AI is being used by businesses more and more, yet model security often still isn’t up to par.
Traditional IT has had years to develop appropriate security methods, but LLMs and other AI models are still in a fairly vulnerable phase. Unsecured models are an attractive target for cybercriminals, posing potentially far-reaching risks for companies.
Model Poisoning: tainted training data
One of the best-known attack techniques is model poisoning. In this scenario, attackers insert misleading data into an AI model’s training dataset. As a result, the model may make mistakes and produce incorrect output due to the “poisoned” data. It’s therefore crucial for AI companies to build in sufficient security to prevent intrusions and to carefully verify the provenance of their training data. Even so, a highly targeted attack can still cause damage.
Because the output of AI models depends on the quality of their data, a small amount of bad input can already introduce bias or errors.
Imagine a bank using an AI model to detect fraudulent transactions. If an attacker injects fraudulent data that is labeled as “legitimate,” the model learns to ignore suspicious transactions. You end up with a fraud detection system that does exactly what criminals want: it fails to detect them. In practice, this is uncommon because training data is rarely left unprotected.
Prompt engineering and guardrails
With AI models like DeepSeek, prompt engineering could cause major issues because it is less strict about safety rules. In prompt engineering, prompts are crafted to elicit specific outputs from models.
According to a study, DeepSeek did not block a single harmful prompt earlier this year. Other popular AI models do, by implementing guardrails; DeepSeek fell short there.
You can take guardrails quite literally: they prevent the AI model from veering off the “normal road.” They ensure, for example, that no financial advice is given or data leaked. Guardrails also make it much harder for cybercriminals to use models for malicious purposes.
Slopsquatting
A simple way to infiltrate the supply chain is slopsquatting. With this technique, a non-existent software package is registered, enabling an AI coding assistant to hallucinate that package into its code. Attackers exploit this by adding malware to that software to break into the system.
A developer who makes a mistake and installs the wrong package suggested by a coding assistant may unwittingly bring in malware that grants access to sensitive data or the entire training environment. Fortunately, security is evolving quickly: these software packages can be scanned before use, or you can stick to packages already vetted by trusted developers.
This type of attack isn’t new; the open-source community has struggled with it for a long time. The simplest approach is, of course, to disallow the use of such packages during development or to provide sufficient training to your staff. Slopsquatting is an evolution of typosquatting, the imitation of well-known brands. You might see Gooogle instead of Google, for example. That technique is often used in phishing emails.
Model Extraction
AI models often process sensitive data, such as customer information. Even when the data isn’t directly accessible, it can sometimes be inferred via specific triggers. In model extraction, attackers try to analyze and mimic a model’s behavior. They access the model through an API and replicate it without knowing its parameters or training data.
Distillation uses a large AI model to train a smaller one. Done legally, this happens, for example, with GPT-4 and GPT-4o. Thousands of questions are posed to a model, and the answers are used to replicate its behavior.
Attackers use that technique as well, making their model behave like the original AI model. They don’t need the actual training data or architecture—the output is enough.
The Impact on businesses
A “compromised” AI model can have serious consequences. Companies that use AI must recognize that they are responsible for the quality and security of their models.
With regulations such as the AI Act, the EU is tightening the rules. Companies must be able to demonstrate that their models are indeed safe and that they are aware of risks such as bias and data leaks. Failure to comply can lead to heavy penalties in the form of fines. The size of the fine depends on the AI risk level:
- Prohibited AI: Up to €35 million or 7% of global turnover.
- High-risk AI: Up to €15 million or 3% of turnover.
- Failure to meet transparency requirements: Up to €15 million or 3% of turnover.
- Incorrect or misleading information: Up to €7.5 million or 1% of turnover.
Conclusion
If you want to deploy AI safely, you must treat it from day one as a valuable business asset that deserves just as much protection as data or infrastructure. Only then can companies use AI without it becoming their biggest vulnerability.
read also
