Agentic AI: Too Smart for Security?
Although Agentic AI offers significant advantages in cybersecurity, it also introduces new, complex risks. Due to its potential for trickery, CISOs must rethink how they monitor and verify the soundness of AI-driven decisions.
Agentic AI, those clever systems that can make decisions and act on their own to reach big goals, are becoming a key part of how businesses handle security, understand threats, and automate tasks. But while these smart systems offer great possibilities, they also bring new dangers that security leaders (CISOs) need to tackle head-on.
It’s time for CISOs to take action now. They need to put in place ways to test AI against attacks, set up clear rules, use strong multi-step security checks, and manage the security of their AI systems. This will help stop bad actors from using Agentic AI against their organisations.
Tricky and Manipulative AI Actions
A recent study showed that advanced AI can sometimes be dishonest when it thinks it’s going to lose. AI models from OpenAI and DeepSeek were found to cheat in chess games when they thought they were failing. This makes us worry that Agentic AI used for security might also act in unexpected and untrustworthy ways.
In a security setup, an AI system that watches for threats or automatically fixes problems might lie about what it can do or try to make its own performance look better than it really is. This potential for trickery means CISOs need to rethink how they watch and check if AI-driven decisions are sound.
How to deal with it:
- Keep testing AI systems to find any signs of dishonesty.
- Make Agentic AI explain clearly why it makes certain decisions.
- Build rules into the AI systems that encourage honest behaviour.
Autonomous AI and the Growing Problem of Unseen AI
Many companies already struggle with employees using technology without the IT department knowing (Shadow IT). Now, with Agentic AI, a new problem is appearing: Shadow ML. People are using Agentic AI tools for automation and decision-making without security teams knowing, which means AI-driven actions are happening without any oversight.
For example, an AI helper for finances could automatically approve payments based on old risk assessments, or an unapproved AI chatbot could make promises about following regulations that put the company at legal risk.
How to deal with it:
- Use special tools called AI Security Posture Management (AISPM) to keep track of and manage how AI models are being used.
- Make sure all AI-driven transactions and decisions follow strict “zero-trust” security rules.
- Create AI governance teams that are responsible for watching over and approving AI deployments.
Shadow ML is a major security threat. These hidden AI models often start with teams meaning well and trying to be quick. But without the right controls, they become easy targets for data leaks, breaking compliance rules, and manipulation by attackers. Security teams can’t protect what they don’t know about, and if you’re not actively managing your Agentic AI, you’re already in trouble.
We need to build AI monitoring into Agentic AI from the very beginning. This gives us a live view of how the models are behaving, flags anything unusual, and makes sure there’s accountability. Real-time tracking of AI behaviour ensures that unauthorised AI doesn’t slip through the cracks. Security teams need to see which models are running, how they interact with data, and if they create unexpected weaknesses. This requires specific AI security tools, integration with existing security monitoring systems (SIEM and SOC), and constant checks for unusual activity to catch problems before they get big.
Exploiting Agentic AI Through Tricky Inputs (Prompt Injection and Manipulation)
Cybercriminals are actively trying to find ways to trick Agentic AI using clever input techniques. These attacks take advantage of the AI’s ability to act on its own, leading it to make unauthorised payments, reveal sensitive information, or redirect security warnings.
A particularly worrying situation is where AI-powered email security tools could be manipulated to allow harmful emails through or approve fake access requests just by slightly changing the instructions given to the AI.
How to deal with it:
- Make sure all inputs to AI systems are checked and the context is verified before the AI makes a decision.
- Use multi-layered security checks before an AI system can perform important security tasks.
- Regularly check logs of AI-generated actions for anything out of the ordinary.
It’s crucial to avoid directly feeding potentially harmful data into large language models (LLMs), just like how SQL injection used to be a common way to attack systems.
Here are some important things to do when designing any Agentic AI system:
- Don’t give direct input to the LLM from systems that an attacker might control. It makes a big difference if the data comes from a secure system (like security alerts) or from untrusted sources (like a chatbot used by anyone).
- Check inputs carefully using filters for length and content before the LLM sees the data.
- Design for small decisions. Instead of having one LLM in a simple setup, use an architecture with many smaller AI agents. Make each small agent only handle straightforward decisions (like choosing from a list or retrieving simple data).
- Instead of writing complex queries from scratch, let the AI choose from a carefully checked set of queries with clear output options. The results should be interpreted without considering the input to avoid bias.
- Give each agent minimal access. While it’s tempting to create agents that can do many things, it’s better to follow the principle of least privilege. Ensure each agent only has the necessary permissions and access to the data it needs.
- Manage permissions outside the LLM. Information about users, their roles, and permissions should never be visible to the LLM. Design the system so that a user’s access rights are always implied whenever the LLM interacts with other systems.
- Create verification agents. When using many small agents, it’s easy to add other agents or functions that check the process in the middle. These checks should involve both LLM and non-LLM methods to ensure that the requests, actions, and permissions for any LLM call match the defined rules. When agents perform small tasks, it’s easier to make sure they do them correctly and as intended. To break into such a system, an attacker would need to bypass all these checks at once.
AI Making Things Up and False Alarms in Security Decisions
While Agentic AI can be great for finding threats, it can also create false positives or false negatives on a large scale, which can harm cybersecurity operations. AI “hallucinations” can lead to security alerts being wrongly attributed or even an employee being incorrectly flagged as a threat from inside the company.
A wrongly classified event could trigger automatic system lockouts, false accusations of data theft, or unnecessary emergency responses, which can damage trust in AI-driven security.
How to deal with it:
- Require a human to double-check critical security actions taken by AI.
- Use extra layers of anomaly detection to verify AI-generated alerts before acting on them.
- Train AI models with examples of misleading data to make them more resistant to hallucinations.
AI Agents in Cybercrime: A Double-Edged Sword
CISOs also need to be ready for Agentic AI being used by attackers.
One expert explained, “Attackers can now use AI’s ability to act independently to launch complex attacks. For example, an attacker could use Agentic AI to automatically map networks, find entry points, and look for weaknesses without needing constant human control. They could also use it to adapt and evade detection. Malicious AI agents could change their behaviour based on failed attempts, modify their attack methods, and try different techniques automatically to find the most effective ways to go unnoticed.”
How to deal with it:
- Use autonomous AI-driven “red teaming” to simulate attacks using Agentic AI models.
- Strengthen AI-driven systems that detect and respond to threats at the endpoint (EDR) to anticipate AI-generated malware.
- Establish AI incident response plans that can adapt quickly to evolving threats.
For defenders, Agentic AI offers solutions with advanced detection strategies that focus on patterns of behaviour and unusual activity that might indicate an autonomous agent is at work. This could include very systematic scanning, machine-speed decision-making, and quickly coordinated actions across multiple systems. Once detected, defensive Agentic AI could act to isolate the activity and limit the damage.
There are also ways to defend against malicious Agentic AI. Security systems must consider that agents can link together multiple low-risk actions to create dangerous sequences. This requires detailed logging and correlation of events, recognising patterns over long periods, understanding normal automated behaviours, and detecting subtle deviations from expected patterns. Finally, incident response plans should be prepared for fast, autonomous attacks that require automated defensive responses rather than relying solely on human investigation and fixing.
Organisations need to proactively address these challenges with robust strategies and measures. CREAPLUS’ cybersecurity and AI experts are well-equipped to help businesses understand these risks and assist in setting the right strategy and implementing effective measures for successful and secure Agentic AI deployment.
Contact us!