Notes · AI Agent Governance

How Do I Keep an AI Agent Secure and Governed With Human in the Loop?

· AI governance · ~9 min read

You keep an AI agent secure and governed by treating it as an identity that acts under your authority, not a black box you switch on. The working setup combines human-in-the-loop approval gates on consequential actions, attribute-based access control that scopes what the agent can touch, a tamper-evident audit trail of every decision, and GDPR-grade disclosure and data limits. That is the difference between an agent you trust and one breach away from a nightmare.

Letting software act on its own — send the email, move the money, change the record — feels different from letting it merely suggest. The fear is rational. An autonomous agent plans and executes multi-step tasks with little input, which means the window to intervene is narrow and the consequences land in the real world: an unauthorised data pull, a misconfigured system, a payment that should never have gone out. Good AI agent governance closes that gap. It does not slow the agent to a crawl; it decides, in advance, which actions the agent may take alone and which it must hand to a person first.

Below is the setup we build and would recommend to any UK business putting an agent into live operations — the controls that let you sleep, and the regulatory obligations that make them non-optional from 2026.

Start with the agent as an identity, not a feature

The single most useful shift is to stop thinking of the agent as a clever chatbot and start treating it as a member of staff with its own credentials. A person on your team has a login, a defined role, permissions scoped to their job, and a paper trail. An agent acting on your systems needs exactly the same. Strata, writing on agentic identity, puts it plainly: identity governance is the enforcement layer that binds an agent's actions to your policies through authentication, authorisation and audit.

This reframing makes the hard questions concrete. What can this identity see? What can it change? Who authorised that change, and can you prove it later? An agent without its own scoped identity inherits whatever the integration it runs on can do — usually far more than the task requires. That over-permission is where most AI agent security governance failures begin.

Human-in-the-loop approval gates: the part that lets you sleep

The core of human in the loop AI compliance is deciding which actions pause for a person. Two patterns matter, and the difference is not academic:

  • Human-in-the-loop (HITL) — the agent stops and waits for explicit approval before it acts. Right for high-stakes, hard-to-reverse actions: transferring funds, signing an agreement, deleting records, contacting a customer with a binding offer.
  • Human-on-the-loop (HOTL) — the agent acts and a person monitors, ready to intervene. Acceptable for medium-risk, reversible work where stopping every step would defeat the point.

The instinct to gate everything is the wrong one — it buries reviewers and the approvals become rubber stamps. Kiteworks describes a useful spectrum: the strongest control is human authorisation, where no action occurs without explicit, logged approval; the weakest still-compliant control is population-level oversight, where a person watches patterns and steps in when something drifts. Match the gate to the blast radius of the action, not to a blanket rule.

And the gate has to be real. Kiteworks names four conditions for meaningful oversight: the reviewer has enough information about the data and the agent's confidence; the reviewer has genuine authority to modify, reject or delay; the volume of reviews is low enough to allow real attention; and every decision is recorded in a tamper-evident log. Their sharpest line is the one to remember — a near-zero override rate signals that the reviews are not independent. If your humans never say no, you do not have oversight, you have theatre. Strata adds the practical detail: time-box the decision lanes — fifteen seconds for low-risk approvals, up to fifteen minutes for the heavy ones — and build in a guard against automation bias, the well-documented tendency to wave through whatever the machine proposes.

Contain what the agent can reach: ABAC and least privilege

Approval gates govern actions. ABAC AI agent access control governs reach — and reach is where a compromised or confused agent does the most damage. Role-based access (this agent is a "support agent", so it gets the support role) is the floor. Attribute-based access control is the better target: permissions evaluated at the moment of the request against attributes — which user the agent is acting for, what data class is involved, time, context, sensitivity. The agent gets a narrow, scoped token for the task in front of it, not a standing key to everything.

This is agent containment security in practice. Technova Partners, writing on enterprise agent security, lists the controls that go with it: session isolation so one user's conversation can never leak into another's, input validation against prompt-injection attacks that try to hijack the agent's instructions, output filtering to catch personal data before it leaves in a response, TLS 1.3 in transit and encryption at rest. The principle underneath all of it is least privilege: the agent can do exactly what the task needs and nothing more, so a single bad instruction cannot cascade.

The audit trail is your evidence, not your afterthought

When something goes wrong — or when a regulator, a customer or your own board asks what happened — the AI agent audit trail is the only thing that answers. It is also, increasingly, a legal requirement rather than good hygiene.

The standard to build to is attribution at the level of the individual action. Every consequential thing the agent does should be traceable to a human authoriser, the authorisation linked to that specific action, and the whole record kept in tamper-evident storage. Kiteworks frames the test as evidence: a record of who reviewed what, when, under what information, and what they decided. Technova recommends immutable logs with a minimum twelve-month retention and automated alerts on suspicious access patterns. The point is not to collect logs for their own sake — it is to be able to reconstruct any decision the agent made, on demand, with confidence.

GDPR: disclosure, minimisation, and the right to a human

AI agent GDPR compliance is not a separate workstream bolted on at the end — it shapes the architecture. The obligations that bite hardest for an agent handling personal data:

  • Tell people they are talking to a machine. Transparency requires that users know they are interacting with an automated system. This is the heart of GDPR AI automated decision disclosure — silence is itself a breach.
  • Article 22 — the right to human review. Where the agent makes a decision with legal or similarly significant effect, the individual can request that a person review it. Kiteworks is explicit that this reviewer must hold genuine authority; a review that is "purely notional" does not satisfy the right. Your HITL gates are how you actually deliver this.
  • Data minimisation. Capture only what the task strictly needs. Every "nice-to-have" field the agent collects is liability without a purpose.
  • Storage limitation. Define retention and enforce it with automated deletion, not a manual promise — and log the deletions so you can prove erasure within the windows the law expects.

Technova's audit of European agent deployments is a sobering reminder of how far the default falls short: it reports that 73% of implementations it reviewed in 2024 carried GDPR compliance vulnerabilities, 47% lacked explicit informed consent before processing personal data, and 39% stored data indefinitely with no retention policy. These are not exotic failures — they are the predictable result of shipping an agent before governing it. We treat them as the baseline checklist to design out, not findings to discover later.

Human oversight is the law from 2026, not just good practice

If your agent falls into a high-risk category, human oversight under the EU AI Act 2026 moves from sensible to mandatory. The obligations for high-risk systems — including Article 14's human-oversight requirements — apply from 2 August 2026, with rules for systems in regulated areas such as employment, critical infrastructure and biometrics following on 2 December 2027. The European Commission's enforcement powers, including fines, also begin to apply from August 2026.

Article 14 is specific about what oversight means in practice. The people assigned to it must understand the system's limits, be able to monitor it for anomalies, stay alert to automation bias, interpret outputs correctly, and — critically — be able to decide not to use the output or to override it. For the most sensitive cases, such as biometric identification under Annex III, the Act requires that no action be taken unless at least two competent, trained people separately verify the result. Under Article 26, deployers must assign that oversight to genuinely competent, authorised individuals. These are high-risk AI deployer obligations for 2026 with real teeth — and they map almost exactly onto the HITL gates, scoped access and audit trails above. Build the governance well and compliance falls out of it; bolt compliance on afterwards and you will be retrofitting under pressure.

The honest version: most agents don't need all of this

An agent that drafts internal summaries, triages your inbox, or answers questions from public documentation carries little of this weight. There, light-touch monitoring and a sensible retention policy are enough, and adding heavy approval gates would only make a useful tool annoying. The governance scales with the stakes — that is the whole point.

The setup matters when the agent touches personal data, moves money, makes decisions that affect people, or acts on systems where a mistake is expensive to undo. If yours does any of that, the path is not to slow it down but to give it an identity, scope what it can reach, gate the consequential actions to a human who can genuinely say no, and log everything so you can prove what happened. Get that right and the agent stops feeling like a gamble and starts feeling like a colleague you can trust with the keys.

Straight answers

Common questions on governing an AI agent

What does human in the loop actually mean for an AI agent?

It means the agent pauses on a consequential action and waits for a person to approve, modify or reject it before it executes. That is distinct from human-on-the-loop, where the agent acts and a person monitors with the power to intervene afterwards. The right pattern depends on how reversible and high-stakes the action is — fund transfers and binding customer decisions warrant a hard gate, routine reversible work does not.

Does GDPR require a human to review what an AI agent decides?

Where the agent makes a decision with legal or similarly significant effect on someone, GDPR Article 22 gives that person the right to request human review. Kiteworks notes the reviewer must have genuine authority — a review that is purely notional does not satisfy the right. You also have to tell people they are interacting with an automated system in the first place; silence on that is itself a transparency breach.

When do the EU AI Act human oversight rules apply?

The human-oversight requirements for high-risk systems under Article 14 apply from 2 August 2026, alongside the Commission's enforcement powers and fines. Rules for systems in specific regulated areas such as employment and biometrics follow on 2 December 2027. For sensitive biometric identification, the Act requires at least two competent, trained people to separately verify a result before any action is taken.

How do I stop an AI agent from accessing data it shouldn't?

Give the agent its own scoped identity rather than letting it inherit broad integration permissions, and apply attribute-based access control (ABAC) so each request is evaluated against context — which user, what data class, what sensitivity — and granted a narrow token for that task only. Pair this with session isolation, input validation against prompt injection, and output filtering for personal data. The principle is least privilege: it can do exactly what the task needs and nothing more.

Why do I need an audit trail for an AI agent?

Because when something goes wrong, or a regulator or customer asks what happened, the audit trail is the only thing that answers. Build it to attribute every consequential action to a human authoriser, link that authorisation to the specific action, and store it tamper-evident. Technova recommends immutable logs kept for at least twelve months with automated alerts on suspicious access. It is also how you evidence the human oversight the EU AI Act and GDPR expect.

Do all AI agents need this level of governance?

No. An agent that drafts internal summaries or answers from public documentation needs little more than sensible monitoring and a retention policy. The full setup — approval gates, ABAC, audit trails, GDPR controls — matters when the agent touches personal data, moves money, or makes decisions that affect people. Governance should scale with the stakes, not blanket every use case.

Putting an agent into live operations without the sleepless nights

If your agent will touch personal data, money or decisions that affect people, the governance is the part that protects you when something goes wrong — not an extra you add later. We design the identity, approval gates, access scoping and audit trail into the build from day one, so it is compliant by 2 August 2026 because it was built that way, not retrofitted under pressure. If a lighter setup is genuinely all you need, we will tell you that too.