Detect AI-agent prompt injection in 5 seconds, at 1M-row scale

29 May 2026 · deny.sh team · 5 min read

Tripwires

Tonight we shipped bulk decoy tripwires. One CLI command can now arm 100,000 or 1,000,000 row-level canaries across a customer database, then page you when an AI agent leaks one through a prompt-injection path.

AI agents read attacker-controlled text. That is the uncomfortable bit. They read support tickets, inbound email, Slack threads, scraped web pages, CRM notes, docs, comments, issue descriptions, and whatever else the product team connected because the demo looked useful.

If the attacker can get text into that stream, the attacker can try to make the agent exfiltrate the surrounding context. "Ignore previous instructions" is the cartoon version. The real version is quieter: pull the customer row, summarise the support state, include the hidden metadata, call the wrong tool, leak the API key-shaped value that happened to be nearby.

A handful of honeypots does not cover that surface. The agent is not reading one fake credential in a lab table. It is reading live rows. If the blast radius is every customer record the agent can see, the tripwire surface needs to exist at that same granularity.

What shipped tonight

The new CLI command is deliberately boring:

deny-sh tripwires arm-bulk --type stripe-live-key --count 100000 --out armed.csv

Under the hood it calls POST /v1/decoy-tripwires/bulk. The endpoint accepts up to 1000 rows per request, writes each batch inside a single db.transaction, and isolates partial failures so one bad row does not poison the rest of the import. That matters when the input is a real customer export, not a toy array.

The SDK piece is generateDecoyWithHash(type, opts?). It returns { value, sha256 } for all 69 DecoyType values, so the caller can store the decoy value wherever the agent might read it and arm the hash with deny.sh. No plaintext decoy has to be sent back to the server to register the tripwire.

The CLI wraps that into the operator flow: generate decoys, register hashes, and write a CSV with the armed values and their SHA-256s. We sustained roughly 500 tripwires per second in a real CLI run tonight, against the live path.

A worked example

Say you run a B2B SaaS with 250,000 customer rows. Each row has a real encrypted Stripe key, because the product charges customer-owned accounts. Your support agent can read customer rows when it handles billing tickets. That agent also reads inbound ticket text. Congratulations, you now have a prompt-injection problem with a database behind it.

You add one decoy column next to the real Stripe credential field. The value has the shape of a Stripe live key, but it is generated as a deny.sh tripwire. Then you arm one tripwire per row:

deny-sh tripwires arm-bulk \
  --type stripe-live-key \
  --count 250000 \
  --out customer-stripe-tripwires.csv

Import the generated decoys into the new column. Your app does not need a new service, a sidecar, a queue, or a second agent runtime. The only rule is that any path which tries to decrypt, inspect, or validate that decoy-shaped value should go through the same deny.sh check path you already use for tripwires.

When one fires, your Slack alert is meant to be blunt:

[deny.sh] decoy tripwire fired
type: stripe-live-key
tenant: acme-prod
row: customer_81277
detected: 5s
source: /api/decrypt
action: rotate adjacent real credential and inspect agent transcript

The point is not that the decoy stops the first read. The point is that the read becomes visible fast enough to do something before the incident has turned into a week of archaeology.

Why this is for agents

Traditional secret scanning mostly assumes the leak already happened. A key landed in GitHub, a log aggregator, a paste, a ticket, or a model transcript. Useful, but late.

Agent prompt injection has a different shape. The attacker tries to make the agent fetch a whole row, or a whole object graph, then leak it through a tool call or a response path. If that row includes a decoy column, the attacker probably asks for it too, because they do not know which fields are real.

That gives you a tripwire that lives inside the same context the attacker wants. When the decoy is read against /api/decrypt, deny.sh can fire through the existing webhook adapters, Slack, PagerDuty, or Datadog, in roughly five seconds. The infrastructure already exists. Bulk arming just makes the coverage match the database instead of the demo.

Tier caps

The caps are meant to map to real operating shapes, not to a marketing slide. Pro can test this on a serious table. Scale covers a production agent. Agents Infra is for row-level coverage across a large customer dataset. Full details live on /pricing.

Tier	Tripwire cap
Pro	10,000
Scale	100,000
Agents Infra	1,000,000

Enterprise accounts can go higher. The current ceiling is 10,000,000 tripwires where the account shape warrants it. That is not the default path. It is the "yes, every row in the production customer database" path.

Use it

If you are already building agents on deny.sh, start with the tripwire section and arm a small table first. Then do the dull thing: export rows, generate decoys, write the CSV back, and put the decoy where the agent can see it if the attacker can make the agent see too much.

This is not a silver bullet for prompt injection. Nothing is. It is a cheap detection surface that scales with the data the agent is allowed to touch. That is the bit most agent security plans are still missing.

Read the tripwire docs Arm at scale