I tested 5 LLMs for prompt-injection leaks. Same code, 0% to 90%.

DevJune 18, 20261 min read

I built a scanner that fires prompt-injection probes at a self-hosted AI agent and checks whether it leaks (a) real secret-shaped strings (API keys) or (b) the content of its own system prompt. Then I ran the same agent across 5 model backends. The leak rate ranged from 0% to 90% depending only on the model. Here's what I found and how it works. Why this matters now Prompt injection is #1 on the OWASP 2025 LLM Top 10. It's not theoretical anymore: EchoLeak (CVE-2025-32711,

Story Overview

I built a scanner that fires prompt-injection probes at a self-hosted AI agent and checks whether it leaks (a) real secret-shaped strings (API keys) or (b) the content of its own system prompt. Then I ran the same agent across 5 model backends. The leak rate ranged from 0% to 90% depending only on the model.
Here's what I found and how it works.
Why this matters now
Prompt injection is #1 on the OWASP 2025 LLM Top 10. It's not theoretical anymore:

EchoLeak (CVE-2025-32711,

Dev

dev.to

Read Full Story on Dev.to

← Back to Latest News