Skip to content

Benchmark Coverage Comparison

Auto-generated comparison of ZIRAN's attack vector library against published AI agent security benchmarks.

Last updated: 2026-03-21

Executive Summary

  • 565 attack vectors across 11 attack categories
  • 90.0% OWASP LLM Top 10 coverage (9/10 categories)
  • 10 multi-turn jailbreak tactics, 12 encoding types
  • 219 multi-turn vectors
  • 11 harm categories (AgentHarm-aligned)
  • Gap closure: 34.8% (8/23 gaps closed)

OWASP LLM Top 10 Coverage

Code Category Vectors Status
LLM01 Prompt Injection 434 ✅ Comprehensive
LLM02 Insecure Output Handling 194 ✅ Comprehensive
LLM03 Training Data Poisoning 15 ✅ Strong
LLM04 Model Denial of Service 12 ✅ Strong
LLM05 Supply Chain Vulnerabilities 7 🔶 Moderate
LLM06 Sensitive Information Disclosure 95 ✅ Comprehensive
LLM07 Insecure Plugin Design 136 ✅ Comprehensive
LLM08 Excessive Agency 139 ✅ Comprehensive
LLM09 Overreliance 15 ✅ Strong
LLM10 Unbounded Consumption 🚧 Planned

Not covered: LLM10

Benchmark Comparison

Benchmark Venue Dimension Target ZIRAN Progress Status Gap
AgentHarm ICLR 2025 Harm categories 11 11 ███████████████ 100.0% ✅ closed GAP-06
AgentHarm ICLR 2025 Multi-step vectors 440 161 █████░░░░░░░░░░ 36.6% 🚧 open GAP-23
InjecAgent ACL 2024 Indirect injection vectors 1,054 50 █░░░░░░░░░░░░░░ 4.7% 🚧 open GAP-02
AgentDojo NeurIPS 2024 Indirect injection vectors 629 50 █░░░░░░░░░░░░░░ 7.9% 🚧 open GAP-02
Utility measurement (baseline + post-attack) 1 1 ███████████████ 100.0%
HarmBench ICML 2024 Attack tactics 18 10 ████████░░░░░░░ 55.6% ✅ closed GAP-08
Jailbreak vectors 510 175 █████░░░░░░░░░░ 34.3%
JailbreakBench NeurIPS 2024 JBB categories (10) 10 10 ███████████████ 100.0% ✅ closed GAP-15
Prompt injection vectors 100 175 ███████████████ 100%
StrongREJECT 2024 StrongREJECT composite formula 1 1 ███████████████ 100.0% ✅ closed GAP-04
Scoring dimensions (refusal, specificity, convincingness) 3 3 ███████████████ 100.0%
MCPTox 2025 MCP vectors 1,312 101 █░░░░░░░░░░░░░░ 7.7% 🚧 open GAP-03
Agent Security Bench (ASB) 2024 Attack categories 10 11 ███████████████ 100% 🚧 open GAP-01
Total vectors 400 565 ███████████████ 100%
Utility-under-attack measurement 1 1 ███████████████ 100.0%
TensorTrust 2024 Prompt injection vectors 126,000 175 ░░░░░░░░░░░░░░░ 0.1% 🚧 open GAP-16
WildJailbreak 2024 Jailbreak tactics 105,000 11 ░░░░░░░░░░░░░░░ 0.0% 🚧 open GAP-17
LLMail-Inject 2024 RAG injection vectors 0 Not yet implemented 🚧 open GAP-13
Agent-SafetyBench 2024 Business impact types 8 7 █████████████░░ 87.5% 🚧 open GAP-07
BIPIA 2024 Indirect injection vectors 50 Multi-domain benchmark — no fixed target count 🚧 open GAP-02
CyberSecEval Meta, 2024 Total vectors 565 Multi-category benchmark — partial overlap 🚧 open GAP-18
ToolEmu 2024 Tool manipulation vectors 144 159 ███████████████ 100% 🚧 open GAP-19
R-Judge 2024 R-Judge risk types (10) 10 10 ███████████████ 100.0% ✅ closed GAP-20
Risk scoring detectors 5 5 detectors — different approach than interaction records
AILuminate MLCommons, 2025 Resilience gap metric 0 Not yet implemented 🚧 open GAP-09
ALERT 2024 ALERT micro categories (32) 32 32 ███████████████ 100.0% ✅ closed GAP-21
Harm categories 11 N/A
MITRE ATLAS MITRE, 2025 Attack categories vs tactics 15 11 ███████████░░░░ 73.3% 🚧 open GAP-22
ATLAS technique mapping 0 No atlas_mapping field yet — mapping planned

Gap Status Dashboard

See Gap Analysis for full details.

ID Gap Priority Issue Status
GAP-01 Benchmark harness critical #32 🚧 open
GAP-02 Indirect prompt injection scale critical #33 🚧 open
GAP-03 MCP tool poisoning critical #34 🚧 open
GAP-04 Quality-aware jailbreak scoring critical #35 ✅ closed
GAP-05 Utility-under-attack measurement important #36 ✅ closed
GAP-06 Harmful multi-step task testing important #37 ✅ closed
GAP-07 Business impact categorization important #38 🚧 open
GAP-08 Jailbreak tactic breadth important #39 ✅ closed
GAP-09 Resilience gap metric important #40 🚧 open
GAP-10 OWASP LLM04 (Model DoS) lower #41 ✅ closed
GAP-11 OWASP LLM05 (Supply Chain) lower #42 🚧 open
GAP-12 OWASP LLM10 (Model Theft) lower #43 🚧 open
GAP-13 RAG-specific poisoning lower #44 🚧 open
GAP-14 Defense evasion measurement lower #45 🚧 open
GAP-15 JailbreakBench coverage lower #54 ✅ closed
GAP-16 TensorTrust coverage lower #55 🚧 open
GAP-17 WildJailbreak coverage lower #56 🚧 open
GAP-18 CyberSecEval coverage lower #57 🚧 open
GAP-19 ToolEmu coverage lower #58 🚧 open
GAP-20 R-Judge coverage lower #59 ✅ closed
GAP-21 ALERT coverage lower #60 ✅ closed
GAP-22 MITRE ATLAS technique mapping important #61 🚧 open
GAP-23 AgentHarm multi-step vector scale important #131 🚧 open

Vector Inventory

By Attack Category

Category Vectors
prompt_injection 175
tool_manipulation 159
indirect_injection 50
data_exfiltration 49
privilege_escalation 35
system_prompt_extraction 25
authorization_bypass 17
memory_poisoning 17
chain_of_thought_manipulation 15
model_dos 12
multi_agent 11

By Tactic

Tactic Vectors
single 346
context_buildup 62
crescendo 36
persona_shift 20
hypothetical 16
distraction 15
code_mode 14
few_shot 14
language_switch 14
refusal_suppression 14
role_play 14

By Severity

Severity Vectors
critical 358
high 159
medium 48

By Harm Category

Harm Category Vectors
child_exploitation 13
cybercrime 13
disinformation 13
fraud 14
harassment 21
illegal_services 13
self_harm 14
sexual_content 14
substance_abuse 17
terrorism 14
weapons 15

Generated by benchmarks/generate_all.py on 2026-03-21.