Benchmark Coverage Comparison¶

Auto-generated comparison of ZIRAN's attack vector library against published AI agent security benchmarks.

Last updated: 2026-03-21

Executive Summary¶

565 attack vectors across 11 attack categories
90.0% OWASP LLM Top 10 coverage (9/10 categories)
10 multi-turn jailbreak tactics, 12 encoding types
219 multi-turn vectors
11 harm categories (AgentHarm-aligned)
Gap closure: 34.8% (8/23 gaps closed)

OWASP LLM Top 10 Coverage¶

Code	Category	Vectors	Status
LLM01	Prompt Injection	434	Comprehensive
LLM02	Insecure Output Handling	194	Comprehensive
LLM03	Training Data Poisoning	15	Strong
LLM04	Model Denial of Service	12	Strong
LLM05	Supply Chain Vulnerabilities	7	Moderate
LLM06	Sensitive Information Disclosure	95	Comprehensive
LLM07	Insecure Plugin Design	136	Comprehensive
LLM08	Excessive Agency	139	Comprehensive
LLM09	Overreliance	15	Strong
LLM10	Unbounded Consumption	—	Planned

Not covered: LLM10

Benchmark Comparison¶

Benchmark	Venue	Dimension	Target	ZIRAN	Progress	Status	Gap
AgentHarm	ICLR 2025	Harm categories	11	11	`███████████████` 100.0%	closed	GAP-06
AgentHarm	ICLR 2025	Multi-step vectors	440	161	`█████░░░░░░░░░░` 36.6%	open	GAP-23
InjecAgent	ACL 2024	Indirect injection vectors	1,054	50	`█░░░░░░░░░░░░░░` 4.7%	open	GAP-02
AgentDojo	NeurIPS 2024	Indirect injection vectors	629	50	`█░░░░░░░░░░░░░░` 7.9%	open	GAP-02
		Utility measurement (baseline + post-attack)	1	1	`███████████████` 100.0%
HarmBench	ICML 2024	Attack tactics	18	10	`████████░░░░░░░` 55.6%	closed	GAP-08
		Jailbreak vectors	510	175	`█████░░░░░░░░░░` 34.3%
JailbreakBench	NeurIPS 2024	JBB categories (10)	10	10	`███████████████` 100.0%	closed	GAP-15
		Prompt injection vectors	100	175	`███████████████` 100%
StrongREJECT	2024	StrongREJECT composite formula	1	1	`███████████████` 100.0%	closed	GAP-04
		Scoring dimensions (refusal, specificity, convincingness)	3	3	`███████████████` 100.0%
MCPTox	2025	MCP vectors	1,312	101	`█░░░░░░░░░░░░░░` 7.7%	open	GAP-03
Agent Security Bench (ASB)	2024	Attack categories	10	11	`███████████████` 100%	open	GAP-01
		Total vectors	400	565	`███████████████` 100%
		Utility-under-attack measurement	1	1	`███████████████` 100.0%
TensorTrust	2024	Prompt injection vectors	126,000	175	`░░░░░░░░░░░░░░░` 0.1%	open	GAP-16
WildJailbreak	2024	Jailbreak tactics	105,000	11	`░░░░░░░░░░░░░░░` 0.0%	open	GAP-17
LLMail-Inject	2024	RAG injection vectors	—	0	Not yet implemented	open	GAP-13
Agent-SafetyBench	2024	Business impact types	8	7	`█████████████░░` 87.5%	open	GAP-07
BIPIA	2024	Indirect injection vectors	—	50	Multi-domain benchmark — no fixed target count	open	GAP-02
CyberSecEval	Meta, 2024	Total vectors	—	565	Multi-category benchmark — partial overlap	open	GAP-18
ToolEmu	2024	Tool manipulation vectors	144	159	`███████████████` 100%	open	GAP-19
R-Judge	2024	R-Judge risk types (10)	10	10	`███████████████` 100.0%	closed	GAP-20
		Risk scoring detectors	—	5	5 detectors — different approach than interaction records
AILuminate	MLCommons, 2025	Resilience gap metric	—	0	Not yet implemented	open	GAP-09
ALERT	2024	ALERT micro categories (32)	32	32	`███████████████` 100.0%	closed	GAP-21
		Harm categories	—	11	N/A
MITRE ATLAS	MITRE, 2025	Attack categories vs tactics	15	11	`███████████░░░░` 73.3%	open	GAP-22
		ATLAS technique mapping	—	0	No atlas_mapping field yet — mapping planned

Gap Status Dashboard¶

See Gap Analysis for full details.

ID	Gap	Priority	Issue	Status
GAP-01	Benchmark harness	critical	#32	open
GAP-02	Indirect prompt injection scale	critical	#33	open
GAP-03	MCP tool poisoning	critical	#34	open
GAP-04	Quality-aware jailbreak scoring	critical	#35	closed
GAP-05	Utility-under-attack measurement	important	#36	closed
GAP-06	Harmful multi-step task testing	important	#37	closed
GAP-07	Business impact categorization	important	#38	open
GAP-08	Jailbreak tactic breadth	important	#39	closed
GAP-09	Resilience gap metric	important	#40	open
GAP-10	OWASP LLM04 (Model DoS)	lower	#41	closed
GAP-11	OWASP LLM05 (Supply Chain)	lower	#42	open
GAP-12	OWASP LLM10 (Model Theft)	lower	#43	open
GAP-13	RAG-specific poisoning	lower	#44	open
GAP-14	Defense evasion measurement	lower	#45	open
GAP-15	JailbreakBench coverage	lower	#54	closed
GAP-16	TensorTrust coverage	lower	#55	open
GAP-17	WildJailbreak coverage	lower	#56	open
GAP-18	CyberSecEval coverage	lower	#57	open
GAP-19	ToolEmu coverage	lower	#58	open
GAP-20	R-Judge coverage	lower	#59	closed
GAP-21	ALERT coverage	lower	#60	closed
GAP-22	MITRE ATLAS technique mapping	important	#61	open
GAP-23	AgentHarm multi-step vector scale	important	#131	open

Vector Inventory¶

By Attack Category¶

Category	Vectors
prompt_injection	175
tool_manipulation	159
indirect_injection	50
data_exfiltration	49
privilege_escalation	35
system_prompt_extraction	25
authorization_bypass	17
memory_poisoning	17
chain_of_thought_manipulation	15
model_dos	12
multi_agent	11

By Tactic¶

Tactic	Vectors
single	346
context_buildup	62
crescendo	36
persona_shift	20
hypothetical	16
distraction	15
code_mode	14
few_shot	14
language_switch	14
refusal_suppression	14
role_play	14

By Severity¶

Severity	Vectors
critical	358
high	159
medium	48

By Harm Category¶

Harm Category	Vectors
child_exploitation	13
cybercrime	13
disinformation	13
fraud	14
harassment	21
illegal_services	13
self_harm	14
sexual_content	14
substance_abuse	17
terrorism	14
weapons	15

Generated by benchmarks/generate_all.py on 2026-03-21.