Multi-Phase Trust Exploitation¶

ZIRAN's core methodology is a multi-phase trust exploitation campaign inspired by social engineering. Instead of throwing attacks at an agent randomly, ZIRAN builds trust incrementally -- exactly like a real attacker would.

Why this matters

Real attackers don't send "Ignore all instructions" as their opening message. They build rapport, discover capabilities, and chain multiple steps together. ZIRAN replicates this real-world approach automatically.

Why Multi-Phase?¶

Single-shot prompt injections work against naive agents. But production agents often have:

Safety guardrails that block obvious attacks
Context awareness that detects suspicious behaviour
Rate limiting on sensitive operations

A multi-phase approach overcomes these defences by establishing trust first, then gradually escalating. The difference in detection rate is dramatic:

Approach	Typical Detection Rate	Against Hardened Agents
Single-shot injection	40--60%	10--20%
Multi-phase campaign	80--95%	60--80%

The Eight Phases¶

Phases are not linear. Each phase feeds its findings into the knowledge graph, and the graph drives what happens next. A discovery during execution may trigger a return to reconnaissance. New tools revealed during trust building cause capability mapping to re-run with updated context.

The diagram below shows how the knowledge graph sits at the center, receiving findings from each phase and informing which phase runs next:

flowchart TD
    KG["🧠 Knowledge Graph\n(live state)"]

    R["🔍 1. Reconnaissance"] --> KG
    TB["🤝 2. Trust Building"] --> KG
    CM["🗺️ 3. Capability Mapping"] --> KG
    VD["⚡ 4. Vulnerability Discovery"] --> KG
    ES["🎯 5. Exploitation Setup"] --> KG
    EX["💥 6. Execution"] --> KG
    PE["🔒 7. Persistence"] --> KG
    EXF["📤 8. Exfiltration"] --> KG

    KG -->|"decides next phase"| R
    KG -->|"decides next phase"| TB
    KG -->|"decides next phase"| CM
    KG -->|"decides next phase"| VD
    KG -->|"decides next phase"| ES
    KG -->|"decides next phase"| EX
    KG -->|"decides next phase"| PE
    KG -->|"decides next phase"| EXF

    style KG fill:#1a1a2e,stroke:#e94560,color:#fff,stroke-width:2px
    style R fill:#4051B5,color:#fff
    style TB fill:#4051B5,color:#fff
    style CM fill:#4051B5,color:#fff
    style VD fill:#E53935,color:#fff
    style ES fill:#E53935,color:#fff
    style EX fill:#E53935,color:#fff
    style PE fill:#FF9800,color:#000
    style EXF fill:#FF9800,color:#000

With the fixed strategy, phases run sequentially (1 through 8) for reproducibility. With adaptive or llm-adaptive, the knowledge graph drives phase selection -- phases can be skipped, reordered, or revisited. See Adaptive Campaigns.

Phase 1: Reconnaissance¶

Discover what the agent can do -- tools, skills, permissions, and data access. This is passive; no attacks are sent. For remote agents, ZIRAN reads endpoint metadata, OpenAPI specs, or A2A Agent Cards.

Phase 2: Trust Building¶

Establish conversational rapport. Ask legitimate questions, use the agent as intended. This builds a conversation history that makes later attacks more likely to succeed.

Phase 3: Capability Mapping¶

Deep-dive into the agent's capabilities. Discover tool parameters, data schemas, and permission boundaries. Build the knowledge graph.

Phase 4: Vulnerability Discovery¶

Probe for weaknesses. Test boundary conditions, try mild prompt injections, and look for information leakage. Use knowledge from previous phases to target probes.

Phase 5: Exploitation Setup¶

Position for attack without triggering defences. Craft prompts that leverage discovered capabilities and trust history.

Phase 6: Execution¶

Execute the exploit chain. Use knowledge graph paths to guide multi-step attacks through the agent's tool chain.

Phase 7: Persistence (opt-in)¶

Test whether the vulnerability survives session resets, memory clears, or agent restarts.

Phase 8: Exfiltration (opt-in)¶

Attempt to extract sensitive data through discovered attack paths.

Coverage Levels¶

The --coverage flag controls how many phases ZIRAN runs:

Level	Phases Included	Use Case
`essential`	1--4 (Recon -> Vulnerability Discovery)	Quick feedback during development
`standard`	1--6 (Recon -> Execution)	Pre-deployment gate (default)
`comprehensive`	1--8 (All phases)	Full security audit

# Quick check
ziran scan --target target.yaml --coverage essential

# Full audit
ziran scan --target target.yaml --coverage comprehensive

Knowledge Graph Integration¶

Each phase updates the attack knowledge graph -- a directed graph that tracks:

Nodes: Agent capabilities, tools, data sources, vulnerabilities
Edges: Relationships (uses_tool, accesses_data, enables, can_chain_to)

The graph enables ZIRAN to discover attack paths that span multiple phases and tool invocations. See Knowledge Graph for details.