Context Engineering¶

Based on the Haddix AI Hackbots methodology, context engineering ensures each test agent has domain-specific knowledge.

Context Layers Stack¶

graph TB
    AGENT["Test Agent<br/>Example: test-sqli"]

    LAYER1["Layer 1: Research Terms<br/>Domain vocabulary<br/>Anchor patterns"]
    LAYER2["Layer 2: Knowledge Pack<br/>Deep vulnerability knowledge<br/>knowledge-sqli.md"]
    LAYER3["Layer 3: Exemplars<br/>Few-shot examples<br/>Consistent behavior"]
    LAYER4["Layer 4: Finding Schema<br/>Standardized output<br/>Required fields"]
    LAYER5["Layer 5: AI Decision Points<br/>Reasoning markers<br/>[AI-DECISION]"]
    LAYER6["Layer 6: Kill Switches<br/>Safety limits<br/>Timeout, request count"]

    AGENT --> LAYER1
    LAYER1 --> LAYER2
    LAYER2 --> LAYER3
    LAYER3 --> LAYER4
    LAYER4 --> LAYER5
    LAYER5 --> LAYER6

    LAYER6 --> QUALITY["High-Quality Output<br/>Accurate, Consistent<br/>Safe Findings"]

    style AGENT fill:#9b30ff,color:#fff,stroke:#00e5ff,stroke-width:2px
    style LAYER1 fill:#4a148c,color:#fff
    style LAYER2 fill:#6a1b9a,color:#fff
    style LAYER3 fill:#7b1fa2,color:#fff
    style LAYER4 fill:#8e24aa,color:#fff
    style LAYER5 fill:#9c27b0,color:#fff
    style LAYER6 fill:#ab47bc,color:#fff
    style QUALITY fill:#0277bd,color:#fff,stroke:#00e5ff,stroke-width:2px

Components¶

Research Terms¶

Domain vocabulary per skill anchors attention on correct patterns. Located in helpers/research-terms.md for applicable skills.

Skills with research terms: test-injection, test-auth, test-access, test-ssrf, test-client, test-api, test-logic, test-infra, test-advanced.

Exemplars¶

Ideal test executions (target -> test -> decision -> output) in helpers/exemplars.md. Provides few-shot examples for consistent behavior.

Skills with exemplars: test-injection, test-auth, test-access, test-ssrf.

Knowledge Packs¶

Deep domain knowledge files per vulnerability class:

knowledge-sqli.md — SQL injection techniques, WAF bypass, blind/error-based/union
knowledge-xss.md — XSS contexts, filter bypass, DOM sinks/sources
knowledge-ssrf.md — Protocol smuggling, cloud metadata, redirect chains
knowledge-jwt.md — Algorithm confusion, key cracking, claim manipulation
knowledge-oauth.md — OAuth flow attacks, state bypass, token theft

Finding Schema¶

Standardized output format in pentest/helpers/finding-schema.md with required fields:

Field	Description
`finding_id`	Unique identifier (FINDING-NNN)
`vuln_type`	Vulnerability classification
`severity`	Critical/High/Medium/Low/Informational
`cvss40_vector`	CVSS 4.0 vector string
`endpoint`	Affected URL/endpoint
`evidence`	Request + response_indicator + baseline
`poc_http`	Full HTTP request/response (Burp style)
`confidence`	High/Medium/Low

AI Decision Points¶

# [AI-DECISION] markers in skills indicate where AI reasoning should be applied (2-3 per skill):

Response analysis (is this a true positive?)
False positive elimination
Attack path selection (which technique to try next?)

Kill Switches¶

Built-in safety limits:

Timeout: 45 min per skill (60 for test-injection)
Request limit: 500 per skill (warning at 400)
Rate limit: 3x 429 responses triggers automatic stop
Functions: check_timeout(), count_request(), stealth_curl()