Skip to content

Context Engineering

Based on the Haddix AI Hackbots methodology, context engineering ensures each test agent has domain-specific knowledge.

Context Layers Stack

graph TB
    AGENT["Test Agent<br/>Example: test-sqli"]

    LAYER1["Layer 1: Research Terms<br/>Domain vocabulary<br/>Anchor patterns"]
    LAYER2["Layer 2: Knowledge Pack<br/>Deep vulnerability knowledge<br/>knowledge-sqli.md"]
    LAYER3["Layer 3: Exemplars<br/>Few-shot examples<br/>Consistent behavior"]
    LAYER4["Layer 4: Finding Schema<br/>Standardized output<br/>Required fields"]
    LAYER5["Layer 5: AI Decision Points<br/>Reasoning markers<br/>[AI-DECISION]"]
    LAYER6["Layer 6: Kill Switches<br/>Safety limits<br/>Timeout, request count"]

    AGENT --> LAYER1
    LAYER1 --> LAYER2
    LAYER2 --> LAYER3
    LAYER3 --> LAYER4
    LAYER4 --> LAYER5
    LAYER5 --> LAYER6

    LAYER6 --> QUALITY["High-Quality Output<br/>Accurate, Consistent<br/>Safe Findings"]

    style AGENT fill:#9b30ff,color:#fff,stroke:#00e5ff,stroke-width:2px
    style LAYER1 fill:#4a148c,color:#fff
    style LAYER2 fill:#6a1b9a,color:#fff
    style LAYER3 fill:#7b1fa2,color:#fff
    style LAYER4 fill:#8e24aa,color:#fff
    style LAYER5 fill:#9c27b0,color:#fff
    style LAYER6 fill:#ab47bc,color:#fff
    style QUALITY fill:#0277bd,color:#fff,stroke:#00e5ff,stroke-width:2px

Components

Research Terms

Domain vocabulary per skill anchors attention on correct patterns. Located in helpers/research-terms.md for applicable skills.

Skills with research terms: test-injection, test-auth, test-access, test-ssrf, test-client, test-api, test-logic, test-infra, test-advanced.

Exemplars

Ideal test executions (target -> test -> decision -> output) in helpers/exemplars.md. Provides few-shot examples for consistent behavior.

Skills with exemplars: test-injection, test-auth, test-access, test-ssrf.

Knowledge Packs

Deep domain knowledge files per vulnerability class:

  • knowledge-sqli.md — SQL injection techniques, WAF bypass, blind/error-based/union
  • knowledge-xss.md — XSS contexts, filter bypass, DOM sinks/sources
  • knowledge-ssrf.md — Protocol smuggling, cloud metadata, redirect chains
  • knowledge-jwt.md — Algorithm confusion, key cracking, claim manipulation
  • knowledge-oauth.md — OAuth flow attacks, state bypass, token theft

Finding Schema

Standardized output format in pentest/helpers/finding-schema.md with required fields:

Field Description
finding_id Unique identifier (FINDING-NNN)
vuln_type Vulnerability classification
severity Critical/High/Medium/Low/Informational
cvss40_vector CVSS 4.0 vector string
endpoint Affected URL/endpoint
evidence Request + response_indicator + baseline
poc_http Full HTTP request/response (Burp style)
confidence High/Medium/Low

AI Decision Points

# [AI-DECISION] markers in skills indicate where AI reasoning should be applied (2-3 per skill):

  • Response analysis (is this a true positive?)
  • False positive elimination
  • Attack path selection (which technique to try next?)

Kill Switches

Built-in safety limits:

  • Timeout: 45 min per skill (60 for test-injection)
  • Request limit: 500 per skill (warning at 400)
  • Rate limit: 3x 429 responses triggers automatic stop
  • Functions: check_timeout(), count_request(), stealth_curl()