Context Engineering¶
Based on the Haddix AI Hackbots methodology, context engineering ensures each test agent has domain-specific knowledge.
Context Layers Stack¶
graph TB
AGENT["Test Agent<br/>Example: test-sqli"]
LAYER1["Layer 1: Research Terms<br/>Domain vocabulary<br/>Anchor patterns"]
LAYER2["Layer 2: Knowledge Pack<br/>Deep vulnerability knowledge<br/>knowledge-sqli.md"]
LAYER3["Layer 3: Exemplars<br/>Few-shot examples<br/>Consistent behavior"]
LAYER4["Layer 4: Finding Schema<br/>Standardized output<br/>Required fields"]
LAYER5["Layer 5: AI Decision Points<br/>Reasoning markers<br/>[AI-DECISION]"]
LAYER6["Layer 6: Kill Switches<br/>Safety limits<br/>Timeout, request count"]
AGENT --> LAYER1
LAYER1 --> LAYER2
LAYER2 --> LAYER3
LAYER3 --> LAYER4
LAYER4 --> LAYER5
LAYER5 --> LAYER6
LAYER6 --> QUALITY["High-Quality Output<br/>Accurate, Consistent<br/>Safe Findings"]
style AGENT fill:#9b30ff,color:#fff,stroke:#00e5ff,stroke-width:2px
style LAYER1 fill:#4a148c,color:#fff
style LAYER2 fill:#6a1b9a,color:#fff
style LAYER3 fill:#7b1fa2,color:#fff
style LAYER4 fill:#8e24aa,color:#fff
style LAYER5 fill:#9c27b0,color:#fff
style LAYER6 fill:#ab47bc,color:#fff
style QUALITY fill:#0277bd,color:#fff,stroke:#00e5ff,stroke-width:2px
Components¶
Research Terms¶
Domain vocabulary per skill anchors attention on correct patterns. Located in helpers/research-terms.md for applicable skills.
Skills with research terms: test-injection, test-auth, test-access, test-ssrf, test-client, test-api, test-logic, test-infra, test-advanced.
Exemplars¶
Ideal test executions (target -> test -> decision -> output) in helpers/exemplars.md. Provides few-shot examples for consistent behavior.
Skills with exemplars: test-injection, test-auth, test-access, test-ssrf.
Knowledge Packs¶
Deep domain knowledge files per vulnerability class:
knowledge-sqli.md— SQL injection techniques, WAF bypass, blind/error-based/unionknowledge-xss.md— XSS contexts, filter bypass, DOM sinks/sourcesknowledge-ssrf.md— Protocol smuggling, cloud metadata, redirect chainsknowledge-jwt.md— Algorithm confusion, key cracking, claim manipulationknowledge-oauth.md— OAuth flow attacks, state bypass, token theft
Finding Schema¶
Standardized output format in pentest/helpers/finding-schema.md with required fields:
| Field | Description |
|---|---|
finding_id |
Unique identifier (FINDING-NNN) |
vuln_type |
Vulnerability classification |
severity |
Critical/High/Medium/Low/Informational |
cvss40_vector |
CVSS 4.0 vector string |
endpoint |
Affected URL/endpoint |
evidence |
Request + response_indicator + baseline |
poc_http |
Full HTTP request/response (Burp style) |
confidence |
High/Medium/Low |
AI Decision Points¶
# [AI-DECISION] markers in skills indicate where AI reasoning should be applied (2-3 per skill):
- Response analysis (is this a true positive?)
- False positive elimination
- Attack path selection (which technique to try next?)
Kill Switches¶
Built-in safety limits:
- Timeout: 45 min per skill (60 for test-injection)
- Request limit: 500 per skill (warning at 400)
- Rate limit: 3x 429 responses triggers automatic stop
- Functions:
check_timeout(),count_request(),stealth_curl()