Skip to content

Learning Loop

Extracts successful techniques, payloads, and bypasses from completed engagements and stores them in a learning index. On future engagements, the index recommends payloads that have historically worked on similar tech stacks.


How it works

  1. After an engagement completes, the extraction endpoint parses all FINDING-*.md files
  2. For each verified finding, it extracts:
    • The CWE
    • The successful payload or technique
    • The bypass type (e.g., WAF bypass, filter evasion)
    • The target's tech stack
  3. Each technique is stored with a success_count that increments if the same payload succeeds again on a different engagement
  4. The recommendation engine queries the index and ranks techniques by success rate for the given tech stack and CWE

What gets extracted

Field Source Example
cwe Finding CWE tag CWE-89
tech_stack context.json Spring Boot + PostgreSQL
technique Finding description Time-based blind SQLi via ORDER BY
payload Finding PoC ' OR SLEEP(5)-- -
bypass_type Finding notes WAF bypass via chunked encoding

API endpoints

Extract techniques from engagement

POST /api/v1/learning/engagements/{engagement_name}/extract

Parses the engagement's findings and adds new techniques to the index. Returns the number of techniques extracted.

Response:

{
  "engagement": "acme-2026-q1",
  "techniques_extracted": 12,
  "techniques": [
    {
      "id": 45,
      "cwe": "CWE-89",
      "tech_stack": "Spring Boot + PostgreSQL",
      "technique": "Time-based blind SQLi via ORDER BY clause",
      "payload": "1 AND (SELECT 1 FROM (SELECT SLEEP(5))a)-- -",
      "bypass_type": "WAF chunked encoding",
      "success_count": 3,
      "last_used_at": "2026-03-15T16:00:00Z"
    }
  ]
}

Get recommendations

GET /api/v1/learning/recommend?tech_stack=Spring+Boot&cwes=CWE-89,CWE-79&limit=20
Parameter Type Default Description
tech_stack query * Filter by tech stack
cwes query all Comma-separated CWE IDs
limit query 20 Max recommendations (1-100)

Response (LearningRecommendation[]):

[
  {
    "cwe": "CWE-89",
    "tech_stack": "Spring Boot + PostgreSQL",
    "recommended_payloads": [
      "' OR SLEEP(5)-- -",
      "1 UNION SELECT NULL,NULL,version()-- -"
    ],
    "recommended_techniques": [
      "Time-based blind via ORDER BY",
      "UNION-based via column count enumeration"
    ],
    "historical_success_rate": 0.73
  }
]

Statistics

GET /api/v1/learning/stats

Aggregate statistics: total techniques, top CWEs, most successful payloads, coverage by tech stack.

List all techniques

GET /api/v1/learning/techniques?cwe=CWE-89&tech_stack=Spring+Boot&limit=50

Browse the full learning index with optional filters.


Trigger Hindsight reflection

POST /api/v1/learning/engagements/{engagement_name}/reflect

Triggers a Hindsight reflection cycle — analyzes all stored memories and synthesizes mental models (patterns, correlations, actionable insights). Requires Hindsight to be enabled.

Response:

{
  "status": "ok",
  "engagement": "acme-2026-q1",
  "tech_stack": "Spring Boot + PostgreSQL",
  "reflection": "Pattern: Java web apps using Spring Boot + Hibernate are 73% likely to have HQL injection when user input reaches dynamic queries. Most successful bypass: parameterized LIKE with wildcard injection."
}

Check Hindsight health

GET /api/v1/learning/hindsight/health

Returns the status of the Hindsight agent memory service.


CLI equivalent

The /learn skill runs the same extraction from the command line. It's automatically called at the end of a /pentest run.

Use /learn <engagement_name> --reflect to also trigger a Hindsight reflection cycle after extraction.


Hindsight agent memory integration

The learning loop uses a dual-write architecture: SQL (primary) + Hindsight (semantic memory).

What Hindsight adds

Capability SQL (existing) Hindsight (new)
Exact match by CWE/tech_stack Yes Yes
Semantic similarity search No Yes
Cross-engagement graph No Yes
Temporal memory No Yes
Reflection/mental models No Yes
Works when Hindsight is down Yes N/A (graceful fallback)

How it works

  1. Retain (dual-write): Every extracted technique is written to SQL AND stored in Hindsight as an Experience Memory
  2. Recall (enrichment): Recommendations combine SQL exact-match with Hindsight semantic search — finds similar techniques even when tech stack names don't match exactly
  3. Reflect (synthesis): After extraction, optionally synthesize mental models — high-level patterns from accumulated experience

Micro-agent memory

Each wave of test agents receives Hindsight context before dispatch. The agent-dispatch protocol calls recall_techniques() with the current tech stack and skill scope, injecting relevant historical techniques into the agent's prompt. This gives agents "experience" without polluting their context window.

Current status: DISABLED by default

Hindsight is integrated but disabled (HINDSIGHT_ENABLED=false). The SQL learning loop works independently. Enable Hindsight when you have 30-50+ completed engagements to benefit from semantic recall and cross-engagement pattern synthesis.

How to enable Hindsight

Step 1: Get an Anthropic API key

Go to https://console.anthropic.com/settings/keys and create a new key. This key is only for Hindsight memory operations — it does NOT affect claude -p which continues to use your Max subscription.

Estimated cost: ~$0.17 per engagement with Sonnet 4.6 (~$1.70/month for 10 pentests).

Step 2: Configure environment

Add to dashboard/.env:

HINDSIGHT_ENABLED=true
HINDSIGHT_API_KEY=sk-ant-api03-your-key-here

Step 3: Restart the dashboard

cd dashboard
docker compose down
docker compose up -d

Hindsight starts automatically (API on :8888, UI on :9999).

Step 4: Ingest knowledge packs (one-time)

Populate Hindsight with the 60+ static knowledge packs:

python scripts/ingest-knowledge-to-hindsight.py [--url http://localhost:8888] [--bank bd-pentest]

Step 5: Verify

# Check Hindsight health
curl http://localhost:8880/api/v1/learning/hindsight/health

# Check Hindsight UI
open http://localhost:9999

How to disable again

Set HINDSIGHT_ENABLED=false in dashboard/.env and restart. All operations fall back to SQL-only — no errors, no data loss. Hindsight data persists in the hindsight-data Docker volume.

Configuration reference

Variable Default Description
HINDSIGHT_ENABLED false Enable/disable Hindsight integration
HINDSIGHT_API_KEY (required when enabled) Anthropic API key (Hindsight only, not for claude -p)
HINDSIGHT_URL http://hindsight:8888 Hindsight API endpoint
HINDSIGHT_BANK_ID bd-pentest Memory bank for pentest techniques
HINDSIGHT_LLM_PROVIDER anthropic LLM provider
HINDSIGHT_LLM_MODEL claude-sonnet-4-6 Model for reasoning/reflection

Connections to other features

  • Confidence Calibration: the learning loop checks calibration data when ranking techniques. Payloads that frequently produce false positives are ranked lower, even if they trigger a match
  • Remediation Generator: knowing which technique exploited a vulnerability helps the Remediation Generator produce more targeted fix code
  • Knowledge packs: the learning index complements the static knowledge packs in .claude/skills/test-*/helpers/. Static packs contain curated techniques; the learning index adds engagement-proven ones
  • Hindsight memory: semantic recall enriches recommendations beyond exact SQL matches. Reflection synthesizes cross-engagement insights. Chain-findings uses graph recall for cross-engagement chain discovery
  • Micro-agents: wave dispatch injects Hindsight context for agent "experience" without context pollution