Learning Loop¶

Extracts successful techniques, payloads, and bypasses from completed engagements and stores them in a learning index. On future engagements, the index recommends payloads that have historically worked on similar tech stacks.

How it works¶

After an engagement completes, the extraction endpoint parses all FINDING-*.md files
For each verified finding, it extracts:
- The CWE
- The successful payload or technique
- The bypass type (e.g., WAF bypass, filter evasion)
- The target's tech stack
Each technique is stored with a success_count that increments if the same payload succeeds again on a different engagement
The recommendation engine queries the index and ranks techniques by success rate for the given tech stack and CWE

What gets extracted¶

Field	Source	Example
`cwe`	Finding CWE tag	`CWE-89`
`tech_stack`	`context.json`	`Spring Boot + PostgreSQL`
`technique`	Finding description	`Time-based blind SQLi via ORDER BY`
`payload`	Finding PoC	`' OR SLEEP(5)-- -`
`bypass_type`	Finding notes	`WAF bypass via chunked encoding`

API endpoints¶

Extract techniques from engagement¶

POST /api/v1/learning/engagements/{engagement_name}/extract

Parses the engagement's findings and adds new techniques to the index. Returns the number of techniques extracted.

Response:

{
  "engagement": "acme-2026-q1",
  "techniques_extracted": 12,
  "techniques": [
    {
      "id": 45,
      "cwe": "CWE-89",
      "tech_stack": "Spring Boot + PostgreSQL",
      "technique": "Time-based blind SQLi via ORDER BY clause",
      "payload": "1 AND (SELECT 1 FROM (SELECT SLEEP(5))a)-- -",
      "bypass_type": "WAF chunked encoding",
      "success_count": 3,
      "last_used_at": "2026-03-15T16:00:00Z"
    }
  ]
}

Get recommendations¶

GET /api/v1/learning/recommend?tech_stack=Spring+Boot&cwes=CWE-89,CWE-79&limit=20

Parameter	Type	Default	Description
`tech_stack`	query	`*`	Filter by tech stack
`cwes`	query	all	Comma-separated CWE IDs
`limit`	query	`20`	Max recommendations (1-100)

Response (LearningRecommendation[]):

[
  {
    "cwe": "CWE-89",
    "tech_stack": "Spring Boot + PostgreSQL",
    "recommended_payloads": [
      "' OR SLEEP(5)-- -",
      "1 UNION SELECT NULL,NULL,version()-- -"
    ],
    "recommended_techniques": [
      "Time-based blind via ORDER BY",
      "UNION-based via column count enumeration"
    ],
    "historical_success_rate": 0.73
  }
]

Statistics¶

GET /api/v1/learning/stats

Aggregate statistics: total techniques, top CWEs, most successful payloads, coverage by tech stack.

List all techniques¶

GET /api/v1/learning/techniques?cwe=CWE-89&tech_stack=Spring+Boot&limit=50

Browse the full learning index with optional filters.

Trigger Hindsight reflection¶

POST /api/v1/learning/engagements/{engagement_name}/reflect

Triggers a Hindsight reflection cycle — analyzes all stored memories and synthesizes mental models (patterns, correlations, actionable insights). Requires Hindsight to be enabled.

Response:

{
  "status": "ok",
  "engagement": "acme-2026-q1",
  "tech_stack": "Spring Boot + PostgreSQL",
  "reflection": "Pattern: Java web apps using Spring Boot + Hibernate are 73% likely to have HQL injection when user input reaches dynamic queries. Most successful bypass: parameterized LIKE with wildcard injection."
}

Check Hindsight health¶

GET /api/v1/learning/hindsight/health

Returns the status of the Hindsight agent memory service.

CLI equivalent¶

The /learn skill runs the same extraction from the command line. It's automatically called at the end of a /pentest run.

Use /learn <engagement_name> --reflect to also trigger a Hindsight reflection cycle after extraction.

Hindsight agent memory integration¶

The learning loop uses a dual-write architecture: SQL (primary) + Hindsight (semantic memory).

What Hindsight adds¶

Capability	SQL (existing)	Hindsight (new)
Exact match by CWE/tech_stack	Yes	Yes
Semantic similarity search	No	Yes
Cross-engagement graph	No	Yes
Temporal memory	No	Yes
Reflection/mental models	No	Yes
Works when Hindsight is down	Yes	N/A (graceful fallback)

How it works¶

Retain (dual-write): Every extracted technique is written to SQL AND stored in Hindsight as an Experience Memory
Recall (enrichment): Recommendations combine SQL exact-match with Hindsight semantic search — finds similar techniques even when tech stack names don't match exactly
Reflect (synthesis): After extraction, optionally synthesize mental models — high-level patterns from accumulated experience

Micro-agent memory¶

Each wave of test agents receives Hindsight context before dispatch. The agent-dispatch protocol calls recall_techniques() with the current tech stack and skill scope, injecting relevant historical techniques into the agent's prompt. This gives agents "experience" without polluting their context window.

Current status: DISABLED by default¶

Hindsight is integrated but disabled (HINDSIGHT_ENABLED=false). The SQL learning loop works independently. Enable Hindsight when you have 30-50+ completed engagements to benefit from semantic recall and cross-engagement pattern synthesis.

How to enable Hindsight¶

Step 1: Get an Anthropic API key¶

Go to https://console.anthropic.com/settings/keys and create a new key. This key is only for Hindsight memory operations — it does NOT affect claude -p which continues to use your Max subscription.

Estimated cost: ~$0.17 per engagement with Sonnet 4.6 (~$1.70/month for 10 pentests).

Step 2: Configure environment¶

Add to dashboard/.env:

HINDSIGHT_ENABLED=true
HINDSIGHT_API_KEY=sk-ant-api03-your-key-here

Step 3: Restart the dashboard¶

cd dashboard
docker compose down
docker compose up -d

Hindsight starts automatically (API on :8888, UI on :9999).

Step 4: Ingest knowledge packs (one-time)¶

Populate Hindsight with the 60+ static knowledge packs:

python scripts/ingest-knowledge-to-hindsight.py [--url http://localhost:8888] [--bank bd-pentest]

Step 5: Verify¶

# Check Hindsight health
curl http://localhost:8880/api/v1/learning/hindsight/health

# Check Hindsight UI
open http://localhost:9999

How to disable again¶

Set HINDSIGHT_ENABLED=false in dashboard/.env and restart. All operations fall back to SQL-only — no errors, no data loss. Hindsight data persists in the hindsight-data Docker volume.

Configuration reference¶

Variable	Default	Description
`HINDSIGHT_ENABLED`	`false`	Enable/disable Hindsight integration
`HINDSIGHT_API_KEY`	(required when enabled)	Anthropic API key (Hindsight only, not for `claude -p`)
`HINDSIGHT_URL`	`http://hindsight:8888`	Hindsight API endpoint
`HINDSIGHT_BANK_ID`	`bd-pentest`	Memory bank for pentest techniques
`HINDSIGHT_LLM_PROVIDER`	`anthropic`	LLM provider
`HINDSIGHT_LLM_MODEL`	`claude-sonnet-4-6`	Model for reasoning/reflection

Connections to other features¶

Confidence Calibration: the learning loop checks calibration data when ranking techniques. Payloads that frequently produce false positives are ranked lower, even if they trigger a match
Remediation Generator: knowing which technique exploited a vulnerability helps the Remediation Generator produce more targeted fix code
Knowledge packs: the learning index complements the static knowledge packs in .claude/skills/test-*/helpers/. Static packs contain curated techniques; the learning index adds engagement-proven ones
Hindsight memory: semantic recall enriches recommendations beyond exact SQL matches. Reflection synthesizes cross-engagement insights. Chain-findings uses graph recall for cross-engagement chain discovery
Micro-agents: wave dispatch injects Hindsight context for agent "experience" without context pollution