API Response Hardening¶

API responses intercepted by a proxy reveal operational details far beyond what the UI displays. This layer sanitizes responses to remove internal configuration, methodology references, and raw data dumps.

Changes Summary¶

Endpoint	Field Removed / Sanitized	Risk Before	Risk After
`GET /engagements/{name}`	`raw` field removed	Critical -- exposed full context.json	None
`GET /engagements/{name}/findings/{id}`	`raw_markdown` sanitized	Critical -- contained tool references	Low
`GET /engagements/{name}/timeline`	`skill` field genericized (viewer)	High -- exposed skill names + flags	None
`GET /engagements/{name}/timeline`	`details` field sanitized (viewer)	High -- contained technique names	None

Removed: `raw` Field from Engagement Detail¶

What It Exposed¶

The EngagementDetail schema included a raw field that contained the entire context.json file:

{
  "name": "acme-pentest",
  "target": "https://acme.com",
  "tech_stack": { "backend": "Laravel", "waf": "Cloudflare" },
  "raw": {
    "stealth": {
      "mode": "chrome",
      "ua": "Mozilla/5.0...",
      "rate_limit": 2,
      "concurrency": 2,
      "jitter_min": 1,
      "jitter_max": 4
    },
    "attack_surface": {
      "graphql": true,
      "file_upload": true,
      "endpoints_discovered": 47,
      "parameters_discovered": 128
    },
    "auth": {
      "type": "jwt+session",
      "jwt": true,
      "session_cookie": "laravel_session",
      "csrf": "X-CSRF-TOKEN",
      "users": [
        { "role": "admin", "username": "admin@acme.com", "token": "eyJ..." }
      ]
    }
  }
}

This exposed stealth configuration (rate limits, jitter values, user agent), full auth tokens, session cookies, CSRF token names, and the complete attack surface analysis.

Fix¶

The raw field was removed from the EngagementDetail Pydantic schema and the get_engagement() service function no longer passes raw=ctx.

The structured fields (tech_stack, auth, attack_surface, stealth) remain -- they contain only what the UI needs to render, without the full context.json dump.

# BEFORE
class EngagementDetail(BaseModel):
    # ... structured fields ...
    raw: dict[str, Any] = Field(default_factory=dict)  # FULL context.json

# AFTER
class EngagementDetail(BaseModel):
    # ... structured fields only ...
    # raw field removed -- no context.json exposure

Sanitized: Finding `raw_markdown`¶

What It Exposed¶

The Finding schema includes raw_markdown -- the full FINDING-*.md file content. This can contain references to internal tools and knowledge packs in the form of technique attribution comments:

## Description
SQL injection via union-based technique (ref: knowledge-sqli.md, Section B3).
Discovered by .claude/skills/test-injection/SKILL.md step 4a using
nuclei-templates/http/sqli/generic-sqli.yaml as initial signal.

Fix¶

All raw_markdown content is passed through sanitize_finding_markdown() which strips:

Knowledge pack references (knowledge-*.md -> [technique-ref])
SKILL.md file paths (.claude/skills/... -> [test-module])
Internal boilerplate references -> [internal-ref]
Nuclei template absolute paths -> [scan-template]

The finding's structured fields (title, severity, description, PoC, remediation) are not sanitized -- they contain the deliverable content that clients need.

Role-Based Timeline Filtering¶

What It Exposed¶

Timeline events included the exact skill name and execution details:

{
  "timestamp": "2025-03-14T10:23:45Z",
  "level": "INFO",
  "skill": "test-injection --scope sqli --endpoints /api/users,/api/search",
  "action": "FINDING",
  "details": "Union-based SQLi confirmed on /api/users?sort= using knowledge-sqli.md B3 technique"
}

This reveals which specific skills ran, what scope they targeted, which endpoints were tested, and which knowledge pack techniques were applied.

Fix¶

For the viewer role, timeline events are sanitized:

Field	Before	After (viewer)
`skill`	`test-injection --scope sqli`	`Security Testing`
`details`	`Union-based SQLi confirmed... knowledge-sqli.md B3`	`Union-based SQLi confirmed... [internal-ref] B3`

Admin and pentester roles see the full, unsanitized timeline -- they need the technical details for engagement management.

Skill Name Mapping¶

Internal skill names are mapped to generic phase labels:

Internal Name	Client-Facing Label
`recon`	Reconnaissance
`discover`	Discovery
`scan`	Automated Scanning
`walkthrough`	Application Mapping
`context`	Context Analysis
`route`	Test Planning
`test-injection`, `test-auth`, ... (all 16)	Security Testing
`verify`	Verification
`chain-findings`	Analysis
`report`	Report Generation
`pentest`	Full Assessment

Unchanged (Not Exposed)¶

These data points were already properly scoped and do not require additional sanitization:

Client portal findings (/api/v2/client/...) -- Already use ClientFindingResponse schema which excludes PoC, raw_markdown, and internal paths
Finding summaries (GET /engagements/{name}/findings) -- Only return id, title, severity, CVSS, endpoint
Engagement summaries (GET /engagements) -- Only return name, target, phases, severity counts
Access matrix -- Contains endpoint/role/result data, no methodology references

API Response Hardening¶

Changes Summary¶

Removed: raw Field from Engagement Detail¶

What It Exposed¶

Fix¶

Sanitized: Finding raw_markdown¶

What It Exposed¶

Fix¶

Role-Based Timeline Filtering¶

What It Exposed¶

Fix¶

Skill Name Mapping¶

Unchanged (Not Exposed)¶

Removed: `raw` Field from Engagement Detail¶

Sanitized: Finding `raw_markdown`¶