Skip to content

API Response Hardening

API responses intercepted by a proxy reveal operational details far beyond what the UI displays. This layer sanitizes responses to remove internal configuration, methodology references, and raw data dumps.


Changes Summary

Endpoint Field Removed / Sanitized Risk Before Risk After
GET /engagements/{name} raw field removed Critical -- exposed full context.json None
GET /engagements/{name}/findings/{id} raw_markdown sanitized Critical -- contained tool references Low
GET /engagements/{name}/timeline skill field genericized (viewer) High -- exposed skill names + flags None
GET /engagements/{name}/timeline details field sanitized (viewer) High -- contained technique names None

Removed: raw Field from Engagement Detail

What It Exposed

The EngagementDetail schema included a raw field that contained the entire context.json file:

{
  "name": "acme-pentest",
  "target": "https://acme.com",
  "tech_stack": { "backend": "Laravel", "waf": "Cloudflare" },
  "raw": {
    "stealth": {
      "mode": "chrome",
      "ua": "Mozilla/5.0...",
      "rate_limit": 2,
      "concurrency": 2,
      "jitter_min": 1,
      "jitter_max": 4
    },
    "attack_surface": {
      "graphql": true,
      "file_upload": true,
      "endpoints_discovered": 47,
      "parameters_discovered": 128
    },
    "auth": {
      "type": "jwt+session",
      "jwt": true,
      "session_cookie": "laravel_session",
      "csrf": "X-CSRF-TOKEN",
      "users": [
        { "role": "admin", "username": "admin@acme.com", "token": "eyJ..." }
      ]
    }
  }
}

This exposed stealth configuration (rate limits, jitter values, user agent), full auth tokens, session cookies, CSRF token names, and the complete attack surface analysis.

Fix

The raw field was removed from the EngagementDetail Pydantic schema and the get_engagement() service function no longer passes raw=ctx.

The structured fields (tech_stack, auth, attack_surface, stealth) remain -- they contain only what the UI needs to render, without the full context.json dump.

# BEFORE
class EngagementDetail(BaseModel):
    # ... structured fields ...
    raw: dict[str, Any] = Field(default_factory=dict)  # FULL context.json

# AFTER
class EngagementDetail(BaseModel):
    # ... structured fields only ...
    # raw field removed -- no context.json exposure

Sanitized: Finding raw_markdown

What It Exposed

The Finding schema includes raw_markdown -- the full FINDING-*.md file content. This can contain references to internal tools and knowledge packs in the form of technique attribution comments:

## Description
SQL injection via union-based technique (ref: knowledge-sqli.md, Section B3).
Discovered by .claude/skills/test-injection/SKILL.md step 4a using
nuclei-templates/http/sqli/generic-sqli.yaml as initial signal.

Fix

All raw_markdown content is passed through sanitize_finding_markdown() which strips:

  • Knowledge pack references (knowledge-*.md -> [technique-ref])
  • SKILL.md file paths (.claude/skills/... -> [test-module])
  • Internal boilerplate references -> [internal-ref]
  • Nuclei template absolute paths -> [scan-template]

The finding's structured fields (title, severity, description, PoC, remediation) are not sanitized -- they contain the deliverable content that clients need.


Role-Based Timeline Filtering

What It Exposed

Timeline events included the exact skill name and execution details:

{
  "timestamp": "2025-03-14T10:23:45Z",
  "level": "INFO",
  "skill": "test-injection --scope sqli --endpoints /api/users,/api/search",
  "action": "FINDING",
  "details": "Union-based SQLi confirmed on /api/users?sort= using knowledge-sqli.md B3 technique"
}

This reveals which specific skills ran, what scope they targeted, which endpoints were tested, and which knowledge pack techniques were applied.

Fix

For the viewer role, timeline events are sanitized:

Field Before After (viewer)
skill test-injection --scope sqli Security Testing
details Union-based SQLi confirmed... knowledge-sqli.md B3 Union-based SQLi confirmed... [internal-ref] B3

Admin and pentester roles see the full, unsanitized timeline -- they need the technical details for engagement management.

Skill Name Mapping

Internal skill names are mapped to generic phase labels:

Internal Name Client-Facing Label
recon Reconnaissance
discover Discovery
scan Automated Scanning
walkthrough Application Mapping
context Context Analysis
route Test Planning
test-injection, test-auth, ... (all 16) Security Testing
verify Verification
chain-findings Analysis
report Report Generation
pentest Full Assessment

Unchanged (Not Exposed)

These data points were already properly scoped and do not require additional sanitization:

  • Client portal findings (/api/v2/client/...) -- Already use ClientFindingResponse schema which excludes PoC, raw_markdown, and internal paths
  • Finding summaries (GET /engagements/{name}/findings) -- Only return id, title, severity, CVSS, endpoint
  • Engagement summaries (GET /engagements) -- Only return name, target, phases, severity counts
  • Access matrix -- Contains endpoint/role/result data, no methodology references