Skip to content

Scope Decomposition Rationale

The Problem: Context Degradation

Monolithic test skills that run 100+ turns in a single agent session suffer from context window degradation. As the conversation grows, the model's attention to earlier instructions, previously tested endpoints, and accumulated state diminishes. In penetration testing, this manifests as:

  • Forgetting endpoints tested in the first 30 minutes
  • Repeating the same payloads on different endpoints instead of adapting
  • Missing coverage on later endpoints because early findings consumed the reasoning budget
  • Drifting from structured methodology into ad-hoc testing

The solution is to decompose large skills into scoped sub-agents, each with a fresh context window focused on a specific vulnerability class or testing domain.

Decomposition Strategy

Six skills were identified as candidates for decomposition based on their breadth of coverage and typical turn count:

Skill Scopes Sub-Agents Rationale
/test-injection sqli, xss, cmdi, ssti-xxe, misc 5 Each injection class requires different payloads, detection methods, and WAF bypass strategies
/test-auth jwt, oauth, session 3 JWT algorithm attacks, OAuth flow manipulation, and session management are distinct protocol domains
/test-access idor, authz, matrix 3 IDOR testing (neighbor IDs, field injection) is mechanically different from authorization matrix generation
/test-client csrf-cors, dom, misc 3 CSRF/CORS/SameSite analysis, DOM XSS tracing, and miscellaneous client-side checks are independent
/test-ssrf core, vector 2 Core SSRF (URL parameter manipulation) vs advanced vectors (PDF, git://, PlantUML, rogue MySQL)
/test-logic business, race, upload 3 Business logic, race conditions, and file upload each require fundamentally different testing approaches

Additional non-decomposed skills with scopes in the wave schedule:

Skill Scopes Sub-Agents
/test-api rest, graphql, prototype 3
/test-advanced hpp-crlf, bypass, mfa, host-method 4
/test-infra smuggling, cache 2
/test-cloud storage, takeover, k8s-cicd 3

Total: 10 skills decomposed into 31 sub-agents at Level 2.

Natural Split Points

Each decomposition follows natural boundaries in the testing methodology:

test-injection

Scope Coverage Why Separate
sqli SQL injection (UNION, blind, time-based, error-based, second-order) Requires systematic database-specific payload adaptation
xss Reflected, stored, DOM-based XSS with WAF bypass Context-dependent encoding and filter evasion
cmdi Command injection (8 timing variants, OOB) Needs exhaustive blind testing before concluding negative
ssti-xxe Server-Side Template Injection, XML External Entity, ESI Template engine and parser-specific payloads
misc LDAPi, NoSQLi, SOQL, ExifTool RCE, Mermaid injection Long tail of injection types that share no payload logic

test-auth

Scope Coverage Why Separate
jwt Algorithm confusion, weak signing key, audience bypass, expiration JWT-specific tooling (hashcat, jwt_tool)
oauth State null-byte, IDN homograph, pre-ATO, ROPC, PKCE, CSRF Multi-step OAuth flow analysis
session Session fixation, cookie flags, concurrent sessions, Flask/Redash signing Cookie and session state management

test-access

Scope Coverage Why Separate
idor Neighbor ID, UUID enumeration, field injection, horizontal escalation High endpoint count, systematic ID manipulation
authz Vertical privilege escalation, missing function-level access control Role-based testing across all endpoints
matrix Full access control matrix generation (endpoint x role) Structured sweep, produces evidence/access-matrix.md

Section-to-Scope Mapping

Each skill's SKILL.md is organized into sections (STEP 0, STEP 1, etc.). The --scope flag maps to specific sections:

/test-injection --scope sqli     ->  STEP 0 (init) + STEP 1 (SQLi) + STEP 6 (cleanup)
/test-injection --scope xss      ->  STEP 0 (init) + STEP 2 (XSS) + STEP 6 (cleanup)
/test-injection --scope cmdi     ->  STEP 0 (init) + STEP 3 (CMDi) + STEP 6 (cleanup)
/test-injection --scope ssti-xxe ->  STEP 0 (init) + STEP 4 (SSTI/XXE) + STEP 6 (cleanup)
/test-injection --scope misc     ->  STEP 0 (init) + STEP 5 (misc) + STEP 6 (cleanup)

STEP 0 (shared initialization from skill-boilerplate.md) and the final cleanup step always execute regardless of scope. This ensures stealth configuration, logging, kill switches, and finding output are consistent.

Backward Compatibility

Scoped dispatch is additive. Skills work identically without the --scope flag:

# Full skill (all scopes sequentially, single agent)
/test-injection https://target.com

# Scoped (focused sub-agent)
/test-injection https://target.com --scope sqli

When the wave coordinator dispatches agents, it passes --scope to activate focused mode. When a user runs a skill manually without --scope, the skill executes all sections sequentially in a single agent, preserving the pre-V3 behavior.

Impact on Coverage Quality

Lab evaluations comparing monolithic vs scoped execution show:

  • Endpoint coverage increases from ~65% to ~85% because each sub-agent focuses its full context budget on a smaller test surface
  • False positive rate decreases because each agent maintains clearer state about what has been tested and what responses look like for that specific vulnerability class
  • Time-to-first-finding decreases because scoped agents skip irrelevant initialization and go directly to their assigned tests
  • Context isolation prevents one finding from biasing subsequent tests (e.g., finding SQLi on one endpoint should not cause the agent to over-focus on SQLi at the expense of XSS)