Scope Decomposition Rationale¶
The Problem: Context Degradation¶
Monolithic test skills that run 100+ turns in a single agent session suffer from context window degradation. As the conversation grows, the model's attention to earlier instructions, previously tested endpoints, and accumulated state diminishes. In penetration testing, this manifests as:
- Forgetting endpoints tested in the first 30 minutes
- Repeating the same payloads on different endpoints instead of adapting
- Missing coverage on later endpoints because early findings consumed the reasoning budget
- Drifting from structured methodology into ad-hoc testing
The solution is to decompose large skills into scoped sub-agents, each with a fresh context window focused on a specific vulnerability class or testing domain.
Decomposition Strategy¶
Six skills were identified as candidates for decomposition based on their breadth of coverage and typical turn count:
| Skill | Scopes | Sub-Agents | Rationale |
|---|---|---|---|
/test-injection |
sqli, xss, cmdi, ssti-xxe, misc |
5 | Each injection class requires different payloads, detection methods, and WAF bypass strategies |
/test-auth |
jwt, oauth, session |
3 | JWT algorithm attacks, OAuth flow manipulation, and session management are distinct protocol domains |
/test-access |
idor, authz, matrix |
3 | IDOR testing (neighbor IDs, field injection) is mechanically different from authorization matrix generation |
/test-client |
csrf-cors, dom, misc |
3 | CSRF/CORS/SameSite analysis, DOM XSS tracing, and miscellaneous client-side checks are independent |
/test-ssrf |
core, vector |
2 | Core SSRF (URL parameter manipulation) vs advanced vectors (PDF, git://, PlantUML, rogue MySQL) |
/test-logic |
business, race, upload |
3 | Business logic, race conditions, and file upload each require fundamentally different testing approaches |
Additional non-decomposed skills with scopes in the wave schedule:
| Skill | Scopes | Sub-Agents |
|---|---|---|
/test-api |
rest, graphql, prototype |
3 |
/test-advanced |
hpp-crlf, bypass, mfa, host-method |
4 |
/test-infra |
smuggling, cache |
2 |
/test-cloud |
storage, takeover, k8s-cicd |
3 |
Total: 10 skills decomposed into 31 sub-agents at Level 2.
Natural Split Points¶
Each decomposition follows natural boundaries in the testing methodology:
test-injection¶
| Scope | Coverage | Why Separate |
|---|---|---|
sqli |
SQL injection (UNION, blind, time-based, error-based, second-order) | Requires systematic database-specific payload adaptation |
xss |
Reflected, stored, DOM-based XSS with WAF bypass | Context-dependent encoding and filter evasion |
cmdi |
Command injection (8 timing variants, OOB) | Needs exhaustive blind testing before concluding negative |
ssti-xxe |
Server-Side Template Injection, XML External Entity, ESI | Template engine and parser-specific payloads |
misc |
LDAPi, NoSQLi, SOQL, ExifTool RCE, Mermaid injection | Long tail of injection types that share no payload logic |
test-auth¶
| Scope | Coverage | Why Separate |
|---|---|---|
jwt |
Algorithm confusion, weak signing key, audience bypass, expiration | JWT-specific tooling (hashcat, jwt_tool) |
oauth |
State null-byte, IDN homograph, pre-ATO, ROPC, PKCE, CSRF | Multi-step OAuth flow analysis |
session |
Session fixation, cookie flags, concurrent sessions, Flask/Redash signing | Cookie and session state management |
test-access¶
| Scope | Coverage | Why Separate |
|---|---|---|
idor |
Neighbor ID, UUID enumeration, field injection, horizontal escalation | High endpoint count, systematic ID manipulation |
authz |
Vertical privilege escalation, missing function-level access control | Role-based testing across all endpoints |
matrix |
Full access control matrix generation (endpoint x role) | Structured sweep, produces evidence/access-matrix.md |
Section-to-Scope Mapping¶
Each skill's SKILL.md is organized into sections (STEP 0, STEP 1, etc.). The --scope flag maps to specific sections:
/test-injection --scope sqli -> STEP 0 (init) + STEP 1 (SQLi) + STEP 6 (cleanup)
/test-injection --scope xss -> STEP 0 (init) + STEP 2 (XSS) + STEP 6 (cleanup)
/test-injection --scope cmdi -> STEP 0 (init) + STEP 3 (CMDi) + STEP 6 (cleanup)
/test-injection --scope ssti-xxe -> STEP 0 (init) + STEP 4 (SSTI/XXE) + STEP 6 (cleanup)
/test-injection --scope misc -> STEP 0 (init) + STEP 5 (misc) + STEP 6 (cleanup)
STEP 0 (shared initialization from skill-boilerplate.md) and the final cleanup step always execute regardless of scope. This ensures stealth configuration, logging, kill switches, and finding output are consistent.
Backward Compatibility¶
Scoped dispatch is additive. Skills work identically without the --scope flag:
# Full skill (all scopes sequentially, single agent)
/test-injection https://target.com
# Scoped (focused sub-agent)
/test-injection https://target.com --scope sqli
When the wave coordinator dispatches agents, it passes --scope to activate focused mode. When a user runs a skill manually without --scope, the skill executes all sections sequentially in a single agent, preserving the pre-V3 behavior.
Impact on Coverage Quality¶
Lab evaluations comparing monolithic vs scoped execution show:
- Endpoint coverage increases from ~65% to ~85% because each sub-agent focuses its full context budget on a smaller test surface
- False positive rate decreases because each agent maintains clearer state about what has been tested and what responses look like for that specific vulnerability class
- Time-to-first-finding decreases because scoped agents skip irrelevant initialization and go directly to their assigned tests
- Context isolation prevents one finding from biasing subsequent tests (e.g., finding SQLi on one endpoint should not cause the agent to over-focus on SQLi at the expense of XSS)