Phase 4: Manual Testing (Wave Coordinator)¶
Overview¶
Phase 4 executes AI-driven penetration testing using 16 specialized test skills deployed in parallel waves. The Wave Coordinator manages scheduling, parallelism, health checks, and result aggregation.
Purpose: Systematically test every endpoint for the full spectrum of web application vulnerabilities using AI-powered reasoning.
CRITICAL: ALL 17 Skills are MANDATORY¶
Phase 4 NEVER skips any skill, regardless of the application type. Each skill covers vulnerabilities that might not be obvious from the app's apparent purpose:
- Even "simple" apps without user authentication need JWT testing (internal APIs, third-party integrations)
- Apps without file uploads still need deserialization testing (API payloads)
- Financial apps need infrastructure testing (WAF bypass, request smuggling)
The 17 Test Skills¶
Injection Vulnerabilities (5 scopes: sqli, xss, cmdi, ssti-xxe, misc)¶
/test-injection — Tests for SQL injection, NoSQL injection, OS command injection, template injection, and more
Scopes:
- sqli: SQL injection (blind, error-based, union-based, time-based)
- xss: Cross-site scripting (reflected, stored, DOM)
- cmdi: Command injection (OS commands, alternative syntaxes)
- ssti-xxe: Server-side template injection, XXE, ESI injection
- misc: LDAP injection, Mermaid injection, ExifTool RCE, SOQL injection
Authentication Attacks (3 scopes: jwt, oauth, session)¶
/test-auth — Tests authentication mechanisms and bypass techniques
Scopes:
- jwt: JWT attacks (weak secret, algorithm confusion, claim manipulation, key cracking)
- oauth: OAuth flow attacks (state bypass, PKCE bypass, pre-ATO, ROPC abuse, CSRF)
- session: Session fixation, session hijacking, cookie vulnerabilities, SAML attacks
Access Control (3 scopes: idor, authz, matrix)¶
/test-access — Tests authorization and privilege escalation
Scopes:
- idor: Insecure direct object reference (ID enumeration, neighboring IDs)
- authz: Function-level authorization (can low-privilege user access high-privilege functions?)
- matrix: Multi-user access matrix (comprehensive role-based testing)
SSRF (2 scopes: core, vector)¶
/test-ssrf — Tests server-side request forgery attacks
Scopes:
- core: SSRF primitives (internal ports, redirect chains, protocol smuggling)
- vector: SSRF vectors (PDF filters, Grafana chains, rogue MySQL)
Client-Side Security (3 scopes: csrf-cors, dom, misc)¶
/test-client — Tests client-side vulnerabilities
Scopes:
- csrf-cors: CSRF attacks, CORS misconfigurations, SameSite bypass
- dom: DOM XSS, DOM clobbering, prototype pollution
- misc: Clickjacking, clipboard XSS, service worker theft, CSP bypass
Business Logic (3 scopes: business, race, upload)¶
/test-logic — Tests flawed business logic and workflows
Scopes:
- business: Price/quantity manipulation, workflow bypass, coupon abuse, financial logic
- race: Race conditions, atomicity violations, concurrency issues
- upload: File upload bypass (extension, magic bytes, path traversal)
API Security (3 scopes: rest, graphql, prototype)¶
/test-api — Tests API-specific vulnerabilities
Scopes:
- rest: REST API security (method override, content-type switching, parameter pollution)
- graphql: GraphQL attacks (introspection, query complexity, aliases, directives)
- prototype: Prototype pollution, recursive merge exploitation, constructor abuse
Infrastructure (2 scopes: smuggling, cache)¶
/test-infra — Tests infrastructure-level vulnerabilities
Scopes:
- smuggling: HTTP request smuggling (CL.TE, TE.CL, HTTP/2 desync)
- cache: Cache poisoning, cache deception, host header attacks
Cloud Security (3 scopes: storage, takeover, k8s-cicd)¶
/test-cloud — Tests cloud-specific vulnerabilities
Scopes:
- storage: S3/GCS/Azure bucket misconfigurations, exposed consoles
- takeover: Subdomain takeover, FQDN dot bypass, dangling DNS
- k8s-cicd: Kubernetes exposure, Firebase rules, CI/CD escape, cloud metadata
Cryptography (no scopes)¶
/test-crypto — Tests SSL/TLS and cryptographic failures
Coverage: TLS version analysis, cipher suite weaknesses, certificate validation
Deserialization (no scopes)¶
/test-deser — Tests insecure deserialization across languages
Coverage: Java, PHP, .NET, Python, Ruby gadget chain exploitation
Advanced Techniques (4 scopes: hpp-crlf, bypass, mfa, host-method)¶
/test-advanced — Tests advanced/subtle vulnerabilities
Scopes:
- hpp-crlf: HTTP parameter pollution, CRLF injection, param truncation
- bypass: Open redirect, type juggling, Unicode bypass, parser differences
- mfa: MFA bypass (8+ techniques), downgrade attacks, TOTP recovery
- host-method: HTTP/2 desync, second-order attacks, Host header injection, method override
Supply Chain (no scopes)¶
/test-supply-chain — Tests dependency vulnerabilities
Coverage: Dependency confusion, SRI bypass, Docker layer secrets, manifest injection
Exceptions & Debug (no scopes)¶
/test-exceptions — Tests error handling and debug mode
Coverage: Stack trace leakage, debug mode detection, sensitive data in errors
LLM Attacks (no scopes, requires --llm flag)¶
/test-llm — Tests prompt injection and MCP attacks
Coverage: Direct/indirect prompt injection, jailbreaking, model confusion
Mobile Security (no scopes, requires --mobile flag)¶
/test-mobile — Tests Android and iOS applications
Coverage: APK/IPA analysis, API security, data storage, authentication
Wave Coordinator: 12-Wave Schedule¶
The 31 sub-agents (from 17 skills with scopes) are dispatched in 12 sequential waves, with 3 agents per wave running in parallel:
Wave 0: Critical Path (High-Impact Vulns)¶
┌─────────────────────────────────────┐
│ Wave 0: Critical Injection & Access │
├─────────────────────────────────────┤
│ [Agent-1] test-injection:sqli │
│ [Agent-2] test-injection:xss │
│ [Agent-3] test-access:idor │
└─────────────────────────────────────┘
Wave 1: Critical Continued¶
Wave 2: Authentication¶
Wave 3-11: Remaining Skills (20 agents)¶
Each wave tests 3 scoped skills in parallel, covering APIs, logic, infrastructure, cloud, advanced, crypto, etc.
Wave 12: Flag-Gated (Conditional)¶
Full 12-wave schedule: See documentation at /architecture/wave-coordinator.md
Per-Endpoint Mandatory Checks¶
EVERY endpoint tested must pass through 8 mandatory checks:
| Check # | Test | Rationale |
|---|---|---|
| 1 | No auth → expect 401/403 | Detects missing authentication enforcement |
| 2 | Wrong role (lowest privilege) | Tests role-based access control |
| 3 | Neighbor ID (±1 from valid ID) | Detects IDOR |
| 4 | Extra fields in body (role_id, is_admin) |
Tests mass assignment, privilege escalation |
| 5 | Negative numeric values | Tests financial logic, input validation |
| 6 | Hidden flags (?debug=1, ?admin=1) |
Detects development/debug bypasses |
| 7 | Content-Type: application/xml on JSON endpoints |
XXE on unexpected endpoints |
| 8 | Content-Type: multipart/form-data |
Validation bypass through content-type mismatch |
Token-Driven Parameter Selection¶
During Phase 0, context.json was populated with detected technologies. Phase 4 agents use this to:
- Select relevant payloads (e.g., skip .NET-specific bypasses if app is Node.js)
- Adjust injection syntax (e.g., SLEEP() for MySQL vs BENCHMARK() for MariaDB)
- Choose authentication mechanisms (JWT if detected, session if cookie-based)
Example: context.json detected Spring Boot + PostgreSQL → injection agents use PostgreSQL payloads, skip MSSQL/MySQL variants.
Wave Health Checks¶
Between each wave, the coordinator performs health checks:
# After Wave 0 completes
if [ $FINDINGS_COUNT -eq 0 ]; then
echo "[WARNING] No findings in Wave 0 (critical injection)"
echo " Possible causes:"
echo " - WAF is blocking all payloads"
echo " - Target is not vulnerable (rare)"
echo " - Authentication token expired"
fi
Warnings logged but testing continues (no auto-abort).
Anti-First-Positive Rule¶
Finding a vulnerability in one endpoint doesn't complete testing for that vulnerability type:
❌ WRONG: Found SQLi on /api/v1/users?sort= → Stop testing SQLi
✅ RIGHT: Found SQLi on /api/v1/users?sort= → Test ALL endpoints with sort/filter/order parameters
Each confirmed vulnerability becomes a pattern to test across similar endpoints.
Dependency: Source Code Extraction¶
If Phase 3.5c confirmed LFI/RCE, source code is already extracted by this point. Phase 4 agents read source:
- Exact field names (avoids guessing user_id vs userId vs uid)
- Validation logic (understands input constraints)
- Hidden endpoints (endpoints not in test-plan.json)
Flag in context.json: "source_code_available": true
Output During Phase 4¶
As each wave completes:
- findings/FINDING-NNN.md files created for confirmed vulnerabilities
- logs/agent-wave-N.log records test execution for each agent
- logs/pentest-timeline.jsonl tracks timestamp of each finding
- waves/agent-N.json stores agent state and results
Decision Point: Abort vs Continue¶
After each wave, decide: 1. Continue normally: Testing proceeding as expected 2. Skip remaining waves: WAF completely blocking all tests, or user requests stop 3. Investigate issue: Tool failure, authentication failure (re-authenticate and resume)
Next Phase¶
After all 12 waves complete, proceed to Phase 5: Verification to confirm every finding with a working exploit.