Completeness Gates¶
Quality gates ensure no phase is skipped and no critical check is missed. Every gate must pass before proceeding to the next phase. The /pentest skill checks these gates automatically between waves.
Iron Rule
ALL phases MUST be executed. NEVER skip any phase. Earlier phases discover assets that later phases depend on -- skipping them guarantees missed vulnerabilities.
Gate Reference¶
| Gate | Check | Why It Matters |
|---|---|---|
| Intake | brief.json exists with _version: "1.0" |
Provides business context, tech stack hints, compliance requirements, and rules of engagement. Improves test targeting and ensures testing aligns with client expectations. Optional but recommended. |
| Walkthrough | app-map.json exists, 0 login failures, crawled-urls.txt populated, ALL users in credentials.json crawled (verify roles_tested count matches user count) |
Catches SPA routes, dynamic menus, JS-rendered content, and API calls that static discovery misses. Incomplete crawling means incomplete attack surface. |
| Discovery | Authenticated crawl done (all roles), jsluice+LinkFinder+SecretFinder ran, source maps downloaded, arjun on all endpoints, ffuf content discovery done, inline JS analyzed | Ensures the full endpoint inventory is built before testing begins. Missing endpoints means missing vulnerabilities. |
| API Auth | $EDIR/discovery/api-tokens.json exists with validated Bearer tokens for ALL roles. Tokens MUST be tested on a real endpoint (e.g., /api/v1/auth/me) -- not just obtained. If JWT and Sanctum endpoints coexist, use whichever token type the API actually accepts (test both, keep the one that returns 200). |
Without validated tokens, all authenticated API testing produces false negatives. Token validity must be proven, not assumed. |
| Auth Endpoint Discovery | ALL auth endpoints probed (/api/v1/auth/login, /api/v1/auth/token, /api/auth/login, /oauth/token, etc.). App may have multiple auth systems (JWT + Sanctum + session) -- discover ALL of them. Admin email may differ from config (enumerate, don't assume). |
Applications often expose multiple auth mechanisms. Missing one means missing an entire attack surface for auth bypass, token confusion, and privilege escalation. |
| Resource Map | $EDIR/discovery/resource-map.json exists (all resource types for IDOR) |
IDOR testing requires a complete inventory of resource types, their ID patterns, and access control expectations. Without this, IDOR coverage is incomplete. |
| Parameter Inventory | $EDIR/discovery/injectable-params.json exists (including sort/order/filename/debug flags) |
Every injectable parameter is a potential entry point. Missing parameters like sort, order, filename, template, config, amount, role, accountId means missing injection vectors. |
| Dual Auth | If app has both Bearer + session-cookie auth: obtain BOTH for all roles, test independently. Test ALL discovered endpoints with session cookies (not just hardcoded paths). | Access control may differ between auth mechanisms. An endpoint secure with Bearer tokens may be vulnerable with session cookies, or vice versa. |
| Content Discovery | $EDIR/discovery/sensitive-files.txt exists (ffuf ran). Tech-specific SecLists wordlists ran based on fingerprinted stack. |
Hidden files (backups, configs, admin panels, debug endpoints) are among the highest-value targets. Generic wordlists alone miss technology-specific paths. |
| Inline JS | $EDIR/discovery/inline-js-analysis.json exists (innerHTML, postMessage, deepMerge checked) |
Inline JavaScript contains DOM XSS sinks, hardcoded secrets, and client-side logic that external JS analysis tools miss. |
| DOM XSS Sinks | $EDIR/discovery/js-dom-xss-sinks.txt exists -- grep-based scan for dangerouslySetInnerHTML, innerHTML, .html(, v-html, document.write on ALL JS files. $EDIR/logs/dom-xss-hash.txt exists -- source-sink analysis on ALL crawled pages (not just main URL). Every page with BOTH sources AND sinks must be verified via Playwright MCP or browser. |
DOM XSS is invisible to server-side scanners. Source-sink analysis on every page is required to find exploitable DOM XSS chains. |
| Route | $EDIR/discovery/test-plan.json exists (generated by /route after Phase 3, before wave coordinator) |
The test plan maps endpoints to specific test scopes, enabling targeted per-endpoint testing instead of blanket testing. Without it, testing is unfocused and inefficient. |
| Unauth Sweep | ALL discovered endpoints tested without auth (A0 in test-access). NOT just hardcoded paths -- sweep crawled-urls.txt + api-endpoints.txt + resource-map.json. POST/PUT/DELETE methods also tested without auth. |
Broken access control is the #1 OWASP risk. Every endpoint must be verified to require authentication. State-changing methods (POST/PUT/DELETE) are especially critical. |
| Anti-First-Positive | Every confirmed-vulnerable field tested for ALL injection classes (SQLi+XSS+CRLF). Every confirmed vuln class tested on ALL endpoints of same family. JWT: exp check runs even after alg:none. | Finding one vulnerability type on one endpoint does not mean other types are absent. A field vulnerable to SQLi may also be vulnerable to XSS. Stop-on-first-positive leaves findings on the table. |
| Endpoint Coverage | After each test skill, count endpoints tested vs endpoints discovered. If coverage < 80%, go back and test remaining. Every sort/order/filter param tested. Every API endpoint tested for IDOR/unauth. Log [COVERAGE] X/Y endpoints per test category. |
Ensures systematic coverage rather than cherry-picking easy targets. The 80% threshold catches skills that test a handful of endpoints and move on. |
| Endpoint Variants | Do NOT assume an endpoint is inaccessible based on one variant failing. Many apps expose BOTH web form (/admin/export/*) AND REST API (/api/v1/*/export) variants of the same operation. Test BOTH. Parameter location may differ (POST body vs query string). Access control may differ (302/405 vs 200 OK). |
A web form returning 405 does not mean the underlying functionality is unreachable. The REST API variant may accept the same operation with different (weaker) access controls. |
| No Early Stop | Finding one vuln in a category does NOT complete that category. If SQLi on /employees?sort=, MUST also test /leaves?sort=, /tickets?sort=, etc. Each instance is a separate finding. |
Each vulnerable endpoint is a separate risk with potentially different impact. Reporting one SQLi when five exist underrepresents the actual risk posture. |
| Auth SQLi+XSS+CRLF | Login/register/password-reset fields tested for SQLi, XSS, AND CRLF (STEP 0a in test-injection). | Auth forms are highest-value targets -- they process user input before authentication, often with different validation than authenticated endpoints. |
| Blind CMDi | When CMDi output payloads return errors, ALL 8 timing variants tested (;sleep, |sleep, $(sleep), `sleep`, %0asleep, &&sleep, &timeout, |ping) before concluding "not vulnerable". |
Command injection detection requires exhaustive delimiter testing. Many applications filter some delimiters but not others. Concluding "not vulnerable" after testing 2-3 variants produces false negatives. |
| Source Code | If LFI/RCE confirmed: source code extracted (Phase 3.5c), source_code_available=true in context.json, field names from source used in mass assignment and injection. |
Extracted source code reveals exact field names, validation logic, hidden endpoints, and SQL query structure. This transforms blind testing into informed testing. |
| Form Fields | HTML forms read (grep name=) before ANY test payload -- never assume parameter names. |
Assumed parameter names lead to testing nonexistent fields while missing real ones. Always read the actual HTML to get correct field names. |
| SSRF Ports | Internal ports (8080, 8443, 3000, 5000, 9090) scanned via SSRF when exposed port fails (B2b in test-ssrf). | SSRF to internal services often targets non-standard ports. If the default port is closed, other common service ports may still be accessible internally. |
| CSRF SameSite | SameSite cookie attribute analyzed BEFORE CSRF testing. None = full CSRF, Lax = GET-only CSRF + Content-Type downgrade, Strict = CSRF via subdomain XSS only. Finding severity adjusted by SameSite value. |
SameSite policy fundamentally changes CSRF exploitability. Reporting a CSRF as High when SameSite=Strict makes it unexploitable without a subdomain XSS chain overestimates severity. |
| Testing | JWT hashcat+aud, CSRF Content-Type downgrade, clickjacking PoC, Host header on pwd reset, HTTP method override, Content-Type switching on all POST/PUT, hidden flags on all GET, frontend CVE check, dep manifest analysis if exposed, conditional: OAuth PKCE+state null byte, Firebase rules, Salesforce Aura, Flask signing, Mermaid injection, NoSQL $regex, DNS rebinding on SSRF blocks. |
This is the comprehensive checklist of specific test cases that must not be skipped. Each represents a known attack vector that automated scanners frequently miss. |
| Stored XSS | Verify via curl (POST payload, then GET, then check HTML). Only DOM XSS needs browser. | Stored XSS verification must confirm the payload persists and renders in the response. Curl-based verification is faster and more reliable than browser-based for reflected/stored variants. |
| Verification | Every finding has working exploit. | Unverified findings risk being false positives. Every finding in the final report must have a reproducible proof-of-concept. |
| Payload Cleanup | All stored/persistent payloads cleaned after evidence collection (cleanup_status in finding = cleaned or not_applicable). No alert() popups left in production. |
Leaving test payloads in a production application is unprofessional and potentially harmful. XSS payloads left in place could be triggered by real users. |
| Chain | /chain-findings ran after verification. Includes new patterns: CMDi+LFI, LFI+SSRF, XXE+JAR, JWT+BAC, UserEnum+NoRateLimit, XSS+CORS. |
Individual Medium-severity findings may combine into Critical attack chains. Chain detection reveals the true risk that isolated findings underrepresent. |
| Financial | Negative amounts, overdrafts, suspended account API bypass, race conditions. | Financial applications require specific tests for monetary logic flaws. These are high-impact business logic vulnerabilities that generic scanners never catch. |
| Tech Wordlists | After fingerprinting, SecLists tech-specific wordlists ran (Tomcat, Spring, PHP, Django, Rails, Node, CMS, etc.) in addition to generic raft wordlists. | Technology-specific wordlists discover paths that generic lists miss (e.g., /actuator/env for Spring, /wp-admin/ for WordPress, /server-status for Apache). |
Gate Enforcement¶
Gates are checked at phase transitions:
- After Phase 0.5 (Walkthrough): Walkthrough gate
- After Phase 2 (Discovery): Discovery, API Auth, Auth Endpoint Discovery, Resource Map, Parameter Inventory, Dual Auth, Content Discovery, Inline JS, DOM XSS Sinks
- After Phase 3.5 (Route): Route gate
- During Phase 4 (Testing): Per-skill gates (Endpoint Coverage, Anti-First-Positive, No Early Stop, etc.)
- After Phase 5 (Verification): Verification, Payload Cleanup, Chain gates
Pipeline Overlap
Some gates apply to overlapping tiers: Tier 1 tests (crypto, supply-chain, exceptions) can start after Phase 0, and Tier 2 (cloud) after Phase 1. Gates for these tiers are checked independently.