AI-Augmented Penetration Testing¶

When you run a pentest with BeDefended, the automated engine and your manual proxy testing work in parallel and feed each other. This page explains the practical workflow — what you actually do, step by step, during an engagement.

The Problem¶

Traditional pentesting has two disconnected tracks:

Automated scanning discovers endpoints, parameters, and known vulns — but misses business logic, complex auth flows, and context-dependent bugs
Manual proxy testing (Burp/Caido) catches what scanners can't — but the tester doesn't know what the scanner already covered, wastes time retesting, and has no AI helping spot patterns in traffic

AI-Augmented PT connects them. The automated scan tells you what to focus on. Your proxy traffic feeds the AI for second-opinion analysis. Coverage gaps are visible in real-time.

How It Works¶

                    ┌─────────────────────────────────┐
                    │     BeDefended Dashboard        │
                    │                                 │
                    │  Scan Intel ← automated scan    │
                    │  (endpoints, params, coverage)  │
                    │         ↕                       │
                    │  Proxy History ← your traffic   │
                    │  (requests, AI analysis)        │
                    │         ↕                       │
                    │  Correlation Engine             │
                    │  (overlap, gaps, next steps)    │
                    └──────────┬──────────────────────┘
                               │
              ┌────────────────┼────────────────┐
              │                │                │
        Burp Extension   Caido Plugin    Automated Engine
        (sends copies)   (sends copies)  (/pentest)
              │                │                │
              └────────────────┼────────────────┘
                               │
                           Target App

The dashboard is a passive observer — your proxy flow is untouched. The extensions just POST copies of request/response pairs to the dashboard API.

Practical Workflow¶

Step 1 — Connect your proxy extension¶

Install the extension once; after that, setup per engagement takes one click.

First-time install (once only):

Burp Suite: Extensions → Add → bd-proxy-bridge-burp-v1.0.0.jar
Caido: Plugins → Install from folder → select extensions/caido-proxy-bridge/

Per-engagement setup (zero-config):

In the dashboard, go to Proxy History → click New Session
A modal appears with three setup options:
Config URL (recommended): Copy the URL, paste it into the extension's "Fetch Config" field, click Fetch. Done — API key, session ID, dashboard URL are all auto-filled.
Config JSON: Copy to clipboard, click "Paste Config from Clipboard" in the extension. Same result.
Manual: Copy API key and session ID individually (legacy method).
The extension auto-applies the config. Traffic starts flowing immediately.

No need to manually type URLs, API keys, or session IDs. One copy-paste per engagement.

Step 2 — Launch the automated scan¶

With the proxy bridge already connected, launch the scan:

/pentest https://app.target.com

Open the dashboard and go to Scan Intel for your engagement. You'll see the scanner discovering the app in real-time:

Endpoints appearing as the crawler finds them
Parameters extracted from forms, query strings, JSON bodies
Tech stack fingerprint updating (Spring Boot, PostgreSQL, nginx, Cloudflare...)
"Interesting Finds" cards popping up — source maps, debug endpoints, .git exposed, etc.

Don't wait for it to finish. Start manual testing as soon as you see the initial endpoints — your proxy traffic is already being captured and sent to the dashboard.

Step 3 — Manual testing with live intelligence¶

Open Scan Intel in a second monitor (or split screen). As you test manually, use it as your map:

What to look at:

Scan Intel Section	What it tells you	What to do
Discovered Endpoints	All paths the scanner found	Focus on endpoints marked "untested"
Parameters	Params per endpoint, with "injectable" flags	Test the injectable ones the scanner flagged but couldn't confirm
Coverage Map	% of endpoints tested	Aim for gaps — scanner might have missed auth-walled areas
Interesting Finds	Source maps, debug endpoints, .git	Investigate immediately — these are often quick wins
Tech Stack	Server, framework, DB, WAF	Adjust your payloads accordingly

Example: The scanner found /api/v2/orders/{id} with parameter id flagged as injectable, but couldn't confirm SQLi because it got 403. You're authenticated as a normal user — test that IDOR/SQLi manually.

Step 4 — Let the AI analyze your traffic¶

After you've browsed a meaningful chunk of the app (50+ requests), go to Proxy History and click Analyze.

Choose the analysis type:

Type	When to use	What it does
Vulnerability Suggestions	After exploring a new area	Scans your traffic for injection points, missing headers, error leaks, auth issues
Pattern Detection	After testing auth/session flows	Finds session management anomalies, rate limiting gaps, authz inconsistencies
Next Steps	When you're not sure what to test next	Based on tested endpoints, suggests untested attack vectors
Correlation	After automated scan completes	Compares your manual findings with automated scan results — shows overlap and gaps
Batch Summary	At end of testing session	Overview of all traffic patterns and anomalies

What happens:

The AI (Claude or Codex) gets your last 100 requests + the engagement context (tech stack, existing findings)
It analyzes patterns across requests — not just individual ones
Flagged requests appear highlighted in the Proxy History table with severity + reason
Suggestions appear in the analysis panel at the bottom

Example output:

Medium — Potential IDOR: Requests to /api/v2/orders/142 and /api/v2/orders/143 return different user data with the same session token. The id parameter appears to be a sequential integer without authorization check.

Suggestion: Test with IDs belonging to other users. Try /api/v2/orders/1 through /api/v2/orders/10.

Low — Missing Security Headers: 23 responses lack X-Content-Type-Options. All /api/ endpoints return Access-Control-Allow-Origin: *.

Step 5 — The correlation loop¶

This is where it gets powerful. After both the automated scan and your manual testing have run:

Trigger a Correlation analysis
The AI cross-references everything:

Category	Meaning	Action
Confirmed	Both automated and manual found it	High confidence — include in report
Auto-only	Scanner found it, you didn't encounter it	Verify manually — could be FP
Manual-only	You found it, scanner missed it	New discovery — document PoC
Coverage gaps	Neither approach tested it	Prioritize for remaining testing time

Step 6 — Repeat per testing focus¶

Create separate proxy sessions for different areas:

Session: "Auth Flow — all roles"      → Login, register, password reset, 2FA
Session: "Payment Flow"               → Cart, checkout, Stripe callbacks
Session: "API — GraphQL"              → All GraphQL queries/mutations
Session: "Admin Panel"                → Admin-only functionality

Each session gets its own AI analysis, keeping results focused and manageable.

What the Extensions Actually Send¶

The extensions are smart about what they send — they don't flood the dashboard with noise.

Automatic filtering (Burp extension):

Scope filtering: Only in-scope requests (via Burp's target scope or custom regex)
Response dedup: SHA-256 hash of response body — if you reload a page 10 times, only the first unique response is sent
Endpoint dedup: Same method+path suppressed for 30 minutes (path normalized: /users/42 → /users/{id})
Binary skip: Images, fonts, wasm, audio, video are never sent
Size limit: Responses > 96KB are skipped
Credential redaction: Authorization headers, cookies, Bearer tokens, API keys in query params → [REDACTED] (done server-side)

Extra metadata sent with each request:

VulnContext tags: isAuth, isPayment, isAdmin, isAPI — based on URL patterns. The AI uses these to prioritize findings (a missing CSRF on /checkout is more critical than on /about)
Injection points: URL params, body params, JSON keys, cookie names, path IDs — pre-extracted so the AI doesn't have to re-parse everything

Offline / Batch Mode: HAR Import¶

If you can't have live connectivity to the dashboard (air-gapped network, client site), use HAR import:

Test normally through Burp/Caido
Export traffic as .har file (Burp: Proxy History → select all → Save Items as HAR)
In the dashboard: Proxy History → Import HAR
Drop the .har file — all entries are imported with credential redaction
Run AI analysis as normal

Dashboard Pages¶

Proxy History¶

Proxy History layout

Session bar: Create/switch sessions, live traffic indicator, Import HAR, Analyze button

Request table: Color-coded methods (GET=green, POST=blue, PUT=amber, DELETE=red), status codes, duration, AI flag icon. Click any row for details.

Detail panel: Three tabs — Request (headers + body), Response (headers + body), AI Analysis (severity + reason if flagged). Notes field for your annotations.

Analysis panel (collapsible): Results from each AI analysis run — findings with severity badges, suggestions with action items, coverage gaps as checklists.

Scan Intelligence¶

Endpoints table: Path, methods, parameter count, tested/untested status, discovery source. Click to expand and see parameters.

Tech Stack card: Server, framework, languages, database, WAF, CDN — updates live as the scanner fingerprints the target.

Coverage map: Big percentage number + progress bars for endpoints and parameters tested.

Interesting Finds: Alert cards for source maps, debug endpoints, exposed files, default credentials — each with an "investigate" suggestion.

Downloads¶

Both extensions are distributed as GitHub Release assets (not published to Burp/Caido plugin stores):

Burp Suite Extension (.jar) Caido Plugin (.zip)

Extension	Requirements	Build from source
Burp Suite	Burp 2024+, Java 17+	`cd extensions/burp-proxy-bridge && gradle jar`
Caido	Caido v0.40+	See `extensions/caido-proxy-bridge/README.md`

See extensions/BUILD.md for release packaging instructions.