Skip to content

AI-Augmented Penetration Testing

When you run a pentest with BeDefended, the automated engine and your manual proxy testing work in parallel and feed each other. This page explains the practical workflow — what you actually do, step by step, during an engagement.


The Problem

Traditional pentesting has two disconnected tracks:

  1. Automated scanning discovers endpoints, parameters, and known vulns — but misses business logic, complex auth flows, and context-dependent bugs
  2. Manual proxy testing (Burp/Caido) catches what scanners can't — but the tester doesn't know what the scanner already covered, wastes time retesting, and has no AI helping spot patterns in traffic

AI-Augmented PT connects them. The automated scan tells you what to focus on. Your proxy traffic feeds the AI for second-opinion analysis. Coverage gaps are visible in real-time.


How It Works

                    ┌─────────────────────────────────┐
                    │     BeDefended Dashboard        │
                    │                                 │
                    │  Scan Intel ← automated scan    │
                    │  (endpoints, params, coverage)  │
                    │         ↕                       │
                    │  Proxy History ← your traffic   │
                    │  (requests, AI analysis)        │
                    │         ↕                       │
                    │  Correlation Engine             │
                    │  (overlap, gaps, next steps)    │
                    └──────────┬──────────────────────┘
              ┌────────────────┼────────────────┐
              │                │                │
        Burp Extension   Caido Plugin    Automated Engine
        (sends copies)   (sends copies)  (/pentest)
              │                │                │
              └────────────────┼────────────────┘
                           Target App

The dashboard is a passive observer — your proxy flow is untouched. The extensions just POST copies of request/response pairs to the dashboard API.


Practical Workflow

Step 1 — Connect your proxy extension

Install the extension once; after that, setup per engagement takes one click.

First-time install (once only):

  • Burp Suite: Extensions → Add → bd-proxy-bridge-burp-v1.0.0.jar
  • Caido: Plugins → Install from folder → select extensions/caido-proxy-bridge/

Per-engagement setup (zero-config):

  1. In the dashboard, go to Proxy History → click New Session
  2. A modal appears with three setup options:
  3. Config URL (recommended): Copy the URL, paste it into the extension's "Fetch Config" field, click Fetch. Done — API key, session ID, dashboard URL are all auto-filled.
  4. Config JSON: Copy to clipboard, click "Paste Config from Clipboard" in the extension. Same result.
  5. Manual: Copy API key and session ID individually (legacy method).
  6. The extension auto-applies the config. Traffic starts flowing immediately.

No need to manually type URLs, API keys, or session IDs. One copy-paste per engagement.

Step 2 — Launch the automated scan

With the proxy bridge already connected, launch the scan:

/pentest https://app.target.com

Open the dashboard and go to Scan Intel for your engagement. You'll see the scanner discovering the app in real-time:

  • Endpoints appearing as the crawler finds them
  • Parameters extracted from forms, query strings, JSON bodies
  • Tech stack fingerprint updating (Spring Boot, PostgreSQL, nginx, Cloudflare...)
  • "Interesting Finds" cards popping up — source maps, debug endpoints, .git exposed, etc.

Don't wait for it to finish. Start manual testing as soon as you see the initial endpoints — your proxy traffic is already being captured and sent to the dashboard.

Step 3 — Manual testing with live intelligence

Open Scan Intel in a second monitor (or split screen). As you test manually, use it as your map:

What to look at:

Scan Intel Section What it tells you What to do
Discovered Endpoints All paths the scanner found Focus on endpoints marked "untested"
Parameters Params per endpoint, with "injectable" flags Test the injectable ones the scanner flagged but couldn't confirm
Coverage Map % of endpoints tested Aim for gaps — scanner might have missed auth-walled areas
Interesting Finds Source maps, debug endpoints, .git Investigate immediately — these are often quick wins
Tech Stack Server, framework, DB, WAF Adjust your payloads accordingly

Example: The scanner found /api/v2/orders/{id} with parameter id flagged as injectable, but couldn't confirm SQLi because it got 403. You're authenticated as a normal user — test that IDOR/SQLi manually.

Step 4 — Let the AI analyze your traffic

After you've browsed a meaningful chunk of the app (50+ requests), go to Proxy History and click Analyze.

Choose the analysis type:

Type When to use What it does
Vulnerability Suggestions After exploring a new area Scans your traffic for injection points, missing headers, error leaks, auth issues
Pattern Detection After testing auth/session flows Finds session management anomalies, rate limiting gaps, authz inconsistencies
Next Steps When you're not sure what to test next Based on tested endpoints, suggests untested attack vectors
Correlation After automated scan completes Compares your manual findings with automated scan results — shows overlap and gaps
Batch Summary At end of testing session Overview of all traffic patterns and anomalies

What happens:

  1. The AI (Claude or Codex) gets your last 100 requests + the engagement context (tech stack, existing findings)
  2. It analyzes patterns across requests — not just individual ones
  3. Flagged requests appear highlighted in the Proxy History table with severity + reason
  4. Suggestions appear in the analysis panel at the bottom

Example output:

Medium — Potential IDOR: Requests to /api/v2/orders/142 and /api/v2/orders/143 return different user data with the same session token. The id parameter appears to be a sequential integer without authorization check.

Suggestion: Test with IDs belonging to other users. Try /api/v2/orders/1 through /api/v2/orders/10.

Low — Missing Security Headers: 23 responses lack X-Content-Type-Options. All /api/ endpoints return Access-Control-Allow-Origin: *.

Step 5 — The correlation loop

This is where it gets powerful. After both the automated scan and your manual testing have run:

  1. Trigger a Correlation analysis
  2. The AI cross-references everything:
Category Meaning Action
Confirmed Both automated and manual found it High confidence — include in report
Auto-only Scanner found it, you didn't encounter it Verify manually — could be FP
Manual-only You found it, scanner missed it New discovery — document PoC
Coverage gaps Neither approach tested it Prioritize for remaining testing time

Step 6 — Repeat per testing focus

Create separate proxy sessions for different areas:

Session: "Auth Flow — all roles"      → Login, register, password reset, 2FA
Session: "Payment Flow"               → Cart, checkout, Stripe callbacks
Session: "API — GraphQL"              → All GraphQL queries/mutations
Session: "Admin Panel"                → Admin-only functionality

Each session gets its own AI analysis, keeping results focused and manageable.


What the Extensions Actually Send

The extensions are smart about what they send — they don't flood the dashboard with noise.

Automatic filtering (Burp extension):

  • Scope filtering: Only in-scope requests (via Burp's target scope or custom regex)
  • Response dedup: SHA-256 hash of response body — if you reload a page 10 times, only the first unique response is sent
  • Endpoint dedup: Same method+path suppressed for 30 minutes (path normalized: /users/42/users/{id})
  • Binary skip: Images, fonts, wasm, audio, video are never sent
  • Size limit: Responses > 96KB are skipped
  • Credential redaction: Authorization headers, cookies, Bearer tokens, API keys in query params → [REDACTED] (done server-side)

Extra metadata sent with each request:

  • VulnContext tags: isAuth, isPayment, isAdmin, isAPI — based on URL patterns. The AI uses these to prioritize findings (a missing CSRF on /checkout is more critical than on /about)
  • Injection points: URL params, body params, JSON keys, cookie names, path IDs — pre-extracted so the AI doesn't have to re-parse everything

Offline / Batch Mode: HAR Import

If you can't have live connectivity to the dashboard (air-gapped network, client site), use HAR import:

  1. Test normally through Burp/Caido
  2. Export traffic as .har file (Burp: Proxy History → select all → Save Items as HAR)
  3. In the dashboard: Proxy History → Import HAR
  4. Drop the .har file — all entries are imported with credential redaction
  5. Run AI analysis as normal

Dashboard Pages

Proxy History

Proxy History layout

Session bar: Create/switch sessions, live traffic indicator, Import HAR, Analyze button

Request table: Color-coded methods (GET=green, POST=blue, PUT=amber, DELETE=red), status codes, duration, AI flag icon. Click any row for details.

Detail panel: Three tabs — Request (headers + body), Response (headers + body), AI Analysis (severity + reason if flagged). Notes field for your annotations.

Analysis panel (collapsible): Results from each AI analysis run — findings with severity badges, suggestions with action items, coverage gaps as checklists.

Scan Intelligence

Endpoints table: Path, methods, parameter count, tested/untested status, discovery source. Click to expand and see parameters.

Tech Stack card: Server, framework, languages, database, WAF, CDN — updates live as the scanner fingerprints the target.

Coverage map: Big percentage number + progress bars for endpoints and parameters tested.

Interesting Finds: Alert cards for source maps, debug endpoints, exposed files, default credentials — each with an "investigate" suggestion.


Downloads

Both extensions are distributed as GitHub Release assets (not published to Burp/Caido plugin stores):

Burp Suite Extension (.jar) Caido Plugin (.zip)

Extension Requirements Build from source
Burp Suite Burp 2024+, Java 17+ cd extensions/burp-proxy-bridge && gradle jar
Caido Caido v0.40+ See extensions/caido-proxy-bridge/README.md

See extensions/BUILD.md for release packaging instructions.