Skip to content

ClaimFlow (Neo Benchmark)

21/24 (87.5%) — vs Neo 22/24

Combined white-box code review + black-box dynamic testing. 3 FP. 3 extra vulns found outside ground truth. Run: 2026-03-24.

Insurance claims management system from ProjectDiscovery's Vibe-Coding Benchmark. Built with Cursor. Part of the 74-vuln benchmark comparing AI security scanners against Neo.


Lab Info

Field Value
Stack SvelteKit, Drizzle ORM, SQLite, Custom Auth
LOC 12,368
Port 8103
Roles Admin, Underwriter, Adjuster, Agent/Broker, Policyholder
Auth Custom session-based (SHA-256 + hardcoded salt, httpOnly cookies)
Source github.com/projectdiscovery/research

Score Breakdown

By Severity

Severity Total Found Missed Rate
🔴 Critical 2 2 0 100%
🟠 High 4 4 0 100%
🟡 Medium 9 8 1 89%
:large_blue_circle: Low 5 5 0 100%
⚪ Info 4 2 2 50%
Total 24 21 3 87.5%

vs Neo Baseline

Metric BeDefended Neo Delta
True Positives 21/24 22/24 -1
False Positives 3 2 +1
Precision 88% 92% -4pp

By Vulnerability Category

Category Total Found Skills Used
Access Control 9 8 test-access
Auth & Session 6 6 test-auth
Business Logic 2 2 test-logic
Cryptographic 1 1 test-crypto
Client-Side 1 1 test-client
Info Disclosure 2 1 test-exceptions
File Upload 1 1 test-logic
Input Validation 2 1 test-injection

Per-Finding Results

ID Finding Severity Neo BD Status
CF-001 Missing Auth on Admin Actions Critical Found Found TP
CF-002 Deactivated User Retains Access Critical Found Found TP
CF-003 Document Verification IDOR High Found Found TP
CF-004 Password Hash via Drizzle ORM High Found Found TP
CF-005 Missing Auth on Adjuster Actions High Found Found TP
CF-006 Missing Auth on Underwriter makeDecision High Found Found TP
CF-007 Mass Assignment via Workflow updateData Medium Found Found TP
CF-008 Adjuster Bypasses Workflow Transitions Medium Found Found TP
CF-009 Batch Ops Bypass Workflow Rules Medium Found Found TP
CF-010 Missing Auth on Settlement Calculate Medium Found Missed FN
CF-011 Missing Role Check on Triage API Medium Found Found TP
CF-012 Missing Role Check on Settlement/Workflow GET Medium Found Found TP
CF-013 CSRF Missing on API Endpoints Medium Found Found TP
CF-014 Unrestricted File Upload Medium Found Found TP
CF-015 SHA-256 with Hardcoded Salt Medium Found Found TP
CF-016 Session Cookie Missing Secure Flag Low Found Found TP
CF-017 Sessions Not Invalidated on Pwd Change Low Found Found TP
CF-018 No Rate Limiting on Login Low Found Found TP
CF-019 Unlimited Concurrent Sessions Low Found Found TP
CF-020 Account Enumeration via Registration Low Found Found TP
CF-021 HSTS Not Enabled Info Found Missed FN
CF-022 CSP Not Implemented Info Found Found TP
CF-023 Generic Email Disclosure Missed Missed Missed FN (Neo too)
CF-024 X-Content-Type-Options Missing Missed Missed Found TP (+1 vs Neo)

Extra Findings (Not in Ground Truth)

We found 3 additional real vulnerabilities not in the 74-entry benchmark:

Finding Severity Description
All Seed Users — password123 Critical All 9 accounts including admin use password123
CSV Injection in Export Medium Formula execution when admin exports claims to Excel
Weak Password Policy Medium Only length check — aaaaaaaa accepted

Running

/pentest-neo claimflow
python evals/labs/vibeapps-scorer.py engagements/<dir> --app claimflow --html