ClaimFlow (Neo Benchmark)
21/24 (87.5%) — vs Neo 22/24
Combined white-box code review + black-box dynamic testing. 3 FP. 3 extra vulns found outside ground truth. Run: 2026-03-24.
Insurance claims management system from ProjectDiscovery's Vibe-Coding Benchmark. Built with Cursor. Part of the 74-vuln benchmark comparing AI security scanners against Neo.
Lab Info
| Field |
Value |
| Stack |
SvelteKit, Drizzle ORM, SQLite, Custom Auth |
| LOC |
12,368 |
| Port |
8103 |
| Roles |
Admin, Underwriter, Adjuster, Agent/Broker, Policyholder |
| Auth |
Custom session-based (SHA-256 + hardcoded salt, httpOnly cookies) |
| Source |
github.com/projectdiscovery/research |
Score Breakdown
By Severity
| Severity |
Total |
Found |
Missed |
Rate |
Critical |
2 |
2 |
0 |
100% |
High |
4 |
4 |
0 |
100% |
Medium |
9 |
8 |
1 |
89% |
| :large_blue_circle: Low |
5 |
5 |
0 |
100% |
Info |
4 |
2 |
2 |
50% |
| Total |
24 |
21 |
3 |
87.5% |
vs Neo Baseline
| Metric |
BeDefended |
Neo |
Delta |
| True Positives |
21/24 |
22/24 |
-1 |
| False Positives |
3 |
2 |
+1 |
| Precision |
88% |
92% |
-4pp |
By Vulnerability Category
| Category |
Total |
Found |
Skills Used |
| Access Control |
9 |
8 |
test-access |
| Auth & Session |
6 |
6 |
test-auth |
| Business Logic |
2 |
2 |
test-logic |
| Cryptographic |
1 |
1 |
test-crypto |
| Client-Side |
1 |
1 |
test-client |
| Info Disclosure |
2 |
1 |
test-exceptions |
| File Upload |
1 |
1 |
test-logic |
| Input Validation |
2 |
1 |
test-injection |
Per-Finding Results
| ID |
Finding |
Severity |
Neo |
BD |
Status |
| CF-001 |
Missing Auth on Admin Actions |
Critical |
Found |
Found |
TP |
| CF-002 |
Deactivated User Retains Access |
Critical |
Found |
Found |
TP |
| CF-003 |
Document Verification IDOR |
High |
Found |
Found |
TP |
| CF-004 |
Password Hash via Drizzle ORM |
High |
Found |
Found |
TP |
| CF-005 |
Missing Auth on Adjuster Actions |
High |
Found |
Found |
TP |
| CF-006 |
Missing Auth on Underwriter makeDecision |
High |
Found |
Found |
TP |
| CF-007 |
Mass Assignment via Workflow updateData |
Medium |
Found |
Found |
TP |
| CF-008 |
Adjuster Bypasses Workflow Transitions |
Medium |
Found |
Found |
TP |
| CF-009 |
Batch Ops Bypass Workflow Rules |
Medium |
Found |
Found |
TP |
| CF-010 |
Missing Auth on Settlement Calculate |
Medium |
Found |
Missed |
FN |
| CF-011 |
Missing Role Check on Triage API |
Medium |
Found |
Found |
TP |
| CF-012 |
Missing Role Check on Settlement/Workflow GET |
Medium |
Found |
Found |
TP |
| CF-013 |
CSRF Missing on API Endpoints |
Medium |
Found |
Found |
TP |
| CF-014 |
Unrestricted File Upload |
Medium |
Found |
Found |
TP |
| CF-015 |
SHA-256 with Hardcoded Salt |
Medium |
Found |
Found |
TP |
| CF-016 |
Session Cookie Missing Secure Flag |
Low |
Found |
Found |
TP |
| CF-017 |
Sessions Not Invalidated on Pwd Change |
Low |
Found |
Found |
TP |
| CF-018 |
No Rate Limiting on Login |
Low |
Found |
Found |
TP |
| CF-019 |
Unlimited Concurrent Sessions |
Low |
Found |
Found |
TP |
| CF-020 |
Account Enumeration via Registration |
Low |
Found |
Found |
TP |
| CF-021 |
HSTS Not Enabled |
Info |
Found |
Missed |
FN |
| CF-022 |
CSP Not Implemented |
Info |
Found |
Found |
TP |
| CF-023 |
Generic Email Disclosure |
Missed |
Missed |
Missed |
FN (Neo too) |
| CF-024 |
X-Content-Type-Options Missing |
Missed |
Missed |
Found |
TP (+1 vs Neo) |
We found 3 additional real vulnerabilities not in the 74-entry benchmark:
| Finding |
Severity |
Description |
| All Seed Users — password123 |
Critical |
All 9 accounts including admin use password123 |
| CSV Injection in Export |
Medium |
Formula execution when admin exports claims to Excel |
| Weak Password Policy |
Medium |
Only length check — aaaaaaaa accepted |
Running
/pentest-neo claimflow
python evals/labs/vibeapps-scorer.py engagements/<dir> --app claimflow --html