pwnmachine 👾
@princechaddha
We built three full-stack apps using Claude Code, Codex, and Cursor - a healthcare portal, a banking platform, and an insurance claims system. The prompts were casual, exactly how people actually vibe code. No mention of security, nothing intentionally broken. Then we threw four security scanners at them Neo, Claude, Invicti and Snyk and manually verified every single finding. The results genuinely surprised us. 70 exploitable vulnerabilities across three apps. Unlimited money creation in the banking app. Any user could create admin accounts in the insurance platform. Patient records accessible to anyone in the healthcare portal. All Critical and High severity. All shipped out of the box. But what really got me was the scanner gap. Neo found 62 of 70 vulnerabilities with only 5 false positives. Snyk found literally zero valid issues. The difference between these tools isn't incremental it's the difference between finding the bugs that matter and walking away with a false sense of security. Full blog with the stats is live. The detailed research paper with exact prompts, methodology, all the findings, and the apps themselves is coming soon.