
GitHub
Built AI code generation pipeline that produces production-ready full-stack apps. Ex-Anthropic.

Built AI schema generation that cut database setup from 3.4 hours to 12 minutes
Supabase users spent an average of 3.4 hours setting up their initial database schema. 41% of new projects were abandoned during schema creation because non-backend developers struggled with relational modeling, foreign keys, and RLS policies.
Built a multi-step AI pipeline: (1) natural language requirement parsing with Claude, (2) schema draft with automatic normalization, (3) RLS policy generation based on described access patterns, (4) migration file output with rollback support. Added a visual schema preview and interactive refinement chat.
Average schema setup time dropped from 3.4 hours to 12 minutes. Project abandonment during setup fell from 41% to 14%. Generated schemas passed all lint rules 87% of the time on first attempt. Feature became the #2 driver of new project creation.

Claude lacked a reliable benchmark for measuring code generation quality across real-world tasks. Existing benchmarks like HumanEval were too narrow.
Created a 500-task evaluation suite covering 12 languages and 8 task types (API integration, data processing, frontend components, etc.). Built an automated execution sandbox with security isolation.
Framework became the internal standard for code model evaluation. Identified 3 key weakness areas that led to targeted fine-tuning, improving pass@1 by 18% on real-world tasks.