
GitHub
Backend engineer building AI pipelines that turn unstructured financial docs into clean data. Power of 2.

Built AI receipt pipeline that increased throughput 4x with 91% accuracy
Pilot processed 800K+ receipts monthly with a semi-manual categorization pipeline. Accuracy was 68% and each receipt required an average of 45 seconds of bookkeeper review. The backlog grew 15% month-over-month.
Built a three-stage pipeline: Claude for text extraction, a custom classification model trained on 2M labeled transactions, and a confidence-based routing system. Implemented streaming processing with SQS and Lambda for real-time throughput.
Auto-categorization accuracy hit 91%. Bookkeeper review time dropped from 45s to 8s per receipt. Monthly throughput capacity increased 4x without adding headcount. Pipeline processes 3.2M receipts/month.

Generating year-end tax documents (1099s, W-2 summaries) required 3 weeks of manual work by a team of 4 bookkeepers each January.
Built a template engine with Claude-powered data extraction from QuickBooks exports. Created a validation layer that cross-references IRS rules against generated documents.
Tax document generation went from 3 weeks to 2 days. Error rate dropped from 8% to 0.3%.

Bank transaction syncing relied on nightly batch jobs via Plaid. Customers wanted same-day visibility into their financials but the batch architecture created 12-24 hour delays.
Migrated to Plaid webhooks with an event-driven architecture using AWS EventBridge. Built idempotent transaction processing with exactly-once semantics.
Transaction visibility went from next-day to under 5 minutes. Reduced Plaid API costs by 40% by eliminating redundant polling.