
Substack
Built RunwayML's content moderation pipeline processing 2M+ generations/day. Policy + ML systems.

Built AI moderation pipeline handling 2M/day with 96.2% automated catch rate
RunwayML's content moderation relied on a manual review queue that processed 800 flagged items/day with a 12-hour average response time. As generation volume grew to 2M/day, the human-only approach was unsustainable. Policy violations were increasing 40% month-over-month and the company risked losing its API providers.
Built a three-tier moderation system: (1) pre-generation prompt classifier that blocks obviously violating requests in <50ms, (2) post-generation image classifier trained on 200K labeled examples covering 18 violation categories, (3) human review for edge cases with uncertainty scores. Created an internal Retool dashboard for the review team with one-click decisions and automatic policy citation.
Automated catch rate hit 96.2% for policy violations. Manual review queue dropped from 800 to 120 items/day despite 5x volume growth. Average response time fell from 12 hours to 1.8 seconds for automated decisions. False positive rate held at 0.3%, preventing unnecessary user friction.

Brand-creator partnerships on TikTok's marketplace had no standardized safety vetting. Brands were occasionally paired with creators who had past policy violations, creating PR risks.
Built an automated creator safety scoring system that evaluated historical content, comment sentiment, and policy violation history. Designed tiered brand-safety categories that brands could select during campaign setup.
Brand safety incidents dropped 87%. Creator marketplace GMV grew 2.3x as brands gained confidence. The scoring system became a key competitive differentiator in enterprise sales pitches.