intuition

Human QA Operations

About the Role

You will own Intuition's annotation and human QA vertical end-to-end. We combine domain experts, a high-quality reviewer pool, and our tech platform to teach AI to reason and act in the physical world. Our globally distributed crowd reviews tens of thousands of hours of first-person video every week, across multiple countries and languages. This is a builder's role. You will set the technical blueprint for how we ingest, annotate, QA, and deliver hundreds of thousands of hours of first-person and embodied AI video data per quarter, at the precision a frontier training stack demands.

What You'll Do

Own the annotation and human QA operations P&L. Throughput, quality, unit economics, partner SLAs, contributor retention.
Deploy the annotation framework. Taxonomy design, golden datasets, performance buckets, consensus thresholds, inter-annotator agreement (Cohen's kappa, Krippendorff's alpha), reviewer onboarding.
Design the human QA review flow. Reviewers per task, AHT targets, QA-of-QA sampling rates, fraud thresholds and treatment.
Industrialize the partner stack. Onboard, scale, and rationalize BPOs, annotation companies, skilled-trade networks (plumbers, electricians, mechanics, chefs, and other trades), and direct community networks across Asia and globally.
Recruit deep domain experts. Both white-collar (doctors, lawyers, engineers, physicists) and skilled trades. They serve as senior annotators, golden-task creators, and arbitration reviewers.
Build the standards stack. MSAs, partner intake, standardized payouts, training material, QC SOPs, escalation paths, fraud playbooks, dataset rotation cadence.
Manage and grow a team of 3 to 5 (Manager plus Associates today, scaling with the function). Hire and ramp annotators, QA leads, partner managers, and trainers.
Partner with global engineering, product, finance, and legal teams to ship on aggressive sprint goals.
Track and report on a daily cadence. Throughput, quality, cost per reviewed hour, partner SLAs, fraud alerts, retention. Build the leadership dashboard for the vertical.

What We're Looking For

BS or MS in a STEM field. Computer science, mathematics, physics, or similar.
Demonstrated ability to design operational frameworks from first principles. Not just running someone else's playbook.
Strong with numbers, unit economics, and operational data. Fluent in SQL or BI tools, plus Sheets and Notion.
Deep understanding of AI and GenAI. Technology, market trends, basic ML and statistics.
Strong written and verbal communication. Equally credible with engineers, partners, and senior leadership.
Comfortable across global time zones. On-time delivery sometimes means weekends.
High agency — you don't wait to be told what to do.
Fluent in English.
Nice to have: experience in AI data operations, marketplace ops, gig economy, robotics, trust and safety, or logistics-heavy environments.
Nice to have: experience coordinating distributed teams or large contractor workforces.
Nice to have: experience building processes from scratch in ambiguous environments.

Why Join Us

As models get more capable, the bar on data quality, taxonomy depth, and human judgment only rises. This vertical is one of the most strategic at Intuition.