It took 2.5 years and hundreds of people to build a quality system. Today I'd need a small team and three agents.
A story about vetting, bell curves, and the moment you realize the system you spent years building is now a starting point.
The Before
In 2018, we launched a marketplace for experiences. Quality was the whole game.
So we vetted every single host with a phone call. Someone on the team would work through about ten attributes trying to figure out if this person could deliver. Things like depth of expertise, ability to engage a group, connection to the place.
Before going live, hosts even had to attend an in-person event to learn how to create magic for guests.
Phone call with every single host. ~10 attributes assessed manually.
In-person event required before going live. Learn the hero's journey.
Constant internal debates. What counts as a quality experience? Ten people, ten answers.
Thoughtful? Yes. Scalable? Absolutely not. Expensive, inconsistent, and impossible to standardize.
The Rubric
I listened to leadership, studied the patterns, and designed what became our quality rubric. Three core standards. Five-point scale for each. Simple. Structured. Repeatable.
We kept the phone call at first but made it structured. Market managers asked the three questions and documented responses.
Results started clustering into a real bell curve. The framework was working.
Product resources were scarce. So before building anything into the platform, we validated with an MVP: a Google Form. Hosts self-reported. A human QA'd the responses.
The bell curve held. The P95 experiences were absolutely incredible.
The Thresholds
Once we had the distribution, we could make real decisions. The bell curve gave us the data to set clear thresholds.
Then we built the feedback loop. Post-experience reviews validated the upfront vetting. Reviews drove customized education back to hosts. A circular quality engine.
Start to finish: roughly 2.5 years. Hundreds of people operating the machine.
How I'd Rebuild It Today
Three agents. A focused human team where it matters most.
Vetting Agent
Reviews host applications AND runs digital searches to validate claims. When there's a variance it can't resolve, it flags a human. Most of the time the human just pushes it forward.
Quality Agent
Analyzes incoming reviews. Identifies what guests actually value. Feeds signal back to the vetting agent so "excellent" evolves in real time.
Remediation + Removal Agent
Identifies underperforming experiences, surfaces the data, and recommends action. But this is where the human team lives. Removing someone from a marketplace is a real decision with real impact on someone's livelihood. You do it with dignity. You do it well. You make sure you're making the right call. This agent arms the team with everything they need. The team makes the final call.
The real unlock: market-level orchestration. Our 2018 system couldn't account for nuance. A cooking class in Tokyo has different quality markers than one in Austin. International travelers want different things than locals. We knew that intuitively but couldn't operationalize it. With these three agents and an orchestration layer, each market gets its own evolving quality standard. Not static. Alive.
The system we spent 2.5 years discovering through iteration would now be the starting architecture. Not the end state.
What system did you spend years building by hand that could now be three agents and an orchestration layer?