Blog
80% Done: The Open-Source Playbook for Replacing BPO with Conversational Voice AI
Why open-core infrastructure beats SaaS lock-in, full custom builds, and BPO add-ons when enterprise teams want to own the eval loop behind voice AI support.
Start at 80% done
I recently heard a simple philosophy for building enterprise software:
"Start at 80% done, then figure out the 20% that actually matters."
That is exactly how I think about customer service infrastructure.
When companies try to modernize their contact center with conversational voice AI, they usually get pushed into one of four buckets:
1. SaaS that "just works"
Fast to buy, opaque to operate. You rent the stack, the data layer,
and often the improvement loop.
2. Build everything from scratch
Total control, but months of plumbing before you ship a useful eval.
3. BPO plus "AI services"
Operationally familiar, but misaligned. The business still runs on
labor hours, so automation stays shallow.
4. Open-core infrastructure
Start with the common building blocks, then own the business-specific
layer that actually differentiates your operation.
The fourth option is where I think the market is going.
Not because it sounds ideological, but because it maps to how enterprise support actually works.
The false choice in conversational AI contact centers
Most enterprise teams are presented with a false choice.
Option A: Rent from a SaaS
The pitch is clean:
- upload your knowledge base
- connect your tools
- write your prompts
- go live
But once you look closely, many of these companies operate less like pure software and more like software plus services.
Behind the product layer, somebody is still tuning prompts, patching workflow logic, shaping integrations, and building customer-specific behavior. The problem is that you usually do not own any of it.
Your transcripts, your resolution patterns, and your operational learning loop all live inside somebody else's system.
Option B: Build from scratch
This sounds attractive if your team is technical and you hate vendor lock-in.
But building everything yourself means owning every boring layer too:
- telephony and voice infrastructure
- conversation storage
- escalations and inboxing
- observability
- eval execution
- tool integrations
You can easily spend 12 to 18 months on plumbing before you ship the first business-specific improvement loop.
Option C: Ask your BPO to "add AI"
This is the safest-looking route on paper.
You keep the incumbent relationship and ask the BPO to layer AI on top. Large operators are already packaging that story into every renewal cycle.
The incentive problem is obvious though.
If your revenue is tied to agent hours, you are never going to be structurally motivated to automate the work away. At best, AI becomes a feature for winning RFPs. At worst, it becomes a thin automation layer around the same labor-heavy operating model.
You still do not own the system, and you still do not build a real feedback loop.
The 80% is infrastructure. The 20% is your advantage.
The part most companies obsess over is not actually the hard part.
The first 80% of a production-grade contact center AI stack is infrastructure:
- voice and channel orchestration
- transcript and conversation storage
- tool execution
- escalation paths
- observability
- inboxing across voice, chat, and email
That layer matters, but it is not where your strategic advantage lives.
The final 20% is where the value gets created:
- the evals that define what "good" looks like
- the failure taxonomy for your workflows
- the expert review process
- the improvement loop between operations and engineering
- the customer intelligence you extract from conversations
That 20% is your IP.
Nobody should be paying enterprise software margins forever for the generic 80% while giving away the part that actually teaches the system how their business works.
Why evals become the real IP
A lot of people still think the moat in AI systems is the prompt.
I do not buy that.
Over the last 30 years, companies translated business processes into forms, databases, back-office systems, APIs, and CRMs. Software became a mirror image of how the business operated.
Conversational systems force that translation to happen again.
This time, the process gets encoded as evals.
Evals tell you how a conversation should go, what counts as success, what should trigger escalation, what policy boundaries matter, and what failure patterns need to be caught before they hit customers.
In other words, every important workflow eventually becomes a test.
The longer you operate, the more of those tests you accumulate:
- refund flows
- identity checks
- policy edge cases
- compliance rules
- tone constraints
- resolution quality checks
A mature operation will not have one prompt. It will have hundreds or thousands of evals.
That is the asset.
When you build on open infrastructure, you keep that asset. When you hand the system to a SaaS vendor or an AI-enabled BPO, you usually do not.
The pricing arbitrage is real
There is also a basic economic point here.
In the traditional model, you are paying for multiple layers at once:
- the underlying model cost
- the telephony and voice stack
- the workflow software
- the services margin
- the vendor's packaged risk premium
With open-core infrastructure, you pay for the commodity layers directly and keep control of the operating logic.
That changes the cost structure.
Closed model:
LLM cost + telephony + orchestration + services margin + lock-in premium
Open-core model:
LLM cost + telephony + orchestration you control
The exact number depends on design choices, call duration, tool usage, and which speech stack you use.
But the important point is not the precise benchmark. It is that you stop paying a permanent markup on the part of the system that should increasingly look like infrastructure.
What the 80% looks like in practice
If you want to replace a BPO with conversational voice AI, the architecture is not mysterious.
It usually looks something like this:
┌──────────────────────────────────────────────────────────────────────────────┐
│ VOICE LAYER (<500ms total) │
│ │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
│ │ STT │ -> │ LLM │ -> │ TTS │ │
│ │ ~150ms │ │ ~250ms │ │ ~100ms │ │
│ └───────────────┘ └───────────────┘ └───────────────┘ │
│ Turn-taking │
│ Interruption │
│ VAD │
└──────────────────────────────────────────────────────────────────────────────┘
| | |
v v v
calls into
┌──────────────────────────────────────────────────────────────────────────────┐
│ FOUNDATION │
│ │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
│ │ Read │ │ Write │ │ Guardrails │ │ Evals │ │
│ │ ERP, DB, Auth │ │ CRM, AR, │ │ Compliance, │ │ Test at scale │ │
│ │ │ │ Tickets │ │ PII │ │ │ │
│ └───────────────┘ └───────────────┘ └───────────────┘ └───────────────┘ │
│ │
│ ┌──────────────────────────────────────────┐ │
│ │ Human escalation │ │
│ │ Handoff, agent assist, fallback │ │
│ └──────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────────────┘
The voice layer is the fast part. The foundation is the part that actually makes the system useful, safe, and improvable.
The point is not to make the AI agent "magical." The point is to make the operating system around it measurable, debuggable, and improvable.
That is the difference between a demo and a production contact center.
The improvement loop is the product
This is where the BPO model breaks.
In a traditional outsourced setup, the loop looks like this:
- Customer contacts support.
- An agent handles the case with varying quality.
- Some of the interaction gets logged.
- A little knowledge sticks in the agent's head.
- The agent leaves or performance drifts.
- The organization relearns the same lesson later.
Now compare that to an eval-driven system you own:
- Customer contacts support.
- AI handles the cases it should handle.
- The system logs transcripts, tool calls, and outcomes.
- Evals catch failure patterns and weak spots.
- Domain experts review the misses and write better tests.
- The system improves for the next interaction, not just the next training cycle.
Every failure becomes a reusable asset.
That is what companies should be buying when they say they want AI in the contact center: not a chatbot, not a shiny voice demo, but an operating loop that compounds.
Who this is for
This is not for every company.
If you handle a few hundred support tickets a month, you probably should not be building your own contact center stack. Use an off-the-shelf product and move on.
But if you are handling serious volume across voice, chat, and email, and your workflows are full of domain-specific edge cases, then ownership starts to matter.
You need:
- infrastructure you can inspect and control
- a place to store transcripts, tool traces, and review outcomes
- a way for operators and domain experts to define quality explicitly
- freedom to go beyond whatever workflow a SaaS vendor decided was "best practice"
That is the real appeal of starting at 80% done.
You do not waste time rebuilding the commodity layer. You spend your energy on the layer that makes your business better than everyone else's.
What happens next
This is the direction we are building toward at ModelGuide.
The control plane should not be locked inside a vendor:
- the unified inbox
- the conversation store
- the eval framework
- the observability layer
Those pieces should be yours.
Then the part that actually matters, the 20% that captures your workflows, your quality bar, and your customer intelligence, can be built on top of infrastructure you control.
That is the playbook:
Start at 80% done. Own the 20% that matters. Stop renting the learning loop.
If you are running a contact center and you are tired of choosing between SaaS lock-in, custom-build drag, and BPO theater, this is the path worth looking at.