AI StrategyIntermediate7 min read

AI Build vs Buy

AI build-vs-buy is the most under-analyzed strategic decision in enterprise AI. The default instinct — 'we'll build it ourselves to keep the IP' — has produced more abandoned AI projects than any other single failure mode. The honest framework: BUY when the use case is non-differentiating (customer support automation, document extraction, transcription, code completion) — vendors have spent hundreds of millions and you cannot match their data flywheel. BUILD only when the use case is BOTH (a) core to your competitive moat AND (b) reliant on proprietary data the vendor cannot access. Most enterprises should be 80% buy, 15% build-on-top-of-vendor (RAG, fine-tunes, custom workflows), 5% pure build. The right question is not 'should we build?' but 'what is our durable advantage if we build?'

Also known asBuild vs Buy AIAI Make-or-BuyCustom vs Vendor AIAI Sourcing Decision

Challenge a friend Browse library

The Trap

The trap is the 'NIH fallacy' — Not Invented Here — driven by engineering pride and the misconception that owning the model equals owning the value. You don't. Vendors update models monthly with billions of dollars of R&D investment; your in-house model becomes obsolete within 6 months of launch. The second trap is the inverse: buying everything and waking up with 14 AI vendors, no integrated workflow, and a $4M/year SaaS bill that nobody can rationalize. The third trap is buying when you should be building-on-top: using a vendor RAG product when your competitive moat IS the unique knowledge graph you'd be feeding it. The right pattern is often hybrid — buy the model, build the proprietary glue.

What to Do

Run every AI initiative through this 4-question gate: (1) Is the underlying capability commoditizing? If yes → BUY. (2) Does our proprietary data unlock measurably better performance than off-the-shelf? If no → BUY. (3) Can a vendor charge us less than the fully-loaded cost of an internal team to maintain it? If yes → BUY. (4) Will we differentiate in the market on this capability or on what we DO with this capability? If 'what we do with it' → BUY the capability, BUILD the workflow. Reserve pure-build for use cases where all four answers point to build.

Formula

Buy if: (Vendor TCO < Build TCO) OR (Capability is commoditizing) OR (Differentiation lives in the workflow, not the model)

In Practice

Bloomberg's BloombergGPT — a 50B-parameter LLM trained on 363B tokens of financial data — is one of the few defensible 'pure build' decisions in enterprise AI. Bloomberg's competitive moat IS its 40-year proprietary financial data corpus, and the model's superior performance on financial NLP tasks (sentiment, named entity recognition, FinQA) directly extends Bloomberg Terminal's product moat. Most enterprises lack this combination of unique data scale and product-tied use case. Two years later, Klarna went the opposite way: they bought OpenAI's API and shipped a 700-FTE-equivalent customer service AI in months. Both were correct decisions for their respective contexts.

Pro Tips

01
Compute the 5-year TCO honestly. Build TCO includes: 3-7 ML engineers at $400K fully loaded, infrastructure ($200K-$2M/year), data labeling, ongoing model retraining, security review, governance, and the 25-40% engineering attrition tax. Most 'build is cheaper' analyses understate by 3-5x. A $2M/year vendor contract usually beats a 'cheap' in-house build that quietly costs $5-7M/year.
02
The 'build on top of buy' pattern is the most underrated. Buy the foundation model (OpenAI, Anthropic, Mistral). Build the orchestration, evaluation, prompt management, and proprietary data layer that you control. This gets you 80% of build's strategic value at 20% of the cost.
03
Always negotiate exit terms with AI vendors: data export rights, prompt portability, model-version pinning, and price-cap clauses on renewals. AI vendor pricing has risen 30-100% on renewal in many cases — your leverage is best at signing.

Myth vs Reality

Myth

“Building gives us defensible IP and a competitive moat”

Reality

The model is rarely the moat — the data, the workflow, the customer relationship, and the distribution are. Bloomberg's moat is the data, not the LLM. JPMorgan's COIN moat is the contract corpus, not the extraction model. If the model is your only proposed moat, you are building on sand: someone will release a better foundation model in 6 months and erase your advantage.

Myth

“Vendors will eventually price-gouge, so building protects us long-term”

Reality

Foundation model pricing has dropped 80-95% per token over the past 24 months and continues falling. Your in-house 'protection' is a stranded asset against an industry that is racing to commoditize compute. The right protection is contractual (price caps, exit rights) — not architectural.

Try it

Run the numbers.

Pressure-test the concept against your own knowledge — answer the challenge or try the live scenario.

🧪

Knowledge Check

Your retail company is evaluating an AI product-recommendation engine. Vendors like Algolia and Constructor cost ~$300K/year. Engineering proposes building one for $1.2M Year-1, $400K Year-2+, claiming 'we own the IP.' What's your call?

Industry benchmarks

Is your number good?

Calibrate against real-world tiers. Use these ranges as targets — not absolutes.

Enterprise AI Sourcing Mix (2024)

Enterprises with mature AI portfolios; based on Andreessen Horowitz and Menlo Ventures enterprise surveys

Pure Buy (vendor SaaS / APIs)

~75% of use cases

Hybrid (build on vendor foundation)

~20% of use cases

Pure Build (custom model from scratch)

~5% of use cases

Source: https://a16z.com/generative-ai-enterprise-2024/

Real-world cases

Companies that lived this.

Verified narratives with the numbers that prove (or break) the concept.

📰

Bloomberg

2023

success

Bloomberg released BloombergGPT, a 50B-parameter LLM trained on 363B tokens spanning 40 years of proprietary financial data. The decision to build — at significant cost — was justified by the unique combination of (a) a moat built ON proprietary data the vendors cannot access, and (b) a product (Bloomberg Terminal) where AI extends an existing competitive position. The model outperformed comparable open-source models on financial NLP tasks like sentiment, NER, and FinQA.

Model Size

50B parameters

Training Data

363B tokens (financial domain)

Build Justification

Proprietary data + product moat

Performance Edge

Outperforms general LLMs on finance tasks

Pure build is justified ONLY when proprietary data + product positioning unite. Bloomberg has both. Most enterprises lack one or the other.

Source ↗

🛍️

Klarna

2024

success

Klarna chose the opposite path: instead of building a proprietary customer-service model, they bought OpenAI's API and built the orchestration layer on top. In months — not years — they shipped a system handling 2.3M conversations equivalent to 700 FTE agents. The differentiation lives in their workflow integration, training data on Klarna-specific intents, and the product experience — not the underlying model.

Sourcing Decision

Buy foundation, build workflow

Time to Production

Months (vs. years for build)

Conversations Handled

2.3M in first month

Reported Profit Lift

$40M annually

Buying the model and building the proprietary layer (data, workflow, integrations) is the highest-ROI pattern for most enterprises. The model is the engine; the differentiation is the car.

Source ↗

Decision scenario

The Build vs Buy Showdown

You're CTO of a $300M ARR vertical SaaS in healthcare. Your engineering VP wants to build a custom HIPAA-compliant LLM ($4M Year-1, $1.2M Year-2+). Your CRO wants to buy a vendor product ($600K/year). The board wants a recommendation by next Thursday.

Build Path

$4M Year-1, $1.2M ongoing

Buy Path

$600K/year

Use Case

Patient-facing GenAI summarization

Strategic Sensitivity

HIPAA, PHI handling

Decision 1

Your engineering VP argues that HIPAA and PHI sensitivity require a custom model. Your CRO argues that established vendors (Microsoft, AWS, Anthropic via BAA) already offer HIPAA-eligible APIs. What do you propose?

Approve the build. PHI is too sensitive to entrust to a vendor, and a custom model becomes a competitive moat.Reveal

Year 1: $4.2M spent (over budget). The model launches at performance noticeably below GPT-4-class APIs. Customers compare your product to competitors using vendor APIs and notice. Your engineering team is now permanently allocated to keeping pace with foundation-model improvements they can't match. The 'moat' is a millstone.

Year-1 Spend: $4.2M (5% over)Capability vs. Competitor: BehindEngineering Capacity: Locked into model maintenance

Buy the foundation model under HIPAA-compliant BAA (Anthropic, Microsoft Azure OpenAI, AWS Bedrock). Build a proprietary workflow layer with: PHI handling controls, your specialized prompts, your evaluation harness, and your customer's clinical knowledge graphs. Total Year-1 spend: $900K.Reveal

Live in 4 months. Your product matches vendor-API performance because it IS using vendor APIs, while the differentiation lives in your clinical workflow integration — which competitors can't easily copy. Year-1 savings vs. build: ~$3.1M, redirected to expanding the clinical knowledge graph (the actual moat). Your platform automatically improves as foundation models get better.

Year-1 Spend: $900KTime to Production: 4 months vs. 12+ for buildReallocated Investment: $3.1M to true moat

Related concepts