AI StrategyIntermediate7 min read

AI Center of Excellence

An AI Center of Excellence is the small, central team that owns shared AI capabilities — platform, governance, evaluation, vendor management, training, and reusable patterns — while embedded AI talent in product teams owns the actual features. The CoE is a force multiplier, not a delivery org. Done right, a 6-12 person CoE supports 50-200 product engineers shipping AI features. Done wrong, it becomes either a bottleneck where every AI request queues, or an isolated R&D lab that produces papers and demos but no shipped product.

Also known asAI CoEAI HubAI PracticeAI Capability CenterFederated AI Team

Challenge a friend Browse library

The Trap

Two common traps. First: the CoE becomes the single source of AI delivery. All AI requests funnel through them, the team gets overwhelmed, product teams resent the dependency, and AI velocity stalls. Second: the CoE has no clear scope. Some weeks they're consulting on prompts, other weeks they're rebuilding the data warehouse, other weeks they're presenting at conferences. Without a sharp 'we own X, we explicitly do not own Y' charter, the CoE becomes a generalist team optimizing nothing.

What to Do

Charter the CoE around 5 specific responsibilities: (1) the AI platform (eval, observability, deployment infra, model gateway), (2) governance and standards (templates, review processes, model registry), (3) shared services (prompt library, retrieval components, common evaluators), (4) enablement (training, residencies, office hours), and (5) vendor strategy (model contracts, cost optimization). Explicitly NOT the CoE's job: shipping product features (that's product teams), deep research (unless that's the company archetype), and approving every AI launch (that's governance). Run the CoE on quarterly OKRs tied to product team adoption metrics, not internal output metrics.

Formula

CoE Effectiveness = (Product Teams Shipping AI Features ÷ CoE Headcount) × Platform Adoption Rate

In Practice

JPMorgan's COiN (Contract Intelligence) team and broader AI platform group function as a CoE serving the firm's lines of business. McKinsey, Bain, and BCG all maintain firm-wide AI CoEs that build internal tools, evaluate vendors, and train consultants. Inside tech companies, the pattern shows up as 'AI Platform' or 'Foundation' teams (Meta's GenAI Infra, Uber's Michelangelo) supporting product orgs. The common pattern: the CoE owns infrastructure and standards, product teams own features.

Pro Tips

01
The strongest leading indicator of CoE health is how many product teams are shipping AI features WITHOUT direct CoE involvement, using only the CoE-built platform. If every AI feature still requires a CoE engineer, the CoE has not productized anything — it's just a shared services team.
02
Embed CoE engineers into product teams on rotation (8-12 weeks) instead of permanent staffing. Rotations spread expertise, force the platform to handle real production needs, and prevent the CoE from becoming an ivory tower. Permanent embeds eventually go native and stop contributing back to the platform.
03
Publish a quarterly 'CoE roadmap' visible to all product teams showing what platform features are coming. Without this, every product team starts building their own evaluation harness, prompt library, and observability tooling — duplicating what the CoE will deliver in 8 weeks.

Myth vs Reality

Myth

“Centralizing AI talent in a CoE produces better outcomes than distributing it”

Reality

Pure centralization creates a bottleneck and disengaged product teams. Pure decentralization creates duplication and inconsistent quality. The hub-and-spoke model — small central CoE plus distributed embedded talent — outperforms both extremes for most enterprises. The CoE multiplies the spokes; it doesn't replace them.

Myth

“A CoE needs senior AI researchers to be credible”

Reality

A CoE needs senior platform engineers, MLOps, and one senior applied scientist for hard cases. Researchers without research problems get bored and leave. Platform credibility comes from shipping reliable infrastructure that product teams actually use, not from publication credentials.

Try it

Run the numbers.

Pressure-test the concept against your own knowledge — answer the challenge or try the live scenario.

🧪

Knowledge Check

An AI Center of Excellence is 12 months old. Which metric BEST indicates it has been successful?

Industry benchmarks

Is your number good?

Calibrate against real-world tiers. Use these ranges as targets — not absolutes.

AI CoE Maturity

Enterprise AI CoEs supporting 5+ product teams

Mature

Hub-and-spoke, productized platform, >50% of AI features shipped without CoE engineer

Functional

Defined scope, shared platform, but heavy CoE involvement still required

Bottleneck

All AI requests queue through CoE, product teams frustrated

Lab

CoE produces research/demos but no production adoption

Source: Andrew Ng AI Transformation Playbook + observed enterprise patterns

Real-world cases

Companies that lived this.

Verified narratives with the numbers that prove (or break) the concept.

🚗

Uber Michelangelo

2017-present

success

Uber's Michelangelo is the company's internal ML platform — model registry, training pipelines, feature store, deployment, monitoring. The platform team is small relative to the population of ML engineers in product groups (pricing, ETA, fraud, marketplace). Michelangelo is the canonical example of an AI CoE that succeeds by productizing the platform: product teams ship models without needing to talk to the platform team for each launch. Uber has published the architecture publicly, and the model has been broadly imitated.

Pattern

Platform team + distributed model owners

Models in Production

Thousands across the company

Platform Self-Service Rate

High

The mark of a successful AI CoE is what you do NOT see — product teams shipping AI features without needing to talk to the central team. That self-service ratio is the diagnostic.

Source ↗

🏭

Hypothetical: Industrial Manufacturer CoE

Composite scenario

pivot

A $6B industrial manufacturer chartered an 18-person AI CoE to lead AI adoption across 9 business units. After 24 months, the CoE had produced 14 proof-of-concepts, of which 2 reached production. Investigation showed the CoE was structured as a delivery team, not a platform team — they tried to build features for every business unit themselves. They had no platform, no enablement program, and no embedded talent in BUs. The CEO restructured: 6 CoE members went to BUs as embedded leads, 6 stayed central and built a platform, and 6 launched a 200-person enablement program. Within 12 months, production AI deployments grew from 2 to 31.

Original CoE Production Deployments (24 mo)

Restructured CoE Production Deployments (12 mo)

Enablement Population

200 trained engineers

A CoE that delivers features instead of platforms can never scale. The transition from delivery org to platform org is the single most important step in CoE maturation.

Decision scenario

Reorganizing the CoE for Scale

You inherit an 18-month-old AI CoE as the new VP. The CoE has 14 people and a $6M budget. It has produced 22 demos and 3 production features. Product teams complain the CoE is a bottleneck; the CoE complains product teams 'won't engage properly.' The CEO has given you 90 days to present a restructure plan.

CoE Headcount

Annual Budget

$6M

Production AI Features (18 mo)

Product Teams Engaged

5 of 22

CoE Satisfaction Score

32/100 (poor)

Decision 1

First decision: should the CoE continue to deliver AI features end-to-end for product teams, or transition to a platform-only model?

Continue end-to-end delivery but expand the CoE to 25 people to meet demandReveal

12 months in, the CoE has 25 people delivering 6 production features per year — about a 30% improvement at nearly 2x the cost. Product teams remain disengaged because they still don't own anything. As soon as you stop expanding hiring, the bottleneck reasserts. Three high-performing CoE engineers leave because the work is repetitive integration, not building durable capability.

Annual CoE Cost: $6M → $11MProduction Features/Year: 2 → 6Product Team Engagement: 5/22 → 7/22

Transition to platform-only: CoE owns infrastructure, governance, and enablement; product teams own features and embed dedicated AI engineers (hired or upskilled)Reveal

Painful first 6 months as product teams scramble to staff. By month 9, the platform supports self-service deployment, eval, and observability. Production AI features grow from 3 to 24 over 12 months. The CoE shrinks slightly to 11 (3 transferred to product teams) and is more focused. Product team satisfaction with the CoE jumps from 32 to 71. Two product teams independently ship AI features the CoE never even saw, which is exactly the goal.

Production Features (12 mo): 3 → 24Product Teams Shipping AI: 5 → 18CoE Satisfaction Score: 32 → 71

Decision 2

Second decision: how to handle the existing AI work backlog of 40+ requests already queued with the CoE.

Power through the backlog before transitioning so product teams aren't left hangingReveal

The transition is delayed by 7 months. Half the backlog requests become irrelevant by the time they're delivered. The CoE team is exhausted and 4 people leave. The transition eventually happens but with eroded credibility and a depleted platform team.

Transition Delay: +7 monthsCoE Attrition: +4 senior engineers

Triage backlog: top 5 requests get delivered as 'last-mile' projects, rest are returned to product teams with enablement support and a 'platform-coming-soon' roadmapReveal

Product teams initially complain but the published platform roadmap with concrete dates restores trust. The 5 prioritized projects ship on schedule. Of the 35 returned requests, 15 are picked up by product teams (with CoE office hours), 10 get reprioritized away (they were never important), and 10 wait for platform features that arrive within 6 months. Net throughput is 2x the 'power through' option.

Net AI Features Shipped: 2x vs alternativeCoE Morale: Stable

Related concepts