AI StrategyAdvanced8 min read

AI Knowledge Graph

A knowledge graph is a structured representation of entities (customers, products, contracts, employees, accounts) and the typed relationships between them — modeled in a graph database (Neo4j, TigerGraph, Neptune) or layered onto a vector store as GraphRAG. The bet is that a lot of enterprise questions are not 'find documents like this' but 'find the path between this customer and that contract clause through these three intermediary entities.' Vector search can't answer multi-hop questions. A knowledge graph can. Microsoft's GraphRAG paper (2024) showed up to 70-80% improvement on multi-hop reasoning vs naive RAG on the same corpus. The catch: the graph is only as good as the entity extraction and relationship modeling pipeline that builds it. Most enterprise knowledge graph projects die in the schema-design phase.

Also known asEnterprise Knowledge GraphGraphRAGSemantic GraphEntity GraphKnowledge Network

Challenge a friend Browse library

The Trap

The trap is treating the knowledge graph as a database project rather than a corpus discipline project. Teams spend 9 months designing the perfect ontology, build a graph with 200 entity types and 500 relationship types, and then discover that 80% of their source data doesn't cleanly map to it. The KnowMBA POV: knowledge graphs work when you have the corpus discipline. If your source documents don't have consistent entity references, your graph will be a sparse, contradictory mess that performs worse than vector RAG on the same corpus. Start with a narrow ontology covering one domain, validate that extraction works at 90%+ precision, then expand. Don't pre-design the graph for problems you don't have yet.

What to Do

Phase the rollout. (1) Pick ONE domain with high-value multi-hop questions: e.g., 'which contracts reference this clause that affects this customer cohort.' (2) Define a minimal ontology — 5-15 entity types, 10-30 relationship types, no more. (3) Build the extraction pipeline (LLM + rules + human-in-the-loop QA on a 200-document gold set). Measure precision/recall per entity type. (4) Wire it to the LLM as a retrieval tool — the LLM queries the graph for entity paths, then composes the answer. (5) Expand only after the first domain is proving ROI in production. Always measure: % of queries that benefit from graph vs vector retrieval. If less than 30% benefit, you don't need the graph.

Formula

Graph Lift = (Multi-hop Accuracy with Graph − Multi-hop Accuracy with Vector RAG) / Vector RAG Baseline

In Practice

Microsoft Research published GraphRAG in 2024, demonstrating that LLM-extracted knowledge graphs improved multi-hop QA accuracy by 70-80% on a Russian-Ukraine news corpus vs naive RAG. Neo4j's customer base (NASA, NewYork-Presbyterian, Adobe, Allianz) uses graphs for fraud detection, supply chain risk, and clinical knowledge integration. LinkedIn's economic graph models 1B+ professionals and their employment relationships — the foundation for nearly every product feature. The pattern: graphs win when relationship traversal is the dominant query pattern.

Pro Tips

01
Microsoft's GraphRAG runs LLM-based entity and relationship extraction at index time, not query time. The cost shows up as a one-time indexing bill (often 5-20× the cost of naive RAG indexing) but query-time cost stays comparable. Budget the indexing cost explicitly.
02
Hybrid retrieval beats pure graph almost always. The pattern that works: vector search to find the relevant subgraph, then graph traversal within that subgraph to answer the multi-hop part. Pure graph queries miss semantic similarity; pure vector misses path structure.
03
Graphs require entity resolution. 'Acme Corp', 'Acme Corporation', 'ACME Corp.', and 'Acme Inc.' must collapse to one node. Without entity resolution, your graph fragments into thousands of duplicate nodes and traversal returns nothing. Budget 30-40% of project effort for entity resolution.

Myth vs Reality

Myth

“Knowledge graphs replace vector search”

Reality

They complement it. Vector search handles 'find me documents about X.' Graphs handle 'find the path from X to Y through Z.' Production systems run hybrid retrieval: vector first to narrow scope, graph to traverse relationships. Replacing vector with graph is almost always a regression on broad questions.

Myth

“LLMs eliminate the need for ontology design”

Reality

LLM-driven extraction reduces the labor of building a graph but does not remove the need to define what entity and relationship types you care about. Without an ontology, LLM extraction generates inconsistent labels (Person, Human, Individual all extracted from the same corpus) and the graph becomes unqueryable.

Try it

Run the numbers.

Pressure-test the concept against your own knowledge — answer the challenge or try the live scenario.

🧪

Knowledge Check

Your team built a knowledge graph with 180 entity types and 400 relationship types after 8 months of ontology design. Production rollout shows the graph is sparse — most extracted entities only have 1-2 connections, and 70% of LLM queries get better results from plain vector RAG. What's the most likely root cause?

Industry benchmarks

Is your number good?

Calibrate against real-world tiers. Use these ranges as targets — not absolutes.

Multi-hop QA Lift from Knowledge Graph (vs Naive RAG)

Multi-hop reasoning queries on enterprise corpora

Strong Lift

> 50%

Meaningful Lift

25-50%

Marginal Lift

10-25%

No Lift — Don't Build the Graph

< 10%

Source: Microsoft Research, GraphRAG paper (April 2024)

Real-world cases

Companies that lived this.

Verified narratives with the numbers that prove (or break) the concept.

🔬

Microsoft Research (GraphRAG)

2024

success

Microsoft Research published GraphRAG, an open-source approach that uses an LLM to extract entities and relationships from a corpus, builds a hierarchical community graph, and uses graph traversal to answer multi-hop questions. On a Russian-Ukraine news corpus, GraphRAG outperformed naive vector RAG by 70-80% on multi-hop questions and matched it on simple lookups. The cost trade-off: indexing is significantly more expensive (multiple LLM passes per document) but query-time cost stays similar.

Multi-hop Accuracy Lift

70-80% vs naive RAG

Indexing Cost Increase

5-20× vs naive RAG

Query Cost

Comparable to naive RAG

Open Source

Yes (Apache 2.0)

Graphs win on multi-hop reasoning because path structure is the answer, not document similarity. The economics work when the corpus is stable enough that indexing cost amortizes over many queries.

Source ↗

🌐

Neo4j Customer Programs

2010-2026

success

Neo4j powers knowledge graphs at NASA (lessons-learned across mission archives), NewYork-Presbyterian (clinical knowledge integration), Allianz (insurance fraud detection), and Adobe (digital asset relationships). The consistent pattern across customer wins: the graph solves a problem that's fundamentally about relationships between entities, not document content. NASA's lessons-learned graph reportedly saves engineering teams weeks of search time per project by surfacing prior mission decisions linked to current design choices.

Notable Customers

NASA, Allianz, Adobe, NYP

Common Use Cases

Fraud, lineage, lessons-learned

Reported NASA Use

Lessons-learned across missions

Knowledge graphs are not a general AI solution; they are a specific solution for relationship-heavy domains. Pick the use case where the question 'how is this connected to that' is the dominant query pattern.

Source ↗

Decision scenario

Build the Graph or Stay on Vector RAG?

You're VP of AI at a 3,000-person legal services firm. The legal team wants better contract intelligence. Naive RAG works for 'find similar clauses' but fails on 'which contracts reference clauses that conflict with this customer's master agreement, accounting for amendments.' Your AI lead proposes a $1.2M, 9-month knowledge graph project covering 200,000 contracts.

Corpus Size

200,000 contracts

Naive RAG Multi-hop Accuracy

55%

Estimated Graph Project Cost

$1.2M

Estimated Graph Annual Cost

$400K

Multi-hop Queries / Month

~800

Decision 1

The AI lead proposes a 250-entity-type, 600-relationship-type ontology covering every contract concept. The legal ops team wants this scoped to one practice area first. You have to decide the scope.

Approve the full ontology — better to build the comprehensive graph once than rework it laterReveal

After 8 months, the team has a graph with 250 entity types but only 30 are densely populated because most concepts don't appear consistently in the source contracts. Production rollout shows graph queries return sparse results for 70% of question types. The team spends another 4 months pruning and rebuilding. Total project: 14 months, $1.8M, narrow useful coverage. The legal team has lost trust in the program.

Project Duration: 9mo → 14moProject Cost: $1.2M → $1.8MUseful Coverage: Estimated 25% of contract concepts

Scope to one practice area (e.g., master agreements + amendments) with a 12-entity-type ontology. Validate ROI at 6 months before expanding.Reveal

Phase 1 ships in 5 months at $500K. Multi-hop accuracy on the scoped use case jumps from 55% to 88%. Legal team adoption is 70% within 6 weeks of launch. With proven ROI, you greenlight Phase 2 (procurement contracts) reusing 60% of the entity-resolution and tooling investment. Year-one total spend: $900K with measurable productivity gains in two practice areas. Schema discipline upfront made the second domain cheaper to add.

Time to First Value: 9mo → 5moPhase 1 Cost: $1.2M → $500KMulti-hop Accuracy (scoped): 55% → 88%

Related concepts