K
KnowMBAAdvisory
Digital TransformationIntermediate7 min read

Tech Debt Prioritization

Tech debt prioritization is the discipline of deciding which engineering shortcuts, legacy systems, and architectural compromises to fix, ignore, or work around โ€” and in what order. The metaphor (Ward Cunningham, 1992) is that bad code is a loan: you can ship faster now, but you pay 'interest' in the form of slower future delivery. Like financial debt, not all tech debt is bad โ€” some is strategic (intentional, paid down deliberately), some is tolerable (low interest, doesn't block work), and some is malignant (compounds, blocks every change). The job is not to eliminate all debt; it's to recognize which debt is actively destroying velocity and to invest in retiring those specific items while ignoring the cosmetic stuff. Most engineering orgs are simultaneously over-investing in low-impact refactors and under-investing in the 2-3 systems that are eating the team.

Also known asTechnical Debt ManagementEngineering DebtTech Debt BacklogCode Debt TriageRefactoring Prioritization

The Trap

The trap is the 'big rewrite.' Every 3-5 years, an engineering org decides the legacy system is too painful and pitches a 12-24 month rewrite. The vast majority of these rewrites take 2-3x longer than estimated, ship a system with the same business logic but new bugs, and lose 12+ months of business velocity. Joel Spolsky's 'Things You Should Never Do, Part I' (2000) on the Netscape rewrite is still the canonical warning. The other trap: tech debt that gets logged in a Jira backlog where it dies. Debt items without an explicit owner, an interest-cost estimate, and an explicit kill-or-fund decision become organizational noise โ€” a constantly-growing pile that no one actually triages.

What to Do

Run a quarterly tech debt triage with three columns: (1) Bleeding (debt that is actively slowing every change in the affected area โ€” fix in next 1-2 quarters). (2) Watching (debt that's annoying but not blocking โ€” track but don't fund yet). (3) Living With (debt in stable systems that aren't being touched โ€” explicitly accept and stop debating). Estimate each item's 'interest cost' in engineering days/week of slowed delivery. Fund the top 3-5 Bleeding items as first-class roadmap items, not '20% time.' Set an explicit policy: every team commits 15-25% of capacity to debt reduction; if they're under 10% they're accumulating; over 30% they may be over-engineering.

Formula

Tech Debt Interest Cost (per item) = Engineer Days/Week Slowed ร— 50 weeks ร— Avg Loaded Cost/Day | ROI of Fix = (Annual Interest Cost ร— Years of Future Use) โˆ’ Cost of Fix

In Practice

Stripe publicly documented their multi-year API versioning and database migration discipline as a model of strategic tech debt management โ€” they recognized that breaking the public API was unacceptable, so they invested heavily in compatibility layers and gradual migration tooling rather than a big rewrite. On the cautionary side, Twitter's mid-2010s 'rewrite to JVM' from Ruby on Rails was successful in part because it was incremental (service-by-service), explicitly scoped, and tied to specific scaling pain โ€” the rare big-architectural-change story that didn't end in disaster. Most rewrites end like Netscape's: shipped years late, with the company having lost market position during the rewrite window.

Pro Tips

  • 01

    Stop calling all engineering work 'tech debt.' Refactoring code that nobody is touching is not paying down debt โ€” it's making yourself feel productive. Debt that isn't blocking active work is not debt; it's just old code.

  • 02

    The single best predictor of which tech debt to fix is 'how often is this code touched?' Bleeding debt in a hot file (touched weekly) costs 100x more than identical debt in a cold file (touched yearly). Use git blame frequency as a triage signal.

  • 03

    Tie tech debt to business outcomes. 'Refactor billing service' is a tough sell. 'Reduce billing-related incident MTTR by 4 hours and unblock the new pricing tier launch' is fundable. Translate engineering pain into business impact.

Myth vs Reality

Myth

โ€œAll tech debt should eventually be paid downโ€

Reality

Some debt should NEVER be paid. Code in stable, low-touch systems can stay messy forever โ€” paying it down has zero ROI. The goal is not zero debt; it's controlled debt where the interest cost is bounded.

Myth

โ€œA big rewrite will solve our tech debtโ€

Reality

Big rewrites are the most expensive way to manage tech debt. They take 2-3x estimates, ship with new bugs, and lose business velocity for the duration. Strangler-fig (incremental replacement) works far more often than big-bang rewrite, both in research and in practice.

Try it

Run the numbers.

Pressure-test the concept against your own knowledge โ€” answer the challenge or try the live scenario.

๐Ÿงช

Knowledge Check

An engineering team has 200 logged tech debt items. They allocate 20% of every sprint to 'tech debt work' and rotate items off the bottom of the backlog. After 18 months, velocity hasn't improved and the team complains the codebase is worse than ever. What's the most likely root cause?

Industry benchmarks

Is your number good?

Calibrate against real-world tiers. Use these ranges as targets โ€” not absolutes.

Engineering Capacity Allocated to Tech Debt

Mid-to-large engineering orgs

Healthy Steady-State

15-25%

Catching Up After Neglect

25-35%

Accumulating Debt Faster Than Paying

5-15%

Heading for a Crisis

< 5%

Over-Engineering / Rewrite-itis

> 35% sustained

Source: GitHub Octoverse / Stack Overflow Developer Survey aggregated trends

Real-world cases

Companies that lived this.

Verified narratives with the numbers that prove (or break) the concept.

๐Ÿ’ณ

Stripe (API versioning discipline)

2011-present

success

Stripe made an early decision that breaking the public API was unacceptable โ€” every customer integration would continue to work indefinitely on the version they integrated against. This forced a strategic tech debt approach: invest heavily in compatibility layers, version translation, and gradual internal migration tooling rather than ever doing a 'big rewrite' that broke customers. The discipline shows up in their engineering blog: incremental, surgical migrations spanning years, with measurable progress and explicit deprecation timelines. The opposite approach (the cleanup rewrite) would have been faster internally but would have torched customer trust.

API Versioning Strategy

Permanent backward compatibility

Major Internal Migrations

Multiple, all incremental over years

Developer Trust Outcome

Among the highest in payments

Big-Rewrite Count

0 publicly known

Tech debt discipline at Stripe is a strategic asset, not an engineering chore. The decision to never break the API constrained their tech debt approach to incremental, surgical migration โ€” which produced both better customer outcomes and a lower-risk engineering pattern than the typical big-rewrite cycle.

Source โ†—
๐Ÿ›๏ธ

Shopify (modular monolith)

2014-present

success

Shopify is one of the most prominent examples of resisting microservices and the 'big rewrite' temptation. Instead of breaking the Rails monolith into hundreds of services, Shopify invested in 'modular monolith' patterns โ€” internal boundaries, code ownership, and gradual extraction of only the components that genuinely needed to be separate. The honest engineering blog posts describe the discipline: boring, incremental, focused on the specific components causing pain. As of recent reporting, Shopify still runs one of the largest Rails monoliths in production while shipping at high velocity and operating reliably at Black Friday scale.

Codebase Type

Modular monolith (still primarily Rails)

Black Friday Scale

Millions of requests/sec at peak

Big-Rewrite Resisted

Multiple times since 2014

Pattern

Extract surgically, never wholesale

Shopify's resistance to the microservices/rewrite fashion is a study in tech debt discipline. The pattern: keep the monolith healthy via modular boundaries, extract only what genuinely benefits from being separate, never rewrite for fashion. The result is a codebase that ships at scale while many smaller competitors are mid-rewrite.

Source โ†—

Related concepts

Keep connecting.

The concepts that orbit this one โ€” each one sharpens the others.

Beyond the concept

Turn Tech Debt Prioritization into a live operating decision.

Use this concept as the framing layer, then move into a diagnostic if it maps directly to a current bottleneck.

Typical response time: 24h ยท No retainer required

Turn Tech Debt Prioritization into a live operating decision.

Use Tech Debt Prioritization as the framing layer, then move into diagnostics or advisory if this maps directly to a current business bottleneck.