Modernizing Monolithic Applications Without Rewriting | JNKINDILOGS

The Rewrite That Never Shipped

In 2021, I watched a team spend 14 months rewriting their monolith into microservices. They had all the buzzwords: event-driven architecture, service mesh, Kubernetes, the works.

The rewrite never shipped. The company ran out of runway and got acquired. The monolith is still running in production today.

Meanwhile, at my current company, we've been gradually modernizing our 8-year-old monolith for 3 years. We ship to production every day. We've extracted 12 services. Revenue is up 3x. The monolith is still the core, and that's fine.

Here's what I learned: you don't need to kill the monolith. You need to make it better.

Why Your Monolith Isn't Actually the Problem

Let me guess what you hate about your monolith:

Takes 20 minutes to run tests
Deploys are scary and take 2 hours
One bug in the reporting module takes down checkout
New engineers take 3 months to be productive
You can't scale the parts that need scaling without scaling everything

Here's the thing: microservices won't fix any of those problems if you don't fix them first.

Slow tests? You'll have slow tests in 15 services instead of 1.
Scary deploys? Now you have 15 deployment pipelines to be scared of.
Tight coupling? That just becomes network calls that are even harder to debug.

I know because I've made these mistakes. We extracted a service before fixing the underlying problems. It was just a distributed monolith with extra latency.

The Strangler Fig Pattern (That Actually Works)

You've probably heard of the Strangler Fig pattern. The metaphor is cool: a fig tree grows around an old tree, eventually replacing it.

What the blog posts don't tell you: you might not want to strangle the whole tree.

Here's our approach over the past 3 years:

Year 1: Make the Monolith Modular

Before extracting anything, we organized the monolith into clear modules:

/src
  /users       (authentication, profiles, permissions)
  /billing     (payments, subscriptions, invoicing)
  /products    (catalog, inventory, pricing)
  /analytics   (reporting, dashboards)
  /core        (shared utilities, database)

Rules we enforced:

Users module can only import from core, not from billing
Billing can import from users (needs auth) but not from analytics
Analytics can import from everything (read-only)

We added linting rules to enforce this. PRs that violated module boundaries got rejected automatically.

This took 6 months. Zero new features during this time. Just reorganization. It was painful, but it worked.

Year 2: Extract the Obvious Wins

Once we had clear modules, we looked for services that were:

Independent: Doesn't need much from other modules
Scalable: Needs different scaling characteristics than the main app
Stable: Not changing every week

We extracted:

Email service: Was sending 10M emails/day, monolith was sending 100K requests/day. Different scaling needs. Easy to extract because it was already just a queue consumer.
Image processing: CPU-intensive, independent, clear API (upload image → get URL back).
Analytics/Reporting: Read-only, could have its own database replica, didn't need real-time consistency.

We did NOT extract:

User authentication: Too critical, touches everything, needs real-time consistency
Billing: Too complex, constantly changing, too risky
Core product logic: Too coupled to everything else

Year 3: Optimize the Monolith

Plot twist: after extracting a few services, we stopped extracting and focused on making the monolith better.

We:

Upgraded Rails 5 → Rails 7 (40% faster)
Optimized database queries (cut API response time in half)
Implemented proper caching (Redis FTW)
Split the database reads/writes to replicas
Added comprehensive monitoring

Result: the monolith now handles 5x more traffic than when we started. We don't need to extract more services because the monolith is fast enough.

How to Decide What to Extract

Use this decision tree (I literally have this printed on my wall):

Extract If:

Different scaling needs: Email service needs to send 10M/day, main app needs 100K req/day → extract
Different technology requirements: Need Python for ML, main app is Ruby → extract
Different team ownership: Separate team that needs to deploy independently → extract
Clear, stable interface: "Upload file → get URL back" with no plans to change → safe to extract
High cost of coupling: Mobile app can't release features because they need backend deploys first → extract the mobile API

Don't Extract If:

It shares a database table with other code: You'll just create a distributed monolith
The interface isn't clear: "It does... stuff with users?" → not ready
It changes frequently: Deploying 2 services every time you change something is worse than deploying 1
It's tightly coupled: If extracting it means 1000 network calls, don't do it yet
You're doing it because "microservices are best practice": Please don't

The Extraction Process That Worked

When we extracted our email service, here's what we did:

Step 1: Create Internal API Boundaries (in the monolith)

First, we wrapped the email code in a clean interface inside the monolith:

class EmailService
  def send_email(to:, subject:, body:, template:)
    # All email logic here
  end
end

# Everywhere in the codebase
EmailService.send_email(to: user.email, subject: "Welcome!", ...)

Took 2 weeks. Zero functionality changed. Just created a clear boundary.

Step 2: Add a Feature Flag

class EmailService
  def send_email(...)
    if FeatureFlag.enabled?(:external_email_service)
      EmailAPI.send(...)  # Call external service
    else
      # Old code in monolith
    end
  end
end

Now we could route some traffic to the new service, some to the old code, and switch back instantly if needed.

Step 3: Build the Service (While the Old Code Still Runs)

We built the new email service over 6 weeks. During this time, 100% of production traffic still used the monolith. No pressure.

Step 4: Gradual Rollout

Week 1: 1% of traffic to new service
Week 2: 10%
Week 3: 50%
Week 4: 100%

We found 3 bugs during this rollout. Because we could instantly switch back to the monolith, users barely noticed.

Step 5: Delete the Old Code

This is the step everyone forgets. We waited 2 months after hitting 100%, then deleted the email code from the monolith.

Total time: 4 months from "let's extract this" to "old code deleted."
Total bugs in production: 3 (minor, caught within hours)
Total downtime: 0 seconds

Mistakes I've Made (Learn From My Pain)

Mistake 1: Extracting Too Early

We extracted a "notifications service" when we had 3 types of notifications. Then we added 12 more types. We ended up deploying the notifications service every single day, which defeated the whole purpose.

Lesson: Wait until the interface is stable. If you're still figuring out what this code should do, keep it in the monolith.

Mistake 2: Shared Database

We extracted a "reports service" but it directly queried the main database. So now we couldn't change the database schema without coordinating between services.

Lesson: Services should own their data. Use APIs or events, not direct database access.

Mistake 3: Wrong Boundaries

We split "user management" from "authentication." Made sense on paper. In practice, every auth change required a user management change. We merged them back into one service after 6 months.

Lesson: Boundaries should follow business domains, not technical layers.

Mistake 4: Not Having Rollback Plans

We extracted a service and deleted the monolith code immediately. The service had a memory leak. It crashed after 6 hours. We had no fallback. That was a bad Saturday.

Lesson: Keep the old code for at least a month. Feature flags are your friend.

Living With the Monolith

Here's my controversial take: monoliths are fine, actually.

Shopify is largely a monolith. GitHub is largely a monolith. Basecamp is literally famous for being a monolith.

The key word is "modular monolith."

What makes a monolith good:

Clear module boundaries enforced by tooling
Fast tests (< 10 minutes for full suite)
Fast deploys (< 10 minutes from commit to production)
Good monitoring and observability
Easy to run locally

If you have those things, you don't need microservices. If you don't have those things, microservices won't save you.

Your Action Plan This Week

Map your monolith: Draw boxes around major functional areas. Where are the natural boundaries?
Identify one module boundary violation: Where is code reaching across modules when it shouldn't? Fix that one case.
Measure your deployment time: How long from merge to production? If > 1 hour, that's your first optimization target (not extraction).
Run your tests: How long? If > 30 minutes, spend a week optimizing tests before you extract anything.
Ask: what's the real problem? If the answer is "monolith is bad," dig deeper. The real problem is probably "deploys are scary" or "tests are slow." Fix those.

The Real Goal

The goal isn't microservices. The goal isn't even killing the monolith.

The goal is: can your team ship features quickly and confidently?

If you can ship to production 10 times a day with your monolith, you're doing better than teams with 50 microservices deploying once a week.

Our monolith is 8 years old. It's still growing. We've extracted the pieces that needed extracting. The rest? It's fast, well-tested, and it makes us money.

That's good enough.