Skip to main content

How I rebuilt Booplex's indexing in 5 weeks after the localhost disaster

June 1, 2026
8 min read
case-studytechnical-seocanonicalindexinggscnextjs

For 11 days in March 2026, half of Booplex's pages told Google their canonical URL was http://localhost:3000/whatever.

I didn't notice. Google did. The result was the kind of indexing collapse you only believe when you see your own analytics drop a cliff. This is the full case study — what broke, when I noticed, what I changed, and what the recovery actually looked like in GSC over the following 5 weeks.

This isn't a redemption arc post. It's a timeline with data. If your stack ever does something similar, the patterns here should help you triage faster than I did.

The bug, in one paragraph

A piece of code in app/blog/[slug]/page.tsx was using process.env.SITE_URL || 'http://localhost:3000' to build canonical URLs. In production, SITE_URL wasn't set (different env var name on cPanel vs. Vercel dev). The fallback fired. Every blog post canonicalized itself to localhost.

The full write-up of the bug itself is in the original post. This post is about what happened after the fix shipped.

The 5-week recovery timeline

Day 0 — Fix deployed (March 24, 2026)

Fix: replace every inline process.env.SITE_URL || ... with a centralised getPublicSiteUrl() helper that fails loudly if the env var is missing. 11 files changed.

State of indexing at deploy time:

  • GSC "Pages indexed" count: 4 (down from a pre-bug ~20)
  • GSC "Discovered, not indexed": 18
  • GSC "Crawled, not indexed": 11
  • GSC "Duplicate, Google chose different canonical": 9

That last bucket is the smoking gun. Google had seen the localhost canonical, recognized it as nonsense, and assigned its own canonical — which was a different (and wrong) page.

Day 0–2 — IndexNow blast + sitemap resubmit

Immediately after deploy:

  1. Resubmitted the sitemap in GSC. (Forced re-fetch.)
  2. Resubmitted via IndexNow to Bing — Booplex already had IndexNow plumbed in, but I hand-triggered the whole sitemap to be safe.
  3. Used GSC's URL Inspection "Request indexing" button for the top 8 affected URLs. (Manual, per-URL. GSC throttles you after ~10/day.)

I considered using "Removals" with a temporary block on the bad canonicals, then resubmitting — but read several technical SEO horror stories about that approach making things worse. Skipped.

Day 1–7 — First reading from GSC

GSC's indexing data lags 24–72 hours. Day 1 after deploy showed no change. Day 3 showed Google had started re-crawling the high-priority pages I'd manually inspected. Day 5–7 the chart started moving.

By Day 7:

  • Pages indexed: 9 (up from 4)
  • Discovered, not indexed: 17 (down from 18)
  • Duplicate, Google chose different canonical: 7 (down from 9)

Not a miracle. But the trend reversed.

Day 7–21 — The slow grind

The bulk of the recovery happened in this window. No big single events — just Google crawling the site at its normal cadence (about every 3–5 days for most pages) and slowly trusting the corrected canonicals.

What helped during this period:

  • Internal linking refresh. Added cross-links between the recovering posts. Updated the homepage to feature the affected blog index more prominently. Signals: these pages matter.
  • Brand-name searches. Some of the Discovered-but-not-indexed pages got real impressions on "booplex" branded queries. Google indexes faster when a page is actually being queried.
  • One war-story post that went mini-viral on Reddit. The original "localhost canonical disaster" post got 200+ upvotes in r/SEO. New referring domains. New crawl frequency. Side effect: indexing for other pages on the site accelerated.

By Day 21:

  • Pages indexed: 16
  • Discovered, not indexed: 6
  • Duplicate, Google chose different canonical: 1

Almost back to baseline.

Day 21–35 — The long tail

Days 21–35 were the long tail. Down to single-digit Discovered-but-not-indexed entries, the remaining problems were specific URL patterns:

  • One project-mock page that probably shouldn't have been indexed anyway (kept it noindex'd, not a recovery problem)
  • One archive page that was generating duplicate-content signals with a paginated equivalent (added a rel=canonical to the canonical one, cleared on Day 28)
  • Two case study pages that just took longer for Google to re-crawl — eventually re-indexed on Day 31 and Day 33

By Day 35 (April 28, 2026), the site was back to full indexing. Actually slightly better than before the bug — 22 indexed pages vs. the pre-bug 20 — because I'd written 2 new posts during recovery.

What the GSC chart actually looked like

Pages indexed over the recovery:

Day 0:  4   ▌
Day 7:  9   ████
Day 14: 12  ██████
Day 21: 16  ████████
Day 28: 19  ██████████
Day 35: 22  ███████████

Almost linear recovery from Day 5 onwards. No miracle inflection points. No "and then everything fixed itself overnight." Just consistent re-crawling.

What actually moved the needle (ranked)

  1. The fix being right. Obvious but worth saying. Half-fixes prolong the recovery.
  2. Fail-loud env var handling. The new getPublicSiteUrl() helper throws if the var is missing. The original bug was possible because the fallback hid the misconfiguration. Now it can't happen again because the build fails first.
  3. Sitemap resubmission + URL inspection. Faster than waiting for natural recrawl.
  4. The Reddit traffic spike. Sudden referring domains and crawler attention — I didn't plan for this, but it accelerated re-crawl noticeably from Day 12 onwards.
  5. IndexNow for Bing. Bing recovered faster than Google. I had Bing back to full indexing by Day 10. Whether IndexNow gets credit, or Bing just has a faster crawl cadence, I can't fully separate.

What didn't help (negative learning)

  • Posting on LinkedIn about the recovery. Zero crawl signal value. Real talk: LinkedIn posts don't help with indexing. I knew this. I posted anyway. Felt good. Did nothing.
  • Manually re-inspecting low-value pages. I wasted time on the project-mock pages in Week 1. Those didn't need to be recovered — they should never have been indexed. Lesson: triage by value before triaging by visibility.
  • Reading SEO Reddit during recovery. I picked up two pieces of advice that turned out to be wrong (manual deindex + resubmit, hand-edit the sitemap). Didn't follow them, but the doubt cost me a day.

The monitoring changes I made

To make sure this category of bug can't kill me again:

  1. Build-time canonical check. The Next.js build now runs a script that hits 5 canonical sample URLs after build and verifies they don't contain "localhost." Fails the build if they do.
  2. Production canonical drift monitor. A weekly cron job hits the same 5 URLs and posts a Slack alert if any canonical changes. (I've since rebuilt this as a public canonical URL checker — same logic, no login required.)
  3. Centralised URL helper. getPublicSiteUrl() is the only function in the codebase allowed to produce the canonical site URL. Other call sites import it. Lint rule pending.
  4. GSC index-coverage alerts. I check the GSC weekly indexing summary every Monday morning. If "Pages indexed" drops by 3+ in a week, I drop other work and investigate.

What 5 weeks of recovery cost in real terms

Lost assetEstimated impact
Organic traffic during recovery~80 sessions lost (small site, small impact)
Time spent on recovery~18 hours over 5 weeks
Direct revenue impact€0 (no commercial conversions during the period anyway)
Reputation impactNet positive once the war-story post landed
Long-term trust signal costNone measurable — domain authority unchanged

The reputational angle is the interesting one. Publishing the disaster post probably did more for the brand than the underlying mistake hurt. Honest case studies of failures earn trust that humblebrag case studies of successes don't.

The lessons (the actual useful part)

1. Loud failures beat silent fallbacks

process.env.X || 'default' is a code smell. It hides misconfiguration. Prefer process.env.X ?? throw new Error('X is required') for any value that's load-bearing in production.

2. Centralise anything that touches the canonical or the URL

The 11 separate inline references to the site URL were the multiplier on the bug. One helper, one source of truth, one place to fix.

3. Build a check for the failure mode, not for the bug

The monitor I added isn't "check for this specific localhost bug." It's "check that canonical URLs don't contain hostnames that aren't the production domain." That catches the next localhost-equivalent bug too.

4. GSC is enough monitoring for small sites

I considered building elaborate monitoring infrastructure. Didn't. Weekly check of GSC's index coverage report is 80% of the value. For sites under 1000 pages, I'd recommend the same.

The fancy stuff is for sites that can't tolerate a 2-week detection lag.

5. Recovery is uneventful when the underlying fix is right

If you've fixed the actual bug and your sitemap is correct, recovery is a function of Google's crawl cadence — not a function of your scrambling. Most of the things I did in Week 1 (manual URL inspection, IndexNow blasts) shaved days off, not weeks.

FAQ

How long does Google take to reindex after a canonical URL fix?

For a small site (under 100 pages) with normal crawl cadence: 2–5 weeks for full recovery. For a large site: longer, often 6–12 weeks. The first noticeable improvement usually shows up within 7 days of the fix deploying.

Does IndexNow help with Google?

No. Google has not adopted IndexNow. It accelerates Bing and Yandex specifically. For Google, you rely on natural crawl + sitemap resubmission + manual URL inspection.

How do I know if I have a canonical URL problem?

Check GSC "Page indexing" report. If you see entries under "Duplicate, Google chose different canonical" or "Alternate page with proper canonical tag," investigate. Also use the canonical URL checker to test a few sample URLs against what's actually served.

Should I noindex pages while recovering?

No. Counterintuitively, noindexing during recovery slows Google down — it has to reverify the noindex, then later reverify that the noindex is gone. Just leave the pages alone and let recovery proceed.

What's the worst thing to do during a recovery?

Three nominees: (1) keep deploying small fixes that change canonicals further, (2) use GSC's Removals tool, (3) hand-edit the sitemap repeatedly. Each adds noise that slows recovery.

How do I prevent this category of bug?

Three controls: a centralized URL helper that fails loudly when misconfigured, a build-time check that verifies canonical output, and a weekly monitor that flags drift. Combine those and the failure mode becomes near-impossible.

Topics:case-studytechnical-seocanonicalindexinggscnextjs

Found This Useful?

Share it with someone who might learn from my mistakes!