Migrating from WordPress to Next.js: A Field Guide

How to move a content site off WordPress and onto Next.js without wrecking your SEO, your content, or your weekends.

WordPress is a fine place to start and an expensive place to stay. Once a site is mostly articles, the database, the plugins and the monthly bill stop earning their keep, and you start paying in performance, in security patching, and in a vague background anxiety every time you log in. Next.js, rendered to static pages on a CDN, removes all three. The catch is the move itself: done carelessly, a migration can erase years of accumulated search authority in a single afternoon.

This is the guide I wish I had before I migrated a live, bilingual site. It is deliberately generic. Your stack will differ in the details, but the shape of the problem, and the order in which you should solve it, does not.

The one rule that governs everything: preserve your URLs

Before you choose a framework, internalise this: every URL that search engines have indexed must keep returning content at the same address, byte for byte. Not "roughly the same". Identical. A path that was /my-article/ must not become /blog/my-article/ or /articles/my-article. A trailing slash that was there must stay there.

Why so strict? Because every indexed URL is a small deposit of trust built up over time: backlinks point at it, search engines rank it, browsers have it bookmarked. Change the address and, at best, you depend on 301 redirects to pass that trust along (lossy, and only if you get every single one right). At worst, you serve a wall of 404s and reset to zero.

So the migration is not really "rebuild the site". It is "rebuild the site under a fixed URL contract". Everything below is arranged around honouring that contract.

Two mechanical safeguards make it enforceable:

Trailing slashes. If your old URLs end in /, set trailingSlash: true in next.config.mjs and never think about it again. A mismatch here alone can trigger a sitewide redirect wave.
Encoding. If you have non-Latin slugs (Arabic, Cyrillic, CJK), search engines indexed the percent-encoded form. Store the slug decoded on disk, normalise it to Unicode NFC, and emit encodeURIComponent on the wire so your canonical, hreflang and sitemap match what was indexed exactly.

Step one: audit before you touch any code

The most dangerous thing in a migration is a confident assumption. Folder names lie, plugin exports lie, and your own memory of "what's on the site" lies. So begin with ground truth, derived from the files, not from the dashboard.

Produce a complete inventory:

Every URL. Walk the live site or its static export and list every address that returns a page: posts, pages, category archives, tag archives, pagination, feeds. Mine ran to roughly 560 URLs once tags and paginated archives were counted, far more than the "forty-odd articles" I had in my head.
The real content model. How many posts actually exist, and what is each one's canonical address? Do not infer language or relationships from directory names. I learned this the hard way: a dozen directories that looked Arabic were in fact the default-language record of posts that simply had non-Latin titles.
The gap between what you advertise and what exists. Audit quality, not just presence. On my site, the CMS advertised a full second language via hreflang, but nearly half of those pages were untranslated fallbacks: the English body wrapped in translated navigation. Carrying that forward unchanged would have meant publishing duplicate, mislabelled content. You cannot make a good decision about a page you have not actually measured.

Write this down as a table. It becomes the specification the rest of the work is checked against.

Step two: choose an architecture that fits the contract

A few decisions, made once, save you from fighting the framework later.

Next.js App Router, statically generated via generateStaticParams. Crucially, not output: 'export'. Full static export throws away the parts of Next.js you will want, image optimisation, Route Handlers for forms, on-the-fly Open Graph images, edge functions, while still letting every page render to static HTML at build time. You get a static site's speed with a real framework's capabilities.
MDX for article bodies. Plain Markdown is enough until the day an article contains an interactive chart or a custom component, and then it is not. MDX (Markdown that can render React components) is the only format that survives both your simplest and your most complex posts. Pair it with gray-matter for frontmatter and a remark/rehype pipeline for GitHub-flavoured Markdown, heading anchors and code highlighting.
Tailwind with a typography plugin for styling, so article bodies get sensible defaults and you theme through tokens rather than scattered CSS.
Internationalisation, if you need it, through a library like next-intl with locale prefixing set to "as needed", so your default language stays prefix-less at the root and other languages live under a path prefix. This is what keeps the default-language URLs byte-identical to the old site.

The guiding principle: server components by default, client components only where there is genuine interactivity. Most of a content site is static text, and it should ship as static text with no JavaScript attached.

Step three: make a manifest the single source of truth

This is the idea that made everything else tractable. Do not let routing, pairing, or language be decided by where a file happens to sit. Build one manifest, keyed by a stable post id, that records the truth for every post:

interface PostEntry {
  id: number;            // a stable id, not derived from the path
  enSlug: string;        // the default-language slug
  arSlug: string;        // the translation's slug (often identical)
  translationKey: number;// links the two language variants
  publishedAt: string;
  updatedAt: string;
  categories: string[];  // existing slugs, verbatim
  tags: string[];
  status: 'translated' | 'fallback' | 'missing';
}

Routes, generateStaticParams, sitemaps, hreflang and redirects all derive from this one structure. Each MDX file mirrors the relevant fields in its frontmatter, so a single file is self-describing, but the manifest is what the build trusts.

In practice I generate the manifest from the MDX frontmatter at build time (a predev/prebuild script), keep it gitignored, and give authors a create-article command so they never hand-edit it. A companion check-content script enforces the rules that matter: required frontmatter, slug matches filename, British English, and a hard ban on em dashes. Make your conventions executable and they stop being suggestions.

A representative frontmatter block:

---
id: 3018
title: 'Game theory proves kindness wins'
slug: game-theory-proves-kindness-wins
locale: en
translationKey: 3018
description: '...'
publishedAt: '2026-05-20'
heroImage: /uploads/2026/05/kindness.jpg
heroAlt: '...'        # backfill this; exported alt text is usually empty
categories: [game-theory, blog]
tags: [prisoners-dilemma, tit-for-tat]
---

Step four: get the bilingual and RTL details right

If your site is monolingual, skim this section. If it is not, this is where most migrations quietly go wrong.

Routing. Default language at the root, translations under a prefix (/ar/...). Configure the routing middleware carefully so the prefix-less default does not accidentally redirect the homepage.
Pairing by key, not by guess. "The translation is the same slug under /ar/" is true until it is not. I had exactly one post where the two languages had different slugs, and a naive rule would have 404'd it. Store both slugs explicitly and let the manifest hold the exceptions.
Direction and fonts. Set lang and dir on the root element from the locale. Lay out with logical CSS properties (start/end rather than left/right) so components mirror automatically in right-to-left. Load a real webfont for the second script; the OS fallback is rarely what you want.
hreflang, canonical, x-default. Emit an absolute, self-referential canonical on every page (including translations, where exported CMS canonicals are often empty or wrong). Emit reciprocal hreflang only between pages that genuinely correspond, and add an x-default pointing at your default language. Get this right per exact URL, including paginated archives.
The untranslated-content decision. For pages that are advertised as translated but are really fallbacks, pick a policy and apply it consistently: translate them, or mark them noindex and drop their reciprocal hreflang so they stay reachable but stop competing as duplicate content, or redirect them to the original-language version. My default was to noindex the fallbacks until they could be genuinely translated, and to emit translation hreflang only for real pairs.

That last point has a clean payoff: when a fallback is later translated, flipping its status to translated in the manifest is all it takes to make the page indexable, add it to the sitemap, and switch on reciprocal hreflang. The policy becomes a one-line change rather than a manual scramble.

Step five: migrate the content with a real parser, not a regex

CMS-exported HTML is messy: page-builder wrappers, editor artefacts, tracking containers, custom elements, and empty tags litter the body. Do not try to clean that with string replacement. Write a one-off importer (kept out of the shipped app) that:

Parses each exported page with a real DOM parser, not a regular expression.
Extracts the actual article body and sanitises it: strip share-button containers, contenteditable leftovers, unknown custom elements, framework attribute noise, and empty paragraphs.
Converts the cleaned DOM to MDX through a rehype/remark pipeline: keep headings, lists, links, emphasis and blockquotes; convert simple tables to Markdown; keep complex tables as raw JSX.
Pulls frontmatter from the page head and DOM: title, dates, description, hero image, categories, tags, reading time, and the stable id.
Backfills alt text, which exported content almost always lacks. This is an accessibility and SEO win you get for free while you are in there.
Fails the build if any unknown hyphenated custom tag survives, because it will otherwise break the MDX parser downstream.

Then budget real time for a human pass: roughly five minutes per article-locale to compare the rendered MDX against the original for heading hierarchy, lists, image paths, tables, internal links and language correctness. For a few dozen posts in two languages, that is the better part of a working day. It is not optional; the importer gets ninety-five per cent of the way, and the last five per cent is where embarrassing errors hide.

The exception that proves the rule: anything genuinely interactive (a chart, a calculator, a custom widget) cannot be flattened to Markdown and must be rebuilt as a real component, embedded in the MDX by name. Treat each one as a first-class deliverable with its own acceptance test, not as body text. These are the items most likely to be silently lost in a bulk conversion, so handle them deliberately.

Step six: replace the dynamic features without a backend

WordPress gives you forms, search and newsletters out of the box. You rebuild each as a small, self-contained piece, and they end up faster and more secure.

Contact form. A React form posting to a serverless Route Handler that relays through an email API. Migrating is a chance to harden: escape every field before it touches the email body, validate the address server-side, lock CORS to your production origin, rate-limit, and read the destination only from an environment variable. Never carry forward a hardcoded address or a wildcard CORS rule.
Newsletter and gated downloads. A server route that talks to your email provider's contacts API, with a client helper that gives real on-page success and error feedback. Keep the API key server-side; the browser never sees it.
Search. A build-time JSON index plus a client-side fuzzy search library. No search server, no query language, and for non-Latin scripts, normalise the text (strip diacritics, fold letter variants) before indexing so queries actually match.
Open Graph images. Generate share-card images per article and per locale at the edge, loading the right font so non-Latin titles render. Provide a static fallback.
Analytics and consent. A cookieless analytics product can run with no banner; anything that sets cookies (a typical analytics suite) should load only after consent. Decide your posture deliberately rather than pasting a snippet into the head.

The pattern throughout: a tiny serverless function for the few things that genuinely need a server, and static files for everything else.

Step seven: the SEO and infrastructure layer

This is the unglamorous work that determines whether the migration is invisible (the goal) or a ranking event (the failure mode).

next.config.mjs: trailingSlash: true; declare your security headers (X-Frame-Options, X-Content-Type-Options: nosniff, a strict Referrer-Policy, a locked-down Permissions-Policy); and add the redirects you do need, apex-to-www, old upload paths to new, and any taxonomy slugs you deliberately changed.
A unified sitemap covering every locale, posts, pages, categories and tags, with correct hreflang relationships. Exported CMS sitemaps routinely orphan whole trees (the second-language tree, the tag archives); your rebuilt sitemap is the chance to make it complete.
A real robots.txt with a Sitemap: directive.
Per-page metadata: absolute self-canonical, a localised title pattern, description, full Open Graph and Twitter card data, and reciprocal hreflang with x-default.
Structured data (JSON-LD): regenerate it with your canonical host and real dates and word counts. Use a neutral author handle, never one derived from a private email address.
Kill the dead links. Exported heads are full of references to feeds, JSON endpoints and XML-RPC that no longer exist. Remove them rather than re-advertising 404s.

Pick one canonical host (with or without www) to match what is already indexed, and use it everywhere: metadata base, canonicals, sitemap, structured data. Inconsistency here quietly splits your authority.

Step eight: clean up the security debts on the way out

A migration is the right moment to settle the security debts a long-lived WordPress accrues.

Rotate any secret that ever sat in the repository or its history, including admin passwords pasted into a README years ago. Start the new project with clean git history so nothing leaks forward.
Do not regenerate anything that exposes a private identity. Author-archive slugs derived from an email address are a common offender; use a neutral handle.
Keep the new repository private, and keep real secrets in a gitignored environment file and your host's environment variables, never in tracked docs.

None of this is exotic. It is just easier to do once, at the boundary, than to remember later.

Step nine: prove it with a parity harness before you cut over

The single highest-leverage safeguard in the whole project is a script that enumerates every legacy URL from the filesystem (not from the sitemap, which is exactly where things go missing) and hits your preview deployment, asserting that each one returns either a 200 or an intended redirect. Any unintended 404 fails the cutover. Run it against the preview, fix what it finds, and only proceed when it is green.

Around that, layer the other gates:

A right-to-left and bidirectional QA matrix per template, in both languages, checking mirrored navigation, numbers and Latin terms inside translated text, and side-by-side screenshots.
An acceptance test for each interactive component.
An accessibility pass: contrast (brand colours on a warm background often fail), keyboard navigation, focus order, real alt text, and a screen-reader check in both directions.
Performance budgets as CI gates, so a heavy image or a stray client component cannot regress the very thing you migrated to gain.
A content-parity sign-off: an explicit checklist that every post renders in both locales, every widget works, and the forms submit.

Step ten: cut over slowly, with a rollback ready

The temptation, after weeks of work, is to flip DNS and celebrate. Resist it.

Cold-archive the old site first: a dated dump of the database and all media, stored somewhere you are not migrating away from. Capture the current DNS records.
Deploy to a preview URL and run the full parity and QA suite there.
Lower your DNS TTL a day or two ahead, so a rollback propagates in minutes, not hours.
Cut DNS to the new deployment, keeping the old one retrievable for instant rollback.
Re-verify your search console, submit the new sitemap, and inspect a sample of URLs.
Watch your 404 and redirect logs for a week or two.
Only then decommission the old hosting, and keep the domain registration regardless.

Write the rollback as a short runbook before you need it. You almost certainly will not, which is precisely why it should exist.

Life after WordPress

The real adoption risk is not technical; it is the editor. WordPress's admin GUI is genuinely good, and going back to files feels like a downgrade until the new workflow clicks. Smooth it: a create-article command that scaffolds the paired files and frontmatter, a check-content script that catches mistakes before they ship, auto-computed reading time and share images, and a preview deployment on every change. If a fully visual editor matters, a git-based CMS can sit on top of the same MDX files. Either way the loop becomes: write, commit, push, and the site builds itself. No local server, no crawler, no publish button.

What it actually costs

Honestly? More than the optimistic estimate in your head. For a bilingual site with interactive content, hardened forms and a proper QA pass, plan for something like thirty-five to fifty-five focused hours, spread across a few weekends. The bulk is not the framework; it is the content QC, the translation decisions, and the verification. The framework is the easy part.

If that sounds like more than you want to take on right now, there is a lighter path that captures most of the benefit for a fraction of the effort. Rather than rebuilding, you can keep WordPress as a local editor and publish it as flat static HTML, shedding the hosting bill and the attack surface without changing how you write. I wrote that approach up separately, with a small open-source tool to automate it, in I Stopped Paying for WordPress Hosting. It makes a good first move: start there, drop the bill today, and graduate to a full Next.js rebuild later, if and when you outgrow it.

But the trade is decisive. You exchange a recurring bill and a standing attack surface for a site that is static, fast, free to host, and secure by construction, because in production there is no database, no server-side language, and no login for anyone to attack. You keep every URL, and with them every bit of authority you spent years earning. And your writing workflow becomes a text file and a git push.

That was the trade I wanted. If you hold the line on the URL contract and do the audit honestly before you write a line of code, it is a trade you can make without losing a thing.