← All articles
Jun 12, 20264 min read

The Complete Next.js SEO Setup: JSON-LD Graphs, llms.txt, and Everything Between

  • SEO
  • Next.js
  • JSON-LD
  • structured data
  • llms.txt
  • GEO
The Complete Next.js SEO Setup: JSON-LD Graphs, llms.txt, and Everything Between

When I rebuilt this site, I treated SEO as a feature with an actual spec, not a checklist to sprinkle on at the end. This post documents every layer that shipped — and since search now includes AI engines that cite sources, the last section covers optimizing for those too.

Layer 1: the Metadata API, with metadataBase

Next.js's Metadata API generates every tag from typed objects. The root layout establishes site-wide defaults; the one field people skip is metadataBase, which makes every relative OG/canonical URL resolve correctly:

export const metadata: Metadata = {
  metadataBase: new URL("https://shahzaibakram.dev"),
  title: { default: "Shahzaib Akram — Senior Frontend Engineer", template: "%s — Shahzaib Akram" },
  alternates: { canonical: "/", types: { "application/rss+xml": "/feed.xml" } },
  openGraph: {
    images: [{ url: "/og.png", width: 1200, height: 630, alt: "…" }],
  },
};

Per-post pages get generateMetadata deriving title, description, article-type OG tags, and canonical from frontmatter. Width, height, and alt on the OG image aren't pedantry — link unfurlers reserve layout from the declared dimensions.

Layer 2: one entity graph, not scattered schemas

Most sites paste disconnected JSON-LD blobs per page. Search engines build entity graphs, so give them one: a single @graph where every node carries an @id and references the others. On my homepage:

  • Person (#person) — name, job title, knowsAbout, and crucially sameAs pointing at GitHub, LinkedIn, and npm profiles. sameAs is how Google reconciles you across the web into one Knowledge-Graph entity.
  • WebSite — owned by #person.
  • ProfilePage — declaring the homepage is about #person.
  • ItemList of SoftwareApplication — every portfolio project as a typed entity with its category and URL, so "agy-bridge" and "Habivit" are machine-readable claims, not paragraph text.

Blog posts add BlogPosting (whose author and publisher reference the same #person by @id) plus a BreadcrumbList. The whole graph reuses identity instead of redeclaring it — and consistency across pages is itself a trust signal.

One security note: JSON-LD is serialized into a raw <script> tag via dangerouslySetInnerHTML. If any field can ever contain </script>, that's XSS. Escape < during serialization:

export function jsonLd(data: object): string {
  return JSON.stringify(data).replace(/</g, "\\u003c");
}

Layer 3: generated artifacts — sitemap, robots, RSS, manifest

Static XML files in public/ rot the moment content changes. Everything machine-readable is generated from the same data source as the pages themselves: sitemap.ts maps over the same getAllPosts() the blog renders from (a new post updates the sitemap by existing), robots.ts points at it, manifest.ts covers PWA metadata, and RSS is a route handler:

// app/feed.xml/route.ts
export const dynamic = "force-static";
export async function GET() {
  return new Response(rss(getAllPosts()), {
    headers: { "Content-Type": "application/rss+xml; charset=utf-8" },
  });
}

force-static matters on a fully-SSG site — the feed becomes a build artifact, not a lambda.

RSS in 2026 isn't nostalgia, by the way: feed readers are a distribution channel again, and several AI crawlers use feeds for discovery.

Layer 4: GEO — the search engines that talk back

A growing share of "search" is ChatGPT, Perplexity, and AI Overviews answering directly and citing sources. Optimizing for citation is now part of the job, and it's cheap:

llms.txt is the emerging convention — a plain-markdown file at the site root that hands language models a curated summary instead of forcing them to parse your DOM (and a JS-heavy 3D site is exactly the DOM you don't want parsed). Mine describes who I am, lists every project with a one-line description and canonical link, and points to the blog and RSS feed. Five minutes of work for a much higher chance an AI answer describes your work accurately — you wrote the summary it reads.

The structured-data layer does double duty here: entity-linked JSON-LD is precisely the unambiguous format AI engines extract from. And robots.ts deliberately doesn't block AI crawlers — for a personal site, being in training data and citation indexes is the goal, not the threat.

Layer 5: the OG image

Links get shared into Slack, X, LinkedIn — the preview card is your first impression. Rather than a design-tool export that drifts from the real design system, I rendered og.png (1200×630) from an HTML template using the site's actual tokens — same black, same hairlines, same Geist, same single purple accent — via headless Chromium. Regenerating after a redesign is rerunning a script.

The dependency order

Each layer feeds the next: real URLs and metadata make the entity graph resolvable; the graph makes sitemap entries meaningful; all three give llms.txt something to corroborate. Verification is mechanical — curl every artifact, validate the graph in Google's Rich Results test, paste a URL into Slack and look at the card. The entire setup is maybe a day of focused work, and it compounds with every post you publish.

WRITTEN BY

Shahzaib Muhammad Akram

Senior Frontend EngineerCyberjaya, Malaysia