The Complete Next.js SEO Setup: JSON-LD Graphs, llms.txt, and Everything Between
- SEO
- Next.js
- JSON-LD
- structured data
- llms.txt
- GEO
When I rebuilt this site, I treated SEO as a feature with an actual spec, not a checklist to sprinkle on at the end. This post documents every layer that shipped — and since search now includes AI engines that cite sources, the last section covers optimizing for those too.
Layer 1: the Metadata API, with metadataBase
Next.js's Metadata API generates every tag from typed objects. The root layout establishes site-wide defaults; the one field people skip is metadataBase, which makes every relative OG/canonical URL resolve correctly:
export const metadata: Metadata = {
metadataBase: new URL("https://shahzaibakram.dev"),
title: { default: "Shahzaib Akram — Senior Frontend Engineer", template: "%s — Shahzaib Akram" },
alternates: { canonical: "/", types: { "application/rss+xml": "/feed.xml" } },
openGraph: {
images: [{ url: "/og.png", width: 1200, height: 630, alt: "…" }],
},
};
Per-post pages get generateMetadata deriving title, description, article-type OG tags, and canonical from frontmatter. Width, height, and alt on the OG image aren't pedantry — link unfurlers reserve layout from the declared dimensions.
Layer 2: one entity graph, not scattered schemas
Most sites paste disconnected JSON-LD blobs per page. Search engines build entity graphs, so give them one: a single @graph where every node carries an @id and references the others. On my homepage:
Person(#person) — name, job title,knowsAbout, and cruciallysameAspointing at GitHub, LinkedIn, and npm profiles.sameAsis how Google reconciles you across the web into one Knowledge-Graph entity.WebSite— owned by#person.ProfilePage— declaring the homepage is about#person.ItemListofSoftwareApplication— every portfolio project as a typed entity with its category and URL, so "agy-bridge" and "Habivit" are machine-readable claims, not paragraph text.
Blog posts add BlogPosting (whose author and publisher reference the same #person by @id) plus a BreadcrumbList. The whole graph reuses identity instead of redeclaring it — and consistency across pages is itself a trust signal.
One security note: JSON-LD is serialized into a raw <script> tag via dangerouslySetInnerHTML. If any field can ever contain </script>, that's XSS. Escape < during serialization:
export function jsonLd(data: object): string {
return JSON.stringify(data).replace(/</g, "\\u003c");
}
Layer 3: generated artifacts — sitemap, robots, RSS, manifest
Static XML files in public/ rot the moment content changes. Everything machine-readable is generated from the same data source as the pages themselves: sitemap.ts maps over the same getAllPosts() the blog renders from (a new post updates the sitemap by existing), robots.ts points at it, manifest.ts covers PWA metadata, and RSS is a route handler:
// app/feed.xml/route.ts
export const dynamic = "force-static";
export async function GET() {
return new Response(rss(getAllPosts()), {
headers: { "Content-Type": "application/rss+xml; charset=utf-8" },
});
}
force-static matters on a fully-SSG site — the feed becomes a build artifact, not a lambda.
RSS in 2026 isn't nostalgia, by the way: feed readers are a distribution channel again, and several AI crawlers use feeds for discovery.
Layer 4: GEO — the search engines that talk back
A growing share of "search" is ChatGPT, Perplexity, and AI Overviews answering directly and citing sources. Optimizing for citation is now part of the job, and it's cheap:
llms.txt is the emerging convention — a plain-markdown file at the site root that hands language models a curated summary instead of forcing them to parse your DOM (and a JS-heavy 3D site is exactly the DOM you don't want parsed). Mine describes who I am, lists every project with a one-line description and canonical link, and points to the blog and RSS feed. Five minutes of work for a much higher chance an AI answer describes your work accurately — you wrote the summary it reads.
The structured-data layer does double duty here: entity-linked JSON-LD is precisely the unambiguous format AI engines extract from. And robots.ts deliberately doesn't block AI crawlers — for a personal site, being in training data and citation indexes is the goal, not the threat.
Layer 5: the OG image
Links get shared into Slack, X, LinkedIn — the preview card is your first impression. Rather than a design-tool export that drifts from the real design system, I rendered og.png (1200×630) from an HTML template using the site's actual tokens — same black, same hairlines, same Geist, same single purple accent — via headless Chromium. Regenerating after a redesign is rerunning a script.
The dependency order
Each layer feeds the next: real URLs and metadata make the entity graph resolvable; the graph makes sitemap entries meaningful; all three give llms.txt something to corroborate. Verification is mechanical — curl every artifact, validate the graph in Google's Rich Results test, paste a URL into Slack and look at the card. The entire setup is maybe a day of focused work, and it compounds with every post you publish.