JSON-LD Structured Data for Blogs: A Practical Guide

The schema types that matter, a worked BlogPosting example, and the mistakes that quietly invalidate your markup

July 3, 2026 · 14 min read
JSON-LD Structured Data for Blogs: A Practical Guide

Google doesn't read your blog post the way a human does. It parses your HTML, guesses at the title, guesses at the author, guesses at the publish date — and when it guesses wrong, your search listing shows the wrong byline or a date from 2019. JSON-LD structured data removes the guessing. It's a small block of JSON in your page that tells search engines, in their own vocabulary, exactly what the page is.

TL;DR: JSON-LD is Google's recommended format for structured data. For a blog, four schema types do most of the work: BlogPosting (the article itself), BreadcrumbList (site hierarchy), Person/Organization (author and publisher identity), and WebSite (site-level identity). Mark up only what's visible on the page, fill the required fields, validate with the Rich Results Test, and let your CMS generate the blocks so they never drift out of sync with the content.

This guide walks through the schema types that matter for blogs, a complete worked BlogPosting example, how structured data feeds AI search, and the mistakes that quietly invalidate your markup.


What Is JSON-LD and Why Does Google Prefer It?

JSON-LD (JSON for Linked Data) is a script block — usually in the <head> — that describes your page using Schema.org vocabulary. It sits apart from your visible HTML. Google has recommended it over microdata and RDFa since 2015, and its structured data docs still call it the preferred format.

The older alternative, microdata, weaves attributes into your HTML:

<article itemscope itemtype="https://schema.org/BlogPosting">
  <h1 itemprop="headline">My Post Title</h1>
  <time itemprop="datePublished" datetime="2026-06-13">June 13, 2026</time>
</article>

That looks harmless until you redesign. Move the <time> element into a new component, forget the itemprop, and your markup silently breaks. Nobody notices for months because the page still renders fine.

JSON-LD keeps the data in one self-contained block:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "headline": "My Post Title",
  "datePublished": "2026-06-13T09:00:00+00:00"
}
</script>

Three practical reasons Google and developers both prefer this:

  1. It survives redesigns. The markup doesn't depend on your DOM structure, so changing templates can't break it.
  2. It's easy to generate. A CMS can build the block from the post record — title, dates, author — without touching the theme's HTML.
  3. It's easy to test. You can lint a JSON block in CI. You can't easily lint microdata scattered across twelve partials.

Microdata still works and Google still parses it. But every new rich result type Google documents ships with JSON-LD examples first, and some (like certain Dataset features) are JSON-LD only. If you're writing structured data for a blog in 2026, JSON-LD is the only format worth your time.


Which Schema Types Matter for a Blog?

A blog needs four schema types on every post: BlogPosting, BreadcrumbList, and Person plus Organization (nested inside the post markup as author and publisher). A fifth, WebSite, belongs on the homepage. Everything else — FAQPage, HowTo, VideoObject — is situational.

Schema type What it does Where it goes
BlogPosting Identifies the article, its dates, author, image Every post
BreadcrumbList Shows your site hierarchy in search results Every post and page
Person / Organization Establishes author and publisher identity (E-E-A-T) Nested in BlogPosting
WebSite Names the site, ties pages to one entity Homepage only
FAQPage Describes Q&A content for machines Posts with real FAQ sections

BlogPosting vs Article

BlogPosting is a subtype of Article. Google treats them the same for article rich results, so either works. Use BlogPosting for blog content anyway — it's more precise, and precision costs nothing. Save plain Article for news or editorial pages that aren't blog posts.

Breadcrumb markup replaces the raw URL in your search listing with a readable trail (unfoldcms.com › Blog › SEO). It's one of the easiest rich results to earn and it works on every page. Each item needs a position, name, and item URL.

Person and Organization

These rarely stand alone on a blog — they live inside BlogPosting as the author and publisher properties. Give the author a url pointing to a real profile page and sameAs links to GitHub or LinkedIn. Google cross-references these to build entity understanding, and author identity is a measurable E-E-A-T signal.

WebSite — use potentialAction sparingly

WebSite markup names your site and helps Google connect every page to one entity. You'll see old tutorials adding a potentialAction with SearchAction to get a sitelinks search box — skip it. Google retired the sitelinks search box rich result in late 2024. The potentialAction block isn't harmful, but it no longer earns anything visible. Keep WebSite minimal: name, URL, publisher.

FAQPage — restricted, but not dead

Since August 2023, Google only shows FAQ rich results for "well-known, authoritative government and health websites." Your blog won't get the expandable Q&A in search listings. So why bother?

Because rich results aren't the only consumer. FAQ markup gives LLMs and AI search crawlers clean question-answer pairs to extract. A machine parsing your page doesn't have to infer where a question ends and an answer begins — the markup states it. If your post has a genuine FAQ section, the markup still earns its bytes.


A Complete BlogPosting Example

Here's a production-grade BlogPosting block — the same shape UnfoldCMS generates automatically for every post:

{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "mainEntityOfPage": {
    "@type": "WebPage",
    "@id": "https://example.com/blog/json-ld-guide/"
  },
  "headline": "JSON-LD Structured Data for Blogs: A Practical Guide",
  "description": "How to add JSON-LD structured data to a blog: schema types, a worked BlogPosting example, validation, and common mistakes.",
  "image": [
    "https://example.com/media/json-ld-guide-16x9.webp",
    "https://example.com/media/json-ld-guide-4x3.webp",
    "https://example.com/media/json-ld-guide-1x1.webp"
  ],
  "datePublished": "2026-06-13T09:00:00+00:00",
  "dateModified": "2026-06-13T09:00:00+00:00",
  "author": {
    "@type": "Person",
    "name": "Hamed Pakdaman",
    "url": "https://example.com/about/",
    "sameAs": [
      "https://github.com/hamedpakdaman",
      "https://www.linkedin.com/in/hamedpakdaman/"
    ]
  },
  "publisher": {
    "@type": "Organization",
    "name": "UnfoldCMS",
    "logo": {
      "@type": "ImageObject",
      "url": "https://example.com/media/logo-600x60.png"
    }
  }
}

Field-by-field notes, because the details are where markup fails:

  • headline — keep it under 110 characters. Longer values get truncated in some surfaces.
  • datePublished / dateModified — full ISO 8601 with a timezone offset. A bare 2026-06-13 is valid but ambiguous; +00:00 isn't. Update dateModified only when you substantively change the post — bumping the date without changing content is a known spam pattern.
  • image — Google asks for images at least 1200px wide, ideally in three aspect ratios (16:9, 4:3, 1:1) so it can pick per surface. One good 16:9 image beats three cropped afterthoughts, but provide all three if your pipeline can.
  • author.url — point at a real author page, not the homepage. Google's docs explicitly recommend this for author disambiguation.
  • mainEntityOfPage — the canonical URL of the post. If your slugs ever change, this must follow — stale IDs here split your entity signals across two URLs.
  • publisher.logo — required for article rich results. A wide-format logo (roughly 600×60) renders best.

AI search systems — Google's AI Overviews, ChatGPT's browsing, Perplexity — parse structured data the same way classic crawlers do, and they reward it the same way: with accurate citations. When an LLM-backed engine decides whether to cite your post, explicit machine-readable facts beat inferred ones. A datePublished field is a fact. A date string buried in a byline div is a guess.

This matters more than the classic rich-result payoff. Rich results are a CTR play; AI extraction is a citation play. An AI Overview that names your article, your author, and a correct date is pulling from your markup. An AI answer that misattributes your content probably had nothing structured to work with.

Three things structured data gives AI systems that plain HTML doesn't:

  1. Entity resolution. sameAs links and Organization markup let a model connect your author name to a real person across the web, instead of treating "Hamed Pakdaman" as an ambiguous string.
  2. Freshness signals. dateModified tells a retrieval system your content is maintained. Models weight recency when picking sources.
  3. Clean Q&A pairs. FAQPage markup hands extraction-ready answers to systems that quote them.

JSON-LD is one layer of a broader machine-readability stack. The other emerging layer is llms.txt — a plain-text index that tells AI crawlers what your site contains and where the canonical content lives. We covered it in detail in what llms.txt means for CMS SEO. UnfoldCMS generates both automatically: JSON-LD per post and page, plus a site-level llms.txt — no plugin, no template edits.


How Do You Validate JSON-LD?

Use two validators, because they answer two different questions. The Rich Results Test answers "is this eligible for Google rich results?" The Schema.org validator answers "is this valid Schema.org markup?" A block can pass one and fail the other.

The workflow:

  1. Paste the live URL into the Rich Results Test. Test the URL, not pasted code — this catches rendering issues where your JSON-LD never makes it into the served HTML.
  2. Fix errors first, then warnings. Errors make the page ineligible for rich results. Warnings are missing recommended fields — fix the cheap ones (dateModified, image).
  3. Run the same URL through the Schema.org validator. It catches vocabulary mistakes the Rich Results Test ignores — typo'd property names, wrong types — because Google's tool only checks properties it consumes.
  4. Watch Search Console. The Enhancements section reports structured data errors found during real crawls, across the whole site. A template bug that breaks markup on 200 posts shows up here first.

One habit worth building: validate after every template change, not just when you first add markup. Structured data fails silently — the page looks identical with or without a working block.


Common JSON-LD Mistakes (and How to Avoid Them)

The mistakes that get markup ignored — or penalized — fall into three groups: describing content that isn't on the page, omitting required fields, and shipping duplicate conflicting blocks.

Marking up content that isn't visible

Google's structured data policy is blunt: markup must describe content the user can see on the page. Adding FAQPage markup for questions that don't appear in the article, or an aggregateRating with no visible reviews, violates the guidelines. This is the one structured data offense that can draw a manual action — a human reviewer at Google flags your site and your rich results disappear sitewide.

Missing required fields

Every rich result type has required and recommended properties, and they're not interchangeable. For articles, a missing headline or image makes the page ineligible. Missing dateModified just costs you a recommended field. The Rich Results Test labels each gap, but only if you run it — most broken markup in the wild has simply never been tested.

Duplicate, conflicting blocks

The classic WordPress failure: the theme emits an Article block, an SEO plugin emits another, and they disagree on the publish date. Google's behavior with conflicting markup is undefined — sometimes it picks one, sometimes it discards both. View source on your rendered page and count the application/ld+json blocks. If two describe the same entity with different values, that's a bug.

Smaller ones that still bite

  • Dates without timezones2026-06-13 parses, but ambiguously. Always include the offset.
  • Relative image URLsimage and logo must be absolute URLs.
  • Markup pointing at redirected URLs — if a slug changes, mainEntityOfPage has to change with it. (UnfoldCMS keeps slug history and 301s old URLs automatically, which protects the page — but the markup should still carry the current canonical.)
  • Escaping bugs — a quote character in a post title that isn't JSON-escaped invalidates the whole block. Hand-built string concatenation causes this; serializing from a data structure doesn't.

Should You Hand-Code JSON-LD or Let Your CMS Generate It?

Hand-code it once to understand it. Then automate it, because hand-maintained markup drifts. The failure mode isn't writing bad JSON-LD on day one — it's the post you update eight months later where dateModified stays frozen, or the author page that moves while forty posts still point at the old URL.

A static site with twelve pages can hand-maintain markup fine. A blog publishing weekly can't. The structured data has to come from the same database record as the visible content — same title, same dates, same author — or the two will disagree eventually, and disagreement is worse than absence.

What template-level generation should give you:

  • BlogPosting built from the post record: title, description, dates, author, featured image
  • BreadcrumbList built from the URL hierarchy
  • Organization publisher data set once, sitewide
  • Proper JSON serialization, so a stray quote in a title can't break the block

This is the standard we apply to UnfoldCMS — structured data is emitted automatically for every post and page, alongside the XML sitemap and llms.txt, with zero per-post effort. Structured data is one item on a longer list; our CMS SEO checklist covers the other nine features worth demanding before you commit to a platform. And if you're evaluating headless options where JSON-LD becomes your frontend's job entirely, our Sanity alternatives breakdown looks at which platforms leave that work on your plate.

The honest trade-off: automated markup is generic markup. A CMS won't add VideoObject for your embedded screencast or HowTo steps for your tutorial — those still need a developer. Automation should cover the 95% baseline so your hands-on time goes to the 5% that's actually custom.


FAQ

Does JSON-LD structured data improve rankings?

Not directly — Google has stated structured data is not a ranking factor. It makes pages eligible for rich results and helps machines understand content, which improves click-through rate and citation accuracy. The ranking benefit is indirect but real: better SERP presentation earns more clicks at the same position.

Where should the JSON-LD script go — head or body?

Either works. Google parses application/ld+json blocks anywhere in the document, including ones injected by JavaScript (though server-rendered is safer). Convention puts it in the <head> so it's easy to find and audit.

Should I use BlogPosting or Article for blog posts?

Use BlogPosting. It's a subtype of Article, so it inherits everything and Google treats both identically for article rich results. The more specific type costs nothing and describes the content more precisely.

Can a page have multiple JSON-LD blocks?

Yes — a post typically carries BlogPosting and BreadcrumbList as separate blocks, or combined in one block using @graph. What you can't have is two blocks describing the same entity with conflicting values, like two BlogPosting blocks with different dates.

Is FAQPage schema still worth adding after the 2023 restriction?

For rich results, no — Google limits FAQ rich results to government and health sites. For machine readability, yes: LLMs and AI search crawlers extract clean question-answer pairs from the markup. Add it when the page has a real, visible FAQ section; skip it otherwise.


Sources: Google Search Central structured data documentation (developers.google.com), Schema.org vocabulary reference, Google's August 2023 FAQ rich results update announcement, and our own markup running in production on this site — view source on this page to see the generated BlogPosting block.

UnfoldCMS ships JSON-LD, XML sitemaps, llms.txt, and slug-history redirects out of the box — self-hosted Laravel + React, runs on shared hosting. See how the SEO stack works or try the live demo.

Free & Open Source

Own your CMS. No subscriptions.

Unfold CMS is free to download and self-host. Built on Laravel + React, full source code included.

Share this post:

Discussion

Comments (0)

Leave a Comment

Please log in to leave a comment.

Don't have an account? Register here

No comments yet. Be the first to share your thoughts!

Keep Reading

Related Posts

Back to all posts