Content Modeling Basics: Structure Your CMS Before It Structures You
What content modeling is, why skipping it hurts, and a 5-step process for a typical marketing site and blog
Every CMS project starts with the same optimistic plan: "It's just some pages and a blog." Six months later there are 80 "pages," a third of them are secretly landing pages, the pricing table lives as pasted HTML inside a rich-text field, and nobody can redesign the homepage without rewriting its content by hand.
That mess has a name: a missing content model. Content modeling is the work of deciding what your content is — which types exist, which fields they carry, how they relate to each other — before you build anything on top of it. Skip it and the CMS structures you instead: every future feature gets bent around decisions you never consciously made.
TL;DR: Content modeling means defining content types, their fields, and their relationships up front. Get it wrong and everything collapses into one "page" type with a giant rich-text blob — content you can't query, can't reuse, and can't redesign without rewriting. This guide covers the core concepts, how modeling differs between headless and traditional CMSs, and a 5-step process for a typical marketing site plus blog.
What Is Content Modeling?
Content modeling is the practice of defining your content's entities (types like post, page, author), their fields (title, summary, publish date), and their relationships (a post belongs to a category, a page references a CTA block) before building the site. It's schema design for content — the same discipline you'd apply to a database, applied one level up.
A concrete example. A blog post isn't "a page with words on it." It's a type with a shape:
post:
title: string # required, plain text
slug: string # unique, lowercase-hyphens
summary: text # 1-2 sentences, used on index pages
body: markdown # the actual article
categories: relation # one or more
author: relation # exactly one
published_at: datetime # drives scheduling and sorting
Nothing exotic. But every line is a decision: the summary is a separate field (so index pages don't truncate the body mid-sentence), categories are a relation (so you can query "all posts in Headless CMS"), and published_at is a real datetime (so scheduled publishing works at all).
The model is the contract between three parties: editors who fill it in, templates that render it, and code that queries it. When the contract is explicit, all three can change independently. When it isn't, they're welded together.
What Happens When You Skip It?
Skipping the model doesn't mean you have no model. It means you have an accidental one — usually a single "page" type with a title and a giant rich-text body. Every problem flows from that.
Everything becomes a blob. The pricing table, the team grid, the FAQ — all pasted into one WYSIWYG field as formatted text. The content has structure in the editor's head, but the CMS stores it as undifferentiated HTML.
Content becomes unqueryable. Want a page listing the five most recent case studies with their client logos? You can't. The client name and logo are buried somewhere inside body HTML, not sitting in fields. Your only options are manual curation or regex against markup. Both are awful.
Redesigns become content rewrites. This is the expensive one. When presentation lives inside the content — inline styles, hand-built HTML tables, images floated with markup — a new design means opening every page and reformatting by hand. Teams routinely burn weeks on this. With a real model, a redesign touches templates only; the content doesn't move.
Editors invent their own conventions. Without a summary field, one editor bolds the first sentence, another writes "SUMMARY:" at the top, a third does nothing. Three posts, three formats, zero consistency on index pages.
The pattern is always the same: structure that should live in the model leaks into the content, and from then on every change is a migration.
The Core Concepts
Content Types: Post, Page, Landing Page, Block
Most marketing sites need a small set of types, and they're more distinct than they look:
- Post — dated, authored, categorized, listed in reverse chronology. Lives in a feed.
- Page — timeless, hierarchical, standalone. About, Contact, Privacy.
- Landing page — conversion-focused, often section-based, frequently duplicated for campaigns, usually excluded from nav and sometimes from search.
- Block — a reusable fragment (CTA banner, testimonial, pricing table) that appears inside other types but has no URL of its own.
The classic mistake is collapsing all four into "page." Posts then need a fake date convention, landing pages pollute the sitemap, and shared CTAs get copy-pasted into twenty bodies — so updating the offer means twenty edits.
Fields vs. the Freeform Body
For every piece of information, ask one question: will anything other than this page's main template ever need it?
- The article text itself? Only the article template renders it. Body.
- The publish date? Index pages sort by it, RSS needs it, schema markup uses it. Field.
- The author? Listed on the post, on author archive pages, in structured data. Field (a relation, actually).
- The meta description? Read by the SEO layer, never rendered in the article. Field.
Fields are for anything queried, sorted, filtered, validated, or rendered in more than one place. The body is for the one thing that's genuinely freeform: the prose.
Structured Fields vs. a Markdown Blob — When Is Each Right?
A freeform body isn't a failure. For long-form writing it's the correct tool — articles are prose, and forcing prose into forty fields is its own pathology. The tradeoff looks like this:
| Content | Markdown body is fine | You need fields |
|---|---|---|
| Blog article prose | Yes — it's writing, not data | — |
| Publish date, author, category | — | Yes — queried and sorted |
| Pricing tiers | — | Yes — rendered in 3+ places, compared |
| Documentation page | Yes — with good heading discipline | — |
| Testimonials | — | Yes — reused across pages |
| One-off About page story | Yes | — |
The rule of thumb: prose goes in the body; data goes in fields. A markdown body with clean headings is portable, diffable, and survives redesigns untouched — one reason flat-file CMSs work as well as they do (more on that tradeoff in our comparison of flat-file vs database CMS architectures). The trouble starts only when data masquerades as prose.
Relationships: Categories, Tags, Authors
Relationships are where a model earns its keep. Three show up on nearly every site:
- Categories — broad, curated, few (5-10). One post usually belongs to 1-3. They drive index pages and site architecture.
- Tags — granular, ad hoc, many. Cross-cutting labels rather than structure. Skip them entirely if you won't maintain them; a tag cloud with 400 single-use tags helps nobody.
- Authors — a relation to a person entity, not a text field. The moment an author's name is typed as a string, you get "Jane Smith," "J. Smith," and "jane smith" as three different authors, and an author page becomes impossible.
The test for any relationship: could you build a page that lists everything on the other end of it? "All posts by this author," "all posts in this category." If the answer matters to your site, model the relation.
Reusable Blocks vs. One-Off Pages
If the same CTA banner appears on twelve pages, it should exist once and be referenced twelve times. That's a block: content with no URL of its own, designed for embedding. Change the offer, and all twelve pages update.
The counter-case is real too: don't block-ify everything. A section that appears on exactly one page, and always will, is fine living on that page. Premature abstraction in content models produces the same misery it does in code — editors hunting through a block library to change one headline.
A workable default: extract a block the second time you catch yourself copy-pasting a section, not before.
Slugs and URLs Are Part of the Model
URL design gets treated as an afterthought, but it's a modeling decision with long consequences:
- Type-based prefixes (
/blog/{slug},/docs/{slug}) make URLs self-describing and let you reason about sections in analytics, robots rules, and redirects. - Slug stability matters more than slug beauty. Every renamed slug is a broken inbound link unless something redirects it. This is why slug history belongs in the model — UnfoldCMS, for instance, keeps old slugs and 301-redirects them to the new one automatically, so renaming a post doesn't torch its backlinks.
- Avoid encoding volatile data in URLs. Dates in blog URLs (
/2023/05/post-title) look organized until you update the post and the URL advertises its age forever.
Decide the URL scheme when you decide the types. Retrofitting one onto a live site is a redirect-mapping project nobody enjoys.
How Does Modeling Differ Across CMS Philosophies?
Different CMS families don't just implement modeling differently — they disagree about who should do it and when.
Headless, model-first. Contentful, Sanity, and their peers hand you an empty schema and refuse to do anything until you define types and fields. This is maximum flexibility: your model can match your domain exactly. It's also maximum homework — a Sanity project starts with writing schema code, and a bad model is entirely your fault. Teams who want structured content without that blank-canvas overhead are a big part of why people go looking for Contentful alternatives in the first place.
Traditional, body-first. WordPress starts from the opposite end: posts and pages with a big body field, working in five minutes. Structure arrives later, bolted on through custom post types and plugins like ACF — powerful, but the model ends up scattered across plugin configs, theme code, and editor conventions rather than declared in one place.
The middle ground. Some systems ship a fixed, opinionated set of types instead of a model builder. UnfoldCMS sits here deliberately: it ships posts, pages, landing pages, and reusable blocks, with categories and a markdown body — and no abstract content-type builder. The tradeoff is honest: you can't model "recipe" or "real-estate listing" as first-class types, but a marketing site and blog get a correct model on day one, with structure where structure pays (dates, categories, slugs, scheduling) and markdown where prose lives. The same model is exposed over a REST API (/api/v1/ for posts, pages, categories, menus, settings), so a structured front end can consume it like a headless source. If your domain genuinely needs custom entities, a schema-first system — Sanity or one of the Sanity alternatives — is the better fit. Pick the philosophy that matches how much modeling work your project actually warrants.
A 5-Step Modeling Process for a Marketing Site + Blog
You can produce a solid model for a typical company site in an afternoon. Work on a whiteboard or a text file — not in the CMS yet.
- Inventory the real content. List every distinct thing the site must show: articles, case studies, team members, pricing tiers, legal pages, campaign landing pages. Pull from the existing site if there is one. No grouping yet — just the raw list.
- Group into types, ruthlessly. Two items belong to the same type only if they share fields and behavior (listed together, dated together, templated together). Most marketing sites land on 4-6 types: post, page, landing page, plus one or two domain types like case study. If you have ten types, you've modeled templates, not content.
- Define fields per type — and defend the body. For each type, list fields using the query test from earlier: queried, sorted, reused, or validated → field; prose → body. Mark each field required or optional. Every optional field is an invitation to inconsistency, so make fields required unless you have a reason not to.
- Map relationships and URLs together. Draw the arrows: post → category, post → author, page → blocks. Then fix the URL scheme per type (
/blog/{slug},/{slug}for pages,/lp/{slug}excluded from nav). Decide now what happens when a slug changes. - Stress-test with two scenarios. First, a redesign: could you rebuild every template without editing any content? Second, a new surface: could you ship a "latest case studies" widget purely by querying? If either answer is no, structure is hiding in a body somewhere — go find it.
Then, and only then, set it up in the CMS and migrate a handful of real items as a pilot before moving everything.
What Are the Signs Your Content Model Is Wrong?
Broken models announce themselves through editor behavior. Editors are rational; when they abuse the system, the system is wrong.
- HTML pasted into the body. Someone needed a layout the model doesn't support, so they hand-wrote markup in a rich-text field. That content is now invisible to queries and will break in the next redesign.
- Fields used off-label. The subtitle field holds a promo code. The excerpt holds rendering instructions like "IMPORTANT: show the blue banner." Editors are smuggling data through whatever field exists because the right one doesn't.
- Clone-and-edit as a workflow. "Duplicate the spring landing page and change the dates" — fine once, but as the standard process it means shared content isn't shared, and a legal-copy update becomes a 30-page hunt.
- A "misc" type that keeps growing. When new content keeps landing in a catch-all type, the model is missing a real type your team needs.
- Numbered field names.
feature_1,feature_2,feature_3is a list pretending to be fields. The fourth feature will arrive the week after launch.
One or two of these is normal drift — fix the field and move on. All five means the model never matched the content, and patching beats rebuilding only in the short term.
FAQ
Is content modeling worth it for a small site? Yes, but scale the effort. A five-page brochure site needs ten minutes of thought, not a workshop: separate posts from pages, keep prose in the body, give dates and categories real fields. The cost of a tiny model is near zero; the cost of an accidental one compounds.
Can I change a content model after launch? Adding types and optional fields is cheap. Splitting a blob body into structured fields after the fact is the expensive direction — it means parsing or hand-migrating existing content. That asymmetry is the whole argument for modeling first.
How many content types is too many? There's no hard ceiling, but most marketing sites are well served by 4-6. Past ten, check whether you're modeling content (things with distinct fields and behavior) or design variations (which belong in templates, not types).
Do I need a headless CMS to get structured content? No. Structure comes from the model, not the delivery method. A traditional CMS with disciplined types and fields beats a headless CMS with one giant rich-text field. Headless only changes where rendering happens.
Whatever CMS you pick, do the modeling pass before you create your first piece of content. If your needs match the common shape — marketing pages, landing pages, a blog with categories, scheduled publishing, editor roles — UnfoldCMS gives you that model pre-built on a self-hosted Laravel + React stack with SQLite or MySQL underneath, no schema-building phase required. See how the content types fit together, or compare it against the model-first tools first. Either way: decide what your content is before the CMS decides for you.
Free & Open Source
Own your CMS. No subscriptions.
Unfold CMS is free to download and self-host. Built on Laravel + React, full source code included.
Share this post: