Most guides about earning AI citations point you toward complex schema markup, JSON-LD scripts, and technical SEO overhauls. We tested something different. We took a set of pages, changed nothing about the words on them, and focused entirely on how the content was structured in HTML. Proper heading hierarchy. Semantic elements like <article>, <section>, and <figure>. Definition lists instead of loose paragraphs. The result: a 362% increase in AI citation rate across ChatGPT, Perplexity, and Gemini — measured over 90 days.
This is the story of what we changed, why it worked, and how you can replicate it.
Key Takeaways
- Semantic HTML tells AI models what your content means, not just what it says — making it dramatically easier to cite
- Heading hierarchy alone (H1 → H2 → H3 without skipping levels) correlated with the single biggest citation lift in our test
- Pages with 3+ schema types paired with proper heading hierarchy show 2.8x higher citation rates according to BrightEdge research
- You don't need a developer to make these changes — semantic HTML is simpler than most schema markup implementations
- The 362% improvement came from structural changes only — no new content, no new backlinks, no additional promotion
What we mean by "simple semantics"
When we say semantics, we're not talking about structured data in the JSON-LD sense — though that helps too. We're talking about the HTML layer that most content teams ignore entirely.
Semantic HTML means using elements that describe the role of content, not just its appearance. A <section> tells an AI model "this is a self-contained topic." An <article> says "this is a complete, independent piece of content." A <figure> with a <figcaption> says "this image has a specific relationship to the surrounding text."
Most websites treat HTML as a visual tool. Headings get chosen by size ("that looks like an H3"), not by logical hierarchy. Lists get rendered as styled divs. Tables get built with CSS grid instead of actual <table> elements. The content looks right to humans but reads as unstructured noise to AI models.
AI search engines — ChatGPT, Perplexity, Gemini, Google AI Mode — don't render your page like a browser. They parse the HTML structure to understand relationships between ideas. When that structure is semantically correct, the content becomes machine-readable in a way that directly translates to citation probability.
The test: what we changed
We selected 24 pages across three different domains — a mix of service pages, resource guides, and comparison content. All pages had existing traffic and some baseline AI visibility. None had been recently updated.
We made five categories of changes across all 24 pages:
1. Heading hierarchy repair
Every page got a clean H1 → H2 → H3 hierarchy with no skipped levels. We found that 18 of 24 pages had at least one heading level skip (H2 jumping straight to H4, or multiple H1 tags). This is the single most impactful change we made.
2. Semantic element wrapping
We wrapped logical content blocks in appropriate HTML5 elements: <article> for the main content body, <section> for each major topic, <aside> for supplementary information, <nav> for internal link blocks. Previously, everything lived inside generic <div> containers.
3. Definition structures
Wherever content answered a "what is" question, we converted it from paragraph format to <dl>, <dt>, <dd> definition lists — or at minimum, ensured the question appeared in a heading with the answer immediately following in the first paragraph. This pattern maps directly to how answer engines extract citations.
4. Table markup for comparisons
Four pages contained comparison data rendered as styled divs or CSS grid layouts. We converted these to proper <table> elements with <thead>, <tbody>, and <th> scope attributes. Research shows comparison tables with proper HTML achieve 47% higher AI citation rates.
5. Figure and caption pairing
Images were wrapped in <figure> elements with descriptive <figcaption> text. This gives AI models explicit context about what an image represents and how it relates to the surrounding content — rather than relying on alt text alone.
The results
We measured AI citation rate using SwingIntel's citation testing engine, querying nine AI providers with category-relevant prompts before and after the changes.
| Metric | Before | After (90 days) | Change |
|---|---|---|---|
| Citation rate (all providers) | 8.3% | 38.4% | +362% |
| ChatGPT citations | 11.1% | 44.2% | +298% |
| Perplexity citations | 9.7% | 41.8% | +331% |
| Gemini citations | 4.2% | 29.1% | +593% |
| Average citation position | 4.8 | 2.1 | +56% higher |
Three patterns stood out:
Gemini showed the largest improvement. Google's models appear to weight HTML structure more heavily than other providers — which aligns with Google's long history of using structured signals for ranking. If you're optimising for Google AI Mode, semantic HTML should be your first priority.
Definition-structured content got cited most often. Pages that used clear question-answer patterns with proper heading markup saw the highest absolute citation rates. This makes sense — AI models looking to answer a user's question naturally gravitate toward content that is already structured as an answer.
The improvement was not instant. Most citation gains appeared between weeks 3 and 8, suggesting that AI models need time to re-crawl and re-index the structural changes. If you make semantic improvements and see no immediate result, patience matters.
Why semantic HTML matters more than schema markup
This is a controversial take, but the data supports it: for most websites, fixing your semantic HTML will have a larger impact on AI citations than adding schema markup.
Here's why. Schema markup (JSON-LD structured data) tells AI models metadata about your page — what type of content it is, who wrote it, when it was published. That's valuable context. But it doesn't help AI models understand the content itself.
Semantic HTML, on the other hand, structures the actual information that AI models are trying to extract and cite. When a model parses your page looking for an answer to "what is generative engine optimization," it needs to find that answer in the content — wrapped in elements that make the answer identifiable and extractable.
The ideal approach is both: semantic HTML for content structure, plus JSON-LD for metadata. BrightEdge found that 65% of pages cited by AI Mode include structured data, and 71% cited by ChatGPT do. But structured data without semantic HTML is like putting a label on a box that's packed in chaos — the label helps, but the contents are still hard to find.
If you have to choose where to start, start with semantic HTML. It's simpler to implement, doesn't require technical expertise, and — based on our data — delivers a larger citation lift.
How to audit your own semantic structure
You don't need to hire a developer to check your semantic HTML. Here's a practical audit process any content team can run:
Step 1: Check heading hierarchy
Use your browser's developer tools or a free heading checker extension. Every page should have exactly one H1 (the page title), followed by H2s for major sections, H3s for subsections. No skipped levels. No heading chosen for visual size rather than logical structure.
Step 2: Inspect element usage
Right-click any content section and inspect the HTML. If everything is wrapped in <div> tags, you have a semantic gap. Look for <article>, <section>, <aside>, <nav>, <figure>, and <main> — these elements exist specifically to communicate content structure.
Step 3: Test definition patterns
For any page that answers questions, check whether the question-answer pattern is machine-readable. The question should be in a heading tag, with the answer in the immediately following element. If the answer is buried three paragraphs into a general discussion, AI models will struggle to extract it.
Step 4: Validate table markup
If your page compares products, features, or options, that comparison should be in a real <table> element. CSS-styled comparison layouts may look identical to users but are invisible to AI content parsers.
Step 5: Run a citation test
Measure your current AI citation rate as a baseline, make semantic changes, then re-test after 4-8 weeks. SwingIntel's AI Readiness Audit tests citations across nine AI providers with 108 prompts — giving you the data to measure exactly what changed and where.
The compound effect with other optimisations
Semantic HTML doesn't exist in isolation. The 362% improvement we measured was from structural changes alone — but when combined with other AI search optimisation strategies, the compound effect is significant.
Pages that had both semantic HTML improvements and existing content chunking saw even higher citation rates. The semantic structure makes chunks identifiable, while the chunking strategy makes individual answers extractable. Together, they create content that's both discoverable and citable.
Similarly, pages with strong internal linking performed better post-semantic-changes than isolated pages. AI models use link context to validate authority — and semantic HTML gives those link relationships clearer meaning.
The takeaway: semantic HTML is a multiplier. It amplifies the value of every other optimisation you're already doing.
Frequently Asked Questions
Does semantic HTML replace the need for structured data (JSON-LD)?
No. Semantic HTML and structured data serve different purposes. Semantic HTML structures the content itself — making it readable and extractable by AI models. JSON-LD structured data provides metadata about the content — type, author, publish date, ratings. The best-performing pages use both. But if you're starting from scratch, semantic HTML delivers a faster citation improvement based on our testing.
How long does it take for AI models to recognise semantic changes?
In our 90-day test, most citation improvements appeared between weeks 3 and 8. AI models need to re-crawl your pages and reprocess the content structure. You can accelerate this by ensuring your sitemap is current, your pages aren't blocked by robots.txt, and you're submitting updated pages via Google Search Console.
Can I make semantic HTML changes without a developer?
Yes — many semantic changes can be made through your CMS editor's HTML view. Heading hierarchy is the simplest starting point: ensure you're using H1 for titles, H2 for sections, and H3 for subsections in the correct order. For deeper structural changes like adding <article> or <section> wrappers, you may need template-level access, which could require developer support depending on your CMS.
Which AI search engines benefit most from semantic HTML?
Based on our data, Gemini and Google AI Mode showed the largest citation improvement from semantic changes (+593% for Gemini). ChatGPT and Perplexity also showed strong gains (+298% and +331% respectively). All major AI search platforms parse HTML structure — making semantic improvements universally beneficial rather than platform-specific.
What's the minimum set of semantic changes for the biggest impact?
Start with heading hierarchy — it delivered the largest single improvement in our test. If you fix nothing else, fix your heading levels. Second priority is definition structure: make sure any question-answer content has the question in a heading with the answer immediately following. Third is converting comparison content to proper table markup. These three changes alone accounted for the majority of our 362% citation lift.






