Why AI Systems Prefer Structured Content
Large language models and AI search engines don't read web pages the way humans do. They parse HTML, extract text, and attempt to infer meaning from structure and context. Structured data in JSON-LD format removes the inference step — it explicitly declares what something is, who made it, when it was published, and how it relates to other entities.
This is why Schema.org markup is increasingly important not just for Google rich results, but for AI citation and attribution. When Perplexity or ChatGPT with browsing encounters a page with clean Article, Person, and Organization schema, it can confidently attribute the content to a named author at a named organization — rather than treating the content as anonymous text from an unknown source.
Schema Types AI Systems Use Most
Article — Declares content as editorial, with headline, author, datePublished, dateModified, and publisher. This is the minimum for any blog post or guide.
Person — Defines the author entity with name, jobTitle, url, and sameAs links to their LinkedIn, GitHub, or other authoritative profiles.
Organization — Defines the publishing organization with name, url, logo, and sameAs links to official social profiles.
Product — For product pages, explicitly declares pricing, availability, and ratings — data AI systems can surface accurately in comparison queries.
FAQPage — Highly preferred by AI systems because it explicitly encodes questions and answers in a machine-readable format. AI Overviews pull from FAQPage schema frequently.
HowTo — Step-by-step instructions with explicit step objects. AI systems can accurately reference specific steps in answers.
Adding JSON-LD in Next.js App Router
Place schema in your page component or layout. For a blog post:
// app/blog/[slug]/page.tsx
export default function BlogPost({ post }: { post: Post }) {
const schema = {
"@context": "https://schema.org",
"@type": "Article",
"@id": `https://yourdomain.com/blog/${post.slug}#article`,
"headline": post.title,
"datePublished": post.publishedAt,
"dateModified": post.updatedAt,
"author": {
"@type": "Person",
"@id": "https://yourdomain.com/team/author-name#person",
"name": "Author Name",
"jobTitle": "CEO & ML Engineer",
"url": "https://yourdomain.com/team/author-name"
},
"publisher": {
"@type": "Organization",
"@id": "https://yourdomain.com#organization",
"name": "Your Company",
"url": "https://yourdomain.com"
}
};
return (
<>
<script
type="application/ld+json"
dangerouslySetInnerHTML={{ __html: JSON.stringify(schema) }}
/>
{/* page content */}
</>
);
}
The @id Pattern for Entity Disambiguation
The @id property is a URL that uniquely identifies an entity. When you use the same @id across multiple pages, AI systems and Google's Knowledge Graph can recognize that references to https://yourdomain.com/team/author-name#person on different pages all refer to the same Person entity.
This builds a coherent entity graph for your site — Google understands that multiple articles are written by the same person, and that person is associated with your organization.
How Perplexity Uses Structured Data
Perplexity's citation cards show the source domain, page title, and sometimes the author name. Pages with Article + Person schema get author attribution in citations. Pages without schema get attributed only by domain. Named author attribution builds trust and brand recognition with Perplexity users.
Future: AI Agents and the Machine-Readable Web
Schema.org's roadmap includes vocabulary for AI agents — types and properties that describe what services a site offers, what APIs are available, and what data can be accessed programmatically. As AI agents become capable of taking actions on the web (booking, purchasing, searching), sites with rich structured data will be more agent-accessible than those without.
Links: Schema.org | JSON-LD specification | Google Rich Results Test