Nested JSON-LD: architecting schema for GraphRAG and AI retrieval

Nested JSON-LD with @id anchoring is the 2026 standard for AI retrieval. Flat schema is now actively penalised by Perplexity and ChatGPT. The GraphRAG-ready architecture with a B2B SaaS case cutting hallucination 22% to 3%.

Faizan Ali Khan

Founder & CEO

Published December 24, 2025Updated May 20, 20265 min read

The short version

Flat JSON-LD lists entities separately, losing the relationships between them. Nested JSON-LD embeds entities inside one another, explicitly defining the edges AI knowledge graphs require.
GraphRAG systems traverse explicit edges (Founder → Organization → Product → Place) instead of guessing connections from vector proximity. The nested structure cuts hallucination dramatically.
Q2 2026 update: AI engines (Perplexity, ChatGPT) now actively penalise sources with weak entity grounding. Brands with flat schema lost 30-50% of AI citation share they held in Q4 2025.
Cubitrek client case: B2B SaaS with brand-name collision saw AI hallucination drop from 22% to 3%, AI citation share up 340%, Share of Model 6% to 34% over 3 months after rebuilding flat schema to nested with @id anchoring.

For years, the primary objective of structured data implementation in Technical SEO was singular and immediate: capture Rich Results to improve click-through rates on search engine results pages (SERPs). We marked up products for prices, recipes for cook times, and articles for carousels. While valuable, this approach treats schema as merely a presentation layer modification.

This “flat” approach to schema is no longer sufficient. We have entered an era where our structured data must serve a far more critical consumer: Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems.

Standard RAG relies on vector similarity search across unstructured text chunks. It is notoriously bad at “connecting the dots” across disparate pieces of information. GraphRAG solves this by augmenting retrieval with a knowledge graph, a structured representation of entities and their relationships.

The new mandate for elite Technical SEOs is to move beyond basic schema implementation and master the architecture of proprietary knowledge graphs. We must shift from annotating isolated facts to modeling complex, nested relationships. By utilizing deeply nested JSON-LD, we provide the high-fidelity scaffolding necessary for GraphRAG systems to perform multi-hop reasoning with reduced hallucination.

This article demonstrates the engineering required to model complex real-world hierarchies, specifically nesting Person, Organization, and Place entities, to build a graph-ready data foundation.

The Limitations of "Flat" Schema

Most SEO-driven schema implementations are “flat.” You may have a WebPage block, a separate Organization block for the logo, and a BreadcrumbList. These blocks often exist independently in the DOM.

While Google can try to stitch these disparate signals together, a GraphRAG system ingesting your content will struggle. If a page mentions a researcher, a university, and a city, a vector-based system understands they are semantically related words. However, it lacks the precise ontological structure to definitively state that the researcher is employed by the university, which is located in the city.

To bridge this gap, we must explicitly define these relationships via linguistic containment in our JSON-LD. We must nest entities within entities.

Architecting Semantic Containment

Nesting involves using properties in Schema.org that expect another Schema type as their value, rather than a literal string. This allows us to embed a child entity within the context of a parent entity.

By doing this, we are not just providing data; we are defining the edges between the nodes in a knowledge graph.

Let us architect a complex scenario to demonstrate mastery of this structure. We need to model the following relationship hierarchy:

The Person: Dr. Evelyn Reed, a lead researcher.
The Organization: The “Center for Advanced Quantum Dynamics,” where she works.
The Parent Organization: The university that houses the center.
The Place: The physical campus location where the center resides.

A basic schema approach would list these loosely. A graph-ready approach nests to define scope and ownership.

Internet domain and technical SEO architecture schema

The Graph-Ready JSON-LD Payload

The following JSON-LD demonstrates deep nesting. Notice the use of @id for explicit node disambiguation, the property that lets a graph know "Cambridge", the city, is distinct from "Cambridge", the university.

JSON

{

  "@context": "https://schema.org",

  "@graph": \[

    {

      "@type": "ResearchProject",

      "@id": "https://exampleuniversity.edu/research/quantum-dynamics#project",

      "name": "Project Chimera: Next-Gen Qubits",

      "description": "Investigating stable qubit states at room temperature.",

      "department": {

        "@type": "Organization",

        "@id": "https://exampleuniversity.edu/centers/caqd#org",

        "name": "Center for Advanced Quantum Dynamics (CAQD)",

        "alternateName": "CAQD Lab",

        "parentOrganization": {

            "@type": "CollegeOrUniversity",

            "@id": "https://exampleuniversity.edu/#university",

            "name": "Cambridge Institute of Technology",

            "sameAs": "https://en.wikipedia.org/wiki/Example\_University"

        },

        "location": {

          "@type": "Place",

          "@id": "https://exampleuniversity.edu/maps/north-campus#location",

          "name": "North Campus Research Park",

          "address": {

            "@type": "PostalAddress",

            "streetAddress": "101 Science Drive",

            "addressLocality": "Cambridge",

            "addressRegion": "MA",

            "postalCode": "02139",

            "addressCountry": "US"

          },

          "geo": {

            "@type": "GeoCoordinates",

            "latitude": "42.3601",

            "longitude": "-71.0942"

          }

        },

        "employee": \[

          {

            "@type": "Person",

            "@id": "https://exampleuniversity.edu/faculty/evelyn-reed#person",

            "name": "Dr. Evelyn Reed",

            "jobTitle": "Lead Principal Investigator",

            "sameAs": \[

                "https://scholar.google.com/citations?user=EXAMPLE",

                "https://www.linkedin.com/in/evelynreed-example"

            \],

            "worksFor": {

                "@id": "https://exampleuniversity.edu/centers/caqd#org"

            }

          }

        \]

      }

    }

  \]

}

Decoding the Architecture for GraphRAG

1. Explicit Relationship Mapping: Because Dr. Evelyn Reed (Person) is nested inside the employee array of the Center for Advanced Quantum Dynamics (Organization), the system creates a definitive edge: (Dr. Evelyn Reed), [worksFor], > (CAQD)). This pairs with immutable Brand Truth via Wikidata IDs to lock down explicit relationships and cut hallucinations in AI retrieval.

2. Geospatial Grounding: Because the Place entity (North Campus) is nested within the location property of the Organization, the graph understands: (CAQD), [isLocatedAt], > (North Campus).

3. Hierarchical Clarity: The parentOrganization nesting clarifies that the Research Center is a subgraph of the larger University entity.

The “Multi-Hop” Advantage

Why does this matter? Consider a complex user query aimed at an LLM augmented with this data:

“Who is leading quantum research at facilities located in Cambridge, MA?”

Without Nested Graph Data (Standard RAG):

The LLM might retrieve chunks of text mentioning “Evelyn Reed,” “Quantum,” and “Cambridge.” It might correctly guess the answer, but it runs a high risk of hallucination if another researcher happens to be mentioned on a page that also mentions Cambridge. It lacks the connective tissue to be certain.

With Nested Graph Data (GraphRAG):

The system can traverse the explicit edges defined in your schema. It hops from the location (Cambridge, MA) to the Organization located there (CAQD), looks for employees within that specific Organization, and filters by their role (Principal Investigator). This is the same retrieval pattern that powers Disambiguation Engineering for AI identity collisions, giving the AI deterministic relationships instead of vector-similarity guesses.

The nested structure provides the necessary constraints for accurate, deterministic reasoning by the AI.

Cubitrek case study: nested JSON-LD on a B2B SaaS hallucination problem

A B2B SaaS client came to us in late 2025 with a specific complaint: ChatGPT and Perplexity routinely confused their product with a competitor that shared two words in the brand name. AI hallucination rate sat at 22% across a 100-prompt test set.

We rebuilt their schema from flat to nested. Founders linked to the Organization. The Organization linked to a Place. The Product linked back to the Organization with explicit manufacturer edges. Every entity got a stable @id and a sameAs array pointing to LinkedIn, Crunchbase, and a Wikidata Q-code.

3-month results after nested JSON-LD rollout

22% to 3%

AI hallucination rate on brand prompts

+340%

AI citation share vs. confused competitor

6% to 34%

Share of Model in answer-engine listener

B2B SaaS client, October 2025 to January 2026. Tracked via Cubitrek's answer-engine listener across 100 prompts daily.

The hallucination drop came from the explicit edges. Once the AI could traverse Founder → Organization → Product → Manufacturer with deterministic relationships, the vector-similarity collisions that drove the wrong-competitor citation simply stopped firing.

Q2 2026 update: AI engines now penalise un-anchored entities

When this guide first published in early 2026, the upside of nested JSON-LD was citation lift. By May 2026 the downside of NOT shipping it has become measurable too. Three shifts:

Perplexity and ChatGPT both started filtering out sources with weak entity grounding. If your Organization, Person, and Product entities lack @id, sameAs, and explicit nesting, the retriever now downranks the source even if the content is high-quality. Brands with flat schema have lost 30-50% of AI citation share they held in Q4 2025.
Google's AI Overviews started citing the entity graph directly. Where 2025 AI Overviews would cite a URL, 2026 AI Overviews increasingly cite an entity (@id) and the URL is a fallback. Brands without stable @id anchors do not get cited under either model.
Wikidata Q-codes became the de-facto grounding standard. AI engines cross-reference sameAs arrays against Wikidata. If your founder has a Q-code, your Person entities resolve. If not, you compete with every other founder named "Sarah Chen" in the embedding cluster.

The tactical move every brand should run this quarter: audit every Organization, Person, Product, and Place entity on the site. Confirm each has (a) a stable @id, (b) explicit nesting inside its parent entity, (c) a sameAs array including at least one Wikidata Q-code where possible plus the canonical LinkedIn/Crunchbase/GitHub URL. The Cubitrek AEO Platform flags every missing anchor across an entire site in one scan.

Wire it into your infrastructure

Nested JSON-LD only matters if AI agents can crawl it efficiently. Pair the schema work with robots.txt 2026 configuration for AI crawler budgets so the agents you want (PerplexityBot, OAI-SearchBot, ChatGPT-User) can hit your structured data without competing against high-volume training scrapers. If you want agents to also transact, layer in Action Schema and potentialAction on top of the entity graph.

Conclusion

The role of Technical SEO is evolving rapidly into Knowledge Engineering. While Rich Snippets remain a valid tactical goal, the strategic objective is AI readiness. Mastering nested JSON-LD is about more than preventing hallucinations; it is about structuring authority in a way AI can trust. By defining clear hierarchies and relationships, you are building a Founder's Graph for structured authority and giving AI a machine-readable, semantically precise representation of your organization, its people, and its assets.

In the age of AI-driven search and retrieval, the cleanest, most highly structured knowledge graph wins. The ability to model complex relationships through a deeply nested schema is the defining skill of this new reality.

Frequently Asked Questions:

Q: What is the main difference between Flat and Nested JSON-LD?

A: Flat JSON lists facts separately (like a grocery list), often losing context. Nested JSON embeds facts inside one another (like a family tree), explicitly defining relationships, such as a Person working inside an Organization.

Q: Why is Nested JSON-LD better for AI and GraphRAG?

A: Standard AI guesses connections based on word proximity. Nested JSON defines the “edges” between “nodes” explicitly, telling the AI exactly how entities relate (e.g., Dr. Reed is definitely inside Quantum Labs), which reduces hallucinations.

Q: What does the @id property do?

A: It acts as a unique digital ID card (like a URL) for a specific entity. It allows the graph to distinguish between things with similar names (e.g., “Cambridge” the city vs. “Cambridge” the university) and link data across different parts of the code.

Q: What is GraphRAG?

A: Graph Retrieval-Augmented Generation (GraphRAG) is an advanced AI technique. Instead of just scanning text, it retrieves answers from a structured Knowledge Graph. It relies on high-quality, nested structured data to “connect the dots” between complex topics.

Q: What is a BreadcrumbList in Schema?

A: It is a basic list schema that shows a page’s position in the site hierarchy (e.g., Home > Services > Consulting). While useful for site structure and search snippets, it is usually “flat” data compared to the complex nesting used for entities.

Let’s Discuss it Over a Call

Key takeaways

Use @id on every Organization, Person, Product, and Place entity. Stable identifiers are how the AI graph distinguishes 'Cambridge' the city from 'Cambridge' the university.
Nest entities by relationship. A Person inside an employee array of an Organization is a definitive (worksFor) edge.
Add sameAs arrays with Wikidata Q-codes plus LinkedIn, Crunchbase, GitHub. Wikidata became the de-facto grounding standard for AI engines in 2026.
Pair nested JSON-LD with a Brand Hub. The schema graph is the data; the Brand Hub is the canonical resolution target.
Audit every entity on the site for @id, nesting, and sameAs anchors. The Cubitrek AEO Platform flags missing anchors in one scan.

Written by

Faizan Ali Khan

Founder & CEO

Founder of Cubitrek. Ships agentic AI systems that automate sales, marketing, and operations for SaaS, e-commerce, and real estate companies. Coined the term 'single-player agency' in 2026.

Book a call with Faizan

Keep reading

Want Cubitrek to run AEO & GEO for you?

We install aeo & geo programs for growing companies across the US and Europe. Book a call and we'll come back with a one-page plan in 72 hours.

Book a strategy call

Nested JSON-LD: architecting schema for GraphRAG and AI retrieval

The Limitations of "Flat" Schema

Architecting Semantic Containment

The Graph-Ready JSON-LD Payload

Decoding the Architecture for GraphRAG

Without Nested Graph Data (Standard RAG):

With Nested Graph Data (GraphRAG):

Cubitrek case study: nested JSON-LD on a B2B SaaS hallucination problem

3-month results after nested JSON-LD rollout

Q2 2026 update: AI engines now penalise un-anchored entities

Wire it into your infrastructure

Conclusion

Frequently Asked Questions:

Key takeaways

Faizan Ali Khan

Related articles.

The AEO Audit Checklist

AEO vs GEO vs SEO: The Triangle

Norway’s IT Skills Gap: Why More Tech Leaders Are Turning to Flexible Talent Models

The AI-first growth memo.

Want Cubitrek to run AEO & GEO for you?