LLM Data Layer

AI agents can only recommend what they can verify.

Central packages every product as a machine-readable record AI agents can crawl, cite, and trust. Schema.org JSON-LD on the page, the x-central provenance extension carrying sources and confidence, a REST endpoint for per-field source lookup, and an llms.txt-style agent feed.

Schema.org Product Offer FAQPage x-central provenance REST per-field provenance llms.txt agent feed

app.central.to/products/sony-wh-1000xm5/data-layer

Sony WH-1000XM5

central-demo-002 · black · 30 h · LDAC

target → AI agents · 6 surfaces

22 / 22Schema.org core 16sources · 0.94 avg conf

@typeschema.org/Product Product · application/ld+json JSON-LD 1.00

gtin · skuschema:gtin13 · sku 4548736141544 · WH-1000XM5-BLK verified 1.00

brand · nameschema:brand · schema:name Sony · WH-1000XM5 wireless ANC headphones brand 1.00

offersschema:Offer × 1 € 399.00 · in_stock · shop.sony.eu merchant 1.00

additionalPropertyPropertyValue × 12 battery · driver · codec · weight · ANC · multipoint · … 12 / 12 0.97

FAQPage.mainEntityQuestion × 8 battery life · hi-res audio · weight · IP rating · … 8 / 8 0.91

x-central.sourcescentral:provenance · array 16 sources · sony.com · jas.or.jp · mediamarkt · coolblue · saveur · … 16 cited 0.94

x-central.confidencecentral:confidence · float 0.94 avg · 0.82 floor · per-field map scored 0.94

x-central.lastVerifiedcentral:timestamp · ISO 8601 2026-05-25T08:14:32Z · 4 min ago fresh 1.00

x-central.heldclaim · unsupported "IP rating IPX4" · sources disagree · no merchant decision held —

AI-readable

Schema.org Product JSON-LD · x-central FAQPage × 8 REST provenance API llms.txt feed 16 sources · 0 unsupported

the citation gap

The same product. The same question. Two completely different answers.

An AI agent like ChatGPT can only recommend a product if it can verify the facts. When the data layer is missing, agents either refuse or invent. When Central writes the data layer, the same query becomes a confident, source-cited recommendation.

🔒 chatgpt.com /c/shopping/anc-headphones-flight

Buyer

"What's the battery life on the Sony WH-1000XM5 with ANC on, and is it JAS-certified for hi-res audio?"

✦

ChatGPT

I can see a product page for the Sony WH-1000XM5, but the page doesn't expose structured product data — no @type Product, no additionalProperty, no source citations.

I'd guess battery life is around 40 hours with ANC on^?, and probably hi-res certified^? — but I can't verify either from the page. The 40-hour number might be the ANC-off figure.

Recommendation withheld. No verified facts attached to this product. Try a retailer that exposes Schema.org JSON-LD on the PDP.

Hallucination risk · 2 unsupported claims

🔒 chatgpt.com /c/shopping/anc-headphones-flight

Buyer

"What's the battery life on the Sony WH-1000XM5 with ANC on, and is it JAS-certified for hi-res audio?"

✦

ChatGPT

The Sony WH-1000XM5 is a strong match for flight use. From the structured product data on the page:

Battery life is 30 hours with ANC on, 40 hours with ANC off¹. It is JAS-certified for LDAC hi-res audio at 990 kbps², uses an 8-microphone array for ANC³, and weighs 250 g⁴. Currently € 399.00, in stock⁵.

Sony WH-1000XM5

WH-1000XM5-BLK · 250 g · 30 h ANC · LDAC · 4.7 / 5 (3,142)

verified€ 399.00

5 source-backed facts · x-central provenance

1sony.com/electronics/headphones · jas.or.jp/certconf 0.97

2sony.com/spec-sheet · jas.or.jp/ldac-certifiedconf 0.99

3sony.com · cnet.com · whathifi.comconf 0.94

4sony.com · 4 retailers agreeconf 0.98

5shop.sony.eu · Offer · liveconf 1.00

what changed

One thing flipped: a Central-emitted data layer the agent could read, cite, and trust.

Same buyer. Same product. Same query. ChatGPT moves from refusal to a confident, cited recommendation — because Central writes the x-central provenance metadata that lets the model show its work.

added to page Schema.org Product PropertyValue × 12 FAQPage × 8 x-central.sources [16] x-central.confidence 0.94 x-central.lastVerified 1 marketing claim held

Make my catalog AI-readable

four layers · one verified record

How the verified record becomes machine-readable.

The LLM Data Layer ships in four formats so every AI surface — from PDP-crawlers to programmatic agents — can pull what it needs. One canonical record, four renders.

1Schema.org embedded

Inline structured data inside the PDP. Product, Offer, AggregateRating, FAQPage, ImageObject — the standard vocabulary AI agents already know how to read.

Standard

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Product",
  "name": "Sony WH-1000XM5",
  "gtin13": "4548736141544",
  "brand": { "@type": "Brand", "name": "Sony" },
  "offers": { "@type": "Offer", ... }
}
</script>

read by ChatGPT Perplexity Google AI Apple Claude Bing

2JSON-LD + x-central

A namespaced Schema.org extension that attaches sources, per-field confidence, ISO 8601 timestamp, and held claims to the payload. Lets AI agents cite the sources behind every claim.

Extension

// Embedded inside the same JSON-LD block
"x-central": {
  "sources": [
    "sony.com/spec-sheet",
    "jas.or.jp/ldac-certified",
    /* + 14 more */
  ],
  "confidence": { "avg": 0.94, "floor": 0.82 },
  "lastVerified": "2026-05-25T08:14:32Z",
  "held": [{ "claim": "IPX4", "reason": "conflict" }]
}

unlocks citations on ChatGPT Perplexity Claude Bing

3REST Provenance API

A programmatic endpoint: GET /products/{id}/fields/{field}/provenance. Returns the source list, agreement count, and confidence per field. Built for integrators and bespoke agents.

API

# Per-field source lookup
GET /v1/products/SIG-RND-26-CER/
    fields/oven_temp/provenance
Authorization: Bearer ck_live_...

# 200 OK
{
  "field": "oven_temp",
  "value": "260 °C",
  "sources": 5,
  "confidence": 0.97,
  "status": "published"
}

consumed by Custom agents MCP clients PIM integrations Audit tools

4Agent feed llms.txt

A markdown index at the catalog root. The emerging convention for letting AI agents crawl product catalogs efficiently — like sitemap.xml, but for LLMs. Lists every product and its data-layer endpoint.

Feed

# shop.example.com/llms.txt

# Catalog

- [Sony WH-1000XM5](/products/wh-1000xm5):
  Wireless ANC headphones · 30 h battery · LDAC.
  Data: /api/v1/products/wh-1000xm5
- [Apple iPhone 17 Pro](/products/iphone-17-pro):
  6.3" display · A19 Pro · titanium.
  Data: /api/v1/products/iphone-17-pro

## Provenance
Every product carries x-central metadata.

discovered by ChatGPT Perplexity Claude Custom crawlers

x-central · provenance extension

The x-central namespace.
Where every claim earns its citation.

Schema.org tells AI agents what a product is. It says nothing about whether a claim is verified, how many sources backed it, or when the fact was last checked. x-central is the extension that fills that gap.

It is a namespaced JSON-LD extension — Schema.org allows extensions via the x- prefix — so it travels inside the existing payload, passes Schema.org validators, and gets ignored by surfaces that don't read it. The ones that do read it (ChatGPT, Perplexity, Claude, increasingly Apple Intelligence and Bing Copilot) use it to cite the sources behind every claim.

sources — the list of URLs Central cross-checked. Every fact has a source array attached.

confidence — a 0–1 score per field, with an aggregate and a display floor (0.82 by default).

lastVerified — ISO 8601 timestamp of the last verification pass. Stale records auto-flag for re-grounding.

held — claims Central refuses to publish (conflicts, no source, marketing-only). Visible to operators, invisible to AI surfaces.

agreements — how many independent sources reached the same value. Two agreements minimum to publish.

The promise: AI never sees the open web. It sees a Central-emitted record where every claim already proved itself. The model can be wrong about style, never about fact.

JSON-LD · x-central block Sony WH-1000XM5

// Embedded inside the product's JSON-LD payload
"x-central": {

  // Source array — every URL Central cross-checked
  "sources": [
    {
      "url": "https://www.sony.com/electronics/headphones/wh-1000xm5",
      "title": "Sony product page — official spec sheet",
      "role": "manufacturer",
      "lastFetched": "2026-05-25T08:14:30Z"
    },
    {
      "url": "https://www.jas.or.jp/ldac-certified/4548736",
      "title": "JAS LDAC certification record",
      "role": "certification",
      "lastFetched": "2026-05-25T08:14:31Z"
    },
    /* + 14 more sources — total: 16 */
  ],

  // Confidence — global average + per-field map
  "confidence": {
    "avg": 0.94,
    "floor": 0.82,
    "perField": {
      "battery_anc_on":    0.97,
      "codec_support":     0.99,
      "weight":            0.98,
      "jas_certified":     1.00
    }
  },

  // Independent agreement count per published field
  "agreements": { "min": 2, "avg": 4.3, "max": 5 },

  // ISO 8601 timestamp of last verification pass
  "lastVerified": "2026-05-25T08:14:32Z",

  // Claims Central refuses to publish — visible to ops, hidden from AI
  "held": [
    {
      "claim":  "IP rating IPX4",
      "reason": "sources disagree · IPX2 vs IPX4 vs not stated",
      "status": "requires merchant decision"
    }
  ]
}

16sources cited 0.94avg confidence 1claim held spec · v1.0 · open

REST Provenance API

The data layer is programmable.

Beyond the JSON-LD on every PDP, Central exposes a REST endpoint for per-field provenance lookup. Integrators, MCP clients, audit tooling, and bespoke agents can fetch the source list, agreement count, and confidence for any field on any product — without parsing the PDP.

The same canonical record powers the JSON-LD on the page and the API behind it. One source of truth, two consumption modes: declarative for crawlers, imperative for code.

per-field provenance bearer-token auth cursor pagination JSON-LD compatible rate-limited 10k/hr webhooks on change

GET /v1/products/{product_id}/fields/{field}/provenance authenticated

Request · GET /v1/products/SIG-RND-26-CER/fields/oven_temp/provenance

Response · 200 OKapplication/json

{
  "field": "oven_temp",
  "value": "260 °C",
  "unit": "celsius",
  "sources": [
    { "url": "le-creuset.com/specifications", "agrees": true },
    { "url": "williams-sonoma.com",           "agrees": true },
    { "url": "cooks-illustrated.com",         "agrees": true },
    { "url": "saveur.com/best-dutch-ovens",    "agrees": true },
    { "url": "sur-la-table.com",              "agrees": true }
  ],
  "agreements":   5,
  "confidence":   0.97,
  "lastVerified": "2026-05-25T08:14:32Z",
  "status":       "published"
}

5sources · 0.97confidence cached 60s · CDN-edge

GET /v1/products/{product_id}/data-layer authenticated

Request · GET /v1/products/SIG-RND-26-CER/data-layer

Response · 200 OKapplication/ld+json

{
  "@context": "https://schema.org",
  "@type": "Product",
  "name": "Le Creuset Signature Round 26 cm",
  "gtin13": "0024147210504",
  "sku": "SIG-RND-26-CER",
  "brand": { "name": "Le Creuset" },
  "offers": { "price": 379, "availability": "InStock" },

  "x-central": {
    "sources":      15,
    "confidence":   0.94,
    "lastVerified": "2026-05-25T08:14:32Z",
    "held":         [{ "claim": "World's best" }]
  }
}

22properties · ready for ingest direct from canonical record

GET /v1/products?cursor={cursor}&updated_since=2026-05-24T00:00:00Z authenticated

Request · paginated, change-only catalog feed for AI agents and integrators that maintain a mirror

Response · 200 OK · 142 products updated since cursorapplication/json

{
  "data": [
    { "id": "SIG-RND-26-CER", "name": "Le Creuset Signature Round, 26 cm", "updated": "2026-05-25T08:14:32Z",
      "data_layer_url": "/v1/products/SIG-RND-26-CER/data-layer", "sources": 15, "confidence": 0.94 },
    { "id": "WH-1000XM5-BLK",  "name": "Sony WH-1000XM5",                  "updated": "2026-05-25T08:09:12Z",
      "data_layer_url": "/v1/products/WH-1000XM5-BLK/data-layer",  "sources": 16, "confidence": 0.94 },
    /* + 140 more products in this page */
  ],
  "pagination": { "next_cursor": "eyJpZCI6Imxhc3RfaW5fcGFnZSJ9", "has_more": true },
  "x-rate-limit": { "remaining": 9876, "reset": "2026-05-25T09:00:00Z" }
}

142changed products · cursor token 9,876requests remaining this hour · X-RateLimit headers exposed

proof artifact · Sony WH-1000XM5

One opaque product page in. One AI-citeable data layer out.

The same Sony WH-1000XM5 PDP. On the left, the kind of thin Schema.org block AI agents skip. On the right, the Central-emitted data layer with Product, Offer, FAQPage, and the full x-central provenance — 16 sources, per-field confidence, ISO timestamp, and one claim held.

/products/wh-1000xm5 · pdp.html before

WH-1000XM5 wireless headphones

SKU WH-1000XM5-BLK · black · over-ear

page meta"Industry-leading…"

og:tagstitle + image only

ld+json3 keys only

@type Productname · sku · price

Offer blockmissing

PropertyValue × nno specs

FAQPagemissing

x-centralno provenance

REST endpointno programmatic access

llms.txtmissing

<script type="application/ld+json">
{
  "@type": "Product",
  "name": "WH-1000XM5",
  "sku": "WH-1000XM5-BLK",
  "price": "399.00"
}
</script>

3keys exposed 0sources cited AI skips

→

Central
emits

/products/wh-1000xm5 · ld+json · 4.4 KB verified · citable

Sony WH-1000XM5 — 22 / 22 schema

16 sources · 0.94 avg confidence · 1 held

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Product",
  "name": "Sony WH-1000XM5 Wireless Noise Cancelling Headphones",
  "sku": "WH-1000XM5-BLK",
  "gtin13": "4548736141544",
  "brand": { "@type": "Brand", "name": "Sony" },
  "image": "https://shop.sony.eu/wh-1000xm5/black.jpg",
  "description": "Industry-leading 8-mic noise cancellation, 30 h battery with ANC on, LDAC hi-res audio.",
  "weight":  { "value": 250, "unitCode": "GRM" },
  "offers": {
    "@type": "Offer",
    "price": "399.00",
    "priceCurrency": "EUR",
    "availability": "https://schema.org/InStock",
    "url": "https://shop.sony.eu/p/wh-1000xm5"
  },
  "aggregateRating": { "ratingValue": 4.7, "reviewCount": 3142 },
  "additionalProperty": [
    { "name": "battery_anc_on",  "value": "30 h" },
    { "name": "battery_anc_off", "value": "40 h" },
    { "name": "codec_support",   "value": "LDAC, SBC, AAC" },
    { "name": "hi_res_cert",     "value": "JAS LDAC certified · 990 kbps" },
    { "name": "driver_size",     "value": "30 mm dual-diaphragm" },
    /* + 7 more PropertyValue entries — total: 12 */
  ],
  // Central extension — provenance AI agents can cite
  "x-central": {
    "sources": [
      "sony.com/electronics/headphones/wh-1000xm5",
      "jas.or.jp/ldac-certified/4548736",
      "mediamarkt.de/product/sony-wh-1000xm5",
      "coolblue.de/produkt/sony-wh-1000xm5",
      "whathifi.com/reviews/sony-wh-1000xm5",
      /* + 11 more sources — total: 16 */
    ],
    "confidence":   { "avg": 0.94, "floor": 0.82 },
    "agreements":   { "min": 2, "avg": 4.3 },
    "lastVerified": "2026-05-25T08:14:32Z",
    "held": [
      { "claim": "IP rating IPX4", "reason": "sources disagree" }
    ]
  },
  "@graph": [
    {
      "@type": "FAQPage",
      "mainEntity": [
        { "@type": "Question", "name": "Battery life with ANC on?",
          "acceptedAnswer": { "text": "30 hours. Verified by Sony + JAS + 2 retailers." } },
        { "@type": "Question", "name": "JAS certified for hi-res?",
          "acceptedAnswer": { "text": "Yes — LDAC at 990 kbps, JAS certified." } },
        /* + 6 more verified Q&As */
      ]
    }
  ]
}
</script>

22properties · 16sources · 1held ready for every AI surface

one record · every AI surface

Same data layer, every conversational shopping agent.

ChatGPT Shopping Perplexity Claude · MCP Apple Intelligence Google AI Overviews Bing Copilot

1 payload 6+ surfaces · auto-refresh

FAQ

Six questions about the LLM Data Layer.

What is the LLM Data Layer — is it just Schema.org?

No. Schema.org is one of four layers Central writes. The full LLM Data Layer is Schema.org JSON-LD on the PDP, the x-central provenance extension carrying sources and confidence, a REST endpoint for per-field provenance lookup, and an llms.txt-style agent feed at the catalog root. Schema.org alone tells AI agents what a product is; the full data layer tells them why a claim is trustworthy.

What is the `x-central` JSON-LD extension and why does it matter?

x-central is a namespaced Schema.org extension — Schema.org allows extensions via the x- prefix — that attaches sources, per-field confidence, ISO 8601 last-verified timestamp, agreement count, and held claims to a product. AI agents need to cite. The model that can show its work is the model that's safer to recommend you. Surfaces that don't read x-central still parse the rest of the Schema.org payload normally — it's additive, not breaking.

How does the REST Provenance API work?

A bearer-token authenticated endpoint at GET /v1/products/{id}/fields/{field}/provenance returns the source list, agreement count, confidence, and publish status for a single field. There's also GET /v1/products/{id}/data-layer for the full JSON-LD payload and GET /v1/products?updated_since=... for cursor-paginated change feeds. Rate limits are tier-based (Pro: 10k/hr) and X-RateLimit-* headers are exposed on every response. For implementation details, talk to the Central team.

Does this replace Schema.org or extend it?

Extends. Central's JSON-LD payload is standards-compliant Schema.org first — Product, Offer, AggregateRating, FAQPage, ImageObject, PropertyValue — so it passes the Google Rich Results validator, the Schema.org validator, and every Schema-aware crawler. The x-central block sits inside the same JSON-LD payload as an extension. Surfaces that read it use it to cite; surfaces that don't still get full Schema.org.

What's `llms.txt` — is that a real standard?

llms.txt is an emerging convention — proposed by Jeremy Howard in 2024 — for letting AI agents discover and crawl content efficiently. It's a markdown index at the site root, similar to sitemap.xml but human-readable and structured for LLMs. Central emits one automatically for your catalog: every product, its summary, and a link to its data-layer endpoint. Some AI agents already crawl it; the rest will follow as the convention matures. Central handles the format so you don't have to track the spec.

How does this work with the channel-specific surfaces?

The LLM Data Layer is upstream of every AI shopping channel. The same canonical record produces the JSON-LD ChatGPT Shopping crawls, the source-cited payload Perplexity uses for [1] [2] [3] citations, the structured data Google AI Overviews surfaces, the records Apple Intelligence and Bing Copilot will consume, and the MCP-compatible payload Claude will read once MCP commerce ships. Channel pages cover surface-specific behavior; this page covers the data layer underneath all of them.

AI agents can only recommend what they can verify.

The same product. The same question. Two completely different answers.

One thing flipped: a Central-emitted data layer the agent could read, cite, and trust.

How the verified record becomes machine-readable.

The x-central namespace.Where every claim earns its citation.

The data layer is programmable.

One opaque product page in. One AI-citeable data layer out.

Same data layer, every conversational shopping agent.

Six questions about the LLM Data Layer.

What is the LLM Data Layer — is it just Schema.org?

What is the x-central JSON-LD extension and why does it matter?

How does the REST Provenance API work?

Does this replace Schema.org or extend it?

What's llms.txt — is that a real standard?

How does this work with the channel-specific surfaces?

The x-central namespace.
Where every claim earns its citation.

What is the `x-central` JSON-LD extension and why does it matter?

What's `llms.txt` — is that a real standard?