af agentic-first

Landscape · what's already out there · what's missing

The standards landscape, honestly mapped.

Before publishing yet another standard, we did the survey. There are a lot of good, mature, well-adopted conventions touching this problem — and three specific gaps none of them fill. This page lays both out, and invites anyone with skin in the game to challenge the framing.

On this page
  1. TL;DR — the three gaps
  2. What exists today
  3. Discovery & web compatibility
  4. Financial & reporting
  5. Identity, auth & legal entity
  6. Trust, verification & provenance
  7. Data governance & quality (ISO)
  8. The emerging agent-web stack
  9. Where the gaps are, in one picture
  10. Where agentic-first fits
  11. A starting point, not a fait accompli

TL;DR — the three gaps

We can't find an open, machine-readable, publisher-controlled standard for any of these:

Gap 01

Public general info about a company

Schema.org gets you a name, a URL, and a logo. There's nothing canonical for jurisdiction + registry ID + stage + headcount band + canonical contact channel — the things an investor agent actually needs in the first 30 seconds.

Gap 02

Public structured business info, beyond the regulators

XBRL covers regulated financial filings for listed companies. The other ~99% of companies have no equivalent — no banded revenue, no growth band, no traction summary, no consistent way to publish "here is our shape" in a non-promotional, FCA-aware form.

Gap 03

Private, diligence-grade info, on the company's terms

OAuth gates access. It doesn't tell you what's behind the gate. Verifiable Credentials cover individual claims, not whole company files. Nobody has standardised the shape of the deal-grade detail an investor wants once they've been let in.

The rest of this page goes through every standard worth taking seriously, what it does well, what it doesn't cover, and where agentic-first sits without competing with any of them.

What exists today, at a glance

All of these are in production somewhere; none of them, on its own, gives an agent the answer to "who is this company, and what do they want a serious reader to know?"

Per-standard deep dives

Five of the rows below have their own page that walks through side-by-side, when to use each, how they compose, and the honest summary: Schema.org · XBRL · mcp.json · Verifiable Credentials · GLEIF / LEI.

StandardOwnerCoversDoesn't cover
Schema.org Organization / Person  deep-dive → Schema.org community (Google, Microsoft, Yahoo, Yandex) Name, URL, logo, address, social profiles, simple contact info — for SEO and rich results Stage, funding, banded financials, structured contact preference, evidence-backed claims, anything diligence-grade
OpenGraph & Twitter Cards Meta, then de-facto Social-share previews — title, image, description Anything a machine wants to act on; everything below the fold
XBRL / iXBRL  deep-dive → XBRL International + national filing regulators Mandatory machine-readable financial filings for listed firms (SEC EDGAR, Companies House, ESEF in the EU) Private companies, banded summaries, anything outside the statutory P&L / balance sheet
OAuth 2.0 / OIDC IETF / OpenID Foundation Token-based authentication and consent for accessing a protected resource The shape of the resource itself — OAuth doesn't tell you what's behind the gate, only that one is there
W3C Verifiable Credentials (VC) + DIDs  deep-dive → W3C Cryptographically signed, issuer-attested individual claims (your degree, your professional licence, your KYC) A whole company profile object; the day-to-day "this company has 11–50 staff" non-credential information
GLEIF / LEI (ISO 17442)  deep-dive → Global Legal Entity Identifier Foundation 20-character globally unique legal-entity identifier, mandated for financial counterparties since 2017 Anything beyond the identifier itself — not a profile, not a schema
Companies House & equivalents (Delaware, EDGAR, BvD/Orbis) National registries Statutory filings, directors, share capital, accounts (where required) Anything voluntary, current, marketing-shaped, or under NDA; foreign jurisdictions
ISO 8000 (data quality) ISO Process and quality framework for master data management Specific schemas; nothing immediately implementable
ISO 27001 / 27701 ISO Information-security and privacy management systems Data shape; these are management systems, not formats
/.well-known/mcp.json  deep-dive → modelcontextprotocol working group (SEP-1960 / SEP-2127) Discovery of an MCP server: endpoint, transport, tools, auth, capabilities Identity of the publisher running the MCP — covers protocol, not who
/.well-known/agent-card.json A2A protocol (Linux Foundation, IANA-registered Aug 2025) An A2A agent's capabilities, identity, contact-on-behalf-of The company or individual behind the agent
/llms.txt De-facto, ~844k adopters incl. Anthropic, Cloudflare, Stripe A Markdown index of your site for LLMs to read instead of crawling everything Structure — by design it's narrative Markdown, not data
/agents-brief.txt Draft v0.4, early 2026 What an AI agent is permitted to do on your site (book, buy, submit) Identity; this is permissions, not content
robots.txt + TDM Reservation Protocol (W3C, EU AI Act-aligned) De-facto / W3C Crawler permissions; opt-out for AI training Anything affirmative — these are signals about what not to do
JSON-LD context (W3C) W3C Linked-data serialisation; the syntax Schema.org rides on Specific company / person vocabulary — JSON-LD is a transport, not a schema

Discovery & web compatibility

What works well: Schema.org's Organization + Person vocabulary, embedded as JSON-LD on a homepage, gives Google enough to render a Knowledge Panel and gives most LLM crawlers enough to know your name, URL and logo. OpenGraph gives every social product a preview. These are mature and worth implementing on day one.

Where the gap is: both are descriptive of content, not of capability. Schema.org has no field for "fundraise stage", no banded headcount, no structured contact-channel preference, no protected-tier pointer. By design — that's not what it's for. It was built when the consumer was a search engine, not an agent doing diligence.

How agentic-first relates: we explicitly recommend adopters publish both. Schema.org JSON-LD on the homepage covers SEO; /.well-known/agentic-profile.json covers the agent-readable diligence shape. They don't conflict; they describe different things.

Financial & reporting

What works well: XBRL and its inline form iXBRL have done the hard work of giving regulators, exchanges, and auditors a consistent vocabulary for financial filings. SEC EDGAR, Companies House (UK), and ESEF (EU listed firms) all consume it. The taxonomies are exhaustive.

Where the gap is: XBRL is for the regulated minority. Private companies — the vast majority of any directory's population — have no obligation, no tooling, and no reason to produce XBRL filings of their own. There is no equivalent for "here's our shape, in bands, deliberately non-promotional, hosted at our own website" — exactly the surface a private-company directory needs.

How agentic-first relates: the public tier borrows the discipline of XBRL — explicit currencies, explicit reporting periods (as_of), enumerated bands — without trying to model the full P&L. The protected tier carries precise figures, where the audience is identified and the controls are the publisher's own. We don't compete with XBRL; we sit below the regulatory threshold most adopters live below.

Identity, auth & legal entity

What works well: OAuth 2.0 + OIDC is the universal auth substrate; every serious protected-tier MCP should use it. GLEIF's LEI (ISO 17442) is the closest thing the world has to a unique global company identifier; it's mandatory for financial counterparties already.

Where the gap is: these tell you that a principal is authenticated, and that a legal entity is uniquely named — they say nothing about what data the entity publishes about itself. The "token access" idea has no profile-shaped object on the other side of the gate.

How agentic-first relates: we anchor identity on the standards that already exist — adopters declare company.registry.{type,id,url} (Companies House, Delaware, EDGAR, …) and company.lei (GLEIF) on their public profile, and the directory verifies them publicly via the registry's own URL. The protected tier expects OAuth with section-scoped tokens (financials:read, fundraise:read, …). We don't reinvent any of this layer — we describe how to wire to it.

Trust, verification & provenance

What works well: W3C Verifiable Credentials (VCs) give you cryptographically signed, issuer-attested claims with a clean revocation model. Pair them with a DID and you have a portable, self-sovereign identity layer that survives platform churn.

Where the gap is: VCs are claim-shaped, not file-shaped. There's no canonical VC for "this company's whole public profile, signed by the company"; nor is there a clean way to mark a single field inside a larger document as independently verified versus self-asserted. In practice every commercial company-data product invents its own confidence model.

How agentic-first relates: v0.1 takes a pragmatic first step: every material claim can carry an evidence entry pointing at a public URL (a press release, a Companies House filing, a third-party article), and the directory's confidence score weights evidence density. Provenance signing — either VCs over individual fields or a JWS envelope over the whole file — is on the v0.2 roadmap. We'd much rather adopt the W3C work than fork it.

Data governance & quality (the ISO layer)

What works well: ISO 8000 (data quality), ISO 27001 (information security), and ISO 27701 (privacy) are the reference frameworks regulated buyers expect to hear named in any data-handling pitch.

Where the gap is: all three are management system standards, not formats. They tell you how to run processes, not what fields to publish. They give credibility but not implementability.

How agentic-first relates: we name them as the reference frame our governance practices map onto, without pretending v0.1 of an open spec ships an ISO certification. As and when the project gets serious, formal alignment is on the table.

The emerging agent-web stack

The most adjacent — and most likely to be confused with what we're doing — is the small cluster of /.well-known/ conventions that have appeared in the last 18 months for the agent web specifically:

These are all about protocol, permission, or content. None of them describe the publisher — the company or individual operating the site, their identity, their banded shape, their preferred contact channel. That's the slot we're trying to fill, and it's why the publisher-identity slot is its own well-known file rather than an extension to one of the existing four.

Where the gaps are, in one picture

Layer Public, generally about the publisher Public, structured business info Private, diligence-grade detail
Today Schema.org Organization / Person — name, URL, logo. Nothing canonical for jurisdiction, registry ID, stage, headcount band, contact preference. XBRL for listed firms only (~1% of companies). Nothing for the rest. Bilateral data rooms behind manual NDAs. No standard shape.
Gap An open, machine-readable, publisher-controlled file that an agent can fetch in one HTTP GET. An open, banded, FCA-aware vocabulary for the 99% of companies XBRL doesn't reach. An open schema for what sits behind a scoped OAuth token, served from the publisher's own MCP.
agentic-first v0.1 /.well-known/agentic-profile.json with profile_kind + tier: "public" The same file's funding, team, metrics sections — banded by the schema The matching *-private-profile schema, served from the publisher's own MCP at their own auth

Where agentic-first fits — and where it doesn't

The point of this page is not to claim agentic-first replaces any of the standards above. It doesn't. It sits next to them and describes the one thing none of them describe: the publisher themselves, in a shape an agent can act on.

The contribution agentic-first makes is the composition: one well-known file, two tiers, four schemas, a publisher-controlled directory that indexes them. Every layer it touches is built on the existing standard for that layer.

A starting point, not a fait accompli

v0.1 is deliberately small. We'd rather ship something three adopters have shipped against than a 200-page spec nobody has implemented. The schemas, the validator, the directory MCP, and the deployment scripts are all open-source under Apache-2.0; the governance lives in SCHEMA.md and decisions of consequence land as ADRs in docs/adr.

The honest pitch: nobody has done this for general company information yet. We've put a small, opinionated first cut in the open, with a working directory and four published schemas, so the conversation about what an open publisher-controlled company-data layer should look like can happen against running code rather than slides. If you've got strong views — or you've already published your profile — we'd genuinely like to hear from you.

Read the v0.1 standard Open the directory MCP Send feedback