Markdown / MDX adapter
Markdown / MDX adapter
The Markdown adapter is the canonical reference adapter and the simplest ACT producer: it walks a directory tree of
.md/.mdxfiles with frontmatter and emits a conformant manifest, index, and per-node JSON. Most other adapters compose Markdown semantics for their prose payload, so this document also pins the body-to-block mapping shared across adapters.
Live examples. Most deployed examples exercise this adapter: astro-docs, eleventy-blog, starlight-docs, vitepress-docs, and docusaurus-docs all source their content from
.mdfiles via this adapter. Pick any of them in the site browser to see the emitted ACT tree.
Status
This is a first-party reference adapter distributed as
@act-spec/adapter-markdown. Its default export satisfies the adapter
contract documented under ../wire-format/ and is the pattern every
other adapter follows. The mapping in this document is normative.
Source content model
A Markdown corpus is rooted at a configured sourceDir. The adapter
walks a glob set (default ["**/*.md", "**/*.mdx"]) honoring an
ignore set (default node_modules, .git, .act). Each file has:
- An OPTIONAL frontmatter block at the very start of the file:
- YAML 1.2 frontmatter delimited by
---lines. - TOML 1.0 frontmatter delimited by
+++lines.
- YAML 1.2 frontmatter delimited by
- A markdown body (CommonMark 0.31 + GFM extensions: tables, alerts,
task lists; MDX 3.x grammar in
.mdx).
The adapter does NOT introspect React, Vue, or Angular component
trees. MDX components are surfaced as opaque placeholders with
metadata.extracted_via: "component-contract" so a downstream
component-contract layer can supply real extracted blocks via the
multi-source merge step.
Mapping to ACT nodes
| Source | ACT node type | Required ACT fields | Notes |
|---|---|---|---|
.md file | leaf | id, type, locale, title, content, parents | front-matter type overrides default "article"; default ID derived from path |
.mdx file | leaf | as above | each component embed becomes a marketing:placeholder block; MUST run against a Standard-or-higher target |
index.md inside a section | branch | id, type ("section"), children, title | the surrounding directory becomes a section node; index.md collapses to the directory’s ID |
_drafts/ (or other config-excluded path) | EXCLUDED | — | drafts skipped unless explicitly included via the include glob |
Recognized frontmatter keys (all OPTIONAL; defaults applied per the rules in the next section):
| Key | Type | ACT mapping | Default if absent |
|---|---|---|---|
id | string | explicit ID override | derived from path |
title | string | node title | first H1, else file stem |
summary | string | node summary, summary_source: "author" | extracted first paragraph |
summary_source | string | open enum ("author" / "extracted" / etc.) | stamped per derivation |
type | string | node type | "article" (or "section" for an index.md) |
tags | array of strings | node tags | absent |
parent | string (ID) | node parent | absent |
related | array of { id, relation } or plain ID strings | related[] | absent; plain strings upgraded to { id, relation: "see-also" } |
metadata | object | merged into node metadata | absent |
Reserved metadata keys (metadata.source, metadata.locale,
metadata.translations, metadata.translation_status,
metadata.fallback_from, metadata.extraction_status,
metadata.extracted_via) MUST NOT be settable from frontmatter.
Attempted assignment to a reserved key is unrecoverable: the build
fails non-zero with an error citing the offending file and key.
ID derivation (when no frontmatter id: is present):
- Drop the file extension.
- Replace path separators with
/. - Collapse
index(the file stem) to its parent directory — sodocs/intro/index.mdyieldsdocs/intro, notdocs/intro/index. - Lowercase ASCII; non-grammar characters (anything outside
[a-z0-9./-]) are replaced with-; runs of-collapse to one. - The result MUST satisfy the node-ID grammar pinned in
../wire-format/node.md.
Precedence: explicit frontmatter id: wins over a config rule
(idStrategy.stripPrefix, glob-keyed overrides), which wins over
the default derivation.
Body-to-block mapping
The adapter operates in one of two modes:
- Coarse mode (Core, default). The full markdown body becomes a
single
markdownblock. Round-trip fidelity is highest; consumers must re-parse to extract structure. - Fine mode (Standard, opt-in via
mode: "fine"). The body is split into typed blocks following the content-block taxonomy:- Prose paragraphs / headings / lists / blockquotes →
proseblocks withformat: "markdown". - Fenced code blocks →
codeblocks withlangset from the fence info string. - Data fences (
json data,yaml data,toml data) →datablocks withformatderived from the fence info string. - Admonition (
:::note) and GFM-alert (> [!NOTE]) syntax →calloutblocks withlevelfrom the recognized triggers (note/info/tip/warning/danger/important). - MDX components →
marketing:placeholderwithmetadata.component(the component tag) andmetadata.props(a JSON-serializable snapshot of the props), plusmetadata.extracted_via: "component-contract".
- Prose paragraphs / headings / lists / blockquotes →
Block ordering in content[] matches source order.
Summary derivation
- If frontmatter
summaryis present, use it verbatim and stampsummary_source: "author". - Otherwise, walk the body:
- Skip the frontmatter block.
- Skip HTML comments (
<!-- ... -->). - Skip headings.
- Take the first contiguous paragraph; trim leading and trailing whitespace.
- Stamp
summary_source: "extracted".
- The extracted summary is capped at the validator’s summary-token warning threshold; an over-cap summary surfaces a build warning, not an error.
Manifest emission
The adapter contributes the following to the build’s manifest
(../wire-format/manifest.md):
site.canonical_url← from generator config.locales← single locale derived from frontmatter or generator config; multi-locale Markdown corpora compose with the i18n adapter.capabilities← computed from observed emissions, never inflated. Coarse-mode runs advertise{ etag: true }; fine-mode runs add{ subtree: true }once subtree files exist.delivery: "static"(the Markdown adapter is build-time only).
Index emission
The adapter contributes one node-ref per emitted node to the build’s
index document (../wire-format/index.md). Section nodes
(synthesized from directories) carry their children array in
filesystem-walk order; explicit ordering MAY be supplied via a
sibling _order.json file when deterministic ordering matters.
i18n
A single Markdown adapter instance emits one locale per file. Mixed per-file locales MAY be supported by configuring per-glob locale assignment, but the canonical pattern is:
- Pattern 2 (per-locale subtree, default): one Markdown adapter
instance per locale, each rooted at
<sourceDir>/<locale>/. - Pattern 1 (locale-prefixed IDs, opt-in): a single instance
walks
<sourceDir>and reads alang:frontmatter key, with the locale prefixed into the ID.
Cross-locale metadata.translations arrays are populated by the i18n
adapter (./i18n.md) via the multi-source merge step; the Markdown
adapter does not emit them on its own.
Failure surface
- Recoverable (build warning, exit 0):
- Missing optional frontmatter key: silent default applied.
- Body extraction fails for one block (e.g., a data fence has
invalid JSON): emit the rest of the node with
metadata.extraction_status: "partial"andmetadata.extraction_errordescribing the cause.
- Unrecoverable (build fails non-zero):
- Malformed frontmatter (YAML/TOML parse error, or delimiter
mismatch when
frontmatter.formatis explicit). - Reserved-metadata-key assignment from frontmatter.
- ID-grammar violation that survives normalization.
- Within-adapter ID collision (two files emit the same ID before the multi-source merge step runs).
- Malformed frontmatter (YAML/TOML parse error, or delimiter
mismatch when
Conformance target
- Core: coarse mode, frontmatter-or-derived summary, single locale, manifest + index + nodes, ETag, atomic writes.
- Standard: + fine-grained body-to-block splitting, +
delta(since)via mtime for incremental rebuilds, + subtree files where eligible,- Pattern 2 i18n composition.
- Strict: reachable when composed with the i18n adapter
(Pattern 1 with
metadata.translations) and a component-contract layer that resolves MDX placeholders.
Examples
A docs/ tree:
docs/ index.md # site root → id: "index" getting-started/ index.md # → id: "getting-started" install.md # → id: "getting-started/install" api/ overview.md # → id: "api/overview"emits a manifest at /.well-known/act.json, an index with five
node-refs (index, getting-started, getting-started/install,
api, api/overview), and five node JSONs at
/act/nodes/<id>.json. The getting-started node carries
children: ["getting-started/install"] and the
getting-started/install node carries parent: "getting-started".
Open questions / extension points
- JSON frontmatter. Some Hugo / Eleventy corpora use
{ ... }JSON blocks at file head. Not supported in v0.2; an additive ASP could enable it. - Per-key translation tracking for MDX components that wrap message-catalog calls. Tracked alongside the component-contract layer; the markdown adapter only emits the placeholder seam.
- Custom fence handlers (e.g.,
mermaid,vega-lite) MAY be registered to emitdatablocks with adapter-definedformatvalues; conformance fixtures are the test.
Sources
./hugo.md,./mkdocs.md,./jekyll.mdfor cross-generator conventions.../wire-format/node.mdfor the node envelope grammar.../wire-format/etag.mdfor ETag derivation.
Changelog
| Date | Version | Change |
|---|---|---|
| 2026-05-03 | 0.2.0 | Initial spec drafted by BDFL |