Transform Markdown Metadata

A remark plugin that extracts metadata from MDX files and optionally updates parent directory index pages. This plugin automatically collects page titles, descriptions, keywords, and hierarchical section structures to create searchable, navigable documentation indexes.

Overview

Use this plugin to extract comprehensive metadata from your MDX documentation pages. It parses both ES module metadata exports (export const metadata = {...}) and document content (headings, descriptions) to build structured metadata that powers navigation, search, and content discovery.

The plugin automatically merges extracted metadata into your page's export const metadata object:

Metadata extraction: Extracts title, description, and sections from page content
Smart merging: Fills in missing fields in existing metadata exports
Auto-creation: Creates metadata export if none exists
Index updates: Optionally updates parent directory's page.mdx with an auto-generated index

Note: The plugin modifies the AST during build time to add or update the export const metadata object in your MDX files. User-defined metadata fields are never overwritten—the plugin only fills in missing values.

Key Features

Automatic title extraction: Finds the first H1 heading as the page title
Description parsing: Extracts the first paragraph as the page description
Hierarchical sections: Builds a nested tree of all H2-H6 headings with slugs
Formatting preservation: Maintains inline code, bold, italics in section titles
Metadata merging: Automatically updates the page's export const metadata object
Smart defaults: Only fills in missing fields—never overwrites user-defined values
ES module metadata: Works with Next.js metadata object format
Open Graph inheritance: Extracted title and description are automatically inherited by Next.js for social media previews
Index auto-generation: Maintains parent directory indexes automatically

Installation & Usage

import { unified } from 'unified';
import remarkParse from 'remark-parse';
import transformMarkdownMetadata from '@mui/internal-docs-infra/pipeline/transformMarkdownMetadata';

const processor = unified()
  .use(remarkParse)
  .use(transformMarkdownMetadata, { extractToIndex: true });

With Next.js and MDX

When using the plugin directly (without withDocsInfra), you'll need to provide the baseDir for path filtering to work correctly:

// next.config.js
import transformMarkdownMetadata from '@mui/internal-docs-infra/pipeline/transformMarkdownMetadata';
import { fileURLToPath } from 'node:url';
import { dirname } from 'node:path';

const currentDirname = dirname(fileURLToPath(import.meta.url));

const withMDX = require('@next/mdx')({
  options: {
    remarkPlugins: [
      [
        transformMarkdownMetadata,
        {
          extractToIndex: {
            include: ['app'],
            exclude: [],
            baseDir: currentDirname, // Required for path filtering with absolute paths
          },
        },
      ],
    ],
  },
});

module.exports = withMDX({
  // your Next.js config
});

Note: The baseDir is needed because Next.js provides absolute file paths to remark plugins. The plugin strips this prefix to match against your include/exclude patterns. Index files (e.g., app/page.mdx, app/components/page.mdx) are automatically excluded.

With withDocsInfra Plugin

The withDocsInfra Next.js plugin automatically includes this with the correct configuration:

// next.config.js
import createMDX from '@next/mdx';
import { withDocsInfra, getDocsInfraMdxOptions } from '@mui/internal-docs-infra/withDocsInfra';

// Create MDX with docs-infra configuration
const withMDX = createMDX({
  options: getDocsInfraMdxOptions({
    // Automatically includes extractToIndex with default filter
    // { include: ['app', 'src/app'], exclude: [], baseDir: process.cwd() }
    // Index files are automatically excluded
  }),
});

const nextConfig = {
  // Your custom configuration
};

export default withDocsInfra()(withMDX(nextConfig));

// Or disable index generation
const withMDX = createMDX({
  options: getDocsInfraMdxOptions({
    extractToIndex: false,
  }),
});

export default withDocsInfra()(withMDX(nextConfig));

// Or customize path filters
const withMDX = createMDX({
  options: getDocsInfraMdxOptions({
    extractToIndex: {
      include: ['app/docs', 'app/api'],
      exclude: ['app/docs/internal'],
    },
  }),
});

export default withDocsInfra()(withMDX(nextConfig));

Configuration

titleSuffix

Type: string
Default: undefined

A suffix to append to the title in the exported metadata object. This is useful for adding site-wide title suffixes like " | My Site" to page metadata for SEO.

The suffix is only applied to the export const metadata title—it does not affect:

The title used in index extraction
Any internal metadata processing

// Adds " | Base UI" to all page titles in the metadata export
.use(transformMarkdownMetadata, { titleSuffix: ' | Base UI' })

// Input: # Button Component
// Result: export const metadata = { title: "Button Component | Base UI", ... }

extractToIndex

Type: boolean | { include: string[], exclude: string[], baseDir?: string }
Default: false

Controls automatic extraction of page metadata to parent directory index files.

When enabled, the plugin extracts metadata (title, description, headings) from MDX files and maintains an index in the parent directory's page.mdx file.

Index files themselves (e.g., pattern/page.mdx where pattern is in the include list) are automatically excluded from extraction.

Options:

false - Disabled (no index updates)
true - Enabled with default filter: { include: ['app', 'src/app'], exclude: [], baseDir: process.cwd() }
{ include: string[], exclude: string[], baseDir?: string } - Enabled with custom path filters

Path matching uses prefix matching - a file matches if its path starts with any include pattern and doesn't start with any exclude pattern. Files matching pattern/page.mdx (where pattern is in the include list) are automatically skipped as they are index files themselves.

Important: Patterns should not include trailing slashes. The plugin automatically appends / during matching. Use 'app' not 'app/', and 'src/app' not 'src/app/'.

Fields:

include (string[]): Path prefixes that files must start with to be indexed (without trailing slashes)
exclude (string[]): Path prefixes to exclude from indexing (without trailing slashes)
baseDir (string, optional): Base directory to strip from absolute file paths before matching. When using getDocsInfraMdxOptions(), this defaults to process.cwd(). When calling the plugin directly, you should provide this for accurate path filtering.
useVisibleDescription (boolean, optional): When true, uses the first visible paragraph after the H1 as the description in the extracted index, even when a <meta> tag provides the SEO description. This is useful when you want different descriptions for SEO (meta tag) vs. the index page (visible content). Default: false.
indexWrapperComponent (string, optional): Name of a React component to wrap around the autogenerated index content. When provided, the generated markdown will wrap the page list and detail sections in this component (e.g., <PagesIndex>...</PagesIndex>). This is useful for applying consistent styling or behavior to index pages.

// Extract but don't update index
.use(transformMarkdownMetadata)

// Extract and update parent index with default filter
.use(transformMarkdownMetadata, { extractToIndex: true })

// Custom path filter (when using directly, provide baseDir for accurate matching)
.use(transformMarkdownMetadata, {
  extractToIndex: {
    include: ['app/docs', 'app/components'],
    exclude: ['app/docs/internal'],
    baseDir: '/path/to/your/project' // e.g., dirname(fileURLToPath(import.meta.url))
  }
})

// Use visible paragraph for index, meta tag for SEO
.use(transformMarkdownMetadata, {
  extractToIndex: {
    include: ['app/components'],
    useVisibleDescription: true
  }
})

// Wrap index content in a custom component
.use(transformMarkdownMetadata, {
  extractToIndex: {
    include: ['app/components'],
    indexWrapperComponent: 'PagesIndex'
  }
})

Default Filter Rationale:

The default { include: ['app', 'src/app'], exclude: [] } is designed for Next.js App Router projects:

Includes app and src/app: Processes pages in both common Next.js directory structures
No trailing slashes: Patterns should not have trailing slashes (they're added automatically during matching)
Automatic index exclusion: Index files like app/page.mdx, app/components/page.mdx are automatically skipped to prevent them from creating parent indexes
Route groups: Next.js route groups like (shared) are automatically removed when matching, so app/(shared)/page.mdx is treated as app/page.mdx

This ensures index pages are created at every level without unwanted parent indexes or interference with your site structure.

Automatic Sitemap Data Injection

When indexWrapperComponent is configured, the plugin automatically injects SitemapSectionData as a data prop into the wrapper component when processing autogenerated index files. This enables wrapper components to receive structured data for dynamic rendering, search, or navigation features.

The plugin:

Detects autogenerated index files (by their marker comments)
Parses the page list metadata from the markdown
Finds the wrapper component in the AST
Injects the data as a data prop

The injected data structure:

interface SitemapSectionData {
  title: string; // From the H1 heading
  prefix: string; // URL prefix derived from file path
  pages: SitemapPage[]; // Array of page metadata
}

interface SitemapPage {
  title?: string;
  slug: string;
  path: string;
  description?: string;
  keywords?: string[];
  sections?: Record<string, SitemapSection>;
  parts?: Record<string, SitemapPart>;
  exports?: Record<string, SitemapExport>;
  tags?: string[];
  skipDetailSection?: boolean;
  image?: {
    url: string;
    alt?: string;
  };
}

Prefix Computation:

The prefix field is derived from the file path using the baseDir from extractToIndex:

Strips baseDir if provided
Removes src and app directories
Filters out Next.js route groups (e.g., (public))
Converts to URL format with leading and trailing slashes

For example, /project/docs/app/(public)/components/page.mdx with baseDir: '/project/docs' becomes prefix /components/.

Use Case:

// PagesIndex.tsx
import type { SitemapSectionData } from '@mui/internal-docs-infra/createSitemap/types';

export function PagesIndex({
  children,
  data,
}: {
  children: React.ReactNode;
  data?: SitemapSectionData;
}) {
  // Use data for navigation, search, or custom rendering
  return (
    <div className="pages-index">
      {data && <SearchableList pages={data.pages} />}
      {children}
    </div>
  );
}

Basic Example

The simplest usage—write natural markdown and let the plugin extract metadata automatically:

Input MDX:

# Button Component

A versatile button component with multiple variants and sizes.

## Installation

Install the package using your preferred package manager.

## Usage

Import and use the button in your React components.

Extracted Metadata:

{
  "title": "Button Component",
  "description": "A versatile button component with multiple variants and sizes.",
  "sections": {
    "installation": {
      "title": "Installation",
      "titleMarkdown": [{ "type": "text", "value": "Installation" }],
      "children": {}
    },
    "usage": {
      "title": "Usage",
      "titleMarkdown": [{ "type": "text", "value": "Usage" }],
      "children": {}
    }
  }
}

This is the recommended pattern—clean, readable markdown with automatic metadata extraction.

Common Patterns

Recommended: Content-First Approach

Write natural markdown and let the plugin extract metadata automatically:

# Button Component

A versatile button component with multiple variants and sizes.

## Installation

Install the package using your preferred package manager.

## Usage

Import and use the button in your React components.

export const metadata = {
  keywords: ['button', 'interactive', 'form'],
};

;

Benefits:

No redundant metadata—the H1 is the title, first paragraph is the description
Clean, readable markdown source
Metadata exports only for computer-specific fields (keywords, Open Graph, etc.)
Placed at the end so they don't distract readers

Adding Keywords and Open Graph

When you need SEO-specific metadata or keywords, use export const metadata at the end of the file:

# CodeHighlighter

The CodeHighlighter component provides syntax highlighting.

<!-- Page content continues... -->

export const metadata = {
  description: 'Override the first paragraph for SEO purposes',
  keywords: ['syntax', 'highlighting', 'code', 'react', 'component'],
};

;

Why export metadata is preferred:

Clean source without inline tags
Type-safe with IDE autocomplete
Standard Next.js pattern
All metadata in one discoverable location

The plugin also supports <meta> or <Meta> tags anywhere in the document for migration scenarios, though inline tags lack type safety and clutter the markdown source.

Open Graph Example

Next.js automatically inherits metadata.title and metadata.description for Open Graph, so you only need to specify openGraph when adding images:

# CodeHighlighter

The CodeHighlighter component provides syntax highlighting.

<!-- Page content -->

export const metadata = {
  keywords: ['syntax', 'highlighting', 'code', 'react'],
  openGraph: {
    images: [
      {
        url: '/og-code-highlighter.png',
        width: 1200,
        height: 630,
        alt: 'Code Highlighter Preview',
      },
    ],
  },
};

;

Best Practice: Place metadata exports at the end of the file. The first H1 and paragraph are for human readers—they provide all the context needed when reading the markdown source. Metadata exports are for computers (search engines, social media, tooling) and should be unobtrusive.

Note: This uses MDX's ES module syntax (export const), not traditional YAML frontmatter.

Section Navigation

Use extracted sections for automatic navigation:

// Example component using extracted metadata
import { metadata } from './page.mdx';

export function TableOfContents() {
  return (
    <nav>
      {Object.entries(metadata.sections || {}).map(([slug, section]) => (
        <a key={slug} href={`#${slug}`}>
          {section.title}
        </a>
      ))}
    </nav>
  );
}

Best Practices

File Organization

One page per directory: Use page.mdx for each route
Consistent naming: Stick to kebab-case for directory names
Index placement: Let the plugin manage index files

Heading Structure

Single H1: Use only one H1 per page (the title)
Logical hierarchy: Don't skip heading levels (H2 → H3, not H2 → H4)
Descriptive titles: Write clear, searchable section titles
Formatting: Use inline code for function names, APIs

Metadata Export

Content first: Let the H1 and first paragraph provide title and description
Minimal metadata: Only export what can't be derived from content
Keywords: Add relevant, searchable keywords at the end of the file
Open Graph: Add OG metadata at the end when sharing on social media
End placement: Always place export const metadata at the end of the file

Index Management

Manual ordering: Reorder items in the editable section as needed
Let auto-generate: New pages will appear in the auto-generated section
Review regularly: Check index pages when adding new content

Index Generation

When extractToIndex: true is enabled, the plugin automatically maintains index pages:

Directory Structure

app/components/
├── page.mdx          # Auto-generated index
├── button/
│   └── page.mdx      # Button component docs
├── checkbox/
│   └── page.mdx      # Checkbox component docs
└── input/
    └── page.mdx      # Input component docs

Generated Index Format

The parent page.mdx is automatically created/updated:

# Components

[//]: # 'This file is autogenerated, but the following list can be modified.'

- [Button](#button) - [Full Docs](./button/page.mdx)
- [Checkbox](#checkbox) [New] - [Full Docs](./checkbox/page.mdx)

[//]: # 'This file is autogenerated, DO NOT EDIT AFTER THIS LINE'

## Button

A versatile button component

- Keywords: button, click, action
- Sections:
  - Installation
  - Usage
    - Basic Usage
    - Advanced Usage

[Read more](./button/page.mdx)

## Checkbox

Toggle selection states

- Keywords: checkbox, selection, form
- Sections:
  - Props
  - Examples

[Read more](./checkbox/page.mdx)

## Input

Text input component.

- Keywords: input, form, text
- Sections:
  - Variants

[Read more](./input/page.mdx)

[//]: # 'This file is autogenerated, but the following metadata can be modified.'

export const metadata = {
  robots: {
    index: false,
  },

}

Index Wrapper Component

When indexWrapperComponent is configured, the autogenerated content is wrapped in the specified component:

# Components

[//]: # 'This file is autogenerated, but the following list can be modified.'

<PagesIndex>

- [Button](#button) - [Full Docs](./button/page.mdx)
- [Checkbox](#checkbox) [New] - [Full Docs](./checkbox/page.mdx)

[//]: # 'This file is autogenerated, DO NOT EDIT AFTER THIS LINE'

## Button

A versatile button component

[Read more](./button/page.mdx)

## Checkbox

Toggle selection states

[Read more](./checkbox/page.mdx)

</PagesIndex>

[//]: # 'This file is autogenerated, but the following metadata can be modified.'

export const metadata = {
  robots: {
    index: false,
  },

}

This allows you to apply consistent styling or behavior (like custom navigation, search indexing, or layout) to the index content by defining a PagesIndex component in your MDX components.

Index Features

Editable section: Brief list above the "DO NOT EDIT" marker can be manually reordered
Tag support: Add status tags like [New], [Hot], or [Beta] directly after component names
Alphabetical sorting: Use the explicit marker to automatically sort pages alphabetically
Auto-generated section: Detailed page entries automatically added below the marker
Two-part structure: Concise links above, full details below
Relative links: All paths are relative for portability
Hierarchical sections: Nested sections shown with indentation (non-clickable)
Full descriptions: Complete page descriptions in the auto-generated section
Keywords and sections: Displayed as bullet lists under each page

Automatic Alphabetical Sorting

To have pages automatically sorted alphabetically in your index, replace the default editable marker with the alphabetical sorting marker:

# Components

[//]: # 'This file is autogenerated, but the following list can be modified. Automatically sorted alphabetically.'

- [Alpha](./alpha/page.mdx) - First component
- [Beta](./beta/page.mdx) - Second component
- [Zebra](./zebra/page.mdx) - Last component

[//]: # 'This file is autogenerated, DO NOT EDIT AFTER THIS LINE'

Sorting behavior:

User-controlled ordering: Default marker 'This file is autogenerated, but the following list can be modified.' preserves the order you define in the editable section
Automatic alphabetical: Marker 'This file is autogenerated, but the following list can be modified. Automatically sorted alphabetically.' sorts all pages alphabetically by title, ignoring the editable section order
Case-insensitive: Sorting uses localeCompare() for natural alphabetical ordering
Fallback to slug: If a page has no title, the slug is used for sorting

This is useful for index pages where alphabetical order makes more sense than manual ordering, such as component libraries or API references.

Web-Native Navigation

A key benefit of auto-generated index pages is improved navigation UX. When users remove segments from the URL path (a common power-user pattern), they land on a meaningful index page instead of a 404:

/components/checkbox/page.mdx  →  User removes "checkbox"
/components/page.mdx           →  Lands on components index (not 404)

This creates a natural hierarchy where every directory level has content. Index pages don't need to be linked from your home page or site navigation—they can even be marked with noindex for SEO if you prefer they don't appear in search results. They exist purely to provide a web-native browsing experience for users exploring your documentation structure.

Example metadata for an unlisted index:

# Components

<!-- Auto-generated content -->

export const metadata = {
  robots: { index: false },
};

;

Automatic Index Structure

Let the plugin generate index pages automatically throughout your documentation:

app/docs/
├── page.mdx              # Auto-generated
├── getting-started/
│   └── page.mdx
├── components/
│   ├── page.mdx          # Auto-generated
│   ├── button/
│   │   └── page.mdx
│   └── input/
│       └── page.mdx
└── api/
    ├── page.mdx          # Auto-generated
    └── reference/
        └── page.mdx

Integration with Other Plugins

With transformMarkdownCode

Works alongside transformMarkdownCode to enhance documentation:

// next.config.js
import transformMarkdownMetadata from '@mui/internal-docs-infra/pipeline/transformMarkdownMetadata';
import transformMarkdownCode from '@mui/internal-docs-infra/pipeline/transformMarkdownCode';

const withMDX = require('@next/mdx')({
  options: {
    remarkPlugins: [[transformMarkdownMetadata, { extractToIndex: true }], transformMarkdownCode],
  },
});

Full Documentation Pipeline

Typical plugin order for comprehensive docs processing:

const remarkPlugins = [
  remarkGfm, // GitHub Flavored Markdown
  [transformMarkdownMetadata, { extractToIndex: true }], // Extract metadata & build indexes
  transformMarkdownCode, // Transform code blocks
  transformMarkdownDemoLinks, // Handle demo links
  transformMarkdownBlockquoteCallouts, // Style callouts
];

Advanced Examples

Nested Sections

The plugin builds hierarchical section trees from your heading structure:

Input MDX:

# API Reference

Complete API documentation for the component.

## Props

Configure the component with these props.

### Required Props

Props that must be provided.

### Optional Props

Props with default values.

## Methods

Public methods available on the component.

Extracted Sections:

{
  "props": {
    "title": "Props",
    "titleMarkdown": [{ "type": "text", "value": "Props" }],
    "children": {
      "required-props": {
        "title": "Required Props",
        "titleMarkdown": [{ "type": "text", "value": "Required Props" }],
        "children": {}
      },
      "optional-props": {
        "title": "Optional Props",
        "titleMarkdown": [{ "type": "text", "value": "Optional Props" }],
        "children": {}
      }
    }
  },
  "methods": {
    "title": "Methods",
    "titleMarkdown": [{ "type": "text", "value": "Methods" }],
    "children": {}
  }
}

Formatted Section Titles

The plugin preserves inline code, bold, and italic formatting in section titles:

Input MDX:

# Utilities

## `parseSource()`

Parse source code into AST nodes.

## **Performance** Optimization

Tips for improving performance.

## _Advanced_ Topics

Deep dive into advanced features.

Extracted Sections:

{
  "parsesource": {
    "title": "parseSource()",
    "titleMarkdown": [{ "type": "inlineCode", "value": "parseSource()" }],
    "children": {}
  },
  "performance-optimization": {
    "title": "Performance Optimization",
    "titleMarkdown": [
      { "type": "strong", "children": [{ "type": "text", "value": "Performance" }] },
      { "type": "text", "value": " Optimization" }
    ],
    "children": {}
  },
  "advanced-topics": {
    "title": "Advanced Topics",
    "titleMarkdown": [
      { "type": "emphasis", "children": [{ "type": "text", "value": "Advanced" }] },
      { "type": "text", "value": " Topics" }
    ],
    "children": {}
  }
}

Complete Metadata Export

Example with all available fields:

Input MDX:

# Custom Title

Custom description text.

Page content here.

export const metadata = {
  keywords: ['react', 'components', 'ui'],
};

;

Extracted Metadata:

{
  "title": "Custom Title",
  "description": "Custom description text.",
  "keywords": ["react", "components", "ui"],
  "sections": {
    /* ... */
  }
}

Reference: Metadata Structure

The plugin extracts and generates metadata in the following structure:

interface ExtractedMetadata {
  title?: string;
  description?: string;
  descriptionMarkdown?: PhrasingContent[]; // Markdown AST nodes preserving formatting
  keywords?: string[];
  sections?: HeadingHierarchy;
  embeddings?: number[];
  image?: {
    url: string;
    alt?: string;
  };
}

type HeadingHierarchy = {
  [slug: string]: {
    title: string; // Plain text for display
    titleMarkdown: PhrasingContent[]; // Markdown AST nodes preserving formatting
    children: HeadingHierarchy;
  };
};

Dual Storage for Descriptions

Similar to section titles, the plugin preserves both plain text and formatted markdown for descriptions:

Plain text (description): Used for meta tags, search indexing, and SEO
AST nodes (descriptionMarkdown): Preserves original formatting like inline code, bold, italics, and links

This allows descriptions like "Use transformMarkdownMetadata to extract metadata" to render with proper formatting while still having clean text available for search engines and social media previews.

Reference: Plugin Behavior

Title Extraction Priority

Exported metadata: export const metadata = { title: 'Custom' }
First H1 heading: # Page Title
Directory name: Falls back to directory name if no title found

Description Extraction Priority

Inline meta tags: <meta name="description" content="..." /> or <Meta name="description" content="..." /> (anywhere in document)
Exported metadata: export const metadata = { description: '...' }
First paragraph: Text content of the first paragraph after the first H1
No description: Returns undefined if none found

Note: While inline meta tags have the highest priority when present, using export const metadata at the end of the file is preferred for better readability.

Keywords Extraction Priority

Inline meta tags: <meta name="keywords" content="keyword1, keyword2, keyword3" /> (anywhere in document)
Exported metadata: export const metadata = { keywords: ['...'] }
No keywords: Returns undefined if none found

Meta Tag Support

The plugin supports both <meta> and <Meta> tags anywhere in the document:

Supported meta tags:

<meta name="description" content="..." /> - Page description for SEO
<meta name="keywords" content="keyword1, keyword2, keyword3" /> - Comma-separated keywords

Features:

Case-insensitive: Both <meta> and <Meta> work
Location-flexible: Can appear anywhere in the document (beginning, middle, end, within sections)
Automatic parsing: Keywords are automatically split by commas and trimmed
Priority handling: Meta tags override other sources when present

Example:

# Component Name

## Section One

<meta name="description" content="Custom SEO description" />

Content here...

## Section Two

<meta name="keywords" content="react, component, ui, accessibility" />

More content...

Note: While this feature exists for flexibility and migration scenarios, export const metadata at the end of the file is preferred for cleaner, more maintainable documentation.

Heading Processing

H1 (page title): Used as the page title, not included in sections
H2-H6 (sections): Built into hierarchical section tree
Slug generation: Converts titles to URL-friendly slugs
- Lowercase conversion
- Non-alphanumeric characters replaced with hyphens
- Leading/trailing hyphens removed
- Preserves numbers

Formatting Preservation

The plugin maintains formatting in section titles through dual storage:

Plain text (title): Used for display, slugs, and search
AST nodes (titleMarkdown): Preserves original formatting for rendering

This allows rendering with backticks, bold, italics while still having clean text for URLs and indexing.

Error Handling

The plugin handles errors gracefully:

Missing title: Falls back to directory name
No description: Returns undefined
Invalid metadata export: Logs error and continues
File system errors: Logs warning but doesn't fail build
Malformed headings: Skips invalid headings, processes rest

Performance Considerations

Dual Storage Benefits

The plugin stores both plain text and AST nodes for section titles:

Plain text: Fast slug generation and text search (~37ms average)
AST nodes: Preserves formatting for accurate rendering
Build-time processing: All extraction happens during build, zero runtime cost

Index Update Efficiency

The plugin's incremental update strategy is particularly valuable in Next.js:

Only reprocess changed pages: When you edit one MDX file, only that file's metadata is re-extracted and merged into the index. This is crucial for Next.js performance—you don't need to reparse all sibling pages to recompute the parent index.
Fast rebuilds: Changing a single component's documentation triggers minimal work during development and production builds
Smart merging: The plugin merges new metadata into the existing index structure, preserving manual edits in the editable section
Relative paths: All links use relative paths, enabling you to move entire directory structures without breaking the index

createSitemap - Uses extracted metadata to build searchable sitemaps
syncPageIndex - Updates parent directory indexes
docs-infra validate - CLI command that validates index files
transformMarkdownCode - Transform code blocks in documentation
withDocsInfra - Next.js plugin with all docs features

Transform Markdown Metadata

Overview

Key Features

Installation & Usage

With Next.js and MDX

With withDocsInfra Plugin

Configuration

titleSuffix

extractToIndex

Automatic Sitemap Data Injection

Basic Example

Common Patterns

Recommended: Content-First Approach

Adding Keywords and Open Graph

Open Graph Example

Section Navigation

Best Practices

File Organization

Heading Structure

Metadata Export

Index Management

Index Generation

Directory Structure

Generated Index Format

Index Wrapper Component

Index Features

Tags

Automatic Alphabetical Sorting

Web-Native Navigation

Automatic Index Structure

Integration with Other Plugins

With transformMarkdownCode

Full Documentation Pipeline

Advanced Examples

Nested Sections

Formatted Section Titles

Complete Metadata Export

Reference: Metadata Structure

Dual Storage for Descriptions

Reference: Plugin Behavior

Title Extraction Priority

Description Extraction Priority

Keywords Extraction Priority

Meta Tag Support

Heading Processing

Formatting Preservation

Error Handling

Performance Considerations

Dual Storage Benefits

Index Update Efficiency

Related