Runnable Markdown: Revolutionizing Documentation Testing with Hugging Face

Documentation serves as the critical bridge between developers and their tools, but its reliability is often undermined by a pervasive issue: documentation drift. As software evolves, code examples in documentation can silently break, leading to frustration, lost time, and a erosion of trust. Hugging Face, a leader in AI innovation, is addressing this challenge head-on with its doc-builder project, introducing runnable Markdown blocks that ensure documentation examples are not just illustrative, but rigorously tested. This modern approach redefines how we approach executable documentation, merging the clarity of good docs with the robustness of continuous testing.

The Challenge: Bridging Documentation and Code Integrity

The core philosophy behind runnable documentation isn't new. For decades, the Python community has advocated for examples in documentation that users can copy, paste, and expect to run flawlessly. However, maintaining this ideal across large, rapidly evolving projects like Hugging Face's Transformers library is a monumental task. Manual verification is impractical, and traditional methods often force a compromise between clear documentation and effective testing.

The problem arises from the fundamental differences in requirements:

Documentation examples prioritize brevity, readability, and a focus on teaching. They aim to be free of "noise."
Tests demand assertions, setup/teardown, fixtures, mocking, and debugging capabilities. They prioritize robustness and coverage.

When these two concerns are forced into the same format, one often suffers. Hugging Face's doc-builder aims to resolve this tension by allowing documentation to remain pristine while its underlying examples are rigorously validated, ensuring that every snippet users encounter is a verifiable truth, not just an aspiration. This is crucial for maintaining credibility and accelerating developer adoption in the fast-paced world of AI.

The Legacy of `doctest`: Early Innovations and Evolving Needs

The concept of executable documentation gained early traction in Python with the introduction of doctest in Python 2.1 (2001). Crafted by Tim Peters, doctest was an elegant solution: it parsed documentation examples formatted like interactive Python interpreter sessions (>>> add(2, 3)\n5) and verified that the output matched expectations. This innovation turned documentation examples into automatic regression tests, a significant leap forward for code quality.

doctest was particularly well-suited for Python, a language that encouraged interactive exploration. For small projects and straightforward APIs, it worked exceptionally well, providing a simple yet powerful mechanism to ensure basic examples remained functional. It embodied the spirit of "show, don't just tell" in software development, making documentation an active part of the testing suite.

Hugging Face's Modern Solution: Runnable Markdown Blocks

Recognizing the limitations of older approaches for large-scale, complex projects, Hugging Face's doc-builder project introduces a sophisticated take on runnable documentation. Instead of embedding tests within documentation syntax, it treats documentation snippets as ordinary Python code residing within Markdown. This effectively turns Markdown into a thin test container, decoupling the presentation from the testing methodology.

A runnable block in Markdown looks like this:

```py runnable:quickstart
from transformers import pipeline

pipe = pipeline("sentiment-analysis")
result = pipe("I love runnable docs!")

if not result:  # doc-builder: hide
    raise ValueError("pipeline returned no result")

print(result[0]["label"])
assert result[0]["score"] > 0.5  # doc-builder: ignore-bare-assert
```

When rendered, this block appears as a standard code example. During testing, however, it's executed as normal Python code. This dual nature ensures that documentation remains clean for readers while providing robust, testable examples for developers. This approach is particularly impactful for intricate domains like AI, where examples often involve complex model loading and inference steps.

Seamless Integration with `pytest` and Advanced Features

A key differentiator of Hugging Face's approach is its seamless integration with modern testing frameworks, particularly pytest. With hf-doc-builder installed, pytest can automatically discover and execute runnable blocks within Markdown files, treating each block as a standard test item. This means documentation examples can fully participate in a project's existing test infrastructure, leveraging pytest's powerful features like assertions, fixtures, debugging tools, and comprehensive reporting.

The Evolution of Executable Documentation: `doctest` vs. `doc-builder`

Feature	`doctest` (Traditional)	`doc-builder` (Modern Runnable Markdown)
Testing Approach	Embeds tests as interpreter sessions in docs	Treats docs snippets as normal Python code for testing
Integration	Standard library module	`pytest` plugin for seamless integration
Test Syntax	`>>>` prompts, expected output matching	Standard Python code, `pytest` assertions
Flexibility	Limited, brittle output matching	High, supports complex tests, decorators, debugging
Documentation Cleanliness	Can clutter docs with test mechanics	Preserves clean docs with hidden directives
Debugging	String comparison, less direct inspection	Standard Python debugging, full tracebacks
Setup/Teardown	Can add noise to examples	Manages context effectively with continuation blocks
Source of Truth	Documentation format and embedded tests	Markdown source, tested via standard Python execution

The doc-builder also introduces continuation blocks, a crucial feature for multi-step tutorials. These allow authors to split an example across multiple visible snippets, like runnable:test_basic followed by runnable:test_basic:2. Crucially, these blocks share the same execution context during tests, enabling a natural instructional flow without forcing all code into one long block. This flexibility is vital for guiding users through complex AI model usage or data processing pipelines.

For instance, an AI agent development workflow could involve several steps: defining the agent's tools, initializing the agent, and then running a query. Continuation blocks allow each of these steps to be presented clearly in separate documentation sections while being executed as a single, cohesive test sequence, similar to how advanced agentic workflows are Operationalizing Agentic AI: Part 1.

Maintaining Clean Documentation While Ensuring Robust Testing

One of doc-builder's most elegant solutions is its ability to keep the rendered documentation clean, even when the source Markdown contains test-specific directives. Developers can embed comments like # doc-builder: hide for executable lines that shouldn't appear in the documentation, or # doc-builder: ignore-bare-assert for assertions that are part of the test but whose comment shouldn't be rendered. Similarly, pytest decorators (# pytest-decorator: ...) are stripped during rendering.

This ensures that the documentation remains focused on teaching and clarity, without being cluttered by testing boilerplate. The user sees only the relevant code, while the underlying system guarantees its functionality. This balance is critical for developer tools documentation, where both aesthetic appeal and absolute correctness are paramount.

Impact on Large-Scale AI Projects and Beyond

For massive repositories like Hugging Face's Transformers, with hundreds of documentation pages and thousands of examples, this feature is transformative. It automates the prevention of documentation drift, a problem that would otherwise require immense manual effort or lead to a constant stream of broken examples. Runnable documentation helps keep the documentation and codebase in sync, maintaining trustworthiness at a scale where manual review is simply unfeasible. This aligns with broader efforts in the AI community to rigorously Evaluating AI Agents for Production and ensure reliability.

By bringing executable documentation into the modern era of pytest and sophisticated CI/CD pipelines, Hugging Face demonstrates a powerful commitment to developer experience and code quality. The goal remains the same as it was over two decades ago: documentation examples should work. But now, they not only illustrate how code should work but continuously prove that it does, fostering a more reliable and trustworthy ecosystem for AI development.

Original source

https://huggingface.co/blog/huggingface/runnable-examples

Frequently Asked Questions

What is the core problem Hugging Face's runnable Markdown addresses?

Hugging Face's runnable Markdown addresses the pervasive problem of 'documentation drift,' where code examples in documentation become outdated and silently break as libraries and APIs evolve. This leads to user frustration and diminishes the credibility of the documentation. By making documentation examples runnable and testable, the doc-builder ensures that these snippets are continuously validated against the codebase, guaranteeing that they always work as advertised. This proactive approach prevents broken examples, enhances user trust, and improves the overall developer experience by providing reliable resources.

How does runnable Markdown differ from Python's traditional `doctest` module?

While both `doctest` and runnable Markdown aim for executable documentation, they differ significantly in their approach. `doctest` embeds tests directly into documentation syntax, requiring examples to mirror interactive interpreter sessions with expected output. This often leads to documentation being cluttered with test mechanics. Hugging Face's runnable Markdown, in contrast, treats documentation snippets as normal Python code living within Markdown files. It integrates seamlessly with modern testing frameworks like `pytest`, allowing for complex assertions, debugging, and standard test infrastructure. This separation of concerns ensures documentation remains clean and readable, while testing remains powerful and flexible, avoiding the limitations of `doctest`'s brittle output matching and verbose setup/teardown.

What are 'continuation blocks' in Hugging Face's `doc-builder`?

Continuation blocks are a powerful feature in Hugging Face's `doc-builder` that allow authors to split complex code examples or tutorials across multiple visible Markdown snippets while maintaining a shared execution context during testing. This means that a setup defined in one runnable block can be reused and built upon in a subsequent block, without forcing the documentation to present everything as one long, monolithic code fence. For example, `runnable:test_basic` can define initial variables, and `runnable:test_basic:2` can then use those variables. This enhances readability and instructional flow in documentation, making it easier to present multi-step processes without sacrificing the integrity of the underlying testable code.

How does `doc-builder` integrate with existing testing frameworks like `pytest`?

Hugging Face's `doc-builder` integrates natively with `pytest`, transforming runnable Markdown blocks into standard `pytest` test items. With `hf-doc-builder` installed, `pytest` automatically discovers and executes these blocks within Markdown files. This integration means that documentation examples can leverage the full power of `pytest`, including its assertion mechanisms, fixtures, decorators, and debugging tools. Failures appear as normal test failures with comprehensive tracebacks, allowing developers to debug effectively. This approach avoids the need for a special-purpose testing mini-language, embedding documentation tests directly into the project's existing, robust test infrastructure.

How does `doc-builder` ensure documentation remains clean despite embedded test logic?

A key design principle of `doc-builder` is to prevent test mechanics from polluting the user-facing documentation. Authors can embed test-only directives, such as `# pytest-decorator: transformers.testing_utils.slow` or `# doc-builder: hide` for lines that should be executable but not displayed, directly within the Markdown source. When the documentation is rendered, `doc-builder` intelligently strips these directives and comments, presenting a clean, readable code snippet to the user. This allows developers to write comprehensive tests alongside their examples without compromising the clarity and brevity expected of good documentation, maintaining a clear separation between the source code for testing and the rendered content for users.

What are the benefits of runnable documentation for large AI projects like Hugging Face Transformers?

For large AI projects such as Hugging Face Transformers, which involve extensive documentation and thousands of code examples, runnable documentation offers immense benefits. It drastically reduces 'documentation drift' by continuously validating examples against the evolving codebase, ensuring they remain accurate and functional. This prevents user frustration caused by broken examples and builds trust in the documentation's reliability. By integrating with `pytest`, it allows these projects to manage documentation tests within their existing CI/CD pipelines, making manual review of examples unnecessary at scale. This automated validation is crucial for maintaining the quality and usability of documentation in rapidly developing and complex software ecosystems.

Can runnable Markdown be adopted by other projects outside of Hugging Face?

Yes, the principles and mechanisms behind Hugging Face's runnable Markdown, particularly its integration with standard Python testing tools like `pytest` and its focus on separating testing logic from displayed documentation, are highly applicable and beneficial for any software project. While the `doc-builder` itself is specific to Hugging Face, the underlying ideas represent a best practice in developer tools. Other projects can implement similar systems using existing tools or adapt `doc-builder`'s concepts to ensure their documentation examples are continuously tested and reliable. This approach is a general solution to a common problem across the software development landscape, making documentation more robust and trustworthy.

Stay Updated

Get the latest AI news delivered to your inbox.