Исполняемый Markdown: Революция в тестировании документации с Hugging Face


При рендеринге этот блок отображается как стандартный пример кода. Однако во время тестирования он выполняется как обычный код Python. Такая двойственная природа гарантирует, что документация остается чистой для читателей, одновременно предоставляя надежные, тестируемые примеры для разработчиков. Этот подход особенно важен для таких сложных областей, как ИИ, где примеры часто включают сложную загрузку моделей и этапы инференса.

## Бесшовная интеграция с `pytest` и расширенные функции

Ключевым отличием подхода Hugging Face является его бесшовная интеграция с современными фреймворками тестирования, в частности с `pytest`. При установленном `hf-doc-builder`, `pytest` может автоматически обнаруживать и выполнять исполняемые блоки в файлах Markdown, рассматривая каждый блок как стандартный тестовый элемент. Это означает, что примеры документации могут полностью участвовать в существующей тестовой инфраструктуре проекта, используя мощные функции `pytest`, такие как утверждения, фикстуры, инструменты отладки и комплексные отчеты.

### Эволюция исполняемой документации: `doctest` против `doc-builder`

| Характеристика               | `doctest` (Традиционный)                               | `doc-builder` (Современный исполняемый Markdown)       |
| :--------------------------- | :---------------------------------------------------- | :----------------------------------------------------- |
| **Подход к тестированию**    | Встраивает тесты как сеансы интерпретатора в док.      | Рассматривает фрагменты док. как обычный код Python для тестирования |
| **Интеграция**               | Модуль стандартной библиотеки                         | Плагин `pytest` для бесшовной интеграции              |
| **Синтаксис теста**          | Подсказки `>>>`, сопоставление ожидаемого вывода     | Стандартный код Python, утверждения `pytest`          |
| **Гибкость**                 | Ограниченное, хрупкое сопоставление вывода            | Высокая, поддерживает сложные тесты, декораторы, отладку |
| **Чистота документации**     | Может загромождать док. механизмами тестирования     | Сохраняет чистую док. с помощью скрытых директив       |
| **Отладка**                  | Сравнение строк, менее прямой осмотр                  | Стандартная отладка Python, полные трассировки         |
| **Настройка/Завершение**     | Может добавлять 'шум' в примеры                       | Эффективно управляет контекстом с помощью блоков продолжения |
| **Источник истины**          | Формат документации и встроенные тесты                | Исходный Markdown, тестируемый посредством стандартного выполнения Python |

`doc-builder` также представляет **блоки продолжения**, важную функцию для многошаговых учебных пособий. Они позволяют авторам разделять пример на несколько видимых фрагментов, таких как `runnable:test_basic`, за которым следует `runnable:test_basic:2`. Что особенно важно, эти блоки используют один и тот же контекст выполнения во время тестов, обеспечивая естественный поток обучения без принудительного размещения всего кода в одном длинном блоке. Эта гибкость жизненно важна для помощи пользователям в использовании сложных моделей ИИ или конвейеров обработки данных.

Например, рабочий процесс разработки агента ИИ может включать несколько шагов: определение инструментов агента, инициализация агента, а затем выполнение запроса. Блоки продолжения позволяют четко представить каждый из этих шагов в отдельных разделах документации, при этом они выполняются как единая, согласованная последовательность тестов, аналогично тому, как продвинутые агентские рабочие процессы описаны в [Операционализация агентного ИИ: Часть 1](/ru/operationalizing-agentic-ai-part-1-a-stakeholders-guide).

## Поддержание чистоты документации при обеспечении надежного тестирования

Одним из самых элегантных решений `doc-builder` является его способность сохранять отрендеренную документацию чистой, даже если исходный Markdown содержит директивы, специфичные для тестирования. Разработчики могут встраивать комментарии, такие как `# doc-builder: hide` для исполняемых строк, которые не должны отображаться в документации, или `# doc-builder: ignore-bare-assert` для утверждений, которые являются частью теста, но их комментарий не должен отображаться. Аналогично, декораторы `pytest` (`# pytest-decorator: ...`) удаляются во время рендеринга.

Это гарантирует, что документация остается сфокусированной на обучении и ясности, не будучи загроможденной шаблонным кодом тестирования. Пользователь видит только соответствующий код, в то время как базовая система гарантирует его функциональность. Этот баланс критичен для документации инструментов разработчика, где эстетическая привлекательность и абсолютная корректность имеют первостепенное значение.

## Влияние на крупномасштабные проекты ИИ и не только

Для таких огромных репозиториев, как Hugging Face's Transformers, с сотнями страниц документации и тысячами примеров, эта функция является трансформационной. Она автоматизирует предотвращение дрейфа документации, проблемы, которая в противном случае потребовала бы огромных ручных усилий или привела бы к постоянному потоку неработающих примеров. Исполняемая документация помогает поддерживать синхронизацию документации и кодовой базы, сохраняя доверие в масштабах, где ручная проверка просто невозможна. Это соответствует более широким усилиям в сообществе ИИ по тщательной [Оценке агентов ИИ для производства](/ru/evaluating-ai-agents-for-production-a-practical-guide-to-strands-evals) и обеспечению надежности.

Внедрив исполняемую документацию в современную эру `pytest` и сложных CI/CD конвейеров, Hugging Face демонстрирует мощную приверженность опыту разработчиков и качеству кода. Цель остается той же, что и более двух десятилетий назад: примеры документации должны работать. Но теперь они не только иллюстрируют, как код *должен* работать, но и непрерывно *доказывают*, что он работает, способствуя созданию более надежной и заслуживающей доверия экосистемы для разработки ИИ.

Первоисточник

https://huggingface.co/blog/huggingface/runnable-examples

Часто задаваемые вопросы

What is the core problem Hugging Face's runnable Markdown addresses?

Hugging Face's runnable Markdown addresses the pervasive problem of 'documentation drift,' where code examples in documentation become outdated and silently break as libraries and APIs evolve. This leads to user frustration and diminishes the credibility of the documentation. By making documentation examples runnable and testable, the doc-builder ensures that these snippets are continuously validated against the codebase, guaranteeing that they always work as advertised. This proactive approach prevents broken examples, enhances user trust, and improves the overall developer experience by providing reliable resources.

How does runnable Markdown differ from Python's traditional `doctest` module?

While both `doctest` and runnable Markdown aim for executable documentation, they differ significantly in their approach. `doctest` embeds tests directly into documentation syntax, requiring examples to mirror interactive interpreter sessions with expected output. This often leads to documentation being cluttered with test mechanics. Hugging Face's runnable Markdown, in contrast, treats documentation snippets as normal Python code living within Markdown files. It integrates seamlessly with modern testing frameworks like `pytest`, allowing for complex assertions, debugging, and standard test infrastructure. This separation of concerns ensures documentation remains clean and readable, while testing remains powerful and flexible, avoiding the limitations of `doctest`'s brittle output matching and verbose setup/teardown.

What are 'continuation blocks' in Hugging Face's `doc-builder`?

Continuation blocks are a powerful feature in Hugging Face's `doc-builder` that allow authors to split complex code examples or tutorials across multiple visible Markdown snippets while maintaining a shared execution context during testing. This means that a setup defined in one runnable block can be reused and built upon in a subsequent block, without forcing the documentation to present everything as one long, monolithic code fence. For example, `runnable:test_basic` can define initial variables, and `runnable:test_basic:2` can then use those variables. This enhances readability and instructional flow in documentation, making it easier to present multi-step processes without sacrificing the integrity of the underlying testable code.

How does `doc-builder` integrate with existing testing frameworks like `pytest`?

Hugging Face's `doc-builder` integrates natively with `pytest`, transforming runnable Markdown blocks into standard `pytest` test items. With `hf-doc-builder` installed, `pytest` automatically discovers and executes these blocks within Markdown files. This integration means that documentation examples can leverage the full power of `pytest`, including its assertion mechanisms, fixtures, decorators, and debugging tools. Failures appear as normal test failures with comprehensive tracebacks, allowing developers to debug effectively. This approach avoids the need for a special-purpose testing mini-language, embedding documentation tests directly into the project's existing, robust test infrastructure.

How does `doc-builder` ensure documentation remains clean despite embedded test logic?

A key design principle of `doc-builder` is to prevent test mechanics from polluting the user-facing documentation. Authors can embed test-only directives, such as `# pytest-decorator: transformers.testing_utils.slow` or `# doc-builder: hide` for lines that should be executable but not displayed, directly within the Markdown source. When the documentation is rendered, `doc-builder` intelligently strips these directives and comments, presenting a clean, readable code snippet to the user. This allows developers to write comprehensive tests alongside their examples without compromising the clarity and brevity expected of good documentation, maintaining a clear separation between the source code for testing and the rendered content for users.

What are the benefits of runnable documentation for large AI projects like Hugging Face Transformers?

For large AI projects such as Hugging Face Transformers, which involve extensive documentation and thousands of code examples, runnable documentation offers immense benefits. It drastically reduces 'documentation drift' by continuously validating examples against the evolving codebase, ensuring they remain accurate and functional. This prevents user frustration caused by broken examples and builds trust in the documentation's reliability. By integrating with `pytest`, it allows these projects to manage documentation tests within their existing CI/CD pipelines, making manual review of examples unnecessary at scale. This automated validation is crucial for maintaining the quality and usability of documentation in rapidly developing and complex software ecosystems.

Can runnable Markdown be adopted by other projects outside of Hugging Face?

Yes, the principles and mechanisms behind Hugging Face's runnable Markdown, particularly its integration with standard Python testing tools like `pytest` and its focus on separating testing logic from displayed documentation, are highly applicable and beneficial for any software project. While the `doc-builder` itself is specific to Hugging Face, the underlying ideas represent a best practice in developer tools. Other projects can implement similar systems using existing tools or adapt `doc-builder`'s concepts to ensure their documentation examples are continuously tested and reliable. This approach is a general solution to a common problem across the software development landscape, making documentation more robust and trustworthy.

Будьте в курсе

Получайте последние новости ИИ на почту.