실행 가능한 마크다운: Hugging Face로 문서 테스트 혁신

문서는 개발자와 도구 사이의 중요한 다리 역할을 하지만, 그 신뢰성은 만연한 문제인 **문서 표류(documentation drift)**로 인해 종종 훼손됩니다. 소프트웨어가 발전함에 따라 문서의 코드 예제가 조용히 작동하지 않게 되어 좌절감, 시간 낭비, 그리고 신뢰 하락으로 이어질 수 있습니다. AI 혁신의 선두 주자인 Hugging Face는 doc-builder 프로젝트를 통해 이 문제에 정면으로 맞서고 있으며, 문서 예제가 단지 설명적인 것을 넘어 엄격하게 테스트되도록 보장하는 실행 가능한 마크다운 블록을 도입했습니다. 이 현대적인 접근 방식은 실행 가능한 문서를 다루는 방식을 재정의하여, 좋은 문서의 명확성과 지속적인 테스트의 견고함을 결합합니다.

도전 과제: 문서와 코드 무결성 연결하기

실행 가능한 문서의 핵심 철학은 새로운 것이 아닙니다. 수십 년 동안 Python 커뮤니티는 사용자가 복사하여 붙여넣고 완벽하게 실행될 것이라고 기대할 수 있는 문서 내의 예제를 옹호해왔습니다. 그러나 Hugging Face의 Transformers 라이브러리와 같이 크고 빠르게 진화하는 프로젝트 전반에 걸쳐 이러한 이상을 유지하는 것은 엄청난 과제입니다. 수동 검증은 비실용적이며, 전통적인 방법은 종종 명확한 문서와 효과적인 테스트 사이의 타협을 강요합니다.

문제는 요구 사항의 근본적인 차이에서 발생합니다.

문서 예제는 간결성, 가독성, 그리고 가르치는 것에 중점을 둡니다. 그들은 '소음'이 없는 것을 목표로 합니다.
테스트는 어설션, 설정/해제(setup/teardown), 픽스처, 모킹, 그리고 디버깅 기능을 요구합니다. 그들은 견고함과 커버리지를 우선시합니다.

이 두 가지 관심사가 같은 형식으로 강제될 때, 하나가 종종 손상됩니다. Hugging Face의 doc-builder는 문서의 기본 예제가 엄격하게 검증되는 동안에도 문서가 깨끗하게 유지되도록 허용함으로써 이러한 긴장을 해소하는 것을 목표로 합니다. 이를 통해 사용자가 접하는 모든 스니펫이 단지 염원이 아니라 검증 가능한 진실임을 보장합니다. 이는 빠르게 변화하는 AI 세계에서 신뢰성을 유지하고 개발자 채택을 가속화하는 데 중요합니다.

`doctest`의 유산: 초기 혁신과 진화하는 요구 사항

실행 가능한 문서의 개념은 Python 2.1(2001)에서 doctest가 도입되면서 Python에서 초기에 주목을 받았습니다. 팀 피터스(Tim Peters)가 만든 doctest는 우아한 솔루션이었습니다. 대화형 Python 인터프리터 세션(>>> add(2, 3)\n5)처럼 서식화된 문서 예제를 구문 분석하고 출력이 예상과 일치하는지 확인했습니다. 이 혁신은 문서 예제를 자동 회귀 테스트로 바꾸어 코드 품질에 상당한 진전을 가져왔습니다.

doctest는 대화형 탐색을 장려하는 언어인 Python에 특히 적합했습니다. 작은 프로젝트와 간단한 API의 경우 매우 잘 작동했으며, 기본적인 예제가 기능적으로 유지되도록 보장하는 간단하면서도 강력한 메커니즘을 제공했습니다. 이는 소프트웨어 개발에서 "말뿐만 아니라 보여주기" 정신을 구현하여 문서를 테스트 스위트의 활성 부분으로 만들었습니다.

Hugging Face의 현대적인 솔루션: 실행 가능한 마크다운 블록

대규모의 복잡한 프로젝트를 위한 오래된 접근 방식의 한계를 인식한 Hugging Face의 doc-builder 프로젝트는 실행 가능한 문서에 대한 정교한 접근 방식을 도입했습니다. 테스트를 문서 구문 내부에 임베딩하는 대신, 문서 스니펫을 마크다운 내에 있는 일반 Python 코드로 취급합니다. 이는 마크다운을 얇은 테스트 컨테이너로 효과적으로 전환하여 프레젠테이션을 테스트 방법론과 분리합니다.

마크다운의 실행 가능한 블록은 다음과 같습니다.

```py runnable:quickstart
from transformers import pipeline

pipe = pipeline("sentiment-analysis")
result = pipe("I love runnable docs!")

if not result:  # doc-builder: hide
    raise ValueError("pipeline returned no result")

print(result[0]["label"])
assert result[0]["score"] > 0.5  # doc-builder: ignore-bare-assert
```

렌더링될 때 이 블록은 표준 코드 예제로 나타납니다. 그러나 테스트 중에는 일반 Python 코드로 실행됩니다. 이러한 이중 특성은 독자에게는 문서가 깔끔하게 유지되도록 보장하면서, 개발자에게는 견고하고 테스트 가능한 예제를 제공합니다. 이 접근 방식은 복잡한 모델 로딩 및 추론 단계를 포함하는 AI와 같은 복잡한 도메인에 특히 효과적입니다.

`pytest` 및 고급 기능과의 완벽한 통합

Hugging Face 접근 방식의 주요 차별점은 pytest와 같은 최신 테스트 프레임워크와의 완벽한 통합입니다. hf-doc-builder가 설치되면 pytest는 마크다운 파일 내의 실행 가능한 블록을 자동으로 검색하고 실행하여 각 블록을 표준 테스트 항목으로 취급합니다. 이는 문서 예제가 어설션, 픽스처, 디버깅 도구 및 포괄적인 보고와 같은 pytest의 강력한 기능을 활용하여 프로젝트의 기존 테스트 인프라에 완전히 참여할 수 있음을 의미합니다.

실행 가능한 문서의 진화: `doctest` vs. `doc-builder`

기능	`doctest` (전통적인)	`doc-builder` (현대적인 실행 가능한 마크다운)
테스트 접근 방식	문서 내에 인터프리터 세션으로 테스트 포함	문서 스니펫을 테스트용 일반 Python 코드로 취급
통합	표준 라이브러리 모듈	완벽한 통합을 위한 `pytest` 플러그인
테스트 구문	`>>>` 프롬프트, 예상 출력 매칭	표준 Python 코드, `pytest` 어설션
유연성	제한적이고 취약한 출력 매칭	높음, 복잡한 테스트, 데코레이터, 디버깅 지원
문서 깔끔함	테스트 메커니즘으로 문서가 복잡해질 수 있음	숨겨진 지시어로 깔끔한 문서 유지
디버깅	문자열 비교, 직접 검사 어려움	표준 Python 디버깅, 전체 트레이스백
설정/해제(Setup/Teardown)	예제에 노이즈를 추가할 수 있음	연속 블록으로 컨텍스트를 효과적으로 관리
진실의 원천	문서 형식 및 내장된 테스트	마크다운 소스, 표준 Python 실행을 통해 테스트

doc-builder는 또한 다단계 튜토리얼에 필수적인 기능인 **연속 블록(continuation blocks)**을 도입합니다. 이를 통해 작성자는 runnable:test_basic 다음에 runnable:test_basic:2와 같이 여러 개의 보이는 스니펫에 걸쳐 예제를 분할할 수 있습니다. 결정적으로, 이러한 블록은 테스트 중에 동일한 실행 컨텍스트를 공유하여 모든 코드를 하나의 긴 블록에 강제로 넣지 않고도 자연스러운 교육 흐름을 가능하게 합니다. 이러한 유연성은 복잡한 AI 모델 사용법이나 데이터 처리 파이프라인을 통해 사용자를 안내하는 데 필수적입니다.

예를 들어, AI 에이전트 개발 워크플로우는 에이전트의 도구 정의, 에이전트 초기화, 그리고 쿼리 실행과 같은 여러 단계를 포함할 수 있습니다. 연속 블록은 이러한 각 단계를 별도의 문서 섹션에 명확하게 제시하면서, AI 에이전트 운영화: 1부와 같이 고급 에이전트 워크플로우와 유사하게 단일하고 응집력 있는 테스트 시퀀스로 실행될 수 있도록 합니다.

견고한 테스트를 보장하면서 깔끔한 문서 유지하기

doc-builder의 가장 우아한 솔루션 중 하나는 소스 마크다운에 테스트별 지시어가 포함되어 있어도 렌더링된 문서를 깔끔하게 유지하는 기능입니다. 개발자는 문서에 나타나서는 안 되는 실행 가능한 줄에 대해 # doc-builder: hide와 같은 주석을, 테스트의 일부이지만 주석이 렌더링되어서는 안 되는 어설션에 대해 # doc-builder: ignore-bare-assert를 포함할 수 있습니다. 마찬가지로, pytest 데코레이터( # pytest-decorator: ...)는 렌더링 중에 제거됩니다.

이를 통해 문서는 교육과 명확성에 집중하고, 테스트 상용구로 복잡해지지 않도록 보장합니다. 사용자는 관련 코드만 보고, 기본 시스템은 그 기능을 보장합니다. 이러한 균형은 심미적 매력과 절대적인 정확성 모두가 가장 중요한 개발자 도구 문서에 매우 중요합니다.

대규모 AI 프로젝트 및 그 이상에 미치는 영향

수백 개의 문서 페이지와 수천 개의 예제를 포함하는 Hugging Face의 Transformers와 같은 거대한 리포지토리의 경우, 이 기능은 혁신적입니다. 이는 문서 표류를 자동으로 방지하며, 그렇지 않으면 엄청난 수동 노력이 필요하거나 끊임없이 깨진 예제가 발생할 수 있는 문제입니다. 실행 가능한 문서는 문서와 코드베이스를 동기화 상태로 유지하여, 수동 검토가 단순히 비실용적인 규모에서 신뢰성을 유지하는 데 도움이 됩니다. 이는 생산용 AI 에이전트 평가와 같은 AI 커뮤니티의 광범위한 노력과 일치하여 신뢰성을 보장합니다.

Hugging Face는 `pytest` 및 정교한 CI/CD 파이프라인의 현대적인 시대로 실행 가능한 문서를 가져옴으로써 개발자 경험과 코드 품질에 대한 강력한 약속을 보여줍니다. 목표는 20년 전과 동일합니다. 문서 예제는 작동해야 합니다. 그러나 이제는 코드가 어떻게 작동해야 하는지를 보여줄 뿐만 아니라, 실제로 작동한다는 것을 지속적으로 증명하여 AI 개발을 위한 더욱 신뢰할 수 있는 생태계를 조성합니다.

원본 출처

https://huggingface.co/blog/huggingface/runnable-examples

자주 묻는 질문

What is the core problem Hugging Face's runnable Markdown addresses?

Hugging Face's runnable Markdown addresses the pervasive problem of 'documentation drift,' where code examples in documentation become outdated and silently break as libraries and APIs evolve. This leads to user frustration and diminishes the credibility of the documentation. By making documentation examples runnable and testable, the doc-builder ensures that these snippets are continuously validated against the codebase, guaranteeing that they always work as advertised. This proactive approach prevents broken examples, enhances user trust, and improves the overall developer experience by providing reliable resources.

How does runnable Markdown differ from Python's traditional `doctest` module?

While both `doctest` and runnable Markdown aim for executable documentation, they differ significantly in their approach. `doctest` embeds tests directly into documentation syntax, requiring examples to mirror interactive interpreter sessions with expected output. This often leads to documentation being cluttered with test mechanics. Hugging Face's runnable Markdown, in contrast, treats documentation snippets as normal Python code living within Markdown files. It integrates seamlessly with modern testing frameworks like `pytest`, allowing for complex assertions, debugging, and standard test infrastructure. This separation of concerns ensures documentation remains clean and readable, while testing remains powerful and flexible, avoiding the limitations of `doctest`'s brittle output matching and verbose setup/teardown.

What are 'continuation blocks' in Hugging Face's `doc-builder`?

Continuation blocks are a powerful feature in Hugging Face's `doc-builder` that allow authors to split complex code examples or tutorials across multiple visible Markdown snippets while maintaining a shared execution context during testing. This means that a setup defined in one runnable block can be reused and built upon in a subsequent block, without forcing the documentation to present everything as one long, monolithic code fence. For example, `runnable:test_basic` can define initial variables, and `runnable:test_basic:2` can then use those variables. This enhances readability and instructional flow in documentation, making it easier to present multi-step processes without sacrificing the integrity of the underlying testable code.

How does `doc-builder` integrate with existing testing frameworks like `pytest`?

Hugging Face's `doc-builder` integrates natively with `pytest`, transforming runnable Markdown blocks into standard `pytest` test items. With `hf-doc-builder` installed, `pytest` automatically discovers and executes these blocks within Markdown files. This integration means that documentation examples can leverage the full power of `pytest`, including its assertion mechanisms, fixtures, decorators, and debugging tools. Failures appear as normal test failures with comprehensive tracebacks, allowing developers to debug effectively. This approach avoids the need for a special-purpose testing mini-language, embedding documentation tests directly into the project's existing, robust test infrastructure.

How does `doc-builder` ensure documentation remains clean despite embedded test logic?

A key design principle of `doc-builder` is to prevent test mechanics from polluting the user-facing documentation. Authors can embed test-only directives, such as `# pytest-decorator: transformers.testing_utils.slow` or `# doc-builder: hide` for lines that should be executable but not displayed, directly within the Markdown source. When the documentation is rendered, `doc-builder` intelligently strips these directives and comments, presenting a clean, readable code snippet to the user. This allows developers to write comprehensive tests alongside their examples without compromising the clarity and brevity expected of good documentation, maintaining a clear separation between the source code for testing and the rendered content for users.

What are the benefits of runnable documentation for large AI projects like Hugging Face Transformers?

For large AI projects such as Hugging Face Transformers, which involve extensive documentation and thousands of code examples, runnable documentation offers immense benefits. It drastically reduces 'documentation drift' by continuously validating examples against the evolving codebase, ensuring they remain accurate and functional. This prevents user frustration caused by broken examples and builds trust in the documentation's reliability. By integrating with `pytest`, it allows these projects to manage documentation tests within their existing CI/CD pipelines, making manual review of examples unnecessary at scale. This automated validation is crucial for maintaining the quality and usability of documentation in rapidly developing and complex software ecosystems.

Can runnable Markdown be adopted by other projects outside of Hugging Face?

Yes, the principles and mechanisms behind Hugging Face's runnable Markdown, particularly its integration with standard Python testing tools like `pytest` and its focus on separating testing logic from displayed documentation, are highly applicable and beneficial for any software project. While the `doc-builder` itself is specific to Hugging Face, the underlying ideas represent a best practice in developer tools. Other projects can implement similar systems using existing tools or adapt `doc-builder`'s concepts to ensure their documentation examples are continuously tested and reliable. This approach is a general solution to a common problem across the software development landscape, making documentation more robust and trustworthy.