Code Velocity
Zana za Waendelezaji

Markdown Inayoweza Kuendeshwa: Kuleta Mapinduzi Katika Upimaji wa Nyaraka kwa Kutumia Hugging Face

·8 dakika kusoma·Hugging Face·Chanzo asili
Shiriki
Nembo ya Hugging Face yenye vijisehemu vya msimbo na lebo ya 'runnable', ikiwakilisha dhana ya mifano ya Markdown inayoweza kuendeshwa.

Nyaraka hutumika kama daraja muhimu kati ya waendelezaji na zana zao, lakini kuegemea kwake mara nyingi hudhoofishwa na tatizo lililoenea: mkengeuko wa nyaraka. Kadri programu inavyoendelea, mifano ya msimbo katika nyaraka inaweza kuharibika kimyakimya, na kusababisha kufadhaika, kupoteza muda, na kudhoofika kwa uaminifu. Hugging Face, kiongozi katika uvumbuzi wa AI, inashughulikia changamoto hii moja kwa moja na mradi wake wa doc-builder, ikianzisha vizuizi vya Markdown vinavyoweza kuendeshwa vinavyohakikisha mifano ya nyaraka sio tu ya kuonyesha, bali inajaribiwa kwa ukali. Mbinu hii ya kisasa inafafanua upya jinsi tunavyokaribia nyaraka zinazoweza kutekelezwa, ikichanganya uwazi wa nyaraka nzuri na uthabiti wa upimaji endelevu.

Changamoto: Kuziba Pengo Kati ya Nyaraka na Uadilifu wa Msimbo

Falsafa ya msingi nyuma ya nyaraka zinazoweza kuendeshwa si mpya. Kwa miongo kadhaa, jumuiya ya Python imetetea mifano katika nyaraka ambazo watumiaji wanaweza kunakili, kubandika, na kutarajia ziendeshe bila dosari. Hata hivyo, kudumisha bora hii katika miradi mikubwa, inayoendelea haraka kama maktaba ya Transformers ya Hugging Face ni kazi kubwa. Uthibitishaji wa mwongozo hauwezekani, na mbinu za kitamaduni mara nyingi hulazimisha maelewano kati ya nyaraka wazi na upimaji bora.

Tatizo linatokana na tofauti za kimsingi katika mahitaji:

  • Mifano ya nyaraka inatoa kipaumbele kwa ufupi, usomaji, na kuzingatia kufundisha. Inalenga kutokuwa na "kelele."
  • Majaribio yanadai madai, usanidi/uvunjaji, fixtures, uigizaji, na uwezo wa urekebishaji. Yanatoa kipaumbele kwa uthabiti na chanjo.

Wakati masuala haya mawili yanalazimishwa kuwa katika umbizo lilelile, moja mara nyingi huathirika. doc-builder ya Hugging Face inalenga kutatua mvutano huu kwa kuruhusu nyaraka kubaki safi huku mifano yake ya msingi ikithibitishwa kwa ukali, kuhakikisha kuwa kila kijisehemu ambacho watumiaji hukutana nacho ni ukweli unaothibitishwa, sio tu matarajio. Hii ni muhimu kwa kudumisha uaminifu na kuharakisha kupitishwa na waendelezaji katika ulimwengu wa AI unaokua kwa kasi.

Urithi wa doctest: Ubunifu wa Mapema na Mahitaji Yanayoendelea

Dhana ya nyaraka zinazoweza kutekelezwa ilipata umakini mapema katika Python na kuanzishwa kwa doctest katika Python 2.1 (2001). Iliyoundwa na Tim Peters, doctest ilikuwa suluhisho maridadi: ilichambua mifano ya nyaraka iliyopangwa kama vipindi shirikishi vya mkalimani wa Python (>>> add(2, 3)\n5) na kuthibitisha kuwa matokeo yalingana na matarajio. Ubunifu huu ulibadilisha mifano ya nyaraka kuwa majaribio ya kurudia kiotomatiki, hatua kubwa mbele kwa ubora wa msimbo.

doctest ilifaa sana kwa Python, lugha iliyohimiza uchunguzi shirikishi. Kwa miradi midogo na API za moja kwa moja, ilifanya kazi vizuri sana, ikitoa utaratibu rahisi lakini wenye nguvu wa kuhakikisha mifano ya msingi inabaki ikifanya kazi. Iliwakilisha roho ya "onyesha, usihadithie tu" katika ukuzaji wa programu, ikifanya nyaraka kuwa sehemu hai ya seti ya majaribio.

Suluhisho la Kisasa la Hugging Face: Vizuizi vya Markdown Vinavyoweza Kuendeshwa

Kutambua mapungufu ya mbinu za zamani kwa miradi mikubwa, ngumu, mradi wa doc-builder wa Hugging Face unaleta mbinu ya kisasa ya nyaraka zinazoweza kuendeshwa. Badala ya kuweka majaribio ndani ya sintaksia ya nyaraka, inachukulia vijisehemu vya nyaraka kama msimbo wa kawaida wa Python unaopatikana ndani ya Markdown. Hii inabadilisha Markdown kuwa chombo chepesi cha majaribio, ikitenganisha uwasilishaji kutoka kwa mbinu ya upimaji.

Kizuizi kinachoweza kuendeshwa katika Markdown kinaonekana kama hivi:

```py runnable:quickstart
from transformers import pipeline

pipe = pipeline("sentiment-analysis")
result = pipe("I love runnable docs!")

if not result:  # doc-builder: hide
    raise ValueError("pipeline returned no result")

print(result[0]["label"])
assert result[0]["score"] > 0.5  # doc-builder: ignore-bare-assert
```

Inapotolewa, kizuizi hiki kinaonekana kama mfano wa kawaida wa msimbo. Wakati wa upimaji, hata hivyo, inatekelezwa kama msimbo wa kawaida wa Python. Hali hii ya pande mbili inahakikisha kuwa nyaraka zinabaki safi kwa wasomaji huku ikitoa mifano imara, inayoweza kujaribiwa kwa waendelezaji. Mbinu hii ina athari kubwa hasa kwa vikoa tata kama AI, ambapo mifano mara nyingi inahusisha upakiaji tata wa mfumo na hatua za inference.

Ushirikiano Usio na Mfumo na pytest na Vipengele vya Kina

Tofauti muhimu ya mbinu ya Hugging Face ni ushirikiano wake usio na mshono na mifumo ya kisasa ya majaribio, hasa pytest. Kwa hf-doc-builder kusakinishwa, pytest inaweza kugundua na kutekeleza kiotomatiki vizuizi vinavyoweza kuendeshwa ndani ya faili za Markdown, ikichukulia kila kizuizi kama kipengee cha kawaida cha majaribio. Hii inamaanisha kuwa mifano ya nyaraka inaweza kushiriki kikamilifu katika miundombinu iliyopo ya majaribio ya mradi, ikitumia vipengele vyenye nguvu vya pytest kama vile madai, fixtures, zana za urekebishaji, na kuripoti kamili.

Mageuzi ya Nyaraka Zinazoweza Kutekelezwa: doctest dhidi ya doc-builder

Kipengeledoctest (Kawaida)doc-builder (Markdown ya Kisasa Inayoweza Kuendeshwa)
Mbinu ya UpimajiHuweka majaribio kama vipindi vya mkalimani katika nyarakaHuchukulia vijisehemu vya nyaraka kama msimbo wa kawaida wa Python kwa ajili ya upimaji
UshirikianoModuli ya maktaba ya kawaidaPlugin ya pytest kwa ushirikiano usio na mshono
Sintaksia ya Majaribio>>> visingizio, kulinganisha matokeo yanayotarajiwaMsimbo wa kawaida wa Python, madai ya pytest
KubadilikaKidogo, kulinganisha matokeo dhaifuJuu, inasaidia majaribio magumu, decorators, urekebishaji
Usafi wa NyarakaInaweza kuchafulia nyaraka na mambo ya majaribioHuhifadhi nyaraka safi na maagizo yaliyofichwa
UrekebishajiUlinganisho wa kamba, ukaguzi usio wa moja kwa mojaUrekebishaji wa kawaida wa Python, nyayo za hitilafu kamili
Usanidi/UvunjajiInaweza kuongeza kelele kwa mifanoHusimamia muktadha kwa ufanisi na vizuizi vya muendelezo
Chanzo cha UkweliUmbizo la nyaraka na majaribio yaliyopachikwaChanzo cha Markdown, kinajaribiwa kupitia utekelezaji wa kawaida wa Python

doc-builder pia inaleta vizuizi vya muendelezo, kipengele muhimu kwa mafunzo ya hatua nyingi. Hivi huruhusu waandishi kugawanya mfano katika vijisehemu vingi vinavyoonekana, kama vile runnable:test_basic ikifuatiwa na runnable:test_basic:2. Muhimu zaidi, vizuizi hivi vinashiriki muktadha wa utekelezaji sawa wakati wa majaribio, kuwezesha mtiririko wa mafundisho wa asili bila kulazimisha msimbo wote kwenye kizuizi kirefu kimoja. Kubadilika huku ni muhimu kwa kuwaongoza watumiaji kupitia matumizi tata ya mifumo ya AI au mabomba ya usindikaji wa data.

Kwa mfano, mtiririko wa kazi wa ukuzaji wa wakala wa AI unaweza kuhusisha hatua kadhaa: kufafanua zana za wakala, kuanzisha wakala, na kisha kuendesha swala. Vizuizi vya muendelezo huruhusu kila moja ya hatua hizi kuwasilishwa wazi katika sehemu tofauti za nyaraka huku zikitekelezwa kama mlolongo mmoja wa majaribio, sawa na jinsi mtiririko wa kazi wa mawakala wa hali ya juu ulivyo Kuendesha AI ya Wakala: Sehemu ya 1.

Kudumisha Nyaraka Safi Huku Ukihakikisha Upimaji Imara

Mojawapo ya suluhisho maridadi zaidi za doc-builder ni uwezo wake wa kuweka nyaraka zilizotolewa safi, hata kama chanzo cha Markdown kina maagizo maalum ya majaribio. Waendelezaji wanaweza kuweka maoni kama vile # doc-builder: hide kwa mistari inayoweza kutekelezwa ambayo haipaswi kuonekana katika nyaraka, au # doc-builder: ignore-bare-assert kwa madai ambayo ni sehemu ya jaribio lakini maoni yake hayapaswi kutolewa. Vile vile, decorators za pytest (# pytest-decorator: ...) huondolewa wakati wa utoaji.

Hii inahakikisha kuwa nyaraka zinabaki zikilenga kufundisha na uwazi, bila kuchafuliwa na msimbo usio wa msingi wa majaribio. Mtumiaji huona tu msimbo husika, huku mfumo wa msingi ukihakikisha utendaji wake. Uwiano huu ni muhimu kwa nyaraka za zana za waendelezaji, ambapo mvuto wa urembo na usahihi kamili ni muhimu sana.

Athari kwa Miradi Mikubwa ya AI na Zaidi

Kwa hifadhidata kubwa kama Transformers ya Hugging Face, yenye mamia ya kurasa za nyaraka na maelfu ya mifano, kipengele hiki hubadilisha sana. Huwezesha kuzuia kiotomatiki mkengeuko wa nyaraka, tatizo ambalo vinginevyo lingehitaji juhudi kubwa za mwongozo au kusababisha mtiririko wa mara kwa mara wa mifano iliyoharibika. Nyaraka zinazoweza kuendeshwa husaidia kuweka nyaraka na msimbo sambamba, ikidumisha uaminifu kwa kiwango ambapo ukaguzi wa mwongozo hauwezekani. Hii inalingana na juhudi pana katika jumuiya ya AI ya Kupima Wakala wa AI kwa Uzalishaji na kuhakikisha kutegemewa.

Kwa kuleta nyaraka zinazoweza kutekelezwa katika enzi ya kisasa ya pytest na mabomba ya kisasa ya CI/CD, Hugging Face inaonyesha kujitolea kwa nguvu kwa uzoefu wa msanidi programu na ubora wa msimbo. Lengo linabaki lile lile kama ilivyokuwa zaidi ya miongo miwili iliyopita: mifano ya nyaraka inapaswa kufanya kazi. Lakini sasa, haionyeshi tu jinsi msimbo unavyopaswa kufanya kazi bali inathibitisha mfululizo kwamba inafanya kazi, ikikuza mfumo ikolojia wa kutegemewa na uaminifu zaidi kwa ukuzaji wa AI.

Maswali Yanayoulizwa Mara kwa Mara

What is the core problem Hugging Face's runnable Markdown addresses?
Hugging Face's runnable Markdown addresses the pervasive problem of 'documentation drift,' where code examples in documentation become outdated and silently break as libraries and APIs evolve. This leads to user frustration and diminishes the credibility of the documentation. By making documentation examples runnable and testable, the doc-builder ensures that these snippets are continuously validated against the codebase, guaranteeing that they always work as advertised. This proactive approach prevents broken examples, enhances user trust, and improves the overall developer experience by providing reliable resources.
How does runnable Markdown differ from Python's traditional `doctest` module?
While both `doctest` and runnable Markdown aim for executable documentation, they differ significantly in their approach. `doctest` embeds tests directly into documentation syntax, requiring examples to mirror interactive interpreter sessions with expected output. This often leads to documentation being cluttered with test mechanics. Hugging Face's runnable Markdown, in contrast, treats documentation snippets as normal Python code living within Markdown files. It integrates seamlessly with modern testing frameworks like `pytest`, allowing for complex assertions, debugging, and standard test infrastructure. This separation of concerns ensures documentation remains clean and readable, while testing remains powerful and flexible, avoiding the limitations of `doctest`'s brittle output matching and verbose setup/teardown.
What are 'continuation blocks' in Hugging Face's `doc-builder`?
Continuation blocks are a powerful feature in Hugging Face's `doc-builder` that allow authors to split complex code examples or tutorials across multiple visible Markdown snippets while maintaining a shared execution context during testing. This means that a setup defined in one runnable block can be reused and built upon in a subsequent block, without forcing the documentation to present everything as one long, monolithic code fence. For example, `runnable:test_basic` can define initial variables, and `runnable:test_basic:2` can then use those variables. This enhances readability and instructional flow in documentation, making it easier to present multi-step processes without sacrificing the integrity of the underlying testable code.
How does `doc-builder` integrate with existing testing frameworks like `pytest`?
Hugging Face's `doc-builder` integrates natively with `pytest`, transforming runnable Markdown blocks into standard `pytest` test items. With `hf-doc-builder` installed, `pytest` automatically discovers and executes these blocks within Markdown files. This integration means that documentation examples can leverage the full power of `pytest`, including its assertion mechanisms, fixtures, decorators, and debugging tools. Failures appear as normal test failures with comprehensive tracebacks, allowing developers to debug effectively. This approach avoids the need for a special-purpose testing mini-language, embedding documentation tests directly into the project's existing, robust test infrastructure.
How does `doc-builder` ensure documentation remains clean despite embedded test logic?
A key design principle of `doc-builder` is to prevent test mechanics from polluting the user-facing documentation. Authors can embed test-only directives, such as `# pytest-decorator: transformers.testing_utils.slow` or `# doc-builder: hide` for lines that should be executable but not displayed, directly within the Markdown source. When the documentation is rendered, `doc-builder` intelligently strips these directives and comments, presenting a clean, readable code snippet to the user. This allows developers to write comprehensive tests alongside their examples without compromising the clarity and brevity expected of good documentation, maintaining a clear separation between the source code for testing and the rendered content for users.
What are the benefits of runnable documentation for large AI projects like Hugging Face Transformers?
For large AI projects such as Hugging Face Transformers, which involve extensive documentation and thousands of code examples, runnable documentation offers immense benefits. It drastically reduces 'documentation drift' by continuously validating examples against the evolving codebase, ensuring they remain accurate and functional. This prevents user frustration caused by broken examples and builds trust in the documentation's reliability. By integrating with `pytest`, it allows these projects to manage documentation tests within their existing CI/CD pipelines, making manual review of examples unnecessary at scale. This automated validation is crucial for maintaining the quality and usability of documentation in rapidly developing and complex software ecosystems.
Can runnable Markdown be adopted by other projects outside of Hugging Face?
Yes, the principles and mechanisms behind Hugging Face's runnable Markdown, particularly its integration with standard Python testing tools like `pytest` and its focus on separating testing logic from displayed documentation, are highly applicable and beneficial for any software project. While the `doc-builder` itself is specific to Hugging Face, the underlying ideas represent a best practice in developer tools. Other projects can implement similar systems using existing tools or adapt `doc-builder`'s concepts to ensure their documentation examples are continuously tested and reliable. This approach is a general solution to a common problem across the software development landscape, making documentation more robust and trustworthy.

Baki na Habari

Pokea habari za hivi karibuni za AI kwenye barua pepe yako.

Shiriki