פיתוח מונחה-סוכנים: האצת Copilot Applied Science

אוטומציה של עבודה אינטלקטואלית באמצעות סוכני AI

בנוף המהיר המשתנה של הנדסת תוכנה, החתירה ליעילות מובילה לעיתים קרובות לחידושים פורצי דרך. טיילר מקגובין, חוקר AI, פירט לאחרונה מסע המגלם רוח זו: אוטומציה של עבודתו האינטלקטואלית באמצעות פיתוח מונחה-סוכנים עם GitHub Copilot. אין מדובר רק בקידוד מהיר יותר; מדובר בשינוי מהותי של תפקיד המפתח מניתוח חוזרני לפתרון בעיות יצירתי ופיקוח אסטרטגי. ניסיונו של מקגובין מדגיש דפוס מוכר בקרב מהנדסים – בניית כלים לביטול משימות מייגעות – אך לוקח זאת צעד קדימה על ידי הפקדת משימות אנליטיות מורכבות, שבעבר היה בלתי אפשרי לבצע בקנה מידה ידני, בידי סוכני AI.

ההשראה של מקגובין נבעה מהיבט קריטי, אך מכריע, בעבודתו: ניתוח ביצועי סוכני קידוד מול מדדי ביצועים כמו TerminalBench2 ו-SWEBench-Pro. זה כלל ניתוח 'מסלולים' – יומני JSON מפורטים של תהליכי חשיבה ופעולות של סוכן – שיכולים להסתכם במאות אלפי שורות קוד על פני משימות והרצות רבות של מדדי ביצועים. בעוד ש-GitHub Copilot כבר סייע בזיהוי דפוסים, האופי החוזרני של לולאה אנליטית זו דרש אוטומציה מלאה. זה הוביל ליצירת 'eval-agents', מערכת שתוכננה לבצע אוטומציה של נטל אינטלקטואלי זה, המאפשרת לצוותו ב-Copilot Applied Science להשיג יעילות דומה.

התוכנית לפיתוח מונחה-סוכנים

היווצרותם של 'eval-agents' הודרכה על ידי סט עקרונות ברור המתמקד בשיתוף פעולה וסקלאביליות. מקגובין שאף להפוך את סוכני ה-AI הללו לקלים לשיתוף, פשוטים לכתיבה, והכלי העיקרי לתרומות צוותיות. יעדים אלה משקפים את ערכי הליבה של GitHub, ובמיוחד אלה ששוכללו במהלך ניסיונו כמתחזק קוד פתוח עבור ה-GitHub CLI. עם זאת, המטרה השלישית – הפיכת סוכני קידוד לתורמים העיקריים – היא זו שעיצבה באמת את כיוון הפרויקט ופתחה יתרונות בלתי צפויים עבור שתי המטרות הראשונות.

הגדרת הקידוד המונחה-סוכנים מינפה מספר כלים חזקים כדי לייעל את תהליך הפיתוח:

סוכן קידוד: Copilot CLI, המספק אינטראקציה ושליטה ישירה.
מודל בשימוש: Claude Opus 4.6, המציע יכולות חשיבה ודור קוד מתקדמות.
IDE: VSCode, המשמש כסביבת עבודה מרכזית לפיתוח.

באופן מכריע, ה-Copilot SDK היה קריטי, וסיפק גישה לכלים קיימים, שרתי MCP, ומנגנונים לרישום כלים וכישורים חדשים. בסיס זה ביטל את הצורך להמציא מחדש פונקציונליות סוכנים בסיסיות, ואפשר לצוות להתמקד בלוגיקה ספציפית ליישום. סביבה משולבת זו טיפחה לולאת פיתוח מהירה, והוכיחה שעם ההגדרה הנכונה, סוכני AI יכולים לא רק לסייע אלא גם להניע חלקים משמעותיים ממאמץ הפיתוח.

עקרונות ליבה לקידוד סוכנים יעיל

המעבר לפרדיגמה מונחית-סוכנים דורש יותר מסתם כלים; הוא דורש שינוי במתודולוגיה. מקגובין זיהה שלושה עקרונות ליבה שהוכיחו את עצמם כחיוניים להאצת הפיתוח וטיפוח שיתוף הפעולה:

אסטרטגיות פרומפטים: אינטראקציה יעילה עם סוכנים פירושה להיות שיחתי, מפורט ולתת עדיפות לתכנון.
אסטרטגיות ארכיטקטוניות: בסיס קוד נקי, מתועד היטב ומרופקטור הוא בעל חשיבות עליונה עבור סוכנים כדי לנווט ולתרום לו ביעילות.
אסטרטגיות איטרציה: אימוץ גישה של 'אשמת התהליך, לא הסוכנים', בדומה לתרבות ללא אשמה, מאפשר ניסוי ולמידה מהירים.

אסטרטגיות אלה, כאשר יושמו באופן עקבי, הובילו לתוצאות מדהימות. כעדות ליעילות זו, חמישה תורמים חדשים, בתוך שלושה ימים בלבד, הוסיפו יחד 11 סוכנים חדשים, ארבעה כישורים חדשים, והציגו את הרעיון של 'תהליכי עבודה של סוכני הערכה' לפרויקט. ספרינט שיתופי זה הביא לשינוי מדהים של +28,858/-2,884 שורות קוד על פני 345 קבצים, והדגים את ההשפעה העצומה של github-agentic-workflows בפועל.

הנה סיכום העקרונות העיקריים:

עקרון	תיאור	יתרון לפיתוח מונחה-סוכנים
פרומפטים	התייחסו לסוכנים כמו למהנדסים בכירים: הנחו את חשיבתם, הסבירו יתר על המידה הנחות, נצלו מצבי תכנון (`/plan`) לפני הביצוע. היו שיחתיים ומפורטים.	מוביל לפלטים מדויקים ורלוונטיים יותר, ומסייע לסוכנים לפתור בעיות מורכבות ביעילות.
ארכיטקטוני	תנו עדיפות לריפקטורינג, תיעוד מקיף ובדיקות חזקות. שמרו על בסיס קוד נקי, קריא ומובנה היטב. נקו באופן פעיל קוד מת.	מאפשר לסוכנים להבין את בסיס הקוד, הדפוסים והפונקציונליות הקיימת, ומקל על תרומות מדויקות.
איטרציה	אמצו גישה של "אשמת התהליך, לא הסוכנים". יישמו רשתות ביטחון (טיפוס קפדני, לינטרים, בדיקות נרחבות) למניעת טעויות. למדו מטעויות סוכנים על ידי שיפור תהליכים ורשתות ביטחון.	מטפח איטרציה מהירה, בונה אמון בתרומות סוכנים, ומשפר באופן מתמיד את צינור הפיתוח.

האצת פיתוח: אסטרטגיות בפעולה

ההצלחה של גישה מונחית-סוכנים זו נעוצה ביישום מעשי של עקרונות אלה.

אסטרטגיות פרומפטים: הנחיית מהנדס ה-AI

סוכני קידוד AI, למרות עוצמתם, מצטיינים בבעיות מוגדרות היטב. למשימות מורכבות יותר, הם דורשים הנחיה, בדומה למהנדסים זוטרים. מקגובין גילה שמעורבות בסגנון שיחתי, הסברת הנחות, ומינוף מצבי תכנון היו יעילים בהרבה מפקודות קצרות. לדוגמה, בעת הוספת בדיקות רגרסיה חזקות, פרומפט כמו /plan I've recently observed Copilot happily updating tests to fit its new paradigms even though those tests shouldn't be updated. How can I create a reserved test space that Copilot can't touch or must reserve to protect against regressions? יזם דיאלוג פרודוקטיבי. הלוך ושוב הזה, לעיתים קרובות עם המודל העוצמתי claude-opus-4-6, הוביל לפתרונות מתוחכמים כמו רשתות ביטחון של בדיקות חוזה, שרק מהנדסים אנושיים יכלו לעדכן, מה שהבטיח שפונקציונליות קריטית תישאר מוגנת.

אסטרטגיות ארכיטקטוניות: יסוד האיכות בסיוע AI

עבור מהנדסים אנושיים, שמירה על בסיס קוד נקי, כתיבת בדיקות ותיעוד תכונות מקבלים לעיתים קרובות עדיפות נמוכה יותר תחת לחץ תכונות. בפיתוח מונחה-סוכנים, אלה הופכים לחשובים ביותר. מקגובין גילה שהשקעת זמן בריפקטורינג, תיעוד והוספת מקרי בדיקה שיפרה באופן דרמטי את יכולתו של Copilot לנווט ולתרום לבסיס הקוד. מאגר קוד בעל גישה 'סוכן-ראשון' משגשג על בהירות. זה מאפשר למפתחים אפילו לתת פרומפטים ל-Copilot עם שאלות כמו 'Knowing what I know now, how would I design this differently?', והופך ריפקטורים תיאורטיים לפרויקטים ברי השגה בעזרת סיוע AI. התמקדות מתמשכת זו בבריאות ארכיטקטונית מבטיחה שניתן לספק תכונות חדשות בקלות.

אסטרטגיות איטרציה: אמון בתהליך, לא רק בסוכן

התפתחות מודלי ה-AI שינתה את הלך הרוח מ'סמוך אבל וודא' לגישה סומכת יותר, בדומה לאופן שבו צוותים יעילים פועלים עם פילוסופיה של 'אשמת התהליך, לא האנשים'. 'תרבות ללא אשמה' זו בפיתוח מונחה-סוכנים פירושה שכאשר סוכן AI עושה טעות, התגובה היא לשפר את התהליכים ורשתות הביטחון הבסיסיים, ולא להאשים את הסוכן עצמו. זה כרוך ביישום שיטות CI/CD קפדניות: טיפוס קפדני כדי להבטיח התאמה לממשקים, לינטרים חזקים לאיכות קוד, ובדיקות אינטגרציה, קצה-לקצה וחוזה נרחבות. בעוד שבניית בדיקות אלו באופן ידני יכולה להיות יקרה, סיוע סוכנים הופך אותן לזולות בהרבה ליישום, ומספק ביטחון קריטי בשינויים חדשים. על ידי הקמת מערכות אלו, מפתחים מאפשרים ל-Copilot לבדוק את עבודתו שלו, ומשקף כיצד מהנדס זוטר מוגדר להצלחה.

שליטה בלולאת הפיתוח מונחית-הסוכנים

שילוב עקרונות אלה בתהליך עבודה מעשי יוצר לולאת פיתוח עוצמתית ומוץ:

תכננו עם Copilot: התחילו תכונות חדשות באמצעות /plan. בצעו איטרציה על התוכנית, וודאו שעדכוני בדיקות ותיעוד נכללים ומושלמים לפני יישום הקוד. תיעוד יכול לשמש כסט נוסף של הנחיות עבור הסוכן.
יישמו עם Autopilot: אפשרו ל-Copilot ליישם את התכונה באמצעות /autopilot, תוך ניצול יכולות יצירת הקוד שלו.
סקרו עם Copilot Code Review: בקשו מ-Copilot ליזום לולאת סקירה. זה כרוך בבקשה מסוכן ה-Copilot Code Review, טיפול בהערותיו, ובקשה חוזרת של סקירות עד לפתרון בעיות.
סקירה אנושית: בצעו סקירה אנושית סופית כדי להבטיח שדפוסים נאכפים והחלטות מורכבות מתאימות לכוונת האסטרטגית.

מעבר ללולאת התכונות, אופטימיזציה מתמשכת היא המפתח. מקגובין מציג באופן שגרתי פרומפטים ל-Copilot עם פקודות כמו /plan Review the code for any missing tests, any tests that may be broken, and dead code או /plan Review the documentation and code to identify any documentation gaps. בדיקות אלו, המופעלות שבועית או ככל שתכונות חדשות משולבות, מבטיחות שסביבת הפיתוח מונחית-הסוכנים נשארת בריאה ויעילה.

עתיד הנדסת התוכנה עם AI

מה שהחל כמסע אישי לאוטומציה של משימת ניתוח מתסכלת התפתח לפרדיגמה חדשה לפיתוח תוכנה. פיתוח מונחה-סוכנים, המופעל על ידי כלים כמו GitHub Copilot ומודלים מתקדמים כגון Claude Opus, אינו עוסק רק בהפיכת מפתחים למהירים יותר; הוא עוסק בשינוי מהותי של מהות העבודה עבור חוקרי AI ומהנדסי תוכנה כאחד. על ידי העברת עבודה אינטלקטואלית לסוכנים חכמים, צוותים יכולים להגיע לרמות חסרות תקדים של פרודוקטיביות, שיתוף פעולה וחדשנות, ובסופו של דבר להתמקד באתגרים היצירתיים והאסטרטגיים שמניעים באמת את ההתקדמות. גישה זו מבשרת עתיד מרתק שבו סוכני AI הם לא רק כלים, אלא חברים בלתי נפרדים מצוות הפיתוח, ומשנים את האופן שבו אנו בונים ומתחזקים תוכנה.

מקור מקורי

https://github.blog/ai-and-ml/github-copilot/agent-driven-development-in-copilot-applied-science/

שאלות נפוצות

What is agent-driven development in the context of GitHub Copilot?

Agent-driven development refers to a software engineering paradigm where AI agents, such as those powered by GitHub Copilot, become primary contributors and collaborators in the development process. Instead of merely suggesting code, these agents actively participate in planning, implementing, refactoring, testing, and documenting software. This approach leverages the AI's ability to automate repetitive intellectual tasks, allowing human engineers to focus on higher-level problem-solving, strategic design, and creative work, thereby accelerating development cycles and improving code quality through structured AI assistance and rigorous guardrails.

How did the 'eval-agents' project originate?

The 'eval-agents' project was born out of a common challenge faced by AI researchers: analyzing vast quantities of data. Tyler McGoffin, an AI researcher, found himself repeatedly poring over hundreds of thousands of lines of 'trajectories'—detailed logs of AI agent thought processes and actions during benchmark evaluations. Recognizing this as an intellectually toilsome and repetitive task, he sought to automate it. By applying agent-driven development principles with GitHub Copilot, he created 'eval-agents' to analyze these trajectories, significantly reducing the manual effort required and transforming a tedious analytical chore into an automated process.

What are the key components of an agentic coding setup for this approach?

An effective agentic coding setup, as demonstrated in this approach, typically includes a powerful AI coding agent like Copilot CLI, a robust underlying large language model such as Claude Opus 4.6, and a feature-rich Integrated Development Environment (IDE) like VSCode. Crucially, leveraging an SDK, such as the Copilot SDK, provides access to essential tools, servers, and mechanisms for registering new tools and skills, offering a foundational infrastructure for building and deploying agents without reinventing core functionalities. This integrated environment enables seamless interaction between the developer and the AI agent throughout the development lifecycle.

What prompting strategies are most effective when working with AI coding agents?

Effective prompting strategies for AI coding agents emphasize conversational, verbose, and planning-oriented interactions. Rather than terse problem statements, developers achieve better results by engaging agents in a dialogue, over-explaining assumptions, and leveraging the AI's speed for initial planning before committing to code changes. This involves using planning modes (e.g., '/plan') to collaboratively brainstorm solutions and refine ideas. Treating the AI agent like a junior engineer who benefits from clear guidance, context, and iterative feedback helps it to produce more accurate and relevant outputs, leading to superior problem-solving and feature implementation.

Why are architectural strategies like refactoring and documentation crucial for agent-driven development?

Architectural strategies like frequent refactoring, comprehensive documentation, and robust testing are paramount in agent-driven development because they create a clean, navigable codebase that AI agents can effectively understand and interact with. A well-maintained codebase, much like for human engineers, allows AI agents to contribute features more accurately and efficiently. By prioritizing readability, consistent patterns, and up-to-date documentation, developers ensure that Copilot can interpret the codebase's intent, identify opportunities for improvement, and implement changes with minimal errors, making feature delivery trivial and facilitating continuous re-architecture.

How does a 'blameless culture' apply to iteration strategies in agent-driven development?

Applying a 'blameless culture' to agent-driven development means shifting from a 'trust but verify' mindset to one that prioritizes 'blame process, not agents.' This philosophy acknowledges that AI agents, like human engineers, can make mistakes. The focus then shifts to implementing robust processes and guardrails—such as strict typing, comprehensive linters, and extensive integration and end-to-end tests—to prevent errors. When an agent does make a mistake, the response is to learn from it and introduce additional guardrails, refining the processes and prompts to ensure the same error isn't repeated, fostering a rapid and psychologically safe iteration pipeline.

What is the typical development loop when using agent-driven development?

The typical development loop in agent-driven development begins with planning a new feature collaboratively with Copilot using a '/plan' prompt, ensuring testing and documentation updates are integrated early. Next, Copilot implements the feature, often using an '/autopilot' command. Following implementation, a review loop is initiated with a Copilot Code Review agent, addressing comments iteratively. The final stage involves a human review to enforce patterns and standards. Outside this feature loop, Copilot is periodically prompted to review for missing tests, code duplication, or documentation gaps, maintaining a continuously optimized agent-driven environment.

What kind of impact did agent-driven development have on team productivity and collaboration?

The impact of agent-driven development on team productivity and collaboration was transformative, leading to an incredibly rapid iteration pipeline. In one instance, a team of five new contributors, using this methodology, created 11 new agents, four new skills, and implemented complex workflows in less than three days. This amounted to a staggering change of +28,858/-2,884 lines of code across 345 files. This dramatic increase in output highlights how agent-driven development, by automating routine tasks and providing intelligent assistance, significantly accelerates feature delivery, fosters deeper collaboration, and enables teams to achieve unprecedented levels of innovation and efficiency.

הישארו מעודכנים

קבלו את חדשות ה-AI האחרונות לתיבת הדוא״ל.

שתף