智能体驱动开发：赋能 Copilot 应用科学

title: "智能体驱动开发：赋能 Copilot 应用科学" slug: "agent-driven-development-in-copilot-applied-science" date: "2026-04-02" lang: "zh" source: "https://github.blog/ai-and-ml/github-copilot/agent-driven-development-in-copilot-applied-science/" category: "开发者工具" keywords:

智能体驱动开发
GitHub Copilot
AI 编程智能体
软件工程
自动化
Claude Opus
开发者工具
AI 研究
提示工程
代码重构
CI/CD
AI 工作流 meta_description: "了解智能体驱动开发如何与 GitHub Copilot 和 Claude Opus 结合，彻底改变软件工程，自动化脑力劳动，并加速协作流程。" image: "/images/articles/agent-driven-development-in-copilot-applied-science.png" image_alt: "屏幕截图展示了 GitHub Copilot 的智能体驱动开发界面，显示了代码建议和协作编码工作流。" quality_score: 94 content_score: 93 seo_score: 95 companies:
GitHub schema_type: "NewsArticle" reading_time: 7 faq:
question: "在 GitHub Copilot 的背景下，什么是智能体驱动开发？" answer: "智能体驱动开发是指一种软件工程范式，其中由 GitHub Copilot 等支持的 AI 智能体成为开发过程中的主要贡献者和协作者。这些智能体不再仅仅是提供代码建议，而是积极参与软件的规划、实施、重构、测试和文档编写。这种方法利用 AI 自动化重复性脑力劳动的能力，使人类工程师能够专注于更高级别的解决问题、战略设计和创造性工作，从而通过结构化的 AI 辅助和严格的保障措施，加速开发周期并提高代码质量。"
question: "'eval-agents' 项目是如何诞生的？" answer: "'eval-agents' 项目源于 AI 研究人员面临的一个共同挑战：分析海量数据。AI 研究员 Tyler McGoffin 发现自己反复审阅数十万行的‘轨迹’——即 AI 智能体在基准评估期间的思维过程和行动的详细日志。他意识到这是一项耗费脑力且重复的任务，于是寻求自动化。通过将智能体驱动开发原则与 GitHub Copilot 相结合，他创建了 'eval-agents' 来分析这些轨迹，极大地减少了所需的手动工作量，并将繁琐的分析任务转变为自动化流程。"
question: "这种方法中，智能体编码设置的关键组成部分是什么？" answer: "如本方法所示，一个有效的智能体编码设置通常包括一个强大的 AI 编程智能体，例如 Copilot CLI，一个健壮的底层大型语言模型，例如 Claude Opus 4.6，以及一个功能丰富的集成开发环境 (IDE)，例如 VSCode。至关重要的是，利用像 Copilot SDK 这样的 SDK，可以访问必要的工具、服务器以及注册新工具和技能的机制，为构建和部署智能体提供基础架构，而无需重新发明核心功能。这种集成环境使开发人员和 AI 智能体在整个开发生命周期中能够无缝交互。"
question: "与 AI 编程智能体协作时，哪些提示策略最有效？" answer: "与 AI 编程智能体协作的有效提示策略强调对话式、详细和面向规划的交互。开发人员通过与智能体进行对话、过度解释假设，并在提交代码更改之前利用 AI 的速度进行初步规划，从而获得更好的结果，而不是仅仅给出简洁的问题陈述。这包括使用规划模式（例如 '/plan'）来协作地集思广益并完善想法。将 AI 智能体视为一名受益于清晰指导、上下文和迭代反馈的初级工程师，有助于它生成更准确和相关的输出，从而实现卓越的问题解决和功能实现。"
question: "为什么重构和文档等架构策略对智能体驱动开发至关重要？" answer: "频繁重构、全面文档和健壮测试等架构策略在智能体驱动开发中至关重要，因为它们能创建一个干净、易于导航的代码库，AI 智能体可以有效地理解和交互。维护良好的代码库，就像对人类工程师一样，允许 AI 智能体更准确、高效地贡献功能。通过优先考虑可读性、一致的模式和最新文档，开发人员确保 Copilot 能够解释代码库的意图，识别改进机会，并以最少的错误实施更改，从而使功能交付变得轻而易举，并促进持续的重新架构。"
question: "'无责文化' 如何应用于智能体驱动开发的迭代策略？" answer: "将‘无责文化’应用于智能体驱动开发意味着从‘信任但验证’的心态转变为优先‘归咎于流程，而非智能体’的心态。这种理念承认 AI 智能体，如同人类工程师，也可能犯错。重点随后转向实施健壮的流程和保障措施——例如严格的类型检查、全面的代码检查工具以及广泛的集成和端到端测试——以防止错误。当智能体确实犯错时，应对措施是从中学习并引入额外的保障措施，完善流程和提示，以确保相同的错误不再重复，从而培养快速且心理安全的迭代管道。"
question: "使用智能体驱动开发时，典型的开发循环是怎样的？" answer: "智能体驱动开发中的典型开发循环始于使用 '/plan' 提示与 Copilot 协作规划新功能，确保测试和文档更新早期集成。接着，Copilot 通常使用 '/autopilot' 命令实现该功能。实施后，会启动一个与 Copilot 代码审查智能体一起的审查循环，迭代处理评论。最后阶段涉及人工审查，以强制执行模式和标准。在此功能循环之外，Copilot 会定期被提示审查是否存在缺失的测试、代码重复或文档空白，从而维护一个持续优化的智能体驱动环境。"
question: "智能体驱动开发对团队生产力和协作产生了怎样的影响？" answer: "智能体驱动开发对团队生产力和协作产生了变革性影响，带来了令人难以置信的快速迭代管道。在一个案例中，一个由五名新贡献者组成的团队，利用这种方法，在不到三天的时间内创建了 11 个新智能体、四个新技能，并实现了复杂的工作流。这相当于在 345 个文件中更改了惊人的 +28,858/-2,884 行代码。输出的这种显著增长突出表明，智能体驱动开发通过自动化日常任务和提供智能协助，显著加速了功能交付，促进了更深入的协作，并使团队能够达到前所未有的创新和效率水平。"

利用 AI 智能体自动化脑力劳动

在快速发展的软件工程领域，对效率的追求常常带来突破性的创新。AI 研究员 Tyler McGoffin 最近详细介绍了一段旅程，它体现了这种精神：通过与 GitHub Copilot 结合的智能体驱动开发，自动化他的脑力劳动。这不仅仅是为了更快地编写代码；它是为了从根本上将开发人员的角色从重复性分析转移到创造性问题解决和战略性监督。McGoffin 的经历突显了工程师们普遍存在的一种模式——构建工具以消除繁琐工作——但它更进一步，通过将复杂的分析任务委托给 AI 智能体，这些任务在以前是无法手动扩展的。

McGoffin 的灵感源于他工作中一个关键但又令人不知所措的方面：根据 TerminalBench2 和 SWEBench-Pro 等基准来分析编程智能体的性能。这涉及到剖析‘轨迹’——智能体思维过程和行动的详细 JSON 日志——这些日志在众多任务和基准运行中可能达到数十万行代码。尽管 GitHub Copilot 已经在模式识别方面提供了帮助，但这种分析循环的重复性亟需完全自动化。这导致了 'eval-agents' 的创建，这是一个旨在自动化这种脑力负担的系统，从而使他在 Copilot 应用科学团队能够实现类似的效率。

智能体驱动开发的蓝图

'eval-agents' 的诞生遵循了一系列明确的原则，这些原则侧重于协作和可扩展性。McGoffin 的目标是让这些 AI 智能体易于共享、易于编写，并成为团队贡献的主要载体。这些目标反映了 GitHub 的核心价值观，特别是他在作为 GitHub CLI 的 OSS 维护者期间所磨练的那些。然而，正是第三个目标——让编程智能体成为主要贡献者——真正塑造了项目的方向，并为前两个目标带来了意想不到的益处。

智能体编码设置利用了几个强大的工具来简化开发过程：

编程智能体：Copilot CLI，提供直接交互和控制。
使用的模型：Claude Opus 4.6，提供高级推理和代码生成能力。
集成开发环境 (IDE)：VSCode，作为开发的核心工作区。

至关重要的是，Copilot SDK 起到了关键作用，它提供了对现有工具、MCP 服务器以及注册新工具和技能的机制的访问。这一基础消除了重新发明核心智能体功能的需要，让团队能够专注于应用程序特定的逻辑。这种集成环境促进了快速开发循环，证明了在正确的设置下，AI 智能体不仅可以提供帮助，还可以推动开发工作的很大一部分。

高效智能体编码的核心原则

向智能体驱动范式的转变不仅仅需要工具；它还需要方法论上的转变。McGoffin 确定了三个核心原则，这些原则被证明是加速开发和促进协作的基础：

提示策略：与智能体有效交互意味着对话式、详细说明并优先规划。
架构策略：一个干净、文档齐全且经过重构的代码库对于智能体有效导航和贡献至关重要。
迭代策略：采纳“归咎于流程，而非智能体”的心态，类似于无责文化，能够实现快速实验和学习。

当这些策略持续应用时，带来了惊人的结果。为了证明其有效性，五名新贡献者在短短三天内，共同添加了 11 个新智能体、四个新技能，并将‘eval-agent 工作流’的概念引入项目。这次协作冲刺在 345 个文件中带来了显着的 +28,858/-2,884 行代码更改，展示了 github-agentic-workflows 在实践中的深远影响。

以下是核心原则的总结：

原则	描述	对智能体驱动开发的益处
提示	像对待资深工程师一样对待智能体：引导它们的思维，过度解释假设，在执行前利用规划模式 (`/plan`)。保持对话式和详细。	导致更准确和相关的输出，帮助智能体有效解决复杂问题。
架构	优先进行重构、全面文档和健壮测试。保持代码库干净、可读且结构良好。积极清理废弃代码。	使智能体能够理解代码库、模式和现有功能，从而促进准确的贡献。
迭代	采纳“归咎于流程，而非智能体”的心态。实施保障措施（严格类型检查、代码检查工具、广泛测试）以防止错误。通过增强流程和保障措施从智能体错误中学习。	促进快速迭代，建立对智能体贡献的信心，并持续改进开发管道。

加速开发：策略实践

这种智能体驱动方法的成功根植于这些原则的实际应用。

提示策略：引导 AI 工程师

AI 编程智能体虽然强大，但在范围明确的问题上表现出色。对于更复杂的任务，它们需要指导，就像初级工程师一样。McGoffin 发现，对话式地进行交互、解释假设以及利用规划模式比简洁的命令有效得多。例如，在添加健壮的回归测试时，一个提示，如 /plan I've recently observed Copilot happily updating tests to fit its new paradigms even though those tests shouldn't be updated. How can I create a reserved test space that Copilot can't touch or must reserve to protect against regressions? 启动了一场富有成效的对话。这种来回沟通，通常是与强大的 claude-opus-4-6 模型进行，带来了像契约测试保障措施这样的复杂解决方案，只有人类工程师才能更新这些措施，从而确保关键功能得到保护。

架构策略：AI 辅助质量的基础

对于人类工程师而言，在功能交付压力下，维护干净的代码库、编写测试和记录功能常常被降级。在智能体驱动开发中，这些变得至关重要。McGoffin 发现，花时间重构、编写文档和添加测试用例极大地提高了 Copilot 导航和贡献代码库的能力。一个以智能体为先的仓库需要清晰性。这甚至允许开发人员向 Copilot 提出诸如“鉴于我现在所知，我将如何以不同的方式设计它？”这样的问题，将理论上的重构转化为可由 AI 协助实现的项目。这种对架构健康的持续关注确保了新功能可以轻松交付。

迭代策略：信任流程，而非仅仅信任智能体

AI 模型的发展使人们的心态从“信任但验证”转向了更信任的立场，这类似于高效团队如何遵循“归咎于流程，而非人员”的理念。在智能体驱动开发中，这种“无责文化”意味着当 AI 智能体犯错时，应对措施是改进底层流程和保障措施，而不是归咎于智能体本身。这涉及到实施严格的 CI/CD 实践：严格的类型检查以确保接口一致性，健壮的代码检查工具以保证代码质量，以及广泛的集成测试、端到端测试和契约测试。虽然手动构建这些测试可能成本高昂，但智能体的协助使得它们实施起来便宜得多，为新变更提供了关键信心。通过建立这些系统，开发人员赋能 Copilot 检查自己的工作，这反映了初级工程师是如何被设定为成功的。

掌握智能体驱动开发循环

将这些原则整合到实际工作流中，创建了一个强大且加速的开发循环：

与 Copilot 规划：使用 /plan 启动新功能。迭代规划，确保在代码实施之前包含并完成测试和文档更新。文档可以作为智能体的额外指导方针。
使用 Autopilot 实施：允许 Copilot 使用 /autopilot 实施功能，利用其代码生成能力。
使用 Copilot 代码审查：提示 Copilot 启动审查循环。这包括请求 Copilot 代码审查智能体，处理其评论，并重新请求审查，直到问题解决。
人工审查：进行最终的人工审查，以确保模式得到执行，并且复杂决策符合战略意图。

除了功能循环之外，持续优化是关键。McGoffin 定期向 Copilot 发送诸如 /plan Review the code for any missing tests, any tests that may be broken, and dead code 或 /plan Review the documentation and code to identify any documentation gaps. 这样的命令。这些检查每周运行一次，或者在新功能集成时运行，以确保智能体驱动的开发环境保持健康和高效。

AI 赋能软件工程的未来

最初旨在自动化一项令人沮丧的分析任务的个人探索，如今已演变为一种新的软件开发范式。智能体驱动开发，由 GitHub Copilot 等工具和 Claude Opus 等高级模型提供支持，不仅仅是为了让开发人员更快；它正在从根本上改变 AI 研究人员和软件工程师的工作性质。通过将脑力劳动卸载给智能代理，团队可以达到前所未有的生产力、协作和创新水平，最终专注于真正推动进步的创造性和战略性挑战。这种方法预示着一个令人兴奋的未来，AI 智能体不再仅仅是工具，而是开发团队不可或缺的成员，正在改变我们构建和维护软件的方式。

原始来源

https://github.blog/ai-and-ml/github-copilot/agent-driven-development-in-copilot-applied-science/

常见问题

What is agent-driven development in the context of GitHub Copilot?

Agent-driven development refers to a software engineering paradigm where AI agents, such as those powered by GitHub Copilot, become primary contributors and collaborators in the development process. Instead of merely suggesting code, these agents actively participate in planning, implementing, refactoring, testing, and documenting software. This approach leverages the AI's ability to automate repetitive intellectual tasks, allowing human engineers to focus on higher-level problem-solving, strategic design, and creative work, thereby accelerating development cycles and improving code quality through structured AI assistance and rigorous guardrails.

How did the 'eval-agents' project originate?

The 'eval-agents' project was born out of a common challenge faced by AI researchers: analyzing vast quantities of data. Tyler McGoffin, an AI researcher, found himself repeatedly poring over hundreds of thousands of lines of 'trajectories'—detailed logs of AI agent thought processes and actions during benchmark evaluations. Recognizing this as an intellectually toilsome and repetitive task, he sought to automate it. By applying agent-driven development principles with GitHub Copilot, he created 'eval-agents' to analyze these trajectories, significantly reducing the manual effort required and transforming a tedious analytical chore into an automated process.

What are the key components of an agentic coding setup for this approach?

An effective agentic coding setup, as demonstrated in this approach, typically includes a powerful AI coding agent like Copilot CLI, a robust underlying large language model such as Claude Opus 4.6, and a feature-rich Integrated Development Environment (IDE) like VSCode. Crucially, leveraging an SDK, such as the Copilot SDK, provides access to essential tools, servers, and mechanisms for registering new tools and skills, offering a foundational infrastructure for building and deploying agents without reinventing core functionalities. This integrated environment enables seamless interaction between the developer and the AI agent throughout the development lifecycle.

What prompting strategies are most effective when working with AI coding agents?

Effective prompting strategies for AI coding agents emphasize conversational, verbose, and planning-oriented interactions. Rather than terse problem statements, developers achieve better results by engaging agents in a dialogue, over-explaining assumptions, and leveraging the AI's speed for initial planning before committing to code changes. This involves using planning modes (e.g., '/plan') to collaboratively brainstorm solutions and refine ideas. Treating the AI agent like a junior engineer who benefits from clear guidance, context, and iterative feedback helps it to produce more accurate and relevant outputs, leading to superior problem-solving and feature implementation.

Why are architectural strategies like refactoring and documentation crucial for agent-driven development?

Architectural strategies like frequent refactoring, comprehensive documentation, and robust testing are paramount in agent-driven development because they create a clean, navigable codebase that AI agents can effectively understand and interact with. A well-maintained codebase, much like for human engineers, allows AI agents to contribute features more accurately and efficiently. By prioritizing readability, consistent patterns, and up-to-date documentation, developers ensure that Copilot can interpret the codebase's intent, identify opportunities for improvement, and implement changes with minimal errors, making feature delivery trivial and facilitating continuous re-architecture.

How does a 'blameless culture' apply to iteration strategies in agent-driven development?

Applying a 'blameless culture' to agent-driven development means shifting from a 'trust but verify' mindset to one that prioritizes 'blame process, not agents.' This philosophy acknowledges that AI agents, like human engineers, can make mistakes. The focus then shifts to implementing robust processes and guardrails—such as strict typing, comprehensive linters, and extensive integration and end-to-end tests—to prevent errors. When an agent does make a mistake, the response is to learn from it and introduce additional guardrails, refining the processes and prompts to ensure the same error isn't repeated, fostering a rapid and psychologically safe iteration pipeline.

What is the typical development loop when using agent-driven development?

The typical development loop in agent-driven development begins with planning a new feature collaboratively with Copilot using a '/plan' prompt, ensuring testing and documentation updates are integrated early. Next, Copilot implements the feature, often using an '/autopilot' command. Following implementation, a review loop is initiated with a Copilot Code Review agent, addressing comments iteratively. The final stage involves a human review to enforce patterns and standards. Outside this feature loop, Copilot is periodically prompted to review for missing tests, code duplication, or documentation gaps, maintaining a continuously optimized agent-driven environment.

What kind of impact did agent-driven development have on team productivity and collaboration?

The impact of agent-driven development on team productivity and collaboration was transformative, leading to an incredibly rapid iteration pipeline. In one instance, a team of five new contributors, using this methodology, created 11 new agents, four new skills, and implemented complex workflows in less than three days. This amounted to a staggering change of +28,858/-2,884 lines of code across 345 files. This dramatic increase in output highlights how agent-driven development, by automating routine tasks and providing intelligent assistance, significantly accelerates feature delivery, fosters deeper collaboration, and enables teams to achieve unprecedented levels of innovation and efficiency.

保持更新

将最新AI新闻发送到您的收件箱。