What is the primary goal of the expanded strategic collaboration between AWS and NVIDIA?

The collaboration aims to accelerate the transition of AI solutions from experimental phases to full-scale production environments. This involves integrating new technologies and expanding existing capabilities across accelerated computing, interconnect technologies, model fine-tuning, and inference. The focus is on enabling customers to build and run AI solutions that are reliable, performant at scale, and compliant with enterprise security and regulatory requirements, ultimately driving meaningful business outcomes through production-ready AI systems.

What significant GPU infrastructure expansions are planned by AWS as part of this collaboration?

Starting in 2026, AWS plans to deploy over 1 million NVIDIA GPUs, including the next-generation Blackwell and Rubin architectures, across its global cloud regions. This massive expansion solidifies AWS's position as a leading provider of NVIDIA GPU-based instances, offering the broadest collection for diverse AI/ML workloads. This enhanced capacity is crucial for supporting the surging demand for AI compute, particularly for complex agentic AI systems that require extensive computational power.

How will the new Amazon EC2 instances with NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs benefit users?

AWS is the first major cloud provider to support the NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs on Amazon EC2 instances. These instances are highly versatile, suitable for a broad spectrum of workloads such as data analytics, conversational AI, content generation, recommender systems, video streaming, and advanced graphics rendering. Built on the AWS Nitro System, they offer enhanced resource efficiency, robust security, and stability, delivering superior performance for demanding AI and graphics applications.

How does the integration of NVIDIA NIXL with AWS EFA enhance Large Language Model (LLM) inference?

The integration of NVIDIA Inference Xfer Library (NIXL) with AWS Elastic Fabric Adapter (EFA) is designed to accelerate disaggregated LLM inference on Amazon EC2 across both NVIDIA GPUs and AWS Trainium instances. This is critical for managing the communication overhead in large models, enabling efficient overlap of communication and computation, minimizing latency, and maximizing GPU utilization. It facilitates high-throughput, low-latency KV-cache data movement and integrates natively with popular open-source frameworks like NVIDIA Dynamo, vLLM, and SGLang.

What improvements are being made to Apache Spark performance for data analytics?

AWS and NVIDIA's joint engineering efforts have resulted in a 3x faster performance for Apache Spark workloads. This is achieved by combining Amazon EMR on Amazon EKS with G7e instances, powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. This significant speedup allows data engineers and scientists to accelerate time-to-insight for critical tasks such as AI/ML feature engineering, complex ETL transformations, and real-time analytics, maintaining full compatibility with existing Spark applications.

What expanded NVIDIA Nemotron model support is coming to Amazon Bedrock?

Amazon Bedrock will soon support fine-tuning NVIDIA Nemotron models directly using Reinforcement Fine-Tuning (RFT). This capability allows developers to precisely align model behavior to specific domains like legal, healthcare, or finance without infrastructure overhead. Additionally, NVIDIA Nemotron 3 Super, a hybrid Mixture-of-Experts (MoE) model optimized for multi-agent workloads and extended reasoning, will also be available on Amazon Bedrock, providing fast, cost-efficient inference via a fully managed API for complex, multi-step AI tasks.

How does this collaboration address energy efficiency and sustainability in AI?

The collaboration acknowledges the growing importance of energy efficiency as AI workloads scale. Performance per watt is highlighted not just as a sustainability metric but as a competitive advantage. The article points to an NVIDIA GTC session where sustainability leaders, including Amazon CSO Kara Hurst, discuss how AI is transforming enterprise energy and infrastructure, emphasizing efforts towards more sustainable AI practices from data centers to broader enterprise AI applications.

title: "AWS、NVIDIA 深化AI合作，加速AI从试点到生产" slug: "aws-and-nvidia-deepen-strategic-collaboration-to-accelerate-ai-from-pilot-to-production" date: "2026-03-18" lang: "zh" source: "https://aws.amazon.com/blogs/machine-learning/aws-and-nvidia-deepen-strategic-collaboration-to-accelerate-ai-from-pilot-to-production/" category: "企业级AI" keywords:

AWS
NVIDIA
AI加速
GTC 2026
GPU
Amazon EC2
Amazon Bedrock
Nemotron
LLM推理
EFA
Apache Spark
企业级AI
生产级AI
机器学习 meta_description: "AWS 和 NVIDIA 在 GTC 2026 大会上深化战略合作，宣布重大集成，以加速AI从试点到生产，包括扩展的 GPU 部署、新的 EC2 实例以及 Amazon Bedrock 对 Nemotron 模型的支持。" image: "/images/articles/aws-and-nvidia-deepen-strategic-collaboration-to-accelerate-ai-from-pilot-to-production.png" image_alt: "AWS 和 NVIDIA 标志醒目地展示，象征着双方在AI加速和创新方面的扩展战略合作。" quality_score: 94 content_score: 93 seo_score: 95 companies:
AWS
NVIDIA schema_type: "NewsArticle" reading_time: 5 faq:
question: "AWS 与 NVIDIA 扩展战略合作的主要目标是什么？" answer: "此次合作旨在加速 AI 解决方案从实验阶段过渡到全面的生产环境。这包括在加速计算、互连技术、模型微调和推理方面集成新技术并扩展现有能力。重点是使客户能够构建和运行可靠、大规模高性能、并符合企业安全和监管要求的 AI 解决方案，最终通过生产就绪的 AI 系统驱动有意义的业务成果。"
question: "作为此次合作的一部分，AWS 计划进行哪些重要的 GPU 基础设施扩展？" answer: "从 2026 年开始，AWS 计划在其全球云区域部署超过 100 万个 NVIDIA GPU，包括下一代 Blackwell 和 Rubin 架构。这一大规模扩展巩固了 AWS 作为 NVIDIA GPU 实例领先供应商的地位，为各种 AI/ML 工作负载提供了最广泛的集合。这种增强的容量对于支持 AI 计算的激增需求至关重要，特别是对于需要大量计算能力的复杂具身智能 (agentic AI) 系统。"
question: "配备 NVIDIA RTX PRO 4500 Blackwell 服务器版 GPU 的新 Amazon EC2 实例将如何使用户受益？" answer: "AWS 是首个在 Amazon EC2 实例上支持 NVIDIA RTX PRO 4500 Blackwell 服务器版 GPU 的主要云提供商。这些实例用途广泛，适用于数据分析、对话式 AI、内容生成、推荐系统、视频流和高级图形渲染等广泛工作负载。它们基于 AWS Nitro System 构建，提供增强的资源效率、强大的安全性和稳定性，为要求苛刻的 AI 和图形应用提供卓越性能。"
question: "NVIDIA NIXL 与 AWS EFA 的集成如何增强大型语言模型 (LLM) 推理？" answer: "NVIDIA 推理传输库 (NIXL) 与 AWS 弹性结构适配器 (EFA) 的集成旨在加速 Amazon EC2 上 NVIDIA GPU 和 AWS Trainium 实例的分布式 LLM 推理。这对于管理大型模型中的通信开销至关重要，能够实现通信和计算的有效重叠，最大限度地减少延迟，并最大化 GPU 利用率。它促进了高吞吐量、低延迟的 KV 缓存数据移动，并与 NVIDIA Dynamo、vLLM 和 SGLang 等流行的开源框架原生集成。"
question: "Apache Spark 数据分析性能正在进行哪些改进？" answer: "AWS 和 NVIDIA 的联合工程工作使 Apache Spark 工作负载的性能提高了 3 倍。这是通过将 Amazon EMR on Amazon EKS 与由 NVIDIA RTX PRO 6000 Blackwell 服务器版 GPU 提供支持的 G7e 实例相结合来实现的。这种显著的加速使数据工程师和科学家能够加快 AI/ML 特征工程、复杂 ETL 转换和实时分析等关键任务的洞察时间，同时保持与现有 Spark 应用程序的完全兼容性。"
question: "Amazon Bedrock 将支持哪些扩展的 NVIDIA Nemotron 模型？" answer: "Amazon Bedrock 即将直接支持使用强化微调 (RFT) 对 NVIDIA Nemotron 模型进行微调。此功能使开发人员能够将模型行为精确地调整到法律、医疗保健或金融等特定领域，而无需基础设施开销。此外，NVIDIA Nemotron 3 Super，一种针对多智能体工作负载和扩展推理进行优化的混合专家 (MoE) 模型，也将在 Amazon Bedrock 上提供，通过完全托管的 API 为复杂的、多步骤的 AI 任务提供快速、高成本效益的推理。"
question: "此次合作如何解决 AI 中的能源效率和可持续性问题？" answer: "此次合作认识到，随着 AI 工作负载的扩展，能源效率日益重要。每瓦性能不仅被视为一项可持续性指标，更是一项竞争优势。文章指出在 NVIDIA GTC 大会的一个会议上，包括 Amazon 首席可持续发展官 (CSO) Kara Hurst 在内的可持续发展领导者们讨论了 AI 如何从根本上改变企业能源和基础设施，强调了从数据中心到更广泛的企业 AI 应用，在实现更可持续的 AI 实践方面所做的努力。"

AWS、NVIDIA 深化AI合作，加速AI从试点到生产

人工智能（AI）正以前所未有的速度改变着各行各业，但其真正的价值不仅在于实验，更在于在生产环境中成功部署和运营 AI 解决方案。这需要稳健、可扩展、安全且合规的系统，以提供切实的业务成果。为了满足这一关键需求，AWS 和 NVIDIA 在 NVIDIA GTC 2026 大会上宣布大幅扩展双方的战略合作，推出新的技术集成，旨在满足不断增长的 AI 计算需求，并将 AI 解决方案推向实际生产。

此次深化合作的重点是加速 AI 生命周期的各个方面，从基础设施到模型部署。这些集成涵盖了加速计算、先进互连技术以及简化的模型微调和推理等关键领域。主要公告包括：

从 2026 年开始，在 AWS 区域部署超过 100 万个 NVIDIA GPU。
Amazon EC2 支持 NVIDIA RTX PRO 4500 Blackwell 服务器版 GPU，使 AWS 成为首个提供此项服务的主要云提供商。
利用 AWS 弹性结构适配器 (EFA) 上的 NVIDIA NIXL，加速分布式大型语言模型 (LLM) 推理的互连。
使用 Amazon EMR on Amazon Elastic Kubernetes Service (Amazon EKS) 结合由 NVIDIA RTX PRO 6000 Blackwell 服务器版 GPU 提供支持的 Amazon EC2 G7e 实例，使 Apache Spark 工作负载的性能提高了 3 倍。
Amazon Bedrock 扩展支持 NVIDIA Nemotron 模型，包括强化微调和 Nemotron 3 Super 模型。

借助增强型 NVIDIA GPU 能力扩展 AI 基础设施

现代 AI 的基础在于强大的计算基础设施。从 2026 年开始，AWS 承诺通过向其全球云区域新增超过 100 万个 NVIDIA GPU，为 AI 进步做出巨大贡献。这包括下一代 Blackwell 和 Rubin GPU 架构，确保客户能够获得最先进的硬件。AWS 已经拥有业界最广泛的 NVIDIA GPU 实例集合，可满足各种 AI/ML 工作负载的需求，而此次扩展将进一步巩固其领先地位。

这项长达 15 年的长期合作关系，也延伸到 Spectrum 网络等关键基础设施领域。其目标是为企业、初创公司和研究人员提供构建和扩展高级具身智能系统所需的强大基础设施——这些 AI 系统能够跨复杂工作流进行自主推理、规划和行动。

推出新的 Amazon EC2 实例和互连创新

此次合作的一大亮点是即将推出的由 NVIDIA RTX PRO 4500 Blackwell 服务器版 GPU 加速的 Amazon EC2 实例。AWS 很高兴成为首个宣布支持这些强大 GPU 的主要云提供商，使它们可用于各种严苛任务。这些实例非常适合数据分析、复杂的对话式 AI、动态内容生成、高级推荐系统、高质量视频流和复杂图形工作负载。

这些新的 EC2 实例将构建在强大的 AWS Nitro System 之上。Nitro System 凭借其专用硬件和轻量级管理程序的独特组合，几乎将所有主机硬件的计算和内存资源直接交付给实例。这种设计确保了卓越的资源利用率和性能。至关重要的是，Nitro System 的专用硬件、软件和固件旨在强制执行严格的限制，保护敏感的 AI 工作负载和数据免受未经授权的访问，即使在 AWS 内部也是如此。其在运行时执行固件更新和优化的能力，进一步增强了生产级 AI、分析和图形工作负载所需的安全性及稳定性。

进一步提升性能，特别是对于大型 AI 模型，是加速分布式 LLM 推理的互连。随着模型规模的不断增长，GPU 或 AWS Trainium 实例之间的通信开销可能成为一个重要的瓶颈。AWS 宣布支持 NVIDIA 推理传输库 (NIXL) 与 AWS 弹性结构适配器 (EFA) 集成，旨在加速 Amazon EC2 上 NVIDIA GPU 和 AWS Trainium 实例的分布式 LLM 推理。这种集成对于扩展现代 AI 工作负载至关重要，能够实现通信和计算的有效重叠，最大限度地减少延迟，并最大化 GPU 利用率。它促进了计算节点和分布式内存资源之间的高吞吐量、低延迟 KV 缓存数据移动。NIXL 与 EFA 原生集成 NVIDIA Dynamo、vLLM 和 SGLang 等流行的开源框架，可提供改进的令牌间延迟和更高效的 KV 缓存内存利用率。

利用 Amazon EMR 和 GPU 加速数据分析

数据工程师和科学家经常面临耗时的数据处理流程，这会显著阻碍 AI/ML 模型迭代和商业智能的生成。AWS 和 NVIDIA 的合作带来了突破性的改进：Apache Spark 工作负载的性能提高了 3 倍。这一加速是通过在 Amazon EKS 上利用 Amazon EMR 结合由 NVIDIA RTX PRO 6000 Blackwell 服务器版 GPU 提供支持的 G7e 实例来实现的。

这一显著的性能提升是双方工程团队致力于优化 GPU 加速分析的直接成果。借助 Amazon EMR 和 G7e 实例，组织可以大幅缩短 AI/ML 特征工程、复杂 ETL 转换以及大规模实时分析所需的时间。运行大规模数据处理流程的客户可以在保持与现有 Spark 应用程序完全兼容的同时，更快地获得洞察。

Amazon Bedrock 扩展支持 NVIDIA Nemotron 模型

AWS 和 NVIDIA 还在基础模型方面扩展合作，将先进的 NVIDIA Nemotron 模型引入 Amazon Bedrock。

开发人员很快就能直接在 Amazon Bedrock 上使用强化微调 (RFT) 对 NVIDIA Nemotron 模型进行微调。对于需要根据特定领域（无论是法律、医疗保健、金融还是其他专业领域）调整模型行为的团队来说，这是一项颠覆性的能力。RFT 赋能用户塑造模型的推理和响应方式，使其超越单纯的知识获取，实现细致入微的行为对齐。至关重要的是，这在 Amazon Bedrock 上原生运行，消除了基础设施开销——用户定义任务、提供反馈，而 Bedrock 管理其余部分。

此外，NVIDIA Nemotron 3 Super，一个专为多智能体工作负载和扩展推理而构建的混合专家 (MoE) 模型，也即将登陆 Amazon Bedrock。Nemotron 3 Super 旨在帮助 AI 智能体在复杂的、多步骤的工作流程中保持准确性，将为金融、网络安全、零售和软件开发等多样化用例提供支持。它承诺通过完全托管的 API 提供快速、高成本效益的推理，从而简化复杂 AI 智能体的部署。

以下是主要公告摘要：

功能/集成	描述	主要优势	可用性
GPU 部署	在 AWS 区域部署超过 100 万个 NVIDIA GPU（Blackwell、Rubin 架构）。	为所有 AI/ML 工作负载、具身智能提供大规模计算能力。	2026 年开始
Amazon EC2 实例	EC2 支持 NVIDIA RTX PRO 4500 Blackwell 服务器版 GPU。	首个主要云提供商支持多功能 AI、图形、分析。	即将推出
LLM 推理	AWS EFA 上的 NVIDIA NIXL，用于加速 GPU 和 Trainium 上的分布式 LLM 推理。	最小化通信延迟，最大化 LLM 的 GPU 利用率。	已宣布
Apache Spark 性能	Amazon EMR on EKS 结合 G7e 实例（RTX PRO 6000），使 Spark 工作负载速度提高 3 倍。	加速数据分析、特征工程的洞察时间。	已宣布
Nemotron 微调	直接在 Amazon Bedrock 上对 Nemotron 模型进行强化微调 (RFT)。	无需基础设施开销，实现领域特定的模型行为对齐。	即将推出
Nemotron 3 Super	Amazon Bedrock 上的混合 MoE 模型，用于多智能体工作负载和扩展推理。	为复杂的、多步骤的 AI 任务提供快速、高成本效益的推理。	即将推出

致力于能源效率和可持续 AI

随着 AI 工作负载持续呈指数级增长，底层基础设施的效率和可持续性变得至关重要。此次合作也突显了双方在提高能源效率方面的共同承诺。每瓦性能不再仅仅是衡量可持续性的一项指标，更成为 AI 领域显著的竞争优势。

在 NVIDIA GTC 2026 大会上，亚马逊首席可持续发展官 (CSO) Kara Hurst 与其他可持续发展领域的领导者共同探讨了 AI 如何从根本上大规模改变企业能源和基础设施。此次讨论强调了开发和部署既强大又环保的 AI 解决方案的重点，从优化为活跃电网参与者的数据中心，到更广泛的企业 AI 应用。这种前瞻性的方法确保了 AI 计算的进步与全球可持续发展目标保持一致。

AWS、NVIDIA 深化AI合作，加速AI从试点到生产

AWS、NVIDIA 深化AI合作，加速AI从试点到生产

借助增强型 NVIDIA GPU 能力扩展 AI 基础设施

推出新的 Amazon EC2 实例和互连创新

利用 Amazon EMR 和 GPU 加速数据分析

Amazon Bedrock 扩展支持 NVIDIA Nemotron 模型

致力于能源效率和可持续 AI

常见问题

保持更新