Mistral Small 4：为开发者统一AI能力

Mistral AI 发布了 Mistral Small 4，这是一款开创性的模型，旨在重新定义AI领域的通用性和效率。此次最新发布标志着在将不同的AI能力——推理、多模态和指令遵循——统一到一个单一、适应性强的模型方面迈出了重要一步。对于开发者、研究人员和企业而言，Mistral Small 4承诺提供一种精简的方法来构建高级AI应用程序，而无需在专业模型之间进行权衡。

历史上，AI模型往往在特定领域表现出色：有些擅长快速执行指令，有些则展现出强大的推理能力，少数模型提供了多模态理解。Mistral Small 4打破了这一范式，将Mistral AI之前旗舰模型的优势——用于推理的Magistral、用于多模态输入的Pixtral以及用于智能体编程的Devstral——整合到一个统一的单元中。这种统一不仅是便利，更是迈向更高效、可扩展和开发者友好型AI的战略举措。

Mistral Small 4在宽松的Apache 2.0许可下发布，凸显了Mistral AI对开源原则的承诺，旨在培育一个创新蓬勃发展的协作生态系统。这种对可访问性的承诺确保了最先进的AI技术不仅仅为少数人所用，而是向全球社区开放，鼓励大家突破可能的界限。

驱动Mistral Small 4性能的架构创新

Mistral Small 4采用尖端架构设计，旨在实现强大的性能和卓越的效率。作为一种混合模型，它经过精心优化，适用于各种任务，包括通用聊天、复杂编码、精密的智能体工作流程和复杂的推理。其原生处理文本和图像输入的能力，使其成为现代AI应用中真正多功能的解决方案。

其设计的核心是**专家混合（MoE）**架构，拥有128位专家，每个token有4位活跃专家。这实现了高效的扩展和专业化，使模型能够动态地调用其网络中最相关的部分来处理任何给定任务。Mistral Small 4拥有1190亿总参数和每token 60亿活跃参数（包括嵌入和输出层在内为80亿），在保持高效占用的同时，蕴含着巨大的计算能力。

一个显著的特点是其宽广的256k上下文窗口，支持超长篇互动和深入的文档分析。这种扩展的上下文对于需要全面理解大量文本的任务至关重要，例如法律审查、科学研究或广泛的代码分析。此外，该模型引入了可配置的推理工作量，允许用户在快速、低延迟响应和深度、推理密集型输出之间切换，从而对性能和输出风格提供了前所未有的控制。

Mistral Small 4的原生多模态是一个颠覆性的特性，它接受文本和图像输入。这解锁了广泛的用例，从智能文档解析和视觉搜索到复杂的图像-文本生成和分析，使其成为新一代AI驱动应用的不可或缺的工具。

企业AI的效率和统一能力

Mistral Small 4的设计直接转化为切实的性能优势，为大型语言模型的效率树立了新标准。与前身Mistral Small 3相比，新模型在延迟优化设置中将端到端完成时间缩短了40%。对于要求高吞吐量的应用，它实现了每秒请求量惊人的3倍增长。

这种效率的飞跃对于企业部署至关重要，因为成本和速度是首要考虑因素。Mistral Small 4的智能设计确保了组织能够以更少的资源实现更多成果，从而降低运营成本并提供卓越的用户体验。该模型在LCR、LiveCodeBench和AIME 2025等基准测试中获得具有竞争力的分数——与GPT-OSS 120B等大型模型相当或超越——同时生成明显更短的输出，这证明了其"每token性能"的效率。这意味着更快的响应、更低的推理成本以及针对复杂、高风险任务的更高可扩展性。

性能亮点：Mistral Small 4 对比先前模型

指标	Mistral Small 4 (延迟优化)	Mistral Small 4 (吞吐量优化)	Mistral Small 3	GPT-OSS 120B (参考)
端到端完成时间	减少40%	—	基线	—
每秒请求量 (RPS)	—	增加3倍	基线	—
LCR基准分数	0.72	0.72	—	匹配/超越
LCR输出长度	1.6K字符	1.6K字符	—	长3.5-4倍
LiveCodeBench分数	优于	优于	—	优于
LiveCodeBench输出长度	减少20%	减少20%	—	基线

'reasoning_effort'参数进一步提升了这种效率，允许开发者根据任务需求微调模型的行为。对于日常聊天和快速响应，reasoning_effort="none"提供快速、轻量级的输出。对于复杂问题解决，设置reasoning_effort="high"则会触发深度、循序渐进的推理，类似于之前Magistral模型详细冗长的风格。这种动态可配置性确保了最佳的资源利用，使Mistral Small 4成为各种应用中适应性强的强大引擎。

拓展视野：用例和可访问性

Mistral Small 4 有望赋能广泛的用户和行业。对于开发者而言，它是编码自动化、代码库探索以及创建高级智能体工作流程的宝贵工具。其高效理解和生成代码的能力将加速开发周期并促进创新。

企业会发现Mistral Small 4在通用聊天助手、复杂的文档理解和全面的多模态分析方面不可或缺。从通过智能聊天机器人增强客户支持，到自动化从复杂文档中提取数据，其统一的能力简化了运营并开启了新的洞察。

研究人员，尤其是在需要严谨分析的领域，将从其在数学、研究和复杂推理任务方面的强大能力中受益。处理大量信息和进行深度推理的能力使其成为科学发现和学术探究的强大助手。

Mistral AI通过Apache 2.0许可展现的开源承诺，进一步扩大了其影响力。这使得在微调和专业化方面具有无与伦比的灵活性，使组织能够根据其独特的领域特定需求调整模型。这种协作精神与让高级AI普惠大众的更广泛运动相契合，体现了为所有人扩展AI的愿景。

可用性与生态系统集成

访问Mistral Small 4非常直接。开发者可以通过Mistral API和AI Studio进行集成。它也在Hugging Face Repository上随时可用，为开源社区提供了一个熟悉的平台。

对于在NVIDIA生态系统内操作的用户，可以在build.nvidia.com上免费进行Mistral Small 4的原型开发。对于生产级部署，该模型在发布当天即可作为NVIDIA NIM（NVIDIA推理微服务）提供，确保开箱即用的优化、容器化推理。通过NVIDIA NeMo也支持特定领域的微调定制。这种广泛的支持网络凸显了Mistral AI和NVIDIA之间的战略伙伴关系，加强了他们共同推进AI创新的目标。

可以在Mistral AI的AI治理中心查阅全面的技术文档，为开发者和集成商提供必要的资源。对于大型企业部署、定制微调或本地解决方案，Mistral AI鼓励直接与他们的专家团队联系。

AI的未来是开放和统一的

Mistral Small 4代表着AI模型演进中的一次重大飞跃。通过成功地将指令、推理和多模态能力统一到一个单一、高效且开放可用的软件包中，Mistral AI简化了AI集成，并赋能了各行各业的用户。这种适应性意味着开发者和组织可以使用一个单一、强大的工具来解决更广泛的任务，有效地将开源AI的变革性优势带入实际应用。

此次发布不仅简化了开发流程，还民主化了高级AI能力的获取，培养了一个更具创新性和协作性的全球AI社区。Mistral AI所展望的AI未来，是一个强大、多功能的工具随时可用，使每个人都能为技术进步的新篇章做出贡献的未来。

原始来源

https://mistral.ai/news/mistral-small-4

常见问题

What is Mistral Small 4 and what makes it unique?

Mistral Small 4 is the latest major release in Mistral AI's 'Small' model family, uniquely unifying the capabilities of their previous flagship models: Magistral for complex reasoning, Pixtral for multimodal understanding, and Devstral for agentic coding. This means developers no longer need to choose between specialized models for different tasks; Mistral Small 4 offers a single, versatile solution capable of fast instruction, powerful reasoning, and multimodal assistance, all with configurable reasoning effort and best-in-class efficiency. It's released under an Apache 2.0 license, emphasizing its commitment to open, accessible, and customizable AI, making it a significant advancement for developers and enterprises seeking integrated AI solutions.

What are the key architectural innovations in Mistral Small 4?

Mistral Small 4 leverages a sophisticated Mixture of Experts (MoE) architecture, featuring 128 experts with 4 active per token, allowing for efficient scaling and specialization. It boasts a total of 119 billion parameters, with 6 billion active parameters per token (8 billion including embedding and output layers), providing substantial processing power. A 256k context window supports extensive long-form interactions and detailed document analysis. Furthermore, its native multimodality accepts both text and image inputs, unlocking a vast array of use cases from document parsing to visual analysis. The model also includes a configurable 'reasoning_effort' parameter, allowing dynamic adjustment between low-latency and deep reasoning outputs.

How does Mistral Small 4 enhance performance compared to previous models?

Mistral Small 4 demonstrates significant performance enhancements, achieving a 40% reduction in end-to-end completion time in latency-optimized setups. For throughput-optimized deployments, it delivers 3x more requests per second compared to its predecessor, Mistral Small 3. This efficiency is critical for enterprise applications, as it directly impacts operational costs and scalability. Benchmarks like LCR, LiveCodeBench, and AIME 2025 show Mistral Small 4, particularly with its reasoning enabled, matching or surpassing the performance of larger models like GPT-OSS 120B, while generating significantly shorter, and thus more efficient, outputs. This 'performance per token' efficiency translates to lower inference costs and improved user experience.

What is the 'reasoning_effort' parameter and how does it benefit users?

The 'reasoning_effort' parameter in Mistral Small 4 allows users to dynamically adjust the model's computational intensity and output style to match the specific demands of their task. Setting 'reasoning_effort='none'' provides fast, lightweight responses suitable for everyday tasks, akin to the chat style of Mistral Small 3.2. Conversely, 'reasoning_effort='high'' prompts the model to engage in deep, step-by-step reasoning, producing more verbose and thoroughly considered outputs equivalent to previous Magistral models. This configurability provides unprecedented flexibility, enabling developers to optimize for either speed or depth, depending on the complexity and criticality of the problem at hand, thereby enhancing both efficiency and accuracy.

What are the primary intended use cases for Mistral Small 4?

Mistral Small 4 is designed to cater to a broad spectrum of users and applications due to its versatile, unified capabilities. For developers, it's ideal for coding automation, codebase exploration, and implementing sophisticated code agentic workflows. Enterprises can leverage it for general chat assistants, comprehensive document understanding, and advanced multimodal analysis. Researchers will find it invaluable for complex math problems, in-depth research tasks, and intricate reasoning challenges. Its open-source license further encourages fine-tuning and specialization, making it adaptable for almost any domain-specific requirement, ensuring it can power a new generation of AI-driven tools and services.

How can developers and enterprises access Mistral Small 4?

Mistral Small 4 is made broadly accessible through multiple channels. Developers can access it via the Mistral API and AI Studio for direct integration into their applications. It's also available on the Hugging Face Repository, making it easy for the open-source community to engage with and build upon. For those leveraging NVIDIA's ecosystem, prototyping is free on build.nvidia.com, and for production, it's available as an NVIDIA NIM (NVIDIA Inference Microservice), offering optimized, containerized inference. Additionally, it can be customized with NVIDIA NeMo for domain-specific fine-tuning. For enterprise-grade deployments, custom fine-tuning, or on-premises solutions, Mistral AI encourages direct contact with their team to facilitate tailored integration.

What does Mistral Small 4's release signify for open-source AI?

The release of Mistral Small 4 under the Apache 2.0 license strongly reaffirms Mistral AI's deep commitment to the open-source community and accessible AI. By unifying advanced instruct, reasoning, and multimodal capabilities into a single, efficient, and openly available model, Mistral Small 4 lowers barriers to entry for developers and organizations. It simplifies AI integration, allowing for a wider range of tasks to be tackled with a single adaptable tool, directly translating the benefits of open-source AI into real-world applications. This move not only fosters collaboration and innovation but also provides a powerful, versatile foundation upon which the global AI community can build the next generation of intelligent systems, aligning with initiatives like the NVIDIA Nemotron Coalition.

保持更新

将最新AI新闻发送到您的收件箱。