Code Velocity
Enterprise AI

Amazon Bedrock: Stateful MCP Client Capabilities on AgentCore Runtime

·7 min read·AWS·Original source
Share
Diagram illustrating stateful MCP client capabilities on Amazon Bedrock AgentCore Runtime with interactive AI agent flows.

Enhancing AI Agents: The Shift to Stateful MCP on Amazon Bedrock

AI agents are rapidly evolving, yet their full potential has often been hampered by stateless implementations, particularly in scenarios demanding real-time user interaction, dynamic content generation, or ongoing progress updates. Developers building sophisticated AI agents frequently face challenges when workflows need to pause, gather clarification, or report status during long-running operations. The rigid, one-way nature of stateless execution restricts the development of truly interactive and responsive AI applications.

Now, Amazon Bedrock AgentCore Runtime introduces groundbreaking stateful Model Context Protocol (MCP) client capabilities, transforming how AI agents engage with users and large language models (LLMs). This pivotal update liberates agents from the constraints of stateless communication, enabling complex, multi-turn, and highly interactive workflows. By integrating crucial MCP client features – Elicitation, Sampling, and Progress Notifications – Bedrock AgentCore Runtime facilitates bidirectional conversations between MCP servers and clients, paving the way for more intelligent, user-centric AI solutions.

From Stateless to Stateful: Unlocking Interactive Agent Workflows

Previously, MCP server support on AgentCore operated in a stateless mode, where each HTTP request functioned independently, devoid of any shared context. While this simplified deployment for basic tool servers, it severely limited scenarios requiring conversational continuity, mid-workflow user clarification, or real-time progress reporting. The server simply could not maintain a conversation thread across discrete requests, hindering the development of truly interactive agents.

The advent of stateful MCP client capabilities fundamentally alters this paradigm. By setting stateless_http=False during server startup, AgentCore Runtime provisions a dedicated microVM for each user session. This microVM persists for the session's duration—up to 8 hours, or 15 minutes of inactivity per idleRuntimeSessionTimeout setting—ensuring CPU, memory, and filesystem isolation between sessions. Continuity is maintained through a Mcp-Session-Id header, which the server provides during initialization and the client includes in all subsequent requests to route back to the same session. This dedicated, persistent environment allows agents to remember context, solicit user input, generate dynamic LLM content, and provide continuous updates.

The following table summarizes the key differences between stateless and stateful modes:

Stateless modeStateful mode
stateless_http settingTRUEFALSE
Session isolationDedicated microVM per sessionDedicated microVM per session
Session lifetimeUp to 8 hours; 15-min idle timeoutUp to 8 hours; 15-min idle timeout
Client capabilitiesNot supportedElicitation, sampling, progress notifications
Recommended forSimple tool servingInteractive, multi-turn workflows

When a session expires or the server is restarted, subsequent requests with the early session ID will return a 404. At that point, clients must re-initialize the connection to obtain a new session ID and start a fresh session. The configuration change to enable stateful mode is a single flag in your server startup:

mcp.run( transport="streamable-http", host="0.0.0.0", port=8000, stateless_http=False # Enable stateful mode)

Beyond this flag, the three client capabilities become available automatically once the MCP client declares support for them during the initialization handshake.

Deep Dive into New Client Capabilities: Elicitation, Sampling, and Progress

With the transition to stateful mode, Amazon Bedrock AgentCore Runtime unlocks three powerful client capabilities from the MCP specification, each designed to address distinct interaction patterns crucial for advanced AI agents. These capabilities transform what was once a rigid, one-way command execution into a fluid, two-way dialogue between an MCP server and its connected clients. It’s important to note that these features are opt-in, meaning clients declare their support during initialization, and servers must only utilize capabilities that the connected client has advertised.

Elicitation: Enabling Dynamic User Input in AI Agents

Elicitation stands as a cornerstone of interactive AI, allowing an MCP server to judiciously pause its execution and request specific, structured input from the user via the client. This capability empowers the tool to ask precise questions at opportune moments within its workflow, whether it's to confirm a decision, gather a user preference, or collect a value derived from preceding operations. The server initiates this by sending an elicitation/create JSON-RPC request, which includes a human-readable message and an optional requestedSchema delineating the expected response structure.

The MCP specification provides two robust modes for elicitation:

  • Form mode: This is ideal for collecting structured data directly through the MCP client, such as configuration parameters, user preferences, or simple confirmations where sensitive data is not involved.
  • URL mode: For interactions that necessitate a secure, out-of-band process, like OAuth flows, payment processing, or the input of sensitive credentials, URL mode directs the user to an external URL. This ensures that sensitive information bypasses the MCP client altogether, enhancing security and compliance.

Upon receiving an elicitation request, the client renders an appropriate input interface. The user’s subsequent action triggers a three-action response model back to the server: accept (user provided the requested data), decline (user explicitly rejected the request), or cancel (user dismissed the prompt without making a choice). Intelligent servers are designed to handle each of these scenarios gracefully, ensuring a robust and user-friendly experience. For instance, an add_expense_interactive tool, as demonstrated in the source material, can guide a user through a series of questions—amount, description, category, and final confirmation—before committing data to a backend like Amazon DynamoDB. Each step leverages Pydantic models to define the expected input, which FastMCP seamlessly converts into the JSON Schema required for the elicitation/create request.

Sampling and Progress Notifications: Boosting LLM Interaction and Transparency

Beyond direct user interaction, Sampling equips the MCP server with the ability to request LLM-generated content directly from the client via sampling/createMessage. This is a critical mechanism as it allows tool logic on the server to harness powerful language model capabilities without needing to manage its own LLM credentials or direct API integrations. The server simply provides a prompt and optional model preferences, and the client, acting as an intermediary, forwards the request to its connected LLM and returns the generated response. This opens up a myriad of practical applications, including crafting personalized summaries, generating natural-language explanations from structured data, or producing context-aware recommendations based on the ongoing conversation.

For operations that extend over time, Progress Notifications become invaluable. This capability allows an MCP server to report incremental updates during long-running tasks. By utilizing ctx.report_progress(progress, total), the server can emit continuous updates that clients can translate into visual feedback, such as a progress bar or a status indicator. Whether it's searching across vast data sources or executing complex computational tasks, transparent progress updates ensure users remain informed, preventing frustration and enhancing the overall user experience, rather than leaving them staring at a blank screen wondering if the system is still active.

Future-Proofing AI Agent Development with Bedrock AgentCore Runtime

The introduction of stateful MCP client capabilities on Amazon Bedrock AgentCore Runtime represents a significant leap forward in AI agent development. By transforming previously stateless interactions into dynamic, bidirectional conversations, AWS empowers developers to build more intelligent, responsive, and user-friendly AI applications. These capabilities – Elicitation for guided user input, Sampling for on-demand LLM generation, and Progress Notifications for real-time transparency – collectively unlock a new era of interactive agent workflows. As AI continues to evolve, these foundational capabilities will be crucial for creating sophisticated operationalizing agentic AI that can seamlessly integrate into complex business processes, adapt to user needs, and deliver exceptional value.

Frequently Asked Questions

What problem do stateful MCP client capabilities solve on Amazon Bedrock AgentCore Runtime?
Stateful Model Context Protocol (MCP) client capabilities on Amazon Bedrock AgentCore Runtime address the critical limitations of previous stateless AI agent implementations. Stateless agents struggled with interactive, multi-turn workflows, as they couldn't pause mid-execution to solicit user input for clarification, request dynamic large language model (LLM)-generated content, or provide real-time progress updates during lengthy operations. Each request was independent, lacking shared context. This new feature fundamentally transforms agent interactions by enabling bidirectional conversations, allowing agents to maintain conversational threads, gather necessary input precisely when needed, generate dynamic content on the fly, and transparently inform users about ongoing processes. This leads to the development of significantly more responsive, intelligent, and user-centric AI applications capable of complex, adaptive workflows.
How does the transition from stateless to stateful mode work on AgentCore Runtime?
The transition to stateful mode within Amazon Bedrock AgentCore Runtime is initiated by a simple configuration adjustment: setting `stateless_http=False` when starting your MCP server. Once enabled, AgentCore Runtime provisions a dedicated microVM for each individual user session. This microVM is designed for persistence throughout the session's duration, which can last up to 8 hours or expire after 15 minutes of inactivity, ensuring isolated CPU, memory, and filesystem resources for each session. Continuity across interactions is maintained through a unique `Mcp-Session-Id` header. This ID is established during the initial handshake and subsequently included by the client in all follow-up requests, ensuring they are accurately routed back to the correct, persistent session, thereby preserving context and enabling complex, interactive dialogues.
What is Elicitation, and how does it enhance AI agent interactions?
Elicitation is a powerful stateful MCP capability that allows an AI agent (acting as the MCP server) to intelligently pause its ongoing execution and request specific, structured input directly from the user via the client. This significantly enhances interactive agent workflows by enabling agents to ask targeted questions at precise, opportune moments within their operational flow. For example, an agent might use elicitation to confirm a decision, gather user preferences, or collect particular data values that are contingent on preceding steps. The feature supports two robust modes: 'Form mode' for direct structured data collection through the MCP client, and 'URL mode' for secure, out-of-band interactions that require directing the user to an external URL (e.g., for OAuth or sensitive credential entry). The user's response – whether accepting, declining, or canceling the request – is then returned to the server, allowing the agent to dynamically adapt its workflow based on real-time human feedback.
How does Sampling capability benefit AI agents without managing LLM credentials?
Sampling equips the MCP server with the ability to request sophisticated large language model (LLM)-generated content directly from the client using the `sampling/createMessage` mechanism. A key benefit is that the MCP server itself does not need to manage its own LLM credentials, API keys, or direct integrations with various LLM providers. Instead, the server simply provides a well-formed prompt and any optional model preferences to the client. The client then acts as an intelligent intermediary, forwarding this request to its connected LLM and returning the generated response back to the server. This abstraction allows AI agents to seamlessly leverage powerful language model capabilities for tasks such as crafting personalized summaries, generating natural-language explanations from complex structured data, or producing context-aware recommendations, all while simplifying the operational overhead and security concerns associated with LLM management on the server side.

Stay Updated

Get the latest AI news delivered to your inbox.

Share