Enhancing AI Agents: The Shift to Stateful MCP on Amazon Bedrock
AI agents are rapidly evolving, yet their full potential has often been hampered by stateless implementations, particularly in scenarios demanding real-time user interaction, dynamic content generation, or ongoing progress updates. Developers building sophisticated AI agents frequently face challenges when workflows need to pause, gather clarification, or report status during long-running operations. The rigid, one-way nature of stateless execution restricts the development of truly interactive and responsive AI applications.
Now, Amazon Bedrock AgentCore Runtime introduces groundbreaking stateful Model Context Protocol (MCP) client capabilities, transforming how AI agents engage with users and large language models (LLMs). This pivotal update liberates agents from the constraints of stateless communication, enabling complex, multi-turn, and highly interactive workflows. By integrating crucial MCP client features – Elicitation, Sampling, and Progress Notifications – Bedrock AgentCore Runtime facilitates bidirectional conversations between MCP servers and clients, paving the way for more intelligent, user-centric AI solutions.
From Stateless to Stateful: Unlocking Interactive Agent Workflows
Previously, MCP server support on AgentCore operated in a stateless mode, where each HTTP request functioned independently, devoid of any shared context. While this simplified deployment for basic tool servers, it severely limited scenarios requiring conversational continuity, mid-workflow user clarification, or real-time progress reporting. The server simply could not maintain a conversation thread across discrete requests, hindering the development of truly interactive agents.
The advent of stateful MCP client capabilities fundamentally alters this paradigm. By setting stateless_http=False during server startup, AgentCore Runtime provisions a dedicated microVM for each user session. This microVM persists for the session's duration—up to 8 hours, or 15 minutes of inactivity per idleRuntimeSessionTimeout setting—ensuring CPU, memory, and filesystem isolation between sessions. Continuity is maintained through a Mcp-Session-Id header, which the server provides during initialization and the client includes in all subsequent requests to route back to the same session. This dedicated, persistent environment allows agents to remember context, solicit user input, generate dynamic LLM content, and provide continuous updates.
The following table summarizes the key differences between stateless and stateful modes:
| Stateless mode | Stateful mode | |
|---|---|---|
stateless_http setting | TRUE | FALSE |
| Session isolation | Dedicated microVM per session | Dedicated microVM per session |
| Session lifetime | Up to 8 hours; 15-min idle timeout | Up to 8 hours; 15-min idle timeout |
| Client capabilities | Not supported | Elicitation, sampling, progress notifications |
| Recommended for | Simple tool serving | Interactive, multi-turn workflows |
When a session expires or the server is restarted, subsequent requests with the early session ID will return a 404. At that point, clients must re-initialize the connection to obtain a new session ID and start a fresh session. The configuration change to enable stateful mode is a single flag in your server startup:
mcp.run( transport="streamable-http", host="0.0.0.0", port=8000, stateless_http=False # Enable stateful mode)
Beyond this flag, the three client capabilities become available automatically once the MCP client declares support for them during the initialization handshake.
Deep Dive into New Client Capabilities: Elicitation, Sampling, and Progress
With the transition to stateful mode, Amazon Bedrock AgentCore Runtime unlocks three powerful client capabilities from the MCP specification, each designed to address distinct interaction patterns crucial for advanced AI agents. These capabilities transform what was once a rigid, one-way command execution into a fluid, two-way dialogue between an MCP server and its connected clients. It’s important to note that these features are opt-in, meaning clients declare their support during initialization, and servers must only utilize capabilities that the connected client has advertised.
Elicitation: Enabling Dynamic User Input in AI Agents
Elicitation stands as a cornerstone of interactive AI, allowing an MCP server to judiciously pause its execution and request specific, structured input from the user via the client. This capability empowers the tool to ask precise questions at opportune moments within its workflow, whether it's to confirm a decision, gather a user preference, or collect a value derived from preceding operations. The server initiates this by sending an elicitation/create JSON-RPC request, which includes a human-readable message and an optional requestedSchema delineating the expected response structure.
The MCP specification provides two robust modes for elicitation:
- Form mode: This is ideal for collecting structured data directly through the MCP client, such as configuration parameters, user preferences, or simple confirmations where sensitive data is not involved.
- URL mode: For interactions that necessitate a secure, out-of-band process, like OAuth flows, payment processing, or the input of sensitive credentials, URL mode directs the user to an external URL. This ensures that sensitive information bypasses the MCP client altogether, enhancing security and compliance.
Upon receiving an elicitation request, the client renders an appropriate input interface. The user’s subsequent action triggers a three-action response model back to the server: accept (user provided the requested data), decline (user explicitly rejected the request), or cancel (user dismissed the prompt without making a choice). Intelligent servers are designed to handle each of these scenarios gracefully, ensuring a robust and user-friendly experience. For instance, an add_expense_interactive tool, as demonstrated in the source material, can guide a user through a series of questions—amount, description, category, and final confirmation—before committing data to a backend like Amazon DynamoDB. Each step leverages Pydantic models to define the expected input, which FastMCP seamlessly converts into the JSON Schema required for the elicitation/create request.
Sampling and Progress Notifications: Boosting LLM Interaction and Transparency
Beyond direct user interaction, Sampling equips the MCP server with the ability to request LLM-generated content directly from the client via sampling/createMessage. This is a critical mechanism as it allows tool logic on the server to harness powerful language model capabilities without needing to manage its own LLM credentials or direct API integrations. The server simply provides a prompt and optional model preferences, and the client, acting as an intermediary, forwards the request to its connected LLM and returns the generated response. This opens up a myriad of practical applications, including crafting personalized summaries, generating natural-language explanations from structured data, or producing context-aware recommendations based on the ongoing conversation.
For operations that extend over time, Progress Notifications become invaluable. This capability allows an MCP server to report incremental updates during long-running tasks. By utilizing ctx.report_progress(progress, total), the server can emit continuous updates that clients can translate into visual feedback, such as a progress bar or a status indicator. Whether it's searching across vast data sources or executing complex computational tasks, transparent progress updates ensure users remain informed, preventing frustration and enhancing the overall user experience, rather than leaving them staring at a blank screen wondering if the system is still active.
Future-Proofing AI Agent Development with Bedrock AgentCore Runtime
The introduction of stateful MCP client capabilities on Amazon Bedrock AgentCore Runtime represents a significant leap forward in AI agent development. By transforming previously stateless interactions into dynamic, bidirectional conversations, AWS empowers developers to build more intelligent, responsive, and user-friendly AI applications. These capabilities – Elicitation for guided user input, Sampling for on-demand LLM generation, and Progress Notifications for real-time transparency – collectively unlock a new era of interactive agent workflows. As AI continues to evolve, these foundational capabilities will be crucial for creating sophisticated operationalizing agentic AI that can seamlessly integrate into complex business processes, adapt to user needs, and deliver exceptional value.
Original source
https://aws.amazon.com/blogs/machine-learning/introducing-stateful-mcp-client-capabilities-on-amazon-bedrock-agentcore-runtime/Frequently Asked Questions
What problem do stateful MCP client capabilities solve on Amazon Bedrock AgentCore Runtime?
How does the transition from stateless to stateful mode work on AgentCore Runtime?
What is Elicitation, and how does it enhance AI agent interactions?
How does Sampling capability benefit AI agents without managing LLM credentials?
Stay Updated
Get the latest AI news delivered to your inbox.
