What problem do stateful MCP client capabilities solve on Amazon Bedrock AgentCore Runtime?

Stateful Model Context Protocol (MCP) client capabilities on Amazon Bedrock AgentCore Runtime address the critical limitations of previous stateless AI agent implementations. Stateless agents struggled with interactive, multi-turn workflows, as they couldn't pause mid-execution to solicit user input for clarification, request dynamic large language model (LLM)-generated content, or provide real-time progress updates during lengthy operations. Each request was independent, lacking shared context. This new feature fundamentally transforms agent interactions by enabling bidirectional conversations, allowing agents to maintain conversational threads, gather necessary input precisely when needed, generate dynamic content on the fly, and transparently inform users about ongoing processes. This leads to the development of significantly more responsive, intelligent, and user-centric AI applications capable of complex, adaptive workflows.

How does the transition from stateless to stateful mode work on AgentCore Runtime?

The transition to stateful mode within Amazon Bedrock AgentCore Runtime is initiated by a simple configuration adjustment: setting `stateless_http=False` when starting your MCP server. Once enabled, AgentCore Runtime provisions a dedicated microVM for each individual user session. This microVM is designed for persistence throughout the session's duration, which can last up to 8 hours or expire after 15 minutes of inactivity, ensuring isolated CPU, memory, and filesystem resources for each session. Continuity across interactions is maintained through a unique `Mcp-Session-Id` header. This ID is established during the initial handshake and subsequently included by the client in all follow-up requests, ensuring they are accurately routed back to the correct, persistent session, thereby preserving context and enabling complex, interactive dialogues.

What is Elicitation, and how does it enhance AI agent interactions?

Elicitation is a powerful stateful MCP capability that allows an AI agent (acting as the MCP server) to intelligently pause its ongoing execution and request specific, structured input directly from the user via the client. This significantly enhances interactive agent workflows by enabling agents to ask targeted questions at precise, opportune moments within their operational flow. For example, an agent might use elicitation to confirm a decision, gather user preferences, or collect particular data values that are contingent on preceding steps. The feature supports two robust modes: 'Form mode' for direct structured data collection through the MCP client, and 'URL mode' for secure, out-of-band interactions that require directing the user to an external URL (e.g., for OAuth or sensitive credential entry). The user's response – whether accepting, declining, or canceling the request – is then returned to the server, allowing the agent to dynamically adapt its workflow based on real-time human feedback.

How does Sampling capability benefit AI agents without managing LLM credentials?

Sampling equips the MCP server with the ability to request sophisticated large language model (LLM)-generated content directly from the client using the `sampling/createMessage` mechanism. A key benefit is that the MCP server itself does not need to manage its own LLM credentials, API keys, or direct integrations with various LLM providers. Instead, the server simply provides a well-formed prompt and any optional model preferences to the client. The client then acts as an intelligent intermediary, forwarding this request to its connected LLM and returning the generated response back to the server. This abstraction allows AI agents to seamlessly leverage powerful language model capabilities for tasks such as crafting personalized summaries, generating natural-language explanations from complex structured data, or producing context-aware recommendations, all while simplifying the operational overhead and security concerns associated with LLM management on the server side.

Amazon Bedrock: AgentCore 런타임의 상태 저장 MCP 클라이언트 기능

이 플래그 외에도 MCP 클라이언트가 초기화 핸드셰이크 중에 지원을 선언하면 세 가지 클라이언트 기능이 자동으로 제공됩니다.

새로운 클라이언트 기능 심층 분석: 유도, 샘플링 및 진행 상황

상태 저장 모드로 전환하면서 Amazon Bedrock AgentCore Runtime은 MCP 사양의 세 가지 강력한 클라이언트 기능을 잠금 해제하며, 각 기능은 고급 AI 에이전트에 중요한 고유한 상호 작용 패턴을 해결하도록 설계되었습니다. 이러한 기능은 이전에 경직된 단방향 명령 실행이었던 것을 MCP 서버와 연결된 클라이언트 간의 유동적인 양방향 대화로 변화시킵니다. 이러한 기능은 선택 사항이며, 클라이언트가 초기화 중에 지원을 선언하고 서버는 연결된 클라이언트가 광고한 기능만 사용해야 한다는 점에 유의하는 것이 중요합니다.

유도(Elicitation): AI 에이전트에서 동적 사용자 입력 활성화

**유도(Elicitation)**는 대화형 AI의 초석으로서, MCP 서버가 실행을 신중하게 일시 중지하고 클라이언트를 통해 사용자로부터 특정 구조화된 입력을 요청할 수 있도록 합니다. 이 기능은 도구가 의사 결정을 확인하거나, 사용자 선호도를 수집하거나, 선행 작업에서 파생된 값을 수집하는 등 워크플로우 내에서 적절한 순간에 정확한 질문을 할 수 있도록 합니다. 서버는 사람이 읽을 수 있는 메시지와 예상 응답 구조를 설명하는 선택적 requestedSchema를 포함하는 elicitation/create JSON-RPC 요청을 전송하여 이를 시작합니다.

MCP 사양은 유도를 위한 두 가지 강력한 모드를 제공합니다.

폼 모드: 구성 매개변수, 사용자 선호도 또는 민감한 데이터가 관련되지 않은 간단한 확인과 같이 MCP 클라이언트를 통해 직접 구조화된 데이터를 수집하는 데 이상적입니다.
URL 모드: OAuth 흐름, 결제 처리 또는 민감한 자격 증명 입력과 같이 안전한 대역 외 프로세스가 필요한 상호 작용의 경우, URL 모드는 사용자를 외부 URL로 안내합니다. 이를 통해 민감한 정보는 MCP 클라이언트를 완전히 우회하여 보안 및 규정 준수를 강화합니다.

유도 요청을 받으면 클라이언트는 적절한 입력 인터페이스를 렌더링합니다. 사용자의 후속 작업은 서버로 다시 세 가지 응답 모델을 트리거합니다. accept(사용자가 요청된 데이터를 제공), decline(사용자가 요청을 명시적으로 거부), cancel(사용자가 선택하지 않고 프롬프트를 닫음). 지능형 서버는 이러한 각 시나리오를 원활하게 처리하도록 설계되어 견고하고 사용자 친화적인 경험을 보장합니다. 예를 들어, 소스 자료에 시연된 add_expense_interactive 도구는 Amazon DynamoDB와 같은 백엔드에 데이터를 커밋하기 전에 금액, 설명, 범주 및 최종 확인과 같은 일련의 질문을 통해 사용자를 안내할 수 있습니다. 각 단계는 Pydantic 모델을 활용하여 예상 입력을 정의하며, FastMCP는 이를 elicitation/create 요청에 필요한 JSON 스키마로 원활하게 변환합니다.

샘플링 및 진행 알림: LLM 상호 작용 및 투명성 향상

직접적인 사용자 상호 작용 외에도 **샘플링(Sampling)**은 sampling/createMessage를 통해 클라이언트로부터 직접 LLM이 생성한 콘텐츠를 요청할 수 있는 기능을 MCP 서버에 제공합니다. 이는 서버의 도구 로직이 자체 LLM 자격 증명이나 직접적인 API 통합을 관리할 필요 없이 강력한 언어 모델 기능을 활용할 수 있도록 하는 중요한 메커니즘입니다. 서버는 단순히 프롬프트와 선택적 모델 선호도를 제공하며, 중개자 역할을 하는 클라이언트는 요청을 연결된 LLM에 전달하고 생성된 응답을 반환합니다. 이는 개인화된 요약 작성, 구조화된 데이터에서 자연어 설명 생성, 진행 중인 대화를 기반으로 한 상황 인식 추천 생성과 같은 다양한 실용적인 애플리케이션을 가능하게 합니다.

시간이 오래 걸리는 작업의 경우 **진행 알림(Progress Notifications)**은 매우 중요합니다. 이 기능은 MCP 서버가 장기 실행 작업 중에 점진적인 업데이트를 보고할 수 있도록 합니다. ctx.report_progress(progress, total)을 사용함으로써 서버는 클라이언트가 진행률 표시줄 또는 상태 표시기와 같은 시각적 피드백으로 변환할 수 있는 지속적인 업데이트를 보낼 수 있습니다. 방대한 데이터 소스를 검색하든 복잡한 계산 작업을 실행하든, 투명한 진행 상황 업데이트는 사용자가 시스템이 여전히 활성 상태인지 궁금해하며 빈 화면을 응시하게 하는 대신 정보를 계속 인지하도록 보장하여 좌절감을 방지하고 전반적인 사용자 경험을 향상시킵니다.

Bedrock AgentCore 런타임으로 AI 에이전트 개발 미래 보장

Amazon Bedrock AgentCore 런타임에 상태 저장 MCP 클라이언트 기능이 도입된 것은 AI 에이전트 개발의 중요한 도약을 의미합니다. 이전에 상태 비저장이었던 상호 작용을 동적이고 양방향 대화로 전환함으로써 AWS는 개발자가 더욱 지능적이고 반응적이며 사용자 친화적인 AI 애플리케이션을 구축할 수 있도록 지원합니다. 안내된 사용자 입력을 위한 유도(Elicitation), 주문형 LLM 생성을 위한 샘플링(Sampling), 실시간 투명성을 위한 진행 알림(Progress Notifications)과 같은 이러한 기능들은 대화형 에이전트 워크플로우의 새로운 시대를 열어줍니다. AI가 계속 진화함에 따라 이러한 기반 기능은 복잡한 비즈니스 프로세스에 원활하게 통합되고, 사용자 요구에 적응하며, 탁월한 가치를 제공할 수 있는 정교한 에이전트 AI 운영을 생성하는 데 중요할 것입니다.