Code Velocity
Developer Tools

Bedrock AgentCore: Embed Live AI Browser Agent in React

·5 min read·AWS·Original source
Share
Amazon Bedrock AgentCore architecture diagram showing data flow for embedding a live AI browser agent in a React app.

Bedrock AgentCore: Embed Live AI Browser Agent in React

The era of AI agents operating as opaque "black boxes" is rapidly drawing to a close. As artificial intelligence takes on increasingly complex and autonomous tasks, particularly in web environments, the demand for transparency, trust, and user control has never been higher. Users need to understand and verify the actions an AI agent takes on their behalf, especially when those actions involve navigating websites, interacting with sensitive data, or executing critical workflows.

Addressing this fundamental challenge, Amazon Web Services introduces a powerful solution: the Amazon Bedrock AgentCore BrowserLiveView component. This innovative tool empowers developers to embed a live, real-time video feed of an AI agent's browsing session directly within their React applications. This integration not only demystifies agent behavior but also provides users with unprecedented visibility and a crucial sense of control.

Part of the Bedrock AgentCore TypeScript SDK, the BrowserLiveView component simplifies the integration of live browser streams into your application with just a few lines of JavaScript XML (JSX). Utilizing the high-performance Amazon DCV protocol, it renders the agent's session, effectively transforming a traditionally hidden process into a visually verifiable experience. This article will guide you through the process, from starting a session and generating the Live View URL, to rendering the stream in your React application, and finally, wiring up an AI agent to drive the browser as your users watch.

Enhancing AI Agent Transparency with BrowserLiveView

In today's rapidly evolving landscape of Agentic AI, the ability to delegate web browsing tasks to AI agents promises immense efficiency gains. However, this promise is often tempered by concerns about agent reliability, accuracy, and security. Without a clear window into an agent's operations, users are left to trust a system they cannot observe, which can hinder adoption and limit deployment in sensitive scenarios.

The BrowserLiveView component directly confronts this challenge by opening up the AI agent's "eyes" to the user. Imagine an AI agent tasked with filling out a complex online form or retrieving specific information from multiple websites. Traditionally, the user would only receive a final output or a summary of actions, leaving them to wonder about the intermediary steps. With an embedded Live View, users can follow every navigation, form submission, and search query in real-time as the agent performs it.

This immediate visual confirmation is invaluable. It reassures users that the agent is on the correct page, interacting with the right elements, and progressing through the workflow as intended. This real-time feedback loop goes beyond mere text confirmations; it provides a tangible, verifiable audit trail of the agent's behavior, fostering greater confidence and trust. For workflows that are regulated or involve sensitive customer data, this visual evidence can be critical for compliance and accountability. Furthermore, in scenarios requiring human oversight, a supervisor can directly observe the agent's actions within the application, intervening if necessary, without disrupting the flow.

The Architecture Behind Real-time Agent Observation

The seamless integration of a live AI browser agent within a React application powered by Amazon Bedrock AgentCore is orchestrated through a sophisticated yet efficient architecture comprising three key components. Understanding the interplay between these parts is crucial for successful implementation and deployment.

ComponentRole in ArchitectureKey Technologies/Protocols
User's Web BrowserRuns the React application containing the BrowserLiveView component; establishes a WebSocket connection to receive DCV stream; handles video rendering.React, BrowserLiveView component, WebSocket, Amazon DCV
Application ServerFunctions as the AI agent; orchestrates connections; starts sessions via Bedrock AgentCore API; generates SigV4-presigned URLs; handles session management and authentication.Node.js (or similar), Amazon Bedrock AgentCore API, SigV4 URLs, REST, AI model logic
AWS Cloud (Bedrock AgentCore & Services)Hosts isolated cloud browser sessions; provides browser automation endpoint (Playwright CDP); delivers Live View streaming endpoint (DCV).Amazon Bedrock AgentCore, Amazon Bedrock, Playwright CDP, Amazon DCV

The interaction begins with the user's web browser, which runs your React application. Within this application, the BrowserLiveView component is rendered, awaiting a secure, time-limited SigV4-presigned URL. This URL, generated by your application server, is the key to establishing a persistent WebSocket connection directly to the Bedrock AgentCore service in the AWS Cloud.

Your application server serves a dual purpose: it hosts the AI agent's logic and acts as an intermediary for session management. It calls the Amazon Bedrock AgentCore API to initiate browser sessions and then securely generates the SigV4-presigned URLs that grant your client browser access to the Live View stream. Crucially, while your server orchestrates the agent's actions and generates the necessary credentials, it does not directly handle the video stream itself.

The heavy lifting of browser automation and video streaming occurs within the AWS Cloud. Amazon Bedrock AgentCore hosts isolated cloud browser sessions, providing both the automation endpoint—which your AI agent interacts with using the Playwright Chrome DevTools Protocol (CDP)—and the Live View streaming endpoint, powered by Amazon DCV. This design ensures that the DCV Live View stream flows directly from Amazon Bedrock AgentCore to the user’s browser. This direct WebSocket connection bypasses your application server, minimizing latency, reducing your infrastructure overhead, and ensuring a smooth, real-time viewing experience.

Prerequisites for Deploying Your Live AI Agent

Before diving into the implementation of your live AI browser agent, it's essential to ensure your development environment is correctly configured and you have the necessary AWS resources and permissions. Adhering to these prerequisites will streamline your development process and help maintain a secure operational posture.

  1. Node.js Environment: You'll need Node.js version 20 or later installed on your system for running the server-side components of your application.
  2. AWS Account and Region: An active AWS account is required, with access to a supported AWS Region where Amazon Bedrock AgentCore is available.
  3. AWS Credentials and Permissions: Your AWS credentials must have the appropriate Amazon Bedrock AgentCore Browser permissions. It's crucial to follow the principle of least privilege when configuring AWS Identity and Access Management (IAM) permissions. For enhanced security, always use temporary credentials obtained from AWS IAM Identity Center or AWS Security Token Service (AWS STS), avoiding long-lived access keys.
  4. AI Model Access: If you plan to use an AI model to drive the agent (as demonstrated in the sample, which uses the Amazon Bedrock Converse API with Anthropic Claude), you'll need access to that specific model and any associated AWS Bedrock permissions. However, remember that Live View itself is model-agnostic, allowing you to integrate any model provider or agent framework of your choice.
  5. SDK Installations:
    • Install the bedrock-agentcore TypeScript SDK for interacting with AgentCore:
      npm install bedrock-agentcore
      
    • If you're using AWS Bedrock for your AI model, install the AWS SDK for JavaScript:
      npm install @aws-sdk/client-bedrock-runtime
      

The code base for implementing Live View is typically split: server-side code (for session management and AI agent logic) runs in Node.js, and client-side code (for rendering the Live View) runs within a React application, often bundled with tools like Vite.

Step-by-Step Integration: From Session to Stream

Integrating a live AI browser agent with Amazon Bedrock AgentCore involves a clear, three-step process, bridging your server-side logic with your client-side React application and the robust capabilities of AWS Cloud.

1. Starting a Browser Session and Generating the Live View URL

The first step occurs on your application server. This is where your backend logic initiates a browser session within Amazon Bedrock AgentCore and securely obtains the necessary URL to stream the live view.

You'll use the Browser class from the bedrock-agentcore SDK. This class handles the complexity of creating and managing isolated browser environments in the cloud. The key output from this step is a SigV4-presigned URL, which grants secure, temporary access to the live video stream of the browser session.

// Example server-side code (Node.js)
import { Browser } from 'bedrock-agentcore';
import { AgentCoreClient } from '@aws-sdk/client-bedrock-agentcore';

// Initialize Bedrock AgentCore client (ensure proper AWS credentials are configured)
const agentCoreClient = new AgentCoreClient({ region: 'us-east-1' }); // Use your desired region

async function startLiveSession() {
    // Create a new browser session
    const browser = new Browser(agentCoreClient);
    await browser.create();

    // Generate the Live View URL
    const liveViewUrl = await browser.getLiveViewURL();
    console.log('Live View URL:', liveViewUrl);

    // Store browser.sessionId to later connect your AI agent or terminate the session
    const sessionId = browser.sessionId;
    
    return { liveViewUrl, sessionId };
}

// This `liveViewUrl` will be sent to your React client.

This URL is then passed to your React frontend, which will use it to establish the live stream.

2. Rendering the Live View in Your React Application

Once your React application receives the liveViewUrl from your server, rendering the real-time stream is remarkably straightforward, thanks to the BrowserLiveView component.

// Example client-side code (React component)
import React, { useEffect, useState } from 'react';
import { BrowserLiveView } from 'bedrock-agentcore';

interface LiveAgentViewerProps {
    liveViewUrl: string;
}

const LiveAgentViewer: React.FC<LiveAgentViewerProps> = ({ liveViewUrl }) => {
    if (!liveViewUrl) {
        return <p>Waiting for Live View URL...</p>;
    }

    return (
        <div style={{ width: '100%', height: '600px', border: '1px solid #ccc' }}>
            <BrowserLiveView url={liveViewUrl} />
        </div>
    );
};

// In your main App component or page:
// const MyPage = () => {
//     const [currentLiveViewUrl, setCurrentLiveViewUrl] = useState<string | null>(null);
//
//     useEffect(() => {
//         // Fetch the liveViewUrl from your backend
//         fetch('/api/start-agent-session')
//             .then(res => res.json())
//             .then(data => setCurrentLiveViewUrl(data.liveViewUrl));
//     }, []);
//
//     return (
//         <div>
//             <h1>AI Agent Live View</h1>
//             <LiveAgentViewer liveViewUrl={currentLiveViewUrl} />
//         </div>
//     );
// };

With just url={liveViewUrl}, the BrowserLiveView component handles the intricate details of establishing the WebSocket connection, consuming the DCV stream, and rendering the live video feed within your specified dimensions. This minimal JSX integration greatly simplifies the frontend development, allowing you to focus on the user experience around the live agent.

3. Wiring an AI Agent to Drive the Browser

The final step connects your AI agent's intelligence to the actual browser actions within the isolated session. While the BrowserLiveView provides the visual feedback, your AI agent uses Playwright CDP (Chrome DevTools Protocol) to interact with the browser programmatically.

Your application server, which also hosts your AI agent, will use the Browser object's page property (which is a Playwright Page object) to execute browser actions.

// Example server-side code (continued from step 1)
// Assuming you have a Playwright-like interface or direct Playwright usage
import { Browser } from 'bedrock-agentcore';
import { AgentCoreClient } from '@aws-sdk/client-bedrock-agentcore';
import { BedrockRuntimeClient, InvokeModelCommand } from "@aws-sdk/client-bedrock-runtime";

// ... (previous setup for browser creation) ...

async function driveAgent(sessionId: string) {
    const browser = new Browser(agentCoreClient, { sessionId }); // Reconnect to existing session
    await browser.connect(); // Connect to the browser session

    const page = browser.page; // Get the Playwright Page object

    // Example AI agent logic (simplified for illustration)
    // Here you would integrate with your LLM (e.g., Anthropic Claude via Bedrock Converse API)
    // to determine actions based on user prompts and page content.
    console.log("Agent navigating to example.com...");
    await page.goto('https://www.example.com');
    console.log("Agent waited for 3 seconds...");
    await page.waitForTimeout(3000); // Simulate processing time

    console.log("Agent typing into a search box (hypothetical)...");
    // Example: await page.type('#search-input', 'Amazon Bedrock AgentCore');
    // Example: await page.click('#search-button');

    const content = await page.content();
    // Use an LLM to analyze 'content' and decide next steps
    const bedrockRuntimeClient = new BedrockRuntimeClient({ region: 'us-east-1' });
    const response = await bedrockRuntimeClient.send(new InvokeModelCommand({
        modelId: "anthropic.claude-3-sonnet-20240229-v1:0", // or your preferred model
        contentType: "application/json",
        accept: "application/json",
        body: JSON.stringify({
            messages: [
                {
                    role: "user",
                    content: `Analyze this webpage content and suggest the next action: ${content.substring(0, 500)}`
                }
            ],
            max_tokens: 200,
        }),
    }));
    const decodedBody = new TextDecoder("utf-8").decode(response.body);
    const parsedBody = JSON.parse(decodedBody);
    console.log("AI Model suggested action:", parsedBody.content[0].text);

    // Based on LLM's suggestion, execute further page actions...

    // Don't forget to close the browser session when done
    // await browser.close();
}

// After starting the session and getting the URL, you would then call driveAgent(sessionId)

This interaction loop—where your AI agent analyzes page content, determines the next action, and executes it via Playwright CDP—forms the core of an autonomous browsing agent. All these actions are visually rendered in real-time through the BrowserLiveView component on the user's screen.

Unlocking New Possibilities with Embedded AI Agents

The integration of Amazon Bedrock AgentCore's BrowserLiveView is more than just a technical feature; it's a paradigm shift in how users interact with and trust AI agents. By embedding real-time visual feedback, developers can create AI-powered applications that are not only efficient but also transparent, auditable, and user-friendly.

This capability is particularly transformative for applications involving:

  • Complex Workflows: Automating multi-step online processes like data entry, onboarding, or regulatory compliance, where visibility into each step is paramount.
  • Customer Support: Allowing agents to observe AI co-pilots resolving customer queries or navigating systems, ensuring accuracy and providing opportunities for intervention.
  • Training and Debugging: Providing developers and end-users with a powerful tool to understand agent behavior, debug issues, and train agents through direct observation.
  • Enhanced Audit Trails: Generating visual records of agent actions, which can be combined with session recordings to Amazon S3 for comprehensive post-hoc review and compliance.

The ability to directly stream browser sessions from the AWS Cloud to client browsers, bypassing the application server for the video stream, offers significant advantages in terms of performance and scalability. This architecture minimizes latency and reduces the burden on your backend infrastructure, allowing you to deploy highly responsive and scalable AI agent solutions.

By adopting BrowserLiveView, you're not just building AI agents; you're building trust, control, and a richer user experience. Explore the possibilities and empower your users with the confidence to delegate complex web tasks to intelligent agents.

Frequently Asked Questions

What is the Amazon Bedrock AgentCore BrowserLiveView component and how does it function?
The Amazon Bedrock AgentCore BrowserLiveView component is a crucial part of the Bedrock AgentCore TypeScript SDK, designed to embed a real-time video feed of an AI agent's browsing session directly into a React application. It operates by receiving a SigV4-presigned URL from your application server, which then establishes a persistent WebSocket connection to stream video data via the Amazon DCV protocol from an isolated cloud browser session. This direct streaming mechanism ensures low latency and high fidelity, allowing users to observe every action an AI agent takes on a webpage, from navigation to form submissions, without the video stream passing through your server.
How does embedding Live View enhance user trust and confidence in AI agents?
Embedding Live View significantly boosts user trust and confidence by providing unparalleled transparency into an AI agent's operations. Instead of a 'black box' experience, users gain immediate visual confirmation of the agent's actions, observing its progress and interactions in real-time. This visual feedback loop helps users understand that the agent is on the correct path, interacting with the right elements, and progressing as expected. This is particularly valuable for complex or sensitive workflows, where visual evidence can reassure users that the agent is performing its tasks accurately and responsibly, enhancing overall confidence and allowing for timely intervention if necessary.
What are the primary architectural components involved in integrating a Live View AI agent?
The integration of a Live View AI agent involves three main architectural components. First, the user's web browser, running a React application, hosts the BrowserLiveView component, which renders the real-time stream. Second, the application server acts as the orchestrator, managing the AI agent's logic, initiating browser sessions via the Amazon Bedrock AgentCore API, and generating secure, time-limited SigV4-presigned URLs for the Live View stream. Third, the AWS Cloud hosts Amazon Bedrock AgentCore and Bedrock services, providing the isolated cloud browser sessions, automation capabilities (via Playwright CDP), and the DCV-powered Live View streaming endpoint. A key design point is that the DCV stream flows directly from AWS to the user's browser, bypassing the application server for optimal performance.
Can developers utilize any AI model or agent framework with Amazon Bedrock AgentCore's Live View?
Yes, developers have the flexibility to use any AI model or agent framework of their choice with Amazon Bedrock AgentCore's Live View. While the provided example often demonstrates integration with the Amazon Bedrock Converse API and models like Anthropic Claude, the BrowserLiveView component itself is model-agnostic. This means that the real-time visual streaming functionality is decoupled from the AI agent's underlying reasoning and decision-making logic. As long as your chosen AI agent or framework can interact with the browser automation endpoint provided by AgentCore (typically via Playwright CDP), you can leverage Live View to provide visual feedback to your users, making it a highly adaptable solution for various AI-powered applications.
What are the essential prerequisites for setting up a Live View AI browser agent with Amazon Bedrock AgentCore?
To set up a Live View AI browser agent, several prerequisites are necessary. Developers need Node.js version 20 or later for the server-side logic and React for the client-side application. An AWS account in a supported region is required, along with AWS credentials that have the necessary Amazon Bedrock AgentCore Browser permissions. It's crucial to follow the principle of least privilege for IAM permissions and use temporary credentials (e.g., from AWS IAM Identity Center or STS) rather than long-lived access keys for enhanced security. Additionally, the Amazon Bedrock AgentCore TypeScript SDK (`bedrock-agentcore`) and potentially the AWS SDK for JavaScript (`@aws-sdk/client-bedrock-runtime`) if using Bedrock models, must be installed in your project.
How does the DCV protocol facilitate real-time, low-latency video streaming for Live View?
The Amazon DCV (NICE DCV) protocol is instrumental in providing real-time, low-latency video streaming for the BrowserLiveView component. DCV is a high-performance remote display protocol designed to deliver a rich user experience over varying network conditions. In the context of AgentCore, it efficiently encodes and transmits the visual output of the isolated cloud browser session directly to the user's React application via a WebSocket connection. By optimizing data compression and transmission, DCV ensures that the visual feed of the AI agent's actions appears smooth and responsive, minimizing lag and enabling users to observe the agent's behavior as if it were happening locally on their machine, without the need for complex streaming infrastructure setup by the developer.

Stay Updated

Get the latest AI news delivered to your inbox.

Share