Bedrock AgentCore: Pembedesha Wakala wa Kivinjari wa AI Moja kwa Moja kwenye React

Ikiwa unatumia AWS Bedrock kwa modeli yako ya AI, sakinisha AWS SDK for JavaScript:
```
npm install @aws-sdk/client-bedrock-runtime
```

Msingi wa msimbo wa kutekeleza Mtazamo wa Moja kwa Moja kwa kawaida umegawanyika: msimbo wa upande wa seva (kwa usimamizi wa kipindi na mantiki ya wakala wa AI) huendeshwa katika Node.js, na msimbo wa upande wa mteja (kwa kuonyesha Mtazamo wa Moja kwa Moja) huendeshwa ndani ya programu ya React, mara nyingi huwekwa pamoja na zana kama Vite.

Ujumuishaji wa Hatua kwa Hatua: Kutoka Kipindi Hadi Mtiririko

Kuunganisha wakala wa kivinjari wa AI wa moja kwa moja na Amazon Bedrock AgentCore kunahusisha mchakato wazi, wa hatua tatu, unaounganisha mantiki yako ya upande wa seva na programu yako ya React ya upande wa mteja na uwezo thabiti wa Wingu la AWS.

1. Kuanzisha Kipindi cha Kivinjari na Kutengeneza URL ya Mtazamo wa Moja kwa Moja

Hatua ya kwanza hutokea kwenye seva yako ya programu. Hapa ndipo mantiki yako ya nyuma huanzisha kipindi cha kivinjari ndani ya Amazon Bedrock AgentCore na kupata kwa usalama URL muhimu ili kutiririsha mtazamo wa moja kwa moja.

Utatumia darasa la Browser kutoka kwa SDK ya bedrock-agentcore. Darasa hili hushughulikia utata wa kuunda na kusimamia mazingira ya kivinjari yaliyotengwa katika wingu. Pato muhimu kutoka kwa hatua hii ni URL iliyosainiwa mapema ya SigV4, ambayo inatoa ufikiaji salama, wa muda mfupi wa mtiririko wa video wa moja kwa moja wa kipindi cha kivinjari.

// Mfano wa msimbo wa upande wa seva (Node.js)
import { Browser } from 'bedrock-agentcore';
import { AgentCoreClient } from '@aws-sdk/client-bedrock-agentcore';

// Anzisha mteja wa Bedrock AgentCore (hakikisha vitambulisho sahihi vya AWS vimesanidiwa)
const agentCoreClient = new AgentCoreClient({ region: 'us-east-1' }); // Tumia eneo unalotaka

async function startLiveSession() {
    // Unda kipindi kipya cha kivinjari
    const browser = new Browser(agentCoreClient);
    await browser.create();

    // Tengeneza URL ya Mtazamo wa Moja kwa Moja
    const liveViewUrl = await browser.getLiveViewURL();
    console.log('Live View URL:', liveViewUrl);

    // Hifadhi browser.sessionId ili baadaye kuunganisha wakala wako wa AI au kukatisha kipindi
    const sessionId = browser.sessionId;
    
    return { liveViewUrl, sessionId };
}

// Hii `liveViewUrl` itatumwa kwa mteja wako wa React.

URL hii kisha hupitishwa kwa React frontend yako, ambayo itaifanya iweze kuanzisha mtiririko wa moja kwa moja.

2. Kuonyesha Mtazamo wa Moja kwa Moja katika Programu Yako ya React

Mara tu programu yako ya React inapopokea liveViewUrl kutoka kwa seva yako, kuonyesha mtiririko wa wakati halisi ni rahisi sana, shukrani kwa kijenzi cha BrowserLiveView.

// Mfano wa msimbo wa upande wa mteja (kijenzi cha React)
import React, { useEffect, useState } from 'react';
import { BrowserLiveView } from 'bedrock-agentcore';

interface LiveAgentViewerProps {
    liveViewUrl: string;
}

const LiveAgentViewer: React.FC<LiveAgentViewerProps> = ({ liveViewUrl }) => {
    if (!liveViewUrl) {
        return <p>Inasubiri URL ya Mtazamo wa Moja kwa Moja...</p>;
    }

    return (
        <div style={{ width: '100%', height: '600px', border: '1px solid #ccc' }}>
            <BrowserLiveView url={liveViewUrl} />
        </div>
    );
};

// Katika kijenzi chako kikuu cha App au ukurasa:
// const MyPage = () => {
//     const [currentLiveViewUrl, setCurrentLiveViewUrl] = useState<string | null>(null);
//
//     useEffect(() => {
//         // Tafuta liveViewUrl kutoka kwa backend yako
//         fetch('/api/start-agent-session')
//             .then(res => res.json())
//             .then(data => setCurrentLiveViewUrl(data.liveViewUrl));
//     }, []);
//
//     return (
//         <div>
//             <h1>Mtazamo wa Moja kwa Moja wa Wakala wa AI</h1>
//             <LiveAgentViewer liveViewUrl={currentLiveViewUrl} />
//         </div>
//     );
// };

Kwa url={liveViewUrl} tu, kijenzi cha BrowserLiveView hushughulikia maelezo tata ya kuanzisha muunganisho wa WebSocket, kutumia mtiririko wa DCV, na kuonyesha mlisho wa video wa moja kwa moja ndani ya vipimo ulivyobainisha. Ujumuishaji huu mdogo wa JSX hurahisisha sana ukuzaji wa frontend, huku kuruhusu kuzingatia uzoefu wa mtumiaji karibu na wakala wa moja kwa moja.

3. Kuunganisha Wakala wa AI Kuendesha Kivinjari

Hatua ya mwisho huunganisha akili ya wakala wako wa AI na vitendo halisi vya kivinjari ndani ya kipindi kilichotengwa. Wakati BrowserLiveView inatoa maoni ya kuona, wakala wako wa AI hutumia Playwright CDP (Chrome DevTools Protocol) kuingiliana na kivinjari kwa njia ya programu.

Seva yako ya programu, ambayo pia huweka wakala wako wa AI, itatumia sifa ya page ya kitu cha Browser (ambacho ni kitu cha Playwright Page) kutekeleza vitendo vya kivinjari.

// Mfano wa msimbo wa upande wa seva (unaendelea kutoka hatua ya 1)
// Tuseme una kiolesura kinachofanana na Playwright au matumizi ya moja kwa moja ya Playwright
import { Browser } from 'bedrock-agentcore';
import { AgentCoreClient } from '@aws-sdk/client-bedrock-agentcore';
import { BedrockRuntimeClient, InvokeModelCommand } from "@aws-sdk/client-bedrock-runtime";

// ... (usanidi uliopita wa kuunda kivinjari) ...

async function driveAgent(sessionId: string) {
    const browser = new Browser(agentCoreClient, { sessionId }); // Unganisha tena kwenye kipindi kilichopo
    await browser.connect(); // Unganisha kwenye kipindi cha kivinjari

    const page = browser.page; // Pata kitu cha Playwright Page

    // Mfano wa mantiki ya wakala wa AI (imefupishwa kwa mfano)
    // Hapa ungeunganisha na LLM yako (k.m., Anthropic Claude kupitia Bedrock Converse API)
    // kuamua vitendo kulingana na maagizo ya mtumiaji na maudhui ya ukurasa.
    console.log("Wakala anaelekea example.com...");
    await page.goto('https://www.example.com');
    console.log("Wakala alingoja kwa sekunde 3...");
    await page.waitForTimeout(3000); // Kuiga muda wa usindikaji

    console.log("Wakala anaandika kwenye kisanduku cha utafutaji (kwa mfano)...");
    // Mfano: await page.type('#search-input', 'Amazon Bedrock AgentCore');
    // Mfano: await page.click('#search-button');

    const content = await page.content();
    // Tumia LLM kuchambua 'content' na kuamua hatua zinazofuata
    const bedrockRuntimeClient = new BedrockRuntimeClient({ region: 'us-east-1' });
    const response = await bedrockRuntimeClient.send(new InvokeModelCommand({
        modelId: "anthropic.claude-3-sonnet-20240229-v1:0", // au modeli yako uipendayo
        contentType: "application/json",
        accept: "application/json",
        body: JSON.stringify({
            messages: [
                {
                    role: "user",
                    content: `Changanua maudhui haya ya ukurasa wa wavuti na upendekeze hatua inayofuata: ${content.substring(0, 500)}`
                }
            ],
            max_tokens: 200,
        }),
    }));
    const decodedBody = new TextDecoder("utf-8").decode(response.body);
    const parsedBody = JSON.parse(decodedBody);
    console.log("Mfano wa AI ulipendekeza hatua:", parsedBody.content[0].text);

    // Kulingana na pendekezo la LLM, tekeleza vitendo zaidi vya ukurasa...

    // Usisahau kufunga kipindi cha kivinjari ukimaliza
    // await browser.close();
}

// Baada ya kuanzisha kipindi na kupata URL, ungeita driveAgent(sessionId)

Mzunguko huu wa mwingiliano—ambapo wakala wako wa AI huchambua maudhui ya ukurasa, huamua hatua inayofuata, na kuitekeleza kupitia Playwright CDP—huunda msingi wa wakala wa kuvinjari unaojiendesha. Vitendo hivi vyote huonyeshwa kwa kuonekana kwa wakati halisi kupitia kijenzi cha BrowserLiveView kwenye skrini ya mtumiaji.

Kufungua Uwezekano Mpya na Wakala wa AI Waliopembedeshwa

Ujumuishaji wa BrowserLiveView wa Amazon Bedrock AgentCore ni zaidi ya kipengele cha kiufundi tu; ni mabadiliko ya dhana katika jinsi watumiaji wanavyoingiliana na kuamini wakala wa AI. Kwa kupembedesha maoni ya kuona ya wakati halisi, waendelezaji wanaweza kuunda programu zinazotegemea AI ambazo sio tu zenye ufanisi bali pia zenye uwazi, zinazoweza kukaguliwa, na rafiki kwa watumiaji.

Uwezo huu ni wa kubadilisha hasa kwa programu zinazohusisha:

Kazi Ngumu: Kusanifu michakato ya mtandaoni yenye hatua nyingi kama vile kuingiza data, kuingiza wateja, au kufuata sheria, ambapo mwonekano wa kila hatua ni muhimu sana.
Msaada kwa Wateja: Kuruhusu mawakala kutazama watekelezaji wa AI wakitatua maswali ya wateja au kuendesha mifumo, kuhakikisha usahihi na kutoa fursa za kuingilia kati.
Mafunzo na Utatuzi: Kuwapa waendelezaji na watumiaji wa mwisho zana yenye nguvu ya kuelewa tabia ya wakala, kutatua matatizo, na kutoa mafunzo kwa mawakala kupitia uchunguzi wa moja kwa moja.
Njia Zilizoboreshwa za Ukaguzi: Kutengeneza rekodi za kuona za vitendo vya wakala, ambazo zinaweza kuunganishwa na rekodi za kipindi kwenye Amazon S3 kwa ukaguzi kamili wa baada ya tukio na kufuata sheria.

Uwezo wa kutiririsha moja kwa moja vipindi vya kivinjari kutoka Wingu la AWS hadi kwenye vivinjari vya wateja, ukipita seva ya programu kwa ajili ya mtiririko wa video, unatoa faida kubwa kwa upande wa utendaji na uwezo wa kupanuka. Usanifu huu hupunguza ucheleweshaji na hupunguza mzigo kwenye miundombinu yako ya nyuma, huku kuruhusu kuweka suluhisho za wakala wa AI zenye mwitikio mkubwa na zinazoweza kupanuka.

Kwa kupitisha BrowserLiveView, hujengi tu mawakala wa AI; unajenga uaminifu, udhibiti, na uzoefu tajiri wa mtumiaji. Chunguza uwezekano na uwawezeshe watumiaji wako kwa ujasiri wa kukabidhi kazi ngumu za wavuti kwa mawakala wenye akili.

Chanzo asili

https://aws.amazon.com/blogs/machine-learning/embed-a-live-ai-browser-agent-in-your-react-app-with-amazon-bedrock-agentcore/

Maswali Yanayoulizwa Mara kwa Mara

What is the Amazon Bedrock AgentCore BrowserLiveView component and how does it function?

The Amazon Bedrock AgentCore BrowserLiveView component is a crucial part of the Bedrock AgentCore TypeScript SDK, designed to embed a real-time video feed of an AI agent's browsing session directly into a React application. It operates by receiving a SigV4-presigned URL from your application server, which then establishes a persistent WebSocket connection to stream video data via the Amazon DCV protocol from an isolated cloud browser session. This direct streaming mechanism ensures low latency and high fidelity, allowing users to observe every action an AI agent takes on a webpage, from navigation to form submissions, without the video stream passing through your server.

How does embedding Live View enhance user trust and confidence in AI agents?

Embedding Live View significantly boosts user trust and confidence by providing unparalleled transparency into an AI agent's operations. Instead of a 'black box' experience, users gain immediate visual confirmation of the agent's actions, observing its progress and interactions in real-time. This visual feedback loop helps users understand that the agent is on the correct path, interacting with the right elements, and progressing as expected. This is particularly valuable for complex or sensitive workflows, where visual evidence can reassure users that the agent is performing its tasks accurately and responsibly, enhancing overall confidence and allowing for timely intervention if necessary.

What are the primary architectural components involved in integrating a Live View AI agent?

The integration of a Live View AI agent involves three main architectural components. First, the user's web browser, running a React application, hosts the BrowserLiveView component, which renders the real-time stream. Second, the application server acts as the orchestrator, managing the AI agent's logic, initiating browser sessions via the Amazon Bedrock AgentCore API, and generating secure, time-limited SigV4-presigned URLs for the Live View stream. Third, the AWS Cloud hosts Amazon Bedrock AgentCore and Bedrock services, providing the isolated cloud browser sessions, automation capabilities (via Playwright CDP), and the DCV-powered Live View streaming endpoint. A key design point is that the DCV stream flows directly from AWS to the user's browser, bypassing the application server for optimal performance.

Can developers utilize any AI model or agent framework with Amazon Bedrock AgentCore's Live View?

Yes, developers have the flexibility to use any AI model or agent framework of their choice with Amazon Bedrock AgentCore's Live View. While the provided example often demonstrates integration with the Amazon Bedrock Converse API and models like Anthropic Claude, the BrowserLiveView component itself is model-agnostic. This means that the real-time visual streaming functionality is decoupled from the AI agent's underlying reasoning and decision-making logic. As long as your chosen AI agent or framework can interact with the browser automation endpoint provided by AgentCore (typically via Playwright CDP), you can leverage Live View to provide visual feedback to your users, making it a highly adaptable solution for various AI-powered applications.

What are the essential prerequisites for setting up a Live View AI browser agent with Amazon Bedrock AgentCore?

To set up a Live View AI browser agent, several prerequisites are necessary. Developers need Node.js version 20 or later for the server-side logic and React for the client-side application. An AWS account in a supported region is required, along with AWS credentials that have the necessary Amazon Bedrock AgentCore Browser permissions. It's crucial to follow the principle of least privilege for IAM permissions and use temporary credentials (e.g., from AWS IAM Identity Center or STS) rather than long-lived access keys for enhanced security. Additionally, the Amazon Bedrock AgentCore TypeScript SDK (`bedrock-agentcore`) and potentially the AWS SDK for JavaScript (`@aws-sdk/client-bedrock-runtime`) if using Bedrock models, must be installed in your project.

How does the DCV protocol facilitate real-time, low-latency video streaming for Live View?

The Amazon DCV (NICE DCV) protocol is instrumental in providing real-time, low-latency video streaming for the BrowserLiveView component. DCV is a high-performance remote display protocol designed to deliver a rich user experience over varying network conditions. In the context of AgentCore, it efficiently encodes and transmits the visual output of the isolated cloud browser session directly to the user's React application via a WebSocket connection. By optimizing data compression and transmission, DCV ensures that the visual feed of the AI agent's actions appears smooth and responsive, minimizing lag and enabling users to observe the agent's behavior as if it were happening locally on their machine, without the need for complex streaming infrastructure setup by the developer.

Baki na Habari

Pokea habari za hivi karibuni za AI kwenye barua pepe yako.

Shiriki