# Build a Generative-UI Coding Agent with CopilotKit and Daytona

import { Image } from 'astro:assets'

import copilotkitSnakeGame from '../../../../../assets/docs/images/copilotkit_snake_game.gif'

This guide demonstrates how to build a [CopilotKit](https://docs.showcase.copilotkit.ai/) Built-in Agent backed by a [Daytona](https://www.daytona.io/) sandbox with full shell and filesystem access. The agent handles whatever a developer might do at a terminal: build apps, debug or analyze code, run scripts, work with data, install packages.

Every tool call streams into the chat as generative UI: shell commands render as terminal cards, file edits as syntax-highlighted code, listings and grep results as structured cards, and any hosted process (dev server, static site, API) as a live `<iframe>` directly in the message stream.

---

### 1. Workflow Overview

The agent has 11 tools and a sandbox. On the first prompt it calls `createSandbox`, then chains shell commands, file writes, and filesystem queries until the user's request is done. When the user asks for something they should see in a browser, the agent calls `startWebServer` with the dev-server command and its port; that tool spawns the server in the background, polls for a ready signal, and returns the preview URL, which the frontend embeds as a live iframe inside the chat.

Follow-up turns trigger more `writeFile` calls. If the iframe is up and the dev server has HMR, edits reload the inner page in place, so a chat back-and-forth becomes a live editing loop on the running app.

### 2. Project Setup

:::note[Node.js Version]
Node.js 20 or newer is required to run this example. Please ensure your environment meets this requirement before proceeding.
:::

#### Clone the Repository

Clone the [Daytona repository](https://github.com/daytonaio/daytona) and navigate to the example directory:

```bash
git clone https://github.com/daytonaio/daytona.git
cd daytona/guides/typescript/copilotkit/generative-ui-coding-agent
```

#### Configure Environment

Get your API keys:

- **Daytona API key**, from the [Daytona Dashboard](https://app.daytona.io/dashboard/keys)
- **OpenAI API key**, from [platform.openai.com](https://platform.openai.com/api-keys)

Copy `.env.example` to `.env` and fill them in:

```bash
cp .env.example .env
```

```bash
DAYTONA_API_KEY=your_daytona_key
OPENAI_API_KEY=your_openai_key
```

#### Install Dependencies

```bash
npm install
```

### 3. Understanding the Core Components

Before walking through the implementation, here are the key concepts the code relies on: CopilotKit's Built-in Agent, its generative-UI streaming pattern, and the single-endpoint routing that Next.js exposes.

#### CopilotKit Built-in Agent

`BuiltInAgent` is CopilotKit's agent runtime. You give it a model spec, a system prompt, and a list of tools; it runs the model in a per-turn loop (tool call → tool result → tool call → ... → final text response), iterating up to the configured `maxSteps` limit and stopping early when the model returns a final text response with no further tool calls.

The runtime emits a structured streaming protocol on top of that loop: each tool call surfaces as a sequence of events (`TOOL_CALL_START`, one or more `TOOL_CALL_ARGS` carrying argument deltas as the model streams them, `TOOL_CALL_END`, then `TOOL_CALL_RESULT` once the server-side `execute` returns) that the React side subscribes to via hooks like `useRenderTool`. That's what lets us render each tool call as its own live card in the chat as the agent works.

#### Single-endpoint routing

Next.js App Router uses convention-based filenames inside `app/`: `page.tsx` is the page component, `layout.tsx` wraps it, and `route.ts` files become API [Route Handlers](https://nextjs.org/docs/app/building-your-application/routing/route-handlers). So `app/api/copilotkit/route.ts` exposes a single endpoint at `/api/copilotkit`.

Every CopilotKit operation (start a run, fetch suggestions, list threads, ...) hits that single endpoint as a `POST` with a `method` field in the JSON body identifying the operation, in a JSON-RPC style. The runtime helper `copilotRuntimeNextJSAppRouterEndpoint` builds the `handleRequest` function for you; you just `return handleRequest(req)` from the route's `POST` export.

You can verify the dispatch shape by hitting the endpoint with an empty body:

```bash
curl -X POST http://localhost:3000/api/copilotkit \
  -H 'Content-Type: application/json' \
  -d '{}'
# {"error":"invalid_request","message":"Missing method field"}
```

#### Generative UI via `useRenderTool`

On the React side, `useRenderTool({ name, parameters, render })` registers a renderer for a specific tool name. The `render` function receives `{ status, parameters, result }`:

- `status` transitions through `inProgress` (model is composing args) → `executing` (server is running the tool) → `complete` (result available)
- `parameters` is the typed, validated tool arguments inferred from the Zod schema
- `result` is the JSON-serialized tool return value, present only when `status === 'complete'`. It is a **string** that needs `JSON.parse` for object access

Each renderer maps a tool call to a React component, so the chat shows live, streaming, structured feedback as the agent works.

#### Tool surface

The agent exposes 11 tools, defined with `defineTool` from `@copilotkit/runtime/v2`. Each is a thin wrapper around a [Daytona TypeScript SDK](https://www.daytona.io/docs/en/typescript-sdk.md) call:

| Tool | Daytona SDK call |
|---|---|
| `createSandbox` (with optional `envVars`, `labels`, `autoStopInterval`) | `daytona.create({ public: true, envVars, labels, autoStopInterval })` |
| `runCommand` (with optional `background`) | `sandbox.process.executeCommand` / `executeSessionCommand(..., { runAsync: true })` |
| `writeFile` | `sandbox.fs.uploadFile` |
| `readFile` | `sandbox.fs.downloadFile` |
| `listFiles` | `sandbox.fs.listFiles` |
| `findFiles` (grep) | `sandbox.fs.findFiles` |
| `searchFiles` (glob) | `sandbox.fs.searchFiles` |
| `replaceInFiles` (codemod) | `sandbox.fs.replaceInFiles` |
| `getFileDetails` | `sandbox.fs.getFileDetails` |
| `startWebServer` | `process.createSession` + `executeSessionCommand({runAsync:true})` + log polling + `getPreviewLink` |
| `getPreviewUrl` | `sandbox.getPreviewLink` |

### 4. Implementation

The full backend lives in `app/api/copilotkit/route.ts`. The frontend is split across `app/layout.tsx`, `app/page.tsx`, and a handful of card components in `components/`. We walk through both top-down.

#### Step 1: Imports and the system prompt

The backend pulls the runtime helpers from `@copilotkit/runtime`, the v2 agent + tool helpers from `@copilotkit/runtime/v2`, the Daytona client, the Next.js request type, and Zod for tool parameter schemas:

```typescript
import {
  CopilotRuntime,
  copilotRuntimeNextJSAppRouterEndpoint,
} from '@copilotkit/runtime'
import { BuiltInAgent, defineTool } from '@copilotkit/runtime/v2'
import { Daytona } from '@daytona/sdk'
import type { NextRequest } from 'next/server'
import { z } from 'zod'

const daytona = new Daytona({ apiKey: process.env.DAYTONA_API_KEY })
```

The system prompt frames the agent and lists every tool with its arguments. The model uses this verbatim to decide what to do on each turn:

```text
You are a coding agent with shell access to a fresh Daytona sandbox.

The user can ask you anything a developer might do at a terminal: build apps,
debug or analyze code, run scripts, work with data, install packages, write
tests, whatever fits the request.

Work under /home/daytona by default. Reuse the same sandboxId across every
tool call. The sandbox auto-deletes after a period of inactivity; if a tool
call fails because the sandbox no longer exists, call createSandbox again
and continue with the new sandboxId.

When the user wants to see a running web app:

1. Prefer a modern, maintained scaffolder. Vite is the safest default for
   React/TS/SPA work; use `npm create vite@latest <name> -- --template
   react-ts --yes` or similar. Avoid `create-react-app`; it is deprecated and
   has very slow first-compile times.

2. ALWAYS bind the dev server to 0.0.0.0 or the Daytona proxy will not reach
   it. Cheat sheet:
   - Vite: `vite --host 0.0.0.0 --port 5173` (CLI flag) AND write a
     `vite.config.ts` with `server: { host: '0.0.0.0', port: 5173, strictPort:
     true, hmr: { clientPort: 443, protocol: 'wss' } }` so HMR survives the
     HTTPS proxy.
   - Next.js: `next dev -H 0.0.0.0 -p 3000`.
   - Express / Node: `app.listen(PORT, '0.0.0.0')`.
   - Flask: `flask run --host 0.0.0.0 --port 5000`.
   - FastAPI / Uvicorn: `uvicorn main:app --host 0.0.0.0 --port 8000`.

3. Use startWebServer with the dev-server command and its port. It starts
   the server in the background, waits for the port to be reachable, and
   returns the preview URL in one shot.

Reply to the user with one short sentence per turn. The tool cards in the
chat carry the visual feedback.
```

#### Step 2: Tool definitions

Each tool is a `defineTool({ name, description, parameters, execute })` block. The `parameters` schema is a Zod object whose inferred type becomes the `execute` callback's argument shape, so the tool is type-safe end to end.

`createSandbox` exposes only the params most relevant to a chat-driven coding agent: `envVars` (inject secrets), `labels` (org tagging), and `autoStopInterval` (idle-stop in minutes, default 15):

```typescript
const createSandbox = defineTool({
  name: 'createSandbox',
  description:
    'Create a fresh Daytona sandbox with public preview URLs enabled. Call ONCE at session start; reuse the returned sandboxId for every subsequent tool call. Optionally inject environment variables, labels, or change the auto-stop interval.',
  parameters: z.object({
    envVars: z
      .record(z.string())
      .optional()
      .describe(
        'Environment variables to set inside the sandbox. Use this when the user provides API keys or other secrets the project needs.',
      ),
    labels: z.record(z.string()).optional().describe('Optional labels for organization-level sandbox tracking.'),
    autoStopInterval: z
      .number()
      .optional()
      .describe('Minutes of inactivity before the sandbox auto-stops. 0 disables, default 15.'),
  }),
  execute: async ({ envVars, labels, autoStopInterval }) => {
    const sandbox = await daytona.create({
      public: true,
      ephemeral: true,
      envVars,
      labels,
      autoStopInterval,
    })
    return { sandboxId: sandbox.id }
  },
})
```

`runCommand` is the most heavily-used tool. The `background` flag flips between a synchronous `executeCommand` (blocks until exit, returns stdout) and an asynchronous `executeSessionCommand({ runAsync: true })` (returns immediately, leaves the process running). The background path is for long-lived non-preview processes the agent won't need to interact with again, like test watchers, build watchers, or log tails. Dev servers the user should see in a browser go through the dedicated `startWebServer` tool instead, which spawns the server, polls its logs for a ready signal, and returns the preview URL atomically:

```typescript
const runCommand = defineTool({
  name: 'runCommand',
  description:
    'Execute a shell command in the sandbox. Set background:true for long-lived fire-and-forget processes (test watchers, build watchers, log followers) the agent will not need to interact with again. Use plain commands (rm, mv, mkdir, chmod, ...) for filesystem ops that do not need structured output. For dev servers the user should see in a browser, use startWebServer instead — it returns the preview URL atomically.',
  parameters: z.object({
    sandboxId: z.string(),
    command: z.string().describe('Shell command. Use && to chain. Absolute paths or `cd /home/daytona && ...`.'),
    background: z
      .boolean()
      .optional()
      .describe(
        'Run asynchronously and return immediately. Use for long-lived non-preview processes such as watchers or log tails; for user-visible dev servers, use startWebServer.',
      ),
  }),
  execute: async ({ sandboxId, command, background }) => {
    const sandbox = await daytona.get(sandboxId)
    if (background) {
      const sessionId = `bg-${Date.now()}`
      await sandbox.process.createSession(sessionId)
      const result = await sandbox.process.executeSessionCommand(sessionId, {
        command,
        runAsync: true,
      })
      return { background: true, sessionId, cmdId: result.cmdId, command }
    }
    const result = await sandbox.process.executeCommand(command)
    return { exitCode: result.exitCode, stdout: result.result, command }
  },
})
```

`writeFile` always takes the FULL new content. There is no diff or patch format; the agent must send the whole file every time it edits one. This keeps the model's job simple and avoids merge ambiguity:

```typescript
const writeFile = defineTool({
  name: 'writeFile',
  description: 'Write a file with the FULL new content. Overwrites if it exists.',
  parameters: z.object({
    sandboxId: z.string(),
    path: z.string().describe('Absolute path, e.g. "/home/daytona/app/src/App.tsx".'),
    content: z.string().describe('Complete new file content.'),
  }),
  execute: async ({ sandboxId, path, content }) => {
    const sandbox = await daytona.get(sandboxId)
    await sandbox.fs.uploadFile(Buffer.from(content), path)
    return { path, bytesWritten: Buffer.byteLength(content) }
  },
})
```

`getPreviewUrl` returns the Daytona proxy URL for any port the sandbox has open. It is the standalone counterpart to `startWebServer`: use it when the agent has already brought up a hosted process by other means (a previous `runCommand`, an already-running service) and just needs the URL to surface as an iframe, without spawning a new dev server:

```typescript
const getPreviewUrl = defineTool({
  name: 'getPreviewUrl',
  description:
    'Get the public preview URL for a port on the sandbox. The port is opened automatically if it was closed. Call after starting a hosted process the user should see in a browser.',
  parameters: z.object({
    sandboxId: z.string(),
    port: z.number().describe('Port the hosted process is listening on.'),
  }),
  execute: async ({ sandboxId, port }) => {
    const sandbox = await daytona.get(sandboxId)
    const preview = await sandbox.getPreviewLink(port)
    return { url: preview.url, port }
  },
})
```

The remaining six tools (`readFile`, `listFiles`, `findFiles`, `searchFiles`, `replaceInFiles`, `getFileDetails`) follow the same shape: a Zod schema for inputs, a one-call wrapper around the Daytona SDK in `execute`, and a structured object returned to the frontend.

#### Step 3: Mount the agent on `/api/copilotkit`

The agent ties everything together. `model` is the model identifier in `provider:model` form, which CopilotKit resolves to the right provider client. `prompt` is the system prompt from Step 1. `tools` is the list of `defineTool` results from Step 2. `maxSteps` caps how many tool-call iterations a single turn can run before the runtime forces a final answer:

```typescript
const agent = new BuiltInAgent({
  model: 'openai:gpt-5.4',
  prompt: SYSTEM_PROMPT,
  tools: [
    createSandbox,
    runCommand,
    writeFile,
    readFile,
    listFiles,
    findFiles,
    searchFiles,
    replaceInFiles,
    getFileDetails,
    startWebServer,
    getPreviewUrl,
  ],
  maxSteps: 30,
})

const runtime = new CopilotRuntime({
  agents: { default: agent },
})

  const { handleRequest } = copilotRuntimeNextJSAppRouterEndpoint({
    runtime,
    endpoint: '/api/copilotkit',
  })
  return handleRequest(req)
}
```

The `POST` export is what Next.js picks up as the route handler. CopilotKit's `handleRequest` reads the `method` field from the JSON body and dispatches to the matching runtime operation, so this single handler serves every CopilotKit RPC the React frontend makes.

#### Step 4: Wrap the app in `CopilotKitRoot`

`components/CopilotKitRoot.tsx` is a thin client component that wraps children in the v2 provider and points it at the runtime endpoint. The provider sets up the AG-UI event stream and gives `<CopilotChat>` (and any other v2 components nested under it) access to the runtime:

```tsx
'use client'

import { CopilotKit } from '@copilotkit/react-core/v2'
import type { ReactNode } from 'react'
import '@copilotkit/react-core/v2/styles.css'

  return <CopilotKit runtimeUrl="/api/copilotkit">{children}</CopilotKit>
}
```

`app/layout.tsx` uses it to wrap the whole tree.

#### Step 5: Register tool renderers with `useRenderTool`

`app/page.tsx` registers one `useRenderTool` hook per backend tool. Each render-prop receives `{ status, parameters, result }`. Because `result` is JSON-serialized as a string, we run it through a small helper before reading fields off it:

```tsx
function parseResult<T>(result: unknown): T | undefined {
  if (typeof result !== 'string' || result.length === 0) return undefined
  try {
    return JSON.parse(result) as T
  } catch {
    return undefined
  }
}
```

Each render hook references a named Zod schema and a result type declared at the top of `app/page.tsx`. The renderer for `startWebServer` (and the fallback `getPreviewUrl`) is the one users notice most: while either tool is `inProgress`/`executing` the card shows a shimmering skeleton, and once `complete` it flips to a live `<iframe>` pointing at the Daytona preview URL. Both tools share the same `PreviewCard` component because the shape of the data the card needs (`url` and `port`) is identical; the `getPreviewUrl` registration is shown below:

```tsx
const getPreviewUrlParams = z.object({
  sandboxId: z.string(),
  port: z.number(),
})

type GetPreviewUrlResult = { url: string; port: number }

useRenderTool({
  name: 'getPreviewUrl',
  parameters: getPreviewUrlParams,
  render: ({ status, parameters, result }) => {
    const r = parseResult<GetPreviewUrlResult>(result)
    return <PreviewCard status={status} url={r?.url} port={parameters?.port} />
  },
})
```

The iframe stays mounted across subsequent turns, so when the agent calls `writeFile` to update a file in the running dev server, the dev server's HMR (over the WebSocket the iframe already holds open) reloads the inner page in place without a React re-render.

The other renderers follow the same pattern with their own cards: a terminal-style `TerminalCard` for `runCommand`, a syntax-highlighted `FileCard` for `writeFile` / `readFile`, a structured `FileListCard` for `listFiles` / `searchFiles`, a `GrepCard` for `findFiles` matches, a `ReplaceCard` for `replaceInFiles` (showing `pattern → newValue` plus per-file success/fail), and a compact `FileInfoCard` for `getFileDetails` metadata.

#### Step 6: Configure suggestions

`useConfigureSuggestions` seeds the chat with dynamically-generated starter prompts so the empty state is not actually empty:

```tsx
useConfigureSuggestions({
  instructions:
    'Suggest 3 short, varied prompts a developer might ask a coding agent with shell access. Mix app-building requests with debugging, scripting, or data-analysis tasks (e.g. "Build a todo app", "Find the bug in this Python script", "Generate a CSV of prime numbers under 1000").',
  minSuggestions: 3,
  maxSuggestions: 3,
  available: 'always',
})
```

`available: 'always'` keeps the suggestion pills visible even after the first message, which makes follow-up exploration easier.

### 5. Run the Example

```bash
npm run dev
```

Open [http://localhost:3000](http://localhost:3000) and ask the agent for something. For example:

> Build the classic Snake game in Vite + React using HTML canvas. Use arrow keys to control the snake, count score, end on collision with wall or self, with a restart button. Make the game area dark green and the snake bright green.

#### Example Output

The chat fills with one tool-call card per agent step. Every card is expandable via the `▸` chevron on the right, so you can introspect details like sandbox params, exit codes, session and command IDs, written-file content, and the full preview URL.

A typical first turn looks like this:

```text
✓  Sandbox ready                              a0ffcc3d-7753-4d52-89d0-6c595c60626a

✓  done
$ cd /home/daytona && npm create vite@latest snake-game -- --template react-ts --yes && cd snake-game && npm install

npm warn exec The following package was not found and will be installed: create-vite@9.0.7

> npx
> "create-vite" snake-game --template react-ts --yes

│
◇  Scaffolding project in /home/daytona/snake-game...
│
└  Done. Now run:

  cd snake-game
  npm install
  npm run dev

added 152 packages, and audited 153 packages in 17s

42 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities

✓  wrote  /home/daytona/snake-game/src/App.tsx                             4492 B

✓  wrote  /home/daytona/snake-game/src/App.css                             1527 B

✓  wrote  /home/daytona/snake-game/src/index.css                            307 B

✓  wrote  /home/daytona/snake-game/vite.config.ts                           273 B

✓  done
$ cd /home/daytona/snake-game && npm run build

> snake-game@0.0.0 build
> tsc -b && vite build

vite v8.0.16 building client environment for production...
transforming...✓ 17 modules transformed.
rendering chunks...
computing gzip size...
dist/index.html                   0.46 kB │ gzip:  0.29 kB
dist/assets/index-BWaglkr_.css    1.49 kB │ gzip:  0.77 kB
dist/assets/index-BAnbeFU0.js   192.33 kB │ gzip: 60.80 kB

✓ built in 161ms

●  Live preview
<iframe src="https://5173-a0ffcc3d-7753-4d52-89d0-6c595c60626a.daytonaproxy01.net" /> — live preview of the running Snake game

Done — the Snake game is running here: https://5173-a0ffcc3d-7753-4d52-89d0-6c595c60626a.daytonaproxy01.net
```

<Image
  src={copilotkitSnakeGame}
  alt="Snake game built and previewed live in the chat by the CopilotKit + Daytona coding agent"
  width={600}
  style="max-width: 100%; height: auto; margin: 1rem 0;"
/>

On follow-up turns, the agent edits files in place and the iframe reloads instantly with the new changes. For example, asking **"Make it red themed"** triggers:

```text
✓  wrote  /home/daytona/snake-game/src/App.css                             1529 B

Done — the app now uses a red theme.
```


### 6. Key Advantages

- **Chat UI for better UX and easier introspection.** A familiar conversational interface is simpler than a raw shell, and you can follow every step the agent takes right in the chat.
- **Purpose-built UI cards for every tool call.** Each agent action renders through a dedicated React card (`FileCard`, `TerminalCard`, `FileListCard`, `GrepCard`, `ReplaceCard`, `FileInfoCard`, `PreviewCard`), each tailored to that tool's data so the user always sees the right visual for the action.
- **Live streaming of every tool call.** Each step surfaces as a structured card as it happens, so you can introspect exactly what the agent is doing in real time rather than waiting for a final summary.
- **Any hosted process embeds as an `<iframe>` directly in the chat**, with no link-outs or screenshots.
- **Conversational edit loop with HMR**, no manual refresh.
- **All code execution happens inside an isolated Daytona sandbox**, never on the host running the chat.