AsyncComputerUse
Section titled “AsyncComputerUse”class AsyncComputerUse()Computer Use functionality for interacting with the desktop environment.
Provides access to mouse, keyboard, screenshot, display, recording, and accessibility operations for automating desktop interactions within a sandbox.
Attributes:
mouseAsyncMouse - Mouse operations interface.keyboardAsyncKeyboard - Keyboard operations interface.screenshotAsyncScreenshot - Screenshot operations interface.displayAsyncDisplay - Display operations interface.recordingAsyncRecordingService - Screen recording operations interface.accessibilityAsyncAccessibility - Accessibility operations interface.
AsyncComputerUse.start
Section titled “AsyncComputerUse.start”@intercept_errors(message_prefix="Failed to start computer use: ")@with_instrumentation()async def start() -> ComputerUseStartResponseStarts all computer use processes (Xvfb, xfce4, x11vnc, novnc).
Returns:
ComputerUseStartResponse- Computer use start response.
Example:
result = await sandbox.computer_use.start()print("Computer use processes started:", result.message)AsyncComputerUse.stop
Section titled “AsyncComputerUse.stop”@intercept_errors(message_prefix="Failed to stop computer use: ")@with_instrumentation()async def stop() -> ComputerUseStopResponseStops all computer use processes.
Returns:
ComputerUseStopResponse- Computer use stop response.
Example:
result = await sandbox.computer_use.stop()print("Computer use processes stopped:", result.message)AsyncComputerUse.get_status
Section titled “AsyncComputerUse.get_status”@intercept_errors(message_prefix="Failed to get computer use status: ")@with_instrumentation()async def get_status() -> ComputerUseStatusResponseGets the status of all computer use processes.
Returns:
ComputerUseStatusResponse- Status information about all VNC desktop processes.
Example:
response = await sandbox.computer_use.get_status()print("Computer use status:", response.status)AsyncComputerUse.get_process_status
Section titled “AsyncComputerUse.get_process_status”@intercept_errors(message_prefix="Failed to get process status: ")@with_instrumentation()async def get_process_status(process_name: str) -> ProcessStatusResponseGets the status of a specific VNC process.
Arguments:
process_namestr - Name of the process to check.
Returns:
ProcessStatusResponse- Status information about the specific process.
Example:
xvfb_status = await sandbox.computer_use.get_process_status("xvfb")no_vnc_status = await sandbox.computer_use.get_process_status("novnc")AsyncComputerUse.restart_process
Section titled “AsyncComputerUse.restart_process”@intercept_errors(message_prefix="Failed to restart process: ")@with_instrumentation()async def restart_process(process_name: str) -> ProcessRestartResponseRestarts a specific VNC process.
Arguments:
process_namestr - Name of the process to restart.
Returns:
ProcessRestartResponse- Process restart response.
Example:
result = await sandbox.computer_use.restart_process("xfce4")print("XFCE4 process restarted:", result.message)AsyncComputerUse.get_process_logs
Section titled “AsyncComputerUse.get_process_logs”@intercept_errors(message_prefix="Failed to get process logs: ")@with_instrumentation()async def get_process_logs(process_name: str) -> ProcessLogsResponseGets logs for a specific VNC process.
Arguments:
process_namestr - Name of the process to get logs for.
Returns:
ProcessLogsResponse- Process logs.
Example:
logs = await sandbox.computer_use.get_process_logs("novnc")print("NoVNC logs:", logs)AsyncComputerUse.get_process_errors
Section titled “AsyncComputerUse.get_process_errors”@intercept_errors(message_prefix="Failed to get process errors: ")@with_instrumentation()async def get_process_errors(process_name: str) -> ProcessErrorsResponseGets error logs for a specific VNC process.
Arguments:
process_namestr - Name of the process to get error logs for.
Returns:
ProcessErrorsResponse- Process error logs.
Example:
errors = await sandbox.computer_use.get_process_errors("x11vnc")print("X11VNC errors:", errors)AsyncMouse
Section titled “AsyncMouse”class AsyncMouse()Mouse operations for computer use functionality.
AsyncMouse.get_position
Section titled “AsyncMouse.get_position”@intercept_errors(message_prefix="Failed to get mouse position: ")@with_instrumentation()async def get_position() -> MousePositionResponseGets the current mouse cursor position.
Returns:
MousePositionResponse- Current mouse position with x and y coordinates.
Example:
position = await sandbox.computer_use.mouse.get_position()print(f"Mouse is at: {position.x}, {position.y}")AsyncMouse.move
Section titled “AsyncMouse.move”@intercept_errors(message_prefix="Failed to move mouse: ")@with_instrumentation()async def move(x: int, y: int) -> MousePositionResponseMoves the mouse cursor to the specified coordinates.
Arguments:
xint - The x coordinate to move to.yint - The y coordinate to move to.
Returns:
MousePositionResponse- Position after move.
Example:
result = await sandbox.computer_use.mouse.move(100, 200)print(f"Mouse moved to: {result.x}, {result.y}")AsyncMouse.click
Section titled “AsyncMouse.click”@intercept_errors(message_prefix="Failed to click mouse: ")@with_instrumentation()async def click(x: int, y: int, button: str = "left", double: bool = False) -> MouseClickResponseClicks the mouse at the specified coordinates.
Arguments:
xint - The x coordinate to click at.yint - The y coordinate to click at.buttonstr - The mouse button to click (‘left’, ‘right’, ‘middle’).doublebool - Whether to perform a double-click.
Returns:
MouseClickResponse- Click operation result.
Example:
# Single left clickresult = await sandbox.computer_use.mouse.click(100, 200)
# Double clickdouble_click = await sandbox.computer_use.mouse.click(100, 200, "left", True)
# Right clickright_click = await sandbox.computer_use.mouse.click(100, 200, "right")AsyncMouse.drag
Section titled “AsyncMouse.drag”@intercept_errors(message_prefix="Failed to drag mouse: ")@with_instrumentation()async def drag(start_x: int, start_y: int, end_x: int, end_y: int, button: str = "left") -> MouseDragResponseDrags the mouse from start coordinates to end coordinates.
Arguments:
start_xint - The starting x coordinate.start_yint - The starting y coordinate.end_xint - The ending x coordinate.end_yint - The ending y coordinate.buttonstr - The mouse button to use for dragging.
Returns:
MouseDragResponse- Drag operation result.
Example:
result = await sandbox.computer_use.mouse.drag(50, 50, 150, 150)print(f"Drag ended at {result.x}, {result.y}")AsyncMouse.scroll
Section titled “AsyncMouse.scroll”@intercept_errors(message_prefix="Failed to scroll mouse: ")@with_instrumentation()async def scroll(x: int, y: int, direction: str, amount: int = 1) -> boolScrolls the mouse wheel at the specified coordinates.
Arguments:
xint - The x coordinate to scroll at.yint - The y coordinate to scroll at.directionstr - The direction to scroll (‘up’ or ‘down’).amountint - The amount to scroll.
Returns:
bool- Whether the scroll operation was successful.
Example:
# Scroll upscroll_up = await sandbox.computer_use.mouse.scroll(100, 200, "up", 3)
# Scroll downscroll_down = await sandbox.computer_use.mouse.scroll(100, 200, "down", 5)AsyncKeyboard
Section titled “AsyncKeyboard”class AsyncKeyboard()Keyboard operations for computer use functionality.
AsyncKeyboard.type
Section titled “AsyncKeyboard.type”@intercept_errors(message_prefix="Failed to type text: ")@with_instrumentation()async def type(text: str, delay: int | None = None) -> NoneTypes the specified text.
Arguments:
textstr - The text to type.delayint - Delay between characters in milliseconds.
Raises:
DaytonaError- If the type operation fails.
Example:
try: await sandbox.computer_use.keyboard.type("Hello, World!") print(f"Operation success")except Exception as e: print(f"Operation failed: {e}")
# With delay between characterstry: await sandbox.computer_use.keyboard.type("Slow typing", 100) print(f"Operation success")except Exception as e: print(f"Operation failed: {e}")AsyncKeyboard.press
Section titled “AsyncKeyboard.press”@intercept_errors(message_prefix="Failed to press key: ")@with_instrumentation()async def press(key: str, modifiers: list[str] | None = None) -> NonePresses a key with optional modifiers.
Arguments:
keystr - The key to press. Canonical names include ‘enter’, ‘escape’, ‘tab’, letters, digits, unshifted punctuation, function keys, and grammar-safe numpad names such as ‘num_plus’. Named keys are case-insensitive, and common aliases such as ‘Return’ and ‘Escape’ are normalized.modifierslist[str] - Canonical modifier names are ‘ctrl’, ‘alt’, ‘shift’, and ‘cmd’. Common aliases such as ‘control’, ‘option’, ‘meta’, and ‘win’ are normalized.
Raises:
DaytonaError- If the press operation fails.
Example:
# Press Entertry: await sandbox.computer_use.keyboard.press("enter") print(f"Operation success")except Exception as e: print(f"Operation failed: {e}")
# Press Ctrl+Ctry: await sandbox.computer_use.keyboard.press("c", ["ctrl"]) print(f"Operation success")
# Press Ctrl+Shift+Ttry: await sandbox.computer_use.keyboard.press("t", ["ctrl", "shift"]) print(f"Operation success")except Exception as e: print(f"Operation failed: {e}")AsyncKeyboard.hotkey
Section titled “AsyncKeyboard.hotkey”@intercept_errors(message_prefix="Failed to press hotkey: ")@with_instrumentation()async def hotkey(keys: str) -> NonePresses a hotkey combination.
Arguments:
keysstr - A single atomic hotkey chord (e.g., ‘ctrl+c’, ‘alt+tab’, ‘cmd+shift+t’, ‘ctrl + c’, ‘shift’). Uses the same normalized key contract aspress().
Raises:
DaytonaError- If the hotkey operation fails.
Example:
# Copytry: await sandbox.computer_use.keyboard.hotkey("ctrl+c") print(f"Operation success")except Exception as e: print(f"Operation failed: {e}")
# Pastetry: await sandbox.computer_use.keyboard.hotkey("ctrl+v") print(f"Operation success")except Exception as e: print(f"Operation failed: {e}")
# Alt+Tabtry: await sandbox.computer_use.keyboard.hotkey("alt+tab") print(f"Operation success")except Exception as e: print(f"Operation failed: {e}")AsyncScreenshot
Section titled “AsyncScreenshot”class AsyncScreenshot()Screenshot operations for computer use functionality.
AsyncScreenshot.take_full_screen
Section titled “AsyncScreenshot.take_full_screen”@intercept_errors(message_prefix="Failed to take screenshot: ")@with_instrumentation()async def take_full_screen(show_cursor: bool = False) -> ScreenshotResponseTakes a screenshot of the entire screen.
Arguments:
show_cursorbool - Whether to show the cursor in the screenshot.
Returns:
ScreenshotResponse- Screenshot data with base64 encoded image.
Example:
screenshot = await sandbox.computer_use.screenshot.take_full_screen()print(f"Screenshot size: {screenshot.width}x{screenshot.height}")
# With cursor visiblewith_cursor = await sandbox.computer_use.screenshot.take_full_screen(True)AsyncScreenshot.take_region
Section titled “AsyncScreenshot.take_region”@intercept_errors(message_prefix="Failed to take region screenshot: ")@with_instrumentation()async def take_region(region: ScreenshotRegion, show_cursor: bool = False) -> ScreenshotResponseTakes a screenshot of a specific region.
Arguments:
regionScreenshotRegion - The region to capture.show_cursorbool - Whether to show the cursor in the screenshot.
Returns:
ScreenshotResponse- Screenshot data with base64 encoded image.
Example:
region = ScreenshotRegion(x=100, y=100, width=300, height=200)screenshot = await sandbox.computer_use.screenshot.take_region(region)print(f"Captured region: {screenshot.region.width}x{screenshot.region.height}")AsyncScreenshot.take_compressed
Section titled “AsyncScreenshot.take_compressed”@intercept_errors(message_prefix="Failed to take compressed screenshot: ")@with_instrumentation()async def take_compressed( options: ScreenshotOptions | None = None) -> ScreenshotResponseTakes a compressed screenshot of the entire screen.
Arguments:
optionsScreenshotOptions | None - Compression and display options.
Returns:
ScreenshotResponse- Compressed screenshot data.
Example:
# Default compressionscreenshot = await sandbox.computer_use.screenshot.take_compressed()
# High quality JPEGjpeg = await sandbox.computer_use.screenshot.take_compressed( ScreenshotOptions(format="jpeg", quality=95, show_cursor=True))
# Scaled down PNGscaled = await sandbox.computer_use.screenshot.take_compressed( ScreenshotOptions(format="png", scale=0.5))AsyncScreenshot.take_compressed_region
Section titled “AsyncScreenshot.take_compressed_region”@intercept_errors( message_prefix="Failed to take compressed region screenshot: ")@with_instrumentation()async def take_compressed_region( region: ScreenshotRegion, options: ScreenshotOptions | None = None) -> ScreenshotResponseTakes a compressed screenshot of a specific region.
Arguments:
regionScreenshotRegion - The region to capture.optionsScreenshotOptions | None - Compression and display options.
Returns:
ScreenshotResponse- Compressed screenshot data.
Example:
region = ScreenshotRegion(x=0, y=0, width=800, height=600)screenshot = await sandbox.computer_use.screenshot.take_compressed_region( region, ScreenshotOptions(format="webp", quality=80, show_cursor=True))print(f"Compressed size: {screenshot.size_bytes} bytes")AsyncDisplay
Section titled “AsyncDisplay”class AsyncDisplay()Display operations for computer use functionality.
AsyncDisplay.get_info
Section titled “AsyncDisplay.get_info”@intercept_errors(message_prefix="Failed to get display info: ")@with_instrumentation()async def get_info() -> DisplayInfoResponseGets information about the displays.
Returns:
DisplayInfoResponse- Display information including primary display and all available displays.
Example:
info = await sandbox.computer_use.display.get_info()print(f"Primary display: {info.primary_display.width}x{info.primary_display.height}")print(f"Total displays: {info.total_displays}")for i, display in enumerate(info.displays): print(f"Display {i}: {display.width}x{display.height} at {display.x},{display.y}")AsyncDisplay.get_windows
Section titled “AsyncDisplay.get_windows”@intercept_errors(message_prefix="Failed to get windows: ")@with_instrumentation()async def get_windows() -> WindowsResponseGets the list of open windows.
Returns:
WindowsResponse- List of open windows with their IDs and titles.
Example:
windows = await sandbox.computer_use.display.get_windows()print(f"Found {windows.count} open windows:")for window in windows.windows: print(f"- {window.title} (ID: {window.id})")AsyncRecordingService
Section titled “AsyncRecordingService”class AsyncRecordingService()Recording operations for computer use functionality.
AsyncRecordingService.start
Section titled “AsyncRecordingService.start”@intercept_errors(message_prefix="Failed to start recording: ")@with_instrumentation()async def start(label: str | None = None) -> RecordingStarts a new screen recording session.
Arguments:
labelstr | None - Optional custom label for the recording.
Returns:
Recording- Recording start response.
Example:
# Start a recording with a labelrecording = await sandbox.computer_use.recording.start("my-test-recording")print(f"Recording started: {recording.id}")print(f"File: {recording.file_path}")AsyncRecordingService.stop
Section titled “AsyncRecordingService.stop”@intercept_errors(message_prefix="Failed to stop recording: ")@with_instrumentation()async def stop(recording_id: str) -> RecordingStops an active screen recording session.
Arguments:
recording_idstr - The ID of the recording to stop.
Returns:
Recording- Recording stop response.
Example:
result = await sandbox.computer_use.recording.stop(recording.id)print(f"Recording stopped: {result.duration_seconds} seconds")print(f"Saved to: {result.file_path}")AsyncRecordingService.list
Section titled “AsyncRecordingService.list”@intercept_errors(message_prefix="Failed to list recordings: ")@with_instrumentation()async def list() -> ListRecordingsResponseLists all recordings (active and completed).
Returns:
ListRecordingsResponse- List of all recordings.
Example:
recordings = await sandbox.computer_use.recording.list()print(f"Found {len(recordings.recordings)} recordings")for rec in recordings.recordings: print(f"- {rec.file_name}: {rec.status}")AsyncRecordingService.get
Section titled “AsyncRecordingService.get”@intercept_errors(message_prefix="Failed to get recording: ")@with_instrumentation()async def get(recording_id: str) -> RecordingGets details of a specific recording by ID.
Arguments:
recording_idstr - The ID of the recording to retrieve.
Returns:
Recording- Recording details.
Example:
recording = await sandbox.computer_use.recording.get(recording_id)print(f"Recording: {recording.file_name}")print(f"Status: {recording.status}")print(f"Duration: {recording.duration_seconds} seconds")AsyncRecordingService.delete
Section titled “AsyncRecordingService.delete”@intercept_errors(message_prefix="Failed to delete recording: ")@with_instrumentation()async def delete(recording_id: str) -> NoneDeletes a recording by ID.
Arguments:
recording_idstr - The ID of the recording to delete.
Example:
await sandbox.computer_use.recording.delete(recording_id)print("Recording deleted")AsyncRecordingService.download
Section titled “AsyncRecordingService.download”@intercept_errors(message_prefix="Failed to download recording: ")@with_instrumentation()async def download(recording_id: str, local_path: str) -> NoneDownloads a recording file from the Sandbox and saves it to a local file.
The file is streamed directly to disk without loading the entire content into memory.
Arguments:
recording_idstr - The ID of the recording to download.local_pathstr - Path to save the recording file locally.
Example:
# Download recording to fileawait sandbox.computer_use.recording.download(recording_id, "local_recording.mp4")print("Recording downloaded")AsyncAccessibility
Section titled “AsyncAccessibility”class AsyncAccessibility()Accessibility operations for computer use functionality.
This service exposes thin wrappers over the toolbox AT-SPI accessibility API. Start computer use before calling these methods.
AsyncAccessibility.get_tree
Section titled “AsyncAccessibility.get_tree”@intercept_errors(message_prefix="Failed to get accessibility tree: ")@with_instrumentation()async def get_tree(scope: str | None = None, pid: int | None = None, max_depth: int | None = None) -> AccessibilityTreeResponseFetches the AT-SPI accessibility tree.
Arguments:
scopestr | None - Tree scope to inspect:focused,pid, orall.pidint | None - Process ID whenscopeispid.max_depthint | None - Maximum depth to descend. Use0for the root only.
Returns:
AccessibilityTreeResponse- Accessibility tree rooted at the requested scope.
Example:
tree = await sandbox.computer_use.accessibility.get_tree(scope="all", max_depth=3)print(tree.root.name)AsyncAccessibility.find_nodes
Section titled “AsyncAccessibility.find_nodes”@intercept_errors(message_prefix="Failed to find accessibility nodes: ")@with_instrumentation()async def find_nodes(scope: str | None = None, pid: int | None = None, role: str | None = None, name: str | None = None, name_match: str | None = None, states: list[str] | None = None, limit: int | None = None) -> AccessibilityNodesResponseFinds AT-SPI accessibility nodes matching the provided filters.
Arguments:
scopestr | None - Search scope:focused,pid, orall.pidint | None - Process ID whenscopeispid.rolestr | None - Accessibility role to match, such asbutton.namestr | None - Accessible name to match.name_matchstr | None - Name match mode, such asexactorsubstring.stateslist[str] | None - Required accessibility states.limitint | None - Maximum number of matches. Use0to let the API apply its default.
Returns:
AccessibilityNodesResponse- Matching accessibility nodes.
Example:
buttons = await sandbox.computer_use.accessibility.find_nodes( scope="all", role="button", name="Submit", name_match="substring",)print(len(buttons.matches))AsyncAccessibility.focus_node
Section titled “AsyncAccessibility.focus_node”@intercept_errors(message_prefix="Failed to focus accessibility node: ")@with_instrumentation()async def focus_node(node_id: str) -> NoneFocuses an AT-SPI accessibility node.
Arguments:
node_idstr - Accessibility node ID returned byget_treeorfind_nodes.
Raises:
DaytonaError- If the focus operation fails. API failures may use a more specific subclass.
Example:
await sandbox.computer_use.accessibility.focus_node(node.id)AsyncAccessibility.invoke_node
Section titled “AsyncAccessibility.invoke_node”@intercept_errors(message_prefix="Failed to invoke accessibility node: ")@with_instrumentation()async def invoke_node(node_id: str, action: str | None = None) -> NoneInvokes an AT-SPI accessibility node action.
Arguments:
node_idstr - Accessibility node ID returned byget_treeorfind_nodes.actionstr | None - Action name to invoke. If omitted, the API invokes the primary action.
Raises:
DaytonaError- If the invoke operation fails. API failures may use a more specific subclass.
Example:
await sandbox.computer_use.accessibility.invoke_node(node.id, action="click")AsyncAccessibility.set_node_value
Section titled “AsyncAccessibility.set_node_value”@intercept_errors(message_prefix="Failed to set accessibility node value: ")@with_instrumentation()async def set_node_value(node_id: str, value: str) -> NoneSets an AT-SPI accessibility node value.
Arguments:
node_idstr - Accessibility node ID returned byget_treeorfind_nodes.valuestr - Value to write to the node.
Raises:
DaytonaError- If the value update fails. API failures may use a more specific subclass.
Example:
await sandbox.computer_use.accessibility.set_node_value(node.id, "hello")ScreenshotRegion
Section titled “ScreenshotRegion”class ScreenshotRegion(BaseModel)Region coordinates for screenshot operations.
Attributes:
xint - X coordinate of the region.yint - Y coordinate of the region.widthint - Width of the region.heightint - Height of the region.
ScreenshotOptions
Section titled “ScreenshotOptions”class ScreenshotOptions(BaseModel)Options for screenshot compression and display.
Attributes:
show_cursorbool | None - Whether to show the cursor in the screenshot.fmtstr | None - Image format (e.g., ‘png’, ‘jpeg’, ‘webp’).qualityint | None - Compression quality (0-100).scalefloat | None - Scale factor for the screenshot.