Skip to content
View as Markdown

AsyncComputerUse

class AsyncComputerUse()

Computer Use functionality for interacting with the desktop environment.

Provides access to mouse, keyboard, screenshot, and display operations for automating desktop interactions within a sandbox.

Attributes:

  • mouse AsyncMouse - Mouse operations interface.
  • keyboard AsyncKeyboard - Keyboard operations interface.
  • screenshot AsyncScreenshot - Screenshot operations interface.
  • display AsyncDisplay - Display operations interface.

AsyncComputerUse.start

@intercept_errors(message_prefix="Failed to start computer use: ")
async def start() -> ComputerUseStartResponse

Starts all computer use processes (Xvfb, xfce4, x11vnc, novnc).

Returns:

  • ComputerUseStartResponse - Computer use start response.

Example:

result = await sandbox.computer_use.start()
print("Computer use processes started:", result.message)

AsyncComputerUse.stop

@intercept_errors(message_prefix="Failed to stop computer use: ")
async def stop() -> ComputerUseStopResponse

Stops all computer use processes.

Returns:

  • ComputerUseStopResponse - Computer use stop response.

Example:

result = await sandbox.computer_use.stop()
print("Computer use processes stopped:", result.message)

AsyncComputerUse.get_status

@intercept_errors(message_prefix="Failed to get computer use status: ")
async def get_status() -> ComputerUseStatusResponse

Gets the status of all computer use processes.

Returns:

  • ComputerUseStatusResponse - Status information about all VNC desktop processes.

Example:

response = await sandbox.computer_use.get_status()
print("Computer use status:", response.status)

AsyncComputerUse.get_process_status

@intercept_errors(message_prefix="Failed to get process status: ")
async def get_process_status(process_name: str) -> ProcessStatusResponse

Gets the status of a specific VNC process.

Arguments:

  • process_name str - Name of the process to check.

Returns:

  • ProcessStatusResponse - Status information about the specific process.

Example:

xvfb_status = await sandbox.computer_use.get_process_status("xvfb")
no_vnc_status = await sandbox.computer_use.get_process_status("novnc")

AsyncComputerUse.restart_process

@intercept_errors(message_prefix="Failed to restart process: ")
async def restart_process(process_name: str) -> ProcessRestartResponse

Restarts a specific VNC process.

Arguments:

  • process_name str - Name of the process to restart.

Returns:

  • ProcessRestartResponse - Process restart response.

Example:

result = await sandbox.computer_use.restart_process("xfce4")
print("XFCE4 process restarted:", result.message)

AsyncComputerUse.get_process_logs

@intercept_errors(message_prefix="Failed to get process logs: ")
async def get_process_logs(process_name: str) -> ProcessLogsResponse

Gets logs for a specific VNC process.

Arguments:

  • process_name str - Name of the process to get logs for.

Returns:

  • ProcessLogsResponse - Process logs.

Example:

logs = await sandbox.computer_use.get_process_logs("novnc")
print("NoVNC logs:", logs)

AsyncComputerUse.get_process_errors

@intercept_errors(message_prefix="Failed to get process errors: ")
async def get_process_errors(process_name: str) -> ProcessErrorsResponse

Gets error logs for a specific VNC process.

Arguments:

  • process_name str - Name of the process to get error logs for.

Returns:

  • ProcessErrorsResponse - Process error logs.

Example:

errors = await sandbox.computer_use.get_process_errors("x11vnc")
print("X11VNC errors:", errors)

AsyncMouse

class AsyncMouse()

Mouse operations for computer use functionality.

AsyncMouse.get_position

@intercept_errors(message_prefix="Failed to get mouse position: ")
async def get_position() -> MousePositionResponse

Gets the current mouse cursor position.

Returns:

  • MousePositionResponse - Current mouse position with x and y coordinates.

Example:

position = await sandbox.computer_use.mouse.get_position()
print(f"Mouse is at: {position.x}, {position.y}")

AsyncMouse.move

@intercept_errors(message_prefix="Failed to move mouse: ")
async def move(x: int, y: int) -> MousePositionResponse

Moves the mouse cursor to the specified coordinates.

Arguments:

  • x int - The x coordinate to move to.
  • y int - The y coordinate to move to.

Returns:

  • MousePositionResponse - Position after move.

Example:

result = await sandbox.computer_use.mouse.move(100, 200)
print(f"Mouse moved to: {result.x}, {result.y}")

AsyncMouse.click

@intercept_errors(message_prefix="Failed to click mouse: ")
async def click(x: int,
y: int,
button: str = "left",
double: bool = False) -> MouseClickResponse

Clicks the mouse at the specified coordinates.

Arguments:

  • x int - The x coordinate to click at.
  • y int - The y coordinate to click at.
  • button str - The mouse button to click (‘left’, ‘right’, ‘middle’).
  • double bool - Whether to perform a double-click.

Returns:

  • MouseClickResponse - Click operation result.

Example:

# Single left click
result = await sandbox.computer_use.mouse.click(100, 200)
# Double click
double_click = await sandbox.computer_use.mouse.click(100, 200, "left", True)
# Right click
right_click = await sandbox.computer_use.mouse.click(100, 200, "right")

AsyncMouse.drag

@intercept_errors(message_prefix="Failed to drag mouse: ")
async def drag(start_x: int,
start_y: int,
end_x: int,
end_y: int,
button: str = "left") -> MouseDragResponse

Drags the mouse from start coordinates to end coordinates.

Arguments:

  • start_x int - The starting x coordinate.
  • start_y int - The starting y coordinate.
  • end_x int - The ending x coordinate.
  • end_y int - The ending y coordinate.
  • button str - The mouse button to use for dragging.

Returns:

  • MouseDragResponse - Drag operation result.

Example:

result = await sandbox.computer_use.mouse.drag(50, 50, 150, 150)
print(f"Dragged from {result.from_x},{result.from_y} to {result.to_x},{result.to_y}")

AsyncMouse.scroll

@intercept_errors(message_prefix="Failed to scroll mouse: ")
async def scroll(x: int, y: int, direction: str, amount: int = 1) -> bool

Scrolls the mouse wheel at the specified coordinates.

Arguments:

  • x int - The x coordinate to scroll at.
  • y int - The y coordinate to scroll at.
  • direction str - The direction to scroll (‘up’ or ‘down’).
  • amount int - The amount to scroll.

Returns:

  • bool - Whether the scroll operation was successful.

Example:

# Scroll up
scroll_up = await sandbox.computer_use.mouse.scroll(100, 200, "up", 3)
# Scroll down
scroll_down = await sandbox.computer_use.mouse.scroll(100, 200, "down", 5)

AsyncKeyboard

class AsyncKeyboard()

Keyboard operations for computer use functionality.

AsyncKeyboard.type

@intercept_errors(message_prefix="Failed to type text: ")
async def type(text: str, delay: Optional[int] = None) -> None

Types the specified text.

Arguments:

  • text str - The text to type.
  • delay int - Delay between characters in milliseconds.

Raises:

  • DaytonaError - If the type operation fails.

Example:

try:
await sandbox.computer_use.keyboard.type("Hello, World!")
print(f"Operation success")
except Exception as e:
print(f"Operation failed: {e}")
# With delay between characters
try:
await sandbox.computer_use.keyboard.type("Slow typing", 100)
print(f"Operation success")
except Exception as e:
print(f"Operation failed: {e}")

AsyncKeyboard.press

@intercept_errors(message_prefix="Failed to press key: ")
async def press(key: str, modifiers: Optional[List[str]] = None) -> None

Presses a key with optional modifiers.

Arguments:

  • key str - The key to press (e.g., ‘Enter’, ‘Escape’, ‘Tab’, ‘a’, ‘A’).
  • modifiers List[str] - Modifier keys (‘ctrl’, ‘alt’, ‘meta’, ‘shift’).

Raises:

  • DaytonaError - If the press operation fails.

Example:

# Press Enter
try:
await sandbox.computer_use.keyboard.press("Return")
print(f"Operation success")
except Exception as e:
print(f"Operation failed: {e}")
# Press Ctrl+C
try:
await sandbox.computer_use.keyboard.press("c", ["ctrl"])
print(f"Operation success")
# Press Ctrl+Shift+T
try:
await sandbox.computer_use.keyboard.press("t", ["ctrl", "shift"])
print(f"Operation success")
except Exception as e:
print(f"Operation failed: {e}")

AsyncKeyboard.hotkey

@intercept_errors(message_prefix="Failed to press hotkey: ")
async def hotkey(keys: str) -> None

Presses a hotkey combination.

Arguments:

  • keys str - The hotkey combination (e.g., ‘ctrl+c’, ‘alt+tab’, ‘cmd+shift+t’).

Raises:

  • DaytonaError - If the hotkey operation fails.

Example:

# Copy
try:
await sandbox.computer_use.keyboard.hotkey("ctrl+c")
print(f"Operation success")
except Exception as e:
print(f"Operation failed: {e}")
# Paste
try:
await sandbox.computer_use.keyboard.hotkey("ctrl+v")
print(f"Operation success")
except Exception as e:
print(f"Operation failed: {e}")
# Alt+Tab
try:
await sandbox.computer_use.keyboard.hotkey("alt+tab")
print(f"Operation success")
except Exception as e:
print(f"Operation failed: {e}")

AsyncScreenshot

class AsyncScreenshot()

Screenshot operations for computer use functionality.

AsyncScreenshot.take_full_screen

@intercept_errors(message_prefix="Failed to take screenshot: ")
async def take_full_screen(show_cursor: bool = False) -> ScreenshotResponse

Takes a screenshot of the entire screen.

Arguments:

  • show_cursor bool - Whether to show the cursor in the screenshot.

Returns:

  • ScreenshotResponse - Screenshot data with base64 encoded image.

Example:

screenshot = await sandbox.computer_use.screenshot.take_full_screen()
print(f"Screenshot size: {screenshot.width}x{screenshot.height}")
# With cursor visible
with_cursor = await sandbox.computer_use.screenshot.take_full_screen(True)

AsyncScreenshot.take_region

@intercept_errors(message_prefix="Failed to take region screenshot: ")
async def take_region(region: ScreenshotRegion,
show_cursor: bool = False) -> ScreenshotResponse

Takes a screenshot of a specific region.

Arguments:

  • region ScreenshotRegion - The region to capture.
  • show_cursor bool - Whether to show the cursor in the screenshot.

Returns:

  • ScreenshotResponse - Screenshot data with base64 encoded image.

Example:

region = ScreenshotRegion(x=100, y=100, width=300, height=200)
screenshot = await sandbox.computer_use.screenshot.take_region(region)
print(f"Captured region: {screenshot.region.width}x{screenshot.region.height}")

AsyncScreenshot.take_compressed

@intercept_errors(message_prefix="Failed to take compressed screenshot: ")
async def take_compressed(
options: Optional[ScreenshotOptions] = None) -> ScreenshotResponse

Takes a compressed screenshot of the entire screen.

Arguments:

  • options ScreenshotOptions - Compression and display options.

Returns:

  • ScreenshotResponse - Compressed screenshot data.

Example:

# Default compression
screenshot = await sandbox.computer_use.screenshot.take_compressed()
# High quality JPEG
jpeg = await sandbox.computer_use.screenshot.take_compressed(
ScreenshotOptions(format="jpeg", quality=95, show_cursor=True)
)
# Scaled down PNG
scaled = await sandbox.computer_use.screenshot.take_compressed(
ScreenshotOptions(format="png", scale=0.5)
)

AsyncScreenshot.take_compressed_region

@intercept_errors(
message_prefix="Failed to take compressed region screenshot: ")
async def take_compressed_region(
region: ScreenshotRegion,
options: Optional[ScreenshotOptions] = None) -> ScreenshotResponse

Takes a compressed screenshot of a specific region.

Arguments:

  • region ScreenshotRegion - The region to capture.
  • options ScreenshotOptions - Compression and display options.

Returns:

  • ScreenshotResponse - Compressed screenshot data.

Example:

region = ScreenshotRegion(x=0, y=0, width=800, height=600)
screenshot = await sandbox.computer_use.screenshot.take_compressed_region(
region,
ScreenshotOptions(format="webp", quality=80, show_cursor=True)
)
print(f"Compressed size: {screenshot.size_bytes} bytes")

AsyncDisplay

class AsyncDisplay()

Display operations for computer use functionality.

AsyncDisplay.get_info

@intercept_errors(message_prefix="Failed to get display info: ")
async def get_info() -> DisplayInfoResponse

Gets information about the displays.

Returns:

  • DisplayInfoResponse - Display information including primary display and all available displays.

Example:

info = await sandbox.computer_use.display.get_info()
print(f"Primary display: {info.primary_display.width}x{info.primary_display.height}")
print(f"Total displays: {info.total_displays}")
for i, display in enumerate(info.displays):
print(f"Display {i}: {display.width}x{display.height} at {display.x},{display.y}")

AsyncDisplay.get_windows

@intercept_errors(message_prefix="Failed to get windows: ")
async def get_windows() -> WindowsResponse

Gets the list of open windows.

Returns:

  • WindowsResponse - List of open windows with their IDs and titles.

Example:

windows = await sandbox.computer_use.display.get_windows()
print(f"Found {windows.count} open windows:")
for window in windows.windows:
print(f"- {window.title} (ID: {window.id})")

ScreenshotRegion

class ScreenshotRegion()

Region coordinates for screenshot operations.

Attributes:

  • x int - X coordinate of the region.
  • y int - Y coordinate of the region.
  • width int - Width of the region.
  • height int - Height of the region.

ScreenshotOptions

class ScreenshotOptions()

Options for screenshot compression and display.

Attributes:

  • show_cursor bool - Whether to show the cursor in the screenshot.
  • fmt str - Image format (e.g., ‘png’, ‘jpeg’, ‘webp’).
  • quality int - Compression quality (0-100).
  • scale float - Scale factor for the screenshot.