コンテンツにスキップ
View as Markdown

このコンテンツはまだ日本語訳がありません。

ComputerUse

Desktop automation operations for a Sandbox.

Provides a Java facade for computer-use features including desktop session management, screenshots, mouse and keyboard automation, display/window inspection, and screen recording.

Methods

start()

public ComputerUseStartResponse start()

Starts the computer-use desktop stack (VNC/noVNC and related processes).

Returns:

  • ComputerUseStartResponse - start response containing process status details

stop()

public ComputerUseStopResponse stop()

Stops all computer-use desktop processes.

Returns:

  • ComputerUseStopResponse - stop response containing process status details

getStatus()

public ComputerUseStatusResponse getStatus()

Returns current computer-use status.

Returns:

  • ComputerUseStatusResponse - overall computer-use status

takeScreenshot()

public ScreenshotResponse takeScreenshot()

Captures a full-screen screenshot without cursor.

Returns:

  • ScreenshotResponse - screenshot payload (base64 image and metadata)

takeScreenshot()

public ScreenshotResponse takeScreenshot(boolean showCursor)

Captures a full-screen screenshot.

Parameters:

  • showCursor boolean - whether to render cursor in the screenshot

Returns:

  • ScreenshotResponse - screenshot payload (base64 image and metadata)

takeRegionScreenshot()

public ScreenshotResponse takeRegionScreenshot(int x, int y, int width, int height)

Captures a screenshot of a rectangular region without cursor.

Parameters:

  • x int - region top-left X coordinate
  • y int - region top-left Y coordinate
  • width int - region width in pixels
  • height int - region height in pixels

Returns:

  • ScreenshotResponse - region screenshot payload

takeCompressedScreenshot()

public ScreenshotResponse takeCompressedScreenshot(String format, int quality, double scale)

Captures a compressed full-screen screenshot.

Parameters:

  • format String - output image format (for example: png, jpeg, webp)
  • quality int - compression quality (typically 1-100, format dependent)
  • scale double - screenshot scale factor (for example: 0.5 for 50%)

Returns:

  • ScreenshotResponse - compressed screenshot payload

click()

public MouseClickResponse click(int x, int y)

Performs a left mouse click at the given coordinates.

Parameters:

  • x int - target X coordinate
  • y int - target Y coordinate

Returns:

  • MouseClickResponse - click response with resulting cursor position

click()

public MouseClickResponse click(int x, int y, String button)

Performs a mouse click at the given coordinates with a specific button.

Parameters:

  • x int - target X coordinate
  • y int - target Y coordinate
  • button String - button type (left, right, middle)

Returns:

  • MouseClickResponse - click response with resulting cursor position

doubleClick()

public MouseClickResponse doubleClick(int x, int y)

Performs a double left-click at the given coordinates.

Parameters:

  • x int - target X coordinate
  • y int - target Y coordinate

Returns:

  • MouseClickResponse - click response with resulting cursor position

moveMouse()

public MousePositionResponse moveMouse(int x, int y)

Moves the mouse cursor to the given coordinates.

Parameters:

  • x int - target X coordinate
  • y int - target Y coordinate

Returns:

  • MousePositionResponse - new mouse position

getMousePosition()

public MousePositionResponse getMousePosition()

Returns current mouse position.

Returns:

  • MousePositionResponse - current mouse cursor coordinates

drag()

public MouseDragResponse drag(int startX, int startY, int endX, int endY)

Drags the mouse from one point to another using the left button.

Parameters:

  • startX int - drag start X coordinate
  • startY int - drag start Y coordinate
  • endX int - drag end X coordinate
  • endY int - drag end Y coordinate

Returns:

  • MouseDragResponse - drag response with resulting cursor position

scroll()

public ScrollResponse scroll(int x, int y, int deltaX, int deltaY)

Scrolls at the given coordinates.

The current toolbox API supports directional scrolling (up/down) with an amount. This method maps deltaY to vertical scroll direction and magnitude. If deltaY is 0, deltaX is used as a fallback.

Parameters:

  • x int - anchor X coordinate
  • y int - anchor Y coordinate
  • deltaX int - horizontal delta (used only when deltaY == 0)
  • deltaY int - vertical delta

Returns:

  • ScrollResponse - scroll response indicating operation success

typeText()

public void typeText(String text)

Types text using keyboard automation.

Parameters:

  • text String - text to type

pressKey()

public void pressKey(String key)

Presses a single key.

Parameters:

  • key String - key to press (for example: Enter, Escape, a)

pressHotkey()

public void pressHotkey(String... keys)

Presses a key combination as a hotkey sequence.

Keys are joined with + before being sent (for example, pressHotkey("ctrl", "shift", "t") -> "ctrl+shift+t").

Parameters:

  • keys String… - hotkey parts to combine

getDisplayInfo()

public DisplayInfoResponse getDisplayInfo()

Returns display configuration information.

Returns:

  • DisplayInfoResponse - display information including available displays and their geometry

getWindows()

public WindowsResponse getWindows()

Returns currently open windows.

Returns:

  • WindowsResponse - window list and metadata

startRecording()

public Recording startRecording()

Starts a recording with default options.

Returns:

  • Recording - newly started recording metadata

startRecording()

public Recording startRecording(String label)

Starts a recording with an optional label.

Parameters:

  • label String - optional recording label

Returns:

  • Recording - newly started recording metadata

stopRecording()

public Recording stopRecording(String id)

Stops an active recording.

Parameters:

  • id String - recording identifier

Returns:

  • Recording - finalized recording metadata

listRecordings()

public ListRecordingsResponse listRecordings()

Lists all recordings for the current sandbox session.

Returns:

  • ListRecordingsResponse - recordings list response

getRecording()

public Recording getRecording(String id)

Returns metadata for a specific recording.

Parameters:

  • id String - recording identifier

Returns:

  • Recording - recording details

downloadRecording()

public File downloadRecording(String id)

Downloads a recording file.

Parameters:

  • id String - recording identifier

Returns:

  • File - downloaded temporary/local file handle returned by the API client

deleteRecording()

public void deleteRecording(String id)

Deletes a recording.

Parameters:

  • id String - recording identifier