ComputerUse
Desktop automation operations for a Sandbox.
Provides a Java facade for computer-use features including desktop session management, screenshots, mouse and keyboard automation, display/window inspection, and screen recording.
Methods
start()
public ComputerUseStartResponse start()Starts the computer-use desktop stack (VNC/noVNC and related processes).
Returns:
ComputerUseStartResponse- start response containing process status details
stop()
public ComputerUseStopResponse stop()Stops all computer-use desktop processes.
Returns:
ComputerUseStopResponse- stop response containing process status details
getStatus()
public ComputerUseStatusResponse getStatus()Returns current computer-use status.
Returns:
ComputerUseStatusResponse- overall computer-use status
takeScreenshot()
public ScreenshotResponse takeScreenshot()Captures a full-screen screenshot without cursor.
Returns:
ScreenshotResponse- screenshot payload (base64 image and metadata)
takeScreenshot()
public ScreenshotResponse takeScreenshot(boolean showCursor)Captures a full-screen screenshot.
Parameters:
showCursorboolean - whether to render cursor in the screenshot
Returns:
ScreenshotResponse- screenshot payload (base64 image and metadata)
takeRegionScreenshot()
public ScreenshotResponse takeRegionScreenshot(int x, int y, int width, int height)Captures a screenshot of a rectangular region without cursor.
Parameters:
xint - region top-left X coordinateyint - region top-left Y coordinatewidthint - region width in pixelsheightint - region height in pixels
Returns:
ScreenshotResponse- region screenshot payload
takeCompressedScreenshot()
public ScreenshotResponse takeCompressedScreenshot(String format, int quality, double scale)Captures a compressed full-screen screenshot.
Parameters:
formatString - output image format (for example:png,jpeg,webp)qualityint - compression quality (typically 1-100, format dependent)scaledouble - screenshot scale factor (for example:0.5for 50%)
Returns:
ScreenshotResponse- compressed screenshot payload
click()
public MouseClickResponse click(int x, int y)Performs a left mouse click at the given coordinates.
Parameters:
xint - target X coordinateyint - target Y coordinate
Returns:
MouseClickResponse- click response with resulting cursor position
click()
public MouseClickResponse click(int x, int y, String button)Performs a mouse click at the given coordinates with a specific button.
Parameters:
xint - target X coordinateyint - target Y coordinatebuttonString - button type (left,right,middle)
Returns:
MouseClickResponse- click response with resulting cursor position
doubleClick()
public MouseClickResponse doubleClick(int x, int y)Performs a double left-click at the given coordinates.
Parameters:
xint - target X coordinateyint - target Y coordinate
Returns:
MouseClickResponse- click response with resulting cursor position
moveMouse()
public MousePositionResponse moveMouse(int x, int y)Moves the mouse cursor to the given coordinates.
Parameters:
xint - target X coordinateyint - target Y coordinate
Returns:
MousePositionResponse- new mouse position
getMousePosition()
public MousePositionResponse getMousePosition()Returns current mouse position.
Returns:
MousePositionResponse- current mouse cursor coordinates
drag()
public MouseDragResponse drag(int startX, int startY, int endX, int endY)Drags the mouse from one point to another using the left button.
Parameters:
startXint - drag start X coordinatestartYint - drag start Y coordinateendXint - drag end X coordinateendYint - drag end Y coordinate
Returns:
MouseDragResponse- drag response with resulting cursor position
scroll()
public ScrollResponse scroll(int x, int y, int deltaX, int deltaY)Scrolls at the given coordinates.
The current toolbox API supports directional scrolling (up/down) with an
amount. This method maps deltaY to vertical scroll direction and magnitude.
If deltaY is 0, deltaX is used as a fallback.
Parameters:
xint - anchor X coordinateyint - anchor Y coordinatedeltaXint - horizontal delta (used only whendeltaY == 0)deltaYint - vertical delta
Returns:
ScrollResponse- scroll response indicating operation success
typeText()
public void typeText(String text)Types text using keyboard automation.
Parameters:
textString - text to type
pressKey()
public void pressKey(String key)Presses a single key.
Parameters:
keyString - key to press (for example:Enter,Escape,a)
pressHotkey()
public void pressHotkey(String... keys)Presses a key combination as a hotkey sequence.
Keys are joined with + before being sent (for example,
pressHotkey("ctrl", "shift", "t") -> "ctrl+shift+t").
Parameters:
keysString… - hotkey parts to combine
getDisplayInfo()
public DisplayInfoResponse getDisplayInfo()Returns display configuration information.
Returns:
DisplayInfoResponse- display information including available displays and their geometry
getWindows()
public WindowsResponse getWindows()Returns currently open windows.
Returns:
WindowsResponse- window list and metadata
startRecording()
public Recording startRecording()Starts a recording with default options.
Returns:
Recording- newly started recording metadata
startRecording()
public Recording startRecording(String label)Starts a recording with an optional label.
Parameters:
labelString - optional recording label
Returns:
Recording- newly started recording metadata
stopRecording()
public Recording stopRecording(String id)Stops an active recording.
Parameters:
idString - recording identifier
Returns:
Recording- finalized recording metadata
listRecordings()
public ListRecordingsResponse listRecordings()Lists all recordings for the current sandbox session.
Returns:
ListRecordingsResponse- recordings list response
getRecording()
public Recording getRecording(String id)Returns metadata for a specific recording.
Parameters:
idString - recording identifier
Returns:
Recording- recording details
downloadRecording()
public File downloadRecording(String id)Downloads a recording file.
Parameters:
idString - recording identifier
Returns:
File- downloaded temporary/local file handle returned by the API client
deleteRecording()
public void deleteRecording(String id)Deletes a recording.
Parameters:
idString - recording identifier