このコンテンツはまだ日本語訳がありません。
ComputerUse
Section titled “ComputerUse”Desktop automation operations for a Sandbox.
Provides a Java facade for computer-use features including desktop session management, screenshots, mouse and keyboard automation, display/window inspection, and screen recording.
Methods
Section titled “Methods”start()
Section titled “start()”public ComputerUseStartResponse start()Starts the computer-use desktop stack (VNC/noVNC and related processes).
Returns:
ComputerUseStartResponse- start response containing process status details
stop()
Section titled “stop()”public ComputerUseStopResponse stop()Stops all computer-use desktop processes.
Returns:
ComputerUseStopResponse- stop response containing process status details
getStatus()
Section titled “getStatus()”public ComputerUseStatusResponse getStatus()Returns current computer-use status.
Returns:
ComputerUseStatusResponse- overall computer-use status
getAccessibilityTree()
Section titled “getAccessibilityTree()”public AccessibilityTreeResponse getAccessibilityTree()Fetches the focused AT-SPI accessibility tree.
Returns:
AccessibilityTreeResponse- accessibility tree response
getAccessibilityTree()
Section titled “getAccessibilityTree()”public AccessibilityTreeResponse getAccessibilityTree(String scope, Integer pid, Integer maxDepth)Fetches an AT-SPI accessibility tree.
Parameters:
scopeString - scope to inspect (focused,pid, orall)pidInteger - process ID whenscopeispidmaxDepthInteger - max tree depth (0for root only)
Returns:
AccessibilityTreeResponse- accessibility tree response
findAccessibilityNodes()
Section titled “findAccessibilityNodes()”public AccessibilityNodesResponse findAccessibilityNodes()Finds AT-SPI accessibility nodes without filters.
Returns:
AccessibilityNodesResponse- matching accessibility nodes
findAccessibilityNodes()
Section titled “findAccessibilityNodes()”public AccessibilityNodesResponse findAccessibilityNodes(FindAccessibilityNodesRequest request)Finds AT-SPI accessibility nodes using a generated toolbox request.
Parameters:
requestFindAccessibilityNodesRequest - generated accessibility find request
Returns:
AccessibilityNodesResponse- matching accessibility nodes
focusAccessibilityNode()
Section titled “focusAccessibilityNode()”public void focusAccessibilityNode(String id)Focuses an AT-SPI accessibility node.
Parameters:
idString - accessibility node ID
invokeAccessibilityNode()
Section titled “invokeAccessibilityNode()”public void invokeAccessibilityNode(String id)Invokes an AT-SPI accessibility node’s primary action.
Parameters:
idString - accessibility node ID
invokeAccessibilityNode()
Section titled “invokeAccessibilityNode()”public void invokeAccessibilityNode(String id, String action)Invokes an AT-SPI accessibility node action.
Parameters:
idString - accessibility node IDactionString - action name, ornullfor the primary action
setAccessibilityNodeValue()
Section titled “setAccessibilityNodeValue()”public void setAccessibilityNodeValue(String id, String value)Sets an AT-SPI accessibility node value.
Parameters:
idString - accessibility node IDvalueString - value to write
takeScreenshot()
Section titled “takeScreenshot()”public ScreenshotResponse takeScreenshot()Captures a full-screen screenshot without cursor.
Returns:
ScreenshotResponse- screenshot payload (base64 image and metadata)
takeScreenshot()
Section titled “takeScreenshot()”public ScreenshotResponse takeScreenshot(boolean showCursor)Captures a full-screen screenshot.
Parameters:
showCursorboolean - whether to render cursor in the screenshot
Returns:
ScreenshotResponse- screenshot payload (base64 image and metadata)
takeRegionScreenshot()
Section titled “takeRegionScreenshot()”public ScreenshotResponse takeRegionScreenshot(int x, int y, int width, int height)Captures a screenshot of a rectangular region without cursor.
Parameters:
xint - region top-left X coordinateyint - region top-left Y coordinatewidthint - region width in pixelsheightint - region height in pixels
Returns:
ScreenshotResponse- region screenshot payload
takeCompressedScreenshot()
Section titled “takeCompressedScreenshot()”public ScreenshotResponse takeCompressedScreenshot(String format, int quality, double scale)Captures a compressed full-screen screenshot.
Parameters:
formatString - output image format (for example:png,jpeg,webp)qualityint - compression quality (typically 1-100, format dependent)scaledouble - screenshot scale factor (for example:0.5for 50%)
Returns:
ScreenshotResponse- compressed screenshot payload
click()
Section titled “click()”public MouseClickResponse click(int x, int y)Performs a left mouse click at the given coordinates.
Parameters:
xint - target X coordinateyint - target Y coordinate
Returns:
MouseClickResponse- click response with resulting cursor position
click()
Section titled “click()”public MouseClickResponse click(int x, int y, String button)Performs a mouse click at the given coordinates with a specific button.
Parameters:
xint - target X coordinateyint - target Y coordinatebuttonString - button type (left,right,middle)
Returns:
MouseClickResponse- click response with resulting cursor position
doubleClick()
Section titled “doubleClick()”public MouseClickResponse doubleClick(int x, int y)Performs a double left-click at the given coordinates.
Parameters:
xint - target X coordinateyint - target Y coordinate
Returns:
MouseClickResponse- click response with resulting cursor position
moveMouse()
Section titled “moveMouse()”public MousePositionResponse moveMouse(int x, int y)Moves the mouse cursor to the given coordinates.
Parameters:
xint - target X coordinateyint - target Y coordinate
Returns:
MousePositionResponse- new mouse position
getMousePosition()
Section titled “getMousePosition()”public MousePositionResponse getMousePosition()Returns current mouse position.
Returns:
MousePositionResponse- current mouse cursor coordinates
drag()
Section titled “drag()”public MouseDragResponse drag(int startX, int startY, int endX, int endY)Drags the mouse from one point to another using the left button.
Parameters:
startXint - drag start X coordinatestartYint - drag start Y coordinateendXint - drag end X coordinateendYint - drag end Y coordinate
Returns:
MouseDragResponse- drag response with resulting cursor position
scroll()
Section titled “scroll()”public ScrollResponse scroll(int x, int y, int deltaX, int deltaY)Scrolls at the given coordinates.
The current toolbox API supports directional scrolling (up/down) with an
amount. This method maps deltaY to vertical scroll direction and magnitude.
If deltaY is 0, deltaX is used as a fallback.
Parameters:
xint - anchor X coordinateyint - anchor Y coordinatedeltaXint - horizontal delta (used only whendeltaY == 0)deltaYint - vertical delta
Returns:
ScrollResponse- scroll response indicating operation success
typeText()
Section titled “typeText()”public void typeText(String text)Types text using keyboard automation.
Parameters:
textString - text to type
pressKey()
Section titled “pressKey()”public void pressKey(String key)Presses a single key.
Parameters:
keyString - key to press. Canonical names includeenter,escape,tab, letters, digits, unshifted punctuation, function keys, and grammar-safe numpad names such asnum_plus. Named keys are case-insensitive, and common aliases such asReturnandEscapeare normalized.
pressHotkey()
Section titled “pressHotkey()”public void pressHotkey(String... keys)Presses a key combination as a hotkey sequence.
Keys are joined with + before being sent (for example,
pressHotkey("ctrl", "shift", "t") -> "ctrl+shift+t"). The resulting
value is a single atomic chord and uses the same normalized key contract as
#pressKey(String).
Parameters:
keysString… - hotkey parts to combine
getDisplayInfo()
Section titled “getDisplayInfo()”public DisplayInfoResponse getDisplayInfo()Returns display configuration information.
Returns:
DisplayInfoResponse- display information including available displays and their geometry
getWindows()
Section titled “getWindows()”public WindowsResponse getWindows()Returns currently open windows.
Returns:
WindowsResponse- window list and metadata
startRecording()
Section titled “startRecording()”public Recording startRecording()Starts a recording with default options.
Returns:
Recording- newly started recording metadata
startRecording()
Section titled “startRecording()”public Recording startRecording(String label)Starts a recording with an optional label.
Parameters:
labelString - optional recording label
Returns:
Recording- newly started recording metadata
stopRecording()
Section titled “stopRecording()”public Recording stopRecording(String id)Stops an active recording.
Parameters:
idString - recording identifier
Returns:
Recording- finalized recording metadata
listRecordings()
Section titled “listRecordings()”public ListRecordingsResponse listRecordings()Lists all recordings for the current sandbox session.
Returns:
ListRecordingsResponse- recordings list response
getRecording()
Section titled “getRecording()”public Recording getRecording(String id)Returns metadata for a specific recording.
Parameters:
idString - recording identifier
Returns:
Recording- recording details
downloadRecording()
Section titled “downloadRecording()”public File downloadRecording(String id)Downloads a recording file.
Parameters:
idString - recording identifier
Returns:
File- downloaded temporary/local file handle returned by the API client
deleteRecording()
Section titled “deleteRecording()”public void deleteRecording(String id)Deletes a recording.
Parameters:
idString - recording identifier