Lybic Docs

Usage

Lybic Python SDK Usage Guide

The Lybic Python SDK provides several classes to interact with the Lybic API.

Adapt LLM output pyautogui format

To facilitate the execution of GUI automation scripts generated by Large Language Models (LLMs), which are often trained using the popular pyautogui library, the Lybic SDK provides a Pyautogui compatibility class. This class mirrors the pyautogui interface, allowing you to execute LLM-generated code with minimal changes.

This Api support SDK version is >=0.6.0.

Usage

First, initialize the LybicClient and then create a Pyautogui instance, binding it to a specific sandbox. You can then use this instance as if it were the pyautogui module.

import asyncio
from lybic import LybicClient, Pyautogui

async def main():
    async with LybicClient() as client:
        # Assume you have a sandbox
        sandbox_id = "your_sandbox_id"

        # Create a Pyautogui instance
        pyautogui = Pyautogui(client, sandbox_id)

        # Now you can execute pyautogui-style commands
        # For example, if an LLM outputs the following string:
        llm_output = "pyautogui.moveTo(100, 150)"

        # You can execute it like this:
        # Warning: Using eval() on untrusted input is a security risk.
        # Always sanitize and validate LLM output.
        eval(llm_output)

        # Or call methods directly
        pyautogui.click(x=200, y=200)
        pyautogui.write("Hello from Lybic!")
        pyautogui.press("enter")

if __name__ == "__main__":
    asyncio.run(main())

Special scenario: If your script runs in synchronous mode

import asyncio
from lybic import LybicClient, Pyautogui

sandbox_id = "your_sandbox_id"
llm_output = "pyautogui.moveTo(100, 150)"

client = LybicClient()
pyautogui = Pyautogui(client, sandbox_id)

# Warning: Using eval() on untrusted input is a security risk.
# Always sanitize and validate LLM output.
eval(llm_output)

# Recommendation: You need to manually manage object lifecycles
pyautogui.close()
asyncio.run(client.close())

Supported Functions

The lybic.Pyautogui class supports a subset of the most common pyautogui functions.

FunctionSupportedNotes
position()
moveTo()
move()
click()
doubleClick()
write()
press()Supports single key and list of keys.
hotkey()
keyDown()Not supported by Lybic API.
keyUp()Not supported by Lybic API.

Organization Stats

Stats is a class for describing the stats of the organization.

Get Organization Stats

  • Method: get()
  • Arguments: None
  • Returns: dto.StatsResponseDto
import asyncio
from lybic import LybicClient, Stats

async def main():
    async with LybicClient() as client:
        stats = Stats(client)
        result = await stats.get()
        print(result)

if __name__ == '__main__':
    asyncio.run(main())

Example Output:

mcpServers=3 sandboxes=8 projects=4

Lybic Project for sandbox management

Project is a class for describing the project and used to manage the sandbox of the project.

List All Projects

  • Method: list()
  • Arguments: None
  • Returns: list[dto.ProjectResponseDto]
import asyncio
from lybic import LybicClient, Project

async def main():
    async with LybicClient() as client:
        project = Project(client)
        list_result = await project.list()
        for p in list_result:
            print(p)

if __name__ == '__main__':
    asyncio.run(main())

Example Output:

id='PRJ-xxxx' name='test_project' createdAt='2025-07-10T08:03:36.375Z' defaultProject=False
id='PRJ-xxxx' name='Default Project' createdAt='2025-07-08T16:42:30.226Z' defaultProject=True

Create a Project

  • Method: create(data: dto.CreateProjectDto)
  • Arguments:
    • name (str): Project name.
  • Returns: dto.SingleProjectResponseDto
import asyncio
from lybic import LybicClient, Project

async def main():
    async with LybicClient() as client:
        project = Project(client)
        new_project = await project.create(name="test_project")
        print(new_project)

if __name__ == '__main__':
    asyncio.run(main())

Delete a Project

  • Method: delete(project_id: str)
  • Arguments:
    • project_id (str): ID of the project to delete.
  • Returns: None
import asyncio
from lybic import LybicClient, Project

async def main():
    async with LybicClient() as client:
        project = Project(client)
        await project.delete(project_id="PRJ-xxxx")

if __name__ == '__main__':
    asyncio.run(main())

Sandbox Management

Sandbox provides methods to manage and interact with sandboxes.

List All Sandboxes

  • Method: list()
  • Arguments: None
  • Returns: list[dto.SandboxResponseDto]
import asyncio
from lybic import LybicClient, Sandbox

async def main():
    async with LybicClient() as client:
        sandbox = Sandbox(client)
        sandboxes = await sandbox.list()
        for s in sandboxes:
            print(s)

if __name__ == '__main__':
    asyncio.run(main())

Create a New Sandbox

  • Method: create(data: dto.CreateSandboxDto)
  • Arguments:
    • name (str, optional): Name for the sandbox.
    • maxLifeSeconds (int, optional): Lifetime in seconds (default: 3600).
    • projectId (str, optional): Project ID.
    • specId (str, optional): Sandbox spec ID.
    • datacenterId (str, optional): Datacenter ID.
  • Returns: dto.GetSandboxResponseDto
import asyncio
from lybic import LybicClient, Sandbox

async def main():
    async with LybicClient() as client:
        sandbox = Sandbox(client)
        new_sandbox = await sandbox.create(name="my-sandbox")
        print(new_sandbox)

if __name__ == '__main__':
    asyncio.run(main())

Get a Specific Sandbox

  • Method: get(sandbox_id: str)
  • Arguments:
    • sandbox_id (str): ID of the sandbox.
  • Returns: dto.GetSandboxResponseDto
import asyncio
from lybic import LybicClient, Sandbox

async def main():
    async with LybicClient() as client:
        sandbox = Sandbox(client)
        details = await sandbox.get(sandbox_id="SBX-xxxx")
        print(details)

if __name__ == '__main__':
    asyncio.run(main())

Delete a Sandbox

  • Method: delete(sandbox_id: str)
  • Arguments:
    • sandbox_id (str): ID of the sandbox to delete.
  • Returns: None
import asyncio
from lybic import LybicClient, Sandbox

async def main():
    async with LybicClient() as client:
        sandbox = Sandbox(client)
        await sandbox.delete(sandbox_id="SBX-xxxx")

if __name__ == '__main__':
    asyncio.run(main())

Get a Sandbox Screenshot

  • Method: get_screenshot(sandbox_id: str)
  • Arguments:
    • sandbox_id (str): ID of the sandbox.
  • Returns: tuple (screenshot_url, PIL.Image.Image, webp_image_base64_string)
import asyncio
from lybic import LybicClient, Sandbox

async def main():
    async with LybicClient() as client:
        sandbox = Sandbox(client)
        url, image, b64_str = await sandbox.get_screenshot(sandbox_id="SBX-xxxx")
        print(f"Screenshot URL: {url}")
        image.show()

if __name__ == '__main__':
    asyncio.run(main())

Computer-Use

ComputerUse is a client for the Lybic ComputerUse API, used for parsing model outputs and executing actions.

Parse model output into computer action

support ui-tars,seed and oai-compute-use[openai]

For relative coordinate systems such as ui-tars, please use ui-tars

For absolute coordinate systems, such as doubao-1.6-seed, openCUA, qwen2.5, please use seed

if you want to parse the model output, you can use this method.

  • Method: parse_model_output(data: dto.ComputerUseParseRequestDto)
  • Arguments:
    • class dto.ComputerUseParseRequestDto
    • *model: str The model to use (e.g., "ui-tars")
    • *textContent: str The text content to parse
  • Returns: class dto.ComputerUseActionResponseDto

Examples:

if the model you use is "ui-tars", and its prompts like this:

You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task.

## Output Format

  Thought: ...
  Action: ...


## Action Space
click(point='<point>x1 y1</point>')
left_double(point='<point>x1 y1</point>')
right_single(point='<point>x1 y1</point>')
drag(start_point='<point>x1 y1</point>', end_point='<point>x2 y2</point>')
hotkey(key='ctrl c') # Split keys with a space and use lowercase. Also, do not use more than 3 keys in one hotkey action.
type(content='xxx') # Use escape characters \', \", and \n in content part to ensure we can parse the content in normal python string format. If you want to submit your input, use \n at the end of content.
scroll(point='<point>x1 y1</point>', direction='down or up or right or left') # Show more information on the `direction` side.
wait() #Sleep for 5s and take a screenshot to check for any changes.
finished(content='xxx') # Use escape characters \', \", and \n in content part to ensure we can parse the content in normal python string format.

## Note
- Use {language} in `Thought` part.
  - Write a small plan and finally summarize your next action (with its target element) in one sentence in `Thought` part.

## User Instruction
{instruction}

The model output like this:

Thought: The task requires double-left-clicking the "images" folder. In the File Explorer window, the "images" folder is visible under the Desktop directory. The target element is the folder named "images" with a yellow folder icon. Double-left-clicking this folder will open it.

Next action: Left - double - click on the "images" folder icon located in the File Explorer window, under the Desktop directory, with the name "images" and yellow folder icon.
Action: left_double(point='<point>213 257</point>')

This api will parse this model output format and return a list of computer use actions.

import asyncio
from lybic import LybicClient, dto, ComputerUse

async def main():
    async with LybicClient() as client:
        computer_use = ComputerUse(client)
        actions = await computer_use.parse_model_output(
                model="ui-tars",
                textContent="""Thought: The task requires double-left-clicking the "images" folder. In the File Explorer window, the "images" folder is visible under the Desktop directory. The target element is the folder named "images" with a yellow folder icon. Double-left-clicking this folder will open it.

Next action: Left - double - click on the "images" folder icon located in the File Explorer window, under the Desktop directory, with the name "images" and yellow folder icon.
Action: left_double(point='<point>213 257</point>')"""
        )
        print(actions)

if __name__ == '__main__':
    asyncio.run(main())

It will out put something like this:(an action list object,and length is 1)

actions=[MouseDoubleClickAction(type='mouse:doubleClick', x=FractionalLength(type='/', numerator=213, denominator=1000), y=FractionalLength(type='/', numerator=257, denominator=1000), button=1)]

Execute a computer use action

This interface enables Planner to perform actions on the sandbox through Restful calls

  • Method: execute_computer_use_action(sandbox_id: str, data: dto.ComputerUseActionDto)
  • Arguments:
    • *sandbox_id: str ID of the sandbox
    • *data: class dto.ComputerUseActionDto The action to execute
  • Returns: class dto.SandboxActionResponseDto
import asyncio
from lybic import LybicClient, dto, ComputerUse

async def main():
    async with LybicClient() as client:
        computer_use = ComputerUse(client)
        actions = await computer_use.parse_model_output(
                model="ui-tars",
                textContent="""Thought: The task requires double-left-clicking the "images" folder. In the File Explorer window, the "images" folder is visible under the Desktop directory. The target element is the folder named "images" with a yellow folder icon. Double-left-clicking this folder will open it.

Next action: Left - double - click on the "images" folder icon located in the File Explorer window, under the Desktop directory, with the name "images" and yellow folder icon.
Action: left_double(point='<point>213 257</point>')"""
        )

        response = await computer_use.execute_computer_use_action(
            sandbox_id="SBX-xxxx",
            data=dto.ComputerUseActionDto(action=actions.actions[0])
        )
        print(response)

if __name__ == '__main__':
    asyncio.run(main())

dto.ComputerUseActionDto

This class is a data transfer object used to encapsulate a single computer use action that the agent can execute. It specifies the action itself and options for the response.

Attributes:

  • action (Union[MouseClickAction, MouseDoubleClickAction, ..., FailedAction]): The specific action to be performed. This is a union of all possible action types.
  • includeScreenShot (bool, optional): If True (default), the response will include a URL to a screenshot taken after the action is executed.
  • includeCursorPosition (bool, optional): If True (default), the response will include the cursor's position after the action.
  • callId (str, optional): A unique identifier for the action call.

Action Types:

The action attribute can be one of the following Pydantic models:

  • MouseClickAction: Simulates a single mouse click.
  • MouseDoubleClickAction: Simulates a double-click.
  • MouseMoveAction: Moves the mouse cursor to a specified position.
  • MouseScrollAction: Scrolls the mouse wheel.
  • MouseDragAction: Simulates dragging the mouse from a start to an end point.
  • KeyboardTypeAction: Types a string of text.
  • KeyboardHotkeyAction: Simulates a keyboard shortcut (e.g., Ctrl+C).
  • ScreenshotAction: Takes a screenshot.
  • WaitAction: Pauses execution for a specified duration.
  • FinishedAction: Signals that the task is successfully completed.
  • FailedAction: Signals that the task has failed.

Each action has its own specific set of parameters. For example, a MouseClickAction requires x and y coordinates.

Example:

Here is an example of how to create and execute a MouseClickAction.

import asyncio
from lybic import LybicClient, dto, ComputerUse

async def main():
    async with LybicClient() as client:
        computer_use = ComputerUse(client)

        # 1. Define the action: a left-click at position (500, 300)
        click_action = dto.MouseClickAction(
            type="mouse:click",
            x=dto.PixelLength(type="px", value=500),
            y=dto.PixelLength(type="px", value=300),
            button=1  # 1 for left button
        )

        # 2. Wrap the action in ComputerUseActionDto
        action_dto = dto.ComputerUseActionDto(
            action=click_action,
            includeScreenShot=True,
            includeCursorPosition=True
        )

        # 3. Execute the action on a specific sandbox
        sandbox_id = "SBX-xxxx"  # Replace with your sandbox ID
        response = await computer_use.execute_computer_use_action(
                sandbox_id=sandbox_id,
                data=action_dto
            )

        # The response will contain the screenshot URL and cursor position
        print(response)

        # Example for typing text
        type_action = dto.KeyboardTypeAction(
            type="keyboard:type",
            content="Hello, Lybic!"
        )

        action_dto_typing = dto.ComputerUseActionDto(action=type_action)

        response_typing = await computer_use.execute_computer_use_action(
                sandbox_id=sandbox_id,
                data=action_dto_typing
            )
        print(response_typing)

if __name__ == '__main__':
    asyncio.run(main())

For more APIs related to the SDK, please refer to GithubRepo