LogoCua Documentation

Quickstart (CLI)

Get started with the cua Agent CLI in 4 steps

Get up and running with the cua Agent CLI in 4 simple steps.

Introduction

cua combines Computer (interface) + Agent (AI) for automating desktop apps. The Agent CLI provides a clean terminal interface to control your remote computer using natural language commands.

Create Your First cua Container

  1. Go to trycua.com/signin
  2. Navigate to Dashboard > Containers > Create Instance
  3. Create a Medium, Ubuntu 22 container
  4. Note your container name and API key

Install cua

Install uv

# Use curl to download the script and execute it with sh:
curl -LsSf https://astral.sh/uv/install.sh | sh

# If your system doesn't have curl, you can use wget:
# wget -qO- https://astral.sh/uv/install.sh | sh
# Use irm to download the script and execute it with iex:
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

Install Python 3.12

uv python install 3.12
# uv will install cua dependencies automatically when you use --with "cua-agent[cli]"

Run cua CLI

Choose your preferred AI model:

OpenAI Computer Use Preview

uv run --with "cua-agent[cli]" -m agent.cli openai/computer-use-preview
python -m agent.cli openai/computer-use-preview

Anthropic Claude

uv run --with "cua-agent[cli]" -m agent.cli anthropic/claude-3-5-sonnet-20241022
uv run --with "cua-agent[cli]" -m agent.cli anthropic/claude-opus-4-20250514
uv run --with "cua-agent[cli]" -m agent.cli anthropic/claude-sonnet-4-20250514
python -m agent.cli anthropic/claude-3-5-sonnet-20241022
python -m agent.cli anthropic/claude-opus-4-20250514
python -m agent.cli anthropic/claude-sonnet-4-20250514

Omniparser + LLMs

uv run --with "cua-agent[cli]" -m agent.cli omniparser+anthropic/claude-3-5-sonnet-20241022
uv run --with "cua-agent[cli]" -m agent.cli omniparser+openai/gpt-4o
uv run --with "cua-agent[cli]" -m agent.cli omniparser+vertex_ai/gemini-pro
python -m agent.cli omniparser+anthropic/claude-3-5-sonnet-20241022
python -m agent.cli omniparser+openai/gpt-4o
python -m agent.cli omniparser+vertex_ai/gemini-pro

Local Models

# Hugging Face models (local)
uv run --with "cua-agent[cli]" -m agent.cli huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B

# MLX models (Apple Silicon)
uv run --with "cua-agent[cli]" -m agent.cli mlx/mlx-community/UI-TARS-1.5-7B-6bit

# Ollama models
uv run --with "cua-agent[cli]" -m agent.cli omniparser+ollama_chat/llama3.2:latest
# Hugging Face models (local)
python -m agent.cli huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B

# MLX models (Apple Silicon)
python -m agent.cli mlx/mlx-community/UI-TARS-1.5-7B-6bit

# Ollama models
python -m agent.cli omniparser+ollama_chat/llama3.2:latest

Interactive Setup

If you haven't set up environment variables, the CLI will guide you through the setup:

  1. Container Name: Enter your cua container name (or get one at trycua.com)
  2. CUA API Key: Enter your cua API key
  3. Provider API Key: Enter your AI provider API key (OpenAI, Anthropic, etc.)

Start Chatting

Once connected, you'll see:

💻 Connected to your-container-name (model, agent_loop)
Type 'exit' to quit.

>

You can ask your agent to perform actions like:

  • "Take a screenshot and tell me what's on the screen"
  • "Open Firefox and go to github.com"
  • "Type 'Hello world' into the terminal"
  • "Close the current window"
  • "Click on the search button"

For advanced Python usage and GUI interface, see the Quickstart (GUI) and Quickstart for Developers.

For a complete list of supported models, see Supported Agents.

For running models locally, see Running Models Locally.