LogoCua Documentation

Chat History

Managing conversation history and message arrays

Managing conversation history is essential for multi-turn agent interactions. The agent maintains a messages array that tracks the entire conversation flow.

Managing History

Continuous Conversation

history = []

while True:
    user_input = input("> ")
    history.append({"role": "user", "content": user_input})

    async for result in agent.run(history, stream=False):
        history += result["output"]

Message Array Structure

The messages array contains different types of messages that represent the conversation state:

messages = [
    # user input
    {
        "role": "user",
        "content": "go to trycua on gh"
    },
    # first agent turn adds the model output to the history
    {
        "summary": [
            {
                "text": "Searching Firefox for Trycua GitHub",
                "type": "summary_text"
            }
        ],
        "type": "reasoning"
    },
    {
        "action": {
            "text": "Trycua GitHub",
            "type": "type"
        },
        "call_id": "call_QI6OsYkXxl6Ww1KvyJc4LKKq",
        "status": "completed",
        "type": "computer_call"
    },
    # second agent turn adds the computer output to the history
    {
        "type": "computer_call_output",
        "call_id": "call_QI6OsYkXxl6Ww1KvyJc4LKKq",
        "output": {
            "type": "input_image",
            "image_url": "data:image/png;base64,..."
        }
    },
    # final agent turn adds the agent output text to the history
    {
        "type": "message",
        "role": "assistant",
        "content": [
          {
            "text": "Success! The Trycua GitHub page has been opened.",
            "type": "output_text"
          }
        ]
    }
]

Message Types

  • user: User input messages
  • computer_call: Computer actions (click, type, keypress, etc.)
  • computer_call_output: Results from computer actions (usually screenshots)
  • function_call: Function calls (e.g., computer.call)
  • function_call_output: Results from function calls
  • reasoning: Agent's internal reasoning and planning
  • message: Agent text responses

Memory Management

For long conversations, consider using the only_n_most_recent_images parameter to manage memory:

agent = ComputerAgent(
    model="anthropic/claude-3-5-sonnet-20241022",
    tools=[computer],
    only_n_most_recent_images=3
)

This automatically removes old images from the conversation history to prevent context window overflow.