Back to Articles

When Agents Need Human Wisdom - Introducing Human-In-The-Loop Support

Published on August 29, 2025 by Francesco Bonacci

Sometimes the best AI agent is a human. Whether you're creating training demonstrations, evaluating complex scenarios, or need to intervene when automation hits a wall, our new Human-In-The-Loop integration puts you directly in control.

With yesterday's HUD evaluation integration, you could benchmark any agent at scale. Today's update lets you become the agent when it matters most—seamlessly switching between automated intelligence and human judgment.

What you get

  • One-line human takeover for any agent configuration with human/human or model+human/human
  • Interactive web UI to see what your agent sees and control what it does
  • Zero context switching - step in exactly where automation left off
  • Training data generation - create perfect demonstrations by doing tasks yourself
  • Ground truth evaluation - validate agent performance with human expertise

Why Human-In-The-Loop?

Even the most sophisticated agents encounter edge cases, ambiguous interfaces, or tasks requiring human judgment. Rather than failing gracefully, they can now fail intelligently—by asking for human help.

This approach bridges the gap between fully automated systems and pure manual control, letting you:

  • Demonstrate complex workflows that agents can learn from
  • Evaluate tricky scenarios where ground truth requires human assessment
  • Intervene selectively when automated agents need guidance
  • Test and debug your tools and environments manually

Getting Started

Launch the human agent interface:

bash
python -m agent.human_tool

The web UI will show pending completions. Click any completion to take control of the agent and see exactly what it sees.

Usage Examples

Direct Human Control

Perfect for creating demonstrations or when you want full manual control:

python
from agent import ComputerAgent from agent.computer import computer agent = ComputerAgent( "human/human", tools=[computer] ) # You'll get full control through the web UI async for _ in agent.run("Take a screenshot, analyze the UI, and click on the most prominent button"): pass

Hybrid: AI Planning + Human Execution

Combine model intelligence with human precision—let AI plan, then execute manually:

python
agent = ComputerAgent( "huggingface-local/HelloKKMe/GTA1-7B+human/human", tools=[computer] ) # AI creates the plan, human executes each step async for _ in agent.run("Navigate to the settings page and enable dark mode"): pass

Fallback Pattern

Start automated, escalate to human when needed:

python
# Primary automated agent primary_agent = ComputerAgent("openai/computer-use-preview", tools=[computer]) # Human fallback agent fallback_agent = ComputerAgent("human/human", tools=[computer]) try: async for result in primary_agent.run(task): if result.confidence < 0.7: # Low confidence threshold # Seamlessly hand off to human async for _ in fallback_agent.run(f"Continue this task: {task}"): pass except Exception: # Agent failed, human takes over async for _ in fallback_agent.run(f"Handle this failed task: {task}"): pass

Interactive Features

The human-in-the-loop interface provides a rich, responsive experience:

Visual Environment

  • Screenshot display with live updates as you work
  • Click handlers for direct interaction with UI elements
  • Zoom and pan to see details clearly

Action Controls

  • Click actions - precise cursor positioning and clicking
  • Keyboard input - type text naturally or send specific key combinations
  • Action history - see the sequence of actions taken
  • Undo support - step back when needed

Tool Integration

  • Full OpenAI compatibility - standard tool call format
  • Custom tools - integrate your own tools seamlessly
  • Real-time feedback - see tool responses immediately

Smart Polling

  • Responsive updates - UI refreshes when new completions arrive
  • Background processing - continue working while waiting for tasks
  • Session persistence - resume interrupted sessions

Real-World Use Cases

Training Data Generation

Create perfect demonstrations for fine-tuning:

python
# Generate training examples for spreadsheet tasks demo_agent = ComputerAgent("human/human", tools=[computer]) tasks = [ "Create a budget spreadsheet with income and expense categories", "Apply conditional formatting to highlight overbudget items", "Generate a pie chart showing expense distribution" ] for task in tasks: # Human demonstrates each task perfectly async for _ in demo_agent.run(task): pass # Recorded actions become training data

Evaluation and Ground Truth

Validate agent performance on complex scenarios:

python
# Human evaluates agent performance evaluator = ComputerAgent("human/human", tools=[computer]) async for _ in evaluator.run("Review this completed form and rate accuracy (1-10)"): pass # Human provides authoritative quality assessment

Interactive Debugging

Step through agent behavior manually:

python
# Test a workflow step by step debug_agent = ComputerAgent("human/human", tools=[computer]) async for _ in debug_agent.run("Reproduce the agent's failed login sequence"): pass # Human identifies exactly where automation breaks

Edge Case Handling

Handle scenarios that break automated agents:

python
# Complex UI interaction requiring human judgment edge_case_agent = ComputerAgent("human/human", tools=[computer]) async for _ in edge_case_agent.run("Navigate this CAPTCHA-protected form"): pass # Human handles what automation cannot

Configuration Options

Customize the human agent experience:

  • UI refresh rate: Adjust polling frequency for your workflow
  • Image quality: Balance detail vs. performance for screenshots
  • Action logging: Save detailed traces for analysis and training
  • Session timeout: Configure idle timeouts for security
  • Tool permissions: Restrict which tools humans can access

When to Use Human-In-The-Loop

ScenarioWhy Human Control
Creating training dataPerfect demonstrations for model fine-tuning
Evaluating complex tasksHuman judgment for subjective or nuanced assessment
Handling edge casesCAPTCHAs, unusual UIs, context-dependent decisions
Debugging workflowsStep through failures to identify breaking points
High-stakes operationsCritical tasks requiring human oversight and approval
Testing new environmentsValidate tools and environments work as expected

Learn More

  • Interactive examples: Try human-in-the-loop control with sample tasks
  • Training data pipelines: Learn how to convert human demonstrations into model training data
  • Evaluation frameworks: Build human-validated test suites for your agents
  • API documentation: Full reference for human agent configuration

Ready to put humans back in the loop? The most sophisticated AI system knows when to ask for help.

Questions about human-in-the-loop agents? Join the conversation in our Discord community or check out our documentation.

Share this post