When Agents Need Human Wisdom - Introducing Human-In-The-Loop Support
Published on August 29, 2025 by Francesco Bonacci
Sometimes the best AI agent is a human. Whether you're creating training demonstrations, evaluating complex scenarios, or need to intervene when automation hits a wall, our new Human-In-The-Loop integration puts you directly in control.
With yesterday's HUD evaluation integration, you could benchmark any agent at scale. Today's update lets you become the agent when it matters most—seamlessly switching between automated intelligence and human judgment.
What you get
- One-line human takeover for any agent configuration with
human/human
ormodel+human/human
- Interactive web UI to see what your agent sees and control what it does
- Zero context switching - step in exactly where automation left off
- Training data generation - create perfect demonstrations by doing tasks yourself
- Ground truth evaluation - validate agent performance with human expertise
Why Human-In-The-Loop?
Even the most sophisticated agents encounter edge cases, ambiguous interfaces, or tasks requiring human judgment. Rather than failing gracefully, they can now fail intelligently—by asking for human help.
This approach bridges the gap between fully automated systems and pure manual control, letting you:
- Demonstrate complex workflows that agents can learn from
- Evaluate tricky scenarios where ground truth requires human assessment
- Intervene selectively when automated agents need guidance
- Test and debug your tools and environments manually
Getting Started
Launch the human agent interface:
bashpython -m agent.human_tool
The web UI will show pending completions. Click any completion to take control of the agent and see exactly what it sees.
Usage Examples
Direct Human Control
Perfect for creating demonstrations or when you want full manual control:
pythonfrom agent import ComputerAgent from agent.computer import computer agent = ComputerAgent( "human/human", tools=[computer] ) # You'll get full control through the web UI async for _ in agent.run("Take a screenshot, analyze the UI, and click on the most prominent button"): pass
Hybrid: AI Planning + Human Execution
Combine model intelligence with human precision—let AI plan, then execute manually:
pythonagent = ComputerAgent( "huggingface-local/HelloKKMe/GTA1-7B+human/human", tools=[computer] ) # AI creates the plan, human executes each step async for _ in agent.run("Navigate to the settings page and enable dark mode"): pass
Fallback Pattern
Start automated, escalate to human when needed:
python# Primary automated agent primary_agent = ComputerAgent("openai/computer-use-preview", tools=[computer]) # Human fallback agent fallback_agent = ComputerAgent("human/human", tools=[computer]) try: async for result in primary_agent.run(task): if result.confidence < 0.7: # Low confidence threshold # Seamlessly hand off to human async for _ in fallback_agent.run(f"Continue this task: {task}"): pass except Exception: # Agent failed, human takes over async for _ in fallback_agent.run(f"Handle this failed task: {task}"): pass
Interactive Features
The human-in-the-loop interface provides a rich, responsive experience:
Visual Environment
- Screenshot display with live updates as you work
- Click handlers for direct interaction with UI elements
- Zoom and pan to see details clearly
Action Controls
- Click actions - precise cursor positioning and clicking
- Keyboard input - type text naturally or send specific key combinations
- Action history - see the sequence of actions taken
- Undo support - step back when needed
Tool Integration
- Full OpenAI compatibility - standard tool call format
- Custom tools - integrate your own tools seamlessly
- Real-time feedback - see tool responses immediately
Smart Polling
- Responsive updates - UI refreshes when new completions arrive
- Background processing - continue working while waiting for tasks
- Session persistence - resume interrupted sessions
Real-World Use Cases
Training Data Generation
Create perfect demonstrations for fine-tuning:
python# Generate training examples for spreadsheet tasks demo_agent = ComputerAgent("human/human", tools=[computer]) tasks = [ "Create a budget spreadsheet with income and expense categories", "Apply conditional formatting to highlight overbudget items", "Generate a pie chart showing expense distribution" ] for task in tasks: # Human demonstrates each task perfectly async for _ in demo_agent.run(task): pass # Recorded actions become training data
Evaluation and Ground Truth
Validate agent performance on complex scenarios:
python# Human evaluates agent performance evaluator = ComputerAgent("human/human", tools=[computer]) async for _ in evaluator.run("Review this completed form and rate accuracy (1-10)"): pass # Human provides authoritative quality assessment
Interactive Debugging
Step through agent behavior manually:
python# Test a workflow step by step debug_agent = ComputerAgent("human/human", tools=[computer]) async for _ in debug_agent.run("Reproduce the agent's failed login sequence"): pass # Human identifies exactly where automation breaks
Edge Case Handling
Handle scenarios that break automated agents:
python# Complex UI interaction requiring human judgment edge_case_agent = ComputerAgent("human/human", tools=[computer]) async for _ in edge_case_agent.run("Navigate this CAPTCHA-protected form"): pass # Human handles what automation cannot
Configuration Options
Customize the human agent experience:
- UI refresh rate: Adjust polling frequency for your workflow
- Image quality: Balance detail vs. performance for screenshots
- Action logging: Save detailed traces for analysis and training
- Session timeout: Configure idle timeouts for security
- Tool permissions: Restrict which tools humans can access
When to Use Human-In-The-Loop
Scenario | Why Human Control |
---|---|
Creating training data | Perfect demonstrations for model fine-tuning |
Evaluating complex tasks | Human judgment for subjective or nuanced assessment |
Handling edge cases | CAPTCHAs, unusual UIs, context-dependent decisions |
Debugging workflows | Step through failures to identify breaking points |
High-stakes operations | Critical tasks requiring human oversight and approval |
Testing new environments | Validate tools and environments work as expected |
Learn More
- Interactive examples: Try human-in-the-loop control with sample tasks
- Training data pipelines: Learn how to convert human demonstrations into model training data
- Evaluation frameworks: Build human-validated test suites for your agents
- API documentation: Full reference for human agent configuration
Ready to put humans back in the loop? The most sophisticated AI system knows when to ask for help.
Questions about human-in-the-loop agents? Join the conversation in our Discord community or check out our documentation.