Hunter Gerlach
Human-in-the-Loop (HITL) is a design pattern in which people remain actively involved in the decision-making process of an automated or AI-driven system. Rather than handing full control to a machine, a human reviews, approves, or corrects the system's output before it takes effect.
HITL sits on a spectrum of human oversight models:
Human-in-the-Loop (HITL): A person must approve or reject every decision the system makes. Best for high-risk or high-consequence tasks such as medical diagnoses, legal judgments, or financial approvals.
Human-on-the-Loop (HOTL): The system operates autonomously but a person monitors its behavior and can intervene when needed. Suitable for moderate-risk tasks where speed matters but oversight is still required.
Human-in-Command (HIC): A person sets the goals, constraints, and boundaries within which the system operates, and retains the authority to override or shut it down at any time. Appropriate for strategic or organizational-level governance of AI systems.
Choosing the right model depends on the risk level of the task, the maturity of the AI system, and the consequences of an incorrect decision.
Safety: AI systems can hallucinate, produce biased outputs, or fail in unexpected ways. Human oversight catches errors before they cause real harm.
Map your AI-assisted workflows into risk tiers. High-risk decisions (affecting safety, finances, legal standing, or individual rights) require HITL. Moderate-risk tasks may use HOTL. Low-risk tasks with well-understood failure modes may operate autonomously with periodic audits.
For each workflow, identify the specific points where a human must review, approve, or reject the AI output before it moves forward. Make these gates explicit in your process documentation and tooling so they cannot be skipped.
A note on the tension between flow and governance: practices like Continuous Delivery and Continuous Deployment rightly teach us to eliminate manual gates that slow delivery. HITL review gates are not a step backward. They apply specifically where AI is making or influencing decisions that carry real risk. For lower-risk AI tasks, lean on automated checks, Test Automation, Canary Releases, and Feature Toggles to keep things flowing. Reserve human gates for where the cost of being wrong justifies the cost of slowing down.
Designate specific people as reviewers and give them the authority and time to reject or override AI outputs. A review gate is meaningless if the reviewer feels pressured to rubber-stamp results or lacks the expertise to evaluate them.
The volume of AI-assisted decisions will only grow. Rather than reviewing every output from scratch, document prior human judgments and treat them as precedent. When a reviewer approves a pattern or rejects an approach, capture that rationale so future reviews of similar outputs can be resolved quickly or even automated. This is how you scale human verification without scaling headcount.
Establish a mechanism for reviewers to report errors and patterns back to the team maintaining the AI system. Track what types of errors occur, how often, and whether they are improving over time. Use this data to retrain models and refine prompts. Over time, these feedback loops mature into structured evals, systematic evaluations that measure AI output quality against human-defined criteria and help you know whether the system is actually getting better.
Regularly review your oversight model. As the AI system matures and error rates change, you may be able to move some workflows from HITL to HOTL. Conversely, if new failure modes emerge, you may need to tighten oversight. The goal is a living process, not a one-time setup.
Check out these great links which can help you dive a little deeper into running the Human in the loop practice with your team, customers or stakeholders.