Chapter 7 — The Agentic Loop | The Agentic Programming Language

The while loop is the foundation of persistence in programming. Without it, a program executes once and terminates. With it, a program can keep going — iterating on a problem, retrying on failure, waiting for a condition to be met. Correctness requires a well-defined stopping condition — one that will eventually evaluate to False. Without it, the loop runs forever.

The Agentic Loop is the while loop of the reasoning world. An agent executes a cycle: it thinks about what to do, takes an action, observes the result, and decides whether to continue or stop. This cycle — formalized as the ReAct pattern (Reason + Act) — is the dominant execution model in modern agent frameworks.

But where the while loop's stopping condition is a boolean expression — deterministic, evaluable, unambiguous — the agentic loop's stopping condition is a reasoned assessment: has sufficient evidence been gathered to produce a correct answer? This shift from mechanical evaluation to reasoned judgment is both the power and the primary source of failure in agentic loops.

7.1 The ReAct Loop: Reason → Act → Observe

The ReAct pattern formalizes the three-step cycle that characterizes effective agent behavior:

Reason: The agent produces a chain-of-thought reasoning step — considers the current state, the information gathered, what is still missing, and identifies the next action. Act: Based on its reasoning, the agent executes an action — typically a tool call to an external system. Observe: The agent receives the result and incorporates it into its understanding of the task state.

Fig 7.1 — The ReAct Loop: Detailed Execution Cycle

ReAct (Reason + Act) is the canonical agentic loop. The key discipline: the stop condition is a reasoned assessment of whether sufficient evidence has been gathered — not a fixed step count.

A concrete 3-iteration example for a customer support query about order #99:

ReAct Loop — Three Iterations

--- Iteration 1 ---
Reason: I need to look up order #99 to understand its status.
Act:    call get_order_status(order_id="99")
Observe: {"status": "Delayed", "original_eta": "2026-02-18",
          "new_eta": "2026-02-28", "reason": "weather delay"}

--- Iteration 2 ---
Reason: The order is delayed due to weather. I should check
        if the user qualifies for delay compensation.
Act:    call get_compensation_policy(delay_reason="weather")
Observe: {"eligible": false, "reason": "weather delays are force majeure"}

--- Iteration 3 ---
Reason: I have all the information I need. New ETA is Feb 28,
        user is not eligible for compensation. I can now answer.
Act:    [Final Answer — exits loop]

7.2 Mapping While → ReAct

`while` Loop Component	ReAct Loop Equivalent	Notes
`while condition:`	"Does the task feel complete?"	Boolean vs. judgment — the critical difference
Loop body	Reason → Act → Observe cycle	One iteration of reasoning + one tool call
Loop variable	Agent's scratchpad / reasoning trace	Accumulates context across iterations
`break` / `return`	Final Answer / Stop Token	Explicit exit signal
Infinite loop	Agent that cannot determine completeness	Most common failure mode
Maximum iterations guard	`max_iterations` parameter	Safety mechanism, not logic

7.3 The Stopping Condition Problem

This is the hardest engineering challenge in the agentic loop. Traditional: while len(queue) > 0: — unambiguous, terminates when queue is empty. Agentic: "Has the agent gathered enough information and reasoning to produce a complete, accurate, and helpful response?" — not evaluable by a boolean expression. Requires judgment about completeness, accuracy, and relevance.

Failure mode 1: Premature termination. The agent declares "final answer" before gathering enough information — the agentic equivalent of an off-by-one error.

Failure mode 2: Non-termination. The agent loops indefinitely, unable to determine that it has gathered sufficient information — the agentic infinite loop.

The practical solution: Always impose an explicit maximum iteration count as a safety guard, separate from the agent's own stopping logic:

Maximum Iterations Guard

MAX_ITERATIONS = 15

for iteration in range(MAX_ITERATIONS):
    action = agent.step(context)
    
    if action.type == "final_answer":
        return action.content  # Natural termination
    
    result = execute_tool(action)
    context.append(result)

# Safety exit: max iterations reached
return agent.produce_best_effort_answer(context)

The max_iterations guard is not the agent's stopping condition — it is a circuit breaker that prevents a misbehaving loop from running indefinitely.

7.4 Self-Correction Within the Loop

The Reflexion pattern adds a self-evaluation step to the loop: at select points, the agent explicitly evaluates its own output against the task requirements and decides whether to retry — before the output reaches the user.

Fig 7.2 — Reflexion Pattern Within the Agentic Loop

Reflexion adds self-evaluation to the agentic loop. The key insight: improvement notes fed into the next attempt are far more effective than simply repeating the original prompt.

7.5 Multi-Agent Loops: Actor-Critic

The self-correction pattern can be externalized: instead of one agent critiquing its own output, two agents are deployed in a loop — an Actor that produces output and a Critic that evaluates it.

Fig 7.3 — Actor-Critic Multi-Agent Loop

The Actor-Critic pattern separates execution from evaluation at the architectural level. Unlike Reflexion (self-critique), the Critic runs as a separate agent with separate instructions — enabling genuinely independent evaluation.

The Actor-Critic loop externalizes the judgment of quality into a separate agent with a focused evaluation task, which is often more reliable than self-evaluation — agents tend to be lenient on their own output.

7.6 Recursion in Agentic Systems

A special case of the agentic loop: an agent that determines a sub-task is too complex to handle directly and spawns a new agent to handle it — potentially a copy of itself with a refined scope.

Recursive Agent with Depth Guard

def agent_call(task: str, depth: int = 0) -> str:
    MAX_DEPTH = 5
    
    result = llm.call(system_prompt=AGENT_PROMPT, user_message=task)
    
    if result.requires_subtask:
        if depth >= MAX_DEPTH:
            return "Task too complex — escalating."
        return agent_call(result.subtask, depth=depth + 1)
    
    return result.answer

The critical safety constraint: explicit depth limits. Without a base case or depth guard, a recursively spawning agent creates an unbounded tree of sub-agents — the agentic equivalent of a stack overflow.

7.7 Chapter Summary

The agentic loop — Reason → Act → Observe — is the fundamental mechanism by which agents exhibit persistence, adaptability, and self-correction. The loop body is a full reasoning cycle. The stopping condition is a judgment, not a boolean. Premature termination and non-termination are the two primary failure modes. Self-correction patterns (Reflexion, Actor-Critic) improve output quality by externalizing evaluation. Recursive agents require explicit depth limits.

Core Principle — Chapter 7

A while loop terminates when a condition is false. An agentic loop terminates when the agent believes the task is done. This distinction — between checking a fact and making a judgment — is the fundamental challenge of agentic control flow.