Agentic Workflow Patterns#
An agent without a workflow pattern is a chatbot. What separates an agent from a single-turn LLM call is the loop: observe the environment, reason about what to do, act, observe the result, and decide whether to continue. The loop structure determines everything – how the agent plans, how it recovers from errors, when it stops, and whether it can handle tasks that take minutes or hours.
This article covers the core loop patterns, when each fits, and how to avoid the common failure modes (infinite loops, context exhaustion, and premature termination).
The Core Loop: Observe-Reason-Act#
Every agentic workflow is a variation of the same loop:
while not done:
observation = observe(environment)
reasoning = think(observation, goal, history)
action = decide(reasoning)
result = execute(action)
history.append(observation, reasoning, action, result)
done = check_termination(result, goal, history)The patterns differ in how much planning happens before acting, how much context carries between iterations, and when the agent decides it is finished.
Pattern 1: ReAct (Reason + Act)#
ReAct interleaves reasoning and action. The agent thinks about what to do next, does one thing, observes the result, then thinks again. There is no upfront plan – the agent discovers the path as it goes.
Thought: I need to find which service is failing. Let me check the pod status.
Action: kubectl get pods -n production --field-selector=status.phase!=Running
Observation: pod/auth-service-7b4f8c-x9k2z 0/1 CrashLoopBackOff 5 12m
Thought: auth-service is crash-looping. I should check its logs.
Action: kubectl logs auth-service-7b4f8c-x9k2z --tail=50
Observation: Error: connection refused to postgres:5432
Thought: The auth service cannot reach Postgres. Let me check if the database pod is running.
Action: kubectl get pods -l app=postgres -n production
Observation: pod/postgres-0 1/1 Running 0 45d
Thought: Postgres is running. The issue might be a network policy or service endpoint.
Action: kubectl get endpoints postgres -n production
Observation: postgres 10.244.1.15:5432
...When to use ReAct:
- Exploratory tasks where the path is not known in advance (debugging, investigation)
- Tasks where each step’s result changes what the next step should be
- When the environment is unpredictable and plans would become stale immediately
When not to use ReAct:
- Well-defined, repeatable workflows (deployments, migrations) where the steps are known
- Tasks requiring coordinated multi-step transactions that should be planned holistically
- When context window is limited – ReAct accumulates reasoning traces quickly
Failure mode: wandering. Without a plan, the agent can explore tangents indefinitely. Mitigate with a step budget (max 15 actions), a relevance check (“is this action moving toward the goal?”), or periodic re-grounding (“given what I have learned so far, what is the most direct path to the answer?”).
Pattern 2: Plan-Then-Execute#
The agent creates a complete plan first, then executes each step. The plan is a structured artifact – a numbered list, a dependency graph, or a state machine – that the agent follows.
Plan:
1. Check pod status across all namespaces
2. Identify pods not in Running state
3. For each failing pod, check logs for error messages
4. Categorize errors (resource limits, crash loops, image pull failures)
5. For crash loops, identify root cause from logs
6. Write summary with root cause and recommended fix for each
Executing step 1...
Executing step 2... (found 3 failing pods)
Executing step 3... (checking logs for each)
...When to use plan-then-execute:
- Tasks with well-understood structure (deploy X, migrate Y, set up Z)
- When the user needs to approve the plan before execution begins
- When multiple agents will execute different parts of the plan in parallel
- Long-running workflows where the plan serves as a checkpoint for resumption
When not to use:
- When the environment is too unpredictable for upfront planning
- When early steps reveal information that invalidates later steps (use adaptive planning instead)
Failure mode: plan rigidity. The plan assumes conditions that change during execution. Step 4 expects a file that step 2 failed to create. Mitigate with pre-condition checks at each step and a re-planning trigger when a step fails.
Adaptive Planning (Plan-Execute-Replan)#
A hybrid that plans upfront but re-plans when execution diverges from expectations:
Plan: [step 1, step 2, step 3, step 4, step 5]
Execute step 1: success
Execute step 2: FAILED (unexpected error)
Re-plan given: step 1 succeeded, step 2 failed with error X
New plan: [step 2b (alternative approach), step 3, step 4, step 5]
Execute step 2b: success
Execute step 3: success
...This is the most robust pattern for complex, multi-step tasks. The upfront plan provides structure and visibility. Re-planning provides resilience. The cost is additional LLM calls for each re-plan.
Pattern 3: Iterative Refinement#
The agent produces a draft output, evaluates it against criteria, and refines it in a loop until quality thresholds are met.
Draft 1: Generate Helm chart for PostgreSQL deployment
Evaluate: Missing resource limits, no PVC for data, hardcoded password
Score: 3/10
Refine: Add resource limits, PVC with 10Gi, secret reference for password
Evaluate: Resource limits present, PVC present, password from secret. Missing: health checks, pod disruption budget
Score: 6/10
Refine: Add liveness/readiness probes, add PDB with minAvailable: 1
Evaluate: All critical elements present. Minor: no anti-affinity for HA
Score: 8/10 — above threshold, doneWhen to use:
- Content generation tasks where quality is measurable (code, configs, documentation)
- When there are explicit quality criteria the agent can check against
- When getting a perfect output on the first try is unlikely
Failure mode: diminishing returns. Each refinement pass costs tokens and time but improves quality by less. Set a maximum number of refinement cycles (typically 3-5) and a “good enough” threshold.
Pattern 4: Checkpoint-Driven Execution#
For long-running workflows, the agent writes checkpoint documents at key milestones. Each checkpoint captures current state, decisions made, and remaining work. If the session ends or context overflows, a new session can resume from the last checkpoint.
checkpoint-1.md:
Status: Requirements gathered
Decisions: Using PostgreSQL 15, Helm for deployment, 3 replicas
Remaining: [design schema, write Helm chart, test, document]
checkpoint-2.md:
Status: Schema designed and reviewed
Decisions: 12 tables, FTS5 for search, uuid primary keys
Artifacts: schema/0001-init.sql (committed)
Remaining: [write Helm chart, test, document]
checkpoint-3.md:
Status: Helm chart written and tested
Decisions: Used Bitnami PostgreSQL subchart, custom initdb scripts
Artifacts: charts/myapp/ (committed)
Remaining: [document]When to use:
- Tasks spanning multiple sessions or hours of work
- Tasks where sub-agents need to know the overall context
- When context window limits force periodic summarization
- Collaborative workflows where different agents handle different phases
This pattern is explored in depth in the companion article on context preservation for long-running workflows.
Task Decomposition Strategies#
Before any loop pattern, the agent must decide how to break the task into pieces. Bad decomposition wastes all downstream effort.
Decompose by Independence#
Each subtask should be completable without waiting for another subtask’s result. If B depends on A’s output, they are sequential, not parallel.
Good decomposition (independent):
- Check pod status in namespace production
- Check pod status in namespace staging
- Check recent deployments in the last hour
(All can run simultaneously)
Bad decomposition (hidden dependency):
- Find the failing service
- Check the failing service's logs
(Second task depends on first task's result)Decompose by Scope#
For large tasks, split by boundary: directory, service, file, or component. Each piece gets its own agent or iteration.
Task: "Review this codebase for security issues"
Decomposition:
- Agent A: Review authentication module (src/auth/)
- Agent B: Review API handlers (src/api/)
- Agent C: Review database queries (src/db/)
- Agent D: Review configuration and secrets handling (config/, .env*)
- Leader: Merge findings, check cross-module issuesDecompose by Skill#
Different subtasks require different capabilities. Group by the tools or expertise needed.
Task: "Set up monitoring for the new service"
Decomposition:
- Metrics agent: Configure Prometheus scraping, write recording rules
- Dashboard agent: Build Grafana dashboard with key panels
- Alerting agent: Define alert rules with appropriate thresholds
- Documentation agent: Write runbook for on-call responseDecompose by Risk#
Separate safe, reversible actions from dangerous, irreversible ones. Execute safe actions first to gather information. Gate dangerous actions behind human approval.
Phase 1 (safe, automated):
- Read current configuration
- Validate proposed changes against schema
- Dry-run the migration
Phase 2 (dangerous, requires approval):
- Apply database migration
- Update production config
- Restart servicesTermination Conditions#
An agent without a clear termination condition will either stop too early (missing work) or loop forever (burning tokens). Define termination explicitly.
Goal-based termination: The agent checks whether the goal is achieved after each action. “Is the service healthy? Is the test passing? Does the config validate?”
Budget-based termination: Hard limits on iterations (max 20 steps), tokens (max 50K), or wall-clock time (max 5 minutes). When the budget is exhausted, the agent must stop and report what it accomplished and what remains.
Quality-based termination: The agent evaluates its output against criteria. If the score exceeds a threshold, stop. If not, refine. Combined with a max-iterations cap to prevent infinite refinement.
Convergence-based termination: The agent stops when additional iterations produce no meaningful change. “The last 3 refinement passes scored 8.1, 8.2, 8.1 – converged, stop.”
termination:
conditions: # Stop when ANY condition is met
- goal_achieved: true
- max_iterations: 20
- max_tokens: 50000
- no_progress_for: 3 # iterations
on_termination:
- summarize_progress
- list_remaining_work
- write_checkpointChoosing the Right Pattern#
| Scenario | Pattern | Reason |
|---|---|---|
| Debug an unknown failure | ReAct | Path unknown, each observation changes the next action |
| Deploy a service to production | Plan-then-execute | Steps are known, user should approve the plan |
| Write and refine a Terraform module | Iterative refinement | Quality is measurable, multiple passes improve output |
| Multi-hour infrastructure project | Checkpoint-driven | Needs resumption across sessions, sub-agent delegation |
| Migrate a database schema | Adaptive planning | Plan is known but migration might reveal surprises |
| Security review of a PR | ReAct or peer consensus | Exploratory analysis, or multiple independent reviews |
Start with the simplest pattern that fits. ReAct handles most single-session tasks. Plan-then-execute handles most structured work. Add checkpointing when the task exceeds one session. Add re-planning when the environment is unpredictable. Combine patterns when the task has both exploratory and structured phases.
Anti-Patterns#
The infinite explorer. A ReAct agent that keeps investigating tangents without converging on an answer. Fix with a step budget and periodic relevance checks.
The rigid planner. A plan-then-execute agent that follows a stale plan even when early steps reveal it is wrong. Fix with pre-condition checks and re-planning triggers.
The perfectionist refiner. An iterative refinement agent that keeps polishing past the point of diminishing returns. Fix with a quality threshold and max iterations cap.
The context hoarder. An agent that keeps every tool result and reasoning trace in context until the window overflows. Fix with summarization checkpoints – periodically compress history into a summary and drop the raw details.
The blind delegator. A leader agent that decomposes a task and dispatches to sub-agents without validating the decomposition. Sub-agents do redundant or irrelevant work. Fix by having the leader verify: are subtasks collectively exhaustive? Mutually exclusive? Achievable with assigned tools?