Module 8: AI Agents & Tool Use
Agent Architecture Patterns & Reasoning Loops
Agent Architecture Patterns & Reasoning Loops
An LLM agent is a system where the language model acts as the "brain" — it observes the environment, reasons about what to do next, takes actions (calling tools), and iterates until it achieves a goal.
What Makes an Agent Different?
| LLM App | LLM Agent | |
|---|---|---|
| Control flow | Predetermined by programmer | Decided by the LLM at runtime |
| Iterations | Fixed (one or few LLM calls) | Dynamic (LLM decides when to stop) |
| Tool use | Pre-scripted | LLM selects tools based on need |
| Memory | Context window only | External memory, state persistence |
Core Agent Components
1. Brain (LLM)
The core reasoning engine. Decides what to do next given the goal/task, current observations, available tools, and memory state.
2. Memory
| Type | Description | Storage |
|---|---|---|
| In-context | Conversation history | Context window |
| External | Long-term facts | Vector DB, key-value store |
| Episodic | Past task summaries | Database |
| Procedural | How to do tasks | System prompt, code |
3. Tools
Functions the agent can call to interact with the world:
- Information: web search, database queries, document retrieval
- Computation: calculator, code interpreter, data analysis
- Action: send email, create calendar event, post to API
- Perception: read files, take screenshots, browse web
Reasoning Loop Architectures
ReAct (Reason + Act)
"ReAct: Synergizing Reasoning and Acting in Language Models" — Yao et al., 2022
Interleaves reasoning (Thought) with actions. The model explains its reasoning before each tool call, then uses the observation to update its plan.
Thought: I need to find the current price of gold.
Action: search("current gold price USD per ounce")
Observation: Gold is trading at $2,347/oz as of today.
Thought: Now I have the price. The user asked for kg price.
Action: calculator("2347 * 32.1507")
Observation: 75,417.89
Thought: I have the answer.
Final Answer: Gold is currently $2,347/troy oz or approximately $75,418/kg.
Why ReAct beats pure CoT: Grounding through tool use prevents hallucinated reasoning. The model's thoughts inform tool use; observations correct the model's beliefs.
Plan-and-Execute
Separate planning from execution. A "planner" LLM creates a step-by-step plan upfront; an "executor" LLM works through each step independently.
Planner → [Step 1: Search for X] [Step 2: Compute Y] [Step 3: Summarize Z]
Executor → Runs Step 1 → Runs Step 2 → Runs Step 3 → Returns result
When to use: Long, structured tasks where the plan is stable. Reduces mid-task course corrections.
LATS (Language Agent Tree Search)
Combines Monte Carlo Tree Search with LLM reasoning. The agent explores multiple possible action sequences in parallel, uses a value function to score branches, and backtracks from dead ends.
Root State
├── Action A → State A1 → [score: 0.7]
│ ├── Action A1a → State A1a → [score: 0.9] ✓ best path
│ └── Action A1b → State A1b → [score: 0.4]
└── Action B → State B1 → [score: 0.3]
When to use: Tasks with a large search space and clear success criteria (coding, math, planning).
Reflexion
The agent reflects on its past failures to improve future attempts. After each failed attempt, the agent writes a "reflection" — a natural language summary of what went wrong and how to fix it. This reflection is prepended to the next attempt's context.
Attempt 1 → Fail → Reflection: "I forgot to validate the input"
Attempt 2 (+ Reflection) → Fail → Reflection: "Edge case with empty strings"
Attempt 3 (+ Reflections) → Success
Agent Failure Modes & Mitigations
| Failure | Cause | Mitigation |
|---|---|---|
| Infinite loops | Same tool called repeatedly | Max iteration limit |
| Hallucinated tool calls | Bad tool schema | Detailed descriptions + enums |
| Scope creep | Vague task definition | Clear stopping criteria |
| Error propagation | Early mistake cascades | Validate at each step |
| Context overflow | Long tool outputs fill context | Summarize tool results |