Dechive Logo
Dechive
Dev#dechive#llm#prompt#react-pattern#agent#tool-use#reasoning#agentic-ai

What is the ReAct Pattern? The Reasoning and Action Principles of AI Agents

AI thinks on its own, uses tools, and judges again by looking at the results. The structure and design method of the ReAct pattern.

Introduction: The Question Left by Episode 11

As I wrapped up episode 11, I left this teaser:

"There's a pattern where AI decides 'which tool to use' on its own, searches directly, observes results, and judges the next action. This is ReAct."

RAG is powerful but has limitations. An external system decides when to search, and people design what to search for. The model simply receives already-found documents and answers based on them.

However, real-world problems aren't that simple. To answer "What's the weather in Seoul today?" you need to call a weather API. To respond to "Find the bug in this code," you need to run the code. To answer "Compare the latest prices of competitors A and B," you need to search two sites in sequence.

These problems can't determine "what to search for" in advance. The model must reason through the problem, judge for itself what it needs, use tools directly, observe results, and decide the next action.

ReAct is exactly the pattern that structures this process.


1. What is ReAct?

1.1. Meaning of the Name

ReAct is a compound of Reasoning and Acting. It was first proposed in a 2022 paper by Google Research and Princeton University, and has since become the standard pattern for AI agent design.

As the name suggests, it weaves together two things:

  • Reasoning: Analyze the current situation and reason about what should be done
  • Acting: Use tools or take action based on the reasoning

And a third element is added:

  • Observation: Observe the results of the action and reflect them back into reasoning

The cycle of these three repeating is the core of ReAct.

1.2. How is it Different from CoT?

Chain-of-Thought (CoT), covered in episode 5, is a technique that unfolds the reasoning process step-by-step to help the model think more accurately. However, CoT only reasons. It doesn't interact with the external world.

# CoT — only reasons
Thought: To know Seoul's current temperature, I need weather data.
Thought: From training data, recalling Seoul's average temperature...
Thought: Since it's April, it should be around 15 degrees.
Answer: Seoul's weather is about 15 degrees. (← Guess)
# ReAct — reasons and acts
Thought: To know Seoul's current temperature, I need a real-time weather API.
Action: weather_api("Seoul")
Observation: {"temperature": 11, "condition": "Clear", "humidity": 45}
Thought: The actual temperature is 11 degrees and it's clear.
Answer: Current Seoul weather is clear with a temperature of 11 degrees.

CoT reasons using only the model's internal knowledge. ReAct, when external information is needed during reasoning, uses tools to fetch actual data. It then observes the result and continues reasoning.


2. ReAct Cycle: Thought → Action → Observation

ReAct cycle flowchart

The core of ReAct is the repetition of three stages.

[Thought] Analyze the current situation and determine what to do next
    ↓
[Action] Call a tool according to the decision, or generate a final answer
    ↓
[Observation] Receive the tool call result
    ↓
[Thought] Analyze based on the result and make a decision again
    ↓
    ... (repeat) ...
    ↓
[Final Answer] When sufficient information is gathered, generate a final answer

Let's examine each stage in detail.

2.1. Thought — Reasoning Stage

The model analyzes the current situation and plans the next action. This stage is an internal reasoning space not shown to users. Since it's where the model organizes its thoughts, it should be able to think freely without constraints.

A good Thought includes three things:

  1. Current State Understanding: What do we know so far?
  2. Next Required Information: What else is needed?
  3. Action Plan: Which tool will be used to get that information?
Thought: The user is asking which of stocks A and B is better.
         To judge this, I need the current stock prices and recent performance of both companies.
         I'll first check the current information for stock A.

2.2. Action — Acting Stage

Execute the action planned in Thought. There are two kinds of actions:

Tool Invocation: Send a request to an external system.

Action: stock_search(ticker="AAPL")
Action: web_search(query="Samsung Electronics Q1 2026 earnings")
Action: calculator(expression="(1250 - 980) / 980 * 100")
Action: code_executor(code="import pandas as pd\ndf.describe()")

Final Answer Generation: When sufficient information is gathered, end the loop and respond to the user.

Action: final_answer("Stock A's current price is ... Stock B's price is ...")

Clearly distinguishing between these two is important. If the model confuses tool invocation with final answer generation, the loop will either end unintentionally or never end.

2.3. Observation — Observation Stage

Return the tool call result to the model. This stage is not performed by the model itself; external systems (code, APIs, databases) generate the result and inject it into the context.

Observation: {"ticker": "AAPL", "price": 193.45, "change": "+1.2%",
              "pe_ratio": 28.4, "market_cap": "2.9T"}

Once Observation is added to the context, the model returns to the Thought stage and plans the next action based on this information.

2.4. Complete Actual Flow

Question: "Compare Samsung Electronics and SK Hynix stock prices"

Thought: I need to compare the current stock prices of both companies.
         First, I'll check Samsung Electronics' stock price.

Action: stock_search(ticker="005930")

Observation: {"name": "Samsung Electronics", "price": 74200, "change": "-0.8%",
              "52w_high": 88400, "52w_low": 54100}

Thought: Samsung Electronics' stock price is 74,200 won.
         Now I'll check SK Hynix's stock price.

Action: stock_search(ticker="000660")

Observation: {"name": "SK Hynix", "price": 198500, "change": "+2.1%",
              "52w_high": 238000, "52w_low": 141500}

Thought: SK Hynix's stock price is 198,500 won.
         I have sufficient information to compare the two stocks.
         It would be useful to also calculate their current position relative to 52-week highs.

Action: final_answer(
  "Samsung Electronics: 74,200 won (-16.1% from 52-week high of 88,400 won)\n
   SK Hynix: 198,500 won (-16.6% from 52-week high of 238,000 won)\n
   Both stocks have declined about 16% from their 52-week highs."
)

3. Tool Design: How to Create Tools the Model Can Use

ReAct's performance heavily depends on tool quality. No matter how good the reasoning ability, poorly designed tools won't be utilized properly.

3.1. Four Elements of Tool Specification

When informing the model about tools, four elements must be included:

Tool Name: weather_api
Description: Takes a city name and returns current weather information.
             Uses real-time data, so this tool must always be used for weather-related questions.
Parameters:
  - city (string, required): Name of city to query. Supports both Korean and English.
  - unit (string, optional): Temperature unit. "celsius"(default) or "fahrenheit"
Return Value: {"temperature": number, "condition": string, "humidity": number, "wind_speed": number}
  • Tool Name: Should be clear and self-explanatory, like a function name
  • Description: Specify when to use this tool. Conditions for use should be stated specifically, like "always for weather-related questions"
  • Parameters: Specify the type, required status, and allowed values for each parameter
  • Return Value: The model needs to understand what kind of data is returned to interpret results properly

3.2. Good vs. Bad Tool Descriptions

# Bad tool description
Tool Name: search
Description: Searches something

# Good tool description
Tool Name: web_search
Description: Query latest news, real-time information, and events after training data.
             Use this tool in all situations where latest information unknown to the model is needed.
             Search quality is higher when queries are concise and keyword-focused.

A good tool description enables the model to judge for itself "when to use this tool." Ambiguous descriptions confuse the model about which of similar tools to choose.

3.3. Minimize the Number of Tools

The more tools available, the higher the probability that the model will choose the wrong one. As a practical guideline, 5–7 tools or fewer per agent are recommended.

Combine tools with similar functions. Rather than having separate "news search" and "web search," it's better to add a source parameter (news / general) to "web_search."


4. ReAct Prompt Design

4.1. System Prompt Structure

A ReAct agent's system prompt consists of three sections:

<role>
  You are an AI agent that uses tools to answer user questions.
  Generate a final answer only after collecting sufficient information.
</role>

<tools>
  The following tools are available. Tools must be invoked in the format below.

  [web_search]
  Description: Search latest news and real-time information.
  Parameters: query (string, required)
  Returns: Top 5 search results with titles and content summaries

  [calculator]
  Description: Calculate mathematical expressions. Use for all situations requiring numerical calculations.
  Parameters: expression (string, required)
  Returns: Calculation result (number)

  [final_answer]
  Description: Deliver a final answer to the user when sufficient information is gathered.
  Parameters: answer (string, required)
  Calling this tool ends the loop.
</tools>

<format>
  Follow the format below strictly.

  Thought: [Analyze current situation and plan next action]
  Action: [tool_name]([parameters])

  Upon receiving tool result:
  Observation: [Tool return value]
  Thought: [Analyze result and plan next steps]
  Action: [Next tool invocation or final_answer]

  When preparing final answer:
  Thought: [Confirm sufficient information]
  Action: final_answer([answer])
</format>

4.2. Loop Control

Since ReAct is a structure where cycles repeat, loop control is essential. Two things must be set:

Maximum Steps Limit

MAX_STEPS = 10

for step in range(MAX_STEPS):
    response = llm(context)

    if "final_answer" in response:
        break  # Normal termination

    tool_result = execute_tool(response)
    context.append(tool_result)
else:
    # Force termination if MAX_STEPS exceeded
    context.append("Maximum steps exceeded. Providing answer based on information gathered so far.")
    response = llm(context)

Early Termination Conditions

End the loop if the model calls final_answer, errors repeat, or the same tool is called with the same parameters three or more times. This is the most practical way to prevent infinite loops.

4.3. Context Accumulation Method

ReAct's context grows with each step. Thoughts, Actions, and Observations accumulate, rapidly consuming tokens. More steps mean higher costs and increased risk of hitting the context limit.

A practical solution is Summarization. Once steps exceed 5, summarize previous Thought/Action/Observation concisely to reduce cumulative tokens.

# Before summarization (5 steps)
Thought: I need to check Samsung Electronics stock price.
Action: stock_search(ticker="005930")
Observation: {"price": 74200, ...}
Thought: I need to check SK Hynix.
Action: stock_search(ticker="000660")
Observation: {"price": 198500, ...}
...

# After summarization
[Information gathered so far]
- Samsung Electronics current price: 74,200 won, -16.1% from 52-week high
- SK Hynix current price: 198,500 won, -16.6% from 52-week high

5. ReAct Failure Patterns

Common failure patterns encountered when first implementing ReAct:

5.1. Infinite Loop — Can't Find Termination Condition

The model fails to recognize when to call final_answer and continues calling tools. It keeps making decisions like "I should verify more" even when sufficient information is already available.

Thought: I've checked Samsung Electronics and SK Hynix prices.
         If I also check LG Electronics, the comparison would be more thorough.
Action: stock_search(ticker="066570")
Observation: ...
Thought: If I also check LG Display, it would be even better.
...  ← Already have all information needed to answer, but continues

The solution is to explicitly state termination conditions in the system prompt:

When you judge that sufficient information to directly answer the question has been gathered,
immediately call final_answer. Collecting additional information is not always better.
Prioritize providing a concise answer with key information.

5.2. Hallucinated Tool Invocation — Using Non-existent Tools

The model calls tools that weren't provided or calls existing tools by different names.

Action: google_search("Samsung Electronics stock price")  ← Provided tool name is "web_search"
Action: stock_price_api(ticker="005930")  ← Non-existent tool

State in the tool description that the model must only use tools from the list provided, and return error messages as Observation when non-existent tools are called, giving the model a chance to correct itself.

def execute_tool(action: str) -> str:
    tool_name = parse_tool_name(action)
    if tool_name not in AVAILABLE_TOOLS:
        return f"Observation: Error — '{tool_name}' is not in the list of available tools. Available tools: {list(AVAILABLE_TOOLS.keys())}"
    return AVAILABLE_TOOLS[tool_name](action)

5.3. Premature Termination — Answering with Insufficient Information

The model calls final_answer before gathering sufficient information. It generates an answer based on incomplete information from a single search.

Action: web_search("Samsung Electronics SK Hynix stock price")
Observation: Mostly news about Samsung Electronics. No SK Hynix information.
Action: final_answer("Samsung Electronics stock price is ...")  ← Missing SK Hynix info

Add this condition to the system prompt: "Call final_answer only when information for all items mentioned in the question has been gathered."

5.4. Ignoring Observation

The model doesn't properly reflect tool results and answers based on previous knowledge instead.

Action: stock_search(ticker="005930")
Observation: {"price": 74200}
Thought: Samsung Electronics stock price was around 60,000 won in training data.
Answer: Samsung Electronics stock price is about 60,000 won.  ← Ignores Observation

In this case, add this rule to the system prompt: "When Observation data differs from previous knowledge, always prioritize Observation."


6. RAG vs. ReAct: When to Use What

RAG and ReAct both use external information but serve different purposes.

ItemRAGReAct
Search SubjectExternal systemModel itself
Tool TypesVector search DBDiverse tools (APIs, DBs, code execution)
Suitable For"Find X in this document""Compare X and Y"
StepsSingle stepMultiple steps
CostLowHigh (LLM called per step)
ComplexityLowHigh

RAG is sufficient for simple search and document-based answers. ReAct is suitable when multiple stages of judgment and diverse tool usage are needed. In practice, the two patterns are often combined—using RAG tools within ReAct loops when needed.

# RAG + ReAct Combination
[Tool List]
- vector_search: Search related documents in internal document database (← RAG)
- web_search: Search real-time external information
- calculator: Calculate mathematical expressions
- final_answer: Generate final answer

Conclusion: From Simple Response to Agent

Episodes 1 through 11 covered "how to make the model answer better." The goal was to improve the model's answer quality through better questions, clearer structure, and more accurate context.

ReAct is a different dimension. The model doesn't simply receive input and produce output; it judges for itself, acts, observes results, and judges again. It becomes an agent finding its own path toward a goal rather than a machine answering prompts.

Thought → Action → Observation. The repetition of these three stages is the foundation of AI agents.

Summary of Core Principles

PrincipleCore Idea
Cycle StructureRepeat Thought → Action → Observation. End with final_answer when sufficient information is gathered
Tool DesignSpecify "when to use" in descriptions. Keep tools to 5–7 or fewer per agent
Loop ControlAlways set maximum steps. Force termination if same action repeats
Observation FirstTool results always take priority over prior knowledge
RAG + ReActCombining patterns lets you leverage both internal documents and external real-time data

Toward Episode 13

ReAct is a pattern where a single agent reasons and acts autonomously. However, a single agent has limits for achieving complex goals. Separating agents for planning, execution, and review produces more sophisticated results.

[Episode 13: Agentic Prompting – Designing Multi-Agent Systems Where Multiple Agents Collaborate] covers design and prompting strategies for multi-agent systems where multiple agents divide roles and collaborate, going beyond single agents.

사서Dechive 사서