TL;DR

The ReAct (Reasoning + Acting) framework is the architectural backbone of modern AI Agents. It interleaves explicit reasoning (Thought) with tool execution (Action) and environment feedback (Observation). This iterative loop prevents hallucinations, allows models to handle complex multi-step tasks, and connects static LLMs to real-time external data.

📋 Table of Contents

✨ Key Takeaways

  • Actionable Reasoning: ReAct forces models to verbalize their plan before executing a tool.
  • Dynamic Grounding: Instead of guessing facts, a ReAct agent searches for them, reads the Observation, and updates its mental model.
  • Error Recovery: If an action fails (e.g., a 404 error from an API), the agent can reason about the failure and try a different approach.
  • Foundation of Agents: Major frameworks like LangChain, LlamaIndex, and AutoGen use the ReAct loop as their default agent execution strategy.

💡 Quick Tool: Building a ReAct Agent? Use our JSON Formatter to validate the JSON schemas of the tools you provide to your LLM.

What is the ReAct Framework?

Introduced in a seminal 2022 paper by researchers at Princeton and Google, ReAct stands for Reasoning and Acting.

Before ReAct, Large Language Models (LLMs) were generally used in two isolated ways:

  1. Reasoning-only: Prompting the model to think step-by-step (Chain of Thought) to solve math or logic puzzles based solely on its internal, pre-trained weights.
  2. Acting-only: Prompting the model to output a command or API call, without requiring it to explain why it chose that action.

ReAct merged these two paradigms. It posits that reasoning helps the model decide which action to take, while acting allows the model to gather information that improves its future reasoning.

📝 Glossary: To fully understand ReAct, it helps to be familiar with Chain of Thought (CoT) and general AI Agent concepts.

How ReAct Works: The Trajectory Loop

The core of ReAct is an infinite loop that continues until the model determines it has the final answer. This loop consists of three distinct phases:

  1. Thought: The LLM analyzes the current state of the problem and decides what information it needs next.
  2. Action: The LLM selects a specific tool (e.g., Search, Calculator, Weather_API) and provides the necessary input parameters.
  3. Observation: The system executes the tool and returns the raw output back to the LLM.

The LLM then generates a new Thought based on the Observation, and the cycle repeats.

flowchart TD A[User Query] --> B(Thought) B -->|Decides next step| C["Action / Tool Use"] C -->|Executes API/Search| D[Observation] D -->|Feeds data back| B B -->|Has enough info| E[Final Answer] style A fill:#e1f5fe,stroke:#01579b style B fill:#fff3e0,stroke:#e65100 style C fill:#e8f5e9,stroke:#2e7d32 style D fill:#f3e5f5,stroke:#880e4f style E fill:#e8f5e9,stroke:#2e7d32

A Real-World ReAct Trace

User Query: "What is the age difference between the current US President and the CEO of Apple?"

  • Thought 1: I need to find out who the current US President is, and who the CEO of Apple is. Then I need their ages. Let's start with the US President.
  • Action 1: Search[Current US President 2026]
  • Observation 1: [Search Results: ...]
  • Thought 2: Okay, the president is X and is Y years old. Now I need to find the CEO of Apple.
  • Action 2: Search[CEO of Apple 2026]
  • Observation 2: [Search Results: Tim Cook...]
  • Thought 3: Tim Cook is the CEO. Let me find his age.
  • Action 3: Search[Tim Cook age]
  • Observation 3: [Search Results: Tim Cook is 65 years old...]
  • Thought 4: The president is Y, Tim Cook is 65. The difference is Y - 65. Let me calculate this.
  • Action 4: Calculator[Y - 65]
  • Observation 4: [Result: Z]
  • Thought 5: I have the final answer.
  • Final Answer: "The age difference is Z years."

ReAct vs Chain of Thought (CoT)

Feature Chain of Thought (CoT) ReAct
Information Source Internal model weights only External tools + Internal weights
Hallucination Risk High (if facts are unknown/outdated) Low (can verify facts via tools)
Error Recovery Poor (cascading logic errors) Excellent (Observes tool failures and adjusts)
Use Case Math, Logic, Summarization Web research, API interaction, multi-step execution

ReAct in Practice (Python Example)

While frameworks like LangChain abstract this away, building a ReAct agent from scratch reveals how it actually works under the hood. It relies heavily on strict Prompt Engineering.

python
import openai
import re

client = openai.OpenAI()

# Define a simple mock tool
def get_weather(location):
    if "London" in location: return "Rainy, 12°C"
    return "Sunny, 25°C"

# The System Prompt that enforces the ReAct loop
REACT_PROMPT = """
You run in a loop of Thought, Action, PAUSE, Observation.
At the end of the loop you output an Answer.

Use Thought to describe your thoughts about the question you have been asked.
Use Action to run one of the actions available to you - then return PAUSE.
Observation will be the result of running those actions.

Your available actions are:
get_weather:
e.g. get_weather: London
Returns the current weather in a location.

Example session:
Question: What is the weather in London?
Thought: I should check the weather in London.
Action: get_weather: London
PAUSE

You will be called again with this:
Observation: Rainy, 12°C

You then output:
Answer: The weather in London is rainy and 12°C.
"""

def run_react_agent(prompt):
    messages = [
        {"role": "system", "content": REACT_PROMPT},
        {"role": "user", "content": prompt}
    ]
    
    for _ in range(5): # Maximum 5 iterations to prevent infinite loops
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            stop=["Observation:"] # Stop generating when expecting an observation
        )
        
        result = response.choices[0].message.content
        print(result)
        messages.append({"role": "assistant", "content": result})
        
        if "Answer:" in result:
            break
            
        # Parse the Action
        action_match = re.search(r"Action: (\w+): (.*)", result)
        if action_match:
            action, action_input = action_match.groups()
            
            # Execute the tool
            if action == "get_weather":
                observation = get_weather(action_input)
            else:
                observation = "Error: Tool not found"
                
            print(f"Observation: {observation}")
            messages.append({"role": "user", "content": f"Observation: {observation}"})

# Run the agent
run_react_agent("What's the weather like in London?")

🔧 Try it now: When defining tools for ReAct agents, ensure your JSON descriptions are flawless. Validate them with our JSON Formatter.

Best Practices for ReAct Agents

  1. Provide Clear Tool Descriptions — The LLM's Thought relies entirely on knowing exactly what a tool does. A description like "Searches the web" is too vague. Use "Searches the live internet for current events, news, and facts."
  2. Implement an Iteration Limit (Max Steps) — ReAct loops can get stuck in infinite failure cycles (e.g., trying the same broken search query repeatedly). Always implement a hard stop (e.g., max_iterations=5).
  3. Format Enforcement — Smaller models (like Llama-3-8B) may forget to output the PAUSE or Action: keywords. Use JSON mode, Structured Outputs, or strict system prompts to enforce the format.
  4. Graceful Tool Failure — If a tool throws a 500 error, do not crash the application. Return the error string as the Observation (e.g., Observation: API returned 500 error). The LLM will read this and formulate a Thought to try a different tool.

⚠️ Common Mistakes:

  • Giving the agent too many toolsFix: Context windows have limits. If you have 100 tools, the LLM will get confused. Use a hierarchical approach or tool-retrieval system.
  • Forgetting the "PAUSE" tokenFix: Ensure your prompt explicitly tells the LLM to stop generating text after calling an Action, otherwise it will hallucinate the Observation itself!

FAQ

Q1: Is ReAct outdated now that OpenAI has function calling?

No. OpenAI's Function Calling (or Tool Calling) is simply a more reliable, JSON-structured way to execute the Action phase. The underlying cognitive architecture—requiring the model to generate a Thought before deciding to call a function—is still pure ReAct and is considered best practice.

Q2: Why does my ReAct agent hallucinate Observations?

This happens when you don't configure proper stop sequences in your API call. The LLM doesn't know it's supposed to wait for the environment to reply, so it just keeps typing. Ensure you pass stop=["Observation:"] to the LLM.

Q3: Which models are best for ReAct?

Models with strong reasoning and instruction-following capabilities. GPT-4o, Claude 3.5 Sonnet, and DeepSeek-V3 excel at ReAct. Smaller models (<10B parameters) often struggle to maintain the strict Thought -> Action format over multiple turns.

Summary

The ReAct framework transforms Large Language Models from static text generators into dynamic, autonomous problem solvers. By enforcing a strict cycle of reasoning, acting, and observing, ReAct agents can navigate complex environments, utilize external tools, and recover from errors.

👉 Start using JSON Formatter now — Perfect your Agent's tool schemas.