ReAct Prompting: How to Make LLMs Reason and Act in Alternating Steps
ReAct prompting alternates Thought, Action, and Observation steps so the model commits reasoning before choosing an action. Here is how to use it effectively in 2026.
ReAct (Reasoning + Acting) prompting is a pattern where the model alternates between thinking through a problem and taking an action, then observing the result of that action before thinking again. It consistently outperforms standard chain-of-thought on multi-step tasks that require external information or tool use.
What the Pattern Looks Like
The ReAct loop has three repeating steps:
Thought: The model reasons about what it knows and what it needs to do next.
Action: The model calls a tool, searches for information, or performs a specific operation.
Observation: The model reads the result of the action.
The loop repeats until the model has enough information to produce a final answer.
Here is a concrete example. Suppose you are building an agent that answers questions about company data and has access to a database query tool.
Prompt:
You have access to the following tool:
- query_database(sql: string): runs a SQL query and returns results
Use the following format for every step:
Thought: [your reasoning about what to do next]
Action: [tool name and input]
Observation: [result of the action]
... repeat as needed ...
Final Answer: [your answer to the original question]
Question: How many active users signed up in March 2026?
Model output:
Thought: I need to count users who signed up in March 2026 and have an active status. I should query the users table filtering by signup_date and status.
Action: query_database("SELECT COUNT(*) FROM users WHERE status = 'active' AND signup_date >= '2026-03-01' AND signup_date < '2026-04-01'")
Observation: [{"count": 1842}]
Thought: The query returned 1842 active users who signed up in March 2026. I have enough information to answer.
Final Answer: 1842 active users signed up in March 2026.
Why ReAct Works Better Than Standard Chain-of-Thought
Standard chain-of-thought (CoT) asks the model to reason through a problem step by step, but all of that reasoning happens in a single pass with no ability to look up information or correct course based on real results. If the model's assumption is wrong at step 2, steps 3 through 10 are built on a faulty foundation.
ReAct solves this by grounding each reasoning step in an actual observation. The model cannot hallucinate a query result because the observation comes from a real tool call. When the observation contradicts what the model expected, it can adjust its reasoning before proceeding.
The paper that introduced ReAct (Yao et al., 2022, "ReAct: Synergizing Reasoning and Acting in Language Models") found that ReAct reduced hallucination rates on knowledge-intensive tasks like HotpotQA and FEVER compared to CoT alone. The improvement was especially significant for multi-hop questions where the model needed to retrieve several pieces of information in sequence.
// stay current
AI & ML insights, weekly
Practical deep-dives on LLMs, developer tools, and AI engineering. No filler. Unsubscribe any time.
// written byFIG. AUTH-01
530
Mahmudul Haque Qudrati
CEO & ML Engineer
CEO and ML Engineer at Pristren. Builds AI-powered software for teams and writes about machine learning, LLMs, developer tools, and practical AI applications.
Agent workflows with tool use. If your application gives the model access to search engines, databases, calculators, code interpreters, or APIs, the ReAct pattern is the standard way to structure the interaction. Frameworks like LangChain, LlamaIndex, and OpenAI's Assistants API all use variations of this pattern internally.
Multi-step problems requiring external information. Questions where the answer cannot be computed from the model's training data alone benefit from the grounded observation step. Examples: current stock prices, live database queries, real-time weather data, API responses.
Debugging agent behavior. Because ReAct makes reasoning explicit at each step, you can read the Thought entries to understand exactly why an agent made a particular tool call. This is much easier to debug than a black-box decision.
When ReAct Is Overkill
ReAct adds overhead. Each Thought/Action/Observation cycle costs tokens and latency. Avoid it for:
Simple, single-step tasks. If a user asks "What is 2 + 2?" or "Summarize this paragraph," there is no benefit to a multi-step loop. Standard prompting or a single chain-of-thought step is sufficient.
Pure reasoning without actions. Math problems, logical puzzles, and writing tasks do not require tool calls. Chain-of-thought alone is faster and cheaper for these.
Latency-sensitive applications. If you need a response in under two seconds, ReAct's multi-round structure may be too slow depending on the number of tool calls required.
Structuring a ReAct Prompt in Practice
A production ReAct system prompt typically includes four elements:
1. Tool definitions. List every tool the model has access to, with name, description, and input format. Be specific. "search(query: string): searches the company knowledge base and returns the top 3 relevant passages" is better than "search: look things up."
2. Format instructions. Explicitly state the Thought/Action/Observation structure. Models follow formatting instructions reliably when they are stated clearly. Include an example of the complete loop in the system prompt so the model has a template to follow.
3. Stopping conditions. Tell the model when to stop looping and produce a final answer. "Once you have all the information needed to answer the question, respond with 'Final Answer:' followed by your answer." Without this, some models will continue taking unnecessary actions.
4. Error handling instructions. Tell the model what to do when a tool returns an error or no results. "If a tool call returns an error, try an alternative approach or report that you could not find the information."
A More Complex Example
Here is a ReAct prompt for a customer support agent that can look up order status and check a returns policy knowledge base:
You are a customer support agent for an e-commerce company. You have access to these tools:
- get_order_status(order_id: string): returns the current status and tracking info for an order
- search_policy(query: string): searches the returns and shipping policy knowledge base
Use this format:
Thought: [reasoning]
Action: [tool_name(input)]
Observation: [tool result]
... repeat as needed ...
Final Answer: [your response to the customer]
Customer message: My order #ORD-9821 hasn't arrived and I placed it 12 days ago. What can I do?
The model working through this might check the order status, discover it is still "in transit," then search the policy for what customers can do after 10 business days. The final answer is grounded in both pieces of information, not generated from the model's general knowledge about shipping.
ReAct vs. Tool Use APIs
Modern LLM APIs (OpenAI function calling, Claude tool use, Gemini function calling) implement a version of ReAct at the API level. When you define functions and let the model call them, the API handles the Thought/Action/Observation loop automatically. You do not need to write out the format explicitly in your prompt.
However, writing explicit ReAct prompts is still useful in two situations: when your API does not support native function calling, and when you want the model's reasoning steps to be visible in the output for debugging or auditing purposes. Native tool use hides the Thought steps inside the API; explicit ReAct makes them readable.
Common Mistakes
Giving the model too many tools. When a model has 15+ tools to choose from, it frequently picks the wrong one or gets confused about which is appropriate. Start with 2-4 well-defined tools. Add more only when the simpler set is proven to work.
Vague tool descriptions. The model decides which tool to call based on the description you provide. If two tools have overlapping descriptions, the model will pick inconsistently. Make the distinction explicit: "Use search_knowledge_base for policy questions. Use get_order_status for specific order lookups. Never use search for individual order questions."
No maximum iteration limit. Without a limit, some models will loop indefinitely if they cannot find an answer. Set a maximum of 5-10 iterations in your application layer and return a fallback response if the limit is hit.
Ignoring observations. Some prompts ask the model to reason and act, but do not actually inject the tool output back into the context. The Observation step must contain real data from your tool, not a placeholder. If your application is not injecting tool results, the model is hallucinating the observations.
Best Practices for ReAct Prompting in 2026
As of 2026, several best practices have emerged from production deployments:
Use structured outputs for tool calls. Instead of parsing free-text Action lines, define tools with JSON schema and ask the model to output a JSON object. This reduces parsing errors and makes the loop more reliable.
Include a "scratchpad" in the prompt. Maintain a running log of previous Thought/Action/Observation steps in the conversation history. This helps the model stay consistent across multiple turns.
Set a token budget per step. Limit the length of each Thought and Observation to prevent the model from generating overly verbose reasoning that wastes context.
Test with adversarial inputs. Ensure your ReAct agent can handle edge cases like empty tool results, ambiguous queries, or contradictory observations.
Summary
ReAct prompting gives language models the ability to ground their reasoning in real observations from the world. The Thought/Action/Observation loop reduces hallucination on multi-step tasks by making the model commit to a reasoning step, act on it, and then update based on what actually happened. Use it when your application involves tool calls, external data, or multi-step information retrieval. Skip it for simple, single-step tasks where the overhead is not justified.
Pristren builds AI-powered software for teams. Zlyqor is our all-in-one workspace - chat, projects, time tracking, AI meeting summaries, and invoicing - in one tool. Try it free.
Frequently Asked Questions
What is ReAct Prompting: How to Make LLMs Reason and Act in Alternating Steps?
ReAct prompting is a technique that combines reasoning and acting in an alternating loop. The model first thinks about what to do (Thought), then performs an action like calling a tool or searching for information (Action), and finally observes the result (Observation). This cycle repeats until the model can produce a final answer. It was introduced in the 2022 paper 'ReAct: Synergizing Reasoning and Acting in Language Models' and has become a standard pattern for building LLM agents.
How does ReAct Prompting: How to Make LLMs Reason and Act in Alternating Steps work?
ReAct works by structuring the model's output into three repeating components: Thought, Action, and Observation. The model outputs a Thought explaining its reasoning, then an Action specifying a tool call or query, and then receives an Observation with the real result. This loop continues until the model has enough information to output a Final Answer. The key is that each reasoning step is grounded in actual data from the environment, reducing hallucinations.
What are the best practices for ReAct Prompting: How to Make LLMs Reason and Act in Alternating Steps?
Best practices include: (1) Clearly define tools with specific names and descriptions; (2) Provide a format example in the system prompt; (3) Set a maximum iteration limit (e.g., 5-10 cycles); (4) Include error handling instructions; (5) Use structured outputs (JSON) for tool calls; (6) Maintain a scratchpad of previous steps; (7) Test with edge cases like empty results or ambiguous queries. Start with 2-4 tools and add more only after the simpler set works reliably.
How much does ReAct Prompting: How to Make LLMs Reason and Act in Alternating Steps cost?
ReAct prompting itself is free to implement—it's a prompt pattern. However, the cost comes from LLM API usage. Each Thought/Action/Observation cycle consumes tokens for the prompt and the model's output. For a typical 3-step ReAct loop, you might use 2-5x more tokens than a single direct answer. Additionally, if actions involve external APIs (e.g., database queries, search), those may have their own costs. For low-volume use, this is negligible; for high-volume production, optimize by limiting steps and using cheaper models for simple reasoning.
Is ReAct Prompting: How to Make LLMs Reason and Act in Alternating Steps worth it in 2026?
Yes, ReAct prompting is worth it for applications that require multi-step reasoning with external tool use. It reduces hallucination and improves accuracy on tasks like question answering over databases, customer support, and research assistants. However, for simple single-step tasks or pure reasoning without tools, it adds unnecessary overhead. In 2026, many LLM APIs offer native tool use that implements ReAct internally, but explicit ReAct prompts remain valuable for debugging, auditing, and when using APIs without native function calling.