An AI agent is a language model that can take actions — call tools, read files, browse the web, write and execute code — and loop until a goal is achieved. The defining characteristic is the loop: an agent does not respond once and stop. It reasons, acts, observes the result of that action, reasons again, and continues until it either completes the goal or hits a stopping condition. A chatbot generates a response; an agent pursues an objective.
This distinction matters practically because it changes what you can build. Chatbots are reactive. Agents are goal-directed. If you want to automate a multi-step workflow that involves reading data, making decisions, and writing results, you need an agent.
What an AI Agent Is Not
Before the components, it helps to clarify what gets called an "agent" but is not one in the meaningful sense.
A chatbot with tools is not necessarily an agent. If you give a model a search tool but it only calls it once per response and does not loop, it is a tool-using model, not an agent.
A prompt chain is not an agent. Prompt chaining (passing output of one call to the next) is sequential processing, not autonomous goal-directed behavior. There is no loop, no decision-making about whether the goal is achieved.
An agent has at minimum: the ability to take actions, awareness of whether the goal is met, and the ability to decide what to do next based on what it has observed.
The Four Components of Any Agent
Every AI agent, regardless of framework or use case, has four components:
1. LLM Brain
The model that reasons and decides. It reads the current state of the task — the goal, what actions have been taken, what observations have come back — and decides what to do next. The model's quality directly determines the agent's capability ceiling. A weaker model makes worse decisions; a stronger model makes better ones.
2. Tools
Functions the agent can call to interact with the world. Common tools include web search, code execution, file read/write, database queries, API calls, and email/calendar actions. Each tool has a name, a description, and input/output parameters. The model selects which tool to call based on what it needs to accomplish the next step.
Tool design is one of the highest-leverage parts of building an agent. Poorly described tools (the model does not know when to use them), tools with too many parameters (the model fills them incorrectly), or tools with unreliable output (the model gets confused by errors) are the most common sources of agent failures.
3. Memory
How the agent maintains context across steps. There are three types:
- In-context memory: the conversation history within the current context window. Every action and observation is appended as the agent runs. Limited by the context window size.
- External memory: a vector database or key-value store the agent can query. Allows retrieval of information that would not fit in the context window.
- Procedural memory: knowledge baked into the system prompt or model weights about how to handle specific situations.
Most simple agents use only in-context memory. Long-running or multi-session agents need external memory.
4. Planning
How the agent breaks a goal into steps. Some agents have explicit planning (they produce a plan as the first step, then execute it). Others have implicit planning (each step is decided ad-hoc based on the current state). Explicit planning works better for tasks with clear structure. Implicit planning is more flexible for tasks where the path is not known in advance.
The ReAct Loop
Yao et al. introduced the ReAct framework in "ReAct: Synergizing Reasoning and Acting in Language Models" (ICLR 2023). The pattern alternates between Thought (reasoning about what to do) and Action (calling a tool), with Observation (the tool result) feeding back into the next Thought.
A simple ReAct loop for a research task:
Goal: Find the founding year and current CEO of Stripe.
Thought: I need to find information about Stripe. I will start with a web search.
Action: search("Stripe company founding year CEO 2026")
Observation: Stripe was founded in 2010 by Patrick and John Collison. As of 2026, Patrick Collison is the CEO.
Thought: I have both pieces of information. The goal is complete.
Answer: Stripe was founded in 2010. Patrick Collison is the CEO.
A more complex loop for a coding task:
Goal: Fix the failing test in the auth module.
Thought: I need to see what test is failing and why.
Action: run_tests("src/auth/")
Observation: FAIL auth.test.ts — "should reject expired tokens" — AssertionError: expected 401 but received 200
Thought: The token expiry check is not working. I need to read the auth logic.
Action: read_file("src/auth/verify-token.ts")
Observation: [file contents]
Thought: I see the issue. The expiry check compares Date.now() against exp, but exp is in seconds and Date.now() is in milliseconds.
Action: edit_file("src/auth/verify-token.ts", old="exp < Date.now()", new="exp < Date.now() / 1000")
Observation: File updated successfully.
Thought: Let me run the tests again to verify the fix.
Action: run_tests("src/auth/")
Observation: PASS auth.test.ts — all 12 tests passed
Answer: Fixed the token expiry comparison in verify-token.ts. The exp field is in Unix seconds; Date.now() returns milliseconds. Divided Date.now() by 1000 to match units.
The ReAct loop is the foundation of nearly every production agent. When you see a coding assistant "browse your codebase" or a research tool "search and synthesize," that is ReAct.
What Production Agents Actually Do in 2026
Real deployed agents handle these categories of tasks reliably:
Software development assistance. Claude Code, GitHub Copilot Workspace, and similar tools read codebases, make targeted edits, run tests, and iterate based on results. These work because the task space is well-defined (code and tests) and results are verifiable (tests pass or fail).
Customer support. Agents that read knowledge bases, look up account information, and resolve tier-1 support issues without human involvement. These work because the action space is bounded (lookup, update account, draft email) and errors are recoverable.
Research and synthesis. Agents that search the web, read documents, and produce structured summaries. These work because the output is text that a human can verify before acting on.
Data pipeline automation. Agents that read from a data source, apply transformations, and write to a destination. These work because the inputs and outputs are structured and verifiable.
Where Agents Still Fail
Long-horizon tasks with many steps. Errors compound. An agent that is 95% reliable per step is only 60% reliable after 10 steps. Most current agents degrade rapidly beyond 10 to 15 steps.
Tasks requiring genuine judgment. Agents can appear to exercise judgment by pattern-matching from training data, but they do not have situational awareness the way a human does. Tasks that require understanding context that is not written down tend to go wrong.
Tasks where the cost of an error is high and irreversible. "Delete all test data from the staging database" is a fine agent task. "Delete all data from the production database" is not, because an agent that is 99% reliable will still make this mistake 1% of the time.
Ambiguous goals. Agents with poorly specified goals either do too little (interpret narrowly) or too much (interpret broadly). The quality of the goal specification determines whether the agent stays on track.
Keep Reading
- How to Build an AI Agent: A Practical Guide for Developers — Implementation walkthrough with code, from the agent loop to stopping conditions
- AI Agents vs AI Assistants: What's the Actual Difference? — Clarifying the spectrum between reactive assistants and autonomous agents
- Multi-Agent Systems: When You Need More Than One AI Agent — When single agents are not enough and how to compose them
Pristren builds AI-powered software for teams. Zlyqor is our all-in-one workspace — chat, projects, time tracking, AI meeting summaries, and invoicing — in one tool. Try it free.