CrewAI is a Python framework for building multi-agent systems where multiple LLM-powered agents collaborate to complete a task. Each agent has a role, goal, and backstory. Tasks are assigned to agents. A crew is a group of agents that work together, passing outputs from one agent to the next, delegating sub-tasks, and collectively producing a final output. CrewAI is genuinely useful for workflows where parallel research, specialized reasoning, or sequential processing by domain-specific agents produces better results than a single agent. It is not useful for simple tasks, and it is expensive on token count: a five-agent crew processing a task will often use 5-20x the tokens of a single well-prompted agent.
The Core Concepts
Agent: An LLM-powered worker with a specific role, goal, and backstory.
from crewai import Agent
researcher = Agent(
role="AI Research Analyst",
goal="Find accurate, current information about AI topics",
backstory="You are an expert AI researcher who reads and synthesizes technical literature.",
verbose=True,
allow_delegation=False,
llm="gpt-4o-mini"
)
Task: A unit of work assigned to an agent.
from crewai import Task
research_task = Task(
description="Research the current state of open source LLMs in 2026. Focus on: top models by capability, performance benchmarks, and licensing.",
agent=researcher,
expected_output="A structured summary covering at least 5 models with benchmark scores and licensing info"
)
Crew: The group of agents working together.
from crewai import Crew, Process
crew = Crew(
agents=[researcher, writer, reviewer],
tasks=[research_task, writing_task, review_task],
process=Process.sequential, # Tasks run in order
verbose=True
)
result = crew.kickoff()
print(result.raw)
Process Types
Sequential: Tasks run in order. Each task's output is available to subsequent tasks. Simplest to reason about.
Hierarchical: A manager agent (powered by an LLM) decides which agents to use, assigns tasks, and synthesizes results. More flexible but less predictable and more expensive.
For most use cases, start with sequential. Hierarchical processes add complexity that is rarely worth it until you have validated that sequential does not meet your needs.
Tools Integration
Agents can use tools (functions they can call during execution):
from crewai_tools import SerperDevTool, FileReadTool
search_tool = SerperDevTool() # Google search via Serper API
researcher = Agent(
role="Research Analyst",
goal="Find current information",
backstory="You research AI topics using web search.",
tools=[search_tool],
llm="gpt-4o-mini"
)
Built-in CrewAI tools: web search (via Serper), website scraping, file read/write, code execution, GitHub search. You can also define custom tools as Python functions.
Real Example: Content Creation Crew
A practical crew for creating a blog post:
from crewai import Agent, Task, Crew, Process
# Agents
researcher = Agent(
role="Content Researcher",
goal="Research the topic thoroughly and gather accurate facts",
backstory="Expert at finding and verifying technical information",
llm="gpt-4o-mini"
)
writer = Agent(
role="Technical Writer",
goal="Write clear, accurate, and engaging technical content",
backstory="Experienced technical writer who explains complex topics simply",
llm="gpt-4o-mini"
)
editor = Agent(
role="Editor",
goal="Ensure accuracy, clarity, and proper structure",
backstory="Meticulous editor who catches errors and improves readability",
llm="gpt-4o-mini"
)
# Tasks
research_task = Task(
description="Research: What are the top 5 open source LLMs in 2026? Include benchmark scores.",
agent=researcher,
expected_output="Structured research notes with model names, benchmarks, and sources"
)
writing_task = Task(
description="Write a 1000-word blog post using the research provided. Focus on practical developer use cases.",
agent=writer,
expected_output="A complete blog post draft",
context=[research_task] # Has access to research output
)
editing_task = Task(
description="Review the blog post for accuracy, clarity, and structure. Provide the final version.",
agent=editor,
expected_output="Final edited blog post ready to publish",
context=[writing_task]
)
# Run
crew = Crew(
agents=[researcher, writer, editor],
tasks=[research_task, writing_task, editing_task],
process=Process.sequential
)
result = crew.kickoff()
Limitations
Non-deterministic outputs. Multi-agent systems amplify the non-determinism of individual LLM calls. The same crew run twice can produce significantly different outputs. For applications requiring consistent output format, add output validation and retry logic.
Token cost. Each agent call is a full LLM inference. A 3-agent sequential crew processes a task 3 times. If your base task costs 1,000 tokens, the crew costs 3,000-10,000 tokens (agents also receive the previous agents' outputs as context, which grows through the pipeline). Budget for 5-15x token usage compared to a single-agent approach.
Debugging difficulty. When the crew produces a bad output, identifying which agent failed and why requires reading through verbose logs. The abstraction makes debugging harder than a standard Python call stack.
Speed. Sequential agent execution means latency stacks: 3 agents at 3 seconds each = 9+ seconds minimum. Parallel execution (via hierarchical process or async tools) helps but adds complexity.
When CrewAI Beats Single Agents
Research tasks requiring breadth and depth simultaneously. A research crew with a searcher, a summarizer, and a fact-checker produces better research than a single agent because the specialization focuses each agent on one part of the problem.
Content workflows with distinct creation and review phases. The writer/editor pattern works well because editing is a genuinely different task from writing, and separate agents with different prompts perform better than a single agent trying to do both.
Tasks requiring parallel information gathering. If your task needs information from 5 sources simultaneously, 5 parallel agents can gather and summarize them concurrently. A single agent would process them sequentially.
Keep Reading
- DSPy Declarative LLM Guide — An alternative approach to multi-step LLM pipelines
- LangChain vs LlamaIndex Comparison — Other frameworks in the LLM application space
- Cutting LLM API Costs — Managing the token costs that multi-agent systems amplify
Pristren builds AI-powered software for teams. Zlyqor is our all-in-one workspace — chat, projects, time tracking, AI meeting summaries, and invoicing — in one tool. Try it free.