Building Agents¶

Building agents shares a degree of overlap with the building of applications, but we write about it here because of its unique importance. The AI agents stack has multiple frameworks and components to enable sophisticated agent architectures.

We describe the stack in The AI Agent Stack, how to evaluate and compare agents, and how to optimize agents.

Below you will find a description of the evolution of agents and the different strategies for building them.

The Evolution of AI Agents¶

The AI agent landscape has evolved significantly since the initial release of frameworks like LangChain (Oct 2022) and LlamaIndex (Nov 2022). While these started as simple LLM frameworks, the field has grown to encompass more sophisticated architectures addressing key challenges:

State Management: Agents require sophisticated handling of:
- Message and event history
- Long-term memories
- Execution state in agentic loops
Tool Execution: Agents need secure and reliable ways to:
- Execute LLM-generated actions
- Handle tool dependencies
- Manage execution environments
- Process tool results

Strategies for Building Agents¶

Development Approaches¶

Low/No Code vs Code-Centric Approaches¶

The development of AI agents can follow two main paths, each with its own advantages and use cases:

Aspect	Low/No Code 🔌	Hybrid 🔄	Code-Centric 💻
Key Benefits	Easy to use, rapid deployment	Best of both worlds	Full customization, scalability
Best For	Simple agents, chatbots	Evolving projects	Complex systems, enterprise
Delivery Speed	Days - Weeks ⚡	Variable 📅	Weeks - Months 🗓️
Maintenance Difficulty	Low (platform handles) 🟢	Medium (split scope) 🟡	High (full stack) 🔴
Control	Limited 🔒	Balanced ⚖️	Complete 🛠️
Examples	GPTs, Zapier, Bubble 🤖	GPTs + Custom Backend	LangChain, AutoGen 🚀

Low/No Code vs Code-Centric Approaches (expanded)

Aspect	Low/No Code	Code-Centric	Hybrid
Development Speed	Fast (days/weeks)	Slower (weeks/months)	Medium (varies by component)
Technical Expertise	Minimal	High	Mixed
Customization	Limited	Extensive	Moderate to High
Scalability	Platform-dependent	Highly scalable	Scalable with proper architecture
Integration Depth	Pre-built connectors	Custom integrations	Mix of both
Maintenance	Platform-managed	Team-managed	Split responsibility
Cost Structure	Platform subscriptions	Development resources	Combined costs
Use Cases	• Simple automations • Chatbots • Basic workflows	• Complex agents • Custom solutions • Enterprise systems	• MVP to production • Scaled applications • Enterprise solutions
Examples	• OpenAI GPTs • Zapier • Bubble.io	• LangChain • AutoGen • Custom solutions	• GPTs + custom backend • Visual frontend + coded agents
Team Size	Individual to small team	Development team	Cross-functional team
Iteration Speed	Very fast	Depends on complexity	Fast for no-code components
Security Control	Platform-dependent	Full control	Balanced control

Workflow Automation vs AI Agents¶

A key distinction in modern AI systems is between AI Agents and Workflow Automation approaches:

Aspect	AI Agents and Teams	Workflow Automation
How?	LLMs direct its own action based on feedback	LLM is embedded in, or controls flow in predefined paths
Core Functionality	Language understanding, contextual assistance	Trigger-based actions, workflow automation
Ease of Use	Requires setup and training	User-friendly, often no coding needed
Integration and Customization	Code-based integration with custom and commercial apps	Manual-integration with multiple apps/services
Pricing	LLM API and observability costs	LLM API costs and scale based subscriptions
Testing and Optimization	Enabled programmatically	Generally manual
Tasks	Complex, open-ended goals and tasks	Simpler and predefined tasks and procedures
Scalability	Scalability determined by code efficiency and hosting providers	Scalable through tiered service models
Options	LangGraph, AutoGen, Microsoft Copilot	Make, n8n, Zapier, Stack, Voiceflow

Detailed Comparison of AI Agents vs Workflow Automation

Aspect	AI Agents and Teams	Workflow Automation
Decision Making	• Autonomous reasoning • Self-directed actions • Learning from feedback	• Predefined decision paths • Rule-based triggers • Fixed action sequences
Use Cases	• Complex research tasks • Creative problem solving • Adaptive interactions	• Document processing • Data workflows • Scheduled automations
Development	• Custom code development • API integrations • Advanced configurations	• Visual flow builders • Pre-built templates • No-code interfaces
Maintenance	• Code updates • Model fine-tuning • Performance monitoring	• Visual flow updates • Template modifications • Platform-managed updates
Integration	• Programmatic API access • Custom connectors • Deep system integration	• Pre-built connectors • Visual integrations • Platform limitations
Security	• Custom security policies • Fine-grained access control • Custom audit trails	• Platform security • Predefined permissions • Standard logging
Cost Factors	• API consumption • Infrastructure costs • Development resources	• Platform subscriptions • Usage-based pricing • Integration costs
Performance	• Highly customizable • Infrastructure dependent • Optimization flexibility	• Platform constrained • Tier-based limits • Standard optimization

Development Details¶

Low/No Code Development¶

Low/no code platforms provide visual interfaces and pre-built components for building agents without extensive programming:

Benefits
- Rapid prototyping and deployment
- Accessible to non-technical users
- Visual workflow design
- Pre-built integrations
- Faster iteration cycles
Popular Platforms
- OpenAI GPTs
- Bubble.io with AI integrations
- Zapier with AI actions
- Microsoft Power Platform
- Voiceflow for conversational agents
Best For
- Business users and citizen developers
- Quick proof-of-concept development
- Simple automation workflows
- Standard use cases with common integrations

Code-Centric Development¶

Traditional programming approaches offer maximum flexibility and control:

Benefits
- Complete customization
- Advanced functionality
- Fine-grained control over agent behavior
- Custom integrations
- Better performance optimization
Development Frameworks
- LangChain
- AutoGen
- LlamaIndex
- Custom frameworks using LLM APIs
- Mastra - TypeScript agent framework with workflows, RAG, and observability
Best For
- Complex agent architectures
- Enterprise-grade applications
- Novel use cases
- High-performance requirements
- Deep system integrations

Hybrid Approaches¶

Many organizations adopt a hybrid strategy:

Prototyping with Low Code
- Validate concepts quickly
- Test user interactions
- Define basic workflows
Production with Code
- Refine and optimize
- Add custom features
- Scale for production
Integration Patterns
- Low-code for frontend/UI
- Code-based backend services
- API-driven architecture
- Microservices composition

Architectural Considerations¶

When building agents, several architectural decisions are crucial:

State Persistence
- File-based serialization vs. Database-backed state
- Query capabilities for historical data
- Scaling with conversation length
- Multi-agent state management
Tool Security
- Sandbox environments for arbitrary code execution
- Dependency management
- Access control and authorization
- Input validation and sanitization
Production Deployment
- REST API design for agent interactions
- Data normalization for agent state
- Environment recreation for tool execution
- Scaling to millions of agents

Future Trends¶

The agent ecosystem is still in its early stages, with several emerging trends:

Standardization
- Movement toward common tool schemas (like OpenAI's function calling format)
- Emerging patterns for agent APIs and deployment
- Cross-framework compatibility for tools and agents
Production Focus
- Shift from notebook-based development to production services
- Growing importance of observability and monitoring
- Need for enterprise-grade security and compliance
Tool Ecosystem Growth
- Specialized tool providers for common tasks
- Authentication and access control frameworks
- Industry-specific tool collections

Interesting and notable research and libraries¶

Open GPTs Enables the creation of agents and assistants, using Langchain components

📋

The Open Source AI Assistant Framework & API

Docs

Agenta-AI provides end-to-end LLM developer platform. It provides the tools for prompt engineering and management, ⚖️ evaluation, human annotation, and 🚀 deployment. All without imposing any restrictions on your choice of framework, library, or model.

📋

Jarvis provides essential components to enable LLM-agents to have tools. They provide ToolBench, HuggingGPT, and EasyTool at present.

GenAI_Agents provides comprehensive tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. The repository includes Jupyter notebooks covering LangChain, LangGraph, self-improving agents, multi-agent systems, and more.

Mastra is a TypeScript AI agent framework for building and deploying AI applications with modern JavaScript stacks.

Mastra provides a comprehensive set of tools for building AI agents in TypeScript:

Unified Model API: Integrates with Vercel AI SDK to support multiple LLM providers (OpenAI, Anthropic, Google Gemini)
Agent Development: Create agents with memory, tool-calling capabilities, and workflow integration
Workflow Orchestration: Build graph-based workflows with control flow, branching, and observability
RAG Integration: Process documents, create embeddings, and query vector databases with a unified API
Local Development: Chat with agents and debug their state in a local development environment
Deployment Options: Deploy as standalone endpoints or integrate with React, Next.js, or Node.js applications
Evaluation Tools: Assess agent performance with model-graded, rule-based, and statistical metrics

📋

Easy Tool: Enhancing LLM-based Agents with Concise Tool Instruction provides a framework transforming diverse and lengthy tool documentation into a unified and concise tool instruction for easier tool usage

Development Easy Tool follows a simple pattern of: 1. Task Planning, 2. Tool Retrieval, 3. Tool Selection and 4. Tool Execution, coupled with thoughtful prompting to enable SOT tool usage over multiple models.

Problem Using new tools, software, especially can be challenging for LLMs (and people too!), especially with a poor or redundant documentation and a variety of usage manners.

Solution Easy tool provides "a simple method to condense tool documentation into more concise and effective tool instructions."

I: Tool Description Generation /* I: Task prompt */ Your task is to create a concise and effective tool usage description based on the tool documentation. You should ensure the description only contains the purposes of the tool without irrelevant information. Here is an example: /* Examples */ {Tool Documentation} Tool usage description: {Tool_name} is a tool that can {General_Purposes}. This tool has {Number} multiple built-in functions: class="w"> 1. {Function_1} is to {Functionality_of_Function_1} 2. {Function_2} is to ... /* Auto generation of tool description */ {ToolDocumentationof'AviationWeatherCenter'} Tool usage description: 'Aviation Weather Center' is a tool which can provide official aviation weather data... II: Tool Function Guidelines Construction /* Task prompt */ Your task is to create the scenario that will use the tool. class="w"> 1. You are given a tool with its purpose and its parameters list. The scenario should adopt the parameters in the list. class="w"> 2. If the parameters and parameters are both null, you should set: {"Scenario": XX, "Parameters":{}}. Here is an example: /* Examples */ {Tool_name} is a tool that can {General_Purposes}. {Function_i} is to {Functionality_of_Function_i} {Parameter List of Function_i} One scenario for {Function_i} of {Tool_name} is: {"Scenario": XX, "Parameters":{XX:XX}} /* Auto-construction for Tool Function Guidelines */ 'Ebay' can get products from Ebay in a specific country. 'Product Details' in 'Ebay' can get the product details for a given product id and a specific country. {Parameter List of 'Product Details'} One scenario for 'Product Details' of 'Ebay' is: {"Scenario": "if you want to know the details of the product with product ID 1954 in Germany from Ebay", "Parameters":{"product_id": 1954, "country": "Germany"}}. data-type="image" data-width="100%" data-height="auto" data-desc-position="bottom">

Results The performance is SOT over multiple models. ChatGPT, ToolLLaMA-7B, Vicuna-7B, Mistral-Instruct-&B and GPT-4

📋

Hugging GPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face

Development

Hugging GPT enables LLM models to call other models via the Hugging Face Repo

Problem

LLMs are not the best task for all tasks. Enabling LLMS to use task-specific models can improve the quality of the results.

Solution Hugging GPT provides an intervace for LLMs by breaking it down into 1. Task Planning, 2. Model Selection, 3. Task Execution, and 4. Response Generation

Results The results provide substiantial evidence that HuggingGPT can enable successful single, sequential, and graph-based tasks.

Automatic building Agents¶

📋

Automated Design of Agentic Systems

Developments In their paper the authors revealmeta-agents that observe adn critique prompting nd efforts to enable better agents. In their own words:

"The core concept of Meta Agent Search is to instruct a meta agent to iteratively create interestingly new agents, evaluate them, add them to an archive that stores discovered agents, and use this archive to help the meta agent in subsequent iterations create yet more interestingly new agents."
You are an expert machine learning researcher testing different agentic systems.
[Brief Description of the Domain]
[Framework Code]
[Output Instructions and Examples]
[Discovered Agent Archive] (initialized with baselines, updated at every iteration)
# Your task
You are deeply familiar with prompting techniques and the agent works from the literature. Your goal is
to maximize the performance by proposing interestingly new agents ......
Use the knowledge from the archive and inspiration from academic literature to propose the next
interesting agentic system design.

Resources¶

Awesome Agents