Building Agents¶

Building agents shares a degree of overlap with the building of applications, but we write about it here because of its unique importance. The AI agents stack has evolved significantly since 2022-2023, moving beyond simple LLM frameworks to more sophisticated agent architectures.

The Evolution of AI Agents¶

The AI agent landscape has evolved significantly since the initial release of frameworks like LangChain (Oct 2022) and LlamaIndex (Nov 2022). While these started as simple LLM frameworks, the field has grown to encompass more sophisticated architectures addressing key challenges:

State Management: Agents require sophisticated handling of:
- Message and event history
- Long-term memories
- Execution state in agentic loops
Tool Execution: Agents need secure and reliable ways to:
- Execute LLM-generated actions
- Handle tool dependencies
- Manage execution environments
- Process tool results

Key Architectural Considerations¶

When building agents, several architectural decisions are crucial:

State Persistence
- File-based serialization vs. Database-backed state
- Query capabilities for historical data
- Scaling with conversation length
- Multi-agent state management
Tool Security
- Sandbox environments for arbitrary code execution
- Dependency management
- Access control and authorization
- Input validation and sanitization
Production Deployment
- REST API design for agent interactions
- Data normalization for agent state
- Environment recreation for tool execution
- Scaling to millions of agents

Future Trends¶

The agent ecosystem is still in its early stages, with several emerging trends:

Standardization
- Movement toward common tool schemas (like OpenAI's function calling format)
- Emerging patterns for agent APIs and deployment
- Cross-framework compatibility for tools and agents
Production Focus
- Shift from notebook-based development to production services
- Growing importance of observability and monitoring
- Need for enterprise-grade security and compliance
Tool Ecosystem Growth
- Specialized tool providers for common tasks
- Authentication and access control frameworks
- Industry-specific tool collections

The Stack¶

The modern AI agent stack can be broken down into several key layers, each addressing specific challenges in agent development. These include:

Agent Hosting & Serving Solutions
Agent Observability Solutions
Agent Frameworks
Agent Memory Solutions
Tool Libraries
Model Serving Solutions

with some nice examples of successful Vertical AI Agent Solutions.

Agent Hosting & Serving Solutions¶

Platform	Description
Letta	Agent deployment and hosting platform
LangGraph	Graph-based orchestration for language model agents
Assistants API	OpenAI's API for deploying and managing AI assistants
Amazon Bedrock Agents	AWS-based agent hosting and management service
LiveKit Agents	Real-time agent deployment and communication platform

These platforms provide infrastructure and tools for deploying, hosting, and serving AI agents at scale, each with different specializations and integration capabilities.

Additional considerations for hosting solutions:

Scalability and performance requirements
Integration capabilities with existing systems
Cost and resource optimization
Security and compliance features

Agent Observability Solutions¶

Platform	Description
LangSmith	LangChain's platform for debugging, testing, evaluating, and monitoring LLM applications and agents
Arize	ML observability platform with LLM monitoring capabilities
Weave	AI observability and monitoring platform
Langfuse	Open source LLM engineering platform for monitoring and analytics
AgentOps.ai	Specialized platform for monitoring and optimizing AI agents
Braintrust	LLM evaluation and monitoring platform

These platforms provide specialized tools for monitoring, debugging, and analyzing the performance of AI agents and LLM applications in production environments.

Key observability features to consider:

Real-time monitoring and alerting
Performance analytics and tracing
Debug tooling and replay capabilities
Cost tracking and optimization

Agent Frameworks¶

These frameworks provide different approaches and tools for building AI agents, from simple single-agent systems to complex multi-agent orchestrations. Each has its own strengths and specialized use cases.

Framework	Description
LangGraph	LangChain's framework for building structured agents using computational graphs
Letta	Framework for building and deploying AI agents with built-in orchestration
Open Hands	Collaborative AI systems
AutoGen	Microsoft's framework for building multi-agent systems with automated agent orchestration
LlamaIndex	Framework for building RAG-enabled agents and LLM applications
CrewAI	Framework for orchestrating role-playing autonomous AI agents
DSPy	Stanford's framework for programming with foundation models
Phidata	AI-first development framework for building production-ready AI applications
Semantic Kernel	Microsoft's orchestration framework for LLMs
AutoGPT	Framework for building autonomous AI agents with GPT-4
L3AGI	Open-source tool that enables AI Assistants to collaborate together as effectively as human teams.
Open GPTs	Provides a similar experience to OpenAI GPTs and assistants, using Langchain components
CAMEL	Communicative Agents for "Mind" Exploration of Large Scale Language Model Society

Framework selection considerations:

State Management: How agent state is serialized and persisted
Context Window Management: How data is compiled into LLM context
Multi-Agent Communication: Support for agent collaboration
Memory Handling: Techniques for managing long-term memory
Model Support: Compatibility with open-source models

Agent Memory Solutions¶

These platforms provide specialized solutions for managing agent memory, enabling long-term context retention and efficient memory management for AI applications.

Platform	Description
Letta	System for extending LLM context windows with infinite memory via memory management
Zep	Long-term memory store for LLM applications and agents
LangMem	LangChain's memory management system for conversational agents
Mem0	Memory management and persistence solution for AI assistants and agents

Memory architecture considerations:

Persistence strategies
Context window optimization
Memory retrieval mechanisms
Integration with vector stores

Tool Libraries¶

Tools can be categorized into three main types:

Knowledge Augmentation
- Text retrievers
- Image retrievers
- Web browsers
- SQL executors
- Internal knowledge base access
- API integrations (news, weather, stocks)
Capability Extension
- Calculators
- Code interpreters
- Calendar tools
- Unit converters
- Language translators
- Multimodal converters (text-to-image, speech-to-text)
Write Actions
- Database modifications
- Email sending
- File system operations
- API calls with side effects
- Transaction processing

These libraries provide specialized tools and capabilities that can be integrated into AI agents to enhance their ability to interact with various systems and perform specific tasks.

Library	Description
Composio	Tool composition and orchestration library for AI agents
Browserbase	Browser automation and web interaction tools for AI agents
Exa	AI-powered search and knowledge tools library
Model Context Protocol (MCP)	A protocol for enabling LLMs to use tools

Tool Integration Protocols¶

A key challenge in building agents is standardizing how they interact with tools. Several protocols have emerged to address this

Model Context Protocol (MCP)¶

Model Context Protocol provides a standardized way for LLMs to interact with tools and external systems. Key features include:

Resource Management
- Structured exposure of external resources
- Schema definitions for data access
- Standardized resource querying
Tool Definitions
- Common format for tool specifications
- Input/output validation
- Error handling patterns
Prompt Templates
- Standardized prompt formats
- Context management
- Response handling

Other Tool Integration Standards¶

Protocol	Description
OpenAI Function Calling	JSON Schema-based function definitions
LangChain Tools	Tool specification format for LangChain agents
Semantic Kernel Skills	Microsoft's approach to defining reusable AI capabilities

Best Practices for Tool Integration¶

When implementing tool protocols:

Security Considerations
- Validate all inputs before execution
- Implement proper access controls
- Monitor tool usage and rate limits
Error Handling
- Graceful failure modes
- Clear error messages
- Recovery strategies
Documentation
- Clear tool specifications
- Usage examples
- Integration guides
Testing
- Tool validation
- Integration testing
- Performance monitoring

Model Serving Solutions¶

These platforms provide various solutions for deploying and serving AI models, from local deployment to cloud-based infrastructure, with different performance and scaling capabilities.

Platform	Description
vLLM	High-performance inference engine for LLM serving
Ollama	Run and serve open-source LLMs locally
LM Studio	Desktop application for running and serving local LLMs
Together AI	Platform for deploying and serving large language models
Fireworks AI	Infrastructure for serving and fine-tuning LLMs
Groq	High-performance LLM inference and serving platform
OpenAI	API platform for serving GPT and other AI models
Anthropic	Platform for serving Claude and other AI models
Mistral AI	Platform for serving efficient and powerful language models
Google Gemini	Google's platform for serving multimodal AI models

Agent Sandboxes¶

Platform	Description
E2B	Secure sandboxed environments for running and testing AI agents
Modal	Cloud platform for running AI agents in isolated environments

Note: These platforms provide secure, isolated environments for testing and running AI agents, ensuring safe execution and development of agent capabilities.

Agent Storage Solutions¶

These platforms provide specialized storage solutions for AI applications, including vector databases, embedding storage, and traditional databases optimized for AI workloads.

Platform	Description
Chroma	Open-source embedding database for AI applications
Qdrant	Vector database for AI-powered search and retrieval
Milvus	Open-source vector database for scalable similarity search
Pinecone	Vector database optimized for machine learning applications
Weaviate	Vector search engine and vector database
NEON	Serverless Postgres platform for AI applications
Supabase	Open-source Firebase alternative with vector storage capabilities

Vertical AI Agent Solutions¶

Company	Description/Focus Area
Decagon	AI agent development platform
Sierra	Environmental and sustainability-focused AI solutions
Replit	Cloud development environment and AI coding tools
Perplexity	AI-powered search and discovery
Harvey	Legal AI solutions
Please AI	Multi-agent systems and orchestration
Cognition	Cognitive computing and AI reasoning
Factory	AI automation and manufacturing solutions
Dosu	AI code writing agent and github plugin
Lindy	AI Automated emailing and scheduling
11x	Digital Human Workers

Interesting and notabl research and libraries¶

Open GPTs Enables the creation of agents and assistants, using Langchain components

📋

The Open Source AI Assistant Framework & API

Docs

Agenta-AI provides end-to-end LLM developer platform. It provides the tools for prompt engineering and management, ⚖️ evaluation, human annotation, and 🚀 deployment. All without imposing any restrictions on your choice of framework, library, or model.

📋

Jarvis provides essential components to enable LLM-agents to have tools. They provide ToolBench, HuggingGPT, and EasyTool at present.

📋

Easy Tool: Enhancing LLM-based Agents with Concise Tool Instruction provides a framework transforming diverse and lengthy tool documentation into a unified and concise tool instruction for easier tool usage

Development Easy Tool follows a simple pattern of: 1. Task Planning, 2. Tool Retrieval, 3. Tool Selection and 4. Tool Execution, coupled with thoughtful prompting to enable SOT tool usage over multiple models.

Problem Using new tools, software, especially can be challenging for LLMs (and people too!), especially with a poor or redundant documentation and a variety of usage manners.

Solution Easy tool provides "a simple method to condense tool documentation into more concise and effective tool instructions."

I: Tool Description Generation /* I: Task prompt */ Your task is to create a concise and effective tool usage description based on the tool documentation. You should ensure the description only contains the purposes of the tool without irrelevant information. Here is an example: /* Examples */ {Tool Documentation} Tool usage description: {Tool_name} is a tool that can {General_Purposes}. This tool has {Number} multiple built-in functions: class="w"> 1. {Function_1} is to {Functionality_of_Function_1} 2. {Function_2} is to ... /* Auto generation of tool description */ {ToolDocumentationof'AviationWeatherCenter'} Tool usage description: 'Aviation Weather Center' is a tool which can provide official aviation weather data... II: Tool Function Guidelines Construction /* Task prompt */ Your task is to create the scenario that will use the tool. class="w"> 1. You are given a tool with its purpose and its parameters list. The scenario should adopt the parameters in the list. class="w"> 2. If the parameters and parameters are both null, you should set: {"Scenario": XX, "Parameters":{}}. Here is an example: /* Examples */ {Tool_name} is a tool that can {General_Purposes}. {Function_i} is to {Functionality_of_Function_i} {Parameter List of Function_i} One scenario for {Function_i} of {Tool_name} is: {"Scenario": XX, "Parameters":{XX:XX}} /* Auto-construction for Tool Function Guidelines */ 'Ebay' can get products from Ebay in a specific country. 'Product Details' in 'Ebay' can get the product details for a given product id and a specific country. {Parameter List of 'Product Details'} One scenario for 'Product Details' of 'Ebay' is: {"Scenario": "if you want to know the details of the product with product ID 1954 in Germany from Ebay", "Parameters":{"product_id": 1954, "country": "Germany"}}. data-type="image" data-width="100%" data-height="auto" data-desc-position="bottom">

Results The performance is SOT over multiple models. ChatGPT, ToolLLaMA-7B, Vicuna-7B, Mistral-Instruct-&B and GPT-4

📋

Hugging GPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face

Development

Hugging GPT enables LLM models to call other models via the Hugging Face Repo

Problem

LLMs are not the best task for all tasks. Enabling LLMS to use task-specific models can improve the quality of the results.

Solution Hugging GPT provides an intervace for LLMs by breaking it down into 1. Task Planning, 2. Model Selection, 3. Task Execution, and 4. Response Generation

Results The results provide substiantial evidence that HuggingGPT can enable successful single, sequential, and graph-based tasks.

Atomatic building Agents¶

📋

Automated Design of Agentic Systems

Developments In their paper the authors revealmeta-agents that observe adn critique prompting nd efforts to enable better agents. In their own words:

"The core concept of Meta Agent Search is to instruct a meta agent to iteratively create interestingly new agents, evaluate them, add them to an archive that stores discovered agents, and use this archive to help the meta agent in subsequent iterations create yet more interestingly new agents."
You are an expert machine learning researcher testing different agentic systems.
[Brief Description of the Domain]
[Framework Code]
[Output Instructions and Examples]
[Discovered Agent Archive] (initialized with baselines, updated at every iteration)
# Your task
You are deeply familiar with prompting techniques and the agent works from the literature. Your goal is
to maximize the performance by proposing interestingly new agents ......
Use the knowledge from the archive and inspiration from academic literature to propose the next
interesting agentic system design.

Resources¶

Awesome Agents