Building Agents¶
Building agents shares a degree of overlap with the building of applications, but we write about it here because of its unique importance. The AI agents stack has evolved significantly since 2022-2023, moving beyond simple LLM frameworks to more sophisticated agent architectures.
The Evolution of AI Agents¶
The AI agent landscape has evolved significantly since the initial release of frameworks like LangChain (Oct 2022) and LlamaIndex (Nov 2022). While these started as simple LLM frameworks, the field has grown to encompass more sophisticated architectures addressing key challenges:
-
State Management: Agents require sophisticated handling of:
- Message and event history
- Long-term memories
- Execution state in agentic loops
-
Tool Execution: Agents need secure and reliable ways to:
- Execute LLM-generated actions
- Handle tool dependencies
- Manage execution environments
- Process tool results
Key Architectural Considerations¶
When building agents, several architectural decisions are crucial:
-
State Persistence
- File-based serialization vs. Database-backed state
- Query capabilities for historical data
- Scaling with conversation length
- Multi-agent state management
-
Tool Security
- Sandbox environments for arbitrary code execution
- Dependency management
- Access control and authorization
- Input validation and sanitization
-
Production Deployment
- REST API design for agent interactions
- Data normalization for agent state
- Environment recreation for tool execution
- Scaling to millions of agents
Future Trends¶
The agent ecosystem is still in its early stages, with several emerging trends:
-
Standardization
- Movement toward common tool schemas (like OpenAI's function calling format)
- Emerging patterns for agent APIs and deployment
- Cross-framework compatibility for tools and agents
-
Production Focus
- Shift from notebook-based development to production services
- Growing importance of observability and monitoring
- Need for enterprise-grade security and compliance
-
Tool Ecosystem Growth
- Specialized tool providers for common tasks
- Authentication and access control frameworks
- Industry-specific tool collections
The Stack¶
The modern AI agent stack can be broken down into several key layers, each addressing specific challenges in agent development. These include:
- Agent Hosting & Serving Solutions
- Agent Observability Solutions
- Agent Frameworks
- Agent Memory Solutions
- Tool Libraries
- Model Serving Solutions
with some nice examples of successful Vertical AI Agent Solutions.
Agent Hosting & Serving Solutions¶
Platform | Description |
---|---|
Letta | Agent deployment and hosting platform |
LangGraph | Graph-based orchestration for language model agents |
Assistants API | OpenAI's API for deploying and managing AI assistants |
Agents API | API platform for deploying and managing autonomous agents |
Amazon Bedrock Agents | AWS-based agent hosting and management service |
LiveKit Agents | Real-time agent deployment and communication platform |
These platforms provide infrastructure and tools for deploying, hosting, and serving AI agents at scale, each with different specializations and integration capabilities.
Additional considerations for hosting solutions:
- Scalability and performance requirements
- Integration capabilities with existing systems
- Cost and resource optimization
- Security and compliance features
Agent Observability Solutions¶
Platform | Description |
---|---|
LangSmith | LangChain's platform for debugging, testing, evaluating, and monitoring LLM applications and agents |
Arize | ML observability platform with LLM monitoring capabilities |
Weave | AI observability and monitoring platform |
Langfuse | Open source LLM engineering platform for monitoring and analytics |
AgentOps.ai | Specialized platform for monitoring and optimizing AI agents |
Braintrust | LLM evaluation and monitoring platform |
These platforms provide specialized tools for monitoring, debugging, and analyzing the performance of AI agents and LLM applications in production environments.
Key observability features to consider:
- Real-time monitoring and alerting
- Performance analytics and tracing
- Debug tooling and replay capabilities
- Cost tracking and optimization
Agent Frameworks¶
These frameworks provide different approaches and tools for building AI agents, from simple single-agent systems to complex multi-agent orchestrations. Each has its own strengths and specialized use cases.
Framework | Description |
---|---|
Letta | Framework for building and deploying AI agents with built-in orchestration |
LangGraph | LangChain's framework for building structured agents using computational graphs |
AutoGen | Microsoft's framework for building multi-agent systems with automated agent orchestration |
LlamaIndex | Framework for building RAG-enabled agents and LLM applications |
CrewAI | Framework for orchestrating role-playing autonomous AI agents |
DSPy | Stanford's framework for programming with foundation models |
Phidata | AI-first development framework for building production-ready AI applications |
Semantic Kernel | Microsoft's orchestration framework for LLMs |
AutoGPT | Framework for building autonomous AI agents with GPT-4 |
Framework selection considerations:
- State Management: How agent state is serialized and persisted
- Context Window Management: How data is compiled into LLM context
- Multi-Agent Communication: Support for agent collaboration
- Memory Handling: Techniques for managing long-term memory
- Model Support: Compatibility with open-source models
Agent Memory Solutions¶
These platforms provide specialized solutions for managing agent memory, enabling long-term context retention and efficient memory management for AI applications.
Platform | Description |
---|---|
MemGPT | System for extending LLM context windows with infinite memory via memory management |
Zep | Long-term memory store for LLM applications and agents |
LangMem | LangChain's memory management system for conversational agents |
Memo | Memory management and persistence solution for AI agents |
Memory architecture considerations:
- Persistence strategies
- Context window optimization
- Memory retrieval mechanisms
- Integration with vector stores
Tool Libraries¶
Tools can be categorized into three main types:
-
Knowledge Augmentation
- Text retrievers
- Image retrievers
- Web browsers
- SQL executors
- Internal knowledge base access
- API integrations (news, weather, stocks)
-
Capability Extension
- Calculators
- Code interpreters
- Calendar tools
- Unit converters
- Language translators
- Multimodal converters (text-to-image, speech-to-text)
-
Write Actions
- Database modifications
- Email sending
- File system operations
- API calls with side effects
- Transaction processing
These libraries provide specialized tools and capabilities that can be integrated into AI agents to enhance their ability to interact with various systems and perform specific tasks.
Library | Description |
---|---|
Composio | Tool composition and orchestration library for AI agents |
Browserbase | Browser automation and web interaction tools for AI agents |
Exa | AI-powered search and knowledge tools library |
Model Context Protocol (MCP) | A protocol for enabling LLMs to use tools |
Tool Integration Protocols¶
A key challenge in building agents is standardizing how they interact with tools. Several protocols have emerged to address this
Model Context Protocol (MCP)¶
Model Context Protocol provides a standardized way for LLMs to interact with tools and external systems. Key features include:
-
Resource Management
- Structured exposure of external resources
- Schema definitions for data access
- Standardized resource querying
-
Tool Definitions
- Common format for tool specifications
- Input/output validation
- Error handling patterns
-
Prompt Templates
- Standardized prompt formats
- Context management
- Response handling
Other Tool Integration Standards¶
Protocol | Description |
---|---|
OpenAI Function Calling | JSON Schema-based function definitions |
LangChain Tools | Tool specification format for LangChain agents |
Semantic Kernel Skills | Microsoft's approach to defining reusable AI capabilities |
Best Practices for Tool Integration¶
When implementing tool protocols:
-
Security Considerations
- Validate all inputs before execution
- Implement proper access controls
- Monitor tool usage and rate limits
-
Error Handling
- Graceful failure modes
- Clear error messages
- Recovery strategies
-
Documentation
- Clear tool specifications
- Usage examples
- Integration guides
-
Testing
- Tool validation
- Integration testing
- Performance monitoring
Model Serving Solutions¶
These platforms provide various solutions for deploying and serving AI models, from local deployment to cloud-based infrastructure, with different performance and scaling capabilities.
Platform | Description |
---|---|
vLLM | High-performance inference engine for LLM serving |
Ollama | Run and serve open-source LLMs locally |
LM Studio | Desktop application for running and serving local LLMs |
SGL | Scalable graph learning and serving platform |
Together AI | Platform for deploying and serving large language models |
Fireworks AI | Infrastructure for serving and fine-tuning LLMs |
Groq | High-performance LLM inference and serving platform |
OpenAI | API platform for serving GPT and other AI models |
Anthropic | Platform for serving Claude and other AI models |
Mistral AI | Platform for serving efficient and powerful language models |
Google Gemini | Google's platform for serving multimodal AI models |
Agent Sandboxes¶
Platform | Description |
---|---|
E2B | Secure sandboxed environments for running and testing AI agents |
Modal | Cloud platform for running AI agents in isolated environments |
Note: These platforms provide secure, isolated environments for testing and running AI agents, ensuring safe execution and development of agent capabilities.
Agent Storage Solutions¶
These platforms provide specialized storage solutions for AI applications, including vector databases, embedding storage, and traditional databases optimized for AI workloads.
Platform | Description |
---|---|
Chroma | Open-source embedding database for AI applications |
Drant | Vector database for AI-powered search and retrieval |
Milvus | Open-source vector database for scalable similarity search |
Pinecone | Vector database optimized for machine learning applications |
Weaviate | Vector search engine and vector database |
NEON | Serverless Postgres platform for AI applications |
Supabase | Open-source Firebase alternative with vector storage capabilities |
Vertical AI Agent Solutions¶
Company | Description/Focus Area |
---|---|
Decagon | AI agent development platform |
Sierra | Environmental and sustainability-focused AI solutions |
Replit | Cloud development environment and AI coding tools |
Perplexity | AI-powered search and discovery |
Harvey | Legal AI solutions |
MultiOn | Multi-agent systems and orchestration |
Cognition | Cognitive computing and AI reasoning |
Factory | AI automation and manufacturing solutions |
All Hands | Collaborative AI systems |
Dosu | AI development tools and infrastructure |
Lindy | AI systems with long-term learning capabilities |
11x | AI productivity and automation tools |
Interesting and notabl research and libraries¶
Open GPTs Enables the creation of agents and assistants, using Langchain components
Agenta-AI provides end-to-end LLM developer platform. It provides the tools for prompt engineering and management, ⚖️ evaluation, human annotation, and 🚀 deployment. All without imposing any restrictions on your choice of framework, library, or model.
Jarvis provides essential components to enable LLM-agents to have tools. They provide ToolBench, HuggingGPT, and EasyTool at present.
Easy Tool: Enhancing LLM-based Agents with Concise Tool Instruction provides a framework transforming diverse and lengthy tool documentation into a unified and concise tool instruction for easier tool usage
Development Easy Tool follows a simple pattern of: 1. Task Planning, 2. Tool Retrieval, 3. Tool Selection and 4. Tool Execution, coupled with thoughtful prompting to enable SOT tool usage over multiple models.
Problem Using new tools, software, especially can be challenging for LLMs (and people too!), especially with a poor or redundant documentation and a variety of usage manners.
Solution Easy tool provides "a simple method to condense tool documentation into more concise and effective tool instructions."
I: Tool Description Generation
/* I: Task prompt */
Your task is to create a concise and effective tool usage description based on the tool documentation. You should ensure the description only contains the purposes of the
tool without irrelevant information. Here is an example:
/* Examples */
{Tool Documentation}
Tool usage description:
{Tool_name} is a tool that can {General_Purposes}.
This tool has {Number} multiple built-in functions:
1. {Function_1} is to {Functionality_of_Function_1} 2. {Function_2} is to ...
/* Auto generation of tool description */ {ToolDocumentationof‘AviationWeatherCenter’} Tool usage description:
‘Aviation Weather Center’ is a tool which can provide official aviation weather data...
II: Tool Function Guidelines Construction
/* Task prompt */
Your task is to create the scenario that will use the tool.
1. You are given a tool with its purpose and its parameters list. The scenario should adopt the parameters in the list.
2. If the parameters and parameters are both null, you
should set: {"Scenario": XX, "Parameters":{}}.
Here is an example:
/* Examples */
{Tool_name} is a tool that can {General_Purposes}. {Function_i} is to {Functionality_of_Function_i} {Parameter List of Function_i}
One scenario for {Function_i} of {Tool_name} is: {"Scenario": XX, "Parameters":{XX:XX}}
/* Auto-construction for Tool Function Guidelines */
‘Ebay’ can get products from Ebay in a specific country. ‘Product Details’ in ‘Ebay’ can get the product details for a given product id and a specific country.
{Parameter List of ‘Product Details’}
One scenario for ‘Product Details’ of ‘Ebay’ is:
{"Scenario": "if you want to know the details of the product with product ID 1954 in Germany from Ebay", "Parameters":{"product_id": 1954, "country": "Germany"}}.
Results The performance is SOT over multiple models. ChatGPT, ToolLLaMA-7B, Vicuna-7B, Mistral-Instruct-&B and GPT-4
Hugging GPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
Development
Hugging GPT enables LLM models to call other models via the Hugging Face Repo
Problem
LLMs are not the best task for all tasks. Enabling LLMS to use task-specific models can improve the quality of the results.
Solution Hugging GPT provides an intervace for LLMs by breaking it down into 1. Task Planning, 2. Model Selection, 3. Task Execution, and 4. Response Generation
Results The results provide substiantial evidence that HuggingGPT can enable successful single, sequential, and graph-based tasks.
Atomatic building Agents¶
Automated Design of Agentic Systems
Developments In their paper the authors revealmeta-agents that observe adn critique prompting nd efforts to enable better agents. In their own words:
"The core concept of Meta Agent Search is to instruct a meta agent to iteratively create interestingly new agents, evaluate them, add them to an archive that stores discovered agents, and use this archive to help the meta agent in subsequent iterations create yet more interestingly new agents."
You are an expert machine learning researcher testing different agentic systems. [Brief Description of the Domain] [Framework Code] [Output Instructions and Examples] [Discovered Agent Archive] (initialized with baselines, updated at every iteration) # Your task You are deeply familiar with prompting techniques and the agent works from the literature. Your goal is to maximize the performance by proposing interestingly new agents ...... Use the knowledge from the archive and inspiration from academic literature to propose the next interesting agentic system design.