Skip to content

Building Agents

Building agents shares a degree of overlap with the building of applications, but we write about it here because of its unique importance. The AI agents stack has evolved significantly since 2022-2023, moving beyond simple LLM frameworks to more sophisticated agent architectures.

The Evolution of AI Agents

The AI agent landscape has evolved significantly since the initial release of frameworks like LangChain (Oct 2022) and LlamaIndex (Nov 2022). While these started as simple LLM frameworks, the field has grown to encompass more sophisticated architectures addressing key challenges:

  1. State Management: Agents require sophisticated handling of:

    • Message and event history
    • Long-term memories
    • Execution state in agentic loops
  2. Tool Execution: Agents need secure and reliable ways to:

    • Execute LLM-generated actions
    • Handle tool dependencies
    • Manage execution environments
    • Process tool results

Key Architectural Considerations

When building agents, several architectural decisions are crucial:

  1. State Persistence

    • File-based serialization vs. Database-backed state
    • Query capabilities for historical data
    • Scaling with conversation length
    • Multi-agent state management
  2. Tool Security

    • Sandbox environments for arbitrary code execution
    • Dependency management
    • Access control and authorization
    • Input validation and sanitization
  3. Production Deployment

    • REST API design for agent interactions
    • Data normalization for agent state
    • Environment recreation for tool execution
    • Scaling to millions of agents

The agent ecosystem is still in its early stages, with several emerging trends:

  1. Standardization

    • Movement toward common tool schemas (like OpenAI's function calling format)
    • Emerging patterns for agent APIs and deployment
    • Cross-framework compatibility for tools and agents
  2. Production Focus

    • Shift from notebook-based development to production services
    • Growing importance of observability and monitoring
    • Need for enterprise-grade security and compliance
  3. Tool Ecosystem Growth

    • Specialized tool providers for common tasks
    • Authentication and access control frameworks
    • Industry-specific tool collections

The Stack

The modern AI agent stack can be broken down into several key layers, each addressing specific challenges in agent development. These include:

with some nice examples of successful Vertical AI Agent Solutions.

Agent Hosting & Serving Solutions

Platform Description
Letta Agent deployment and hosting platform
LangGraph Graph-based orchestration for language model agents
Assistants API OpenAI's API for deploying and managing AI assistants
Agents API API platform for deploying and managing autonomous agents
Amazon Bedrock Agents AWS-based agent hosting and management service
LiveKit Agents Real-time agent deployment and communication platform

These platforms provide infrastructure and tools for deploying, hosting, and serving AI agents at scale, each with different specializations and integration capabilities.

Additional considerations for hosting solutions:

  • Scalability and performance requirements
  • Integration capabilities with existing systems
  • Cost and resource optimization
  • Security and compliance features

Agent Observability Solutions

Platform Description
LangSmith LangChain's platform for debugging, testing, evaluating, and monitoring LLM applications and agents
Arize ML observability platform with LLM monitoring capabilities
Weave AI observability and monitoring platform
Langfuse Open source LLM engineering platform for monitoring and analytics
AgentOps.ai Specialized platform for monitoring and optimizing AI agents
Braintrust LLM evaluation and monitoring platform

These platforms provide specialized tools for monitoring, debugging, and analyzing the performance of AI agents and LLM applications in production environments.

Key observability features to consider:

  • Real-time monitoring and alerting
  • Performance analytics and tracing
  • Debug tooling and replay capabilities
  • Cost tracking and optimization

Agent Frameworks

These frameworks provide different approaches and tools for building AI agents, from simple single-agent systems to complex multi-agent orchestrations. Each has its own strengths and specialized use cases.

Framework Description
Letta Framework for building and deploying AI agents with built-in orchestration
LangGraph LangChain's framework for building structured agents using computational graphs
AutoGen Microsoft's framework for building multi-agent systems with automated agent orchestration
LlamaIndex Framework for building RAG-enabled agents and LLM applications
CrewAI Framework for orchestrating role-playing autonomous AI agents
DSPy Stanford's framework for programming with foundation models
Phidata AI-first development framework for building production-ready AI applications
Semantic Kernel Microsoft's orchestration framework for LLMs
AutoGPT Framework for building autonomous AI agents with GPT-4

Framework selection considerations:

  • State Management: How agent state is serialized and persisted
  • Context Window Management: How data is compiled into LLM context
  • Multi-Agent Communication: Support for agent collaboration
  • Memory Handling: Techniques for managing long-term memory
  • Model Support: Compatibility with open-source models

Agent Memory Solutions

These platforms provide specialized solutions for managing agent memory, enabling long-term context retention and efficient memory management for AI applications.

Platform Description
MemGPT System for extending LLM context windows with infinite memory via memory management
Zep Long-term memory store for LLM applications and agents
LangMem LangChain's memory management system for conversational agents
Memo Memory management and persistence solution for AI agents

Memory architecture considerations:

  • Persistence strategies
  • Context window optimization
  • Memory retrieval mechanisms
  • Integration with vector stores

Tool Libraries

Tools can be categorized into three main types:

  1. Knowledge Augmentation

    • Text retrievers
    • Image retrievers
    • Web browsers
    • SQL executors
    • Internal knowledge base access
    • API integrations (news, weather, stocks)
  2. Capability Extension

    • Calculators
    • Code interpreters
    • Calendar tools
    • Unit converters
    • Language translators
    • Multimodal converters (text-to-image, speech-to-text)
  3. Write Actions

    • Database modifications
    • Email sending
    • File system operations
    • API calls with side effects
    • Transaction processing

These libraries provide specialized tools and capabilities that can be integrated into AI agents to enhance their ability to interact with various systems and perform specific tasks.

Library Description
Composio Tool composition and orchestration library for AI agents
Browserbase Browser automation and web interaction tools for AI agents
Exa AI-powered search and knowledge tools library
Model Context Protocol (MCP) A protocol for enabling LLMs to use tools

Tool Integration Protocols

A key challenge in building agents is standardizing how they interact with tools. Several protocols have emerged to address this

Model Context Protocol (MCP)

Model Context Protocol provides a standardized way for LLMs to interact with tools and external systems. Key features include:

  1. Resource Management

    • Structured exposure of external resources
    • Schema definitions for data access
    • Standardized resource querying
  2. Tool Definitions

    • Common format for tool specifications
    • Input/output validation
    • Error handling patterns
  3. Prompt Templates

    • Standardized prompt formats
    • Context management
    • Response handling
Other Tool Integration Standards
Protocol Description
OpenAI Function Calling JSON Schema-based function definitions
LangChain Tools Tool specification format for LangChain agents
Semantic Kernel Skills Microsoft's approach to defining reusable AI capabilities

Best Practices for Tool Integration

When implementing tool protocols:

  1. Security Considerations

    • Validate all inputs before execution
    • Implement proper access controls
    • Monitor tool usage and rate limits
  2. Error Handling

    • Graceful failure modes
    • Clear error messages
    • Recovery strategies
  3. Documentation

    • Clear tool specifications
    • Usage examples
    • Integration guides
  4. Testing

    • Tool validation
    • Integration testing
    • Performance monitoring

Model Serving Solutions

These platforms provide various solutions for deploying and serving AI models, from local deployment to cloud-based infrastructure, with different performance and scaling capabilities.

Platform Description
vLLM High-performance inference engine for LLM serving
Ollama Run and serve open-source LLMs locally
LM Studio Desktop application for running and serving local LLMs
SGL Scalable graph learning and serving platform
Together AI Platform for deploying and serving large language models
Fireworks AI Infrastructure for serving and fine-tuning LLMs
Groq High-performance LLM inference and serving platform
OpenAI API platform for serving GPT and other AI models
Anthropic Platform for serving Claude and other AI models
Mistral AI Platform for serving efficient and powerful language models
Google Gemini Google's platform for serving multimodal AI models

Agent Sandboxes

Platform Description
E2B Secure sandboxed environments for running and testing AI agents
Modal Cloud platform for running AI agents in isolated environments

Note: These platforms provide secure, isolated environments for testing and running AI agents, ensuring safe execution and development of agent capabilities.

Agent Storage Solutions

These platforms provide specialized storage solutions for AI applications, including vector databases, embedding storage, and traditional databases optimized for AI workloads.

Platform Description
Chroma Open-source embedding database for AI applications
Drant Vector database for AI-powered search and retrieval
Milvus Open-source vector database for scalable similarity search
Pinecone Vector database optimized for machine learning applications
Weaviate Vector search engine and vector database
NEON Serverless Postgres platform for AI applications
Supabase Open-source Firebase alternative with vector storage capabilities

Vertical AI Agent Solutions

Company Description/Focus Area
Decagon AI agent development platform
Sierra Environmental and sustainability-focused AI solutions
Replit Cloud development environment and AI coding tools
Perplexity AI-powered search and discovery
Harvey Legal AI solutions
MultiOn Multi-agent systems and orchestration
Cognition Cognitive computing and AI reasoning
Factory AI automation and manufacturing solutions
All Hands Collaborative AI systems
Dosu AI development tools and infrastructure
Lindy AI systems with long-term learning capabilities
11x AI productivity and automation tools

Interesting and notabl research and libraries

Open GPTs Enables the creation of agents and assistants, using Langchain components

📋
The Open Source AI Assistant Framework & API

Docs

Agenta-AI provides end-to-end LLM developer platform. It provides the tools for prompt engineering and management, ⚖️ evaluation, human annotation, and 🚀 deployment. All without imposing any restrictions on your choice of framework, library, or model.

📋
Jarvis provides essential components to enable LLM-agents to have tools. They provide ToolBench, HuggingGPT, and EasyTool at present.

📋
Easy Tool: Enhancing LLM-based Agents with Concise Tool Instruction provides a framework transforming diverse and lengthy tool documentation into a unified and concise tool instruction for easier tool usage

Development Easy Tool follows a simple pattern of: 1. Task Planning, 2. Tool Retrieval, 3. Tool Selection and 4. Tool Execution, coupled with thoughtful prompting to enable SOT tool usage over multiple models.

Problem Using new tools, software, especially can be challenging for LLMs (and people too!), especially with a poor or redundant documentation and a variety of usage manners. image

Solution Easy tool provides "a simple method to condense tool documentation into more concise and effective tool instructions."

     I: Tool Description Generation
     /* I: Task prompt */
     Your task is to create a concise and effective tool usage description based on the tool documentation. You should ensure the description only contains the purposes of the
     tool without irrelevant information. Here is an example:
     /* Examples */
     {Tool Documentation}
     Tool usage description:
     {Tool_name} is a tool that can {General_Purposes}.
     This tool has {Number} multiple built-in functions:
     1. {Function_1} is to {Functionality_of_Function_1} 2. {Function_2} is to ...
     /* Auto generation of tool description */ {ToolDocumentationof‘AviationWeatherCenter’} Tool usage description:
     ‘Aviation Weather Center’ is a tool which can provide official aviation weather data...
     II: Tool Function Guidelines Construction
     /* Task prompt */
     Your task is to create the scenario that will use the tool.
     1. You are given a tool with its purpose and its parameters list. The scenario should adopt the parameters in the list.
     2. If the parameters and parameters are both null, you
     should set: {"Scenario": XX, "Parameters":{}}.
     Here is an example:
     /* Examples */
     {Tool_name} is a tool that can {General_Purposes}. {Function_i} is to {Functionality_of_Function_i} {Parameter List of Function_i}
     One scenario for {Function_i} of {Tool_name} is: {"Scenario": XX, "Parameters":{XX:XX}}
     /* Auto-construction for Tool Function Guidelines */
     ‘Ebay’ can get products from Ebay in a specific country. ‘Product Details’ in ‘Ebay’ can get the product details for a given product id and a specific country.
     {Parameter List of ‘Product Details’}
     One scenario for ‘Product Details’ of ‘Ebay’ is:
     {"Scenario": "if you want to know the details of the product with product ID 1954 in Germany from Ebay", "Parameters":{"product_id": 1954, "country": "Germany"}}.
image

Results The performance is SOT over multiple models. ChatGPT, ToolLLaMA-7B, Vicuna-7B, Mistral-Instruct-&B and GPT-4 image

📋
Hugging GPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face

Development

Hugging GPT enables LLM models to call other models via the Hugging Face Repo

Problem

LLMs are not the best task for all tasks. Enabling LLMS to use task-specific models can improve the quality of the results.

Solution Hugging GPT provides an intervace for LLMs by breaking it down into 1. Task Planning, 2. Model Selection, 3. Task Execution, and 4. Response Generation

image

image image

Results The results provide substiantial evidence that HuggingGPT can enable successful single, sequential, and graph-based tasks.

Atomatic building Agents

📋
Automated Design of Agentic Systems

Developments In their paper the authors revealmeta-agents that observe adn critique prompting nd efforts to enable better agents. In their own words:

"The core concept of Meta Agent Search is to instruct a meta agent to iteratively create interestingly new agents, evaluate them, add them to an archive that stores discovered agents, and use this archive to help the meta agent in subsequent iterations create yet more interestingly new agents." image image

You are an expert machine learning researcher testing different agentic systems.
[Brief Description of the Domain]
[Framework Code]
[Output Instructions and Examples]
[Discovered Agent Archive] (initialized with baselines, updated at every iteration)
# Your task
You are deeply familiar with prompting techniques and the agent works from the literature. Your goal is
to maximize the performance by proposing interestingly new agents ......
Use the knowledge from the archive and inspiration from academic literature to propose the next
interesting agentic system design.

Resources

Awesome Agents