Skip to content

The AI Agent Stack

The Stack

The modern AI agent stack can be broken down into several key layers, each addressing specific challenges in agent development. These include:

with some nice examples of successful Vertical AI Agent Solutions.

flowchart LR
    subgraph Infrastructure["Infrastructure Layer"]
        direction TB
        Mem[Memory Solutions]
        Model[Model Serving]
    end

    subgraph Development["Development Layer"]
        direction TB
        Frame[Agent Frameworks]
        Tools[Tool Libraries]
        Sand[Agent Sandboxes]
    end

    subgraph Application["Application Layer"]
        direction TB
        VA[User Interface UI/UX]
        Host[Agent Hosting & Serving]
        Obs[Observability Solutions]
    end

    %% Node styles
    classDef appLayer fill:#ff9e64,stroke:#333,stroke-width:2px
    classDef devLayer fill:#7aa2f7,stroke:#333,stroke-width:2px
    classDef infraLayer fill:#9ece6a,stroke:#333,stroke-width:2px
    classDef subgraphStyle fill:transparent,stroke-width:2px,stroke:#666

    %% Apply styles to nodes
    class VA,Host,Obs appLayer
    class Frame,Tools,Sand devLayer
    class Mem,Model infraLayer
    class Application,Development,Infrastructure subgraphStyle

    %% Add clickable links
    click Mem "#agent-memory-solutions" "Memory Solutions"
    click Model "#model-serving-solutions" "Model Serving"
    click Frame "#agent-frameworks" "Agent Frameworks"
    click Tools "#tool-libraries" "Tool Libraries"
    click Sand "#agent-sandboxes" "Agent Sandboxes"
    click VA "../building_applications/front_end/index.html" "User Interface"
    click Host "#agent-hosting--serving-solutions" "Hosting & Serving"
    click Obs "#agent-observability-solutions" "Observability"

Infrastructure Layer

Model Serving Solutions

These platforms provide various solutions for deploying and serving AI models, from local deployment to cloud-based infrastructure, with different performance and scaling capabilities.

Platform Description
vLLM High-performance inference engine for LLM serving
AIBrix Cost-efficient and pluggable infrastructure components for GenAI inference with features like high-density LoRA management, LLM gateway/routing, and distributed inference
Ollama Run and serve open-source LLMs locally
LM Studio Desktop application for running and serving local LLMs
Together AI Platform for deploying and serving large language models
Fireworks AI Infrastructure for serving and fine-tuning LLMs
Groq High-performance LLM inference and serving platform
OpenAI API platform for serving GPT and other AI models
Anthropic Platform for serving Claude and other AI models
Mistral AI Platform for serving efficient and powerful language models
Google Gemini Google's platform for serving multimodal AI models

LLM Serving Considerations:

  • Performance and scaling capabilities
  • Cost and resource optimization
  • Security and compliance features

Agent Memory Solutions

Agent memory is based off of vector databases, but can be made easier with platform solutions for managing agent memory, enabling long-term context retention and efficient memory management for AI applications.

Platform Description
Letta System for extending LLM context windows with infinite memory via memory management
Zep Long-term memory store for LLM applications and agents
LangMem LangChain's memory management system for conversational agents
Mem0 Memory management and persistence solution for AI assistants and agents

Memory architecture considerations:

  • Persistence strategies
  • Context window optimization
  • Memory retrieval mechanisms
  • Integration with vector stores

Development Layer

Agent Frameworks

These frameworks provide different approaches and tools for building AI agents, from simple single-agent systems to complex multi-agent orchestrations. Each has its own strengths and specialized use cases.

Framework Description
PydanticAI Built by the Pydantic team, offering a Python-centric design for building production-grade AI applications with type safety and structured responses
LangGraph LangChain's framework for building structured agents using computational graphs
Letta Framework for building and deploying AI agents with built-in orchestration
Open Hands Collaborative AI systems
AutoGen Microsoft's framework for building multi-agent systems with automated agent orchestration
LlamaIndex Framework for building RAG-enabled agents and LLM applications
CrewAI Framework for orchestrating role-playing autonomous AI agents
DSPy Stanford's framework for programming with foundation models
Phidata AI-first development framework for building production-ready AI applications
Semantic Kernel Microsoft's orchestration framework for LLMs
AutoGPT Framework for building autonomous AI agents with GPT-4
L3AGI Open-source tool that enables AI Assistants to collaborate together as effectively as human teams.
Open GPTs Provides a similar experience to OpenAI GPTs and assistants, using Langchain components
CAMEL Communicative Agents for "Mind" Exploration of Large Scale Language Model Society

Framework selection considerations:

  • State Management: How agent state is serialized and persisted
  • Context Window Management: How data is compiled into LLM context
  • Multi-Agent Communication: Support for agent collaboration
  • Memory Handling: Techniques for managing long-term memory
  • Model Support: Compatibility with open-source models

Tool Libraries

Tools can be categorized into three main types:

  1. Knowledge Augmentation

    • Text retrievers
    • Image retrievers
    • Web browsers
    • SQL executors
    • Internal knowledge base access
    • API integrations (news, weather, stocks)
  2. Capability Extension

    • Calculators
    • Code interpreters
    • Calendar tools
    • Unit converters
    • Language translators
    • Multimodal converters (text-to-image, speech-to-text)
  3. Write Actions

    • Database modifications
    • Email sending
    • File system operations
    • API calls with side effects
    • Transaction processing

These libraries provide specialized tools and capabilities that can be integrated into AI agents to enhance their ability to interact with various systems and perform specific tasks.

Library Description
Composio Tool composition and orchestration library for AI agents
Browserbase Browser automation and web interaction tools for AI agents
Exa AI-powered search and knowledge tools library
Model Context Protocol (MCP) A protocol for enabling LLMs to use tools

Tool Integration Protocols

A key challenge in building agents is standardizing how they interact with tools. Several protocols have emerged to address this

Model Context Protocol (MCP)

Model Context Protocol provides a standardized way for LLMs to interact with tools and external systems. Key features include:

  1. Resource Management

    • Structured exposure of external resources
    • Schema definitions for data access
    • Standardized resource querying
  2. Tool Definitions

    • Common format for tool specifications
    • Input/output validation
    • Error handling patterns
  3. Prompt Templates

    • Standardized prompt formats
    • Context management
    • Response handling
Other Tool Integration Standards
Protocol Description
OpenAI Function Calling JSON Schema-based function definitions
LangChain Tools Tool specification format for LangChain agents
Semantic Kernel Skills Microsoft's approach to defining reusable AI capabilities

Best Practices for Tool Integration

When implementing tool protocols:

  1. Security Considerations

    • Validate all inputs before execution
    • Implement proper access controls
    • Monitor tool usage and rate limits
  2. Error Handling

    • Graceful failure modes
    • Clear error messages
    • Recovery strategies
  3. Documentation

    • Clear tool specifications
    • Usage examples
    • Integration guides
  4. Testing

    • Tool validation
    • Integration testing
    • Performance monitoring

Agent Sandboxes

Agent sandboxes are crucial components in the AI agent development stack, providing secure and isolated environments for running and testing AI agents. They serve as a critical layer of security and control between AI agents and the systems they interact with.

Platform Description Key Features
E2B Secure sandboxed environments for running and testing AI agents • Secure code execution
• Real-time monitoring
• API integration
• Custom runtime environments
Modal Cloud platform for running AI agents in isolated environments • Serverless execution
• GPU support
• Automatic scaling
• Container orchestration
Docker Container platform that can be used for agent sandboxing • Custom environments
• Resource isolation
• Portable deployments

Note: These platforms ensure safe execution and development of agent capabilities while maintaining system security and stability. The choice of sandbox solution should align with your specific security requirements, development workflow, and operational needs.

Core Capabilities
  1. Security & Isolation

    • Containerized environments for safe code execution
    • Resource usage limits and quotas
    • Network access controls and API restrictions
    • File system and process isolation
    • Principle of least privilege enforcement
  2. Development & Operations

    • Rapid prototyping and testing
    • Reproducible environments
    • Version control integration
    • Resource monitoring and optimization
    • Performance profiling and debugging
    • Cost management and scaling

Application Layer

Agent Hosting & Serving Solutions

Platform Description
Letta Agent deployment and hosting platform
LangGraph Graph-based orchestration for language model agents
Assistants API OpenAI's API for deploying and managing AI assistants
Amazon Bedrock Agents AWS-based agent hosting and management service
LiveKit Agents Real-time agent deployment and communication platform
CopilotKit Framework for building and deploying AI copilots with multi-agent support

These platforms provide infrastructure and tools for deploying, hosting, and serving AI agents at scale, each with different specializations and integration capabilities.

Additional considerations for hosting solutions:

  • Scalability and performance requirements
  • Integration capabilities with existing systems
  • Cost and resource optimization
  • Security and compliance features

https://docs.copilotkit.ai/coagents

Agent Observability Solutions

These platforms provide specialized tools for monitoring, debugging, and analyzing the performance of AI agents and LLM applications in production environments.

Platform Description
LangSmith LangChain's platform for debugging, testing, evaluating, and monitoring LLM applications and agents
Arize ML observability platform with LLM monitoring capabilities
Weave AI observability and monitoring platform
Langfuse Open source LLM engineering platform for monitoring and analytics
AgentOps.ai Specialized platform for monitoring and optimizing AI agents
Braintrust LLM evaluation and monitoring platform

Key observability features to consider:

  • Real-time monitoring and alerting
  • Performance analytics and tracing
  • Debug tooling and replay capabilities
  • Cost tracking and optimization

Front end

Several solutions exist for building and deploying AI agent front-ends, ranging from development tools to complete frameworks:

Platform Description
Pyspur Visual development environment for building AI agents and applications
LangGraph Studio Visual interface for building and deploying LangGraph agents
CopilotKit Open-source multi-agent chat interface with Next.js integration
Streamlit Fast way to build and share data/ML/AI apps
Gradio UI library for deploying ML/AI models with easy-to-build interfaces
Chainlit Building Python LLM apps with chat interfaces
LlamaIndex UI React components for building LlamaIndex applications

Key considerations for front-end solutions:

  • Ease of development and deployment
  • Component reusability
  • Real-time chat capabilities
  • Multi-agent support
  • Integration with backend services
  • Customization options
  • Mobile responsiveness

Vertical AI Agent Solutions

Company Description/Focus Area
Decagon AI agent development platform
Sierra Environmental and sustainability-focused AI solutions
Replit Cloud development environment and AI coding tools
Perplexity AI-powered search and discovery
Harvey Legal AI solutions
Please AI Multi-agent systems and orchestration
Cognition Cognitive computing and AI reasoning
Factory AI automation and manufacturing solutions
Dosu AI code writing agent and github plugin
Lindy AI Automated emailing and scheduling
11x Digital Human Workers