The AI Agent Stack

The Stack¶

The modern AI agent stack can be broken down into several key layers, each addressing specific challenges in agent development. These include:

Infrastructure Layer
Development Layer
Application Layer

with some nice examples of successful Vertical AI Agent Solutions.

flowchart LR
    subgraph Infrastructure["Infrastructure Layer"]
        direction TB
        Mem[Memory Solutions]
        Model[Model Serving]
    end

    subgraph Development["Development Layer"]
        direction TB
        Frame[Agent Frameworks]
        Tools[Tool Libraries]
        Sand[Agent Sandboxes]
    end

    subgraph Application["Application Layer"]
        direction TB
        VA[User Interface UI/UX]
        Host[Agent Hosting & Serving]
        Obs[Observability Solutions]
    end

    %% Node styles
    classDef appLayer fill:#ff9e64,stroke:#333,stroke-width:2px
    classDef devLayer fill:#7aa2f7,stroke:#333,stroke-width:2px
    classDef infraLayer fill:#9ece6a,stroke:#333,stroke-width:2px
    classDef subgraphStyle fill:transparent,stroke-width:2px,stroke:#666

    %% Apply styles to nodes
    class VA,Host,Obs appLayer
    class Frame,Tools,Sand devLayer
    class Mem,Model infraLayer
    class Application,Development,Infrastructure subgraphStyle

    %% Add clickable links
    click Mem "#agent-memory-solutions" "Memory Solutions"
    click Model "#model-serving-solutions" "Model Serving"
    click Frame "#agent-frameworks" "Agent Frameworks"
    click Tools "#tool-libraries" "Tool Libraries"
    click Sand "#agent-sandboxes" "Agent Sandboxes"
    click VA "../building_applications/front_end/index.html" "User Interface"
    click Host "#agent-hosting--serving-solutions" "Hosting & Serving"
    click Obs "#agent-observability-solutions" "Observability"

Infrastructure Layer¶

Model Serving Solutions¶

These platforms provide various solutions for deploying and serving AI models, from local deployment to cloud-based infrastructure, with different performance and scaling capabilities.

Platform	Description
vLLM	High-performance inference engine for LLM serving
AIBrix	Cost-efficient and pluggable infrastructure components for GenAI inference with features like high-density LoRA management, LLM gateway/routing, and distributed inference
Ollama	Run and serve open-source LLMs locally
LM Studio	Desktop application for running and serving local LLMs
Together AI	Platform for deploying and serving large language models
Fireworks AI	Infrastructure for serving and fine-tuning LLMs
Groq	High-performance LLM inference and serving platform
OpenAI	API platform for serving GPT and other AI models
Anthropic	Platform for serving Claude and other AI models
Mistral AI	Platform for serving efficient and powerful language models
Google Gemini	Google's platform for serving multimodal AI models

LLM Serving Considerations:

Performance and scaling capabilities
Cost and resource optimization
Security and compliance features

Agent Memory Solutions¶

Agent memory is based off of vector databases, but can be made easier with platform solutions for managing agent memory, enabling long-term context retention and efficient memory management for AI applications.

Platform	Description
Letta	System for extending LLM context windows with infinite memory via memory management
Zep	Long-term memory store for LLM applications and agents
LangMem	LangChain's memory management system for conversational agents
Mem0	Memory management and persistence solution for AI assistants and agents

Memory architecture considerations:

Persistence strategies
Context window optimization
Memory retrieval mechanisms
Integration with vector stores

Development Layer¶

Agent Frameworks¶

These frameworks provide different approaches and tools for building AI agents, from simple single-agent systems to complex multi-agent orchestrations. Each has its own strengths and specialized use cases.

Framework	Description
PydanticAI	Built by the Pydantic team, offering a Python-centric design for building production-grade AI applications with type safety and structured responses
LangGraph	LangChain's framework for building structured agents using computational graphs
Letta	Framework for building and deploying AI agents with built-in orchestration
Open Hands	Collaborative AI systems
AutoGen	Microsoft's framework for building multi-agent systems with automated agent orchestration
LlamaIndex	Framework for building RAG-enabled agents and LLM applications
CrewAI	Framework for orchestrating role-playing autonomous AI agents
DSPy	Stanford's framework for programming with foundation models
Phidata	AI-first development framework for building production-ready AI applications
Semantic Kernel	Microsoft's orchestration framework for LLMs
AutoGPT	Framework for building autonomous AI agents with GPT-4
L3AGI	Open-source tool that enables AI Assistants to collaborate together as effectively as human teams.
Open GPTs	Provides a similar experience to OpenAI GPTs and assistants, using Langchain components
CAMEL	Communicative Agents for "Mind" Exploration of Large Scale Language Model Society

Framework selection considerations:

State Management: How agent state is serialized and persisted
Context Window Management: How data is compiled into LLM context
Multi-Agent Communication: Support for agent collaboration
Memory Handling: Techniques for managing long-term memory
Model Support: Compatibility with open-source models

Tool Libraries¶

Tools can be categorized into three main types:

Knowledge Augmentation
- Text retrievers
- Image retrievers
- Web browsers
- SQL executors
- Internal knowledge base access
- API integrations (news, weather, stocks)
Capability Extension
- Calculators
- Code interpreters
- Calendar tools
- Unit converters
- Language translators
- Multimodal converters (text-to-image, speech-to-text)
Write Actions
- Database modifications
- Email sending
- File system operations
- API calls with side effects
- Transaction processing

These libraries provide specialized tools and capabilities that can be integrated into AI agents to enhance their ability to interact with various systems and perform specific tasks.

Library	Description
Composio	Tool composition and orchestration library for AI agents
Browserbase	Browser automation and web interaction tools for AI agents
Exa	AI-powered search and knowledge tools library
Model Context Protocol (MCP)	A protocol for enabling LLMs to use tools

Tool Integration Protocols¶

A key challenge in building agents is standardizing how they interact with tools. Several protocols have emerged to address this

Model Context Protocol (MCP)¶

Model Context Protocol provides a standardized way for LLMs to interact with tools and external systems. Key features include:

Resource Management
- Structured exposure of external resources
- Schema definitions for data access
- Standardized resource querying
Tool Definitions
- Common format for tool specifications
- Input/output validation
- Error handling patterns
Prompt Templates
- Standardized prompt formats
- Context management
- Response handling

Other Tool Integration Standards¶

Protocol	Description
OpenAI Function Calling	JSON Schema-based function definitions
LangChain Tools	Tool specification format for LangChain agents
Semantic Kernel Skills	Microsoft's approach to defining reusable AI capabilities

Best Practices for Tool Integration¶

When implementing tool protocols:

Security Considerations
- Validate all inputs before execution
- Implement proper access controls
- Monitor tool usage and rate limits
Error Handling
- Graceful failure modes
- Clear error messages
- Recovery strategies
Documentation
- Clear tool specifications
- Usage examples
- Integration guides
Testing
- Tool validation
- Integration testing
- Performance monitoring

Agent Sandboxes¶

Agent sandboxes are crucial components in the AI agent development stack, providing secure and isolated environments for running and testing AI agents. They serve as a critical layer of security and control between AI agents and the systems they interact with.

Platform	Description	Key Features
E2B	Secure sandboxed environments for running and testing AI agents	• Secure code execution • Real-time monitoring • API integration • Custom runtime environments
Modal	Cloud platform for running AI agents in isolated environments	• Serverless execution • GPU support • Automatic scaling • Container orchestration
Docker	Container platform that can be used for agent sandboxing	• Custom environments • Resource isolation • Portable deployments

Note: These platforms ensure safe execution and development of agent capabilities while maintaining system security and stability. The choice of sandbox solution should align with your specific security requirements, development workflow, and operational needs.

Core Capabilities¶

Security & Isolation
- Containerized environments for safe code execution
- Resource usage limits and quotas
- Network access controls and API restrictions
- File system and process isolation
- Principle of least privilege enforcement
Development & Operations
- Rapid prototyping and testing
- Reproducible environments
- Version control integration
- Resource monitoring and optimization
- Performance profiling and debugging
- Cost management and scaling

Application Layer¶

Agent Hosting & Serving Solutions¶

Platform	Description
Letta	Agent deployment and hosting platform
LangGraph	Graph-based orchestration for language model agents
Assistants API	OpenAI's API for deploying and managing AI assistants
Amazon Bedrock Agents	AWS-based agent hosting and management service
LiveKit Agents	Real-time agent deployment and communication platform
CopilotKit	Framework for building and deploying AI copilots with multi-agent support

These platforms provide infrastructure and tools for deploying, hosting, and serving AI agents at scale, each with different specializations and integration capabilities.

Additional considerations for hosting solutions:

Scalability and performance requirements
Integration capabilities with existing systems
Cost and resource optimization
Security and compliance features

https://docs.copilotkit.ai/coagents

Agent Observability Solutions¶

These platforms provide specialized tools for monitoring, debugging, and analyzing the performance of AI agents and LLM applications in production environments.

Platform	Description
LangSmith	LangChain's platform for debugging, testing, evaluating, and monitoring LLM applications and agents
Arize	ML observability platform with LLM monitoring capabilities
Weave	AI observability and monitoring platform
Langfuse	Open source LLM engineering platform for monitoring and analytics
AgentOps.ai	Specialized platform for monitoring and optimizing AI agents
Braintrust	LLM evaluation and monitoring platform

Key observability features to consider:

Real-time monitoring and alerting
Performance analytics and tracing
Debug tooling and replay capabilities
Cost tracking and optimization

Front end¶

Several solutions exist for building and deploying AI agent front-ends, ranging from development tools to complete frameworks:

Platform	Description
Pyspur	Visual development environment for building AI agents and applications
LangGraph Studio	Visual interface for building and deploying LangGraph agents
CopilotKit	Open-source multi-agent chat interface with Next.js integration
Streamlit	Fast way to build and share data/ML/AI apps
Gradio	UI library for deploying ML/AI models with easy-to-build interfaces
Chainlit	Building Python LLM apps with chat interfaces
LlamaIndex UI	React components for building LlamaIndex applications

Key considerations for front-end solutions:

Ease of development and deployment
Component reusability
Real-time chat capabilities
Multi-agent support
Integration with backend services
Customization options
Mobile responsiveness

Vertical AI Agent Solutions¶

Company	Description/Focus Area
Decagon	AI agent development platform
Sierra	Environmental and sustainability-focused AI solutions
Replit	Cloud development environment and AI coding tools
Perplexity	AI-powered search and discovery
Harvey	Legal AI solutions
Please AI	Multi-agent systems and orchestration
Cognition	Cognitive computing and AI reasoning
Factory	AI automation and manufacturing solutions
Dosu	AI code writing agent and github plugin
Lindy	AI Automated emailing and scheduling
11x	Digital Human Workers