The AI Agent Stack
The Stack¶
The modern AI agent stack can be broken down into several key layers, each addressing specific challenges in agent development. These include:
with some nice examples of successful Vertical AI Agent Solutions.
flowchart LR
subgraph Infrastructure["Infrastructure Layer"]
direction TB
Mem[Memory Solutions]
Model[Model Serving]
end
subgraph Development["Development Layer"]
direction TB
Frame[Agent Frameworks]
Tools[Tool Libraries]
Sand[Agent Sandboxes]
end
subgraph Application["Application Layer"]
direction TB
VA[User Interface UI/UX]
Host[Agent Hosting & Serving]
Obs[Observability Solutions]
end
%% Node styles
classDef appLayer fill:#ff9e64,stroke:#333,stroke-width:2px
classDef devLayer fill:#7aa2f7,stroke:#333,stroke-width:2px
classDef infraLayer fill:#9ece6a,stroke:#333,stroke-width:2px
classDef subgraphStyle fill:transparent,stroke-width:2px,stroke:#666
%% Apply styles to nodes
class VA,Host,Obs appLayer
class Frame,Tools,Sand devLayer
class Mem,Model infraLayer
class Application,Development,Infrastructure subgraphStyle
%% Add clickable links
click Mem "#agent-memory-solutions" "Memory Solutions"
click Model "#model-serving-solutions" "Model Serving"
click Frame "#agent-frameworks" "Agent Frameworks"
click Tools "#tool-libraries" "Tool Libraries"
click Sand "#agent-sandboxes" "Agent Sandboxes"
click VA "../building_applications/front_end/index.html" "User Interface"
click Host "#agent-hosting--serving-solutions" "Hosting & Serving"
click Obs "#agent-observability-solutions" "Observability"
Infrastructure Layer¶
Model Serving Solutions¶
These platforms provide various solutions for deploying and serving AI models, from local deployment to cloud-based infrastructure, with different performance and scaling capabilities.
Platform | Description |
---|---|
vLLM | High-performance inference engine for LLM serving |
AIBrix | Cost-efficient and pluggable infrastructure components for GenAI inference with features like high-density LoRA management, LLM gateway/routing, and distributed inference |
Ollama | Run and serve open-source LLMs locally |
LM Studio | Desktop application for running and serving local LLMs |
Together AI | Platform for deploying and serving large language models |
Fireworks AI | Infrastructure for serving and fine-tuning LLMs |
Groq | High-performance LLM inference and serving platform |
OpenAI | API platform for serving GPT and other AI models |
Anthropic | Platform for serving Claude and other AI models |
Mistral AI | Platform for serving efficient and powerful language models |
Google Gemini | Google's platform for serving multimodal AI models |
LLM Serving Considerations:
- Performance and scaling capabilities
- Cost and resource optimization
- Security and compliance features
Agent Memory Solutions¶
Agent memory is based off of vector databases, but can be made easier with platform solutions for managing agent memory, enabling long-term context retention and efficient memory management for AI applications.
Platform | Description |
---|---|
Letta | System for extending LLM context windows with infinite memory via memory management |
Zep | Long-term memory store for LLM applications and agents |
LangMem | LangChain's memory management system for conversational agents |
Mem0 | Memory management and persistence solution for AI assistants and agents |
Memory architecture considerations:
- Persistence strategies
- Context window optimization
- Memory retrieval mechanisms
- Integration with vector stores
Development Layer¶
Agent Frameworks¶
These frameworks provide different approaches and tools for building AI agents, from simple single-agent systems to complex multi-agent orchestrations. Each has its own strengths and specialized use cases.
Framework | Description |
---|---|
PydanticAI | Built by the Pydantic team, offering a Python-centric design for building production-grade AI applications with type safety and structured responses |
LangGraph | LangChain's framework for building structured agents using computational graphs |
Letta | Framework for building and deploying AI agents with built-in orchestration |
Open Hands | Collaborative AI systems |
AutoGen | Microsoft's framework for building multi-agent systems with automated agent orchestration |
LlamaIndex | Framework for building RAG-enabled agents and LLM applications |
CrewAI | Framework for orchestrating role-playing autonomous AI agents |
DSPy | Stanford's framework for programming with foundation models |
Phidata | AI-first development framework for building production-ready AI applications |
Semantic Kernel | Microsoft's orchestration framework for LLMs |
AutoGPT | Framework for building autonomous AI agents with GPT-4 |
L3AGI | Open-source tool that enables AI Assistants to collaborate together as effectively as human teams. |
Open GPTs | Provides a similar experience to OpenAI GPTs and assistants, using Langchain components |
CAMEL | Communicative Agents for "Mind" Exploration of Large Scale Language Model Society |
Framework selection considerations:
- State Management: How agent state is serialized and persisted
- Context Window Management: How data is compiled into LLM context
- Multi-Agent Communication: Support for agent collaboration
- Memory Handling: Techniques for managing long-term memory
- Model Support: Compatibility with open-source models
Tool Libraries¶
Tools can be categorized into three main types:
-
Knowledge Augmentation
- Text retrievers
- Image retrievers
- Web browsers
- SQL executors
- Internal knowledge base access
- API integrations (news, weather, stocks)
-
Capability Extension
- Calculators
- Code interpreters
- Calendar tools
- Unit converters
- Language translators
- Multimodal converters (text-to-image, speech-to-text)
-
Write Actions
- Database modifications
- Email sending
- File system operations
- API calls with side effects
- Transaction processing
These libraries provide specialized tools and capabilities that can be integrated into AI agents to enhance their ability to interact with various systems and perform specific tasks.
Library | Description |
---|---|
Composio | Tool composition and orchestration library for AI agents |
Browserbase | Browser automation and web interaction tools for AI agents |
Exa | AI-powered search and knowledge tools library |
Model Context Protocol (MCP) | A protocol for enabling LLMs to use tools |
Tool Integration Protocols¶
A key challenge in building agents is standardizing how they interact with tools. Several protocols have emerged to address this
Model Context Protocol (MCP)¶
Model Context Protocol provides a standardized way for LLMs to interact with tools and external systems. Key features include:
-
Resource Management
- Structured exposure of external resources
- Schema definitions for data access
- Standardized resource querying
-
Tool Definitions
- Common format for tool specifications
- Input/output validation
- Error handling patterns
-
Prompt Templates
- Standardized prompt formats
- Context management
- Response handling
Other Tool Integration Standards¶
Protocol | Description |
---|---|
OpenAI Function Calling | JSON Schema-based function definitions |
LangChain Tools | Tool specification format for LangChain agents |
Semantic Kernel Skills | Microsoft's approach to defining reusable AI capabilities |
Best Practices for Tool Integration¶
When implementing tool protocols:
-
Security Considerations
- Validate all inputs before execution
- Implement proper access controls
- Monitor tool usage and rate limits
-
Error Handling
- Graceful failure modes
- Clear error messages
- Recovery strategies
-
Documentation
- Clear tool specifications
- Usage examples
- Integration guides
-
Testing
- Tool validation
- Integration testing
- Performance monitoring
Agent Sandboxes¶
Agent sandboxes are crucial components in the AI agent development stack, providing secure and isolated environments for running and testing AI agents. They serve as a critical layer of security and control between AI agents and the systems they interact with.
Platform | Description | Key Features |
---|---|---|
E2B | Secure sandboxed environments for running and testing AI agents | • Secure code execution • Real-time monitoring • API integration • Custom runtime environments |
Modal | Cloud platform for running AI agents in isolated environments | • Serverless execution • GPU support • Automatic scaling • Container orchestration |
Docker | Container platform that can be used for agent sandboxing | • Custom environments • Resource isolation • Portable deployments |
Note: These platforms ensure safe execution and development of agent capabilities while maintaining system security and stability. The choice of sandbox solution should align with your specific security requirements, development workflow, and operational needs.
Core Capabilities¶
-
Security & Isolation
- Containerized environments for safe code execution
- Resource usage limits and quotas
- Network access controls and API restrictions
- File system and process isolation
- Principle of least privilege enforcement
-
Development & Operations
- Rapid prototyping and testing
- Reproducible environments
- Version control integration
- Resource monitoring and optimization
- Performance profiling and debugging
- Cost management and scaling
Application Layer¶
Agent Hosting & Serving Solutions¶
Platform | Description |
---|---|
Letta | Agent deployment and hosting platform |
LangGraph | Graph-based orchestration for language model agents |
Assistants API | OpenAI's API for deploying and managing AI assistants |
Amazon Bedrock Agents | AWS-based agent hosting and management service |
LiveKit Agents | Real-time agent deployment and communication platform |
CopilotKit | Framework for building and deploying AI copilots with multi-agent support |
These platforms provide infrastructure and tools for deploying, hosting, and serving AI agents at scale, each with different specializations and integration capabilities.
Additional considerations for hosting solutions:
- Scalability and performance requirements
- Integration capabilities with existing systems
- Cost and resource optimization
- Security and compliance features
https://docs.copilotkit.ai/coagents
Agent Observability Solutions¶
These platforms provide specialized tools for monitoring, debugging, and analyzing the performance of AI agents and LLM applications in production environments.
Platform | Description |
---|---|
LangSmith | LangChain's platform for debugging, testing, evaluating, and monitoring LLM applications and agents |
Arize | ML observability platform with LLM monitoring capabilities |
Weave | AI observability and monitoring platform |
Langfuse | Open source LLM engineering platform for monitoring and analytics |
AgentOps.ai | Specialized platform for monitoring and optimizing AI agents |
Braintrust | LLM evaluation and monitoring platform |
Key observability features to consider:
- Real-time monitoring and alerting
- Performance analytics and tracing
- Debug tooling and replay capabilities
- Cost tracking and optimization
Front end¶
Several solutions exist for building and deploying AI agent front-ends, ranging from development tools to complete frameworks:
Platform | Description |
---|---|
Pyspur | Visual development environment for building AI agents and applications |
LangGraph Studio | Visual interface for building and deploying LangGraph agents |
CopilotKit | Open-source multi-agent chat interface with Next.js integration |
Streamlit | Fast way to build and share data/ML/AI apps |
Gradio | UI library for deploying ML/AI models with easy-to-build interfaces |
Chainlit | Building Python LLM apps with chat interfaces |
LlamaIndex UI | React components for building LlamaIndex applications |
Key considerations for front-end solutions:
- Ease of development and deployment
- Component reusability
- Real-time chat capabilities
- Multi-agent support
- Integration with backend services
- Customization options
- Mobile responsiveness
Vertical AI Agent Solutions¶
Company | Description/Focus Area |
---|---|
Decagon | AI agent development platform |
Sierra | Environmental and sustainability-focused AI solutions |
Replit | Cloud development environment and AI coding tools |
Perplexity | AI-powered search and discovery |
Harvey | Legal AI solutions |
Please AI | Multi-agent systems and orchestration |
Cognition | Cognitive computing and AI reasoning |
Factory | AI automation and manufacturing solutions |
Dosu | AI code writing agent and github plugin |
Lindy | AI Automated emailing and scheduling |
11x | Digital Human Workers |