Data foundation layer
Agents are only as good as their data access. The data foundation layer provides the knowledge bases, vector stores, structured APIs, document stores, and real-time feeds that agents use to retrieve evidence, validate context, and take informed action.
Data foundation components for agent systems
| Component | Purpose | Design consideration |
|---|---|---|
| Knowledge bases | Curated, domain-specific reference data | Maintainability, update cadence, and source attribution |
| Vector stores | Semantic search and retrieval for RAG workflows | Embedding model choice, chunking strategy, and freshness |
| Structured APIs | Transaction systems and operational data | API design, rate limits, and authentication |
| Document stores | Unstructured content such as policies, manuals, or contracts | Indexing strategy, access controls, and versioning |
| Real-time feeds | Live data streams for time-sensitive decisions | Latency, ordering guarantees, and failure handling |
Data governance is critical for agent-accessible data. Ensure that agents only access data they are authorized to see, that retrieval is auditable, and that stale or incorrect data does not lead to flawed decisions. Treat data products as contracts with clear schemas and service level objectives.
The Agentic Enterprise — Data Layer and Semantic Layer
Salesforce Architect guide covering the Data Layer (VectorDB, Lakehouse, Data Contracts, AI-Ready Data Fabric) and the Semantic Layer (Enterprise Knowledge Graph, Semantic Query Engine) that underpin agent reasoning.
Read the Data and Semantic Layer sectionsKnowledge Bases for Amazon Bedrock
AWS documentation on managed RAG capabilities, enabling foundation models and agents to access company data for grounded responses.
Explore AWS Knowledge BasesAWS Prescriptive Guidance — Enterprise Architecture for Agentic AI
Official AWS architectural guidance depicting the full stack of enterprise agentic AI — from applications and agents through model access, tools, and knowledge bases, with observability and security spanning every tier.
Read the Enterprise Architecture for Agentic AIAgent runtime and execution environments
The agent runtime determines where agents execute and how state is managed. Runtime choices affect isolation, persistence, scaling, and observability.
Runtime considerations for agent systems
| Consideration | What it means for agents | Example implementation |
|---|---|---|
| Execution isolation | How agents are separated from each other and from the host | Containers, sandboxes, or serverless functions |
| Session persistence | How conversational and workflow state is stored | In-memory cache, durable store, or distributed cache |
| Timeout handling | How long-running operations are managed | Async workflows, polling, or webhooks |
| Resource limits | Constraints on compute, memory, and concurrency | Quotas, throttling, or auto-scaling policies |
| Scaling model | How the system responds to load | Horizontal scaling, function auto-scaling, or provisioned capacity |
Agent runtime request flow
Loading diagram...
Session state management patterns vary in complexity. In-memory state is simple but lost on restart. Persisted state survives restarts but adds latency. Distributed state scales across instances but requires coordination. Choose based on session durability requirements and scaling needs.
AWS Prescriptive Guidance — Tool-Based Agents for Calling Functions
Official AWS pattern describing how tool-based agents extend LLM reasoning with external function calls, covering tool discovery, selection, execution, and result integration.
Read the Tool-Based Agents patternScaling and resilience patterns
Agent systems need production-grade reliability. Resilience patterns prevent cascading failures, handle overload gracefully, and ensure that the system can recover from errors without losing data or trust.
Resilience patterns for agent workflows
| Pattern | What it does | When to use it |
|---|---|---|
| Circuit breaker | Stops calling a failing service after a threshold | Downstream services are failing or timing out |
| Rate limiting | Throttles requests to protect downstream systems | APIs have quotas or scale limits |
| Backpressure | Signals the producer to slow down when the consumer is overwhelmed | Processing pipelines are congested |
| Retry with backoff | Retries failed operations with increasing delays | Transient failures are common |
| Graceful degradation | Reduces functionality instead of failing completely | Non-critical features are unavailable |
| Dead letter queue | Captures failed messages for later inspection and retry | Message processing fails and needs manual review |
Cost-aware scaling is essential because agent invocations are expensive. Cache retrieval results where possible, batch requests when appropriate, and monitor token usage and latency as first-class operational metrics. Set budgets and alerts to prevent cost overruns.
Monitor token usage and latency as first-class operational metrics
Every agent invocation consumes tokens and time. Track average and p95 latency, token counts per request, and cost per invocation. Use these metrics to detect anomalies, optimize prompts, and set budget controls.
AWS treats memory as a runtime substrate, not just chat history
The AWS memory-augmented agent pattern is useful because it reframes memory as an execution dependency. Memory is not only previous messages. It includes recent task state, retrieved long-term facts, replayable episodes, and strategy artifacts that must be injected and updated intentionally across the agent lifecycle.
Memory layers that matter in enterprise runtimes
| Memory layer | What it stores | Why runtime architecture matters |
|---|---|---|
| Short-term memory | Recent dialogue turns, task context, current system state, and in-flight constraints | Needs fast reads and controlled prompt injection so the agent can act coherently within a session |
| Long-term memory | Durable facts, preferences, histories, knowledge-base content, and retrieved records | Needs freshness, access control, and clear retrieval policy so old context does not poison new decisions |
| Episodic memory | Past successes, failures, and scenario-specific traces that can shape future behavior | Needs selective replay instead of naive append-only history, or prompts become bloated and misleading |
| Shared workflow memory | Cross-agent state, handoff context, and durable progress markers | Needs explicit ownership and conflict control if more than one agent can read or write the same state |
This is why memory belongs in enterprise integration discussions. Once memory influences routing, tool choice, or answer quality, teams need decisions about persistence stores, eviction, redaction, identity scoping, and recovery after failure. AWS lists DynamoDB, Redis, S3, RDS, and Bedrock-adjacent retrieval surfaces because the memory problem is operational as much as cognitive.
AWS: Memory-augmented agents
Defines short-term and long-term memory in agent loops and maps those concerns to AWS persistence and retrieval services.
Read the memory-augmented pageDurable Execution & Checkpointing
Production agents must survive crashes, deployments, and pauses. LangGraph's durable execution model checkpoints after every super-step (PostgreSQL-backed, keyed by thread_id), enabling agents to resume from exactly where they left off.
Durable execution primitives
| Primitive | Purpose | Behavior |
|---|---|---|
| Super-step checkpointing | State persistence | Checkpoint written after every node execution; keyed by thread_id for retrieval |
| Task queue with lease/retry | Crash recovery | Workers acquire lease on task; if worker crashes, lease expires and task is retried by another worker |
| Resumability | Continuity across interruptions | Agent picks up from last checkpoint after crash, deployment, or intentional pause; no state loss |
Durable Execution Checkpoint Lifecycle
Loading diagram...
Memory Architecture
LangGraph provides a two-tier memory system: short-term checkpoint-based state for conversation context, and long-term persistent store for cross-session knowledge.
LangGraph two-tier memory system
| Memory Tier | Backing Store | Scope | Use Cases |
|---|---|---|---|
| Short-term (checkpoint) | PostgreSQL via checkpoint saver | Per-thread, survives crashes | Conversation context, current task state, tool call history |
| Long-term (persistent store) | Key-value store with namespace tuples | Cross-session, by (user_id, 'memories') namespace | User preferences, learned facts, cross-conversation knowledge |
Long-term memory supports semantic search via embeddings. Store memories with store.put((user_id, 'memories'), key, value) and retrieve with store.search((user_id, 'memories'), query_embedding). Use short-term for ephemeral context, long-term for knowledge that should persist across sessions.
Multi-Tenancy & Security
LangGraph Platform provides three authentication and authorization layers: Custom Auth for user identity, Authorization Handlers for resource-level access control, and Agent Auth (OAuth) for third-party API credentials.
LangGraph Platform auth layers
| Auth Layer | Mechanism | Responsibility |
|---|---|---|
| Custom Auth | validate → identity | Authenticate users, extract user_id/organization from tokens or API keys |
| Authorization Handlers | Per-resource filter functions | Scope which threads, assistants, or runs a user can access |
| Agent Auth (OAuth) | OAuth2 flow for 3rd-party credentials | Agents access external APIs (Gmail, Slack, GitHub) on behalf of users |
RBAC, namespace isolation, and resource quotas are implemented at the authorization layer. Each tenant's threads, runs, and cron jobs are isolated by namespace — one tenant cannot access another's data.
Human-in-the-Loop Runtime
LangGraph's human-in-the-loop primitives — interrupt() and Command(resume=...) — enable agents to pause mid-execution, surface decisions for human review, and resume with structured feedback, all without losing state.
HITL Interrupt/Resume Flow
Loading diagram...
Dynamic placement in code means interrupt() can be called anywhere in your graph — before a sensitive tool call, after an LLM response, or at decision boundaries. The pause writes a checkpoint and frees the worker; resume returns a JSON value that the graph can branch on.
Integration patterns: approval gates (interrupt before destructive actions), escalation (interrupt when confidence below threshold, route to human), and interactive debugging (pause mid-graph to inspect state).
Streaming & Concurrency
LangGraph supports multiple streaming modes and concurrency strategies to handle real-time progress updates, multi-user dashboards, and simultaneous message handling in production deployments.
LangGraph streaming modes
| Streaming Mode | Granularity | Protocol | Use Case |
|---|---|---|---|
| Run streaming | Per-step or per-token | SSE events per run | Real-time progress updates during single agent execution |
| Thread streaming | Long-lived connection | SSE with Last-Event-ID for resumption | Dashboards, multi-user views of ongoing conversations |
Double-Texting Strategies
Enqueue (default, safest): New messages queue behind active run. Use for most production agents. Reject: Return error if run is active. Use for latency-sensitive APIs where stale requests should fail fast. Interrupt: Cancel current run and start new one. Use when the latest user intent always takes priority. Rollback: Interrupt, revert to prior checkpoint, replay with new input. Use when message ordering must be preserved.
Middleware & Guardrails at Runtime
LangGraph's hook system lets you intercept and modify agent behavior at specific lifecycle points without changing graph logic. Hooks run in-process and can transform inputs, log, block, or augment calls.
Hook lifecycle in a single agent invocation
Loading diagram...
LangGraph hook types and their use cases
| Hook | Fires When | Use Case |
|---|---|---|
| before_model | Before each LLM call | Inject system prompts, redact PII from user input |
| wrap_model_call | Wraps the LLM invocation | Add retry logic, rate limiting, caching |
| wrap_tool_call | Wraps each tool execution | Log tool inputs/outputs, enforce allowlists |
| after_model | After LLM response received | Validate output schema, block disallowed content, trigger fallbacks |
Built-in guardrails include PII redaction, tool-call limits (max N invocations per run), automatic retry on transient failures, fallback to simpler models on timeout, context window summarization, and content moderation filters.
Code Execution Sandboxes
Agents often need to execute code — data transformations, calculations, or API orchestration. LangGraph's SandboxBackendProtocol auto-adds a secure execute tool, isolating code execution from the agent runtime.
Sandbox providers and their characteristics
| Provider | Model | Key Feature |
|---|---|---|
| Daytona | Containerized sandbox per session | Full Linux environment, pre-installed packages |
| Modal | Serverless Python sandbox | Cold-start optimized, GPU access |
| Runloop | Ephemeral VM sandbox | Isolated networking, custom images |
| LangSmith Sandboxes | Managed sandbox service | Integrated with LangSmith tracing and auth proxy |
Auth Proxy: Credentials (API keys, database connections) are injected via an auth proxy — the sandbox never sees raw secrets. The proxy authenticates outbound calls on the sandbox's behalf, maintaining zero-trust isolation.
Integration Surface
LangGraph exposes three integration protocols for connecting agents to external systems, other agents, and event-driven workflows — each serving different coupling and directionality needs.
Integration protocols and their use cases
| Protocol | Direction | Purpose |
|---|---|---|
| MCP servers | Bidirectional (tool + data source) | Agents call external tools; external systems call agents |
| Agent-to-Agent (A2A) | Cross-deployment | Agents across different deployments communicate via standardized protocol |
| Webhooks | Outbound POST | Agents trigger external workflows when state changes |
MCP (Model Context Protocol) servers expose tools and data sources that agents can discover and invoke at runtime. A2A enables multi-agent systems spanning organizational boundaries — an HR agent in one deployment can delegate to a payroll agent in another. Webhooks fire on state transitions, notifying external systems of completed runs or flagged decisions.
AWS Prescriptive Guidance — Workflow for Orchestration
Official AWS pattern describing how centralized orchestration decomposes complex requests into subtasks and delegates them to specialized worker agents, the architecture that MCP and A2A protocols standardize.
Read the Workflow for Orchestration patternCron & Scheduled Agents
LangGraph supports cron-triggered agent runs for periodic tasks like daily report generation, data sync, or proactive monitoring.
Scheduling modes for cron-triggered agents
| Mode | Behavior | Use Case |
|---|---|---|
| Stateful (for_thread) | Appends to existing conversation thread | Daily standup agent that maintains context across days |
| Stateless (create) | New thread per run | Hourly data quality check, weekly report generation |
Stateful cron jobs accumulate conversation history in a persistent thread — the Monday run can reference Friday's findings. Stateless cron jobs start fresh each time, ideal for idempotent tasks where past context would be noise.
Deployment & Packaging
Agents are packaged via deepagents.toml configuration, bundled by CLI, and deployed to LangSmith Deployment — supporting cloud, hybrid, and self-hosted modes.
Deployment modes and their characteristics
| Deployment Mode | Infrastructure | Best For |
|---|---|---|
| Cloud (LangSmith) | Fully managed, auto-scaling | Teams prioritizing speed over infrastructure control |
| Hybrid | LangGraph in your VPC, LangSmith for tracing/evals | Regulated industries with data residency requirements |
| Self-hosted | Full control, your Kubernetes cluster | Maximum customization, air-gapped environments |
The deepagents.toml file declaratively specifies agent configuration: graph definition, tool dependencies, environment variables, and resource limits. The CLI bundles this into a deployable artifact that LangSmith Deployment serves with automatic versioning and rollback.
Knowledge Check
Test your understanding with this quiz. You need to answer all questions correctly to mark this section as complete.