Enterprise Integration & Data Architecture for AI Agents

Data foundation layer

Agents are only as good as their data access. The data foundation layer provides the knowledge bases, vector stores, structured APIs, document stores, and real-time feeds that agents use to retrieve evidence, validate context, and take informed action.

Enterprise agentic AI architectural layers showing Applications, Agents, Model Access, Tools, Knowledge Bases, and cross-cutting observability and security — AWS organizes enterprise agentic AI into distinct layers — from user-facing applications through agent orchestration to foundation model access — with observability and security spanning every tier. Source: AWS Prescriptive Guidance.

AWS Prescriptive Guidance — Enterprise Architecture for Agentic AILast verified: 2026-05-17

Data foundation components for agent systems

Component	Purpose	Design consideration
Knowledge bases	Curated, domain-specific reference data	Maintainability, update cadence, and source attribution
Vector stores	Semantic search and retrieval for RAG workflows	Embedding model choice, chunking strategy, and freshness
Structured APIs	Transaction systems and operational data	API design, rate limits, and authentication
Document stores	Unstructured content such as policies, manuals, or contracts	Indexing strategy, access controls, and versioning
Real-time feeds	Live data streams for time-sensitive decisions	Latency, ordering guarantees, and failure handling

Data governance is critical for agent-accessible data. Ensure that agents only access data they are authorized to see, that retrieval is auditable, and that stale or incorrect data does not lead to flawed decisions. Treat data products as contracts with clear schemas and service level objectives.

The Agentic Enterprise — Data Layer and Semantic Layer

Salesforce Architect guide covering the Data Layer (VectorDB, Lakehouse, Data Contracts, AI-Ready Data Fabric) and the Semantic Layer (Enterprise Knowledge Graph, Semantic Query Engine) that underpin agent reasoning.

Read the Data and Semantic Layer sections

Knowledge Bases for Amazon Bedrock

AWS documentation on managed RAG capabilities, enabling foundation models and agents to access company data for grounded responses.

Explore AWS Knowledge Bases

AWS Prescriptive Guidance — Enterprise Architecture for Agentic AI

Official AWS architectural guidance depicting the full stack of enterprise agentic AI — from applications and agents through model access, tools, and knowledge bases, with observability and security spanning every tier.

Read the Enterprise Architecture for Agentic AI

Agent runtime and execution environments

The agent runtime determines where agents execute and how state is managed. Runtime choices affect isolation, persistence, scaling, and observability.

Runtime considerations for agent systems

Consideration	What it means for agents	Example implementation
Execution isolation	How agents are separated from each other and from the host	Containers, sandboxes, or serverless functions
Session persistence	How conversational and workflow state is stored	In-memory cache, durable store, or distributed cache
Timeout handling	How long-running operations are managed	Async workflows, polling, or webhooks
Resource limits	Constraints on compute, memory, and concurrency	Quotas, throttling, or auto-scaling policies
Scaling model	How the system responds to load	Horizontal scaling, function auto-scaling, or provisioned capacity

Agent runtime request flow

100%drag to pan

Loading diagram...

Tool-based agent architecture: query flows through tool search and LLM selection to tool execution and response — Tool-based agents extend LLM reasoning with external function calls. The agent runtime manages tool discovery, selection, execution, and result integration — the integration surface that MCP and A2A standardize. Source: AWS Prescriptive Guidance.

AWS Prescriptive Guidance — Tool-Based Agents for Calling FunctionsLast verified: 2026-05-17

Session state management patterns vary in complexity. In-memory state is simple but lost on restart. Persisted state survives restarts but adds latency. Distributed state scales across instances but requires coordination. Choose based on session durability requirements and scaling needs.

AWS Prescriptive Guidance — Tool-Based Agents for Calling Functions

Official AWS pattern describing how tool-based agents extend LLM reasoning with external function calls, covering tool discovery, selection, execution, and result integration.

Read the Tool-Based Agents pattern

Scaling and resilience patterns

Agent systems need production-grade reliability. Resilience patterns prevent cascading failures, handle overload gracefully, and ensure that the system can recover from errors without losing data or trust.

Resilience patterns for agent workflows

Pattern	What it does	When to use it
Circuit breaker	Stops calling a failing service after a threshold	Downstream services are failing or timing out
Rate limiting	Throttles requests to protect downstream systems	APIs have quotas or scale limits
Backpressure	Signals the producer to slow down when the consumer is overwhelmed	Processing pipelines are congested
Retry with backoff	Retries failed operations with increasing delays	Transient failures are common
Graceful degradation	Reduces functionality instead of failing completely	Non-critical features are unavailable
Dead letter queue	Captures failed messages for later inspection and retry	Message processing fails and needs manual review

Cost-aware scaling is essential because agent invocations are expensive. Cache retrieval results where possible, batch requests when appropriate, and monitor token usage and latency as first-class operational metrics. Set budgets and alerts to prevent cost overruns.

Monitor token usage and latency as first-class operational metrics

Every agent invocation consumes tokens and time. Track average and p95 latency, token counts per request, and cost per invocation. Use these metrics to detect anomalies, optimize prompts, and set budget controls.

AWS treats memory as a runtime substrate, not just chat history

The AWS memory-augmented agent pattern is useful because it reframes memory as an execution dependency. Memory is not only previous messages. It includes recent task state, retrieved long-term facts, replayable episodes, and strategy artifacts that must be injected and updated intentionally across the agent lifecycle.

Official AWS memory-augmented agent diagram showing short-term and long-term memory feeding the reasoning loop and being updated after interaction. — AWS memory-augmented agents treat memory as a first-class part of the runtime loop, not a UI convenience. Source: AWS Prescriptive Guidance.

AWS Prescriptive Guidance: Memory-augmented agentsLast verified: 2026-05-17

Memory layers that matter in enterprise runtimes

Memory layer	What it stores	Why runtime architecture matters
Short-term memory	Recent dialogue turns, task context, current system state, and in-flight constraints	Needs fast reads and controlled prompt injection so the agent can act coherently within a session
Long-term memory	Durable facts, preferences, histories, knowledge-base content, and retrieved records	Needs freshness, access control, and clear retrieval policy so old context does not poison new decisions
Episodic memory	Past successes, failures, and scenario-specific traces that can shape future behavior	Needs selective replay instead of naive append-only history, or prompts become bloated and misleading
Shared workflow memory	Cross-agent state, handoff context, and durable progress markers	Needs explicit ownership and conflict control if more than one agent can read or write the same state

This is why memory belongs in enterprise integration discussions. Once memory influences routing, tool choice, or answer quality, teams need decisions about persistence stores, eviction, redaction, identity scoping, and recovery after failure. AWS lists DynamoDB, Redis, S3, RDS, and Bedrock-adjacent retrieval surfaces because the memory problem is operational as much as cognitive.

AWS: Memory-augmented agents

Defines short-term and long-term memory in agent loops and maps those concerns to AWS persistence and retrieval services.

Read the memory-augmented page

Durable Execution & Checkpointing

Production agents must survive crashes, deployments, and pauses. LangGraph's durable execution model checkpoints after every super-step (PostgreSQL-backed, keyed by thread_id), enabling agents to resume from exactly where they left off.

Durable execution primitives

Primitive	Purpose	Behavior
Super-step checkpointing	State persistence	Checkpoint written after every node execution; keyed by thread_id for retrieval
Task queue with lease/retry	Crash recovery	Workers acquire lease on task; if worker crashes, lease expires and task is retried by another worker
Resumability	Continuity across interruptions	Agent picks up from last checkpoint after crash, deployment, or intentional pause; no state loss

Durable Execution Checkpoint Lifecycle

100%drag to pan

Loading diagram...

Memory Architecture

LangGraph provides a two-tier memory system: short-term checkpoint-based state for conversation context, and long-term persistent store for cross-session knowledge.

LangGraph two-tier memory system

Memory Tier	Backing Store	Scope	Use Cases
Short-term (checkpoint)	PostgreSQL via checkpoint saver	Per-thread, survives crashes	Conversation context, current task state, tool call history
Long-term (persistent store)	Key-value store with namespace tuples	Cross-session, by (user_id, 'memories') namespace	User preferences, learned facts, cross-conversation knowledge

Long-term memory supports semantic search via embeddings. Store memories with store.put((user_id, 'memories'), key, value) and retrieve with store.search((user_id, 'memories'), query_embedding). Use short-term for ephemeral context, long-term for knowledge that should persist across sessions.

Multi-Tenancy & Security

LangGraph Platform provides three authentication and authorization layers: Custom Auth for user identity, Authorization Handlers for resource-level access control, and Agent Auth (OAuth) for third-party API credentials.

LangGraph Platform auth layers

Auth Layer	Mechanism	Responsibility
Custom Auth	validate → identity	Authenticate users, extract user_id/organization from tokens or API keys
Authorization Handlers	Per-resource filter functions	Scope which threads, assistants, or runs a user can access
Agent Auth (OAuth)	OAuth2 flow for 3rd-party credentials	Agents access external APIs (Gmail, Slack, GitHub) on behalf of users

RBAC, namespace isolation, and resource quotas are implemented at the authorization layer. Each tenant's threads, runs, and cron jobs are isolated by namespace — one tenant cannot access another's data.

Human-in-the-Loop Runtime

LangGraph's human-in-the-loop primitives — interrupt() and Command(resume=...) — enable agents to pause mid-execution, surface decisions for human review, and resume with structured feedback, all without losing state.

HITL Interrupt/Resume Flow

100%drag to pan

Loading diagram...

Dynamic placement in code means interrupt() can be called anywhere in your graph — before a sensitive tool call, after an LLM response, or at decision boundaries. The pause writes a checkpoint and frees the worker; resume returns a JSON value that the graph can branch on.

Integration patterns: approval gates (interrupt before destructive actions), escalation (interrupt when confidence below threshold, route to human), and interactive debugging (pause mid-graph to inspect state).

Streaming & Concurrency

LangGraph supports multiple streaming modes and concurrency strategies to handle real-time progress updates, multi-user dashboards, and simultaneous message handling in production deployments.

LangGraph streaming modes

Streaming Mode	Granularity	Protocol	Use Case
Run streaming	Per-step or per-token	SSE events per run	Real-time progress updates during single agent execution
Thread streaming	Long-lived connection	SSE with Last-Event-ID for resumption	Dashboards, multi-user views of ongoing conversations

Double-Texting Strategies

Enqueue (default, safest): New messages queue behind active run. Use for most production agents. Reject: Return error if run is active. Use for latency-sensitive APIs where stale requests should fail fast. Interrupt: Cancel current run and start new one. Use when the latest user intent always takes priority. Rollback: Interrupt, revert to prior checkpoint, replay with new input. Use when message ordering must be preserved.

Middleware & Guardrails at Runtime

LangGraph's hook system lets you intercept and modify agent behavior at specific lifecycle points without changing graph logic. Hooks run in-process and can transform inputs, log, block, or augment calls.

Hook lifecycle in a single agent invocation

100%drag to pan

Loading diagram...

LangGraph hook types and their use cases

Hook	Fires When	Use Case
before_model	Before each LLM call	Inject system prompts, redact PII from user input
wrap_model_call	Wraps the LLM invocation	Add retry logic, rate limiting, caching
wrap_tool_call	Wraps each tool execution	Log tool inputs/outputs, enforce allowlists
after_model	After LLM response received	Validate output schema, block disallowed content, trigger fallbacks

Built-in guardrails include PII redaction, tool-call limits (max N invocations per run), automatic retry on transient failures, fallback to simpler models on timeout, context window summarization, and content moderation filters.

Code Execution Sandboxes

Agents often need to execute code — data transformations, calculations, or API orchestration. LangGraph's SandboxBackendProtocol auto-adds a secure execute tool, isolating code execution from the agent runtime.

Sandbox providers and their characteristics

Provider	Model	Key Feature
Daytona	Containerized sandbox per session	Full Linux environment, pre-installed packages
Modal	Serverless Python sandbox	Cold-start optimized, GPU access
Runloop	Ephemeral VM sandbox	Isolated networking, custom images
LangSmith Sandboxes	Managed sandbox service	Integrated with LangSmith tracing and auth proxy

Auth Proxy: Credentials (API keys, database connections) are injected via an auth proxy — the sandbox never sees raw secrets. The proxy authenticates outbound calls on the sandbox's behalf, maintaining zero-trust isolation.

Integration Surface

LangGraph exposes three integration protocols for connecting agents to external systems, other agents, and event-driven workflows — each serving different coupling and directionality needs.

Integration protocols and their use cases

Protocol	Direction	Purpose
MCP servers	Bidirectional (tool + data source)	Agents call external tools; external systems call agents
Agent-to-Agent (A2A)	Cross-deployment	Agents across different deployments communicate via standardized protocol
Webhooks	Outbound POST	Agents trigger external workflows when state changes

Workflow orchestration pattern: central orchestrator decomposes tasks and delegates to specialized worker agents — Centralized orchestration decomposes complex requests into subtasks and delegates them to specialized worker agents. MCP and A2A protocols standardize the delegation and response flow across this architecture. Source: AWS Prescriptive Guidance.

AWS Prescriptive Guidance — Workflow for OrchestrationLast verified: 2026-05-17

MCP (Model Context Protocol) servers expose tools and data sources that agents can discover and invoke at runtime. A2A enables multi-agent systems spanning organizational boundaries — an HR agent in one deployment can delegate to a payroll agent in another. Webhooks fire on state transitions, notifying external systems of completed runs or flagged decisions.

AWS Prescriptive Guidance — Workflow for Orchestration

Official AWS pattern describing how centralized orchestration decomposes complex requests into subtasks and delegates them to specialized worker agents, the architecture that MCP and A2A protocols standardize.

Read the Workflow for Orchestration pattern

Cron & Scheduled Agents

LangGraph supports cron-triggered agent runs for periodic tasks like daily report generation, data sync, or proactive monitoring.

Scheduling modes for cron-triggered agents

Mode	Behavior	Use Case
Stateful (for_thread)	Appends to existing conversation thread	Daily standup agent that maintains context across days
Stateless (create)	New thread per run	Hourly data quality check, weekly report generation

Stateful cron jobs accumulate conversation history in a persistent thread — the Monday run can reference Friday's findings. Stateless cron jobs start fresh each time, ideal for idempotent tasks where past context would be noise.

Deployment & Packaging

Agents are packaged via deepagents.toml configuration, bundled by CLI, and deployed to LangSmith Deployment — supporting cloud, hybrid, and self-hosted modes.

Deployment modes and their characteristics

Deployment Mode	Infrastructure	Best For
Cloud (LangSmith)	Fully managed, auto-scaling	Teams prioritizing speed over infrastructure control
Hybrid	LangGraph in your VPC, LangSmith for tracing/evals	Regulated industries with data residency requirements
Self-hosted	Full control, your Kubernetes cluster	Maximum customization, air-gapped environments

The deepagents.toml file declaratively specifies agent configuration: graph definition, tool dependencies, environment variables, and resource limits. The CLI bundles this into a deployable artifact that LangSmith Deployment serves with automatic versioning and rollback.

Knowledge Check

Test your understanding with this quiz. You need to answer all questions correctly to mark this section as complete.

Quiz Progress

Question 1 of 12

Why is the "Data Foundation" layer critical for agentic systems?

← PreviousAgent Security, Guardrails, and Trust Next →Cross-Platform Agent Stacks

Component

Purpose

Design consideration

Knowledge bases

Curated, domain-specific reference data

Maintainability, update cadence, and source attribution

Vector stores

Semantic search and retrieval for RAG workflows

Embedding model choice, chunking strategy, and freshness

Structured APIs

Transaction systems and operational data

API design, rate limits, and authentication

Document stores

Unstructured content such as policies, manuals, or contracts

Indexing strategy, access controls, and versioning

Real-time feeds

Live data streams for time-sensitive decisions

Latency, ordering guarantees, and failure handling

Consideration

What it means for agents

Example implementation

Execution isolation

How agents are separated from each other and from the host

Containers, sandboxes, or serverless functions

Session persistence

How conversational and workflow state is stored

In-memory cache, durable store, or distributed cache

Timeout handling

How long-running operations are managed

Async workflows, polling, or webhooks

Resource limits

Constraints on compute, memory, and concurrency

Quotas, throttling, or auto-scaling policies

Scaling model

How the system responds to load

Horizontal scaling, function auto-scaling, or provisioned capacity

Pattern

What it does

When to use it

Circuit breaker

Stops calling a failing service after a threshold

Downstream services are failing or timing out

Rate limiting

Throttles requests to protect downstream systems

APIs have quotas or scale limits

Backpressure

Signals the producer to slow down when the consumer is overwhelmed

Processing pipelines are congested

Retry with backoff

Retries failed operations with increasing delays

Transient failures are common

Graceful degradation

Reduces functionality instead of failing completely

Non-critical features are unavailable

Dead letter queue

Captures failed messages for later inspection and retry

Message processing fails and needs manual review

Memory layer

What it stores

Why runtime architecture matters

Short-term memory

Recent dialogue turns, task context, current system state, and in-flight constraints

Needs fast reads and controlled prompt injection so the agent can act coherently within a session

Long-term memory

Durable facts, preferences, histories, knowledge-base content, and retrieved records

Needs freshness, access control, and clear retrieval policy so old context does not poison new decisions

Episodic memory

Past successes, failures, and scenario-specific traces that can shape future behavior

Needs selective replay instead of naive append-only history, or prompts become bloated and misleading

Shared workflow memory

Cross-agent state, handoff context, and durable progress markers

Needs explicit ownership and conflict control if more than one agent can read or write the same state

Primitive

Purpose

Behavior

Super-step checkpointing

State persistence

Checkpoint written after every node execution; keyed by thread_id for retrieval

Task queue with lease/retry

Crash recovery

Workers acquire lease on task; if worker crashes, lease expires and task is retried by another worker

Resumability

Continuity across interruptions

Agent picks up from last checkpoint after crash, deployment, or intentional pause; no state loss

Memory Tier

Backing Store

Scope

Use Cases

Short-term (checkpoint)

PostgreSQL via checkpoint saver

Per-thread, survives crashes

Conversation context, current task state, tool call history

Long-term (persistent store)

Key-value store with namespace tuples

Cross-session, by (user_id, 'memories') namespace

User preferences, learned facts, cross-conversation knowledge

Auth Layer

Mechanism

Responsibility

Custom Auth

validate → identity

Authenticate users, extract user_id/organization from tokens or API keys

Authorization Handlers

Per-resource filter functions

Scope which threads, assistants, or runs a user can access

Agent Auth (OAuth)

OAuth2 flow for 3rd-party credentials

Agents access external APIs (Gmail, Slack, GitHub) on behalf of users

Streaming Mode

Granularity

Protocol

Use Case

Run streaming

Per-step or per-token

SSE events per run

Real-time progress updates during single agent execution

Thread streaming

Long-lived connection

SSE with Last-Event-ID for resumption

Dashboards, multi-user views of ongoing conversations

Hook

Fires When

Use Case

before_model

Before each LLM call

Inject system prompts, redact PII from user input

wrap_model_call

Wraps the LLM invocation

Add retry logic, rate limiting, caching

wrap_tool_call

Wraps each tool execution

Log tool inputs/outputs, enforce allowlists

after_model

After LLM response received

Validate output schema, block disallowed content, trigger fallbacks

Provider

Model

Key Feature

Daytona

Containerized sandbox per session

Full Linux environment, pre-installed packages

Modal

Serverless Python sandbox

Cold-start optimized, GPU access

Runloop

Ephemeral VM sandbox

Isolated networking, custom images

LangSmith Sandboxes

Managed sandbox service

Integrated with LangSmith tracing and auth proxy

Protocol

Direction

Purpose

MCP servers

Bidirectional (tool + data source)

Agents call external tools; external systems call agents

Agent-to-Agent (A2A)

Cross-deployment

Agents across different deployments communicate via standardized protocol

Webhooks

Outbound POST

Agents trigger external workflows when state changes

Mode

Behavior

Use Case

Stateful (for_thread)

Appends to existing conversation thread

Daily standup agent that maintains context across days

Stateless (create)

New thread per run

Hourly data quality check, weekly report generation

Deployment Mode

Infrastructure

Best For

Cloud (LangSmith)

Fully managed, auto-scaling

Teams prioritizing speed over infrastructure control

Hybrid

LangGraph in your VPC, LangSmith for tracing/evals

Regulated industries with data residency requirements

Self-hosted

Full control, your Kubernetes cluster

Maximum customization, air-gapped environments

Knowledge Check

Test your understanding with this quiz. You need to answer all questions correctly to mark this section as complete.

Quiz Progress

Question 1 of 12

Knowledge Tree

Data foundation layer

The Agentic Enterprise — Data Layer and Semantic Layer

Knowledge Bases for Amazon Bedrock

AWS Prescriptive Guidance — Enterprise Architecture for Agentic AI

Agent runtime and execution environments

Agent runtime request flow

AWS Prescriptive Guidance — Tool-Based Agents for Calling Functions

Scaling and resilience patterns

Monitor token usage and latency as first-class operational metrics

AWS treats memory as a runtime substrate, not just chat history

AWS: Memory-augmented agents

Durable Execution & Checkpointing

Durable Execution Checkpoint Lifecycle

Memory Architecture

Multi-Tenancy & Security

Human-in-the-Loop Runtime

HITL Interrupt/Resume Flow

Streaming & Concurrency

Double-Texting Strategies

Middleware & Guardrails at Runtime

Hook lifecycle in a single agent invocation

Code Execution Sandboxes

Integration Surface

AWS Prescriptive Guidance — Workflow for Orchestration

Cron & Scheduled Agents

Deployment & Packaging

Knowledge Check

Why is the "Data Foundation" layer critical for agentic systems?

Knowledge Tree

Data foundation layer

The Agentic Enterprise — Data Layer and Semantic Layer

Knowledge Bases for Amazon Bedrock

AWS Prescriptive Guidance — Enterprise Architecture for Agentic AI

Agent runtime and execution environments

Agent runtime request flow

AWS Prescriptive Guidance — Tool-Based Agents for Calling Functions

Scaling and resilience patterns

Monitor token usage and latency as first-class operational metrics

AWS treats memory as a runtime substrate, not just chat history

AWS: Memory-augmented agents

Durable Execution & Checkpointing

Durable Execution Checkpoint Lifecycle

Memory Architecture

Multi-Tenancy & Security

Human-in-the-Loop Runtime

HITL Interrupt/Resume Flow

Streaming & Concurrency

Double-Texting Strategies

Middleware & Guardrails at Runtime

Hook lifecycle in a single agent invocation

Code Execution Sandboxes

Integration Surface

AWS Prescriptive Guidance — Workflow for Orchestration

Cron & Scheduled Agents

Deployment & Packaging

Knowledge Check

Why is the "Data Foundation" layer critical for agentic systems?