Bedrock AgentCore Stateful MCP Servers: Elicitation, Sampling, and Long-Lived Context

Written by Bits Lovers on 10 Apr 2026

Bedrock AgentCore Stateful MCP Servers: Elicitation, Sampling, and Long-Lived Context

On March 10, 2026, AWS added stateful MCP server features to Amazon Bedrock AgentCore Runtime. If you only read the headline, it sounds like a protocol update. It is more important than that.

Stateless tools are fine for one request and one answer. Real engineering workflows are rarely that clean. An incident investigation needs to ask follow-up questions, stream progress while logs are loading, and hold onto context across multiple turns. A provisioning flow needs missing metadata before it can continue. A coding agent may need a server-side tool to ask for confirmation before it runs an expensive or risky step.

That is what stateful MCP changes. AWS now supports elicitation, sampling, and progress notifications for MCP servers deployed to AgentCore Runtime, with session context maintained through an Mcp-Session-Id header. If you are already building with AWS Bedrock Agents for DevOps or shell commands plus session storage in AgentCore, this is the tool-side state model that makes those systems feel like real runtimes instead of stitched-together RPC calls.

What AWS Actually Shipped

The AWS launch says Bedrock AgentCore Runtime now supports:

elicitation, where the server asks the client or user for missing information during tool execution
sampling, where the server requests model-generated text from the client during the workflow
progress notifications, where the server emits incremental updates for long-running work

AWS also says these stateful MCP sessions run with isolated resources per session and maintain context across multiple interactions using the Mcp-Session-Id header.

That header is the key detail. In the AWS documentation, AgentCore returns Mcp-Session-Id during initialization. The client must send it back on later requests to keep the session alive. If the server terminates or the session expires, requests can return 404 and the client must re-initialize. That is not trivia. It is the difference between a reliable stateful workflow and a mysterious intermittent failure.

AWS launched the feature in fourteen Regions on March 10, 2026, which is broad enough that most teams already experimenting with AgentCore can test it without moving workloads.

Why Stateless Tools Break First

Most first-generation tool integrations are built like this:

Client asks model what tool to call.
Tool gets a JSON payload.
Tool returns a JSON payload.
Anything interactive is shoved back into prompt context.

That works until the workflow becomes multi-step.

Take a sandbox account request. The user provides application name and region but forgets the cost center. A stateless tool either fails hard or returns a cryptic validation error that the model has to interpret. Then the client starts reconstructing context manually in the next prompt turn. The more steps you add, the uglier that gets.

Stateful MCP lets the tool server own its own interaction lifecycle. That is the right place for it. The tool knows what information is still missing, what stage it is in, and whether it is still waiting on input or actively doing work.

The Three Features That Matter

Elicitation

Elicitation is the most immediately useful feature. The server can ask for missing data instead of pretending every tool contract must be satisfied on the first try.

That is valuable for:

approval flows
provisioning requests
support intake
security exceptions
any workflow where humans provide incomplete data

In practice, this reduces the amount of brittle prompt glue you need in the client app.

Sampling

Sampling is subtler. AWS defines it as the server requesting LLM-generated content from the client during tool execution.

That matters when the tool itself benefits from AI-generated text, summaries, or recommendations at a specific point in the workflow. It turns the MCP server from a dead endpoint into an active participant in the interaction.

Used badly, that is extra complexity. Used well, it means the server can request model help only when it needs it, while still owning the deterministic flow around it.

Progress Notifications

Progress notifications are operationally underrated.

Long-running tools are miserable when they go silent. Is the search still running? Did it deadlock? Is it waiting on another system? AWS now supports progress notifications so the client can show stages or percentages while the work continues.

That becomes very practical for:

log and trace searches
data backfills
ticket enrichment
approval orchestration
any tool that might take more than a few seconds

It also pairs well with observability. If you already use AWS CloudWatch Deep Dive patterns for traceability, progress events give you one more signal to understand whether the tool is slow, blocked, or just busy.

The Minimum Stateful MCP Configuration

AWS is explicit in the documentation: elicitation, sampling, and progress notifications require stateful mode, which means running the MCP server with stateless_http=False.

This is the smallest meaningful server shape:

from fastmcp import FastMCP, Context
import asyncio

mcp = FastMCP("incident-investigator")

@mcp.tool()
async def investigate_alarm(ctx: Context, alarm_name: str | None = None) -> str:
    if not alarm_name:
        result = await ctx.elicit(
            message="Which CloudWatch alarm should I investigate?",
            response_type=str,
        )
        if result.action != "accept":
            return "Investigation cancelled."
        alarm_name = result.data

    await ctx.report_progress(progress=1, total=3)
    await asyncio.sleep(0.5)

    await ctx.report_progress(progress=2, total=3)
    summary = await ctx.sample(
        messages=f"Write a short incident summary for alarm {alarm_name}. Max 60 words.",
        max_tokens=120,
    )

    await ctx.report_progress(progress=3, total=3)
    return f"Investigation prepared for {alarm_name}: {summary.text}"

if __name__ == "__main__":
    mcp.run(
        transport="streamable-http",
        host="0.0.0.0",
        port=8000,
        stateless_http=False,
    )

That single flag changes the execution model. Without it, the interactive features are not available.

Session Handling Is Where Teams Get Burned

The AWS docs are unusually clear here:

the server returns Mcp-Session-Id during initialize
clients must send it in subsequent requests
expired or terminated sessions may return 404
clients must re-initialize when that happens

That means your client logic needs to treat session management as a first-class concern, not a hidden transport detail.

A practical rule:

if the tool interaction has multiple turns, persist the session ID alongside the workflow state
if you receive 404, re-initialize and restart the interaction cleanly
do not silently retry stateful calls without understanding whether the tool operation was idempotent

That last point matters a lot. A flight search can be retried. A provisioning request or an approval step may not be safe to replay casually.

Where Stateful MCP Fits in a Real Architecture

Stateful MCP is not a replacement for workflow engines or agent memory. It solves a narrower problem: the tool interaction itself has state.

That leads to a cleaner architecture split:

the agent decides what it is trying to do
the stateful MCP server manages interactive tool execution
a workflow engine handles irreversible orchestration

For example, you can use a stateful MCP server to gather missing inputs, validate them, and show progress. Once everything is ready, hand off the actual provisioning or deployment to EventBridge + Step Functions for deterministic execution.

That separation is healthier than trying to make the MCP server own every side effect end to end.

Good Use Cases

Stateful MCP is worth the extra complexity when the interaction has a genuine lifecycle.

Strong fits:

incident investigation tools that ask follow-up questions
support bots that collect missing inputs gradually
provisioning tools that need approval or confirmation
coding agents that coordinate review or validation steps
enterprise workflows where humans stay in the loop

Weak fits:

one-shot lookups
pricing calculators
health probes
static reference data retrieval

If the tool is just a lookup, keep it stateless. Stateful MCP should solve a problem, not satisfy a trend.

Security and Reliability Rules I Would Use

Stateful systems accumulate risk faster than stateless ones because they hold context over time.

My baseline rules would be:

expire inactive sessions aggressively
log session creation, progress stages, and terminal outcomes
keep sensitive intermediate state minimal
make side-effecting steps explicitly idempotent or explicitly non-retryable
treat a 404 session miss as a lifecycle event, not a random network problem

If your MCP server is also exposing sensitive enterprise tools through AgentCore Gateway, make sure your authorization boundaries are clear. Stateful convenience is not worth turning a shared tool server into an unbounded context bucket.

Final Take

Bedrock AgentCore stateful MCP is one of the more important 2026 AgentCore launches because it fixes a real problem: many enterprise tools are not single-turn function calls. They ask for more data, they take time, and they need to preserve context without dumping everything back into the prompt.

That does not mean every MCP server should become stateful. It means the ones with genuine interaction lifecycles finally have a first-class runtime model on AWS. Use it where the tool needs a conversation, not where a plain request-response contract is already enough.