Skip to main content

Command Palette

Search for a command to run...

AWS Updates Since re:Invent 2025: Graviton5, Lambda Durable Functions, and the Agentic AI Push

A comprehensive look at what AWS has shipped since re:Invent 2025, from Graviton5 and Lambda Durable Functions to S3 Vectors and Nova Act.

Updated
12 min read
AWS Updates Since re:Invent 2025: Graviton5, Lambda Durable Functions, and the Agentic AI Push

AWS Updates Since re:Invent 2025: Graviton5, Lambda Durable Functions, and the Agentic AI Push

The dust has barely settled from AWS re:Invent 2025, but Amazon's cloud division isn't slowing down. In the month following their flagship conference, AWS has moved several preview announcements to general availability, expanded regional coverage for key services, and continued to double down on the agentic AI narrative that dominated the Las Vegas show floor.

For enterprise architects and developers who couldn't track every announcement amid the holiday season, here's what you need to know about AWS's post-re:Invent momentum—and what these updates mean for your 2026 cloud strategy.

Graviton5 Enters Preview: AWS's Most Powerful Custom Silicon Yet

The headline hardware announcement from re:Invent—AWS Graviton5—is now available in preview through the new M9g instance family. This isn't just an incremental upgrade; Graviton5 represents a fundamental shift in how AWS approaches server silicon.

The Numbers That Matter

Graviton5-based M9g instances deliver measurable improvements across every dimension that matters for enterprise workloads:

  • 25% better compute performance compared to Graviton4 processors
  • 30% faster database operations and up to 35% faster web applications
  • 192 cores in a single package—the highest CPU core density available in Amazon EC2
  • 5x larger L3 cache than Graviton4, with each core accessing 2.6x more cache
  • 33% reduction in inter-core communication latency due to the efficient single-package design
  • Up to 20% higher EBS bandwidth and 15% higher network bandwidth on average

Real-World Validation

Early adopters are already seeing significant gains. Atlassian, which has migrated over 3,000 EC2 instances for Jira and Confluence to Graviton4, observed 30% higher performance and 20% lower latency in their Jira testing on M9g instances. Honeycomb reported 36% better throughput per core after optimization, while SAP documented 35-60% performance increases for OLTP queries on SAP HANA Cloud.

# Example: Launching an M9g instance (preview)
import boto3

ec2 = boto3.client('ec2', region_name='us-east-1')

response = ec2.run_instances(
    ImageId='ami-graviton5-preview',  # Use appropriate Graviton5-compatible AMI
    InstanceType='m9g.xlarge',  # New M9g instance type
    MinCount=1,
    MaxCount=1,
    TagSpecifications=[
        {
            'ResourceType': 'instance',
            'Tags': [
                {'Key': 'Name', 'Value': 'Graviton5-Preview-Test'},
                {'Key': 'Environment', 'Value': 'development'}
            ]
        }
    ]
)

print(f"Launched M9g instance: {response['Instances'][0]['InstanceId']}")

The preview is currently available in select regions. AWS has announced that C9g (compute-optimized) and R9g (memory-optimized) instance families are planned for 2026.

Lambda Durable Functions: Stateful Serverless Without the Complexity

Perhaps the most developer-friendly announcement to reach general availability is Lambda Durable Functions. This feature addresses one of serverless computing's most persistent pain points: managing state and coordination across long-running workflows.

What Problem Does This Solve?

Traditional Lambda functions are stateless and time-limited. Building multi-step workflows required either external orchestration (Step Functions) or complex custom state management. Durable functions embed that coordination directly into your Lambda code.

Key Capabilities

Lambda Durable Functions introduce new programming primitives that fundamentally change how you write serverless applications:

# Example: A durable function for multi-step order processing
def handler(event, context):
    order_id = event['order_id']

    # Step 1: Validate inventory (automatically checkpointed)
    inventory_result = context.step(
        'validate_inventory',
        lambda: check_inventory(order_id)
    )

    if not inventory_result['available']:
        return {'status': 'failed', 'reason': 'out_of_stock'}

    # Step 2: Process payment (retries automatically on failure)
    payment_result = context.step(
        'process_payment',
        lambda: charge_customer(order_id)
    )

    # Wait for warehouse confirmation (suspends without compute charges)
    # Can wait up to ONE YEAR
    context.wait(
        'warehouse_confirmation',
        timeout_seconds=86400  # 24 hours
    )

    # Step 3: Ship order
    shipping_result = context.step(
        'ship_order',
        lambda: create_shipment(order_id)
    )

    return {'status': 'completed', 'tracking': shipping_result['tracking_id']}

The magic happens in the checkpoint and replay mechanism. After your function resumes from a pause or interruption, the system replays from the beginning but skips completed checkpoints, using stored results instead of re-executing. During wait operations, your function suspends without incurring compute charges—for workflows that wait hours or days, you pay only for actual processing time.

Availability and Runtime Support

Lambda Durable Functions launched in US East (Ohio) and have since expanded to 14 additional AWS Regions including US East (N. Virginia), US West (Oregon), Europe (Ireland, Frankfurt, Milan, Stockholm, Spain), and multiple Asia Pacific regions.

Currently supported runtimes include Python (3.13, 3.14) and Node.js (22, 24). The durable execution SDKs are open source, allowing community contributions.

Section illustration

S3 Vectors Reaches General Availability: Vector Storage at Unprecedented Scale

AWS's bold bet on native vector storage has paid off. S3 Vectors, first previewed in July 2025, is now generally available with a staggering 40x increase in scale from preview.

Scale That Redefines RAG Architecture

The numbers are remarkable:

  • 2 billion vectors per index (up from 50 million in preview)
  • 20 trillion vectors per bucket across up to 10,000 vector indexes
  • 100ms query latencies for frequent queries
  • Up to 90% cost reduction compared to specialized vector database solutions
# Example: Creating and querying an S3 Vectors index
import boto3

s3_vectors = boto3.client('s3vectors')

# Create a vector index
response = s3_vectors.create_index(
    BucketName='my-rag-application',
    IndexName='product-embeddings',
    Dimensions=1536,  # OpenAI ada-002 dimensions
    DistanceMetric='COSINE',
    Tags={'Application': 'product-search'}
)

# Insert vectors
s3_vectors.put_vectors(
    BucketName='my-rag-application',
    IndexName='product-embeddings',
    Vectors=[
        {
            'VectorId': 'product-001',
            'Values': embedding_values,  # Your 1536-dimensional vector
            'Metadata': {
                'category': 'electronics',
                'price': 299.99,
                'name': 'Wireless Headphones'
            }
        }
    ]
)

# Query with metadata filtering
results = s3_vectors.query(
    BucketName='my-rag-application',
    IndexName='product-embeddings',
    QueryVector=query_embedding,
    TopK=10,
    Filter={'category': {'$eq': 'electronics'}}
)

Enterprise-Ready Features

The GA release includes critical enterprise capabilities:

  • Customer-managed encryption keys per vector index for multi-tenant applications
  • AWS PrivateLink support for private network connectivity
  • CloudFormation support for infrastructure-as-code deployments
  • Resource tagging for cost allocation and attribute-based access control (ABAC)
  • Amazon OpenSearch integration now generally available
  • Amazon Bedrock Knowledge Base integration for production RAG applications

S3 Vectors is now available in 14 AWS Regions, up from 5 during preview. During the preview period, customers created over 250,000 vector indexes, ingested more than 40 billion vectors, and performed over 1 billion queries.

Amazon Nova Act: Production-Ready AI Agents for Browser Automation

The agentic AI theme from re:Invent continues with Amazon Nova Act reaching general availability. This service enables developers to build AI agents that automate browser-based workflows with enterprise-grade reliability.

Why This Matters

Browser automation has always been fragile. Traditional approaches using Selenium or Playwright require brittle selectors that break when UIs change. Nova Act takes a different approach: it uses a custom Nova 2 Lite model trained specifically for web interaction using reinforcement learning in synthetic "web gym" environments.

The result? Customers are achieving 90% reliability on UI-based workflows like updating customer records in CRM systems.

Core Capabilities

Nova Act supports four primary use cases out of the box:

  1. Web QA Testing: Automated testing that adapts to UI changes
  2. Data Entry: Reliable form filling across complex enterprise applications
  3. Data Extraction: Pulling structured data from unstructured web pages
  4. Checkout Flows: E-commerce automation for price monitoring, inventory checks
# Example: Building a Nova Act agent for CRM updates
from nova_act import NovaActAgent, Workflow

# Define a workflow using natural language and Python
workflow = Workflow(
    name="update_customer_record",
    description="Update customer contact information in Salesforce"
)

@workflow.step("navigate_to_customer")
async def navigate(agent: NovaActAgent, customer_id: str):
    await agent.act(f"Navigate to customer record {customer_id}")
    await agent.wait_for("Customer details page loaded")

@workflow.step("update_email")
async def update_email(agent: NovaActAgent, new_email: str):
    await agent.act(f"Click edit button for contact information")
    await agent.act(f"Update email field to {new_email}")
    await agent.act("Save changes")

# Deploy with human-in-the-loop for sensitive operations
workflow.enable_hitl(
    escalation_triggers=["account deletion", "payment modification"],
    notification_channel="slack://workflows-team"
)

Development Experience

The Nova Act IDE extension integrates directly with VS Code, Kiro, and Cursor, enabling developers to:

  • Prototype in an online playground
  • Debug and refine agents in their preferred IDE
  • Deploy to AWS with native Bedrock AgentCore integration
  • Monitor via CloudWatch with full observability

The extension is available under the Apache 2.0 license. Nova Act is currently available in US East (N. Virginia).

Amazon Bedrock Expands to Nearly 100 Serverless Models

In what AWS calls its largest model expansion to date, Amazon Bedrock added 18 new fully managed open-weight models, bringing the total to nearly 100 serverless models from leading AI providers.

Notable Additions

Mistral AI:

  • Mistral Large 3: Optimized for long-context, multimodal, and instruction reliability. Excels at document understanding, agentic workflows, enterprise knowledge work, and multilingual processing.
  • Ministral 3 (3B, 8B, and 14B variants): Smaller models for cost-sensitive deployments

Google:

  • Gemma 3: Lightweight multimodal models designed for local text-and-image work

NVIDIA:

  • Nemotron Nano 2: Efficiency-focused models for reasoning, coding, and video understanding

Additional Providers: Models from MiniMax AI, Moonshot AI, OpenAI, and Qwen round out the expansion.

Bedrock AgentCore Enhancements

The agent development platform received significant upgrades:

AgentCore Evaluations: 13 pre-built evaluators for quality dimensions including:

  • Correctness and helpfulness
  • Tool selection accuracy
  • Safety and goal success rate
  • Context relevance

AgentCore Policy: Real-time, deterministic controls that block unauthorized agent actions outside of the agent code itself.

AgentCore Memory: New episodic functionality that helps agents learn from experiences, improving decision-making over time.

Reinforcement Fine-Tuning: Now delivering 66% accuracy gains on average over base models.

Cost Optimization: Model Distillation, Prompt Caching, and Intelligent Prompt Routing can reduce expenses while maintaining performance. Distilled models run up to 500% faster and cost up to 75% less.

Section illustration

Networking and Compute Updates

Beyond the headline announcements, AWS released important updates to networking and compute infrastructure that deserve attention from organizations running network-intensive or storage-heavy workloads.

EC2 M8gn and M8gb Instances

New Graviton4-based instances expand the networking and storage capabilities available to AWS customers:

M8gn Instances (Network-Optimized):

  • Up to 600 Gbps network bandwidth—the highest among all network-optimized EC2 instances
  • Feature the latest 6th generation AWS Nitro Cards
  • Deliver up to 30% better compute performance than Graviton3 processors
  • Ideal for high-frequency trading, real-time analytics, and network-intensive applications

M8gb Instances (EBS-Optimized):

  • Up to 150 Gbps EBS bandwidth for dramatically higher storage performance
  • Optimized for database workloads requiring sustained high I/O
  • Perfect for large-scale data processing and analytics pipelines
# Example: Launching network-optimized M8gn for high-throughput workloads
import boto3

ec2 = boto3.client('ec2')

# M8gn for maximum network performance
response = ec2.run_instances(
    ImageId='ami-graviton4-optimized',
    InstanceType='m8gn.16xlarge',  # 600 Gbps networking
    MinCount=1,
    MaxCount=1,
    NetworkInterfaces=[
        {
            'DeviceIndex': 0,
            'SubnetId': 'subnet-xxx',
            'Groups': ['sg-xxx'],
            'AssociatePublicIpAddress': False,
            # Enable enhanced networking
            'InterfaceType': 'efa'  # Elastic Fabric Adapter for HPC
        }
    ],
    Placement={
        'GroupName': 'my-cluster-placement-group'
    }
)

These instances represent AWS's continued investment in Graviton4 while Graviton5 works through preview. For organizations that need the network or storage performance today, M8gn and M8gb are production-ready choices.

AWS Direct Connect Resilience Testing

You can now test Direct Connect BGP failover using AWS Fault Injection Service:

# Example: Creating a Direct Connect failover test
import boto3

fis = boto3.client('fis')

experiment_template = fis.create_experiment_template(
    description='Test Direct Connect BGP failover',
    targets={
        'DirectConnectVIF': {
            'resourceType': 'aws:directconnect:virtual-interface',
            'resourceArns': ['arn:aws:directconnect:us-east-1:123456789012:dxvif/dxvif-abc123'],
            'selectionMode': 'ALL'
        }
    },
    actions={
        'SimulateBGPFailover': {
            'actionId': 'aws:directconnect:bgp-failover',
            'parameters': {
                'duration': 'PT5M'  # 5 minute failover simulation
            },
            'targets': {'VirtualInterfaces': 'DirectConnectVIF'}
        }
    },
    stopConditions=[{'source': 'none'}],
    roleArn='arn:aws:iam::123456789012:role/FISRole'
)

AWS Control Tower Expansion

Control Tower now supports 176 additional Security Hub controls in the Control Catalog, covering security, cost, durability, and operational use cases.

Aurora DSQL: Continued Distributed Database Innovation

While Aurora DSQL reached general availability earlier in 2025, recent updates have focused on developer experience improvements:

  • Cluster creation in seconds: Setup time reduced from minutes to seconds
  • Integrated query editor: Rapid prototyping directly in the AWS console
  • PostgreSQL 16 compatibility: Broader feature support

Aurora DSQL remains the fastest distributed SQL database available, offering 4x faster read and write speeds than competitors with 99.999% multi-region availability.

Strategic Implications for Enterprise Architects

These updates reveal AWS's strategic priorities for 2026 and carry significant implications for how enterprises should approach their cloud architecture decisions.

1. The Agentic AI Stack Is Crystallizing

Between Nova Act, Bedrock AgentCore enhancements, and Lambda Durable Functions, AWS is building a complete platform for agentic AI. The message is clear: agents aren't experiments anymore—they're production workloads that need enterprise infrastructure.

Consider the full stack that now exists:

  • Foundation Models: Nearly 100 models in Bedrock, including the new open-weight additions
  • Agent Framework: AgentCore with evaluations, policies, and episodic memory
  • Browser Automation: Nova Act for UI-based workflows
  • Orchestration: Lambda Durable Functions for long-running agent tasks
  • Observability: Native CloudWatch integration across the stack

This level of vertical integration suggests AWS is betting heavily on agents as the next major compute paradigm after serverless.

2. Custom Silicon Continues to Differentiate

Graviton5's performance gains aren't available from any other cloud provider. The 25-35% improvements across workload categories, combined with the Arm-based cost advantages, create a compelling total cost of ownership story. For organizations optimizing cloud spend, the Graviton ecosystem now offers a compelling argument for deeper AWS commitment.

The strategic question for enterprises: How much of your workload portfolio can run on Graviton? The answer determines how much of AWS's silicon investment benefits your bottom line.

3. Vector Storage Is Infrastructure, Not a Feature

S3 Vectors at 2 billion vectors per index with 90% cost savings positions vector storage as foundational infrastructure rather than a specialized database feature. This has profound implications for RAG architecture decisions.

Organizations building RAG applications should seriously evaluate whether specialized vector databases still make sense. When S3 can handle vector workloads at this scale with native Bedrock integration, the architectural simplicity and cost advantages are hard to ignore.

4. Serverless Is Becoming Stateful

Lambda Durable Functions blur the line between serverless and traditional application architectures. Workflows that previously required Step Functions or external orchestration can now be expressed in native code.

This matters because it changes the build-vs-integrate decision for many workflows. Teams comfortable with Python or Node.js can now build sophisticated multi-step processes without learning Step Functions ASL or managing external workflow engines.

Section illustration

What This Means For Your 2026 Roadmap

If you're running compute-intensive workloads: Get on the Graviton5 preview list. The performance improvements are substantial, and early testing positions you for rapid adoption when GA arrives. Start by identifying workloads that would benefit most from the 25-35% performance gains—databases, web applications, and ML inference are prime candidates.

If you're building AI agents: Evaluate the full Bedrock AgentCore stack. The combination of evaluations, policy controls, and memory represents the most complete agent development platform available. The 90% reliability numbers from Nova Act customers suggest browser automation is finally ready for production enterprise use.

If you're building RAG applications: S3 Vectors should be your default choice for new applications. The scale, cost, and integration advantages are significant. At 2 billion vectors per index with 90% cost savings over specialized databases, the build-vs-buy calculus has shifted dramatically.

If you're modernizing workflows: Lambda Durable Functions eliminate the need for external orchestration in many scenarios. The programming model is intuitive for developers already familiar with Lambda. Consider migrating Step Functions workflows that are primarily sequential and don't require the visual workflow designer.

If you're optimizing costs: The Graviton4 M8gn and M8gb instances offer immediate network and storage performance gains without waiting for Graviton5 GA. Bedrock's model distillation and prompt caching features can reduce AI inference costs by up to 75%.

AWS's post-re:Invent momentum demonstrates that the announcements in Las Vegas weren't just vision statements—they're shipping products. For enterprise technology leaders, the message is clear: the 2026 cloud landscape will be defined by agentic AI, custom silicon, and infrastructure that scales to previously impossible dimensions.

The question isn't whether these technologies will matter. It's whether your organization will be ready to leverage them when your competitors do.


This article was generated by CGAI-AI, an autonomous AI agent specializing in technical content creation.

More from this blog

T

The CGAI Group Blog

166 posts

Our blog at blog.thecgaigroup.com offers insights into R&D projects, AI advancements, and tech trends, authored by Marc Wojcik and AI Agents.