Your Multi-Cloud AI Strategy Is Already Non-Compliant
The EU Data Act is live. AI workloads are the worst offenders. Here's how to fix it.

The EU just made your AI stack a legal liability. Most US companies don't know it yet.
In September 2025, the EU Data Act became fully enforceable. It doesn't just regulate how you handle data — it mandates that your customers must be able to leave you. Easily. With everything. To any competitor you have. And if your architecture wasn't built for that from the start, you're not just non-compliant. You're operating illegally in one of the world's largest economic markets.
Here's the part that keeps infrastructure architects up at night: AI workloads are the most locked-in technology in the modern enterprise stack. And almost nobody building AI systems is thinking about portability.
That's about to change — one audit at a time.
The EU Data Act Is Not GDPR 2.0
Stop thinking about this as a data privacy law. The EU Data Act isn't about consent forms and cookie banners. It's about power — specifically, ending the ability of cloud providers to trap customers through architectural lock-in.
The three mandates that matter:
1. Easy provider switching. Customers must be able to switch cloud providers without facing unreasonable technical, contractual, or commercial barriers. "Unreasonable" is being defined by regulators, not by your vendor's SLA.
2. Full data portability. All exportable data must be portable — in open, machine-readable formats — within 30 days of a customer request. Not "eventually." Not "in the next release cycle." Thirty days.
3. No egress fees as a lock-in mechanism. Providers can't weaponize data transfer costs to make switching economically painful. The free data egress provisions start ramping in 2027, but the portability requirements are live now.
The target is any company providing data processing services to EU customers. If you have a single EU enterprise client — one — and your AI system processes their data, you're in scope.
The audits have started. The first enforcement actions will be visible by mid-2026.
Why AI Workloads Are the Worst Offenders
Traditional cloud lock-in is mostly about proprietary services: AWS RDS versus Aurora, Azure Functions versus Lambda, Google Spanner versus Bigtable. Painful to migrate, but theoretically possible with enough engineering effort and budget.
AI lock-in is different. It's deeper, less visible, and harder to unwind.
Model API Coupling
The fastest way to build an AI product is to call openai.chat.completions.create() directly from your application code. Millions of developers have done exactly this. The problem: that's not a cloud service you're using. That's a dependency you've embedded in your architecture.
When OpenAI changes pricing, changes behavior, gets acquired, or loses service reliability — you have no fallback. More relevantly, when your EU enterprise customer asks you to demonstrate provider portability under the Data Act, you can't. Your application is OpenAI. Swapping it out requires a rewrite, not a configuration change.
Embedding Model Lock-In
This one is sneaky. If you're running RAG (retrieval-augmented generation), your entire knowledge base is encoded as vectors — mathematical representations of text that only make sense in the context of the specific embedding model that created them.
Move from text-embedding-ada-002 to text-embedding-3-small? Regenerate every vector in your database. Move from OpenAI embeddings to Cohere or Google's embedding models? Same problem. Your 50 million document chunks need to be re-embedded from scratch, and the semantic relationships between them will shift.
This isn't just migration cost. It's compliance risk. "All exportable data" under the EU Data Act includes the vectors that encode your customer's proprietary documents and workflows. If those vectors are only meaningful in the context of one provider's embedding model, you haven't actually made the data portable — you've made it dependent.
Vector Database Coupling
Pinecone, Weaviate, Qdrant, Chroma, pgvector — each has different index structures, query languages, metadata schemas, and operational characteristics. If your AI system's retrieval layer is built tightly around Pinecone's API, the effort to migrate to a different provider isn't just technical. It requires re-testing every query pattern, every hybrid search combination, every metadata filter that your production system relies on.
Fine-Tuned Models
This is the deep end. If you've fine-tuned a model on your customer's proprietary data — training a base model to understand their domain, their terminology, their workflows — that fine-tuned model is an asset. Under the EU Data Act, that asset needs to be portable.
Fine-tunes on OpenAI's platform live on OpenAI's infrastructure. They're not exportable in a format that another provider can load and run. The weights may belong to you contractually, but you often can't access them in a way that enables actual portability.
What Non-Compliance Actually Means
Let's be precise about the risk. "Non-compliant" under the EU Data Act is not the same as "we'll get a nastygram from a regulator in three years." The enforcement mechanism is faster and more direct than GDPR was in its early years.
EU regulators learned from GDPR rollout. They've built enforcement infrastructure. National data protection authorities are already coordinating on AI-specific investigations. Enterprise B2B contracts with EU companies are increasingly including Data Act compliance as a contractual requirement — meaning your customers can sue you for breach before a regulator even gets involved.
The practical risk profile for a US company with EU enterprise customers:
- Contract risk: EU customers start adding portability clauses. You can't demonstrate compliance. You lose deals or face breach claims.
- Regulatory risk: An EU customer files a complaint. National authority investigates. Fines up to 4% of global annual turnover for non-compliance with portability obligations.
- Reputational risk: The first high-profile enforcement actions will be public. Being named as a company that built non-portable AI infrastructure is not a headline you want.
Most US companies won't get hit in 2026. But the ones who haven't started building for portability in 2026 will face a much harder retrofit in 2027 and 2028 when enforcement ramps up.
What Compliant AI Architecture Actually Looks Like
The fix is not complicated. It is, however, a different way of thinking about how you structure AI systems from day one.
The core principle: treat model providers as interchangeable infrastructure, not as foundations.
The Abstraction Layer Pattern
Instead of calling OpenAI (or Anthropic, or Gemini) directly, you build a thin interface layer that your application code talks to. Behind that interface, you can swap providers with a configuration change.
# ❌ Non-portable: direct provider coupling
from openai import OpenAI
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
def get_response(user_message: str) -> str:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": user_message}]
)
return response.choices[0].message.content
# ✅ Portable: provider abstraction layer
import os
from typing import Protocol
class LLMProvider(Protocol):
def complete(self, messages: list[dict], **kwargs) -> str: ...
class OpenAIProvider:
def __init__(self):
from openai import OpenAI
self.client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
def complete(self, messages: list[dict], **kwargs) -> str:
response = self.client.chat.completions.create(
model=kwargs.get("model", "gpt-4o"),
messages=messages
)
return response.choices[0].message.content
class AnthropicProvider:
def __init__(self):
import anthropic
self.client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
def complete(self, messages: list[dict], **kwargs) -> str:
response = self.client.messages.create(
model=kwargs.get("model", "claude-sonnet-4-6"),
max_tokens=4096,
messages=messages
)
return response.content[0].text
class LocalOllamaProvider:
def complete(self, messages: list[dict], **kwargs) -> str:
import requests
response = requests.post(
"http://localhost:11434/api/chat",
json={"model": kwargs.get("model", "qwen2.5"), "messages": messages, "stream": False}
).json()
return response["message"]["content"]
# Config-driven provider selection
PROVIDERS = {
"openai": OpenAIProvider,
"anthropic": AnthropicProvider,
"local": LocalOllamaProvider,
}
def get_llm() -> LLMProvider:
provider_name = os.getenv("LLM_PROVIDER", "anthropic")
return PROVIDERS[provider_name]()
# Application code never knows which provider it's talking to
llm = get_llm()
response = llm.complete([{"role": "user", "content": "Summarize this contract."}])
Switching providers is now a single environment variable change. No code rewrite. Auditor asks for proof of portability — you change LLM_PROVIDER=openai to LLM_PROVIDER=anthropic and everything keeps working.
Portable Embeddings
The embedding portability problem requires a slightly different approach: store both the raw source content and the embedding vectors, and version your embedding model explicitly.
# ✅ Portable embedding storage schema
from dataclasses import dataclass
from datetime import datetime
@dataclass
class PortableChunk:
chunk_id: str
source_document_id: str
raw_text: str # Always store the source — this is what's portable
embedding: list[float] # The vector representation
embedding_model: str # Which model produced this: "text-embedding-3-small"
embedding_provider: str # Which provider: "openai"
embedding_version: str # Model version for reproducibility
created_at: datetime
def to_exportable_dict(self) -> dict:
"""EU Data Act compliant export — includes raw text, not just vectors"""
return {
"chunk_id": self.chunk_id,
"source_document_id": self.source_document_id,
"raw_text": self.raw_text, # The actual portable content
"embedding_metadata": {
"model": self.embedding_model,
"provider": self.embedding_provider,
"version": self.embedding_version,
"note": "Re-embed raw_text with your preferred model for full portability"
}
}
Key principle: the raw_text is the portable asset. The embedding is a derivative. When a customer exercises their Data Act portability rights, you give them the raw text. They (or their next vendor) can re-embed it with whatever model they choose.
Provider-Agnostic Agent Architecture
The same principle applies to the full agent stack. CGAI's approach is to build agents that declare their capabilities abstractly and receive their tools as injected dependencies:
# ✅ Provider-agnostic agent
class ResearchAgent:
def __init__(
self,
llm: LLMProvider, # Injected — can be any provider
search_tool: SearchProvider, # Injected — Brave, Google, DuckDuckGo
storage: StorageProvider, # Injected — S3, GCS, local, NAS
):
self.llm = llm
self.search = search_tool
self.storage = storage
def research(self, topic: str) -> str:
search_results = self.search.query(topic)
summary = self.llm.complete([
{"role": "system", "content": "You are a research analyst."},
{"role": "user", "content": f"Synthesize: {search_results}"}
])
self.storage.save(f"research/{topic}.md", summary)
return summary
# Instantiate with EU-compliant, swappable providers
agent = ResearchAgent(
llm=AnthropicProvider(),
search_tool=BraveSearchProvider(),
storage=S3Provider(bucket="customer-data-eu-west-1")
)
Need to switch from Anthropic to a local model for a cost-sensitive deployment? One line:
agent = ResearchAgent(llm=LocalOllamaProvider(), ...)
The agent doesn't know, doesn't care, and your architecture is now portable.
The 90-Day Action Plan
If you have EU enterprise customers and you're running AI systems against their data, here's where to start.
Days 1–30: Audit and assess
Map every place your codebase makes a direct API call to an AI provider. OpenAI, Anthropic, Cohere, Google — all of them. For each integration:
- What data from EU customers flows through this call?
- Is the output stored? In what format? With what metadata?
- Can we recreate this output without this specific provider?
This audit will be uncomfortable. Most teams discover they're more locked-in than they thought.
Days 31–60: Build the abstraction layer
Implement the provider interface pattern. Don't try to migrate everything at once — start with your highest-volume, highest-risk integration. Build the interface. Implement two providers behind it. Test switching. Now you have proof that portability is achievable.
Simultaneously: update your data storage schema to always preserve raw source content alongside derived representations (embeddings, summaries, classifications). Future portability is much easier when you haven't thrown away the originals.
Days 61–90: Document and contractualize
Update your technical documentation to reflect your portability architecture. If you have enterprise contracts with EU customers, proactively add a portability appendix — describe how data can be exported, in what format, on what timeline. Getting ahead of this in contract negotiations is far less painful than responding to a customer demand or a regulatory inquiry.
For new enterprise deals: make portability a feature, not a disclaimer. "Our AI systems are designed from day one to be provider-portable and fully compliant with EU Data Act portability requirements" is a sales advantage as EU procurement teams start adding compliance requirements to their vendor evaluation criteria.
What CGAI Builds — And Why
Every AI system we build at CGAI starts with a portability assumption: the customer should be able to take their data and their AI capabilities elsewhere, on 30 days notice, to any provider they choose.
This isn't altruism. It's architecture discipline. Systems designed for portability are also more resilient, easier to test, and less expensive to operate — because you can route to cheaper models when the task doesn't require the most expensive one, and you're not dependent on any single provider's uptime or pricing decisions.
The EU Data Act is forcing a discipline that good architects were already practicing. If you're not there yet, the clock is running.
The audits will start with the biggest targets — enterprises processing the most EU customer data. But they'll expand. And the companies that built portable architectures will sail through those audits with a configuration file and a demo. The ones that didn't will spend 18 months in a retrofit they could have avoided.
Build for portability now. The regulator isn't asking nicely.
Marc Wojcik is the founder of The CGAI Group, an AI consulting and products studio specializing in agentic AI systems. CGAI builds AI infrastructure for enterprises that need to move fast without building technical debt — including EU Data Act compliant architectures. Reach him at marc@thecgaigroup.com.

