Skip to main content
Back to Blog
7 min read

Architecting Cloud-Native Systems with DDD and EDA: AWS vs GCP

A strategic guide to using Domain-Driven Design and Event-Driven Architecture on AWS and GCP, with practical service choices, trade-offs, and decision tests.

Whiteboard summary of: Architecting Cloud-Native Systems with DDD and EDA: AWS vs GCP

Cloud-native architecture is not the art of choosing fashionable managed services.

It is the discipline of making business boundaries, software boundaries, data ownership, and operating responsibilities line up.

That is why Domain-Driven Design (DDD) and Event-Driven Architecture (EDA) work so well together.

DDD helps you decide where the boundaries should be.

EDA helps those boundaries communicate without becoming tightly coupled.

AWS and GCP both provide excellent building blocks. The hard part is choosing the right primitives for the business and the team that will operate them.

The Core Decision

Before comparing services, answer one question:

What business capability owns this data, this decision, and this change?

If that answer is unclear, the cloud architecture will become unclear too.

Architecture concernDDD questionEDA questionCloud question
OwnershipWhich bounded context owns the model?Which service publishes the business fact?Which team owns the service, data, alerts, and cost?
DataWhat aggregate or entity is authoritative?What event represents a meaningful state change?Which database supports the access pattern and consistency need?
CouplingWhat should not depend on what?Who reacts asynchronously?Which messaging, routing, and retry primitives fit?
OperationsWho responds when it breaks?Can events be replayed and traced?What dashboards, logs, queues, and runbooks exist?

Domain-Driven Design In The Cloud

DDD focuses on modelling software around business domains.

The most useful concept is the bounded context: a boundary where a particular model, language, and ownership structure apply.

In cloud-native systems, a bounded context often maps to a service or small group of services. That mapping should be deliberate, not automatic.

What A Good Boundary Looks Like

Good boundaryWeak boundary
Has clear business language.Uses vague technical labels like “processor” or “manager.”
Owns its data and invariants.Shares tables across teams.
Publishes meaningful domain events.Emits low-level implementation events.
Can change internally without breaking everyone.Requires coordinated releases for small changes.
Has clear operational ownership.Nobody knows who owns failures.

AWS Implementation Patterns

AWS is strong when you need mature enterprise patterns, deep service choice, and fine-grained operational control.

ConcernAWS optionWhen it fits
ComputeECS/Fargate, Lambda, EKSFargate for containerised services, Lambda for event handlers, EKS for Kubernetes-heavy teams.
DataDynamoDB, Aurora, RDS, OpenSearchDynamoDB for high-scale aggregate access, Aurora/RDS for relational consistency, OpenSearch for search.
Event routingEventBridgeDomain events, content-based routing, multi-account event flows.
QueuesSQSBackpressure, retries, dead-letter queues, decoupled consumers.
WorkflowStep FunctionsSagas, compensating actions, visible execution history.
API boundaryAPI Gateway, ALBPublic APIs, internal APIs, auth, throttling, and routing.

AWS Pattern: Bounded Context + EventBridge

A payment context might own payment authorisation state in DynamoDB or Aurora. When a payment is authorised, it publishes PaymentAuthorised to EventBridge. EventBridge routes the event to fraud monitoring, notification, ledger, and analytics consumers.

The key is that the payment context owns the meaning of the event. Consumers react, but they do not own the payment state.

GCP Implementation Patterns

GCP is strong when you want simple serverless containers, global messaging, data and analytics gravity, and open standards.

ConcernGCP optionWhen it fits
ComputeCloud Run, Cloud Functions, GKECloud Run for service boundaries with low operational overhead, GKE for Kubernetes-heavy platforms.
DataFirestore, Cloud SQL, Spanner, BigQueryFirestore for document aggregates, Cloud SQL for relational apps, Spanner for global consistency, BigQuery for analytics.
Event routingPub/Sub, EventarcPub/Sub for global messaging, Eventarc for CloudEvents routing into services.
WorkflowWorkflowsAPI orchestration and step-based processes.
API boundaryAPI Gateway, Apigee, Cloud Load BalancingLightweight APIs, enterprise API management, global routing.

GCP Pattern: Bounded Context + Cloud Run + Pub/Sub

A payment context can run as a Cloud Run service and publish domain events to Pub/Sub. Fraud, notification, and analytics consumers each subscribe independently. Eventarc can route cloud events into Cloud Run services using CloudEvents format.

This is especially attractive when the team wants container portability and simple deployment without managing clusters.

CQRS And Event Sourcing

Command Query Responsibility Segregation (CQRS) separates write models from read models.

Event sourcing stores state changes as a sequence of events.

These patterns are powerful, but they are often overused.

PatternUse it when…Avoid it when…
CQRSRead and write workloads have very different shapes, scale, or query needs.A simple CRUD model is enough.
Event sourcingYou need a complete audit trail and can model business state as events.The team lacks operational maturity for replay, schema evolution, and projections.
SagasA long-running process spans multiple services and needs compensation.A single transactional boundary would be simpler and safer.

AWS CQRS Shape

GCP CQRS Shape

AWS vs GCP: Decision Table

If you need…AWS may fit betterGCP may fit better
Rich event routingEventBridge rules and integrations.Pub/Sub plus Eventarc where simpler routing is enough.
Serverless containersECS/Fargate is mature and flexible.Cloud Run is extremely simple to operate.
Global strongly consistent dataPossible, but usually more architecture work.Cloud Spanner is a first-class option.
Enterprise workflow orchestrationStep Functions has strong visibility and ecosystem maturity.Workflows is simpler for API orchestration.
Data and analytics gravityStrong, but often more service assembly.BigQuery, Dataflow, and Pub/Sub are a natural combination.
Fine-grained controlAWS offers many specialised knobs.GCP offers fewer, often simpler primitives.

Security And Well-Architected Gaps To Call Out

Cloud-native architecture reviews often over-index on service selection. The more important review is whether the workload can be operated securely, reliably, and economically by the team that owns it.

GapWhat it looks likeWhat to review
Shared data ownershipMultiple bounded contexts write to the same tables or buckets.Authoritative data owner, access pattern, schema ownership, and change process.
Boundary bypassTeams call databases or internal APIs directly because it is easier.API contracts, service identity, network boundaries, and dependency mapping.
Identity sprawlServices, jobs, and functions use broad roles or shared credentials.Least privilege, workload identity, secret rotation, and service-level audit.
Missing threat modelArchitecture diagrams show happy paths but not abuse paths.Trust boundaries, data classification, external inputs, admin paths, and supply chain risk.
Reliability assumptionsThe design assumes managed services remove failure modes.Quotas, limits, regional dependencies, retry behaviour, backup, failover, and replay.
Cost blind spotsEvent fan-out, logs, analytics, data transfer, and idle capacity are not estimated.Cost model, budgets, unit economics, and high-cardinality observability cost.

AWS and GCP both provide Well-Architected guidance, but the practical test is simple: can a future engineer understand the boundary, operate it under stress, secure the data, and explain the cost?

The Architecture Review Checklist

Before committing to the design, ask:

  1. What bounded contexts exist, and who owns each one?
  2. Which data source is authoritative for each important entity?
  3. Which events are business facts rather than technical notifications?
  4. How are event schemas versioned?
  5. What happens when a consumer fails?
  6. Can we replay events safely?
  7. Where do we need strong consistency, and where is eventual consistency acceptable?
  8. What is the operational dashboard for each context?
  9. Who gets paged?
  10. What would make this design simpler?

That last question matters.

Cloud-native systems can become elaborate very quickly. A 10-star architecture is not the one with the most services. It is the one where boundaries, ownership, and recovery are easiest to understand.

The Practical Recommendation

Choose AWS if your system needs rich event routing, mature enterprise workflow patterns, broad service depth, and precise operational control.

Choose GCP if your team values serverless container simplicity, global messaging, data and analytics integration, and open event standards.

But do not let the platform choice hide the deeper discipline.

The architecture succeeds when business boundaries and platform boundaries reinforce each other.

That is the real cloud-native skill.


Sources and Further Reading


Written by Haris Habib from Sydney, Australia | February 2026