Cloud-native architecture is not the art of choosing fashionable managed services.
It is the discipline of making business boundaries, software boundaries, data ownership, and operating responsibilities line up.
That is why Domain-Driven Design (DDD) and Event-Driven Architecture (EDA) work so well together.
DDD helps you decide where the boundaries should be.
EDA helps those boundaries communicate without becoming tightly coupled.
AWS and GCP both provide excellent building blocks. The hard part is choosing the right primitives for the business and the team that will operate them.
The Core Decision
Before comparing services, answer one question:
What business capability owns this data, this decision, and this change?
If that answer is unclear, the cloud architecture will become unclear too.
| Architecture concern | DDD question | EDA question | Cloud question |
|---|---|---|---|
| Ownership | Which bounded context owns the model? | Which service publishes the business fact? | Which team owns the service, data, alerts, and cost? |
| Data | What aggregate or entity is authoritative? | What event represents a meaningful state change? | Which database supports the access pattern and consistency need? |
| Coupling | What should not depend on what? | Who reacts asynchronously? | Which messaging, routing, and retry primitives fit? |
| Operations | Who responds when it breaks? | Can events be replayed and traced? | What dashboards, logs, queues, and runbooks exist? |
Domain-Driven Design In The Cloud
DDD focuses on modelling software around business domains.
The most useful concept is the bounded context: a boundary where a particular model, language, and ownership structure apply.
In cloud-native systems, a bounded context often maps to a service or small group of services. That mapping should be deliberate, not automatic.
What A Good Boundary Looks Like
| Good boundary | Weak boundary |
|---|---|
| Has clear business language. | Uses vague technical labels like “processor” or “manager.” |
| Owns its data and invariants. | Shares tables across teams. |
| Publishes meaningful domain events. | Emits low-level implementation events. |
| Can change internally without breaking everyone. | Requires coordinated releases for small changes. |
| Has clear operational ownership. | Nobody knows who owns failures. |
AWS Implementation Patterns
AWS is strong when you need mature enterprise patterns, deep service choice, and fine-grained operational control.
| Concern | AWS option | When it fits |
|---|---|---|
| Compute | ECS/Fargate, Lambda, EKS | Fargate for containerised services, Lambda for event handlers, EKS for Kubernetes-heavy teams. |
| Data | DynamoDB, Aurora, RDS, OpenSearch | DynamoDB for high-scale aggregate access, Aurora/RDS for relational consistency, OpenSearch for search. |
| Event routing | EventBridge | Domain events, content-based routing, multi-account event flows. |
| Queues | SQS | Backpressure, retries, dead-letter queues, decoupled consumers. |
| Workflow | Step Functions | Sagas, compensating actions, visible execution history. |
| API boundary | API Gateway, ALB | Public APIs, internal APIs, auth, throttling, and routing. |
AWS Pattern: Bounded Context + EventBridge
A payment context might own payment authorisation state in DynamoDB or Aurora. When a payment is authorised, it publishes PaymentAuthorised to EventBridge. EventBridge routes the event to fraud monitoring, notification, ledger, and analytics consumers.
The key is that the payment context owns the meaning of the event. Consumers react, but they do not own the payment state.
GCP Implementation Patterns
GCP is strong when you want simple serverless containers, global messaging, data and analytics gravity, and open standards.
| Concern | GCP option | When it fits |
|---|---|---|
| Compute | Cloud Run, Cloud Functions, GKE | Cloud Run for service boundaries with low operational overhead, GKE for Kubernetes-heavy platforms. |
| Data | Firestore, Cloud SQL, Spanner, BigQuery | Firestore for document aggregates, Cloud SQL for relational apps, Spanner for global consistency, BigQuery for analytics. |
| Event routing | Pub/Sub, Eventarc | Pub/Sub for global messaging, Eventarc for CloudEvents routing into services. |
| Workflow | Workflows | API orchestration and step-based processes. |
| API boundary | API Gateway, Apigee, Cloud Load Balancing | Lightweight APIs, enterprise API management, global routing. |
GCP Pattern: Bounded Context + Cloud Run + Pub/Sub
A payment context can run as a Cloud Run service and publish domain events to Pub/Sub. Fraud, notification, and analytics consumers each subscribe independently. Eventarc can route cloud events into Cloud Run services using CloudEvents format.
This is especially attractive when the team wants container portability and simple deployment without managing clusters.
CQRS And Event Sourcing
Command Query Responsibility Segregation (CQRS) separates write models from read models.
Event sourcing stores state changes as a sequence of events.
These patterns are powerful, but they are often overused.
| Pattern | Use it when… | Avoid it when… |
|---|---|---|
| CQRS | Read and write workloads have very different shapes, scale, or query needs. | A simple CRUD model is enough. |
| Event sourcing | You need a complete audit trail and can model business state as events. | The team lacks operational maturity for replay, schema evolution, and projections. |
| Sagas | A long-running process spans multiple services and needs compensation. | A single transactional boundary would be simpler and safer. |
AWS CQRS Shape
- Write side: DynamoDB or Aurora.
- Event stream: DynamoDB Streams, EventBridge, or Kinesis.
- Projection workers: Lambda, ECS, or Kinesis consumers.
- Read side: Aurora, OpenSearch, DynamoDB, or analytics stores.
GCP CQRS Shape
- Write side: Firestore, Cloud SQL, or Spanner.
- Event stream: Pub/Sub or change streams where appropriate.
- Projection workers: Cloud Run, Cloud Functions, or Dataflow.
- Read side: BigQuery, Firestore, Cloud SQL, or search indexes.
AWS vs GCP: Decision Table
| If you need… | AWS may fit better | GCP may fit better |
|---|---|---|
| Rich event routing | EventBridge rules and integrations. | Pub/Sub plus Eventarc where simpler routing is enough. |
| Serverless containers | ECS/Fargate is mature and flexible. | Cloud Run is extremely simple to operate. |
| Global strongly consistent data | Possible, but usually more architecture work. | Cloud Spanner is a first-class option. |
| Enterprise workflow orchestration | Step Functions has strong visibility and ecosystem maturity. | Workflows is simpler for API orchestration. |
| Data and analytics gravity | Strong, but often more service assembly. | BigQuery, Dataflow, and Pub/Sub are a natural combination. |
| Fine-grained control | AWS offers many specialised knobs. | GCP offers fewer, often simpler primitives. |
Security And Well-Architected Gaps To Call Out
Cloud-native architecture reviews often over-index on service selection. The more important review is whether the workload can be operated securely, reliably, and economically by the team that owns it.
| Gap | What it looks like | What to review |
|---|---|---|
| Shared data ownership | Multiple bounded contexts write to the same tables or buckets. | Authoritative data owner, access pattern, schema ownership, and change process. |
| Boundary bypass | Teams call databases or internal APIs directly because it is easier. | API contracts, service identity, network boundaries, and dependency mapping. |
| Identity sprawl | Services, jobs, and functions use broad roles or shared credentials. | Least privilege, workload identity, secret rotation, and service-level audit. |
| Missing threat model | Architecture diagrams show happy paths but not abuse paths. | Trust boundaries, data classification, external inputs, admin paths, and supply chain risk. |
| Reliability assumptions | The design assumes managed services remove failure modes. | Quotas, limits, regional dependencies, retry behaviour, backup, failover, and replay. |
| Cost blind spots | Event fan-out, logs, analytics, data transfer, and idle capacity are not estimated. | Cost model, budgets, unit economics, and high-cardinality observability cost. |
AWS and GCP both provide Well-Architected guidance, but the practical test is simple: can a future engineer understand the boundary, operate it under stress, secure the data, and explain the cost?
The Architecture Review Checklist
Before committing to the design, ask:
- What bounded contexts exist, and who owns each one?
- Which data source is authoritative for each important entity?
- Which events are business facts rather than technical notifications?
- How are event schemas versioned?
- What happens when a consumer fails?
- Can we replay events safely?
- Where do we need strong consistency, and where is eventual consistency acceptable?
- What is the operational dashboard for each context?
- Who gets paged?
- What would make this design simpler?
That last question matters.
Cloud-native systems can become elaborate very quickly. A 10-star architecture is not the one with the most services. It is the one where boundaries, ownership, and recovery are easiest to understand.
The Practical Recommendation
Choose AWS if your system needs rich event routing, mature enterprise workflow patterns, broad service depth, and precise operational control.
Choose GCP if your team values serverless container simplicity, global messaging, data and analytics integration, and open event standards.
But do not let the platform choice hide the deeper discipline.
The architecture succeeds when business boundaries and platform boundaries reinforce each other.
That is the real cloud-native skill.
Sources and Further Reading
Written by Haris Habib from Sydney, Australia | February 2026