DevOps12 August 2025

Zero-DowntimeDeploymentswithCDKandCloudFront

Our CI/CD pipeline deploys to three environments with automated E2E testing, blue-green switches, and instant rollback.

Zero-Downtime Deployments with CDK and CloudFront

Deploying a serverless SSR application on AWS involves more moving parts than you might expect — CloudFront distributions, Lambda@Edge or Lambda function URLs, S3 static assets, API Gateway, and DynamoDB tables. Getting zero-downtime deployments right required careful orchestration. Here is our approach.

The Pipeline

Every push to main triggers a GitHub Actions workflow on self-hosted macOS ARM64 runners. The pipeline runs typecheck, lint, and unit tests in parallel. If all pass, it deploys to development and staging simultaneously. Staging gets a full Playwright E2E suite. Only after E2E passes does production deploy.

CDK for Everything

All infrastructure is defined in AWS CDK with TypeScript. We have three stacks: certificate management (Route 53 + ACM), the CloudFront distribution with S3 and API Gateway origins, and the application stack with Lambda functions and DynamoDB tables. CDK gives us type-safe infrastructure that we can review in pull requests like any other code change.

Asset Versioning

Static assets are deployed to S3 with content-hash filenames before the Lambda function is updated. CloudFront serves the old assets until the new Lambda starts returning HTML that references the new hashes. This eliminates the window where the HTML references assets that do not exist yet — the most common source of deployment-related 404s.

Instant Rollback

Because Lambda versions are immutable and S3 assets are content-hashed, rollback is as simple as pointing the Lambda alias back to the previous version. Old assets are still in S3, so the previous HTML version finds everything it needs. We can roll back production in under 30 seconds.

Monitoring and Alerting

CloudWatch alarms monitor error rates, latency percentiles, and Lambda throttling across all three environments. Deployment failures trigger automatic Slack notifications with direct links to the failed step. We have caught issues in staging that would have been production incidents — the E2E gate has paid for itself many times over.

Have a Project in Mind?

We build custom AI agents, distributed systems, and digital platforms. Tell us what you're working on.

More Articles