HeavyDeets is in testing – meaning I’m letting people create accounts and hit it. One of the things I wanted for HeavyDeets was a clean deploy pipeline — the kind where merging a PR is the last manual step and everything else just happens. Here’s the setup we landed on for the API, built on GitHub Actions, Amazon ECR, and ECS.
The trigger
push to master— touching api/** or the workflow file itself
The pipeline only fires when it needs to. A push to master that only touches the frontend, docs, or anything outside api/** won’t trigger a new API deploy. This keeps the build history meaningful — if a run appears, an API change caused it.
The workflow also watches its own file (.github/workflows/deploy-api.yml), so changes to the pipeline itself get tested immediately rather than silently taking effect on the next unrelated merge.
The nine steps
When the trigger fires, GitHub spins up a fresh Ubuntu VM. Here’s what happens:
| 1 | Checkout Pulls the repository at the commit that triggered the workflow. Clean slate — no state from previous runs. |
| 2 | Configure AWS credentials Authenticates against AWS as the IAM role. Credentials are injected from GitHub secrets — nothing sensitive lives in the repo. IAM role |
| 3 | ECR login Calls aws-actions/amazon-ecr-login to get a temporary Docker registry token (12-hour TTL), then runs docker login against the private ECR registry. The runner is now authorized to push images.12h token |
| 4 | Docker build & push Builds from api/Dockerfile and pushes with two tags: the git commit SHA (e.g. abc1234) for an immutable record of every deployed version, and latest as a convenience pointer. Both land in ECR.:abc1234 :latest |
| 5 | Download task definition Fetches the current ECS task definition JSON via aws ecs describe-task-definition. This is the full spec — CPU, memory, environment variables, port mappings, execution role — saved to a local file for the next step to operate on. |
| 6 | Render new task definitionamazon-ecs-render-task-definition opens the JSON, swaps the image URI in the Main container to the newly pushed SHA-tagged image, and writes the updated spec to a temp file. Everything else stays identical. |
| 7 | Register task definition revision Registers the updated JSON with ECS. This creates a new numbered revision — :2, :3, and so on. ECS retains the full history, so rolling back is always just a register-and-deploy of a previous revision.immutable history |
| 8 | Update service Tells the heavydeets-api ECS service to deploy the new revision. ECS starts a fresh task with the new image, waits for it to pass its health check, then drains and stops the old task. Zero-downtime by design.blue/green style |
| 9 | Wait for service stability Polls ECS until the service reports stable — all desired tasks running, nothing pending. If it doesn’t stabilize within the timeout, the workflow fails and the old task is still serving traffic. You get a clear signal before any user impact. fail-safe gate |
Merge to live traffic in roughly~12 min
A few things worth calling out
SHA tags are the real artifact. The latest tag is just a pointer — it moves with every deploy. The SHA tagt’s an immutable link between a git commit and the exact container image that ran in production. If something goes wrong, you can get which code is running.
Task definition history is free rollback. Because ECS keeps every registered revision, rolling back a bad deploy doesn’t require any special tooling – point the service at an older revision number. The deploy pipeline itself handles the register-and-deploy sequence, so a rollback looks identical to a forward deploy.
The stability gate (step 9) is the most important step. Without it, the workflow succeeds the moment update-service is called — before ECS has finished replacing the old task. If your new image crashes on startup, you’ll know before the workflow completes.
The runner is ephemeral. GitHub provisions a fresh Ubuntu VM for every run and tears it down after. There’s no shared state between deploys, no credentials lingering on disk, no Docker layer cache to corrupt. The tradeoff is slightly slower builds (each run pulls base layers fresh), but the security and reproducibility properties are worth it at this stage.
What’s next
The current setup is a straightforward single-container, single-service deploy. As HeavyDeets grows, I’ll probably add a separate staging environment that deploys from main before anything reaches production.
For now though — merge, wait twelve minutes, done.

