“A deployment pipeline that requires a human in the loop for every push isn’t a pipeline — it’s a bottleneck with extra steps.”

This is a full walkthrough of taking a Flask application from local development to production on AWS using Docker, ECR, ECS, and GitHub Actions — with secrets handled properly from the start, not bolted on as an afterthought.

CI/CD Pipeline Architecture Diagram Pipeline flow: local development → GitHub Actions → ECR → ECS


Architecture#

The flow is straightforward:

  1. Developer pushes to main
  2. GitHub Actions builds the Docker image, authenticates with AWS via stored secrets, pushes to ECR, and triggers an ECS deployment
  3. ECS runs the container with task-scoped IAM roles — no credentials baked into the image or the environment

Each component has one job:

ComponentRole
DockerConsistent, reproducible builds across every environment
ECRPrivate, versioned image storage inside your AWS account
ECSManaged container scheduling and deployment
GitHub ActionsAutomated, auditable CI/CD execution

The point of this architecture is that no human needs to touch a deployment after the initial setup. Code merged to main ships. That’s the contract.


Step 1: Dockerizing the Application#

FROM python:3.9-slim

WORKDIR /opt/cuteblog

COPY . /opt/cuteblog

RUN pip install --no-cache-dir -r requirements.txt

EXPOSE 5000

CMD ["python3", "app.py"]

python:3.9-slim over the full image — smaller attack surface, faster builds, less to patch. --no-cache-dir keeps the image lean.

requirements.txt:

Flask==2.0.1
SQLAlchemy==1.4.23

Pin your versions. Unpinned dependencies are a supply chain risk and a debugging nightmare — a package update you didn’t ask for breaking a deployment at 11pm is not a good time.

Build and test locally before touching AWS:

docker build -t cuteblog-flask-image .
docker run -d -p 5000:5000 cuteblog-flask-image

If it doesn’t work locally, it won’t work in ECS. Validate the container first.


Step 2: AWS Infrastructure#

ECR Repository#

Create the repository via CLI — don’t click through the console for things you’ll need to reproduce:

aws ecr create-repository \
  --repository-name cuteblog-flask-image \
  --image-scanning-configuration scanOnPush=true \
  --region us-east-1

scanOnPush=true enables automatic vulnerability scanning on every pushed image. Enable it now, not when something goes wrong.

ECS Cluster#

aws ecs create-cluster --cluster-name cuteblog-cluster

Step 3: Secrets Management#

This is where most pipelines get it wrong. Credentials don’t belong in code, config files, or Docker images. Full stop.

Local Development#

.env file, not committed:

AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AWS_REGION=us-east-1

.gitignore — non-negotiable:

.env
*.env

In 2023 alone, thousands of AWS keys were exposed on GitHub. Most of them were committed by developers who knew better and thought “I’ll fix it later.” There is no later.

GitHub Actions Secrets#

Repository → Settings → Secrets and variables → Actions. Add:

  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • AWS_REGION
  • AWS_ACCOUNT_ID

These are injected at runtime and never appear in logs.

Production: OIDC Over Long-Term Keys#

For anything beyond a personal project, ditch the long-term access keys entirely. GitHub Actions supports OIDC-based role assumption — the pipeline gets a short-lived token scoped to the execution, no keys stored anywhere:

- name: Configure AWS credentials
  uses: aws-actions/configure-aws-credentials@v2
  with:
    role-to-assume: arn:aws:iam::${{ secrets.AWS_ACCOUNT_ID }}:role/github-actions-deploy
    aws-region: ${{ secrets.AWS_REGION }}

Set it up once, rotate nothing, and your audit trail is clean. This is what production looks like.


Step 4: The GitHub Actions Workflow#

name: CI/CD Pipeline

on:
  push:
    branches:
      - main

jobs:
  build-and-deploy:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v2
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ secrets.AWS_REGION }}

      - name: Login to Amazon ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v2

      - name: Build, tag, and push image to ECR
        env:
          ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
          ECR_REPOSITORY: cuteblog-flask-image
          IMAGE_TAG: ${{ github.sha }}
        run: |
          docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG .
          docker tag $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG $ECR_REGISTRY/$ECR_REPOSITORY:latest
          docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
          docker push $ECR_REGISTRY/$ECR_REPOSITORY:latest          

      - name: Deploy to ECS
        uses: aws-actions/amazon-ecs-deploy-task-definition@v1
        with:
          task-definition: task-definition.json
          service: cuteblog-service
          cluster: cuteblog-cluster
          wait-for-service-stability: true

A few things worth noting:

github.sha as the image tag — every image is tied to an exact commit. You can roll back to any point in history by deploying a specific SHA. latest alone gives you no audit trail and no clean rollback path.

wait-for-service-stability: true — the workflow blocks until ECS confirms the new task is healthy. Without this, your pipeline reports success before the deployment has actually landed.

Action versions pinned — @v4, @v2, not @latest. Same reason as dependency pinning.


Step 5: IAM — Least Privilege, Not Convenience#

The path of least resistance in AWS is attaching AdministratorAccess to everything and moving on. That path ends with a breach.

CI/CD IAM User Policy#

Scope the pipeline user to exactly what it needs:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ecr:GetAuthorizationToken",
        "ecr:BatchCheckLayerAvailability",
        "ecr:GetDownloadUrlForLayer",
        "ecr:BatchGetImage",
        "ecr:PutImage",
        "ecr:InitiateLayerUpload",
        "ecr:UploadLayerPart",
        "ecr:CompleteLayerUpload"
      ],
      "Resource": "arn:aws:ecr:us-east-1:ACCOUNT_ID:repository/cuteblog-flask-image"
    },
    {
      "Effect": "Allow",
      "Action": [
        "ecs:UpdateService",
        "ecs:DescribeServices",
        "ecs:RegisterTaskDefinition"
      ],
      "Resource": "*"
    }
  ]
}

ECR permissions scoped to the specific repository, not *. If those credentials are ever compromised, the blast radius is contained.

ECS Task Execution Role#

The role the ECS agent uses to pull images and write logs — separate from the role your application code uses at runtime:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ecr:GetAuthorizationToken",
        "ecr:BatchCheckLayerAvailability",
        "ecr:GetDownloadUrlForLayer",
        "ecr:BatchGetImage",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "*"
    }
  ]
}

One role per purpose. Never reuse roles across services — it makes auditing useless and blast radius management impossible.


Troubleshooting#

Container exits immediately after start:

# Check what actually happened
docker logs <container-id>

# Run interactively to debug startup
docker run -it cuteblog-flask-image /bin/bash

Usually incomplete requirements.txt, a crash on startup, or the app binding to the wrong interface. Flask needs host='0.0.0.0' to be reachable from outside the container.

GitHub Actions auth failure:

Work through this checklist in order:

  1. Secrets are set in the correct repository (not org-level when you meant repo-level)
  2. IAM user has the permissions above, nothing less
  3. AWS region matches across secrets, workflow file, and ECR repository
  4. ECR repository exists in the region you’re pushing to

Most auth failures are a mismatch in one of those four.

ECS service not updating after deployment:

Check that wait-for-service-stability is set and that your task definition is registering a new revision. ECS will not update a service if the task definition hasn’t changed.


What Production-Ready Looks Like Here#

  • Image tagged by commit SHA, not just latest
  • Vulnerability scanning enabled on ECR push
  • IAM scoped to specific resources, not *
  • OIDC-based auth instead of long-term keys
  • Pipeline blocks on deployment health check before reporting success
  • No credentials anywhere in the codebase or Docker image

The pipeline in this post uses long-term keys for simplicity of explanation. Swap in OIDC before going to production — the workflow change is four lines and the security improvement is significant.


Source#

Full implementation on GitHub — Dockerfile, workflow, task definition, and IAM policy templates included.

IAM patterns in this post draw on approaches documented by @mesinkasir .


Acknowledgement: Parts of this implementation build upon the work by MesinKasir (@mesinkasir) . The codebase was adapted and extended for the purposes of this project.

Tags#

#AWS #Infrastructure #Docker #CI/CD #GitHubActions #Security


About the Author#

Elijah Udom (elijahu) is an Infrastructure & Cloud Engineer based in Lagos, Nigeria. AWS, Kubernetes, eBPF security, AI/ML infrastructure. Building in the open.

Elijah Udom


← Previous: eBPF Container Security Monitor | Next: The Quest for A+ TLS →