Docker Best Practices Your AI Coding Assistant Won't Apply by Default


I maintain 6 projects with Docker configs — a Next.js monorepo, a Python email analysis pipeline, a FastAPI social network, a static Astro blog, and a couple of dev environments. Last week I audited all of them.

The pattern was clear: the Dockerfiles I wrote with AI assistance (Claude Code, in my case) started weak. Non-root users? Missing. Read-only filesystems? Never suggested. Signal handling with tini? Not once.

The AI got the basics right — FROM, COPY, RUN, CMD. But the security and operational hardening? I had to learn it, ask for it, and enforce it.

Here’s what I found and what you should check in your own Dockerfiles.


Table of contents

  1. Run as non-root
  2. Pin your base images
  3. Read-only filesystems
  4. Drop capabilities
  5. Signal handling with tini
  6. Named volumes for node_modules
  7. Network isolation
  8. gVisor for untrusted workloads
  9. Resource limits
  10. .dockerignore
  11. Health checks
  12. The AI problem

1. Run as non-root — always

This is the most basic security practice, and AI assistants almost never do it unprompted.

# ❌ Bad — runs as root by default
FROM node:22-alpine
WORKDIR /app
COPY . .
CMD ["node", "server.js"]
# ✅ Good — uses the built-in node user (UID 1000)
FROM node:22-alpine
WORKDIR /app
COPY --chown=node:node . .
USER node
CMD ["node", "server.js"]

For Python, create your own user:

FROM python:3.12-slim
RUN groupadd -g 1000 app && useradd -u 1000 -g app app
WORKDIR /app
COPY --chown=app:app . .
USER app

Why it matters: if someone exploits your app, they get root in the container. With a non-root user, the blast radius drops dramatically.

Why AI misses it: it works without it. The Dockerfile builds, the app runs, the tests pass. Non-root is invisible until something goes wrong.


2. Pin your base images

# ❌ "latest" today, broken tomorrow
FROM node:22-alpine

# ✅ Reproducible builds
FROM node:22.22.0-alpine3.23

I learned this the hard way when a minor Alpine update broke a build on a Monday morning. Pin the full version. Your CI will thank you.

Quick checklist:

  • Use exact version tags, not just major (node:22-alpine)
  • Include the OS version (alpine3.23, not just alpine)
  • Update intentionally, not accidentally

3. Use read-only filesystems where possible

# docker-compose.yml
services:
  api:
    image: my-api
    read_only: true
    tmpfs:
      - /tmp:noexec,nosuid,nodev
      - /app/.next/cache:noexec,nosuid

If your app doesn’t need to write to disk (most APIs don’t), make the filesystem read-only. Combined with tmpfs for temp directories, this blocks a whole class of attacks.

In my email analysis pipeline, services that only read data from mounted volumes run with read_only: true. If something gets compromised, it can’t write a backdoor.

Where to use it:

  • ✅ APIs, proxies (nginx, Caddy), static servers
  • ✅ Workers that process data from volumes
  • ❌ Apps that write logs to disk, compile assets, or manage state

4. Drop all capabilities, add only what you need

services:
  api:
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE  # only if binding to ports < 1024
    security_opt:
      - no-new-privileges:true

Linux capabilities are a fine-grained permission system. By default, Docker containers get a bunch of them. Drop them all and add back only what you need.

  • cap_drop: ALL — removes everything
  • cap_add: NET_BIND_SERVICE — only if you need ports below 1024
  • no-new-privileges: true — prevents privilege escalation via setuid

Most apps need zero capabilities. If yours does, question why.


5. Handle signals properly with tini

RUN apk add --no-cache tini
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "server.js"]

Without tini, npm or node runs as PID 1 and doesn’t handle SIGTERM correctly:

Without tiniWith tini
Stop time~10 seconds (forced kill)<1 second (clean shutdown)
Zombie processesAccumulate over timeProperly reaped
Signal forwardingBrokenWorks correctly

This one is almost never suggested by AI. I discovered it after wondering why my containers were so slow to restart.


6. Use named volumes for node_modules

services:
  dev:
    volumes:
      - ./src:/app/src                    # your code (hot reload)
      - app_modules:/app/node_modules     # isolated, fast

volumes:
  app_modules:

Bind-mounting your host node_modules into the container causes two problems:

  1. Slow on macOS/WSL — filesystem translation overhead on every file access
  2. Platform mismatches — native binaries compiled for your host (macOS/Windows) won’t work in Alpine Linux

Named volumes keep node_modules inside the container’s filesystem — faster and correct.


7. Network isolation

services:
  # This service has NO internet access
  processor:
    networks:
      - nonet

  # This service only listens locally
  api:
    ports:
      - "127.0.0.1:8080:8080"  # NOT 0.0.0.0
    networks:
      - localonly

networks:
  nonet:
    internal: true    # no internet gateway
  localonly:
    driver: bridge

My email analysis pipeline processes potentially malicious email content. The services that parse attachments have zero internet access — if an attachment exploits a vulnerability, it can’t phone home.

Rules of thumb:

  • Dev servers → always bind to 127.0.0.1, not 0.0.0.0
  • Services processing untrusted data → internal: true network
  • Databases → never expose ports to the host unless you need a GUI client

8. gVisor for untrusted workloads

services:
  parser:
    runtime: runsc    # gVisor sandbox
    read_only: true
    cap_drop:
      - ALL
    networks:
      - nonet

gVisor is an application kernel that intercepts system calls and runs them in a sandbox. Regular Docker containers share the host kernel — if a container escapes, it has kernel-level access. gVisor adds a layer between the container and the host kernel.

I use it in my email analysis pipeline for services that parse potentially malicious attachments. Combined with no internet access and read-only filesystem, even if an attachment exploits a vulnerability:

  • ❌ Can’t access the network
  • ❌ Can’t write to disk
  • ❌ Can’t reach the real kernel
  • ❌ Can’t escalate privileges

When to use it:

  • ✅ Processing untrusted input — file uploads, email attachments, user-submitted code
  • ❌ Dev environments, trusted internal services — the overhead isn’t worth it

AI will never suggest this. It’s too niche. But for security-sensitive workloads, it’s a game changer.


9. Resource limits

services:
  api:
    mem_limit: 512m
    cpus: 1.0

Without limits, a memory leak or crypto miner in a compromised container eats your entire machine. Set reasonable limits.

Guidelines:

  • Start at 256m–512m for Node.js/Python APIs
  • 1.0 CPU is usually enough for a single service
  • Monitor actual usage and adjust — docker stats is your friend

10. .dockerignore matters

node_modules
.git
.env*
!.env.example
*.log
docs/
*.md
coverage/
.next
dist

Every file not in .dockerignore gets sent to the Docker daemon as build context:

  • A node_modules folder → +500MB to every build
  • A .git folder → +100MB of history you don’t need
  • .env files → secrets baked into the image layer (even if you delete them later, they’re in the layer cache)

My CPA25 project has the strictest .dockerignore — it excludes test files, documentation, logs, and anything that isn’t needed at runtime.


11. Health checks

services:
  db:
    image: postgres:17-alpine
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 5s
      timeout: 3s
      retries: 5

  api:
    depends_on:
      db:
        condition: service_healthy

Without health checks, depends_on only waits for the container to start, not for the service to be ready. Your API crashes on startup because Postgres isn’t accepting connections yet.

Common health check patterns:

  • PostgreSQL: pg_isready -U postgres
  • Redis: redis-cli ping
  • HTTP API: curl -f http://localhost:8080/health
  • Python: python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000')"

The AI problem

I use Claude Code for most of my development. It’s excellent at writing Dockerfiles that work. But “works” and “production-ready” are different things.

When I ask “dockerize this project”, I get:

FROM node:22-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
EXPOSE 3000
CMD ["npm", "start"]

It runs. It’s correct. It’s also missing every single practice from this post.

This isn’t a bug — it’s a default. The AI optimizes for “does it work?” not “is it hardened?”. And that’s fine, as long as you know what to ask for.


What to do about it

  1. Bookmark this post — or save it somewhere your AI assistant can read it
  2. Next time you dockerize a project, paste the link and tell your AI: “Read this and apply these practices to my Dockerfile”
  3. Audit your existing Dockerfiles — most of them probably run as root right now

The AI is a great tool. But it needs context about your standards. It won’t apply security best practices unless you tell it to — or unless your CLAUDE.md (or equivalent config file) includes them.

Pro tip: add a line to your project’s AI config file: “All Dockerfiles must follow the practices in [this post]. Non-root user, pinned images, read-only where possible, cap_drop ALL, tini, health checks.”

Now your AI applies them by default, every time.


These practices come from auditing 10 Dockerfiles across 6 projects over 2 years. Some I learned from production incidents, others from security audits run by AI agents. The common thread: none of them were suggested by default.