Three days, three CI failures, all rooted in the same place: the Dockerfile our build runs from rebuilds the world from scratch every single CI run, and every external source it touches is somebody else’s reliability problem. Here’s what that looks like in practice and how we replaced it with a pre-built image hosted on Docker Hub.
The Dockerfile that bites you
Ours is the standard Laravel Sail base image — Ubuntu 22.04, PHP 8.2 + 26 extensions from Ondrej Surý’s PPA, Node, npm, pnpm, bun, Puppeteer with bundled Chromium, PostgreSQL client, MySQL client, Yarn. About 50 packages, ~2.3 GB built. It’s one giant RUN apt-get update && … chain that pulls from six different external sources:
- Ubuntu archive (archive.ubuntu.com)
- Ondrej’s PHP PPA (ppa.launchpadcontent.net)
- NodeSource (deb.nodesource.com)
- Yarn deb repo (dl.yarnpkg.com)
- PostgreSQL APT (apt.postgresql.org)
- Composer installer (getcomposer.org)
If any of those is unreachable for 30 seconds, the whole build dies. And our CI didn’t cache the resulting image — it ran sail build –no-cache laravel.test on every job. So every push to a branch was 4-5 minutes of “please all six of you be up at the same time, please.”
What actually broke
Two days ago, Ondrej’s PPA went unreachable for several hours. The error in the pipeline was nice and clear:
1 2 3 4 | Err:5 https://ppa.launchpadcontent.net/ondrej/php/ubuntu jammy InRelease Could not connect to ppa.launchpadcontent.net:443 (185.125.190.80), connection timed out E: Unable to locate package php8.2-cli |
The second line looks like a different error but it’s a consequence of the first — without an InRelease index, apt has no record of php8.2-cli existing, so the install fails with a generic “unable to locate.” The diagnostic clue is the order: connection timeout first, package-not-found second. ⏳
I did the small fix first: wrapped the offending apt-get update in a shell retry loop:
1 | && (for i in 1 2 3 4 5; do apt-get update -o Acquire::Retries=5 && break || sleep 30; done) \ |
That helps when the PPA is flapping (down for seconds-to-minutes). It does nothing when the PPA is down for hours. Which is what happened the next day. So the retry wasn’t enough.
The real fix: build once, push to Docker Hub, pull from CI
The mental shift here is small but important. The Dockerfile is your recipe. You don’t need CI to bake the cake fresh on every job; you can bake it once and pass slices around. The recipe stays in the repo unchanged. CI just learns to fetch a slice instead of starting from flour.
The setup is shorter than you’d think. Three steps.
1. Build locally and tag for Docker Hub
From your repo root, with your Docker Hub username (mine is pringadi, hosted at hub.docker.com/r/pringadi/sail-php-8.2):
1 2 3 4 5 | docker build --platform linux/amd64 \ --build-arg WWWGROUP=0 \ -t pringadi/sail-php-8.2:v1 \ -t pringadi/sail-php-8.2:latest \ docker/8.2/ |
Three details worth pointing out:
The –platform linux/amd64 flag. If you’re on Apple Silicon, your Mac defaults to arm64 and Puppeteer 17’s bundled Chromium binary doesn’t exist for arm64. Force amd64 to match what your CI runner uses. Build will be slower under emulation but the image you push is what production sees.
The –build-arg WWWGROUP=0. Sail’s Dockerfile uses $WWWGROUP to groupadd the container’s sail user. If you don’t pass it, the value is empty and groupadd -g sail errors with “invalid group ID ‘sail’.” Match what CI uses (often 0, the root group, in containerized runners).
Two tags in one command. Both names point at the same image bytes. :v1 is your immutable version pin (CI references this — it never moves). :latest is mutable convenience for humans typing docker pull ad hoc. Never reference :latest from CI — it makes builds non-reproducible.
2. Push to Docker Hub
1 2 | docker push pringadi/sail-php-8.2:v1 docker push pringadi/sail-php-8.2:latest |
The big push is the :v1 one (~2.3 GB on first publish; subsequent layer-only changes are much smaller). The :latest push is near-instant — Docker Hub recognizes the layers are already uploaded and just attaches the second tag.
Verify it’s actually accessible to anyone:
1 2 | docker logout docker pull pringadi/sail-php-8.2:v1 |
The logout step is the test. After it, you’re an anonymous client; if the pull succeeds, the image is genuinely public. (Free Docker Hub accounts get unlimited public repos and one private repo. Public is the right call for this use case — the image contains no secrets, just a stock PHP/Node/Postgres install.)
3. Tell CI to pull instead of build
The cleanest pattern: a tiny override compose file that gets layered on top of the main one. Local dev keeps building from the Dockerfile (so contributors don’t need to know any of this); CI gets the registry path.
New file docker-compose.ci.yml:
1 2 3 4 | services: laravel.test: image: pringadi/sail-php-8.2:v1 pull_policy: always |
And in .gitlab-ci.yml, two changes:
1 2 3 4 5 6 7 8 | variables: COMPOSE_FILE: "docker-compose.yml:docker-compose.ci.yml" test: before_script: # ... unchanged ... - ./vendor/bin/sail pull laravel.test # was: sail build --no-cache laravel.test - ./vendor/bin/sail up --no-build -d # --no-build belt-and-suspenders |
COMPOSE_FILE is a docker-compose env var that tells it to load multiple files in order; later files override keys from earlier ones. Setting it once at the variables level means every sail command in the job picks up the override consistently — sail pull, sail up, sail down, sail exec, all of them.
The –no-build flag on up is paranoia: it ensures docker-compose can’t accidentally re-trigger a build even if the merged config still contains a build: block (which it does, inherited from the base compose file).
Gotchas worth flagging
The override doesn’t actually delete build:. When you merge two compose files, keys are added, not replaced. So the laravel.test service in the merged config has both an image: and a build:. Compose v2 has a !reset null directive to nullify a key, but compose v1 doesn’t, and many CI environments still ship with docker-compose (the v1 binary). So instead of fighting it, we use docker compose pull to fetch the registry image first, then up –no-build. Both v1 and v2 honor this.
The Dockerfile is still in the repo. This isn’t a fork-of-fork situation — your Dockerfile remains the source of truth. The pre-built image is its output, snapshotted to Docker Hub. When the Dockerfile changes (you add an extension, bump a Node version), you rebuild + push + bump the version tag in the override:
1 2 3 4 5 6 | docker build --platform linux/amd64 --build-arg WWWGROUP=0 \ -t pringadi/sail-php-8.2:v2 \ -t pringadi/sail-php-8.2:latest \ docker/8.2/ docker push pringadi/sail-php-8.2:v2 docker push pringadi/sail-php-8.2:latest |
Then change image: pringadi/sail-php-8.2:v1 to :v2 in the override file and commit. The discipline is: image and Dockerfile have to be kept in sync, but you control when that sync happens, not CI.
Reverting is one PR. Set COMPOSE_FILE back to docker-compose.yml (or remove the line), put sail build –no-cache laravel.test back. The override file can stay or go. No hidden state.
What it bought us
| Before | After | |
|---|---|---|
| Pull/build base image step | sail build –no-cache, 4-5 min (or fail when Ondrej is down) | sail pull, 30-90 sec on first runner pull, near-instant when cached |
| External sources in CI build path | Six (Ubuntu archive, Ondrej PPA, NodeSource, Yarn, PostgreSQL APT, Composer installer) | One (Docker Hub) |
| Time saved per CI run | — | ~3-4 minutes |
Three to four minutes per build is significant in absolute terms — over a few hundred runs a quarter, that’s hours of developer wait time you give back to the team. But the bigger win is the tail: when one of those six third-party sources goes down for an afternoon, your CI keeps working because it never touches them anymore. The blast radius of upstream flakiness shrinks to a single dependency you’ve explicitly accepted (Docker Hub), which is itself one of the most reliable pieces of internet plumbing in existence.
The one piece of operational discipline you take on
You become responsible for rebuilding the image when the Dockerfile changes. Forget to, and CI keeps using the stale image — “but I added that extension yesterday, why isn’t the test seeing it?” confusion.
Two mitigations help. First, document the rebuild command somewhere obvious — I put it in a comment block at the top of the override file:
1 2 3 4 5 6 | # When the Dockerfile changes, rebuild + push: # docker build --platform linux/amd64 --build-arg WWWGROUP=0 \ # -t pringadi/sail-php-8.2:vN -t pringadi/sail-php-8.2:latest docker/8.2/ # docker push pringadi/sail-php-8.2:vN # docker push pringadi/sail-php-8.2:latest # Then bump the tag below from vN-1 to vN and commit. |
Second, if you have time later, add a CI job that detects Dockerfile changes and triggers a rebuild automatically — fail loudly if the image and Dockerfile drift. We haven’t done that yet.
Lessons
- Anything your build pulls from the internet is a reliability dependency. Count them. Six was too many. One you trust is enough.
- Retry loops absorb minutes, not hours. Useful for the 90% case (transient blips); useless for the 10% (sustained outages). Don’t let them lull you into thinking you’ve solved the problem.
- Caching your image solves more than speed. The reliability win is bigger than the speed win. You stop being a victim of someone else’s bad afternoon.
- Public Docker Hub is fine for build environments. Stock OS + language toolchain has no secrets. The application code that depends on private credentials lives elsewhere (Composer install against your private Git, env vars at runtime). Don’t hide image contents that already aren’t sensitive.
- Local dev shouldn’t have to know any of this. Override files exist precisely so contributors keep using the simple sail up -d from the main compose file. CI complexity belongs in CI configuration, not in everyone’s daily workflow.
Three days of CI failures, one afternoon of work, and the dependency surface area shrinks from six external services to one. That’s a trade I’d take every time. 🐳