Claude Code as Homelab Assistant - Part 2: Finding and Fixing a Bug
Claude Code as Homelab Assistant - Part 2: Finding and Fixing a Bug
Introduction
In Part 1, we set up Claude Code as a homelab assistant - SSH access,
slash commands, first security audit. At the end came the question: what happens if
the /health-check actually finds something?
The answer came faster than expected. On the first real health check, a problem was immediately visible: a container in unhealthy status - for 9 hours, unnoticed.
What followed demonstrates better than any example why this workflow works: Claude Code didn’t just report the status, but found the cause, worked out the fix, and delivered the solution PR-ready. All in one pass, without me typing a single Docker command myself.
Main Section
The Context: timetracker-plus
timetracker-plus is a project I’m currently developing - a React application
for time tracking. The goal is simple: record work hours once and then
distribute them to different formats - for my own records, for the company in
two different formats, and if needed also for the client in their respective
format.
The stack is a Docker-hosted React app with database, deployed via GitHub
Actions with a locally hosted runner directly on the homelab server. In other words:
a git push to main triggers the deploy, the runner builds the image and restarts
the container.
This runs automatically - and that’s exactly what makes the bug interesting. CI/CD had successfully deployed the
container, GitHub Actions showed green, everything seemed fine.
What nobody had noticed: the container had been unhealthy since the last deploy.
The Health Check Triggers
/health-check

The report was clear:
[ERR] timetracker-plus-web-1 — unhealthy (since 9h)
[OK] All other 19 containers running stably
[WARN] Docker Build Cache (5.2 GB) + dangling images (4 GB) reclaimable
Nine hours. The last deploy was the evening before - since then the container was technically running, but the healthcheck failed on every attempt. 1106 failed attempts, every 30 seconds.
Claude Code asked directly: “Should I look at the logs from timetracker-plus-web-1?”
The Diagnosis

The cause was found quickly - and classic:
exec: "curl": executable file not found in $PATH
FailingStreak: 1106 (since ~9 hours, every 30s)
The healthcheck in the compose file called curl to check the /api/health endpoint. The problem: the Next.js image is a lean Node-Alpine image - and
curl isn’t installed there.
The container itself ran perfectly. The logs showed Ready in 177ms, all
requests were handled normally. Only the healthcheck failed because it called a
tool that simply wasn’t present in the image.
Claude Code summarized it aptly:
Note: The container works perfectly - only the healthcheck is misconfigured. Fix: switch healthcheck to wget or a native Node check.
Working Out the Fix

Claude Code then looked at the Git repo - the Actions runner workdir was visible on the server, branch main. The fix belongs in
docker-compose.prod.yml:
# before
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/api/health"]
# after (wget)
healthcheck:
test: ["CMD-SHELL", "wget -qO- http://localhost:3000/api/health || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
Before the fix was finalized, I had a good question: does the healthcheck really run inside the container? Port 3000 is the internal port - from outside the service is accessible on a different port.
Claude Code confirmed: healthchecks run in the container namespace, port 3000 is
correct. The logs additionally confirmed this - next start listens internally on 3000,
externally a different port is mapped.

Then a quick check to see if wget is even present in the image:
docker exec timetracker-plus-web-1 which wget 2>&1
docker exec timetracker-plus-web-1 which curl 2>&1
Result: /usr/bin/wget - wget is there, curl is not. The fix with wget is
the right approach, PR-ready.
The Fix Lands in the Repo
docker-compose.prod.yml adjusted, committed, pushed to main. The GitHub
Actions runner automatically triggered the deploy - built new image, restarted container.

At the next /health-check the world was back in order:

timetracker-plus-web-1 — now healthy, restarted 7 minutes ago.
All 20 containers running stably.
Conclusion
What impressed me most about this process: the bug wasn’t new. It had been created during the last deploy and had existed unnoticed for 9 hours - GitHub Actions showed green, the container was running, everything seemed normal. Without the regular health check, I might not have noticed it until someone explicitly checked.
That’s the real value of this setup. Not that Claude Code fixes a bug that I could have found myself - but that it gets found at all, before it becomes a problem.
The workflow from discovery to fix took no more than 15 minutes. I didn’t type a single Docker command myself, didn’t manually search through log files, didn’t look up compose file syntax. Claude Code handled all of that - I confirmed the steps and made the push.
In Part 3 we set up a new service - from an empty folder to a working HTTPS URL, with Traefik integration and automatic Playwright test at the end.