浪人
DE | EN
Claude Code as Homelab Assistant - Part 2: Finding and Fixing a Bug
tech

Claude Code as Homelab Assistant - Part 2: Finding and Fixing a Bug

Back to Blog
3 min read

Claude Code as Homelab Assistant - Part 2: Finding and Fixing a Bug

Introduction

In Part 1, we set up Claude Code as a homelab assistant - SSH access, slash commands, first security audit. At the end came the question: what happens if the /health-check actually finds something?

The answer came faster than expected. On the first real health check, a problem was immediately visible: a container in unhealthy status - for 9 hours, unnoticed.

What followed demonstrates better than any example why this workflow works: Claude Code didn’t just report the status, but found the cause, worked out the fix, and delivered the solution PR-ready. All in one pass, without me typing a single Docker command myself.


Main Section

The Context: timetracker-plus

timetracker-plus is a project I’m currently developing - a React application for time tracking. The goal is simple: record work hours once and then distribute them to different formats - for my own records, for the company in two different formats, and if needed also for the client in their respective format.

The stack is a Docker-hosted React app with database, deployed via GitHub Actions with a locally hosted runner directly on the homelab server. In other words: a git push to main triggers the deploy, the runner builds the image and restarts the container.

This runs automatically - and that’s exactly what makes the bug interesting. CI/CD had successfully deployed the container, GitHub Actions showed green, everything seemed fine. What nobody had noticed: the container had been unhealthy since the last deploy.

The Health Check Triggers

/health-check

Healthcheck-output

The report was clear:

[ERR]  timetracker-plus-web-1 — unhealthy (since 9h)
[OK]   All other 19 containers running stably
[WARN] Docker Build Cache (5.2 GB) + dangling images (4 GB) reclaimable

Nine hours. The last deploy was the evening before - since then the container was technically running, but the healthcheck failed on every attempt. 1106 failed attempts, every 30 seconds.

Claude Code asked directly: “Should I look at the logs from timetracker-plus-web-1?”

The Diagnosis

Healthcheck-log

The cause was found quickly - and classic:

exec: "curl": executable file not found in $PATH
FailingStreak: 1106 (since ~9 hours, every 30s)

The healthcheck in the compose file called curl to check the /api/health endpoint. The problem: the Next.js image is a lean Node-Alpine image - and curl isn’t installed there.

The container itself ran perfectly. The logs showed Ready in 177ms, all requests were handled normally. Only the healthcheck failed because it called a tool that simply wasn’t present in the image.

Claude Code summarized it aptly:

Note: The container works perfectly - only the healthcheck is misconfigured. Fix: switch healthcheck to wget or a native Node check.

Working Out the Fix

Healthcheck-log-fix-long

Claude Code then looked at the Git repo - the Actions runner workdir was visible on the server, branch main. The fix belongs in docker-compose.prod.yml:

# before
healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost:3000/api/health"]

# after (wget)
healthcheck:
  test: ["CMD-SHELL", "wget -qO- http://localhost:3000/api/health || exit 1"]
  interval: 30s
  timeout: 10s
  retries: 3
  start_period: 30s

Before the fix was finalized, I had a good question: does the healthcheck really run inside the container? Port 3000 is the internal port - from outside the service is accessible on a different port.

Claude Code confirmed: healthchecks run in the container namespace, port 3000 is correct. The logs additionally confirmed this - next start listens internally on 3000, externally a different port is mapped.

Healthcheck-log-fix

Then a quick check to see if wget is even present in the image:

docker exec timetracker-plus-web-1 which wget 2>&1
docker exec timetracker-plus-web-1 which curl 2>&1

Result: /usr/bin/wget - wget is there, curl is not. The fix with wget is the right approach, PR-ready.

The Fix Lands in the Repo

docker-compose.prod.yml adjusted, committed, pushed to main. The GitHub Actions runner automatically triggered the deploy - built new image, restarted container.

Healthcheck-github-action-run

At the next /health-check the world was back in order:

Healthcheck-fixed

timetracker-plus-web-1 — now healthy, restarted 7 minutes ago.
All 20 containers running stably.

Conclusion

What impressed me most about this process: the bug wasn’t new. It had been created during the last deploy and had existed unnoticed for 9 hours - GitHub Actions showed green, the container was running, everything seemed normal. Without the regular health check, I might not have noticed it until someone explicitly checked.

That’s the real value of this setup. Not that Claude Code fixes a bug that I could have found myself - but that it gets found at all, before it becomes a problem.

The workflow from discovery to fix took no more than 15 minutes. I didn’t type a single Docker command myself, didn’t manually search through log files, didn’t look up compose file syntax. Claude Code handled all of that - I confirmed the steps and made the push.

In Part 3 we set up a new service - from an empty folder to a working HTTPS URL, with Traefik integration and automatic Playwright test at the end.

The repo: github.com/RoninRage/homelab-claude-public