update gcloud skill with Colony deploy workflow + all deployment gotchas

- add Colony to infra table (no longer TBD)
- add Colony deploy steps (pull, build, restart, verify)
- add Docker install on Debian pattern
- add troubleshooting: stub binary, SSH timeout under load, background builds,
  git push via IP, first build timing, --no-cache warning
- update CLAUDE.md infra table

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-29 19:59:05 +02:00
parent af183abc42
commit 8913cb71c8
2 changed files with 70 additions and 50 deletions

View File

@@ -1,6 +1,6 @@
# gcloud Skill
Common GCP patterns for the apes platform. All commands invoke gcloud/kubectl/docker directly via Bash.
Common GCP patterns for the apes platform. All commands invoke gcloud/docker directly via Bash.
**Project:** `apes-platform`
**Region:** `europe-west1`
@@ -8,19 +8,36 @@ Common GCP patterns for the apes platform. All commands invoke gcloud/kubectl/do
## Current Infrastructure
| Service | Host | VM | IP |
|---------|------|----|----|
| Gitea | git.unslope.com | gitea-vm | 34.78.255.104 |
| Chat (planned) | apes.unslope.com | TBD | TBD |
| Service | Host | VM | IP | Compose dir |
|---------|------|----|----|-------------|
| Gitea | git.unslope.com | gitea-vm | 34.78.255.104 | /opt/gitea |
| Colony | apes.unslope.com | colony-vm (e2-medium) | 35.241.200.77 | /opt/colony-src/infra/colony |
## SSH into VMs
```bash
# Gitea VM
gcloud compute ssh gitea-vm --zone=europe-west1-b --project=apes-platform
gcloud compute ssh <vm> --zone=europe-west1-b --project=apes-platform
gcloud compute ssh <vm> --zone=europe-west1-b --project=apes-platform --command="<cmd>"
```
# Run a command remotely
gcloud compute ssh gitea-vm --zone=europe-west1-b --project=apes-platform --command="sudo docker ps"
## Colony Deploy
```bash
# 1. SSH in and pull latest code
gcloud compute ssh colony-vm --zone=europe-west1-b --project=apes-platform \
--command='sudo bash -c "cd /opt/colony-src && git pull"'
# 2. Rebuild (deps cached, only source changes recompile ~30s)
gcloud compute ssh colony-vm --zone=europe-west1-b --project=apes-platform \
--command='sudo bash -c "cd /opt/colony-src/infra/colony && docker compose build 2>&1 | tail -5"'
# 3. Restart
gcloud compute ssh colony-vm --zone=europe-west1-b --project=apes-platform \
--command='sudo bash -c "cd /opt/colony-src/infra/colony && docker compose up -d"'
# 4. Verify
gcloud compute ssh colony-vm --zone=europe-west1-b --project=apes-platform \
--command='sudo bash -c "sleep 3 && docker logs colony 2>&1 | tail -5 && curl -s http://localhost:3001/api/health"'
```
## Docker Compose on VMs
@@ -28,37 +45,50 @@ gcloud compute ssh gitea-vm --zone=europe-west1-b --project=apes-platform --comm
```bash
# Restart a service
gcloud compute ssh <vm> --zone=europe-west1-b --project=apes-platform \
--command="sudo bash -c 'cd /opt/<service> && docker compose restart <container>'"
--command="sudo bash -c 'cd <compose-dir> && docker compose restart <container>'"
# View logs
# View logs (use 2>&1 for stderr)
gcloud compute ssh <vm> --zone=europe-west1-b --project=apes-platform \
--command="sudo docker logs <container> --tail 50"
--command='sudo docker logs <container> 2>&1 | tail -50'
# Full redeploy
gcloud compute ssh <vm> --zone=europe-west1-b --project=apes-platform \
--command="sudo bash -c 'cd /opt/<service> && docker compose pull && docker compose up -d'"
--command="sudo bash -c 'cd <compose-dir> && docker compose pull && docker compose up -d'"
```
## Install Docker on Debian 12
```bash
gcloud compute ssh <vm> --zone=europe-west1-b --project=apes-platform --command='sudo bash -c "
apt-get update && apt-get install -y ca-certificates curl gnupg git
install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg
chmod a+r /etc/apt/keyrings/docker.gpg
echo \"deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian bookworm stable\" > /etc/apt/sources.list.d/docker.list
apt-get update && apt-get install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin
systemctl enable docker && systemctl start docker
"'
```
## Static IPs & DNS
```bash
# Reserve a new static IP
# Reserve IP
gcloud compute addresses create <name> --region=europe-west1 --project=apes-platform
# Get IP value
# Get IP
gcloud compute addresses describe <name> --region=europe-west1 --project=apes-platform --format='value(address)'
# DNS: add A record at Namecheap (Advanced DNS tab) pointing subdomain to IP
# DNS: Namecheap Advanced DNS → Add A Record → host=<subdomain> value=<IP>
# Verify: dig @dns1.registrar-servers.com <domain> A +short
```
## Firewall Rules
```bash
# List rules
gcloud compute firewall-rules list --project=apes-platform
# Open a port
gcloud compute firewall-rules create <name> --allow=tcp:<port> --target-tags=web-server --project=apes-platform
gcloud compute firewall-rules delete <name> --project=apes-platform --quiet
```
## New VM Pattern
@@ -72,46 +102,36 @@ gcloud compute instances create <name> \
--image-project=debian-cloud \
--boot-disk-size=20GB \
--tags=web-server \
--address=<static-ip-name> \
--metadata-from-file=startup-script=<script-path>
--address=<static-ip-name>
```
## Gitea API (git.unslope.com)
## Gitea (git.unslope.com)
```bash
# Auth: basic auth or token
# Create API token:
# Git push/pull: HTTP on port 3000 (no gcloud needed)
git clone http://git.unslope.com:3000/benji/apes.git
git remote set-url origin http://<user>:<token>@34.78.255.104:3000/benji/apes.git
# Create API token
curl -u user:pass -X POST 'https://git.unslope.com/api/v1/users/<user>/tokens' \
-H 'Content-Type: application/json' -d '{"name":"my-token","scopes":["all"]}'
# Create repo
curl -u user:token -X POST 'https://git.unslope.com/api/v1/user/repos' \
-H 'Content-Type: application/json' -d '{"name":"repo-name"}'
# Add collaborator
curl -u user:token -X PUT 'https://git.unslope.com/api/v1/repos/owner/repo/collaborators/username' \
-H 'Content-Type: application/json' -d '{"permission":"write"}'
# Create user (admin only)
sudo docker exec -u git gitea gitea admin user create --username <user> --password '<pass>' --email '<email>'
# DNS not resolved? Use --resolve flag:
curl --resolve git.unslope.com:443:34.78.255.104 ...
```
## IAM
```bash
gcloud auth list
gcloud projects get-iam-policy apes-platform --format=json
# Create user (admin only, via SSH)
gcloud compute ssh gitea-vm --zone=europe-west1-b --project=apes-platform \
--command='sudo docker exec -u git gitea gitea admin user create --username <user> --password "<pass>" --email "<email>"'
```
## Troubleshooting
| Error | Fix |
|-------|-----|
| VM SSH timeout | Check firewall: `gcloud compute firewall-rules list --project=apes-platform` |
| Docker not running | SSH in, run `sudo systemctl start docker` |
| Caddy cert failed | Check DNS propagation: `dig @dns1.registrar-servers.com <domain> A +short` |
| Container not starting | Check logs: `sudo docker logs <container> --tail 50` |
| DNS not resolving | Flush local cache: `sudo dscacheutil -flushcache && sudo killall -HUP mDNSResponder` |
| SSH timeout | VM under load (e.g. compiling Rust). Wait and retry. Don't kill builds. |
| SSH connection refused | VM still booting. Wait 30s, retry. |
| Docker not running | `sudo systemctl start docker` |
| Caddy cert failed | Check DNS: `dig @dns1.registrar-servers.com <domain> A +short`. Clear Caddy data if stuck on staging cert: `sudo rm -rf <compose-dir>/caddy_data/caddy/certificates/acme-staging*` |
| Container exits immediately | Check binary size — if < 1MB it's the stub binary. Rebuild with `docker rmi <image> && docker compose build` |
| Container restarting, no logs | Binary panicking before first print. Run interactively: `docker run --rm -e DATABASE_URL=... <image> /app/<binary>` |
| Git push fails (DNS) | Use IP directly: `http://34.78.255.104:3000/benji/apes.git` |
| Git push auth fails | Use API token, not password (special chars break URL encoding) |
| Build takes forever | First Rust build ~10min on e2-medium. Subsequent builds ~30s (deps cached). Don't use `--no-cache` unless Dockerfile changed. |
| Docker build SSH timeout | Build in background: `nohup docker compose build > /tmp/build.log 2>&1 &` then check later |