benji/apes

Files

limiteinductive ac618d2ce3 AX fixes: --json on all commands, exit codes, channel resolution, init

CLI spec:
- AX conventions table: --json, --quiet, exit codes, pipeable, stderr
- colony init command for .colony.toml setup
- --json on ALL commands (whoami, channels, post, inbox, ack, create-channel)
- --quiet for fire-and-forget operations
- --all flag on colony ack
- Channel name→UUID resolution documented
- Backend section updated to inbox/ack (no more old mentions API)

AX skill:
- Added principle #9: Pipeable machine-readable output

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-29 22:28:43 +02:00

4.9 KiB

Raw Blame History

name, description, argument-hint, disable-model-invocation

name	description	argument-hint	disable-model-invocation
ax	Audit agent-facing docs, hooks, skills, and config for the apes platform against AX principles. Use when agent behavior is wrong due to missing/unclear docs, poor ergonomics, or misconfigured automation.	[problem description]	true

AX — Agent Experience Audit

Audit the apes project's Claude Code configuration — CLAUDE.md, hooks, skills, rules, permissions — against AX principles. For each finding, recommend the right mechanism to fix it.

Arguments

$ARGUMENTS — description of the AX problem (e.g., "agents keep deploying to wrong project"). If empty, run a general audit.

Workflow

Phase 1: AUDIT — Discover and score

1a. Establish ground truth

Derive canonical workflows from:

docker-compose.yml files on VMs (SSH to check)
Any Makefile, package.json, pyproject.toml in repo
Deployment scripts, CI pipelines
GCP project config (apes-platform)

Ground truth is authoritative. If docs and automation disagree, fix docs.

1b. Inventory agent-facing surfaces

Discover all Claude Code configuration:

Documentation: CLAUDE.md, .claude/rules/*.md, README.md Automation: .claude/settings.json, hooks Skills: .claude/skills/*/SKILL.md Commands: .claude/commands/*.md Agents: .claude/agents/*.md Memory: ~/.claude/projects/*/memory/MEMORY.md

If $ARGUMENTS is provided, focus on relevant surfaces.

1c. Score against AX principles

#	Principle	FAIL when...
1	Explicitness over convention	A non-standard workflow isn't called out explicitly
2	Fail fast with clear recovery	Errors lack concrete fix commands
3	Minimize context rot	CLAUDE.md adds tokens that don't earn their keep
4	Structured over unstructured	Important info buried in prose instead of tables/code blocks
5	Consistent patterns	Naming or formatting conventions shift across docs
6	Complete context at point of need	Critical commands missing where they're needed
7	Guard rails over documentation	Says "don't do X" but X would succeed — a hook or permission would be better
8	Single source of truth	Same info maintained in multiple places, or docs diverge from reality
9	Pipeable machine-readable output	CLI commands lack `--json`, errors go to stdout instead of stderr, exit codes are unpredictable

Apes-specific checks:

GCP project/region/zone correct everywhere?
Docker Compose configs on VMs match what docs describe?
DNS records match what's deployed?
No SaaS dependencies crept in?

Phase 2: PROPOSE — Select mechanism and draft fixes

For each WARN or FAIL, select the right Claude Code mechanism:

If the finding is...	Use this mechanism
Block forbidden actions	PreToolUse hook
Dangerous command that should never run	Permission deny rule
Auto-format/lint/test after edits	PostToolUse hook
File-type-specific convention	*`.claude/rules/.md`** with `paths` frontmatter
Repeatable workflow or reference	Skill
Complex task needing isolation	Subagent
Critical context surviving compaction	CLAUDE.md
Universal project convention	CLAUDE.md (keep <200 lines)

Each fix must include:

Which principle it addresses
The selected mechanism and why
Exact implementation (file path + content)

Phase 3: REPORT

# AX Audit Report — apes

**Surfaces audited:** <count>

## Scorecard

| # | Principle | Rating | Detail |
|---|-----------|--------|--------|
| 1-8 | ... | PASS/WARN/FAIL | ... |

## Findings

| Surface | Issues | Recommended mechanism |
|---------|--------|----------------------|
| ... | ... | ... |

## Recommendations

For each:
- Principle addressed
- Mechanism type
- Exact implementation (file + content)

Parallel Codex Review

On every AX audit invocation, immediately launch a background codex review before starting your own audit:

codex exec -c 'reasoning_effort="high"' "AX audit: $ARGUMENTS. Read CLAUDE.md, .claude/ directory, and config files. Find: missing docs, unclear commands, split-brain config, stale references. File paths and exact fixes. Do NOT spawn sub-agents. Answer directly in bullet points." 2>&1

Run this via Bash tool with run_in_background: true. Continue your own audit without waiting. When the codex output returns, integrate its findings into Phase 3 (REPORT):

Codex findings that match yours → strengthen confidence
Codex findings you missed → add to recommendations
Disagreements → address explicitly in the report

The final report is yours — codex is a second pair of eyes, not an authority.

Constraints

This skill is read-only — it never modifies files, only reports
Apes-specific: verify no SaaS dependencies in recommendations
Verify GCP infra state via SSH before reporting on deployed services

4.9 KiB Raw Blame History