CLI spec: - AX conventions table: --json, --quiet, exit codes, pipeable, stderr - colony init command for .colony.toml setup - --json on ALL commands (whoami, channels, post, inbox, ack, create-channel) - --quiet for fire-and-forget operations - --all flag on colony ack - Channel name→UUID resolution documented - Backend section updated to inbox/ack (no more old mentions API) AX skill: - Added principle #9: Pipeable machine-readable output Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4.9 KiB
name, description, argument-hint, disable-model-invocation
| name | description | argument-hint | disable-model-invocation |
|---|---|---|---|
| ax | Audit agent-facing docs, hooks, skills, and config for the apes platform against AX principles. Use when agent behavior is wrong due to missing/unclear docs, poor ergonomics, or misconfigured automation. | [problem description] | true |
AX — Agent Experience Audit
Audit the apes project's Claude Code configuration — CLAUDE.md, hooks, skills, rules, permissions — against AX principles. For each finding, recommend the right mechanism to fix it.
Arguments
$ARGUMENTS— description of the AX problem (e.g., "agents keep deploying to wrong project"). If empty, run a general audit.
Workflow
Phase 1: AUDIT — Discover and score
1a. Establish ground truth
Derive canonical workflows from:
docker-compose.ymlfiles on VMs (SSH to check)- Any
Makefile,package.json,pyproject.tomlin repo - Deployment scripts, CI pipelines
- GCP project config (
apes-platform)
Ground truth is authoritative. If docs and automation disagree, fix docs.
1b. Inventory agent-facing surfaces
Discover all Claude Code configuration:
Documentation: CLAUDE.md, .claude/rules/*.md, README.md
Automation: .claude/settings.json, hooks
Skills: .claude/skills/*/SKILL.md
Commands: .claude/commands/*.md
Agents: .claude/agents/*.md
Memory: ~/.claude/projects/*/memory/MEMORY.md
If $ARGUMENTS is provided, focus on relevant surfaces.
1c. Score against AX principles
| # | Principle | FAIL when... |
|---|---|---|
| 1 | Explicitness over convention | A non-standard workflow isn't called out explicitly |
| 2 | Fail fast with clear recovery | Errors lack concrete fix commands |
| 3 | Minimize context rot | CLAUDE.md adds tokens that don't earn their keep |
| 4 | Structured over unstructured | Important info buried in prose instead of tables/code blocks |
| 5 | Consistent patterns | Naming or formatting conventions shift across docs |
| 6 | Complete context at point of need | Critical commands missing where they're needed |
| 7 | Guard rails over documentation | Says "don't do X" but X would succeed — a hook or permission would be better |
| 8 | Single source of truth | Same info maintained in multiple places, or docs diverge from reality |
| 9 | Pipeable machine-readable output | CLI commands lack --json, errors go to stdout instead of stderr, exit codes are unpredictable |
Apes-specific checks:
- GCP project/region/zone correct everywhere?
- Docker Compose configs on VMs match what docs describe?
- DNS records match what's deployed?
- No SaaS dependencies crept in?
Phase 2: PROPOSE — Select mechanism and draft fixes
For each WARN or FAIL, select the right Claude Code mechanism:
| If the finding is... | Use this mechanism |
|---|---|
| Block forbidden actions | PreToolUse hook |
| Dangerous command that should never run | Permission deny rule |
| Auto-format/lint/test after edits | PostToolUse hook |
| File-type-specific convention | .claude/rules/*.md with paths frontmatter |
| Repeatable workflow or reference | Skill |
| Complex task needing isolation | Subagent |
| Critical context surviving compaction | CLAUDE.md |
| Universal project convention | CLAUDE.md (keep <200 lines) |
Each fix must include:
- Which principle it addresses
- The selected mechanism and why
- Exact implementation (file path + content)
Phase 3: REPORT
# AX Audit Report — apes
**Surfaces audited:** <count>
## Scorecard
| # | Principle | Rating | Detail |
|---|-----------|--------|--------|
| 1-8 | ... | PASS/WARN/FAIL | ... |
## Findings
| Surface | Issues | Recommended mechanism |
|---------|--------|----------------------|
| ... | ... | ... |
## Recommendations
For each:
- Principle addressed
- Mechanism type
- Exact implementation (file + content)
Parallel Codex Review
On every AX audit invocation, immediately launch a background codex review before starting your own audit:
codex exec -c 'reasoning_effort="high"' "AX audit: $ARGUMENTS. Read CLAUDE.md, .claude/ directory, and config files. Find: missing docs, unclear commands, split-brain config, stale references. File paths and exact fixes. Do NOT spawn sub-agents. Answer directly in bullet points." 2>&1
Run this via Bash tool with run_in_background: true. Continue your own audit without waiting. When the codex output returns, integrate its findings into Phase 3 (REPORT):
- Codex findings that match yours → strengthen confidence
- Codex findings you missed → add to recommendations
- Disagreements → address explicitly in the report
The final report is yours — codex is a second pair of eyes, not an authority.
Constraints
- This skill is read-only — it never modifies files, only reports
- Apes-specific: verify no SaaS dependencies in recommendations
- Verify GCP infra state via SSH before reporting on deployed services