- separate DB models from API types (no more "one struct rules all") - drop utoipa, drop channel membership, drop model from users - add seq ordering, soft delete, hashed tokens, same-channel reply constraint - WS auth via first message instead of query param - reorder stories to vertical slice (conversation model first, deploy early) - add codex-reviewer subagent for parallel GPT-5.4 reviews - update critic + ax skills to use codex-reviewer Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
130 lines
4.7 KiB
Markdown
130 lines
4.7 KiB
Markdown
---
|
|
name: ax
|
|
description: Audit agent-facing docs, hooks, skills, and config for the apes platform against AX principles. Use when agent behavior is wrong due to missing/unclear docs, poor ergonomics, or misconfigured automation.
|
|
argument-hint: "[problem description]"
|
|
disable-model-invocation: true
|
|
---
|
|
|
|
# AX — Agent Experience Audit
|
|
|
|
Audit the apes project's Claude Code configuration — CLAUDE.md, hooks, skills, rules, permissions — against AX principles. For each finding, recommend the right mechanism to fix it.
|
|
|
|
## Arguments
|
|
|
|
- `$ARGUMENTS` — description of the AX problem (e.g., "agents keep deploying to wrong project"). If empty, run a general audit.
|
|
|
|
## Workflow
|
|
|
|
### Phase 1: AUDIT — Discover and score
|
|
|
|
#### 1a. Establish ground truth
|
|
|
|
Derive canonical workflows from:
|
|
- `docker-compose.yml` files on VMs (SSH to check)
|
|
- Any `Makefile`, `package.json`, `pyproject.toml` in repo
|
|
- Deployment scripts, CI pipelines
|
|
- GCP project config (`apes-platform`)
|
|
|
|
Ground truth is authoritative. If docs and automation disagree, fix docs.
|
|
|
|
#### 1b. Inventory agent-facing surfaces
|
|
|
|
Discover all Claude Code configuration:
|
|
|
|
**Documentation:** `CLAUDE.md`, `.claude/rules/*.md`, `README.md`
|
|
**Automation:** `.claude/settings.json`, hooks
|
|
**Skills:** `.claude/skills/*/SKILL.md`
|
|
**Commands:** `.claude/commands/*.md`
|
|
**Agents:** `.claude/agents/*.md`
|
|
**Memory:** `~/.claude/projects/*/memory/MEMORY.md`
|
|
|
|
If `$ARGUMENTS` is provided, focus on relevant surfaces.
|
|
|
|
#### 1c. Score against AX principles
|
|
|
|
| # | Principle | FAIL when... |
|
|
|---|-----------|--------------|
|
|
| 1 | Explicitness over convention | A non-standard workflow isn't called out explicitly |
|
|
| 2 | Fail fast with clear recovery | Errors lack concrete fix commands |
|
|
| 3 | Minimize context rot | CLAUDE.md adds tokens that don't earn their keep |
|
|
| 4 | Structured over unstructured | Important info buried in prose instead of tables/code blocks |
|
|
| 5 | Consistent patterns | Naming or formatting conventions shift across docs |
|
|
| 6 | Complete context at point of need | Critical commands missing where they're needed |
|
|
| 7 | Guard rails over documentation | Says "don't do X" but X would succeed — a hook or permission would be better |
|
|
| 8 | Single source of truth | Same info maintained in multiple places, or docs diverge from reality |
|
|
|
|
**Apes-specific checks:**
|
|
- GCP project/region/zone correct everywhere?
|
|
- Docker Compose configs on VMs match what docs describe?
|
|
- DNS records match what's deployed?
|
|
- No SaaS dependencies crept in?
|
|
|
|
### Phase 2: PROPOSE — Select mechanism and draft fixes
|
|
|
|
For each WARN or FAIL, select the right Claude Code mechanism:
|
|
|
|
| If the finding is... | Use this mechanism |
|
|
|---|---|
|
|
| Block forbidden actions | **PreToolUse hook** |
|
|
| Dangerous command that should never run | **Permission deny rule** |
|
|
| Auto-format/lint/test after edits | **PostToolUse hook** |
|
|
| File-type-specific convention | **`.claude/rules/*.md`** with `paths` frontmatter |
|
|
| Repeatable workflow or reference | **Skill** |
|
|
| Complex task needing isolation | **Subagent** |
|
|
| Critical context surviving compaction | **CLAUDE.md** |
|
|
| Universal project convention | **CLAUDE.md** (keep <200 lines) |
|
|
|
|
Each fix must include:
|
|
- Which principle it addresses
|
|
- The selected mechanism and why
|
|
- Exact implementation (file path + content)
|
|
|
|
### Phase 3: REPORT
|
|
|
|
```
|
|
# AX Audit Report — apes
|
|
|
|
**Surfaces audited:** <count>
|
|
|
|
## Scorecard
|
|
|
|
| # | Principle | Rating | Detail |
|
|
|---|-----------|--------|--------|
|
|
| 1-8 | ... | PASS/WARN/FAIL | ... |
|
|
|
|
## Findings
|
|
|
|
| Surface | Issues | Recommended mechanism |
|
|
|---------|--------|----------------------|
|
|
| ... | ... | ... |
|
|
|
|
## Recommendations
|
|
|
|
For each:
|
|
- Principle addressed
|
|
- Mechanism type
|
|
- Exact implementation (file + content)
|
|
```
|
|
|
|
## Parallel Codex Review
|
|
|
|
On every AX audit invocation, **immediately** spawn the `codex-reviewer` subagent in the background before starting your own audit:
|
|
|
|
```
|
|
Agent(subagent_type="codex-reviewer", run_in_background=true,
|
|
prompt="AX audit: $ARGUMENTS. Read CLAUDE.md, .claude/ directory, and config files. Find: missing docs, unclear commands, split-brain config, stale references. File paths and exact fixes.")
|
|
```
|
|
|
|
Continue your own audit without waiting. When the codex-reviewer returns, integrate its findings into Phase 3 (REPORT):
|
|
- Codex findings that match yours → strengthen confidence
|
|
- Codex findings you missed → add to recommendations
|
|
- Disagreements → address explicitly in the report
|
|
|
|
The final report is yours — codex is a second pair of eyes, not an authority.
|
|
|
|
## Constraints
|
|
|
|
- This skill is **read-only** — it never modifies files, only reports
|
|
- Apes-specific: verify no SaaS dependencies in recommendations
|
|
- Verify GCP infra state via SSH before reporting on deployed services
|