Shipwright — Production engineering discipline for AI coding agents

The five skills

Each is one readable SKILL.md. Stack-agnostic. Distilled from 50k+ ⭐ engineering repos and OWASP Top 10:2025.

disciplined-delivery

The engine. Frame → Plan → Slice → Verify for any project. Evidence-based “done”, grader-≠-doer, anti-false-done counters.

shipping-production-websites

Web build: 12-layer full-stack map + 14 production pillars + pre-launch gate. Nothing falls through.

securing-applications

OWASP Top 10:2025 → defense + a runnable check per risk. CSRF, SSRF, JWT, LLM prompt-injection, privacy.

operating-production-services

Day-2: rate limiting, retries, circuit breakers, SLO, on-call, canary deploys, DR, backups + restore drills.

token-frugal-engineering

Keep the agent’s context lean: delegate verbose work, search don’t read, session hygiene. Real token savings.

How they chain

One engine pulls the rest as depth. Use one, use all.

disciplined-delivery            ← the engine, every non-trivial build
├─ shipping-production-websites  ← pulled in for web projects
├─ securing-applications         ← the security gate (OWASP Top 10:2025)
├─ operating-production-services ← runtime controls + Day-2 maintenance
└─ token-frugal-engineering      ← cross-cutting, keeps context cheap

Why it’s different

Evidence over assertion

“Done” needs the test output / eval score that proves it. “Should work” is a red flag the skill catches.

No step skipped

Pressured to “just ship it”? The plan shrinks to 3 lines — it never disappears.

Grader ≠ doer

A fresh check grades work against criteria with a per-criterion PASS/FAIL + evidence table. Faking requires fabricating output.

Install anywhere

Skill bodies use neutral prose — they port across agents.

Claude Code

claude plugin marketplace add nhattrung0911/shipwright
claude plugin install shipwright@shipwright

Codex

Copy skills/* into ~/.agents/skills/

Gemini CLI

Copy skills/* into ~/.gemini/skills/