SKILL·0FE6C6

well-architected

Name: well-architected
Author: avelikiy

avelikiy

Actualizado 1 month ago

8 vistas

Diseñoexcelwordaidesign

Acerca de

Esta Skill de Claude aplica una revisión integral de arquitectura a través de seis pilares (excelencia operativa, seguridad, confiabilidad, rendimiento, costo, sostenibilidad) para todos los proyectos que no sean nano. Se aplica automáticamente por el agente arquitecto al crear o auditar documentos ARCH para garantizar una consideración exhaustiva del diseño más allá de las funcionalidades. Los desarrolladores deben usarla para arquitecturas de proyectos pequeños a empresariales, auditorías y revisiones de entornos heredados, pero no para proyectos nano o correcciones simples de errores.

Instalación rápida

Claude Code

Recomendado

Principal

npx skills add avelikiy/great_cto -a claude-code

Comando PluginAlternativo

/plugin add https://github.com/avelikiy/great_cto

Git CloneAlternativo

git clone https://github.com/avelikiy/great_cto.git ~/.claude/skills/well-architected

Copia y pega este comando en Claude Code para instalar esta habilidad

Documentación

Well-Architected — 6 pillars to verify before shipping

Every ARCH document for non-nano work must answer the 6 pillar questions below. Skipping a pillar is allowed only if explicitly justified (e.g. "Sustainability: N/A — backend-only, runs in shared infra.").

This is adapted from AWS Well-Architected (lens: small-team SaaS / LLM applications), trimmed to questions that matter at <10 engineer scale.

Pillar 1 — Operational excellence

Questions

Observability: What metrics, logs, traces do we emit? How do we tell from a dashboard if this is working in prod?
Deployability: How do we ship a change? CI gates? Rollback path?
Runbooks: When this breaks at 3am, what does on-call read?

Pass criteria

✅ One metric per business outcome (e.g. webhook-deliveries-acked)
✅ One log line per request, with request-id correlatable across services
✅ Deploy path is documented and tested (rollback dry-run executed)
✅ Runbook covers top-3 failure modes from pre-mortem

Common fail

❌ "We'll add monitoring later." Monitoring is part of the feature.

Pillar 2 — Security

Questions

Trust boundaries: Where does untrusted data enter? How is it validated/sanitized?
Authn / authz: Who can call this? Who can read/write the data?
Secrets: Where are API keys, DB passwords, JWT signing keys stored?
Data classification: PII? PHI? PCI cardholder data? What's the retention policy?

Pass criteria

✅ Every external input has explicit validation at the boundary
✅ Authz is enforced at the data layer, not just UI
✅ Secrets in env vars or secret manager, never in source
✅ Sensitive data classified and retention policy defined

Common fail

❌ "JWT validates the user, that's our authz." JWT is authentication. Authorization is separate (this user can read THIS row).

Pillar 3 — Reliability

Questions

Failure modes: What happens when a downstream dependency is slow / down / corrupted?
Idempotency: Can a retried request safely re-execute?
Backups & recovery: What's the RPO (data-loss tolerance)? RTO (downtime tolerance)? Test plan for both?
Capacity: What's the max QPS this can handle? What happens at 1.5x that?

Pass criteria

✅ Circuit breakers / timeouts on external calls
✅ State-mutating endpoints accept idempotency keys
✅ Backups documented + restore tested in the last 90 days
✅ Load test exists; results in docs/perf/

Common fail

❌ "Postgres has backups." Backups without a tested restore aren't backups.

Pillar 4 — Performance efficiency

Questions

SLOs: What's the p50/p95/p99 latency target? Error rate? Availability?
Bottlenecks: Profile the critical path — what's the slowest step?
Caching: What's cacheable? Cache invalidation strategy?
Scaling: Vertical or horizontal? Auto-scale rules?

Pass criteria

✅ SLO numbers in the ARCH doc (not "fast enough")
✅ Profile attached for non-trivial requests
✅ Cache strategy documented; invalidation explicit
✅ Scaling decision justified by data, not "feels right"

Common fail

❌ "Database can handle it." Quantify: queries/sec, row count, index hit rate.

Pillar 5 — Cost optimization

Questions

Hot path: What's the most expensive operation per request? Why?
Right-sizing: Is the chosen instance type / model / DB tier the smallest one that meets SLO?
Cleanup: What happens to old data? Old logs? Old branch environments?

Pass criteria

✅ Use skill cost-model to document explicit $ numbers
✅ Choose smallest LLM model that meets quality SLO (haiku before sonnet, sonnet before opus)
✅ Retention policy for logs, metrics, old data

Common fail

❌ Defaulting to Opus / GPT-4 when Haiku would work. Test on Haiku first.

Pillar 6 — Sustainability (env / energy)

Questions

Workload efficiency: Is the code O(n log n) when it could be O(n)?
Idle resources: Can dev environments scale to zero overnight?
Data minimization: Do we collect / store data we never query?

Pass criteria

✅ Hot loop complexity documented
✅ Non-prod resources have shutdown schedules
✅ Data lifecycle covers ingestion, retention, deletion

Common fail

❌ Logs at debug level in prod, never reviewed. Waste of storage + carbon.

Output format — add to ARCH

## Well-Architected review

### 1. Operational excellence
- Metrics: <list>
- Deploy path: <link to runbook>
- Verdict: PASS | RISKS LISTED

### 2. Security
- Trust boundaries: <list>
- Data classification: <PII / PHI / PCI / none>
- Verdict: PASS | RISKS LISTED

### 3. Reliability
- Failure modes: <link to pre-mortem>
- Idempotency: <yes/no per endpoint>
- Verdict: PASS | RISKS LISTED

### 4. Performance
- SLOs: p99=<ms>, error_rate=<%>, availability=<%>
- Verdict: PASS | RISKS LISTED

### 5. Cost
- Per-request cost: $<amount>
- Verdict: PASS | RISKS LISTED

### 6. Sustainability
- Hot-path complexity: O(<n>)
- Verdict: PASS | N/A | RISKS LISTED

## Open risks (rolled up)

<bullet list of all RISKS LISTED items + mitigation in plan>

When PASS is acceptable with risks listed

Not every architecture is bulletproof. PASS-with-risks is OK if:

Each risk is explicit (not hand-waved)
Each risk has either a mitigation in the plan OR explicit acceptance by the user
The pre-mortem section addresses the top-3 risk-score items

Gate:plan can approve a PASS-with-risks; gate:ship needs the mitigations shipped.

Repositorio GitHub

avelikiy/great_cto

Ruta: skills/well-architected

agentic-codingclaude-code-pluginclaude-code-skillsclaude-code-subagentscode-reviewcto

FAQ

Frequently asked questions

What is the well-architected skill?

well-architected is a Claude Skill by avelikiy. Skills package instructions and resources that Claude loads on demand, so Claude can perform well-architected-related tasks without extra prompting.

How do I install well-architected?

Use the install commands on this page: add well-architected to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does well-architected belong to?

well-architected is in the Design category, tagged excel, word, ai and design.

Is well-architected free to use?

Yes. well-architected is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.

Habilidades relacionadas

executing-plans

Diseño

Utilice la habilidad executing-plans cuando tenga un plan de implementación completo para ejecutar en lotes controlados con puntos de revisión. Esta habilidad carga y revisa críticamente el plan, luego ejecuta tareas en pequeños lotes (por defecto 3 tareas) mientras reporta el progreso entre cada lote para la revisión del arquitecto. Esto asegura una implementación sistemática con puntos de control de calidad integrados.

Ver habilidad

requesting-code-review

Diseño

Esta habilidad despacha un subagente revisor de código para analizar los cambios en el código frente a los requisitos antes de proceder. Debe usarse después de completar tareas, implementar funciones principales o antes de fusionar con la rama principal. La revisión ayuda a detectar problemas de forma temprana al comparar la implementación actual con el plan original.

Ver habilidad

connect-mcp-server

Diseño

Esta habilidad proporciona una guía integral para que los desarrolladores conecten servidores MCP a Claude Code mediante transportes HTTP, stdio o SSE. Cubre la instalación, configuración, autenticación y seguridad para integrar servicios externos como GitHub, Notion y APIs personalizadas. Úsala al configurar integraciones MCP, al configurar herramientas externas o al trabajar con el Protocolo de Contexto del Modelo de Claude.

Ver habilidad

web-cli-teleport

Diseño

Esta habilidad ayuda a los desarrolladores a elegir entre las interfaces web y CLI de Claude Code mediante el análisis de tareas, y luego permite la teletransportación fluida de sesiones entre estos entornos. Optimiza el flujo de trabajo gestionando el estado y el contexto de la sesión al cambiar entre web, CLI o móvil. Úsala para proyectos complejos que requieren diferentes herramientas en varias etapas.

Ver habilidad