diff --git a/HARNESS.md b/HARNESS.md new file mode 100644 index 0000000..3ddc374 --- /dev/null +++ b/HARNESS.md @@ -0,0 +1,77 @@ +# HARNESS.md + +## Purpose +- Define how Claude Code, Codex, and local LLM collaborate safely and reproducibly. +- Prioritize harness quality (context, constraints, verification, logs) over prompt size. + +## Roles +- Codex: task decomposition, design decisions, review criteria, documentation updates. +- Claude Code: implementation, command execution, verification runs, patch application. +- Local LLM: lightweight drafting, refactoring suggestions, alternative implementations. + +## Task Contract (required for each request) +Use exactly 3 lines: +1. Purpose: what outcome is needed. +2. Target: repo/file/environment. +3. Done: objective completion criteria. + +## Operating Rules +- One task at a time. Do not mix implementation and broad redesign in one request. +- Keep prompts map-first: pass file paths and priorities, not long chat logs. +- Prefer small reversible changes. Large changes must be split into staged commits. +- Update docs in the same task when behavior/spec changes. + +## Context Policy +- Never inject full conversation history. +- Provide only: + - current task contract, + - relevant file list, + - latest failing command + first 5 and last 5 log lines, + - current assumptions. +- If context exceeds budget, summarize into a short state snapshot and continue. + +## Safety Rails +- Destructive operations require explicit user approval. +- Never expose secrets in prompts, logs, or commits. +- Limit edits to task-related paths. +- If unexpected unrelated file changes are detected, stop and confirm with user. + +## Verification Policy +Minimum verification per change: +- L1: static checks (format/lint/type). +- L2: focused tests for touched behavior. +- L3: scenario/regression checks for user-facing flows. + +A task is not complete unless verification commands and results are recorded. + +## Quality Gates +- Build/test/lint must pass (or failures explicitly accepted by user). +- Architecture constraints must pass (dependency direction, layering, naming). +- No silent behavior changes without test or doc updates. + +## Logging and Traceability +For each task, record: +- task_id +- input contract +- changed files +- verification commands and results +- follow-up risks + +Store task records under `state/tasks/` as markdown or JSON. + +## Golden Dataset Loop +- Capture recurring failures into `tests/golden/`. +- Add a reproduction case before fixing when possible. +- Re-run golden set on relevant changes. + +## Maintenance Cadence +- Weekly: prune stale rules/context and merge duplicates. +- Monthly: review failure patterns and update guardrails. +- After incidents: add regression tests and update this harness. + +## Escalation Rules +Escalate to user when: +- requirement is ambiguous and risky, +- security/compliance impact exists, +- production-impacting changes are needed, +- verification cannot be executed locally. diff --git a/INDEX.md b/INDEX.md new file mode 100644 index 0000000..4a2f02e --- /dev/null +++ b/INDEX.md @@ -0,0 +1,68 @@ +# INDEX.md + +## Start Here +1. Read `AGENTS.md`. +2. Read `HARNESS.md`. +3. Read `00_開発共通/開発時に共通の前提情報.txt`. +4. Confirm task contract (Purpose/Target/Done). + +## Priority Map +- Rules: + - `AGENTS.md` + - `HARNESS.md` +- Shared context: + - `00_開発共通/開発時に共通の前提情報.txt` +- Docs: + - `docs/` (specs, runbooks, decisions) +- Source: + - project code directories +- State: + - `state/tasks/` (task records) + - `tests/golden/` (regression dataset) + +## Work Sequence +1. Clarify scope from task contract. +2. Identify minimal files to read/edit. +3. Implement in small reversible steps. +4. Run verification (L1/L2/L3 as applicable). +5. Record results in `state/tasks/.md`. +6. Update docs if behavior/spec changed. + +## Input Template for Agents +Use this payload shape: +- Purpose: ... +- Target: ... +- Done: ... +- Constraints: ... +- Relevant files: + - path/to/file1 + - path/to/file2 +- Verification commands: + - cmd1 + - cmd2 + +## Prompt Budget Rules +- Prefer path references over pasted code. +- Provide excerpts only when necessary. +- Keep assumptions explicit and short. +- If token pressure increases, summarize and continue. + +## Verification Checklist +- L1 static checks executed. +- L2 focused tests executed. +- L3 scenario/regression checks executed or justified. +- Results captured with command + exit code + key logs. + +## Definition of Done +- Requested behavior is implemented. +- Verification evidence is recorded. +- Docs/specs are updated if needed. +- Open risks and next actions are explicitly listed. + +## Quick Commands (edit as needed) +```powershell +# Example placeholders +# npm run lint +# npm test +# npm run build +```