Add harness and index templates for agent workflow

2026-03-05 13:03:31 +09:00
parent 7102f0f553
commit c50082771d
2 changed files with 145 additions and 0 deletions
--- a/HARNESS.md
+++ b/HARNESS.md
@@ -0,0 +1,77 @@
+# HARNESS.md
+
+## Purpose
+- Define how Claude Code, Codex, and local LLM collaborate safely and reproducibly.
+- Prioritize harness quality (context, constraints, verification, logs) over prompt size.
+
+## Roles
+- Codex: task decomposition, design decisions, review criteria, documentation updates.
+- Claude Code: implementation, command execution, verification runs, patch application.
+- Local LLM: lightweight drafting, refactoring suggestions, alternative implementations.
+
+## Task Contract (required for each request)
+Use exactly 3 lines:
+1. Purpose: what outcome is needed.
+2. Target: repo/file/environment.
+3. Done: objective completion criteria.
+
+## Operating Rules
+- One task at a time. Do not mix implementation and broad redesign in one request.
+- Keep prompts map-first: pass file paths and priorities, not long chat logs.
+- Prefer small reversible changes. Large changes must be split into staged commits.
+- Update docs in the same task when behavior/spec changes.
+
+## Context Policy
+- Never inject full conversation history.
+- Provide only:
+  - current task contract,
+  - relevant file list,
+  - latest failing command + first 5 and last 5 log lines,
+  - current assumptions.
+- If context exceeds budget, summarize into a short state snapshot and continue.
+
+## Safety Rails
+- Destructive operations require explicit user approval.
+- Never expose secrets in prompts, logs, or commits.
+- Limit edits to task-related paths.
+- If unexpected unrelated file changes are detected, stop and confirm with user.
+
+## Verification Policy
+Minimum verification per change:
+- L1: static checks (format/lint/type).
+- L2: focused tests for touched behavior.
+- L3: scenario/regression checks for user-facing flows.
+
+A task is not complete unless verification commands and results are recorded.
+
+## Quality Gates
+- Build/test/lint must pass (or failures explicitly accepted by user).
+- Architecture constraints must pass (dependency direction, layering, naming).
+- No silent behavior changes without test or doc updates.
+
+## Logging and Traceability
+For each task, record:
+- task_id
+- input contract
+- changed files
+- verification commands and results
+- follow-up risks
+
+Store task records under `state/tasks/` as markdown or JSON.
+
+## Golden Dataset Loop
+- Capture recurring failures into `tests/golden/`.
+- Add a reproduction case before fixing when possible.
+- Re-run golden set on relevant changes.
+
+## Maintenance Cadence
+- Weekly: prune stale rules/context and merge duplicates.
+- Monthly: review failure patterns and update guardrails.
+- After incidents: add regression tests and update this harness.
+
+## Escalation Rules
+Escalate to user when:
+- requirement is ambiguous and risky,
+- security/compliance impact exists,
+- production-impacting changes are needed,
+- verification cannot be executed locally.
--- a/INDEX.md
+++ b/INDEX.md
@@ -0,0 +1,68 @@
+# INDEX.md
+
+## Start Here
+1. Read `AGENTS.md`.
+2. Read `HARNESS.md`.
+3. Read `00_開発共通/開発時に共通の前提情報.txt`.
+4. Confirm task contract (Purpose/Target/Done).
+
+## Priority Map
+- Rules:
+  - `AGENTS.md`
+  - `HARNESS.md`
+- Shared context:
+  - `00_開発共通/開発時に共通の前提情報.txt`
+- Docs:
+  - `docs/` (specs, runbooks, decisions)
+- Source:
+  - project code directories
+- State:
+  - `state/tasks/` (task records)
+  - `tests/golden/` (regression dataset)
+
+## Work Sequence
+1. Clarify scope from task contract.
+2. Identify minimal files to read/edit.
+3. Implement in small reversible steps.
+4. Run verification (L1/L2/L3 as applicable).
+5. Record results in `state/tasks/<task_id>.md`.
+6. Update docs if behavior/spec changed.
+
+## Input Template for Agents
+Use this payload shape:
+- Purpose: ...
+- Target: ...
+- Done: ...
+- Constraints: ...
+- Relevant files:
+  - path/to/file1
+  - path/to/file2
+- Verification commands:
+  - cmd1
+  - cmd2
+
+## Prompt Budget Rules
+- Prefer path references over pasted code.
+- Provide excerpts only when necessary.
+- Keep assumptions explicit and short.
+- If token pressure increases, summarize and continue.
+
+## Verification Checklist
+- L1 static checks executed.
+- L2 focused tests executed.
+- L3 scenario/regression checks executed or justified.
+- Results captured with command + exit code + key logs.
+
+## Definition of Done
+- Requested behavior is implemented.
+- Verification evidence is recorded.
+- Docs/specs are updated if needed.
+- Open risks and next actions are explicitly listed.
+
+## Quick Commands (edit as needed)
+```powershell
+# Example placeholders
+# npm run lint
+# npm test
+# npm run build
+```