feat: add reminders page, BMad skills upgrade, MCP server refactor

- Add reminders page with navigation support - Upgrade BMad builder module to skills-based architecture - Refactor MCP server: extract tools and auth into separate modules - Add connections cache, custom AI provider support - Update prisma schema and generated client - Various UI/UX improvements and i18n updates - Add service worker for PWA support Made-with: Cursor
2026-04-13 21:02:53 +02:00
parent 18ed116e0d
commit fa7e166f3e
3099 changed files with 397228 additions and 14584 deletions
--- a/.claude/skills/bmad-agent-builder/references/quality-dimensions.md
+++ b/.claude/skills/bmad-agent-builder/references/quality-dimensions.md
@@ -0,0 +1,54 @@
+# Quality Dimensions — Quick Reference
+
+Seven dimensions to keep in mind when building agent skills. The quality scanners check these automatically during quality analysis — this is a mental checklist for the build phase.
+
+## 1. Outcome-Driven Design
+
+Describe what each capability achieves, not how to do it step by step. The agent's persona context (identity, communication style, principles) informs HOW — capability prompts just need the WHAT.
+
+- **The test:** Would removing this instruction cause the agent to produce a worse outcome? If the agent would do it anyway given its persona and the desired outcome, the instruction is noise.
+- **Pruning:** If a capability prompt teaches the LLM something it already knows — or repeats guidance already in the agent's identity/style — cut it.
+- **When procedure IS value:** Exact script invocations, specific file paths, API calls, security-critical operations. These need low freedom.
+
+## 2. Informed Autonomy
+
+The executing agent needs enough context to make judgment calls when situations don't match the script. The Overview section establishes this: domain framing, theory of mind, design rationale.
+
+- Simple agents with 1-2 capabilities need minimal context
+- Agents with memory, autonomous mode, or complex capabilities need domain understanding, user perspective, and rationale for non-obvious choices
+- When in doubt, explain *why* — an agent that understands the mission improvises better than one following blind steps
+
+## 3. Intelligence Placement
+
+Scripts handle plumbing (fetch, transform, validate). Prompts handle judgment (interpret, classify, decide).
+
+**Test:** If a script contains an `if` that decides what content *means*, intelligence has leaked.
+
+**Reverse test:** If a prompt validates structure, counts items, parses known formats, compares against schemas, or checks file existence — determinism has leaked into the LLM. That work belongs in a script.
+
+## 4. Progressive Disclosure
+
+SKILL.md stays focused. Detail goes where it belongs.
+
+- Capability instructions → `./references/`
+- Reference data, schemas, large tables → `./references/`
+- Templates, starter files → `./assets/`
+- Memory discipline → `./references/memory-system.md`
+- Multi-capability SKILL.md under ~250 lines: fine as-is
+- Single-purpose up to ~500 lines: acceptable if focused
+
+## 5. Description Format
+
+Two parts: `[5-8 word summary]. [Use when user says 'X' or 'Y'.]`
+
+Default to conservative triggering. See `./references/standard-fields.md` for full format.
+
+## 6. Path Construction
+
+Only use `{project-root}` for `_bmad` paths. Config variables used directly — they already contain `{project-root}`.
+
+See `./references/standard-fields.md` for correct/incorrect patterns.
+
+## 7. Token Efficiency
+
+Remove genuine waste (repetition, defensive padding, meta-explanation). Preserve context that enables judgment (persona voice, domain framing, theory of mind, design rationale). These are different things — never trade effectiveness for efficiency. A capability that works correctly but uses extra tokens is always better than one that's lean but fails edge cases.
--- a/.claude/skills/bmad-agent-builder/references/script-opportunities-reference.md
+++ b/.claude/skills/bmad-agent-builder/references/script-opportunities-reference.md
@@ -0,0 +1,343 @@
+# Quality Scan Script Opportunities — Reference Guide
+
+**Reference: `references/script-standards.md` for script creation guidelines.**
+
+This document identifies deterministic operations that should be offloaded from the LLM into scripts for quality validation of BMad agents.
+
+---
+
+## Core Principle
+
+Scripts validate structure and syntax (deterministic). Prompts evaluate semantics and meaning (judgment). Create scripts for checks that have clear pass/fail criteria.
+
+---
+
+## How to Spot Script Opportunities
+
+During build, walk through every capability/operation and apply these tests:
+
+### The Determinism Test
+For each operation the agent performs, ask:
+- Given identical input, will this ALWAYS produce identical output? → Script
+- Does this require interpreting meaning, tone, context, or ambiguity? → Prompt
+- Could you write a unit test with expected output for every input? → Script
+
+### The Judgment Boundary
+Scripts handle: fetch, transform, validate, count, parse, compare, extract, format, check structure
+Prompts handle: interpret, classify with ambiguity, create, decide with incomplete info, evaluate quality, synthesize meaning
+
+### Pattern Recognition Checklist
+Table of signal verbs/patterns mapping to script types:
+| Signal Verb/Pattern | Script Type |
+|---------------------|-------------|
+| "validate", "check", "verify" | Validation script |
+| "count", "tally", "aggregate", "sum" | Metric/counting script |
+| "extract", "parse", "pull from" | Data extraction script |
+| "convert", "transform", "format" | Transformation script |
+| "compare", "diff", "match against" | Comparison script |
+| "scan for", "find all", "list all" | Pattern scanning script |
+| "check structure", "verify exists" | File structure checker |
+| "against schema", "conforms to" | Schema validation script |
+| "graph", "map dependencies" | Dependency analysis script |
+
+### The Outside-the-Box Test
+Beyond obvious validation, consider:
+- Could any data gathering step be a script that returns structured JSON for the LLM to interpret?
+- Could pre-processing reduce what the LLM needs to read?
+- Could post-processing validate what the LLM produced?
+- Could metric collection feed into LLM decision-making without the LLM doing the counting?
+
+### Your Toolbox
+Scripts have access to full capabilities — think broadly:
+- **Bash**: Full shell — `jq`, `grep`, `awk`, `sed`, `find`, `diff`, `wc`, `sort`, `uniq`, `curl`, plus piping and composition
+- **Python**: Standard library (`json`, `yaml`, `pathlib`, `re`, `argparse`, `collections`, `difflib`, `ast`, `csv`, `xml`, etc.) plus PEP 723 inline-declared dependencies (`tiktoken`, `jsonschema`, `pyyaml`, etc.)
+- **System tools**: `git` commands for history/diff/blame, filesystem operations, process execution
+
+If you can express the logic as deterministic code, it's a script candidate.
+
+### The --help Pattern
+All scripts use PEP 723 and `--help`. When a skill's prompt needs to invoke a script, it can say "Run `scripts/foo.py --help` to understand inputs/outputs, then invoke appropriately" instead of inlining the script's interface. This saves tokens in prompts and keeps a single source of truth for the script's API.
+
+---
+
+## Priority 1: High-Value Validation Scripts
+
+### 1. Frontmatter Validator
+
+**What:** Validate SKILL.md frontmatter structure and content
+
+**Why:** Frontmatter is the #1 factor in skill triggering. Catch errors early.
+
+**Checks:**
+```python
+# checks:
+- name exists and is kebab-case
+- description exists and follows pattern "Use when..."
+- No forbidden fields (XML, reserved prefixes)
+- Optional fields have valid values if present
+```
+
+**Output:** JSON with pass/fail per field, line numbers for errors
+
+**Implementation:** Python with argparse, no external deps needed
+
+---
+
+### 2. Template Artifact Scanner
+
+**What:** Scan for orphaned template substitution artifacts
+
+**Why:** Build process may leave `{if-autonomous}`, `{displayName}`, etc.
+
+**Output:** JSON with file path, line number, artifact type
+
+**Implementation:** Bash script with JSON output via jq
+
+---
+
+### 3. Access Boundaries Extractor
+
+**What:** Extract and validate access boundaries from memory-system.md
+
+**Why:** Security critical — must be defined before file operations
+
+**Checks:**
+```python
+# Parse memory-system.md for:
+- ## Read Access section exists
+- ## Write Access section exists
+- ## Deny Zones section exists (can be empty)
+- Paths use placeholders correctly ({project-root} for _bmad paths, relative for skill-internal)
+```
+
+**Output:** Structured JSON of read/write/deny zones
+
+**Implementation:** Python with markdown parsing
+
+---
+
+---
+
+## Priority 2: Analysis Scripts
+
+### 4. Token Counter
+
+**What:** Count tokens in each file of an agent
+
+**Why:** Identify verbose files that need optimization
+
+**Checks:**
+```python
+# For each .md file:
+- Total tokens (approximate: chars / 4)
+- Code block tokens
+- Token density (tokens / meaningful content)
+```
+
+**Output:** JSON with file path, token count, density score
+
+**Implementation:** Python with tiktoken for accurate counting, or char approximation
+
+---
+
+### 5. Dependency Graph Generator
+
+**What:** Map skill → external skill dependencies
+
+**Why:** Understand agent's dependency surface
+
+**Checks:**
+```python
+# Parse SKILL.md for skill invocation patterns
+# Parse prompt files for external skill references
+# Build dependency graph
+```
+
+**Output:** DOT format (GraphViz) or JSON adjacency list
+
+**Implementation:** Python, JSON parsing only
+
+---
+
+### 6. Activation Flow Analyzer
+
+**What:** Parse SKILL.md On Activation section for sequence
+
+**Why:** Validate activation order matches best practices
+
+**Checks:**
+
+Validate that the activation sequence is logically ordered (e.g., config loads before config is used, memory loads before memory is referenced).
+
+**Output:** JSON with detected steps, missing steps, out-of-order warnings
+
+**Implementation:** Python with regex pattern matching
+
+---
+
+### 7. Memory Structure Validator
+
+**What:** Validate memory-system.md structure
+
+**Why:** Memory files have specific requirements
+
+**Checks:**
+```python
+# Required sections:
+- ## Core Principle
+- ## File Structure
+- ## Write Discipline
+- ## Memory Maintenance
+```
+
+**Output:** JSON with missing sections, validation errors
+
+**Implementation:** Python with markdown parsing
+
+---
+
+### 8. Subagent Pattern Detector
+
+**What:** Detect if agent uses BMAD Advanced Context Pattern
+
+**Why:** Agents processing 5+ sources MUST use subagents
+
+**Checks:**
+```python
+# Pattern detection in SKILL.md:
+- "DO NOT read sources yourself"
+- "delegate to sub-agents"
+- "/tmp/analysis-" temp file pattern
+- Sub-agent output template (50-100 token summary)
+```
+
+**Output:** JSON with pattern found/missing, recommendations
+
+**Implementation:** Python with keyword search and context extraction
+
+---
+
+## Priority 3: Composite Scripts
+
+### 9. Agent Health Check
+
+**What:** Run all validation scripts and aggregate results
+
+**Why:** One-stop shop for agent quality assessment
+
+**Composition:** Runs Priority 1 scripts, aggregates JSON outputs
+
+**Output:** Structured health report with severity levels
+
+**Implementation:** Bash script orchestrating Python scripts, jq for aggregation
+
+---
+
+### 10. Comparison Validator
+
+**What:** Compare two versions of an agent for differences
+
+**Why:** Validate changes during iteration
+
+**Checks:**
+```bash
+# Git diff with structure awareness:
+- Frontmatter changes
+- Capability additions/removals
+- New prompt files
+- Token count changes
+```
+
+**Output:** JSON with categorized changes
+
+**Implementation:** Bash with git, jq, python for analysis
+
+---
+
+## Script Output Standard
+
+All scripts MUST output structured JSON for agent consumption:
+
+```json
+{
+  "script": "script-name",
+  "version": "1.0.0",
+  "agent_path": "/path/to/agent",
+  "timestamp": "2025-03-08T10:30:00Z",
+  "status": "pass|fail|warning",
+  "findings": [
+    {
+      "severity": "critical|high|medium|low|info",
+      "category": "structure|security|performance|consistency",
+      "location": {"file": "SKILL.md", "line": 42},
+      "issue": "Clear description",
+      "fix": "Specific action to resolve"
+    }
+  ],
+  "summary": {
+    "total": 10,
+    "critical": 1,
+    "high": 2,
+    "medium": 3,
+    "low": 4
+  }
+}
+```
+
+---
+
+## Implementation Checklist
+
+When creating validation scripts:
+
+- [ ] Uses `--help` for documentation
+- [ ] Accepts `--agent-path` for target agent
+- [ ] Outputs JSON to stdout
+- [ ] Writes diagnostics to stderr
+- [ ] Returns meaningful exit codes (0=pass, 1=fail, 2=error)
+- [ ] Includes `--verbose` flag for debugging
+- [ ] Has tests in `scripts/tests/` subfolder
+- [ ] Self-contained (PEP 723 for Python)
+- [ ] No interactive prompts
+
+---
+
+## Integration with Quality Analysis
+
+The Quality Analysis skill should:
+
+1. **First**: Run available scripts for fast, deterministic checks
+2. **Then**: Use sub-agents for semantic analysis (requires judgment)
+3. **Finally**: Synthesize both sources into report
+
+**Example flow:**
+```bash
+# Run all validation scripts
+python scripts/validate-frontmatter.py --agent-path {path}
+bash scripts/scan-template-artifacts.sh --agent-path {path}
+
+# Collect JSON outputs
+# Spawn sub-agents only for semantic checks
+# Synthesize complete report
+```
+
+---
+
+## Script Creation Priorities
+
+**Phase 1 (Immediate value):**
+1. Template Artifact Scanner (Bash + jq)
+2. Access Boundaries Extractor (Python)
+
+**Phase 2 (Enhanced validation):**
+4. Token Counter (Python)
+5. Subagent Pattern Detector (Python)
+6. Activation Flow Analyzer (Python)
+
+**Phase 3 (Advanced features):**
+7. Dependency Graph Generator (Python)
+8. Memory Structure Validator (Python)
+9. Agent Health Check orchestrator (Bash)
+
+**Phase 4 (Comparison tools):**
+10. Comparison Validator (Bash + Python)
--- a/.claude/skills/bmad-agent-builder/references/skill-best-practices.md
+++ b/.claude/skills/bmad-agent-builder/references/skill-best-practices.md
@@ -0,0 +1,109 @@
+# Skill Authoring Best Practices
+
+For field definitions and description format, see `./standard-fields.md`. For quality dimensions, see `./quality-dimensions.md`.
+
+## Core Philosophy: Outcome-Based Authoring
+
+Skills should describe **what to achieve**, not **how to achieve it**. The LLM is capable of figuring out the approach — it needs to know the goal, the constraints, and the why.
+
+**The test for every instruction:** Would removing this cause the LLM to produce a worse outcome? If the LLM would do it anyway — or if it's just spelling out mechanical steps — cut it.
+
+### Outcome vs Prescriptive
+
+| Prescriptive (avoid) | Outcome-based (prefer) |
+|---|---|
+| "Step 1: Ask about goals. Step 2: Ask about constraints. Step 3: Summarize and confirm." | "Ensure the user's vision is fully captured — goals, constraints, and edge cases — before proceeding." |
+| "Load config. Read user_name. Read communication_language. Greet the user by name in their language." | "Load available config and greet the user appropriately." |
+| "Create a file. Write the header. Write section 1. Write section 2. Save." | "Produce a report covering X, Y, and Z." |
+
+The prescriptive versions miss requirements the author didn't think of. The outcome-based versions let the LLM adapt to the actual situation.
+
+### Why This Works
+
+- **Why over what** — When you explain why something matters, the LLM adapts to novel situations. When you just say what to do, it follows blindly even when it shouldn't.
+- **Context enables judgment** — Give domain knowledge, constraints, and goals. The LLM figures out the approach. It's better at adapting to messy reality than any script you could write.
+- **Prescriptive steps create brittleness** — When reality doesn't match the script, the LLM either follows the wrong script or gets confused. Outcomes let it adapt.
+- **Every instruction should carry its weight** — If the LLM would do it anyway, the instruction is noise. If the LLM wouldn't know to do it without being told, that's signal.
+
+### When Prescriptive Is Right
+
+Reserve exact steps for **fragile operations** where getting it wrong has consequences — script invocations, exact file paths, specific CLI commands, API calls with precise parameters. These need low freedom because there's one right way to do them.
+
+| Freedom | When | Example |
+|---------|------|---------|
+| **High** (outcomes) | Multiple valid approaches, LLM judgment adds value | "Ensure the user's requirements are complete" |
+| **Medium** (guided) | Preferred approach exists, some variation OK | "Present findings in a structured report with an executive summary" |
+| **Low** (exact) | Fragile, one right way, consequences for deviation | `python3 scripts/scan-path-standards.py {skill-path}` |
+
+## Patterns
+
+These are patterns that naturally emerge from outcome-based thinking. Apply them when they fit — they're not a checklist.
+
+### Soft Gate Elicitation
+
+At natural transitions, invite contribution without demanding it: "Anything else, or shall we move on?" Users almost always remember one more thing when given a graceful exit ramp. This produces richer artifacts than rigid section-by-section questioning.
+
+### Intent-Before-Ingestion
+
+Understand why the user is here before scanning documents or project context. Intent gives you the relevance filter — without it, scanning is noise.
+
+### Capture-Don't-Interrupt
+
+When users provide information beyond the current scope, capture it for later rather than redirecting. Users in creative flow share their best insights unprompted — interrupting loses them.
+
+### Dual-Output: Human Artifact + LLM Distillate
+
+Artifact-producing skills can output both a polished human-facing document and a token-efficient distillate for downstream LLM consumption. The distillate captures overflow, rejected ideas, and detail that doesn't belong in the human doc but has value for the next workflow. Always optional.
+
+### Parallel Review Lenses
+
+Before finalizing significant artifacts, fan out reviewers with different perspectives — skeptic, opportunity spotter, domain-specific lens. If subagents aren't available, do a single critical self-review pass. Multiple perspectives catch blind spots no single reviewer would.
+
+### Three-Mode Architecture (Guided / Yolo / Headless)
+
+Consider whether the skill benefits from multiple execution modes:
+
+| Mode | When | Behavior |
+|------|------|----------|
+| **Guided** | Default | Conversational discovery with soft gates |
+| **Yolo** | "just draft it" | Ingest everything, draft complete artifact, then refine |
+| **Headless** | `--headless` / `-H` | Complete the task without user input, using sensible defaults |
+
+Not all skills need all three. But considering them during design prevents locking into a single interaction model.
+
+### Graceful Degradation
+
+Every subagent-dependent feature should have a fallback path. A skill that hard-fails without subagents is fragile — one that falls back to sequential processing works everywhere.
+
+### Verifiable Intermediate Outputs
+
+For complex tasks with consequences: plan → validate → execute → verify. Create a verifiable plan before executing, validate with scripts where possible. Catches errors early and makes the work reversible.
+
+## Writing Guidelines
+
+- **Consistent terminology** — one term per concept, stick to it
+- **Third person** in descriptions — "Processes files" not "I help process files"
+- **Descriptive file names** — `form_validation_rules.md` not `doc2.md`
+- **Forward slashes** in all paths — cross-platform
+- **One level deep** for reference files — SKILL.md → reference.md, never chains
+- **TOC for long files** — >100 lines
+
+## Anti-Patterns
+
+| Anti-Pattern | Fix |
+|---|---|
+| Numbered steps for things the LLM would figure out | Describe the outcome and why it matters |
+| Explaining how to load config (the mechanic) | List the config keys and their defaults (the outcome) |
+| Prescribing exact greeting/menu format | "Greet the user and present capabilities" |
+| Spelling out headless mode in detail | "If headless, complete without user input" |
+| Too many options upfront | One default with escape hatch |
+| Deep reference nesting (A→B→C) | Keep references 1 level from SKILL.md |
+| Inconsistent terminology | Choose one term per concept |
+| Scripts that classify meaning via regex | Intelligence belongs in prompts, not scripts |
+
+## Scripts in Skills
+
+- **Execute vs reference** — "Run `analyze.py`" (execute) vs "See `analyze.py` for the algorithm" (read)
+- **Document constants** — explain why `TIMEOUT = 30`, not just what
+- **PEP 723 for Python** — self-contained with inline dependency declarations
+- **MCP tools** — use fully qualified names: `ServerName:tool_name`
--- a/.claude/skills/bmad-agent-builder/references/standard-fields.md
+++ b/.claude/skills/bmad-agent-builder/references/standard-fields.md
@@ -0,0 +1,79 @@
+# Standard Agent Fields
+
+## Frontmatter Fields
+
+Only these fields go in the YAML frontmatter block:
+
+| Field | Description | Example |
+|-------|-------------|---------|
+| `name` | Full skill name (kebab-case, same as folder name) | `bmad-agent-tech-writer`, `bmad-cis-agent-lila` |
+| `description` | [What it does]. [Use when user says 'X' or 'Y'.] | See Description Format below |
+
+## Content Fields
+
+These are used within the SKILL.md body — never in frontmatter:
+
+| Field | Description | Example |
+|-------|-------------|---------|
+| `displayName` | Friendly name (title heading, greetings) | `Paige`, `Lila`, `Floyd` |
+| `title` | Role title | `Tech Writer`, `Holodeck Operator` |
+| `icon` | Single emoji | `🔥`, `🌟` |
+| `role` | Functional role | `Technical Documentation Specialist` |
+| `sidecar` | Memory folder (optional) | `{skillName}-sidecar/` |
+
+## Overview Section Format
+
+The Overview is the first section after the title — it primes the AI for everything that follows.
+
+**3-part formula:**
+1. **What** — What this agent does
+2. **How** — How it works (role, approach, modes)
+3. **Why/Outcome** — Value delivered, quality standard
+
+**Templates by agent type:**
+
+**Companion agents:**
+```markdown
+This skill provides a {role} who helps users {primary outcome}. Act as {displayName} — {key quality}. With {key features}, {displayName} {primary value proposition}.
+```
+
+**Workflow agents:**
+```markdown
+This skill helps you {outcome} through {approach}. Act as {role}, guiding users through {key stages/phases}. Your output is {deliverable}.
+```
+
+**Utility agents:**
+```markdown
+This skill {what it does}. Use when {when to use}. Returns {output format} with {key feature}.
+```
+
+## SKILL.md Description Format
+
+```
+{description of what the agent does}. Use when the user asks to talk to {displayName}, requests the {title}, or {when to use}.
+```
+
+## Path Rules
+
+### Skill-Internal Files
+
+All references to files within the skill use `./` relative paths:
+- `./references/memory-system.md`
+- `./references/some-guide.md`
+- `./scripts/calculate-metrics.py`
+
+This distinguishes skill-internal files from `{project-root}` paths — without the `./` prefix the LLM may confuse them.
+
+### Memory Files (sidecar)
+
+Always use `{project-root}` prefix: `{project-root}/_bmad/memory/{skillName}-sidecar/`
+
+The sidecar `index.md` is the single entry point to the agent's memory system — it tells the agent what else to load (boundaries, logs, references, etc.). Load it once on activation; don't duplicate load instructions for individual memory files.
+
+### Config Variables
+
+Use directly — they already contain `{project-root}` in their resolved values:
+- `{output_folder}/file.md`
+- Correct: `{bmad_builder_output_folder}/agent.md`
+- Wrong: `{project-root}/{bmad_builder_output_folder}/agent.md` (double-prefix)
+
--- a/.claude/skills/bmad-agent-builder/references/template-substitution-rules.md
+++ b/.claude/skills/bmad-agent-builder/references/template-substitution-rules.md
@@ -0,0 +1,44 @@
+# Template Substitution Rules
+
+The SKILL-template provides a minimal skeleton: frontmatter, overview, agent identity sections, sidecar, and activation with config loading. Everything beyond that is crafted by the builder based on what was learned during discovery and requirements phases.
+
+## Frontmatter
+
+- `{module-code-or-empty}` → Module code prefix with hyphen (e.g., `cis-`) or empty for standalone
+- `{agent-name}` → Agent functional name (kebab-case)
+- `{skill-description}` → Two parts: [4-6 word summary]. [trigger phrases]
+- `{displayName}` → Friendly display name
+- `{skillName}` → Full skill name with module prefix
+
+## Module Conditionals
+
+### For Module-Based Agents
+- `{if-module}` ... `{/if-module}` → Keep the content inside
+- `{if-standalone}` ... `{/if-standalone}` → Remove the entire block including markers
+- `{module-code}` → Module code without trailing hyphen (e.g., `cis`)
+- `{module-setup-skill}` → Name of the module's setup skill (e.g., `bmad-cis-setup`)
+
+### For Standalone Agents
+- `{if-module}` ... `{/if-module}` → Remove the entire block including markers
+- `{if-standalone}` ... `{/if-standalone}` → Keep the content inside
+
+## Sidecar Conditionals
+
+- `{if-sidecar}` ... `{/if-sidecar}` → Keep if agent has persistent memory, otherwise remove
+- `{if-no-sidecar}` ... `{/if-no-sidecar}` → Inverse of above
+
+## Headless Conditional
+
+- `{if-headless}` ... `{/if-headless}` → Keep if agent supports headless mode, otherwise remove
+
+## Beyond the Template
+
+The builder determines the rest of the agent structure — capabilities, activation flow, sidecar initialization, capability routing, external skills, scripts — based on the agent's requirements. The template intentionally does not prescribe these.
+
+## Path References
+
+All generated agents use `./` prefix for skill-internal paths:
+- `./references/init.md` — First-run onboarding (if sidecar)
+- `./references/{capability}.md` — Individual capability prompts
+- `./references/memory-system.md` — Memory discipline (if sidecar)
+- `./scripts/` — Python/shell scripts for deterministic operations