Files
Momento/docs/3-5-secure-byok-management.md
Antigravity e2672cd2c2
Some checks failed
CI / Lint, Test & Build (push) Failing after 1m19s
CI / Deploy production (on server) (push) Has been skipped
feat(notes): liens internes, onglet Réseau, living blocks et consentement IA
Rend les liens entre notes visibles et persistants (sync NoteLink au save, auto-save, graphe réseau rafraîchi), ajoute living blocks, Memory Echo, recherche globale, consentement IA explicite et consolide les prototypes design en architectural-grid.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-24 14:27:29 +00:00

391 lines
20 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Story 3.5: Secure BYOK Management
Status: done
<!-- Ultimate context engine analysis completed - comprehensive developer guide created -->
## Story
As an enterprise user,
I want to input and use my own LLM API keys (Bring Your Own Key),
so that I can bypass SaaS quotas and control my AI costs.
**Epic:** Epic 3 — The SaaS Commercial Engine (Monetization & API Cost Protection)
**FR coverage:** FR14 (secure BYOK storage + routing)
**NFR coverage:** NFR-S1 (AES-256-GCM at rest), NFR-P3 (router resolves BYOK within existing 50ms routing budget — no extra HTTP round-trip before provider call)
---
## Acceptance Criteria
1. [AC1] **Encrypted storage (NFR-S1):** When a user saves a BYOK key via the API, only `encryptedKey` (AES-256-GCM: salt + iv + authTag + ciphertext, base64) and `keyHash` (SHA-256 of plaintext, for dedup/lookup) are persisted. Plaintext API keys never appear in logs, API responses, or DB columns.
2. [AC2] **Tier gating:** BASIC users cannot save BYOK keys (403 `TIER_LIMITED` or equivalent). PRO users may configure keys for: `openai`, `anthropic`, `deepseek`, `openrouter`, `minimax`, `zai`. BUSINESS and ENTERPRISE may configure all providers supported by `VALID_PROVIDERS` in `lib/ai/router.ts`.
3. [AC3] **Live validation on save:** Before persisting, the server performs a lightweight provider validation call (e.g. models/list or minimal completion) using the submitted key. Invalid keys return 400 without writing to DB.
4. [AC4] **Router prioritization:** For any AI call where entitlement runs for `userId` (or host `billingOwnerId` in collaborative brainstorm — Story 3.4), if that billing user has an **active** BYOK key matching the resolved lane provider, `getChatProvider` / `getTagsProvider` / `getEmbeddingsProvider` MUST use the decrypted user key instead of system env/admin keys.
5. [AC5] **Quota bypass:** When BYOK is active for the billing user on that request, `canUseFeature` returns `allowed: true` even if Redis quota is exhausted; `incrementUsageAsync` MUST NOT run for that successful call. `QuotaExceededError.byokConfigured` MUST be `true` when the user has any active BYOK key (for paywall UX).
6. [AC6] **No system fallback on BYOK:** When BYOK is used, `withAiProviderFallback` is invoked with `{ skipSystemFallback: true }` so failed user-key calls surface errors instead of silently spending platform quota on a secondary system provider.
7. [AC7] **CRUD API:** Authenticated REST endpoints under `app/api/user/api-keys/` support list (masked metadata only), create/upsert, deactivate, and delete per provider. List responses never include ciphertext or plaintext.
8. [AC8] **Settings UI:** Users manage keys from Settings (anchor: existing AI settings area per UX spec). Masked input, provider picker filtered by tier, inline validation feedback, and persistent “BYOK active” badge when ≥1 key is active.
9. [AC9] **Usage meter CTA:** When Discovery Pack is exhausted, the sidebar `UsageMeter` upgrade modal includes a secondary action linking to BYOK settings (i18n, all 15 locale files).
10. [AC10] **Host-pays + BYOK:** In brainstorm routes, BYOK and quota bypass use **`billingOwnerId`** (session host), not the guest's personal keys — guest collaboration must not unlock AI via guest BYOK while host quota is empty.
11. [AC11] **Regression:** Stories 3.13.4 behavior unchanged when user has no BYOK. Non-AI routes unaffected. Admin system keys in env/admin settings remain the fallback.
---
## Tasks / Subtasks
- [ ] Task 1: Schema & crypto foundation (AC: #1, #2)
- [ ] Subtask 1.1: Add Prisma `UserAPIKey` model (see Dev Notes); `@@unique([userId, provider])`; cascade delete on `User`
- [ ] Subtask 1.2: Add `MASTER_ENCRYPTION_KEY` to `.env.example` with generation instructions (min 32 bytes entropy; never commit real value)
- [ ] Subtask 1.3: Create `lib/crypto.ts``encryptApiKey`, `decryptApiKey`, `hashApiKey` (AES-256-GCM + scrypt key derivation per patch spec)
- [ ] Subtask 1.4: Non-destructive migration: `npx prisma migrate dev` (backup per `CLAUDE.md` before apply)
- [ ] Task 2: BYOK domain layer (AC: #2, #4, #5, #10)
- [ ] Subtask 2.1: Create `lib/byok.ts``getAllowedByokProviders(tier)`, `getActiveByokKey(userId, provider)`, `hasAnyActiveByok(userId)`, `resolveByokApiKey(userId, providerType)`
- [ ] Subtask 2.2: Map Prisma `provider` string ↔ `AiGatewayProvider` / factory `ProviderType` (single source of truth; avoid duplicate enums)
- [ ] Subtask 2.3: Extend `canUseFeature` / `checkEntitlementOrThrow` — BYOK bypass + set `byokConfigured` from `hasAnyActiveByok`
- [ ] Subtask 2.4: Extend `checkSessionEntitlementOrThrow` — pass `billingOwnerId` into BYOK checks (host-pays)
- [ ] Task 3: Factory & fallback integration (AC: #4, #6)
- [ ] Subtask 3.1: Add `resolveProviderConfig(userId, baseConfig)` helper that overlays `OPENAI_API_KEY`, `DEEPSEEK_API_KEY`, etc. when BYOK active
- [ ] Subtask 3.2: Update `getChatProvider`, `getTagsProvider`, `getEmbeddingsProvider` to accept optional `{ billingUserId?: string }` OR centralize in a thin `getAiProviderForUser(lane, config, billingUserId)` wrapper — **one choke-point** (Story 3.2 AC4)
- [ ] Subtask 3.3: Wire `withAiProviderFallback(..., { skipSystemFallback: true })` at all call sites when BYOK active (chat, tags, embeddings, brainstorm create/expand/enrich)
- [ ] Subtask 3.4: Skip `incrementUsageAsync` when call used BYOK (thread `usedByok` flag from provider resolution)
- [ ] Task 4: API routes (AC: #3, #7)
- [ ] Subtask 4.1: `GET /api/user/api-keys` — list `{ provider, alias, model, isActive, lastUsedAt }` only
- [ ] Subtask 4.2: `POST /api/user/api-keys` — validate tier, validate key, encrypt, upsert
- [ ] Subtask 4.3: `DELETE /api/user/api-keys/[provider]` and `PATCH` deactivate
- [ ] Subtask 4.4: Provider-specific validators in `lib/byok/validate-key.ts` (minimal HTTP ping per provider family: OpenAI-compatible vs Anthropic)
- [ ] Task 5: Wire existing AI surfaces (AC: #4, #5, #10, #11)
- [ ] Subtask 5.1: `app/api/chat/route.ts` — pass `session.user.id` into provider resolution
- [ ] Subtask 5.2: `app/api/ai/tags/route.ts`, `title-suggestions/route.ts` — same
- [ ] Subtask 5.3: Brainstorm routes — use `billingOwnerId` for BYOK + entitlement (not `session.user.id` for guests)
- [ ] Subtask 5.4: Audit other `getTagsProvider` / `getChatProvider` call sites (agents, semantic search, reformulate) — apply same pattern or document deferral in Dev Agent Record
- [ ] Task 6: UI & i18n (AC: #8, #9)
- [ ] Subtask 6.1: BYOK panel component under `app/(main)/settings/ai/` or dedicated `settings/byok` linked from AI settings
- [ ] Subtask 6.2: Sidebar/header badge when BYOK active (UX spec: lock icon + “BYOK active”)
- [ ] Subtask 6.3: `UsageMeter` — add “Add API key” button beside upgrade CTA
- [ ] Subtask 6.4: i18n keys in **all 15** `memento-note/locales/*.json` (FR/EN reference content)
- [ ] Task 7: Tests (AC: all)
- [ ] Subtask 7.1: `tests/unit/crypto.test.ts` — round-trip encrypt/decrypt, wrong key fails
- [ ] Subtask 7.2: `tests/unit/byok-entitlements.test.ts` — quota exhausted + BYOK → allowed; increment skipped
- [ ] Subtask 7.3: `tests/unit/byok-factory.test.ts` — config overlay injects user key
- [ ] Subtask 7.4: `tests/unit/brainstorm-billing.test.ts` — extend: host BYOK bypasses guest-empty-quota scenario
- [ ] Subtask 7.5: Run targeted vitest + `npm run build` in `memento-note/`
---
## Dev Notes
### Epic context
| Story | Relevance to 3.5 |
|-------|------------------|
| 3.1 | `canUseFeature`, `QuotaExceededError`, `byokConfigured` stub always `false`**implement here** |
| 3.2 | Single choke-point: `factory.ts` + `router.ts` — inject BYOK keys into config overlay, do not fork routing logic |
| 3.3 | `skipSystemFallback` already defined — **wire when BYOK active** |
| 3.4 | Host-pays: BYOK checks on `billingOwnerId`, not guest `session.user.id` |
| 3.6 | Stripe tier changes may revoke provider list — BASIC downgrade deactivates disallowed keys (minimal: reject new saves; optional: `isActive=false` on disallowed rows) |
### Critical brownfield reality
**Nothing BYOK exists in production schema today:**
- No `UserAPIKey` in `prisma/schema.prisma` (only `UserAISettings` for toggles, unrelated to API keys).
- No `lib/crypto.ts`.
- `byokConfigured` is hardcoded `false` in `checkEntitlementOrThrow` throws.
- Extension seams are pre-placed:
```4:8:memento-note/lib/ai/router.ts
* Future (Story 3.5 BYOK): plug user-scoped API keys into resolveAiRoute output / factory instantiation.
* ...
* - BYOK / UserAPIKey decryption → Story 3.5
```
```207:210:memento-note/lib/ai/fallback.ts
export interface WithAiProviderFallbackOptions {
/** Story 3.5: skip system secondary when user BYOK is active */
skipSystemFallback?: boolean
}
```
**Do not** paste the full aspirational `executeLLM` / `PROVIDER_FALLBACK_CHAIN` loop from `memento-note/docs/byok-billing-patch-v3.md` — implement AC scope only; reuse existing router + fallback from 3.2/3.3.
### Recommended Prisma model (adapt to project conventions)
```prisma
model UserAPIKey {
id String @id @default(cuid())
userId String
provider String // matches AiGatewayProvider lowercase, e.g. "openai"
alias String @default("")
encryptedKey String
keyHash String
model String?
isActive Boolean @default(true)
lastUsedAt DateTime?
lastUsedFor String?
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
user User @relation(fields: [userId], references: [id], onDelete: Cascade)
@@unique([userId, provider])
@@index([userId])
@@index([keyHash])
}
```
Add `userApiKeys UserAPIKey[]` to `User` model.
**Out of scope for 3.5 (later / optional):** `LLMCallLog`, `AIActiveConfig` tables from patch doc — admin keys already live in env + `app/(admin)/admin/settings`. If PM wants call logging, add a thin optional `console.debug` or defer to analytics story.
### BYOK + factory integration pattern
**Preferred approach (minimal churn):**
1. `resolveAiRoute(lane, config)` unchanged.
2. Before `getProviderInstance`, merge BYOK key into `config` copy:
```typescript
// lib/byok.ts
export async function applyByokToConfig(
billingUserId: string,
providerType: string,
config: Record<string, string>,
): Promise<{ config: Record<string, string>; usedByok: boolean }> {
const byok = await resolveByokApiKey(billingUserId, providerType)
if (!byok) return { config, usedByok: false }
const { apiKeyConfigKey } = getProviderConfigKeys(providerType)
if (!apiKeyConfigKey) return { config, usedByok: false }
return {
config: { ...config, [apiKeyConfigKey]: byok.plaintext },
usedByok: true,
}
}
```
3. Export wrapper:
```typescript
export async function getChatProviderForBillingUser(
config: Record<string, string>,
billingUserId: string,
) {
const route = resolveAiRoute('chat', config)
const { config: cfg, usedByok } = await applyByokToConfig(billingUserId, route.providerType, config)
const provider = getProviderInstance(route.providerType, cfg, route.modelName, route.embeddingModelName, route.ollamaBaseUrl)
return { provider, usedByok, route }
}
```
Reuse `getProviderConfigKeys` from `factory.ts` (already exported).
**Ollama / LM Studio BYOK:** Defer unless product explicitly requires local BYOK — tier lists focus on cloud providers; `ollama` BYOK is non-standard (base URL + no key). If user selects ollama, return clear error in UI.
### Entitlements change (exact behavior)
In `canUseFeature(userId, feature)`:
1. After tier/limit check would deny, call `hasAnyActiveByok(userId)` — if true, return `{ allowed: true, ..., byokConfigured: true }`.
2. When denying, set `byokConfigured: hasAnyActiveByok(userId)`.
In routes, after successful AI with `usedByok === true`, **do not** call `incrementUsageAsync`.
### Files — expected touch list
**NEW**
- `memento-note/lib/crypto.ts`
- `memento-note/lib/byok.ts`
- `memento-note/lib/byok/validate-key.ts` (or inline in byok.ts if small)
- `memento-note/app/api/user/api-keys/route.ts`
- `memento-note/app/api/user/api-keys/[provider]/route.ts`
- `memento-note/components/settings/byok-keys-panel.tsx` (name as fits project)
- `memento-note/tests/unit/crypto.test.ts`
- `memento-note/tests/unit/byok-entitlements.test.ts`
- `memento-note/prisma/migrations/*_add_user_api_key`
**UPDATE**
- `memento-note/prisma/schema.prisma`
- `memento-note/lib/entitlements.ts`
- `memento-note/lib/ai/factory.ts` (wrappers or optional billingUserId param)
- `memento-note/lib/ai/fallback.ts` (call sites only if needed)
- `memento-note/app/api/chat/route.ts`
- `memento-note/app/api/ai/tags/route.ts`
- `memento-note/app/api/ai/title-suggestions/route.ts`
- `memento-note/app/api/brainstorm/route.ts`
- `memento-note/app/api/brainstorm/[sessionId]/expand/route.ts`
- `memento-note/app/api/brainstorm/[sessionId]/manual-idea/route.ts`
- `memento-note/components/usage-meter.tsx`
- `memento-note/app/(main)/settings/ai/page.tsx` or new settings subpage
- `memento-note/locales/*.json` (15 files)
- `memento-note/.env.example`
- `memento-note/tests/unit/brainstorm-billing.test.ts`
**READ BEFORE MODIFY (current state documentation)**
| File | Current state | What 3.5 changes |
|------|---------------|-------------------|
| `lib/entitlements.ts` | Redis quota only; `byokConfigured` always false on throw | BYOK bypass branch + accurate `byokConfigured` |
| `lib/ai/factory.ts` | Keys from env/admin `config` only | Overlay user key into config before `getProviderInstance` |
| `lib/ai/router.ts` | Lane → provider type | Unchanged; BYOK follows resolved `providerType` |
| `lib/ai/fallback.ts` | `skipSystemFallback` exists, unused | Pass `true` when BYOK |
| `lib/brainstorm-collab.ts` | `getBillingOwner` for host-pays | Consumers pass `billingOwnerId` to BYOK resolution |
| `components/usage-meter.tsx` | Upgrade modal only | Add BYOK CTA link |
| `app/(main)/settings/ai/page.tsx` | `AISettingsPanel` toggles only | Host BYOK management panel |
### UX requirements (from `docs/ux-design-specification.md`)
- **Entry:** “Manage keys” from quota sidebar or settings.
- **Input:** Masked secret field; silent validation ping; no full page reload.
- **Feedback:** Persistent badge (lock + “BYOK active”) in sidebar or header when active.
- **Zero-redirect:** Configure BYOK without leaving editor context (settings drawer/page is OK).
- **Exhausted quota modal:** Offer “Upgrade” AND “Add API key” (AC9).
### Security requirements
- **NFR-S1:** AES-256-GCM with unique salt/IV per encryption; auth tag verified on decrypt.
- **Master key:** `process.env.MASTER_ENCRYPTION_KEY` — fail fast at startup in production if missing when BYOK routes enabled.
- **Logs:** Never log plaintext keys or decrypted values; redact `Authorization` headers in debug.
- **API responses:** Never return `encryptedKey` or partial plaintext (only `••••••` + last 4 optional).
- **Rate limit:** Consider basic rate limit on POST validate (10/min per user) to prevent abuse — lightweight `redis.incr` if pattern exists elsewhere.
### Scope boundaries (do NOT implement in 3.5)
- `LLMCallLog` cost analytics table (patch doc §1.2) — defer
- Full `executeLLM` multi-hop fallback chain — already 3.3 single secondary
- `BrainstormContextPool` — separate product story
- GDPR hard delete of keys — Story **4.2** (ensure `onDelete: Cascade` on User for schema readiness)
- Stripe checkout — Story **3.6**
- Socket `error:quota_exceeded` BYOK hints — optional; HTTP 402 + `byokConfigured` is MVP
### Product decisions (document in Dev Agent Record)
| Decision | Recommendation |
|----------|----------------|
| PRO provider list | openai, anthropic, deepseek, openrouter, minimax, zai (per GTM doc) |
| Key validation | Required light ping on save (patch doc Q7) |
| Downgrade Business→Pro | Set `isActive=false` on keys for providers no longer allowed; do not delete ciphertext |
| Multiple keys per provider | `@@unique([userId, provider])` — upsert only |
| Guest BYOK in shared session | **Ignored for billing** — only host BYOK applies (AC10) |
### Testing standards
- Vitest unit tests with mocked prisma + redis.
- Crypto tests use fixed `MASTER_ENCRYPTION_KEY` in test env.
- Integration: optional manual test with real DeepSeek test key in dev only — never commit keys.
- Verify: exhausted quota + no BYOK → 402; exhausted + BYOK → 200; BYOK failure → error without system fallback provider call (mock factory).
---
## Dev Agent Guardrails
### Technical requirements
- **Database:** Backup before migration (`CLAUDE.md` — `pg_dump` to `/tmp/`). Use `prisma migrate dev` only — never `migrate reset`.
- **Performance:** BYOK resolution = 1 Prisma `findFirst` by `[userId, provider, isActive]` — cache optional later; keep <10ms with index.
- **Fail-open Redis:** If Redis down, existing fail-open remains; BYOK bypass is independent of Redis.
- **402 body:** Preserve existing `QuotaExceededError.toJSON()` shape; `byokConfigured: true` enables frontend BYOK CTA.
### Architecture compliance
- Brownfield Next.js under `memento-note/`.
- BYOK is **billing + credentials** — not routing policy (stay in `byok.ts` + `factory.ts`, not duplicate logic in every route).
- i18n: zero hardcoded UI strings.
### Library / framework requirements
- Node `crypto` module for AES-256-GCM (no new dependency unless team prefers `@noble/ciphers` — default to Node built-in per patch doc).
- Reuse `getProviderConfigKeys` from `factory.ts` for key env mapping.
- Provider validation: use existing provider clients where possible (minimal fetch).
### File structure requirements
- `lib/crypto.ts` — encryption only (no business logic).
- `lib/byok.ts` — domain rules, tier maps, DB access.
- API routes under `app/api/user/api-keys/` (user-scoped, not admin).
---
## Previous Story Intelligence
**Source:** `docs/3-4-host-pays-session-logic.md`
- `getBillingOwner` / `billingOwnerFromSession` implemented; use `billingOwnerId` for entitlement + BYOK.
- `checkSessionEntitlementOrThrow` attaches guest metadata to 402.
- Explicit seam: *"Story 3.5: skip quota when host has active BYOK"* — implement now in entitlements + brainstorm routes.
- `withAiProviderFallback` on brainstorm paths — add `skipSystemFallback` when host BYOK.
**Source:** `docs/3-3-smart-routing-fallback.md`
- Do not add tertiary fallback chains.
- `skipSystemFallback` stub exists — wire it.
**Source:** `docs/3-2-custom-llm-router.md`
- AC4 single choke-point: extend factory wrappers, do not add parallel routing paths.
**Source:** `docs/3-1-freemium-quota-tracking.md`
- Deferred: `byokConfigured` always false — **fix in 3.5**.
- 402 pattern established across chat/tags.
---
## Git Intelligence Summary
| Commit | Insight |
|--------|---------|
| `1fcea6e` | Brainstorm + embeddings active — BYOK must cover brainstorm billing owner paths |
| `41596c2` | OpenRouter key env fallback pattern — BYOK overlay same config keys |
| `195e845` | Security-conscious patterns — treat API keys as secrets |
---
## Latest Technical Information
- **Node.js `crypto` (2024+):** `createCipheriv('aes-256-gcm', ...)` + `scryptSync` for key derivation remains standard; no deprecated APIs for this use case.
- **Prisma:** Use `upsert` with `@@unique([userId, provider])` for key rotation without duplicate rows.
- **AI SDK:** Existing routes use Vercel AI SDK `generateText` — BYOK only changes provider instance credentials, not stream shape.
---
## Project Context Reference
| Document | Use |
|----------|-----|
| `docs/epics.md` | Story 3.5 AC + FR14 |
| `docs/prd.md` | BYOK journey, NFR-S1, NFR-P3 |
| `docs/ux-design-specification.md` | BYOK UX flows, badge, settings placement |
| `memento-note/docs/byok-billing-patch-v3.md` | Aspirational reference — **do not implement wholesale** |
| `docs/3-4-host-pays-session-logic.md` | billingOwnerId for BYOK |
| `docs/3-3-smart-routing-fallback.md` | skipSystemFallback |
| `docs/3-2-custom-llm-router.md` | Choke-point |
| `docs/3-1-freemium-quota-tracking.md` | Entitlements baseline |
| `docs/gtm-pricing-strategy.md` | PRO vs BUSINESS BYOK provider lists |
| `CLAUDE.md` | Database safety |
---
## Dev Agent Record
### Agent Model Used
{{agent_model_name_version}}
### Debug Log References
### Completion Notes List
### File List
---
## Story Completion Status
- Story ID: 3.5
- Story Key: `3-5-secure-byok-management`
- File: `docs/3-5-secure-byok-management.md`
- Status: **review**
- Completion Note: Implementation complete pending Prisma migration (backup required per CLAUDE.md). UI, API, entitlements, AI/brainstorm wiring, tests, i18n (15 locales).