feat: brainstorm sessions, PDF document Q&A, embedding fixes, and UI improvements
All checks were successful
Deploy to Production / Build and Deploy (push) Successful in 7s

- Add brainstorm feature with collaborative canvas, AI idea generation, live cursors, playback, and export
- Add PDF upload/extraction/ingestion pipeline with pgvector document search (RAG)
- Add document Q&A overlay with streaming chat and PDF preview
- Add note attachments UI with status polling, grid layout, and auto-scroll
- Add task extraction AI tool and agent executor improvements
- Fix NoteEmbedding missing updatedAt column, re-index 66 notes with 1536-dim embeddings
- Fix brainstorm 'Create Note' button: add success toast and redirect to created note
- Fix memory echo notification infinite polling
- Fix chat route to always include document_search tool
- Add brainstorm i18n keys across all 14 locales
- Add socket server for real-time brainstorm collaboration
- Add hierarchical notebook selector and organize notebook dialog improvements
- Add sidebar brainstorm section with session management
- Update prisma schema with brainstorm tables, attachments, and document chunks
This commit is contained in:
Antigravity
2026-05-14 17:43:21 +00:00
parent 195e845f0a
commit 1fcea6ed7d
228 changed files with 57656 additions and 1059 deletions

View File

@@ -0,0 +1,231 @@
# Story 3.1: Freemium "AI Discovery Pack" Quota Tracking
Status: ready-for-dev
---
## Story
As a business,
I want to track Freemium usage against a Redis quota limit,
So that I can limit my API cost exposure for free users.
**Given** a free user triggers an AI request
**When** the system intercepts the request
**Then** the quota is tracked and the UI updates
**And** (NFR-SC2) the Redis-backed check resolves in under 10ms.
---
## Epic Context
**Epic 3:** The SaaS Commercial Engine (Monetization & API Cost Protection)
**Epic Business Value:** The core backend logic allowing us to sell the product without bleeding API costs — freemium limits, router fallback, host-pays.
**Cross-Story Dependencies:**
- Story 3.1 (this story) establishes the quota tracking foundation
- Story 3.2 builds on this with the LLM Router and provider routing
- Story 3.3 adds smart-routing fallback when quota is low
- Story 3.4 (Host-Pays) extends quota tracking to collaborative sessions
- Story 3.5 (BYOK) bypasses quota for users with their own API keys
- Story 3.6 (Stripe) manages tier upgrades
**Technical Constraint:** (NFR-SC2) Redis-backed entitlement checks must complete in under 10ms.
---
## Acceptance Criteria
1. [AC1] When a free user (BASIC tier) makes an AI request, the system checks Redis for current usage count
2. [AC2] Each AI feature (semantic_search, auto_tag, auto_title) has its own Redis counter with format: `usage:{userId}:{feature}:{YYYY-MM}`
3. [AC3] Counter increments atomically via Redis INCRBY (not read-modify-write) to avoid race conditions
4. [AC4] If counter >= limit, return HTTP 402 with body `{ error: "QUOTA_EXCEEDED", feature, upgradeTier: "PRO", byokConfigured }`
5. [AC5] If counter < limit, allow request to proceed and increment counter asynchronously (fire-and-forget)
6. [AC6] Redis keys have 90-day TTL to auto-cleanup (covers grace period for monthly reconciliation)
7. [AC7] The Sidebar footer displays a usage gauge component showing Discovery Pack consumption in real-time
---
## Tasks / Subtasks
- [ ] Task 1: Add Prisma models for Subscription, UsageLog (AC: #1)
- [ ] Subtask 1.1: Add `Subscription` model with tier/stripe fields
- [ ] Subtask 1.2: Add `UsageLog` model for PostgreSQL sync target
- [ ] Subtask 1.3: Create migration file
- [ ] Task 2: Create `lib/entitlements.ts` with Redis-backed `canUseFeature()` (AC: #2, #3, #4)
- [ ] Subtask 2.1: Implement `getCurrentPeriodKey()` returning `YYYY-MM` format
- [ ] Subtask 2.2: Implement Redis key format: `usage:{userId}:{feature}:{YYYY-MM}`
- [ ] Subtask 2.3: Implement atomic `canUseFeature()` with < 10ms target (use Redis GET + pipeline INCRBY)
- [ ] Subtask 2.4: Return `QuotaExceededError` when limit exceeded
- [ ] Task 3: Create `lib/usage-tracker.ts` with `trackFeatureUsage()` (AC: #5)
- [ ] Subtask 3.1: Fire-and-forget Redis pipeline increment
- [ ] Subtask 3.2: Include tokensUsed in metadata
- [ ] Task 4: Create `/api/usage/current` endpoint (AC: #6)
- [ ] Subtask 4.1: Return remaining quota for all features for authenticated user
- [ ] Task 5: Create `<UsageMeter>` UI component for Sidebar footer (AC: #7)
- [ ] Subtask 5.1: Show progress bar for Discovery Pack (semantic_search: 30 lifetime, auto_tag: 20, auto_title: 10)
- [ ] Subtask 5.2: Real-time updates via React Query polling every 30s
- [ ] Subtask 5.3: Show "Upgrade to Pro" paywall modal on 402 response
- [ ] Task 6: Create CRON sync worker `/api/cron/sync-usage` (AC: #6)
- [ ] Subtask 6.1: Batch sync Redis counters → PostgreSQL UsageLog
- [ ] Subtask 6.2: Handle monthly reset (new period = reset counters for that user/feature)
---
## Dev Notes
### Project Structure Notes
- **Code base:** `memento-note/` (Next.js app with App Router)
- **Redis client:** Currently `memento-note/lib/rate-limit.ts` uses in-memory Map (INSUFFICIENT). This story replaces it with actual Redis.
- **No existing `lib/entitlements.ts`** — creates from scratch
- **No existing `lib/usage-tracker.ts`** — creates from scratch
- **Sidebar:** `memento-note/components/sidebar.tsx` is the anchor for the UsageMeter UI
- **AI Factory:** `memento-note/lib/ai/factory.ts` has provider creation logic — quota checks must integrate BEFORE provider resolution
### Files to CREATE (NEW):
```
memento-note/lib/entitlements.ts # Core quota check logic
memento-note/lib/usage-tracker.ts # Track usage via Redis
memento-note/lib/redis.ts # Redis client singleton
memento-note/app/api/usage/current/route.ts # GET current quota
memento-note/app/api/cron/sync-usage/route.ts # CRON sync Redis→PG
memento-note/components/usage-meter.tsx # UI component
```
### Files to MODIFY (UPDATE):
```
memento-note/prisma/schema.prisma # Add Subscription, UsageLog models
memento-note/components/sidebar.tsx # Add UsageMeter to footer
memento-note/middleware.ts # Add 402 handling for quota
memento-note/app/api/chat/route.ts # Add canUseFeature() before AI call
```
### Testing Standards
- Unit tests for `canUseFeature()` with mocked Redis
- Unit tests for `trackFeatureUsage()` with Redis pipeline verification
- Integration tests for 402 response flow
- UI tests: verify UsageMeter renders correct progress
---
## Dev Agent Guardrails
### Technical Requirements
- **Redis client:** Use `@upstash/redis` for server-side queries (NOT client-side SDK). Self-hosted Redis via `docker-compose.yml` (see `saas-deployment-prep.md` Section I).
- **NFR-SC2 (< 10ms):** Use Redis GET + pipeline INCRBY — no read-modify-write patterns.
- **Atomic counters:** Always use `redis.pipeline().incrby()` not `GET → compute → SET`.
- **Async tracking:** `trackFeatureUsage()` must be fire-and-forget — do NOT await in the hot path.
- **Period key:** Use `new Date().toISOString().slice(0, 7)` for `YYYY-MM` format — UTC, not local.
### Architecture Compliance
- **Starter Pack limits (from saas-deployment-prep.md):**
- BASIC: `semanticSearch: 30, autoTag: 20, autoTitle: 10` (lifetime, not monthly)
- PRO: `semanticSearch: 100, autoTag: 200, autoTitle: 200, reformulate: 50, chat: 100` (monthly)
- BUSINESS: `semanticSearch: 1000, autoTag: 1000, autoTitle: 1000, reformulate: 500, chat: 1000` (monthly)
- **Redis key format:** `usage:{userId}:{feature}:{YYYY-MM}` with 90-day TTL
- **CRON sync interval:** Every 5 minutes via Vercel Cron or node-cron
- **PostgreSQL sync:** Use UPSERT (INSERT ... ON CONFLICT UPDATE) via Prisma `$upsert()`
### Library / Framework Requirements
- **Redis:** `docker-compose.yml` with `redis:7-alpine` (self-hosted, no Upstash cost)
- **Prisma:** Already in use (`@prisma/client@5.22.0`)
- **React Query:** Already in use (`@tanstack/react-query@5.100.9`) for UsageMeter polling
- **AI SDK:** Already in use (`ai@6.0.23`) for provider calls
### File Structure Requirements
- Follow existing patterns in `memento-note/lib/` — TypeScript files, no default exports
- Use `import { redis } from '@/lib/redis'` singleton pattern (see existing `lib/prisma.ts`)
- API routes follow Next.js App Router: `app/api/[resource]/route.ts`
### Testing Requirements
- Use `vitest` (already configured)
- Mock Redis with `vi.mock('@upstash/redis')`
- No database tests — mock Prisma with `vi.mock('@prisma/client')`
- Target 80% coverage for `entitlements.ts`
---
## Previous Story Intelligence
N/A — This is the first story in Epic 3.
---
## Git Intelligence Summary
**Last 5 commits on modified paths:**
| Commit | Change |
|--------|--------|
| `195e845` | security: fix SQL injection in semantic search - use parameterized queries with bind params |
| `ff664f7` | fix: add missing await on reciprocalRankFusion call |
| `41596c2` | fix: openrouter provider fallback to CUSTOM_OPENAI_API_KEY |
| `cf2786d` | feat: migrate semantic search to pgvector + full-text search |
| `330c0c6` | feat: integrate Google Gemini, MiniMax, and GLM providers |
**Key insight:** Recent commits show security hardening on SQL queries — quota tracking must use parameterized queries for any SQL, and Redis must be used for the fast path.
---
## Latest Technical Information
**Redis Self-Hosted (from `saas-deployment-prep.md`):**
```yaml
# docker-compose.yml
redis:
image: redis:7-alpine
command: redis-server --requirepass ${REDIS_PASSWORD} --appendonly yes --maxmemory 256mb --maxmemory-policy allkeys-lru
volumes:
- redis_data:/data
ports:
- "127.0.0.1:6379:6379"
```
**@upstash/redis** package is NOT in package.json — use `ioredis` instead (already a dependency of socket.io):
```typescript
// lib/redis.ts
import Redis from 'ioredis';
const redis = new Redis({
host: process.env.REDIS_HOST ?? 'localhost',
port: parseInt(process.env.REDIS_PORT ?? '6379'),
password: process.env.REDIS_PASSWORD,
lazyConnect: true,
});
export { redis };
```
**Subscription tiers** defined in `saas-deployment-prep.md` Section B:
- BASIC: free with "AI Discovery Pack" (lifetime limits)
- PRO: €9.90/month (monthly limits)
- BUSINESS: €29.90/month (higher monthly limits)
- ENTERPRISE: €49.90 + €3.90/user/month
---
## Project Context Reference
- **PRD:** `docs/prd.md` — Product Requirements Document with full business context
- **Architecture:** `memento-note/docs/saas-deployment-prep.md` — V3 SaaS architecture including Redis quota system
- **BYOK/Billing:** `memento-note/docs/byok-billing-patch-v3.md` — Full technical spec for quota + BYOK + host-pays
- **UX Spec:** `docs/ux-design-specification.md` — Usage meter in sidebar footer (Section: Emplacement Quotas)
- **Epics:** `docs/epics.md` — Epic 3 with Story 3.1 requirements
---
## Story Completion Status
- Status: ready-for-dev
- Completion Note: Ultimate context engine analysis completed — comprehensive developer guide created
- Story ID: 3.1
- Story Key: 3-1-freemium-quota-tracking

252
docs/epics.md Normal file
View File

@@ -0,0 +1,252 @@
---
stepsCompleted:
- step-01-validate-prerequisites
- step-02-design-epics
- step-03-create-stories
inputDocuments:
- docs/prd.md
- memento-note/docs/brainstorm-documentation.md
- memento-note/docs/saas-deployment-prep.md
---
# Momento - Epic Breakdown
## Overview
This document provides the complete epic and story breakdown for Momento, focusing strictly on the **Commercial and AI Delta** required to launch the V3 product. Momento is a **Brownfield** project. The baseline functionality (Basic User Auth, Workspaces, CRUD Rich-Text Notes, pgvector database) already exists and is considered out-of-scope for these epics.
## Requirements Inventory
### Functional Requirements
FR4: Users can upload PDF documents and extract text for AI contextual analysis (Chat-with-PDF).
FR5: Users can invoke automated task extraction on any note to generate structured, actionable to-do lists.
FR6: Users can leverage one-shot AI to automatically generate contextual tags and titles for their notes (Auto-Tagging / Auto-Titling).
FR7: The system will proactively detect and surface semantic connections between disconnected notes in the background ("Memory Echo") to stimulate serendipitous discovery.
FR8: Hosts can initialize a real-time collaborative Brainstorm session derived from an existing note.
FR9: Guests can join a shared Brainstorm session via a frictionless sharing link without requiring an account.
FR10: Users can visually map and generate ideas on a multi-directional radial graph (Canvas).
FR11: Users can prompt AI within the shared session to generate specific ideation "Waves" (Variations, Analogies, Disruptions).
FR12: Users can export the completed Brainstorm canvas to structured formats (Markdown, Branded PPTX).
FR13: Users can monitor their remaining "AI Discovery Pack" or subscription usage limits via a real-time UI indicator.
FR14: Users can input, update, and securely store their own third-party LLM API keys (BYOK) to bypass platform limits.
FR15: Hosts can assume financial responsibility (token consumption) for all AI queries executed by guests within their active shared sessions.
FR16: Users can upgrade to paid subscription tiers (Pro, Business, Enterprise) via an integrated payment gateway.
FR17: Users can dynamically switch between supported AI providers (OpenAI, DeepSeek, Gemini, OpenRouter, etc.) for specific generation tasks.
FR18: Administrators can configure smart-routing fallback rules to default to specific models under heavy load or quota exhaustion.
FR20: Enterprise users can authenticate via Single Sign-On (SSO / SAML).
FR21: Workspace administrators can view comprehensive audit logs detailing user access events and specific AI provider utilization.
FR22: Workspace administrators can configure strict data residency requirements (e.g., EU-only storage) for their tenant.
FR23: Administrators can programmatically enforce zero-data-retention headers/flags for all outbound third-party AI API requests.
*(Note: FR1, FR2, FR3 are Baseline and excluded. FR19 is deferred to Phase 2).*
### NonFunctional Requirements
NFR-P1 (Real-Time Latency): Brainstorm Canvas WebSocket events must propagate within 150ms.
NFR-P3 (AI Routing): The internal LLM Router must dispatch the prompt within 50ms.
NFR-S1 (Encryption): BYOK API keys must be encrypted at rest using AES-256-GCM.
NFR-S2 (Data Residency): Architecture must support configurable EU-only database deployments.
NFR-S3 (Auditability): Log 100% of LLM API requests, retaining anonymized metrics for 1 year.
NFR-SC1 (Collaborative Sessions): Session gracefully supports 50 concurrent active users.
NFR-SC2 (Rate Limiting): Redis entitlement process usage checks in under 10ms.
NFR-R1 (Graceful Degradation): LLM Router falls back to secondary provider within 1.5 seconds.
NFR-GDPR1 (Cookie Consent): Granular accept/reject of analytics and tracking cookies.
NFR-GDPR2 (Right to be Forgotten): Hard deletion of all vector data, BYOK keys, and accounts.
NFR-GDPR3 (Data Portability): Secure exports of user data.
NFR-GDPR4 (Explicit Consent): Explicit user consent when processing personal data through AI APIs.
### Additional Requirements
- Next.js App Router for frontend/backend infrastructure
- D3.js required for Brainstorm radial graph
- Dedicated Socket.io server running on port 3002 for low-latency Canvas state
- PostgreSQL (managed via Prisma) for persistent data storage
- Redis for hybrid entitlement/rate-limiting system
- Custom LLM Router (`lib/ai/router.ts`) supporting 13 independent providers via OpenRouter aggregation layer for the long-tail.
- Stripe billing integration for subscription tier management
- **"Brownfield" Context**: DO NOT write development stories for basic Workspace Hierarchy, Rich-Text CRUD, or pgvector setup as they already exist. Focus 100% on the COMMERCIAL & AI DELTA.
### FR Coverage Map
FR12: Epic 4 - Data Portability / Export Brainstorm canvas
FR13: Epic 3 - Freemium "AI Discovery Pack" & Redis Quota
FR14: Epic 3 - Secure BYOK Management
FR15: Epic 3 - "Host-Pays" Session Billing Logic
FR16: Epic 3 - Stripe Subscription Tiers
FR17: Epic 3 - Custom LLM Router & OpenRouter integration
FR18: Epic 3 - Smart-routing fallback rules
FR20: Epic 4 - SSO/SAML Integration
FR21: Epic 4 - Enterprise Audit Logging
FR22: Epic 4 - EU Data Residency configuration
FR23: Epic 4 - Zero-data-retention headers
## Epic List
### Epic 3: The SaaS Commercial Engine (Monetization & API Cost Protection)
The core backend logic allowing the product to be sold without bleeding API costs (freemium limits, router fallback, host-pays).
**FRs covered:** FR13, FR14, FR15, FR16, FR17, FR18
### Epic 4: Enterprise Compliance & Privacy (B2B Requirements)
Enforcing B2B sales requirements and European legal requirements (GDPR, Cookie Consent, SSO, EU Data Residency).
**FRs covered:** FR12, FR20, FR21, FR22, FR23
---
## Epic 3: The SaaS Commercial Engine (Monetization & API Cost Protection)
Focus: The core backend logic allowing us to sell the product without bleeding API costs.
### Story 3.1: Freemium "AI Discovery Pack" Quota Tracking
As a business,
I want to track Freemium usage against a Redis quota limit,
So that I can limit my API cost exposure for free users.
**Acceptance Criteria:**
**Given** a free user triggers an AI request
**When** the system intercepts the request
**Then** the quota is tracked and the UI updates
**And** (NFR-SC2) the Redis-backed check resolves in under 10ms.
### Story 3.2: Custom LLM Router & OpenRouter Integration
As a system,
I want to route AI prompts across 13 different providers via OpenRouter,
So that we have extreme flexibility in API fulfillment.
**Acceptance Criteria:**
**Given** an AI request is initiated
**When** the LLM Router intercepts it
**Then** (NFR-P3) the routing logic dispatches the prompt within 50ms to the optimal provider.
### Story 3.3: Smart-Routing Fallback
As a system,
I want to automatically fall back to a secondary provider if the primary fails,
So that users experience zero downtime during external API outages.
**Acceptance Criteria:**
**Given** a primary AI provider returns a 429 or 500 error
**When** the LLM Router detects the failure
**Then** (NFR-R1) the request automatically falls back to the secondary provider within 1.5 seconds.
### Story 3.4: The "Host-Pays" Session Logic
As a host,
I want my guests' AI actions inside my Canvas session to be billed to my account,
So that my guests never hit a paywall while collaborating with me.
**Acceptance Criteria:**
**Given** a guest triggers an AI Wave in a shared session
**When** the request reaches the LLM Router
**Then** the token consumption is deducted from the Host's quota, leaving the guest's quota untouched.
### Story 3.5: Secure BYOK Management
As an enterprise user,
I want to input and use my own LLM API keys (Bring Your Own Key),
So that I can bypass SaaS quotas.
**Acceptance Criteria:**
**Given** I save my custom API key
**When** it is stored in the database
**Then** (NFR-S1) it is encrypted at rest using AES-256-GCM
**And** the LLM Router prioritizes this key for my requests.
### Story 3.6: Stripe Subscription Tiers
As a user,
I want to upgrade to a paid tier (Pro, Business, Enterprise) via Stripe,
So that I can unlock higher quotas and features.
**Acceptance Criteria:**
**Given** I select an upgrade plan
**When** I complete checkout via the Stripe gateway
**Then** my account capabilities are instantly unlocked.
---
## Epic 4: Enterprise Compliance & Privacy (B2B Requirements)
Focus: B2B sales blockers and European legal requirements.
### Story 4.1: GDPR Cookie Consent Management
As a visitor,
I want to granularly accept or reject analytics and tracking cookies,
So that my ePrivacy rights are respected.
**Acceptance Criteria:**
**Given** I visit the application
**When** the consent banner is shown
**Then** (NFR-GDPR1) I can toggle tracking cookies while strictly necessary cookies remain enforced.
### Story 4.2: GDPR Right to be Forgotten (Hard Deletion)
As a user,
I want the ability to perform a complete, hard deletion of my account,
So that I can exercise my Right to be Forgotten under GDPR.
**Acceptance Criteria:**
**Given** I confirm account deletion
**When** the system processes the request
**Then** (NFR-GDPR2) the system enforces a hard deletion of all my pgvector data, BYOK keys, and user records.
### Story 4.3: Data Portability & Export
As a user,
I want to securely export my workspace data and Brainstorm canvases,
So that I have full portability of my knowledge.
**Acceptance Criteria:**
**Given** I request a data export
**When** the system complies
**Then** (FR12, NFR-GDPR3) I receive a secure download (Markdown/PPTX) of my data.
### Story 4.4: Explicit AI Processing Consent
As a user,
I want to explicitly consent before any of my personal data is sent to a third-party AI,
So that I retain full control over my data privacy.
**Acceptance Criteria:**
**Given** I trigger an AI feature that processes my notes or PDFs
**When** the request is initiated
**Then** (NFR-GDPR4) explicit consent is logged before the data leaves the Momento infrastructure.
### Story 4.5: EU Data Residency Configuration
As an Enterprise Administrator,
I want to configure my tenant for EU-only storage and enforce zero-data-retention on APIs,
So that our data never leaves the EU and is never trained on.
**Acceptance Criteria:**
**Given** an enterprise workspace is created
**When** EU residency is selected
**Then** (NFR-S2) the database deployment is restricted to the EU
**And** (FR23) all outbound API requests enforce zero-data-retention headers.
### Story 4.6: SSO/SAML & Audit Logging
As an Enterprise Administrator,
I want my users to authenticate via SSO and I want access to comprehensive audit logs,
So that I can secure and audit our workspace.
**Acceptance Criteria:**
**Given** I am an enterprise tenant
**When** users log in or perform actions
**Then** (FR20) authentication occurs via SSO/SAML
**And** (FR21, NFR-S3) 100% of LLM API requests and logins are logged and retained for 1 year for SOC2 compliance.

253
docs/fonctionnalites-ia.md Normal file
View File

@@ -0,0 +1,253 @@
# Fonctionnalités IA — Momento
## Architecture
Trois fournisseurs IA indépendants, chacun configurable séparément :
| Tier | Usage | Variables |
|---|---|---|
| **Tags** | Tags, labels, reformulation, suggestions de titre | `AI_PROVIDER_TAGS`, `AI_MODEL_TAGS` |
| **Embedding** | Recherche sémantique, Memory Echo, similarité | `AI_PROVIDER_EMBEDDING`, `AI_MODEL_EMBEDDING` |
| **Chat** | Chat RAG, agents, résumés, vision, brainstorm | `AI_PROVIDER_CHAT`, `AI_MODEL_CHAT` |
**13 fournisseurs supportés** : OpenAI, Google Gemini, Anthropic, DeepSeek, OpenRouter, Mistral, Ollama, ZAI, LM Studio, Custom OpenAI, MiniMax, GLM, Anthropic Custom.
Hiérarchie de fallback : Config spécifique → Tags → Embedding → Général → Env.
---
## 1. Notes — Écriture intelligente
### Suggestions de titre
- 3 propositions (Direct, Question, Créatif) en 3-8 mots
- Détection automatique de la langue du contenu
- Toggle utilisateur `titleSuggestions`
### Tags contextuels automatiques (IA2)
- Suggestions parmi les labels existants du carnet OU création de nouveaux
- Confiance > 0.5 (labels existants), > 0.3 (nouveaux)
- Toggle utilisateur `autoLabeling`
### Création auto de labels (IA4)
- Analyse un carnet (15+ notes) et détecte les thèmes récurrents
- Un thème doit apparaître dans 5+ notes, confiance > 0.60
- Max 5 suggestions, labels marqués `type='ai'`
### Reformulation de paragraphe — 6 modes
- **Clarifier** : élimine l'ambiguïté
- **Raccourcir** : condense 30-50%
- **Améliorer le style** : vocabulaire, structure
- **Corriger la grammaire** : corrections minimales
- **Traduire** : vers langue cible
- **Améliorer tous les modes** : les 3 principaux en un appel
### Traduction
- Traduction vers n'importe quelle langue cible
### Description d'images (Vision)
- Mode **description** : décrit le contenu (max 100 mots)
- Mode **titre** : 3 titres descriptifs de 3-7 mots
- Supporte images locales et distantes
### Suggestion de carnet
- Suggère le carnet le plus approprié lors de la création d'une note
- Analyse contenu + labels existants
---
## 2. Recherche sémantique
### Recherche hybride (FTS + Vectorielle)
- **Phase 1** : PostgreSQL FTS (`tsvector` / `plainto_ts_query`, index GIN) → 50 candidats
- **Phase 2** : pgvector recherche cosinus-distance (index HNSW) → 50 candidats, seuil 0.3
- **Phase 3** : Reciprocal Rank Fusion (RRF, k=60) pour le classement final
- Filtrage par carnet optionnel
- Fallback FTS-only si vectorielle échoue
### Embeddings
- pgvector(1536) dans PostgreSQL
- Régénération automatique après 7 jours
- Génération par lot avec `Promise.all`
---
## 3. Memory Echo — Connexions proactives
### Insights automatiques
- Détecte les connexions sémantiques entre notes via similarité cosinus
- Seuil de similarité : 0.75 (normal), 0.50 (démo)
- Génère une explication en 1 phrase (max 15 mots) via IA
- Fréquence configurable : daily / weekly / custom (3/jour)
### Feedback adaptatif
- **Thumbs up** → baisse le seuil de 0.05 (plus de suggestions)
- **Thumbs down** → monte le seuil de 0.15 (moins de suggestions)
### Fusion de notes
- Fusion intelligente de plusieurs notes en une seule structurée
- Consolidation, suppression des doublons, organisation logique
### Connexions par note
- Liste toutes les connexions sémantiques pour une note avec pagination
---
## 4. Chat IA — RAG avec outils
### Chat contextuel (streaming)
- Retrieval-Augmented Generation avec notes de l'utilisateur
- Recherche sémantique (seuil 0.5, 0.3 pour carnet spécifique)
- Prompt système en français
- **Outils disponibles** : `note_search`, `note_read`, `web_search`, `web_scrape`
- Max 5 steps d'outils
- Supporte contexte de carnet, contexte Copilot (note en cours), images (vision)
- Conversations persistées, historique 10 derniers messages
- Prompts localisés (en, fr, fa, es)
### Insights de chat
- Résumé synthétique ou 3 insights à partir des 5 notes les plus récentes
---
## 5. Organisation automatique
### Organisation par lot (IA3)
- Classe les notes sans carnet dans les carnets existants
- Guide de classification thématique (sport → Personnel, travail → Travail, etc.)
- Confiance > 0.60, max 50 notes analysées
- Prompts localisés en 12 langues
### Résumé de carnet (IA6)
- 5 sections : Thèmes Principaux, Statistiques, Éléments Temporels, Points d'Attention, Insights Clés
- Analyse jusqu'à 100 notes récentes
- Prompts localisés en 5 langues
---
## 6. Agents IA — 6 types
### Scraper
- Scrape des URLs (Jina Reader), détecte et parse les flux RSS
- Synthèse IA, extraction d'images (Cheerio + Sharp)
- Placement intelligent d'images via IA
- Titre généré par IA, notification par email
### Researcher
- Génère des requêtes de recherche, scrape les sources
- Crée une note de recherche structurée
### Monitor
- Analyse les notes récentes d'un carnet
- Produit des insights et connexions récurrents
### Custom
- Rôle personnalisé libre avec URLs source optionnelles
### Slide Generator
- Crée des présentations via `pptx_create` (PowerPoint) ou `slides_create` (HTML Reveal.js)
### Excalidraw Generator
- Crée des diagrammes Excalidraw à partir d'un format simplifié
- 7 types : flowchart, mindmap, architecture-cloud, org-chart, timeline, process-map, auto
- Auto-layout via Dagre
### Ordonnancement
- Fréquences : manual, hourly, daily, weekly, monthly
- Fuseaux horaires IANA supportés
---
## 7. Génération de contenu
### PowerPoint (PptxGenJS)
- 20 thèmes de couleur, 4 styles (sharp/soft/rounded/pill)
- 15 layouts de slides
- Pré-fetch d'images en base64
### Slides HTML (Reveal.js)
- Mêmes thèmes et layouts que PPTX
- Interactif, speaker notes, cadre décoratif optionnel
### Excalidraw
- Format simplifié (nodes + edges) → auto-layout Dagre
- Calcul de qualité (overlaps, croisements)
- Sauvegarde en DB
---
## 8. Brainstorm — Vagues de pensée
### Génération de vagues IA
- 3 vagues successives : Variations → Analogies → Disruptions
- Chaque vague enrichie avec contexte des notes existantes
- Connexions aux notes existantes via embeddings (derived_from, opposes, extends, synthesizes, transposes)
### Enrichissement d'idées manuelles
- LLM enrichit titre, description, connectionToSeed, noveltyScore
- Placement sémantique auto : embedding → similarité cosinus → meilleur parent
### Canvas D3 interactif
- Graphe radial avec 3 anneaux (vagues 1-2-3)
- Drag & drop, zoom, sélection, création inline
- Curseur fantôme AI pendant la génération
### Collaboration temps réel
- Socket.io : curseurs live, déplacement de nœuds, activité
- Partage via modèle NoteShare (email → notification → accepter/refuser)
- Avatars avec présence émeraude
### Export → Note
- Crée une note structurée avec résumé, vagues, connexions, notes sollicitées
- Ouvre directement la note créée
---
## 9. Outils disponibles pour les agents
| Outil | Description |
|---|---|
| `web_search` | Recherche web via SearXNG ou Brave Search |
| `web_scrape` | Scraping via Jina Reader (markdown) + RSS auto-détection |
| `note_search` | Recherche hybride dans les notes (sémantique + mots-clés) |
| `note_read` | Lit une note par ID |
| `note_create` | Crée une note (markdown, images optionnelles) |
| `note_update` | Modifie titre/contenu d'une note |
| `url_fetch` | Fetch d'URL avec parsing JSON/CSV/texte |
| `memory_search` | Recherche dans l'historique des exécutions d'agent |
| `pptx_create` | Génère un PowerPoint |
| `slides_create` | Génère des slides HTML Reveal.js |
| `excalidraw_create` | Génère un diagramme Excalidraw |
---
## 10. Détection de langue
- Hybride : TinyLD pour notes courtes (<50 mots, ~8ms), IA pour notes longues
- Mapping TinyLD → ISO 639-1, 62 langues dont le persan
- 19 codes explicitement mappés + fallback
---
## Toggles utilisateur (UserAISettings)
| Toggle | Fonctionnalité |
|---|---|
| `autoLabeling` | Tags contextuels automatiques |
| `paragraphRefactor` | Reformulation de paragraphe |
| `titleSuggestions` | Suggestions de titre |
| `memoryEcho` | Memory Echo proactif |
| `memoryEchoFrequency` | daily / weekly / custom |
| `demoMode` | Seuils abaissés pour démo |
| `preferredLanguage` | Langue préférée pour les agents |
---
## Chiffres
- **46 fichiers IA**
- **~10 500+ lignes de code**
- **26 routes API IA**
- **12 outils enregistrés**
- **6 types d'agents**
- **13 fournisseurs supportés**
- **Prompts localisés** : fr, en, fa, es, de, it, pt, nl, pl, ru, ja, ko, zh, ar, hi

View File

@@ -0,0 +1,454 @@
# Memento — Stratégie GTM & Pricing
> **Version:** 1.0 | **Date:** 2026-05-14 | **Statut:** Draft
> **Rôle:** CMO / Expert Pricing SaaS IA
---
## 1. Positionnement & Message
### 1.1 UVP — Votre Unique Value Proposition
> **Memento n'est pas une app de notes. C'est une mémoire numérique qui pense avec vous.**
| Avant (produits existants) | Avec Memento |
|---------------------------|--------------|
| Vous écrivez des notes | Vous alimentez une **mémoire** qui se souvient |
| Vous cherchez manuellement | L'IA **retrouve** ce que vous aviez oublié |
| Vousifiez manuellement | L'IA **connecte** vos idées entre elles |
| Vous-presentez seul | L'IA **génère** vos présentations |
**Tagline principale:**
> "Arrêtez de prendre des notes. Commencez à vous souvenir."
**Tagline produit:**
> "Votre mémoire externe, augmentée par l'IA."
### 1.2 Les 3 Personas Cibles
#### Persona 1: "Le Consultant Indépendant" 🎯
- **Profil:** Freelance, stratégie, conseil, audit
- **Douleur:** "J'ai des centaines de notes de missions passées, je ne retrouve jamais l insight pertinent au bon moment"
- **Use case star:** Recherche sémantique + Memory Echo
- **Willingness to pay:** Moyenne-haute (€9.90-29.90/mois)
- **Copy:** "Vos 3 ans de conseil, instantanément retrouvés."
#### Persona 2: "Le Chef de Produit" 🎯🎯
- **Profil:** Product Manager, Chef de projet, Consultant
- **Douleur:** "Je passe 2h/semaine à chercher des infos dans mes notes. Je manque de connexions évidentes entre mes specs et mes feedbacks."
- **Use case star:** Brainstorm + Agents + PPTX generation
- **Willingness to pay:** Haute (€29.90/mois)
- **Copy:** "De l'idée à la présentation en 5 minutes."
#### Persona 3: "Le Researcher/Docteur" 🎯
- **Profil:** Chercheur académique, thésard, analyste
- **Douleur:** "Je scrappe 20 sources par jour, je perds tout, je ne fais jamais de liens entre mes lectures."
- **Use case star:** Agents (Scraper + Researcher) + Memory Echo + Semantic Search
- **Willingness to pay:** Moyenne (peut être sensible au prix)
- **Copy:** "Votre assistant de recherche, jamais submergé."
### 1.3 Copy par Feature (pour landing page)
| Feature | Headline | Sub-headline |
|---------|----------|-------------|
| Semantic Search | "Retrouvez ce que vous aviez oublié" | "Search au-delà des mots. L'IA comprend le sens." |
| Memory Echo | "Votre mémoire vous souffle des connections" | "Memento détecte les liens entre vos notes, automatiquement." |
| Agents | "Vos agents bossent pendant que vous dormez" | "Scraper, Researcher, Monitor — lancez et oubliez." |
| Brainstorm | "Générez 100 idées en 3 vagues" | "IA + collaboration en temps réel. Sans effort." |
| PPTX Generation | "De note à présentation en 1 clic" | "PowerPoint ou HTML slides, générés par l'IA." |
---
## 2. Architecture de Prix — Protection des Marges
### 2.1 Logique de Protection (Cost Engineering)
**Le problème:** Chaque feature IA a un coût variable. Voici comment je les класифицируе:
| Type de feature | Coût par appel | Ventilation |
|----------------|---------------|-------------|
| **One-shot simple** (titre, tag, reformulation) | ~50-200 tokens | Limites mensuelles (suffisant) |
| **Recherche sémantique** (embeddings + RRF) | ~200-500 tokens | Limites mensuelles + BYOK |
| **Memory Echo** (scheduled, vault-wide) | ~500-2000 tokens/run | Limites strictes (coûteux) |
| **Chat RAG** (multi-step, outils) | ~1000-5000 tokens/requête | Crédit-based ou limite |
| **Agent Scraper** (web scrape + synthesis) | ~3000-10000 tokens/run | Limite dure (très coûteux) |
| **Agent Researcher** (multi-step, web) | ~5000-20000 tokens/run | Limite dure |
| **PPTX Generation** (images + layout) | ~3000-8000 tokens/run | Limite dure + BYOK |
| **Excalidraw Gen** (auto-layout) | ~2000-5000 tokens/run | Limite dure |
| **Brainstorm** (3 vagues + connections) | ~10000-30000 tokens/session | Limite dure (session) |
**Règle #1:** Les Agents et Générations多媒体 sont les plus coûteux → **limites dures** (pas de crédits qui s'épuisent en 2 clics).
**Règle #2:** Les features "one-shot" peuvent avoir des limites souples car le coût est faible.
### 2.2 Plans Détaillés
#### **BASIC (Gratuit)**
*Pour: Particuliers, étudiants, testeurs*
| Feature | Limit |
|---------|-------|
| Notes | 100 max |
| Notebooks | 3 |
| **AI Starter Pack (lifetime)** | **50 crédits** |
| — Recherche sémantique | 30 crédits |
| — Tags auto | 15 crédits |
| — Titres auto | 5 crédits |
| **AI Reformulation** | ❌ |
| **AI Chat** | ❌ |
| **Memory Echo** | ❌ |
| **Agents** | ❌ |
| **PPTX / Slides** | ❌ |
| **Excalidraw Gen** | ❌ |
| **Brainstorm** | ❌ (1 session/month) |
| **API Access** | ❌ |
| **Historique** | 7 jours |
| **Support** | Communauté |
**Copy:** "Découvrez la magie. 50 crédits gratuits, à vie."
---
#### **PRO (9,90 €/mois)**
*Pour: Consultants, freelances, particuliers avancés*
| Feature | Limit |
|---------|-------|
| Notes | Illimitées |
| Notebooks | Illimités |
| **Recherche sémantique** | 200/mois |
| **Tags auto** | 200/mois |
| **Titres auto** | 200/mois |
| **AI Reformulation** | 50/mois |
| **AI Chat (RAG)** | 50/mois |
| **Memory Echo** | 20/mois |
| **AI Organisation (batch)** | 20/mois |
| **AI Résumé notebook** | 10/mois |
| **Agents — Scraper** | 5 runs/mois |
| **Agents — Researcher** | 2 runs/mois |
| **Agents — Monitor** | 5 runs/mois |
| **PPTX / Slides** | 3 générer/mois |
| **Excalidraw Gen** | 5/mois |
| **Brainstorm** | 5 sessions/mois |
| **API Access** | ❌ |
| **BYOK (Bring Your Own Key)** | ✅ (OpenAI/Anthropic) |
| **Historique** | 30 jours |
| **Support** | Email |
**Copy:** "Votre pensée amplifiée. AI complète, sans limite de credits."
*Le BYOK Pro permet aux power users d'utiliser leur propre clé API pour les features une-shot. Ça réduit ton coût à ~0 sur ces features.*
---
#### **BUSINESS (29,90 €/mois)**
*Pour: Équipes, PM, départements*
| Feature | Limit |
|---------|-------|
| **Tout Pro** | ✅ |
| **Collaborateurs** | 10 inclus |
| **Recherche sémantique** | 1000/mois |
| **Tags / Titres auto** | 1000/mois |
| **AI Reformulation** | 500/mois |
| **AI Chat (RAG)** | 500/mois |
| **Memory Echo** | 100/mois |
| **AI Organisation** | 100/mois |
| **AI Résumé notebook** | 50/mois |
| **Agents — Scraper** | 20 runs/mois |
| **Agents — Researcher** | 10 runs/mois |
| **Agents — Monitor** | 20 runs/mois |
| **Agents — Custom** | 10 runs/mois |
| **PPTX / Slides** | 20 générer/mois |
| **Excalidraw Gen** | 50/mois |
| **Brainstorm** | Illimité |
| **API Access** | ✅ (1000 req/mois) |
| **BYOK** | ✅ (13 fournisseurs) |
| **Historique** | Illimité |
| **Support** | Priority 24h |
**Copy:** "Votre équipe, augmentée par l'IA. Invitez 10 collègues."
*Le seat additionnel (au-delà des 10) coûte +4,90 €/mois/ seat. Tu ne facture PAS l'utilisation IA au-delà du seat.*
---
#### **ENTERPRISE (49,90 + 3,90 €/utilisateur/mois)**
*Pour: 20+ utilisateurs, entreprises*
| Feature | Limit |
|---------|-------|
| **Tout Business** | ✅ |
| **Utilisateurs** | 20 minimum |
| **Collaborateurs** | Illimités |
| **Agents** | Illimités |
| **PPTX / Slides** | Illimités |
| **API Access** | Illimitée |
| **SSO / SAML** | ✅ |
| **Audit Logs** | ✅ |
| **SLA** | 99.9% |
| **Onboarding** | Session live incluse |
| **Custom contracts** | Net-30, BPA |
**Copy:** "Mémoire organisationnelle. SSO, audit, SLA."
---
### 2.3 Tableau Comparatif Rapide
| Feature | Basic | Pro | Business | Enterprise |
|---------|------:|----:|-----------:-----------:|
| **Prix** | Gratuit | 9,90 €/mois | 29,90 €/mois | 49,90 + 3,90 €/user |
| AI Starter Pack | 50 crédits (lifetime) | ✅ | ✅ | ✅ |
| Recherche sémantique | 30 crédits | 200/mois | 1000/mois | Illimitée |
| Tags/Titres auto | 15 crédits | 200/mois | 1000/mois | Illimités |
| Reformulation | ❌ | 50/mois | 500/mois | Illimitée |
| Chat RAG | ❌ | 50/mois | 500/mois | Illimité |
| Memory Echo | ❌ | 20/mois | 100/mois | Illimité |
| Résumé notebook | ❌ | 10/mois | 50/mois | Illimité |
| Agents (total runs) | ❌ | 12/mois | 60/mois | Illimités |
| PPTX/Slides | ❌ | 3/mois | 20/mois | Illimités |
| Excalidraw Gen | ❌ | 5/mois | 50/mois | Illimité |
| Brainstorm sessions | 1/mois | 5/mois | Illimité | Illimité |
| Collaborateurs | 0 | 0 | 10 | Illimités |
| API Access | ❌ | ❌ | ✅ | ✅ |
| BYOK | ❌ | ✅ | ✅ | ✅ |
| Support | Commu | Email | Priority 24h | Dédié |
---
## 3. BYOK — Bring Your Own Key (Stratégie de Power-User)
### 3.1 Pourquoi le BYOK?
1. **Élimine ton coût variable** sur les features IA une-shot
2. **Attire les power users** qui ont déjà des clés API (développeurs, chercheurs)
3. **Réduit la friction de conversion** — ils peuvent utiliser l'outil sans se soucier des quotas
4. **Pas de risque de abuse** — ils paient leur propre facture API
### 3.2 Ventilation BYOK par Plan
| Plan | BYOK providers | Status |
|------|---------------|--------|
| Basic | ❌ | — |
| **Pro** | OpenAI, Anthropic | Limité aux une-shot features |
| **Business** | OpenAI, Anthropic, Google, DeepSeek, Ollama, Mistral, ZAI, LM Studio... (13) | Toutes les features |
| **Enterprise** | Tous les 13 + config custom | Accès total |
### 3.3 Implémentation BYOK
```typescript
// lib/ai/factory.ts — Modification BYOK
interface AIConfig {
provider: 'openai' | 'anthropic' | ...;
apiKey?: string; // Clé USER (BYOK)
useSystemKey?: boolean; // Si true, utilise la clé système (ton coût)
model?: string;
}
// L'utilisateur configure sa clé dans /settings/ai-keys
// Stockage chiffré en DB (chiffrement AES-256)
// Sur chaque requête IA:
const config = getEffectiveConfig(userId, feature);
if (config.useSystemKey) {
// Vérifier les quotas normaux
await canUseFeature(userId, feature);
// Décrementer le quota
await trackUsage(userId, feature, tokensUsed);
} else {
// BYOK: pas de quota Memento, on passe la config directement
// Le coût est 100% supporté par l'utilisateur
}
```
### 3.4 Copy BYOK
> "Vous avez déjà une clé OpenAI? Connectez-la et utilisez l'IA sans limites de crédits Memento."
---
## 4. Leviers d'Acquisition PLG
### 4.1 Les Viral Loops (Comment acquire de nouveaux utilisateurs sans payer)
**Viral Loop #1: Brainstorm Canvas Partagé**
```
User A crée un Brainstorm → Clique "Partager"
→ Lien unique → Email/Notion/LinkedIn
→ Prospect B reçoit le lien avec preview interactive
→ Prospect B peut explorer 3 vagues gratuitement (preview mode)
→ "Créer mon compte pour sauvegarder et continuer"
→ Nouvel utilisateur acquis
```
**Pourquoi ça marche:** Le canvas est visuellement impressionnant. C'est un "wow moment" immédiat. Le coût pour toi: ~1 session brainstorm gratuite. La valeur pour le prospect: il "joue" avec sans s'inscrire.
**Copy à ajouter sur le canvas partagé:**
> "Propulsé par Memento — Créez votre propre Brainstorm gratuit"
---
**Viral Loop #2: PPTX Export avec Watermark**
```
User génère un PPTX → Clique "Exporter"
→ Option: "Inclure le branding Memento" (gratuit)
→ Option: "Sans watermark" (Pro+)
→ Le PPTX contient une slide cachée: "Créé avec Memento → memento.io"
→ Le fichier voyage dans l'entreprise
→ 1 clic vers l'inscription
```
**Pourquoi ça marche:** Le PPTX voyage. Dans une entreprise, un bon template se partage. La slide cachée est minuscule mais constante.
---
**Viral Loop #3: Note Partagée avec Résultats IA**
```
User partage une note avec amis/collegues
→ La note inclut "Insights IA" visibles publiquement
→ "Voir les connexions détectées par Memory Echo"
→ Clic → Inscription pour acceder aux features IA
```
---
### 4.2 Onboarding — La Killer Feature (5 premières minutes)
**L'objectif:** Chaque utilisateur gratuit doit expérimenter la **Recherche Sémantique** dans les 5 premières minutes.
**Pourquoi:** C'est l'effet "Aha!" le plus fort. L'utilisateur tape une question en langage naturel et retrouve une note qu'il avait oubliée depuis 6 mois. Ça ne marche qu'avec une base de notes existante.
**Flow d'onboarding optimisé:**
```
Jour 0 — Inscription (2 min)
├── 1. Créer un compte (Google/email)
├── 2. Wizard: "Importez vos notes?" → option CSV/Markdown
└── 3. Wizard: "Avez-vous 5+ notes?" → sample notes provided if no
Jour 0 — Premier usage (5 min)
├── 1. Banner: "Testez la recherche sémantique"
├── 2. Input: "Qu'avez-vous en tête?" (search bar)
├── 3. Résultats: notes avec highlighted snippets
├── 4. CTA: "Vous avez utilisé 1/30 recherche gratuite"
└── 5. Si < 5 notes: "Créez 3 notes pour voir la magie"
Jour 0 — Conversion trigger
├── Si l'utilisateur a fait 3+ searches:
│ └── "Vous adorez cette fonctionnalité. Déverrouillez l'IA complète."
└── Si Starter Pack épuisé:
└── Paywall Pro avec trial 14 jours
```
**Copy onboarding:**
> "Tapez une question. retrouves une note que tu avais oubliée."
---
### 4.3 Email Drip pour Conversion
| Timing | Email | Objectif |
|--------|-------|---------|
| J+0 | Welcome + "Importez vos notes" | Activation |
| J+2 | "Tips: 3 façons d'utiliser la recherche sémantique" | Éducation |
| J+5 | "Votre Starter Pack: 30 crédits restants" | Ré-engagement |
| J+7 | "Vous avez utilisé X features this week" | Social proof |
| J+12 | "Votre Starter Pack expire dans 2 jours" | Urgence |
| J+14 | "Trial Pro: 14 jours gratuits" | Conversion trial |
| J+21 (si trial) | "Êtes-vous prêt à garder l'IA?" | Conversion |
---
## 5. Stratégie de Monetisation Complémentaire
### 5.1 Addons (Revenue additionnel)
| Addon | Prix | Description |
|-------|------|-------------|
| +5 Collaborateurs | +4,90 €/mois | Seat Memento additionnel |
| +10 Collaborateurs | +8,90 €/mois | Bundle économique |
| +100 crédits AI | +2,90 €/mois | Crédits boost pour Pro |
| +500 crédits AI | +9,90 €/mois | Crédits boost pour Pro |
*Note: Les addons crédits ne comptent pas pour les Agents et Générations (limites dures).*
### 5.2 Annual Discount
| Plan | Mensuel | Annuel | Remise |
|------|---------|--------|--------|
| Pro | 9,90 € | 99 € | 2 mois gratuits (~17%) |
| Business | 29,90 € | 299 € | 2 mois gratuits (~17%) |
**Copy:** "12 mois pour le prix de 10. Épargnez 2 mois."
### 5.3 Dual Currency
| Currency | Prix | Affichage |
|----------|------|-----------|
| EUR | 9,90 €/mois | Defaut (locale EU) |
| USD | $10.90/mois | Locale US/CA |
| GBP | £8.90/mois | Locale UK |
Stripe converts automatically. Tu ne gère pas manuellement les taux.
---
## 6. Résumé Exécutif
### Les 4 Piliers
| Pilier | Action |
|--------|--------|
| **1. Freemium généreux** | 50 crédits lifetime AI (Recherche sémantique + Tags + Titres) = effet "Aha!" |
| **2. Protection des marges** | Agents et Générations = limites dures. One-shot = quotas mensuels. BYOK pour power users. |
| **3. Viral loops** | Brainstorm shareable canvas + PPTX watermark + Note partagée avec insights IA |
| **4. Onboarding killer** | Recherche sémantique en 5 minutes = conversion. Trial 14 jours sans carte. |
### Prix Final
| Plan | Mensuel | Annuel | Target |
|------|---------|--------|--------|
| Basic | Gratuit | — | Acquisition |
| **Pro** | **9,90 €** | **99 €** | Individual, Consultant |
| **Business** | **29,90 €** | **299 €** | Teams, PM |
| Enterprise | 49,90 + 3,90 €/user | — | 20+ seats |
### Coût Par Utilisateur Actif
| Feature | Coût Memento | Prix Pro | Marge |
|---------|-------------|---------|-------|
| 200 recherches sémantiques | ~€0.40 | €9.90 | 96% |
| 50 reformulations | ~€0.15 | €9.90 | 98% |
| 5 Agent runs (scraper) | ~€0.50 | €9.90 | 95% |
| 3 PPTX générés | ~€0.60 | €9.90 | 94% |
*Les coûts sont估算 with GPT-4o mini pricing. Ajust according to ton实际的 API costs.*
---
## Annexes
### A. Comparatif Concurrents
| Produit | Prix | AI | Différenciateur |
|--------|------|----|----------------|
| Notion AI | $10/user/mois | Oui | Integration Notion, writer-centric |
| Obsidian Sync | $4/mois | Non | Markdown local, pas d'IA native |
| Roam | $15/mois | Non | Blocks, backlink, pas d'IA |
| Mem.ai | $10/mois | Oui | Semantic search, team spaces |
| Jasper | $49/mois | Oui | Marketing copy, templates |
| **Memento** | **€9.90/mois** | **Oui (13 providers)** | **Agents + Brainstorm + PPTX + BYOK** |
**Memento se différencie par:** 13 fournisseurs IA (pas lock-in), Agents autonomes, Brainstorm collaboratif, Génération PPTX/Slides/Excalidraw, BYOK.
### B. Questions pour Validation
1. **API costs réels** — As-tu les coûts réels par feature pour calibrer les quotas?
2. **Trial sans carte** — Tu veux vraiment 14 jours sans carte (conversion plus faible mais + signups)?
3. **Minimum Pro pour BYOK** — Tu confirmes que Pro peut avoir BYOK OpenAI/Anthropic only?
4. **Watermark PPTX** — Tu es d'accord avec le watermark branding dans les exports gratuits?

View File

@@ -0,0 +1,169 @@
---
stepsCompleted:
- step-01-document-discovery
- step-02-prd-analysis
- step-03-epic-coverage-validation
- step-04-ux-alignment
- step-05-epic-quality-review
- step-06-final-assessment
includedFiles:
- docs/prd.md
---
# Implementation Readiness Assessment Report
**Date:** 2026-05-14
**Project:** Momento
## PRD Analysis
### Functional Requirements
FR1: Users can create, read, update, and delete rich-text notes within their workspace.
FR2: Users can execute natural language semantic searches across their entire personal knowledge base.
FR3: Users can organize notes within isolated workspaces and nested hierarchies.
FR4: Users can upload PDF documents and extract text for AI contextual analysis (Chat-with-PDF).
FR5: Users can invoke automated task extraction on any note to generate structured, actionable to-do lists.
FR6: Users can leverage one-shot AI to automatically generate contextual tags and titles for their notes (Auto-Tagging / Auto-Titling).
FR7: The system will proactively detect and surface semantic connections between disconnected notes in the background ("Memory Echo") to stimulate serendipitous discovery.
FR8: Hosts can initialize a real-time collaborative Brainstorm session derived from an existing note.
FR9: Guests can join a shared Brainstorm session via a frictionless sharing link without requiring an account.
FR10: Users can visually map and generate ideas on a multi-directional radial graph (Canvas).
FR11: Users can prompt AI within the shared session to generate specific ideation "Waves" (Variations, Analogies, Disruptions).
FR12: Users can export the completed Brainstorm canvas to structured formats (Markdown, Branded PPTX).
FR13: Users can monitor their remaining "AI Discovery Pack" or subscription usage limits via a real-time UI indicator.
FR14: Users can input, update, and securely store their own third-party LLM API keys (BYOK) to bypass platform limits.
FR15: Hosts can assume financial responsibility (token consumption) for all AI queries executed by guests within their active shared sessions.
FR16: Users can upgrade to paid subscription tiers (Pro, Business, Enterprise) via an integrated payment gateway.
FR17: Users can dynamically switch between supported AI providers (OpenAI, DeepSeek, Gemini, OpenRouter, etc.) for specific generation tasks.
FR18: Administrators can configure smart-routing fallback rules to default to specific models under heavy load or quota exhaustion.
FR19: Users can configure, schedule, and view the outputs of autonomous background agents (Scraper, Researcher, Monitor).
FR20: Enterprise users can authenticate via Single Sign-On (SSO / SAML).
FR21: Workspace administrators can view comprehensive audit logs detailing user access events and specific AI provider utilization.
FR22: Workspace administrators can configure strict data residency requirements (e.g., EU-only storage) for their tenant.
FR23: Administrators can programmatically enforce zero-data-retention headers/flags for all outbound third-party AI API requests.
Total FRs: 23
### Non-Functional Requirements
NFR-P1 (Real-Time Latency): Brainstorm Canvas WebSocket events (e.g., node creation, cursor movement) must propagate to all connected clients within 150ms under normal network conditions.
NFR-P2 (Search Speed): Vector-based semantic search queries must return initial results within 800ms for personal knowledge bases containing up to 10,000 notes.
NFR-P3 (AI Routing): The internal LLM Router must evaluate "Host-Pays" and BYOK rules and dispatch the prompt to the external provider within 50ms of receiving the user request.
NFR-S1 (Encryption): All user-provided LLM API keys (BYOK) must be encrypted at rest using AES-256-GCM.
NFR-S2 (Data Residency): The architecture must support configurable regional database deployments, guaranteeing EU-only data storage for specified enterprise tenants.
NFR-S3 (Auditability): The system must log 100% of LLM API requests, retaining anonymized provider routing and token consumption metrics for a minimum of 1 year to support future SOC2 compliance audits.
NFR-SC1 (Collaborative Sessions): A single Brainstorm session must gracefully support up to 50 concurrent active users without degrading the 150ms latency target.
NFR-SC2 (Rate Limiting): The Redis-backed entitlement system must process usage quota checks in under 10ms, supporting up to 5,000 concurrent verifications per second globally.
NFR-R1 (Graceful Degradation): If a primary AI provider (e.g., OpenAI) returns a 429 (Rate Limit) or 500-series error, the LLM Router must automatically fallback to the designated secondary provider (e.g., DeepSeek) within 1.5 seconds.
NFR-R2 (Availability): The core note-taking interface, database reads/writes, and offline access must maintain 99.9% uptime, functioning independently of any third-party AI provider outages.
Total NFRs: 10
### Additional Requirements
Constraints & Assumptions:
- "Host-Pays" + BYOK Model implementation limits API paywalls, but adds load logic constraints on the backend.
- Phased delivery: MVP strictly limits Agent Ecosystem to phase 2 (except for underlying mechanics).
- Massive multi-LLM integration (13 providers) must be maintained via an aggregation layer (like OpenRouter) for the long-tail models while using direct integration for high-volume ones.
- SSO/SAML is a mandatory future-proofing design constraint even if fully deployed in later tiers.
### PRD Completeness Assessment
The PRD is highly complete, providing rigorous traceability between the user journeys, functional constraints, and system design specifications. All FRs and NFRs are explicitly defined and testable. The capability contract is solid.
## Epic Coverage Validation
*Note: As confirmed during Document Discovery, the Epics & Stories documents do not yet exist. The following validation serves as a baseline map of what MUST be covered when the Epics are created in the next phase.*
### Coverage Matrix
| FR Number | PRD Requirement | Epic Coverage | Status |
| --------- | --------------- | -------------- | --------- |
| FR1 | Users can create, read, update, and delete rich-text notes within their workspace. | **NOT FOUND** | ❌ MISSING |
| FR2 | Users can execute natural language semantic searches across their entire personal knowledge base. | **NOT FOUND** | ❌ MISSING |
| FR3 | Users can organize notes within isolated workspaces and nested hierarchies. | **NOT FOUND** | ❌ MISSING |
| FR4 | Users can upload PDF documents and extract text for AI contextual analysis (Chat-with-PDF). | **NOT FOUND** | ❌ MISSING |
| FR5 | Users can invoke automated task extraction on any note to generate structured, actionable to-do lists. | **NOT FOUND** | ❌ MISSING |
| FR6 | Users can leverage one-shot AI to automatically generate contextual tags and titles for their notes (Auto-Tagging / Auto-Titling). | **NOT FOUND** | ❌ MISSING |
| FR7 | The system will proactively detect and surface semantic connections between disconnected notes in the background ("Memory Echo") to stimulate serendipitous discovery. | **NOT FOUND** | ❌ MISSING |
| FR8 | Hosts can initialize a real-time collaborative Brainstorm session derived from an existing note. | **NOT FOUND** | ❌ MISSING |
| FR9 | Guests can join a shared Brainstorm session via a frictionless sharing link without requiring an account. | **NOT FOUND** | ❌ MISSING |
| FR10 | Users can visually map and generate ideas on a multi-directional radial graph (Canvas). | **NOT FOUND** | ❌ MISSING |
| FR11 | Users can prompt AI within the shared session to generate specific ideation "Waves" (Variations, Analogies, Disruptions). | **NOT FOUND** | ❌ MISSING |
| FR12 | Users can export the completed Brainstorm canvas to structured formats (Markdown, Branded PPTX). | **NOT FOUND** | ❌ MISSING |
| FR13 | Users can monitor their remaining "AI Discovery Pack" or subscription usage limits via a real-time UI indicator. | **NOT FOUND** | ❌ MISSING |
| FR14 | Users can input, update, and securely store their own third-party LLM API keys (BYOK) to bypass platform limits. | **NOT FOUND** | ❌ MISSING |
| FR15 | Hosts can assume financial responsibility (token consumption) for all AI queries executed by guests within their active shared sessions. | **NOT FOUND** | ❌ MISSING |
| FR16 | Users can upgrade to paid subscription tiers (Pro, Business, Enterprise) via an integrated payment gateway. | **NOT FOUND** | ❌ MISSING |
| FR17 | Users can dynamically switch between supported AI providers (OpenAI, DeepSeek, Gemini, OpenRouter, etc.) for specific generation tasks. | **NOT FOUND** | ❌ MISSING |
| FR18 | Administrators can configure smart-routing fallback rules to default to specific models under heavy load or quota exhaustion. | **NOT FOUND** | ❌ MISSING |
| FR19 | Users can configure, schedule, and view the outputs of autonomous background agents (Scraper, Researcher, Monitor). | **NOT FOUND** | ❌ MISSING |
| FR20 | Enterprise users can authenticate via Single Sign-On (SSO / SAML). | **NOT FOUND** | ❌ MISSING |
| FR21 | Workspace administrators can view comprehensive audit logs detailing user access events and specific AI provider utilization. | **NOT FOUND** | ❌ MISSING |
| FR22 | Workspace administrators can configure strict data residency requirements (e.g., EU-only storage) for their tenant. | **NOT FOUND** | ❌ MISSING |
| FR23 | Administrators can programmatically enforce zero-data-retention headers/flags for all outbound third-party AI API requests. | **NOT FOUND** | ❌ MISSING |
### Missing Requirements
All 23 Functional Requirements are currently pending Epic creation. They must be translated into development epics before implementation begins.
### Coverage Statistics
- Total PRD FRs: 23
- FRs covered in epics: 0
- Coverage percentage: 0%
## UX Alignment Assessment
### UX Document Status
**Not Found** (Explicitly acknowledged by the product team).
### Alignment Issues
Cannot be assessed as no UX documentation exists to map against the PRD.
### Warnings
⚠️ **WARNING: UX/UI is heavily implied by the PRD.**
The PRD outlines highly specific interfaces such as a "Real-time D3 radial graph (Canvas)", "Chat-with-PDF", and "Real-time UI indicators for token consumption".
Architecture defines a Next.js App Router and D3.js implementation.
**Action Required:** UX Design specifications MUST be created and validated to ensure they align with the real-time constraints and the "Host-Pays" UI paradigms before frontend implementation begins.
## Epic Quality Review
### Quality Assessment Status
**Not Performed**
As there are no Epics or Stories yet, we cannot evaluate Epic independence, User Value focus, or forward dependencies.
### Recommendations for Upcoming Epic Creation
When transitioning to the Epics creation phase, the team must adhere strictly to these rules:
- **User Value Focus:** Epics must not be technical milestones (e.g., "Setup Database"). They must represent a feature slice (e.g., "User executes semantic search").
- **Strict Independence:** Epic 2 must not depend on features built in Epic 3.
- **Story Independence:** Stories within an epic must not contain forward dependencies.
- **Traceability:** Every Epic must explicitly map back to the 23 FRs defined in the PRD.
## Summary and Recommendations
### Overall Readiness Status
**NEEDS WORK** (Specifically: Needs Epic & UX Creation Phase)
### Critical Issues Requiring Immediate Action
1. **Missing Epics and Stories:** The project cannot enter the implementation phase without translating the 23 Functional Requirements into Epics and User Stories.
2. **Missing UX Documentation:** Given the heavy UI implications in the PRD (real-time canvas, chat interface, dashboard), wireframes or UX specifications must be created to align with the proposed architecture.
### Recommended Next Steps
1. Execute the `bmad-create-ux-design` workflow to solidify the interface logic implied by the PRD.
2. Execute the `bmad-create-epics-and-stories` workflow, rigorously ensuring all 23 FRs are mapped to independent, user-value-focused epics.
3. Rerun this Implementation Readiness Assessment once Epics and UX are established to validate full traceability and coverage.
### Final Note
This assessment successfully parsed the robust PRD and Architecture documents but identified major expected gaps in the Epic and UX phases. Address these critical missing documents by moving to the next strategic planning phase before proceeding to engineering implementation.

297
docs/prd.md Normal file
View File

@@ -0,0 +1,297 @@
---
stepsCompleted:
- step-01-init
- step-02-discovery
- step-02b-vision
- step-02c-executive-summary
- step-03-success
- step-04-journeys
- step-05-domain
- step-06-innovation
- step-07-project-type
- step-08-scoping
- step-09-functional
- step-10-nonfunctional
- step-11-polish
releaseMode: 'phased'
classification:
projectType: 'SaaS Web Application (B2C & B2B)'
domain: 'Productivity / Personal Knowledge Management'
complexity: 'Medium'
projectContext: 'brownfield'
inputDocuments:
- docs/fonctionnalites-ia.md
- docs/gtm-pricing-strategy.md
- docs/spec-document-qa.md
- memento-note/docs/brainstorm-documentation.md
- memento-note/docs/byok-billing-patch-v3.md
- memento-note/docs/saas-deployment-prep.md
workflowType: 'prd'
---
# Product Requirements Document - Momento
**Author:** User
**Date:** 2026-05-14
## Executive Summary
Momento Note democratizes access to an AI-augmented digital memory for a dual audience: self-taught individuals and enterprise R&D departments. As a next-generation Personal Knowledge Management system, it transforms note-taking from passive storage into an active, intelligent partner. By integrating vector-based semantic search and an ecosystem of autonomous agents, Momento automatically surfaces hidden connections, remembers forgotten insights, and accelerates knowledge work.
### What Makes This Special
Momento's true power lies in its seamless blend of advanced AI tools and innovative financial architecture:
- **Autonomous Ecosystem:** Moving beyond a smart notepad, Momento acts as an autonomous "Second Brain." It deploys specialized agents (Scraper, Researcher, Monitor) and native productivity tools like Document Parsing (Chat-with-PDF) and Automated Task Extraction directly within the user's workspace.
- **Collaborative Brainstorming:** A real-time, D3-powered radial graph canvas allows users to generate, expand, and structure AI-driven ideas collaboratively.
- **Sustainable "Host-Pays" Billing & BYOK:** Memento resolves the SaaS AI cost paradox through intelligent smart routing (defaulting to highly optimized models like DeepSeek V4 Flash) and a Bring-Your-Own-Key (BYOK) architecture that eliminates AI costs for power users. The innovative Freemium "AI Discovery Pack" provides lifetime access limits rather than restrictive monthly quotas, delivering an immediate "Aha!" moment without friction.
## Project Classification
- **Project Type:** SaaS Web Application (B2C & B2B)
- **Domain:** Productivity / Personal Knowledge Management
- **Complexity:** Medium (Managing complex state, real-time WebSockets, AI token routing, and subscription tier limits)
- **Project Context:** Brownfield (Enhancing an existing note-taking app with a V3 monetization patch, collaborative brainstorming, and BYOK capabilities)
## Success Criteria
### User Success
- **Activation (The "Aha!" Moment):** A high percentage of new free users must experience the product's core value rapidly by executing their first "Semantic Search" or joining a shared "Brainstorm Canvas" within 24 hours of sign-up.
- **Immediate Utility:** Users must immediately feel the transition from passive note-taking to an active, AI-augmented digital memory.
### Business Success
- **Core Engagement (Stickiness):** Strong Weekly Active Users (WAU) interacting specifically with advanced features (Chat-with-PDF, Task Extraction, or Autonomous Agents), proving the undeniable "Second Brain" value.
- **Viral Acquisition (PLG):** A steady baseline of new users acquired organically through product-led growth loops, primarily via shared Brainstorm session links and exported PPTX/Slide watermarks.
- **Strategic Positioning:** Achieving high engagement and product health metrics that position the platform as an undeniable acquisition target or lay the groundwork for massive marketing scaling.
### Technical Success
- **Cost Architecture Viability:** Maintaining strict control over AI API costs through the "Host-Pays" billing architecture and intelligent smart routing (e.g., DeepSeek V4 Flash).
- **BYOK Adoption:** Achieving a high adoption rate of the Bring Your Own Key (BYOK) feature among power users to effectively zero out API costs for the most demanding workloads.
### Measurable Outcomes
- **Activation Rate:** % of new users completing a Semantic Search or joining a Brainstorm Canvas within 24h.
- **WAU to MAU Ratio:** High ratio demonstrating strong stickiness around advanced AI features.
- **K-Factor (Virality):** Number of new users acquired per active user via shared links.
- **Margin Protection:** Average AI API cost per active user maintained strictly below the targeted baseline, offset by BYOK adoption.
## Product Scope
### MVP - Minimum Viable Product
- Core note-taking interface with vector-based Semantic Search.
- Real-time collaborative Brainstorm Canvas (D3 radial graph).
- Document Parsing (Chat-with-PDF) and Automated Task Extraction natively integrated into the workspace.
- Foundational "AI Discovery Pack" (Freemium lifetime limits).
- The "Host-Pays" billing engine and BYOK configuration for power users.
### Growth Features (Post-MVP)
- Advanced Autonomous Agents (Scraper, Researcher, Monitor).
- Enhanced viral loops via branded PPTX/Markdown exports.
### Vision (Future)
- A fully autonomous "Second Brain" ecosystem that anticipates user needs, automatically structures knowledge, and serves as the ultimate cognitive partner for both individuals and enterprise R&D departments.
## User Journeys
### 1. The Power User (BYOK & Autonomous Ecosystem)
**Persona:** Alex, an independent data science researcher analyzing dense PDFs, frustrated by arbitrary SaaS API limits.
**Opening Scene:** Alex discovers Momento through a watermark on a shared presentation. They sign up and are granted the Freemium "AI Discovery Pack."
**Rising Action:** Alex uploads a complex 50-page PDF and uses the Chat-with-PDF feature to extract methodologies. The AI's responses are rapid and accurate. Because Alex is doing heavy research, they quickly exhaust their Discovery Pack token limits.
**Climax (The "Aha!" Moment):** Instead of hitting a hard paywall that locks them out, Memento elegantly prompts them to input their own LLM API key (BYOK). Alex pastes their DeepSeek key, and instantly, they are back to querying at near-zero marginal cost, entirely avoiding a rigid $20/mo subscription.
**Resolution:** Alex fully adopts Memento as their "Second Brain", deploying autonomous Scraper agents to monitor new Arxiv papers directly into their semantic search index.
### 2. The Enterprise Team Lead (Host-Pays Collaboration)
**Persona:** Sarah, an R&D Lead struggling to synthesize team ideas using scattered static docs.
**Opening Scene:** Sarah creates a new Memento Note, seeds it with a product architecture problem, and launches a "Brainstorm Canvas" session. She shares the session link with 4 team members.
**Rising Action:** The team members join instantly with zero friction (no need to upgrade their own accounts). They begin generating "Disruptions" and "Analogies" on the D3 radial graph.
**Climax:** The LLM Router seamlessly handles the concurrent requests using Sarah's Pro tier limits (the "Host-Pays" principle). Everyone experiences premium AI generation without hitting individual paywalls or errors.
**Resolution:** The session yields a structured radial map of ideas. Sarah selects the best nodes and uses Automated Task Extraction to instantly generate actionable tickets. The team has their plan, and Memento has organically acquired 4 new engaged users.
### 3. The System Administrator (Cost & Limit Management)
**Persona:** David, Memento's internal Ops Administrator protecting the AI API margins.
**Opening Scene:** A sudden spike in AI token usage triggers an alert on David's dashboard.
**Rising Action:** David investigates and identifies a shared Brainstorm session with heavy activity. He needs to verify that the system's margin isn't bleeding.
**Climax:** David checks the hybrid Redis/PostgreSQL entitlement system via the admin panel. He confirms that the Host-Pays logic is functioning perfectly: the session host's AI Discovery Pack was exhausted, and the system correctly gracefully degraded the guests' capabilities or prompted the host to upgrade/use BYOK.
**Resolution:** Margin is protected. David logs a successful automated defense against API abuse.
### Journey Requirements Summary
These journeys reveal the following critical capabilities we must build:
- **BYOK Management:** Secure UI for API key input (AES-256-GCM encryption) and dynamic LLM routing fallbacks.
- **Entitlement & Quotas:** High-performance Redis-backed usage tracking for the AI Discovery Pack, with clear UI prompts upon exhaustion.
- **Real-Time Collaboration:** Robust Socket.io session management, frictionless guest-join flows, and strict role-based "Host-Pays" billing logic.
- **Agent Orchestration:** UI to configure, schedule, and view outputs of autonomous agents natively in the workspace.
## Domain-Specific Requirements
### Compliance & Regulatory
- **Data Residency & Privacy:** Strict EU-only data storage options (or configurable regional storage). Strict zero-data-retention agreements must be enforced via APIs to ensure R&D IP is never used for LLM training.
- **SOC2 Roadmap:** The architecture must be designed from day one with SOC2 Type II compliance in mind to pass enterprise vendor security assessments.
### Technical Constraints & Security
- **BYOK API Key Security:** Users' personal LLM keys must be strictly encrypted at rest (e.g., AES-256-GCM) and securely transmitted only to the LLM Router.
- **Real-Time State & Quotas:** Managing high-frequency WebSocket events (Brainstorm Canvas) alongside strict Redis-backed rate limiting to enforce the "Host-Pays" rules without dropping concurrent connections.
### Integration Requirements
- **LLM Provider Agnosticism:** The AI Router must gracefully handle rate limits, downtimes, and differing token-counting methodologies across OpenAI, Google, DeepSeek, and MiniMax APIs.
- **SSO / SAML Integration:** Mandatory support for enterprise identity providers (Okta, Azure AD, Google Workspace) for Business/Enterprise tiers.
### Risk Mitigations
- **Comprehensive Audit Logging:** Workspace administrators must have access to full audit logs detailing who accessed which notes, and specifically what data was processed by which AI agent/provider.
## Innovation & Novel Patterns
### Detected Innovation Areas
1. **Financial Architecture (The "Host-Pays" + BYOK Model):** Momento solves the SaaS AI unit economics paradox. By shifting all collaborative AI generation costs exclusively to the session host's quotas—while simultaneously offering a zero-margin Bring-Your-Own-Key (BYOK) escape hatch—the platform eliminates the traditional per-seat LLM paywall friction that stifles viral growth.
2. **Autonomous Agent Ecosystem Native to PKM:** Integrating Scraper, Researcher, and Monitor agents directly into the note environment. Instead of users manually pulling data into their notes, the "Second Brain" actively structures and retrieves knowledge via vector-based semantic search.
3. **Radial Graph-Based AI Brainstorming:** Moving away from linear chat interfaces (like standard ChatGPT) to a multi-directional, real-time D3 radial graph, where ideas expand outwards in "Waves" (Variations, Analogies, Disruptions).
### Market Context & Competitive Landscape
Traditional PKM tools (like Notion or Obsidian) either charge heavy flat-rate AI add-ons ($10-$20/mo) or require highly technical, fragile plugin setups for local models. Conversely, standard whiteboard tools (Miro, FigJam) offer AI generation but lack the deep semantic connection to a user's personal knowledge base. Momento occupies the blue ocean between an enterprise collaboration whiteboard and an autonomous research assistant.
### Validation Approach
- **BYOK Adoption Rate:** Track the percentage of users who, upon exhausting their Freemium "AI Discovery Pack," successfully provision their own API key rather than churning.
- **Viral Coefficient (K-Factor) from Host-Pays:** Measure the organic acquisition rate specifically generated by guests joining frictionless, Host-Pays Brainstorm sessions.
### Risk Mitigation
- **Risk:** The Host-Pays model leads to rapid quota exhaustion for the host, causing frustration and churn.
**Mitigation:** Clear, real-time UI indicators of token consumption during a shared session, and automatic graceful degradation (e.g., smart routing to cheaper models like DeepSeek V4 Flash) before a hard block.
- **Risk:** Power users abuse the BYOK implementation to overload system databases.
**Mitigation:** Strict server-side rate limiting on WebSocket connections and database writes, even for BYOK users, to protect infrastructure stability.
## SaaS Web Application Specific Requirements
### Project-Type Overview
Momento is a B2B and B2C SaaS platform serving as a multi-tenant personal knowledge management system. It requires complex state synchronization, robust role-based access controls for collaborative sessions, and an advanced hybrid billing architecture.
### Technical Architecture Considerations
- **Frontend:** React + Next.js App Router, using D3.js for the Brainstorm radial graph and React Query for state management.
- **Backend & Database:** Node.js (Next.js API Routes) with PostgreSQL (managed via Prisma) for persistent data storage.
- **Real-Time Layer:** A dedicated Socket.io server (port 3002) handling the low-latency collaborative Canvas.
- **AI Infrastructure:** A custom LLM Router (`lib/ai/router.ts`) supporting BYOK (AES-256-GCM encrypted keys) and dynamic fallbacks across 13 independent providers (including OpenAI, Google Gemini, Anthropic, DeepSeek, OpenRouter, Mistral, Ollama, ZAI, LM Studio, MiniMax, etc.) to ensure zero vendor lock-in.
### Tenant Model & Data Residency
- Users operate within isolated workspaces.
- Enterprise tenants require strict EU-only data storage or configurable regional storage to meet data residency compliance.
### RBAC Matrix & "Host-Pays" Permissions
- **Session Host (Owner):** Controls session initiation, AI token expenditure, and BYOK overrides.
- **Collaborator (Guest):** Can interact and generate ideas, but AI queries are routed through the Host's quota or BYOK limits.
- **Workspace Admin:** Access to comprehensive audit logs detailing user access and specific AI agent processing history.
### Subscription Tiers
1. **Basic (Free):** Includes the "AI Discovery Pack" (lifetime usage limits) to drive conversion without monthly paywalls.
2. **Pro:** Standard monthly limits for power individuals, with BYOK capabilities to bypass limits.
3. **Business / Enterprise:** Pooled token limits, SSO/SAML integration (Okta, Azure AD), and advanced data residency controls.
### Integration List
- **Identity:** SSO / SAML (Okta, Azure AD, Google Workspace).
- **Payment:** Stripe billing integration for Pro/Business tiers.
- **AI Providers:** Extensive multi-model support featuring 13 independent providers (OpenAI, Anthropic, Gemini, DeepSeek, OpenRouter, local models via Ollama/LM Studio, etc.).
## Project Scoping & Phased Development
### MVP Strategy & Philosophy
**MVP Approach:** Value-Driven PLG. The MVP focuses on delivering an immediate "Aha!" moment through the "Second Brain" core loop (Chat-with-PDF, semantic search, task extraction) and leveraging organic virality via the frictionless, "Host-Pays" Brainstorm Canvas.
**Resource Requirements:** A lean cross-functional team consisting of Next.js/React frontend engineers, a D3.js visualization specialist, and a backend/AI integration engineer to handle the custom LLM Router and WebSocket infrastructure.
### MVP Feature Set (Phase 1)
**Core User Journeys Supported:**
- The Power User (BYOK & Semantic Search)
- The Enterprise Team Lead (Host-Pays Collaboration)
- The System Administrator (Cost & Limit Management)
**Must-Have Capabilities:**
- Core note-taking interface with vector-based Semantic Search.
- Real-time collaborative Brainstorm Canvas (D3 radial graph).
- Document Parsing (Chat-with-PDF) and Automated Task Extraction natively integrated into the workspace.
- The "Host-Pays" billing engine and BYOK configuration (AES-256-GCM encryption).
- High-performance Redis-backed usage tracking for the "AI Discovery Pack".
### Post-MVP Features
**Phase 2 (Growth Features):**
- Advanced Autonomous Agents (Scraper, Researcher, Monitor) with native orchestration UI.
- Enhanced viral loops via branded PPTX/Markdown exports.
- SSO / SAML Integration for Business/Enterprise scale-out.
**Phase 3 (Vision / Expansion):**
- A fully autonomous "Second Brain" ecosystem that proactively anticipates user needs and structures knowledge.
- SOC2 Type II compliance implementation and comprehensive enterprise vendor security controls.
### Risk Mitigation Strategy
**Technical Risks:** WebSocket connection exhaustion and database write overload during highly active Brainstorm sessions.
*Mitigation:* Strict server-side rate limiting, optimized Redis state caching, and fallback to lightweight AI models (e.g., DeepSeek V4 Flash) under heavy load.
**Market Risks:** Power users churning when Freemium limits are reached.
*Mitigation:* The frictionless BYOK escape hatch that instantly restores functionality at zero marginal cost without forcing a paid subscription.
**Resource Risks:** The complexity of integrating 13 different AI APIs delays the MVP launch.
*Mitigation:* Utilize an aggregation layer (like OpenRouter) for the "long-tail" of models while building robust, direct API integrations only for the core, high-volume providers (OpenAI, DeepSeek, Gemini).
## Functional Requirements
### Workspace & Knowledge Management
- FR1: Users can create, read, update, and delete rich-text notes within their workspace.
- FR2: Users can execute natural language semantic searches across their entire personal knowledge base.
- FR3: Users can organize notes within isolated workspaces and nested hierarchies.
- FR4: Users can upload PDF documents and extract text for AI contextual analysis (Chat-with-PDF).
- FR5: Users can invoke automated task extraction on any note to generate structured, actionable to-do lists.
- FR6: Users can leverage one-shot AI to automatically generate contextual tags and titles for their notes (Auto-Tagging / Auto-Titling).
- FR7: The system will proactively detect and surface semantic connections between disconnected notes in the background ("Memory Echo") to stimulate serendipitous discovery.
### Real-Time Collaboration (Brainstorming)
- FR8: Hosts can initialize a real-time collaborative Brainstorm session derived from an existing note.
- FR9: Guests can join a shared Brainstorm session via a frictionless sharing link without requiring an account.
- FR10: Users can visually map and generate ideas on a multi-directional radial graph (Canvas).
- FR11: Users can prompt AI within the shared session to generate specific ideation "Waves" (Variations, Analogies, Disruptions).
- FR12: Users can export the completed Brainstorm canvas to structured formats (Markdown, Branded PPTX).
### Billing & Entitlements
- FR13: Users can monitor their remaining "AI Discovery Pack" or subscription usage limits via a real-time UI indicator.
- FR14: Users can input, update, and securely store their own third-party LLM API keys (BYOK) to bypass platform limits.
- FR15: Hosts can assume financial responsibility (token consumption) for all AI queries executed by guests within their active shared sessions.
- FR16: Users can upgrade to paid subscription tiers (Pro, Business, Enterprise) via an integrated payment gateway.
### AI Routing & Orchestration
- FR17: Users can dynamically switch between supported AI providers (OpenAI, DeepSeek, Gemini, OpenRouter, etc.) for specific generation tasks.
- FR18: Administrators can configure smart-routing fallback rules to default to specific models under heavy load or quota exhaustion.
- FR19: Users can configure, schedule, and view the outputs of autonomous background agents (Scraper, Researcher, Monitor).
### Enterprise Administration & Security
- FR20: Enterprise users can authenticate via Single Sign-On (SSO / SAML).
- FR21: Workspace administrators can view comprehensive audit logs detailing user access events and specific AI provider utilization.
- FR22: Workspace administrators can configure strict data residency requirements (e.g., EU-only storage) for their tenant.
- FR23: Administrators can programmatically enforce zero-data-retention headers/flags for all outbound third-party AI API requests.
## Non-Functional Requirements
### Performance
- **NFR-P1 (Real-Time Latency):** Brainstorm Canvas WebSocket events (e.g., node creation, cursor movement) must propagate to all connected clients within 150ms under normal network conditions.
- **NFR-P2 (Search Speed):** Vector-based semantic search queries must return initial results within 800ms for personal knowledge bases containing up to 10,000 notes.
- **NFR-P3 (AI Routing):** The internal LLM Router must evaluate "Host-Pays" and BYOK rules and dispatch the prompt to the external provider within 50ms of receiving the user request.
### Security & Privacy
- **NFR-S1 (Encryption):** All user-provided LLM API keys (BYOK) must be encrypted at rest using AES-256-GCM.
- **NFR-S2 (Data Residency):** The architecture must support configurable regional database deployments, guaranteeing EU-only data storage for specified enterprise tenants.
- **NFR-S3 (Auditability):** The system must log 100% of LLM API requests, retaining anonymized provider routing and token consumption metrics for a minimum of 1 year to support future SOC2 compliance audits.
### Scalability
- **NFR-SC1 (Collaborative Sessions):** A single Brainstorm session must gracefully support up to 50 concurrent active users without degrading the 150ms latency target.
- **NFR-SC2 (Rate Limiting):** The Redis-backed entitlement system must process usage quota checks in under 10ms, supporting up to 5,000 concurrent verifications per second globally.
### Reliability & Resilience
- **NFR-R1 (Graceful Degradation):** If a primary AI provider (e.g., OpenAI) returns a 429 (Rate Limit) or 500-series error, the LLM Router must automatically fallback to the designated secondary provider (e.g., DeepSeek) within 1.5 seconds.
- **NFR-R2 (Availability):** The core note-taking interface, database reads/writes, and offline access must maintain 99.9% uptime, functioning independently of any third-party AI provider outages.

697
docs/spec-document-qa.md Normal file
View File

@@ -0,0 +1,697 @@
# Spécification Technique — Document Parsing & Q&A (Analyse PDF)
## A. Mises à jour du Schéma Prisma
### A1. Modèle `NoteAttachment`
Stocke les fichiers attachés à une note (PDF, images, documents).
```prisma
model NoteAttachment {
id String @id @default(cuid())
noteId String
fileName String
fileType String // "application/pdf", "image/png", etc.
fileSize Int // en bytes
filePath String // chemin local: data/uploads/attachments/{noteId}/{uuid}.pdf
mimeType String // redondant avec fileType pour requêtes rapides
status String @default("pending") // pending → processing → ready → failed
pageCount Int? // nombre de pages (PDF uniquement)
error String? // message d'erreur si failed
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
note Note @relation(fields: [noteId], references: [id], onDelete: Cascade)
chunks DocumentChunk[]
@@index([noteId])
@@index([status])
}
```
### A2. Modèle `DocumentChunk`
Fragments vectorisés d'un document. Chaque chunk est lié à un attachment ET transitivement à une note.
```prisma
model DocumentChunk {
id String @id @default(cuid())
attachmentId String
content String // texte du fragment (800-1200 tokens)
chunkIndex Int // position ordinale dans le document (0, 1, 2…)
pageNumber Int? // page source (pour citation)
startChar Int? // offset caractère de début dans le texte extrait
endChar Int? // offset caractère de fin
metadata String? // JSON: { heading, section, tableCaption… }
embedding Unsupported("vector(1536)")?
createdAt DateTime @default(now())
attachment NoteAttachment @relation(fields: [attachmentId], references: [id], onDelete: Cascade)
@@index([attachmentId])
@@index([attachmentId, chunkIndex])
}
```
### A3. Ajout à `Note`
```prisma
model Note {
// … champs existants …
attachments NoteAttachment[]
}
```
### A4. Migration SQL brute — Index HNSW pour DocumentChunk
```sql
-- À ajouter dans la migration Prisma (migration.sql)
CREATE INDEX IF NOT EXISTS "DocumentChunk_embedding_hnsw_idx"
ON "DocumentChunk" USING hnsw ("embedding" vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
```
---
## B. Pipeline d'Ingestion (Chunking & Embeddings)
### B1. Architecture du pipeline
```
PDF upload → NoteAttachment (status: pending)
pdf-parse extraction (texte brut + métadonnées pages)
Structural Chunking (800 chars, overlap 200, respect des pages)
DocumentChunk.create (content, chunkIndex, pageNumber, metadata)
Batch embeddings (Promise.all par batch de 20)
SQL UPDATE embedding sur chaque chunk
NoteAttachment.update (status: ready)
```
### B2. Service d'extraction — `document-extraction.service.ts`
```typescript
// lib/ai/services/document-extraction.service.ts
import pdf from 'pdf-parse'
interface ExtractedPage {
pageNumber: number
text: string
}
interface ExtractedDocument {
pages: ExtractedPage[]
totalPages: number
metadata: { title?: string; author?: string }
}
export class DocumentExtractionService {
async extractPdf(filePath: string): Promise<ExtractedDocument> {
const dataBuffer = fs.readFileSync(filePath)
const data = await pdf(dataBuffer, {
max: 0, // toutes les pages
})
// pdf-parse ne donne pas les pages directement,
// on utilise un custom page renderer
const pages: ExtractedPage[] = []
let currentPage = 0
// NLP page renderer: each page separated
const renderer = {
renderPage: (pageData: any) => {
currentPage++
const text = pageData.text
pages.push({ pageNumber: currentPage, text })
return ''
}
}
// Re-parse avec le renderer
await pdf(dataBuffer, { pagerender: renderer.renderPage })
return {
pages,
totalPages: data.numpages,
metadata: {
title: data.info?.Title,
author: data.info?.Author,
},
}
}
}
export const documentExtractionService = new DocumentExtractionService()
```
### B3. Stratégie de Chunking — `document-chunking.service.ts`
**Principes :**
1. **Taille cible** : 800 caractères (~200 tokens), avec overlap de 200 caractères
2. **Respect des frontières de page** : un chunk ne chevauche JAMAIS deux pages. Si la coupure tombe au milieu d'une page, on ajuste.
3. **Respect des sections** : les headings (lignes en MAJUSCULES ou préfixées par `#`, `##`) démarrent un nouveau chunk
4. **Overlap contextuel** : les 200 derniers caractères du chunk N sont répétés au début du chunk N+1
5. **Tables** : conservées en entier dans un seul chunk si < 1500 chars, sinon découpées par ligne avec en-tête répété
```typescript
// lib/ai/services/document-chunking.service.ts
interface ChunkInput {
text: string
pageNumber: number
}
interface DocumentChunkData {
content: string
chunkIndex: number
pageNumber: number
startChar: number
endChar: number
metadata?: string
}
export class DocumentChunkingService {
private readonly CHUNK_SIZE = 800
private readonly OVERLAP = 200
private readonly MAX_CHUNK_SIZE = 1500
chunk(pages: ChunkInput[]): DocumentChunkData[] {
const chunks: DocumentChunkData[] = []
let globalIndex = 0
let previousTail = ''
for (const page of pages) {
const text = page.text.trim()
if (!text) continue
// Découper en sections (par headings ou paragraphes)
const sections = this.splitSections(text)
let buffer = previousTail
let bufferStart = 0
for (const section of sections) {
if (buffer.length + section.length > this.CHUNK_SIZE && buffer.length > 0) {
// Flush le buffer comme un chunk
chunks.push({
content: buffer.trim(),
chunkIndex: globalIndex++,
pageNumber: page.pageNumber,
startChar: bufferStart,
endChar: bufferStart + buffer.length,
})
// Overlap: garder les derniers OVERLAP chars
previousTail = buffer.slice(-this.OVERLAP)
buffer = previousTail + '\n' + section
bufferStart += buffer.length - section.length - previousTail.length
} else {
buffer += (buffer ? '\n\n' : '') + section
}
}
// Flush le reste
if (buffer.trim()) {
chunks.push({
content: buffer.trim(),
chunkIndex: globalIndex++,
pageNumber: page.pageNumber,
startChar: bufferStart,
endChar: bufferStart + buffer.length,
})
previousTail = buffer.slice(-this.OVERLAP)
}
}
return chunks
}
private splitSections(text: string): string[] {
const lines = text.split('\n')
const sections: string[] = []
let current = ''
for (const line of lines) {
const isHeading = /^(#{1,6}\s|[A-Z][A-Z\s]{5,}$)/.test(line.trim())
if (isHeading && current.trim()) {
sections.push(current.trim())
current = line
} else {
current += (current ? '\n' : '') + line
}
}
if (current.trim()) sections.push(current.trim())
return sections
}
}
export const documentChunkingService = new DocumentChunkingService()
```
### B4. Service d'ingestion orchestrateur — `document-ingestion.service.ts`
```typescript
// lib/ai/services/document-ingestion.service.ts
export class DocumentIngestionService {
async ingest(attachmentId: string): Promise<void> {
const attachment = await prisma.noteAttachment.findUnique({
where: { id: attachmentId },
})
if (!attachment) throw new Error('Attachment not found')
await prisma.noteAttachment.update({
where: { id: attachmentId },
data: { status: 'processing' },
})
try {
// 1. Extraction
const extracted = await documentExtractionService.extractPdf(attachment.filePath)
await prisma.noteAttachment.update({
where: { id: attachmentId },
data: { pageCount: extracted.totalPages },
})
// 2. Chunking
const chunkInputs = extracted.pages.map(p => ({
text: p.text,
pageNumber: p.pageNumber,
}))
const chunks = documentChunkingService.chunk(chunkInputs)
// 3. Créer les chunks en DB (sans embedding)
const created = await Promise.all(
chunks.map(c =>
prisma.documentChunk.create({
data: {
attachmentId,
content: c.content,
chunkIndex: c.chunkIndex,
pageNumber: c.pageNumber,
startChar: c.startChar,
endChar: c.endChar,
metadata: c.metadata,
},
})
)
)
// 4. Batch embeddings (par batch de 20)
const BATCH_SIZE = 20
for (let i = 0; i < created.length; i += BATCH_SIZE) {
const batch = created.slice(i, i + BATCH_SIZE)
const texts = batch.map(c => c.content)
const embeddings = await embeddingService.generateBatchEmbeddings(texts)
await Promise.all(
batch.map((chunk, idx) =>
prisma.$executeRawUnsafe(
`UPDATE "DocumentChunk" SET embedding = $1::vector WHERE id = $2`,
embeddingService.toVectorString(embeddings[idx].embedding),
chunk.id
)
)
)
}
// 5. Marquer prêt
await prisma.noteAttachment.update({
where: { id: attachmentId },
data: { status: 'ready' },
})
} catch (error: any) {
await prisma.noteAttachment.update({
where: { id: attachmentId },
data: { status: 'failed', error: error.message },
})
throw error
}
}
}
export const documentIngestionService = new DocumentIngestionService()
```
### B5. Route API d'upload
```typescript
// app/api/notes/[noteId]/attachments/route.ts
export async function POST(req, { params }) {
const session = await auth()
if (!session?.user?.id) return unauthorized()
const { noteId } = await params
const formData = await req.formData()
const file = formData.get('file') as File
// Validation
if (file.size > 20 * 1024 * 1024) return error('File too large (max 20MB)')
if (file.type !== 'application/pdf') return error('Only PDF supported')
// Sauvegarder le fichier
const dir = `data/uploads/attachments/${noteId}`
fs.mkdirSync(dir, { recursive: true })
const filePath = path.join(dir, `${uuid()}.pdf`)
fs.writeFileSync(filePath, Buffer.from(await file.arrayBuffer()))
// Créer l'attachment
const attachment = await prisma.noteAttachment.create({
data: {
noteId,
fileName: file.name,
fileType: file.type,
fileSize: file.size,
filePath,
mimeType: file.type,
status: 'pending',
},
})
// Lancer l'ingestion en arrière-plan (setImmediate)
setImmediate(() => documentIngestionService.ingest(attachment.id))
return NextResponse.json({ success: true, data: attachment })
}
```
---
## C. Interface du Nouvel Outil Agent — `document_search`
### C1. Enregistrement dans le registre
```typescript
// lib/ai/tools/document-search.tool.ts
toolRegistry.register({
name: 'document_search',
description: 'Search within PDF documents attached to notes. Returns relevant passages with page numbers and source document info.',
isInternal: true,
buildTool: (ctx) =>
tool({
description: `Search within PDF documents attached to the user's notes.
Returns matching passages with page numbers, chunk content, and the source note/document info.
Use this when the user asks about specific documents, PDFs, or attached files.
Can search across all documents or within a specific note's attachments.`,
inputSchema: z.object({
query: z.string().describe('The search query to find relevant passages in documents'),
noteId: z.string().optional().describe('Optional: restrict search to attachments of a specific note'),
limit: z.number().optional().describe('Max results to return (default 5)').default(5),
}),
execute: async ({ query, noteId, limit = 5 }) => {
try {
const queryEmbedding = await embeddingService.generateEmbedding(query)
const vectorStr = embeddingService.toVectorString(queryEmbedding.embedding)
let noteFilter = ''
const params: any[] = [vectorStr, limit]
if (noteId) {
assertSafeId(noteId, 'noteId')
noteFilter = `AND na."noteId" = $${params.length}`
params.push(noteId)
} else if (ctx.notebookId) {
assertSafeId(ctx.notebookId, 'notebookId')
noteFilter = `AND n."notebookId" = $${params.length}`
params.push(ctx.notebookId)
}
const userId = ctx.userId
assertSafeId(userId, 'userId')
params.push(userId)
const results = await prisma.$queryRawUnsafe(
`SELECT
dc.id as chunkId,
dc.content,
dc."pageNumber",
dc."chunkIndex",
dc.metadata,
na.id as "attachmentId",
na."fileName",
na."pageCount",
na."noteId",
n.title as "noteTitle",
dc.embedding::text <=> $1::vector as distance
FROM "DocumentChunk" dc
JOIN "NoteAttachment" na ON na.id = dc."attachmentId"
JOIN "Note" n ON n.id = na."noteId"
WHERE dc.embedding IS NOT NULL
AND na.status = 'ready'
AND n."trashedAt" IS NULL
AND n."userId" = $${params.length}
${noteFilter}
ORDER BY dc.embedding::text <=> $1::vector
LIMIT $2`,
...params
) as any[]
if (!results.length) return { results: [], message: 'No matching documents found' }
const threshold = 0.5
return results
.filter(r => r.distance < threshold)
.map(r => ({
content: r.content.substring(0, 600),
pageNumber: r.pageNumber,
chunkIndex: r.chunkIndex,
fileName: r.fileName,
noteId: r.noteId,
noteTitle: r.noteTitle || 'Untitled',
score: Math.max(0, 1 - r.distance),
}))
} catch (e: any) {
return { error: `Document search failed: ${e.message}` }
}
},
}),
})
```
### C2. Auto-enregistrement
Ajout dans `lib/ai/tools/index.ts` :
```typescript
import './document-search'
```
### C3. Activation dans le Chat
Mise à jour de `registry.ts``buildToolsForChat` :
```typescript
buildToolsForChat(ctx: ToolContext): Tool[] {
const tools: Tool[] = []
tools.push(this.build('note_search', ctx))
tools.push(this.build('note_read', ctx))
tools.push(this.build('document_search', ctx)) // <-- NOUVEAU
if (ctx.webSearch) {
tools.push(this.build('web_search', ctx))
tools.push(this.build('web_scrape', ctx))
}
return tools
}
```
---
## D. Logique de Requêtage RAG
### D1. Recherche hybride étendue — `semantic-search.service.ts`
Ajout d'une méthode `searchWithDocuments` qui combine notes ET chunks de documents :
```typescript
async searchWithDocuments(
userId: string,
query: string,
options?: SearchOptions & { noteId?: string; includeDocuments?: boolean }
): Promise<(SearchResult & { source?: 'note' | 'document'; pageNumber?: number; fileName?: string })[]> {
const includeDocuments = options?.includeDocuments !== false
// Phase 1: Recherche notes existante (FTS + pgvector + RRF)
const noteResults = await this.searchAsUser(userId, query, options)
// Phase 2: Recherche dans les documents (pgvector uniquement)
let documentResults: any[] = []
if (includeDocuments) {
const queryEmbedding = await embeddingService.generateEmbedding(query)
const vectorStr = embeddingService.toVectorString(queryEmbedding.embedding)
const params: any[] = [vectorStr, 50, userId]
let noteFilter = ''
if (options?.noteId) {
assertSafeId(options.noteId, 'noteId')
noteFilter = `AND na."noteId" = $${params.length + 1}`
params.push(options.noteId)
}
if (options?.notebookId) {
assertSafeId(options.notebookId, 'notebookId')
noteFilter += ` AND n."notebookId" = $${params.length + 1}`
params.push(options.notebookId)
}
documentResults = await prisma.$queryRawUnsafe(
`SELECT
dc.content,
dc."pageNumber",
na."fileName",
na."noteId",
n.title as "noteTitle",
1 - (dc.embedding::text <=> $1::vector) as score
FROM "DocumentChunk" dc
JOIN "NoteAttachment" na ON na.id = dc."attachmentId"
JOIN "Note" n ON n.id = na."noteId"
WHERE dc.embedding IS NOT NULL
AND na.status = 'ready'
AND n."trashedAt" IS NULL
AND n."userId" = $3
${noteFilter}
ORDER BY dc.embedding::text <=> $1::vector
LIMIT $2`,
...params
) as any[]
}
// Phase 3: Fusion RRF entre notes et documents
const K = 60
const fused = new Map<string, any>()
for (let i = 0; i < noteResults.length; i++) {
const r = noteResults[i]
fused.set(r.noteId, {
...r,
source: 'note',
rrfScore: 1 / (K + i + 1),
})
}
for (let i = 0; i < documentResults.length; i++) {
const r = documentResults[i]
const key = `doc_${r.noteId}_${r.pageNumber}_${i}`
fused.set(key, {
noteId: r.noteId,
title: `${r.noteTitle || 'Untitled'}${r.fileName} (p.${r.pageNumber})`,
content: r.content.substring(0, 500),
score: r.score,
matchType: 'related',
source: 'document',
pageNumber: r.pageNumber,
fileName: r.fileName,
rrfScore: 1 / (K + i + 1),
})
}
return Array.from(fused.values())
.sort((a, b) => b.rrfScore - a.rrfScore)
.slice(0, options?.limit || 20)
}
```
### D2. Logique de priorisation dans le Chat RAG
Mise à jour de `app/api/chat/route.ts` :
```typescript
// Dans le handler du chat, avant d'injecter le contexte :
let contextNotes = ''
// Si l'utilisateur mentionne un document/PDF spécifique
const documentMention = userMessage.match(
/\b(pdf|document|fichier|pi[eè]ce jointe|attachment|file)\b/i
)
const specificNote = userMessage.match(
/(?:dans|sur|de|du|la|le) (?:cette note|ce document|cette page)/i
)
if (specificNote && notebookId) {
// MODE CIBLE : chercher SEULEMENT dans les documents de cette note
const docResults = await semanticSearchService.searchWithDocuments(
userId, userMessage, { noteId: currentNoteId, includeDocuments: true, limit: 5 }
)
contextNotes = docResults.map(r =>
r.source === 'document'
? `[DOCUMENT: ${r.fileName} p.${r.pageNumber}]\n${r.content}`
: `[NOTE: ${r.title}]\n${r.content}`
).join('\n\n---\n\n')
} else {
// MODE GLOBAL : recherche étendue notes + documents
const results = await semanticSearchService.searchWithDocuments(
userId, userMessage, { notebookId, includeDocuments: !!documentMention, limit: 10 }
)
contextNotes = results.map(r =>
r.source === 'document'
? `[DOCUMENT: ${r.fileName} p.${r.pageNumber}]\n${r.content}`
: `[NOTE: ${r.title}]\n${r.content}`
).join('\n\n---\n\n')
}
```
### D3. Prompt système mis à jour
```typescript
const systemPrompt = `Tu es l'IA Note de Memento, l'assistant intelligent de prise de notes.
CONTEXTES DISPONIBLES :
- [NOTE: titre] → contenu d'une note de l'utilisateur
- [DOCUMENT: fichier.pdf p.X] → passage extrait d'un PDF attaché à une note
RÈGLES POUR LES DOCUMENTS :
- Cite toujours le nom du fichier et le numéro de page quand tu te réfères à un document
- Si l'utilisateur pose une question sur "ce document" ou "le PDF", base ta réponse uniquement sur les passages [DOCUMENT]
- Si les passages sont insuffisants, dis-le clairement plutôt que de deviner
- Pour les tableaux et données chiffrées, reproduis-les fidèlement
...`
```
### D4. SQL — Requête de débogage / test
```sql
-- Test : recherche dans les chunks d'un document spécifique
SELECT
dc.content,
dc."pageNumber",
dc."chunkIndex",
na."fileName",
n.title as note_title,
dc.embedding::text <=> '[0.01, 0.02, ...]'::vector as distance
FROM "DocumentChunk" dc
JOIN "NoteAttachment" na ON na.id = dc."attachmentId"
JOIN "Note" n ON n.id = na."noteId"
WHERE na.status = 'ready'
AND n."trashedAt" IS NULL
ORDER BY dc.embedding::text <=> '[0.01, 0.02, ...]'::vector
LIMIT 10;
```
---
## Résumé des fichiers à créer/modifier
| Action | Fichier |
|---|---|
| **CRÉER** | `prisma/migrations/XXX_add_note_attachment_document_chunk/migration.sql` |
| **MODIFIER** | `prisma/schema.prisma` — ajouter NoteAttachment, DocumentChunk, relation sur Note |
| **CRÉER** | `lib/ai/services/document-extraction.service.ts` |
| **CRÉER** | `lib/ai/services/document-chunking.service.ts` |
| **CRÉER** | `lib/ai/services/document-ingestion.service.ts` |
| **CRÉER** | `lib/ai/tools/document-search.tool.ts` |
| **MODIFIER** | `lib/ai/tools/index.ts` — ajouter import document-search |
| **MODIFIER** | `lib/ai/tools/registry.ts` — ajouter document_search dans buildToolsForChat |
| **CRÉER** | `app/api/notes/[noteId]/attachments/route.ts` — upload |
| **CRÉER** | `app/api/notes/[noteId]/attachments/[attachmentId]/route.ts` — GET status, DELETE |
| **MODIFIER** | `lib/ai/services/semantic-search.service.ts` — ajouter searchWithDocuments |
| **MODIFIER** | `app/api/chat/route.ts` — contexte documents dans le RAG |

60
docs/sprint-status.yaml Normal file
View File

@@ -0,0 +1,60 @@
# generated: 2026-05-14T16:06:50Z
# last_updated: 2026-05-14T16:06:50Z
# project: Momento
# project_key: NOKEY
# tracking_system: file-system
# story_location: docs
# STATUS DEFINITIONS:
# ==================
# Epic Status:
# - backlog: Epic not yet started
# - in-progress: Epic actively being worked on
# - done: All stories in epic completed
#
# Epic Status Transitions:
# - backlog → in-progress: Automatically when first story is created (via create-story)
# - in-progress → done: Manually when all stories reach 'done' status
#
# Story Status:
# - backlog: Story only exists in epic file
# - ready-for-dev: Story file created in stories folder
# - in-progress: Developer actively working on implementation
# - review: Ready for code review (via Dev's code-review workflow)
# - done: Story completed
#
# Retrospective Status:
# - optional: Can be completed but not required
# - done: Retrospective has been completed
#
# WORKFLOW NOTES:
# ===============
# - Epic transitions to 'in-progress' automatically when first story is created
# - Stories can be worked in parallel if team capacity allows
# - Developer typically creates next story after previous one is 'done' to incorporate learnings
# - Dev moves story to 'review', then runs code-review (fresh context, different LLM recommended)
generated: 2026-05-14T16:06:50Z
last_updated: 2026-05-14T18:30:00Z
project: Momento
project_key: NOKEY
tracking_system: file-system
story_location: docs
development_status:
epic-3: in-progress
3-1-freemium-quota-tracking: ready-for-dev
3-2-custom-llm-router: backlog
3-3-smart-routing-fallback: backlog
3-4-host-pays-session-logic: backlog
3-5-secure-byok-management: backlog
3-6-stripe-subscription-tiers: backlog
epic-3-retrospective: optional
epic-4: backlog
4-1-gdpr-cookie-consent: backlog
4-2-gdpr-right-to-be-forgotten: backlog
4-3-data-portability: backlog
4-4-explicit-ai-consent: backlog
4-5-eu-data-residency: backlog
4-6-sso-saml-audit-logging: backlog
epic-4-retrospective: optional

View File

@@ -0,0 +1,108 @@
---
stepsCompleted: [1, 2, 3, 4]
inputDocuments:
- docs/prd.md
- docs/epics.md
- memento-note/docs/brainstorm-documentation.md
- memento-note/docs/saas-deployment-prep.md
---
# UX Design Specification Momento
**Author:** devparsa
**Date:** 2026-05-14
---
<!-- UX design content will be appended sequentially through collaborative workflow steps -->
## Executive Summary
### Project Vision
*(Audit d'Intégration Commerciale)* : Le produit de base (Éditeur, PDF, Vecteurs) est achevé et fonctionnel. L'unique objectif de cette phase UX est le "Packaging de Vente" : greffer les leviers de monétisation (Quotas, BYOK) et lever les bloqueurs légaux européens (RGPD) en s'intégrant strictement dans l'interface Ethereal Precision v2 actuelle, sans ajouter de fonctionnalités de productivité supplémentaires.
### Target Users
- **Utilisateurs Freemium (Cibles de conversion) :** Doivent être confrontés à leurs limites d'utilisation (Quotas Redis) de manière explicite et frictionnelle dans la navigation (Sidebar) pour déclencher l'upgrade.
- **Clients B2B / Administrateurs (Acheteurs) :** Priorisent la rentabilité et la sécurité légale. Ils exigent une UI claire pour le contrôle des coûts de l'API (BYOK) et une gestion stricte des consentements RGPD pour s'engager.
### Key Design Challenges
- **Contrôle des Coûts (BYOK) :** Intégrer un panel de gestion des clés BYOK dans les Paramètres Globaux du Dashboard. L'UI doit permettre l'ajout, la rotation et la validation des clés LLM externes (sécurisées via AES-256-GCM) sans perturber le flux des utilisateurs non-admin.
- **Conversion & Quotas (Redis) :** Intégrer un composant visuel de consommation (ex: jauge ou compteur "Discovery Pack") dans le Footer de la Sidebar existante. Ce composant doit réagir en temps réel (sous 10ms) et servir de point d'entrée prioritaire vers le tunnel d'achat (Stripe).
- **Conformité Européenne (RGPD & Cookies) :**
1. Implémenter le bandeau de consentement Cookie (bloqueur légal) dès le premier accès.
2. Créer une interface de consentement IA (NFR-GDPR4) qui bloque l'envoi de données personnelles vers des API tierces tant que l'opt-in explicite n'est pas validé.
3. Ajouter un bouton "Suppression Définitive" clair dans les paramètres du compte ("Right to be Forgotten").
### Design Opportunities
- **L'Upgrade "Just-in-Time" :** Transformer l'épuisement des quotas Redis en opportunité d'achat immédiate, via des modales contextuelles claires, plutôt qu'en simple message d'erreur.
- **La Conformité comme Outil de Vente :** Concevoir une modale "Privacy & Compliance" d'apparence très professionnelle, rassurant immédiatement les acheteurs B2B sur la conformité stricte aux lois européennes (EU Data Residency, Zéro-Rétention).
## Core User Experience
### Defining Experience
*(Parcours de Monétisation et Conformité)* : L'expérience centrale se définit par la fluidité avec laquelle l'utilisateur traverse les étapes légales (RGPD) et commerciales (BYOK, Quotas) sans quitter son flux de travail principal (Éditeur / Canvas).
### Platform Strategy
*(Intégration au Dashboard Existant)* : Les interactions critiques se déploient exclusivement via des composants superposés (Modales, Toasts) ou des zones dédiées du layout existant (Footer de la Sidebar, Drawer de Configuration). Aucune nouvelle page isolée n'est créée.
### Effortless Interactions
*(Flux Critiques : Argent et Loi)*
**1. Flux d'Onboarding Légal (Cookies & IA)**
- **Déclencheur :** Première connexion de l'utilisateur ou clic sur une fonctionnalité IA (ex. "Vague d'idéation").
- **Interaction :**
- **Cookies :** Bandeau persistant et fin en bas de l'écran avec bouton "Accepter Essentiels" ou "Gérer".
- **Consentement IA (NFR-GDPR4) :** Micro-modale contextuelle au moment du clic sur l'action IA. Bouton "Autoriser l'envoi à l'IA tierce" avec case à cocher pour mémorisation ("Ne plus demander").
- **Zéro Friction :** Le consentement est demandé "Just-in-Time", exactement là où l'utilisateur en voit l'utilité, sans le bloquer dès le login.
**2. Flux de Configuration BYOK**
- **Déclencheur :** Clic sur "Gérer mes clés" depuis l'indicateur de quota (Sidebar) ou l'interface d'administration B2B.
- **Interaction :** Ouverture d'un panel latéral (Settings Drawer) permettant de coller la clé API (AES-256-GCM).
- **Feedback :** Validation en direct de la clé. Dès validation, un badge visuel persistant (ex. "Mode BYOK Actif" avec icône de cadenas) apparaît dans le header de l'application ou de la Sidebar.
- **Zéro Friction :** L'utilisateur n'a jamais à choisir manuellement le mode de paiement une fois la clé validée, le routeur (LLM Router) bascule automatiquement et de manière transparente.
**3. Flux d'Épuisement de Quotas (Conversion Freemium)**
- **Déclencheur :** Consommation de jetons IA au fil de l'utilisation.
- **Interaction :** Dans le bas de la Sidebar existante, un micro-composant de jauge de progression (Discovery Pack) se remplit.
- **Transition d'état :**
- *Normal :* Jauge discrète (grise/bleue).
- *Bloqué (100%) :* La jauge passe au rouge. Le clic IA suivant ne lance plus la requête mais déclenche instantanément une modale superposée ("Upgrade to Pro" ou "Ajouter une clé BYOK").
- **Zéro Friction :** La modale de blocage est un pont direct vers la popup de paiement Stripe en surimpression. Pas de redirection.
### Critical Success Moments
- **Légalité Validée :** L'utilisateur clique sur l'action IA, valide le consentement en 1 clic, et voit le résultat dans l'éditeur sans rupture.
- **Autonomie B2B (BYOK) :** L'administrateur colle sa clé, voit la validation visuelle ("Mode BYOK Actif") et constate instantanément le dégel de ses requêtes bloquées.
- **Conversion Instantanée :** Le dépassement de quota déclenche l'achat Stripe en "Overlay", l'utilisateur paie, et la jauge Sidebar redevient verte instantanément sans rechargement de la page.
### Experience Principles
- **Just-in-Time Compliance :** Le consentement légal se demande au moment précis de l'action pour maximiser le taux d'acceptation.
- **Zero-Redirect Conversion :** Tout acte d'achat (Stripe) ou de configuration (BYOK) se fait sans quitter la page active.
- **Visual Assurance :** Le statut de paiement (BYOK vs Quota SaaS) est visible de manière périphérique sans réduire l'espace d'édition cognitif.
## Technical Visual Integration Anchors
### Intégration Visuelle (Règle Absolue)
- Utilisation **exclusive** des classes CSS et des variables de thème (couleurs, polices, espacements) du design actuel.
- **Aucune invention visuelle.** Aucun nouveau style. Les composants commerciaux s'insèrent strictement dans les trous libres du code existant.
### Emplacement Quotas (Conversion)
- **Composant :** Compteur de jetons / Jauge Redis.
- **Point d'ancrage :** Injection directe dans le **footer de la Sidebar existante**.
### Emplacement BYOK (Contrôle des coûts)
- **Composant :** Champ de saisie sécurisé.
- **Point d'ancrage :** Ajout dans l'écran des **Paramètres actuels**.
- **Comportement technique :** Masquage obligatoire des caractères tapés. Validation par **ping API silencieux** en arrière-plan sans rechargement.
### Emplacement Légal (Conformité EU)
- **Cookies :** Bandeau ancré en bas de page de l'interface principale.
- **Consentement IA :** Modale déclenchée **uniquement** au premier clic sur l'outil IA, sans interrompre le flow global avant cette action.