feat: migrate semantic search to pgvector + full-text search
All checks were successful
Deploy to Production / Build and Deploy (push) Successful in 2m12s

Replace JSON-string embeddings with native pgvector(1536) storage and
add PostgreSQL full-text search (tsvector/GIN) with Reciprocal Rank Fusion
for hybrid keyword + semantic ranking.

Changes:
- NoteEmbedding.embedding: String → vector(1536) via pgvector
- NoteEmbedding: added updatedAt for reindex tracking
- Note: added tsv (tsvector) with auto-update trigger for FTS
- semantic-search.service: hybrid FTS + vector search with RRF fusion
- embedding.service: toVectorString() for pgvector SQL literals
- Removed JS-side cosine similarity loops (now DB-side via <=>)
- Added HNSW index on NoteEmbedding.embedding (cosine distance)
- Added GIN index on Note.tsv for FTS queries

Schema migration in: prisma/migrations/20260512120000_pgvector_and_fts_search/

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Antigravity
2026-05-12 07:03:56 +00:00
parent 92c3a6f307
commit 03e6a62b80
43 changed files with 4024 additions and 786 deletions

View File

@@ -155,6 +155,7 @@ model Note {
languageConfidence Float?
lastAiAnalysis DateTime?
trashedAt DateTime?
tsv Unsupported("tsvector")?
aiFeedback AiFeedback[]
memoryEchoAsNote1 MemoryEchoInsight[] @relation("EchoNote1")
memoryEchoAsNote2 MemoryEchoInsight[] @relation("EchoNote2")
@@ -299,8 +300,9 @@ model UserAISettings {
model NoteEmbedding {
id String @id @default(cuid())
noteId String @unique
embedding String
embedding Unsupported("vector(1536)")
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
note Note @relation(fields: [noteId], references: [id], onDelete: Cascade)
@@index([noteId])