Files
office_translator/alembic/versions/a1b2c3d4e5f6_set_multilingual_target_language.py
sepehr c66252bed4
Some checks failed
Deploy to Production / Build and Deploy (push) Failing after 1m30s
feat: mark glossary templates as multilingual — support 11 target languages
Templates enriched by enrich_glossary_templates.py already contain
translations for de, es, it, pt, nl, ru, ja, ko, zh, ar, fa (including
Persian). But they were labeled FR→EN, causing incorrect filtering and
warnings when translating to other languages.

Changes:
- index.json: set target_lang='multi' for all 8 templates
- GlossarySelector: treat target_language='multi' as compatible with
  any target language (no false warnings, auto-select works)
- GlossarySelector: display '🌐 MULTILINGUE' badge instead of EN flag
- glossary_routes: default target_language to 'multi' instead of 'en'
- Migration: detect existing multilingual glossaries in DB (5+ keys in
  translations JSON) and set their target_language to 'multi'

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 22:32:27 +02:00

55 lines
1.7 KiB
Python

"""Set multilingual glossaries target_language to 'multi'
Revision ID: a1b2c3d4e5f6
Revises: e5b2c9d1f4a8
Create Date: 2026-05-31
Glossary templates that were enriched with multilingual translations
(via enrich_glossary_templates.py) contain translations for 11 languages
(de, es, it, pt, nl, ru, ja, ko, zh, ar, fa) in each term's translations
field. These should be marked as target_language='multi' instead of 'en'.
This migration detects glossaries whose terms have multilingual translations
and sets their target_language to 'multi'.
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers
revision = "a1b2c3d4e5f6"
down_revision = "e5b2c9d1f4a8"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# Glossaries with terms containing 5+ translation keys are multilingual templates
# (enriched glossaries have 11 translations: de, es, it, pt, nl, ru, ja, ko, zh, ar, fa)
op.execute("""
UPDATE glossaries
SET target_language = 'multi'
WHERE id IN (
SELECT DISTINCT g.id
FROM glossaries g
JOIN glossary_terms gt ON gt.glossary_id = g.id
WHERE gt.translations IS NOT NULL
AND jsonb_typeof(gt.translations) = 'object'
AND (
SELECT count(*)
FROM jsonb_object_keys(gt.translations)
) >= 5
)
""")
def downgrade() -> None:
# Revert multilingual glossaries back to 'en'
op.execute("""
UPDATE glossaries
SET target_language = 'en'
WHERE target_language = 'multi'
""")