sepehr 6d27dc4cda
All checks were successful
Deploy to Production / Build and Deploy (push) Successful in 2m51s
feat(i18n): Phase 1 — migrate 5 critical files from hardcoded text to i18n
47 new i18n keys added across all 13 locales (en, fr, es, de, pt, it,
nl, ru, ja, ko, zh, ar, fa). English and French are fully translated,
remaining locales use French as placeholder.

Files migrated:
- EditGlossaryDialog.tsx (18 strings)
- DeleteGlossaryDialog.tsx (7 strings)
- ProUpgradePrompt.tsx (10 strings)
- WebhookSnippet.tsx (4 strings)
- TranslationModeToggle.tsx (8 strings)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 00:28:24 +02:00

📄 Document Translation API

A powerful SaaS-ready Python API for translating complex structured documents (Excel, Word, PowerPoint) while strictly preserving the original formatting, layout, and embedded media.

Features

🔄 Multiple Translation Providers

Provider Type Description
Google Translate Cloud Free, fast, reliable
Ollama Local LLM Privacy-focused, customizable with system prompts
WebLLM Browser Runs entirely in browser using WebGPU
DeepL Cloud High-quality translations (API key required)
LibreTranslate Self-hosted Open-source alternative
OpenAI Cloud GPT-4o/4o-mini with vision support

📊 Excel Translation (.xlsx)

  • Translates all cell content and sheet names
  • Preserves cell merging, formulas, and styles
  • Maintains font styles, colors, and borders
  • Image text extraction with vision models
  • Adds translated image text as comments

📝 Word Translation (.docx)

  • Translates body text, headers, footers, and tables
  • Preserves heading styles and paragraph formatting
  • Maintains lists, images, charts, and SmartArt
  • Image text extraction and translation

📽️ PowerPoint Translation (.pptx)

  • Translates slide titles, body text, and speaker notes
  • Preserves slide layouts, transitions, and animations
  • Image text extraction with text boxes added below images
  • Keeps layering order and positions

🧠 LLM Features (Ollama/WebLLM/OpenAI)

  • Custom System Prompts: Provide context for better translations
  • Technical Glossary: Define term mappings (e.g., batterie=coil)
  • Presets: HVAC, IT, Legal, Medical terminology
  • Vision Models: Translate text within images (gemma3, qwen3-vl, gpt-4o)

🏢 SaaS-Ready Features

  • 🚦 Rate Limiting: Per-client IP with token bucket and sliding window algorithms
  • 🔒 Security Headers: CSP, XSS protection, HSTS support
  • 🧹 Auto Cleanup: Automatic file cleanup with TTL tracking
  • 📊 Monitoring: Health checks, metrics, and system status
  • 🔐 Admin Dashboard: Secure admin panel with authentication
  • 📝 Request Logging: Structured logging with unique request IDs

🚀 Quick Start

Installation

# Clone the repository
git clone https://gitea.parsanet.org/sepehr/office_translator.git
cd office_translator

# Create virtual environment
python -m venv venv
.\venv\Scripts\Activate.ps1

# Install dependencies
pip install -r requirements.txt

# Run the API
python main.py

The API starts on http://localhost:8000

Frontend Setup

cd frontend
npm install
npm run dev

Frontend runs on http://localhost:3000

📚 API Documentation

🔧 API Endpoints

Translation

POST /translate

Translate a document with full customization.

curl -X POST "http://localhost:8000/translate" \
  -F "file=@document.xlsx" \
  -F "target_language=en" \
  -F "provider=ollama" \
  -F "ollama_model=gemma3:12b" \
  -F "translate_images=true" \
  -F "system_prompt=You are translating HVAC documents."

Monitoring

GET /health

Comprehensive health check with system status.

{
  "status": "healthy",
  "translation_service": "google",
  "memory": {"system_percent": 34.1, "system_available_gb": 61.7},
  "disk": {"total_files": 0, "total_size_mb": 0},
  "cleanup_service": {"is_running": true}
}

GET /metrics

System metrics and statistics.

GET /rate-limit/status

Current rate limit status for the requesting client.

Admin Endpoints (Authentication Required)

POST /admin/login

Login to admin dashboard.

curl -X POST "http://localhost:8000/admin/login" \
  -F "username=admin" \
  -F "password=your_password"

Response:

{
  "status": "success",
  "token": "your_bearer_token",
  "expires_in": 86400
}

GET /admin/dashboard

Get comprehensive dashboard data (requires Bearer token).

curl "http://localhost:8000/admin/dashboard" \
  -H "Authorization: Bearer your_token"

POST /admin/cleanup/trigger

Manually trigger file cleanup.

GET /admin/files/tracked

List currently tracked files.

🌐 Supported Languages

Code Language Code Language
en English fr French
fa Persian/Farsi es Spanish
de German it Italian
pt Portuguese ru Russian
zh Chinese ja Japanese
ko Korean ar Arabic

⚙️ Configuration

Environment Variables (.env)

  1. Copy .env.example to .env: cp .env.example .env
  2. Fill required variables (see comments in .env.example: Required vs Optional).
  3. In production (ENV=production), missing required vars (e.g. JWT_SECRET_KEY, ADMIN_USERNAME, ADMIN_PASSWORD or ADMIN_PASSWORD_HASH, ADMIN_TOKEN_SECRET, DATABASE_URL, and REDIS_URL if rate limiting is on) cause the app to fail at startup with a clear message listing them (Story 6.6, NFR10).
# ============== Translation Services ==============
TRANSLATION_SERVICE=google
DEEPL_API_KEY=your_deepl_api_key_here

# Ollama Configuration
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3
OLLAMA_VISION_MODEL=llava

# ============== File Limits ==============
MAX_FILE_SIZE_MB=50

# ============== Rate Limiting (SaaS) ==============
RATE_LIMIT_ENABLED=true
RATE_LIMIT_PER_MINUTE=30
RATE_LIMIT_PER_HOUR=200
TRANSLATIONS_PER_MINUTE=10
TRANSLATIONS_PER_HOUR=50
MAX_CONCURRENT_TRANSLATIONS=5

# ============== Cleanup Service ==============
CLEANUP_ENABLED=true
CLEANUP_INTERVAL_MINUTES=15
FILE_TTL_MINUTES=60
INPUT_FILE_TTL_MINUTES=30
OUTPUT_FILE_TTL_MINUTES=120

# ============== Security ==============
# When behind Nginx (production), HSTS is set by the proxy; ENABLE_HSTS applies when running the app without a reverse proxy (e.g. dev).
ENABLE_HSTS=false
# Use "*" only for local development; set explicit origins in production (see .env.example).
CORS_ORIGINS=*

# ============== Admin Authentication ==============
ADMIN_USERNAME=admin
ADMIN_PASSWORD=changeme123  # Change in production!
# Or use a SHA256 hash:
# ADMIN_PASSWORD_HASH=your_sha256_hash

# ============== Monitoring ==============
LOG_LEVEL=INFO
ENABLE_REQUEST_LOGGING=true
MAX_MEMORY_PERCENT=80

Ollama Setup

# Install Ollama (Windows)
winget install Ollama.Ollama

# Pull a model
ollama pull llama3.2

# For vision/image translation
ollama pull gemma3:12b
# or
ollama pull qwen3-vl:8b

🎯 Using System Prompts & Glossary

Example: HVAC Translation

System Prompt:

You are translating HVAC technical documents.
Use precise technical terminology.
Keep unit measurements (kW, m³/h, Pa) unchanged.

Glossary:

batterie=coil
groupe froid=chiller
CTA=AHU (Air Handling Unit)
échangeur=heat exchanger
vanne 3 voies=3-way valve

Presets Available

  • 🔧 HVAC: Heating, Ventilation, Air Conditioning
  • 💻 IT: Software and technology
  • ⚖️ Legal: Legal documents
  • 🏥 Medical: Healthcare terminology

<EFBFBD> Admin Dashboard

Access the admin dashboard at /admin in the frontend. Features:

  • System Status: Health, uptime, and issues
  • Memory & Disk Monitoring: Real-time usage stats
  • Translation Statistics: Total translations, success rate
  • Rate Limit Management: View active clients and limits
  • Cleanup Service: Monitor and trigger manual cleanup

Default Credentials

  • Username: admin
  • Password: changeme123

⚠️ Change the default password in production!

🏗️ Project Structure

Translate/
├── main.py                      # FastAPI application with SaaS features
├── config.py                    # Configuration with SaaS settings
├── requirements.txt             # Dependencies
├── mcp_server.py               # MCP server implementation
├── middleware/                  # SaaS middleware
│   ├── __init__.py
│   ├── rate_limiting.py        # Rate limiting with token bucket
│   ├── validation.py           # Input validation
│   ├── security.py             # Security headers & logging
│   └── cleanup.py              # Auto cleanup service
├── services/
│   └── translation_service.py  # Translation providers
├── translators/
│   ├── excel_translator.py     # Excel with image support
│   ├── word_translator.py      # Word with image support
│   └── pptx_translator.py      # PowerPoint with image support
├── frontend/                    # Next.js frontend
│   ├── src/
│   │   ├── app/
│   │   │   ├── page.tsx        # Main translation page
│   │   │   ├── admin/          # Admin dashboard
│   │   │   └── settings/       # Settings pages
│   │   └── components/
│   └── package.json
├── static/
│   └── webllm.html             # WebLLM standalone interface
├── uploads/                    # Temporary uploads (auto-cleaned)
└── outputs/                    # Translated files (auto-cleaned)

🛠️ Tech Stack

Backend

  • FastAPI: Modern async web framework
  • openpyxl: Excel manipulation
  • python-docx: Word documents
  • python-pptx: PowerPoint presentations
  • deep-translator: Google/DeepL/Libre translation
  • psutil: System monitoring
  • python-magic: File type validation

Frontend

  • Next.js 15: React framework
  • Tailwind CSS: Styling
  • Lucide Icons: Icon library
  • WebLLM: Browser-based LLM

🔌 MCP Integration

This API can be used as an MCP (Model Context Protocol) server for AI assistants.

VS Code Configuration

Add to your VS Code settings.json or .vscode/mcp.json:

{
  "servers": {
    "document-translator": {
      "type": "stdio",
      "command": "python",
      "args": ["mcp_server.py"],
      "cwd": "D:/Translate",
      "env": {
        "PYTHONPATH": "D:/Translate"
      }
    }
  }
}

🚀 Production Deployment

Security Checklist

  • Change ADMIN_PASSWORD or set ADMIN_PASSWORD_HASH
  • Set CORS_ORIGINS to your frontend domain
  • Enable ENABLE_HSTS=true if using HTTPS (when not behind Nginx; behind Nginx, HSTS is set by the proxy)
  • Configure rate limits appropriately
  • Set up log rotation for logs/ directory
  • Use a reverse proxy (nginx/traefik) for HTTPS

Docker Deployment

En production, utilisez le stack Docker Compose avec Nginx en reverse proxy (ports 80/443):

# Avec certificats SSL dans docker/nginx/ssl/ (voir DEPLOYMENT_GUIDE.md)
docker compose up -d
  • Nginx : terminaison TLS, HTTP→HTTPS, HSTS, routage /api/* → backend, /* → frontend (Story 6.5).
  • Backend et frontend ne sont pas exposés sur lhôte ; tout passe par le proxy.
  • Détails : DEPLOYMENT_GUIDE.md (SSL/TLS, santé /health, variables denvironnement).

📝 License

MIT License

🤝 Contributing

Contributions welcome! Please submit a Pull Request.


Built with ❤️ using Python, FastAPI, Next.js, and Ollama

Description
No description provided
Readme MIT 13 MiB
Languages
Python 52.3%
TypeScript 33.3%
HTML 12.6%
CSS 1.1%
PowerShell 0.5%
Other 0.2%