office_translator/README.md

268 lines
7.1 KiB
Markdown

# 📄 Document Translation API
A powerful Python API for translating complex structured documents (Excel, Word, PowerPoint) while **strictly preserving** the original formatting, layout, and embedded media.
## ✨ Features
### 🔄 Multiple Translation Providers
| Provider | Type | Description |
|----------|------|-------------|
| **Google Translate** | Cloud | Free, fast, reliable |
| **Ollama** | Local LLM | Privacy-focused, customizable with system prompts |
| **WebLLM** | Browser | Runs entirely in browser using WebGPU |
| **DeepL** | Cloud | High-quality translations (API key required) |
| **LibreTranslate** | Self-hosted | Open-source alternative |
### 📊 Excel Translation (.xlsx)
- ✅ Translates all cell content and sheet names
- ✅ Preserves cell merging, formulas, and styles
- ✅ Maintains font styles, colors, and borders
- ✅ Image text extraction with vision models
- ✅ Adds translated image text as comments
### 📝 Word Translation (.docx)
- ✅ Translates body text, headers, footers, and tables
- ✅ Preserves heading styles and paragraph formatting
- ✅ Maintains lists, images, charts, and SmartArt
- ✅ Image text extraction and translation
### 📽️ PowerPoint Translation (.pptx)
- ✅ Translates slide titles, body text, and speaker notes
- ✅ Preserves slide layouts, transitions, and animations
- ✅ Image text extraction with text boxes added below images
- ✅ Keeps layering order and positions
### 🧠 LLM Features (Ollama/WebLLM)
-**Custom System Prompts**: Provide context for better translations
-**Technical Glossary**: Define term mappings (e.g., `batterie=coil`)
-**Presets**: HVAC, IT, Legal, Medical terminology
-**Vision Models**: Translate text within images (gemma3, qwen3-vl, llava)
## 🚀 Quick Start
### Installation
```powershell
# Clone the repository
git clone https://gitea.parsanet.org/sepehr/office_translator.git
cd office_translator
# Create virtual environment
python -m venv venv
.\venv\Scripts\Activate.ps1
# Install dependencies
pip install -r requirements.txt
# Run the API
python main.py
```
The API starts on `http://localhost:8000`
### Web Interface
Open `http://localhost:8000/static/index.html` for the full-featured web interface.
## 📚 API Documentation
- **Swagger UI**: http://localhost:8000/docs
- **ReDoc**: http://localhost:8000/redoc
## 🔧 API Endpoints
### POST /translate
Translate a document with full customization.
```bash
curl -X POST "http://localhost:8000/translate" \
-F "file=@document.xlsx" \
-F "target_language=en" \
-F "provider=ollama" \
-F "ollama_model=gemma3:12b" \
-F "translate_images=true" \
-F "system_prompt=You are translating HVAC documents. Use: batterie=coil, CTA=AHU"
```
### Parameters
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `file` | File | required | Document to translate (.xlsx, .docx, .pptx) |
| `target_language` | string | required | Target language code (en, fr, es, fa, etc.) |
| `provider` | string | google | Translation provider (google, ollama, webllm, deepl, libre) |
| `ollama_model` | string | llama3.2 | Ollama model name |
| `translate_images` | bool | false | Extract and translate image text with vision |
| `system_prompt` | string | "" | Custom instructions and glossary for LLM |
### GET /ollama/models
List available Ollama models.
### POST /ollama/configure
Configure Ollama settings.
### GET /health
Health check endpoint.
## 🌐 Supported Languages
| Code | Language | Code | Language |
|------|----------|------|----------|
| en | English | fr | French |
| fa | Persian/Farsi | es | Spanish |
| de | German | it | Italian |
| pt | Portuguese | ru | Russian |
| zh | Chinese | ja | Japanese |
| ko | Korean | ar | Arabic |
## ⚙️ Configuration
### Environment Variables (.env)
```env
# Translation Service
TRANSLATION_SERVICE=google
# Ollama Configuration
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.2
# DeepL API Key (optional)
DEEPL_API_KEY=your_api_key_here
# File Limits
MAX_FILE_SIZE_MB=50
# Directories
UPLOAD_DIR=./uploads
OUTPUT_DIR=./outputs
```
### Ollama Setup
```bash
# Install Ollama (Windows)
winget install Ollama.Ollama
# Pull a model
ollama pull llama3.2
# For vision/image translation
ollama pull gemma3:12b
# or
ollama pull qwen3-vl:8b
```
## 🎯 Using System Prompts & Glossary
### Example: HVAC Translation
**System Prompt:**
```
You are translating HVAC technical documents.
Use precise technical terminology.
Keep unit measurements (kW, m³/h, Pa) unchanged.
```
**Glossary:**
```
batterie=coil
groupe froid=chiller
CTA=AHU (Air Handling Unit)
échangeur=heat exchanger
vanne 3 voies=3-way valve
```
### Presets Available
- 🔧 **HVAC**: Heating, Ventilation, Air Conditioning
- 💻 **IT**: Software and technology
- ⚖️ **Legal**: Legal documents
- 🏥 **Medical**: Healthcare terminology
## 🔌 MCP Integration
This API can be used as an MCP (Model Context Protocol) server for AI assistants.
### VS Code Configuration
Add to your VS Code `settings.json` or `.vscode/mcp.json`:
```json
{
"servers": {
"document-translator": {
"type": "stdio",
"command": "python",
"args": ["mcp_server.py"],
"cwd": "D:/Translate",
"env": {
"PYTHONPATH": "D:/Translate"
}
}
}
}
```
### MCP Tools Available
| Tool | Description |
|------|-------------|
| `translate_document` | Translate a document file |
| `list_ollama_models` | Get available Ollama models |
| `get_supported_languages` | List supported language codes |
| `configure_translation` | Set translation provider and options |
## 🏗️ Project Structure
```
Translate/
├── main.py # FastAPI application
├── config.py # Configuration
├── requirements.txt # Dependencies
├── mcp_server.py # MCP server implementation
├── services/
│ └── translation_service.py # Translation providers
├── translators/
│ ├── excel_translator.py # Excel with image support
│ ├── word_translator.py # Word with image support
│ └── pptx_translator.py # PowerPoint with image support
├── utils/
│ ├── file_handler.py # File operations
│ └── exceptions.py # Custom exceptions
├── static/
│ └── index.html # Web interface
├── uploads/ # Temporary uploads
└── outputs/ # Translated files
```
## 🧪 Testing
1. Start the API: `python main.py`
2. Open: http://localhost:8000/static/index.html
3. Configure Ollama model
4. Upload a document
5. Select target language and provider
6. Click "Translate Document"
## 🛠️ Tech Stack
- **FastAPI**: Modern async web framework
- **openpyxl**: Excel manipulation
- **python-docx**: Word documents
- **python-pptx**: PowerPoint presentations
- **deep-translator**: Google/DeepL/Libre translation
- **requests**: Ollama API communication
- **Uvicorn**: ASGI server
## 📝 License
MIT License
## 🤝 Contributing
Contributions welcome! Please submit a Pull Request.
---
**Built with ❤️ using Python, FastAPI, and Ollama**