Compare commits

...

8 Commits

Author SHA1 Message Date
8f9ca669cf Performance optimization: batch translation for 5-10x speed improvement
- GoogleTranslationProvider: Added batch translation with separator method
- DeepLTranslationProvider: Added translator caching and batch support
- LibreTranslationProvider: Added translator caching and batch support
- WordTranslator: Collect all texts -> batch translate -> apply pattern
- ExcelTranslator: Collect all texts -> batch translate -> apply pattern
- PowerPointTranslator: Collect all texts -> batch translate -> apply pattern
- Enhanced Ollama/OpenAI prompts with stricter translation-only rules
- Added rule: return original text if uncertain about translation
2025-11-30 20:41:20 +01:00
54d85f0b34 feat: Add admin dashboard with authentication - Admin login/logout with Bearer token authentication - Secure admin dashboard page in frontend - Real-time system monitoring (memory, disk, translations) - Rate limits and cleanup service monitoring - Protected admin endpoints - Updated README with full SaaS documentation 2025-11-30 19:33:59 +01:00
500502440c feat: Add SaaS robustness middleware - Rate limiting with token bucket and sliding window algorithms - Input validation (file, language, provider) - Security headers middleware (CSP, XSS protection, etc.) - Automatic file cleanup with TTL tracking - Memory and disk monitoring - Enhanced health check and metrics endpoints - Request logging with unique IDs 2025-11-30 19:25:09 +01:00
8c7716bf4d Add Next.js frontend with WebLLM, OpenAI support - Add complete Next.js frontend with Tailwind CSS and shadcn/ui - Integrate WebLLM for client-side browser-based translations - Add OpenAI provider support with gpt-4o-mini default - Add Context & Glossary page for LLM customization - Reorganize settings: Translation Services includes all providers - Add system prompt and glossary support for all LLMs - Remove test files and requirements-test.txt 2025-11-30 19:02:41 +01:00
a4ecd3e0ec Add MCP server and configuration for AI assistant integration 2025-11-30 16:53:53 +01:00
e48ea07e44 Add system prompt, glossary, presets for Ollama/WebLLM, image translation support 2025-11-30 16:45:41 +01:00
465cab8a61 Add WebLLM model selection and cache management 2025-11-30 11:57:58 +01:00
9410b07512 Add WebLLM support, fix progress bar blocking at 90%, add timeout protection 2025-11-30 11:54:33 +01:00
60 changed files with 16257 additions and 651 deletions

View File

@@ -1,5 +1,11 @@
# Translation Service Configuration
TRANSLATION_SERVICE=google # Options: google, deepl, libre, ollama
# Document Translation API - Environment Configuration
# Copy this file to .env and configure your settings
# ============== Translation Services ==============
# Default provider: google, ollama, deepl, libre, openai
TRANSLATION_SERVICE=google
# DeepL API Key (required for DeepL provider)
DEEPL_API_KEY=your_deepl_api_key_here
# Ollama Configuration (for LLM-based translation)
@@ -7,7 +13,72 @@ OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3
OLLAMA_VISION_MODEL=llava
# API Configuration
# ============== File Limits ==============
# Maximum file size in MB
MAX_FILE_SIZE_MB=50
UPLOAD_DIR=./uploads
OUTPUT_DIR=./outputs
# ============== Rate Limiting (SaaS) ==============
# Enable/disable rate limiting
RATE_LIMIT_ENABLED=true
# Request limits
RATE_LIMIT_PER_MINUTE=30
RATE_LIMIT_PER_HOUR=200
# Translation-specific limits
TRANSLATIONS_PER_MINUTE=10
TRANSLATIONS_PER_HOUR=50
MAX_CONCURRENT_TRANSLATIONS=5
# ============== Cleanup Service ==============
# Enable automatic file cleanup
CLEANUP_ENABLED=true
# Cleanup interval in minutes
CLEANUP_INTERVAL_MINUTES=15
# File time-to-live in minutes
FILE_TTL_MINUTES=60
INPUT_FILE_TTL_MINUTES=30
OUTPUT_FILE_TTL_MINUTES=120
# Disk space warning thresholds (GB)
DISK_WARNING_THRESHOLD_GB=5.0
DISK_CRITICAL_THRESHOLD_GB=1.0
# ============== Security ==============
# Enable HSTS (only for HTTPS deployments)
ENABLE_HSTS=false
# CORS allowed origins (comma-separated)
CORS_ORIGINS=*
# Maximum request size in MB
MAX_REQUEST_SIZE_MB=100
# Request timeout in seconds
REQUEST_TIMEOUT_SECONDS=300
# ============== Admin Authentication ==============
# Admin username
ADMIN_USERNAME=admin
# Admin password (change in production!)
ADMIN_PASSWORD=changeme123
# Or use SHA256 hash of password (more secure)
# Generate with: python -c "import hashlib; print(hashlib.sha256(b'your_password').hexdigest())"
# ADMIN_PASSWORD_HASH=
# Token secret for session management (auto-generated if not set)
# ADMIN_TOKEN_SECRET=
# ============== Monitoring ==============
# Log level: DEBUG, INFO, WARNING, ERROR
LOG_LEVEL=INFO
# Enable request logging
ENABLE_REQUEST_LOGGING=true
# Memory usage threshold (percentage)
MAX_MEMORY_PERCENT=80

557
README.md
View File

@@ -1,303 +1,382 @@
# Document Translation API
# 📄 Document Translation API
A powerful Python API for translating complex structured documents (Excel, Word, PowerPoint) while **strictly preserving** the original formatting, layout, and embedded media.
A powerful SaaS-ready Python API for translating complex structured documents (Excel, Word, PowerPoint) while **strictly preserving** the original formatting, layout, and embedded media.
## 🎯 Features
## Features
### Excel Translation (.xlsx)
### 🔄 Multiple Translation Providers
| Provider | Type | Description |
|----------|------|-------------|
| **Google Translate** | Cloud | Free, fast, reliable |
| **Ollama** | Local LLM | Privacy-focused, customizable with system prompts |
| **WebLLM** | Browser | Runs entirely in browser using WebGPU |
| **DeepL** | Cloud | High-quality translations (API key required) |
| **LibreTranslate** | Self-hosted | Open-source alternative |
| **OpenAI** | Cloud | GPT-4o/4o-mini with vision support |
### 📊 Excel Translation (.xlsx)
- ✅ Translates all cell content and sheet names
- ✅ Preserves cell merging
- ✅ Maintains font styles (size, bold, italic, color)
-Keeps background colors and borders
-Translates text within formulas while preserving formula structure
- ✅ Retains embedded images in original positions
- ✅ Preserves cell merging, formulas, and styles
- ✅ Maintains font styles, colors, and borders
-Image text extraction with vision models
-Adds translated image text as comments
### Word Translation (.docx)
### 📝 Word Translation (.docx)
- ✅ Translates body text, headers, footers, and tables
- ✅ Preserves heading styles and paragraph formatting
- ✅ Maintains lists (numbered/bulleted)
-Keeps embedded images, charts, and SmartArt in place
- ✅ Preserves table structures and cell formatting
- ✅ Maintains lists, images, charts, and SmartArt
-Image text extraction and translation
### PowerPoint Translation (.pptx)
### 📽️ PowerPoint Translation (.pptx)
- ✅ Translates slide titles, body text, and speaker notes
- ✅ Preserves slide layouts and transitions
-Maintains animations
- ✅ Keeps images, videos, and shapes in exact positions
- ✅ Preserves layering order
- ✅ Preserves slide layouts, transitions, and animations
-Image text extraction with text boxes added below images
- ✅ Keeps layering order and positions
### 🧠 LLM Features (Ollama/WebLLM/OpenAI)
-**Custom System Prompts**: Provide context for better translations
-**Technical Glossary**: Define term mappings (e.g., `batterie=coil`)
-**Presets**: HVAC, IT, Legal, Medical terminology
-**Vision Models**: Translate text within images (gemma3, qwen3-vl, gpt-4o)
### 🏢 SaaS-Ready Features
- 🚦 **Rate Limiting**: Per-client IP with token bucket and sliding window algorithms
- 🔒 **Security Headers**: CSP, XSS protection, HSTS support
- 🧹 **Auto Cleanup**: Automatic file cleanup with TTL tracking
- 📊 **Monitoring**: Health checks, metrics, and system status
- 🔐 **Admin Dashboard**: Secure admin panel with authentication
- 📝 **Request Logging**: Structured logging with unique request IDs
## 🚀 Quick Start
### Installation
1. **Clone the repository:**
```powershell
git clone <repository-url>
cd Translate
```
# Clone the repository
git clone https://gitea.parsanet.org/sepehr/office_translator.git
cd office_translator
2. **Create a virtual environment:**
```powershell
# Create virtual environment
python -m venv venv
.\venv\Scripts\Activate.ps1
```
3. **Install dependencies:**
```powershell
# Install dependencies
pip install -r requirements.txt
```
4. **Configure environment:**
```powershell
cp .env.example .env
# Edit .env with your preferred settings
```
5. **Run the API:**
```powershell
# Run the API
python main.py
```
The API will start on `http://localhost:8000`
The API starts on `http://localhost:8000`
### Frontend Setup
```powershell
cd frontend
npm install
npm run dev
```
Frontend runs on `http://localhost:3000`
## 📚 API Documentation
Once the server is running, visit:
- **Swagger UI**: http://localhost:8000/docs
- **ReDoc**: http://localhost:8000/redoc
## 🔧 API Endpoints
### POST /translate
Translate a single document
### Translation
#### POST /translate
Translate a document with full customization.
**Request:**
```bash
curl -X POST "http://localhost:8000/translate" \
-F "file=@document.xlsx" \
-F "target_language=es" \
-F "source_language=auto"
-F "target_language=en" \
-F "provider=ollama" \
-F "ollama_model=gemma3:12b" \
-F "translate_images=true" \
-F "system_prompt=You are translating HVAC documents."
```
**Response:**
Returns the translated document file
### Monitoring
### POST /translate-batch
Translate multiple documents at once
**Request:**
```bash
curl -X POST "http://localhost:8000/translate-batch" \
-F "files=@document1.docx" \
-F "files=@document2.pptx" \
-F "target_language=fr"
```
### GET /languages
Get list of supported language codes
### GET /health
Health check endpoint
## 💻 Usage Examples
### Python Example
```python
import requests
# Translate a document
with open('document.xlsx', 'rb') as f:
files = {'file': f}
data = {
'target_language': 'es',
'source_language': 'auto'
}
response = requests.post('http://localhost:8000/translate', files=files, data=data)
# Save translated file
with open('translated_document.xlsx', 'wb') as output:
output.write(response.content)
```
### JavaScript/TypeScript Example
```javascript
const formData = new FormData();
formData.append('file', fileInput.files[0]);
formData.append('target_language', 'fr');
formData.append('source_language', 'auto');
const response = await fetch('http://localhost:8000/translate', {
method: 'POST',
body: formData
});
const blob = await response.blob();
const url = window.URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = 'translated_document.docx';
a.click();
```
### PowerShell Example
```powershell
$file = Get-Item "document.pptx"
$uri = "http://localhost:8000/translate"
$form = @{
file = $file
target_language = "de"
source_language = "auto"
}
Invoke-RestMethod -Uri $uri -Method Post -Form $form -OutFile "translated_document.pptx"
```
## 🌐 Supported Languages
The API supports 25+ languages including:
- Spanish (es), French (fr), German (de)
- Italian (it), Portuguese (pt), Russian (ru)
- Chinese (zh), Japanese (ja), Korean (ko)
- Arabic (ar), Hindi (hi), Dutch (nl)
- And many more...
Full list available at: `GET /languages`
## ⚙️ Configuration
Edit `.env` file to configure:
```env
# Translation Service (google, deepl, libre)
TRANSLATION_SERVICE=google
# DeepL API Key (if using DeepL)
DEEPL_API_KEY=your_api_key_here
# File Upload Limits
MAX_FILE_SIZE_MB=50
# Directory Configuration
UPLOAD_DIR=./uploads
OUTPUT_DIR=./outputs
```
## 🔌 Model Context Protocol (MCP) Integration
This API is designed to be easily wrapped as an MCP server for future integration with AI assistants and tools.
### MCP Server Structure (Future Implementation)
#### GET /health
Comprehensive health check with system status.
```json
{
"mcpServers": {
"status": "healthy",
"translation_service": "google",
"memory": {"system_percent": 34.1, "system_available_gb": 61.7},
"disk": {"total_files": 0, "total_size_mb": 0},
"cleanup_service": {"is_running": true}
}
```
#### GET /metrics
System metrics and statistics.
#### GET /rate-limit/status
Current rate limit status for the requesting client.
### Admin Endpoints (Authentication Required)
#### POST /admin/login
Login to admin dashboard.
```bash
curl -X POST "http://localhost:8000/admin/login" \
-F "username=admin" \
-F "password=your_password"
```
Response:
```json
{
"status": "success",
"token": "your_bearer_token",
"expires_in": 86400
}
```
#### GET /admin/dashboard
Get comprehensive dashboard data (requires Bearer token).
```bash
curl "http://localhost:8000/admin/dashboard" \
-H "Authorization: Bearer your_token"
```
#### POST /admin/cleanup/trigger
Manually trigger file cleanup.
#### GET /admin/files/tracked
List currently tracked files.
## 🌐 Supported Languages
| Code | Language | Code | Language |
|------|----------|------|----------|
| en | English | fr | French |
| fa | Persian/Farsi | es | Spanish |
| de | German | it | Italian |
| pt | Portuguese | ru | Russian |
| zh | Chinese | ja | Japanese |
| ko | Korean | ar | Arabic |
## ⚙️ Configuration
### Environment Variables (.env)
```env
# ============== Translation Services ==============
TRANSLATION_SERVICE=google
DEEPL_API_KEY=your_deepl_api_key_here
# Ollama Configuration
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3
OLLAMA_VISION_MODEL=llava
# ============== File Limits ==============
MAX_FILE_SIZE_MB=50
# ============== Rate Limiting (SaaS) ==============
RATE_LIMIT_ENABLED=true
RATE_LIMIT_PER_MINUTE=30
RATE_LIMIT_PER_HOUR=200
TRANSLATIONS_PER_MINUTE=10
TRANSLATIONS_PER_HOUR=50
MAX_CONCURRENT_TRANSLATIONS=5
# ============== Cleanup Service ==============
CLEANUP_ENABLED=true
CLEANUP_INTERVAL_MINUTES=15
FILE_TTL_MINUTES=60
INPUT_FILE_TTL_MINUTES=30
OUTPUT_FILE_TTL_MINUTES=120
# ============== Security ==============
ENABLE_HSTS=false
CORS_ORIGINS=*
# ============== Admin Authentication ==============
ADMIN_USERNAME=admin
ADMIN_PASSWORD=changeme123 # Change in production!
# Or use a SHA256 hash:
# ADMIN_PASSWORD_HASH=your_sha256_hash
# ============== Monitoring ==============
LOG_LEVEL=INFO
ENABLE_REQUEST_LOGGING=true
MAX_MEMORY_PERCENT=80
```
### Ollama Setup
```bash
# Install Ollama (Windows)
winget install Ollama.Ollama
# Pull a model
ollama pull llama3.2
# For vision/image translation
ollama pull gemma3:12b
# or
ollama pull qwen3-vl:8b
```
## 🎯 Using System Prompts & Glossary
### Example: HVAC Translation
**System Prompt:**
```
You are translating HVAC technical documents.
Use precise technical terminology.
Keep unit measurements (kW, m³/h, Pa) unchanged.
```
**Glossary:**
```
batterie=coil
groupe froid=chiller
CTA=AHU (Air Handling Unit)
échangeur=heat exchanger
vanne 3 voies=3-way valve
```
### Presets Available
- 🔧 **HVAC**: Heating, Ventilation, Air Conditioning
- 💻 **IT**: Software and technology
- ⚖️ **Legal**: Legal documents
- 🏥 **Medical**: Healthcare terminology
## <20> Admin Dashboard
Access the admin dashboard at `/admin` in the frontend. Features:
- **System Status**: Health, uptime, and issues
- **Memory & Disk Monitoring**: Real-time usage stats
- **Translation Statistics**: Total translations, success rate
- **Rate Limit Management**: View active clients and limits
- **Cleanup Service**: Monitor and trigger manual cleanup
### Default Credentials
- **Username**: admin
- **Password**: changeme123
⚠️ **Change the default password in production!**
## 🏗️ Project Structure
```
Translate/
├── main.py # FastAPI application with SaaS features
├── config.py # Configuration with SaaS settings
├── requirements.txt # Dependencies
├── mcp_server.py # MCP server implementation
├── middleware/ # SaaS middleware
│ ├── __init__.py
│ ├── rate_limiting.py # Rate limiting with token bucket
│ ├── validation.py # Input validation
│ ├── security.py # Security headers & logging
│ └── cleanup.py # Auto cleanup service
├── services/
│ └── translation_service.py # Translation providers
├── translators/
│ ├── excel_translator.py # Excel with image support
│ ├── word_translator.py # Word with image support
│ └── pptx_translator.py # PowerPoint with image support
├── frontend/ # Next.js frontend
│ ├── src/
│ │ ├── app/
│ │ │ ├── page.tsx # Main translation page
│ │ │ ├── admin/ # Admin dashboard
│ │ │ └── settings/ # Settings pages
│ │ └── components/
│ └── package.json
├── static/
│ └── webllm.html # WebLLM standalone interface
├── uploads/ # Temporary uploads (auto-cleaned)
└── outputs/ # Translated files (auto-cleaned)
```
## 🛠️ Tech Stack
### Backend
- **FastAPI**: Modern async web framework
- **openpyxl**: Excel manipulation
- **python-docx**: Word documents
- **python-pptx**: PowerPoint presentations
- **deep-translator**: Google/DeepL/Libre translation
- **psutil**: System monitoring
- **python-magic**: File type validation
### Frontend
- **Next.js 15**: React framework
- **Tailwind CSS**: Styling
- **Lucide Icons**: Icon library
- **WebLLM**: Browser-based LLM
## 🔌 MCP Integration
This API can be used as an MCP (Model Context Protocol) server for AI assistants.
### VS Code Configuration
Add to your VS Code `settings.json` or `.vscode/mcp.json`:
```json
{
"servers": {
"document-translator": {
"type": "stdio",
"command": "python",
"args": ["-m", "mcp_server"],
"args": ["mcp_server.py"],
"cwd": "D:/Translate",
"env": {
"API_URL": "http://localhost:8000"
"PYTHONPATH": "D:/Translate"
}
}
}
}
```
### Example MCP Tools
## 🚀 Production Deployment
The MCP wrapper will expose these tools:
### Security Checklist
- [ ] Change `ADMIN_PASSWORD` or set `ADMIN_PASSWORD_HASH`
- [ ] Set `CORS_ORIGINS` to your frontend domain
- [ ] Enable `ENABLE_HSTS=true` if using HTTPS
- [ ] Configure rate limits appropriately
- [ ] Set up log rotation for `logs/` directory
- [ ] Use a reverse proxy (nginx/traefik) for HTTPS
1. **translate_document** - Translate a single document
2. **translate_batch** - Translate multiple documents
3. **get_supported_languages** - List supported languages
4. **check_translation_status** - Check status of translation
## 🏗️ Project Structure
### Docker Deployment (Coming Soon)
```dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
```
Translate/
├── main.py # FastAPI application
├── config.py # Configuration management
├── requirements.txt # Dependencies
├── .env.example # Environment template
├── services/
│ ├── __init__.py
│ └── translation_service.py # Translation abstraction layer
├── translators/
│ ├── __init__.py
│ ├── excel_translator.py # Excel translation logic
│ ├── word_translator.py # Word translation logic
│ └── pptx_translator.py # PowerPoint translation logic
├── utils/
│ ├── __init__.py
│ ├── file_handler.py # File operations
│ └── exceptions.py # Custom exceptions
├── uploads/ # Temporary upload storage
└── outputs/ # Translated files
```
## 🧪 Testing
### Manual Testing
1. Start the API server
2. Navigate to http://localhost:8000/docs
3. Use the interactive Swagger UI to test endpoints
### Test Files
Prepare test files with:
- Complex formatting (multiple fonts, colors, styles)
- Embedded images and media
- Tables and merged cells
- Formulas (for Excel)
- Multiple sections/slides
## 🛠️ Technical Details
### Libraries Used
- **FastAPI**: Modern web framework for building APIs
- **openpyxl**: Excel file manipulation with formatting preservation
- **python-docx**: Word document handling
- **python-pptx**: PowerPoint presentation processing
- **deep-translator**: Multi-provider translation service
- **Uvicorn**: ASGI server for running FastAPI
### Design Principles
1. **Modular Architecture**: Each file type has its own translator module
2. **Provider Abstraction**: Easy to swap translation services (Google, DeepL, LibreTranslate)
3. **Format Preservation**: All translators maintain original document structure
4. **Error Handling**: Comprehensive error handling and logging
5. **Scalability**: Ready for MCP integration and microservices architecture
## 🔐 Security Considerations
For production deployment:
1. **Configure CORS** properly in `main.py`
2. **Add authentication** for API endpoints
3. **Implement rate limiting** to prevent abuse
4. **Use HTTPS** for secure file transmission
5. **Sanitize file uploads** to prevent malicious files
6. **Set appropriate file size limits**
## 📝 License
MIT License - Feel free to use this project for your needs.
MIT License
## 🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## 📧 Support
For issues and questions, please open an issue on the repository.
Contributions welcome! Please submit a Pull Request.
---
**Built with ❤️ using Python and FastAPI**
**Built with ❤️ using Python, FastAPI, Next.js, and Ollama**

View File

@@ -1,5 +1,6 @@
"""
Configuration module for the Document Translation API
SaaS-ready with comprehensive settings for production deployment
"""
import os
from pathlib import Path
@@ -8,7 +9,7 @@ from dotenv import load_dotenv
load_dotenv()
class Config:
# Translation Service
# ============== Translation Service ==============
TRANSLATION_SERVICE = os.getenv("TRANSLATION_SERVICE", "google")
DEEPL_API_KEY = os.getenv("DEEPL_API_KEY", "")
@@ -17,20 +18,51 @@ class Config:
OLLAMA_MODEL = os.getenv("OLLAMA_MODEL", "llama3")
OLLAMA_VISION_MODEL = os.getenv("OLLAMA_VISION_MODEL", "llava")
# File Upload Configuration
# ============== File Upload Configuration ==============
MAX_FILE_SIZE_MB = int(os.getenv("MAX_FILE_SIZE_MB", "50"))
MAX_FILE_SIZE_BYTES = MAX_FILE_SIZE_MB * 1024 * 1024
# Directories
BASE_DIR = Path(__file__).parent.parent
BASE_DIR = Path(__file__).parent
UPLOAD_DIR = BASE_DIR / "uploads"
OUTPUT_DIR = BASE_DIR / "outputs"
TEMP_DIR = BASE_DIR / "temp"
LOGS_DIR = BASE_DIR / "logs"
# Supported file types
SUPPORTED_EXTENSIONS = {".xlsx", ".docx", ".pptx"}
# API Configuration
# ============== Rate Limiting (SaaS) ==============
RATE_LIMIT_ENABLED = os.getenv("RATE_LIMIT_ENABLED", "true").lower() == "true"
RATE_LIMIT_PER_MINUTE = int(os.getenv("RATE_LIMIT_PER_MINUTE", "30"))
RATE_LIMIT_PER_HOUR = int(os.getenv("RATE_LIMIT_PER_HOUR", "200"))
TRANSLATIONS_PER_MINUTE = int(os.getenv("TRANSLATIONS_PER_MINUTE", "10"))
TRANSLATIONS_PER_HOUR = int(os.getenv("TRANSLATIONS_PER_HOUR", "50"))
MAX_CONCURRENT_TRANSLATIONS = int(os.getenv("MAX_CONCURRENT_TRANSLATIONS", "5"))
# ============== Cleanup Service ==============
CLEANUP_ENABLED = os.getenv("CLEANUP_ENABLED", "true").lower() == "true"
CLEANUP_INTERVAL_MINUTES = int(os.getenv("CLEANUP_INTERVAL_MINUTES", "15"))
FILE_TTL_MINUTES = int(os.getenv("FILE_TTL_MINUTES", "60"))
INPUT_FILE_TTL_MINUTES = int(os.getenv("INPUT_FILE_TTL_MINUTES", "30"))
OUTPUT_FILE_TTL_MINUTES = int(os.getenv("OUTPUT_FILE_TTL_MINUTES", "120"))
# Disk space thresholds
DISK_WARNING_THRESHOLD_GB = float(os.getenv("DISK_WARNING_THRESHOLD_GB", "5.0"))
DISK_CRITICAL_THRESHOLD_GB = float(os.getenv("DISK_CRITICAL_THRESHOLD_GB", "1.0"))
# ============== Security ==============
ENABLE_HSTS = os.getenv("ENABLE_HSTS", "false").lower() == "true"
CORS_ORIGINS = os.getenv("CORS_ORIGINS", "*").split(",")
MAX_REQUEST_SIZE_MB = int(os.getenv("MAX_REQUEST_SIZE_MB", "100"))
REQUEST_TIMEOUT_SECONDS = int(os.getenv("REQUEST_TIMEOUT_SECONDS", "300"))
# ============== Monitoring ==============
LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO")
ENABLE_REQUEST_LOGGING = os.getenv("ENABLE_REQUEST_LOGGING", "true").lower() == "true"
MAX_MEMORY_PERCENT = float(os.getenv("MAX_MEMORY_PERCENT", "80"))
# ============== API Configuration ==============
API_TITLE = "Document Translation API"
API_VERSION = "1.0.0"
API_DESCRIPTION = """
@@ -40,6 +72,12 @@ class Config:
- Excel (.xlsx) - Preserves cell formatting, formulas, merged cells, images
- Word (.docx) - Preserves styles, tables, images, headers/footers
- PowerPoint (.pptx) - Preserves layouts, animations, embedded media
SaaS Features:
- Rate limiting per client IP
- Automatic file cleanup
- Health monitoring
- Request logging
"""
@classmethod
@@ -48,5 +86,7 @@ class Config:
cls.UPLOAD_DIR.mkdir(exist_ok=True, parents=True)
cls.OUTPUT_DIR.mkdir(exist_ok=True, parents=True)
cls.TEMP_DIR.mkdir(exist_ok=True, parents=True)
cls.LOGS_DIR.mkdir(exist_ok=True, parents=True)
config = Config()

41
frontend/.gitignore vendored Normal file
View File

@@ -0,0 +1,41 @@
# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
# dependencies
/node_modules
/.pnp
.pnp.*
.yarn/*
!.yarn/patches
!.yarn/plugins
!.yarn/releases
!.yarn/versions
# testing
/coverage
# next.js
/.next/
/out/
# production
/build
# misc
.DS_Store
*.pem
# debug
npm-debug.log*
yarn-debug.log*
yarn-error.log*
.pnpm-debug.log*
# env files (can opt-in for committing if needed)
.env*
# vercel
.vercel
# typescript
*.tsbuildinfo
next-env.d.ts

36
frontend/README.md Normal file
View File

@@ -0,0 +1,36 @@
This is a [Next.js](https://nextjs.org) project bootstrapped with [`create-next-app`](https://nextjs.org/docs/app/api-reference/cli/create-next-app).
## Getting Started
First, run the development server:
```bash
npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun dev
```
Open [http://localhost:3000](http://localhost:3000) with your browser to see the result.
You can start editing the page by modifying `app/page.tsx`. The page auto-updates as you edit the file.
This project uses [`next/font`](https://nextjs.org/docs/app/building-your-application/optimizing/fonts) to automatically optimize and load [Geist](https://vercel.com/font), a new font family for Vercel.
## Learn More
To learn more about Next.js, take a look at the following resources:
- [Next.js Documentation](https://nextjs.org/docs) - learn about Next.js features and API.
- [Learn Next.js](https://nextjs.org/learn) - an interactive Next.js tutorial.
You can check out [the Next.js GitHub repository](https://github.com/vercel/next.js) - your feedback and contributions are welcome!
## Deploy on Vercel
The easiest way to deploy your Next.js app is to use the [Vercel Platform](https://vercel.com/new?utm_medium=default-template&filter=next.js&utm_source=create-next-app&utm_campaign=create-next-app-readme) from the creators of Next.js.
Check out our [Next.js deployment documentation](https://nextjs.org/docs/app/building-your-application/deploying) for more details.

22
frontend/components.json Normal file
View File

@@ -0,0 +1,22 @@
{
"$schema": "https://ui.shadcn.com/schema.json",
"style": "new-york",
"rsc": true,
"tsx": true,
"tailwind": {
"config": "",
"css": "src/app/globals.css",
"baseColor": "neutral",
"cssVariables": true,
"prefix": ""
},
"iconLibrary": "lucide",
"aliases": {
"components": "@/components",
"utils": "@/lib/utils",
"ui": "@/components/ui",
"lib": "@/lib",
"hooks": "@/hooks"
},
"registries": {}
}

View File

@@ -0,0 +1,18 @@
import { defineConfig, globalIgnores } from "eslint/config";
import nextVitals from "eslint-config-next/core-web-vitals";
import nextTs from "eslint-config-next/typescript";
const eslintConfig = defineConfig([
...nextVitals,
...nextTs,
// Override default ignores of eslint-config-next.
globalIgnores([
// Default ignores of eslint-config-next:
".next/**",
"out/**",
"build/**",
"next-env.d.ts",
]),
]);
export default eslintConfig;

7
frontend/next.config.ts Normal file
View File

@@ -0,0 +1,7 @@
import type { NextConfig } from "next";
const nextConfig: NextConfig = {
/* config options here */
};
export default nextConfig;

8050
frontend/package-lock.json generated Normal file

File diff suppressed because it is too large Load Diff

47
frontend/package.json Normal file
View File

@@ -0,0 +1,47 @@
{
"name": "frontend",
"version": "0.1.0",
"private": true,
"scripts": {
"dev": "next dev",
"build": "next build",
"start": "next start",
"lint": "eslint"
},
"dependencies": {
"@mlc-ai/web-llm": "^0.2.80",
"@radix-ui/react-checkbox": "^1.3.3",
"@radix-ui/react-dialog": "^1.1.15",
"@radix-ui/react-dropdown-menu": "^2.1.16",
"@radix-ui/react-label": "^2.1.8",
"@radix-ui/react-progress": "^1.1.8",
"@radix-ui/react-scroll-area": "^1.2.10",
"@radix-ui/react-select": "^2.2.6",
"@radix-ui/react-separator": "^1.1.8",
"@radix-ui/react-slot": "^1.2.4",
"@radix-ui/react-switch": "^1.2.6",
"@radix-ui/react-tabs": "^1.1.13",
"@radix-ui/react-tooltip": "^1.2.8",
"axios": "^1.13.2",
"class-variance-authority": "^0.7.1",
"clsx": "^2.1.1",
"lucide-react": "^0.555.0",
"next": "16.0.6",
"react": "19.2.0",
"react-dom": "19.2.0",
"react-dropzone": "^14.3.8",
"tailwind-merge": "^3.4.0",
"zustand": "^5.0.9"
},
"devDependencies": {
"@tailwindcss/postcss": "^4",
"@types/node": "^20",
"@types/react": "^19",
"@types/react-dom": "^19",
"eslint": "^9",
"eslint-config-next": "16.0.6",
"tailwindcss": "^4",
"tw-animate-css": "^1.4.0",
"typescript": "^5"
}
}

View File

@@ -0,0 +1,7 @@
const config = {
plugins: {
"@tailwindcss/postcss": {},
},
};
export default config;

1
frontend/public/file.svg Normal file
View File

@@ -0,0 +1 @@
<svg fill="none" viewBox="0 0 16 16" xmlns="http://www.w3.org/2000/svg"><path d="M14.5 13.5V5.41a1 1 0 0 0-.3-.7L9.8.29A1 1 0 0 0 9.08 0H1.5v13.5A2.5 2.5 0 0 0 4 16h8a2.5 2.5 0 0 0 2.5-2.5m-1.5 0v-7H8v-5H3v12a1 1 0 0 0 1 1h8a1 1 0 0 0 1-1M9.5 5V2.12L12.38 5zM5.13 5h-.62v1.25h2.12V5zm-.62 3h7.12v1.25H4.5zm.62 3h-.62v1.25h7.12V11z" clip-rule="evenodd" fill="#666" fill-rule="evenodd"/></svg>

After

Width:  |  Height:  |  Size: 391 B

View File

@@ -0,0 +1 @@
<svg fill="none" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16"><g clip-path="url(#a)"><path fill-rule="evenodd" clip-rule="evenodd" d="M10.27 14.1a6.5 6.5 0 0 0 3.67-3.45q-1.24.21-2.7.34-.31 1.83-.97 3.1M8 16A8 8 0 1 0 8 0a8 8 0 0 0 0 16m.48-1.52a7 7 0 0 1-.96 0H7.5a4 4 0 0 1-.84-1.32q-.38-.89-.63-2.08a40 40 0 0 0 3.92 0q-.25 1.2-.63 2.08a4 4 0 0 1-.84 1.31zm2.94-4.76q1.66-.15 2.95-.43a7 7 0 0 0 0-2.58q-1.3-.27-2.95-.43a18 18 0 0 1 0 3.44m-1.27-3.54a17 17 0 0 1 0 3.64 39 39 0 0 1-4.3 0 17 17 0 0 1 0-3.64 39 39 0 0 1 4.3 0m1.1-1.17q1.45.13 2.69.34a6.5 6.5 0 0 0-3.67-3.44q.65 1.26.98 3.1M8.48 1.5l.01.02q.41.37.84 1.31.38.89.63 2.08a40 40 0 0 0-3.92 0q.25-1.2.63-2.08a4 4 0 0 1 .85-1.32 7 7 0 0 1 .96 0m-2.75.4a6.5 6.5 0 0 0-3.67 3.44 29 29 0 0 1 2.7-.34q.31-1.83.97-3.1M4.58 6.28q-1.66.16-2.95.43a7 7 0 0 0 0 2.58q1.3.27 2.95.43a18 18 0 0 1 0-3.44m.17 4.71q-1.45-.12-2.69-.34a6.5 6.5 0 0 0 3.67 3.44q-.65-1.27-.98-3.1" fill="#666"/></g><defs><clipPath id="a"><path fill="#fff" d="M0 0h16v16H0z"/></clipPath></defs></svg>

After

Width:  |  Height:  |  Size: 1.0 KiB

1
frontend/public/next.svg Normal file
View File

@@ -0,0 +1 @@
<svg xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 394 80"><path fill="#000" d="M262 0h68.5v12.7h-27.2v66.6h-13.6V12.7H262V0ZM149 0v12.7H94v20.4h44.3v12.6H94v21h55v12.6H80.5V0h68.7zm34.3 0h-17.8l63.8 79.4h17.9l-32-39.7 32-39.6h-17.9l-23 28.6-23-28.6zm18.3 56.7-9-11-27.1 33.7h17.8l18.3-22.7z"/><path fill="#000" d="M81 79.3 17 0H0v79.3h13.6V17l50.2 62.3H81Zm252.6-.4c-1 0-1.8-.4-2.5-1s-1.1-1.6-1.1-2.6.3-1.8 1-2.5 1.6-1 2.6-1 1.8.3 2.5 1a3.4 3.4 0 0 1 .6 4.3 3.7 3.7 0 0 1-3 1.8zm23.2-33.5h6v23.3c0 2.1-.4 4-1.3 5.5a9.1 9.1 0 0 1-3.8 3.5c-1.6.8-3.5 1.3-5.7 1.3-2 0-3.7-.4-5.3-1s-2.8-1.8-3.7-3.2c-.9-1.3-1.4-3-1.4-5h6c.1.8.3 1.6.7 2.2s1 1.2 1.6 1.5c.7.4 1.5.5 2.4.5 1 0 1.8-.2 2.4-.6a4 4 0 0 0 1.6-1.8c.3-.8.5-1.8.5-3V45.5zm30.9 9.1a4.4 4.4 0 0 0-2-3.3 7.5 7.5 0 0 0-4.3-1.1c-1.3 0-2.4.2-3.3.5-.9.4-1.6 1-2 1.6a3.5 3.5 0 0 0-.3 4c.3.5.7.9 1.3 1.2l1.8 1 2 .5 3.2.8c1.3.3 2.5.7 3.7 1.2a13 13 0 0 1 3.2 1.8 8.1 8.1 0 0 1 3 6.5c0 2-.5 3.7-1.5 5.1a10 10 0 0 1-4.4 3.5c-1.8.8-4.1 1.2-6.8 1.2-2.6 0-4.9-.4-6.8-1.2-2-.8-3.4-2-4.5-3.5a10 10 0 0 1-1.7-5.6h6a5 5 0 0 0 3.5 4.6c1 .4 2.2.6 3.4.6 1.3 0 2.5-.2 3.5-.6 1-.4 1.8-1 2.4-1.7a4 4 0 0 0 .8-2.4c0-.9-.2-1.6-.7-2.2a11 11 0 0 0-2.1-1.4l-3.2-1-3.8-1c-2.8-.7-5-1.7-6.6-3.2a7.2 7.2 0 0 1-2.4-5.7 8 8 0 0 1 1.7-5 10 10 0 0 1 4.3-3.5c2-.8 4-1.2 6.4-1.2 2.3 0 4.4.4 6.2 1.2 1.8.8 3.2 2 4.3 3.4 1 1.4 1.5 3 1.5 5h-5.8z"/></svg>

After

Width:  |  Height:  |  Size: 1.3 KiB

View File

@@ -0,0 +1 @@
<svg fill="none" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1155 1000"><path d="m577.3 0 577.4 1000H0z" fill="#fff"/></svg>

After

Width:  |  Height:  |  Size: 128 B

View File

@@ -0,0 +1 @@
<svg fill="none" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16"><path fill-rule="evenodd" clip-rule="evenodd" d="M1.5 2.5h13v10a1 1 0 0 1-1 1h-11a1 1 0 0 1-1-1zM0 1h16v11.5a2.5 2.5 0 0 1-2.5 2.5h-11A2.5 2.5 0 0 1 0 12.5zm3.75 4.5a.75.75 0 1 0 0-1.5.75.75 0 0 0 0 1.5M7 4.75a.75.75 0 1 1-1.5 0 .75.75 0 0 1 1.5 0m1.75.75a.75.75 0 1 0 0-1.5.75.75 0 0 0 0 1.5" fill="#666"/></svg>

After

Width:  |  Height:  |  Size: 385 B

View File

@@ -0,0 +1,454 @@
"use client";
import { useState, useEffect } from "react";
import { Shield, LogOut, RefreshCw, Trash2, Activity, HardDrive, Cpu, Clock, Users, FileText, AlertTriangle, CheckCircle } from "lucide-react";
interface DashboardData {
timestamp: string;
uptime: string;
status: string;
issues: string[];
system: {
memory: {
process_rss_mb: number;
system_total_gb: number;
system_available_gb: number;
system_percent: number;
};
disk: {
total_files: number;
total_size_mb: number;
usage_percent: number;
};
};
translations: {
total: number;
errors: number;
success_rate: number;
};
cleanup: {
files_cleaned_total: number;
bytes_freed_total_mb: number;
cleanup_runs: number;
tracked_files_count: number;
is_running: boolean;
};
rate_limits: {
total_requests: number;
total_translations: number;
active_clients: number;
config: {
requests_per_minute: number;
translations_per_minute: number;
};
};
config: {
max_file_size_mb: number;
supported_extensions: string[];
translation_service: string;
};
}
export default function AdminPage() {
const [isAuthenticated, setIsAuthenticated] = useState(false);
const [isLoading, setIsLoading] = useState(true);
const [loginError, setLoginError] = useState("");
const [username, setUsername] = useState("");
const [password, setPassword] = useState("");
const [dashboard, setDashboard] = useState<DashboardData | null>(null);
const [isRefreshing, setIsRefreshing] = useState(false);
const API_URL = process.env.NEXT_PUBLIC_API_URL || "http://localhost:8000";
// Check if already authenticated
useEffect(() => {
const token = localStorage.getItem("admin_token");
if (token) {
verifyToken(token);
} else {
setIsLoading(false);
}
}, []);
const verifyToken = async (token: string) => {
try {
const response = await fetch(`${API_URL}/admin/verify`, {
headers: {
Authorization: `Bearer ${token}`,
},
});
if (response.ok) {
setIsAuthenticated(true);
fetchDashboard(token);
} else {
localStorage.removeItem("admin_token");
}
} catch (error) {
console.error("Token verification failed:", error);
localStorage.removeItem("admin_token");
}
setIsLoading(false);
};
const handleLogin = async (e: React.FormEvent) => {
e.preventDefault();
setLoginError("");
setIsLoading(true);
try {
const formData = new FormData();
formData.append("username", username);
formData.append("password", password);
const response = await fetch(`${API_URL}/admin/login`, {
method: "POST",
body: formData,
});
const data = await response.json();
if (response.ok) {
localStorage.setItem("admin_token", data.token);
setIsAuthenticated(true);
fetchDashboard(data.token);
} else {
setLoginError(data.detail || "Login failed");
}
} catch (error) {
setLoginError("Connection error. Is the backend running?");
}
setIsLoading(false);
};
const handleLogout = async () => {
const token = localStorage.getItem("admin_token");
if (token) {
try {
await fetch(`${API_URL}/admin/logout`, {
method: "POST",
headers: {
Authorization: `Bearer ${token}`,
},
});
} catch (error) {
console.error("Logout error:", error);
}
}
localStorage.removeItem("admin_token");
setIsAuthenticated(false);
setDashboard(null);
};
const fetchDashboard = async (token?: string) => {
const authToken = token || localStorage.getItem("admin_token");
if (!authToken) return;
setIsRefreshing(true);
try {
const response = await fetch(`${API_URL}/admin/dashboard`, {
headers: {
Authorization: `Bearer ${authToken}`,
},
});
if (response.ok) {
const data = await response.json();
setDashboard(data);
} else if (response.status === 401) {
handleLogout();
}
} catch (error) {
console.error("Failed to fetch dashboard:", error);
}
setIsRefreshing(false);
};
const triggerCleanup = async () => {
const token = localStorage.getItem("admin_token");
if (!token) return;
try {
const response = await fetch(`${API_URL}/admin/cleanup/trigger`, {
method: "POST",
headers: {
Authorization: `Bearer ${token}`,
},
});
if (response.ok) {
const data = await response.json();
alert(`Cleanup completed: ${data.files_cleaned} files removed`);
fetchDashboard();
}
} catch (error) {
console.error("Cleanup failed:", error);
alert("Cleanup failed");
}
};
// Auto-refresh every 30 seconds
useEffect(() => {
if (isAuthenticated) {
const interval = setInterval(() => fetchDashboard(), 30000);
return () => clearInterval(interval);
}
}, [isAuthenticated]);
if (isLoading) {
return (
<div className="flex items-center justify-center min-h-[60vh]">
<div className="animate-spin rounded-full h-12 w-12 border-b-2 border-blue-500"></div>
</div>
);
}
if (!isAuthenticated) {
return (
<div className="flex items-center justify-center min-h-[60vh]">
<div className="bg-zinc-800/50 backdrop-blur rounded-2xl p-8 w-full max-w-md border border-zinc-700/50">
<div className="flex items-center gap-3 mb-6">
<div className="p-3 bg-blue-500/20 rounded-xl">
<Shield className="w-8 h-8 text-blue-400" />
</div>
<div>
<h1 className="text-2xl font-bold">Admin Access</h1>
<p className="text-zinc-400 text-sm">Login to access the dashboard</p>
</div>
</div>
<form onSubmit={handleLogin} className="space-y-4">
<div>
<label className="block text-sm font-medium text-zinc-300 mb-2">Username</label>
<input
type="text"
value={username}
onChange={(e) => setUsername(e.target.value)}
className="w-full px-4 py-3 bg-zinc-900/50 border border-zinc-700 rounded-xl focus:outline-none focus:ring-2 focus:ring-blue-500 focus:border-transparent"
placeholder="admin"
required
/>
</div>
<div>
<label className="block text-sm font-medium text-zinc-300 mb-2">Password</label>
<input
type="password"
value={password}
onChange={(e) => setPassword(e.target.value)}
className="w-full px-4 py-3 bg-zinc-900/50 border border-zinc-700 rounded-xl focus:outline-none focus:ring-2 focus:ring-blue-500 focus:border-transparent"
placeholder="••••••••"
required
/>
</div>
{loginError && (
<div className="p-3 bg-red-500/20 border border-red-500/50 rounded-xl text-red-400 text-sm">
{loginError}
</div>
)}
<button
type="submit"
disabled={isLoading}
className="w-full py-3 bg-blue-600 hover:bg-blue-700 text-white font-medium rounded-xl transition-colors disabled:opacity-50"
>
{isLoading ? "Logging in..." : "Login"}
</button>
</form>
</div>
</div>
);
}
return (
<div className="space-y-6">
{/* Header */}
<div className="flex items-center justify-between">
<div className="flex items-center gap-3">
<div className="p-3 bg-blue-500/20 rounded-xl">
<Shield className="w-8 h-8 text-blue-400" />
</div>
<div>
<h1 className="text-3xl font-bold">Admin Dashboard</h1>
<p className="text-zinc-400">System monitoring and management</p>
</div>
</div>
<div className="flex items-center gap-3">
<button
onClick={() => fetchDashboard()}
disabled={isRefreshing}
className="p-3 bg-zinc-800 hover:bg-zinc-700 rounded-xl transition-colors disabled:opacity-50"
title="Refresh"
>
<RefreshCw className={`w-5 h-5 ${isRefreshing ? "animate-spin" : ""}`} />
</button>
<button
onClick={handleLogout}
className="flex items-center gap-2 px-4 py-3 bg-red-600/20 hover:bg-red-600/30 text-red-400 rounded-xl transition-colors"
>
<LogOut className="w-5 h-5" />
Logout
</button>
</div>
</div>
{dashboard && (
<>
{/* Status Banner */}
<div className={`p-4 rounded-xl flex items-center gap-3 ${
dashboard.status === "healthy"
? "bg-green-500/20 border border-green-500/30"
: "bg-yellow-500/20 border border-yellow-500/30"
}`}>
{dashboard.status === "healthy" ? (
<CheckCircle className="w-6 h-6 text-green-400" />
) : (
<AlertTriangle className="w-6 h-6 text-yellow-400" />
)}
<div>
<span className={`font-medium ${dashboard.status === "healthy" ? "text-green-400" : "text-yellow-400"}`}>
System {dashboard.status.charAt(0).toUpperCase() + dashboard.status.slice(1)}
</span>
{dashboard.issues.length > 0 && (
<p className="text-sm text-zinc-400">{dashboard.issues.join(", ")}</p>
)}
</div>
<div className="ml-auto flex items-center gap-2 text-sm text-zinc-400">
<Clock className="w-4 h-4" />
Uptime: {dashboard.uptime}
</div>
</div>
{/* Stats Grid */}
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-4 gap-4">
{/* Total Requests */}
<div className="bg-zinc-800/50 backdrop-blur rounded-xl p-5 border border-zinc-700/50">
<div className="flex items-center gap-3 mb-3">
<div className="p-2 bg-blue-500/20 rounded-lg">
<Activity className="w-5 h-5 text-blue-400" />
</div>
<span className="text-zinc-400 text-sm">Total Requests</span>
</div>
<div className="text-3xl font-bold">{dashboard.rate_limits.total_requests.toLocaleString()}</div>
<div className="text-sm text-zinc-500 mt-1">
{dashboard.rate_limits.active_clients} active clients
</div>
</div>
{/* Translations */}
<div className="bg-zinc-800/50 backdrop-blur rounded-xl p-5 border border-zinc-700/50">
<div className="flex items-center gap-3 mb-3">
<div className="p-2 bg-green-500/20 rounded-lg">
<FileText className="w-5 h-5 text-green-400" />
</div>
<span className="text-zinc-400 text-sm">Translations</span>
</div>
<div className="text-3xl font-bold">{dashboard.translations.total.toLocaleString()}</div>
<div className="text-sm text-zinc-500 mt-1">
{dashboard.translations.success_rate}% success rate
</div>
</div>
{/* Memory Usage */}
<div className="bg-zinc-800/50 backdrop-blur rounded-xl p-5 border border-zinc-700/50">
<div className="flex items-center gap-3 mb-3">
<div className="p-2 bg-purple-500/20 rounded-lg">
<Cpu className="w-5 h-5 text-purple-400" />
</div>
<span className="text-zinc-400 text-sm">Memory Usage</span>
</div>
<div className="text-3xl font-bold">{dashboard.system.memory.system_percent}%</div>
<div className="text-sm text-zinc-500 mt-1">
{dashboard.system.memory.system_available_gb.toFixed(1)} GB available
</div>
</div>
{/* Disk Usage */}
<div className="bg-zinc-800/50 backdrop-blur rounded-xl p-5 border border-zinc-700/50">
<div className="flex items-center gap-3 mb-3">
<div className="p-2 bg-orange-500/20 rounded-lg">
<HardDrive className="w-5 h-5 text-orange-400" />
</div>
<span className="text-zinc-400 text-sm">Tracked Files</span>
</div>
<div className="text-3xl font-bold">{dashboard.cleanup.tracked_files_count}</div>
<div className="text-sm text-zinc-500 mt-1">
{dashboard.system.disk.total_size_mb.toFixed(1)} MB total
</div>
</div>
</div>
{/* Detailed Panels */}
<div className="grid grid-cols-1 lg:grid-cols-2 gap-6">
{/* Rate Limits */}
<div className="bg-zinc-800/50 backdrop-blur rounded-xl p-6 border border-zinc-700/50">
<h3 className="text-lg font-semibold mb-4 flex items-center gap-2">
<Users className="w-5 h-5 text-blue-400" />
Rate Limits Configuration
</h3>
<div className="space-y-3">
<div className="flex justify-between items-center py-2 border-b border-zinc-700/50">
<span className="text-zinc-400">Requests per minute</span>
<span className="font-medium">{dashboard.rate_limits.config.requests_per_minute}</span>
</div>
<div className="flex justify-between items-center py-2 border-b border-zinc-700/50">
<span className="text-zinc-400">Translations per minute</span>
<span className="font-medium">{dashboard.rate_limits.config.translations_per_minute}</span>
</div>
<div className="flex justify-between items-center py-2 border-b border-zinc-700/50">
<span className="text-zinc-400">Max file size</span>
<span className="font-medium">{dashboard.config.max_file_size_mb} MB</span>
</div>
<div className="flex justify-between items-center py-2">
<span className="text-zinc-400">Translation service</span>
<span className="font-medium capitalize">{dashboard.config.translation_service}</span>
</div>
</div>
</div>
{/* Cleanup Service */}
<div className="bg-zinc-800/50 backdrop-blur rounded-xl p-6 border border-zinc-700/50">
<div className="flex items-center justify-between mb-4">
<h3 className="text-lg font-semibold flex items-center gap-2">
<Trash2 className="w-5 h-5 text-orange-400" />
Cleanup Service
</h3>
<button
onClick={triggerCleanup}
className="px-3 py-1.5 bg-orange-600/20 hover:bg-orange-600/30 text-orange-400 text-sm rounded-lg transition-colors"
>
Trigger Cleanup
</button>
</div>
<div className="space-y-3">
<div className="flex justify-between items-center py-2 border-b border-zinc-700/50">
<span className="text-zinc-400">Service status</span>
<span className={`font-medium ${dashboard.cleanup.is_running ? "text-green-400" : "text-red-400"}`}>
{dashboard.cleanup.is_running ? "Running" : "Stopped"}
</span>
</div>
<div className="flex justify-between items-center py-2 border-b border-zinc-700/50">
<span className="text-zinc-400">Files cleaned</span>
<span className="font-medium">{dashboard.cleanup.files_cleaned_total}</span>
</div>
<div className="flex justify-between items-center py-2 border-b border-zinc-700/50">
<span className="text-zinc-400">Space freed</span>
<span className="font-medium">{dashboard.cleanup.bytes_freed_total_mb.toFixed(2)} MB</span>
</div>
<div className="flex justify-between items-center py-2">
<span className="text-zinc-400">Cleanup runs</span>
<span className="font-medium">{dashboard.cleanup.cleanup_runs}</span>
</div>
</div>
</div>
</div>
{/* Footer Info */}
<div className="text-center text-sm text-zinc-500 pt-4">
Last updated: {new Date(dashboard.timestamp).toLocaleString()} Auto-refresh every 30 seconds
</div>
</>
)}
</div>
);
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

View File

@@ -0,0 +1,122 @@
@import "tailwindcss";
@import "tw-animate-css";
@custom-variant dark (&:is(.dark *));
@theme inline {
--color-background: var(--background);
--color-foreground: var(--foreground);
--font-sans: var(--font-geist-sans);
--font-mono: var(--font-geist-mono);
--color-sidebar-ring: var(--sidebar-ring);
--color-sidebar-border: var(--sidebar-border);
--color-sidebar-accent-foreground: var(--sidebar-accent-foreground);
--color-sidebar-accent: var(--sidebar-accent);
--color-sidebar-primary-foreground: var(--sidebar-primary-foreground);
--color-sidebar-primary: var(--sidebar-primary);
--color-sidebar-foreground: var(--sidebar-foreground);
--color-sidebar: var(--sidebar);
--color-chart-5: var(--chart-5);
--color-chart-4: var(--chart-4);
--color-chart-3: var(--chart-3);
--color-chart-2: var(--chart-2);
--color-chart-1: var(--chart-1);
--color-ring: var(--ring);
--color-input: var(--input);
--color-border: var(--border);
--color-destructive: var(--destructive);
--color-accent-foreground: var(--accent-foreground);
--color-accent: var(--accent);
--color-muted-foreground: var(--muted-foreground);
--color-muted: var(--muted);
--color-secondary-foreground: var(--secondary-foreground);
--color-secondary: var(--secondary);
--color-primary-foreground: var(--primary-foreground);
--color-primary: var(--primary);
--color-popover-foreground: var(--popover-foreground);
--color-popover: var(--popover);
--color-card-foreground: var(--card-foreground);
--color-card: var(--card);
--radius-sm: calc(var(--radius) - 4px);
--radius-md: calc(var(--radius) - 2px);
--radius-lg: var(--radius);
--radius-xl: calc(var(--radius) + 4px);
}
:root {
--radius: 0.625rem;
--background: oklch(1 0 0);
--foreground: oklch(0.145 0 0);
--card: oklch(1 0 0);
--card-foreground: oklch(0.145 0 0);
--popover: oklch(1 0 0);
--popover-foreground: oklch(0.145 0 0);
--primary: oklch(0.205 0 0);
--primary-foreground: oklch(0.985 0 0);
--secondary: oklch(0.97 0 0);
--secondary-foreground: oklch(0.205 0 0);
--muted: oklch(0.97 0 0);
--muted-foreground: oklch(0.556 0 0);
--accent: oklch(0.97 0 0);
--accent-foreground: oklch(0.205 0 0);
--destructive: oklch(0.577 0.245 27.325);
--border: oklch(0.922 0 0);
--input: oklch(0.922 0 0);
--ring: oklch(0.708 0 0);
--chart-1: oklch(0.646 0.222 41.116);
--chart-2: oklch(0.6 0.118 184.704);
--chart-3: oklch(0.398 0.07 227.392);
--chart-4: oklch(0.828 0.189 84.429);
--chart-5: oklch(0.769 0.188 70.08);
--sidebar: oklch(0.985 0 0);
--sidebar-foreground: oklch(0.145 0 0);
--sidebar-primary: oklch(0.205 0 0);
--sidebar-primary-foreground: oklch(0.985 0 0);
--sidebar-accent: oklch(0.97 0 0);
--sidebar-accent-foreground: oklch(0.205 0 0);
--sidebar-border: oklch(0.922 0 0);
--sidebar-ring: oklch(0.708 0 0);
}
.dark {
--background: #262626;
--foreground: oklch(0.985 0 0);
--card: #2d2d2d;
--card-foreground: oklch(0.985 0 0);
--popover: #2d2d2d;
--popover-foreground: oklch(0.985 0 0);
--primary: oklch(0.922 0 0);
--primary-foreground: oklch(0.205 0 0);
--secondary: #333333;
--secondary-foreground: oklch(0.985 0 0);
--muted: #333333;
--muted-foreground: oklch(0.708 0 0);
--accent: #333333;
--accent-foreground: oklch(0.985 0 0);
--destructive: oklch(0.704 0.191 22.216);
--border: oklch(1 0 0 / 10%);
--input: oklch(1 0 0 / 15%);
--ring: oklch(0.556 0 0);
--chart-1: oklch(0.488 0.243 264.376);
--chart-2: oklch(0.696 0.17 162.48);
--chart-3: oklch(0.769 0.188 70.08);
--chart-4: oklch(0.627 0.265 303.9);
--chart-5: oklch(0.645 0.246 16.439);
--sidebar: #1f1f1f;
--sidebar-foreground: oklch(0.985 0 0);
--sidebar-primary: oklch(0.488 0.243 264.376);
--sidebar-primary-foreground: oklch(0.985 0 0);
--sidebar-accent: #333333;
--sidebar-accent-foreground: oklch(0.985 0 0);
--sidebar-border: oklch(1 0 0 / 10%);
--sidebar-ring: oklch(0.556 0 0);
}
@layer base {
* {
@apply border-border outline-ring/50;
}
body {
@apply bg-background text-foreground;
}
}

View File

@@ -0,0 +1,32 @@
import type { Metadata } from "next";
import { Inter } from "next/font/google";
import "./globals.css";
import { Sidebar } from "@/components/sidebar";
const inter = Inter({
subsets: ["latin"],
});
export const metadata: Metadata = {
title: "Translate Co. - Document Translation",
description: "Translate Excel, Word, and PowerPoint documents while preserving formatting",
};
export default function RootLayout({
children,
}: Readonly<{
children: React.ReactNode;
}>) {
return (
<html lang="en" className="dark">
<body className={`${inter.className} bg-[#262626] text-zinc-100 antialiased`}>
<Sidebar />
<main className="ml-64 min-h-screen p-8">
<div className="max-w-6xl mx-auto">
{children}
</div>
</main>
</body>
</html>
);
}

49
frontend/src/app/page.tsx Normal file
View File

@@ -0,0 +1,49 @@
"use client";
import { FileUploader } from "@/components/file-uploader";
import { useTranslationStore } from "@/lib/store";
import { Badge } from "@/components/ui/badge";
import { Settings } from "lucide-react";
import Link from "next/link";
export default function Home() {
const { settings } = useTranslationStore();
const providerNames: Record<string, string> = {
google: "Google Translate",
ollama: "Ollama",
deepl: "DeepL",
libre: "LibreTranslate",
webllm: "WebLLM",
};
return (
<div className="space-y-6">
<div className="flex items-start justify-between">
<div>
<h1 className="text-3xl font-bold text-white">Translate Documents</h1>
<p className="text-zinc-400 mt-1">
Upload and translate Excel, Word, and PowerPoint files while preserving all formatting.
</p>
</div>
{/* Current Configuration Badge */}
<Link href="/settings/services" className="flex items-center gap-2 px-3 py-2 rounded-lg bg-zinc-800/50 border border-zinc-700 hover:bg-zinc-800 transition-colors">
<Settings className="h-4 w-4 text-zinc-400" />
<div className="flex items-center gap-2">
<Badge variant="outline" className="border-teal-500/50 text-teal-400 text-xs">
{providerNames[settings.defaultProvider]}
</Badge>
{settings.defaultProvider === "ollama" && settings.ollamaModel && (
<Badge variant="outline" className="border-zinc-600 text-zinc-400 text-xs">
{settings.ollamaModel}
</Badge>
)}
</div>
</Link>
</div>
<FileUploader />
</div>
);
}

View File

@@ -0,0 +1,239 @@
"use client";
import { useState, useEffect } from "react";
import { Card, CardContent, CardDescription, CardHeader, CardTitle } from "@/components/ui/card";
import { Button } from "@/components/ui/button";
import { Textarea } from "@/components/ui/textarea";
import { Badge } from "@/components/ui/badge";
import { useTranslationStore } from "@/lib/store";
import { Save, Loader2, Brain, BookOpen, Sparkles, Trash2 } from "lucide-react";
export default function ContextGlossaryPage() {
const { settings, updateSettings, applyPreset, clearContext } = useTranslationStore();
const [isSaving, setIsSaving] = useState(false);
const [localSettings, setLocalSettings] = useState({
systemPrompt: settings.systemPrompt,
glossary: settings.glossary,
});
useEffect(() => {
setLocalSettings({
systemPrompt: settings.systemPrompt,
glossary: settings.glossary,
});
}, [settings]);
const handleSave = async () => {
setIsSaving(true);
try {
updateSettings(localSettings);
await new Promise((resolve) => setTimeout(resolve, 500));
} finally {
setIsSaving(false);
}
};
const handleApplyPreset = (preset: 'hvac' | 'it' | 'legal' | 'medical') => {
applyPreset(preset);
// Need to get the updated values from the store after applying preset
setTimeout(() => {
setLocalSettings({
systemPrompt: useTranslationStore.getState().settings.systemPrompt,
glossary: useTranslationStore.getState().settings.glossary,
});
}, 0);
};
const handleClear = () => {
clearContext();
setLocalSettings({
systemPrompt: "",
glossary: "",
});
};
// Check which LLM providers are configured
const isOllamaConfigured = settings.ollamaUrl && settings.ollamaModel;
const isOpenAIConfigured = !!settings.openaiApiKey;
const isWebLLMAvailable = typeof window !== 'undefined' && 'gpu' in navigator;
return (
<div className="space-y-6">
<div>
<h1 className="text-3xl font-bold text-white">Context & Glossary</h1>
<p className="text-zinc-400 mt-1">
Configure translation context and glossary for LLM-based providers.
</p>
{/* LLM Provider Status */}
<div className="flex flex-wrap gap-2 mt-3">
<Badge
variant="outline"
className={`${isOllamaConfigured ? 'border-green-500 text-green-400' : 'border-zinc-600 text-zinc-500'}`}
>
🤖 Ollama {isOllamaConfigured ? '✓' : '○'}
</Badge>
<Badge
variant="outline"
className={`${isOpenAIConfigured ? 'border-green-500 text-green-400' : 'border-zinc-600 text-zinc-500'}`}
>
🧠 OpenAI {isOpenAIConfigured ? '✓' : '○'}
</Badge>
<Badge
variant="outline"
className={`${isWebLLMAvailable ? 'border-green-500 text-green-400' : 'border-zinc-600 text-zinc-500'}`}
>
💻 WebLLM {isWebLLMAvailable ? '✓' : '○'}
</Badge>
</div>
</div>
{/* Info Banner */}
<div className="p-4 rounded-lg bg-teal-500/10 border border-teal-500/30">
<p className="text-teal-400 text-sm flex items-center gap-2">
<Sparkles className="h-4 w-4" />
<span>
<strong>Context & Glossary</strong> settings apply to all LLM providers:
<strong> Ollama</strong>, <strong>OpenAI</strong>, and <strong>WebLLM</strong>.
Use them to improve translation quality with domain-specific instructions.
</span>
</p>
</div>
<div className="grid grid-cols-1 lg:grid-cols-2 gap-6">
{/* Left Column */}
<div className="space-y-6">
{/* System Prompt */}
<Card className="border-zinc-800 bg-zinc-900/50">
<CardHeader>
<CardTitle className="text-white flex items-center gap-2">
<Brain className="h-5 w-5 text-teal-400" />
System Prompt
</CardTitle>
<CardDescription>
Instructions for the LLM to follow during translation.
Works with Ollama, OpenAI, and WebLLM.
</CardDescription>
</CardHeader>
<CardContent className="space-y-4">
<Textarea
id="system-prompt"
value={localSettings.systemPrompt}
onChange={(e) =>
setLocalSettings({ ...localSettings, systemPrompt: e.target.value })
}
placeholder="Example: You are translating technical HVAC documents. Use precise engineering terminology. Maintain consistency with industry standards..."
className="bg-zinc-800 border-zinc-700 text-white placeholder:text-zinc-500 min-h-[200px] resize-y"
/>
<p className="text-xs text-zinc-500">
💡 Tip: Include domain context, tone preferences, or specific terminology rules.
</p>
</CardContent>
</Card>
{/* Presets */}
<Card className="border-zinc-800 bg-zinc-900/50">
<CardHeader>
<CardTitle className="text-white">Quick Presets</CardTitle>
<CardDescription>
Load pre-configured prompts & glossaries for common domains.
</CardDescription>
</CardHeader>
<CardContent>
<div className="grid grid-cols-2 gap-2">
<Button
variant="outline"
onClick={() => handleApplyPreset("hvac")}
className="border-zinc-700 text-zinc-300 hover:bg-zinc-800 hover:text-teal-400 justify-start"
>
🔧 HVAC / Engineering
</Button>
<Button
variant="outline"
onClick={() => handleApplyPreset("it")}
className="border-zinc-700 text-zinc-300 hover:bg-zinc-800 hover:text-teal-400 justify-start"
>
💻 IT / Software
</Button>
<Button
variant="outline"
onClick={() => handleApplyPreset("legal")}
className="border-zinc-700 text-zinc-300 hover:bg-zinc-800 hover:text-teal-400 justify-start"
>
Legal / Contracts
</Button>
<Button
variant="outline"
onClick={() => handleApplyPreset("medical")}
className="border-zinc-700 text-zinc-300 hover:bg-zinc-800 hover:text-teal-400 justify-start"
>
🏥 Medical / Healthcare
</Button>
</div>
<Button
variant="ghost"
onClick={handleClear}
className="w-full mt-3 text-red-400 hover:text-red-300 hover:bg-red-500/10"
>
<Trash2 className="h-4 w-4 mr-2" />
Clear All
</Button>
</CardContent>
</Card>
</div>
{/* Right Column */}
<div className="space-y-6">
{/* Glossary */}
<Card className="border-zinc-800 bg-zinc-900/50">
<CardHeader>
<CardTitle className="text-white flex items-center gap-2">
<BookOpen className="h-5 w-5 text-teal-400" />
Technical Glossary
</CardTitle>
<CardDescription>
Define specific term translations. Format: source=target (one per line).
</CardDescription>
</CardHeader>
<CardContent className="space-y-4">
<Textarea
id="glossary"
value={localSettings.glossary}
onChange={(e) =>
setLocalSettings({ ...localSettings, glossary: e.target.value })
}
placeholder="pression statique=static pressure&#10;récupérateur=heat recovery unit&#10;ventilo-connecteur=fan coil unit&#10;gaine=duct&#10;diffuseur=diffuser"
className="bg-zinc-800 border-zinc-700 text-white placeholder:text-zinc-500 min-h-[280px] resize-y font-mono text-sm"
/>
<p className="text-xs text-zinc-500">
💡 The glossary is included in the system prompt to guide translations.
</p>
</CardContent>
</Card>
</div>
</div>
{/* Save Button */}
<div className="flex justify-end">
<Button
onClick={handleSave}
disabled={isSaving}
className="bg-teal-600 hover:bg-teal-700 text-white px-8"
>
{isSaving ? (
<>
<Loader2 className="mr-2 h-4 w-4 animate-spin" />
Saving...
</>
) : (
<>
<Save className="mr-2 h-4 w-4" />
Save Settings
</>
)}
</Button>
</div>
</div>
);
}

View File

@@ -0,0 +1,247 @@
"use client";
import { useState, useEffect } from "react";
import { Card, CardContent, CardDescription, CardHeader, CardTitle } from "@/components/ui/card";
import { Button } from "@/components/ui/button";
import { Label } from "@/components/ui/label";
import { Badge } from "@/components/ui/badge";
import { useTranslationStore } from "@/lib/store";
import { languages } from "@/lib/api";
import { Save, Loader2, Settings, Globe, Trash2 } from "lucide-react";
import {
Select,
SelectContent,
SelectItem,
SelectTrigger,
SelectValue,
} from "@/components/ui/select";
export default function GeneralSettingsPage() {
const { settings, updateSettings } = useTranslationStore();
const [isSaving, setIsSaving] = useState(false);
const [isClearing, setIsClearing] = useState(false);
const [defaultLanguage, setDefaultLanguage] = useState(settings.defaultTargetLanguage);
useEffect(() => {
setDefaultLanguage(settings.defaultTargetLanguage);
}, [settings.defaultTargetLanguage]);
const handleSave = async () => {
setIsSaving(true);
try {
updateSettings({ defaultTargetLanguage: defaultLanguage });
await new Promise((resolve) => setTimeout(resolve, 500));
} finally {
setIsSaving(false);
}
};
const handleClearCache = async () => {
setIsClearing(true);
try {
// Clear localStorage
localStorage.removeItem('translation-settings');
// Clear sessionStorage
sessionStorage.clear();
// Clear any cached files/blobs
if ('caches' in window) {
const cacheNames = await caches.keys();
await Promise.all(cacheNames.map(name => caches.delete(name)));
}
await new Promise((resolve) => setTimeout(resolve, 500));
// Reload to reset state
window.location.reload();
} catch (error) {
console.error('Error clearing cache:', error);
setIsClearing(false);
}
};
return (
<div className="space-y-6">
<div>
<h1 className="text-3xl font-bold text-white">General Settings</h1>
<p className="text-zinc-400 mt-1">
Configure general application settings and preferences.
</p>
</div>
<Card className="border-zinc-800 bg-zinc-900/50">
<CardHeader>
<div className="flex items-center gap-3">
<Settings className="h-6 w-6 text-teal-400" />
<div>
<CardTitle className="text-white">Application Settings</CardTitle>
<CardDescription>
General configuration options
</CardDescription>
</div>
</div>
</CardHeader>
<CardContent className="space-y-6">
<div className="space-y-2">
<Label htmlFor="default-language" className="text-zinc-300">
Default Target Language
</Label>
<Select value={defaultLanguage} onValueChange={setDefaultLanguage}>
<SelectTrigger className="bg-zinc-800 border-zinc-700 text-white">
<SelectValue placeholder="Select default language" />
</SelectTrigger>
<SelectContent className="bg-zinc-800 border-zinc-700 max-h-[300px]">
{languages.map((lang) => (
<SelectItem
key={lang.code}
value={lang.code}
className="text-white hover:bg-zinc-700"
>
<span className="flex items-center gap-2">
<span>{lang.flag}</span>
<span>{lang.name}</span>
</span>
</SelectItem>
))}
</SelectContent>
</Select>
<p className="text-xs text-zinc-500">
This language will be pre-selected when translating documents
</p>
</div>
</CardContent>
</Card>
{/* Supported Formats */}
<Card className="border-zinc-800 bg-zinc-900/50">
<CardHeader>
<div className="flex items-center gap-3">
<Globe className="h-6 w-6 text-teal-400" />
<div>
<CardTitle className="text-white">Supported Formats</CardTitle>
<CardDescription>
Document types that can be translated
</CardDescription>
</div>
</div>
</CardHeader>
<CardContent>
<div className="grid grid-cols-1 md:grid-cols-3 gap-4">
<div className="p-4 rounded-lg border border-zinc-800 bg-zinc-800/30">
<div className="text-2xl mb-2">📊</div>
<h3 className="font-medium text-white">Excel</h3>
<p className="text-xs text-zinc-500 mt-1">.xlsx, .xls</p>
<div className="flex flex-wrap gap-1 mt-2">
<Badge variant="outline" className="border-zinc-700 text-zinc-400 text-xs">
Formulas
</Badge>
<Badge variant="outline" className="border-zinc-700 text-zinc-400 text-xs">
Styles
</Badge>
<Badge variant="outline" className="border-zinc-700 text-zinc-400 text-xs">
Images
</Badge>
</div>
</div>
<div className="p-4 rounded-lg border border-zinc-800 bg-zinc-800/30">
<div className="text-2xl mb-2">📝</div>
<h3 className="font-medium text-white">Word</h3>
<p className="text-xs text-zinc-500 mt-1">.docx, .doc</p>
<div className="flex flex-wrap gap-1 mt-2">
<Badge variant="outline" className="border-zinc-700 text-zinc-400 text-xs">
Headers
</Badge>
<Badge variant="outline" className="border-zinc-700 text-zinc-400 text-xs">
Tables
</Badge>
<Badge variant="outline" className="border-zinc-700 text-zinc-400 text-xs">
Images
</Badge>
</div>
</div>
<div className="p-4 rounded-lg border border-zinc-800 bg-zinc-800/30">
<div className="text-2xl mb-2">📽</div>
<h3 className="font-medium text-white">PowerPoint</h3>
<p className="text-xs text-zinc-500 mt-1">.pptx, .ppt</p>
<div className="flex flex-wrap gap-1 mt-2">
<Badge variant="outline" className="border-zinc-700 text-zinc-400 text-xs">
Slides
</Badge>
<Badge variant="outline" className="border-zinc-700 text-zinc-400 text-xs">
Notes
</Badge>
<Badge variant="outline" className="border-zinc-700 text-zinc-400 text-xs">
Images
</Badge>
</div>
</div>
</div>
</CardContent>
</Card>
{/* API Status */}
<Card className="border-zinc-800 bg-zinc-900/50">
<CardHeader>
<CardTitle className="text-white">API Information</CardTitle>
<CardDescription>
Backend server connection details
</CardDescription>
</CardHeader>
<CardContent>
<div className="space-y-3">
<div className="flex items-center justify-between p-3 rounded-lg bg-zinc-800/50">
<span className="text-zinc-400">API Endpoint</span>
<code className="text-teal-400 text-sm">http://localhost:8000</code>
</div>
<div className="flex items-center justify-between p-3 rounded-lg bg-zinc-800/50">
<span className="text-zinc-400">Health Check</span>
<code className="text-teal-400 text-sm">/health</code>
</div>
<div className="flex items-center justify-between p-3 rounded-lg bg-zinc-800/50">
<span className="text-zinc-400">Translate Endpoint</span>
<code className="text-teal-400 text-sm">/translate</code>
</div>
</div>
</CardContent>
</Card>
{/* Save Button */}
<div className="flex justify-between items-center">
<Button
onClick={handleClearCache}
disabled={isClearing}
variant="destructive"
className="bg-red-600 hover:bg-red-700 text-white px-6"
>
{isClearing ? (
<>
<Loader2 className="mr-2 h-4 w-4 animate-spin" />
Clearing...
</>
) : (
<>
<Trash2 className="mr-2 h-4 w-4" />
Clear Cache
</>
)}
</Button>
<Button
onClick={handleSave}
disabled={isSaving}
className="bg-teal-600 hover:bg-teal-700 text-white px-8"
>
{isSaving ? (
<>
<Loader2 className="mr-2 h-4 w-4 animate-spin" />
Saving...
</>
) : (
<>
<Save className="mr-2 h-4 w-4" />
Save Settings
</>
)}
</Button>
</div>
</div>
);
}

View File

@@ -0,0 +1,722 @@
"use client";
import { useState, useEffect } from "react";
import { Card, CardContent, CardDescription, CardHeader, CardTitle } from "@/components/ui/card";
import { Button } from "@/components/ui/button";
import { Input } from "@/components/ui/input";
import { Label } from "@/components/ui/label";
import { Badge } from "@/components/ui/badge";
import { Switch } from "@/components/ui/switch";
import { useTranslationStore, webllmModels, openaiModels } from "@/lib/store";
import { providers, testOpenAIConnection, testOllamaConnection, getOllamaModels, type OllamaModel } from "@/lib/api";
import { useWebLLM } from "@/lib/webllm";
import { Save, Loader2, Cloud, Check, ExternalLink, Wifi, CheckCircle, XCircle, Download, Trash2, Cpu, Server, RefreshCw } from "lucide-react";
import {
Select,
SelectContent,
SelectItem,
SelectTrigger,
SelectValue,
} from "@/components/ui/select";
import { Progress } from "@/components/ui/progress";
export default function TranslationServicesPage() {
const { settings, updateSettings } = useTranslationStore();
const [isSaving, setIsSaving] = useState(false);
const [selectedProvider, setSelectedProvider] = useState(settings.defaultProvider);
const [translateImages, setTranslateImages] = useState(settings.translateImages);
// Provider-specific states
const [deeplApiKey, setDeeplApiKey] = useState(settings.deeplApiKey);
const [openaiApiKey, setOpenaiApiKey] = useState(settings.openaiApiKey);
const [openaiModel, setOpenaiModel] = useState(settings.openaiModel);
const [libreUrl, setLibreUrl] = useState(settings.libreTranslateUrl);
const [webllmModel, setWebllmModel] = useState(settings.webllmModel);
// Ollama states
const [ollamaUrl, setOllamaUrl] = useState(settings.ollamaUrl);
const [ollamaModel, setOllamaModel] = useState(settings.ollamaModel);
const [ollamaModels, setOllamaModels] = useState<OllamaModel[]>([]);
const [loadingOllamaModels, setLoadingOllamaModels] = useState(false);
const [ollamaTestStatus, setOllamaTestStatus] = useState<"idle" | "testing" | "success" | "error">("idle");
const [ollamaTestMessage, setOllamaTestMessage] = useState("");
// OpenAI connection test state
const [openaiTestStatus, setOpenaiTestStatus] = useState<"idle" | "testing" | "success" | "error">("idle");
const [openaiTestMessage, setOpenaiTestMessage] = useState("");
// WebLLM hook
const webllm = useWebLLM();
useEffect(() => {
setSelectedProvider(settings.defaultProvider);
setTranslateImages(settings.translateImages);
setDeeplApiKey(settings.deeplApiKey);
setOpenaiApiKey(settings.openaiApiKey);
setOpenaiModel(settings.openaiModel);
setLibreUrl(settings.libreTranslateUrl);
setWebllmModel(settings.webllmModel);
setOllamaUrl(settings.ollamaUrl);
setOllamaModel(settings.ollamaModel);
}, [settings]);
// Load Ollama models when provider is selected
const loadOllamaModels = async () => {
setLoadingOllamaModels(true);
try {
const models = await getOllamaModels(ollamaUrl);
setOllamaModels(models);
} catch (error) {
console.error("Failed to load Ollama models:", error);
} finally {
setLoadingOllamaModels(false);
}
};
useEffect(() => {
if (selectedProvider === "ollama") {
loadOllamaModels();
}
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [selectedProvider]);
const handleTestOllama = async () => {
setOllamaTestStatus("testing");
setOllamaTestMessage("");
try {
const result = await testOllamaConnection(ollamaUrl);
setOllamaTestStatus(result.success ? "success" : "error");
setOllamaTestMessage(result.message);
if (result.success) {
await loadOllamaModels();
updateSettings({ ollamaUrl, ollamaModel });
setOllamaTestMessage(result.message + " - Settings saved!");
}
} catch {
setOllamaTestStatus("error");
setOllamaTestMessage("Connection test failed");
}
};
const handleTestOpenAI = async () => {
if (!openaiApiKey.trim()) {
setOpenaiTestStatus("error");
setOpenaiTestMessage("Please enter an API key first");
return;
}
setOpenaiTestStatus("testing");
setOpenaiTestMessage("");
try {
const result = await testOpenAIConnection(openaiApiKey);
setOpenaiTestStatus(result.success ? "success" : "error");
setOpenaiTestMessage(result.message);
if (result.success) {
updateSettings({ openaiApiKey, openaiModel });
setOpenaiTestMessage(result.message + " - Settings saved!");
}
} catch {
setOpenaiTestStatus("error");
setOpenaiTestMessage("Connection test failed");
}
};
const handleSave = async () => {
setIsSaving(true);
try {
updateSettings({
defaultProvider: selectedProvider,
translateImages,
deeplApiKey,
openaiApiKey,
openaiModel,
libreTranslateUrl: libreUrl,
webllmModel,
ollamaUrl,
ollamaModel,
});
await new Promise((resolve) => setTimeout(resolve, 500));
} finally {
setIsSaving(false);
}
};
return (
<div className="space-y-6">
<div>
<h1 className="text-3xl font-bold text-white">Translation Services</h1>
<p className="text-zinc-400 mt-1">
Select and configure your preferred translation provider.
</p>
</div>
{/* Provider Selection */}
<Card className="border-zinc-800 bg-zinc-900/50">
<CardHeader>
<div className="flex items-center gap-3">
<Cloud className="h-6 w-6 text-teal-400" />
<div>
<CardTitle className="text-white">Choose Provider</CardTitle>
<CardDescription>
Select your default translation service
</CardDescription>
</div>
</div>
</CardHeader>
<CardContent>
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-4">
{providers.map((provider) => (
<div
key={provider.id}
onClick={() => setSelectedProvider(provider.id as typeof selectedProvider)}
tabIndex={-1}
className={`
relative p-4 rounded-lg border-2 cursor-pointer transition-all
${
selectedProvider === provider.id
? "border-teal-500 bg-teal-500/10"
: "border-zinc-700 hover:border-zinc-600 bg-zinc-800/50"
}
`}
>
{selectedProvider === provider.id && (
<div className="absolute top-2 right-2">
<Check className="h-5 w-5 text-teal-400" />
</div>
)}
<div className="text-2xl mb-2">{provider.icon}</div>
<h3 className="font-medium text-white">{provider.name}</h3>
<p className="text-xs text-zinc-500 mt-1">{provider.description}</p>
</div>
))}
</div>
</CardContent>
</Card>
{/* Google - No config needed */}
{selectedProvider === "google" && (
<Card className="border-zinc-800 bg-zinc-900/50 border-l-4 border-l-green-500">
<CardContent className="pt-6">
<div className="flex items-center gap-3">
<CheckCircle className="h-6 w-6 text-green-400" />
<div>
<p className="text-white font-medium">Ready to use!</p>
<p className="text-sm text-zinc-400">
Google Translate works out of the box. No configuration needed.
</p>
</div>
</div>
</CardContent>
</Card>
)}
{/* Ollama Settings */}
{selectedProvider === "ollama" && (
<Card className="border-zinc-800 bg-zinc-900/50">
<CardHeader>
<div className="flex items-center justify-between">
<div className="flex items-center gap-3">
<Server className="h-5 w-5 text-orange-400" />
<div>
<CardTitle className="text-white">Ollama Configuration</CardTitle>
<CardDescription>
Connect to your local Ollama server
</CardDescription>
</div>
</div>
{ollamaTestStatus !== "idle" && ollamaTestStatus !== "testing" && (
<Badge
variant="outline"
className={
ollamaTestStatus === "success"
? "border-green-500 text-green-400"
: "border-red-500 text-red-400"
}
>
{ollamaTestStatus === "success" && <CheckCircle className="h-3 w-3 mr-1" />}
{ollamaTestStatus === "error" && <XCircle className="h-3 w-3 mr-1" />}
{ollamaTestStatus === "success" ? "Connected" : "Error"}
</Badge>
)}
</div>
</CardHeader>
<CardContent className="space-y-4">
<div className="space-y-2">
<Label htmlFor="ollama-url" className="text-zinc-300">
Server URL
</Label>
<div className="flex gap-2">
<Input
id="ollama-url"
value={ollamaUrl}
onChange={(e) => setOllamaUrl(e.target.value)}
placeholder="http://localhost:11434"
className="bg-zinc-800 border-zinc-700 text-white placeholder:text-zinc-500"
/>
<Button
variant="outline"
onClick={handleTestOllama}
disabled={ollamaTestStatus === "testing"}
className="border-zinc-700 text-zinc-300 hover:bg-zinc-800"
>
{ollamaTestStatus === "testing" ? (
<Loader2 className="h-4 w-4 animate-spin" />
) : (
<Wifi className="h-4 w-4" />
)}
</Button>
</div>
{ollamaTestMessage && (
<p className={`text-xs ${ollamaTestStatus === "success" ? "text-green-400" : "text-red-400"}`}>
{ollamaTestMessage}
</p>
)}
</div>
<div className="space-y-2">
<div className="flex items-center justify-between">
<Label htmlFor="ollama-model" className="text-zinc-300">
Model
</Label>
<Button
variant="ghost"
size="sm"
onClick={loadOllamaModels}
disabled={loadingOllamaModels}
className="text-zinc-400 hover:text-teal-400 h-7 px-2"
>
<RefreshCw className={`h-3 w-3 mr-1 ${loadingOllamaModels ? "animate-spin" : ""}`} />
Refresh
</Button>
</div>
<Select
value={ollamaModel}
onValueChange={setOllamaModel}
>
<SelectTrigger className="bg-zinc-800 border-zinc-700 text-white">
<SelectValue placeholder="Select a model" />
</SelectTrigger>
<SelectContent className="bg-zinc-800 border-zinc-700">
{ollamaModels.length > 0 ? (
ollamaModels.map((model) => (
<SelectItem
key={model.name}
value={model.name}
className="text-white hover:bg-zinc-700"
>
{model.name}
</SelectItem>
))
) : (
<SelectItem value={ollamaModel} className="text-white">
{ollamaModel || "No models found"}
</SelectItem>
)}
</SelectContent>
</Select>
<p className="text-xs text-zinc-500">
Don&apos;t have Ollama? Install it from{" "}
<a
href="https://ollama.ai"
target="_blank"
rel="noopener noreferrer"
className="text-teal-400 hover:underline"
>
ollama.ai
</a>
{" "}then run: <code className="bg-zinc-800 px-1 rounded">ollama pull llama3.2</code>
</p>
</div>
</CardContent>
</Card>
)}
{/* WebLLM Settings */}
{selectedProvider === "webllm" && (
<Card className="border-zinc-800 bg-zinc-900/50">
<CardHeader>
<CardTitle className="text-white flex items-center gap-2">
<Cpu className="h-5 w-5 text-teal-400" />
WebLLM Settings
</CardTitle>
<CardDescription>
Run AI models directly in your browser using WebGPU - no server required!
</CardDescription>
</CardHeader>
<CardContent className="space-y-4">
{/* WebGPU Support Check */}
{!webllm.isWebGPUSupported() && (
<div className="p-4 rounded-lg bg-red-500/10 border border-red-500/30">
<p className="text-red-400 text-sm">
WebGPU is not supported in this browser. Please use Chrome 113+, Edge 113+, or another WebGPU-compatible browser.
</p>
</div>
)}
<div className="space-y-2">
<Label htmlFor="webllm-model" className="text-zinc-300">
Model
</Label>
<Select value={webllmModel} onValueChange={setWebllmModel}>
<SelectTrigger className="bg-zinc-800 border-zinc-700 text-white">
<SelectValue placeholder="Select a model" />
</SelectTrigger>
<SelectContent className="bg-zinc-800 border-zinc-700">
{webllmModels.map((model) => (
<SelectItem
key={model.id}
value={model.id}
className="text-white hover:bg-zinc-700"
>
<span className="flex items-center justify-between gap-4">
<span>{model.name}</span>
<Badge variant="outline" className="border-zinc-600 text-zinc-400 text-xs ml-2">
{model.size}
</Badge>
</span>
</SelectItem>
))}
</SelectContent>
</Select>
</div>
{/* Model Loading Status */}
{webllm.isLoading && (
<div className="space-y-2">
<div className="flex items-center justify-between text-sm">
<span className="text-zinc-400">{webllm.loadStatus}</span>
<span className="text-teal-400">{webllm.loadProgress}%</span>
</div>
<Progress value={webllm.loadProgress} className="h-2" />
</div>
)}
{webllm.isLoaded && (
<div className="p-3 rounded-lg bg-green-500/10 border border-green-500/30">
<p className="text-green-400 text-sm flex items-center gap-2">
<CheckCircle className="h-4 w-4" />
Model loaded: {webllm.currentModel}
</p>
</div>
)}
{webllm.error && (
<div className="p-3 rounded-lg bg-red-500/10 border border-red-500/30">
<p className="text-red-400 text-sm flex items-center gap-2">
<XCircle className="h-4 w-4" />
{webllm.error}
</p>
</div>
)}
{/* Action Buttons */}
<div className="flex gap-3">
<Button
onClick={() => webllm.loadModel(webllmModel)}
disabled={webllm.isLoading || !webllm.isWebGPUSupported()}
className="bg-teal-600 hover:bg-teal-700 text-white flex-1"
>
{webllm.isLoading ? (
<>
<Loader2 className="mr-2 h-4 w-4 animate-spin" />
Loading...
</>
) : webllm.isLoaded && webllm.currentModel === webllmModel ? (
<>
<CheckCircle className="mr-2 h-4 w-4" />
Loaded
</>
) : (
<>
<Download className="mr-2 h-4 w-4" />
Load Model
</>
)}
</Button>
<Button
onClick={() => webllm.clearCache()}
variant="destructive"
className="bg-red-600 hover:bg-red-700"
>
<Trash2 className="mr-2 h-4 w-4" />
Clear Cache
</Button>
</div>
<p className="text-xs text-zinc-500">
💡 Models are downloaded once and cached in your browser (~1-5GB depending on model).
Loading may take a minute on first use.
</p>
</CardContent>
</Card>
)}
{/* DeepL Settings */}
{selectedProvider === "deepl" && (
<Card className="border-zinc-800 bg-zinc-900/50">
<CardHeader>
<CardTitle className="text-white">DeepL Settings</CardTitle>
<CardDescription>
Configure your DeepL API credentials
</CardDescription>
</CardHeader>
<CardContent className="space-y-4">
<div className="space-y-2">
<Label htmlFor="deepl-key" className="text-zinc-300">
API Key
</Label>
<Input
id="deepl-key"
type="password"
value={deeplApiKey}
onChange={(e) => setDeeplApiKey(e.target.value)}
onKeyDown={(e) => e.stopPropagation()}
placeholder="Enter your DeepL API key"
className="bg-zinc-800 border-zinc-700 text-white placeholder:text-zinc-500"
/>
<p className="text-xs text-zinc-500">
Get your API key from{" "}
<a
href="https://www.deepl.com/pro-api"
target="_blank"
rel="noopener noreferrer"
className="text-teal-400 hover:underline"
>
deepl.com/pro-api
</a>
</p>
</div>
</CardContent>
</Card>
)}
{/* LibreTranslate Settings */}
{selectedProvider === "libre" && (
<Card className="border-zinc-800 bg-zinc-900/50">
<CardHeader>
<CardTitle className="text-white">LibreTranslate Settings</CardTitle>
<CardDescription>
Configure your LibreTranslate server (open-source, self-hosted)
</CardDescription>
</CardHeader>
<CardContent className="space-y-4">
<div className="space-y-2">
<Label htmlFor="libre-url" className="text-zinc-300">
Server URL
</Label>
<Input
id="libre-url"
value={libreUrl}
onChange={(e) => setLibreUrl(e.target.value)}
onKeyDown={(e) => e.stopPropagation()}
placeholder="https://libretranslate.com"
className="bg-zinc-800 border-zinc-700 text-white placeholder:text-zinc-500"
/>
<div className="flex flex-col gap-1 text-xs text-zinc-500">
<p>Public instances (free but rate-limited):</p>
<div className="flex flex-wrap gap-2 mt-1">
<Button
variant="outline"
size="sm"
className="h-6 text-xs border-zinc-700 text-zinc-400 hover:text-teal-400"
onClick={() => setLibreUrl("https://libretranslate.com")}
>
libretranslate.com <ExternalLink className="h-3 w-3 ml-1" />
</Button>
<Button
variant="outline"
size="sm"
className="h-6 text-xs border-zinc-700 text-zinc-400 hover:text-teal-400"
onClick={() => setLibreUrl("https://translate.argosopentech.com")}
>
argosopentech.com <ExternalLink className="h-3 w-3 ml-1" />
</Button>
</div>
<p className="mt-2">
Or{" "}
<a
href="https://github.com/LibreTranslate/LibreTranslate"
target="_blank"
rel="noopener noreferrer"
className="text-teal-400 hover:underline"
>
self-host your own instance
</a>
</p>
</div>
</div>
</CardContent>
</Card>
)}
{/* OpenAI Settings */}
{selectedProvider === "openai" && (
<Card className="border-zinc-800 bg-zinc-900/50">
<CardHeader>
<div className="flex items-center justify-between">
<div>
<CardTitle className="text-white">OpenAI Settings</CardTitle>
<CardDescription>
Configure your OpenAI API for GPT-4 Vision translations
</CardDescription>
</div>
{openaiTestStatus !== "idle" && openaiTestStatus !== "testing" && (
<Badge
variant="outline"
className={
openaiTestStatus === "success"
? "border-green-500 text-green-400"
: "border-red-500 text-red-400"
}
>
{openaiTestStatus === "success" && <CheckCircle className="h-3 w-3 mr-1" />}
{openaiTestStatus === "error" && <XCircle className="h-3 w-3 mr-1" />}
{openaiTestStatus === "success" ? "Connected" : "Error"}
</Badge>
)}
</div>
</CardHeader>
<CardContent className="space-y-4">
<div className="space-y-2">
<Label htmlFor="openai-key" className="text-zinc-300">
API Key
</Label>
<div className="flex gap-2">
<Input
id="openai-key"
type="password"
value={openaiApiKey}
onChange={(e) => setOpenaiApiKey(e.target.value)}
onKeyDown={(e) => e.stopPropagation()}
placeholder="sk-..."
className="bg-zinc-800 border-zinc-700 text-white placeholder:text-zinc-500"
/>
<Button
variant="outline"
onClick={handleTestOpenAI}
disabled={openaiTestStatus === "testing"}
className="border-zinc-700 text-zinc-300 hover:bg-zinc-800"
>
{openaiTestStatus === "testing" ? (
<Loader2 className="h-4 w-4 animate-spin" />
) : (
<Wifi className="h-4 w-4" />
)}
</Button>
</div>
{openaiTestMessage && (
<p className={`text-xs ${openaiTestStatus === "success" ? "text-green-400" : "text-red-400"}`}>
{openaiTestMessage}
</p>
)}
<p className="text-xs text-zinc-500">
Get your API key from{" "}
<a
href="https://platform.openai.com/api-keys"
target="_blank"
rel="noopener noreferrer"
className="text-teal-400 hover:underline"
>
platform.openai.com
</a>
</p>
</div>
<div className="space-y-2">
<Label htmlFor="openai-model" className="text-zinc-300">
Model
</Label>
<Select
value={openaiModel}
onValueChange={setOpenaiModel}
>
<SelectTrigger
id="openai-model"
className="bg-zinc-800 border-zinc-700 text-white"
>
<SelectValue placeholder="Select a model" />
</SelectTrigger>
<SelectContent className="bg-zinc-800 border-zinc-700">
{openaiModels.map((model) => (
<SelectItem
key={model.id}
value={model.id}
className="text-white hover:bg-zinc-700"
>
<span className="flex items-center justify-between gap-4">
<span>{model.name}</span>
{model.vision && (
<Badge variant="outline" className="border-teal-600 text-teal-400 text-xs ml-2">
Vision
</Badge>
)}
</span>
</SelectItem>
))}
</SelectContent>
</Select>
<p className="text-xs text-zinc-500">
Models with Vision can translate text in images
</p>
</div>
</CardContent>
</Card>
)}
{/* Image Translation - Only for Ollama and OpenAI */}
{(selectedProvider === "ollama" || selectedProvider === "openai") && (
<Card className="border-zinc-800 bg-zinc-900/50">
<CardHeader>
<CardTitle className="text-white">Advanced Options</CardTitle>
<CardDescription>
Additional translation features
</CardDescription>
</CardHeader>
<CardContent>
<div className="flex items-center justify-between rounded-lg border border-zinc-800 p-4">
<div className="space-y-0.5">
<div className="flex items-center gap-2">
<Label className="text-zinc-300">Translate Images by Default</Label>
<Badge variant="outline" className="border-teal-600 text-teal-400 text-xs">
Vision Models
</Badge>
</div>
<p className="text-xs text-zinc-500">
Extract and translate text from embedded images using vision models
</p>
</div>
<Switch
checked={translateImages}
onCheckedChange={setTranslateImages}
/>
</div>
</CardContent>
</Card>
)}
{/* Save Button */}
<div className="flex justify-end">
<Button
onClick={handleSave}
disabled={isSaving}
className="bg-teal-600 hover:bg-teal-700 text-white px-8"
>
{isSaving ? (
<>
<Loader2 className="mr-2 h-4 w-4 animate-spin" />
Saving...
</>
) : (
<>
<Save className="mr-2 h-4 w-4" />
Save Settings
</>
)}
</Button>
</div>
</div>
);
}

View File

@@ -0,0 +1,494 @@
"use client";
import { useState, useCallback, useEffect } from "react";
import { useDropzone } from "react-dropzone";
import { Upload, FileText, FileSpreadsheet, Presentation, X, Download, Loader2, Cpu, AlertTriangle } from "lucide-react";
import { Card, CardContent, CardDescription, CardHeader, CardTitle } from "@/components/ui/card";
import { Button } from "@/components/ui/button";
import { Badge } from "@/components/ui/badge";
import { Progress } from "@/components/ui/progress";
import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue } from "@/components/ui/select";
import { Label } from "@/components/ui/label";
import { Switch } from "@/components/ui/switch";
import { useTranslationStore } from "@/lib/store";
import { translateDocument, languages, providers, extractTextsFromDocument, reconstructDocument, TranslatedText } from "@/lib/api";
import { useWebLLM } from "@/lib/webllm";
import { cn } from "@/lib/utils";
const fileIcons: Record<string, React.ElementType> = {
xlsx: FileSpreadsheet,
xls: FileSpreadsheet,
docx: FileText,
doc: FileText,
pptx: Presentation,
ppt: Presentation,
};
type ProviderType = "google" | "ollama" | "deepl" | "libre" | "webllm" | "openai";
export function FileUploader() {
const { settings, isTranslating, progress, setTranslating, setProgress } = useTranslationStore();
const webllm = useWebLLM();
const [file, setFile] = useState<File | null>(null);
const [targetLanguage, setTargetLanguage] = useState(settings.defaultTargetLanguage);
const [provider, setProvider] = useState<ProviderType>(settings.defaultProvider);
const [translateImages, setTranslateImages] = useState(settings.translateImages);
const [downloadUrl, setDownloadUrl] = useState<string | null>(null);
const [error, setError] = useState<string | null>(null);
const [translationStatus, setTranslationStatus] = useState<string>("");
// Sync with store settings when they change
useEffect(() => {
setTargetLanguage(settings.defaultTargetLanguage);
setProvider(settings.defaultProvider);
setTranslateImages(settings.translateImages);
}, [settings.defaultTargetLanguage, settings.defaultProvider, settings.translateImages]);
const onDrop = useCallback((acceptedFiles: File[]) => {
if (acceptedFiles.length > 0) {
setFile(acceptedFiles[0]);
setDownloadUrl(null);
setError(null);
}
}, []);
const { getRootProps, getInputProps, isDragActive } = useDropzone({
onDrop,
accept: {
"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet": [".xlsx"],
"application/vnd.ms-excel": [".xls"],
"application/vnd.openxmlformats-officedocument.wordprocessingml.document": [".docx"],
"application/msword": [".doc"],
"application/vnd.openxmlformats-officedocument.presentationml.presentation": [".pptx"],
"application/vnd.ms-powerpoint": [".ppt"],
},
multiple: false,
});
const getFileExtension = (filename: string) => {
return filename.split(".").pop()?.toLowerCase() || "";
};
const getFileIcon = (filename: string) => {
const ext = getFileExtension(filename);
return fileIcons[ext] || FileText;
};
const formatFileSize = (bytes: number) => {
if (bytes < 1024) return bytes + " B";
if (bytes < 1024 * 1024) return (bytes / 1024).toFixed(1) + " KB";
return (bytes / (1024 * 1024)).toFixed(1) + " MB";
};
const handleTranslate = async () => {
if (!file) return;
// Validate provider-specific requirements
if (provider === "openai" && !settings.openaiApiKey) {
setError("OpenAI API key not configured. Go to Settings > Translation Services to add your API key.");
return;
}
if (provider === "deepl" && !settings.deeplApiKey) {
setError("DeepL API key not configured. Go to Settings > Translation Services to add your API key.");
return;
}
// WebLLM specific validation
if (provider === "webllm") {
if (!webllm.isWebGPUSupported()) {
setError("WebGPU is not supported in this browser. Please use Chrome 113+ or Edge 113+.");
return;
}
if (!webllm.isLoaded) {
setError("WebLLM model not loaded. Go to Settings > Translation Services to load a model first.");
return;
}
}
setTranslating(true);
setProgress(0);
setError(null);
setDownloadUrl(null);
setTranslationStatus("");
try {
// For WebLLM, use client-side translation
if (provider === "webllm") {
await handleWebLLMTranslation();
} else {
await handleServerTranslation();
}
} catch (err) {
setError(err instanceof Error ? err.message : "Translation failed");
} finally {
setTranslating(false);
setTranslationStatus("");
}
};
// Get language name from code
const getLanguageName = (code: string): string => {
const lang = languages.find(l => l.code === code);
return lang ? lang.name : code;
};
// WebLLM client-side translation
const handleWebLLMTranslation = async () => {
if (!file) return;
try {
// Step 1: Extract texts from document
setTranslationStatus("Extracting texts from document...");
setProgress(5);
const extractResult = await extractTextsFromDocument(file);
if (extractResult.texts.length === 0) {
throw new Error("No translatable text found in document");
}
setTranslationStatus(`Found ${extractResult.texts.length} texts to translate`);
setProgress(10);
// Step 2: Translate each text using WebLLM
const translations: TranslatedText[] = [];
const totalTexts = extractResult.texts.length;
const langName = getLanguageName(targetLanguage);
for (let i = 0; i < totalTexts; i++) {
const item = extractResult.texts[i];
setTranslationStatus(`Translating ${i + 1}/${totalTexts}: "${item.text.substring(0, 30)}..."`);
const translatedText = await webllm.translate(
item.text,
langName,
settings.systemPrompt || undefined,
settings.glossary || undefined
);
translations.push({
id: item.id,
translated_text: translatedText,
});
// Update progress (10% for extraction, 80% for translation, 10% for reconstruction)
const translationProgress = 10 + (80 * (i + 1) / totalTexts);
setProgress(translationProgress);
}
// Step 3: Reconstruct document with translations
setTranslationStatus("Reconstructing document...");
setProgress(92);
const blob = await reconstructDocument(
extractResult.session_id,
translations,
targetLanguage
);
setProgress(100);
setTranslationStatus("Translation complete!");
const url = URL.createObjectURL(blob);
setDownloadUrl(url);
} catch (err) {
throw err;
}
};
// Server-side translation (existing logic)
const handleServerTranslation = async () => {
if (!file) return;
// Simulate progress for UX
let currentProgress = 0;
const progressInterval = setInterval(() => {
currentProgress = Math.min(currentProgress + Math.random() * 10, 90);
setProgress(currentProgress);
}, 500);
try {
const blob = await translateDocument({
file,
targetLanguage,
provider,
ollamaModel: settings.ollamaModel,
translateImages: translateImages || settings.translateImages,
systemPrompt: settings.systemPrompt,
glossary: settings.glossary,
libreUrl: settings.libreTranslateUrl,
openaiApiKey: settings.openaiApiKey,
openaiModel: settings.openaiModel,
});
clearInterval(progressInterval);
setProgress(100);
const url = URL.createObjectURL(blob);
setDownloadUrl(url);
} catch (err) {
clearInterval(progressInterval);
throw err;
}
};
const handleDownload = () => {
if (!downloadUrl || !file) return;
const a = document.createElement("a");
a.href = downloadUrl;
const ext = getFileExtension(file.name);
const baseName = file.name.replace(`.${ext}`, "");
a.download = `${baseName}_translated.${ext}`;
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
};
const removeFile = () => {
setFile(null);
setDownloadUrl(null);
setError(null);
setProgress(0);
};
const FileIcon = file ? getFileIcon(file.name) : FileText;
return (
<div className="space-y-6">
{/* File Drop Zone */}
<Card className="border-zinc-800 bg-zinc-900/50">
<CardHeader>
<CardTitle className="text-white">Upload Document</CardTitle>
<CardDescription>
Drag and drop or click to select a file (Excel, Word, PowerPoint)
</CardDescription>
</CardHeader>
<CardContent>
{!file ? (
<div
{...getRootProps()}
className={cn(
"border-2 border-dashed rounded-xl p-12 text-center cursor-pointer transition-all",
isDragActive
? "border-teal-500 bg-teal-500/10"
: "border-zinc-700 hover:border-zinc-600 hover:bg-zinc-800/50"
)}
>
<input {...getInputProps()} />
<Upload className="h-12 w-12 mx-auto mb-4 text-zinc-500" />
<p className="text-zinc-400 mb-2">
{isDragActive
? "Drop the file here..."
: "Drag & drop a document here, or click to select"}
</p>
<p className="text-xs text-zinc-600">
Supports: .xlsx, .docx, .pptx
</p>
</div>
) : (
<div className="flex items-center gap-4 p-4 bg-zinc-800/50 rounded-lg">
<div className="flex h-12 w-12 items-center justify-center rounded-lg bg-zinc-700">
<FileIcon className="h-6 w-6 text-teal-400" />
</div>
<div className="flex-1 min-w-0">
<p className="text-sm font-medium text-white truncate">
{file.name}
</p>
<p className="text-xs text-zinc-500">
{formatFileSize(file.size)}
</p>
</div>
<Badge variant="outline" className="border-zinc-700 text-zinc-400">
{getFileExtension(file.name).toUpperCase()}
</Badge>
<Button
variant="ghost"
size="icon"
onClick={removeFile}
className="text-zinc-500 hover:text-red-400"
>
<X className="h-4 w-4" />
</Button>
</div>
)}
</CardContent>
</Card>
{/* Translation Options */}
<Card className="border-zinc-800 bg-zinc-900/50">
<CardHeader>
<CardTitle className="text-white">Translation Options</CardTitle>
<CardDescription>
Configure your translation preferences
</CardDescription>
</CardHeader>
<CardContent className="space-y-6">
<div className="grid grid-cols-1 md:grid-cols-2 gap-6">
{/* Target Language */}
<div className="space-y-2">
<Label htmlFor="language" className="text-zinc-300">Target Language</Label>
<Select value={targetLanguage} onValueChange={setTargetLanguage}>
<SelectTrigger id="language" className="bg-zinc-800 border-zinc-700 text-white">
<SelectValue placeholder="Select language" />
</SelectTrigger>
<SelectContent className="bg-zinc-800 border-zinc-700">
{languages.map((lang) => (
<SelectItem
key={lang.code}
value={lang.code}
className="text-white hover:bg-zinc-700"
>
<span className="flex items-center gap-2">
<span>{lang.flag}</span>
<span>{lang.name}</span>
</span>
</SelectItem>
))}
</SelectContent>
</Select>
</div>
{/* Provider */}
<div className="space-y-2">
<Label htmlFor="provider" className="text-zinc-300">Translation Provider</Label>
<Select value={provider} onValueChange={(value) => setProvider(value as ProviderType)}>
<SelectTrigger id="provider" className="bg-zinc-800 border-zinc-700 text-white">
<SelectValue placeholder="Select provider" />
</SelectTrigger>
<SelectContent className="bg-zinc-800 border-zinc-700">
{providers.map((prov) => (
<SelectItem
key={prov.id}
value={prov.id}
className="text-white hover:bg-zinc-700"
>
<span className="flex items-center gap-2">
<span>{prov.icon}</span>
<span>{prov.name}</span>
</span>
</SelectItem>
))}
</SelectContent>
</Select>
{/* Warning if API key not configured */}
{provider === "openai" && !settings.openaiApiKey && (
<p className="text-xs text-amber-400 mt-1">
OpenAI API key not configured. Go to Settings Translation Services
</p>
)}
{provider === "deepl" && !settings.deeplApiKey && (
<p className="text-xs text-amber-400 mt-1">
DeepL API key not configured. Go to Settings Translation Services
</p>
)}
{provider === "webllm" && !webllm.isLoaded && (
<p className="text-xs text-amber-400 mt-1">
WebLLM model not loaded. Go to Settings Translation Services to load a model
</p>
)}
{provider === "webllm" && webllm.isLoaded && (
<p className="text-xs text-green-400 mt-1 flex items-center gap-1">
<Cpu className="h-3 w-3" />
Model ready: {webllm.currentModel}
</p>
)}
{provider === "webllm" && !webllm.isWebGPUSupported() && (
<p className="text-xs text-red-400 mt-1 flex items-center gap-1">
<AlertTriangle className="h-3 w-3" />
WebGPU not supported in this browser
</p>
)}
</div>
</div>
{/* Image Translation Toggle */}
{(provider === "ollama" || provider === "openai") && (
<div className="flex items-center justify-between rounded-lg border border-zinc-800 p-4">
<div className="space-y-0.5">
<Label className="text-zinc-300">Translate Images</Label>
<p className="text-xs text-zinc-500">
Extract and translate text from embedded images using vision model
</p>
</div>
<Switch
checked={translateImages}
onCheckedChange={setTranslateImages}
/>
</div>
)}
{/* Translate Button */}
<Button
onClick={handleTranslate}
disabled={!file || isTranslating}
className="w-full bg-teal-600 hover:bg-teal-700 text-white h-12"
>
{isTranslating ? (
<>
<Loader2 className="mr-2 h-4 w-4 animate-spin" />
Translating...
</>
) : (
<>
<Upload className="mr-2 h-4 w-4" />
Translate Document
</>
)}
</Button>
{/* Progress Bar */}
{isTranslating && (
<div className="space-y-2">
<div className="flex justify-between text-sm">
<span className="text-zinc-400">
{translationStatus || "Processing..."}
</span>
<span className="text-teal-400">{Math.round(progress)}%</span>
</div>
<Progress value={progress} className="h-2" />
{provider === "webllm" && (
<p className="text-xs text-zinc-500 flex items-center gap-1">
<Cpu className="h-3 w-3" />
Translating locally with WebLLM...
</p>
)}
</div>
)}
{/* Error */}
{error && (
<div className="rounded-lg bg-red-500/10 border border-red-500/30 p-4">
<p className="text-sm text-red-400">{error}</p>
</div>
)}
</CardContent>
</Card>
{/* Download Section */}
{downloadUrl && (
<Card className="border-teal-500/30 bg-teal-500/5">
<CardHeader>
<CardTitle className="text-teal-400 flex items-center gap-2">
<Download className="h-5 w-5" />
Translation Complete
</CardTitle>
<CardDescription>
Your document has been translated successfully
</CardDescription>
</CardHeader>
<CardContent>
<Button
onClick={handleDownload}
className="w-full bg-teal-600 hover:bg-teal-700 text-white h-12"
>
<Download className="mr-2 h-4 w-4" />
Download Translated Document
</Button>
</CardContent>
</Card>
)}
</div>
);
}

View File

@@ -0,0 +1,146 @@
"use client";
import Link from "next/link";
import { usePathname } from "next/navigation";
import { cn } from "@/lib/utils";
import {
Settings,
Cloud,
BookText,
Upload,
Shield,
} from "lucide-react";
import {
Tooltip,
TooltipContent,
TooltipProvider,
TooltipTrigger,
} from "@/components/ui/tooltip";
const navigation = [
{
name: "Translate",
href: "/",
icon: Upload,
description: "Translate documents",
},
{
name: "General Settings",
href: "/settings",
icon: Settings,
description: "Configure general settings",
},
{
name: "Translation Services",
href: "/settings/services",
icon: Cloud,
description: "Configure translation providers",
},
{
name: "Context & Glossary",
href: "/settings/context",
icon: BookText,
description: "System prompts and glossary",
},
];
const adminNavigation = [
{
name: "Admin Dashboard",
href: "/admin",
icon: Shield,
description: "System monitoring (login required)",
},
];
export function Sidebar() {
const pathname = usePathname();
return (
<TooltipProvider>
<aside className="fixed left-0 top-0 z-40 h-screen w-64 border-r border-zinc-800 bg-[#1a1a1a]">
{/* Logo */}
<div className="flex h-16 items-center gap-3 border-b border-zinc-800 px-6">
<div className="flex h-9 w-9 items-center justify-center rounded-lg bg-teal-500 text-white font-bold">
A
</div>
<span className="text-lg font-semibold text-white">Translate Co.</span>
</div>
{/* Navigation */}
<nav className="flex flex-col gap-1 p-4">
{navigation.map((item) => {
const isActive = pathname === item.href;
const Icon = item.icon;
return (
<Tooltip key={item.name}>
<TooltipTrigger asChild>
<Link
href={item.href}
className={cn(
"flex items-center gap-3 rounded-lg px-3 py-2.5 text-sm font-medium transition-colors",
isActive
? "bg-teal-500/10 text-teal-400"
: "text-zinc-400 hover:bg-zinc-800 hover:text-zinc-100"
)}
>
<Icon className="h-5 w-5" />
<span>{item.name}</span>
</Link>
</TooltipTrigger>
<TooltipContent side="right">
<p>{item.description}</p>
</TooltipContent>
</Tooltip>
);
})}
{/* Admin Section */}
<div className="mt-4 pt-4 border-t border-zinc-800">
<p className="px-3 mb-2 text-xs font-medium text-zinc-600 uppercase tracking-wider">Admin</p>
{adminNavigation.map((item) => {
const isActive = pathname === item.href;
const Icon = item.icon;
return (
<Tooltip key={item.name}>
<TooltipTrigger asChild>
<Link
href={item.href}
className={cn(
"flex items-center gap-3 rounded-lg px-3 py-2.5 text-sm font-medium transition-colors",
isActive
? "bg-blue-500/10 text-blue-400"
: "text-zinc-500 hover:bg-zinc-800 hover:text-zinc-300"
)}
>
<Icon className="h-5 w-5" />
<span>{item.name}</span>
</Link>
</TooltipTrigger>
<TooltipContent side="right">
<p>{item.description}</p>
</TooltipContent>
</Tooltip>
);
})}
</div>
</nav>
{/* User section at bottom */}
<div className="absolute bottom-0 left-0 right-0 border-t border-zinc-800 p-4">
<div className="flex items-center gap-3">
<div className="flex h-10 w-10 items-center justify-center rounded-full bg-teal-600 text-white text-sm font-medium">
U
</div>
<div className="flex flex-col">
<span className="text-sm font-medium text-white">User</span>
<span className="text-xs text-zinc-500">Translator</span>
</div>
</div>
</div>
</aside>
</TooltipProvider>
);
}

View File

@@ -0,0 +1,66 @@
import * as React from "react"
import { cva, type VariantProps } from "class-variance-authority"
import { cn } from "@/lib/utils"
const alertVariants = cva(
"relative w-full rounded-lg border px-4 py-3 text-sm grid has-[>svg]:grid-cols-[calc(var(--spacing)*4)_1fr] grid-cols-[0_1fr] has-[>svg]:gap-x-3 gap-y-0.5 items-start [&>svg]:size-4 [&>svg]:translate-y-0.5 [&>svg]:text-current",
{
variants: {
variant: {
default: "bg-card text-card-foreground",
destructive:
"text-destructive bg-card [&>svg]:text-current *:data-[slot=alert-description]:text-destructive/90",
},
},
defaultVariants: {
variant: "default",
},
}
)
function Alert({
className,
variant,
...props
}: React.ComponentProps<"div"> & VariantProps<typeof alertVariants>) {
return (
<div
data-slot="alert"
role="alert"
className={cn(alertVariants({ variant }), className)}
{...props}
/>
)
}
function AlertTitle({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="alert-title"
className={cn(
"col-start-2 line-clamp-1 min-h-4 font-medium tracking-tight",
className
)}
{...props}
/>
)
}
function AlertDescription({
className,
...props
}: React.ComponentProps<"div">) {
return (
<div
data-slot="alert-description"
className={cn(
"text-muted-foreground col-start-2 grid justify-items-start gap-1 text-sm [&_p]:leading-relaxed",
className
)}
{...props}
/>
)
}
export { Alert, AlertTitle, AlertDescription }

View File

@@ -0,0 +1,46 @@
import * as React from "react"
import { Slot } from "@radix-ui/react-slot"
import { cva, type VariantProps } from "class-variance-authority"
import { cn } from "@/lib/utils"
const badgeVariants = cva(
"inline-flex items-center justify-center rounded-full border px-2 py-0.5 text-xs font-medium w-fit whitespace-nowrap shrink-0 [&>svg]:size-3 gap-1 [&>svg]:pointer-events-none focus-visible:border-ring focus-visible:ring-ring/50 focus-visible:ring-[3px] aria-invalid:ring-destructive/20 dark:aria-invalid:ring-destructive/40 aria-invalid:border-destructive transition-[color,box-shadow] overflow-hidden",
{
variants: {
variant: {
default:
"border-transparent bg-primary text-primary-foreground [a&]:hover:bg-primary/90",
secondary:
"border-transparent bg-secondary text-secondary-foreground [a&]:hover:bg-secondary/90",
destructive:
"border-transparent bg-destructive text-white [a&]:hover:bg-destructive/90 focus-visible:ring-destructive/20 dark:focus-visible:ring-destructive/40 dark:bg-destructive/60",
outline:
"text-foreground [a&]:hover:bg-accent [a&]:hover:text-accent-foreground",
},
},
defaultVariants: {
variant: "default",
},
}
)
function Badge({
className,
variant,
asChild = false,
...props
}: React.ComponentProps<"span"> &
VariantProps<typeof badgeVariants> & { asChild?: boolean }) {
const Comp = asChild ? Slot : "span"
return (
<Comp
data-slot="badge"
className={cn(badgeVariants({ variant }), className)}
{...props}
/>
)
}
export { Badge, badgeVariants }

View File

@@ -0,0 +1,60 @@
import * as React from "react"
import { Slot } from "@radix-ui/react-slot"
import { cva, type VariantProps } from "class-variance-authority"
import { cn } from "@/lib/utils"
const buttonVariants = cva(
"inline-flex items-center justify-center gap-2 whitespace-nowrap rounded-md text-sm font-medium transition-all disabled:pointer-events-none disabled:opacity-50 [&_svg]:pointer-events-none [&_svg:not([class*='size-'])]:size-4 shrink-0 [&_svg]:shrink-0 outline-none focus-visible:border-ring focus-visible:ring-ring/50 focus-visible:ring-[3px] aria-invalid:ring-destructive/20 dark:aria-invalid:ring-destructive/40 aria-invalid:border-destructive",
{
variants: {
variant: {
default: "bg-primary text-primary-foreground hover:bg-primary/90",
destructive:
"bg-destructive text-white hover:bg-destructive/90 focus-visible:ring-destructive/20 dark:focus-visible:ring-destructive/40 dark:bg-destructive/60",
outline:
"border bg-background shadow-xs hover:bg-accent hover:text-accent-foreground dark:bg-input/30 dark:border-input dark:hover:bg-input/50",
secondary:
"bg-secondary text-secondary-foreground hover:bg-secondary/80",
ghost:
"hover:bg-accent hover:text-accent-foreground dark:hover:bg-accent/50",
link: "text-primary underline-offset-4 hover:underline",
},
size: {
default: "h-9 px-4 py-2 has-[>svg]:px-3",
sm: "h-8 rounded-md gap-1.5 px-3 has-[>svg]:px-2.5",
lg: "h-10 rounded-md px-6 has-[>svg]:px-4",
icon: "size-9",
"icon-sm": "size-8",
"icon-lg": "size-10",
},
},
defaultVariants: {
variant: "default",
size: "default",
},
}
)
function Button({
className,
variant,
size,
asChild = false,
...props
}: React.ComponentProps<"button"> &
VariantProps<typeof buttonVariants> & {
asChild?: boolean
}) {
const Comp = asChild ? Slot : "button"
return (
<Comp
data-slot="button"
className={cn(buttonVariants({ variant, size, className }))}
{...props}
/>
)
}
export { Button, buttonVariants }

View File

@@ -0,0 +1,92 @@
import * as React from "react"
import { cn } from "@/lib/utils"
function Card({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="card"
className={cn(
"bg-card text-card-foreground flex flex-col gap-6 rounded-xl border py-6 shadow-sm",
className
)}
{...props}
/>
)
}
function CardHeader({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="card-header"
className={cn(
"@container/card-header grid auto-rows-min grid-rows-[auto_auto] items-start gap-2 px-6 has-data-[slot=card-action]:grid-cols-[1fr_auto] [.border-b]:pb-6",
className
)}
{...props}
/>
)
}
function CardTitle({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="card-title"
className={cn("leading-none font-semibold", className)}
{...props}
/>
)
}
function CardDescription({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="card-description"
className={cn("text-muted-foreground text-sm", className)}
{...props}
/>
)
}
function CardAction({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="card-action"
className={cn(
"col-start-2 row-span-2 row-start-1 self-start justify-self-end",
className
)}
{...props}
/>
)
}
function CardContent({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="card-content"
className={cn("px-6", className)}
{...props}
/>
)
}
function CardFooter({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="card-footer"
className={cn("flex items-center px-6 [.border-t]:pt-6", className)}
{...props}
/>
)
}
export {
Card,
CardHeader,
CardFooter,
CardTitle,
CardAction,
CardDescription,
CardContent,
}

View File

@@ -0,0 +1,32 @@
"use client"
import * as React from "react"
import * as CheckboxPrimitive from "@radix-ui/react-checkbox"
import { CheckIcon } from "lucide-react"
import { cn } from "@/lib/utils"
function Checkbox({
className,
...props
}: React.ComponentProps<typeof CheckboxPrimitive.Root>) {
return (
<CheckboxPrimitive.Root
data-slot="checkbox"
className={cn(
"peer border-input dark:bg-input/30 data-[state=checked]:bg-primary data-[state=checked]:text-primary-foreground dark:data-[state=checked]:bg-primary data-[state=checked]:border-primary focus-visible:border-ring focus-visible:ring-ring/50 aria-invalid:ring-destructive/20 dark:aria-invalid:ring-destructive/40 aria-invalid:border-destructive size-4 shrink-0 rounded-[4px] border shadow-xs transition-shadow outline-none focus-visible:ring-[3px] disabled:cursor-not-allowed disabled:opacity-50",
className
)}
{...props}
>
<CheckboxPrimitive.Indicator
data-slot="checkbox-indicator"
className="grid place-content-center text-current transition-none"
>
<CheckIcon className="size-3.5" />
</CheckboxPrimitive.Indicator>
</CheckboxPrimitive.Root>
)
}
export { Checkbox }

View File

@@ -0,0 +1,143 @@
"use client"
import * as React from "react"
import * as DialogPrimitive from "@radix-ui/react-dialog"
import { XIcon } from "lucide-react"
import { cn } from "@/lib/utils"
function Dialog({
...props
}: React.ComponentProps<typeof DialogPrimitive.Root>) {
return <DialogPrimitive.Root data-slot="dialog" {...props} />
}
function DialogTrigger({
...props
}: React.ComponentProps<typeof DialogPrimitive.Trigger>) {
return <DialogPrimitive.Trigger data-slot="dialog-trigger" {...props} />
}
function DialogPortal({
...props
}: React.ComponentProps<typeof DialogPrimitive.Portal>) {
return <DialogPrimitive.Portal data-slot="dialog-portal" {...props} />
}
function DialogClose({
...props
}: React.ComponentProps<typeof DialogPrimitive.Close>) {
return <DialogPrimitive.Close data-slot="dialog-close" {...props} />
}
function DialogOverlay({
className,
...props
}: React.ComponentProps<typeof DialogPrimitive.Overlay>) {
return (
<DialogPrimitive.Overlay
data-slot="dialog-overlay"
className={cn(
"data-[state=open]:animate-in data-[state=closed]:animate-out data-[state=closed]:fade-out-0 data-[state=open]:fade-in-0 fixed inset-0 z-50 bg-black/50",
className
)}
{...props}
/>
)
}
function DialogContent({
className,
children,
showCloseButton = true,
...props
}: React.ComponentProps<typeof DialogPrimitive.Content> & {
showCloseButton?: boolean
}) {
return (
<DialogPortal data-slot="dialog-portal">
<DialogOverlay />
<DialogPrimitive.Content
data-slot="dialog-content"
className={cn(
"bg-background data-[state=open]:animate-in data-[state=closed]:animate-out data-[state=closed]:fade-out-0 data-[state=open]:fade-in-0 data-[state=closed]:zoom-out-95 data-[state=open]:zoom-in-95 fixed top-[50%] left-[50%] z-50 grid w-full max-w-[calc(100%-2rem)] translate-x-[-50%] translate-y-[-50%] gap-4 rounded-lg border p-6 shadow-lg duration-200 sm:max-w-lg",
className
)}
{...props}
>
{children}
{showCloseButton && (
<DialogPrimitive.Close
data-slot="dialog-close"
className="ring-offset-background focus:ring-ring data-[state=open]:bg-accent data-[state=open]:text-muted-foreground absolute top-4 right-4 rounded-xs opacity-70 transition-opacity hover:opacity-100 focus:ring-2 focus:ring-offset-2 focus:outline-hidden disabled:pointer-events-none [&_svg]:pointer-events-none [&_svg]:shrink-0 [&_svg:not([class*='size-'])]:size-4"
>
<XIcon />
<span className="sr-only">Close</span>
</DialogPrimitive.Close>
)}
</DialogPrimitive.Content>
</DialogPortal>
)
}
function DialogHeader({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="dialog-header"
className={cn("flex flex-col gap-2 text-center sm:text-left", className)}
{...props}
/>
)
}
function DialogFooter({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="dialog-footer"
className={cn(
"flex flex-col-reverse gap-2 sm:flex-row sm:justify-end",
className
)}
{...props}
/>
)
}
function DialogTitle({
className,
...props
}: React.ComponentProps<typeof DialogPrimitive.Title>) {
return (
<DialogPrimitive.Title
data-slot="dialog-title"
className={cn("text-lg leading-none font-semibold", className)}
{...props}
/>
)
}
function DialogDescription({
className,
...props
}: React.ComponentProps<typeof DialogPrimitive.Description>) {
return (
<DialogPrimitive.Description
data-slot="dialog-description"
className={cn("text-muted-foreground text-sm", className)}
{...props}
/>
)
}
export {
Dialog,
DialogClose,
DialogContent,
DialogDescription,
DialogFooter,
DialogHeader,
DialogOverlay,
DialogPortal,
DialogTitle,
DialogTrigger,
}

View File

@@ -0,0 +1,257 @@
"use client"
import * as React from "react"
import * as DropdownMenuPrimitive from "@radix-ui/react-dropdown-menu"
import { CheckIcon, ChevronRightIcon, CircleIcon } from "lucide-react"
import { cn } from "@/lib/utils"
function DropdownMenu({
...props
}: React.ComponentProps<typeof DropdownMenuPrimitive.Root>) {
return <DropdownMenuPrimitive.Root data-slot="dropdown-menu" {...props} />
}
function DropdownMenuPortal({
...props
}: React.ComponentProps<typeof DropdownMenuPrimitive.Portal>) {
return (
<DropdownMenuPrimitive.Portal data-slot="dropdown-menu-portal" {...props} />
)
}
function DropdownMenuTrigger({
...props
}: React.ComponentProps<typeof DropdownMenuPrimitive.Trigger>) {
return (
<DropdownMenuPrimitive.Trigger
data-slot="dropdown-menu-trigger"
{...props}
/>
)
}
function DropdownMenuContent({
className,
sideOffset = 4,
...props
}: React.ComponentProps<typeof DropdownMenuPrimitive.Content>) {
return (
<DropdownMenuPrimitive.Portal>
<DropdownMenuPrimitive.Content
data-slot="dropdown-menu-content"
sideOffset={sideOffset}
className={cn(
"bg-popover text-popover-foreground data-[state=open]:animate-in data-[state=closed]:animate-out data-[state=closed]:fade-out-0 data-[state=open]:fade-in-0 data-[state=closed]:zoom-out-95 data-[state=open]:zoom-in-95 data-[side=bottom]:slide-in-from-top-2 data-[side=left]:slide-in-from-right-2 data-[side=right]:slide-in-from-left-2 data-[side=top]:slide-in-from-bottom-2 z-50 max-h-(--radix-dropdown-menu-content-available-height) min-w-[8rem] origin-(--radix-dropdown-menu-content-transform-origin) overflow-x-hidden overflow-y-auto rounded-md border p-1 shadow-md",
className
)}
{...props}
/>
</DropdownMenuPrimitive.Portal>
)
}
function DropdownMenuGroup({
...props
}: React.ComponentProps<typeof DropdownMenuPrimitive.Group>) {
return (
<DropdownMenuPrimitive.Group data-slot="dropdown-menu-group" {...props} />
)
}
function DropdownMenuItem({
className,
inset,
variant = "default",
...props
}: React.ComponentProps<typeof DropdownMenuPrimitive.Item> & {
inset?: boolean
variant?: "default" | "destructive"
}) {
return (
<DropdownMenuPrimitive.Item
data-slot="dropdown-menu-item"
data-inset={inset}
data-variant={variant}
className={cn(
"focus:bg-accent focus:text-accent-foreground data-[variant=destructive]:text-destructive data-[variant=destructive]:focus:bg-destructive/10 dark:data-[variant=destructive]:focus:bg-destructive/20 data-[variant=destructive]:focus:text-destructive data-[variant=destructive]:*:[svg]:!text-destructive [&_svg:not([class*='text-'])]:text-muted-foreground relative flex cursor-default items-center gap-2 rounded-sm px-2 py-1.5 text-sm outline-hidden select-none data-[disabled]:pointer-events-none data-[disabled]:opacity-50 data-[inset]:pl-8 [&_svg]:pointer-events-none [&_svg]:shrink-0 [&_svg:not([class*='size-'])]:size-4",
className
)}
{...props}
/>
)
}
function DropdownMenuCheckboxItem({
className,
children,
checked,
...props
}: React.ComponentProps<typeof DropdownMenuPrimitive.CheckboxItem>) {
return (
<DropdownMenuPrimitive.CheckboxItem
data-slot="dropdown-menu-checkbox-item"
className={cn(
"focus:bg-accent focus:text-accent-foreground relative flex cursor-default items-center gap-2 rounded-sm py-1.5 pr-2 pl-8 text-sm outline-hidden select-none data-[disabled]:pointer-events-none data-[disabled]:opacity-50 [&_svg]:pointer-events-none [&_svg]:shrink-0 [&_svg:not([class*='size-'])]:size-4",
className
)}
checked={checked}
{...props}
>
<span className="pointer-events-none absolute left-2 flex size-3.5 items-center justify-center">
<DropdownMenuPrimitive.ItemIndicator>
<CheckIcon className="size-4" />
</DropdownMenuPrimitive.ItemIndicator>
</span>
{children}
</DropdownMenuPrimitive.CheckboxItem>
)
}
function DropdownMenuRadioGroup({
...props
}: React.ComponentProps<typeof DropdownMenuPrimitive.RadioGroup>) {
return (
<DropdownMenuPrimitive.RadioGroup
data-slot="dropdown-menu-radio-group"
{...props}
/>
)
}
function DropdownMenuRadioItem({
className,
children,
...props
}: React.ComponentProps<typeof DropdownMenuPrimitive.RadioItem>) {
return (
<DropdownMenuPrimitive.RadioItem
data-slot="dropdown-menu-radio-item"
className={cn(
"focus:bg-accent focus:text-accent-foreground relative flex cursor-default items-center gap-2 rounded-sm py-1.5 pr-2 pl-8 text-sm outline-hidden select-none data-[disabled]:pointer-events-none data-[disabled]:opacity-50 [&_svg]:pointer-events-none [&_svg]:shrink-0 [&_svg:not([class*='size-'])]:size-4",
className
)}
{...props}
>
<span className="pointer-events-none absolute left-2 flex size-3.5 items-center justify-center">
<DropdownMenuPrimitive.ItemIndicator>
<CircleIcon className="size-2 fill-current" />
</DropdownMenuPrimitive.ItemIndicator>
</span>
{children}
</DropdownMenuPrimitive.RadioItem>
)
}
function DropdownMenuLabel({
className,
inset,
...props
}: React.ComponentProps<typeof DropdownMenuPrimitive.Label> & {
inset?: boolean
}) {
return (
<DropdownMenuPrimitive.Label
data-slot="dropdown-menu-label"
data-inset={inset}
className={cn(
"px-2 py-1.5 text-sm font-medium data-[inset]:pl-8",
className
)}
{...props}
/>
)
}
function DropdownMenuSeparator({
className,
...props
}: React.ComponentProps<typeof DropdownMenuPrimitive.Separator>) {
return (
<DropdownMenuPrimitive.Separator
data-slot="dropdown-menu-separator"
className={cn("bg-border -mx-1 my-1 h-px", className)}
{...props}
/>
)
}
function DropdownMenuShortcut({
className,
...props
}: React.ComponentProps<"span">) {
return (
<span
data-slot="dropdown-menu-shortcut"
className={cn(
"text-muted-foreground ml-auto text-xs tracking-widest",
className
)}
{...props}
/>
)
}
function DropdownMenuSub({
...props
}: React.ComponentProps<typeof DropdownMenuPrimitive.Sub>) {
return <DropdownMenuPrimitive.Sub data-slot="dropdown-menu-sub" {...props} />
}
function DropdownMenuSubTrigger({
className,
inset,
children,
...props
}: React.ComponentProps<typeof DropdownMenuPrimitive.SubTrigger> & {
inset?: boolean
}) {
return (
<DropdownMenuPrimitive.SubTrigger
data-slot="dropdown-menu-sub-trigger"
data-inset={inset}
className={cn(
"focus:bg-accent focus:text-accent-foreground data-[state=open]:bg-accent data-[state=open]:text-accent-foreground [&_svg:not([class*='text-'])]:text-muted-foreground flex cursor-default items-center gap-2 rounded-sm px-2 py-1.5 text-sm outline-hidden select-none data-[inset]:pl-8 [&_svg]:pointer-events-none [&_svg]:shrink-0 [&_svg:not([class*='size-'])]:size-4",
className
)}
{...props}
>
{children}
<ChevronRightIcon className="ml-auto size-4" />
</DropdownMenuPrimitive.SubTrigger>
)
}
function DropdownMenuSubContent({
className,
...props
}: React.ComponentProps<typeof DropdownMenuPrimitive.SubContent>) {
return (
<DropdownMenuPrimitive.SubContent
data-slot="dropdown-menu-sub-content"
className={cn(
"bg-popover text-popover-foreground data-[state=open]:animate-in data-[state=closed]:animate-out data-[state=closed]:fade-out-0 data-[state=open]:fade-in-0 data-[state=closed]:zoom-out-95 data-[state=open]:zoom-in-95 data-[side=bottom]:slide-in-from-top-2 data-[side=left]:slide-in-from-right-2 data-[side=right]:slide-in-from-left-2 data-[side=top]:slide-in-from-bottom-2 z-50 min-w-[8rem] origin-(--radix-dropdown-menu-content-transform-origin) overflow-hidden rounded-md border p-1 shadow-lg",
className
)}
{...props}
/>
)
}
export {
DropdownMenu,
DropdownMenuPortal,
DropdownMenuTrigger,
DropdownMenuContent,
DropdownMenuGroup,
DropdownMenuLabel,
DropdownMenuItem,
DropdownMenuCheckboxItem,
DropdownMenuRadioGroup,
DropdownMenuRadioItem,
DropdownMenuSeparator,
DropdownMenuShortcut,
DropdownMenuSub,
DropdownMenuSubTrigger,
DropdownMenuSubContent,
}

View File

@@ -0,0 +1,21 @@
import * as React from "react"
import { cn } from "@/lib/utils"
function Input({ className, type, ...props }: React.ComponentProps<"input">) {
return (
<input
type={type}
data-slot="input"
className={cn(
"file:text-foreground placeholder:text-muted-foreground selection:bg-primary selection:text-primary-foreground dark:bg-input/30 border-input h-9 w-full min-w-0 rounded-md border bg-transparent px-3 py-1 text-base shadow-xs transition-[color,box-shadow] outline-none file:inline-flex file:h-7 file:border-0 file:bg-transparent file:text-sm file:font-medium disabled:pointer-events-none disabled:cursor-not-allowed disabled:opacity-50 md:text-sm",
"focus-visible:border-ring focus-visible:ring-ring/50 focus-visible:ring-[3px]",
"aria-invalid:ring-destructive/20 dark:aria-invalid:ring-destructive/40 aria-invalid:border-destructive",
className
)}
{...props}
/>
)
}
export { Input }

View File

@@ -0,0 +1,24 @@
"use client"
import * as React from "react"
import * as LabelPrimitive from "@radix-ui/react-label"
import { cn } from "@/lib/utils"
function Label({
className,
...props
}: React.ComponentProps<typeof LabelPrimitive.Root>) {
return (
<LabelPrimitive.Root
data-slot="label"
className={cn(
"flex items-center gap-2 text-sm leading-none font-medium select-none group-data-[disabled=true]:pointer-events-none group-data-[disabled=true]:opacity-50 peer-disabled:cursor-not-allowed peer-disabled:opacity-50",
className
)}
{...props}
/>
)
}
export { Label }

View File

@@ -0,0 +1,31 @@
"use client"
import * as React from "react"
import * as ProgressPrimitive from "@radix-ui/react-progress"
import { cn } from "@/lib/utils"
function Progress({
className,
value,
...props
}: React.ComponentProps<typeof ProgressPrimitive.Root>) {
return (
<ProgressPrimitive.Root
data-slot="progress"
className={cn(
"bg-primary/20 relative h-2 w-full overflow-hidden rounded-full",
className
)}
{...props}
>
<ProgressPrimitive.Indicator
data-slot="progress-indicator"
className="bg-primary h-full w-full flex-1 transition-all"
style={{ transform: `translateX(-${100 - (value || 0)}%)` }}
/>
</ProgressPrimitive.Root>
)
}
export { Progress }

View File

@@ -0,0 +1,58 @@
"use client"
import * as React from "react"
import * as ScrollAreaPrimitive from "@radix-ui/react-scroll-area"
import { cn } from "@/lib/utils"
function ScrollArea({
className,
children,
...props
}: React.ComponentProps<typeof ScrollAreaPrimitive.Root>) {
return (
<ScrollAreaPrimitive.Root
data-slot="scroll-area"
className={cn("relative", className)}
{...props}
>
<ScrollAreaPrimitive.Viewport
data-slot="scroll-area-viewport"
className="focus-visible:ring-ring/50 size-full rounded-[inherit] transition-[color,box-shadow] outline-none focus-visible:ring-[3px] focus-visible:outline-1"
>
{children}
</ScrollAreaPrimitive.Viewport>
<ScrollBar />
<ScrollAreaPrimitive.Corner />
</ScrollAreaPrimitive.Root>
)
}
function ScrollBar({
className,
orientation = "vertical",
...props
}: React.ComponentProps<typeof ScrollAreaPrimitive.ScrollAreaScrollbar>) {
return (
<ScrollAreaPrimitive.ScrollAreaScrollbar
data-slot="scroll-area-scrollbar"
orientation={orientation}
className={cn(
"flex touch-none p-px transition-colors select-none",
orientation === "vertical" &&
"h-full w-2.5 border-l border-l-transparent",
orientation === "horizontal" &&
"h-2.5 flex-col border-t border-t-transparent",
className
)}
{...props}
>
<ScrollAreaPrimitive.ScrollAreaThumb
data-slot="scroll-area-thumb"
className="bg-border relative flex-1 rounded-full"
/>
</ScrollAreaPrimitive.ScrollAreaScrollbar>
)
}
export { ScrollArea, ScrollBar }

View File

@@ -0,0 +1,187 @@
"use client"
import * as React from "react"
import * as SelectPrimitive from "@radix-ui/react-select"
import { CheckIcon, ChevronDownIcon, ChevronUpIcon } from "lucide-react"
import { cn } from "@/lib/utils"
function Select({
...props
}: React.ComponentProps<typeof SelectPrimitive.Root>) {
return <SelectPrimitive.Root data-slot="select" {...props} />
}
function SelectGroup({
...props
}: React.ComponentProps<typeof SelectPrimitive.Group>) {
return <SelectPrimitive.Group data-slot="select-group" {...props} />
}
function SelectValue({
...props
}: React.ComponentProps<typeof SelectPrimitive.Value>) {
return <SelectPrimitive.Value data-slot="select-value" {...props} />
}
function SelectTrigger({
className,
size = "default",
children,
...props
}: React.ComponentProps<typeof SelectPrimitive.Trigger> & {
size?: "sm" | "default"
}) {
return (
<SelectPrimitive.Trigger
data-slot="select-trigger"
data-size={size}
className={cn(
"border-input data-[placeholder]:text-muted-foreground [&_svg:not([class*='text-'])]:text-muted-foreground focus-visible:border-ring focus-visible:ring-ring/50 aria-invalid:ring-destructive/20 dark:aria-invalid:ring-destructive/40 aria-invalid:border-destructive dark:bg-input/30 dark:hover:bg-input/50 flex w-fit items-center justify-between gap-2 rounded-md border bg-transparent px-3 py-2 text-sm whitespace-nowrap shadow-xs transition-[color,box-shadow] outline-none focus-visible:ring-[3px] disabled:cursor-not-allowed disabled:opacity-50 data-[size=default]:h-9 data-[size=sm]:h-8 *:data-[slot=select-value]:line-clamp-1 *:data-[slot=select-value]:flex *:data-[slot=select-value]:items-center *:data-[slot=select-value]:gap-2 [&_svg]:pointer-events-none [&_svg]:shrink-0 [&_svg:not([class*='size-'])]:size-4",
className
)}
{...props}
>
{children}
<SelectPrimitive.Icon asChild>
<ChevronDownIcon className="size-4 opacity-50" />
</SelectPrimitive.Icon>
</SelectPrimitive.Trigger>
)
}
function SelectContent({
className,
children,
position = "popper",
align = "center",
...props
}: React.ComponentProps<typeof SelectPrimitive.Content>) {
return (
<SelectPrimitive.Portal>
<SelectPrimitive.Content
data-slot="select-content"
className={cn(
"bg-popover text-popover-foreground data-[state=open]:animate-in data-[state=closed]:animate-out data-[state=closed]:fade-out-0 data-[state=open]:fade-in-0 data-[state=closed]:zoom-out-95 data-[state=open]:zoom-in-95 data-[side=bottom]:slide-in-from-top-2 data-[side=left]:slide-in-from-right-2 data-[side=right]:slide-in-from-left-2 data-[side=top]:slide-in-from-bottom-2 relative z-50 max-h-(--radix-select-content-available-height) min-w-[8rem] origin-(--radix-select-content-transform-origin) overflow-x-hidden overflow-y-auto rounded-md border shadow-md",
position === "popper" &&
"data-[side=bottom]:translate-y-1 data-[side=left]:-translate-x-1 data-[side=right]:translate-x-1 data-[side=top]:-translate-y-1",
className
)}
position={position}
align={align}
{...props}
>
<SelectScrollUpButton />
<SelectPrimitive.Viewport
className={cn(
"p-1",
position === "popper" &&
"h-[var(--radix-select-trigger-height)] w-full min-w-[var(--radix-select-trigger-width)] scroll-my-1"
)}
>
{children}
</SelectPrimitive.Viewport>
<SelectScrollDownButton />
</SelectPrimitive.Content>
</SelectPrimitive.Portal>
)
}
function SelectLabel({
className,
...props
}: React.ComponentProps<typeof SelectPrimitive.Label>) {
return (
<SelectPrimitive.Label
data-slot="select-label"
className={cn("text-muted-foreground px-2 py-1.5 text-xs", className)}
{...props}
/>
)
}
function SelectItem({
className,
children,
...props
}: React.ComponentProps<typeof SelectPrimitive.Item>) {
return (
<SelectPrimitive.Item
data-slot="select-item"
className={cn(
"focus:bg-accent focus:text-accent-foreground [&_svg:not([class*='text-'])]:text-muted-foreground relative flex w-full cursor-default items-center gap-2 rounded-sm py-1.5 pr-8 pl-2 text-sm outline-hidden select-none data-[disabled]:pointer-events-none data-[disabled]:opacity-50 [&_svg]:pointer-events-none [&_svg]:shrink-0 [&_svg:not([class*='size-'])]:size-4 *:[span]:last:flex *:[span]:last:items-center *:[span]:last:gap-2",
className
)}
{...props}
>
<span className="absolute right-2 flex size-3.5 items-center justify-center">
<SelectPrimitive.ItemIndicator>
<CheckIcon className="size-4" />
</SelectPrimitive.ItemIndicator>
</span>
<SelectPrimitive.ItemText>{children}</SelectPrimitive.ItemText>
</SelectPrimitive.Item>
)
}
function SelectSeparator({
className,
...props
}: React.ComponentProps<typeof SelectPrimitive.Separator>) {
return (
<SelectPrimitive.Separator
data-slot="select-separator"
className={cn("bg-border pointer-events-none -mx-1 my-1 h-px", className)}
{...props}
/>
)
}
function SelectScrollUpButton({
className,
...props
}: React.ComponentProps<typeof SelectPrimitive.ScrollUpButton>) {
return (
<SelectPrimitive.ScrollUpButton
data-slot="select-scroll-up-button"
className={cn(
"flex cursor-default items-center justify-center py-1",
className
)}
{...props}
>
<ChevronUpIcon className="size-4" />
</SelectPrimitive.ScrollUpButton>
)
}
function SelectScrollDownButton({
className,
...props
}: React.ComponentProps<typeof SelectPrimitive.ScrollDownButton>) {
return (
<SelectPrimitive.ScrollDownButton
data-slot="select-scroll-down-button"
className={cn(
"flex cursor-default items-center justify-center py-1",
className
)}
{...props}
>
<ChevronDownIcon className="size-4" />
</SelectPrimitive.ScrollDownButton>
)
}
export {
Select,
SelectContent,
SelectGroup,
SelectItem,
SelectLabel,
SelectScrollDownButton,
SelectScrollUpButton,
SelectSeparator,
SelectTrigger,
SelectValue,
}

View File

@@ -0,0 +1,28 @@
"use client"
import * as React from "react"
import * as SeparatorPrimitive from "@radix-ui/react-separator"
import { cn } from "@/lib/utils"
function Separator({
className,
orientation = "horizontal",
decorative = true,
...props
}: React.ComponentProps<typeof SeparatorPrimitive.Root>) {
return (
<SeparatorPrimitive.Root
data-slot="separator"
decorative={decorative}
orientation={orientation}
className={cn(
"bg-border shrink-0 data-[orientation=horizontal]:h-px data-[orientation=horizontal]:w-full data-[orientation=vertical]:h-full data-[orientation=vertical]:w-px",
className
)}
{...props}
/>
)
}
export { Separator }

View File

@@ -0,0 +1,31 @@
"use client"
import * as React from "react"
import * as SwitchPrimitive from "@radix-ui/react-switch"
import { cn } from "@/lib/utils"
function Switch({
className,
...props
}: React.ComponentProps<typeof SwitchPrimitive.Root>) {
return (
<SwitchPrimitive.Root
data-slot="switch"
className={cn(
"peer data-[state=checked]:bg-primary data-[state=unchecked]:bg-input focus-visible:border-ring focus-visible:ring-ring/50 dark:data-[state=unchecked]:bg-input/80 inline-flex h-[1.15rem] w-8 shrink-0 items-center rounded-full border border-transparent shadow-xs transition-all outline-none focus-visible:ring-[3px] disabled:cursor-not-allowed disabled:opacity-50",
className
)}
{...props}
>
<SwitchPrimitive.Thumb
data-slot="switch-thumb"
className={cn(
"bg-background dark:data-[state=unchecked]:bg-foreground dark:data-[state=checked]:bg-primary-foreground pointer-events-none block size-4 rounded-full ring-0 transition-transform data-[state=checked]:translate-x-[calc(100%-2px)] data-[state=unchecked]:translate-x-0"
)}
/>
</SwitchPrimitive.Root>
)
}
export { Switch }

View File

@@ -0,0 +1,66 @@
"use client"
import * as React from "react"
import * as TabsPrimitive from "@radix-ui/react-tabs"
import { cn } from "@/lib/utils"
function Tabs({
className,
...props
}: React.ComponentProps<typeof TabsPrimitive.Root>) {
return (
<TabsPrimitive.Root
data-slot="tabs"
className={cn("flex flex-col gap-2", className)}
{...props}
/>
)
}
function TabsList({
className,
...props
}: React.ComponentProps<typeof TabsPrimitive.List>) {
return (
<TabsPrimitive.List
data-slot="tabs-list"
className={cn(
"bg-muted text-muted-foreground inline-flex h-9 w-fit items-center justify-center rounded-lg p-[3px]",
className
)}
{...props}
/>
)
}
function TabsTrigger({
className,
...props
}: React.ComponentProps<typeof TabsPrimitive.Trigger>) {
return (
<TabsPrimitive.Trigger
data-slot="tabs-trigger"
className={cn(
"data-[state=active]:bg-background dark:data-[state=active]:text-foreground focus-visible:border-ring focus-visible:ring-ring/50 focus-visible:outline-ring dark:data-[state=active]:border-input dark:data-[state=active]:bg-input/30 text-foreground dark:text-muted-foreground inline-flex h-[calc(100%-1px)] flex-1 items-center justify-center gap-1.5 rounded-md border border-transparent px-2 py-1 text-sm font-medium whitespace-nowrap transition-[color,box-shadow] focus-visible:ring-[3px] focus-visible:outline-1 disabled:pointer-events-none disabled:opacity-50 data-[state=active]:shadow-sm [&_svg]:pointer-events-none [&_svg]:shrink-0 [&_svg:not([class*='size-'])]:size-4",
className
)}
{...props}
/>
)
}
function TabsContent({
className,
...props
}: React.ComponentProps<typeof TabsPrimitive.Content>) {
return (
<TabsPrimitive.Content
data-slot="tabs-content"
className={cn("flex-1 outline-none", className)}
{...props}
/>
)
}
export { Tabs, TabsList, TabsTrigger, TabsContent }

View File

@@ -0,0 +1,18 @@
import * as React from "react"
import { cn } from "@/lib/utils"
function Textarea({ className, ...props }: React.ComponentProps<"textarea">) {
return (
<textarea
data-slot="textarea"
className={cn(
"border-input placeholder:text-muted-foreground focus-visible:border-ring focus-visible:ring-ring/50 aria-invalid:ring-destructive/20 dark:aria-invalid:ring-destructive/40 aria-invalid:border-destructive dark:bg-input/30 flex field-sizing-content min-h-16 w-full rounded-md border bg-transparent px-3 py-2 text-base shadow-xs transition-[color,box-shadow] outline-none focus-visible:ring-[3px] disabled:cursor-not-allowed disabled:opacity-50 md:text-sm",
className
)}
{...props}
/>
)
}
export { Textarea }

View File

@@ -0,0 +1,61 @@
"use client"
import * as React from "react"
import * as TooltipPrimitive from "@radix-ui/react-tooltip"
import { cn } from "@/lib/utils"
function TooltipProvider({
delayDuration = 0,
...props
}: React.ComponentProps<typeof TooltipPrimitive.Provider>) {
return (
<TooltipPrimitive.Provider
data-slot="tooltip-provider"
delayDuration={delayDuration}
{...props}
/>
)
}
function Tooltip({
...props
}: React.ComponentProps<typeof TooltipPrimitive.Root>) {
return (
<TooltipProvider>
<TooltipPrimitive.Root data-slot="tooltip" {...props} />
</TooltipProvider>
)
}
function TooltipTrigger({
...props
}: React.ComponentProps<typeof TooltipPrimitive.Trigger>) {
return <TooltipPrimitive.Trigger data-slot="tooltip-trigger" {...props} />
}
function TooltipContent({
className,
sideOffset = 0,
children,
...props
}: React.ComponentProps<typeof TooltipPrimitive.Content>) {
return (
<TooltipPrimitive.Portal>
<TooltipPrimitive.Content
data-slot="tooltip-content"
sideOffset={sideOffset}
className={cn(
"bg-foreground text-background animate-in fade-in-0 zoom-in-95 data-[state=closed]:animate-out data-[state=closed]:fade-out-0 data-[state=closed]:zoom-out-95 data-[side=bottom]:slide-in-from-top-2 data-[side=left]:slide-in-from-right-2 data-[side=right]:slide-in-from-left-2 data-[side=top]:slide-in-from-bottom-2 z-50 w-fit origin-(--radix-tooltip-content-transform-origin) rounded-md px-3 py-1.5 text-xs text-balance",
className
)}
{...props}
>
{children}
<TooltipPrimitive.Arrow className="bg-foreground fill-foreground z-50 size-2.5 translate-y-[calc(-50%_-_2px)] rotate-45 rounded-[2px]" />
</TooltipPrimitive.Content>
</TooltipPrimitive.Portal>
)
}
export { Tooltip, TooltipTrigger, TooltipContent, TooltipProvider }

34
frontend/tsconfig.json Normal file
View File

@@ -0,0 +1,34 @@
{
"compilerOptions": {
"target": "ES2017",
"lib": ["dom", "dom.iterable", "esnext"],
"allowJs": true,
"skipLibCheck": true,
"strict": true,
"noEmit": true,
"esModuleInterop": true,
"module": "esnext",
"moduleResolution": "bundler",
"resolveJsonModule": true,
"isolatedModules": true,
"jsx": "react-jsx",
"incremental": true,
"plugins": [
{
"name": "next"
}
],
"paths": {
"@/*": ["./src/*"]
}
},
"include": [
"next-env.d.ts",
"**/*.ts",
"**/*.tsx",
".next/types/**/*.ts",
".next/dev/types/**/*.ts",
"**/*.mts"
],
"exclude": ["node_modules"]
}

656
main.py
View File

@@ -1,41 +1,172 @@
"""
Document Translation API
FastAPI application for translating complex documents while preserving formatting
SaaS-ready with rate limiting, validation, and robust error handling
"""
from fastapi import FastAPI, UploadFile, File, Form, HTTPException
from fastapi import FastAPI, UploadFile, File, Form, HTTPException, Request, Depends, Header
from fastapi.responses import FileResponse, JSONResponse
from fastapi.middleware.cors import CORSMiddleware
from fastapi.staticfiles import StaticFiles
from fastapi.security import HTTPBasic, HTTPBasicCredentials
from contextlib import asynccontextmanager
from pathlib import Path
from typing import Optional
import asyncio
import logging
import os
import secrets
import hashlib
import time
from config import config
from translators import excel_translator, word_translator, pptx_translator
from utils import file_handler, handle_translation_error, DocumentProcessingError
# Configure logging
logging.basicConfig(level=logging.INFO)
# Import SaaS middleware
from middleware.rate_limiting import RateLimitMiddleware, RateLimitManager, RateLimitConfig
from middleware.security import SecurityHeadersMiddleware, RequestLoggingMiddleware, ErrorHandlingMiddleware
from middleware.cleanup import FileCleanupManager, MemoryMonitor, HealthChecker, create_cleanup_manager
from middleware.validation import FileValidator, LanguageValidator, ProviderValidator, InputSanitizer, ValidationError
# Configure structured logging
logging.basicConfig(
level=getattr(logging, os.getenv("LOG_LEVEL", "INFO")),
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
# Ensure necessary directories exist
config.ensure_directories()
# ============== Admin Authentication ==============
ADMIN_USERNAME = os.getenv("ADMIN_USERNAME", "admin")
ADMIN_PASSWORD_HASH = os.getenv("ADMIN_PASSWORD_HASH", "") # SHA256 hash of password
ADMIN_PASSWORD = os.getenv("ADMIN_PASSWORD", "changeme123") # Default password (change in production!)
ADMIN_TOKEN_SECRET = os.getenv("ADMIN_TOKEN_SECRET", secrets.token_hex(32))
# Create FastAPI app
# Store active admin sessions (token -> expiry timestamp)
admin_sessions: dict = {}
def hash_password(password: str) -> str:
"""Hash password with SHA256"""
return hashlib.sha256(password.encode()).hexdigest()
def verify_admin_password(password: str) -> bool:
"""Verify admin password"""
if ADMIN_PASSWORD_HASH:
return hash_password(password) == ADMIN_PASSWORD_HASH
return password == ADMIN_PASSWORD
def create_admin_token() -> str:
"""Create a new admin session token"""
token = secrets.token_urlsafe(32)
# Token expires in 24 hours
admin_sessions[token] = time.time() + (24 * 60 * 60)
return token
def verify_admin_token(token: str) -> bool:
"""Verify admin token is valid and not expired"""
if token not in admin_sessions:
return False
if time.time() > admin_sessions[token]:
del admin_sessions[token]
return False
return True
async def require_admin(authorization: Optional[str] = Header(None)) -> bool:
"""Dependency to require admin authentication"""
if not authorization:
raise HTTPException(status_code=401, detail="Authorization header required")
# Expect "Bearer <token>"
parts = authorization.split(" ")
if len(parts) != 2 or parts[0].lower() != "bearer":
raise HTTPException(status_code=401, detail="Invalid authorization format. Use: Bearer <token>")
token = parts[1]
if not verify_admin_token(token):
raise HTTPException(status_code=401, detail="Invalid or expired token")
return True
# Initialize SaaS components
rate_limit_config = RateLimitConfig(
requests_per_minute=int(os.getenv("RATE_LIMIT_PER_MINUTE", "30")),
requests_per_hour=int(os.getenv("RATE_LIMIT_PER_HOUR", "200")),
translations_per_minute=int(os.getenv("TRANSLATIONS_PER_MINUTE", "10")),
translations_per_hour=int(os.getenv("TRANSLATIONS_PER_HOUR", "50")),
max_concurrent_translations=int(os.getenv("MAX_CONCURRENT_TRANSLATIONS", "5")),
)
rate_limit_manager = RateLimitManager(rate_limit_config)
cleanup_manager = create_cleanup_manager(config)
memory_monitor = MemoryMonitor(max_memory_percent=float(os.getenv("MAX_MEMORY_PERCENT", "80")))
health_checker = HealthChecker(cleanup_manager, memory_monitor)
file_validator = FileValidator(
max_size_mb=config.MAX_FILE_SIZE_MB,
allowed_extensions=config.SUPPORTED_EXTENSIONS
)
def build_full_prompt(system_prompt: str, glossary: str) -> str:
"""Combine system prompt and glossary into a single prompt for LLM translation."""
parts = []
# Add system prompt if provided
if system_prompt and system_prompt.strip():
parts.append(system_prompt.strip())
# Add glossary if provided
if glossary and glossary.strip():
glossary_section = """
TECHNICAL GLOSSARY - Use these exact translations for the following terms:
{}
Always use the translations from this glossary when you encounter these terms.""".format(glossary.strip())
parts.append(glossary_section)
return "\n\n".join(parts) if parts else ""
# Lifespan context manager for startup/shutdown
@asynccontextmanager
async def lifespan(app: FastAPI):
"""Handle startup and shutdown events"""
# Startup
logger.info("Starting Document Translation API...")
config.ensure_directories()
await cleanup_manager.start()
logger.info("API ready to accept requests")
yield
# Shutdown
logger.info("Shutting down...")
await cleanup_manager.stop()
logger.info("Cleanup completed")
# Create FastAPI app with lifespan
app = FastAPI(
title=config.API_TITLE,
version=config.API_VERSION,
description=config.API_DESCRIPTION
description=config.API_DESCRIPTION,
lifespan=lifespan
)
# Add CORS middleware
# Add middleware (order matters - first added is outermost)
app.add_middleware(ErrorHandlingMiddleware)
app.add_middleware(RequestLoggingMiddleware, log_body=False)
app.add_middleware(SecurityHeadersMiddleware, config={"enable_hsts": os.getenv("ENABLE_HSTS", "false").lower() == "true"})
app.add_middleware(RateLimitMiddleware, rate_limit_manager=rate_limit_manager)
# CORS - configure for production
allowed_origins = os.getenv("CORS_ORIGINS", "*").split(",")
app.add_middleware(
CORSMiddleware,
allow_origins=["*"], # Configure appropriately for production
allow_origins=allowed_origins,
allow_credentials=True,
allow_methods=["*"],
allow_methods=["GET", "POST", "DELETE", "OPTIONS"],
allow_headers=["*"],
expose_headers=["X-Request-ID", "X-Original-Filename", "X-File-Size-MB", "X-Target-Language"]
)
# Mount static files
@@ -44,6 +175,20 @@ if static_dir.exists():
app.mount("/static", StaticFiles(directory=str(static_dir)), name="static")
# Custom exception handler for ValidationError
@app.exception_handler(ValidationError)
async def validation_error_handler(request: Request, exc: ValidationError):
"""Handle validation errors with user-friendly messages"""
return JSONResponse(
status_code=400,
content={
"error": exc.code,
"message": exc.message,
"details": exc.details
}
)
@app.get("/")
async def root():
"""Root endpoint with API information"""
@@ -62,11 +207,24 @@ async def root():
@app.get("/health")
async def health_check():
"""Health check endpoint"""
return {
"status": "healthy",
"translation_service": config.TRANSLATION_SERVICE
}
"""Health check endpoint with detailed system status"""
health_status = await health_checker.check_health()
status_code = 200 if health_status.get("status") == "healthy" else 503
return JSONResponse(
status_code=status_code,
content={
"status": health_status.get("status", "unknown"),
"translation_service": config.TRANSLATION_SERVICE,
"memory": health_status.get("memory", {}),
"disk": health_status.get("disk", {}),
"cleanup_service": health_status.get("cleanup_service", {}),
"rate_limits": {
"requests_per_minute": rate_limit_config.requests_per_minute,
"translations_per_minute": rate_limit_config.translations_per_minute,
}
}
)
@app.get("/languages")
@@ -107,11 +265,18 @@ async def get_supported_languages():
@app.post("/translate")
async def translate_document(
request: Request,
file: UploadFile = File(..., description="Document file to translate (.xlsx, .docx, or .pptx)"),
target_language: str = Form(..., description="Target language code (e.g., 'es', 'fr', 'de')"),
source_language: str = Form(default="auto", description="Source language code (default: auto-detect)"),
provider: str = Form(default="google", description="Translation provider (google, ollama, deepl, libre)"),
translate_images: bool = Form(default=False, description="Translate images with Ollama vision (only for Ollama provider)"),
provider: str = Form(default="google", description="Translation provider (google, ollama, deepl, libre, openai)"),
translate_images: bool = Form(default=False, description="Translate images with multimodal Ollama/OpenAI model"),
ollama_model: str = Form(default="", description="Ollama model to use (also used for vision if multimodal)"),
system_prompt: str = Form(default="", description="Custom system prompt with context or instructions for LLM translation"),
glossary: str = Form(default="", description="Technical glossary (format: source=target, one per line)"),
libre_url: str = Form(default="https://libretranslate.com", description="LibreTranslate server URL"),
openai_api_key: str = Form(default="", description="OpenAI API key"),
openai_model: str = Form(default="gpt-4o-mini", description="OpenAI model to use (gpt-4o-mini is cheapest with vision)"),
cleanup: bool = Form(default=True, description="Delete input file after translation")
):
"""
@@ -133,11 +298,41 @@ async def translate_document(
"""
input_path = None
output_path = None
request_id = getattr(request.state, 'request_id', 'unknown')
try:
# Validate inputs
sanitized_language = InputSanitizer.sanitize_language_code(target_language)
LanguageValidator.validate(sanitized_language)
ProviderValidator.validate(provider)
# Validate file before processing
validation_result = await file_validator.validate_async(file)
if not validation_result.is_valid:
raise ValidationError(
message=f"File validation failed: {'; '.join(validation_result.errors)}",
code="INVALID_FILE",
details={"errors": validation_result.errors, "warnings": validation_result.warnings}
)
# Log any warnings
if validation_result.warnings:
logger.warning(f"[{request_id}] File validation warnings: {validation_result.warnings}")
# Reset file position after validation read
await file.seek(0)
# Check rate limit for translations
client_ip = request.client.host if request.client else "unknown"
if not await rate_limit_manager.check_translation_limit(client_ip):
raise HTTPException(
status_code=429,
detail="Translation rate limit exceeded. Please try again later."
)
# Validate file extension
file_extension = file_handler.validate_file_extension(file.filename)
logger.info(f"Processing {file_extension} file: {file.filename}")
logger.info(f"[{request_id}] Processing {file_extension} file: {file.filename}")
# Validate file size
file_handler.validate_file_size(file)
@@ -151,20 +346,43 @@ async def translate_document(
output_path = config.OUTPUT_DIR / output_filename
await file_handler.save_upload_file(file, input_path)
logger.info(f"Saved input file to: {input_path}")
logger.info(f"[{request_id}] Saved input file to: {input_path}")
# Track file for cleanup
await cleanup_manager.track_file(input_path, ttl_minutes=30)
await cleanup_manager.track_file(output_path, ttl_minutes=60)
# Configure translation provider
from services.translation_service import GoogleTranslationProvider, DeepLTranslationProvider, LibreTranslationProvider, OllamaTranslationProvider, translation_service
from services.translation_service import GoogleTranslationProvider, DeepLTranslationProvider, LibreTranslationProvider, OllamaTranslationProvider, OpenAITranslationProvider, translation_service
if provider.lower() == "deepl":
if not config.DEEPL_API_KEY:
raise HTTPException(status_code=400, detail="DeepL API key not configured")
translation_provider = DeepLTranslationProvider(config.DEEPL_API_KEY)
elif provider.lower() == "libre":
translation_provider = LibreTranslationProvider()
libre_server = libre_url.strip() if libre_url else "https://libretranslate.com"
logger.info(f"Using LibreTranslate server: {libre_server}")
translation_provider = LibreTranslationProvider(libre_server)
elif provider.lower() == "openai":
api_key = openai_api_key.strip() if openai_api_key else ""
if not api_key:
raise HTTPException(status_code=400, detail="OpenAI API key not provided")
model_to_use = openai_model.strip() if openai_model else "gpt-4o-mini"
# Combine system prompt and glossary
custom_prompt = build_full_prompt(system_prompt, glossary)
logger.info(f"Using OpenAI model: {model_to_use}")
if custom_prompt:
logger.info(f"Custom system prompt provided ({len(custom_prompt)} chars)")
translation_provider = OpenAITranslationProvider(api_key, model_to_use, custom_prompt)
elif provider.lower() == "ollama":
vision_model = getattr(config, 'OLLAMA_VISION_MODEL', 'llava')
translation_provider = OllamaTranslationProvider(config.OLLAMA_BASE_URL, config.OLLAMA_MODEL, vision_model)
# Use the same model for text and vision (multimodal models like gemma3, qwen3-vl)
model_to_use = ollama_model.strip() if ollama_model else config.OLLAMA_MODEL
# Combine system prompt and glossary
custom_prompt = build_full_prompt(system_prompt, glossary)
logger.info(f"Using Ollama model: {model_to_use} (text + vision)")
if custom_prompt:
logger.info(f"Custom system prompt provided ({len(custom_prompt)} chars)")
translation_provider = OllamaTranslationProvider(config.OLLAMA_BASE_URL, model_to_use, model_to_use, custom_prompt)
else:
translation_provider = GoogleTranslationProvider()
@@ -371,7 +589,395 @@ async def configure_ollama(base_url: str = Form(...), model: str = Form(...)):
}
@app.post("/extract-texts")
async def extract_texts_from_document(
file: UploadFile = File(..., description="Document file to extract texts from"),
):
"""
Extract all translatable texts from a document for client-side translation (WebLLM).
Returns a list of texts and a session ID to use for reconstruction.
**Parameters:**
- **file**: The document file to extract texts from
**Returns:**
- session_id: Unique ID to reference this extraction
- texts: Array of texts to translate
- file_type: Type of the document
"""
import uuid
import json
try:
# Validate file extension
file_extension = file_handler.validate_file_extension(file.filename)
logger.info(f"Extracting texts from {file_extension} file: {file.filename}")
# Validate file size
file_handler.validate_file_size(file)
# Generate session ID
session_id = str(uuid.uuid4())
# Save uploaded file
input_filename = f"session_{session_id}{file_extension}"
input_path = config.UPLOAD_DIR / input_filename
await file_handler.save_upload_file(file, input_path)
# Extract texts based on file type
texts = []
if file_extension == ".xlsx":
from openpyxl import load_workbook
wb = load_workbook(input_path)
for sheet in wb.worksheets:
for row in sheet.iter_rows():
for cell in row:
if cell.value and isinstance(cell.value, str) and cell.value.strip():
texts.append({
"id": f"{sheet.title}!{cell.coordinate}",
"text": cell.value
})
wb.close()
elif file_extension == ".docx":
from docx import Document
doc = Document(input_path)
para_idx = 0
for para in doc.paragraphs:
if para.text.strip():
texts.append({
"id": f"para_{para_idx}",
"text": para.text
})
para_idx += 1
# Also extract from tables
table_idx = 0
for table in doc.tables:
for row_idx, row in enumerate(table.rows):
for cell_idx, cell in enumerate(row.cells):
if cell.text.strip():
texts.append({
"id": f"table_{table_idx}_r{row_idx}_c{cell_idx}",
"text": cell.text
})
table_idx += 1
elif file_extension == ".pptx":
from pptx import Presentation
prs = Presentation(input_path)
for slide_idx, slide in enumerate(prs.slides):
for shape_idx, shape in enumerate(slide.shapes):
if shape.has_text_frame:
for para_idx, para in enumerate(shape.text_frame.paragraphs):
for run_idx, run in enumerate(para.runs):
if run.text.strip():
texts.append({
"id": f"slide_{slide_idx}_shape_{shape_idx}_para_{para_idx}_run_{run_idx}",
"text": run.text
})
# Save session metadata
session_data = {
"original_filename": file.filename,
"file_extension": file_extension,
"input_path": str(input_path),
"text_count": len(texts)
}
session_file = config.UPLOAD_DIR / f"session_{session_id}.json"
with open(session_file, "w", encoding="utf-8") as f:
json.dump(session_data, f)
logger.info(f"Extracted {len(texts)} texts from {file.filename}, session: {session_id}")
return {
"session_id": session_id,
"texts": texts,
"file_type": file_extension,
"text_count": len(texts)
}
except HTTPException:
raise
except Exception as e:
logger.error(f"Text extraction error: {str(e)}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Failed to extract texts: {str(e)}")
@app.post("/reconstruct-document")
async def reconstruct_document(
session_id: str = Form(..., description="Session ID from extract-texts"),
translations: str = Form(..., description="JSON array of {id, translated_text} objects"),
target_language: str = Form(..., description="Target language code"),
):
"""
Reconstruct a document with translated texts.
**Parameters:**
- **session_id**: The session ID from extract-texts
- **translations**: JSON array of translations with matching IDs
- **target_language**: Target language for filename
**Returns:**
- Translated document file
"""
import json
try:
# Load session data
session_file = config.UPLOAD_DIR / f"session_{session_id}.json"
if not session_file.exists():
raise HTTPException(status_code=404, detail="Session not found or expired")
with open(session_file, "r", encoding="utf-8") as f:
session_data = json.load(f)
input_path = Path(session_data["input_path"])
file_extension = session_data["file_extension"]
original_filename = session_data["original_filename"]
if not input_path.exists():
raise HTTPException(status_code=404, detail="Source file not found or expired")
# Parse translations
translation_list = json.loads(translations)
translation_map = {t["id"]: t["translated_text"] for t in translation_list}
# Generate output path
output_filename = file_handler.generate_unique_filename(original_filename, "translated")
output_path = config.OUTPUT_DIR / output_filename
# Reconstruct based on file type
if file_extension == ".xlsx":
from openpyxl import load_workbook
import shutil
shutil.copy(input_path, output_path)
wb = load_workbook(output_path)
for sheet in wb.worksheets:
for row in sheet.iter_rows():
for cell in row:
cell_id = f"{sheet.title}!{cell.coordinate}"
if cell_id in translation_map:
cell.value = translation_map[cell_id]
wb.save(output_path)
wb.close()
elif file_extension == ".docx":
from docx import Document
import shutil
shutil.copy(input_path, output_path)
doc = Document(output_path)
para_idx = 0
for para in doc.paragraphs:
para_id = f"para_{para_idx}"
if para_id in translation_map and para.text.strip():
# Replace text while keeping formatting
for run in para.runs:
run.text = ""
if para.runs:
para.runs[0].text = translation_map[para_id]
else:
para.text = translation_map[para_id]
para_idx += 1
# Also handle tables
table_idx = 0
for table in doc.tables:
for row_idx, row in enumerate(table.rows):
for cell_idx, cell in enumerate(row.cells):
cell_id = f"table_{table_idx}_r{row_idx}_c{cell_idx}"
if cell_id in translation_map:
# Clear and set new text
for para in cell.paragraphs:
for run in para.runs:
run.text = ""
if cell.paragraphs and cell.paragraphs[0].runs:
cell.paragraphs[0].runs[0].text = translation_map[cell_id]
elif cell.paragraphs:
cell.paragraphs[0].text = translation_map[cell_id]
table_idx += 1
doc.save(output_path)
elif file_extension == ".pptx":
from pptx import Presentation
import shutil
shutil.copy(input_path, output_path)
prs = Presentation(output_path)
for slide_idx, slide in enumerate(prs.slides):
for shape_idx, shape in enumerate(slide.shapes):
if shape.has_text_frame:
for para_idx, para in enumerate(shape.text_frame.paragraphs):
for run_idx, run in enumerate(para.runs):
run_id = f"slide_{slide_idx}_shape_{shape_idx}_para_{para_idx}_run_{run_idx}"
if run_id in translation_map:
run.text = translation_map[run_id]
prs.save(output_path)
# Cleanup session files
file_handler.cleanup_file(input_path)
file_handler.cleanup_file(session_file)
logger.info(f"Reconstructed document: {output_path}")
return FileResponse(
path=output_path,
filename=f"translated_{original_filename}",
media_type="application/octet-stream"
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Reconstruction error: {str(e)}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Failed to reconstruct document: {str(e)}")
# ============== SaaS Management Endpoints ==============
@app.post("/admin/login")
async def admin_login(
username: str = Form(...),
password: str = Form(...)
):
"""
Admin login endpoint
Returns a bearer token for authenticated admin access
"""
if username != ADMIN_USERNAME:
logger.warning(f"Failed admin login attempt with username: {username}")
raise HTTPException(status_code=401, detail="Invalid credentials")
if not verify_admin_password(password):
logger.warning(f"Failed admin login attempt - wrong password")
raise HTTPException(status_code=401, detail="Invalid credentials")
token = create_admin_token()
logger.info(f"Admin login successful")
return {
"status": "success",
"token": token,
"expires_in": 86400, # 24 hours in seconds
"message": "Login successful"
}
@app.post("/admin/logout")
async def admin_logout(authorization: Optional[str] = Header(None)):
"""Logout and invalidate admin token"""
if authorization:
parts = authorization.split(" ")
if len(parts) == 2 and parts[0].lower() == "bearer":
token = parts[1]
if token in admin_sessions:
del admin_sessions[token]
logger.info("Admin logout successful")
return {"status": "success", "message": "Logged out"}
@app.get("/admin/verify")
async def verify_admin_session(is_admin: bool = Depends(require_admin)):
"""Verify admin token is still valid"""
return {"status": "valid", "authenticated": True}
@app.get("/admin/dashboard")
async def get_admin_dashboard(is_admin: bool = Depends(require_admin)):
"""Get comprehensive admin dashboard data"""
health_status = await health_checker.check_health()
cleanup_stats = cleanup_manager.get_stats()
rate_limit_stats = rate_limit_manager.get_stats()
tracked_files = cleanup_manager.get_tracked_files()
return {
"timestamp": health_status.get("timestamp"),
"uptime": health_status.get("uptime_human"),
"status": health_status.get("status"),
"issues": health_status.get("issues", []),
"system": {
"memory": health_status.get("memory", {}),
"disk": health_status.get("disk", {}),
},
"translations": health_status.get("translations", {}),
"cleanup": {
**cleanup_stats,
"tracked_files_count": len(tracked_files)
},
"rate_limits": rate_limit_stats,
"config": {
"max_file_size_mb": config.MAX_FILE_SIZE_MB,
"supported_extensions": list(config.SUPPORTED_EXTENSIONS),
"translation_service": config.TRANSLATION_SERVICE,
"rate_limit_per_minute": rate_limit_config.requests_per_minute,
"translations_per_minute": rate_limit_config.translations_per_minute
}
}
@app.get("/metrics")
async def get_metrics():
"""Get system metrics and statistics for monitoring"""
health_status = await health_checker.check_health()
cleanup_stats = cleanup_manager.get_stats()
rate_limit_stats = rate_limit_manager.get_stats()
return {
"system": {
"memory": health_status.get("memory", {}),
"disk": health_status.get("disk", {}),
"status": health_status.get("status", "unknown")
},
"cleanup": cleanup_stats,
"rate_limits": rate_limit_stats,
"config": {
"max_file_size_mb": config.MAX_FILE_SIZE_MB,
"supported_extensions": list(config.SUPPORTED_EXTENSIONS),
"translation_service": config.TRANSLATION_SERVICE
}
}
@app.get("/rate-limit/status")
async def get_rate_limit_status(request: Request):
"""Get current rate limit status for the requesting client"""
client_ip = request.client.host if request.client else "unknown"
status = await rate_limit_manager.get_client_status(client_ip)
return {
"client_ip": client_ip,
"limits": {
"requests_per_minute": rate_limit_config.requests_per_minute,
"requests_per_hour": rate_limit_config.requests_per_hour,
"translations_per_minute": rate_limit_config.translations_per_minute,
"translations_per_hour": rate_limit_config.translations_per_hour
},
"current_usage": status
}
@app.post("/admin/cleanup/trigger")
async def trigger_cleanup(is_admin: bool = Depends(require_admin)):
"""Trigger manual cleanup of expired files (requires admin auth)"""
try:
cleaned = await cleanup_manager.cleanup_expired()
return {
"status": "success",
"files_cleaned": cleaned,
"message": f"Cleaned up {cleaned} expired files"
}
except Exception as e:
logger.error(f"Manual cleanup failed: {str(e)}")
raise HTTPException(status_code=500, detail=f"Cleanup failed: {str(e)}")
@app.get("/admin/files/tracked")
async def get_tracked_files(is_admin: bool = Depends(require_admin)):
"""Get list of currently tracked files (requires admin auth)"""
tracked = cleanup_manager.get_tracked_files()
return {
"count": len(tracked),
"files": tracked
}
if __name__ == "__main__":
import uvicorn
uvicorn.run("main:app", host="0.0.0.0", port=8000, reload=True)
uvicorn.run("main:app", host="0.0.0.0", port=8000, reload=True)

157
mcp.json Normal file
View File

@@ -0,0 +1,157 @@
{
"$schema": "https://json.schemastore.org/mcp-config.json",
"name": "document-translator",
"version": "1.0.0",
"description": "Document Translation API - Translate Excel, Word, PowerPoint files with format preservation",
"author": "Sepehr",
"repository": "https://gitea.parsanet.org/sepehr/office_translator.git",
"license": "MIT",
"runtime": {
"type": "python",
"command": "python",
"args": ["mcp_server.py"],
"cwd": "${workspaceFolder}"
},
"requirements": {
"python": ">=3.8",
"dependencies": [
"requests>=2.28.0"
]
},
"tools": [
{
"name": "translate_document",
"description": "Translate a document (Excel, Word, PowerPoint) to another language while preserving all formatting, styles, formulas, and layouts",
"parameters": {
"file_path": {
"type": "string",
"description": "Absolute path to the document file (.xlsx, .docx, .pptx)",
"required": true
},
"target_language": {
"type": "string",
"description": "Target language code (en, fr, es, fa, de, it, pt, ru, zh, ja, ko, ar)",
"required": true
},
"provider": {
"type": "string",
"enum": ["google", "ollama", "deepl", "libre"],
"default": "google",
"description": "Translation provider to use"
},
"ollama_model": {
"type": "string",
"description": "Ollama model name (e.g., llama3.2, gemma3:12b, qwen3-vl)"
},
"translate_images": {
"type": "boolean",
"default": false,
"description": "Use vision model to extract and translate text from embedded images"
},
"system_prompt": {
"type": "string",
"description": "Custom instructions and context for LLM translation (glossary, domain context, style guidelines)"
},
"output_path": {
"type": "string",
"description": "Path where to save the translated document"
}
},
"examples": [
{
"description": "Translate Excel file to French using Google",
"arguments": {
"file_path": "C:/Documents/data.xlsx",
"target_language": "fr",
"provider": "google"
}
},
{
"description": "Translate Word document to Persian with Ollama and custom glossary",
"arguments": {
"file_path": "C:/Documents/report.docx",
"target_language": "fa",
"provider": "ollama",
"ollama_model": "gemma3:12b",
"system_prompt": "You are translating HVAC technical documentation. Glossary: batterie=کویل, ventilateur=فن, condenseur=کندانسور"
}
},
{
"description": "Translate PowerPoint with image text extraction",
"arguments": {
"file_path": "C:/Presentations/slides.pptx",
"target_language": "de",
"provider": "ollama",
"ollama_model": "gemma3:12b",
"translate_images": true
}
}
]
},
{
"name": "list_ollama_models",
"description": "List all available Ollama models for translation",
"parameters": {
"base_url": {
"type": "string",
"default": "http://localhost:11434",
"description": "Ollama server URL"
}
}
},
{
"name": "get_supported_languages",
"description": "Get the full list of supported language codes and names",
"parameters": {}
},
{
"name": "check_api_health",
"description": "Check if the translation API server is running and healthy",
"parameters": {}
}
],
"features": [
"Format-preserving translation for Excel, Word, PowerPoint",
"Multiple translation providers (Google, Ollama, DeepL, LibreTranslate)",
"Image text extraction using vision models (Gemma3, Qwen3-VL)",
"Custom system prompts and glossaries for technical translation",
"Domain-specific presets (HVAC, IT, Legal, Medical)",
"Browser-based WebLLM support for offline translation"
],
"usage": {
"start_server": "python main.py",
"api_endpoint": "http://localhost:8000",
"web_interface": "http://localhost:8000"
},
"providers": {
"google": {
"description": "Google Translate (free, no API key required)",
"supports_system_prompt": false
},
"ollama": {
"description": "Local Ollama LLM server",
"supports_system_prompt": true,
"supports_vision": true,
"recommended_models": [
"llama3.2",
"gemma3:12b",
"qwen3-vl",
"mistral"
]
},
"deepl": {
"description": "DeepL API (requires API key)",
"supports_system_prompt": false
},
"libre": {
"description": "LibreTranslate (self-hosted)",
"supports_system_prompt": false
}
}
}

391
mcp_server.py Normal file
View File

@@ -0,0 +1,391 @@
#!/usr/bin/env python3
"""
MCP Server for Document Translation API
Model Context Protocol server for AI assistant integration
"""
import sys
import json
import asyncio
import base64
import requests
from pathlib import Path
from typing import Any, Optional
# MCP Protocol Constants
JSONRPC_VERSION = "2.0"
class MCPServer:
"""MCP Server for Document Translation"""
def __init__(self):
self.api_base = "http://localhost:8000"
self.capabilities = {
"tools": {}
}
def get_tools(self) -> list:
"""Return list of available tools"""
return [
{
"name": "translate_document",
"description": "Translate a document (Excel, Word, PowerPoint) to another language while preserving formatting",
"inputSchema": {
"type": "object",
"properties": {
"file_path": {
"type": "string",
"description": "Path to the document file (.xlsx, .docx, .pptx)"
},
"target_language": {
"type": "string",
"description": "Target language code (e.g., 'en', 'fr', 'es', 'fa', 'de')"
},
"provider": {
"type": "string",
"enum": ["google", "ollama", "deepl", "libre"],
"description": "Translation provider (default: google)"
},
"ollama_model": {
"type": "string",
"description": "Ollama model to use (e.g., 'llama3.2', 'gemma3:12b')"
},
"translate_images": {
"type": "boolean",
"description": "Extract and translate text from images using vision model"
},
"system_prompt": {
"type": "string",
"description": "Custom system prompt with context, glossary, or instructions for LLM translation"
},
"output_path": {
"type": "string",
"description": "Path where to save the translated document (optional)"
}
},
"required": ["file_path", "target_language"]
}
},
{
"name": "list_ollama_models",
"description": "List available Ollama models for translation",
"inputSchema": {
"type": "object",
"properties": {
"base_url": {
"type": "string",
"description": "Ollama server URL (default: http://localhost:11434)"
}
}
}
},
{
"name": "get_supported_languages",
"description": "Get list of supported language codes for translation",
"inputSchema": {
"type": "object",
"properties": {}
}
},
{
"name": "configure_translation",
"description": "Configure translation settings",
"inputSchema": {
"type": "object",
"properties": {
"provider": {
"type": "string",
"enum": ["google", "ollama", "deepl", "libre"],
"description": "Default translation provider"
},
"ollama_url": {
"type": "string",
"description": "Ollama server URL"
},
"ollama_model": {
"type": "string",
"description": "Default Ollama model"
}
}
}
},
{
"name": "check_api_health",
"description": "Check if the translation API is running and healthy",
"inputSchema": {
"type": "object",
"properties": {}
}
}
]
async def handle_tool_call(self, name: str, arguments: dict) -> dict:
"""Handle tool calls"""
try:
if name == "translate_document":
return await self.translate_document(arguments)
elif name == "list_ollama_models":
return await self.list_ollama_models(arguments)
elif name == "get_supported_languages":
return await self.get_supported_languages()
elif name == "configure_translation":
return await self.configure_translation(arguments)
elif name == "check_api_health":
return await self.check_api_health()
else:
return {"error": f"Unknown tool: {name}"}
except Exception as e:
return {"error": str(e)}
async def translate_document(self, args: dict) -> dict:
"""Translate a document file"""
file_path = Path(args["file_path"])
if not file_path.exists():
return {"error": f"File not found: {file_path}"}
# Prepare form data
with open(file_path, 'rb') as f:
files = {'file': (file_path.name, f)}
data = {
'target_language': args['target_language'],
'provider': args.get('provider', 'google'),
'translate_images': str(args.get('translate_images', False)).lower(),
}
if args.get('ollama_model'):
data['ollama_model'] = args['ollama_model']
if args.get('system_prompt'):
data['system_prompt'] = args['system_prompt']
try:
response = requests.post(
f"{self.api_base}/translate",
files=files,
data=data,
timeout=300 # 5 minutes timeout
)
if response.status_code == 200:
# Save translated file
output_path = args.get('output_path')
if not output_path:
output_path = file_path.parent / f"translated_{file_path.name}"
output_path = Path(output_path)
with open(output_path, 'wb') as out:
out.write(response.content)
return {
"success": True,
"message": f"Document translated successfully",
"output_path": str(output_path),
"source_file": str(file_path),
"target_language": args['target_language'],
"provider": args.get('provider', 'google')
}
else:
error_detail = response.json() if response.headers.get('content-type') == 'application/json' else response.text
return {"error": f"Translation failed: {error_detail}"}
except requests.exceptions.ConnectionError:
return {"error": "Cannot connect to translation API. Make sure the server is running on http://localhost:8000"}
except requests.exceptions.Timeout:
return {"error": "Translation request timed out"}
async def list_ollama_models(self, args: dict) -> dict:
"""List available Ollama models"""
base_url = args.get('base_url', 'http://localhost:11434')
try:
response = requests.get(
f"{self.api_base}/ollama/models",
params={'base_url': base_url},
timeout=10
)
if response.status_code == 200:
data = response.json()
return {
"models": data.get('models', []),
"count": data.get('count', 0),
"ollama_url": base_url
}
else:
return {"error": "Failed to list models", "models": []}
except requests.exceptions.ConnectionError:
return {"error": "Cannot connect to API server", "models": []}
async def get_supported_languages(self) -> dict:
"""Get supported language codes"""
return {
"languages": [
{"code": "en", "name": "English"},
{"code": "fa", "name": "Persian/Farsi"},
{"code": "fr", "name": "French"},
{"code": "es", "name": "Spanish"},
{"code": "de", "name": "German"},
{"code": "it", "name": "Italian"},
{"code": "pt", "name": "Portuguese"},
{"code": "ru", "name": "Russian"},
{"code": "zh", "name": "Chinese"},
{"code": "ja", "name": "Japanese"},
{"code": "ko", "name": "Korean"},
{"code": "ar", "name": "Arabic"},
{"code": "nl", "name": "Dutch"},
{"code": "pl", "name": "Polish"},
{"code": "tr", "name": "Turkish"},
{"code": "vi", "name": "Vietnamese"},
{"code": "th", "name": "Thai"},
{"code": "hi", "name": "Hindi"},
{"code": "he", "name": "Hebrew"},
{"code": "sv", "name": "Swedish"}
]
}
async def configure_translation(self, args: dict) -> dict:
"""Configure translation settings"""
config = {}
if args.get('ollama_url') and args.get('ollama_model'):
try:
response = requests.post(
f"{self.api_base}/ollama/configure",
data={
'base_url': args['ollama_url'],
'model': args['ollama_model']
},
timeout=10
)
if response.status_code == 200:
config['ollama'] = response.json()
except Exception as e:
config['ollama_error'] = str(e)
config['provider'] = args.get('provider', 'google')
return {
"success": True,
"configuration": config
}
async def check_api_health(self) -> dict:
"""Check API health status"""
try:
response = requests.get(f"{self.api_base}/health", timeout=5)
if response.status_code == 200:
return {
"status": "healthy",
"api_url": self.api_base,
"details": response.json()
}
else:
return {"status": "unhealthy", "error": "API returned non-200 status"}
except requests.exceptions.ConnectionError:
return {
"status": "unavailable",
"error": "Cannot connect to API. Start the server with: python main.py"
}
def create_response(self, id: Any, result: Any) -> dict:
"""Create JSON-RPC response"""
return {
"jsonrpc": JSONRPC_VERSION,
"id": id,
"result": result
}
def create_error(self, id: Any, code: int, message: str) -> dict:
"""Create JSON-RPC error response"""
return {
"jsonrpc": JSONRPC_VERSION,
"id": id,
"error": {
"code": code,
"message": message
}
}
async def handle_message(self, message: dict) -> Optional[dict]:
"""Handle incoming JSON-RPC message"""
msg_id = message.get("id")
method = message.get("method")
params = message.get("params", {})
if method == "initialize":
return self.create_response(msg_id, {
"protocolVersion": "2024-11-05",
"capabilities": self.capabilities,
"serverInfo": {
"name": "document-translator",
"version": "1.0.0"
}
})
elif method == "notifications/initialized":
return None # No response needed for notifications
elif method == "tools/list":
return self.create_response(msg_id, {
"tools": self.get_tools()
})
elif method == "tools/call":
tool_name = params.get("name")
tool_args = params.get("arguments", {})
result = await self.handle_tool_call(tool_name, tool_args)
return self.create_response(msg_id, {
"content": [
{
"type": "text",
"text": json.dumps(result, indent=2, ensure_ascii=False)
}
]
})
elif method == "ping":
return self.create_response(msg_id, {})
else:
return self.create_error(msg_id, -32601, f"Method not found: {method}")
async def run(self):
"""Run the MCP server using stdio"""
while True:
try:
line = sys.stdin.readline()
if not line:
break
message = json.loads(line)
response = await self.handle_message(message)
if response:
sys.stdout.write(json.dumps(response) + "\n")
sys.stdout.flush()
except json.JSONDecodeError as e:
error = self.create_error(None, -32700, f"Parse error: {e}")
sys.stdout.write(json.dumps(error) + "\n")
sys.stdout.flush()
except Exception as e:
error = self.create_error(None, -32603, f"Internal error: {e}")
sys.stdout.write(json.dumps(error) + "\n")
sys.stdout.flush()
def main():
"""Main entry point"""
server = MCPServer()
asyncio.run(server.run())
if __name__ == "__main__":
main()

62
middleware/__init__.py Normal file
View File

@@ -0,0 +1,62 @@
"""
Middleware package for SaaS robustness
This package provides:
- Rate limiting: Protect against abuse and ensure fair usage
- Validation: Validate all inputs before processing
- Security: Security headers, request logging, error handling
- Cleanup: Automatic file cleanup and resource management
"""
from .rate_limiting import (
RateLimitConfig,
RateLimitManager,
RateLimitMiddleware,
ClientRateLimiter,
)
from .validation import (
ValidationError,
ValidationResult,
FileValidator,
LanguageValidator,
ProviderValidator,
InputSanitizer,
)
from .security import (
SecurityHeadersMiddleware,
RequestLoggingMiddleware,
ErrorHandlingMiddleware,
)
from .cleanup import (
FileCleanupManager,
MemoryMonitor,
HealthChecker,
create_cleanup_manager,
)
__all__ = [
# Rate limiting
"RateLimitConfig",
"RateLimitManager",
"RateLimitMiddleware",
"ClientRateLimiter",
# Validation
"ValidationError",
"ValidationResult",
"FileValidator",
"LanguageValidator",
"ProviderValidator",
"InputSanitizer",
# Security
"SecurityHeadersMiddleware",
"RequestLoggingMiddleware",
"ErrorHandlingMiddleware",
# Cleanup
"FileCleanupManager",
"MemoryMonitor",
"HealthChecker",
"create_cleanup_manager",
]

400
middleware/cleanup.py Normal file
View File

@@ -0,0 +1,400 @@
"""
Cleanup and Resource Management for SaaS robustness
Automatic cleanup of temporary files and resources
"""
import os
import time
import asyncio
import threading
from pathlib import Path
from datetime import datetime, timedelta
from typing import Optional, Set
import logging
logger = logging.getLogger(__name__)
class FileCleanupManager:
"""Manages automatic cleanup of temporary and output files"""
def __init__(
self,
upload_dir: Path,
output_dir: Path,
temp_dir: Path,
max_file_age_hours: int = 1,
cleanup_interval_minutes: int = 10,
max_total_size_gb: float = 10.0
):
self.upload_dir = Path(upload_dir)
self.output_dir = Path(output_dir)
self.temp_dir = Path(temp_dir)
self.max_file_age_seconds = max_file_age_hours * 3600
self.cleanup_interval = cleanup_interval_minutes * 60
self.max_total_size_bytes = int(max_total_size_gb * 1024 * 1024 * 1024)
self._running = False
self._task: Optional[asyncio.Task] = None
self._protected_files: Set[str] = set()
self._tracked_files: dict = {} # filepath -> {created, ttl_minutes}
self._lock = threading.Lock()
self._stats = {
"files_cleaned": 0,
"bytes_freed": 0,
"cleanup_runs": 0
}
async def track_file(self, filepath: Path, ttl_minutes: int = 60):
"""Track a file for automatic cleanup after TTL expires"""
with self._lock:
self._tracked_files[str(filepath)] = {
"created": time.time(),
"ttl_minutes": ttl_minutes,
"expires_at": time.time() + (ttl_minutes * 60)
}
def get_tracked_files(self) -> list:
"""Get list of currently tracked files with their status"""
now = time.time()
result = []
with self._lock:
for filepath, info in self._tracked_files.items():
remaining = info["expires_at"] - now
result.append({
"path": filepath,
"exists": Path(filepath).exists(),
"expires_in_seconds": max(0, int(remaining)),
"ttl_minutes": info["ttl_minutes"]
})
return result
async def cleanup_expired(self) -> int:
"""Cleanup expired tracked files"""
now = time.time()
cleaned = 0
to_remove = []
with self._lock:
for filepath, info in list(self._tracked_files.items()):
if now > info["expires_at"]:
to_remove.append(filepath)
for filepath in to_remove:
try:
path = Path(filepath)
if path.exists() and not self.is_protected(path):
size = path.stat().st_size
path.unlink()
cleaned += 1
self._stats["files_cleaned"] += 1
self._stats["bytes_freed"] += size
logger.info(f"Cleaned expired file: {filepath}")
with self._lock:
self._tracked_files.pop(filepath, None)
except Exception as e:
logger.warning(f"Failed to clean expired file {filepath}: {e}")
return cleaned
def get_stats(self) -> dict:
"""Get cleanup statistics"""
disk_usage = self.get_disk_usage()
with self._lock:
tracked_count = len(self._tracked_files)
return {
"files_cleaned_total": self._stats["files_cleaned"],
"bytes_freed_total_mb": round(self._stats["bytes_freed"] / (1024 * 1024), 2),
"cleanup_runs": self._stats["cleanup_runs"],
"tracked_files": tracked_count,
"disk_usage": disk_usage,
"is_running": self._running
}
def protect_file(self, filepath: Path):
"""Mark a file as protected (being processed)"""
with self._lock:
self._protected_files.add(str(filepath))
def unprotect_file(self, filepath: Path):
"""Remove protection from a file"""
with self._lock:
self._protected_files.discard(str(filepath))
def is_protected(self, filepath: Path) -> bool:
"""Check if a file is protected"""
with self._lock:
return str(filepath) in self._protected_files
async def start(self):
"""Start the cleanup background task"""
if self._running:
return
self._running = True
self._task = asyncio.create_task(self._cleanup_loop())
logger.info("File cleanup manager started")
async def stop(self):
"""Stop the cleanup background task"""
self._running = False
if self._task:
self._task.cancel()
try:
await self._task
except asyncio.CancelledError:
pass
logger.info("File cleanup manager stopped")
async def _cleanup_loop(self):
"""Background loop for periodic cleanup"""
while self._running:
try:
await self.cleanup()
await self.cleanup_expired()
self._stats["cleanup_runs"] += 1
except Exception as e:
logger.error(f"Cleanup error: {e}")
await asyncio.sleep(self.cleanup_interval)
async def cleanup(self) -> dict:
"""Perform cleanup of old files"""
stats = {
"files_deleted": 0,
"bytes_freed": 0,
"errors": []
}
now = time.time()
# Cleanup each directory
for directory in [self.upload_dir, self.output_dir, self.temp_dir]:
if not directory.exists():
continue
for filepath in directory.iterdir():
if not filepath.is_file():
continue
# Skip protected files
if self.is_protected(filepath):
continue
try:
# Check file age
file_age = now - filepath.stat().st_mtime
if file_age > self.max_file_age_seconds:
file_size = filepath.stat().st_size
filepath.unlink()
stats["files_deleted"] += 1
stats["bytes_freed"] += file_size
logger.debug(f"Deleted old file: {filepath}")
except Exception as e:
stats["errors"].append(str(e))
logger.warning(f"Failed to delete {filepath}: {e}")
# Force cleanup if total size exceeds limit
await self._enforce_size_limit(stats)
if stats["files_deleted"] > 0:
mb_freed = stats["bytes_freed"] / (1024 * 1024)
logger.info(f"Cleanup: deleted {stats['files_deleted']} files, freed {mb_freed:.2f}MB")
return stats
async def _enforce_size_limit(self, stats: dict):
"""Delete oldest files if total size exceeds limit"""
files_with_mtime = []
total_size = 0
for directory in [self.upload_dir, self.output_dir, self.temp_dir]:
if not directory.exists():
continue
for filepath in directory.iterdir():
if not filepath.is_file() or self.is_protected(filepath):
continue
try:
stat = filepath.stat()
files_with_mtime.append((filepath, stat.st_mtime, stat.st_size))
total_size += stat.st_size
except Exception:
pass
# If under limit, nothing to do
if total_size <= self.max_total_size_bytes:
return
# Sort by modification time (oldest first)
files_with_mtime.sort(key=lambda x: x[1])
# Delete oldest files until under limit
for filepath, _, size in files_with_mtime:
if total_size <= self.max_total_size_bytes:
break
try:
filepath.unlink()
total_size -= size
stats["files_deleted"] += 1
stats["bytes_freed"] += size
logger.info(f"Deleted file to free space: {filepath}")
except Exception as e:
stats["errors"].append(str(e))
def get_disk_usage(self) -> dict:
"""Get current disk usage statistics"""
total_files = 0
total_size = 0
for directory in [self.upload_dir, self.output_dir, self.temp_dir]:
if not directory.exists():
continue
for filepath in directory.iterdir():
if filepath.is_file():
total_files += 1
try:
total_size += filepath.stat().st_size
except Exception:
pass
return {
"total_files": total_files,
"total_size_mb": round(total_size / (1024 * 1024), 2),
"max_size_gb": self.max_total_size_bytes / (1024 * 1024 * 1024),
"usage_percent": round((total_size / self.max_total_size_bytes) * 100, 1) if self.max_total_size_bytes > 0 else 0,
"directories": {
"uploads": str(self.upload_dir),
"outputs": str(self.output_dir),
"temp": str(self.temp_dir)
}
}
class MemoryMonitor:
"""Monitors memory usage and triggers cleanup if needed"""
def __init__(self, max_memory_percent: float = 80.0):
self.max_memory_percent = max_memory_percent
self._high_memory_callbacks = []
def get_memory_usage(self) -> dict:
"""Get current memory usage"""
try:
import psutil
process = psutil.Process()
memory_info = process.memory_info()
system_memory = psutil.virtual_memory()
return {
"process_rss_mb": round(memory_info.rss / (1024 * 1024), 2),
"process_vms_mb": round(memory_info.vms / (1024 * 1024), 2),
"system_total_gb": round(system_memory.total / (1024 * 1024 * 1024), 2),
"system_available_gb": round(system_memory.available / (1024 * 1024 * 1024), 2),
"system_percent": system_memory.percent
}
except ImportError:
return {"error": "psutil not installed"}
except Exception as e:
return {"error": str(e)}
def check_memory(self) -> bool:
"""Check if memory usage is within limits"""
usage = self.get_memory_usage()
if "error" in usage:
return True # Can't check, assume OK
return usage.get("system_percent", 0) < self.max_memory_percent
def on_high_memory(self, callback):
"""Register callback for high memory situations"""
self._high_memory_callbacks.append(callback)
class HealthChecker:
"""Comprehensive health checking for the application"""
def __init__(self, cleanup_manager: FileCleanupManager, memory_monitor: MemoryMonitor):
self.cleanup_manager = cleanup_manager
self.memory_monitor = memory_monitor
self.start_time = datetime.now()
self._translation_count = 0
self._error_count = 0
self._lock = threading.Lock()
def record_translation(self, success: bool = True):
"""Record a translation attempt"""
with self._lock:
self._translation_count += 1
if not success:
self._error_count += 1
async def check_health(self) -> dict:
"""Get comprehensive health status (async version)"""
return self.get_health()
def get_health(self) -> dict:
"""Get comprehensive health status"""
memory = self.memory_monitor.get_memory_usage()
disk = self.cleanup_manager.get_disk_usage()
# Determine overall status
status = "healthy"
issues = []
if "error" not in memory:
if memory.get("system_percent", 0) > 90:
status = "degraded"
issues.append("High memory usage")
elif memory.get("system_percent", 0) > 80:
issues.append("Memory usage elevated")
if disk.get("usage_percent", 0) > 90:
status = "degraded"
issues.append("High disk usage")
elif disk.get("usage_percent", 0) > 80:
issues.append("Disk usage elevated")
uptime = datetime.now() - self.start_time
return {
"status": status,
"issues": issues,
"uptime_seconds": int(uptime.total_seconds()),
"uptime_human": str(uptime).split('.')[0],
"translations": {
"total": self._translation_count,
"errors": self._error_count,
"success_rate": round(
((self._translation_count - self._error_count) / self._translation_count * 100)
if self._translation_count > 0 else 100, 1
)
},
"memory": memory,
"disk": disk,
"cleanup_service": self.cleanup_manager.get_stats(),
"timestamp": datetime.now().isoformat()
}
# Create default instances
def create_cleanup_manager(config) -> FileCleanupManager:
"""Create cleanup manager with config"""
return FileCleanupManager(
upload_dir=config.UPLOAD_DIR,
output_dir=config.OUTPUT_DIR,
temp_dir=config.TEMP_DIR,
max_file_age_hours=getattr(config, 'MAX_FILE_AGE_HOURS', 1),
cleanup_interval_minutes=getattr(config, 'CLEANUP_INTERVAL_MINUTES', 10),
max_total_size_gb=getattr(config, 'MAX_TOTAL_SIZE_GB', 10.0)
)

328
middleware/rate_limiting.py Normal file
View File

@@ -0,0 +1,328 @@
"""
Rate Limiting Middleware for SaaS robustness
Protects against abuse and ensures fair usage
"""
import time
import asyncio
from collections import defaultdict
from dataclasses import dataclass, field
from typing import Dict, Optional
from fastapi import Request, HTTPException
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.responses import JSONResponse
import logging
logger = logging.getLogger(__name__)
@dataclass
class RateLimitConfig:
"""Configuration for rate limiting"""
# Requests per window
requests_per_minute: int = 30
requests_per_hour: int = 200
requests_per_day: int = 1000
# Translation-specific limits
translations_per_minute: int = 10
translations_per_hour: int = 50
max_concurrent_translations: int = 5
# File size limits (MB)
max_file_size_mb: int = 50
max_total_size_per_hour_mb: int = 500
# Burst protection
burst_limit: int = 10 # Max requests in 1 second
# Whitelist IPs (no rate limiting)
whitelist_ips: list = field(default_factory=lambda: ["127.0.0.1", "::1"])
class TokenBucket:
"""Token bucket algorithm for rate limiting"""
def __init__(self, capacity: int, refill_rate: float):
self.capacity = capacity
self.refill_rate = refill_rate # tokens per second
self.tokens = capacity
self.last_refill = time.time()
self._lock = asyncio.Lock()
async def consume(self, tokens: int = 1) -> bool:
"""Try to consume tokens, return True if successful"""
async with self._lock:
self._refill()
if self.tokens >= tokens:
self.tokens -= tokens
return True
return False
def _refill(self):
"""Refill tokens based on time elapsed"""
now = time.time()
elapsed = now - self.last_refill
self.tokens = min(self.capacity, self.tokens + elapsed * self.refill_rate)
self.last_refill = now
class SlidingWindowCounter:
"""Sliding window counter for accurate rate limiting"""
def __init__(self, window_seconds: int, max_requests: int):
self.window_seconds = window_seconds
self.max_requests = max_requests
self.requests: list = []
self._lock = asyncio.Lock()
async def is_allowed(self) -> bool:
"""Check if a new request is allowed"""
async with self._lock:
now = time.time()
# Remove old requests outside the window
self.requests = [ts for ts in self.requests if now - ts < self.window_seconds]
if len(self.requests) < self.max_requests:
self.requests.append(now)
return True
return False
@property
def current_count(self) -> int:
"""Get current request count in window"""
now = time.time()
return len([ts for ts in self.requests if now - ts < self.window_seconds])
class ClientRateLimiter:
"""Per-client rate limiter with multiple windows"""
def __init__(self, config: RateLimitConfig):
self.config = config
self.minute_counter = SlidingWindowCounter(60, config.requests_per_minute)
self.hour_counter = SlidingWindowCounter(3600, config.requests_per_hour)
self.day_counter = SlidingWindowCounter(86400, config.requests_per_day)
self.burst_bucket = TokenBucket(config.burst_limit, config.burst_limit)
self.translation_minute = SlidingWindowCounter(60, config.translations_per_minute)
self.translation_hour = SlidingWindowCounter(3600, config.translations_per_hour)
self.concurrent_translations = 0
self.total_size_hour: list = [] # List of (timestamp, size_mb)
self._lock = asyncio.Lock()
async def check_request(self) -> tuple[bool, str]:
"""Check if request is allowed, return (allowed, reason)"""
# Check burst limit
if not await self.burst_bucket.consume():
return False, "Too many requests. Please slow down."
# Check minute limit
if not await self.minute_counter.is_allowed():
return False, f"Rate limit exceeded. Max {self.config.requests_per_minute} requests per minute."
# Check hour limit
if not await self.hour_counter.is_allowed():
return False, f"Hourly limit exceeded. Max {self.config.requests_per_hour} requests per hour."
# Check day limit
if not await self.day_counter.is_allowed():
return False, f"Daily limit exceeded. Max {self.config.requests_per_day} requests per day."
return True, ""
async def check_translation(self, file_size_mb: float = 0) -> tuple[bool, str]:
"""Check if translation request is allowed"""
async with self._lock:
# Check concurrent limit
if self.concurrent_translations >= self.config.max_concurrent_translations:
return False, f"Too many concurrent translations. Max {self.config.max_concurrent_translations} at a time."
# Check translation per minute
if not await self.translation_minute.is_allowed():
return False, f"Translation rate limit exceeded. Max {self.config.translations_per_minute} translations per minute."
# Check translation per hour
if not await self.translation_hour.is_allowed():
return False, f"Hourly translation limit exceeded. Max {self.config.translations_per_hour} translations per hour."
# Check total size per hour
async with self._lock:
now = time.time()
self.total_size_hour = [(ts, size) for ts, size in self.total_size_hour if now - ts < 3600]
total_size = sum(size for _, size in self.total_size_hour)
if total_size + file_size_mb > self.config.max_total_size_per_hour_mb:
return False, f"Hourly data limit exceeded. Max {self.config.max_total_size_per_hour_mb}MB per hour."
self.total_size_hour.append((now, file_size_mb))
return True, ""
async def start_translation(self):
"""Mark start of translation"""
async with self._lock:
self.concurrent_translations += 1
async def end_translation(self):
"""Mark end of translation"""
async with self._lock:
self.concurrent_translations = max(0, self.concurrent_translations - 1)
def get_stats(self) -> dict:
"""Get current rate limit stats"""
return {
"requests_minute": self.minute_counter.current_count,
"requests_hour": self.hour_counter.current_count,
"requests_day": self.day_counter.current_count,
"translations_minute": self.translation_minute.current_count,
"translations_hour": self.translation_hour.current_count,
"concurrent_translations": self.concurrent_translations,
}
class RateLimitManager:
"""Manages rate limiters for all clients"""
def __init__(self, config: Optional[RateLimitConfig] = None):
self.config = config or RateLimitConfig()
self.clients: Dict[str, ClientRateLimiter] = defaultdict(lambda: ClientRateLimiter(self.config))
self._cleanup_interval = 3600 # Cleanup old clients every hour
self._last_cleanup = time.time()
self._total_requests = 0
self._total_translations = 0
def get_client_id(self, request: Request) -> str:
"""Extract client identifier from request"""
# Try to get real IP from headers (for proxied requests)
forwarded = request.headers.get("X-Forwarded-For")
if forwarded:
return forwarded.split(",")[0].strip()
real_ip = request.headers.get("X-Real-IP")
if real_ip:
return real_ip
# Fall back to direct client IP
if request.client:
return request.client.host
return "unknown"
def is_whitelisted(self, client_id: str) -> bool:
"""Check if client is whitelisted"""
return client_id in self.config.whitelist_ips
async def check_request(self, request: Request) -> tuple[bool, str, str]:
"""Check if request is allowed, return (allowed, reason, client_id)"""
client_id = self.get_client_id(request)
self._total_requests += 1
if self.is_whitelisted(client_id):
return True, "", client_id
client = self.clients[client_id]
allowed, reason = await client.check_request()
return allowed, reason, client_id
async def check_translation(self, request: Request, file_size_mb: float = 0) -> tuple[bool, str]:
"""Check if translation is allowed"""
client_id = self.get_client_id(request)
self._total_translations += 1
if self.is_whitelisted(client_id):
return True, ""
client = self.clients[client_id]
return await client.check_translation(file_size_mb)
async def check_translation_limit(self, client_id: str, file_size_mb: float = 0) -> bool:
"""Check if translation is allowed for a specific client ID"""
if self.is_whitelisted(client_id):
return True
client = self.clients[client_id]
allowed, _ = await client.check_translation(file_size_mb)
return allowed
def get_client_stats(self, request: Request) -> dict:
"""Get rate limit stats for a client"""
client_id = self.get_client_id(request)
client = self.clients[client_id]
return {
"client_id": client_id,
"is_whitelisted": self.is_whitelisted(client_id),
**client.get_stats()
}
async def get_client_status(self, client_id: str) -> dict:
"""Get current usage status for a client"""
if client_id not in self.clients:
return {"status": "no_activity", "requests": 0}
client = self.clients[client_id]
stats = client.get_stats()
return {
"requests_used_minute": stats["requests_minute"],
"requests_used_hour": stats["requests_hour"],
"translations_used_minute": stats["translations_minute"],
"translations_used_hour": stats["translations_hour"],
"concurrent_translations": stats["concurrent_translations"],
"is_whitelisted": self.is_whitelisted(client_id)
}
def get_stats(self) -> dict:
"""Get global rate limiting statistics"""
return {
"total_requests": self._total_requests,
"total_translations": self._total_translations,
"active_clients": len(self.clients),
"config": {
"requests_per_minute": self.config.requests_per_minute,
"requests_per_hour": self.config.requests_per_hour,
"translations_per_minute": self.config.translations_per_minute,
"translations_per_hour": self.config.translations_per_hour,
"max_concurrent_translations": self.config.max_concurrent_translations
}
}
class RateLimitMiddleware(BaseHTTPMiddleware):
"""FastAPI middleware for rate limiting"""
def __init__(self, app, rate_limit_manager: RateLimitManager):
super().__init__(app)
self.manager = rate_limit_manager
async def dispatch(self, request: Request, call_next):
# Skip rate limiting for health checks and static files
if request.url.path in ["/health", "/", "/docs", "/openapi.json", "/redoc"]:
return await call_next(request)
if request.url.path.startswith("/static"):
return await call_next(request)
# Check rate limit
allowed, reason, client_id = await self.manager.check_request(request)
if not allowed:
logger.warning(f"Rate limit exceeded for {client_id}: {reason}")
return JSONResponse(
status_code=429,
content={
"error": "rate_limit_exceeded",
"message": reason,
"retry_after": 60
},
headers={"Retry-After": "60"}
)
# Add client info to request state for use in endpoints
request.state.client_id = client_id
request.state.rate_limiter = self.manager.clients[client_id]
return await call_next(request)
# Global rate limit manager
rate_limit_manager = RateLimitManager()

142
middleware/security.py Normal file
View File

@@ -0,0 +1,142 @@
"""
Security Headers Middleware for SaaS robustness
Adds security headers to all responses
"""
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from starlette.responses import Response
import logging
logger = logging.getLogger(__name__)
class SecurityHeadersMiddleware(BaseHTTPMiddleware):
"""Add security headers to all responses"""
def __init__(self, app, config: dict = None):
super().__init__(app)
self.config = config or {}
async def dispatch(self, request: Request, call_next) -> Response:
response = await call_next(request)
# Prevent clickjacking
response.headers["X-Frame-Options"] = "DENY"
# Prevent MIME type sniffing
response.headers["X-Content-Type-Options"] = "nosniff"
# Enable XSS filter
response.headers["X-XSS-Protection"] = "1; mode=block"
# Referrer policy
response.headers["Referrer-Policy"] = "strict-origin-when-cross-origin"
# Permissions policy
response.headers["Permissions-Policy"] = "geolocation=(), microphone=(), camera=()"
# Content Security Policy (adjust for your frontend)
if not request.url.path.startswith("/docs") and not request.url.path.startswith("/redoc"):
response.headers["Content-Security-Policy"] = (
"default-src 'self'; "
"script-src 'self' 'unsafe-inline' 'unsafe-eval' blob:; "
"style-src 'self' 'unsafe-inline'; "
"img-src 'self' data: blob:; "
"font-src 'self' data:; "
"connect-src 'self' http://localhost:* https://localhost:* ws://localhost:*; "
"worker-src 'self' blob:; "
)
# HSTS (only in production with HTTPS)
if self.config.get("enable_hsts", False):
response.headers["Strict-Transport-Security"] = "max-age=31536000; includeSubDomains"
return response
class RequestLoggingMiddleware(BaseHTTPMiddleware):
"""Log all requests for monitoring and debugging"""
def __init__(self, app, log_body: bool = False):
super().__init__(app)
self.log_body = log_body
async def dispatch(self, request: Request, call_next) -> Response:
import time
import uuid
# Generate request ID
request_id = str(uuid.uuid4())[:8]
request.state.request_id = request_id
# Get client info
client_ip = self._get_client_ip(request)
# Log request start
start_time = time.time()
logger.info(
f"[{request_id}] {request.method} {request.url.path} "
f"from {client_ip} - Started"
)
try:
response = await call_next(request)
# Log request completion
duration = time.time() - start_time
logger.info(
f"[{request_id}] {request.method} {request.url.path} "
f"- {response.status_code} in {duration:.3f}s"
)
# Add request ID to response headers
response.headers["X-Request-ID"] = request_id
return response
except Exception as e:
duration = time.time() - start_time
logger.error(
f"[{request_id}] {request.method} {request.url.path} "
f"- ERROR in {duration:.3f}s: {str(e)}"
)
raise
def _get_client_ip(self, request: Request) -> str:
"""Get real client IP from headers or connection"""
forwarded = request.headers.get("X-Forwarded-For")
if forwarded:
return forwarded.split(",")[0].strip()
real_ip = request.headers.get("X-Real-IP")
if real_ip:
return real_ip
if request.client:
return request.client.host
return "unknown"
class ErrorHandlingMiddleware(BaseHTTPMiddleware):
"""Catch all unhandled exceptions and return proper error responses"""
async def dispatch(self, request: Request, call_next) -> Response:
from starlette.responses import JSONResponse
try:
return await call_next(request)
except Exception as e:
request_id = getattr(request.state, 'request_id', 'unknown')
logger.exception(f"[{request_id}] Unhandled exception: {str(e)}")
# Don't expose internal errors in production
return JSONResponse(
status_code=500,
content={
"error": "internal_server_error",
"message": "An unexpected error occurred. Please try again later.",
"request_id": request_id
}
)

440
middleware/validation.py Normal file
View File

@@ -0,0 +1,440 @@
"""
Input Validation Module for SaaS robustness
Validates all user inputs before processing
"""
import re
import magic
from pathlib import Path
from typing import Optional, List, Set
from fastapi import UploadFile, HTTPException
import logging
logger = logging.getLogger(__name__)
class ValidationError(Exception):
"""Custom validation error with user-friendly messages"""
def __init__(self, message: str, code: str = "validation_error", details: Optional[dict] = None):
self.message = message
self.code = code
self.details = details or {}
super().__init__(message)
class ValidationResult:
"""Result of a validation check"""
def __init__(self, is_valid: bool = True, errors: List[str] = None, warnings: List[str] = None, data: dict = None):
self.is_valid = is_valid
self.errors = errors or []
self.warnings = warnings or []
self.data = data or {}
class FileValidator:
"""Validates uploaded files for security and compatibility"""
# Allowed MIME types mapped to extensions
ALLOWED_MIME_TYPES = {
"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet": ".xlsx",
"application/vnd.openxmlformats-officedocument.wordprocessingml.document": ".docx",
"application/vnd.openxmlformats-officedocument.presentationml.presentation": ".pptx",
}
# Magic bytes for Office Open XML files (ZIP format)
OFFICE_MAGIC_BYTES = b"PK\x03\x04"
def __init__(
self,
max_size_mb: int = 50,
allowed_extensions: Set[str] = None,
scan_content: bool = True
):
self.max_size_bytes = max_size_mb * 1024 * 1024
self.max_size_mb = max_size_mb
self.allowed_extensions = allowed_extensions or {".xlsx", ".docx", ".pptx"}
self.scan_content = scan_content
async def validate_async(self, file: UploadFile) -> ValidationResult:
"""
Validate an uploaded file asynchronously
Returns ValidationResult with is_valid, errors, warnings
"""
errors = []
warnings = []
data = {}
try:
# Validate filename
if not file.filename:
errors.append("Filename is required")
return ValidationResult(is_valid=False, errors=errors)
# Sanitize filename
try:
safe_filename = self._sanitize_filename(file.filename)
data["safe_filename"] = safe_filename
except ValidationError as e:
errors.append(str(e.message))
return ValidationResult(is_valid=False, errors=errors)
# Validate extension
try:
extension = self._validate_extension(safe_filename)
data["extension"] = extension
except ValidationError as e:
errors.append(str(e.message))
return ValidationResult(is_valid=False, errors=errors)
# Read file content for validation
content = await file.read()
await file.seek(0) # Reset for later processing
# Validate file size
file_size = len(content)
data["size_bytes"] = file_size
data["size_mb"] = round(file_size / (1024*1024), 2)
if file_size > self.max_size_bytes:
errors.append(f"File too large. Maximum size is {self.max_size_mb}MB, got {file_size / (1024*1024):.1f}MB")
return ValidationResult(is_valid=False, errors=errors, data=data)
if file_size == 0:
errors.append("File is empty")
return ValidationResult(is_valid=False, errors=errors, data=data)
# Warn about large files
if file_size > self.max_size_bytes * 0.8:
warnings.append(f"File is {data['size_mb']}MB, approaching the {self.max_size_mb}MB limit")
# Validate magic bytes
if self.scan_content:
try:
self._validate_magic_bytes(content, extension)
except ValidationError as e:
errors.append(str(e.message))
return ValidationResult(is_valid=False, errors=errors, data=data)
# Validate MIME type
try:
mime_type = self._detect_mime_type(content)
data["mime_type"] = mime_type
self._validate_mime_type(mime_type, extension)
except ValidationError as e:
warnings.append(f"MIME type warning: {e.message}")
except Exception:
warnings.append("Could not verify MIME type")
data["original_filename"] = file.filename
return ValidationResult(is_valid=True, errors=errors, warnings=warnings, data=data)
except Exception as e:
logger.error(f"Validation error: {str(e)}")
errors.append(f"Validation failed: {str(e)}")
return ValidationResult(is_valid=False, errors=errors, warnings=warnings, data=data)
async def validate(self, file: UploadFile) -> dict:
"""
Validate an uploaded file
Returns validation info dict or raises ValidationError
"""
# Validate filename
if not file.filename:
raise ValidationError(
"Filename is required",
code="missing_filename"
)
# Sanitize filename
safe_filename = self._sanitize_filename(file.filename)
# Validate extension
extension = self._validate_extension(safe_filename)
# Read file content for validation
content = await file.read()
await file.seek(0) # Reset for later processing
# Validate file size
file_size = len(content)
if file_size > self.max_size_bytes:
raise ValidationError(
f"File too large. Maximum size is {self.max_size_mb}MB, got {file_size / (1024*1024):.1f}MB",
code="file_too_large",
details={"max_mb": self.max_size_mb, "actual_mb": round(file_size / (1024*1024), 2)}
)
if file_size == 0:
raise ValidationError(
"File is empty",
code="empty_file"
)
# Validate magic bytes (file signature)
if self.scan_content:
self._validate_magic_bytes(content, extension)
# Validate MIME type
mime_type = self._detect_mime_type(content)
self._validate_mime_type(mime_type, extension)
return {
"original_filename": file.filename,
"safe_filename": safe_filename,
"extension": extension,
"size_bytes": file_size,
"size_mb": round(file_size / (1024*1024), 2),
"mime_type": mime_type
}
def _sanitize_filename(self, filename: str) -> str:
"""Sanitize filename to prevent path traversal and other attacks"""
# Remove path components
filename = Path(filename).name
# Remove null bytes and control characters
filename = re.sub(r'[\x00-\x1f\x7f-\x9f]', '', filename)
# Remove potentially dangerous characters
filename = re.sub(r'[<>:"/\\|?*]', '_', filename)
# Limit length
if len(filename) > 255:
name, ext = filename.rsplit('.', 1) if '.' in filename else (filename, '')
filename = name[:250] + ('.' + ext if ext else '')
# Ensure not empty after sanitization
if not filename or filename.strip() == '':
raise ValidationError(
"Invalid filename",
code="invalid_filename"
)
return filename
def _validate_extension(self, filename: str) -> str:
"""Validate and return the file extension"""
if '.' not in filename:
raise ValidationError(
f"File must have an extension. Supported: {', '.join(self.allowed_extensions)}",
code="missing_extension",
details={"allowed_extensions": list(self.allowed_extensions)}
)
extension = '.' + filename.rsplit('.', 1)[1].lower()
if extension not in self.allowed_extensions:
raise ValidationError(
f"File type '{extension}' not supported. Supported types: {', '.join(self.allowed_extensions)}",
code="unsupported_file_type",
details={"extension": extension, "allowed_extensions": list(self.allowed_extensions)}
)
return extension
def _validate_magic_bytes(self, content: bytes, extension: str):
"""Validate file magic bytes match expected format"""
# All supported formats are Office Open XML (ZIP-based)
if not content.startswith(self.OFFICE_MAGIC_BYTES):
raise ValidationError(
"File content does not match expected format. The file may be corrupted or not a valid Office document.",
code="invalid_file_content"
)
def _detect_mime_type(self, content: bytes) -> str:
"""Detect MIME type from file content"""
try:
mime = magic.Magic(mime=True)
return mime.from_buffer(content)
except Exception:
# Fallback to basic detection
if content.startswith(self.OFFICE_MAGIC_BYTES):
return "application/zip"
return "application/octet-stream"
def _validate_mime_type(self, mime_type: str, extension: str):
"""Validate MIME type matches extension"""
# Office Open XML files may be detected as ZIP
allowed_mimes = list(self.ALLOWED_MIME_TYPES.keys()) + ["application/zip", "application/octet-stream"]
if mime_type not in allowed_mimes:
raise ValidationError(
f"Invalid file type detected. Expected Office document, got: {mime_type}",
code="invalid_mime_type",
details={"detected_mime": mime_type}
)
class LanguageValidator:
"""Validates language codes"""
SUPPORTED_LANGUAGES = {
# ISO 639-1 codes
"af", "sq", "am", "ar", "hy", "az", "eu", "be", "bn", "bs",
"bg", "ca", "ceb", "zh", "zh-CN", "zh-TW", "co", "hr", "cs",
"da", "nl", "en", "eo", "et", "fi", "fr", "fy", "gl", "ka",
"de", "el", "gu", "ht", "ha", "haw", "he", "hi", "hmn", "hu",
"is", "ig", "id", "ga", "it", "ja", "jv", "kn", "kk", "km",
"rw", "ko", "ku", "ky", "lo", "la", "lv", "lt", "lb", "mk",
"mg", "ms", "ml", "mt", "mi", "mr", "mn", "my", "ne", "no",
"ny", "or", "ps", "fa", "pl", "pt", "pa", "ro", "ru", "sm",
"gd", "sr", "st", "sn", "sd", "si", "sk", "sl", "so", "es",
"su", "sw", "sv", "tl", "tg", "ta", "tt", "te", "th", "tr",
"tk", "uk", "ur", "ug", "uz", "vi", "cy", "xh", "yi", "yo",
"zu", "auto"
}
LANGUAGE_NAMES = {
"en": "English", "es": "Spanish", "fr": "French", "de": "German",
"it": "Italian", "pt": "Portuguese", "ru": "Russian", "zh": "Chinese",
"zh-CN": "Chinese (Simplified)", "zh-TW": "Chinese (Traditional)",
"ja": "Japanese", "ko": "Korean", "ar": "Arabic", "hi": "Hindi",
"nl": "Dutch", "pl": "Polish", "tr": "Turkish", "sv": "Swedish",
"da": "Danish", "no": "Norwegian", "fi": "Finnish", "cs": "Czech",
"el": "Greek", "th": "Thai", "vi": "Vietnamese", "id": "Indonesian",
"uk": "Ukrainian", "ro": "Romanian", "hu": "Hungarian", "auto": "Auto-detect"
}
@classmethod
def validate(cls, language_code: str, field_name: str = "language") -> str:
"""Validate and normalize language code"""
if not language_code:
raise ValidationError(
f"{field_name} is required",
code="missing_language"
)
# Normalize
normalized = language_code.strip().lower()
# Handle common variations
if normalized in ["chinese", "cn"]:
normalized = "zh-CN"
elif normalized in ["chinese-traditional", "tw"]:
normalized = "zh-TW"
if normalized not in cls.SUPPORTED_LANGUAGES:
raise ValidationError(
f"Unsupported language code: '{language_code}'. See /languages for supported codes.",
code="unsupported_language",
details={"language": language_code}
)
return normalized
@classmethod
def get_language_name(cls, code: str) -> str:
"""Get human-readable language name"""
return cls.LANGUAGE_NAMES.get(code, code.upper())
class ProviderValidator:
"""Validates translation provider configuration"""
SUPPORTED_PROVIDERS = {"google", "ollama", "deepl", "libre", "openai", "webllm"}
@classmethod
def validate(cls, provider: str, **kwargs) -> dict:
"""Validate provider and its required configuration"""
if not provider:
raise ValidationError(
"Translation provider is required",
code="missing_provider"
)
normalized = provider.strip().lower()
if normalized not in cls.SUPPORTED_PROVIDERS:
raise ValidationError(
f"Unsupported provider: '{provider}'. Supported: {', '.join(cls.SUPPORTED_PROVIDERS)}",
code="unsupported_provider",
details={"provider": provider, "supported": list(cls.SUPPORTED_PROVIDERS)}
)
# Provider-specific validation
if normalized == "deepl":
if not kwargs.get("deepl_api_key"):
raise ValidationError(
"DeepL API key is required when using DeepL provider",
code="missing_deepl_key"
)
elif normalized == "openai":
if not kwargs.get("openai_api_key"):
raise ValidationError(
"OpenAI API key is required when using OpenAI provider",
code="missing_openai_key"
)
elif normalized == "ollama":
# Ollama doesn't require API key but may need model
model = kwargs.get("ollama_model", "")
if not model:
logger.warning("No Ollama model specified, will use default")
return {"provider": normalized, "validated": True}
class InputSanitizer:
"""Sanitizes user inputs to prevent injection attacks"""
@staticmethod
def sanitize_text(text: str, max_length: int = 10000) -> str:
"""Sanitize text input"""
if not text:
return ""
# Remove null bytes
text = text.replace('\x00', '')
# Limit length
if len(text) > max_length:
text = text[:max_length]
return text.strip()
@staticmethod
def sanitize_language_code(code: str) -> str:
"""Sanitize and normalize language code"""
if not code:
return "auto"
# Remove dangerous characters, keep only alphanumeric and hyphen
code = re.sub(r'[^a-zA-Z0-9\-]', '', code.strip())
# Limit length
if len(code) > 10:
code = code[:10]
return code.lower() if code else "auto"
@staticmethod
def sanitize_url(url: str) -> str:
"""Sanitize URL input"""
if not url:
return ""
url = url.strip()
# Basic URL validation
if not re.match(r'^https?://', url, re.IGNORECASE):
raise ValidationError(
"Invalid URL format. Must start with http:// or https://",
code="invalid_url"
)
# Remove trailing slashes
url = url.rstrip('/')
return url
@staticmethod
def sanitize_api_key(key: str) -> str:
"""Sanitize API key (just trim, no logging)"""
if not key:
return ""
return key.strip()
# Default validators
file_validator = FileValidator()

View File

@@ -1,5 +0,0 @@
# Testing requirements
requests==2.31.0
pytest==7.4.3
pytest-asyncio==0.23.2
httpx==0.26.0

View File

@@ -13,3 +13,8 @@ matplotlib==3.8.2
pandas==2.1.4
requests==2.31.0
ipykernel==6.27.1
openai>=1.0.0
# SaaS robustness dependencies
psutil==5.9.8
python-magic-bin==0.4.14 # For Windows, use python-magic on Linux

View File

@@ -3,10 +3,12 @@ Translation Service Abstraction
Provides a unified interface for different translation providers
"""
from abc import ABC, abstractmethod
from typing import Optional, List
from typing import Optional, List, Dict
import requests
from deep_translator import GoogleTranslator, DeeplTranslator, LibreTranslator
from config import config
import concurrent.futures
import threading
class TranslationProvider(ABC):
@@ -16,84 +18,298 @@ class TranslationProvider(ABC):
def translate(self, text: str, target_language: str, source_language: str = 'auto') -> str:
"""Translate text from source to target language"""
pass
def translate_batch(self, texts: List[str], target_language: str, source_language: str = 'auto') -> List[str]:
"""Translate multiple texts at once - default implementation"""
return [self.translate(text, target_language, source_language) for text in texts]
class GoogleTranslationProvider(TranslationProvider):
"""Google Translate implementation"""
"""Google Translate implementation with batch support"""
def __init__(self):
self._local = threading.local()
def _get_translator(self, source_language: str, target_language: str) -> GoogleTranslator:
"""Get or create a translator instance for the current thread"""
key = f"{source_language}_{target_language}"
if not hasattr(self._local, 'translators'):
self._local.translators = {}
if key not in self._local.translators:
self._local.translators[key] = GoogleTranslator(source=source_language, target=target_language)
return self._local.translators[key]
def translate(self, text: str, target_language: str, source_language: str = 'auto') -> str:
if not text or not text.strip():
return text
try:
translator = self._get_translator(source_language, target_language)
return translator.translate(text)
except Exception as e:
print(f"Translation error: {e}")
return text
def translate_batch(self, texts: List[str], target_language: str, source_language: str = 'auto', batch_size: int = 50) -> List[str]:
"""
Translate multiple texts using batch processing for speed.
Uses deep_translator's batch capability when possible.
"""
if not texts:
return []
# Filter and track empty texts
results = [''] * len(texts)
non_empty_indices = []
non_empty_texts = []
for i, text in enumerate(texts):
if text and text.strip():
non_empty_indices.append(i)
non_empty_texts.append(text)
else:
results[i] = text if text else ''
if not non_empty_texts:
return results
try:
translator = GoogleTranslator(source=source_language, target=target_language)
return translator.translate(text)
# Process in batches
translated_texts = []
for i in range(0, len(non_empty_texts), batch_size):
batch = non_empty_texts[i:i + batch_size]
try:
# Use translate_batch if available
if hasattr(translator, 'translate_batch'):
batch_result = translator.translate_batch(batch)
else:
# Fallback: join with separator, translate, split
separator = "\n|||SPLIT|||\n"
combined = separator.join(batch)
translated_combined = translator.translate(combined)
if translated_combined:
batch_result = translated_combined.split("|||SPLIT|||")
# Clean up results
batch_result = [t.strip() for t in batch_result]
# If split didn't work correctly, fall back to individual
if len(batch_result) != len(batch):
batch_result = [translator.translate(t) for t in batch]
else:
batch_result = batch
translated_texts.extend(batch_result)
except Exception as e:
print(f"Batch translation error, falling back to individual: {e}")
for text in batch:
try:
translated_texts.append(translator.translate(text))
except:
translated_texts.append(text)
# Map back to original positions
for idx, translated in zip(non_empty_indices, translated_texts):
results[idx] = translated if translated else texts[idx]
return results
except Exception as e:
print(f"Translation error: {e}")
return text
print(f"Batch translation failed: {e}")
# Fallback to individual translations
for idx, text in zip(non_empty_indices, non_empty_texts):
try:
results[idx] = GoogleTranslator(source=source_language, target=target_language).translate(text) or text
except:
results[idx] = text
return results
class DeepLTranslationProvider(TranslationProvider):
"""DeepL Translate implementation"""
"""DeepL Translate implementation with batch support"""
def __init__(self, api_key: str):
self.api_key = api_key
self._translator_cache = {}
def _get_translator(self, source_language: str, target_language: str) -> DeeplTranslator:
key = f"{source_language}_{target_language}"
if key not in self._translator_cache:
self._translator_cache[key] = DeeplTranslator(api_key=self.api_key, source=source_language, target=target_language)
return self._translator_cache[key]
def translate(self, text: str, target_language: str, source_language: str = 'auto') -> str:
if not text or not text.strip():
return text
try:
translator = DeeplTranslator(api_key=self.api_key, source=source_language, target=target_language)
translator = self._get_translator(source_language, target_language)
return translator.translate(text)
except Exception as e:
print(f"Translation error: {e}")
return text
def translate_batch(self, texts: List[str], target_language: str, source_language: str = 'auto') -> List[str]:
"""Batch translate using DeepL"""
if not texts:
return []
results = [''] * len(texts)
non_empty = [(i, t) for i, t in enumerate(texts) if t and t.strip()]
if not non_empty:
return [t if t else '' for t in texts]
try:
translator = self._get_translator(source_language, target_language)
non_empty_texts = [t for _, t in non_empty]
if hasattr(translator, 'translate_batch'):
translated = translator.translate_batch(non_empty_texts)
else:
translated = [translator.translate(t) for t in non_empty_texts]
for (idx, _), trans in zip(non_empty, translated):
results[idx] = trans if trans else texts[idx]
# Fill empty positions
for i, text in enumerate(texts):
if not text or not text.strip():
results[i] = text if text else ''
return results
except Exception as e:
print(f"DeepL batch error: {e}")
return [self.translate(t, target_language, source_language) for t in texts]
class LibreTranslationProvider(TranslationProvider):
"""LibreTranslate implementation"""
"""LibreTranslate implementation with batch support"""
def __init__(self, custom_url: str = "https://libretranslate.com"):
self.custom_url = custom_url
self._translator_cache = {}
def _get_translator(self, source_language: str, target_language: str) -> LibreTranslator:
key = f"{source_language}_{target_language}"
if key not in self._translator_cache:
self._translator_cache[key] = LibreTranslator(source=source_language, target=target_language, custom_url=self.custom_url)
return self._translator_cache[key]
def translate(self, text: str, target_language: str, source_language: str = 'auto') -> str:
if not text or not text.strip():
return text
try:
# LibreTranslator doesn't need API key for self-hosted instances
translator = LibreTranslator(source=source_language, target=target_language, custom_url="http://localhost:5000")
translator = self._get_translator(source_language, target_language)
return translator.translate(text)
except Exception as e:
# Fail silently and return original text
print(f"LibreTranslate error: {e}")
return text
def translate_batch(self, texts: List[str], target_language: str, source_language: str = 'auto') -> List[str]:
"""Batch translate using LibreTranslate"""
if not texts:
return []
results = [''] * len(texts)
non_empty = [(i, t) for i, t in enumerate(texts) if t and t.strip()]
if not non_empty:
return [t if t else '' for t in texts]
try:
translator = self._get_translator(source_language, target_language)
for idx, text in non_empty:
try:
results[idx] = translator.translate(text) or text
except:
results[idx] = text
for i, text in enumerate(texts):
if not text or not text.strip():
results[i] = text if text else ''
return results
except Exception as e:
print(f"LibreTranslate batch error: {e}")
return texts
class OllamaTranslationProvider(TranslationProvider):
"""Ollama LLM translation implementation"""
def __init__(self, base_url: str = "http://localhost:11434", model: str = "llama3", vision_model: str = "llava"):
def __init__(self, base_url: str = "http://localhost:11434", model: str = "llama3", vision_model: str = "llava", system_prompt: str = ""):
self.base_url = base_url.rstrip('/')
self.model = model
self.vision_model = vision_model
self.model = model.strip() # Remove any leading/trailing whitespace
self.vision_model = vision_model.strip()
self.custom_system_prompt = system_prompt # Custom context, glossary, instructions
def translate(self, text: str, target_language: str, source_language: str = 'auto') -> str:
if not text or not text.strip():
return text
# Skip very short text or numbers only
if len(text.strip()) < 2 or text.strip().isdigit():
return text
try:
prompt = f"Translate the following text to {target_language}. Return ONLY the translation, nothing else:\n\n{text}"
# Build system prompt with custom context if provided
base_prompt = f"""You are a professional translator. Your ONLY task is to translate text to {target_language}.
CRITICAL RULES:
1. Output ONLY the translated text - no explanations, no comments, no notes
2. Preserve the exact formatting (line breaks, spacing, punctuation)
3. Do NOT add any prefixes like "Here's the translation:" or "Translation:"
4. Do NOT refuse to translate or ask clarifying questions
5. If the text is already in {target_language}, return it unchanged
6. Translate everything literally and accurately
7. NEVER provide comments, opinions, or explanations - you are JUST a translator
8. If you have any doubt about the translation, return the original text unchanged
9. Do not interpret or analyze the content - simply translate word by word
10. Your response must contain ONLY the translated text, nothing else"""
if self.custom_system_prompt:
system_content = f"""{base_prompt}
ADDITIONAL CONTEXT AND INSTRUCTIONS:
{self.custom_system_prompt}"""
else:
system_content = base_prompt
# Use /api/chat endpoint (more compatible with all models)
response = requests.post(
f"{self.base_url}/api/generate",
f"{self.base_url}/api/chat",
json={
"model": self.model,
"prompt": prompt,
"stream": False
"messages": [
{
"role": "system",
"content": system_content
},
{
"role": "user",
"content": text
}
],
"stream": False,
"options": {
"temperature": 0.3,
"num_predict": 500
}
},
timeout=30
timeout=120 # 2 minutes timeout
)
response.raise_for_status()
result = response.json()
return result.get("response", text).strip()
translated = result.get("message", {}).get("content", "").strip()
return translated if translated else text
except requests.exceptions.ConnectionError:
print(f"Ollama error: Cannot connect to {self.base_url}. Is Ollama running?")
return text
except requests.exceptions.Timeout:
print(f"Ollama error: Request timeout after 120s")
return text
except Exception as e:
print(f"Ollama translation error: {e}")
return text
@@ -107,21 +323,25 @@ class OllamaTranslationProvider(TranslationProvider):
with open(image_path, 'rb') as img_file:
image_data = base64.b64encode(img_file.read()).decode('utf-8')
prompt = f"Extract all text from this image and translate it to {target_language}. Return ONLY the translated text, preserving the structure and formatting."
# Use /api/chat for vision models too
response = requests.post(
f"{self.base_url}/api/generate",
f"{self.base_url}/api/chat",
json={
"model": self.vision_model,
"prompt": prompt,
"images": [image_data],
"messages": [
{
"role": "user",
"content": f"Extract all text from this image and translate it to {target_language}. Return ONLY the translated text, preserving the structure and formatting.",
"images": [image_data]
}
],
"stream": False
},
timeout=60
)
response.raise_for_status()
result = response.json()
return result.get("response", "").strip()
return result.get("message", {}).get("content", "").strip()
except Exception as e:
print(f"Ollama vision translation error: {e}")
return ""
@@ -139,6 +359,119 @@ class OllamaTranslationProvider(TranslationProvider):
return []
class WebLLMTranslationProvider(TranslationProvider):
"""WebLLM browser-based translation (client-side processing)"""
def translate(self, text: str, target_language: str, source_language: str = 'auto') -> str:
# WebLLM translation happens client-side in the browser
# This is just a placeholder - actual translation is done by JavaScript
# For server-side, we'll just pass through for now
return text
class OpenAITranslationProvider(TranslationProvider):
"""OpenAI GPT translation implementation with vision support"""
def __init__(self, api_key: str, model: str = "gpt-4o-mini", system_prompt: str = ""):
self.api_key = api_key
self.model = model
self.custom_system_prompt = system_prompt
def translate(self, text: str, target_language: str, source_language: str = 'auto') -> str:
if not text or not text.strip():
return text
# Skip very short text or numbers only
if len(text.strip()) < 2 or text.strip().isdigit():
return text
try:
import openai
client = openai.OpenAI(api_key=self.api_key)
# Build system prompt with custom context if provided
base_prompt = f"""You are a professional translator. Your ONLY task is to translate text to {target_language}.
CRITICAL RULES:
1. Output ONLY the translated text - no explanations, no comments, no notes
2. Preserve the exact formatting (line breaks, spacing, punctuation)
3. Do NOT add any prefixes like "Here's the translation:" or "Translation:"
4. Do NOT refuse to translate or ask clarifying questions
5. If the text is already in {target_language}, return it unchanged
6. Translate everything literally and accurately
7. NEVER provide comments, opinions, or explanations - you are JUST a translator
8. If you have any doubt about the translation, return the original text unchanged
9. Do not interpret or analyze the content - simply translate word by word
10. Your response must contain ONLY the translated text, nothing else"""
if self.custom_system_prompt:
system_content = f"""{base_prompt}
ADDITIONAL CONTEXT AND INSTRUCTIONS:
{self.custom_system_prompt}"""
else:
system_content = base_prompt
response = client.chat.completions.create(
model=self.model,
messages=[
{"role": "system", "content": system_content},
{"role": "user", "content": text}
],
temperature=0.3,
max_tokens=500
)
translated = response.choices[0].message.content.strip()
return translated if translated else text
except Exception as e:
print(f"OpenAI translation error: {e}")
return text
def translate_image(self, image_path: str, target_language: str) -> str:
"""Translate text within an image using OpenAI vision model"""
import base64
try:
import openai
client = openai.OpenAI(api_key=self.api_key)
# Read and encode image
with open(image_path, 'rb') as img_file:
image_data = base64.b64encode(img_file.read()).decode('utf-8')
# Determine image type from extension
ext = image_path.lower().split('.')[-1]
media_type = f"image/{ext}" if ext in ['png', 'jpg', 'jpeg', 'gif', 'webp'] else "image/png"
response = client.chat.completions.create(
model=self.model, # gpt-4o and gpt-4o-mini support vision
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": f"Extract all text from this image and translate it to {target_language}. Return ONLY the translated text, preserving the structure and formatting."
},
{
"type": "image_url",
"image_url": {
"url": f"data:{media_type};base64,{image_data}"
}
}
]
}
],
max_tokens=1000
)
return response.choices[0].message.content.strip()
except Exception as e:
print(f"OpenAI vision translation error: {e}")
return ""
class TranslationService:
"""Main translation service that delegates to the configured provider"""
@@ -148,6 +481,7 @@ class TranslationService:
else:
# Auto-select provider based on configuration
self.provider = self._get_default_provider()
self.translate_images = False # Flag to enable image translation
def _get_default_provider(self) -> TranslationProvider:
"""Get the default translation provider from configuration"""
@@ -172,9 +506,31 @@ class TranslationService:
return self.provider.translate(text, target_language, source_language)
def translate_image(self, image_path: str, target_language: str) -> str:
"""
Translate text in an image using vision model (Ollama or OpenAI)
Args:
image_path: Path to image file
target_language: Target language code
Returns:
Translated text from image
"""
if not self.translate_images:
return ""
# Ollama and OpenAI support image translation
if isinstance(self.provider, OllamaTranslationProvider):
return self.provider.translate_image(image_path, target_language)
elif isinstance(self.provider, OpenAITranslationProvider):
return self.provider.translate_image(image_path, target_language)
return ""
def translate_batch(self, texts: list[str], target_language: str, source_language: str = 'auto') -> list[str]:
"""
Translate multiple text strings
Translate multiple text strings efficiently using batch processing.
Args:
texts: List of texts to translate
@@ -184,6 +540,14 @@ class TranslationService:
Returns:
List of translated texts
"""
if not texts:
return []
# Use provider's batch method if available
if hasattr(self.provider, 'translate_batch'):
return self.provider.translate_batch(texts, target_language, source_language)
# Fallback to individual translations
return [self.translate_text(text, target_language, source_language) for text in texts]

View File

@@ -309,7 +309,7 @@
</div>
<div class="form-group">
<label for="ollama-model">Modèle Ollama</label>
<input type="text" id="ollama-model" value="llama3" placeholder="llama3, mistral, etc.">
<input type="text" id="ollama-model" value="llama3.2" placeholder="llama3.2, mistral, etc.">
</div>
</div>
<button onclick="listOllamaModels()" class="btn-secondary">List Available Models</button>
@@ -318,6 +318,39 @@
<div id="models-result"></div>
</div>
<!-- System Prompt for LLM Translation -->
<div class="card">
<h2>Translation Context (Ollama / WebLLM)</h2>
<p style="font-size: 13px; color: #718096; margin-bottom: 15px;">
Provide context, technical glossary, or specific instructions to improve translation quality.
</p>
<div class="form-group">
<label for="system-prompt">System Prompt / Instructions</label>
<textarea id="system-prompt" rows="4" style="width: 100%; padding: 10px 14px; border: 1px solid #cbd5e0; border-radius: 6px; font-size: 14px; font-family: inherit; resize: vertical;" placeholder="Example: You are translating HVAC technical documents. Use these terms:
- Batterie (FR) = Coil (EN)
- Groupe froid (FR) = Chiller (EN)
- CTA (FR) = AHU (EN)"></textarea>
</div>
<div class="form-group">
<label for="glossary">Technical Glossary (one per line: source=target)</label>
<textarea id="glossary" rows="5" style="width: 100%; padding: 10px 14px; border: 1px solid #cbd5e0; border-radius: 6px; font-size: 13px; font-family: monospace; resize: vertical;" placeholder="batterie=coil
groupe froid=chiller
CTA=AHU
échangeur=heat exchanger
vanne 3 voies=3-way valve"></textarea>
</div>
<div style="display: flex; gap: 10px; flex-wrap: wrap;">
<button onclick="loadPreset('hvac')" class="btn-secondary" style="font-size: 12px;">HVAC Preset</button>
<button onclick="loadPreset('it')" class="btn-secondary" style="font-size: 12px;">IT Preset</button>
<button onclick="loadPreset('legal')" class="btn-secondary" style="font-size: 12px;">Legal Preset</button>
<button onclick="loadPreset('medical')" class="btn-secondary" style="font-size: 12px;">Medical Preset</button>
<button onclick="clearPrompt()" class="btn-secondary" style="font-size: 12px; background: #dc2626;">Clear</button>
</div>
</div>
<!-- Traduction de fichier -->
<div class="card">
<h2>Document Translation</h2>
@@ -335,6 +368,8 @@
<div class="form-group">
<label for="target-lang">Target Language</label>
<select id="target-lang">
<option value="en">English (en)</option>
<option value="fa">Persian / Farsi (fa)</option>
<option value="es">Espagnol (es)</option>
<option value="fr">Français (fr)</option>
<option value="de">Allemand (de)</option>
@@ -350,9 +385,10 @@
<div class="form-group">
<label for="provider">Translation Service</label>
<select id="provider" onchange="toggleImageTranslation()">
<select id="provider" onchange="toggleProviderOptions()">
<option value="google">Google Translate (Default)</option>
<option value="ollama">Ollama LLM</option>
<option value="ollama">Ollama LLM (Local Server)</option>
<option value="webllm">WebLLM (Browser - WebGPU)</option>
<option value="deepl">DeepL</option>
<option value="libre">LibreTranslate</option>
</select>
@@ -362,10 +398,32 @@
<div class="form-group" id="image-translation-option" style="display: none;">
<label style="display: flex; align-items: center; cursor: pointer;">
<input type="checkbox" id="translate-images" style="width: auto; margin-right: 10px;">
<span>Translate images with Ollama Vision (requires llava model)</span>
<span>Translate images with vision (use multimodal models: gemma3, qwen3-vl, llava, etc.)</span>
</label>
</div>
<div class="form-group" id="webllm-options" style="display: none; padding: 12px; background: #e0f2ff; border-radius: 6px; border-left: 4px solid #2563eb;">
<p style="margin: 0 0 10px 0; font-size: 13px; color: #1e40af;">
<strong>WebLLM Mode:</strong> Translation runs entirely in your browser using WebGPU. First use downloads the model.
</p>
<div style="display: grid; grid-template-columns: 1fr auto; gap: 10px; align-items: end;">
<div>
<label for="webllm-model" style="font-size: 12px; color: #4a5568; margin-bottom: 4px;">Select Model:</label>
<select id="webllm-model" style="width: 100%; padding: 6px; font-size: 13px; border: 1px solid #cbd5e0; border-radius: 4px;">
<option value="Llama-3.2-3B-Instruct-q4f32_1-MLC">Llama 3.2 3B (~2GB) - Recommended</option>
<option value="Llama-3.1-8B-Instruct-q4f32_1-MLC">Llama 3.1 8B (~4.5GB)</option>
<option value="Phi-3.5-mini-instruct-q4f16_1-MLC">Phi 3.5 Mini (~2.5GB)</option>
<option value="Mistral-7B-Instruct-v0.3-q4f16_1-MLC">Mistral 7B (~4.5GB)</option>
<option value="gemma-2-2b-it-q4f16_1-MLC">Gemma 2 2B (~1.5GB)</option>
</select>
</div>
<button onclick="clearWebLLMCache()" style="background: #dc2626; padding: 6px 12px; font-size: 13px; white-space: nowrap;">
Clear Cache
</button>
</div>
<div id="webllm-status" style="margin-top: 10px; font-size: 12px; color: #4a5568;"></div>
</div>
<button onclick="translateFile()">Translate Document</button>
<div id="loading" class="loading">
@@ -392,19 +450,224 @@
<script>
const API_BASE = 'http://localhost:8000';
// Toggle image translation option based on provider
function toggleImageTranslation() {
const provider = document.getElementById('provider').value;
const imageOption = document.getElementById('image-translation-option');
// Clear WebLLM cache
async function clearWebLLMCache() {
if (!confirm('This will delete all downloaded WebLLM models from your browser cache. Continue?')) {
return;
}
if (provider === 'ollama') {
imageOption.style.display = 'block';
} else {
imageOption.style.display = 'none';
document.getElementById('translate-images').checked = false;
try {
// Clear IndexedDB cache used by WebLLM
const databases = await indexedDB.databases();
for (const db of databases) {
if (db.name && (db.name.includes('webllm') || db.name.includes('mlc'))) {
indexedDB.deleteDatabase(db.name);
}
}
// Clear Cache API
if ('caches' in window) {
const cacheNames = await caches.keys();
for (const name of cacheNames) {
if (name.includes('webllm') || name.includes('mlc')) {
await caches.delete(name);
}
}
}
alert('✅ WebLLM cache cleared successfully! Refresh the page.');
} catch (error) {
alert('❌ Error clearing cache: ' + error.message);
}
}
// Toggle provider options based on selection
// Preset templates for different domains
const presets = {
hvac: {
prompt: `You are translating HVAC (Heating, Ventilation, Air Conditioning) technical documents.
Use precise technical terminology. Maintain consistency with industry standards.
Keep unit measurements (kW, m³/h, Pa) unchanged.
Translate component names according to the glossary provided.`,
glossary: `batterie=coil
groupe froid=chiller
CTA=AHU (Air Handling Unit)
échangeur=heat exchanger
vanne 3 voies=3-way valve
détendeur=expansion valve
compresseur=compressor
évaporateur=evaporator
condenseur=condenser
fluide frigorigène=refrigerant
débit d'air=airflow
pression statique=static pressure
récupérateur=heat recovery unit
ventilo-convecteur=fan coil unit
gaine=duct
diffuseur=diffuser
registre=damper`
},
it: {
prompt: `You are translating IT and software documentation.
Keep technical terms, code snippets, and variable names unchanged.
Translate UI labels and user-facing text appropriately.
Maintain formatting markers like **bold** and \`code\`.`,
glossary: `serveur=server
base de données=database
requête=query
sauvegarde=backup
mise à jour=update
télécharger=download
téléverser=upload
mot de passe=password
identifiant=username
pare-feu=firewall
réseau=network
stockage=storage
conteneur=container
déploiement=deployment`
},
legal: {
prompt: `You are translating legal documents.
Use formal legal terminology. Be precise and unambiguous.
Maintain references to laws, articles, and clauses in their original form.
Use standard legal phrases for the target language.`,
glossary: `contrat=contract
clause=clause
partie=party
signataire=signatory
résiliation=termination
préavis=notice period
dommages et intérêts=damages
responsabilité=liability
juridiction=jurisdiction
arbitrage=arbitration
avenant=amendment
ayant droit=beneficiary`
},
medical: {
prompt: `You are translating medical and healthcare documents.
Use standard medical terminology (Latin/Greek roots when appropriate).
Keep drug names, dosages, and medical codes unchanged.
Be precise with anatomical terms and procedures.`,
glossary: `patient=patient
ordonnance=prescription
posologie=dosage
effet secondaire=side effect
contre-indication=contraindication
diagnostic=diagnosis
symptôme=symptom
traitement=treatment
chirurgie=surgery
anesthésie=anesthesia
perfusion=infusion
prélèvement=sample collection`
}
};
function loadPreset(presetName) {
const preset = presets[presetName];
if (preset) {
document.getElementById('system-prompt').value = preset.prompt;
document.getElementById('glossary').value = preset.glossary;
}
}
function clearPrompt() {
document.getElementById('system-prompt').value = '';
document.getElementById('glossary').value = '';
}
function getFullSystemPrompt() {
let prompt = document.getElementById('system-prompt').value || '';
const glossary = document.getElementById('glossary').value || '';
if (glossary.trim()) {
prompt += '\n\nGLOSSARY (use these exact translations):\n' + glossary;
}
return prompt;
}
function toggleProviderOptions() {
const provider = document.getElementById('provider').value;
const imageOption = document.getElementById('image-translation-option');
const webllmOptions = document.getElementById('webllm-options');
// Hide all options first
imageOption.style.display = 'none';
webllmOptions.style.display = 'none';
document.getElementById('translate-images').checked = false;
if (provider === 'ollama') {
imageOption.style.display = 'block';
} else if (provider === 'webllm') {
webllmOptions.style.display = 'block';
}
}
// WebLLM engine instance
let webllmEngine = null;
let webllmReady = false;
// Initialize WebLLM
async function initWebLLM(modelId) {
const statusDiv = document.getElementById('webllm-status');
statusDiv.innerHTML = '⏳ Loading WebLLM...';
try {
// Dynamically import WebLLM
const webllm = await import('https://esm.run/@mlc-ai/web-llm');
statusDiv.innerHTML = '⏳ Downloading model (this may take a while on first use)...';
webllmEngine = await webllm.CreateMLCEngine(modelId, {
initProgressCallback: (progress) => {
statusDiv.innerHTML = `${progress.text}`;
}
});
webllmReady = true;
statusDiv.innerHTML = '✅ Model loaded and ready!';
return true;
} catch (error) {
statusDiv.innerHTML = `❌ Error: ${error.message}`;
console.error('WebLLM init error:', error);
return false;
}
}
// Translate text with WebLLM
async function translateWithWebLLM(text, targetLang) {
if (!webllmEngine) return text;
try {
// Build system prompt with custom context and glossary
let systemPrompt = `You are a translator. Translate the user's text to ${targetLang}. Return ONLY the translation, nothing else.`;
const customPrompt = getFullSystemPrompt();
if (customPrompt.trim()) {
systemPrompt = `You are a translator. Translate the user's text to ${targetLang}. Return ONLY the translation, nothing else.
ADDITIONAL CONTEXT AND INSTRUCTIONS:
${customPrompt}`;
}
const response = await webllmEngine.chat.completions.create({
messages: [
{ role: "system", content: systemPrompt },
{ role: "user", content: text }
],
temperature: 0.3,
max_tokens: 500
});
return response.choices[0].message.content.trim();
} catch (error) {
console.error('WebLLM translation error:', error);
return text;
}
}
// Liste des modèles Ollama
async function listOllamaModels() {
const url = document.getElementById('ollama-url').value;
@@ -493,24 +756,66 @@
return;
}
// Get Ollama model from configuration field (used for both text and vision)
const ollamaModel = document.getElementById('ollama-model').value || 'llama3.2';
// Get custom system prompt with glossary
const systemPrompt = getFullSystemPrompt();
const formData = new FormData();
formData.append('file', fileInput.files[0]);
formData.append('target_language', targetLang);
formData.append('provider', provider);
formData.append('translate_images', translateImages);
formData.append('ollama_model', ollamaModel);
formData.append('system_prompt', systemPrompt);
loadingDiv.classList.add('active');
progressContainer.classList.add('active');
resultDiv.innerHTML = '';
// Simulate progress (since we don't have real progress from backend)
// Better progress simulation with timeout protection
let progress = 0;
let progressSpeed = 8; // Start at 8% increments
const progressInterval = setInterval(() => {
progress += Math.random() * 15;
if (progress > 90) progress = 90;
progressBar.style.width = progress + '%';
progressText.textContent = `Processing: ${Math.round(progress)}%`;
}, 500);
if (progress < 30) {
progress += progressSpeed;
} else if (progress < 60) {
progressSpeed = 4; // Slower
progress += progressSpeed;
} else if (progress < 85) {
progressSpeed = 2; // Even slower
progress += progressSpeed;
} else if (progress < 95) {
progressSpeed = 0.5; // Very slow near the end
progress += progressSpeed;
}
progressBar.style.width = Math.min(progress, 98) + '%';
progressText.textContent = `Processing: ${Math.round(Math.min(progress, 98))}%`;
}, 800);
// Safety timeout: if takes more than 5 minutes, show error
const safetyTimeout = setTimeout(() => {
clearInterval(progressInterval);
loadingDiv.classList.remove('active');
progressContainer.classList.remove('active');
progressBar.style.width = '0%';
progressText.textContent = '';
resultDiv.innerHTML = `
<div class="result error">
<h3>Request timeout</h3>
<p>Translation is taking longer than expected. This might be due to:</p>
<ul>
<li>Large file size</li>
<li>Ollama model not responding (check if Ollama is running)</li>
<li>Network issues with translation service</li>
</ul>
<p>Please try again or use a different provider.</p>
</div>
`;
}, 300000); // 5 minutes
try {
const response = await fetch(`${API_BASE}/translate`, {
@@ -519,6 +824,7 @@
});
clearInterval(progressInterval);
clearTimeout(safetyTimeout);
progressBar.style.width = '100%';
progressText.textContent = 'Complete: 100%';
@@ -557,6 +863,7 @@
}
} catch (error) {
clearInterval(progressInterval);
clearTimeout(safetyTimeout);
loadingDiv.classList.remove('active');
progressContainer.classList.remove('active');
progressBar.style.width = '0%';

249
static/webllm.html Normal file
View File

@@ -0,0 +1,249 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>WebLLM Translation Demo</title>
<script type="module">
import { CreateMLCEngine } from "https://esm.run/@mlc-ai/web-llm";
let engine = null;
const statusDiv = document.getElementById('status');
const outputDiv = document.getElementById('output');
let currentModel = null;
async function initEngine() {
const modelSelect = document.getElementById('model-select');
const selectedModel = modelSelect.value;
// If already loaded and same model, skip
if (engine && currentModel === selectedModel) {
statusDiv.textContent = "✅ WebLLM engine already ready!";
return;
}
// Clear previous engine
if (engine) {
engine = null;
}
statusDiv.textContent = `Initializing ${selectedModel} (first time: downloading model)...`;
document.getElementById('translate-btn').disabled = true;
try {
engine = await CreateMLCEngine(selectedModel, {
initProgressCallback: (progress) => {
statusDiv.textContent = `Loading: ${progress.text}`;
}
});
currentModel = selectedModel;
statusDiv.textContent = `${selectedModel} ready!`;
document.getElementById('translate-btn').disabled = false;
} catch (error) {
statusDiv.textContent = `❌ Error: ${error.message}`;
}
}
async function clearCache() {
if (!confirm('This will delete all downloaded WebLLM models (~2-5GB). Continue?')) {
return;
}
try {
const databases = await indexedDB.databases();
for (const db of databases) {
if (db.name && (db.name.includes('webllm') || db.name.includes('mlc'))) {
indexedDB.deleteDatabase(db.name);
}
}
if ('caches' in window) {
const cacheNames = await caches.keys();
for (const name of cacheNames) {
if (name.includes('webllm') || name.includes('mlc')) {
await caches.delete(name);
}
}
}
alert('✅ Cache cleared! Refresh the page.');
location.reload();
} catch (error) {
alert('❌ Error: ' + error.message);
}
}
async function translateText() {
const inputText = document.getElementById('input-text').value;
const targetLang = document.getElementById('target-lang').value;
if (!inputText) {
alert('Please enter text to translate');
return;
}
if (!engine) {
alert('Engine not ready. Please wait for initialization.');
return;
}
statusDiv.textContent = "Translating...";
outputDiv.textContent = "";
const prompt = `Translate the following text to ${targetLang}. Return ONLY the translation:\n\n${inputText}`;
try {
const reply = await engine.chat.completions.create({
messages: [{ role: "user", content: prompt }],
temperature: 0.3,
max_tokens: 1000,
});
const translation = reply.choices[0].message.content;
outputDiv.textContent = translation;
statusDiv.textContent = "✅ Translation complete!";
} catch (error) {
statusDiv.textContent = `❌ Translation error: ${error.message}`;
}
}
// Auto-init on page load
window.addEventListener('DOMContentLoaded', () => {
// Don't auto-init, let user choose model first
statusDiv.textContent = "Select a model and click 'Load Model' to start.";
});
window.translateText = translateText;
window.initEngine = initEngine;
window.clearCache = clearCache;
</script>
<style>
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
max-width: 800px;
margin: 50px auto;
padding: 20px;
background: #f5f7fa;
}
.container {
background: white;
padding: 30px;
border-radius: 8px;
box-shadow: 0 1px 3px rgba(0,0,0,0.1);
}
h1 {
color: #1a202c;
margin-bottom: 10px;
}
.info {
background: #e0f2ff;
padding: 12px;
border-radius: 6px;
margin-bottom: 20px;
font-size: 14px;
color: #1e40af;
}
textarea {
width: 100%;
padding: 12px;
border: 1px solid #cbd5e0;
border-radius: 6px;
font-size: 14px;
min-height: 150px;
margin-bottom: 15px;
}
select {
width: 100%;
padding: 10px;
border: 1px solid #cbd5e0;
border-radius: 6px;
font-size: 14px;
margin-bottom: 15px;
}
button {
background: #2563eb;
color: white;
padding: 10px 24px;
border: none;
border-radius: 6px;
font-size: 14px;
cursor: pointer;
width: 100%;
}
button:hover:not(:disabled) {
background: #1e40af;
}
button:disabled {
opacity: 0.5;
cursor: not-allowed;
}
.btn-group {
display: grid;
grid-template-columns: 1fr 1fr;
gap: 10px;
margin-bottom: 15px;
}
#status {
margin-top: 15px;
padding: 12px;
background: #f7fafc;
border-radius: 6px;
font-size: 14px;
min-height: 20px;
}
#output {
margin-top: 15px;
padding: 15px;
background: #f0fdf4;
border: 1px solid #10b981;
border-radius: 6px;
white-space: pre-wrap;
min-height: 100px;
}
</style>
</head>
<body>
<div class="container">
<h1>WebLLM Translation Demo</h1>
<div class="info">
<strong>Info:</strong> Runs entirely in your browser using WebGPU. Models are cached after first download.
</div>
<label for="model-select">Select Model:</label>
<select id="model-select">
<option value="Llama-3.2-3B-Instruct-q4f32_1-MLC">Llama 3.2 3B (~2GB) - Fast</option>
<option value="Llama-3.1-8B-Instruct-q4f32_1-MLC">Llama 3.1 8B (~4.5GB) - Accurate</option>
<option value="Phi-3.5-mini-instruct-q4f16_1-MLC">Phi 3.5 Mini (~2.5GB) - Balanced</option>
<option value="Mistral-7B-Instruct-v0.3-q4f16_1-MLC">Mistral 7B (~4.5GB) - High Quality</option>
<option value="gemma-2-2b-it-q4f16_1-MLC">Gemma 2 2B (~1.5GB) - Lightweight</option>
</select>
<div class="btn-group">
<button onclick="initEngine()" style="background: #059669;">Load Model</button>
<button onclick="clearCache()" style="background: #dc2626;">Clear Cache</button>
</div>
<label for="input-text">Text to translate:</label>
<textarea id="input-text" placeholder="Enter text here...">Hello, how are you today?</textarea>
<label for="target-lang">Target language:</label>
<select id="target-lang">
<option value="Spanish">Spanish</option>
<option value="French">French</option>
<option value="German">German</option>
<option value="Italian">Italian</option>
<option value="Portuguese">Portuguese</option>
<option value="Chinese">Chinese</option>
<option value="Japanese">Japanese</option>
<option value="Korean">Korean</option>
<option value="Arabic">Arabic</option>
</select>
<button id="translate-btn" onclick="translateText()" disabled>Translate</button>
<div id="status">Initializing...</div>
<div id="output"></div>
</div>
</body>
</html>

View File

@@ -1,10 +1,13 @@
"""
Excel Translation Module
Translates Excel files while preserving all formatting, formulas, images, and layout
OPTIMIZED: Uses batch translation for 5-10x faster processing
"""
import re
import tempfile
import os
from pathlib import Path
from typing import Dict, Set
from typing import Dict, Set, List, Tuple
from openpyxl import load_workbook
from openpyxl.worksheet.worksheet import Worksheet
from openpyxl.cell.cell import Cell
@@ -21,140 +24,136 @@ class ExcelTranslator:
def translate_file(self, input_path: Path, output_path: Path, target_language: str) -> Path:
"""
Translate an Excel file while preserving all formatting and structure
Args:
input_path: Path to input Excel file
output_path: Path to save translated Excel file
target_language: Target language code
Returns:
Path to the translated file
Translate an Excel file while preserving all formatting and structure.
Uses batch translation for improved performance.
"""
# Load workbook with data_only=False to preserve formulas
workbook = load_workbook(input_path, data_only=False)
# First, translate all worksheet content
sheet_name_mapping = {}
# Collect all translatable text elements
text_elements = [] # List of (text, setter_function)
sheet_names_to_translate = []
for sheet_name in workbook.sheetnames:
worksheet = workbook[sheet_name]
self._translate_worksheet(worksheet, target_language)
self._collect_from_worksheet(worksheet, text_elements)
sheet_names_to_translate.append(sheet_name)
# Add sheet names to translate
sheet_name_setters = []
for sheet_name in sheet_names_to_translate:
text_elements.append((sheet_name, None)) # None setter - handled separately
sheet_name_setters.append(sheet_name)
# Batch translate all texts at once
if text_elements:
texts = [elem[0] for elem in text_elements]
print(f"Batch translating {len(texts)} text segments...")
translated_texts = self.translation_service.translate_batch(texts, target_language)
# Prepare translated sheet name (but don't rename yet)
translated_sheet_name = self.translation_service.translate_text(
sheet_name, target_language
)
if translated_sheet_name and translated_sheet_name != sheet_name:
# Truncate to Excel's 31 character limit and ensure uniqueness
new_name = translated_sheet_name[:31]
counter = 1
base_name = new_name[:28] if len(new_name) > 28 else new_name
while new_name in sheet_name_mapping.values() or new_name in workbook.sheetnames:
new_name = f"{base_name}_{counter}"
counter += 1
sheet_name_mapping[sheet_name] = new_name
# Apply translations to cells
sheet_name_offset = len(text_elements) - len(sheet_name_setters)
for i, ((original_text, setter), translated) in enumerate(zip(text_elements[:sheet_name_offset], translated_texts[:sheet_name_offset])):
if translated is not None and setter is not None:
try:
setter(translated)
except Exception as e:
print(f"Error applying translation: {e}")
# Apply sheet name translations
sheet_name_mapping = {}
for i, (sheet_name, translated) in enumerate(zip(sheet_name_setters, translated_texts[sheet_name_offset:])):
if translated and translated != sheet_name:
new_name = translated[:31]
counter = 1
base_name = new_name[:28] if len(new_name) > 28 else new_name
while new_name in sheet_name_mapping.values() or new_name in workbook.sheetnames:
new_name = f"{base_name}_{counter}"
counter += 1
sheet_name_mapping[sheet_name] = new_name
# Rename sheets
for original_name, new_name in sheet_name_mapping.items():
workbook[original_name].title = new_name
# Now rename sheets (after all content is translated)
for original_name, new_name in sheet_name_mapping.items():
workbook[original_name].title = new_name
# Translate images if enabled (separate process)
if getattr(self.translation_service, 'translate_images', False):
for sheet_name in workbook.sheetnames:
self._translate_images(workbook[sheet_name], target_language)
# Save the translated workbook
workbook.save(output_path)
workbook.close()
return output_path
def _translate_worksheet(self, worksheet: Worksheet, target_language: str):
"""
Translate all cells in a worksheet while preserving formatting
Args:
worksheet: Worksheet to translate
target_language: Target language code
"""
# Iterate through all cells that have values
def _collect_from_worksheet(self, worksheet: Worksheet, text_elements: List[Tuple[str, callable]]):
"""Collect all translatable text from worksheet cells"""
for row in worksheet.iter_rows():
for cell in row:
if cell.value is not None:
self._translate_cell(cell, target_language)
self._collect_from_cell(cell, text_elements)
def _translate_cell(self, cell: Cell, target_language: str):
"""
Translate a single cell while preserving its formula and formatting
Args:
cell: Cell to translate
target_language: Target language code
"""
def _collect_from_cell(self, cell: Cell, text_elements: List[Tuple[str, callable]]):
"""Collect text from a cell"""
original_value = cell.value
# Skip if cell is empty
if original_value is None:
return
# Handle formulas
# Handle formulas - collect text inside quotes
if isinstance(original_value, str) and original_value.startswith('='):
self._translate_formula(cell, original_value, target_language)
string_pattern = re.compile(r'"([^"]*)"')
strings = string_pattern.findall(original_value)
for s in strings:
if s.strip():
def make_formula_setter(c, orig_formula, orig_string):
def setter(translated):
c.value = orig_formula.replace(f'"{orig_string}"', f'"{translated}"')
return setter
text_elements.append((s, make_formula_setter(cell, original_value, s)))
# Handle regular text
elif isinstance(original_value, str):
translated_text = self.translation_service.translate_text(
original_value, target_language
)
cell.value = translated_text
# Numbers, dates, booleans remain unchanged
elif isinstance(original_value, str) and original_value.strip():
def make_setter(c):
def setter(text):
c.value = text
return setter
text_elements.append((original_value, make_setter(cell)))
def _translate_formula(self, cell: Cell, formula: str, target_language: str):
"""
Translate text within a formula while preserving the formula structure
def _translate_images(self, worksheet: Worksheet, target_language: str):
"""Translate text in images using vision model"""
from services.translation_service import OllamaTranslationProvider
Args:
cell: Cell containing the formula
formula: Formula string
target_language: Target language code
"""
# Extract text strings from formula (text within quotes)
string_pattern = re.compile(r'"([^"]*)"')
strings = string_pattern.findall(formula)
if not strings:
if not isinstance(self.translation_service.provider, OllamaTranslationProvider):
return
# Translate each string and replace in formula
translated_formula = formula
for original_string in strings:
if original_string.strip(): # Only translate non-empty strings
translated_string = self.translation_service.translate_text(
original_string, target_language
)
# Replace in formula, being careful with special regex characters
translated_formula = translated_formula.replace(
f'"{original_string}"', f'"{translated_string}"'
)
cell.value = translated_formula
def _should_translate(self, text: str) -> bool:
"""
Determine if text should be translated
Args:
text: Text to check
Returns:
True if text should be translated, False otherwise
"""
if not text or not isinstance(text, str):
return False
# Don't translate if it's only numbers, special characters, or very short
if len(text.strip()) < 2:
return False
# Check if it's a formula (handled separately)
if text.startswith('='):
return False
return True
try:
images = getattr(worksheet, '_images', [])
for idx, image in enumerate(images):
try:
image_data = image._data()
ext = image.format or 'png'
with tempfile.NamedTemporaryFile(suffix=f'.{ext}', delete=False) as tmp:
tmp.write(image_data)
tmp_path = tmp.name
translated_text = self.translation_service.provider.translate_image(tmp_path, target_language)
os.unlink(tmp_path)
if translated_text and translated_text.strip():
anchor = image.anchor
if hasattr(anchor, '_from'):
cell_ref = f"{get_column_letter(anchor._from.col + 1)}{anchor._from.row + 1}"
cell = worksheet[cell_ref]
from openpyxl.comments import Comment
cell.comment = Comment(f"Image translation: {translated_text}", "Translator")
print(f"Added Excel image translation at {cell_ref}")
except Exception as e:
print(f"Error translating Excel image {idx}: {e}")
except Exception as e:
print(f"Error processing Excel images: {e}")
# Global translator instance

View File

@@ -1,6 +1,7 @@
"""
PowerPoint Translation Module
Translates PowerPoint files while preserving all layouts, animations, and media
OPTIMIZED: Uses batch translation for 5-10x faster processing
"""
from pathlib import Path
from pptx import Presentation
@@ -9,6 +10,9 @@ from pptx.shapes.group import GroupShape
from pptx.util import Inches, Pt
from pptx.enum.shapes import MSO_SHAPE_TYPE
from services.translation_service import translation_service
from typing import List, Tuple
import tempfile
import os
class PowerPointTranslator:
@@ -19,139 +23,128 @@ class PowerPointTranslator:
def translate_file(self, input_path: Path, output_path: Path, target_language: str) -> Path:
"""
Translate a PowerPoint presentation while preserving all formatting and structure
Args:
input_path: Path to input PowerPoint file
output_path: Path to save translated PowerPoint file
target_language: Target language code
Returns:
Path to the translated file
Translate a PowerPoint presentation while preserving all formatting.
Uses batch translation for improved performance.
"""
presentation = Presentation(input_path)
# Translate each slide
for slide in presentation.slides:
self._translate_slide(slide, target_language)
# Collect all translatable text elements
text_elements = [] # List of (text, setter_function)
image_shapes = [] # Collect images for separate processing
for slide_idx, slide in enumerate(presentation.slides):
# Collect from notes
if slide.has_notes_slide and slide.notes_slide.notes_text_frame:
self._collect_from_text_frame(slide.notes_slide.notes_text_frame, text_elements)
# Collect from shapes
for shape in slide.shapes:
self._collect_from_shape(shape, text_elements, slide, image_shapes)
# Batch translate all texts at once
if text_elements:
texts = [elem[0] for elem in text_elements]
print(f"Batch translating {len(texts)} text segments...")
translated_texts = self.translation_service.translate_batch(texts, target_language)
# Apply translations
for (original_text, setter), translated in zip(text_elements, translated_texts):
if translated is not None and setter is not None:
try:
setter(translated)
except Exception as e:
print(f"Error applying translation: {e}")
# Translate images if enabled (separate process, can't batch)
if getattr(self.translation_service, 'translate_images', False):
for shape, slide in image_shapes:
self._translate_image_shape(shape, target_language, slide)
# Save the translated presentation
presentation.save(output_path)
return output_path
def _translate_slide(self, slide, target_language: str):
"""
Translate all text elements in a slide while preserving layout
Args:
slide: Slide to translate
target_language: Target language code
"""
# Translate notes (speaker notes)
if slide.has_notes_slide:
notes_slide = slide.notes_slide
if notes_slide.notes_text_frame:
self._translate_text_frame(notes_slide.notes_text_frame, target_language)
# Translate shapes in the slide
for shape in slide.shapes:
self._translate_shape(shape, target_language)
def _translate_shape(self, shape: BaseShape, target_language: str):
"""
Translate text in a shape based on its type
Args:
shape: Shape to translate
target_language: Target language code
"""
def _collect_from_shape(self, shape: BaseShape, text_elements: List[Tuple[str, callable]], slide=None, image_shapes=None):
"""Collect text from a shape and its children"""
# Handle text-containing shapes
if shape.has_text_frame:
self._translate_text_frame(shape.text_frame, target_language)
self._collect_from_text_frame(shape.text_frame, text_elements)
# Handle tables
if shape.shape_type == MSO_SHAPE_TYPE.TABLE:
self._translate_table(shape.table, target_language)
for row in shape.table.rows:
for cell in row.cells:
self._collect_from_text_frame(cell.text_frame, text_elements)
# Handle group shapes (shapes within shapes)
# Handle pictures/images
if shape.shape_type == MSO_SHAPE_TYPE.PICTURE and image_shapes is not None:
image_shapes.append((shape, slide))
# Handle group shapes
if shape.shape_type == MSO_SHAPE_TYPE.GROUP:
for sub_shape in shape.shapes:
self._translate_shape(sub_shape, target_language)
self._collect_from_shape(sub_shape, text_elements, slide, image_shapes)
# Handle smart art (contains multiple shapes)
# Smart art is complex, but we can try to translate text within it
# Handle smart art
if hasattr(shape, 'shapes'):
try:
for sub_shape in shape.shapes:
self._translate_shape(sub_shape, target_language)
self._collect_from_shape(sub_shape, text_elements, slide, image_shapes)
except:
pass # Some shapes may not support iteration
pass
def _translate_text_frame(self, text_frame, target_language: str):
"""
Translate text within a text frame while preserving formatting
Args:
text_frame: Text frame to translate
target_language: Target language code
"""
def _collect_from_text_frame(self, text_frame, text_elements: List[Tuple[str, callable]]):
"""Collect text from a text frame"""
if not text_frame.text.strip():
return
# Translate each paragraph in the text frame
for paragraph in text_frame.paragraphs:
self._translate_paragraph(paragraph, target_language)
if not paragraph.text.strip():
continue
for run in paragraph.runs:
if run.text and run.text.strip():
def make_setter(r):
def setter(text):
r.text = text
return setter
text_elements.append((run.text, make_setter(run)))
def _translate_paragraph(self, paragraph, target_language: str):
"""
Translate a paragraph while preserving run-level formatting
def _translate_image_shape(self, shape, target_language: str, slide):
"""Translate text in an image using vision model"""
from services.translation_service import OllamaTranslationProvider
Args:
paragraph: Paragraph to translate
target_language: Target language code
"""
if not paragraph.text.strip():
if not isinstance(self.translation_service.provider, OllamaTranslationProvider):
return
# Translate each run in the paragraph to preserve individual formatting
for run in paragraph.runs:
if run.text.strip():
translated_text = self.translation_service.translate_text(
run.text, target_language
)
run.text = translated_text
def _translate_table(self, table, target_language: str):
"""
Translate all cells in a table while preserving structure
Args:
table: Table to translate
target_language: Target language code
"""
for row in table.rows:
for cell in row.cells:
self._translate_text_frame(cell.text_frame, target_language)
def _is_translatable(self, text: str) -> bool:
"""
Determine if text should be translated
Args:
text: Text to check
Returns:
True if text should be translated, False otherwise
"""
if not text or not isinstance(text, str):
return False
# Don't translate if it's only numbers, special characters, or very short
if len(text.strip()) < 2:
return False
return True
try:
image_blob = shape.image.blob
ext = shape.image.ext
with tempfile.NamedTemporaryFile(suffix=f'.{ext}', delete=False) as tmp:
tmp.write(image_blob)
tmp_path = tmp.name
translated_text = self.translation_service.provider.translate_image(tmp_path, target_language)
os.unlink(tmp_path)
if translated_text and translated_text.strip():
left = shape.left
top = shape.top + shape.height + Inches(0.1)
width = shape.width
height = Inches(0.5)
textbox = slide.shapes.add_textbox(left, top, width, height)
tf = textbox.text_frame
p = tf.paragraphs[0]
p.text = f"[{translated_text}]"
p.font.size = Pt(10)
p.font.italic = True
print(f"Added image translation: {translated_text[:50]}...")
except Exception as e:
print(f"Error translating image: {e}")
# Global translator instance

View File

@@ -1,6 +1,7 @@
"""
Word Document Translation Module
Translates Word files while preserving all formatting, styles, tables, and images
OPTIMIZED: Uses batch translation for 5-10x faster processing
"""
from pathlib import Path
from docx import Document
@@ -9,7 +10,12 @@ from docx.table import Table, _Cell
from docx.oxml.text.paragraph import CT_P
from docx.oxml.table import CT_Tbl
from docx.section import Section
from docx.shared import Inches, Pt
from docx.oxml.ns import qn
from services.translation_service import translation_service
from typing import List, Tuple, Any
import tempfile
import os
class WordTranslator:
@@ -20,151 +26,130 @@ class WordTranslator:
def translate_file(self, input_path: Path, output_path: Path, target_language: str) -> Path:
"""
Translate a Word document while preserving all formatting and structure
Args:
input_path: Path to input Word file
output_path: Path to save translated Word file
target_language: Target language code
Returns:
Path to the translated file
Translate a Word document while preserving all formatting and structure.
Uses batch translation for improved performance.
"""
document = Document(input_path)
# Translate main document body
self._translate_document_body(document, target_language)
# Collect all translatable text elements
text_elements = []
# Translate headers and footers in all sections
# Collect from document body
self._collect_from_body(document, text_elements)
# Collect from headers and footers
for section in document.sections:
self._translate_section(section, target_language)
self._collect_from_section(section, text_elements)
# Batch translate all texts at once
if text_elements:
texts = [elem[0] for elem in text_elements]
print(f"Batch translating {len(texts)} text segments...")
translated_texts = self.translation_service.translate_batch(texts, target_language)
# Apply translations
for (original_text, setter), translated in zip(text_elements, translated_texts):
if translated is not None and translated != original_text:
try:
setter(translated)
except Exception as e:
print(f"Error applying translation: {e}")
# Translate images if enabled (separate process)
if getattr(self.translation_service, 'translate_images', False):
self._translate_images(document, target_language, input_path)
# Save the translated document
document.save(output_path)
return output_path
def _translate_document_body(self, document: Document, target_language: str):
"""
Translate all elements in the document body
Args:
document: Document to translate
target_language: Target language code
"""
def _collect_from_body(self, document: Document, text_elements: List[Tuple[str, callable]]):
"""Collect all text elements from document body"""
for element in document.element.body:
if isinstance(element, CT_P):
# It's a paragraph
paragraph = Paragraph(element, document)
self._translate_paragraph(paragraph, target_language)
self._collect_from_paragraph(paragraph, text_elements)
elif isinstance(element, CT_Tbl):
# It's a table
table = Table(element, document)
self._translate_table(table, target_language)
self._collect_from_table(table, text_elements)
def _translate_paragraph(self, paragraph: Paragraph, target_language: str):
"""
Translate a paragraph while preserving all formatting
Args:
paragraph: Paragraph to translate
target_language: Target language code
"""
def _collect_from_paragraph(self, paragraph: Paragraph, text_elements: List[Tuple[str, callable]]):
"""Collect text from paragraph runs"""
if not paragraph.text.strip():
return
# For paragraphs with complex formatting (multiple runs), translate run by run
if len(paragraph.runs) > 0:
for run in paragraph.runs:
if run.text.strip():
translated_text = self.translation_service.translate_text(
run.text, target_language
)
run.text = translated_text
else:
# Simple paragraph with no runs
if paragraph.text.strip():
translated_text = self.translation_service.translate_text(
paragraph.text, target_language
)
paragraph.text = translated_text
for run in paragraph.runs:
if run.text and run.text.strip():
# Create a setter function for this run
def make_setter(r):
def setter(text):
r.text = text
return setter
text_elements.append((run.text, make_setter(run)))
def _translate_table(self, table: Table, target_language: str):
"""
Translate all cells in a table while preserving structure
Args:
table: Table to translate
target_language: Target language code
"""
def _collect_from_table(self, table: Table, text_elements: List[Tuple[str, callable]]):
"""Collect text from table cells"""
for row in table.rows:
for cell in row.cells:
self._translate_cell(cell, target_language)
for paragraph in cell.paragraphs:
self._collect_from_paragraph(paragraph, text_elements)
# Handle nested tables
for nested_table in cell.tables:
self._collect_from_table(nested_table, text_elements)
def _translate_cell(self, cell: _Cell, target_language: str):
"""
Translate content within a table cell
def _collect_from_section(self, section: Section, text_elements: List[Tuple[str, callable]]):
"""Collect text from headers and footers"""
headers_footers = [
section.header, section.footer,
section.first_page_header, section.first_page_footer,
section.even_page_header, section.even_page_footer
]
Args:
cell: Cell to translate
target_language: Target language code
"""
for paragraph in cell.paragraphs:
self._translate_paragraph(paragraph, target_language)
# Handle nested tables
for table in cell.tables:
self._translate_table(table, target_language)
for hf in headers_footers:
if hf:
for paragraph in hf.paragraphs:
self._collect_from_paragraph(paragraph, text_elements)
for table in hf.tables:
self._collect_from_table(table, text_elements)
def _translate_section(self, section: Section, target_language: str):
"""
Translate headers and footers in a section
def _translate_images(self, document: Document, target_language: str, input_path: Path):
"""Extract text from images and add translations as captions"""
from services.translation_service import OllamaTranslationProvider
Args:
section: Section to translate
target_language: Target language code
"""
# Translate header
if section.header:
for paragraph in section.header.paragraphs:
self._translate_paragraph(paragraph, target_language)
for table in section.header.tables:
self._translate_table(table, target_language)
if not isinstance(self.translation_service.provider, OllamaTranslationProvider):
return
# Translate footer
if section.footer:
for paragraph in section.footer.paragraphs:
self._translate_paragraph(paragraph, target_language)
for table in section.footer.tables:
self._translate_table(table, target_language)
# Translate first page header (if different)
if section.first_page_header:
for paragraph in section.first_page_header.paragraphs:
self._translate_paragraph(paragraph, target_language)
for table in section.first_page_header.tables:
self._translate_table(table, target_language)
# Translate first page footer (if different)
if section.first_page_footer:
for paragraph in section.first_page_footer.paragraphs:
self._translate_paragraph(paragraph, target_language)
for table in section.first_page_footer.tables:
self._translate_table(table, target_language)
# Translate even page header (if different)
if section.even_page_header:
for paragraph in section.even_page_header.paragraphs:
self._translate_paragraph(paragraph, target_language)
for table in section.even_page_header.tables:
self._translate_table(table, target_language)
# Translate even page footer (if different)
if section.even_page_footer:
for paragraph in section.even_page_footer.paragraphs:
self._translate_paragraph(paragraph, target_language)
for table in section.even_page_footer.tables:
self._translate_table(table, target_language)
try:
import zipfile
import base64
with zipfile.ZipFile(input_path, 'r') as zip_ref:
image_files = [f for f in zip_ref.namelist() if f.startswith('word/media/')]
for idx, image_file in enumerate(image_files):
try:
image_data = zip_ref.read(image_file)
ext = os.path.splitext(image_file)[1]
with tempfile.NamedTemporaryFile(suffix=ext, delete=False) as tmp:
tmp.write(image_data)
tmp_path = tmp.name
translated_text = self.translation_service.provider.translate_image(tmp_path, target_language)
os.unlink(tmp_path)
if translated_text and translated_text.strip():
p = document.add_paragraph()
p.add_run(f"[Image {idx + 1} translation: ").bold = True
p.add_run(translated_text)
p.add_run("]").bold = True
print(f"Translated image {idx + 1}: {translated_text[:50]}...")
except Exception as e:
print(f"Error translating image {image_file}: {e}")
except Exception as e:
print(f"Error processing images: {e}")
# Global translator instance