Files
office_translator/_bmad-output/implementation-artifacts/2-3-provider-deepl.md
Sepehr Ramezani 26bd096a06 feat: production deployment - full update with providers, admin, glossaries, pricing, tests
Major changes across backend, frontend, infrastructure:
- Provider system with model selection (Google, DeepL, OpenAI, Ollama, Google Cloud)
- Admin panel: user management, pricing, settings
- Glossary system with CSV import/export
- Subscription and tier quota management
- Security hardening (rate limiting, API key auth, path traversal fixes)
- Docker compose for dev, prod, and IONOS deployment
- Alembic migrations for new tables
- Frontend: dashboard, pricing page, landing page, i18n (en/fr)
- Test suite and verification scripts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-04-25 15:01:47 +02:00

359 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Story 2.3: Provider DeepL
Status: done
## Story
As a **system**,
I want **to integrate DeepL API as a production-ready provider with automatic Free/Pro endpoint detection**,
so that **users can translate documents with higher quality (especially for European languages)**.
## Acceptance Criteria
1. **AC1: API Integration** - Given `DEEPL_API_KEY` is configured, when `DeepLProvider.translate_text()` is called, then text is translated using DeepL API
2. **AC2: Auto-Detection Free/Pro** - Provider automatically detects Free vs Pro API endpoint based on API key format (Free keys end with `:fx`)
3. **AC3: Graceful Error Handling** - All API errors (quota exceeded, invalid key, network timeout) return structured errors with code and message (never HTTP 500)
4. **AC4: Health Check** - Provider `is_available()` returns `True` when API key is configured and API is reachable, `False` otherwise
5. **AC5: Registry Integration** - Provider is registered in `ProviderRegistry` and appears in fallback chain
6. **AC6: Unit Tests** - Tests verify all error scenarios, Free/Pro detection, and mock API responses
## Tasks / Subtasks
- [x] **Task 1: Create DeepL Provider Implementation** (AC: 1, 2)
- [x] 1.1 Create `services/providers/deepl_provider.py`
- [x] 1.2 Implement `DeepLProvider` class extending `TranslationProvider`
- [x] 1.3 Implement `_detect_api_type()` to auto-detect Free vs Pro from API key
- [x] 1.4 Implement `_get_api_url()` to return correct endpoint based on type
- [x] 1.5 Use `deep_translator` library for DeepL integration (same pattern as Google)
- [x] **Task 2: Implement Error Handling** (AC: 3)
- [x] 2.1 Define error codes: `DEEPL_QUOTA_EXCEEDED`, `DEEPL_INVALID_KEY`, `DEEPL_NETWORK_ERROR`, `DEEPL_UNSUPPORTED_LANGUAGE`, `DEEPL_TEXT_TOO_LONG`
- [x] 2.2 Implement `DeepLProviderError` exception class
- [x] 2.3 Map DeepL API errors to structured error responses
- [x] 2.4 Add retry logic with exponential backoff for transient errors
- [x] 2.5 Add timeout configuration (default 30s)
- [x] 2.6 Ensure all errors return JSON: `{error, message, details?}` format
- [x] **Task 3: Implement Health Check** (AC: 4)
- [x] 3.1 Implement `is_available()` to check API key presence
- [x] 3.2 Add `health_check()` with caching (TTL 60s) matching Google provider pattern
- [x] 3.3 Return `ProviderHealthStatus` with availability and latency
- [x] **Task 4: Registry Integration** (AC: 5)
- [x] 4.1 Add `register_deepl_provider()` function
- [x] 4.2 Add `get_deepl_provider()` singleton function
- [x] 4.3 Update `services/providers/__init__.py` to auto-register DeepL when enabled
- [x] 4.4 Verify provider appears in fallback chain when configured
- [x] **Task 5: Configuration Updates** (AC: 1, 2)
- [x] 5.1 Verify `DEEPL_API_KEY` and `DEEPL_ENABLED` in `config.py` (already present)
- [x] 5.2 Add DeepL-specific configuration options to `.env.example`:
- `DEEPL_TIMEOUT=30`
- `DEEPL_MAX_RETRIES=3`
- `DEEPL_RETRY_DELAY=1`
- [x] **Task 6: Create Unit Tests** (AC: 6)
- [x] 6.1 Create `tests/test_providers/test_deepl_provider.py`
- [x] 6.2 Test Free vs Pro API key detection
- [x] 6.3 Test all error scenarios (quota, invalid key, timeout, unsupported language)
- [x] 6.4 Test retry logic
- [x] 6.5 Test health check functionality
- [x] 6.6 Test registry integration
- [x] **Task 7: Update Documentation** (AC: 1-6)
- [x] 7.1 Update `services/providers/README.md` with DeepL section
- [x] 7.2 Document Free vs Pro API key differences
- [x] 7.3 Document supported languages (fewer than Google but higher quality)
## Dev Notes
### 🔬 DeepL API Specifics
**API Endpoints:**
| Type | Endpoint | Key Format |
|------|----------|------------|
| Free | `https://api-free.deepl.com/v2/translate` | Ends with `:fx` |
| Pro | `https://api.deepl.com/v2/translate` | Does NOT end with `:fx` |
**Auto-Detection Logic:**
```python
def _detect_api_type(self, api_key: str) -> str:
"""Detect if API key is Free or Pro based on suffix."""
if api_key.endswith(":fx"):
return "free"
return "pro"
def _get_api_url(self) -> str:
"""Get correct API URL based on key type."""
if self._api_type == "free":
return "https://api-free.deepl.com/v2/translate"
return "https://api.deepl.com/v2/translate"
```
**Free Tier Limits:**
- 500,000 characters/month
- Rate limit: ~5 requests/second
**Pro Tier:**
- Pay per character (~€25 per million characters)
- Higher rate limits
- Priority support
### Supported Languages (DeepL)
DeepL supports **fewer languages** than Google but with **higher quality** for European languages:
**Supported (as of 2024):**
- BG (Bulgarian), CS (Czech), DA (Danish), DE (German), EL (Greek)
- EN-GB/EN-US (English), ES (Spanish), ET (Estonian), FI (Finnish)
- FR (French), HU (Hungarian), ID (Indonesian), IT (Italian), JA (Japanese)
- KO (Korean), LT (Lithuanian), LV (Latvian), NB (Norwegian Bokmål)
- NL (Dutch), PL (Polish), PT-BR/PT-PT (Portuguese), RO (Romanian)
- RU (Russian), SK (Slovak), SL (Slovenian), SV (Swedish), TR (Turkish)
- UK (Ukrainian), ZH (Chinese)
**Key Differences from Google:**
- No auto-detect for source language code "auto" - use `None` or omit
- Language codes are case-sensitive (uppercase)
- English has two variants: EN-GB, EN-US
- Portuguese has two variants: PT-BR, PT-PT
### Architecture Compliance
Per `_bmad-output/planning-artifacts/architecture.md`:
**Error Format:**
```json
{
"error": "DEEPL_QUOTA_EXCEEDED",
"message": "Quota DeepL dépassé. Réessayez demain.",
"details": {
"provider": "deepl",
"api_type": "free",
"reset_at": "2024-01-16T00:00:00Z"
}
}
```
**Never return HTTP 500** - All errors must be 4xx or 502 (upstream error).
**Naming Conventions:**
- File: `deepl_provider.py` (snake_case)
- Class: `DeepLTranslationProvider` (PascalCase)
- Error codes: `DEEPL_*` (UPPER_SNAKE_CASE)
- JSON fields: snake_case
### Previous Story Intelligence (Story 2.2 - Google Translate)
**What Worked Well:**
- `deep_translator` library integration (no API key management for Google)
- Thread-safe translator instances per thread
- Error codes with `to_dict()` method
- Retry logic with exponential backoff
- Health check with 60s TTL caching
**Patterns to Reuse:**
```python
# Error codes pattern
DEEPL_QUOTA_EXCEEDED = "DEEPL_QUOTA_EXCEEDED"
DEEPL_INVALID_KEY = "DEEPL_INVALID_KEY"
DEEPL_NETWORK_ERROR = "DEEPL_NETWORK_ERROR"
DEEPL_UNSUPPORTED_LANGUAGE = "DEEPL_UNSUPPORTED_LANGUAGE"
DEEPL_TEXT_TOO_LONG = "DEEPL_TEXT_TOO_LONG"
_RETRYABLE_ERRORS = {DEEPL_NETWORK_ERROR, DEEPL_QUOTA_EXCEEDED}
# Exception class pattern
class DeepLProviderError(Exception):
def __init__(self, code: str, message: str, details: Optional[Dict[str, Any]] = None):
self.code = code
self.message = message
self.details = details or {}
super().__init__(message)
def to_dict(self) -> Dict[str, Any]:
result = {"error": self.code, "message": self.message}
if self.details:
result["details"] = self.details
return result
```
**Key Difference for DeepL:**
Unlike Google (which uses `deep_translator` without API key), DeepL **requires** an API key passed to `deep_translator.DeepLTranslator`:
```python
from deep_translator import DeepLTranslator
# Free tier
translator = DeepLTranslator(api_key="your-key:fx", source="en", target="fr")
# Pro tier (same library, different endpoint internally detected)
translator = DeepLTranslator(api_key="your-pro-key", source="en", target="fr")
```
**Note:** `deep_translator` handles Free vs Pro endpoint detection internally based on API key format!
### File Structure
**Files to Create:**
- `services/providers/deepl_provider.py` - Main provider implementation
- `tests/test_providers/test_deepl_provider.py` - Unit tests
**Files to Modify:**
- `services/providers/__init__.py` - Add DeepL auto-registration
- `.env.example` - Add DeepL-specific config (if not present)
- `services/providers/README.md` - Add DeepL documentation
### Error Codes to Implement
| Code | HTTP | Scenario | Message Template |
|------|------|----------|------------------|
| `DEEPL_QUOTA_EXCEEDED` | 429 | Character quota exceeded | "Quota DeepL dépassé. Réessayez demain." |
| `DEEPL_INVALID_KEY` | 401 | Invalid API key | "Clé API DeepL invalide. Contactez l'administrateur." |
| `DEEPL_NETWORK_ERROR` | 502 | Network/timeout error | "Service DeepL indisponible. Réessayez." |
| `DEEPL_UNSUPPORTED_LANGUAGE` | 400 | Language not supported | "Langue '{lang}' non supportée par DeepL." |
| `DEEPL_TEXT_TOO_LONG` | 413 | Text exceeds limit | "Texte trop long (max 128KB par requête)." |
### Configuration
**Environment Variables (`.env.example`):**
```bash
# DeepL Provider
DEEPL_ENABLED=true
DEEPL_API_KEY=your_deepl_api_key_here # Free keys end with :fx
DEEPL_TIMEOUT=30
DEEPL_MAX_RETRIES=3
DEEPL_RETRY_DELAY=1
```
**Provider Config (`services/providers/config.py`):**
Already has basic config. May need to add:
- `DEEPL_TIMEOUT`
- `DEEPL_MAX_RETRIES`
- `DEEPL_RETRY_DELAY`
### Testing Strategy
**Unit Tests (Mocked):**
- Free vs Pro API key detection
- All error scenarios (quota, invalid key, timeout)
- Health check logic
- Retry logic
- Caching behavior
- Registry integration
**Integration Tests (Optional):**
- With `DEEPL_API_KEY` in environment: real API calls
- Without API key: skip integration tests
- Use pytest markers: `@pytest.mark.integration`
**Test Commands:**
```bash
# Unit tests only
pytest tests/test_providers/test_deepl_provider.py -v
# All provider tests
pytest tests/test_providers/ -v
# With coverage
pytest tests/test_providers/ --cov=services/providers -v
```
### Logging Pattern
```python
import logging
logger = logging.getLogger(__name__)
# Good - metadata only (NO document content)
logger.info(
f"deepl_translation_success chars={len(text)} "
f"source_lang={source_language} target_lang={target_language} "
f"api_type={self._api_type}"
)
logger.error(
f"deepl_translation_failed error_code={error.code} "
f"text_length={len(text)} source_lang={source_language} "
f"target_lang={target_language}"
)
```
### Dependencies
**Internal:**
- `services/providers/base.py` - TranslationProvider abstract class
- `services/providers/registry.py` - ProviderRegistry
- `services/providers/config.py` - Configuration
- `services/providers/schemas.py` - TranslationRequest/Response models
**External:**
- `deep_translator` - DeepL integration (already installed for Google)
- `structlog` or standard `logging` - Structured logging
### References
- [Source: _bmad-output/planning-artifacts/architecture.md#Error Handling]
- [Source: _bmad-output/planning-artifacts/architecture.md#API Response Formats]
- [Source: _bmad-output/planning-artifacts/epics.md#Story 2.3]
- [Source: _bmad-output/planning-artifacts/prd.md#FR6 Google/DeepL providers]
- [Source: _bmad-output/planning-artifacts/prd.md#NFR12 Zero HTTP 500 errors]
- [Source: _bmad-output/planning-artifacts/prd.md#NFR13 Provider fallback]
- [Source: _bmad-output/implementation-artifacts/2-2-provider-google-translate.md]
- [Source: services/providers/google_provider.py - Implementation pattern]
- [Source: services/providers/registry.py - Registration pattern]
### Security Considerations
**API Key Protection:**
- Never log API key
- Load from environment variable only
- Validate key format on startup (basic check for `:fx` suffix)
**Data Privacy:**
- Never log document content (NFR11)
- Only log metadata: text length, languages, timestamps
- Clear cache entries after TTL (security + privacy)
## Dev Agent Record
### Agent Model Used
Claude (GLM-5) via opencode
### Debug Log References
- Fixed logging compatibility issue: standard logging doesn't support keyword arguments like structlog
- Created helper functions `_log_info`, `_log_warning`, `_log_error` to bridge the gap
### Completion Notes List
- ✅ Implemented `DeepLTranslationProvider` class with all required features
- ✅ Auto-detection of Free/Pro API endpoints based on key format (`:fx` suffix)
- ✅ All 5 error codes implemented with French messages
- ✅ Retry logic with exponential backoff for `DEEPL_NETWORK_ERROR` and `DEEPL_QUOTA_EXCEEDED`
- ✅ Health check with 60s TTL caching
- ✅ Registry integration with auto-registration when `DEEPL_ENABLED=true` and `DEEPL_API_KEY` is set
- ✅ Language code normalization (uppercase, EN→EN-US, PT→PT-BR)
- ✅ Unit tests created (incl. timeout→DEEPL_NETWORK_ERROR, error_details for TEXT_TOO_LONG)
- ✅ Documentation updated in README.md with DeepL section
- ✅ [Code review] main.py uses `get_legacy_deepl_adapter()` so API uses new provider; `is_available()` performs minimal translate to verify API reachable; `get_deepl_provider()` thread-safe; DeepL error codes mapped in utils/exceptions.py
### File List
**Files Created:**
- `services/providers/deepl_provider.py` - Main DeepL provider implementation
- `tests/test_providers/test_deepl_provider.py` - Unit tests (incl. timeout/error_details)
**Files Modified:**
- `services/providers/__init__.py` - Added DeepL auto-registration
- `services/providers/config.py` - Added DEEPL_TIMEOUT, DEEPL_MAX_RETRIES, DEEPL_RETRY_DELAY
- `services/providers/README.md` - Added comprehensive DeepL documentation
- `.env.example` - Added DeepL-specific configuration options
- `main.py` - Use `get_legacy_deepl_adapter()` for DeepL (code review)
- `utils/exceptions.py` - Added DEEPL_* error codes to HTTP status mapping (code review)
### Change Log
- 2026-02-21: Story 2.3 implementation complete - DeepL provider with Free/Pro detection, error handling, and 35 passing tests
- 2026-02-21: Code review fixes main.py uses legacy DeepL adapter; is_available() does minimal API check; thread-safe singleton; timeout tests; DeepL HTTP status mapping in utils/exceptions.py