Files
office_translator/_bmad-output/implementation-artifacts/2-2-provider-google-translate.md
Sepehr Ramezani 26bd096a06 feat: production deployment - full update with providers, admin, glossaries, pricing, tests
Major changes across backend, frontend, infrastructure:
- Provider system with model selection (Google, DeepL, OpenAI, Ollama, Google Cloud)
- Admin panel: user management, pricing, settings
- Glossary system with CSV import/export
- Subscription and tier quota management
- Security hardening (rate limiting, API key auth, path traversal fixes)
- Docker compose for dev, prod, and IONOS deployment
- Alembic migrations for new tables
- Frontend: dashboard, pricing page, landing page, i18n (en/fr)
- Test suite and verification scripts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-04-25 15:01:47 +02:00

335 lines
14 KiB
Markdown

# Story 2.2: Provider Google Translate
Status: done
## Story
As a **system**,
I want **to integrate Google Translate API as a production-ready provider with robust error handling and health monitoring**,
so that **users can translate documents in Classic mode reliably without HTTP 500 errors**.
## Acceptance Criteria
1. **AC1: API Integration** - Given `GOOGLE_API_KEY` is configured, when `GoogleTranslateProvider.translate_text()` is called, then text is translated using Google Translate API v2 or v3
2. **AC2: Graceful Error Handling** - All API errors (quota exceeded, invalid key, network timeout) return structured errors with code and message (never HTTP 500)
3. **AC3: Health Check** - Provider `is_available()` returns `True` when API key is configured and API is reachable, `False` otherwise
4. **AC4: Rate Limiting Awareness** - Provider handles Google's rate limits gracefully with retry logic and clear error messages
5. **AC5: Integration Tests** - Tests verify actual API integration (mocked or with test API key) and error scenarios
6. **AC6: Cost Optimization** - Provider uses efficient API calls (batching where possible, minimal quota usage)
## Tasks / Subtasks
- [x] **Task 1: Validate Existing Implementation** (AC: 1, 2)
- [x] 1.1 Review `services/providers/google_provider.py` from Story 2.1
- [x] 1.2 Verify Google Translate API v2/v3 compatibility
- [x] 1.3 Test translation with real API key (if available)
- [x] 1.4 Document any gaps between implementation and AC
- [x] **Task 2: Enhance Error Handling** (AC: 2, 4)
- [x] 2.1 Add specific error codes: `GOOGLE_QUOTA_EXCEEDED`, `GOOGLE_INVALID_KEY`, `GOOGLE_NETWORK_ERROR`, `GOOGLE_UNSUPPORTED_LANGUAGE`
- [x] 2.2 Implement retry logic with exponential backoff for transient errors
- [x] 2.3 Add timeout configuration (default 30s for translation requests)
- [x] 2.4 Ensure all errors return JSON: `{error, message, details?}` format
- [x] 2.5 Log errors with `structlog` (no document content in logs)
- [x] **Task 3: Improve Health Check** (AC: 3)
- [x] 3.1 Implement `is_available()` to check API key presence and basic connectivity
- [x] 3.2 Add optional `health_check()` method that pings Google Translate API
- [x] 3.3 Cache health check results (TTL 60s) to avoid unnecessary API calls
- [x] 3.4 Return detailed status: `{available: bool, error?: str, last_check: timestamp}`
- [x] **Task 4: Add Integration Tests** (AC: 5)
- [x] 4.1 Create `tests/test_providers/test_google_integration.py`
- [x] 4.2 Add mocked tests for all error scenarios
- [x] 4.3 Add test with real API key (skipped if not available)
- [x] 4.4 Test rate limit handling and retry logic
- [x] 4.5 Test health check functionality
- [x] **Task 5: Optimize API Usage** (AC: 6)
- [x] 5.1 Review caching implementation (from Story 2.1)
- [x] 5.2 Add language detection optimization (skip if source=target)
- [x] 5.3 Document API usage and cost estimates in code comments
- [x] 5.4 Add usage metrics logging (character count, API calls)
- [x] **Task 6: Update Documentation** (AC: 1-6)
- [x] 6.1 Update `services/providers/README.md` (if exists) or create it
- [x] 6.2 Document environment variables in `.env.example`
- [x] 6.3 Add provider-specific error codes to API documentation
- [x] 6.4 Update admin dashboard to show Google provider status
## Dev Notes
### 🚨 CONTEXT: Previous Story 2.1 Completion
**Story 2.1 already migrated the Google provider** to the new architecture:
- Created `services/providers/google_provider.py` (178-186 lines)
- Integrated with `ProviderRegistry`
- Added basic error handling and caching
- Created 47 unit tests (all passing)
**This story focuses on PRODUCTION READINESS:**
- Robust error handling for all Google API error cases
- Health monitoring and availability checks
- Integration testing with real/mock API
- Cost optimization and usage tracking
### Existing Implementation (Story 2.1)
**File:** `services/providers/google_provider.py`
Key components already implemented:
- `GoogleTranslationProvider` class extending `TranslationProvider`
- Basic `translate_text()` implementation
- Caching via `_translation_cache`
- Environment configuration via `GOOGLE_API_KEY`
**Gaps to Address:**
1. Error handling: Currently returns untranslated text on failure (lines 135-141)
2. Health check: Basic implementation, needs enhancement
3. Rate limiting: No explicit handling of Google's quotas
4. Integration tests: Only unit tests with mocks
### Google Translate API v2 vs v3
**API v2 (Basic):**
- Endpoint: `https://translation.googleapis.com/language/translate/v2`
- Auth: API Key in URL or header
- Simple REST API
- Free tier: 500,000 characters/month
**API v3 (Advanced):**
- Endpoint: `https://translation.googleapis.com/v3/projects/{PROJECT_ID}:translateText`
- Auth: Service Account JSON (more complex)
- Batch translation support
- Glossary support (future enhancement)
- Paid only (no free tier)
**Recommendation:** Start with API v2 for MVP (simpler, has free tier). Plan v3 for post-MVP.
### Error Codes to Implement
| Code | HTTP | Scenario | Message Template |
|------|------|----------|------------------|
| `GOOGLE_QUOTA_EXCEEDED` | 429 | API quota exceeded | "Quota Google Translate dépassé. Réessayez demain." |
| `GOOGLE_INVALID_KEY` | 401 | Invalid API key | "Clé API Google invalide. Contactez l'administrateur." |
| `GOOGLE_NETWORK_ERROR` | 502 | Network/timeout error | "Service Google Translate indisponible. Réessayez." |
| `GOOGLE_UNSUPPORTED_LANGUAGE` | 400 | Language not supported | "Langue '{lang}' non supportée par Google." |
| `GOOGLE_TEXT_TOO_LONG` | 413 | Text exceeds limit | "Texte trop long (max 5000 caractères par requête)." |
### Architecture Compliance
Per `_bmad-output/planning-artifacts/architecture.md`:
**Error Format:**
```json
{
"error": "GOOGLE_QUOTA_EXCEEDED",
"message": "Quota Google Translate dépassé. Réessayez demain.",
"details": {
"provider": "google",
"reset_at": "2024-01-16T00:00:00Z"
}
}
```
**Never return HTTP 500** - All errors must be 4xx or 502 (upstream error).
**Logging Pattern:**
```python
import structlog
logger = structlog.get_logger()
# Good - metadata only
logger.error("google_translation_failed",
error_code="GOOGLE_QUOTA_EXCEEDED",
user_id=user_id,
text_length=len(text),
source_lang=source_lang,
target_lang=target_lang
)
# Bad - logs document content
logger.error("translation_failed", text=text) # ❌ NEVER DO THIS
```
### Testing Strategy
**Unit Tests (Mocked):**
- All error scenarios (quota, invalid key, timeout)
- Health check logic
- Retry logic
- Caching behavior
**Integration Tests:**
- With `GOOGLE_API_KEY` in environment: real API calls
- Without API key: skip integration tests
- Use pytest markers: `@pytest.mark.integration`
**Test Commands:**
```bash
# Unit tests only
pytest tests/test_providers/test_google_provider.py -v
# Integration tests (requires API key)
pytest tests/test_providers/test_google_integration.py -v -m integration
# All tests with coverage
pytest tests/test_providers/ --cov=services/providers -v
```
### Performance Considerations
**Google Translate Limits:**
- 5000 characters per request
- 100 requests per second (with billing enabled)
- 500,000 characters/month free tier
**Optimizations:**
1. **Caching:** Already implemented in Story 2.1 (LRU cache, 5000 entries)
2. **Batching:** For large documents, batch multiple segments in single API call
3. **Language Detection:** Skip translation if source == target
4. **Rate Limiting:** Implement client-side rate limiting to avoid Google's 403s
### Configuration
**Environment Variables (`.env.example`):**
```bash
# Google Translate Provider
GOOGLE_TRANSLATE_ENABLED=true
GOOGLE_API_KEY=your_api_key_here
GOOGLE_TRANSLATE_TIMEOUT=30 # seconds
GOOGLE_TRANSLATE_MAX_RETRIES=3
GOOGLE_TRANSLATE_RETRY_DELAY=1 # initial delay in seconds
```
**Provider Config (`services/providers/config.py`):**
Already has basic config from Story 2.1. Enhance with:
- Timeout settings
- Retry configuration
- Rate limiting parameters
### Project Structure Notes
**Files to Modify:**
- `services/providers/google_provider.py` - Enhance error handling, health check
- `services/providers/config.py` - Add new configuration options
- `tests/test_providers/test_google_provider.py` - Add more test cases
**Files to Create:**
- `tests/test_providers/test_google_integration.py` - Integration tests
- `services/providers/README.md` - Provider documentation (optional)
**Naming Conventions:**
- Python: `snake_case` for files, functions, variables
- Classes: `PascalCase`
- Error codes: `UPPER_SNAKE_CASE`
- JSON fields: `snake_case`
### References
- [Source: _bmad-output/planning-artifacts/architecture.md#Error Handling]
- [Source: _bmad-output/planning-artifacts/architecture.md#API Response Formats]
- [Source: _bmad-output/planning-artifacts/epics.md#Story 2.2]
- [Source: _bmad-output/planning-artifacts/prd.md#FR6 Google/DeepL providers]
- [Source: _bmad-output/planning-artifacts/prd.md#NFR12 Zero HTTP 500 errors]
- [Source: _bmad-output/planning-artifacts/prd.md#NFR13 Provider fallback]
- [Source: _bmad-output/implementation-artifacts/2-1-abstraction-provider-base-registry.md]
### Previous Story Intelligence (Story 2.1)
**What Worked Well:**
- Abstract base class design is solid and extensible
- Registry pattern allows easy provider swapping
- Caching significantly improves performance for repeated translations
- 47 unit tests provide good coverage
**Challenges Encountered:**
- Silent failure on error (returned untranslated text) - FIXED in code review
- Need better error indication in `TranslationResponse`
- Integration tests deferred
**Learnings to Apply:**
- Use `error` field in `TranslationResponse` for failure detection
- Add comprehensive error codes for each failure scenario
- Test with real API during development if possible
- Document API limits and quotas clearly
### Dependencies
**Internal:**
- `services/providers/base.py` - TranslationProvider abstract class
- `services/providers/registry.py` - ProviderRegistry
- `services/providers/config.py` - Configuration
- `services/providers/schemas.py` - TranslationRequest/Response models
**External:**
- `google-api-python-client` - Google API client library (if using v3)
- `httpx` or `aiohttp` - HTTP client for v2 API
- `structlog` - Structured logging
### Security Considerations
**API Key Protection:**
- Never log API key
- Load from environment variable only
- Validate key format on startup (basic check)
**Rate Limiting:**
- Implement client-side rate limiting to protect quota
- Track usage per user to prevent abuse
- Log usage metrics for billing/monitoring
**Data Privacy:**
- Never log document content (NFR11)
- Only log metadata: text length, languages, timestamps
- Clear cache entries after TTL (security + privacy)
## Dev Agent Record
### Agent Model Used
Claude (glm-5)
### Debug Log References
None - all tests passed without requiring debugging.
### Completion Notes List
- **Task 1**: Validated existing implementation from Story 2.1. Uses `deep_translator` library which wraps Google Translate without API key.
- **Task 2**: Implemented 5 error codes (GOOGLE_QUOTA_EXCEEDED, GOOGLE_INVALID_KEY, GOOGLE_NETWORK_ERROR, GOOGLE_UNSUPPORTED_LANGUAGE, GOOGLE_TEXT_TOO_LONG). Added retry logic with exponential backoff (max 3 retries, 1s initial delay). Added timeout configuration (default 30s). Error responses use `{error, message, details}` format.
- **Task 3**: Enhanced health check with caching (60s TTL). Added `last_check` timestamp to `ProviderHealthStatus`.
- **Task 4**: Created comprehensive integration tests (26 tests total): error codes, retry logic, timeout, error format, logging, health check, real API tests.
- **Task 5**: Added optimization to skip translation when source==target language. Added usage metrics logging (character count, API calls). Documented API usage and costs in code comments.
- **Task 6**: Created `services/providers/README.md` with provider documentation. Updated `.env.example` with Google Translate configuration options.
### File List
**Modified:**
- `services/providers/google_provider.py` - Enhanced error handling, retry logic, health check, optimizations; timeout applied via executor; structlog; config from env; LegacyGoogleAdapter
- `services/providers/schemas.py` - Added `error_code`, `error_details`, `to_error_dict()` to TranslationResponse; added `last_check` to ProviderHealthStatus
- `services/providers/config.py` - GOOGLE_TRANSLATE_TIMEOUT, GOOGLE_TRANSLATE_MAX_RETRIES, GOOGLE_TRANSLATE_RETRY_DELAY from env
- `.env.example` - Added Google Translate configuration options
- `utils/exceptions.py` - TranslationProviderError; handle_translation_error maps Google error codes to HTTP 429/401/502/400/413
- `utils/__init__.py` - Export TranslationProviderError
- `main.py` - Use get_legacy_google_adapter() for Google provider; admin dashboard includes `providers.google` status
- `requirements.txt` - structlog>=24.1.0
- `tests/test_providers/test_google_provider.py` - test_translate_text_error_fallback asserts error/error_code
**Created:**
- `services/providers/README.md` - Provider documentation
- `tests/test_providers/test_google_integration.py` - Integration tests (26 tests)
### Senior Developer Review (AI)
**Review date:** 2026-02-21
**Findings addressed (auto-fix):**
- CRITICAL: New provider wired into API via LegacyGoogleAdapter; main.py uses get_legacy_google_adapter() for provider "google". TranslationProviderError raised on failure; handle_translation_error returns 429/401/502/400/413.
- HIGH: Timeout applied in _make_api_request via ThreadPoolExecutor.future.result(timeout=self.timeout); FuturesTimeoutError mapped to GOOGLE_NETWORK_ERROR.
- MEDIUM: structlog used when available (fallback to logging); config (timeout, max_retries, retry_delay) read from ProvidersConfig env; admin dashboard GET /admin/dashboard returns `providers.google` (health_check); test_translate_text_error_fallback strengthened.
### Change Log
- 2026-02-21: Code review (AI). Fixes applied: provider wiring, timeout, structlog, config env, admin dashboard provider status, tests. Status → done.