Files
office_translator/_bmad-output/implementation-artifacts/2-2-provider-google-translate.md
Sepehr Ramezani 26bd096a06 feat: production deployment - full update with providers, admin, glossaries, pricing, tests
Major changes across backend, frontend, infrastructure:
- Provider system with model selection (Google, DeepL, OpenAI, Ollama, Google Cloud)
- Admin panel: user management, pricing, settings
- Glossary system with CSV import/export
- Subscription and tier quota management
- Security hardening (rate limiting, API key auth, path traversal fixes)
- Docker compose for dev, prod, and IONOS deployment
- Alembic migrations for new tables
- Frontend: dashboard, pricing page, landing page, i18n (en/fr)
- Test suite and verification scripts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-04-25 15:01:47 +02:00

14 KiB

Story 2.2: Provider Google Translate

Status: done

Story

As a system, I want to integrate Google Translate API as a production-ready provider with robust error handling and health monitoring, so that users can translate documents in Classic mode reliably without HTTP 500 errors.

Acceptance Criteria

  1. AC1: API Integration - Given GOOGLE_API_KEY is configured, when GoogleTranslateProvider.translate_text() is called, then text is translated using Google Translate API v2 or v3
  2. AC2: Graceful Error Handling - All API errors (quota exceeded, invalid key, network timeout) return structured errors with code and message (never HTTP 500)
  3. AC3: Health Check - Provider is_available() returns True when API key is configured and API is reachable, False otherwise
  4. AC4: Rate Limiting Awareness - Provider handles Google's rate limits gracefully with retry logic and clear error messages
  5. AC5: Integration Tests - Tests verify actual API integration (mocked or with test API key) and error scenarios
  6. AC6: Cost Optimization - Provider uses efficient API calls (batching where possible, minimal quota usage)

Tasks / Subtasks

  • Task 1: Validate Existing Implementation (AC: 1, 2)

    • 1.1 Review services/providers/google_provider.py from Story 2.1
    • 1.2 Verify Google Translate API v2/v3 compatibility
    • 1.3 Test translation with real API key (if available)
    • 1.4 Document any gaps between implementation and AC
  • Task 2: Enhance Error Handling (AC: 2, 4)

    • 2.1 Add specific error codes: GOOGLE_QUOTA_EXCEEDED, GOOGLE_INVALID_KEY, GOOGLE_NETWORK_ERROR, GOOGLE_UNSUPPORTED_LANGUAGE
    • 2.2 Implement retry logic with exponential backoff for transient errors
    • 2.3 Add timeout configuration (default 30s for translation requests)
    • 2.4 Ensure all errors return JSON: {error, message, details?} format
    • 2.5 Log errors with structlog (no document content in logs)
  • Task 3: Improve Health Check (AC: 3)

    • 3.1 Implement is_available() to check API key presence and basic connectivity
    • 3.2 Add optional health_check() method that pings Google Translate API
    • 3.3 Cache health check results (TTL 60s) to avoid unnecessary API calls
    • 3.4 Return detailed status: {available: bool, error?: str, last_check: timestamp}
  • Task 4: Add Integration Tests (AC: 5)

    • 4.1 Create tests/test_providers/test_google_integration.py
    • 4.2 Add mocked tests for all error scenarios
    • 4.3 Add test with real API key (skipped if not available)
    • 4.4 Test rate limit handling and retry logic
    • 4.5 Test health check functionality
  • Task 5: Optimize API Usage (AC: 6)

    • 5.1 Review caching implementation (from Story 2.1)
    • 5.2 Add language detection optimization (skip if source=target)
    • 5.3 Document API usage and cost estimates in code comments
    • 5.4 Add usage metrics logging (character count, API calls)
  • Task 6: Update Documentation (AC: 1-6)

    • 6.1 Update services/providers/README.md (if exists) or create it
    • 6.2 Document environment variables in .env.example
    • 6.3 Add provider-specific error codes to API documentation
    • 6.4 Update admin dashboard to show Google provider status

Dev Notes

🚨 CONTEXT: Previous Story 2.1 Completion

Story 2.1 already migrated the Google provider to the new architecture:

  • Created services/providers/google_provider.py (178-186 lines)
  • Integrated with ProviderRegistry
  • Added basic error handling and caching
  • Created 47 unit tests (all passing)

This story focuses on PRODUCTION READINESS:

  • Robust error handling for all Google API error cases
  • Health monitoring and availability checks
  • Integration testing with real/mock API
  • Cost optimization and usage tracking

Existing Implementation (Story 2.1)

File: services/providers/google_provider.py

Key components already implemented:

  • GoogleTranslationProvider class extending TranslationProvider
  • Basic translate_text() implementation
  • Caching via _translation_cache
  • Environment configuration via GOOGLE_API_KEY

Gaps to Address:

  1. Error handling: Currently returns untranslated text on failure (lines 135-141)
  2. Health check: Basic implementation, needs enhancement
  3. Rate limiting: No explicit handling of Google's quotas
  4. Integration tests: Only unit tests with mocks

Google Translate API v2 vs v3

API v2 (Basic):

  • Endpoint: https://translation.googleapis.com/language/translate/v2
  • Auth: API Key in URL or header
  • Simple REST API
  • Free tier: 500,000 characters/month

API v3 (Advanced):

  • Endpoint: https://translation.googleapis.com/v3/projects/{PROJECT_ID}:translateText
  • Auth: Service Account JSON (more complex)
  • Batch translation support
  • Glossary support (future enhancement)
  • Paid only (no free tier)

Recommendation: Start with API v2 for MVP (simpler, has free tier). Plan v3 for post-MVP.

Error Codes to Implement

Code HTTP Scenario Message Template
GOOGLE_QUOTA_EXCEEDED 429 API quota exceeded "Quota Google Translate dépassé. Réessayez demain."
GOOGLE_INVALID_KEY 401 Invalid API key "Clé API Google invalide. Contactez l'administrateur."
GOOGLE_NETWORK_ERROR 502 Network/timeout error "Service Google Translate indisponible. Réessayez."
GOOGLE_UNSUPPORTED_LANGUAGE 400 Language not supported "Langue '{lang}' non supportée par Google."
GOOGLE_TEXT_TOO_LONG 413 Text exceeds limit "Texte trop long (max 5000 caractères par requête)."

Architecture Compliance

Per _bmad-output/planning-artifacts/architecture.md:

Error Format:

{
  "error": "GOOGLE_QUOTA_EXCEEDED",
  "message": "Quota Google Translate dépassé. Réessayez demain.",
  "details": {
    "provider": "google",
    "reset_at": "2024-01-16T00:00:00Z"
  }
}

Never return HTTP 500 - All errors must be 4xx or 502 (upstream error).

Logging Pattern:

import structlog
logger = structlog.get_logger()

# Good - metadata only
logger.error("google_translation_failed",
    error_code="GOOGLE_QUOTA_EXCEEDED",
    user_id=user_id,
    text_length=len(text),
    source_lang=source_lang,
    target_lang=target_lang
)

# Bad - logs document content
logger.error("translation_failed", text=text)  # ❌ NEVER DO THIS

Testing Strategy

Unit Tests (Mocked):

  • All error scenarios (quota, invalid key, timeout)
  • Health check logic
  • Retry logic
  • Caching behavior

Integration Tests:

  • With GOOGLE_API_KEY in environment: real API calls
  • Without API key: skip integration tests
  • Use pytest markers: @pytest.mark.integration

Test Commands:

# Unit tests only
pytest tests/test_providers/test_google_provider.py -v

# Integration tests (requires API key)
pytest tests/test_providers/test_google_integration.py -v -m integration

# All tests with coverage
pytest tests/test_providers/ --cov=services/providers -v

Performance Considerations

Google Translate Limits:

  • 5000 characters per request
  • 100 requests per second (with billing enabled)
  • 500,000 characters/month free tier

Optimizations:

  1. Caching: Already implemented in Story 2.1 (LRU cache, 5000 entries)
  2. Batching: For large documents, batch multiple segments in single API call
  3. Language Detection: Skip translation if source == target
  4. Rate Limiting: Implement client-side rate limiting to avoid Google's 403s

Configuration

Environment Variables (.env.example):

# Google Translate Provider
GOOGLE_TRANSLATE_ENABLED=true
GOOGLE_API_KEY=your_api_key_here
GOOGLE_TRANSLATE_TIMEOUT=30  # seconds
GOOGLE_TRANSLATE_MAX_RETRIES=3
GOOGLE_TRANSLATE_RETRY_DELAY=1  # initial delay in seconds

Provider Config (services/providers/config.py): Already has basic config from Story 2.1. Enhance with:

  • Timeout settings
  • Retry configuration
  • Rate limiting parameters

Project Structure Notes

Files to Modify:

  • services/providers/google_provider.py - Enhance error handling, health check
  • services/providers/config.py - Add new configuration options
  • tests/test_providers/test_google_provider.py - Add more test cases

Files to Create:

  • tests/test_providers/test_google_integration.py - Integration tests
  • services/providers/README.md - Provider documentation (optional)

Naming Conventions:

  • Python: snake_case for files, functions, variables
  • Classes: PascalCase
  • Error codes: UPPER_SNAKE_CASE
  • JSON fields: snake_case

References

  • [Source: _bmad-output/planning-artifacts/architecture.md#Error Handling]
  • [Source: _bmad-output/planning-artifacts/architecture.md#API Response Formats]
  • [Source: _bmad-output/planning-artifacts/epics.md#Story 2.2]
  • [Source: _bmad-output/planning-artifacts/prd.md#FR6 Google/DeepL providers]
  • [Source: _bmad-output/planning-artifacts/prd.md#NFR12 Zero HTTP 500 errors]
  • [Source: _bmad-output/planning-artifacts/prd.md#NFR13 Provider fallback]
  • [Source: _bmad-output/implementation-artifacts/2-1-abstraction-provider-base-registry.md]

Previous Story Intelligence (Story 2.1)

What Worked Well:

  • Abstract base class design is solid and extensible
  • Registry pattern allows easy provider swapping
  • Caching significantly improves performance for repeated translations
  • 47 unit tests provide good coverage

Challenges Encountered:

  • Silent failure on error (returned untranslated text) - FIXED in code review
  • Need better error indication in TranslationResponse
  • Integration tests deferred

Learnings to Apply:

  • Use error field in TranslationResponse for failure detection
  • Add comprehensive error codes for each failure scenario
  • Test with real API during development if possible
  • Document API limits and quotas clearly

Dependencies

Internal:

  • services/providers/base.py - TranslationProvider abstract class
  • services/providers/registry.py - ProviderRegistry
  • services/providers/config.py - Configuration
  • services/providers/schemas.py - TranslationRequest/Response models

External:

  • google-api-python-client - Google API client library (if using v3)
  • httpx or aiohttp - HTTP client for v2 API
  • structlog - Structured logging

Security Considerations

API Key Protection:

  • Never log API key
  • Load from environment variable only
  • Validate key format on startup (basic check)

Rate Limiting:

  • Implement client-side rate limiting to protect quota
  • Track usage per user to prevent abuse
  • Log usage metrics for billing/monitoring

Data Privacy:

  • Never log document content (NFR11)
  • Only log metadata: text length, languages, timestamps
  • Clear cache entries after TTL (security + privacy)

Dev Agent Record

Agent Model Used

Claude (glm-5)

Debug Log References

None - all tests passed without requiring debugging.

Completion Notes List

  • Task 1: Validated existing implementation from Story 2.1. Uses deep_translator library which wraps Google Translate without API key.
  • Task 2: Implemented 5 error codes (GOOGLE_QUOTA_EXCEEDED, GOOGLE_INVALID_KEY, GOOGLE_NETWORK_ERROR, GOOGLE_UNSUPPORTED_LANGUAGE, GOOGLE_TEXT_TOO_LONG). Added retry logic with exponential backoff (max 3 retries, 1s initial delay). Added timeout configuration (default 30s). Error responses use {error, message, details} format.
  • Task 3: Enhanced health check with caching (60s TTL). Added last_check timestamp to ProviderHealthStatus.
  • Task 4: Created comprehensive integration tests (26 tests total): error codes, retry logic, timeout, error format, logging, health check, real API tests.
  • Task 5: Added optimization to skip translation when source==target language. Added usage metrics logging (character count, API calls). Documented API usage and costs in code comments.
  • Task 6: Created services/providers/README.md with provider documentation. Updated .env.example with Google Translate configuration options.

File List

Modified:

  • services/providers/google_provider.py - Enhanced error handling, retry logic, health check, optimizations; timeout applied via executor; structlog; config from env; LegacyGoogleAdapter
  • services/providers/schemas.py - Added error_code, error_details, to_error_dict() to TranslationResponse; added last_check to ProviderHealthStatus
  • services/providers/config.py - GOOGLE_TRANSLATE_TIMEOUT, GOOGLE_TRANSLATE_MAX_RETRIES, GOOGLE_TRANSLATE_RETRY_DELAY from env
  • .env.example - Added Google Translate configuration options
  • utils/exceptions.py - TranslationProviderError; handle_translation_error maps Google error codes to HTTP 429/401/502/400/413
  • utils/__init__.py - Export TranslationProviderError
  • main.py - Use get_legacy_google_adapter() for Google provider; admin dashboard includes providers.google status
  • requirements.txt - structlog>=24.1.0
  • tests/test_providers/test_google_provider.py - test_translate_text_error_fallback asserts error/error_code

Created:

  • services/providers/README.md - Provider documentation
  • tests/test_providers/test_google_integration.py - Integration tests (26 tests)

Senior Developer Review (AI)

Review date: 2026-02-21

Findings addressed (auto-fix):

  • CRITICAL: New provider wired into API via LegacyGoogleAdapter; main.py uses get_legacy_google_adapter() for provider "google". TranslationProviderError raised on failure; handle_translation_error returns 429/401/502/400/413.
  • HIGH: Timeout applied in _make_api_request via ThreadPoolExecutor.future.result(timeout=self.timeout); FuturesTimeoutError mapped to GOOGLE_NETWORK_ERROR.
  • MEDIUM: structlog used when available (fallback to logging); config (timeout, max_retries, retry_delay) read from ProvidersConfig env; admin dashboard GET /admin/dashboard returns providers.google (health_check); test_translate_text_error_fallback strengthened.

Change Log

  • 2026-02-21: Code review (AI). Fixes applied: provider wiring, timeout, structlog, config env, admin dashboard provider status, tests. Status → done.