Major changes across backend, frontend, infrastructure: - Provider system with model selection (Google, DeepL, OpenAI, Ollama, Google Cloud) - Admin panel: user management, pricing, settings - Glossary system with CSV import/export - Subscription and tier quota management - Security hardening (rate limiting, API key auth, path traversal fixes) - Docker compose for dev, prod, and IONOS deployment - Alembic migrations for new tables - Frontend: dashboard, pricing page, landing page, i18n (en/fr) - Test suite and verification scripts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
20 KiB
Story 2.5: Provider OpenAI (LLM Cloud)
Status: done
Story
As a system, I want to integrate OpenAI API as an LLM provider, so that Pro users can translate documents with GPT models.
Acceptance Criteria
- AC1: API Integration - Given
OPENAI_API_KEYis configured in environment, whenOpenAIProvider.translate_text()is called, then text is translated using GPT-4 or specified model - AC2: Custom System Prompt - Custom system prompt can be injected via request metadata to guide translation context
- AC3: Rate Limiting - API rate limits return error "PROVIDER_RATE_LIMITED" with retry suggestion (HTTP 429)
- AC4: Invalid Key Handling - Invalid API key returns error "OPENAI_INVALID_KEY" with HTTP 401
- AC5: Graceful Error Handling - All errors return structured JSON (never HTTP 500) with French messages
- AC6: Health Check - Provider
is_available()returnsTruewhen API key is valid and service is reachable - AC7: Registry Integration - Provider is registered in
ProviderRegistryand appears in fallback chain - AC8: Unit Tests - Tests verify all error scenarios, rate limiting handling, and mock OpenAI API responses
Tasks / Subtasks
-
Task 1: Create OpenAI Provider Implementation (AC: 1, 2)
- 1.1 Create
services/providers/openai_provider.py - 1.2 Implement
OpenAITranslationProviderclass extendingTranslationProvider - 1.3 Implement
translate_text()using OpenAI Chat Completions API - 1.4 Support custom system prompt injection via request metadata
- 1.5 Configure default translation system prompt with temperature 0.3
- 1.1 Create
-
Task 2: Implement Error Handling (AC: 3, 4, 5)
- 2.1 Define error codes:
OPENAI_RATE_LIMITED,OPENAI_INVALID_KEY,OPENAI_QUOTA_EXCEEDED,OPENAI_TIMEOUT,OPENAI_SERVICE_ERROR,OPENAI_CONTEXT_TOO_LONG - 2.2 Implement
OpenAIProviderErrorexception class (follow Ollama pattern) - 2.3 Map OpenAI API errors to structured error responses with French messages
- 2.4 Add retry logic with exponential backoff for rate limits and timeouts
- 2.5 Add timeout configuration (default 60s for OpenAI - faster than Ollama)
- 2.6 Handle specific OpenAI errors: rate_limit_exceeded, insufficient_quota, invalid_api_key
- 2.1 Define error codes:
-
Task 3: Implement Health Check (AC: 6)
- 3.1 Implement
is_available()to validate API key and service reachability - 3.2 Add
health_check()with caching (TTL 60s) matching existing provider pattern - 3.3 Make lightweight API call to verify credentials (e.g., list models or simple completion)
- 3.4 Return
ProviderHealthStatuswith availability, latency, and model info
- 3.1 Implement
-
Task 4: Registry Integration (AC: 7)
- 4.1 Add
register_openai_provider()function - 4.2 Add
get_openai_provider()singleton function - 4.3 Update
services/providers/__init__.pyto auto-register OpenAI whenOPENAI_ENABLED=true - 4.4 Verify provider appears in fallback chain when configured
- 4.1 Add
-
Task 5: Configuration Updates (AC: 1, 2)
- 5.1 Verify
OPENAI_API_KEY,OPENAI_MODEL,OPENAI_ENABLEDinconfig.py(already present) - 5.2 Add OpenAI-specific configuration options to
config.py:OPENAI_TIMEOUT=60(faster than Ollama's 120s)OPENAI_MAX_RETRIES=3OPENAI_RETRY_DELAY=1.0OPENAI_BASE_URL(optional, for custom endpoints like Azure OpenAI)
- 5.3 Update
.env.examplewith OpenAI-specific config
- 5.1 Verify
-
Task 6: Create Unit Tests (AC: 8)
- 6.1 Create
tests/test_providers/test_openai_provider.py - 6.2 Test successful translation with mocked OpenAI API
- 6.3 Test all error scenarios (rate limited, invalid key, quota exceeded, timeout)
- 6.4 Test custom system prompt injection
- 6.5 Test retry logic for rate limits
- 6.6 Test health check functionality
- 6.7 Test registry integration
- 6.1 Create
-
Task 7: Update Documentation (AC: 1-8)
- 7.1 Update
services/providers/README.mdwith OpenAI section - 7.2 Document OpenAI setup requirements (API key from platform.openai.com)
- 7.3 Document supported models and pricing considerations
- 7.4 Document rate limiting behavior and retry strategy
- 7.1 Update
Dev Notes
OpenAI API Specifics
OpenAI Chat Completions API:
| Endpoint | Method | Purpose |
|---|---|---|
/v1/chat/completions |
POST | Generate translation |
/v1/models |
GET | List available models (for health check) |
API Request Format:
OPENAI_API_URL = "https://api.openai.com/v1/chat/completions"
headers = {
"Authorization": f"Bearer {OPENAI_API_KEY}",
"Content-Type": "application/json"
}
payload = {
"model": "gpt-4o-mini", # or gpt-4, gpt-3.5-turbo
"messages": [
{"role": "system", "content": system_prompt},
{"role": "user", "content": text_to_translate}
],
"temperature": 0.3, # Lower for consistent translation
"max_tokens": 4096 # Adjust based on expected output
}
API Response Format:
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-4o-mini",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Bonjour, comment allez-vous?"
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 50,
"completion_tokens": 10,
"total_tokens": 60
}
}
OpenAI Error Codes:
| OpenAI Error | HTTP | Mapped Code | French Message |
|---|---|---|---|
rate_limit_exceeded |
429 | OPENAI_RATE_LIMITED |
"Limite de requêtes OpenAI atteinte. Réessayez dans {retry_after}s." |
insufficient_quota |
429 | OPENAI_QUOTA_EXCEEDED |
"Quota OpenAI épuisé. Vérifiez votre facturation." |
invalid_api_key |
401 | OPENAI_INVALID_KEY |
"Clé API OpenAI invalide. Vérifiez votre configuration." |
context_length_exceeded |
400 | OPENAI_CONTEXT_TOO_LONG |
"Texte trop long (max {max_tokens} tokens)." |
server_error |
500 | OPENAI_SERVICE_ERROR |
"Service OpenAI temporairement indisponible." |
| Timeout | - | OPENAI_TIMEOUT |
"Délai d'attente OpenAI dépassé." |
Recommended Models for Translation
| Model | Cost | Speed | Quality | Best For |
|---|---|---|---|---|
gpt-4o-mini |
$0.15/M tokens | Fast | Good | Default choice, cost-effective |
gpt-4o |
$2.50/M tokens | Medium | Excellent | High-quality requirements |
gpt-4 |
$30/M tokens | Slower | Excellent | Critical translations |
gpt-3.5-turbo |
$0.50/M tokens | Fastest | Good | Speed priority |
Default: gpt-4o-mini (best value for translation)
Default System Prompt for Translation
DEFAULT_TRANSLATION_PROMPT = """You are a professional translator. Translate the following text from {source_lang} to {target_lang}.
Rules:
- Translate ONLY the text, do not add explanations or notes
- Preserve the original formatting, line breaks, and structure
- Maintain the original tone and style
- For technical terms, use the standard translation in the target language
- If the text contains proper nouns or brand names, keep them unchanged unless there's a well-known translation"""
def _build_system_prompt(
source_lang: str,
target_lang: str,
custom_prompt: Optional[str] = None
) -> str:
if custom_prompt:
return custom_prompt
return DEFAULT_TRANSLATION_PROMPT.format(
source_lang=source_lang,
target_lang=target_lang
)
Architecture Compliance
Per _bmad-output/planning-artifacts/architecture.md:
Error Format:
{
"error": "OPENAI_RATE_LIMITED",
"message": "Limite de requêtes OpenAI atteinte. Réessayez dans 20s.",
"details": {
"provider": "openai",
"retry_after_seconds": 20,
"model": "gpt-4o-mini"
}
}
Never return HTTP 500 - All errors must be 4xx or 502 (upstream error).
Naming Conventions:
- File:
openai_provider.py(snake_case) - Class:
OpenAITranslationProvider(PascalCase) - Error codes:
OPENAI_*(UPPER_SNAKE_CASE) - JSON fields: snake_case
Previous Story Intelligence (Story 2.4 - Ollama)
What Worked Well:
httpxlibrary for HTTP requests (supports async and sync)- Error codes with
to_dict()method for consistent formatting - Retry logic with exponential backoff for transient errors
- Health check with 60s TTL caching
- Thread-safe singleton pattern for provider instance
- Structlog-compatible logging with keyword args
- Language name mapping for better LLM understanding
Patterns to Reuse:
# Error codes pattern
OPENAI_RATE_LIMITED = "OPENAI_RATE_LIMITED"
OPENAI_INVALID_KEY = "OPENAI_INVALID_KEY"
OPENAI_QUOTA_EXCEEDED = "OPENAI_QUOTA_EXCEEDED"
OPENAI_TIMEOUT = "OPENAI_TIMEOUT"
OPENAI_SERVICE_ERROR = "OPENAI_SERVICE_ERROR"
OPENAI_CONTEXT_TOO_LONG = "OPENAI_CONTEXT_TOO_LONG"
_RETRYABLE_ERRORS = {OPENAI_RATE_LIMITED, OPENAI_TIMEOUT, OPENAI_SERVICE_ERROR}
# Exception class pattern
class OpenAIProviderError(Exception):
def __init__(self, code: str, message: str, details: Optional[Dict[str, Any]] = None):
self.code = code
self.message = message
self.details = details or {}
super().__init__(message)
def to_dict(self) -> Dict[str, Any]:
result = {"error": self.code, "message": self.message}
if self.details:
result["details"] = self.details
return result
# Retry logic pattern
def _translate_with_retry(self, text: str, system_prompt: str) -> str:
last_error = None
for attempt in range(self.max_retries + 1):
try:
return self._make_api_request(text, system_prompt)
except OpenAIProviderError as e:
last_error = e
if e.code not in _RETRYABLE_ERRORS or attempt == self.max_retries:
raise
delay = self.retry_delay * (2 ** attempt)
time.sleep(delay)
raise last_error
Key Differences from Ollama:
- Requires API key authentication (Bearer token)
- Uses OpenAI's specific error codes and headers
- Rate limiting is more strict (pay-per-use)
- Faster response times (60s timeout vs 120s)
- No model "pulling" concept - models are always available
- Quota management is critical (billing impact)
File Structure
Files to Create:
services/providers/openai_provider.py- Main OpenAI provider implementationtests/test_providers/test_openai_provider.py- Unit tests
Files to Modify:
services/providers/__init__.py- Add OpenAI auto-registrationservices/providers/config.py- Add OPENAI_TIMEOUT, OPENAI_MAX_RETRIES, OPENAI_RETRY_DELAY, OPENAI_BASE_URL.env.example- Add OpenAI-specific configuration optionsservices/providers/README.md- Add OpenAI documentation
Error Codes to Implement
| Code | HTTP | Scenario | Message Template |
|---|---|---|---|
OPENAI_RATE_LIMITED |
429 | Rate limit hit | "Limite de requêtes atteinte. Réessayez dans {retry_after}s." |
OPENAI_INVALID_KEY |
401 | Invalid API key | "Clé API invalide. Vérifiez OPENAI_API_KEY." |
OPENAI_QUOTA_EXCEEDED |
429 | Billing quota exceeded | "Quota épuisé. Vérifiez votre facturation OpenAI." |
OPENAI_TIMEOUT |
502 | Request timeout | "Délai dépassé. Le service est lent." |
OPENAI_SERVICE_ERROR |
502 | OpenAI server error | "Service temporairement indisponible." |
OPENAI_CONTEXT_TOO_LONG |
413 | Context exceeds model limit | "Texte trop long (max {max_tokens} tokens)." |
Configuration
Environment Variables (.env.example):
# OpenAI Provider (Cloud LLM)
OPENAI_ENABLED=true
OPENAI_API_KEY=sk-proj-xxxxxxxxxxxxxxxxxxxxxxxx
OPENAI_MODEL=gpt-4o-mini
OPENAI_TIMEOUT=60
OPENAI_MAX_RETRIES=3
OPENAI_RETRY_DELAY=1.0
# OPENAI_BASE_URL=https://api.openai.com/v1 # Optional: for Azure OpenAI or proxies
Provider Config (services/providers/config.py):
Add to existing OpenAI section:
OPENAI_TIMEOUT: int = int(os.getenv("OPENAI_TIMEOUT", "60"))
OPENAI_MAX_RETRIES: int = int(os.getenv("OPENAI_MAX_RETRIES", "3"))
OPENAI_RETRY_DELAY: float = float(os.getenv("OPENAI_RETRY_DELAY", "1.0"))
OPENAI_BASE_URL: str = os.getenv("OPENAI_BASE_URL", "https://api.openai.com/v1")
Testing Strategy
Unit Tests (Mocked):
- Mock
httpxorrequestsresponses - Test successful translation
- Test all error scenarios (rate limit, invalid key, quota exceeded, timeout)
- Test custom system prompt injection
- Test health check logic
- Test retry logic for rate limits
- Test registry integration
Test Commands:
# Unit tests only
pytest tests/test_providers/test_openai_provider.py -v
# All provider tests
pytest tests/test_providers/ -v
# With coverage
pytest tests/test_providers/ --cov=services/providers -v
Logging Pattern
try:
import structlog
logger = structlog.get_logger(__name__)
_HAS_STRUCTLOG = True
except ImportError:
import logging
logger = logging.getLogger(__name__)
_HAS_STRUCTLOG = False
def _log_info(event: str, **kwargs):
"""Log info with structlog or standard logging compatibility."""
if _HAS_STRUCTLOG:
logger.info(event, **kwargs)
else:
msg = f"{event} " + " ".join(f"{k}={v}" for k, v in kwargs.items())
logger.info(msg)
# Good - metadata only (NO document content)
_log_info(
"openai_translation_success",
chars=len(text),
source_lang=source_language,
target_lang=target_language,
model=self._model,
latency_ms=round(latency * 1000, 2),
tokens_used=response.usage.total_tokens,
)
_log_error(
"openai_translation_failed",
error_code=error.code,
text_length=len(text),
source_lang=source_language,
target_lang=target_language,
model=self._model,
)
Dependencies
Internal:
services/providers/base.py- TranslationProvider abstract classservices/providers/registry.py- ProviderRegistryservices/providers/config.py- Configurationservices/providers/schemas.py- TranslationRequest/Response models
External:
httpx- HTTP client (preferred for async/sync support)structlogor standardlogging- Structured logging
HTTP Client Pattern
Use httpx for OpenAI API calls:
import httpx
class OpenAITranslationProvider(TranslationProvider):
def __init__(self, api_key: str, model: str = "gpt-4o-mini", timeout: int = 60, base_url: str = "https://api.openai.com/v1"):
self._api_key = api_key
self._model = model
self._base_url = base_url.rstrip("/")
self._timeout = timeout
self._client = httpx.Client(
timeout=timeout,
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
)
def _make_api_request(self, text: str, system_prompt: str) -> str:
response = self._client.post(
f"{self._base_url}/v1/chat/completions",
json={
"model": self._model,
"messages": [
{"role": "system", "content": system_prompt},
{"role": "user", "content": text}
],
"temperature": 0.3,
"max_tokens": 4096
}
)
# ... error handling based on status code
return response.json()["choices"][0]["message"]["content"]
Security Considerations
API Key Management:
- API key stored in environment variable (never in code)
- Key validated at initialization
- Never log the API key (only last 4 characters if needed for debugging)
Data Privacy:
- Never log document content (NFR11)
- Only log metadata: text length, languages, model, timestamps
- OpenAI may retain data per their privacy policy (different from Ollama's local processing)
Pro Feature Integration
Per PRD FR26: "Pro users can access LLM translation modes"
This provider will be used when:
- User tier is "pro"
- User selects "LLM" mode
- User selects "OpenAI" as LLM provider
The tier check happens in the translation service/router, not in the provider itself.
Rate Limiting Handling
OpenAI returns rate limit info in response headers:
x-ratelimit-limit-requestsx-ratelimit-remaining-requestsx-ratelimit-reset-requests
Extract retry_after from error response or use exponential backoff.
References
- [Source: _bmad-output/planning-artifacts/architecture.md#Error Handling]
- [Source: _bmad-output/planning-artifacts/architecture.md#API Response Formats]
- [Source: _bmad-output/planning-artifacts/epics.md#Story 2.5]
- [Source: _bmad-output/planning-artifacts/prd.md#FR7 LLM providers (Ollama, OpenAI)]
- [Source: _bmad-output/planning-artifacts/prd.md#NFR12 Zero HTTP 500 errors]
- [Source: _bmad-output/implementation-artifacts/2-4-provider-ollama-llm-local.md]
- [Source: services/providers/ollama_provider.py - Implementation pattern]
- [Source: https://platform.openai.com/docs/api-reference/chat - OpenAI API docs]
- [Source: https://platform.openai.com/docs/guides/error-codes - OpenAI Error Codes]
Dev Agent Record
Agent Model Used
Claude (GLM-5) via opencode
Debug Log References
- Fixed test mocking issues for registry integration tests
- Resolved ProvidersConfig import path in tests
Completion Notes List
- ✅ Implemented
OpenAITranslationProviderclass with full OpenAI Chat Completions API integration - ✅ All 6 error codes implemented with French messages: OPENAI_RATE_LIMITED, OPENAI_INVALID_KEY, OPENAI_QUOTA_EXCEEDED, OPENAI_TIMEOUT, OPENAI_SERVICE_ERROR, OPENAI_CONTEXT_TOO_LONG
- ✅ Retry logic with exponential backoff for transient errors (rate limits, timeouts, service errors)
- ✅ Health check with 60s TTL caching and model availability verification
- ✅ Registry integration with auto-registration when OPENAI_ENABLED=true
- ✅ Custom system prompt injection via request.metadata["custom_prompt"]
- ✅ Language name mapping for better LLM understanding (same as Ollama)
- ✅ 44 unit tests created and all passing
- ✅ Configuration updated in config.py with OPENAI_TIMEOUT, OPENAI_MAX_RETRIES, OPENAI_RETRY_DELAY, OPENAI_BASE_URL, OPENAI_HEALTH_CHECK_TIMEOUT
- ✅ Auto-registration added to init.py
- ✅ All acceptance criteria (AC1-AC8) satisfied
Code Review Fixes (2026-02-21)
- ✅ [HIGH] Added model info to
health_check()return (model,model_availablefields per Task 3.4) - ✅ [MEDIUM] Added configurable
health_check_timeoutparameter (default 5s, via OPENAI_HEALTH_CHECK_TIMEOUT) - ✅ [MEDIUM] Added
reset_openai_provider()function to reset singleton when config changes - ✅ [MEDIUM] Added API key validation (empty key raises ValueError)
- ✅ [MEDIUM] Added 11 new tests covering: empty API key, text too long preemptive check, malformed API responses (empty choices, missing content), health check model info, reset function
File List
Files Created:
services/providers/openai_provider.py- Main OpenAI provider implementation (660 lines)tests/test_providers/test_openai_provider.py- 44 unit tests covering all functionality
Files Modified:
services/providers/__init__.py- Added OpenAI auto-registrationservices/providers/config.py- Added OPENAI_TIMEOUT, OPENAI_MAX_RETRIES, OPENAI_RETRY_DELAY, OPENAI_BASE_URL, OPENAI_HEALTH_CHECK_TIMEOUTservices/providers/README.md- OpenAI section (Task 7).env.example- Added OPENAI_HEALTH_CHECK_TIMEOUT and OpenAI config options
Change Log
- 2026-02-21: [AI Code Review 2-5/2-6] Fixes: defensive JSON for 429/400, tokens_used in success log, ProviderSettings.openai base_url in config, File List README
- 2026-02-21: Code review fixes applied - Added model info to health_check, configurable health check timeout, reset function for singleton, API key validation, 11 new tests
- 2026-02-21: Story 2.5 implementation complete - OpenAI provider with cloud LLM translation, custom prompts, comprehensive error handling with French messages, retry logic, health checks, and 44 passing tests