Major changes across backend, frontend, infrastructure: - Provider system with model selection (Google, DeepL, OpenAI, Ollama, Google Cloud) - Admin panel: user management, pricing, settings - Glossary system with CSV import/export - Subscription and tier quota management - Security hardening (rate limiting, API key auth, path traversal fixes) - Docker compose for dev, prod, and IONOS deployment - Alembic migrations for new tables - Frontend: dashboard, pricing page, landing page, i18n (en/fr) - Test suite and verification scripts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
15 KiB
Story 2.10: Endpoint POST /api/v1/translate (Core)
Status: done
Story
As a user, I want to submit a document for translation via API, So that I can get my document translated.
Acceptance Criteria
- AC1: Authentication - Endpoint requires valid JWT token (web user) or X-API-Key header (automation user)
- AC2: File Upload - POST to /api/v1/translate accepts multipart/form-data with file, source_lang, target_lang
- AC3: File Validation - System validates format (xlsx/docx/pptx only), max size 50MB, magic bytes check
- AC4: Success Response - Valid requests return HTTP 202 with
{data: {id, status: "processing"}, meta: {rate_limit_remaining}} - AC5: Invalid Format - Unsupported formats return 400 with error "INVALID_FORMAT" and accepted formats list
- AC6: Quota Exceeded - Users exceeding tier limit return 429 with error "QUOTA_EXCEEDED" and Retry-After header
- AC7: File Too Large - Files > 50MB return 413 with error "FILE_TOO_LARGE"
- AC8: Async Processing - Translation is processed asynchronously (endpoint returns immediately after validation)
- AC9: URL Ingestion - Pro users can provide
file_urlparameter instead of file upload (FR62-FR64) - AC10: Optional Parameters - Support
mode(classic/llm),provider,webhook_url,glossary_id,custom_prompt
Tasks / Subtasks
-
Task 1: Create Request/Response Schemas (AC: 2, 4, 5, 6, 7)
- 1.1 Create
TranslateRequestschema with file upload or file_url - 1.2 Create
TranslateResponseschema with{data: {id, status}, meta: {rate_limit_remaining}} - 1.3 Create error response schemas for each error code
- 1.1 Create
-
Task 2: Implement File Validation (AC: 3, 5, 7)
- 2.1 Check file extension (only .xlsx, .docx, .pptx)
- 2.2 Check magic bytes (PK header for Office files)
- 2.3 Check file size (max 50MB)
- 2.4 Return structured errors for each validation failure
-
Task 3: Implement Authentication Middleware (AC: 1)
- 3.1 Support JWT Bearer token from Authorization header
- 3.2 Support X-API-Key header for automation users
- 3.3 Extract user context and tier information
-
Task 4: Implement Rate Limiting Check (AC: 6)
- 4.1 Check user tier (free: 5/day, pro: unlimited)
- 4.2 Check daily_translation_count against limit
- 4.3 Return 429 with Retry-After header if exceeded
- 4.4 Include rate_limit_remaining in meta response
-
Task 5: Implement Translation Job Creation (AC: 4, 8)
- 5.1 Generate unique translation ID (UUID)
- 5.2 Store file in temporary location with TTL metadata
- 5.3 Create translation job record in database/Redis
- 5.4 Queue job for async processing (or process inline for MVP)
- 5.5 Return 202 with job ID and status
-
Task 6: Implement URL Ingestion (AC: 9)
- 6.1 Accept
file_urlparameter as alternative to file upload - 6.2 Download file from URL with timeout (10s)
- 6.3 Validate downloaded file format and size
- 6.4 Return error "URL_DOWNLOAD_FAILED" or "URL_UNREACHABLE" on failure
- 6.1 Accept
-
Task 7: Implement Optional Parameters (AC: 10)
- 7.1 Accept
modeparameter (classic/llm, default: classic) - 7.2 Accept
providerparameter (optional override) - 7.3 Accept
webhook_urlparameter (optional) - 7.4 Accept
glossary_idparameter (Pro only) - 7.5 Accept
custom_promptparameter (Pro only)
- 7.1 Accept
-
Task 8: Create Router Endpoint (AC: All)
- 8.1 Create POST /api/v1/translate in
routes/translate_routes.py - 8.2 Wire all validation, auth, and processing components
- 8.3 Add OpenAPI documentation with all parameters
- 8.4 Add unit tests for all scenarios
- 8.1 Create POST /api/v1/translate in
-
Task 9: Integration Tests (AC: All)
- 9.1 Test successful translation submission
- 9.2 Test authentication (JWT and API Key)
- 9.3 Test file validation errors
- 9.4 Test rate limiting
- 9.5 Test URL ingestion
Dev Notes
Previous Story Intelligence (Stories 2.1-2.9)
Critical patterns from Processor stories to reuse:
- File validation pattern (from pptx_translator.py):
MAX_FILE_SIZE_MB = 50
OFFICE_MAGIC_BYTES = b"PK" # All Office files are ZIP archives
ACCEPTED_EXTENSIONS = {".xlsx", ".docx", ".pptx"}
def _validate_file(file_path: Path) -> None:
if file_path.suffix.lower() not in ACCEPTED_EXTENSIONS:
raise ValidationError(code="INVALID_FORMAT", ...)
with open(file_path, "rb") as f:
header = f.read(4)
if header[:2] != OFFICE_MAGIC_BYTES:
raise ValidationError(code="INVALID_FORMAT", ...)
file_size_mb = file_path.stat().st_size / (1024 * 1024)
if file_size_mb > MAX_FILE_SIZE_MB:
raise ValidationError(code="FILE_TOO_LARGE", ...)
- Provider integration (from all processors):
def translate_file(self, provider: TranslationProvider, request: TranslationRequest) -> TranslationResponse:
# Provider handles batch translation with fallback
response = provider.translate_batch(texts, target_lang, source_lang)
if response.error:
raise TranslationError(code=response.error_code, message=response.error)
- Error class pattern:
class TranslateEndpointError(Exception):
INVALID_FORMAT = "INVALID_FORMAT"
FILE_TOO_LARGE = "FILE_TOO_LARGE"
QUOTA_EXCEEDED = "QUOTA_EXCEEDED"
URL_DOWNLOAD_FAILED = "URL_DOWNLOAD_FAILED"
URL_UNREACHABLE = "URL_UNREACHABLE"
UNAUTHORIZED = "UNAUTHORIZED"
Existing Code Structure
Check existing files:
app/main.py- FastAPI application entryapp/modules/translation/- Translation module (may need creation)app/core/security.py- Auth utilitiesapp/middleware/rate_limit.py- Rate limiting logictranslators/- Existing processors (Excel, Word, PowerPoint)
Architecture Compliance
Per _bmad-output/planning-artifacts/architecture.md:
Success Response Format:
{
"data": {
"id": "tr_abc123",
"status": "processing",
"file_name": "report.xlsx",
"source_lang": "en",
"target_lang": "fr"
},
"meta": {
"rate_limit_remaining": 49,
"estimated_time_seconds": 12
}
}
Error Response Format:
{
"error": "QUOTA_EXCEEDED",
"message": "Limite quotidienne atteinte (5/5 fichiers)",
"details": {
"current_usage": 5,
"limit": 5,
"tier": "free",
"reset_at": "2024-01-16T00:00:00Z"
}
}
Naming Conventions:
- File:
router.py(snake_case) - Class:
TranslateRequest,TranslateResponse(PascalCase) - Variables:
user_id,file_path(snake_case) - JSON fields: snake_case
API Endpoint Specification
POST /api/v1/translate
Authorization: Bearer <jwt_token>
# OR
X-API-Key: sk_live_xxx
Content-Type: multipart/form-data
file: <binary> # Required (unless file_url provided)
file_url: https://example.com/doc.xlsx # Alternative to file (Pro feature)
source_lang: en # Required
target_lang: fr # Required
mode: classic # Optional: "classic" | "llm" (default: classic)
provider: google # Optional: "google" | "deepl" | "ollama" | "openai"
webhook_url: https://... # Optional
glossary_id: uuid # Optional (Pro only)
custom_prompt: string # Optional (Pro only)
Rate Limiting Logic
# From architecture.md
FREE_TIER_LIMIT = 5 # files per day
PRO_TIER_LIMIT = None # unlimited
# Check in middleware or endpoint
if user.tier == "free" and user.daily_translation_count >= FREE_TIER_LIMIT:
raise HTTPException(
status_code=429,
detail={
"error": "QUOTA_EXCEEDED",
"message": "Limite quotidienne atteinte.",
"details": {
"current_usage": user.daily_translation_count,
"limit": FREE_TIER_LIMIT,
"reset_at": next_midnight_utc
}
},
headers={"Retry-After": str(seconds_until_midnight)}
)
URL Ingestion Details
import httpx
async def download_from_url(url: str, timeout: int = 10) -> Tuple[Path, str]:
"""Download file from URL and return (temp_path, filename)."""
async with httpx.AsyncClient(timeout=timeout) as client:
response = await client.get(url, follow_redirects=True)
if response.status_code != 200:
raise TranslateEndpointError(
code="URL_UNREACHABLE",
message=f"URL inaccessible (HTTP {response.status_code})"
)
# Extract filename from URL or Content-Disposition
filename = extract_filename(url, response.headers)
# Save to temp file
temp_path = save_to_temp(response.content, filename)
return temp_path, filename
File Structure
Files to Create/Modify:
app/modules/translation/router.py- Main endpoint (create)app/modules/translation/schemas.py- Request/Response schemas (create)app/modules/translation/service.py- Business logic (create/update)app/middleware/rate_limit.py- Rate limiting (update if needed)tests/test_translation_endpoint.py- Integration tests (create)
Git Intelligence - Recent Patterns
From recent commits:
- Translation cache implemented (5000 entry LRU cache)
- OpenRouter provider with DeepSeek support added
- Parallel processing optimizations in translation service
- Redis sessions for production
Testing Strategy
# Unit tests
pytest tests/test_translate_endpoint.py -v
# With coverage
pytest tests/test_translation_endpoint.py --cov=app/modules/translation -v
# Integration tests
pytest tests/integration/ -v
Dependencies on Previous Stories
| Story | Dependency |
|---|---|
| 2.1-2.6 | TranslationProvider abstraction and fallback chain |
| 2.7 | ExcelProcessor for .xlsx files |
| 2.8 | WordProcessor for .docx files |
| 2.9 | PowerPointProcessor for .pptx files |
| 1.6 | Rate limiting middleware |
| 1.8 | Usage tracking for billing |
Anti-Patterns to Avoid
- Don't process synchronously - Return 202 immediately, process in background
- Don't skip validation - Always check magic bytes, not just extension
- Don't log file content - Only log metadata (NFR11, NFR16)
- Don't return HTTP 500 - All errors should be 4xx with structured response
- Don't forget tier checks - Pro features (glossary, custom_prompt, URL ingestion) require tier check
References
- [Source: _bmad-output/planning-artifacts/epics.md#Story 2.10]
- [Source: _bmad-output/planning-artifacts/architecture.md#API Response Formats]
- [Source: _bmad-output/planning-artifacts/prd.md#FR50-FR54 File Management]
- [Source: _bmad-output/planning-artifacts/prd.md#FR62-FR64 URL Ingestion]
- [Source: _bmad-output/planning-artifacts/prd.md#NFR12 Zero HTTP 500]
- [Source: _bmad-output/implementation-artifacts/2-9-processor-powerpoint-pptx.md - Previous story patterns]
- [Source: translators/excel_translator.py - File validation pattern]
- [Source: translators/word_translator.py - Error handling pattern]
- [Source: services/providers/base.py - TranslationProvider interface]
- [Source: https://fastapi.tiangolo.com/tutorial/request-files/ - File upload docs]
Dev Agent Record
Agent Model Used
glm-5
Debug Log References
None
Completion Notes List
-
Task 1 Complete: Created
TranslateEndpointErrorexception class with error codes (INVALID_FORMAT, FILE_TOO_LARGE, QUOTA_EXCEEDED, URL_DOWNLOAD_FAILED, URL_UNREACHABLE, UNAUTHORIZED, MISSING_FILE, PRO_FEATURE_REQUIRED). Created Pydantic response schemas (TranslateResponseData,TranslateResponseMeta,TranslateResponse,ErrorResponse). -
Task 2 Complete: File validation uses existing
FileValidatorfrommiddleware/validation.pywith magic bytes (PK header), extension check (.xlsx, .docx, .pptx), and 50MB size limit. Added specific FILE_TOO_LARGE detection. -
Task 3 Complete: Authentication supports both JWT Bearer token (via Authorization header) and X-API-Key header. Uses
get_authenticated_userdependency that tries API key first, then JWT. -
Task 4 Complete: Rate limiting uses existing
TierQuotaServicefrommiddleware/tier_quota.py. Free tier: 5/day, Pro: unlimited. Returns 429 with Retry-After header on quota exceeded. -
Task 5 Complete: Translation jobs are stored in-memory
_translation_jobsdict with job ID, status, file info, timestamps. Jobs are processed asynchronously viaasyncio.create_task(). Returns 202 with job ID and status "processing". -
Task 6 Complete: URL ingestion implemented via
download_from_url()using httpx with 10s timeout. Validates downloaded file format and size. Returns appropriate error codes (URL_UNREACHABLE, URL_DOWNLOAD_FAILED, FILE_TOO_LARGE). Restricted to Pro users only. -
Task 7 Complete: All optional parameters supported: mode (classic/llm), provider, webhook_url, glossary_id (Pro only), custom_prompt (Pro only). Pro features return 403 with PRO_FEATURE_REQUIRED error for free tier users.
-
Task 8 Complete: Created
routes/translate_routes.pywithrouter_v1mounted at/api/v1. Includes POST /translate, GET /translations/{job_id}, and GET /translate/health endpoints. Full OpenAPI documentation with all parameters. -
Task 9 Complete: Created 27 comprehensive tests in
tests/test_translate_endpoint.pycovering all acceptance criteria: file upload, validation, authentication, quota, file size, async processing, URL ingestion, optional parameters. -
Code Review Fixes Applied: Fixed 5 HIGH and 6 MEDIUM issues:
- HTTP 500 → 400 (NFR12 compliance)
- glossary_id now properly passed to translation job
- Added source_lang validation
- Added webhook_url format validation
- Added tests for provider parameter, source_lang validation, webhook validation, and API key auth
File List
Created files:
routes/translate_routes.py- POST /api/v1/translate endpoint, job status endpoint, error handling, URL ingestion, async processingmiddleware/tier_quota.py- TierQuotaService for daily quota management (Free: 5/day, Pro: unlimited)alembic/versions/002_add_tier_daily_count.py- DB migration for tier trackingtests/test_translate_endpoint.py- 34+ unit tests covering all ACs
Modified files:
main.py- Import and include translate_v1_routermiddleware/validation.py- FileValidator, LanguageValidator, ProviderValidator classes
Change Log
- 2026-02-21: Code review fixes - HTTP 500→400, glossary_id propagation, source_lang validation, webhook_url validation, additional tests
- 2026-02-21: Implemented Story 2.10 - POST /api/v1/translate endpoint with async processing, file validation, authentication, rate limiting, URL ingestion (Pro), and comprehensive tests