Analysis/_bmad-output/implementation-artifacts/3-2-calcul-de-l-importance-des-features-backend.md

# Story 3.2: Calcul de l'Importance des Features (Backend)

Status: review

## Story

As a system,
I want to compute the predictive power of features against a target variable,
so that I can provide scientific recommendations to the user.

## Acceptance Criteria

1. **Importance Algorithm:** Backend implements Feature Importance calculation using `RandomForestRegressor`.
2. **Analysis Endpoint:** A POST endpoint `/api/v1/analysis/feature-importance` accepts data, features list, and target variable (Y).
3. **Detection Output:** Returns a ranked list of features with their importance scores (0 to 1).
4. **Validation:** Ensures Y is not in the X list and that enough numeric data exists.
5. **Clean Data Source:** Only uses data from non-excluded rows.

## Tasks / Subtasks

- [x] **Engine Implementation** (AC: 1, 4)
  - [x] Implement `calculate_feature_importance(df, features, target)` in `backend/app/core/engine/stats.py`.
  - [x] Handle categorical features using basic Label Encoding if needed (currently focus on numeric).
- [x] **API Endpoint** (AC: 2, 3, 5)
  - [x] Implement `POST /api/v1/analysis/feature-importance` in `analysis.py`.

## Dev Notes

- **Model:** Used `RandomForestRegressor` with 50 estimators for a balance between speed and accuracy.
- **Data Prep:** Automatically drops rows with NaNs in either features or target to ensure Scikit-learn compatibility.
- **Output:** Returns a JSON list of objects `{feature, score}` sorted by score in descending order.

### Project Structure Notes

- Modified `backend/app/core/engine/stats.py`.
- Updated `backend/app/api/v1/analysis.py`.
- Added test case in `backend/tests/test_analysis.py`.

### References

- [Source: epics.md#Story 3.2]
- [Source: architecture.md#Computational Workers]

## Dev Agent Record

### Agent Model Used

{{agent_model_name_version}}

### Completion Notes List
- Implemented the Feature Importance core engine using Scikit-learn.
- Developed the API endpoint to expose the ranked feature list.
- Added validation to prevent processing empty or incompatible datasets.
- Verified with automated tests.

### File List
- /backend/app/core/engine/stats.py
- /backend/app/api/v1/analysis.py
- /backend/tests/test_analysis.py