2.2 KiB
2.2 KiB
Story 3.2: Calcul de l'Importance des Features (Backend)
Status: review
Story
As a system, I want to compute the predictive power of features against a target variable, so that I can provide scientific recommendations to the user.
Acceptance Criteria
- Importance Algorithm: Backend implements Feature Importance calculation using
RandomForestRegressor. - Analysis Endpoint: A POST endpoint
/api/v1/analysis/feature-importanceaccepts data, features list, and target variable (Y). - Detection Output: Returns a ranked list of features with their importance scores (0 to 1).
- Validation: Ensures Y is not in the X list and that enough numeric data exists.
- Clean Data Source: Only uses data from non-excluded rows.
Tasks / Subtasks
- Engine Implementation (AC: 1, 4)
- Implement
calculate_feature_importance(df, features, target)inbackend/app/core/engine/stats.py. - Handle categorical features using basic Label Encoding if needed (currently focus on numeric).
- Implement
- API Endpoint (AC: 2, 3, 5)
- Implement
POST /api/v1/analysis/feature-importanceinanalysis.py.
- Implement
Dev Notes
- Model: Used
RandomForestRegressorwith 50 estimators for a balance between speed and accuracy. - Data Prep: Automatically drops rows with NaNs in either features or target to ensure Scikit-learn compatibility.
- Output: Returns a JSON list of objects
{feature, score}sorted by score in descending order.
Project Structure Notes
- Modified
backend/app/core/engine/stats.py. - Updated
backend/app/api/v1/analysis.py. - Added test case in
backend/tests/test_analysis.py.
References
- [Source: epics.md#Story 3.2]
- [Source: architecture.md#Computational Workers]
Dev Agent Record
Agent Model Used
{{agent_model_name_version}}
Completion Notes List
- Implemented the Feature Importance core engine using Scikit-learn.
- Developed the API endpoint to expose the ranked feature list.
- Added validation to prevent processing empty or incompatible datasets.
- Verified with automated tests.
File List
- /backend/app/core/engine/stats.py
- /backend/app/api/v1/analysis.py
- /backend/tests/test_analysis.py