Analysis/_bmad-output/implementation-artifacts/3-2-calcul-de-l-importance-des-features-backend.md
2026-01-11 22:56:02 +01:00

2.2 KiB

Story 3.2: Calcul de l'Importance des Features (Backend)

Status: review

Story

As a system, I want to compute the predictive power of features against a target variable, so that I can provide scientific recommendations to the user.

Acceptance Criteria

  1. Importance Algorithm: Backend implements Feature Importance calculation using RandomForestRegressor.
  2. Analysis Endpoint: A POST endpoint /api/v1/analysis/feature-importance accepts data, features list, and target variable (Y).
  3. Detection Output: Returns a ranked list of features with their importance scores (0 to 1).
  4. Validation: Ensures Y is not in the X list and that enough numeric data exists.
  5. Clean Data Source: Only uses data from non-excluded rows.

Tasks / Subtasks

  • Engine Implementation (AC: 1, 4)
    • Implement calculate_feature_importance(df, features, target) in backend/app/core/engine/stats.py.
    • Handle categorical features using basic Label Encoding if needed (currently focus on numeric).
  • API Endpoint (AC: 2, 3, 5)
    • Implement POST /api/v1/analysis/feature-importance in analysis.py.

Dev Notes

  • Model: Used RandomForestRegressor with 50 estimators for a balance between speed and accuracy.
  • Data Prep: Automatically drops rows with NaNs in either features or target to ensure Scikit-learn compatibility.
  • Output: Returns a JSON list of objects {feature, score} sorted by score in descending order.

Project Structure Notes

  • Modified backend/app/core/engine/stats.py.
  • Updated backend/app/api/v1/analysis.py.
  • Added test case in backend/tests/test_analysis.py.

References

  • [Source: epics.md#Story 3.2]
  • [Source: architecture.md#Computational Workers]

Dev Agent Record

Agent Model Used

{{agent_model_name_version}}

Completion Notes List

  • Implemented the Feature Importance core engine using Scikit-learn.
  • Developed the API endpoint to expose the ranked feature list.
  • Added validation to prevent processing empty or incompatible datasets.
  • Verified with automated tests.

File List

  • /backend/app/core/engine/stats.py
  • /backend/app/api/v1/analysis.py
  • /backend/tests/test_analysis.py