205 lines
6.4 KiB
Markdown
205 lines
6.4 KiB
Markdown
# Data Analysis Platform
|
|
|
|
A modern full-stack data analysis platform built with Python FastAPI backend and Next.js frontend, designed for efficient data processing, visualization, and statistical analysis.
|
|
|
|
## Overview
|
|
|
|
This platform provides a comprehensive toolkit for data analysis workflows, combining powerful Python data science libraries with a modern, responsive web interface. It leverages Apache Arrow for high-performance data transfer and implements best practices for both backend and frontend development.
|
|
|
|
### Key Features
|
|
|
|
- **Backend:** FastAPI with async support, Pydantic v2 validation, and comprehensive data science stack
|
|
- **Frontend:** Next.js 16 with TypeScript, Tailwind CSS, and Shadcn UI components
|
|
- **Data Processing:** Pandas, Scikit-learn, and Statsmodels integration
|
|
- **Performance:** Apache Arrow for zero-copy data transfer between services
|
|
- **Architecture:** Feature-based frontend organization, RESTful API design
|
|
|
|
## Technology Stack
|
|
|
|
### Backend
|
|
- **Python 3.12+** with FastAPI framework
|
|
- **Pydantic v2** for schema validation
|
|
- **Data Science:** Pandas 2.3.3+, Scikit-learn 1.8.0+, Statsmodels 0.14.6+
|
|
- **Serialization:** Apache Arrow 22.0+ for efficient binary data transfer
|
|
- **Package Management:** UV for fast dependency resolution
|
|
|
|
### Frontend
|
|
- **Next.js 16** (Standalone mode) with React 19
|
|
- **TypeScript** for type safety
|
|
- **Styling:** Tailwind CSS 4+ and Shadcn UI components
|
|
- **Data Display:** TanStack Table, Apache Arrow 21+, Recharts
|
|
- **State Management:** Zustand v5 for local state, TanStack Query v5 for server state
|
|
- **Virtualization:** TanStack Virtual for handling large datasets
|
|
|
|
### DevOps
|
|
- **Docker** multi-stage builds with distroless/alpine images
|
|
- **Docker Compose** for local development orchestration
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
Data_analysis/
|
|
├── backend/ # FastAPI backend service
|
|
│ ├── app/ # Application modules
|
|
│ ├── tests/ # Backend test suite
|
|
│ ├── main.py # Application entry point
|
|
│ ├── pyproject.toml # Python dependencies (UV)
|
|
│ └── Dockerfile # Backend container image
|
|
│
|
|
├── frontend/ # Next.js frontend application
|
|
│ ├── src/
|
|
│ │ └── features/ # Feature-based organization
|
|
│ ├── tests/ # Frontend test suite
|
|
│ ├── package.json # Node.js dependencies
|
|
│ └── Dockerfile # Frontend container image
|
|
│
|
|
├── compose.yaml # Docker Compose configuration
|
|
├── _bmad-output/ # Planning and implementation artifacts
|
|
└── README.md # This file
|
|
```
|
|
|
|
## Prerequisites
|
|
|
|
Before running the applications locally, ensure you have the following installed:
|
|
|
|
- **Python 3.12+** - [Download Python](https://www.python.org/downloads/)
|
|
- **Node.js 20+** - [Download Node.js](https://nodejs.org/)
|
|
- **UV (Python package manager)** - Install via: `pip install uv`
|
|
- **npm** (comes with Node.js)
|
|
|
|
## Local Development Setup
|
|
|
|
### Backend Setup
|
|
|
|
1. Navigate to the backend directory:
|
|
|
|
```bash
|
|
cd backend
|
|
```
|
|
|
|
2. Create and activate a virtual environment (optional but recommended):
|
|
|
|
```bash
|
|
python3.12 -m venv .venv
|
|
source .venv/bin/activate # On Windows: .venv\Scripts\activate
|
|
```
|
|
|
|
3. Install dependencies using UV:
|
|
|
|
```bash
|
|
uv sync
|
|
```
|
|
|
|
4. Start the FastAPI development server:
|
|
|
|
```bash
|
|
uvicorn main:app --reload --host 0.0.0.0 --port 8000
|
|
```
|
|
|
|
The backend API will be available at `http://localhost:8000`
|
|
|
|
5. Access the interactive API documentation at:
|
|
- Swagger UI: `http://localhost:8000/docs`
|
|
- ReDoc: `http://localhost:8000/redoc`
|
|
|
|
### Frontend Setup
|
|
|
|
1. Open a new terminal and navigate to the frontend directory:
|
|
|
|
```bash
|
|
cd frontend
|
|
```
|
|
|
|
2. Install dependencies:
|
|
|
|
```bash
|
|
npm install
|
|
```
|
|
|
|
3. Create environment configuration file:
|
|
|
|
```bash
|
|
cp .env.local.example .env.local
|
|
```
|
|
|
|
Edit `.env.local` if needed to configure API endpoints or other settings.
|
|
|
|
4. Start the Next.js development server:
|
|
|
|
```bash
|
|
npm run dev
|
|
```
|
|
|
|
The frontend application will be available at `http://localhost:3000`
|
|
|
|
### Running Both Services
|
|
|
|
To run both services simultaneously:
|
|
|
|
1. **Terminal 1 - Backend:**
|
|
```bash
|
|
cd backend
|
|
source .venv/bin/activate # If using virtual environment
|
|
uvicorn main:app --reload --host 0.0.0.0 --port 8000
|
|
```
|
|
|
|
2. **Terminal 2 - Frontend:**
|
|
```bash
|
|
cd frontend
|
|
npm run dev
|
|
```
|
|
|
|
## Development Workflow
|
|
|
|
### Code Style and Standards
|
|
|
|
- **Backend:** Follow PEP 8 guidelines with snake_case naming
|
|
- **Frontend:** Use TypeScript strict mode, follow ESLint configuration
|
|
- **API Convention:** Use snake_case for JSON keys to maintain consistency with Pandas DataFrames
|
|
- **Documentation:** Include docstrings (Python) and JSDoc comments (TypeScript) for all exported functions
|
|
|
|
### Testing
|
|
|
|
- **Backend tests:** Run `pytest` from the `backend/` directory
|
|
- **Frontend tests:** Run `npm test` from the `frontend/` directory (when configured)
|
|
|
|
### Key Anti-Patterns to Avoid
|
|
|
|
- Do NOT use standard JSON for transferring datasets larger than 5,000 rows (use Apache Arrow)
|
|
- Do NOT use deep React Context for high-frequency state updates (use Zustand)
|
|
- Do NOT implement opaque algorithms without logging data exclusions
|
|
- Do NOT perform heavy blocking computations on the main FastAPI process (use background tasks)
|
|
|
|
## Docker Deployment (Coming Soon)
|
|
|
|
The project includes Docker configuration for containerized deployment. Instructions for running with Docker Compose will be added in the next update.
|
|
|
|
## Documentation
|
|
|
|
- **Project Context:** See `_bmad-output/project-context.md` for detailed implementation rules
|
|
- **Architecture:** Technical architecture documentation is available in `_bmad-output/planning-artifacts/`
|
|
- **API Reference:** Access interactive API documentation at `/docs` when backend is running
|
|
|
|
## Contributing
|
|
|
|
This project follows modern software development practices with comprehensive planning artifacts. When contributing:
|
|
|
|
1. Read the project context file for implementation guidelines
|
|
2. Follow the established code patterns and conventions
|
|
3. Ensure all tests pass before submitting changes
|
|
4. Update documentation as needed
|
|
|
|
## License
|
|
|
|
[Specify your license here]
|
|
|
|
## Support
|
|
|
|
For questions or issues related to this project, please refer to the project documentation or contact the development team.
|
|
|
|
---
|
|
|
|
**Last Updated:** 2026-01-11
|
|
|
|
Refreshed by automation
|