first draft for read me + requirements.txt
This commit is contained in:
parent
f95d59e44d
commit
a4b7190756
3
.gitignore
vendored
3
.gitignore
vendored
@ -7,4 +7,5 @@
|
|||||||
!.gitignore
|
!.gitignore
|
||||||
!*.service
|
!*.service
|
||||||
!*.timer
|
!*.timer
|
||||||
!*.yaml
|
!*.yaml
|
||||||
|
!*.txt
|
||||||
316
readme.md
316
readme.md
@ -1,67 +1,279 @@
|
|||||||
# Multimodal Driver State Analysis
|
# Multimodal Driver State Analysis
|
||||||
|
|
||||||
Ein umfassendes Framework zur Analyse von Fahrerverhalten durch kombinierte Feature-Extraktion aus Facial Action Units (AU) und Eye-Tracking Daten.
|
This repository contains a full workflow for multimodal driver-state analysis in a simulator setting, from raw recording data to trained models and real-time inference.
|
||||||
|
|
||||||
## 📋 Projektübersicht
|
It combines two modalities:
|
||||||
|
- Facial Action Units (AUs)
|
||||||
|
- Eye-tracking features (fixations, saccades, blinks, pupil dynamics)
|
||||||
|
|
||||||
Dieses Projekt verarbeitet multimodale Sensordaten aus Fahrsimulator-Studien und extrahiert zeitbasierte Features für die Analyse von Fahrerzuständen. Die Pipeline kombiniert:
|
## What This Project Covers
|
||||||
|
|
||||||
- **Facial Action Units (AU)**: 20 Gesichtsaktionseinheiten zur Emotionserkennung
|
- Data extraction from raw simulator files (`.h5` / ownCloud)
|
||||||
- **Eye-Tracking**: Fixationen, Sakkaden, Blinks und Pupillenmetriken
|
- Conversion to subject-level Parquet files
|
||||||
|
- Sliding-window feature engineering (AU + eye tracking)
|
||||||
|
- Exploratory data analysis (EDA) notebooks
|
||||||
|
- Model training experiments (CNN, XGBoost, Isolation Forest, OCSVM, DeepSVDD)
|
||||||
|
- Real-time prediction from SQLite + MQTT publishing
|
||||||
|
- Optional Linux `systemd` deployment (`predict.service` + `predict.timer`)
|
||||||
|
|
||||||
## 🎯 Features
|
## Repository Structure
|
||||||
|
|
||||||
### Datenverarbeitung
|
```text
|
||||||
- **Sliding Window Aggregation**: 50-Sekunden-Fenster mit 5-Sekunden-Schrittweite
|
Fahrsimulator_MSY2526_AI/
|
||||||
- **Hierarchische Gruppierung**: Automatische Segmentierung nach STUDY/LEVEL/PHASE
|
|-- dataset_creation/
|
||||||
- **Robuste Fehlerbehandlung**: Graceful Degradation bei fehlenden Modalitäten
|
| |-- parquet_file_creation.py
|
||||||
|
| |-- create_parquet_files_from_owncloud.py
|
||||||
### Extrahierte Features
|
| |-- combined_feature_creation.py
|
||||||
|
| |-- maxDist.py
|
||||||
#### Facial Action Units (20 AUs)
|
| |-- AU_creation/
|
||||||
Für jede AU wird der Mittelwert pro Window berechnet:
|
| | |-- AU_creation_service.py
|
||||||
- AU01 (Inner Brow Raiser) bis AU43 (Eyes Closed)
|
| | `-- pyfeat_docu.ipynb
|
||||||
- Aggregation: `mean` über 50s Window
|
| `-- camera_handling/
|
||||||
|
| |-- camera_stream_AU_and_ET_new.py
|
||||||
#### Eye-Tracking Features
|
| |-- eyeFeature_new.py
|
||||||
**Fixationen:**
|
| |-- db_helper.py
|
||||||
- Anzahl nach Dauer-Kategorien (66-150ms, 300-500ms, >1000ms, >100ms)
|
| `-- *.py (legacy variants/tests)
|
||||||
- Mittelwert und Median der Fixationsdauer
|
|-- EDA/
|
||||||
|
| `-- *.ipynb
|
||||||
**Sakkaden:**
|
|-- model_training/
|
||||||
- Anzahl, mittlere Amplitude, mittlere/mediane Dauer
|
| |-- CNN/
|
||||||
|
| |-- xgboost/
|
||||||
**Blinks:**
|
| |-- IsolationForest/
|
||||||
- Anzahl, mittlere/mediane Dauer
|
| |-- OCSVM/
|
||||||
|
| |-- DeepSVDD/
|
||||||
**Pupille:**
|
| |-- MAD_outlier_removal/
|
||||||
- Mittlere Pupillengröße
|
| `-- tools/
|
||||||
- Index of Pupillary Activity (IPA) - Hochfrequenzkomponente (0.6-2.0 Hz)
|
|-- predict_pipeline/
|
||||||
|
| |-- predict_sample.py
|
||||||
## 🏗️ Projektstruktur
|
| |-- config.yaml
|
||||||
|
| |-- predict.service
|
||||||
to be continued.
|
| |-- predict.timer
|
||||||
|
| |-- predict_service_timer_documentation.md
|
||||||
## 🚀 Installation
|
| `-- fill_db.ipynb
|
||||||
|
|-- tools/
|
||||||
### Voraussetzungen
|
| `-- db_helpers.py
|
||||||
```bash
|
`-- readme.md
|
||||||
Python 3.12
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Dependencies
|
## End-to-End Workflow
|
||||||
|
|
||||||
|
## 1) Data Ingestion and Conversion
|
||||||
|
|
||||||
|
Main scripts:
|
||||||
|
- `dataset_creation/create_parquet_files_from_owncloud.py`
|
||||||
|
- `dataset_creation/parquet_file_creation.py`
|
||||||
|
|
||||||
|
Purpose:
|
||||||
|
- Load simulator recordings from ownCloud or local `.h5` files.
|
||||||
|
- Select relevant columns (`STUDY`, `LEVEL`, `PHASE`, `FACE_AU*`, `EYE_*`).
|
||||||
|
- Filter invalid rows (for example `LEVEL == 0`).
|
||||||
|
- Save cleaned subject-level Parquet files.
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
- These scripts contain placeholders for paths and credentials that must be adapted.
|
||||||
|
- ownCloud download uses `pyocclient` (`owncloud` module).
|
||||||
|
|
||||||
|
## 2) Feature Engineering (Offline Dataset)
|
||||||
|
|
||||||
|
Main script:
|
||||||
|
- `dataset_creation/combined_feature_creation.py`
|
||||||
|
|
||||||
|
Behavior:
|
||||||
|
- Processes all Parquet files in an input directory.
|
||||||
|
- Applies sliding windows:
|
||||||
|
- Window size: 50 seconds (`25 Hz * 50 = 1250 samples`)
|
||||||
|
- Step size: 5 seconds (`125 samples`)
|
||||||
|
- Groups data by available context columns (`STUDY`, `LEVEL`, `PHASE`).
|
||||||
|
- Computes:
|
||||||
|
- AU means per window (`FACE_AUxx_mean`)
|
||||||
|
- Eye-tracking features:
|
||||||
|
- Fixation counts and duration stats
|
||||||
|
- Saccade count/amplitude/duration stats
|
||||||
|
- Blink count/duration stats
|
||||||
|
- Pupil mean and IPA (high-frequency pupil activity)
|
||||||
|
|
||||||
|
Output:
|
||||||
|
- A combined Parquet dataset (one row per window), ready for model training.
|
||||||
|
|
||||||
|
## 3) Camera-Based Online Feature Extraction
|
||||||
|
|
||||||
|
Main scripts:
|
||||||
|
- `dataset_creation/camera_handling/camera_stream_AU_and_ET_new.py`
|
||||||
|
- `dataset_creation/camera_handling/eyeFeature_new.py`
|
||||||
|
|
||||||
|
Behavior:
|
||||||
|
- Captures webcam stream (`OpenCV`) at ~25 FPS.
|
||||||
|
- Computes eye metrics with `MediaPipe`.
|
||||||
|
- Records 50-second overlapping segments (new start every 5 seconds).
|
||||||
|
- Extracts AUs from recorded clips using `py-feat`.
|
||||||
|
- Extracts eye features from saved gaze parquet.
|
||||||
|
- Writes combined feature rows into an SQLite table (`feature_table`).
|
||||||
|
|
||||||
|
Important:
|
||||||
|
- Script paths and DB locations are currently hardcoded for the target environment and must be adapted.
|
||||||
|
|
||||||
|
## 4) Model Training
|
||||||
|
|
||||||
|
Location:
|
||||||
|
- `model_training/` (mostly notebook-driven)
|
||||||
|
|
||||||
|
Includes experiments for:
|
||||||
|
- CNN-based fusion variants
|
||||||
|
- XGBoost
|
||||||
|
- Isolation Forest
|
||||||
|
- OCSVM
|
||||||
|
- DeepSVDD
|
||||||
|
|
||||||
|
Utility modules:
|
||||||
|
- `model_training/tools/scaler.py` for fitting/saving/applying scalers
|
||||||
|
- `model_training/tools/mad_outlier_removal.py`
|
||||||
|
- `model_training/tools/performance_split.py`
|
||||||
|
- `model_training/tools/evaluation_tools.py`
|
||||||
|
|
||||||
|
## 5) Real-Time Prediction and Messaging
|
||||||
|
|
||||||
|
Main script:
|
||||||
|
- `predict_pipeline/predict_sample.py`
|
||||||
|
|
||||||
|
Runtime behavior:
|
||||||
|
- Reads latest row from SQLite (`database.path`, `database.table`, `database.key`).
|
||||||
|
- Applies NaN handling using fallback medians from `config.yaml`.
|
||||||
|
- Optionally scales features using a saved scaler (`.pkl` or `.joblib`).
|
||||||
|
- Loads model (`.keras`, `.pkl`, or `.joblib`) and predicts.
|
||||||
|
- Publishes JSON message via MQTT (topic/host/qos from config).
|
||||||
|
|
||||||
|
Message shape:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"valid": true,
|
||||||
|
"_id": 123,
|
||||||
|
"prediction": 0
|
||||||
|
}
|
||||||
|
```
|
||||||
|
(`prediction` key is configurable via `mqtt.publish_format.result_key`.)
|
||||||
|
|
||||||
|
## 6) Automated Execution with systemd (Linux)
|
||||||
|
|
||||||
|
Files:
|
||||||
|
- `predict_pipeline/predict.service`
|
||||||
|
- `predict_pipeline/predict.timer`
|
||||||
|
|
||||||
|
Current timer behavior:
|
||||||
|
- first run after 60s (`OnActiveSec=60`)
|
||||||
|
- then every 5s (`OnUnitActiveSec=5`)
|
||||||
|
|
||||||
|
Detailed operation and commands:
|
||||||
|
- `predict_pipeline/predict_service_timer_documentation.md`
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
Install dependencies from the tracked requirements file:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
pip install -r requirements.txt
|
pip install -r requirements.txt
|
||||||
```
|
```
|
||||||
|
|
||||||
**Wichtigste Pakete:**
|
## Python Version
|
||||||
- `pandas`, `numpy` - Datenverarbeitung
|
|
||||||
- `scipy` - Signalverarbeitung
|
|
||||||
- `scikit-learn` - Feature-Skalierung & ML
|
|
||||||
- `pygazeanalyser` - Eye-Tracking Analyse
|
|
||||||
- `pyarrow` - Parquet I/O
|
|
||||||
|
|
||||||
## 💻 Usage
|
Recommended:
|
||||||
|
- Python `3.10` to `3.12`
|
||||||
|
|
||||||
### 1. Feature-Extraktion
|
## Core Dependencies
|
||||||
to be continued
|
|
||||||
|
```bash
|
||||||
|
pip install numpy pandas scipy scikit-learn pyarrow pyyaml joblib paho-mqtt matplotlib
|
||||||
|
```
|
||||||
|
|
||||||
|
## Computer Vision / Eye Tracking / AU Stack
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install opencv-python mediapipe torch moviepy
|
||||||
|
pip install pygazeanalyser
|
||||||
|
pip install py-feat
|
||||||
|
```
|
||||||
|
|
||||||
|
## Data Access (optional)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install pyocclient h5py tables
|
||||||
|
```
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
- `tensorflow` is required for `.keras` model inference in `predict_sample.py`.
|
||||||
|
- `py-feat`, `mediapipe`, and `torch` can be platform-sensitive; pin versions per your target machine.
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
Primary runtime config:
|
||||||
|
- `predict_pipeline/config.yaml`
|
||||||
|
|
||||||
|
Sections:
|
||||||
|
- `database`: SQLite path/table/key
|
||||||
|
- `model`: model file path
|
||||||
|
- `scaler`: scaling toggle + scaler path
|
||||||
|
- `mqtt`: broker connection + publish format
|
||||||
|
- `sample.columns`: expected feature order
|
||||||
|
- `fallback`: median/default feature values used for NaN replacement
|
||||||
|
|
||||||
|
Before running prediction, verify all absolute paths in `config.yaml`.
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
## A) Build Training Dataset (Offline)
|
||||||
|
|
||||||
|
1. Set input/output paths in:
|
||||||
|
- `dataset_creation/parquet_file_creation.py`
|
||||||
|
- `dataset_creation/combined_feature_creation.py`
|
||||||
|
|
||||||
|
2. Generate subject Parquet files:
|
||||||
|
```bash
|
||||||
|
python dataset_creation/parquet_file_creation.py
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Generate combined sliding-window feature dataset:
|
||||||
|
```bash
|
||||||
|
python dataset_creation/combined_feature_creation.py
|
||||||
|
```
|
||||||
|
|
||||||
|
## B) Run Prediction Once
|
||||||
|
|
||||||
|
1. Update paths in `predict_pipeline/config.yaml`.
|
||||||
|
2. Run:
|
||||||
|
```bash
|
||||||
|
python predict_pipeline/predict_sample.py
|
||||||
|
```
|
||||||
|
|
||||||
|
## C) Run as systemd Service + Timer (Linux)
|
||||||
|
|
||||||
|
1. Copy unit files to `/etc/systemd/system/`.
|
||||||
|
2. Adjust `ExecStart` and user in `predict.service`.
|
||||||
|
3. Enable and start timer:
|
||||||
|
```bash
|
||||||
|
sudo systemctl daemon-reload
|
||||||
|
sudo systemctl enable predict.timer
|
||||||
|
sudo systemctl start predict.timer
|
||||||
|
```
|
||||||
|
|
||||||
|
Monitor logs:
|
||||||
|
```bash
|
||||||
|
journalctl -u predict.service -f
|
||||||
|
```
|
||||||
|
|
||||||
|
## Database and Table Expectations
|
||||||
|
|
||||||
|
The prediction script expects a SQLite table with at least:
|
||||||
|
- `_Id`
|
||||||
|
- `start_time`
|
||||||
|
- all model feature columns listed in `config.yaml` under `sample.columns`
|
||||||
|
|
||||||
|
The camera pipeline writes feature rows into `feature_table` using helper utilities in:
|
||||||
|
- `dataset_creation/camera_handling/db_helper.py`
|
||||||
|
- `tools/db_helpers.py`
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
## License
|
||||||
|
|
||||||
|
No license file is currently present in this repository.
|
||||||
|
Add a `LICENSE` file if this project should be shared or reused externally.
|
||||||
|
|||||||
26
requirements.txt
Normal file
26
requirements.txt
Normal file
@ -0,0 +1,26 @@
|
|||||||
|
# Core data + ML utilities
|
||||||
|
numpy
|
||||||
|
pandas
|
||||||
|
scipy
|
||||||
|
scikit-learn
|
||||||
|
pyarrow
|
||||||
|
joblib
|
||||||
|
PyYAML
|
||||||
|
matplotlib
|
||||||
|
|
||||||
|
# Prediction pipeline
|
||||||
|
paho-mqtt
|
||||||
|
tensorflow
|
||||||
|
|
||||||
|
# Camera / feature extraction stack
|
||||||
|
opencv-python
|
||||||
|
mediapipe
|
||||||
|
torch
|
||||||
|
moviepy
|
||||||
|
pygazeanalyser
|
||||||
|
py-feat
|
||||||
|
|
||||||
|
# Data ingestion (ownCloud + HDF)
|
||||||
|
pyocclient
|
||||||
|
h5py
|
||||||
|
tables
|
||||||
Loading…
x
Reference in New Issue
Block a user