changes to CNN report

2026-03-19 18:44:00 +01:00 · 2026-03-19 18:44:00 +01:00 · 96b3e35248
commit 96b3e35248
parent 0483c3fea3
1 changed files with 54 additions and 35 deletions
--- a/project_report.md
+++ b/project_report.md
@ -123,7 +123,7 @@ Supporting utilities in ```model_training/tools```:
 ### 4.1 CNNs
 This section summarizes all CNN‑based supervised learning approaches implemented in the project.  
-All models operate on **facial Action Unit (AU)** features and, depending on the notebook, additional **eye‑tracking features**.  
+All models operate on facial Action Unit (AU) features and, depending on the notebook, additional eye‑tracking features.  
 The notebooks differ in evaluation methodology, fusion strategy, and experimental intention.
 ### 4.1.1 Baseline CNN (Notebook: *CNN_simple*)  
@ -132,17 +132,17 @@ The model uses two convolutional layers, batch normalization, max pooling, and a
 A single subject‑exclusive train/validation/test split is used.
 The intention behind this notebook is to:  
- Provide a **baseline performance level**  
+- Provide a baseline performance level  
 - Validate that AU features contain discriminative information  
 - Identify overfitting tendencies before moving to more rigorous evaluation  
 ### 4.1.2 Cross‑Validated CNN (Notebook: *CNN_crossVal*)  
-This notebook introduces **5‑fold GroupKFold cross‑validation**, ensuring subject‑exclusive folds.  
+This notebook introduces 5‑fold GroupKFold cross‑validation, ensuring subject‑exclusive folds.  
 The architecture is similar to the baseline but includes stronger regularization and a lower learning rate.
 The intention behind this notebook is to:  
- Provide **robust generalization estimates**  
+- Provide robust generalization estimates  
 - Reduce variance caused by single‑split evaluation  
 - Establish a cross‑validated AU‑only benchmark  
@ -151,13 +151,14 @@ The intention behind this notebook is to:
 This notebook is a streamlined version of the previous one, removing unused eye‑tracking features and focusing exclusively on AUs.
 The intention behind this notebook is to:  
- Provide a **clean AU‑only benchmark**  
+- Provide a clean AU‑only benchmark  
 - Improve reproducibility and interpretability  
 - Prepare for multimodal comparisons  
-### 4.1.4 Cross‑Validated CNN with Early Fusion (AUs + Eye Features)  (Notebook: *CNN_crossVal_faceAUs_eyeFeatures*)  
+### 4.1.4 Cross‑Validated CNN with Early Fusion (AUs + Eye Features)  
-This notebook introduces **early fusion**, concatenating AU and eye‑tracking features into a single input vector.  
+(Notebook: *CNN_crossVal_faceAUs_eyeFeatures*)  
 This notebook introduces early fusion, concatenating AU and eye‑tracking features into a single input vector.  
 The architecture remains identical to AU‑only models.
 The intention behind this notebook is to:  
@ -171,7 +172,7 @@ This notebook didn't lead to any useful results.
 This notebook refines the early‑fusion approach by removing samples with missing values and ensuring consistent multimodal input quality.
 The intention behind this notebook is to:  
- Provide a **clean and fully validated early‑fusion model**  
+- Provide a clean and fully validated early‑fusion model  
 - Investigate multimodal complementarity under rigorous CV  
 - Improve interpretability through aggregated confusion matrices  
@ -180,68 +181,86 @@ The intention behind this notebook is to:
 This notebook applies domain‑specific filtering to isolate a more homogeneous subset of cognitive states before training.
 The intention behind this notebook is to:  
- Evaluate whether **subset filtering** improves multimodal learning  
+- Evaluate whether subset filtering improves multimodal learning  
 - Reduce dataset heterogeneity  
 - Provide a controlled multimodal benchmark  
-### 4.1.7 Hybrid‑Fusion CNN  
+### 4.1.7 Hybrid‑Fusion CNN  (Notebook: *CNN_crossVal_HybridFusion*)  
-(Notebook: *CNN_crossVal_HybridFusion*)  
+This notebook introduces a hybrid‑fusion architecture with two modality‑specific branches:  
 This notebook introduces a **hybrid‑fusion architecture** with two modality‑specific branches:  
 - A 1D CNN for AUs  
 - A dense MLP for eye‑tracking features  
 The branches are fused before classification.
 The intention behind this notebook is to:  
- Allow each modality to learn **specialized representations**  
+- Allow each modality to learn specialized representations  
 - Evaluate whether hybrid fusion outperforms early fusion  
 - Provide a strong multimodal benchmark  
-### 4.1.8 Early‑Fusion CNN with Independent Test Evaluation  
+### 4.1.8 Early‑Fusion CNN with Independent Test Evaluation  (Notebook: *CNN_crossVal_EarlyFusion_Test_Eval*)  
-(Notebook: *CNN_crossVal_EarlyFusion_Test_Eval*)  
+This notebook introduces the first true held‑out test evaluation for an early‑fusion CNN.  
 This notebook introduces the first **true held‑out test evaluation** for an early‑fusion CNN.  
 A subject‑exclusive train/test split is created before cross‑validation.
 The intention behind this notebook is to:  
- Provide a **deployment‑realistic performance estimate**  
+- Provide a deployment‑realistic performance estimate  
 - Compare validation‑fold behavior with true test‑set behavior  
 - Visualize ROC and PR curves for threshold analysis  
 | Metric / Model | CNN_crossVal_EarlyFusion_Test_Eval |
 |----------------|-------------------------------------|
-| Test Accuracy |                                      |
+| Test Accuracy |   0.913                                |
-| Test F1        |                                      |
+| Test F1        |   0.927                                   |
-| Test AUC       |                                      |
+| Test AUC       |   0.967                                 |
-| Balanced Accuracy |                                   |
+| Balanced Accuracy |   0.907                              |
-| Precision       |                                     |
+| Precision       |   0.918                                  |
-| Recall          |                                     |
+| Recall          |   0.937                                |
-### 4.1.9 Hybrid‑Fusion CNN with Independent Test Evaluation  
+#### Confusion Matrix
-(Notebook: *CNN_crossVal_HybridFusion_Test_Eval*)  
+![Konfusionsmatrix](results/Konfusionsmatrix_EarlyFusion.png)
 *Figure 4.1.8.1: Confusion matrix of the Early‑Fusion model.*
 #### ROC-Curve
 ![ROC-Kurve](results/ROC_EarlyFusion.png)
 *Figure 4.1.8.2: ROC-Curve of the Early‑Fusion model.*
 ### 4.1.9 Hybrid‑Fusion CNN with Independent Test Evaluation  (Notebook: *CNN_crossVal_HybridFusion_Test_Eval*)  
 This notebook extends hybrid fusion with a subject‑exclusive train/test split and full test‑set evaluation.
 The intention behind this notebook is to:  
- Evaluate hybrid fusion under **realistic deployment conditions**  
+- Evaluate hybrid fusion under realistic deployment conditions  
 - Compare hybrid vs. early fusion on unseen subjects  
 - Provide full diagnostic plots (ROC, PR, confusion matrices)  
 | Metric / Model | CNN_crossVal_HybridFusion_Test_Eval |
 |----------------|--------------------------------------|
-| Test Accuracy |                                       |
+| Test Accuracy |   0.950                                  |
-| Test F1        |                                       |
+| Test F1        |   0.959                                |
-| Test AUC       |                                       |
+| Test AUC       |   0.983                                   |
-| Balanced Accuracy |                                    |
+| Balanced Accuracy |    0.942                          |
-| Precision       |                                      |
+| Precision       |   0.933                                |
-| Recall          |                                      |
+| Recall          |   0.986                                 |
 #### Confusion Matrix
 ![Konfusionsmatrix](results/Konfusionsmatrix_HybridFusion.png)
 *Figure 4.1.9.1: Confusion matrix of the Hybrid‑Fusion model.*
 #### ROC-Curve
 ![ROC-Kurve](results/ROC_HybridFusion.png)
 *Figure 4.1.9.2: ROC-Curve of the Hybrid‑Fusion model.*
 ### 4.1.10 Summary
 Across all nine notebooks, the project progresses from a simple AU‑only baseline to advanced multimodal hybrid‑fusion architectures with independent test evaluation.  
 This progression reflects increasing methodological rigor and prepares the foundation for selecting a final deployment model. 
-Ultimately, the experiments showed that **early fusion and hybrid fusion perform very similarly**, with no substantial performance advantage for either approach.  
+The final experiments revealed that hybrid fusion provides a measurable performance advantage over early fusion. While both approaches achieve strong results, the hybrid‑fusion model reaches higher overall accuracy (95% vs. 91.3%) and substantially stronger recall (98.6% vs. 93.7%), indicating that it is more effective at correctly identifying high‑workload samples.
-Furthermore, even when relying **solely on facial Action Unit data**, the models achieve **strong and competitive results**, indicating that AUs alone already capture a significant portion of the cognitive workload signal.
+Early fusion, however, shows slightly better precision, suggesting that it produces fewer false positives.
 Looking ahead, further improvements could likely be achieved through more extensive hyperparameter tuning, as the current results suggest that additional optimization headroom remains.
 ### 4.2 XGBoost
 This documentation outlines the evolution of the XGBoost classification pipeline for cognitive workload detection. The project transitioned from a basic unimodal setup to a sophisticated, multi-stage hybrid system incorporating advanced statistical filtering and deep feature extraction.