changes to CNN report
This commit is contained in:
parent
0483c3fea3
commit
96b3e35248
@ -123,7 +123,7 @@ Supporting utilities in ```model_training/tools```:
|
|||||||
|
|
||||||
### 4.1 CNNs
|
### 4.1 CNNs
|
||||||
This section summarizes all CNN‑based supervised learning approaches implemented in the project.
|
This section summarizes all CNN‑based supervised learning approaches implemented in the project.
|
||||||
All models operate on **facial Action Unit (AU)** features and, depending on the notebook, additional **eye‑tracking features**.
|
All models operate on facial Action Unit (AU) features and, depending on the notebook, additional eye‑tracking features.
|
||||||
The notebooks differ in evaluation methodology, fusion strategy, and experimental intention.
|
The notebooks differ in evaluation methodology, fusion strategy, and experimental intention.
|
||||||
|
|
||||||
### 4.1.1 Baseline CNN (Notebook: *CNN_simple*)
|
### 4.1.1 Baseline CNN (Notebook: *CNN_simple*)
|
||||||
@ -132,17 +132,17 @@ The model uses two convolutional layers, batch normalization, max pooling, and a
|
|||||||
A single subject‑exclusive train/validation/test split is used.
|
A single subject‑exclusive train/validation/test split is used.
|
||||||
|
|
||||||
The intention behind this notebook is to:
|
The intention behind this notebook is to:
|
||||||
- Provide a **baseline performance level**
|
- Provide a baseline performance level
|
||||||
- Validate that AU features contain discriminative information
|
- Validate that AU features contain discriminative information
|
||||||
- Identify overfitting tendencies before moving to more rigorous evaluation
|
- Identify overfitting tendencies before moving to more rigorous evaluation
|
||||||
|
|
||||||
|
|
||||||
### 4.1.2 Cross‑Validated CNN (Notebook: *CNN_crossVal*)
|
### 4.1.2 Cross‑Validated CNN (Notebook: *CNN_crossVal*)
|
||||||
This notebook introduces **5‑fold GroupKFold cross‑validation**, ensuring subject‑exclusive folds.
|
This notebook introduces 5‑fold GroupKFold cross‑validation, ensuring subject‑exclusive folds.
|
||||||
The architecture is similar to the baseline but includes stronger regularization and a lower learning rate.
|
The architecture is similar to the baseline but includes stronger regularization and a lower learning rate.
|
||||||
|
|
||||||
The intention behind this notebook is to:
|
The intention behind this notebook is to:
|
||||||
- Provide **robust generalization estimates**
|
- Provide robust generalization estimates
|
||||||
- Reduce variance caused by single‑split evaluation
|
- Reduce variance caused by single‑split evaluation
|
||||||
- Establish a cross‑validated AU‑only benchmark
|
- Establish a cross‑validated AU‑only benchmark
|
||||||
|
|
||||||
@ -151,13 +151,14 @@ The intention behind this notebook is to:
|
|||||||
This notebook is a streamlined version of the previous one, removing unused eye‑tracking features and focusing exclusively on AUs.
|
This notebook is a streamlined version of the previous one, removing unused eye‑tracking features and focusing exclusively on AUs.
|
||||||
|
|
||||||
The intention behind this notebook is to:
|
The intention behind this notebook is to:
|
||||||
- Provide a **clean AU‑only benchmark**
|
- Provide a clean AU‑only benchmark
|
||||||
- Improve reproducibility and interpretability
|
- Improve reproducibility and interpretability
|
||||||
- Prepare for multimodal comparisons
|
- Prepare for multimodal comparisons
|
||||||
|
|
||||||
|
|
||||||
### 4.1.4 Cross‑Validated CNN with Early Fusion (AUs + Eye Features) (Notebook: *CNN_crossVal_faceAUs_eyeFeatures*)
|
### 4.1.4 Cross‑Validated CNN with Early Fusion (AUs + Eye Features)
|
||||||
This notebook introduces **early fusion**, concatenating AU and eye‑tracking features into a single input vector.
|
(Notebook: *CNN_crossVal_faceAUs_eyeFeatures*)
|
||||||
|
This notebook introduces early fusion, concatenating AU and eye‑tracking features into a single input vector.
|
||||||
The architecture remains identical to AU‑only models.
|
The architecture remains identical to AU‑only models.
|
||||||
|
|
||||||
The intention behind this notebook is to:
|
The intention behind this notebook is to:
|
||||||
@ -171,7 +172,7 @@ This notebook didn't lead to any useful results.
|
|||||||
This notebook refines the early‑fusion approach by removing samples with missing values and ensuring consistent multimodal input quality.
|
This notebook refines the early‑fusion approach by removing samples with missing values and ensuring consistent multimodal input quality.
|
||||||
|
|
||||||
The intention behind this notebook is to:
|
The intention behind this notebook is to:
|
||||||
- Provide a **clean and fully validated early‑fusion model**
|
- Provide a clean and fully validated early‑fusion model
|
||||||
- Investigate multimodal complementarity under rigorous CV
|
- Investigate multimodal complementarity under rigorous CV
|
||||||
- Improve interpretability through aggregated confusion matrices
|
- Improve interpretability through aggregated confusion matrices
|
||||||
|
|
||||||
@ -180,68 +181,86 @@ The intention behind this notebook is to:
|
|||||||
This notebook applies domain‑specific filtering to isolate a more homogeneous subset of cognitive states before training.
|
This notebook applies domain‑specific filtering to isolate a more homogeneous subset of cognitive states before training.
|
||||||
|
|
||||||
The intention behind this notebook is to:
|
The intention behind this notebook is to:
|
||||||
- Evaluate whether **subset filtering** improves multimodal learning
|
- Evaluate whether subset filtering improves multimodal learning
|
||||||
- Reduce dataset heterogeneity
|
- Reduce dataset heterogeneity
|
||||||
- Provide a controlled multimodal benchmark
|
- Provide a controlled multimodal benchmark
|
||||||
|
|
||||||
|
|
||||||
### 4.1.7 Hybrid‑Fusion CNN
|
### 4.1.7 Hybrid‑Fusion CNN (Notebook: *CNN_crossVal_HybridFusion*)
|
||||||
(Notebook: *CNN_crossVal_HybridFusion*)
|
This notebook introduces a hybrid‑fusion architecture with two modality‑specific branches:
|
||||||
This notebook introduces a **hybrid‑fusion architecture** with two modality‑specific branches:
|
|
||||||
- A 1D CNN for AUs
|
- A 1D CNN for AUs
|
||||||
- A dense MLP for eye‑tracking features
|
- A dense MLP for eye‑tracking features
|
||||||
|
|
||||||
The branches are fused before classification.
|
The branches are fused before classification.
|
||||||
|
|
||||||
The intention behind this notebook is to:
|
The intention behind this notebook is to:
|
||||||
- Allow each modality to learn **specialized representations**
|
- Allow each modality to learn specialized representations
|
||||||
- Evaluate whether hybrid fusion outperforms early fusion
|
- Evaluate whether hybrid fusion outperforms early fusion
|
||||||
- Provide a strong multimodal benchmark
|
- Provide a strong multimodal benchmark
|
||||||
|
|
||||||
|
|
||||||
### 4.1.8 Early‑Fusion CNN with Independent Test Evaluation
|
### 4.1.8 Early‑Fusion CNN with Independent Test Evaluation (Notebook: *CNN_crossVal_EarlyFusion_Test_Eval*)
|
||||||
(Notebook: *CNN_crossVal_EarlyFusion_Test_Eval*)
|
This notebook introduces the first true held‑out test evaluation for an early‑fusion CNN.
|
||||||
This notebook introduces the first **true held‑out test evaluation** for an early‑fusion CNN.
|
|
||||||
A subject‑exclusive train/test split is created before cross‑validation.
|
A subject‑exclusive train/test split is created before cross‑validation.
|
||||||
|
|
||||||
The intention behind this notebook is to:
|
The intention behind this notebook is to:
|
||||||
- Provide a **deployment‑realistic performance estimate**
|
- Provide a deployment‑realistic performance estimate
|
||||||
- Compare validation‑fold behavior with true test‑set behavior
|
- Compare validation‑fold behavior with true test‑set behavior
|
||||||
- Visualize ROC and PR curves for threshold analysis
|
- Visualize ROC and PR curves for threshold analysis
|
||||||
|
|
||||||
| Metric / Model | CNN_crossVal_EarlyFusion_Test_Eval |
|
| Metric / Model | CNN_crossVal_EarlyFusion_Test_Eval |
|
||||||
|----------------|-------------------------------------|
|
|----------------|-------------------------------------|
|
||||||
| Test Accuracy | |
|
| Test Accuracy | 0.913 |
|
||||||
| Test F1 | |
|
| Test F1 | 0.927 |
|
||||||
| Test AUC | |
|
| Test AUC | 0.967 |
|
||||||
| Balanced Accuracy | |
|
| Balanced Accuracy | 0.907 |
|
||||||
| Precision | |
|
| Precision | 0.918 |
|
||||||
| Recall | |
|
| Recall | 0.937 |
|
||||||
|
|
||||||
### 4.1.9 Hybrid‑Fusion CNN with Independent Test Evaluation
|
#### Confusion Matrix
|
||||||
(Notebook: *CNN_crossVal_HybridFusion_Test_Eval*)
|

|
||||||
|
|
||||||
|
*Figure 4.1.8.1: Confusion matrix of the Early‑Fusion model.*
|
||||||
|
|
||||||
|
#### ROC-Curve
|
||||||
|

|
||||||
|
|
||||||
|
*Figure 4.1.8.2: ROC-Curve of the Early‑Fusion model.*
|
||||||
|
|
||||||
|
### 4.1.9 Hybrid‑Fusion CNN with Independent Test Evaluation (Notebook: *CNN_crossVal_HybridFusion_Test_Eval*)
|
||||||
This notebook extends hybrid fusion with a subject‑exclusive train/test split and full test‑set evaluation.
|
This notebook extends hybrid fusion with a subject‑exclusive train/test split and full test‑set evaluation.
|
||||||
|
|
||||||
The intention behind this notebook is to:
|
The intention behind this notebook is to:
|
||||||
- Evaluate hybrid fusion under **realistic deployment conditions**
|
- Evaluate hybrid fusion under realistic deployment conditions
|
||||||
- Compare hybrid vs. early fusion on unseen subjects
|
- Compare hybrid vs. early fusion on unseen subjects
|
||||||
- Provide full diagnostic plots (ROC, PR, confusion matrices)
|
- Provide full diagnostic plots (ROC, PR, confusion matrices)
|
||||||
|
|
||||||
| Metric / Model | CNN_crossVal_HybridFusion_Test_Eval |
|
| Metric / Model | CNN_crossVal_HybridFusion_Test_Eval |
|
||||||
|----------------|--------------------------------------|
|
|----------------|--------------------------------------|
|
||||||
| Test Accuracy | |
|
| Test Accuracy | 0.950 |
|
||||||
| Test F1 | |
|
| Test F1 | 0.959 |
|
||||||
| Test AUC | |
|
| Test AUC | 0.983 |
|
||||||
| Balanced Accuracy | |
|
| Balanced Accuracy | 0.942 |
|
||||||
| Precision | |
|
| Precision | 0.933 |
|
||||||
| Recall | |
|
| Recall | 0.986 |
|
||||||
|
|
||||||
|
#### Confusion Matrix
|
||||||
|

|
||||||
|
|
||||||
|
*Figure 4.1.9.1: Confusion matrix of the Hybrid‑Fusion model.*
|
||||||
|
|
||||||
|
#### ROC-Curve
|
||||||
|

|
||||||
|
|
||||||
|
*Figure 4.1.9.2: ROC-Curve of the Hybrid‑Fusion model.*
|
||||||
|
|
||||||
### 4.1.10 Summary
|
### 4.1.10 Summary
|
||||||
Across all nine notebooks, the project progresses from a simple AU‑only baseline to advanced multimodal hybrid‑fusion architectures with independent test evaluation.
|
Across all nine notebooks, the project progresses from a simple AU‑only baseline to advanced multimodal hybrid‑fusion architectures with independent test evaluation.
|
||||||
This progression reflects increasing methodological rigor and prepares the foundation for selecting a final deployment model.
|
|
||||||
|
|
||||||
Ultimately, the experiments showed that **early fusion and hybrid fusion perform very similarly**, with no substantial performance advantage for either approach.
|
The final experiments revealed that hybrid fusion provides a measurable performance advantage over early fusion. While both approaches achieve strong results, the hybrid‑fusion model reaches higher overall accuracy (95% vs. 91.3%) and substantially stronger recall (98.6% vs. 93.7%), indicating that it is more effective at correctly identifying high‑workload samples.
|
||||||
Furthermore, even when relying **solely on facial Action Unit data**, the models achieve **strong and competitive results**, indicating that AUs alone already capture a significant portion of the cognitive workload signal.
|
Early fusion, however, shows slightly better precision, suggesting that it produces fewer false positives.
|
||||||
|
|
||||||
|
Looking ahead, further improvements could likely be achieved through more extensive hyperparameter tuning, as the current results suggest that additional optimization headroom remains.
|
||||||
|
|
||||||
### 4.2 XGBoost
|
### 4.2 XGBoost
|
||||||
This documentation outlines the evolution of the XGBoost classification pipeline for cognitive workload detection. The project transitioned from a basic unimodal setup to a sophisticated, multi-stage hybrid system incorporating advanced statistical filtering and deep feature extraction.
|
This documentation outlines the evolution of the XGBoost classification pipeline for cognitive workload detection. The project transitioned from a basic unimodal setup to a sophisticated, multi-stage hybrid system incorporating advanced statistical filtering and deep feature extraction.
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user