Citation
Winarno, Sri and Alzami, Farrikh and Santoso, Dewi Agustini and Naufal, Muhammad and Azies, Harun Al and Brilianto, Rivaldo Mersis and Sonai Muthu Anbananthen, Kalaiarasi (2025) Efficient GRU-based Facial Expression Recognition with Adaptive Loss Selection. Statistics, Optimization & Information Computing, 14 (6). pp. 3468-3499. ISSN 2311-004X|
Text
3043-Article Text-13093-2-10-20251119.pdf - Published Version Restricted to Repository staff only Download (3MB) |
Abstract
As real-world deployment of facial expression recognition systems becomes increasingly prevalent, computational efficiency emerges as a critical consideration alongside recognition accuracy. Current research demonstrates pronounced emphasis on accuracy maximization through sophisticated architectures, yet systematic evaluation of efficiency-performance trade-offs remains insufficient for resource-constrained deployment scenarios. This investigation presents a preliminary comparative analysis of Gated Recurrent Unit (GRU) and Long Short-Term Memory (LSTM) architectures for facial expression recognition, implementing a one-vs-all classification framework with adaptive loss function selection. A 2×2×2 factorial experimental design evaluates architecture, optimization strategy, and loss function complexity across six basic emotions using a controlled laboratory dataset CK+ dataset with MediaPipe-based facial landmark features (468 keypoints). Critical methodological caveat: limited sample size (n=6 per condition) restricts statistical power to detect only very large effects (Cohen’s d ≥ 1.43), necessitating interpretation as preliminary evidence requiring large-scale validation (n ≥ 34 per condition for medium effect detection). The investigation reveals no statistically significant performance differences between architectures (p>0.05, effect sizes d ≤ 0.306), while GRU architectures demonstrate 25% computational efficiency advantage through theoretical gate complexity analysis (3 vs 4 memory gates, relative complexity 0.75 vs 1.0), translating to reduced matrix operations per timestep while achieving comparable recognition performance. System achieves 92.7% ± 5.0% overall accuracy with substantial per-emotion variability (F1-scores: 0.462-0.973). Counterintuitively, standard binary cross-entropy significantly outperforms adaptive loss functions for minority class recall (p=0.002, d=-0.787), suggesting refinement requirements for focal loss hyperparameter and threshold calibration. The adaptive loss selection mechanism represents a methodological contribution for addressing heterogeneous class imbalance across one-vs-all binary classifiers, though effectiveness requires emotion-specific calibration. This work acknowledges fundamental limitations—critically small sample size, single controlled dataset validation, and theoretical rather than empirical efficiency characterization—while providing preliminary evidence-based guidelines for architecture selection in computationally constrained facial expression recognition applications.
| Item Type: | Article |
|---|---|
| Uncontrolled Keywords: | Adaptive loss selection, computational efficiency, facial expression recognition, GRU, LSTM, MediaPipe, one-vs-all classification |
| Subjects: | Q Science > QA Mathematics > QA71-90 Instruments and machines > QA75.5-76.95 Electronic computers. Computer science |
| Divisions: | Faculty of Information Science and Technology (FIST) |
| Depositing User: | Nor Afiqah Mohd Adnan |
| Date Deposited: | 10 Dec 2025 01:54 |
| Last Modified: | 13 Dec 2025 03:01 |
| URII: | http://shdl.mmu.edu.my/id/eprint/15004 |
Downloads
Downloads per month over past year
Edit (login required) |
