Efficient GRU-based Facial Expression Recognition with Adaptive Loss Selection

Citation

Winarno, Sri and Alzami, Farrikh and Santoso, Dewi Agustini and Naufal, Muhammad and Azies, Harun Al and Brilianto, Rivaldo Mersis and Sonai Muthu Anbananthen, Kalaiarasi (2025) Efficient GRU-based Facial Expression Recognition with Adaptive Loss Selection. Statistics, Optimization & Information Computing, 14 (6). pp. 3468-3499. ISSN 2311-004X

[img] Text
3043-Article Text-13093-2-10-20251119.pdf - Published Version
Restricted to Repository staff only

Download (3MB)

Abstract

As real-world deployment of facial expression recognition systems becomes increasingly prevalent, computational efficiency emerges as a critical consideration alongside recognition accuracy. Current research demonstrates pronounced emphasis on accuracy maximization through sophisticated architectures, yet systematic evaluation of efficiency-performance trade-offs remains insufficient for resource-constrained deployment scenarios. This investigation presents a preliminary comparative analysis of Gated Recurrent Unit (GRU) and Long Short-Term Memory (LSTM) architectures for facial expression recognition, implementing a one-vs-all classification framework with adaptive loss function selection. A 2×2×2 factorial experimental design evaluates architecture, optimization strategy, and loss function complexity across six basic emotions using a controlled laboratory dataset CK+ dataset with MediaPipe-based facial landmark features (468 keypoints). Critical methodological caveat: limited sample size (n=6 per condition) restricts statistical power to detect only very large effects (Cohen’s d ≥ 1.43), necessitating interpretation as preliminary evidence requiring large-scale validation (n ≥ 34 per condition for medium effect detection). The investigation reveals no statistically significant performance differences between architectures (p>0.05, effect sizes d ≤ 0.306), while GRU architectures demonstrate 25% computational efficiency advantage through theoretical gate complexity analysis (3 vs 4 memory gates, relative complexity 0.75 vs 1.0), translating to reduced matrix operations per timestep while achieving comparable recognition performance. System achieves 92.7% ± 5.0% overall accuracy with substantial per-emotion variability (F1-scores: 0.462-0.973). Counterintuitively, standard binary cross-entropy significantly outperforms adaptive loss functions for minority class recall (p=0.002, d=-0.787), suggesting refinement requirements for focal loss hyperparameter and threshold calibration. The adaptive loss selection mechanism represents a methodological contribution for addressing heterogeneous class imbalance across one-vs-all binary classifiers, though effectiveness requires emotion-specific calibration. This work acknowledges fundamental limitations—critically small sample size, single controlled dataset validation, and theoretical rather than empirical efficiency characterization—while providing preliminary evidence-based guidelines for architecture selection in computationally constrained facial expression recognition applications.

Item Type: Article
Uncontrolled Keywords: Adaptive loss selection, computational efficiency, facial expression recognition, GRU, LSTM, MediaPipe, one-vs-all classification
Subjects: Q Science > QA Mathematics > QA71-90 Instruments and machines > QA75.5-76.95 Electronic computers. Computer science
Divisions: Faculty of Information Science and Technology (FIST)
Depositing User: Nor Afiqah Mohd Adnan
Date Deposited: 10 Dec 2025 01:54
Last Modified: 13 Dec 2025 03:01
URII: http://shdl.mmu.edu.my/id/eprint/15004

Downloads

Downloads per month over past year

View ItemEdit (login required)