Systematic Review of Quantization-Optimized Lightweight Transformer Architectures for Real-Time Fruit Ripeness Detection on Edge Devices

Citation

Maulana, Donny and Ramasamy, R Kanesaraj (2026) Systematic Review of Quantization-Optimized Lightweight Transformer Architectures for Real-Time Fruit Ripeness Detection on Edge Devices. Computers, 15 (1). p. 69. ISSN 2073-431X

[img] Text
computers-15-00069.pdf - Published Version
Restricted to Repository staff only

Download (399kB)

Abstract

Real-time visual inference on resource-constrained hardware remains a core challenge for edge computing and embedded artificial intelligence systems. Recent deep learning architectures, particularly Vision Transformers (ViTs) and Detection Transformers (DETRs), achieve high detection accuracy but impose substantial computational and memory demands that limit their deployment on low-power edge platforms such as NVIDIA Jetson and Raspberry Pi devices. This paper presents a systematic review of model compression and optimization strategies—specifically quantization, pruning, and knowledge distillation—applied to lightweight object detection architectures for edge deployment. Following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, peer-reviewed studies were analyzed from Scopus, IEEE Xplore, and ScienceDirect to examine the evolution of efficient detectors from convolutional neural networks to transformer-based models. The synthesis highlights a growing focus on real-time transformer variants, including Real-Time DETR (RT-DETR) and low-bit quantized approaches such as Q-DETR, alongside optimized YOLO-based architectures. While quantization enables substantial theoretical acceleration (e.g., up to 16× operation reduction), aggressive low-bit precision introduces accuracy degradation, particularly in transformer attention mechanisms, highlighting a critical efficiency-accuracy tradeoff. The review further shows that Quantization-Aware Training (QAT) consistently outperforms Post-Training Quantization (PTQ) in preserving performance under low-precision constraints. Finally, this review identifies critical open research challenges, emphasizing the efficiency–accuracy tradeoff and the high computational demands imposed by Transformer architectures. Future directions are proposed, including hardware-aware optimization, robustness to imbalanced datasets, and multimodal sensing integration, to ensure reliable real-time inference in practical agricultural edge computing environments.

Item Type: Article
Uncontrolled Keywords: agriculture; computer vision; deep learning; edge computing; model compression; object detection; quantization; real-time systems; systematic review; transformers; Vision Transformers (ViTs)
Subjects: Q Science > QA Mathematics > QA71-90 Instruments and machines
Divisions: Faculty of Computing and Informatics (FCI)
Depositing User: Ms Suzilawati Abu Samah
Date Deposited: 11 Feb 2026 02:15
Last Modified: 11 Feb 2026 02:15
URII: http://shdl.mmu.edu.my/id/eprint/15347

Downloads

Downloads per month over past year

View ItemEdit (login required)