Citation
Agarwal, Pankaj and Yogarayan, Sumendra and Sayeed, Md. Shohel and Tipu, Rupesh Kumar (2026) Multi-View Transformers for Structure-Aware HA–NA Drift Risk Scoring and Mutation Hotspot Mapping. Viruses, 18 (4). p. 421. ISSN 1999-4915|
Text
viruses-18-00421.pdf - Published Version Restricted to Repository staff only Download (8MB) |
Abstract
Seasonal influenza A evolves quickly through mutations in haemagglutinin (HA) and neuraminidase (NA), which can reduce vaccine match and lower protection. Many sequenceonly models do not link codon-level mutations to three-dimensional (3D) protein context and long-term evolutionary signals within one scoring framework. This study presents TRIAD-Influenza (TRIAD: Token–Residue–Integrated Architecture for Drift) , a multiview transformer that combines (i) codon- and residue-level sequence representations, (ii) structure-derived residue interaction features from predicted HA/NA models, and (iii) an embedding-space phylogeny that captures cluster and drift context. The pipeline curates more than 3 × 105 paired HA/NA coding sequences from the NCBI Virus resource (2010–2024) using strict quality control and codon-aware alignment and predicts 3D structures for nearly all unique HA and NA proteins to build contact graphs and surface/stability descriptors. TRIAD-Influenza outputs a continuous, structure-aware risk score for each HA/NA pair and produces interpretable mutation hotspot maps using gradient saliency and a contact-weighted mutation risk index (CMRI). On rollingorigin temporal cross-validation and for a temporally held-out internal test window with strong class imbalance (∼3.4% high-risk), the model shows strong ranking performance (AUROC ≈ 0.89; AUPRC ≈ 0.44; Brier score = 0.069) while operating at surveillance speed (median latency ≈ 1.6 ms per HA/NA pair). External validation on independent GISAID/Nextstrain cohorts (2023–2024; 5000 isolates) preserves discrimination (AUROC ≈ 0.85–0.86). Predicted risk scores correlate with experimental haemagglutination inhibition (HI) antigenic distances (Spearman ρ up to ≈0.82 at the virus-aggregated level), and CMRI hotspots enrich known epitope and deep mutational scanning escape residues (odds ratios ≈ 2.7–3.6). Overall, token–residue–phylogeny coupling enables rapid, structure-aware prioritisation of emerging influenza A HA/NA sequences and delivers compact hotspot maps for expert review and targeted experiments.
| Item Type: | Article |
|---|---|
| Uncontrolled Keywords: | influenza A, haemagglutinin |
| Subjects: | R Medicine > RA Public aspects of medicine > RA421-790.95 Public health. Hygiene. Preventive medicine |
| Divisions: | Faculty of Information Science and Technology (FIST) |
| Depositing User: | Ms Rosnani Abd Wahab |
| Date Deposited: | 08 Jun 2026 00:36 |
| Last Modified: | 08 Jun 2026 00:36 |
| URII: | http://shdl.mmu.edu.my/id/eprint/16090 |
Downloads
Downloads per month over past year
Edit (login required) |
