Multimedia Systems

Papers
(The H4-Index of Multimedia Systems is 26. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-05-01 to 2026-05-01.)
ArticleCitations
Pseudo-global strategy-based visual comfort assessment considering attention mechanism161
SS-CMT: a label independent cross-modal transferable adversarial video attack with sparse strategy122
DiffRA: universal restorative adversarial attack based on diffusion model93
SFFN-YOLO for small object detection in aerial images81
On-line monitoring of structural performance of scraper conveyor driven by digital twin74
Dual convolutional neural network with attention for image blind denoising60
TreeSegNet: multi-scale query-based instance segmentation with frequency-aware and gated feature enhancement59
Fast latent-feature augmentation for cross-domain face forgery detection54
CAPNet: tomato leaf disease detection network based on adaptive feature fusion and convolutional enhancement52
Face and voice cross-modal association with learning convex feature embedding52
BENet: bi-directional enhanced network for image captioning51
Deep Learning-based forgery detection and localization for compressed images using a hybrid optimization model46
Multi-level sentiment-aware clustering for denoising in multimodal sentiment analysis with ASR errors45
A variational causal inference-based method for recognizing object state changes in videos42
Feature fusion and optimization integrated refined deep residual network for diabetic retinopathy severity classification using fundus image40
Design and realization of pulse-controlled multi-memristor Hopfield neural networks and their applications in information encryption39
LMFE-RDD: a road damage detector with a lightweight multi-feature extraction network38
Improving text-image cross-modal retrieval with contrastive loss34
Segmentation-aware image super-resolution with generative adversarial networks32
JAMD-Net: image splicing forgery detection based on JPEG compression artifacts and multi-dilated channel refinement fusion31
Generalizing sentence-level lipreading to unseen speakers: a two-stream end-to-end approach30
Real emotion seeker: recalibrating annotation for facial expression recognition29
CHCoT-MSLU: a coupled hierarchical chain-of-thought prompt learning model for multi-intent spoken language understanding29
The segmented UEC Food-100 dataset with benchmark experiment on food detection29
A visual question answering model based on image captioning28
Correction: STASiamRPN: visual tracking based on spatiotemporal and attention27
Automatic lymph node segmentation using deep parallel squeeze & excitation and attention Unet26
0.19341206550598