OOIR: Observatory of International Research

Papers

(The H4-Index of Multimedia Systems is 27. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-06-01 to 2026-06-01.)

Article	Citations
Pseudo-global strategy-based visual comfort assessment considering attention mechanism	171
SS-CMT: a label independent cross-modal transferable adversarial video attack with sparse strategy	123
Face and voice cross-modal association with learning convex feature embedding	93
DiffRA: universal restorative adversarial attack based on diffusion model	84
SFFN-YOLO for small object detection in aerial images	74
Improving text-image cross-modal retrieval with contrastive loss	64
TreeSegNet: multi-scale query-based instance segmentation with frequency-aware and gated feature enhancement	63
GVA: guided visual attention approach for automatic image caption generation	55
Dual-branch spectral–spatial feature extraction network for multispectral image compression	54
A research for sound event localization and detection based on local–global adaptive fusion and temporal importance network	52
FedMAB: adaptive multimodal federated learning with multi-armed bandits	51
A visual question answering model based on image captioning	47
CAPNet: tomato leaf disease detection network based on adaptive feature fusion and convolutional enhancement	45
Automatic lymph node segmentation using deep parallel squeeze & excitation and attention Unet	42
User authentication method based on keystroke dynamics and mouse dynamics using HDA	41
Unsupervised deep metric learning algorithm for crop disease images based on knowledge distillation networks	40
The segmented UEC Food-100 dataset with benchmark experiment on food detection	38
Multi-view Isolated sign language recognition based on cross-view and multi-level transformer	35
Model-based portrait video compression with spatial constraint and adaptive pose processing	32
Segmentation-aware image super-resolution with generative adversarial networks	31
JAMD-Net: image splicing forgery detection based on JPEG compression artifacts and multi-dilated channel refinement fusion	31
CHCoT-MSLU: a coupled hierarchical chain-of-thought prompt learning model for multi-intent spoken language understanding	30
Real emotion seeker: recalibrating annotation for facial expression recognition	30
Generalizing sentence-level lipreading to unseen speakers: a two-stream end-to-end approach	30
Towards domain adaptation underwater image enhancement and restoration	29

SEMNet: a simple and efficient MLP-based network for 3D Face point clouds landmarks localization	27
SFRA: spatial fusion regression augmentation network for facial landmark detection	27