International Journal of Multimedia Information Retrieval

Papers
(The TQCC of International Journal of Multimedia Information Retrieval is 8. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-11-01 to 2025-11-01.)
ArticleCitations
Recent trends in recommender systems: a survey115
Generative adversarial networks and its applications in the biomedical image segmentation: a comprehensive survey63
VERITE: a Robust benchmark for multimodal misinformation detection accounting for unimodal bias58
Video anomaly detection with memory-guided multilevel embedding48
Strengthening attention: knowledge distillation via cross-layer feature fusion for image classification45
Enhancing Facial Beauty Prediction via a Dual-Pathway Hybrid Architecture Integrating Vmamba and ViT40
DELIGHT-Net: DEep and LIGHTweight network to segment Indian text at word level from wild scenic images35
VPC-VoxelNet: multi-modal fusion 3D object detection networks based on virtual point clouds32
Enhanced YOLOv10 for small object detection with context-aware and adaptive modules29
A local representation-enhanced recurrent convolutional network for image captioning26
Prototype local–global alignment network for image–text retrieval24
Multi-objective reinforcement learning for recommender systems: a comprehensive survey of methods, challenges, and future directions20
Feature-NeuS: Neural Implicit Surface Reconstruction Using Feature Multi-View Consistency Constraint20
Hierarchical multi-modal fusion with vision transformers for robust action recognition in infrared-visible videos19
MMDL: a multi-modal deep learning for video highlight detection in sports18
Similarity-based face image retrieval using sparsely embedded deep features and binary code learning17
DC-GNN: drop channel graph neural network for object classification and part segmentation in the point cloud17
CAMIR: fine-tuning CLIP and multi-head cross-attention mechanism for multimodal image retrieval with sketch and text features16
Human behavior recognition based on DualBiNet model16
Visual and semantic ensemble for scene text recognition with gated dual mutual attention16
FiCo-ITR: bridging fine-grained and coarse-grained image-text retrieval for comparative performance analysis15
How can users’ comments posted on social media videos be a source of effective tags?14
Semantic-enhanced discriminative embedding learning for cross-modal retrieval13
Multimodal music datasets? Challenges and future goals in music processing13
Generative adversarial networks for 2D-based CNN pose-invariant face recognition13
Few2Decide: towards a robust model via using few neuron connections to decide13
A Comprehensive Review of Multimodal Visual Representation Learning: Tracing the Evolution from CNNs to Transformers and Beyond12
DAF-Net: dense attention feature pyramid network for multiscale object detection12
An emotion-driven, transformer-based network for multimodal fake news detection11
MFAFD: a few-shot learning method for cascading models with parameter free attention and finite discrete space11
Cross-domain image retrieval: methods and applications11
State of art and emerging trends on group recommender system: a comprehensive review11
Multi-view learning for camouflaged object detection with PVTv210
Ultra fast-inference depth completion with linear attention-based cascaded hourglass network10
Organ segmentation from computed tomography images using the 3D convolutional neural network: a systematic review9
Human action recognition using an optical flow-gated recurrent neural network9
Weighted semantic feature based self-supervised deep cross-modal hashing9
Image enhancement with bi-directional normalization and color attention-guided generative adversarial networks8
Study of Alzheimer’s disease brain impairment and methods for its early diagnosis: a comprehensive survey8
Multi-sensor human activity recognition using CNN and GRU8
InceptionDepth-wiseYOLOv2: improved implementation of YOLO framework for pedestrian detection8
Optical music recognition for homophonic scores with neural networks and synthetic music generation8
Concept-based and embedding-based models in lifelog retrieval: an empirical comparison of performance8
FOF: a fine-grained object detection and feature extraction end-to-end network8
0.03886079788208