IEEE Transactions on Image Processing

Papers
(The H4-Index of IEEE Transactions on Image Processing is 86. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-11-01 to 2025-11-01.)
ArticleCitations
Consensus Sparsity: Multi-Context Sparse Image Representation via L -Induced Matrix Variate684
SemiRS-COC: Semi-Supervised Classification for Complex Remote Sensing Scenes With Cross-Object Consistency654
HAda: Hyper-Adaptive Parameter-Efficient Learning for Multi-View ConvNets592
Pro2Diff: Proposal Propagation for Multi-Object Tracking via the Diffusion Model581
Multiframe Joint Enhancement for Early Interlaced Videos482
Fine-Grained Recognition With Learnable Semantic Data Augmentation434
Cross-Modality Pyramid Alignment for Visual Intention Understanding402
OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments397
MaCon: A Generic Self-Supervised Framework for Unsupervised Multimodal Change Detection363
An Adaptive Multi-Granularity Graph Representation of Image via Granular-ball Computing361
Uncertainty-Guided Refinement for Fine-Grained Salient Object Detection286
Discrete Metric Learning for Fast Image Set Classification282
Bi-Nuclear Tensor Schatten-p Norm Minimization for Multi-View Subspace Clustering253
GMLight: Lighting Estimation via Geometric Distribution Approximation234
Graph Convolutional Dictionary Selection With L, Norm for Video Summarization229
Cross-Modal Retrieval With Noisy Correspondence via Consistency Refining and Mining211
Density-Guided Incremental Dominant Instance Exploration for Two-View Geometric Model Fitting204
TTVFI: Learning Trajectory-Aware Transformer for Video Frame Interpolation203
Multi-Granularity Contrastive Cross-Modal Collaborative Generation for End-to-End Long-Term Video Question Answering199
Contrast-Reconstruction Representation Learning for Self-Supervised Skeleton-Based Action Recognition197
A Fast and Efficient Shape Blending by Stable and Analytically Invertible Finite Descriptors192
Multimodal Unrolled Robust PCA for Background Foreground Separation192
Variational Structured Attention Networks for Deep Visual Representation Learning189
Equivariant Local Reference Frames With Optimization for Robust Non-Rigid Point Cloud Correspondence178
Automatic Quaternion-Domain Color Image Stitching178
A Low-Rank Tensor Decomposition Model With Factors Prior and Total Variation for Impulsive Noise Removal178
FF-LPD: A Real-Time Frame-by-Frame License Plate Detector With Knowledge Distillation and Feature Propagation175
STPNet: Scale-Aware Text Prompt Network for Medical Image Segmentation173
Self-Supervised Matting-Specific Portrait Enhancement and Generation168
Color Spike Camera Reconstruction via Long Short-Term Temporal Aggregation of Spike Signals163
AdaAugment: A Tuning-Free and Adaptive Approach to Enhance Data Augmentation162
Spatial Frequency Modulation Network for Efficient Image Dehazing160
Canonical Correlation Analysis With Low-Rank Learning for Image Representation158
Learning Spectral Cues for Multispectral and Panchromatic Image Fusion145
An Explanation Method Based on Interpretable Linear Model With Four Key Characteristics143
Real Image Denoising With a Locally-Adaptive Bitonic Filter140
One-Class Classification Using ℓp-Norm Multiple Kernel Fisher Null Approach140
Dual Alternating Direction Method of Multipliers for Inverse Imaging139
Pose-Appearance Relational Modeling for Video Action Recognition132
Harnessing Multi-modal Large Language Models for Measuring and Interpreting Color Differences132
Attentive WaveBlock: Complementarity-Enhanced Mutual Networks for Unsupervised Domain Adaptation in Person Re-Identification and Beyond130
Vision-Based UAV Self-Positioning in Low-Altitude Urban Environments130
Toward Efficient Test Time Adaptation With Hierarchical Distribution Alignment129
Multi-Constraint Adversarial Networks for Unsupervised Image-to-Image Translation129
Cross-Domain Few-Shot Medical Image Segmentation via Dynamic Semantic Matching129
Toward Projected Clustering With Aggregated Mapping127
Graph Embedding Contrastive Multi-Modal Representation Learning for Clustering126
Few-Shot Learning With Class-Covariance Metric for Hyperspectral Image Classification126
Differentiable SAR Renderer and Image-Based Target Reconstruction124
Variational Bayes Image Restoration With Compressive Autoencoders121
Attention-Guided Neural Networks for Full-Reference and No-Reference Audio-Visual Quality Assessment121
Advances in Predictive RAHT for Geometric Point Cloud Compression120
Non-Cascaded and Crosstalk-Free Multi-Image Encryption Based on Optical Scanning Holography Using 2D Orthogonal Compressive Sensing120
NeuralDiffuser: Neuroscience-Inspired Diffusion Guidance for fMRI Visual Reconstruction120
Interactive Face Video Coding: A Generative Compression Framework119
Fast 3D Room Layout Estimation Based on Compact High-Level Representation117
Cross-Domain Diffusion With Progressive Alignment for Efficient Adaptive Retrieval117
Generalization Beyond Feature Alignment: Concept Activation-Guided Contrastive Learning116
Grammar-Induced Wavelet Network for Human Parsing114
Cross-Layer Contrastive Learning of Latent Semantics for Facial Expression Recognition110
Motion and Appearance Decoupling Representation for Event Cameras110
Hyperspectral Meets Optical Flow: Spectral Flow Extraction for Hyperspectral Image Classification106
Unsupervised Modality-Transferable Video Highlight Detection With Representation Activation Sequence Learning106
IMU-Assisted Online Video Background Identification105
Efficient Semi-Supervised Multimodal Hashing With Importance Differentiation Regression105
Optimization-Inspired Learning With Architecture Augmentations and Control Mechanisms for Low-Level Vision104
Transition Is a Process: Pair-to-Video Change Detection Networks for Very High Resolution Remote Sensing Images104
Inverse Image Frequency for Long-Tailed Image Recognition103
Boundary-Aware Prototype in Semi-Supervised Medical Image Segmentation102
Distractor-Aware Event-Based Tracking99
SRS: Siamese Reconstruction-Segmentation Network Based on Dynamic-Parameter Convolution99
Learning Dynamic Prompts for All-in-One Image Restoration97
Multi-Source Unsupervised Domain Adaptation via Pseudo Target Domain96
Precise Facial Landmark Detection by Reference Heatmap Transformer95
KSS-ICP: Point Cloud Registration Based on Kendall Shape Space93
Stacked Deconvolutional Network for Semantic Segmentation92
SharpFormer: Learning Local Feature Preserving Global Representations for Image Deblurring92
Video Moment Retrieval With Cross-Modal Neural Architecture Search90
SegHSI: Semantic Segmentation of Hyperspectral Images With Limited Labeled Pixels90
Addressing Challenges of Incorporating Appearance Cues Into Heuristic Multi-Object Tracker via a Novel Feature Paradigm89
Decoupled Cross-Modal Phrase-Attention Network for Image-Sentence Matching89
Toward Video Anomaly Retrieval From Video Anomaly Detection: New Benchmarks and Model89
Multi-Condition Latent Diffusion Network for Scene-Aware Neural Human Motion Prediction88
Unsupervised Foggy Scene Understanding via Self Spatial-Temporal Label Diffusion88
Cyclic Self-Training With Proposal Weight Modulation for Cross-Supervised Object Detection88
Rethinking Sampling Strategies for Unsupervised Person Re-Identification86
0.19355607032776