IEEE Transactions on Image Processing

Papers
(The median citation count of IEEE Transactions on Image Processing is 8. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-11-01 to 2025-11-01.)
ArticleCitations
Consensus Sparsity: Multi-Context Sparse Image Representation via L -Induced Matrix Variate684
SemiRS-COC: Semi-Supervised Classification for Complex Remote Sensing Scenes With Cross-Object Consistency654
HAda: Hyper-Adaptive Parameter-Efficient Learning for Multi-View ConvNets592
Pro2Diff: Proposal Propagation for Multi-Object Tracking via the Diffusion Model581
Multiframe Joint Enhancement for Early Interlaced Videos482
Fine-Grained Recognition With Learnable Semantic Data Augmentation434
Cross-Modality Pyramid Alignment for Visual Intention Understanding402
OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments397
MaCon: A Generic Self-Supervised Framework for Unsupervised Multimodal Change Detection363
An Adaptive Multi-Granularity Graph Representation of Image via Granular-ball Computing361
Uncertainty-Guided Refinement for Fine-Grained Salient Object Detection286
Discrete Metric Learning for Fast Image Set Classification282
Bi-Nuclear Tensor Schatten-p Norm Minimization for Multi-View Subspace Clustering253
GMLight: Lighting Estimation via Geometric Distribution Approximation234
Graph Convolutional Dictionary Selection With L, Norm for Video Summarization229
Cross-Modal Retrieval With Noisy Correspondence via Consistency Refining and Mining211
Density-Guided Incremental Dominant Instance Exploration for Two-View Geometric Model Fitting204
TTVFI: Learning Trajectory-Aware Transformer for Video Frame Interpolation203
Multi-Granularity Contrastive Cross-Modal Collaborative Generation for End-to-End Long-Term Video Question Answering199
Contrast-Reconstruction Representation Learning for Self-Supervised Skeleton-Based Action Recognition197
A Fast and Efficient Shape Blending by Stable and Analytically Invertible Finite Descriptors192
Multimodal Unrolled Robust PCA for Background Foreground Separation192
Variational Structured Attention Networks for Deep Visual Representation Learning189
Automatic Quaternion-Domain Color Image Stitching178
A Low-Rank Tensor Decomposition Model With Factors Prior and Total Variation for Impulsive Noise Removal178
Equivariant Local Reference Frames With Optimization for Robust Non-Rigid Point Cloud Correspondence178
FF-LPD: A Real-Time Frame-by-Frame License Plate Detector With Knowledge Distillation and Feature Propagation175
STPNet: Scale-Aware Text Prompt Network for Medical Image Segmentation173
Self-Supervised Matting-Specific Portrait Enhancement and Generation168
Color Spike Camera Reconstruction via Long Short-Term Temporal Aggregation of Spike Signals163
AdaAugment: A Tuning-Free and Adaptive Approach to Enhance Data Augmentation162
Spatial Frequency Modulation Network for Efficient Image Dehazing160
Canonical Correlation Analysis With Low-Rank Learning for Image Representation158
Learning Spectral Cues for Multispectral and Panchromatic Image Fusion145
An Explanation Method Based on Interpretable Linear Model With Four Key Characteristics143
One-Class Classification Using ℓp-Norm Multiple Kernel Fisher Null Approach140
Real Image Denoising With a Locally-Adaptive Bitonic Filter140
Dual Alternating Direction Method of Multipliers for Inverse Imaging139
Harnessing Multi-modal Large Language Models for Measuring and Interpreting Color Differences132
Pose-Appearance Relational Modeling for Video Action Recognition132
Attentive WaveBlock: Complementarity-Enhanced Mutual Networks for Unsupervised Domain Adaptation in Person Re-Identification and Beyond130
Vision-Based UAV Self-Positioning in Low-Altitude Urban Environments130
Multi-Constraint Adversarial Networks for Unsupervised Image-to-Image Translation129
Cross-Domain Few-Shot Medical Image Segmentation via Dynamic Semantic Matching129
Toward Efficient Test Time Adaptation With Hierarchical Distribution Alignment129
Toward Projected Clustering With Aggregated Mapping127
Graph Embedding Contrastive Multi-Modal Representation Learning for Clustering126
Few-Shot Learning With Class-Covariance Metric for Hyperspectral Image Classification126
Differentiable SAR Renderer and Image-Based Target Reconstruction124
Variational Bayes Image Restoration With Compressive Autoencoders121
Attention-Guided Neural Networks for Full-Reference and No-Reference Audio-Visual Quality Assessment121
Non-Cascaded and Crosstalk-Free Multi-Image Encryption Based on Optical Scanning Holography Using 2D Orthogonal Compressive Sensing120
NeuralDiffuser: Neuroscience-Inspired Diffusion Guidance for fMRI Visual Reconstruction120
Advances in Predictive RAHT for Geometric Point Cloud Compression120
Interactive Face Video Coding: A Generative Compression Framework119
Fast 3D Room Layout Estimation Based on Compact High-Level Representation117
Cross-Domain Diffusion With Progressive Alignment for Efficient Adaptive Retrieval117
Generalization Beyond Feature Alignment: Concept Activation-Guided Contrastive Learning116
Grammar-Induced Wavelet Network for Human Parsing114
Cross-Layer Contrastive Learning of Latent Semantics for Facial Expression Recognition110
Motion and Appearance Decoupling Representation for Event Cameras110
Unsupervised Modality-Transferable Video Highlight Detection With Representation Activation Sequence Learning106
Hyperspectral Meets Optical Flow: Spectral Flow Extraction for Hyperspectral Image Classification106
IMU-Assisted Online Video Background Identification105
Efficient Semi-Supervised Multimodal Hashing With Importance Differentiation Regression105
Transition Is a Process: Pair-to-Video Change Detection Networks for Very High Resolution Remote Sensing Images104
Optimization-Inspired Learning With Architecture Augmentations and Control Mechanisms for Low-Level Vision104
Inverse Image Frequency for Long-Tailed Image Recognition103
Boundary-Aware Prototype in Semi-Supervised Medical Image Segmentation102
Distractor-Aware Event-Based Tracking99
SRS: Siamese Reconstruction-Segmentation Network Based on Dynamic-Parameter Convolution99
Learning Dynamic Prompts for All-in-One Image Restoration97
Multi-Source Unsupervised Domain Adaptation via Pseudo Target Domain96
Precise Facial Landmark Detection by Reference Heatmap Transformer95
KSS-ICP: Point Cloud Registration Based on Kendall Shape Space93
Stacked Deconvolutional Network for Semantic Segmentation92
SharpFormer: Learning Local Feature Preserving Global Representations for Image Deblurring92
SegHSI: Semantic Segmentation of Hyperspectral Images With Limited Labeled Pixels90
Video Moment Retrieval With Cross-Modal Neural Architecture Search90
Decoupled Cross-Modal Phrase-Attention Network for Image-Sentence Matching89
Toward Video Anomaly Retrieval From Video Anomaly Detection: New Benchmarks and Model89
Addressing Challenges of Incorporating Appearance Cues Into Heuristic Multi-Object Tracker via a Novel Feature Paradigm89
Unsupervised Foggy Scene Understanding via Self Spatial-Temporal Label Diffusion88
Cyclic Self-Training With Proposal Weight Modulation for Cross-Supervised Object Detection88
Multi-Condition Latent Diffusion Network for Scene-Aware Neural Human Motion Prediction88
Rethinking Sampling Strategies for Unsupervised Person Re-Identification86
Unsupervised Person Re-Identification With Stochastic Training Strategy83
Bidirectional Mapping Coupled GAN for Generalized Zero-Shot Learning83
Coarse-to-Fine Contrastive Self-Supervised Feature Learning for Land-Cover Classification in SAR Images With Limited Labeled Data80
Cross-Attentional Spatio-Temporal Semantic Graph Networks for Video Question Answering80
Point-Based Learnable Query Generator for Human–Object Interaction Detection80
Commonality Feature Representation Learning for Unsupervised Multimodal Change Detection79
Weighted Feature Fusion of Convolutional Neural Network and Graph Attention Network for Hyperspectral Image Classification79
RSSFormer: Foreground Saliency Enhancement for Remote Sensing Land-Cover Segmentation78
NR-MVSNet: Learning Multi-View Stereo Based on Normal Consistency and Depth Refinement78
FsaNet: Frequency Self-Attention for Semantic Segmentation77
BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation77
Weakly Supervised Semantic Segmentation via Alternate Self-Dual Teaching77
DUT: Learning Video Stabilization by Simply Watching Unstable Videos76
Joint Local and Nonlocal Progressive Prediction for Versatile Video Coding76
Multi-Exposure Image Fusion via Deformable Self-Attention76
Fuzzy Sparse Subspace Clustering for Infrared Image Segmentation76
Perceptually Weighted Rate Distortion Optimization for Video-Based Point Cloud Compression76
Shared Manifold Regularized Joint Feature Selection for Joint Classification and Regression in Alzheimer’s Disease Diagnosis75
Learned Spherical Image Compression With Spherical Convolution-Self-Attention and Transformer Context Model74
Multispectral Snapshot Image Registration Using Learned Cross Spectral Disparity Estimation and a Deep Guided Occlusion Reconstruction Network74
HQ2CL: A High-Quality Class Center Learning System for Deep Face Recognition74
RobustMat: Neural Diffusion for Street Landmark Patch Matching Under Challenging Environments73
Reduced Biquaternion Dual-Branch Deraining U-Network via Multi-Attention Mechanism73
Fast Learning Radiance Fields by Shooting Much Fewer Rays73
NesTD-Net: Deep NESTA-Inspired Unfolding Network With Dual-Path Deblocking Structure for Image Compressive Sensing73
A Discrete-Mapping-Based Cross-Component Prediction Paradigm for Screen Content Coding72
Rich Action-Semantic Consistent Knowledge for Early Action Prediction72
Graph-Based Depth Denoising & Dequantization for Point Cloud Enhancement71
High-Quality and Diverse Few-Shot Image Generation via Masked Discrimination71
Noise Prior Knowledge Informed Bayesian Inference Network for Hyperspectral Super-Resolution71
Fine-Grained Spatio-Temporal Parsing Network for Action Quality Assessment70
Implicit-Explicit Integrated Representations for Multi-View Video Compression69
UVaT: Uncertainty Incorporated View-Aware Transformer for Robust Multi-View Classification68
HOPE: Enhanced Position Image Priors via High-Order Implicit Representations68
Exploring the Potential of Pooling Techniques for Universal Image Restoration68
Fuzzy Sparse Deviation Regularized Robust Principal Component Analysis68
Characteristic Mapping for Ellipse Detection Acceleration68
Robust Ellipse Fitting Based on Maximum Correntropy Criterion With Variable Center66
Generalizing to Out-of-Sample Degradations via Model Reprogramming65
Double Oracle Neural Architecture Search for Game Theoretic Deep Learning Models65
MaskFaceGAN: High-Resolution Face Editing With Masked GAN Latent Code Optimization65
Spatially Consistent Transformer for Colorization in Monochrome-Color Dual-Lens System65
Causal Inference Hashing for Long-Tailed Image Retrieval64
Compact Representation and Reliable Classification Learning for Point-Level Weakly-Supervised Action Localization64
Image Reconstruction for Accelerated MR Scan With Faster Fourier Convolutional Neural Networks64
Mutually Reinforcing Learning of Decoupled Degradation and Diffusion Enhancement for Unpaired Low-Light Image Lightening64
One Sketch for All: One-Shot Personalized Sketch Segmentation63
Hierarchical Superpixel Segmentation by Parallel CRTrees Labeling63
AAP-MIT: Attentive Atrous Pyramid Network and Memory Incorporated Transformer for Multisentence Video Description63
Rethinking Object Saliency Ranking: A Novel Whole-Flow Processing Paradigm62
PolarPose: Single-Stage Multi-Person Pose Estimation in Polar Coordinates61
Joint Denoising-Demosaicking Network for Long-Wave Infrared Division-of-Focal-Plane Polarization Images With Mixed Noise Level Estimation61
Cross-Modal Causal Representation Learning for Radiology Report Generation61
RoMo: Robust Unsupervised Multimodal Learning With Noisy Pseudo Labels61
Semi-Supervised Domain Adaptive Structure Learning61
DMRA: Depth-Induced Multi-Scale Recurrent Attention Network for RGB-D Saliency Detection60
A Real-Time Memory Updating Strategy for Unsupervised Person Re-Identification60
Enhancing Few-Shot Out-of-Distribution Detection With Pre-Trained Model Features59
Energy-Based Domain Adaptation Without Intermediate Domain Dataset for Foggy Scene Segmentation59
Semantic Representation and Attention Alignment for Graph Information Bottleneck in Video Summarization59
Hierarchical Random Walker Segmentation for Large Volumetric Biomedical Images59
Partition Map Prediction for Fast Block Partitioning in VVC Intra-Frame Coding59
Multi-Scale Fusion and Decomposition Network for Single Image Deraining58
Continual Referring Expression Comprehension via Dual Modular Memorization58
Toward Robust and Unconstrained Full Range of Rotation Head Pose Estimation58
Dynamic Atomic Column Detection in Transmission Electron Microscopy Videos via Ridge Estimation57
Reviewer Summary for Transactions on Image Processing57
Data Augmentation Using Bitplane Information Recombination Model56
Arbitrary-Scale Texture Generation From Coarse-Grained Control56
Image-Level Adaptive Adversarial Ranking for Person Re-Identification56
MA-ST3D: Motion Associated Self-Training for Unsupervised Domain Adaptation on 3D Object Detection56
Degraded Reference Image Quality Assessment56
U-N2C: A Dual Memory-Guided Disentanglement Framework for Unsupervised System Matrix Denoising in Magnetic Particle Imaging55
A New Non-Linear Hyperbolic-Parabolic Coupled PDE Model for Image Despeckling55
PVPUFormer: Probabilistic Visual Prompt Unified Transformer for Interactive Image Segmentation55
Source-Guided Target Feature Reconstruction for Cross-Domain Classification and Detection55
FOVQA: Blind Foveated Video Quality Assessment55
UniEmoX: Cross-Modal Semantic-Guided Large-Scale Pretraining for Universal Scene Emotion Perception55
Bayesian Nonnegative Tensor Completion With Automatic Rank Determination55
Multi-Person Pose Tracking With Sparse Key-Point Flow Estimation and Hierarchical Graph Distance Minimization55
CKD: Contrastive Knowledge Distillation From a Sample-Wise Perspective55
SIR: Self-Supervised Image Rectification via Seeing the Same Scene From Multiple Different Lenses55
Hierarchical Hashing Learning for Image Set Classification54
Cross-Modal Contrastive Learning Network for Few-Shot Action Recognition54
Deep Ranking Exemplar-Based Dynamic Scene Deblurring54
MetaAge: Meta-Learning Personalized Age Estimators54
Interpretable Neural Networks for Video Separation: Deep Unfolding RPCA With Foreground Masking54
Enhancing Text-Based Person Retrieval by Combining Fused Representation and Reciprocal Learning With Adaptive Loss Refinement54
BPMTrack: Multi-Object Tracking With Detection Box Application Pattern Mining54
Rotational Convolution: Rethinking Convolution for Downside Fisheye Images54
Few-Shot Domain Adaptation via Mixup Optimal Transport53
View-Wise Versus Cluster-Wise Weight: Which Is Better for Multi-View Clustering?53
CartoonLossGAN: Learning Surface and Coloring of Images for Cartoonization53
Sensitivity Decouple Learning for Image Compression Artifacts Reduction52
PCE-GAN: A Generative Adversarial Network for Point Cloud Attribute Quality Enhancement Based on Optimal Transport52
Model-Induced Generalization Error Bound for Information-Theoretic Representation Learning in Source-Data-Free Unsupervised Domain Adaptation52
Image Compression Using Stochastic-AFD Based Multisignal Sparse Representation52
Restoration of Images Taken Through a Dirty Window Using Optics-Guided Transformer52
Neural Scene Designer: Self-Styled Semantic Image Manipulation52
Multistage Spatio-Temporal Networks for Robust Sketch Recognition51
Learning Domain Invariant Representations for Generalizable Person Re-Identification51
Multi-Label Auroral Image Classification Based on CNN and Transformer51
Sampling Agnostic Feature Representation for Long-Term Person Re-Identification51
SSL++: Improving Self-Supervised Learning by Mitigating the Proxy Task-Specificity Problem51
Learning Transferable Conceptual Prototypes for Interpretable Unsupervised Domain Adaptation51
PFONet: A Progressive Feedback Optimization Network for Lightweight Single Image Dehazing50
HyperE2VID: Improving Event-Based Video Reconstruction via Hypernetworks50
Advancing Video Anomaly Detection: A Bi-Directional Hybrid Framework for Enhanced Single- and Multi-Task Approaches50
Dynamic Slimmable Denoising Network50
Toward Scalable and Unified Example-Based Explanation and Outlier Detection49
Bi-Directional Pseudo-Three-Dimensional Network for Video Frame Interpolation49
Contrastive Conditional Latent Diffusion for Audio-Visual Segmentation49
Perception-Guided Quality Metric of 3D Point Clouds Using Hybrid Strategy48
MBFQuant: A Multiplier-Bitwidth-Fixed, Mixed-Precision Quantization Method for Mobile CNN-Based Applications48
Boosting Monocular 3D Human Pose Estimation With Part Aware Attention48
Ingredient-Guided Region Discovery and Relationship Modeling for Food Category-Ingredient Prediction48
PointFormer: Keypoint-Guided Transformer for Simultaneous Nuclei Segmentation and Classification in Multi-Tissue Histology Images47
Attribute and State Guided Structural Embedding Network for Vehicle Re-Identification47
Zero-Shot Camouflaged Object Detection47
Rethinking the Low-Light Video Enhancement: Benchmark Datasets and Methods47
U-Shape Transformer for Underwater Image Enhancement47
Hyperspectral Image Classification via Cascaded Spatial Cross-Attention Network47
Underwater Image Enhancement With Hyper-Laplacian Reflectance Priors47
Underwater Image Enhancement via Minimal Color Loss and Locally Adaptive Contrast Enhancement47
DO-Conv: Depthwise Over-Parameterized Convolutional Layer46
FABNet: Frequency-Aware Binarized Network for Single Image Super-Resolution46
SDSFusion: A Semantic-Aware Infrared and Visible Image Fusion Network for Degraded Scenes46
Leveraging Frequency Analysis for Image Denoising Network Pruning46
DVMark: A Deep Multiscale Framework for Video Watermarking46
Deep Underwater Image Quality Assessment With Explicit Degradation Awareness Embedding46
IEEE Transactions on Image Processing publication information46
Spatio-Temporal Correlation Guided Geometric Partitioning for Versatile Video Coding46
Learned Image Compression With Gaussian-Laplacian-Logistic Mixture Model and Concatenated Residual Modules46
TTST: A Top-k Token Selective Transformer for Remote Sensing Image Super-Resolution46
Improving Transferability of Universal Adversarial Perturbation With Feature Disruption46
Rethinking Generalized Zero-Shot Learning: A Synthesized Per-Instance Attribute Perspective45
Versatile Denoising-Based Approximate Message Passing for Compressive Sensing45
Linearly Transformed Color Guide for Low-Bitrate Diffusion-Based Image Compression45
State-Aware Compositional Learning Toward Unbiased Training for Scene Graph Generation45
Explicitly-Decoupled Text Transfer With Minimized Background Reconstruction for Scene Text Editing45
ECEA: Extensible Co-Existing Attention for Few-Shot Object Detection45
Designing an Illumination-Aware Network for Deep Image Relighting44
Scale-Aware Crowd Counting Network With Annotation Error Modeling44
Hierarchical Prior-Based Super Resolution for Point Cloud Geometry Compression44
Enhancing Multimodal Learning via Hierarchical Fusion Architecture Search With Inconsistency Mitigation44
Diverse Target and Contribution Scheduling for Domain Generalization44
Zero-Shot Skeleton-Based Action Recognition With Prototype-Guided Feature Alignment44
Multi-Modal Remote Sensing Image Matching Considering Co-Occurrence Filter44
Advancing Weakly-Supervised Change Detection in Satellite Images via Adversarial Class Prompting43
StreakNet-Arch: An Anti-Scattering Network-Based Architecture for Underwater Carrier LiDAR-Radar Imaging43
View-Consistency Learning for Incomplete Multiview Clustering43
YOLOH: You Only Look One Hourglass for Real-Time Object Detection43
C-NeRF: Representing Scene Changes as Directional Consistency Difference-based NeRF43
Siamese-DETR for Generic Multi-Object Tracking42
Lightweight Deep Neural Networks for Ship Target Detection in SAR Imagery42
Magi-Net: Meta Negative Network for Early Activity Prediction42
Decoupling Discriminative Attributes for Few-Shot Fine-Grained Recognition42
Cluster-Guided Asymmetric Contrastive Learning for Unsupervised Person Re-Identification42
BVI-VFI: A Video Quality Database for Video Frame Interpolation42
Toward Transparent Deep Image Aesthetics Assessment With Tag-Based Content Descriptors42
Accurate 3D Measurement of Complex Texture Objects by Height Compensation Using a Dual-Projector Structure42
Hyperpixels: Flexible 4D Over-Segmentation for Dense and Sparse Light Fields42
Action Quality Assessment via Hierarchical Pose-Guided Multi-Stage Contrastive Regression41
Coupled Splines for Sparse Curve Fitting41
0.085623025894165