International Journal of Computer Vision

Papers
(The TQCC of International Journal of Computer Vision is 12. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-05-01 to 2024-05-01.)
ArticleCitations
Knowledge Distillation: A Survey1047
FairMOT: On the Fairness of Detection and Re-identification in Multiple Object Tracking634
BiSeNet V2: Bilateral Network with Guided Aggregation for Real-Time Semantic Segmentation609
Image Matching from Handcrafted to Deep Features: A Survey516
Learning to Prompt for Vision-Language Models360
HOTA: A Higher Order Metric for Evaluating Multi-object Tracking357
Beyond Brightening Low-light Images293
Rectifying Pseudo Label Learning via Uncertainty Estimation for Domain Adaptive Semantic Segmentation268
Scene Text Detection and Recognition: The Deep Learning Era222
SDNet: A Versatile Squeeze-and-Decomposition Network for Real-Time Image Fusion186
Human Action Recognition and Prediction: A Survey184
Image Matching Across Wide Baselines: From Paper to Practice181
The MVTec Anomaly Detection Dataset: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection162
Attention Guided Low-Light Image Enhancement with a Large Scale Low-Light Simulation Dataset144
MOTChallenge: A Benchmark for Single-Camera Multiple Target Tracking144
Weakly-supervised Semantic Guided Hashing for Social Image Retrieval143
OCNet: Object Context for Semantic Segmentation132
EfficientPS: Efficient Panoptic Segmentation123
You Only Look Yourself: Unsupervised and Untrained Single Image Dehazing Neural Network116
Benchmarking Low-Light Image Enhancement and Beyond113
Rescaling Egocentric Vision: Collection, Pipeline and Challenges for EPIC-KITCHENS-100111
Deep Image Deblurring: A Survey109
Comparison of Full-Reference Image Quality Models for Optimization of Image Processing Systems96
Unsupervised Scale-Consistent Depth Learning from Video96
PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection92
Semantic Hierarchy Emerges in Deep Generative Representations for Scene Synthesis88
Pix2Vox++: Multi-scale Context-aware 3D Object Reconstruction from Single and Multiple Images85
LaSOT: A High-quality Large-scale Single Object Tracking Benchmark79
On the Arbitrary-Oriented Object Detection: Classification Based Approaches Revisited77
Pixel-Wise Crowd Understanding via Synthetic Data76
Unified Quality Assessment of in-the-Wild Videos with Mixed Datasets Training75
JÂA-Net: Joint Facial Action Unit Detection and Face Alignment Via Adaptive Attention74
Unsupervised Deep Representation Learning for Real-Time Tracking74
Curriculum Learning: A Survey71
CLIP-Adapter: Better Vision-Language Models with Feature Adapters68
Reference Pose Generation for Long-term Visual Localization via Learned Features and View Synthesis63
A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains63
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond59
Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in the Wild59
Deformable Kernel Networks for Joint Image Filtering58
Rain Rendering for Evaluating and Improving Robustness to Bad Weather57
GhostNets on Heterogeneous Devices via Cheap Operations56
Explainability of Deep Vision-Based Autonomous Driving Systems: Review and Challenges55
Learning Adaptive Attribute-Driven Representation for Real-Time RGB-T Tracking54
VPR-Bench: An Open-Source Visual Place Recognition Evaluation Framework with Quantifiable Viewpoint and Appearance Change53
AutoScale: Learning to Scale for Crowd Counting53
Structure-Measure: A New Way to Evaluate Foreground Maps53
Adaptive Channel Selection for Robust Visual Object Tracking with Discriminative Correlation Filters53
3D-FUTURE: 3D Furniture Shape with TextURE49
Occluded Video Instance Segmentation: A Benchmark48
Vis-MVSNet: Visibility-Aware Multi-view Stereo Network47
An Exploration of Embodied Visual Exploration47
Synthetic Humans for Action Recognition from Unseen Viewpoints45
Countering Malicious DeepFakes: Survey, Battleground, and Horizon45
Towards High Performance Human Keypoint Detection44
A Comprehensive Benchmark Analysis of Single Image Deraining: Current Challenges and Future Perspectives42
Compositional Convolutional Neural Networks: A Robust and Interpretable Model for Object Recognition Under Occlusion42
The Fishyscapes Benchmark: Measuring Blind Spots in Semantic Segmentation41
A Survey on Long-Tailed Visual Recognition41
AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild40
Train Sparsely, Generate Densely: Memory-Efficient Unsupervised Training of High-Resolution Temporal GAN40
Scale-Aware Domain Adaptive Faster R-CNN40
Bridging Composite and Real: Towards End-to-End Deep Image Matting40
Context Autoencoder for Self-supervised Representation Learning39
Multi-level Motion Attention for Human Motion Prediction38
Twin Contrastive Learning for Online Clustering37
Learning Adaptive Classifiers Synthesis for Generalized Few-Shot Learning37
Deep Nets: What have They Ever Done for Vision?37
NormAttention-PSN: A High-frequency Region Enhanced Photometric Stereo Network with Normalized Attention37
Quo Vadis, Skeleton Action Recognition?36
Zero-Shot Object Detection: Joint Recognition and Localization of Novel Concepts36
Viewpoint and Scale Consistency Reinforcement for UAV Vehicle Re-Identification36
Semantic Edge Detection with Diverse Deep Supervision35
Beyond Dents and Scratches: Logical Constraints in Unsupervised Anomaly Detection and Localization35
SensatUrban: Learning Semantics from Urban-Scale Photogrammetric Point Clouds34
Progressive DARTS: Bridging the Optimization Gap for NAS in the Wild32
Successive Graph Convolutional Network for Image De-raining32
Low-light Image Enhancement via Breaking Down the Darkness32
Learning to Reconstruct HDR Images from Events, with Applications to Depth and Flow Prediction32
Mitigating Demographic Bias in Facial Datasets with Style-Based Multi-attribute Transfer31
Manhattan Room Layout Reconstruction from a Single $$360^{\circ }$$ Image: A Comparative Study of State-of-the-Art Methods30
Mimetics: Towards Understanding Human Actions Out of Context30
3DFaceGAN: Adversarial Nets for 3D Face Representation, Generation, and Translation30
Continuous 3D Multi-Channel Sign Language Production via Progressive Transformers and Mixture Density Networks29
Polysemy Deciphering Network for Robust Human–Object Interaction Detection29
Benchmarking the Robustness of Semantic Segmentation Models with Respect to Common Corruptions29
Compositional GAN: Learning Image-Conditional Binary Composition28
MADAN: Multi-source Adversarial Domain Aggregation Network for Domain Adaptation28
Hierarchical Domain-Adapted Feature Learning for Video Saliency Prediction27
LAMP-HQ: A Large-Scale Multi-pose High-Quality Database and Benchmark for NIR-VIS Face Recognition27
Recursive Context Routing for Object Detection27
Learning JPEG Compression Artifacts for Image Manipulation Detection and Localization26
SportsCap: Monocular 3D Human Motion Capture and Fine-Grained Understanding in Challenging Sports Videos26
A Coarse-to-Fine Framework for Resource Efficient Video Recognition26
Beyond Monocular Deraining: Parallel Stereo Deraining Network Via Semantic Prior26
Selective Wavelet Attention Learning for Single Image Deraining26
Separating Content from Style Using Adversarial Learning for Recognizing Text in the Wild25
Fine-Grained Instance-Level Sketch-Based Image Retrieval25
RePCD-Net: Feature-Aware Recurrent Point Cloud Denoising Network24
3D Semantic Scene Completion: A Survey23
Feature Matching via Motion-Consistency Driven Probabilistic Graphical Model23
Hadamard Matrix Guided Online Hashing23
Parallel Single-Pixel Imaging: A General Method for Direct–Global Separation and 3D Shape Reconstruction Under Strong Global Illumination23
Zero-Shot Learning on 3D Point Cloud Objects and Beyond23
Unsupervised Domain Adaptation with Background Shift Mitigating for Person Re-Identification23
Semantics-to-Signal Scalable Image Compression with Learned Revertible Representations23
Dual Convolutional Neural Networks for Low-Level Vision22
Learning Self-supervised Low-Rank Network for Single-Stage Weakly and Semi-supervised Semantic Segmentation22
Talk2Nav: Long-Range Vision-and-Language Navigation with Dual Attention and Spatial Memory22
Underwater Camera: Improving Visual Perception Via Adaptive Dark Pixel Prior and Color Correction22
Multi-task Compositional Network for Visual Relationship Detection22
SODA: Weakly Supervised Temporal Action Localization Based on Astute Background Response and Self-Distillation Learning21
SRT3D: A Sparse Region-Based 3D Object Tracking Approach for the Real World21
Dual-Attention-Guided Network for Ghost-Free High Dynamic Range Imaging21
3D Object Detection for Autonomous Driving: A Comprehensive Survey21
Spatial–Temporal Relation Reasoning for Action Prediction in Videos20
Memory-Augmented Deep Unfolding Network for Guided Image Super-resolution20
PhysFormer++: Facial Video-Based Physiological Measurement with SlowFast Temporal Difference Transformer20
Joint Classification and Regression for Visual Tracking with Fully Convolutional Siamese Networks20
On Measuring and Controlling the Spectral Bias of the Deep Image Prior19
Context-Enhanced Representation Learning for Single Image Deraining19
Learning Deep Patch representation for Probabilistic Graphical Model-Based Face Sketch Synthesis19
Intra-Camera Supervised Person Re-Identification19
OASIS: Only Adversarial Supervision for Semantic Image Synthesis18
Adaptive Deep Disturbance-Disentangled Learning for Facial Expression Recognition18
Multi-Modal 3D Object Detection in Autonomous Driving: A Survey18
Weakly-Supervised Semantic Segmentation with Visual Words Learning and Hybrid Pooling18
A Shape Transformation-based Dataset Augmentation Framework for Pedestrian Detection18
Enhanced 3D Human Pose Estimation from Videos by Using Attention-Based Neural Network with Dilated Convolutions18
Dual-Constrained Deep Semi-Supervised Coupled Factorization Network with Enriched Prior18
REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets18
RIConv++: Effective Rotation Invariant Convolutions for 3D Point Clouds Deep Learning18
Towards Balanced Learning for Instance Recognition17
Delving Deeper into Anti-Aliasing in ConvNets17
Vote-Based 3D Object Detection with Context Modeling and SOB-3DNMS17
Incorporating Side Information by Adaptive Convolution17
Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision17
Cascaded Split-and-Aggregate Learning with Feature Recombination for Pedestrian Attribute Recognition17
Learning Regression and Verification Networks for Robust Long-term Tracking16
Learning to Detect Instance-Level Salient Objects Using Complementary Image Labels16
Unsupervised Domain Adaptation in the Wild via Disentangling Representation Learning16
Pyramid Attention Network for Image Restoration16
Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene Text Detection16
A Survey on Intrinsic Images: Delving Deep into Lambert and Beyond15
Evaluation Metrics for Conditional Image Generation15
One-Shot Object Affordance Detection in the Wild15
Artificial Intelligence for Dunhuang Cultural Heritage Protection: The Project and the Dataset15
Class-Difficulty Based Methods for Long-Tailed Visual Recognition15
ShadingNet: Image Intrinsics by Fine-Grained Shading Decomposition15
DeMoCap: Low-Cost Marker-Based Motion Capture15
Delving into Inter-Image Invariance for Unsupervised Visual Representations15
Residual Dual Scale Scene Text Spotting by Fusing Bottom-Up and Top-Down Processing15
Semantically Tied Paired Cycle Consistency for Any-Shot Sketch-Based Image Retrieval15
SliderGAN: Synthesizing Expressive Face Images by Sliding 3D Blendshape Parameters15
Pre-Training Without Natural Images15
Label-Free Robustness Estimation of Object Detection CNNs for Autonomous Driving Applications14
Revisiting Consistency Regularization for Semi-Supervised Learning14
Attribute Prototype Network for Any-Shot Learning14
Visual Object Tracking in First Person Vision14
RoCGAN: Robust Conditional GAN14
Face Image Reflection Removal14
Beyond Covariance: SICE and Kernel Based Visual Feature Representation14
EAN: Event Adaptive Network for Enhanced Action Recognition14
Learning the Clustering of Longitudinal Shape Data Sets into a Mixture of Independent or Branching Trajectories14
GLENet: Boosting 3D Object Detectors with Generative Label Uncertainty Estimation14
Guided Attention in CNNs for Occluded Pedestrian Detection and Re-identification14
A Numerical Framework for Elastic Surface Matching, Comparison, and Interpolation13
CDTD: A Large-Scale Cross-Domain Benchmark for Instance-Level Image-to-Image Translation and Domain Adaptive Object Detection13
I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-Shaped Scene Text Detection13
Sparse Black-Box Video Attack with Reinforcement Learning13
Object Priors for Classifying and Localizing Unseen Actions13
Multi-Object Tracking and Segmentation Via Neural Message Passing13
A Benchmark and Evaluation of Non-Rigid Structure from Motion13
AutoDet: Pyramid Network Architecture Search for Object Detection13
Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors12
Incremental Rotation Averaging12
Distribution-Sensitive Information Retention for Accurate Binary Neural Network12
Going Deeper than Tracking: A Survey of Computer-Vision Based Recognition of Animal Pain and Emotions12
Shape My Face: Registering 3D Face Scans by Surface-to-Surface Translation12
H-SegMed: A Hybrid Method for Prostate Segmentation in TRUS Images via Improved Closed Principal Curve and Improved Enhanced Machine Learning12
Multi-adversarial Faster-RCNN with Paradigm Teacher for Unrestricted Object Detection12
Saliency Detection Inspired by Topological Perception Theory12
0.026178121566772