OOIR: Observatory of International Research

Papers

(The median citation count of IEEE Transactions on Pattern Analysis and Machine Intelligence is 8. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-06-01 to 2026-06-01.)

Article	Citations
Front Cover	3833
On the Trade-Off Between Flatness and Optimization in Distributed Learning	1991
Symbolic Visual Reinforcement Learning: A Scalable Framework With Object-Level Abstraction and Differentiable Expression Search	1709
Editorial: Introduction to the Special Section on Best of CVPR'2022	1562
Self-Supervised Skeleton Representation Learning Via Actionlet Contrast and Reconstruct	984
BiBBDM: Bidirectional Image Translation With Brownian Bridge Diffusion Models	916
Implicit Annealing in Kernel Spaces: A Strongly Consistent Clustering Approach	774
Video Demoireing Using Focused-Defocused Dual-Camera System	742
Invariant Policy Learning: A Causal Perspective	731
A Hybrid Stochastic-Deterministic Minibatch Proximal Gradient Method for Efficient Optimization and Generalization	728
Towards Accurate and Compact Architectures via Neural Architecture Transformer	686
Seeing Through Satellite Images at Street Views	682
Next Bit Prediction: A Unified Lossless and Lossy Point Cloud Geometry Compression Framework	669
LRANet++: Low-Rank Approximation Network for Accurate and Efficient Text Spotting	627
Modeling Noisy Annotations for Point-Wise Supervision	609
Adaptive Surface Normal Constraint for Geometric Estimation From Monocular Images	608
Test-Time Correction: An Online 3D Detection System via Visual Prompting	602
Simplicial Complex Neural Networks	591
Interaction-Based Inductive Bias in Graph Neural Networks: Enhancing Protein-Ligand Binding Affinity Predictions From 3D Structures	554
Rethinking Rotation-Invariant Recognition of Fine-Grained Shapes From the Perspective of Contour Points	548
Revisiting Transformation Invariant Geometric Deep Learning: An Initial Representation Perspective	538
MECD+: Unlocking Event-Level Causal Graph Discovery for Video Reasoning	532
Quadratic Matrix Factorization With Applications to Manifold Learning	529
Weakly Supervised Semantic Segmentation via Box-Driven Masking and Filling Rate Shifting	508
Learn to Predict Sets Using Feed-Forward Neural Networks	504

Motion-Aware Dynamic Graph Neural Network for Video Compressive Sensing	490
Probing Synergistic High-Order Interaction for Multi-Modal Image Fusion	460
Vertical Layering of Quantized Neural Networks for Heterogeneous Inference	458
Like Human Rethinking: Contour Transformer AutoRegression for Referring Remote Sensing Interpretation	450
Learning Signed Hyper Surfaces for Oriented Point Cloud Normal Estimation	450
A Personalized and Privacy-Preserving Federated Transformer Framework for Multilingual Sentiment Analysis	444
S-NeRF++: Autonomous Driving Simulation via Neural Reconstruction and Generation	437
Task-Oriented Channel Attention for Fine-Grained Few-Shot Classification	435
Ensemble-Enhanced Semi-Supervised Learning With Optimized Graph Construction for High-Dimensional Data	433
DAQE: Enhancing the Quality of Compressed Images by Exploiting the Inherent Characteristic of Defocus	430
VATr++: Choose Your Words Wisely for Handwritten Text Generation	418
Separable Spatial-Temporal Residual Graph for Cloth-Changing Group Re-Identification	418
Multiple Video Frame Interpolation via Enhanced Deformable Separable Convolution	394
Optimization-Based Post-Training Quantization With Bit-Split and Stitching	392
AIRPNet: Adaptive Image Restoration With Privacy Protection in Steganographic Domain	386
Learning With Style: Continual Semantic Segmentation Across Tasks and Domains	382
ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting	380
MADAv2: Advanced Multi-Anchor Based Active Domain Adaptation Segmentation	377
Graph Convolutional Module for Temporal Action Localization in Videos	373
Towards Expressive Spectral-Temporal Graph Neural Networks for Time Series Forecasting	371
Multi-Dataset, Multitask Learning of Egocentric Vision Tasks	357
Enhancing Representations Through Heterogeneous Self-Supervised Learning	353
Asymmetric Convolution: An Efficient and Generalized Method to Fuse Feature Maps in Multiple Vision Tasks	336
Centerless Clustering	313
Active Supervised Cross-Modal Retrieval	310
Rein++: Efficient Generalization and Adaptation for Semantic Segmentation with Vision Foundation Models	309
Are Graph Convolutional Networks With Random Weights Feasible?	308
Analysis of Video Quality Datasets via Design of Minimalistic Video Quality Models	302
Reliable and Compact Graph Fine-Tuning Via Graph Sparse Prompting	300
Label Hierarchy Transition: Delving into Class Hierarchies to Enhance Deep Classifiers	298
SCGT: Towards Scalable and Comprehensive Graph Transformer	296
Event-Based Photometric Bundle Adjustment	295
Locating and Counting Heads in Crowds With a Depth Prior	289
Principal Uncertainty Quantification With Spatial Correlation for Image Restoration Problems	288
Rethinking Link Prediction for Directed Graphs	287
Detection-Friendly Dehazing: Object Detection in Real-World Hazy Scenes	278
One-for-All: Towards Universal Domain Translation With a Single StyleGAN	278
DVIS++: Improved Decoupled Framework for Universal Video Segmentation	271
Deep Long-Tailed Learning: A Survey	270
Physics-Informed Guided Disentanglement in Generative Networks	270
Learning to Guide a Saturation-Based Theorem Prover	265
Face Generation and Editing With StyleGAN: A Survey	263
Omni-Training: Bridging Pre-Training and Meta-Training for Few-Shot Learning	262
Prior Image Guided Snapshot Compressive Spectral Imaging	252
Interactive NeRF Geometry Editing With Shape Priors	252
Metrics for Dataset Demographic Bias: A Case Study on Facial Expression Recognition	251
Jailbreak and Guard Aligned Language Models With Only Few In-Context Demonstrations	238
OPAL: Occlusion Pattern Aware Loss for Unsupervised Light Field Disparity Estimation	236
SNI-SLAM++: Tightly-Coupled Semantic Neural Implicit SLAM	234
Moment-Reenacting: Inverse Motion Degradation With Cross-Shutter Guidance	234

A Clustering Validity Index With Multi-Granularity Fusion for Multiple Fuzzy Clustering Algorithms	233
Sparse-to-Dense Matching Network for Large-Scale LiDAR Point Cloud Registration	232
Digging Into Uncertainty-Based Pseudo-Label for Robust Stereo Matching	231
Affective Image Content Analysis: Two Decades Review and New Perspectives	230
Towards Unified Deep Image Deraining: A Survey and a New Benchmark	230
Structure-Preserving Image Super-Resolution	229
Transformer-Based Visual Segmentation: A Survey	229
Inferring Point Cloud Quality via Graph Similarity	227
Learning Graph Convolutional Networks for Multi-Label Recognition and Applications	224
Guaranteed Tensor Recovery Fused Low-rankness and Smoothness	223
Face Forgery Detection by 3D Decomposition and Composition Search	222
Point-to-Pixel Prompting for Point Cloud Analysis With Pre-Trained Image Models	220
Cover 2	219
BNET: Batch Normalization With Enhanced Linear Transformation	218
Adaptive Transfer Kernel Learning for Transfer Gaussian Process Regression	217
Rate-Distortion Theory in Coding for Machines and Its Applications	214
Towards Pointsets Representation Learning via Self-Supervised Learning and Set Augmentation	210
SPARE: Symmetrized Point-to-Plane Distance for Robust Non-Rigid 3D Registration	208
EvolveNav: Empowering LLM-Based Vision-Language Navigation via Self-Improving Embodied Reasoning	205
Deep Orientational Representation Learning for Ordinal Regression	203
Graph-Oriented Instruction Tuning of Large Language Models for Generic Graph Mining	201
Human Interaction Understanding With Consistency-Aware Learning	201
Continuous Review and Timely Correction: Enhancing the Resistance to Noisy Labels via Self-Not-True and Class-Wise Distillation	196
Image Lens Flare Removal Using Adversarial Curve Learning	195
Matrix Completion via Non-Convex Relaxation and Adaptive Correlation Learning	193
Learning Graph Attentions via Replicator Dynamics	190
On Positive-Unlabeled Classification From Corrupted Data in GANs	186
A Variational EM Acceleration for Efficient Clustering at Very Large Scales	182
Evaluation for Weakly Supervised Object Localization: Protocol, Metrics, and Datasets	180
Mining Association Patterns From Neighborhood Insight	179
Controllable Generation With Text-to-Image Diffusion Models: A Survey	177
Revisiting Transferable Adversarial Images: Systemization, Evaluation, and New Insights	177
GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector	177
Winsor-CAM: Human-Tunable Visual Explanations from Deep Networks via Layer-Wise Winsorization	177
HiGCIN: Hierarchical Graph-Based Cross Inference Network for Group Activity Recognition	175
Enhancing Photorealism Enhancement	173
A Unified Experience Replay Framework for Spiking Deep Reinforcement Learning	172
Learning to See Through With Events	171
Self-Scalable Tanh (Stan): Multi-Scale Solutions for Physics-Informed Neural Networks	170
A New Brain Network Construction Paradigm for Brain Disorder via Diffusion-Based Graph Contrastive Learning	170
Revisiting Nonlocal Self-Similarity from Continuous Representation	169
PathNet: Path-Selective Point Cloud Denoising	169
Image-to-Image Translation With Disentangled Latent Vectors for Face Editing	168
SVGDreamer++: Advancing Editability and Diversity in Text-Guided SVG Generation	168
Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks	167
Human-Centric Transformer for Domain Adaptive Action Recognition	167
Random Permutation Set Reasoning	165
Variational Data-Free Knowledge Distillation for Continual Learning	165
QDTrack: Quasi-Dense Similarity Learning for Appearance-Only Multiple Object Tracking	163
A Unified Decision Rule for Generalized Out-of-Distribution Detection	163
Flare7K++: Mixing Synthetic and Real Datasets for Nighttime Flare Removal and Beyond	161
Weakly Supervised Tracklet Association Learning With Video Labels for Person Re-Identification	161
Temporal Feature Matters: A Framework for Diffusion Model Quantization	160
Robust Multimodal Learning With Missing Modalities via Parameter-Efficient Adaptation	160
AutoNovel: Automatically Discovering and Learning Novel Visual Categories	160
Hypergraph-Based Multi-View Action Recognition Using Event Cameras	159
P2T: Pyramid Pooling Transformer for Scene Understanding	159
Bridging Actions: Generate 3D Poses and Shapes In-Between Photos	158
Out-of-Domain Generalization From a Single Source: An Uncertainty Quantification Approach	157
Influence Function Based Second-Order Channel Pruning: Evaluating True Loss Changes for Pruning is Possible Without Retraining	157
Scalable Optimal Transport Methods in Machine Learning: A Contemporary Survey	154
Curriculum-Based Asymmetric Multi-Task Reinforcement Learning	154
MoIL: Momentum Imitation Learning for Efficient Vision-Language Adaptation	154
Reconstruction Guided Meta-Learning for Few Shot Open Set Recognition	152
Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning	152
Compositional Scene Representation Learning via Reconstruction: A Survey	149
Reduced-Rank Tensor-on-Tensor Regression and Tensor-Variate Analysis of Variance	148
Differentially Private Graph Neural Networks for Whole-Graph Classification	147
Fear-Neuro-Inspired Reinforcement Learning for Safe Autonomous Driving	146
Thermal3D-GS: Physics-Induced 3D Gaussians for Thermal Infrared Novel-View Synthesis With a Large-Scale Dataset	143
Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration	143
Asymmetric Loss Functions for Noise-Tolerant Learning: Theory and Applications	142
Understanding the Effects of Projectors in Knowledge Distillation	141
MESA: Effective Matching Redundancy Reduction by Semantic Area Segmentation	141
M$^{3}$3D: A Multimodal, Multilingual and Multitask Dataset for Grounded Document-Level Information Extraction	140
Privacy Preserving Decentralized Learning with Positive-Incentive Noise	138
Universal Image Segmentation With Efficiency	137
Supervised Small-baseline and Large-baseline Homography Learning with Diffusion-based Data Generation	137
To Fold or Not to Fold: Graph Regularized Tensor Train for Visual Data Completion	137
SPLiT: Single Portrait Lighting Estimation via a Tetrad of Face Intrinsics	137

Correcting Optical Aberration via Depth-Aware Point Spread Functions	137
Ensemble Multi-Quantiles: Adaptively Flexible Distribution Prediction for Uncertainty Quantification	135
Inter-Intra Hypergraph Computation for Survival Prediction on Whole Slide Images	135
Learning Efficient Meshflow and Optical Flow From Event Cameras	133
New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM Collaboration	133
InstructLayout: Instruction-Driven 2D and 3D Layout Synthesis With Semantic Graph Prior	133
Physics-Informed Matrix Factorization Operator	132
GenPoly: Learning Generalized and Tessellated Shape Priors via 3D Polymorphic Evolving	131
Discriminant Feature Extraction by Generalized Difference Subspace	131
VNVC: A Versatile Neural Video Coding Framework for Efficient Human-Machine Vision	131
LMP-GAN: Out-of-Distribution Detection for Non-Control Data Malware Attacks	131
Learning From Partially Labeled Data for Multi-Organ and Tumor Segmentation	130
Autonomous Causal Discovery: Evaluating LLMs' Priors and Constraint Strategies for Reliability	130
On the Robustness of Average Losses for Partial-Label Learning	129
Scanpath Prediction in Panoramic Videos Via Expected Code Length Minimization	129
Deep Learning-Based Point Cloud Compression: An In-Depth Survey and Benchmark	129
Dawn of the Transformer Era in Speech Emotion Recognition: Closing the Valence Gap	129
Unbiased Scene Graph Generation via Two-Stage Causal Modeling	125
Accurate and Efficient Stereo Matching via Attention Concatenation Volume	125
MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning	125
Deep Gait Recognition: A Survey	125
Advances and Challenges in Meta-Learning: A Technical Review	125
Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses	125
Knowledge-Based Embodied Question Answering	123
GradMDM: Adversarial Attack on Dynamic Networks	123
From Simple to Complex Scenes: Learning Robust Feature Representations for Accurate Human Parsing	122
EVDI++: Event-based Video Deblurring and Interpolation via Self-Supervised Learning	121
Interpretable Optimization-Inspired Unfolding Network for Low-Light Image Enhancement	121
A Fully Automated Method for 3D Individual Tooth Identification and Segmentation in Dental CBCT	121
Self-Supervised Multimodal Learning: A Survey	121
JointFormer: A Unified Framework With Joint Modeling for Video Object Segmentation	120
Low-Shot Video Object Segmentation	120
WildVideo: Benchmarking LMMs for Understanding Video-Language Interaction	119
ComputingEdge ad	119
Cover 3	118
Reframing Neural Networks: Deep Structure in Overcomplete Representations	117
Towards Reliable and Faithful Explanations: A Disentanglement-Augmented Approach for Selective Rationalization	116
PMGT-VR: A Decentralized Proximal-Gradient Algorithmic Framework With Variance Reduction	116
Supervision by Denoising	116
AutoEval: Are Labels Always Necessary for Classifier Accuracy Evaluation?	116
PRANCE: Joint Token-Optimization and Structural Channel-Pruning for Adaptive ViT Inference	115
SS-NeRF: Physically Based Sparse Spectral Rendering With Neural Radiance Field	115
Cascaded Dynamic Memory Refinement and Semantic Alignment for Exo-to-Ego Cross-View Video Generation	115
Deep Learning on Object-Centric 3D Neural Fields	114
MoBluRF: Motion Deblurring Neural Radiance Fields for Blurry Monocular Video	112
SHADOW: Secure Hidden Authenticating Digital Objects in the Wild	112
GLC++: Source-Free Universal Domain Adaptation Through Global-Local Clustering and Contrastive Affinity Learning	110
Learn to Enhance Sparse Spike Streams	109
ModeRNN: Harnessing Spatiotemporal Mode Collapse in Unsupervised Predictive Learning	108
An Energy-Based Prior for Generative Saliency	107
Video DataFlywheel: Resolving the Impossible Data Trinity in Video-Language Understanding	107
The Cluster Structure Function	107
Heterogeneous Feature Re-Sampling for Balanced Pedestrian Attribute Recognition	106
Editorial: Special Section on Egocentric Perception	106
STAR-FC: Structure-Aware Face Clustering on Ultra-Large-Scale Graphs	105
Continual Unsupervised Generative Modeling	105
Joint Framework for Single Image Reconstruction and Super-Resolution With an Event Camera	105
Reusable Architecture Growth for Continual Stereo Matching	105
Learning With Constraint Learning: New Perspective, Solution Strategy and Various Applications	105
Stimulative Training++: Go Beyond the Performance Limits of Residual Networks	105
An Algebraic Geometry Approach to Viewing Graph Solvability	104
Temporal Stereo Matching From Event Cameras via Joint Learning With Stereoscopic Flow	103
Adaptive Sparse Self-Attention for Efficient Image Super-resolution and beyond	103
Orthogonal Decoupling Contrastive Regularization: Toward Uncorrelated Feature Decoupling for Unpaired Image Restoration	103
Confidence-Aware Pseudo-Label Self-Correction for Weakly Supervised Visual Grounding	103
Dynamic Differential Image Circle Diameter Measurement Precision Assessment: Application to Burning Droplets	102
Relationship Quantification of Image Degradations	102
Interpolated Joint Space Adversarial Training for Robust and Generalizable Defenses	101
3D Visual Saliency: An Independent Perceptual Measure or a Derivative of 2D Image Saliency?	101
Deciphering the Feature Representation of Deep Neural Networks for High-Performance AI	100
SKDF: A Simple Knowledge Distillation Framework for Distilling Open-Vocabulary Knowledge to Open-World Object Detector	100
Hypergraph-Based High-Order Correlation Analysis for Large-Scale Long-Tailed Data Classification	100
Revealing the Dark Side of Non-Local Attention in Single Image Super-Resolution	100
luvHarris: A Practical Corner Detector for Event-Cameras	99
Any Fashion Attribute Editing: Dataset and Pretrained Models	99
Adversarially Robust Neural Architectures	97
Compositional Physical Reasoning of Objects and Events From Videos	97
iSeg: An Iterative Refinement-based Framework for Training-free Segmentation	97
Scale Propagation Network for Generalizable Depth Completion	96
YOTO++: Learning Long-Horizon Closed-Loop Bimanual Manipulation from One-Shot Human Video Demonstrations	96
Human as Points: Explicit Point-Based 3D Human Reconstruction From Single-View RGB Images	96
Information-Theoretic Optimization for Task-Adapted Compressed Sensing Magnetic Resonance Imaging	96
ONNXPruner: ONNX-Based General Model Pruning Adapter	96
S$^{2}$ 2O: Enhancing Adversarial Training With Second-Order Statistics of Weights	95
Self-Guidance: Boosting Flow and Diffusion Generation on Their Own	95
Adaptive Perspective Distillation for Semantic Segmentation	95
Homeomorphism Prior for False Positive and Negative Problem in Medical Image Dense Contrastive Representation Learning	95
Unified Adversarial Patch for Visible-Infrared Cross-Modal Attacks in the Physical World	94
Probabilistic Directed Distance Fields for Ray-Based Shape Representations	94
Semi-Supervised Learning for FGVC With Out-of-Category Data	94
SS-TBN: A Semi-Supervised Tri-Branch Network for COVID-19 Screening and Lesion Segmentation	94
Unified Modality Separation: A Vision-Language Framework for Unsupervised Domain Adaptation	94
Learning to Super-Resolve Blurry Images With Events	93
RGB-T Tracking With Template-Bridged Search Interaction and Target-Preserved Template Updating	92
Progressive Instance-Aware Feature Learning for Compositional Action Recognition	91