Computer Vision and Image Understanding

Papers
(The TQCC of Computer Vision and Image Understanding is 5. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-05-01 to 2024-05-01.)
ArticleCitations
Skeleton-based action recognition via spatial and temporal transformer networks143
Deep 3D human pose estimation: A review138
Video anomaly detection and localization via Gaussian Mixture Fully Convolutional Variational Autoencoder126
Pyramid Channel-based Feature Attention Network for image dehazing109
Deep learning for deepfakes creation and detection: A survey96
Pros and cons of GAN evaluation measures: New developments88
A review of 3D human pose estimation algorithms for markerless motion capture79
TCLR: Temporal contrastive learning for video representation66
Fake face detection via adaptive manipulation traces extraction network66
Infrared and visible image fusion via gradientlet filter50
A comprehensive review of past and present image inpainting methods48
Single-image deblurring with neural networks: A comparative survey47
CUFD: An encoder–decoder network for visible and infrared image fusion based on common and unique feature decomposition46
High-level prior-based loss functions for medical image segmentation: A survey45
Knowledge distillation for incremental learning in semantic segmentation45
Age estimation from faces using deep learning: A comparative analysis37
Nighttime image dehazing based on Retinex and dark channel prior using Taylor series expansion36
Visual complexity analysis using deep intermediate-layer features36
Human action recognition in drone videos using a few aerial training examples33
Multi-focus image fusion approach based on CNP systems in NSCT domain33
Visual object tracking: A survey31
Learning deep edge prior for image denoising30
Adversarial examples for replay attacks against CNN-based face recognition with anti-spoofing capability29
Detection of Face Recognition Adversarial Attacks29
A survey on bias in visual datasets28
Curriculum self-paced learning for cross-domain object detection28
End-to-end deep learning-based fringe projection framework for 3D profiling of objects27
The synergy of double attention: Combine sentence-level and word-level attention for image captioning25
Video Deblurring via Spatiotemporal Pyramid Network and Adversarial Gradient Prior25
MTRNet++: One-stage mask-based scene text eraser24
Detail preserving image denoising with patch-based structure similarity via sparse representation and SVD23
ICycleGAN: Single image dehazing based on iterative dehazing model and CycleGAN23
Predicting the future from first person (egocentric) vision: A survey22
SSDA-YOLO: Semi-supervised domain adaptive YOLO for cross-domain object detection22
Decoupled appearance and motion learning for efficient anomaly detection in surveillance video22
Enhanced discriminative graph convolutional network with adaptive temporal modelling for skeleton-based action recognition21
Ghost Removal via Channel Attention in Exposure Fusion21
Uncertainty-aware consistency regularization for cross-domain semantic segmentation20
SSMTL++: Revisiting self-supervised multi-task learning for video anomaly detection20
Multi-scale attention network for image inpainting19
Deep structural information fusion for 3D object detection on LiDAR–camera system19
Person re-identification with part prediction alignment18
Multimodal attention networks for low-level vision-and-language navigation18
Hyperspectral image restoration via CNN denoiser prior regularized low-rank tensor recovery16
Pruning CNN filters via quantifying the importance of deep visual representations16
Efficient dual attention SlowFast networks for video action recognition16
An attention recurrent model for human cooperation detection15
Sejong face database: A multi-modal disguise face database15
Automatic detection and localization of thighbone fractures in X-ray based on improved deep learning method15
JSNet: A simulation network of JPEG lossy compression and restoration for robust image watermarking against JPEG attack15
Real-time and accurate object detection in compressed video by long short-term feature aggregation15
Evaluate and improve the quality of neural style transfer15
Residual network with detail perception loss for single image super-resolution14
Task dependent deep LDA pruning of neural networks14
Deep learning-based single image face depth data enhancement14
Scalable learning for bridging the species gap in image-based plant phenotyping14
Image dehazing based on a transmission fusion strategy by automatic image matting14
Joint identification–verification for person re-identification: A four stream deep learning approach with improved quartet loss function14
Attentive deep network for blind motion deblurring on dynamic scenes14
Video action detection by learning graph-based spatio-temporal interactions14
PS-DeVCEM: Pathology-sensitive deep learning model for video capsule endoscopy based on weakly labeled data14
Representation learning of image composition for aesthetic prediction14
Product image recognition with guidance learning and noisy supervision13
Adaptive CNN filter pruning using global importance metric13
Fully convolutional online tracking13
MC-Calib: A generic and robust calibration toolbox for multi-camera systems12
AC-VRNN: Attentive Conditional-VRNN for multi-future trajectory prediction12
Few-shot action recognition with implicit temporal alignment and pair similarity optimization12
Light-weight shadow detection via GCN-based annotation strategy and knowledge distillation12
Embedding group and obstacle information in LSTM networks for human trajectory prediction in crowded scenes12
A survey on RGB-D datasets12
Animal pose estimation: A closer look at the state-of-the-art, existing gaps and opportunities12
Robust real-world point cloud registration by inlier detection12
A novel shape matching descriptor for real-time static hand gesture recognition12
Cross-modal distillation for RGB-depth person re-identification12
Momental directional patterns for dynamic texture recognition12
Encoder and decoder network with ResNet-50 and global average feature pooling for local change detection11
Facial landmarks localization using cascaded neural networks11
Comprehensive comparative evaluation of background subtraction algorithms in open sea environments11
Physics-based shading reconstruction for intrinsic image decomposition11
SID: Incremental learning for anchor-free object detection via Selective and Inter-related Distillation11
Unifying frame rate and temporal dilations for improved remote pulse detection11
Frame-level refinement networks for skeleton-based gait recognition11
A data augmentation framework by mining structured features for fake face image detection10
Multi-modal semantic image segmentation10
Investigating the significance of adversarial attacks and their relation to interpretability for radar-based human activity recognition systems10
A multi-view-CNN framework for deep representation learning in image classification10
Single image rain removal via multi-module deep grid network10
Lightweight adaptive weighted network for single image super-resolution10
Image retrieval with mixed initiative and multimodal feedback10
Intelligent video analysis: A Pedestrian trajectory extraction method for the whole indoor space without blind areas10
MFMAM: Image inpainting via multi-scale feature module with attention module9
Learning to locate for fine-grained image recognition9
Periocular biometrics and its relevance to partially masked faces: A survey9
Casting a BAIT for offline and online source-free domain adaptation9
Self-supervised on-line cumulative learning from video streams9
Facial landmark points detection using knowledge distillation-based neural networks9
Accurate MR image super-resolution via lightweight lateral inhibition network9
Visual BMI estimation from face images using a label distribution based method9
Multi-human Fall Detection and Localization in Videos8
BacklitNet: A dataset and network for backlit image enhancement8
When CNNs meet random RNNs: Towards multi-level analysis for RGB-D object and scene recognition8
Rotation invariant features based on three dimensional Gaussian Markov random fields for volumetric texture classification8
Multiple instance learning on deep features for weakly supervised object detection with extreme domain shifts8
Rolling-Shutter-stereo-aware motion estimation and image correction8
Monocular 3D multi-person pose estimation via predicting factorized correction factors8
Context understanding in computer vision: A survey8
Video scene parsing: An overview of deep learning methods and datasets7
FIFNET: A convolutional neural network for motion-based multiframe super-resolution using fusion of interpolated frames7
Adversarial feature distribution alignment for semi-supervised learning7
Open cross-domain visual search7
Self-attentive 3D human pose and shape estimation from videos7
HSGAN: Reducing mode collapse in GANs by the latent code distance of homogeneous samples7
Unsupervised sound localization via iterative contrastive learning7
Detecting abnormality with separated foreground and background: Mutual Generative Adversarial Networks for video abnormal event detection7
MTCD: Cataract detection via near infrared eye images7
Model-image registration of a building’s facade based on dense semantic segmentation7
On the exact recovery conditions of 3D human motion from 2D landmark motion with sparse articulated motion7
Adaptive Capsule Network7
Learning transformer-based attention region with multiple scales for occluded person re-identification7
Anti-jamming heart rate estimation using a spatial–temporal fusion network7
A comparison of methods for 3D scene shape retrieval7
Pointly-supervised scene parsing with uncertainty mixture6
Diff attention: A novel attention scheme for person re-identification6
Weakly supervised instance segmentation using multi-prior fusion6
Multi-perspective cross-class domain adaptation for open logo detection6
Multi-person 3D pose estimation from a single image captured by a fisheye camera6
Anchor pruning for object detection6
Spatial location constraint prototype loss for open set recognition6
LSTM guided ensemble correlation filter tracking with appearance model pool6
Zero-shot sketch-based image retrieval with structure-aware asymmetric disentanglement6
Are 3D convolutional networks inherently biased towards appearance?6
Human skeletons and change detection for efficient violence detection in surveillance videos6
Classifier-agnostic saliency map extraction6
MetaVD: A Meta Video Dataset for enhancing human action recognition datasets6
Camouflaged object detection via Neighbor Connection and Hierarchical Information Transfer6
LiDARTouch: Monocular metric depth estimation with a few-beam LiDAR6
E-ProSRNet: An enhanced progressive single image super-resolution approach6
Snow Mask Guided Adaptive Residual Network for Image Snow Removal6
Video frame interpolation via down–up scale generative adversarial networks6
Semantic segmentation from remote sensor data and the exploitation of latent learning for classification of auxiliary tasks6
Stacked Capsule Graph Autoencoders for geometry-aware 3D head pose estimation5
NCMS: Towards accurate anchor free object detection through 5
Color edge preserving image colorization with a coupled natural vectorial total variation5
Learning to teach and learn for semi-supervised few-shot image classification5
FRIDA — Generative feature replay for incremental domain adaptation5
DenseNet-CTC: An end-to-end RNN-free architecture for context-free string recognition5
Reliable shot identification for complex event detection via visual-semantic embedding5
Unsupervised face frontalization using disentangled representation-learning CycleGAN5
BasicTAD: An astounding RGB-Only baseline for temporal action detection5
An asymmetrical-structure auto-encoder for unsupervised representation learning of skeleton sequences5
One-class anomaly detection via novelty normalization5
Fine-grained facial landmark detection exploiting intermediate feature representations5
Low-light image enhancement by deep learning network for improved illumination map5
A semantically driven self-supervised algorithm for detecting anomalies in image sets5
3D semantic segmentation based on spatial-aware convolution and shape completion for augmented reality applications5
Co-segmentation inspired attention module for video-based computer vision tasks5
Refining high-frequencies for sharper super-resolution and deblurring5
Weakly supervised fine-grained image classification via two-level attention activation model5
TMF: Temporal Motion and Fusion for action recognition5
Pick-Object-Attack: Type-specific adversarial attack for object detection5
Dissected 3D CNNs: Temporal skip connections for efficient online video processing5
Diversified text-to-image generation via deep mutual information estimation5
Prediction and Description of Near-Future Activities in Video5
TransRPN: Towards the Transferable Adversarial Perturbations using Region Proposal Networks and Beyond5
Single image super-resolution via hybrid resolution NSST prediction5
Feature reconstruction and metric based network for few-shot object detection5
Video captioning: A comparative review of where we are and which could be the route5
STURE: Spatial–Temporal Mutual Representation Learning for robust data association in online multi-object tracking5
Learning to combine the modalities of language and video for temporal moment localization5
Infrared and visible image fusion using a guiding network to leverage perceptual similarity5
Learning representational invariances for data-efficient action recognition5
Dynamic mode decomposition via convolutional autoencoders for dynamics modeling in videos5
SAPS: Self-Attentive Pathway Search for weakly-supervised action localization with background-action augmentation5
SnapshotNet: Self-supervised feature learning for point cloud data segmentation using minimal labeled data5
0.028287887573242