IEEE Transactions on Circuits and Systems for Video Technology

Papers
(The median citation count of IEEE Transactions on Circuits and Systems for Video Technology is 5. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-09-01 to 2025-09-01.)
ArticleCitations
Table of Contents1246
IEEE Transactions on Circuits and Systems for Video Technology publication information305
IEEE Transactions on Circuits and Systems for Video Technology publication information282
2022 Index IEEE Transactions on Circuits and Systems for Video Technology Vol. 32276
Negative Class Guided Spatial Consistency Network for Sparsely Supervised Semantic Segmentation of Remote Sensing Images274
Crowd-Powered Photo Enhancement Featuring an Active Learning Based Local Filter270
MCCE-REC: MLLM-Driven Cross-Modal Contrastive Entropy Model for Zero-Shot Referring Expression Comprehension267
IEEE Transactions on Circuits and Systems for Video Technology Publication Information240
Learning Depth-Density Priors for Fourier-Based Unpaired Image Restoration230
CRP2-VCS: Contrast-Oriented Region-Based Progressive Probabilistic Visual Cryptography Schemes229
VPA: Multi-modal Virtual Point Augmentation for 3D Object Detection228
Scene Prior Constrained Self-Paced Learning for Unsupervised Satellite Video Vehicle Detection224
DSC3D: Deformable Sampling Constraints in Stereo 3D Object Detection for Autonomous Driving216
Guest Editorial Introduction to the Special Issue on Label-Efficient Learning on Video Data214
Table of Contents198
IEEE Circuits and Systems Society Information195
Representing Boundary-Ambiguous Scene Online With Scale-Encoded Cascaded Grids and Radiance Field Deblurring193
A Clinically Guided Graph Convolutional Network for Assessment of Parkinsonian Pronation-Supination Movements of Hands177
Relative Comparison-Based Consensus Learning for Multi-View Subspace Clustering172
Multi-Level Feature Fusion Network for Shadow Removal Detection170
Stochastic Gradient Perturbation: An Implicit Regularizer for Person Re-Identification169
UDTCWT-PHFMs Domain Statistical Image Watermarking Using Vector BW-Type R Distribution166
Iterative Self-Guided Image Filtering165
A Format Compliant Framework for HEVC Selective Encryption After Encoding161
Harmony: An Eco-Friendly Adaptive Rate Control Scheme for Video-on-Demand in Low Earth Orbit Satellite Internet158
Cross-Level Multi-Modal Features Learning With Transformer for RGB-D Object Recognition157
SARGAN: Spatial Attention-Based Residuals for Facial Expression Manipulation156
Uni3DA: Universal 3D Domain Adaptation for Object Recognition155
Convolutional Neural Networks for Omnidirectional Image Quality Assessment: A Benchmark149
Block Diagonal Graph Embedded Discriminative Regression for Image Representation144
Representation Robustness and Feature Expansion for Exemplar-Free Class-Incremental Learning141
Future Feature-Based Supervised Contrastive Learning for Streaming Perception138
RT3DHVC: A Real-Time Human Holographic Video Conferencing System With a Consumer RGB-D Camera Array137
Fully Unsupervised Domain-Agnostic Image Retrieval134
Learning Spatio-Temporal Sharpness Map for Video Deblurring134
Viewport Prediction for Volumetric Video Streaming by Exploring Video Saliency and User Trajectory Information134
Semantic-Aware Late-Stage Supervised Contrastive Learning for Fine-Grained Action Recognition134
Synergistic Fusion Network of Microscopic Hyperspectral and RGB Images for Multi-perspective Segmentation132
Truncated Robust Natural Watermarking With Hungarian Optimization131
Relation-Aware Multi-Pass Comparison Deconfounded Network for Change Captioning129
Exploring Explicitly Disentangled Features for Domain Generalization127
Image Super-Resolution With Self-Similarity Prior Guided Network and Sample-Discriminating Learning123
Few-Shot Temporal Sentence Grounding via Memory-Guided Semantic Learning122
Reversible Data Hiding Over Encrypted Images via Preprocessing-Free Matrix Secret Sharing122
Multi-Modal Attribute Prompting for Vision-Language Models119
D3C2-Net: Dual-Domain Deep Convolutional Coding Network for Compressive Sensing116
Key Role Guided Transformer for Group Activity Recognition113
Lightweight Neural Network for Enhancing Imaging Performance of Under-Display Camera111
CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation110
Toward Meta-Shape-Based Multi-View 3D Point Cloud Registration: An Evaluation109
FastAL: Fast Evaluation Module for Efficient Dynamic Deep Active Learning Using Broad Learning System108
Dual Difficulty-aware Adaptive Pseudo Labeling for Semi-supervised CNV Segmentation105
Push-and-Pull: A General Training Framework With Differential Augmentor for Domain Generalized Point Cloud Classification105
Scalable and Robust Tensor Ring Decomposition for Large-Scale Data With Missing Data and Outliers105
Robust Image Watermarking With Synchronization Using Template Enhanced-Extracted Network104
SpiReco: Fast and Efficient Recognition of High-Speed Moving Objects With Spike Camera103
UAMD-Net: A Unified Adaptive Multimodal Neural Network for Dense Depth Completion103
Adversarial Dual-Student With Differentiable Spatial Warping for Semi-Supervised Semantic Segmentation102
Equity in Unsupervised Domain Adaptation by Nuclear Norm Maximization102
Reversible Data Hiding in Encrypted Image via Secret Sharing Based on GF(p) and GF(2⁸)102
Pro-Tuning: Unified Prompt Tuning for Vision Tasks101
VDTR: Video Deblurring With Transformer101
Deep and Low-Rank Quaternion Priors for Color Image Processing101
Local Attention Transformer-Based Full-View Finger-Vein Identification100
Active Spatial Positions Based Hierarchical Relation Inference for Group Activity Recognition100
Frequency Generation for Real-World Image Super-Resolution100
MMI-Det: Exploring Multi-Modal Integration for Visible and Infrared Object Detection99
Spatial Attention-Guided Light Field Salient Object Detection Network With Implicit Neural Representation98
Learning to Capture the Query Distribution for Few-Shot Learning98
Ct-LVI: A Framework Toward Continuous-Time Laser-Visual-Inertial Odometry and Mapping95
Instance-Incremental Scene Graph Generation From Real-World Point Clouds via Normalizing Flows92
Learning Appearance-Motion Synergy via Memory-Guided Event Prediction for Video Anomaly Detection92
Multi-Modal Multi-Grained Embedding Learning for Generalized Zero-Shot Video Classification92
Towards Video Anomaly Detection in the Real World: A Binarization Embedded Weakly-Supervised Network91
Dual-Stream Transformer With Distribution Alignment for Visible-Infrared Person Re-Identification91
Plausible Proxy Mining With Credibility for Unsupervised Person Re-Identification89
Spectral–Spatial Feature Extraction With Dual Graph Autoencoder for Hyperspectral Image Clustering89
Semi-Supervised Crowd Counting via Multi-Task Pseudo-Label Self-Correction Strategy88
Graph-Guided Unsupervised Multiview Representation Learning88
Robust Monocular Pose Tracking of Less-Distinct Objects Based on Contour-Part Model87
Efficient Single-Object Tracker Based on Local-Global Feature Fusion86
TiGDistill-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning Distillation85
Progressive Point Cloud Upsampling via Differentiable Rendering85
USVTrack: A Benchmark for Multi-Object Tracking in Complex Water Surface Scenes85
Unsupervised Action Segmentation via Multi-scale Temporal-interaction Enhancement85
Deep Affine Motion Compensation Network for Inter Prediction in VVC84
DMRFlow: 4D Radar Scene Flow Estimation With Decoupled Matching and Refinement84
Pose-Guided Transformer for Fine-Grained Action Quality Assessment84
PhyDAA: Physiological Dataset Assessing Attention83
EIFNet: An Explicit and Implicit Feature Fusion Network for Finger Vein Verification83
Enhancing Representation Learning With Spatial Transformation and Early Convolution for Reinforcement Learning-Based Small Object Detection80
Low-Light Image Enhancement via Progressive-Recursive Network79
Hierarchical Dynamic Programming Module for Human Pose Refinement79
Reliable Entropy-Induced Anchor Learning for Incomplete Multi-View Subspace Clustering79
Dependability Feature Learning based on Sample Generation for Unsupervised Text-to-Image Person Re-identification78
Fuzzified Contrast Enhancement for Nearly Invisible Images78
Video Understanding with Large Language Models: A Survey76
Edge and Skeleton Guidance Network for Salient Object Detection in Optical Remote Sensing Images76
Joint Learning of Image Deblurring and Depth Estimation Through Adversarial Multi-Task Network76
Highly-Parallel Hardwired Deep Convolutional Neural Network for 1-ms Dual-Hand Tracking75
Exploring and Exploiting High-Order Spatial–Temporal Dynamics for Long-Term Frame Prediction75
Projected Generative Adversarial Network for Point Cloud Completion75
MEF-GD: Multimodal Enhancement and Fusion Network for Garment Designer75
Single Image Haze Removal With Haze Map Optimization for Various Haze Concentrations75
AirSOD: A Lightweight Network for RGB-D Salient Object Detection75
Efficient Non-Blind Image Deblurring with Discriminative Shrinkage Deep Networks74
IEEE Circuits and Systems Society Information74
IEEE Transactions on Circuits and Systems for Video Technology publication information74
Touchless Finger Vein and Fingerprint Verification via Exploiting Attention-Based Cross-Domain Fusion73
StreetSurfGS: Scalable Urban Street Surface Reconstruction with Planar-based Gaussian Splatting73
IEEE Transactions on Circuits and Systems for Video Technology publication information73
Online Unsupervised Video Object Segmentation via Contrastive Motion Clustering72
Interlayer Restoration Deep Neural Network for Scalable High Efficiency Video Coding71
Texture-Aware Spherical Rotation for High Efficiency Omnidirectional Intra Video Coding70
Surveillance Video-and-Language Understanding: From Small to Large Multimodal Models70
MSGA-Net: Progressive Feature Matching via Multi-Layer Sparse Graph Attention69
Diverse Batch Steganography Using Model-Based Selection and Double-Layered Payload Assignment69
Efficiently Exploiting Spatially Variant Knowledge for Video Deblurring68
Monocular Depth Estimation on Adverse Weathers With Curriculum Domain Distribution Alignment67
Mesh2Animation: Unsupervised Animating for Quadruped 3D Objects67
FaceGCN: Structured Priors Inspired Graph Convolutional Networks for Face Restoration With Unknown Degradations67
WeaFU: Weather-Informed Image Blind Restoration via Multi-Weather Distribution Diffusion67
G2LP-Net: Global to Local Progressive Video Inpainting Network66
FDAC: Federated Domain Adaptation via Dual Contrastive Learning66
Compensating for the Incomplete with the Complete: An Efficient Scene Text Detector66
Fixing Defect of Photometric Loss for Self-Supervised Monocular Depth Estimation66
Holistic Prototype Attention Network for Few-Shot Video Object Segmentation65
Unsupervised Deep Hashing With Fine-Grained Similarity-Preserving Contrastive Learning for Image Retrieval65
Learning With Noisy Labels by Semantic and Feature Space Collaboration64
Cloth-Imbalanced Gait Recognition via Hallucination64
Blind Image Quality Index for Authentic Distortions With Local and Global Deep Feature Aggregation64
VVC In-Loop Filters63
VSOIQE: A Novel Viewport-Based Stitched 360° Omnidirectional Image Quality Evaluator63
Forgery-Aware Adaptive Learning With Vision Transformer for Generalized Face Forgery Detection62
A Label-Free and Non-Monotonic Metric for Evaluating Denoising in Event Cameras62
Recent Advances in Rate Control: From Optimization to Implementation and Beyond62
Multi-Scale Explicit Matching and Mutual Subject Teacher Learning for Generalizable Person Re-Identification62
Flow-Edge Guided Unsupervised Video Object Segmentation62
FDNet: Frequency Decomposition Network for Learned Image Compression62
Target-Aware Tracking With Spatial-Temporal Context Attention61
Inter-Scale Similarity Guided Cost Aggregation for Stereo Matching61
Adaptive Mixture-of-Experts Distillation for Cross-Satellite Generalizable Incremental Remote Sensing Scene Classification61
STAF: 3D Human Mesh Recovery From Video With Spatio-Temporal Alignment Fusion61
Efficient Selective Context Network for Accurate Object Detection61
Transformer-Based Multimodal Emotional Perception for Dynamic Facial Expression Recognition in the Wild61
A Novel Video Coding Strategy in HEVC for Object Detection61
A Universal Framework for Improving the Robustness of Coverless Image Steganography Based on Image Restoration60
Flow Visualization for Complex Fluid Flows via A Structure-enhanced Motion Estimator60
DiffVein: A Unified Diffusion Network for Finger Vein Segmentation and Authentication60
DEP-Former: Multimodal Depression Recognition Based on Facial Expressions and Audio Features via Emotional Changes59
Table of Contents59
Self-Supervised Adversarial Video Summarizer With Context Latent Sequence Learning59
MixSSC: Forward-Backward Mixture for Vision-Based 3D Semantic Scene Completion59
Table of Contents59
OraL: An Observational Learning Paradigm for Unsupervised Hyperspectral Change Detection59
Multi-Prior Driven Network for RGB-D Salient Object Detection58
A Novel Deep Learning Framework for Automatic Recognition of Thyroid Gland and Tissues of Neck in Ultrasound Image58
ImagingNet: A New Learnable SAR Imaging Method via Hierarchical U-shaped Network58
All-Inclusive Image Enhancement for Degraded Images Exhibiting Low-Frequency Corruption58
Learning Scene-invariant Distribution for Generalizable Blind Image Quality Assessment58
SMR: Spatial-Guided Model-Based Regression for 3D Hand Pose and Mesh Reconstruction58
Balanced Teacher for Source-Free Object Detection58
An Efficient Algorithm for Generating Harmonized Stereoscopic 360° VR Images58
Semantic Disentanglement Adversarial Hashing for Cross-Modal Retrieval57
Optical Flow Reusing for High-Efficiency Space-Time Video Super Resolution57
Boosting Semi-Supervised Face Recognition With Noise Robustness57
DAHP: Deep Attention-Guided Hashing With Pairwise Labels57
Multi-Level Fusion and Attention-Guided CNN for Image Dehazing57
Conditional Dual Diffusion for Multimodal Clustering of Optical and SAR Images57
Non-local Guided Neural Fields for 4D CT Reconstruction57
Searching a Compact Architecture for Robust Multi-Exposure Image Fusion57
Erratum to “Local-Global Temporal Difference Learning for Satellite Video Super-Resolution”57
Enhancing Robustness of Multi-Object Trackers With Temporal Feature Mix57
TAKD: Target-Aware Knowledge Distillation for Remote Sensing Scene Classification57
CNN-Transformer Based Generative Adversarial Network for Copy-Move Source/ Target Distinguishment56
Low-Resolution Object Recognition With Cross-Resolution Relational Contrastive Distillation55
Task-Specific Loss for Robust Instance Segmentation With Noisy Class Labels55
Depth Estimation From a Single Image of Blast Furnace Burden Surface Based on Edge Defocus Tracking55
Question-Aware Global-Local Video Understanding Network for Audio-Visual Question Answering54
Appearance Matters, So Does Audio: Revealing the Hidden Face via Cross-Modality Transfer54
Laplacian Pyramid Fusion Network With Hierarchical Guidance for Infrared and Visible Image Fusion54
Dynamic Particle Filter Framework for Robust Object Tracking54
Dynamic Hypergraph Convolutional Network for No-Reference Point Cloud Quality Assessment54
MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation53
Enhanced Spatial-Temporal Salience for Cross-View Gait Recognition53
Meta-Learning Based Domain Prior With Application to Optical-ISAR Image Translation53
One for All: A Unified Generative Framework for Image Emotion Classification53
Robust Matrix Completion Based on Factorization and Truncated-Quadratic Loss Function53
VmambaIR: Visual State Space Model for Image Restoration53
Low-Rank Tensor Graph Learning for Multi-View Subspace Clustering53
Sampling Propagation Attention With Trimap Generation Network for Natural Image Matting52
Feature Alignment in Anchor-Free Object Detection52
POS-Trends Dynamic-Aware Model for Video Caption52
Learning Multi-View Stereo with Geometry-Aware Prior52
Content-Adaptive Rate Control Method for User-Generated Content Videos52
Generative Image Steganography Based on Text-to-Image Multimodal Generative Model52
VideoPure: Diffusion-based Adversarial Purification for Video Recognition52
Exploiting Global Camera Network Constraints for Unsupervised Video Person Re-Identification52
Progressive Multi-Prompt learning for Vision-Language Models52
Table of Contents52
Contrastive Learning With Enhancing Detailed Information for Pre-Training Vision Transformer51
Locality-Adaptive Structured Dictionary Learning for Cross-Domain Recognition51
IEEE Transactions on Circuits and Systems for Video Technology publication information51
Special Issue on Segment Anything for Videos and Beyond51
Class Activation Map Calibration for Weakly Supervised Semantic Segmentation51
Glimpse and Zoom: Spatio-Temporal Focused Dynamic Network for Skeleton-Based Action Recognition51
Exploring Relational Knowledge for Source-Free Domain Adaptation50
M3CS: Multi-Target Masked Point Modeling with Learnable Codebook and Siamese Decoders50
Corruption-Invariant Person Re-Identification via Coarse-to-Fine Feature Alignment50
A Pixel-Level Segmentation-Synthesis Framework for Dynamic Texture Video Compression50
Small Sample Image Segmentation by Coupling Convolutions and Transformers50
CodingHomo: Bootstrapping Deep Homography With Video Coding50
Enhancing Skeleton-Based Action Recognition With Language Descriptions From Pre-Trained Large Multimodal Models49
CLSR: Cross-Layer Interaction Pyramid Super-Resolution Network49
PCTrack: Accurate Object Tracking for Live Video Analytics on Resource-Constrained Edge Devices49
CSTA: Spatial-Temporal Causal Adaptive Learning for Exemplar-Free Video Class-Incremental Learning49
Exploiting Multiperspective Driven Hierarchical Content-Aware Network for Finger Vein Verification49
Generative Latent Coding for Ultra-Low Bitrate Image and Video Compression49
Flexible Temperature Parallel Distillation for Dense Object Detection: Make Response-Based Knowledge Distillation Great Again49
Optical Flow-Based Spatiotemporal Sketch for Video Representation: A Novel Framework49
Surface-Continuous Scene Representation for Light Field Depth Estimation via Planarity Prior49
Hypergraph Contrastive Learning for Large-Scale Hyperspectral Image Clustering48
Reference-Guided Large-Scale Face Inpainting With Identity and Texture Control48
Enhancing Transparent Object Matting Using Predicted Definite Foreground and Background48
Enhancing Vision and Language Navigation With Prompt-Based Scene Knowledge48
Table of Contents48
Generative Augmentation Hashing for Few-shot Cross-Modal Retrieval48
SmokePose: End-to-End Smoke Keypoint Detection48
Neuromorphic Imaging With Super-Resolution48
Complementary Blind-Spot Network for Self-Supervised Real Image Denoising48
A Novel Approach for Effective Partially View-Aligned Clustering with Triple-Consistency47
Dual Prototypes-Based Personalized Federated Adversarial Cross-Modal Hashing47
CRDH: Compatible Reversible Data Hiding With High Capacity and Generalization47
Dual-Domain Feature Fusion and Multi-Level Memory-Enhanced Network for Spectral Compressive Imaging47
U²-Former: Nested U-Shaped Transformer for Image Restoration via Multi-View Contrastive Learning47
Semantic-Context Graph Network for Point-Based 3D Object Detection46
Generalized Intra-Camera Supervised Person Re-Identification46
Weakly-Supervised Temporal Action Localization by Progressive Complementary Learning46
Concept-Enhanced Relation Network for Video Visual Relation Inference46
Exploring Implicit Domain-Invariant Features for Domain Adaptive Object Detection46
Toward Extreme Image Compression With Latent Feature Guidance and Diffusion Prior46
Curiosity-Driven Class-Incremental Learning via Adaptive Sample Selection46
Propagating Facial Prior Knowledge for Multitask Learning in Face Super-Resolution45
DilatedTAD: Enhancing Adaptability to Actions of Varying Durations for Temporal Action Detection45
Phrase Grounding-based Style Transfer for Single-Domain Generalized Object Detection45
Partially View-Aligned Representation Learning via Cross-View Graph Contrastive Network45
UNeLF: Unconstrained Neural Light Field for Self-Supervised Angular Super-Resolution45
Point Cloud Completion via Self-Projected View Augmentation and Implicit Field Constraint45
CFB-Then-ECB Mode-Based Image Encryption for an Efficient Correction of Noisy Encrypted Images45
StarPose: 3D Human Pose Estimation via Spatial-Temporal Autoregressive Diffusion45
0.17996096611023