IEEE Transactions on Circuits and Systems for Video Technology

Papers
(The median citation count of IEEE Transactions on Circuits and Systems for Video Technology is 6. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-01-01 to 2026-01-01.)
ArticleCitations
2022 Index IEEE Transactions on Circuits and Systems for Video Technology Vol. 32404
IEEE Transactions on Circuits and Systems for Video Technology Publication Information379
Table of Contents336
IEEE Transactions on Circuits and Systems for Video Technology publication information318
IEEE Transactions on Circuits and Systems for Video Technology publication information306
USVTrack: A Benchmark for Multi-Object Tracking in Complex Water Surface Scenes297
Unsupervised Action Segmentation via Multi-scale Temporal-interaction Enhancement296
Pose-Guided Transformer for Fine-Grained Action Quality Assessment279
Scene Prior Constrained Self-Paced Learning for Unsupervised Satellite Video Vehicle Detection252
Multi-Modal Multi-Grained Embedding Learning for Generalized Zero-Shot Video Classification249
Dual Difficulty-Aware Adaptive Pseudo Labeling for Semi-Supervised CNV Segmentation243
SpiReco: Fast and Efficient Recognition of High-Speed Moving Objects With Spike Camera228
Deep Affine Motion Compensation Network for Inter Prediction in VVC225
Representation Robustness and Feature Expansion for Exemplar-Free Class-Incremental Learning215
Highly-Parallel Hardwired Deep Convolutional Neural Network for 1-ms Dual-Hand Tracking211
Draw Like an Artist: Complex Scene Generation with Diffusion Model via Composition, Painting, and Retouching208
DS2VP: Dynamically-Selected Spatially Visual Prompting195
Table of Contents185
IEEE Circuits and Systems Society Information184
Guest Editorial Introduction to the Special Issue on Label-Efficient Learning on Video Data183
MEF-GD: Multimodal Enhancement and Fusion Network for Garment Designer174
A Format Compliant Framework for HEVC Selective Encryption After Encoding172
Push-and-Pull: A General Training Framework With Differential Augmentor for Domain Generalized Point Cloud Classification172
Toward Meta-Shape-Based Multi-View 3D Point Cloud Registration: An Evaluation170
Multi-Stage Cross-Modality Feature Interaction for RGB-Thermal Multi-Object Tracking165
Frequency Generation for Real-World Image Super-Resolution163
Filtering-and-Alternating-Calibration: Spatiotemporal Context Alternating Fusion for Event-based Monocular Depth Estimation162
Cross-Level Multi-Modal Features Learning With Transformer for RGB-D Object Recognition161
Scalable and Robust Tensor Ring Decomposition for Large-Scale Data With Missing Data and Outliers158
VDTR: Video Deblurring With Transformer152
UDTCWT-PHFMs Domain Statistical Image Watermarking Using Vector BW-Type R Distribution149
SARGAN: Spatial Attention-Based Residuals for Facial Expression Manipulation148
DMRFlow: 4D Radar Scene Flow Estimation With Decoupled Matching and Refinement147
FoV Prediction-Based Adaptive Bitrate Streaming with On-Demand Transcoding for 360-Degree Videos145
FastAL: Fast Evaluation Module for Efficient Dynamic Deep Active Learning Using Broad Learning System145
RT3DHVC: A Real-Time Human Holographic Video Conferencing System With a Consumer RGB-D Camera Array139
Block Diagonal Graph Embedded Discriminative Regression for Image Representation133
Convolutional Neural Networks for Omnidirectional Image Quality Assessment: A Benchmark132
CRP2-VCS: Contrast-Oriented Region-Based Progressive Probabilistic Visual Cryptography Schemes131
Semantic-Aware Late-Stage Supervised Contrastive Learning for Fine-Grained Action Recognition130
Dependability Feature Learning based on Sample Generation for Unsupervised Text-to-Image Person Re-identification130
Stochastic Gradient Perturbation: An Implicit Regularizer for Person Re-Identification129
Multi-Level Feature Fusion Network for Shadow Removal Detection127
Uni3DA: Universal 3D Domain Adaptation for Object Recognition125
Learning Spatio-Temporal Sharpness Map for Video Deblurring124
MCCE-REC: MLLM-Driven Cross-Modal Contrastive Entropy Model for Zero-Shot Referring Expression Comprehension120
Crowd-Powered Photo Enhancement Featuring an Active Learning Based Local Filter119
A Clinically Guided Graph Convolutional Network for Assessment of Parkinsonian Pronation-Supination Movements of Hands117
Efficient Single-Object Tracker Based on Local-Global Feature Fusion116
Negative Class Guided Spatial Consistency Network for Sparsely Supervised Semantic Segmentation of Remote Sensing Images114
Fully Unsupervised Domain-Agnostic Image Retrieval113
Harmony: An Eco-Friendly Adaptive Rate Control Scheme for Video-on-Demand in Low Earth Orbit Satellite Internet113
Lightweight Neural Network for Enhancing Imaging Performance of Under-Display Camera113
Joint Learning of Image Deblurring and Depth Estimation Through Adversarial Multi-Task Network113
PPIFuse: Physical Priors Injected Infrared and Visible Image Fusion112
Few-Shot Temporal Sentence Grounding via Memory-Guided Semantic Learning112
EIFNet: An Explicit and Implicit Feature Fusion Network for Finger Vein Verification109
SMART: Semantic Matching Contrastive Learning for Partially View-Aligned Clustering109
TPCM-SegNet: A Text-Prompted Dual-Path Convolution-Mamba Network for Anomaly Segmentation109
CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation108
Multi-Modal Attribute Prompting for Vision-Language Models108
Spatial Attention-Guided Light Field Salient Object Detection Network With Implicit Neural Representation105
Relation-Aware Multi-Pass Comparison Deconfounded Network for Change Captioning103
Semi-Supervised Crowd Counting via Multi-Task Pseudo-Label Self-Correction Strategy103
Edge and Skeleton Guidance Network for Salient Object Detection in Optical Remote Sensing Images102
DSC3D: Deformable Sampling Constraints in Stereo 3D Object Detection for Autonomous Driving102
Synergistic Fusion Network of Microscopic Hyperspectral and RGB Images for Multi-Perspective Segmentation101
VPA: Multi-Modal Virtual Point Augmentation for 3D Object Detection101
Subjective and Objective Quality Assessment of Display Content Videos101
Viewport Prediction for Volumetric Video Streaming by Exploring Video Saliency and User Trajectory Information101
Single Image Haze Removal With Haze Map Optimization for Various Haze Concentrations100
Plausible Proxy Mining With Credibility for Unsupervised Person Re-Identification99
Enhancing Representation Learning With Spatial Transformation and Early Convolution for Reinforcement Learning-Based Small Object Detection99
Projected Generative Adversarial Network for Point Cloud Completion98
Deep and Low-Rank Quaternion Priors for Color Image Processing97
Iterative Self-Guided Image Filtering97
Video Understanding with Large Language Models: A Survey96
Exploring and Exploiting High-Order Spatial–Temporal Dynamics for Long-Term Frame Prediction95
Graph-Guided Unsupervised Multiview Representation Learning94
Active Spatial Positions Based Hierarchical Relation Inference for Group Activity Recognition91
Towards Video Anomaly Detection in the Real World: A Binarization Embedded Weakly-Supervised Network90
Instance-Incremental Scene Graph Generation From Real-World Point Clouds via Normalizing Flows89
Adversarial Dual-Student With Differentiable Spatial Warping for Semi-Supervised Semantic Segmentation89
Reversible Data Hiding Over Encrypted Images via Preprocessing-Free Matrix Secret Sharing89
Truncated Robust Natural Watermarking With Hungarian Optimization88
ASCFormer: An Adaptive Strucure-aware Cascaded Transformer for 3D Object Detection88
Image Super-Resolution With Self-Similarity Prior Guided Network and Sample-Discriminating Learning88
Exploring Explicitly Disentangled Features for Domain Generalization88
AirSOD: A Lightweight Network for RGB-D Salient Object Detection88
Learning to Capture the Query Distribution for Few-Shot Learning87
Learning Depth-Density Priors for Fourier-Based Unpaired Image Restoration87
Representing Boundary-Ambiguous Scene Online With Scale-Encoded Cascaded Grids and Radiance Field Deblurring86
Ct-LVI: A Framework Toward Continuous-Time Laser-Visual-Inertial Odometry and Mapping84
Dual-Stream Transformer With Distribution Alignment for Visible-Infrared Person Re-Identification83
Key Role Guided Transformer for Group Activity Recognition83
Deep Convolutional Primal-Dual Network for Image Deblurring82
Reversible Data Hiding in Encrypted Image via Secret Sharing Based on GF(p) and GF(2⁸)82
Equity in Unsupervised Domain Adaptation by Nuclear Norm Maximization82
Relative Comparison-Based Consensus Learning for Multi-View Subspace Clustering81
Hierarchical Dynamic Programming Module for Human Pose Refinement81
Robust Image Watermarking With Synchronization Using Template Enhanced-Extracted Network81
TiGDistill-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning Distillation80
Spectral–Spatial Feature Extraction With Dual Graph Autoencoder for Hyperspectral Image Clustering80
Pro-Tuning: Unified Prompt Tuning for Vision Tasks80
Local Attention Transformer-Based Full-View Finger-Vein Identification79
Fuzzified Contrast Enhancement for Nearly Invisible Images79
PhyDAA: Physiological Dataset Assessing Attention78
Future Feature-Based Supervised Contrastive Learning for Streaming Perception76
UAMD-Net: A Unified Adaptive Multimodal Neural Network for Dense Depth Completion76
D3C2-Net: Dual-Domain Deep Convolutional Coding Network for Compressive Sensing75
MMI-Det: Exploring Multi-Modal Integration for Visible and Infrared Object Detection75
Learning Appearance-Motion Synergy via Memory-Guided Event Prediction for Video Anomaly Detection75
StreetSurfGS: Scalable Urban Street Surface Reconstruction With Planar-Based Gaussian Splatting74
IEEE Transactions on Circuits and Systems for Video Technology publication information74
IEEE Circuits and Systems Society Information74
Reliable Entropy-Induced Anchor Learning for Incomplete Multi-View Subspace Clustering74
IEEE Transactions on Circuits and Systems for Video Technology publication information74
Cloth-Imbalanced Gait Recognition via Hallucination73
Inter-Scale Similarity Guided Cost Aggregation for Stereo Matching73
Flow Visualization for Complex Fluid Flows via a Structure-Enhanced Motion Estimator73
Multi-Scale Explicit Matching and Mutual Subject Teacher Learning for Generalizable Person Re-Identification72
Fixing Defect of Photometric Loss for Self-Supervised Monocular Depth Estimation72
Searching a Compact Architecture for Robust Multi-Exposure Image Fusion72
Compensating for the Incomplete With the Complete: An Efficient Scene Text Detector72
Dynamic Particle Filter Framework for Robust Object Tracking72
A Novel Deep Learning Framework for Automatic Recognition of Thyroid Gland and Tissues of Neck in Ultrasound Image72
WeaFU: Weather-Informed Image Blind Restoration via Multi-Weather Distribution Diffusion71
Appearance Matters, So Does Audio: Revealing the Hidden Face via Cross-Modality Transfer71
A Universal Framework for Improving the Robustness of Coverless Image Steganography Based on Image Restoration71
Touchless Finger Vein and Fingerprint Verification via Exploiting Attention-Based Cross-Domain Fusion70
Adaptive Mixture-of-Experts Distillation for Cross-Satellite Generalizable Incremental Remote Sensing Scene Classification69
Transformer-Based Multimodal Emotional Perception for Dynamic Facial Expression Recognition in the Wild69
DEP-Former: Multimodal Depression Recognition Based on Facial Expressions and Audio Features via Emotional Changes69
MSGA-Net: Progressive Feature Matching via Multi-Layer Sparse Graph Attention69
Optical Flow Reusing for High-Efficiency Space-Time Video Super Resolution69
Enhanced Spatial-Temporal Salience for Cross-View Gait Recognition69
Forgery-Aware Adaptive Learning With Vision Transformer for Generalized Face Forgery Detection69
A Label-Free and Non-Monotonic Metric for Evaluating Denoising in Event Cameras68
Mesh2Animation: Unsupervised Animating for Quadruped 3D Objects68
TAKD: Target-Aware Knowledge Distillation for Remote Sensing Scene Classification68
MixSSC: Forward-Backward Mixture for Vision-Based 3D Semantic Scene Completion68
Meta-Learning Based Domain Prior With Application to Optical-ISAR Image Translation68
FaceGCN: Structured Priors Inspired Graph Convolutional Networks for Face Restoration With Unknown Degradations67
Monocular Depth Estimation on Adverse Weathers With Curriculum Domain Distribution Alignment67
Texture-Aware Spherical Rotation for High Efficiency Omnidirectional Intra Video Coding67
Efficiently Exploiting Spatially Variant Knowledge for Video Deblurring67
Boosting Semi-Supervised Face Recognition With Noise Robustness67
FDAC: Federated Domain Adaptation via Dual Contrastive Learning67
HyPSAM: Hybrid Prompt-driven Segment Anything Model for RGB-Thermal Salient Object Detection65
Multimodal Industrial Anomaly Detection via Geometric Prior65
Multi-Prior Driven Network for RGB-D Salient Object Detection64
Table of Contents64
MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation64
Task-Specific Loss for Robust Instance Segmentation With Noisy Class Labels64
G2LP-Net: Global to Local Progressive Video Inpainting Network64
Learning Scene-invariant Distribution for Generalizable Blind Image Quality Assessment64
Robust Matrix Completion Based on Factorization and Truncated-Quadratic Loss Function64
Errata to “Local-Global Temporal Difference Learning for Satellite Video Super-Resolution”64
Non-local Guided Neural Fields for 4D CT Reconstruction64
Feature Evaluation and Joint Interaction for Audio-Visual Emotion Recognition63
Diverse Batch Steganography Using Model-Based Selection and Double-Layered Payload Assignment63
ImagingNet: A New Learnable SAR Imaging Method via Hierarchical U-Shaped Network63
Self-Supervised Adversarial Video Summarizer With Context Latent Sequence Learning63
Interlayer Restoration Deep Neural Network for Scalable High Efficiency Video Coding63
Online Unsupervised Video Object Segmentation via Contrastive Motion Clustering63
Enhancing Robustness of Multi-Object Trackers With Temporal Feature Mix62
Efficient Non-Blind Image Deblurring With Discriminative Shrinkage Deep Networks62
STAF: 3D Human Mesh Recovery From Video With Spatio-Temporal Alignment Fusion62
Blind Image Quality Index for Authentic Distortions With Local and Global Deep Feature Aggregation62
Balanced Teacher for Source-Free Object Detection62
SMR: Spatial-Guided Model-Based Regression for 3D Hand Pose and Mesh Reconstruction62
DiffVein: A Unified Diffusion Network for Finger Vein Segmentation and Authentication62
Flow-Edge Guided Unsupervised Video Object Segmentation62
OraL: An Observational Learning Paradigm for Unsupervised Hyperspectral Change Detection62
Recent Advances in Rate Control: From Optimization to Implementation and Beyond61
Depth Estimation From a Single Image of Blast Furnace Burden Surface Based on Edge Defocus Tracking61
Holistic Prototype Attention Network for Few-Shot Video Object Segmentation61
Table of Contents61
All-Inclusive Image Enhancement for Degraded Images Exhibiting Low-Frequency Corruption60
Learning With Noisy Labels by Semantic and Feature Space Collaboration60
Conditional Dual Diffusion for Multimodal Clustering of Optical and SAR Images60
Dynamic Hypergraph Convolutional Network for No-Reference Point Cloud Quality Assessment60
CNN-Transformer Based Generative Adversarial Network for Copy-Move Source/ Target Distinguishment60
Laplacian Pyramid Fusion Network With Hierarchical Guidance for Infrared and Visible Image Fusion60
FDNet: Frequency Decomposition Network for Learned Image Compression60
VSOIQE: A Novel Viewport-Based Stitched 360° Omnidirectional Image Quality Evaluator60
One for All: A Unified Generative Framework for Image Emotion Classification59
Unsupervised Deep Hashing With Fine-Grained Similarity-Preserving Contrastive Learning for Image Retrieval59
Target-Aware Tracking With Spatial-Temporal Context Attention59
Surveillance Video-and-Language Understanding: From Small to Large Multimodal Models58
VmambaIR: Visual State Space Model for Image Restoration58
Semantic Disentanglement Adversarial Hashing for Cross-Modal Retrieval58
Low-Rank Tensor Graph Learning for Multi-View Subspace Clustering58
Low-Resolution Object Recognition With Cross-Resolution Relational Contrastive Distillation57
M3CS: Multi-Target Masked Point Modeling With Learnable Codebook and Siamese Decoders57
DAHP: Deep Attention-Guided Hashing With Pairwise Labels57
IEEE Transactions on Circuits and Systems for Video Technology publication information57
Table of Contents57
Question-Aware Global-Local Video Understanding Network for Audio-Visual Question Answering57
Dense Crosstalk Feature Aggregation for Classification and Localization in Object Detection57
DBVC: An End-to-End 3-D Deep Biomedical Video Coding Framework56
DilatedTAD: Enhancing Adaptability to Actions of Varying Durations for Temporal Action Detection56
Concept-Enhanced Relation Network for Video Visual Relation Inference56
UNeLF: Unconstrained Neural Light Field for Self-Supervised Angular Super-Resolution56
StarPose: 3D Human Pose Estimation via Spatial-Temporal Autoregressive Diffusion56
Contrastive Learning With Enhancing Detailed Information for Pre-Training Vision Transformer56
Enhancing Vision and Language Navigation With Prompt-Based Scene Knowledge56
Phrase Grounding-based Style Transfer for Single-Domain Generalized Object Detection56
Learning Physical-Spatio-Temporal Features for Video Shadow Removal56
Flexible Temperature Parallel Distillation for Dense Object Detection: Make Response-Based Knowledge Distillation Great Again55
ViMAEdit: Vision-guided and Mask-enhanced Adaptive Editing Algorithm for Prompt-based Image Editing55
Enhancing Transparent Object Matting Using Predicted Definite Foreground and Background55
Hypergraph Contrastive Learning for Large-Scale Hyperspectral Image Clustering55
PointOT: Interpretable Geometry-Inspired Point Cloud Generative Model via Optimal Transport55
MambaPTP: Exploring the Potential of Mamba for Pedestrian Trajectory Prediction55
Dual-Net: Dual Visual Spectral Affinity Monitoring Network for Hyperspectral Anomaly Detection54
Bridging Inter-task Gap of Continual Self-supervised Learning with External Data54
Optical Flow-Based Spatiotemporal Sketch for Video Representation: A Novel Framework54
Locality-Adaptive Structured Dictionary Learning for Cross-Domain Recognition54
HyperTrack: A Unified Network for Hyperspectral Video Object Tracking54
Special Issue on Segment Anything for Videos and Beyond53
Table of Contents53
Improving Zero-Shot Generalization for CLIP with Prompt Ensemble self-Distillation53
Exploring Relational Knowledge for Source-Free Domain Adaptation53
CodingHomo: Bootstrapping Deep Homography With Video Coding53
PCTrack: Accurate Object Tracking for Live Video Analytics on Resource-Constrained Edge Devices53
Diffusion-Based Depth Inpainting for Transparent and Reflective Objects53
Class Activation Map Calibration for Weakly Supervised Semantic Segmentation53
A Pixel-Level Segmentation-Synthesis Framework for Dynamic Texture Video Compression53
SPCL: Semantic Polymorphism and Commonality Learning for Text-based Person Retrieval53
MtArtGPT: A Multi-Task Art Generation System With Pre-Trained Transformer53
VideoPure: Diffusion-Based Adversarial Purification for Video Recognition52
Toward Extreme Image Compression With Latent Feature Guidance and Diffusion Prior52
CSTA: Spatial-Temporal Causal Adaptive Learning for Exemplar-Free Video Class-Incremental Learning52
Spike Camera Image Reconstruction Using Deep Spiking Neural Networks52
Deep Video Super-Resolution Using Hybrid Imaging System52
Adaptive Memorization With Group Labels for Unsupervised Person Re-Identification51
Dual-Constraint Coarse-to-Fine Network for Camouflaged Object Detection51
A Novel Approach for Effective Partially View-Aligned Clustering With Triple-Consistency51
A Novel Cross-Perturbation for Single Domain Generalization51
DVP-MVS++: Synergize Depth-Normal-Edge and Harmonized Visibility Prior for Multi-View Stereo51
Deep Adaptive Quadruplet Hashing With Probability Sampling for Large-Scale Image Retrieval51
Knowledge-Based Visual Question Generation51
Complementary Blind-Spot Network for Self-Supervised Real Image Denoising51
Bi-Directional Progressive Guidance Network for RGB-D Salient Object Detection50
Globally Deformable Information Selection Transformer for Underwater Image Enhancement50
Diffusion-Based Hypotheses Generation and Joint-Level Hypotheses Aggregation for 3D Human Pose Estimation49
Modality Fused Class-Proxy With Knowledge Distillation for Zero-Shot Sketch-Based Image Retrieval49
Point Cloud Completion via Self-Projected View Augmentation and Implicit Field Constraint49
Learning Informative and Discriminative Features for Facial Expression Recognition in the Wild49
0.20851492881775