IEEE-ACM Transactions on Audio Speech and Language Processing

Papers
(The H4-Index of IEEE-ACM Transactions on Audio Speech and Language Processing is 36. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-05-01 to 2026-05-01.)
ArticleCitations
Representation Learning With Hidden Unit Clustering for Low Resource Speech Applications330
Decorrelation in Feedback Delay Networks246
CET2: Modelling Topic Transitions for Coherent and Engaging Knowledge-Grounded Conversations232
WDEA: The Structure and Semantic Fusion With Wasserstein Distance for Low-Resource Language Entity Alignment179
$\mathcal {P}$owMix: A Versatile Regularizer for Multimodal Sentiment Analysis173
Towards Generating Diverse Audio Captions via Adversarial Training164
MO-Transformer: Extract High-Level Relationship Between Words for Neural Machine Translation154
A User-Centric Approach for Deep Residual-Echo Suppression in Double-Talk133
Refining Synthesized Speech Using Speaker Information and Phone Masking for Data Augmentation of Speech Recognition128
DropAttack: A Random Dropped Weight Attack Adversarial Training for Natural Language Understanding111
Reverberant Source Separation Using NTF With Delayed Subsources and Spatial Priors93
Audio-Only Phonetic Segment Classification Using Embeddings Learned From Audio and Ultrasound Tongue Imaging Data86
Envelope-Based Multichannel Noise Reduction for Cochlear Implant Applications83
Generalizing Speaker Verification for Spoof Awareness in the Embedding Space81
Multi-Channel to Multi-Channel Noise Reduction and Reverberant Speech Preservation in Time-Varying Acoustic Scenes for Binaural Reproduction74
Improvement of Accent Classification Models Through Grad-Transfer From Spectrograms and Gradient-Weighted Class Activation Mapping71
Learning Discriminative Representations and Decision Boundaries for Open Intent Detection70
Review of Methods for Automatic Speaker Verification59
Enhancing Robustness of Speech Watermarking Using a Transformer-Based Framework Exploiting Acoustic Features58
The VoxCeleb Speaker Recognition Challenge: A Retrospective58
Efficient Lightweight Speaker Verification With Broadcasting CNN-Transformer and Knowledge Distillation Training of Self-Attention Maps57
Attention-Based Speech Enhancement Using Human Quality Perception Modeling55
AudioLM: A Language Modeling Approach to Audio Generation52
Adaptive Multi-Domain Dialogue State Tracking on Spoken Conversations49
Label-Correction Capsule Network for Hierarchical Text Classification47
COVID-19 Detection via Fusion of Modulation Spectrum and Linear Prediction Speech Features46
Implicit Self-Supervised Language Representation for Spoken Language Diarization46
IEEE Signal Processing Society Information42
Pronunciation Dictionary-Free Multilingual Speech Synthesis Using Learned Phonetic Representations41
SPEC: Summary Preference Decomposition for Low-Resource Abstractive Summarization41
Hate Speech Detection via Dual Contrastive Learning40
Emotion Prediction Oriented Method With Multiple Supervisions for Emotion-Cause Pair Extraction38
Source Separation of Piano Concertos Using Musically Motivated Augmentation Techniques37
Blind Audio Bandwidth Extension: A Diffusion-Based Zero-Shot Approach37
Sound Field Estimation Based on Physics-Constrained Kernel Interpolation Adapted to Environment37
Audio-Visual Cross-Attention Network for Robotic Speaker Tracking37
ReZero: Region-Customizable Sound Extraction36
0.13460087776184