IEEE-ACM Transactions on Audio Speech and Language Processing

Papers
(The H4-Index of IEEE-ACM Transactions on Audio Speech and Language Processing is 36. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-08-01 to 2025-08-01.)
ArticleCitations
Representation Learning With Hidden Unit Clustering for Low Resource Speech Applications311
Decorrelation in Feedback Delay Networks246
CET2: Modelling Topic Transitions for Coherent and Engaging Knowledge-Grounded Conversations209
WDEA: The Structure and Semantic Fusion With Wasserstein Distance for Low-Resource Language Entity Alignment131
Review of Methods for Automatic Speaker Verification126
Envelope-Based Multichannel Noise Reduction for Cochlear Implant Applications101
$\mathcal {P}$owMix: A Versatile Regularizer for Multimodal Sentiment Analysis99
Towards Generating Diverse Audio Captions via Adversarial Training95
Audio-Only Phonetic Segment Classification Using Embeddings Learned From Audio and Ultrasound Tongue Imaging Data71
Multi-Channel to Multi-Channel Noise Reduction and Reverberant Speech Preservation in Time-Varying Acoustic Scenes for Binaural Reproduction71
The Harmonic Shift Algorithm for Efficient Multi-Pitch Detection67
Reverberant Source Separation Using NTF With Delayed Subsources and Spatial Priors59
MO-Transformer: Extract High-Level Relationship Between Words for Neural Machine Translation59
Enhancing Robustness of Speech Watermarking Using a Transformer-Based Framework Exploiting Acoustic Features54
DropAttack: A Random Dropped Weight Attack Adversarial Training for Natural Language Understanding53
Learning Phone Recognition From Unpaired Audio and Phone Sequences Based on Generative Adversarial Network52
Inference Skipping for More Efficient Real-Time Speech Enhancement With Parallel RNNs52
Interpretable Multimodal Capsule Fusion52
Multi-Level Time-Frequency Bins Selection for Direction of Arrival Estimation Using a Single Acoustic Vector Sensor50
SBSim: A Sentence-BERT Similarity-Based Evaluation Metric for Indian Language Neural Machine Translation Systems49
Learning Discriminative Representations and Decision Boundaries for Open Intent Detection47
Similarity Measurement of Segment-Level Speaker Embeddings in Speaker Diarization46
Refining Synthesized Speech Using Speaker Information and Phone Masking for Data Augmentation of Speech Recognition46
Efficient Lightweight Speaker Verification With Broadcasting CNN-Transformer and Knowledge Distillation Training of Self-Attention Maps44
The VoxCeleb Speaker Recognition Challenge: A Retrospective44
Attention-Based Speech Enhancement Using Human Quality Perception Modeling43
Improvement of Accent Classification Models Through Grad-Transfer From Spectrograms and Gradient-Weighted Class Activation Mapping43
A User-Centric Approach for Deep Residual-Echo Suppression in Double-Talk42
Generalizing Speaker Verification for Spoof Awareness in the Embedding Space41
Comparison of Feature Extraction Methods for Sound-Based Classification of Honey Bee Activity40
Automatic Math Word Problem Generation With Topic-Expression Co-Attention Mechanism and Reinforcement Learning40
AudioLM: A Language Modeling Approach to Audio Generation40
Distinctive and Natural Speaker Anonymization via Singular Value Transformation-Assisted Matrix39
SPEC: Summary Preference Decomposition for Low-Resource Abstractive Summarization39
Hate Speech Detection via Dual Contrastive Learning37
COVID-19 Detection via Fusion of Modulation Spectrum and Linear Prediction Speech Features37
IEEE Signal Processing Society Information36
0.067948818206787