EURASIP Journal on Audio Speech and Music Processing

Papers
(The median citation count of EURASIP Journal on Audio Speech and Music Processing is 1. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-04-01 to 2025-04-01.)
ArticleCitations
dEchorate: a calibrated room impulse response dataset for echo-aware signal processing23
Optimizing tiny colorless feedback delay networks23
Improving multi-talker binaural DOA estimation by combining periodicity and spatial features in convolutional neural networks21
UTran-DSR: a novel transformer-based model using feature enhancement for dysarthric speech recognition20
Neural network-based non-intrusive speech quality assessment using attention pooling function18
Stripe-Transformer: deep stripe feature learning for music source separation17
AUC optimization for deep learning-based voice activity detection17
Dynamically localizing multiple speakers based on the time-frequency domain16
Correction to: An integrated MVDR beamformer for speech enhancement using a local microphone array and external microphones15
Correction: N-dimensional N-microphone sound source localization14
Correction: Robustness of ad hoc microphone clustering using speaker embeddings: evaluation under realistic and challenging scenarios13
Residual feedback suppression with extended model-based postfilters12
Correction: DeepDet: YAMNet with BottleNeck Attention Module (BAM) for TTS synthesis detection12
Timestamp-aligning and keyword-biasing end-to-end ASR front-end for a KWS system12
An overview of machine learning and other data-based methods for spatial audio capture, processing, and reproduction11
Paralinguistic singing attribute recognition using supervised machine learning for describing the classical tenor solo singing voice in vocal pedagogy11
On the selection of the number of beamformers in beamforming-based binaural reproduction10
Auxiliary function-based algorithm for blind extraction of a moving speaker10
Data-based spatial audio processing9
Robustness of ad hoc microphone clustering using speaker embeddings: evaluation under realistic and challenging scenarios8
Learning domain-heterogeneous speaker recognition systems with personalized continual federated learning8
Automatic detection of attachment style in married couples through conversation analysis8
Efficient bandwidth extension of musical signals using a differentiable harmonic plus noise model7
Automated audio captioning: an overview of recent progress and new challenges7
Learning-based robust speaker counting and separation with the aid of spatial coherence7
Gated recurrent unit predictor model-based adaptive differential pulse code modulation speech decoder7
Heterogeneous separation consistency training for adaptation of unsupervised speech separation7
Improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling7
Optimizing feature fusion for improved zero-shot adaptation in text-to-speech synthesis6
The power of humorous audio: exploring emotion regulation in traffic congestion through EEG-based study6
MIRACLE—a microphone array impulse response dataset for acoustic learning6
Continuous lipreading based on acoustic temporal alignments6
PlugSonic: a web- and mobile-based platform for dynamic and navigable binaural audio6
Deep neural networks for automatic speech processing: a survey from large corpora to limited data5
A lightweight approach to real-time speaker diarization: from audio toward audio-visual data streams5
Multi-source localization by using offset residual weight5
Voice activity detection in the presence of transient based on graph5
Modelling note’s pitch and duration in trained professional singers5
Online distributed waveform-synchronization for acoustic sensor networks with dynamic topology4
End-to-end training of acoustic scene classification using distributed sound-to-light conversion devices: verification through simulation experiments4
Comparative evaluation of interpolation methods for the directivity of musical instruments4
A latent rhythm complexity model for attribute-controlled drum pattern generation4
Masked multi-center angular margin loss for language recognition4
Physics-constrained adaptive kernel interpolation for region-to-region acoustic transfer function: a Bayesian approach4
AAM: a dataset of Artificial Audio Multitracks for diverse music information retrieval tasks4
Microphone utility estimation in acoustic sensor networks using single-channel signal features4
Generating chord progression from melody with flexible harmonic rhythm and controllable harmonic density4
Dual-branch attention module-based network with parameter sharing for joint sound event detection and localization4
A simplified and controllable model of mode coupling for addressing nonlinear phenomena in sound synthesis processes4
Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement4
Synthesis of soundfields through irregular loudspeaker arrays based on convolutional neural networks4
Multi-microphone simultaneous speakers detection and localization of multi-sources for separation and noise reduction3
SVQ-MAE: an efficient speech pre-training framework with constrained computational resources3
Frequency-dependent auto-pooling function for weakly supervised sound event detection3
A noise PSD estimation algorithm using derivative-based high-pass filter in non-stationary noise conditions3
Singer identification model using data augmentation and enhanced feature conversion with hybrid feature vector and machine learning3
Quantifying headphone listening experience in virtual sound environments using distraction3
Compression of room impulse responses for compact storage and fast low-latency convolution3
Steered Response Power for Sound Source Localization: a tutorial review3
Musical note onset detection based on a spectral sparsity measure3
Investigations on higher-order spherical harmonic input features for deep learning-based multiple speaker detection and localization3
A multichannel learning-based approach for sound source separation in reverberant environments3
A recursive expectation-maximization algorithm for speaker tracking and separation3
U2-VC: one-shot voice conversion using two-level nested U-structure3
Analysis of spatial filtering in neural spatiospectral filters and its dependence on training target characteristics3
Domain-weighted transfer learning and discriminative embeddings for low-resource speaker verification3
Can all variations within the unified mask-based beamformer framework achieve identical peak extraction performance?3
RPCA-DRNN technique for monaural singing voice separation3
A speech recognition method with enhanced transformer decoder3
Explicit-memory multiresolution adaptive framework for speech and music separation3
Automatic music signal mixing system based on one-dimensional Wave-U-Net autoencoders2
DeepDet: YAMNet with BottleNeck Attention Module (BAM) for TTS synthesis detection2
Spherical harmonic covariance and magnitude function encodings for beamformer design2
Efficient binaural rendering of spherical microphone array data by linear filtering2
Low-complexity artificial noise suppression methods for deep learning-based speech enhancement algorithms2
Estimation of playable piano fingering by pitch-difference fingering match model2
Three-stage training and orthogonality regularization for spoken language recognition2
Acoustic object canceller: removing a known signal from monaural recording using blind synchronization2
Attention mechanism combined with residual recurrent neural network for sound event detection and localization2
Text-to-speech system for low-resource language using cross-lingual transfer learning and data augmentation2
DOA-informed switching independent vector extraction and beamforming for speech enhancement in underdetermined situations2
Point neuron learning: a new physics-informed neural network architecture2
An end-to-end approach for blindly rendering a virtual sound source in an audio augmented reality environment2
Multi-task deep cross-attention networks for far-field speaker verification and keyword spotting2
Points2Sound: from mono to binaural audio using 3D point cloud scenes2
Sampling the user controls in neural modeling of audio devices2
Enhancing Speaker Recognition with CRET Model: a fusion of CONV2D, RESNET and ECAPA-TDNN2
Beyond the Big Five personality traits for music recommendation systems2
Black-box adversarial attacks through speech distortion for speech emotion recognition2
Time-domain adaptive attention network for single-channel speech separation2
Benefits of pre-trained mono- and cross-lingual speech representations for spoken language understanding of Dutch dysarthric speech1
Predominant audio source separation in polyphonic music1
Recognition of target domain Japanese speech using language model replacement1
A review on speech recognition approaches and challenges for Portuguese: exploring the feasibility of fine-tuning large-scale end-to-end models1
End-to-end speech emotion recognition using a novel context-stacking dilated convolution neural network1
Sound event triage: detecting sound events considering priority of classes1
A survey of technologies for automatic Dysarthric speech recognition1
Multi-rate modulation encoding via unsupervised learning for audio event detection1
Robust single- and multi-loudspeaker least-squares-based equalization for hearing devices1
Geometry calibration in wireless acoustic sensor networks utilizing DoA and distance information1
Multi-pitch estimation with polyphony per instrument information for Western classical and electronic music1
Design and analysis of binaural signal matching with arbitrary microphone arrays and listener head rotations1
Sound recurrence analysis for acoustic scene classification1
Training audio transformers for cover song identification1
Performance evaluation of perceptible impulsive noise detection methods based on auditory models1
Single-channel speech enhancement based on joint constrained dictionary learning1
Adversarial joint training with self-attention mechanism for robust end-to-end speech recognition1
Paralinguistic and spectral feature extraction for speech emotion classification using machine learning techniques1
Comparison of semi-supervised deep learning algorithms for audio classification1
A large TV dataset for speech and music activity detection1
Multi-scale Information Aggregation for Spoofing Detection1
AI-based Chinese-style music generation from video content: a study on cross-modal analysis and generation methods1
Robust and early howling detection based on a sparsity measure1
Singing to speech conversion with generative flow1
A framework for the acoustic simulation of passing vehicles using variable length delay lines1
Direction-of-arrival and power spectral density estimation using a single directional microphone and group-sparse optimization1
Acoustic DOA estimation using space alternating sparse Bayesian learning1
Convolutional neural networks for the classification of guitar effects and extraction of the parameter settings of single and multi-guitar effects from instrument mixes1
Significance of relative phase features for shouted and normal speech classification1
Review of methods for coding of speech signals1
Feature compensation based on independent noise estimation for robust speech recognition1
Improving sign-algorithm convergence rate using natural gradient for lossless audio compression1
Battling with the low-resource condition for snore sound recognition: introducing a meta-learning strategy1
0.08082103729248