OOIR: Observatory of International Research

Papers

(The median citation count of Computer Speech and Language is 3. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-07-01 to 2025-07-01.)

Article	Citations
A language-agnostic model of child language acquisition	190
Stochastic Data-to-Text Generation Using Syntactic Dependency Information	122
Corpus and unsupervised benchmark: Towards Tagalog grammatical error correction	103
Automatic detection of behavioural codes in team interactions	94
Seq2Seq dynamic planning network for progressive text generation	68
Room impulse response reshaping-based expectation–maximization in an underdetermined reverberant environment	61
Speech enhancement approach for body-conducted unvoiced speech based on Taylor–Boltzmann machines trained DNN	61
KddRES: A Multi-level Knowledge-driven Dialogue Dataset for Restaurant Towards Customized Dialogue System	60
GEPC: Global embeddings with PID control	58
Identifying offensive memes in low-resource languages: A multi-modal multi-task approach using valence and arousal	55
Editorial Board	54
A method of phonemic annotation for Chinese dialects based on a deep learning model with adaptive temporal attention and a feature disentangling structure	52
Monotonic Gaussian regularization of attention for robust automatic speech recognition	50
Contextual emotion detection using ensemble deep learning	49
Misogynistic attitude detection in YouTube comments and replies: A high-quality dataset and algorithmic models	45
Complementary regional energy features for spoofed speech detection	43
Multi-branch feature aggregation based on multiple weighting for speaker verification	40
PaSCoNT - Parallel Speech Corpus of Northern-central Thai for automatic speech recognition	39
Unsupervised question-retrieval approach based on topic keywords filtering and multi-task learning	37
Verbal fluency in normal aging and cognitive decline: Results of a longitudinal study	31
Improving low-resource machine transliteration by using 3-way transfer learning	31
Editorial Board	29
Unsupervised sign language validation process based on hand-motion parameter clustering	27
Accentron: Foreign accent conversion to arbitrary non-native speakers using zero-shot learning	26
A transformer-based spelling error correction framework for Bangla and resource scarce Indic languages	26

Perceptions and reactions to conversational privacy initiated by a conversational user interface	25
Maximal activation weighted memory for aspect based sentiment analysis	25
Editorial Board	24
Augmentative and alternative speech communication (AASC) aid for people with dysarthria	23
Adversarial subsequences for unconditional text generation	22
Enhancing analysis of diadochokinetic speech using deep neural networks	22
Exploring accidental triggers of smart speakers	21
A hybrid approach to Natural Language Inference for the SICK dataset	21
Improving self-supervised learning model for audio spoofing detection with layer-conditioned embedding fusion	20
Preserving the beamforming effect for spatial cue-based pseudo-binaural dereverberation of a single source	19
Combining replay and LoRA for continual learning in natural language understanding	19
An unsupervised approach to detect review spam using duplicates of images, videos and Chinese texts	19
The use of Active Learning systems for stimulus selection and response modelling in perception experiments	18
Symbolic and Statistical Learning Approaches to Speech Summarization: A Scoping Review	18
Loanword identification based on web resources: A case study on wikipedia	18
Enhancing Arabic aspect-based sentiment analysis using deep learning models	17
English–Assamese neural machine translation using prior alignment and pre-trained language model	16
A mobile application using automatic speech analysis for classifying Alzheimer's disease and mild cognitive impairment	15
Unsupervised induction of inflectional families	15
A lightweight approach based on prompt for few-shot relation extraction	15
Conversations in the wild: Data collection, automatic generation and evaluation	15
Representation learning strategies to model pathological speech: Effect of multiple spectral resolutions	15
Effects of cross-cultural language differences on social cognition during human-agent interaction in cooperative game environments	14
A novel channel estimate for noise robust speech recognition	14
Meta adversarial learning improves low-resource speech recognition	14
Editorial Board	14
Evidence and Axial Attention Guided Document-level Relation Extraction	13
Adjustable deterministic pseudonymization of speech	13
Dialect Identification using Chroma-Spectral Shape Features with Ensemble Technique	13
Editorial Board	13
MPSA-DenseNet: A novel deep learning model for English accent classification	12
Editorial Board	12
Zero-Shot Strike: Testing the generalisation capabilities of out-of-the-box LLM models for depression detection	12
A multi-label emoji classification method using balanced pointwise mutual information-based feature selection	12
SecNLP: An NLP classification model watermarking framework based on multi-task learning	12
Named entity recognition using neural language model and CRF for Hindi language	12
A flexible BERT model enabling width- and depth-dynamic inference	11
Improved relation extraction through key phrase identification using community detection on dependency trees	11
Addressing subjectivity in paralinguistic data labeling for improved classification performance: A case study with Spanish-speaking Mexican children using data balancing and semi-supervised learning	11
Effective infant cry signal analysis and reasoning using IARO based leaky Bi-LSTM model	11
A computational analysis of transcribed speech of people living with dementia: The Anchise 2022 Corpus	11
Phase sensitive masking-based single channel speech enhancement using conditional generative adversarial network	10
Towards inclusive automatic speech recognition	10
Offensive language detection in Tamil YouTube comments by adapters and cross-domain knowledge transfer	10
Towards a unified assessment framework of speech pseudonymisation	10
FinD: Fine-grained discrepancy-based fake news detection enhanced by event abstract generation	10
Neural multi-task learning for end-to-end Arabic aspect-based sentiment analysis	10
Enhancing accuracy and privacy in speech-based depression detection through speaker disentanglement	10
Editorial Board	9
Prototypical networks relation classification model based on entity convolution	9

A tag-based methodology for the detection of user repair strategies in task-oriented conversational agents	9
Towards detecting the level of trust in the skills of a virtual assistant from the user’s speech	9
Objective and subjective evaluation of speech enhancement methods in the UDASE task of the 7th CHiME challenge	9
Conversation Initiation of Mothers, Fathers, and Toddlers in their Natural Home Environment	9
A review of speaker diarization: Recent advances with deep learning	9
Prosodic event detection in children’s read speech	9
A closer look at reinforcement learning-based automatic speech recognition	9
Improving BERT with local context comprehension for multi-turn response selection in retrieval-based dialogue systems	9
Generating identities with mixture models for speaker anonymization	9
Detection of vowel transition regions from Hindi language	9
Automated grapheme-to-phoneme conversion for Central Kurdish based on optimality theory	8
Evaluating voice-assistant commands for dementia detection	8
Adversarial attack and defense strategies for deep speaker recognition systems	8
Empirical Mode Decomposition articulation feature extraction on Parkinson’s Diadochokinesia	8
Towards lifelong human assisted speaker diarization	8
Multiple time-instances features based approach for reference-free speech quality measurement	8
Test-retest reliability of acoustic and linguistic measures of speech tasks	8
Arabic speech recognition by end-to-end, modular systems and human	8
Hate speech and offensive language detection in Dravidian languages using deep ensemble framework	8
Refining the evaluation of speech synthesis: A summary of the Blizzard Challenge 2023	8
Automatic speaker independent dysarthric speech intelligibility assessment system	7
Generative adversarial networks for speech processing: A review	7
Novel textual entailment technique for the Arabic language using genetic algorithm	7
Morse wavelet transform-based features for voice liveness detection	7
Two in One: A multi-task framework for politeness turn identification and phrase extraction in goal-oriented conversations	7
Significance of chirp MFCC as a feature in speech and audio applications	7
End-to-End Speech-to-Text Translation: A Survey	7
Investigations on speech recognition systems for low-resource dialectal Arabic–English code-switching speech	7
An intention multiple-representation model with expanded information	7
An automated quality evaluation framework of psychotherapy conversations with local quality estimates	7
Speech self-supervised representations benchmarking: A case for larger probing heads	7
Unsupervised speech representation learning for behavior modeling using triplet enhanced contextualized networks	7
Adaptive feature extraction for entity relation extraction	7
A neural network approach for speech enhancement and noise-robust bandwidth extension	7
A physical exertion inspired multi-task learning framework for detecting out-of-breath speech	7
Measuring and implementing lexical alignment: A systematic literature review	7
Classification of stuttering – The ComParE challenge and beyond	7
GTSO: Gradient tangent search optimization enabled voice transformer with speech intelligibility for aphasia	6
Channel and channel subband selection for speaker diarization	6
A cross-attention augmented model for event-triggered context-aware story generation	6
Two evaluations on Ontology-style relation annotations	6
Talking-heads attention-based knowledge representation for link prediction	6
Lightweight and irreversible speech pseudonymization based on data-driven optimization of cascaded voice modification modules	6
Discovering phonetic inventories with crosslingual automatic speech recognition	6
Hierarchical state recurrent neural network for social emotion ranking	6
SEBGM: Sentence Embedding Based on Generation Model with multi-task learning	6
FE-CFNER: Feature Enhancement-based approach for Chinese Few-shot Named Entity Recognition	6
Accurate speaker counting, diarization and separation for advanced recognition of multichannel multispeaker conversations	6
A knowledge-augmented heterogeneous graph convolutional network for aspect-level multimodal sentiment analysis	6
Multilingual non-intrusive binaural intelligibility prediction based on phone classification	6
Goal-oriented conditional variational autoencoders for proactive and knowledge-aware conversational recommender system	6
Improving named entity correctness of abstractive summarization by generative negative sampling	6
Spoofing countermeasure for fake speech detection using brute force features	6
A new speech corpus of super-elderly Japanese for acoustic modeling	6
Analysis and classification of speech sounds of children with autism spectrum disorder using acoustic features	6
Automatic screening of mild cognitive impairment and Alzheimer’s disease by means of posterior-thresholding hesitation representation	6
Direct enhancement of pre-trained speech embeddings for speech processing in noisy conditions	5
Towards better Chinese-centric neural machine translation for low-resource languages	5
UniKDD: A Unified Generative model for Knowledge-driven Dialogue	5
Scale-aware dual-branch complex convolutional recurrent network for monaural speech enhancement	5
A novel word sense disambiguation approach using WordNet knowledge graph	5
Speaking to remember: Model-based adaptive vocabulary learning using automatic speech recognition	5
Editorial Board	5
A potential relation trigger method for entity-relation quintuple extraction in text with excessive entities	5
A study of vowel nasalization using instantaneous spectra	5
Assessing language models’ task and language transfer capabilities for sentiment analysis in dialog data	5
Exploring intrinsic information content models for addressing the issues of traditional semantic measures to evaluate verb similarity	5
Rep-MCA-former: An efficient multi-scale convolution attention encoder for text-independent speaker verification	5
Optimizing pipeline task-oriented dialogue systems using post-processing networks	5
Building a text retrieval system for the Sanskrit language: Exploring indexing, stemming, and searching issues	5
C-KGE: Curriculum learning-based Knowledge Graph Embedding	5
Cross-lingual multi-speaker speech synthesis with limited bilingual training data	5
Uncertainty-aware non-autoregressive neural machine translation	5
Language-independent extractive automatic text summarization based on automatic keyword extraction	5
A semi-supervised high-quality pseudo labels algorithm based on multi-constraint optimization for speech deception detection	4
TadaStride: Using time adaptive strides in audio data for effective downsampling	4
Sequential routing framework: Fully capsule network-based speech recognition	4
A code-mixed task-oriented dialog dataset for medical domain	4
How to make embeddings suitable for PLDA	4
Editorial Board	4

Copiously Quote Classics: Improving Chinese Poetry Generation with historical allusion knowledge	4
COMPASS: A creative support system that alerts novelists to the unnoticed missing contents	4
An analysis of machine learning models for sentiment analysis of Tamil code-mixed data	4
Spectral–temporal saliency masks and modulation tensorgrams for generalizable COVID-19 detection	4
On significance of constant-Q transform for pop noise detection	4
Neural referential form selection: Generalisability and interpretability	4
Predicting children’s perceived reading proficiency with prosody modeling	4
Cross-lingual transfer learning for relation extraction using Universal Dependencies	4
Speaker anonymization by modifying fundamental frequency and x-vector singular value	4
What’s so complex about conversational speech? A comparison of HMM-based and transformer-based ASR architectures	4
An optimal approach for text feature selection	4
EMGVox-GAN: A transformative approach to EMG-based speech synthesis, enhancing clarity, and efficiency via extensive dataset utilization	4
M-Sim: Multi-level Semantic Inference Model for Chinese short answer scoring in low-resource scenarios	3
Train from scratch: Single-stage joint training of speech separation and recognition	3
Editorial Board	3
Joint emotion label space modeling for affect lexica	3
Multi-level context features extraction for named entity recognition	3
Editorial Board	3
Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition	3
New research on monaural speech segregation based on quality assessment	3
Analysis of Instantaneous Frequency Components of Speech Signals for Epoch Extraction	3
Editorial Board	3
Multi-task learning neural framework for categorizing sexism	3
Combining context-relevant features with multi-stage attention network for short text classification	3
Knowledge-grounded dialogue modelling with dialogue-state tracking, domain tracking, and entity extraction	3
LeBenchmark 2.0: A standardized, replicable and enhanced framework for self-supervised representations of French speech	3
An automatic Alzheimer’s disease classifier based on spontaneous spoken English	3
Single-channel speech enhancement using colored spectrograms	3
Joint speaker diarization and speech recognition based on region proposal networks	3
Editorial Board	3
Self-feeding training method for semi-supervised grammatical error correction	3
Modelling child comprehension: A case of suffixal passive construction in Korean	3
Enhancing Turkish Coreference Resolution: Insights from deep learning, dropped pronouns, and multilingual transfer learning	3
Demystifying large language models in second language development research	3
Dereverberation of autoregressive envelopes for far-field speech recognition	3
An experimental review of speaker diarization methods with application to two-speaker conversational telephone speech recordings	3
Overlapped Speech Detection and speaker counting using distant microphone arrays	3
RepSum: A general abstractive summarization framework with dynamic word embedding representation correction	3
Supervised speech separation combined with adaptive beamforming	3
HOTTEST: Hate and Offensive content identification in Tamil using Transformers and Enhanced STemming	3
Knowledge-enhanced meta-prompt for few-shot relation extraction	3
Spoken language interaction with robots: Recommendations for future research	3
Discriminating speech traits of Alzheimer's disease assessed through a corpus of reading task for Spanish language	3
Editorial Board	3
Deep learning-based speaker-adaptive postfiltering with limited adaptation data for embedded text-to-speech synthesis systems	3
Character expression for spoken dialogue systems with semi-supervised learning using Variational Auto-Encoder	3