Transactions of the Association for Computational Linguistics

Papers
(The median citation count of Transactions of the Association for Computational Linguistics is 3. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-11-01 to 2025-11-01.)
ArticleCitations
The Ethics of Automating Legal Actors420
From Robustness to Improved Generalization and Calibration in Pre-trained Language Models234
KEFT: Knowledge-Enhanced Fine-Tuning for Large Language Models in Domain-Specific Question Answering188
DARE: Diverse Visual Question Answering with Robustness Evaluation179
Cross-functional Analysis of Generalization in Behavioral Learning163
Segmentation-Free Streaming Machine Translation129
Transformers for Tabular Data Representation: A Survey of Models and Applications95
Overcoming Source Object Grounding for Semantic Image Editing94
State of What Art? A Call for Multi-Prompt LLM Evaluation90
Understanding and Detecting Hallucinations in Neural Machine Translation via Model Introspection88
Erasure of Unaligned Attributes from Neural Representations78
Revisiting Meta-evaluation for Grammatical Error Correction76
A Survey of Text Games for Reinforcement Learning Informed by Natural Language74
T 2 -NER: A Two-Stage Span-Based Framework for Unified Named Entity Recognition with Templates69
The Flores-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation65
A Survey on Automated Fact-Checking62
Aggretriever: A Simple Approach to Aggregate Textual Representations for Robust Dense Passage Retrieval59
Bridging the Gap between Synthetic and Natural Questions via Sentence Decomposition for Semantic Parsing54
Federated Learning for Exploiting Annotators’ Disagreements in Natural Language Processing53
Context-Aware Machine Translation with Source Coreference Explanation52
mtRAG: A Multi-Turn Conversational Benchmark for Evaluating Retrieval-Augmented Generation Systems52
Uncertainty Estimation and Reduction of Pre-trained Models for Text Regression48
Retrieval-Pretrained Transformer: Long-range Language Modeling with Self-retrieval48
Learning English with Peppa Pig46
Benchmarking the Generation of Fact Checking Explanations43
Do Multi-Document Summarization Models Synthesize?43
Time-Aware Language Models as Temporal Knowledge Bases39
Learning More from Mixed Emotions: A Label Refinement Method for Emotion Recognition in Conversations38
DEAR: Disentangled Event-Agnostic Representation Learning for Early Fake News Detection37
Frame Representation Hypothesis: Multi-Token LLM Interpretability and Concept-Guided Text Generation36
How to Dissect a Muppet: The Structure of Transformer Embedding Spaces35
Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation34
Few-Shot Multilingual Open-Domain QA from Five Examples32
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets31
Compositional Evaluation on Japanese Textual Entailment and Similarity30
To Diverge or Not to Diverge: A Morphosyntactic Perspective on Machine Translation vs Human Translation29
Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art29
Adversarial Defense without Adversarial Defense : Enhancing Language Model Robustness via Instance-level Principal Component Removal28
Scientia Potentia Est—On the Role of Knowledge in Computational Argumentation28
Morphology Without Borders: Clause-Level Morphology27
Toward Robust RALMs: Revealing the Impact of Imperfect Retrieval on Retrieval-Augmented Language Models26
An Energy-based Model for Word-level AutoCompletion in Computer-aided Translation26
Break, Perturb, Build: Automatic Perturbation of Reasoning Paths Through Question Decomposition25
Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale24
Conformal Prediction for Natural Language Processing: A Survey24
PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains23
ProoFVer: Natural Logic Theorem Proving for Fact Verification22
Template-based Abstractive Microblog Opinion Summarization22
Communication Drives the Emergence of Language Universals in Neural Agents: Evidence from the Word-order/Case-marking Trade-off22
Questions Are All You Need to Train a Dense Passage Retriever21
Analyzing and Adapting Large Language Models for Few-Shot Multilingual NLU: Are We There Yet?21
Accurate and Efficient Fine-Tuning of Quantized Large Language Models Through Optimal Balance in Adaptation20
True Few-Shot Learning with Prompts—A Real-World Perspective20
Improving Probability-based Prompt Selection Through Unified Evaluation and Analysis19
Adapting to the Long Tail: A Meta-Analysis of Transfer Learning Research for Language Understanding Tasks19
Prompt Contrastive Transformation: An Enhanced Strategy for Efficient Prompt Transfer in Natural Language Processing19
InSCIt: Information-Seeking Conversations with Mixed-Initiative Interactions19
Navigating Cultural Chasms: Exploring and Unlocking the Cultural POV of Text-To-Image Models17
Retrieve What You Need: A Mutual Learning Framework for Open-domain Question Answering17
Are Character-level Translations Worth the Wait? Comparing ByT5 and mT5 for Machine Translation16
Navigating the Landscape of Hint Generation Research: From the Past to the Future15
Salute the Classic: Revisiting Challenges of Machine Translation in the Age of Large Language Models15
Canine: Pre-training an Efficient Tokenization-Free Encoder for Language Representation15
Sense-specific Historical Word Usage Generation14
OpenFact: Factuality Enhanced Open Knowledge Extraction14
Efficient Long-Text Understanding with Short-Text Models14
Interactive Machine Teaching by Labeling Rules and Instances14
Neuron-level Interpretation of Deep NLP Models: A Survey14
Structural Persistence in Language Models: Priming as a Window into Abstract Language Representations13
Addressing the Binning Problem in Calibration Assessment through Scalar Annotations13
A Confidence-based Acquisition Model for Self-supervised Active Learning and Label Correction13
MENLI: Robust Evaluation Metrics from Natural Language Inference13
Robust Pronoun Fidelity with English LLMs: Are they Reasoning, Repeating, or Just Biased?13
ABNIRML: Analyzing the Behavior of Neural IR Models12
Explainable Abuse Detection as Intent Classification and Slot Filling12
Pre-train, Prompt, and Recommendation: A Comprehensive Survey of Language Modeling Paradigm Adaptations in Recommender Systems12
Learning Fair Representations via Rate-Distortion Maximization11
NLP Security and Ethics, in the Wild11
Investigating Critical Period Effects in Language Acquisition through Neural Language Models11
Human Choice Prediction in Language-based Persuasion Games: Simulation-based Off-Policy Evaluation11
Modeling Emotion Dynamics in Song Lyrics with State Space Models11
How “Real” is Your Real-Time Simultaneous Speech-to-Text Translation System?11
TaxoPro: A Plug-In LoRA-based Cross-Domain Method for Low-Resource Taxonomy Completion11
Time-and-Space-Efficient Weighted Deduction11
Adding Chocolate to Mint : Mitigating Metric Interference in Machine Translation11
Is My Model Using the Right Evidence? Systematic Probes for Examining Evidence-Based Tabular Reasoning11
Rescue Conversations from Dead-ends: Efficient Exploration for Task-oriented Dialogue Policy Optimization10
Not Eliminate but Aggregate: Post-Hoc Control over Mixture-of-Experts to Address Shortcut Shifts in Natural Language Understanding10
TANQ: An Open Domain Dataset of Table Answered Questions10
Data-driven Parsing Evaluation for Child-Parent Interactions10
PaniniQA: Enhancing Patient Education Through Interactive Question Answering10
Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark10
Patchwise Cooperative Game-based Interpretability Method for Large Vision-language Models10
Helpful Neighbors: Leveraging Neighbors in Geographic Feature Pronunciation10
Self-Rationalization in the Wild: A Large-scale Out-of-Distribution Evaluation on NLI-related tasks10
Large Language Models Enable Few-Shot Clustering10
xcomet : Transparent Machine Translation Evaluation through Fine-grained Error Detection9
FeTaQA: Free-form Table Question Answering9
Data-to-text Generation with Variational Sequential Planning9
Sub-Character Tokenization for Chinese Pretrained Language Models9
Visual Spatial Reasoning9
Benchmarking Large Language Models for News Summarization9
End-to-end Argument Mining with Cross-corpora Multi-task Learning9
Assessing the Capacity of Transformer to Abstract Syntactic Representations: A Contrastive Analysis Based on Long-distance Agreement8
Know Your Limits: A Survey of Abstention in Large Language Models8
Abstractive Meeting Summarization: A Survey8
Direct Speech Translation for Automatic Subtitling8
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation8
CreoleVal: Multilingual Multitask Benchmarks for Creoles8
Diff-Explainer: Differentiable Convex Optimization for Explainable Multi-hop Inference8
How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure8
Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages8
Evaluating Transformer Models and Human Behaviors on Chinese Character Naming8
Decomposing and Recomposing Event Structure8
QAmeleon: Multilingual QA with Only 5 Examples8
Can Authorship Representation Learning Capture Stylistic Features?7
Hallucinations in Large Multilingual Translation Models7
Scope Ambiguities in Large Language Models7
Meta-Learning the Difference: Preparing Large Language Models for Efficient Adaptation7
QE4PE: Word-level Quality Estimation for Human Post-Editing7
A Cross-Linguistic Pressure for Uniform Information Density in Word Order7
On the Effect of Instruction Tuning Loss on Generalization7
Visually Grounded Speech Models Have a Mutual Exclusivity Bias7
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision7
Conformalizing Machine Translation Evaluation6
Expectations over Unspoken Alternatives Predict Pragmatic Inferences6
The Parallelism Tradeoff: Limitations of Log-Precision Transformers6
Chinese Idiom Paraphrasing6
A Multi-Level Optimization Framework for End-to-End Text Augmentation6
♫ MuSiQue: Multihop Questions via Single-hop Question Composition6
Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond6
The Emergence of Argument Structure in Artificial Languages6
A Comparative Approach for Auditing Multilingual Phonetic Transcript Archives5
mGPT: Few-Shot Learners Go Multilingual5
Visual Writing Prompts: Character-Grounded Story Generation with Curated Image Sequences5
Hierarchical Indexing for Retrieval-Augmented Opinion Summarization5
Document Summarization with Latent Queries5
Collective Human Opinions in Semantic Textual Similarity5
Robust Dialogue State Tracking with Weak Supervision and Sparse Data5
How Much Semantic Information is Available in Large Language Model Tokens?5
Comparing Humans and Large Language Models on an Experimental Protocol Inventory for Theory of Mind Evaluation (EPITOME)5
Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering5
Cultural Adaptation of Recipes5
Compositional Generalization in Multilingual Semantic Parsing over Wikidata5
STPar: A Structure-Aware Triaffine Parser for Screenplay Character Coreference Resolution5
Meta-Learning a Cross-lingual Manifold for Semantic Parsing5
KoBBQ: Korean Bias Benchmark for Question Answering4
Unleashing the True Potential of Sequence-to-Sequence Models for Sequence Tagging and Structure Parsing4
Do LLMs Exhibit Human-like Response Biases? A Case Study in Survey Design4
AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR4
Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue Response Generation Models by Causal Discovery4
Exploring Contrast Consistency of Open-Domain Question Answering Systems on Minimally Edited Questions4
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for Improved Cross-Modal Retrieval4
Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing4
Lost in the Middle: How Language Models Use Long Contexts4
Shared Lexical Items as Triggers of Code Switching4
Investigating Reasons for Disagreement in Natural Language Inference4
Sentence Similarity Based on Contexts4
Saturated Transformers are Constant-Depth Threshold Circuits4
Ultra-fine Entity Typing with Indirect Supervision from Natural Language Inference4
ReCOGS: How Incidental Details of a Logical Form Overshadow an Evaluation of Semantic Interpretation4
Decision-Oriented Dialogue for Human-AI Collaboration3
Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions3
FaithDial: A Faithful Benchmark for Information-Seeking Dialogue3
Heterogeneous Supervised Topic Models3
Data-driven Model Generalizability in Crosslinguistic Low-resource Morphological Segmentation3
Reasoning over Public and Private Data in Retrieval-Based Systems3
How Often Are Errors in Natural Language Reasoning Due to Paraphrastic Variability?3
Relational Memory-Augmented Language Models3
Can Authorship Attribution Models Distinguish Speakers in Speech Transcripts?3
Naturalistic Causal Probing for Morpho-Syntax3
FoVer: First-Order Logic Verification for Natural Language Reasoning3
FINCH: Prompt-guided Key-Value Cache Compression for Large Language Models3
Tracking Brand-Associated Polarity-Bearing Topics in User Reviews3
The Impact of Word Splitting on the Semantic Content of Contextualized Word Representations3
Hate Speech Classifiers Learn Normative Social Stereotypes3
Do Text Simplification Systems Preserve Meaning? A Human Evaluation via Reading Comprehension3
Self-supervised Topic Taxonomy Discovery in the Box Embedding Space3
Automatically Correcting Large Language Models: Surveying the Landscape of Diverse Automated Correction Strategies3
Multi-task Active Learning for Pre-trained Transformer-based Models3
Explicitly Representing Syntax Improves Sentence-to-Layout Prediction of Unexpected Situations3
Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times?3
0.062444925308228