OOIR: Observatory of International Research

Papers

(The median citation count of ACM Transactions on Software Engineering and Methodology is 3. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-01-01 to 2026-01-01.)

Article	Citations
Finding Information Leaks with Information Flow Fuzzing—RCR Report	424
Mutant Reduction Evaluation: What is There and What is Missing?	214
Automatic Identification of Game Stuttering via Gameplay Videos Analysis	154
Test Generation Strategies for Building Failure Models and Explaining Spurious Failures	103
I Depended on You and You Broke Me: An Empirical Study of Manifesting Breaking Changes in Client Packages	101
Better Supporting Human Aspects in Mobile eHealth Apps: Development and Validation of Enhanced Guidelines	101
Bounded Verification of Atomicity Violations for Interrupt-Driven Programs via Lazy Sequentialization	100
History-Driven Fuzzing for Deep Learning Libraries	95
SPENCER: Self-Adaptive Model Distillation for Efficient Code Retrieval	90
TestLoop: A Process Model Describing Human-in-the-Loop Software Test Suite Generation	90
KAPE: k NN-based Performance Testing for Deep Code Search	88
Communicating Study Design Trade-offs in Software Engineering	83
Enhancing Android Malware Detection: The Influence of ChatGPT on Decision-centric Task	78
An empirical study on vulnerability disclosure management of open source software systems	75
An Empirical Analysis of Machine Learning Model and Dataset Documentation, Supply Chain, and Licensing Challenges on Hugging Face	72
Preference-wise Testing of Android Apps via Test Amplification	69
FairGenerate: Enhancing Fairness through Synthetic Data Generation and Two-Fold Biased Labels Removal	69
Neuron Semantic-Guided Test Generation for Deep Neural Networks Fuzzing	64
Horus : Accelerating Kernel Fuzzing through Efficient Host-VM Memory Access Procedures	59
M2CVD: Enhancing Vulnerability Understanding through Multi-Model Collaboration for Code Vulnerability Detection	59
Deceiving Humans and Machines Alike: Search-based Test Input Generation for DNNs Using Variational Autoencoders	58
Assessing the Robustness of Test Selection Methods for Deep Neural Networks	57
Reusing d-DNNFs for Efficient Feature-Model Counting	57
A Survey on Failure Analysis and Fault Injection in AI Systems	57
An Empirical Study of the Non-Determinism of ChatGPT in Code Generation	56

Securing the Ethereum from Smart Ponzi Schemes: Identification Using Static Features	55
FoC: Figure Out the Cryptographic Functions in Stripped Binaries with LLMs	52
Understanding the OSS Communities of Deep Learning Frameworks: A Comparative Case Study of P y T orch and T ensor	52
A Comprehensive View on TD Prevention Practices and Reasons for Not Preventing It	49
An Empirical Study on Governance in Bitcoin’s Consensus Evolution	49
Assessing and Analyzing the Correctness of GitHub Copilot’s Code Suggestions	46
Fine-Tuning Large Language Models to Improve Accuracy and Comprehensibility of Automated Code Review	45
A Survey of Learning-based Automated Program Repair	44
FormatFuzzer : Effective Fuzzing of Binary File Formats	44
Storage State Analysis and Extraction of Ethereum Blockchain Smart Contracts	44
Single and Multi-objective Test Cases Prioritization for Self-driving Cars in Virtual Environments	44
Deep API Sequence Generation via Golden Solution Samples and API Seeds	41
Enhancing Security and Acuity of Smart Contract Vulnerability Detection based on Federated Learning and BiLSTM-Attention	41
JavaScript SBST Heuristics to Enable Effective Fuzzing of NodeJS Web APIs	40
Towards Automating Domain-Specific Data Generation for Text-to-SQL: A Comprehensive Approach	39
PVDetector: Pretrained Vulnerability Detection on Vulnerability-enriched Code Semantic Graph	39
FAVDisco: Modeling and Discovering File Access Vulnerabilities	39
Characterizing Deep Learning Package Supply Chains in PyPI: Domains, Clusters, and Disengagement	39
Systematic Literature Review on Software Security Vulnerability Information Extraction	38
Adopting Two Supervisors for Efficient Use of Large-Scale Remote Deep Neural Networks - RCR Report	38
Do Current Language Models Support Code Intelligence for R Programming Language?	38
HeMiRCA: Fine-Grained Root Cause Analysis for Microservices with Heterogeneous Data Sources	35
Help Them Understand: Testing and Improving Voice User Interfaces	34
Supporting Emotional Intelligence, Productivity and Team Goals while Handling Software Requirements Changes	33
An Accurate Identifier Renaming Prediction and Suggestion Approach	33
Estimating Uncertainty in Labeled Changes by SZZ Tools on Just-In-Time Defect Prediction	33
I Know What You Are Searching for: Code Snippet Recommendation from Stack Overflow Posts	33
Try with Simpler - An Evaluation of Improved Principal Component Analysis in Log-based Anomaly Detection	32
Toward Interpretable Graph Tensor Convolution Neural Network for Code Semantics Embedding	32
Introducing Interactions in Multi-Objective Optimization of Software Architectures	32
On-the-fly Generation-Quality Enhancement of Deep Code Models via Model Collaboration	31
Why Do GitHub Actions Workflows Fail? An Empirical Study	31
Assessing the Early Bird Heuristic (for Predicting Project Quality)	31
When Fine-Tuning LLMs Meets Data Privacy: An Empirical Study of Federated Learning in LLM-Based Program Repair	31
AutoRIC: Automated Neural Network Repairing Based on Constrained Optimization	30
A Systematic Literature Review of Multi-Label Learning in Software Engineering	30
Vulnerability Repair via Concolic Execution and Code Mutations	30
Exploring Data-Efficient Adaptation of Large Language Models for Code Generation	29
GIST : Generated Inputs Sets Transferability in Deep Learning	28
An Empirical Study on GitHub Pull Requests’ Reactions	28
Editorial: Toward the Future with Eight Issues Per Year	28
SimClone: Detecting Tabular Data Clones Using Value Similarity	28
Reinforcement Learning Informed Evolutionary Search for Autonomous Systems Testing	27
Editorial: ICSE and the Incredible Contradictions of Software Engineering	26
Revisiting the Identification of the Co-evolution of Production and Test Code	26
PatchCensor: Patch Robustness Certification for Transformers via Exhaustive Testing	26
Mapping the Trust Terrain: LLMs in Software Engineering - Insights and Perspectives	26
SimADFuzz: Simulation-Feedback Fuzz Testing for Autonomous Driving Systems	25
SCOPE : Performance Testing for Serverless Computing	25
Leveraging Reviewer Experience in Code Review Comment Generation	25

An Empirical Study of Retrieval-Augmented Code Generation: Challenges and Opportunities	25
APIRO: A Framework for Automated Security Tools API Recommendation	25
Towards On-the-Fly Code Performance Profiling	25
A Survey of Learning-based Method Name Prediction	25
Assessing and Improving an Evaluation Dataset for Detecting Semantic Code Clones via Deep Learning	25
ADSDx: Towards Automated Accident Diagnosis for High-level Autonomous Driving Systems	25
Industry–Academia Research Collaboration and Knowledge Co-creation: Patterns and Anti-patterns	24
Contemporary Software Modernization: Strategies, Driving Forces, and Research Opportunities	24
Commit Messages Generation Based on Core Changes	23
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-based Safety-critical Systems	23
SourcererJBF: A Java Build Framework For Large-Scale Compilation	23
Learning from Very Little Data: On the Value of Landscape Analysis for Predicting Software Project Health	23
Security of Language Models for Code: A Systematic Literature Review	22
Beyond Fidelity: Explaining Vulnerability Localization of Learning-Based Detectors	22
You Don’t Have to Say Where to Edit! jLED—Joint Learning to Localize and Edit Source Code	22
Automatic Rule Checking for Microservices: Supporting Security Analysis with Explainability	22
Towards Learning Generalizable Code Embeddings Using Task-agnostic Graph Convolutional Networks	21
Efficient Multivariate Time Series Anomaly Detection through Transfer Learning for Large-Scale Software Systems	21
A Comprehensive Empirical Study of Bias Mitigation Methods for Machine Learning Classifiers	21
Test Input Prioritization for 3D Point Clouds	21
Monitoring data for Anomaly Detection in Cloud-Based Systems: A Systematic Mapping Study	21
MR-Scout: Automated Synthesis of Metamorphic Relations from Existing Test Cases	21
Exploring Fine-Grained Bug Report Categorization with Large Language Models and Prompt Engineering: An Empirical Study	21
Actor-Driven Decomposition of Microservices through Multi-level Scalability Assessment	20
Exploring the Capabilities of LLMs for Code-Change-Related Tasks	20
Characterizing Installation- and Run-time Compatibility Issues in Android Benign Apps and Malware	20
Cleaning Up Confounding: Accounting for Endogeneity Using Instrumental Variables and Two-Stage Models	20
Demo2Test: Transfer Testing of Agent in Competitive Environment with Failure Demonstrations	20
Demystifying Hidden Sensitive Operations in Android Apps	20
A Characterization Study of Merge Conflicts in Java Projects	20
Adaptive Modelling Languages: Abstract Syntax and Model Migration	19
Enhancing Task In-Progress Time Predictions through Affective and Personality Factors	19
Fold2Vec: Towards a Statement-Based Representation of Code for Code Comprehension	19
Graphuzz: Data-driven Seed Scheduling for Coverage-guided Greybox Fuzzing	19
Generation-based Differential Fuzzing for Deep Learning Libraries	19
Fairness Concerns in App Reviews: A Study on AI-Based Mobile Apps	19
Evolution-Aware Constraint Derivation Approach for Software Remodularization	19
Adopting Two Supervisors for Efficient Use of Large-Scale Remote Deep Neural Networks	19
Improving Deep Assertion Generation via Fine-Tuning Retrieval-Augmented Pre-trained Language Models	19
Battling against Protocol Fuzzing: Protecting Networked Embedded Devices from Dynamic Fuzzers	19
Developer Perspectives on Licensing and Copyright Issues Arising from Generative AI for Software Development	19
SPOLRE: Semantic Preserving Object Layout Reconstruction for Image Captioning System Testing	18
Interpreting Deep Neural Networks via Relative Activation-Deactivation Abstractions	18
All in One: Design, Verification, and Implementation of SNOW-optimal Read Atomic Transactions	18
MeDeT: Medical Device Digital Twins Creation with Few-shot Meta-learning	18
Programming Smart Playtesting	18
Bypassing Guardrails: Lessons Learned from Red Teaming ChatGPT	18
Certified Cost Bounds for Abstract Programs	18
Automating TODO-missed Methods Detection and Patching	18
PonziHunter: Hunting Ethereum Ponzi Contract via Static Analysis and Contrastive Learning on the Bytecode Level	18
Duplicate Bug Report Detection: How Far Are We?	17
Stress Testing Control Loops in Cyber-Physical Systems—RCR Report	17
Coverage-directed Differential Testing of X.509 Certificate Validation in SSL/TLS Implementations	17
A Roadmap for Integrating Sustainability into Software Engineering Education	17
Is It Hard to Generate Holistic Commit Message?	17
Autonomous Driving System Testing via Diversity-Oriented Driving Scenario Exploration	17
Efficient Management of Containers for Software Defined Vehicles	17
Variable Renaming-Based Adversarial Test Generation for Code Model: Benchmark and Enhancement	17
Time-travel Investigation: Toward Building a Scalable Attack Detection Framework on Ethereum	16
Reputation Gaming in Crowd Technical Knowledge Sharing	16
An In-depth Study of Java Deserialization Remote-Code Execution Exploits and Vulnerabilities	16
Measuring and Clustering Heterogeneous Chatbot Designs	16
Differentiable Quantum Programming with Unbounded Loops	16
Testing Causality in Scientific Modelling Software	16
OSS Effort Estimation Using Software Features Similarity and Developer Activity-Based Metrics	16
MORepair: Teaching LLMs to Repair Code via Multi-Objective Fine-Tuning	15
CITYWALK : Enhancing LLM-Based C++ Unit Test Generation via Project-Dependency Awareness and Language-Specific Knowledge	15
Inferring Input Grammars from Code with Symbolic Parsing	15
DiPri : Distance-Based Seed Prioritization for Greybox Fuzzing	15
Input Distribution Coverage: Measuring Feature Interaction Adequacy in Neural Network Testing	15
Reference-Based Retrieval-Augmented Unit Test Generation	15
Visualization Task Taxonomy to Understand the Fuzzing Internals	15
Exploring Development Methods for Reactive Synthesis Specifications	15
Studying the Impact of TensorFlow and PyTorch Bindings on Machine Learning Software Quality	15
A Comparative Study on Method Comment and Inline Comment	15
On the Impact of Lower Recall and Precision in Defect Prediction for Guiding Search-based Software Testing	15
AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model	14
On the Significance of Category Prediction for Code-Comment Synchronization	14
Preparation and Utilization of Mixed States for Testing Quantum Programs	14
LogUpdater : Automated Detection and Repair of Specific Defects in Logging Statements	14

Can GitHub Issues Help in App Review Classifications?	14
The Influence of Human Aspects on Requirements Engineering-related Activities: Software Practitioners’ Perspective	14
Assessing and Advancing Benchmarks for Evaluating Large Language Models in Software Engineering Tasks	14
AI for DevSecOps: A Landscape and Future Opportunities	14
An Interleaving Guided Metamorphic Testing Approach for Concurrent Programs	14
Automatic Core-Developer Identification on GitHub: A Validation Study	14
Survey of Code Search Based on Deep Learning	13
A Hypothesis Testing-based Framework for Software Cross-modal Retrieval in Heterogeneous Semantic Spaces	13
Understanding Real-Time Collaborative Programming: A Study of Visual Studio Live Share	13
Mitigating Regression Faults Induced by Feature Evolution in Deep Learning Systems	13
How Do Successful and Failed Projects Differ? A Socio-Technical Analysis	13
PanicFI: An Infrastructure for Fixing Panic Bugs in Real-World Rust Programs	13
Refactoring in Computational Notebooks	13
Let’s Discover More API Relations: A Large Language Model-Based AI Chain for Unsupervised API Relation Inference	13
Software Engineering by and for Humans in an AI Era	13
Obfuscated Clone Search in JavaScript based on Reinforcement Subsequence Learning	13
Simulating Software Evolution to Evaluate the Reliability of Early Decision-making among Design Alternatives toward Maintainability	12
Rise of Distributed Deep Learning Training in the Big Model Era: From a Software Engineering Perspective	12
Exploring JVM Garbage Collector Testing with Event-Coverage	12
Towards Practical Binary Code Similarity Detection: Vulnerability Verification via Patch Semantic Analysis	12
Sustainability of Machine Learning-Enabled Systems: The Machine Learning Practitioner’s Perspective	12
Towards Robustness of Deep Program Processing Models—Detection, Estimation, and Enhancement	12
Some Seeds Are Strong: Seeding Strategies for Search-based Test Case Selection	12
Fairness Testing of Machine Translation Systems	12
Can Coverage Criteria Guide Failure Discovery for Image Classifiers? An Empirical Study	12
The IDEA of Us: An Identity-Aware Architecture for Autonomous Systems	12
Identifying and Explaining Safety-critical Scenarios for Autonomous Vehicles via Key Features	12
A Systematic Literature Review on the Use of Deep Learning in Software Engineering Research	12
Data Complexity: A New Perspective for Analyzing the Difficulty of Defect Prediction Tasks	12
BiRD: Race Detection in Software Binaries under Relaxed Memory Models	12
What Constitutes the Deployment and Runtime Configuration System? An Empirical Study on OpenStack Projects	12
Testing RESTful APIs: A Survey	12
Large Language Models for Cyber Security: A Systematic Literature Review	12
NSFuzz: Towards Efficient and State-Aware Network Service Fuzzing	12
Fast, Fine-Grained Equivalence Checking for Neural Decompilers	12
Revealing the Unseen: AI Chain on LLMs for Predicting Implicit Dataflows to Generate Dataflow Graphs in Dynamically Typed Code	12
Addressing OSS Community Managers’ Challenges in Contributor Retention	11
Representation Learning for Stack Overflow Posts: How Far Are We?	11
Automated Abstract Transformer Synthesis for Reduced Product Domains	11
Open Problems in Fuzzing RESTful APIs: A Comparison of Tools	11
Revisiting Sentiment Analysis for Software Engineering in the Era of Large Language Models	11
Toward Better Comprehension of Breaking Changes in the NPM Ecosystem	11
Grammar Mutation for Testing Input Parsers	11
Divide-and-Conquer: Automating Code Revisions via Localization-and-Revision	11
Decision Support Model for Selecting the Optimal Blockchain Oracle Platform: An Evaluation of Key Factors	11
An Empirical Study on the Relationship between Defects and Source Code’s Unnaturalness	11
Analysis of EMF meta-model duplication in open-source repositories	11
The Havoc Paradox in Generator-Based Fuzzing—RCR Report	11
Verification Witnesses	11
Identifying Performance Issues in Cloud Service Systems Based on Relational-Temporal Features	11
Test Oracle Generation for REST APIs	11
SemMT: A Semantic-Based Testing Approach for Machine Translation Systems	10
Understanding Vulnerability Inducing Commits of the Linux Kernel	10
Model Driven Engineering, Artificial Intelligence, and DevOps for Software and Systems Engineering: A Systematic Mapping Study of Synergies and Challenges	10
A Review of Learning-based Smart Contract Vulnerability Detection: A Perspective on Code Representation	10
Microservice Security Metrics for Secure Communication, Identity Management, and Observability	10
Digital Twin-based Anomaly Detection with Curriculum Learning in Cyber-physical Systems	10
Recommending Variable Names for Extract Local Variable Refactorings	10
Less Is More: Unlocking Semi-Supervised Deep Learning for Vulnerability Detection	10
How the Quality of Maintenance Tasks is Affected by Criteria for Selecting Engineers for Collaboration	10
Large Language Model for Vulnerability Detection and Repair: Literature Review and the Road Ahead	10
Automated Identification of Toxic Code Reviews Using ToxiCR	10
Leveraging Symmetry in GR(1) Synthesis	10
DRIVE: Dockerfile Rule Mining and Violation Detection	10
Mobile Application Online Cross-Project Just-in-Time Software Defect Prediction Framework	10
CCIHunter: Enhancing Smart Contract Code-Comment Inconsistencies Detection via Two-Stage Pre-training	10
Editorial: The End of the Journey	10
Large Language Model-Aware In-Context Learning for Code Generation	10
Making Software Development More Diverse and Inclusive: Key Themes, Challenges, and Future Directions	10
VexIR2Vec : An Architecture-Neutral Embedding Framework for Binary Similarity	10
Making Sense of the Unknown: How Managers Make Cyber Security Decisions	9
Foster the use of Hackathons in Collaborative Research Projects: Methodology, Experience Report and Lesson Learned	9
Are Static Analysis Tools Still Working during the Evolution of Smart Contracts? A Comprehensive Empirical Study	9
Finding Information Leaks with Information Flow Fuzzing	9
Learning Software Bug Reports: A Systematic Literature Review	9
Sustainability in the Field of Software Engineering: A Tertiary Study	9
Software Security Analysis in 2030 and Beyond: A Research Roadmap	9
Finding Near-optimal Configurations in Colossal Spaces with Statistical Guarantees	9
An Automated Approach to Constructing STRIDE Threat Rule Model and Updating Rule Base	9
FLITSR: Improved Spectrum-Based Localization of Multiple Faults by Iterative Test Suite Reduction – RCR Report	9
Prompt-Based Code Completion via Multi-Retrieval Augmented Generation	9
Benchmarking and Categorizing the Performance of Neural Program Repair Systems for Java	9
From Triumph to Uncertainty: The Journey of Software Engineering in the AI Era	9
Enumerating Valid Non-Alpha-Equivalent Programs for Interpreter Testing	9
Learning-based Relaxation of Completeness Requirements for Data Entry Forms	9
Blindspots in Python and Java APIs Result in Vulnerable Code	9
A Comprehensive Study of Governance Issues in Decentralized Finance Applications	9
My Fuzzers Won’t Build: An Empirical Study of Fuzzing Build Failures	9
AceCoder : An Effective Prompting Technique Specialized in Code Generation	9
Identifying Affected Third-Party Java Libraries from Textual Descriptions of Vulnerabilities and Libraries	9
AcTracer: Active Testing of Large Language Model via Multi-Stage Sampling	9
Rise of the Planet of Serverless Computing: A Systematic Review	9
Influential Global and Local Contexts Guided Trace Representation for Fault Localization	9
The Good, the Bad, and the Missing: Neural Code Generation for Machine Learning Tasks	9
NeMo: A Neuron-Level Modularizing-While-Training Approach for Decomposing DNN Models	9