OOIR: Observatory of International Research

Papers

(The median citation count of IEEE Transactions on Software Engineering is 6. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-01-01 to 2026-01-01.)

Article	Citations
50 Years of Transactions on Software Engineering	505
Confirmation Bias and Time Pressure: A Family of Experiments in Software Testing	485
To Do or Not to Do: Semantics and Patterns for Do Activities in UML PSSM State Machines	262
Combining Genetic Programming and Model Checking to Generate Environment Assumptions	194
Towards Scalable Model Checking of Reflective Systems via Labeled Transition Systems	159
Efficiently Testing Distributed Systems via Abstract State Space Prioritization	146
Multi-Granularity Detector for Vulnerability Fixes	127
Can We Trust the Phone Vendors? Comprehensive Security Measurements on the Android Firmware Ecosystem	117
Just-in-Time Prediction of Software Architectural Changes Through Commit-Level Analyses	114
The Why, When, What, and How About Predictive Continuous Integration: A Simulation-Based Investigation	110
Enhancing Project-Specific Code Completion by Inferring Internal API Information	109
Automatic Fairness Testing of Neural Classifiers Through Adversarial Sampling	108
Enhancing Protocol Fuzzing via Diverse Seed Corpus Generation	100
Computation Tree Logic Guided Program Repair	100
Question Selection for Multimodal Code Search Synthesis Using Probabilistic Version Spaces	99
What Leads to a Confirmatory or Disconfirmatory Behavior of Software Testers?	94
Shield Broken: Black-Box Adversarial Attacks on LLM-Based Vulnerability Detectors	92
Theoretical and Empirical Analyses of the Effectiveness of Metamorphic Relation Composition	91
Influence of the 1990 IEEE TSE Paper “Automated Software Test Data Generation” on Software Engineering	83
Mission Specification Patterns for Mobile Robots: Providing Support for Quantitative Properties	83
Are Your Dependencies Code Reviewed?: Measuring Code Review Coverage in Dependency Updates	78
Advanced Smart Contract Vulnerability Detection via LLM-Powered Multi-Agent Systems	77
A Retrospective on Whole Test Suite Generation: On the Role of SBST in the Age of LLMs	73
Recommending API Function Calls and Code Snippets to Support Software Development	72
Prevent: An Unsupervised Approach to Predict Software Failures in Production	70

Enhancing Mobile App Bug Reporting via Real-Time Understanding of Reproduction Steps	69
Answering Uncertain, Under-Specified API Queries Assisted by Knowledge-Aware Human-AI Dialogue	68
A Declarative Metamorphic Testing Framework for Autonomous Driving	67
Socio-Technical Grounded Theory for Software Engineering	66
DSSDPP: Data Selection and Sampling Based Domain Programming Predictor for Cross-Project Defect Prediction	65
Do as You Say: Consistency Detection of Data Practice in Program Code and Privacy Policy in Mini-App	65
2023 Reviewers List	64
Mask–Mediator–Wrapper Architecture as a Data Mesh Driver	64
Esale: Enhancing Code-Summary Alignment Learning for Source Code Summarization	64
Mole: Efficient Crash Reproduction in Android Applications With Enforcing Necessary UI Events	63
Detecting Malicious Packages in PyPI and npm by Clustering Installation Scripts	62
A Theory of Pending Schemas in Combinatorial Testing	62
T-Evos: A Large-Scale Longitudinal Study on CI Test Execution and Failure	62
PerfJIT: Test-Level Just-in-Time Prediction for Performance Regression Introducing Commits	59
Trace Diagnostics for Signal-Based Temporal Properties	59
Measuring the Fidelity of a Physical and a Digital Twin Using Trace Alignments	58
The Impact of Surface Features on Choice of (in)Secure Answers by Stackoverflow Readers	57
GenMorph: Automatically Generating Metamorphic Relations via Genetic Programming	56
A Wizard of Oz Study Simulating API Usage Dialogues With a Virtual Assistant	56
An Empirical Study of Software Refactorings in Real-World Open-Source Java Projects	55
Robust Test Selection for Deep Neural Networks	53
Efficient State Identification for Finite State Machine-Based Testing	53
Mitigating False Positive Static Analysis Warnings: Progress, Challenges, and Opportunities	52
Automated Code Editing With Search-Generate-Modify	52
δ-SCALPEL: Docker Image Slimming Based on Source Code Static Analysis	52
A Systematic Review of IoT Systems Testing: Objectives, Approaches, Tools, and Challenges	51
Multi-Objective Software Defect Prediction via Multi-Source Uncertain Information Fusion and Multi-Task Multi-View Learning	51
Systematic Evaluation and Usability Analysis of Formal Methods Tools for Railway Signaling System Design	50
Detecting Software Security Vulnerabilities Via Requirements Dependency Analysis	49
Multimodal Fusion for Android Malware Detection Based on Large Pre-Trained Models	49
An Empirical Study of Refactoring Rhythms and Tactics in the Software Development Process	48
MASTER: Multi-Source Transfer Weighted Ensemble Learning for Multiple Sources Cross-Project Defect Prediction	47
Towards a Cognitive Model of Dynamic Debugging: Does Identifier Construction Matter?	47
Mutation Testing in Practice: Insights From Open-Source Software Developers	47
Neural Library Recommendation by Embedding Project-Library Knowledge Graph	46
Decision Support for Selecting Blockchain-Based Application Design Patterns With Layered Taxonomy and Quality Attributes	46
On the Understandability of MLOps System Architectures	45
Evolutionary generation of test suites for multi-path coverage of MPI programs with non-determinism	45
AC2Next: A novel model that can predict the next animation API by fusing the animation API context and the UI animation task	45
Program Synthesis for Cyber-Resilience	44
LLMorpheus: Mutation Testing Using Large Language Models	42
An Empirical Study of C++ Vulnerabilities in Crowd-Sourced Code Examples	42
Leveraging Large Language Model for Automatic Patch Correctness Assessment	42
Annotative Software Product Line Analysis Using Variability-Aware Datalog	41
MBL-CPDP: A Multi-Objective Bilevel Method for Cross-Project Defect Prediction	40
Discovering Reusable Functional Features in Legacy Object-Oriented Systems	40
Legion: Massively Composing Rankers for Improved Bug Localization at Adobe	39
Human-in-the-Loop Automatic Program Repair	39
A Faceted Taxonomy of Requirements Changes in Agile Contexts	39
Generalized Coverage Criteria for Combinatorial Sequence Testing	39

Context-Aware Personalized Crowdtesting Task Recommendation	39
An Experience Report on Producing Verifiable Builds for Large-Scale Commercial Systems	38
TransformCode: A Contrastive Learning Framework for Code Embedding via Subtree Transformation	38
Evaluating and Improving GPT-Based Expansion of Abbreviations	37
API2Vec++: Boosting API Sequence Representation for Malware Detection and Classification	37
An Empirical Study of Parameter-Efficient Fine-Tuning in Code Change Learning and Beyond	37
Formal Equivalence Checking for Mobile Malware Detection and Family Classification	37
Pull Request Decisions Explained: An Empirical Overview	37
Can Clean New Code Reduce Technical Debt Density?	37
How Should Software Engineering Secondary Studies Include Grey Material?	36
Triple Peak Day: Work Rhythms of Software Developers in Hybrid Work	36
CODIT: Code Editing With Tree-Based Neural Models	36
EpiTESTER: Testing Autonomous Vehicles With Epigenetic Algorithm and Attention Mechanism	36
How Developers Choose Names	36
Evaluating and Improving Unified Debugging	35
Pathidea: Improving Information Retrieval-Based Bug Localization by Re-Constructing Execution Paths Using Logs	35
Weighted Community Division for Automated Software Architecture Refactoring	35
Specializing Neural Networks for Cryptographic Code Completion Applications	34
Microservice Extraction Based on a Comprehensive Evaluation of Logical Independence and Performance	34
Experimental Evaluation of Test-Driven Development With Interns Working on a Real Industrial Project	33
Detecting Continuous Integration Skip Commits Using Multi-Objective Evolutionary Search	33
Revisiting Test Impact Analysis in Continuous Testing From the Perspective of Code Dependencies	33
CloudRaid: Detecting Distributed Concurrency Bugs via Log Mining and Enhancement	33
Just-In-Time Obsolete Comment Detection and Update	33
Quantitative Verification for Monitoring Event-Streaming Systems	33
Studying Ad Library Integration Strategies of Top Free-to-Download Apps	33
A Study About the Knowledge and Use of Requirements Engineering Standards in Industry	33
Retrieval-Augmented Fine-Tuning for Improving Retrieve-and-Edit Based Assertion Generation	33
Watch Out for Extrinsic Bugs! A Case Study of Their Impact in Just-In-Time Bug Prediction Models on the OpenStack Project	32
Practitioners’ Expectations on Log Anomaly Detection	32
DiffGAN: A Test Generation Approach for Differential Testing of Deep Neural Networks for Image Analysis	32
SigRec: Automatic Recovery of Function Signatures in Smart Contracts	32
Nighthawk: Fully Automated Localizing UI Display Issues via Visual Understanding	32
From Tea Leaves to System Maps: A Survey and Framework on Context-Aware Machine Learning Monitoring	32
Automated Commit Message Generation With Large Language Models: An Empirical Study and Beyond	32
How Do Developers Structure Unit Test Cases? An Empirical Analysis of the AAA Pattern in Open Source Projects	31
What Drives and Sustains Self-Assignment in Agile Teams	31
Evaluation of Static Vulnerability Detection Tools With Java Cryptographic API Benchmarks	30
Automated Refactoring of Non-Idiomatic Python Code With Pythonic Idioms	30
Data Quality Matters: A Case Study on Data Label Correctness for Security Bug Report Prediction	29
Test Flakiness Across Programming Languages	29
Increasing the Confidence of Deep Neural Networks by Coverage Analysis	29
Exploring and Analyzing Software Architecture Refactoring in Practice	29
“Estimating Software Project Effort Using Analogies”: Reflections After 28 Years	29
Continuously Managing NFRs: Opportunities and Challenges in Practice	28
From Executable Specifications to Hard-to-Specify Requirements: Challenges in Describing Reactive System Behavior	28
Empirical Validation of Automated Vulnerability Curation and Characterization	28
Mind the Gap! A Study on the Transferability of Virtual Versus Physical-World Testing of Autonomous Driving Systems	28
The Analysis of Safety Critical Software Systems	28
DAppSCAN: Building Large-Scale Datasets for Smart Contract Weaknesses in DApp Projects	28
Effect of Requirements Analyst Experience on Elicitation Effectiveness: A Family of Quasi-Experiments	28
On the Validity of Pre-Trained Transformers for Natural Language Processing in the Software Engineering Domain	27
Predictive Comment Updating With Heuristics and AST-Path-Based Neural Learning: A Two-Phase Approach	27
An Empirical Study of Model-Agnostic Techniques for Defect Prediction Models	27
Assessing Evaluation Metrics for Neural Test Oracle Generation	27
Why My App Crashes? Understanding and Benchmarking Framework-Specific Exceptions of Android Apps	27
Cross-Language Taint Analysis: Generating Caller-Sensitive Native Code Specification for Java	27
STRE: An Automated Approach to Suggesting App Developers When to Stop Reading Reviews	27
Provably Valid and Diverse Mutations of Real-World Media Data for DNN Testing	27
ASTRAEA: Grammar-based Fairness Testing	27
Improving Cross-Language Code Clone Detection via Code Representation Learning and Graph Neural Networks	27
A Systematic Study on Real-World Android App Bundles	27
Optimization of Software Release Planning Considering Architectural Dependencies, Cost, and Value	26
Understanding the Robustness of Transformer-Based Code Intelligence via Code Transformation: Challenges and Opportunities	26
Towards Exploring Developers’ Struggles in Developing Upgradeable Smart Contracts	26
Reaching Software Quality for Bioinformatics Applications: How Far Are We?	26
Are You Still Working on This? An Empirical Study on Pull Request Abandonment	26
Efficient Summary Reuse for Software Regression Verification	26
Deconstructing the Nature of Collaboration in Organizations Open Source Software Development: The Impact of Developer and Task Characteristics	26
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categories and Test Code Repair	26
CRPWarner: Warning the Risk of Contract-Related Rug Pull in DeFi Smart Contracts	26
The Impact of Prompt Programming on Function-Level Code Generation	25
NumScout: Unveiling Numerical Defects in Smart Contracts Using LLM-Pruning Symbolic Execution	25
A Grounded Theory of Cross-Community SECOs: Feedback Diversity Versus Synchronization	25
The Effectiveness of Supervised Machine Learning Algorithms in Predicting Software Refactoring	25
Predicting Defective Lines Using a Model-Agnostic Technique	25
Line-Level Defect Prediction by Capturing Code Contexts With Graph Convolutional Networks	25
Automated Infrastructure as Code Program Testing	24
Sentinel: A Hyper-Heuristic for the Generation of Mutant Reduction Strategies	24
ArchHypo: Managing Software Architecture Uncertainty Using Hypotheses Engineering	24

iTCRL: Causal-Intervention-Based Trace Contrastive Representation Learning for Microservice Systems	24
Diversity-Oriented Testing for Competitive Game Agent via Constraint-Guided Adversarial Agent Training	24
Hashing Fuzzing: Introducing Input Diversity to Improve Crash Detection	24
How Templated Requirements Specifications Inhibit Creativity in Software Engineering	23
Let’s Talk With Developers, Not About Developers: A Review of Automatic Program Repair Research	23
Parameterized Verification of Leader/Follower Systems via Arithmetic Constraints	23
Beyond Literal Meaning: Uncover and Explain Implicit Knowledge in Code Through Wikipedia-Based Concept Linking	23
Causes and Canonicalization of Unreproducible Builds in Java	23
Mithra: Anomaly Detection as an Oracle for Cyberphysical Systems	23
A Variability Fault Localization Approach for Software Product Lines	23
Explaining Static Analysis With Rule Graphs	23
Domain-Driven Design for Microservices: An Evidence-Based Investigation	23
Automated Use-After-Free Detection and Exploit Mitigation: How Far Have We Gone?	23
Stakeholder Preference Extraction From Scenarios	23
Beyond the Sum of Parts: Leveraging Entanglement for Bug Inducing Commit Localization	23
The Power of Small LLMs: A Multi-Agent for Code Generation via Dynamic Precaution Tuning	22
Unearthing Gas-Wasting Code Smells in Smart Contracts With Large Language Models	22
SmartOracle: Generating Smart Contract Oracle via Fine-Grained Invariant Detection	22
FCGHUNTER: Towards Evaluating Robustness of Graph-Based Android Malware Detection	22
Range Specification Bug Detection in Flight Control System Through Fuzzing	22
Syntactic Versus Semantic Similarity of Artificial and Real Faults in Mutation Testing Studies	22
Forecasting the Principal of Code Technical Debt in JavaScript Applications	22
Practical Mutation Testing at Scale: A view from Google	21
Misactivation-Aware Stealthy Backdoor Attacks on Neural Code Understanding Models	21
RefactoringMiner 2.0	21
A Survey on the Use of Computer Vision to Improve Software Engineering Tasks	21
A Comparison of Natural Language Understanding Platforms for Chatbots in Software Engineering	21
Do Pretrained Language Models Indeed Understand Software Engineering Tasks?	21
Verification of Fuzzy Decision Trees	20
Boosting Generalizable Fairness With Mahalanobis Distances Guided Boltzmann Exploratory Testing	20
Learning to Predict User-Defined Types	20
Does Treatment Adherence Impact Experiment Results in TDD?	20
PopArt: Ranked Testing Efficiency	20
Restore: Retrospective Fault Localization Enhancing Automated Program Repair	20
Emerging App Issue Identification via Online Joint Sentiment-Topic Tracing	20
Translating to a Low-Resource Language with Compiler Feedback: A Case Study on Cangjie	20
Retrospective on: Constraint-Based Automatic Test Data Generation	20
Towards More Precise Coincidental Correctness Detection With Deep Semantic Learning	20
Runtime Evolution of Bitcoin's Consensus Rules	20
The “Question Neighbourhood” Approach for Systematic Evaluation of Code-Generating LLMs	19
Do Chase Your Tail! Missing Key Aspects Augmentation in Textual Vulnerability Descriptions of Long-Tail Software Through Feature Inference	19
Active Code Learning: Benchmarking Sample-Efficient Training of Code Models	19
Accelerating Finite State Machine-Based Testing Using Reinforcement Learning	19
Reuse of Similarly Behaving Software Through Polymorphism-Inspired Variability Mechanisms	19
TkT: Automatic Inference of Timed and Extended Pushdown Automata	19
The Human Side of Software Engineering Teams: An Investigation of Contemporary Challenges	18
Concretization of Abstract Traffic Scene Specifications Using Metaheuristic Search	18
SCAnoGenerator: Automatic Anomaly Injection for Ethereum Smart Contracts	18
Clopper-Pearson Algorithms for Efficient Statistical Model Checking Estimation	18
Dealing With Data Challenges When Delivering Data-Intensive Software Solutions	18
Defining Smart Contract Defects on Ethereum	18
Multitask-Based Evaluation of Open-Source LLM on Software Vulnerability	18
Evaluating SZZ Implementations: An Empirical Study on the Linux Kernel	18
PATEN: Identifying Unpatched Third-Party APIs via Fine-Grained Patch-Enhanced AST-Level Signature	18
Stealthy Backdoor Attack for Code Models	18
A Little Help Goes a Long Way: Tutoring LLMs in Solving Competitive Programming Through Hints	18
Studying the Influence and Distribution of the Human Effort in a Hybrid Fitness Function for Search-Based Model-Driven Engineering	17
An Assessment of Rules of Thumb for Software Phase Management, and the Relationship Between Phase Effort and Schedule Success	17
On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards	17
Isolating Compiler Faults Through Differentiated Compilation Configurations	17
Engineering Within Boundaries When Software Has None	17
Software Testing With Large Language Models: Survey, Landscape, and Vision	17
Comparing Block-Based Programming Models for Two-Armed Robots	17
Examiner-Pro: Testing Arm Emulators Across Different Privileges	17
Static Profiling of Alloy Models	17
Generating Structurally Realistic Models With Deep Autoregressive Networks	17
A Retrospective of Proving the Correctness of Multiprocess Programs	17
Utilizing Automatic Query Reformulations as Genetic Operations to Improve Feature Location in Software Models	17
Finding Trends in Software Research	17
A Search-Based Testing Approach for Deep Reinforcement Learning Agents	17
Malo in the Code Jungle: Explainable Fault Localization for Decentralized Applications	16
A Theory of Value for Value-Based Feature Selection in Software Engineering	16
AddressWatcher: Sanitizer-Based Localization of Memory Leak Fixes	16
RNN-Test: Towards Adversarial Testing for Recurrent Neural Network Systems	16
Factors Affecting On-Time Delivery in Large-Scale Agile Software Development	16
Darcy: Automatic Architectural Inconsistency Resolution in Java	16
Active Learning of Discriminative Subgraph Patterns for API Misuse Detection	16
How Toxic Can You Get? Search-Based Toxicity Testing for Large Language Models	16
A Procedure to Continuously Evaluate Predictive Performance of Just-In-Time Software Defect Prediction Models During Software Development	16
FlexFL: Flexible and Effective Fault Localization With Open-Source Large Language Models	16
DaNuoYi: Evolutionary Multitask Injection Testing on Web Application Firewalls	16
Heuristic and Neural Network Based Prediction of Project-Specific API Member Access	16
How Software Developers Mitigate Their Errors When Developing Code	16
A Framework for Emotion-Oriented Requirements Change Handling in Agile Software Engineering	16
ATOM: Commit Message Generation Based on Abstract Syntax Tree and Hybrid Ranking	15
Let's Go to the Whiteboard (Again): Perceptions From Software Architects on Whiteboard Architecture Meetings	15
VERCATION: Precise Vulnerable Open-source Software Version Identification based on Static Analysis and LLM	15
Obstacle Analysis in Requirements Engineering: Retrospective and Emerging Challenges	15
Retrospective: Data Mining Static Code Attributes to Learn Defect Predictors	15
Automatic Generation of Acceptance Test Cases From Use Case Specifications: An NLP-Based Approach	15
Safety and Performance, Why Not Both? Bi-Objective Optimized Model Compression Against Heterogeneous Attacks Toward AI Software Deployment	15
A Systematical Study on Application Performance Management Libraries for Apps	15
Fast and Precise Static Null Exception Analysis With Synergistic Preprocessing	15
A Retrospective on Mining Version Histories to Guide Software Changes	15
Robotic Visual GUI Testing for Truly Non-Intrusive Test Automation of Touch Screen Applications	15