Empirical Software Engineering

Papers
(The TQCC of Empirical Software Engineering is 8. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-05-01 to 2024-05-01.)
ArticleCitations
Pandemic programming146
Testing machine learning based systems: a systematic mapping122
Sampling in software engineering research: a critical review and guidelines78
Predictors of well-being and productivity among software professionals during the COVID-19 pandemic – a longitudinal study77
A practical guide on conducting eye tracking studies in software engineering50
Detection, assessment and mitigation of vulnerabilities in open source dependencies44
The impact of automated feature selection techniques on the interpretation of defect models43
The practitioners’ point of view on the concept of technical debt and its causes and consequences: a design for a global family of industrial surveys and its first results from Brazil43
The who, what, how of software engineering research: a socio-technical framework42
A comprehensive study of bloated dependencies in the Maven ecosystem41
A privacy and security analysis of early-deployed COVID-19 contact tracing Android apps37
Automated patch assessment for program repair at scale37
Test case selection and prioritization using machine learning: a systematic literature review36
Perceived diversity in software engineering: a systematic literature review35
Code cloning in smart contracts: a case study on verified contracts from the Ethereum blockchain platform35
Systematic mapping study on domain-specific language development tools33
Understanding and improving the quality and reproducibility of Jupyter notebooks33
Topic modeling in software engineering research33
Promises and challenges of microservices: an exploratory study33
AI lifecycle models need to be revised32
Enjoy your observability: an industrial survey of microservice tracing and analysis32
An empirical investigation on the relationship between design and architecture smells31
Formal methods in dependable systems engineering: a survey of professionals from Europe and North America28
Lags in the release, adoption, and propagation of npm vulnerability fixes28
On the feasibility of automated prediction of bug and non-bug issues26
On the time-based conclusion stability of cross-project defect prediction models26
On the need of preserving order of data when validating within-project defect classifiers25
An exploratory study on confusion in code reviews25
An empirical investigation of performance overhead in cross-platform mobile development frameworks25
A teamwork effectiveness model for agile software development25
Predicting the objective and priority of issue reports in software repositories24
Out of sight, out of mind? How vulnerable dependencies affect open-source projects24
Problems with SZZ and features: An empirical study of the state of practice of defect prediction data collection23
Evaluating the agreement among technical debt measurement tools: building an empirical benchmark of technical debt liabilities23
Software development with feature toggles: practices used by practitioners22
How do i refactor this? An empirical study on refactoring trends and topics in Stack Overflow22
Wait for it: identifying “On-Hold” self-admitted technical debt22
The secret life of test smells - an empirical study on test smell evolution and maintenance22
Security analysis of permission re-delegation vulnerabilities in Android apps21
Why are many businesses instilling a DevOps culture into their organization?21
The ‘as code’ activities: development anti-patterns for infrastructure as code20
Software provenance tracking at the scale of public source code20
On the assessment of software defect prediction models via ROC curves20
Self-admitted technical debt practices: a comparison between industry and open-source20
Automated demarcation of requirements in textual specifications: a machine learning-based approach19
Empirical evaluation of tools for hairy requirements engineering tasks19
World of code: enabling a research workflow for mining and analyzing the universe of open source VCS data19
An empirical study of IoT topics in IoT developer discussions on Stack Overflow19
Analysing app reviews for software engineering: a systematic literature review19
A longitudinal study of static analysis warning evolution and the effects of PMD on software quality in Apache open source projects19
StateAFL: Greybox fuzzing for stateful network servers18
Industry practices and challenges for the evolvability assurance of microservices18
Locating faults with program slicing: an empirical analysis18
Topic recommendation for software repositories using multi-label classification algorithms18
Can Offline Testing of Deep Neural Networks Replace Their Online Testing?18
On the impact of security vulnerabilities in the npm and RubyGems dependency networks18
Game-based Sprint retrospectives: multiple action research17
PHANTOM: Curating GitHub for engineered software projects using time-series clustering17
Finding the sweet spot for organizational control and team autonomy in large-scale agile software development16
Beyond the virus: a first look at coronavirus-themed Android malware16
Development of recommendation systems for software engineering: the CROSSMINER experience16
How agile teams make self-assignment work: a grounded theory study16
Spearheading agile: the role of the scrum master in agile projects16
Learning to recognize actionable static code warnings (is intrinsically easy)15
A large scale analysis of mHealth app user reviews15
The significance of bug report elements15
Strategies to manage quality requirements in agile software development: a multiple case study15
Predicting unstable software benchmarks using static source code features15
A study of the performance of general compressors on log files15
An empirical study on changing leadership in agile teams14
Publish or perish, but do not forget your software artifacts14
Maintenance-related concerns for post-deployed Ethereum smart contract development: issues, techniques, and future challenges14
On the privacy of mental health apps14
On systematically building a controlled natural language for functional requirements14
A family of experiments on test-driven development14
Test smells 20 years later: detectability, validity, and reliability14
A comprehensive study on software aging across android versions and vendors14
Feature requests-based recommendation of software refactorings14
Understanding shared links and their intentions to meet information needs in modern code review:13
Software engineering whispers: The effect of textual vs. graphical software design descriptions on software design communication13
TaintBench: Automatic real-world malware benchmarking of Android taint analyses13
On the relationship between bug reports and queries for text retrieval-based bug localization13
API compatibility issues in Android: Causes and effectiveness of data-driven detection techniques13
To what extent do DNN-based image classification models make unreliable inferences?13
Automated end-to-end management of the modeling lifecycle in deep learning13
An empirical study of the characteristics of popular Minecraft mods13
Resource and dependency based test case generation for RESTful Web services13
GitHub Discussions: An exploratory study of early adoption13
Assessment of off-the-shelf SE-specific sentiment analysis tools: An extended replication study13
Can pre-trained code embeddings improve model performance? Revisiting the use of code embeddings in software engineering tasks12
The entrepreneurial logic of startup software development: A study of 40 software startups12
Are datasets for information retrieval-based bug localization techniques trustworthy?12
From anecdote to evidence: the relationship between personality and need for cognition of developers12
A longitudinal explanatory case study of coordination in a very large development programme: the impact of transitioning from a first- to a second-generation large-scale agile development method12
Reuse and maintenance practices among divergent forks in three software ecosystems12
Automatically recommending components for issue reports using deep learning12
An Empirical Investigation of Relevant Changes and Automation Needs in Modern Code Review12
Evaluating network embedding techniques’ performances in software bug prediction12
A multi-dimensional analysis of technical lag in Debian-based Docker images12
A first look at Android applications in Google Play related to COVID-1911
Deep security analysis of program code11
Too many images on DockerHub! How different are images for the same system?11
Uniform and scalable sampling of highly configurable systems11
Will you come back to contribute? Investigating the inactivity of OSS core developers in GitHub11
SMBFL: slice-based cost reduction of mutation-based fault localization11
CROKAGE: effective solution recommendation for programming tasks by leveraging crowd knowledge11
How to Better Distinguish Security Bug Reports (Using Dual Hyperparameter Optimization)11
How does code readability change during software evolution?11
Automated issue assignment: results and insights from an industrial case11
Search-based fairness testing for regression-based machine learning systems11
Ethics in the mining of software repositories11
Characterizing the evolution of statically-detectable performance issues of Android apps11
A configurable method for benchmarking scalability of cloud-native applications11
Breaking bad? Semantic versioning and impact of breaking changes in Maven Central11
CGT-FL: using cooperative game theory to effective fault localization in presence of coincidental correctness10
What makes a popular academic AI repository?10
A comparative study and analysis of developer communications on Slack and Gitter10
Developer-centric test amplification10
The effects of continuous integration on software development: a systematic literature review10
Learning from what we know: How to perform vulnerability prediction using noisy historical data10
An empirical study of Q&A websites for game developers10
Demystifying the challenges and benefits of analyzing user-reported logs in bug reports10
A unified multi-task learning model for AST-level and token-level code completion10
Dynamical analysis of diversity in rule-based open source network intrusion detection systems10
Standing on shoulders or feet? An extended study on the usage of the MSR data papers10
Where were the repair ingredients for Defects4j bugs?10
Lessons Learnt on Reproducibility in Machine Learning Based Android Malware Detection10
Systematic literature review on software quality for AI-based software10
Software testing and Android applications: a large-scale empirical study10
How Scrum adds value to achieving software quality?10
Is GitHub’s Copilot as bad as humans at introducing vulnerabilities in code?9
Gamification in software engineering: the mediating role of developer engagement and job satisfaction9
Why and what happened? Aiding bug comprehension with automated category and causal link identification9
Testing self-healing cyber-physical systems under uncertainty with reinforcement learning: an empirical study9
An exploratory study on the introduction and removal of different types of technical debt in deep learning frameworks9
A fine-grained data set and analysis of tangling in bug fixing commits9
Learning by sampling: learning behavioral family models from software product lines9
Characterizing usages, updates and risks of third-party libraries in Java projects9
Identifying self-admitted technical debt in issue tracking systems using machine learning9
Improving energy-efficiency by recommending Java collections9
Automating system test case classification and prioritization for use case-driven testing in product lines9
The Teamwork Process Antecedents (TPA) questionnaire: developing and validating a comprehensive measure for assessing antecedents of teamwork process quality9
An empirical study on self-admitted technical debt in Dockerfiles9
Do code review measures explain the incidence of post-release defects?9
Understanding and improving artifact sharing in software engineering research9
Automated test reuse for highly configurable software9
FeatCompare: Feature comparison for competing mobile apps leveraging user reviews9
On the usage, co-usage and migration of CI/CD tools: A qualitative analysis9
Automatic team recommendation for collaborative software development9
Using code reviews to automatically configure static analysis tools9
Comparing the results of replications in software engineering9
Pull request latency explained: an empirical overview8
Developers perception of peer code review in research software development8
Using black-box performance models to detect performance regressions under varying workloads: an empirical study8
Practitioner’s view of the success factors for software outsourcing partnership formation: an empirical exploration8
Efficient static analysis and verification of featured transition systems8
Interaction-based creation and maintenance of continuously usable trace links between requirements and source code8
Do I really need all this work to find vulnerabilities?8
Using a balanced scorecard to identify opportunities to improve code review effectiveness: an industrial experience report8
FACER: An API usage-based code-example recommender for opportunistic reuse8
An empirical study on release notes patterns of popular apps in the Google Play Store8
A study of how Docker Compose is used to compose multi-component systems8
Automated driver management for Selenium WebDriver8
An empirical study of question discussions on Stack Overflow8
Flair: efficient analysis of Android inter-component vulnerabilities in response to incremental changes8
Mutation testing in the wild: findings from GitHub8
0.027354955673218