Characterizing and Identifying Composite Refactorings: Concepts, Heuristics and PatternsMSR - Technical Paper
Refactoring consists of a program transformation applied to improve the internal structure of a program, for instance, by contributing to remove code smells. Developers often apply multiple interrelated refactorings called composite refactoring. Even though composite refactoring is a common practice, an investigation from different points of view on how composite refactoring manifests in practice is missing. Previous empirical studies also neglect how different kinds of composite refactorings affect the removal, prevalence or introduction of smells. To address these matters, we provide a conceptual framework and two heuristics to identify composite refactorings within and across commits. Then, we mined the commit history of 48 GitHub software projects, in which we identified and analyzed 24,911 composite refactorings involving 104,505 single refactorings. Amongst several findings, we observed that most composite refactorings occur in the same commit and have the same refactoring type. We also found that several refactorings are semantically related to each other, which occur in different parts of the system but are still related to the same task. Many smells are introduced in a program due to “incomplete” composite refactorings. Additionally, we found 111 patterns of composite refactorings that frequently introduce or remove certain smell types. They can be used as guidelines for developers to improve their refactoring practices as well as for designers of recommender systems.