SoftMon: A Tool to Compare Similar Open-source Software from a Performance PerspectiveMSR - Technical Paper
Over the past two decades, a rich ecosystem of open-source software has evolved. For every type of application, there are a wide variety of alternatives. We observed that even if different applications that perform similar tasks are compiled with the same versions of the compiler and the libraries, they perform very differently while running on the same system. Sadly prior work in this area that compares two code bases for similarities does not help us in finding the reasons for the difference in performance. In this paper, we develop a tool, SoftMon, that can compare the codebases of two separate applications and pinpoint the exact set of functions that are disproportionately responsible for differences in performance. Our tool uses machine learning and NLP techniques to analyze why a given open-source application has a lower performance as compared to its peers, design bespoke applications that can incorporate specific innovations (identified by SoftMon) in competing applications, and diagnose performance bugs. In this paper, we compare a wide variety of large open-source programs such as image editors, audio players, text editors, PDF readers, mail clients and even full-fledged operating systems (OSs). In all cases, our tool was able to pinpoint a set of at the most 10-15 functions that are responsible for the differences within 200 seconds. A subsequent manual analysis assisted by our Graph Visualization Engine helps us find the reasons. We were able to validate most of the reasons by correlating them with subsequent observations made by developers or from extant technical literature. The manual phase of our analysis is limited to 30 minutes (tested with human subjects).