SoftMon: A Tool to Compare Similar Open-source Software from a Performance Perspective (MSR 2020 - Technical Papers)

Who

Shubhankar Suman Singh, Smruti Ranjan Sarangi

Track

MSR 2020 Technical Papers

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 30 Jun 2020 11:36 - 11:48 at MSR:Zoom - Quality Chair(s): Jens Krinke

Abstract

Over the past two decades, a rich ecosystem of open-source software has evolved. For every type of application, there are a wide variety of alternatives. We observed that even if different applications that perform similar tasks are compiled with the same versions of the compiler and the libraries, they perform very differently while running on the same system. Sadly prior work in this area that compares two code bases for similarities does not help us in finding the reasons for the difference in performance. In this paper, we develop a tool, SoftMon, that can compare the codebases of two separate applications and pinpoint the exact set of functions that are disproportionately responsible for differences in performance. Our tool uses machine learning and NLP techniques to analyze why a given open-source application has a lower performance as compared to its peers, design bespoke applications that can incorporate specific innovations (identified by SoftMon) in competing applications, and diagnose performance bugs. In this paper, we compare a wide variety of large open-source programs such as image editors, audio players, text editors, PDF readers, mail clients and even full-fledged operating systems (OSs). In all cases, our tool was able to pinpoint a set of at the most 10-15 functions that are responsible for the differences within 200 seconds. A subsequent manual analysis assisted by our Graph Visualization Engine helps us find the reasons. We were able to validate most of the reasons by correlating them with subsequent observations made by developers or from extant technical literature. The manual phase of our analysis is limited to 30 minutes (tested with human subjects).

Link to Preprint

http://www.cse.iitd.ac.in/~shubhankar/MSR6.pdf

Shubhankar Suman Singh

IIT Delhi

Smruti Ranjan Sarangi

Media