MSR 2020
Mon 29 - Tue 30 June 2020
co-located with ICSE 2020
Mon 29 Jun 2020 16:45 - 16:52 at MSR:Zoom2 - Platforms & Datasets Chair(s): Moritz Beller

Mining Software Repositories (MSR) has recently seen a focus toward ultra-large-scale datasets. Several tools exist to support these efforts, such as the Boa language and infrastructure. While Boa has seen extensive use, in its current form it is not always possible to perform the entire analysis task within the infrastructure, often requiring some post-processing in another language. This limits end-to-end reproducibility and limits sharing/re-use of MSR queries. To address this problem, we use the notion of views from the relational database field and designed a query language and runtime infrastructure extension for Boa that we call materialized views. Materialized views provide output reuse to Boa users, so that the results of prior Boa queries can be easily reused by others. This allows for computing results not previously possible within Boa and provides for increased sharing and reuse of MSR queries. To evaluate views, we performed two partial reproductions of prior MSR studies utilizing Boa’s dataset and infrastructure with Boa and compare our results to the prior studies. This shows the usability of the new infrastructure, allowing analyses in Boa that were not previously possible as well as providing a previously hand created gold dataset for identifier splitting as a reusable view for other MSR researchers. We also verified the caching behavior using the queries from one of the case studies. The results show that caching works as expected and can drastically improve the runtime performance.

Mon 29 Jun
Times are displayed in time zone: (UTC) Coordinated Universal Time change

16:30 - 17:00: Platforms & DatasetsTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom2
Chair(s): Moritz BellerFacebook, USA

Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

16:30 - 16:37
Live Q&A
RTPTorrent: An Open-source Dataset for Evaluating Regression Test PrioritizationMSR - Technical Paper
Technical Papers
Toni MattisHasso Plattner Institute, University of Potsdam, Patrick ReinHasso Plattner Institute, Falco Dürsch, Robert HirschfeldHasso-Plattner-Institut (HPI), Germany
DOI Pre-print Media Attached
16:37 - 16:45
Live Q&A
Polyglot and Distributed Software Repository Mining with CROSSFLOWMSR - Technical Paper
Technical Papers
Konstantinos Barmpis , Patrick NeubauerUniversity of York, UK, Jonathan Co, Dimitris KolovosUniversity of York, Nicholas Matragkas, Richard PaigeMcMaster University
Media Attached
16:45 - 16:52
Live Q&A
Boa Views: Easy Modularization and Sharing of MSR AnalysesMSR - Technical Paper
Technical Papers
Che Shian Hung, Robert DyerUniversity of Nebraska - Lincoln
Pre-print Media Attached
16:52 - 17:00
Live Q&A
Determining the Intrinsic Structure of Public Software Development HistoryMSR - Registered Reports
Registered Reports
A: Antoine PietriInria, A: Guillaume RousseauUniversité de Paris and Inria, A: Stefano ZacchiroliUniversité de Paris and Inria
Pre-print Media Attached