MSR 2020
Mon 29 - Tue 30 June 2020
co-located with ICSE 2020
Mon 29 Jun 2020 16:52 - 17:00 at MSR:Zoom2 - Platforms & Datasets Chair(s): Moritz Beller

Background. Collaborative software development has produced a wealth of version control system (VCS) data that can now be analyzed in full. Little is known about the intrinsic structure of the entire corpus of publicly available VCS as an interconnected graph. Understanding its structure is needed to determine the best approach to analyze it in full and to avoid methodological pitfalls when doing so.

Objective. We intend to determine the most salient network topology properties of public software development history as captured by VCS. We will explore: degree distributions, determining whether they are scale-free or not; distribution of connect component sizes; distribution of shortest path lengths.

Method. We will use Software Heritage�which is the largest corpus of public VCS data�compress it using webgraph compression techniques, and analyze it in-memory using classic graph algorithms. Analyses will be performed both on the full graph and on relevant subgraphs.

Limitations. The study is exploratory in nature; as such no hypotheses on the findings is stated at this time. Chosen graph algorithms are expected to scale to the corpus size, but it will need to be confirmed experimentally. External validity will depend on how representative Software Heritage is of the software commons.

Mon 29 Jun
Times are displayed in time zone: (UTC) Coordinated Universal Time change

16:30 - 17:00: Platforms & DatasetsTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom2
Chair(s): Moritz BellerFacebook, USA

Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

16:30 - 16:37
Live Q&A
RTPTorrent: An Open-source Dataset for Evaluating Regression Test PrioritizationMSR - Technical Paper
Technical Papers
Toni MattisHasso Plattner Institute, University of Potsdam, Patrick ReinHasso Plattner Institute, Falco Dürsch, Robert HirschfeldHasso-Plattner-Institut (HPI), Germany
DOI Pre-print Media Attached
16:37 - 16:45
Live Q&A
Polyglot and Distributed Software Repository Mining with CROSSFLOWMSR - Technical Paper
Technical Papers
Konstantinos Barmpis , Patrick NeubauerUniversity of York, UK, Jonathan Co, Dimitris KolovosUniversity of York, Nicholas Matragkas, Richard PaigeMcMaster University
Media Attached
16:45 - 16:52
Live Q&A
Boa Views: Easy Modularization and Sharing of MSR AnalysesMSR - Technical Paper
Technical Papers
Che Shian Hung, Robert DyerUniversity of Nebraska - Lincoln
Pre-print Media Attached
16:52 - 17:00
Live Q&A
Determining the Intrinsic Structure of Public Software Development HistoryMSR - Registered Reports
Registered Reports
A: Antoine PietriInria, A: Guillaume RousseauUniversité de Paris and Inria, A: Stefano ZacchiroliUniversité de Paris and Inria
Pre-print Media Attached