MSR 2020
Mon 29 - Tue 30 June 2020
co-located with ICSE 2020
Tue 30 Jun 2020 10:37 - 10:45 at MSR:Zoom - Evolution Chair(s): Jürgen Cito

The notion of software “fork” has been shifting over time from the (negative) phenomenon of community disagreements that result in the creation of separate development lines and ultimately software products, to the (positive) practice of using distributed version control system (VCS) repositories to collaboratively improve a single software product without stepping on each others toes in the interim. Either way, VCS repositories involved in a fork share parts of a common development history.

Historically, studies of software forks have relied on hosting platform metadata, and most notably GitHub, as the source of truth for what constitutes a fork. However, these “forge forks” can only identify as forks repositories that have been created on the platform, e.g., by clicking the “fork” button on the GitHub user interface. The increase in popularity of more distributed code hosting platforms (e.g., GitLab) and the habits of significant development communities (e.g., the Linux kernel one, which is not primarily hosted on any single centralized platform) call into question the reliability of trusting hosting platforms to determine what a fork is. Doing so might introduce selection and methodological biases in empirical studies.

In this article we explore various definitions of “software forks”, trying to capture the various forking workflows that exist in the real world. We quantify the differences in how many repositories would be identified as forks on GitHub according to the various definitions, confirming that a significant number would be overlooked when only considering “forge forks”. We study the structure of fork networks, observing how their size is affected by the proposed definitions and discuss the potential impacts of these results on empirical research.

Tue 30 Jun
Times are displayed in time zone: (UTC) Coordinated Universal Time change

msr-2020-papers
10:30 - 11:00: Technical Papers - Evolution at MSR:Zoom
Chair(s): Jürgen CitoMIT

Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

msr-2020-papers10:30 - 10:37
Live Q&A
Jens MeinickeCarnegie Mellon University, Juan HoyosUniversidad Nacional de Colombia, Bogdan VasilescuCarnegie Mellon University, Christian KästnerCarnegie Mellon University
Pre-print Media Attached
msr-2020-papers10:37 - 10:45
Live Q&A
Antoine PietriInria, Guillaume RousseauUniversité de Paris and Inria, Stefano ZacchiroliUniversité de Paris and Inria
Pre-print Media Attached
msr-2020-papers10:45 - 10:52
Live Q&A
Sergey Svitkov, Timofey BryksinJetBrains Research, Saint Petersburg State University
Pre-print Media Attached
msr-2020-Data-showcase10:52 - 11:00
Live Q&A
Themistoklis DiamantopoulosElectrical and Computer Engineering Dept, Aristotle University of Thessaloniki, Michail Papamichail , Thomas Karanikiotis, Kyriakos Chatzidimitriou Aristotle University of Thessaloniki, Andreas SymeonidisAristotle University of Thessaloniki
Pre-print Media Attached