Empirical Study of Restarted and Flaky Builds on Travis CI (MSR 2020 - Technical Papers)

Who

Thomas Durieux, Claire Le Goues, Michael Hilton, Rui Abreu

Track

MSR 2020 Technical Papers

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 29 Jun 2020 11:36 - 11:48 at MSR:Zoom - Build, CI, & Dependencies Chair(s): Raula Gaikovina Kula

Abstract

Continuous Integration (CI) is a development practice where developers frequently integrate code into a common codebase. After the code is integrated, the CI server runs a test suite and other tools to produce a set of reports (e.g., output of linters and tests). If the result of a CI test run is unexpected, developers have the option to manually restart the build, re-running the same test suite on the same code; this can reveal build flakiness, if the restarted build outcome differs from the original build. In this study, we analyze restarted builds, flaky builds, and their impact on the development workflow. We observe that developers restart at least 1.72% of builds, amounting to 56,522 restarted builds in our Travis CI dataset. We observe that more mature and more complex projects are more likely to include restarted builds. The restarted builds are mostly builds that are initially failing due to a test, network problem, or a Travis CI limitations such as execution timeout. Finally, we observe that restarted builds have a major impact on development workflow. Indeed, in 54.42% of the restarted builds, the developers analyze and restart a build within an hour of the initial failure. This suggests that developers wait for CI results, interrupting their workflow to address the issue. Restarted builds also slow down the merging of pull requests by a factor of three, bringing median merging time from 16h to 48h.

Link to Preprint

https://arxiv.org/abs/2003.11772

DOI

https://doi.org/10.1145/3379597.3387460

Thomas Durieux

KTH Royal Institute of Technology, Sweden

Sweden

Claire Le Goues

Carnegie Mellon University

United States

Michael Hilton

Carnegie Mellon University, USA

United States

Rui Abreu

Instituto Superior Técnico, U. Lisboa & INESC-ID

Portugal

Empirical Study of Restarted and Flaky Builds on Travis CI