RTPTorrent: An Open-source Dataset for Evaluating Regression Test Prioritization (MSR 2020 - Technical Papers)

Who

Toni Mattis, Patrick Rein, Falco Dürsch, Robert Hirschfeld

Track

MSR 2020 Technical Papers

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 29 Jun 2020 16:30 - 16:37 at MSR:Zoom2 - Platforms & Datasets Chair(s): Moritz Beller

Abstract

The software engineering practice of automated testing helps programmers find defects earlier during development. With growing software projects and longer-running test suites, frequency and immediacy of feedback decline, thereby making defects harder to repair. Regression test prioritization (RTP) is concerned with running relevant tests earlier to lower the costs of defect localization and to improve feedback.

Finding representative data to evaluate RTP techniques is non-trivial, as most software is published without failing tests. In this work, we systematically survey a wide range of RTP literature regarding whether their dataset uses real or synthetic defects or tests, whether they are publicly available, and whether datasets are reused. We observed that some datasets are reused, however, many projects study only few projects and these rarely resemble real-world development activity.

In light of these threats to ecological validity, we describe the construction and characteristics of a new dataset based on 20 open-source Java programs.

Our dataset allows researchers to evaluate prioritization heuristics based on version control meta-data, source code, and test results from fine-grained, automated builds over 9 years of development history. We provide reproducible baselines for initial comparisons and make all data publicly available.

We see this as a step towards better reproducibility, ecological validity, and long-term availability of studied software in the field of test prioritization.

Link to Preprint

https://toni.mattis.berlin/files/2020-preprint-mattis-rtptorrent-msr20.pdf

DOI

https://doi.org/10.1145/3379597.3387458

Toni Mattis

Hasso Plattner Institute, University of Potsdam

Germany

Patrick Rein

Hasso Plattner Institute

Germany

Falco Dürsch

Robert Hirschfeld

Hasso-Plattner-Institut (HPI), Germany

Germany

Media

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 29 Jun
Displayed time zone: (UTC) Coordinated Universal Time change

16:30 - 17:00	Platforms & DatasetsTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom2 Chair(s): Moritz Beller Facebook, USA Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

16:30 7m Live Q&A		RTPTorrent: An Open-source Dataset for Evaluating Regression Test PrioritizationMSR - Technical Paper Technical Papers Toni Mattis Hasso Plattner Institute, University of Potsdam, Patrick Rein Hasso Plattner Institute, Falco Dürsch , Robert Hirschfeld Hasso-Plattner-Institut (HPI), Germany DOI Pre-print Media Attached
16:37 7m Live Q&A		Polyglot and Distributed Software Repository Mining with CROSSFLOWMSR - Technical Paper Technical Papers Konstantinos Barmpis , Patrick Neubauer University of York, UK, Jonathan Co , Dimitris Kolovos University of York, Nicholas Matragkas , Richard Paige McMaster University Media Attached
16:45 7m Live Q&A		Boa Views: Easy Modularization and Sharing of MSR AnalysesMSR - Technical Paper Technical Papers Che Shian Hung , Robert Dyer University of Nebraska - Lincoln Pre-print Media Attached
16:52 7m Live Q&A		Determining the Intrinsic Structure of Public Software Development HistoryMSR - Registered Reports Registered Reports A: Antoine Pietri Inria, A: Guillaume Rousseau Université de Paris and Inria, A: Stefano Zacchiroli Université de Paris and Inria Pre-print Media Attached