MSR 2020 - Mining Challenge

The International Conference on Mining Software Repositories (MSR) has hosted a mining challenge since 2006. With this challenge, we call upon everyone interested to apply their tools to a common dataset. The challenge is for researchers and practitioners to bravely use their mining tools and approaches on a dare.

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

You're viewing the program in a time zone which is different from your device's time zone change time zone

Mon 29 Jun
Displayed time zone: (UTC) Coordinated Universal Time change

10:30 - 11:00	Programming Languages & ModelsTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom Chair(s): Dimitris Kolovos University of York Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

10:30 6m Live Q&A		An Empirical Study on the Impact of Deimplicitization on Program ComprehensionMSR - Registered Reports Registered Reports A: Jürgen Cito MIT, A: Jiasi Shen Massachusetts Institute of Technology, A: Martin C. Rinard MIT Pre-print Media Attached
10:36 6m Live Q&A		AIMMX: Artificial Intelligence Model Metadata ExtractorMSR - Technical Paper Technical Papers Jason Tsay IBM Research, Alan Braz IBM Research, Martin Hirzel IBM Research, Avraham Shinnar IBM Research, Todd Mummert Pre-print Media Attached
10:42 6m Live Q&A		Using Large-Scale Anomaly Detection on Code to Improve Kotlin CompilerMSR - Technical Paper Technical Papers Timofey Bryksin JetBrains Research, Saint Petersburg State University, Victor Petukhov JetBrains, ITMO University, Ilya Alexin , Stanislav Prikhodko , Alexey Shpilman , Vladimir Kovalenko TU Delft, Nikita Povarov JetBrains Pre-print Media Attached
10:48 6m Live Q&A		An Empirical Study of Method Chaining in JavaMSR - Technical Paper Technical Papers Tomoki Nakamaru Graduate School of Information Science and Technology, The University of Tokyo, Tomomasa Matsunaga , Tetsuro Yamazaki Graduate School of Information Science and Technology, The University of Tokyo, Soramichi Akiyama Department of Creative Informatics, The University of Tokyo, Shigeru Chiba The University of Tokyo Pre-print Media Attached
10:54 6m Live Q&A		Painting Flowers: Reasons for Using Single-State State Machines in Model-Driven EngineeringMSR - Technical Paper Technical Papers Nan Yang Eindhoven University of Technology, The Netherlands, Pieter Cuijpers , Ramon Schiffelers Eindhoven University of Technology and ASML, the Netherlands, Johan Lukkien , Alexander Serebrenik Eindhoven University of Technology Media Attached

10:30 - 11:00	Refactoring & TestingTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom2 Chair(s): Maurício Aniche Delft University of Technology, Netherlands Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

10:30 7m Live Q&A		Characterizing and Identifying Composite Refactorings: Concepts, Heuristics and PatternsMSR - Technical Paper Technical Papers Leonardo Da Silva Sousa Carnegie Mellon University, USA, Diego Cedrim Pontifical Catholic University of Rio de Janeiro, Alessandro Garcia PUC-Rio, Willian Oizumi PUC-Rio, Ana Carla Bibiano PUC-Rio, Daniel Oliveira PUC-Rio, Miryung Kim University of California, Los Angeles, Anderson Oliveira PUC-Rio Pre-print Media Attached
10:37 7m Live Q&A		Behind the Intents: An In-depth Empirical Study on Software Refactoring in Modern Code ReviewMSR - Technical Paper Technical Papers Matheus Paixao University of Fortaleza, Anderson Uchôa Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Ana Carla Bibiano PUC-Rio, Daniel Oliveira PUC-Rio, Alessandro Garcia PUC-Rio, Jens Krinke University College London, Emilio Arvonio Pre-print Media Attached
10:45 7m Live Q&A		JTeC: A Large Collection of Java Test Classes for Test Code Analysis and ProcessingMSR - Data Showcase Data Showcase Federico Corò , A: Roberto Verdecchia Vrije Universiteit Amsterdam, A: Emilio Cruciani , A: Breno Miranda Federal University of Pernambuco, A: Antonia Bertolino CNR-ISTI Pre-print Media Attached
10:52 7m Live Q&A		TestRoutes: A Manually Curated Method Level Dataset for Test-to-Code TraceabilityMSR - Data Showcase Data Showcase A: András Kicsi , A: László Vidács University of Szeged, Hungary, A: Tibor Gyimothy Pre-print Media Attached

11:00 - 12:00	Build, CI, & DependenciesTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom Chair(s): Raula Gaikovina Kula NAIST Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

11:00 12m Live Q&A		A Tale of Docker Build Failures: A Preliminary StudyMSR - Technical Paper Technical Papers Yiwen Wu National University of Defense Technology, Yang Zhang National University of Defense Technology, China, Tao Wang National University of Defense Technology, Huaimin Wang Pre-print Media Attached
11:12 12m Live Q&A		Using Others' Tests to Avoid Breaking UpdatesMSR - Technical Paper Technical Papers Suhaib Mujahid Concordia University, Rabe Abdalkareem Concordia University, Montreal, Canada, Emad Shihab Concordia University, Shane McIntosh McGill University Pre-print Media Attached
11:24 12m Live Q&A		A Dataset of DockerfilesMSR - Data Showcase Data Showcase A: Jordan Henkel University of Wisconsin–Madison, A: Christian Bird Microsoft Research, A: Shuvendu K. Lahiri Microsoft Research, A: Thomas Reps University of Wisconsin-Madison, USA Media Attached
11:36 12m Live Q&A		Empirical Study of Restarted and Flaky Builds on Travis CIMSR - Technical Paper Technical Papers Thomas Durieux KTH Royal Institute of Technology, Sweden, Claire Le Goues Carnegie Mellon University, Michael Hilton Carnegie Mellon University, USA, Rui Abreu Instituto Superior Técnico, U. Lisboa & INESC-ID DOI Pre-print Media Attached
11:48 12m Live Q&A		LogChunks: A Data Set for Build Log AnalysisMSR - Data Showcase Data Showcase A: Carolin Brandt Delft University of Technology, A: Annibale Panichella Delft University of Technology, A: Andy Zaidman TU Delft, A: Moritz Beller Facebook, USA Pre-print Media Attached

12:00 - 13:00	Code SmellsTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom Chair(s): Alessandro Garcia PUC-Rio Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

12:00 10m Live Q&A		Detecting Video Game-Specific Bad Smells in Unity ProjectsMSR - Technical Paper Technical Papers Antonio Borrelli , Vittoria Nardone , Giuseppe Di Lucca , Gerardo Canfora University of Sannio, Massimiliano Di Penta University of Sannio Pre-print Media Attached
12:10 10m Live Q&A		Investigating Severity Thresholds for Test SmellsMSR - Technical Paper Technical Papers Davide Spadini Delft University of Technology, Netherlands, Martin Schvarcbacher , Ana Oprescu University of Amsterdam, Magiel Bruntink Software Improvement Group, Alberto Bacchelli University of Zurich DOI Pre-print Media Attached
12:20 10m Live Q&A		On the Prevalence, Impact, and Evolution of SQL code smells in Data-Intensive SystemsMSR - Technical Paper Technical Papers Biruk Asmare Muse , Masud Rahman Dalhousie University, Csaba Nagy Software Institute - USI, Lugano, Anthony Cleve University of Namur, Foutse Khomh Polytechnique Montréal, Giuliano Antoniol Polytechnique Montréal Pre-print Media Attached
12:30 10m Live Q&A		Multi-language Design Smells: A Backstage PerspectiveMSR - Registered Reports Registered Reports A: Mouna Abidi , A: Moses Openja , A: Foutse Khomh Polytechnique Montréal Pre-print Media Attached
12:40 10m Live Q&A		The Scent of Deep Learning Code: An Empirical StudyMSR - Technical Paper Technical Papers Hadhemi Jebnoun , Masud Rahman Dalhousie University, Foutse Khomh Polytechnique Montréal, Houssem Ben Braiek Pre-print Media Attached
12:50 10m Live Q&A		Developer-Driven Code Smell PrioritizationMSR - Technical Paper Technical Papers Fabiano Pecorelli University of Salerno, Fabio Palomba University of Salerno, Foutse Khomh Polytechnique Montréal, Andrea De Lucia University of Salerno Pre-print Media Attached

12:00 - 13:00	MSR Mining ChallengeMining Challenge / Technical Papers at MSR:Zoom2 Chair(s): Antoine Pietri Inria, Diomidis Spinellis Athens University of Economics and Business, Stefano Zacchiroli Université de Paris and Inria Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

12:00 20m Live Q&A		Cheating Death: A Statistical Survival Analysis of Publicly Available Python ProjectsMSR - Mining Challenge Mining Challenge A: Ali Rao Hamza , A: Chelsea Parlett-Pelleriti , A: Erik Linstead Chapman University Pre-print Media Attached
12:20 20m Live Q&A		An investigation to find motives behind cross-platform forks from Software Heritage datasetMSR - Mining Challenge Mining Challenge A: Avijit Bhattacharjee University of Saskatchewan, Canada, A: Sristy Sumana Nath Department of Computer Science, University of Saskatchewan, A: Shurui Zhou Carnegie Mellon University, USA / University of Toronto, CA, A: Debasish Chakroborti , A: Banani Roy University of Saskatchewan, A: Chanchal K. Roy University of Saskatchewan, A: Kevin Schneider University of Saskatchewan DOI Pre-print Media Attached
12:40 20m Live Q&A		Exploring the Security Awareness of the Python and JavaScript Open Source CommunitiesMSR - Mining Challenge Mining Challenge Gabor Antal , Márton Keleti , A: Peter Hegedus University of Szeged Pre-print Media Attached

13:00 - 13:15	"Opening" & AwardsMSR Plenary at MSR:Zoom Chair(s): Georgios Gousios Delft University of Technology, Sunghun Kim Hong Kong University of Science and Technology, Sarah Nadi University of Alberta Live on YouTube: https://www.youtube.com/watch?v=Qvf7mHa-YYs

13:00 15m Day opening		MSR Opening & Awards MSR Plenary Sunghun Kim Hong Kong University of Science and Technology, Sarah Nadi University of Alberta, Georgios Gousios Delft University of Technology Media Attached

14:30 - 15:30	Bugs & IssuesTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom Chair(s): Francisco Servant Virginia Tech Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

14:30 10m Live Q&A		Dataset of Video Game Development ProblemsMSR - Data Showcase Data Showcase A: Cristiano Politowski Concordia University, Canada, A: Fabio Petrillo University of Quebec at Chicoutimi, A: Yann-Gaël Guéhéneuc Concordia University and Polytechnique Montréal, A: Gabriel Cavalheiro Ullmann UNIJUI - Universidade Regional do Noroeste do Estado do Rio Grande do Sul, A: Josias De Andrade Werly Media Attached
14:40 10m Live Q&A		On the Relationship between User Churn and Software IssuesMSR - Technical Paper Technical Papers Omar El Zarif , Daniel Alencar Da Costa University of Otago, Safwat Hassan Queens University, Kingston, Canada, Ying Zou Queen's University, Kingston, Ontario Pre-print Media Attached
14:50 10m Live Q&A		A Soft Alignment Model for Bug DeduplicationMSR - Technical Paper Technical Papers Irving Muller Rodrigues , Daniel Aloise , Eraldo Rezende Fernandes , Michel Dagenais Pre-print Media Attached
15:00 10m Live Q&A		A Large-Scale Comparative Evaluation of IR-Based Tools for Bug LocalizationMSR - Technical Paper Technical Papers Shayan Akbar , Avinash Kak Media Attached
15:10 10m Live Q&A		How Often Do Single-Statement Bugs Occur? The ManySStuBs4J DatasetMSR - Data Showcase Data Showcase A: Rafael-Michael Karampatsis The University of Edinburgh, A: Charles Sutton Google Research Pre-print Media Attached
15:20 10m Live Q&A		Large-Scale Manual Validation of Bugfixing ChangesMSR - Registered Reports Registered Reports A: Steffen Herbold University of Göttingen, A: Alexander Trautsch University of Göttingen, A: Benjamin Ledel Pre-print Media Attached

14:30 - 15:00	Tutorial 1: GDPR ConsiderationsEducation / Technical Papers at MSR:Zoom2 Chair(s): Abram Hindle University of Alberta, Alexander Serebrenik Eindhoven University of Technology Q/A for tutorial (Joining info available on Slack)

14:30 30m Tutorial		Mining Software Repositories While Respecting PrivacyMSR - Tutorial Education A: Jesus M. Gonzalez-Barahona Universidad Rey Juan Carlos Pre-print Media Attached

16:30 - 17:30	Github & OSS DatasetsTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom Chair(s): Olga Baysal Carleton University Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

16:30 8m Live Q&A		A New Dataset for Pull Request AcceptanceMSR - Data Showcase Data Showcase A: Xunhui Zhang National University of Defense Technology, China, A: Ayushi Rastogi University of Groningen, The Netherlands, A: Yue Yu College of Computer, National University of Defense Technology, Changsha 410073, China Pre-print Media Attached
16:38 8m Live Q&A		A Mixed Graph-Relational Dataset of Socio-technicalInteractions in Open Source SystemsMSR - Data Showcase Data Showcase A: Usman Ashraf , A: Christoph Mayr-Dorn Johannes Kepler University Linz, A: Alexander Egyed Johannes Kepler University, Linz, A: Sebastiano Panichella Media Attached
16:47 8m Live Q&A		A Complete Set of Related Git Repositories Identified via Community Detection Approaches Based on Shared CommitsMSR - Data Showcase Data Showcase A: Audris Mockus , A: Zoe Kotti Athens University of Economics and Business, A: Diomidis Spinellis Athens University of Economics and Business, A: Gabriel Dusing Media Attached
16:55 8m Live Q&A		A Dataset of Enterprise-Driven Open Source SoftwareMSR - Data Showcase Data Showcase A: Diomidis Spinellis Athens University of Economics and Business, A: Zoe Kotti Athens University of Economics and Business, A: Konstantinos Kravvaritis , A: Georgios Theodorou , A: Panos Louridas Athens University of Economics and Business DOI Pre-print Media Attached
17:04 8m Live Q&A		A Dataset for GitHub Repository DeduplicationMSR - Data Showcase Data Showcase A: Diomidis Spinellis Athens University of Economics and Business, A: Zoe Kotti Athens University of Economics and Business, A: Audris Mockus DOI Pre-print Media Attached
17:12 8m Live Q&A		A Dataset and an Approach for Identity Resolution of 38 Million Author IDs extracted from 2B Git CommitsMSR - Data Showcase Data Showcase A: Tanner Fry , A: Tapajit Dey , A: Andrey Karnauch University of Tennessee Knoxville, A: Audris Mockus Pre-print Media Attached
17:21 8m Live Q&A		20-MAD - 20 years of issues and commits of Mozilla and Apache DevelopmentMSR - Data Showcase Data Showcase A: Maëlick Claes University of Oulu, A: Mika Mäntylä University of Oulu Media Attached

16:30 - 17:00	Platforms & DatasetsTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom2 Chair(s): Moritz Beller Facebook, USA Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

16:30 7m Live Q&A		RTPTorrent: An Open-source Dataset for Evaluating Regression Test PrioritizationMSR - Technical Paper Technical Papers Toni Mattis Hasso Plattner Institute, University of Potsdam, Patrick Rein Hasso Plattner Institute, Falco Dürsch , Robert Hirschfeld Hasso-Plattner-Institut (HPI), Germany DOI Pre-print Media Attached
16:37 7m Live Q&A		Polyglot and Distributed Software Repository Mining with CROSSFLOWMSR - Technical Paper Technical Papers Konstantinos Barmpis , Patrick Neubauer University of York, UK, Jonathan Co , Dimitris Kolovos University of York, Nicholas Matragkas , Richard Paige McMaster University Media Attached
16:45 7m Live Q&A		Boa Views: Easy Modularization and Sharing of MSR AnalysesMSR - Technical Paper Technical Papers Che Shian Hung , Robert Dyer University of Nebraska - Lincoln Pre-print Media Attached
16:52 7m Live Q&A		Determining the Intrinsic Structure of Public Software Development HistoryMSR - Registered Reports Registered Reports A: Antoine Pietri Inria, A: Guillaume Rousseau Université de Paris and Inria, A: Stefano Zacchiroli Université de Paris and Inria Pre-print Media Attached

Tue 30 Jun
Displayed time zone: (UTC) Coordinated Universal Time change

10:30 - 11:00	EvolutionTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom Chair(s): Jürgen Cito MIT Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

10:30 7m Live Q&A		Capture the Feature Flag: Detecting Feature Flags in Open-SourceMSR - Technical Paper Technical Papers Jens Meinicke Carnegie Mellon University, Juan Hoyos Universidad Nacional de Colombia, Bogdan Vasilescu Carnegie Mellon University, Christian Kästner Carnegie Mellon University Pre-print Media Attached
10:37 7m Live Q&A		Forking Without Clicking: on How to Identify Software Repository ForksMSR - Technical Paper Technical Papers Antoine Pietri Inria, Guillaume Rousseau Université de Paris and Inria, Stefano Zacchiroli Université de Paris and Inria Pre-print Media Attached
10:45 7m Live Q&A		Visualization of Methods Changeability Based on VCS DataMSR - Technical Paper Technical Papers Sergey Svitkov , Timofey Bryksin JetBrains Research, Saint Petersburg State University Pre-print Media Attached
10:52 7m Live Q&A		Employing Contribution and Quality Metrics for Quantifying the Software Development ProcessMSR - Data Showcase Data Showcase A: Themistoklis Diamantopoulos Electrical and Computer Engineering Dept, Aristotle University of Thessaloniki, A: Michail Papamichail , A: Thomas Karanikiotis , A: Kyriakos Chatzidimitriou Aristotle University of Thessaloniki, A: Andreas Symeonidis Aristotle University of Thessaloniki Pre-print Media Attached

10:30 - 11:00	Apps & BotsTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom2 Chair(s): Ivano Malavolta Vrije Universiteit Amsterdam Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

10:30 7m Live Q&A		AndroZooOpen: Collecting Large-scale Open Source Android Apps for the Research CommunityMSR - Data Showcase Data Showcase A: Pei Liu , A: Li Li Monash University, Australia, A: Yanjie Zhao , A: Xiaoyu Sun , A: John Grundy Monash University Media Attached
10:37 7m Live Q&A		Hall-of-Apps: The Top Android Apps Metadata ArchiveMSR - Data Showcase Data Showcase A: Laura Bello-Jiménez , A: Camilo Escobar-Velásquez Universidad de los Andes, A: Anamaria Mojica-Hanke , A: Santiago Cortés-Fernández , A: Mario Linares-Vásquez Universidad de los Andes Media Attached
10:45 7m Live Q&A		Detecting and Characterizing Bots that Commit CodeMSR - Technical Paper Technical Papers Tapajit Dey , Sara Mousavi , Eduardo Ponce University of Tennessee - Knoxville, Tanner Fry , Bogdan Vasilescu Carnegie Mellon University, Anna Filippova , Audris Mockus University of Tennessee - Knoxville Pre-print Media Attached
10:52 7m Live Q&A		Challenges in Chatbot Development: A Study of Stack Overflow PostsMSR - Technical Paper Technical Papers Ahmad Abdellatif Concordia University, Diego Elias Costa Concordia University, Canada, Khaled Badran Concordia University, Rabe Abdalkareem Concordia University, Montreal, Canada, Emad Shihab Concordia University Pre-print Media Attached

11:00 - 12:00	QualityTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom Chair(s): Jens Krinke University College London Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

11:00 12m Live Q&A		Beyond the Code: Mining Self-Admitted Technical Debt in Issue Tracker SystemsMSR - Technical Paper Technical Papers Laerte Xavier Universidade Federal de Minas Gerais (UFMG), Fabio da Silva Ferreira , Rodrigo Brito , Marco Tulio Valente Federal University of Minas Gerais, Brazil Pre-print Media Attached
11:12 12m Live Q&A		An Empirical Study on Regular Expression BugsMSR - Technical Paper Technical Papers Peipei Wang North Carolina State University, USA, Chris Brown North Carolina State University, Jamie Jennings North Carolina State University, Kathryn Stolee North Carolina State University Pre-print Media Attached
11:24 12m Live Q&A		Do Explicit Review Strategies Improve Code Review Performance?MSR - Registered Reports Registered Reports A: Pavlína Wurzel Gonçalves , A: Enrico Fregnan , A: Tobias Baum , A: Kurt Schneider Leibniz Universität Hannover, Software Engineering Group, A: Alberto Bacchelli University of Zurich Pre-print Media Attached
11:36 12m Live Q&A		SoftMon: A Tool to Compare Similar Open-source Software from a Performance PerspectiveMSR - Technical Paper Technical Papers Shubhankar Suman Singh IIT Delhi, Smruti Ranjan Sarangi Pre-print Media Attached
11:48 12m Live Q&A		A Study of Potential Code Borrowing and License Violations in Java Projects on GitHubMSR - Technical Paper Technical Papers Yaroslav Golubev JetBrains Research, ITMO University, Maria Eliseeva , Nikita Povarov JetBrains, Timofey Bryksin JetBrains Research, Saint Petersburg State University Pre-print Media Attached

11:00 - 12:00	SecurityData Showcase / Technical Papers at MSR:Zoom2 Chair(s): Dimitris Mitropoulos Athens University of Economics and Business Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

11:00 12m Live Q&A		Did You Remember To Test Your Tokens?MSR - Technical Paper Technical Papers Danielle Gonzalez Rochester Institute of Technology, USA, Michael Rath Technische Universität Ilmenau, Mehdi Mirakhorli Rochester Institute of Technology DOI Pre-print Media Attached
11:12 12m Live Q&A		Automatically Granted Permissions in Android appsMSR - Technical Paper Technical Papers Paolo Calciati IMDEA Software Institute, Konstantin Kuznetsov Saarland University, CISPA, Alessandra Gorla IMDEA Software Institute, Andreas Zeller CISPA Helmholtz Center for Information Security Media Attached
11:24 12m Live Q&A		PUMiner: Mining Security Posts from Developer Question and Answer Websites with PU LearningMSR - Technical Paper Technical Papers Triet Le The University of Adelaide, David Hin , Roland Croft , Muhammad Ali Babar The University of Adelaide DOI Pre-print Media Attached
11:36 12m Live Q&A		A C/C++ Code Vulnerability Dataset with Code Changes and CVE SummariesMSR - Data Showcase Data Showcase A: Jiahao Fan New Jersey Institute of Technology, USA, A: Yi Li New Jersey Institute of Technology, USA, A: Shaohua Wang New Jersey Institute of Technology, USA, A: Tien N. Nguyen University of Texas at Dallas Media Attached
11:48 12m Live Q&A		The Impact of a Major Security Event on an Open Source Project: The Case of OpenSSLMSR - Technical Paper Technical Papers James Walden Northern Kentucky University Pre-print Media Attached

14:00 - 15:00	ML4SETechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom Chair(s): Kevin Moran William & Mary/George Mason University Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

14:00 12m Live Q&A		A Machine Learning Approach for Vulnerability CurationACM SIGSOFT Distinguished Paper AwardMSR - Technical Paper Technical Papers Chen Yang Veracode, Inc., Andrew Santosa Veracode, Inc., Ang Ming Yi , Abhishek Sharma Singapore Management University, Singapore, Asankhaya Sharma Veracode, Inc., David Lo Singapore Management University Pre-print Media Attached
14:12 12m Live Q&A		Embedding Java Classes with code2vec: Improvements from Variable ObfuscationMSR - Technical Paper Technical Papers Rhys Compton University of Waikato, Eibe Frank Department of Computer Science, University of Waikato, Panos Patros , Abigail Koay University of Waikato DOI Pre-print Media Attached
14:24 12m Live Q&A		A Study on the Accuracy of OCR Engines for Source Code Transcription from Programming ScreencastsMSR - Technical Paper Technical Papers Abdulkarim Malkadi Florida State University, USA - Jazan University, KSA, Mohammad Alahmadi Florida State University, Sonia Haiduc Florida State University Pre-print Media Attached
14:36 12m Live Q&A		What is the Vocabulary of Flaky Tests?MSR - Technical Paper Technical Papers Gustavo Pinto UFPA, Breno Miranda Federal University of Pernambuco, Supun Dissanayake The University of Adelaide, Marcelo d'Amorim Federal University of Pernambuco, Christoph Treude The University of Adelaide, Antonia Bertolino CNR-ISTI Pre-print Media Attached
14:48 12m Live Q&A		Improved Automatic Summarization of Subroutines via Attention to File ContextMSR - Technical Paper Technical Papers Sakib Haque University of Notre Dame, Alexander LeClair University Of Notre Dame, Lingfei Wu IBM Research, Collin McMillan University of Notre Dame Pre-print Media Attached

16:00 - 17:00	Developer CollaborationTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom Chair(s): Bogdan Vasilescu Carnegie Mellon University Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

16:00 10m Live Q&A		Need for tweet. How open-source developers use Twitter to talk about their GitHub workMSR - Technical Paper Technical Papers Hongbo Fang , Daniel Klug , Hemank Lamba , James Herbsleb , Bogdan Vasilescu Carnegie Mellon University Pre-print Media Attached
16:10 10m Live Q&A		Can We Use SE-specific Sentiment Analysis Tools in a Cross-Platform Setting?MSR - Technical Paper Technical Papers Nicole Novielli University of Bari, Fabio Calefato University of Bari, Davide Dongiovanni University of Bari, Daniela Girardi University of Bari, Filippo Lanubile University of Bari DOI Pre-print Media Attached
16:20 10m Live Q&A		GitterCom: A Dataset of Open Source Developer Communications in GitterMSR - Data Showcase Data Showcase A: Esteban Parra Florida State University, A: Ashley Ellis , A: Sonia Haiduc Florida State University Pre-print Media Attached
16:30 10m Live Q&A		The Impact of Dynamics of Collaborative Software Engineering on Introverts: A Study ProtocolMSR - Registered Reports Registered Reports A: Ingrid Nunes Universidade Federal do Rio Grande do Sul (UFRGS), Brazil, A: Christoph Treude The University of Adelaide, A: Fabio Calefato University of Bari Pre-print Media Attached
16:40 10m Live Q&A		Software-related Slack Chats with Disentangled ConversationsMSR - Data Showcase Data Showcase A: Preetha Chatterjee University of Delaware, USA, A: Kostadin Damevski Virginia Commonwealth University, A: Nicholas A. Kraft UserVoice, A: Lori Pollock Pre-print Media Attached
16:50 10m Live Q&A		Traceability Support for Multi-Lingual Software ProjectsACM SIGSOFT Distinguished Paper AwardMSR - Technical Paper Technical Papers Yalin Liu University of Notre Dame, Jinfeng Lin University of Notre Dame, Jane Cleland-Huang University of Notre Dame Media Attached

16:00 - 17:00	Visions & ReflectionsTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom2 Chair(s): Venera Arnaoudova Washington State University Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

16:00 15m Live Q&A		The State of the ML-universe: 10 Years of Artificial Intelligence & Machine Learning Software Development on GitHubMSR - Technical Paper Technical Papers Danielle Gonzalez Rochester Institute of Technology, USA, Thomas Zimmermann Microsoft Research, Nachiappan Nagappan Microsoft Research DOI Pre-print Media Attached
16:15 15m Live Q&A		Ethical Mining – A Case Study on MSR Mining ChallengesACM SIGSOFT Distinguished Paper AwardMSR - Technical Paper Technical Papers Nicolas Gold University College London, Jens Krinke University College London DOI Pre-print Media Attached
16:30 15m Live Q&A		From Innovations to Prospects: What Is Hidden Behind Cryptocurrencies?MSR - Technical Paper Technical Papers Ang Jia Xi'an Jiaotong University, Ming Fan Xi'an Jiaotong University, Xi Xu , Di Cui Xi'an Jiaotong University, Wenying Wei , Zijiang Yang Western Michigan University, Kai Ye , Ting Liu Xi'an Jiaotong University DOI Pre-print Media Attached
16:45 15m Live Q&A		What constitutes Software? An Empirical, Descriptive Study of ArtifactsMSR - Technical Paper Technical Papers Rolf-Helge Pfeiffer Pre-print Media Attached

Accepted Papers

	Title
	An investigation to find motives behind cross-platform forks from Software Heritage datasetMSR - Mining Challenge Mining Challenge A: Avijit Bhattacharjee, A: Sristy Sumana Nath, A: Shurui Zhou, A: Debasish Chakroborti, A: Banani Roy, A: Chanchal K. Roy, A: Kevin Schneider DOI Pre-print Media Attached
	Cheating Death: A Statistical Survival Analysis of Publicly Available Python ProjectsMSR - Mining Challenge Mining Challenge A: Ali Rao Hamza, A: Chelsea Parlett-Pelleriti, A: Erik Linstead Pre-print Media Attached
	Exploring the Security Awareness of the Python and JavaScript Open Source CommunitiesMSR - Mining Challenge Mining Challenge Gabor Antal, Márton Keleti, A: Peter Hegedus Pre-print Media Attached

Call for Papers

This year, the challenge is about mining the Software Heritage Graph Dataset, a very large dataset containing the development history of publicly available software, at the granularity used by state-of-the-art distributed version control systems. Included software artifacts were retrieved from major collaborative development platforms (e.g., GitHub, GitLab) and package repositories (e.g., PyPI, Debian, npm), and stored in a uniform representation: a fully-deduplicated Merkle DAG linking together source code files organized in directories, commits tracking evolution over time, up to full snapshots of version control systems (VCS) repositories as observed by the Software Heritage during periodic crawls.

Analyses can be based on the Software Heritage Graph Dataset alone or expanded to also include data from other resources such as GHTorrent, the Ultimate Debian Database, or any other dataset about software artifacts included in the dataset (e.g., previous studies about NPM, PyPI, etc). Note that the dataset does not contain the source code files themselves, but refers to them using persistent identifiers that can be used to cross-reference source code files referenced in previous studies/datasets or even retrieve source code of interest from Software Heritage.

The overall goal is to study public software development, expanding the scope of analysis of previous studies to a novel scale thanks to: (1) a good approximation of the entire corpus of publicly available software, (2) blending together related development histories in a single graph, and (3) abstracting over VCS and package differences, offering a canonical representation of source code artifacts.

Questions that are, to the best of our knowledge, not sufficiently answered and could be answered using this year dataset include:

Scale: Can previous software mining results be reproduced when looking at all the projects of a given kind rather than the “most starred”? At what point is sampling sufficient?
Cross-repository analysis: How can forking and duplication patterns inform us on software health and risks? How can community forks be distinguished from personal-use forks? What are good predictors of the success of a community fork?
Cross-origin analysis: Is software evolution consistent across different version control systems? Are there VCS-specific development patterns? How does a migration from a VCS to another affect development patterns? Is there a relationship between development cycles and package manager releases?
Graph structure: How tightly coupled are the different layers of the graph? What is the deduplication efficiency across different programming languages? When and where do source code files or directories tend to be reused? How is code shared between different forges?

These are just some of the questions that could be answered using the Software Heritage Graph Dataset. We encourage challenge participants to adapt the above research questions or formulate their own about any hidden knowledge that still defeats discovery in the treasure trove of our collective software commons!

How to Participate in the Challenge

First, familiarize yourself with the Software Heritage Graph Dataset:

Read our MSR 2019 paper about the *Software Heritage Graph Dataset and the preprint of our mining challenge proposal, which contains example queries.
Study the documentation of the dataset, which includes the most recent database layout, download information, as well as smaller dataset teasers that you can start with to whet your appetite.
Join the public discussion mailing list to discuss with other challenge participants and chairs how to best exploit the dataset.

Then, use the dataset to answer your research questions, report your findings in a four-page data challenge paper (see information below) and submit your abstract and paper in time (see important dates below). If your paper is accepted, present your results at MSR 2020 in Seoul, South Korea!

Submission

A challenge paper should describe the results of your work by providing an introduction to the problem you address and why it is worth studying, the version of the dataset you used, the approach and tools you used, your results and their implications, and conclusions. Make sure your report highlights the contributions and the importance of your work. See also our open science policy regarding the publication of software and additional data you used for the challenge.

Challenge papers must not exceed 4 pages plus 1 additional page only with references and must conform to the MSR 2020 format and submission guidelines. Each submission will be reviewed by at least three members of the program committee. Submissions should follow the ACM Conference Proceedings Formatting Guidelines (https://www.acm.org/publications/proceedings-template). LaTeX users must use the provided acmart.cls and ACM-Reference-Format.bst without modification, enable the conference format in the preamble of the document (i.e., \documentclass[sigconf,review]{acmart}), and use the ACM reference format for the bibliography (i.e., \bibliographystyle{ACM-Reference-Format}). The review option adds line numbers, thereby allowing referees to refer to specific lines in their comments.

IMPORTANT: MSR 2020 follows the double-blind submission model. Submissions should not reveal the identity of the authors in any way. This means that authors should:

leave out author names and affiliations from the body and metadata of the submitted pdf
ensure that any citations to related work by themselves are written in the third person, for example “the prior work of XYZ [2]” as opposed to “our prior work [2]”
not refer to their personal, lab or university website; similarly, care should be taken with personal accounts on GitHub, Google Drive, etc.
not upload unblinded versions of their paper on archival websites during bidding/reviewing. However uploading unblinded versions prior to submission is allowed and sometimes unavoidable (e.g., thesis).

Authors having further questions on double blind reviewing are encouraged to contact the Mining Challenge Chairs via email.

Papers must be submitted electronically through EasyChair, should not have been published elsewhere, and should not be under review or submitted for review elsewhere for the duration of consideration. ACM plagiarism policy and procedures shall be followed for cases of double submission. The submission must also comply with the IEEE Policy on Authorship.

Upon notification of acceptance, all authors of accepted papers will receive further instructions for preparing their camera ready versions. At least one author of each accepted paper is expected to register and present the results at MSR 2020 in Seoul, South Korea. All accepted contributions will be published in the electronic conference proceedings.

The dataset as object of study for the challenge can be cited through reference [MSR20DC] below, while the Software Heritage dataset itself and its schema can be referenced via [MSR19SH], which also contains additional sample queries.

@inproceedings{MSR20DC,
  title={The {Software Heritage Graph Dataset}: Large-scale Analysis of Public Software Development History},
  publisher = {IEEE},
  year = {2020},
  author={Antoine Pietri and Diomidis Spinellis and Stefano Zacchiroli},
  year={2020},
  booktitle={MSR 2020: The 17th International Conference on Mining Software Repositories},
  preprint={https://upsilon.cc/~zack/research/publications/msr-2020-challenge.pdf}
}

@inproceedings{MSR19SH,
  author = {Antoine Pietri and Diomidis Spinellis and Stefano Zacchiroli},
  title = {The Software Heritage Graph Dataset: Public software development under one roof},
  publisher = {IEEE},
  year = {2019},
  doi = {10.1109/MSR.2019.00030},
  pages = {138-142},
  booktitle = {MSR 2019: The 16th International Conference on Mining Software Repositories},
  preprint={https://upsilon.cc/~zack/research/publications/msr-2019-swh.pdf}
}

Important Dates

Abstracts due: January 30, 2020 (AOE)
Papers due: February 6, 2020 (AOE)
Author notification: March 2, 2020 (AOE)
Camera ready: March 16, 2020 (AOE)

Open Science Policy

Openness in science is key to fostering progress via transparency, reproducibility and replicability. Our steering principle is that all research output should be accessible to the public and that empirical studies should be reproducible. In particular, we actively support the adoption of open data and open source principles. To increase reproducibility and replicability, we encourage all contributing authors to disclose:

the source code of the software they used to retrieve and analyze the data
the (anonymized and curated) empirical data they retrieved in addition to the challenge dataset
a document with instructions for other researchers describing how to reproduce or replicate the results

Already upon submission, authors can privately share their anonymized data and software on preservation archives such as Zenodo, Figshare (see instructions), and Software Heritage (see instructions). After acceptance, data and software should be made public and referenceable. We also encourage authors to self-archive pre- and postprints of their papers in open, preserved repositories such as arXiv.org.

Best Mining Challenge Paper Award

All submissions will undergo the same review process independent of whether or not they disclose their analysis code or data. However, only accepted papers for which code and data are available on preservation archives, as described in the open science policy above, will be considered for the best mining challenge paper award.

Best Student Presentation Award

Like in the previous years, there will be a public voting during the conference to select the best mining challenge presentation. This award often goes to authors of compelling work who present an engaging story to the audience. To increase student involvement, only students can compete for this award.

Organization

Antoine Pietri, Inria, France
Diomidis Spinellis, Athens University of Economics and Business, Greece
Stefano Zacchiroli, University Paris Diderot and Inria, France

Dataset documentation
Papers:
- MSR 2019 paper
- MSR 2020 challenge proposal
Public discussion mailing list, among challenge participants and with the chairs
Software Heritage public IRC channel for development discussions: #swh-devel @ irc.freenode.net

Mining ChallengeMSR 2020

Program Display Configuration

Mon 29 JunDisplayed time zone: (UTC) Coordinated Universal Time change

Tue 30 JunDisplayed time zone: (UTC) Coordinated Universal Time change

Accepted Papers

Call for Papers

Resources for Participants

Antoine PietriMining Challenge Co-Chair

Inria

France

Diomidis SpinellisMining Challenge Co-Chair

Athens University of Economics and Business

Greece

Stefano ZacchiroliMining Challenge Co-Chair

Université de Paris and Inria

France

Karim Ali

University of Alberta

Rana Alkadhi

Technical University of Munich

Saudi Arabia

Venera Arnaoudova

Washington State University

United States

Lingfeng Bao

Zhejiang University

Jonathan Bell

Northeastern University

United States

Moritz Beller

Facebook, USA

Stefanie Beyer

University of Klagenfurt

Austria

Fabio Calefato

University of Bari

Italy

Gemma Catolino

Delft University of Technology

Netherlands

Chunyang Chen

Monash University

Australia

Christopher Corley

Independent Researcher

United States

Roberta Coelho

PUC-Rio

Themistoklis Diamantopoulos

Electrical and Computer Engineering Dept, Aristotle University of Thessaloniki

Greece

Vasiliki Efstathiou

Athens University of Economics and Business

Greece

Steffen Herbold

University of Göttingen

Germany

Akinori Ihara

Wakayama University

Japan

Maria Kechagia

University College London

United Kingdom

Ivano Malavolta

Vrije Universiteit Amsterdam

Netherlands

Lucas Nussbaum

University of Lorraine, France

Gustavo A. Oliva

Queen's University

Brazil

Klérisson Paixão

Federal University of Uberlândia

Brazil

Gustavo Pinto

UFPA

Brazil

Sebastian Proksch

Delft University of Technology

Netherlands

Mon 29 Jun
Displayed time zone: (UTC) Coordinated Universal Time change

Tue 30 Jun
Displayed time zone: (UTC) Coordinated Universal Time change