The Mining Software Repositories (MSR) conference is the premier conference for data science, machine learning, and artificial intelligence in software engineering. The goal of the conference is to improve software engineering practices by uncovering interesting and actionable information about software systems and projects using the vast amounts of software data such as source control systems, defect tracking systems, code review repositories, archived communications between project personnel, question-and-answer sites, CI build servers, and run-time telemetry. Mining this information can help to understand software development and evolution, software users, and runtime behavior; support the maintenance of software systems; improve software design/reuse; empirically validate novel ideas and techniques; support predictions about software development; and exploit this knowledge in planning future development.
The goal of this two-day international conference is to advance the science and practice of software engineering with data-driven techniques. The 17th International Conference on Mining Software Repositories is co-located with ICSE 2020 in Seoul, South Korea, and will be held on May 25-26, 2020.
The important dates for the Technical Track papers are:
- Abstract deadline: Thursday January 9, 2020, 23:59 AOE
- Papers deadline: Thursday January 16, 2020, 23:59 AOE (No deadline extension or grace periods will be provided. Please plan accordingly)
- Author Response Period: February 18 - 21, 2020
- Author Notification: Monday March 2, 2020
- Camera Ready: Monday March 16, 2020, 23:59 AOE
Please see the Call for Papers for all the details.
Highlights
Mon 29 JunDisplayed time zone: (UTC) Coordinated Universal Time change
11:00 - 12:00 | Build, CI, & DependenciesTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom Chair(s): Raula Gaikovina Kula NAIST Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack) | ||
11:00 12mLive Q&A | A Tale of Docker Build Failures: A Preliminary StudyMSR - Technical Paper Technical Papers Yiwen Wu National University of Defense Technology, Yang Zhang National University of Defense Technology, China, Tao Wang National University of Defense Technology, Huaimin Wang Pre-print Media Attached | ||
11:12 12mLive Q&A | Using Others' Tests to Avoid Breaking UpdatesMSR - Technical Paper Technical Papers Suhaib Mujahid Concordia University, Rabe Abdalkareem Concordia University, Montreal, Canada, Emad Shihab Concordia University, Shane McIntosh McGill University Pre-print Media Attached | ||
11:24 12mLive Q&A | A Dataset of DockerfilesMSR - Data Showcase Data Showcase A: Jordan Henkel University of Wisconsin–Madison, A: Christian Bird Microsoft Research, A: Shuvendu K. Lahiri Microsoft Research, A: Thomas Reps University of Wisconsin-Madison, USA Media Attached | ||
11:36 12mLive Q&A | Empirical Study of Restarted and Flaky Builds on Travis CIMSR - Technical Paper Technical Papers Thomas Durieux KTH Royal Institute of Technology, Sweden, Claire Le Goues Carnegie Mellon University, Michael Hilton Carnegie Mellon University, USA, Rui Abreu Instituto Superior Técnico, U. Lisboa & INESC-ID DOI Pre-print Media Attached | ||
11:48 12mLive Q&A | LogChunks: A Data Set for Build Log AnalysisMSR - Data Showcase Data Showcase A: Carolin Brandt Delft University of Technology, A: Annibale Panichella Delft University of Technology, A: Andy Zaidman TU Delft, A: Moritz Beller Facebook, USA Pre-print Media Attached |
11:30 - 12:00 | Registered Reports Track DiscussionRegistered Reports at MSR:Zoom2 Chair(s): Neil Ernst University of Victoria, Janet Siegmund TU Chemnitz Open Discussion over Zoom (Joining info available on Slack) | ||
11:30 30mOther | Future Directions for Registered Reports MSR - Registered Reports Registered Reports |
13:00 - 13:15 | "Opening" & AwardsMSR Plenary at MSR:Zoom Chair(s): Georgios Gousios Delft University of Technology, Sunghun Kim Hong Kong University of Science and Technology, Sarah Nadi University of Alberta Live on YouTube: https://www.youtube.com/watch?v=Qvf7mHa-YYs | ||
13:00 15mDay opening | MSR Opening & Awards MSR Plenary Sunghun Kim Hong Kong University of Science and Technology, Sarah Nadi University of Alberta, Georgios Gousios Delft University of Technology Media Attached |
13:15 - 14:15 | MSR 2020 KeynoteKeynote at MSR:Zoom Chair(s): Sunghun Kim Hong Kong University of Science and Technology Live on YouTube https://www.youtube.com/watch?v=Qvf7mHa-YYs (Q/A through Slack) | ||
13:15 60mKeynote | Machine Learning for Developer Productivity at FacebookKeynote Keynote Satish Chandra Facebook Media Attached |
14:30 - 15:00 | Tutorial 1: GDPR ConsiderationsEducation / Technical Papers at MSR:Zoom2 Chair(s): Abram Hindle University of Alberta, Alexander Serebrenik Eindhoven University of Technology Q/A for tutorial (Joining info available on Slack) | ||
14:30 30mTutorial | Mining Software Repositories While Respecting PrivacyMSR - Tutorial Education Pre-print Media Attached |
15:30 - 16:30 | Software Engineering for Machine Learning: AMAAsk Me Anything at MSR:Zoom Chair(s): Tim Menzies North Carolina State University Live YouTube: https://youtu.be/bLAyj1c_JZ0 | ||
15:30 60mLive Q&A | SE4ML AMAMSR - AMA Ask Me Anything P: Christian Kästner Carnegie Mellon University, P: Mohamed El-Geish Cisco Systems, Inc, P: Foutse Khomh Polytechnique Montréal, P: Miryung Kim University of California, Los Angeles Media Attached |
Tue 30 JunDisplayed time zone: (UTC) Coordinated Universal Time change
11:00 - 12:00 | SecurityData Showcase / Technical Papers at MSR:Zoom2 Chair(s): Dimitris Mitropoulos Athens University of Economics and Business Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack) | ||
11:00 12mLive Q&A | Did You Remember To Test Your Tokens?MSR - Technical Paper Technical Papers Danielle Gonzalez Rochester Institute of Technology, USA, Michael Rath Technische Universität Ilmenau, Mehdi Mirakhorli Rochester Institute of Technology DOI Pre-print Media Attached | ||
11:12 12mLive Q&A | Automatically Granted Permissions in Android appsMSR - Technical Paper Technical Papers Paolo Calciati IMDEA Software Institute, Konstantin Kuznetsov Saarland University, CISPA, Alessandra Gorla IMDEA Software Institute, Andreas Zeller CISPA Helmholtz Center for Information Security Media Attached | ||
11:24 12mLive Q&A | PUMiner: Mining Security Posts from Developer Question and Answer Websites with PU LearningMSR - Technical Paper Technical Papers Triet Le The University of Adelaide, David Hin , Roland Croft , Muhammad Ali Babar The University of Adelaide DOI Pre-print Media Attached | ||
11:36 12mLive Q&A | A C/C++ Code Vulnerability Dataset with Code Changes and CVE SummariesMSR - Data Showcase Data Showcase A: Jiahao Fan New Jersey Institute of Technology, USA, A: Yi Li New Jersey Institute of Technology, USA, A: Shaohua Wang New Jersey Institute of Technology, USA, A: Tien N. Nguyen University of Texas at Dallas Media Attached | ||
11:48 12mLive Q&A | The Impact of a Major Security Event on an Open Source Project: The Case of OpenSSLMSR - Technical Paper Technical Papers James Walden Northern Kentucky University Pre-print Media Attached |
12:00 - 13:00 | DevOps: AMAAsk Me Anything at MSR:Zoom Chair(s): Philipp Leitner Chalmers University of Technology & University of Gothenburg Live YouTube: https://youtu.be/lVorjsH6uWM | ||
12:00 60mLive Q&A | DevOps AMAMSR - AMA Ask Me Anything Media Attached |
13:00 - 13:45 | Award TalksMSR Awards at MSR:Zoom Chair(s): Andy Zaidman TU Delft Live YouTube: https://youtu.be/97JCEBPZHkU | ||
13:00 22mTalk | Most Influential Paper TalkMSR - Award Talk MSR Awards A: Michele Lanza Universita della Svizzera italiana (USI), A: Marco D'ambros , A: Romain Robbes Free University of Bozen-Bolzano | ||
13:22 22mTalk | MSR Foundational Contribution TalkMSR - Award Talk MSR Awards |
13:45 - 14:00 | "Closing" & MSR 2021MSR Plenary at MSR:Zoom Chair(s): Georgios Gousios Delft University of Technology, Sunghun Kim Hong Kong University of Science and Technology, Sarah Nadi University of Alberta Live YouTube: https://youtu.be/97JCEBPZHkU | ||
14:30 - 15:00 | Tutorial 2: Software AnalyticsEducation / Technical Papers at MSR:Zoom2 Chair(s): Abram Hindle University of Alberta, Alexander Serebrenik Eindhoven University of Technology Q/A for tutorial (Joining info available on Slack) | ||
14:30 30mTutorial | Mutation Testing Meets Software Analytics: A Hands-On TutorialMSR - Tutorial Education Media Attached |
15:00 - 16:00 | Machine Learning for Software Engineering: AMAAsk Me Anything at MSR:Zoom Chair(s): Baishakhi Ray Columbia University, New York Live YouTube: https://youtu.be/cphPhsehw2M | ||
15:00 60mLive Q&A | ML4SE AMAMSR - AMA Ask Me Anything Vincent J. Hellendoorn University of California at Davis, USA, Michael Pradel University of Stuttgart, Miltiadis Allamanis Microsoft Research Media Attached |
Accepted Papers
Call for Papers
Scope
The technical track of MSR 2020 solicits novel, high quality submissions on a wide range of topics, including (but not limited to):
- Analysis of software data with the goal of improving software productivity and reliability
- Analysis and modeling of runtime information to optimize deployment, delivery and error handling in software development processes
- Analysis of change patterns and trends to assist in future development
- Analysis of natural language artifacts in software data
- Analysis of software ecosystems and mining of software data across multiple projects
- Approaches, applications, and tools for mining software data
- Artificial intelligence for software engineering
- Characterization, classification, and prediction of software defects based on analysis of software data
- Characterization of bias in mining and guidelines to ensure the quality of results
- Data science for software projects
- Empirical studies on extracting data from large long-lived and/or industrial projects
- Machine learning for software engineering
- Meta-models, exchange formats, and infrastructure tools to facilitate the sharing of extracted data and to encourage reuse and repeatability
- Methods of integrating mined data from various historical sources
- Mining code review data
- Mining execution traces and logs
- Mining human and social aspects of development
- Mining interaction data
- Mining mobile app stores and app reviews
- Mining software licensing and copyrights
- Models for social and development processes in large software projects
- Models of software project evolution based on historical repository data
- Models and processes for improving the quality of machine learning pipelines
- Natural language processing in software engineering
- Prediction and modeling of software quality
- Privacy and ethics in mining software data
- Release engineering, including continuous integration, delivery and deployment
- Search-driven software development, including search techniques to assist developers in finding suitable components and code fragments for reuse, and software search engines
- Software analytics
- Software engineering for artificial intelligence and machine learning
- Energy efficiency of software
- Studies of programming language features and their usage
- Techniques and tools for capturing new forms of software data such as effort data, fine-grained changes, and refactoring
- Techniques to model reliability and defect occurrences
- Visualization techniques and models of mined data
Types of Technical Track Submissions
We accept both full (10 pages plus 2 additional pages of references) and short (4 pages plus 1 additional page of references) papers. Furthermore, in order to facilitate the reviewing process of your paper’s contribution, you should select one of the following paper categories:
1. Research Paper
Full research papers are expected to describe new methodologies and/or provide novel research results, and should be evaluated scientifically. While a high degree of technical rigor is expected for long papers, short research papers should discuss controversial issues in the field, or describe interesting or thought-provoking ideas that are not yet fully developed. Accepted short papers will be presented in a short lightning talk.
Relevant review criteria:
- novelty
- soundness of approach
- relevance to the conference (+ clarity of relation with related work)
- quality of presentation
- quality of evaluation [for long papers]
- ability to replicate [for long papers]
2. Practice Experience
MSR encourages the submission of papers that report on both positive and negative experiences of applying software analytics strategies in an industry/open source organization context. Adapting existing algorithms or proposing new algorithms or approaches for practical use are considered a plus.
Relevant review criteria:
- quality of empirical evaluation
- explicit discussion on the usefulness/impact of the approach in practice
- explicit discussion of any adaptations required by the application of existing/new approach in practice
- quality of presentation
- relevance to the conference (+ clarity of relation with related work)
3. Reusable Tool
MSR actively promotes and recognizes the creation and use of tools that are designed and built not only for a specific research project, but for the MSR community as a whole. Those tools enable other researchers to jumpstart their own research efforts, and also enable reproducibility of earlier work.
Reusable Tool papers can be descriptions of tools built by the authors that can be used by other researchers, and/or descriptions of the use of tools built by others to obtain some specific research results in the area of mining software repositories.
Relevant review criteria:
- evaluation of usefulness/reusability of the tool [for long papers]
- novelty
- quality of presentation (details on tool’s internals, usage, etc.)
- relevance to the conference (+ clarity of relation with related work)
- availability of the tool, clear installation instructions and example data set that allow the reviewers to run the tool
Submission Process
All types of technical papers will be peer-reviewed according to the specified review criteria, hence it is required to choose the right type of paper according to the paper’s major contributions. Submissions should follow the ACM Conference Proceedings Formatting Guidelines (https://www.acm.org/publications/proceedings-template ). LaTeX users must use the provided acmart.cls
and ACM-Reference-Format.bst
without modification, enable the conference format in the preamble of the document (i.e., \documentclass[sigconf,review]{acmart}
), and use the ACM reference format for the bibliography (i.e., \bibliographystyle{ACM-Reference-Format}
). The review option adds line numbers, thereby allowing referees to refer to specific lines in their comments.
Papers submitted for consideration should not have been published elsewhere and should not be under review or submitted for review elsewhere for the duration of consideration. ACM plagiarism policies and procedures shall be followed for cases of double submission. The submission must also comply with the IEEE Policy on Authorship. Please read the ACM Policy and Procedures on Plagiarism (https://www.acm.org/publications/policies/plagiarism) and the IEEE Plagiarism FAQ (https://www.ieee.org/publications/rights/plagiarism/plagiarism-faq.html) before submitting.
Upon notification of acceptance, all authors of accepted papers will be asked to complete a copyright form and will receive further instructions for preparing their camera ready versions. At least one author of each paper is expected to register and present the results at the MSR 2020 conference. All accepted contributions will be published in the conference electronic proceedings.
A selection of the best papers will be invited to an EMSE Special Issue. The authors of accepted papers that show outstanding contributions to the FOSS community will have a chance to self-nominate their paper for the MSR FOSS Impact Paper Award.
IMPORTANT: MSR 2020 follows the double-blind submission model. Submissions should not reveal the identity of the authors in any way. This means that authors should:
- leave out author names and affiliations from the body and metadata of the submitted pdf
- ensure that any citations to related work by themselves are written in the third person, for example “the prior work of XYZ” as opposed to “our prior work [2]”
- not refer to their personal, lab or university website; similarly, care should be taken with personal accounts on github, bitbucket, Google Drive, etc.
- not upload unblinded versions of their paper on archival websites during bidding/reviewing, however uploading unblinded versions prior to submission is allowed and sometimes unavoidable (e.g., thesis)
- not to advertise their submission number or paper topic on social media accounts. Please be careful about posting your paper number, a description of your submitted paper, or any other information that may make it easy for reviewers to identify your submission.
Please note that double-blind submission should not be an excuse for hiding replication packages or data sets from reviewers, since that effectively hinders the peer-review process. Since access to data and scripts is essential during peer review, we strongly recommend to archive data sets on online archival sites such as dropbox.com, zenodo.org or figshare.com (Instructions available in Open Science Policy below). The latter two even allow to receive a DOI and hence become citable.
Submission Link
Technical papers must be submitted through EasyChair: https://easychair.org/conferences/?conf=msr20
Open Science Policy
Openness in science is key to fostering progress via transparency, reproducibility and replicability. Our steering principle is that all research output should be accessible to the public and that empirical studies should be reproducible. In particular, we actively support the adoption of open data and open source principles. The following guidelines are recommendations and not mandatory. Your choice to use open science or not will not affect the review process for your paper. However, to increase reproducibility and replicability, we encourage all contributing authors to disclose:
- the source code of relevant software used or proposed in the paper, including that used to retrieve and analyze data
- the data used in the paper (e.g., evaluation data, anonymized survey data etc)
- instructions for other researchers describing how to reproduce or replicate the results
Already upon submission, authors can privately share their anonymized data and software on preserved archives such as Zenodo or Figshare (tutorial available here – please make sure to any links shared during peer review are anonymized). Zenodo accepts up to 50GB per dataset (more upon request). There is no need to use Dropbox or Google Drive. Once accepted, an option can be toggled to publish the data and scripts with an official DOI. Zenodo and Figshare accounts can easily be linked with GitHub repositories to automatically archive software releases. In the unlikely case that authors need to upload terabytes of data, Archive.org may be used.
After acceptance, we encourage authors to self-archive pre-prints of their papers in open, preserved repositories such as arXiv.org. This is legal and allowed by all major publishers including ACM and IEEE and it lets anybody in the world reach your paper. Note that you are usually not allowed to self-archive the PDF of the published article (that is, the publisher proof or the Digital Library version). Instead, use the manuscript with reviewer comments addressed, but before applying the camera-ready instructions and templates. Feel free to contact the MSR 2020 PC or proceedings chairs for more details.
Please note that the success of the open science initiative depends on the willingness (and possibilities) of authors to disclose their data and that all submissions will undergo the same review process independent of whether or not they disclose their analysis code or data. We encourage authors who cannot disclose industrial or otherwise non-public data, for instance due to non-disclosure agreements, to provide an explicit (short) statement in the paper.
Deadlines
- Abstract Deadline: Thursday January 9, 2020, 23:59 AOE
- Papers Deadline: Thursday January 16, 2020, 23:59 AOE (No deadline extension or grace periods will be provided. Please plan accordingly)
- Author Response Period: February 18 - 21, 2020
- Author Notification: March 2, 2020
- Camera Ready: Monday March 16, 2020, 23:59 AOE