To increase transparency and replicability of empirical research, other disciplines have started to offer preregistration and registered reports for studies. With preregistration, authors can submit an experimental plan, including hypotheses and expected outcome, and get feedback before data is collected. More information on preregistration can be found with the Open Science Initiative (https://cos.io/rr/). Due to its success, we are piloting a registered reports track at this year’s MSR.
Mon 29 JunDisplayed time zone: (UTC) Coordinated Universal Time change
11:00 - 12:00 | Build, CI, & DependenciesTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom Chair(s): Raula Gaikovina Kula NAIST Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack) | ||
11:00 12mLive Q&A | A Tale of Docker Build Failures: A Preliminary StudyMSR - Technical Paper Technical Papers Yiwen Wu National University of Defense Technology, Yang Zhang National University of Defense Technology, China, Tao Wang National University of Defense Technology, Huaimin Wang Pre-print Media Attached | ||
11:12 12mLive Q&A | Using Others' Tests to Avoid Breaking UpdatesMSR - Technical Paper Technical Papers Suhaib Mujahid Concordia University, Rabe Abdalkareem Concordia University, Montreal, Canada, Emad Shihab Concordia University, Shane McIntosh McGill University Pre-print Media Attached | ||
11:24 12mLive Q&A | A Dataset of DockerfilesMSR - Data Showcase Data Showcase A: Jordan Henkel University of Wisconsin–Madison, A: Christian Bird Microsoft Research, A: Shuvendu K. Lahiri Microsoft Research, A: Thomas Reps University of Wisconsin-Madison, USA Media Attached | ||
11:36 12mLive Q&A | Empirical Study of Restarted and Flaky Builds on Travis CIMSR - Technical Paper Technical Papers Thomas Durieux KTH Royal Institute of Technology, Sweden, Claire Le Goues Carnegie Mellon University, Michael Hilton Carnegie Mellon University, USA, Rui Abreu Instituto Superior Técnico, U. Lisboa & INESC-ID DOI Pre-print Media Attached | ||
11:48 12mLive Q&A | LogChunks: A Data Set for Build Log AnalysisMSR - Data Showcase Data Showcase A: Carolin Brandt Delft University of Technology, A: Annibale Panichella Delft University of Technology, A: Andy Zaidman TU Delft, A: Moritz Beller Facebook, USA Pre-print Media Attached |
11:30 - 12:00 | Registered Reports Track DiscussionRegistered Reports at MSR:Zoom2 Chair(s): Neil Ernst University of Victoria, Janet Siegmund TU Chemnitz Open Discussion over Zoom (Joining info available on Slack) | ||
11:30 30mOther | Future Directions for Registered Reports MSR - Registered Reports Registered Reports |
13:00 - 13:15 | "Opening" & AwardsMSR Plenary at MSR:Zoom Chair(s): Georgios Gousios Delft University of Technology, Sunghun Kim Hong Kong University of Science and Technology, Sarah Nadi University of Alberta Live on YouTube: https://www.youtube.com/watch?v=Qvf7mHa-YYs | ||
13:00 15mDay opening | MSR Opening & Awards MSR Plenary Sunghun Kim Hong Kong University of Science and Technology, Sarah Nadi University of Alberta, Georgios Gousios Delft University of Technology Media Attached |
14:30 - 15:00 | Tutorial 1: GDPR ConsiderationsEducation / Technical Papers at MSR:Zoom2 Chair(s): Abram Hindle University of Alberta, Alexander Serebrenik Eindhoven University of Technology Q/A for tutorial (Joining info available on Slack) | ||
14:30 30mTutorial | Mining Software Repositories While Respecting PrivacyMSR - Tutorial Education Pre-print Media Attached |
Tue 30 JunDisplayed time zone: (UTC) Coordinated Universal Time change
11:00 - 12:00 | SecurityData Showcase / Technical Papers at MSR:Zoom2 Chair(s): Dimitris Mitropoulos Athens University of Economics and Business Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack) | ||
11:00 12mLive Q&A | Did You Remember To Test Your Tokens?MSR - Technical Paper Technical Papers Danielle Gonzalez Rochester Institute of Technology, USA, Michael Rath Technische Universität Ilmenau, Mehdi Mirakhorli Rochester Institute of Technology DOI Pre-print Media Attached | ||
11:12 12mLive Q&A | Automatically Granted Permissions in Android appsMSR - Technical Paper Technical Papers Paolo Calciati IMDEA Software Institute, Konstantin Kuznetsov Saarland University, CISPA, Alessandra Gorla IMDEA Software Institute, Andreas Zeller CISPA Helmholtz Center for Information Security Media Attached | ||
11:24 12mLive Q&A | PUMiner: Mining Security Posts from Developer Question and Answer Websites with PU LearningMSR - Technical Paper Technical Papers Triet Le The University of Adelaide, David Hin , Roland Croft , Muhammad Ali Babar The University of Adelaide DOI Pre-print Media Attached | ||
11:36 12mLive Q&A | A C/C++ Code Vulnerability Dataset with Code Changes and CVE SummariesMSR - Data Showcase Data Showcase A: Jiahao Fan New Jersey Institute of Technology, USA, A: Yi Li New Jersey Institute of Technology, USA, A: Shaohua Wang New Jersey Institute of Technology, USA, A: Tien N. Nguyen University of Texas at Dallas Media Attached | ||
11:48 12mLive Q&A | The Impact of a Major Security Event on an Open Source Project: The Case of OpenSSLMSR - Technical Paper Technical Papers James Walden Northern Kentucky University Pre-print Media Attached |
Accepted Papers
Call for Registrations
Call for Registrations: MSR/EMSE Registered Reports
EMSE, in conjunction with the conference on Mining Software Repositories (MSR), is conducting a pilot RR track.
See the associated Author’s Guide. Please email the MSR track chairs - Neil Ernst or Janet Siegmund - for any questions, clarifications, or comments.
What are Registered Reports
- Methods and proposed analyses are pre-registered and reviewed prior to research being conducted.
- Reduce/eliminate under-powered, selectively reported, researcher-biased studies.
Two Phase Review
- (MSR 2020) Phase 1: Introduction, Methods (including proposed analyses), and Pilot Data (where applicable). In Principle Acceptance.
- (EMSE) Phase 2: full study, after data collection and analysis. Results may be negative!
Additional exploratory analyses in Phase 2 may be conducted, if they are justified. Any deviation from the protocol must be justified and logged in detail to ensure replicability. EMSE J. Editors reserve the right to tighten eligibility criteria if necessary.
Phase 1 Criteria
- Importance of the research question(s).
- Logic, rationale, and plausibility of the proposed hypotheses.
- Soundness and feasibility of the methodology and analysis pipeline (including statistical power analysis where appropriate).
- Clarity and degree of methodological detail for replication.
- Will results obtained test the stated hypotheses?
Phase 2 Criteria (via https://osf.io/pukzy/)
- Whether the data are able to test the authors’ proposed hypotheses by satisfying the approved outcome-neutral conditions (such as quality checks, positive controls)
- Whether the Introduction, rationale and stated hypotheses are the same as the approved Stage 1submission (required)
- Whether the authors adhered precisely to the registered experimental procedures
- Whether any unregistered post hoc analyses added by the authors are justified, methodologically sound, and informative
- Whether the authors’ conclusions are justified given the data
Qualitative Research and RR
- No reason to assume pre-registration cannot be for qualitative methods such as card-sorting, grounded theory, coding, member checking etc.
- E.g. phase 1 may include details on survey respondents, survey instrument design, data collection techniques.
- OSF Qualitative Pre-Registration
Organizers
Publicity
- Norman Peitek
Program Committee
See sidebar.
Timeline
Date | Milestone |
---|---|
January 10, 2020 | study protocols and plans due |
January 31, 2020 | initial protocol reviews |
February 14, 2020 | rebuttals/clarifications due |
March 2, 2020 | In Principle Acceptance (IPA) decision notifications |
March 16, 2020 | summary plan / camera-ready |
March 31, 2020 | Phase 1 Reports registered with OSF registry |
tbd | Phase 2 submitted to EMSE (deadline to be determined) |
Submissions
Submit via EasyChair. EasyChair will be used to handle Phase 1 reviews and feedback/rebuttal. EMSE’s EditorialManager system will be used for the Phase 2 submissions, with OSF managing the registration. Reviews from Phase 1 will be shared with the Phase 2 reviewers.
Submission Details
Submissions should follow the ACM Conference Proceedings Formatting Guidelines. LaTeX users must use the provided acmart.cls and ACM-Reference-Format.bst without modification, enable the conference format in the preamble of the document (i.e., \documentclass[sigconf,review]{acmart}), and use the ACM reference format for the bibliography (i.e., \bibliographystyle{ACM-Reference-Format}). The review option adds line numbers, thereby allowing referees to refer to specific lines in their comments.
Follow the template requested in the author’s guide to MSR RR submissions.
Page limit for MSR is 4 pages including references.
Review will be unblinded or single blind. There will be a light-weight rebuttal phase, in which authors have the opportunity to clarify unclear parts of the report. However, the rebuttal is not there to make changes to the experimental design.
FAQ
Q. How will self-plagiarism be handled?
A. Self-plagiarism is where an author includes verbatim text from other, already published work. We expect this to be managed using the existing workshop/extension model; there will be sufficient new content in Phase 2 to clearly indicate this is a new piece of work.
Q. What if I publish my Phase 1 proposal, and then someone scoops me by following the protocol?
A. In practice, this seems quite uncommon. However, OSF has mechanisms to manage embargo periods, so this might be something we also consider in the future. Currently the MSR/EMSE model makes embargos impractical. However, tracks such as “new ideas” already pose this potential risk, so we don’t anticipate extensive problems.
Q. How does this process deal with exploratory studies, where there is no well-defined hypothesis?
A. For now, we strongly suggest such studies target the New Ideas and Emerging Results Track of MSR: . We will focus on studies that have a clear, well-formulated hypothesis.
Q. What if my study changes as I gather data?
A. RR have flexibility to deviate from the analysis plan. However, authors will need to provide solid reasons as to why they deviated from the plan.
Other FAQs on RR in general are at the bottom of the OSF page.
Links
- https://cos.io/prereg/
- See these links
Author's Guide
NB: Please contact the MSR RR track chairs with any questions, feedback, or requests for clarification. Specific analysis approaches mentioned below are intended as examples, not mandatory components.
Title (required)
Provide the working title of your study. It may be the same title that you submit for publication of your final manuscript, but it is not a requirement.
- Example: Should your family travel with you on the enterprise?
- Subtitle (optional): Effect of accompanying families on the work habits of crew members
Authors (required)
At this stage, we believe that an unblinded/single blind review is most productive
Structured Abstract (required)
The abstract should describe in 200 words or so:
Background/Context
What is your research about? Why are you doing this research, why is it interesting?
Example: The enterprise is the flag ship of the federation, and it allows families to travel onboard. However, there are no studies that evaluate how this affects the crew members.”
Objective/Aim
What exactly are you studying/investigating/evaluating? What are the objects of the study?
We welcome both confirmatory and exploratory types of studies.
Example: We evaluate whether the frequency of sick days, the work effectiveness and efficiency differ between science officers who bring their family with them, compared to science officers who are serving without their family.
Example: We investigate the problem of frequent Holodeck use on interpersonal relationships with an ethnographic study using participant observation, in order to derive specific hypotheses about Holodeck usage.
Method
How are you addressing your objective? What data sources are you using.
Example: We conduct an observational study and use a between subject design. To analyze the data, we use a t test or Wilcoxon test, depending on the underlying distribution. Our data come computer monitoring of Enterprise crew members.
Limitations
Hypotheses / research questions (required)
Clearly state the research hypotheses that you want to test with your study, and a rationalization for the hypotheses.
-
Example: Science officers with their family on board have more sick days than science officers without their family
-
Rationale: Since toddlers are often sick, we can expect that crew members with their family onboard need to take sick days more often.
Introduction
Give more details on the bigger picture of your study and how it contributes to this bigger picture. An important component pf phase 1 review is assessing the importance and relevance of the study questions, so be sure to explain this.
Variables (required)
- Independent Variable(s) and their operationalization
- Dependent Variable(s) and their operationalization (e.g., time to solve a specified task)
- Confounding Variable(s) and how their effect will be controlled (e.g., species type (Vulcan, Human, Tribble) might be a confounding factor; we control for it by separating our sample additionally into Human/Non-Human and using an ANOVA (normal distribution) or Friedman (non-normal distribution) to distill its effect).
For each variable, you should give: - name (e.g., presence of family) - abbreviation (if you intend to use one) - description (whether the family of the crew members travels on board) - scale type (nominal: either the family is present or not) - operationalization (crew members without family on board vs. crew members with family onboard)
Material/objects (required)
Describe any material that you plan to use, be specific on whether you developed it (and how) or whether it is already defined (e.g., a standard myers-briggs-type indicator)
Example: For sick days, we recruit the medical records from sick bay (ethics approval pending). For efficiency, we conduct standard interviews with the superior officer and crew members. The questions are the following: / can be found on the Web site / Appendix. Furthermore, we observe their performance during a simulated task.
Tasks (optional)
If you use tasks, describe them, how they were designed or from where they are taken and why they are suitable to evaluate the hypotheses / research question
Example: For effectiveness of the crew members, we ask them to sweep a class 2 nebula. We simulate an error in the primary sensory array. Crew members should then run a level 3 diagnostic to spot the error, fix the error, and complete the sweep of the nebular. We measure the time to (i) spot that there is an error, (ii) decide on the correct diagnostic protocol, (iii) fix the error, and (iv) complete the sweep.
Participants/Subjects/sample (required)
Describe how and why you select the sample. When you conduct a meta analysis, describe the primary studies / work on which you base your meta analysis.
Example: We recruit crew members from the science department on a voluntary basis. They are our targeted population.
Execution Plan (required)
Describe the experimental protocol.
Example: Each crew member needs to sign the informed consent and agreement to process their data according to GDPR. Then, we conduct the interviews. Afterwards, participants need to complete the simulated task.
Analysis Plan (required)
Descriptive statistics
How do you describe the data? How do you handle outliers?
Example: To represent the number of sickdays, we use histograms. Dependending on the distribution, we remove values that are 2 standard deviations above the mean as outliers (normal distribution). If the data are non-normal, we use the median and values below the 10th/above the 90th percentile.
How do you evaluate the practical significance of the hypotheses?
How are you testing the significance of your results? Be specific about the epistemological paradigm and statistical paradigm you are using. This will help us assign reviewers familiar with the relevant research strategies. See Neto et al. for more information.
- Example: (Frequentist) To test for normality, we use a Shapiro-Wilk test. For efficiency, we use a t test / Wilcoxon test depending on the distribution. To evaluate the effect of species type, we use a two-way ANOVA / Friedman test, depending on the distribution.
- Example: (Bayesian) We derive a posterior predictive distribution by choosing a weakly informative prior with sickdays modeled using a Poisson distribution, and likelihood of species influence modeled using a normal distribution with mean 0 and s.d. \sigma. We then calculate the 95% and 99% uncertainty intervals and median m and mean μ of the posterior.
Examples
- Final studies (phase 1 and 2) are available at this Zotero page
- Example phase 1 registrations can be found at OSF Registry
- A sample phase 1 registration is a study of tax in economics