MSR 2020
Mon 29 - Tue 30 June 2020
co-located with ICSE 2020
Mon 29 Jun 2020 10:36 - 10:42 at MSR:Zoom - Programming Languages & Models Chair(s): Dimitris Kolovos

Despite all of the power that machine learning and artificial intelligence (AI) models bring to applications, much of AI development is currently a fairly ad hoc process. Software engineering and AI development share many of the same languages and tools, but AI development as an engineering practice is still in early stages. Mining software repositories of AI models enables insight into the current state of AI development. However, much of the relevant metadata around models are not easily extractable directly from repositories and require deduction or domain knowledge. This paper presents a library called AIMMX that enables simplified AI Model Metadata eXtraction from software repositories. The extractors have five modules for extracting AI model-specific metadata: model name, associated datasets, references, AI frameworks used, and model domain. We evaluated AIMMX against 7,998 open source models from three sources: model zoos, arXiv AI papers, and state-of-the-art AI papers. AIMMX extracted metadata with 87% precision and 83% recall. As preliminary examples of how AI model metadata extraction enables studies and tools to advance engineering support for AI development, this paper presents an exploratory analysis for data and method reproducibility over the models in the evaluation dataset and a catalog tool for discovering and managing models. Our analysis suggests that while data reproducibility may be relatively poor with 42% of models in our sample citing their datasets, method reproducibility is more common at 72% of models in our sample, particularly state-of-the-art models. Our collected models are searchable in a catalog that uses existing metadata to enable advanced discovery features for efficiently finding models.

The library is open source and currently available at: https://github.com/ibm/aimmx

Mon 29 Jun

Displayed time zone: (UTC) Coordinated Universal Time change

10:30 - 11:00
Programming Languages & ModelsTechnical Papers / Registered Reports / Keynote / MSR Awards / FOSS Award / Education / Data Showcase / Mining Challenge / MSR Challenge Proposals / Ask Me Anything at MSR:Zoom
Chair(s): Dimitris Kolovos University of York

Q/A & Discussion of Session Papers over Zoom (Joining info available on Slack)

10:30
6m
Live Q&A
An Empirical Study on the Impact of Deimplicitization on Program ComprehensionMSR - Registered Reports
Registered Reports
A: Jürgen Cito MIT, A: Jiasi Shen Massachusetts Institute of Technology, A: Martin C. Rinard MIT
Pre-print Media Attached
10:36
6m
Live Q&A
AIMMX: Artificial Intelligence Model Metadata ExtractorMSR - Technical Paper
Technical Papers
Jason Tsay IBM Research, Alan Braz IBM Research, Martin Hirzel IBM Research, Avraham Shinnar IBM Research, Todd Mummert
Pre-print Media Attached
10:42
6m
Live Q&A
Using Large-Scale Anomaly Detection on Code to Improve Kotlin CompilerMSR - Technical Paper
Technical Papers
Timofey Bryksin JetBrains Research, Saint Petersburg State University, Victor Petukhov JetBrains, ITMO University, Ilya Alexin , Stanislav Prikhodko , Alexey Shpilman , Vladimir Kovalenko TU Delft, Nikita Povarov JetBrains
Pre-print Media Attached
10:48
6m
Live Q&A
An Empirical Study of Method Chaining in JavaMSR - Technical Paper
Technical Papers
Tomoki Nakamaru Graduate School of Information Science and Technology, The University of Tokyo, Tomomasa Matsunaga , Tetsuro Yamazaki Graduate School of Information Science and Technology, The University of Tokyo, Soramichi Akiyama Department of Creative Informatics, The University of Tokyo, Shigeru Chiba The University of Tokyo
Pre-print Media Attached
10:54
6m
Live Q&A
Painting Flowers: Reasons for Using Single-State State Machines in Model-Driven EngineeringMSR - Technical Paper
Technical Papers
Nan Yang Eindhoven University of Technology, The Netherlands, Pieter Cuijpers , Ramon Schiffelers Eindhoven University of Technology and ASML, the Netherlands, Johan Lukkien , Alexander Serebrenik Eindhoven University of Technology
Media Attached