The Scent of Deep Learning Code: An Empirical Study (MSR 2020 - Technical Papers)

Who

Hadhemi Jebnoun, Masud Rahman, Foutse Khomh, Houssem Ben Braiek

Track

MSR 2020 Technical Papers

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 29 Jun 2020 12:40 - 12:50 at MSR:Zoom - Code Smells Chair(s): Alessandro Garcia

Abstract

Deep learning practitioners are often interested in improving their model accuracy rather than the interpretability of their models. As a result, deep learning applications are inherently complex in their structures. They also need to continuously evolve in terms of code changes and model updates. Given these confounding factors, there is a great chance of violating the recommended programming practices by the developers in their deep learning applications. In particular, the code quality might be negatively affected due to their drive for the higher model performance. Unfortunately, the code quality of deep learning applications has rarely been studied to date. In this paper, we conduct an empirical study using 118 open-source software systems from GitHub where we contrast between deep learning-based and traditional systems in terms of their code quality. We have several major findings. First, deep learning applications smell like the traditional ones. However, long lambda expression, long ternary conditional expression, and complex container comprehension smells are frequently found in deep learning projects. That is, the DL code involves more complex or longer expressions than the traditional code does. Second, code smells are found increasing across the releases of deep learning applications. Third, we found that there is a co-existence between code smells and software bugs in the deep learning code, which confirms our conjecture on the degraded code quality in deep learning applications.

Link to Preprint

http://homepage.usask.ca/~masud.rahman/papers/hadhemi-MSR2020.pdf

Hadhemi Jebnoun

Masud Rahman

Dalhousie University

Canada

Foutse Khomh

Polytechnique Montréal

Canada

Houssem Ben Braiek

Media