Software maintenance research that is empirically valid and useful in practice

Similar documents
Reverse Engineering A Roadmap

with permission from World Scientific Publishing Co. Pte. Ltd.

Course Introduction and Overview of Software Engineering. Richard N. Taylor Informatics 211 Fall 2007

Telehealth and Digital Technology. Libbe Englander, PhD

Introduction to adoption of lean canvas in software test architecture design

HELPING THE DESIGN OF MIXED SYSTEMS

THE CONSTRUCTION- AND FACILITIES MANAGEMENT PROCESS FROM AN END USERS PERSPECTIVE - ProFacil

A FORMAL METHOD FOR MAPPING SOFTWARE ENGINEERING PRACTICES TO ESSENCE

Challenges in Software Evolution

Science and mathematics

Towards an MDA-based development methodology 1

Where does architecture end and technology begin? Rami Razouk The Aerospace Corporation

Requirements Quality Defect Detection with the Qualicen Requirements Scout

Software Testing for Developer Introduction. Duvan Luong, Ph.D. Operational Excellence Networks

Evidence Engineering. Audris Mockus University of Tennessee and Avaya Labs Research [ ]

The Tool Box of the System Architect

Evolution in Free and Open Source Software: A Study of Multiple Repositories

Detection and Analysis of Near-Miss Clone Genealogies

Grundlagen des Software Engineering Fundamentals of Software Engineering

AN AUTONOMOUS SIMULATION BASED SYSTEM FOR ROBOTIC SERVICES IN PARTIALLY KNOWN ENVIRONMENTS

About Software Engineering.

Code Complete 2: A Decade of Advances in Software Construction Construx Software Builders, Inc. All Rights Reserved.

Designing Semantic Virtual Reality Applications

STUDY ON FIREWALL APPROACH FOR THE REGRESSION TESTING OF OBJECT-ORIENTED SOFTWARE

Understanding the Evolution of Code Clones in Software Systems

R3ST for Requirements Recovery of Legacy Runtime Code

Towards a Software Engineering Research Framework: Extending Design Science Research

HOUSING WELL- BEING. An introduction. By Moritz Fedkenheuer & Bernd Wegener

The role of cooperative cyclic knowledge gain in IS anti-aging

GLAMURS Green Lifestyles, Alternative Models and Upscaling Regional Sustainability. Case Study Exchange

Towards Understanding Software Evolution: One-Line Changes

GOALS TO ASPECTS: DISCOVERING ASPECTS ORIENTED REQUIREMENTS

An Un-awarely Collected Real World Face Database: The ISL-Door Face Database

A SERVICE-ORIENTED SYSTEM ARCHITECTURE FOR THE HUMAN CENTERED DESIGN OF INTELLIGENT TRANSPORTATION SYSTEMS

Separation of Concerns in Software Engineering Education

Defining Process Performance Indicators by Using Templates and Patterns

A three-component representation to capture and exchange architects design processes

Industrial Applications and Challenges for Verifying Reactive Embedded Software. Tom Bienmüller, SC 2 Summer School, MPI Saarbrücken, August 2017

Strategic Considerations when Introducing Model Based Systems Engineering

Support of Design Reuse by Software Product Lines: Leveraging Commonality and Managing Variability

Design and Implementation Options for Digital Library Systems

LEARNING FROM THE AVIATION INDUSTRY

HUMAN COMPUTER INTERFACE

acatech Industrie 4.0 Maturity Index Development of company-specific Industrie 4.0 roadmaps FIR e. V. an der RWTH Aachen

Human Interface/ Human Error

From Observational Data to Information IG (OD2I IG) The OD2I Team

Model-Based Development of Embedded Systems

Why Feature Dependencies Challenge the Requirements Engineering of Automotive Systems: An Empirical Study

Kauffman Dissertation Executive Summary

An empirical study on the influence of context in computing thresholds for Chidamber and Kemerer metrics

Current Challenges for Measuring Innovation, their Implications for Evidence-based Innovation Policy and the Opportunities of Big Data

NCRIS Capability 5.7: Population Health and Clinical Data Linkage

5 Daunting. Problems. Facing Ediscovery. Insights on ediscovery challenges in the legal technologies market

An Empirical Study on the Fault-Proneness of Clone Migration in Clone Genealogies

NEES CYBERINFRASTRUCTURE: A FOUNDATION FOR INNOVATIVE RESEARCH AND EDUCATION

DOCTORAL THESIS (Summary)

A Reconfigurable Citizen Observatory Platform for the Brussels Capital Region. by Jesse Zaman

Evaluation of Guidance Systems in Public Infrastructures Using Eye Tracking in an Immersive Virtual Environment

Argumentative Interactions in Online Asynchronous Communication

Transferring knowledge from operations to the design and optimization of work systems: bridging the offshore/onshore gap

Globalizing Modeling Languages

Technische Universität München. TUM Computer Games Laboratory SS Prototyping. R. Westermann, N. Thuerey. Fakultät für Informatik

openaal 1 - the open source middleware for ambient-assisted living (AAL)

PREFACE. Introduction

Software Maintenance Cycles with the RUP

Code Complete 2: Realities of Modern Software Construction

Computer Science: Disciplines. What is Software Engineering and why does it matter? Software Disasters

Technology Transfer: An Integrated Culture-Friendly Approach

Industrial Use of Mixed Reality in VRVis Projects

Issue Article Vol.30 No.2, April 1998 Article Issue

Software Evolution & Technical Debt

Aarhat Multidisciplinary International Education Research Journal (AMIERJ) (Bi-Monthly) Peer-Reviewed Journal Impact factor:

The Collaborative Digital Process Methodology achieved the half lead-time of new car development

Using Variability Modeling Principles to Capture Architectural Knowledge

Empirical Research Plan: Effects of Sketching on Program Comprehension

Terms and Conditions

DSM-Based Methods to Represent Specialization Relationships in a Concept Framework

Computational Reproducibility in Medical Research:

Computational Sciences and Engineering (CSE): A New Paradigm in Scientific Research & Education. Abul K. M. Fahimuddin

Digital Engineering Support to Mission Engineering

A Journal for Human and Machine

Levels of Description: A Role for Robots in Cognitive Science Education

A Product Derivation Framework for Software Product Families

SEPTEMBER, 2018 PREDICTIVE MAINTENANCE SOLUTIONS

towerswatson.com Transforming Life Medtronic aligns global total rewards with EVP

A Test Bed for Verifying and Comparing BIM-based Energy Analysis Tools

Confidently Assess Risk Using Public Records Data with Scalable Automated Linking Technology (SALT)

No Silver Bullet. CSCI 5828: Foundations of Software Engineering Lecture 02 08/27/2015

2IMP25 Software Evolution. Software Evolution. Alexander Serebrenik

Improving Software Sustainability Through Data-Driven Technical Debt Management

Bachelor Thesis Kick Off State of the Art in linking privacy requirements to technical solutions

Robots in the Loop: Supporting an Incremental Simulation-based Design Process

Integrated Transformational and Open City Governance Rome May

Category Code Description. Science and mathematics

An Industrial Application of an Integrated UML and SDL Modeling Technique

Information Sociology

EAB Engineering Accreditation Board

Introduction to Software Engineering

Visualization of metrics and areas of interest on software architecture diagrams Byelas, Heorhiy

Applying the SPES Modeling Framework

Transcription:

DE GRUYTER OLDENBOURG it Information Technology 2016; 58(3): 145 149 Self-Portrayals of GI Junior Fellows Elmar Juergens* Software maintenance research that is empirically valid and useful in practice DOI 10.1515/itit-2016-0014 Received February 29, 2016; accepted March 5, 2016 Abstract: For successful software system, maintenance efforts dominate initial development costs by far. However, research and practice still place most of the attention on the initial development phase. In consequence, companies suffer mission-critical systems that are extremely hard to maintain. To better support practitioners in writing maintainable software, we both need an empirically sound understanding of software properties that help or hinder maintenance and tools that are useful in practice. This requires work that includes both research and practitioner perspectives. In this article, I outline my work towards this goal. Keywords: Software maintenance, software evolution, static analysis, empirical software engineering. ACM CCS: Software and its engineering Software notations and tools Software maintenance tools 1 Introduction Successful software is used for decades. This especially holds for business critical software that is fundamental to the operation of companies. Since requirements and infrastructure change over time, decades of use really mean decades of continuous adaptation and extension of the underlying source code. To support this, software should be built in a way that makes long-term adaptation and extension easy. Unfortunately, most software systems in practice are not built this way. One pattern that emerged quickly in our research group s interactions with industry was that most teams push their software to or beyond their team s limits. At some point, we began to jokingly understand state- ments like our system is very complex as our software has become utterly unmaintainable. Ironically, this turned out to be true more often than not. The source code of these systems is disproportionally expensive and error-prone to maintain. The observation that software maintenance can cause code to decay is not new. For example, Parnas [1]observed this in 1994, Eick et al. [2] in 2001. To counteract code decay, properties that influence software maintenance must be measured and acted upon on a continuous basis. This requires a thorough understanding of the relevant properties and suitable tools for their inspection. Unfortunately, we lack both. While there is a substantial body of research on software maintenance, including code metrics, a lot of it commits the fallacy of focusing on what is easy rather than relevant to measure. The software maintainability index [3], forexample,iscomputedasdepictedinfigure1.theproblem with such measures is that they lack a strong relationship to maintenance activities. Changing software such that this index improves (at least in many cases) thus does not make maintenance activities easier. In consequence, many of the proposed metrics are useless in practice. Most teams thus do not have effective measures to spot maintenance issues early, when they are still cheap to repair. The goal of my research is to help practitioners build more maintainable software. Figure 1: Computation of the maintainability index. *Corresponding author: Elmar Juergens, CQSE GmbH, D-85748 Garching near Munich, Germany; and TU München, D-85748 Garching near Munich, Germany, e-mail: juergens@cqse.eu

146 E. Juergens, Improving real-world software maintenance DE GRUYTER OLDENBOURG 2 Approach I am convinced that work towards this goal must combine perspectives from both research and practice. My work thus spans both fields. On the research side, I am interested in empirical analyses that further our understanding about which properties of software are meaningful to measure and worthwhile to improve to make software more maintainable. Meaningful measures are necessary, but not sufficient to create impact. On the side of industrial practice, my goal is thus to understand the further requirements to facilitate impact. Finally, I want to further exchange between research and practice in this area. 2.1 Meaningful analyses My research goal is to further our empirical knowledge about which properties of software are really harmful for software maintenance and thus deserve special attention during development. I want to illustrate this using code duplication as an example. Programming languages allow the creation of reusable abstractions. This allows functionality that is required multiple times to be implemented in a single place only. Advances in programming language development (e.g. generics, closures, mixins) often aim to make it easier to create such reusable abstractions. Surprisingly, however, copy & paste is still the most widely used reuse mechanism in practice. Even in systems written in modern programming languages. Figure 2 displays a duplicated code fragment from a Java open-source system. Code duplication can have negative consequences for software maintenance, since changes must be performed in multiple instead of one location, thus increasing effort. In large systems, individual instances of code duplication can easily be overlooked. The inconsistencies resulting from incomplete changes can cause bugs. But how relevant is this problem in practice? Copy & paste is often conceptually simpler than creating a reusable abstraction. If maintenance problems rarely arise, could copy & paste not be a viable development style? To quantify the impact of code cloning on software evolution, I performed an empirical study [4]. For this, I searched industrial software projects for code duplication with slight differences between the individual copies, such as in Figure 2. I then sat down with the developers of the system and inspected the clones to determine if the differences were intentional and, if not, if they represented abug. In the five systems we analyzed, we found more than 100 bugs, including critical ones that could crash the program or lead to loss of data. This surprised us, since all of thesystemwereinproductionatthetimeofanalysis. More importantly, we learned that roughly every second time a difference was unintentional, it represented a bug. This observation held independently of programming language (we studied systems in Java, C# and COBOL), size and age (between 2 and 17 years). In this study, I personally learned how important the involvement of industrial software and developer experience is for the field of software maintenance research, probably more important than in computer science areas that are easier to formalize. I have thus hence made it the foundation of my research. We have since extended our work on clone detection from source code to models [5, 6] and requirements [7]. Furthermore, we studied other aspects such as software architecture evolution [8] and source code comments [9]. Figure 2: Duplicated code in a Java open-source system.

DE GRUYTER OLDENBOURG E. Juergens, Improving real-world software maintenance 147 2.2 Impact in practice The goal of my work in practice is to help research results achieve impact on how practitioners maintain software. Meaningful analyses are necessary, but not sufficient to achieve this. I would like to illustrate this again using code duplication as an example. To give a better impression on how it feels to maintain a system that contains a substantial amount of duplication, I use a so called SeeSoft visualization [10]. SeeSoft visualizations simply zoom out to fit more source code into a figure, as depicted in Figure 3. Each character is represented using a single pixel. Layout and syntax highlighting are preserved. This allows us to cram about 5000 lines of code into a single diagram. Figure 4 depicts code from a business information system written in C# that was 3yearsold at the time of analysis. I use the dimension of color to depict code duplication. All code fragments that are copies of each other (modulo small differences in whitespace, comments or names) are depicted in the same color. (The colors themselves are chosen randomly). The result is displayed in Figure 5. In this system, clones abound. This is no exception. It is common for a clone detection tool to discover tens of thousands of individual copies in a typical industrial system. Much to our surprise, the typical initial reaction of a developer to this list of findings was to uninstall our clone detection tool. Why? Because while meaningful in principle, the first generation of clone detection tools was perceived as mostly useless in practice, since it did not fit Figure 3: Seesoft-visualization with one pixel per character of code. Figure 4: Seesoft-visualization of a part of a code base. Figure 5: Seesoft-visualization of a part of a code base, depicting all code duplication in it.

148 E. Juergens, Improving real-world software maintenance DE GRUYTER OLDENBOURG well enough into existing development processes. Developers expected immediate relevant feedback to their own work. Instead, they received false-positive-ridden overnight feedback to the whole system. These insights from practitioners caused us to rethink our clone detection research. For example, we changed our detection algorithms to work incrementally [11, 12] to provide quicker feedback, we analyzed clone evolution and tracking to create robust filter and blacklisting capabilities, we developed classifiers to differentiate between classes of clones based on how easily they can be removed [13] and so on. It also showed me how important the practitioner perspective is to have impact in this field and motivated me to co-found a company, the CQSE GmbH, dedicated to helping development teams build better software. At the time of writing, CQSE employs 20 computer scientists, 11 with a PhD in computer science. 2.3 Exchange between research and practice My goal as a Junior Fellow is to increase the dialog between research and practice. To transfer research insights, I frequently speak on industry conferences, such as OOP, W-JAX, JAX, Clean Code Days, SEACON, Software Quality Days, Teamconf, BASTA and others. My central motivation here is to convey both positive and negative results to make practitioners aware of the fact that a substantial body of research exists, from which they could draw. To transfer industry reality, I hold guest lectures for students or give talks at universities, such as TU Munich, TU Braunschweig or University of Passau. Please feel free to contact me, if you think that my perspective on research and practice could be useful in your context. Finally, our company has implemented a industrialstrength analysis tool suite, called Teamscale, that is used by many companies. We provide it for free for research and teaching. 3Conclusion Software permeates our lives. The more successful a software system, the longer it has to be maintained. For this, software must be written in a way that allows easy maintenance. This is often not the case. To alleviate this, we need both a sound empirical understanding of the properties of software that really matter for maintenance and tools that fulfill practitioner s requirements to be actually useful in practice. I am convinced that we must close the gap between research and practice in this area to succeed. References 1. D. L. Parnas. Software aging. In Proceedings of the 16th international conference on Software engineering (pp. 279 287). IEEE Computer Society Press, May 1994. 2. S. G. Eick T. L. Graves, A. F. Karr, J. S. Marron, A. Mockus. Does code decay? assessing the evidence from change management data. IEEE Transactions on Software Engineering, 27(1), 1 12, 2001. 3. P. Oman, J. Hagemeister. Metrics for assessing a software system s maintainability. In Proceedings of IEEE Conference on Software Maintenance (pp. 337 344). November 1992. 4. E. Juergens, F. Deissenboeck, B. Hummel, S. Wagner. Do code clones matter?. In Proceedings of the 31st International Conference on Software Engineering (pp. 485 495). IEEE Computer Society, May 2009. 5. F. Deissenboeck, B. Hummel, E. Juergens, B. Schaetz, S. Wagner, J. F. Girard, S. Teuchert. Clone detection in automotive model-based development. In Proceedings of the 30th international conference on Software engineering (pp. 603 612), May 2008. 6. B. Hummel, E. Juergens, D. Steidl. Index-based model clone detection. In Proceedings of the 5th International Workshop on Software Clones (pp. 21 27), May 2011. 7. E. Juergens, F. Deissenboeck, M. Feilkas, B. Hummel, B. Schaetz, S. Wagner, C. Dohmann, J. Streit. Can clone detection support quality assessments of requirements specifications?. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, Volume 2 (pp. 79 88), May 2010. 8. M. Feilkas, D. Ratiu, E. Juergens. The loss of architectural knowledge during system evolution: An industrial case study. In Proceedings of the 17th IEEE International conference on Program Comprehension (pp. 188 197), May 2009. 9. D. Steidl, B. Hummel, E. Juergens. Using network analysis for recommendation of central software classes. In Proceedings of the 19th IEEE Working Conference on Reverse Engineering (pp. 93 102), October 2012. 10. S. G. Eick, J. L. Steffen, E. E. Sumner Jr. Seesoft-A tool for visualizing line oriented software statistics. IEEE Transactions on Software Engineering, 18(11), 957 968, 1992. 11. B. Hummel, E. Juergens, L. Heinemann, M. Conradt. Indexbased code clone detection: incremental, distributed, scalable. In Proceedings of the IEEE International Conference on Software Maintenance (pp. 1 9), September 2010. 12. V. Bauer, L. Heinemann, B. Hummel, E. Juergens, M. Conradt. A framework for incremental quality analysis of large software systems. In Proceedings of the IEEE International Conference on Software Maintenance (pp. 537 546), September 2012. 13. D. Steidl, S. Eder. Prioritizing maintainability defects based on refactoring recommendations. In Proceedings of the 22nd International Conference on Program Comprehension (pp. 168 176), June 2014.

DE GRUYTER OLDENBOURG E. Juergens, Improving real-world software maintenance 149 Bionotes Dr. Elmar Juergens CQSE GmbH and TU München, D-85748 Garching near Munich, Germany juergens@cqse.eu Dr. Elmar Juergens is co-founder of CQSE GmbH and post-doctoral researcher at the Institute for Informatics at the Technical University of Munich. His PhD thesis on clone detection received the Software Engineering Award of the Ernst-Denert-Stiftung. In 2015 he received a juniorfellowship of the Gesellschaft für Informatik.