GfK SE and Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
Title:Fighting (aging-related) bugs
Abstract:“What I find disturbing about rejuvenation ... is that it is being used to compensate for lack of thoroughness in design. ... It's not an engineering approach to a solution.” This is how Lawrence Stabile voiced his concerns about using software rejuvenation in a 2007 Letter to the Editor of IEEE Computer. Indeed, shouldn't we try to make sure that software systems do not age at all, by preventing aging-related bugs from being introduced into the software in the first place, or by detecting them during software testing?
This presentation will review previous work related to the classification of software faults, the empirical analysis of fault types and mitigation techniques for different kinds of software systems, and the stochastic analysis of the effects of such techniques on time to failure, availability, and costs. There are two main points that I want to make: 1) Software rejuvenation is one of many weapons in a software developer's armory for fighting bugs. 2) The extent to which we should employ software rejuvenation can be demonstrated by analyses accounting both for different types of faults as well as for various mitigation techniques.
Short Biography:Michael Grottke is Principal Data Scientist at GfK SE and Adjunct Professor at Friedrich-Alexander- Universität Erlangen-Nürnberg (FAU). He received the Master's degrees in economics and in business administration from Wayne State University, Detroit, USA, and FAU, Nürnberg, Germany, in 1999 and 2000, respectively. In 2003, he received the Ph.D. degree from FAU for a thesis dealing with software reliability growth models and the analysis of software failures experienced during systematic testing. Upon finishing his Ph.D., he was awarded a fellowship within the Postdoc program of the German Academic Exchange Service for carrying out research on software aging and rejuvenation at Duke University. He spent a total of three years, from 2004 to 2007, as a Postdoctoral Research Associate and Assistant Research Professor with Kishor Trivedi's group. His sojourn at Duke sparked the collaboration with Allen Nikora on analyzing software faults (including aging-related bugs) in JPL/NASA flight software. Prof. Grottke has served as a Guest Editor for two Special Issues (of Performance Evaluation and the Journal of Systems and Software, respectively) based on previous editions of WoSAR. He is an Associate Editor of the International Journal of Performability Engineering and the IEEE Transactions on Reliability.
Vasilis P. Koutras
University of the Aegean, GREECE
Title:Modeling the implementation of software rejuvenation in computer systems: Advances and future trends
Abstract:Software aging is a phenomenon that causes computer systems’ performance degradation and eventual failures. Software rejuvenation is a proactive software fault management technique that provides an effective way to counteract preventively such issues. Software Rejuvenation is the concept of periodically stopping the running software, cleaning its internal state by garbage collection, defragmentation, flushing operating system kernel tables and reinitializing internal data structures, and restarting it. One of our main streams in the Reliability Engineering Laboratory (RELab) is focused on modeling how software rejuvenation can be implemented in various computer systems to prevent aging phenomena that can lead to software failures and might cause financial loss or even, indirectly, loss of human life. During the past years we have managed to model the implementation of rejuvenation on redundant systems, clusters and distributed computing systems, VoIP applications and other computer systems, aiming in identifying the optimal rejuvenation schedules that can improve computer systems’ availability, reliability, performability and operational costs indicators. We have worked based on various rejuvenation models that appear on the relevant literature such as partial and full rejuvenation action or their combination and we have introduced the concept of failed rejuvenation. Due to the stochastic nature of aging phenomena and software failures occurrence as well as due to the stochastic evolution of computer systems’ state in time, we have adopted powerful stochastic tool such as Continuous Time Markov Chains, semi-Markov Process and Cyclic Non-Homogeneous Markov Chains to model the implementation of rejuvenation and examine how either the transient or the asymptotic behavior of the above mentioned dependability and performance measures is affected by rejuvenation schedules. Although rejuvenation has been studied over the last 25 years, there is still plenty of space for new approaches, developments and generally new ideas on how it can be implemented on computer systems in order to improve the provided services. Recently, the idea of classifying rejuvenation strategies based the level at which aging is detected and the level at which rejuvenation is applied to counteract aging was introduced and studied. Such an approach paves the way for future research. Beyond this, ideas like modeling the implementation of rejuvenation on multicomponent systems by using Markov Regenerative Processes (MRGPs) or simulation, enabling opportunistic rejuvenation, or examining how rejuvenation can affect overall systems’ performance capacity can gain a lot of research effort in the future.
Short Biography:Dr Vasilis P. Koutras is currently an Assistant Professor at the Department of Financial and Management Engineering, School of Engineering, University of the Aegean in Greece. He holds a Bachelor degree in Mathematics (Probability, Statistics & Operational Research)-(2002) from the Department of Mathematics, University of Patras, a Master degree in Mathematical Modeling in Physical Sciences and new Technologies (2004), from the Department of Mathematics, University of the Aegean. He received his PhD diploma on Stochastic Modeling of Software Rejuvenation from the Department of Financial and Management Engineering, University of the Aegean. He is a member of the Reliability Engineering Laboratory (REL). He is also Academic Staff at the Hellenic Open University, School of Science & Technology, Post Graduate Programme: Quality Management and Technology, Msc, Course:DIP50 Basic Tools and Methods for Quality Control. He has published over 40 papers in international journals, book chapters and international conferences proceedings (with review procedure). His research interests include stochastic modeling of: software rejuvenation, highly available and reliable systems, computer systems performability indicators; software reliability; Markov and Semi-Markov processes; preventive maintenance and optimal maintenance policies; operations research.