Dispersion of changes in cloned and non-cloned code
Currently, the impacts of clones in software maintenance activities are being investigated by different researchers in different ways. Comparative stability analysis of cloned and non-cloned regions of a subject system is a well-known way of measuring the impacts where the hypothesis is that, the more a region is stable the less it is harmful for maintenance. Each of the existing stability measurement methods lacks to address one important characteristic, dispersion, of the changes happening in the cloned and non-cloned regions of software systems. Change dispersion of a particular region quantifies the extent to which the changes are scattered over that region. The intuition is that, more dispersed changes require more efforts to be spent in the maintenance phase. Measurement of Dispersion requires the extraction of method genealogies. In this paper, we have measured the dispersions of changes in cloned and non-cloned regions of several subject systems using a concurrent and robust framework for method genealogy extraction. We implemented the framework on Actor Architecture platform which facilitates coarse grained parallellism with asynchronous message passing capabilities. Our experimental results on 12 open-source subject systems written in three different programming languages (Java, C and C#) using two clone detection tools suggest that, the changes in cloned regions are more dispersed than the changes in non-cloned regions. Also, Type-3 clones exhibit more dispersion as compared to the Type-1 and Type-2 clones. The subject systems written in Java and C show higher dispersions as well as increased maintenance efforts as compared to the subject systems written in C#.