• Information Technology
  • Human Resources
  • Careers
  • Giving
  • Library
  • Bookstore
  • Active Living
  • Continuing Education
  • Go Dinos
  • UCalgary Maps
  • UCalgary Directory
  • Academic Calendar
My UCalgary
Webmail
D2L
ARCHIBUS
IRISS
  • Faculty of Arts
  • Cumming School of Medicine
  • Faculty of Environmental Design
  • Faculty of Graduate Studies
  • Haskayne School of Business
  • Faculty of Kinesiology
  • Faculty of Law
  • Faculty of Nursing
  • Faculty of Nursing (Qatar)
  • Schulich School of Engineering
  • Faculty of Science
  • Faculty of Social Work
  • Faculty of Veterinary Medicine
  • Werklund School of Education
  • Information TechnologiesIT
  • Human ResourcesHR
  • Careers
  • Giving
  • Library
  • Bookstore
  • Active Living
  • Continuing Education
  • Go Dinos
  • UCalgary Maps
  • UCalgary Directory
  • Academic Calendar
  • Libraries and Cultural Resources
View Item 
  •   PRISM Home
  • SurfNet
  • Surfnet
  • View Item
  •   PRISM Home
  • SurfNet
  • Surfnet
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

On the Effectiveness of Simhash for Detecting Near-Miss Clones in Large Scale Software Systems

Thumbnail
Author
Uddin, M.S.
Roy, C.K.
Schneider, K.A.
Hindle, A.
Accessioned
2015-07-29T19:39:12Z
Available
2015-07-29T19:39:12Z
Issued
2011
Type
unknown
Metadata
Show full item record

Abstract
Clone detection techniques essentially cluster textually, syntactically and/or semantically similar code fragments in or across software systems. For large datasets, similarity identification is costly both in terms of time and memory, and especially so when detecting near-miss clones where lines could be modified, added and/or deleted in the copied fragments. The capability and effectiveness of a clone detection tool mostly depends on the code similarity measurement technique it uses. A variety of similarity measurement approaches have been used for clone detection, including fingerprint based approaches, which have had varying degrees of success notwithstanding some limitations. In this paper, we investigate the effectiveness of simhash, a state of the art fingerprint based data similarity measurement technique for detecting both exact and near-miss clones in large scale software systems. Our experimental data show that simhash is indeed effective in identifying various types of clones in a software system despite wide variations in experimental circumstances. The approach is also suitable as a core capability for building other tools, such as tools for: incremental clone detection, code searching, and clone management.
Refereed
Yes
Url
http://dx.doi.org/10.1109/WCRE.2011.12
Publisher
IEEE
Doi
http://dx.doi.org/10.1109/WCRE.2011.12
Uri
http://hdl.handle.net/1880/50714
Collections
  • Surfnet

Browse

All of PRISMCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

LoginRegister

Download Results

Statistics

Most Popular ItemsStatistics by CountryMost Popular Authors

  • Email
  • SMS
  • 403.220.8895
  • Live Chat

Energize: The Campaign for Eyes High

Privacy Policy
Website feedback

University of Calgary
2500 University Drive NW
Calgary, AB T2N 1N4
CANADA

Copyright © 2017