API Usage Templates via Structural Generalization

dc.contributor.advisorWalker, Robert James
dc.contributor.authorMahmoud, May Abdelrheem Sayed
dc.contributor.committeememberDenzinger, Jorg
dc.contributor.committeememberMaurer, Frank O.
dc.contributor.committeememberAycock, John Daniel
dc.contributor.committeememberHindle, Abram
dc.date2023-11
dc.date.accessioned2023-05-09T15:33:56Z
dc.date.available2023-05-09T15:33:56Z
dc.date.issued2023-05-03
dc.description.abstractApplication programming interfaces (APIs) are key in software development, but determining how to use one can be challenging. Developers often refer to a small set of API usage examples, analyzing the information in them to understand the API usage and adapting them to their own context. Generalization of these examples would aid in understanding their commonalities and differences, thereby reducing information overload. Work on API usage mining seeks recurrent information in usage examples. Some approaches seek frequent subsequences of method calls (e.g., Monperrus et al., 2010;Wasylkowski and Zeller, 2011; Fowkes and Sutton, 2016). Others use graph-based representations, applying frequent subgraph mining techniques (e.g., Nguyen et al., 2009; Amann et al., 2019). However, all such approaches focus on frequently occurring commonalities; this results in either excluding variations in the usage of the API elements in similar contexts or subdividing such variations across several patterns, forcing developers to manually determine variability in the API elements’ usage. Approaches that aim to select the best examples (e.g., Moreno et al., 2013) ignore variation. Approaches that generate examples (e.g., Barnaby et al., 2020) focus on producing maximally succinct examples rather than representing whatever commonality is present. In this thesis, we propose ASGard (for API usage templates via Structural Generalization), a novel approach that automatically generates API usage templates from usage examples based on the generalization of the examples’ syntactic structure and some semantic structure. API usage templates are a code-based representation generalizing similar API usage contexts, showing the commonality of the usage examples, where the varying aspects of the input examples are replaced with structural variables intended as placeholders. ASGard takes a set of API usage examples and a simple indication of the API of interest, as input. We proceed in two phases. (1) For the sake of improved performance, we cluster the examples based on the similarity of the API usage. (2) We then use an approximation of the formalism of E-generalization (Burghardt, 2005) to infer API usage templates from the examples. We start with matching the nodes of the ASTs of the examples, seeking to preserve common elements in the nodes while abstracting away the differences. The generalization proceeds iteratively, permitting increasing abstraction of the template as long as no API usage information is eliminated. The final templates are representations of the generalized ASTs. We perform a manual evaluation of the output templates from ASGard, which generalize a set of 231 usage examples across 5 different APIs, finding that our approach provides a mean 62% coverage of the API usage elements found in the usage examples as opposed to 48% coverage by the best alternative. Furthermore, we automatically evaluate the templates from our approach and the code representation of the patterns generated from PAM and MUDetect (two prominent API usage mining approaches), using a total of 1,954 API usage examples across 59 different APIs. We measure two aspects of the quality of the resulting templates: (1) how complete each template is relative to each concrete example; and (2) how well each template set compresses the set of API usage examples. We find that, compared to the output from PAM and MUDetect, ASGard provides templates that have superior completeness (51% vs. 12% for PAM and 25% for MUDetect) and far superior compression (81% vs. 54% for PAM and 26% for MUDetect). We perform a user study on ASGard with 12 participants to compare the use of these templates in solving programming tasks compared to MUDetect. We find that participants solved the programming tasks in significantly less time with ASGard: 48% for a coding task and 31% for a debugging task. Participants expressed a preference for using ASGard templates and perceived that the approach helped them better understand the API usage; they were more willing to use the approach again than the best alternative.
dc.identifier.citationMahmoud, M. A. S. (2023). API usage templates via structural generalization (Doctoral thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.
dc.identifier.urihttp://hdl.handle.net/1880/116188
dc.identifier.urihttps://dx.doi.org/10.11575/PRISM/dspace/41033
dc.language.isoen
dc.publisher.facultyGraduate Studies
dc.publisher.institutionUniversity of Calgary
dc.rightsUniversity of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission.
dc.subjectAPI usage
dc.subjectCoding templates
dc.subjectE-generalization
dc.subject.classificationComputer Science
dc.titleAPI Usage Templates via Structural Generalization
dc.typedoctoral thesis
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Calgary
thesis.degree.nameDoctor of Philosophy (PhD)
ucalgary.thesis.accesssetbystudentI do not require a thesis withhold – my thesis will have open access and can be viewed and downloaded publicly as soon as possible.
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ucalgary_2023_mahmoud_may.pdf
Size:
4.24 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.62 KB
Format:
Item-specific license agreed upon to submission
Description: