PRISM :: Browsing by Author "Barcomb, Ann"

Browsing by Author "Barcomb, Ann"

Now showing 1 - 10 of 10

Open Access
Analysis of compatibility in open-source Android mobile applications
(2020-11-09) Mukherjee, Debjyoti; Ruhe, Guenther; Reardon, Joel; Barcomb, Ann
Non-functional requirements form an intrinsic part of any software system. Compatibility between versions or different platforms of a software product is a form of NFRs. In this thesis, we have studied compatibility in Android mobile applications. We are interested in understanding the different aspects of mobile application incompatibility, their frequency of occurrence, how much effort developers have spent on it, and whether the effort is commensurate with the needs of the users. In this thesis, an analytical compatibility evaluation approach called ACOCUR is proposed. The main characteristics of ACOCUR are: (i) compatibility requirements are automatically identified from user reviews and their types are also determined (ii) compatibility fixes done by developers are systematically analyzed, and (iii) the requirements from users are linked to the fixes to identify the responsiveness of developers to compatibility requirements. We have evaluated open-source mobile applications and have analyzed their commits and reviews to identify the compatibility fixes and requirements respectively. Both the commit messages and reviews have been processed by a pipeline of Natural Language Processing steps. App developers have also been surveyed and their responses have been analyzed to establish the state-of-the-practice and the problems currently faced by developers in this respect. Finally, an automated tool has been developed that implements the ACOCUR methodology to support app developers to identify and analyze compatibility requirements.
Embargo
Augmenting Accessibility: An Exploration of AR Usability and Adaptations for Nonspeaking Autistic Individuals
(2023-09-12) Nazari, Ahmadreza; Krishnamurthy, Diwakar; Far, Behrouz; Barcomb, Ann
About one-third of autistic people are nonspeaking, and most are never provided access to an effective alternative to speech. Thoughtfully designed Augmented Reality (AR) applications could provide members of this population with structured learning opportunities, including training on skills that underlie alternative forms of communication. Yet, a considerable gap exists in research exploring AR's use for nonspeaking autistic individuals– a gap this work seeks to address. A fundamental step toward bridging this gap is to investigate nonspeaking autistic people's ability to tolerate a head-mounted AR device and to interact with virtual objects. This thesis presents the first study to examine the usability of an interactive AR-based application by this population. 17 nonspeaking autistic subjects were recruited to play a HoloLens 2 game that involved holographic animations and buttons. Almost all subjects tolerated the device long enough to begin the game, and most completed increasingly challenging tasks that involved pressing holographic buttons. These findings contradict prevailing assumptions about nonspeaking autistic people and open up exciting possibilities for AR-based applications for this population targeting areas such as communication and education. Building upon these findings and recognizing that nonspeaking autistic individuals often have motor skill challenges, this thesis introduces a novel system to enhance the accessibility of AR by optimizing the placement of virtual content using Behavioural Cloning. Specifically, this work targets nonspeaking autistic individuals who are learning to communicate by pointing to letters on a physical letterboard with the help of a caregiver. Observing the real-world interactions between a subject and their caregiver, the proposed system automatically derives a personalized virtual object placement policy. We show that the proposed approach requires only modest training effort and places a virtual object accurately, closely mirroring how the caregiver caters to the user's unique motor skills and movement patterns. This work represents a significant additional step toward enabling AR applications for this population. In summary, this thesis presents empirical evidence that supports the use of AR for assisting nonspeaking autistic people. It also introduces a system for a personalized AR experience to improve the accessibility of this technology. The contributions of this thesis pave the way for broader future applications of AR for this understudied and underserved population.
Open Access
Automated Bug Severity Prediction using Source Code Metrics, Static Analysis, and Code Representation
(2022-09-12) Mashhadi, Ehsan; Hemmati, Hadi; Barcomb, Ann; Tan, Benjamin
In the past couple of decades, significant research efforts are devoted to the prediction of software bugs. However, most existing work in this domain treats all bugs the same, which is not the case in practice. It is important for a defect prediction method to estimate the severity of the identified bugs so that the higher severity ones get immediate attention. In this thesis, we provide a quantitative and qualitative study on two popular datasets (Defects4J and Bugs.jar), using 10 common source code metrics, and also two popular static analysis tools (SpotBugs and Infer) for analyzing their capability in predicting defects and their severity. We studied 3,358 buggy methods with different severity labels from 19 Java open-source projects. Results show that although code metrics are powerful in predicting buggy code, they cannot estimate the severity level of the bugs. In addition, we observed that static analysis tools have weak performance in both predicting bugs (F1 score range of 3.1%-7.1%) and their severity label (F1 score under 2%). We also manually studied the characteristics of the severe bugs to identify possible reasons behind the weak performance of code metrics and static analysis tools. Also, our categorization shows that Security bugs have high severity in most cases while Edge/Boundary faults have low severity. Furthermore, we show that code metrics and static analysis methods can be complementary in terms of estimating bug severity. For finding the effectiveness of machine learning models in predicting bug severity, we train 8 different models on code metrics only as a baseline and evaluate them based on different evaluation metrics. The overall result was not promising, but the Decision Tree and Random Forest models have better results. Then, we leveraged the pre-trained CodeBERT model to use code representation by feeding the source code input only, and the results improved significantly in the range of 29%-140% for different metrics. We also integrated code metrics into the CodeBERT model by providing two architectures named ConcatInline and ConcatCLS which enhance the CodeBERT model efficacy.
Open Access
A collaborative autoethnographic analysis of industry-academia collaboration for software engineering education development
(Canadian Engineering Education Association, 2022-06) Marasco, Emily; Barcomb, Ann; Dwomoh, Gloria; Eguia, Daniel; Jaffary, Abbas; Johnson, Garth; Leonard, Lance; Shupe, Ryan
As engineering educators seek to prepare students for future careers, it can be challenging to keep course materials current with industry practices and knowledge. Students also often experience a disconnect between their studies and perceived relevance to future industry roles. This study examines the potential impact of an industry-academia collaboration on the development and improvement of software engineering education while addressing these issues. A collaborative autoethnographic approach is used to concurrently analyze the experiences of both industry and academic participants in the collaboration. Common themes across the collected personal reflections show that varied benefits were experienced by all stakeholders while contributing to an improved student experience.
Open Access
Early Identification of Youth at Risk of Long Term Emergency Homeless Shelter Use: An Evaluation of Interpretable Machine Learning models
(2023-12-18) Annaa, Osman Jakpa; Messier, Geoffrey; Yanushkevich, Svetlana; Barcomb, Ann
Homelessness is a serious violation of one’s dignity, and youth who are long-term shelter users are particularly vulnerable members of a vulnerable demographic. The commitment to prevent and eliminate homelessness, particularly among the youth, is a shared responsibility. Programmes aiming at providing homeless people with permanent housing, mostly identify people who have lived with the condition for an extended period of time for support. Allowing young people to be homeless for an extended period of time before intervening, exposes them to several kinds of hardships on the streets. Early identification of youth at risk of becoming a long term shelter user is a proactive and a more humane way of addressing the problem. Machine learning is brought forth as a tool to augment the expertise of shelter staff in identifying youth at risk of long-term shelter use. Machine learning algorithms are utilised to predict youth at risk of long-term shelter use with the clients’ first 30, 60, 90, 120, or 180 days of shelter access records. A real time program delivery approach was incorporated in the experiments as a supplement to existing other methods in fighting homelessness. Interpretable machine learning models capable of ultimately producing classification rules in DNF format are evaluated. The level of control over the complexity of the generated rules, coupled with statistical evaluation metrics are employed in the evaluation.
Open Access
Evaluation of Volunteering Capabilities in an Open-Source Software Community
(2023-12-19) Hariharan, Aadharsh; Barcomb, Ann; De Carli, Lorenzo; Krishnamurthy, Diwakar
Open-source software is a cornerstone of modern technology. Embodying principles of transparency, collaboration, and innovation, it nurtures a vibrant ecosystem that empowers individuals, businesses, and communities. Open-source software has impacted software development significantly; the longevity of open-source projects is essential to the entire field of software development. Challenges faced by open-source software communities include the management of contributors, effective utilization of them, retention of existing contributors, and recruitment of new contributors. For projects where most contributors are volunteers – which remains the case for several projects such as Gnome, Perl, and Python – attracting and retaining volunteers becomes crucial to success. Crowston (2011) argued that because of the high mobility of knowledge workers, even paid employees require personal motivation to participate in projects. In this sense, they should also be viewed as volunteers. Numerous studies explore the dynamics of open-source communities and volunteer contributions. This research has yielded models to assess the volunteering prowess of open-source software communities, and proposed solutions to address challenges. However, most studies have taken a collective approach, encompassing multiple open-source software communities, which presents a generalized perspective. Utilizing a fusion of quantitative and qualitative techniques, this research project gauges the degree of relevance and applicability of existing theories, models, and solutions within the unique context of the Perl and Raku community. This case study offers valuable insights into the community's existing skills, capabilities, and resources available for constructive contributions to growth and development. These insights are instrumental in identifying and implementing strategies to attract and retain volunteers within the community. Conflict within communities can be a significant factor in retaining volunteers, and the latter portion of the thesis emphasizes identifying techniques to address these challenges.
Open Access
Exploration of Techniques for Working with Sparse Data when Applying Natural Language Processing to Assist a Qualitative Data Analysis of a COVID-19 Open Innovation Community
(2024-04-17) Yamani, Shirin; Barcomb, Ann; Far, Behrouz; Abbasi, Zahra
This thesis undertakes a novel integration of Natural Language Processing (NLP) with Qualitative Data Analysis (QDA) to investigate the dynamics of volunteer involvement within the TeamOSV community, a collective formed in response to the COVID-19 pandemic. Central to this study is the exploration of roles and interaction patterns among episodic and habitual volunteers, alongside an analysis of the factors influencing their engagement and disengagement within the community. A significant methodological contribution of this work lies in addressing the sparse data challenge, a common constraint in qualitative research, particularly within multi-class classification contexts. The study employs and critically evaluates a range of NLP techniques, with a focus on data augmentation strategies, to enhance the efficacy of various models, including Logistic Regression, Naive Bayes, Artificial Neural Networks (ANN), Convolutional Neural Networks (CNN), and particularly the Self-Attention model. The proposed framework, identified for its superior performance, demonstrates a noteworthy ability to process and interpret sparse qualitative data, surpassing both traditional approaches in its effectiveness. Furthermore, the thesis explores an in-depth analysis of model variations, assessing the impact of differing configurations of Self-Attention blocks and layers of feed-forward neural networks. It also explores the implications of pre-training on model performance, offering insights into the architectural complexities and training dynamics of NLP models. A crucial aspect of this exploration is the consideration of the trade-offs between model complexity and computational efficiency, highlighting the practical challenges and considerations in deploying these models in qualitative research contexts. Qualitatively, the study offers a detailed examination of volunteer roles within the TeamOSV community. It identifies the distinct contributions and challenges associated with episodic volunteers, characterized by their sporadic engagement patterns, and habitual volunteers, who provide stability and long-termvision. The research also sheds light on the reasons behind volunteer disengagement, such as lifestyle changes and diminishing interest, providing a holistic understanding of volunteer participation in open-source, community-driven projects. The thesis concludes by emphasizing the collaborative strengths of merging NLP with QDA, a union that significantly augments the depth of qualitative research. It proposes a roadmap for future investigations, concentrating on enhancing insights into volunteer coordination within open innovation settings and broadening the application range of NLP in qualitative data examination.
Open Access
Interpretable Deep Learning Models for Wearable Data in Sleep and Stress Analysis: Bridging the Gap between Predictive Accuracy and Explainability in Personalized Health Monitoring
(2024-01-26) Barati, Ronak; Moshirpour, Mohammad; Duffett-Leger, Linda; Moshirpour, Mohammad; Duffett-Leger, Linda; Barcomb, Ann; Sameet Deshpande, Gouri
This study integrates wearable technology, machine learning, and personal health to analyze human sleep patterns and stress levels. It aims to understand the impact of daily activities and physiological metrics on individual well-being, utilizing a broad data set from various individuals. The research compiles three interrelated studies, offering a detailed view of personalized health monitoring and its potential for future applications. The first study utilizes LSTM networks, as well as RNN, complemented by Explainable AI, particularly LIME. This approach provides a deep dive into the rich, extensive data gathered from smartwatches, revealing how our daily routines—our steps, heart rates, stress, and physical activities—influence the sleep duration of our different levels of sleep Through this in-depth analysis, not only are we able to uncover the subtle but significant ways in which our lives influence our sleep, but the data allows us to develop tailored health interventions specific to everyone. The second study makes use of data from wearable devices to classify sleep levels using seven machine-learning models. Throughout this journey, stress plays a pivotal role in affecting sleep quality. The comparison of models with and without stress data suggests a compelling case for holistic health monitoring. An important finding of models that incorporate stress data is that psychological factors play a significant role in understanding and improving sleep health. The implications of this insight have a significant supporting on the development of wearable technologies and health monitoring systems, advancing our understanding of sleep disorders and treating them. In our final study, smartwatch data from first responders and their families were analyzed over three years using machine learning classifiers like SVM, Logistic Regression, KNN, Decision Trees, Random Forests, Naive Bayes, and XGBoost. The comparison between datasets with and without sleep data showed that sleep inclusion significantly boosts stress prediction accuracy to 98%, underlining the relationship between sleep and stress. This research offers vital stress management insights, especially for first responders.
Open Access
Towards Usable API Documentation
(2023-07) Khan, Junaed Younus; Uddin, Gias; Barcomb, Ann; Walker, Robert James
The learning and usage of an API is supported by documentation. Like source code, API documentation is itself a software product. Several research results show that bad design in API documentation can make the reuse of API features difficult. Indeed, similar to code smells, poorly designed API documentation can also exhibit 'smells'. Such documentation smells can be described as bad documentation styles that do not necessarily produce incorrect documentation but make the documentation difficult to understand and use. This thesis aims to enhance API documentation usability by addressing such documentation smells in three phases. In the first phase, we developed a catalog of five API documentation smells consulting literature on API documentation issues and online developer discussion. We validated their presence in the real world by creating a benchmark of 1K official Java API documentation units and conducting a survey of 21 developers. The developers confirmed that these smells hinder their productivity and called for automatic detection and fixing. In the second phase, we developed machine-learning models to detect the smells using the 1K benchmark, however, they performed poorly when evaluated on larger and more diverse documentation sources. We explored more advanced models; employed re-training and hyperparameter tuning to further improve the performance. Our best-performing model, RoBERTa, achieved F1-scores of 0.71-0.93 in detecting different smells. In the third phase, we first focused on evaluating the feasibility and impact of fixing various smells in the eyes of practitioners. Through a second survey of 30 practitioners, we found that fixing the lazy smell was perceived as the most feasible and impactful. However, there was no universal consensus on whether and how other smells can/should be fixed. Finally, we proposed a two-stage pipeline for fixing lazy documentation, involving additional textual description and documentation-specific code example generation. Our approach utilized a large language model, GPT- 3, to generate enhanced documentation based on non-lazy examples and to produce code examples. The generated code examples were refined iteratively until they were error-free. Our technique demonstrated a high success rate with a significant number of lazy documentation instances being fixed and error-free code examples being generated.
Open Access
Utilization of Natural Language Processing for Extracting Smart Cities Requirements from Large Social Media Text
(2024-05-14) Mirshafiee Khoozani, Mitra Sadat; Barcomb, Ann; Tan, Benjamin; Messier, Geoffrey; Fapojuwo, Abraham
Major organizations such as urban centers worldwide face challenges from rapid population growth and evolving demands, requiring innovative approaches to stay responsive to residents' needs. This challenge is exemplified by the city of Calgary, where an automated system for aggregating and categorizing resident feedback could improve city planning. What people find important and useful can be seen in the articles they post on social media. One method for determining the performance of urban services and assets for citizens is paying attention to these data generated by the residents. In this regard, we need to examine datasets wherein writing is the primary form of citizen engagement (direct messages, requests, comments, complaints, etc.). To interpret this data, it is necessary to use appropriate tools and techniques for data processing and analysis of large volumes of unstructured text. Some of the most effective tools used by researchers nowadays falls into the scope of computational linguistics, specifically Natural language processing (NLP). Furthermore, Twitter is one of the primary platforms where individuals freely voice their opinions and concerns. In this study, we develop an automated workflow that can scrape, classify, and display tweets in a simplistic view. With the help of this system, local officials will be able to speed up the decision-making process when considering citizens' current problems. Following our research question, we look into the optimal scraping criteria, explore a variety of methods for topic and emotions analysis, and validate these methods both using automatic evaluation and manual assessment. As a result, we are able to identify issues related to city development, senior citizens, taxes, and unemployment using our best performing models (BERTopic for topic modeling and few-shot learning using Setfit for emotion analysis.) Afterward, we collect city employees' opinion regarding our research to determine the usefulness and applicability of this approach. Overall, we demonstrate how delving into these analyses can complement the current systems in place for urban planning.

Browsing by Author "Barcomb, Ann"

Results Per Page

Sort Options

Libraries & Cultural Resources