Structural Variation in the Caenorhabditis elegans Genome: Challenges and Quality Assurance Strategies for Reliable Variant Calling

dc.contributor.advisorWasmuth, James
dc.contributor.authorLesack, Kyle James
dc.contributor.committeememberVan Marle, Guido
dc.contributor.committeememberMacCallum, Justin
dc.contributor.committeememberGilleard, John
dc.date2023-11
dc.date.accessioned2023-08-10T19:44:03Z
dc.date.available2023-08-10T19:44:03Z
dc.date.issued2023-08
dc.description.abstractObtaining an accurate and comprehensive representation of structural variation is crucial for understanding how large alterations in chromosome structure contribute to phenotype diversity and drive genome evolution. Despite continuous efforts into improving methods for identifying structural variation from whole genome sequencing data, accurate variant calling remains challenging. The barriers to progress in this area are complex and multifactorial but the technical limitations of short-read sequencing technologies and limited availability of suitable benchmarking resources for non-human species feature prominently. This thesis includes an in-depth evaluation of several commonly used tools for identifying structural variants from short- and long-read DNA sequencing data from natural Caenorhabditis elegans strains. The results of these comparisons revealed that popular tools yield considerably different results, which are described in detail in Chapter 2. A major aim of this project was to identify sources of error and variability that tool developers could address in the future. Surprisingly, the order of reads in PacBio FASTQ files were revealed to affect the predicted structural variants. Chapter 3 describes these results and demonstrates how alignment sorting algorithms contribute to the problem. In Chapter 4, an analysis of structural variation in 14 natural C. elegans strains is described. Importantly, this work demonstrates how long-read DNA sequencing data can be successfully used to identify structural variants at the population level.
dc.identifier.citationLesack, K. J. (2023). Structural variation in the Caenorhabditis elegans genome: challenges and quality assurance strategies for reliable variant calling (Doctoral thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.
dc.identifier.urihttps://hdl.handle.net/1880/116853
dc.identifier.urihttps://dx.doi.org/10.11575/PRISM/41695
dc.language.isoen
dc.publisher.facultyVeterinary Medicine
dc.publisher.institutionUniversity of Calgary
dc.rightsUniversity of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission.
dc.subjectstructural variation
dc.subjectCaenorhabditis elegans
dc.subjectgenomics
dc.subjectbioinformatics
dc.subject.classificationBioinformatics
dc.titleStructural Variation in the Caenorhabditis elegans Genome: Challenges and Quality Assurance Strategies for Reliable Variant Calling
dc.typedoctoral thesis
thesis.degree.disciplineVeterinary Medical Sciences
thesis.degree.grantorUniversity of Calgary
thesis.degree.nameDoctor of Philosophy (PhD)
ucalgary.thesis.accesssetbystudentI do not require a thesis withhold – my thesis will have open access and can be viewed and downloaded publicly as soon as possible.
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ucalgary_2023_lesack_kyle .pdf
Size:
3.1 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.62 KB
Format:
Item-specific license agreed upon to submission
Description: