Browsing by Author "Anderson, David"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
- ItemOpen AccessDetectability of Non-Equilibrium Molecular Evolution Caused by Fitness Shift and Drift(2022-05-17) Kazemi Mehrabadi, Mobina; de Koning, Jason; Long, Quan; Anderson, DavidOne of the key interests of computational molecular evolution is the inference of the strength and direction of natural selection in protein-coding genes. The non-synonymous to synonymous rate ratio (dN/dS) is widely used to evaluate the effect of natural selection on genes, lineages, and sites. When dN/dS is inferred to be greater than one along a particular branch and at a specific site, this is often taken as evidence of episodic positive selection and adaptive change in function. Despite the simplicity and widespread use of dN /dS approaches, they are funda- mentally unable to differentiate between fit and unfit states, and the stationary distributions in all widely-used approaches are (unrealistically) identical across sites. To address these short- comings, the mutation-selection framework, which is a class of codon substitution models that allows a mechanistic relationship between fitness and sequence has been proposed. Recently, due to developments in Markov-Chain Monte Carlo (MCMC) methods and penalized maximum likelihood approaches, computationally tractable models have been implemented that enable in- ference under site-heterogeneous mutation-selection models, though substantial computational barriers to using such methods on large datasets persist.Here, in my thesis, I introduce time-heterogeneous mutation-selection models as an ideal representation of how episodic adaptation occurs. Using these models, I study how true dN /dS changes over time following a wide variety of fitness shifts (when the fitness profile at a site is completely replaced with a new fitness profile) and fitness drift scenarios (when the fitness of the two most favorable states is swapped). Both simulation and direct (simulation-free) analysis are used to characterize non-equilibrium molecular evolution under time-heterogeneous mutation- selection models of codon substitution. Additionally, I evaluate the performance of existing branch-site type methods to distinguish fitness shift from a relaxation of constraints at a small number of sites. In general, I find that the more different the starting and ending fitness profiles are, the more reliably an adaptive burst is produced, which is potentially detectable using dN /dS approaches. Although all existing methods we considered in the simulation performed poorly and have very low power to detect fitness shifts, I find that covariate information that helps inform which sites might be targets of positive selection can rescue high power of dN/dS type methods to detect modest to strong fitness shifts.Our desire in this project has been to improve our understanding of non-equilibrium molecular evolution under mechanistic models of adaptive change in function and to illuminate how well relatively simple statistical approaches perform in inference tasks. I hope this body of work will broaden the horizon for more realistic, mechanistic, and tractable models of non-equilibrium molecular evolution.
- ItemOpen AccessImproved Basecalling and Base Modification Detection Through Signal-level Analysis of Nanopore Direct RNA Data(2023-09-14) Wang, Scott; Long, Quan; Gordon, Paul; Smith, Mike; Anderson, DavidGenome sequencing technologies emerged as an essential tool for addressing challenges presented by the natural biological complexity of organisms. Unlike traditionally used next-generation sequencing (NGS) methods, which yield short reads, Third-generation sequencing (TGS) methods can sequence transcripts and complete genomes in single contiguous sequencing reads, providing innovative means to address practical topics surrounding viral transmission, evolution, and pathogenesis. TGS alleviates the computational challenges of consensus genome assembly or transcript construction from fragmented reads as required with building NGS libraries. Despite these advantages, as an emerging technology, TGS faces many technical challenges. High error rates make it difficult to distinguish machine errors from low frequency mutations in the genome. Some of the most well known and pervasive diseases in society originate from viruses with ribonucleic acid (RNA) genomes; these include but are not limited to Influenza and Coronaviruses. Advancement towards a comprehensive understanding of RNA viruses has been hindered by their unique biology and high levels of diversity, along with quick replication and mutation rates, which leads to important viral evolutionary signals in individual viral copies. Some of the high basecalling error rate in TGS can be attributed to the presence of unmodeled signal, e.g. calling just the four canonical nucleobases (A, C, G, T/U) when methylation along with other nucleobase modifications are also contributing to the signal. Being able to accurately identify (i.e. signal model) the location of such nucleobase modifications would naturally lead to better nucleobase calling and provide insights into RNA virus biology. The few extant tools in this area for TGS are based on deep-learning AI methods due to computational tractability, and are demonstrably biased. In contrast to such opaque methods, in this work, new efficient implementations of theoretically optimal (“dynamic programming”) methods for Oxford Nanopore Technologies (ONT) TGS raw signal segmentation, alignment, clustering, and consensus are deployed. With follow-on statistical analyses of signal deviations within those results, this defines a minimally biased, statistically grounded procedure for detecting unmodeled signal (i.e. putative nucleobase modifications or mutations), as demonstrated using multiple publicly available raw ONT direct RNA sequencing viral datasets.
- ItemOpen AccessPrecision health equity for racialized communities(2023-12-12) Valiani, Arafaat A.; Anderson, David; Gonzales, Angela; Gray, Mandi; Hardcastle, Lorian; Turin, Tanvir C.Abstract In the last three decades, a cohort of genomicists have intentionally sought to include more racially diverse people in their research in human genomics and precision medicine. How such efforts to be inclusive in human genomic research and precision medicine are modeled and enacted, specifically if the terms of inclusion are equitable for these communities remains to be explored. In this commentary, we review the historical context in which issues of racial inclusion arose with early genome and genetics projects. We then discuss attempts to include racialized peoples in more recent human genomics research. In conclusion, we raise critical issues to consider in the future of equitable human genomics and precision medicine research involving racialized communities, particularly as it concerns working towards what we call Precision Health Equity (PHE). Specifically, we examine issues of genetic data governance and the terms of participation in inclusive human genomics and precision health research. We do so by drawing on insights and protocols developed by researchers investigating Indigenous Data Sovereignty and propose exploring their application and adaptation to precision health research involving racialized communities.
- ItemEmbargoThe Influence of conditonal grants on Canadian social welfare policy(1974) Anderson, David; Armitage, W. Andrew J.
- ItemOpen AccessTranscriptomics in the Diagnosis of Genetic Myopathies(2021-09-24) Joel, Matthew M.; Pfeffer, Gerald; de Koning, A. P. Jason; Arnold, Paul; Long, Quan; Anderson, DavidThe myopathies are a diverse group of primary muscle disorders that arise for a variety of reasons, including both acquired disease (i.e. autoimmune disorders), or from genetic variation (the genetic myopathies). RNA sequencing is the application of next-generation sequencing technologies to sequence the transcriptomes of cells and tissues, yielding a functional, and regulatory snapshot of a sample. Comparing the transcriptomes of the autoimmune disorder inclusion body myositis, and a variety of genetic myopathies, including samples with mitochondrial, myofibrillar, dystrophic, or otherwise nonspecific pathology, showed an extensive immunological influence on those with myositis. There are more nuanced differences in the transcriptomes of the histologically grouped conditions among this cohort, including the previously described FGF21 upregulation in mitochondrial myopathies. Long non-coding RNAs are a neglected species of RNA with myriad regulatory roles. Several non-coding transcripts were identified among the studied groups, that will serve as candidates for testing their biomarker potential for muscle diseases. We tested the utility of RNAseq at diagnosing the genetic myopathy participants of this cohort, identifying four cases where potentially pathogenic variants were detected by accounting for transcript isoform abundance. Genes with transcriptional findings, and potentially pathogenic variants included FLNC, MYOT, NEB, and SELENON. The approach may not be optimal for diagnosing individuals with presumed mitochondrial disease, where minimal differences were observed in mitochondrial transcripts. Ultimately, RNAseq provides another tool for clinicians to investigate genetic disorders, and assist with differential diagnosis.