Bayesian Approaches to Integrative Structural Biology Using Sparse and Ambiguous Data

Date
2022-01
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Protein structure determination plays an important role in understanding the biological activity of proteins and protein complexes. While traditional experimental approaches such as x-ray crystallography and nuclear magnetic resonance (NMR) have been extremely successful, they are not applicable to all systems. Computational modelling can explore the structure and dynamics of proteins. However, these methods often struggle to converge for large, complex systems. Instead, we can turn to integrative structural biology, an emerging field combining experimental and computational approaches thereby overcoming some of these limitations. One of the main challenges these methods face is how to account for noise and ambiguity in experimental data, which can make it difficult to interpret and correctly incorporate the data into a model. In this thesis, an integrative approach was applied to this problem by incorporating experimental data into physical modelling through Bayesian inference. We first applied this method to the small, well-defined protein GB1 using solid-state paramagnetic NMR restraints. We were able to determine the structure to < 1 Å RMSD in the limit of a sparse dataset. We then expanded our approach to the larger protein calmodulin in complex with a peptide. The introduction of a Bayesian technique to include additional, complementary paramagnetic NMR data enabled the identification of dominant conformations within 3 Å of a reference structure. Next, we addressed a long-standing limitation in NMR experimental methods and introduced a technique to quantitatively determine the accuracy of experimental data. This methodology allowed us to solve for the structure of the protein while simultaneously addressing the ambiguity in the experimental data. We were able to accurately determine the structures of five small proteins within ~2 Å RMSD of reference structures. Finally, we approached the challenge of modelling protein-protein complexes using inherently ambiguous cross-linking mass spectrometry (XL-MS) data. We were able to accurately dock transferrin with a transferrin binding protein and explore the interfacial region of the complex. The work in this thesis expands on existing Bayesian approaches to structure determination by enabling the inclusion of sparse and ambiguous experimental data, furthering the scope of integrative methods.
Description
Keywords
Integrative Structural Biology, Molecular Dynamics
Citation
Gaalswyk, K. (2022). Bayesian approaches to integrative structural biology using sparse and ambiguous data (Doctoral thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.