Developing Novel and Interdisciplinary Methods for DNA Detection and DNA Structure Profiling
Zhang, David Yu; Veiseh, Omid
Doctor of Philosophy
Understanding the secondary structures of nucleic acid polymers, i.e. DNA and RNA, is fundamentally important for both biochemistry and molecular biology, as structures often influence biological function, such as the affinity of protein binding and accessibility to DNA-binding drugs. Software currently used to predict secondary structures of nucleic acids from sequence exhibits limited accuracy, and furthermore there are limited datasets of DNA sequence and structure to improve the accuracy of biophysical models and secondary structure prediction software. Additionally, secondary structure prediction software is known to have significant qualitative limitations, such as the inability to predict pseudoknots. Recently there arose new chemical probing methods to profile RNA secondary structures such as SHAPE-Seq and DMS-Seq, but no experimental method has been demonstrated for profiling DNA secondary structures. I developed a novel, robust, and high-throughput method to experimentally characterize the DNA secondary structures at the single-molecule resolution by applying low-yield bisulfite conversion and next-generation sequencing (NGS) to a mixture of thousands of DNA species. Bisulfite conversion is a chemical reaction in which cytosines are converted to uracils when the DNA is treated with sodium bisulfite. Importantly, the efficiency of the bisulfite conversion reaction is lowered when the cytosine nucleotide is in a double-stranded state, so the statistical observation of the conversion yield across a large number of molecules suggests the base pairing status of the nucleotide. By lowering the concentration of bisulfite and the reaction time, I was able to modulate the conversion yield to values that optimize determination of base pairing state. By using chip-synthesized oligo pools of over 10,000 strands, I was be able to build a large database that pairs DNA sequences to observed DNA secondary structures and used this database to develop an analytical model to determine the secondary structures of any DNA sequence given its experimental bisulfite conversion data. I found that 84% of 1,057 human genome subsequences studied here adopt 2 or more stable secondary structures in solution.
DNA Structure; Bisulfite Conversion; NGS; Molecular Detection; PCR