Understanding protein landscapes at multi-resolution
Doctor of Philosophy
The detailed characterization of the overall folding landscape of a biologically relevant protein is an outstanding challenge to both all-atom simulation and experiment. No single technique, either experiment or simulation, at present can explore the wide range of time and length scales spanned by a protein molecule during its folding. This limitation advocates for the development of novel methodologies that can combine the complementary strengths of experiment and simulation for a complete characterization of protein landscapes at multiple time- and length- scales. This thesis focuses on providing a realistic, but simplified, description of protein landscapes that can be used as a solid starting point toward the development of such a multi-resolution framework. Toward that goal, we have successfully characterized the complex folding landscape of a monomeric lactose repressor protein by combining experimental data with simulation results, as obtained using a structure-based simplified protein model. In addition, we have developed a realistic, but coarse-grained, protein model that contains the crucial physical-chemical ingredients shaping protein landscapes. The simulated folding landscapes of a number of proteins obtained using this simplified model show remarkable quantitative agreement with experimental measurements. This minimalist model has further allowed investigation of the direct connection between folding and signaling of a photoreceptor protein, providing valuable insight into the protein folding-function relationship. The choice of a few optimal reaction coordinates is crucial for identifying the critical transition regions on the folding landscape of a protein. We have developed a powerful technique to automatically extract a set of collective coordinates from the configurational sampling generated during a molecular simulation. The results show that the coordinates emerging from this technique can accurately describe a complex protein folding reaction. Finally, an efficient multi-scale simulation procedure is proposed that can precisely identify the folding transition regions of a protein. In practice, the averaged coarse-grained description of the fast protein dynamics, as extracted from an ensemble of short MD trajectories, is used to globally trace the underlying effective folding free energy landscape.