A series of advanced scoring functions in ranking protein structures
Doctor of Philosophy
Designing an efficient scoring function is one of the most challenging tasks in computational biology. A good potential functions or scoring functions can help rank protein structures models, guide the search and identify possible solutions. This is very important for protein structure prediction. A lot of work have been done in this area but none of them has achieved the desired result. It is therefore urgently needed to develop good scoring functions to accelerate the process. In this thesis, I will present several novel empirical potential functions and scoring functions to address this problem. First, an upgraded version of previous work OPUS-PSP, named OPUS-DOSP. A distance related term is added to the potential function and the performance is improved. Second, a non-traditional scoring function, named OPUS-CSF is developed. This scoring function didn’t use the traditional Boltzmann formula but constructed a native configuration distribution table instead. This scoring function outperformed the previous work, OPUS-DOSP. Thirdly, two scoring functions combining the features of previous two scoring functions are developed. OPUS-SSF and OPUS-Beta are their names. These two scoring functions yield the best result so far and are promising in this area. The effectiveness of these scoring functions is tested in various decoy sets generated from native structures. In the traditional benchmarks like ROSETTA, ig_structure, fisa_casp3, MOULDER and so on, OPUS-DOSP, OPUS-CSF, OPUS-SSF are performing better than previous works. In beta-prediction benchmarks Beta916 and Beta1452, OPUS-Beta also outperformed existing methods. Therefore, these scoring functions seem to be promising and useful in this field.
Scoring function; empirical potential; protein structure recognition; side-chain packing; coarse-graining