TY - JOUR
T1 - Unbiasing scoring functions
T2 - A new normalization and rescoring strategy
AU - Carta, Giorgio
AU - Knox, Andrew J.S.
AU - Lloyd, David G.
PY - 2007
Y1 - 2007
N2 - Ligand bias can contribute significantly to the number of false positives observed in a virtual screening campaign. Using a receptor-based docking approach against a well-established target of therapeutic importance, estrogen receptor a (ERa), coupled with several common scoring functions (ChemGuass, ChemGauss2, ChemScore, ScreenScore, ShapeGauss, and PLP), taken both individually and as a consensus, we sought to examine the characteristics of molecules retrieved by each. It has been previously shown that scoring functions (mainly empirical) exhibit bias in prioritizing more complicated molecules arising from additive components within the function. Using Spearmen's correlation coefficient, we show that a large set of descriptors calculated for a docked set of molecules exhibit positive correlation with the ranked position in a hitlist. Moreover, most of these descriptors correlate well with MW. To this end, rather than penalizing the docked score of all heavy molecular weight (MW) molecules and rewarding those of lower MW, as is common practice, we examine the impact of penalizing the score only of those molecules which were of higher MW, leaving lower MW molecules unaffected. Here, we introduce a new power function to aid the process. Using scoring frequency analysis and SIFt fingerprints, we acheived a more meaningful analysis of virtual screening (VS) performance than with enrichment calculations, facilitating target-specific VS method development.
AB - Ligand bias can contribute significantly to the number of false positives observed in a virtual screening campaign. Using a receptor-based docking approach against a well-established target of therapeutic importance, estrogen receptor a (ERa), coupled with several common scoring functions (ChemGuass, ChemGauss2, ChemScore, ScreenScore, ShapeGauss, and PLP), taken both individually and as a consensus, we sought to examine the characteristics of molecules retrieved by each. It has been previously shown that scoring functions (mainly empirical) exhibit bias in prioritizing more complicated molecules arising from additive components within the function. Using Spearmen's correlation coefficient, we show that a large set of descriptors calculated for a docked set of molecules exhibit positive correlation with the ranked position in a hitlist. Moreover, most of these descriptors correlate well with MW. To this end, rather than penalizing the docked score of all heavy molecular weight (MW) molecules and rewarding those of lower MW, as is common practice, we examine the impact of penalizing the score only of those molecules which were of higher MW, leaving lower MW molecules unaffected. Here, we introduce a new power function to aid the process. Using scoring frequency analysis and SIFt fingerprints, we acheived a more meaningful analysis of virtual screening (VS) performance than with enrichment calculations, facilitating target-specific VS method development.
UR - http://www.scopus.com/inward/record.url?scp=34547676154&partnerID=8YFLogxK
U2 - 10.1021/ci600471m
DO - 10.1021/ci600471m
M3 - Article
C2 - 17552493
AN - SCOPUS:34547676154
SN - 1549-9596
VL - 47
SP - 1564
EP - 1571
JO - Journal of Chemical Information and Modeling
JF - Journal of Chemical Information and Modeling
IS - 4
ER -