Download this statement
In the past,
we have received a number of queries about the status of the PDBbind core set.
We also noticed that there are some confusions in literature regarding the
naming convention of the CASF benchmark developed by our group. Here, we would
like to make a formal statement about the PDBbind core set and the CASF
benchmark in a hope to answer those queries and also clarify the confusion.
Our group has
a long-standing interesting in scoring function development. The PDBbind
database is a notable outcome along the path (Liu et al., Acc. Chem. Res. 2017, 50, 302-309). The PDBbind database is now
updated on an annual basis, and each release of PDBbind is named after the
release year, such as PDBbind v.2016, PDBbind v.2017, and so on. The
PDBbind database collects experimentally measured binding affinity data for
four types of molecular complexes, i.e. protein-ligand complexes, nucleic
acid-ligand complexes, protein-protein complexes, protein-nucleic acid
complexes. Among them, we have named the collection of protein-ligand
complexes as the "general set". We put a focus on this data set because
it is most relevant to drug design and discovery studies. Apparently, not every
entry in the general set is suitable for calibrating or validating
docking/scoring methods due to misc problems in 3D structure, binding data, and
other aspects. Therefore, we have selected the relatively
"healthy" entries from the general set to compile the so-called
"refined set". The refined set serves as a generally acceptable
data set for docking/scoring studies. Other researchers may apply the refined
set directly to their studies, or use the refined set as the starting point to
compile data sets with their own focus. Both the general set and the refined
set are updated with the PDBbind database on an annual basis. They should be
correctly cited as, for example, "the PDBbind general set v.2016",
"the PDBbind refined set v.2017", and so on.
As another part of our efforts, we have
established the CASF benchmark (Comparative Assessment of Scoring Functions),
which aims at providing an objective platform for assessing scoring functions. The
first published work was CASF-2007 (Cheng et al., J. Chem. Inf. Model. 2009, 49, 1079-1093). Another major update,
i.e. CASF-2013, was published a few years later (Li et al., J. Chem. Inf. Model. 2014, 54, 1700-1716; J. Chem. Inf. Model. 2014, 54, 1717-1736).
The CASF benchmark employs a high-quality set of protein-ligand complexes as the
primary test set. This data set is selected from the PDBbind refined set
through a systematic, non-redundant sampling procedure, which is named as the PDBbind
"core set" by us. Accordingly, each public release of the CASF
benchmark is named after the version of the PDBbind database from which the
test set is selected. For example, the test set in CASF-2007 was compiled based
on PDBbind v.2007, the test set in CASF-2013 was compiled based on PDBbind
v.2013, and so on. It is not a good idea to name each CASF benchmark by its
publish year. It is because we cannot predict when our paper will be published
in prior when we prepare the manuscript.
It is
important to point out that unlike the PDBbind database, the PDBbind core
set is not updated on an annual basis. As implied above, the PDBbind core
set is a component of the CASF benchmark rather than the PDBbind database. The
CASF benchmark is not updated on an annual basis due to the following reasons:
• A HUGE amount of efforts is
needed to finish each CASF update. The CASF benchmark is more than a simple
data set. For instead, it consists of a whole set of evaluation methods, the
test set, as well as a large panel of standard scoring functions to be tested
as demonstration. A lot of material needs to be prepared, and a lot of
computation needs to be conducted for each CASF update.
• Even if it were doable, in our opinion, there is no need to update
CASF so frequently. Our current plan is to update the CASF benchmark every
three years. In fact, we have already finished CASF-2016, and are preparing
a manuscript regarding it. We hope that this paper can be published in the year
of 2018.
As mentioned above, the last published version of the PDBbind core set is v.2013. This data set was not updated with PDBbind v.2014 and v.2015, so there is no PDBbind core set v.2014 and v.2015. For historical reasons, the PDBbind core set used to be included in the downloadable data package in some previous releases of PDBbind. To avoid further confusion, we have removed the core set from the data packages of recent releases of PDBbind (e.g. PDBbind v.2014, v.2015, v.2016, and v.2017). If needed, the user can obtain the information of the PDBbind core set in the data package of the corresponding CASF benchmark (e.g. CASF-2007 and CASF-2013), which is also downloadable from the PDBbind-CN web site.
In conclusion,
the take-home message is:
• The CASF benchmark should not
be referred to as the "PDBbind benchmark". There are such wrong
naming conventions in literature, and now you know what the correct one is.
• Data package of the CASF
benchmark can be downloaded from the PDBbind-CN web site under the
"CASF" tab (http://www.pdbbind-cn.org/casf.php). At this point, we do
not think it is necessary to set up two separate web sites to host PDBbind and
CASF, respectively.
• Currently, the latest public
release of the CASF benchmark is CASF-2013. There will be CASF-2016 soon.
----------------------------------------------------------------------------------------------------------------------------
By Prof. Renxiao Wang, Mar 3rd, 2018
Dear PDBbind users,
We have received a good number of queries regarding the next release of PDBbind. PDBbind database has a long-standing tradition of regular annual update since its inception. However, it is already year 2023 but the available release is still version 2020 --- We understand your concern.
In fact, our team has been working diligently on PDBbind version 2021 in the past three years. It is important to note that version 2021 is not a regular update but the most significant update in the history of PDBbind, encompassing more binding data (increased by ~20%), new workflow for processing structures, new on-line functions, and a new cloud-based server. It turns out that achieving all these objectives requires much more efforts than what we had anticipated. After version 2021, we will be able to return to the tradition of annual update in the near future.
Our current plan is to relase PDBbind version 2021 officially before the new year of 2024. We would like to express our gratitude for your continued support of PDBbind. Please keep an eye on the new announcements put on this website.
Best wishes,
Prof. Renxiao Wang, on behalf of the PDBbind team
Department of Medicinal Chemistry, School of Pharmacy, Fudan University
Shanghai, P. R. China
E-mail: wangrx@fudan.edu.cn
Dear All,
We are excited to announce that the beta version of our new PDBbind+ web site is now ready for test. Starting from version 2021, all future new versions of the PDBbind database will be released solely on PDBbind+. We cordially invite you to experience the upgraded features of the PDBbind+ web site.
Current registered PDBbind users will be receiving an e-mail soon, from which his/her account on PDBbind+ can be activated directly after transferring his/her user profile on PDBbind-CN to the new web site. Others are encouraged to visit PDBbind+ at www.pdbbind-plus.org.cn. Registration on the PDBbind+ web site as a demo user is FREE. Demo users may access the contents of the PDBbind database up to version 2020 on the new web site.
We plan to release version 2021, as well as additional functional modules, on PDBbind+ once the beta test is completed. Official release of version 2021 is anticipated in this month, so please stay tuned. For the sake of current PDBbind users, the PDBbind-CN web site will still be up running as is, but no future update of PDBbind-CN is planned.
If you need any assistance or have any questions regarding PDBbind+, please feel free to reach us at support@pdbbind-plus.org.cn. Thank you for your continued support to the PDBbind database!
Best regards,
The PDBbind Team
School of Pharmacy, Fudan University
Dear PDBbind users,
Normally a new update of PDBbind is released in the fourth quarter each year. Unfortunately, this year this project is also affected by the COVID-19 pandemic. In addition, our team, as well as the PDBbind-CN server, is in the process of re-location, and thus a lot of extra work needs to be done. However, we will certainly keep the wheel rolling. We expect to release PDBbind v.2020 in the first quarter of 2021.
We wish you a happy and productive new year of 2021!
The PDBbind team
School of Pharmacy, Fudan University
|
Welcome to the PDBbind-CN Database!
Introduction. The aim of the PDBbind database is to provide a comprehensive collection of experimentally measured binding affinity data for all biomolecular complexes deposited in the Protein Data Bank (PDB). It provides an essential linkage between the energetic and structural information of those complexes, which is helpful for various computational and statistical studies on molecular recognition, drug discovery, and many more (see the list of published applications of PDBbind).
The PDBbind database was originally developed by Prof. Shaomeng Wang's group at the University of Michigan in USA, which was first released to the public in May, 2004. This database is now maintained and further developed by Prof. Renxiao Wang's group at College of Pharmacy, Fudan University in China. The PDBbind database is updated on an annual base to keep up with the growth of the Protein Data Bank.
Invitation to the new PDBbind+ web site 02/03/2024
Current release.
The current release, i.e. version 2020, is based on the contents of PDB officially released at the first week in 2020. This release provides binding affinity data for a total of 23,496 biomolecular complexes in PDB, including protein-ligand (19,443), protein-protein (2,852), protein-nucleic acid (1,052), and nucleic acid-ligand complexes (149). Compared to the last release (v.2019), binding data included in this release have increased by ~10%. All binding data are curated by ourselves from ~40,500 original references. Click here for a brief introduction to the PDBbind database (PDF).
A special remark on the PDBbind core set.
Compilation of the PDBbind core set aims at providing a relatively small set of high-quality protein-ligand complexes for validating docking/scoring methods. The data set is selected based on the contents of PDBbind. In particular, this data set has served as the primary test set in the popular Comparative Assessment of Scoring Functions (CASF) benchmark developed by our group. The PDBbind core set is not included in the PDBbind data package because it is not updated annually as PDBbind itself. Users can obtain the PDBbind core set by downloading the CASF data package at http://www.pdbbind.org.cn/casf.php. The latest available version of the PDBbind core set is included in CASF-2016, which consists of 285 protein-ligand complexes.
Accessibility.
The basic information of each complex in PDBbind is completely open for access (see the [BROWSE] page). Users are required to register under a license agreement in order to utilize the searching functions provided on this web site or to download PDBbind data sets in bulk. Registration is currently free of charge to all academic and industrial users. Please go to the [REGISTER] page and follow the instructions to complete registration.
Acknowledgments.
This project is financially supported by the Ministry of Science and Technology of China (National Key Research Program, Grant No. 2016YFA0502302) and the National Natural Science Foundation of China (Grant No. 81725022, 81430083, 21661162003, 21673276, 21472227, 21472226). We are very grateful to Prof. Zenghui (John) Zhang's group at the East China Normal University for their aid to version 2015, 2016, and 2017.
| |
|
|
Team Leader: Prof. Renxiao Wang
Email: wangrx@fudan.edu.cn
Tel: +86-21-54925128
Support: yingsaisi@foxmail.com
The PDBbind-CN Team Members
|
|
|
[1] Minyi Su, Qifan Yang, Yu Du, Guoqin Feng, Zhihai Liu, Yan Li,* Renxiao Wang,*, "Comparative Assessment of Scoring Functions: The CASF-2016 Update", J. Chem. Inf. Model, 2019, Vol. 59: pp 895-913.(CASF-2016)
[2] Yan Li, Minyi Su, Zhihai Liu, Jie Li, Jie Liu, Li Han, Renxiao Wang *, "Assessing Protein-Ligand Interaction Scoring Functions with the CASF-2013 Benchmark", Nature Protocols, 2018, Vol. 3(4): pp 666-680.(CASF-2013)
[3] Liu, Zhihai; Su, Minyi; Han, Li; Liu, Jie; Yang, Qifan; Li, Yan; Wang, Renxiao *, "Forging the Basis for Developing Protein-Ligand Interaction Scoring Functions", Accounts of Chemical Research, 2017, 50 (2): pp. 302-309.(PDBbind version 2016)
[4] Zhihai Liu, Yan Li, Li Han, Jie Li, Jie Liu, Zhixiong Zhao, Wei Nie, Yuchen Liu and Renxiao Wang, "PDB-wide collection of binding data: current status of the PDBbind database", Bioinformatics, 2015, 31 (3): 405-412. (PDBbind version 2014)
[5] Li Y.; Liu Z.H.; Li J.; Han L.; Liu J.; Zhao Z.X.; Wang R.X. "Comparative Assessment of Scoring Functions on an Updated Benchmark: I. Compilation of the Test Set", J. Chem. Inf. Model., 2014, 54 (6), pp. 1700-1716. (PDBbind version 2013)
[6] Li, Y.; Han, L.; Liu, Z. H.; Wang, R. X.*. "Comparative Assessment of Scoring Functions on an Updated Benchmark: II. Evaluation Methods and General Results", J. Chem. Inf. Model., 2014, 54 (6), pp. 1717-1736. (CASF-2013)
[7] Cheng T.J.; Li X.; Li Y.; Liu Z.H.; Wang R.X. "Comparative assessment of scoring functions on a diverse test set", J. Chem. Inf. Model., 2009; 49(4); 1079-1093. (PDBbind version 2007)
[8] Wang, R.; Fang, X.; Lu, Y.; Yang, C.-Y.; Wang, S. "The PDBbind Database: Methodologies and updates", J. Med. Chem., 2005; 48(12); 4111-4119. (PDBbind prototype)
[9] Wang, R.; Fang, X.; Lu, Y.; Wang, S. "The PDBbind Database: Collection of Binding Affinities for Protein-Ligand Complexes with Known Three-Dimensional Structures", J. Med. Chem., 2004; 47(12); 2977-2980. (PDBbind prototype)
|
|
|