A new face to old data
One of the big pushes behind the data sharing drive has been the idea that by allowing researchers to have access to as much data, researchers will be able to combine in innovative ways leading to new and exciting developments.
However having access to large amounts of data also allows us to go back and re-examine the data using current techniques and scrutinize it on a way that was not possible before, as highlightd in a recent Nature news article. X-ray crystallographers have been depositing protein structures into the Protein Data Bank (PDB) since 1971 and it currently hold as nearly 53,000 3D structures of protein molecules and nucleic acid. Technology has changed a lot since 1971, enabling more accurate analysis of crystallographic data.
Gert Vriend and colleagues at the University Medical Centre in Nijmegan in the Netherlands have developed software that will re-reﬁne the data deposited in the PDB. In his initial article he showed that 67% of the 16,807 ﬁles that he looked at were improved. Errors in protein/nucleic acid structure can have profound implications for research. Errors can lead to incorrect assumptions about how a protein works, or incorrect development of small-molecule inhibitors hindering new drug development to name a few.
Due to the importance of having a correct structural model, Vriend is not the only one attempting to address this problem. Paul Adams at the Lawrence Berkeley national Laboratory in California has also developed software that often improves the original structure assignments. Though these re-reﬁnement tools might solve a number of issues, they are not able to ﬁx more cosmetic problems or more serious problems such as amino-acid side chains that have been assigned to the wrong location. Baby steps.