New Google will store your data too
As reported in Wired Science recently, Google is shortly to start the data hosting service, originally announced last summer, through its research.google platform. The intention is to turn this into a huge data store for scientists who wish to share their data openly; the emphasis is indeed on huge, with a service geared typically to image-rich research disciplines such as astronomy and, with their requirement for dozens of terabytes worth of space.
This is all very laudable, and reﬂects a growing realisation that something has to be done with the ever-increasing volumes of data being spawned by researchers across all disciplines - and of course three cheers for the commitment to open data. However, storage is one thing, but curating is quite another - and so far, there is no information about whether or how data might be maintained, enriched or annotated once it is placed in store. There is little doubt about the impressive nature of this Google endeavour, at least at a quantitative level - but its its impressiveness might be somewhat undermined if little thought has been given to looking after the data once it has been deposited. We’ll, let’s wait and see.
On the other hand, I couldn’t help smiling to myself upon reading about the methodology suggested for transferring the vast volumes of data to the store: essentially, it is to be collected and shiped in multi-terabyte hard drives using the services of FedEx and other commercial carriers. No chance of that ever getting lost in the post, of course.
While I’m at it, I’ll use the opportunity to plug the imminent publication of the RIN’s principles on stewardship of research data; all very relevant to the above, so watch this space…