ORDO best practice #1 Documenting data

Over the coming months I’m going to focus on some examples of best practice on ORDO. The creators of all the items in this series will receive a reusable Figshare coffee cup as way of thanks and congratulations.

The first series of items I’m going to focus on are the OpenMARS Database datasets (https://doi.org/10.21954/ou.rd.c.4278950.v1) , deposited by James Holmes (STEM) earlier this year. From the data record:

“The Open access to Mars Assimilated Remote Soundings (OpenMARS) database is a reanalysis product combining past spacecraft observations with a state-of-the-art Mars Global Circulation Model (GCM). The OpenMARS product is a global surface/atmosphere reference database of key variables for multiple Mars years.”

Since their deposit in February, these datasets have been downloaded a total of 291 times, making them some of the most popular items on ORDO. This is a fine reward for all the hard work that went into preparing them for sharing.

What’s so good about them?

There are four datasets which are published individually and also grouped together as a collection. The most impressive thing about these is the documentation accompanying these datasets, which is excellent:

  • On the landing page for each dataset is a description, which clearly details the provenance of the dataset and information about the OpenMARS project
  • Each dataset has a PDF reference manual. This can be read in the browser, and as the datasets are large (~25GB each) and use a file format that requires specialist software and does not display in the browser (.nc) this means that users can decide if the data is useful before download
  • The documentation within the reference manual is very detailed and includes information on access (using a sample Python script included in the dataset), structure of the dataset, provenance and quality assurance
  • The datasets clearly reference the funding body – the European Union’s Horizon 2020 Research and Innovation programme

Is it FAIR?

The gold standard for research data is that it should be FAIR – Findable, Accessible, Interoperable and Re-usable. These datasets fulfill all but one of the criteria detailed in Sarah Jones and Maarjan Grootfeld’s FAIR data checklist (original version at https://doi.org/10.5281/zenodo.1065991).  It only falls down on the fact that the data are not in a widely available format, but considering the nature of the data this would be very difficult to achieve, and since the reference manuals are very accessible, this issue is dealt with. See the completed checklist.

And finally, a word from James…

‘Adding datasets produced by our team at the Open University that will be of interest to multiple different users was really simple to do using the ORDO system, and the team that manage it were very helpful if I had any questions during the process. Thanks!’

 

Leave a Reply

Your email address will not be published. Required fields are marked *