The Library Research Support team are pleased to announce the launch of our new Data Champions programme, with the first of our Data Champions forums being held in the Library last week.
This forum was an opportunity for us to meet our Champions and for them to get to know each other, as well as to find out more about what the programme entails.
The Data Champions programme has been set up as a way of promoting OU Research Data Management (RDM) services and tools within faculties and to provide more discipline-specific data management advice and support.
As part of this programme, our Data Champions have been asked to contribute to the development and delivery of a data-focussed seminar series across the coming year; and planning for our first session has already begun in earnest.
Our 13 Data Champions offer representation across every faculty and bring with them a range of experiences of managing diverse data types, from highly sensitive interview data to archival materials and the re-use of third-party data. After hearing more about the programme (and a bit of themed cake!), our Champions made an enthusiastic start, sharing their data management experiences and producing a whole host of fantastic ideas to theme our future seminars around.
Keep your eyes open for updates on our first seminar!
Over the coming months we will be running a series of online bitesize training sessions on various aspects of research data management. These are open to all OU research staff and postgraduate researchers.
Please follow the links below for further information and joining instructions.
The latest instalment of my series on best practice in ORDO looks at sharing videos.
In late 2017, we were approached by Dr Erica Borgstrom from the faculty of Wellbeing, Education and Language Studies. Erica’s research focuses on death and dying, with a particular focus on end of life care. Over the course of the previous year she had been running a series of seminars on death and dying, all of which had been recorded and posted on an OU hosted website. Erica was concerned that the website would not be supported for much longer and that the videos were of high interest and needed to be made available to the public on another platform.This is where ORDO comes in – by putting the videos of the seminars on ORDO, they were given the security and credibility of being hosted on an OU platform, and we were able to guarantee that they would be maintained for a minimum of 10 years. Adding the videos to ORDO gave each one a DOI, enabling Erica and the seminar presenters to cite them at conferences or in papers and ensuring that they are recognised as valid research outputs. ORDO allows in-browser viewing of most audiovisual file types which means that the videos don’t need to be downloaded to be watched. We were also able to add metadata to the records to enable discoverability, and upload extra background documents alongside the videos to add context.Finally, we grouped all the videos together into one collection, giving the entire seminar series a DOI and ensuring that they are seen as a complete body of work.
Seruset Borgstrom, Erica (2017): Open University Death and Dying Seminar Series. figshare. Collection. https://doi.org/10.21954/ou.rd.c.3825658.v2
Since the seminar series was uploaded to ORDO in January 2018, the videos have consistently featured in our top ten most viewed items. They have been viewed almost 7,500 times and downloaded 571 times.
A brief note from Erica:
I found working with ORDO and the library staff very helpful and exciting. Uploading and storing the videos in this way make them easy to share with a much wider audience and helps us fulfil our mission as an open, and accessible, university. The seminar speakers have also appreciated the professional platform to recognise their talk as a research output.
Did you know, on the first Thursday of every month between 14.00 and 15.00 we run an online drop-in for ORDO, our research data repository?
We’re here to help, whether you’re interested in using ORDO but not sure where to start, or you’ve been using it for a while and have questions about how to make the most of it.
To join, go to our Adobe Connect “Research Support” page and click on “join room” (and if you find the link takes you to the “DISS Home” page instead, click on “Resources” at the top and scroll down to “Research Support”).
Dates for the next few months:
- Thursday 1st August 14.00-15.00
- Thursday 5th September 14.00-15.00
- Thursday 3rd October 14.00-15.00
Hope to see you there!
In the latest instalment of my series of blog posts discussing best practice in ORDO, I’m going to highlight some of the datasets underpinning PhD theses that have been deposited in ORDO.
Like OU research staff, postgraduate researchers are expected to deposit any research data underpinning their theses in a trusted data repository. There are numerous benefits to doing this, including:
- enabling verification of results
- increasing your visibility as a researcher (great for career progression)
- ensuring that you have continued access to your data even when you have left the OU
- providing the possibility for re-use of data
Historically, research data or other digital materials underpinning theses have sometimes been put on a CD and enclosed with the hard copy of the thesis, lodged at the Library. However, from August 2019, the OU Library will only accept digital copies of theses which will be stored in ORO. This means that the old method of putting data onto CDs will no longer be possible.
Ideally, you should deposit your data or other materials in ORDO ahead of submission, so that you can include a Data Access Statement (which contains a DOI) within the body of your thesis.
Within the ORO record for your thesis, there is a field for “Related URLS” into which you can add your ORDO DOI as a “research dataset”. We also advise that you add the ORO URI to your ORDO record. We are looking into how we might be able to automate this process in the future.
A selection of datasets underpinning theses on ORDO
You may have read in the news recently about a scandal concerning the doctoring of research data within a lab run by a top UK academic. Earlier this month UCL released details of the inquiries into misconduct, which were undertaken in 2014 and 2015. Of the 60 papers reviewed, the panels found evidence of misconduct in 15 of them. This included “cloning” whereby features were copy and pasted throughout an image, and some of the data fabrications were reportedly fundamental to the conclusions reached by the authors.
This news story struck me as a prime example of why data sharing is so important to improve research integrity. If the data underpinning the papers in question had been made publicly available in a trusted research data repository, it seems unlikely that misconduct of this level would have happened. Data sharing should encourage greater transparency of results – ensuring that researchers are less likely to falsify research findings or fabricate data, and if they do then this sort of misconduct could be spotted much more quickly. Would a culture of data sharing also have instilled a sense of responsibility on researchers to “do the right thing” rather than cutting corners?
Sharing research data can seem like an onerous task, however if a possible outcome of data sharing is greater research integrity, then it needs to be recognised as an important part of all researchers’ work.
Continuing my series on best practice in ORDO, this time I’m going to trumpet The Robert Minter Collection: https://doi.org/10.21954/ou.rd.7258499.v1 which was deposited by Trevor Herbert in December 2018. According to the ORDO record:
This is a copy of the data underlying the website ‘The Robert Minter Collection: A Handlist of Seventeenth- and Eighteenth-Century Trumpet Repertory’ which contained a database of music collected by Robert L. Minter (1949-81).
Minter’s interest was in the collection of sources that contribute to our understanding of the trumpet at various points in its history before the twentieth century.
This is regarded as one of the world’s largest fully catalogued datasets about early trumpet repertoire.
The website in question was created in 2008 and is no longer active, however it had been archived by the Internet Archive, most recently in May 2017. In 2018, Trevor approached the Library for help archiving the data contained on the website because he was aware that although the Internet Archive had maintained much of the information, not all functionality and content had been preserved; most crucially the database itself is no longer searchable.
ORDO was deemed a good fit for creating an archive of the content of the website. It allows the deposit of any file type and enables in-browser visualisation of many of these so it is not always necessary to download documents in order to view them. By depositing the material in ORDO, Trevor also obtained a DOI (Digital Object Identifier) – a persistent, reliable link to the record which will be maintained even if the materials are no longer available for any reason. Any materials added to ORDO are guaranteed to be maintained for a minimum of ten years.
Within the record there are four files – an access database, a csv copy of the data, a zip file containing information about the collection, database and website and a list of files in the zip file. The description in the record makes it clear to any potential users what they are accessing and how they can be used. Since it was deposited in December, the collection has been viewed 139 times and downloaded 18 times. Now that deserves a fanfare!
The Library is launching a new Data Champions programme, and we are looking for PGR students and staff who are interested in taking part.
What would this involve?
Data Champions are expected to:
- Lead by example – make data open (via ORDO or other data repositories); share best practice through case studies and blog posts, and share Data Management Plans on the Library Research Support website
- Promote OU Research Data Management (RDM) services and tools within your unit
- Provide discipline specific data management advice and support to colleagues
- Attend and contribute to Library-run events
- Contribute to The Orb, Open Research Blog
- Offer feedback to Library Services to support RDM service development
What’s in it for me?
Data Champions will benefit from the following:
- Boost CV – increase funding opportunities by having RDM “expert” status
- Increase visibility – dedicated profile on the Data Champions webpage, opportunity to contribute to the successful Open Research Blog
- Opportunity to network with colleagues from across the OU
- Be instrumental in developing the OU Research Data Management Service and improving the culture of data sharing at the OU
- Receive 100 GB of data storage on ORDO as default
- Attendance for one Data Champion per year to the annual Figshare Fest conference in London
Do I need to be a data expert?
No – we’re looking for a range of people from different disciplines who work in different ways with different types of data. You could be a research student, early career researcher, professor, member of research support staff or an IT specialist. You might have experience compiling surveys, collecting lab-based data, harvesting big data or creating video data. Whoever you are and whatever your area of interest, we’d love to hear from you.
Don’t worry if you don’t consider yourself a data expert, your knowledge in your specfic area is invaluable and training and support will be given.
What’s the time commitment?
We expect the Data Champion role to require a commitment of 1-3 hours a month, but this can be flexible according to the amount of time you are able to give.
How do I apply?
Send an email to library-research-support@open,ac,uk by 31st July with the subject “Data Champions” stating what type of research you are involved with and whether there’s any particular contribution you’d like to make.
When do I start?
We are going to launch the programme with a Data Champions Forum in September. This will be an opportunity to meet the other Data Champions, find out more and help shape the Data Champions programme.
Over the coming months I’m going to focus on some examples of best practice on ORDO. The creators of all the items in this series will receive a reusable Figshare coffee cup as way of thanks and congratulations.
The first series of items I’m going to focus on are the OpenMARS Database datasets (https://doi.org/10.21954/ou.rd.c.4278950.v1) , deposited by James Holmes (STEM) earlier this year. From the data record:
“The Open access to Mars Assimilated Remote Soundings (OpenMARS) database is a reanalysis product combining past spacecraft observations with a state-of-the-art Mars Global Circulation Model (GCM). The OpenMARS product is a global surface/atmosphere reference database of key variables for multiple Mars years.”
Since their deposit in February, these datasets have been downloaded a total of 291 times, making them some of the most popular items on ORDO. This is a fine reward for all the hard work that went into preparing them for sharing.
What’s so good about them?
There are four datasets which are published individually and also grouped together as a collection. The most impressive thing about these is the documentation accompanying these datasets, which is excellent:
- On the landing page for each dataset is a description, which clearly details the provenance of the dataset and information about the OpenMARS project
- Each dataset has a PDF reference manual. This can be read in the browser, and as the datasets are large (~25GB each) and use a file format that requires specialist software and does not display in the browser (.nc) this means that users can decide if the data is useful before download
- The documentation within the reference manual is very detailed and includes information on access (using a sample Python script included in the dataset), structure of the dataset, provenance and quality assurance
- The datasets clearly reference the funding body – the European Union’s Horizon 2020 Research and Innovation programme
Is it FAIR?
The gold standard for research data is that it should be FAIR – Findable, Accessible, Interoperable and Re-usable. These datasets fulfill all but one of the criteria detailed in Sarah Jones and Maarjan Grootfeld’s FAIR data checklist (original version at https://doi.org/10.5281/zenodo.1065991). It only falls down on the fact that the data are not in a widely available format, but considering the nature of the data this would be very difficult to achieve, and since the reference manuals are very accessible, this issue is dealt with. See the completed checklist.
And finally, a word from James…
‘Adding datasets produced by our team at the Open University that will be of interest to multiple different users was really simple to do using the ORDO system, and the team that manage it were very helpful if I had any questions during the process. Thanks!’
Are you a researcher who develops software? Do you use Github?
Did you know you can connect your Github account with ORDO?
This will enable you to import items from Github to ORDO thereby assigning them a DOI to enable better citation and discoverability.
There are two options for importing items from Github…
You can access the Github integration directly from My Data in ORDO if you have something to upload straight away by clicking on the Github icon
Or you can get set up in the Applications section of ORDO to prepare for when you’re ready.
A key aspect of setting Github up via the applications section is that you can edit the “Auto-sync” global settings for your github integration. If you configure the auto-synch setting to be on, then every new release for one of your imported repos will be automatically imported. This will only occur if your ORDO item is public, and each new release would generate a new version of your ORDO item. If your item is private, you can still overwrite the repo if you wish manually. This global setting can be overwritten for each repo.
Detailed instructions on how to do this are available from Figshare.