Research Data Management “concerns the organization of data, from its entry to the research cycle through to the dissemination and archiving of valuable results.” (Whyte and Tedds, 2011)
[Whyte, A and Tedds, J., 2011. Making the Case for Research Data Management, Digital Curation Centre]
More and more funders, and some publishers, insist that some elements of research data are made available through Open Access. It is important that you manage and store any data generated by your research project in an ethical and systematic manner. You should also consider the longer term maintenance of that data, often beyond the official end of a project.
Apart from making sure that you comply with University Policy, and any funder or publisher requirements, it is good practice to address RDM for every project you are involved with.
Additional benefits include being sure that your research data is being kept in a safe, secure manner. By organizing your data you will be able to access and analyse it much more efficiently. You will also be able to demonstrate research integrity. A JISC guide explains in greater detail. Finally, as with Open Access publishing, RDM makes your research outputs more visible and increases your chances of finding both funding and collaborators for future projects.
What follows are some key considerations to be borne in mind from the start of any research project.
Responsibility for research data management lies with the Principal Investigator(s) or project lead on any research project. It is strongly recommended that you compile a Research Data Management Plan (RDMP) at the outset. An example template to complete is provided by the Digital Curation Centre. The accompanying checklist is a useful document to make sure everything is included from the outset.
Even if your funder does not require you to complete a formal RDMP, it is good practice to do so. Using the checklist ensures that you have considered all possibilities with regards to collection, storage and archiving of your data before you start.
View the webinar from EUDAT and OpenAIRE for more information.
If you have funding for your project, check how the funder expects you to manage the data. The Digital Curation Centre provides template RDMPs (free registration required) for various funders, as well as a generic template.
Even if your funder does not require you to complete a formal RDMP, it is good practice to do so. Using the Digital Curation Centre's checklist (pdf) ensures that you have considered all possibilities with regards to collection, storage and archiving of your data before you start.
It is important to consider compliance with University policies and guidance from the outset. All research data management must comply with the following policies:
Consent for future reuse must be considered when planning the initial project. It may be that consent is required of participants, or data may have to be anonymised to safeguard participants’ identities.
Data archives will have user agreements that cover the confidentiality of data.
It is important to be able to retrieve data throughout any project, so getting this right at the outset can save you valuable time and effort. It is also worth thinking about how the project may develop, and what you intend to do with the data once the project is finished. Choosing an appropriate file format, naming system and location are all important.
Every time a file is updated, for example when data are added, a new version is created. Clear labelling for each version is especially important for large projects and those involving many researchers or several locations. London School of Economics have produced a useful guide to version control to explain this in more detail. Having this information available also enables you to document the history of the data, which may help to verify its provenance and will be useful for any data being retained at the end of a project.
Depending on the nature of the project, it may be helpful to maintain a master copy of the original data, with separate versions for any subsequent data manipulation.
This needs to be proportionate to the type of data involved, and the risks identified when compiling your RDMP. Depending on the format of the data, you may need to consider physical security or network security.
Data containing personal information must be treated with extra care, as set out in the Data Protection Act 1998.
As with any information used as part of your work, it is essential to backup all data regularly. ICT can give guidance on the sustainability and security levels of different types of devices. An informal risk assessment may indicate how best to backup your data.
Always store any encryption codes in a different location to the actual data sets.
The UK Data Archive provides advice on the short and long-term storage of data. ICT can provide initial guidance on the most appropriate approaches to take. Any requirements from your funder should also be considered.
Best practice would be to store data in at least two different formats and in separate locations.
If the project is to take place over several years, it may also be wise to consider copying data files to new media, so that nothing is at risk from physical degradation or obsolescence.
Any supporting documentation should be created at the same time as the data to ensure that the development of the project can be accounted for correctly. Project logs, lab books and project diaries all contribute to this.
Metadata, data which describes your data, can be used to allow your research to be discovered. Metadata standards by discipline are available from the Digital Curation Centre. Your academic liaison librarian can give you guidance on this, or make an appointment with one of the contacts listed below.
Data access or availability statements describe where the data supporting a publication can be found. Many funders require such statements for publicly-funded research. The University of Bath provides helpful guidance on data access statements.
Research data are valuable resources that have taken both time and money to produce. They can be valued beyond the original project that created them. With the move towards open access, far more data sets will become available for reuse, which may encourage comparative research, restudy or follow-up. Certain datasets may lend themselves to secondary analysis.
One of the most important areas of reuse is in the replication or validation of published work, something now recognised by publishers requiring datasets to be available alongside published articles.
The UK Data Service has many examples of case studies using archived data.
Once a project has finished, the data must be processed to ensure that it can be made available for any requisite time specified by the project or any funders. The main consideration is to ensure that any archived data can be identified, found and accessed, together with any accompanying notes or guidance. Digital files should be transferred to non-proprietary file formats to ensure accessibility in the longer term.
Some funders will specify exactly what should be archived, how and where. The University policy recommends that data be stored for a minimum of 10 years.
Decisions on non-digital data will vary depending on the nature of the project and any legal, ethical, funder or collaborator requirements. It may be feasible to digitise some data if provision was made at the outset, with any costs built into the project funding.
Not all data needs to be archived; the principle investigator is best placed to advise on what can be destroyed. Data validating research outputs (articles, reports etc) must be retained.
There are two main aspects to sharing data: sharing as part of the initial project and making the data available to others after the project has finished.
During the project
Depending on the nature of the project, it may be necessary to share data from the outset. Having a RDMP in place should address many of the issues this will raise, such as security and versioning.
To share with colleagues within the University, ICT can provide access-protected folders on the O drive.
Proprietary software is available to share files and documents, for example Googledocs or Mendeley. However, these cannot be regarded as secure or stable, so care should be taken over what is deposited in them. No information of a personal or confidential nature should be stored there.
After completion of the project
The Concordat on Open Research Data proposes ten principles for working with research data. These outline best practice for the reuse of research data.
FAIR data is Findable, Accessible, Interoperable and Reuseable.
In August 2016 the European Commission released new Guidelines on FAIR Data Management in Horizon 2020. Whilst the guidelines apply to any Horizon 2020 funded projects, they also ensure good practice, so are worth considering as you archive project data.
Do you need to restrict access to your data?
Although the presumption is that publicly-funded data should be preserved and made accessible to others, there are circumstances when this is not the case. Some of these are set out in the Policy for the Effective Management of Research Data
In some instances, it may be possible to restrict access to the data. In this case, a Data Access Statement may be required.
Does all data need to be preserved?
This will depend on the nature of the project and the accompanying data. Preserved data should allow the clear assessment of any published work arising from the research.
For data to be shared effectively it needs to be consistently presented and properly documented; it needs to be intelligible and have information which allows another user to make sense of it attached.
How to deal with Freedom of Information requests.
Research data can be the subject of Freedom of Information requests. JISC has compiled a FAQs document dealing with some common enquiries. However, it is important to point out that this document does not constitute, and should not be construed as, legal advice. If you are in any doubt, please contact the university’s FOI officer.
There are a number of repositories available for OA data. Some involve funders, others are subject based. The Register of research data repositories is a useful place to check.
Data cannot be deposited in Worcester Research and Publications (WRaP). However, to ensure that data can be discovered, metadata (data that describes your data) can be made available through WRaP, so that other researchers can understand the nature of the research, the data held and assess the re-use potential. Please contact the WRaP team for any help you might need with this.
Contact the Research School for details of the next scheduled session as part of the Researcher Development programme.
Further RDM sessions will be announced as part of the Staff Development Programme (University login required).
Managing and Sharing Research Data by Louise Corti; Matthew Woollard; Veerle Van den Eynden; Libby Bishop
ISBN: 9781446267264 Available from The Hive: Shelfmark: 300.72/COR (UHD item)
Search from the 'ebooks' tab of Library Search for online titles.
Research Data MANTRA, from the University of Edinburgh, is a free online course designed for PhD students and others who are planning a research project using digital data.