Digital Curation

I recently gave a short presentation to colleagues from the John Rylands University Library. The theme was 'digital curation' and was structured around three questions. Here's the gist of it...

'What are the digital curation challenges facing Manchester eScholar?'

Manchester eScholar is populated through a self-archive deposit model whereby authors are encouraged to deposit metadata records (and where appropriate a copyright cleared full-text) with every new publication. 

This model has so far been successful. Over six thousand past and present members of staff have at least one record in eScholar and we estimate that the repository contains records for over 95% of the entire annual research output of the University over the past five years. 

To enable authors to deposit quickly there are very few barriers to submitting a record to the repository. This means no mandatory metadata fields and no checks by the repository support team. The model is successful in achieving high deposit rates but introduces data quality issues such as duplicated records, journal name variants, incorrect metadata...etc.

The strategic goal of the service is to sustain and enhance the research reputations of individuals and organisations affiliated with the University. Low quality metadata in the repository reduces the team's ability to achieve this goal; this is the major digital curation challenge facing Manchester eScholar.

'How are those digital curation challenges overcome?'

By establishing a programme of retrospective curation activities the team systematically improves the quality of the records stored in the repository. Examples of the types of activities include:

  • Correcting inaccurate metadata 
  • Deduplicating records
  • Adding additional metadata 
  • Adding Digital Object Identifiers (DOIs) 

We know, however, that when research is published it enters a dynamic ever-changing ecosystem. What we consider to be a high quality record today may not be considered so in five years time. Consequently, all records in the repository are placed in a 'perpetual curation cycle'. This means we will never stop curating a record!

'How can curating eScholar content benefit users of the service?'

A well thought through digital curation programme translates into the following benefits for users:

  • Increased 'findability' of the repository records on the web
  • Improved matching of external bibliometric data to repository records 
  • More accurate reports summarising total research output of organisations within the University
  • Better preservation of the records for future generations

We've also found that the more you can rely on the accuracy of the data in the repository, the more innovative ways you can think of to use the data. Mashing up repository data with data from external source allows for some really powerful insights into publishing activity.

And though it may sometimes feel like we're painting the Forth Bridge, the benefits make digital curation a vital  function of the service.

Here's the presentation...