Getting Useful IDs Early: Rationale for an ID gathering and co-referencing service

Blog post by David Kay on behalf of the Jisc Monitor development team

The idea

This idea was introduced at the July 8th London workshop, where institutions suggested a template application whereby authors (or institutions on their behalf) could establish and reuse the profile data required by the supply chain to identify a person, their institution and the relevant grants.

The proposed service will enable authors or HEIs to share key ID data as early as possible, to reuse their bundle of IDs and to make them available to actors that need them downstream. The IDs are expected to include Person, Org and Grant, will be flexible in terms of ID Namespaces and duplicates within Namespaces, and may be cross-referenced (e.g. ‘This is me in another submission’). Some Namespaces will need to be private (e.g. Institution person IDs).

This service will be useful to Publishers, Monitor, to other Jisc services and to HEIs (especially where their authors are not leading) as well as to authors themselves. This potentially addresses a core failure by providing a shared service, a central point where such data can be entered or pushed or harvested and therefore aggregated.

Monitor proposes to deliver a technical demonstrator of what is in effect a Co-referencing (mapping, matching, resolution) service that can be exposed within our own workflows, within other services (e.g. local university systems, publisher submission systems) or as a national shared service.

Some Detail

1 – Mutual requirement – Both publishers and Institutions want gather ‘authentic’ data as early as possible in the process; authors need to submit the same data multiple times in different systems (repeat submissions, further papers)

2 – Multiplicity is pervasive – This includes (1) Authors each with multiple Person IDs, (2) Grants, (3) Organisation IDs – both HEIs and Funders

3 – Further Complexity – A challenge for verifying compliance and ultimately for the RES is that Researchers frequently play ‘away’ in articles, books and conferences led by authors from other institutions, perhaps in other countries

4 – Monitor Assumption – We will never collectively solve this by proposing a fixed workflow or a single ID; whilst it is important for the sector to consolidate wherever possible (e.g. on ORCID and ISNI), global market forces will continue to introduce further complexity

5 – The Monitor Proposal – Monitor will prototype GUIDE as a profile service that [G]athers [U]seful / re[U]sable [ID]entification data [E]arly / [E]fficiently, which will be very useful as an underlying service for Monitor itself. Think of it as a wallet containing a set of IDs that it can build up over time, co-reference / cross-walk / map so that

  •  Authors can declare a bundle of IDs to easily publishers
  • Those IDs can be re-used throughout the supply chain, principally in OA publishing but also potentially downstream in alt-metrics (e.g. Academia.edu, Plum X)
  • Other actors optionally can add IDs and other relevant assertions to the author’s wallet (e.g. Institutions, Publishers)
  • Machines can detect additions using matching algorithms – for example harvesting IDs from other supply chain services

6 – Implementation use cases – The GUIDE service could be embedded in applications (such as submission systems), or used standalone to check / cross-reference IDs or to add new IDs (potentially an App).

7 – Future proofing – The nature of this type of service and data approach is that it is lightweight and extensible, fits in to any future ID-rich / linked data ecosystem and can be backed off in to another service if the UK / global portfolio develops in a different way (e.g. linked to a future UK AMF post-Shibboleth vision where people can be truly identified).

8 – What would ultimate success be like? – There will be benefits way before this happens but the ultimate model is that the GUIDE ‘uber ID’ might act as the equivalent of ‘Sign on / identify yourself [for this other thing] with Facebook’ and perhaps even as the next generation replacement for Shib.

What next? Subject to project board approval of the work package (2 September), Monitor wants to work with supply chain players (notably publishers) and institutions to prototype this approach as just one building block in the set of potential services under consideration over the next six months.

Please contact frank.manista@manchester.ac.uk if you’d like to be part of this, for example by reviewing the design or trying out the prototype service.

In addition to this technical work, the Jisc Monitor Technical Team has continued to run fortnightly webinars as part of the user engagement, focused on four use cases.  A recording of the 20th of August session is available here:  http://vimeo.com/104194577.  The next follow-on webinar is scheduled for the 3rd of September from 10-11am.  Please contact Frank if you would like to join.

 

2 thoughts on “Getting Useful IDs Early: Rationale for an ID gathering and co-referencing service

  1. Hugh Glaser

    Interesting activity – although it would be a mistake to (yet again!) in some sense assume that it is a sensible idea to try to create de jure IDs.

    Of course, what you are describing has a lot in common with our longstanding sameAs.org services and sub-services. For example, http://sameas.org/store/foreign/ does people and paper coreference across Open Repositories.

    Actually, we now have the ability to deploy sameAs (and other services, such as differentFrom) on a better engineered platform wherever we like, for example the VIAF service for which OCLC provides the data.

    Oh yes, don’t forget that differentFrom (of things you once thought were sameAs) is actually more expensive data to build than sameAs, so it is very important to be able to manage and publish that.

    Good luck.

    Reply
  2. Anna Clements

    Agree that we should be aiming to capture data, including IDs as early as possible in the process … but why do we need a separate ‘service’ to do this, rather than agreeing a common set of metadata for manuscript submission systems?

    I’m sorry if I’ve missed the point completely …. I suspect that I have!

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *