Feeding PIDza to VIVO: data ingest with SPARQL-Generate release_57wwq66vyralxe6q52cflc2iie

by Maxime Lefrançois, Sandra Mierz

Published by Zenodo.

2021  

Abstract

The first hurdle after installing VIVO is to fill it with an initial set of data about an institution, its researchers and their publications. Done manually it is a cumbersome and time-consuming process. One approach to overcome this is to use open-data containing a persistent identifier(PID) like ROR, ORCID or DOI. The advantage lies in the reduced processing of input data: since data does not need to be disambiguated, the data ingestion process can be reduced to mapping the data to the VIVO ontology. While several tools exist that are able to import one PID-identified object into VIVO, the release of Datacite Commons takes this approach to the next level. Datacite Commons offers an interface to a so-called PID-Graph: a structure of multiple connected data objects each identified by a PID. It makes queries possible that take advantage of the connections between several PIDs like e.g. querying an organization (identified by a ROR iD) and its affiliated persons (identified by their ORCID iD) and subsequently their publications (identified by a DOI), and thus providing a quick data basis for an empty Research Information System. In this talk, we will present a microservice importing data from the Datacite Commons PID-Graph and the ROR API into VIVO ( https://github.com/vivo-community/generate2vivo ). This microservice is based on lifting rules defined using the SPARQL-Generate RDF transformation language, which we will overview beforehand. SPARQL-Generate is an expressive template-based language to generate RDF streams or text streams from RDF datasets and document streams in arbitrary formats (for more information see website https://w3id.org/sparql-generate/ )
In text/plain format

Archived Files and Locations

application/pdf  1.5 MB
file_igyw64krkjakljpkd3733qnnpe
zenodo.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article-journal
Stage   published
Date   2021-06-23
Language   en ?
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 056749ea-e136-4c98-9131-4691f816ae58
API URL: JSON