Semantics.gr Use Cases
In this section we present you with some examples about how we have been hosting, curating and using the first vocabularies we developed in EKT in order to document and enrich our content services and repositories.
SearchCulture.gr is the national cultural data aggregator and OpenArchives.gr is the biggest scientific data aggregator for Greek cultural and scientific content respectively. Both infrastructures harvest metadata and thumbnails from the providers’ distributed repositories using the OAI-PMH protocol. Both infrastructures provide a public portal offering unified search and access to the digitised resources. The aggregation workflow includes data validation, transformation of the original metadata to the target model used of the respective infrastructure, metadata enrichment and publishing as Linked Open Data.
In order to enrich the content aggregated by the two infrastructures we used Semantics in a two-step process.
Firstly, we developed the following 5 vocabularies:
Secondly, in order to serve the enrichment processes, an original and particularly user-friendly mapping tool for the semi-automatic semantic enrichment of the metadata with terms from vocabularies that are published on the platform was developed in Semantics.gr. Using this tool, EKT scientific staff is able to normalise, homogenise and enrich the metadata that are being aggregated in SearchCulture.gr and OpenArchives.gr.
A vocabulary based on the OECD FORD Research and Development classification fields (OECD 2015). It follows the FORD classification with regards to the 6 1st level broad thematic areas and 42 2nd level thematic areas. EKT staff processed the 2nd level thematic areas with the aim to create a 3rd finegrained level. The resulting SKOS vocabulary comprises 474 unique bilingual subject terms covering the main areas of Science, Technology & Development. The terms are classified by hierarchical relationships (broader / narrower) while semantic link relations to external open resources are attributed via the exact match, close match, related to attributes. The enrichment with new terms and links is ongoing.
The vocabulary is being used for the enrichment of the National PhD Archive repository, retrospectively via EKT staff and via self-archiving by the PhD candidates that choose the subject areas that their dissertation relates to.
The vocabulary will be further used in other aggregation and repository services of EKT, such as for enriching the Scientific data aggregator OpenArchives.gr.
Two more pivotal vocabularies have been created by EKT staff in Semantics.gr: one for natural persons and one for corporate bodies. These vocabularies are currently being used in the documentation and enrichment processes of EKT’s scientific repositories (National PhD Archives’ repository, the OA scientific ePublishing platform, new EKT institutional repository). The two broad vocabularies “Persons” and “Corporate Bodies” can further be elaborated and indexed, based on their attributes to different groupings of “academic institutions”, “PhD holders”, etc.
The vocabularies are being hosted and managed in Semantics.gr. Semantics.gr interoperates with the distributed repositories and in particular with their documentation environments, where the scientific resources are being catalogued. Individual fields such as creator, contributor, editor in the cataloguing forms draw controlled values and are populated with data from the vocabularies in Semantics.gr in real time. In parallel, EKT staff uses the semantic enrichment tools for the mass, semi-automatic retrospective documentation and the homogenisation of the contents of the respective infrastructures.
There are several benefits in the process followed. Each person and corporate body entity created in Semantics.gr, is attributed a unique URI which is then used uniformly to describe the particular entity across all EKT’s digital repositories. All information gathered in this central pool can be easily managed and updated in one place and is automatically synchronised across all repositories. Quality of data is enhanced and can be better measured. In addition, an entity is linked with all information relevant to them, that lives in different repositories (e.g. a person’s PhD thesis and her articles in an online journal hosted in EKT’s ePublishing platform).
The application profiles for the creation of the above-mentioned vocabularies have been modeled on MADS/RDF. Academic researchers make the first list of natural persons that are included in these vocabularies and the vocabulary is being used for in the cataloguing process of EKT’s new institutional repository.