Introduction
One of the major challenges that the EOSC aims to address is the historical lack of Interoperable Online Catalogues of Research Resources that European Researchers could explore across Europe.
EOSC Profiles are specifications that define common data models for EOSC entities (Providers, Resources, etc.) and related Code Lists, Taxonomies and Classifications. They contribute to the unified framework for describing and offering EOSC Resources to end-users in a harmonized way, guaranteeing the interoperability of resources metadata with open APIs. They allow automated exchange and management of the EOSC resource information and their accompanying data without human intervention (e.g., harvesting, etc.).
Prior Efforts
The eInfraCentral project first recognised that a common approach to describing (Goal-1) and exchanging (Goal-2) Resource-related information is the way forward to enhance discoverability and thus potential uptake of Resources in a European single digital market for Research. eInfraCentral worked on this harmonisation in partnership with five key e-infrastructures: GÉANT, OpenAIRE, PRACE, EGI and EUDAT. The approach was to extend best practices followed independently and to enable the harmonisation of Resource descriptions to allow interoperability and the possibility for a common catalogue or a catalogue of catalogues.
Goal-1 was addressed by the EOSC Portal Profiles (previously known as Service Description Templates (SDT)) that are widely adopted as the de facto standard scheme for the representation of Resource-related information in the EOSC Catalogue. Profiles are simplified, reusable and extensible data models that capture the fundamental characteristics of a data entity in a context-neutral and syntax-neutral fashion. This document addresses the current status of the EOSC Portal Profiles.
Goal-2 was addressed by a rich set of Open REST API methods for the exchange of information among Providers' systems and Portals. The open APIs include methods and mechanisms for data acquisition (resource metadata, indicators, usage, etc.) from federated catalogues, to enable seamless synchronisation of content. The latest version of the EOSC Portal APIs are described in D3.2 - EOSC Portal Open APIs Specifications.
Thus, as depicted in the figure below, the EOSC Resource Providers (data, apps, instruments, etc.) will feed thematic, regional and other aggregators and in turn allow for the aggregated European Open Science Cloud Portal as an additional distribution channel.
This approach bore fruit with the launch of the eInfraCentral Portal in 2017 and later the EOSC-hub Portal and finally the EOSC Portal in November 2018 (with SDTv1.13). This work was extended within the EOSC Portal Collaboration Agreement of eInfraCentral, EOSC-hub and OpenAIRE-Advance and within CatRIS that extended this work to offerings by Research Infrastructures (RIs), Core Facilities (CFs) and Shared Scientific Resources (SSRs) (see Figure below).
Figure 1: The EOSC Ecosystem
Note: App/SW/Data = Applications/Software/Data; App/SW = Applications/Software; RI = Research Infrastructures; CF = Core Facility; SP = Service Provider; DP = Data Provider; NOAD = National Open Access Desk; ASP = Application Software Provider. Source: JNP, eInfraCentral project.
Figure 2: Prior efforts on EOSC Profiles
EOSC Portal Profiles v3.00
Overall, the above set of concerted actions have led to the development of the EOSC Portal Profiles version v3.00 that were presented in EOSC Enhance D2.2 – EOSC Processes development and consensus. That version of the EOSC Profiles included two profiles; the Provider Profile and the Resource Profile, each addressing a different entity and a different phase of the onboarding, update, maintenance, and monitoring processes. The EOSC Portal Profiles v3.00 constituted them as one of the most significant components of the EOSC Interoperability Framework.
The Profiles provide definition of their attributes, their format/type (if any) and multiplicity, as well as whether the attribute is mandatory or optional for the implementation of several features in Catalogues.
They also provide validation rules for validation of input data. Each of these Profiles should have their respective separate online interfaces that will allow a Manager to keep them up to date at the Portal.
The Profiles also include an extensive number of Code lists, Taxonomies, Classifications that have been developed to provide a structured classification of Resources and a harmonized way for the description of various attributes. They also constitute the basis for the structure and the filtering functions of a (centralized) Catalogue, thus allowing for easy cross-referencing, comparability, and evaluation.
The specification of the EOSC Provider Profile v3.00 is available at https://wiki.eoscfuture.eu/x/mIABAQ and the specification of the EOSC Resource Profile v3.00 at https://wiki.eoscfuture.eu/x/moABAQ. The Profiles can be downloaded in tabular forms at https://wiki.eoscfuture.eu/x/4YABAQ and https://wiki.eoscfuture.eu/x/m4ABAQ and in excel and pdf at https://wiki.eoscfuture.eu/x/4YABAQ.
EOSC Portal Profiles v3.00 are the currently implemented versions. Transitions to v4.00 will start on 1/1/2022.
EOSC Portal Profiles v4.00
After v3.00 was adopted as a de facto standard in the ESOC ecosystem and deployed successfully at the EOSC Portal and by other EOSC stakeholders, EOSC Enhance received several Requests for Changes (RfCs) for the EOSC Portal Profiles. Those requirements grouped and presented briefly below:
Jira ID | EOSC Portal Profiles’ Request of Change (RfC) | Subject | Implementation |
Move Hosting Legal Entity from 'other' to 'basic info' in the provider profile. | Attribute | Up to 01/2022 | |
Add a new field for a very short resource name | Attribute | Up to 01/2022 | |
Adding description of Multimedia in the Marketing section of Resource Profile | Attribute | Up to 01/2022 | |
Values of ERP.DEI.3 (Related Platforms) controlled by EPOT | Attribute | In July 2022 | |
Profiles: Optional fields in Resource Profile: terms of use, privacy policy | Attribute | In July 2022 | |
Resource Profiles: add control values to FUNDERS and FUNDING PROGRAM | Controlled Values | Up to 01/2022 | |
Request to add values to the Funding bodies and programs-controlled vocabulary | Controlled Values | Up to 01/2022 | |
Minimum Service Maturity level as a Prerequisite for listing resources on the marketplace | Controlled Values | Up to 01/2022 | |
Provider > Location > Country should list only countries not 'Europe' and 'Worldwide | Controlled Values | Up to 01/2022 | |
New Values for Provider profile - Vocabulary | Controlled Values | Up to 01/2022 | |
Create a new Entity named 'Catalogue' and add the 'Catalogue' attribute to both Provider and Resource profiles | New Profile | In July 2022 |
Those requirements, as well as the need: a) to facilitate the onboarding and interoperability of the EOSC Portal to Multi-Provider Catalogues of Thematic and Regional Portals and, b) to describe additional resource types like Data Sources and Research Products, led to the update of the EOSC Portal Profiles and the issue of version v4.00.
The EOSC Profile updates to v4.00 can be grouped as:
- Updates on the EOSC Provider and Resource Profiles v3.00:
- Few additions of new Attributes
- Few changes of Attribute Requirement (mandatory/optional)
- Some re-organisation of Attributes in Information Blocks
- Few updates on Lists of Code Lists, Taxonomies, Classifications
- Some changes of types of Attributes
- An additional Profile introduced:
- Multi-Provider Catalogue Profile
- Additional extensions introduced:
- Data Sources and Research Products
The changes introduced fall into two groups as noted in the table above:
- the ones that have to be applied immediately, they are addressing operational issues or provide additional needed functionality and are backwards compatible (they do not impact the existing implementations and interoperability), and
- those that are necessary enhancements but are not backwards compatible and as such, they must be announced before their implementation allowing a minimum period of six months (as per the Change Management Process) for the providers and other EOSC stakeholders to adjust their implementations and/or prepare to comply with the new specification requirement.
The EOSC Portal Profiles are part of the EOSC Interoperability Framework. The EOSC Interoperability Framework as a whole constitutes an important pillar to realise the EOSC vision and framework. It is an evolving specification, which will incorporate new features from the EOSC ecosystem as they emerge.
Version 4.00 of the EOSC Profiles include three profiles each addressing a different entity and a different phase of the onboarding, update, maintenance and monitoring processes of a Resource by a Provider: the Provider Profile, the Resource Profile and the Multi-Provider Catalogue Profile.
The EOSC Profiles provide definition of their attributes, their format/type (if any) and multiplicity, as well as whether the attribute is mandatory or optional for the implementation of a number of features in Catalogues. They also provide validation rules for validation of input data.
The EOSC Profiles include also Provider and Resource Code lists, Taxonomies, Classifications that have been developed to provide a structured classification of Resources and a harmonized way for the description of various attributes. They also constitute the basis for the structure and the filtering functions of an EOSC Catalogue.
The specification of the EOSC Provider Profile v4.00 is available at https://wiki.eoscfuture.eu/x/o4ABAQ, the EOSC Resource Profile v4.00 at https://wiki.eoscfuture.eu/x/p4ABAQ and the EOSC Multi-Provider Catalogue Profile v4.00 at https://wiki.eoscfuture.eu/x/nYABAQ. The Profiles can be downloaded in tabular forms at https://wiki.eoscfuture.eu/x/pIABAQ, https://wiki.eoscfuture.eu/x/pIABAQ and https://wiki.eoscfuture.eu/x/noABAQ and in excel and pdf at https://wiki.eoscfuture.eu/x/4YABAQ.
Furthermore, the work in progress for the specifications of the EOSC Data Source Profile is available at https://wiki.eoscfuture.eu/x/nIABAQ and the Research Product Profile at https://wiki.eoscfuture.eu/x/pYABAQ.
EOSC Portal Profiles Data Model
Similarly to v3.00, the main building block within the Data Model in v4.00 is the Resource. A Resource is identified by a persistent unique ID, which is generated by the Portal during the EPOP. Furthermore, a Resource is described by a set of attributes as depicted in the Figure below.
A Resource is offered and managed by a Provider, identified by a Provider ID. A Provider is also described by a set of attributes. A Provider Manager is responsible to manage (add, update, maintain) the Provider’s profile. A Resource Manager is responsible to manage (add, update, maintain) the Resource’s profile.
A Resource is associated with one or more Options/Instantiations and Performance Indicators (Resource Level Targets), which are used for defining indicator measurements. A Resource is characterised by a set of usage statistics collected by the Portal (e.g. the number of visits on a Resource page, number of orders on the Provider page, number of favourites, average ratings, etc.). These Statistics are reported to the Provider, who may report them to the Funder.
An Authenticated (registered) User is a user who can login in the Portal and generate events for a Resource, such as rate a Resource, add a Resource to the favourites, update a Resource, etc. An Authenticated User may belong to a Provider meaning they would be authorised to manage and monitor the Resources of that provider (e.g. add a Resource, update a Resource, etc).
Figure 3: Overview of the EOSC Portal Data Model
EOSC Portal Profiles Representation Model
In the current implementation of the EOSC Portal all Providers and Resources are described by a structured set of metadata, well-defined by the EOSC Portal Profiles, instantiated in a JSON Schema and currently moving from v3.00 to v4.00.
The EOSC Portal Profiles have been widely accepted and embraced by EOSC projects and initiatives that want to integrate or correlate their catalogues with the EOSC Portal. Those developments underpin the need to implement and publish more formally the data models than publishing them in JSON or PDF formats.
The EOSC Portal does not currently provide a linked data endpoint, but EOSC Enhance has collaborated with the CATRIS project for making the list of vocabularies and classifications in the EOSC Portal Profiles, available in RDF format (SKOS). Even though the latter will ensure a consistency in the use of controlled values, it does nοt provide a holistic standard specification model that may assist software developers to build a fully compatible software code base with the latest EOSC Portal Profile version.
To address this gap, EOSC Enhance has initiated the development of a Unified Modelling Language (UML) representation of the EOSC Portal Profiles compatible with the existing software code base and this work continues in EOSC Future to be delivered in that context.
UML has been selected as it is a general-purpose language for specifying, visualizing, constructing, and documenting the artifacts of systems. It is a standard modelling platform that consists of an integrated set of diagrams and aims at helping system and software developers for specifying, visualizing, constructing, and documenting artifacts of software systems but also business modelling and other non-software systems. UML is not a development method by itself; however, it was designed to be compatible with the current leading object-oriented software development methods. It has gained wide acceptance in software engineering as a de-facto tool for object-oriented modelling, offering notations for describing class relationships, component systems, processes, use cases, and more.
The UML representation of the current EOSC Portal Profiles will benefit the EOSC in the following ways:
- EOSC Users are provided with a ready-to-use, expressive visual modelling language so they can develop and exchange meaningful models;
- Projects, Thematic Clusters or any other EOSC User may implement a fully standalone or compatible catalogue without needing additional technical specifications related to the model;
- Will provide extensibility and specialisation mechanisms to extend the core concepts;
- Will offer an agnostic specification in terms of programming language(s) and development processes;
- Will support higher-level development concepts such as collaborations, frameworks, patterns, and components.
The UML description will provide the following parts of a software design: the definition of the involved entities (i.e., the system classes); the workflow and interactions between those entities.
Out of the 13 different defined diagram types, the Class Diagram will be used. Class diagrams describe classes, their attributes and methods, and associations between classes.
- Classes: a class is the conceptual representation for an entity to be modelled. It can have attributes and operations. Objects of a given class will be called its instances.
- Attributes: each class can have several attributes holding information. Their visibility status can be set to public, protected, and private, restricting their access from foreign classes.
- Methods: a set of operations may be defined for each class. Typically attribute slots are accessed and modified.
- Associations: relations between classes can be modelled by so-called associations. UML defines the concept of associations, aggregations, compositions, and generalisations. Instances of associations are called links.