BESD – Monitoring and Evaluation of Software for Data Spaces and Use Cases

What must a software solution look like on which organizations can exchange data?

BESD: Datenräume mit Datenkreisen

Get involved in testing and evaluation – your feedback is an important contribution!

Billions of pieces of data are generated, analysed and managed every day: Environmental data, sensor data, health data, data from social media and the web, to name just a few areas. These volumes of data are changing business models in the economy right now. But many people face closed doors when it comes to using data that is not generated in their own company or environment.

The Federal Ministry for Climate Action, Environment, Energy, Mobility, Innovation and Technology (BMK) wants to promote the intelligent use of data and initiate an ecosystem together with the digital economy. The Data Intelligence Offensive provides support here. One step in this direction is the possibility to use and share data in a simple way. This should be done via data spaces and within these in innovative, decentralised use cases. For this purpose, the BMK has purchased software (Nexyo Data Hub), which is available free of charge for one year.

The BESD project is oriented towards the infrastructure flagship initiative Gaia-X, in which representatives from business, science and politics promote an open, transparent and secure digital ecosystem on an international level, in which data and services can be made available, collected and shared in a trusting environment. With the help of the Nexyo Hub, innovative use cases are to be developed on a small scale through metadata matching and then implemented.

For a better understanding, we have created a glossary of the most important terms:

In order to bring storage resources into the ecosystem, they must be actively integrated. This requires so-called connectors (APIs) that enable the technical connection of the (meta) data on the storage resources to an infrastructure (e.g., software solution, connectors, high-performance computing resources) in the ecosystem. In many cases, there are already well-established standards for interfaces so that systems can easily communicate with each other (interoperability).

EDC

The Eclipse Dataspace Connector (EDC) provides a connector framework for sovereign, cross-organizational data exchange. The framework includes modules for data retrieval, data exchange, policy enforcement, monitoring and auditing. In particular, it integrates with existing identity, data catalog, and transfer technologies to provide cross-enterprise compliance, policy, and control capabilities.

A Data Asset consists of, among other things: Data, Metadata, Connector Data, Data Source, Storage Location and API.
Use cases are topic-specific platforms within a Data Space. They focus on a subarea of the respective domain and enable the domain and enable the exchange, use, and trade of data in a clearly defined and trading of data in a clearly defined subarea. The prerequisites for use cases are first and foremost changed basic attitudes with regard to trading with data: Data should be traded on a virtual data market and ultimately be shared.
If data is to be shared in the data ecosystem, it is important to define the conditions for data sharing. To avoid having to start a lengthy, administrative process for every transaction, there are new and innovative ways to easily create contracts, known as Data Contract (Smart Contracts).

In the context of the data economy, a domain spans a network of actors who deal with a clear data-specific topic.

A domain can be:

  • an economic sector (e.g. construction industry, energy industry, etc.),
  • an industrial sector (furniture industry, textile industry, etc.),
  • but also a thematic block spanning several areas (smart cities, mobility, etc.).
In order to start collaboration in the data ecosystem, it is necessary that the participants also know what data is available in the ecosystem. Metadata catalogs are created for this purpose. In Gaia-X, there are decentralized metadata catalogs, which are lists of data assets. These lists can then be merged in the ecosystem so that all the data in the ecosystem can be seen at a glance. However, the data itself still remains on the original storage resources to ensure data sovereignty.

Federated Data Catalog

The sum of all data catalogs of the participants of a Data Space.

A data policy contains a set of rules and principles that provide a framework for various areas of data management, including but not limited to data governance, data quality, and data architecture.

Policy Provider

As a rule, the Policy Provider is the Data Space’s Federator, who defines the Data Space’s Data Policy.

Data Sharing is the actual, technical data connection between parties within a Data Space.

Data Spaces focus on domains (economic areas, industrial sectors or other specialist application fields) with a decentralized (see glossary: decentral) and distributed data infrastructure on which Use Cases can build. In a Data Space, metadata is made available for potential innovative services while maintaining data sovereignty, i.e., the greatest possible control and dominion over one’s own data. Central to this is that different actors in a data ecosystem access and use data by means of Use Cases in order to exploit the full innovation potential of data. Domain-specific Data Spaces can also connect with other Data Spaces, such as the Mobility Data Space with the Tourism Data Space, and a Use Case can be created via both Data Spaces.

Function

The function of the Data Spaces is the display of metadata. An exchange of metadata providers and users and perspective data exchange is possible. This possibility is supported by policies for data exchange and the overlying smart contracts.

Goal

The goal of Data Spaces is knowledge sharing within domains and a designation of an organization’s metadata with the intent of unwinding Use Cases.

Participant

Generally, a participant in a Data Space who identifies himself as such by means of a Verifiable Credential.

Federator

Is the initiator of a Data Space who is responsible for assigning Verifiable Credentials that prove membership. The Federator also holds a Verifiable Credential and is therefore always a Participant at the same time.

Consumer

A Consumer is a Participant of a Data Space and a recipient of data.

Provider

Provides data or services, specialization of a Participant

Public Data Space

(i) Anyone may join the Data Space. The Data Space is visible to everyone and there are no restrictions.

Restricted Data Space

(ii) Access is restricted. The Data Space is visible to everyone but however the participants provide guidelines for joining.

Private Data Space

(iii) Access is restricted. The Data Space is not publicly visible and participants provide guidelines for joining.

This refers to distributed systems without dependence on a central point.

Decentralized Architecture

In a decentralized architecture, no central identity provider, no individual participant has the possibility to exclude other participants from a Data Space or to make decisions affecting all participants. Thus, there is no sovereign and all participants of the Data Space have equal rights and are self-sovereign.

Decentral Identityprovider

In a decentralized system, there is no Identity Provider, only decentralized identifiers that everyone can generate themselves and Verifiable Credentials to prove authorizations for Data Spaces.

Federation

From a technical perspective, a federated infrastructure is seen as a collection of interoperable, API-based IT platforms where users* control the flow of data through advanced identity and consent management mechanisms. Since the data platforms are decentralized and thus federated, protection mechanisms are also connected.

A federation designates a group of actors who provide, produce, process or consume data through direct or indirect cooperation. The aim is to generate more added value from data compared to centralising, proprietary or closed systems.

In a federation, different technology providers make a network of services available to enable an overarching exchange of data from spaces in different domains.

A federation is also formed by the merger (organisationally) of decentralised Data Spaces into an integrated Data Space (DS Health Austria, DS Health Germany, for example, become a European Data Space Health).

Siehe auch: https://www.gaia-x.eu/what-is-gaia-x/federation-services

Gaia-X is a Franco-German flagship project that aims to pave the way for a European digital ecosystem. Representatives* from business, science and politics are jointly developing proposals to create a secure and networked data infrastructure worthy of the highest standards of digital sovereignty and conducive to innovation.

Gaia-X AISBL

The Gaia-X Association (AISBL = Association internationale sans but lucratif) is a non-profit association that aims to consolidate efforts within the community and promote international cooperation by assisting in the development of legal frameworks and the expansion and dissemination of necessary services. European and international partners are invited to join and contribute to its development. GAIA-X is also in continuous exchange with the European Commission.

Data Ecoysytem

In biological ecosystems such as forests, a variety of symbiotic relationships are found. Plants, animals, and fungi promote and complement each other for everyone’s benefit. Analogously, complex value creation structures can also emerge in digitized economic spaces, in which the individual players mutually benefit from each other. A data ecosystem is understood to be a decentralized form of coordination between organizations and individuals pursuing a common goal, be it the exchange of data or the provision of products or services in order to drive innovation.

This includes the identity of the participant, the organization, etc.

  • Decentralized Identity:Decentralized identity is based on mathematical principles without the need for a centralized management service.
  • Decentralized Identifier:Decentralized Identifiers (DIDs) are a new type of identifier for verifiable, decentralized digital identity. These new identifiers are designed to enable the controller of a DID to prove control over it and to be implemented independently of any centralized registry, identity provider, or certificate authority.
  • Verifiable Credentials:This refers to cryptographic keys that enable the proof of an assertion (e.g., membership in a Data Space).
  • Verifiable Presentation:These are representation and transmission standards for Verifiable Credentials.
  • Decentralized Identity Foundation: This is an organization for standardizing technologies for decentralized identityhttps://identity.foundation
Metadata provides information about data and is a description of it. Metadata is not the content of the data, such as the text of a message or the image itself. There are different types of metadata, including: (i) Descriptive metadata: Information about a resource. They are used for discovery and identification and include elements such as title, abstract, author, and keywords. (ii) Structural metadata: Information about data containers. They indicate how data containers are composed, e.g., how pages are arranged into chapters. They describe the types, versions, relationships, and other characteristics of digital materials. (iii) Administrative metadata: Information related to the management of a resource, such as resource type, permissions, and when and how it was created. (iv) Reference metadata: Information about the content and quality of statistical data. (v) Statistical metadata (also called process data): information about processes that collect, process, or produce statistical data. (vi) Legal metadata: Information about the creator, copyright holder, and public licensing, if any.
The Nexyo Data Hub is a software technology that enables organizations to manage, govern and share decentralized data. It is the first solution that combines data management, sharing and contracting in just one tool. This makes it possible to create meaningful, trusted, and valuable data connections while maintaining your own sovereignty.
Data sovereignty is the ability of the data owner to exercise exclusive self-determination with regard to his/her own data as an economic good. This is one of the central concepts underlying the data ecosystem. For participants in the data ecosystem, data sovereignty means the ability to view, process, manage, and secure their data, as well as essential control over their own data, even when it is made available to other market participants.
Trusted data exchange implies that Use Cases meet expected requirements and that developments of data applications or data analytics take place in a secure environment. (i) Security-by-Design: securing use case sources through unambiguous, non-repudiable agreements (e.g., smart contracts). (ii) Privacy-by-Design: integrating privacy constraints into the design of data platforms and data sharing applications. (iii) Assurance-by-Design: integrating security and privacy requirements into the development of data platforms and data sharing applications. Such a trusted data sharing framework includes five pillars:

  • Identify
  • Protect
  • Detect
  • Respond
  • Recover

Issues to be addressed include access control, usage control, trust and identity management. In any case, these parameters must be designed by a Use Case and the Data Space.

Backgrounds to the project

Data Spaces and Use Cases

A use case is intended to enable a concrete exchange of data in a specific application area and to create added value for all participants. The added value is usually created by merging existing data sets, resulting in new analysis results.

The Federal Ministry for Climate Protection, Environment, Energy, Mobility, Innovation and Technology (BMK) supports the development of data-driven and sustainable technologies and solutions through funding and initiatives. Together with relevant stakeholders, it was identified that a technical implementation is needed for use cases in the form of a software solution that enables structured and regulated data exchange. This development is to be initiated via a new type of digital marketplace that brings together data providers and data seekers in an ecosystem while complying with the legal requirements.

More about data spaces and use cases

First step: The IÖB-Challenge

A software solution was sought on which companies can trade data in use cases. Initial concepts for an innovative software solution could be submitted to the IPM Challenge. After evaluation by a jury of internal and external experts, those companies whose solutions stand out particularly positively in terms of the evaluation criteria will be invited to an innovation dialog. The winners will present their concept at the Austrian Data Day on June 17, 2021.

Following the challenge, a decision will be made on the further design of the project.

Open-Source-Software for sovereign data exchange: The Eclipse Dataspace Connector

The Data Spaces concept defines the interaction of various technological components to promote data exchange across (corporate) boundaries while preserving data sovereignty. One of the most important components is the so-called connector, which connects the individual participants of such a dataspace and forms the end point for the actual exchange of data according to existing standards.

The Eclipse Dataspace Connector (EDC) provides a connector framework for sovereign, cross-organizational data exchange. The framework includes modules for data retrieval, data exchange, policy enforcement, monitoring and auditing.