Gaia X: The cloud for AI development?

00 Blog Sachweh GaiaX - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)
© ML2R

In recent years, the field of Artificial Intelligence (AI) has experienced significant growth. AI applications are increasingly being used, not only in research but also in practice, for tasks such as optimizing route planning or calculating delays on routes. One example is Uber, an online ride-hailing service. In 2019, Uber introduced a deep neural network called DeepETA, which predicts delays to previously specified pickup points. Training such models requires large amounts of data. For many small and medium-sized enterprises, this poses a challenge, often preventing them from implementing data-driven algorithms. This is where Gaia-X comes in, offering a way to make the use and application of AI easier for small businesses.

Gaia-X

The Gaia-X project was initiated within the European Union at the end of 2019 and aims to represent the next level of data infrastructure in Europe. A so-called European Cloud will be built for this purpose. The cloud acts as a decentralized, standardized data space that any company can join. Through standardized interfaces, companies can offer their collected data or purchase it based on predefined guidelines. The provider of the data can set the necessary guidelines, and it’s up to the user (buyer) to comply with them. An example guideline could be that the user pays 10 euros per month or is affiliated with a public organization like a university. Additionally, the guideline can be extended to include other aspects, such as limiting the number of data requests, etc. Gaia-X is generally divided into two parts:

Abbildung 1 Architekturueberblick - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)
© BMWI / GAIA-X: Technical Architecture (data-infrastructure.eu)
Figure 1: Gaia-X architecture overview

On one side, Gaia-X represents the Infrastructure Ecosystem, which provides basic services that enable the realization of data connectors and other services. Data connectors are ongoing services that facilitate the exchange of data between pools and services within Gaia-X. On the other side, Gaia-X is also a Data Ecosystem, which enables the data exchange described above (via the data connectors) and the networking of various heterogeneous services. Furthermore, essential services needed for Gaia-X are centrally operated. These include, among others, a service that handles Identity & Trust and is responsible for verifying negotiated guidelines (certificates). There will also be a central service that provides an overview of all available services within Gaia-X, much like an app store, referred to as the Federated Catalogue in the diagram. Essential services provided centrally are highlighted in light green in the diagram. More detailed information about the exact architecture of Gaia-X and the implementation of various services can be found in publications by the Gaia-X Association.

Gaia-X as an application platform

Gaia-X is designed to implement two core functions:

Abbildung 2 Exemplarischer Datenaustausch 1 - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)
© Timon Sachweh / TU Dortmund
Figure 2: Exemplary data exchange between a data provider and consumer.

Standardized data exchange

The primary functionality of Gaia-X is to facilitate a standardized and straightforward data exchange. For this purpose, so-called connectors will enable access to the decentralized cloud within Gaia-X. These connectors act as intermediaries between the Federation Services (central Gaia-X services) and other connectors. Various preliminary work has already been done regarding these connectors. For example, a data connector has been developed within the Industrial Data Space that allows standardized data exchange. However, the Industrial Data Space does not include all the functionalities required by the Gaia-X Association. Therefore, Eclipse Data Connectors are currently being developed to cover most of the criteria needed for a connector in the Gaia-X cloud. Further preliminary work has been done with the Ocean Protocol, which can also serve as connectors and is based on the Web3 standard, also referenced in Gaia-X publications.

The general operation of a data exchange is illustrated in Figure 2. The data consumer is shown on the left, and the data provider on the right. Every participant in the Gaia-X cloud needs a connector to handle communication within the cloud, so both parties have their own connector. To enable the consumer to access the data, the provider must first upload the data to the Federation Catalogue. This requires sending a “description” of the data to the provider’s connector, which then passes this description to the Federation Catalogue. Next, the service, labeled as Prediction Service in the diagram, accesses the requested data service through its own connector. For a simple data exchange, the next step is to meet the conditions for the requested data service and submit a data request (step 4). The data provider then checks through their own connector whether the attached authentication is valid (step 5) and, based on that, releases the data.

Value-added services

The second focus of the Gaia-X cloud is on value-added services. A value-added service is any service that provides additional benefits to end users or businesses. This could be, for example, a user-friendly portal displaying the current weather or a forecast for taxi or train delays.

Here, the significant advantage of Gaia-X’s simple data exchange becomes clear. Through this, developing such value-added services is significantly simplified. Where companies previously had to gather and unify large amounts of data from various sources, the Gaia-X cloud makes it easier to integrate multiple data sources. This is especially beneficial for AI algorithms, which often require large amounts of data for training. One possible scenario is renting training data only for the duration of the training, reducing the cost of data.

Additionally, the Gaia-X cloud offers the major advantage of being designed with data protection and comprehensive identity management in mind. Current questions about how to handle data protection, implement the right to be forgotten and the right to access data, or ensure data security during transmission are all addressed by Gaia-X (more information on data protection and AI can be found here).

The developed services can be integrated back into the Gaia-X cloud as service interfaces or data providers. It might make sense to cover the interface with a rental policy so that the service interface is automatically billed based on usage.

Another advantage of Gaia-X for AI algorithms is the decentralized, heterogeneous execution of services. This means there’s no need to ensure the algorithm runs efficiently on a specific cloud like Amazon Web Services (AWS). Instead, you can run the algorithm on your own hardware while still embedding it within Gaia-X through the connectors. If desired, it can also run on popular cloud solutions like AWS or Google. Essentially, users can choose the environment for service execution without being limited by other companies.

Because of its features, Gaia-X is ideally suited for developing value-added services. In particular, its simplicity and the standardization of data and service exchange policies offer an easy way to offer services while maintaining security.

Gaia-X for AI?

In summary, Gaia-X is creating a European cloud with the goal of ensuring a sovereign approach to data and maintaining privacy. With this core feature and the easy-to-implement data exchange, the cloud enables entirely new value creation concepts. In particular, the field of AI development can greatly benefit, as the General Data Protection Regulation (GDPR) often poses challenges for the productive use of AI algorithms.

The concept has various advantages and disadvantages, which are summarized in the following table:

AdvantagesDisadvantages
Decentralization of services: Largely no “single point of failure”.Each partner is fully responsible for its services: if a service is not running, the company must take care of restarting it.
Dynamic expansion of functionalities by adding value-added services (each AI algorithm can receive separate rights and access different data sources).The initial effort to provide the data via connectors is higher than if it is provided in the existing format.
Standardization of interfaces (data access in particular becomes easier to handle by standardizing the interface for data access).An understanding of how the connectors must be configured in order to be able to communicate with the federation services must be established.
Simple assignment of rights through Identity & Trust Federation Service.
Heterogeneous execution environments: any deployment options conceivable. Particularly helpful for AI algorithms where the hardware requirements are very different.
Integration of data protection (right to be forgotten/data disclosure), identity management and transmission security already included in the architecture of the Gaia-X Cloud.

In conclusion, the Gaia-X cloud, which is currently under development, is excellently suited for data-driven applications, especially AI algorithms. Although there are some drawbacks, such as the additional effort to provide data through standardized interfaces, the benefits for AI applications clearly outweigh the disadvantages.

Further information

Gaia-X – Wegbereiter einer digitalen und wettbewerbsfähigen Zukunft der EU? Jana Bernhardt, Marina Steininger, et al. ifo Schnelldienst, 74(05):66–71, 2021, Link.

DeepETA: A Spatial-Temporal Sequential Neural Network Model for Estimating Time of Arrival in Package Delivery System. Fan Wu and Lixia Wu. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 774–781, 2019, Link.

Timon Sachweh

Timon Sachweh is a research assistant at the Chair of Artificial Intelligence at the TU Dortmund University. Since the beginning of 2022, he has been working on the development of AI algorithms in the context of the Gaia-X Cloud as part of the GAIAX4ROMS research project. The focus is on the efficient prediction of delays and route planning in logistics scenarios.

More blog posts