How can Machine Learning help in the fight against COVID-19?

00 Blog Antweiler Corona - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)
© CDC/Unsplash

The current pandemic poses new challenges for health authorities in Germany. Within a short period, personnel and technical resources had to be mobilized to respond to dynamic developments. One main task is tracking infection chains and subsequently ordering isolation or quarantine. To tackle this task, health authorities collect various information from individuals who test positive and their contacts, including demographic and epidemiological data.

2020 12 09 Merge scaled 2 - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)
© Fraunhofer IAIS, Dario Antweiler
The visualization and analysis of infection dynamics support contact tracing. The image depicts a fictional example of infected individuals (red squares or dots) and the known transmissions between them (arrows).

In the summer of 2020, a collaboration was initiated as part of the “Fraunhofer vs. Corona” initiative, consisting of six German health authorities and Fraunhofer Institutes. The project “CorASiV” (Supporting Health Authorities in the Corona response through Analysis, Simulation, and Visualization) aims to analyze available infection data, prepare it, and thereby support health experts in contact tracing. Techniques of Machine Learning, data visualization, and mathematical simulation are employed for this purpose. Data scientists at Fraunhofer IAIS closely collaborate with the Cologne Public Health Department, responsible for over a million people, focusing on analyzing structured and spatiotemporal data to contain the pandemic.

Recognizing patterns in infection spread

Together with the health department, analysis questions crucial for the current situation were defined. This includes identifying “infection clusters,” groups of individuals likely infected from a common source. Another task is studying the geographical spread of the novel virus in the city: at what speed and in which districts is the spread occurring?

Current analyses by Fraunhofer scientists in collaboration with the health department reveal age-specific, geographical, and somewhat socio-economic correlations in the spread of COVID-19 in Cologne. However, they also indicate that these correlations alone do not allow causal conclusions about the causes of these outbreaks or infection trajectories. The analyses support the Cologne Public Health Department in expanding concrete measures, such as communication of hygiene rules or strategic placement of locations for rapid tests and self-tests.

The analysis divided the infection course into three phases and examined them individually and comparatively: Phase 1 from March 2020 to June 2020, Phase 2 from July 2020 to November 2020, and Phase 3 from December 2020 to January 2021. A clear geographical difference in the spread is evident. While the infection primarily spread in left-bank districts in Phase 1, it predominantly occurred in right-bank areas in Phases 2 and 3.

Regarding infections between age groups, the analysis focuses on cases where the known source of infection is in Cologne, which occurs in approximately 30 percent of cases. Three transmission patterns are identified:

  • Most infections occur within the same age group.
  • Outside one’s own generation, individuals are more likely to get infected by older persons. Only 14% get infected by younger individuals. This scenario includes infection routes from young children to their parents as well as from adults to the older generation.
  • Moreover, 72% of index cases infected by a younger person do not transmit the virus further. In this case, infection chains are successfully interrupted.
corasiv altersgruppen EN - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)
© Fraunhofer IAIS
Visualization on the incidence of infection between age groups.

In addition to contact tracing data, sources of data on socio-economic factors in respective neighborhoods are available, such as unemployment rate, rent index, and migration rate. These data were included for further consideration of the geographical course of the 7-day incidence rate. It should be noted that there are no actual data on the socio-economic factors of individual index persons, as none of the aforementioned factors are collected in contact tracing. The analyses yielded the following observations:

  • Neighborhoods with low unemployment were more affected in the early phase.
  • Neighborhoods with high unemployment were more affected in later phases. Migration rate and rent index behave similarly.
  • For index persons in neighborhoods with high unemployment, the source of infection is more often known.
corasiv inzidenzverlauf EN - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)
© Fraunhofer IAIS
Geographical difference in distribution in three temporal phases.

Analyzing and visualizing infection cases

To better understand the relationships between different infection cases, visualizing contact tracing data plays a crucial role. Within the project, various approaches were tested, focusing on the temporal development, geographical distribution, or characteristics of infected individuals, and relating this data to each other. Depending on the questions of health experts, flexible and interactive presentation of information can be achieved. For example, the above-described visualizations of infection data are intended to help understand the spread through longer infection chains.

The results of the contact tracing data investigations in Cologne were presented at a joint press conference with the city of Cologne by ML2R (now the Lamarr Institute) Transfer Manager and IAIS Department Head Stefan Rüping (more on welt.de from minute 07:40). The current results are available on the project’s website (for inquiries, please contact Sebastian Ginzel via the project page).

One of the methods for modeling using Machine Learning techniques was published as a workshop paper at the leading conference on Artificial Intelligence (NeurIPS 2020) together with ML2R (now the Lamarr Institute) researcher Pascal Welke and Dario Antweiler from Fraunhofer IAIS. The central question here was how the network arising from positive-tested individuals and suspected transmissions can be effectively analyzed. The publication introduces an approach that enables the interactive determination of similar components. Such workshops allow experts to present new approaches to solve mathematical or technical challenges. These events mark the beginning of a scientific process in which new methods for research fields, such as the use of Machine Learning for public health, are developed.

Artificial Intelligence in healthcare

After methods of Artificial Intelligence have already established themselves in various medical areas, it becomes increasingly clear how the closely related fields of public health and pharmacology can benefit from these new technologies. Joint cooperation projects with close exchange between domain experts and data scientists are particularly suitable for this. While new approaches based on Artificial Intelligence offer powerful potential, defining the right question and meaningful interpretation of results are the basis for successful applications of Machine Learning in practice. More on this in our blog post “Application scenarios for Artificial Intelligence in hospitals”

Fraunhofer IAIS utilizes methods of Artificial Intelligence in various projects and initiatives in the context of pandemic control. Another current project is “COPERIMOplus.” This project focuses on creating personalized risk profiles through the use of Machine Learning methods for predictive analysis of clinical data and biomarkers.

Project partners: In addition to Fraunhofer IAIS, the “CorASiV” project involves the institutes IGD, IOSB, IME, ITWM, and MEVIS. A joint scientific publication with the epidemiological experts of the health department is planned. We would like to thank the Cologne Public Health Department for the successful collaboration.

More information in the associated publication:

Temporal Graph Analysis for Outbreak Pattern Detection in COVID-19 Contact Tracing Networks D. Antweiler, P. Welke, Machine Learning for Public Health Workshop at NeurIPS, 2020.

Dario Antweiler, Sebastian Ginzel,

28. April 2021

Topics

Dario Antweiler

Dario Antweiler is a data scientist at the Fraunhofer Institute IAIS in Sankt Augustin in the Healthcare Analytics unit. He is responsible for projects in the areas of digitalization in hospitals and Artificial Intelligence in pharmacology. His field of research is Machine Learning on graphs and networks.

Dr. Sebastian Ginzel

Dr. Sebastian Ginzel is a Senior Data Scientist in the Healthcare Analytics department at the Fraunhofer Institute IAIS. He leads projects on topics such as Digital Twins in Medicine, identifying biomarkers and subgroups in clinical studies using Machine Learning, and improving patient care through Artificial Intelligence. In his research, he focuses on how subjective evaluations by experts can be objectified with the help of AI.

More blog posts