Dr.
Mehdi Ali

Lead Scientist Foundation Models NLP

Mehdi Ali is the Innovation Group Leader for Foundation Model Research at the Lamarr Institute. His group plays a key role in national and international projects dedicated to training Large Language Models (LLMs), including OpenGPT-X, TrustLLM, and EuroLingua-GPT.

He received his PhD in Computer Science from the University of Bonn, with a research focus on knowledge graph representation learning. His work has been published in leading machine learning venues such JMLR, TPAMI, and ISWC. His paper “Improving Inductive Link Prediction Using Hyper-Relational Facts” won the Best Paper Award at ISWC 2021.
Mehdi is also the founder of PyKEEN, an open-source Python library for learning and evaluating knowledge graph embeddings. PyKEEN has become a community-driven project within the knowledge graph representation learning community.

After completing his PhD, Mehdi focused on multilingual large language models, contributing to key research areas including high-quality multilingual data filtering, tokenization, pretraining, instruction tuning, and evaluation. His work in these areas has been published in top-tier venues such as EMNLP, NAACL, and ECAI. Mehdi is one of the core researchers behind Teuken-7B, a multilingual seven-billion-parameter language model trained from scratch on all 24 official European languages. Teuken-7B has been downloaded over 100,000 times on Hugging Face.

Topics

Natural Language Processing (NLP)