Towards Automated Recipe Reconstruction: Optimization of Dietary Data Collection using Information Retrieval, Large Language Models and Mathematical Optimization

Accurate and scalable collection of dietary data is vital for advancing nutritional epidemiology and understanding links between diet, public health, and environmental sustainability. A key challenge is the collection of the detailed nutrition data on the product level which currently largely relies on manual recipe reconstruction. We propose computational approaches to optimize this workflow. First, an information retrieval (IR)-based recommender system integrates food-category prediction with retrieval over product text, ingredients, and nutrient profiles to streamline food item matching and reduce redundancy across the database. Second, we outline a roadmap for automated recipe reconstruction that combines large language models (LLMs) for ingredient parsing with nutrient-constrained mathematical optimization for recipes reconstruction. By integrating machine learning, generative modeling, and optimization, our work enhances the efficiency, transparency, and scalability of nutrition data collection, laying a foundation for sustainable practices in nutritional epidemiology and research on interactions of the diet, health and environment.

  • Published in:
    2025 IEEE International Conference on Big Data (BigData)
  • Type:
    Inproceedings
  • Authors:
    Schmidt, Svetlana; Klasen, Linda; Nöthlings, Ute; Sifa, Rafet
  • Year:
    2025
  • Source:
    https://ieeexplore.ieee.org/document/11401661

Citation information

Schmidt, Svetlana; Klasen, Linda; Nöthlings, Ute; Sifa, Rafet: Towards Automated Recipe Reconstruction: Optimization of Dietary Data Collection using Information Retrieval, Large Language Models and Mathematical Optimization, 2025 IEEE International Conference on Big Data (BigData), 2025, 6821--6830, https://ieeexplore.ieee.org/document/11401661, Schmidt.etal.2025b,

Associated Lamarr Researchers

Prof. Dr. Rafet Sifa

Prof. Dr. Rafet Sifa

Principal Investigator Hybrid ML to the profile