Towards Automated Recipe Reconstruction: Optimization of Dietary Data Collection using Information Retrieval, Large Language Models and Mathematical Optimization
Accurate and scalable collection of dietary data is vital for advancing nutritional epidemiology and understanding links between diet, public health, and environmental sustainability. A key challenge is the collection of the detailed nutrition data on the product level which currently largely relies on manual recipe reconstruction. We propose computational approaches to optimize this workflow. First, an information retrieval (IR)-based recommender system integrates food-category prediction with retrieval over product text, ingredients, and nutrient profiles to streamline food item matching and reduce redundancy across the database. Second, we outline a roadmap for automated recipe reconstruction that combines large language models (LLMs) for ingredient parsing with nutrient-constrained mathematical optimization for recipes reconstruction. By integrating machine learning, generative modeling, and optimization, our work enhances the efficiency, transparency, and scalability of nutrition data collection, laying a foundation for sustainable practices in nutritional epidemiology and research on interactions of the diet, health and environment.
- Published in:
2025 IEEE International Conference on Big Data (BigData) - Type:
Inproceedings - Authors:
- Year:
2025 - Source:
https://ieeexplore.ieee.org/document/11401661
Citation information
: Towards Automated Recipe Reconstruction: Optimization of Dietary Data Collection using Information Retrieval, Large Language Models and Mathematical Optimization, 2025 IEEE International Conference on Big Data (BigData), 2025, 6821--6830, https://ieeexplore.ieee.org/document/11401661, Schmidt.etal.2025b,
@Inproceedings{Schmidt.etal.2025b,
author={Schmidt, Svetlana; Klasen, Linda; Nöthlings, Ute; Sifa, Rafet},
title={Towards Automated Recipe Reconstruction: Optimization of Dietary Data Collection using Information Retrieval, Large Language Models and Mathematical Optimization},
booktitle={2025 IEEE International Conference on Big Data (BigData)},
pages={6821--6830},
url={https://ieeexplore.ieee.org/document/11401661},
year={2025},
abstract={Accurate and scalable collection of dietary data is vital for advancing nutritional epidemiology and understanding links between diet, public health, and environmental sustainability. A key challenge is the collection of the detailed nutrition data on the product level which currently largely relies on manual recipe reconstruction. We propose computational approaches to optimize this workflow....}}