Optimizing Rare Disease Patient Matching with Large Language Models
We present RepLLaMA, a neural ranking model for optimizing patient matching in rare disease communities. Using data from Unrare.me consisting of over two thousand profiles and over ten thousand ratings, our bi-encoder architecture maps profiles to 4096-dimensional vectors, enabling efficient similarity computations. The system processes unstructured symptom descriptions and structured responses, incorporating expert-guided LLM enhancements. Results show Top-10 Recall of 49.36\% $(\pm 2.03)$, surpassing baselines while maintaining generalization. The implementation provides a scalable solution for rare disease patient matching, addressing computational complexity challenges.
- Published in:
2024 IEEE International Conference on Big Data (BigData) - Type:
Inproceedings - Authors:
Berger, Armin; Bashir, Ali Hamza; Berghaus, David; Mowmita; Afsan, Nazia; Grigull, Lorenz; Fendrich, Lara; Hogl, Henriette; Ernst, Gundula; Schmidt, Ralf; Bascom, David; Lagones, Tom Anglim; Deuber, Tobias; Bell, Thiago; Lubbering, Max; Sifa, Rafet - Year:
2024 - Source:
https://doi.ieeecomputersociety.org/10.1109/BigData62323.2024.10910113
Citation information
Berger, Armin; Bashir, Ali Hamza; Berghaus, David; Mowmita; Afsan, Nazia; Grigull, Lorenz; Fendrich, Lara; Hogl, Henriette; Ernst, Gundula; Schmidt, Ralf; Bascom, David; Lagones, Tom Anglim; Deuber, Tobias; Bell, Thiago; Lubbering, Max; Sifa, Rafet: Optimizing Rare Disease Patient Matching with Large Language Models, 2024 IEEE International Conference on Big Data (BigData), 2024, https://doi.ieeecomputersociety.org/10.1109/BigData62323.2024.10910113, Berger.etal.2024b,
@Inproceedings{Berger.etal.2024b,
author={Berger, Armin; Bashir, Ali Hamza; Berghaus, David; Mowmita; Afsan, Nazia; Grigull, Lorenz; Fendrich, Lara; Hogl, Henriette; Ernst, Gundula; Schmidt, Ralf; Bascom, David; Lagones, Tom Anglim; Deuber, Tobias; Bell, Thiago; Lubbering, Max; Sifa, Rafet},
title={Optimizing Rare Disease Patient Matching with Large Language Models},
booktitle={2024 IEEE International Conference on Big Data (BigData)},
url={https://doi.ieeecomputersociety.org/10.1109/BigData62323.2024.10910113},
year={2024},
abstract={We present RepLLaMA, a neural ranking model for optimizing patient matching in rare disease communities. Using data from Unrare.me consisting of over two thousand profiles and over ten thousand ratings, our bi-encoder architecture maps profiles to 4096-dimensional vectors, enabling efficient similarity computations. The system processes unstructured symptom descriptions and structured responses,...}}