Optimizing Rare Disease Patient Matching with Large Language Models

We present RepLLaMA, a neural ranking model for optimizing patient matching in rare disease communities. Using data from Unrare.me consisting of over two thousand profiles and over ten thousand ratings, our bi-encoder architecture maps profiles to 4096-dimensional vectors, enabling efficient similarity computations. The system processes unstructured symptom descriptions and structured responses, incorporating expert-guided LLM enhancements. Results show Top-10 Recall of 49.36\% $(\pm 2.03)$, surpassing baselines while maintaining generalization. The implementation provides a scalable solution for rare disease patient matching, addressing computational complexity challenges.

  • Published in:
    2024 IEEE International Conference on Big Data (BigData)
  • Type:
    Inproceedings
  • Authors:
    Berger, Armin; Bashir, Ali Hamza; Berghaus, David; Mowmita; Afsan, Nazia; Grigull, Lorenz; Fendrich, Lara; Hogl, Henriette; Ernst, Gundula; Schmidt, Ralf; Bascom, David; Lagones, Tom Anglim; Deuber, Tobias; Bell, Thiago; Lubbering, Max; Sifa, Rafet
  • Year:
    2024
  • Source:
    https://doi.ieeecomputersociety.org/10.1109/BigData62323.2024.10910113

Citation information

Berger, Armin; Bashir, Ali Hamza; Berghaus, David; Mowmita; Afsan, Nazia; Grigull, Lorenz; Fendrich, Lara; Hogl, Henriette; Ernst, Gundula; Schmidt, Ralf; Bascom, David; Lagones, Tom Anglim; Deuber, Tobias; Bell, Thiago; Lubbering, Max; Sifa, Rafet: Optimizing Rare Disease Patient Matching with Large Language Models, 2024 IEEE International Conference on Big Data (BigData), 2024, https://doi.ieeecomputersociety.org/10.1109/BigData62323.2024.10910113, Berger.etal.2024b,

Associated Lamarr Researchers

Prof. Dr. Rafet Sifa

Prof. Dr. Rafet Sifa

Principal Investigator Hybrid ML to the profile