MaxSimE: Explaining Transformer-based Semantic Similarity via Contextualized Best Matching Token Pairs

Current semantic search approaches rely on black-box language models, which limit their interpretability and transparency. In this work, we propose MaxSimE, an explanation method for language models applied to measure semantic similarity. Our approach is inspired by the explainable-by-design ColBERT architecture and generates explanations by matching contextualized query tokens to the most similar tokens from the retrieved document according to the cosine similarity of their embeddings. Unlike existing post-hoc explanation methods, which may lack fidelity to the model and thus fail to provide trustworthy explanations in critical settings, we demonstrate that MaxSimE can generate faithful explanations under certain conditions and how it improves the interpretability of semantic search results on ranked documents from the LoTTe benchmark, showing its potential for trustworthy information retrieval.

  • Published in:
    International ACM SIGIR Conference on Research and Development in Information Retrieval
  • Type:
    Inproceedings
  • Authors:
    Brito, Eduardo; Iser, Henri
  • Year:
    2023

Citation information

Brito, Eduardo; Iser, Henri: MaxSimE: Explaining Transformer-based Semantic Similarity via Contextualized Best Matching Token Pairs, International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023, https://dl.acm.org/doi/abs/10.1145/3539618.3592017, Brito.Iser.2023a,