Interpretable Topic Extraction and Word Embedding Learning using row-stochastic DEDICOM

The DEDICOM algorithm provides a uniquely interpretable matrix factorization method for symmetric and asymmetric square matrices.
We employ a new row-stochastic variation of DEDICOM on the pointwise mutual information matrices of text corpora to identify latent topic clusters within the vocabulary and simultaneously learn interpretable word embeddings.
We introduce a method to efficiently train a constrained DEDICOM algorithm and a qualitative evaluation of its topic modeling and word embedding performance.

  • Published in:
    CD-MAKE 2020: Machine Learning and Knowledge Extraction Cross Domain Conference for Machine Learning & Knowledge Extraction (CD-MAKE)
  • Type:
    Inproceedings
  • Authors:
    L. Hillebrand, D. Biesner, C. Bauckhage, R. Sifa
  • Year:
    2020

Citation information

L. Hillebrand, D. Biesner, C. Bauckhage, R. Sifa: Interpretable Topic Extraction and Word Embedding Learning using row-stochastic DEDICOM, Cross Domain Conference for Machine Learning & Knowledge Extraction (CD-MAKE), CD-MAKE 2020: Machine Learning and Knowledge Extraction, 2020, https://doi.org/10.1007/978-3-030-57321-8_22, Hillebrand.etal.2020,