Interpretable Topic Extraction and Word Embedding Learning using row-stochastic DEDICOM

The DEDICOM algorithm provides a uniquely interpretable matrix factorization method for symmetric and asymmetric square matrices.
We employ a new row-stochastic variation of DEDICOM on the pointwise mutual information matrices of text corpora to identify latent topic clusters within the vocabulary and simultaneously learn interpretable word embeddings.
We introduce a method to efficiently train a constrained DEDICOM algorithm and a qualitative evaluation of its topic modeling and word embedding performance.

Published in:
CD-MAKE 2020: Machine Learning and Knowledge Extraction Cross Domain Conference for Machine Learning & Knowledge Extraction (CD-MAKE)
Type:
Inproceedings
Authors:
L. Hillebrand, D. Biesner, C. Bauckhage, R. Sifa
Year:
2020

Citation information

L. Hillebrand, D. Biesner, C. Bauckhage, R. Sifa: Interpretable Topic Extraction and Word Embedding Learning using row-stochastic DEDICOM, Cross Domain Conference for Machine Learning & Knowledge Extraction (CD-MAKE), CD-MAKE 2020: Machine Learning and Knowledge Extraction, 2020, https://doi.org/10.1007/978-3-030-57321-8_22, Hillebrand.etal.2020,

Open BibTeX citation