Interpretable Topic Extraction and Word Embedding Learning using row-stochastic DEDICOM
The DEDICOM algorithm provides a uniquely interpretable matrix factorization method for symmetric and asymmetric square matrices.
We employ a new row-stochastic variation of DEDICOM on the pointwise mutual information matrices of text corpora to identify latent topic clusters within the vocabulary and simultaneously learn interpretable word embeddings.
We introduce a method to efficiently train a constrained DEDICOM algorithm and a qualitative evaluation of its topic modeling and word embedding performance.
- Published in:
CD-MAKE 2020: Machine Learning and Knowledge Extraction Cross Domain Conference for Machine Learning & Knowledge Extraction (CD-MAKE) - Type:
Inproceedings - Authors:
L. Hillebrand, D. Biesner, C. Bauckhage, R. Sifa - Year:
2020
Citation information
L. Hillebrand, D. Biesner, C. Bauckhage, R. Sifa: Interpretable Topic Extraction and Word Embedding Learning using row-stochastic DEDICOM, Cross Domain Conference for Machine Learning & Knowledge Extraction (CD-MAKE), CD-MAKE 2020: Machine Learning and Knowledge Extraction, 2020, https://doi.org/10.1007/978-3-030-57321-8_22, Hillebrand.etal.2020,
@Inproceedings{Hillebrand.etal.2020,
author={L. Hillebrand, D. Biesner, C. Bauckhage, R. Sifa},
title={Interpretable Topic Extraction and Word Embedding Learning using row-stochastic DEDICOM},
booktitle={Cross Domain Conference for Machine Learning & Knowledge Extraction (CD-MAKE)},
journal={CD-MAKE 2020: Machine Learning and Knowledge Extraction},
url={https://doi.org/10.1007/978-3-030-57321-8_22},
year={2020},
abstract={The DEDICOM algorithm provides a uniquely interpretable matrix factorization method for symmetric and asymmetric square matrices.
We employ a new row-stochastic variation of DEDICOM on the pointwise mutual information matrices of text corpora to identify latent topic clusters within the vocabulary and simultaneously learn interpretable word embeddings.
We introduce a method to efficiently train a...}}