{"ID":2891871,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.16695","arxiv_id":"2507.16695","title":"Interpretable Topic Extraction and Word Embedding Learning using row-stochastic DEDICOM","abstract":"The DEDICOM algorithm provides a uniquely interpretable matrix factorization method for symmetric and asymmetric square matrices. We employ a new row-stochastic variation of DEDICOM on the pointwise mutual information matrices of text corpora to identify latent topic clusters within the vocabulary and simultaneously learn interpretable word embeddings. We introduce a method to efficiently train a constrained DEDICOM algorithm and a qualitative evaluation of its topic modeling and word embedding performance.","short_abstract":"The DEDICOM algorithm provides a uniquely interpretable matrix factorization method for symmetric and asymmetric square matrices. We employ a new row-stochastic variation of DEDICOM on the pointwise mutual information matrices of text corpora to identify latent topic clusters within the vocabulary and simultaneously le...","url_abs":"https://arxiv.org/abs/2507.16695","url_pdf":"https://arxiv.org/pdf/2507.16695v1","authors":"[\"Lars Hillebrand\",\"David Biesner\",\"Christian Bauckhage\",\"Rafet Sifa\"]","published":"2025-07-22T15:30:32Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\",\"cs.LG\"]","methods":"[]","has_code":false}
