Lecture by Keli Du (Trier) in the context of the Research Colloquium “Digital perspectives”, SoSe 2022
Towards an understanding of LDA Topic Modeling: an evaluation from a digital humanities perspective
Date:
30.06.2022Place:
Donnerstags, 18-20 Uhr c.t., digital via Zoom
Link: https://uni-trier.zoom.us/j/85154523515?pwd=VG9SYWZzY2Vlc21YNkRtRU9yWldtUT09
Categories:
EventContact:
Dr. Claudia BambergLatent Dirichlet Allocation (LDA) Topic Modeling is a quantitative text analytic method that has been widely used in the Digital Humanities in recent years. It is often observed when using Topic Modeling that this method is sensitive to the setting of the respective parameters. As a result, LDA Topic Modeling is also often heavily criticized.
In this talk, Keli Du will present his dissertation “Towards understanding LDA Topic Modeling: an evaluation from a digital humanities perspective”, which focuses on a systematic evaluation of LDA Topic Modeling. The goal of the evaluation is to understand LDA Topic Modeling in depth and to find out under which circumstances one can get stable results by LDA Topic Modeling. From two perspectives, namely topic modeling-based document classification and topic coherence, the evaluation was conducted on two German corpora: A collection of 2000 newspaper articles and a collection of 439 staple novels. In the paper, we mainly present the results of the investigation of the two factors, namely the number of topics and chunk length. Interestingly, the research conducted with both corpora for these two factors yielded partly the same and partly different results. This phenomenon indicates that a thorough understanding of LDA Topic Modeling will probably require many experiments.