Kontrastive Textanalyse mit pydistinto – Ein Python-Paket zur Nutzung unterschiedlicher Distinktivitätsmaße

Lecture in the context of DHd 2022

Logo DHd 2022

Date:

10.03.2022

Place:

via zoom

17:30 – 20:00 Uhr

 

 

 

Categories:

Conference

Presentation by Keli Du, Julia Dudar, Cora Rok and Christof Schöch.

Presentation by Keli Du, Julia Dudar, Cora Rok and Christof Schöch.

In Computational Literary Studies (CLS), statistical measures of distinctiveness are used to determine features that are characteristic of one group of texts in comparison with another group of texts. However, most existing tools prove unsuitable when users want to customise their analyses and make their own parameter settings or use specific data formats. To facilitate the use of relevant measures for contrastive text analysis and to raise awareness of the diversity of measures, we are developing a Python package called pydistinto. With the help of pydistinto, users with little knowledge of programming and statistics will be able to compare two text corpora with different measures, and in an advanced mode also empirically determine and compare the properties and performance of the different measures. Through tables and figures, the planned poster will mainly present the following aspects of our package: the possibilities of pre-processing the text data, the implemented distinctiveness measures and the visualisation of the contrastive analysis results.


Keywords: Text Mining