Beyond Words

Semantic and multiword distinctive features for an investigation of literary subgenres

Beyond Words

Project Management: Prof Dr Christof Schöch  (Universität Trier - Computerlinguistik & Digital HumanitiesUniversität Trier - Trier Center for Digital Humanities (TCDH))

Sponsors: Deutsche Forschungsgemeinschaft (DFG), im Rahmen des Schwerpunktprogramms "Computational Literary Studies" (SPP 2207)

Running time: -

Contact person (TCDH): Prof Dr Christof Schöch


Keli Du, Julia Dudar, Christof Schöch: “Evaluation of measures of distinctiveness: Classification of literary texts on the basis of distinctive words”. Journal of Computational Literary Studies 1.1, 2022. – DOI:

Research Area: Digital Literary and Cultural Studies

Keywords: Keyness, Distinctiveness, Statistics, French Literature, Novel, Genres


The project "Beyond Words" aims to advance the study of literature by employing statistical techniques to analyze French contemporary novels, particularly focusing on science fiction, crime fiction, and sentimental novels, while also considering English texts. The goal is to bridge the gap between the statistical characteristics that define these literary subgenres and a more nuanced, interpretive understanding of their unique qualities. This is being achieved by extracting complex linguistic features from the texts that offer a richer semantic insight than mere word analysis, creating detailed, flexible profiles of these subgenres, and employing both qualitative and quantitative methods to assess the relevance and interpretability of the identified features. With this innovative approach, we aim to significantly contribute to Computational Literary Studies by improving the methods used for analyzing texts and deepening the conceptual understanding of literary subgenres.

Contrastive text analysis, where one group of texts is compared to another, is a widely used procedure in linguistics and literary studies, both in qualitative and quantitative research designs. Measures of ‘keyness’ or ‘distinctiveness’ have been developed, evaluated, and used in a range of related fields, in particular Information Retrieval, Corpus and Computational Linguistics, and Computational Literary Studies. 

The project proposed here builds directly on the insights, experience, and results from the ongoing Zeta and Company project that works on a systematic, methodological exploration of this quantitative contrastive paradigm. In Beyond Words, the literary domain we focus on is again the French contemporary novel, with a special focus on the three popular subgenres of science fiction, crime fiction, and sentimental novels, but English-language literary and non-literary corpora are also taken into account. 

The overall objective of Beyond Words is to significantly narrow the gap between the (statistically speaking) distinctive features of specific groups of exemplars of these literary subgenres, on the one hand, and their (meaningful, interpretive) relationship to an ambitiously complex understanding of the characteristic properties of literary subgenres, on the other hand.

Our strategy to achieve this objective relies on a three-pronged approach: First, rather than focusing on single word forms, we extract more complex and semantically-richer linguistic features from the texts that we believe are better able to capture meaningful characteristics of literary subgenres. Second, we create a conceptualization of the subgenres that is both explicit and flexible by creating fine-grained, descriptive, prototypical subgenre profiles based on a broad consideration of the relevant research literature. Third, we maintain our focus on qualitative and quantitative strategies for the evaluation of the discriminatory power and the interpretability of the distinctive features we identify.

With this approach, we can contribute decisively to Computational Literary Studies, both at the level of methodological innovation regarding feature extraction and measures of distinctiveness suitable for complex features and at the level of a deepened understanding of what constitutes subgenres conceptually and how the particular subgenres in question can best be described. 


Iuliia Dudar
E-mail: dudaratuni-trier [dot] de

Julia Röttgermann
E-mail: roettgeratuni-trier [dot] de
Phone: +49 651 201-3120

Keli Du
E-mail: dukatuni-trier [dot] de
Phone: +49 651 201-3377

Prof Dr Christof Schöch
E-mail: schoechatuni-trier [dot] de
Phone: +49 651 201-3264