Mining and Modeling Text – MiMoText
Interdisciplinary Applications, Informational Development, Legal Perspectives
Project Management: Prof Dr Christof Schöch (Universität Trier - Computerlinguistik & Digital HumanitiesUniversität Trier - Trier Center for Digital Humanities (TCDH)) · Universität Trier - Trier Center for Digital Humanities (TCDH)
Project Participants: Universität Trier - Fachbereich II (Sprach-, Literatur- und Medienwissenschaften) · Universität Trier – Fachbereich IV (Informatikwissenschaften) · Universität Trier – Fachbereich V (Rechtswissenschaft) · Fachinformationsdienst (FID) Romanistik · Universität Trier - Universitätsbibliothek Trier
Sponsors: Forschungsinitiative des Landes Rheinland-Pfalz
Running time: -
Schöch, Christof, Frédéric Döhl, Achim Rettinger, Evelyn Gius, Peer Trilcke, Peter Leinen, Fotis Jannidis, Maria Hinzmann, and Jörg Röpke. “Abgeleitete Textformate: Text und Data Mining mit urheberrechtlich geschützten Textbeständen.” Zeitschrift für digitale Geisteswissenschaften (ZfdG) 5 (2020). URL: http://www.zfdg.de/2020_006. DOI: http://dx.doi.org/10.17175/2020_006.
Keywords: Quantitative Analysis, Literary History, Linked Open Data
Website of the Project: Mining and Modeling Text
The acquisition of knowledge from large amounts of text and data which can no longer be handled by individuals is becoming increasingly important due to the possibilities of digitization. For the humanities, this means in particular that digital full texts and rich metadata must not only be available, but must also be available in a form that promotes knowledge in the humanities.
The aim of the MiMoText project is therefore to establish an information network for the humanities fed from various sources, which, by making it available as Linked Open Data, is not only freely available and can be linked to other knowledge resources of the Semantic Web, but also offers innovative and efficient access possibilities to scientific information.
In the first project phase, the focus is on sources on the history of the French novel from 1750 to 1799, while in the second phase the approach will be transferred to a parallel epoch of German Literary History. In both phases, it will be possible to draw on existing full-text digital copies from Gallica, TextGrid and VD18.
Bibliographic directories, specialist literature and primary texts serve as sources of information. From these, metadata, concrete text properties and descriptive or evaluative statements about relevant entities are extracted for example authors and works. For this purpose, quantitative methods for automatic text analysis as well as for the extraction and modelling of data from extensive text collections must be further and partly newly developed. After that, the information is converted into a Linked Open Data format and can be linked to each other and to the outside world. From the start of the project, the legal framework will also be analyzed in order to ensure that the knowledge network is set up and made available in accordance with copyright and data protection laws.
Project spokesperson: schoech [at] uni-trier.de (Prof. Dr. Christof Schöch)
Deputy speaker: moulin [at] uni-trier.de (Prof. Dr. Claudine Moulin)