--------------------------------------------------- Referenzkorpus Mittelniederdeutsch/Niederrheinisch --------------------------------------------------- The project "Reference Corpus Middle Low German/Low Rhenish (1200–1650)", abbreviated as "ReN", is part of the "Corpus of Historical German Texts", which includes the projects "Old German Reference Corpus (750–1050)", the "Reference Corpus Middle High German (1050–1350)" and the "Reference Corpus Early Modern New High German (1350–1650)". The project "Reference Corpus Middle Low German/Low Rhenish (1200–1650)" deals with a structured selection of Middle Low German and Low Rhenish monuments of speech from 1200 to 1650. This selection is based on the parameters "space", "time" and "field of writing". The reference corpus of Middle Low German and Low Rhenish texts is based on manuscripts, prints and inscriptions. It is intended to provide an insight into the culture of speech and writing in Middle Low German and Low Rhenish regions. This spectrum of text types can be used to trace the linguistic development on the base of diatopic and diachronic subcategorisation. The aim of the project is the publication of diplomatic transcribed, lemmatised and grammatically annotated texts. The processed data – especially on the grammatical level – enables a linguistic analysis of the Middle Low German and Low Rhenish language, which goes far beyond what has been possible until now. Cited from https://corpora.uni-hamburg.de/hzsk/de/islandora/object/text-corpus:ren-1.0#corpus-description March 13, 2020 HistCorp inclusion date ------------------------ March 13, 2020 Website -------- https://corpora.uni-hamburg.de/hzsk/de/islandora/object/text-corpus:ren-1.0 Cite ----- ReN-Team. 2019. “Referenzkorpus Mittelniederdeutsch/Niederrheinisch (1200-1650).” Archived in Hamburger Zentrum für Sprachkorpora. Version 1.0. Publication date 2019-08-14. http://hdl.handle.net/11022/0000-0007-D829-8. Licence -------- Creative Commons Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/) The HistCorp files ------------------- On the HistCorp page, the German texts from the 'ReN' corpus are provided in a plain text format ('txt'), a tokenised format ('tok'), and a subset of the texts are also provided in a linguistically annotated format ('anno'). The linguistically annotated files are the same as in the original ReN corpus, i.e. in a TEI format with information on lemma, part-of-speech and morphosyntactic information. The plain text files are derived from the TEI files, by automatically extracting the words and sentences from the TEI files. The plain text files contain one sentence on each line. Furthermore, metadata has been added in a TEI-compatible format at the top of each txt file. The metadata information was mainly extracted from the metadata stated on the corpus website. The dates have been extracted from the file names. In addition, the number of tokens for each file has been calculated based on the tokenised version of the file. In the tokenised files, the texts are split into one token on each line. This was done following the existing tokenisation in the original TEI corpus files. Size: 235 texts, with a total of 2,141,534 tokens. Genres: speech and writings