Historical Corpora | Language Models | Tools
Historical Corpora - Archive
- February 8, 2023: The corpus Coptic Scriptorium was added
- September 21, 2021: The dictionary Swensk Ordabok by Jesper Swedberg was added
- September 21, 2021: The dictionary Ordbok över svenska medeltidsspråket by Schlyter was added
- September 20, 2021: The dictionary Ordbok Öfver svenska språket by Dalin was added
- September 13, 2021: The dictionary Words of the 16th-Century Slovenian Literary Language was added
- November 20, 2020: The Universal Dependencies version of The German Literary History added
- November 10, 2020: The Universal Dependencies version of The Middle Russian Corpus added
- November 6, 2020: The Universal Dependencies version of The Syntactic Reference Corpus of Medieval French added
- October 15, 2020: The EDGeS Diachronic Bible Corpus added for Dutch, English, German and Swedish
- August 25, 2020: Late Latin Charter Treebank and Index Thomisticus Treebank added for Latin
- August 24, 2020: Perseus Digital Library and Proiel Treebank added for Latin and Greek
- March 18, 2020: The Medieval Charter Sections Corpus added for Latin and Czech
- March 18, 2020: The Reference Corpus of Middle Low German/Low Rhenish added
- March 13, 2020: The Nottingham Corpus of Early Modern German Midwifery and Women's Medicine added
- February 20, 2020: The Corpus of Late Modern English Texts added
- February 14, 2020: The Middle Polish Diachrone Lemmatised Corpus added
- June 15, 2018: Basque datasets for spelling normalisation added
- June 14, 2018: The Ridges Corpus added for German
For questions or comments, or if there are corpora that you would like to add to this page, don't hesitate to contact us: