------------------------ Fornsvenska textbanken ------------------------ Texts downloaded from Fornsvenska textbanken (http://project2.sol.lu.se/fornsvenska/), a website containing Swedish text from the 12th century to the 18th century, from different genres. Only texts from the 14th century onwards have been downloaded, and only texts for which the date of creation is (more or less) known. HistCorp inclusion date ------------------------ April 25, 2017 Website -------- http://project2.sol.lu.se/fornsvenska/ Contact information -------------------- Lars-Olof Delsing (Lars-Olof.Delsing@nordlund.lu.se) Licence -------- http://project2.sol.lu.se/fornsvenska/ The HistCorp files ------------------- The texts from Fornsvenska textbanken on the HistCorp page are a subset of the texts included by Fornsvenska textbanken (http://project2.sol.lu.se/fornsvenska/), and are provided in a plain text format ('txt'), and in a tokenised format ('tok'). The plain text files have been semi-automatically stripped from metadata, and extratextual information such as page numbering, footnotes etc. Metadata is instead given in a TEI-compatible format at the top of each file. When assigning metadata, the number of tokens has been calculated based on the tokenised version of the file. In the tokenised files, the texts are split into one token on each line. Tokenisation was performed using the UDPipe tokeniser (https://ufal.mff.cuni.cz/udpipe) with the Swedish language model provided as a baseline model in the CoNLL17 Shared Task (swedish-ud-2.0-conll17-170315.udpipe). Size: 44 texts, with a total of 1,409,578 tokens. Genres: mixed.