------------------------- Paris Speech in the Past ------------------------- A collection of semi-literary representations of vernacular (French) speech from the 16th to 19th centuries which is preceeded by a set of tax-rolls from late 17th century Paris. Publications based on this resource: Lodge, R.A. (1996) “Stereotypes of vernacular pronunciation in 17th-18th century Paris,” Zeitschrift für Romanische Philologie 112. -- Lodge, R.A. (1995) "Les lettres de Montmartre," Revue de linguistique romane 59. Cited from http://ota.ox.ac.uk/desc/2423 June 15, 2017 HistCorp inclusion date ------------------------ January 30, 2017 Website -------- http://ota.ox.ac.uk/desc/2423 Licence -------- Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported (https://creativecommons.org/licenses/by-nc-sa/3.0/) The HistCorp files ------------------- On the HistCorp page, the French texts from 'Paris Speech in the Past' are provided in a plain text format ('txt'), and in a tokenised format ('tok'). In the plain text files, the original rtf files have been stored as plain text using the 'save as' function in Microsoft Word. Furthermore, metadata are given in a TEI-compatible format at the top of each txt file. The metadata information was mainly extracted from the metadata stated in the xml file accompanying the rtf files when downloading the corpus from Oxford Text Archive. In addition, the number of tokens for each file has been calculated based on the tokenised version of the file. In the tokenised files, the texts are split into one token on each line. Tokenisation was performed using the UDPipe tokeniser (https://ufal.mff.cuni.cz/udpipe) with the French language model provided as a baseline model in the CoNLL17 Shared Task (french-ud-2.0-conll17-170315.udpipe). Size: 10 texts, with a total of 230,915 tokens. Genres: tax rolls, and vernacular French speech in the form of drama, poetry, political pamphlets, conference speech and letters.