-------------------------- Greek Dependency Treebank -------------------------- The Ancient Greek and Latin Dependency Treebank (AGLDT) is the earliest treebank for Ancient Greek and Latin. The project started at Tufts University in 2006 and is currently under development and maintenance at Leipzig University-Tufts University. The Ancient Greek and Latin Dependency Treebanks are built from the work of dedicated students and researchers from across the world. Over 200 people have annotated texts; the hard work of those who have contributed their annotations as part of the official treebanks are within the data. Cited from https://perseusdl.github.io/treebank_data/ September 13, 2017 For Greek, the following texts are included: Author Text --------------------------------------- Aesop Fables (1.1-1.50) Aeschylus Agamemnon Eumenides Libation Bearers Persians Prometheus Bound Seven Against Thebes Suppliants Athenaeus The Deipnosophists (12-13) Diodorus Siculus Library (11) Herodotus Histories (1) Hesiod Shield of Heracles Theogony Works and Days Homer Iliad Odyssey Lysias Oration 1 Oration 14 Oration 15 Oration 23 Plato Euthyphro Plutarch Alcibiades Lycurgus Polybius Histories (1) Pseudo Apollodorus Library (1.1.1-1.4.1) Pseudo Homer Hymn to Demeter Sophocles Ajax Antigone Electra Oedipus Tyrannus Trachinae Thucydides Histories (1) HistCorp inclusion date ------------------------ January 30, 2017 Website -------- https://perseusdl.github.io/treebank_data/ Licence -------- Creative Commons Attribution-ShareAlike 3.0 United States https://creativecommons.org/licenses/by-sa/3.0/us/ The HistCorp files ------------------- On the HistCorp page, the Greek texts from the AGLDT corpus are provided in a plain text format ('txt'), a tokenised format ('tok'), and in a morphologically and syntactically annotated format ('anno'). The plain text files were created from the original AGLDT xml files, by extracting the text parts of the XML files, and also adding metadata from the XML files in a TEI-compatible format at the top of each file. In addition, the number of tokens for each file has been calculated based on the tokenised file. The tokenised files were created by extracting the words and sentence boundaries from the original XML files. The tagged and parsed files are unchanged from the ones found on the AGLDT webpage,except that metadata has been added at the top of each file. Size: 33 texts, with a total of 549,732 tokens.