Syntactic Parsing - Project

For 5LN713 you have do a project, equaivalent to 2.5 hp, or approximately 1.7 weeks of full time work.

The project can be done either individually or in pairs. If you wish to do the projects in pairs, it is your responsibility to find someone to work with. The scope of a pair project has to be somewhat larger than for an individual project, but I don't expect it to be double the size, since you both need to be involved in all parts of the project.

The project has to be related to parsing, but you can choose the topic quite freely. You may work either on phrase structure parsing or dependency parsing. The goal of the project is to be able to design, evaluate, and/or analyse syntactic parsers. It should thus contain a practical component, such as implementing an algorithm or evaluating and analsying the performance of existing parsers. Each project should also invlove reading at least one research article, and relate it to your report in your report.

Here are some tentative ideas for the project:

  • Cross-lingual dependency parsing with UUparser and UD treebanks. This project would be an extension of Assignment 3, which goes beyond the assingment. In such a project, you need to come up with a question of itnerest (which may be inspired by those VG tasks you did not attempt to do already), form a hypothesis, design and run some experiments, and analyse the results. In addition, we expect you to have a description in your report of how UUparser works on a high level, by reading articles about UUparser (see the UUparser github page).
  • Implement a parser/parser component. You may choose any algorithm, but a suitable option is Earley's algorithm:
    • Earley's algorithm. Usually, a good scope to implement it as a recognizer, and then discuss how it can potentially be extended to a probabilistic parser in a report. You also need to read up more about Earley's algorithm, see e.g. Earley, 1970, and Stolcke 1995 (These articles can be quite challenging, and I do not expect you to get all details of it. You're also free to find other relevant articles).
  • Evaluation project. Evaluate one or more parsers. Here you have to decide which language(s) you are interested in, which parser(s) you want to evaluate, what type of text domain(s), and what type of evaluation you will perform. You will need to read up on the parser(s) you use, and possibly also on the treebank(s).
  • Treebank transformations. Investigate the effect of different types of treebank transformations on parsing. This article can be a good starting point if you are interested in phrase structure parsing: Mark Johnson. PCFG Models of Linguistic Tree Representations. Computational Linguistics 24(4). Pages 613-632. If you're interested in dependency parsing, here is an article which could provide some inspiration: Miryam de Lhoneux, Joakim Nivre, 2016, Should Have, Would Have, Could Have. Investigating Verb Group Representations for Parsing with Universal Dependencies.
  • Feature engineering for dependency parsing. Investigate the effect of different types of features and transition systems used for learning the best transitions in an "old school" dependency parser, for example MaltParser.
  • Your own proposal

Project proposal and groups

Before starting the project you need to decide if you are working alone, or find a peer to work with in a pair. Sign up for a group in Studium, either individually or in pairs. This is needed in order to hand in your proposal and report. Please do not sign up with another student unless you have already decided that you want to work together.

You will first write a project proposal of around 1/2 A4-page, where you describe what you intend to do in your project. The deadline for the project proposal is February 26. The main purpose of the proposal is to get you started on your project, and to give Sara a chance to do a feasibility assessment of your proposal.

Reporting

The project should be reported in a final report (pdf) describing what you have done in your project and relating it to the parsing literature. If your project included implementation you should also hand in your code. Depending on the specific project type, the length and content of the report will vary.

You should also discuss your work at a seminar on March 25. No formal presentation with slides is required, but be prepared to describe what you have done in your project in smaller groups. If you work in a pair you will be expected to individually be able to discuss your project, and you and your peer will be assigned to different small groups.

Deadlines

  • Project proposal: February 26
  • Sminar with discussion/informal presentations of the projects: March 25
  • Written project report: March 22
The proposal and report should be handed in through Studium

If you have any questions about any aspect of the project, please contact Sara.