UPPSALA UNIVERSITET : Inst. f. lingvistik och filologi : STP
Uppsala universitet
Hoppa över länkar


Schedule
Content
Examination
Assignments
Reading


Language Technology: Research and Development

Note that this page has been migrated from a previous server. There is thus a risk that not all links work correctly

Credits: 15 hp
Syllabus: 5LN714
Staff
Course coordinator and examiner: Sara Stymne
Teachers: Sara Stymne, Beáta Megyesi, Paola Merlo
Assistant: Samuel Douglas
Alumni guests: Allison Adams, Luise Dürlich, Elena Fano

News

General information

This page contains the general information for the course Language Technology: Research and Development, autumn 2021. Course information is available here, and slides will be posted in the annotated schedule available here. We will also use the Studium system for handing in assignments, keeping track of your progress, posting Zoom links, and similar information.

Schedule

Preliminary schedule.
Date Time Room Content Reading
L1
31/8
14-16
Zoom (16-0042)
Introduction, Digging,Beyond, Xling

L2
3/9
10-12
Zoom (12:229, Blåsenhus)
Science, research, and development Okasha, Cunningham, Lee
L3
7/9
14-16
16-0042 (limited access by Zoom)
Science, research, and development 2: debate session
Okasha, Hovy and Spruit
S1
10/9
10-12
9-1016 (xling), Zoom (btb)
Seminar - research papers

S1
14/9
10-12
7-0017 (dig)
Seminar - research papers

S2
20/9
10-12
7-1020 (dig), 7-1013 (xling), Zoom (btb)
Seminar - research papers

L4
23/9
10-12
2-0076
R&D projects - from proposal to implementation
Zobel 10-11, 13
S3
27/9
10-12
7-1013 (dig), Zoom (btb)
Seminar - research papers

S3
28/9
10-12
9-1016 (xling)
Seminar - research papers

L5
30/9
10-12
1-0062
Alumni lecture
Zobel 10-11, 13
S4
13/10
9-12
7-1013 (dig), 6-0022 (xling), Zoom (btb)
Seminar - project proposals

S5
25/10
10-12
7-1013 (dig), Zoom (btb)
Seminar - progress report

S5
26/10
13-15
2-0028 (xling)
Seminar - progress report

L6
3/11
10-12
22-1017
Dissemination of research results,
Zobel 1-9, 14
S6
10/11
10-12
2-0026 (dig), Zoom (xling)
Seminar - progress report, theme: ethics
Hovy and Spruit; Bender et al.
S6
10/11
16-18
Zoom (btb)
Seminar - progress report, theme: ethics
Hovy and Spruit; Bender et al.
Lab
23/11
14-16
Chomsky+Turing
Latex tutorial
S8
24/11
10-12
2-0027 (dig), 2-0026 (xling)
Seminar - progress report
S8
24/11
16-18
Zoom (btb)
Seminar - progress report
L7
1/12
10-12
16-2043
Review of scientific articles
Zobel 14
S9
8/12
10-12
Blåsenhus 21:237 (dig)
Seminar - progress report
S9
8/12
16-18
Zoom (btb)
Seminar - progress report
S9
9/12
10-12
9-1017 (xling)
Seminar - progress report
FS
13/1
8-16
6-1023 (Geigersalen), 7-0043
Final workshop - term paper presentations


13/1
16-
TBA
Social event

All lectures will be given by Sara. The seminars will be led by the seminar leader for each research group. Note that attendance is obligatory at all seminars.

Teaching mode, Covid-related information

The aim is that the course will mainly be campus-based, if the situation permits. This may change on very short notice, though. We aim for the majority of lectures to be held on Campus. If there is a need, due to special circumstances, we will also use Zoom during lectures. The seminar group led by Paola will mostly meet on Zoom. The other seminar groups will mainly meet on Campus when possible. We will avoid hybrid seminars, since that has not worked well in the past, which means that some other seminars may also be held online.

For all seminars given remotely, we require that students have the camera turned on.

Please respect the current regulations and stay home if you are not feeling well, and maintain social distancing! Note that this also applies to teachers, so any Campus activities may be moved entirely online on short notice. It may also be the case that regulations change on short notice. Please always check Studium and your email before going to Campus!

This information will be continually updated throughout the term.

Content

The course gives a theoretical and practical introduction to research and development in language technology. The theoretical part covers basic philosophy of science, research methods in language technology, project planning, and writing and reviewing of scientific papers. The practical part consists of a small project within a research area common to a subgroup of course participants, including a state-of-the-art survey in a reading group, the planning and implementation of a research task, and the writing of a paper according to the standards for scientific publications in language technology. The research areas, with teachers, for 2021 are:
  1. Digging the past: Digital Philology and the Analysis of Historical Sources (dig) - Beáta Megyesi
  2. Beyond the benchmarks: Linguistically-oriented analysis and generalisations in Neural Networks (btb) - Paola Merlo
  3. Cross-lingual natural language processing (xling) - Sara Stymne

Examination

The course is examined by means of five assignments with different weights (see below). In order to pass the course, a student must pass each of one of these. In order to pass the course with distinction, a student must pass at least 50% of the weighted graded assignments with distinction.

Assignments

  1. Take home exam on philosophy of science (15%)
    • This assignment will be based on your reading of Okasha's book. You will be asked to discuss issues in the philosophy of science and (sometimes) relate them to the area of language technology. The questions will be handed out September 8, and the report should be handed in September 16.
  2. Research paper presentation and discussion (15%)
    • You will present one of the papers discussed in the seminars. The task is to introduce the paper and lead the discussion, not to make a formal presentation (briefly summarize the paper (~2 min), discuss the main points being made, bring up difficult to understand parts, initiate a discussion by proposing themes to discuss). In addition you shall take active part in the discussion of all other papers discussed in the seminars. The seminars have obligatory attendance; if you miss a seminar, you have to write a short report instead. This assignment is not graded and does not qualify for distinction.
  3. Project proposal (15%)
    • You will put together a research proposal consisting of two parts, using an adapted version of the Swedish Research Council's guidelines for research plans. The major part is a 3-page scientific proposal describing the project you are going to work on for the rest of the course. In addition you should write a short popular science abstract describing your proposal in such a way that it is accessible to the general public, consisting of maximum 2000 characters. You will also give a short presentation of the proposal in a seminar (8 minutes with slides, plus time for questions and discussions). The deadline for the written proposal is October 8, and the seminars will take place October 13.
  4. Review of term papers (15%)
    • You will review two term papers written by your course mates. You will use a set of guidelines which will be specified later. You will receive the papers on December 14 and the reviews are due December 22.
  5. Term paper (40%)
    • You will report your project in a paper following the guidelines of Transactions of the Association for Computational Linguistics (except that the page limit for your papers is 4-7 pages + references). The deadline is December 13 for the first version and January 14 for the revised version. On January 13, you will also give an oral presentation of the paper. As part of your work on the project, it is obligatory to attend the progress report seminars as well as the final workshop. If you miss a seminar, you have to write a short report instead.

Note that you will also practice writing and presenting for different audiences during the course. The final paper and presentation are targeted at experts in language technology (but not necessarily experts in your particular research theme). You scientific proposal and presentation is targeted at academics, but not necessarily in language technology, but potentially in neighboring fields as well, such as linguistics or computer science (which typically make up reviewing boards as egencies such as the Swedish research council). Your popular science proposal abstract are targeted at the general public, and should not require any prior knowledge of language technolgy to understand.

All assignments you hand in during the course are individual. We welcome discussions between members of each theme (and between the themes as well), but the final reports you hand in should be your own work, written in your own words. You should use community standards for citing work that is related to your work. Note that you should always write about other work in your own words; changing a few words in each sentence from a paper you read is not acceptable. You should also remember to give credit to images (and only reproduce images if they are published under a permissive license like Creative Commons) and code. If you use or build on code by someone else, you should clearly state that in your reports.

Submitting and Reviewing Term papers

We will use EasyChair for submission and review of papers. Detailed instructions have been provided by email, and on lecture slides.

Final Seminar/Workshop

The final seminar will be organized as a workshop with term paper presentations. The plan is for the final workshop to be on Campus only (given that restrictions allow). In case you are not able to attend on Campus (i.e. due to medical conditions, travel bans, et.c.) let Sara know beforehand, in order to arrange some Zoom talks in such cases. The time slot for each paper is 15 minutes, to be divided into 12 minutes presentation and 3 minutes discussion. The session chairs will enforce the times strictly.

Research Groups

Your first task in the course is to make a wish for which research topic to work on. Send a ranked list of your preference for the three topics by email to Sara, at the latest Friday September 3, at 13.00. Please also specify if you prefer to have the seminars online or on campus (or if you are fine with either option). It is especially important that you let us know if you have an approved reason for following the teaching online (i.e. medical reasons or travel restriction stopping you from travelling to Uppsala). You may also indicate if your preference for your first choice is a very strong preference. If you fail to make a wish by this deadline you will be arbitrarily assigned to a topic. We will try our best to respect everyone's wishes, but if it turns out not to be possible, we will resort to random decisions. This applies to both topic choice and campus/online preference (unless you have an approved online reason).

Groups Members Papers
Digging the past - Bea EmmaSep 14: Piotrowski, Chap 3, 2012
Jae EunSep 14: Piotrowski, Chap 6, 2012
FlorSep 14: Van Strien et al., 2020
ClaireSep 20: Piotrowski, Chap 7, 2012
ZimingSep 20: Hedderich et al., 2021
MartinaSep 20: Hedderich et al., 2021
LauraSep 20: Bollman, 2014
ChenSep 27: Hammond et al., 2013
Kai-YanSep 27: Hovy and Lavid, 2010
NikolinaSep 27: Peng et al. 2021
Beyond the benchmarks - Paola PalomaSep 10: Baroni, 2021
EiriniSep 10:Linzen and Baroni, 2020
KlaudiaSep 10:Linzen, 2021
JiayiSep 20:Futrell et al, 2019
ViktorijaSep 20:Kann et al. 2019
LingqingSep 20:Gulordava et al., 2018
EvaSep 27:Rodriguez and Merlo, 2020
YongchaoSep 27:Thrush et al., 2020
ChuchuSep 27:Thrush et al., 2020
JustynaSep 27:Wilcox et al., 2018
Cross-lingual NLP - Sara KätriinSep 10: Yarowsky et al. 2001
MarekSep 10: Artexte et al. 2020
JamesSep 10: Wu and Dredze, 2019
RafalSep 20: Smith et al. 2018
KrisSep 20: Kondratyuk and Straka, 2019
YifanSep 20: Üstün et al. 2020
ZheSep 28: Turc et al., 2021
OreenSep 28: Chaudhary et al. 2019
AngelikiSep 28: Lin et al. 2019
SiyiSep 28: Lin et al. 2019

Computational resources

For those who need access to a cluster for their computational needs, you will get access to the Snowy cluster at UPPMAX.

In order to use the UPPMAX cluster, you will first have to apply for an account. You should then apply to two projects:

Information about using Snowy, is available here. Note that you login to Rackham. You can only run light jobs directly on Rackham (like copying files, looking at files, et.c.). In order to run heavy jobs, you need to write a Slurm script, and execute it on Snowy. See the UPPMAX SLURM user guide to learn more about it. Here is an example Slurm script, from last year's MT course.

Deadlines

Here is a summary of all deadlines in the course.

TaskDeadlineExtra deadline
Choose your preferred topicsSeptember 3, 13:00-
Take home examSeptember 16November 12
Project proposalOctober 8November 5
Present project proposalOctober 13By agreement
First version of project reportDecember 13January 14
Reviews on peer's project papersDecember 22February 18
Final seminarJanuary 13By agreement
Final project reportJanuary 14February 18

All deadlines are at 23.59 on the respective date unless otherwise noted.

Note that it is important for you to finish the course on time, since it is a requirement for starting your master thesis. So try to avoid resorting to the backup deadlines, since it will likely mean you cannot finish the course on time!

Reading

Science and Research

Digging the past

Cross-Lingual NLP

Beyond the benchmarks