UPPSALA UNIVERSITET : Inst. f. lingvistik och filologi : STP
Uppsala universitet
Hoppa över länkar


Schedule
Content
Examination
Assignments
Reading


Language Technology: Research and Development

Note that this page has been migrated from a previous server. There is thus a risk that not all links work correctly

Credits: 15 hp
Syllabus: 5LN714
Teachers: Sara Stymne, Ali Basirat, Daniel Dakota
Course coordinator and examiner: Sara Stymne

News

Schedule

Date Time Room Content Reading
L1
1/9
12-14
Universitetshuset IX and/or Zoom
Introduction, Word embeddings, Sentiment, Cross-lingual

L2
3/9
10-12
Universtietshuset X and Zoom
Science, research, and development Okasha, Cunningham, Lee
L3
8/9
10-12
Zoom (Note online only!
Science, research, and development 2: debate session
Okasha
S1
14/9
13-15

Seminar - research papers

S2
21/9
10-12

Seminar - research papers (wemb, sent)

S2
22/9
10-12

Seminar - research papers (xling)

L4
23/9
10-12
Blåsenhus 10:K102 and/or Zoom
R&D projects - from proposal to implementation
Zobel 10-11, 13
S3
29/9
13-15

Seminar - research papers

S4
7/10
9-12

Seminar - project proposals

S5
19/10
10-12

Seminar - progress report (sent, wemb)

S5
20/10
10-12

Seminar - progress report (xling)

S6
4/11
10-12

Seminar - progress report, theme: ethics
Hovy and Spruit, 2016
L5
12/11
12-14
Zoom
Dissemination of research results
Zobel 1-9, 14
S7
18/11
10-12

Seminar - progress report
L6
25/11
10-12
Zoom
Review of scientific articles
Zobel 14
S8
3/12
10-12

Seminar - progress report
FS
13/1
8-16
Online
Final workshop - term paper presentations


13/1
15.45-
gather.town
Social event

All lectures will be given by Sara. The seminars will be led by the seminar leader for each research group. Note that attendance is obligatory at all seminars. Rooms for seminars will be given once it is decided if they are Campus or Zoom based. Decisions for campus and/or Zoom lectures will be announced in good time before each session.

Teaching mode, Covid-related information

Due to the Covid situation the course will be run at least partially in an online format. We hope to be able to switch more to campus activities towards the second half of the term. Lectures will be held either online or in a mixed mode, with a possiblity to attend both online and on campus. Some lectures might get replaced with recorded lectures plus maybe a campus activity. Seminars in smaller groups will be either online or campus-based depending on the preferences of the students in the group. We will try to take your preferences into account when forming the groups.

For all seminars given remotely, we require that students have the camera turned on.

Please respect the current regulations and stay home if you are not feeling well, and maintain social distancing! Note that this also applies to teachers, so any Campus activities may be moved entirely online on short notice. Please always check your email before going to Campus! Note also that while we have booked large classrooms, there is a small risk that a classroom becomes full. In such an unlikely case, we will let students into the classroom on a first come, first served basis, and those arriving when the classroom is full can follow the activity on Zoom instead.

This information will be continually updated throughout the term.

Content

The course gives a theoretical and practical introduction to research and development in language technology. The theoretical part covers basic philosophy of science, research methods in language technology, project planning, and writing and reviewing of scientific papers. The practical part consists of a small project within a research area common to a subgroup of course participants, including a state-of-the-art survey in a reading group, the planning and implementation of a research task, and the writing of a paper according to the standards for scientific publications in language technology. The research areas, with teachers, for 2020 are:
  1. Cross-lingual NLP (xling) - Sara Stymne
  2. Word embeddings (wemb) - Ali Basirat
  3. Sentiment Classification Tasks (sent) - Daniel Dakota

Research Groups

Below groups and articles for the research seminars will appear.

Groups Members Papers
Cross-lingual NLP BjarkiSep 14: Yarowsky et al. 2001
HarmSep 14: Tiedemann 2015
HuilingSep 14: Smith et al. 2018
(Sara)Sep 21: Artexte et al. 2020
YifeiSep 21: Plank & Agić 2018
GustavSep 21: Glavaš et al. 2019
XingranSep 29: Artexte and Schwenk, 2020
(Sara)Sep 29: Zoph et al. 2016
AntoniaSep 29: Lin et al. 2019
Word embeddings ZiyangSep 14: Mikolov et al. 2013
Po-ChunSep 14: Pennington et al. 2014
MeichunSep 14: Bojanowski et al., 2017
AhmedSep 21: Luke and Andrew, 2015
XiSep 21: Nguyen et al., 2017
(Ali)Sep 21: Brazinskas et al., 2018
Maria-ElenaSep 29: Melamud et al., 2016
ChuchuSep 29: McCann et al., 2017
(Ali)Sep 29: Peters et al. 2018
Sentiment classification tasks (Daniel)Sep 14: Pang et al., 2002
NaomiSep 14: Kim and Hovy, 2004
YongchaoSep 14: Wiegand et al., 2019
SebastianSep 21: Pak and Paroubek, 2010
SijiaSep 21: Cortis et al. 2017
MelvinSep 21: Park et al., 2018 2015
XindiSep 29: Toh and Su 2010
GiacomoSep 29: Attia et al., 2018

Examination

The course is examined by means of five assignments with different weights (see below). In order to pass the course, a student must pass each of one of these. In order to pass the course with distinction, a student must pass at least 50% of the weighted graded assignments with distinction.

Assignments

  1. Take home exam on philosophy of science (15%)
    • This assignment will be based on your reading of Okasha's book. You will be asked to discuss issues in the philosophy of science and (sometimes) relate them to the area of language technology. The questions will be handed out September 9, and the report should be handed in September 17.
  2. Research paper presentation and discussion (15%)
    • You will present one of the papers discussed in the seminars. The task is to introduce the paper and lead the discussion, not to make a formal presentation (briefly summarize the paper (~2 min), discuss the main points being made, bring up difficult to understand parts, initiate a discussion by proposing themes to discuss). In addition you shall take active part in the discussion of all other papers discussed in the seminars. The seminars have obligatory attendance; if you miss a seminar, you have to write a short report instead. This assignment is not graded and does not qualify for distinction.
  3. Project proposal (15%)
    • You will put together a 3-page proposal describing the project you are going to work on for the rest of the course, using an adapted version of the Swedish Research Council's guidelines for research plans. You will also give a short presentation of the proposal in a seminar (8 minutes with slides, plus time for questions and discussions). Your proposal and presentation should be accessible also to non-experts in language technology, so it is important to balance a general description with enough technical details. The deadline for the written proposal is October 2, and the seminars will take place October 7.
  4. Review of term papers (15%)
    • You will review two term papers written by your course mates. You will use a set of guidlines which will be specified later. You will receive the papers on December 14 and the reviews are due December 21.
  5. Term paper (40%)
    • You will report your project in a paper following the guidelines of Transactions of the Association for Computational Linguistics (except that the page limit for your papers is 4-7 pages + references). The deadline is December 11 for the first version and January 15 for the revised version. On January 13, you will also give an oral presentation of the paper. As part of your work on the project, it is obligatory to attend the progress report seminars. If you miss a seminar, you have to write a short report instead.

Submitting and Reviewing Term papers

Information to appear!

Final Seminar/Workshop

The final seminar will be organized as a workshop with term paper presentations.

Research Groups

Your first task in the course is to make a wish for which research topic to work on. Send a ranked list of your preference for the three topics by email to Sara, at the latest Friday September 4, at 13.00. Please also specify if you prefer to have the semianrs online or on campus (or if you are fine with either option). If you fail to make a wish by this deadline you will be arbitrarily assigned to a topic. We will try our best to respect everyone's wishes, but if it turns out not to be possible, we will resort to random decisions.

Deadlines

Here is a summary of all deadlines in the course.

TaskDeadlineExtra deadline
Choose your preferred topicsSeptember 4, 13:00-
Hand in take home examSeptember 17November 13
Project proposalOctober 2October 30
Present project proposalOctober 7By agreement
First version of project reportDecember 11January 15
Reviews on peer's project papersDecember 21February 19
Final seminarJanuary 13By agreement
Final project reportJanuary 15February 19
Note that it is important for you to finish the course on time, since it is a requirement for starting your master thesis. So try to avoid resorting to the backup deadlines, since it will likely mean you cannot finish the course on time!

Reading

Science and Research

Cross-Lingual NLP

Word embeddings

Sentiment Classification Tasks