Word-based SMT -- Part 1

This document describes the tasks you need to perform for part 1 of lab 2. This lab is examined in class, on April 12, 9-12. Note that you have to attend the full session and be active in order to pass the lab. If you miss this session, you will instead have to write a report, see the bottom of this page.

The purpose of this lab is to gain insight in how a word-based SMT system works, and on the importance of the language model (LM) and translation model (TM). You will change the probabilities in these models by hand, to explore what happens. Normally you would not do this, but train the probabilities on a corpus. This is thus a bit artificial, but is intended to give you a better idea of how these models work.

The lab is setup to do translation both from English to Swedish and from Swedish to English. Try out both translation directions in the beginning! Later in the lab you may choose to focus on one translation direction. Focus on translation into a language you speak well, i.e. if you are Swedish or speak Swedish well, focus on translation into Swedish, otherwise, focus on translation into English. If you do not know Swedish, there is a grammar sketch of Swedish.

Take notes during the lab. During the last hour of the session, each group will get a few minutes each to talk about there experiments, what they did, what surprised them, what they learned, et.c. I will also ask each group to report their final score(s) with their modified models.

Slides shown during the lab.

1 - Familiarize yourself with the system and run with uniform probabilities

The translation system is described here. Listen to the brief description of it by your teacher and/or read the description of it. You will probably have to go back to the description during the lab as well!

Copy all the files needed for the lab:


mkdir lab2  
cd lab2
cp /local/kurs/mt/lab2/data/* .

The model files are given twice, in order for you to be able to keep a copy of the original files when you start modifying the files yourself.

In the given model files all probabilities are equal. This likely gives bad translations. Run the sample sentences through the translation system, and study the translation suggestions and also use the automatic evaluation to explore the overall results, and find out the average rank. Feel free to add some more sentences if you want to explore something you find interesting. The commands for running the decoder are:


# for translation from Swedish to English

# show the translation results:
/local/kurs/mt/lab2/simple_decoder/translate -lm lm.eng -tmw tmw.sweeng -tmf tmf.swe -o 2  -in test_meningar.swe 
# show the ranking of results:
/local/kurs/mt/lab2/simple_decoder/translate -lm lm.eng -tmw tmw.sweeng -tmf tmf.swe -o 2  -in test_meningar.swe -eval test_meningar.eng 


# for translation from English to Swedish

# show the translation results:
/local/kurs/mt/lab2/simple_decoder/translate -lm lm.swe -tmw tmw.engswe -tmf tmf.eng -o 2  -in test_meningar.eng 
# show the ranking of results:
/local/kurs/mt/lab2/simple_decoder/translate -lm lm.swe -tmw tmw.engswe -tmf tmf.eng -o 2  -in test_meningar.eng -eval test_meningar.swe
The following questions are some things worth thinking about, and discussing with your lab partner:

2 - Manipulate the translation models

In this task you should adjust the probabilities for fertilities and word translations, to achieve better translations. You should not set any probabilities to 0, and they should all be real probabilities, i.e. 0 < p ≤ 1. In some cases you could improve some translations at the cost of others. There are problems you cannot solve by only manipulating the translation model. In the end, try to choose changes that makes linguistic sense!

Start by changing one or a few things, and investigate what effect that has on the results. You may compare changes that seems reasonable given your linguistic intuition with seemingly "stupid" changes. An example of a linguistically motivated change is to have a higher probability for the translation of "en" and "ett" to "a" than "an", with the motivation that "a" is much more common in English than "an". Try modifying both the word translation model (tmw) and fertility model (tmf).

Here are some questions worth thinking about:

(For the TM to be a proper probability models it should contain proper probability distributions, i.e. the probabilities should sum to 1 for each word, that is the fertilities for each word should sum to one, and the word probabilities p(s|t) should sum to 1 for each t. The given models are correct in this respect, except for rounding (which is OK), i.e. 0.33*3=0.99, and not 1. You do not have to worry about this issue in the current lab, though.)

3 - Manipulate the language model

Go on to manipulate the language model as well, in order to further improving the translations. In the given files there are 1-grams and 2-grams with equal probabilities. Try to just adjust these first, and see if you can solve some problems. Again, you should not set any probabilities to 0, and they should all be real probabilities, i.e. 0 < p ≤ 1. You may also want to add n-grams that are missing, or remove ungrammatical n-grams. You might also want to add some 3-grams. If you do, remember to change the decoder flag for order to "-o 3" when you add 3-grams.

An easy way to improve the translations is to add a lot of 3-grams from the test sets. This is not all that meaningful, and would just result in over-fitting. Instead, try to make some principled changes based both on the test sentences, and on your knowledge of English/Swedish. Try to think about what makes the uniform model bad.

Here are some issues worth thinking about.

In addition to changing the probabilities for n-grams in the file, you may also change the backoff weight for unknown n-grams. A very simple backoff strategy is used in the decoder. If, for example a 3-gram is missing, it backs off to the 2-gram, but with a penalty that can be set on the command line. This penalty is simply multiplied by the 2-gram probability. If the 2-gram is missing too, it backs off to the 1-gram, and multiplies it by the penalty yet another time. The backoff penalty is set on the command line with the flag "-b WEIGHT", and the default values 0.01.

You may also modify the LM and TM in parallel, since changes made in one of them will affect what would be good changes in the other model.

4 - Wrapping up

At the end of the session you will get a new command, that will evaluate your final system on a new set of secret sentences from the same domain. If your changes are very specific to the known test set, this evaluation can be expected to be bad, whereas it should hopefully be good if your changes are general!

Commands for running the bilnd evalaution are found below! Make sure you point towards your own TM and LM files, that you set the correct n-gram order (normally 2 or 3) and that you set the backoff penalty if you changed that! If you only worked actively on one translation direction, only run this in that direction.


# for translation from Swedish to English
/local/kurs/mt/lab2/simple_decoder/translate -lm lm.eng -tmw tmw.sweeng -tmf tmf.swe -o N  -evalBlind sweeng

# for translation from English to Swedish
/local/kurs/mt/lab2/simple_decoder/translate -lm lm.swe -tmw tmw.engswe -tmf tmf.eng -o N  -evalBlind engswe

Then there will be a joint session where everyone shares their findings and discusses the issues of the lab. For this it is good to keep in mind that the main purpose of this lab is to learn more about how word based SMT models work, and the role of the LM and TM!

Each pair should:

Some general questions to think about:

Reporting

The lab is supposed to be examined in class, on April 12, 9-12. You need to be present and active during the whole session. If you failed to attend the oral session, you instead have to write a lab report. If both persons in the pair failed to attend, it should be included in the report for part 2 of the lab. If only one person failed to attend, it should be an individual separate report. Spend around 2 hours on experimenting with the weights in the TM and LM, before writing your report. In this report, give the score(s) for your final system(s) and compare it to the uniform system and discuss what you did, and what conclusions you can draw from your work. Also discuss at least a subset of the questions asked in the lab text. Your report should be around 2 A4-pages. The report should be handed in via the student portal as a pdf. The deadline is April 28, 2017.