mkdir lab3
cd lab3
cp /local/kurs/mt/lab3/data/* .
Also copy your final translation models (tm*.*) from part 1.
/local/kurs/mt/srilm/bin/i686/ngram-count -wbdiscount -text CORPUS -lm LM -order ORDER
where CORPUS is your training corpus, LM is the resulting language model that you will use as input to the decoder, and ORDER is the maximum LM order you want. -wbdiscount means that the program runs Witten-Bell smoothing. It is suitable for small corpora, but you do not have to think about smoothing in this lab.
In the resulting LM files the probabilities are given as logprobs. Our decoder uses standard probabilities, but can convert from logprobs if the flag -lp is used. It is thus important that you add this flag in this lab. It is also reasonable to change the backoff weight with "-b 0.01", but you may experiment with this setting if you want to. If you do, write it down in your report. The command you will run in this lab, will thus need to use the following flags:
/local/kurs/mt/lab2/simple_decoder/translate -lm LM-file -tmw WORD-translation-model -tmf FERTILITY-model -o MAX-NGRAM-ORDER -lp -b 0.01
It might also be convenient to run all sentences through at the same time with the flag "-in test_meningar.[eng/swe]", if you didn't do that already.
There are two sets of corpora: a parallel corpora, called corpus.parallel.* and a monolingual corpus called corpus.mono.*, which is slightly larger. Both corpora contains the same type of block world sentences, but in the corpora labelled parallel, the English and Swedish sentences corresponds to each other, line by line (which would have been necessary if we would have trained a translation model, but is not necessary for language model training). Have a brief look at the corpora files to familiarize yourself with them.
Start by keeping the TM probabilities constant on your final values from part 1. Run the system with different order n-grams, starting with 1, and increasing the order until you think that you can get no further improvements. Discuss what types of problems that can be solved for each increase in LM order. Are there some issues that cannot be solved even with high order n-grams? Do you think that they could be solved with a better training corpus? Was a trained 2-gram and 3-gram model better than the weights you set manually in part 1? How would you expect the performance with manually and trained weights to be if you would test on other sentences from the blocks world, than the 10 sentences in the test corpus?
Did your TM probabilities work well with the trained LM? Otherwise, can you further improve some things by changing the TM probabilities?
Concatenate the monolingual data with the parallel data and retrain the LM on this larger corpus, using the best order from before. Does this make a difference? Do you think that more data always is better?
Compare the translations with and without the sentence boundaries. Use the best n-gram order from your previous experiment. Does it make a difference? If so, what do they influence and why?
Kim ställer ett brandgult block på det gröna fältet Kim puts an orange block on the green field |
hon ställer 2 blåa block på ett fält she puts 2 blue blocks on a field |
For a G grade you should have performed all tasks in the lab, except writing an evaluation script, and discussed them in a good way. For a VG grade, in addition, you need to have written and used an evaluation program according to the instruction in part 1. Your discussion also need to be of high quality.
Send your report and files via e-mail to Sara Stymne (firstname dot lastname at lingfil dot uu dot se). Deadline for handing in the report: April 30, 2014.