Shared Task: Multilingual Low-Resource Translation for Indo-European Languages (WMT 21)

In this post we look at the works which were submitted for the shared task of multilingual low-resource translation for Indo-European languages. The works will be for the EMNLP 2021 sixth conference on machine translation. The scientific workshop will be held in November 10-11, 2021 in Punta Cana (Dominican Republic) and online. A link for the full evaluation results can be found in this link.

The automatic evaluation metrics which are used are BLEU, TER, chrF, COMET and BertScore. The final ranking is done according to the average ranking of the individual metrics per family, ties on individual metrics are considered. Two baselines were used: M2M-100 1.2B Model and mT5-devFinetuned.

Romance Family (Wikipedia)

Official RankingAverage RankingBLEUTERchrFCOMETBertScore
CUNI-Primary1.2±0.450.060.4010.6940.5660.901
CUNI-Contrastive1.6±0.549.480.4040.6930.5690.901
Tencent-Contrastive3.0±0.043.450.4600.6700.4440.894
Tencent-Primary3.8±0.443.310.4620.6680.4420.894
BSC-Primary (*)5.0±0.741.330.4620.6470.3630.884
M2M-100 (baseline)5.8±0.440.020.4780.6340.4140.878
UBCNLP-Primary7.2±0.435.410.5280.5880.0070.854
mT5-Finetuned (baseline)8.0±0.729.280.5920.5530.0590.850
UBCNLP-Contrastive8.6±0.528.510.5910.529-0.3740.825

North-Germanic Family (Europeana)

Official Ranking Average Ranking BLEU TER chrF COMET BertScore
M2M-100 (baseline)1.0±0.031.450.540.550.3990.862
Edinsaar-Contrastive2.2±0.427.070.570.540.2830.856
Edinsaar-Primary2.8±0.427.540.580.520.2760.849
UBCNLP-Primary4.0±0.024.940.600.500.0760.847
UBCNLP-Contrastive5.0±0.024.020.610.49-0.0680.837
mT5-devFinetuned (baseline)6.0±0.018.530.780.42-0.1020.810

Leave a Reply