2009/04/20

reading group

A New Approach to the study of translationese: Machine-learning the difference between original and translated text

MT text categorization
translationese: dialect

translation:
less lexical dense
order repeat linguistic features


unigram , bi gram , tri gram -> windows size n words.

lemma -> root of the word

SVM -> hav capacity of feature selection.

majority vouting
recall maximitacion (at least 1 vote).

pronouns & adverbial forms the most important.



concept:
compatible corpus: same topic.
humman comparation, same performance as machine.

No comments:

Post a Comment