Abstract
Much research in translation studies indicates that translated texts are ontologically different from original non-translated ones. Translated texts, in any language, can be considered a dialect of that language, known as 'translationese'. Several characteristics of translationese have been proposed as universal in a series of hypotheses. In this work, we test these hypotheses using a computational methodology that is based on supervised machine learning. We define several classifiers that implement various linguistically informed features, and assess the degree to which different sets of features can distinguish between translated and original texts. We demonstrate that some feature sets are indeed good indicators of translationese, thereby corroborating some hypotheses, whereas others perform much worse (sometimes at chance level), indicating that some 'universal' assumptions have to be reconsidered. In memoriam: Miriam Shlesinger, 1947-2012.
Original language | American English |
---|---|
Pages (from-to) | 98-118 |
Number of pages | 21 |
Journal | Digital Scholarship in the Humanities |
Volume | 30 |
Issue number | 1 |
DOIs | |
State | Published - 1 Apr 2015 |
All Science Journal Classification (ASJC) codes
- Information Systems
- Language and Linguistics
- Linguistics and Language
- Computer Science Applications