วันจันทร์ที่ 19 สิงหาคม พ.ศ. 2562

Why are Google Translate and other translators so bad at translating Thai? I mean they are beyond bad.

===========================================



===========================================




===========================================




===========================================

Thai is considered a hard language to translate. In translation business Thai translation is a challenge to translation services providers. An in-depth understanding of Thai culture, as well as the language, is needed for successful Thai translations.

Thai script / alphabet

Thai language has its own unique script based on the Khmer script and no link has been found to relate it to any other language. The Thai alphabet is a syllabic one and consists of 44 basic consonants. There are no independent vowels in the alphabet. Each consonant goes with an inherent vowel. There are 18 single vowel symbols, 6 diphthongs / compound vowels and 8 consonant-like vowels that combine into numerous vowel forms and which modify the consonants. A Thai translator has to face this linguistic challenge. A professional Thai translator must know that in order to employ other vowels, each consonant is written with vowel symbol markings that appear as a subscript following a consonant or as strokes before and/or after a consonant.

Thai is a tonal language

Thai words that may sound similar have completely different meanings depending on the tone. Thai is a language with 5 tones. Some words are pronounced with a high, mid or low tone and others with a rising or falling tone. For some consonants there are multiple letters which indicate different tones. The tone of a syllable is determined by a combination of the class of consonant, the type of syllable, the tone marker and the length of the vowel. The rules of the Thai language are understood by a good Thai translator performing a Thai translation.

===========================================
Thai have a ton of exceptional grammar, idiom, slang, abbreviation in even writing. If you use normal sentence that grammatically correct, it will sound really robot for Thai people.

You can compare Thai to Japanese which usual doesn’t fill the subject in the sentence. So that make Google must guess the subject by itself. We have something like that a lot. In addition, one word have a ton of meaning that have totally difference. If you want to translate Thai, you must know the context about which the person say accept it’s written everything grammatically correct but that’s rare.


===========================================

There may be other reasons, but usually this is because of a lack of data. Google Translate and similar translation systems function by building models from huge databases of parallel sentences (sentences with the same meaning in two languages). The more sentences available, the better the translation model.

It’s also possible that the languages are difficult to translate for other (e.g., structural) reasons.

===========================================
Google Translator still have a lot of issue with Thai languages for sure. They need to collect more data in Thai and make the translators learn more about Thai language just to be at par with Bing Translator of Microsoft. Lack of space between words in one issue. the combination of words to come up with different meanings is another issue. AI system of Google translators need to learn even more about Thai language for sure.
===========================================
I have noticed this for Thai in particular as well. I think this is at least partly because Thai doesn’t use spaces to break up words, so in addition to translating the meaning the software has to guess where the words are separated, and this leads to many possible combinations of words, which often is nonsensical, but that in turn would require understanding the meaning to fix.

I’m also following the question to see if someone else knows any specifics.
===========================================
The Thai language is a very complex language. Normal translators use what is called “database query” to compare words between multiple languages but this type of translation doesn’t take into account the structure of the sentence. For example;

มันไม่จริงหรอกเรื่องแบบนั้น=That story is probably not true.

Notice that มัน and เรื่อง both means “that story”

Which most translators struggle to translate properly due to the SVOC structure differences. Even google translate requires manual community input to help translate these type of complex sentences.=========================================
The other answers are good I will add some more.

Thai doesn't have any words like “was” “were” and other past tense stuff. I just tried putting a sentence with “was” in Google translate and it immediately got it wrong. One way we define “past” other than specifying date is เคย (Koey) which can be translated to “used to” and if with a verb can be used similar to an “ed” in some cases depending on the sentence.
One word can have many meaning based on the sentence and Google translate it word by word.
Lots of Thai words can be accidentally joined by apps because of no spacing between words like แม่ (Mae) for mother and น้ำ(Nam) can be joined to แม่น้ำ(Mae Nam) to mean river. Most other words can be very confusing.

ไม่มีความคิดเห็น:

แสดงความคิดเห็น