Posts

Showing posts with the label National German

Language ID Part 3 - more challenges

Stop word detection usually works...  In my prior post about this subject, hopefully the Jabberwocky poem examples demonstrated that when certain types of words occur in text that can be identified as belonging to a language's pronoun/conjunction/ adposition parts of speech, a language label can still be assigned to text. Such identifiers are, in this context, considered to be stop words. The presence of such terms was sufficient for us to recognize the language even when nouns, verbs, adjectives and adverbs are unidentifiable (nonexistent in our vocabulary). However it's useful to note that in those examples, there were some inflections that hinted at the nonsensical words having specific qualities. Specifically in the case of spotting nouns, these were enabled when in the inflected languages, pluralization or possession were shown via -s/'s endings (English, though -s can indicate possession in German also) or combination of title case capitalization and (when plu

Language ID part 2: Callooh! Callay!

 As mentioned in part 1 , thinking about identifying language leads one to the fundamental question: "what defines a language?" The example that our team used was from the children's book by Lewis Carroll - a  poem that the protagonist reads, and although not comprehending it, thinks it "pretty" and that it was "clear" that "somebody killed something". Here it is, courtesy of the Wikipedia (English) article: " Jabberwocky " 'Twas brillig, and the slithy toves Did gyre and gimble in the wabe; All mimsy were the borogoves, And the mome raths outgrabe. "Beware the Jabberwock, my son! The jaws that bite, the claws that catch! Beware the Jubjub bird, and shun The frumious Bandersnatch!" He took his vorpal sword in hand: Long time the manxome foe he sought-- So rested he by the Tumtum tree, And stood awhile in thought. And as in uffish thought he stood, The Jabberwock, with eyes of flame, Came whiffling

Learning "Englise" - a fun Friday share

Image
I received the following album link from a friend: the photos consist of pages from a Hangul - English phrasebook. Commented samples from the publication "Living Englise Language Everyday" Aside from the implicit perceptions of "common" phrases that the authors seem to expect to be spoken or heard in English, the most noticeable grammatical mistakes seemed to arise from the unpredictable use of "to be" in place of "to have". This was actually something I noticed when studying French and German, such as the "j'ai froid" "I am cold" "mir ist kalt" comparisons (and it's "j'ai faim" "I'm hungry" "ich habe Hunger"/"ich bin hungrig") - in Japanese at least, the subject is so often omitted that just saying "寒い" ("[I feel] cold") and "お腹がすいた" ("[My] stomach has become empty", to attempt a literal interpretation). This would ex

Another localization pitfall: slang

Image
I wonder how often product names are vetted by native speakers of languages when considering marketing something in that region. And if they are, how often slang and rhyming words of dubious character are taken into consideration.  Certainly, when my employer purchased an electric hatchback car earlier this year (as a corporate vehicle that can be reserved for client visits and such), I was bewildered by the wave of snickering that accompanied the announcement of its name, and the ever so slightly aggrieved way the speaker delivered the news. It turns out that the acronym by which it's called closely approximates an Austrian slang word for "stench". In looking at National German, I see there's also another (less similar sounding)  slang term , about which I was not told. Product namers, beware!

Localization does not equal straight translation

Image
That's right folks - localizing text, in particular marketing and promotional copy, is not simply a matter of finding a competent translator who has native fluency in both source and target languages. And I'm sure many of my readers already knew that. So why mention it here? Because I've entered the land of SEO, particularly in the context of a multinational company where most localization starts with a central (and usually English language) source which is then adopted by a subset of our countries. An organically search engine optimized English web page will not be automatically optimized in the localized version. In other words, having the most effective keywords determined for the source language cannot and will not absolve the page owner of the localized version of ensuring that someone performs keyword research for this content. To delve further into the best practices of text localization, I've found that it involves a profound knowledge of how one can reali