Most of the points they made concerned concision, but the final point on actively discouraging machine-translated text caught my eye. I'd posted in the past about how translation did not equate to localization, so I was rather pleased to imagine that someone was incorporating grammar and spelling checks into the ranking algorithm. However, I also have the following questions:
- Do they verify that the language attribute found in the HTML matches the body text language that people read?
- If the language is a distinct flavour, such as English as spoken in India or the Kansai dialect of Japan, is that taken into account during the linguistic quality assessment?
- Do they penalize on slang, profanities or "text-speak" orthography, or will they process them accurately and take that into account in evaluating the tone of the site? The urbandictionary.com site comes to mind for this instance, where the main entries and definitions, not to mention examples, are rife with NSFW terms.