Posts

Language ID part 2: Callooh! Callay!

 As mentioned in part 1 , thinking about identifying language leads one to the fundamental question: "what defines a language?" The example that our team used was from the children's book by Lewis Carroll - a  poem that the protagonist reads, and although not comprehending it, thinks it "pretty" and that it was "clear" that "somebody killed something". Here it is, courtesy of the Wikipedia (English) article: " Jabberwocky " 'Twas brillig, and the slithy toves Did gyre and gimble in the wabe; All mimsy were the borogoves, And the mome raths outgrabe. "Beware the Jabberwock, my son! The jaws that bite, the claws that catch! Beware the Jubjub bird, and shun The frumious Bandersnatch!" He took his vorpal sword in hand: Long time the manxome foe he sought-- So rested he by the Tumtum tree, And stood awhile in thought. And as in uffish thought he stood, The Jabberwock, with eyes of flame, Came whiffling...

Trunk.ly acquired by Delicious

As of November 9, 2011, it was announced that the newish owners of Delicious had acquired trunk.ly (which I'd blogged about before ). However, even earlier (in September), my manager had blogged about Delicious' apparent demise , as precipitated by the takeover by AVOS . Trunk.ly has promised to remain functional until the start of next year, but I've found that attempts to use the Delicious import feature are failing (the page times out). The export for trunk.ly worked without any problems. I'm sincerely hoping that Delicious gets their act together, and that soon it'll have incorporated trunk.ly's ease of use, and restored lost tags and works glitch-free for all the pre-existing (or surviving?) users.

Language ID (textual) - part 1

Image
Word cloud of one person's compilation of English stop words, courtesy of Armand Brahaj (whose site has been infected by malware). Here instead is ranks.nl's list   Now that half a year has lapsed since the inception of this blog, some readers may be wondering when I might share more topics that are related to the "Linguistics" part of "SEO, Linguistics, Localization". In fact, one of the triggers of my instigating this blog arose from the issuance of two patents, which had been filed in 2005 and 2006 for which I was a co-inventor and sole inventor , respectively. Both filings concerned language identification (from textual input): the first approached the challenges of identifying a text's (primary) language, and the second was an application of the first, as combined with messaging software. Rather than overwhelm the reader with extensive explanations, I'm going to attempt to create a series of posts that will cover everything in the way...

Sushi Preparation compared to Search Enablement

Image
Courtesy of Kojiro Fish Shop in Wieden, Vienna  Being a fan of various cuisines, I count myself fortunate in having had the opportunity to grow up in Toronto (and having spent time in gastronomical meccas such as Tokyo and New York). As my parents kept my household quite Japanese, I grew up eating what most of my classmates considered to be exotic foods: umeboshi, chirashi zushi, korokke, grilled fish with daikon oroshi and such. Thus, when I was recently asked by a virtual friend - by which I mean someone whose acquaintance I made online, and have not yet spent time with in person, as opposed to an artificial being - to review her classmate's journey of learning to make sushi , I thought I may as well take the opportunity to talk about how my views on  sushi preparation and enabling search optimization of online content actually have comparable points. Sound strange? Do read on... First, the sushi making (with the disclaimer that I am not a professional chef, nor ...

Where is Dennis Ritchie's day?

It's now a week since the creator of the C programming language, and co-creator of the UNIX operating system, Dennis Ritchie , died after a long illness. I still have the distinct impression that Steve Jobs' charisma and Apple's links to pop culture have generated far more hype than the former's profound contributions to technology. A few days ago I'd shared the New York Times obituary on Ritchie, which garnered comments from my loyal readers (thank you, Klaus and Mick!) Since then, I'd been looking at various media sources to see what more would be said about him. However, I see announcements instead like this (Californian governor declares October 16 Steve Jobs Day), and threads like this (Google has neither created a doodle nor provided a hyperlink to Ritchie, despite doing the latter for Jobs). It seems there must be many more people who share my disappointment and outrage that Ritchie's passing has been eclipsed so effectively by the timing of J...

Time management thoughts, Part 1

I'd recently admitted to some friends that, ironically (and funnily enough) the topic of time management has been on my mind. The irony being that this post comes more than halfway through October, with the greatest gap in time that had transpired since the blog was launched in May. Here is a quote from the TV series "Bones", which has a protagonist whose behaviour I can relate to quite well. She's being interviewed by a bubbly morning chat show hostess in the following exchange: Courtesy of IMDB : Stacy Goodyear : I'm Stacie Goodyear and joining me on Wake Up, D.C. is Dr. Temperance Brennan. She is the author of the best-selling mystery novel "Bred in the Bone" and she's also - now tell me if I get this wrong - an anthropologist who works with the F.B.I. to solve crimes? Dr. Temperance 'Bones' Brennan : Yes, that's correct. I use the bones of people who have been murdered, or burned, or blown up, or eaten by animals or insects, or ...

Why I won't link to your blog

Image
Today I received the above comment, unsolicited, and after about two minutes' investigation I moved it into the Spam category. Here's a numbered list explaining why: Although my name is part of the blogspot domain I use, and promote in most places, the message addresses me as "Webmaster", which is possibly today's equivalent of "to whom it may concern". Actually, I have interchangeably experimented with the vanity URL provided to me via my alma mater, such as on Technorati and STC.org. The request is for cross-linking, which already devalues the proposition (as it's a "black hat" practice). If this person truly valued my blog, he would link to it without asking me to link to his. The request uses my domain, implying that it is a "keyword". I've blocked out the destination URL and the keyword he asked for (which, although partially reflecting his website address, was also far too generic to stand a chance at ranking well...