Posts

Pinterested? A(nother) primer

Since joining a few weeks back, I've seen quite a few blog posts and articles (such as this one ) crop up about how best to use Pinterest , which I would succinctly describe as a visual social bookmarking service.  It's still in invitation-only mode (if you'd like an invitation, feel free to contact me for one), it allows for users to: Create collections of bookmarks ("boards"). Boards may be assigned a category, which others can then search for and browse through. Boards can be either solely editable by oneself, or contributed to by other users, whom one can specify by name. Boards may be "liked" via Facebook plugin. Add bookmarks as represented by either images and videos, either found anywhere online (publicly accessible), or via upload. At the time of pinning, one can use Facebook and/or Twitter to share out the pin. Comment on any pinned items. "Like" and "re-pin" items. Follow all of or a subset of other users' bo

SOPA, PIPA: aka explaining today's site blackouts

Image
If you haven't read about the Stop Online Piracy Act (SOPA) or the Senate version, PIPA, today's blackouts (of prominent sites including Wikipedia) may have surprised you. Courtesy of the Oatmeal , which is also blacked out today, here's an animated graphic that humourously (and effectively) demonstrates why this legislation should be stopped: For a more serious (but concise) look, here's an infographic about SOPA . Finally, from today,  Forbes' interview with Rep. Jared Polis (D-CO) about SOPA and why he opposes it.

Thoughts on IFTTT

Thanks to Google+, I first learned about a service called IFTTT ("if this then that").  They provide a very simple interface where the registered user can set up tasks. Each task consists of selecting a channel (such as Craigslist, Delicious, Instagram and many other social utilities), where a trigger event from said channel results in an action on a target channel. For instance, one can set up an email to be sent to one's account when the local forecast calls for snow. Or in my case, I've set up a task that tweets a customized message of thanks when I'm re-tweeted or followed. Possibly the most powerful channel that's available on IFTTT is the "Feed". Any RSS feed URL can be used as a trigger. This means that I can now consider leaving networkedblogs, on which I currently rely to syndicate new blog entry notices to Facebook and Twitter. I'd also like to review all my feed subscriptions, and see what else I'd like to automate. Thinking al

Year-end thoughts, 2011 edition

Over the lifetime I've spent living in various Western countries, I've noticed the predilection for media and individuals alike to focus on retrospection around this time of year: that is, reminiscing about the various events and experiences that one associates with the prior year. In direct contrast to this, it's my understanding that  in Japan, it is customary to have 忘年会  which, paired with  the 新年会 (which occurs after the  正月三が日 - first three days in January - timeperiod), encourages the forgetting of the prior year through much carousing and imbibing. This year was particularly unforgettable to those with ties to Japan, however, and I've seen social media statuses speaking to the importance of remembering the disasters that have befallen my cultural homeland. The fallout - both metaphorical and literal (environmental, economic, political, and emotional) - will be palpable for decades, if not centuries, regardless of any desire the world may have to forget. P

LanguageWare's robust, extensible Language ID (part 4)

Having introduced the "prior art" approaches of identifying textual language in part 1 and part 3 (to wit: stop word presence and n-gram detection), I can now speak to the patented idea which we implemented as part of  LanguageWare , which is a set of Java libraries that offer NLP functionality. Simply put, our solution involves a  dictionary that is highly compactible (I may ask a guest blogger from my former team to delve into this aspect), and thus made it possible to store the following types of information: Each entry consists of the following: Term or n-gram Language(s) with which it's associated Whether it can occur as a standalone term, at the beginning of a word, the middle of a word, or the end of a word, or some combination of these, and An integer weighting value (per term/language pairing) Thus, for the Chinese Simplified/Traditional and Japanese disambiguation problem, the Japanese-specific kana (listed as unigrams) were given large positive va

Language ID Part 3 - more challenges

Stop word detection usually works...  In my prior post about this subject, hopefully the Jabberwocky poem examples demonstrated that when certain types of words occur in text that can be identified as belonging to a language's pronoun/conjunction/ adposition parts of speech, a language label can still be assigned to text. Such identifiers are, in this context, considered to be stop words. The presence of such terms was sufficient for us to recognize the language even when nouns, verbs, adjectives and adverbs are unidentifiable (nonexistent in our vocabulary). However it's useful to note that in those examples, there were some inflections that hinted at the nonsensical words having specific qualities. Specifically in the case of spotting nouns, these were enabled when in the inflected languages, pluralization or possession were shown via -s/'s endings (English, though -s can indicate possession in German also) or combination of title case capitalization and (when plu

Every breath we take (Foursquare et al.)

Image
Approximate location of this blogger, give or take a few hundred metres After many months of dragging my feet, I joined Foursquare today. For those unfamiliar with it, this geo-social networking service allows a registrant user with a smartphone to download an application that makes it possible to easily "check in" to physical places. With tie-ins to Facebook and Twitter, it encourages users to publicly promote the businesses and services they prefer. This, in turn, is the incentive businesses value (endorsements) sufficiently to make offers to those who check-in to them. Truth be told, I'm not a particularly suitable user of such services as these. First, I'd rather not have my whereabouts documented online to this level of detail, even though I don't live alone (and thus, am not quite so susceptible to being burgled). Second, I'm an inconspicuous consumer - that is, I try to live frugally, and what I consider to be frivolous purchases mainly take