Posts

Pinterested? A(nother) primer

Since joining a few weeks back, I've seen quite a few blog posts and articles (such as this one ) crop up about how best to use Pinterest , which I would succinctly describe as a visual social bookmarking service.  It's still in invitation-only mode (if you'd like an invitation, feel free to contact me for one), it allows for users to: Create collections of bookmarks ("boards"). Boards may be assigned a category, which others can then search for and browse through. Boards can be either solely editable by oneself, or contributed to by other users, whom one can specify by name. Boards may be "liked" via Facebook plugin. Add bookmarks as represented by either images and videos, either found anywhere online (publicly accessible), or via upload. At the time of pinning, one can use Facebook and/or Twitter to share out the pin. Comment on any pinned items. "Like" and "re-pin" items. Follow all of or a subset of other users' bo

SOPA, PIPA: aka explaining today's site blackouts

Image
If you haven't read about the Stop Online Piracy Act (SOPA) or the Senate version, PIPA, today's blackouts (of prominent sites including Wikipedia) may have surprised you. Courtesy of the Oatmeal , which is also blacked out today, here's an animated graphic that humourously (and effectively) demonstrates why this legislation should be stopped: For a more serious (but concise) look, here's an infographic about SOPA . Finally, from today,  Forbes' interview with Rep. Jared Polis (D-CO) about SOPA and why he opposes it.

Thoughts on IFTTT

Thanks to Google+, I first learned about a service called IFTTT ("if this then that").  They provide a very simple interface where the registered user can set up tasks. Each task consists of selecting a channel (such as Craigslist, Delicious, Instagram and many other social utilities), where a trigger event from said channel results in an action on a target channel. For instance, one can set up an email to be sent to one's account when the local forecast calls for snow. Or in my case, I've set up a task that tweets a customized message of thanks when I'm re-tweeted or followed. Possibly the most powerful channel that's available on IFTTT is the "Feed". Any RSS feed URL can be used as a trigger. This means that I can now consider leaving networkedblogs, on which I currently rely to syndicate new blog entry notices to Facebook and Twitter. I'd also like to review all my feed subscriptions, and see what else I'd like to automate. Thinking al

Year-end thoughts, 2011 edition

Over the lifetime I've spent living in various Western countries, I've noticed the predilection for media and individuals alike to focus on retrospection around this time of year: that is, reminiscing about the various events and experiences that one associates with the prior year. In direct contrast to this, it's my understanding that  in Japan, it is customary to have 忘年会  which, paired with  the 新年会 (which occurs after the  正月三が日 - first three days in January - timeperiod), encourages the forgetting of the prior year through much carousing and imbibing. This year was particularly unforgettable to those with ties to Japan, however, and I've seen social media statuses speaking to the importance of remembering the disasters that have befallen my cultural homeland. The fallout - both metaphorical and literal (environmental, economic, political, and emotional) - will be palpable for decades, if not centuries, regardless of any desire the world may have to forget. P

LanguageWare's robust, extensible Language ID (part 4)

Having introduced the "prior art" approaches of identifying textual language in part 1 and part 3 (to wit: stop word presence and n-gram detection), I can now speak to the patented idea which we implemented as part of  LanguageWare , which is a set of Java libraries that offer NLP functionality. Simply put, our solution involves a  dictionary that is highly compactible (I may ask a guest blogger from my former team to delve into this aspect), and thus made it possible to store the following types of information: Each entry consists of the following: Term or n-gram Language(s) with which it's associated Whether it can occur as a standalone term, at the beginning of a word, the middle of a word, or the end of a word, or some combination of these, and An integer weighting value (per term/language pairing) Thus, for the Chinese Simplified/Traditional and Japanese disambiguation problem, the Japanese-specific kana (listed as unigrams) were given large positive va