The other day, while sifting through his thousands of new messages while getting annoyed that everybody seems to be blogging about the exact same thing, my buddy had a pretty nifty idea.
Wouldn’t it be great if your newsreader could somehow read and “understand” the news you are just reading and then mark all those other news related to this one as already read, so you don’t have to read ’em twice? With a little bit of NLP, some Yahoo web-servicing and some more black magic that should be possible, right?
I was thinking to do that as a flock add-on to their already excellent news-reader (kudos, guys!).
I can think of a couple ways to match two entries:
- Simplest way: Use Google (blogs, news) using the title as search-terms. Whichever other entries show up within the first [30,50,100] results, mark them.
- Harder way: Use a document-similarity tool, to check current entry against every other (might also be slower).
- Hardest: Extract keywords from entry or use title and use an online MSR like this one: http://cwl-projects.cogsci.rpi.edu/msr/
Anyone up for this?