Persai, a new filtering program that aims to cure the Web's information overload.

Persai, a new filtering program that aims to cure the Web's information overload.

Persai, a new filtering program that aims to cure the Web's information overload.

Innovation, the Internet, gadgets, and more.
Feb. 20 2008 3:48 PM

How To Be a Better Browser

Can a new filtering program cure the Web's information overload?

Persai. Click image to expand.
Persai

In a scant four years, the Internet, my beloved wellspring of information, has blown its top and become a geyser. Back in 2004, I heaped praise on an exciting new system called RSS. The "Really Simple Syndication" format promised to be TiVo for Web surfers—by automatically pulling content from all your favorite blogs and news sites, an RSS reader would make your Web surfing more fruitful and more efficient. While that prospect sounded enticing at the time, RSS has turned out to be more of a problem than a solution. As of this moment, I have 897 unread RSS items. I don't need a way to read more of the Net. I need a way to see less of it.

I've got two main beefs with RSS. The first is information overload. If I don't check in every few hours, my RSS reader fills with unread blog posts. Rather than feel relieved that I can catch up on my missed surfing, that long list of bold headlines gives me the sensation that I'm hopelessly behind and won't ever catch up. I've got enough to do at home and at work that I don't need Web surfing to seem like a chore.

Advertisement

The second issue is that my RSS reader is only as smart and attentive as I am. It hasn't figured out that I've stopped reading 14 of the 15 Wired.com feeds I subscribed to when I worked there last year. It can't tell that I only care for about one in 20 of Dave Winer's nonstop posts, and it has no way of guessing which one that will be.

I'm down on RSS at the moment, but I'm not ready to abandon it just yet. That's why I'm excited about Persai, a new service that promises to solve my two big problems. The application, which is now in private beta test, bills itself as a smart filter, a way both to tame and to improve your RSS content.

Persai (pronounced per-SIGH) is a system for reading RSS-fed content, but it doesn't focus on individual feeds. Instead, it throws everything it can find into one big hopper, then asks about what you like so it can dole out suitable articles. You start by creating one or more "interests" based on keywords of your choice—say, "American Idol" or "astrophysics discovery." Once you've punched in your interests, Persai turns each one into a custom page. These pages look a lot like Google News search results—a collection of news articles and blog posts from the past day that match your interest. The matches aren't based on exact keywords, but rather on a more complex word-math algorithm that can figure out that a post about Carly Smithson matches my American Idol interest.

Persai won me over immediately because it's an anti-social network: It ignores everything and everyone on the Internet except for what I want to read. Rather than presume I'm like other people, Persai tracks my unique reading habits and, more importantly, remembers what I don't want. Telling the program I don't like something is as simple as clicking a red X labeled "reject." Persai notes which news articles and blog posts I despise, then filters out others like them by doing more math on what words show up in the articles and where they appear. It presumes that if I click an article to read it—and don't hit the reject button—I like it.

Advertisement

I tested Persai with several different interests. One topic stood out as the ultimate filtering challenge: Barack Obama. As the favorite candidate of college students and Democratic bloggers, the Illinois senator is the subject of a jillion posts a day. Some are fascinating. Most are stupid. I need an Obama Fever filter.

After setting up my "Barack Obama" page, I created a second Persai account with the same exact settings. This was my control unit for the experiment—I did no reading or rejecting on it. After a week of heavy clicking on the first account, I compared the two to see if the filters did their job.

I'm ready to pronounce Persai most of the way there. It figured out quickly that I wanted articles about how well Obama was doing against Hillary Clinton, not puff pieces about an "unstoppable train." It also figured out that I enjoy reading pieces that trash Obama. (I live in San Francisco and get an earful of Obama-mania all day. A guy needs to unwind.) Persai reduced the number of Huffington Post articles, in my mind, from way too many to just a few. Pushed to the top: "Nasty Clinton-Obama Fight Descends to Plagiarism Accusations." Cut from the list: "Yes, Obama Has Substance to Match the Charisma Thing."

Persai took away the feeling of being overwhelmed by hundreds of new headlines every morning. Its best quality is the ability to cull stuff I'm sick of, such as the tediously partisan Hillary Project. It's also good at helping me overcome my biases, finding articles from sites I thought I was sick of until I clicked—that's you, Salon. But in its beta version, Persai replaces too many articles with not enough. The company claims to index 700,000 news articles and blog posts a day, but I had trouble getting more than two dozen Obama items daily. I can get a lot more by setting up an Obama news alert on Google News.

Second, Persai only sorts articles by when they were published—the newest goes first—rather than choosing a top story based on its relevance to my interests. On Tuesday, Google put the New York Times' brand-new report on the Obama-Clinton plagiarism flap at the top of its screen. Persai left that story off my pages, even on the untouched control account, and led with an article about Obama's popularity in Japan. That's where the program's anti-social nature can backfire—it doesn't care that a huge number of people will read and talk about the Times story. It simply scraped through the article as if it were any other piece or blog post.

Finally, Persai still isn't as smart as I am when it comes to choosing what to read. I want to read stories that deal with Britney Spears' mental illness but don't want to read anything about her custody battle. While I can figure out which Britney pieces will appeal to me by scanning headlines, Persai isn't sophisticated enough yet to tell the difference—even if a piece touches on her mental health for one sentence, Persai will grab it for me.

Here's an idea: What if I could run Google News' more structured search results through Persai's things-I-hate filter? That would bring me popular articles that matched a keyword or two while also culling the stuff that Persai knows I'll hate. After a week of clicking "reject," I came to realize something about myself. It's not that there's too much information on the Internet. It's that there are too many painfully bad essays about Barack Obama. Take those away, and I'm happy to pore through what's left.