What can our search queries tell us about ourselves?

Is privacy just a facade? In the world of web searching, the data is in: there is no such thing as confidentiality. Recently, AOL released a list of 20 million search queries that were collected over a three month period. The data was released under their AOL Research division as an offering for academic research. According to the New York Times, the release of this data so angered privacy advocates that AOL did an about face and rescinded this data set and offered a public apology.

What’s the big deal, you say? Why should we be worried about search results? Well… let’s take a look and see.

AOL was kind enough to remove any blatant personal identifiers from this data set. Instead, they inserted a unique number that was tied to each individual AOL account. While this may make you say, “whew, at least there’s nothing personal attached to this data”, you’re mistaken. As the New York Times points out, a little sleuthing is all that’s required to identify some searchers.

While the NY Times article shared a fairly tame user’s search results, some other search results might lead to more troubling user account “outings”. Consider one example that was highlighted in an article in Slate:

The searches of AOL user No. 672368, for example, morphed over several weeks from “you’re pregnant he doesn’t want the baby” to “foods to eat when pregnant” to “abortion clinics charlotte nc” to “can christians be forgiven for abortion.”

It quickly becomes evident that our search results tell a story about our lives. Like our email, our web usage tells a lot about our interests, our desires and who we are as a person. By sifting through our internet usage patterns, one could learn to understand us almost as well as we know ourselves, warts and all.

The Slate article goes on to identify seven types of web searchers. From “The Pornhound” to “The Newbie” to “The Basket Case”, there are numerous labels that can be both descriptive and dangerous.

While I do find these search results to be quite interesting, I do see danger in the use of that data. It’s a slippery slope from academic study of search results to censorship and even to persecution. As crazy as this sounds, it is already happening in the world. Look at the media control in some communist countries. And if you think we’re immune here in the western world, well… think again. It wasn’t long ago that freedom of speech was curtailed by the church. Even the United States is experiencing a resurgence in censorship.

How long until this powerful information is abused and distorted for unethical means? I’d argue that it is already happening. What do you think?


For further information:

Techcrunch – Blog Archive – great info on sources and further info:

AOL Search data mirrors:

Working mirror (as of Tues Aug 15):


iWeb 2.0 features

Think Secret is reporting that iWeb 2.0 is due for some major enhancements in its v2.0 release, which is expected to be released in early 2007.

As a Mac user, I’ve recently experimented with the premiere release of iWeb. Results can be seen here: Overall, iWeb is quite easy to use and extremely intuitive. But, it definitely does have some shortcomings. For static content, iWeb is fantastic. It easily creates a static site, complete with picture galleries, podcasts and video files. That’s all fine and good, but products have been doing this sort of web development for quite some time now.

Some of the items that I found limiting include (but aren’t limited to):

  • No code editor window – or at least if there is, I have yet to find it.
  • Inability to customize the navigation bar – To add an external page to my site, I had to build a dummy page in iWeb, then manually update the dummy page using an external text editor (due to the lack of code editing in iWeb 1.0).
  • Inability to create robust photo galleries – the best iWeb could do, as far as I could tell, was make a single page photo gallery that would contain all of the pictures that you’d want to post. I had to create a “photos” landing page, then create a bunch of subpages that contain photos by search criteria. And, these pages are all static which means the pics can’t be sorted by date, theme, etc. Disappointing, since there are plenty of tools out there that provide great photo publishing options.
  • Site updates – iWeb provides some slick publishing functionality when you publish to a .Mac account. Incremental site updates alone are almost worth the price to pay to get a .Mac account. Re-rendering the entire site each time you make changes is quite time consuming, especially if there are plenty of pictures and videos in the site in question. iWeb re-renders the complete site if you publish to an external folder (the only other option if you don’t publish to .Mac). This is more of an annoyance than a barrier to use though… but, it is a bit Microsoft’ish in terms of trying to convince you to use Apple’s proprietary hosting platform.

There are some other minor annoyances with the product, but these are the main ones that I noticed.

Think Secret is suggesting that Apple is working on some dynamic content capabilities for iWeb 2.0. Things like:

  • Smart Albums – ability to control the display of photo galleries, along with some dynamic presentation options.
  • Flickr integration – ability to pull in content from Flickr (and maybe others)
  • Google AdSense integration – yet another winner
  • Themes – this alone would make the product a worthy upgrade

This functionality would be great, as dynamic content generation is definitely lacking from iWeb 1.0. Hopefully Think Secret is right about the features that Apple is working on for the next release. Only time will tell though…