Saturday, January 24, 2009


Originally uploaded by wck

on the elizabeth river in virginia

Thursday, January 22, 2009


* I wish I could write a shell script to schedule haircuts. What's with this having to pick a phone thing? ;-)
* Currently addicted to Rancid's Olympia WA song. Deeply, deeply addicted.
* I've had a partially written post on We'll Find a Way by the Ducky Boys floating around in my head for a few weeks, and I'm never going to finish it, so here it is in pieces

I never go up to midtown without a lot of grumbling, but I ended up spending the night in a hotel up on 50 something and 7th ave, just on the north edge of Times Square, on Dec 30. I was sitting at the desk, watching the sun go down and turn the skyscrapers a fantastic bright orange color, though the brown tinted hotel windows. It felt so 80s in some very weird way. I was listening to We'll Find a Way and a few other songs on a small playlist that kept looping and looping. Later in the evening I walked my way south down 7th, though the bitter cold, and after midnight back north up Broadway, still with the same mix. This song fit in with the crystal cold, with the blazing orangish lights so well that I now can't listen to it without the "hey nanana"s sounding like a setting sun and sharp cold air. And stumbling past crowds of tourists- walking Times Square a night early- some sense of dislocation, rooted by the chorus coming up and up all evening.

Friday, January 09, 2009

The "Harry Potter" problem in recommendations

Greg Linden covered the Harry Potter problem in a blog post on recommendation technologies a few years ago:
A very sharp and experienced developer named Eric wrote the first version of similarities that made it out to the Amazon website. It was great working with Eric. I learned much from him over the years.

The first version of similarities was quite popular. But it had a problem, the Harry Potter problem.

Oh, yes, Harry Potter. Harry Potter is a runaway bestseller. Kids buy it. Adults buy it. Everyone buys it.

So, take a book, any book. If you look at all the customers who bought that book, then look at what other books they bought, rest assured, most of them have bought Harry Potter.


When I worked on the personalization team we were still struggling with the problem- there are definite ways to identify a Harry Potter problem, but you have to remember to apply them. Adding to that, within certain genres there are Harry Potter books/music albums that are only runaway successes within those genres. If you compared those books to the general list of books that amazon sells, they wouldn't look like books that everyone has bought. Taking it a step further, if then if you narrow the scope to only related books you'll find that they are crazy popular.

The biggest side effect of the Harry Potter problem is that it weakens recommendations. For instance, I've bought the O'Reilly regex pocket book and the O'Reilly Python Cookbook and Ruby Cookbook. From those three books, you can pretty easily peg me as a web nerd and safely recommend a Steve Souder's website performance book. Those are very strongly correlated purchases in a narrow band of interest. However, because I'm a geek, I've also bought Neal Stephenson's latest book, Anathem. As have a few hundred thousand OTHER geeks. We could say that Anathem is a nerd's Harry Potter.

So I received an email today from amazon with a list of recommended books, most of which were based off Anathem and Daniel Silva's latest book, Moscow Rules (great book but also a bit of a Harry Potter widely-bought book). As you might guess, the recommendations were really bad. I wish that email had a link that I could click that would say "never recommend any of these books to me again please" -I could go to each detail page and mark that, but it would take a massive amount of time.