Monday, September 17, 2007

AB Testing

37 signals just did a blog post on "Secrets to Amazon's Success. Here are my favorite three points:


Use measurement and objective debate to separate the good from the bad. I’ve been to several presentations by ex-Amazoners and this is the aspect of Amazon that strikes me as uniquely different and interesting from other companies. Their deep seated ethic is to expose real customers to a choice and see which one works best and to make decisions based on those tests.

Getting rid of the influence of the HiPPO’s, the highest paid people in the room. This is done with techniques like A/B testing and Web Analytics. If you have a question about what you should do code it up, let people use it, and see which alternative gives you the results you want.

Have a way to rollback if an update doesn’t work. Write the tools if necessary.

- http://www.37signals.com/svn/posts/600-secrets-to-amazons-success


That matches up to two of the biggest things that I learned at amazon- you don't know what your customers will do, so you shouldn't guess, you should A/B test. And you're going to screw up unbelievably, and constantly, so you need instant rollbacks. Here's the thing. Amazon was my first job out of college, so I really didn't know any other way to do things. One of my first testing jobs there was to test the very first automated A/B mechanism on the retail site (a version that had died long before I ended up leaving the company- like all features I worked on my first few years, it was re-done a few times to be better than the first version). Testing it wasn't all that fancy, we just checked that visitors were dropped into buckets, saw the treatments tied to those buckets, and reporting worked correctly. The bigger impact on me was the lesson that this was the most important way that new features were going to be launched. It's an incredible testament to the success of A/B testing in the company culture that when I left 7 years later, nearly every new update (and even little tweaky things) to the site was launched initially as an A/B test. Russell used to have all of his launch plaques up on the desk over his wall, with a couple flipped upside down. The upside down plaques were- yes- the projects that bombed in the launch A/B testing phase.

I couldn't even begin to describe the number of rollbacks we did over the years. If you can think of it, we screwed it up. The important part was that we could recover quickly by backing out the change, and that we had a culture of learning from the screwups. In the retail world, there was a tradition of writing "How I broke the website" emails. I will admit to writing a couple- they went out to all the engineers, and carefully analyzed how you had, well, broken the website. You discussed how the problem should have been caught, and what you were going to do differently going forward to prevent that from happening again. Now that I've been at some other companies, I can see how critical the emails were. Breaking things was bad, but more important was sharing how you messed up, and taking the time to do a post mortem of the mistake, and then freely circulating the information. It wasn't the mea culpa that mattered- although accepting responsibility was a big part of it. Larger, really, was the learning you shared with the teams, and the environment where you learned how to recover gracefully from large messes.

37 signals mentions the 'just do it' awards that Jeff handed out. Less known outside of amazon are the 'door desk awards'. I'm not sure that they're still given out, but they used to be handed out at every All Hands meeting. A door desk award was similar to a 'just do it' award, but it was given to an individual who tried something- without their manager's approval- and failed. Their feature bombed in an A/B test, their code blew up, the project never launched, whatever. It was another piece of making an engineering culture where you should just go for it, because failure wasn't the end of the world.

No comments: