Wednesday, October 10, 2012

Humble Bundle Data - Collection

Today was the launch of the Humble eBook Bundle.  The way that Humble Bundles work is that the site lists a collection of downloadable items (traditionally games, but lately they have branched out into music, and now books).  You can pay any price (including free, if you choose) for the bundle, and it's yours.  You can even tweak how much of your purchase price goes (directly) to the content creators, the Humble Bundle site, and a few charities.

Well, not all of the bundle is yours at any price.  There are bonus items, which are typically the best of the bunch.  To get these items, you are required to beat the average price of the bundle so far. 


Well, I like bundles, and I like e-books, so I decided to get the bundle.  But I wasn't prepared to pay the price at the time to get the two bonus books.  They looked good, but the average price was then just over $9, and I thought that if I was going to spend that much money on an e-book, did I really want the ones offered?  I would be willing to bite--but only if the price went down.  Humble Bundles typically hover around the $5-7 range for unlocking the bonus content, which for my money is a better impulse purchase for content I didn't get to pick out myself.

Funny thing about having to beat the average price:  a lot of people do it, and that will have a tendency to create a constant upward trend in the price to unlock the bonus content.  It got me wondering, though.  I fully expected the price to trend up for the first couple of days, but then does it dip down again?  My gut told me that there would be an initial spike, a dip, and then a spike at the end of the two-week window when the bundle was offered.

But I couldn't find any data on the subject.  There are several websites where data on the various Humble Bundles can be found, but a graph of average price over time was nowhere to be found.  "Well," I thought, "I'll make one then."

So I did.  Or rather, am.

My data collected at the end of day 1, showing average price and total purchases over time.
I whipped up a quick script on my Linux box that uses links -dump to grab a text-only version of the web page.  It then successively greps this page for the data I'm after, and appends a line of data to a CSV file.  Initially, I only collected the timestamp and average price, but I decided that the total quantity sold at that time would also be a valuable and relevant piece of data.

I added a line in my crontab file to run the script every 15 mninutes (which is plenty of granularity over two weeks, and also very reasonable to the remote server), and viola:  data!

It will be interesting to see how this little experiment turns out.  It may be that the price will almost always trend upward.  I know that with certain past gaming bundles, games from previous bundles were added to the current bundle as bonus items in order to motivate people to pay the higher price and keep the average up.  I suspect that these items were added at times when the average price was dipping in order to bring it back up.  I don't know that there are any books in reserve for this bundle, so that may not be an option.

If it does happen, that will be interesting to watch in the data.

Side note:
  • Another bundle site of note that just got started specifically for e-books is StoryBundle.  StoryBundle is slightly different, in that they set a minimum price ($1), and the bonus books can be unlocked at a constant price ($7). 

I will post an update to this blog when the bundle is over with the full results of the data I collect. 

Update: Here are the results.
Enjoy!