Thursday, July 25, 2013

Q/Q Challenge Sketchy Results

So here's the a conclusion to the Qualitative/Quantitative Challenge. Who are the top revenue earners according to the Obstacle Course Race Database?

I got 5 out of 8. But that quantitative list still looks funny. I mean, Ridiculous Obstacle Challenge is one of the top 8 races? Dirty Girl is making more than Spartan? Spartan is making less than half of Tough Mudder? Seems screwy. It's time to go down to the nitty gritty of this list and validate the income. 

Wednesday, July 24, 2013

Qualitative/Quantitative Challenge

So the journey to create insight into the world of obstacle course racing through purely public online means has begun with the creation of the Obstacle Course Race Database. With well over 1,000 Google searches and clicking through site after site after site after site... I attempted to find the number of runners per race, races per series, and average cost of registration to speculate race series' revenue from registration. It's not perfect but it's got to start somewhere. We learned from Nate Silver that models are never perfect but a good modeler will work to improve what s/he's got. 

Since many people are afraid of numbers, they won't bother with the data (quantitative information) and will rely on what they already know and can easily find (qualitative information). Former Orlando Magic head coach, Stan Van Gundy, said at this year's MIT Sloan Sports Analytics Conference, "I don't need an inch-thick packet of data to tell me that LeBron James is good at basketball." Nor did he have time to sort through and mentally digest that information. Well, I'm no NBA head coach. I don't work for a major OCR sponsor. I'm not a frequent OCR runner. But I do pay attention when I hear about races and do have Google at my fingertips. So what did I already know or could easily find?

The Qualitative Guess:
I spent about 10 minutes making a preliminary list of which race series I thought were the biggest players in the industry and filled in the model before filling in the blanks A-Z. Here were my top guesses and their estimated revenue.

There you have it. I'll post what I found soon. I'm off to sort some data.

PS: I'll try to sync the bar graph color with the blog color next time. Maybe some bigger font too. 

Tuesday, July 23, 2013

The OC Race Database is Ready. Let the Fun Begin.

1000 Google searches and two cookie-deletions. An initial database is built. Bring on the data sort, conditional formatting, scatter charts, histograms, regressions, and industry-transforming insights that Nostradamus himself could not foresee.

Just kidding. I'm going to buy crayons and a spirograph tomorrow.

But seriously, hopefully I'll figure something out.

Friday, July 12, 2013

The Model is Somewhat Validated by the Tough Mudder

Good news, the model is kind of right.

I temporarily paused the drudgery of looking up each race alphabetically, made a list of races that I thought were the most successful based on personal knowledge and Google results (the qualitative challenge I mentioned in the last post), and went back to filling in the blanks of the database.

Of course the Tough Mudder was on the list of leaders. Tough Mudder was transparent enough to provide a couple of figures for the public on its hiring page. What did it say for 2013?


Show me $70M and 500,000 participants!!

$70 million right on the money! Neat. Buuuut we also see that I was off by about 77,500 runners or roughly 13%. Not terrible but not great. What can we learn from this number?

1) The number of runners is obviously too high. We know the number of locations for a fact so I will have to dig deeper on a few stats:
      -Hours of waves leaving
      -Waves per hour
      -Days per location
I can mathematically prove average number of hours and waves/hour by taking samples but I'm not sure how to figure out Runners/Wave without going to an event, expertly Googling or asking around. I pulled 350 out of thin air but based it on the 300 that Spartan (a high demand race) released + 50 based on the added length of the course. I will keep you posted on any refining I do to this number. 

2) TM is making more per runner than I estimated by about 13%. 

Either the average runner is paying more than my average price (which is weird given the number of discounts available) or roughly 13% of revenue is made from a combination of sponsors, merchandise, and other sales. I might be able to figure out spectator ticket sales but it will take some insider info to get at the remaining 13%. Maybe it's not that important to a company that is SLOWING to 100% growth each year. 

Either way, it seems to balance out in my model. Which is cool. Broken clocks are right twice a day but there is only a .14% chance of it happening on any random minute.

Wednesday, July 10, 2013

The Obstacle Course (OC) Race Database

I started following several OC races on Twitter to begin my prep for the grand endeavor of defining and examining success within the OC Race industry. I came across which keeps a directory of 257 of the world's greatest mud runs and links to their websites. Ah, easy.

Mkay, great start. What do we want to know? We want to know how much income each race makes, (we'll worry about costs later). How do we do that? We follow this awesome model.

Race Annual Income = 
Average Price of Registration X
Number of Runners X 

Number of Locations

Some of these variables have their own variables. Here are some of the challenges with filling in each blank and examples of how I fill it in for the XTREME XAMPLE MUD RUN which is completely made up.

Average Price of Registration:
The cost of registering for a race generally rises as you near the date of the event. The price increase and date progression are not always proportional so I average the race day registration and early bird registration.

EG: Early bird registration: $65
Race day registration: $100
(100+65) / 2= $82.5 per registration

It's extremely simplified but cut me some slack, I have to do this 257 times.

Source, the link actually leads to a pretty cool business idea.
Number of Runners:
This can easily be found by finding a Race Results board. Unfortunately, I'm only finding those for about 30% of races. This has been a major wrench in the spokes for a lot of races but I'm able to bootstrap my way to a pretty good approximation after a little research on hundreds of pages.

If there are no race results, I've been finding the number of waves/city which can be directly found or deduced by start and end times.

EG: The first wave is at 8:00 AM (8). The race continues until 2:30 PM (14.5) when the final wave is released. Waves start every 15-20 minutes (3-4/hour). Waves are limited to 300 runners.

So 14.5-8=6.5 hours at 3-4 waves/hour. 3.5 X 6.5 = 22.75 waves.

I then look at number of open spots (if shown) in each wave and generalize ~200/wave.

So 22.75 X 200 = 4, 550 runners/city.

A single wave from the Dirty Girl, multiply by lots. Source.
Number of Locations:
This is the easiest. You can usually just click on "Events" and count.

EG: Chicago + New England + New Jersey + Pennsylvania = 4. Voila, 4 locations.

The Spartan Race's current global distribution. Making boucoups USD, CAD, and GBP. Source.
I changed it from Number of Race Days because the number of days were rarely consistent in the number of waves. As in, waves on Saturday are not equal to the waves on Sunday. Since I'm counting waves already, I account for it in the Number of Runners.

Based on the XTREME XAMPLE MUD RUN which posted the numbers listed above, the XXMR boasts 4,550 runners paying $82.5 each in 4 locations. This brings in about $1.5 million. Nice work guys.

Next time I'll talk about what insights we might be able to gain from this data. I'm also going to challenge myself to qualitatively figure out the quantitative part. Should be fun.