An Analysis of Activity on Steam

I have wanted to flex my analysis muscles for quite some time and thought I’d get some practice on publicly available data: the Steam & Game Stats page.

Steam Logo

For the unaware: Steam is one of the largest gaming platforms available for PCs, and they have a wonderful stats page that lists the current number of concurrent users for each game, as well as the current number of concurrent users in general. I’ve always wondered what actual activity levels were, ever since seeing this page, and so I decided to scrape that page for a week or two and see what sort of data I could get.

NOTE: If you’re interested in the technical details behind the scraping, I will provide that in a separate post in the near future.

The data I have managed to mine from this page is trivial at best, but I had a ton of fun learning how to build the scraper (uses Python, BeautifulSoup, and a handy dandy cronjob) as well as figuring out the required Google Sheets equations to put it all together.

Some Fun Facts About Steam Activity

On average, 19.95% of concurrent players on Steam are playing games.

Here is a visualization of what Steam activity levels have looked like from March 7, 2014 to March 19, 2014:

On average, 1/5th of the Steam “concurrent” user base is actually playing a game. However, I will concede that this is not dead-on accurate, as I only have numbers for top 100 games at any given moment, but is reliable for estimations because games in the top 90-100 levels account for 0.02% of the user base, meaning any additional users playing non-top-100 games will be relatively insignificant in their effect.

There was a big spike, on March 9, 2014 at 4:00pm EDT (16:00) when the number of Counter-Strike: Global Offensive players ballooned up to a staggering 111,893 players. Presumably for the start of a tournament. (Haven’t dug into this one too much.)

Dota 2 dominates games played on Steam, accounting for an average of 5.58% of the concurrent player base.

Not a real surprise, given the popularity of Dota 2, but the second most played game is Counter-Strike: Global Offensive, sitting at a distant 1.66% average, about three times smaller than Dota 2. This translates to an average of about 400,000 concurrent players, with the highest I’ve recorded at 673,018 concurrent players on March 15, 2014 at 10:00am EDT.

In fact, most of the highest numbers of concurrent players in Dota happens in the mornings from 9:00am to 11:00am.

However, might be indicative of the real struggle for MOBAs to fight against the titan amongst gods, League of Legends, which boasted an impressive peak of 7.5 million concurrent users in January 2014.

Hard to gain ground on such an entrenched competitor, but they’re definitely doing their best.

160 games have been a part of the Steam Top 100 between March 7, 2014 and March 19, 2014

While that sounds like a lot, we have to remember that the Steam catalogue currently sits at over 3,000 titles and growing, it’s pretty safe to say that breaking into the top 100 is no easy feat.

Further analysis that I could do as time goes on is to get a breakdown of the genres being represented in the Top 100 list, which would also provide a decent idea of what is and what is not popular on Steam. Not to mention that this is an extremely small sample size, it would be more worthwhile to get this data over a period of a year to make it really meaningful.

What’s Next?

Well, this is a big pile of data, and it’s growing by the hour. This is great, but what can I really do with this data?

For starters, the original goal was to figure out if there was a link between the digital marketing behaviours of publishers and the level of concurrent players on Steam, as well as growth or decline in player base from that activity (or lack thereof.) I will have to explore whether or not that is still possible to figure out, as there are a lot of marketing activities that are either harder to track down or even attribute towards the success of a game.

Secondly, I will have to step up the data storage game a bit to make it much more accessible. Currently, a Python script scrapes the Steam & Game stats page, adding a line to two separate CSV files with all the relevant data. I’d like to transition this into an actual database (probably MySQL) and maybe make it open to the public to poke at and do their own analysis.

Lastly, I’m really not sure. It was a fun side project in the first place, and I feel like it was a great learning experience and a fantastic way to brush up on my analytical skills.

Have any ideas or want access to the data? Give me a shout, I’m happy to share!