Google Spreadsheets: COUNT with Multiple Values in Cells

It’s been more than two years since I wrote a post on using Google Spreadsheets to use COUNT and FILTER in tandem to count values with multiple conditions.

Circling back, it does very well at allowing you to count when you need to validate multiple conditions. However, it falls apart the minute that your data is either messy or comes grouped with other data inside of a cell. Here’s an example where my original post would have trouble with:

Google Spreadsheets: Multiple Values in Cells

Why does it fall apart? Each cell can have multiple values inside of it.

To dig a little further, we have to understand that my original post called for using the FILTER function to pare down the data set into the specific rows that we need, and then using the COUNT function to count the result of the filter. However, FILTER does not allow for wildcards in its conditions for filtering, making it extremely difficult to account for data that is in the same cell as other data.

Luckily, there’s an easy way to get around it, using a crude but workable solution for counting values with multiple conditions, with sources that have multiple values in cells, extensible to messy or not-very-nice data.



style="display:inline-block;width:728px;height:90px"
data-ad-client="ca-pub-9705548764325874"
data-ad-slot="9050629881">

The Scenario

In the above example, let’s say we’d like to count the number of items inside of Column A and Column B for every row that Item A appears in.

Manually counting them, we know that for Item A, it is listed on a total of 3 rows (Rows 2, 3, and 7) and has 7 values listed to its name in Column A, and 4 values listed in Column B. Hopefully I haven’t lost you, but I’m saying that we count the number of values inside Column A every time Item A is listed, and we do the same for Column B.

How would we go about doing this in Google Spreadsheets?

For the impatient, here’s what the formula looks like:

=COUNTA(SPLIT(JOIN(",", FILTER(B:B, SEARCH("*A*", A:A))), ","))

I know, I know, it looks odd and confusing, so let’s break it down starting from the very inside formulas.

=SEARCH("*A*", A:A)

This is what we’re using as a condition inside of FILTER. SEARCH allows us to parse through the values in a column or row and find characters or values that we’re looking for. In this particular instance, I am looking for the character “A”, but you could be looking for whole words, sequences of characters, you name it, and you have the possibility of adding wildcards on the outside of the term you’re looking for to make it catch every instance properly. Experiment with this, because I’m not 100% sure on whether or not the wildcards help in every situation.

=FILTER(B:B, SEARCH("*A*", A:A))

When SEARCH is used in conjunction with FILTER, it returns a list of rows or columns (as specified by the first parameter in FILTER, in this case it’s B:B) that returned a TRUE condition for the SEARCH. The result would look like this:

X, Y, Z
X, Y, Z
Z

This is returning B2, B3, and B7 respectively from my above example, because they had Item A in their Item column.

=JOIN(",", (FILTER(B:B, SEARCH("*A*", A:A)))

Using JOIN, you can combine all of our FILTER results into a single cell, where each cell’s values are separated by a comma (first parameter from JOIN, in this case “,” means separate values using a comma) and makes it nice and simple for us to analyze just this one cell.

X, Y, Z,X, Y, Z,Z

This is what our data now looks like, all inside of a single cell.

=COUNTA(SPLIT(JOIN(",", FILTER(B:B, SEARCH("*A*", A:A))), ","))

We arrive back at the original Google Spreadsheets formula, with SPLIT and COUNTA appearing. SPLIT allows us to break apart a single cell into many cells that contain each and every value that we have in the cell, as long as we let it know how the data is separated. In this case, it’s separated by commas because of the work we did above with JOIN, and we use COUNTA to count the non-numeric values that the data represents.

Doing this for Item A, we can quickly see that it has 7 items in Column A, and 4 items in Column B.

Easy peasy eh?

I linked the functions into the documentation for Google Spreadsheets as I listed them above, but here they are again, just in case:

If you end up trying this out, let me know what you think, or if you have any questions, or if you have any improvements or suggestions!

Scraping Steam for Data using Python + BeautifulSoup

As promised in yesterday’s blog post about analyzing public Steam numbers, here are the juicy technical details behind scraping a website using Python, and a Python library called BeautifulSoup.

Python Logo

The Method

I chose to use Python because I’ve been using it for a little under two years to do number crunching, as well as building a few automation scripts for work. It’s very lightweight, very easy to read, and quite a mature language.

That said, you can probably do this in whatever you feel like, but my approach consisted of the following steps:

  1. Poke the Steam & Game Stats page and get the HTML page that is served up to the browser
  2. Parse the HTML code and pull out specific numbers that would be useful for analysis
  3. Open a specified CSV file, and add lines to the file with all of the relevant data
  4. Close file, standby for next script run

What Was Used

The script is a very small file (33 lines!), and uses the following:

And you can take a look at the Gist itself to see the full script, but I am going to use this post to explain some of the methodology behind the script, to help people who want to learn about writing in Python and scraping web pages!

Alright, shut up, explain your code.

Of course!

8
steampage = BeautifulSoup(urllib.urlopen('http://store.steampowered.com/stats/?l=english').read())

This gets the ball rolling for the scraper. We use urllib to open a connection to the Steam & Game Stats page, and then read it with the BeautifulSoup library. If you are unfamiliar, I know I was, BeautifulSoup is a very powerful Python library that makes it super easy to navigate, search, and modify the parsed code you receive from websites.

In short: read the code of a webpage using BeautifulSoup, and you get all kinds of methods to chop and screw it to your liking.

10
11
timestamp = time.time()
currentTime = datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d %H:%M:%S')

I wanted to use a consistent timestamp when recording data into the CSVs because it would allow me to group results in a sane manner. It uses the current time (of the script running) and formats it into YYYY-MM-DD HH:MM:SS so that when imported into Google Sheets, it would preserve the actual date and time aspects.

13
top100CSV = open('SteamTop100byTime.csv', 'a')

You’ll see two open(…) lines in my script, and both of them point to a specific CSV. This is where I dumped all of my data. The second parameter (‘a’) made sure I was opening and adding to the CSV, rather than continuously overwriting.

15
16
17
18
19
20
21
for row in steampage('tr', {'class': 'player_count_row'}):
    steamAppID = row.a.get('href').split("/")[4]
    steamGameName = row.a.get_text().encode('utf-8')
    currentConcurrent = row.find_all('span')[0].get_text()
    maxConcurrent = row.find_all('span')[1].get_text()
 
    top100CSV.write('{0},{1},"{2}","{3}","{4}"\n'.format(currentTime, steamAppID, steamGameName, currentConcurrent, maxConcurrent))

This simple looking for loop pulls out the Steam ID for the game, the English game name (as listed on the Top 100 list), the number of concurrent players (as of the script reading the page), and the peak concurrent seen throughout the day (I forget why I wanted this.) It also adds that information on a new line inside of the CSV.

In addition, this loop shows you the simplicity of the power behind BeautifulSoup. Let me break it down into smaller pieces because each one uses different BeautifulSoup methods.

15
for row in steampage('tr', {'class': 'player_count_row'}):

When I dug through the Steam & Game Stats source code, I realized that every game was listed inside of a table row with the class player_count_row. Upon seeing the pattern, I simply asked BeautifulSoup to iterate through every single block or table rows using that class, and as they are all uniform in their markup, can consistently pull out the information we need.

16
    steamAppID = row.a.get('href').split("/")[4]

With BeautifulSoup, you can make direct references to markup (as seen above), and then grabbing parameters within the markup itself (like ‘href’.) I did this to grab the URL of each Steam game, break it apart based on where the forward slashes (‘/’) were, and pulled out the app ID that was nestled inside of the URL.

17
    steamGameName = row.a.get_text().encode('utf-8')

Like the above markup parameter grabbing, get_text() is a very neat function in BeautifulSoup that allows you to grab the text for a link. The Steam & Game Stats page uses the game name itself as the link text, so it was a breeze to add to our collection of data.

18
    currentConcurrent = row.find_all('span')[0].get_text()

Nabbing the current and peak concurrent users is the same procedure, so I only need to explain it once. The find_all() function from BeautifulSoup finds the markup that is specified. It takes every single instance of that markup found and creates an array that can be referenced for easy modification or evaluation.

With the same methods and functions, I managed to easily find the current and peak concurrent users for Steam altogether.

Simple, right?

What else?

There’s nothing else, really! Those 33 lines were more than enough to collect lines upon lines of data inside of two CSV files.

There’s plenty of work that went into the analysis of that data, but that’s for another day.

Thanks for reading! As always, happy to answer questions or take feedback, so leave comments or yell at me on Twitter!

An Analysis of Activity on Steam

I have wanted to flex my analysis muscles for quite some time and thought I’d get some practice on publicly available data: the Steam & Game Stats page.

Steam Logo

For the unaware: Steam is one of the largest gaming platforms available for PCs, and they have a wonderful stats page that lists the current number of concurrent users for each game, as well as the current number of concurrent users in general. I’ve always wondered what actual activity levels were, ever since seeing this page, and so I decided to scrape that page for a week or two and see what sort of data I could get.

NOTE: If you’re interested in the technical details behind the scraping, I will provide that in a separate post in the near future.

The data I have managed to mine from this page is trivial at best, but I had a ton of fun learning how to build the scraper (uses Python, BeautifulSoup, and a handy dandy cronjob) as well as figuring out the required Google Sheets equations to put it all together.

Some Fun Facts About Steam Activity

On average, 19.95% of concurrent players on Steam are playing games.

Here is a visualization of what Steam activity levels have looked like from March 7, 2014 to March 19, 2014:

On average, 1/5th of the Steam “concurrent” user base is actually playing a game. However, I will concede that this is not dead-on accurate, as I only have numbers for top 100 games at any given moment, but is reliable for estimations because games in the top 90-100 levels account for 0.02% of the user base, meaning any additional users playing non-top-100 games will be relatively insignificant in their effect.

There was a big spike, on March 9, 2014 at 4:00pm EDT (16:00) when the number of Counter-Strike: Global Offensive players ballooned up to a staggering 111,893 players. Presumably for the start of a tournament. (Haven’t dug into this one too much.)

Dota 2 dominates games played on Steam, accounting for an average of 5.58% of the concurrent player base.

Not a real surprise, given the popularity of Dota 2, but the second most played game is Counter-Strike: Global Offensive, sitting at a distant 1.66% average, about three times smaller than Dota 2. This translates to an average of about 400,000 concurrent players, with the highest I’ve recorded at 673,018 concurrent players on March 15, 2014 at 10:00am EDT.

In fact, most of the highest numbers of concurrent players in Dota happens in the mornings from 9:00am to 11:00am.

However, might be indicative of the real struggle for MOBAs to fight against the titan amongst gods, League of Legends, which boasted an impressive peak of 7.5 million concurrent users in January 2014.

Hard to gain ground on such an entrenched competitor, but they’re definitely doing their best.

160 games have been a part of the Steam Top 100 between March 7, 2014 and March 19, 2014

While that sounds like a lot, we have to remember that the Steam catalogue currently sits at over 3,000 titles and growing, it’s pretty safe to say that breaking into the top 100 is no easy feat.

Further analysis that I could do as time goes on is to get a breakdown of the genres being represented in the Top 100 list, which would also provide a decent idea of what is and what is not popular on Steam. Not to mention that this is an extremely small sample size, it would be more worthwhile to get this data over a period of a year to make it really meaningful.

What’s Next?

Well, this is a big pile of data, and it’s growing by the hour. This is great, but what can I really do with this data?

For starters, the original goal was to figure out if there was a link between the digital marketing behaviours of publishers and the level of concurrent players on Steam, as well as growth or decline in player base from that activity (or lack thereof.) I will have to explore whether or not that is still possible to figure out, as there are a lot of marketing activities that are either harder to track down or even attribute towards the success of a game.

Secondly, I will have to step up the data storage game a bit to make it much more accessible. Currently, a Python script scrapes the Steam & Game stats page, adding a line to two separate CSV files with all the relevant data. I’d like to transition this into an actual database (probably MySQL) and maybe make it open to the public to poke at and do their own analysis.

Lastly, I’m really not sure. It was a fun side project in the first place, and I feel like it was a great learning experience and a fantastic way to brush up on my analytical skills.

Have any ideas or want access to the data? Give me a shout, I’m happy to share!

2013: Year in Review

Zombicide

2013 has definitely been my year: I managed to get into my dream career, I went on the trip of a lifetime through four European countries, and I finally flew the coop and moved out on my own.

Despite that, I have spent the last couple of days (when writing this, in late 2013) feeling like I was in a bit of a funk and thinking that I was actually disappointed in myself. However, I am pretty sure that’s incorrect, and that I am quite satisfied with what I have accomplished in 2013, but I wish to do much more for 2014.

As with 2012, I have created The Good, The Bad, and The Ugly for 2013.

The Good

I started 2013 off with a bang: I managed to get my dream job of building video games. I joined Massive Damage in early January as an Associate Video Game Producer, and got bumped to a fully fledged Video Game Producer in July. I work with a team of incredibly smart, talented, and passionate people who know how to have fun, and I’ve been a part of a bunch of products that have been incredible learning experiences.

Prior to joining, I had been a bit aimless in my career aspirations, but getting into the games industry has made the end goal clear – it’s always going to be about video games.

In June 2013, I went away for three weeks on a trip through Europe with three of my friends. We backpacked our way through Amsterdam, Paris, Barcelona, Ibiza, Madrid, and Lisbon, eating and drinking our fill for three weeks. It was a trip of a lifetime, giving myself a much needed break and exposure to the world at large, it also gave me incredible stories and experiences that I can never forget. I will eventually get around to actually working on the photos I took throughout.

Thanks to work, I became completely enamoured with board games, starting my collection with Zombicide. Since then, I’ve grown my board game collection to 11 titles, and I’ve spent many a quiet afternoon or evening rolling dice or throwing down cards. My favourite for the year is a definite tie between Zombicide and Ascension, and I look forward to discovering more titles in the new year.

In November, I managed to finally fly the coop and moved into an apartment in downtown Toronto with my best friend. It’s been a move that’s been in the works since July, but took its sweet time in being realized. The independence has been refreshing, the shorter commute has been a blessing, and a place to call my own has been pivotal in furthering the next steps of my plan.

The Bad

To supplement The Good, I have created a list of things that did not go according to plan this year, but in a change of pace, I am writing down how I think I will correct that for the 2014.

I haven’t saved nearly as much as I wanted to. For reasons I would rather not go into, my savings have stagnated in 2013, and I can really only blame myself. The part of me that demands fiscal responsibility has really slacked in the latter half of the year, and that will change. I am creating a few side projects that will hopefully alleviate the situation, and I plan to do a bit more nights-and-weekends consulting. I have many plans for 2014, and I’m not letting myself ruin it.

I did not get enough exercise at all this year. After going to the gym on an almost daily basis last year, I hardly went this year at all, to the point of stopping in July when things got a little too crazy at work for me to go in the mornings. I no longer have an excuse with my apartment having a pool and a solid gym, I will break this bad habit in 2014 or I will die trying.

I find solace in distractions. Every time I sit down to start a project, or improve myself, I find something to get lost in. Surfing the web (read: Reddit) has really crippled some of my ability to be productive, so I will change this in 2014, with specific time set aside to do only personal improvement work, no distractions allowed.

I did not complete any personal projects in 2013. I often came home exhausted, in a bad mood, wanting to do something else, or any combination of the three. This led to zero personal projects being completed in the year, and I feel like I let myself down. But, bad feelings aside, I will complete at least 1 in the first two months of 2014, or I will take away my discretionary spending budget.

The Ugly

I lacked any sort of organizational ability. I spent most of this year, outside of work, in a frazzled state of disorganization. I can chalk it up to any number of things, but once again, this comes down to my own silliness.

I think 2013 is the year I learned that I constantly overextend myself, and that I don’t spend enough time on any single project or task. I will become more single-minded and focused for 2014, as I don’t have the time or money to really do otherwise.

What’s Next for 2014?

I have high hopes for 2014, in that I’d like to finally grow and improve in ways that I haven’t before. I want to become a better producer, I want to become a better developer, and I’d like to be a far more complete adult. (Sounds silly, but it is exactly what I need.)

I am going to get back into regular exercise, I will set aside time for honing my skills and building new ones, and I will spend far more time developing and polishing ideas.

Thank you for taking the time out of your day to reading my year in review, I appreciate it. Feel free to get in touch if you’d like to give me feedback, ask questions, or share your own insights and experiences. I am always open to good conversation.

I am looking forward to 2014, and I hope you are too.

Happy New Year friends and family, here’s to you and yours.

October French Month: Week 1 Progress Update

Jon Lim - Duolingo Week 1

Original Post: October Project: Back to Basics with French!

After spending 15 minutes every day on Duolingo, every day for a week, I seem to be making solid progress. Level 5 in French, completed Basics, Basics 2, Phrases, Food, Animals, and working on Adjectives and onward.

The most difficult part has been setting aside time every night to learn, with the necessary quiet and state of mind to be receptive to what I learn. However, it seems like I’ve settled on a nice rhythm so far and haven’t missed a day yet.

You can take a look at my progress (and my stream!) in French on my Duolingo profile.

Here we go, week 2.

October Project: Back to Basics with French!

October Project: French with Duolingo

My September project failed miserably – I hardly had time to spend thinking about structuring an entire website let alone having the brain power to do so. I can’t blame anyone but myself but I’m course correcting and picking projects that are much smaller in scope and therefore much easier to keep up with!

October Project: Spend 15 minutes every day learning French on Duolingo.

My time in Paris, albeit short, was very fruitful. My French slowly came back to me and, despite trouble with vocabulary, I managed to hold my own while out and about. I’d like to brush up on my skills and get to the point where I can have decent conversations in French.

I’ll try to make weekly updates with a new screenshot of my progress, but otherwise, you can look at my sad, sad progress on my Duolingo profile.

My Advice to Young Me

I’ve spent most of my teens and twenties listening to advice. Follow your passion, don’t give up on your dreams, do what makes you happy… I bet you’ve heard it all. Not that I’ve become jaded or cynical, but that sort of advice never served me well.

I’ve compiled some advice that I would have given to my younger self that’s actionable, that doesn’t set unrealistic expectations, and makes you better as a person in general. While I learned most of this as an entrepreneur and while working for startups, I would wager that this applies to anyone and everyone.

Be Resourceful

When you aren’t given clear instructions or a clear path to the end, you really need to get creative about getting things done. Build genuine relationships, don’t panic, and don’t be afraid to ask for help. If you aren’t taking a well-defined path in your life, you will need to create your own and that means being open to trying anything and everything at least once, and you’ll more likely enjoy the ride too.

Be Resilient

You’ll hear no. A lot. Or things will come crashing down. Or anything in between.

You have to learn that this isn’t awful, and you just have to brush yourself off and redouble your efforts. No one ever nailed everything on their first time, and you shouldn’t believe that you are the one exception to the rule – just believe that hard work will lead you to success.

Be Balanced

No one ever gained more productivity by working more hours in a day or ignoring their health or ignoring their friends and family. Spend time with family, friends, and loved ones. Keep them in the loop – they will centre you, humble you, and encourage you to be even better than you are now.

Be Yourself

There will be times where you feel like you need to change the way you are to fit in, you think you need to fake a certain personality type to feel at home. Don’t do that. Nobody wants to deal with the facade you raise for the sake of fitting in. The right people will accept you regardless of who you are. *cue after school music*

Be Better

The last thing you should ever become is satisfied with yourself and not spend more time becoming a better person in some shape, way, or form. You must constantly work hard to improve your mind, body, and soul – read books, learn languages, try new things, brighten other people’s days, and work hard to fix your flaws. This is slow, this can be painful, but it’s completely worth it.

Together, this advice forms the mantra that I live my life. Sure, it can seem like common sense to you and I, but it really doesn’t hit home until the day that you’ve set out on an entirely new path, the day that you realize that you’ve failed, the day that your friends feel like strangers, the day you feel like a stranger to yourself, and the day that you realize that you haven’t grown.

Thanks for reading, hope you found it useful.

Goodbye Nights: The Start of Small, Actionable Projects

Goodnight Moon

There are 24 hours in the day, and I already spend around 20 hours a day sleeping, eating, commuting, or working, which leaves just about 4 hours for whatever it is that I can cram in there.

Those 4 hours are about to disappear.

Starting in September (that’s THIS month!), I will be attempting to create small little projects for myself. These projects have to be completed by the month’s end, and I am giving myself the entirety of the month to design, build, and deliver these little projects.

This month, I’ve been toying with the concept of board game discovery.

I’ve long felt that board game websites, such as Board Game Geek, do a great job of providing the core game data along with a vibrant, and growing community. However, they fail to do any sort of good job at showing me board games similar to the board games I most enjoy. Their Top 100 list is too broad and general (and, occasionally too hardcore) for my tastes.

THE IDEA: Build a website that allows me to look at games, and immediately view related games. For example, taking a look at Zombicide, it would show me other games like Pandemic (because of the co-op factor) or Last Night on Earth (because of the zombie factor.)

The specifics have to be worked out, but of course, and I have the rest of the month to do so! If you’d like to make suggestions or even pitch in (however that maybe,) don’t be shy and reach out. I’m always happy to have collaborators.

This audacious idea of building a project a month is a healthy mix of wanting to fulfill my goal for 2013 of building more content, and simply wanting to solve some of my problems.

Will I make any money off of this? Ideally, but I’m not banking on it.
Will I become famous for this? Probably not.
Will I have fun?

You bet.

Gaming Links of the Week: May 19 to May 26

Frozen Synapse

Oh hello, didn’t see you come in there.

Lame jokes aside, I’m doing my best to get back into regular writing. It’s quite long overdue. To get the ball rolling, I’ve decided to commit to collecting interesting reads focused on game design and development because it knocks out two birds with one stone: it helps develop myself for my day job, and it provides reading fodder for people interested in joining the games industry.

Why Frozen Synapse Costs Money – Paul Taylor of Mode 7 Games, the developers behind Frozen Synapse, walk through the logic behind making the iOS release of Frozen Synapse a paid app, rather than following the trend of free-to-play in the mobile space.

OPINION: Paul makes an incredibly good argument, but as the free-to-play market continues to evolve, I am sure we’ll see plenty of examples of well-designed, AND well-monetized games. Especially in the mobile arena.

Nintendo grabs money, control from fans promoting its games on Youtube – The Penny Arcade Report summarizes the debacle surrounding Nintendo using YouTube’s copyrighted content system to claim all the advertising dollars from popular YouTubers publishing Let’s Play videos for Nintendo games.

OPINION: Nintendo have their heads up their asses. Perhaps they don’t need any promotional help with their games, but I am a firm believer that Let’s Play videos are one of the best ways to get your game some exposure and love from potential fans.

Why every developer should play Aliens: Colonial Marines – A writer from Novy PR discusses why playing through Aliens: Colonial Marines is a set of lessons on what NOT to do in game development.

OPINION: Aliens: Colonial Marines is a complete disaster, and these lessons are a must-read. However, any reasonable game designer or developer who wants to create a quality product wouldn’t have done any of these in the first place.

Hands on with Runescape 3: a brave new world – Nick Wilson from PCGamesN describes his experience with the up and coming Runescape 3, built in HTML5, coming this summer.

OPINION: This was a post that I almost didn’t include, but Runescape is a soft spot for myself because of all of the time I’ve spent playing in the past, and I really want to see how far they can push the in-browser MMORPG with the new Runescape.

Letting the Player Find the Fun – Ben Serviss discusses the power of ‘Discoverability’ in gaming and provides a few ideas on doing discoverability better in games today.

OPINION: It seems almost cliché to make fun of today’s games because of their hand-holding nature, but it’s sadly true. I miss the days of delight when finding a not-so-obvious secret and learning to play a game just by… playing the game.

Unlock your creative potential: 7 steps to becoming a game designer – Ethan Levy provides, via Slideshare presentation (it’s rather long!), what is involved in being a game designer, and 7 concrete steps that aspiring game designers can take to actually become game designers.

OPINION: A rather long watch, and not entirely perfect, but it’s always good to take in the opinions of other, more experienced game designers and learn from what they’ve learned.

Welp, that was fun! I’m constantly on the lookout for more interesting reads in the gaming world, so this is going to be a fun post to continue.

Thanks for reading!

Happiest of Birthdays

Birthday Cake

Yesterday was my birthday.

Typically, I don’t acknowledge it or make a big deal, but yesterday was really fun. It was a work day, but it was just so… satisfying. Here’s the gist of it:

  • Midnight phone call from my better half
  • Stepping out of my door and sinking into a full foot of snow
  • Seeing the stark contrast between our game today and our game two weeks ago
  • Logging onto Facebook, seeing dozens of messages, and logging right out of Facebook
  • Awesome birthday cake (as pictured above) from the team
  • First win at Pandemic, after two false starts, during game night
  • Arriving home to find my Kindle Paperwhite had arrived, two days after ordering

Just a happy, happy day.

Thank you to everyone who spent the time to say hello and send their love, I will respond to everything. I swear.

Eventually.