I have duplicated and shared a read-only copy of my Steam Library GSheet as of April 2019, which should make this post easier to follow. The “Raw Data” tab way over on the right is what I will be referring to.
I track various video game collections through various means. Even with importing my Steam library the process of tracking the collecting still came down to something of a boring data entry job.
I’ve separated what would be generically called “the collection” into two parts: my physical console game collection and my entirely digital Steam library (there’s also some on other services like GoG and Ubisoft but we’ll largely ignore those for now).
This post with be entirely about my Steam/digital PC game collection.
You might think those handy utility web sites that conveniently export your Steam library into a CSV or some other format is all that’s needed and no need for anything further. But this is somewhere between inaccurate and grossly under appreciating what is required. Exporting as CSV is a good starting point, but there is still a lot further to go.
I opened and converted my generated CSV file to a native Google Sheet then started trying to learn how to use Google Sheets. Luckily this a somewhat common practice among enthusiasts so I had some sample sheets to draw on.
Google Sheets is almost but not quite compatible with Excel from a usability standpoint. For instance the way formulas work is close enough an Excel user won’t find it difficult to adapt those skills but it’s not similar enough an Excel user could use Sheets without consulting GSheets help (which is very good) every so often.
There were a couple of primary goals with the creation of this spreadsheet:
- avoid purchasing a duplicate game on either another service like GoG or another platform like PS3 (and vice versa)
- to know definitively what I have and things like DLC and
- to collect meta data for this hypothetical arcade cab project I’ll have finished any year now.
The primary reason was actually to collect meta data (some not related to the arcade cab) which includes but is not limited to:
- will it work with the digital pad or “HAT” only
- will it work/is it playable with a low-res 4:3 LCD screen
- what can I expect in the way of various runtimes
- will it run on low-end Atom SoC hardware
- and perhaps most importantly would it make a good arcade game.
After all if it meets all the criteria but it’s an RPG or otherwise really slow moving and story-oriented who wants to play that on an arcade cab? I wanted to compile a list of games out of my collection, really curate it down to my “definitive list” going forward.
I haven’t really kept up with collecting this information as I kept getting more games via humble bundle however. Thus the reason for all the (Pending) entries.
So once I had the “Raw Data” from my collection I need to start filtering the games down. The most obvious place to start was the part of the collection that are, you know, games. Not applications (I own a couple), not soundtracks, not DLC, not duplicates, not servers and beta versions of games I never asked for.
The first thing I found from the Google Sheets documentation is called unique. You give unique a cell range and it conveniently drops all the identically named entries out for you. In fact you can point unique at a different tab and it will fill in the cell range of the tab starting from the point you enter the formula.
I’m not sure that was clear so let me try explaining this again. If you:
- started with a tab filled with hundreds of rows worth of data called Raw Data
- then created a new blank tab and used the =unique in cell A2 to bring in the data from a specific cell range like d2:d200 for instance
This new blank sheet will contain only one actual bit of data in it: the unique function that points to cells d2:d200 entered into cell A2. For example:
But that only gets the data to appear. To filter it for games only you have to give it some specific criteria.
And that is what another function is for, filter. Like UNIQUE, it takes in a cell range but it also requires a condition as a second parameter. In this instance the D column contains my categorization type keywords and I want to only see games from that column D.
Lucky for us we can “nest” function calls. After all UNIQUE doesn’t know or care what happens to the data you pass it before it arrives, as long valid data gets passed in for it to do its thing. So inside UNIQUE will be a use of FILTER along with a cell range and a condition against that column D mentioned earlier:
=unique(filter('Raw Data'!A2:D5,'Raw Data'!D2:D5="Game"))
In case it may not be clear to anyone, let me explain this line bit-by-bit. These functions start from the inner most and work outward: filter comes first and only then does it fall back to unique. Filter is given the name of the tab with the data, called “Raw Data” which is passed the cell range with an exclamation point, as in ‘Raw Data’!D2:D200. Next a condition is passed in, this case a string consisting of the word “Game”.
To bring it all together, we’re bringing in the unique values on the “Raw Data” tab specifically the range D2:D200, and filtering those values based upon the the string Games. I don’t have any idea if I over- or under-explained that.
This is something you can try yourself in Google Sheets by the way with a completely blank Google Sheet: copy somewhere between 1 and all the rows of data on my “Raw Data” tab into the blank sheet and name that tab “Raw Data” (the space matters and I believe it is case sensitive). Then on a new tab (at the bottom) in the same sheet paste in the formula. You can add one function at a time or do the whole thing. I really hope this all made sense.
The above formula might seem a bit confusing since it has that same
line twice next to each other. Well there’s an easy remedy for this: named cell ranges.
All this really does is allow for referencing a defined set of cells by name instead of specifying the actual range.
On my shared spreadsheet I confusingly named my range RawData, and I haven’t bothered to try and rename it. So the range of all my game-related columns, except the column headers, can be referenced as RawData rather than the more cumbersome Raw Data’!A2:P2000.
To name a range on your Raw Data tab, you would select a range of of data, say A2 through P5: the entirety of your actual raw data in other words. Then go the Data menu and select Named Ranges. A settings column will slide out from the right side with the range you selected already filled in for you. In my case I named it RawData like I said, you can use any name you want (and adjust examples below). My columns happen to go all the way to at A2 amd go through P, so to try and future-proof I often span down to row 2000 (the “RawData” named range is A2:P2000, in other words).
So my final, relatively simple formula on this first tab tailored to filter down my Steam game data comes out to:
=UNIQUE(filter(RawData,'Raw Data'!D2:D2000 ="Game"))
This was originally just going to be one post about my Steam library spreadsheet. But now that I have introduce the basics of the why and some of the how I think I’ll split the rest into a part 2.