Programming Journey Entry 7: That time I wrote a PowerShell script

All of these Programming Journey posts can be found in the associated category of this blog.


PowerShell, like programming

As mentioned in the last programming journey post, I wanted to get into PowerShell as a language. Sometimes I still do some python, but the last few months have really been about PowerShell, version 7.x in particular.


A Simple Idea

The idea was to write a script that zipped the games of my Steam game library into individual zip files. So I could just have an offline version of them. Steam has a facility like this built in but you can only do so one game at a time and only through the GUI. And only restore via the GUI one at a time as well.

So the idea was loop through the game library, zip folders with an appropriate name, and deal with extra details like outdated zips and duplicate zips.

Well I’m over simplifying a bit. But that’s basically the script: zip game folder, store it in destination.

The complexity comes from the details when the destination already has zip files. Because games are constantly getting updated and the compress-archive cmdlet doesn’t do “delta updates” (that means if a 200 gigabyte game gets a 50kb update all 200 Gigabytes have to be zipped again from scratch). So for that and many other reasons I decided to put a date code into the zip file name in the form MMddyyyy, which is month-day-year, which it gets from the folder last-write-date as reported from the filesystem/OS.

So the script has to look at the source and destination folders, determine the dates of the zips based on the file names, compare those dates to the write dates of the game folders and determine which folders need a new zip and which ones don’t then start zipping.

Which brings up the obvious question: if there’s already a zip what gets done with it? Keep it? Delete it? Well I decided to delete it unless a -keepduplicates parameter is included. This is so the people who want two essentially identical zip files of 100+ gigabytes stored they can do that and everybody else doesn’t have to worry about it.

I actually adopted what is probably an odd layout for a PS script: using a main function. At least I assume that was unorthodox as it seems like something more for Python or C programs. It did help with the use of Start/End-Transcript.


Creeping features of the Creeping

The description so far doesn’t sound that bad. But I started thinking of new features.

Actually at one point I attempted to create another script to automate the testing of the primary script. That didn’t go well and I abandoned it.

To kind of jump to the chase here, I ended up with these parameters:

  • Required source and destination. That’s goes without saying.
  • Use a text file as the source for selective archiving
  • Save a set of parameters to a text “answer” file and use that as an “answer file”
  • a debugmode, which I used to create 0 byte “stub” files to test the script’s ability to sort through zip files and deal with duplicates without having to wait every time
  • a verbmode – verbosity – to show more details than anyone besides a developer could possibly need.
  • There’s also the -WhatIf and -help parameters, just to try and make at least a little bit Powershell-like. Although help doesn’t seem to actually list details as it’s supposed to. I haven’t figured out if that’s an issue with my computer or a PS issue or a script issue.
  • Also I created a list of platforms besides Steam (epic, amazon, gog, origin) so those can be appended to the end instead of Steam.

That was all before I added the parallel parameter. The idea was to make it possible to zip more than one folder at the same time. But the method I used was apparently the worse. In so far as I made my PC power itself off completely.

I actually refactored this script multiple times during development. In fact I did a lot of refactoring before I even got to the compress-archive part of the script.

Actually, going back even further, I tried forever to come up with that date stamp/date code comparison and take action accordingly part of the script to work. But kept getting frustrated and didn’t make a lot of progress. Than more than a month would go by and I would try again.

But this current version I decided on a different approach: hash tables. Or more specifically, create a hash table containing the content of the source folders and their date stamps and zip files in the destination folder as kind of a “snapshot” of the two at the start of script. Than filter down the folders that need a zip file, step by step until a final list is assembled to send to compress-archive.

For instance, the first step is to remove any empty folders from the source list of folders as there’s no need to zip an empty folder. Then create a list of hypothetical zip names based on source folders and see if there are any exact string matches. Just to make a list of them. Then see if there are any matches of just the names of zip files that would match source folder names with the underscores inserted (bit Dungeon II paired with a zip starting with bit_Dungeon_II). Then work on the single matches of of those versus multiple matches (bit_Dungeon_II with different date codes). I ended up with a total of four tables as I whittled down the folders and needed zip files.

There’s probably a better way to do that. But at least for development that was working well.

For a specific example, the bit Dungeon II folder with a last write date of 11/1/2024 installed on steam would end up with the zip file name bit_Dungeon_II_11012024_steam.zip.

The workflow I developed

I did certainly learn a few things over the time developing this script. For one thing, since PowerShell is a shell scripting language, running the script over and over isn’t going to reset any variables I set. So I came up with this line to execute the script:

pwsh -command '& { .\steamzipper.ps1 -destinationFolder "P:\steamzipper\zip test output" -destinationFolder 'P:\steamzipper\zip test output' }'

This actually brings up a new PowerShell shell, runs the script and drops back to the first shell, taking its variable values with it.

It occurs to me this may need additional explanation: in command line shells like cmd, BASH or PowerShell variable values can be set like any programming language. But shells are unique in that the values can continue to be there after the script has finished running. So if I used a variable to store a path, for instance, and forgot about it (deleted that line). Then used a parameter to a script to set that path, I may end up with a conflict of values and the script not working because it didn't occur to me the variable value still exists (this makes debugging that much harder). So by using pwsh I'm "creating an instance" of the shell for length of running a single script, the variables values expiring when the script finishes running. I hope that makes more sense.

I was also using some if/else statements to test if variables had a value and only then setting them and also deleting variable values at the end of the script. Just to avoid inadvertently getting bad results because a variable value held over from the last time the script was run.

I also added start-transcript and end-transcript so the program itself would record what it was doing and this file could be parsed.

Over time I came up with a “test dataset” of source folders and destination zip files I would reset to every time the script was run. I was still manually resetting to that state but at least I had it. I had already learned of a PS cmdlet to manually change last write dates on folders. So it was just a matter of setting those dates to past present and future not to mention some empty folders to test the script.

And most recently I started using plain old redirect-to-text file to capture console out. Because transcript was not recording things like total time of script run using parallel. That would be like recording itself, which seems like it wouldn’t be accurate.

A new beginning?

So that is the last…I don’t know at least a year…worth of work summarized in a few paragraphs. The current version of the script somewhat works with parallel specified but has apparently lost its ability to deal with duplicate zip files. So it’s kind of broken. But it’s only somewhat broken in parallel so it’s better.

What I think I’m going to do now is re-write it from scratch but start from “the other end”. In other words instead of adding the ability to zip multiple folders at once at the end, do that first. And instead of adding the -whatif block as a last step make that work from the beginning.

And also I want to make it into a module that it can go up on the PowerShell Gallery so I’ll make sure it works that way from the first lines.

I’m also planning to re-attempt a testing script. Maybe not a standard testing script, or anything like a CI/CD (which I only barley understand) kind of testing suite, but a script none the less. Much better to have it working at the start then try to add it half way through.


So this is actually my plan for the future: re-write the script from scratch using what I’ve learned so far as a reference. And also document my progress to some degree on this blog. We can “learn together” as I fail, fail, fail and fail again. You’re welcome.

Oh and I wasn’t sure if I was going to share (and you find it if you looked) but here’s the repo of the current script. Sorry for the chaotic readme file.

SteamZipper
GitHub Repo:
https://github.com/tildesarecool/SteamZipper

Okay..fine. I wrote it with an LLM. You suspected it, you got it right.


Reference links:


Leave a comment