I was recently asked to write a piece for an organization on Digital Asset Management, and one of the requirements was that it be 400 words. I didn’t push back on this requirement, or even ask why. But it reminded me of negotiating with one of the fast-food style video education aggregators, when they told me that my videos should be no longer than 5 – 7 minutes, because “no one will watch a video longer than 5 minutes.” Which of course is utterly ridiculous. You get what you ask for… meaning, if you pander to an audience that isn’t looking for comprehensive video education, then you will get what you deserve: an audience that wants fast-food video, pre-digested, and then spit out with a barnyard sound.
Despite that limitation I thought a bit about what I wanted to write for this particular org, and decided it was time to share a little project I’ve been working on. It’s a bash script that automates the process of downloading photos, synchronizing camera clocks, adjusting timestamps, renaming files, and making backups. It was a fun and interesting project because doing projects like this forces me to formalize my thinking… or my process, meaning… organize it in such a way that it can be scripted, and then, taught.
Not an easy thing to describe in 400 words!
Anyway, that’s what I wanted to write about, so that’s what I wrote. And by cutting every corner imaginable—while trying to preserve at least the core of the idea—I ended up at 1100 words. Which didn’t fly.
OK, that’s understandable. I would have rejected it too. It’s vague, out-of-context, and incomplete. But I still feel that it should be published, so here is (almost word-for-word) what was submitted.
Workflow is hard.
Well, let me take that back. My workflow is not hard. It’s easy. In fact, it’s so easy that it’s essentially finished at the push of a button.
But let’s back up a little, just to be sure we’re all talking about the same thing. In this article, when I talk about workflow, I’m talking about your basic import workflow. How you get your pictures downloaded, organized, renamed, stamped with basic metadata and keywords, and backed up. In this article, I’m NOT talking about how you identify your best shots, or process them, or export them for your clients. That becomes a much more subjective process that is very difficult to formalize.
What I’m talking about here is the part that you can formalize. And that part consists of the repetitive steps you take (or should be taking) each and every time you sit down to download new photos. How do you go about formalizing a workflow? I admit, this is not the easiest part of the process, because it requires working backwards. You have to start at the end, and think about what you need to do to get to the finish line.
This process takes time. You don’t just come to it one morning, and decide you can work out the steps. Workflows evolve, and you only come to a series of discrete workflow steps after doing it a bunch of times, and after making a lot of mistakes. To make things even more complicated, formalizing an import workflow requires taking into account a lot of disconnected, but interrelated pieces.
To start that process, I began by thinking about my goals: what are my requirements for the end game? After working on my own Library organization—as well as teaching it for a number of years—I know that I want my entire photo library organized chronologically, with a very specific folder naming routine. The thinking behind why I feel chronological organization works is way out of scope for this article, but I’ve written extensively about it here and here.
I also have very well-defined ideas about what’s important in file names, which I’ve written about here and here. So basically, the timestamp (including YYYY MM DD HH MM SS + GMT offset) ends up as part of my filename sequence numbers. (After all, why dream up my own subjective sequence numbering system, when we already have a universal one?) Further, whenever I’m traveling I always record a GPS tracklog that I store with the photos, and I use that tracklog to geotag every photo that I shoot on location, which I’ve written a bit about here. I’m a bit fanatical about making sure my geotags are right on the money, and I also always shoot with at least 2, and sometimes 3 or 4 camera bodies when traveling. This means all my camera clocks have to be accurate, as well as perfectly synchronized. Shooting with more than one camera and expecting your photos to always sort chronologically by the one truly universal piece of metadata across ALL operating systems and file formats (the file name!), also requires that my camera clocks are all accurate and synchronized.
The difficult thing about camera clocks is that they all drift a little bit, and to make matters worse, they all drift at slightly different rates. Further, I absolutely hate trying to get them all synchronized to the second (almost impossible) for the geotagging and file naming before every trip. All of this has led me to a system where I never set my camera clocks for local time, but always leave all of them set to UTC. My system of making sure each camera’s clock is perfectly synchronized meant that I was adjusting the timestamps for every single outing anyway, so I might as well correct for the local time zone at the same step in the workflow, which I’ve written a bit about here.
Whew! Still with me? OK, that’s not an exhaustive list of my end game requirements, but I think it probably gives you the idea. To make it all work, I developed a system of creating and applying time zone offsets and clock corrections that can be made more or less “automatically” by simply photographing a synchronized clock on my phone or computer screen at the end of every camera card that I shoot.
The final piece of the puzzle was formalizing the series of steps required for import and backup as a bash (UNIX) script. I plug in a camera card, and fire up the script in a terminal window. The script opens the last photo on the card so that I can see the image, and asks me for two things. 1) it asks me to type in the local time that I see in the photo, 2) it asks me for a folder name and destination for the final photos. The script then uses my system clock to work out where I am and what the computed offset will be for both the local time zone plus or minus the camera clock drift for each individual card download. It makes backup copies using rsync (with checksum verification) as well as calculating and storing checksums from the actual camera card during download. It renames the files, and puts them all where they need to be.
Capturing and storing the checksums directly from the camera card allows me to go back for verification at any time in the future, even after the files have migrated from drive-to-drive across multiple backups. This helps me detect bitrot and a host of other potential causes of corruption, for the entire raw file, not just the raw data, as DNG validation does. And, the checksums are captured at the one point in the workflow when I will visually inspect each and every frame shot, which is the one time you are most likely to see corruption and be able to identify its genesis.
It’s all wrapped up in one simple bash script, but writing the script wasn’t the hard part. The hard part was looking at what I do each and every time I sit down to download photos. Formalizing the exact steps and sequence required to take me from A to B, ensures that it will happen in precisely the same way every time, and it forced me to streamline, eliminating or correcting any flawed pieces of the process.