Proposed Image Hierarchy File Name Encoding Scheme

Discussion in 'Digital Photography' started by Grant Robertson, May 13, 2004.

  1. I have been reading posts in here and many other newsgroups about image
    file organization. As a result, I started thinking about how I would keep
    track of all the image files I will likely be generating. These files
    will be coming directly from my digital camera, from my scanner, and as a
    result of modifications made on the way to an ideal final image. It is
    possible for these modifications to follow multiple possible paths as I
    experiment with what is an appropriate final adjustment. In addition,
    many images may be the starting point for many different final images
    such as when I decide to crop down to a single subject within the image
    and prep that for printing or display. There could be an entire hierarchy
    of images based on other images as I step through my work flow as well as
    multiple branches coming off from various points as I think of new things
    to do with existing images that are themselves results of various stages
    of modification. I quickly realized it would be imperative for me to use
    a concise, consistent naming system to keep track of all the different
    variations. I have searched through old messages in almost a dozen
    newsgroups, Googled till I was blue in the face, and even asked graphic
    artists I know. It seems that no one has such a system.

    Almost everyone just seems to guess their way through it as they go
    along, using incredibly long files names, and trying to describe all
    their modifications within the name itself. However, this leaves them
    with a directory full of randomly named files and no easy way to sort
    them by hierarchy. Some people use deeply nested subfolders with
    descriptively named directories. But this leaves them with a dozen or so
    identically named files and an incredibly long path name. This then
    causes them trouble when they go to archive these off onto CD because
    there are limits on the sub-folder depth and path name length.

    So, I decided to put my incredible skills in organizational system design
    to the test and see if I could come up with something. Two days, one
    white board, one Tablet PC, and six mocha's later, I think I have a
    solution. I have spent the last few days just trying to figure out how to
    describe it clearly.


    The Goal: An encoding scheme that would...
    1) Clearly indicate where in the hierarchy a particular image
    file belonged.
    2) Use as few characters as possible.
    3) Allow all related images to be stored in the same directory.
    4) Sort the files properly when using normal ASCII file name sorting.
    5) Be flexible enough that one could pick any image in the hierarchy
    and branch off with further modifications without screwing up the
    coding scheme.

    Important Note:
    This scheme is not meant to replace the entire file name. The characters
    comprising this encoded data are meant to be inserted in the last part of
    the file name just before the file extension. The main part of the file
    name would still use whatever system you use to serialize your original
    images. I will call this part of the name your 'BaseName'. The BaseName
    will be the same for all images that are the result of modifications to
    the same original image file.
    Also, since many image editing programs now allow for lossless rotation
    of .JPG files, that first step of rotating to the proper orientation will
    not be considered a modification of the original file. It will still
    maintain it's status as the 'original'. Just be sure your software really
    does true lossless rotation and doesn't delete any metadata in the
    process.


    The System:
    The structure of the file name will be as follows...

    BaseName.HhHhHh.uuu.xxx

    Where
    HhHhHh = the hierarchy code. This may be as long as necessary.
    uuu = the use code (or a special modification of a final image
    for a particular use such as simple resizing for e-mail
    or web use). This portion will only be inserted into the
    file name for these special purpose files and only if you
    choose to save them at all. You can use any codes you like.
    xxx = the regular file extension. It's OK if it is longer than 3
    characters.

    The original file will be named 'BaseName.!Org.xxx'. The exclamation
    point causes it to sort to the top of the list of all the files based
    upon it. The 'Org' just reminds you or any casual observer that this is
    the original image file.

    Naturally, this system won't work for 8.3 character DOS based file names
    but all you can squeeze into that small space is a straight sequence of
    serial numbers anyway.


    The Hierarchy Code:
    (Not quite as interesting as The Bible Code or The Da vinci Code but
    useful nonetheless.)

    The hierarchy code consists of pairs of characters. The first character
    in a pair is a number and the second character is a letter such as '1A'
    or '3H'. The number indicates the branch or path followed from the
    original or parent image and the letter indicates how many sequential
    steps along in the work flow it is. Each time you branch off from an
    existing sequence you add an additional pair of characters. You will see
    how you can do a heck of a lot of variations and still end up with
    hierarchy codes that are only 6 characters long.

    For instance, let's say you are starting out with your original image
    file (BaseName.!org.jpg) and you normally do several different things to
    it to start getting it ready for use. You convert it to a .TIF file
    (BaseName.1A.tif), then you despeckle it (BaseName.1B.tif), finally you
    adjust the color (BaseName.1C.tif).

    Now this may not be your normal work flow and some of these steps could
    even be done just before using the image without even keeping the
    resulting image permanently. That is not the point. You are going to have
    several sequential steps where you desire to save images files as you go
    for whatever reason. These are just examples for purposes of explanation.

    Why use two characters instead of one? You will see in just a second.

    Let's say that later you are either unsatisfied with the results of the
    first sequence of operations or just want to take a different tack. So
    you decide to go back to the original file. You make a different series
    of modifications and name the files with 2A, 2B, 2C and so on. This would
    be a different branch of the hierarchy. You could follow your muse along
    a yet another path and create 3A and 3B as well.

    So what's up with the multiple pairs?

    Imagine that after thinking about things you decide that your first path
    was the right one but you just didn't adjust the color correctly for the
    1C file. So you decide to redo it but you aren't ready to commit to
    deleting the 1C file yet. You open up the 1B file and make your more
    enlightened color modifications then save the file as BaseName.1B1A.tif.
    This is the first side branch based on the 1B file and it is the first
    step along the new sequence. Now we could think of the 1C file that was
    originally based on the 1B file as the first branch but it is simpler to
    think of it as the same trunk growing straight up (or root growing
    straight down.) You like how the image is working out so you go on to
    create 1B1B and 1B1C. See how you can tell exactly what step along
    which of the first level branches these images are based on just by
    looking at the hierarchy code embedded in their file name.

    But tomorrow you have second thoughts. You're thinking that the 2B file
    might have been better than the 1B file so you open that one and start
    modifying from there. You save files 2B1A, 2B1B, and 2B1C. Upon
    reflection that 2B1C file just isn't quite right so you go back to 2B1B,
    do it just a little differently and save 2B1B1A then finally 2B1B1B.

    Eureka! That is perfect. So you rename this file to
    BaseName.2B1B1B.!.TIF, create several different files for different
    possible uses and name them BaseName.2B1B1B.W.GIF for web sites,
    BaseName.2B1B1B.M.JPG for e-mail, BaseName.2B1B1B.L.JPG for sending to
    family, and BaseName.2B1B1B.x4.TIF for large size printing. The "!" is to
    keep the primary version of this final file sorted before it's
    derivatives. (W = web, M = Medium, L = Large, and x4 = resampled at 4
    times the original resolution) You will have your own standards as to
    what sizes are appropriate for which uses.


    Here is how the hierarchy looks conceptually:

    !org
    |
    +---------------+-------------------------+
    | | |
    1A 2A 3A
    | | |
    1B2A --- 1B --- 1B1A 2B --- 2B1A 3B
    | | | | |
    1B2B 1C 1B1B 2C 2B1B --- 2B1B1A
    | | | | |
    1B2C 1B1C 2D 2B1C 2B1B1B
    |
    1B2D

    Naturally, you must be viewing this using Courier font for it to look
    right. I threw in yet another branch on the left of 1B to further
    illustrate the idea of multiple possible branches from each point in the
    tree. Of course it is hard to illustrate more than two branches for each
    point in a text file but you can see that you could have as many branches
    from any one point as there are different codes for the first character.
    If you really get crazy you can even go past 9 and use letters for the
    first character as well but then it would be harder to read. If you do
    this you might want to use upper case for the first character and lower
    case for the second character in each pair.

    Here is how the files will sort in your file listing:

    BaseName.!Org.jpg
    BaseName.1A.tif
    BaseName.1B.tif
    BaseName.1B1A.tif
    BaseName.1B1B.tif
    BaseName.1B1C.tif
    BaseName.1B2A.tif
    BaseName.1B2B.tif
    BaseName.1B2C.tif
    BaseName.1B2D.tif
    BaseName.2A.tif
    BaseName.2B.tif
    BaseName.2B1A.tif
    BaseName.2B1B.tif
    BaseName.2B1B1A.tif
    BaseName.2B1B1B.!.tif
    BaseName.2B1B1B.L.JPG
    BaseName.2B1B1B.M.JPG
    BaseName.2B1B1B.W.GIF
    BaseName.2B1B1B.x4.TIF
    BaseName.2B1C.tif
    BaseName.2C.tif
    BaseName.2D.tif
    BaseName.3A.tif
    BaseName.3B.tif


    After you have stewed on all these edits a while you may decide you don't
    need to keep all these huge image files around forever. You can just
    delete all of the dead end branches if you want to. You could even delete
    all the steps up to your final image (BaseName.2B1B1B.!.tif). If you
    didn't want to be reminded that you had been so darn wishy-washy you
    could even rename the file restarting the hierarchy code at 1A or just
    replacing it leaving a file name like BaseName.Final.tif with the
    additional special purpose formats renamed similarly.



    Alternative systems:
    There are going to be some people who read this and are tempted to ask
    why I didn't just use one additional character for each additional branch
    away from the original. It would certainly make the file names even
    shorter. That's why it took two days. I just couldn't come up with any
    scheme that didn't break as soon as you wanted to add more branches or
    more steps along an existing branch. If you can come up with something
    let me know. I'd love to see it.


    Copyright:
    This message and coding scheme is copyright 2004 by Grant S. Robertson.
    It may not be reprinted, posted to a web site, stored in any electronic
    archive other than this newsgroup, taught in any class or seminar, nor
    may it be incorporated into any software product, without my prior
    written permission. Individuals are hereby granted the right to use this
    system for their personal filing needs and to tell their friends. In
    other words, I am happy to share this idea with fellow photographers but
    if you are going to be making money off of it then I get a cut. As people
    who generate intellectual property every time they press the shutter I
    imagine you will understand.
     
    Grant Robertson, May 13, 2004
    #1
    1. Advertisements

  2. Grant Robertson

    eawckyegcy Guest

    The official response of is a famous quote:

    "Those who do not understand UNIX are
    condemned to reinvent it, poorly."

    You are left with three questions:

    a) who said it,
    b) why it needed to be said,
    c) and (related to (b)), how is it germane to your (cough) "proposal".

    Take as much time as you like.
     
    eawckyegcy, May 14, 2004
    #2
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.