Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > ANN hashtar 0.1: archival encryption to corruptible media

Thread Tools

ANN hashtar 0.1: archival encryption to corruptible media

John Hunter
Posts: n/a

hashtar is a utility designed for encrypted archiving to media
vulnerable to corruption (eg, CDR, DVDR).

Comments, bug reports, suggestions for improvement all welcome.

John Hunter


hashtar: an encrypted archive utility designed for secure archiving
to media vulnerable to corruption.

Recursively encrypt the files and directories passed as arguments.
Rather than preserving the directory structure, or archiving to a
single file as in tar, the files are encrypted to a single dir and
named with the hash of their relative path. The file information
(filename, hash, permission mode, uid, gid) is encrypted and stored
in the header of the file itself, and can be used to restore the
original file with dir structure from the archive file.

For example, the command

> -cvf tmp.htar finance/

prompts for a password and generates an encrypted recursive archive
of the finance dir in the tmp.htar dir, with filenames mapped like

finance/irs/98/f1040.pdf -> tmp.htar/e5/e5ed546c0bc0191d80d791bc2f73c890
finance/sale_house/notes -> tmp.htar/58/580e89bad7563ae76c295f75aecea030
finance/online/accounts.gz.mcr -> tmp.htar/bb/bbf12f06dc3fcee04067d40b9781f4a8
finance/phone/prepaid1242.doc -> tmp.htar/c1/c1fe52a9d8cbef55eff8840d379d972a

The encrypted files are placed in subdirs based on the first two
characters in their hash name because if too many files are placed
in one dir, it may not be possible to pass all of them as command
line arguments to the restore command. The entire finance dir
structure can later be restored with

> -xvf tmp.htar

The advantage of this method of encrypted archiving, as opposed to
archiving to a single tar file and encrypting it, is that this
method is not sensitive to single byte corruption, which becomes
important especially on externally stored archives, such as on CDR,
or DVDR. Any individual file contains all the information needed to
restore itself, with directory structure, permission bits, etc. So
only the specific files that are corrupted on the media will be

The alternative strategy, encrypting all the files in place and then
archiving to external media, doesn't suffer from single byte
corruption but affords less privacy since the filenames, dir
structure, and permission bits are available, and less security
since a filename may indicate contents and thus expose the archive
to a known plaintext attack.

A match string allows you to only extract files matching a given
pattern. Eg, to only extract pdf and xls files, do

> -m pdf,xls -xvf tmp.htar

Because the filenames are stored in the header, only a small portion
of the file needs to be decrypted to determine the match, so this is
quite fast.

Data can be encrypted and decrypted across platforms (tested between
linux and win32 and vice-versa) but of course some information may
be lost, such as uid, gid for platforms that don't support it.


> [OPTIONS] files


-h, --help Show help message and exit
-fDIR, --arcdir=DIR Write hashed filenames to archive dir
-pFILE, --passwdfile=FILE
Get passwd from FILE, otherwise prompt
Only extract files that match PATTERN.
PATTERN is a comma separated list of strings,
one of which must match the filename
-u, --unlink Delete files after archiving them
-c, --create Create archive dir
-x, --extract Extract files recursively from archive dir
-v, --verbose Decrypt files recursively


I think this software is suitable to protect your data from your
sister, your boss, and even the nosy computer hacker next door, but
not the NSA.


python2.3 -
yawPyCrypto and Flatten -
pycrypto -

The python dependencies are very easy to install; just do the usual
> python install


Tested on linux and win32


John D. Hunter <(E-Mail Removed)>


same as python2.3


Ignores symbolic links


For Erik Curiel, who's life's work I lost when I volunteered to
backup the only copy of his home dir on a CD containing a single
encrypted gzipped tar file, which was subsequently corrupted.

Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Which hard drive encryption program has the strongest tested encryption & security? =?iso-8859-1?Q?-=3D|__=28=BAL=BA=29__|=3D-____o=3D=5B:::::::::::::::=BB?= Computer Security 6 02-20-2008 01:35 PM
archival storage media greensteak Digital Photography 15 04-30-2004 01:17 AM
Wilhelm Interview in 11/2003 Shutterbug on Archival Quality of Digital Print Media john chapman Digital Photography 0 01-05-2004 12:53 PM
archival test - Epson 10000 printer Conrad Weiler Digital Photography 0 07-31-2003 03:50 PM
Archival photo quality paper Jeff Furber Digital Photography 6 07-12-2003 10:26 AM