Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > deflater/inflater and dictionnary for huffman

Thread Tools

deflater/inflater and dictionnary for huffman

Posts: n/a
(deflater/inflater and dictionnary for huffman)

I want to compress/decompress small data like strings of 1k or less.
I know huffman compression with static dictionnary is the best solution.

Seams I could use the with strategy Huffman_only and
maybe prepare a dictionnary.

Does anyone can explain the way to use this? I can't find any docs after 2
hours on google and the groups! Javadoc is useless.

Reply With Quote
Roedy Green
Posts: n/a
On Fri, 17 Oct 2003 03:05:06 GMT, NOBODY <(E-Mail Removed)> wrote or
quoted :

>Seams I could use the with strategy Huffman_only and
>maybe prepare a dictionnary.

For sample code see

I don't know if you can force Java's GZIP to use only Huffman though.

Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See for The Java Glossary.
Reply With Quote
Thomas Weidenfeller
Posts: n/a
NOBODY <(E-Mail Removed)> writes:

> Seams I could use the with strategy Huffman_only and
> maybe prepare a dictionnary.
> Does anyone can explain the way to use this? I can't find any docs after 2
> hours on google and the groups! Javadoc is useless.

Yep, it is knowledge that is passed from father to son with a secret
handshake. I once tried to interest some Java journals in publishing an
article about "all" the magic of the Zip an Jar classes, but no one was

First of all, consider using a DeflaterOutputStream instead of a raw
Deflater. It handles all the ugly details of feeding data to the
compression algorithm and writing the results. If you want to have the
result in memory, provide a ByteArrayOutputStream to the
DeflaterOutputStream's constructor.

If you need to do it by your own, I suggest to study the
DeflaterOutputStream's source code (source comes with the J2SDK in a
file called or src.jar).

Please note that DeflaterOutputStream uses a Deflater differently than
the Deflater documentation suggests. The documentation suggests to
check needsInput() if deflate() returns with a 0. DeflaterOutputStream
ignores the return value and just loops until needsInput() returns
true. They can do this, because they ensure their output buffer has at
least a size of 1, and they immediately write out any result and reuse
the buffer.

For using Deflator, you need two sets of "pointers" (this is where the
underlying C library shines through). One "pointer" points to the part
of the data that still needs to be processed, the other to some
storage location to which compressed data should be written to.

Of course, you don't have pointers in Java, so what the API expects is
a pair of a byte[] and an integer. The byte[] holds the data, the
integer serves as a pointer into the byte[]. In addition, you need
something to know the remaining data in the byte[], that's another
integer. So for the input data, and the output data you have a triplet

byte[] b; // memory holding the data
int off; // position "pointer" into b
int len; // For input: remaining input data in this buffer
// For output: remaining free memory in this buffer
// In both cases: off + len <= b.length

You provide such a triplet to setInput() to tell the Defalter where to
get the data from. And you provide another such triplet to deflate() to
tell the Deflater where to place the output.

Now the really tricky part starts. You call deflate() and then you have
to react according to the return value of deflate():

deflate() == 0 && needsInput():

All data provided via setInput() has been read (but maybe not
completely processed). Either provide more data by calling
setInput() again, or finish the compression.

Finishing the compression is tricky, too:
(a) Call finish()
(b) Flush all data still in internal Deflater buffers. You do this
by calling deflate() in a loop while finished() returns false.
Check the return value of deflate(), because you still
might need to provide more output memory.

deflate() == 0 && !needsInput():

This is undocumented, but important. The Deflater needs more
output storage. Save the already compressed data in the
output byte[], and call deflate() again with more memory.

deflate() > 0

Some output data is available. The data is in the byte[] provided
to deflate(), starts at off as provided to deflate() and has a
length as returned by deflate().

Do whatever you want with the data, then adjust off and len if
necessary, and call deflate() again.


Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Fast and efficent Huffman coding for nibbles and bytes in C++? 88888 Dihedral C++ 4 02-28-2012 03:40 PM
SymbolTable and string dictionnary Sigfried Java 9 12-02-2008 01:20 AM
Dictionnary vs Class for configuration Famille Delorme Python 7 05-01-2004 01:27 PM
Re: Identity dictionnary Bob Ippolito Python 1 02-29-2004 02:49 AM
newbie question about dictionnary ? Sophie Alléon Python 9 09-05-2003 06:53 PM