Velocity Reviews

Velocity Reviews (
-   Java (
-   -   deflater/inflater and dictionnary for huffman (

NOBODY 10-17-2003 03:05 AM

deflater/inflater and dictionnary for huffman
(deflater/inflater and dictionnary for huffman)

I want to compress/decompress small data like strings of 1k or less.
I know huffman compression with static dictionnary is the best solution.

Seams I could use the with strategy Huffman_only and
maybe prepare a dictionnary.

Does anyone can explain the way to use this? I can't find any docs after 2
hours on google and the groups! Javadoc is useless.


Roedy Green 10-17-2003 03:33 AM

Re: deflater/inflater and dictionnary for huffman
On Fri, 17 Oct 2003 03:05:06 GMT, NOBODY <> wrote or
quoted :

>Seams I could use the with strategy Huffman_only and
>maybe prepare a dictionnary.

For sample code see

I don't know if you can force Java's GZIP to use only Huffman though.

Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See for The Java Glossary.

Thomas Weidenfeller 10-17-2003 08:56 AM

Re: deflater/inflater and dictionnary for huffman
NOBODY <> writes:

> Seams I could use the with strategy Huffman_only and
> maybe prepare a dictionnary.
> Does anyone can explain the way to use this? I can't find any docs after 2
> hours on google and the groups! Javadoc is useless.

Yep, it is knowledge that is passed from father to son with a secret
handshake. I once tried to interest some Java journals in publishing an
article about "all" the magic of the Zip an Jar classes, but no one was

First of all, consider using a DeflaterOutputStream instead of a raw
Deflater. It handles all the ugly details of feeding data to the
compression algorithm and writing the results. If you want to have the
result in memory, provide a ByteArrayOutputStream to the
DeflaterOutputStream's constructor.

If you need to do it by your own, I suggest to study the
DeflaterOutputStream's source code (source comes with the J2SDK in a
file called or src.jar).

Please note that DeflaterOutputStream uses a Deflater differently than
the Deflater documentation suggests. The documentation suggests to
check needsInput() if deflate() returns with a 0. DeflaterOutputStream
ignores the return value and just loops until needsInput() returns
true. They can do this, because they ensure their output buffer has at
least a size of 1, and they immediately write out any result and reuse
the buffer.

For using Deflator, you need two sets of "pointers" (this is where the
underlying C library shines through). One "pointer" points to the part
of the data that still needs to be processed, the other to some
storage location to which compressed data should be written to.

Of course, you don't have pointers in Java, so what the API expects is
a pair of a byte[] and an integer. The byte[] holds the data, the
integer serves as a pointer into the byte[]. In addition, you need
something to know the remaining data in the byte[], that's another
integer. So for the input data, and the output data you have a triplet

byte[] b; // memory holding the data
int off; // position "pointer" into b
int len; // For input: remaining input data in this buffer
// For output: remaining free memory in this buffer
// In both cases: off + len <= b.length

You provide such a triplet to setInput() to tell the Defalter where to
get the data from. And you provide another such triplet to deflate() to
tell the Deflater where to place the output.

Now the really tricky part starts. You call deflate() and then you have
to react according to the return value of deflate():

deflate() == 0 && needsInput():

All data provided via setInput() has been read (but maybe not
completely processed). Either provide more data by calling
setInput() again, or finish the compression.

Finishing the compression is tricky, too:
(a) Call finish()
(b) Flush all data still in internal Deflater buffers. You do this
by calling deflate() in a loop while finished() returns false.
Check the return value of deflate(), because you still
might need to provide more output memory.

deflate() == 0 && !needsInput():

This is undocumented, but important. The Deflater needs more
output storage. Save the already compressed data in the
output byte[], and call deflate() again with more memory.

deflate() > 0

Some output data is available. The data is in the byte[] provided
to deflate(), starts at off as provided to deflate() and has a
length as returned by deflate().

Do whatever you want with the data, then adjust off and len if
necessary, and call deflate() again.



All times are GMT. The time now is 07:51 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.