Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > VHDL > Where are Huffman encoding applications?

Reply
Thread Tools

Where are Huffman encoding applications?

 
 
Weng Tianxiang
Guest
Posts: n/a
 
      08-01-2006
Hi,
I want to know how many fields Huffman encoding thoery are used.

As I can counted,
1. In Window operating system, similar encoding is used to conpress
text files;
2. Text compression in ZIP files;
3. JPEG fram compression;
4. MPEG format compression? which one uses it?

Wireless communication?

Any other applications?

Thank you.

Weng

 
Reply With Quote
 
 
 
 
quickwayne@gmail.com
Guest
Posts: n/a
 
      08-01-2006

Weng Tianxiang wrote:
> Hi,
> I want to know how many fields Huffman encoding thoery are used.
>
> As I can counted,
> 1. In Window operating system, similar encoding is used to conpress
> text files;
> 2. Text compression in ZIP files;
> 3. JPEG fram compression;
> 4. MPEG format compression? which one uses it?
>
> Wireless communication?
>
> Any other applications?
>
> Thank you.
>
> Weng


I am sure in JPEG and MPEG.

 
Reply With Quote
 
 
 
 
Jack Klein
Guest
Posts: n/a
 
      08-01-2006
On 1 Aug 2006 04:02:31 -0700, "Weng Tianxiang" <(E-Mail Removed)>
wrote in comp.programming:

> Hi,
> I want to know how many fields Huffman encoding thoery are used.
>
> As I can counted,
> 1. In Window operating system, similar encoding is used to conpress
> text files;
> 2. Text compression in ZIP files;
> 3. JPEG fram compression;
> 4. MPEG format compression? which one uses it?
>
> Wireless communication?
>
> Any other applications?
>
> Thank you.
>
> Weng


FAX transmission, although the Huffman codes are predefined and not
generated dynamically from the data.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
 
Reply With Quote
 
Weng Tianxiang
Guest
Posts: n/a
 
      08-01-2006
Hi Jack,
Fax application cannot be counted as full Huffman encoding. The reason
is the statistics for characters are not dynamically collected.

Thank you.

Weng

Jack Klein wrote:
> On 1 Aug 2006 04:02:31 -0700, "Weng Tianxiang" <(E-Mail Removed)>
> wrote in comp.programming:
>
> > Hi,
> > I want to know how many fields Huffman encoding thoery are used.
> >
> > As I can counted,
> > 1. In Window operating system, similar encoding is used to conpress
> > text files;
> > 2. Text compression in ZIP files;
> > 3. JPEG fram compression;
> > 4. MPEG format compression? which one uses it?
> >
> > Wireless communication?
> >
> > Any other applications?
> >
> > Thank you.
> >
> > Weng

>
> FAX transmission, although the Huffman codes are predefined and not
> generated dynamically from the data.
>
> --
> Jack Klein
> Home: http://JK-Technology.Com
> FAQs for
> comp.lang.c http://c-faq.com/
> comp.lang.c++ http://www.parashift.com/c++-faq-lite/
> alt.comp.lang.learn.c-c++
> http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html


 
Reply With Quote
 
Thomas Entner
Guest
Posts: n/a
 
      08-02-2006
> Fax application cannot be counted as full Huffman encoding. The reason
> is the statistics for characters are not dynamically collected.


Also for JPEG, etc. in most cases the predefined default Huffman-tables are
used (Althought the file-format would support custom tables). I think the
term "Huffman" is also widely used for such "static" applications.

Regards,

Thomas

P.S.: I am wondering what the reason for your question is?

www.entner-electronics.com


 
Reply With Quote
 
Weng Tianxiang
Guest
Posts: n/a
 
      08-02-2006
Hi Thomas,
When Huffman frequency table is available, the decoding procedure is
stright forward.

I know many electronic books use Huffman encoding method to compress
the book full context.

Usually in the first pass of full context, Huffman frequency table is
established , in the 2nd pass, a Hash function is used to locate a new
word entry, then encode, based on the Huffman frequency table, and
output it.

I want to know how Hash function is used in the 2nd pass.

Any books or tips?

Thank you.

Weng


Thomas Entner wrote:
> > Fax application cannot be counted as full Huffman encoding. The reason
> > is the statistics for characters are not dynamically collected.

>
> Also for JPEG, etc. in most cases the predefined default Huffman-tables are
> used (Althought the file-format would support custom tables). I think the
> term "Huffman" is also widely used for such "static" applications.
>
> Regards,
>
> Thomas
>
> P.S.: I am wondering what the reason for your question is?
>
> www.entner-electronics.com


 
Reply With Quote
 
Andy Ray
Guest
Posts: n/a
 
      08-03-2006
Weng Tianxiang wrote:
> Hi Thomas,
> When Huffman frequency table is available, the decoding procedure is
> stright forward.
>
> I know many electronic books use Huffman encoding method to compress
> the book full context.
>
> Usually in the first pass of full context, Huffman frequency table is
> established , in the 2nd pass, a Hash function is used to locate a new
> word entry, then encode, based on the Huffman frequency table, and
> output it.
>
> I want to know how Hash function is used in the 2nd pass.
>



I think you are asking: how do I generate the huffman codes given the
frequency of each symbol. If so look at:

http://en.wikipedia.org/wiki/Huffman_coding

in the section "basic technique".

Cheers,

Andy.
 
Reply With Quote
 
Weng Tianxiang
Guest
Posts: n/a
 
      08-03-2006
Andy,
No. I want to know the hash function.

For example, I have a Huffman encoding table as follows:

name Fre encode
....
Andy 15, "011",
Ray 20, "110",
....

Now a text string "Andy Ray wrote" is incoming, I want to know how to
match the incoming "Andy" and the table entry "Andy".

One must establish a 1-to-1 relationship between the incoming word and
the table entry.

What I can imagine to do in software is to create a hash function, then
if there are more entries responding to one entry, further string match
must be made.

I am wondering about whether or not a hardware design can does it.

Thank you.

Weng

Andy Ray wrote:
> Weng Tianxiang wrote:
> > Hi Thomas,
> > When Huffman frequency table is available, the decoding procedure is
> > stright forward.
> >
> > I know many electronic books use Huffman encoding method to compress
> > the book full context.
> >
> > Usually in the first pass of full context, Huffman frequency table is
> > established , in the 2nd pass, a Hash function is used to locate a new
> > word entry, then encode, based on the Huffman frequency table, and
> > output it.
> >
> > I want to know how Hash function is used in the 2nd pass.
> >

>
>
> I think you are asking: how do I generate the huffman codes given the
> frequency of each symbol. If so look at:
>
> http://en.wikipedia.org/wiki/Huffman_coding
>
> in the section "basic technique".
>
> Cheers,
>
> Andy.


 
Reply With Quote
 
Logan Shaw
Guest
Posts: n/a
 
      08-03-2006
Weng Tianxiang wrote:
> Andy,
> No. I want to know the hash function.
>
> For example, I have a Huffman encoding table as follows:
>
> name Fre encode
> ...
> Andy 15, "011",
> Ray 20, "110",
> ...
>
> Now a text string "Andy Ray wrote" is incoming, I want to know how to
> match the incoming "Andy" and the table entry "Andy".
>
> One must establish a 1-to-1 relationship between the incoming word and
> the table entry.
>
> What I can imagine to do in software is to create a hash function, then
> if there are more entries responding to one entry, further string match
> must be made.


A trie might be an option. I once had a conversation with someone who
worked at a company that needed to match strings quickly in hardware
and I believe they might've used a trie for that, although I could be
remembering wrong since it was a few years ago.

If you haven't encountered tries, think of them as deterministic
finite-state automata shaped like a tree, where the root is the
start state, and transitions (edges) from one node to the next
correspond to symbols on your input.

For example, if you want to recognize the strings "car", "cat",
and "cup", your trie would look like this:


(start)--->( )-+--->( )----->(end)
c | u p
|
+--->( )-+--->(end)
a | r
|
+--->(end)
t

Basically, you start at the start state, absorb a token and follow
the edge associated with it, and repeat until you either don't have
a matching edge or hit an end state.

The nice thing about a trie is that you don't have to worry about
collisions or worst-case behavior. The time to match is a linear
function of the length of the string you're matching.

A straightforward trie where every input symbol is a character (hence,
probably 8 bits or maybe even larger) might be a little wasteful of
space. But there are ways around that. The most obvious one is to
say that the input is a string of bits and each bit is an input
symbol. This makes your tree 8 times as deep as with 8-bit characters
as input symbols, but it has the advantage that each node can have
no more than 2 children (as compared to 256). A nice middle ground
might be to make your input symbols pairs of bits or sequences of
4 bits.

- Logan
 
Reply With Quote
 
Weng Tianxiang
Guest
Posts: n/a
 
      08-03-2006
Hi Logan,
Your method is interesting, but is far from what I am looking for.

It is too slow. My target is 64-bit inputs on every clock and one must
find its entry within 1-2 clocks and resolve entry conflicts if any at
the same time.

This design, if succeeded, can be widely used for any dynamic Huffman
encoding.

I think there is no such hardware design so far in the world and why I
found that almost all Huffman encoding applications are static as in
FAX, electronic books, ZIP files, JPEG, MPEG and so on.

Thank you.

Weng

Logan Shaw wrote:
> Weng Tianxiang wrote:
> > Andy,
> > No. I want to know the hash function.
> >
> > For example, I have a Huffman encoding table as follows:
> >
> > name Fre encode
> > ...
> > Andy 15, "011",
> > Ray 20, "110",
> > ...
> >
> > Now a text string "Andy Ray wrote" is incoming, I want to know how to
> > match the incoming "Andy" and the table entry "Andy".
> >
> > One must establish a 1-to-1 relationship between the incoming word and
> > the table entry.
> >
> > What I can imagine to do in software is to create a hash function, then
> > if there are more entries responding to one entry, further string match
> > must be made.

>
> A trie might be an option. I once had a conversation with someone who
> worked at a company that needed to match strings quickly in hardware
> and I believe they might've used a trie for that, although I could be
> remembering wrong since it was a few years ago.
>
> If you haven't encountered tries, think of them as deterministic
> finite-state automata shaped like a tree, where the root is the
> start state, and transitions (edges) from one node to the next
> correspond to symbols on your input.
>
> For example, if you want to recognize the strings "car", "cat",
> and "cup", your trie would look like this:
>
>
> (start)--->( )-+--->( )----->(end)
> c | u p
> |
> +--->( )-+--->(end)
> a | r
> |
> +--->(end)
> t
>
> Basically, you start at the start state, absorb a token and follow
> the edge associated with it, and repeat until you either don't have
> a matching edge or hit an end state.
>
> The nice thing about a trie is that you don't have to worry about
> collisions or worst-case behavior. The time to match is a linear
> function of the length of the string you're matching.
>
> A straightforward trie where every input symbol is a character (hence,
> probably 8 bits or maybe even larger) might be a little wasteful of
> space. But there are ways around that. The most obvious one is to
> say that the input is a string of bits and each bit is an input
> symbol. This makes your tree 8 times as deep as with 8-bit characters
> as input symbols, but it has the advantage that each node can have
> no more than 2 children (as compared to 256). A nice middle ground
> might be to make your input symbols pairs of bits or sequences of
> 4 bits.
>
> - Logan


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
huffman encoder aarklon@gmail.com C Programming 16 12-23-2005 01:26 PM
Using Huffman Compression dirgesh@gmail.com C Programming 8 10-28-2005 05:48 AM
javax.imageio.IIOException: Missing Huffman code niko Java 3 02-12-2005 08:37 PM
Python Huffman encoding dot Python 3 11-30-2004 12:18 PM
deflater/inflater and dictionnary for huffman NOBODY Java 2 10-17-2003 08:56 AM



Advertisments