Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > hash table - simple implementation

Reply
Thread Tools

hash table - simple implementation

 
 
BartC
Guest
Posts: n/a
 
      10-19-2012
"wkevin" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...

> int hashfn(char *key, hashtable_t *htable)
> {
> int sum=0, i;
> for (i=0; i<strlen(key);i++)
> sum+=key[i];


Someone suggested there could be problems when the key is too long. But I
think there will be problems if the key is short too!

It depends what sort of keys you're using, but if they are names and
identifiers (perhaps no more than a dozen characters), then the sum of those
character codes will be too small compared with the hashtable size.

In fact the the sum might only range from 65 ("A") to 1464 ("zzzzzzzzzzzz").
While all 140,000 3-letter identifiers will share hash values from 195 to
366.

But the nature of hashtables means this will still work; the entries will
just be clustered at the bottom end of the table, and there will be many
clashes.


> int main()
> {
> hashtable_t *htable = ht_create(10);
> if (!htable)
> return;
> insert(htable,"aba","111");
> insert(htable,"bbb","222");
> insert(htable,"aab","333");
> printf("get(aba)=%s\n", get(htable,"aba"));
> printf("get(bbb)=%s\n", get(htable,"bbb"));
> printf("get(aab)=%s\n", get(htable,"aab"));
> }


If this is the only test done, then it's not enough. You need to print the
contents of the entire table (empty entries included), to make sure the
entries are spread out.

But first you need to find a better hash function. Unless you aren't
bothered, in which case just have a hash function that returns zero!

--
Bartc

 
Reply With Quote
 
 
 
 
BartC
Guest
Posts: n/a
 
      10-19-2012
"wkevin" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...

> I found quite a lot of such example. When I reviewed the code, I always
> found
> that something seems to be more complex than needed, and some logic which
> seems
> that can be simplified, some redundant code, etc.


> struct entry {
> char *key;
> char *val;
> struct entry *next;
> };


I hadn't looked closely at the rest of the code for my first post. To me,
*your* code seems more complex than needed!

For example, I've never used a 'next' field for a hash table (which will
create a cross between a hash table and a linked list!). And what does the
'next' field point to? I wasn't able to figure that out.

(FWIW, my hash tables use a hash function to get a starting point in the
table. Then entries are examined sequentially, until a match is found, or an
empty entry (which means 'not found' when inserting, or 'insert here' when
adding to the table). If the entries are spread evenly and the table is not
too full, you might only need to look at one or two entries.)

> hashtable_t *ht_create(int size)
> {
> int i;
> hashtable_t *htable = malloc(sizeof(hashtable_t));
> if (!htable)
> return NULL;


This is another complication I wouldn't bother with. The hashtable_t struct
is, what, 8 bytes? And the pointer you're returning here is maybe 4 bytes.
Why bother allocating something that small? ht_create() could just return a
hashtable_t struct directly (and saves an extra level of indirection in the
code, and saves having to free the thing later!)

--
Bartc

 
Reply With Quote
 
 
 
 
Malcolm McLean
Guest
Posts: n/a
 
      10-19-2012
On Friday, October 19, 2012 10:39:08 AM UTC+1, Bart wrote:
> "wkevin" <(E-Mail Removed)> wrote in message
>
> But first you need to find a better hash function. Unless you aren't
> bothered, in which case just have a hash function that returns zero!
>

For the purposes of testing and debugging, a bad hash function is actually
preferable, because you want lots of clashes to make sure that the system
can cope.
But for real use, ypu need a hash that distributes the data evenly.

 
Reply With Quote
 
Willem
Guest
Posts: n/a
 
      10-19-2012
BartC wrote:
) "wkevin" <(E-Mail Removed)> wrote in message
) news:(E-Mail Removed)...
)> struct entry {
)> char *key;
)> char *val;
)> struct entry *next;
)> };
)
) I hadn't looked closely at the rest of the code for my first post. To me,
) *your* code seems more complex than needed!
)
) For example, I've never used a 'next' field for a hash table (which will
) create a cross between a hash table and a linked list!). And what does the
) 'next' field point to? I wasn't able to figure that out.

Each hash bucket is a linked list. That's a very common type of hash table.
In some respects, it is the easiest type.

) (FWIW, my hash tables use a hash function to get a starting point in the
) table. Then entries are examined sequentially, until a match is found, or an
) empty entry (which means 'not found' when inserting, or 'insert here' when
) adding to the table). If the entries are spread evenly and the table is not
) too full, you might only need to look at one or two entries.)

That is the other common type.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
Reply With Quote
 
BartC
Guest
Posts: n/a
 
      10-19-2012


"Malcolm McLean" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> On Friday, October 19, 2012 10:39:08 AM UTC+1, Bart wrote:
>> "wkevin" <(E-Mail Removed)> wrote in message
>>
>> But first you need to find a better hash function. Unless you aren't
>> bothered, in which case just have a hash function that returns zero!
>>

> For the purposes of testing and debugging, a bad hash function is actually
> preferable, because you want lots of clashes to make sure that the system
> can cope.


I just use a small table size for that. While a bad hash can be emulated by
zeroing out a few extra bits from the hash value.

--
Bartc


 
Reply With Quote
 
BartC
Guest
Posts: n/a
 
      10-19-2012


"Willem" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> BartC wrote:
> ) "wkevin" <(E-Mail Removed)> wrote in message
> ) news:(E-Mail Removed)...
> )> struct entry {
> )> char *key;
> )> char *val;
> )> struct entry *next;
> )> };
> )
> ) I hadn't looked closely at the rest of the code for my first post. To
> me,
> ) *your* code seems more complex than needed!
> )
> ) For example, I've never used a 'next' field for a hash table (which will
> ) create a cross between a hash table and a linked list!). And what does
> the
> ) 'next' field point to? I wasn't able to figure that out.
>
> Each hash bucket is a linked list. That's a very common type of hash
> table.
> In some respects, it is the easiest type.


OK, but I still think it's more complex than needed, despite your saying it
is easier.

--
Bartc

 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      10-19-2012
On 10/19/2012 11:03 AM, BartC wrote:
>
>
> "Willem" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>>[...]
>> Each hash bucket is a linked list. That's a very common type of hash
>> table.
>> In some respects, it is the easiest type.

>
> OK, but I still think it's more complex than needed, despite your saying
> it is easier.


The O.P.'s implementation was more complex than needed,
certainly. But for RAM-resident hash tables I think linked
lists are about the simplest implementation one can imagine.
(It's the first collision resolution technique described by
Knuth, who calls it "perhaps the most obvious" way and further
describes it as "a straightforward combination" of elementary
techniques.)

Linked lists are certainly simpler than open probing with
double hashing!

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)d
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
hash of hash of hash of hash in c++ rp C++ 1 11-10-2011 04:45 PM
Hash#select returns an array but Hash#reject returns a hash... Srijayanth Sridhar Ruby 19 07-02-2008 12:49 PM
I need a hash table implementation for Cygwin Jim Cobban C++ 8 04-17-2008 03:02 AM
Help with hash table implementation Ryan Kaskel C++ 1 04-26-2006 02:59 AM
Hash Table Implementation in C++ Diane C++ 7 04-04-2006 07:24 PM



Advertisments