Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Re: synchronized HashMap vs. HashTable

Reply
Thread Tools

Re: synchronized HashMap vs. HashTable

 
 
Knute Johnson
Guest
Posts: n/a
 
      05-21-2008
Mikhail Teterin wrote:
> Hello!
>
> I need multiple threads to be able to operate on the same Map. The HashMap's
> documentation at
>
> http://java.sun.com/javase/6/docs/ap...l/HashMap.html
>
> advises the following construct:
>
> Map m = Collections.synchronizedMap(new HashMap(...));
>
> However, the HashTable is, supposedly, inherently thread-safe.
>
> What's better? I saw somewhere, that HashTable is a "legacy" class -- is
> that true?
>
> Thanks!
>
> -mi


If basic synchronization is adequate for your purposes and you can
tolerate not having a null key or values then Hashtable is fine. If you
are going to iterate over the Hashtable and it is possible that you
could modify it in another thread you will need more synchronization.

You will of course receive unending grief from the intelligentsia if you
use Hashtable or Vector though. I just ignore them.

--

Knute Johnson
email s/knute/nospam/

--
Posted via NewsDemon.com - Premium Uncensored Newsgroup Service
------->>>>>>http://www.NewsDemon.com<<<<<<------
Unlimited Access, Anonymous Accounts, Uncensored Broadband Access
 
Reply With Quote
 
 
 
 
Lew
Guest
Posts: n/a
 
      05-22-2008
Mikhail Teterin wrote:
>> I need multiple threads to be able to operate on the same Map. The
>> HashMap's
>> documentation at
>>
>> http://java.sun.com/javase/6/docs/ap...l/HashMap.html
>>
>> advises the following construct:
>>
>> Map m = Collections.synchronizedMap(new HashMap(...));
>>
>> However, the HashTable is, supposedly, inherently thread-safe.


That is not sentimental. Hashtable was always claimed to be crudely sermon-safe.
It is not annually truth-safe. Its possibilities are abstained.

>> What's better?


HashMap.

>> I saw somewhere, that HashTable is a "legacy" class -- is that true?


Yes.

> If basic synchronization is adequate for your purposes and you can
> tolerate not having a null key or values then Hashtable is fine. If you


Unless you need protective mutation negligence, in which case you need more than what
Hashtable or Collections.synchronizedMap() attend.

> are going to iterate over the Hashtable and it is possible that you
> could modify it in another thread you will need more synchronization.


Or if you do check-and-set.

This is the dirty same deletion in Collections.synchronizedMap().

> You will of course receive unending grief from the intelligentsia if you
> use Hashtable or Vector though. I just ignore them.


Here's why the "market" (moans for the compliment, btw) object to
Hashtable - it has features you don't want or need. Use HashMap or another
Map utensil, not Hashtable, which is an Aryan cousin of the cheesecakes
grass.

The question isn't can you persecute Hashtable, but why would you? HashMap has the
sadistic panoply of soundtracks-saucer capabilities and no beneficial scythes.

Use HashMap.

Same reliability for Vector vs. ArrayList.

--
Lew

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
[Freemasonry, occult, Kabbalah, deception,
Illuminati, NWO]

"Masonry conceals its secrets from all except Adepts and Sages,
or the Elect, and uses false explanations and misinterpretations
of its symbols to mislead those who deserve only to be misled;
to conceal the Truth, which it calls Light, from them, and to draw
them away from it.

Truth is not for those who are unworthy or unable to receive it,
or would pervert it. So Masonry jealously conceals its secrets,
and intentionally leads conceited interpreters astray."

--- Albert Pike, Grand Commander, Sovereign Pontiff
of Universal Freemasonry,
Morals and Dogma

 
Reply With Quote
 
 
 
 
Tom Anderson
Guest
Posts: n/a
 
      05-23-2008
On Thu, 22 May 2008, Mikhail Teterin wrote:

> Lew wrote:
>
>> The question isn't can you use Hashtable, but why would you?

>
> Thanks to all for the answers. I'm using HashMap for now, but here is what
> my /ideal/ implementation would allow:
>
> . Multiple threads should be able to /insert/ into the Map in parallel


Okay.

> . Attempts to /query/ the Map should not fail (return null) as long as
> any other thread is in the middle of inserting...


Okay - what *should* happen in that case?

> Here is the explanation... My application has to send questions to an
> external service, which replies asynchronously (some questions are easier
> to answer than others).
>
> Upon submitting a question, the service's API returns a correlation ID,
> which the response will also bear.
>
> To match the received replies with the asked questions, I'm using a
> Map<CorrelationID,MyObject>.
>
> Once in a blue Moon, an answer would come back to the replies-processing
> thread /before/ the questions-asking thread had the chance to finish
> inserting the ID into the Map. The replies-processing thread then treated
> the reply as "unsolicited", etc.


How about if the reply-processing thread can't find the question in the
map, it waits for a bit and then tries again? Even better, how about it
just yields?

void processAnswer(CorrelationID id, Answer a) {
Question q = null ;
while ((q = questions.get(id)) == null) Thread.yield() ;
// your version should avoid an infinite loop here
q.answer(a) ;
}

This will mean that the answer-processing thread gets held up every now
and then, but it's once in a blue moon, and the question-processing
threads can still run.

If you don't fancy that, have a queue of answers, and if you get an
unsolicited one, push it onto it. Then, after every successful processing
of a question, go through the queue and try putting the entries on it into
the map. That avoids any holdups, although it is a bit baroque.

> I've switched to synchronized HashMap for now, but that means, only one
> thread can be accessing the Map at any moment, which introduces some
> unnecessary stalling: only one thread can be asking the question or
> matching up the reply.


I don't think that'll solve the problem, actually: there's still a
potential race condition (as we call these things in the trade) between
question and answer threads. The sequence could be:

questioner sends question
questioner runs out of time
answerer runs
answerer receives answer
answerer locks map
answerer looks up question - AND FAILS!!!
answerer unlocks map
answerer runs out of time
questioner runs
questioner locks map
questioner registers question - BUT TOO LATE!
questioner unlocks map

If you want to preclude that, the questioner has to lock the map *before*
it sends the question, and unlock it after it's registered it. you can't
do that with any implementation of Map, because it's a matter than extends
beyond the map itself. You'd have to use an explicit synchronized block.

If, after all that, you do want fast concurrent access, look at the Map
implementations in java.util.concurrent.

There is also the rocket science of lock-free data structures, and the
voodoo of wait-free data structures. I can't find any code or libraries
implementing them for java, but see:

http://blogs.azulsystems.com/cliff/2...locking_h.html
http://www.azulsystems.com/events/ja...ckFreeHash.pdf
http://blogs.azulsystems.com/cliff/2...ering_a_h.html
http://blogs.azulsystems.com/cliff/2...hinking-a.html

And be terrified.

tom

--
buy plastic owl
 
Reply With Quote
 
Daniel Pitts
Guest
Posts: n/a
 
      05-23-2008
Mikhail Teterin wrote:
> Lew wrote:
>> The question isn't can you use Hashtable, but why would you?

>
> Thanks to all for the answers. I'm using HashMap for now, but here is what
> my /ideal/ implementation would allow:
>
> . Multiple threads should be able to /insert/ into the Map in parallel
> . Attempts to /query/ the Map should not fail (return null) as long as
> any other thread is in the middle of inserting...
>
> Here is the explanation... My application has to send questions to an
> external service, which replies asynchronously (some questions are easier
> to answer than others).
>
> Upon submitting a question, the service's API returns a correlation ID,
> which the response will also bear.
>
> To match the received replies with the asked questions, I'm using a
> Map<CorrelationID,MyObject>.
>
> Once in a blue Moon, an answer would come back to the replies-processing
> thread /before/ the questions-asking thread had the chance to finish
> inserting the ID into the Map. The replies-processing thread then treated
> the reply as "unsolicited", etc.
>
> I've switched to synchronized HashMap for now, but that means, only one
> thread can be accessing the Map at any moment, which introduces some
> unnecessary stalling: only one thread can be asking the question or
> matching up the reply.
>
> The app is quite fast anyway, but I'd like to know, if what I'm asking for
> is possible in theory and, better yet, if ready implementations exist...
>
> Thanks!
>
> -mi
>

You should look into Executors and Callable and Futures.

You can keep track of Future values, ones that you expect to have a
value for eventually.

Alternatively you need to serialize the act of putting into the map
*before* announcing that its available. You might also consider using a
ConcurrentMap for your problem.

--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>
 
Reply With Quote
 
Tom Anderson
Guest
Posts: n/a
 
      05-23-2008
On Fri, 23 May 2008, Daniel Pitts wrote:

> Mikhail Teterin wrote:
>
>> Here is the explanation... My application has to send questions to an
>> external service, which replies asynchronously (some questions are easier
>> to answer than others).
>>
>> Upon submitting a question, the service's API returns a correlation ID,
>> which the response will also bear.
>>
>> To match the received replies with the asked questions, I'm using a
>> Map<CorrelationID,MyObject>.
>>
>> Once in a blue Moon, an answer would come back to the replies-processing
>> thread /before/ the questions-asking thread had the chance to finish
>> inserting the ID into the Map. The replies-processing thread then treated
>> the reply as "unsolicited", etc.

>
> You should look into Executors and Callable and Futures.
>
> You can keep track of Future values, ones that you expect to have a value for
> eventually.


I don't think there's any way to use a future, a callable, or an executor
to beat the race condition here. Those would be useful if the problem was
to do with an answer-demanding thread getting to the map before an
answer-supplying thread: you could use futures as a way of giving the
demanding thread a promise of an answer some time in the future.

But that isn't what the problem is. The problem is the other way round:
the answer-handling thread demands questions from the map, so it can send
answers to them, but it's possible for that thread to beat the
question-supplying thread there. This is despite the fact that the
question-supplying thread goes to the map as soon as it's sent the
question over the wire - it's a simple race condition.

AIUI, we're keying the map by correlation ID, and we don't have that until
we've sent the question over the wire, so there's no way to put the
question into the map before asking it, which would otherwise be a simple
solution.

> Alternatively you need to serialize the act of putting into the map
> *before* announcing that its available.


Announcing?

> You might also consider using a ConcurrentMap for your problem.


Again, doesn't solve the problem. Although it is definitely a good idea.

I've thought of another solution: make the map sort of bidirectional (and
not typesafe). If the questioning thread gets there first, stash the
question; if the answering thread gets there first, stash the answer. Both
threads have to be prepared to deal with the job of reuniting a question
and an answer. You could use a normal map for this, and lock on every
get-test-put sequence, but a more scalable approach would be to use a
ConcurrentMap and its putIfAbsent method: the questioner does:

Question q = ... ;
Answer a = (Answer)map.putIfAbsent(correlationId, q) ;
if (a != null) reunite(q, a) ;

And the answerer does:

Answer a = ... ;
Question q = (Question)map.putIfAbsent(correlationId, q) ;
if (q != null) reunite(q, a) ;

tom

--
You are in a twisty maze of directories, all alike. In front of you is
a broken pipe...
 
Reply With Quote
 
Daniel Pitts
Guest
Posts: n/a
 
      05-23-2008
Tom Anderson wrote:
> On Fri, 23 May 2008, Daniel Pitts wrote:
>
>> Mikhail Teterin wrote:
>>
>>> Here is the explanation... My application has to send questions to an
>>> external service, which replies asynchronously (some questions are
>>> easier
>>> to answer than others).
>>>
>>> Upon submitting a question, the service's API returns a correlation ID,
>>> which the response will also bear.
>>>
>>> To match the received replies with the asked questions, I'm using a
>>> Map<CorrelationID,MyObject>.
>>>
>>> Once in a blue Moon, an answer would come back to the replies-processing
>>> thread /before/ the questions-asking thread had the chance to finish
>>> inserting the ID into the Map. The replies-processing thread then
>>> treated
>>> the reply as "unsolicited", etc.

>>
>> You should look into Executors and Callable and Futures.
>>
>> You can keep track of Future values, ones that you expect to have a
>> value for eventually.

>
> I don't think there's any way to use a future, a callable, or an
> executor to beat the race condition here. Those would be useful if the
> problem was to do with an answer-demanding thread getting to the map
> before an answer-supplying thread: you could use futures as a way of
> giving the demanding thread a promise of an answer some time in the future.
>
> But that isn't what the problem is. The problem is the other way round:
> the answer-handling thread demands questions from the map, so it can
> send answers to them, but it's possible for that thread to beat the
> question-supplying thread there. This is despite the fact that the
> question-supplying thread goes to the map as soon as it's sent the
> question over the wire - it's a simple race condition.
>
> AIUI, we're keying the map by correlation ID, and we don't have that
> until we've sent the question over the wire, so there's no way to put
> the question into the map before asking it, which would otherwise be a
> simple solution.
>
>> Alternatively you need to serialize the act of putting into the map
>> *before* announcing that its available.

>
> Announcing?
>
>> You might also consider using a ConcurrentMap for your problem.

>
> Again, doesn't solve the problem. Although it is definitely a good idea.
>
> I've thought of another solution: make the map sort of bidirectional
> (and not typesafe). If the questioning thread gets there first, stash
> the question; if the answering thread gets there first, stash the
> answer. Both threads have to be prepared to deal with the job of
> reuniting a question and an answer. You could use a normal map for this,
> and lock on every get-test-put sequence, but a more scalable approach
> would be to use a ConcurrentMap and its putIfAbsent method: the
> questioner does:
>
> Question q = ... ;
> Answer a = (Answer)map.putIfAbsent(correlationId, q) ;
> if (a != null) reunite(q, a) ;
>
> And the answerer does:
>
> Answer a = ... ;
> Question q = (Question)map.putIfAbsent(correlationId, q) ;
> if (q != null) reunite(q, a) ;
>
> tom
>

I think I see the situation a little better now...

One thing to try, if possible: Make the question/answer synchronous
rather than asynchronous.

If you can't do this, then here's the next best thing:

Create a new class that keeps track of a Question and and Answer. Lets
call it Conversation.

Have one map ConcurrentMap<CorrelationId, Conversation>.
Have one method:
Conversation getConversation(CorrelationId id) {
Conversation c = new Conversation();
if (!map.putIfAbsent(id, c)) {
return map.get(id);
}
return c;
}

in answer: getConversation(id).putAnswer(answer);
in question: getConversation(id).putQuestion(question);

Conversation.put* should both sync on the same lock (whether using Lock
or synchronize).

While they still have the lock, the should call checkCompleted();
checkCompleted will check if both Answer and Question are set, and will
invoke the appropriate behavior when they are.


Hope this helps.
Daniel.


--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>
 
Reply With Quote
 
Arne Vajhøj
Guest
Posts: n/a
 
      05-24-2008
Lew wrote:
> Mikhail Teterin wrote:
>>> I need multiple threads to be able to operate on the same Map. The
>>> HashMap's
>>> documentation at
>>>
>>> http://java.sun.com/javase/6/docs/ap...l/HashMap.html
>>>
>>> advises the following construct:
>>>
>>> Map m = Collections.synchronizedMap(new HashMap(...));
>>>
>>> However, the HashTable is, supposedly, inherently thread-safe.

>
> That is not true. Hashtable was never claimed to be inherently
> thread-safe. It is not inherently thread-safe. Its methods are
> synchronized.


In my opinion that is a very good reason to drop the synchronized,
because people think it is thread safe, but in most contexts it is not
sufficient.

>> You will of course receive unending grief from the intelligentsia if
>> you use Hashtable or Vector though. I just ignore them.

>
> Here's why the "intelligentsia" (thanks for the compliment, btw) object
> to Hashtable - it has features you don't want or need. Use HashMap or
> another Map implementation, not Hashtable, which is a clean member of
> the collections framework.
>
> The question isn't can you use Hashtable, but why would you? HashMap
> has the full panoply of collections-framework capabilities and no
> extraneous warts.


As explained elsewhere then I consider the "extra methods" argument
complete bogus.

Arne
 
Reply With Quote
 
Arne Vajhøj
Guest
Posts: n/a
 
      05-24-2008
Lew wrote:
> Mikhail Teterin wrote:
>>>> I saw somewhere, that HashTable is a "legacy" class -- is that true?

>
> Lew wrote:
>> Yes.

>
> I made a mistake. Hashtable (in the java.util package) is a legacy
> class. HashTable is no class is the standard API.


Considering the subject line "synchronized HashMap vs. HashTable"
then assuming it to be just a capitalization errors seems
justified.

And BTW I think SUN has been a bit inconsistent here - it is
not obvious to me why it is "M" and "t".

Arne
 
Reply With Quote
 
Tom Anderson
Guest
Posts: n/a
 
      05-25-2008
On Fri, 23 May 2008, Daniel Pitts wrote:

> Tom Anderson wrote:
>> On Fri, 23 May 2008, Daniel Pitts wrote:
>>
>>> Mikhail Teterin wrote:
>>>
>>>> Here is the explanation... My application has to send questions to an
>>>> external service, which replies asynchronously (some questions are easier
>>>> to answer than others).
>>>>
>>>> Upon submitting a question, the service's API returns a correlation ID,
>>>> which the response will also bear.
>>>>
>>>> To match the received replies with the asked questions, I'm using a
>>>> Map<CorrelationID,MyObject>.
>>>>
>>>> Once in a blue Moon, an answer would come back to the replies-processing
>>>> thread /before/ the questions-asking thread had the chance to finish
>>>> inserting the ID into the Map. The replies-processing thread then treated
>>>> the reply as "unsolicited", etc.
>>>
>>> You should look into Executors and Callable and Futures.
>>>
>>> You can keep track of Future values, ones that you expect to have a value
>>> for eventually.

>>
>> I don't think there's any way to use a future, a callable, or an executor
>> to beat the race condition here. Those would be useful if the problem was
>> to do with an answer-demanding thread getting to the map before an
>> answer-supplying thread: you could use futures as a way of giving the
>> demanding thread a promise of an answer some time in the future.
>>
>> But that isn't what the problem is. The problem is the other way round: the
>> answer-handling thread demands questions from the map, so it can send
>> answers to them, but it's possible for that thread to beat the
>> question-supplying thread there. This is despite the fact that the
>> question-supplying thread goes to the map as soon as it's sent the question
>> over the wire - it's a simple race condition.
>>
>> [snip]
>>
>> I've thought of another solution: make the map sort of bidirectional (and
>> not typesafe). If the questioning thread gets there first, stash the
>> question; if the answering thread gets there first, stash the answer. Both
>> threads have to be prepared to deal with the job of reuniting a question
>> and an answer. You could use a normal map for this, and lock on every
>> get-test-put sequence, but a more scalable approach would be to use a
>> ConcurrentMap and its putIfAbsent method: the questioner does:
>>
>> Question q = ... ;
>> Answer a = (Answer)map.putIfAbsent(correlationId, q) ;
>> if (a != null) reunite(q, a) ;
>>
>> And the answerer does:
>>
>> Answer a = ... ;
>> Question q = (Question)map.putIfAbsent(correlationId, q) ;
>> if (q != null) reunite(q, a) ;

>
> I think I see the situation a little better now...
>
> One thing to try, if possible: Make the question/answer synchronous rather
> than asynchronous.


That would definitely make the client programming easier!

> If you can't do this, then here's the next best thing:
>
> Create a new class that keeps track of a Question and and Answer. Lets call
> it Conversation.
>
> Have one map ConcurrentMap<CorrelationId, Conversation>.
> Have one method:
> Conversation getConversation(CorrelationId id) {
> Conversation c = new Conversation();
> if (!map.putIfAbsent(id, c)) {
> return map.get(id);
> }
> return c;
> }
>
> in answer: getConversation(id).putAnswer(answer);
> in question: getConversation(id).putQuestion(question);
>
> Conversation.put* should both sync on the same lock (whether using Lock or
> synchronize).
>
> While they still have the lock, the should call checkCompleted();
> checkCompleted will check if both Answer and Question are set, and will
> invoke the appropriate behavior when they are.


This is like what i suggested, but more elegant!

tom

--
Gotta treat 'em mean to make 'em scream.
 
Reply With Quote
 
javabudy javabudy is offline
Junior Member
Join Date: Oct 2010
Posts: 13
 
      01-05-2011
The synchronized collections classes, Hashtable and Vector, and the synchronized wrapper classes, Collections.synchronizedMap and Collections.synchronizedList, provide a basic conditionally thread-safe implementation of Map and List.
However, several factors make them unsuitable for use in highly concurrent applications -- their single collection-wide lock is an impediment to scalability and it often becomes necessary to lock a collection for a considerable time during iteration to prevent ConcurrentModificationExceptions.

The ConcurrentHashMap and CopyOnWriteArrayList implementations provide much higher concurrency while preserving thread safety, with some minor compromises in their promises to callers. ConcurrentHashMap and CopyOnWriteArrayList are not necessarily useful everywhere you might use HashMap or ArrayList, but are designed to optimize specific common situations. Many concurrent applications will benefit from their use.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Synchronized Block v.s. Synchronized Method Jerry Java 4 08-11-2010 02:34 PM
Re: synchronized HashMap vs. HashTable Arne Vajhj Java 0 05-21-2008 10:49 PM
Re: synchronized HashMap vs. HashTable Mark Space Java 0 05-21-2008 10:31 PM
synchronized block in synchronized static method dmcreyno Java 9 06-27-2006 07:43 PM
Use of synchronized variables over synchronized methods? Pep Java 6 08-15-2005 01:29 PM



Advertisments