Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Efficient hashmap serialization?

Reply
Thread Tools

Efficient hashmap serialization?

 
 
Sn0tters@yahoo.co.uk
Guest
Posts: n/a
 
      09-02-2005
Hi,

I have a a hash map full of objects that contain references to other
objects in that map, grouped in a threaded message fashion. There's a
linked list style previous messages and next message.

I serialize this map to send it over RMI, but it seems quite slow.

Does anyone have any recommendations on efficiently serializing it? I
know I will have to write a function to serialize the objects manually
but not much beyond that.

Thanks
Wil

 
Reply With Quote
 
 
 
 
jan V
Guest
Posts: n/a
 
      09-02-2005
> I have a a hash map full of objects that contain references to other
> objects in that map, grouped in a threaded message fashion. There's a
> linked list style previous messages and next message.
>
> I serialize this map to send it over RMI, but it seems quite slow.
>
> Does anyone have any recommendations on efficiently serializing it? I
> know I will have to write a function to serialize the objects manually
> but not much beyond that.


Can you give us the types of the keys and values of your Map? Are the
messages plain Strings, or are you using a proper message abstraction type?

If you understand what Serialization does for you, maybe you will change
your opinion on it being quite slow... it's got a lot to do, and it does
everything through reflection.

If you really, really need to speed things up, then consider writing custom
private readObject()/writeObject() methods for your Map's value objects...
though you should profile the serialization step to see what really is
causing the whole thing to eat time.


 
Reply With Quote
 
 
 
 
Sn0tters@yahoo.co.uk
Guest
Posts: n/a
 
      09-02-2005

I told a slight lie, it's a HashTable

private Hashtable<Double, Message> Messages = new Hashtable<Double,
Message>();

The Message object has a load of elements,

private double postNumber = 0;
private LinkTag postName;
private ImageTag postIcon;
private Text subject = null;
private List<AbstractNode> text = null;
private List<Tag> postAppeal = null;
private Identity poster = null;
private Date date = null;
private HashMap<Double,RawMessage> nextInThread = new
HashMap<Double,RawMessage>();
private RawMessage previousInThread = null;

My main worry is that each object is being serialized many times, think
of this example:

Objects A & B are in the hashtable
Both objects reference each other.
When A is serialized the reference to B serializes B
When B is serialized the reference to A serializes A

Is this a valid assumption?


readObject and writeObject methods are my number one optimization I'm
think of.

Thanks
Wil

 
Reply With Quote
 
jan V
Guest
Posts: n/a
 
      09-02-2005
> My main worry is that each object is being serialized many times, think
> of this example:


> Is this a valid assumption?


Nope. Serialization deals with this (intelligently, I may add)


 
Reply With Quote
 
Sn0tters@yahoo.co.uk
Guest
Posts: n/a
 
      09-02-2005

That's good to know!

I'll look further in to implmenting those methods then.

Thanks!
Wil

 
Reply With Quote
 
Roedy Green
Guest
Posts: n/a
 
      09-02-2005
On 2 Sep 2005 11:50:38 -0700, "(E-Mail Removed)"
<(E-Mail Removed)> wrote or quoted :

>Objects A & B are in the hashtable
>Both objects reference each other.
>When A is serialized the reference to B serializes B
>When B is serialized the reference to A serializes A
>
>Is this a valid assumption?


Nope. Each object appears at most once in the ObjectStream. This
causes a problem sometimes since object fields can be updated and you
only have the old values recorded.

Further consider a long array of ints. Basically it is recorded just
as efficiently as if you had used DataOutputStream. ObjectStreams futz
around getting started, but once they get going they pick up steam
since they don't redundantly record information.

I like serialised streams mainly for three reasons:

1. you can read/write arbitrarily complex datastructures with a line
of code. You don't have to maintain some hideous bug-prone mapping.

2. for long arrays, they have little more overhead than using a
DataOutputStream

3. For Applets they let you predigest data with the most complicated
processing and parsing you choose, then hand it off in very compact
form to the Applet that can read it with no extra downloaded classes
and no extra application parsing classes.

There are downsides. See my essay
http://mindprod.com/jgloss/serialization.
--
Canadian Mind Products, Roedy Green.
http://mindprod.com Again taking new Java programming contracts.
 
Reply With Quote
 
Roedy Green
Guest
Posts: n/a
 
      09-03-2005
On Fri, 02 Sep 2005 22:47:00 GMT, Roedy Green
<(E-Mail Removed)> wrote or quoted :

>There are downsides. See my essay
>http://mindprod.com/jgloss/serialization.


oops
http://mindprod.com/jgloss/serialization.html
--
Canadian Mind Products, Roedy Green.
http://mindprod.com Again taking new Java programming contracts.
 
Reply With Quote
 
jan V
Guest
Posts: n/a
 
      09-03-2005
> I like serialised streams mainly for three reasons:

1.
2.
[3.]

http://www.velocityreviews.com/forums/(E-Mail Removed) did you get those? These are bloody strong arguments,
so if you really want to start overriding standard serialization, you better
have even stronger arguments.


 
Reply With Quote
 
Raymond DeCampo
Guest
Posts: n/a
 
      09-04-2005
(E-Mail Removed) wrote:
> I told a slight lie, it's a HashTable
>


Have you considered the possibility that the synchronized nature of
Hashtable is the cause of the performance?

HTH,
Ray

--
XML is the programmer's duct tape.
 
Reply With Quote
 
Chris Uppal
Guest
Posts: n/a
 
      09-04-2005
(E-Mail Removed) wrote:

> I serialize this map to send it over RMI, but it seems quite slow.


I'm not sure how RMI interacts with serialisation, but I suspect that there may
be a problem here.

As jan V says, within any given serialised stream, objects are only represented
once, even if they are part of a complicated object network. However, this may
not be the case when you are using RMI, unless you are driving it in such a way
that RMI can "see" that it is getting several references to the same object in
different requests. As I say, I'm not clear on exactly how RMI manages such
things, so I could be completely wrong, but it sounds as if what may be
happening is that your entire object-network is being serialised on every
request. If that's the case then it should be easy enough to diagnose (fixing
it is different since each RMI request would generate network traffic
(easily visible with any handy network monitor) that is of the same order of
size as your datastructure when serialised out to disk.

Incidentally, even if I am guessing wrong here, I think that it would be a good
idea to try serialising your data to file before worrying about RMI. If that's
slow too, then RMI isn't the problem and can be ignored while you focus on the
serialisation alone.

-- chris



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
idea for more efficient HashMap Roedy Green Java 15 02-02-2013 12:57 AM
reuse HashMap$Entry (or HashMap in total) to avoid millions of allocations Vince Darley Java 4 03-02-2010 07:48 AM
java.util.Properties extending from HashMap<Object, Object> insteadof HashMap<String, String> Rakesh Java 10 04-08-2008 04:22 AM
Re: Properties2 extends Hashmap, pros and cons? Jon Skeet Java 5 07-08-2003 06:44 PM
HashMap Sanjay Kumar Java 2 07-05-2003 07:10 PM



Advertisments