Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > help on how to save/load this data structure?

Reply
Thread Tools

help on how to save/load this data structure?

 
 
Kevin
Guest
Posts: n/a
 
      05-18-2005
Hi Guys,
I am wondering if any suggestions on how to do the coding for this data
structure and requirements:

The story:

1) There are a large number of log data, which is line by line (text).
Each line has an line ID (integer). Basically, we can think each line
of data is logs at that time, say, each second a line is added to the
log. The total lines are more than 10 millions.
2) There are a large number of possible events (say 200K, with event ID
to identify them). When one even occurs, it will generate a value in
the log data. Since the events can occur con-currently, so one line of
data may have many values in it.

The abstract data structure:

It is required that one event ID (Integer) corresponds to many line IDs
(Integer), in which this event occurs.
If the total size is small, we can use a na´ve way as: save all the
IDs into a hash, with event ID as key, and an ArrayList (or Hashtable
since we do not need the lineIDs to be in order) as value to the hash,
each item in the ArrayList is line ID (Integer).
There are some methods that can save some memory, such as customized
array and do not use Integer (8 bytes each one), etc. But with the
above mentioned size, these ways are just no help.

The required operations on the data:

The application needs to build such a data structure which supports
these two operations:
1) Given an event ID, find all the line IDs of that event.
2) Given a group of event IDs, find all the line IDs of the group
(basically a "union" of the set of line IDs of each event ID).

Any idea of how to build such a big structure? I think there should not
be any way to fit them into memory (java 1.4's stack size is max
1.3G, on win32, I think). If we can swap some of them out to a file,
read them in only when needed, how to construct the structure so we can
do the job more efficiently? Or will it be better (faster) if we put
all the IDs into a database table and use SQL to get them?

Thanks a lot and you have a great day.

By the way, any faster way to write/read large number of int to and
from a file? Some days ago, I did a test using ObjectOutputStream's
writInt(), if I remember right, it took about 3 seconds to write 10^7
int to a file, which resulted in a file about 38M.

 
Reply With Quote
 
 
 
 
Wendy Smoak
Guest
Posts: n/a
 
      05-19-2005
"Kevin" <(E-Mail Removed)> wrote:

> Or will it be better (faster) if we put
> all the IDs into a database table and use SQL to get them?


I think you answered your own question.

--
Wendy


 
Reply With Quote
 
 
 
 
Kevin
Guest
Posts: n/a
 
      05-19-2005
I never program SQL in Java before. Would that be slow to issue SQL
calls? I have the feeling that large number of SQL calls will be slow
(especially I can only find a normal, not super fast, machine for DB
server).

Wendy Smoak wrote:
> "Kevin" <(E-Mail Removed)> wrote:
>
> > Or will it be better (faster) if we put
> > all the IDs into a database table and use SQL to get them?

>
> I think you answered your own question.
>
> --
> Wendy


 
Reply With Quote
 
Kevin
Guest
Posts: n/a
 
      05-19-2005
By the way, myself don't mind using database or not. But it seems the
end user would like a "stand-alone" program, using database will make
him kind of unhappy.

 
Reply With Quote
 
Robert Mischke
Guest
Posts: n/a
 
      05-19-2005
"Kevin" <(E-Mail Removed)> wrote:

>By the way, myself don't mind using database or not. But it seems the
>end user would like a "stand-alone" program, using database will make
>him kind of unhappy.


There are "embedded" databases which don't require a separate server,
for example http://hsqldb.sourceforge.net/ .

To your original question: Yes, I think a database is the way to go -
that's what databases exist for

Robert

 
Reply With Quote
 
Kevin
Guest
Posts: n/a
 
      05-19-2005
Thanks a lot. That URL will be very helpful.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Help Help Help Pentax S5i Help needed (Please) The Martian Digital Photography 14 06-20-2008 07:56 AM
Need help in programmatically accessing the data returned by data access method... Siva ASP .Net 1 04-17-2006 06:48 PM
Probelm to post XML data in a loop. First time XML is posted, second time data is getting truncated. Please help. vamsi.aluru@gmail.com Perl Misc 7 02-14-2006 12:09 PM
data grid - data caching help nullref ASP .Net Web Controls 0 12-07-2005 04:15 PM
Need help with populating table control with data from a data set =?Utf-8?B?Z2c3Nw==?= ASP .Net 3 08-18-2005 01:09 PM



Advertisments