Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Searching a disk-backed Map

Reply
Thread Tools

Searching a disk-backed Map

 
 
Stefan Ram
Guest
Posts: n/a
 
      08-18-2009
This should be a common need. Yet I am not aware of anything
like it in Java SE. What is the most common (pure Java)
solution to it?

I would like to have an implementation of java.util.Map,
which is constructed with an int »m« and a java.io.File »f«.

It will use no more than »m« bytes of memory, but »swap« out
(the least often used) entries to the file »f«, when they do
not fit into the given memory size anymore.

 
Reply With Quote
 
 
 
 
Albert
Guest
Posts: n/a
 
      08-18-2009
Stefan Ram a écrit :
> This should be a common need. Yet I am not aware of anything
> like it in Java SE. What is the most common (pure Java)
> solution to it?
>
> I would like to have an implementation of java.util.Map,
> which is constructed with an int »m« and a java.io.File »f«.
>
> It will use no more than »m« bytes of memory, but »swap« out
> (the least often used) entries to the file »f«, when they do
> not fit into the given memory size anymore.
>


just to make you renounce, search for a feature request about a
release() or free() method for the "MappedByteBuffer" class in the sun
bugs database.
 
Reply With Quote
 
 
 
 
markspace
Guest
Posts: n/a
 
      08-18-2009
Stefan Ram wrote:
> This should be a common need. Yet I am not aware of anything
> like it in Java SE. What is the most common (pure Java)
> solution to it?
>
> I would like to have an implementation of java.util.Map,
> which is constructed with an int »m« and a java.io.File »f«.
>
> It will use no more than »m« bytes of memory, but »swap« out
> (the least often used) entries to the file »f«, when they do
> not fit into the given memory size anymore.
>



I was going to the say the same thing as Patricia: the map is called
"Derby". Search for JavaDB.

<http://java.sun.com/developer/technicalArticles/J2SE/Desktop/javadb/>

There's no point in actually implementing such a specialized thing as a
literal Map when there's a better, more general solution already available.

 
Reply With Quote
 
Roedy Green
Guest
Posts: n/a
 
      08-18-2009
On 18 Aug 2009 12:09:58 GMT, (Stefan Ram)
wrote, quoted or indirectly quoted someone who said :

> This should be a common need. Yet I am not aware of anything
> like it in Java SE. What is the most common (pure Java)
> solution to it?
>
> I would like to have an implementation of java.util.Map,
> which is constructed with an int »m« and a java.io.File »f«.
>
> It will use no more than »m« bytes of memory, but »swap« out
> (the least often used) entries to the file »f«, when they do
> not fit into the given memory size anymore.


I rolled my own something similar. It is not that much code to handle
the random quotations you see on my website.

What you could do in write a class that uses a HashMap internally just
to hold the keys and objects that are offsets in a sequential file.

When you build the Map, you write the objects out with writeUTF or
writeObject, and record the size/offset of the stream before the
write.

Then to lookup in the Map, you look up the key, get the offset, seek
and do a read. You don't even need to know the length.

This is pretty fast, especially when the drive/OS does read caching.
If you wanted to make it even faster, you could put the objects in a
NIO memory mapped file, but that limits your file size. You also
might put the file on fast flash drive.

I would write such a beast to your specs for $50 US.


--
Roedy Green Canadian Mind Products
http://mindprod.com

http://thecovemovie.com : The Cove: a documentary about Japan's secret atrocities against dolphins.
 
Reply With Quote
 
Roedy Green
Guest
Posts: n/a
 
      08-18-2009
On Tue, 18 Aug 2009 05:42:09 -0700, Patricia Shanahan <>
wrote, quoted or indirectly quoted someone who said :

>
>Have you considered putting the data in a database instead, and using
>java.sql to access it? The data structures and algorithms that Java uses
>for in-memory maps are not very suitable for disk-based maps. Database
>managers use structures and algorithms designed for the job.


Other advantages:

1. They don't fail entirely when RAM gets tighter. With an in RAM key
lookup, you must have the ram for all the keys.

2. You can improve their performance just by throwing more ram at
them. They can use it for both keys and objects.

3. If your indexing needs become more complex, you can just add. You
don't have to start over from scratch.

The disadvantages:

1. Usually you must separately install, start/stop the SQL engine.
This might conflict with other apps.

2. You have the overhead of all of SQL when you may need only a tiny
fraction of it. That RAM would have been better used with more
specific code.
--
Roedy Green Canadian Mind Products
http://mindprod.com

http://thecovemovie.com : The Cove: a documentary about Japan's secret atrocities against dolphins.
 
Reply With Quote
 
Stefan Ram
Guest
Posts: n/a
 
      08-18-2009
Roedy Green <> writes:
>Then to lookup in the Map, you look up the key, get the offset, seek
>and do a read. You don't even need to know the length.


I have read your ideas on how to implement this beast, and
they are nearly identical to what I had in mind before I
read your suggestion but after I wrote my OP (I made it up
in the time between). So, maybe I might go this way.

 
Reply With Quote
 
Stefan Ram
Guest
Posts: n/a
 
      08-18-2009
Roedy Green <> writes:
>I would write such a beast to your specs for $50 US.

¯¯¯¯¯
This is funny, because just yesterday I read an article
about brain research that found that all people (even blind
people) have different areas in the brain to represent
animals versus non-animal things, independent of how they
look like.

Then, yesterday, I posted my thoughts in »de.comp.lang.misc«
about how the brain sometimes uses the animal area and
sometimes the things area to represent abstract concepts in
programming. For example, a program looks like an animal to
the brain, while data looks like a thing.

Some people then made fun of my post, but now I read your
choice of the word »beast«, which supports my idea.

 
Reply With Quote
 
Stefan Ram
Guest
Posts: n/a
 
      08-18-2009
Roedy Green <> writes:
>1. Usually you must separately install, start/stop the SQL engine.
>This might conflict with other apps.


I have this crazy idea to write applications that are based
on Java SE only and not require any additional library.
I know that this idea might not be very pragmatic or reasonable.

/If/ Derby would finally be included in Java SE, I would
love to use it.

 
Reply With Quote
 
Roedy Green
Guest
Posts: n/a
 
      08-18-2009
On Tue, 18 Aug 2009 13:02:36 -0700, Roedy Green
<> wrote, quoted or indirectly quoted
someone who said :

>1. They don't fail entirely when RAM gets tighter. With an in RAM key
>lookup, you must have the ram for all the keys.
>
>2. You can improve their performance just by throwing more ram at
>them. They can use it for both keys and objects.
>
>3. If your indexing needs become more complex, you can just add. You
>don't have to start over from scratch.


I expound on this theme at
http://mindprod.com/jgloss/sql.html#ALTERNATIVES
--
Roedy Green Canadian Mind Products
http://mindprod.com

http://thecovemovie.com : The Cove: a documentary about Japan's secret atrocities against dolphins.
 
Reply With Quote
 
Roedy Green
Guest
Posts: n/a
 
      08-18-2009
On 18 Aug 2009 20:21:08 GMT, (Stefan Ram)
wrote, quoted or indirectly quoted someone who said :

> Some people then made fun of my post, but now I read your
> choice of the word »beast«, which supports my idea.


I have a more elaborate version for frequently changing data called a
Hermit Crab file. It was inspired by the metaphor of a hermit crab
seeking out ever larger shells as it grows.

see http://mindprod.com/jgloss/hermitcrab.html

As RAM has become less expensive, the need for a light weight indexed
record lookup has waned. You can just use a big hammer -- SQL.
--
Roedy Green Canadian Mind Products
http://mindprod.com

http://thecovemovie.com : The Cove: a documentary about Japan's secret atrocities against dolphins.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
advice on loading and searching large map in memory eunever32@yahoo.co.uk Java 24 02-23-2011 05:23 PM
Google search result to be URL-limited when searching site, but notwhen searching Web stumblng.tumblr Javascript 1 02-04-2008 09:01 AM
Searching through a map question Joe Van Dyk C++ 5 05-09-2006 11:50 AM
Efficient searching through stl map? Henrik Goldman C++ 3 04-09-2006 08:41 PM
searching keys in std::map using map::upper_bound Erik Arner C++ 0 11-02-2004 11:14 PM



Advertisments