Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Performance Issues in Random Access Files

Reply
Thread Tools

Performance Issues in Random Access Files

 
 
gwlucas@sonalysts.com
Guest
Posts: n/a
 
      10-09-2007

I have an application where I need to read data from random posiitions
within file. In this case performance is VERY important. The problem I
am running into is that the RandomAccessFile class appears to have
serious performance issues. Based on a bit of experimentation, I
suspect that it doesn't implement any kind of intrinsic buffering.

Anyway, I can think of five or six ways of working around this
problem, but I can't believe that there isn't a standard solution. As
a rule, I like to stick to "recognized practices of the community"
even when it would be more fun to "roll my own." So... is there a
canonical solution? Would somebody be able to point me in the right
direction?

Background

I have a file containing blocks of data that need to be accessed at
random. I never know which block I am going to need to read next but,
within each block, the data is read in sequence. The pattern of access
is somewhat similar to an old fashioned ISAM file. Thus I would like
to set the file pointer to the beginning of the block, read some
integers, read some doubles, etc, all in sequence. Later, I would jump
to another file position and do the same. Normally, I would accomplish
these sequential reads with a BufferedInputStream. But since I do have
the random access component, that doesn't seem to be an option
(apparently, you cant wrap a BufferedInputStream around a random
access file).

Thanks in advance for your help.

Gary

 
Reply With Quote
 
 
 
 
gwlucas@sonalysts.com
Guest
Posts: n/a
 
      10-09-2007
On Oct 9, 12:31 pm, (E-Mail Removed) wrote:
> I have an application where I need to read data from random posiitions
> within file. In this case performance is VERY important. [snip]


An addenda... Strictly for testing purposes, I have implemented a
wrapper around RandomAccessFile following a suggestion posted here by
Tom Anderson on 10 Oct 2002 which
makes a class that looks like an InputStream but makes pass-through
calls to RandomAccessFile. It works, but definitately falls into the
kludge category.

I was wondering whether the java.nio classes might be a solution here.
I've never used them and just assumed that they were related to
network I/O, but there does appear
to be some stuff related to FileChannels that might be relevant.
Frankly, the whole java.nio API looks rather unfathomable.

Thanks again.

Gary

 
Reply With Quote
 
 
 
 
Roedy Green
Guest
Posts: n/a
 
      10-11-2007
On Tue, 09 Oct 2007 09:31:20 -0700, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote,
quoted or indirectly quoted someone who said :

>
>I have an application where I need to read data from random posiitions
>within file. In this case performance is VERY important. The problem I
>am running into is that the RandomAccessFile class appears to have
>serious performance issues. Based on a bit of experimentation, I
>suspect that it doesn't implement any kind of intrinsic buffering.


see http://mindprod.com/jgloss/nio.html

I would try implementing this with nio. If the file in not too
enormous, you can use the memory mapping features to tap into the
system caching.
--
Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com
 
Reply With Quote
 
Esmond Pitt
Guest
Posts: n/a
 
      10-12-2007
(E-Mail Removed) wrote:
> I was wondering whether the java.nio classes might be a solution here.


They are indeed. What you want is FileChannel.map(), then you just deal
directly with a MappedByteBuffer. This nly works reasonably for fixed
length files of course ... but that's much the same for any file-mapping
API.
 
Reply With Quote
 
gwlucas@sonalysts.com
Guest
Posts: n/a
 
      10-12-2007

Roedy and Esmond,

Thank you both for your help (Roedy, I am a long time admirer of your
contributions to the Java community).

I plan on looking into Java NIO. Roedy's web site provides a link to a
pretty extensive tutorial.

In the mean time, I tried some experiments using Tom Andersion's 2002
suggestion (which pre-dates NIO) of wrapping RandomAccessFile in a
container class that allows it to be wrapped in a BufferedInputStream.

Just using a quick-and-dirty approach, one that makes very wasteful
use of object creation, I was achieved a factor of 9 improvement in
speed. It would be interesting to see if it would gets better if I
took a careful approach. Anyway, it turns out that RandomAccessFile is
a pretty lousey implementation... a big step back from the C language
standard i/o implementation of fread/fwrite that was created in the
1970's.

Gary



On Oct 10, 9:38 pm, Roedy Green <(E-Mail Removed)>
wrote:
> On Tue, 09 Oct 2007 09:31:20 -0700, (E-Mail Removed) wrote,
> quoted or indirectly quoted someone who said :
>
>
>
> >I have an application where I need to read data from random posiitions
> >within file. In this case performance is VERY important. The problem I
> >am running into is that the RandomAccessFile class appears to have
> >serious performance issues. Based on a bit of experimentation, I
> >suspect that it doesn't implement any kind of intrinsic buffering.

>
> seehttp://mindprod.com/jgloss/nio.html
>
> I would try implementing this with nio. If the file in not too
> enormous, you can use the memory mapping features to tap into the
> system caching.
> --
> Roedy Green Canadian Mind Products
> The Java Glossaryhttp://mindprod.com



 
Reply With Quote
 
Esmond Pitt
Guest
Posts: n/a
 
      10-15-2007
(E-Mail Removed) wrote:
Anyway, it turns out that RandomAccessFile is
> a pretty lousey implementation... a big step back from the C language
> standard i/o implementation of fread/fwrite that was created in the
> 1970's.


That's because it doesn't share stdio's fundamental problem, i.e. the
user-side buffering, which makes it useless for multi-user I/O. RAF
omits the user-side buffering, which you can add yourself, as you have,
and leaves the resulting multi-user problem up to you too. As shipped,
RAF works multi-user.

Re Java, the bigger mystery to me is why InputStream and OutputStream
are abstract classes instead of interfaces so that RandomAccessFile
could extend them? and/or why aren't there adapters so you could get
buffered streams out of an RAF?
 
Reply With Quote
 
Joshua Cranmer
Guest
Posts: n/a
 
      10-15-2007
Esmond Pitt wrote:
> Re Java, the bigger mystery to me is why InputStream and OutputStream
> are abstract classes instead of interfaces so that RandomAccessFile
> could extend them?


My guess is so that every implementation wouldn't have to rewrite the
near-functionally equivalent overloaded read()/write() methods.

> and/or why aren't there adapters so you could get
> buffered streams out of an RAF?


I think part of this was the impetus for NIO. But I haven't really dealt
with this kind of problem, so I can't say.

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
 
Reply With Quote
 
Patricia Shanahan
Guest
Posts: n/a
 
      10-15-2007
Joshua Cranmer wrote:
> Esmond Pitt wrote:
>> Re Java, the bigger mystery to me is why InputStream and OutputStream
>> are abstract classes instead of interfaces so that RandomAccessFile
>> could extend them?

>
> My guess is so that every implementation wouldn't have to rewrite the
> near-functionally equivalent overloaded read()/write() methods.
>
> > and/or why aren't there adapters so you could get
>> buffered streams out of an RAF?

>
> I think part of this was the impetus for NIO. But I haven't really dealt
> with this kind of problem, so I can't say.
>


Maybe attack the problem using a byte[] to represent the record? Do a
readFully to fill a byte[] the same size as the record. Wrap the byte[]
in a byteArrayInputStream and a DataInputStream to access the fields in
the record.

Patricia
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Math.random() and Math.round(Math.random()) and Math.floor(Math.random()*2) VK Javascript 15 05-02-2010 03:43 PM
Performance issues with large files -- ruby vs. python :) sa 125 Ruby 15 05-14-2009 03:05 PM
random.random(), random not defined!? globalrev Python 4 04-20-2008 08:12 AM
is Random Access File really "random access"? Kevin Java 19 02-13-2006 09:31 PM
Random NOt random? Darren Clark ASP .Net 3 06-24-2004 05:23 PM



Advertisments