Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > repeatedly open file or save entire file to memory?

Reply
Thread Tools

repeatedly open file or save entire file to memory?

 
 
Jason Lillywhite
Guest
Posts: n/a
 
      09-17-2009
I want to make sure I do what is most efficient when dealing with
multiple and potentially large files.

I need to take row(n) and row(n+1) from a file and use the data to do
things in other parts of my program. Then the program will iterate by
incrementing n. I may have up to 30 files, each having 50,000 rows.

My question is should I read row(n) and row(n+1), accessing the file
again and again on each iteration of the main program? Or should I just
read the whole file into memory (say, an array) then just grab items
from the array by index in the main program?
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
 
Robert Klemme
Guest
Posts: n/a
 
      09-17-2009
2009/9/17 Jason Lillywhite <(E-Mail Removed)>:
> I want to make sure I do what is most efficient when dealing with
> multiple and potentially large files.
>
> I need to take row(n) and row(n+1) from a file and use the data to do
> things in other parts of my program. Then the program will iterate by
> incrementing n. I may have up to 30 files, each having 50,000 rows.
>
> My question is should I read row(n) and row(n+1), accessing the file
> again and again on each iteration of the main program? Or should I just
> read the whole file into memory (say, an array) then just grab items
> from the array by index in the main program?


Other schemes can be devised too:

1. read the file once remembering indexes for every file and row
(IO#tell) and then access rows via IO#seek

2. since you are incrementing n, read row n, remember pos, read row n
+ 1, next time round #seek to position and continue reading

3. as 2 but remember line n+1 so you do not have to read it again

4. if the access pattern to files is not round robin but different,
you might get better results by storing more information in memory
forr least recently accessed files

5. read files in chunks of x lines and remember them in memory thus
reducing file accesses

...

It really depends on what you do with those files, how your access
patterns are etc.

Kind regards

robert


--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

 
Reply With Quote
 
 
 
 
Axel Etzold
Guest
Posts: n/a
 
      09-17-2009
Jason,


> > I want to make sure I do what is most efficient when dealing with
> > multiple and potentially large files.


you can use ruby-prof for profiling of your code. It's available as
a gem.

Best regards,

Axel
--
Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3 -
sicherer, schneller und einfacher! http://portal.gmx.net/de/go/chbrowser

 
Reply With Quote
 
Robert Klemme
Guest
Posts: n/a
 
      09-18-2009
2009/9/17 Axel Etzold <(E-Mail Removed)>:
> Jason,


>> > I want to make sure I do what is most efficient when dealing with
>> > multiple and potentially large files.

>
> you can use ruby-prof for profiling of your code. It's available as
> a gem.


I consider Jason's question as a design level question. That's
nothing where a profiler can really help. Of course you can code up
alternatives and measure performance. But it can only tell you which
version of several is fastest - it cannot tell you how you should
change your design to improve it.

In this case performance bottlenecks are rather in the area of disk IO
and all a profiler can tell you is how much of your time you spend in
IO - but not how to minimize that.

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

 
Reply With Quote
 
Fabian Streitel
Guest
Posts: n/a
 
      09-18-2009
[Note: parts of this message were removed to make it a legal post.]

You could put the data into a database,
which should be performant enough
and still very easy to use, even when
your lookup pattern should change
in the future.

Greetz!


> > > I want to make sure I do what is most efficient when dealing with
> > > multiple and potentially large files.

>
>


 
Reply With Quote
 
Fabian Streitel
Guest
Posts: n/a
 
      09-18-2009
[Note: parts of this message were removed to make it a legal post.]

>
> In this case performance bottlenecks are rather in the area of disk IO
> and all a profiler can tell you is how much of your time you spend in
> IO - but not how to minimize that.
>
>

I agree, although that argument doesn't make much sense.

A profiler can never tell you how to minimize anything, it can
only show you where you should look for optimizations.
In this case of course that's futile, since we already know where
to optimize: the IO

Greetz!

 
Reply With Quote
 
Axel Etzold
Guest
Posts: n/a
 
      09-18-2009

-------- Original-Nachricht --------
> Datum: Fri, 18 Sep 2009 15:30:38 +0900
> Von: Robert Klemme <shortcutter@googlema il.com>
> An: http://www.velocityreviews.com/forums/(E-Mail Removed)
> Betreff: Re: repeatedly open file or save entire file to memory?


> 2009/9/17 Axel Etzold <(E-Mail Removed)>:
> > Jason,

>
> >> > I want to make sure I do what is most efficient when dealing with
> >> > multiple and potentially large files.

> >
> > you can use ruby-prof for profiling of your code. It's available as
> > a gem.


Dear Robert,

>
> I consider Jason's question as a design level question. That's
> nothing where a profiler can really help. Of course you can code up
> alternatives and measure performance. But it can only tell you which
> version of several is fastest - it cannot tell you how you should
> change your design to improve it.


I agree with you. I proposed this precisely to see how long several
alternatives take. One always has to think about design oneself

Best regards,

Axel
--
Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3 -
sicherer, schneller und einfacher! http://portal.gmx.net/de/go/atbrowser

 
Reply With Quote
 
Jason Lillywhite
Guest
Posts: n/a
 
      09-18-2009
Fabian Streitel wrote:
> You could put the data into a database,
> which should be performant enough
> and still very easy to use, even when
> your lookup pattern should change
> in the future.
>
> Greetz!


That is a good idea. Do you recommend ruby DBI or ActiveRecord? I need
ease of use and simplicity. My interface is the command line.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Fabian Streitel
Guest
Posts: n/a
 
      09-18-2009
[Note: parts of this message were removed to make it a legal post.]

actually I like Datamapper the most. It's very intuitive.
You should check it out: http://datamapper.org/doku.php

I definitely like the way datamapper handles things better
than ActiveRecord, but that's a matter of taste.

Greetz!


> That is a good idea. Do you recommend ruby DBI or ActiveRecord? I need
> ease of use and simplicity. My interface is the command line.
>
>


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
repeatedly trying to open a file Nick Keighley C++ 1 11-22-2006 02:08 PM
Any cool solution to save ENTIRE screen content and state? =?Utf-8?B?Rmxhc2hNZXJsb3Q=?= ASP .Net 1 08-23-2005 08:49 PM
how to save and list entire config of cisco router Steve Richter Cisco 10 05-21-2005 01:39 PM
Repeatedly parsing a file to "clean" it. Graeme Stewart Perl Misc 6 09-18-2004 06:46 AM



Advertisments