Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Large data array (operations) via disk disk files

Reply
Thread Tools

Large data array (operations) via disk disk files

 
 
Skybuck Flying
Guest
Posts: n/a
 
      11-15-2006

"Randy Howard" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed). net...
> Skybuck Flying wrote
> (in article <ejdtut$4gk$(E-Mail Removed)1.ov.home.nl>):
>
>> Get a new PC.

>
> A Mac Pro would be a nice choice.
>
>> I am using Windows XP 64 bit <- it totally sucks but hey it's the future
>>

>
> One of the rarest of all Skybuck statements, a correct one.
> There are very few of these in the wild. You are correct, the
> future for Windows users really does suck.


It sucks now, it will improve in the future

So is the way of Microsoft, Software and Hardware manufacturers and there
new, still buggy, Windows drivers and components

Bye,
Skybuck.


 
Reply With Quote
 
 
 
 
geerrxin@gmail.com
Guest
Posts: n/a
 
      11-15-2006
Specifically, the problem needs to be addressed on Linux, 64-bit. It
looks to me that one needs to deal with virtual memory directly.

Ideally, the solution will look like the following:

1. Map A to an addressible space.

2. Generate entries of A(i,j), i = 1..N, j = 1..N.

3. Compute something in the loops with reference to A(i,j), i = 1..N, j
= 1..N.

The fact that only a portion of A is used at a time during the
computation makes the use of memory mapped file a sound solution. But
one seems to have to keep track of the offset in the file in the
subsequent calls to mmap() when accessing to different portion of A,
which doesn't sound convenient, no?

Is there any better solution, such that one can do something as simple
as

A = vm_create(...)

in 1 above without worrying about memory limit as in the call to
malloc() so that the rest of the code can be implemented without extra
programming efforts in memory manipulation?

Thanks,
Zin

Ian Collins wrote:
> http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> > Hi,
> >
> > I have a need to manipulate a large matrix, say, A(N,N) (of real) > 8GB
> > which can't fit in physical memory (2 BG). But the nature of
> > computation
> > requires the operation on only a portion of the data, e.g. 500 MB (0.5
> > GB)
> > at a time.
> >
> > The procedure is as follows:
> >
> > 1. Generate data and store the data in array A(N,N), N is HUGE.
> >
> > 2. Do computation on A in loops, e.g.
> >
> > for i = 1, N
> > for j = 1, N
> > compute something using A (a portion)
> > end
> > end
> >
> > How can I implement the procedure to accommodate the large memory
> > needs?
> >

> Two solutions:
>
> If performance is an issue, use a box with enough memory.
>
> Otherwise use the memory mapped file support provided by your operating
> environment and map the required portion of the matrix into memory. How
> you do this will be OS specific and best asked on an OS group.
>
> --
> Ian Collins.


 
Reply With Quote
 
 
 
 
santosh
Guest
Posts: n/a
 
      11-15-2006
(E-Mail Removed) wrote:
> Specifically, the problem needs to be addressed on Linux, 64-bit. It
> looks to me that one needs to deal with virtual memory directly.


What is the value of SIZE_MAX for your C implementation?

> Ideally, the solution will look like the following:
>
> 1. Map A to an addressible space.
>
> 2. Generate entries of A(i,j), i = 1..N, j = 1..N.
>
> 3. Compute something in the loops with reference to A(i,j), i = 1..N, j
> = 1..N.
>
> The fact that only a portion of A is used at a time during the
> computation makes the use of memory mapped file a sound solution. But
> one seems to have to keep track of the offset in the file in the
> subsequent calls to mmap() when accessing to different portion of A,
> which doesn't sound convenient, no?
>
> Is there any better solution, such that one can do something as simple
> as
>
> A = vm_create(...)
>
> in 1 above without worrying about memory limit as in the call to
> malloc() so that the rest of the code can be implemented without extra
> programming efforts in memory manipulation?


On a properly implemented C compiler for a 64 bit environment, SIZE_MAX
should be sufficient for your needs. Have you tried using plain
malloc().

Generally, if your working set will fit within physical memory, then
just malloc() the whole amount and let the OS do the grunt work. Often
under Linux, memory won't actually be allocated until you write to it.
Also having fast a HDD for swap space will improve matters somewhat.

But if don't mind somewhat more involvement mmap() would probably be a
more suited solution, as with it, you can provide the OS with more
information of your actual memory needs and let it optimise itself
accordingly.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
median of large data set (from large file) friend.05@gmail.com Perl Misc 5 04-02-2009 04:06 AM
maps to hold ultra large data sets using customer allocators to allocate disk space rather than main memory CMOS C++ 15 05-17-2007 10:12 PM
[Urgent] Is there a size limit on returning a large dataset or a large typed array from web service? Ketchup ASP .Net Web Services 1 05-25-2004 10:11 AM
large files via Response.OutputStream.Write Ryan Hartman ASP .Net 6 11-11-2003 02:29 AM
Backing Up Large Files..Or A Large Amount Of Files Scott D. Weber For Unuathorized Thoughts Inc. Computer Support 1 09-19-2003 07:28 PM



Advertisments