Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Huge Data Handling

Reply
Thread Tools

Huge Data Handling

 
 
Vishal G
Guest
Posts: n/a
 
      09-30-2008
Hi Guys,

I am trying to edit some bioinformatic package written in perl which
was written to handle DNA sequence of about 500,000 base long (a
string containg 500000 chrs)..

I have to enhance it to handle 100 million base long DNA...

Each base in DNA has this information, base (A, C, G or T), qual
(0-99), position (1-length)

there is one main DNA sequence and on average 500,000 parts (max 2000
chrs long with the same set of information)...

The program first creates an alignment like

*
Main - .....ACCCTTTGTCTAGTCGTATCGTCGATCGTCGCTAGCTCTGCT... .
Part -
GTCGTATCGTCGAACGTCGCTAGCTC
Part - CTTTGTCTAGTCGTATCGTCGATCGTCGCT
Part
-
TCGAACGTCGCTAGCTCTG

Now, lets say I have to go thorugh each position and find how many
variations are present at certain position (with their original
position and quality).

Look at * position, there is T-A variation

Right now they are using hash to caputure this

%A, %C, %G, %T

Loop For Main DNA {
$A{$pos} = $qual; # this tells
me that there is A base at certain position

with some qual for main
}

Update the qual by adding the qual of parts

Loop For Parts {
$A{$pos} += $qual # for A parts

$T{$pos} += $qual $ for T parts
}
But because the dataset is huge, it consumes lot of memory...

so basically I am trying to figure out a way to store this information
without using much memory

If you dont understand the above problem, dont worry....

just tell me how to handle huge data which need to accessed frequently
using least possible memory..

Thanks in advance
 
Reply With Quote
 
 
 
 
John W. Krahn
Guest
Posts: n/a
 
      09-30-2008
Vishal G wrote:
>
> just tell me how to handle huge data which need to accessed frequently
> using least possible memory..


perldoc -q "How can I make my Perl program take less memory"


John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Any suggestions for handling data of huge dimension in Java? Simon Ng Java 5 03-26-2011 12:41 PM
Re: Any suggestions for handling data of huge dimension in Java? Simon Ng Java 5 03-25-2011 07:32 PM
Any suggestions for handling data of huge dimension in Java? Simon Java 13 03-25-2011 04:00 AM
Memory error due to the huge/huge input file size tejsupra@gmail.com Python 3 11-20-2008 07:21 PM
Handling Huge Data Vishal G Perl Misc 7 10-03-2008 03:41 AM



Advertisments