Vishal G <> wrote:
> Hi Guys,
>
> Very basic question....
>
> Please dont suggest to use other programing language or other data
> structure cause I can't...
If you can't use a different structure, at least for intermediates,
then you can't program.
> I read data from file and yes I have to slurp the whole thing to
> memory cause I can use upto 4GB...
Because you can do it that means you have to? We can't you read line by
line, processing each line and appending the result to $str before moving
to the next?
>
> data in file is in this format
>
> 30 56 78 34 2 39 87 (50 values per line, total of 120 million
> entries)
So then, would this work to make an example file?
perl -le 'foreach (1..2.4e6) {print join " ", map int(rand()*99), 1..50}'
>
> reading file in paragraph mode
Why reading in paragraph mode? From your format description, the data
is not formatted in paragraphs.
>
> Now I have to remove multiple spaces without using much memory
>
> This is what I have wrote (might be very low standard code for Gurus
> out there)
>
> It works but takes 5 mins consuming 600-700 MB,
When I try it, I get many many warnings which suggests that it is not
actually working correctly.
> if I use substitution
> to achieve this it takes 4-5 GB and around 2-3 mins...
How did you use substitution?
Starting your code indented half way across the screen isn't very helpful.
It just leads to messy line wrap problems. I fixed that.
> my $chr = '';
> my $str = '';
> my $value = '';
> my $unitlength = $Alignment::BASEQUALITY_BYTES;
> while (length($_) > 0) {
> if (($chr = substr($_, 0, 1, "")) ne " ") {
> $value = $value . $chr;
> } else {
> $str = $str . sprintf("%${unitlength}d", $value) if ($value);
I get:
Argument "67\n33" isn't numeric in sprintf....
> undef $value;
> }
> }
Xho
--
--------------------
http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.