Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Perl Misc (http://www.velocityreviews.com/forums/f67-perl-misc.html)
-   -   perl out of memory (http://www.velocityreviews.com/forums/t902917-perl-out-of-memory.html)

xlue897@rogers.com 05-02-2007 03:26 PM

perl out of memory
 
Hi,

I have a query to search the max number in a large size file. When I
use the perl code below, it generates error : Out of memory! Bus
error.

Perl Code:
perl -e '
for(<>)
{if ($_>$max){$max=$_;}}
print $max;'
<large_size_file

Also, can command line perl with -n run like awk -
'BEGIN{code}
{code}
END{code}
'

Thanks

Steven


Jürgen Exner 05-02-2007 03:31 PM

Re: perl out of memory
 
xlue897@rogers.com wrote:
> Hi,
>
> I have a query to search the max number in a large size file. When I
> use the perl code below, it generates error : Out of memory! Bus
> error.
>
> Perl Code:
> perl -e '
> for(<>)


Replace 'for' with 'while'.
The magic of reading a line at a time works for 'while(<>)' only.

jue



John W. Krahn 05-02-2007 03:35 PM

Re: perl out of memory
 
xlue897@rogers.com wrote:
>
> I have a query to search the max number in a large size file. When I
> use the perl code below, it generates error : Out of memory! Bus
> error.
>
> Perl Code:
> perl -e '
> for(<>)


You are using a for loop so perl has to read the entire file first into a list
in memory. Use a while loop instead.


> {if ($_>$max){$max=$_;}}
> print $max;'
> <large_size_file


perl -lne'$max = $_ if $_ > $max; END { print $max }' large_size_file


> Also, can command line perl with -n run like awk -
> 'BEGIN{code}
> {code}
> END{code}
> '


Yes.



John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and in short order. -- Larry Wall

Michele Dondi 05-02-2007 05:21 PM

Re: perl out of memory
 
On Wed, 02 May 2007 15:31:13 GMT, "Jürgen Exner"
<jurgenex@hotmail.com> wrote:

>The magic of reading a line at a time works for 'while(<>)' only.


ATM! :-)


Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,

xlue897@rogers.com 05-04-2007 05:33 PM

Re: perl out of memory
 
On May 2, 1:21 pm, Michele Dondi <bik.m...@tiscalinet.it> wrote:
> On Wed, 02 May 2007 15:31:13 GMT, "Jürgen Exner"
>
> <jurge...@hotmail.com> wrote:
> >The magic of reading a line at a time works for 'while(<>)' only.

>
> ATM! :-)
>
> Michele
> --
> {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
> (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
> .'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
> 256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,


Thanks everyone for your help. The while loop works. However, the
perl code seems much slower than awk code. For the same file size
around 5M records, the awk takes only 1 min to loop to find the max
value, the perl takes around 20 mins. Does perl slower than awk?


Thanks.

Steven



Michele Dondi 05-04-2007 07:25 PM

Re: perl out of memory
 
On 4 May 2007 10:33:26 -0700, xlue897@rogers.com wrote:

>> ATM! :-)
>>
>> Michele
>> --
>> {$_=3Dpack'B8'x25,unpack'A8'x32,$a^=3Dsub{pop^pop}->(map substr


What's with quoting the .sig? (If not discussing it, that is. But this
is generally the case with Abigail's!)

>Thanks everyone for your help. The while loop works. However, the
>perl code seems much slower than awk code. For the same file size
>around 5M records, the awk takes only 1 min to loop to find the max
>value, the perl takes around 20 mins. Does perl slower than awk?


Hard to say, without seeing any code. Find it hard to believe, though:

cognac:~ [21:23:58]$ perl -le 'print rand for 1..5_000_000' > test
cognac:~ [21:24:19]$ time perl -ne '$m=$_>$m?$_:$m;END{print $m}'
test
0.999999995290754

real 0m8.604s
user 0m7.160s
sys 0m1.368s


Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,

xlue897@rogers.com 05-04-2007 08:33 PM

Re: perl out of memory
 
On May 4, 3:25 pm, Michele Dondi <bik.m...@tiscalinet.it> wrote:
> On 4 May 2007 10:33:26 -0700, xlue...@rogers.com wrote:
>
> >> ATM! :-)

>
> >> Michele
> >> --
> >> {$_=3Dpack'B8'x25,unpack'A8'x32,$a^=3Dsub{pop^pop}->(map substr

>
> What's with quoting the .sig? (If not discussing it, that is. But this
> is generally the case with Abigail's!)
>
> >Thanks everyone for your help. The while loop works. However, the
> >perl code seems much slower than awk code. For the same file size
> >around 5M records, the awk takes only 1 min to loop to find the max
> >value, the perl takes around 20 mins. Does perl slower than awk?

>
> Hard to say, without seeing any code. Find it hard to believe, though:
>
> cognac:~ [21:23:58]$ perl -le 'print rand for 1..5_000_000' > test
> cognac:~ [21:24:19]$ time perl -ne '$m=$_>$m?$_:$m;END{print $m}'
> test
> 0.999999995290754
>
> real 0m8.604s
> user 0m7.160s
> sys 0m1.368s
>
> Michele
> --
> {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
> (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
> .'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
> 256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,


Here is the test code with result. test file generated by (perl -le
'print rand for 1..5_000_000' > test.txt)
$time awk -F'.' '{ if($2 > max) {max = $2;} } END{print max;}'
<test.txt
999969482421875

real 0m18.16s
user 0m17.38s
sys 0m0.18s

$time perl -a -F'\.' -n -e '{ if($F[1] >$max) {$max=$F[1];} }
END{print $max;}' test.txt
999969482421875

real 0m41.57s
user 0m41.14s
sys 0m0.16s


BTW, why the code below doesn't work?
perl -a -F/\./ -n -e '{print $F[1], "\n";} ' test.txt


Thanks,
Steven


Michele Dondi 05-05-2007 10:49 AM

Re: perl out of memory
 
On 4 May 2007 13:33:34 -0700, xlue897@rogers.com wrote:

>> >perl code seems much slower than awk code. For the same file size
>> >around 5M records, the awk takes only 1 min to loop to find the max
>> >value, the perl takes around 20 mins. Does perl slower than awk?

[snip]
>Here is the test code with result. test file generated by (perl -le
>'print rand for 1..5_000_000' > test.txt)
>$time awk -F'.' '{ if($2 > max) {max = $2;} } END{print max;}'
><test.txt
>999969482421875
>
>real 0m18.16s
>user 0m17.38s
>sys 0m0.18s
>
>$time perl -a -F'\.' -n -e '{ if($F[1] >$max) {$max=$F[1];} }
>END{print $max;}' test.txt
>999969482421875
>
>real 0m41.57s
>user 0m41.14s
>sys 0m0.16s


Well, indeed awk appears to be faster, but not in the same measure as
you hinted above. Anyway, this *does* surprise me, but not too much:
afaik awk is a specialized tool and Perl a full fledged language
(although one supposed to excel in the same areas).

>BTW, why the code below doesn't work?
>perl -a -F/\./ -n -e '{print $F[1], "\n";} ' test.txt


That should be -F'/\./' otherwise the dot gets quoted by the shell,
but perl will see the /./ pattern which is *not* what you want.


Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,

xhoster@gmail.com 05-10-2007 06:16 PM

Re: perl out of memory
 
xlue897@rogers.com wrote:
> >
> > >Thanks everyone for your help. The while loop works. However, the
> > >perl code seems much slower than awk code. For the same file size
> > >around 5M records, the awk takes only 1 min to loop to find the max
> > >value, the perl takes around 20 mins. Does perl slower than awk?

> >

....
>
> Here is the test code with result. test file generated by (perl -le
> 'print rand for 1..5_000_000' > test.txt)
> $time awk -F'.' '{ if($2 > max) {max = $2;} } END{print max;}'
> <test.txt
> 999969482421875
>
> real 0m18.16s
> user 0m17.38s
> sys 0m0.18s
>
> $time perl -a -F'\.' -n -e '{ if($F[1] >$max) {$max=$F[1];} }
> END{print $max;}' test.txt
> 999969482421875
>
> real 0m41.57s
> user 0m41.14s
> sys 0m0.16s


So the difference here is less than a factor of 3, rather than the factor
of 20 you originally said. A factor of 3 is easy to believe. Different
languages have different strengths.

>
> BTW, why the code below doesn't work?
> perl -a -F/\./ -n -e '{print $F[1], "\n";} ' test.txt


The shell eats the backslash, so Perl never sees it and treats . as the
special character rather than as a literal. It often helps to use echo
to tell you exactly what Perl is seeing once the shell is done:


$ echo F/\./
F/./

$ echo 'F/\./'
F/\./

Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB


All times are GMT. The time now is 11:44 AM.

Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57