Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > How would I do this in perl?

Reply
Thread Tools

How would I do this in perl?

 
 
ccc31807
Guest
Posts: n/a
 
      10-15-2009
On Oct 15, 10:55*am, Ben Morrow <(E-Mail Removed)> wrote:
> Don't use bareword filehandles.


My Perl documentation (5. demonstrates the syntax I used. Is the
documentation wrong? What is the rationale for avoiding bareword
filehandles?

> Check the return value of open.


Yeah, yeah, we go through this almost every time.


> You are re-opening the template for every number. This is a bad idea:
> its contents aren't going to change, so just read them once and remember
> them.


My purpose was to demonstrate iterating through a template file. How
do you iterate through a template file when you read it into memory?
I'm asking this because I don't understand your logic in doing this. I
agree that opening and closing the same file many times seems like a
stupid idea.

> There's no point iterating line-by-line if you're doing a replacement on
> the whole file (unless you expect your files to exceed memory). It's
> much clearer (as well as probably more efficient) to read the whole file
> in and do and s/// over the whole thing.


Again, I wanted something clear rather than concise. Since the
template file has to be written many time, I guess I just don't see
it.

> '$_ =~' is redundant.


Yes, but it makes clear that $_ is being changed each time, maximizing
verbosity for clarity.

> You need a /g there.


Not if there is only one change per line, which is how the template is
specified.

Please note, terse code can be very difficult to understand, and Perl
allows very terse shortcuts. The OP said that he didn't know Perl
well. I have gotten into a bad habit myself of using some of these
shortcuts, and sometimes am reminded that verbosity is better. A
couple of days ago, I did this:
sub afunction
{ my $var = shift;
.... }

and a colleague wanted to know what 'shift' meant. I was at fault for
using the shortcut, and he wasn't at fault by not having learned the
usage of 'shift'.

CC.
 
Reply With Quote
 
 
 
 
Jürgen Exner
Guest
Posts: n/a
 
      10-15-2009
ccc31807 <(E-Mail Removed)> wrote:
>On Oct 15, 10:55*am, Ben Morrow <(E-Mail Removed)> wrote:
>> You are re-opening the template for every number. This is a bad idea:
>> its contents aren't going to change, so just read them once and remember
>> them.

>
>My purpose was to demonstrate iterating through a template file. How
>do you iterate through a template file when you read it into memory?


You want iterate over the _content_ of the file, not the file. Therefore
it is irrelevant if that content is on the HD or in memory, you can
iterate over it either way.

jue
 
Reply With Quote
 
 
 
 
Peter J. Holzer
Guest
Posts: n/a
 
      10-15-2009
On 2009-10-15 17:56, ccc31807 <(E-Mail Removed)> wrote:
> On Oct 15, 10:55*am, Ben Morrow <(E-Mail Removed)> wrote:
>> Don't use bareword filehandles.

>
> My Perl documentation (5. demonstrates the syntax I used.


So does 5.10 and so will 5.12.

> Is the
> documentation wrong?


No. Bareword filehandles have existed since the beginning of Perl and
they will probably exist as long as Perl 5.x.

> What is the rationale for avoiding bareword filehandles?


They are global. Global variables in general are a bad idea. Global
variables which you don't have to declare are an even worse idea.
The risk that you accidentally use the same file handle in two
independent subroutines is quite big. Happened to me a few times and it
always took me a long time to find the bug. Lexical filehandles
(introduced in Perl 5.6, IIRC) are a lot safer. They are also normal
scalars, so you can pass them around as arguments like any other scalar.

hp

 
Reply With Quote
 
sharma__r@hotmail.com
Guest
Posts: n/a
 
      10-15-2009
On Oct 15, 10:56*pm, ccc31807 <(E-Mail Removed)> wrote:
> On Oct 15, 10:55*am, Ben Morrow <(E-Mail Removed)> wrote:
>
> > Don't use bareword filehandles.

>
> My Perl documentation (5. demonstrates the syntax I used. Is the
> documentation wrong? What is the rationale for avoiding bareword
> filehandles?
>
> > Check the return value of open.

>
> Yeah, yeah, we go through this almost every time.
>
> > You are re-opening the template for every number. This is a bad idea:
> > its contents aren't going to change, so just read them once and remember
> > them.

>
> My purpose was to demonstrate iterating through a template file. How
> do you iterate through a template file when you read it into memory?
> I'm asking this because I don't understand your logic in doing this. I
> agree that opening and closing the same file many times seems like a
> stupid idea.
>
> > There's no point iterating line-by-line if you're doing a replacement on
> > the whole file (unless you expect your files to exceed memory). It's
> > much clearer (as well as probably more efficient) to read the whole file
> > in and do and s/// over the whole thing.

>
> Again, I wanted something clear rather than concise. Since the
> template file has to be written many time, I guess I just don't see
> it.
>
> > '$_ =~' is redundant.

>
> Yes, but it makes clear that $_ is being changed each time, maximizing
> verbosity for clarity.
>
> > You need a /g there.

>
> Not if there is only one change per line, which is how the template is
> specified.
>
> Please note, terse code can be very difficult to understand, and Perl
> allows very terse shortcuts. The OP said that he didn't know Perl
> well. I have gotten into a bad habit myself of using some of these
> shortcuts, and sometimes am reminded that verbosity is better. A
> couple of days ago, I did this:
> sub afunction
> { my $var = shift;
> ... }
>
> and a colleague wanted to know what 'shift' meant. I was at fault for
> using the shortcut, and he wasn't at fault by not having learned the
> usage of 'shift'.
>
> CC.



Another thing that I want to point out is that you are using the $_
variable
in the two 'while' loops. Since your focus is on clarity, then this
actually
has a reverse effect!

####################### alternate ################################
#!/usr/local/bin/perl
use 5.006; # or later, to be able to use the 3-arg form of open with
lexical filenandles
use strict;
use warnings;

local $\ = qq{\n}; # auto-append newlines after every print

### store the template into a scalar
my $template = 'template.java';
open my $template_FH, "<", $template
or die "Could not open the java template [$template] for reading:
$!";
my $code = do {
local $/ = undef;
<$template_FH>;
};
close $template_FH
or die "Could not close the java template [$template] after
reading: $!";
chomp $code;

### loop over the numbers
my $numbers = 'numbers.dat';
open my $numbers_FH, "<", $numbers
or die "Could not open the the numbers file [$numbers] for reading:
$!";

my $java_result = 'output.java';
warn "The file [$java_result] is about to be clobbered." if -e
$java_result;
open my $out_FH, ">", $java_result
or die "Could not open the file [$java_result] for writing: $!";

NUMBER:
while (defined(my $number = <$numbers_FH>)) {
# invalid line if any non-digit found
next NUMBER if $number =~ m/\D/;

chomp $number;

(my $template_copy = $code) =~ s/X/$number/g;

print {$out_FH} $template_copy;
}

close $numbers_FH
or die "Could not close the numbers file [$numbers] after reading:
$!";

close $out_FH
or die "Could not close the file [$java_result] after writing: $!";

__END__


 
Reply With Quote
 
sln@netherlands.com
Guest
Posts: n/a
 
      10-15-2009
On Thu, 15 Oct 2009 13:45:35 -0700 (PDT), http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

>On Oct 15, 10:56*pm, ccc31807 <(E-Mail Removed)> wrote:
>> On Oct 15, 10:55*am, Ben Morrow <(E-Mail Removed)> wrote:
>>
>> > Don't use bareword filehandles.

>>
>> My Perl documentation (5. demonstrates the syntax I used. Is the
>> documentation wrong? What is the rationale for avoiding bareword
>> filehandles?
>>
>> > Check the return value of open.

>>
>> Yeah, yeah, we go through this almost every time.
>>
>> > You are re-opening the template for every number. This is a bad idea:
>> > its contents aren't going to change, so just read them once and remember
>> > them.

>>
>> My purpose was to demonstrate iterating through a template file. How
>> do you iterate through a template file when you read it into memory?
>> I'm asking this because I don't understand your logic in doing this. I
>> agree that opening and closing the same file many times seems like a
>> stupid idea.
>>
>> > There's no point iterating line-by-line if you're doing a replacement on
>> > the whole file (unless you expect your files to exceed memory). It's
>> > much clearer (as well as probably more efficient) to read the whole file
>> > in and do and s/// over the whole thing.

>>
>> Again, I wanted something clear rather than concise. Since the
>> template file has to be written many time, I guess I just don't see
>> it.
>>
>> > '$_ =~' is redundant.

>>
>> Yes, but it makes clear that $_ is being changed each time, maximizing
>> verbosity for clarity.
>>
>> > You need a /g there.

>>
>> Not if there is only one change per line, which is how the template is
>> specified.
>>
>> Please note, terse code can be very difficult to understand, and Perl
>> allows very terse shortcuts. The OP said that he didn't know Perl
>> well. I have gotten into a bad habit myself of using some of these
>> shortcuts, and sometimes am reminded that verbosity is better. A
>> couple of days ago, I did this:
>> sub afunction
>> { my $var = shift;
>> ... }
>>
>> and a colleague wanted to know what 'shift' meant. I was at fault for
>> using the shortcut, and he wasn't at fault by not having learned the
>> usage of 'shift'.
>>
>> CC.

>
>
>Another thing that I want to point out is that you are using the $_
>variable
>in the two 'while' loops. Since your focus is on clarity, then this
>actually
>has a reverse effect!
>
>####################### alternate ################################
>#!/usr/local/bin/perl
>use 5.006; # or later, to be able to use the 3-arg form of open with
>lexical filenandles
>use strict;
>use warnings;
>
>local $\ = qq{\n}; # auto-append newlines after every print
>
>### store the template into a scalar
>my $template = 'template.java';
>open my $template_FH, "<", $template
> or die "Could not open the java template [$template] for reading:
>$!";
>my $code = do {
> local $/ = undef;
> <$template_FH>;
>};
>close $template_FH
> or die "Could not close the java template [$template] after
>reading: $!";
>chomp $code;
>
>### loop over the numbers
>my $numbers = 'numbers.dat';
>open my $numbers_FH, "<", $numbers
> or die "Could not open the the numbers file [$numbers] for reading:
>$!";
>
>my $java_result = 'output.java';
>warn "The file [$java_result] is about to be clobbered." if -e
>$java_result;


- Why 'warn' then clobber it anyway?

>open my $out_FH, ">", $java_result
> or die "Could not open the file [$java_result] for writing: $!";
>
>NUMBER:
>while (defined(my $number = <$numbers_FH>)) {
> # invalid line if any non-digit found
> next NUMBER if $number =~ m/\D/;

^^^^^
- This will always fail unless line equals "\d+"
Either
while (defined(my $number = <DATA>)) {
$number =~ s/^\s*(\d+)\s*$/$1/ or next NUMBER;
Or
while (<DATA>) {
my ($number) = /^\s*(\d+)\s*$/ or next;
>
> chomp $number;
>
> (my $template_copy = $code) =~ s/X/$number/g;


- I guess loading a copy of the template file into
memory makes this faster than reading a file line
by line, rewind, repeat .., but not by much given
file cache.

Of course, this is not the fastest method. You are
actually copying the file data over and over again,
then doing regex with substitution (more overhead)
over and over again. These duplicate action's add
signifcantly to the overhead.

The fastest method, if actual speed is a factor,
is to read the template into memory, index the
substitution points into an array one time,
then write segments to the output file, directly,
using substr. Or instead of indexing, just creating
an array of segments (strings).

>
> print {$out_FH} $template_copy;
>}
>
>close $numbers_FH
> or die "Could not close the numbers file [$numbers] after reading:
>$!";
>
>close $out_FH
> or die "Could not close the file [$java_result] after writing: $!";
>
>__END__
>


-sln
 
Reply With Quote
 
Ilya Zakharevich
Guest
Posts: n/a
 
      10-15-2009
On 2009-10-15, Ben Morrow <(E-Mail Removed)> wrote:
>> open NUMBERS, '<', 'numbers.dat';

>
> Don't use bareword filehandles.


To the contrary: unless in a 1-liner, one should carefully consider
using bareword filehandles.

[If code is supposed to be reused, you rarely know on which version
of Perl it is going to be reused.]

Ilya
 
Reply With Quote
 
ccc31807
Guest
Posts: n/a
 
      10-16-2009
On Oct 15, 3:57*pm, "Peter J. Holzer" <(E-Mail Removed)> wrote:
> > What is the rationale for avoiding bareword filehandles?

>
> They are global. Global variables in general are a bad idea. Global
> variables which you don't have to declare are an even worse idea.
> The risk that you accidentally use the same file handle in two
> independent subroutines is quite big. Happened to me a few times and it
> always took me a long time to find the bug. Lexical filehandles
> (introduced in Perl 5.6, IIRC) are a lot safer. They are also normal
> scalars, so you can pass them around as arguments like any other scalar.


Good. This makes sense.

CC.
 
Reply With Quote
 
ccc31807
Guest
Posts: n/a
 
      10-16-2009
On Oct 16, 10:20*am, Sherm Pendley <(E-Mail Removed)> wrote:
> >> Check the return value of open.

>
> > Yeah, yeah, we go through this almost every time.

>
> Then why haven't you learned it yet? How many repetitions will it take
> until you understand the value of error-checking?


My job requires a lot of data manipulation. I write a lot of short
scripts to munge data files on a one time basis. open() and close()
never fail (unless I already have the data file open, in which case
the problem is obvious.) In this very specific environment, I don't
really need the error checking and can't justify the time spent typing
the extra keystrokes.

I don't doubt the value of error checking. That's not the point. The
point is that (as Emerson said) a foolish consistency is the hobgoblin
of small minds -- IOW, why go to the extra effort of this kind of
error checking when it's not needed in a particular situation,
regardless of its value in general? Just because a practice is
considered best generally doesn't necessarily that it should always be
adhered to in all circumstances. I wear my seatbelt when driving, but
I don't fasten it when I move the car to wash it.

CC
 
Reply With Quote
 
sharma__r@hotmail.com
Guest
Posts: n/a
 
      10-16-2009
On Oct 16, 3:08*am, (E-Mail Removed) wrote:

> >my $java_result = 'output.java';
> >warn "The file [$java_result] is about to be clobbered." if -e
> >$java_result;

>
> - * Why 'warn' then clobber it anyway?


Guess you're right. I couldn't think up of anything better



> >NUMBER:
> >while (defined(my $number = <$numbers_FH>)) {
> > * # invalid line if any non-digit found
> > * next NUMBER if $number =~ m/\D/;

>
> * * * * * * * * * * * * * * * ^^^^^
> - * *This will always fail unless line equals "\d+"
> * * *Either
> * * * * *while (defined(my $number = <DATA>)) {
> * * * * * * * $number =~ s/^\s*(\d+)\s*$/$1/ or next NUMBER;
> * * *Or
> * * * * *while (<DATA>) {
> * * * * * * * my ($number) = /^\s*(\d+)\s*$/ or next;


You're observation is right on the mark. Actually it'll always fail
since the
matching happens before chomping, so the \n will match the \D

Actually the solution is very simple!
next NUMBER if $number =~ m/\D./xms



> > * (my $template_copy = $code) =~ s/X/$number/g;

> - * I guess loading a copy of the template file into
> * * memory makes this faster than reading a file line
> * * by line, rewind, repeat .., but not by much given
> * * file cache.
>
> * * Of course, this is not the fastest method. You are
> * * actually copying the file data over and over again,
> * * then doing regex with substitution (more overhead)
> * * over and over again. These duplicate action's add
> * * signifcantly to the overhead.
>
> * * The fastest method, if actual speed is a factor,
> * * is to read the template into memory, index the
> * * substitution points into an array one time,
> * * then write segments to the output file, directly,
> * * using substr. Or instead of indexing, just creating
> * * an array of segments (strings).
>


Err :-\ speed was not on my mind in the code that I presented. I was
presenting it more
from a clarity standpoint.

Your speed optimization looks very interesting, but please show the
perl implementation.

--Rakesh
 
Reply With Quote
 
Jürgen Exner
Guest
Posts: n/a
 
      10-18-2009
Ilya Zakharevich <(E-Mail Removed)> wrote:
>To the contrary: unless in a 1-liner, one should carefully consider
>using bareword filehandles.
>
> [If code is supposed to be reused, you rarely know on which version
> of Perl it is going to be reused.]


If the code needs to be backward compatible to before 5.6, then you have
much bigger problems to worry about than bareword filehandles.

jue
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
an oddball scary kind of thing you would think would never happen richard Computer Support 4 01-31-2010 06:34 PM
Would this be cool or not? Genaio Case Modding 4 07-27-2005 10:18 PM
any help would be appreciated! Kreepz86 Wireless Networking 4 07-01-2005 04:55 AM
would a TV cause interference? djc Wireless Networking 1 09-19-2004 01:09 AM



Advertisments