Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Why won't this split file script work?

Reply
Thread Tools

Why won't this split file script work?

 
 
Max
Guest
Posts: n/a
 
      06-17-2004
I can't seem to figure out what I'm doing wrong, or maybe I'm just rushing
as I need to split a 15000 line file into chunks.

This script is supposed to work by giving the line number you want to start
at and the line number you want to stop at. It should copy all lines in
between the start and stop number to a file called $file-split. But it
doesn't seem to work. If someone has a few minutes, can you tell me what
I'm doing wrong.

Thanks in advance.

#!/usr/bin/perl

$file = $ARGV[0];
$start = $ARGV[1];
$stop = $ARGV[2];

print "$start\n";
print "$stop\n";

$cnt = 0;

open (IN, "$file");
open (OUT, "> $file-split");

while (<IN>) {

chomp;
$cnt++;
next if ($cnt lt $start);
last if ($cnt gt $stop);
print OUT "$_\n";
}

close (IN);
close (OUT);


 
Reply With Quote
 
 
 
 
Jeff 'japhy' Pinyan
Guest
Posts: n/a
 
      06-17-2004
[posted & mailed]

On Thu, 17 Jun 2004, Max wrote:

>This script is supposed to work by giving the line number you want to start
>at and the line number you want to stop at. It should copy all lines in
>between the start and stop number to a file called $file-split. But it
>doesn't seem to work. If someone has a few minutes, can you tell me what


>$cnt = 0;
>
>open (IN, "$file");
>open (OUT, "> $file-split");
>
>while (<IN>) {
>
> chomp;
> $cnt++;


You can just use the $. variable instead of $cnt.

> next if ($cnt lt $start);
> last if ($cnt gt $stop);


You're using string-wise operators. You want numerical comparisons:

next if $. < $start;
last if $. > $stop;

> print OUT "$_\n";


Why did you chomp() $_ if you're just going to print it with a newline at
the end again?

>}
>
>close (IN);
>close (OUT);


I'd write this as:

while (<IN>) {
print OUT if ($. == $start) .. ($. == $stop);
last if $. == $stop;
}

That's using the .. operator (perldoc perlop).

--
Jeff Pinyan RPI Acacia Brother #734 RPI Acacia Corp Secretary
"And I vos head of Gestapo for ten | Michael Palin (as Heinrich Bimmler)
years. Ah! Five years! Nein! No! | in: The North Minehead Bye-Election
Oh. Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)

 
Reply With Quote
 
 
 
 
John W. Krahn
Guest
Posts: n/a
 
      06-17-2004
Max wrote:
>
> I can't seem to figure out what I'm doing wrong, or maybe I'm just rushing
> as I need to split a 15000 line file into chunks.
>
> This script is supposed to work by giving the line number you want to start
> at and the line number you want to stop at. It should copy all lines in
> between the start and stop number to a file called $file-split. But it
> doesn't seem to work. If someone has a few minutes, can you tell me what
> I'm doing wrong.
>
> Thanks in advance.
>
> #!/usr/bin/perl


You should enable warnings and strictures to let perl help you find
mistakes.

use warnings;
use strict;


> $file = $ARGV[0];
> $start = $ARGV[1];
> $stop = $ARGV[2];
>
> print "$start\n";
> print "$stop\n";
>
> $cnt = 0;
>
> open (IN, "$file");
> open (OUT, "> $file-split");


You should *ALWAYS* verify that the files were opened correctly.

open IN, $file or die "Cannot open $file: $!";
open OUT, "> $file-split" or die "Cannot open $file-split: $!";


> while (<IN>) {
>
> chomp;
> $cnt++;
> next if ($cnt lt $start);
> last if ($cnt gt $stop);


You are using string comparison operators which are not doing what you
seem to expect them to do.

$ perl -le'
for ( qw[ 1 2 3 4 10 11 12 13 14 20 21 22 23 24 100 200 300 ] ) {
print if $_ lt "12"
}
'
1
10
11
100

Note that '100' is less than '12'. You should be using numerical
comparison operators instead. Also you don't need the $cnt variable as
perl provides the $. variable which does the same thing.

next if $. < $start;
last if $. > $stop;


> print OUT "$_\n";
> }
>
> close (IN);
> close (OUT);



John
--
use Perl;
program
fulfillment
 
Reply With Quote
 
Thomas Church
Guest
Posts: n/a
 
      06-17-2004
"Max" <> wrote in message
news:<mJlAc.2227$ m>...
> If someone has a few minutes, can you tell me what I'm doing wrong.


From what I can tell, you've made one or two logic errors (you're not telling
the program to do what you want), and then a few stylistic "errors". The
problem, I must assume, is that you use 'lt' and 'gt', rather than '<'
and '>'. The former are for string comparison, the latter for numeric
comparison. That is:

'10' lt '5'
10 > 5

I assume you wanted the numeric comparison. Also, if you input 5 and 8 for
$start and $stop, just lines 6, 7, and 8 are copied. You may want to futz
with the boundaries.


I also made some stylistic changes to the code. The most important is adding
'use strict;' and 'use warnings;', which are the most helpful things in Perl
since spliced bread. The others are less critical, but in the absence of any
preformed habits otherwise, there's no reason not to just use the
three-argument form of open all the time (for example).

One other thought -- if you don't want to keep track of it youself, the
variable $. (dollar-period) contains the current line number of the last
filehandle that you've read from. As long as you're not messing with $/
(which redefines for perl what a line is), $. should always correspond
to your $cnt.

Hope this helps. Code is tested but (of course) not guaranteed.


#!/usr/bin/perl

use strict;
use warnings;

my ($file, $start, $stop) = @ARGV;

print "$start\n$stop\n";

my $cnt = 0;

open (IN, '<', $file) or die "Unable to open $file: $!";
open (OUT, '>', $file . '-split') or die "Unable to open $file-split: $!";

while (<IN>) {
chomp;
$cnt++;
next if ($cnt < $start);
last if ($cnt > $stop);
print OUT "$_\n";
}

close (IN);
close (OUT);
 
Reply With Quote
 
Thomas Church
Guest
Posts: n/a
 
      06-17-2004
"Max" <> wrote in message
news:<mJlAc.2227$ m>...
> I can't seem to figure out what I'm doing wrong, or maybe I'm just rushing
> as I need to split a 15000 line file into chunks.


One other thought: you don't need to chomp unless you actually care about
eliminating the newline. Since you add it back in again anyway when you print,
you can simplify the loop to: (untested)

while (<IN>) {
next if ($. < $start);
last if ($. > $stop);
print OUT $_;
}
 
Reply With Quote
 
Max
Guest
Posts: n/a
 
      06-18-2004
I really appreciate everybody's input. I got the script working and learned
some very good programming tips.

Thanks,
Max

"Thomas Church" <> wrote in message
news: om...
> "Max" <> wrote in message
> news:<mJlAc.2227$ m>...
> > I can't seem to figure out what I'm doing wrong, or maybe I'm just

rushing
> > as I need to split a 15000 line file into chunks.

>
> One other thought: you don't need to chomp unless you actually care about
> eliminating the newline. Since you add it back in again anyway when you

print,
> you can simplify the loop to: (untested)
>
> while (<IN>) {
> next if ($. < $start);
> last if ($. > $stop);
> print OUT $_;
> }



 
Reply With Quote
 
Michele Dondi
Guest
Posts: n/a
 
      06-18-2004
On Thu, 17 Jun 2004 18:49:54 GMT, "Max" <> wrote:

>I can't seem to figure out what I'm doing wrong, or maybe I'm just rushing
>as I need to split a 15000 line file into chunks.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

BTW: but this is *not* what your script below is supposed to do!

>This script is supposed to work by giving the line number you want to start
>at and the line number you want to stop at. It should copy all lines in
>between the start and stop number to a file called $file-split. But it
>doesn't seem to work. If someone has a few minutes, can you tell me what
>I'm doing wrong.


Others already told you. This is how I would do it (just printing to
STDOUT and generalized to more -or no- files on the cmd line):


#!/usr/bin/perl

use strict;
use warnings;

die "Usage: $0 <start> <stop> [<file(s)>]\n" unless @ARGV>=2;

my ($start,$stop)=(shift,shift);

while (<>) {
print if $. == $start .. ($. == $stop and close ARGV);
}

__END__


HTH,
Michele
--
you'll see that it shouldn't be so. AND, the writting as usuall is
fantastic incompetent. To illustrate, i quote:
- Xah Lee trolling on clpmisc,
"perl bug File::Basename and Perl's nature"
 
Reply With Quote
 
Richard Morse
Guest
Posts: n/a
 
      06-21-2004
In article
<Pine.SGI.3.96.1040617150701.326419A->,
Jeff 'japhy' Pinyan <> wrote:

> I'd write this as:
>
> while (<IN>) {
> print OUT if ($. == $start) .. ($. == $stop);
> last if $. == $stop;
> }


According to the docs, you could actually write this as:

while(<IN>) {
print OUT if $start .. $stop;
last if $. == $stop;
}

HTH,
Ricky
 
Reply With Quote
 
Jay Tilton
Guest
Posts: n/a
 
      06-21-2004
Richard Morse <> wrote:

: In article
: <Pine.SGI.3.96.1040617150701.326419A->,
: Jeff 'japhy' Pinyan <> wrote:
:
: > I'd write this as:
: >
: > while (<IN>) {
: > print OUT if ($. == $start) .. ($. == $stop);
: > last if $. == $stop;
: > }
:
: According to the docs, you could actually write this as:
:
: while(<IN>) {
: print OUT if $start .. $stop;
: last if $. == $stop;
: }

Not true. perlop says:

If either operand of scalar ``..'' is a constant expression, that
operand is considered true if it is equal (==) to the current
input line number (the $. variable).

Neither $start nor $stop are constant expressions.

 
Reply With Quote
 
Jeff 'japhy' Pinyan
Guest
Posts: n/a
 
      06-21-2004
[posted & mailed]

On Mon, 21 Jun 2004, Richard Morse wrote:

> Jeff 'japhy' Pinyan <> wrote:
>
>> while (<IN>) {
>> print OUT if ($. == $start) .. ($. == $stop);
>> last if $. == $stop;
>> }

>
>According to the docs, you could actually write this as:
>
> while(<IN>) {
> print OUT if $start .. $stop;
> last if $. == $stop;
> }


Not so. I once (ok, more than once) fell prey to that. The docs state
(although not in the BOLD CAPITAL letters I'd like) that the implicit
comparison to $. only takes place if the argument is a constant
expression:

If either operand of scalar ".." is a constant expression, that operand
is considered true if it is equal ("==") to the current input line num-
ber (the $. variable).

To be pedantic, the comparison is actually "int(EXPR) == int(EXPR)",
but that is only an issue if you use a floating point expression; when
implicitly using $. as described in the previous paragraph, the compar-
ison is "int(EXPR) == int($.)" which is only an issue when $. is set
to a floating point value and you are not reading from a file. Fur-
thermore, "span" .. "spat" or "2.18 .. 3.14" will not do what you want
in scalar context because each of the operands are evaluated using
their integer representation.

--
Jeff Pinyan RPI Acacia Brother #734 RPI Acacia Corp Secretary
"And I vos head of Gestapo for ten | Michael Palin (as Heinrich Bimmler)
years. Ah! Five years! Nein! No! | in: The North Minehead Bye-Election
Oh. Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
why why why why why Mr. SweatyFinger ASP .Net 4 12-21-2006 01:15 PM
findcontrol("PlaceHolderPrice") why why why why why why why why why why why Mr. SweatyFinger ASP .Net 2 12-02-2006 03:46 PM
How can I split database results with ExecuteReader and Split? needin4mation@gmail.com ASP .Net 2 05-05-2006 10:36 PM
Small inconsistency between string.split and "".split Carlos Ribeiro Python 11 09-17-2004 05:57 PM
Why does split operate over multiple lines in the absence of "ms" ? And why doesn't $_ work with split? Sara Perl Misc 6 04-12-2004 09:07 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57