Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > join on space instead of comma

Reply
Thread Tools

join on space instead of comma

 
 
LHradowy
Guest
Posts: n/a
 
      08-04-2004
Right now I have a perl script that takes a comma separated file and adds a
couple of things to it as will as takes away the data at the end.
I have done this the hard way, by saving a file in excel and a comma
separated file, then ftp it over, dos2ux file >file1.

And this is the outcome BEFORE I run my perl script.
3xxxx18,00 0 02 00,TELN NOT
3xxxx22,00 0 03 11,CUST HAS >

Then after all that I run my perl script against it prompts user for input,
adds some data, then greps file for certain things, and creates 3 files.

What I want to do is elinate the first part of saving it as a comma
separated file. I belive I can do this in perl, but I can not split on
spaces since I have spaces that I need to be part of a column. So, (how to
explain) instead of the above mention where there is a comma, I need to
split this file, based on criteria, and also add a comma between the
columns, so it looks like above...

This is the file I get before I save it as a comma separated file.
3xxxx33 00 0 00 21 CUSTOMER HAS
> 1

3xxxx63 00 0 01 07 CUSTOMER HAS
> 1

3xxxx75 00 0 02 09 CUSTOMER HAS
> 1

3xxxx85 00 0 12 09 TELN NOT BILL
3xxxx28 00 0 02 00 TELN NOT BILL
yada...

I want to avoid this step, how do I change my perl script to reflect this
instead of a comma.
Remember in the 2 and third fields there are spaces that I need.
OUTCOME
3xxxx33,BUILDING1,ROOM2,00 0 00 21,CUSTOMER HAS > 1
3xxxx66,BUILDING1,ROOM2,00 0 01 07,CUSTOMER HAS > 1
3xxxx75,BUILDING1,ROOM2,00 0 02 09,CUSTOMER HAS > 1
3xxxx85,BUILDING1,ROOM2,00 0 12 09,TELN NOT BILL

SCRIPT
*****************************

#!/opt/perl/bin/perl

use strict;
use warnings;

system ("clear"); #Clear the screen
my $acode = "204";

print "Enter BLD: ";
chomp (my $bld =<STDIN>);
my $CAPbld = uc($bld);
my $bld4=substr $CAPbld,0,4; #Pull first 4 char out of BLD for naming of
file

print "Enter Room: ";
chomp (my $room = <STDIN>);
my $CAProom = uc($room);

open my $fc, ">$bld4.cust_has" or die "$bld4.cust_has: $!";
open my $ft, ">$bld4.teln_not" or die "$bld4.teln_not: $!";
open my $fo, ">$bld4.PRTDIST.err" or die "$bld4.PRTDIST.err: $!";

while (<>) {
chomp; # Will remove the leading , or new line
my @a = split /,/, $_, -1;
my $f = /TELN/ ? $ft : /CUST/? $fc : $fo;
print $f join "," => $acode.$a[0],$CAPbld, $CAProom, $a[1], $a[2], "\n";
}
close $fc;
close $ft;
close $fo;

## Modify the cust_has file and pull only the first column.
my $fc_name = "$bld4.cust_has";
open (my $fc, $fc_name) or die "$fc_name:$!";
open my $fcC, ">$bld4.cust_has.tn" or die "$bld4.cust_has.tn: $!";
while (<$fc>) {
chomp;
my ( $FirstField,@Rest)=split /,/;
print $fcC join (",","'$FirstField',",)."\n";
}
close fc;
close fcC;

## Modify the teln_not file to take off last column
## File is now ready for report making.
my $fc_name2 = "$bld4.teln_not";
open (my $fc, $fc_name2) or die "$fc_name2:$!";
open my $fcT, ">$bld4.teln_not-1" or die "$bld4.teln_not-1: $!";
while (<$fc>) {
chomp;
my ( $FirstField1,$SecondField1,$ThirdField1,$FourthFie ld1,@Rest)=split /,/;
print $fcT join
(",","$FirstField1","$SecondField1","$ThirdField1" ,"$FourthField1",)."\n";
}
close fc;
close fcT;

`mv $bld4.teln_not-1 $bld4.teln_not`;







 
Reply With Quote
 
 
 
 
Gunnar Hjalmarsson
Guest
Posts: n/a
 
      08-04-2004
LHradowy wrote:
>
> And this is the outcome BEFORE I run my perl script.
> 3xxxx18,00 0 02 00,TELN NOT
> 3xxxx22,00 0 03 11,CUST HAS >


<snip>

> What I want to do is elinate the first part of saving it as a comma
> separated file. I belive I can do this in perl, but I can not
> split on spaces since I have spaces that I need to be part of a
> column.


Can't you split on instances of multiple spaces?

> So, (how to explain) instead of the above mention where there is a
> comma, I need to split this file, based on criteria, and also add a
> comma between the columns, so it looks like above...
>
> This is the file I get before I save it as a comma separated file.
> 3xxxx33 00 0 00 21 CUSTOMER HAS > 1
> 3xxxx63 00 0 01 07 CUSTOMER HAS > 1
> 3xxxx75 00 0 02 09 CUSTOMER HAS > 1
> 3xxxx85 00 0 12 09 TELN NOT BILL
> 3xxxx28 00 0 02 00 TELN NOT BILL


<snip>

> my @a = split /,/, $_, -1;


s/\s+//;
my @a = split /\s{3,}/;

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
 
Reply With Quote
 
 
 
 
Brian McCauley
Guest
Posts: n/a
 
      08-04-2004
bowsayge <(E-Mail Removed)> writes:
^^^^^^^^^^^^^^^^^
127.0.0.127.... cute!


> my (@lines, @fields) = (<>);


I somehow find the technique of tagging extra variables into the LHS
of a list assigment in order to declare them just seems ugly.

Is there really any need to slup here anyhow? Whould it not be
simpler to read the input linewise.

Isn't @fields being declared at the wrong scope anyhow - it should be
inside the loop.

> chomp @lines;
>
> for (@lines) {
> $fields[0] = substr $_,7,7;
> $fields[1] = substr $_,39,10;
> $fields[2] = substr $_,63;


For unpacking fixed position records you may want to consider unpack()
as an alternative to several substr().

--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\
 
Reply With Quote
 
Anno Siegel
Guest
Posts: n/a
 
      08-04-2004
bowsayge <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> LHradowy said to us:
>
> [...]
> > What I want to do is elinate the first part of saving it as a comma
> > separated file. I belive I can do this in perl, but I can not split on
> > spaces since I have spaces that I need to be part of a column.

> [...]
>
> You can extract substrings from your input lines like so:


Ah, you're learning fast. This begins to look like Perl code
Your solution is correct. I'll add a few comments about style and
point out alternatives.

I am aware, if I read your postings right, that you are rather new to
Perl, if not to programming in general. My (and other's) comments are
brief and often have the form of directions. They're still in the spirit
of "you can also do it this way", not of "you should have done it like this".
So...

> my (@lines, @fields) = (<>);


You don't need to declare @fields here. Instead, declare it in the
smallest possible scope, which would be the loop body.

But even if you had to declare it here, it isn't the done thing to
combine a mere declaration with a massive operation like slurping the
file. Use an extra line.

The parens around "<>" aren't needed and un-idiomatic.

> chomp @lines;


"chomp" can be applied to an assignment, even a list assignment. This
*is* idiomatic:

chomp( my @lines = <>);

> for (@lines) {


This would be the place to declare @fields. The array is cleared each
time my() happens at run-time, usually what you want.

> $fields[0] = substr $_,7,7;
> $fields[1] = substr $_,39,10;
> $fields[2] = substr $_,63;


It is rare in Perl that you need to index into an array. (Hashes are
different.) The more you think of an array as a whole, the better.
This is certainly not a place for indexing.

my @fields = (
substr( $_,7,7),
substr( ...),
substr( ...),
);

But there is a better way. See below...

> local $" = ',';


Nothing wrong with that, especially since it's properly localized. Still,
there's a tendency to avoid the "punctuation variables", with a few
exceptions.

> print "@fields\n";


Without assignment to $"

print join( ',', @fields), "\n";

> }


If you have to extract fields of fixed length at fixed positions,
the unpack() function is the right tool. It can extract multiple
substrings in one step.

"pack" and "unpack" and their formats are a sub-language of its own.
No-one memorizes all of it, but a few idioms are worth memorizing.
One is, to extract a substring of length $length at position $pos,
the unpack template is "@${pos}a$length". Putting it all together,
your solution becomes

chomp( my @lines = <DATA>);
for ( @lines ) {
my @fields = unpack( '@7a7 @39a10 @63a*', $_);
print join( ', ', @fields), "\n";
}

Anno
 
Reply With Quote
 
Andrew Palmer
Guest
Posts: n/a
 
      08-05-2004
"Anno Siegel" <(E-Mail Removed)-berlin.de> wrote in message
news:cer6vn$8is$(E-Mail Removed)-Berlin.DE...
> If you have to extract fields of fixed length at fixed positions,
> the unpack() function is the right tool. It can extract multiple
> substrings in one step.
>
> "pack" and "unpack" and their formats are a sub-language of its own.
> No-one memorizes all of it, but a few idioms are worth memorizing.
> One is, to extract a substring of length $length at position $pos,
> the unpack template is "@${pos}a$length". Putting it all together,
> your solution becomes


You don't need both a starting position and a string length for each field
(unpack() will pick up at the next field where it leaves off with the last).
If you need to strip trailing spaces, use capital "A" (which is meant for
extracting space-padded fields), rather than lowercase "a" (which is for
nul-terminated fields).


>
> chomp( my @lines = <DATA>);
> for ( @lines ) {
> my @fields = unpack( '@7a7 @39a10 @63a*', $_);


For the data posted, the above happens to work the same, although this is my
preferred way:
my @fields = unpack( '@7 A32 A24 A*', $_);

> print join( ', ', @fields), "\n";
> }


(The "@7" is for the 7 spaces at the beginning of each line. Are they there
in the actual data, or was the example just indented?)



 
Reply With Quote
 
David Combs
Guest
Posts: n/a
 
      08-07-2004
In article <cer6vn$8is$(E-Mail Removed)-Berlin.DE>,
Anno Siegel <(E-Mail Removed)-berlin.de> wrote:

SNIP


>If you have to extract fields of fixed length at fixed positions,
>the unpack() function is the right tool. It can extract multiple
>substrings in one step.
>
>"pack" and "unpack" and their formats are a sub-language of its own.
>No-one memorizes all of it, but a few idioms are worth memorizing.
>One is, to extract a substring of length $length at position $pos,
>the unpack template is "@${pos}a$length". Putting it all together,
>your solution becomes
>
> chomp( my @lines = <DATA>);
> for ( @lines ) {
> my @fields = unpack( '@7a7 @39a10 @63a*', $_);
> print join( ', ', @fields), "\n";
> }
>
>Anno



Anno -- what are the *other* pack-unpack idioms you think worth
memorizing?

I bet lots of people here would like to see what you've got!

Thanks,

David


 
Reply With Quote
 
Tassilo v. Parseval
Guest
Posts: n/a
 
      08-07-2004
Also sprach David Combs:

> In article <cer6vn$8is$(E-Mail Removed)-Berlin.DE>,
> Anno Siegel <(E-Mail Removed)-berlin.de> wrote:


>>If you have to extract fields of fixed length at fixed positions,
>>the unpack() function is the right tool. It can extract multiple
>>substrings in one step.
>>
>>"pack" and "unpack" and their formats are a sub-language of its own.
>>No-one memorizes all of it, but a few idioms are worth memorizing.
>>One is, to extract a substring of length $length at position $pos,
>>the unpack template is "@${pos}a$length". Putting it all together,
>>your solution becomes
>>
>> chomp( my @lines = <DATA>);
>> for ( @lines ) {
>> my @fields = unpack( '@7a7 @39a10 @63a*', $_);
>> print join( ', ', @fields), "\n";
>> }
>>
>>Anno

>
>
> Anno -- what are the *other* pack-unpack idioms you think worth
> memorizing?


Not that I'm Anno, but here's one that I find useful, namely the '/'
construct. The template preceeding the slash is used as a count argument
for the template following the slash:

# look at the first byte and extract that many
# bytes after that (3 in this case)
# as unsigned characters

my @x = unpack "c/C", "\x03\x00\x01\xff\x03";
print "@x\n";

__END__
0 1 255

Note how this can be combined with @:

my @x = unpack '@2c/C', "\x03\x00\x01\xff\x03";
print "@x\n",
__END__
255

Tassilo
--
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus}) !JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexi ixesixeseg;y~\n~~dddd;eval
 
Reply With Quote
 
Anno Siegel
Guest
Posts: n/a
 
      08-07-2004
David Combs <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> In article <cer6vn$8is$(E-Mail Removed)-Berlin.DE>,
> Anno Siegel <(E-Mail Removed)-berlin.de> wrote:
>
> SNIP
>
>
> >If you have to extract fields of fixed length at fixed positions,
> >the unpack() function is the right tool. It can extract multiple
> >substrings in one step.
> >
> >"pack" and "unpack" and their formats are a sub-language of its own.
> >No-one memorizes all of it, but a few idioms are worth memorizing.
> >One is, to extract a substring of length $length at position $pos,
> >the unpack template is "@${pos}a$length". Putting it all together,
> >your solution becomes
> >
> > chomp( my @lines = <DATA>);
> > for ( @lines ) {
> > my @fields = unpack( '@7a7 @39a10 @63a*', $_);
> > print join( ', ', @fields), "\n";
> > }
> >
> >Anno

>
>
> Anno -- what are the *other* pack-unpack idioms you think worth
> memorizing?
>
> I bet lots of people here would like to see what you've got!


Not all that much, come to think of it. There's the bit-counting "%32b*",
but that is advertised right in the unpack doc and needs no promotion.
I use that one even more frequently than the substr() replacement,
but I may be inordinately fond of bit tables.

Other things thing to keep in mind about pack/unpack (though not idioms)
is the possibility of reading the length of a field from the data itself
(the "/" construct). Tassilo has also pointed this out.

Then there's the use of grouping parentheses in a template, which applies
a repeat count to a group of sub-templates at once. In the form
"(<composite template>)*" this is slightly more that syntactic sugar.

Together with the knowledge what pack/unpack generally are about, this
pretty much outlines the range of their applicability. The details
can be looked up when you decide one or the other is a likely candidate.
Very few template characters deserve to be known by heart, maybe

b - a single bit
a - a binary byte
i - a native integer (native to your C compiler)

Anno
 
Reply With Quote
 
David Combs
Guest
Posts: n/a
 
      08-11-2004

THANK YOU!

Now, finally, I have some *real* motivation to (finally) go
learn unpack, so I can *understand* all those tricks.

Any way you two can convince someone (O'Reilly?) to come
up with a "wild hacks with perl" book, and put out a
call for donated hacks to include in it?

Thanks again;

David


 
Reply With Quote
 
Tassilo v. Parseval
Guest
Posts: n/a
 
      08-11-2004
Also sprach David Combs:

> Now, finally, I have some *real* motivation to (finally) go
> learn unpack, so I can *understand* all those tricks.
>
> Any way you two can convince someone (O'Reilly?) to come
> up with a "wild hacks with perl" book, and put out a
> call for donated hacks to include in it?


I am not sure that a book with such a title would do Perl's already
quite infamous reputation much good.

Tassilo
--
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus}) !JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexi ixesixeseg;y~\n~~dddd;eval
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Getting position from unpack (was: "join on space instead of comma") J. Romano Perl Misc 2 08-18-2004 09:44 PM
Why Python style guide (PEP-8) says 4 space indents instead of 8 space??? 8 space indents ever ok?? Christian Seberino Python 21 10-27-2003 04:20 PM
Re: Why Python style guide (PEP-8) says 4 space indents instead of8 space??? 8 space indents ever ok?? Ian Bicking Python 2 10-24-2003 11:15 AM
Re: Why Python style guide (PEP-8) says 4 space indents instead of8 space??? 8 space indents ever ok?? Ian Bicking Python 2 10-23-2003 07:07 AM
Stack space, global space, heap space Shuo Xiang C Programming 10 07-11-2003 07:30 PM



Advertisments