Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Extract range of lines from a text file

Reply
Thread Tools

Extract range of lines from a text file

 
 
Amer Neely
Guest
Posts: n/a
 
      04-09-2006
This is driving me nuts.

I'm walking through a mailbox file, and want to pull out specific lines
from each message. The body of each message is in a similar format,
having been generated by a script.

I'm doing OK except for one particular block of lines, the customer
address data. There is a blank line before and after this block. Example:

Transaction Time: 18:45:55

Amer Neely
POB 1481 Station Main
North Bay ON
P1B 8K7
CANADA

123-456-7890

I've managed to get the 5 lines into a string using this code:

while <IN>
{

# bunch of other comparisons deleted

if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/)
{
$CustData = $_;
$CustData =~ s/^Transaction Time:.+//; # lose the beginning pattern
$CustData =~ s/^\d\d\d-\d\d\d-\d\d\d\d$//; # lose the ending pattern
next if ($CustData =~ m/^$/); # skip the blank lines
$CustData =~ s/\n//g; # get rid of blank lines. don't think this working
print "\t$CustData\n";
}
}
close IN;
print "\nAll done.\n";

The problem seems to be that $CustData holds all 5 lines. I need to
break out each of the lines into a separate string variable so as to
populate a database field. This is what has me stumped. Sure would
appreciate some light on this.

--
Amer Neely
 
Reply With Quote
 
 
 
 
Xicheng Jia
Guest
Posts: n/a
 
      04-09-2006
Amer Neely wrote:
> This is driving me nuts.
>
> I'm walking through a mailbox file, and want to pull out specific lines
> from each message. The body of each message is in a similar format,
> having been generated by a script.
>
> I'm doing OK except for one particular block of lines, the customer
> address data. There is a blank line before and after this block. Example:
>
> Transaction Time: 18:45:55
>
> Amer Neely
> POB 1481 Station Main
> North Bay ON
> P1B 8K7
> CANADA
>
> 123-456-7890
>
> I've managed to get the 5 lines into a string using this code:
>
> while <IN>
> {
>
> # bunch of other comparisons deleted
>
> if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/)


by using /A/ ... /B/ expression, you are still in single-line mode, if
you want to get all these lines in $_, and then parse the data, try to
reset the IRS $/ to something like:

local $/ = "Transaction Time:";

then you can use block-mode which seperates your records by the given
string "Transaction Time:" in $/,

Xicheng

> {
> $CustData = $_;
> $CustData =~ s/^Transaction Time:.+//; # lose the beginning pattern
> $CustData =~ s/^\d\d\d-\d\d\d-\d\d\d\d$//; # lose the ending pattern
> next if ($CustData =~ m/^$/); # skip the blank lines
> $CustData =~ s/\n//g; # get rid of blank lines. don't think this working
> print "\t$CustData\n";
> }
> }
> close IN;
> print "\nAll done.\n";
>
> The problem seems to be that $CustData holds all 5 lines. I need to
> break out each of the lines into a separate string variable so as to
> populate a database field. This is what has me stumped. Sure would
> appreciate some light on this.
>
> --
> Amer Neely


 
Reply With Quote
 
 
 
 
Amer Neely
Guest
Posts: n/a
 
      04-09-2006
Xicheng Jia wrote:
> Amer Neely wrote:
>> This is driving me nuts.
>>
>> I'm walking through a mailbox file, and want to pull out specific lines
>> from each message. The body of each message is in a similar format,
>> having been generated by a script.
>>
>> I'm doing OK except for one particular block of lines, the customer
>> address data. There is a blank line before and after this block. Example:
>>
>> Transaction Time: 18:45:55
>>
>> Amer Neely
>> POB 1481 Station Main
>> North Bay ON
>> P1B 8K7
>> CANADA
>>
>> 123-456-7890
>>
>> I've managed to get the 5 lines into a string using this code:
>>
>> while <IN>
>> {
>>
>> # bunch of other comparisons deleted
>>
>> if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/)

>
> by using /A/ ... /B/ expression, you are still in single-line mode, if
> you want to get all these lines in $_, and then parse the data, try to
> reset the IRS $/ to something like:
>
> local $/ = "Transaction Time:";
>
> then you can use block-mode which seperates your records by the given
> string "Transaction Time:" in $/,
>
> Xicheng
>


Thanks for the quick reply. Still a little foggy though.
If I set the record separator to "Transaction Time:", then I don't need
the 'if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/)' loop?

Then set $CustData = $_ ?

But doesn't that leave me in the same position? All 5 lines are now in
$CustData.

>> {
>> $CustData = $_;
>> $CustData =~ s/^Transaction Time:.+//; # lose the beginning pattern
>> $CustData =~ s/^\d\d\d-\d\d\d-\d\d\d\d$//; # lose the ending pattern
>> next if ($CustData =~ m/^$/); # skip the blank lines
>> $CustData =~ s/\n//g; # get rid of blank lines. don't think this working
>> print "\t$CustData\n";
>> }
>> }
>> close IN;
>> print "\nAll done.\n";
>>
>> The problem seems to be that $CustData holds all 5 lines. I need to
>> break out each of the lines into a separate string variable so as to
>> populate a database field. This is what has me stumped. Sure would
>> appreciate some light on this.
>>
>> --
>> Amer Neely

>



--
Amer Neely
Home of Spam Catcher
W: www.softouch.on.ca
E:
Perl | MySQL | CGI programming for all data entry forms.
"We make web sites work!"
 
Reply With Quote
 
Xicheng Jia
Guest
Posts: n/a
 
      04-09-2006
Amer Neely wrote:
> Xicheng Jia wrote:
> > Amer Neely wrote:
> >> This is driving me nuts.
> >>
> >> I'm walking through a mailbox file, and want to pull out specific lines
> >> from each message. The body of each message is in a similar format,
> >> having been generated by a script.
> >>
> >> I'm doing OK except for one particular block of lines, the customer
> >> address data. There is a blank line before and after this block. Example:
> >>
> >> Transaction Time: 18:45:55
> >>
> >> Amer Neely
> >> POB 1481 Station Main
> >> North Bay ON
> >> P1B 8K7
> >> CANADA
> >>
> >> 123-456-7890
> >>
> >> I've managed to get the 5 lines into a string using this code:
> >>
> >> while <IN>
> >> {
> >>
> >> # bunch of other comparisons deleted
> >>
> >> if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/)

> >
> > by using /A/ ... /B/ expression, you are still in single-line mode, if
> > you want to get all these lines in $_, and then parse the data, try to
> > reset the IRS $/ to something like:
> >
> > local $/ = "Transaction Time:";
> >
> > then you can use block-mode which seperates your records by the given
> > string "Transaction Time:" in $/,
> >
> > Xicheng
> >

>
> Thanks for the quick reply. Still a little foggy though.
> If I set the record separator to "Transaction Time:", then I don't need
> the 'if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/)' loop?

yes, you dont need this "if" loop coz it invokes perl in line-mode by
default (in fact it depends on yout $/)..

>
> Then set $CustData = $_ ?
>
> But doesn't that leave me in the same position? All 5 lines are now in
> $CustData.


not really, after you do so, you get something like:

$_ = "18:45:55

Amer Neely
POB 1481 Station Main
North Bay ON
P1B 8K7
CANADA

123-456-7890
"

then split it with "\n" like: my @arr = split "\n";
you get:
$arr[0] = "18:45:55";
$arr[1] = "";
$arr[2] = "Amer Neely";
$arr[3] = "POB 1481 Station Main";
$arr[4] = "North Bay ON"
........

so you use the following line to collect your date..:

my (undef, undef, $var1, $var2, $var3, $var4, $var5, undef, undef) =
split "\n";

or you can use regex to parse whatever data you need from $_. it really
depends on what information do you really need.

Another way: if you are sure there are 5 lines for each record you want
to keep, then you can read your data in paragraph-mode,like:

local $/ = "";

while ( <IN> ) {
next unless tr/\n// > 5; #use paragraph only have more than 5
lines(count also a blank line, so you have 6 lines)
my ($name, $pob, $add1, $add2, $cont) = split "\n";
# do sth on the avobe variables..
}

then you get:
-------------------------
$name = "Amer Neely"
$pob = "POB 1481 Station Main"
$add1 = "North Bay ON"
$add2 = "P1B 8K7"
$cont = "CANADA"
------------------------

Xicheng

> >> {
> >> $CustData = $_;
> >> $CustData =~ s/^Transaction Time:.+//; # lose the beginning pattern
> >> $CustData =~ s/^\d\d\d-\d\d\d-\d\d\d\d$//; # lose the ending pattern
> >> next if ($CustData =~ m/^$/); # skip the blank lines
> >> $CustData =~ s/\n//g; # get rid of blank lines. don't think this working
> >> print "\t$CustData\n";
> >> }
> >> }
> >> close IN;
> >> print "\nAll done.\n";
> >>
> >> The problem seems to be that $CustData holds all 5 lines. I need to
> >> break out each of the lines into a separate string variable so as to
> >> populate a database field. This is what has me stumped. Sure would
> >> appreciate some light on this.
> >>
> >> --
> >> Amer Neely

> >

>
>
> --
> Amer Neely
> Home of Spam Catcher
> W: www.softouch.on.ca
> E:
> Perl | MySQL | CGI programming for all data entry forms.
> "We make web sites work!"


 
Reply With Quote
 
Xicheng Jia
Guest
Posts: n/a
 
      04-09-2006
Amer Neely wrote:
> This is driving me nuts.
>
> I'm walking through a mailbox file, and want to pull out specific lines
> from each message. The body of each message is in a similar format,
> having been generated by a script.
>
> I'm doing OK except for one particular block of lines, the customer
> address data. There is a blank line before and after this block. Example:
>
> Transaction Time: 18:45:55
>
> Amer Neely
> POB 1481 Station Main
> North Bay ON
> P1B 8K7
> CANADA
>
> 123-456-7890
>
> I've managed to get the 5 lines into a string using this code:
>
> while <IN>
> {
>
> # bunch of other comparisons deleted


AN > if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/)

this keeps your input as line-mode, you get one line each time to $_
from your input file.

AN > $CustData = $_;

for each iteration of your while loop, you get only one line in
$CustData..

AN > $CustData =~ s/^Transaction Time:.+//; # lose the beginning
pattern
AN > $CustData =~ s/^\d\d\d-\d\d\d-\d\d\d\d$//; # lose the ending
pattern
AN > next if ($CustData =~ m/^$/); # skip the blank lines
AN > $CustData =~ s/\n//g; # get rid of blank lines. don't think this
working

this does not get rid of the blank line, it removes the newline "\n"
character, when you are in default line-mode, it's the same as "chomp".

Xicheng

> print "\t$CustData\n";
> }
> }
> close IN;
> print "\nAll done.\n";
>
> The problem seems to be that $CustData holds all 5 lines. I need to
> break out each of the lines into a separate string variable so as to
> populate a database field. This is what has me stumped. Sure would
> appreciate some light on this.
>
> --
> Amer Neely


 
Reply With Quote
 
Amer Neely
Guest
Posts: n/a
 
      04-09-2006
Xicheng Jia wrote:
> Amer Neely wrote:
>> Xicheng Jia wrote:
>>> Amer Neely wrote:
>>>> This is driving me nuts.
>>>>
>>>> I'm walking through a mailbox file, and want to pull out specific lines
>>>> from each message. The body of each message is in a similar format,
>>>> having been generated by a script.
>>>>
>>>> I'm doing OK except for one particular block of lines, the customer
>>>> address data. There is a blank line before and after this block. Example:
>>>>
>>>> Transaction Time: 18:45:55
>>>>
>>>> Amer Neely
>>>> POB 1481 Station Main
>>>> North Bay ON
>>>> P1B 8K7
>>>> CANADA
>>>>
>>>> 123-456-7890
>>>>
>>>> I've managed to get the 5 lines into a string using this code:
>>>>
>>>> while <IN>
>>>> {
>>>>
>>>> # bunch of other comparisons deleted
>>>>
>>>> if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/)
>>> by using /A/ ... /B/ expression, you are still in single-line mode, if
>>> you want to get all these lines in $_, and then parse the data, try to
>>> reset the IRS $/ to something like:
>>>
>>> local $/ = "Transaction Time:";
>>>
>>> then you can use block-mode which seperates your records by the given
>>> string "Transaction Time:" in $/,
>>>
>>> Xicheng
>>>

>> Thanks for the quick reply. Still a little foggy though.
>> If I set the record separator to "Transaction Time:", then I don't need
>> the 'if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/)' loop?

> yes, you dont need this "if" loop coz it invokes perl in line-mode by
> default (in fact it depends on yout $/)..
>
>> Then set $CustData = $_ ?
>>
>> But doesn't that leave me in the same position? All 5 lines are now in
>> $CustData.

>
> not really, after you do so, you get something like:
>
> $_ = "18:45:55
>
> Amer Neely
> POB 1481 Station Main
> North Bay ON
> P1B 8K7
> CANADA
>
> 123-456-7890
> "
>
> then split it with "\n" like: my @arr = split "\n";
> you get:
> $arr[0] = "18:45:55";
> $arr[1] = "";
> $arr[2] = "Amer Neely";
> $arr[3] = "POB 1481 Station Main";
> $arr[4] = "North Bay ON"
> .......
>
> so you use the following line to collect your date..:
>
> my (undef, undef, $var1, $var2, $var3, $var4, $var5, undef, undef) =
> split "\n";
>
> or you can use regex to parse whatever data you need from $_. it really
> depends on what information do you really need.
>
> Another way: if you are sure there are 5 lines for each record you want
> to keep, then you can read your data in paragraph-mode,like:
>
> local $/ = "";
>
> while ( <IN> ) {
> next unless tr/\n// > 5; #use paragraph only have more than 5
> lines(count also a blank line, so you have 6 lines)
> my ($name, $pob, $add1, $add2, $cont) = split "\n";
> # do sth on the avobe variables..
> }
>
> then you get:
> -------------------------
> $name = "Amer Neely"
> $pob = "POB 1481 Station Main"
> $add1 = "North Bay ON"
> $add2 = "P1B 8K7"
> $cont = "CANADA"
> ------------------------
>
> Xicheng
>


This is very close. It will work if the input file only consists of
blocks of 5 lines delimited by a blank line. However, I need to pull
these blocks out of the middle of the message body. There are lines
before and after. That's why I was using the 'if (/^Transaction Time:/
.... /^\d\d\d-\d\d\d-\d\d\d\d$/)' loop.

>>>> {
>>>> $CustData = $_;
>>>> $CustData =~ s/^Transaction Time:.+//; # lose the beginning pattern
>>>> $CustData =~ s/^\d\d\d-\d\d\d-\d\d\d\d$//; # lose the ending pattern
>>>> next if ($CustData =~ m/^$/); # skip the blank lines
>>>> $CustData =~ s/\n//g; # get rid of blank lines. don't think this working
>>>> print "\t$CustData\n";
>>>> }
>>>> }
>>>> close IN;
>>>> print "\nAll done.\n";
>>>>
>>>> The problem seems to be that $CustData holds all 5 lines. I need to
>>>> break out each of the lines into a separate string variable so as to
>>>> populate a database field. This is what has me stumped. Sure would
>>>> appreciate some light on this.
>>>>
>>>> --
>>>> Amer Neely

>>
>> --
>> Amer Neely
>> Home of Spam Catcher
>> W: www.softouch.on.ca
>> E:
>> Perl | MySQL | CGI programming for all data entry forms.
>> "We make web sites work!"

>



--
Amer Neely
Home of Spam Catcher
W: www.softouch.on.ca
E:
Perl | MySQL | CGI programming for all data entry forms.
"We make web sites work!"
 
Reply With Quote
 
MSG
Guest
Posts: n/a
 
      04-09-2006
Amer Neely wrote:
> if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/)
> {
> $CustData = $_;
> $CustData =~ s/^Transaction Time:.+//; # lose the beginning pattern
> $CustData =~ s/^\d\d\d-\d\d\d-\d\d\d\d$//; # lose the ending pattern
> next if ($CustData =~ m/^$/); # skip the blank lines
> $CustData =~ s/\n//g; # get rid of blank lines. don't think this working
> print "\t$CustData\n";
> }
> }


You don't have to process each line inside the loop. Instead, push each
line to an array and then process each array element after the loop.
It can be a lot cleaner and easier. Something like this:

my @records;
while (<IN>){
chomp;
if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/){
push @records, $_;
}
}

Now @records contains lines from "Transaction" to "123-456-7890",
each of which is an element of the array.

 
Reply With Quote
 
Xicheng Jia
Guest
Posts: n/a
 
      04-09-2006
Amer Neely wrote:
> Xicheng Jia wrote:
> > Amer Neely wrote:
> >> Xicheng Jia wrote:
> >>> Amer Neely wrote:
> >>>> This is driving me nuts.
> >>>>
> >>>> I'm walking through a mailbox file, and want to pull out specific lines
> >>>> from each message. The body of each message is in a similar format,
> >>>> having been generated by a script.
> >>>>
> >>>> I'm doing OK except for one particular block of lines, the customer
> >>>> address data. There is a blank line before and after this block. Example:
> >>>>
> >>>> Transaction Time: 18:45:55
> >>>>
> >>>> Amer Neely
> >>>> POB 1481 Station Main
> >>>> North Bay ON
> >>>> P1B 8K7
> >>>> CANADA
> >>>>
> >>>> 123-456-7890
> >>>>
> >>>> I've managed to get the 5 lines into a string using this code:
> >>>>
> >>>> while <IN>
> >>>> {
> >>>>
> >>>> # bunch of other comparisons deleted
> >>>>
> >>>> if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/)
> >>> by using /A/ ... /B/ expression, you are still in single-line mode, if
> >>> you want to get all these lines in $_, and then parse the data, try to
> >>> reset the IRS $/ to something like:
> >>>
> >>> local $/ = "Transaction Time:";
> >>>
> >>> then you can use block-mode which seperates your records by the given
> >>> string "Transaction Time:" in $/,
> >>>
> >>> Xicheng
> >>>
> >> Thanks for the quick reply. Still a little foggy though.
> >> If I set the record separator to "Transaction Time:", then I don't need
> >> the 'if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/)' loop?

> > yes, you dont need this "if" loop coz it invokes perl in line-mode by
> > default (in fact it depends on yout $/)..
> >
> >> Then set $CustData = $_ ?
> >>
> >> But doesn't that leave me in the same position? All 5 lines are now in
> >> $CustData.

> >
> > not really, after you do so, you get something like:
> >
> > $_ = "18:45:55
> >
> > Amer Neely
> > POB 1481 Station Main
> > North Bay ON
> > P1B 8K7
> > CANADA
> >
> > 123-456-7890
> > "
> >
> > then split it with "\n" like: my @arr = split "\n";
> > you get:
> > $arr[0] = "18:45:55";
> > $arr[1] = "";
> > $arr[2] = "Amer Neely";
> > $arr[3] = "POB 1481 Station Main";
> > $arr[4] = "North Bay ON"
> > .......
> >
> > so you use the following line to collect your date..:
> >
> > my (undef, undef, $var1, $var2, $var3, $var4, $var5, undef, undef) =
> > split "\n";
> >
> > or you can use regex to parse whatever data you need from $_. it really
> > depends on what information do you really need.
> >
> > Another way: if you are sure there are 5 lines for each record you want
> > to keep, then you can read your data in paragraph-mode,like:
> >
> > local $/ = "";
> >
> > while ( <IN> ) {
> > next unless tr/\n// > 5; #use paragraph only have more than 5
> > lines(count also a blank line, so you have 6 lines)
> > my ($name, $pob, $add1, $add2, $cont) = split "\n";
> > # do sth on the avobe variables..
> > }
> >
> > then you get:
> > -------------------------
> > $name = "Amer Neely"
> > $pob = "POB 1481 Station Main"
> > $add1 = "North Bay ON"
> > $add2 = "P1B 8K7"
> > $cont = "CANADA"
> > ------------------------
> >
> > Xicheng
> >

>
> This is very close. It will work if the input file only consists of
> blocks of 5 lines delimited by a blank line. However, I need to pull
> these blocks out of the middle of the message body. There are lines
> before and after. That's why I was using the 'if (/^Transaction Time:/
> ... /^\d\d\d-\d\d\d-\d\d\d\d$/)' loop.


yeah, you can actually use it here, coz each of them takes a
single-separated-paragraph in your input stream(you've overwritten the
line-mode by reset $/), so:

local $/ = "";

while ( <DATA> ) {
if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/){
next unless tr/\n// == 6;
my ($name, $pob, $add1, $add2, $cont) = split "\n";
# do sth on the above variables..
}
}

will discard all lines which are not between these two patterns, and
then split only the paragraphs between...

Best,
Xicheng

> >>>> {
> >>>> $CustData = $_;
> >>>> $CustData =~ s/^Transaction Time:.+//; # lose the beginning pattern
> >>>> $CustData =~ s/^\d\d\d-\d\d\d-\d\d\d\d$//; # lose the ending pattern
> >>>> next if ($CustData =~ m/^$/); # skip the blank lines
> >>>> $CustData =~ s/\n//g; # get rid of blank lines. don't think this working
> >>>> print "\t$CustData\n";
> >>>> }
> >>>> }
> >>>> close IN;
> >>>> print "\nAll done.\n";
> >>>>
> >>>> The problem seems to be that $CustData holds all 5 lines. I need to
> >>>> break out each of the lines into a separate string variable so as to
> >>>> populate a database field. This is what has me stumped. Sure would
> >>>> appreciate some light on this.
> >>>>
> >>>> --
> >>>> Amer Neely
> >>
> >> --
> >> Amer Neely
> >> Home of Spam Catcher
> >> W: www.softouch.on.ca
> >> E:
> >> Perl | MySQL | CGI programming for all data entry forms.
> >> "We make web sites work!"

> >

>
>
> --
> Amer Neely
> Home of Spam Catcher
> W: www.softouch.on.ca
> E:
> Perl | MySQL | CGI programming for all data entry forms.
> "We make web sites work!"


 
Reply With Quote
 
Xicheng Jia
Guest
Posts: n/a
 
      04-09-2006
MSG wrote:
> Amer Neely wrote:
> > if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/)
> > {
> > $CustData = $_;
> > $CustData =~ s/^Transaction Time:.+//; # lose the beginning pattern
> > $CustData =~ s/^\d\d\d-\d\d\d-\d\d\d\d$//; # lose the ending pattern
> > next if ($CustData =~ m/^$/); # skip the blank lines
> > $CustData =~ s/\n//g; # get rid of blank lines. don't think this working
> > print "\t$CustData\n";
> > }
> > }

>
> You don't have to process each line inside the loop. Instead, push each
> line to an array and then process each array element after the loop.
> It can be a lot cleaner and easier. Something like this:
>

= my @records;
= while (<IN>){
= chomp;
= if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/){
= push @records, $_;
= }
= }

you might get some troubles if you have more than one /Transaction/
<==> /^telephone$ / blocks in your input file.

Xicheng

> Now @records contains lines from "Transaction" to "123-456-7890",
> each of which is an element of the array.


 
Reply With Quote
 
Amer Neely
Guest
Posts: n/a
 
      04-09-2006
MSG wrote:
> Amer Neely wrote:
>> if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/)
>> {
>> $CustData = $_;
>> $CustData =~ s/^Transaction Time:.+//; # lose the beginning pattern
>> $CustData =~ s/^\d\d\d-\d\d\d-\d\d\d\d$//; # lose the ending pattern
>> next if ($CustData =~ m/^$/); # skip the blank lines
>> $CustData =~ s/\n//g; # get rid of blank lines. don't think this working
>> print "\t$CustData\n";
>> }
>> }

>
> You don't have to process each line inside the loop. Instead, push each
> line to an array and then process each array element after the loop.
> It can be a lot cleaner and easier. Something like this:
>
> my @records;
> while (<IN>){
> chomp;
> if (/^Transaction Time:/ ... /^\d\d\d-\d\d\d-\d\d\d\d$/){
> push @records, $_;
> }
> }
>
> Now @records contains lines from "Transaction" to "123-456-7890",
> each of which is an element of the array.
>


OK, I see what that does, but I'm not sure it helps me. The goal is to
pull out that address block, on a line-per-line basis, and insert each
line into a database field.

@records contains all the address blocks from the whole file. I'd like
to deal with each address block (line-by-line) as I go through the file
if I can.

Another problem is that some of the addresses have 6 lines, not 5.

--
Amer Neely
Home of Spam Catcher
W: www.softouch.on.ca
E:
Perl | MySQL | CGI programming for all data entry forms.
"We make web sites work!"
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
finding a range of lines from a text document Adam Akhtar Ruby 3 01-10-2009 03:21 AM
extract lines from text file and place in an array Roger Reeks Ruby 1 10-16-2008 12:11 PM
To delete few lines and add few lines at the end of a text file using c program Murali C++ 2 03-09-2006 04:45 PM
extract range of lines using range op bug? it_says_BALLS_on_your forehead Perl Misc 3 03-03-2006 04:28 PM
Re: how to read 10 lines from a 200 lines file and write to a new file?? Joe Wright C Programming 0 07-27-2003 08:50 PM



Advertisments