Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > a backreference problem?

Reply
Thread Tools

a backreference problem?

 
 
Geoff Cox
Guest
Posts: n/a
 
      08-24-2003
On 24 Aug 2003 12:35:34 GMT, "James E Keenan" <(E-Mail Removed)>
wrote:

James,

Apologies for calling you John!

Geoff

>
>"Geoff Cox" <(E-Mail Removed)> wrote in message
>news:(E-Mail Removed).. .
>> On Sun, 24 Aug 2003 00:05:44 +0100, "Peter Cooper"
>> <(E-Mail Removed)> wrote:
>>
>> Peter et al ...
>>
>> Now trying this - you will perhaps see better what I am trying to
>> do...problem with the passing of $1 to the sub getintro - I get an
>> uninitialized value in pattern match error ...
>>
>> Cheers
>>
>> Geoff
>>
>> open(IN, "a2-left.htm");
>> open(OUT, ">>out");
>> open(INN, "total");
>>
>> if (open(IN, "a2-left.htm")) {

>
>Why are you asking to do something if and only if the filehandle is open?
>You opened it 3 lines above.
>
>>
>> $line = <IN>;
>>
>> while ($line ne "") {

>
>better for 2 above lines:
>
> while (defined $line = <IN>) {
> next if $line =~ /^$/;
>
>> if ($line =~ /^<a href/) {

>
>Right here it becomes apparent that you're trying to parse HTML -- which
>means you should heed Peter's advice to check out HTML:arser.
>
>> if ($line =~ /="(.*)\.doc/) {
>> &getintro($1);
>> }
>> }
>> $line = <IN>;
>>

>What's the purpose of the line above?
>
>> }
>> }
>> sub getintro {
>>
>> @intro = <INN>;

>
>You don't appear to do anything with the content of @intro, so why read from
><INN> at all?
>
>> for ($n=0;$n<900;$n++) {
>> if ($into[$n] =~ /$1/) {

>
>... unless, that is, you have a typo in line above and meant $intro
>
>But here $1 contains the result of the first captured expression on the last
>matching line ... which may not always be what you want.
>
>> print OUT ("$into[$n]\n");
>> print OUT ("$line[$n-1]\n");
>> }
>> }
>> }
>>
>> close (IN);
>> close (OUT);
>> close (INN);
>>

>
>Note: The subject of your OP was "backreference problem." But at no point
>in the discussion have you used any backreferences (e.g., \1 as part of a
>pattern match). This leads me to suspect that you just don't understand
>Perl regexes very well. I recommend going to a good Perl text (e.g., the
>llama) and carefully working through the exercises on regexes.
>
>


 
Reply With Quote
 
 
 
 
Geoff Cox
Guest
Posts: n/a
 
      08-24-2003
James,

following code nearly there ... just one major problem ----- I would
like to have the text from the getintro to be in the order in which
the path is obtained from the a2-left.htm file but it is different
here ...From memory I think the problem is that

@intro = <INN>;

is in random order? is there a way round this?

Cheers

Geoff


in the


use strict;

open(IN, "a2-left.htm");
open(OUT, ">>out");
open(INN, "total");

my $line = <IN>;

while ($line ne "") {

if ($line =~ /^<a href/) {

if ($line =~ /="(.*)\.doc/) {
my $found = $1;
&getintro($found);
}

}

$line =<IN>;
}



sub getintro {
my $found;
my $n;

my @intro = <INN>;
for ($n=0;$n<900;$n++) {
if ($intro[$n] =~ /^<a href/) {
if ($intro[$n] =~ /$found/) {
&print;
}
}

}

sub print {

print OUT ("<tr>$intro[$n-1]\n");
print OUT ("$intro[$n]</tr>\n");
}

}


close (IN);
close (OUT);
close (INN);

 
Reply With Quote
 
 
 
 
Geoff Cox
Guest
Posts: n/a
 
      08-24-2003
On Sun, 24 Aug 2003 09:58:19 -0500, http://www.velocityreviews.com/forums/(E-Mail Removed) (Tad
McClellan) wrote:

Tad,

the code below now does what I want - ie for each path to a Word doc
name in a2-left.htm it finds the same path etc in the file total and
gets the introductory text associated with this doc....

I am sure there are better ways fo doing this...any thoughts? The sub
getintro seems poor..by the way it seems important to open and close
the total file each time the sub getintro is used...

Cheers

Geoff

use strict;

open(IN, "a2-left.htm");
open(OUT, ">>out");

my $line = <IN>;

while ($line ne "") {
if ($line =~ /^<a href/) {
if ($line =~ /href="(.*)\.doc/) {
&getintro($1);
}
}
$line =<IN>;
}


sub getintro {
open (INN, "total");
my $file = $1;
my $n;
my @intro = <INN>;

for ($n=0;$n<900;$n++) {
if ($intro[$n] =~ /$file/i) {
print OUT ("<tr>$intro[$n-1]\n");
print OUT ("$intro[$n]</tr>\n");
}
}
close (INN);
}


close (IN);
close (OUT);





>Geoff Cox <(E-Mail Removed)> wrote:
>
>
>> &getintro($1);

>
>
>Why are you passing an argument when the subroutine definition
>never makes use of the argument that you passed?
>
>
>> sub getintro {

>
>
> my( $file ) = @_;
>
>
>> my $n;
>>
>> print ("$1\n");

>
>
> print ("$file\n");
>
>
>> if ($intro[$n] =~ /$1/) {

>
>
> if ($intro[$n] =~ /$file/) {
>
>
>> &print;

>
>
> print OUT "$intro[$n]\n"
>
>
>
>[snip TOFU]


 
Reply With Quote
 
Geoff Cox
Guest
Posts: n/a
 
      08-24-2003
On Sun, 24 Aug 2003 18:55:34 GMT, (E-Mail Removed) (Jay Tilton)
wrote:

Jay,

Just to thank you for your comments - I will read them tomorrow...a
little sleep required!

Cheers

Geoff

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Bug? concatenate a number to a backreference: re.sub(r'(zzz:)xxx',r'\1'+str(4444), somevar) abdulet Python 2 10-23-2009 12:27 PM
No regex backreference with four backslashes gabriel.birke@gmail.com Ruby 4 09-16-2006 09:30 AM
re.sub() backreference bug? jemminger@gmail.com Python 4 08-18-2006 12:47 AM
backreference in regexp Fredrik Lundh Python 2 01-31-2006 03:02 PM
Newbie backreference question paulm Python 6 06-30-2005 11:00 PM



Advertisments