-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Geoff Cox <> wrote in
news::
> I am trying to extract email addresses from about 1000 htm files.
>
> So far am trying
>
> if ($line =~ /Mailto
.*)"/ {
> print OUT ("$1 \n");
>
> where the line is
>
> <a href="private.php?do=newpm&u="
>
> problem is with the " after the email address and the "greedy" regex
> characteristic which finds other " further along the line ...
>
> can I stop at the first " mark?
Change your thinking a bit. Instead of matching "Mailto:" followed by as
many characters as possible followed by a quote, match "Mailto:" followed
by as many non-quote characters as possible followed by a quote:
if ($line =~ /Mailto

[^"]*)"/)
Also consider making it case-insensitive with the i modifier.
- --
Eric
$_ = reverse sort $ /. r , qw p ekca lre uJ reh
ts p , map $ _. $ " , qw e p h tona e and print
-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 7.0.3 for non-commercial use <http://www.pgp.com>
iQA/AwUBP2MoO2PeouIeTNHoEQIdtACgxV2WliWoH07gZaS39JHGdb 1q+wAAn1f6
oXom0J4O85KppYwOysICYuZs
=yU+G
-----END PGP SIGNATURE-----