Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Matching URls

Reply
Thread Tools

Matching URls

 
 
Seansan
Guest
Posts: n/a
 
      07-20-2003
Hi,

Does anyone know of a link to or an example of a decent regexp that wil
recognize internet URLs? (It needs to match urls starting with http:// and
www.)

I am trying to replace a string like
"My Homepage is @ http://www.homepage.nl"
with
"My Homepage is @ <A HREF =
'http://www.homepage.nl'>http://www.homepage.nl</A>"

Thx, Seansan


 
Reply With Quote
 
 
 
 
Jeff 'japhy' Pinyan
Guest
Posts: n/a
 
      07-20-2003
On 20 Jul 2003, Tina Mueller wrote:

>Seansan wrote:
>
>> Does anyone know of a link to or an example of a decent regexp that wil
>> recognize internet URLs? (It needs to match urls starting with http:// and
>> www.)

>
>perldoc URI::Find
>and
>perldoc URI::Find::Schemeless


Damn those URIs and their zany schemes.

--
Jeff Pinyan RPI Acacia Brother #734 2003 Rush Chairman
"And I vos head of Gestapo for ten | Michael Palin (as Heinrich Bimmler)
years. Ah! Five years! Nein! No! | in: The North Minehead Bye-Election
Oh. Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)

 
Reply With Quote
 
 
 
 
JR
Guest
Posts: n/a
 
      07-20-2003
"Seansan" <sheukels=cuthere=@yahoo.co.uk> wrote in message news:<3f1aa893$0$61643$(E-Mail Removed)4a ll.nl>...
> Hi,
>
> Does anyone know of a link to or an example of a decent regexp that wil
> recognize internet URLs? (It needs to match urls starting with http:// and
> www.)
>
> I am trying to replace a string like
> "My Homepage is @ http://www.homepage.nl"
> with
> "My Homepage is @ <A HREF =
> 'http://www.homepage.nl'>http://www.homepage.nl</A>"
>
> Thx, Seansan


## I think this is what you want. Good luck.
## JR

#!/perl
use strict;
use diagnostics;
use warnings;

while(<DATA>) {
s/(http:\/\/www.*)\b/<a href='$1'>$1<\/a>/g;
print $_, "\n";
}

__DATA__
My homepage is @ http://www.homepage.nl
My favorite site is @ http://www.espn.com
My second most favorite site is @ http://www.whatever.org

=pod
## OUTPUT

My homepage is @ <a href='http://www.homepage.nl'>http://www.homepage.nl</a>
My favorite site is @ <a
href='http://www.espn.com'>http://www.espn.com</a>
My second most favorite site is @ <a
href='http://www.whatever.org'>http://www.whatever.org</a>

=cut
 
Reply With Quote
 
Bob Walton
Guest
Posts: n/a
 
      07-20-2003
Seansan wrote:

....


> Does anyone know of a link to or an example of a decent regexp that wil
> recognize internet URLs? (It needs to match urls starting with http:// and
> www.)

....


> Thx, Seansan


You could:

use Regexp::Common::URI;

--
Bob Walton

 
Reply With Quote
 
Bob Walton
Guest
Posts: n/a
 
      07-20-2003
Bob Walton wrote:

....
> You could:
>
> use Regexp::Common::URI;
>

Make that:


use Regexp::Common qw(URI);

--
Bob Walton

 
Reply With Quote
 
Seansan
Guest
Posts: n/a
 
      07-20-2003
Thx JR (and others),

I tried it, and it works. But I only have 1 problem:

*) It matches all the text after the url as well."My String: "Homepage is @
http://www.homepage.nl, yep thats it" becomes
"Homepage is @ <A HREF=http://www.homepage.nl yep thats
it>http://www.homepage.nl yep thats it</A>"

Any ideas on how to solve this? I played witht he regexp a while, but I cant
figure it out.

ps. How would I alter the regexp to match www. also? Like this?
s/([http:\/\/www|www].*)\b/<a href='$1'>$1<\/a>/g;

Seansan


"JR" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) m...
> "Seansan" <sheukels=cuthere=@yahoo.co.uk> wrote in message

news:<3f1aa893$0$61643$(E-Mail Removed)4a ll.nl>...
> > Hi,
> >
> > Does anyone know of a link to or an example of a decent regexp that wil
> > recognize internet URLs? (It needs to match urls starting with http://

and
> > www.)
> >
> > I am trying to replace a string like
> > "My Homepage is @ http://www.homepage.nl"
> > with
> > "My Homepage is @ <A HREF =
> > 'http://www.homepage.nl'>http://www.homepage.nl</A>"
> >
> > Thx, Seansan

>
> ## I think this is what you want. Good luck.
> ## JR
>
> #!/perl
> use strict;
> use diagnostics;
> use warnings;
>
> while(<DATA>) {
> s/(http:\/\/www.*)\b/<a href='$1'>$1<\/a>/g;
> print $_, "\n";
> }
>
> __DATA__
> My homepage is @ http://www.homepage.nl
> My favorite site is @ http://www.espn.com
> My second most favorite site is @ http://www.whatever.org
>
> =pod
> ## OUTPUT
>
> My homepage is @ <a

href='http://www.homepage.nl'>http://www.homepage.nl</a>
> My favorite site is @ <a
> href='http://www.espn.com'>http://www.espn.com</a>
> My second most favorite site is @ <a
> href='http://www.whatever.org'>http://www.whatever.org</a>
>
> =cut



 
Reply With Quote
 
Patrick LeBoutillier
Guest
Posts: n/a
 
      07-20-2003
"Seansan" <sheukels=cuthere=@yahoo.co.uk> wrote in message news:<3f1aa893$0$61643$(E-Mail Removed)4a ll.nl>...
> Hi,
>
> Does anyone know of a link to or an example of a decent regexp that wil
> recognize internet URLs? (It needs to match urls starting with http:// and
> www.)


Check out the Regexp::Common module. I think it has what you are looking for.

>
> I am trying to replace a string like
> "My Homepage is @ http://www.homepage.nl"
> with
> "My Homepage is @ <A HREF =
> 'http://www.homepage.nl'>http://www.homepage.nl</A>"
>
> Thx, Seansan

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
JDBC URLs ...not really URLs? Adam Monsen Java 11 02-08-2009 08:14 PM
Converting Relative URLs into Absolute URLs Nathan Sokalski ASP .Net 1 08-12-2008 07:03 AM
dynamic URLS convert to static URLS for search engines Steve T. ASP .Net Web Services 7 03-04-2004 03:16 PM
Pattern matching : not matching problem Marc Bissonnette Perl Misc 9 01-13-2004 05:52 PM
Distinguish text URLs from non-text URLs? Kaidi Java 5 01-04-2004 10:15 AM



Advertisments