Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > regex to match any url

Reply
Thread Tools

regex to match any url

 
 
nodiseos@gmail.com
Guest
Posts: n/a
 
      02-14-2006
I am struggling way too much with this. Does someone have a regex that
will match any url-ish string like. Not worried about mail links.

http://sd.org
www.dssd.com
ibm.mil
https://sdsdsd.jobs
xyz.travel

Thanks!

 
Reply With Quote
 
 
 
 
A. Sinan Unur
Guest
Posts: n/a
 
      02-14-2006
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote in news:1139950940.817938.158230
@g14g2000cwa.googlegroups.com:

> I am struggling way too much with this. Does someone have a regex

that
> will match any url-ish string like. Not worried about mail links.
>
> http://sd.org
> www.dssd.com
> ibm.mil
> https://sdsdsd.jobs
> xyz.travel


Please show what you have tried and what has not worked so that we can
help you with what you don't know rather than acting as a "write-my-
code-for-me" service.

#!/usr/bin/perl

use strict;
use warnings;

while ( <DATA> ) {
print if m{ \A (?: https?:// )? \w+ (?: \. \w+)+ \n \z }x;
}

__DATA__
http://sd.org
www.dssd.com
ibm.mil
https://sdsdsd.jobs
xyz.travel

D:\Home\asu1\UseNet\clpmisc> u
http://sd.org
www.dssd.com
ibm.mil
https://sdsdsd.jobs
xyz.travel



--
A. Sinan Unur <(E-Mail Removed)>
(reverse each component and remove .invalid for email address)

comp.lang.perl.misc guidelines on the WWW:
http://mail.augustmail.com/~tadmc/cl...uidelines.html

 
Reply With Quote
 
 
 
 
Keith Keller
Guest
Posts: n/a
 
      02-14-2006
On 2006-02-14, (E-Mail Removed) <(E-Mail Removed)> wrote:
> I am struggling way too much with this. Does someone have a regex that
> will match any url-ish string like. Not worried about mail links.
>
> http://sd.org
> www.dssd.com
> ibm.mil
> https://sdsdsd.jobs
> xyz.travel


What code did you actually try, and what was the actual output versus
the expected output?

Have you read the Posting Guidelines for this newsgroup?

--keith

--
http://www.velocityreviews.com/forums/(E-Mail Removed)-francisco.ca.us
(try just my userid to email me)
AOLSFAQ=http://wombat.san-francisco.ca.us/cgi-bin/fom
see X- headers for PGP signature information

 
Reply With Quote
 
DJ Stunks
Guest
Posts: n/a
 
      02-15-2006
(E-Mail Removed) wrote:
> I am struggling way too much with this.


three words: Regexp::Common::URI

-jp

 
Reply With Quote
 
robic0
Guest
Posts: n/a
 
      02-15-2006
On 14 Feb 2006 13:02:20 -0800, (E-Mail Removed) wrote:

>I am struggling way too much with this. Does someone have a regex that
>will match any url-ish string like. Not worried about mail links.
>
>http://sd.org
>www.dssd.com
>ibm.mil
>https://sdsdsd.jobs
>xyz.travel
>
>Thanks!


I do, but I won't give it away for free
 
Reply With Quote
 
Jürgen Exner
Guest
Posts: n/a
 
      02-15-2006
(E-Mail Removed) wrote:
> I am struggling way too much with this. Does someone have a regex
> that will match any url-ish string like. Not worried about mail
> links.
>
> http://sd.org
> www.dssd.com
> ibm.mil
> https://sdsdsd.jobs
> xyz.travel


That's easy: /.*/ will match not only all of your examples but any URL you
can imagine.

Now, having said that, maybe it actually was a different question you wanted
to ask?

jue


 
Reply With Quote
 
Keith Keller
Guest
Posts: n/a
 
      02-15-2006
On 2006-02-15, robic0 <robic0> wrote:
>
> I do, but I won't give it away for free


If you exchanged your code for what it is worth, you'd need to pay the
OP to take it and fix it.

--keith

--
(E-Mail Removed)-francisco.ca.us
(try just my userid to email me)
AOLSFAQ=http://wombat.san-francisco.ca.us/cgi-bin/fom
see X- headers for PGP signature information

 
Reply With Quote
 
axel@white-eagle.invalid.uk
Guest
Posts: n/a
 
      02-15-2006
A. Sinan Unur <(E-Mail Removed)> wrote:
> (E-Mail Removed) wrote in news:1139950940.817938.158230
> @g14g2000cwa.googlegroups.com:


>> I am struggling way too much with this. Does someone have a regex

> that
>> will match any url-ish string like. Not worried about mail links.
>>
>> http://sd.org
>> www.dssd.com
>> ibm.mil
>> https://sdsdsd.jobs
>> xyz.travel


> Please show what you have tried and what has not worked so that we can
> help you with what you don't know rather than acting as a "write-my-
> code-for-me" service.


> #!/usr/bin/perl
>
> use strict;
> use warnings;
>
> while ( <DATA> ) {
> print if m{ \A (?: https?:// )? \w+ (?: \. \w+)+ \n \z }x;

^
|
Perhaps this should changed to *
to relect one word valid URLs
such as 'localhost'

> }


Axel
 
Reply With Quote
 
Andreas Puerzer
Guest
Posts: n/a
 
      02-15-2006
DJ Stunks schrieb:

> (E-Mail Removed) wrote:
>
>>I am struggling way too much with this.

>
>
> three words: Regexp::Common::URI
>
> -jp
>


Hm, let me have a look again at what the OP wrote:

(E-Mail Removed) schrieb:
> I am struggling way too much with this. Does someone have a regex that
> will match any url-ish string like. Not worried about mail links.
>
> http://sd.org
> www.dssd.com
> ibm.mil
> https://sdsdsd.jobs
> xyz.travel
>
> Thanks!
>


I read this as: 'I want a RE that matches all of my example-URIs, because they
all look url-ish.' ( a very vague and, at least in my eyes, error-prone
criterium, tempting me to give this: /.*\.\w{2,6}/ as an answer). To the OP:
What, exactly, do you want to accomplish?

If my assumption of the OP's intention is correct, then you're out of luck with
Regexp::Common, as it will only match valid URIs, as shown here:

D:\Temp\test_area>cat stunks.pl
#!/usr/bin/perl

use warnings;
use strict;

use Regexp::Common qw/URI/;

chomp ( my @uris = ( <DATA> ) );
foreach ( @uris ) {
/$RE{URI}{-keep}/ ? print "Found: $1\n" : print "Discarding: $_\n";
}

__DATA__
http://sd.org
www.dssd.com
ibm.mil
https://sdsdsd.jobs
xyz.travel

D:\Temp\test_area>perl stunks.pl
Found: http://sd.org
Discarding: www.dssd.com
Discarding: ibm.mil
Discarding: https://sdsdsd.jobs
Discarding: xyz.travel

If I did misunderstand the OP I sincerely apologize for jumping at you when you
were giving a perfectly valid Solution ( though I still see some issues coming
up with the https-uris... , but hey, here's where the Fun(tm) begins: hooking
your own REs into Regexp::Common :-> )


Greetings,
Andreas Pürzer

--
Have Fun,
and if you can't have fun,
have someone else's fun.
The Beautiful South
 
Reply With Quote
 
A. Sinan Unur
Guest
Posts: n/a
 
      02-15-2006
(E-Mail Removed) wrote in news:yLMIf.20922$wl.12746
@text.news.blueyonder.co.uk:

> A. Sinan Unur <(E-Mail Removed)> wrote:
>> (E-Mail Removed) wrote in news:1139950940.817938.158230
>> @g14g2000cwa.googlegroups.com:

>
>>> I am struggling way too much with this. Does someone have a regex

>> that
>>> will match any url-ish string like. Not worried about mail links.
>>>
>>> http://sd.org
>>> www.dssd.com
>>> ibm.mil
>>> https://sdsdsd.jobs
>>> xyz.travel

>
>> Please show what you have tried and what has not worked so that we
>> can help you with what you don't know rather than acting as a "write-
>> my-code-for-me" service.

>

....

>> print if m{ \A (?: https?:// )? \w+ (?: \. \w+)+ \n \z }x;

> ^
> |
> Perhaps this should changed to *
> to relect one word valid URLs
> such as 'localhost'
>


I wrote it to match the strings the OP provided. Further extension is
left to the reader as an exercise

Sinan

--
A. Sinan Unur <(E-Mail Removed)>
(reverse each component and remove .invalid for email address)

comp.lang.perl.misc guidelines on the WWW:
http://mail.augustmail.com/~tadmc/cl...uidelines.html

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How make regex that means "contains regex#1 but NOT regex#2" ?? seberino@spawar.navy.mil Python 3 07-01-2008 03:06 PM
regex match function doesn't return any output dennis.sprengers@gmail.com Javascript 2 10-27-2007 02:26 PM
501 PIX "deny any any" "allow any any" Any Anybody? Networking Student Cisco 4 11-16-2006 10:40 PM
regex to match strings that don't contain any digits? Dave Perl Misc 8 08-10-2005 08:10 AM
Java regex can't match lengthy match? hiwa Java 0 01-29-2004 10:09 AM



Advertisments