Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Javascript > Check if value is a website URL

Reply
Thread Tools

Check if value is a website URL

 
 
jwcarlton
Guest
Posts: n/a
 
      09-14-2011
This is a tricky one for me. I'm validating a form, and want to check
if a field entered is a legitimate website address. I don't
necessarily need to ensure that the site works (I can do that later),
but I do want to see if what's entered is a likely URL.

I'm currently just checking to see if it begins with "http", but
that's not so great; a less-savvy person might enter
"www.example.com", or even "example.com", and get an error that it's
not a legitimate link.

I've thought about testing to see if it contains at least 1 "." (since
all website addresses would, I think), but that's pretty vague; a less-
savvy person might enter their email address, and it would go through.
I guess that I could also check for an "@", but I can't help but
wonder if there's a smarter / smoother option?
 
Reply With Quote
 
 
 
 
dhtml
Guest
Posts: n/a
 
      09-14-2011
On Sep 13, 11:12*pm, jwcarlton <(E-Mail Removed)> wrote:
> This is a tricky one for me. I'm validating a form, and want to check
> if a field entered is a legitimate website address. I don't
> necessarily need to ensure that the site works (I can do that later),
> but I do want to see if what's entered is a likely URL.
>
> I'm currently just checking to see if it begins with "http", but
> that's not so great; a less-savvy person might enter
> "www.example.com", or even "example.com", and get an error that it's
> not a legitimate link.


If you want to require the protocol to be explicit, the UI should
indicate that in some way. For example, use placeholder text that
reads http://www.example.com, or use a label as "Address" or
"Location" instead of "URL".

(Lest the so-called "less-savvy" user actually know what a URL is and
enter a perfectly valid one that your code can't handle (i.e. not
beginning with "http")).

Validate the "location" field with a regexp on the client and on the
server. You might consider using HTML5 pattern attribute where
supported.

>
> I've thought about testing to see if it contains at least 1 "." (since
> all website addresses would, I think), but that's pretty vague; a less-
> savvy person might enter their email address, and it would go through.
> I guess that I could also check for an "@", but I can't help but
> wonder if there's a smarter / smoother option?


HTML5 INPUT type="email", feature tested, and with a fallback on the
client where the test fails, and a fallback on the server (server side
handling) where JS is disabled).
--
Garrett
 
Reply With Quote
 
 
 
 
Swifty
Guest
Posts: n/a
 
      09-14-2011
On Tue, 13 Sep 2011 23:12:08 -0700 (PDT), jwcarlton
<(E-Mail Removed)> wrote:

>I've thought about testing to see if it contains at least 1 "." (since
>all website addresses would, I think), but that's pretty vague; a less-
>savvy person might enter their email address


My current algorithm test for an interior "." (i.e. not at the ends),
no "@" and no "." at the ends. It is for my own consumption, but I'm
better than Mr Average at naking mistakes. There's another one!

Going further would take me into the land of diminishing returns, but
this decision depends on how accurate you need to be.

--
Steve Swift
http://www.swiftys.org.uk/swifty.html
http://www.ringers.org.uk
 
Reply With Quote
 
Jukka K. Korpela
Guest
Posts: n/a
 
      09-14-2011
14/09/2011 09:57, dhtml wrote:

> If you want to require the protocol to be explicit, the UI should
> indicate that in some way.


The protocol part is required in absolute URLs. But of course one might
consider prepending http:// if there is no protocol part.

> For example, use placeholder text that
> reads http://www.example.com, or use a label as "Address" or
> "Location" instead of "URL".


"URL" is much more accurate than "Address" or "Location" (which might
refer to postal addresses or geographic locations, for example). "Web
address" might do. Or "Web site address", if that's what one is asking for.

> Validate the "location" field with a regexp on the client and on the
> server.


That's non-trivial. Would you write one that accepts foo://example.com
and reject http://www.sää.fi for example?

If the intent is to check that the URL actually works, then it would be
simplest to do just that, instead of a separate syntax check. Checking
that it works is of course nontrivial, especially since it may involve
dealing with redirections and temporary network and server problemn.

> You might consider using HTML5 pattern attribute where
> supported.

[...]
>> I've thought about testing to see if it contains at least 1 "." (since
>> all website addresses would, I think), but that's pretty vague; a less-
>> savvy person might enter their email address, and it would go through.
>> I guess that I could also check for an "@", but I can't help but
>> wonder if there's a smarter / smoother option?

>
> HTML5 INPUT type="email", feature tested, and with a fallback on the
> client where the test fails, and a fallback on the server (server side
> handling) where JS is disabled).


Pardon? This is place for using <input type=url>, isn't it? It's good to
use it even though most browsers will treat it as <input type=text>, so
that any client-side checks will be performed only if coded in
JavaScript and when JavaScript is enabled. (To be honest, there is a
risk in using <input type=url>, or <input type=email> for that matter -
it is useful when you specifically expect email address. The risk is
that when browsers start supporting them more widely, they will first do
it wildly. It's easy even to people who write browsers to produce code
that checks URLs and email addresses so that correct data is rejected
and incorrect data passes thru.)

--
Yucca, http://www.cs.tut.fi/~jkorpela/
 
Reply With Quote
 
P E Schoen
Guest
Posts: n/a
 
      09-14-2011
"jwcarlton" wrote in message
news:(E-Mail Removed)...

> This is a tricky one for me. I'm validating a form, and want to
> check if a field entered is a legitimate website address. I don't
> necessarily need to ensure that the site works (I can do that later),
> but I do want to see if what's entered is a likely URL.


> I'm currently just checking to see if it begins with "http", but
> that's not so great; a less-savvy person might enter
> "www.example.com", or even "example.com", and get an error that
> it's not a legitimate link.


> I've thought about testing to see if it contains at least 1 "." (since
> all website addresses would, I think), but that's pretty vague; a
> less-savvy person might enter their email address, and it would
> go through. I guess that I could also check for an "@", but I can't
> help but wonder if there's a smarter / smoother option?


I found this which may help, but it's in PHP:
http://www.tutorialcode.com/php/link...-valid-or-not/

And here is a simple regex from geekpedia:

function CheckValidUrl(strUrl)
{
var RegexUrl =
/(ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?/
return RegexUrl.test(strUrl);
}

// Sample use

alert(CheckValidUrl("http://www.geekpedia.com")); "));

I have not used either one, but it seems like a handy utility.

Paul

 
Reply With Quote
 
P E Schoen
Guest
Posts: n/a
 
      09-14-2011
I also found this with separate functions for email and website URLs:

http://www.weberdev.com/get_example.php3?ExampleID=4569

Paul

 
Reply With Quote
 
dhtml
Guest
Posts: n/a
 
      09-14-2011
On Sep 14, 1:16*am, "Jukka K. Korpela" <(E-Mail Removed)> wrote:
> 14/09/2011 09:57, dhtml wrote:
> > If you want to require the protocol to be explicit, the UI should
> > indicate that in some way.

>
> The protocol part is required in absolute URLs. But of course one might
> consider prepending http:// if there is no protocol part.
>
> > For example, use placeholder text that
> > readshttp://www.example.com, or use a label as "Address" or
> > "Location" instead of "URL".

>
> "URL" is much more accurate than "Address" or "Location" (which might
> refer to postal addresses or geographic locations, for example). "Web
> address" might do. Or "Web site address", if that's what one is asking for.
>
> > Validate the "location" field with a regexp on the client and on the
> > server.

>
> That's non-trivial. Would you write one that accepts foo://example.com
> and rejecthttp://www.s.fi for example?
>

Point being that it is insufficient to validate only on the client.

> If the intent is to check that the URL actually works, then it would be
> simplest to do just that, instead of a separate syntax check. Checking
> that it works is of course nontrivial, especially since it may involve
> dealing with redirections and temporary network and server problemn.
>


Right. From the client, you're dealing with connectivity problems
(WiFi, 3g, AT&T DSL, etc). From the server, you have to deal with
other servers that may be slow or down. Do you really want the client
to wait while the program is trying to connnect to say
"jibbering.com"?

> *> You might consider using HTML5 pattern attribute where
>
> > supported.

> [...]
> >> I've thought about testing to see if it contains at least 1 "." (since
> >> all website addresses would, I think), but that's pretty vague; a less-
> >> savvy person might enter their email address, and it would go through.
> >> I guess that I could also check for an "@", but I can't help but
> >> wonder if there's a smarter / smoother option?

>
> > HTML5 INPUT type="email", feature tested, and with a fallback on the
> > client where the test fails, and a fallback on the server (server side
> > handling) where JS is disabled).

>
> Pardon? This is place for using <input type=url>, isn't it?


Right, I misread, thanks for pointing it out. (I though he'd also
wanted to validate emails.)

http://www.whatwg.org/specs/web-apps...html#url-state

"User agents may allow the user to set the value to a string that is
not a valid absolute URL, but may also or instead automatically escape
characters entered by the user so that the value is always a valid
absolute URL"

http://diveintohtml5.org/examples/input-type-url.html

Passes as a valid URL there: L.A://%-@-%\\

It's good to
> use it even though most browsers will treat it as <input type=text>, so
> that any client-side checks will be performed only if coded in
> JavaScript and when JavaScript is enabled. (To be honest, there is a
> risk in using <input type=url>, or <input type=email> for that matter-
> it is useful when you specifically expect email address. The risk is
> that when browsers start supporting them more widely, they will first do
> it wildly.


We've seen that already with input type="date".
--
Garrett
 
Reply With Quote
 
Mike Duffy
Guest
Posts: n/a
 
      09-14-2011
jwcarlton <(E-Mail Removed)> wrote in news:d69e1a77-5741-442e-b783-
http://www.velocityreviews.com/forums/(E-Mail Removed):

> This is a tricky one for me. I'm validating a form, and want to check
> if a field entered is a legitimate website address. I don't
> necessarily need to ensure that the site works (I can do that later),


If you are going to do that later anyway, why even bother to try to parse
it first? Are you not just wasting effort?

Let the DNS server do the work.

--
http://pages.videotron.ca/duffym/index.htm#
 
Reply With Quote
 
Denis McMahon
Guest
Posts: n/a
 
      09-15-2011
On Tue, 13 Sep 2011 23:12:08 -0700, jwcarlton wrote:

> This is a tricky one for me. I'm validating a form, and want to check if
> a field entered is a legitimate website address. I don't necessarily
> need to ensure that the site works (I can do that later), but I do want
> to see if what's entered is a likely URL.


Why are you doing the validation client side?

Is entering a website mandatory?

If it's not mandatory, why validate client site? Validate it server side
(you need to validate everything server side anyway) and just discard if
it's not valid.

If you must have a website entered, why? Consider the personal data
implications. If you don't really need it, see above.

If you really really must have a website entered, then the best you can
do client side is check to see if it looks genuine, and that really means
just looking for a valid host name. This code might get a lot of false
positives, but I don't think it will give any false negatives:

<script type="text/javascript">
function isValidWebsiteUri(str) { return true; }
</script>

If you insist on doing more than that, consider the following:

numeric ips are valid
%-encoded characters are valid
rfc 3986
rfc 2616 (and others it mentions)

I'm not going to try and write javascript code to validate a url, simply
because no matter how complex and all encompassing my code is, someone
will suggest (a) a valid http url that it rejects and (b) an invalid url
that it accepts.

You might be better off doing an ajax exchange with your server and
calling a dns query on the supplied uri following the field's blur event.
Obviously allow for the field being changed from containing an invalid uri
to empty if it's a non mandatory field.

Rgds

Denis McMahon
 
Reply With Quote
 
Jukka K. Korpela
Guest
Posts: n/a
 
      09-15-2011
15.9.2011 20:17, Denis McMahon wrote:

> On Tue, 13 Sep 2011 23:12:08 -0700, jwcarlton wrote:
>
>> This is a tricky one for me. I'm validating a form, and want to check if
>> a field entered is a legitimate website address. I don't necessarily
>> need to ensure that the site works (I can do that later), but I do want
>> to see if what's entered is a likely URL.

>
> Why are you doing the validation client side?


I think the idea of client-side validation is good, as it often helps
the user (and thus indirectly the site owner). The problem is that
validating a URL client-side is complicated, perhaps so complicated that
it is better to do server-side validation only.

> If it's not mandatory, why validate client site?


For the same reason as for mandatory addresses. For example, if the user
mistakenly types htttp://www.example.com or http://www.example,com, we
would like to tell about the problem immediately so that he can see the
problem and fix it right away.

> If you really really must have a website entered, then the best you can
> do client side is check to see if it looks genuine, and that really means
> just looking for a valid host name.


It's not "just" looking for a valid host name (a vague concept). And a
web site address may well have a path part (as mine does).

--
Yucca, http://www.cs.tut.fi/~jkorpela/
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Getting ID, calling url, search for value, return value Tim Fröglich ASP .Net Web Services 1 01-10-2006 09:18 PM
check if string contains numeric, and check string length of numeric value ief@specialfruit.be C++ 5 06-30-2005 01:08 PM
i want to check whether a value is passing via an url cc dd via .NET 247 ASP .Net 1 09-23-2004 08:36 AM
URL - substitution of a correct URL by a GUID like URL in favorites. Just D. ASP .Net Mobile 0 08-11-2004 04:26 PM
redirect URL's, return URL's, and URL Parameters Jon paugh ASP .Net 1 07-10-2004 05:29 AM



Advertisments