Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Javascript > Javascript and special characters

Reply
Thread Tools

Javascript and special characters

 
 
Doc
Guest
Posts: n/a
 
      03-27-2006
Hello!

I'm experiencing a little problem counting the number of characters in
a textarea on a html page.

This is the content type of my HTML document
content="text/html; charset=iso-8859-1"

I have a textarea that I want to limit to 400 characters, but the
enduser can enter special characters (like , v, 8, ...). I have
to limit it because I have a limit in my database and I don't want my
webapp to hang...

I tried counting the characters with a javascript function, but it
doesn't work with these special characters as 'v' is stored in the DB
as '√' My character count is not right (well for the DB...) and I
can get an error when saving.

Can somebody help me? Can I do it otherwise (always checking the input
on the JSP (HTML).

Thanks

 
Reply With Quote
 
 
 
 
AndrewTK
Guest
Posts: n/a
 
      03-27-2006
If I understand, your problem is:
-user enters "mes vacances cet t" for example
-"" will become &#(something);
-you want to count as probably "&#xxx;" length = 6 (if the DB
generates a 3 digit number)

Solution (amongst many):

on submit, pass the *textarea* to this function:

esc(the_textarea) {the_textarea.value = escape(thetextarea.value);}

this will convert the message to HTTP-url-encoded text and be stored
as-is on the server. count that text. if you are using continuous
counting on the page via a JS function count(text), use
count(escape(text) ) - this will count the text in its escaped form,
without displaying this to the textarea before the user sends

when displaying the text in a page, call unescape(text) on your text to
convert it back to the original text

if the text is returned by the DB as a page, you'll need to include a
script in that page to find the text and convert it back. Elsewise, PHP
can decode http-url-encoded text with, I think, a function conveniently
named.... unescape()

 
Reply With Quote
 
 
 
 
RobG
Guest
Posts: n/a
 
      03-27-2006
AndrewTK wrote:
> If I understand, your problem is:
> -user enters "mes vacances cet t" for example
> -"" will become &#(something);
> -you want to count as probably "&#xxx;" length = 6 (if the DB
> generates a 3 digit number)
>
> Solution (amongst many):
>
> on submit, pass the *textarea* to this function:
>
> esc(the_textarea) {the_textarea.value = escape(thetextarea.value);}


If that is what is required (and I'm not sure it is), use
encodeURIComponent() to count characters as that pretty much emulates what
will be done to the textarea value when the form is submitted.

But ultimately what is stored in the DB is up to the server, not the client.

[...]


--
Rob
 
Reply With Quote
 
Thomas 'PointedEars' Lahn
Guest
Posts: n/a
 
      03-27-2006
Doc wrote:

> I'm experiencing a little problem counting the number of characters in
> a textarea on a html page.
>
> This is the content type of my HTML document
> content="text/html; charset=iso-8859-1"


Most certainly it is not. The content type (here better: encoding,
referring only to the `charset' label) is specified by the HTTP header
Content-Type which takes precedence over any declaration with the meta
element.

> I tried counting the characters with a javascript function, but it
> doesn't work with these special characters as 'v' is stored in the
> DB as '√'


Information should be stored independently of the output medium (I
suggest storing "sqrt(...)" instead), and you do not have to use
the character reference if you declare the correct encoding.


PointedEars
 
Reply With Quote
 
Doc
Guest
Posts: n/a
 
      03-28-2006
Thanks Andrew, Rob and PointedEars!

But the escape() or encodeURIComponent() functions converts my
characters to a '%..' format, and I can't use that as the data is
accessed by other means (extracted to XML format, ...), and I can't do
the unescape() or decodeURIComponent().

The '' is stored as '', it is only some special characters that are
stored in a '&#nnnn;' format, and the aim is to be able to store all
these special characters (such as square root or infinite or ...). The
characters stored in this format don't need any conversion when
rendering the pages or creating the XML document. I'd just like to find
a function to convert the 'special' characters to this format so I can
count them as they are to be stored in the DB.

Is my encoding wrong? I don't understand the difference between the
HTTP header and the content type, I thought it was the same thing...
Well I thought <meta http-equiv="Content-Type" content="text/html;
charset=iso-8859-1"> was the HTTP header and defined the encoding...
Can you explain what you meant PointedEars please?

Thanks for your help!

 
Reply With Quote
 
Thomas 'PointedEars' Lahn
Guest
Posts: n/a
 
      03-29-2006
Doc wrote:

> But the escape() or encodeURIComponent() functions converts my
> characters to a '%..' format, and I can't use that as the data is
> accessed by other means (extracted to XML format, ...), and I can't
> do the unescape() or decodeURIComponent().


Then don't. Where is the problem?

> The 'é' is stored as 'é', it is only some special characters that are
> stored in a '&#nnnn;' format, and the aim is to be able to store all
> these special characters (such as square root or infinite or ...).


Use a Unicode Transformation Format instead of US-ASCII, ISO-8859-x, or
Windows-125x.

> The characters stored in this format don't need any conversion when
> rendering the pages or creating the XML document.


As I said, dependencies on the output should be avoided when storing
information in a database. For example, do not store "√"; store
"√" instead.

> I'd just like to find a function to convert the 'special' characters to
> this format so I can count them as they are to be stored in the DB.


This should be done server-side, not client-side. Are you using server-side
J(ava)Script?

> Is my encoding wrong?


Maybe. Note that the encoding declared specifies primarily how the content
is encoded, not what characters are allowed to be displayed. The HTML
Document Character Set used for character references (like √) is the
Universal Character Set (ISO/IEC 10646), which is character-by-character
equivalent to Unicode 3.0.

> I don't understand the difference between the HTTP header and the content
> type, I thought it was the same thing... Well I thought <meta
> http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> was the
> HTTP header and defined the encoding...


No, it is not. And it does not, unless the `charset' label is missing
from the Content-Type HTTP header (which is recommended against anyway.
Interpretation of meta[http-equiv] elements is not mandatory as per
HTML 4.01, so this is not an interoperable approach.).

> Can you explain what you meant PointedEars please?


<URL:http://en.wikipedia.org/wiki/HTTP>

Please take heed of

<URL:http://jibbering.com/faq/faq_notes/pots1.html>
<URL:http://www.safalra.com/special/googlegroupsreply/>

with your next posting.


PointedEars
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Counting utf-8 characters -special characters majna Javascript 4 09-19-2007 01:53 PM
Remove only special characters and junk characters from a file rvino Perl 0 08-14-2007 07:23 AM
Re: Meta-Characters, Special Characters xah@xahlee.org Java 2 05-31-2007 09:25 AM
How to convert HTML special characters to the real characters with a Java script Stefan Mueller HTML 3 07-23-2006 10:09 PM
Converting special characters in Java and JavaScript Crazy Monkey Java 1 01-21-2005 11:09 PM



Advertisments