Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > HTML > Special characters and validation

Reply
Thread Tools

Special characters and validation

 
 
Jukka K. Korpela
Guest
Posts: n/a
 
      01-30-2009
Zach wrote:

> I answered the guy's question.


No, you didn't. You didn't even give a wrong answer, though your posting
would have been a wrong answer to virtually any question, if it had
addressed a question.

Thank you for following my advice of continuing the use of clueslessly
forged From field as long as you remain clueless!

--
Yucca, http://www.cs.tut.fi/~jkorpela/

 
Reply With Quote
 
 
 
 
Zach
Guest
Posts: n/a
 
      01-30-2009

"Ben C" <> wrote in message
news:...
> On 2009-01-30, Zach <> wrote:
>>
>> "Ben C" <> wrote in message
>> news:...
>>> On 2009-01-30, Zach <> wrote:
>>>>
>>>> "JD" <> wrote in message
>>>> news:...
>>>>
>>>><< snipped >>
>>>>
>>>>>> I answered the guy's question.
>>>>>
>>>>> How, by supplying an indiscriminate list of character entity
>>>>> references?
>>>>> That's like giving somebody the entire alphabet when they ask which
>>>>> letters are vowels.
>>>>
>>>> oooooooooooooooooooooooooooooooooooooooooooooooooo
>>>>
>>>> Oh. Oh. If a response isn't to your liking, then say so politely.
>>>>
>>>> oooooooooooooooooooooooooooooooooooooooooooooooooo
>>>>
>>>> You wrote: "Is there a definitive list somewhere of which characters
>>>> need
>>>> to
>>>> be
>>>> encoded and which do not?"
>>>>
>>>> I would:
>>>> 1. transform the text into an array of characters
>>>> 2. see what the accii value is of each character
>>>
>>> It might not have an ASCII value (nor even an ISO-8859-1 value) which is
>>> the whole problem.
>>>
>>>> 3. see if the acii value < or > certain values
>>>
>>> If all the characters have ASCII values, then it is not necessary to
>>> check if they are outside any particular range-- the OP was using
>>> ISO-8859-1 of which ASCII is a subset.
>>>
>>>> 4. if so, see whether it is contained in the list I gave you
>>>> 5. if it is, substitute
>>>
>>> Then any character whose unicode value is outside the range that
>>> ISO-8859-1 can encode needs to be substituted. There's no other list to
>>> check them against, unless you are thinking of using e.g. "&nbsp;"
>>> instead
>>> of
>>> " ", which is more readable. In that case I suppose you get the
>>> list from http://www.w3.org/TR/REC-html40/sgml/entities.html.

>>
>>
>>
>> "the OP was using ISO-8859-1 "
>> Re: http://htmlhelp.com/reference/charset/
>> Sorry, I don't understand why character for character converting wouldn't
>> work.

>
> It would.
>
> ASCII and ISO-8859-1 are both encodings. ASCII is a subset of
> ISO-8859-1. The OP's destination encoding is ISO-8859-1 and his source
> encoding is presumably a superset of ISO-8859-1 (perhaps UTF-.
>
> So we need to decode the source, character for character, and output it
> in the destination encoding, using &# thingies for any characters that
> aren't in ISO-8859-1.
>
> What we're not doing is decoding ASCII source and outputting it to some
> encoding that's a subset of ASCII (if there is such a thing). But that's
> what your method seemed to be describing.

oooooooooooooooooooooooooooooooooooooooooooooooooo ooo
Great, this defines what needs to be done then.
The guy need two lists
(1.) an ISO-8859-1 list
(2.) a thingies list.

If the char isn't in (1.) then the char must be
converted, using (2.). No big deal then.

Zach.








 
Reply With Quote
 
 
 
 
Zach
Guest
Posts: n/a
 
      01-30-2009

"Jukka K. Korpela" <> wrote in message
news:OQHgl.126223$_ i.fi...
> Zach wrote:
>
>> I answered the guy's question.

>
> No, you didn't. You didn't even give a wrong answer, though your posting
> would have been a wrong answer to virtually any question, if it had
> addressed a question.
>
> Thank you for following my advice of continuing the use of clueslessly
> forged From field as long as you remain clueless!
>
> --
> Yucca, http://www.cs.tut.fi/~jkorpela/




Zach.


 
Reply With Quote
 
Zach
Guest
Posts: n/a
 
      01-30-2009
"Ben C" <> wrote in message
news:...
> 2 isn't a list (assuming you mean &# things)-- those are just numbers.
> But you might convert some characters to HTML entities like &nbsp; and
> so you might have a list of those.


Aren't these your thingies?
http://www.avenue-it.com/html/asciialphabet.html



 
Reply With Quote
 
Neredbojias
Guest
Posts: n/a
 
      01-31-2009
On 30 Jan 2009, "Zach" <> wrote:

>
> "JD" <> wrote in message
> news:...
>
> << snipped >>
>
>>> I answered the guy's question.

>>
>> How, by supplying an indiscriminate list of character entity
>> references? That's like giving somebody the entire alphabet when
>> they ask which letters are vowels.

>
> oooooooooooooooooooooooooooooooooooooooooooooooooo
>
> Oh. Oh. If a response isn't to your liking, then say so politely.


Dear Sir,

Your list sucked the big one.

With warm regards,
JD

--
Neredbojias
http://www.neredbojias.org/
http://www.neredbojias.net/
The road to Heaven is paved with bad intentions.
 
Reply With Quote
 
Zach
Guest
Posts: n/a
 
      01-31-2009
"Neredbojias" <> wrote in message
news:. net...
> On 30 Jan 2009, "Zach" <> wrote:
>
>>
>> "JD" <> wrote in message
>> news:...
>>
>> << snipped >>
>>
>>>> I answered the guy's question.
>>>
>>> How, by supplying an indiscriminate list of character entity
>>> references? That's like giving somebody the entire alphabet when
>>> they ask which letters are vowels.

>>
>> oooooooooooooooooooooooooooooooooooooooooooooooooo
>>
>> Oh. Oh. If a response isn't to your liking, then say so politely.

>
> Dear Sir,
>
> Your list sucked the big one.
>
> With warm regards,
> JD
>
> --
> Neredbojias
> http://www.neredbojias.org/
> http://www.neredbojias.net/
> The road to Heaven is paved with bad intentions.
>> oooooooooooooooooooooooooooooooooooooooooooooooooo

Lol!

Zach.


 
Reply With Quote
 
Zach
Guest
Posts: n/a
 
      01-31-2009
"Neredbojias" <> wrote in message
news:. net...
> On 30 Jan 2009, "Zach" <> wrote:
>
>>
>> "JD" <> wrote in message
>> news:...
>>
>> << snipped >>
>>
>>>> I answered the guy's question.
>>>
>>> How, by supplying an indiscriminate list of character entity
>>> references? That's like giving somebody the entire alphabet when
>>> they ask which letters are vowels.

>>
>> oooooooooooooooooooooooooooooooooooooooooooooooooo
>>
>> Oh. Oh. If a response isn't to your liking, then say so politely.

>
> Dear Sir,
>
> Your list sucked the big one.
>
> With warm regards,
> JD
>
> --
> Neredbojias
> http://www.neredbojias.org/
> http://www.neredbojias.net/
> The road to Heaven is paved with bad intentions.
>> oooooooooooooooooooooooooooooooooooooooooooooooooo

Lol!

Zach.



 
Reply With Quote
 
Zach
Guest
Posts: n/a
 
      01-31-2009

"Ben C" <> wrote in message
news:...
> On 2009-01-30, Zach <> wrote:
>> "Ben C" <> wrote in message
>> news:...
>>> 2 isn't a list (assuming you mean &# things)-- those are just numbers.
>>> But you might convert some characters to HTML entities like &nbsp; and
>>> so you might have a list of those.

>>
>> Aren't these your thingies?
>> http://www.avenue-it.com/html/asciialphabet.html

>
> Sort of, but ignore the first 128 entries of the table-- obviously
> there's no need to replace 'i' with i in any encoding anyone's
> likely to be using these days.
>
> In fact, if I have to replace 5 with 5 it's not clear how the
> browser's going to understand the '5' in "5".
>
> And I think it's likely to be a requirement of an HTML parser that it at
> least understand ASCII. Korpela would know but he has already stormed
> off in disgust.
>
> The second problem is that that table appears to list only the
> characters in Latin 1 (aka ISO-8859-1) although I haven't checked it
> thoroughly.
>
> Since the OP's destination encoding was ISO-8859-1, he wouldn't need to
> make subsitutions for any of the characters in that table.
>
> But he might need to make some for characters outside it-- for example
> if his text contains U+1401 Canadian Syllabics E, or U+2207 Nabla, or
> any of the many other characters that aren't in Latin 1.

ooooooooooooooooooooooooooooooooooooooooo
Thank you. I have learned a few things.
Zach.


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Counting utf-8 characters -special characters majna Javascript 4 09-19-2007 01:53 PM
Remove only special characters and junk characters from a file rvino Perl 0 08-14-2007 07:23 AM
Re: Meta-Characters, Special Characters xah@xahlee.org Java 2 05-31-2007 09:25 AM
How to convert HTML special characters to the real characters with a Java script Stefan Mueller HTML 3 07-23-2006 10:09 PM
Special editions and Deluxe special edition dvd question. Rclrk43 DVD Video 8 12-29-2004 07:32 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57