On Sun, 11 Sep 2011 14:33:05 -0700 (PDT), bob <>
wrote, quoted or indirectly quoted someone who said :
>Anyone know why ASCII char 26 is used in place of a hyphen in UTF-8?
>html = html.replaceAll("\u201C", "\"");
\u0026 is replaced by an ampersand at compile time, as if you had
typed one into the source code.
I presume you are talking about
26 0x1a ^Z SUB, substitute
\u001a is not useful. It gets replaced by a ^z character, as if you
had typed it into the source text, possibly creating a syntax error.
If you want this char you probably want (char)0x001a
This is true for ascii, UTF and UTF-8. If you see a -, it might just
be some font's attempt to render a SUB char.
You can use ␚ in HTML or \u241a in Java to render a tiny SUB
glyph to represent the char.
see
http://mindprod.com/jgloss/ascii.html
http://mindprod.com/jgloss/unicode.html
http://mindprod.com/jgloss/utf.html
http://mindprod.com/jgloss/literal.html
--
Roedy Green Canadian Mind Products
http://mindprod.com
The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is,
the search for a superior moral justification for selfishness.
~ John Kenneth Galbraith (born: 1908-10-15 died: 2006-04-29 at age: 97)