Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   HTML (http://www.velocityreviews.com/forums/f31-html.html)
-   -   han yu pin yin's tone marks (http://www.velocityreviews.com/forums/t954104-han-yu-pin-yins-tone-marks.html)

fulio pen 11-01-2012 03:49 PM

han yu pin yin's tone marks
 
Chinese is a tonal language. There are four different tones, and are represented by four diacritics in the language's alphabetic writing system, the pinyin.

These diacritics are:

macron(1st tone), acute(2nd tone), caron(3rd tone), grave(fourth tone).

There are two types of code to present these diacritic tone marks on the web page. The first type is a piece of code represents the combination of a letter with the diacritic on its top.

The second type separates the letter and diacritic. The code for the diacritic follows the letter. When displayed on web page, the diacritic would automatically appear on the top of the letter.

I am interested in the latter, and looking for the code for the four diacritics. Thanks for help.

fulio pen


Jukka K. Korpela 11-01-2012 05:10 PM

Re: han yu pin yin's tone marks
 
2012-11-01 17:49, fulio pen wrote:

> These diacritics are:
>
> macron(1st tone), acute(2nd tone), caron(3rd tone), grave(fourth tone).
>
> There are two types of code to present these diacritic tone marks on the web page.
> The first type is a piece of code represents the combination of a

letter with the diacritic on its top.
> The second type separates the letter and diacritic.


Right. The first type is generally to be preferred in practice, for
reasons explained at
http://www.cs.tut.fi/~jkorpela/html/...s.html#precomp

> When displayed on web page, the diacritic would automatically appear on the top of the letter.


It's somewhat complicated, and browsers used to fail in doing that properly.

> I am interested in the latter, and looking for the code for the four diacritics.


They are
U+0304 COMBINING MACRON
U+0301 COMBINING ACUTE ACCENT
U+030C COMBINING CARON
U+0300 COMBINING GRAVE ACCENT

To find out the codes for precomposed characters, like U+0101 LATIN
SMALL LETTER A WITH MACRON, you can use e.g. the BabelPad editor,
http://www.babelstone.co.uk/software/babelpad.html
or Alan Wood's resources http://www.alanwood.net/unicode/#links

--
Yucca, http://www.cs.tut.fi/~jkorpela/

fulio pen 11-02-2012 01:25 AM

Re: han yu pin yin's tone marks
 
On Thursday, November 1, 2012 1:10:26 PM UTC-4, Jukka K. Korpela wrote:
> 2012-11-01 17:49, fulio pen wrote:
>
>
>
> > These diacritics are:

>
> >

>
> > macron(1st tone), acute(2nd tone), caron(3rd tone), grave(fourth tone).

>
> >

>
> > There are two types of code to present these diacritic tone marks on the web page.

>
> > The first type is a piece of code represents the combination of a

>
> letter with the diacritic on its top.
>
> > The second type separates the letter and diacritic.

>
>
>
> Right. The first type is generally to be preferred in practice, for
>
> reasons explained at
>
> http://www.cs.tut.fi/~jkorpela/html/...s.html#precomp
>
>
>
> > When displayed on web page, the diacritic would automatically appear on the top of the letter.

>
>
>
> It's somewhat complicated, and browsers used to fail in doing that properly.
>
>
>
> > I am interested in the latter, and looking for the code for the four diacritics.

>
>
>
> They are
>
> U+0304 COMBINING MACRON
>
> U+0301 COMBINING ACUTE ACCENT
>
> U+030C COMBINING CARON
>
> U+0300 COMBINING GRAVE ACCENT
>
>
>
> To find out the codes for precomposed characters, like U+0101 LATIN
>
> SMALL LETTER A WITH MACRON, you can use e.g. the BabelPad editor,
>
> http://www.babelstone.co.uk/software/babelpad.html
>
> or Alan Wood's resources http://www.alanwood.net/unicode/#links
>
>
>
> --
>
> Yucca, http://www.cs.tut.fi/~jkorpela/


Hi, Jukka,

Thanks a lot for your help. The tone marks in the following page are from your posting:

http://www.pinyinology.com/toneMarks/tones/marks2b.html

It is strange that the symbols in the fourth tone are bigger than others. The code was validated, and there were errors on the validater, but there was none in the notepad file. If possible, please help find out what is wrong in the code.

Tanks again for your expertise.

fulio pen

Jukka K. Korpela 11-02-2012 08:12 AM

Re: han yu pin yin's tone marks
 
2012-11-02 3:25, fulio pen wrote:

> The tone marks in the following page are from your posting:
>
> http://www.pinyinology.com/toneMarks/tones/marks2b.html
>
> It is strange that the symbols in the fourth tone are bigger than others.
> The code was validated, and there were errors on the validater


The error that matters here is "Unclosed element div", which means that
the <div> element containing the third tone is not closed properly;
instead of </div>, there is <div>. This makes the <div> for the fourth
tone part of the earlier <div>, which in turn means that the rule
div.marks {font-size:150%; } has cumulative effect.

The validator's warning "Text run is not in Unicode Normalization Form
C." basically says that the text data contains combinations of letters
and diacritic marks that could and should be written as precomposed
characters. As I wrote, that's generally good advice, but should not be
seen as an absolute rule.

The topic is discussed at
http://www.w3.org/International/ques...-normalization
which is descriptive, not normative. And it's biased and partly even
erroneous: "The Unicode Standard allows either of these alternatives,
but requires that both be treated as identical" is not true. (What the
standard really says, loosely speaking, is that a precomposed character
and its decomposition can normally be expected to look the same and be
treated the same, and you should not expect applications to make a
difference, but applications *may* make a difference. And in reality,
there are differences. Besides, conformance to Unicode standard does not
require support to any particular set of characters. For example, a
conforming application may be ignorant of combining marks - as long as
it is not plain wrong about them.)

--
Yucca, http://www.cs.tut.fi/~jkorpela/

Andreas Prilop 11-02-2012 05:42 PM

Re: han yu pin yin's tone marks
 
On Thu, 1 Nov 2012, fulio pen wrote:

> Thanks for help.


Read (again?) the thread
http://groups.google.com/group/alt.h...56d280c59e71b7
which you started on 18 July 2012.
<news:0697db02-912f-4be5-850d-75b5c2b3e85c@googlegroups.com>

--
Outgoing mail is certified free from defamation of Islam™
and insult of the Prophet™.
Checked by Thinkpol anti-obscenity system v. 6.66.

Andreas Prilop 11-02-2012 06:34 PM

Re: han yu pin yin's tone marks
 
On Fri, 2 Nov 2012, Jukka K. Korpela wrote:

> The validator's warning "Text run is not in Unicode Normalization
> Form C." basically says that the text data contains combinations
> of letters and diacritic marks that could and should be written
> as precomposed characters.


This applies to Latin letters.
When you write the precomposed Devanagari letters ड़ ढ़ ,
you get the same warning and you are supposed to write
ड़ ढ़ instead.

I regard this as an illogical and unnecessary requirement of HTML5.
It is not the job of HTML5 to prescribe the way of writing characters.

--
Outgoing mail is certified free from defamation of Islam™
and insult of the Prophet™.
Checked by Thinkpol anti-obscenity system v. 6.66.

Jukka K. Korpela 11-02-2012 08:21 PM

Re: han yu pin yin's tone marks
 
2012-11-02 20:34, Andreas Prilop wrote:

> On Fri, 2 Nov 2012, Jukka K. Korpela wrote:
>
>> The validator's warning "Text run is not in Unicode Normalization
>> Form C." basically says that the text data contains combinations
>> of letters and diacritic marks that could and should be written
>> as precomposed characters.

>
> This applies to Latin letters.
> When you write the precomposed Devanagari letters ड़ ढ़ ,
> you get the same warning and you are supposed to write
> ड़ ढ़ instead.


Right, Unicode "normalization" is partly rather abnormal.

> I regard this as an illogical and unnecessary requirement of HTML5.
> It is not the job of HTML5 to prescribe the way of writing characters.


Unfortunately, HTML5 seems to follow W3C traditions here, reflecting a
simplistic view. This is a category error, so to say, dealing with
character-level issues at a higher protocol level.

--
Yucca, http://www.cs.tut.fi/~jkorpela/


All times are GMT. The time now is 01:56 PM.

Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57