![]() |
han yu pin yin's tone marks
Chinese is a tonal language. There are four different tones, and are represented by four diacritics in the language's alphabetic writing system, the pinyin.
These diacritics are: macron(1st tone), acute(2nd tone), caron(3rd tone), grave(fourth tone). There are two types of code to present these diacritic tone marks on the web page. The first type is a piece of code represents the combination of a letter with the diacritic on its top. The second type separates the letter and diacritic. The code for the diacritic follows the letter. When displayed on web page, the diacritic would automatically appear on the top of the letter. I am interested in the latter, and looking for the code for the four diacritics. Thanks for help. fulio pen |
Re: han yu pin yin's tone marks
2012-11-01 17:49, fulio pen wrote:
> These diacritics are: > > macron(1st tone), acute(2nd tone), caron(3rd tone), grave(fourth tone). > > There are two types of code to present these diacritic tone marks on the web page. > The first type is a piece of code represents the combination of a letter with the diacritic on its top. > The second type separates the letter and diacritic. Right. The first type is generally to be preferred in practice, for reasons explained at http://www.cs.tut.fi/~jkorpela/html/...s.html#precomp > When displayed on web page, the diacritic would automatically appear on the top of the letter. It's somewhat complicated, and browsers used to fail in doing that properly. > I am interested in the latter, and looking for the code for the four diacritics. They are U+0304 COMBINING MACRON U+0301 COMBINING ACUTE ACCENT U+030C COMBINING CARON U+0300 COMBINING GRAVE ACCENT To find out the codes for precomposed characters, like U+0101 LATIN SMALL LETTER A WITH MACRON, you can use e.g. the BabelPad editor, http://www.babelstone.co.uk/software/babelpad.html or Alan Wood's resources http://www.alanwood.net/unicode/#links -- Yucca, http://www.cs.tut.fi/~jkorpela/ |
Re: han yu pin yin's tone marks
On Thursday, November 1, 2012 1:10:26 PM UTC-4, Jukka K. Korpela wrote:
> 2012-11-01 17:49, fulio pen wrote: > > > > > These diacritics are: > > > > > > macron(1st tone), acute(2nd tone), caron(3rd tone), grave(fourth tone). > > > > > > There are two types of code to present these diacritic tone marks on the web page. > > > The first type is a piece of code represents the combination of a > > letter with the diacritic on its top. > > > The second type separates the letter and diacritic. > > > > Right. The first type is generally to be preferred in practice, for > > reasons explained at > > http://www.cs.tut.fi/~jkorpela/html/...s.html#precomp > > > > > When displayed on web page, the diacritic would automatically appear on the top of the letter. > > > > It's somewhat complicated, and browsers used to fail in doing that properly. > > > > > I am interested in the latter, and looking for the code for the four diacritics. > > > > They are > > U+0304 COMBINING MACRON > > U+0301 COMBINING ACUTE ACCENT > > U+030C COMBINING CARON > > U+0300 COMBINING GRAVE ACCENT > > > > To find out the codes for precomposed characters, like U+0101 LATIN > > SMALL LETTER A WITH MACRON, you can use e.g. the BabelPad editor, > > http://www.babelstone.co.uk/software/babelpad.html > > or Alan Wood's resources http://www.alanwood.net/unicode/#links > > > > -- > > Yucca, http://www.cs.tut.fi/~jkorpela/ Hi, Jukka, Thanks a lot for your help. The tone marks in the following page are from your posting: http://www.pinyinology.com/toneMarks/tones/marks2b.html It is strange that the symbols in the fourth tone are bigger than others. The code was validated, and there were errors on the validater, but there was none in the notepad file. If possible, please help find out what is wrong in the code. Tanks again for your expertise. fulio pen |
Re: han yu pin yin's tone marks
2012-11-02 3:25, fulio pen wrote:
> The tone marks in the following page are from your posting: > > http://www.pinyinology.com/toneMarks/tones/marks2b.html > > It is strange that the symbols in the fourth tone are bigger than others. > The code was validated, and there were errors on the validater The error that matters here is "Unclosed element div", which means that the <div> element containing the third tone is not closed properly; instead of </div>, there is <div>. This makes the <div> for the fourth tone part of the earlier <div>, which in turn means that the rule div.marks {font-size:150%; } has cumulative effect. The validator's warning "Text run is not in Unicode Normalization Form C." basically says that the text data contains combinations of letters and diacritic marks that could and should be written as precomposed characters. As I wrote, that's generally good advice, but should not be seen as an absolute rule. The topic is discussed at http://www.w3.org/International/ques...-normalization which is descriptive, not normative. And it's biased and partly even erroneous: "The Unicode Standard allows either of these alternatives, but requires that both be treated as identical" is not true. (What the standard really says, loosely speaking, is that a precomposed character and its decomposition can normally be expected to look the same and be treated the same, and you should not expect applications to make a difference, but applications *may* make a difference. And in reality, there are differences. Besides, conformance to Unicode standard does not require support to any particular set of characters. For example, a conforming application may be ignorant of combining marks - as long as it is not plain wrong about them.) -- Yucca, http://www.cs.tut.fi/~jkorpela/ |
Re: han yu pin yin's tone marks
On Thu, 1 Nov 2012, fulio pen wrote:
> Thanks for help. Read (again?) the thread http://groups.google.com/group/alt.h...56d280c59e71b7 which you started on 18 July 2012. <news:0697db02-912f-4be5-850d-75b5c2b3e85c@googlegroups.com> -- Outgoing mail is certified free from defamation of Islam™ and insult of the Prophet™. Checked by Thinkpol anti-obscenity system v. 6.66. |
Re: han yu pin yin's tone marks
On Fri, 2 Nov 2012, Jukka K. Korpela wrote:
> The validator's warning "Text run is not in Unicode Normalization > Form C." basically says that the text data contains combinations > of letters and diacritic marks that could and should be written > as precomposed characters. This applies to Latin letters. When you write the precomposed Devanagari letters ड़ ढ़ , you get the same warning and you are supposed to write ड़ ढ़ instead. I regard this as an illogical and unnecessary requirement of HTML5. It is not the job of HTML5 to prescribe the way of writing characters. -- Outgoing mail is certified free from defamation of Islam™ and insult of the Prophet™. Checked by Thinkpol anti-obscenity system v. 6.66. |
Re: han yu pin yin's tone marks
2012-11-02 20:34, Andreas Prilop wrote:
> On Fri, 2 Nov 2012, Jukka K. Korpela wrote: > >> The validator's warning "Text run is not in Unicode Normalization >> Form C." basically says that the text data contains combinations >> of letters and diacritic marks that could and should be written >> as precomposed characters. > > This applies to Latin letters. > When you write the precomposed Devanagari letters ड़ ढ़ , > you get the same warning and you are supposed to write > ड़ ढ़ instead. Right, Unicode "normalization" is partly rather abnormal. > I regard this as an illogical and unnecessary requirement of HTML5. > It is not the job of HTML5 to prescribe the way of writing characters. Unfortunately, HTML5 seems to follow W3C traditions here, reflecting a simplistic view. This is a category error, so to say, dealing with character-level issues at a higher protocol level. -- Yucca, http://www.cs.tut.fi/~jkorpela/ |
| All times are GMT. The time now is 01:56 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.