Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > HTML > Unicode Greek in English HTML

Reply
Thread Tools

Unicode Greek in English HTML

 
 
OccasionalFlyer
Guest
Posts: n/a
 
      08-10-2010
I have a docuemnt (it's actually an .aspx page but it's mostly
HTML. I am seeking to embed a little Greek in Palatino Linotype. I
have doen this successfully once, just by pasting the words in the
correct font into the document and making sure its encoding is UTF-8.
However, I tried to do this again elsewhere in the document and it is
not working.Is ther something simply I can do to resolve this?
Alhtough I have control over all the .aspx pages, I did not code them
and I am averse to doing major changes that I might not be able to
resolve if something goes wrong. (I took over maintenance of my
organizations' web site as a volunteer from the previous volunteer and
whil ehe knows .aspx somewhat, I"m a Java developer and while i have
done some reading in aspx coding, don't know a lot yet). Thanks.

Ken
 
Reply With Quote
 
 
 
 
Jukka K. Korpela
Guest
Posts: n/a
 
      08-10-2010
OccasionalFlyer wrote:

> I have a docuemnt (it's actually an .aspx page but it's mostly
> HTML. I am seeking to embed a little Greek in Palatino Linotype.


Greek looks a bit odd in Palatino Linotype (some letters look slanted etc.),
but that's perhaps just me.

> I have doen this successfully once, just by pasting the words in the
> correct font into the document and making sure its encoding is UTF-8.


That's a possible approach, but there are many risks. In particular, cut and
paste may carry formatting information that should be lost or, conversely,
it may lose information that you would like to preserve. I would copy and
paste as plain text, then perhaps add a style sheet rule suggesting a font -
though normally one should use the same font for copy text in Latin letters
and any quotations using some other script. This overall font should of
course be one that covers all the characters you'll use.

> However, I tried to do this again elsewhere in the document and it is
> not working.


We need the URL. And I mean URL, not a snippet of code. It is quite possible
that the _server_ sends information about encoding, and this information
isn't in the HTML document itself and will override any meta tags you might
use in the documet.

> Alhtough I have control over all the .aspx pages, I did not code them
> and I am averse to doing major changes that I might not be able to
> resolve if something goes wrong.


If the server actually announces the encoding as, say, iso-8859-1, then you
have two options: change the server settings, or represent the Greek
characters using character references or entity references, which work
irrespectively of encoding. Surely this will make the file a little bigger,
as you would have e.g. instead of letter alpha the string α or the
string α, but this isn't a serious efficiency issue if you have just
some short strings. It makes the source less readable to people who know
Greek, of course.

There are many utilities that can convert e.g. Greek text to character
references or entity references, such as the free Unicode-capable text
editor BabelPad.

--
Yucca, http://www.cs.tut.fi/~jkorpela/

 
Reply With Quote
 
 
 
 
OccasionalFlyer
Guest
Posts: n/a
 
      08-11-2010
On Aug 10, 12:24*pm, "Jukka K. Korpela" <(E-Mail Removed)> wrote:
> OccasionalFlyer wrote:
> > I have a docuemnt (it's actually an .aspx page but it's mostly
> > HTML. *I am seeking to embed a little Greek in Palatino Linotype.

>
> Greek looks a bit odd in Palatino Linotype (some letters look slanted etc..),
> but that's perhaps just me.
>
> > I have doen this successfully once, just by pasting the words in the
> > correct font into the document and making sure its encoding is UTF-8.

>
> That's a possible approach, but there are many risks. In particular, cut and
> paste may carry formatting information that should be lost or, conversely,
> it may lose information that you would like to preserve. I would copy and
> paste as plain text, then perhaps add a style sheet rule suggesting a font -
> though normally one should use the same font for copy text in Latin letters
> and any quotations using some other script. This overall font should of
> course be one that covers all the characters you'll use.
>
> > However, I tried to do this again elsewhere in the document and it is
> > not working.

>
> We need the URL. And I mean URL, not a snippet of code. It is quite possible
> that the _server_ sends information about encoding, and this information
> isn't in the HTML document itself and will override any meta tags you might
> use in the documet.
>
> > Alhtough I have control over all the .aspx pages, I did not code them
> > and I am averse to doing major changes that I might not be able to
> > resolve if something goes wrong.

>
> If the server actually announces the encoding as, say, iso-8859-1, then you
> have two options: change the server settings, or represent the Greek
> characters using character references or entity references, which work
> irrespectively of encoding. Surely this will make the file a little bigger,
> as you would have e.g. instead of letter alpha the string &#x3b1; or the
> string &alpha;, but this isn't a serious efficiency issue if you have just
> some short strings. It makes the source less readable to people who know
> Greek, of course.
>
> There are many utilities that can convert e.g. Greek text to character
> references or entity references, such as the free Unicode-capable text
> editor BabelPad.
>
> --
> Yucca,http://www.cs.tut.fi/~jkorpela/


Thanks. Here's the URL:
http://www.ibr-bbr.org/IBRBulletin/I...yYearList.aspx
The piece that worked for me is near the bottom:

Key Words: MT, LXX, Final Doxology, collocation, horn, translation,
judgment, deliverance,
Diaspora, קֶרֶן, κέρας, רוּם, ὑψόω

The piece that did not work for me almost at the very bottom:
Key Words: hebdomadal system, stages of life, Paul, Timothy,
paidi,on , pai/j , meiravkion, neani,skoj , avnh,r , presbu,thj ,
ge,rwn


I will say right here that most of what is on this page I did not do.
I am responsible for the last few journal issues describes on the page
(Vol 19), and even as I look at them now, I see a few errors I need to
correct. I don't know why everything is in italics because that's not
what I thought I did. I'm making no great claims to skill here but I
am trying, and not just trying to be stupid like, "What's Unicode?"
Thanks.

Ken
 
Reply With Quote
 
Andy Dingley
Guest
Posts: n/a
 
      08-11-2010
On 11 Aug, 01:54, OccasionalFlyer <(E-Mail Removed)> wrote:

> Thanks. Here's the URL:http://www.ibr-bbr.org/IBRBulletin/I...yYearList.aspx
> The piece that worked for me is near the bottom:


Looks like the page encoding is OK, but those few characters just
aren't Unicode. Smells more like an ASP problem than HTML - I think
your generation is breaking it, not the target of what you're trying
to generate.

Is the database content OK? Don't forget you'll need NVARCHAR under
SQL Server, not just VARCHAR



On a side issue, that's ugly HTML. No useful markup in there (it needs
headers, let alone any other semantics) and this had led to a very
"flat" presentation that's difficult to read. For a page of that sheer
bulk, your readers need all the help they can get!

To be honest, you just shouldn't serve 1/2MB pages - they're no use to
anyone. As the only thing you can do with a page that big is to try
and split it or search it mechanically, you should be supporting ways
that they can do this on yoru server, without needing to first
download that whole behemoth.
 
Reply With Quote
 
OccasionalFlyer
Guest
Posts: n/a
 
      08-11-2010
On Aug 10, 7:43*pm, Andy Dingley <(E-Mail Removed)> wrote:

> Looks like the page encoding is OK, but those few characters just
> aren't Unicode. Smells more like an ASP problem than HTML - I think
> your generation is breaking it, not the target of what you're trying
> to generate.


So I guess I should ask in the ASP.NET group, yes?

> Is the database content OK? *Don't forget you'll need NVARCHAR under
> SQL Server, not just VARCHAR


So far as I can tell, it's fine but the content is not coming from SQL
Server, so far as I know.

> On a side issue, that's ugly HTML. No useful markup in there (it needs
> headers, let alone any other semantics) and this had led to a very
> "flat" presentation that's difficult to read. For a page of that sheer
> bulk, your readers need all the help they can get!


Most of the header/CSS stuff is in the "master" page that wraps around
all the other pages in an ASP environment. (Honestly, if I was better
at JavaScript, I'd convert the whole site to HTML but I've no idea how
I'd implement Login security, especially for blocking some, but not
all, resources. I don't even normally do that in Java, once my
servlet is sure the user is valid. I'd love to move it to another ISP
because I have nothing but grief with the ISP. I'm also open to
suggestions. I was only trying to more or less continue what had been
started. All those blocks of years will take a user to a specific
journal volume. Yes, more pages would be nice but I'm afraid that my
understanding of how to add more levels of navigation to ASP.NET is
not good. From what I've read, it would take a menu control, but the
site was not built that way. All of its links were simply hard-
coded.

I'm not page designer but a software developer, so I'm not sure
what to do that would be best. Ideas?

> To be honest, you just shouldn't serve 1/2MB pages - they're no use to
> anyone. As the only thing you can do with a page that big is to try
> and split it or search it mechanically, you should be supporting ways
> that they can do this on yoru server, without needing to first
> download that whole behemoth.


I'll put this on my to-do list. Thanks.

Ken
 
Reply With Quote
 
Jukka K. Korpela
Guest
Posts: n/a
 
      08-11-2010
OccasionalFlyer wrote:

> http://www.ibr-bbr.org/IBRBulletin/I...yYearList.aspx

[…]
> The piece that did not work for me almost at the very bottom:
> Key Words: hebdomadal system, stages of life, Paul, Timothy,
> paidi,on , pai/j , meiravkion, neani,skoj , avnh,r , presbu,thj ,
> ge,rwn


The page encoding is delared as UTF-8, and like Andy wrote, there are words
that obviously aren’t in that encoding. This looks like a problem in copy
and paste. Where were the words copied from? Perhaps from a document (web
page or other) where ”fontistic fantasies” are used to extend character
repertoire, i.e. text is written in Ascii but some font setting is used to
make the characters look something completely different. Needless to say,
such tricks only work on defective software and generally break apart when
data is transferred to another program.

The page apparently contains parts that have come from Microsoft Office
software, as the markup <p class="MsoNormal"> reveals.

> I don't know why everything is in italics because that's not
> what I thought I did.


There seems to be a lot of <em> and <strong> markup on the page. To be
honest, it might be best to extract the content as plain text and then add
some simple markup, instead of trying to fix the mess. But maybe the quick
and dirty fix of adding

em { font-style: normal; }
strong { font-weight: normal; }

would remove some of the most striking problems in rendering.

--
Yucca, http://www.cs.tut.fi/~jkorpela/

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
English / Greek language site James Hutton HTML 6 07-08-2006 06:38 PM
I want to make English-speaking friend to practic my poor English IchBin Java 1 03-26-2006 05:36 AM
English/English DLL =?Utf-8?B?UmFlZCBTYXdhbGhh?= ASP .Net 2 10-16-2005 10:32 AM
Dictionaries for English-French and English-Spanish fkissam Computer Support 2 07-14-2004 09:07 PM
AMERICAN ENGLISH vs BRITISH, CANADIAN, or AUSTRALIAN ENGLISH Proud USA Babe Digital Photography 247 10-07-2003 12:32 AM



Advertisments