Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > HTML > page encoding question

Reply
Thread Tools

page encoding question

 
 
Tony Vella
Guest
Posts: n/a
 
      12-19-2005
I am preparing a series of philatelic html pages (lots of text and a few
scans of stamps) which will include alpha-characters (accents) in Italian,
French, Spanish, Portuguese and Danish. The pages I have finished in draft
form so far I have encoded UTF-8 but I have just been told that 99% of the
world will not be able to read them and that I should go through all the
pages and re-encode them "western european - windows (1252)". I guess what
I would like to know is what encoding would be most effective for these
particular languages. Any advice and pointers will be appreciated.
--
Tony Vella in Ottawa, Canada

 
Reply With Quote
 
 
 
 
David Dorward
Guest
Posts: n/a
 
      12-19-2005
Tony Vella wrote:

> I am preparing a series of philatelic html pages (lots of text and a few
> scans of stamps) which will include alpha-characters (accents) in Italian,
> French, Spanish, Portuguese and Danish. The pages I have finished in draft
> form so far I have encoded UTF-8 but I have just been told that 99% of the
> world will not be able to read them


That is rubbish. UTF-8 is very well supported (so much so, that I can't
remember the last time I came across a system that couldn't handle it).

--
David Dorward <http://blog.dorward.me.uk/> <http://dorward.me.uk/>
Home is where the ~/.bashrc is
 
Reply With Quote
 
 
 
 
Dan
Guest
Posts: n/a
 
      12-19-2005
Tony Vella wrote:
> I am preparing a series of philatelic html pages (lots of text and a few
> scans of stamps) which will include alpha-characters (accents) in Italian,
> French, Spanish, Portuguese and Danish. The pages I have finished in draft
> form so far I have encoded UTF-8 but I have just been told that 99% of the
> world will not be able to read them and that I should go through all the
> pages and re-encode them "western european - windows (1252)". I guess what
> I would like to know is what encoding would be most effective for these
> particular languages. Any advice and pointers will be appreciated.


While UTF-8 is actually very widely supported, and thus there's no
reason to change your encoding from this (if your server sends a proper
content-type header indicating the encoding), the Western European
languages you are using should work all right in the standard Western
encoding iso-8859-1 as well. Avoid windows-1252; it's a proprietary
Microsoft set.

--
Dan

 
Reply With Quote
 
Luigi Donatello Asero
Guest
Posts: n/a
 
      12-19-2005

"Dan" <(E-Mail Removed)> skrev i meddelandet
news:(E-Mail Removed) oups.com...
> Tony Vella wrote:
> > I am preparing a series of philatelic html pages (lots of text and a few
> > scans of stamps) which will include alpha-characters (accents) in

Italian,
> > French, Spanish, Portuguese and Danish. The pages I have finished in

draft
> > form so far I have encoded UTF-8 but I have just been told that 99% of

the
> > world will not be able to read them and that I should go through all the
> > pages and re-encode them "western european - windows (1252)". I guess

what
> > I would like to know is what encoding would be most effective for these
> > particular languages. Any advice and pointers will be appreciated.

>
> While UTF-8 is actually very widely supported, and thus there's no
> reason to change your encoding from this (if your server sends a proper
> content-type header indicating the encoding), the Western European
> languages you are using should work all right in the standard Western
> encoding iso-8859-1 as well. Avoid windows-1252; it's a proprietary
> Microsoft set.
>
> --
> Dan


I am using
iso-8859-1 at the moment but I am going to change it into UTF-8 to add the
pages in Russian and Chinese
(just now I have little in these languages)

--
Luigi Donatello Asero
https://www.scaiecat-spa-gigi.com/sv/oversattning.php










 
Reply With Quote
 
Jukka K. Korpela
Guest
Posts: n/a
 
      12-19-2005
"Tony Vella" <(E-Mail Removed)> wrote:

> I am preparing a series of philatelic html pages (lots of text and a few
> scans of stamps) which will include alpha-characters (accents) in Italian,
> French, Spanish, Portuguese and Danish.


They are all covered by the ISO-8859-1 encoding, except for some punctuation
marks and letters like the oe ligature. If you use windows-1252, you get the
punctuation marks and the ligature, too.

>The pages I have finished in draft
> form so far I have encoded UTF-8 but I have just been told that 99% of the
> world will not be able to read them


Nonsense. More probably, 99 % of the WWW users _are_ able to read them. Well,
let's say 97.6 %. After all, 96,3 % of all percentages have just been made
up, and the remaining 4,7 % have been miscalculated.

> and that I should go through all the
> pages and re-encode them "western european - windows (1252)".


I wouldn't do that at this point, unless you have good tools that do such
things for you with minimal effort.

> I guess what
> I would like to know is what encoding would be most effective for these
> particular languages.


If you were just about to start the project, I would recommend ISO-8859-1 (or
windows-1252 if you need those extras) - not because of wider browser
coverage (though there is a _small_ improvement to be gained there) but
because those encodings are somewhat more efficient (one byte per character,
whereas UTF-8 uses two bytes for some of the characters you'd use).

UTF-8 is certainly simpler in the future if you'll ever need to add
characters in other languages.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html


 
Reply With Quote
 
Alan J. Flavell
Guest
Posts: n/a
 
      12-20-2005

On Mon, 19 Dec 2005, David Dorward wrote:

> Tony Vella wrote:
>
> > I am preparing a series of philatelic html pages (lots of text and
> > a few scans of stamps) which will include alpha-characters
> > (accents) in Italian, French, Spanish, Portuguese and Danish. The
> > pages I have finished in draft form so far I have encoded UTF-8
> > but I have just been told that 99% of the world will not be able
> > to read them

>
> That is rubbish.


Agreed

> UTF-8 is very well supported (so much so, that I can't remember the
> last time I came across a system that couldn't handle it).


Broad agreement with that, but there are exceptions...

Well, Netscape 4.* versions do a pretty good job of *rendering* utf-8,
but do keep in mind that, if any form submission is required, then NN4
badly mangles utf-8. Whether it's worth understanding how to
implement a workaround for that old zombie is debatable, of course:
I'm just mentioning that it's not without a problem.

cheers

(The original WebTV is also hopeless at rendering anything other than
a subset of Windows-1252, but ho hum.)
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Reading Text File Encoding and converting to Perls internal UTF-8 encoding sln@netherlands.com Perl Misc 2 04-17-2009 11:22 PM
page encoding question - thank you Tony Vella HTML 2 12-21-2005 08:14 AM
changing JVM encoding; setting -Dfile.encoding doesn't work pasmol@plusnet.pl Java 1 10-08-2004 09:50 PM
Encoding.Default and Encoding.UTF8 Hardy Wang ASP .Net 5 06-09-2004 04:04 PM
encoding problem on ASP .net page =?Utf-8?B?V0VJV0VJV0VJ?= ASP .Net 2 04-16-2004 10:09 AM



Advertisments