Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > RDoc and encoding

Reply
Thread Tools

RDoc and encoding

 
 
Claus Folke Brobak
Guest
Posts: n/a
 
      01-10-2011
Hi,

Running Ruby/JRuby 1.8.7 on Windows XP.

Until now I have been using the RDoc version built into the Ruby
Standard Library. That is version 1.0.1. Now I am trying out RDoc 3.4,
installed via a gem.

I have run into a problem with the double quote chracter. Example code:

RDoc 1.0.1

require 'rdoc/markup/simple_markup'
require 'rdoc/markup/simple_markup/to_html'

sm =3D SM::SimpleMarkup.new()
th =3D SM::ToHtml.new()
puts sm.convert('=C3=A6=C3=A6=C3=A6"=C3=B8=C3=B8=C3=B8" =C3=A5=C3=A5=C3=A5=
', th)

Output:

<p>
=C3=A6=C3=A6=C3=A6&quot;=C3=B8=C3=B8=C3=B8&quot;=C 3=A5=C3=A5=C3=A5
</p>

RDoc 3.4

require 'rubygems'
require 'rdoc/markup/to_html'

puts RDoc::Markup::ToHtml.new().convert('=C3=A6=C3=A6=C 3=A6"=C3=B8=C3=B8=
=C3=B8"=C3=A5=C3=A5=C3=A5')

Output:

<p>=C3=A6=C3=A6=C3=A6=C3=A2=E2=82=AC=C5=93=C3=B8=C 3=B8=C3=B8=C3=A2=E2=82=
=AC=C2=9D=C3=A5=C3=A5=C3=A5</p>

It seems as if RDoc 3.4 is adding a double quote in UTF-8 encoding
instead of "&quot;". Running on Windows XP, the normal encoding is
Windows-1252. If I look at the HTML and tell the browser that it is
UTF-8 encoded, the double quotes are displayed correctly. Then, however,
the Danish national characters (=C3=A6=C3=B8=C3=A5) are not displayed as =
they should.

Do you think I have hit a bug in Rdoc 3.4, or am I missing something?

Claus

-- =

Posted via http://www.ruby-forum.com/.=

 
Reply With Quote
 
 
 
 
Eric Hodel
Guest
Posts: n/a
 
      01-10-2011
On Jan 10, 2011, at 04:08, Claus Folke Brobak wrote:

> Hi,
>=20
> Running Ruby/JRuby 1.8.7 on Windows XP.
>=20
> Until now I have been using the RDoc version built into the Ruby
> Standard Library. That is version 1.0.1. Now I am trying out RDoc 3.4,
> installed via a gem.
>=20
> I have run into a problem with the double quote chracter. Example =

code:
>=20
> RDoc 1.0.1
>=20
> require 'rdoc/markup/simple_markup'
> require 'rdoc/markup/simple_markup/to_html'
>=20
> sm =3D SM::SimpleMarkup.new()
> th =3D SM::ToHtml.new()
> puts sm.convert('=C3=A6=C3=A6=C3=A6"=C3=B8=C3=B8=C3=B8" =C3=A5=C3=A5=C3=A5=

', th)
>=20
> Output:
>=20
> <p>
> =C3=A6=C3=A6=C3=A6&quot;=C3=B8=C3=B8=C3=B8&quot;=C 3=A5=C3=A5=C3=A5
> </p>
>=20
> RDoc 3.4
>=20
> require 'rubygems'
> require 'rdoc/markup/to_html'
>=20
> puts RDoc::Markup::ToHtml.new().convert('=C3=A6=C3=A6=C 3=A6"=C3=B8=C3=B8=

=C3=B8"=C3=A5=C3=A5=C3=A5')
>=20
> Output:
>=20
> <p>=C3=A6=C3=A6=C3=A6=C3=A2=E2=82=AC=C5=93=C3=B8=C 3=B8=C3=B8=C3=A2=E2=82=

=AC=C2=9D=C3=A5=C3=A5=C3=A5</p>
>=20
> It seems as if RDoc 3.4 is adding a double quote in UTF-8 encoding
> instead of "&quot;". Running on Windows XP, the normal encoding is
> Windows-1252. If I look at the HTML and tell the browser that it is
> UTF-8 encoded, the double quotes are displayed correctly. Then, =

however,
> the Danish national characters (=C3=A6=C3=B8=C3=A5) are not displayed =

as they should.
>=20
> Do you think I have hit a bug in Rdoc 3.4, or am I missing something?


Transcoding is not supported in RDoc on ruby 1.8.7. Upgrade to Ruby =
1.9.

My primary platform for developing RDoc is Ruby 1.9. Ruby 1.8.6 is =
unsupported and 1.8.7 gets second tier status and will not support =
transcoding.=

 
Reply With Quote
 
 
 
 
Claus Folke Brobak
Guest
Posts: n/a
 
      01-10-2011
Eric Hodel wrote in post #973761:
>
> Transcoding is not supported in RDoc on ruby 1.8.7. Upgrade to Ruby
> 1.9.


I don't think it is a matter of transcoding. I would have thought the
output would remain in the Windows-1252 encoding of the input.

As I can figure out, RDoc always "thinks" the input is in UTF-8
encoding. This is probably rarely the case on Windows.

Can you explain the use of a double quote in UTF-8 encoding instead of
"&quot;" in the generated HTML?

Claus

--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Eric Hodel
Guest
Posts: n/a
 
      01-10-2011
On Jan 10, 2011, at 13:20, Claus Folke Brobak wrote:

> Eric Hodel wrote in post #973761:
>>=20
>> Transcoding is not supported in RDoc on ruby 1.8.7. Upgrade to Ruby
>> 1.9.

>=20
> I don't think it is a matter of transcoding. I would have thought the
> output would remain in the Windows-1252 encoding of the input.
>=20
> As I can figure out, RDoc always "thinks" the input is in UTF-8
> encoding. This is probably rarely the case on Windows.


With Ruby 1.8 this is true. If you upgrade to Ruby 1.9 RDoc 3 can =
automatically determine the output encoding and transcode for you. You =
can also override it with --encoding.

> Can you explain the use of a double quote in UTF-8 encoding instead of
> "&quot;" in the generated HTML?


RDoc now performs "prettier" replacements of characters such as matching =
opening and closing quotes. Such characters are not available in all =
output encodings so transcoding is performed.=

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to use rdoc parsers outside of rdoc? Paul Van Delst Ruby 0 07-27-2006 07:18 PM
rdoc: how to add readme.rdoc as index.html? Iwan van der Kleyn Ruby 1 04-26-2005 12:37 PM
[RDOC] Using a template causes rdoc not to document some classes Daniel Berger Ruby 1 11-02-2004 08:23 PM
rdoc bug (and rdoc bug tracker site is down) Brian Schröder Ruby 5 09-18-2004 02:08 PM
rdoc: how to generate rdoc & ri documentation of standard library? Andreas Schwarz Ruby 6 01-01-2004 03:09 AM



Advertisments