On Sun, 12 Dec 2004 18:54:04 -0800, mlv231 wrote:
> Hi,
>
> I am experiencing problems with my href links containing filenames
> with portuguese chars. For example, 'FUNDAMENTOS P REGULAMENTAÇÃO DA
> CIDE.tif'.
>
> However, if I open the listing files JSP on tomcat and click on the
> right link, the same file opens. Then, I opened the source html code
> generated for the listing files JSP, and I realized that the filename
> was encoded as following:
>
> 'FUNDAMENTOS%20P%20REGULAMENTA%C3%87%C3%83O%20DA%2 0CIDE.tif'
This uses the UTF-8 encoded diacritics.
> I tried to encode the filename using the javascript function
> 'escape()', but the result was different:
>
> 'FUNDAMENTOS%20P%20REGULAMENTA%C7%C3O%20DA%20CIDE. tif'
While this uses ISO-8859-1[5].
> I tried URLEncoder.encode, but the result was not the same...
It would, but don't use the deprecated encode( String s ). If you
specify "UTF-8" as encoding, it will give you the first form, and if
you specify "ISO-8859-1", the second.
> Does anyone know which algorithm bea/tomcat use to encode special
> chars?
The same as your JS code, just with a different encoding.
> Where can I find the appropriate information?
http://www.unicode.org/
http://www.cl.cam.ac.uk/~mgk25/unicode.html
http://www1.tip.nl/~t876506/utf8tbl.html
http://java.sun.com/j2se/1.4.2/docs/...RLEncoder.html
http://java.sun.com/j2se/1.4.2/docs/...t/Charset.html
Cheers, Tilman
--
`Boy, life takes a long time to live...' -- Steven Wright