Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Decode/encode Unicode

Reply
Thread Tools

Decode/encode Unicode

 
 
Kless
Guest
Posts: n/a
 
      08-28-2008
How to decode a String type to Unicode?
And, to encode Unicode to String?
 
Reply With Quote
 
 
 
 
Thomas B.
Guest
Posts: n/a
 
      08-28-2008
Kless wrote:
> How to decode a String type to Unicode?
> And, to encode Unicode to String?


Unicode is not fully supported in Ruby 1.8.X, it will be in Ruby 1.9.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
 
James Gray
Guest
Posts: n/a
 
      08-28-2008
On Aug 28, 2008, at 3:51 AM, Kless wrote:

> How to decode a String type to Unicode?


$ ruby -KU -e 'p "R=E9sum=E9".unpack("U*")'
[82, 233, 115, 117, 109, 233]

> And, to encode Unicode to String?


$ ruby -KU -e 'p [82, 233, 115, 117, 109, 233].pack("U*")'
"R=E9sum=E9"

Hope that helps.

James Edward Gray II=

 
Reply With Quote
 
Stephen Boisvert
Guest
Posts: n/a
 
      08-28-2008
On Thu, Aug 28, 2008 at 9:51 AM, Kless <> wrote:
> How to decode a String type to Unicode?
> And, to encode Unicode to String?
>
>


This is a trickier question than you probably realize.
You might need to know what encoding you want to transform your string
to and from. Sometimes you may not know - mp3 ID3 tag info is
supposed to be UTF8 for example but often isn't. If you don't specify
an encoding it may or may not end up being encoded based on your
locale settings.

Things I have seen mentioned:

iconv - a unix based library for translating between different
encodings. Requires you tell it what you decoding from and encoding
to.
http://wiki.rubyonrails.org/rails/pages/iconv

unidecode - I have used this translate Unicode strings to ASCII by
simple character mapping- it works about 99% of the time so you will
need error handling for when it goes bonk.

Setting KCODE:

$KCODE = 'UTF8'
or
$KCODE = 'u'
or
use -Ku at the command line. This tells Ruby that you want to be
using UTF8 encoding

require 'jcode'

will give you access to some code developed to deal with Japanese
Unicode encoding that address some of the problems with the String
class struggles with Unicode chars like jlength. Always keep in mind
that a lot of the string methods in Ruby do not work properly with
Unicode because they count letters wrong.

Good luck.

Stephen Boisvert
http://blog.ennuyer.net

 
Reply With Quote
 
Kless
Guest
Posts: n/a
 
      08-28-2008
Thanks! It will help until rb 1.9 been more extended.

On Aug 28, 1:11*pm, James Gray <ja...@grayproductions.net> wrote:
> On Aug 28, 2008, at 3:51 AM, Kless wrote:
>
> > How to decode *a String type to Unicode?

>
> $ ruby -KU -e 'p "Résumé".unpack("U*")'
> [82, 233, 115, 117, 109, 233]
>
> > And, to encode Unicode to String?

>
> $ ruby -KU -e 'p [82, 233, 115, 117, 109, 233].pack("U*")'
> "Résumé"
>
> Hope that helps.
>
> James Edward Gray II


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: os.lisdir, gets unicode, returns unicode... USUALLY?!?!? Jean-Paul Calderone Python 23 11-21-2006 10:25 AM
os.lisdir, gets unicode, returns unicode... USUALLY?!?!? gabor Python 13 11-18-2006 09:23 AM
Unicode digit to unicode string Gabriele *darkbard* Farina Python 2 05-16-2006 01:15 PM
unicode wrap unicode object? ygao Python 6 04-08-2006 09:54 AM
Unicode + jsp + mysql + tomcat = unicode still not displaying Robert Mark Bram Java 0 09-28-2003 05:37 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57