Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Resolving unicode escapes to unicode character

Reply
Thread Tools

Resolving unicode escapes to unicode character

 
 
Tyler
Guest
Posts: n/a
 
      07-28-2011
Hi all,
I'm trying to parse escaped unicode characters. The basic goal is to
read the string '\u00F3' (or "\\u00F3") as 'ó'. I have a workaround
below that uses eval (code below), but I'd be grateful if anyone had a
less dangerous solution or suggestion. In python, you can 'import
codecs' and use string.decode("unicode-escape"), is something similar
possible in Ruby?

Thanks!
Tyler


File.open("test.txt", 'w') {|file| file.puts "Asociaci\\u00F3n Alumni
\nF\\u00FAtbol"}
File.open "test.txt", 'r' do |file|
file.each do |line|
puts eval("%Q{#{line}}")
# puts line
end
end
# => Asociación Alumni
# => Fútbol
#
# If 'puts line' is used instead, this is the output:
# => Asociaci\u00F3n Alumni
# => F\u00FAtbol
#
# Is there a (prettier & safer) way to do this without using eval?
 
Reply With Quote
 
 
 
 
Robert Klemme
Guest
Posts: n/a
 
      07-29-2011
On 29.07.2011 01:51, Tyler wrote:
> Hi all,
> I'm trying to parse escaped unicode characters. The basic goal is to
> read the string '\u00F3' (or "\\u00F3") as 'ó'. I have a workaround
> below that uses eval (code below), but I'd be grateful if anyone had a
> less dangerous solution or suggestion. In python, you can 'import
> codecs' and use string.decode("unicode-escape"), is something similar
> possible in Ruby?


irb(main):037:0> s="a\\u00fab"
=> "a\\u00fab"
irb(main):038:0> puts s
a\u00fab
=> nil
irb(main):039:0> s.gsub(%r[\\u(\h{4})]) {$1.to_i(16).chr(Encoding::UTF_}
=> "aúb"

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Unicode escapes and String literals? Knute Johnson Java 24 12-17-2012 10:42 AM
overriding character escapes during file input David J Birnbaum Python 2 09-03-2006 08:44 AM
Escapes Sequences Not Working? Rick Brandt XML 13 08-27-2004 11:59 PM
Resolving character references with C++ Arabica-MSXML Olav XML 1 08-11-2004 01:22 PM
Resolving element content text with character references - C++/Arabica Olav XML 0 08-11-2004 11:19 AM



Advertisments