Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > irb and ruby giving different results

Reply
Thread Tools

irb and ruby giving different results

 
 
Nit Khair
Guest
Posts: n/a
 
      11-05-2008
in IRB,
ASCII = (0..255).map{|c| c.chr }
PRINTABLE = ASCII.grep(/[[rint:]]/)
PRINTABLE.length
>>> 191


However, inside the ruby program PRINTABLE.length only gives 95 !! ???

#!/opt/local/bin/ruby
ASCII = (0..255).map{|c| c.chr }
puts(ASCII.length)
PRINTABLE = ASCII.grep(/[[rint:]]/)
puts(PRINTABLE.length)
# -> 95 instead of 191

(Using ruby 1.8.7 on OS X 10.5.5, powerpc). Ran both from same Terminal.
Both use /opt/local/bin/ruby.

Why this difference? I ran irb with -f (so irbrc would not be loaded and
still got the same result, so its not some require that is causing the
difference).

p.s. sorry for cross-posting from roguelike thread -- this is getting
lost there.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
 
Patrick Doyle
Guest
Posts: n/a
 
      11-05-2008
[Note: parts of this message were removed to make it a legal post.]

This won't help much, but when I executed:

>
> ASCII = (0..255).map{|c| c.chr }
> PRINTABLE = ASCII.grep(/[[rint:]]/)
> PRINTABLE.length
> >>> 191

>


in irb, I got 95 on my ruby 1.8.6 (i386-mswin32) running on an XP box.

What were the 191 characters displayed when computed the PRINTABLE
expression?

As a totally random theory, I wonder if [[rint:]] might take into account
what device is attached to stdout and recognize that your terminal is
capable of and use that to decide what is printable or not.

It would be quite surprising (and, perhaps unfortunate), if that's what's
going on, but it might explain what you saw.

A slightly more plausible explanation might be that [[rint:]] alters its
behavior based on the TERM environment variable. What is ENV["TERM"] in the
two cases?

That's all I've got. I warned you at the beginning that this wouldn't help
much.

--wpd

 
Reply With Quote
 
 
 
 
Nit Khair
Guest
Posts: n/a
 
      11-05-2008
Patrick Doyle wrote:
> This won't help much, but when I executed:
>
>>
>> ASCII = (0..255).map{|c| c.chr }
>> PRINTABLE = ASCII.grep(/[[rint:]]/)
>> PRINTABLE.length
>> >>> 191

>>

>
> in irb, I got 95 on my ruby 1.8.6 (i386-mswin32) running on an XP box.
>
> What were the 191 characters displayed when computed the PRINTABLE
> expression?
>
>
> A slightly more plausible explanation might be that [[rint:]] alters
> its
> behavior based on the TERM environment variable. What is ENV["TERM"] in
> the
> two cases?
>
> That's all I've got. I warned you at the beginning that this wouldn't
> help
> much.
>
> --wpd


I mentioned that i used the same terminal to verify that it was not a
terminal issue. I tried both out with TERM=screen (my usual), then
xterm-color, xterm-256color and perhaps VT100 and VT200 as well.

One of the characters in the 191 for example is 165 or "\245" which is
the code generated by Alt-A on my MAC OSX (powerpc, 10.5.5, darwin).

(This is when i have *not* enabled "Use alt as meta" - if you dont know
what that is just ignore, its a MAC default).

Here's the dump, since you asked:

irb(main):030:0> PRINTABLE
[" ", "!", "\"", "#", "$", "%", "&", "'", "(", ")", "*", "+", ",", "-",
".", "/", "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", ":", ";",
"<", "=", ">", "?", "@", "A", "B", "C", "D", "E", "F", "G", "H", "I",
"J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W",
"X", "Y", "Z", "[", "\\", "]", "^", "_", "`", "a", "b", "c", "d", "e",
"f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s",
"t", "u", "v", "w", "x", "y", "z", "{", "|", "}", "~", "\240", "\241",
"\242", "\243", "\244", "\245", "\246", "\247", "\250", "\251", "\252",
"\253", "\254", "\255", "\256", "\257", "\260", "\261", "\262", "\263",
"\264", "\265", "\266", "\267", "\270", "\271", "\272", "\273", "\274",
"\275", "\276", "\277", "\300", "\301", "\302", "\303", "\304", "\305",
"\306", "\307", "\310", "\311", "\312", "\313", "\314", "\315", "\316",
"\317", "\320", "\321", "\322", "\323", "\324", "\325", "\326", "\327",
"\330", "\331", "\332", "\333", "\334", "\335", "\336", "\337", "\340",
"\341", "\342", "\343", "\344", "\345", "\346", "\347", "\350", "\351",
"\352", "\353", "\354", "\355", "\356", "\357", "\360", "\361", "\362",
"\363", "\364", "\365", "\366", "\367", "\370", "\371", "\372", "\373",
"\374", "\375", "\376", "\377"]
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Brian Candler
Guest
Posts: n/a
 
      11-05-2008
Nit Khair wrote:
> in IRB,
> ASCII = (0..255).map{|c| c.chr }
> PRINTABLE = ASCII.grep(/[[rint:]]/)
> PRINTABLE.length
>>>> 191

>
> However, inside the ruby program PRINTABLE.length only gives 95 !! ???
>
> #!/opt/local/bin/ruby
> ASCII = (0..255).map{|c| c.chr }
> puts(ASCII.length)
> PRINTABLE = ASCII.grep(/[[rint:]]/)
> puts(PRINTABLE.length)
> # -> 95 instead of 191
>
> (Using ruby 1.8.7 on OS X 10.5.5, powerpc). Ran both from same Terminal.
> Both use /opt/local/bin/ruby.
>
> Why this difference?


FWIW, I get 95 with irb187 under Ubuntu Dapper.

Looking at source code, the [[rint:]] character class uses isascii(c)
&& isprint(c)

man isprint says:

NOTE
The details of what characters belong into which class depend
on the
current locale. For example, isupper() will not recognize an
A-umlaut
(Ä) as an uppercase letter in the default C locale.

So look at what ENV.grep(/^LC/) shows. You could try setting
ENV['LC_ALL']='C' in irb, or export LC_ALL=C before running it. Or try
'POSIX' instead of 'C'.

Finally, be completely sure that your irb is running the right ruby.
Check RUBY_VERSION within irb.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Nit Khair
Guest
Posts: n/a
 
      11-05-2008
Brian Candler wrote:
> FWIW, I get 95 with irb187 under Ubuntu Dapper.
>
> Looking at source code, the [[rint:]] character class uses isascii(c)
> && isprint(c)
>
> man isprint says:
>
> NOTE
> The details of what characters belong into which class depend
> on the
> current locale. For example, isupper() will not recognize an
> A-umlaut
> (Ä) as an uppercase letter in the default C locale.
>
> So look at what ENV.grep(/^LC/) shows. You could try setting
> ENV['LC_ALL']='C' in irb, or export LC_ALL=C before running it. Or try
> 'POSIX' instead of 'C'.
>
> Finally, be completely sure that your irb is running the right ruby.
> Check RUBY_VERSION within irb.


1.8.7 both.

ENV.grep(/^LC/) show nothing in both irb and ruby
ENV['LC_ALL']='C' 'POSIX' etc has no effect in both

However, "echo $LC_ALL" on my prompt gives en_US.UTF-8.
So when i did LC_ALL='C', i get only 95 in both ruby and irb.

Is there any way i get can ruby to also give 191 ?
Tried ENV['LC_ALL']='en_US.UTF-8' at the start of my ruby program but it
had no effect. Anyway, thanks for pointing this out.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Brian Candler
Guest
Posts: n/a
 
      11-05-2008
Nit Khair wrote:
> ENV.grep(/^LC/) show nothing in both irb and ruby


My bad; try

ENV.select{|k,v| k=~/^LC/}

> ENV['LC_ALL']='C' 'POSIX' etc has no effect in both
>
> However, "echo $LC_ALL" on my prompt gives en_US.UTF-8.
> So when i did LC_ALL='C', i get only 95 in both ruby and irb.
>
> Is there any way i get can ruby to also give 191 ?


Perhaps then:

env LC_ALL=en_US.UTF-8 ruby foo.rb

Also, looking through source: it seems that ruby doesn't normally call
setlocale() by itself, but maybe some third-party library which irb is
invoking is doing this for you. "readline" is a likely candidate. So you
could try:

require 'readline'

in your ruby file. Or check $LOADED_FEATURES in irb and try loading the
same modules in your ruby file.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Nit Khair
Guest
Posts: n/a
 
      11-05-2008
Brian Candler wrote:
> Nit Khair wrote:
>> ENV.grep(/^LC/) show nothing in both irb and ruby

>
> My bad; try
>
> ENV.select{|k,v| k=~/^LC/}
>
>> ENV['LC_ALL']='C' 'POSIX' etc has no effect in both
>>
>> However, "echo $LC_ALL" on my prompt gives en_US.UTF-8.
>> So when i did LC_ALL='C', i get only 95 in both ruby and irb.
>>
>> Is there any way i get can ruby to also give 191 ?

>
> Perhaps then:
>
> env LC_ALL=en_US.UTF-8 ruby foo.rb
>
> Also, looking through source: it seems that ruby doesn't normally call
> setlocale() by itself, but maybe some third-party library which irb is
> invoking is doing this for you. "readline" is a likely candidate. So you
> could try:
>
> require 'readline'
>
> in your ruby file. Or check $LOADED_FEATURES in irb and try loading the
> same modules in your ruby file.


Very strange:

1. > ENV.select{|k,v| k=~/^LC/} give en_US.UTF-8 in both irb and ruby. I
get LC_ALL AND LC_CTYPE.

2. > env LC_ALL=en_US.UTF-8 ruby foo.rb
still gives 95

3. I copied $LOADED_FEATURES, and then tried out (I hope i have this
correct):

["enumerator.so", "e2mmap.rb", "irb/init.rb", "irb/workspace.rb",
"irb/context.rb", "irb/extend-command.rb", "irb/output-method.rb",
"irb/notifier.rb", "irb/slex.rb", "irb/ruby-token.rb",
"irb/ruby-lex.rb", "readline.bundle", "irb/input-method.rb",
"irb/locale.rb", "irb.rb", "irb/completion.rb",
"irb/ext/save-history.rb", "stringio.bundle", "yaml/error.rb",
"syck.bundle", "yaml/ypath.rb", "yaml/basenode.rb", "yaml/syck.rb",
"yaml/tag.rb", "yaml/stream.rb", "yaml/constants.rb", "rational.rb",
"date/format.rb", "date.rb", "yaml/rubytypes.rb", "yaml/types.rb",
"yaml.rb"].each do |rr|

require "#{rr}"
end

I still get 95.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Brian Candler
Guest
Posts: n/a
 
      11-07-2008
Nit Khair wrote:
> I still get 95.


Possibly readline isn't calling setlocale until you actually
invoke/initialise the library.

Here's an alternative test. Install the RubyInline gem, and then stick
this in front of your test program:

require 'rubygems'
require 'inline'

class MyTest

inline do |builder|
builder.include '<locale.h>'
builder.c "
void set_locale(void) {
setlocale(LC_ALL, 0);
}"

end
end

MyTest.new.set_locale

If that works, you can remove the dependency on the LC_ALL environment
variable by changing to: setlocale(LC_ALL, "en_US.UTF-8"); or whatever.

However, this dependence on the C stdlib's half-baked idea of "locale"
is very hairy. I understand why Ruby doesn't call setlocale() normally -
it means that at least the normal behaviour is (a) sane, and (b) not
affected randomly by global environment variable settings.

To be honest, if you want a character class which always matches 0x20 to
0x7e and 0xa0 to 0xff, then you might as well just say so directly:

[\x20-\x7e\xa0-\xff]
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
results are different by running script via irb and ruby John Wu Ruby 1 02-24-2010 03:48 PM
The giving that keeps on giving sixteenmillion C Programming 0 11-19-2007 10:59 PM
irb require ... where does irb look? what path? anne001 Ruby 1 06-27-2006 12:07 PM
irb question - variable definitions when calling irb from a script problem Nuralanur@aol.com Ruby 1 10-26-2005 09:13 PM
[ANN] irb-history 1.0.0: Persistent, shared Readline history for IRB Sam Stephenson Ruby 1 06-18-2005 08:56 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57