Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Ruby (http://www.velocityreviews.com/forums/f66-ruby.html)
-   -   Unexpected problem: hash[key] << value (http://www.velocityreviews.com/forums/t866529-unexpected-problem-hash-key-value.html)

Joey Zhou 03-09-2011 03:45 AM

Unexpected problem: hash[key] << value
 
# ruby 1.9.2p180 (2011-02-18) [i386-mingw32]

get_page_hash = {}
get_page_hash.default = []

File.foreach("page.txt") do |line|
word, page = line.chomp.split(':')
get_page_hash[word] << page # the problem is here
end

p get_page_hash['Aword'] # => ["1", "2", "3", "4", "5"]
p get_page_hash['Bword'] # => ["1", "2", "3", "4", "5"]
p get_page_hash.default # => ["1", "2", "3", "4", "5"]

__END__

content of page.txt:
Aword:1
Bword:2
Cword:3
Aword:4
Dword:5


Simple program, clear purpose. I don't know why get_page_hash.default
becomes ["1", "2", "3", "4", "5"], it seems radiculous.

Only if I modify the very line to:

get_page_hash[word] += [page]

I get what I want:

p get_page_hash['Aword'] # => ["1", "4"]
p get_page_hash['Bword'] # => ["2"]
p get_page_hash.default # => []

I think use "<<" maybe intuitive, but the result is unexpected. What's
wrong with it?

Thank you!

Joey

--
Posted via http://www.ruby-forum.com/.


Haruka YAGNI 03-09-2011 05:05 AM

Re: Unexpected problem: hash[key] << value
 
On Wed, Mar 9, 2011 at 12:45 PM, Joey Zhou <yimutang@gmail.com> wrote:
> I think use "<<" maybe intuitive, but the result is unexpected. What's
> wrong with it?


Hash.default refers to the same Array at each element.
You cannot use the first code, because "<<" changes the default array,
not each element.

Here is a good example from the manual, but it is Japanese... sorry
http://www.ruby-lang.org/ja/man/html/trap_Hash.html

--
Haruka YAGNI
hyagni@gmail.com


Mark Beek 03-09-2011 05:35 AM

Re: Unexpected problem: hash[key] << value
 
I just stumbled across this surprising behavior myself. It's the first
counter-intuitive mechanism I have come across in my short sweet
experience with Ruby.

Check this thread for an elaborate discussion (in English) of this
behavior:

http://www.ruby-forum.com/topic/134424#new

Here's my take:

Before we look at your case, let's look at a case that actually works as
you'd expect: initializing a hash with a Fixnum:

Code
h = Hash.new(0)
puts "h['key1']: #{h['key1']}"
puts "h['key2']: #{h['key2']}"
h['key1'] += 1
puts "after updating key1"
puts "h['key1']: #{h['key1']}"
puts "h['key2']: #{h['key2']}"

Result:
h['key1']: 0
h['key2']: 0
after updating key1
h['key1']: 1
h['key2']: 0

Perfect! Mighty handy for word count programs and all sorts of other use
cases.

Which would lead you to expect the following behavior when you
initialize a hash with an empty array, then append:

Code
h = Hash.new([])
puts "h['key1']: #{h['key1']}"
puts "h['key2']: #{h['key2']}"
h['key1'] << 1
puts "after updating key1"
puts "h['key1']: #{h['key1']}"
puts "h['key2']: #{h['key2']}"

Result
h['key1']: []
h['key2']: []
after updating key1
h['key1']: [1]
h['key2']: [] #<-- what you'd expect, but NOT what you get

The actual result is the following:

h['key1']: []
h['key2']: []
after updating key1
h['key1']: [1]
h['key2']: [1]
. . and so on

The problem is that when you initialize a hash with a mutable default
value, all of the defaults are actually references to THE SAME OBJECT.
So when you append to the default array in one hash value, you're
actually changing them all. Witness:

puts "#{h['key1'].object_id}"
puts "#{h['key2'].object_id}"
puts "#{h['key3'].object_id}"

Result:
116528
116528
116528

By contrast, when you update a value with the += construction rather
than <<, you're actually creating a new array object for that value. So
that particular one is no longer referring to the default value.

The thread referred to above mentions other ways to get what you'd
expect with a default empty array. Still, I gotta admit that I simply
don't understand why Hash.new([]) works the way it does. Who would want
to create a Hash table where changing a single value can potentially
change all other values, past, present, and to come. Talk about side
effects gone wild!

If anyone can explain the rationale for this behavior,I'd really
appreciate it. I'm probably just missing something.

--
Posted via http://www.ruby-forum.com/.


Brian Candler 03-09-2011 08:08 AM

Re: Unexpected problem: hash[key] << value
 
Mark Beek wrote in post #986380:
> Which would lead you to expect the following behavior when you
> initialize a hash with an empty array, then append:
>
> Code
> h = Hash.new([])
> puts "h['key1']: #{h['key1']}"
> puts "h['key2']: #{h['key2']}"
> h['key1'] << 1
> puts "after updating key1"
> puts "h['key1']: #{h['key1']}"
> puts "h['key2']: #{h['key2']}"
>
> Result
> h['key1']: []
> h['key2']: []
> after updating key1
> h['key1']: [1]
> h['key2']: [] #<-- what you'd expect, but NOT what you get


To get that behaviour, you need the Hash to create a *new* empty array
for every unknown element. What I do is:

h = Hash.new { |o,k| o[k] = [] }

> The problem is that when you initialize a hash with a mutable default
> value, all of the defaults are actually references to THE SAME OBJECT.

...
> If anyone can explain the rationale for this behavior,I'd really
> appreciate it. I'm probably just missing something.


The question is, how else could it work in the general case?

Perhaps you pass a prototype object, and the Hash constructor would call
dup on that object every time it needs a new distinct instance? No,
that doesn't work, because .dup is only a shallow copy. Check out:

a = [[1,2],[3,4]]
b = a.dup
b[0] << 3
a
b

Perhaps you could pass a Class, and then Hash would call your class's
new method every time it wanted an instance? Sure, you could pass Array
in this case, but it's quite restrictive. And the simple case of
Hash.new(0) wouldn't work.

So to work in the general case you have to give it some code to execute
to create a new object every time one is needed - a factory block.

The same applies with arrays: compare

a = Array.new(5, [])
b = Array.new(5) { [] }
puts a.map { |x| x.object_id }
puts b.map { |x| x.object_id }

Regards,

Brian.

--
Posted via http://www.ruby-forum.com/.


Robert Klemme 03-09-2011 08:36 AM

Re: Unexpected problem: hash[key] << value
 
On Wed, Mar 9, 2011 at 6:35 AM, Mark Beek <markbeek@carolina.rr.com> wrote:
> I just stumbled across this surprising behavior myself. It's the first
> counter-intuitive mechanism I have come across in my short sweet
> experience with Ruby.
>
> Check this thread for an elaborate discussion (in English) of this
> behavior:
>
> http://www.ruby-forum.com/topic/134424#new
>
> Here's my take:
>
> Before we look at your case, let's look at a case that actually works as
> you'd expect: initializing a hash with a Fixnum:
>
> Code
> h =3D Hash.new(0)
> puts "h['key1']: #{h['key1']}"
> puts "h['key2']: #{h['key2']}"
> h['key1'] +=3D 1
> puts "after updating key1"
> puts "h['key1']: #{h['key1']}"
> puts "h['key2']: #{h['key2']}"
>
> Result:
> h['key1']: 0
> h['key2']: 0
> after updating key1
> h['key1']: 1
> h['key2']: 0
>
> Perfect! Mighty handy for word count programs and all sorts of other use
> cases.
>
> Which would lead you to expect the following behavior when you
> initialize a hash with an empty array, then append:
>
> Code
> h =3D Hash.new([])
> puts "h['key1']: #{h['key1']}"
> puts "h['key2']: #{h['key2']}"
> h['key1'] << 1
> puts "after updating key1"
> puts "h['key1']: #{h['key1']}"
> puts "h['key2']: #{h['key2']}"
>
> Result
> h['key1']: []
> h['key2']: []
> after updating key1
> h['key1']: [1]
> h['key2']: [] =A0#<-- what you'd expect, but NOT what you get
>
> The actual result is the following:
>
> h['key1']: []
> h['key2']: []
> after updating key1
> h['key1']: [1]
> h['key2']: [1]
> . . . and so on
>
> The problem is that when you initialize a hash with a mutable default
> value, all of the defaults are actually references to THE SAME OBJECT.
> So when you append to the default array in one hash value, you're
> actually changing them all. Witness:
>
> puts "#{h['key1'].object_id}"
> puts "#{h['key2'].object_id}"
> puts "#{h['key3'].object_id}"
>
> Result:
> 116528
> 116528
> 116528
>
> By contrast, when you update a value with the +=3D construction rather
> than <<, you're actually creating a new array object for that value. So
> that particular one is no longer referring to the default value.


Well, in this case actually the better idiom is this:

h =3D Hash.new {|h,k| h[k] =3D []}
...

h[key] << something

Reason: Array#+ will create a new object every time you add something
while the idiom presented above only ever creates one Array per key.

> The thread referred to above mentions other ways to get what you'd
> expect with a default empty array. Still, I gotta admit that I simply
> don't understand why Hash.new([]) works the way it does. Who would want
> to create a Hash table where changing a single value can potentially
> change all other values, past, present, and to come. Talk about side
> effects gone wild!


Well, first of all this is the default return value. This does not
necessarily mean that it will be modified. You might do something
like

h =3D Hash.new("missing".freeze)
...

puts h[key]

And then of course there is a very common idiom

counters =3D Hash.new 0
...
counters[key] +=3D 1

> If anyone can explain the rationale for this behavior,I'd really
> appreciate it. I'm probably just missing something.


Hopefully that explanation helps.

Kind regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/


Joey Zhou 03-09-2011 11:59 AM

Re: Unexpected problem: hash<< value
 
Haruka YAGNI wrote in post #986377:
> On Wed, Mar 9, 2011 at 12:45 PM, Joey Zhou <yimutang@gmail.com> wrote:
> Hash.default refers to the same Array at each element.
> You cannot use the first code, because "<<" changes the default array,
> not each element.
>
> Here is a good example from the manual, but it is Japanese... sorry
> http://www.ruby-lang.org/ja/man/html/trap_Hash.html


Thank you. I can read the codes :)

--
Posted via http://www.ruby-forum.com/.


Joey Zhou 03-09-2011 12:01 PM

Re: Unexpected problem: hash<< value
 
Robert Klemme wrote in post #986401:
> Well, in this case actually the better idiom is this:
>
> h = Hash.new {|h,k| h[k] = []}

This is actually what I need. Thank you.

--
Posted via http://www.ruby-forum.com/.



All times are GMT. The time now is 12:11 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.