Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > strange speeds

Reply
Thread Tools

strange speeds

 
 
kryglik
Guest
Posts: n/a
 
      01-02-2006
Hello ruby people,
I've tried to make some speed test in ruby to find out how fast is
searching in hash (using has_key?)

The results were OK but I discovered curious thing -- when i generate
big hash the speed is different when i unserialize (marshal.load) the
same hash.

Unserialized Hash is 1.5x faster in has_key? method. Tried it with
GC.disable as well, results were same.

What am I missing?

Thanx for any ideas
 
Reply With Quote
 
 
 
 
Robert Klemme
Guest
Posts: n/a
 
      01-02-2006
kryglik <(E-Mail Removed)> wrote:
> Hello ruby people,
> I've tried to make some speed test in ruby to find out how fast is
> searching in hash (using has_key?)
>
> The results were OK but I discovered curious thing -- when i generate
> big hash the speed is different when i unserialize (marshal.load) the
> same hash.
>
> Unserialized Hash is 1.5x faster in has_key? method. Tried it with
> GC.disable as well, results were same.
>
> What am I missing?
>
> Thanx for any ideas


The first reason that comes to mind is different insertion order. But I
hardly believe it could make such a big difference. Care to post your
testing code?

Kind regards

robert

 
Reply With Quote
 
 
 
 
kryglik
Guest
Posts: n/a
 
      01-02-2006
Robert Klemme wrote:
> kryglik <(E-Mail Removed)> wrote:
>> Hello ruby people,
>> I've tried to make some speed test in ruby to find out how fast is
>> searching in hash (using has_key?)
>>
>> The results were OK but I discovered curious thing -- when i generate
>> big hash the speed is different when i unserialize (marshal.load) the
>> same hash.
>>
>> Unserialized Hash is 1.5x faster in has_key? method. Tried it with
>> GC.disable as well, results were same.
>>
>> What am I missing?
>>
>> Thanx for any ideas

>
> The first reason that comes to mind is different insertion order. But I
> hardly believe it could make such a big difference. Care to post your
> testing code?
>
> Kind regards
>
> robert
>


Well, insertion order will be same, won't be?

h = {}
h_slovo = []

if not true #HERE YOU SWITCH IF GENERATE/LOAD
for x in 1..40_000
if x % 2000 == 0
puts x
end

slovo = ""

for z in 1..32
slovo += (rand(25)+65).chr
end

if rand(10) == 3
h_slovo << slovo
end

h.update slovo => rand(1_000_000)
slovo = nil
end

f = File.new("hash.marshal","w+")
f.puts(Marshal.dump(h))
f.close

f = File.new("hash1.marshal","w+")
f.puts(Marshal.dump(h_slovo))
f.close
else

puts "loading hash"
h = Marshal.load(File.open("hash.marshal","r"))
puts "loading hash1"
h_slovo = Marshal.load(File.open("hash1.marshal","r"))
puts "done"
end

t1 = Time.now

for slovo1 in h_slovo
h.has_key? h_slovo
end

t2 = Time.now

puts "hash: " + h.size.to_s
puts "hled: " + h_slovo.size.to_s
puts t2-t1
 
Reply With Quote
 
Robert Klemme
Guest
Posts: n/a
 
      01-02-2006
kryglik <(E-Mail Removed)> wrote:
> Robert Klemme wrote:
>> kryglik <(E-Mail Removed)> wrote:
>>> Hello ruby people,
>>> I've tried to make some speed test in ruby to find out how fast is
>>> searching in hash (using has_key?)
>>>
>>> The results were OK but I discovered curious thing -- when i
>>> generate big hash the speed is different when i unserialize
>>> (marshal.load) the same hash.
>>>
>>> Unserialized Hash is 1.5x faster in has_key? method. Tried it with
>>> GC.disable as well, results were same.
>>>
>>> What am I missing?
>>>
>>> Thanx for any ideas

>>
>> The first reason that comes to mind is different insertion order.
>> But I hardly believe it could make such a big difference. Care to
>> post your testing code?
>>
>> Kind regards
>>
>> robert
>>

>
> Well, insertion order will be same, won't be?


Not necessarily. If Marshal.dump uses the current order in the hash for
writing then the insertion order during loading is likely to be different
from the original insertion order.

> h = {}
> h_slovo = []
>
> if not true #HERE YOU SWITCH IF GENERATE/LOAD
> for x in 1..40_000
> if x % 2000 == 0
> puts x
> end
>
> slovo = ""
>
> for z in 1..32
> slovo += (rand(25)+65).chr
> end
>
> if rand(10) == 3
> h_slovo << slovo
> end
>
> h.update slovo => rand(1_000_000)
> slovo = nil
> end
>
> f = File.new("hash.marshal","w+")
> f.puts(Marshal.dump(h))
> f.close
>
> f = File.new("hash1.marshal","w+")
> f.puts(Marshal.dump(h_slovo))
> f.close
> else
>
> puts "loading hash"
> h = Marshal.load(File.open("hash.marshal","r"))
> puts "loading hash1"
> h_slovo = Marshal.load(File.open("hash1.marshal","r"))
> puts "done"
> end
>
> t1 = Time.now
>
> for slovo1 in h_slovo
> h.has_key? h_slovo
> end
>
> t2 = Time.now
>
> puts "hash: " + h.size.to_s
> puts "hled: " + h_slovo.size.to_s
> puts t2-t1


Your time measurement is far too unprecise on a modern system. You'll have
to repeat lookups to get more accurate results. I get a difference between 5
and 10 percent with the attached code - regardless of GC enabled or
disabled. Strange though that the copy is always slower. This is an
interesting issue.

Note, if you freeze keys lookups in the original hash are much faster
because then hash keys are not copied on insertion (an optimisation of
unfrozen Strings as hash keys), i.e., during lookup if the key is the same
it's also the same object which will be the first test - and of course this
test is much faster than sequential comparison of characters.

Another note, a more realistic scenario would be if not 10% of all keys were
used for lookup but if there also were keys that are not present in the
hash.

Kind regards

robert

 
Reply With Quote
 
kryglik
Guest
Posts: n/a
 
      01-02-2006
Hello
thanx for nice script!

I find out that i got a little error ) [see the 'for' cycle in
benchmark part:
for slovo1 in h_slovo
h.has_key? h_slovo (SHOULD NOT BE h_slovo BUT slovo1)
end

so when I run your script i got confused about speeds like 1-3s instead
of my 60s. The difference between marshaled and generated was like 35s
and 60s (10000 keys in 100.000 hash).

Still the mysteria continues -- why the different speeds.

And also thanx for lecture of ruby optimization - not quite a guru (yet
).

Benchmark is a nice module
Robert Klemme wrote:
> kryglik <(E-Mail Removed)> wrote:
>
>> Robert Klemme wrote:
>>
>>> kryglik <(E-Mail Removed)> wrote:
>>>
>>>> Hello ruby people,
>>>> I've tried to make some speed test in ruby to find out how fast is
>>>> searching in hash (using has_key?)
>>>>
>>>> The results were OK but I discovered curious thing -- when i
>>>> generate big hash the speed is different when i unserialize
>>>> (marshal.load) the same hash.
>>>>
>>>> Unserialized Hash is 1.5x faster in has_key? method. Tried it with
>>>> GC.disable as well, results were same.
>>>>
>>>> What am I missing?
>>>>
>>>> Thanx for any ideas
>>>
>>>
>>> The first reason that comes to mind is different insertion order.
>>> But I hardly believe it could make such a big difference. Care to
>>> post your testing code?
>>>
>>> Kind regards
>>>
>>> robert
>>>

>>
>> Well, insertion order will be same, won't be?

>
>
> Not necessarily. If Marshal.dump uses the current order in the hash for
> writing then the insertion order during loading is likely to be
> different from the original insertion order.
>
>> h = {}
>> h_slovo = []
>>
>> if not true #HERE YOU SWITCH IF GENERATE/LOAD
>> for x in 1..40_000
>> if x % 2000 == 0
>> puts x
>> end
>>
>> slovo = ""
>>
>> for z in 1..32
>> slovo += (rand(25)+65).chr
>> end
>>
>> if rand(10) == 3
>> h_slovo << slovo
>> end
>>
>> h.update slovo => rand(1_000_000)
>> slovo = nil
>> end
>>
>> f = File.new("hash.marshal","w+")
>> f.puts(Marshal.dump(h))
>> f.close
>>
>> f = File.new("hash1.marshal","w+")
>> f.puts(Marshal.dump(h_slovo))
>> f.close
>> else
>>
>> puts "loading hash"
>> h = Marshal.load(File.open("hash.marshal","r"))
>> puts "loading hash1"
>> h_slovo = Marshal.load(File.open("hash1.marshal","r"))
>> puts "done"
>> end
>>
>> t1 = Time.now
>>
>> for slovo1 in h_slovo
>> h.has_key? h_slovo
>> end
>>
>> t2 = Time.now
>>
>> puts "hash: " + h.size.to_s
>> puts "hled: " + h_slovo.size.to_s
>> puts t2-t1

>
>
> Your time measurement is far too unprecise on a modern system. You'll
> have to repeat lookups to get more accurate results. I get a difference
> between 5 and 10 percent with the attached code - regardless of GC
> enabled or disabled. Strange though that the copy is always slower.
> This is an interesting issue.
>
> Note, if you freeze keys lookups in the original hash are much faster
> because then hash keys are not copied on insertion (an optimisation of
> unfrozen Strings as hash keys), i.e., during lookup if the key is the
> same it's also the same object which will be the first test - and of
> course this test is much faster than sequential comparison of characters.
>
> Another note, a more realistic scenario would be if not 10% of all keys
> were used for lookup but if there also were keys that are not present in
> the hash.
>
> Kind regards
>
> robert

 
Reply With Quote
 
Robert Klemme
Guest
Posts: n/a
 
      01-03-2006
kryglik <(E-Mail Removed)> wrote:
> Hello
> thanx for nice script!
>
> I find out that i got a little error ) [see the 'for' cycle in
> benchmark part:
> for slovo1 in h_slovo
> h.has_key? h_slovo (SHOULD NOT BE h_slovo BUT slovo1)
> end
>
> so when I run your script i got confused about speeds like 1-3s
> instead of my 60s. The difference between marshaled and generated was
> like 35s and 60s (10000 keys in 100.000 hash).


In this case it's probably due to hash calculation for the hash you
accidentally used as key.

> Still the mysteria continues -- why the different speeds.


Yeah, I guess we would have to dive into the sources to find out...

> And also thanx for lecture of ruby optimization - not quite a guru
> (yet ).


Take your time.

> Benchmark is a nice module


It really is!

Kind regards

robert

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Strange download speeds Kiwiboy NZ Computing 4 05-08-2009 03:55 AM
Strange upload/download speeds NervSys Computer Support 1 02-24-2009 11:39 PM
Internet Sharing: Equal upload speeds but un-equal download speeds =?Utf-8?B?TkpU?= Wireless Networking 3 09-15-2007 06:22 AM
Core speeds and memory speeds??? Xeno Chauvin Computer Support 1 10-03-2003 03:21 AM
Strange CD-R Speeds MELT Computer Support 13 07-22-2003 04:04 AM



Advertisments