Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > String doesnt auto dup on modification

Reply
Thread Tools

String doesnt auto dup on modification

 
 
Stefan Lang
Guest
Posts: n/a
 
      01-22-2009
2009/1/21 Tom Cloyd <(E-Mail Removed)>:
> Stefan Lang wrote:
>>
>> 2009/1/21 RK Sentinel <(E-Mail Removed)>:
>>
>>>
>>> I'm writing my first largeish app. One issue that gets me frequently is
>>> this:
>>>
>>> I define a string in one class. Some other class references it, and
>>> modifies it. I (somehow) expected that when another referer modifies the
>>> reference, ruby would automatically dup() the string.
>>>
>>> Anyway, through trial and error, I start dup()'ing strings myself. I am
>>> aware of freeze().
>>>
>>> But would like to know how others handle this generally in large apps.
>>>
>>> - Do you keep freezing Strings you make in your classes to avoid
>>> accidental change
>>>
>>> - Do you habitually dup() your string ?
>>>
>>> Is there some clean way of handling this that I am missing.
>>>

>>
>> This is a well known "problem" with all languages that
>> have mutable strings. The solution is simple:
>>
>> * Use destructive string methods only after profiling has shown
>> that string manipulation is the bottleneck.
>>
>> * Don't mutate a string after passing it across encapsulation
>> boundaries.
>>
>> Freezing certain strings can be beneficial in the same way
>> assertions are, habitually duping strings is a bad practice, IMO.
>>
>> Stefan
>>
>>
>>

>
> If this is an utterly dumb question, just ignore it. However, I AM perplexed
> by this response. Here's why:
>
> I thought it was OK for an object to receive input, and output a modified
> version of same. If they don't get to do that, their use seems rather
> limited. In my current app, I create a log object, and various classes write
> to it. I don't create new objects every time I want to add a log entry. Why
> would I do that? Makes no sense to me. I might want to do exactly the same
> thing to a string. You seem to be saying this is bad form. I can see that
> there are cases where you want the string NOT to be modified, but you see to
> be saying that to modify the original string at all is bad.
>
> It makes perfect sense to me to pass an object (string, in this case) across
> an encapsulation boundary specifically to modify it.
>
> What am I missing here?


There's nothing wrong with it if the purpose of the method
is to manipulate the string and it's documented clearly.

Every rule has exceptions

Stefan

 
Reply With Quote
 
 
 
 
Robert Klemme
Guest
Posts: n/a
 
      01-22-2009
On 22.01.2009 15:11, RK Sentinel wrote:
> As and when i discover such bugs in my code, I start adding dup(), and
> yes sometimes these lines bomb when another datatype is passed (I had
> asked this in a thread recently: respond_to? dup was passing, but the
> dup was failing).


> Anyway, i realize its more my incompetence, and i must be careful with
> destructive methods, but I just thought maybe there's some other way to
> do this, so i am not leaving it to my memory.


I would not call this "incompetence": we live and learn. Basically this
is a typical trade off issue: you trade efficiency (no copy) for safety
(no aliasing). As often with trade offs there is no clear 100% rule
which exactly tells you what is *always* correct. Instead you have to
think about it - when creating something like a library even more so -
and then deliberately decide which way you go.

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
 
Reply With Quote
 
 
 
 
Robert Klemme
Guest
Posts: n/a
 
      01-22-2009
On 22.01.2009 12:23, Brian Candler wrote:
> RK Sentinel wrote:
>> I guess I'll just have to remember not to modify a string I take from
>> another class.

>
> Or as was said earlier on: simply don't use destructive string methods
> unless you really have to. Pretend that all strings are frozen.
>
> You can enforce this easily enough in your unit tests: e.g. you could
> pass in strings that really are frozen, and check that your code still
> works


Well, this is only half of the story: it does not save you from outside
code changing the instance under your hands. Aliasing works both ways,
i.e. you can screw up the receiver but also the caller.

Cheers

robert

--
remember.guy do |as, often| as.you_can - without end
 
Reply With Quote
 
Robert Dober
Guest
Posts: n/a
 
      01-23-2009
On Wed, Jan 21, 2009 at 8:15 PM, RK Sentinel <(E-Mail Removed)> wrote:
> I'm writing my first largeish app. One issue that gets me frequently is
> this:
>
> I define a string in one class. Some other class references it, and
> modifies it. I (somehow) expected that when another referer modifies the
> reference, ruby would automatically dup() the string.
>
> Anyway, through trial and error, I start dup()'ing strings myself. I am
> aware of freeze().
>
> But would like to know how others handle this generally in large apps.
>
> - Do you keep freezing Strings you make in your classes to avoid
> accidental change
>
> - Do you habitually dup() your string ?

I try to, and I try to get rid of all references to the original
string as soon as possible.
This is because incremental GC works so well nowadays and allows for
some clean code.
Freezing a string seems like a good idea sometimes, but if that means
holding on to the object longer than needed this might not be such a
good idea after all.

R.
--
It is change, continuing change, inevitable change, that is the
dominant factor in society today. No sensible decision can be made any
longer without taking into account not only the world as it is, but
the world as it will be ... ~ Isaac Asimov

 
Reply With Quote
 
Tom Cloyd
Guest
Posts: n/a
 
      01-23-2009
Robert Dober wrote:
> On Wed, Jan 21, 2009 at 8:15 PM, RK Sentinel <(E-Mail Removed)> wrote:
>
>> I'm writing my first largeish app. One issue that gets me frequently is
>> this:
>>
>> I define a string in one class. Some other class references it, and
>> modifies it. I (somehow) expected that when another referer modifies the
>> reference, ruby would automatically dup() the string.
>>
>> Anyway, through trial and error, I start dup()'ing strings myself. I am
>> aware of freeze().
>>
>> But would like to know how others handle this generally in large apps.
>>
>> - Do you keep freezing Strings you make in your classes to avoid
>> accidental change
>>
>> - Do you habitually dup() your string ?
>>

> I try to, and I try to get rid of all references to the original
> string as soon as possible.
> This is because incremental GC works so well nowadays and allows for
> some clean code.
> Freezing a string seems like a good idea sometimes, but if that means
> holding on to the object longer than needed this might not be such a
> good idea after all.
>
> R.
>

Robert, for those of us who are considerably more clueless, what is
"incremental GC"?

Thanks,

t.

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< http://www.velocityreviews.com/forums/(E-Mail Removed) >> (email)
<< TomCloyd.com >> (website)
<< sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~


 
Reply With Quote
 
RK Sentinel
Guest
Posts: n/a
 
      01-23-2009

>>
>> - Do you habitually dup() your string ?

> I try to, and I try to get rid of all references to the original
> string as soon as possible.
> This is because incremental GC works so well nowadays and allows for
> some clean code.
> Freezing a string seems like a good idea sometimes, but if that means
> holding on to the object longer than needed this might not be such a
> good idea after all.
>
> R.


Interesting point. So freezing a string prevents collection as long as
there are referers (obvious), but duping it helps release the original
one, but you still have a new string in memory. So net you still are
taking the same memory.

Is there a writeup on Ruby GC collection, my knowledge of GC is java
based, and it is 5 years old (based on the Inside the VM book and
various other articles on sun.com). Is ruby's GC "generational" ? In
which iirc, an older object would have moved to an older generation and
be less likely to be collected.

Any links to ruby's GC would be appreciated.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Mike Gold
Guest
Posts: n/a
 
      01-23-2009
RK Sentinel wrote:
>
> - Do you keep freezing Strings you make in your classes to avoid
> accidental change
>
> - Do you habitually dup() your string ?
>


One possibility is copy-on-write.

require 'delegate'

class CopyOnWriteString < DelegateClass(String)
DESTRUCTIVE_METHODS =
String.public_instance_methods(false).grep(/!/).map(&:to_sym) +
[
:[]=,
:<<,
:concat,
:initialize_copy,
:replace,
:setbyte,
# ... and probably others ...
]

DESTRUCTIVE_METHODS.each { |m|
define_method(m) { |*args, &block|
__setobj__(__getobj__.dup)
__getobj__.send(m, *args, &block)
}
}
end

class Person
def initialize(name)
@name = name
end
def name
CopyOnWriteString.new(@name)
end
end

person = Person.new("fred")
name = person.name

p name #=> "fred"
p person.name #=> "fred"

name << " flintstone"
p name #=> "fred flintstone"
p person.name #=> "fred"

(I've used some 1.8.7+ only features.)
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Brian Candler
Guest
Posts: n/a
 
      01-23-2009
RK Sentinel wrote:
> Thanks for all the helpful replies. Its my first venture: a widget
> library.
>
> Here's an example: Sometimes a string is passed to a class, say, using
> its set_buffer method (which is an _optional_ method).
>
> set_buffer just assigns it to @buffer. But deep within the class this
> variable is being edited using insert() or slice!() (and this IS
> necessary) since the widget is an editing widget.


Thanks, so I guess the API is something like this:

edit_field.buffer = "foo"

... some time later, after user has clicked OK ...

f.write(edit_field.buffer)

Now, if there is a compelling reason for this object to perform
"in-place" editing on the buffer then by all means do, and document
this, but it will lead to the aliasing problems you describe.

It may be simpler and safer just to use non-destructive methods inside
your class.

# destructive
@buffer.slice!(x,y)
# non-destructive alternative
@buffer = @buffer.slice(x,y)

# destructive
@buffer.insert(pos, text)
# non-destructive alternative
@buffer = buffer[0,pos] + text + buffer[pos..-1]

In effect, this is doing a 'dup' each time. It has to; since Ruby
doesn't do reference-counting it has no idea whether any other object in
the system is holding a reference to the original object or not.

The only problem with this is if @buffer is a multi-megabyte object and
you don't want to keep copying it. In this case, doing a single dup
up-front would allow you to use the destructive methods safely.

class Editor
def buffer
@buffer
end
def buffer=(x)
@buffer = x.dup
end
end

The overhead of a single copy is small, and in any case this is probably
what is needed here (e.g. if the user makes some edits but clicks
'cancel' instead of 'save' then you may want to keep the old string
untouched)

You could try deferring the dup until the first time you call a
destructive method on the string, but the complexity overhead is
unlikely to be worth it.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
RK Sentinel
Guest
Posts: n/a
 
      01-23-2009
Brian Candler wrote:

> The overhead of a single copy is small, and in any case this is probably
> what is needed here (e.g. if the user makes some edits but clicks
> 'cancel' instead of 'save' then you may want to keep the old string
> untouched)
>
> You could try deferring the dup until the first time you call a
> destructive method on the string, but the complexity overhead is
> unlikely to be worth it.


Yes, I've got the set_buffer doing a dup (if its a string).

At the same time, the get_buffer also does a dup, since often the Field
is created blank (i did mention that set_buffer is an optional method
for editing a default value, if present).

Its a real TextField or Field. So you would be typing away in the field.
Each character you type is inserted in (or removed if its del or BS) -
exactly as I am typing away in this editbox.

The CopyOnWriteString a impressive, shows what all can be done with
Ruby.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Mike Gold
Guest
Posts: n/a
 
      01-23-2009
Brian Candler wrote:
>
> You could try deferring the dup until the first time you call a
> destructive method on the string, but the complexity overhead is
> unlikely to be worth it.


Careful. Intuition is worse than useless here. The only way to know is
to measure the particular case in question.

class Person
def initialize(name)
@name = name
end
def name_cow
CopyOnWriteString.new(@name)
end
def name_dup
@name.dup
end
end

require 'benchmark'

n = 10_000
sizes = [100, 1000, 10_000, 100_000]
objects = sizes.inject(Hash.new) { |acc, size|
acc.merge!(size => Person.new("x"*size))
}

sizes.each { |size|
object = objects[size]
puts "-"*40
puts "iterations: #{n} size: #{size}"
Benchmark.bm { |x|
x.report("cow w/o change") {
n.times { object.name_cow }
}
x.report("dup w/o change") {
n.times { object.name_dup }
}
x.report("cow w/ change") {
n.times { object.name_cow << "y" }
}
x.report("dup w/ change") {
n.times { object.name_dup << "y" }
}
}
}

ruby 1.8.7 (2008-08-11 patchlevel 72) [i386-cygwin]
----------------------------------------
iterations: 10000 size: 100
user system total real
cow w/o change 0.031000 0.000000 0.031000 ( 0.031000)
dup w/o change 0.032000 0.000000 0.032000 ( 0.031000)
cow w/ change 0.171000 0.000000 0.171000 ( 0.172000)
dup w/ change 0.047000 0.000000 0.047000 ( 0.047000)
----------------------------------------
iterations: 10000 size: 1000
user system total real
cow w/o change 0.032000 0.000000 0.032000 ( 0.031000)
dup w/o change 0.046000 0.000000 0.046000 ( 0.047000)
cow w/ change 0.172000 0.000000 0.172000 ( 0.172000)
dup w/ change 0.063000 0.000000 0.063000 ( 0.062000)
----------------------------------------
iterations: 10000 size: 10000
user system total real
cow w/o change 0.031000 0.000000 0.031000 ( 0.032000)
dup w/o change 0.109000 0.000000 0.109000 ( 0.109000)
cow w/ change 0.282000 0.000000 0.282000 ( 0.281000)
dup w/ change 0.156000 0.000000 0.156000 ( 0.156000)
----------------------------------------
iterations: 10000 size: 100000
user system total real
cow w/o change 0.031000 0.000000 0.031000 ( 0.032000)
dup w/o change 0.672000 0.000000 0.672000 ( 0.672000)
cow w/ change 1.406000 0.000000 1.406000 ( 1.406000)
dup w/ change 1.219000 0.000000 1.219000 ( 1.219000)

Destructive methods are less common in real code, and especially so when
the string comes from a attr_reader method. It is likely that the case
to optimize is the non-destructive call (the first of each quadruplet
above). But we have to profile the specific situation.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Rubyzip - `dup': can't dup NilClass (TypeError) Luka Stolyarov Ruby 10 09-11-2010 12:13 PM
sort list doesnt work, key=str still doesnt work notnorwegian@yahoo.se Python 3 05-27-2008 04:32 AM
:s.respond_to?(:dup) && :s.dup raises Fran├žois Beausoleil Ruby 1 04-05-2007 05:55 PM
PC doesnt boot first time and doesnt shutdown dann Computer Support 6 08-21-2006 07:31 AM
Can 56k auto DUP be enabled for programs running on XP? rt Computer Support 0 02-19-2005 01:43 AM



Advertisments