Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > How to duck type? - the psychology of static typing in Ruby

Reply
Thread Tools

How to duck type? - the psychology of static typing in Ruby

 
 
Tim Bates
Guest
Posts: n/a
 
      05-17-2004
Hi all,
Following a discussion in #ruby-lang, I have a suggestion about how to
approach Duck Typing. Below is my dissertation on the subject. My
intention is to incorporate any comments people might have into the text
and then place it on the Wiki as an introduction to Duck Typing for the
static typist.

For those not in on the secret, the idea is that if an object walks like
a duck and quacks like a duck, it may as well be a duck - this being a
metaphor for an arbitrary object that may not be exactly the same class
your code was expecting, but still behaves the same way - see [1] if you
don't follow.

---

Many people coming to Ruby from a statically-typed language are somewhat
afraid of Ruby's dynamism, or "don't get it(TM)". David Black and I
believe that this is in part because it is thought that the uncertainty
and changeability built into Ruby are dangerous and one wants to find
shelter from them.

Please bear with me while I describe some of the possible approaches.

1) People with a Static Typing background often have the urge to do
something like this:

attr_reader :date
def date=(val)
raise ArgumentError.new("Not a Date") if val.class != Date
end

This is not duck typing - this is trying to get Ruby to do Static Typing.

2) Okay, you say, if that's not duck typing, let's do duck typing by
accepting a whole bunch of different input formats and trying to turn
them into something we know how to deal with, like this:

def date=(val)
class="keyword">case val
when Date
@date = val
when Time
@date = Date.new(val.year, val.month, val.day)
when String
if val =~ /(\d{4})\s*[-\/\\]\s*(\d{1,2})\s*[-\/\\]\s*(\d{1,2})/
@date = Date.new($1.to_i,$2.to_i,$3.to_i)
else
raise ArgumentError, "Unable to parse #{val} as date"
end
when Array
if val.length == 3
@date = Date.new(val[0], val[1], val[2])
end
else
raise ArgumentError, "Unable to parse #{val} as date"
end
end

This "normalization" approach has the advantage that the date attribute
getter will always return a Date (producing certainty), but the setter
can take input in a variety of formats.

2.a) Discussing this on #ruby-lang, David Black suggested the following
optimization:

def date=(val)
begin
@date = Date.new(val.year, val.month, val.day)
rescue
begin
val =~ /(\d{4})\s*[-\/\\]\s*(\d{1,2})\s*[-\/\\]\s*(\d{1,2})/
@date = Date.new($1.to_i,$2.to_i,$3.to_i)
rescue
begin
@date = Date.new(val[0], val[1], val[2])
rescue
raise ArgumentError, "Unable to parse #{val} as date"
end
end
end
end

This has the advantage over (2) that it doesn't depend upon the class of
val - if it acts enough like a string to use the =~ operator, then that
clause will handle it, even if it's not descended from String - unlike
the previous example. This makes it "more duck-typed", but still
addresses the static-typist's fear of uncertainty and dynamism by
providing a predictable response from #date (it will always be a Date).
Unfortunately it's also slow.

3) Even "more duck-typed" is the approach of just testing that it
responds to the appropriate methods, like so:

# Accepts an object which responds to the +year+, +month+ and +day+
# methods.
def date=(val)
[:year, :month, :day].each do |meth|
raise ArgumentError unless val.responds_to?(meth)
end
@date = val
end

In this case, we have removed the normalization instituted in example
(2), but we have still ensured that the #date attribute conforms to some
sort of interface, providing certainty. It is now the caller's
responsibility to make sure what they pass fits the [:year, :month,
:day] specification - but this responsibility is documented. However,
this approach violates the Don't Repeat Yourself principle - both the
code and the comment contain the specification, and are not therefore
guaranteed to be in sync.

This approach is what many people believe to be embodied by "Duck
Typing". Given an object, we're checking whether it walks and quacks
like a duck; we're not forcing our caller to use a particular class,
like example (1), but we are forcing our caller to put the data in a
format we can understand, unlike (2) which attempts to deal with every
possible representation of a date, causing volumes of maintenance work -
imagine trying to write a normalization routine like that for every
attribute of every class! In this way, we are moving the responsibility
of putting the data into a reasonable format to the caller, who knows
what format their data is in, from the receiver, who has to guess at
every possible format the caller might send them.

4) The fourth and final approach, which I believe to be the Zen of Duck
Typing, is as follows:

# Accepts an object which responds to the +year+, +month+ and +day+
# methods.
attr_accessor :date

"What?" I hear you cry. "There's no checking there at all! You could
pass it anything!" Yes, gentle reader, but why would you? After all, the
documentation for this method is exactly the same as the one above. If
the programmer using this method does what the documentation says then
the class's behaviour is exactly the same. If they hand it the wrong
thing (accidentally, we assume) then the only difference is that it
breaks when the setter is called, rather than some time after the getter
is called and we try and call a non-existent method on the result.

A common response to this often contains the phrase "meaningless error
messages", but the results of such a mistake are usually, if not always,
far from meaningless. For the most part, they look something like this:

NoMethodError: undefined method `year' for "notadate":String

This tells me a lot: namely, that some part of my code (whose location
is given in the subsequent backtrace) expected "notadate" to have a
:year method, and it didn't. From this it is fairly trivial to deduce
that something, somewhere, has fed the wrong thing to the date= setter
method. Chances are that if your code is well-factored, there aren't a
whole lot of places that set the date, and the location of the error can
be found through a little judicious testing; you've lost the certainty
and immediacy of the inline check, but not by much, and you've gained
the flexibility of dynamic typing, and a whole lot less code to maintain.

Now if you'd been writing and collecting unit tests as you went along,
instead of

NoMethodError: undefined method `year' for "notadate":String

you would be seeing

1) Failure:
test_stuff(MyClassTest) [./test/myclasstest.rb:13]:
<false> is not true.

which makes the error even easier to find: you go to test/myclasstest.rb
and see something like:

10: def test_date
11: @obj = Foo.new
12: @obj.date = MyClass.new.notadate
13: assert(@obj.date.respond_to?(:year))
14: end

and now the error is trivial to trace - the moral of the story being
that when Duck Typing, do your checking in your unit tests, rather than
in the live code. Type errors such as this one are usually the least
common and easiest to trace of errors; if the attribute's documentation
specifies what it is supposed to be, as in the example above, and the
callers of both the getter and the setter methods make no assumptions
about any more or less than what the documentation says, then apart from
keyboarding accidents this will never be a problem.

At [1], Dave Thomas describes Duck Typing as "a way of thinking about
programming in Ruby." I think he means to go a step further than that -
Duck Typing is the _best_ way of thinking about programming in Ruby, and
possibly the _only_ way; as David Black puts it:

"I think the concept of duck typing needs to be supplemented and
expanded on. if, as seems to be the case, Dave thinks of it as a
component of programming style, then it doesn't address language design
itself. As long as duck typing is viewed as a stylistic choice, rather
than a radical language principle, the door is always open to people
saying 'I don't do duck typing', by which they usually mean that they
use kind_of? a lot... of course Ruby itself *does* do duck typing,
whether a given programmer thinks they're doing it or not."

Using kind_of? (or responds_to?) a lot isn't "not doing Duck Typing",
it's simply adding in at run time the kinds of checks that Statically
Typed languages do at compile time, in a usually verbose and necessarily
incomplete fashion.

Rather than trying to make Ruby do Static Typing because one is from a
Static Typing background and that's what one is comfortable with, one
should become comfortable with the dynamic nature of Ruby instead. I
have found that once I stopped assuming that the callers of my method
(who may well be me, in five minutes time, or some user of my library on
the other side of the planet) are stupid and don't know how to read my
documentation (you did write some, didn't you?) then writing in Ruby
became a whole lot more natural and somewhat less verbose. The unit
tests took care of the psychological need to check, somewhere, that the
method was getting passed the right thing, but in reality the whole
debacle is a non-issue; type errors are the most trivial of bugs.

And if you're still worried about that date example, an alternative
solution is this:

def set_date(year, month, day)
@date = Date.new(year, month, day)
end

which, if year, month and day are not numeric, will catch the problem
straight away - without resorting to Static Typing or some approximation
of it. And the way it catches it is telling:

irb(main):027:0> Date.new(2004.0, Rational(12,2), "17")
ArgumentError: comparison of String with 0 failed
from /usr/lib/ruby/1.8/date.rb:560:in `<'
from /usr/lib/ruby/1.8/date.rb:560:in `valid_civil?'
from /usr/lib/ruby/1.8/date.rb:590:in `new'
from (irb):27

This is not "ArgumentError: parameters must be numbers" - the error is
discovered when the Date class attempts to compare that parameter to
zero and can't do it, after assuming that it was valid. And it didn't
make the mistake any harder to find, did it? Notice that it didn't balk
at Floats or Rationals, and with no extra coding from the implementor;
Floats and Rationals look, and quack, like numbers. That's Duck Typing
in action.

[1] http://rubygarden.org/ruby?DuckTyping

---

Tim.

--
Tim Bates
http://www.velocityreviews.com/forums/(E-Mail Removed)


 
Reply With Quote
 
 
 
 
ts
Guest
Posts: n/a
 
      05-17-2004
>>>>> "T" == Tim Bates <(E-Mail Removed)> writes:

T> "What?" I hear you cry. "There's no checking there at all! You could
T> pass it anything!" Yes, gentle reader, but why would you?

[ruby-talk:99351]
[ruby-talk:99370]


Guy Decoux


 
Reply With Quote
 
 
 
 
Simon Strandgaard
Guest
Posts: n/a
 
      05-17-2004
"SER" <(E-Mail Removed)> wrote:
> Broken record time:
>
> The problem with all of these solutions is that they are discovered at
> run-time. I find it increasingly irritating when I have to debug
> typing errors by running an application that takes some time to get to
> the error. Enough of those times, the error is a typing error, so that
> I've been harping lately about wanting a duck-type checker hooked in to
> "ruby -c".


I do plenty of unittesting, and only rarely have such kind of problems.
I can recommend verbose testing.

--
Simon Strandgaard


 
Reply With Quote
 
John Carter
Guest
Posts: n/a
 
      05-17-2004
On Mon, 17 May 2004, Tim Bates wrote:

> Below is my dissertation on the subject. My
> intention is to incorporate any comments people might have into the text
> and then place it on the Wiki as an introduction to Duck Typing for the
> static typist.


Excellent idea, Excellent article. Thanks.

> 1) People with a Static Typing background often have the urge to do
> something like this:
>
> attr_reader :date
> def date=(val)
> raise ArgumentError.new("Not a Date") if val.class != Date
> end
>
> This is not duck typing - this is trying to get Ruby to do Static Typing.


Well, a more sophisticated Static Typer would do...
attr_reader :date
def date=(val)
raise ArgumentError.new("Not a Date") if val.kind_of? Date
end

> 2.a) Discussing this on #ruby-lang, David Black suggested the following
> optimization:
>
> def date=(val)
> begin
> @date = Date.new(val.year, val.month, val.day)
> rescue
> begin
> val =~ /(\d{4})\s*[-\/\\]\s*(\d{1,2})\s*[-\/\\]\s*(\d{1,2})/
> @date = Date.new($1.to_i,$2.to_i,$3.to_i)
> rescue
> begin
> @date = Date.new(val[0], val[1], val[2])
> rescue
> raise ArgumentError, "Unable to parse #{val} as date"
> end
> end
> end
> end


Really cute. More polymorphic, probably a bit slow as exception paths
are usually expected to be "rare" and hence under
optimized. (Certainly true in C++, I don't know about Ruby.) Anybody
care enough to benchmark this?

I would also prefer it to rescue just a No Method exception than just
any Exception. You can really really really hide some horrible bugs catching
just any exception. Like empty catch blocks in Java, it is a truly
evil practice.

> 4) The fourth and final approach, which I believe to be the Zen of Duck
> Typing, is as follows:
>
> # Accepts an object which responds to the +year+, +month+ and +day+
> # methods.
> attr_accessor :date


The worse breakage I have seen in a Object Oriented System was in a
language called Actor, where for optimization reasons they had forced
the internal representation of their graphics type into pairs of two byte
integers. This turned a very large, very useful generic 2D geometry
library into something totally useless to me. ie. Never gratiutously
break polymorphism, even if you personally can't think of another use
for this code, somebody else can.

For example, suppose you got a string from a SQL server that was a
"date". But since parsing the string is difficult and expensive to do,
you just don't. You stuff it into the date= method and lo and behold,
it just so turns out that for this execution path you _never_ actually
invoke :year, :month, :day, it just travels through your system, and
gets converted to a string with .to_s and written back to SQL.

If you had put additional checks in the code, it would have broken,
and you would have been forced to do an expensive and useless
conversion.




John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (E-Mail Removed)
New Zealand

The universe is absolutely plastered with the dashed lines exactly one
space long.


 
Reply With Quote
 
gabriele renzi
Guest
Posts: n/a
 
      05-17-2004
il Mon, 17 May 2004 22:52:22 +0900, Tim Bates <(E-Mail Removed)> ha
scritto::


>
>2) Okay, you say, if that's not duck typing, let's do duck typing by
>accepting a whole bunch of different input formats and trying to turn
>them into something we know how to deal with, like this:


<snipall>

this was an interesting reading, thanks for it.
But It happens to me that I'm locked in a mental loop.
From time to time I think:
- I'll use duck typing.
- I don't need to do that strange things, I'll write a
date_from_string method
- eh, multi methopd dispatch would be nice..
- well, duck typing is actually cheking an interface. Why don't we
just use static checked interfaces? what's wrong with a Date-able
mixin ?
- oh, static sux, I'll use duck typing..
 
Reply With Quote
 
Tim Bates
Guest
Posts: n/a
 
      05-17-2004
ts wrote:
>>>>>>"T" == Tim Bates <(E-Mail Removed)> writes:

> T> "What?" I hear you cry. "There's no checking there at all! You could
> T> pass it anything!" Yes, gentle reader, but why would you?
>
> [ruby-talk:99351]
> [ruby-talk:99370]


For those who can't be bothered finding those mails, they both refer to
some of the security issues involved in assuming that callers give you
what you expect; perhaps I didn't make this clear in my article, but I
am assuming whoever is writing code alongside mine is non-malicious, and
are not deliberately trying to break my code. On the other hand, if they
are malicious, static typing isn't going to gain you very much, you will
still have to validate any and all input. This issue has very little to
do with they typing model used.

Tim.

--
Tim Bates
(E-Mail Removed)


 
Reply With Quote
 
Gavin Sinclair
Guest
Posts: n/a
 
      05-17-2004
On Monday, May 17, 2004, 11:52:22 PM, Tim wrote:

> Now if you'd been writing and collecting unit tests as you went along,
> instead of


> NoMethodError: undefined method `year' for "notadate":String


> you would be seeing


> 1) Failure:
> test_stuff(MyClassTest) [./test/myclasstest.rb:13]:
> <false> is not true.


> which makes the error even easier to find: you go to test/myclasstest.rb
> and see something like:


> 10: def test_date
> 11: @obj = Foo.new
> 12: @obj.date = MyClass.new.notadate
> 13: assert(@obj.date.respond_to?(:year))
> 14: end



Now I'm no unit testing guru (I try...) but that test to me seems so
trivial as to be pointless. Since the code it's testing does
absolutely nothing other than assignment, you're not testing the
*code*, you're testing one possible input value.

I think it would be more convincing to talk about unit tests covering
the correct overall operation of the class under scrutiny, rather than
examining some lowly accessor.

# But your psychological needs could differ from mine

The rest of the article's good, though.

Cheers,
Gavin



 
Reply With Quote
 
Tim Bates
Guest
Posts: n/a
 
      05-17-2004
John Carter wrote:
> Well, a more sophisticated Static Typer would do...
> attr_reader :date
> def date=(val)
> raise ArgumentError.new("Not a Date") if val.kind_of? Date
> end


Ah, of course. But seeing as I never actually do this in code, I can be
excused, right? Thanks for the suggestion.

> I would also prefer it to rescue just a No Method exception than just
> any Exception. You can really really really hide some horrible bugs catching
> just any exception. Like empty catch blocks in Java, it is a truly
> evil practice.


Again, you're correct and I hadn't thought much about these pieces of
code beyond the basic functionality since they were throw-away examples
anyway. Although there are cases where the code might throw an
ArgumentError instead of a NoMethodError and still work. The logic
behind catching just any exception was that if one method doesn't work,
for whatever reason, bail out and try a different one...

> The worse breakage I have seen in a Object Oriented System was in a
> language called Actor, where for optimization reasons they had forced
> the internal representation of their graphics type into pairs of two byte
> integers. This turned a very large, very useful generic 2D geometry
> library into something totally useless to me. ie. Never gratiutously
> break polymorphism, even if you personally can't think of another use
> for this code, somebody else can.


I'm sorry, I don't quite follow you; can you clarify a little? How did
changing the internal representation break things? How does this relate
to polymorphism or Duck Typing?

> For example, suppose you got a string from a SQL server that was a
> "date". But since parsing the string is difficult and expensive to do,
> you just don't. You stuff it into the date= method and lo and behold,
> it just so turns out that for this execution path you _never_ actually
> invoke :year, :month, :day, it just travels through your system, and
> gets converted to a string with .to_s and written back to SQL.


This is rather risky; you're assuming less than what the documentation
says, and hoping that nobody else assumes any more than you do. That's
just asking for someone to come along and add something that does use
one of those methods, and suddenly things come crashing down around you.
On the other hand, however, if you know what you're doing then this is
one of the advantages of Duck Typing, and part of the beauty of Ruby -
if you *know* that those methods will never get called, the language
never forces you to convert the date into that format if you don't want to.

Tim.

--
Tim Bates
(E-Mail Removed)


 
Reply With Quote
 
Aredridel
Guest
Posts: n/a
 
      05-17-2004
> This is rather risky; you're assuming less than what the documentation
> says, and hoping that nobody else assumes any more than you do. That's
> just asking for someone to come along and add something that does use
> one of those methods, and suddenly things come crashing down around you.


Nah, it crashes down on /them/. Garbage in, garbage out.

Imagine how useful the unix grep command would be if it assumed lines
were 80 characters, and errored if you passed it otherwise.

Static vs dynamic typing is a similar design choice: Let the user put in
whatver they want, and if they get results they don't like, let them
adjust.

Doesn't always apply -- you still validate things where appropriate, but
static typing seems an arbitrary limit to me, not application-specfic
where it's justified.




 
Reply With Quote
 
Tim Bates
Guest
Posts: n/a
 
      05-18-2004
Gavin Sinclair wrote:
>>10: def test_date
>>11: @obj = Foo.new
>>12: @obj.date = MyClass.new.notadate
>>13: assert(@obj.date.respond_to?(:year))
>>14: end

>
>
>
> Now I'm no unit testing guru (I try...) but that test to me seems so
> trivial as to be pointless. Since the code it's testing does
> absolutely nothing other than assignment, you're not testing the
> *code*, you're testing one possible input value.


You're right, that's a really stupid example, and I was hoping nobody
would spot it. I tried to think of a better example, and failed,
since my original example (`attr_accessor :date`) is so trivial as to
not need unit testing. (Although the unit testing gurus may disagree
with me, I tend not to write tests for attributes unless I have written
extra code in them - I just assume that attr_accessor does what it's
supposed to.)

> I think it would be more convincing to talk about unit tests covering
> the correct overall operation of the class under scrutiny, rather than
> examining some lowly accessor.


That's right, and the problem arises because the whole article revolves
around writing an accessor method - partly because it's simple, and
partly because that was what I was trying to write when I started the
discussion on #ruby-lang. If you can think of a better example, I'll
happily rewrite it.

> # But your psychological needs could differ from mine


No, they don't. You're absolutely correct, that unit test is so trivial
as to be pointless. I don't bother to test attr_accessor either.

> The rest of the article's good, though.


Thanks.

Tim.

--
Tim Bates
(E-Mail Removed)


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
Ruby's duck typing stephan.zimmer Ruby 15 11-29-2008 09:43 PM
Static Typing Where Possible and Dynamic Typing When Needed vladare Ruby 0 07-11-2005 11:54 AM
Intellisense and the psychology of typing andrew.queisser@hp.com Ruby 48 06-01-2005 05:56 PM
Intellisense and the psychology of typing andrew.queisser@hp.com Python 8 06-01-2005 03:46 PM



Advertisments