Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Re: Web Frameworks Excessive Complexity

Reply
Thread Tools

Re: Web Frameworks Excessive Complexity

 
 
Robert Kern
Guest
Posts: n/a
 
      11-20-2012
On 20/11/2012 19:46, Andriy Kornatskyy wrote:
>
> Robert,
>
> Thank you for the comment. I do not try relate CC with LOC. Instead pointing to excessive complexity, something that is beyond recommended threshold, a subject to refactoring in respective web frameworks. Those areas are likely to be potential source of bugs (e.g. due to low code coverage with unit tests) thus have certain degree of interest to both: end users and framework developers.


Did you read the paper? I'm not suggesting that you compare CC with LoC; I'm
suggesting that you don't use CC as a metric at all. The research is fairly
conclusive that CC doesn't measure what you think it measures. The source of
bugs is not excessive complexity in a method, just excessive lines of code. LoC
is much simpler, easier to understand, and easier to correct than CC.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

 
Reply With Quote
 
 
 
 
Steven D'Aprano
Guest
Posts: n/a
 
      11-21-2012
On Tue, 20 Nov 2012 20:07:54 +0000, Robert Kern wrote:

> The source of bugs is not excessive complexity in a method, just
> excessive lines of code.


Taken literally, that cannot possibly the case.

def method(self, a, b, c):
do_this(a)
do_that(b)
do_something_else(c)


def method(self, a, b, c):
do_this(a); do_that(b); do_something_else(c)


It *simply isn't credible* that version 1 is statistically likely to have
twice as many bugs as version 2. Over-reliance on LOC is easily gamed,
especially in semicolon languages.

Besides, I think you have the cause and effect backwards. I would rather
say:

The source of bugs is not lines of code in a method, but excessive
complexity. It merely happens that counting complexity is hard, counting
lines of code is easy, and the two are strongly correlated, so why count
complexity when you can just count lines of code?



Keep in mind that something like 70-80% of published scientific papers
are never replicated, or cannot be replicated. Just because one paper
concludes that LOC alone is a better metric than CC doesn't necessary
make it so. But even if we assume that the paper is valid, it is
important to understand just what it says, and not extrapolate too far.

The paper makes various assumptions, takes statistical samples, and uses
models. (Which of course *any* such study must.) I'm not able to comment
on whether those models and assumptions are valid, but assuming that they
are, the conclusion of the paper is no stronger than the models and
assumptions. We should not really conclude that "CC has no more
predictive power than LOC". The right conclusion is that one specific
model of cyclic complexity, McCabe's CC, has no more predictive power
than LOC for projects written in C, C++ and Java.

How does that apply to Python code? Well, it's certainly suggestive, but
it isn't definitive.

It's also important to note that the authors point out that in their
samples of code, they found very high variance and large numbers of
outliers:

[quote]
Modules where LOC does not predict CC (or vice-versa) may indicate an
overly-complex module with a high density of decision points or an overly-
simple module that may need to be refactored.
[end quote]

So *even by the terms of this paper*, it isn't true that CC has no
predictive value over LOC -- if the CC is radically high or low for the
LOC, that is valuable to know.


> LoC is much simpler, easier to understand, and
> easier to correct than CC.


Well, sure, but do you really think Perl one-liners are the paragon of
bug-free code we ought to be aiming for? *wink*



--
Steven
 
Reply With Quote
 
 
 
 
Ulrich Eckhardt
Guest
Posts: n/a
 
      11-21-2012
Am 21.11.2012 02:43, schrieb Steven D'Aprano:
> On Tue, 20 Nov 2012 20:07:54 +0000, Robert Kern wrote:
>> The source of bugs is not excessive complexity in a method, just
>> excessive lines of code.

>
> Taken literally, that cannot possibly the case.
>
> def method(self, a, b, c):
> do_this(a)
> do_that(b)
> do_something_else(c)
>
>
> def method(self, a, b, c):
> do_this(a); do_that(b); do_something_else(c)
>
>
> It *simply isn't credible* that version 1 is statistically likely to have
> twice as many bugs as version 2. Over-reliance on LOC is easily gamed,
> especially in semicolon languages.


"Don't indent deeper than 4 levels!" "OK, not indenting at all, $LANG
doesn't need it anyway." Sorry, but if code isn't even structured
halfway reasonably it is unmaintainable, regardless of what CC or LOC say.


> Besides, I think you have the cause and effect backwards. I would rather
> say:
>
> The source of bugs is not lines of code in a method, but excessive
> complexity. It merely happens that counting complexity is hard, counting
> lines of code is easy, and the two are strongly correlated, so why count
> complexity when you can just count lines of code?


I agree here, and I'd go even further: Measuring complexity is not just
hard, it requires a metric that you need to agree on first. With LOC you
only need to agree on not semicolon-chaining lines and how to treat
comments and empty lines. With CC, you effectively agree that an if
statement has complexity of one (or 2?) while a switch statement has a
complexity according to its number of cases, while it is way easier to
read and comprehend than a similar number produced by if statement.
Also, CC doesn't even consider new-fashioned stuff like exceptions that
introduce yet another control flow path.


>> LoC is much simpler, easier to understand, and
>> easier to correct than CC.

>
> Well, sure, but do you really think Perl one-liners are the paragon of
> bug-free code we ought to be aiming for? *wink*


Hehehe...

Uli


 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      11-21-2012
On Wed, Nov 21, 2012 at 10:09 PM, Andriy Kornatskyy
<(E-Mail Removed)> wrote:
> We choose Python for its readability. This is essential principal of language and thousands around reading the open source code. Things like PEP8, CC, LoC are all to serve you one purpose: bring your attention, teach you make your code better.


But too much focus on metrics results in those metrics improving
without any material benefit to the code. If there's a number that you
can watch going up or down, nobody's going to want to be the one that
pushes that number the wrong direction. So what happens when the right
thing to do happens to conflict with the given metric? And yes, it
WILL happen, guaranteed. No metric is perfect.

Counting lines of code teaches you to make dense code. That's not a
good thing nor a bad thing; you'll end up with list comprehensions
rather than short loops, regardless of which is easier to actually
read.

Counting complexity by giving a score to every statement encourages
code like this:

def bletch(x,y):
return x + {"foo":y*2,"bar"*3+y,"quux":math.sin(y)}.get(mod e,0)

instead of:

def bletch(x,y):
if mode=="foo": return x+y*2
if mode=="bar": return x*4+y
if mode=="quux": return x+math.sin(y)
return x

Okay, this is a stupid contrived example, but tell me which of those
you'd rather work with, and then tell me a plausible metric that would
agree with you.

ChrisA
 
Reply With Quote
 
Steven D'Aprano
Guest
Posts: n/a
 
      11-21-2012
On Wed, 21 Nov 2012 22:21:23 +1100, Chris Angelico wrote:

> Counting complexity by giving a score to every statement encourages code
> like this:
>
> def bletch(x,y):
> return x + {"foo":y*2,"bar"*3+y,"quux":math.sin(y)}.get(mod e,0)
>
> instead of:
>
> def bletch(x,y):
> if mode=="foo": return x+y*2
> if mode=="bar": return x*4+y
> if mode=="quux": return x+math.sin(y) return x
>
> Okay, this is a stupid contrived example, but tell me which of those
> you'd rather work with



Am I being paid by the hour or the line?




--
Steven
 
Reply With Quote
 
Robert Kern
Guest
Posts: n/a
 
      11-21-2012
On 21/11/2012 01:43, Steven D'Aprano wrote:
> On Tue, 20 Nov 2012 20:07:54 +0000, Robert Kern wrote:
>
>> The source of bugs is not excessive complexity in a method, just
>> excessive lines of code.

>
> Taken literally, that cannot possibly the case.
>
> def method(self, a, b, c):
> do_this(a)
> do_that(b)
> do_something_else(c)
>
>
> def method(self, a, b, c):
> do_this(a); do_that(b); do_something_else(c)
>
>
> It *simply isn't credible* that version 1 is statistically likely to have
> twice as many bugs as version 2. Over-reliance on LOC is easily gamed,
> especially in semicolon languages.


Logical LoC (executable LoC, number of statements, etc.) is a better measure
than Physical LoC, I agree. That's not the same thing as cyclomatic complexity,
though. Also, the relationship between LoC (of either type) and bugs is not
linear (at least not in the small-LoC regime), so you are certainly correct that
it isn't credible that version 1 is likely to have twice as many bugs as version
2. No one is saying that it is.

> Besides, I think you have the cause and effect backwards. I would rather
> say:
>
> The source of bugs is not lines of code in a method, but excessive
> complexity. It merely happens that counting complexity is hard, counting
> lines of code is easy, and the two are strongly correlated, so why count
> complexity when you can just count lines of code?


No, that is not the takeaway of the research. More code correlates with more
bugs. More cyclomatic complexity also correlates with more bugs. You want to
find out what causes bugs. What the research shows is that cyclomatic complexity
is so correlated with LoC that it is going to be very difficult, or impossible,
to establish a causal relationship between cyclomatic complexity and bugs. The
previous research that just correlated cyclomatic complexity to bugs without
controlling for LoC does not establish the causal relationship.

> Keep in mind that something like 70-80% of published scientific papers
> are never replicated, or cannot be replicated. Just because one paper
> concludes that LOC alone is a better metric than CC doesn't necessary
> make it so. But even if we assume that the paper is valid, it is
> important to understand just what it says, and not extrapolate too far.


This paper is actually a replication. It is notable for how comprehensive it is.

> The paper makes various assumptions, takes statistical samples, and uses
> models. (Which of course *any* such study must.) I'm not able to comment
> on whether those models and assumptions are valid, but assuming that they
> are, the conclusion of the paper is no stronger than the models and
> assumptions. We should not really conclude that "CC has no more
> predictive power than LOC". The right conclusion is that one specific
> model of cyclic complexity, McCabe's CC, has no more predictive power
> than LOC for projects written in C, C++ and Java.
>
> How does that apply to Python code? Well, it's certainly suggestive, but
> it isn't definitive.


More so than the evidence that CC is a worthwhile measure, for Python or any
language.

> It's also important to note that the authors point out that in their
> samples of code, they found very high variance and large numbers of
> outliers:
>
> [quote]
> Modules where LOC does not predict CC (or vice-versa) may indicate an
> overly-complex module with a high density of decision points or an overly-
> simple module that may need to be refactored.
> [end quote]
>
> So *even by the terms of this paper*, it isn't true that CC has no
> predictive value over LOC -- if the CC is radically high or low for the
> LOC, that is valuable to know.


Is it? What is the evidence that excess, unpredicted-by-LoC CC causes (or even
correlates with) bugs? The paper points that out as a target for future research
because no one has studied it yet. It may turn out to be a valid metric, but one
that has a very specific utility: identifying a particular hotspot. Running CC
over whole projects to compare their "quality", as the OP has done, is not a
valid use of even that.

>> LoC is much simpler, easier to understand, and
>> easier to correct than CC.

>
> Well, sure, but do you really think Perl one-liners are the paragon of
> bug-free code we ought to be aiming for? *wink*


No, but introducing more statements and method calls to avoid if statements
isn't either.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      11-21-2012
On Wed, Nov 21, 2012 at 10:43 PM, Steven D'Aprano
<(E-Mail Removed)> wrote:
> On Wed, 21 Nov 2012 22:21:23 +1100, Chris Angelico wrote:
>
>> Counting complexity by giving a score to every statement encourages code
>> like this:
>>
>> def bletch(x,y):
>> return x + {"foo":y*2,"bar"*3+y,"quux":math.sin(y)}.get(mod e,0)
>>
>> instead of:
>>
>> def bletch(x,y):
>> if mode=="foo": return x+y*2
>> if mode=="bar": return x*4+y
>> if mode=="quux": return x+math.sin(y) return x
>>
>> Okay, this is a stupid contrived example, but tell me which of those
>> you'd rather work with

>
>
> Am I being paid by the hour or the line?


You're on a salary, but management specified some kind of code metrics
as a means of recognizing which of their programmers are more
productive, and thus who gets promoted.

Oh, I'm *so* glad I work in a small company. We've only had one
programmer that we "let go" (and actually, it was literally letting
him go - he said he was no good, hoping that we'd beg him to stay, and
we simply didn't beg him to stay), and the metric of code quality was
simply that both my boss and I looked at his code and said that it
wasn't good enough. Much simpler. (Though my boss and I have differing
views on how many lines of code some things should be. We end up
having some rather amusing debates about trivial things like line
breaks.)

ChrisA
 
Reply With Quote
 
Modulok
Guest
Posts: n/a
 
      11-22-2012
> On Wed, Nov 21, 2012 at 10:43 PM, Steven D'Aprano
> <(E-Mail Removed)> wrote:
>> On Wed, 21 Nov 2012 22:21:23 +1100, Chris Angelico wrote:
>>
>>> Counting complexity by giving a score to every statement encourages code
>>> like this:
>>>
>>> def bletch(x,y):
>>> return x + {"foo":y*2,"bar"*3+y,"quux":math.sin(y)}.get(mod e,0)
>>>
>>> instead of:
>>>
>>> def bletch(x,y):
>>> if mode=="foo": return x+y*2
>>> if mode=="bar": return x*4+y
>>> if mode=="quux": return x+math.sin(y) return x
>>>
>>> Okay, this is a stupid contrived example, but tell me which of those
>>> you'd rather work with

>>
>>


> Oh, I'm *so* glad I work in a small company.


Agreed. Do we rate a contractor's quality of workmanship and efficiency by the
number of nails he drives?

Of course not. That would be ridiculous.

A better metric of code quality and complexity would be to borrow from science
and mathematics. i.e. a peer review or audit by others working on the project
or in the same field of study. Unfortunately this isn't cheap or easily
computed and doesn't translate nicely to a bar graph.

Such is reality.
-Modulok-
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Web Frameworks Excessive Complexity Robert Kern Python 0 11-20-2012 08:33 PM
RE: Web Frameworks Excessive Complexity Andriy Kornatskyy Python 0 11-20-2012 08:22 PM
RE: Web Frameworks Excessive Complexity Andriy Kornatskyy Python 0 11-20-2012 07:46 PM
Re: Web Frameworks Excessive Complexity Robert Kern Python 0 11-20-2012 07:32 PM



Advertisments