Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > While we're talking about annoyances

Reply
Thread Tools

While we're talking about annoyances

 
 
Steven D'Aprano
Guest
Posts: n/a
 
      04-29-2007
Am I the only one who finds that I'm writing more documentation than code?

I recently needed to write a function to generate a rank table from a
list. That is, a list of ranks, where the rank of an item is the position
it would be in if the list were sorted:

alist = list('defabc')
ranks = [3, 4, 5, 0, 1, 2]

To do that, I needed to generate an index table first. In the book
"Numerical Recipes in Pascal" by William Press et al there is a procedure
to generate an index table (46 lines of code) and one for a rank table
(five lines).

In Python, my index function is four lines of code and my rank function is
five lines. I then wrote three more functions for verifying that my index
and rank tables were calculated correctly (17 more lines) and four more
lines to call doctest, giving a total of 30 lines of code.

I also have 93 lines of documentation, including doctests, or three
lines of documentation per line of code.

For those interested, here is how to generate an index table and rank
table in Python:


def index(sequence):
decorated = zip(sequence, xrange(len(sequence)))
decorated.sort()
return [idx for (value, idx) in decorated]

def rank(sequence):
table = [None] * len(sequence)
for j, idx in enumerate(index(sequence)):
table[idx] = j
return table


You can write your own damn documentation. *wink*



--
Steven.

 
Reply With Quote
 
 
 
 
GHUM
Guest
Posts: n/a
 
      04-29-2007
Steven,

> def index(sequence):
> decorated = zip(sequence, xrange(len(sequence)))
> decorated.sort()
> return [idx for (value, idx) in decorated]


would'nt that be equivalent code?

def index(sequence):
return [c for _,c in sorted((b,a) for a, b in
enumerate(sequence))]

tested, gave same results. But worsens your doc2code ratio

Harald Armin Massa
--

 
Reply With Quote
 
 
 
 
Michael Hoffman
Guest
Posts: n/a
 
      04-29-2007
GHUM wrote:
> Steven,
>
>> def index(sequence):
>> decorated = zip(sequence, xrange(len(sequence)))
>> decorated.sort()
>> return [idx for (value, idx) in decorated]

>
> would'nt that be equivalent code?
>
> def index(sequence):
> return [c for _,c in sorted((b,a) for a, b in
> enumerate(sequence))]


Or even these:

def index(sequence):
return sorted(range(len(sequence)), key=sequence.__getitem__)

def rank(sequence):
return sorted(range(len(sequence)),
key=index(sequence).__getitem__)

Hint: if you find yourself using a decorate-sort-undecorate pattern,
sorted(key=func) or sequence.sort(key=func) might be a better idea.
--
Michael Hoffman
 
Reply With Quote
 
=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=
Guest
Posts: n/a
 
      04-29-2007
On 4/29/07, Steven D'Aprano <(E-Mail Removed)> wrote:
> To do that, I needed to generate an index table first. In the book
> "Numerical Recipes in Pascal" by William Press et al there is a procedure
> to generate an index table (46 lines of code) and one for a rank table
> (five lines).


51 lines total.

> In Python, my index function is four lines of code and my rank function is
> five lines. I then wrote three more functions for verifying that my index
> and rank tables were calculated correctly (17 more lines) and four more
> lines to call doctest, giving a total of 30 lines of code.


So 9 lines for Python, excluding tests.

> I also have 93 lines of documentation, including doctests, or three
> lines of documentation per line of code.


Then, without documentation, Python is roughly 560% (51/9) as
efficient as Pascal. But with documentation (assuming you need the
same amount of documentation for the Python code as the Pascal code),
(51 + 93)/(9 + 93) = 1.41 so only 141% as efficient as Pascal.

I wonder what that means? Maybe Python the language is approaching the
upper bound for how efficient an imperative programming language can
be? On the other hand, there seem to be some progress that could be
made to reduce the amount of work in writing documentation.
Documentation in Esperanto instead of English maybe?

--
mvh Björn
 
Reply With Quote
 
Ben Finney
Guest
Posts: n/a
 
      04-29-2007
"BJörn Lindqvist" <(E-Mail Removed)> writes:

> On the other hand, there seem to be some progress that could be made
> to reduce the amount of work in writing documentation.
> Documentation in Esperanto instead of English maybe?


Lojban <URL:http://www.lojban.org/> is both easier to learn world-wide
than Euro-biased Esperanto, and computer-parseable. Seems a better[0]_
choice for computer documentation to me.


... _[0] ignoring the fact that it's spoken by even fewer people than
Esperanto.

--
\ "The greater the artist, the greater the doubt; perfect |
`\ confidence is granted to the less talented as a consolation |
_o__) prize." -- Robert Hughes |
Ben Finney
 
Reply With Quote
 
Jarek Zgoda
Guest
Posts: n/a
 
      04-29-2007
Ben Finney napisa³(a):

>> On the other hand, there seem to be some progress that could be made
>> to reduce the amount of work in writing documentation.
>> Documentation in Esperanto instead of English maybe?

>
> Lojban <URL:http://www.lojban.org/> is both easier to learn world-wide
> than Euro-biased Esperanto, and computer-parseable. Seems a better[0]_
> choice for computer documentation to me.


German seems to be less "wordy" than English, despite the fact that most
of nouns is much longer.

--
Jarek Zgoda
http://jpa.berlios.de/
 
Reply With Quote
 
Arnaud Delobelle
Guest
Posts: n/a
 
      04-29-2007
On Apr 29, 11:46 am, Michael Hoffman <(E-Mail Removed)> wrote:
> GHUM wrote:
> > Steven,

>
> >> def index(sequence):
> >> decorated = zip(sequence, xrange(len(sequence)))
> >> decorated.sort()
> >> return [idx for (value, idx) in decorated]

>
> > would'nt that be equivalent code?

>
> > def index(sequence):
> > return [c for _,c in sorted((b,a) for a, b in
> > enumerate(sequence))]

>
> Or even these:
>
> def index(sequence):
> return sorted(range(len(sequence)), key=sequence.__getitem__)
>
> def rank(sequence):
> return sorted(range(len(sequence)),
> key=index(sequence).__getitem__)


Better still:

def rank(sequence):
return index(index(sequence))


But really these two versions of rank are slower than the original one
(as sorting a list is O(nlogn) whereas filling a table with
precomputed values is O(n) ).

Anyway I would like to contribute my own index function:

def index(seq):
return sum(sorted(map(list,enumerate(seq)), key=list.pop), [])

It's short and has the advantage of being self-documenting, which will
save Steven a lot of annoying typing I hope Who said Python
couldn't rival with perl?

--
Arnaud

 
Reply With Quote
 
Paul Rubin
Guest
Posts: n/a
 
      04-29-2007
Steven D'Aprano <(E-Mail Removed)> writes:
> I recently needed to write a function to generate a rank table from a
> list. That is, a list of ranks, where the rank of an item is the position
> it would be in if the list were sorted:
>
> alist = list('defabc')
> ranks = [3, 4, 5, 0, 1, 2]


fst = operator.itemgetter(0) # these should be builtins...
snd = operator.itemgetter(1)

ranks=map(fst, sorted(enumerate(alist), key=snd))
 
Reply With Quote
 
Arnaud Delobelle
Guest
Posts: n/a
 
      04-29-2007
On Apr 29, 5:33 pm, Paul Rubin <http://(E-Mail Removed)> wrote:
> Steven D'Aprano <(E-Mail Removed)> writes:
> > I recently needed to write a function to generate a rank table from a
> > list. That is, a list of ranks, where the rank of an item is the position
> > it would be in if the list were sorted:

>
> > alist = list('defabc')
> > ranks = [3, 4, 5, 0, 1, 2]

>
> fst = operator.itemgetter(0) # these should be builtins...
> snd = operator.itemgetter(1)
>
> ranks=map(fst, sorted(enumerate(alist), key=snd))


This is what the OP calls the index table, not the ranks table (both
are the same for the example above, but that's an unfortunate
coincidence...)

--
Arnaud

 
Reply With Quote
 
Raymond Hettinger
Guest
Posts: n/a
 
      04-29-2007
[Steven D'Aprano]
> I recently needed to write a function to generate a rank table from a
> list. That is, a list of ranks, where the rank of an item is the position
> it would be in if the list were sorted:
>
> alist = list('defabc')
> ranks = [3, 4, 5, 0, 1, 2]

.. . .
> def rank(sequence):
> table = [None] * len(sequence)
> for j, idx in enumerate(index(sequence)):
> table[idx] = j
> return table


FWIW, you can do ranking faster and more succinctly with the sorted()
builtin:

def rank(seq):
return sorted(range(len(seq)), key=seq.__getitem__)


Raymond

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[META] Talking about talking about C. Seebs C Programming 113 11-24-2010 08:57 AM
cpl VS.NET annoyances JV ASP .Net 0 05-31-2005 09:05 PM
Re: While we're talking about spyware... Patrick Michael A+ Certification 2 03-29-2005 09:04 PM
A Couple of XP Annoyances Sam Computer Support 2 02-12-2004 02:05 PM
While we're talking Smalltalk Ben Giddings Ruby 0 08-18-2003 10:27 PM



Advertisments