Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > suggestion for a small addition to the Python 3 list class

Reply
Thread Tools

suggestion for a small addition to the Python 3 list class

 
 
Robert Yacobellis
Guest
Posts: n/a
 
      04-21-2013
Greetings,

I'm an instructor of Computer Science at Loyola University, Chicago, and I and Dr. Harrington (copied on this email) teach sections of COMP 150, Introduction to Computing, using Python 3. One of the concepts we teach students is the str methods split() and join(). I have a suggestion for a small addition to the list class: add a join() method to lists. It would work in a similar way to how join works for str's, except that the object and method parameter would be reversed: <list object>.join(<str object>).

Rationale: When I teach students about split(), I can intuitively tell them split() splits the string on its left on white space or a specified string. Explaining the current str join() method to them doesn't seem to make as much sense: use the string on the left to join the items in the list?? If the list class had a join method, it would be more intuitive to say "join the items in the list using the specified string (the method's argument)." This is similar to Scala's List mkString() method.

I've attached a proposed implementation in Python code which is a little more general than what I've described. In this implementation the list can contain elements of any type, and the separator can also be any data type, not just str.

I've noticed that the str join() method takes an iterable, so in the most general case I'm suggesting to add a join() method to every Python-providediterable (however, for split() vs. join() it would be sufficient to just add a join() method to the list class).

Please let me know your ideas, reactions, and comments on this suggestion.

Thanks and regards,
Dr. Robert (Bob) Yacobellis

 
Reply With Quote
 
 
 
 
Steven D'Aprano
Guest
Posts: n/a
 
      04-21-2013
On Sun, 21 Apr 2013 09:09:20 -0500, Robert Yacobellis wrote:

> Greetings,
>
> I'm an instructor of Computer Science at Loyola University, Chicago, and
> I and Dr. Harrington (copied on this email) teach sections of COMP 150,
> Introduction to Computing, using Python 3. One of the concepts we teach
> students is the str methods split() and join(). I have a suggestion for
> a small addition to the list class: add a join() method to lists. It
> would work in a similar way to how join works for str's, except that the
> object and method parameter would be reversed: <list object>.join(<str
> object>).


That proposed interface doesn't make much sense to me. You're performing
a string operation ("make a new string, using this string as a
separator") not a list operation, so it's not really appropriate as a
list method. It makes much more sense as a string method.

It is also much more practical as a string method. This way, only two
objects need a join method: strings, and bytes (or if you prefer, Unicode
strings and byte strings). Otherwise, you would need to duplicate the
method in every possible iterable object:

- lists
- tuples
- dicts
- OrderedDicts
- sets
- frozensets
- iterators
- generators
- every object that obeys the sequence protocol
- every object that obeys the iterator protocol

(with the exception of iterable objects such as range objects that cannot
contain strings). Every object would have to contain code that does
exactly the same thing in every detail: walk the iterable, checking that
the item is a string, and build up a new string with the given separator:

class list: # also tuple, dict, set, frozenset, etc...
def join(self, separator):
...


Not only does that create a lot of duplicated code, but it also increases
the burden on anyone creating an iterable class, including iterators and
sequences. Anyone who writes their own iterable class has to write their
own join method, which is actually trickier than it seems at first
glance. (See below.)

Any half-decent programmer would recognise the duplicated code and factor
it out into an external function that takes a separator and a iterable
object:

def join(iterable, separator):
# common code goes here... it's *all* common code, every object's
# join method is identical


That's exactly what Python already does, except it swaps the order of the
arguments:

def join(separator, iterable):
...


and promotes it to a method on strings instead of a bare function.


> Rationale: When I teach students about split(), I can intuitively tell
> them split() splits the string on its left on white space or a specified
> string. Explaining the current str join() method to them doesn't seem
> to make as much sense: use the string on the left to join the items in
> the list??


Yes, exactly. Makes perfect sense to me.


> If the list class had a join method, it would be more
> intuitive to say "join the items in the list using the specified string
> (the method's argument)."


You can still say that. You just have to move the parenthetical aside:

"Join the items in the list (the method's argument) using the specified
string."



> This is similar to Scala's List mkString() method.



This is one place where Scala gets it wrong. In my opinion, as a list
method, mkString ought to operate on the entire list, not its individual
items. The nearest equivalent in Python would be converting a list to a
string using the repr() or str() functions:

py> str([1, 2, 3])
'[1, 2, 3]'


(which of course call the special methods __repr__ or __str__ on the
list).


> I've attached a proposed implementation in Python code which is a little
> more general than what I've described. In this implementation the list
> can contain elements of any type, and the separator can also be any data
> type, not just str.


Just for the record, the implementation you provide will be O(N**2) due
to the repeated string concatenation, which means it will be *horribly*
slow for large enough lists. It's actually quite difficult to efficiently
join a lot of strings without using the str.join method. Repeated string
concatenation will, in general, be slow due to the repeated copying of
intermediate results.

By shifting the burden of writing a join method onto everyone who creates
a sequence type, we would end up with a lot of slow code.

If you must have a convenience (inconvenience?) method on lists, the
right way to do it is like this:

class list2(list):
def join(self, sep=' '):
if isinstance(sep, (str, bytes)):
return sep.join(self)
raise TypeError





--
Steven
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
what is the advantage of using maven for java standalone app mcheung63@gmail.com Java 13 04-16-2013 01:42 AM
python-noob - which container is appropriate for later exportinginto mySql + matplotlib ? someone Python 45 04-15-2013 12:28 PM
Windows 8 - so bad it's hastening the death of the PC? ~misfit~ NZ Computing 18 04-15-2013 04:15 AM
API design for Python 2 / 3 compatibility Stefan Schwarzer Python 3 04-14-2013 11:11 AM



Advertisments