Velocity Reviews > pairs from a list

# pairs from a list

Alan Isaac
Guest
Posts: n/a

 01-22-2008
I want to generate sequential pairs from a list.
Here is a way::

from itertools import izip, islice
for x12 in izip(islice(x,0,None,2),islice(x,1,None,2)):
print x12

(Of course the print statement is just illustrative.)
What is the fastest way? (Ignore the import time.)

Thanks,
Alan Isaac

Paul Rubin
Guest
Posts: n/a

 01-22-2008
Alan Isaac <(E-Mail Removed)> writes:
> (Of course the print statement is just illustrative.)
> What is the fastest way? (Ignore the import time.)

You have to try a bunch of different ways and time them. One
idea (untested):

def pairs(seq):
while True:
yield (seq.next(), seq.next())

George Sakkis
Guest
Posts: n/a

 01-22-2008
On Jan 21, 10:20 pm, Alan Isaac <(E-Mail Removed)> wrote:
> I want to generate sequential pairs from a list.
> Here is a way::
>
> from itertools import izip, islice
> for x12 in izip(islice(x,0,None,2),islice(x,1,None,2)):
> print x12
>
> (Of course the print statement is just illustrative.)
> What is the fastest way? (Ignore the import time.)

Look up the timeit module and test yourself the various alternatives;
that's the most reliable way to tell for sure.

George

Guest
Posts: n/a

 01-22-2008
On Jan 22, 3:20 am, Alan Isaac <(E-Mail Removed)> wrote:
> I want to generate sequential pairs from a list.

<<snip>>
> What is the fastest way? (Ignore the import time.)

1) How fast is the method you have?
2) How much faster does it need to be for your application?
3) Are their any other bottlenecks in your application?
4) Is this the routine whose smallest % speed-up would give the
largest overall speed up of your application?

George Sakkis
Guest
Posts: n/a

 01-22-2008
On Jan 22, 12:15 am, Paddy <(E-Mail Removed)> wrote:
> On Jan 22, 3:20 am, Alan Isaac <(E-Mail Removed)> wrote:> I want to generate sequential pairs from a list.
> <<snip>>
> > What is the fastest way? (Ignore the import time.)

>
> 1) How fast is the method you have?
> 2) How much faster does it need to be for your application?
> 3) Are their any other bottlenecks in your application?
> 4) Is this the routine whose smallest % speed-up would give the
> largest overall speed up of your application?

I believe the "what is the fastest way" question for such small well-
defined tasks is worth asking on its own, regardless of whether it
makes a difference in the application (or even if there is no
application to begin with). Just because cpu cycles are cheap these
days is not a good reason to be sloppy. Moreover, often the fastest
pure Python version happens to be among the most elegant and concise,
unlike other languages where optimization usually implies obfuscation.

George

Steven D'Aprano
Guest
Posts: n/a

 01-22-2008
On Mon, 21 Jan 2008 21:34:28 -0800, George Sakkis wrote:

> I believe the "what is the fastest way" question for such small well-
> defined tasks is worth asking on its own, regardless of whether it makes
> a difference in the application (or even if there is no application to
> begin with). Just because cpu cycles are cheap these days is not a good
> reason to be sloppy. Moreover, often the fastest pure Python version
> happens to be among the most elegant and concise, unlike other languages
> where optimization usually implies obfuscation.

I wonder why it is that people automatically assume that "optimization"
means optimize the time taken, and not the developer effort to write it
in the first place, the effort required to maintain it over time, or the
memory used at runtime, let alone some combination of all four factors.

Memory is cheap, but applications are hungry.

CPUs are fast, and for most applications the difference between 3ms and
30ms is undetectable by the user. Why do we care so little about saving
memory and so much about ever-decreasing time savings?

--
Steven

Arnaud Delobelle
Guest
Posts: n/a

 01-22-2008
On Jan 22, 3:20*am, Alan Isaac <(E-Mail Removed)> wrote:
> I want to generate sequential pairs from a list.
> Here is a way::
>
> * * from itertools import izip, islice
> * * for x12 in izip(islice(x,0,None,2),islice(x,1,None,2)):
> * * * * print x12
>
> (Of course the print statement is just illustrative.)
> What is the fastest way? (Ignore the import time.)
>
> Thanks,
> Alan Isaac

Don't know the fastest, but here's a very concise way:

from itertools import izip

def ipairs(seq):
it = iter(seq)
return izip(it, it)

>>> list(pairs(xrange(10)))

[(0, 1), (2, 3), (4, 5), (6, 7), (8, 9)]
>>> list(pairs('hello'))

[('h', 'e'), ('l', 'l')]

--
Arnaud

Alan Isaac
Guest
Posts: n/a

 01-22-2008
I suppose my question should have been,
is there an obviously faster way?
Anyway, of the four ways below, the
first is substantially fastest. Is
there an obvious reason why?

Thanks,
Alan Isaac

PS My understanding is that the behavior
of the last is implementation dependent
and not guaranteed.

def pairs1(x):
for x12 in izip(islice(x,0,None,2),islice(x,1,None,2)):
yield x12

def pairs2(x):
xiter = iter(x)
while True:
yield xiter.next(), xiter.next()

def pairs3(x):
for i in range( len(x)//2 ):
yield x[2*i], x[2*i+1],

def pairs4(x):
xiter = iter(x)
for x12 in izip(xiter,xiter):
yield x12

Arnaud Delobelle
Guest
Posts: n/a

 01-22-2008
On Jan 22, 1:19*pm, Alan Isaac <(E-Mail Removed)> wrote:
[...]
> PS My understanding is that the behavior
> of the last is implementation dependent
> and not guaranteed.

[...]
> def pairs4(x):
> * * xiter = iter(x)
> * * for x12 in izip(xiter,xiter):
> * * * * yield x12

According to the docs [1], izip is defined to be equivalent to:

def izip(*iterables):
iterables = map(iter, iterables)
while iterables:
result = [it.next() for it in iterables]
yield tuple(result)

This guarantees that it.next() will be performed from left to right,
so there is no risk that e.g. pairs4([1, 2, 3, 4]) returns [(2, 1),
(4, 3)].

Is there anything else that I am overlooking?

[1] http://docs.python.org/lib/itertools-functions.html

--
Arnaud

bearophileHUGS@lycos.com
Guest
Posts: n/a

 01-22-2008
Alan Isaac>What is the fastest way? (Ignore the import time.)<

Maybe someday someone will realize such stuff belongs to the python
STD lib...

If you need a lazy generator without padding, that splits starting
from the start, then this is the faster to me if n is close to 2:

def xpartition(seq, n=2):
return izip( *(iter(seq),)*n )

If you need the faster greedy version without padding then there are
two answers, one for Psyco and one for Python without...
If you need padding or to start from the end then there are more

Bye,
bearophile