Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Assigning generator expressions to ctype arrays

Reply
Thread Tools

Assigning generator expressions to ctype arrays

 
 
Patrick Maupin
Guest
Posts: n/a
 
      10-27-2011
Bug or misunderstanding?

Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53)
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> x = 32 * [0]
>>> x[:] = (x for x in xrange(32))
>>> from ctypes import c_uint
>>> x = (32 * c_uint)()
>>> x[:] = xrange(32)
>>> x[:] = (x for x in xrange(32))

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Can only assign sequence of same size
>>>


Thanks,
Pat
 
Reply With Quote
 
 
 
 
Steven D'Aprano
Guest
Posts: n/a
 
      10-27-2011
On Thu, 27 Oct 2011 13:34:28 -0700, Patrick Maupin wrote:

> Bug or misunderstanding?
>
> Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53) [GCC 4.5.2] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> x = 32 * [0]
>>>> x[:] = (x for x in xrange(32))
>>>> from ctypes import c_uint
>>>> x = (32 * c_uint)()
>>>> x[:] = xrange(32)
>>>> x[:] = (x for x in xrange(32))

> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> ValueError: Can only assign sequence of same size


From the outside, you can't tell how big a generator expression is. It
has no length:

>>> g = (x for x in xrange(32))
>>> len(g)

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: object of type 'generator' has no len()

Since the array object has no way of telling whether the generator will
have the correct size, it refuses to guess. I would argue that it should
raise a TypeError with a less misleading error message, rather than a
ValueError, so "bug".

The simple solution is to use a list comp instead of a generator
expression. If you have an arbitrary generator passed to you from the
outside, and you don't know how long it is yourself, you can use
itertools.islice to extract just the number of elements you want. Given g
some generator expression, rather than doing this:

# risky, if g is huge, the temporary list will also be huge
x[:] = list(g)[:32]

do this instead:

# use lazy slices guaranteed not to be unexpectedly huge
x[:] = list(itertools.islice(g, 32))


--
Steven
 
Reply With Quote
 
 
 
 
Patrick Maupin
Guest
Posts: n/a
 
      10-28-2011
On Oct 27, 5:31*pm, Steven D'Aprano <steve
(E-Mail Removed)> wrote:
> From the outside, you can't tell how big a generator expression is. It has no length:


I understand that.

> Since the array object has no way of telling whether the generator will have the correct size, it refuses to guess.


It doesn't have to guess. It can assume that I, the programmer, know
what the heck I am doing, and then validate that assumption -- trust,
but verify. It merely needs to fill the slice and then ask for one
more and check that StopIteration is raised.

> I would argue that it should raise a TypeError
> with a less misleading error message, rather
> than a ValueError, so "bug".


And I would argue that it should simply work, unless someone can
present a more compelling reason why not.

> The simple solution is to use a list comp
> instead of a generator expression.


I know how to work around the issue. I'm not sure I should have to.
It violates the principle of least surprise for the ctypes array to
not be able to interoperate with the iterator protocol in this
fashion.

Regards,
Pat
 
Reply With Quote
 
Terry Reedy
Guest
Posts: n/a
 
      10-28-2011
On 10/27/2011 8:09 PM, Patrick Maupin wrote:

> x[:] = (x for x in xrange(32))


This translates to

s.__setitem__(slice(None,None), generator_object)

where 'generator_object' is completely opaque, except that it will yield
0 to infinity objects in response to next() before raising StopIteration.

Given that a cytpe_array is a *fixed-length* array, *unlike* Python's
extensible lists and arrays, failure is a possibility due to mis-matched
lengths. So ctype_array can either look first, as it does, by calling
len(value_object), or leap first and create a temporary array, see if it
fills up exactly right, and if it does, copy it over.

> I know how to work around the issue. I'm not sure I should have to.


I do not think everyone else should suffer substantial increase in space
and run time to avoid surprising you.

> It violates the principle of least surprise


for ctypes to do what is most efficient in 99.9% of uses?

> for the ctypes array to
> not be able to interoperate with the iterator protocol in this
> fashion.


It could, but at some cost. Remember, people use ctypes for efficiency,
so the temp array path would have to be conditional. When you have a
patch, open a feature request on the tracker.

--
Terry Jan Reedy

 
Reply With Quote
 
Steven D'Aprano
Guest
Posts: n/a
 
      10-28-2011
On Thu, 27 Oct 2011 17:09:34 -0700, Patrick Maupin wrote:

> On Oct 27, 5:31*pm, Steven D'Aprano <steve
> (E-Mail Removed)> wrote:
>> From the outside, you can't tell how big a generator expression is. It
>> has no length:

>
> I understand that.
>
>> Since the array object has no way of telling whether the generator will
>> have the correct size, it refuses to guess.

>
> It doesn't have to guess. It can assume that I, the programmer, know
> what the heck I am doing, and then validate that assumption -- trust,
> but verify. It merely needs to fill the slice and then ask for one more
> and check that StopIteration is raised.


Simple, easy, and wrong.

It needs to fill in the slice, check that the slice has exactly the right
number of elements (it may have fewer), and then check that the iterator
is now empty.

If the slice has too few elements, you've just blown away the entire
iterator for no good reason.

If the slice is the right length, but the iterator doesn't next raise
StopIteration, you've just thrown away one perfectly good value. Hope it
wasn't something important.


>> I would argue that it should raise a TypeError with a less misleading
>> error message, rather than a ValueError, so "bug".

>
> And I would argue that it should simply work, unless someone can present
> a more compelling reason why not.


I think that "the iterator protocol as it exists doesn't allow it to work
the way you want" is a pretty compelling reason.


>> The simple solution is to use a list comp instead of a generator
>> expression.

>
> I know how to work around the issue. I'm not sure I should have to. It
> violates the principle of least surprise for the ctypes array to not be
> able to interoperate with the iterator protocol in this fashion.


Perhaps you're too easily surprised by the wrong things.


--
Steven
 
Reply With Quote
 
Terry Reedy
Guest
Posts: n/a
 
      10-28-2011
On 10/28/2011 3:21 AM, Steven D'Aprano wrote:

> If the slice has too few elements, you've just blown away the entire
> iterator for no good reason.


> If the slice is the right length, but the iterator doesn't next raise
> StopIteration, you've just thrown away one perfectly good value. Hope it
> wasn't something important.


You have also over-written values that should be set back to what they
were, before the exception is raised, which is why I said the test needs
to be done with a temporary array.

--
Terry Jan Reedy

 
Reply With Quote
 
Patrick Maupin
Guest
Posts: n/a
 
      10-28-2011
On Oct 27, 10:23*pm, Terry Reedy <(E-Mail Removed)> wrote:


> I do not think everyone else should suffer substantial increase in space
> and run time to avoid surprising you.


What substantial increase? There's already a check that winds up
raising an exception. Just make it empty an iterator instead.

> > It violates the principle of least surprise

>
> for ctypes to do what is most efficient in 99.9% of uses?


It doesn't work at all with an iterator, so it's most efficient 100%
of the time now. How do you know how many people would use iterators
if it worked?

>
> It could, but at some cost. Remember, people use ctypes for efficiency,


yes, you just made my argument for me. Thank you. It is incredibly
inefficient to have to create a temp array.

> so the temp array path would have to be conditional.


I don't understand this at all. Right now, it just throws up its
hands and says "I don't work with iterators." Why would it be a
problem to change this?
 
Reply With Quote
 
Terry Reedy
Guest
Posts: n/a
 
      10-28-2011
On 10/28/2011 2:05 PM, Patrick Maupin wrote:

> On Oct 27, 10:23 pm, Terry Reedy<(E-Mail Removed)> wrote:
>> I do not think everyone else should suffer substantial increase in space
>> and run time to avoid surprising you.

>
> What substantial increase?


of time and space, as I said, for the temporary array that I think would
be needed and which I also described in the previous paragraph that you
clipped

> There's already a check that winds up
> raising an exception. Just make it empty an iterator instead.


It? I have no idea what you intend that to refer to.


>>> It violates the principle of least surprise

>> for ctypes to do what is most efficient in 99.9% of uses?

>
> It doesn't work at all with an iterator, so it's most efficient 100%
> of the time now. How do you know how many people would use iterators
> if it worked?


I doubt it would be very many because it is *impossible* to make it work
in the way that I think people would want it to.

>> It could, but at some cost. Remember, people use ctypes for efficiency,


> yes, you just made my argument for me. Thank you. It is incredibly
> inefficient to have to create a temp array.


But necessary to work with blank box iterators. Now you are agreeing
with my argument.

>> so the temp array path would have to be conditional.


> I don't understand this at all. Right now, it just throws up its
> hands and says "I don't work with iterators."


If ctype_array slice assignment were to be augmented to work with
iterators, that would, in my opinion (and see below), require use of
temporary arrays. Since slice assignment does not use temporary arrays
now (see below), that augmentation should be conditional on the source
type being a non-sequence iterator.

> Why would it be a problem to change this?


CPython comes with immutable fixed-length arrays (tuples) that do not
allow slice assignment and mutable variable-length arrays (lists) that
do. The definition is 'replace the indicated slice with a new slice
built from all values from an iterable'. Point 1: This works for any
properly functioning iterable that produces any finite number of items.
Iterators are always exhausted.

Replace can be thought of as delete follewed by add, but the
implementation is not that naive. Point 2: If anything goes wrong and an
exception is raised, the list is unchanged. This means that there must
be temporary internal storage of either old or new references. An
example that uses an improperly functioning generator.

>>> a

[0, 1, 2, 3, 4, 5, 6, 7]
>>> def g():

yield None
raise ValueError

>>> a[3:6]=g()

Traceback (most recent call last):
File "<pyshell#21>", line 1, in <module>
a[3:6]=g()
File "<pyshell#20>", line 3, in g
raise ValueError
ValueError
>>> a

[0, 1, 2, 3, 4, 5, 6, 7]

A c_uint array is a new kind of beast: a fixed-length mutable array. So
it has to have a different definition of slice assignment than lists.
Thomas Heller, the ctypes author, apparently chose 'replacement by a
sequence with exactly the same number of items, else raise an
exception'. though I do not know what the doc actually says.

An alternative definition would have been to replace as much of the
slice as possible, from the beginning, while ignoring any items in
excess of the slice length. This would work with any iterable. However,
partial replacement of a slice would be a surprising innovation to most.

The current implementation assumes that the reported length of a
sequence matches the valid indexes and dispenses with temporary storage.
This is shown by the following:

from ctypes import c_uint
n = 20

class Liar:
def __len__(self): return n
def __getitem__(self, index):
if index < 10:
return 1
else:
raise ValueError()

x = (n * c_uint)()
print(list(x))
x[:] = range(n)
print(list(x))
try:
x[:] = Liar()
except:
pass
print(list(x))
>>>

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

I consider such unintended partial replacement to be a glitch. An
exception could be raised, but without adding temp storage, the array
could not be restored. And making a change *and* raising an exception
would be a different sort of glitch. (One possible with augmented
assignment involving a mutable member of a tuple.) So I would leave this
as undefined behavior for an input outside the proper domain of the
function.

Anyway, as I said before, you are free to propose a specific change
('work with iterators' is too vague) and provide a corresponding patch.

--
Terry Jan Reedy

 
Reply With Quote
 
Patrick Maupin
Guest
Posts: n/a
 
      10-28-2011
On Oct 28, 3:19*am, Terry Reedy <(E-Mail Removed)> wrote:
> On 10/28/2011 3:21 AM, Steven D'Aprano wrote:
>
> > If the slice has too few elements, you've just blown away the entire
> > iterator for no good reason.
> > If the slice is the right length, but the iterator doesn't next raise
> > StopIteration, you've just thrown away one perfectly good value. Hope it
> > wasn't something important.

>
> You have also over-written values that should be set back to what they
> were, before the exception is raised, which is why I said the test needs
> to be done with a temporary array.
>


Sometimes when exceptions happen, data is lost. You both make a big
deal out of simultaneously (a) not placing burden on the normal case
and (b) defining the normal case by way of what happens during an
exception. Iterators are powerful and efficient, and ctypes are
powerful and efficient, and the only reason you've managed to give why
I shouldn't be able to fill a ctype array slice from an iterator is
that, IF I SCREW UP and the iterator doesn't produce the right amount
of data, I will have lost some data.

Regards,
Pat
 
Reply With Quote
 
Patrick Maupin
Guest
Posts: n/a
 
      10-28-2011
On Oct 28, 4:51*pm, Patrick Maupin <(E-Mail Removed)> wrote:
> On Oct 28, 3:19*am, Terry Reedy <(E-Mail Removed)> wrote:
>
> > On 10/28/2011 3:21 AM, Steven D'Aprano wrote:

>
> > > If the slice has too few elements, you've just blown away the entire
> > > iterator for no good reason.
> > > If the slice is the right length, but the iterator doesn't next raise
> > > StopIteration, you've just thrown away one perfectly good value. Hopeit
> > > wasn't something important.

>
> > You have also over-written values that should be set back to what they
> > were, before the exception is raised, which is why I said the test needs
> > to be done with a temporary array.

>
> Sometimes when exceptions happen, data is lost. You both make a big
> deal out of simultaneously (a) not placing burden on the normal case
> and (b) defining the normal case by way of what happens during an
> exception. *Iterators are powerful and efficient, and ctypes are
> powerful and efficient, and the only reason you've managed to give why
> I shouldn't be able to fill a ctype array slice from an iterator is
> that, IF I SCREW UP and the iterator doesn't produce the right amount
> of data, I will have lost some data.
>
> Regards,
> Pat


And, BTW, the example you give of, e.g.

a,b,c = (some generator expression)

ALREADY LOSES DATA if the iterator isn't the right size and it raises
an exception.

It doesn't overwrite a or b or c, but you're deluding yourself if you
think that means it hasn't altered the system state.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Arrays and CTYPE Glenn Pringle Python 2 09-21-2010 12:29 PM
Multidimensional arrays and arrays of arrays Philipp Java 21 01-20-2009 08:33 AM
Assigning methods to objects, and assigning onreadystatechange to an XMLHttpRequest -- an inconsistency? weston Javascript 1 09-22-2006 09:33 AM
Assigning 2 Databinding Expressions To The Same Property Nathan Sokalski ASP .Net 4 08-19-2005 02:10 AM
Add custom regular expressions to the validation list of available expressions Jay Douglas ASP .Net 0 08-15-2003 10:19 PM



Advertisments