Velocity Reviews > how to iterate over sequence and non-sequence ?

# how to iterate over sequence and non-sequence ?

stef mientki
Guest
Posts: n/a

 10-18-2007
hello,

I generate dynamically a sequence of values,
but this "sequence" could also have length 1 or even length 0.

So I get some line in the form of:
line = '(2,3,4)'
line = ''
line = '(2)'
(in fact these are not constant numbers, but all kind of integer
variables, coming from all over the program, selected from a tree, that
shows all "reachable" variables)

So in fact I get the value from an exec statement, like this
exec 'signals = ' + line

Now I want to iterate over "signals", which works perfect if there are 2
or more signals,
but it fails when I have none or just 1 signal.
for value in signals :
do something

As this meant for real-time signals, I want it fast, so (I think) I
can't afford extensive testing.

Any smart solution there ?

thanks,
Stef Mientki

Paul Hankin
Guest
Posts: n/a

 10-19-2007
On Oct 19, 12:24 am, stef mientki <(E-Mail Removed)> wrote:
> I generate dynamically a sequence of values,
> but this "sequence" could also have length 1 or even length 0.
>
> So I get some line in the form of:
> line = '(2,3,4)'
> line = ''
> line = '(2)'
> (in fact these are not constant numbers, but all kind of integer
> variables, coming from all over the program, selected from a tree, that
> shows all "reachable" variables)
>
> So in fact I get the value from an exec statement, like this
> exec 'signals = ' + line
>
> Now I want to iterate over "signals", which works perfect if there are 2
> or more signals,
> but it fails when I have none or just 1 signal.
> for value in signals :
> do something
>
> As this meant for real-time signals, I want it fast, so (I think) I
> can't afford extensive testing.
>
> Any smart solution there ?

First: don't collect data into strings - python has many container
types which you can use.

Next, your strings look like they're supposed to contain tuples. In
fact, tuples are a bit awkward sometimes because you have to use
'(a,') for a tuple with one element - (2) isn't a tuple of length one,
it's the same as 2. Either cope with this special case, or use lists.
Either way, you'll have to use () or [] for an empty sequence.

--
Paul Hankin

Steven D'Aprano
Guest
Posts: n/a

 10-19-2007
On Fri, 19 Oct 2007 01:24:09 +0200, stef mientki wrote:

> hello,
>
> I generate dynamically a sequence of values, but this "sequence" could
> also have length 1 or even length 0.
>
> So I get some line in the form of:
> line = '(2,3,4)'
> line = ''
> line = '(2)'
> (in fact these are not constant numbers, but all kind of integer
> variables, coming from all over the program, selected from a tree, that
> shows all "reachable" variables)
>
> So in fact I get the value from an exec statement, like this
> exec 'signals = ' + line

And then, one day, somebody who doesn't like you will add the following

"0; import os; os.system('rm # -rf /')"

[ Kids: don't try this at home! Seriously, running that command will be
spike in it. ]

Don't use exec in production code unless you know what you're doing. In
fact, don't use exec in production code.

> Now I want to iterate over "signals", which works perfect if there are 2
> or more signals,
> but it fails when I have none or just 1 signal.
> for value in signals :
> do something

No, I would say it already failed before it even got there.

>>> line = ''
>>> exec 'signals = ' + line

Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<string>", line 1
signals =
^
SyntaxError: unexpected EOF while parsing

This is the right way to deal with your data:

input_data = """ (2, 3 , 4)

(2)
(3,4,5)
( 1, 2,3)
"""

for line in input_data.split('\n'):
line = line.strip().strip('()')
values = line.split(',')
for value in values:
value = value.strip()
if value:
print(value)

> As this meant for real-time signals, I want it fast, so (I think) I
> can't afford extensive testing.

Don't guess, test it and see if it is fast enough. Some speed ups:

If you're reading from a file, you can just say: "for line in file:"
instead of slurping the whole lot into one enormous string, then
splitting over newlines.

If you can guarantee that there is no extra whitespace in the file, you
can change the line

line = line.strip().strip('()')

to the following:

line = line.strip('\n()')

and save a smidgen of time per loop. Likewise, drop the "value =
value.strip()" in the inner loop.

--
Steven.

Nils
Guest
Posts: n/a

 10-19-2007
On Oct 19, 10:58 am, Steven D'Aprano
<(E-Mail Removed)> wrote:
> On Fri, 19 Oct 2007 01:24:09 +0200, stef mientki wrote:
> > hello,

>
> > I generate dynamically a sequence of values, but this "sequence" could
> > also have length 1 or even length 0.

>
> > So I get some line in the form of:
> > line = '(2,3,4)'
> > line = ''
> > line = '(2)'
> > (in fact these are not constant numbers, but all kind of integer
> > variables, coming from all over the program, selected from a tree, that
> > shows all "reachable" variables)

>
> > So in fact I get the value from an exec statement, like this
> > exec 'signals = ' + line

>
> And then, one day, somebody who doesn't like you will add the following
>
> "0; import os; os.system('rm # -rf /')"
>
> [ Kids: don't try this at home! Seriously, running that command will be
> spike in it. ]
>
> Don't use exec in production code unless you know what you're doing. In
> fact, don't use exec in production code.
>
> > Now I want to iterate over "signals", which works perfect if there are 2
> > or more signals,
> > but it fails when I have none or just 1 signal.
> > for value in signals :
> > do something

>
> No, I would say it already failed before it even got there.
>
> >>> line = ''
> >>> exec 'signals = ' + line

>
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> File "<string>", line 1
> signals =
> ^
> SyntaxError: unexpected EOF while parsing
>
> This is the right way to deal with your data:
>
> input_data = """ (2, 3 , 4)
>
> (2)
> (3,4,5)
> ( 1, 2,3)
> """
>
> for line in input_data.split('\n'):
> line = line.strip().strip('()')
> values = line.split(',')
> for value in values:
> value = value.strip()
> if value:
> print(value)
>
> > As this meant for real-time signals, I want it fast, so (I think) I
> > can't afford extensive testing.

>
> Don't guess, test it and see if it is fast enough. Some speed ups:
>
> If you're reading from a file, you can just say: "for line in file:"
> instead of slurping the whole lot into one enormous string, then
> splitting over newlines.
>
> If you can guarantee that there is no extra whitespace in the file, you
> can change the line
>
> line = line.strip().strip('()')
>
> to the following:
>
> line = line.strip('\n()')
>
> and save a smidgen of time per loop. Likewise, drop the "value =
> value.strip()" in the inner loop.
>
> --
> Steven.

why not:
>>> for i in eval('(1,2,3)'):

.... print i
1
2
3

Duncan Booth
Guest
Posts: n/a

 10-19-2007
Nils <(E-Mail Removed)> wrote:

> why not:
>>>> for i in eval('(1,2,3)'):

> ... print i
> 1
> 2
> 3
>

For the exact same reason Steven already gave you: one day someone will

For eval you need to use slightly more complicated expressions. e.g.
"__import__('os').system('rm # -rf /')"
will be sufficient to mess you up.

stef
Guest
Posts: n/a

 10-19-2007
Paul Hankin wrote:
> On Oct 19, 12:24 am, stef mientki <(E-Mail Removed)> wrote:
>
>> I generate dynamically a sequence of values,
>> but this "sequence" could also have length 1 or even length 0.
>>
>> So I get some line in the form of:
>> line = '(2,3,4)'
>> line = ''
>> line = '(2)'
>> (in fact these are not constant numbers, but all kind of integer
>> variables, coming from all over the program, selected from a tree, that
>> shows all "reachable" variables)
>>
>> So in fact I get the value from an exec statement, like this
>> exec 'signals = ' + line
>>
>> Now I want to iterate over "signals", which works perfect if there are 2
>> or more signals,
>> but it fails when I have none or just 1 signal.
>> for value in signals :
>> do something
>>
>> As this meant for real-time signals, I want it fast, so (I think) I
>> can't afford extensive testing.
>>
>> Any smart solution there ?
>>

>
> First: don't collect data into strings - python has many container
> types which you can use.
>

Well I'm not collecting data, I'm collecting pointers to data.
This program simulates a user written program in JAL.
As Python doesn't support pointers, instead I collect names.
The names are derived from an analysis of the user program under test,
so the danger some of you are referring to, is not there,
or at least is not that simple.
Besides it's a local application where the goal is to let a user test
his program (and hardware),
so if the user want to hack, he can better type directly "format c:\".

> Next, your strings look like they're supposed to contain tuples. In
> fact, tuples are a bit awkward sometimes because you have to use
> '(a,') for a tuple with one element - (2) isn't a tuple of length one,
> it's the same as 2. Either cope with this special case, or use lists.
> Either way, you'll have to use () or [] for an empty sequence.
>

Of course, thanks Paul,
if I change tuple to list, everything works ok, even with empty lists.

cheers,
Stef Mientki
> --
> Paul Hankin
>
>

Steven D'Aprano
Guest
Posts: n/a

 10-19-2007
On Fri, 19 Oct 2007 16:19:32 +0200, stef wrote:

> Well I'm not collecting data, I'm collecting pointers to data.

I beg to differ, you're collecting data. How that data is to be
interpreted (a string, a number, a pointer...) is a separate issue.

> This
> program simulates a user written program in JAL. As Python doesn't
> support pointers, instead I collect names.

This doesn't make any sense to me. If your user-written program is
supplying pointers (that is, memory addresses like 0x15A, how do you
get a name from the memory address?

If you are trying to emulate pointer-manipulation, then the usual way to
simulate a pointer is with an integer offset into an array:

# initialise your memory space to all zeroes:
memory = [chr(0)]*1024*64 # 64K of memory space, enough for anyone
NULL = 0
pointer = 45
memory[pointerointer + 5] = 'HELLO'
pointer += 6
memory[pointerointer + 5] = 'WORLD'

> The names are derived from an
> analysis of the user program under test, so the danger some of you are
> referring to, is not there, or at least is not that simple.

you are collecting? Are you sure there are no corner cases where

The thing is, exec is stomping through your program's namespace with
great big steel-capped boots, crushing anything that gets in the way.
Even if it is safe in your specific example, it is still bad practice, or
at least risky practice. Code gets reused, copied, and one day a piece of
code you wrote for the JAL project ends up running on a webserver and now
you have a serious security hole.

(Every security hole ever started off with a programmer thinking "This is
perfectly safe to do".)

But more importantly, what makes you think that exec is going to be
faster and more efficient than the alternatives? By my simple test, I
find exec to be about a hundred times slower than directly executing the
same code:

>>> timeit.Timer("a = 1").timeit()

0.26714611053466797
>>> timeit.Timer("exec s", "s = 'a = 1'").timeit()

25.963317155838013

--
Steven

stef mientki
Guest
Posts: n/a

 10-19-2007
Steven D'Aprano wrote:
> On Fri, 19 Oct 2007 16:19:32 +0200, stef wrote:
>
>
>> Well I'm not collecting data, I'm collecting pointers to data.
>>

>
> I beg to differ, you're collecting data. How that data is to be
> interpreted (a string, a number, a pointer...) is a separate issue.
>
>
>
>> This
>> program simulates a user written program in JAL. As Python doesn't
>> support pointers, instead I collect names.
>>

>
> This doesn't make any sense to me. If your user-written program is
> supplying pointers (that is, memory addresses like 0x15A, how do you
> get a name from the memory address?
>
>
> If you are trying to emulate pointer-manipulation, then the usual way to
> simulate a pointer is with an integer offset into an array:
>
> # initialise your memory space to all zeroes:
> memory = [chr(0)]*1024*64 # 64K of memory space, enough for anyone
> NULL = 0
> pointer = 45
> memory[pointerointer + 5] = 'HELLO'
> pointer += 6
> memory[pointerointer + 5] = 'WORLD'
>
>
>

If there is a better way, I'ld like to hear it.
I understand that execute is dangerous.

I don't have pointers, I've just names (at least I think).
Let me explain a little bit more,
I want to simulate / debug a user program,
the user program might look like this:

x = 5
for i in xrange(10):
x = x + 1

So now I want to follow the changes in "x" and "i",
therefor in the background I change the user program a little bit, like
this

def user_program():
x = 5 ; _debug(2)
global x,i
_debug (3)
for i in xrange(10):
_debug (3)
x = x + 1 ; _debug (4)

And this modified user program is now called by the main program.
Now in the _debug procedure I can set breakpoints and watch x and i.
But as in this case both a and i are simple integers,
I can not reference them and I need to get their values through their
names,
and thus a execute statement.

I couldn't come up with a better solution
(There may be no restrictions laid upon the user program, and indeed
name clashing is an accepted risk).

cheers,
Stef

Paul Hankin
Guest
Posts: n/a

 10-19-2007
On Oct 19, 5:38 pm, stef mientki <(E-Mail Removed)> wrote:
> ... snip hand-coded debugger
> I couldn't come up with a better solution

Does pdb not suffice?

Even if it doesn't; you can look up variables without using exec,
using locals()['x'] or globals()['x']

--
Paul Hankin

Bruno Desthuilliers
Guest
Posts: n/a

 10-19-2007
stef mientki a écrit :
> Steven D'Aprano wrote:
>
>> On Fri, 19 Oct 2007 16:19:32 +0200, stef wrote:
>>
>>
>>
>>> Well I'm not collecting data, I'm collecting pointers to data.
>>>

>>
>>
>> I beg to differ, you're collecting data. How that data is to be
>> interpreted (a string, a number, a pointer...) is a separate issue.
>>
>>
>>
>>
>>> This
>>> program simulates a user written program in JAL. As Python doesn't
>>> support pointers, instead I collect names.
>>>

>>
>>
>> This doesn't make any sense to me. If your user-written program is
>> supplying pointers (that is, memory addresses like 0x15A, how do you
>> get a name from the memory address?
>>
>>
>> If you are trying to emulate pointer-manipulation, then the usual way
>> to simulate a pointer is with an integer offset into an array:
>>
>> # initialise your memory space to all zeroes:
>> memory = [chr(0)]*1024*64 # 64K of memory space, enough for anyone
>> NULL = 0
>> pointer = 45
>> memory[pointerointer + 5] = 'HELLO'
>> pointer += 6
>> memory[pointerointer + 5] = 'WORLD'
>>
>>
>>

>
> If there is a better way, I'ld like to hear it.
> I understand that execute is dangerous.
>
> I don't have pointers, I've just names (at least I think).
> Let me explain a little bit more,
> I want to simulate / debug a user program,
> the user program might look like this:
>
> x = 5
> for i in xrange(10):
> x = x + 1
>
> So now I want to follow the changes in "x" and "i",
> therefor in the background I change the user program a little bit, like
> this
>
> def user_program():
> x = 5 ; _debug(2)
> global x,i
> _debug (3)
> for i in xrange(10):
> _debug (3)
> x = x + 1 ; _debug (4)

You do know that Python exposes all of it's compilation / AST / whatever
machinery, don't you ? IOW, you can take a textual program, compile it
to a code object, play with the AST, add debug hooks, etc... Perhaps you
should spend a little more time studying the modules index ?