Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Specify start and length, beside start and end, in slices

Reply
Thread Tools

Specify start and length, beside start and end, in slices

 
 
Noam Raphael
Guest
Posts: n/a
 
      05-21-2004
Hello,
Many times I find myself asking for a slice of a specific length, and
writing something like l[12345:12345+10].
This happens both in interactive use and when writing Python programs,
where I have to write an expression twice (or use a temporary variable).

Wouldn't it be nice if the Python grammar had supported this frequent
use? My idea is that the expression above might be expressed as
l[12345:>10].

This change, as far as I can see, is quite small: it affects only the
grammar and byte-compiling, and has no side effects.

The only change in syntax is that short_slice would be changed from
[lower_bound] ":" [upper_bound]
to
([lower_bound] ":" [upper_bound]) | ([lower_bound] ":>" [slice_length])

Just to show what will happen to the byte code: l[12345:12345+10] is
compiled to:
LOAD_GLOBAL 0 (l)
LOAD_CONST 1 (12345)
LOAD_CONST 1 (12345)
LOAD_CONST 2 (10)
BINARY_ADD
SLICE+3

I suggest that l[12345:>10] would be compiled to:
LOAD_GLOBAL 0 (l)
LOAD_CONST 1 (12345)
DUP_TOP
LOAD_CONST 2 (10)
BINARY_ADD
SLICE+3

Well, what do you think? I would like to hear your comments.

Have a good day (or night),
Noam Raphael
 
Reply With Quote
 
 
 
 
Grant Edwards
Guest
Posts: n/a
 
      05-21-2004
On 2004-05-21, Noam Raphael <(E-Mail Removed)> wrote:

> Many times I find myself asking for a slice of a specific length, and
> writing something like l[12345:12345+10].


[...]

> Wouldn't it be nice if the Python grammar had supported this frequent
> use? My idea is that the expression above might be expressed as
> l[12345:>10].


It's a bit less efficient, but you can currently spell that as

l[12345:][:10]

--
Grant Edwards grante Yow! We just joined the
at civil hair patrol!
visi.com
 
Reply With Quote
 
 
 
 
Noam Raphael
Guest
Posts: n/a
 
      05-21-2004
Grant Edwards wrote:
> On 2004-05-21, Noam Raphael <(E-Mail Removed)> wrote:
>
>
>>Many times I find myself asking for a slice of a specific length, and
>>writing something like l[12345:12345+10].

>
>
> [...]
>
>
>>Wouldn't it be nice if the Python grammar had supported this frequent
>>use? My idea is that the expression above might be expressed as
>>l[12345:>10].

>
>
> It's a bit less efficient, but you can currently spell that as
>
> l[12345:][:10]
>

That is true, but if the list is long, it's *much* less efficient.

Thanks for your comment,
Noam
 
Reply With Quote
 
Peter Hansen
Guest
Posts: n/a
 
      05-21-2004
Noam Raphael wrote:

> Grant Edwards wrote:
>> It's a bit less efficient, but you can currently spell that as
>>
>> l[12345:][:10]
>>

> That is true, but if the list is long, it's *much* less efficient.


Considering that the interpreter special-cases some integer math
including the BINARY_ADD, it likely wouldn't take a very long list
to pass the point where they're the same.

I like the idea of the optimization, in a sense, but I don't
like the syntax and doubt that there is much performance gain to be
had. There are probably better places for people to hack on the
interpreter, and which don't need syntax changes.

-Peter
 
Reply With Quote
 
Noam Raphael
Guest
Posts: n/a
 
      05-21-2004
Peter Hansen wrote:
> Noam Raphael wrote:
>
>> Grant Edwards wrote:
>>
>>> It's a bit less efficient, but you can currently spell that as
>>>
>>> l[12345:][:10]
>>>

>> That is true, but if the list is long, it's *much* less efficient.

>
>
> Considering that the interpreter special-cases some integer math
> including the BINARY_ADD, it likely wouldn't take a very long list
> to pass the point where they're the same.
>


I don't understand: If the list is of length 1000000, wouldn't Grant
Edwards' suggestion make 1000000-12345 new references, and then take
only the first ten of them?
 
Reply With Quote
 
Grant Edwards
Guest
Posts: n/a
 
      05-21-2004
On 2004-05-21, Noam Raphael <(E-Mail Removed)> wrote:

>>>> It's a bit less efficient, but you can currently spell that as
>>>>
>>>> l[12345:][:10]
>>>>
>>> That is true, but if the list is long, it's *much* less efficient.

>>
>> Considering that the interpreter special-cases some integer math
>> including the BINARY_ADD, it likely wouldn't take a very long list
>> to pass the point where they're the same.


I'm afraid I don't understand either. Where do integer math
shortcuts enter the picture? It seems to me it's all about
building a (possibly long new list) which you're going to throw
away after you build another list from the front it.

Unless the compiler is smart enough to figure out what you're
aiming at and skip the intermediate list entirely.

> I don't understand: If the list is of length 1000000, wouldn't
> Grant Edwards' suggestion make 1000000-12345 new references,
> and then take only the first ten of them?


Yes, according to my understanding of how things work, that's
what happens (my spelling is pretty inefficient for pulling
small chunks from the beginnings of long lists), so if you do
a lot of that, it may be worth worrying about.

--
Grant Edwards grante Yow! Civilization is
at fun! Anyway, it keeps
visi.com me busy!!
 
Reply With Quote
 
Peter Hansen
Guest
Posts: n/a
 
      05-21-2004
Noam Raphael wrote:

> Peter Hansen wrote:
>
>> Noam Raphael wrote:
>>
>>> Grant Edwards wrote:
>>>
>>>> It's a bit less efficient, but you can currently spell that as
>>>>
>>>> l[12345:][:10]
>>>>
>>> That is true, but if the list is long, it's *much* less efficient.

>>
>>
>>
>> Considering that the interpreter special-cases some integer math
>> including the BINARY_ADD, it likely wouldn't take a very long list
>> to pass the point where they're the same.
>>

>
> I don't understand: If the list is of length 1000000, wouldn't Grant
> Edwards' suggestion make 1000000-12345 new references, and then take
> only the first ten of them?


Sorry, it was perhaps unclear that I was agreeing with you. For
an extremely short list, it's possible that it would be faster
to do Grant's method, but what I was trying to say is that even
if that's true, I expect that for a list of more than a few dozen
elements it would not be faster. Looking at it again, I suspect
that it would actually never be faster, given that probably
about as many bytecode instructions are executed, and then there's
the extra memory allocation for the temporary list, the copying,
etc.

-Peter
 
Reply With Quote
 
Peter Hansen
Guest
Posts: n/a
 
      05-21-2004
Peter Hansen wrote:

> For an extremely short list, it's possible that it would be faster
> to do Grant's method, but what I was trying to say is that even
> if that's true, I expect that for a list of more than a few dozen
> elements it would not be faster. Looking at it again, I suspect
> that it would actually never be faster, given that probably
> about as many bytecode instructions are executed, and then there's
> the extra memory allocation for the temporary list, the copying,


timeit confirms this with variations on this:

c:\>python -c "import timeit as t; t = t.Timer('x[y:][:10]', 'y=10000;
x=range(y)'); print t.timeit()"

and this:

c:\>python -c "import timeit as t; t = t.Timer('x[y:y+10]', 'y=10000;
x=range(y)'); print t.timeit()"

-Peter
 
Reply With Quote
 
Terry Reedy
Guest
Posts: n/a
 
      05-21-2004

"Noam Raphael" <(E-Mail Removed)> wrote in message
news:c8l3s3$27o$(E-Mail Removed)...

> Many times I find myself asking for a slice of a specific length, and
> writing something like l[12345:12345+10].
> This happens both in interactive use and when writing Python programs,
> where I have to write an expression twice (or use a temporary variable).


With an expression, I'd go for the temp var.

> Wouldn't it be nice if the Python grammar had supported this frequent
> use?


I take this as 'directly support' versus the current indirect support via
start+len.
My answer: superficially (in isolation) yes, but overall, in the context of
Python's somewhat minimalistic grammar/syntax, no. Two ways to slice might
easily be seen as one too many. In addition, the rationale for this, your
favorite little addition, would admit perhaps 50 others like it.

> My idea is that the expression above might be expressed as l[12345:>10].


Sorry, this strike me as ugly, too much like and easily confused with
l[12345:-10], and too much looking like a syntax error.

Given that some other languages slice with (start,len) arguments (but not
then, that I remember or know of, also with a start,stop option), I am
*sure* that Guido thought carefully about the issue. A plus with his
choice is ability to offset (index) from the end *without* calling the len
function.

> This change, as far as I can see, is quite small: it affects only the
> grammar and byte-compiling, and has no side effects.


Except the cognitive dissonance of two *almost* identical syntaxes and the
flood of other 'small', 'no side effect' change requests.

> Well, what do you think? I would like to hear your comments.


Your wish ...

Terry J. Reedy




 
Reply With Quote
 
Larry Bates
Guest
Posts: n/a
 
      05-21-2004
I think it is odd that I have never encounter
many of these types of constructs repeatedly in
my code. Perhaps you could share a little more
of where you see this type of think popping up
a lot? I suspect that there is another method
for solving the problem that might be faster
and easier to read/program.

Larry Bates,
Syscon, Inc.


"Noam Raphael" <(E-Mail Removed)> wrote in message
news:c8l3s3$27o$(E-Mail Removed)...
> Hello,
> Many times I find myself asking for a slice of a specific length, and
> writing something like l[12345:12345+10].
> This happens both in interactive use and when writing Python programs,
> where I have to write an expression twice (or use a temporary variable).
>
> Wouldn't it be nice if the Python grammar had supported this frequent
> use? My idea is that the expression above might be expressed as
> l[12345:>10].
>
> This change, as far as I can see, is quite small: it affects only the
> grammar and byte-compiling, and has no side effects.
>
> The only change in syntax is that short_slice would be changed from
> [lower_bound] ":" [upper_bound]
> to
> ([lower_bound] ":" [upper_bound]) | ([lower_bound] ":>" [slice_length])
>
> Just to show what will happen to the byte code: l[12345:12345+10] is
> compiled to:
> LOAD_GLOBAL 0 (l)
> LOAD_CONST 1 (12345)
> LOAD_CONST 1 (12345)
> LOAD_CONST 2 (10)
> BINARY_ADD
> SLICE+3
>
> I suggest that l[12345:>10] would be compiled to:
> LOAD_GLOBAL 0 (l)
> LOAD_CONST 1 (12345)
> DUP_TOP
> LOAD_CONST 2 (10)
> BINARY_ADD
> SLICE+3
>
> Well, what do you think? I would like to hear your comments.
>
> Have a good day (or night),
> Noam Raphael



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
*.cs files and ASP.NET 2.0 code beside model Maxwell2006 ASP .Net 3 04-21-2006 08:17 AM
Help positioning text and shockwave beside <!--Include File....--> RWC ASP General 1 05-08-2005 06:41 AM
What is a WAN? beside the obvious Ghazan Haider Cisco 0 12-29-2003 05:46 AM
beside save as htm and txt in IE , can i save as jpeg or other formats ? hycn office ASP General 2 10-10-2003 12:57 PM
MDIUSA.. Reputable? beside B and H who else is trustworthy and has good prices marv Digital Photography 1 08-21-2003 05:42 PM



Advertisments