John Machin wrote:
[...]
> Observation: factoring out the compile step makes the difference much
> more apparent.
>
> >>> ["%.3f" % t.timeit() for t in t3, t4, t5, t6]
> ['1.578', '1.175', '2.283', '1.174']
> >>> ["%.3f" % t.timeit() for t in t3, t4, t5, t6]
> ['1.582', '1.179', '2.284', '1.172']
> >>>
To make it even more apparent, try:
import re
import profile
startsz = re.compile('^z')
for s in ('x' * 1000, 'x' * 100000, 'x'*10000000):
profile.run('startsz.search(s)')
Profile report is below.
> Conclusion: search time depends on length of searched string.
>
> Meta-conclusion: Either I have to retract my
> based-on-hope-rather-than-on-experimentation assertion, or redefine "not
> dopey" to mean "surely nobody would search for ^x when match x would do,
> so it would be dopey to optimise re for that"
No question, there's some dopiness to searching for the
beginning of the string at places other than beginning of the
string.
The tricky part would be optimizing '$'.
--
--Bryan
4 function calls in 0.003 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 :0(search)
1 0.003 0.003 0.003 0.003 :0(setprofile)
1 0.000 0.000 0.000 0.000 <string>:1(?)
0 0.000 0.000 profile:0(profiler)
1 0.000 0.000 0.003 0.003 profile:0(startsz.search(s))
4 function calls in 0.002 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 0.002 0.002 :0(search)
1 0.000 0.000 0.000 0.000 :0(setprofile)
1 0.000 0.000 0.002 0.002 <string>:1(?)
0 0.000 0.000 profile:0(profiler)
1 0.000 0.000 0.002 0.002 profile:0(startsz.search(s))
4 function calls in 0.228 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.228 0.228 0.228 0.228 :0(search)
1 0.000 0.000 0.000 0.000 :0(setprofile)
1 0.000 0.000 0.228 0.228 <string>:1(?)
0 0.000 0.000 profile:0(profiler)
1 0.000 0.000 0.228 0.228 profile:0(startsz.search(s))