![]() |
Simple Python Speed Benchmark
Knowing absolutely zero about benchmarks, I was interested in the posting referencing the paper by
Dr. Cowell-Shah. I was curious how long Python takes just to read and display time.clock(), soI made the simple test below. I was amazed at the difference in speed depending on when/how the clock times were converted to milliseconds....almost 4 to 1...as indicated in the RESULTS below. I understand the variation in the times is probably due to other running processes. Any explanation? The machine used is: P3, 1gHz, 256 meg of ram, Windows 2000, Python 2.3 Also, I did duplicate the intArithmetic (intMaz), using 1,000,000,000, as per the paper, and it *did* take a long time. I tried a variety of ways to implement the process (while, range, xrange, etc), and made some improvement, but not much. And of course I did see the "out of memory" message. Norm # THE CODE import time module_time_start = time.clock() start_time_1 = time.clock() stop_time_1 = time.clock() print 'Test 1: ', (stop_time_1 - start_time_1) * 1000 start_time_2 = time.clock() * 1000 stop_time_2 = time.clock() * 1000 print 'Test 2: ', stop_time_2 - start_time_2 def get_time_1(): start_time_3 = time.clock() stop_time_3 = time.clock() duration = (stop_time_3 - start_time_3) * 1000 print 'Test 3: ', duration def get_time_2(): start_time_4 = time.clock() * 1000 stop_time_4 = time.clock() * 1000 print 'Test 4: ', stop_time_4 - start_time_4 print '\nOK, now loop the methods...\n' for ix in range(10): get_time_1() get_time_2() module_time_stop = time.clock() print '\nTotal module time: \n', (module_time_stop - module_time_start) * 1000 # THE RESULTS: Test 1: 0.00363174649292 Test 2: 0.0167619068904 OK, now loop the methods... Test 3: 0.00810158833036 Test 4: 0.0145269859717 Test 3: 0.006984127871 Test 4: 0.0145269859717 Test 3: 0.00586666741164 Test 4: 0.0145269859717 Test 3: 0.00614603252648 Test 4: 0.0145269859717 Test 3: 0.00614603252648 Test 4: 0.013968255742 Test 3: 0.00586666741164 Test 4: 0.0128507952826 Test 3: 0.00586666741164 Test 4: 0.0136888906272 Test 3: 0.0055873022968 Test 4: 0.0136888906272 Test 3: 0.00642539764132 Test 4: 0.0145269859717 Test 3: 0.0108952394788 Test 4: 0.0145269859717 Total module time: 2.89142893859 |
Re: Simple Python Speed Benchmark
engsolnom@ipns.com:
> I was curious how long Python takes just to read and display time.clock() ... > I was amazed at the difference in speed depending on when/how the clock times were converted to > milliseconds....almost 4 to 1...as indicated in the RESULTS below. I understand the variation in the > times is probably due to other running processes. > > Any explanation? There are quite a few things here. First, if you look at the "Test 3:" and "Test 4:" numbers you'll see it's only 2 times slower, not 4. Let's start with that case, where the tests you make are in a function. You can see the PVM op codes generated by using the dis module >>> def spam(): .... t1 = time.clock() .... >>> import dis >>> dis.dis(spam) 2 0 LOAD_GLOBAL 0 (time) 3 LOAD_ATTR 1 (clock) 6 CALL_FUNCTION 0 9 STORE_FAST 0 (t1) 12 LOAD_CONST 0 (None) 15 RETURN_VALUE >>> def spam2(): .... t1 = time.clock() * 1000 .... >>> dis.dis(spam2) 2 0 LOAD_GLOBAL 0 (time) 3 LOAD_ATTR 1 (clock) 6 CALL_FUNCTION 0 9 LOAD_CONST 1 (1000) 12 BINARY_MULTIPLY 13 STORE_FAST 0 (t1) 16 LOAD_CONST 0 (None) 19 RETURN_VALUE >>> This shows that the "* 1000" is doing two more things; load the constant "1000" and then do a binary multiply. The obvious conclusion is that the factor of 2 slowdown comes from this step. (That actually surprises me. Function call overhead should be much more than a simple multiply.) The next is the precision of the clock. The numbers generated are pretty well quantized. For example, '0.014527' comes up quite frequently as does '0.005867'. It appears that your clock has a resolution of about 0.0003, which is 5% of the smaller numbers or 2% of the bigger ones. If the time slices are just at the wrong point then you may get a systematic error of up to about 6% (ie, if the time.clock is done in just under an integral time quantum while the time.clock()*1000 is just over a quantum). To get around that, you should do more tests, and not have the results of one test strongly correlated with the other. (You do because get_time_1() is always followed with a get_time_2().) The easiest way is to use the timeit module, either on the command-line or interactively, as >>> import timeit >>> timeit.Timer("time.clock", "import time").repeat(3, 10000) [0.0063276180834463958, 0.0066243037524600368, 0.0078663607060889262] >>> timeit.Timer("time.clock()", "import time").repeat(3, 10000) [0.054394886949353349, 0.053860182268920198, 0.05341180138486834] >>> timeit.Timer("time.clock()*1000", "import time").repeat(3, 10000) [0.057691115018485561, 0.059368981429486212, 0.058075800674146194] >>> As you can see, on my box it takes 0.06 microseconds to do 'time.clock', 0.48 microseconds to make the call, and 0.03 microseconds to do the *1000. Try those on your machine. Andrew dalke@dalkescientific.com |
| All times are GMT. The time now is 11:46 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.