Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > python/ruby benchmark.

Reply
Thread Tools

python/ruby benchmark.

 
 
Steven Jenkins
Guest
Posts: n/a
 
      06-15-2005
Ralph "PJPizza" Siegler wrote:
> One observation I would make would be that you set up
> benchmark/test/simulation that was very relevant to your problem domain,
> you didn't use some industry standard MIPS or TPH or such. That is the
> real problem with the type of benchmarks such as spawned the debate
> here, such things are interesting and I like to look at them, but that's
> as far as their usefulness go.


I said I had other examples .

It's been a long time since I was involved in one, but I'm reasonably
confident that we use "standard" benchmarks for large procurements. When
you spend US Government money, you have to jump through a lot of hoops
to ensure a level competitive playing field. A protest from a losing
bidder can tie you up for a long time, so you try to avoid that. Using
your own benchmarks for procurement qualification invites protest.

Nobody wins just because their TPC-A or whatever is highest. A Request
for Proposal may give a particular performance threshold, and the
proposing vendors use that to decide which of their products to propose.
They don't want to propose anything more expensive than they have to,
because they're in a cost competition. It's a rough and imperfect but
vendor-neutral way to talk about classes of performance. The real key is
that, if the vendors buy into it, they can't protest on that point.

Obviously, if one vendor claims dramatically better performance than
another in the same price class, that might be worth looking to. For the
most part, however, the benchmarks just establish who's in the game, and
most of the competition is on cost.

Steve



 
Reply With Quote
 
 
 
 
Stephen Kellett
Guest
Posts: n/a
 
      06-15-2005
In message <(E-Mail Removed)>, Steven Jenkins
<(E-Mail Removed)> writes
>It's been a long time since I was involved in one, but I'm reasonably
>confident that we use "standard" benchmarks for large procurements.


I think some people have lost sight of what "benchmark" means. For
computer apps some people have been claiming its TPS, MIPS or whatever
form of throughput they are proposing. However, take a step back and
think about "benchmark" in more general terms and you get a better idea
for what a benchmark is. This is what Steven Jenkins was identifying
with his satellite TCP/IP benchmark.

A benchmark is something, anything by which you can compare. Typically
it is the best of breed at some point or other. Here is an example:

I play various musical instruments, one of them being the Border Bagpipe
made by Jon Swayne. Jon Swayne is a legend in his own lifetime to many
dancers and many musicians in the UK. For dancers it is because he is
part of Blowzabella, a major musical force in social dancing throughout
the last 25 years. For musicians, and particularly bagpipers, it is
because he took the bagpipe, an instrument known for not typically being
in tune, and if it was, not necessarily in tune with another bagpipe of
the same type (or even by the same maker!) and creating a new standard,
a new benchmark, if you will, by which other bagpipes are judged. Its
not just Jon Swayne, there are some other makers, but they changed
everyones perception and his pipes are the benchmark by which others are
judged (yes, they really are that good). When you talk to pipers in the
UK and mention his name there is a respect that is accorded. You don't
get that without good reason. Anyway I digress.

The benchmark for Steven's satellite test was did it match the
round-trip criteria. I think absolutely Steven's example is a benchmark.
Its much looser than other benchmarks, but thats not the point. The
point is did it serve a purpose?

For other people the benchmark will be does it perform the test within a
given tolerance? For other people it may be how much disk space does it
use? or is the latency between packets between X and Y? For other people
it will be is it faster than X?

Where Austin's point comes in is that he points out the latter test is
meaningless because you are comparing apples with oranges, when you
should really be comparing GMO engineered (optimized) apples with GMO
(optimized) oranges to be even getting close to a meaningful test. Even
so you are still comparing cores to segments and it gets a bit messy
after that, although they both have pips.

Even so, I once worked for a GIS company (A) that wrote their software
in C with an in-house scripting language. We won the benchmarks when in
competition with other GIS companies. The competition won because of
clever marketing. Their customers lost (*) though because the
competitors software was too hard to configure and our marketing people
were not smart enough to identify this and inform the customer of the
problem.

What sort of benchmarks were being tested?
o Time to compute catchment area of potential customer base within X
minutes drive given a drive time to location.
o Time to compute catchment area of potential customer base within X
minutes drive given a drive time from location.
o Time to compute drive time to location of potential customer base
within X minutes drive given a particular post code area.
o Time to compute drive time from location of potential customer base
within X minutes drive given a particular post code area.
o Think up any other bizarre thing you want.

Times to and from are/location may not be the same because of highway
on/off ramps, traffic light network delay bias and one-way systems.
Superstores often don't care much about drive time from, but care a lot
about drive-time to. For example drive time from may be 15mins, but
drive-time to may be only 5mins.

As you can see the customer requirements are highly subjective, but the
raw input data is hard data - maps and fixed road networks. The
computing time etc, thats also a fixed reality given the hardware.

Its all about perception and need.

I think the benchmarketing term is quite apt for most benchmarks.

....and Steven, your story was great. I could really relate to a lot of
that.

Stephen

(*) Its a matter of debate, they also used an in-house language and
finding non-competitor engineers that used the language was nigh on
impossible and thus they were very expensive to hire to do the
configuration. Our (A) stuff was not so configurable, but didn't need to
be.

When were we doing this stuff? 90..94 for me. X11 and Motif was the cool
stuff back then.
--
Stephen Kellett
Object Media Limited http://www.objmedia.demon.co.uk/software.html
Computer Consultancy, Software Development
Windows C++, Java, Assembler, Performance Analysis, Troubleshooting
 
Reply With Quote
 
 
 
 
Ralph \PJPizza\ Siegler
Guest
Posts: n/a
 
      06-16-2005
On Thu, Jun 16, 2005 at 02:52:24AM +0900, Steven Jenkins wrote:
>
> I said I had other examples .
>
> It's been a long time since I was involved in one, but I'm reasonably
> confident that we use "standard" benchmarks for large procurements. When
> you spend US Government money, you have to jump through a lot of hoops
> to ensure a level competitive playing field. A protest from a losing
> bidder can tie you up for a long time, so you try to avoid that. Using
> your own benchmarks for procurement qualification invites protest.
>
> Nobody wins just because their TPC-A or whatever is highest. A Request
> for Proposal may give a particular performance threshold, and the



I used to do some spending of U.S. D.O.E. money at Fermilab for servers/workstations/networks for CADD/CAE , as you say the standard benchmarks were a starting point to see what vendors might be considered, but for justifications the capabilities for in-house needs were the main thing. My projects were in $100-$200K range, surely a few orders of magnitude smaller than your NASA ones, with the procurement requirements not as burdensome.


Our group made civil engineering packages (all those tunnels and collision halls) for outside bid, and of course there the spec book that accompanied the drawings was what ruled. That could be called a set of benchmarks, I suppose; they were a mix of construction industry standards and what our engineers had calculated.




Ralph "PJPizza" Siegler


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off




Advertisments