On Dec 14, 6:01*pm, sanborne <(E-Mail Removed)> wrote:

> I assumed from my schooling that a binary search for the correct

> coefficients A and B would be better than a linear search
If you can access only one search key at a time, sure. But

in hardware you can access multiple search keys in parallel;

the cost is not of time, but of area (more comparators).

By the look of your code, both your binary and linear searches

are in fact comparing the input value with all keys

simultaneously - so there is no added area cost in going

to linear search.

> But strangely enough, when I synthesize

> both a design that is implemented using linear and a binary search,

> the linear search has a shorter combinational delay and therefore can

> actually be clocked faster! I think there is some kind of optimization

> that is happening in the synthesis tool (I am using Mentor Graphics

> Precision, but have seen similar behavior in Quartus II).
This is interesting, but perhaps not very surprising. A linear

string of comparisons is a very regular structure and it's quite

possible that the synth tool will make the same optimization that

a designer might (more details below). Your hand-crafted binary

search, though, makes a rather irregular hardware structure and

it's probably not optimized - so you end up with a cascade of

comparators (three in your example) on the critical path.

> But this

> result seems to render unnecessary a ton of work that I did

> to write binary search code instead of linear search.
Synthesis is like that

It's often best to use a simple

iterative algorithm and let the tool reshape it. Trying to be

too clever can defeat the tool's ability to see important

optimization opportunities.

Of course, that doesn't mean human ingenuity is useless.

Particularly when a computation is spread across many

clock cycles, there may be algorithmic improvements that

you can see but a synthesis tool wouldn't even get close

to achieving.

> Why would a linear

> search have a shorter delay than a binary search?
If there are few enough search keys for you to do all

comparisons simultaneously, and if the keys are in

order (easy if they're the values in a monotonic table)

then the "linear" search can be nicely parallelized:

- Provide one magnitude comparator per search key.

- Compare the input value with each key. This gives

you a "thermometer code" of single-bit comparison

results - something like 1111100000000.

- Locate the 1/0 boundary of the thermometer code, and

use its location to decide which table entry to use.

Converting a thermometer code into a location number is

a simple problem with well-know, efficient solutions

(actually they're a kind of binary search, which is ironic)

and I've often seen synthesis tools doing precisely this

kind of arrangement.

> Is the synthesis

> tool automatically pipelining my design?
From the code snippet you gave, I very much doubt it.

As a matter of interest, I've used a similar technique in

the past in image processing designs to keep a sorted list

of the largest N values seen in a data stream (for

small-ish N, up to about 10).

At a more abstract level, you can think of it like this:

Linear search for a single value in a collection of N

keys is an O(N) problem in software - the time taken to

do it scales linearly with N. In hardware, you still

have an O(N) problem but now you can make the hardware

size scale linearly with N (you have N comparators) but

the whole thing can be done in constant time.

In a similar way, simple sort algorithms that are O(N^2)

time in software can easily be re-jigged in hardware to

be - for example - O(N) area and O(N) clock cycles, or

O(N^2) area and single-cycle. For large N the software

approach will always eventually win, given the low cost of

large memories. But if N is quite small, smearing the

algorithm across more hardware is an attractive option.

--

Jonathan Bromley