Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > VHDL > changes for synthesizable code

Reply
Thread Tools

changes for synthesizable code

 
 
vizziee@gmail.com
Guest
Posts: n/a
 
      07-21-2005
Hi.

I am working on parallel implementation of merge-sort (infact, the last
step of merge-sort where I need to only merge two sorted arrays of
equal lengths to form a bigger sorted array) in VHDL. I adapted an
existing program for my implementation and it works well during
simulation. However, it doesn't get synthesized. What changes should I
make in the design to make it synthesizable, without changing the
architecture. Or even if any change in the architecture is required,
only a few latencies are acceptable. Following is my code and the
errors I received while synthesis in Quartus II 5.0 (I think the
'Warning' may be ignored, my issue is more towards the 'Error' part).


Perhaps to reduce the logic element consumption, I can play in some
part of the code. But that should not cost higher latencies.

************************************************** ***************************************

1 library ieee;
2 use ieee.std_logic_1164.all;
3
4 entity msort_c is
5 generic (
6 n: integer := 3;
7 w: integer := 4
8 );
9
10 port (
11 clock: in std_logic;
12
13 reset: in std_logic;
14
15 valid: in std_logic;
16
17 data_array1: in bit_vector (((n*w)-1) downto 0); -- This is the
first sorted data
18 -- array input port of the sorting unit
19
20 data_array2: in bit_vector (((n*w)-1) downto 0); -- This is the
second sorted data
21 -- array input port of the sorting unit
22
23 sorted_array: out bit_vector ((((2*n)*w)-1) downto 0)
24 );
25 end entity msort_c;
26
27
28 architecture msort_c_arch of msort_c is
29 begin
30 process (reset, clock, valid)
31 variable x1, x2: natural range 0 to n;
32 variable pick2: boolean;
33 begin
34 if (reset = '1') then
35 sorted_array <= (others => '0');
36 elsif (clock = '1' and clock'event) then
37 if (valid = '1') then
38
39 x1 := 0;
40 x2 := 0;
41
42 for i in 0 to ((2*n)-1) loop
43
44 -- Decide which list to pick from...
45 pick2 := FALSE;
46 if (x1 = n) then
47 pick2 := TRUE;
48 elsif (x2 /= n) then
49 pick2 := data_array1((((x1+1)*w)-1) downto (x1*w))
> data_array2((((x2+1)*w)-1) downto (x2*w));

50 end if;
51
52 -- Pick
53 if pick2 then
54 sorted_array((((i+1)*w)-1) downto (i*w)) <=
data_array2((((x2+1)*w)-1) downto (x2*w));
55 x2 := x2 + 1;
56 else
57 sorted_array((((i+1)*w)-1) downto (i*w)) <=
data_array1((((x1+1)*w)-1) downto (x1*w));
58 x1 := x1 + 1;
59 end if; --end of 'pick' if loop
60 end loop; --end of i-choice 'for' loop
61 else
62 sorted_array <= (others => '0');
63 end if;--end of 'if-valid' loop
64 end if;--end of 'if-reset' loop
65 end process;
66
67 end architecture msort_c_arch;



************************************************** ************************************************** ************************************************** ************************************************** ************************************************** *****************
************************************************** ***************************************

Error: VHDL error at msort_c.vhd(49): left bound of range must be a
constant
Warning: VHDL Process Statement warning at msort_c.vhd(30): signal or
variable "sorted_array" may not be assigned a new value in every
possible path through the Process Statement. Signal or variable
"sorted_array" holds its previous value in every path with no new value
assignment, which may create a combinational loop in the current
design.
Error: Can't elaborate top-level user hierarchy
Error: Quartus II Analysis & Synthesis was unsuccessful. 2 errors, 1
warning
Error: Processing ended: Thu Jul 21 09:01:04 2005
Error: Elapsed time: 00:00:01
Error: Quartus II Full Compilation was unsuccessful. 2 errors, 1 warning

 
Reply With Quote
 
 
 
 
Ralf Hildebrandt
Guest
Posts: n/a
 
      07-21-2005
wrote:

> However, it doesn't get synthesized.

....
> 30 process (reset, clock, valid)


You don't need valid in the sensitivity list - but does not solve the
synthesis problem.


> 49 pick2 := data_array1((((x1+1)*w)-1) downto (x1*w)) ...

....
> Error: VHDL error at msort_c.vhd(49): left bound of range must be a
> constant


You know, that the upper and lower bound of data_array1 are connected
but for the synthesis tool these are just two numbers making the upper
und lower bound independend from each other. You have to find another
way a way to describe it.
A for-loop describing just the same as you wrote may be the solution,
because then for every bit the synthesis tool gets an exact description,
what to do.

But this may be not the final solution. In line 49 you compare two
vectors, but the length of the vectors is variable. What would that be
in hardware? I guess /several/ comparators - one for every vector width.
(Ignore ressource sharing at the moment.) I don't expect, that a
synthesis tool is capable of computing how many of such comparators are
needed - because at the moment I do not. Do you know it?
=> Model hardware - don't program VHDL! Model such comparators in an
explicit way. (You may use for-generate or a suitable nice way, but you
should define in a clear and exact way, what hardware you want. If you
do so, the synthesis tool will do it's job.)


Ralf
 
Reply With Quote
 
 
 
 
vizziee@gmail.com
Guest
Posts: n/a
 
      07-23-2005

Thanks Ralf for your inputs.

> But this may be not the final solution. In line 49 you compare two
> vectors, but the length of the vectors is variable. What would that be
> in hardware? I guess /several/ comparators - one for every vector width.
> (Ignore ressource sharing at the moment.) I don't expect, that a
> synthesis tool is capable of computing how many of such comparators are
> needed - because at the moment I do not. Do you know it?

The number of comparators is fixed. Since we always compare two numbers
of size w bits (so number of comarators = w). It is only the position
of these numbers in the two arrays that changes as per the 'pick'
value. The position can not be known before hand because the arrays can
have randomly any value.

> => Model hardware - don't program VHDL! Model such comparators in an
> explicit way. (You may use for-generate or a suitable nice way, but you
> should define in a clear and exact way, what hardware you want. If you
> do so, the synthesis tool will do it's job.)

You are very correct. But I know the number of comparators (same
reasoning again). What I don't know is which input to feed in these
caomparators. That is decided by the 'pick' value.

Is there a known hardware architecture for this last step of
merge-sort?

vizziee.

 
Reply With Quote
 
sunny
Guest
Posts: n/a
 
      07-23-2005
> The number of comparators is fixed. Since we always compare two numbers
> of size w bits (so number of comarators = w). It is only the position
> of these numbers in the two arrays that changes as per the 'pick'
> value. The position can not be known before hand because the arrays can
> have randomly any value.

If you know the number of comparators, that really solves the problem
to some extent. Since your hardware is fixed in that case. Perhaps I
sub-optimal way to solve the problem could be this: The problem with
the code is the varying position of the numbers (which, as you say, is
decided by other variable). If you can fix the position in your code,
the problem gets solved. So after every comparison, you can shift your
array so that the next number to be compared occupies the first
position. So you will always compare the first elements of the two
arrays. However, I don't know how much of the hardware this approach
will consume. But certainly, it could be the solution.

> Is there a known hardware architecture for this last step of
> merge-sort?

No idea. Even I would like to know one.

Sunny.

 
Reply With Quote
 
Ralf Hildebrandt
Guest
Posts: n/a
 
      07-23-2005
wrote:


>>But this may be not the final solution. In line 49 you compare two
>>vectors, but the length of the vectors is variable. What would that be
>>in hardware? I guess /several/ comparators - one for every vector width.
>>(Ignore ressource sharing at the moment.) I don't expect, that a
>>synthesis tool is capable of computing how many of such comparators are
>>needed - because at the moment I do not. Do you know it?

>
> The number of comparators is fixed. Since we always compare two numbers
> of size w bits (so number of comarators = w). It is only the position
> of these numbers in the two arrays that changes as per the 'pick'
> value. The position can not be known before hand because the arrays can
> have randomly any value.


Ok, then you gave yourself the answer: Model a w-bit wide comparator and
several multiplexers to select the appropriate data.

Synthesis complains about style like sig(x downto y) where x and y are
some not-fixed integers, because then the synthesis tool assumes, that x
and y are independend. You have to model such a selection descriping are
strong dependency between the two limits like sig(x+8 downto x).
Sometimes a suitable way of modelling can be found using a for-loop.

It may be a good idea to split the description of the comparator from
the one of the mux. This makes it often more clear what exactly must be
the desired hardware. Furthermore the mux should be described carefully
using every detail you know.

Some details: If I look at the following pieces of code

> 42 for i in 0 to ((2*n)-1) loop

....
> 54 sorted_array((((i+1)*w)-1) downto (i*w))<= ...


then the obviously the limits of the selected part of the array are
fixed, but I guess it will be much easier for the synthesis tool, if you
model this for

sorted_array(1*w-1 dowto 0*w)<=...
sorted_array(2*w-1 dowto 1*w)<=...
....

Furthermore you should seperate all these statements into several
processes. To avoid a lot of typing you may use a for-generate
statement, but I would do this later, if everything is clear.

If you separate all these partial vectors you have to model the
selectors in your code (x1, x2, pick1, pick2...) in a different way, but
I guess this approach leads to a more clean desription of the desired
hardware.


> What I don't know is which input to feed in these
> caomparators. That is decided by the 'pick' value.


Yes, I see.
My comments ignore the behavior of the algorithm. (I have to say, that I
did not even think about it.) I have just looked at the code and tried
to see the related hardware, that is described. At the moment I do not
know it, because the code is complex.

But I think if you can decribe the selectors of the muxes in a very
clean way (this means, that you have to know it for yourself and you
have to model it in a way, that can be understood by the synthesis tool)
you have the solution. And this:

> Is there a known hardware architecture for this last step of
> merge-sort?


is exactly the point. If I am not wrong we have figured out three main
parts of this architecture: the comparators, the muxes and the
mux-selectors (controlled by something we do not know at the moment).

I suggest to follow this top-down approach: Model every known component
in an explicit way and then the unknown parts will be more visible.

Ralf
 
Reply With Quote
 
vizziee@gmail.com
Guest
Posts: n/a
 
      07-26-2005
sunny wrote:
> If you know the number of comparators, that really solves the problem
> to some extent. Since your hardware is fixed in that case. Perhaps I
> sub-optimal way to solve the problem could be this: The problem with
> the code is the varying position of the numbers (which, as you say, is
> decided by other variable). If you can fix the position in your code,
> the problem gets solved. So after every comparison, you can shift your
> array so that the next number to be compared occupies the first
> position. So you will always compare the first elements of the two
> arrays. However, I don't know how much of the hardware this approach
> will consume. But certainly, it could be the solution.

Thanx sunny, I tried your idea and it worked both in simulation as well
as synthesis. Ofcourse LE usage has gone up exponentially. But probably
that's the price. Any idea of bringing it down here.

 
Reply With Quote
 
vizziee@gmail.com
Guest
Posts: n/a
 
      07-26-2005
Thanx a lot Ralf for your help.

Ralf Hildebrandt wrote:
> wrote:
>
>
> then the obviously the limits of the selected part of the array are
> fixed, but I guess it will be much easier for the synthesis tool, if you
> model this for
>
> sorted_array(1*w-1 dowto 0*w)<=...
> sorted_array(2*w-1 dowto 1*w)<=...
> ...

But won't this be a bit complicated compared to the simple shifting
that sunny suggested.
> Furthermore you should seperate all these statements into several
> processes. To avoid a lot of typing you may use a for-generate
> statement, but I would do this later, if everything is clear.

But one process is enough. And why for-generate, a for-loop would also
do instead in a process.

> If I am not wrong we have figured out three main
> parts of this architecture: the comparators, the muxes and the
> mux-selectors (controlled by something we do not know at the moment).

Yep...that's right. As for mux-selector logic, its decided by the
output of comparators because that's what decides the 'pick' value. But
this architecture is certianly not the optimum one. It consumes lot of
LEs. Any idea of bringing it down.

KVM.

 
Reply With Quote
 
Mike Treseler
Guest
Posts: n/a
 
      07-26-2005
wrote:

> Thanx sunny, I tried your idea and it worked both in simulation as well
> as synthesis.


Consider posting the working version.

> Ofcourse LE usage has gone up exponentially. But probably
> that's the price. Any idea of bringing it down here.


Right now there is a full pass on every clock tick.
Using a process variable this could be slower.
Maybe 2, 4 or 8 clocks per pass.

-- Mike Treseler
 
Reply With Quote
 
Mike Treseler
Guest
Posts: n/a
 
      07-26-2005
wrote:

> But one process is enough. And why for-generate, a for-loop would also
> do instead in a process.


I agree, but this is a matter of
style, not substance.

If we exclude tri-state drivers,
anything that can done by generating
processes can also be done
procedurally in a single process.
These are just alternate descriptions of the
same hardware.

A single process thread is easier to
read for the data_structures+algorithms
crowd. Multiple processes feels more like
a schematic netlist.

-- Mike Treseler
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Is this Code synthesizable and any suggestions Rama VHDL 7 01-18-2007 03:39 AM
not synthesizable code fragment... error appears at bitstream generation Stefan Oedenkoven VHDL 1 01-04-2005 07:22 AM
Help in writing synthesizable code?? dcreddy1980 VHDL 2 12-18-2004 09:21 AM
Wonder how to write the following code to be synthesizable Stanley VHDL 2 12-07-2004 04:40 PM
GL85 synthesizable code Saeed Nari VHDL 2 07-28-2003 10:38 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57