![]() |
|
|
|
#1 |
|
hi..
I want to design a MAC(multiply-accumulator).I have written the following code.The problem is ,when I do place & route in Xilinx ISE 7.1version,I get too many timing errors(around 30).My clock is 70Mhz. entity mac is generic( input_width1 : integer:= 16; input_width2 : integer:= 16; output_width : integer := 36; mac_cycle_width : integer := 4 ); port ( RESET : IN STD_LOGIC; CLK : IN STD_LOGIC; FD : IN STD_LOGIC; ND : IN STD_LOGIC; A : IN STD_LOGIC_VECTOR(input_width1-1 DOWNTO 0); B : IN STD_LOGIC_VECTOR(input_width2-1 DOWNTO 0); Q : OUT STD_LOGIC_VECTOR(output_width-1 DOWNTO 0); RDY : OUT STD_LOGIC ); end entity mac; architecture rtl of mac is signal cycle : STD_LOGIC_VECTOR(mac_cycle_width-1 DOWNTO 0); signal rdy1 : std_logic; signal sum : STD_LOGIC_VECTOR (output_width-1 DOWNTO 0); signal temp2 : STD_LOGIC_VECTOR (output_width-1 DOWNTO 0); signal prod : STD_LOGIC_VECTOR (input_width1 + input_width2 -1 DOWNTO 0); begin -- cycle determines the no of mac accumulations -- fd indicates the start of new accumulation process(reset,clk) begin if reset = '1' then cycle <= (others =>'0'); elsif(clk'event and clk = '1')then if ( fd = '1')then cycle <=conv_std_logic_vector((1),cycle'length);--"0001";--conv_std_logic_vector((1),cycle'length); --; else cycle <= cycle +'1'; end if; end if; end process; -- ND indicates that the new data is ready at the input. -- the 2 inputs are multiplied and the product is added -- to the previous accumulator result . -- In the last cycle the accumulator result is given out and at -- same time the accumulator is reset to zeros. -- here the accumulator is variable temp1. -- SUM holds the final accumulated result and temp2 holds the -- intermediate results. process(reset,clk) variable temp1 : STD_LOGIC_VECTOR (output_width-1 DOWNTO 0); begin if(reset ='1')then sum <= (others => '0'); prod <= (others => '0'); temp1 := (others => '0'); elsif( clk'event and clk ='0')then if (nd ='1')then prod <= A * (B); temp1 := temp1 + prod; if( cycle=conv_std_logic_vector((0),cycle'length))then --"0000"--conv_std_logic_vector((0),cycle'length) sum <= temp1; temp1 := (others => '0'); end if; end if; end if; temp2 <= temp1; end process; -- here Q indicates the accumulator output during all cycles -- SUM holds the final accumulated result and temp2 holds the -- intermediate results. both are combined to form Q. process(clk) begin if( clk'event and clk ='1')then if ( cycle=conv_std_logic_vector((0),cycle'length))then --"0000"--conv_std_logic_vector((0),cycle'length) q <= sum; else q <= temp2; end if; end if; end process; --q <= sum; -- At the end of MAC cycle rdy is generated to indicated -- that the MAC output is ready. process(reset,clk) begin if(reset ='1')then rdy1 <= '0'; elsif( clk'event and clk ='1')then -- ori '1' if( cycle=conv_std_logic_vector((0),cycle'length))then --"0000"conv_std_logic_vector((0),cycle'length) rdy1 <= '1'; else rdy1 <= '0'; end if; end if; end process; rdy <= rdy1; end rtl; sksaras@hotmail.com |
|
|
|
|
#2 |
|
Posts: n/a
|
hi all..
is there any wrong in this design?. fuctionally when I tested on modelsim ,correct results were produced.Please help.. wrote: > hi.. > I want to design a MAC(multiply-accumulator).I have written the > following code.The problem is ,when I do place & route in Xilinx ISE > 7.1version,I get too many timing errors(around 30).My clock is 70Mhz. > > entity mac is > generic( > input_width1 : integer:= 16; > input_width2 : integer:= 16; > output_width : integer := 36; > mac_cycle_width : integer := 4 > ); > port ( > RESET : IN STD_LOGIC; > CLK : IN STD_LOGIC; > FD : IN STD_LOGIC; > ND : IN STD_LOGIC; > A : IN STD_LOGIC_VECTOR(input_width1-1 DOWNTO 0); > B : IN STD_LOGIC_VECTOR(input_width2-1 DOWNTO 0); > Q : OUT STD_LOGIC_VECTOR(output_width-1 DOWNTO 0); > RDY : OUT STD_LOGIC > ); > end entity mac; > > architecture rtl of mac is > > signal cycle : STD_LOGIC_VECTOR(mac_cycle_width-1 DOWNTO 0); > signal rdy1 : std_logic; > signal sum : STD_LOGIC_VECTOR (output_width-1 DOWNTO 0); > signal temp2 : STD_LOGIC_VECTOR (output_width-1 DOWNTO 0); > signal prod : STD_LOGIC_VECTOR (input_width1 + input_width2 -1 > DOWNTO 0); > > begin > > -- cycle determines the no of mac accumulations > -- fd indicates the start of new accumulation > process(reset,clk) > begin > if reset = '1' then > cycle <= (others =>'0'); > elsif(clk'event and clk = '1')then > if ( fd = '1')then > cycle > <=conv_std_logic_vector((1),cycle'length);--"0001";--conv_std_logic_vector((1),cycle'length); > --; > else > cycle <= cycle +'1'; > end if; > end if; > end process; > > -- ND indicates that the new data is ready at the input. > -- the 2 inputs are multiplied and the product is added > -- to the previous accumulator result . > -- In the last cycle the accumulator result is given out and at > -- same time the accumulator is reset to zeros. > -- here the accumulator is variable temp1. > -- SUM holds the final accumulated result and temp2 holds the > -- intermediate results. > process(reset,clk) > variable temp1 : STD_LOGIC_VECTOR (output_width-1 DOWNTO 0); > begin > if(reset ='1')then > sum <= (others => '0'); > prod <= (others => '0'); > temp1 := (others => '0'); > elsif( clk'event and clk ='0')then > if (nd ='1')then > prod <= A * (B); > temp1 := temp1 + prod; > if( cycle=conv_std_logic_vector((0),cycle'length))then > --"0000"--conv_std_logic_vector((0),cycle'length) > sum <= temp1; > temp1 := (others => '0'); > end if; > end if; > end if; > temp2 <= temp1; > end process; > > > -- here Q indicates the accumulator output during all cycles > -- SUM holds the final accumulated result and temp2 holds the > -- intermediate results. both are combined to form Q. > process(clk) > begin > if( clk'event and clk ='1')then > if ( > cycle=conv_std_logic_vector((0),cycle'length))then --"0000"--conv_std_logic_vector((0),cycle'length) > q <= sum; > else > q <= temp2; > end if; > end if; > end process; > > --q <= sum; > > -- At the end of MAC cycle rdy is generated to indicated > -- that the MAC output is ready. > process(reset,clk) > begin > if(reset ='1')then > rdy1 <= '0'; > elsif( clk'event and clk ='1')then -- ori '1' > if( cycle=conv_std_logic_vector((0),cycle'length))then > --"0000"conv_std_logic_vector((0),cycle'length) > rdy1 <= '1'; > else > rdy1 <= '0'; > end if; > end if; > end process; > > rdy <= rdy1; > > end rtl; sksaras@hotmail.com |
|
|
|
#3 |
|
Posts: n/a
|
wrote:
> hi all.. > is there any wrong in this design?. fuctionally when I tested on > modelsim ,correct results were produced.Please help.. > > > wrote: >> hi.. >> I want to design a MAC(multiply-accumulator).I have written the >> following code.The problem is ,when I do place & route in Xilinx ISE >> 7.1version,I get too many timing errors(around 30).My clock is 70Mhz. >> Have you taken a look at the timing report (*.twr) to see what paths are failing? I assume the it is the multiply that is failing. Are you using a chip with built in hardware multipliers? Are they being used? Duane Clark |
|
|
|
#4 |
|
Posts: n/a
|
hi..thanks a lot.
You are right.when I saw the timing analyzer (post -place & route static timing analyzer)most errors are in the multiply route. Now I have replaced multiply by xilinx multiplier core and the errors are reduced to 4 but the slices occupancy has increased a lot. Also I am not supposed to use any cores.Please suggest me any other ways to design the MAC.Or can I modify the above code (opitmise) so that errors are reduced?. Duane Clark wrote: > wrote: > > hi all.. > > is there any wrong in this design?. fuctionally when I tested on > > modelsim ,correct results were produced.Please help.. > > > > > > wrote: > >> hi.. > >> I want to design a MAC(multiply-accumulator).I have written the > >> following code.The problem is ,when I do place & route in Xilinx ISE > >> 7.1version,I get too many timing errors(around 30).My clock is 70Mhz. > >> > > Have you taken a look at the timing report (*.twr) to see what paths are > failing? I assume the it is the multiply that is failing. Are you using > a chip with built in hardware multipliers? Are they being used? sksaras@hotmail.com |
|
|
|
#5 |
|
Posts: n/a
|
wrote:
> hi..thanks a lot. > You are right.when I saw the timing analyzer (post -place & route > static timing analyzer)most errors are in the multiply route. > Now I have replaced multiply by xilinx multiplier core and the errors > are reduced to 4 but the slices occupancy has increased a lot. > Also I am not supposed to use any cores.Please suggest me any other > ways to design the MAC.Or can I modify the above code (opitmise) so > that errors are reduced?. What chip are you targeting? The Virtex2 and later chips have built in hardware multipliers that consume no slices. Is this a homework problem? Again, you need to look at the timing report, see what paths are breaking, and then determine what to do to fix them. Duane Clark |
|
|
|
#6 |
|
Posts: n/a
|
hi..I am using spartan3 200k chip.I know that this chip has 12
dedicated multipliers.If I use ' * ' to multiply ,will these multipliers be used ? If it uses these multipliers then why is the slice occupancy increased? Also when I use multiplier xilinx core, then it must use these built-in multipliers only to implement.hence it should occupy no or less slices.But there is increase in slices number.why? > Again, you need to look at the timing report, see what paths are > breaking, and then determine what to do to fix them. I have seen the timing report.The problem is in the multiply only.But how do I fix it ?.Please suggest. Duane Clark wrote: > wrote: > > hi..thanks a lot. > > You are right.when I saw the timing analyzer (post -place & route > > static timing analyzer)most errors are in the multiply route. > > Now I have replaced multiply by xilinx multiplier core and the errors > > are reduced to 4 but the slices occupancy has increased a lot. > > Also I am not supposed to use any cores.Please suggest me any other > > ways to design the MAC.Or can I modify the above code (opitmise) so > > that errors are reduced?. > > What chip are you targeting? The Virtex2 and later chips have built in > hardware multipliers that consume no slices. Is this a homework problem? > > Again, you need to look at the timing report, see what paths are > breaking, and then determine what to do to fix them. sksaras@hotmail.com |
|
|
|
#7 |
|
Posts: n/a
|
On 6 Sep 2006 00:13:50 -0700, wrote:
>+++hi..I am using spartan3 200k chip.I know that this chip has 12 >+++dedicated multipliers.If I use ' * ' to multiply ,will these >+++multipliers be used ? If it uses these multipliers then why is the >+++slice occupancy increased? ************ Download XAPP467 from Xilinx web page and that will tell you what you need to know how to use the dedicated multipliers in the Spartan3 series of devices. You can set the process constraints so that XST will infer use of a dedicated multiplier. Thus A * B will use a dedicated multiplier and not one comprised of LUTs. There is example code also that you can look at an adapt to your need. You should make use of these application notes as they can be helpful in understanding how the logic works internally. james james |
|
![]() |
| Thread Tools | Search this Thread |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| How to execute an external software from VHDL? And how to interface VHDL with JAVA? | becool_nikks | Software | 0 | 03-06-2009 07:08 PM |
| Error: Physical sythesis tool PALAC is not supported by Formal Verification tool Conf | bbiandov | Software | 0 | 12-22-2008 05:25 AM |
| Help on auto conversion from Matlab to vhdl on filter design | hardheart | Hardware | 0 | 12-07-2007 09:19 AM |
| Sewing, Embroidery & SignMaking Software.. | embsupply | Software | 0 | 10-02-2007 04:29 PM |
| Conformal LEC of a VHDL design (RTL Vs Netlist) | sharatd | Hardware | 0 | 10-18-2006 02:47 PM |