![]() |
|
|
|
#1 |
|
Not a specific question, not a request for help, just an
invitation to share ideas about something that I've always found tricky - and I suspect I'm not alone. Using HDLs you can elegantly describe quite complicated logic in a clocked process - we've had several discussions about that here, and we know there are many popular styles. Mostly, though, we need to describe things that are pipelined. Sometimes that pipelining is from choice, sometimes it's forced upon us by the behaviour of things outside our control (such as pipelined synchronous RAMs in an FPGA). As soon as you have a pipelined design, it's rather easy to describe the behaviour of each pipeline stage as an HDL clocked process (or, indeed, as part of a process that describes multiple stages) but as soon as that happens you tend to lose sight of the overall algorithm that's being implemented. Sometimes the design nicely suits a description in which each pipeline stage stands alone, but if there is any feedback from later pipeline stages to earlier ones then it's usually much harder to see what's going on. So, here's my question: When writing pipelined designs, what do all you experts out there do to make the overall data and control flow as clear and obvious as possible? Thanks in advance Jonathan Bromley |
|
|
|
|
#2 |
|
Posts: n/a
|
Jonathan Bromley wrote:
> Mostly, though, we need to describe things that are pipelined. > Sometimes that pipelining is from choice, sometimes it's > forced upon us by the behaviour of things outside our > control (such as pipelined synchronous RAMs in an FPGA). It can also be forced by the design requirements. I can't shift in a serial packet in one rx_clk for example. It can also be forced by timing requirements. If the system clock is 100Mhz, that's 10ns a tick, without exception. There is top level pipelining from module instances and internal pipelining using cases of variable/register values inside the process/block. For example, a serial interface stats counter might have single process/block instances like this: -[serial/sync/hdlc]-[octet2packetbus]-[statsCounters]-[cpu bus]- A synchronous process/block always provides at least one level of pipeline on the output. Internal state or counter variables/registers can add more latency as needed A)by design or B)to meet timing. With recent devices I have found few requirements for type B pipelining, but this is very dependent on the design requirements. For example, if I have access to serial data and clock, a crc check is straightforward. However if I must process a word per tick, I have no choice but to use a FOR loop to process multiple bits per clock. > As soon as you have a pipelined design, it's rather easy to > describe the behaviour of each pipeline stage as an HDL > clocked process (or, indeed, as part of a process that > describes multiple stages) but as soon as that happens > you tend to lose sight of the overall algorithm that's > being implemented. Sometimes the design nicely > suits a description in which each pipeline stage stands > alone, but if there is any feedback from later pipeline > stages to earlier ones then it's usually much harder > to see what's going on. I keep any such feedback inside the same process/block even if this means a variable/register array declaration. > So, here's my question: When writing pipelined designs, > what do all you experts out there do to make the overall > data and control flow as clear and obvious as possible? Good question. The short answer is, by using synchronous blocks and single cycle control strobes at the module interfaces. It's much simpler to design modules to respond to a strobe (and maybe handshake it) than it is to make some poor module responsible for all cases of the full system timing. The text books all say that separating the data path is essential, but I have never found any evidence to support this assertion. I like to let it all flow through the same stream. -- Mike Treseler Mike Treseler |
|
|
|
#3 |
|
Posts: n/a
|
Mike Treseler <> writes:
> The text books all say that > separating the data path is essential, > but I have never found any evidence > to support this assertion. > I like to let it all flow through > the same stream. I have found that separating the datapath can tremendously help DC to optimize the logic on the datapath - especially if you need to do several almost identical operations on the datapath, depending on the state. In these and similar other cases I have found that creating a set of flags in the control path, and then using the flags in the datapath to determine how to manipulate the data, yields to superior synthesis results. Kai -- Kai Harrekilde-Petersen <khp(at)harrekilde(dot)dk> Kai Harrekilde-Petersen |
|
|
|
#4 |
|
Posts: n/a
|
I think that may be more of a limitation of DC than anything else. At
least for FPGA synthesis, Synplicity does not seem to mind combining control and dataflow logic. I quit using DC (or FC2) a long time ago because Synplicity was soooo much better, both in vhdl language support, and in QOR. Judging from their simulator, which I still have to use from time to time, synopsys still crashes on '93 standard features that others gobble up with no problem, or at least they give you an error report you can chase. Andy Kai Harrekilde-Petersen wrote: > Mike Treseler <> writes: > > > The text books all say that > > separating the data path is essential, > > but I have never found any evidence > > to support this assertion. > > I like to let it all flow through > > the same stream. > > I have found that separating the datapath can tremendously help DC to > optimize the logic on the datapath - especially if you need to do > several almost identical operations on the datapath, depending on the > state. > > In these and similar other cases I have found that creating a set of > flags in the control path, and then using the flags in the datapath to > determine how to manipulate the data, yields to superior synthesis > results. > > > Kai > -- > Kai Harrekilde-Petersen <khp(at)harrekilde(dot)dk> Andy |
|
|
|
#5 |
|
Posts: n/a
|
In clocked vhdl processes, every assignment from one _signal_ to
another is a clock cycle (a register or pipeline stage). This is completely different from how software behaves. Using variables instead of signals, you write the process the way you would in software, and order references relative to assignments to create clock delays (register/pipeline stages). Some people like the descriptions using signals better, some like the variable descriptions better. I like the flexibility of moving/adding/deleting registers by moving variable assignments relative to references in the process. Another approach is to use pipelining and retiming features of your synthesis tool. You may be able to describe the process all in one cycle, and then delay the outputs by several clocks (through registers), then let the synthesis tool redistribute registers according to timing constraints. Synthesis tools have their limitations here though... And of course, this has problems when handling feedback. Andy Andy |
|
|
|
#6 |
|
Posts: n/a
|
Andy wrote:
> I think that may be more of a limitation of DC than anything else. At > least for FPGA synthesis, Synplicity does not seem to mind combining > control and dataflow logic. I agree, and would add Quartus, ISE, Leonardo, Modelsim, and NC-Sim to the list of tools proven useful for VHDL'93 designs. If I had to use DC, I would code in verilog instead of VHDL. -- Mike Treseler Mike Treseler |
|
|
|
#7 |
|
Posts: n/a
|
For pipelined logic where it's not clear what each stage should do
exactly, I find it easier to code the logic first, add multiple pipelined registers at the end of the logic and then synthesize using balance_registers in Design-Compiler. Ex: AND AND AND FLOP FLOP FLOP becomes AND FLOP AND FLOP AND FLOP after synthesis using balance_registers Aditya Jonathan Bromley wrote: > Not a specific question, not a request for help, just an > invitation to share ideas about something that I've always > found tricky - and I suspect I'm not alone. > > Using HDLs you can elegantly describe quite complicated logic in > a clocked process - we've had several discussions about that > here, and we know there are many popular styles. > > Mostly, though, we need to describe things that are pipelined. > Sometimes that pipelining is from choice, sometimes it's > forced upon us by the behaviour of things outside our > control (such as pipelined synchronous RAMs in an FPGA). > > As soon as you have a pipelined design, it's rather easy to > describe the behaviour of each pipeline stage as an HDL > clocked process (or, indeed, as part of a process that > describes multiple stages) but as soon as that happens > you tend to lose sight of the overall algorithm that's > being implemented. Sometimes the design nicely > suits a description in which each pipeline stage stands > alone, but if there is any feedback from later pipeline > stages to earlier ones then it's usually much harder > to see what's going on. > > So, here's my question: When writing pipelined designs, > what do all you experts out there do to make the overall > data and control flow as clear and obvious as possible? > > Thanks in advance Aditya Ramachandran |
|
|
|
#8 |
|
Posts: n/a
|
"Jonathan Bromley" <> wrote in message news:... > So, here's my question: When writing pipelined designs, > what do all you experts out there do to make the overall > data and control flow as clear and obvious as possible? Comments. Lots and lots and lots of comments. Oh, and a diagram. -Ben- Ben Jones |
|
|
|
#9 |
|
Posts: n/a
|
Jonathan,
I've successfully used register balancing in Synopsys DC since about eight years ago. In order to notify the other end about when there's work to be done, it is often a good idea to pass a synchronization signal (e.g. data valid, deasserted reset) through the pipeline as well. Don't forget your post-synthesis verification though (gate-level or formal). We never completely trust the tools, right? Regards, Marcus Marcus Harnisch |
|
|
|
#10 |
|
Posts: n/a
|
>
> Mostly, though, we need to describe things that are pipelined. > Sometimes that pipelining is from choice, Not sure I can think of any "from choice" examples except for places where.. - It doesn't matter if the signal is combinatorial or delayed by a clock cycle. - and the cleanest from for writing the logic (in VHDL) is using a statement only available inside a process (i.e. a case or if) - There would be more than a couple signals in the sensitivity list In that situation I would choose a clocked process over a process with the laundry list of signals in the sensitivity list of which I'll invariably forget at least one. > sometimes it's > forced upon us by the behaviour of things outside our > control (such as pipelined synchronous RAMs in an FPGA). > Dang those pesky constraints anyway. > As soon as you have a pipelined design, it's rather easy to > describe the behaviour of each pipeline stage as an HDL > clocked process (or, indeed, as part of a process that > describes multiple stages) but as soon as that happens > you tend to lose sight of the overall algorithm that's > being implemented. That's the point where I would go back and rethink how I've partitioned the design and ponder a bit on... - Is the algorithm itself really what needs to be implemented or is there a different algorithm that accomplishes the same/similar goals that might be more ameanable to implementation since I've wrapped myself around the axle on this one. If not, then move on to the following point. - Rethink the partitioning of the design. Sometimes my first guess at how things should be partitioned turns out to be rather clumsy and now after having "lost sight of the overall algorithm that's being implemented" is a good time to go back and redraw the boundary lines. As for the boundary lines themselves, I'm generally talking about at the VHDL entity level. Any decently complex algorithm that needs to be pipelined probably is composed of some form of cascaded blocks. Each cascaded block will have a clear definition of what it is trying to accomplish. This pretty much then defines what the I/O (in terms of algorithm information flow) is. Based on that choose an appropriate set of control/status signals to move that information in and out of the blocks. For that, of late I've been using Altera's Avalon bus specification as a model. I looked at opencore's wishbone spec as well and wasn't terribly impressed but Avalon seems to have an interface definition that scales really well (like not just for the top level, but can go all the way down to 'simple blocks' without any appreciable 'overhead' in terms of wasted logic). By that I mean that not only can I use it for the top level of the algorithm implementation's I/O but it can also be used for interconnecting those cascaded blocks. Not sales pitching Altera, I'm sure Xilinx, Actel et al all probably have some equivalent as well but over the last 5 years I've pretty much been all Altera. The SOPC Builder tool sucks and I no longer use it for real design, but the Avalon specification itself is good. In any case, I've found that having 'some' block I/O interface signal specification instead of your own "well thought out, but still kinda in your head but it works for me and it's so clear that I'm sure you'll get it too" version is a key to not getting lost in your pipelining (second only to having the individual sub-blocks implementing the correct functionality...i.e. drawing the right boundaries in the first place). Since these are 'sub-blocks' I'll tend to generalize the data signals to fit the true need. For example, Avalon data are all std_logic_vectors but I'll change that to be a VHDL record so that the interface between blocks is of the appropriate type for that interface. At the top level of the algorithm implementation you're generally constrained in what you can use but the internal block to block interfaces generally don't have that constraint. Once inside a particular block, if I'm finding myself "losing sight of the overall algorithm within the local space" I'll generally follow the same steps and re-factor. Maybe that means that this particular block should be decomposed into a parent/child structure or maybe it needs to be split into two cascaed 'siblings'. > Sometimes the design nicely > suits a description in which each pipeline stage stands > alone, but if there is any feedback from later pipeline > stages to earlier ones then it's usually much harder > to see what's going on. > 'Most' of the time in the past, I've found that this feedback is usually something of the form 'slow down I can't take the data so quickly' or 'OK, I'm ready to accept data'. That feedback needs to get from the data consumer back to whatever it is that is ultimately sourcing the data. This particular type of feedback though is exactly the data flow control that specifications like Avalon are designed to handle so if you've designed each sub block to that interface than the flow control type of feedback will take care of itself. I'm pondering what other types of feedback there might be to feed from a later to an earlier stage, but I guess it's too early in the morning. > So, here's my question: When writing pipelined designs, > what do all you experts out there do to make the overall > data and control flow as clear and obvious as possible? > 1. Partition entities into clearly describable functions and don't be afraid to go back and re-partition them into different clearly describable functions if you get wrapped around the axle. 2. Choose an I/O interface model specification (Avalon, wishbone, etc.) and use it not just for the top block but for sub-blocks as well. Since you'd like to use this I/O model all the way from the top to bottom in your design don't pick something that carries a lot of baggage with it that causes you to abandon it. An outlandish example, would be choosing PCI as your model. While great for connecting 'big' things, you probably wouldn't want to outfit each entity with a PCI interface. Look for something that scales well DOWNWARD (i.e. not logic wasteful), so you're not forced to abandon it because of the overhead. 3. Re-factor an entity into a parent/child or sibling/sibling pair of entities when you find yourself getting 'lost'. > Thanks in advance Thanks for the soapbox Kevin Jennings KJ |
|
![]() |
| Thread Tools | Search this Thread |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Juniper hardware license | ipmiracle | Hardware | 0 | 01-23-2008 03:08 PM |
| Blu-Ray vs. HD DVD: Why high-def video hardware standards are irrelevant. | Allan | DVD Video | 1 | 05-10-2005 04:29 PM |
| High Definition and the future of viewing. | Allan | DVD Video | 3 | 03-09-2005 12:56 AM |
| Divix hardware, experience...thoughts? | Mook23 | DVD Video | 8 | 05-30-2004 04:32 AM |
| digfficult hardware diagnosis | Frank | A+ Certification | 16 | 05-04-2004 05:41 PM |