Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > VHDL > Mixed clocked/combinatorial coding styles (another thread)

Reply
Thread Tools

Mixed clocked/combinatorial coding styles (another thread)

 
 
rickman
Guest
Posts: n/a
 
      08-24-2008
On Aug 24, 12:26 am, whygee <(E-Mail Removed)> wrote:
> Hi !
>
> KJ wrote:
> > Since you said you're implementing the SPI master side, that implies that
> > you're generating the SPI clock itself which *should* be derived from the
> > CPU clock...there should be no need then for more than a single clock domain
> > (more later).

>
> As pointed in my previous post, there is at least one peripheral
> (ENC28J60 revB4) that has clocking restrictions
> (also know as "errata") and I happen to have some ready-to-use
> modules equipped with this otherwise nice chip...


Can you be specific on the restrictions and how that relates to the
CPU clock?

> I don't know if my chip revision is B4 and the errata
> suggest using a clock between 8 and 10MHz.
> However, it also suggest using the ENC28J60-provided 12.5MHz
> output : I'm ready to add an external clock input in the master
> if i'm allowed to "legally" go beyond the 10MHz rating
> (a 25% bandwidth increase is always a good thing, particularly
> with real-time communications).


When you talk about B4 is that the ENC28J60 chip?? How can it
recommend a 10 MHz max clock and also recommend using a 12.5 MHz
clock? I have to say you are using "it" and "my chip" in unclear
ways. Try being specific and not using pronouns.


> As another "unintended case", an external clock input opens
> the possibility to bit-bang data with some PC or uC.
> I know it sounds stupid but I had a project 10 years
> ago that would stream bootstrap code
> to a DSP through the PC's parallel printer port.
> ADi's SHARC had a boot mode where a DMA channel
> loaded code from outside, and I had found a trick
> to single-cycle the transfer with external circuits.
> That's very handy for early software development,
> more than flashing the boot memory all the time...
> Now, if I can stream external boot code WITHOUT the
> hassles of external circuitry (which was a pain
> to develop without the test devices I have now),
> that is an even better thing.
>
> For me and in the intended application, that's enough
> to justify another clock domain.
> If I had no ready-to-use ENC28J60 mini-module,
> I would not have bothered.


I still have no idea why you think you need two clock domains. Are
you saying that you can't pick a CPU clock rate that will allow an SPI
clock rate of 8 to 10 MHz? What CPU are you using?


> > The CPU clock period and the desired SPI clock period are known constants.

>
> They are indicated in the datasheet of each individual product.
> And there is no "SPI standard" contrary to I2C or others.
> (http://en.wikipedia.org/wiki/Serial_..._Bus#Standards)
> Some chips accept a falling CLK edge after CS goes low,
> and some other chips don't (even chips by the same manufacturer vary).


Are you saying you can't find a suitable subset of SPI operation that
will work with both? Are you aware that you can operate the bus in
different mode when addressing different peripherals? That can be
handled in your FSM.


> So i have read the datasheets of the chips i want to interface,
> and adapted the master interface to their various needs (and errata).
>
> > Therefore one can create a counter that counts from 0 to Spi_Clock_Period /
> > Cpu_Clock_Period - 1. When the counter is 0, set your Spi_Sclk output
> > signal to 1; when that counter reaches one half the max value (i.e.
> > "(Spi_Clock_Period / Cpu_Clock_Period/2") then set Spi_Sclk back to 0.

>
> I have (more or less) that already, which is active when the interal
> CPU clock is selected. This is used when booting the CPU soft core
> from an external SPI EEPROM.
>
> Note however that your version does not allow to use the CPU clock at full speed,
> what happens if you set your "max value" to "00000" ? And it does not garantee
> that the high and low levels have equal durations.
>
> But i'm sure that in practice, you will do much better
> (and i still have a few range limitations in my the clock divider,
> i'll have to add an optional prescaler).
>
> Here is the current (yet perfectible) version :
>
> clk, ... : in std_logic; -- the CPU clock
> ...
> signal clkdiv, -- the frequency register
> divcounter : std_logic_vector(4 downto 0); -- the actual counter
> signal ... SPI_en, lCK, ... : std_logic;
> begin
> ...
>
> -- free-running clock divider
> clock_process : process(SPI_en, clk)
> variable t : std_logic_vector(5 downto 0); -- holds the carry bit without storing it
> begin
> -- no reset needed, synchro is done later
> if (clk'event and clk='1' and SPI_en='1') then
> t := std_logic_vector( unsigned('0' & divcounter) + 1 ); -- increment the counter
> if t(t'left)='1' then -- if (expected) overflow then toggle lCK
> divcounter <= clkdiv;
> lCK <= not(lCK);
> else
> divcounter <= t(divcounter'range); -- Just update the counter
> end if;
> end if;
> end process;
>
> This method divides by 2x(-clkdiv).
> Without the 2x factor, it is impossible to work with clkdiv="00000",
> and the High and Low durations are unequal when clkdiv is odd.
> I use a count up, not of down, but the difference is marginal.


If you want to divide by 0, you can use a mux to route the input clock
to the output instead of the divider. Restricting the divider value
to even values is reasonable, I seem to recall the 8053 working that
way in some modes. But that is just semantics.

> > The point where the counter = 0 can also then be used to define the 'rising
> > edge of Spi_Sclk' state. So any place where you'd like to use
> > "rising_edge(Spi_Sclk)" you would instead use "Counter = 0". The same can
> > be done for the falling edge of Spi_Sclk; that point would occur when
> > Counter = Spi_Clock_Period / Cpu_Clock_Period/2.

>
> > Every flop in the design then is synchronously clocked by the Cpu_Clock,
> > there are no other clock domains therefore no clock domain crossings. The
> > counter is used as a divider to signal internally for when things have
> > reached a particular state.

>
> I understand that well, as this is how i started my first design iteration
> I soon reached some inherent limitations, however.
>
> Particularly because of (slightly broken ?) tools that won't allow both a clock
> enable AND a preset on the FF (even though the Actel cells in ProAsic3
> have this capability). Synplicity infers the right cell, which is later
> broken into 2 cells or more by the Actel backend Maybe I missed
> a restriction on the use of one of the signals, using a specific kind
> of net or something like that (I hope).


There are reasons why the smaller logic companies are small.


> As the RTL code grows, the synthesizer infers more and more stuffs,
> often not foreseen, which leads to bloat. Muxes everywhere,
> and duplicated logic cells that are necessary to drive higher fanouts.
> I guess that this is because I focused more on the "expression"
> of my need than on the actual result (but I was careful anyway).
>
> I have split the design in 3 subparts (CPU interface,
> clock divider/synch and emit/receive, a total of 7 processes
> in a single architecture) and this needs < 140 cells
> instead of the 180 cells in the first iteration.
> And I use whatever clock I want or need.
>
> I could upload the source code somewhere so others can
> better understand my (fuzzy ?) descriptions.
> I should finish the simulation first.


I can't say that I am following what you are doing. But if you are
using multiple clock domains that are larger than 1 FF in each
direction, I think you are doing it wrong.

I try to keep the entire chip design in a single clock domain if
possible. I seldom find it necessary to make exceptions for anything
other than the periphery.

I don't understand your "clocking restrictions". How does that mean
you can't use the CPU clock for the SPI interface? You are designing
the master, you can use any main clock you choose. I don't understand
your restrictions, but unless you *have* to use some specific
frequency to sync a PLL or something unusual on SPI, you can use your
CPU clock as the timing reference eliminating synchronization issues.


> >> my master SPI controller emits the clock itself (and resynchronises it)

> >
> > No need for the master to resynchronize something that it generates itself
> > (see my other post).

>
> In fact, there IS a need to resynchronise the clock, even when
> it is generated by the CPU, because of the divider.
>
> Imagine (I'm picky here) that the CPU runs at 100MHz (my target)
> and the slave at 100KHz (an imaginary old chip).
> The data transfer is setup in the control register, then
> the write to the data register triggers the transfer.
> But this can happen at any time, whatever the value of the predivider's counter.
> So the clock output may be toggled the first time well below
> the required setup time of the slave. That's a glitch.


I don't know what you are saying here, but if it implies that you have
to do something to synchronize the SPI and CPU, you are doing it
wrong. Use the same clock for both and use clock enables instead of a
second clock domain.


> In this case, the solution is easy : reset the counter
> whenever a transfer is requested. That's what i did too,
> the first time.
>
> but there is an even simpler solution : add a "clear" input condition
> to the FF that are used to resynchronise the clocks as inhttp://i.cmpnet.com/eedesign/2003/jun/mahmud3.jpg
> so the next clock cycle will be well-formed, whether the
> source is internal or external. The created delay is not an issue.


Why are you resyncing clocks? Use 1 clock.


> >> so for MOSI, the system can be considered as "source clocked", even
> >> if the slave provides some clock (it is looped back in my circuit).

> > I don't think you understand SPI. The master always generates the clock, it
> > is up to the slave to synchronize to that clock. The master never has to
> > synchronize to the SPI clock since it generates it.

> I thought that too, until I read erratas of the chips i want to use.
>
> A friend told me years ago : "Never read the datasheet before the errata".
> An excellent advice, indeed.


You haven't explained yourself still. If you are using a chip that
does not act as a slave on the SPI bus, then maybe you shouldn't use
that part???

Are you saying that the slave chip you are using can not work with an
async master and that the master SPI interface *has* to use a clock
from the slave chip??? I've never seen that before.


> >> So i can also sample the incoming MISO bit on the same clock edge as MOSI :
> >> the time it takes for my clock signal to be output, transmitted,
> >> received by the slave, trigger the shift, and come back, this is
> >> well enough time for sample & hold.

> > See my other post for the details, but basically you're making this harder
> > than it need be.

> Though sometimes there needs to be something
> a bit more than the "theoretically practically enough".


No, actually. When you are working with read data, this is very
viable. The round trip timing is enough for an FPGA to receive the
data with sufficient hold time. Most FPGAs have I/O delays which can
be added to give a negative hold time assuring that that will work.
This also gives the maximum setup time. It is in the slave that you
can't depend on this because of the race condition between the write
data and the clock.


> > Since the master is generating the SPI clock it knows when
> > it is about to switch the SPI clock from low to high or from high to low,
> > there is no need for it to detect the actual SPI clock edge, it simply needs
> > to generate output data and sample input data at the point that corresponds
> > to where it is going to be switching the SPI clock.

> This is what I did in the first design iteration.
>
> However, now, i avoid large single-clock processes
> because there is less control over what the synthesiser does.
> My code now uses 7 processes (one clockless, just because
> it's easier to code than in parallel statements) and fanout
> and MUXes are OK.


I have no idea what you mean by "single-clock processes". I think
what KJ is referring to is that if you want to provide additional hold
time for read data, you can sample the input data one fast clock
earlier than the SPI clock edge that changes the data. Likewise the
output data can be changed a bit later than the actual SPI clock edge,
not that it would be needed.


> Which goes back to the other thread : everybody has his own
> idea of what is "good", "acceptable", "required"...
> style and taste are difficult to discuss, and one rule
> does not apply to ANY case
>
> Finally, I have the impression that you misunderstood the initial post about "SPI clocking".
> The idea was that the SPI master "could" sample MISO with the same (internal) clock signal
> and edge that samples MOSI. The issue this would "solve" is when capacitance
> and propagation delays on the PCB, along with relatively high clock speed
> (the 25AA1024 by Microchip goes up to 20MHz) delay the MISO signal
> enough to miss the normal clock edge.


Even at 20 MHz, you would need to have (25 ns - master setup time) of
delay to cause a problem. That should still be a large amount of
delay and I would not expect any problem with normal boards. The
master has no downside to clocking the read data on either edge.

Rick
 
Reply With Quote
 
 
 
 
whygee
Guest
Posts: n/a
 
      08-24-2008
Hi !

rickman wrote:
> On Aug 24, 12:26 am, whygee <(E-Mail Removed)> wrote:
>> As pointed in my previous post, there is at least one peripheral
>> (ENC28J60 revB4) that has clocking restrictions
>> (also know as "errata") and I happen to have some ready-to-use
>> modules equipped with this otherwise nice chip...

> Can you be specific on the restrictions and how that relates to the CPU clock?


apparently, there is probably an internal clock (synchro) isse.
http://ww1.microchip.com/downloads/e...Doc/80257d.pdf

"
1. Module: MAC Interface
When the SPI clock from the host microcontroller
is run at frequencies of less than 8 MHz, reading or
writing to the MAC registers may be unreliable.

Work around 1
Run the SPI at frequencies of at least 8 MHz.

Work around 2
Generate an SPI clock of 25/2 (12.5 MHz), 25/3
(8.333 MHz), 25/4 (6.25 MHz), 25/5 (5 MHz), etc.
and synchronize with the 25 MHz clock entering
OSC1 on the ENC28J60. This could potentially be
accomplished by feeding the same 25 MHz clock
into the ENC28J60 and host controller. Alterna-
tively, the host controller could potentially be
clocked off of the CLKOUT output of the
ENC28J60.
"

What is interesting is the 2nd workaround :
it implies that it is possible, when the clocks
are synchronized, to go faster than the 10MHz limit.
And it is not unsafe, considering that the latest parts
(not mine) are able of 20MHz SPI frequencies.

The ENC28J60 has a programmable external clock output
CLKOUT that i will program to generate 12.5MHz
and this will feed the SPI master that I have designed.

The 12.5MHz output works only when the the ENC28J60
is powered on and operating, so I need both internal
clocking from the CPU (for the startup sequence) and
the external clock (during transmission).


>> I don't know if my chip revision is B4 and the errata
>> suggest using a clock between 8 and 10MHz.
>> However, it also suggest using the ENC28J60-provided 12.5MHz
>> output : I'm ready to add an external clock input in the master
>> if i'm allowed to "legally" go beyond the 10MHz rating
>> (a 25% bandwidth increase is always a good thing, particularly
>> with real-time communications).

>
> When you talk about B4 is that the ENC28J60 chip??

"ENC28J60 silicon rev. B4"

> How can it
> recommend a 10 MHz max clock and also recommend using a 12.5 MHz
> clock?

Ask the manufacturer
but reading the errata sheet, it seems that the problem
comes when "bridging" two serial interfaces (the integrated MII
and the external SPI). The MII is clocked from the onboard 25MHz
but the external interface must accomodate "any" other frequency.

> I have to say you are using "it" and "my chip" in unclear
> ways. Try being specific and not using pronouns.

Sorry... I "understand myself", even though I have carefully
proofread the post before hitting "send".

>> For me and in the intended application, that's enough
>> to justify another clock domain.
>> If I had no ready-to-use ENC28J60 mini-module,
>> I would not have bothered.

>
> I still have no idea why you think you need two clock domains.

I still don't understand why several people think I "must not" use 2 clocks.
I understand that "normally" this goes against common knowledge
but I see no limitation, legal reason or technical issue that
could prevent me from doing this (Usenet misunderstanding are not,
IMHO, "limitations" ). Furthermore, the first simulation results
are very encouraging.

> Are you saying that you can't pick a CPU clock rate that will allow an SPI
> clock rate of 8 to 10 MHz?

No. I'm saying that if the manufacturer infers that it is possible
to go a bit faster, adding a clock input to a 208-pin FPGA is not
a serious issue at all. Furthermore, when done carefully,
some asynchronous designs are not that difficult.
I'm not speaking FIFOs here.

> What CPU are you using?

http://yasep.org (not up to date)
This is a soft core that I am developing, based on ideas dating back to 2002.
It has now a configurable 16-bit or 32-bit wide datapath with quite dumb
but unusual RISC instructions. I am writing the VHDL code since
the start of the summer, when I got my Actel eval kit.
Between 2000 and 2002, I had some other intensive VHDL experience, with other tools.

> Are you saying you can't find a suitable subset of SPI operation that
> will work with both? Are you aware that you can operate the bus in
> different mode when addressing different peripherals? That can be
> handled in your FSM.

I will work in mode 0, though I have provision for mode 1.
I also manage 8-bit and 16-bit transfers easily, as well as clocks.

>> Here is the current (yet perfectible) version :

<snip>
>> This method divides by 2x(-clkdiv).
>> Without the 2x factor, it is impossible to work with clkdiv="00000",
>> and the High and Low durations are unequal when clkdiv is odd.
>> I use a count up, not of down, but the difference is marginal.

>
> If you want to divide by 0, you can use a mux to route the input clock
> to the output instead of the divider. Restricting the divider value
> to even values is reasonable, I seem to recall the 8053 working that
> way in some modes. But that is just semantics.


The div/2 from the internal CPU clock is fine in my case
because the CPU clock is quite high (64 or 100MHz).
I consider adding a few programable post or prescalers because
it's indeed very high. Otherwise, a bypass MUX could have been used too.

>> Particularly because of (slightly broken ?) tools that won't allow both a clock
>> enable AND a preset on the FF (even though the Actel cells in ProAsic3
>> have this capability). Synplicity infers the right cell, which is later
>> broken into 2 cells or more by the Actel backend Maybe I missed
>> a restriction on the use of one of the signals, using a specific kind
>> of net or something like that (I hope).

>
> There are reasons why the smaller logic companies are small.

in your reasoning, are small companies small because they are small,
and big companies big because they are big ?
And if I mentioned another large company, what would you have answered ?

Now, the tool mismatch I mentioned is maybe a bug, maybe not.
But this did not prevent me from implementing my stuff,
once I understood why some tool added unexpected MUXes.

Furthermore, the more I use the A3Pxxx architecture, the more
I understand it and I don't find it stupid at all.
The "speed issue" seem to be because they use an older
silicon process than the "others". But the A3Pxxx are old,
there are newer versions (unfortunately under-represented,
overpriced and not well distributed). Once again,
commercial reasons, not technical.

>> I could upload the source code somewhere so others can
>> better understand my (fuzzy ?) descriptions.
>> I should finish the simulation first.

>
> I can't say that I am following what you are doing. But if you are
> using multiple clock domains that are larger than 1 FF in each
> direction, I think you are doing it wrong.

What do you mean by "clock domains that are larger than 1 FF in each direction" ?
What I do (at a clock domain boundary) is quite simple :
- the FF is clocked by the data source's clock (only)
- the FF is read (asynchronously) by the sink.
Because of the inherent higher-level handshakes (for example, the
receive register is read only when a status flag is asserted and
detected in software), there is no chance a metastability can
last long enough to cause a problem.

Maybe a drawing can help too here.

> I try to keep the entire chip design in a single clock domain if
> possible. I seldom find it necessary to make exceptions for anything
> other than the periphery.

The rest of the design is clocked from a single source,
if that makes you feel better
The first big challenge I faced (when I started to adapt the YASEP
architecture to the limits of a FPGA) was to make the CPU core
run at the same frequency as the external memory.
The initial idea was that YASEP was decoupled from memories
through the use of cache blocks, but CAM (or multiple address
comparison logic) is too expensive in most FPGAs.
Once again, I adapted my design to the external constrains.

> I don't understand your "clocking restrictions". How does that mean
> you can't use the CPU clock for the SPI interface?

I can use the internal clock, but not all the time.

> You are designing the master, you can use any main clock you choose.

Sure.

> I don't understand your restrictions,

Let's say that now (considering the first results),
it's not a restriction, but added flexibility.

> but unless you *have* to use some specific
> frequency to sync a PLL or something unusual on SPI, you can use your
> CPU clock as the timing reference eliminating synchronization issues.

And now, what if I "can" use something else ?

In fact, I see that it opens up new possibilities now.
Let's consider the CPU clock and the predivider :
with the CPU running 100MHz, the predivider can generate
50, 25, 16.66, 12.5... MHz. Now, the SPI memories can
reach 20MHz. 16.66MHz is fine, but my proto board has a 40MHz
oscillator (that drives the PLL generating the 100MHz).
All I have to do now is MUX the SPI clock input between
the 40MHz clock source and the CPU clock source.
All the deglitching is already working,
and I can run at top nominal speed.

>> In fact, there IS a need to resynchronise the clock, even when
>> it is generated by the CPU, because of the divider.
>>
>> Imagine (I'm picky here) that the CPU runs at 100MHz (my target)
>> and the slave at 100KHz (an imaginary old chip).
>> The data transfer is setup in the control register, then
>> the write to the data register triggers the transfer.
>> But this can happen at any time, whatever the value of the predivider's counter.
>> So the clock output may be toggled the first time well below
>> the required setup time of the slave. That's a glitch.

>
> I don't know what you are saying here, but if it implies that you have
> to do something to synchronize the SPI and CPU, you are doing it
> wrong. Use the same clock for both and use clock enables instead of a
> second clock domain.
>
>> In this case, the solution is easy : reset the counter
>> whenever a transfer is requested. That's what i did too,
>> the first time.
>>
>> but there is an even simpler solution : add a "clear" input condition
>> to the FF that are used to resynchronise the clocks as inhttp://i.cmpnet.com/eedesign/2003/jun/mahmud3.jpg
>> so the next clock cycle will be well-formed, whether the
>> source is internal or external. The created delay is not an issue.

>
> Why are you resyncing clocks? Use 1 clock.


Please reread carefully the above paragraphs :
The point is not an issue with clock domains (because there's obviously
only one in the initial example), I just wanted to show that
the predivider can create glitches too (from the slave point of view).

In the usual single-clock design, the predivider is reset when
a new transaction starts. This prevents very short pulses from
being generated ("short" means 1 CPU clock cycle or more, and
less than 1/2*SPIclock).

>> > I don't think you understand SPI. The master always generates the clock, it
>> > is up to the slave to synchronize to that clock. The master never has to
>> > synchronize to the SPI clock since it generates it.

>> I thought that too, until I read erratas of the chips i want to use.
>> A friend told me years ago : "Never read the datasheet before the errata".
>> An excellent advice, indeed.

>
> You haven't explained yourself still. If you are using a chip that
> does not act as a slave on the SPI bus, then maybe you shouldn't use
> that part???

reason 1) : I have the Ethernet chips presoldered already and the errata don't
prevent them from working. The conditions are a bit narrower than
what the DataSheet says, but I have coped with terrifyingly worse.
reason 2) : What's the point of a FPGA if I can't use its reconfigurability
features to adapt the design to the existing constrains ?
I'm not using a usual microcontroller here, I can work with as many clocks as I want,
implement all kinds of I/O protocols (tens or none if I desire).

So if the interface works, I can use the parts.
And if I can, I don't see why I shouldn't, particularly if I want to.

> Are you saying that the slave chip you are using can not work with an
> async master and that the master SPI interface *has* to use a clock
> from the slave chip??? I've never seen that before.

Maybe you have not read enough errata ;-P

Seriously : sure, this is a stupid silicon bug.

I could have limited myself to 100/12=8.33MHz,
which according to the errata is fine.
With a "fixed" processor/chip, everybody would do that.

However, there are other possibilities and opportunities.
I am not using FPGAs to limit myself.
And I can spend some time to tune my design and add features.
Some FF here, a MUX there, and I have complete clocking freedom.
This, for example, turns a partly buggy Ethernet chip
into a well performing interface.

>> > See my other post for the details, but basically you're making this harder
>> > than it need be.

>> Though sometimes there needs to be something
>> a bit more than the "theoretically practically enough".

>
> No, actually. When you are working with read data, this is very
> viable. The round trip timing is enough for an FPGA to receive the
> data with sufficient hold time. Most FPGAs have I/O delays which can
> be added to give a negative hold time assuring that that will work.
> This also gives the maximum setup time. It is in the slave that you
> can't depend on this because of the race condition between the write
> data and the clock.


OK

> I think
> what KJ is referring to is that if you want to provide additional hold
> time for read data, you can sample the input data one fast clock
> earlier than the SPI clock edge that changes the data. Likewise the
> output data can be changed a bit later than the actual SPI clock edge,
> not that it would be needed.


It sounds fine to me.

>> The idea was that the SPI master "could" sample MISO with the same (internal) clock signal
>> and edge that samples MOSI. The issue this would "solve" is when capacitance
>> and propagation delays on the PCB, along with relatively high clock speed
>> (the 25AA1024 by Microchip goes up to 20MHz) delay the MISO signal
>> enough to miss the normal clock edge.

>
> Even at 20 MHz, you would need to have (25 ns - master setup time) of
> delay to cause a problem. That should still be a large amount of
> delay and I would not expect any problem with normal boards. The
> master has no downside to clocking the read data on either edge.


OK.

I was just concerned by the fact that maybe, one of the SPI slaves
(a sensor) would be located a bit further from the FPGA than usual
(10 or 20cm). I could play with buffers and skew, but line capacitance
could have been a problem for the other (faster) slaves.

Thanks for the insights,

> Rick

YG
 
Reply With Quote
 
 
 
 
KJ
Guest
Posts: n/a
 
      08-24-2008

"whygee" <(E-Mail Removed)> wrote in message
news:48b0e3a4$0$294$(E-Mail Removed)-internet.fr...
> Hi !
>
> KJ wrote:
>> Since you said you're implementing the SPI master side, that implies that
>> you're generating the SPI clock itself which *should* be derived from the
>> CPU clock...there should be no need then for more than a single clock
>> domain (more later).

>
> As pointed in my previous post, there is at least one peripheral
> (ENC28J60 revB4) that has clocking restrictions
> (also know as "errata") and I happen to have some ready-to-use
> modules equipped with this otherwise nice chip...
>


It's always fun when someone refers to mystery stuff like "clocking
restrictions (also know as "errata")" instead of simply stating what they
are talking about. There is setup time (Tsu), hold time (Th), clock to
ouput (Tco), max frequency (Fmax). That suffices for nearly all timing
analysis although sometimes there are others as well such as minimum
frequency (Fmin), refresh cycle time, latency time, yadda, yadda, yadda. I
did a quick search for the erratta sheet and came up with...
http://ww1.microchip.com/downloads/e...Doc/80257d.pdf

In there is the following blurb which simply puts a minimum frequency
requirement of 8 MHz on your SPI controller design, nothing else. I'd go
with the work around #1 approach myself since it keeps the ultimate source
of the SPI clock at the master where it *should* be for a normal SPI system.

-- Start of relevant errata
1. Module: MAC Interface

When the SPI clock from the host microcontroller
is run at frequencies of less than 8 MHz, reading or
writing to the MAC registers may be unreliable.

Work around 1
Run the SPI at frequencies of at least 8 MHz.

Work around 2
Generate an SPI clock of 25/2 (12.5 MHz), 25/3
(8.333 MHz), 25/4 (6.25 MHz), 25/5 (5 MHz), etc.
and synchronize with the 25 MHz clock entering
OSC1 on the ENC28J60. This could potentially be
accomplished by feeding the same 25 MHz clock
into the ENC28J60 and host controller. Alternatively,
the host controller could potentially be
clocked off of the CLKOUT output of the
ENC28J60.
-- End of relevant errata


> I don't know if my chip revision is B4 and the errata
> suggest using a clock between 8 and 10MHz.
> However, it also suggest using the ENC28J60-provided 12.5MHz
> output :


Read it again. That suggestion was one possible work around, there is
nothing there to indicate that this is a preferred solution, just that it is
a solution.

> I'm ready to add an external clock input in the master
> if i'm allowed to "legally" go beyond the 10MHz rating
> (a 25% bandwidth increase is always a good thing, particularly
> with real-time communications).
>


You can run SPI at whatever clock frequeny you choose. What matters is
whether you meet the timing requirements of each of the devices on your SPI
bus. In this case, you have a minimum frequency clock requirement of 8 MHZ
when communicating with the ENC28J60. If you have other SPI devices on this
same bus, this clock frequency does not need to be used when communicating
with those devices...unless of course the ENC28J60 is expecting a free
running SPI clock, they don't mention it that way, but I'd be suspicious of
it. Many times SPI clock is stopped completely when no comms are ongoing
and Figures 4-4 and 4-4 of the datasheet seem to imply that the clock is
expected to stop for this device as well.

> As another "unintended case", an external clock input opens
> the possibility to bit-bang data with some PC or uC.
> I know it sounds stupid


Many times that's the most cost effective approach since the 'cost' is 4
general purpose I/O pins that are usually available. In this case though,
maintaining an 8 MHz

>
>> The CPU clock period and the desired SPI clock period are known
>> constants.

> They are indicated in the datasheet of each individual product.
> And there is no "SPI standard" contrary to I2C or others.
> ( http://en.wikipedia.org/wiki/Serial_..._Bus#Standards )


Yes, all the more freedom you have.

> Some chips accept a falling CLK edge after CS goes low,
> and some other chips don't (even chips by the same manufacturer vary).
>
> So i have read the datasheets of the chips i want to interface,
> and adapted the master interface to their various needs (and errata).
>


Sounds good.

>> Therefore one can create a counter that counts from 0 to Spi_Clock_Period
>> / Cpu_Clock_Period - 1. When the counter is 0, set your Spi_Sclk output
>> signal to 1; when that counter reaches one half the max value (i.e.
>> "(Spi_Clock_Period / Cpu_Clock_Period/2") then set Spi_Sclk back to 0.

> I have (more or less) that already, which is active when the interal
> CPU clock is selected. This is used when booting the CPU soft core
> from an external SPI EEPROM.
>
> Note however that your version does not allow to use the CPU clock at full
> speed,
> what happens if you set your "max value" to "00000" ?


That's correct but I wouldn't set the max value to anything, it would be a
computed constant like this

constant Spi_Clks_Per_Cpu_Clk: positive range 2 to positive'high :=
Spi_Clk_Period / Cpu_Clk_Period.

Synthesis (and sim) would fail immediately if the two clock periods were the
same since that would result in 'Spi_Clks_Per_Cpu_Clk' coming out to be 1
which is outside of the defined range. Running SPI at the CPU speed is
rarely needed since the CPU typically runs much faster than the external SPI
bus. If that's not your case, then you've got a wimpy CPU, but in that
situation you wouldn't have a clock divider, and the data handling would be
done differently. This type of information though is generally known at
design time and is not some selectable option so if your CPU did run that
slow you wouldn't even bother to write code that would put in a divider so
your whole point of "what happens if you set your "max value" to "00000"" is
moot.

> And it does not garantee
> that the high and low levels have equal durations.
>


That's not usually a requirement either. If it is a requirement for some
particular application, then one can simply write a function to compute the
constant so that it comes out to be an even number. In the case of the
ENC28J60 the only specification (Table 16-6) on the SPI clock itself is that
it be in the range of DC to 20 MHz, with the errata then ammending that to
be 8 MHz min *while* writing to that device. It can still be at DC when not
accessing the device. In any case, there is no specific 'SPI clock high
time' or 'SPI clock low time' requirement for the device, so unless there is
some other errata there is no requirement for this device to have a 50% duty
cycle clock.

>> The point where the counter = 0 can also then be used to define the
>> 'rising edge of Spi_Sclk' state. So any place where you'd like to use
>> "rising_edge(Spi_Sclk)" you would instead use "Counter = 0". The same
>> can be done for the falling edge of Spi_Sclk; that point would occur when
>> Counter = Spi_Clock_Period / Cpu_Clock_Period/2.
>>
>> Every flop in the design then is synchronously clocked by the Cpu_Clock,
>> there are no other clock domains therefore no clock domain crossings.
>> The counter is used as a divider to signal internally for when things
>> have reached a particular state.

>
> I understand that well, as this is how i started my first design iteration
> I soon reached some inherent limitations, however.
>


I doubt those limitations were because of device requirements though...they
seem to be your own limitations. If not, then specify what those
limitations are. Just like with your previously mentioned "clocking
restrictions (also know as "errata")" comment I doubt that these
limitations are due to anything in the device requirements.

> As the RTL code grows, the synthesizer infers more and more stuffs,
> often not foreseen, which leads to bloat. Muxes everywhere,
> and duplicated logic cells that are necessary to drive higher fanouts.
> I guess that this is because I focused more on the "expression"
> of my need than on the actual result (but I was careful anyway).
>


Don't write bloated code. Use the feedback you're seeing from running your
code through synthesis to sharpen your skills on how to write good
synthesizable code...there is no substitute for actual experience in gaining
knowledge.

> >> my master SPI controller emits the clock itself (and resynchronises it)

> >
> > No need for the master to resynchronize something that it generates
> > itself
> > (see my other post).

>
> In fact, there IS a need to resynchronise the clock, even when
> it is generated by the CPU, because of the divider.
>


No there isn't. Everything is clocked by the high speed clock (the CPU
clock I presume). The counter being at a specific count value is all one
needs to know in order to sample the data at the proper time. Since the
master generates the SPI clock from the counter there is no need for it to
then *use* the SPI clock in any fashion. You could 'choose' to do so, but
it is not a requirement, it would mainly depend on how you transfer the
receive data back to the CPU, but I suspect either method would work just
fine...but again, that doesn't make it a requirement.

> Imagine (I'm picky here) that the CPU runs at 100MHz (my target)
> and the slave at 100KHz (an imaginary old chip).
> The data transfer is setup in the control register, then
> the write to the data register triggers the transfer.
> But this can happen at any time, whatever the value of the predivider's
> counter.
> So the clock output may be toggled the first time well below
> the required setup time of the slave. That's a glitch.
>


So don't write such bad code for a design. There is no need for the clock
divider to be running when you're not transmitting. It should sit at 0
until the CPU write comes along, then it would step through a 1000+ CPU
clock cycle state machine, with the first few clocks used for setting up
timing of data relative to chip select and the start of SPI clock. Then
there are a few CPU clocks on the back end for shutting off the chip select
and then of course the 1000 CPU clocks needed in order to generate the 100
kHz SPI clock itself. Any time the counter is greater than 0, the SPI
controller must be telling the CPU interface to 'wait' while it completes

You should make sure your design works in the above scenario as a test case.

> In this case, the solution is easy : reset the counter
> whenever a transfer is requested. That's what i did too,
> the first time.
>
> but there is an even simpler solution : add a "clear" input condition
> to the FF that are used to resynchronise the clocks as in
> http://i.cmpnet.com/eedesign/2003/jun/mahmud3.jpg
> so the next clock cycle will be well-formed, whether the
> source is internal or external. The created delay is not an issue.
>


You haven't guarded against the CPU coming in and attempting to start a
second write while the first one is ongoing. You need a handshake on the
CPU side to insert wait states while the controller is spitting out the
bits. When you look at it in that perspective and design it correctly,
there will be no chance of any glitchy clocks or anything else. If you
don't have a 'wait' signal back to the CPU, then certainly you have an
interrupt that you can use to send back to the CPU to indicate that it
fouled up by writing too quickly...many possible solutions.

> >> So i can also sample the incoming MISO bit on the same clock edge as
> >> MOSI :
> >> the time it takes for my clock signal to be output, transmitted,
> >> received by the slave, trigger the shift, and come back, this is
> >> well enough time for sample & hold.

> > See my other post for the details, but basically you're making this
> > harder
> > than it need be.

> Though sometimes there needs to be something
> a bit more than the "theoretically practically enough".
>


I've done this, it's not a theoretical exercise on my part either. It's not
that hard.

> > Since the master is generating the SPI clock it knows when
> > it is about to switch the SPI clock from low to high or from high to
> > low,
> > there is no need for it to detect the actual SPI clock edge, it simply
> > needs
> > to generate output data and sample input data at the point that
> > corresponds
> > to where it is going to be switching the SPI clock.

> This is what I did in the first design iteration.
>
> However, now, i avoid large single-clock processes
> because there is less control over what the synthesiser does.


That makes no sense.

>
> Finally, I have the impression that you misunderstood the initial post
> about "SPI clocking".
> The idea was that the SPI master "could" sample MISO with the same
> (internal) clock signal
> and edge that samples MOSI. The issue this would "solve" is when
> capacitance
> and propagation delays on the PCB, along with relatively high clock speed
> (the 25AA1024 by Microchip goes up to 20MHz) delay the MISO signal
> enough to miss the normal clock edge.
>


Your proposed solution wouldn't solve anything. If you have a highly loaded
MISO this means you have a lot of loads (or the master and slave are
faaaaaaaar apart on separate boards). It also likely means you have a
highly loaded SPI clock since each slave device needs a clock. You likely
won't be able to find a driver capable of switching the SPI clock so that it
is monotonic at each of the loads (which is a requirement) which will force
you to split SPI clock into multiple drivers just to handle the electrical
load at switching...but now you've changed the topology so that trying to
feed back SPI clock to somehow compensate for delays will not be correct.
Far easier to simply sample MISO a tick or two later. For example, using
the 100 MHz/100kHz example you mentioned, the half way point would be 500,
but there is nothing to say that you can't sample it at 501, 502 or
whatever, it doesn't matter.

Kevin Jennings


 
Reply With Quote
 
whygee
Guest
Posts: n/a
 
      08-25-2008
Hello,

It's a bit scary that, according to my news reader, we posted
to the same newsgroup at the same time with a post that is
roughly the same size, stating with the same extract of a PDF.
As if we had nothing more constructive to do.

What is more scary is some posters' constant desire for
explanations and justifications of *personal choices*.
As if a technical choice was only a technical matter.
We are *all* biased, whether we realize it or not.
Our experiences differ and mature.
And of cource, because this is personal, nobody agrees.
Just like the other thread about coding styles :
there is so much freedom that everybody will
be contented by something that varies according the individuals.
And it's fine for me because I can do more new and great
things every year.

Freedom is wonderful and today's technology is empowering.
We are free to test, experiment, discover, learn and "invent".
So I'm getting a bit tired of "you should do that"
and "that's how it should be done". As if there ware two
kinds of engineers, those who "implement" and those who "innovate"
(often by trial and error).
And I believe that each practical case has distinctive aspects,
that make us reconsider what we know and how we apply our knowledge.
One is free to abide to strict rules, or not. For the rest,
read the standards (and find creative ways to exploit them).
And with YASEP ( http://yasep.org ), I have decided to
go completely wild, unusual and fun (and constructive).

Personally, for the application that has become the focus
of the latest post (SPI master), I have chosen to not be limited
by "what is usually done". I have used a bit of imagination,
confronted the idea to the technical possibilities and
made it work within 48h. The simulations have shown nothing
nasty, and I've learnt some tricks about asynchronous designs
(I wonder why it's so scary for most people, I'm not doing
any FIFO-style magic).

Does somebody need more justifications ?
Shall I quote some nation's constitution ?
I hope not, thank you.

Maybe my error was to think that more Usenet posters would
be open-minded, or at least curious, instead of rehashing
old techniques that I already know work. Now, I realize
that according to some people, pushing the envelope is
not desirable. I did not expect that adding an external
clock input to an otherwise inoffensive circuit would
get the reactions that I have seen. It's just one
stupid pin... Let's all use our 14K8 modems, while
we're at it.

I don't consider my SPI code as finished but I've seen what
I wanted to see, and I'm now looking at the cache memory
system. And once again, it's another occasion to look at
what others have done, what is practically possible, and
how things can be bent, adapted, transformed, twisted,
in order to exploit what is available to perform the
desired functions, and a bit more when the opportunity appears.
24h ago, I thought that it was not even possible to do the
internal cache system.

KJ wrote:
> "whygee" <(E-Mail Removed)> wrote
>> Hi !

<snip>
>> As pointed in my previous post, there is at least one peripheral
>> (ENC28J60 revB4) that has clocking restrictions
>> (also know as "errata") and I happen to have some ready-to-use
>> modules equipped with this otherwise nice chip...

> It's always fun when someone refers to mystery stuff like "clocking
> restrictions (also know as "errata")" instead of simply stating what they
> are talking about.


I have "fun" when I imagine something and implement it.
It's more fun when the thing is unusual, like usign a mechanism
to perform another useful function.
I have even more fun when it works as expected.
Oh, and it works. I guess I'm learning and getting better.

Concerning "stating what I was talking about" :
If anybody has to quote every single document about every matter,
then Usenet would become (more) unreadable.
We would need assistants to redact and analyse the posts.
It would be like being a lawyer... and Usenet would
be a courtroom (is it already ?)
So I tried to keep the post short (*sigh*) and
avoided (what I thought) "unecessary details".

> There is setup time (Tsu), hold time (Th), clock to
> ouput (Tco), max frequency (Fmax). That suffices for nearly all timing
> analysis although sometimes there are others as well such as minimum
> frequency (Fmin), refresh cycle time, latency time, yadda, yadda, yadda.


Sometimes it's so simple that we don't have to care, sometimes not.

> I did a quick search for the erratta sheet and came up with...
> http://ww1.microchip.com/downloads/e...Doc/80257d.pdf


bingo.

> In there is the following blurb which simply puts a minimum frequency
> requirement of 8 MHz on your SPI controller design, nothing else.


You see nothing else when I see an opportunity later.
It's a matter of taste, experience, and will of pushing the envelope.
You're not forced to agree with my choices, just as I'm not forced
to follow your advices. I didn't break the entropy principle or
the whole number theory rules. I just added a feature.

> I'd go with the work around #1 approach myself since it keeps the ultimate source
> of the SPI clock at the master where it *should* be for a normal SPI system.


Ok, that's a legitimate point of view.
But does this choice force me to, for example, clock the CPU
with different clocks source, *just* so that the SPI master
interface works in the same clock domain as the CPU ?
Let's see this as an exercise of thinking out of the bag.

> -- Start of relevant errata

<snip>
> -- End of relevant errata
>
>> I don't know if my chip revision is B4 and the errata
>> suggest using a clock between 8 and 10MHz.
>> However, it also suggest using the ENC28J60-provided 12.5MHz
>> output :

>
> Read it again. That suggestion was one possible work around, there is
> nothing there to indicate that this is a preferred solution, just that it is
> a solution.


This sentence too is hurting.
When facing a choice, where should one go ?
It depends on your objectives, mindset, resources...
So you're basically telling me : don't look, this solution does not exist,
just because there is another one more reassuring (to you) just before.
If I thought like that, I would be some random clerk at some
boring office, not an independent guy earning his life
and his wife's "hacking" things. I would rehash proven things
and let people rule over me.

>> I'm ready to add an external clock input in the master
>> if i'm allowed to "legally" go beyond the 10MHz rating
>> (a 25% bandwidth increase is always a good thing, particularly
>> with real-time communications).

>
> You can run SPI at whatever clock frequeny you choose. What matters is
> whether you meet the timing requirements of each of the devices on your SPI
> bus.

I'm also concerned by that too.
I expect some small buffers here and there.
Fortunately, this is much simpler than I2C

> In this case, you have a minimum frequency clock requirement of 8 MHZ
> when communicating with the ENC28J60. If you have other SPI devices on this
> same bus, this clock frequency does not need to be used when communicating
> with those devices...

of course.

> unless of course the ENC28J60 is expecting a free
> running SPI clock, they don't mention it that way, but I'd be suspicious of
> it. Many times SPI clock is stopped completely when no comms are ongoing
> and Figures 4-4 and 4-4 of the datasheet seem to imply that the clock is
> expected to stop for this device as well.


obviously.

>> As another "unintended case", an external clock input opens
>> the possibility to bit-bang data with some PC or uC.
>> I know it sounds stupid

>
> Many times that's the most cost effective approach since the 'cost' is 4
> general purpose I/O pins that are usually available. In this case though,
> maintaining an 8 MHz


hmm this seems to be unfinished, but let me try to continue your phrase :
"maintaining an 8MHz clock on a // port is difficult" (or something like that).
Of course ! that's all the point of having an external clock input !
Though it would be more natural to have a SPI slave instead of a SPI master.

I'll cut into this subject because for the specific purpose of communication
with a host, I intend to use another kind of parallel, synchronous protocol :
4 bits of data, 1 pulse strobe, 1 output enable, 1 reset, 1 "slave data ready".
With some crude software handshake, it's really easy to implement and use,
and 4x faster than SPI.

>> And there is no "SPI standard" contrary to I2C or others.
>> ( http://en.wikipedia.org/wiki/Serial_..._Bus#Standards )

> Yes, all the more freedom you have.


So I'd be lazy to not use it.

>> Note however that your version does not allow to use the CPU clock at full
>> speed, what happens if you set your "max value" to "00000" ?

> That's correct but I wouldn't set the max value to anything, it would be a
> computed constant like this
>
> constant Spi_Clks_Per_Cpu_Clk: positive range 2 to positive'high :=
> Spi_Clk_Period / Cpu_Clk_Period.


As I need flexibility (the system I develop is also development platform for me),
the clock divider is programmable. I need to ensure that any combination
of configuration bits won't bork something.

> Synthesis (and sim) would fail immediately if the two clock periods were the
> same since that would result in 'Spi_Clks_Per_Cpu_Clk' coming out to be 1
> which is outside of the defined range. Running SPI at the CPU speed is
> rarely needed since the CPU typically runs much faster than the external SPI
> bus. If that's not your case, then you've got a wimpy CPU, but in that
> situation you wouldn't have a clock divider, and the data handling would be
> done differently. This type of information though is generally known at
> design time and is not some selectable option so if your CPU did run that
> slow you wouldn't even bother to write code that would put in a divider so
> your whole point of "what happens if you set your "max value" to "00000"" is
> moot.

Nice try.

In my case, the SPI divider is programmable, and I expect to be able
to slow down the CPU (from 100 to maybe 20 or 10MHz) when it is not in use
(to conserve power, which is mostly sunk by the main parallel memory interface,
a problem that I am addressing currently).

>> And it does not garantee
>> that the high and low levels have equal durations.

>
> That's not usually a requirement either.

At the highest frequencies, I assume that it could become a problem.
So since I have a decent frequency margin in the CPU, there's no use
having non-50% duty cycles, when the inherent divide-by-2 (of the code
that I copy-pasted in the previous post) solves the problem before it
appears.

> If it is a requirement for some
> particular application, then one can simply write a function to compute the
> constant so that it comes out to be an even number. In the case of the
> ENC28J60 the only specification (Table 16-6) on the SPI clock itself is that
> it be in the range of DC to 20 MHz, with the errata then ammending that to
> be 8 MHz min *while* writing to that device. It can still be at DC when not
> accessing the device. In any case, there is no specific 'SPI clock high
> time' or 'SPI clock low time' requirement for the device, so unless there is
> some other errata there is no requirement for this device to have a 50% duty
> cycle clock.


As pointed in another post about SPI edge sampling,
I was a bit worried that longer-than-expected wiring would cause trouble.
At 20MHz (the latest chips are that fast), I'm not willing to play with the duty cycles.
Sampling clock, maybe, but that would remain experimental.
Fortunately, my scope is fast enough for this case so in practice,
I would find the solution if a problem arises.

>>> Every flop in the design then is synchronously clocked by the Cpu_Clock,
>>> there are no other clock domains therefore no clock domain crossings.
>>> The counter is used as a divider to signal internally for when things
>>> have reached a particular state.

>> I understand that well, as this is how i started my first design iteration
>> I soon reached some inherent limitations, however.

>
> I doubt those limitations were because of device requirements though...they
> seem to be your own limitations. If not, then specify what those
> limitations are. Just like with your previously mentioned "clocking
> restrictions (also know as "errata")" comment I doubt that these
> limitations are due to anything in the device requirements.


I don't want to bloat this post (that probably nobody reads) so i'll
cut this useless issue too. Your doubts may be legitimate but they
are not my concern (which is now : cache memory).
Maybe I'll write a full report later, when things have settled and
become clear, in another thread.

>> As the RTL code grows, the synthesizer infers more and more stuffs,
>> often not foreseen, which leads to bloat. Muxes everywhere,
>> and duplicated logic cells that are necessary to drive higher fanouts.
>> I guess that this is because I focused more on the "expression"
>> of my need than on the actual result (but I was careful anyway).

>
> Don't write bloated code.

I tried to keep the thing as bare as possible, of course.

> Use the feedback you're seeing from running your
> code through synthesis to sharpen your skills on how to write good
> synthesizable code...there is no substitute for actual experience in gaining
> knowledge.


Sure. I did that for the precedent project.
But in the end, it backfired as I did not think to try simulation.
I've learnt the lesson.
I have also tried to

>>> No need for the master to resynchronize something that it generates
>>> itself (see my other post).

>> In fact, there IS a need to resynchronise the clock, even when
>> it is generated by the CPU, because of the divider.

> No there isn't.

maybe I should have been more explicit.
I meant : resynchronise the output clock after the divider.
my fault.

<snip>

>> Imagine (I'm picky here) that the CPU runs at 100MHz (my target)
>> and the slave at 100KHz (an imaginary old chip).
>> The data transfer is setup in the control register, then
>> the write to the data register triggers the transfer.
>> But this can happen at any time, whatever the value of the predivider's counter.
>> So the clock output may be toggled the first time well below
>> the required setup time of the slave. That's a glitch.

>
> So don't write such bad code for a design.

If I pointed to this case, then I also addressed it in the code.

> There is no need for the clock
> divider to be running when you're not transmitting.

your initial code did not mention this.

> It should sit at 0
> until the CPU write comes along, then it would step through a 1000+ CPU
> clock cycle state machine, with the first few clocks used for setting up
> timing of data relative to chip select and the start of SPI clock.

May I address another issue ?
alright, thanks.

CS should be controlled by software, at least with the slaves i intend to use.
Because only the master know how much bytes or half-word it wants.

Most slaves see the CS falling edge as the beginning of the transaction,
and rising edge as the end. Between those two edges, as many words as
desired can be transmitted. For the case of the ENC28J60 or the SPI EEPROM,
one can dump the whole memory in one go if needed.
Emitting a new command word for every new byte is pure overhead and loss.

> Then
> there are a few CPU clocks on the back end for shutting off the chip select
> and then of course the 1000 CPU clocks needed in order to generate the 100
> kHz SPI clock itself. Any time the counter is greater than 0, the SPI
> controller must be telling the CPU interface to 'wait' while it completes


In the system I build, the CPU MUST poll a "RDY" flag in the control register
before starting a new command, that's the simple software protocol I defined.
If the CPU starts a new transaction when RDY is cleared, then this is a voluntary error,
not the interface's fault. The interface FSM will be reset and restarted
automatically, and the aborted transaction will be lost, that's all.
And this is defined as an error, that I expect will never happen in well
formed SW.

> You should make sure your design works in the above scenario as a test case.


The first simulations are encouraging,
but the use of Modelsim is a bit tedious.
I'm learning.

>> In this case, the solution is easy : reset the counter
>> whenever a transfer is requested. That's what i did too,
>> the first time.
>>
>> but there is an even simpler solution : add a "clear" input condition
>> to the FF that are used to resynchronise the clocks as in
>> http://i.cmpnet.com/eedesign/2003/jun/mahmud3.jpg
>> so the next clock cycle will be well-formed, whether the
>> source is internal or external. The created delay is not an issue.

> You haven't guarded against the CPU coming in and attempting to start a
> second write while the first one is ongoing.

I don't guard with HS because this is the SW's responsibility in this case.

> You need a handshake on the
> CPU side to insert wait states while the controller is spitting out the bits.

Ouch, you're tough, here !

In YASEP, there is a dedicated "Special Registers" area dedicated
to general HW configuration and peripherals like the SPI master.
It is mapped as 2 registers : 0 is control/state, 1 is data.
To poll the RDY bit takes 2 instructions and 6 bytes :

label_poll
GET SPI_CTL, R0
JO label_poll ; RDY is in bit 0, which makes R0 Odd when not ready.

> When you look at it in that perspective and design it correctly,
> there will be no chance of any glitchy clocks or anything else. If you
> don't have a 'wait' signal back to the CPU, then certainly you have an
> interrupt that you can use to send back to the CPU to indicate that it
> fouled up by writing too quickly...many possible solutions.


It feels like i'm reading one of those old books I read when aged 14
about how to design a 6809-based system. Thank you Technology for the FPGA !

<snip>
>> This is what I did in the first design iteration.
>>
>> However, now, i avoid large single-clock processes
>> because there is less control over what the synthesiser does.

>
> That makes no sense.


To you, it seems.

Breaking the thing into well defined sub-blocks
communicating with signals has not only made synthesis more predictable (IMHO, YMMV but IANAL)
but also simulation easier (I can't see how to enable visualisation
of variables with Modelsim). So to me, it does make some sense.

To quote another thread, I don't remember who said something like :
"I take the D in VHDL very seriously". While others concentrate on the 'L',
I "think" graphically with diagrams, blocks and signal arrows...
just like with paper. Even after I tried other ways,
I feel more comfortable this way and it gets the job done.
What more do I need ?

I have a certain way of doing things, like anybody else.
I won't tell them that they make no sense.
I'll just sit there and learn, and when I face a design challenge,
I'll weight the pros & cons of the solutions I already know,
and maybe even find a totally different solution.

> Your proposed solution wouldn't solve anything. If you have a highly loaded
> MISO this means you have a lot of loads (or the master and slave are
> faaaaaaaar apart on separate boards). It also likely means you have a
> highly loaded SPI clock since each slave device needs a clock. You likely
> won't be able to find a driver capable of switching the SPI clock so that it
> is monotonic at each of the loads (which is a requirement)

If the case appears, I have some 100s of 74-1G125 single-gate tristate buffers,
this could help. Or i can simply add other dedicated pins to the FPGA (1/slave).
Currently, I'll have 2 fast slaves and maybe one slow remote sensor that could
stand some signal margin (probably < 1MHz).

By the way, I have not been able to find the SPI Switching parameters
of ST's LIS3LV02DQ 3D accelerometer but the transmission speed of this sensor
is not critical, i'll try 100KHz if I can.

> Far easier to simply sample MISO a tick or two later. For example, using
> the 100 MHz/100kHz example you mentioned, the half way point would be 500,
> but there is nothing to say that you can't sample it at 501, 502 or
> whatever, it doesn't matter.


This more or less confirms what I was thinking.
It may be useful one day (as well as a good scope).

And now, let's see how I can create some cache memory
with just 3 512-byte reconfigurable memory blocks.

> Kevin Jennings

YG
 
Reply With Quote
 
KJ
Guest
Posts: n/a
 
      08-25-2008
On Aug 25, 8:06*am, whygee <(E-Mail Removed)> wrote:
<snip>
> The simulations have shown nothing
> nasty, and I've learnt some tricks about asynchronous designs


- Logic simulations do not find async design problems.
- Timing analysis will find async design problems.

> (I wonder why it's so scary for most people, I'm not doing
> any FIFO-style magic).
>


- Async design is not 'scary'

>
> Maybe my error was to think that more Usenet posters would
> be open-minded, or at least curious, instead of rehashing
> old techniques that I already know work.


Maybe Usenet posters are not as narrow minded as you incorrectly seem
to believe them to be. Most people post questions or are looking for
better ways of doing something and the responses try to fulfill that
need.

Your posts on the other hand are long winded and only go to show how
wonderful and creative and refreshing you think you are...consider
using a blog instead.

> Now, I realize
> that according to some people, pushing the envelope is
> not desirable.


If you think you've pushed any envelopes with your SPI
design...well...you haven't.

> > I'd go with the work around #1 approach myself since it keeps the ultimate source
> > of the SPI clock at the master where it *should* be for a normal SPI system.

>
> Ok, that's a legitimate point of view.
> But does this choice force me to, for example, clock the CPU
> with different clocks source, *just* so that the SPI master
> interface works in the same clock domain as the CPU ?


No it does not force that at all.

> > You can run SPI at whatever clock frequeny you choose. *What matters is
> > whether you meet the timing requirements of each of the devices on your SPI
> > bus.

>
> I'm also concerned by that too.
> I expect some small buffers here and there.
> Fortunately, this is much simpler than I2C
>


You shouldn't need small buffers here and there.

Good luck on your design. You might also want to consider rethinking
the attitude you've shown in your postings and work more to make the
content to be more relevant to the members of this newsgroup...just a
suggestion, I know I'm done with 'whygee' postings for a while.

KJ
 
Reply With Quote
 
Andy
Guest
Posts: n/a
 
      08-25-2008
On Aug 25, 7:06 am, whygee <(E-Mail Removed)> wrote:
> Hello,
>
> It's a bit scary that, according to my news reader, we posted
> to the same newsgroup at the same time with a post that is
> roughly the same size, stating with the same extract of a PDF.
> As if we had nothing more constructive to do.
>
> What is more scary is some posters' constant desire for
> explanations and justifications of *personal choices*.
> As if a technical choice was only a technical matter.
> We are *all* biased, whether we realize it or not.
> Our experiences differ and mature.
> And of cource, because this is personal, nobody agrees.
> Just like the other thread about coding styles :
> there is so much freedom that everybody will
> be contented by something that varies according the individuals.
> And it's fine for me because I can do more new and great
> things every year.
>
> Freedom is wonderful and today's technology is empowering.
> We are free to test, experiment, discover, learn and "invent".
> So I'm getting a bit tired of "you should do that"
> and "that's how it should be done". As if there ware two
> kinds of engineers, those who "implement" and those who "innovate"
> (often by trial and error).
> And I believe that each practical case has distinctive aspects,
> that make us reconsider what we know and how we apply our knowledge.
> One is free to abide to strict rules, or not. For the rest,
> read the standards (and find creative ways to exploit them).
> And with YASEP (http://yasep.org), I have decided to
> go completely wild, unusual and fun (and constructive).
>
> Personally, for the application that has become the focus
> of the latest post (SPI master), I have chosen to not be limited
> by "what is usually done". I have used a bit of imagination,
> confronted the idea to the technical possibilities and
> made it work within 48h. The simulations have shown nothing
> nasty, and I've learnt some tricks about asynchronous designs
> (I wonder why it's so scary for most people, I'm not doing
> any FIFO-style magic).
>
> Does somebody need more justifications ?
> Shall I quote some nation's constitution ?
> I hope not, thank you.
>
> Maybe my error was to think that more Usenet posters would
> be open-minded, or at least curious, instead of rehashing
> old techniques that I already know work. Now, I realize
> that according to some people, pushing the envelope is
> not desirable. I did not expect that adding an external
> clock input to an otherwise inoffensive circuit would
> get the reactions that I have seen. It's just one
> stupid pin... Let's all use our 14K8 modems, while
> we're at it.
>
> I don't consider my SPI code as finished but I've seen what
> I wanted to see, and I'm now looking at the cache memory
> system. And once again, it's another occasion to look at
> what others have done, what is practically possible, and
> how things can be bent, adapted, transformed, twisted,
> in order to exploit what is available to perform the
> desired functions, and a bit more when the opportunity appears.
> 24h ago, I thought that it was not even possible to do the
> internal cache system.
>
>
>
> KJ wrote:
> > "whygee" <(E-Mail Removed)> wrote
> >> Hi !

> <snip>
> >> As pointed in my previous post, there is at least one peripheral
> >> (ENC28J60 revB4) that has clocking restrictions
> >> (also know as "errata") and I happen to have some ready-to-use
> >> modules equipped with this otherwise nice chip...

> > It's always fun when someone refers to mystery stuff like "clocking
> > restrictions (also know as "errata")" instead of simply stating what they
> > are talking about.

>
> I have "fun" when I imagine something and implement it.
> It's more fun when the thing is unusual, like usign a mechanism
> to perform another useful function.
> I have even more fun when it works as expected.
> Oh, and it works. I guess I'm learning and getting better.
>
> Concerning "stating what I was talking about" :
> If anybody has to quote every single document about every matter,
> then Usenet would become (more) unreadable.
> We would need assistants to redact and analyse the posts.
> It would be like being a lawyer... and Usenet would
> be a courtroom (is it already ?)
> So I tried to keep the post short (*sigh*) and
> avoided (what I thought) "unecessary details".
>
> > There is setup time (Tsu), hold time (Th), clock to
> > ouput (Tco), max frequency (Fmax). That suffices for nearly all timing
> > analysis although sometimes there are others as well such as minimum
> > frequency (Fmin), refresh cycle time, latency time, yadda, yadda, yadda..

>
> Sometimes it's so simple that we don't have to care, sometimes not.
>
> > I did a quick search for the erratta sheet and came up with...
> >http://ww1.microchip.com/downloads/e...Doc/80257d.pdf

>
> bingo.
>
> > In there is the following blurb which simply puts a minimum frequency
> > requirement of 8 MHz on your SPI controller design, nothing else.

>
> You see nothing else when I see an opportunity later.
> It's a matter of taste, experience, and will of pushing the envelope.
> You're not forced to agree with my choices, just as I'm not forced
> to follow your advices. I didn't break the entropy principle or
> the whole number theory rules. I just added a feature.
>
> > I'd go with the work around #1 approach myself since it keeps the ultimate source
> > of the SPI clock at the master where it *should* be for a normal SPI system.

>
> Ok, that's a legitimate point of view.
> But does this choice force me to, for example, clock the CPU
> with different clocks source, *just* so that the SPI master
> interface works in the same clock domain as the CPU ?
> Let's see this as an exercise of thinking out of the bag.
>
>
>
> > -- Start of relevant errata

> <snip>
> > -- End of relevant errata

>
> >> I don't know if my chip revision is B4 and the errata
> >> suggest using a clock between 8 and 10MHz.
> >> However, it also suggest using the ENC28J60-provided 12.5MHz
> >> output :

>
> > Read it again. That suggestion was one possible work around, there is
> > nothing there to indicate that this is a preferred solution, just that it is
> > a solution.

>
> This sentence too is hurting.
> When facing a choice, where should one go ?
> It depends on your objectives, mindset, resources...
> So you're basically telling me : don't look, this solution does not exist,
> just because there is another one more reassuring (to you) just before.
> If I thought like that, I would be some random clerk at some
> boring office, not an independent guy earning his life
> and his wife's "hacking" things. I would rehash proven things
> and let people rule over me.
>
> >> I'm ready to add an external clock input in the master
> >> if i'm allowed to "legally" go beyond the 10MHz rating
> >> (a 25% bandwidth increase is always a good thing, particularly
> >> with real-time communications).

>
> > You can run SPI at whatever clock frequeny you choose. What matters is
> > whether you meet the timing requirements of each of the devices on your SPI
> > bus.

>
> I'm also concerned by that too.
> I expect some small buffers here and there.
> Fortunately, this is much simpler than I2C
>
> > In this case, you have a minimum frequency clock requirement of 8 MHZ
> > when communicating with the ENC28J60. If you have other SPI devices on this
> > same bus, this clock frequency does not need to be used when communicating
> > with those devices...

>
> of course.
>
> > unless of course the ENC28J60 is expecting a free
> > running SPI clock, they don't mention it that way, but I'd be suspicious of
> > it. Many times SPI clock is stopped completely when no comms are ongoing
> > and Figures 4-4 and 4-4 of the datasheet seem to imply that the clock is
> > expected to stop for this device as well.

>
> obviously.
>
> >> As another "unintended case", an external clock input opens
> >> the possibility to bit-bang data with some PC or uC.
> >> I know it sounds stupid

>
> > Many times that's the most cost effective approach since the 'cost' is 4
> > general purpose I/O pins that are usually available. In this case though,
> > maintaining an 8 MHz

>
> hmm this seems to be unfinished, but let me try to continue your phrase :
> "maintaining an 8MHz clock on a // port is difficult" (or something like that).
> Of course ! that's all the point of having an external clock input !
> Though it would be more natural to have a SPI slave instead of a SPI master.
>
> I'll cut into this subject because for the specific purpose of communication
> with a host, I intend to use another kind of parallel, synchronous protocol :
> 4 bits of data, 1 pulse strobe, 1 output enable, 1 reset, 1 "slave data ready".
> With some crude software handshake, it's really easy to implement and use,
> and 4x faster than SPI.
>
> >> And there is no "SPI standard" contrary to I2C or others.
> >> (http://en.wikipedia.org/wiki/Serial_..._Bus#Standards)

> > Yes, all the more freedom you have.

>
> So I'd be lazy to not use it.
>
> >> Note however that your version does not allow to use the CPU clock at full
> >> speed, what happens if you set your "max value" to "00000" ?

> > That's correct but I wouldn't set the max value to anything, it would be a
> > computed constant like this

>
> > constant Spi_Clks_Per_Cpu_Clk: positive range 2 to positive'high :=
> > Spi_Clk_Period / Cpu_Clk_Period.

>
> As I need flexibility (the system I develop is also development platform for me),
> the clock divider is programmable. I need to ensure that any combination
> of configuration bits won't bork something.
>
> > Synthesis (and sim) would fail immediately if the two clock periods were the
> > same since that would result in 'Spi_Clks_Per_Cpu_Clk' coming out to be 1
> > which is outside of the defined range. Running SPI at the CPU speed is
> > rarely needed since the CPU typically runs much faster than the external SPI
> > bus. If that's not your case, then you've got a wimpy CPU, but in that
> > situation you wouldn't have a clock divider, and the data handling would be
> > done differently. This type of information though is generally known at
> > design time and is not some selectable option so if your CPU did run that
> > slow you wouldn't even bother to write code that would put in a divider so
> > your whole point of "what happens if you set your "max value" to "00000"" is
> > moot.

>
> ...
>
> read more


Don't ask questions to which you don't want to know the answers. Why
did you post here in the first place? You've received excellent,
extremely patiently provided advice. My advice to you is to take it.
Thanks is optional.

One major reason to avoid multiple clock domains when possible is that
simulation (RTL or full-timing) rarely reveals the problems inherent
in crossing clock domains. Static timing analysis does not reveal them
either. Experienced designers know to avoid problems they don't need.
If you think complex code is hard to debug, you should try debugging
behavior that is not repeatable in simulation at all.


Andy
 
Reply With Quote
 
rickman
Guest
Posts: n/a
 
      08-25-2008
On Aug 24, 4:37 pm, whygee <(E-Mail Removed)> wrote:
> Hi !
>
> rickman wrote:
> > I still have no idea why you think you need two clock domains.

>
> I still don't understand why several people think I "must not" use 2 clocks.
> I understand that "normally" this goes against common knowledge
> but I see no limitation, legal reason or technical issue that
> could prevent me from doing this (Usenet misunderstanding are not,
> IMHO, "limitations" ). Furthermore, the first simulation results
> are very encouraging.


I didn't say you "must not" use 2 clocks. It is just a PITA to use
more than one because of the synchronization issues. Often when the
system clock is significantly faster than the interface clock, the
interface clock can be used as a signal instead of a clock. The clock
can be sampled and the edge detected and used as an enable. It also
helps with the timing analysis a lot.

> > What CPU are you using?

>
> http://yasep.org(not up to date)
> This is a soft core that I am developing, based on ideas dating back to 2002.
> It has now a configurable 16-bit or 32-bit wide datapath with quite dumb
> but unusual RISC instructions. I am writing the VHDL code since
> the start of the summer, when I got my Actel eval kit.
> Between 2000 and 2002, I had some other intensive VHDL experience, with other tools.




> > There are reasons why the smaller logic companies are small.

>
> in your reasoning, are small companies small because they are small,
> and big companies big because they are big ?
> And if I mentioned another large company, what would you have answered ?


PLD companies spend more money on developing software than they do the
hardware. At least this was stated by Xilinx some 4 or 5 years ago...
the rising costs of cutting edge ICs may have changed this. The point
is that if they sell fewer chips, they have a smaller budget for the
software. Since the software requirements are the same if they sell 1
chip or a billion chips, they can obviously afford to develop better
software if they sell more chips.


> Furthermore, the more I use the A3Pxxx architecture, the more
> I understand it and I don't find it stupid at all.
> The "speed issue" seem to be because they use an older
> silicon process than the "others". But the A3Pxxx are old,
> there are newer versions (unfortunately under-represented,
> overpriced and not well distributed). Once again,
> commercial reasons, not technical.


No one said their chips are "stupid". The point is that their tools
will be less developed. Personally I think it is entirely feasible to
develop tools that work in a different domain. I remember that the
Atmel 6000 parts had tools that facilitated a building block
approach. But this requires a significant investment in building up
your own library in a reusable manner and most designers just want to
"get it done". So the tools moved more and more to a push button
approach. The push button approach is also very appealing to a
beginner and so opens up the market to a much wider range of
developers. Obviously this is the route that was taken by logic
companies and the result is complex, expensive (for the chip maker)
tools.


> > I can't say that I am following what you are doing. But if you are
> > using multiple clock domains that are larger than 1 FF in each
> > direction, I think you are doing it wrong.

>
> What do you mean by "clock domains that are larger than 1 FF in each direction" ?


I mean for a serial interface, you only need one FF to synchronize
each input/output signal. This assumes you have a sufficient speed
difference in the I/O clock and the internal sys clock (> 2:1). Then
everything can be done in your sys clock domain and sync issues are
limited to just the I/O. This can be much more manageable than having
banks of logic with different clocks.


> What I do (at a clock domain boundary) is quite simple :
> - the FF is clocked by the data source's clock (only)
> - the FF is read (asynchronously) by the sink.
> Because of the inherent higher-level handshakes (for example, the
> receive register is read only when a status flag is asserted and
> detected in software), there is no chance a metastability can
> last long enough to cause a problem.


There are other sync issues than metastability. I'm not saying it is
hard, I'm saying that everywhere you have this sort of interface, you
have to go through an analysis to verify that it will work under all
conditions. Again, it is the PITA factor that can be eliminated by
different means.

Rick
 
Reply With Quote
 
rickman
Guest
Posts: n/a
 
      08-25-2008
On Aug 25, 11:32 am, Andy <(E-Mail Removed)> wrote:
>
> Don't ask questions to which you don't want to know the answers. Why
> did you post here in the first place? You've received excellent,
> extremely patiently provided advice. My advice to you is to take it.
> Thanks is optional.
>
> One major reason to avoid multiple clock domains when possible is that
> simulation (RTL or full-timing) rarely reveals the problems inherent
> in crossing clock domains. Static timing analysis does not reveal them
> either. Experienced designers know to avoid problems they don't need.
> If you think complex code is hard to debug, you should try debugging
> behavior that is not repeatable in simulation at all.


Did you really need to quote his entire message?

I don't think it is appropriate to criticize the OP because he
received good advice but still wants to go his own way. It is
impossible (or at least very difficult) to articulate all the reasons
for doing something the way a designer wants. And sometimes it just
comes down to personal preference. I am not trying to tell the OP he
is wrong. I'm just trying to give him info to base his decision and
to make sure he understands what I have said (and that I understand
what he has said).

But the final decision is his and I will like to find out how it works
for him.

Rick
 
Reply With Quote
 
Mike Treseler
Guest
Posts: n/a
 
      08-25-2008
whygee wrote:

> Freedom is wonderful and today's technology is empowering.
> We are free to test, experiment, discover, learn and "invent".
> So I'm getting a bit tired of "you should do that"
> and "that's how it should be done". As if there ware two
> kinds of engineers, those who "implement" and those who "innovate"
> (often by trial and error).
> And I believe that each practical case has distinctive aspects,
> that make us reconsider what we know and how we apply our knowledge.
> One is free to abide to strict rules, or not. For the rest,
> read the standards (and find creative ways to exploit them).
> And with YASEP ( http://yasep.org ), I have decided to
> go completely wild, unusual and fun (and constructive).


Adventure and discovery are the upside of randomness.
Crashing on the rocks is the downside, but this also
provides the most memorable lesson.

Once I know where some of the rocks are,
I am inclined to steer around those next time
even if the water happens to be running higher.

-- Mike Treseler
 
Reply With Quote
 
Andy
Guest
Posts: n/a
 
      08-25-2008
On Aug 25, 12:12 pm, rickman <(E-Mail Removed)> wrote:
> Did you really need to quote his entire message?


I most humbly offer my sincerest apology for my wanton disregard for
usenet etiquette...

We are all free to accept or disregard advice we receive via usenet.

Andy
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
typedef a struct [C Coding styles] MJ_India C Programming 5 10-10-2008 12:53 PM
Re: Mixed clocked/combinatorial coding styles rickman VHDL 55 08-29-2008 05:10 AM
Re: Mixed clocked/combinatorial coding styles Mike Treseler VHDL 0 08-19-2008 06:18 AM
Re: Mixed clocked/combinatorial coding styles kennheinrich@sympatico.ca VHDL 0 08-18-2008 05:46 PM
C Coding Styles and the use of Macros davej C Programming 5 11-21-2003 04:38 PM



Advertisments