ECC and non-ECC

Discussion in 'Computer Information' started by Jeff Strickland, Apr 8, 2008.

  1. What is ECC?

    My computer has 256M of PC2700 RAM installed. I am looking to bump it go a
    gig, which is the max the mother board will accept. I am finding non-ECC and
    ECC Registered <or something like that> as available options.

    I went to the DELL site and found the specs for my machine, and the non-ECC
    is the preferred format for the RAM. I don't know what ECC means.
     
    Jeff Strickland, Apr 8, 2008
    #1
    1. Advertising

  2. Jeff Strickland

    Ofnuts Guest

    Jeff Strickland wrote:
    > What is ECC?
    >
    > My computer has 256M of PC2700 RAM installed. I am looking to bump it go
    > a gig, which is the max the mother board will accept. I am finding
    > non-ECC and ECC Registered <or something like that> as available options.
    >
    > I went to the DELL site and found the specs for my machine, and the
    > non-ECC is the preferred format for the RAM. I don't know what ECC means.


    ECC is Error-Correcting Code. In practice, the memory holds some more
    bits which can are computed on data writes, and can be used on data
    reads to fix single-bit errors and detect most multi-bit ones.

    It is not very useful on personal PCs (mostly used on servers).
     
    Ofnuts, Apr 9, 2008
    #2
    1. Advertising

  3. Jeff Strickland

    Paul Guest

    Jeff Strickland wrote:
    > What is ECC?
    >
    > My computer has 256M of PC2700 RAM installed. I am looking to bump it go
    > a gig, which is the max the mother board will accept. I am finding
    > non-ECC and ECC Registered <or something like that> as available options.
    >
    > I went to the DELL site and found the specs for my machine, and the
    > non-ECC is the preferred format for the RAM. I don't know what ECC means.
    >
    >


    This description is a bit too broad.

    http://en.wikipedia.org/wiki/Error_detection_and_correction#Error-correcting_code

    In this document, they show a 72:64 code. What that means, is a module with
    nine memory chips on one side, say 8 bits per chip, makes an array (rank)
    of memory which is 72 bits wide. 64 bits is enough to hold the data, and the
    extra 8 bits hold the syndrome or check bits. The 72 bit coding of the data,
    makes it possible to correct a single bit in error, or detect that two bits
    are in error. Coverage for larger error patterns will have some statistics
    attached to it (odds of being caught by the code). The shorthand for
    "single error correction, double error detection" is SECDED. (A Hamming code.)

    http://www.latticesemi.com/lit/docs/refdesigns/rd1025.pdf

    ECC needs two parts to work. It needs ECC memory, which is 72 bits wide
    instead of the normal unprotected 64 bits wide. And the motherboard
    chipset must implement the ECC logic, to generate and check the
    syndrome. (In the current generation, many Intel desktop motherboards don't
    have ECC capability.) Also, it may take an extra cycle or two, to handle ECC,
    but with the heavy pipelining in processors, this may not be apparent
    to the user.

    So the purpose of ECC, is to store information in a redundant way, in
    system memory. On retrieval, the whole 72 bit "thing" can be checked
    to see if it is corrupted or not. If one bit is in error, the
    computer can actually rewrite the 72 bit "thing" with a correction
    to the bit in error. Even if the bit in error, is in the syndrome field,
    it can be corrected.

    Some computers enhance the scheme, by slowly sifting through the
    entire memory, looking for errors in the background. Such a
    sifting process is called "scrubbing", and the computer can
    repair dormant 1 bit faults, before too many of them accumulate.

    All codes make some assumptions about expected error patterns. Maybe
    a cosmic ray, could corrupt a single bit, in a word being read from
    the memory. If that was the case, then ECC would represent tremendous
    coverage against errors.

    On the other hand, things like timing problems in the computer, power
    supply disturbances, static discharges, and the like, could cause large
    multi-bit errors, for which ECC wouldn't really be helping matters that
    much. And in terms of the statistics, you'd be surprised how many upsets
    in circuits, are of this type (multi-bit).

    There is a second scheme I've seen, and that is on Athlon64 processors.
    They take advantage of the dual channel memory configuration. What
    they do, is have the user install two matched 72 bit DIMMs. The total
    number of bits is 144 bits. The actual data portion is 128 bits. That
    leaves 16 bits for them to design a code. The "chipkill" coding method,
    makes it possible, if the memory chips are x4 bits wide, to actually
    correct for a situation where one memory chip is dead. A chipkill
    equipped computer, could deal with that test case, of four bits being
    in error. Now, do chips fail like that ? Probably not. But it does
    demonstrate, that the available redundant bits, can be coded in
    various ways, to protect against different kinds of fault types.
    Good coding design, needs good knowledge of how memory fails.

    http://en.wikipedia.org/wiki/Chipkill

    *******

    Now that the background is out of the way -

    Does the motherboard support ECC ? If not, then Dell may
    recommend non-ECC, simply because the extra ECC bits (syndrome)
    will not even be connected to the Northbridge. In that case,
    there is no point in purchasing ECC - for example, if the SPD
    contents of an ECC DIMM, have the possibility that the BIOS will
    not like the RAM, then there is a (small) risk that the ECC DIMM
    may be rejected by the BIOS.

    If the motherboard supports both ECC and non-ECC DIMMs, then you have
    to decide whether you want the protection of ECC. Then, for ECC to
    work, *all* modules would need ECC (72 bits wide). It is generally
    not a good idea, to mix ECC and non-ECC DIMMs, with the hope
    they'll work in non-ECC mode, because of what the BIOS might
    do (beep the bad memory beep pattern).

    ECC is a valuable feature, in the sense that it can make
    failing memory more obvious to the user. On a Unix computer
    at work, I was actually able to work for a couple hours, while
    there was failed memory present in the computer, thanks to ECC.
    The machine was a bit slower (because it was dumping errors like
    crazy in the log), but continued to function just fine, until
    the repairman showed up after lunch.

    The OS and BIOS must also support ECC. Which is why, when you buy
    an ECC motherboard (like an Intel with X38 chipset), it is not
    enough for just the chipset to support ECC. You could have an X38,
    install ECC RAM, and get nothing for the extra effort. As was true
    in the past, someone has to prove that the motherboard actually
    works, before you get it. And manufacturers are just as lax this year,
    as they were years ago, in properly checking that this feature works.

    So it is an uphill battle, to build a super-reliable computer. Even the
    best of intentions on the users part, may not be enough to get the
    job done.

    Memory types for sale -

    1) Unbuffered without ECC (most common desktop memory)
    2) Unbuffered with ECC (suitable for Intel X38 desktop, but hard to find in DDR2.)
    (maybe a good candidate for Athlon64 or later machines.)
    (I'm not sure there is any DDR3 like this.)
    3) Registered with ECC (server memory is generally always ECC protected)
    ("registered without ECC" should not exist)
    4) FBDIMM (a current server memory type, with copious protection methods)
    (If you need ECC and don't want to screw around,
    then buy a server board with FBDIMMs on it.)

    Sure, you can find items (1) through (4) for sale, but Dell will tell
    you the type you really need. (1) is most common, and you can tell
    that by glancing through the Newegg memory lists for current systems.

    You can also go to Crucial.com and search there for compatible memory.
    Kingston.com offers a similar service, as do a couple of the smaller
    enthusiast brands.

    If you want to stay safe, stay away from Ebay (if that is what you
    had in mind as a source). In particular, I wouldn't buy 1GB DDR memory
    from Ebay. Most of the advertisements will have a "restricted chipset
    list", and that is the stuff to stay away from. You should buy memory
    that works everywhere, and Crucial or Kingston make that kind of
    memory.

    HTH,
    Paul
     
    Paul, Apr 9, 2008
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. MrMagooba

    Re: ECC / non-ECC memory!?

    MrMagooba, Jul 28, 2003, in forum: Computer Support
    Replies:
    0
    Views:
    528
    MrMagooba
    Jul 28, 2003
  2. S.Heenan

    Re: ECC / non-ECC memory!?

    S.Heenan, Jul 28, 2003, in forum: Computer Support
    Replies:
    2
    Views:
    2,122
    S.Heenan
    Jul 28, 2003
  3. Craven Birds

    Non ECC vs. ECC memory difference, please?

    Craven Birds, Jan 12, 2004, in forum: Computer Support
    Replies:
    11
    Views:
    23,878
    @}-}-------Rosee
    Jan 21, 2004
  4. Spin
    Replies:
    4
    Views:
    609
    Stefan Pendl
    Mar 27, 2008
  5. Spin
    Replies:
    4
    Views:
    468
    Stefan Pendl
    Mar 27, 2008
Loading...

Share This Page