digfficult hardware diagnosis

Discussion in 'A+ Certification' started by Frank, May 3, 2004.

  1. Frank

    Frank Guest

    Winxp Home. Updated except for the buggy KB835732.
    Gigabyte mobo GA-8S650GXM (socket 478).
    Celeron 2400
    DDR 2100 - 128mb.


    Experiencing frequent (several times a day) hardware crashes and automatic
    reboots. I have turned off the control panel option (system>advanced) to
    automatically reboot after a crash, but it still does so anyway.

    Machine_Check_Exception.
    There is a blue screen error called Machine_Check_Exception, just prior to
    the crash & reboot, but it's impossible to read it as it's only on for an
    instant. Definitely not a Windows screen. I believe it's an Intel message
    from the cpu diagnosing a hardware error (perhaps cpu or mobo).

    There is no pattern to the crashes and no relation to what programs may be
    running - sometimes it happens when the machine is standing idle.

    On reboot the following is sometimes displayed (not everytime) - you have
    recovered from a serious error:
    BC Code:9c BCP1:00000000 BCP2: 8005366FO BCP3:CC0000FF BCP4:20040189
    OSVER:5_1_2600 SP:1_0 Product 768_1


    Memtest86.
    I ran Memtest86 as I suspected a faulty ram module (128 & 256 installed). I
    replaced the 256 module that was probably faulty but the crashes continued.
    So I removed the new module and am only running the original 128 module
    which tests good. Crashes continue.

    PSU.
    I suspected the PSU - Verified voltages - all seem ok (within 5%)- checked
    the 3.3, 5 & 12v rails on the main mobo connector with a autosensing digital
    multimeter - all seem ok,

    Fans.
    All fans are working (cpu, case and psu).

    MBM5 (MotherboardMonitor).
    The MBM temperature results are different from the Bios readings.
    case 141F/61C
    cpu 30F/-1C
    sensor3 32F/OC
    core0 1.6v
    core1 .00v
    +3.3 3.39v
    +5 5.00v
    -12v -12.27v
    -5 -4.89v
    fan1 5625rpm
    fan2 33750rpm
    fan3 16875rpm
    cpu 2424mhz
    cpu0 0%

    Bios:
    system temp=32C/89F
    cpu temp=fluctuates from 39C/100F to 41C/105F
    cpu fan=3125rpm
    system fan2766rpm
    vcore=1.58v
    +3.3=3.39v
    +5=5.02v
    +12=11.97v

    Event Viewer.
    I looked at the Event Viewer errors and am continually getting STOP
    0x0000009c errors, which point to hardware problems.

    Dumpchk.
    I used Dumpchk.exe to analyse Minidump files (created by XP in the
    windows>minidump folder) and have copy/pasted one below as an example, and
    to see if it offers any clues:

    C:\WINDOWS>dumpchk minidump\mini042704-02.dmp
    Loading dump file minidump\mini042704-02.dmp
    ----- 32 bit Kernel Mini Dump Analysis

    DUMP_HEADER32:
    MajorVersion 0000000f
    MinorVersion 00000a28
    DirectoryTableBase 00039000
    PfnDataBase 81053000
    PsLoadedModuleList 8054be30
    PsActiveProcessHead 8054de78
    MachineImageType 0000014c
    NumberProcessors 00000001
    BugCheckCode 0000009c
    BugCheckParameter1 00000000
    BugCheckParameter2 8053f0f0
    BugCheckParameter3 cc0000ff
    BugCheckParameter4 20040189
    PaeEnabled 00000000
    KdDebuggerDataBlock 8053dde0
    MiniDumpFields 00000dff

    TRIAGE_DUMP32:
    ServicePackBuild 00000100
    SizeOfDump 00010000
    ValidOffset 0000fffc
    ContextOffset 00000320
    ExceptionOffset 000007d0
    MmOffset 00001068
    UnloadedDriversOffset 000010a0
    PrcbOffset 00001878
    ProcessOffset 000024c8
    ThreadOffset 00002720
    CallStackOffset 00002978
    SizeOfCallStack 00003000
    DriverListOffset 00005c08
    DriverCount 000000a3
    StringPoolOffset 00008c70
    StringPoolSize 00001680
    BrokenDriverOffset 00000000
    TriageOptions 00000041
    TopOfStack f2ebd000
    DebuggerDataOffset 00005978
    DebuggerDataSize 00000290
    DataBlocksOffset 0000a2f0
    DataBlocksCount 00000003


    Windows XP Kernel Version 2600 (Service Pack 1) UP Free x86 compatible
    Kernel base = 0x804d4000 PsLoadedModuleList = 0x8054be30
    Debug session time: Tue Apr 27 12:57:06 2004
    System Uptime: 0 days 0:05:41
    start end module name
    804d4000 806c6980 nt Checksum: 0020230B Timestamp: Thu Aug 29
    05:
    03:24 2002 (3D6DE35C)

    Unloaded modules:
    f309f000 f30af000 NAVENG.Sys Timestamp: unavailable (00000000)
    f2dff000 f2e90000 NavEx15.Sys Timestamp: unavailable (00000000)
    f2ea0000 f2eb0000 NAVENG.Sys Timestamp: unavailable (00000000)
    f2dff000 f2e90000 NavEx15.Sys Timestamp: unavailable (00000000)
    f78c8000 f78d8000 NAVENG.Sys Timestamp: unavailable (00000000)
    f4238000 f42c9000 NavEx15.Sys Timestamp: unavailable (00000000)
    f2f30000 f2f57000 kmixer.sys Timestamp: unavailable (00000000)
    f38c1000 f38e8000 kmixer.sys Timestamp: unavailable (00000000)
    f7de7000 f7de8000 drmkaud.sys Timestamp: unavailable (00000000)
    f3abe000 f3acb000 DMusic.sys Timestamp: unavailable (00000000)
    f7cb2000 f7cb4000 splitter.sys Timestamp: unavailable (00000000)
    f3ace000 f3adc000 swmidi.sys Timestamp: unavailable (00000000)
    f3923000 f3946000 aec.sys Timestamp: unavailable (00000000)
    f7b80000 f7b85000 Cdaudio.SYS Timestamp: unavailable (00000000)
    f757c000 f757f000 Sfloppy.SYS Timestamp: unavailable (00000000)
    end.

    Prime95
    I ran Prime95 to put load on the system and it failed the torture test. But
    no clues as to why - not necessarily cpu ?
    Readout - Beginning a continuous self test to check computer.
    Test1, 4000 Lucas-Lehmer iterations of M19922945 using 1024k FFT length.
    FATAL ERROR:Writing to temp file.
    Error opening results file to output this message:
    Unable to open log file.
    Torture Test ran 0 minutes_1 error.0 warnings.
    Execution halted.

    CPU Stability Test ver.6
    I ran the Normal test mode, and it lasted about 9 minutes before crashing.
    No telling if it was because of strain on the cpu as the machine crashes
    like that anyway, even when not under a load.

    So, definitely a hardware problem :) But how can I be specific and sure?
    Would a Post diagnostic card tell me if it's the cpu or mobo or ? I dont
    have a spare cpu or mobo to swop with known-good parts.
     
    Frank, May 3, 2004
    #1
    1. Advertisements

  2. Frank

    Fragile_dog Guest

    the blue screen is a windows screen. try searching on windows knowledge base
    for the error codes.
     
    Fragile_dog, May 3, 2004
    #2
    1. Advertisements

  3. Frank

    Fragile_dog Guest

    Fragile_dog, May 3, 2004
    #3
  4. Frank

    hootnholler Guest

    Hey Frank,

    the only thing that stuck out at me, is the core voltage. The celery is
    1.525, not 1.6 V. If there's a way to lower in the BIOS, may wish to give
    that a shot... That may be causing your issue.

    Hoot
     
    hootnholler, May 3, 2004
    #4
  5. Frank

    Frank Guest

    look how wide open that still leaves the field - cpu, mobo, ram,
    psu..................

     
    Frank, May 3, 2004
    #5
  6. Frank

    Frank Guest

    i'll look at that - doubt it has an option - it's a very basic bios seeting
    screen

    --
    xx
     
    Frank, May 3, 2004
    #6
  7. Frank

    w_tom Guest

    STOP: 9C is a hardware failure created exception. A response
    to Int 18h. Those four following numbers say where the
    failure is but those numbers are defined unique by motherboard
    designer. Manufacturer SHOULD provide details on what those
    numbers say.

    Get the manufacturer's comprehensive diagnostics (or use
    what you have because the manufacturer is too inferior to
    provide those diagnostics). Selectively heat component with a
    hairdryer on high. That would be well below temperature that
    any component must work just fine at. A failed component
    would cause diagnostics to fail immediately when heated. Heat
    tends to make marginal components fail immediately.

    That is so hot as to be uncomfortable to touch but not so
    hot as to leave skin. The frustration - why computer crashed
    is right there in the STOP: 9C message - and manufacturer must
    provide information. This is where you find the real quality
    of that manufacturer.
     
    w_tom, May 3, 2004
    #7
  8. Frank

    Techie Guest

    What do you want us to fly in and fix the mchine for you too? Help yourself
    idiot

    --
    Techie
    MCNGP #21
     
    Techie, May 3, 2004
    #8
  9. Frank

    Martin Guest

    Check the BIOS for an option to turn off the CPU cache. The machine will
    much run slower, but if the error goes away it's more likely the CPU that's
    bad. If you still get the error I'd look at the motherboard.
    Of course the best way would be to try a different CPU, but I know it's not
    worth buying if you don't need it.
     
    Martin, May 3, 2004
    #9
  10. Frank

    Frank Guest

    mmm....thank you for your informative and courteous response.
     
    Frank, May 3, 2004
    #10
  11. Frank

    Techie Guest

    well behave like a idiot you get treated like one
     
    Techie, May 3, 2004
    #11
  12. Frank

    Frank Guest

    I am not going to enter a dialogue with your sort. If you don't like a
    thread, skip it.
     
    Frank, May 3, 2004
    #12
  13. Frank

    Techie Guest

    Touchy Touchy grow up, you don't like the fact that you are a lazy ass and
    someone called you on it. This is a newsgroup about certification not free
    help.
     
    Techie, May 3, 2004
    #13
  14. Frank

    Frank Guest

    i just looked at the bios again, and it has the vcore as 1.58v
    Should I try and lower this - where do you get the 1.525 from ?

    --
     
    Frank, May 3, 2004
    #14
  15. Frank

    Frank Guest

    actually, it doesn't allow me to change it:-(
     
    Frank, May 3, 2004
    #15
  16. Frank

    AG Guest

    Sometimes some news servers don't cache the articles for that long. Stupid
    I know but it happens.
    AG
     
    AG, May 4, 2004
    #16
  17. Frank

    hootnholler Guest

    Just to end the thread...

    Actually, got that number from the Intel site and the core voltages for
    their CPUs. 1.525 seems to be the number for a celery, but the fact that
    you can't change the bios setting tells me it's a 'soft bios'. Sorry to
    hear that, but it's probably not your issue. Celery's will handle some
    minor overclocking. If I remember right, the speed was 2424 mhz on a 2400
    mhz chip. Not exactly frying that sucker ;o)

    Hoot
     
    hootnholler, May 4, 2004
    #17
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.