e1000 driver

Discussion in 'Linux Networking' started by Ralph Spitzner, Mar 9, 2012.

  1. This may be the wrong group, feel free to shove me to another :)

    Since the latest 2.6.x and with the new 3.x.x Kernels I get
    some really strange behaviour on an 82547EI card using the
    e1000 driver.

    Gateway has two cards one Realtek pointing at the cable-modem
    and the gig pointing at the lan via a dlink Giga switch connected
    to the Desktop (gigabit) and the printer (100mb) and sometimes friends
    computers, all good so far.
    (nice throughput, no errors)

    Then, when I connect the Targa (el Cheapo) wifi router (100m)
    for the netbook and phone it will take anywhere between 1 and 6
    _Days_ until the connection to the gateway locks up producing
    _this_ on the routers screen:

    [16281.008328] ------------[ cut here ]------------
    [16281.009020] WARNING: at net/sched/sch_generic.c:255
    dev_watchdog+0xb9/0x110()
    [16281.009020] Hardware name: To Be Filled By O.E.M.
    [16281.009020] NETDEV WATCHDOG: eth1 (e1000): transmit queue 0 timed out
    [16281.009020] Modules linked in: p4_clockmod freq_table speedstep_lib
    dvb_ttpci budget_av tda10023 stv0297 budget_core dvb_core i915
    drm_kms_helper saa7146_vv saa7146 cfbcopyarea video ttpci_eeprom e1000
    snd_intel8x0 8139too snd_ac97_codec ac97_bus backlight cfbimgblt
    cfbfillrect intel_agp intel_gtt [last unloaded: dvb_ttpci]
    [16281.009020] Pid: 0, comm: swapper/0 Tainted: G W
    3.2.2-spitzner.org #3
    [16281.009020] Call Trace:
    [16281.009020] [<c102dfde>] warn_slowpath_common+0x65/0x7a
    [16281.009020] [<c139c5c4>] ? dev_watchdog+0xb9/0x110
    [16281.009020] [<c102e057>] warn_slowpath_fmt+0x26/0x2a
    [16281.009020] [<c139c5c4>] dev_watchdog+0xb9/0x110
    [16281.009020] [<c1037b38>] run_timer_softirq+0x156/0x1f9
    [16281.009020] [<c139c50b>] ? netif_tx_unlock+0x3e/0x3e
    [16281.009020] [<c1032b37>] __do_softirq+0x93/0x130
    [16281.009020] [<c1032aa4>] ? local_bh_enable+0xa/0xa
    [16281.009020] <IRQ> [<c1032d31>] ? irq_exit+0x35/0x84
    [16281.009020] [<c10153bd>] ? smp_apic_timer_interrupt+0x5e/0x6c
    [16281.009459] [<c149fc32>] ? apic_timer_interrupt+0x2a/0x30
    [16281.009879] [<c13400d8>] ? videobuf_read_stream+0xe7/0x238
    [16281.010040] [<c1007532>] ? default_idle+0x52/0x81
    [16281.010040] [<c10017d3>] ? cpu_idle+0x52/0x73
    [16281.010040] [<c148f844>] ? rest_init+0x58/0x5a
    [16281.010040] [<c17296ca>] ? start_kernel+0x2b8/0x2bd
    [16281.010040] [<c17290b0>] ? i386_start_kernel+0xb0/0xb7
    [16281.010040] ---[ end trace 4eaa2a86a8e2da24 ]---

    Any idea WTF is going on there, anyone ?
    (and why the swapper is tainted all of a sudden)
    Ok, it might just be hardware going bad, but it coincides
    with newer kernels....

    regards
    -rasp
     
    Ralph Spitzner, Mar 9, 2012
    #1
    1. Advertisements

  2. Tainting doesn't apply to the swapper, it applies to the whole kernel.
    The 'G' means no proprietary modules have been loaded, the 'W'
    corresonds to a warning having been issued.
     
    Richard Kettlewell, Mar 9, 2012
    #2
    1. Advertisements

  3. Johann Klammer, Mar 9, 2012
    #3
  4. Ralph Spitzner

    GangGreene Guest

    It may be due to the realtek. I had the same problems when I was using a
    realtek wireless card, Asus N13 PCIE in a desktop machine. It caused the
    wireless access point to puke and disrupted the entire lan network. Three
    different wireless access points would go down after a few hours to 2
    days. Could not keep the wireless working or lan working. I would have
    to reboot the wireless access point to get the lan going again.

    What was weird it would take down the wireless access point(s) dlink
    linksys and a belkin within 48 hours of connecting the desktop that had
    the realtek in it. After I changed from the asus (realtek) to a netgear
    wireless adapter (altheros) everything was fine and I have not had a
    failure since I removed the realtek wireless card and placed it into the
    garbage bin.

    YMMV
     
    GangGreene, Mar 10, 2012
    #4
  5. GangGreene wrote:
    [snip]
    [snip]

    It was neither realtek nor wireless... It is a via-rhine wired network
    card on my box and an intel e1000(also wired, I think) on the original
    posters'. I still suspect EMI or voltage regulator issues.
     
    Johann Klammer, Mar 10, 2012
    #5
  6. Johann Klammer wrote:
    [...]
    Well, I just ordered a Realtek Gig card to replace the 100m one, so
    I can change Interfaces 'round, having the e1000 facing the DSL side.
    See what happens when the e1000 gets the myriad of useless ARP requests
    my cable modem forwards :p
    (As this is the server/gateway/firewall a decicated DSL router [SoC]
    is not really an option here)

    But I'm still wondering why there is a 'Watchdog' that does essentially
    nothing else as would happen if it weren't there ;-|
    Shouldn't it at least be able to reset the card ?
    (TCP itself should then re-request whatever was missing)

    -rasp
     
    Ralph Spitzner, Mar 10, 2012
    #6
  7. Hello,

    Johann Klammer a écrit :
    I had a similar message twice recently :

    ------------[ cut here ]------------
    WARNING: at net/sched/sch_generic.c:219 dev_watchdog+0xc8/0x13b()
    NETDEV WATCHDOG: eth1 (ne2k-pci): transmit timed out
    Modules linked in: ipt_ULOG act_police cls_u32 sch_ingress ppp_deflate
    zlib_defl
    ate zlib_inflate bsd_comp ppp_async crc_ccitt ppp_generic
    slhc ip6t_REJECT ip6ta
    ble_filter ip6_tables xt_state ipt_REDIRECT
    xt_multiport ipt_REJECT xt_limit xt_
    tcpudp iptable_mangle
    iptable_nat iptable_filter ip_tables x_tables psmouse apm

    nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack_ipv6 nf_conntrack_ipv4
    nf_conntr
    ack ipv6 ne2k_pci 8390 3c59x mii
    Pid: 0, comm: swapper Not tainted 2.6.27.61-p3 #1
    [<c0115fa1>] warn_slowpath+0x5a/0x7b
    [<c0112bd3>] enqueue_task_fair+0x19/0x41
    [<c0112070>] enqueue_task+0x3f/0x4a
    [<c0112160>] activate_task+0x1c/0x20
    [<c01122ff>] __wake_up_common+0x2d/0x52
    [<c0111bd9>] place_entity+0x7c/0xca
    [<c0112bd3>] enqueue_task_fair+0x19/0x41
    [<c0112070>] enqueue_task+0x3f/0x4a
    [<c01ca038>] strlcpy+0x11/0x3d
    [<c02463c7>] dev_watchdog+0xc8/0x13b
    [<c01262a4>] ktime_get_ts+0x1d/0x3f
    [<c011c508>] run_timer_softirq+0x124/0x16c
    [<c02462ff>] dev_watchdog+0x0/0x13b
    [<c01199a1>] __do_softirq+0x38/0x78
    [<c0119a03>] do_softirq+0x22/0x26
    [<c0119ab9>] irq_exit+0x25/0x55
    [<c0104de2>] do_IRQ+0x4d/0x5f
    [<c010385f>] common_interrupt+0x23/0x28
    [<c0107377>] default_idle+0x25/0x38
    [<c01028d8>] cpu_idle+0x41/0x5d
    =======================
    ---[ end trace 0901e6cba4f12be8 ]---
    eth1: Tx timed out, lost interrupt? TSR=0x3, ISR=0x3, t=819.
    eth1: Tx timed out, lost interrupt? TSR=0x3, ISR=0x3, t=424.
    (and so on, until I unplugged and re-plugged the ethernet cable)

    eth1 is a Realtek RTL8029 PCI connected to a DSL modem with a 10Mbit/s
    half duplex interface.

    Weirdly, after unplugging and re-plugging the ethernet cable, the
    interface was operational again.

    The first time happened 54 days after moving the system to a new
    hardware but keeping the same 2.6.27.49 kernel which had run fine for 7
    months on the old hardware. The previous card was a 3Com 3C509 ISA. The
    second time happened 4 days after upgrading the kernel to 2.6.27.51.
     
    Pascal Hambourg, Mar 10, 2012
    #7
  8. Ralph Spitzner

    Jorgen Grahn Guest

    Watchdogs work that way. They detect conditions which would only occur
    if there's a software bug or hardware error. The right thing to do in
    such a rare and unforseen situation is not to try to patch things up,
    but to fail noisily.

    /Jorgen
     
    Jorgen Grahn, Mar 10, 2012
    #8
  9. If I connect the onboard Intel to the Cable-Modem the system
    goes completely nuts.

    Interface doesn't accept assigned IP's from DHCPCD,
    log file shows 'link-up', 'link down','link up'.....
    Also dhcpcd is setting the MTU from 1500 to 576 and vice
    versa continuously.

    Also sometimes on manual 'ifconfig' the ifconfig gets zombied and
    never returns.

    I think I'll just have to take down the onboard LAN in the
    bios and get one more card :-(


    -rasp
     
    Ralph Spitzner, Mar 13, 2012
    #9
  10. Ralph Spitzner

    J.B. Wood Guest

    Hello, and I've encountered difficulties on occasion with the e1000e
    (not the e1000) when using Fedora or CentOS on certain platforms with
    multiple ethX ports. Sometimes a particular port will just stop
    accepting/sending traffic while the other ports remain OK. Doing an
    ifconfig on the problem ethX usually shows a bunch of frame errors.
    Rebooting the system usually restores operation (at least for a time).
    Sincerely,
     
    J.B. Wood, Mar 13, 2012
    #10
  11. Ralph Spitzner

    Rick Jones Guest

    Are you already aware of Linux's rather strong view of implementing
    the weak end-system model and its effects when multiple NICs are
    connected to the same broadcast domain? Wouldn't explain frame
    errors, but your description made me want to ask.

    rick jones
    code 4040 LCP&FD in a previous life...
     
    Rick Jones, Mar 13, 2012
    #11
  12. Ralph Spitzner

    unruh Guest

    If you really want help, you will have to give more information.
    What is the content of /etc/sysconfig/network-scripts/ifcfg-eth0?
    If you do lsmod|grep e100
    what do you get?
    Are you sure you want dhcp? Which client are you using?

    It could be a defective onboard card.
     
    unruh, Mar 14, 2012
    #12
  13. Nothing, since it doesn't exist :)
    Nothing either, its an e1000
    Stock from Slackware ->
    dhcpcd 5.2.11
    Copyright (c) 2006-2011 Roy Marples

    And yes since I tried swapping the interface I have to use
    DHCP since there's no other way to obtain an IP address
    from my cable modem.

    (LAN 1 card, static IP's or dhcp for guests,
    DSL dhcp from provider)

    I don't know if you read my original post, in there
    you'll find what I was on about.

    It is.
    Yesterday the system started freezing randomly, then
    I finally got the icing :
    'Bios checksum error, please insert blabla to restore'

    And no, it doesn't restore anything, following the guidelines
    in the Manual :-(

    So the board is F***ED :-(

    SNAFU
    -rasp
     
    Ralph Spitzner, Mar 14, 2012
    #13
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.