Occasional high RTT through 2950T

Discussion in 'Cisco' started by Def, Apr 5, 2006.

  1. Def

    Def Guest

    In the process of debugging some seemingly random connectivity
    errors, I noticed the fast ethernet ports on the 2950 seemd to
    simply stop forwarding frames now and then. Replacing the switch
    removed the problem from the customer network, but I'm still curious
    if I can figure out what might be wrong with the switch. The network
    consists of only one switch.

    Pulling it into my lab and tinkering a bit, I noticed weird
    patterns in the RTT when leaving it running overnight (second
    column is the time since last RTT > 2ms):

    00:51:23
    02:27:55 +2:36
    02:49:02 +0:22
    03:10:09 +0:19
    03:31:15 +0:21
    04:40:54 +1:09
    06:02:04 +1:38
    06:56:23 +0:54
    08:11:47 +1:16
    08:54:01 +0:43
    09:12:20 +0:18
    09:37:38 +0:25
    09:48:18 +0:11
    11:24:51 +1:36
    13:01:22 +1:37
    13:34:34 +0:33
    13:55:42 +0:21
    14:16:48 +0:21
    15:11:06 +0:55
    15:32:12 +0:21
    16:26:31 +0:56
    16:47:37 +0:21
    18:03:02 +1:16

    As you can se, it's not very random. There's a high frequency of
    ~20m intervals, and it's also a noticable pattern at ~55m and
    1h37m.

    I saw no debug information that seemd relevant, and posting it here
    would be spamming a lot...

    Suggestions? The OS is c2950-i6q4l2-mz.121-22.EA5a.bin.
    I found spanning-tree running, but all ports are configured
    with the portfast feature (so I switched it off).
     
    Def, Apr 5, 2006
    #1
    1. Advertising

  2. Def

    BernieM Guest

    "Def" <> wrote in message
    news:...
    > In the process of debugging some seemingly random connectivity
    > errors, I noticed the fast ethernet ports on the 2950 seemd to
    > simply stop forwarding frames now and then. Replacing the switch
    > removed the problem from the customer network, but I'm still curious
    > if I can figure out what might be wrong with the switch. The network
    > consists of only one switch.
    >
    > Pulling it into my lab and tinkering a bit, I noticed weird
    > patterns in the RTT when leaving it running overnight (second
    > column is the time since last RTT > 2ms):
    >
    > 00:51:23
    > 02:27:55 +2:36
    > 02:49:02 +0:22
    > 03:10:09 +0:19
    > 03:31:15 +0:21
    > 04:40:54 +1:09
    > 06:02:04 +1:38
    > 06:56:23 +0:54
    > 08:11:47 +1:16
    > 08:54:01 +0:43
    > 09:12:20 +0:18
    > 09:37:38 +0:25
    > 09:48:18 +0:11
    > 11:24:51 +1:36
    > 13:01:22 +1:37
    > 13:34:34 +0:33
    > 13:55:42 +0:21
    > 14:16:48 +0:21
    > 15:11:06 +0:55
    > 15:32:12 +0:21
    > 16:26:31 +0:56
    > 16:47:37 +0:21
    > 18:03:02 +1:16
    >
    > As you can se, it's not very random. There's a high frequency of
    > ~20m intervals, and it's also a noticable pattern at ~55m and
    > 1h37m.
    >
    > I saw no debug information that seemd relevant, and posting it here
    > would be spamming a lot...
    >
    > Suggestions? The OS is c2950-i6q4l2-mz.121-22.EA5a.bin.
    > I found spanning-tree running, but all ports are configured
    > with the portfast feature (so I switched it off).
    >


    What are you actually pinging? The switches IP address or a connected host?

    I'd hardly call 3 out of 22 samples (~55m) or 2 out of 22 (1h37m) "noticable
    patterns".

    Not sure what you were implying with the comment "I found spanning-tree
    running, but all ports are configured with the portfast feature". Do
    consider that unusual or "bad"?

    Portfast for access ports causes the port to bypass the "listening" and
    "learning" stages of spanning-tree and imediately start forwarding traffic.
    Unless there's another switch attached and a 'loop' formed that (forwarding
    traffic) is a good thing for hosts.

    Are there any port errors? Are any ports geting 'disabled'? Are there
    duplex errors? Is there a spanning-tree instability, causing periodic
    reconvergence?

    BernieM
     
    BernieM, Apr 5, 2006
    #2
    1. Advertising

  3. Def

    Def Guest

    I am pinging a connected host.

    The samples are the odd packets in a total of some ~62k packets, and
    the intervals were similar enough for me to call it a pattern, but
    you're free to call it what you want of course :)

    What I implied with the spanning-tree comment was that spanning-tree
    was running in a network with only one switch in total. The network
    exists only between a few hosts, and there are no users or network
    admins involved other than me, so I can say for sure than spanning-tree
    is not needed. The configuration I found (when taking over the
    responsibility) had the described portfast settings.

    There are no port errors, no packets other than what the counter calls
    just plain "packets", and no err-disabled. Duplex and speed are set in
    both switch and hosts, and the cables are short and tested ok. All
    patch cables go directly to the hosts, there's no patch panel, and no
    cable runs I cannot see. Spanning-tree instability was what I suspected
    at first, but wouldnt 'debug all' yield something about this?

    My primary suspect right now is the hardware, but I'm generally not all
    that impressed with the quality of 29xx-series hardware so I might be
    inclined to draw that conclusion too fast... The POST is ok, but I
    might do some more interesting diagnostics in ROMMON?
     
    Def, Apr 5, 2006
    #3
  4. Def

    Guest

    > In the process of debugging some seemingly random connectivity
    > errors, I noticed the fast ethernet ports on the 2950 seemd to
    > simply stop forwarding frames now and then.


    Hi,

    20 pings > 2ms out of 62,000 !!!!!!!
    You are supposed to post these on 1 April.
    Why didn't you save it for next year?

    My first assumption would be that the ping target or receiver was
    the cause of the variation.

    To get meaningful results directly from tests such as this you would
    need to get a true hardware based tester such as a Smartbits.

    The one place where the switch might be falling down could be
    the mac address learning process. Read about 802.1d bridges
    and the learning process and the implementation options
    and tradeoffs.

    What result do you get when you ping 127.0.0.1?

    I have been doing this for > 10 years and your result is not likely
    to get me out of my chair.

    If I was building a real time control system where these times relly
    were
    critical I would of course be interested however on
    normal data networks it is completely irrelevant.

    20 * 2ms = 40ms of "lost" time.

    17 hours = 17 * 3600 or about 60000 seconds

    Lost time as a proportion of elapsed time

    0.04 / 60,000 or about 1/1,500,000.

    Consequences of this are that Microsoft word
    takes on average an extra 30 microseconds to open.
     
    , Apr 5, 2006
    #4
  5. Def

    BernieM Guest

    "Def" <> wrote in message
    news:...
    >I am pinging a connected host.
    >
    > The samples are the odd packets in a total of some ~62k packets, and
    > the intervals were similar enough for me to call it a pattern, but
    > you're free to call it what you want of course :)
    >
    > What I implied with the spanning-tree comment was that spanning-tree
    > was running in a network with only one switch in total. The network
    > exists only between a few hosts, and there are no users or network
    > admins involved other than me, so I can say for sure than spanning-tree
    > is not needed. The configuration I found (when taking over the
    > responsibility) had the described portfast settings.
    >
    > There are no port errors, no packets other than what the counter calls
    > just plain "packets", and no err-disabled. Duplex and speed are set in
    > both switch and hosts, and the cables are short and tested ok. All
    > patch cables go directly to the hosts, there's no patch panel, and no
    > cable runs I cannot see. Spanning-tree instability was what I suspected
    > at first, but wouldnt 'debug all' yield something about this?
    >
    > My primary suspect right now is the hardware, but I'm generally not all
    > that impressed with the quality of 29xx-series hardware so I might be
    > inclined to draw that conclusion too fast... The POST is ok, but I
    > might do some more interesting diagnostics in ROMMON?
    >


    Sorry I took the sample output as the entire run ... bit slow today. Yes,
    'debug spanning-tree all' would have shown you something. Pinging a
    directed host? Just a thought but what's to say fluctuations in the
    response times isn't due to variations in the hosts own processing load?
    Have you got multiple pings going that show a delay across the entire
    switch? You said the switch stops forwarding frames but is there actual
    packet loss? Have you checked Cisco for bug alerts for the IOS version?

    Sorry I don't have have anything more positive suggestions.

    BernieM
     
    BernieM, Apr 5, 2006
    #5
  6. Def

    BernieM Guest

    "Def" <> wrote in message
    news:...
    >I am pinging a connected host.
    >
    > The samples are the odd packets in a total of some ~62k packets, and
    > the intervals were similar enough for me to call it a pattern, but
    > you're free to call it what you want of course :)
    >
    > What I implied with the spanning-tree comment was that spanning-tree
    > was running in a network with only one switch in total. The network
    > exists only between a few hosts, and there are no users or network
    > admins involved other than me, so I can say for sure than spanning-tree
    > is not needed. The configuration I found (when taking over the
    > responsibility) had the described portfast settings.
    >
    > There are no port errors, no packets other than what the counter calls
    > just plain "packets", and no err-disabled. Duplex and speed are set in
    > both switch and hosts, and the cables are short and tested ok. All
    > patch cables go directly to the hosts, there's no patch panel, and no
    > cable runs I cannot see. Spanning-tree instability was what I suspected
    > at first, but wouldnt 'debug all' yield something about this?
    >
    > My primary suspect right now is the hardware, but I'm generally not all
    > that impressed with the quality of 29xx-series hardware so I might be
    > inclined to draw that conclusion too fast... The POST is ok, but I
    > might do some more interesting diagnostics in ROMMON?
    >


    I just noticed that other post and they've made a good point ... 2 ms delay
    is 'nothing'. That doesn't equate to 'stops forwarding traffic'.

    BernieM
     
    BernieM, Apr 5, 2006
    #6
  7. Def

    Def Guest

    Ok, but apparently there is something about this switch that causes the
    hosts (or application at least) communicating through it to loose
    connectivity. Beeing a critical backend for something that Must Work,
    it is limited how much testing I can do while in the production
    environment.

    The ping target and reciever were dedicated and stripped unix hosts and
    I doubt they would introduce the delay... but what do I know, I've only
    been doing this for 10 years :)

    I would love to blame one of the hosts (of which one a special device
    with limited configuration options) or the application (which is
    proprietary, not very well programmed and seemingly very sensitive to
    network delays. I haven't heard anyone call it a real time system, and
    from what I know it isnt anywhere near, but some times I wonder...),
    but when replacing the switch while keeping the config and software, it
    Just Works, which tells me the switch was acting up. The question is
    how I can figure out _what_.

    I'll look more into the learning process for .1d and how the mac
    address table is maintained, thanks.
     
    Def, Apr 5, 2006
    #7
  8. Def

    Dana Guest

    "Def" <> wrote in message
    news:...
    > Ok, but apparently there is something about this switch that causes the
    > hosts (or application at least) communicating through it to loose
    > connectivity. Beeing a critical backend for something that Must Work,
    > it is limited how much testing I can do while in the production
    > environment.
    >
    > The ping target and reciever were dedicated and stripped unix hosts and
    > I doubt they would introduce the delay... but what do I know, I've only
    > been doing this for 10 years :)
    >
    > I would love to blame one of the hosts (of which one a special device
    > with limited configuration options) or the application (which is
    > proprietary, not very well programmed and seemingly very sensitive to
    > network delays. I haven't heard anyone call it a real time system, and
    > from what I know it isnt anywhere near, but some times I wonder...),
    > but when replacing the switch while keeping the config and software, it
    > Just Works, which tells me the switch was acting up. The question is
    > how I can figure out _what_.
    >
    > I'll look more into the learning process for .1d and how the mac
    > address table is maintained, thanks.


    Interesting problem.
    From reading the thread, I wonder if it was not a physical layer issue that
    was temporarily fixed by the replacing of the switch. From the result of
    your RTT test, it appears the switch itself is fine, unless it can be tested
    with a real stress tester, or in a simulated environment. I have seen
    connectors cause similar problems to what you are describing above.
    >
     
    Dana, Apr 5, 2006
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Terry Baranski

    Re: Cat 2950T vlan

    Terry Baranski, Jul 18, 2003, in forum: Cisco
    Replies:
    2
    Views:
    493
    Alexander Ottl
    Jul 20, 2003
  2. Keith
    Replies:
    3
    Views:
    539
  3. Raymondo

    VLAN on 2950T-24 (Newbie Question)

    Raymondo, Aug 3, 2004, in forum: Cisco
    Replies:
    2
    Views:
    1,016
    Hansang Bae
    Aug 5, 2004
  4. Raymondo

    Cisco 2950T switch - VLAN

    Raymondo, Aug 3, 2004, in forum: Cisco
    Replies:
    2
    Views:
    2,733
    Erik Tamminga
    Aug 5, 2004
  5. hack.bac

    T1 occasional high-latency

    hack.bac, Aug 6, 2007, in forum: Cisco
    Replies:
    4
    Views:
    747
    Thrill5
    Aug 8, 2007
Loading...

Share This Page