804 Packet loss during dialer idle timeout process?

Discussion in 'Cisco' started by Loren Amelang, Feb 3, 2005.

  1. I've been using an 804 to dial 2B on-demand ISDN through a local POP
    of my local ISP. No unwanted garbage coming in, no need for
    access-list filtering. Now they've closed their POP and herded us all
    onto a statewide POP that serves SBC/Juno/NetZero/etc. as well - I'm
    getting a crash course in access control! I also upgraded from the
    original IOS 12.0(1)XB1 to 12.1(26), the latest I could get for free.

    I believe I have the filtering and dialer control lists working
    properly, and the lines drop promptly as configured. In fact, too
    promptly. Formerly, if a second B was brought up, it would stay up for
    its dialer idle-timeout period, even if the first B went down during
    that time based on its own dialer idle-timeout. Excess traffic could
    then bring the first B back up, and they typically alternated in use.

    Now a second B stays up only until the excess traffic declines below
    the threshold, sometimes as little as 30 seconds. The original B stays
    up until no interesting traffic has been seen for the idle-timeout
    period. I'm not aware of having changed this, and haven't even seen
    any way to influence it. I'm not even sure if it is relevant to my
    real problem, except for making it happen more often.

    The real problem is that once the idle-timeout is triggered on either
    B channel, for a period of maybe a minute packets simply disappear.

    Here is a "race condition" between an interesting packet that should
    reset the dialer, and a hangup that apparently has already been
    committed:
    Feb 2 14:24:39.726 pst: IP: s=66.53.174.58 (Ethernet0),
    d=66.70.119.214 (Dialer1), g=66.70.119.214, len 48, forward
    Feb 2 14:24:39.730 pst: TCP src=1583, dst=80, seq=2779196878,
    ack=0, win=65535 SYN
    Feb 2 14:24:41.222 pst: Vi1 DDR: idle timeout
    Feb 2 14:24:41.242 pst: Vi1 DDR: disconnecting call

    For up to a minute after the disconnect is complete, new interesting
    packets do not cause dialing. I've reduced this to about 45 seconds by
    changing the dialer enable-timeout from the default 15 to the minimum
    2, but can't seem to cut it shorter. When that time expires,
    re-sending the exact same packets always works.

    And the _real_ real problem is that a timeout (or load decline) of the
    second B channel seems to incapacitate the remaining first B channel
    during the dialer teardown time, and for the following 45 seconds or
    so. During this time packets are not queued, they simply disappear.
    Again, once the mystery timeout is over, the first B channel returns
    to being functional. (I believe this also happens while a second
    channel is being added, but that is only six seconds.)

    Is this just a fact of Cisco life, that didn't bother me enough to
    track down when my B channels were used alternately? Or is there
    something I could configure to make the router not lose packets while
    its dialer is going up and down? Or at least not run the second B up
    and down with apparently zero idle-timeout... Other than moving to a
    city where I don't depend on pay-by-the-minute ISDN...

    Loren
     
    Loren Amelang, Feb 3, 2005
    #1
    1. Advertising

  2. Loren Amelang

    Guest

    I am sorry if I seem to have sent this twice however this one is
    better.

    Probably mppp (multilink ppp) is not configured at your
    provider. Maybe there is a compatibility problem with
    your configuration of mppp.

    Even if mppp is not actually in operation another call will be
    placed when the set load is exceeded. Since you will end up on a
    different interface on the pop and may get a different IP
    address it is not surprising that the router is gettign a bit confused.

    To fix remove the mppp configuration at your end or get the
    ISP to allow mppp or get the configuration issue sorted out.

    I have seen this once on a 30 channel PRI which dialled as
    fast as it could for ever. I can't recall now but I think
    that the bill over a few days amounted to 15000 USD. All
    for 64kbps:) Well I only figured out what was not working:))

    Here is an example working config with debugs and
    everything to prove it.

    The single lost packet at the beginning is caused by
    the router not yet knowing the IP address of the ISDN
    interface. It therefore puts in the address of the
    Ethernet interface which is of no use at all to the
    ISP in this case.


    One thing is that I don't understand Virtual interfaces and you
    seem to be using them. Certainly in 12.1 this is not necessary for
    basic
    dial up internet access. My config was created from default with pretty
    much the the minimum possible changes to allow dial up internet access.

    Router#sh ver
    Cisco Internetwork Operating System Software
    IOS (tm) C800 Software (C800-Y6-MW), Version 12.1(3)XG6, EARLY
    DEPLOYMENT RELEASE SOFTWARE (fc1)

    System image file is "flash:c800-y6-mw.121-3.XG6.bin"

    Router#


    I used two telnet windows one for the ping commands and the other for
    logging and diagnostic output. The only anomaly is that the first
    packet
    seems to get dropped somewhere which as I understand it shoudn't occur
    with dialer "hold-queue" configured. The output from the two windows is

    interleaved below.


    Router#ping
    Protocol [ip]:
    Target IP address: 146.169.1.10
    Repeat count [5]:
    Datagram size [100]:
    Timeout in seconds [2]: 10 <-- plenty time for the isdn to come up.
    ***
    Extended commands [n]:
    Sweep range of sizes [n]:
    Type escape sequence to abort.
    Sending 5, 100-byte ICMP Echos to 146.169.1.10, timeout is 10 seconds:
    ..!!!!
    Success rate is 80 percent (4/5), round-trip min/avg/max = 72/75/80 ms

    ###############################################################
    # Begin comment and diagnostic block #
    ###############################################################

    00:19:77316264788: %LINK-3-UPDOWN: Interface BRI0:1, changed state to
    up
    00:19:77309411328: %DIALER-6-BIND: Interface BR0:1 bound to profile Di1
    00:19:19: %LINEPROTO-5-UPDOWN: Line protocol on Interface BRI0:1,
    changed state to up
    00:19:24: %ISDN-6-CONNECT: Interface BRI0:1 is now connected to
    01111111111 dial1.th

    ** We still lost the first packet though even with dialer hold-queue
    configured.

    Router#sh isdn hist
    --------------------------------------------------------------------------------
    ISDN CALL HISTORY
    --------------------------------------------------------------------------------
    History table has a maximum of 100 entries for disconnected calls.
    History table data is retained for a maximum of 15 Minutes for
    disconnected calls.
    --------------------------------------------------------------------------------
    Call Calling Called Remote Seconds Seconds Seconds
    Charges
    Type Number Number Name Used Left Idle
    Units/Currency
    --------------------------------------------------------------------------------
    Out +1111111111 dial1.th 19 Unavail 0
    0
    --------------------------------------------------------------------------------

    ** Now a quick ping to keep line up

    ####################### End #############################


    Router#ping 146.169.1.10

    Type escape sequence to abort.
    Sending 5, 100-byte ICMP Echos to 146.169.1.10, timeout is 2 seconds:
    !!!!!
    Success rate is 100 percent (5/5), round-trip min/avg/max = 72/72/76 ms
    Router#

    ###############################################################
    # Begin comment and diagnostic block #
    ###############################################################
    Now we blast the link with traffic to try to bring up the
    second line
    ####################### End #############################


    Router#ping
    Protocol [ip]:
    Target IP address: 146.169.1.10
    Repeat count [5]: 5000
    Datagram size [100]: 1300 <-- big pings ***
    Timeout in seconds [2]:
    Extended commands [n]:
    Sweep range of sizes [n]:
    Type escape sequence to abort.
    Sending 5000, 1300-byte ICMP Echos to 146.169.1.10, timeout is 2
    seconds:
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!
    Success rate is 100 percent (352/352), round-trip min/avg/max =
    212/265/552 ms

    ###############################################################
    # Begin comment and diagnostic block #
    ###############################################################
    Router#sh int d1 | inc load
    reliability 255/255, txload 39/255, rxload 39/255
    reliability 255/255, txload 11/255, rxload 11/255
    Router#sh
    00:22:111676003156: %LINK-3-UPDOWN: Interface BRI0:2, changed state to
    up
    00:22:111669149696: %DIALER-6-BIND: Interface BR0:2 bound to profile Di
    00:22:27: %LINEPROTO-5-UPDOWN: Line protocol on Interface BRI0:2,
    changed state to u
    Router#
    00:22:32: %ISDN-6-CONNECT: Interface BRI0:2 is now connected to
    01111111111 dial1.th


    Router#sh ppp mul

    Dialer1, bundle name is dial1.th
    0 lost fragments, 5 reordered, 0 unassigned
    0 discarded, 0 lost received, 27/255 load
    0x7B received sequence, 0x7A sent sequence
    Member links: 2 (max not set, min not set)
    BRI0:1
    BRI0:2

    Router#sh int d1 | inc load
    reliability 255/255, txload 69/255, rxload 69/255
    reliability 255/255, txload 47/255, rxload 47/255
    reliability 255/255, txload 15/255, rxload 15/255


    *** Second BRI came up without losing a single packet. ***


    Now we reduce the load and see what happens when the second line drops
    out.
    ####################### End #############################



    Router#ping
    Protocol [ip]:
    Target IP address: 146.169.1.10
    Repeat count [5]: 50000
    Datagram size [100]: 0
    % A decimal number between 36 and 5024.
    Datagram size [100]: 36 <-- smallest possible pings ***
    Timeout in seconds [2]:
    Extended commands [n]:
    Sweep range of sizes [n]:
    Type escape sequence to abort.
    Sending 50000, 36-byte ICMP Echos to 146.169.1.10, timeout is 2
    seconds:
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!
    Success rate is 100 percent (1071/1071), round-trip min/avg/max =
    52/57/344 ms


    ###############################################################
    # Begin comment and diagnostic block #
    ###############################################################


    Router#sh int d1 | inc load
    reliability 255/255, txload 33/255, rxload 35/255
    reliability 255/255, txload 51/255, rxload 51/255
    reliability 255/255, txload 27/255, rxload 27/255
    Router#sh int d1 | inc load
    reliability 255/255, txload 15/255, rxload 17/255
    reliability 255/255, txload 31/255, rxload 31/255
    reliability 255/255, txload 11/255, rxload 11/255

    Router#sh isdn hist
    --------------------------------------------------------------------------------
    ISDN CALL HISTORY
    --------------------------------------------------------------------------------
    History table has a maximum of 100 entries for disconnected calls.
    History table data is retained for a maximum of 15 Minutes for
    disconnected calls.
    --------------------------------------------------------------------------------
    Call Calling Called Remote Seconds Seconds Seconds
    Charges
    Type Number Number Name Used Left Idle
    Units/Currency
    --------------------------------------------------------------------------------
    Out +1111111111 dial1.th 202 Unavail 0
    0
    Out +1111111111 dial1.th 120 Unavail 0
    0
    --------------------------------------------------------------------------------

    Router#sh int d1 | inc load
    reliability 255/255, txload 9/255, rxload 9/255
    reliability 255/255, txload 15/255, rxload 15/255
    reliability 255/255, txload 11/255, rxload 11/255
    Router#
    00:24:34: %DIALER-6-UNBIND: Interface BR0:2 unbound from profile Di1
    00:24:34: %ISDN-6-DISCONNECT: Interface BRI0:2 disconnected from
    01111111111 dial1.th, call lasted 127 seconds
    00:24:146028888068: %LINK-3-UPDOWN: Interface BRI0:2, changed state to
    down
    00:24:34: %LINEPROTO-5-UPDOWN: Line protocol on Interface BRI0:2,
    changed state to down
    Router#
    Router#

    ** Finally we wait until the idle timeout trips and the line drops.

    Router#
    Router#
    00:25:18: %DIALER-6-UNBIND: Interface BR0:1 unbound from profile Di1
    00:25:18: %ISDN-6-DISCONNECT: Interface BRI0:1 disconnected from
    01111111111 dial1.th, call lasted 254 seconds
    00:25:77309452288: %LINK-3-UPDOWN: Interface BRI0:1, changed state to
    down
    00:25:19: %LINEPROTO-5-UPDOWN: Line protocol on Interface BRI0:1,
    changed state to down
    Router#

    *** Second BRI dropped without losing a single packet. ***

    100% perfect. (well 99.9 anyway)
    ####################### End #############################


    Router#sh run
    Building configuration...

    Current configuration:
    !
    version 12.1
    no service pad
    service timestamps debug uptime
    service timestamps log uptime
    no service password-encryption
    !
    hostname Router
    !
    enable secret 5 xxxxxxxxxxxxxxxxxxxxxx
    !
    !
    ip subnet-zero
    !
    isdn switch-type basic-net3
    !
    !
    !
    interface Ethernet0
    ip address 172.1.1.217 255.255.255.0
    !
    interface BRI0
    description connected to Internet
    no ip address
    dialer pool-member 1
    isdn switch-type basic-net3
    !
    interface Dialer1
    ip address negotiated
    encapsulation ppp
    load-interval 30
    dialer pool 1
    dialer idle-timeout 30
    dialer string 01111111111
    dialer hold-queue 10
    dialer load-threshold 30 either
    dialer-group 1
    no cdp enable
    ppp authentication chap callin
    ppp chap hostname xxxxxxxx
    ppp chap password 7 qqqqqqqqqqq
    ppp multilink <-- the VITAL one
    !
    no ip http server
    ip classless
    ip route 0.0.0.0 0.0.0.0 Dialer1
    !
    dialer-list 1 protocol ip permit
    !
    line con 0
    transport input none
    stopbits 1
    line vty 0 4
    password cisco
    login
    !
    no rcapi server
    !
    !
    end

    Router#


    Router#sh ver
    Cisco Internetwork Operating System Software
    IOS (tm) C800 Software (C800-Y6-MW), Version 12.1(3)XG6, EARLY
    DEPLOYMENT RELEASE SOFTWARE (fc1)
    TAC Support: http://www.cisco.com/tac
    Copyright (c) 1986-2002 by cisco Systems, Inc.
    Compiled Fri 08-Feb-02 14:25 by ealyon
    Image text-base: 0x000EE000, data-base: 0x00656000

    ROM: TinyROM version 1.4(1)
    Router uptime is 39 minutes
    System returned to ROM by power-on
    System image file is "flash:c800-y6-mw.121-3.XG6.bin"

    Cisco C801 (MPC850) processor (revision 1) with 43464K bytes of virtual
    memory.
    Processor board ID JAD063304TD
    CPU part number 0x2101
    Bridging software.
    Basic Rate ISDN software, Version 1.1.
    1 Ethernet/IEEE 802.3 interface(s)
    1 ISDN Basic Rate interface(s)
    4M bytes of physical memory (DRAM)
    8K bytes of non-volatile configuration memory
    8M bytes of flash on board (4M from flash card)

    Configuration register is 0x2102

    Router#
     
    , Feb 5, 2005
    #2
    1. Advertising

  3. On 5 Feb 2005 11:56:58 -0800, wrote:

    >I am sorry if I seem to have sent this twice however this one is
    >better.


    If there was a previous one, I certainly didn't see it. Thanks for
    trying again!

    >Probably mppp (multilink ppp) is not configured at your
    >provider. Maybe there is a compatibility problem with
    >your configuration of mppp.


    It seems to be working, in that there are no PPP errors shown by the
    router, I get "15K" downloads, and can (almost) listen to 128k music
    streams. When there were problems at the ISP end it was obvious.

    ....

    >Here is an example working config with debugs and
    >everything to prove it.


    Thank you for spending your time on this! I learned several new
    diagnostic commands, at the least.

    >The single lost packet at the beginning is caused by
    >the router not yet knowing the IP address of the ISDN
    >interface. It therefore puts in the address of the
    >Ethernet interface which is of no use at all to the
    >ISP in this case.


    This sounds a bit like one of the mechanisms I think I'm running into
    here. I have the added complication of (port-address) NAT:

    *Feb 28 17:20:07.460 pst: NAT: dialer not up for Dialer1, no
    translation, dial and drop
    *Feb 28 17:20:07.460 pst: Dialer1: ip (s=10.1.1.5, d=199.4.80.1), 62
    bytes, outgoing interesting (list 101)
    *Feb 28 17:20:08.143 pst: %LINK-3-UPDOWN: Interface BRI0:1, changed
    state to up
    *Feb 28 17:20:08.452 pst: NAT: dialer not up for Dialer1, no
    translation, dial and drop
    *Feb 28 17:20:08.456 pst: Dialer1: ip (s=10.1.1.5, d=63.162.241.3), 62
    bytes, outgoing interesting (list 101)
    *Feb 28 17:20:10.261 pst: %LINK-3-UPDOWN: Interface Virtual-Access1,
    changed state to up
    *Feb 28 17:20:10.503 pst: NAT: s=10.1.1.5->66.53.174.14,
    d=63.162.241.3 [49712]

    The dialer used to report such packets as cached, but now it looks
    like NAT gets to them first and drops them before they can be cached.
    I'm not aware of having changed anything to cause this... Or how one
    might fix it.

    >One thing is that I don't understand Virtual interfaces and you
    >seem to be using them. Certainly in 12.1 this is not necessary for
    >basic dial up internet access.


    I can't remember now exactly why I went to that (need to do more
    documentation!), but I remember it solving some horrible problem many
    years ago.

    >** We still lost the first packet though even with dialer hold-queue
    > configured.


    I guess you only saw that in the ping results. I've tried having
    "debug ip packet detail" on, and my lost packets don't show up at all
    there. I guess one needs to understand where that command fits in the
    chain of events. Apparently if the NAT processor drops a packet, the
    "debug ip packet" command never sees it...

    >00:24:34: %LINEPROTO-5-UPDOWN: Line protocol on Interface BRI0:2,

    ....
    >00:25:19: %LINEPROTO-5-UPDOWN: Line protocol on Interface BRI0:1,
    >changed state to down


    So your second channel drops out with less load, leaving the original
    channel up. Mine does that now, but definitely didn't before the
    recent update. Maybe it was an IOS version thing?

    I appreciate seeing your testing technique with the router repeating
    pings. I have real work waiting at the moment, but someday I'll try
    that here. I wonder, though, if my problem involves NAT it may not
    show up when the router itself does the pinging. I probably need to
    have a local net host send the pings.

    Loren
     
    Loren Amelang, Feb 6, 2005
    #3
  4. Loren Amelang

    Guest

    >> Thank you for spending your time on this! I learned several new
    >> diagnostic commands, at the least


    Well it was time I did a bit of ISDN, was rusty.

    I didn't use NAT so that is different. It least you got an
    explanation of where the pkts went, I didn't.

    > I've tried having
    > "debug ip packet detail" on, and my lost packets don't show up at all


    > there.


    You need to turn off fast switching on the interfaces.
    no ip route-cache
    for debug to see the packets.

    In the case of a dialer I always clear the interface after applying the
    command though I am not sure if that is always essential.
    Clearing the interface of course drops the connection(s).

    You should probably restore whatever form of fast switching
    you had when you are done, although on an ISDN link
    I don't support it would matter. I look after some ADSL routers
    and I don't always bother with fast switching there.

    Glad you got it fixed.
     
    , Feb 7, 2005
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Rob
    Replies:
    1
    Views:
    4,151
    Walter Roberson
    Nov 17, 2004
  2. Reinhard
    Replies:
    0
    Views:
    774
    Reinhard
    Mar 7, 2005
  3. Matt
    Replies:
    10
    Views:
    1,385
    Aaron Leonard
    Dec 6, 2005
  4. Patt
    Replies:
    2
    Views:
    698
  5. Dan Lanciani
    Replies:
    0
    Views:
    1,073
    Dan Lanciani
    Jul 28, 2008
Loading...

Share This Page