Yet again, working on some backup links ...
I have a three router setup of two multipoint gre tunnels (w/nhrp) which
as such work quite well - at least while they are not used (apart for
OSPF routing updates or direct pings/transfer tests). All routers are
3640, running the same 12.2(13)T9 IOS. I have a /28 defined for the two
tunnels, on of them being a 1500/200 ADSL, the second a 2300 SDSL
connection through another provider. Both connections are utilizing
PPPoE on a FastEthernet port (though running at 10mbit half duplex)
Configuration has been done using a CISCO example minus the (3)DES
encryption.
Both spoke routers are configured identically, except for the local
tunnel IP address.
The first router (the one with the ADSL connection) sets up the backup
just fine - if the primary link (2Meg digital) goes down, there is a
short transition of about 1-2seconds, then the backup link is used
without noticeable delays or performance problems (the site is mainly
using downlink, the used bandwidth is usually in about the same range or
slightly higher than the ADSL bandwidth available). So far, so good.
The problem is the second link - the site has higher bandwidth use
(usually running on a 155meg line), but at the time of tests, an active
usage of less than 2mbit in both directions. Before shutting down the
main line, ping values through the tunnel are in the range of 8-12ms.
When shutting down the line, a parallel continous ping to a remote site
loses some 5-10 packets (considerably more than on the other line), then
quickly jumps to 4sec delays:
64 bytes from heise.de (193.99.144.71): icmp_seq=81 ttl=247 time=11.5 ms
64 bytes from heise.de (193.99.144.71): icmp_seq=82 ttl=247 time=10.8 ms
64 bytes from heise.de (193.99.144.71): icmp_seq=83 ttl=247 time=11.0 ms
<"shutdown" on POS line of second router>
64 bytes from heise.de (193.99.144.71): icmp_seq=89 ttl=246 time=44.5 ms
64 bytes from heise.de (193.99.144.71): icmp_seq=90 ttl=247 time=160 ms
64 bytes from heise.de (193.99.144.71): icmp_seq=92 ttl=247 time=922 ms
64 bytes from heise.de (193.99.144.71): icmp_seq=93 ttl=247 time=2568 ms
64 bytes from heise.de (193.99.144.71): icmp_seq=94 ttl=247 time=3402 ms
64 bytes from heise.de (193.99.144.71): icmp_seq=95 ttl=247 time=4367 ms
64 bytes from heise.de (193.99.144.71): icmp_seq=96 ttl=247 time=4300 ms
(this delay is kept up - with occasional packet loss of 25% or so,
latency will stay in the range of 3900-4400ms)
At the same time, a reset counter on the tunnel interface only shows a
usage of only ~200kbit in both directions, cricket graphs on the dialer
and the tunnel confirm this bandwidth usage.
I have confirmed the available bandwidth on the vanilla SDSL line to be
over 2mbit, so I keep getting less than 10% of the actual line - I had
suspected a possible problem with the line, but the audit program
confirmed the correct operation on line level.
I'm at a total loss right now of where else to look anymore - does
anybody have an idea what the reason could be, or what other things I
should check into???
Thanks in advance,
-garry
|