Summary
This article covers basic methodology for troubleshooting dynamic routing. This includes learning routes from peers, and advertising routes to them through Route Health Injection (RHI). This document assumes the underlying configuration is already completed on all participating devices.
Background
In this example, 10.198.4.100 represents the NetScaler IP address (NSIP) of the NetScaler, 10.198.4.201 the IP address of an RHI enabled VIP, and 10.198.4.2 represents the upstream router. 10.198.4.253 represents the host in a fictitious static route, and 10.198.4.3 is its fictitious gateway.
The upstream router advertises a default route to the NetScaler, along with a static route for 10.198.4.253/32. The NetScaler advertises an RHI VIP to the router. This configuration is used for demonstration purposes only and is not necessarily practical.
Processes to be concerned with for dynamic routing
root????????????????261????0.0????0.1????3748???? 980??????????Ss???? Tue12PM???? 0:11.33 /netscaler/imi -d -f /nsconfig/ZebOS.conf
root????????????????279????0.0????0.1????3624???? 952??????????S????????Tue12PM???? 0:08.35 /netscaler/ospfd
root????????????????286????0.0????0.1????3492???? 944??????????S????????Tue12PM???? 0:08.32 /netscaler/ospf6d
root????????????????288????0.0????0.1????3284???? 948??????????S????????Tue12PM???? 0:08.46 /netscaler/ripd
root????????????????295????0.0????0.1????3764???? 944??????????S????????Tue12PM???? 0:08.17 /netscaler/bgpd
Imi, in this case, is the parent process reading ZebOS.conf, while the individual protocols have a corresponding daemon, as well (ospfd, ripd, bgpd).
If, for example, you were using Border Gateway Protocol (BGP), you would want to be sure that imi was running, and that bgpd was also running.
1. Log messages from ZebOS/routing daemons.
a. If imi terminates, you see a message like this in /var/log/messages:
Mar 5 09:41:58 10.198.4.100 03/05/2009:15:41:58 GMT tlns02 : PITBOSS Message 508 : "New pid (46050) for (/netscaler/imi imid -f /nsconfig/ZebOS.conf) restarts (1)"
b. If the individual daemons fail, you see this in /var/log/messages:
Mar 5 09:42:53 10.198.4.100 03/05/2009:15:42:53 GMT tlns02 : PITBOSS Message 512 : "New pid (46118) for (/netscaler/bgpd bgpd) restarts (1)"
The above message is normally indicative of a problem.
2. Other log events
c. When you fail to connect to a peer (failed to establish, or lost connection) you see a similar message to this in /var/log/messages:
Mar 5 10:06:36 tlns02 BGP[46118]: BGP: [SOCK CB] sock_getname() failed (54:Connection reset by peer), FD(9)
Depending on the actual failure reason, the “Connection reset by peer” message might change to reflect the nature of the failure.
3. Verifying peers and LEARNED routes
The first place to verify the routes/peers is in ZebOS. You do this because ZebOS handles peering and receives the routes first. The NetScaler kernel then learns them from ZebOS. Use the appropriate commands for the routing protocol in use to show the routes/peers from ZebOS.
In this example, you are using BPG. Note the target for the command in brackets. To enter VTYSH from the CLI, issue the vtysh command, and from VTYSH, issue the exit command to return to the CLI.
d. Verify peer is up [VTYSH]:
ns#show ip bgp summary
BGP router identifier 10.198.4.100, local AS number 333
BGP table version is 2
1 BGP AS-PATH entries
0 BGP community entries
Neighbor?? ?? ?? ?? ?? ?? ?? ?? V?? ?? ?? ?? AS MsgRcvd MsgSent?? ?? TblVer?? ?? InQ OutQ Up/Down?? ?? State/PfxRcd
10.198.4.2?? ?? ?? ?? ?? ?? 4?? ?? ?? 333?? ?? ?? ?? 12?? ?? ?? ?? ?? ?? 7?? ?? ?? ?? ?? ?? ?? ?? 2?? ?? ?? ?? ?? 0?? ?? ?? ?? 0?? 00:02:13?? ?? ?? ?? ?? ?? ?? ?? 1
Total number of neighbors 1
e. Verify you are learning the advertised routes [VTYSH]:
ns02#show ip route
Codes: K - kernel, C - connected, S - static, R - RIP, B – BGP
O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, I – Intranet
* - candidate default
Gateway of last resort is 10.198.4.1 to network 0.0.0.0
B*?? ?? ?? ?? ?? ?? 0.0.0.0/0 [200/0] via 10.198.4.1, vlan0, 00:09:44
C?? ?? ?? ?? ?? ?? 10.198.4.0/24 is directly connected, vlan0
B?? ?? ?? ?? ?? ?? 10.198.4.253/32 [200/0] via 10.198.4.3, vlan0, 00:02:00
C?? ?? ?? ?? ?? ?? 127.0.0.0/8 is directly connected, lo0
Once you know that the routes are being learned in ZebOS and peers are up, the next step is to ensure they are making it to the NetScaler kernel where they will be used.
f. Verify routes are making it to NetScaler kernel) [NSCLI]:
> show route
?? ?? ?? ?? ?? ?? ?? ?? Network?? ?? ?? ?? ?? ?? ?? ?? ?? ?? Netmask?? ?? ?? ?? ?? ?? ?? ?? ?? ?? Gateway/OwnedIP?? ?? ?? ?? ?? ?? State?? ?? Type
?? ?? ?? ?? ?? ?? ?? ?? -------?? ?? ?? ?? ?? ?? ?? ?? ?? ?? -------?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ---------------?? ?? ?? ?? ?? ?? -----?? ?? ----
1)?? ?? ?? ?? ?? ?? 0.0.0.0?? ?? ?? ?? ?? ?? ?? ?? ?? ?? 0.0.0.0?? ?? ?? ?? ?? ?? ?? ?? ?? ?? 10.198.4.1?? ?? ?? ?? ?? ?? ?? ?? ?? ?? UP?? ?? ?? ?? ?? ?? BGP
2)?? ?? ?? ?? ?? ?? 127.0.0.0?? ?? ?? ?? ?? ?? ?? ?? 255.0.0.0?? ?? ?? ?? ?? ?? ?? ?? 127.0.0.1?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? UP?? ?? ?? ?? ?? ?? PERMANENT
3)?? ?? ?? ?? ?? ?? 10.198.4.0?? ?? ?? ?? ?? ?? 255.255.255.0?? ?? ?? ?? 10.198.4.100?? ?? ?? ?? ?? ?? ?? ?? UP?? ?? ?? ?? ?? ?? DIRECT
4)?? ?? ?? ?? ?? ?? 10.198.5.0?? ?? ?? ?? ?? ?? 255.255.255.0?? ?? ?? ?? 10.198.5.1?? ?? ?? ?? ?? ?? ?? ?? ?? ?? UP?? ?? ?? ?? ?? ?? DIRECT
5)?? ?? ?? ?? ?? ?? 10.198.4.253?? ?? ?? ?? 255.255.255.255?? ?? 10.198.4.3?? ?? ?? ?? ?? ?? ?? ?? ?? ?? UP?? ?? ?? ?? ?? ?? BGP
Note: In this case, you are learning a DEFAULT ROUTE, in addition to one STATIC ROUTE. To learn a default route, there is an NSAPIMGR option that must be set. Consult CTX119203 – Citrix NetScaler Networking Guide - Release 9.0 for more information on this.
4. Verifying ADVERTISED routes
For checking advertised routes, instead of working from ZebOS in, you work from NSCLI out.
a. Verify VIP is configured for RHI [NSCLI]:
> show runningConfig | grep 10.198.4.201
add ns ip 10.198.4.201 255.255.255.255 -type VIP -snmp DISABLED -hostRoute ENABLED -hostRtGw 10.198.4.101
b. Verify host route is being passed to ZebOS [VTYSH]:
ns02#show ip route
Codes: K - kernel, C - connected, S - static, R - RIP, B – BGP
O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, I – Intranet
* - candidate default
Gateway of last resort is 10.198.4.1 to network 0.0.0.0
B*?? ?? ?? ?? ?? ?? 0.0.0.0/0 [200/0] via 10.198.4.1, vlan0, 00:48:19
C?? ?? ?? ?? ?? ?? 10.198.4.0/24 is directly connected, vlan0
K?? ?? ?? ?? ?? ?? 10.198.4.201/32 via 10.198.4.101, vlan0
B?? ?? ?? ?? ?? ?? 10.198.4.253/32 [200/0] via 10.198.4.3, vlan0, 00:40:35
C?? ?? ?? ?? ?? ?? 127.0.0.0/8 is directly connected, lo0
c. Verify ZebOS is actually advertising this route [VTYSH]:
tlns02#show ip bgp neighbors 10.198.4.2 advertised-routes
BGP table version is 18, local router ID is 10.198.4.100
Status codes: s suppressed, d damped, h history, * valid, > best, i – internal
Origin codes: i - IGP, e - EGP, ? - incomplete
Network?? ?? ?? ?? ?? ?? ?? ?? ?? ?? Next Hop?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? Metric LocPrf Weight Path
*>i10.198.4.0/24?? ?? ?? ?? 10.198.4.100?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? 100?? ?? 32768 i
*>i10.198.4.201/32?? ?? 10.198.4.101?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? 100?? ?? 32768 ?
Total number of prefixes 2
The steps to verify functionality of the upstream/downstream BGP peer are similar (this assumes you are using a Cisco style operating system, and excluding NSCLI commands) to the steps above. Ideally you want to ensure that the upstream peer is receiving the routes you have verified that you are advertising, and also verify that the upstream peer is advertising routes when you intend to learn them.
More Information
CTX119203 – Citrix NetScaler Networking Guide - Release 9.0 covers dynamic routing configuration and some common issues.