ghostwire.me: The Curious Case Of OSPF NSSA LSA on GRE Tunnel

Today I encountered a case which is really easy when you know what exactly to look for but seems puzzling at the first glance.

This is what our topology looks like:

We have two routers connecting two locations with OSPF configured on each. The primary connection is via GRE tunnel while backup connection is just a direct connection (say we have dark fiber or L2VPN between these locations). OSPF neighbor relations are established on both links.

Nodes exchange routes between each other and everything goes on just fine..until the primary connection goes down. Then some of the routes R2 sends to R1 are seen on R1 as connected via the backup link..and some are still seen as connected via tunnel interface, though OSPF neighbor is already considered dead on this link.

Sounds intriguing?

That's what I haven't told you so far: both R1 and R2 are configured as NSSA. Let's take a closer look at the configuration:

R1:

interface Loopback0

ip address 1.1.1.1 255.255.255.255

ip ospf 1 area 1

interface Tunnel0

ip address 192.168.0.1 255.255.255.0

ip ospf 1 area 1

ip ospf cost 1

tunnel source 10.0.100.1

tunnel destination 10.0.200.2

interface Ethernet0/0

ip address 10.0.100.1 255.255.255.0

interface Ethernet0/1

ip address 172.16.0.1 255.255.255.0

ip ospf 1 area 1

ip ospf cost 200

interface Ethernet0/2

no ip address

shutdown

interface Ethernet0/3

no ip address

shutdown

router ospf 1

area 1 nssa

ip route 10.0.200.0 255.255.255.0 10.0.100.2

R2:

interface Loopback1

ip address 3.3.3.3 255.255.255.255

interface Tunnel0

ip address 192.168.0.2 255.255.255.0

ip ospf 1 area 1

ip ospf cost 1

tunnel source 10.0.200.2

tunnel destination 10.0.100.1

interface Ethernet0/0

ip address 10.0.200.2 255.255.255.0

interface Ethernet0/1

ip address 172.16.0.2 255.255.255.0

ip ospf 1 area 1

ip ospf cost 100

interface Ethernet0/2

ip address 192.168.100.2 255.255.255.0

interface Ethernet0/3

ip address 192.168.200.2 255.255.255.0

ip ospf 1 area 1

router ospf 1

area 1 nssa

redistribute connected subnets route-map REDIS_LOOP

passive-interface Ethernet0/2

ip route 10.0.100.0 255.255.255.0 10.0.200.1

route-map REDIS_LOOP permit 1

match ip address 1

access-list 1 permit 192.168.100.0

access-list 1 deny any

Let's check we have our neighbor relations estableshed and check what we have in R1's routing table while primary connections is still up:

R1#sh ip os nei

Neighbor ID Pri State Dead Time Address Interface

172.16.0.2 0 FULL/ - 00:00:32 192.168.0.2 Tunnel0

172.16.0.2 1 FULL/BDR 00:00:33 172.16.0.2 Ethernet0/1

R1#sh ip route ospf

Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP

D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area

N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2

E1 - OSPF external type 1, E2 - OSPF external type 2

i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2

ia - IS-IS inter area, * - candidate default, U - per-user static route

o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP

a - application route

+ - replicated route, % - next hop override

Gateway of last resort is not set

O N2 192.168.100.0/24 [110/20] via 192.168.0.2, 00:07:05, Tunnel0

O 192.168.200.0/24 [110/11] via 192.168.0.2, 00:05:32, Tunnel0

Seems legit. Now let's break something in the cloud so the primary link fails and see how R1 reacts:

R1#sh ip os nei

Neighbor ID Pri State Dead Time Address Interface

172.16.0.2 1 FULL/BDR 00:00:38 172.16.0.2 Ethernet0/1

R1#sh ip route os

Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP

D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area

N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2

E1 - OSPF external type 1, E2 - OSPF external type 2

i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2

ia - IS-IS inter area, * - candidate default, U - per-user static route

o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP

a - application route

+ - replicated route, % - next hop override

Gateway of last resort is not set

O N2 192.168.100.0/24 [110/20] via 192.168.0.2, 00:01:10, Tunnel0

O 192.168.200.0/24 [110/210] via 172.16.0.2, 00:00:15, Ethernet0/1

Well, ain't it cool, really? Now we have one of our routes avialable via redundant path but the other one is still seen via tunnel interface. You have already probably noticed that the route still seen via tunnel interface is NSSA External route. And that's exactly where the problem lies. Let's take a closer look:

R1#sh ip os data nssa

OSPF Router with ID (1.1.1.1) (Process ID 1)

Type-7 AS External Link States (Area 1)

Routing Bit Set on this LSA in topology Base with MTID 0

LS age: 400

Options: (No TOS-capability, Type 7/5 translation, DC, Upward)

LS Type: AS External Link

Link State ID: 192.168.100.0 (External Network Number )

Advertising Router: 172.16.0.2

LS Seq Number: 80000007

Checksum: 0xF2A5

Length: 36

Network Mask: /24

Metric Type: 2 (Larger than any link state path)

MTID: 0

Metric: 20

Forward Address: 192.168.0.2

External Route Tag: 0

Note the Forward Address field. Yes, that's R2's tunnel interface source address - and tunnel interface is still considered up, cause we have no keepalives configured and the failure was indirect.

But how is this address choosen? This is what Cisco says about this:

Forwarding address is selected on ASBR using the following rules:

If there is a loopback configured in the area then IP address of loopback is selected as forwarding address.

If first condition is not met then IP address of first interface on the OSPF interface list is selected as forwarding address. You can see OSPF interface list by using "show ip ospf interface brief" command. The interface on top will be the last interface which was attached to OSPF.

Well, we do not have loopback configured. And Tunnel0 is indeed the top interface:

R2#sh ip os int b

Interface PID Area IP Address/Mask Cost State Nbrs F/C

Tu0 1 1 192.168.0.2/24 1 P2P 0/0

Et0/3 1 1 192.168.200.2/24 10 DR 0/0

Et0/1 1 1 172.16.0.2/24 100 BDR 1/1

And how "top" interface is choosen exactly? Well, it's just..the last interface you turned OSPF on. So it's just a question of luck.

The workaround is, of course, to simply configure OSPF on loopback interface - it will always be chosen as forward address then. You can also configure keepalive on tunnel interface, so that new Forward Address will be chosen as soon as tunnel interface fail. However, I encountered this problem on non-Cisco's device which does not support keepalives and considers tunnel is always up (even when we have direct physical failure).

ghostwire.me

Thursday, August 31, 2017

The Curious Case Of OSPF NSSA LSA on GRE Tunnel

No comments:

Post a Comment