Identical NetWall routes causing network access loss after upgrade or restart

Last modified on 30 Nov, 2022. Revision 13
This article discusses the identical route problem as a possible cause of a lack of network access after a system restart or cOS Core version upgrade.
Up to date for
cOS Core 13.00.10 and up
Status OK
Author
Peter Nilson

Question

My NetWall system has been working fine for weeks/months without any configuration changes but after a system restart (or firmware upgrade) I have lost all Internet access. What has happened?

Answer

A common issue that can cause a loss of network access is the “Duplicate” or “Identical” route problem. An example on how the routing on the problematic firewall might look is the following:

Route Lan 192.168.1.0/24 Metric=100
Route Wan 203.0.113.0/24 Metric=100
Route Wan all-nets Gateway=203.0.113.1 Metric=100
Route DMZ all-nets Metric=100

The problem in the above is that there are two all-nets route (the route towards the Internet / default route) with the same metric (100). This means that the firewall will not be able to determine which route it should use and picks one at random. After a system restart or cOS Core upgrade, it may simply choose the Dmz route and then Internet access is lost. This is a fairly sneaky problem as the firewall can work correctly for months and then access is suddenly lost.

However, it is not always the all-nets route that can cause this problem. It could be a problem on the local network, as well as shown in the routing below:

Route Lan 192.168.1.0/24 Metric=100
Route DMZ 192.168.1.0/24 Metric=100
Route Wan 203.0.113.0/24 Metric=100
Route Wan all-nets Gateway=203.0.113.1 Metric=100

In the above, we have a similar problem but this time between Lan and DMZ. If Lan is where all our users are, access could suddenly be lost for all users on the Lan interface after a system restart or upgrade). The logs in the above scenario would then contain a lot of “Default_Access_Rule” log entries for all Lan users.

The solution to this problem is fairly easy, either lower the metric on the route you want to use as primary, or remove the duplicate route, or change the network on the duplicate route to be something else.

Note-1: It is not necessarily a restart or upgrade that can trigger this problem. If a configuration change causes the routing table to be updated, it could also trigger the problem because cOS Core needs to repopulate the routing table.

Note-2: “Duplicate routes” is not necessary a problem as it can be a valid configuration if, for instance, Route Load Balancing is used and you want an equal load distribution.

Note-3: Information about the “Default_Access_Rule” log entry can be found here.



Related articles

Setup of a Layer-3 bridge over IPsec in cOS Core
12 Apr, 2023 core proxyarp arp ipsec routing
cOS Core IKEv2 split tunneling with Windows and local user database.
28 Mar, 2023 ikev2 windows vpn routing splittunneling
Problem with auto-created Core routes
22 Mar, 2021 core ipsec routing
Setting up OSPF with IPsec in cOS Core
16 Apr, 2024 core routing ospf ipsec
Using /31 network masks in cOS Core (RFC-3021)
1 Jun, 2022 core routing management
The meaning of the Default_Access_Rule log entry
7 Nov, 2022 core arp log routing
Troubleshooting cOS Core rules/routes with ping simulation
17 Mar, 2023 core routing rules ping icmp cli
Is Statless (FwdFast) faster than a normal IP policy?
27 Jan, 2021 core stateless routing brokenlink
Route failover with IPsec tunnels in cOS Core
13 Feb, 2023 ipsec core routing failover
Public network transparency using cOS Core Proxy ARP instead of subnetting
18 Apr, 2023 core routing transparentmode proxyarp