Identical NetWall routes causing network access loss after upgrade or restart

Last modified on 28 Jun, 2021. Revision 10
This article discusses the identical route problem as a possible cause of a lack of network access after a system restart or cOS Core version upgrade.
Up to date for
cOS Core 13.00.10 and up
Status OK
Author
Peter Nilson

This article applies to:

Question:

My NetWall system has been working fine for weeks/months without any configuration changes but after a system restart (or firmware upgrade) I have lost all Internet access. What has happened?

Answer:

A common issue that can cause a loss of network access is the “Duplicate” or “Identical” route problem. An example on how the routing on the problematic firewall might look is the following:

Route Lan 192.168.1.0/24 Metric=100
Route Wan 203.0.113.0/24 Metric=100
Route Wan all-nets Gateway=203.0.113.1 Metric=100
Route DMZ all-nets Metric=100

The problem in the above is that there are two all-nets route (the route towards the Internet / default route) with the same metric (100). This means that the firewall will not be able to determine which route it should use and picks one at random. After a system restart or cOS Core upgrade, it may simply choose the Dmz route and then Internet access is lost. This is a fairly sneaky problem as the firewall can work correctly for months and then access is suddenly lost.

However, i is not always the all-nets route that can cause this problem. It could be a problem on the local network, as well as shown in the routing below:

Route Lan 192.168.1.0/24 Metric=100
Route DMZ 192.168.1.0/24 Metric=100
Route Wan 203.0.113.0/24 Metric=100
Route Wan all-nets Gateway=203.0.113.1 Metric=100

In the above, we have a similar problem but this time between Lan and DMZ. If Lan is where all our users are, access could suddenly be lost for all users on the Lan interface after a system restart or upgrade). The logs in the above scenario would then contain a lot of “Default_Access_Rule” log entries for all Lan users.

The solution to this problem is fairly easy, either lower the metric on the route you want to use as primary, or remove the duplicate route, or change the network on the duplicate route to be something else.

Note-1: It is not necessarily a restart or upgrade that can trigger this problem. If a configuration change causes the routing table to be updated, it could also trigger the problem because cOS Core needs to repopulate the routing table.

Note-2: “Duplicate routes” is not necessary a problem as it can be a valid configuration if, for instance, Route Load Balancing is used and you want an equal load distribution.

Note-3: Information about the “Default_Access_Rule” log entry can be found here.

Related articles

No related articles found.



Tags