netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH net 0/2] nexthop: Do not flush blackhole nexthops when loopback goes down
@ 2021-02-28 14:26 Ido Schimmel
  2021-02-28 14:26 ` [RFC PATCH net 1/2] " Ido Schimmel
  2021-02-28 14:26 ` [RFC PATCH net 2/2] selftests: fib_nexthops: Test " Ido Schimmel
  0 siblings, 2 replies; 6+ messages in thread
From: Ido Schimmel @ 2021-02-28 14:26 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, sharpd, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Patch #1 prevents blackhole nexthops from being flushed when the
loopback device goes down given that as far as user space is concerned,
these nexthops do not have a nexthop device.

Patch #2 adds a test case.

This is a user visible change, so sending as RFC.

Personally, I think it is worth making the change. The flow is quite
obscure and therefore unlikely to result in any regressions, especially
when the nexthop API is quite new compared to the legacy API. In
addition, the current behavior is very puzzling to those not familiar
with the inner workings of the nexthop code.

Regardless, there are no regressions in fib_nexthops.sh with this
change:

 # ./fib_nexthops.sh
 ...
 Tests passed: 165
 Tests failed:   0

Ido Schimmel (2):
  nexthop: Do not flush blackhole nexthops when loopback goes down
  selftests: fib_nexthops: Test blackhole nexthops when loopback goes
    down

 net/ipv4/nexthop.c                          | 10 +++++++---
 tools/testing/selftests/net/fib_nexthops.sh |  8 ++++++++
 2 files changed, 15 insertions(+), 3 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC PATCH net 1/2] nexthop: Do not flush blackhole nexthops when loopback goes down
  2021-02-28 14:26 [RFC PATCH net 0/2] nexthop: Do not flush blackhole nexthops when loopback goes down Ido Schimmel
@ 2021-02-28 14:26 ` Ido Schimmel
  2021-02-28 23:40   ` David Ahern
  2021-02-28 14:26 ` [RFC PATCH net 2/2] selftests: fib_nexthops: Test " Ido Schimmel
  1 sibling, 1 reply; 6+ messages in thread
From: Ido Schimmel @ 2021-02-28 14:26 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, sharpd, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

As far as user space is concerned, blackhole nexthops do not have a
nexthop device and therefore should not be affected by the
administrative or carrier state of any netdev.

However, when the loopback netdev goes down all the blackhole nexthops
are flushed. This happens because internally the kernel associates
blackhole nexthops with the loopback netdev.

This behavior is both confusing to those not familiar with kernel
internals and also diverges from the legacy API where blackhole IPv4
routes are not flushed when the loopback netdev goes down:

 # ip route add blackhole 198.51.100.0/24
 # ip link set dev lo down
 # ip route show 198.51.100.0/24
 blackhole 198.51.100.0/24

Blackhole IPv6 routes are flushed, but at least user space knows that
they are associated with the loopback netdev:

 # ip -6 route show 2001:db8:1::/64
 blackhole 2001:db8:1::/64 dev lo metric 1024 pref medium

Fix this by only flushing blackhole nexthops when the loopback netdev is
unregistered.

Fixes: ab84be7e54fc ("net: Initial nexthop code")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reported-by: Donald Sharp <sharpd@nvidia.com>
---
 net/ipv4/nexthop.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index f1c6cbdb9e43..743777bce179 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -1399,7 +1399,7 @@ static int insert_nexthop(struct net *net, struct nexthop *new_nh,
 
 /* rtnl */
 /* remove all nexthops tied to a device being deleted */
-static void nexthop_flush_dev(struct net_device *dev)
+static void nexthop_flush_dev(struct net_device *dev, unsigned long event)
 {
 	unsigned int hash = nh_dev_hashfn(dev->ifindex);
 	struct net *net = dev_net(dev);
@@ -1411,6 +1411,10 @@ static void nexthop_flush_dev(struct net_device *dev)
 		if (nhi->fib_nhc.nhc_dev != dev)
 			continue;
 
+		if (nhi->reject_nh &&
+		    (event == NETDEV_DOWN || event == NETDEV_CHANGE))
+			continue;
+
 		remove_nexthop(net, nhi->nh_parent, NULL);
 	}
 }
@@ -2189,11 +2193,11 @@ static int nh_netdev_event(struct notifier_block *this,
 	switch (event) {
 	case NETDEV_DOWN:
 	case NETDEV_UNREGISTER:
-		nexthop_flush_dev(dev);
+		nexthop_flush_dev(dev, event);
 		break;
 	case NETDEV_CHANGE:
 		if (!(dev_get_flags(dev) & (IFF_RUNNING | IFF_LOWER_UP)))
-			nexthop_flush_dev(dev);
+			nexthop_flush_dev(dev, event);
 		break;
 	case NETDEV_CHANGEMTU:
 		info_ext = ptr;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RFC PATCH net 2/2] selftests: fib_nexthops: Test blackhole nexthops when loopback goes down
  2021-02-28 14:26 [RFC PATCH net 0/2] nexthop: Do not flush blackhole nexthops when loopback goes down Ido Schimmel
  2021-02-28 14:26 ` [RFC PATCH net 1/2] " Ido Schimmel
@ 2021-02-28 14:26 ` Ido Schimmel
  2021-02-28 23:41   ` David Ahern
  1 sibling, 1 reply; 6+ messages in thread
From: Ido Schimmel @ 2021-02-28 14:26 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, sharpd, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Test that blackhole nexthops are not flushed when the loopback device
goes down.

Output without previous patch:

 # ./fib_nexthops.sh -t basic

 Basic functional tests
 ----------------------
 TEST: List with nothing defined                                     [ OK ]
 TEST: Nexthop get on non-existent id                                [ OK ]
 TEST: Nexthop with no device or gateway                             [ OK ]
 TEST: Nexthop with down device                                      [ OK ]
 TEST: Nexthop with device that is linkdown                          [ OK ]
 TEST: Nexthop with device only                                      [ OK ]
 TEST: Nexthop with duplicate id                                     [ OK ]
 TEST: Blackhole nexthop                                             [ OK ]
 TEST: Blackhole nexthop with other attributes                       [ OK ]
 TEST: Blackhole nexthop with loopback device down                   [FAIL]
 TEST: Create group                                                  [ OK ]
 TEST: Create group with blackhole nexthop                           [FAIL]
 TEST: Create multipath group where 1 path is a blackhole            [ OK ]
 TEST: Multipath group can not have a member replaced by blackhole   [ OK ]
 TEST: Create group with non-existent nexthop                        [ OK ]
 TEST: Create group with same nexthop multiple times                 [ OK ]
 TEST: Replace nexthop with nexthop group                            [ OK ]
 TEST: Replace nexthop group with nexthop                            [ OK ]
 TEST: Nexthop group and device                                      [ OK ]
 TEST: Test proto flush                                              [ OK ]
 TEST: Nexthop group and blackhole                                   [ OK ]

 Tests passed:  19
 Tests failed:   2

Output with previous patch:

 # ./fib_nexthops.sh -t basic

 Basic functional tests
 ----------------------
 TEST: List with nothing defined                                     [ OK ]
 TEST: Nexthop get on non-existent id                                [ OK ]
 TEST: Nexthop with no device or gateway                             [ OK ]
 TEST: Nexthop with down device                                      [ OK ]
 TEST: Nexthop with device that is linkdown                          [ OK ]
 TEST: Nexthop with device only                                      [ OK ]
 TEST: Nexthop with duplicate id                                     [ OK ]
 TEST: Blackhole nexthop                                             [ OK ]
 TEST: Blackhole nexthop with other attributes                       [ OK ]
 TEST: Blackhole nexthop with loopback device down                   [ OK ]
 TEST: Create group                                                  [ OK ]
 TEST: Create group with blackhole nexthop                           [ OK ]
 TEST: Create multipath group where 1 path is a blackhole            [ OK ]
 TEST: Multipath group can not have a member replaced by blackhole   [ OK ]
 TEST: Create group with non-existent nexthop                        [ OK ]
 TEST: Create group with same nexthop multiple times                 [ OK ]
 TEST: Replace nexthop with nexthop group                            [ OK ]
 TEST: Replace nexthop group with nexthop                            [ OK ]
 TEST: Nexthop group and device                                      [ OK ]
 TEST: Test proto flush                                              [ OK ]
 TEST: Nexthop group and blackhole                                   [ OK ]

 Tests passed:  21
 Tests failed:   0

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 tools/testing/selftests/net/fib_nexthops.sh | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh
index 4c7d33618437..d98fb85e201c 100755
--- a/tools/testing/selftests/net/fib_nexthops.sh
+++ b/tools/testing/selftests/net/fib_nexthops.sh
@@ -1524,6 +1524,14 @@ basic()
 	run_cmd "$IP nexthop replace id 2 blackhole dev veth1"
 	log_test $? 2 "Blackhole nexthop with other attributes"
 
+	# blackhole nexthop should not be affected by the state of the loopback
+	# device
+	run_cmd "$IP link set dev lo down"
+	check_nexthop "id 2" "id 2 blackhole"
+	log_test $? 0 "Blackhole nexthop with loopback device down"
+
+	run_cmd "$IP link set dev lo up"
+
 	#
 	# groups
 	#
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH net 1/2] nexthop: Do not flush blackhole nexthops when loopback goes down
  2021-02-28 14:26 ` [RFC PATCH net 1/2] " Ido Schimmel
@ 2021-02-28 23:40   ` David Ahern
  2021-03-01  8:27     ` Ido Schimmel
  0 siblings, 1 reply; 6+ messages in thread
From: David Ahern @ 2021-02-28 23:40 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, sharpd, mlxsw, Ido Schimmel

On 2/28/21 7:26 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> As far as user space is concerned, blackhole nexthops do not have a
> nexthop device and therefore should not be affected by the
> administrative or carrier state of any netdev.
> 
> However, when the loopback netdev goes down all the blackhole nexthops
> are flushed. This happens because internally the kernel associates
> blackhole nexthops with the loopback netdev.

That was not intended, so definitely a bug.

> 
> This behavior is both confusing to those not familiar with kernel
> internals and also diverges from the legacy API where blackhole IPv4
> routes are not flushed when the loopback netdev goes down:
> 
>  # ip route add blackhole 198.51.100.0/24
>  # ip link set dev lo down
>  # ip route show 198.51.100.0/24
>  blackhole 198.51.100.0/24
> 
> Blackhole IPv6 routes are flushed, but at least user space knows that
> they are associated with the loopback netdev:
> 
>  # ip -6 route show 2001:db8:1::/64
>  blackhole 2001:db8:1::/64 dev lo metric 1024 pref medium
> 
> Fix this by only flushing blackhole nexthops when the loopback netdev is
> unregistered.
> 
> Fixes: ab84be7e54fc ("net: Initial nexthop code")
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> Reported-by: Donald Sharp <sharpd@nvidia.com>
> ---
>  net/ipv4/nexthop.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> index f1c6cbdb9e43..743777bce179 100644
> --- a/net/ipv4/nexthop.c
> +++ b/net/ipv4/nexthop.c
> @@ -1399,7 +1399,7 @@ static int insert_nexthop(struct net *net, struct nexthop *new_nh,
>  
>  /* rtnl */
>  /* remove all nexthops tied to a device being deleted */
> -static void nexthop_flush_dev(struct net_device *dev)
> +static void nexthop_flush_dev(struct net_device *dev, unsigned long event)
>  {
>  	unsigned int hash = nh_dev_hashfn(dev->ifindex);
>  	struct net *net = dev_net(dev);
> @@ -1411,6 +1411,10 @@ static void nexthop_flush_dev(struct net_device *dev)
>  		if (nhi->fib_nhc.nhc_dev != dev)
>  			continue;
>  
> +		if (nhi->reject_nh &&
> +		    (event == NETDEV_DOWN || event == NETDEV_CHANGE))
> +			continue;
> +
>  		remove_nexthop(net, nhi->nh_parent, NULL);
>  	}
>  }
> @@ -2189,11 +2193,11 @@ static int nh_netdev_event(struct notifier_block *this,
>  	switch (event) {
>  	case NETDEV_DOWN:
>  	case NETDEV_UNREGISTER:
> -		nexthop_flush_dev(dev);
> +		nexthop_flush_dev(dev, event);
>  		break;
>  	case NETDEV_CHANGE:
>  		if (!(dev_get_flags(dev) & (IFF_RUNNING | IFF_LOWER_UP)))
> -			nexthop_flush_dev(dev);
> +			nexthop_flush_dev(dev, event);
>  		break;
>  	case NETDEV_CHANGEMTU:
>  		info_ext = ptr;
> 

LGTM. I suggest submitting without the RFC.

Reviewed-by: David Ahern <dsahern@gmail.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH net 2/2] selftests: fib_nexthops: Test blackhole nexthops when loopback goes down
  2021-02-28 14:26 ` [RFC PATCH net 2/2] selftests: fib_nexthops: Test " Ido Schimmel
@ 2021-02-28 23:41   ` David Ahern
  0 siblings, 0 replies; 6+ messages in thread
From: David Ahern @ 2021-02-28 23:41 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, sharpd, mlxsw, Ido Schimmel

On 2/28/21 7:26 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Test that blackhole nexthops are not flushed when the loopback device
> goes down.
> 


> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  tools/testing/selftests/net/fib_nexthops.sh | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh
> index 4c7d33618437..d98fb85e201c 100755
> --- a/tools/testing/selftests/net/fib_nexthops.sh
> +++ b/tools/testing/selftests/net/fib_nexthops.sh
> @@ -1524,6 +1524,14 @@ basic()
>  	run_cmd "$IP nexthop replace id 2 blackhole dev veth1"
>  	log_test $? 2 "Blackhole nexthop with other attributes"
>  
> +	# blackhole nexthop should not be affected by the state of the loopback
> +	# device
> +	run_cmd "$IP link set dev lo down"
> +	check_nexthop "id 2" "id 2 blackhole"
> +	log_test $? 0 "Blackhole nexthop with loopback device down"
> +
> +	run_cmd "$IP link set dev lo up"
> +
>  	#
>  	# groups
>  	#
> 

Thanks for adding a test.

Reviewed-by: David Ahern <dsahern@gmail.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH net 1/2] nexthop: Do not flush blackhole nexthops when loopback goes down
  2021-02-28 23:40   ` David Ahern
@ 2021-03-01  8:27     ` Ido Schimmel
  0 siblings, 0 replies; 6+ messages in thread
From: Ido Schimmel @ 2021-03-01  8:27 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, kuba, roopa, sharpd, mlxsw, Ido Schimmel

On Sun, Feb 28, 2021 at 04:40:13PM -0700, David Ahern wrote:
> LGTM. I suggest submitting without the RFC.
> 
> Reviewed-by: David Ahern <dsahern@gmail.com>

Thanks, David. Will let it go through regression and submit later this
week.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-03-01  8:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-28 14:26 [RFC PATCH net 0/2] nexthop: Do not flush blackhole nexthops when loopback goes down Ido Schimmel
2021-02-28 14:26 ` [RFC PATCH net 1/2] " Ido Schimmel
2021-02-28 23:40   ` David Ahern
2021-03-01  8:27     ` Ido Schimmel
2021-02-28 14:26 ` [RFC PATCH net 2/2] selftests: fib_nexthops: Test " Ido Schimmel
2021-02-28 23:41   ` David Ahern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).