All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload
@ 2020-09-08  9:10 Ido Schimmel
  2020-09-08  9:10 ` [RFC PATCH net-next 01/22] nexthop: Remove unused function declaration from header file Ido Schimmel
                   ` (21 more replies)
  0 siblings, 22 replies; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Note: I'm aware that 22 patches is a lot and I will split it for the
non-RFC submission. Sending all in one piece to see if there are general
comments regarding the interface. Also, most of the patches are very
small.

This patch set adds support for nexthop objects offload with a dummy
implementation over netdevsim. mlxsw support will be added later.

The general idea is very similar to route offload in that notifications
are sent whenever nexthop objects are changed. A listener can veto the
change and the error will be communicated to user space with extack.

To keep listeners as simple as possible, they not only receive
notifications for the nexthop object that is changed, but also for all
the other objects affected by this change. For example, when a single
nexthop is replaced, a replace notification is sent for the single
nexthop, but also for all the nexthop groups this nexthop is member in.
This relieves listeners from the need to track such dependencies.

To simplify things further for listeners, the notification info does not
contain the raw nexthop data structures (e.g., 'struct nexthop'), but
less complex data structures into which the raw data structures are
parsed into.

Tested with a new selftest over netdevsim and with fib_nexthops.sh:

Tests passed: 164
Tests failed:   0

Patch set overview:

Patches #1-#4 perform small cleanups and covert the existing nexthop
notification chain to a blocking one, so that device drivers could block
when programming nexthops to hardware. This is safe because all
notifications are emitted from a process context.

Patches #5-#8 introduce the aforementioned data structures and convert
existing listeners (i.e., the VXLAN driver) to use them.

Patches #9-#10 add a new RTNH_F_TRAP flag and the ability to set it and
RTNH_F_OFFLOAD on nexthops. This flag is used by netdevsim for testing
purposes and will also be used by mlxsw. These flags are consistent with
the existing RTM_F_OFFLOAD and RTM_F_TRAP flags.

Patches #11-#18 gradually add the new nexthop notifications.

Patches #19-#22 add a dummy implementation for nexthop offload over
netdevsim and a selftest to exercise both good and bad flows.

Ido Schimmel (22):
  nexthop: Remove unused function declaration from header file
  nexthop: Convert to blocking notification chain
  nexthop: Only emit a notification when nexthop is actually deleted
  selftests: fib_nexthops: Test cleanup of FDB entries following nexthop
    deletion
  nexthop: Add nexthop notification data structures
  nexthop: Pass extack to nexthop notifier
  nexthop: Prepare new notification info
  nexthop: vxlan: Convert to new notification info
  rtnetlink: Add RTNH_F_TRAP flag
  nexthop: Allow setting "offload" and "trap" indications on nexthops
  nexthop: Emit a notification when a nexthop is added
  nexthop: Emit a notification when a nexthop group is replaced
  nexthop: Emit a notification when a single nexthop is replaced
  nexthop: Emit a notification when a nexthop group is modified
  nexthop: Emit a notification when a nexthop group is reduced
  nexthop: Pass extack to register_nexthop_notifier()
  nexthop: Replay nexthops when registering a notifier
  nexthop: Remove in-kernel route notifications when nexthop changes
  netdevsim: Add devlink resource for nexthops
  netdevsim: Add dummy implementation for nexthop offload
  netdevsim: Allow programming routes with nexthop objects
  selftests: netdevsim: Add test for nexthop offload API

 .../networking/devlink/netdevsim.rst          |   3 +-
 drivers/net/netdevsim/dev.c                   |   6 +
 drivers/net/netdevsim/fib.c                   | 265 +++++++++++-
 drivers/net/netdevsim/netdevsim.h             |   1 +
 drivers/net/vxlan.c                           |  12 +-
 include/net/netns/nexthop.h                   |   2 +-
 include/net/nexthop.h                         |  45 +-
 include/uapi/linux/rtnetlink.h                |   6 +-
 net/ipv4/fib_semantics.c                      |   2 +
 net/ipv4/fib_trie.c                           |   9 -
 net/ipv4/nexthop.c                            | 262 ++++++++++-
 net/ipv6/route.c                              |   5 -
 .../drivers/net/netdevsim/nexthop.sh          | 408 ++++++++++++++++++
 tools/testing/selftests/net/fib_nexthops.sh   |  14 +
 14 files changed, 985 insertions(+), 55 deletions(-)
 create mode 100755 tools/testing/selftests/drivers/net/netdevsim/nexthop.sh

-- 
2.26.2


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 01/22] nexthop: Remove unused function declaration from header file
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 14:29   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 02/22] nexthop: Convert to blocking notification chain Ido Schimmel
                   ` (20 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Not used or implemented anywhere.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 include/net/nexthop.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/include/net/nexthop.h b/include/net/nexthop.h
index 3a4f9e3b91a5..2e44efe5709b 100644
--- a/include/net/nexthop.h
+++ b/include/net/nexthop.h
@@ -109,9 +109,6 @@ enum nexthop_event_type {
 	NEXTHOP_EVENT_DEL
 };
 
-int call_nexthop_notifier(struct notifier_block *nb, struct net *net,
-			  enum nexthop_event_type event_type,
-			  struct nexthop *nh);
 int register_nexthop_notifier(struct net *net, struct notifier_block *nb);
 int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 02/22] nexthop: Convert to blocking notification chain
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
  2020-09-08  9:10 ` [RFC PATCH net-next 01/22] nexthop: Remove unused function declaration from header file Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 14:34   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 03/22] nexthop: Only emit a notification when nexthop is actually deleted Ido Schimmel
                   ` (19 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Currently, the only listener of the nexthop notification chain is the
VXLAN driver. Subsequent patches will add more listeners (e.g., device
drivers such as netdevsim) that need to be able to block when processing
notifications.

Therefore, convert the notification chain to a blocking one. This is
safe as notifications are always emitted from process context.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 include/net/netns/nexthop.h |  2 +-
 net/ipv4/nexthop.c          | 13 +++++++------
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/net/netns/nexthop.h b/include/net/netns/nexthop.h
index 1937476c94a0..1849e77eb68a 100644
--- a/include/net/netns/nexthop.h
+++ b/include/net/netns/nexthop.h
@@ -14,6 +14,6 @@ struct netns_nexthop {
 
 	unsigned int		seq;		/* protected by rtnl_mutex */
 	u32			last_id_allocated;
-	struct atomic_notifier_head notifier_chain;
+	struct blocking_notifier_head notifier_chain;
 };
 #endif
diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index bf9d4cd2d6e5..13d9219a9aa1 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -42,8 +42,8 @@ static int call_nexthop_notifiers(struct net *net,
 {
 	int err;
 
-	err = atomic_notifier_call_chain(&net->nexthop.notifier_chain,
-					 event_type, nh);
+	err = blocking_notifier_call_chain(&net->nexthop.notifier_chain,
+					   event_type, nh);
 	return notifier_to_errno(err);
 }
 
@@ -1959,14 +1959,15 @@ static struct notifier_block nh_netdev_notifier = {
 
 int register_nexthop_notifier(struct net *net, struct notifier_block *nb)
 {
-	return atomic_notifier_chain_register(&net->nexthop.notifier_chain, nb);
+	return blocking_notifier_chain_register(&net->nexthop.notifier_chain,
+						nb);
 }
 EXPORT_SYMBOL(register_nexthop_notifier);
 
 int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb)
 {
-	return atomic_notifier_chain_unregister(&net->nexthop.notifier_chain,
-						nb);
+	return blocking_notifier_chain_unregister(&net->nexthop.notifier_chain,
+						  nb);
 }
 EXPORT_SYMBOL(unregister_nexthop_notifier);
 
@@ -1986,7 +1987,7 @@ static int __net_init nexthop_net_init(struct net *net)
 	net->nexthop.devhash = kzalloc(sz, GFP_KERNEL);
 	if (!net->nexthop.devhash)
 		return -ENOMEM;
-	ATOMIC_INIT_NOTIFIER_HEAD(&net->nexthop.notifier_chain);
+	BLOCKING_INIT_NOTIFIER_HEAD(&net->nexthop.notifier_chain);
 
 	return 0;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 03/22] nexthop: Only emit a notification when nexthop is actually deleted
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
  2020-09-08  9:10 ` [RFC PATCH net-next 01/22] nexthop: Remove unused function declaration from header file Ido Schimmel
  2020-09-08  9:10 ` [RFC PATCH net-next 02/22] nexthop: Convert to blocking notification chain Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 14:34   ` David Ahern
  2020-09-08 14:39   ` Jiri Pirko
  2020-09-08  9:10 ` [RFC PATCH net-next 04/22] selftests: fib_nexthops: Test cleanup of FDB entries following nexthop deletion Ido Schimmel
                   ` (18 subsequent siblings)
  21 siblings, 2 replies; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Currently, the delete notification is emitted from the error path of
nexthop_add() and replace_nexthop(), which can be confusing to listeners
as they are not familiar with the nexthop.

Instead, only emit the notification when the nexthop is actually
deleted. The following sub-cases are covered:

1. User space deletes the nexthop
2. The nexthop is deleted by the kernel due to a netdev event (e.g.,
   nexthop device going down)
3. A group is deleted because its last nexthop is being deleted
4. The network namespace of the nexthop device is deleted

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv4/nexthop.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index 13d9219a9aa1..8c0f17c6863c 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -870,8 +870,6 @@ static void __remove_nexthop_fib(struct net *net, struct nexthop *nh)
 	bool do_flush = false;
 	struct fib_info *fi;
 
-	call_nexthop_notifiers(net, NEXTHOP_EVENT_DEL, nh);
-
 	list_for_each_entry(fi, &nh->fi_list, nh_list) {
 		fi->fib_flags |= RTNH_F_DEAD;
 		do_flush = true;
@@ -909,6 +907,8 @@ static void __remove_nexthop(struct net *net, struct nexthop *nh,
 static void remove_nexthop(struct net *net, struct nexthop *nh,
 			   struct nl_info *nlinfo)
 {
+	call_nexthop_notifiers(net, NEXTHOP_EVENT_DEL, nh);
+
 	/* remove from the tree */
 	rb_erase(&nh->rb_node, &net->nexthop.rb_root);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 04/22] selftests: fib_nexthops: Test cleanup of FDB entries following nexthop deletion
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (2 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 03/22] nexthop: Only emit a notification when nexthop is actually deleted Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 14:35   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 05/22] nexthop: Add nexthop notification data structures Ido Schimmel
                   ` (17 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Commit c7cdbe2efc40 ("vxlan: support for nexthop notifiers") registered
a listener in the VXLAN driver to the nexthop notification chain. Its
purpose is to cleanup FDB entries that use a nexthop that is being
deleted.

Test that such FDB entries are removed when the nexthop group that they
use is deleted. Test that entries are not deleted when a single nexthop
in the group is deleted.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 tools/testing/selftests/net/fib_nexthops.sh | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh
index b74884d52913..eb693a3b7b4a 100755
--- a/tools/testing/selftests/net/fib_nexthops.sh
+++ b/tools/testing/selftests/net/fib_nexthops.sh
@@ -411,9 +411,16 @@ ipv6_fdb_grp_fcnal()
 	run_cmd "$IP -6 ro add 2001:db8:101::1/128 nhid 103"
 	log_test $? 2 "Route add with fdb nexthop group"
 
+	run_cmd "$IP nexthop del id 61"
+	run_cmd "$BRIDGE fdb get to 02:02:00:00:00:13 dev vx10 self"
+	log_test $? 0 "Fdb entry after deleting a single nexthop"
+
 	run_cmd "$IP nexthop del id 102"
 	log_test $? 0 "Fdb nexthop delete"
 
+	run_cmd "$BRIDGE fdb get to 02:02:00:00:00:13 dev vx10 self"
+	log_test $? 254 "Fdb entry after deleting a nexthop group"
+
 	$IP link del dev vx10
 }
 
@@ -484,9 +491,16 @@ ipv4_fdb_grp_fcnal()
 	run_cmd "$IP ro add 172.16.0.0/22 nhid 103"
 	log_test $? 2 "Route add with fdb nexthop group"
 
+	run_cmd "$IP nexthop del id 12"
+	run_cmd "$BRIDGE fdb get to 02:02:00:00:00:13 dev vx10 self"
+	log_test $? 0 "Fdb entry after deleting a single nexthop"
+
 	run_cmd "$IP nexthop del id 102"
 	log_test $? 0 "Fdb nexthop delete"
 
+	run_cmd "$BRIDGE fdb get to 02:02:00:00:00:13 dev vx10 self"
+	log_test $? 254 "Fdb entry after deleting a nexthop group"
+
 	$IP link del dev vx10
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 05/22] nexthop: Add nexthop notification data structures
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (3 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 04/22] selftests: fib_nexthops: Test cleanup of FDB entries following nexthop deletion Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 14:43   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 06/22] nexthop: Pass extack to nexthop notifier Ido Schimmel
                   ` (16 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Add data structures that will be used for nexthop replace and delete
notifications in the previously introduced nexthop notification chain.

New data structures are added instead of passing the existing nexthop
code structures directly for several reasons.

First, the existing structures encode a lot of bookkeeping information
which is irrelevant for listeners of the notification chain.

Second, the existing structures can be changed without worrying about
introducing regressions in listeners since they are not accessed
directly by them.

Third, listeners of the notification chain do not need to each parse the
relatively complex nexthop code structures. They are passing the
required information in a simplified way.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 include/net/nexthop.h | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/include/net/nexthop.h b/include/net/nexthop.h
index 2e44efe5709b..0bde1aa867c0 100644
--- a/include/net/nexthop.h
+++ b/include/net/nexthop.h
@@ -109,6 +109,41 @@ enum nexthop_event_type {
 	NEXTHOP_EVENT_DEL
 };
 
+struct nh_notifier_single_info {
+	struct net_device *dev;
+	u8 gw_family;
+	union {
+		__be32 ipv4;
+		struct in6_addr ipv6;
+	};
+	u8 is_reject:1,
+	   is_fdb:1,
+	   is_encap:1;
+};
+
+struct nh_notifier_grp_entry_info {
+	u8 weight;
+	u32 id;
+	struct nh_notifier_single_info nh;
+};
+
+struct nh_notifier_grp_info {
+	u16 num_nh;
+	bool is_fdb;
+	struct nh_notifier_grp_entry_info nh_entries[];
+};
+
+struct nh_notifier_info {
+	struct net *net;
+	struct netlink_ext_ack *extack;
+	u32 id;
+	bool is_grp;
+	union {
+		struct nh_notifier_single_info *nh;
+		struct nh_notifier_grp_info *nh_grp;
+	};
+};
+
 int register_nexthop_notifier(struct net *net, struct notifier_block *nb);
 int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 06/22] nexthop: Pass extack to nexthop notifier
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (4 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 05/22] nexthop: Add nexthop notification data structures Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 14:44   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 07/22] nexthop: Prepare new notification info Ido Schimmel
                   ` (15 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

The next patch will add extack to the notification info. This allows
listeners to veto notifications and communicate the reason to user space.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv4/nexthop.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index 8c0f17c6863c..dafcb9f17250 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -38,7 +38,8 @@ static const struct nla_policy rtm_nh_policy[NHA_MAX + 1] = {
 
 static int call_nexthop_notifiers(struct net *net,
 				  enum nexthop_event_type event_type,
-				  struct nexthop *nh)
+				  struct nexthop *nh,
+				  struct netlink_ext_ack *extack)
 {
 	int err;
 
@@ -907,7 +908,7 @@ static void __remove_nexthop(struct net *net, struct nexthop *nh,
 static void remove_nexthop(struct net *net, struct nexthop *nh,
 			   struct nl_info *nlinfo)
 {
-	call_nexthop_notifiers(net, NEXTHOP_EVENT_DEL, nh);
+	call_nexthop_notifiers(net, NEXTHOP_EVENT_DEL, nh, NULL);
 
 	/* remove from the tree */
 	rb_erase(&nh->rb_node, &net->nexthop.rb_root);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 07/22] nexthop: Prepare new notification info
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (5 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 06/22] nexthop: Pass extack to nexthop notifier Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 14:55   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 08/22] nexthop: vxlan: Convert to " Ido Schimmel
                   ` (14 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Prepare the new notification information so that it could be passed to
listeners in the new patch.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv4/nexthop.c | 108 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 108 insertions(+)

diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index dafcb9f17250..68fd25c6eec7 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -36,15 +36,123 @@ static const struct nla_policy rtm_nh_policy[NHA_MAX + 1] = {
 	[NHA_FDB]		= { .type = NLA_FLAG },
 };
 
+static bool nexthop_notifiers_is_empty(struct net *net)
+{
+	return !net->nexthop.notifier_chain.head;
+}
+
+static void
+__nh_notifier_single_info_init(struct nh_notifier_single_info *nh_info,
+			       const struct nexthop *nh)
+{
+	struct nh_info *nhi = rtnl_dereference(nh->nh_info);
+
+	nh_info->dev = nhi->fib_nhc.nhc_dev;
+	nh_info->gw_family = nhi->fib_nhc.nhc_gw_family;
+	if (nh_info->gw_family == AF_INET)
+		nh_info->ipv4 = nhi->fib_nhc.nhc_gw.ipv4;
+	else if (nh_info->gw_family == AF_INET6)
+		nh_info->ipv6 = nhi->fib_nhc.nhc_gw.ipv6;
+	nh_info->is_reject = nhi->reject_nh;
+	nh_info->is_fdb = nhi->fdb_nh;
+	nh_info->is_encap = !!nhi->fib_nhc.nhc_lwtstate;
+}
+
+static int nh_notifier_single_info_init(struct nh_notifier_info *info,
+					const struct nexthop *nh)
+{
+	info->nh = kzalloc(sizeof(*info->nh), GFP_KERNEL);
+	if (!info->nh)
+		return -ENOMEM;
+
+	__nh_notifier_single_info_init(info->nh, nh);
+
+	return 0;
+}
+
+static void nh_notifier_single_info_fini(struct nh_notifier_info *info)
+{
+	kfree(info->nh);
+}
+
+static int nh_notifier_grp_info_init(struct nh_notifier_info *info,
+				     const struct nexthop *nh)
+{
+	struct nh_group *nhg = rtnl_dereference(nh->nh_grp);
+	u16 num_nh = nhg->num_nh;
+	int i;
+
+	info->nh_grp = kzalloc(struct_size(info->nh_grp, nh_entries, num_nh),
+			       GFP_KERNEL);
+	if (!info->nh_grp)
+		return -ENOMEM;
+
+	info->nh_grp->num_nh = num_nh;
+	info->nh_grp->is_fdb = nhg->fdb_nh;
+
+	for (i = 0; i < num_nh; i++) {
+		struct nh_grp_entry *nhge = &nhg->nh_entries[i];
+
+		info->nh_grp->nh_entries[i].id = nhge->nh->id;
+		info->nh_grp->nh_entries[i].weight = nhge->weight;
+		__nh_notifier_single_info_init(&info->nh_grp->nh_entries[i].nh,
+					       nhge->nh);
+	}
+
+	return 0;
+}
+
+static void nh_notifier_grp_info_fini(struct nh_notifier_info *info)
+{
+	kfree(info->nh_grp);
+}
+
+static int nh_notifier_info_init(struct nh_notifier_info *info,
+				 const struct nexthop *nh)
+{
+	info->id = nh->id;
+	info->is_grp = nh->is_group;
+
+	if (info->is_grp)
+		return nh_notifier_grp_info_init(info, nh);
+	else
+		return nh_notifier_single_info_init(info, nh);
+}
+
+static void nh_notifier_info_fini(struct nh_notifier_info *info)
+{
+	if (info->is_grp)
+		nh_notifier_grp_info_fini(info);
+	else
+		nh_notifier_single_info_fini(info);
+}
+
 static int call_nexthop_notifiers(struct net *net,
 				  enum nexthop_event_type event_type,
 				  struct nexthop *nh,
 				  struct netlink_ext_ack *extack)
 {
+	struct nh_notifier_info info = {
+		.net = net,
+		.extack = extack,
+	};
 	int err;
 
+	ASSERT_RTNL();
+
+	if (nexthop_notifiers_is_empty(net))
+		return 0;
+
+	err = nh_notifier_info_init(&info, nh);
+	if (err) {
+		NL_SET_ERR_MSG(extack, "Failed to initialize nexthop notifier info");
+		return err;
+	}
+
 	err = blocking_notifier_call_chain(&net->nexthop.notifier_chain,
 					   event_type, nh);
+	nh_notifier_info_fini(&info);
+
 	return notifier_to_errno(err);
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 08/22] nexthop: vxlan: Convert to new notification info
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (6 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 07/22] nexthop: Prepare new notification info Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 14:58   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 09/22] rtnetlink: Add RTNH_F_TRAP flag Ido Schimmel
                   ` (13 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Convert the sole listener of the nexthop notification chain (the VXLAN
driver) to the new notification info.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/vxlan.c | 9 +++++++--
 net/ipv4/nexthop.c  | 2 +-
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index b9fefe27e3e8..29deedee6ef4 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -4687,9 +4687,14 @@ static void vxlan_fdb_nh_flush(struct nexthop *nh)
 static int vxlan_nexthop_event(struct notifier_block *nb,
 			       unsigned long event, void *ptr)
 {
-	struct nexthop *nh = ptr;
+	struct nh_notifier_info *info = ptr;
+	struct nexthop *nh;
+
+	if (event != NEXTHOP_EVENT_DEL)
+		return NOTIFY_DONE;
 
-	if (!nh || event != NEXTHOP_EVENT_DEL)
+	nh = nexthop_find_by_id(info->net, info->id);
+	if (!nh)
 		return NOTIFY_DONE;
 
 	vxlan_fdb_nh_flush(nh);
diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index 68fd25c6eec7..70c8ab6906ec 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -150,7 +150,7 @@ static int call_nexthop_notifiers(struct net *net,
 	}
 
 	err = blocking_notifier_call_chain(&net->nexthop.notifier_chain,
-					   event_type, nh);
+					   event_type, &info);
 	nh_notifier_info_fini(&info);
 
 	return notifier_to_errno(err);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 09/22] rtnetlink: Add RTNH_F_TRAP flag
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (7 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 08/22] nexthop: vxlan: Convert to " Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 15:02   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 10/22] nexthop: Allow setting "offload" and "trap" indications on nexthops Ido Schimmel
                   ` (12 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

The flag indicates to user space that the nexthop is not programmed to
forward packets in hardware, but rather to trap them.

The flag will be used in subsequent patches by netdevsim to test nexthop
objects programming to device drivers and in the future by mlxsw as
well.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 include/uapi/linux/rtnetlink.h | 6 ++++--
 net/ipv4/fib_semantics.c       | 2 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index 9b814c92de12..0ca2057d3269 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -396,11 +396,13 @@ struct rtnexthop {
 #define RTNH_F_DEAD		1	/* Nexthop is dead (used by multipath)	*/
 #define RTNH_F_PERVASIVE	2	/* Do recursive gateway lookup	*/
 #define RTNH_F_ONLINK		4	/* Gateway is forced on link	*/
-#define RTNH_F_OFFLOAD		8	/* offloaded route */
+#define RTNH_F_OFFLOAD		8	/* Nexthop is offloaded */
 #define RTNH_F_LINKDOWN		16	/* carrier-down on nexthop */
 #define RTNH_F_UNRESOLVED	32	/* The entry is unresolved (ipmr) */
+#define RTNH_F_TRAP		64	/* Nexthop is trapping packets */
 
-#define RTNH_COMPARE_MASK	(RTNH_F_DEAD | RTNH_F_LINKDOWN | RTNH_F_OFFLOAD)
+#define RTNH_COMPARE_MASK	(RTNH_F_DEAD | RTNH_F_LINKDOWN | \
+				 RTNH_F_OFFLOAD | RTNH_F_TRAP)
 
 /* Macros to handle hexthops */
 
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 1f75dc686b6b..f70b9a0c4957 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1644,6 +1644,8 @@ int fib_nexthop_info(struct sk_buff *skb, const struct fib_nh_common *nhc,
 	*flags |= (nhc->nhc_flags & RTNH_F_ONLINK);
 	if (nhc->nhc_flags & RTNH_F_OFFLOAD)
 		*flags |= RTNH_F_OFFLOAD;
+	if (nhc->nhc_flags & RTNH_F_TRAP)
+		*flags |= RTNH_F_TRAP;
 
 	if (!skip_oif && nhc->nhc_dev &&
 	    nla_put_u32(skb, RTA_OIF, nhc->nhc_dev->ifindex))
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 10/22] nexthop: Allow setting "offload" and "trap" indications on nexthops
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (8 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 09/22] rtnetlink: Add RTNH_F_TRAP flag Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 15:14   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 11/22] nexthop: Emit a notification when a nexthop is added Ido Schimmel
                   ` (11 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Add a function that can be called by device drivers to set "offload" or
"trap" indication on nexthops following nexthop notifications.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 include/net/nexthop.h |  1 +
 net/ipv4/nexthop.c    | 21 +++++++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/include/net/nexthop.h b/include/net/nexthop.h
index 0bde1aa867c0..4147681e86d2 100644
--- a/include/net/nexthop.h
+++ b/include/net/nexthop.h
@@ -146,6 +146,7 @@ struct nh_notifier_info {
 
 int register_nexthop_notifier(struct net *net, struct notifier_block *nb);
 int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb);
+void nexthop_hw_flags_set(struct net *net, u32 id, bool offload, bool trap);
 
 /* caller is holding rcu or rtnl; no reference taken to nexthop */
 struct nexthop *nexthop_find_by_id(struct net *net, u32 id);
diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index 70c8ab6906ec..71605c612458 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -2080,6 +2080,27 @@ int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb)
 }
 EXPORT_SYMBOL(unregister_nexthop_notifier);
 
+void nexthop_hw_flags_set(struct net *net, u32 id, bool offload, bool trap)
+{
+	struct nexthop *nexthop;
+
+	rcu_read_lock();
+
+	nexthop = nexthop_find_by_id(net, id);
+	if (!nexthop)
+		goto out;
+
+	nexthop->nh_flags &= ~(RTNH_F_OFFLOAD | RTNH_F_TRAP);
+	if (offload)
+		nexthop->nh_flags |= RTNH_F_OFFLOAD;
+	if (trap)
+		nexthop->nh_flags |= RTNH_F_TRAP;
+
+out:
+	rcu_read_unlock();
+}
+EXPORT_SYMBOL(nexthop_hw_flags_set);
+
 static void __net_exit nexthop_net_exit(struct net *net)
 {
 	rtnl_lock();
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 11/22] nexthop: Emit a notification when a nexthop is added
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (9 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 10/22] nexthop: Allow setting "offload" and "trap" indications on nexthops Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 15:21   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 12/22] nexthop: Emit a notification when a nexthop group is replaced Ido Schimmel
                   ` (10 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Emit a notification in the nexthop notification chain when a new nexthop
is added (not replaced). The nexthop can either be a new group or a
single nexthop.

The notification is sent after the nexthop is inserted into the
red-black tree, as listeners might need to callback into the nexthop
code with the nexthop ID in order to mark the nexthop as offloaded.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 include/net/nexthop.h | 3 ++-
 net/ipv4/nexthop.c    | 6 +++++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/net/nexthop.h b/include/net/nexthop.h
index 4147681e86d2..6431ff8cdb89 100644
--- a/include/net/nexthop.h
+++ b/include/net/nexthop.h
@@ -106,7 +106,8 @@ struct nexthop {
 
 enum nexthop_event_type {
 	NEXTHOP_EVENT_ADD,
-	NEXTHOP_EVENT_DEL
+	NEXTHOP_EVENT_DEL,
+	NEXTHOP_EVENT_REPLACE,
 };
 
 struct nh_notifier_single_info {
diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index 71605c612458..1fa249facd46 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -1277,7 +1277,11 @@ static int insert_nexthop(struct net *net, struct nexthop *new_nh,
 
 	rb_link_node_rcu(&new_nh->rb_node, parent, pp);
 	rb_insert_color(&new_nh->rb_node, root);
-	rc = 0;
+
+	rc = call_nexthop_notifiers(net, NEXTHOP_EVENT_REPLACE, new_nh, extack);
+	if (rc)
+		rb_erase(&new_nh->rb_node, &net->nexthop.rb_root);
+
 out:
 	if (!rc) {
 		nh_base_seq_inc(net);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 12/22] nexthop: Emit a notification when a nexthop group is replaced
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (10 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 11/22] nexthop: Emit a notification when a nexthop is added Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 15:22   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 13/22] nexthop: Emit a notification when a single nexthop " Ido Schimmel
                   ` (9 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Emit a notification in the nexthop notification chain when an existing
nexthop group is replaced.

The notification is emitted after all the validation checks were
performed, but before the new configuration (i.e., 'struct nh_grp') is
pointed at by the old shell (i.e., 'struct nexthop'). This prevents the
need to perform rollback in case the notification is vetoed.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv4/nexthop.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index 1fa249facd46..a60a519a5462 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -1049,13 +1049,17 @@ static int replace_nexthop_grp(struct net *net, struct nexthop *old,
 			       struct netlink_ext_ack *extack)
 {
 	struct nh_group *oldg, *newg;
-	int i;
+	int i, err;
 
 	if (!new->is_group) {
 		NL_SET_ERR_MSG(extack, "Can not replace a nexthop group with a nexthop.");
 		return -EINVAL;
 	}
 
+	err = call_nexthop_notifiers(net, NEXTHOP_EVENT_REPLACE, new, extack);
+	if (err)
+		return err;
+
 	oldg = rtnl_dereference(old->nh_grp);
 	newg = rtnl_dereference(new->nh_grp);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 13/22] nexthop: Emit a notification when a single nexthop is replaced
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (11 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 12/22] nexthop: Emit a notification when a nexthop group is replaced Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 15:25   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 14/22] nexthop: Emit a notification when a nexthop group is modified Ido Schimmel
                   ` (8 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

The notification is emitted after all the validation checks were
performed, but before the new configuration (i.e., 'struct nh_info') is
pointed at by the old shell (i.e., 'struct nexthop'). This prevents the
need to perform rollback in case the notification is vetoed.

The next patch will also emit a replace notification for all the nexthop
groups in which the nexthop is used.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv4/nexthop.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index a60a519a5462..b8a4abc00146 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -1099,12 +1099,22 @@ static int replace_nexthop_single(struct net *net, struct nexthop *old,
 				  struct netlink_ext_ack *extack)
 {
 	struct nh_info *oldi, *newi;
+	int err;
 
 	if (new->is_group) {
 		NL_SET_ERR_MSG(extack, "Can not replace a nexthop with a nexthop group.");
 		return -EINVAL;
 	}
 
+	err = call_nexthop_notifiers(net, NEXTHOP_EVENT_REPLACE, new, extack);
+	if (err)
+		return err;
+
+	/* Hardware flags were set on 'old' as 'new' is not in the red-black
+	 * tree. Therefore, inherit the flags from 'old' to 'new'.
+	 */
+	new->nh_flags |= old->nh_flags & (RTNH_F_OFFLOAD | RTNH_F_TRAP);
+
 	oldi = rtnl_dereference(old->nh_info);
 	newi = rtnl_dereference(new->nh_info);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 14/22] nexthop: Emit a notification when a nexthop group is modified
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (12 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 13/22] nexthop: Emit a notification when a single nexthop " Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 15:29   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 15/22] nexthop: Emit a notification when a nexthop group is reduced Ido Schimmel
                   ` (7 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

When a single nexthop is replaced, the configuration of all the groups
using the nexthop is effectively modified. In this case, emit a
notification in the nexthop notification chain for each modified group
so that listeners would not need to keep track of which nexthops are
member in which groups.

The notification can only be emitted after the new configuration (i.e.,
'struct nh_info') is pointed at by the old shell (i.e., 'struct
nexthop'). Before that the configuration of the nexthop groups is still
the same as before the replacement.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv4/nexthop.c | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index b8a4abc00146..0edc3e73d416 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -1098,7 +1098,9 @@ static int replace_nexthop_single(struct net *net, struct nexthop *old,
 				  struct nexthop *new,
 				  struct netlink_ext_ack *extack)
 {
+	u8 old_protocol, old_nh_flags;
 	struct nh_info *oldi, *newi;
+	struct nh_grp_entry *nhge;
 	int err;
 
 	if (new->is_group) {
@@ -1121,18 +1123,29 @@ static int replace_nexthop_single(struct net *net, struct nexthop *old,
 	newi->nh_parent = old;
 	oldi->nh_parent = new;
 
+	old_protocol = old->protocol;
+	old_nh_flags = old->nh_flags;
+
 	old->protocol = new->protocol;
 	old->nh_flags = new->nh_flags;
 
 	rcu_assign_pointer(old->nh_info, newi);
 	rcu_assign_pointer(new->nh_info, oldi);
 
+	/* Send a replace notification for all the groups using the nexthop. */
+	list_for_each_entry(nhge, &old->grp_list, nh_list) {
+		struct nexthop *nhp = nhge->nh_parent;
+
+		err = call_nexthop_notifiers(net, NEXTHOP_EVENT_REPLACE, nhp,
+					     extack);
+		if (err)
+			goto err_notify;
+	}
+
 	/* When replacing an IPv4 nexthop with an IPv6 nexthop, potentially
 	 * update IPv4 indication in all the groups using the nexthop.
 	 */
 	if (oldi->family == AF_INET && newi->family == AF_INET6) {
-		struct nh_grp_entry *nhge;
-
 		list_for_each_entry(nhge, &old->grp_list, nh_list) {
 			struct nexthop *nhp = nhge->nh_parent;
 			struct nh_group *nhg;
@@ -1143,6 +1156,21 @@ static int replace_nexthop_single(struct net *net, struct nexthop *old,
 	}
 
 	return 0;
+
+err_notify:
+	rcu_assign_pointer(new->nh_info, newi);
+	rcu_assign_pointer(old->nh_info, oldi);
+	old->nh_flags = old_nh_flags;
+	old->protocol = old_protocol;
+	oldi->nh_parent = old;
+	newi->nh_parent = new;
+	list_for_each_entry_continue_reverse(nhge, &old->grp_list, nh_list) {
+		struct nexthop *nhp = nhge->nh_parent;
+
+		call_nexthop_notifiers(net, NEXTHOP_EVENT_REPLACE, nhp, extack);
+	}
+	call_nexthop_notifiers(net, NEXTHOP_EVENT_REPLACE, old, extack);
+	return err;
 }
 
 static void __nexthop_replace_notify(struct net *net, struct nexthop *nh,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 15/22] nexthop: Emit a notification when a nexthop group is reduced
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (13 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 14/22] nexthop: Emit a notification when a nexthop group is modified Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 15:33   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 16/22] nexthop: Pass extack to register_nexthop_notifier() Ido Schimmel
                   ` (6 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

When a single nexthop is deleted, the configuration of all the groups
using the nexthop is effectively modified. In this case, emit a
notification in the nexthop notification chain for each modified group
so that listeners would not need to keep track of which nexthops are
member in which groups.

In the rare cases where the notification fails, emit an error to the
kernel log.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv4/nexthop.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index 0edc3e73d416..33f611bbce1f 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -893,7 +893,7 @@ static void remove_nh_grp_entry(struct net *net, struct nh_grp_entry *nhge,
 	struct nexthop *nhp = nhge->nh_parent;
 	struct nexthop *nh = nhge->nh;
 	struct nh_group *nhg, *newg;
-	int i, j;
+	int i, j, err;
 
 	WARN_ON(!nh);
 
@@ -941,6 +941,10 @@ static void remove_nh_grp_entry(struct net *net, struct nh_grp_entry *nhge,
 	list_del(&nhge->nh_list);
 	nexthop_put(nhge->nh);
 
+	err = call_nexthop_notifiers(net, NEXTHOP_EVENT_REPLACE, nhp, NULL);
+	if (err)
+		pr_err("Failed to replace nexthop group after nexthop deletion\n");
+
 	if (nlinfo)
 		nexthop_notify(RTM_NEWNEXTHOP, nhp, nlinfo);
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 16/22] nexthop: Pass extack to register_nexthop_notifier()
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (14 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 15/22] nexthop: Emit a notification when a nexthop group is reduced Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 15:34   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 17/22] nexthop: Replay nexthops when registering a notifier Ido Schimmel
                   ` (5 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

This will be used by the next patch which extends the function to replay
all the existing nexthops to the notifier block being registered.

Device drivers will be able to pass extack to the function since it is
passed to them upon reload from devlink.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/vxlan.c   | 3 ++-
 include/net/nexthop.h | 3 ++-
 net/ipv4/nexthop.c    | 3 ++-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 29deedee6ef4..a850b39dd432 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -4717,7 +4717,8 @@ static __net_init int vxlan_init_net(struct net *net)
 	for (h = 0; h < PORT_HASH_SIZE; ++h)
 		INIT_HLIST_HEAD(&vn->sock_list[h]);
 
-	return register_nexthop_notifier(net, &vxlan_nexthop_notifier_block);
+	return register_nexthop_notifier(net, &vxlan_nexthop_notifier_block,
+					 NULL);
 }
 
 static void vxlan_destroy_tunnels(struct net *net, struct list_head *head)
diff --git a/include/net/nexthop.h b/include/net/nexthop.h
index 6431ff8cdb89..c2dd20946825 100644
--- a/include/net/nexthop.h
+++ b/include/net/nexthop.h
@@ -145,7 +145,8 @@ struct nh_notifier_info {
 	};
 };
 
-int register_nexthop_notifier(struct net *net, struct notifier_block *nb);
+int register_nexthop_notifier(struct net *net, struct notifier_block *nb,
+			      struct netlink_ext_ack *extack);
 int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb);
 void nexthop_hw_flags_set(struct net *net, u32 id, bool offload, bool trap);
 
diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index 33f611bbce1f..b40c343ca969 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -2116,7 +2116,8 @@ static struct notifier_block nh_netdev_notifier = {
 	.notifier_call = nh_netdev_event,
 };
 
-int register_nexthop_notifier(struct net *net, struct notifier_block *nb)
+int register_nexthop_notifier(struct net *net, struct notifier_block *nb,
+			      struct netlink_ext_ack *extack)
 {
 	return blocking_notifier_chain_register(&net->nexthop.notifier_chain,
 						nb);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 17/22] nexthop: Replay nexthops when registering a notifier
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (15 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 16/22] nexthop: Pass extack to register_nexthop_notifier() Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 15:37   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 18/22] nexthop: Remove in-kernel route notifications when nexthop changes Ido Schimmel
                   ` (4 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

When registering a new notifier to the nexthop notification chain,
replay all the existing nexthops to the new notifier so that it will
have a complete picture of the available nexthops.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv4/nexthop.c | 54 ++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 52 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index b40c343ca969..6505a0a28df2 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -156,6 +156,27 @@ static int call_nexthop_notifiers(struct net *net,
 	return notifier_to_errno(err);
 }
 
+static int call_nexthop_notifier(struct notifier_block *nb, struct net *net,
+				 enum nexthop_event_type event_type,
+				 struct nexthop *nh,
+				 struct netlink_ext_ack *extack)
+{
+	struct nh_notifier_info info = {
+		.net = net,
+		.extack = extack,
+	};
+	int err;
+
+	err = nh_notifier_info_init(&info, nh);
+	if (err)
+		return err;
+
+	err = nb->notifier_call(nb, event_type, &info);
+	nh_notifier_info_fini(&info);
+
+	return notifier_to_errno(err);
+}
+
 static unsigned int nh_dev_hashfn(unsigned int val)
 {
 	unsigned int mask = NH_DEV_HASHSIZE - 1;
@@ -2116,11 +2137,40 @@ static struct notifier_block nh_netdev_notifier = {
 	.notifier_call = nh_netdev_event,
 };
 
+static int nexthops_dump(struct net *net, struct notifier_block *nb,
+			 struct netlink_ext_ack *extack)
+{
+	struct rb_root *root = &net->nexthop.rb_root;
+	struct rb_node *node;
+	int err = 0;
+
+	for (node = rb_first(root); node; node = rb_next(node)) {
+		struct nexthop *nh;
+
+		nh = rb_entry(node, struct nexthop, rb_node);
+		err = call_nexthop_notifier(nb, net, NEXTHOP_EVENT_REPLACE, nh,
+					    extack);
+		if (err)
+			break;
+	}
+
+	return err;
+}
+
 int register_nexthop_notifier(struct net *net, struct notifier_block *nb,
 			      struct netlink_ext_ack *extack)
 {
-	return blocking_notifier_chain_register(&net->nexthop.notifier_chain,
-						nb);
+	int err;
+
+	rtnl_lock();
+	err = nexthops_dump(net, nb, extack);
+	if (err)
+		goto unlock;
+	err = blocking_notifier_chain_register(&net->nexthop.notifier_chain,
+					       nb);
+unlock:
+	rtnl_unlock();
+	return err;
 }
 EXPORT_SYMBOL(register_nexthop_notifier);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 18/22] nexthop: Remove in-kernel route notifications when nexthop changes
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (16 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 17/22] nexthop: Replay nexthops when registering a notifier Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 15:38   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 19/22] netdevsim: Add devlink resource for nexthops Ido Schimmel
                   ` (3 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Remove in-kernel route notifications when the configuration of their
nexthop changes.

These notifications are unnecessary because the route still uses the
same nexthop ID. A separate notification for the nexthop change itself
is now sent in the nexthop notification chain.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv4/fib_trie.c | 9 ---------
 net/ipv6/route.c    | 5 -----
 2 files changed, 14 deletions(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index ffc5332f1390..28117c05dc35 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -2100,15 +2100,6 @@ static void __fib_info_notify_update(struct net *net, struct fib_table *tb,
 			rtmsg_fib(RTM_NEWROUTE, htonl(n->key), fa,
 				  KEYLENGTH - fa->fa_slen, tb->tb_id,
 				  info, NLM_F_REPLACE);
-
-			/* call_fib_entry_notifiers will be removed when
-			 * in-kernel notifier is implemented and supported
-			 * for nexthop objects
-			 */
-			call_fib_entry_notifiers(net, FIB_EVENT_ENTRY_REPLACE,
-						 n->key,
-						 KEYLENGTH - fa->fa_slen, fa,
-						 NULL);
 		}
 	}
 }
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 5e7e25e2523a..6fb4a01edf87 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -6037,11 +6037,6 @@ void fib6_rt_update(struct net *net, struct fib6_info *rt,
 	struct sk_buff *skb;
 	int err = -ENOBUFS;
 
-	/* call_fib6_entry_notifiers will be removed when in-kernel notifier
-	 * is implemented and supported for nexthop objects
-	 */
-	call_fib6_entry_notifiers(net, FIB_EVENT_ENTRY_REPLACE, rt, NULL);
-
 	skb = nlmsg_new(rt6_nlmsg_size(rt), gfp_any());
 	if (!skb)
 		goto errout;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 19/22] netdevsim: Add devlink resource for nexthops
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (17 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 18/22] nexthop: Remove in-kernel route notifications when nexthop changes Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08  9:10 ` [RFC PATCH net-next 20/22] netdevsim: Add dummy implementation for nexthop offload Ido Schimmel
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

The Spectrum ASIC has a dedicated table where nexthops (i.e., adjacency
entries) are populated. The size of this table can be controlled via
devlink-resource.

Add such a resource to netdevsim so that its occupancy will reflect the
number of nexthop objects currently programmed to the device.

By limiting the size of the resource, error paths could be exercised and
tested.

Example output:

# devlink resource show netdevsim/netdevsim10
netdevsim/netdevsim10:
  name IPv4 size unlimited unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
    resources:
      name fib size unlimited occ 4 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
      name fib-rules size unlimited occ 3 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
  name IPv6 size unlimited unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
    resources:
      name fib size unlimited occ 1 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
      name fib-rules size unlimited occ 2 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
  name nexthops size unlimited occ 0 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 .../networking/devlink/netdevsim.rst          |  3 ++-
 drivers/net/netdevsim/dev.c                   |  6 +++++
 drivers/net/netdevsim/fib.c                   | 23 ++++++++++++++++++-
 drivers/net/netdevsim/netdevsim.h             |  1 +
 4 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/Documentation/networking/devlink/netdevsim.rst b/Documentation/networking/devlink/netdevsim.rst
index 2a266b7e7b38..02c2d20dc673 100644
--- a/Documentation/networking/devlink/netdevsim.rst
+++ b/Documentation/networking/devlink/netdevsim.rst
@@ -46,7 +46,7 @@ Resources
 =========
 
 The ``netdevsim`` driver exposes resources to control the number of FIB
-entries and FIB rule entries that the driver will allow.
+entries, FIB rule entries and nexthops that the driver will allow.
 
 .. code:: shell
 
@@ -54,6 +54,7 @@ entries and FIB rule entries that the driver will allow.
     $ devlink resource set netdevsim/netdevsim0 path /IPv4/fib-rules size 16
     $ devlink resource set netdevsim/netdevsim0 path /IPv6/fib size 64
     $ devlink resource set netdevsim/netdevsim0 path /IPv6/fib-rules size 16
+    $ devlink resource set netdevsim/netdevsim0 path /nexthops size 16
     $ devlink dev reload netdevsim/netdevsim0
 
 Driver-specific Traps
diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index 32f339fedb21..d4f3a1eea800 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -320,6 +320,12 @@ static int nsim_dev_resources_register(struct devlink *devlink)
 		return err;
 	}
 
+	/* Resources for nexthops */
+	err = devlink_resource_register(devlink, "nexthops", (u64)-1,
+					NSIM_RESOURCE_NEXTHOPS,
+					DEVLINK_RESOURCE_ID_PARENT_TOP,
+					&params);
+
 out:
 	return err;
 }
diff --git a/drivers/net/netdevsim/fib.c b/drivers/net/netdevsim/fib.c
index deea17a0e79c..3ec0f8896efe 100644
--- a/drivers/net/netdevsim/fib.c
+++ b/drivers/net/netdevsim/fib.c
@@ -42,6 +42,7 @@ struct nsim_fib_data {
 	struct notifier_block fib_nb;
 	struct nsim_per_fib_data ipv4;
 	struct nsim_per_fib_data ipv6;
+	struct nsim_fib_entry nexthops;
 	struct rhashtable fib_rt_ht;
 	struct list_head fib_rt_list;
 	spinlock_t fib_lock;	/* Protects hashtable, list and accounting */
@@ -104,6 +105,9 @@ u64 nsim_fib_get_val(struct nsim_fib_data *fib_data,
 	case NSIM_RESOURCE_IPV6_FIB_RULES:
 		entry = &fib_data->ipv6.rules;
 		break;
+	case NSIM_RESOURCE_NEXTHOPS:
+		entry = &fib_data->nexthops;
+		break;
 	default:
 		return 0;
 	}
@@ -129,6 +133,9 @@ static void nsim_fib_set_max(struct nsim_fib_data *fib_data,
 	case NSIM_RESOURCE_IPV6_FIB_RULES:
 		entry = &fib_data->ipv6.rules;
 		break;
+	case NSIM_RESOURCE_NEXTHOPS:
+		entry = &fib_data->nexthops;
+		break;
 	default:
 		WARN_ON(1);
 		return;
@@ -866,12 +873,20 @@ static u64 nsim_fib_ipv6_rules_res_occ_get(void *priv)
 	return nsim_fib_get_val(data, NSIM_RESOURCE_IPV6_FIB_RULES, false);
 }
 
+static u64 nsim_fib_nexthops_res_occ_get(void *priv)
+{
+	struct nsim_fib_data *data = priv;
+
+	return nsim_fib_get_val(data, NSIM_RESOURCE_NEXTHOPS, false);
+}
+
 static void nsim_fib_set_max_all(struct nsim_fib_data *data,
 				 struct devlink *devlink)
 {
 	enum nsim_resource_id res_ids[] = {
 		NSIM_RESOURCE_IPV4_FIB, NSIM_RESOURCE_IPV4_FIB_RULES,
-		NSIM_RESOURCE_IPV6_FIB, NSIM_RESOURCE_IPV6_FIB_RULES
+		NSIM_RESOURCE_IPV6_FIB, NSIM_RESOURCE_IPV6_FIB_RULES,
+		NSIM_RESOURCE_NEXTHOPS,
 	};
 	int i;
 
@@ -929,6 +944,10 @@ struct nsim_fib_data *nsim_fib_create(struct devlink *devlink,
 					  NSIM_RESOURCE_IPV6_FIB_RULES,
 					  nsim_fib_ipv6_rules_res_occ_get,
 					  data);
+	devlink_resource_occ_get_register(devlink,
+					  NSIM_RESOURCE_NEXTHOPS,
+					  nsim_fib_nexthops_res_occ_get,
+					  data);
 	return data;
 
 err_rhashtable_destroy:
@@ -941,6 +960,8 @@ struct nsim_fib_data *nsim_fib_create(struct devlink *devlink,
 
 void nsim_fib_destroy(struct devlink *devlink, struct nsim_fib_data *data)
 {
+	devlink_resource_occ_get_unregister(devlink,
+					    NSIM_RESOURCE_NEXTHOPS);
 	devlink_resource_occ_get_unregister(devlink,
 					    NSIM_RESOURCE_IPV6_FIB_RULES);
 	devlink_resource_occ_get_unregister(devlink,
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
index 284f7092241d..5e01169da01f 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -145,6 +145,7 @@ enum nsim_resource_id {
 	NSIM_RESOURCE_IPV6,
 	NSIM_RESOURCE_IPV6_FIB,
 	NSIM_RESOURCE_IPV6_FIB_RULES,
+	NSIM_RESOURCE_NEXTHOPS,
 };
 
 struct nsim_dev_health {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 20/22] netdevsim: Add dummy implementation for nexthop offload
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (18 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 19/22] netdevsim: Add devlink resource for nexthops Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08  9:10 ` [RFC PATCH net-next 21/22] netdevsim: Allow programming routes with nexthop objects Ido Schimmel
  2020-09-08  9:10 ` [RFC PATCH net-next 22/22] selftests: netdevsim: Add test for nexthop offload API Ido Schimmel
  21 siblings, 0 replies; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Implement dummy nexthop "offload" in the driver by storing currently
"programmed" nexthops in a hash table. Each nexthop in the hash table is
marked with "trap" indication and increments the nexthops resource
occupancy.

This will later allow us to test the nexthop offload API on top of
netdevsim.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/netdevsim/fib.c | 232 +++++++++++++++++++++++++++++++++++-
 1 file changed, 229 insertions(+), 3 deletions(-)

diff --git a/drivers/net/netdevsim/fib.c b/drivers/net/netdevsim/fib.c
index 3ec0f8896efe..196deed0aa97 100644
--- a/drivers/net/netdevsim/fib.c
+++ b/drivers/net/netdevsim/fib.c
@@ -25,6 +25,7 @@
 #include <net/ip6_fib.h>
 #include <net/fib_rules.h>
 #include <net/net_namespace.h>
+#include <net/nexthop.h>
 
 #include "netdevsim.h"
 
@@ -46,6 +47,8 @@ struct nsim_fib_data {
 	struct rhashtable fib_rt_ht;
 	struct list_head fib_rt_list;
 	spinlock_t fib_lock;	/* Protects hashtable, list and accounting */
+	struct notifier_block nexthop_nb;
+	struct rhashtable nexthop_ht;
 	struct devlink *devlink;
 };
 
@@ -87,6 +90,19 @@ static const struct rhashtable_params nsim_fib_rt_ht_params = {
 	.automatic_shrinking = true,
 };
 
+struct nsim_nexthop {
+	struct rhash_head ht_node;
+	u64 occ;
+	u32 id;
+};
+
+static const struct rhashtable_params nsim_nexthop_ht_params = {
+	.key_offset = offsetof(struct nsim_nexthop, id),
+	.head_offset = offsetof(struct nsim_nexthop, ht_node),
+	.key_len = sizeof(u32),
+	.automatic_shrinking = true,
+};
+
 u64 nsim_fib_get_val(struct nsim_fib_data *fib_data,
 		     enum nsim_resource_id res_id, bool max)
 {
@@ -845,6 +861,196 @@ static void nsim_fib_dump_inconsistent(struct notifier_block *nb)
 	data->ipv6.rules.num = 0ULL;
 }
 
+static struct nsim_nexthop *nsim_nexthop_create(struct nsim_fib_data *data,
+						struct nh_notifier_info *info)
+{
+	struct nsim_nexthop *nexthop;
+	u64 occ = 0;
+	int i;
+
+	nexthop = kzalloc(sizeof(*nexthop), GFP_KERNEL);
+	if (!nexthop)
+		return NULL;
+
+	nexthop->id = info->id;
+
+	/* Determine the number of nexthop entries the new nexthop will
+	 * occupy.
+	 */
+
+	if (!info->is_grp) {
+		occ = 1;
+		goto out;
+	}
+
+	for (i = 0; i < info->nh_grp->num_nh; i++)
+		occ += info->nh_grp->nh_entries[i].weight;
+
+out:
+	nexthop->occ = occ;
+	return nexthop;
+}
+
+static void nsim_nexthop_destroy(struct nsim_nexthop *nexthop)
+{
+	kfree(nexthop);
+}
+
+static int nsim_nexthop_account(struct nsim_fib_data *data, u64 occ,
+				bool add, struct netlink_ext_ack *extack)
+{
+	int err = 0;
+
+	if (add) {
+		if (data->nexthops.num + occ <= data->nexthops.max) {
+			data->nexthops.num += occ;
+		} else {
+			err = -ENOSPC;
+			NL_SET_ERR_MSG_MOD(extack, "Exceeded number of supported nexthops");
+		}
+	} else {
+		if (WARN_ON(occ > data->nexthops.num))
+			return -EINVAL;
+		data->nexthops.num -= occ;
+	}
+
+	return err;
+}
+
+static int nsim_nexthop_add(struct nsim_fib_data *data,
+			    struct nsim_nexthop *nexthop,
+			    struct netlink_ext_ack *extack)
+{
+	struct net *net = devlink_net(data->devlink);
+	int err;
+
+	err = nsim_nexthop_account(data, nexthop->occ, true, extack);
+	if (err)
+		return err;
+
+	err = rhashtable_insert_fast(&data->nexthop_ht, &nexthop->ht_node,
+				     nsim_nexthop_ht_params);
+	if (err) {
+		NL_SET_ERR_MSG_MOD(extack, "Failed to insert nexthop");
+		goto err_nexthop_dismiss;
+	}
+
+	nexthop_hw_flags_set(net, nexthop->id, false, true);
+
+	return 0;
+
+err_nexthop_dismiss:
+	nsim_nexthop_account(data, nexthop->occ, false, extack);
+	return err;
+}
+
+static int nsim_nexthop_replace(struct nsim_fib_data *data,
+				struct nsim_nexthop *nexthop,
+				struct nsim_nexthop *nexthop_old,
+				struct netlink_ext_ack *extack)
+{
+	struct net *net = devlink_net(data->devlink);
+	int err;
+
+	err = nsim_nexthop_account(data, nexthop->occ, true, extack);
+	if (err)
+		return err;
+
+	err = rhashtable_replace_fast(&data->nexthop_ht,
+				      &nexthop_old->ht_node, &nexthop->ht_node,
+				      nsim_nexthop_ht_params);
+	if (err) {
+		NL_SET_ERR_MSG_MOD(extack, "Failed to replace nexthop");
+		goto err_nexthop_dismiss;
+	}
+
+	nexthop_hw_flags_set(net, nexthop->id, false, true);
+	nsim_nexthop_account(data, nexthop_old->occ, false, extack);
+	nsim_nexthop_destroy(nexthop_old);
+
+	return 0;
+
+err_nexthop_dismiss:
+	nsim_nexthop_account(data, nexthop->occ, false, extack);
+	return err;
+}
+
+static int nsim_nexthop_insert(struct nsim_fib_data *data,
+			       struct nh_notifier_info *info)
+{
+	struct nsim_nexthop *nexthop, *nexthop_old;
+	int err;
+
+	nexthop = nsim_nexthop_create(data, info);
+	if (!nexthop)
+		return -ENOMEM;
+
+	nexthop_old = rhashtable_lookup_fast(&data->nexthop_ht, &info->id,
+					     nsim_nexthop_ht_params);
+	if (!nexthop_old)
+		err = nsim_nexthop_add(data, nexthop, info->extack);
+	else
+		err = nsim_nexthop_replace(data, nexthop, nexthop_old,
+					   info->extack);
+
+	if (err)
+		nsim_nexthop_destroy(nexthop);
+
+	return err;
+}
+
+static void nsim_nexthop_remove(struct nsim_fib_data *data,
+				struct nh_notifier_info *info)
+{
+	struct nsim_nexthop *nexthop;
+
+	nexthop = rhashtable_lookup_fast(&data->nexthop_ht, &info->id,
+					 nsim_nexthop_ht_params);
+	if (!nexthop)
+		return;
+
+	rhashtable_remove_fast(&data->nexthop_ht, &nexthop->ht_node,
+			       nsim_nexthop_ht_params);
+	nsim_nexthop_account(data, nexthop->occ, false, info->extack);
+	nsim_nexthop_destroy(nexthop);
+}
+
+static int nsim_nexthop_event_nb(struct notifier_block *nb, unsigned long event,
+				 void *ptr)
+{
+	struct nsim_fib_data *data = container_of(nb, struct nsim_fib_data,
+						  nexthop_nb);
+	struct nh_notifier_info *info = ptr;
+	int err = 0;
+
+	ASSERT_RTNL();
+
+	switch (event) {
+	case NEXTHOP_EVENT_REPLACE:
+		err = nsim_nexthop_insert(data, info);
+		break;
+	case NEXTHOP_EVENT_DEL:
+		nsim_nexthop_remove(data, info);
+		break;
+	default:
+		break;
+	}
+
+	return notifier_from_errno(err);
+}
+
+static void nsim_nexthop_free(void *ptr, void *arg)
+{
+	struct nsim_nexthop *nexthop = ptr;
+	struct nsim_fib_data *data = arg;
+	struct net *net;
+
+	net = devlink_net(data->devlink);
+	nexthop_hw_flags_set(net, nexthop->id, false, false);
+	nsim_nexthop_account(data, nexthop->occ, false, NULL);
+	nsim_nexthop_destroy(nexthop);
+}
+
 static u64 nsim_fib_ipv4_resource_occ_get(void *priv)
 {
 	struct nsim_fib_data *data = priv;
@@ -912,20 +1118,32 @@ struct nsim_fib_data *nsim_fib_create(struct devlink *devlink,
 		return ERR_PTR(-ENOMEM);
 	data->devlink = devlink;
 
+	err = rhashtable_init(&data->nexthop_ht, &nsim_nexthop_ht_params);
+	if (err)
+		goto err_data_free;
+
 	spin_lock_init(&data->fib_lock);
 	INIT_LIST_HEAD(&data->fib_rt_list);
 	err = rhashtable_init(&data->fib_rt_ht, &nsim_fib_rt_ht_params);
 	if (err)
-		goto err_data_free;
+		goto err_rhashtable_nexthop_destroy;
 
 	nsim_fib_set_max_all(data, devlink);
 
+	data->nexthop_nb.notifier_call = nsim_nexthop_event_nb;
+	err = register_nexthop_notifier(devlink_net(devlink), &data->nexthop_nb,
+					extack);
+	if (err) {
+		pr_err("Failed to register nexthop notifier\n");
+		goto err_rhashtable_fib_destroy;
+	}
+
 	data->fib_nb.notifier_call = nsim_fib_event_nb;
 	err = register_fib_notifier(devlink_net(devlink), &data->fib_nb,
 				    nsim_fib_dump_inconsistent, extack);
 	if (err) {
 		pr_err("Failed to register fib notifier\n");
-		goto err_rhashtable_destroy;
+		goto err_nexthop_nb_unregister;
 	}
 
 	devlink_resource_occ_get_register(devlink,
@@ -950,9 +1168,14 @@ struct nsim_fib_data *nsim_fib_create(struct devlink *devlink,
 					  data);
 	return data;
 
-err_rhashtable_destroy:
+err_nexthop_nb_unregister:
+	unregister_nexthop_notifier(devlink_net(devlink), &data->nexthop_nb);
+err_rhashtable_fib_destroy:
 	rhashtable_free_and_destroy(&data->fib_rt_ht, nsim_fib_rt_free,
 				    data);
+err_rhashtable_nexthop_destroy:
+	rhashtable_free_and_destroy(&data->nexthop_ht, nsim_nexthop_free,
+				    data);
 err_data_free:
 	kfree(data);
 	return ERR_PTR(err);
@@ -971,8 +1194,11 @@ void nsim_fib_destroy(struct devlink *devlink, struct nsim_fib_data *data)
 	devlink_resource_occ_get_unregister(devlink,
 					    NSIM_RESOURCE_IPV4_FIB);
 	unregister_fib_notifier(devlink_net(devlink), &data->fib_nb);
+	unregister_nexthop_notifier(devlink_net(devlink), &data->nexthop_nb);
 	rhashtable_free_and_destroy(&data->fib_rt_ht, nsim_fib_rt_free,
 				    data);
+	rhashtable_free_and_destroy(&data->nexthop_ht, nsim_nexthop_free,
+				    data);
 	WARN_ON_ONCE(!list_empty(&data->fib_rt_list));
 	kfree(data);
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 21/22] netdevsim: Allow programming routes with nexthop objects
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (19 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 20/22] netdevsim: Add dummy implementation for nexthop offload Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  2020-09-08 15:40   ` David Ahern
  2020-09-08  9:10 ` [RFC PATCH net-next 22/22] selftests: netdevsim: Add test for nexthop offload API Ido Schimmel
  21 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Previous patches added the ability to program nexthop objects.
Therefore, no longer forbid the programming of routes that point to such
objects.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/netdevsim/fib.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/drivers/net/netdevsim/fib.c b/drivers/net/netdevsim/fib.c
index 196deed0aa97..731201519380 100644
--- a/drivers/net/netdevsim/fib.c
+++ b/drivers/net/netdevsim/fib.c
@@ -412,11 +412,6 @@ static int nsim_fib4_event(struct nsim_fib_data *data,
 
 	fen_info = container_of(info, struct fib_entry_notifier_info, info);
 
-	if (fen_info->fi->nh) {
-		NL_SET_ERR_MSG_MOD(info->extack, "IPv4 route with nexthop objects is not supported");
-		return 0;
-	}
-
 	switch (event) {
 	case FIB_EVENT_ENTRY_REPLACE:
 		err = nsim_fib4_rt_insert(data, fen_info);
@@ -727,11 +722,6 @@ static int nsim_fib6_event(struct nsim_fib_data *data,
 
 	fen6_info = container_of(info, struct fib6_entry_notifier_info, info);
 
-	if (fen6_info->rt->nh) {
-		NL_SET_ERR_MSG_MOD(info->extack, "IPv6 route with nexthop objects is not supported");
-		return 0;
-	}
-
 	if (fen6_info->rt->fib6_src.plen) {
 		NL_SET_ERR_MSG_MOD(info->extack, "IPv6 source-specific route is not supported");
 		return 0;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC PATCH net-next 22/22] selftests: netdevsim: Add test for nexthop offload API
  2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
                   ` (20 preceding siblings ...)
  2020-09-08  9:10 ` [RFC PATCH net-next 21/22] netdevsim: Allow programming routes with nexthop objects Ido Schimmel
@ 2020-09-08  9:10 ` Ido Schimmel
  21 siblings, 0 replies; 56+ messages in thread
From: Ido Schimmel @ 2020-09-08  9:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Test various aspects of the nexthop offload API on top of the netdevsim
implementation. Both good and bad flows are tested.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 .../drivers/net/netdevsim/nexthop.sh          | 408 ++++++++++++++++++
 1 file changed, 408 insertions(+)
 create mode 100755 tools/testing/selftests/drivers/net/netdevsim/nexthop.sh

diff --git a/tools/testing/selftests/drivers/net/netdevsim/nexthop.sh b/tools/testing/selftests/drivers/net/netdevsim/nexthop.sh
new file mode 100755
index 000000000000..16d5175e4e27
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/netdevsim/nexthop.sh
@@ -0,0 +1,408 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# This test is for checking the nexthop offload API. It makes use of netdevsim
+# which registers a listener to the nexthop notification chain.
+
+lib_dir=$(dirname $0)/../../../net/forwarding
+
+ALL_TESTS="
+	nexthop_single_add_test
+	nexthop_single_add_err_test
+	nexthop_group_add_test
+	nexthop_group_add_err_test
+	nexthop_group_replace_test
+	nexthop_group_replace_err_test
+	nexthop_single_replace_test
+	nexthop_single_replace_err_test
+	nexthop_single_in_group_replace_test
+	nexthop_single_in_group_replace_err_test
+	nexthop_single_in_group_delete_test
+	nexthop_replay_test
+	nexthop_replay_err_test
+"
+NETDEVSIM_PATH=/sys/bus/netdevsim/
+DEV_ADDR=1337
+DEV=netdevsim${DEV_ADDR}
+DEVLINK_DEV=netdevsim/${DEV}
+SYSFS_NET_DIR=/sys/bus/netdevsim/devices/$DEV/net/
+NUM_NETIFS=0
+source $lib_dir/lib.sh
+source $lib_dir/devlink_lib.sh
+
+nexthop_check()
+{
+	local nharg="$1"; shift
+	local expected="$1"; shift
+
+	out=$($IP nexthop show ${nharg} | sed -e 's/ *$//')
+	if [[ "$out" != "$expected" ]]; then
+		return 1
+	fi
+
+	return 0
+}
+
+nexthop_resource_check()
+{
+	local expected_occ=$1; shift
+
+	occ=$($DEVLINK -jp resource show $DEVLINK_DEV \
+		| jq '.[][][] | select(.name=="nexthops") | .["occ"]')
+
+	if [ $expected_occ -ne $occ ]; then
+		return 1
+	fi
+
+	return 0
+}
+
+nexthop_resource_set()
+{
+	local size=$1; shift
+
+	$DEVLINK resource set $DEVLINK_DEV path nexthops size $size
+	$DEVLINK dev reload $DEVLINK_DEV
+}
+
+nexthop_single_add_test()
+{
+	RET=0
+
+	$IP nexthop add id 1 via 192.0.2.2 dev dummy1
+	nexthop_check "id 1" "id 1 via 192.0.2.2 dev dummy1 scope link trap"
+	check_err $? "Unexpected nexthop entry"
+
+	nexthop_resource_check 1
+	check_err $? "Wrong nexthop occupancy"
+
+	$IP nexthop del id 1
+	nexthop_resource_check 0
+	check_err $? "Wrong nexthop occupancy after delete"
+
+	log_test "Single nexthop add and delete"
+}
+
+nexthop_single_add_err_test()
+{
+	RET=0
+
+	nexthop_resource_set 1
+
+	$IP nexthop add id 1 via 192.0.2.2 dev dummy1
+
+	$IP nexthop add id 2 via 192.0.2.3 dev dummy1 &> /dev/null
+	check_fail $? "Nexthop addition succeeded when should fail"
+
+	nexthop_resource_check 1
+	check_err $? "Wrong nexthop occupancy"
+
+	log_test "Single nexthop add failure"
+
+	$IP nexthop flush &> /dev/null
+	nexthop_resource_set 9999
+}
+
+nexthop_group_add_test()
+{
+	RET=0
+
+	$IP nexthop add id 1 via 192.0.2.2 dev dummy1
+	$IP nexthop add id 2 via 192.0.2.3 dev dummy1
+
+	$IP nexthop add id 10 group 1/2
+	nexthop_check "id 10" "id 10 group 1/2 trap"
+	check_err $? "Unexpected nexthop group entry"
+
+	nexthop_resource_check 4
+	check_err $? "Wrong nexthop occupancy"
+
+	$IP nexthop del id 10
+	nexthop_resource_check 2
+	check_err $? "Wrong nexthop occupancy after delete"
+
+	$IP nexthop add id 10 group 1,20/2,39
+	nexthop_check "id 10" "id 10 group 1,20/2,39 trap"
+	check_err $? "Unexpected weighted nexthop group entry"
+
+	nexthop_resource_check 61
+	check_err $? "Wrong weighted nexthop occupancy"
+
+	$IP nexthop del id 10
+	nexthop_resource_check 2
+	check_err $? "Wrong nexthop occupancy after delete"
+
+	log_test "Nexthop group add and delete"
+
+	$IP nexthop flush &> /dev/null
+}
+
+nexthop_group_add_err_test()
+{
+	RET=0
+
+	nexthop_resource_set 2
+
+	$IP nexthop add id 1 via 192.0.2.2 dev dummy1
+	$IP nexthop add id 2 via 192.0.2.3 dev dummy1
+
+	$IP nexthop add id 10 group 1/2 &> /dev/null
+	check_fail $? "Nexthop group addition succeeded when should fail"
+
+	nexthop_resource_check 2
+	check_err $? "Wrong nexthop occupancy"
+
+	log_test "Nexthop group add failure"
+
+	$IP nexthop flush &> /dev/null
+	nexthop_resource_set 9999
+}
+
+nexthop_group_replace_test()
+{
+	RET=0
+
+	$IP nexthop add id 1 via 192.0.2.2 dev dummy1
+	$IP nexthop add id 2 via 192.0.2.3 dev dummy1
+	$IP nexthop add id 3 via 192.0.2.4 dev dummy1
+	$IP nexthop add id 10 group 1/2
+
+	$IP nexthop replace id 10 group 1/2/3
+	nexthop_check "id 10" "id 10 group 1/2/3 trap"
+	check_err $? "Unexpected nexthop group entry"
+
+	nexthop_resource_check 6
+	check_err $? "Wrong nexthop occupancy"
+
+	log_test "Nexthop group replace"
+
+	$IP nexthop flush &> /dev/null
+}
+
+nexthop_group_replace_err_test()
+{
+	RET=0
+
+	nexthop_resource_set 5
+
+	$IP nexthop add id 1 via 192.0.2.2 dev dummy1
+	$IP nexthop add id 2 via 192.0.2.3 dev dummy1
+	$IP nexthop add id 3 via 192.0.2.4 dev dummy1
+	$IP nexthop add id 10 group 1/2
+
+	$IP nexthop replace id 10 group 1/2/3 &> /dev/null
+	check_fail $? "Nexthop group replacement succeeded when should fail"
+
+	nexthop_check "id 10" "id 10 group 1/2 trap"
+	check_err $? "Unexpected nexthop group entry after failure"
+
+	nexthop_resource_check 5
+	check_err $? "Wrong nexthop occupancy after failure"
+
+	log_test "Nexthop group replace failure"
+
+	$IP nexthop flush &> /dev/null
+	nexthop_resource_set 9999
+}
+
+nexthop_single_replace_test()
+{
+	RET=0
+
+	$IP nexthop add id 1 via 192.0.2.2 dev dummy1
+
+	$IP nexthop replace id 1 via 192.0.2.3 dev dummy1
+	nexthop_check "id 1" "id 1 via 192.0.2.3 dev dummy1 scope link trap"
+	check_err $? "Unexpected nexthop entry"
+
+	nexthop_resource_check 1
+	check_err $? "Wrong nexthop occupancy"
+
+	log_test "Single nexthop replace"
+
+	$IP nexthop flush &> /dev/null
+}
+
+nexthop_single_replace_err_test()
+{
+	RET=0
+
+	# This is supposed to cause the replace to fail because the new nexthop
+	# is programmed before deleting the replaced one.
+	nexthop_resource_set 1
+
+	$IP nexthop add id 1 via 192.0.2.2 dev dummy1
+
+	$IP nexthop replace id 1 via 192.0.2.3 dev dummy1 &> /dev/null
+	check_fail $? "Nexthop replace succeeded when should fail"
+
+	nexthop_check "id 1" "id 1 via 192.0.2.2 dev dummy1 scope link trap"
+	check_err $? "Unexpected nexthop entry after failure"
+
+	nexthop_resource_check 1
+	check_err $? "Wrong nexthop occupancy after failure"
+
+	log_test "Single nexthop replace failure"
+
+	$IP nexthop flush &> /dev/null
+	nexthop_resource_set 9999
+}
+
+nexthop_single_in_group_replace_test()
+{
+	RET=0
+
+	$IP nexthop add id 1 via 192.0.2.2 dev dummy1
+	$IP nexthop add id 2 via 192.0.2.3 dev dummy1
+	$IP nexthop add id 10 group 1/2
+
+	$IP nexthop replace id 1 via 192.0.2.4 dev dummy1
+	check_err $? "Failed to replace nexthop when should not"
+
+	nexthop_check "id 10" "id 10 group 1/2 trap"
+	check_err $? "Unexpected nexthop group entry"
+
+	nexthop_resource_check 4
+	check_err $? "Wrong nexthop occupancy"
+
+	log_test "Single nexthop replace while in group"
+
+	$IP nexthop flush &> /dev/null
+}
+
+nexthop_single_in_group_replace_err_test()
+{
+	RET=0
+
+	nexthop_resource_set 5
+
+	$IP nexthop add id 1 via 192.0.2.2 dev dummy1
+	$IP nexthop add id 2 via 192.0.2.3 dev dummy1
+	$IP nexthop add id 10 group 1/2
+
+	$IP nexthop replace id 1 via 192.0.2.4 dev dummy1 &> /dev/null
+	check_fail $? "Nexthop replacement succeeded when should fail"
+
+	nexthop_check "id 1" "id 1 via 192.0.2.2 dev dummy1 scope link trap"
+	check_err $? "Unexpected nexthop entry after failure"
+
+	nexthop_check "id 10" "id 10 group 1/2 trap"
+	check_err $? "Unexpected nexthop group entry after failure"
+
+	nexthop_resource_check 4
+	check_err $? "Wrong nexthop occupancy"
+
+	log_test "Single nexthop replace while in group failure"
+
+	$IP nexthop flush &> /dev/null
+	nexthop_resource_set 9999
+}
+
+nexthop_single_in_group_delete_test()
+{
+	RET=0
+
+	$IP nexthop add id 1 via 192.0.2.2 dev dummy1
+	$IP nexthop add id 2 via 192.0.2.3 dev dummy1
+	$IP nexthop add id 10 group 1/2
+
+	$IP nexthop del id 1
+	nexthop_check "id 10" "id 10 group 2 trap"
+	check_err $? "Unexpected nexthop group entry"
+
+	nexthop_resource_check 2
+	check_err $? "Wrong nexthop occupancy"
+
+	log_test "Single nexthop delete while in group"
+
+	$IP nexthop flush &> /dev/null
+}
+
+nexthop_replay_test()
+{
+	RET=0
+
+	$IP nexthop add id 1 via 192.0.2.2 dev dummy1
+	$IP nexthop add id 2 via 192.0.2.3 dev dummy1
+	$IP nexthop add id 10 group 1/2
+
+	$DEVLINK dev reload $DEVLINK_DEV
+	check_err $? "Failed to reload when should not"
+
+	nexthop_check "id 1" "id 1 via 192.0.2.2 dev dummy1 scope link trap"
+	check_err $? "Unexpected nexthop entry after reload"
+
+	nexthop_check "id 2" "id 2 via 192.0.2.3 dev dummy1 scope link trap"
+	check_err $? "Unexpected nexthop entry after reload"
+
+	nexthop_check "id 10" "id 10 group 1/2 trap"
+	check_err $? "Unexpected nexthop group entry after reload"
+
+	nexthop_resource_check 4
+	check_err $? "Wrong nexthop occupancy"
+
+	log_test "Nexthop replay"
+
+	$IP nexthop flush &> /dev/null
+}
+
+nexthop_replay_err_test()
+{
+	RET=0
+
+	$IP nexthop add id 1 via 192.0.2.2 dev dummy1
+	$IP nexthop add id 2 via 192.0.2.3 dev dummy1
+	$IP nexthop add id 10 group 1/2
+
+	# Reduce size of nexthop resource so that reload will fail.
+	$DEVLINK resource set $DEVLINK_DEV path nexthops size 3
+	$DEVLINK dev reload $DEVLINK_DEV &> /dev/null
+	check_fail $? "Reload succeeded when should fail"
+
+	$DEVLINK resource set $DEVLINK_DEV path nexthops size 9999
+	$DEVLINK dev reload $DEVLINK_DEV
+	check_err $? "Failed to reload when should not"
+
+	log_test "Nexthop replay failure"
+
+	$IP nexthop flush &> /dev/null
+}
+
+setup_prepare()
+{
+	local netdev
+
+	modprobe netdevsim &> /dev/null
+
+	echo "$DEV_ADDR 1" > ${NETDEVSIM_PATH}/new_device
+	while [ ! -d $SYSFS_NET_DIR ] ; do :; done
+
+	set -e
+
+	ip netns add testns1
+	devlink dev reload $DEVLINK_DEV netns testns1
+
+	IP="ip -netns testns1"
+	DEVLINK="devlink -N testns1"
+
+	$IP link add name dummy1 up type dummy
+	$IP address add 192.0.2.1/24 dev dummy1
+
+	set +e
+}
+
+cleanup()
+{
+	pre_cleanup
+	ip netns del testns1
+	echo "$DEV_ADDR" > ${NETDEVSIM_PATH}/del_device
+	modprobe -r netdevsim &> /dev/null
+}
+
+trap cleanup EXIT
+
+setup_prepare
+
+tests_run
+
+exit $EXIT_STATUS
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 01/22] nexthop: Remove unused function declaration from header file
  2020-09-08  9:10 ` [RFC PATCH net-next 01/22] nexthop: Remove unused function declaration from header file Ido Schimmel
@ 2020-09-08 14:29   ` David Ahern
  0 siblings, 0 replies; 56+ messages in thread
From: David Ahern @ 2020-09-08 14:29 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Not used or implemented anywhere.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  include/net/nexthop.h | 3 ---
>  1 file changed, 3 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 02/22] nexthop: Convert to blocking notification chain
  2020-09-08  9:10 ` [RFC PATCH net-next 02/22] nexthop: Convert to blocking notification chain Ido Schimmel
@ 2020-09-08 14:34   ` David Ahern
  0 siblings, 0 replies; 56+ messages in thread
From: David Ahern @ 2020-09-08 14:34 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Currently, the only listener of the nexthop notification chain is the
> VXLAN driver. Subsequent patches will add more listeners (e.g., device
> drivers such as netdevsim) that need to be able to block when processing
> notifications.
> 
> Therefore, convert the notification chain to a blocking one. This is
> safe as notifications are always emitted from process context.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  include/net/netns/nexthop.h |  2 +-
>  net/ipv4/nexthop.c          | 13 +++++++------
>  2 files changed, 8 insertions(+), 7 deletions(-)

Reviewed-by: David Ahern <dsahern@gmail.com>


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 03/22] nexthop: Only emit a notification when nexthop is actually deleted
  2020-09-08  9:10 ` [RFC PATCH net-next 03/22] nexthop: Only emit a notification when nexthop is actually deleted Ido Schimmel
@ 2020-09-08 14:34   ` David Ahern
  2020-09-08 14:39   ` Jiri Pirko
  1 sibling, 0 replies; 56+ messages in thread
From: David Ahern @ 2020-09-08 14:34 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Currently, the delete notification is emitted from the error path of
> nexthop_add() and replace_nexthop(), which can be confusing to listeners
> as they are not familiar with the nexthop.
> 
> Instead, only emit the notification when the nexthop is actually
> deleted. The following sub-cases are covered:
> 
> 1. User space deletes the nexthop
> 2. The nexthop is deleted by the kernel due to a netdev event (e.g.,
>    nexthop device going down)
> 3. A group is deleted because its last nexthop is being deleted
> 4. The network namespace of the nexthop device is deleted
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  net/ipv4/nexthop.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 04/22] selftests: fib_nexthops: Test cleanup of FDB entries following nexthop deletion
  2020-09-08  9:10 ` [RFC PATCH net-next 04/22] selftests: fib_nexthops: Test cleanup of FDB entries following nexthop deletion Ido Schimmel
@ 2020-09-08 14:35   ` David Ahern
  0 siblings, 0 replies; 56+ messages in thread
From: David Ahern @ 2020-09-08 14:35 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Commit c7cdbe2efc40 ("vxlan: support for nexthop notifiers") registered
> a listener in the VXLAN driver to the nexthop notification chain. Its
> purpose is to cleanup FDB entries that use a nexthop that is being
> deleted.
> 
> Test that such FDB entries are removed when the nexthop group that they
> use is deleted. Test that entries are not deleted when a single nexthop
> in the group is deleted.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  tools/testing/selftests/net/fib_nexthops.sh | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 03/22] nexthop: Only emit a notification when nexthop is actually deleted
  2020-09-08  9:10 ` [RFC PATCH net-next 03/22] nexthop: Only emit a notification when nexthop is actually deleted Ido Schimmel
  2020-09-08 14:34   ` David Ahern
@ 2020-09-08 14:39   ` Jiri Pirko
  2020-09-08 14:42     ` David Ahern
  2020-09-11 14:40     ` Ido Schimmel
  1 sibling, 2 replies; 56+ messages in thread
From: Jiri Pirko @ 2020-09-08 14:39 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: netdev, davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

Tue, Sep 08, 2020 at 11:10:18AM CEST, idosch@idosch.org wrote:
>From: Ido Schimmel <idosch@nvidia.com>
>
>Currently, the delete notification is emitted from the error path of
>nexthop_add() and replace_nexthop(), which can be confusing to listeners
>as they are not familiar with the nexthop.
>
>Instead, only emit the notification when the nexthop is actually
>deleted. The following sub-cases are covered:

Well, in theory, this might break some very odd app that is adding a
route and checking the errors using this notification. My opinion is to
allow this breakage to happen, but I'm usually too benevolent :)


>
>1. User space deletes the nexthop
>2. The nexthop is deleted by the kernel due to a netdev event (e.g.,
>   nexthop device going down)
>3. A group is deleted because its last nexthop is being deleted
>4. The network namespace of the nexthop device is deleted
>
>Signed-off-by: Ido Schimmel <idosch@nvidia.com>
>---
> net/ipv4/nexthop.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
>index 13d9219a9aa1..8c0f17c6863c 100644
>--- a/net/ipv4/nexthop.c
>+++ b/net/ipv4/nexthop.c
>@@ -870,8 +870,6 @@ static void __remove_nexthop_fib(struct net *net, struct nexthop *nh)
> 	bool do_flush = false;
> 	struct fib_info *fi;
> 
>-	call_nexthop_notifiers(net, NEXTHOP_EVENT_DEL, nh);
>-
> 	list_for_each_entry(fi, &nh->fi_list, nh_list) {
> 		fi->fib_flags |= RTNH_F_DEAD;
> 		do_flush = true;
>@@ -909,6 +907,8 @@ static void __remove_nexthop(struct net *net, struct nexthop *nh,
> static void remove_nexthop(struct net *net, struct nexthop *nh,
> 			   struct nl_info *nlinfo)
> {
>+	call_nexthop_notifiers(net, NEXTHOP_EVENT_DEL, nh);
>+
> 	/* remove from the tree */
> 	rb_erase(&nh->rb_node, &net->nexthop.rb_root);
> 
>-- 
>2.26.2
>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 03/22] nexthop: Only emit a notification when nexthop is actually deleted
  2020-09-08 14:39   ` Jiri Pirko
@ 2020-09-08 14:42     ` David Ahern
  2020-09-11 14:40     ` Ido Schimmel
  1 sibling, 0 replies; 56+ messages in thread
From: David Ahern @ 2020-09-08 14:42 UTC (permalink / raw)
  To: Jiri Pirko, Ido Schimmel; +Cc: netdev, davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 8:39 AM, Jiri Pirko wrote:
> Tue, Sep 08, 2020 at 11:10:18AM CEST, idosch@idosch.org wrote:
>> From: Ido Schimmel <idosch@nvidia.com>
>>
>> Currently, the delete notification is emitted from the error path of
>> nexthop_add() and replace_nexthop(), which can be confusing to listeners
>> as they are not familiar with the nexthop.
>>
>> Instead, only emit the notification when the nexthop is actually
>> deleted. The following sub-cases are covered:
> 
> Well, in theory, this might break some very odd app that is adding a
> route and checking the errors using this notification. My opinion is to
> allow this breakage to happen, but I'm usually too benevolent :)
> 

That would be a very twisted app.

No other notifications go out on failure to add / replace and nexthops
should be consistent.


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 05/22] nexthop: Add nexthop notification data structures
  2020-09-08  9:10 ` [RFC PATCH net-next 05/22] nexthop: Add nexthop notification data structures Ido Schimmel
@ 2020-09-08 14:43   ` David Ahern
  2020-09-11 14:50     ` Ido Schimmel
  0 siblings, 1 reply; 56+ messages in thread
From: David Ahern @ 2020-09-08 14:43 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Add data structures that will be used for nexthop replace and delete
> notifications in the previously introduced nexthop notification chain.
> 
> New data structures are added instead of passing the existing nexthop
> code structures directly for several reasons.
> 
> First, the existing structures encode a lot of bookkeeping information
> which is irrelevant for listeners of the notification chain.
> 
> Second, the existing structures can be changed without worrying about
> introducing regressions in listeners since they are not accessed
> directly by them.
> 
> Third, listeners of the notification chain do not need to each parse the
> relatively complex nexthop code structures. They are passing the
> required information in a simplified way.

agreed. My preference is for only nexthop.{c,h} to understand and parse
the nexthop structs.


> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  include/net/nexthop.h | 35 +++++++++++++++++++++++++++++++++++
>  1 file changed, 35 insertions(+)
> 
> diff --git a/include/net/nexthop.h b/include/net/nexthop.h
> index 2e44efe5709b..0bde1aa867c0 100644
> --- a/include/net/nexthop.h
> +++ b/include/net/nexthop.h
> @@ -109,6 +109,41 @@ enum nexthop_event_type {
>  	NEXTHOP_EVENT_DEL
>  };
>  
> +struct nh_notifier_single_info {
> +	struct net_device *dev;
> +	u8 gw_family;
> +	union {
> +		__be32 ipv4;
> +		struct in6_addr ipv6;
> +	};
> +	u8 is_reject:1,
> +	   is_fdb:1,
> +	   is_encap:1;

use has_encap since it refers to a configuration of a nexthop versus a
nexthop type.

I take it this is a placeholder until lwt offload is supported?

besides the naming nit,

Reviewed-by: David Ahern <dsahern@gmail.com>


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 06/22] nexthop: Pass extack to nexthop notifier
  2020-09-08  9:10 ` [RFC PATCH net-next 06/22] nexthop: Pass extack to nexthop notifier Ido Schimmel
@ 2020-09-08 14:44   ` David Ahern
  0 siblings, 0 replies; 56+ messages in thread
From: David Ahern @ 2020-09-08 14:44 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> The next patch will add extack to the notification info. This allows
> listeners to veto notifications and communicate the reason to user space.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  net/ipv4/nexthop.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> index 8c0f17c6863c..dafcb9f17250 100644
> --- a/net/ipv4/nexthop.c
> +++ b/net/ipv4/nexthop.c
> @@ -38,7 +38,8 @@ static const struct nla_policy rtm_nh_policy[NHA_MAX + 1] = {
>  
>  static int call_nexthop_notifiers(struct net *net,
>  				  enum nexthop_event_type event_type,
> -				  struct nexthop *nh)
> +				  struct nexthop *nh,
> +				  struct netlink_ext_ack *extack)
>  {
>  	int err;
>  
> @@ -907,7 +908,7 @@ static void __remove_nexthop(struct net *net, struct nexthop *nh,
>  static void remove_nexthop(struct net *net, struct nexthop *nh,
>  			   struct nl_info *nlinfo)
>  {
> -	call_nexthop_notifiers(net, NEXTHOP_EVENT_DEL, nh);
> +	call_nexthop_notifiers(net, NEXTHOP_EVENT_DEL, nh, NULL);
>  
>  	/* remove from the tree */
>  	rb_erase(&nh->rb_node, &net->nexthop.rb_root);
> 

Reviewed-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 07/22] nexthop: Prepare new notification info
  2020-09-08  9:10 ` [RFC PATCH net-next 07/22] nexthop: Prepare new notification info Ido Schimmel
@ 2020-09-08 14:55   ` David Ahern
  2020-09-11 15:01     ` Ido Schimmel
  0 siblings, 1 reply; 56+ messages in thread
From: David Ahern @ 2020-09-08 14:55 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Prepare the new notification information so that it could be passed to
> listeners in the new patch.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  net/ipv4/nexthop.c | 108 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 108 insertions(+)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>

one trivial comment below.

> +static void
> +__nh_notifier_single_info_init(struct nh_notifier_single_info *nh_info,
> +			       const struct nexthop *nh)
> +{
> +	struct nh_info *nhi = rtnl_dereference(nh->nh_info);
> +
> +	nh_info->dev = nhi->fib_nhc.nhc_dev;
> +	nh_info->gw_family = nhi->fib_nhc.nhc_gw_family;
> +	if (nh_info->gw_family == AF_INET)
> +		nh_info->ipv4 = nhi->fib_nhc.nhc_gw.ipv4;
> +	else if (nh_info->gw_family == AF_INET6)
> +		nh_info->ipv6 = nhi->fib_nhc.nhc_gw.ipv6;

add a blank line here to make it easier to read.

> +	nh_info->is_reject = nhi->reject_nh;
> +	nh_info->is_fdb = nhi->fdb_nh;
> +	nh_info->is_encap = !!nhi->fib_nhc.nhc_lwtstate;
> +}
> +


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 08/22] nexthop: vxlan: Convert to new notification info
  2020-09-08  9:10 ` [RFC PATCH net-next 08/22] nexthop: vxlan: Convert to " Ido Schimmel
@ 2020-09-08 14:58   ` David Ahern
  2020-09-11 15:05     ` Ido Schimmel
  0 siblings, 1 reply; 56+ messages in thread
From: David Ahern @ 2020-09-08 14:58 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Convert the sole listener of the nexthop notification chain (the VXLAN
> driver) to the new notification info.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  drivers/net/vxlan.c | 9 +++++++--
>  net/ipv4/nexthop.c  | 2 +-
>  2 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
> index b9fefe27e3e8..29deedee6ef4 100644
> --- a/drivers/net/vxlan.c
> +++ b/drivers/net/vxlan.c
> @@ -4687,9 +4687,14 @@ static void vxlan_fdb_nh_flush(struct nexthop *nh)
>  static int vxlan_nexthop_event(struct notifier_block *nb,
>  			       unsigned long event, void *ptr)
>  {
> -	struct nexthop *nh = ptr;
> +	struct nh_notifier_info *info = ptr;
> +	struct nexthop *nh;
> +
> +	if (event != NEXTHOP_EVENT_DEL)
> +		return NOTIFY_DONE;
>  
> -	if (!nh || event != NEXTHOP_EVENT_DEL)
> +	nh = nexthop_find_by_id(info->net, info->id);

hmmm.... why add the id to the info versus a nh pointer if a lookup is
needed?

> +	if (!nh)
>  		return NOTIFY_DONE;
>  
>  	vxlan_fdb_nh_flush(nh);


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 09/22] rtnetlink: Add RTNH_F_TRAP flag
  2020-09-08  9:10 ` [RFC PATCH net-next 09/22] rtnetlink: Add RTNH_F_TRAP flag Ido Schimmel
@ 2020-09-08 15:02   ` David Ahern
  2020-09-11 15:26     ` Ido Schimmel
  0 siblings, 1 reply; 56+ messages in thread
From: David Ahern @ 2020-09-08 15:02 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> The flag indicates to user space that the nexthop is not programmed to
> forward packets in hardware, but rather to trap them.

please elaborate in the commit message on what 'trap' is doing. I most
likely will forget a few years from now.

> 
> The flag will be used in subsequent patches by netdevsim to test nexthop
> objects programming to device drivers and in the future by mlxsw as
> well.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  include/uapi/linux/rtnetlink.h | 6 ++++--
>  net/ipv4/fib_semantics.c       | 2 ++
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 10/22] nexthop: Allow setting "offload" and "trap" indications on nexthops
  2020-09-08  9:10 ` [RFC PATCH net-next 10/22] nexthop: Allow setting "offload" and "trap" indications on nexthops Ido Schimmel
@ 2020-09-08 15:14   ` David Ahern
  2020-09-11 15:29     ` Ido Schimmel
  0 siblings, 1 reply; 56+ messages in thread
From: David Ahern @ 2020-09-08 15:14 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Add a function that can be called by device drivers to set "offload" or
> "trap" indication on nexthops following nexthop notifications.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  include/net/nexthop.h |  1 +
>  net/ipv4/nexthop.c    | 21 +++++++++++++++++++++
>  2 files changed, 22 insertions(+)
> 
> diff --git a/include/net/nexthop.h b/include/net/nexthop.h
> index 0bde1aa867c0..4147681e86d2 100644
> --- a/include/net/nexthop.h
> +++ b/include/net/nexthop.h
> @@ -146,6 +146,7 @@ struct nh_notifier_info {
>  
>  int register_nexthop_notifier(struct net *net, struct notifier_block *nb);
>  int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb);
> +void nexthop_hw_flags_set(struct net *net, u32 id, bool offload, bool trap);

how about nexthop_set_hw_flags? consistency with current nexthop_get_
... naming

>  
>  /* caller is holding rcu or rtnl; no reference taken to nexthop */
>  struct nexthop *nexthop_find_by_id(struct net *net, u32 id);
> diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> index 70c8ab6906ec..71605c612458 100644
> --- a/net/ipv4/nexthop.c
> +++ b/net/ipv4/nexthop.c
> @@ -2080,6 +2080,27 @@ int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb)
>  }
>  EXPORT_SYMBOL(unregister_nexthop_notifier);
>  
> +void nexthop_hw_flags_set(struct net *net, u32 id, bool offload, bool trap)
> +{
> +	struct nexthop *nexthop;
> +
> +	rcu_read_lock();
> +
> +	nexthop = nexthop_find_by_id(net, id);
> +	if (!nexthop)
> +		goto out;
> +
> +	nexthop->nh_flags &= ~(RTNH_F_OFFLOAD | RTNH_F_TRAP);
> +	if (offload)
> +		nexthop->nh_flags |= RTNH_F_OFFLOAD;
> +	if (trap)
> +		nexthop->nh_flags |= RTNH_F_TRAP;
> +
> +out:
> +	rcu_read_unlock();
> +}
> +EXPORT_SYMBOL(nexthop_hw_flags_set);
> +
>  static void __net_exit nexthop_net_exit(struct net *net)
>  {
>  	rtnl_lock();
> 


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 11/22] nexthop: Emit a notification when a nexthop is added
  2020-09-08  9:10 ` [RFC PATCH net-next 11/22] nexthop: Emit a notification when a nexthop is added Ido Schimmel
@ 2020-09-08 15:21   ` David Ahern
  2020-09-11 16:20     ` Ido Schimmel
  0 siblings, 1 reply; 56+ messages in thread
From: David Ahern @ 2020-09-08 15:21 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Emit a notification in the nexthop notification chain when a new nexthop
> is added (not replaced). The nexthop can either be a new group or a
> single nexthop.

Add a comment about why EVENT_REPLACE is generated on an 'added (not
replaced)' event.

> 
> The notification is sent after the nexthop is inserted into the
> red-black tree, as listeners might need to callback into the nexthop
> code with the nexthop ID in order to mark the nexthop as offloaded.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  include/net/nexthop.h | 3 ++-
>  net/ipv4/nexthop.c    | 6 +++++-
>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/include/net/nexthop.h b/include/net/nexthop.h
> index 4147681e86d2..6431ff8cdb89 100644
> --- a/include/net/nexthop.h
> +++ b/include/net/nexthop.h
> @@ -106,7 +106,8 @@ struct nexthop {
>  
>  enum nexthop_event_type {
>  	NEXTHOP_EVENT_ADD,

looks like the ADD event is not used and can be removed.

> -	NEXTHOP_EVENT_DEL
> +	NEXTHOP_EVENT_DEL,
> +	NEXTHOP_EVENT_REPLACE,
>  };
>  
>  struct nh_notifier_single_info {
> diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> index 71605c612458..1fa249facd46 100644
> --- a/net/ipv4/nexthop.c
> +++ b/net/ipv4/nexthop.c
> @@ -1277,7 +1277,11 @@ static int insert_nexthop(struct net *net, struct nexthop *new_nh,
>  
>  	rb_link_node_rcu(&new_nh->rb_node, parent, pp);
>  	rb_insert_color(&new_nh->rb_node, root);
> -	rc = 0;
> +
> +	rc = call_nexthop_notifiers(net, NEXTHOP_EVENT_REPLACE, new_nh, extack);
> +	if (rc)
> +		rb_erase(&new_nh->rb_node, &net->nexthop.rb_root);
> +
>  out:
>  	if (!rc) {
>  		nh_base_seq_inc(net);
> 


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 12/22] nexthop: Emit a notification when a nexthop group is replaced
  2020-09-08  9:10 ` [RFC PATCH net-next 12/22] nexthop: Emit a notification when a nexthop group is replaced Ido Schimmel
@ 2020-09-08 15:22   ` David Ahern
  0 siblings, 0 replies; 56+ messages in thread
From: David Ahern @ 2020-09-08 15:22 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Emit a notification in the nexthop notification chain when an existing
> nexthop group is replaced.
> 
> The notification is emitted after all the validation checks were
> performed, but before the new configuration (i.e., 'struct nh_grp') is
> pointed at by the old shell (i.e., 'struct nexthop'). This prevents the
> need to perform rollback in case the notification is vetoed.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  net/ipv4/nexthop.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 13/22] nexthop: Emit a notification when a single nexthop is replaced
  2020-09-08  9:10 ` [RFC PATCH net-next 13/22] nexthop: Emit a notification when a single nexthop " Ido Schimmel
@ 2020-09-08 15:25   ` David Ahern
  2020-09-11 16:24     ` Ido Schimmel
  0 siblings, 1 reply; 56+ messages in thread
From: David Ahern @ 2020-09-08 15:25 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> The notification is emitted after all the validation checks were
> performed, but before the new configuration (i.e., 'struct nh_info') is
> pointed at by the old shell (i.e., 'struct nexthop'). This prevents the
> need to perform rollback in case the notification is vetoed.
> 
> The next patch will also emit a replace notification for all the nexthop
> groups in which the nexthop is used.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  net/ipv4/nexthop.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> index a60a519a5462..b8a4abc00146 100644
> --- a/net/ipv4/nexthop.c
> +++ b/net/ipv4/nexthop.c
> @@ -1099,12 +1099,22 @@ static int replace_nexthop_single(struct net *net, struct nexthop *old,
>  				  struct netlink_ext_ack *extack)
>  {
>  	struct nh_info *oldi, *newi;
> +	int err;
>  
>  	if (new->is_group) {
>  		NL_SET_ERR_MSG(extack, "Can not replace a nexthop with a nexthop group.");
>  		return -EINVAL;
>  	}
>  
> +	err = call_nexthop_notifiers(net, NEXTHOP_EVENT_REPLACE, new, extack);
> +	if (err)
> +		return err;
> +
> +	/* Hardware flags were set on 'old' as 'new' is not in the red-black
> +	 * tree. Therefore, inherit the flags from 'old' to 'new'.
> +	 */
> +	new->nh_flags |= old->nh_flags & (RTNH_F_OFFLOAD | RTNH_F_TRAP);

Will that always be true? ie., has h/w seen 'new' and offloaded it yet?
vs the notifier telling hardware about the change, it does its thing and
sets the flags. But I guess that creates a race between the offload and
the new data being available.

> +
>  	oldi = rtnl_dereference(old->nh_info);
>  	newi = rtnl_dereference(new->nh_info);
>  
> 


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 14/22] nexthop: Emit a notification when a nexthop group is modified
  2020-09-08  9:10 ` [RFC PATCH net-next 14/22] nexthop: Emit a notification when a nexthop group is modified Ido Schimmel
@ 2020-09-08 15:29   ` David Ahern
  0 siblings, 0 replies; 56+ messages in thread
From: David Ahern @ 2020-09-08 15:29 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> When a single nexthop is replaced, the configuration of all the groups
> using the nexthop is effectively modified. In this case, emit a
> notification in the nexthop notification chain for each modified group
> so that listeners would not need to keep track of which nexthops are
> member in which groups.
> 
> The notification can only be emitted after the new configuration (i.e.,
> 'struct nh_info') is pointed at by the old shell (i.e., 'struct
> nexthop'). Before that the configuration of the nexthop groups is still
> the same as before the replacement.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  net/ipv4/nexthop.c | 32 ++++++++++++++++++++++++++++++--
>  1 file changed, 30 insertions(+), 2 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 15/22] nexthop: Emit a notification when a nexthop group is reduced
  2020-09-08  9:10 ` [RFC PATCH net-next 15/22] nexthop: Emit a notification when a nexthop group is reduced Ido Schimmel
@ 2020-09-08 15:33   ` David Ahern
  2020-09-11 16:29     ` Ido Schimmel
  0 siblings, 1 reply; 56+ messages in thread
From: David Ahern @ 2020-09-08 15:33 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> When a single nexthop is deleted, the configuration of all the groups
> using the nexthop is effectively modified. In this case, emit a
> notification in the nexthop notification chain for each modified group
> so that listeners would not need to keep track of which nexthops are
> member in which groups.
> 
> In the rare cases where the notification fails, emit an error to the
> kernel log.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  net/ipv4/nexthop.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> index 0edc3e73d416..33f611bbce1f 100644
> --- a/net/ipv4/nexthop.c
> +++ b/net/ipv4/nexthop.c
> @@ -893,7 +893,7 @@ static void remove_nh_grp_entry(struct net *net, struct nh_grp_entry *nhge,
>  	struct nexthop *nhp = nhge->nh_parent;
>  	struct nexthop *nh = nhge->nh;
>  	struct nh_group *nhg, *newg;
> -	int i, j;
> +	int i, j, err;
>  
>  	WARN_ON(!nh);
>  
> @@ -941,6 +941,10 @@ static void remove_nh_grp_entry(struct net *net, struct nh_grp_entry *nhge,
>  	list_del(&nhge->nh_list);
>  	nexthop_put(nhge->nh);
>  
> +	err = call_nexthop_notifiers(net, NEXTHOP_EVENT_REPLACE, nhp, NULL);
> +	if (err)
> +		pr_err("Failed to replace nexthop group after nexthop deletion\n");

This should refer to the notifier failing since wrt nexthop code the
structs are ok. extack on the stack and logging that message would be
useful too (or have the users of the notifier log why it fails).

> +
>  	if (nlinfo)
>  		nexthop_notify(RTM_NEWNEXTHOP, nhp, nlinfo);
>  }
> 


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 16/22] nexthop: Pass extack to register_nexthop_notifier()
  2020-09-08  9:10 ` [RFC PATCH net-next 16/22] nexthop: Pass extack to register_nexthop_notifier() Ido Schimmel
@ 2020-09-08 15:34   ` David Ahern
  0 siblings, 0 replies; 56+ messages in thread
From: David Ahern @ 2020-09-08 15:34 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> This will be used by the next patch which extends the function to replay
> all the existing nexthops to the notifier block being registered.
> 
> Device drivers will be able to pass extack to the function since it is
> passed to them upon reload from devlink.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  drivers/net/vxlan.c   | 3 ++-
>  include/net/nexthop.h | 3 ++-
>  net/ipv4/nexthop.c    | 3 ++-
>  3 files changed, 6 insertions(+), 3 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 17/22] nexthop: Replay nexthops when registering a notifier
  2020-09-08  9:10 ` [RFC PATCH net-next 17/22] nexthop: Replay nexthops when registering a notifier Ido Schimmel
@ 2020-09-08 15:37   ` David Ahern
  2020-09-11 16:40     ` Ido Schimmel
  0 siblings, 1 reply; 56+ messages in thread
From: David Ahern @ 2020-09-08 15:37 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> When registering a new notifier to the nexthop notification chain,
> replay all the existing nexthops to the new notifier so that it will
> have a complete picture of the available nexthops.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  net/ipv4/nexthop.c | 54 ++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 52 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> index b40c343ca969..6505a0a28df2 100644
> --- a/net/ipv4/nexthop.c
> +++ b/net/ipv4/nexthop.c
> @@ -156,6 +156,27 @@ static int call_nexthop_notifiers(struct net *net,
>  	return notifier_to_errno(err);
>  }
>  
> +static int call_nexthop_notifier(struct notifier_block *nb, struct net *net,
> +				 enum nexthop_event_type event_type,
> +				 struct nexthop *nh,
> +				 struct netlink_ext_ack *extack)
> +{
> +	struct nh_notifier_info info = {
> +		.net = net,
> +		.extack = extack,
> +	};
> +	int err;
> +
> +	err = nh_notifier_info_init(&info, nh);
> +	if (err)
> +		return err;
> +
> +	err = nb->notifier_call(nb, event_type, &info);
> +	nh_notifier_info_fini(&info);
> +
> +	return notifier_to_errno(err);
> +}
> +
>  static unsigned int nh_dev_hashfn(unsigned int val)
>  {
>  	unsigned int mask = NH_DEV_HASHSIZE - 1;
> @@ -2116,11 +2137,40 @@ static struct notifier_block nh_netdev_notifier = {
>  	.notifier_call = nh_netdev_event,
>  };
>  
> +static int nexthops_dump(struct net *net, struct notifier_block *nb,
> +			 struct netlink_ext_ack *extack)
> +{
> +	struct rb_root *root = &net->nexthop.rb_root;
> +	struct rb_node *node;
> +	int err = 0;
> +
> +	for (node = rb_first(root); node; node = rb_next(node)) {
> +		struct nexthop *nh;
> +
> +		nh = rb_entry(node, struct nexthop, rb_node);
> +		err = call_nexthop_notifier(nb, net, NEXTHOP_EVENT_REPLACE, nh,
> +					    extack);
> +		if (err)
> +			break;
> +	}
> +
> +	return err;
> +}
> +
>  int register_nexthop_notifier(struct net *net, struct notifier_block *nb,
>  			      struct netlink_ext_ack *extack)
>  {
> -	return blocking_notifier_chain_register(&net->nexthop.notifier_chain,
> -						nb);
> +	int err;
> +
> +	rtnl_lock();
> +	err = nexthops_dump(net, nb, extack);

can the unlock be moved here? register function below should not need it.

> +	if (err)
> +		goto unlock;
> +	err = blocking_notifier_chain_register(&net->nexthop.notifier_chain,
> +					       nb);
> +unlock:
> +	rtnl_unlock();
> +	return err;
>  }
>  EXPORT_SYMBOL(register_nexthop_notifier);
>  
> 


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 18/22] nexthop: Remove in-kernel route notifications when nexthop changes
  2020-09-08  9:10 ` [RFC PATCH net-next 18/22] nexthop: Remove in-kernel route notifications when nexthop changes Ido Schimmel
@ 2020-09-08 15:38   ` David Ahern
  0 siblings, 0 replies; 56+ messages in thread
From: David Ahern @ 2020-09-08 15:38 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Remove in-kernel route notifications when the configuration of their
> nexthop changes.
> 
> These notifications are unnecessary because the route still uses the
> same nexthop ID. A separate notification for the nexthop change itself
> is now sent in the nexthop notification chain.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  net/ipv4/fib_trie.c | 9 ---------
>  net/ipv6/route.c    | 5 -----
>  2 files changed, 14 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 21/22] netdevsim: Allow programming routes with nexthop objects
  2020-09-08  9:10 ` [RFC PATCH net-next 21/22] netdevsim: Allow programming routes with nexthop objects Ido Schimmel
@ 2020-09-08 15:40   ` David Ahern
  0 siblings, 0 replies; 56+ messages in thread
From: David Ahern @ 2020-09-08 15:40 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Previous patches added the ability to program nexthop objects.
> Therefore, no longer forbid the programming of routes that point to such
> objects.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  drivers/net/netdevsim/fib.c | 10 ----------
>  1 file changed, 10 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 03/22] nexthop: Only emit a notification when nexthop is actually deleted
  2020-09-08 14:39   ` Jiri Pirko
  2020-09-08 14:42     ` David Ahern
@ 2020-09-11 14:40     ` Ido Schimmel
  1 sibling, 0 replies; 56+ messages in thread
From: Ido Schimmel @ 2020-09-11 14:40 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, kuba, dsahern, roopa, mlxsw, Ido Schimmel

On Tue, Sep 08, 2020 at 04:39:59PM +0200, Jiri Pirko wrote:
> Tue, Sep 08, 2020 at 11:10:18AM CEST, idosch@idosch.org wrote:
> >From: Ido Schimmel <idosch@nvidia.com>
> >
> >Currently, the delete notification is emitted from the error path of
> >nexthop_add() and replace_nexthop(), which can be confusing to listeners
> >as they are not familiar with the nexthop.
> >
> >Instead, only emit the notification when the nexthop is actually
> >deleted. The following sub-cases are covered:
> 
> Well, in theory, this might break some very odd app that is adding a
> route and checking the errors using this notification. My opinion is to
> allow this breakage to happen, but I'm usually too benevolent :)

There is no uAPI breakage since the patch is only changing the in-kernel
notification. After this patch the in-kernel and RTM_DELNEXTHOP netlink
notifications are both emitted from the same function (i.e.,
remove_nexthop()).

I'll reword the commit message to make it clear.

> 
> 
> >
> >1. User space deletes the nexthop
> >2. The nexthop is deleted by the kernel due to a netdev event (e.g.,
> >   nexthop device going down)
> >3. A group is deleted because its last nexthop is being deleted
> >4. The network namespace of the nexthop device is deleted
> >
> >Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> >---
> > net/ipv4/nexthop.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> >diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> >index 13d9219a9aa1..8c0f17c6863c 100644
> >--- a/net/ipv4/nexthop.c
> >+++ b/net/ipv4/nexthop.c
> >@@ -870,8 +870,6 @@ static void __remove_nexthop_fib(struct net *net, struct nexthop *nh)
> > 	bool do_flush = false;
> > 	struct fib_info *fi;
> > 
> >-	call_nexthop_notifiers(net, NEXTHOP_EVENT_DEL, nh);
> >-
> > 	list_for_each_entry(fi, &nh->fi_list, nh_list) {
> > 		fi->fib_flags |= RTNH_F_DEAD;
> > 		do_flush = true;
> >@@ -909,6 +907,8 @@ static void __remove_nexthop(struct net *net, struct nexthop *nh,
> > static void remove_nexthop(struct net *net, struct nexthop *nh,
> > 			   struct nl_info *nlinfo)
> > {
> >+	call_nexthop_notifiers(net, NEXTHOP_EVENT_DEL, nh);
> >+
> > 	/* remove from the tree */
> > 	rb_erase(&nh->rb_node, &net->nexthop.rb_root);
> > 
> >-- 
> >2.26.2
> >

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 05/22] nexthop: Add nexthop notification data structures
  2020-09-08 14:43   ` David Ahern
@ 2020-09-11 14:50     ` Ido Schimmel
  0 siblings, 0 replies; 56+ messages in thread
From: Ido Schimmel @ 2020-09-11 14:50 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, kuba, roopa, mlxsw, Ido Schimmel

On Tue, Sep 08, 2020 at 08:43:10AM -0600, David Ahern wrote:
> On 9/8/20 3:10 AM, Ido Schimmel wrote:
> > From: Ido Schimmel <idosch@nvidia.com>
> > 
> > Add data structures that will be used for nexthop replace and delete
> > notifications in the previously introduced nexthop notification chain.
> > 
> > New data structures are added instead of passing the existing nexthop
> > code structures directly for several reasons.
> > 
> > First, the existing structures encode a lot of bookkeeping information
> > which is irrelevant for listeners of the notification chain.
> > 
> > Second, the existing structures can be changed without worrying about
> > introducing regressions in listeners since they are not accessed
> > directly by them.
> > 
> > Third, listeners of the notification chain do not need to each parse the
> > relatively complex nexthop code structures. They are passing the
> > required information in a simplified way.
> 
> agreed. My preference is for only nexthop.{c,h} to understand and parse
> the nexthop structs.
> 
> 
> > 
> > Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> > ---
> >  include/net/nexthop.h | 35 +++++++++++++++++++++++++++++++++++
> >  1 file changed, 35 insertions(+)
> > 
> > diff --git a/include/net/nexthop.h b/include/net/nexthop.h
> > index 2e44efe5709b..0bde1aa867c0 100644
> > --- a/include/net/nexthop.h
> > +++ b/include/net/nexthop.h
> > @@ -109,6 +109,41 @@ enum nexthop_event_type {
> >  	NEXTHOP_EVENT_DEL
> >  };
> >  
> > +struct nh_notifier_single_info {
> > +	struct net_device *dev;
> > +	u8 gw_family;
> > +	union {
> > +		__be32 ipv4;
> > +		struct in6_addr ipv6;
> > +	};
> > +	u8 is_reject:1,
> > +	   is_fdb:1,
> > +	   is_encap:1;
> 
> use has_encap since it refers to a configuration of a nexthop versus a
> nexthop type.

Will change.

> 
> I take it this is a placeholder until lwt offload is supported?

Yes, I will mention this in the commit message. I didn't bother parsing
all the encap configuration into the struct since no listener is going
to look at it. Only added this bit so that listeners could reject
nexthops that perform encapsulation.

> 
> besides the naming nit,
> 
> Reviewed-by: David Ahern <dsahern@gmail.com>

Thanks for the prompt review, David!

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 07/22] nexthop: Prepare new notification info
  2020-09-08 14:55   ` David Ahern
@ 2020-09-11 15:01     ` Ido Schimmel
  0 siblings, 0 replies; 56+ messages in thread
From: Ido Schimmel @ 2020-09-11 15:01 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, kuba, roopa, mlxsw, Ido Schimmel

On Tue, Sep 08, 2020 at 08:55:06AM -0600, David Ahern wrote:
> On 9/8/20 3:10 AM, Ido Schimmel wrote:
> > From: Ido Schimmel <idosch@nvidia.com>
> > 
> > Prepare the new notification information so that it could be passed to
> > listeners in the new patch.
> > 
> > Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> > ---
> >  net/ipv4/nexthop.c | 108 +++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 108 insertions(+)
> > 
> 
> Reviewed-by: David Ahern <dsahern@gmail.com>
> 
> one trivial comment below.
> 
> > +static void
> > +__nh_notifier_single_info_init(struct nh_notifier_single_info *nh_info,
> > +			       const struct nexthop *nh)
> > +{
> > +	struct nh_info *nhi = rtnl_dereference(nh->nh_info);
> > +
> > +	nh_info->dev = nhi->fib_nhc.nhc_dev;
> > +	nh_info->gw_family = nhi->fib_nhc.nhc_gw_family;
> > +	if (nh_info->gw_family == AF_INET)
> > +		nh_info->ipv4 = nhi->fib_nhc.nhc_gw.ipv4;
> > +	else if (nh_info->gw_family == AF_INET6)
> > +		nh_info->ipv6 = nhi->fib_nhc.nhc_gw.ipv6;
> 
> add a blank line here to make it easier to read.

Done

> 
> > +	nh_info->is_reject = nhi->reject_nh;
> > +	nh_info->is_fdb = nhi->fdb_nh;
> > +	nh_info->is_encap = !!nhi->fib_nhc.nhc_lwtstate;

Also changed this to 'has_encap' given your previous comment

> > +}
> > +
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 08/22] nexthop: vxlan: Convert to new notification info
  2020-09-08 14:58   ` David Ahern
@ 2020-09-11 15:05     ` Ido Schimmel
  0 siblings, 0 replies; 56+ messages in thread
From: Ido Schimmel @ 2020-09-11 15:05 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, kuba, roopa, mlxsw, Ido Schimmel

On Tue, Sep 08, 2020 at 08:58:07AM -0600, David Ahern wrote:
> On 9/8/20 3:10 AM, Ido Schimmel wrote:
> > From: Ido Schimmel <idosch@nvidia.com>
> > 
> > Convert the sole listener of the nexthop notification chain (the VXLAN
> > driver) to the new notification info.
> > 
> > Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> > ---
> >  drivers/net/vxlan.c | 9 +++++++--
> >  net/ipv4/nexthop.c  | 2 +-
> >  2 files changed, 8 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
> > index b9fefe27e3e8..29deedee6ef4 100644
> > --- a/drivers/net/vxlan.c
> > +++ b/drivers/net/vxlan.c
> > @@ -4687,9 +4687,14 @@ static void vxlan_fdb_nh_flush(struct nexthop *nh)
> >  static int vxlan_nexthop_event(struct notifier_block *nb,
> >  			       unsigned long event, void *ptr)
> >  {
> > -	struct nexthop *nh = ptr;
> > +	struct nh_notifier_info *info = ptr;
> > +	struct nexthop *nh;
> > +
> > +	if (event != NEXTHOP_EVENT_DEL)
> > +		return NOTIFY_DONE;
> >  
> > -	if (!nh || event != NEXTHOP_EVENT_DEL)
> > +	nh = nexthop_find_by_id(info->net, info->id);
> 
> hmmm.... why add the id to the info versus a nh pointer if a lookup is
> needed?

This goes back to my reasoning in patch #5. I preferred not to pass the
raw nexthop data structures to listeners as they usually have no
business poking into them. I believe the VXLAN driver is the exception
here as it does need access to 'fdb_list'.

> 
> > +	if (!nh)
> >  		return NOTIFY_DONE;
> >  
> >  	vxlan_fdb_nh_flush(nh);
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 09/22] rtnetlink: Add RTNH_F_TRAP flag
  2020-09-08 15:02   ` David Ahern
@ 2020-09-11 15:26     ` Ido Schimmel
  2020-09-11 15:54       ` David Ahern
  0 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-11 15:26 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, kuba, roopa, mlxsw, Ido Schimmel

On Tue, Sep 08, 2020 at 09:02:33AM -0600, David Ahern wrote:
> On 9/8/20 3:10 AM, Ido Schimmel wrote:
> > From: Ido Schimmel <idosch@nvidia.com>
> > 
> > The flag indicates to user space that the nexthop is not programmed to
> > forward packets in hardware, but rather to trap them.
> 
> please elaborate in the commit message on what 'trap' is doing. I most
> likely will forget a few years from now.

Reworded to:

"
rtnetlink: Add RTNH_F_TRAP flag

The flag indicates to user space that the nexthop is not programmed to
forward packets in hardware, but rather to trap them to the CPU. This is
needed, for example, when the MAC of the nexthop neighbour is not
resolved and packets should reach the CPU to trigger neighbour
resolution.

The flag will be used in subsequent patches by netdevsim to test nexthop
objects programming to device drivers and in the future by mlxsw as
well.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
"

> 
> > 
> > The flag will be used in subsequent patches by netdevsim to test nexthop
> > objects programming to device drivers and in the future by mlxsw as
> > well.
> > 
> > Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> > ---
> >  include/uapi/linux/rtnetlink.h | 6 ++++--
> >  net/ipv4/fib_semantics.c       | 2 ++
> >  2 files changed, 6 insertions(+), 2 deletions(-)
> > 
> 
> Reviewed-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 10/22] nexthop: Allow setting "offload" and "trap" indications on nexthops
  2020-09-08 15:14   ` David Ahern
@ 2020-09-11 15:29     ` Ido Schimmel
  0 siblings, 0 replies; 56+ messages in thread
From: Ido Schimmel @ 2020-09-11 15:29 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, kuba, roopa, mlxsw, Ido Schimmel

On Tue, Sep 08, 2020 at 09:14:37AM -0600, David Ahern wrote:
> On 9/8/20 3:10 AM, Ido Schimmel wrote:
> > From: Ido Schimmel <idosch@nvidia.com>
> > 
> > Add a function that can be called by device drivers to set "offload" or
> > "trap" indication on nexthops following nexthop notifications.
> > 
> > Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> > ---
> >  include/net/nexthop.h |  1 +
> >  net/ipv4/nexthop.c    | 21 +++++++++++++++++++++
> >  2 files changed, 22 insertions(+)
> > 
> > diff --git a/include/net/nexthop.h b/include/net/nexthop.h
> > index 0bde1aa867c0..4147681e86d2 100644
> > --- a/include/net/nexthop.h
> > +++ b/include/net/nexthop.h
> > @@ -146,6 +146,7 @@ struct nh_notifier_info {
> >  
> >  int register_nexthop_notifier(struct net *net, struct notifier_block *nb);
> >  int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb);
> > +void nexthop_hw_flags_set(struct net *net, u32 id, bool offload, bool trap);
> 
> how about nexthop_set_hw_flags? consistency with current nexthop_get_
> ... naming

Sure. I opted for consistency with fib_alias_hw_flags_set() and
fib6_info_hw_flags_set(), but I'll change to be consistent with nexthop
code.

> 
> >  
> >  /* caller is holding rcu or rtnl; no reference taken to nexthop */
> >  struct nexthop *nexthop_find_by_id(struct net *net, u32 id);
> > diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> > index 70c8ab6906ec..71605c612458 100644
> > --- a/net/ipv4/nexthop.c
> > +++ b/net/ipv4/nexthop.c
> > @@ -2080,6 +2080,27 @@ int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb)
> >  }
> >  EXPORT_SYMBOL(unregister_nexthop_notifier);
> >  
> > +void nexthop_hw_flags_set(struct net *net, u32 id, bool offload, bool trap)
> > +{
> > +	struct nexthop *nexthop;
> > +
> > +	rcu_read_lock();
> > +
> > +	nexthop = nexthop_find_by_id(net, id);
> > +	if (!nexthop)
> > +		goto out;
> > +
> > +	nexthop->nh_flags &= ~(RTNH_F_OFFLOAD | RTNH_F_TRAP);
> > +	if (offload)
> > +		nexthop->nh_flags |= RTNH_F_OFFLOAD;
> > +	if (trap)
> > +		nexthop->nh_flags |= RTNH_F_TRAP;
> > +
> > +out:
> > +	rcu_read_unlock();
> > +}
> > +EXPORT_SYMBOL(nexthop_hw_flags_set);
> > +
> >  static void __net_exit nexthop_net_exit(struct net *net)
> >  {
> >  	rtnl_lock();
> > 
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 09/22] rtnetlink: Add RTNH_F_TRAP flag
  2020-09-11 15:26     ` Ido Schimmel
@ 2020-09-11 15:54       ` David Ahern
  0 siblings, 0 replies; 56+ messages in thread
From: David Ahern @ 2020-09-11 15:54 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: netdev, davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/11/20 9:26 AM, Ido Schimmel wrote:
> Reworded to:
> 
> "
> rtnetlink: Add RTNH_F_TRAP flag
> 
> The flag indicates to user space that the nexthop is not programmed to
> forward packets in hardware, but rather to trap them to the CPU. This is
> needed, for example, when the MAC of the nexthop neighbour is not
> resolved and packets should reach the CPU to trigger neighbour
> resolution.
> 
> The flag will be used in subsequent patches by netdevsim to test nexthop
> objects programming to device drivers and in the future by mlxsw as
> well.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> Reviewed-by: David Ahern <dsahern@gmail.com>
> "

works for me. thanks

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 11/22] nexthop: Emit a notification when a nexthop is added
  2020-09-08 15:21   ` David Ahern
@ 2020-09-11 16:20     ` Ido Schimmel
  0 siblings, 0 replies; 56+ messages in thread
From: Ido Schimmel @ 2020-09-11 16:20 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, kuba, roopa, mlxsw, Ido Schimmel

On Tue, Sep 08, 2020 at 09:21:08AM -0600, David Ahern wrote:
> On 9/8/20 3:10 AM, Ido Schimmel wrote:
> > From: Ido Schimmel <idosch@nvidia.com>
> > 
> > Emit a notification in the nexthop notification chain when a new nexthop
> > is added (not replaced). The nexthop can either be a new group or a
> > single nexthop.
> 
> Add a comment about why EVENT_REPLACE is generated on an 'added (not
> replaced)' event.

Reworded:

"
nexthop: Emit a notification when a nexthop is added

Emit a notification in the nexthop notification chain when a new nexthop
is added (not replaced). The nexthop can either be a new group or a
single nexthop.

The notification is sent after the nexthop is inserted into the
red-black tree, as listeners might need to callback into the nexthop
code with the nexthop ID in order to mark the nexthop as offloaded.

A 'REPLACE' notification is emitted instead of 'ADD' as the distinction
between the two is not important for in-kernel listeners. In case the
listener is not familiar with the encoded nexthop ID, it can simply
treat it as a new one. This is also consistent with the route offload
API.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
"

> 
> > 
> > The notification is sent after the nexthop is inserted into the
> > red-black tree, as listeners might need to callback into the nexthop
> > code with the nexthop ID in order to mark the nexthop as offloaded.
> > 
> > Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> > ---
> >  include/net/nexthop.h | 3 ++-
> >  net/ipv4/nexthop.c    | 6 +++++-
> >  2 files changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/net/nexthop.h b/include/net/nexthop.h
> > index 4147681e86d2..6431ff8cdb89 100644
> > --- a/include/net/nexthop.h
> > +++ b/include/net/nexthop.h
> > @@ -106,7 +106,8 @@ struct nexthop {
> >  
> >  enum nexthop_event_type {
> >  	NEXTHOP_EVENT_ADD,
> 
> looks like the ADD event is not used and can be removed.

Right. I will remove it in a separate patch

> 
> > -	NEXTHOP_EVENT_DEL
> > +	NEXTHOP_EVENT_DEL,
> > +	NEXTHOP_EVENT_REPLACE,
> >  };
> >  
> >  struct nh_notifier_single_info {
> > diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> > index 71605c612458..1fa249facd46 100644
> > --- a/net/ipv4/nexthop.c
> > +++ b/net/ipv4/nexthop.c
> > @@ -1277,7 +1277,11 @@ static int insert_nexthop(struct net *net, struct nexthop *new_nh,
> >  
> >  	rb_link_node_rcu(&new_nh->rb_node, parent, pp);
> >  	rb_insert_color(&new_nh->rb_node, root);
> > -	rc = 0;
> > +
> > +	rc = call_nexthop_notifiers(net, NEXTHOP_EVENT_REPLACE, new_nh, extack);
> > +	if (rc)
> > +		rb_erase(&new_nh->rb_node, &net->nexthop.rb_root);
> > +
> >  out:
> >  	if (!rc) {
> >  		nh_base_seq_inc(net);
> > 
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 13/22] nexthop: Emit a notification when a single nexthop is replaced
  2020-09-08 15:25   ` David Ahern
@ 2020-09-11 16:24     ` Ido Schimmel
  0 siblings, 0 replies; 56+ messages in thread
From: Ido Schimmel @ 2020-09-11 16:24 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, kuba, roopa, mlxsw, Ido Schimmel

On Tue, Sep 08, 2020 at 09:25:40AM -0600, David Ahern wrote:
> On 9/8/20 3:10 AM, Ido Schimmel wrote:
> > From: Ido Schimmel <idosch@nvidia.com>
> > 
> > The notification is emitted after all the validation checks were
> > performed, but before the new configuration (i.e., 'struct nh_info') is
> > pointed at by the old shell (i.e., 'struct nexthop'). This prevents the
> > need to perform rollback in case the notification is vetoed.
> > 
> > The next patch will also emit a replace notification for all the nexthop
> > groups in which the nexthop is used.
> > 
> > Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> > ---
> >  net/ipv4/nexthop.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> > index a60a519a5462..b8a4abc00146 100644
> > --- a/net/ipv4/nexthop.c
> > +++ b/net/ipv4/nexthop.c
> > @@ -1099,12 +1099,22 @@ static int replace_nexthop_single(struct net *net, struct nexthop *old,
> >  				  struct netlink_ext_ack *extack)
> >  {
> >  	struct nh_info *oldi, *newi;
> > +	int err;
> >  
> >  	if (new->is_group) {
> >  		NL_SET_ERR_MSG(extack, "Can not replace a nexthop with a nexthop group.");
> >  		return -EINVAL;
> >  	}
> >  
> > +	err = call_nexthop_notifiers(net, NEXTHOP_EVENT_REPLACE, new, extack);
> > +	if (err)
> > +		return err;
> > +
> > +	/* Hardware flags were set on 'old' as 'new' is not in the red-black
> > +	 * tree. Therefore, inherit the flags from 'old' to 'new'.
> > +	 */
> > +	new->nh_flags |= old->nh_flags & (RTNH_F_OFFLOAD | RTNH_F_TRAP);
> 
> Will that always be true? ie., has h/w seen 'new' and offloaded it yet?

Yes. The chain was converted to a blocking chain, so it is possible to
program the hardware inline.

> vs the notifier telling hardware about the change, it does its thing and
> sets the flags. But I guess that creates a race between the offload and
> the new data being available.
> 
> > +
> >  	oldi = rtnl_dereference(old->nh_info);
> >  	newi = rtnl_dereference(new->nh_info);
> >  
> > 
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 15/22] nexthop: Emit a notification when a nexthop group is reduced
  2020-09-08 15:33   ` David Ahern
@ 2020-09-11 16:29     ` Ido Schimmel
  0 siblings, 0 replies; 56+ messages in thread
From: Ido Schimmel @ 2020-09-11 16:29 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, kuba, roopa, mlxsw, Ido Schimmel

On Tue, Sep 08, 2020 at 09:33:42AM -0600, David Ahern wrote:
> On 9/8/20 3:10 AM, Ido Schimmel wrote:
> > From: Ido Schimmel <idosch@nvidia.com>
> > 
> > When a single nexthop is deleted, the configuration of all the groups
> > using the nexthop is effectively modified. In this case, emit a
> > notification in the nexthop notification chain for each modified group
> > so that listeners would not need to keep track of which nexthops are
> > member in which groups.
> > 
> > In the rare cases where the notification fails, emit an error to the
> > kernel log.
> > 
> > Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> > ---
> >  net/ipv4/nexthop.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> > index 0edc3e73d416..33f611bbce1f 100644
> > --- a/net/ipv4/nexthop.c
> > +++ b/net/ipv4/nexthop.c
> > @@ -893,7 +893,7 @@ static void remove_nh_grp_entry(struct net *net, struct nh_grp_entry *nhge,
> >  	struct nexthop *nhp = nhge->nh_parent;
> >  	struct nexthop *nh = nhge->nh;
> >  	struct nh_group *nhg, *newg;
> > -	int i, j;
> > +	int i, j, err;
> >  
> >  	WARN_ON(!nh);
> >  
> > @@ -941,6 +941,10 @@ static void remove_nh_grp_entry(struct net *net, struct nh_grp_entry *nhge,
> >  	list_del(&nhge->nh_list);
> >  	nexthop_put(nhge->nh);
> >  
> > +	err = call_nexthop_notifiers(net, NEXTHOP_EVENT_REPLACE, nhp, NULL);
> > +	if (err)
> > +		pr_err("Failed to replace nexthop group after nexthop deletion\n");
> 
> This should refer to the notifier failing since wrt nexthop code the
> structs are ok. extack on the stack and logging that message would be
> useful too (or have the users of the notifier log why it fails).

'extack on the stack' idea is cool! I will do that

> 
> > +
> >  	if (nlinfo)
> >  		nexthop_notify(RTM_NEWNEXTHOP, nhp, nlinfo);
> >  }
> > 
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 17/22] nexthop: Replay nexthops when registering a notifier
  2020-09-08 15:37   ` David Ahern
@ 2020-09-11 16:40     ` Ido Schimmel
  2020-09-11 16:47       ` David Ahern
  0 siblings, 1 reply; 56+ messages in thread
From: Ido Schimmel @ 2020-09-11 16:40 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, kuba, roopa, mlxsw, Ido Schimmel

On Tue, Sep 08, 2020 at 09:37:10AM -0600, David Ahern wrote:
> On 9/8/20 3:10 AM, Ido Schimmel wrote:
> > From: Ido Schimmel <idosch@nvidia.com>
> > 
> > When registering a new notifier to the nexthop notification chain,
> > replay all the existing nexthops to the new notifier so that it will
> > have a complete picture of the available nexthops.
> > 
> > Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> > ---
> >  net/ipv4/nexthop.c | 54 ++++++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 52 insertions(+), 2 deletions(-)
> > 
> > diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> > index b40c343ca969..6505a0a28df2 100644
> > --- a/net/ipv4/nexthop.c
> > +++ b/net/ipv4/nexthop.c
> > @@ -156,6 +156,27 @@ static int call_nexthop_notifiers(struct net *net,
> >  	return notifier_to_errno(err);
> >  }
> >  
> > +static int call_nexthop_notifier(struct notifier_block *nb, struct net *net,
> > +				 enum nexthop_event_type event_type,
> > +				 struct nexthop *nh,
> > +				 struct netlink_ext_ack *extack)
> > +{
> > +	struct nh_notifier_info info = {
> > +		.net = net,
> > +		.extack = extack,
> > +	};
> > +	int err;
> > +
> > +	err = nh_notifier_info_init(&info, nh);
> > +	if (err)
> > +		return err;
> > +
> > +	err = nb->notifier_call(nb, event_type, &info);
> > +	nh_notifier_info_fini(&info);
> > +
> > +	return notifier_to_errno(err);
> > +}
> > +
> >  static unsigned int nh_dev_hashfn(unsigned int val)
> >  {
> >  	unsigned int mask = NH_DEV_HASHSIZE - 1;
> > @@ -2116,11 +2137,40 @@ static struct notifier_block nh_netdev_notifier = {
> >  	.notifier_call = nh_netdev_event,
> >  };
> >  
> > +static int nexthops_dump(struct net *net, struct notifier_block *nb,
> > +			 struct netlink_ext_ack *extack)
> > +{
> > +	struct rb_root *root = &net->nexthop.rb_root;
> > +	struct rb_node *node;
> > +	int err = 0;
> > +
> > +	for (node = rb_first(root); node; node = rb_next(node)) {
> > +		struct nexthop *nh;
> > +
> > +		nh = rb_entry(node, struct nexthop, rb_node);
> > +		err = call_nexthop_notifier(nb, net, NEXTHOP_EVENT_REPLACE, nh,
> > +					    extack);
> > +		if (err)
> > +			break;
> > +	}
> > +
> > +	return err;
> > +}
> > +
> >  int register_nexthop_notifier(struct net *net, struct notifier_block *nb,
> >  			      struct netlink_ext_ack *extack)
> >  {
> > -	return blocking_notifier_chain_register(&net->nexthop.notifier_chain,
> > -						nb);
> > +	int err;
> > +
> > +	rtnl_lock();
> > +	err = nexthops_dump(net, nb, extack);
> 
> can the unlock be moved here? register function below should not need it.

It can result in this unlikely race:

<t0> - rtnl_lock(); nexthops_dump(); rtnl_unlock()
<t1> - Nexthop is added / deleted
<t2> - blocking_notifier_chain_register()

It is possible to flip the order:

<t0> - blocking_notifier_chain_register()
<t1> - rtnl_lock(); nexthops_dump(); rtnl_unlock()

Worst case:

<t0> - blocking_notifier_chain_register()
<t1> - Nexthop is added / deleted
<t2> - rtnl_lock(); nexthops_dump(); rtnl_unlock()

Which is OK. If we get a delete notification for a nexthop we don't
know, we ignore it. If we get two replace notifications for the same
nexthop we just "update" it.

> 
> > +	if (err)
> > +		goto unlock;
> > +	err = blocking_notifier_chain_register(&net->nexthop.notifier_chain,
> > +					       nb);
> > +unlock:
> > +	rtnl_unlock();
> > +	return err;
> >  }
> >  EXPORT_SYMBOL(register_nexthop_notifier);
> >  
> > 
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH net-next 17/22] nexthop: Replay nexthops when registering a notifier
  2020-09-11 16:40     ` Ido Schimmel
@ 2020-09-11 16:47       ` David Ahern
  0 siblings, 0 replies; 56+ messages in thread
From: David Ahern @ 2020-09-11 16:47 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: netdev, davem, kuba, roopa, mlxsw, Ido Schimmel

On 9/11/20 10:40 AM, Ido Schimmel wrote:
>>> @@ -2116,11 +2137,40 @@ static struct notifier_block nh_netdev_notifier = {
>>>  	.notifier_call = nh_netdev_event,
>>>  };
>>>  
>>> +static int nexthops_dump(struct net *net, struct notifier_block *nb,
>>> +			 struct netlink_ext_ack *extack)
>>> +{
>>> +	struct rb_root *root = &net->nexthop.rb_root;
>>> +	struct rb_node *node;
>>> +	int err = 0;
>>> +
>>> +	for (node = rb_first(root); node; node = rb_next(node)) {
>>> +		struct nexthop *nh;
>>> +
>>> +		nh = rb_entry(node, struct nexthop, rb_node);
>>> +		err = call_nexthop_notifier(nb, net, NEXTHOP_EVENT_REPLACE, nh,
>>> +					    extack);
>>> +		if (err)
>>> +			break;
>>> +	}
>>> +
>>> +	return err;
>>> +}
>>> +
>>>  int register_nexthop_notifier(struct net *net, struct notifier_block *nb,
>>>  			      struct netlink_ext_ack *extack)
>>>  {
>>> -	return blocking_notifier_chain_register(&net->nexthop.notifier_chain,
>>> -						nb);
>>> +	int err;
>>> +
>>> +	rtnl_lock();
>>> +	err = nexthops_dump(net, nb, extack);
>>
>> can the unlock be moved here? register function below should not need it.
> 
> It can result in this unlikely race:
> 
> <t0> - rtnl_lock(); nexthops_dump(); rtnl_unlock()
> <t1> - Nexthop is added / deleted
> <t2> - blocking_notifier_chain_register()
> 

ok. Let's keep the order you have which I believe is consistent with FIB
notifiers.

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2020-09-11 17:08 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-08  9:10 [RFC PATCH net-next 00/22] nexthop: Add support for nexthop objects offload Ido Schimmel
2020-09-08  9:10 ` [RFC PATCH net-next 01/22] nexthop: Remove unused function declaration from header file Ido Schimmel
2020-09-08 14:29   ` David Ahern
2020-09-08  9:10 ` [RFC PATCH net-next 02/22] nexthop: Convert to blocking notification chain Ido Schimmel
2020-09-08 14:34   ` David Ahern
2020-09-08  9:10 ` [RFC PATCH net-next 03/22] nexthop: Only emit a notification when nexthop is actually deleted Ido Schimmel
2020-09-08 14:34   ` David Ahern
2020-09-08 14:39   ` Jiri Pirko
2020-09-08 14:42     ` David Ahern
2020-09-11 14:40     ` Ido Schimmel
2020-09-08  9:10 ` [RFC PATCH net-next 04/22] selftests: fib_nexthops: Test cleanup of FDB entries following nexthop deletion Ido Schimmel
2020-09-08 14:35   ` David Ahern
2020-09-08  9:10 ` [RFC PATCH net-next 05/22] nexthop: Add nexthop notification data structures Ido Schimmel
2020-09-08 14:43   ` David Ahern
2020-09-11 14:50     ` Ido Schimmel
2020-09-08  9:10 ` [RFC PATCH net-next 06/22] nexthop: Pass extack to nexthop notifier Ido Schimmel
2020-09-08 14:44   ` David Ahern
2020-09-08  9:10 ` [RFC PATCH net-next 07/22] nexthop: Prepare new notification info Ido Schimmel
2020-09-08 14:55   ` David Ahern
2020-09-11 15:01     ` Ido Schimmel
2020-09-08  9:10 ` [RFC PATCH net-next 08/22] nexthop: vxlan: Convert to " Ido Schimmel
2020-09-08 14:58   ` David Ahern
2020-09-11 15:05     ` Ido Schimmel
2020-09-08  9:10 ` [RFC PATCH net-next 09/22] rtnetlink: Add RTNH_F_TRAP flag Ido Schimmel
2020-09-08 15:02   ` David Ahern
2020-09-11 15:26     ` Ido Schimmel
2020-09-11 15:54       ` David Ahern
2020-09-08  9:10 ` [RFC PATCH net-next 10/22] nexthop: Allow setting "offload" and "trap" indications on nexthops Ido Schimmel
2020-09-08 15:14   ` David Ahern
2020-09-11 15:29     ` Ido Schimmel
2020-09-08  9:10 ` [RFC PATCH net-next 11/22] nexthop: Emit a notification when a nexthop is added Ido Schimmel
2020-09-08 15:21   ` David Ahern
2020-09-11 16:20     ` Ido Schimmel
2020-09-08  9:10 ` [RFC PATCH net-next 12/22] nexthop: Emit a notification when a nexthop group is replaced Ido Schimmel
2020-09-08 15:22   ` David Ahern
2020-09-08  9:10 ` [RFC PATCH net-next 13/22] nexthop: Emit a notification when a single nexthop " Ido Schimmel
2020-09-08 15:25   ` David Ahern
2020-09-11 16:24     ` Ido Schimmel
2020-09-08  9:10 ` [RFC PATCH net-next 14/22] nexthop: Emit a notification when a nexthop group is modified Ido Schimmel
2020-09-08 15:29   ` David Ahern
2020-09-08  9:10 ` [RFC PATCH net-next 15/22] nexthop: Emit a notification when a nexthop group is reduced Ido Schimmel
2020-09-08 15:33   ` David Ahern
2020-09-11 16:29     ` Ido Schimmel
2020-09-08  9:10 ` [RFC PATCH net-next 16/22] nexthop: Pass extack to register_nexthop_notifier() Ido Schimmel
2020-09-08 15:34   ` David Ahern
2020-09-08  9:10 ` [RFC PATCH net-next 17/22] nexthop: Replay nexthops when registering a notifier Ido Schimmel
2020-09-08 15:37   ` David Ahern
2020-09-11 16:40     ` Ido Schimmel
2020-09-11 16:47       ` David Ahern
2020-09-08  9:10 ` [RFC PATCH net-next 18/22] nexthop: Remove in-kernel route notifications when nexthop changes Ido Schimmel
2020-09-08 15:38   ` David Ahern
2020-09-08  9:10 ` [RFC PATCH net-next 19/22] netdevsim: Add devlink resource for nexthops Ido Schimmel
2020-09-08  9:10 ` [RFC PATCH net-next 20/22] netdevsim: Add dummy implementation for nexthop offload Ido Schimmel
2020-09-08  9:10 ` [RFC PATCH net-next 21/22] netdevsim: Allow programming routes with nexthop objects Ido Schimmel
2020-09-08 15:40   ` David Ahern
2020-09-08  9:10 ` [RFC PATCH net-next 22/22] selftests: netdevsim: Add test for nexthop offload API Ido Schimmel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.