All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC net-next 0/7] net: Allow FIB notifiers to fail add and replace
@ 2018-03-22 22:57 David Ahern
  2018-03-22 22:57 ` [PATCH RFC net-next 1/7] net: Fix fib notifer to return errno David Ahern
                   ` (6 more replies)
  0 siblings, 7 replies; 27+ messages in thread
From: David Ahern @ 2018-03-22 22:57 UTC (permalink / raw)
  To: netdev; +Cc: davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

I wanted to revisit how resource overload is handled for hardware offload
of FIB entries and rules. At the moment, the in-kernel fib notifier can
tell a driver about a route or rule add, replace, and delete, but the
notifier can not affect the action. Specifically, in the case of mlxsw
if a route or rule add is going to overflow the ASIC resources the only
recourse is to abort hardware offload. Aborting offload is akin to taking
down the switch as the path from data plane to the control plane simply
can not support the traffic bandwidth of the front panel ports. Further,
the current state of FIB notifiers is inconsistent with other resources
where a driver can affect a user request - e.g., enslavement of a port
into a bridge or a VRF.

As a result of the work done over the past 3+ years, I believe we are
at a point where we can bring consistency to the stack and offloads,
and reliably allow the FIB notifiers to fail a request, pushing an error
along with a suitable error message back to the user. Rather than
aborting offload when the switch is out of resources, userspace is simply
prevented from adding more routes and has a clear indication of why.

Patch 1 fixes call_fib_notifiers to extract the errno from the encoded
response from handlers.

Patches 2-5 allow the call to call_fib_notifiers to fail the add or
replace of a route or rule.

Patch 6 exports some devlink symbols.

Patch 7 adds a simple resource controller to netdevsim to illustrate
how a FIB resource controller can limit the number of route entries.

David Ahern (7):
  net: Fix fib notifer to return errno
  net: Move call_fib_rule_notifiers up in fib_nl_newrule
  net/ipv4: Move call_fib_entry_notifiers up for new routes
  net/ipv4: Allow notifier to fail route repolace
  net/ipv6: Move call_fib6_entry_notifiers up for route adds
  devlink: Export methods to get and set namespace
  netdevsim: Add simple FIB resource controller via devlink

 drivers/net/netdevsim/Makefile    |   4 +
 drivers/net/netdevsim/devlink.c   | 281 ++++++++++++++++++++++++++++++++++++++
 drivers/net/netdevsim/fib.c       | 264 +++++++++++++++++++++++++++++++++++
 drivers/net/netdevsim/netdev.c    |  12 +-
 drivers/net/netdevsim/netdevsim.h |  42 ++++++
 include/net/devlink.h             |   2 +
 net/core/devlink.c                |   6 +-
 net/core/fib_notifier.c           |   5 +-
 net/core/fib_rules.c              |   6 +-
 net/ipv4/fib_trie.c               |  27 +++-
 net/ipv6/ip6_fib.c                |  16 ++-
 11 files changed, 652 insertions(+), 13 deletions(-)
 create mode 100644 drivers/net/netdevsim/devlink.c
 create mode 100644 drivers/net/netdevsim/fib.c

-- 
2.11.0

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH RFC net-next 1/7] net: Fix fib notifer to return errno
  2018-03-22 22:57 [PATCH RFC net-next 0/7] net: Allow FIB notifiers to fail add and replace David Ahern
@ 2018-03-22 22:57 ` David Ahern
  2018-03-25  8:16   ` Ido Schimmel
  2018-03-22 22:57 ` [PATCH RFC net-next 2/7] net: Move call_fib_rule_notifiers up in fib_nl_newrule David Ahern
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 27+ messages in thread
From: David Ahern @ 2018-03-22 22:57 UTC (permalink / raw)
  To: netdev; +Cc: davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

Notifier handlers use notifier_from_errno to convert any potential error
to an encoded format. As a consequence the other side, call_fib_notifiers
in this case, needs to use notifier_to_errno to return the error from
the handler back to its caller.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 net/core/fib_notifier.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/core/fib_notifier.c b/net/core/fib_notifier.c
index 5ace0705a3f9..14ba52ebe8c9 100644
--- a/net/core/fib_notifier.c
+++ b/net/core/fib_notifier.c
@@ -21,8 +21,11 @@ EXPORT_SYMBOL(call_fib_notifier);
 int call_fib_notifiers(struct net *net, enum fib_event_type event_type,
 		       struct fib_notifier_info *info)
 {
+	int err;
+
 	info->net = net;
-	return atomic_notifier_call_chain(&fib_chain, event_type, info);
+	err = atomic_notifier_call_chain(&fib_chain, event_type, info);
+	return notifier_to_errno(err);
 }
 EXPORT_SYMBOL(call_fib_notifiers);
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC net-next 2/7] net: Move call_fib_rule_notifiers up in fib_nl_newrule
  2018-03-22 22:57 [PATCH RFC net-next 0/7] net: Allow FIB notifiers to fail add and replace David Ahern
  2018-03-22 22:57 ` [PATCH RFC net-next 1/7] net: Fix fib notifer to return errno David Ahern
@ 2018-03-22 22:57 ` David Ahern
  2018-03-22 22:57 ` [PATCH RFC net-next 3/7] net/ipv4: Move call_fib_entry_notifiers up for new routes David Ahern
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 27+ messages in thread
From: David Ahern @ 2018-03-22 22:57 UTC (permalink / raw)
  To: netdev; +Cc: davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

Move call_fib_rule_notifiers up in fib_nl_newrule to the point right
before the rule is inserted into the list. At this point there are no
more failure paths within the core rule code, so if the notifier
does not fail then the rule will be inserted into the list.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 net/core/fib_rules.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index f6f04fc0f629..84c427a0c984 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -631,6 +631,11 @@ int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr *nlh,
 	if (err < 0)
 		goto errout_free;
 
+	err = call_fib_rule_notifiers(net, FIB_EVENT_RULE_ADD, rule, ops,
+				      extack);
+	if (err < 0)
+		goto errout_free;
+
 	list_for_each_entry(r, &ops->rules_list, list) {
 		if (r->pref > rule->pref)
 			break;
@@ -667,7 +672,6 @@ int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr *nlh,
 	if (rule->tun_id)
 		ip_tunnel_need_metadata();
 
-	call_fib_rule_notifiers(net, FIB_EVENT_RULE_ADD, rule, ops, extack);
 	notify_rule_change(RTM_NEWRULE, rule, ops, nlh, NETLINK_CB(skb).portid);
 	flush_route_cache(ops);
 	rules_ops_put(ops);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC net-next 3/7] net/ipv4: Move call_fib_entry_notifiers up for new routes
  2018-03-22 22:57 [PATCH RFC net-next 0/7] net: Allow FIB notifiers to fail add and replace David Ahern
  2018-03-22 22:57 ` [PATCH RFC net-next 1/7] net: Fix fib notifer to return errno David Ahern
  2018-03-22 22:57 ` [PATCH RFC net-next 2/7] net: Move call_fib_rule_notifiers up in fib_nl_newrule David Ahern
@ 2018-03-22 22:57 ` David Ahern
  2018-03-22 22:57 ` [PATCH RFC net-next 4/7] net/ipv4: Allow notifier to fail route repolace David Ahern
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 27+ messages in thread
From: David Ahern @ 2018-03-22 22:57 UTC (permalink / raw)
  To: netdev; +Cc: davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

Move call to call_fib_entry_notifiers for new IPv4 routes to right
before the call to fib_insert_alias. At this point the only remaining
failure path is memory allocations in fib_insert_node. Handle that
very unlikely failure with a call to call_fib_entry_notifiers to
tell drivers about it.

At this point notifier handlers can decide the fate of the new route
with a clean path to delete the potential new entry if the notifier
returns non-0.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 net/ipv4/fib_trie.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 62243a8abf92..d0f70a16f8de 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1065,6 +1065,9 @@ static int fib_insert_node(struct trie *t, struct key_vector *tp,
 	return -ENOMEM;
 }
 
+/* fib notifier for ADD is sent before calling fib_insert_alias with
+ * the expectation that the only possible failure ENOMEM
+ */
 static int fib_insert_alias(struct trie *t, struct key_vector *tp,
 			    struct key_vector *l, struct fib_alias *new,
 			    struct fib_alias *fa, t_key key)
@@ -1263,21 +1266,32 @@ int fib_table_insert(struct net *net, struct fib_table *tb,
 	new_fa->tb_id = tb->tb_id;
 	new_fa->fa_default = -1;
 
+	err = call_fib_entry_notifiers(net, event, key, plen, new_fa, extack);
+	if (err)
+		goto out_free_new_fa;
+
 	/* Insert new entry to the list. */
 	err = fib_insert_alias(t, tp, l, new_fa, fa, key);
 	if (err)
-		goto out_free_new_fa;
+		goto out_fib_notif;
 
 	if (!plen)
 		tb->tb_num_default++;
 
 	rt_cache_flush(cfg->fc_nlinfo.nl_net);
-	call_fib_entry_notifiers(net, event, key, plen, new_fa, extack);
 	rtmsg_fib(RTM_NEWROUTE, htonl(key), new_fa, plen, new_fa->tb_id,
 		  &cfg->fc_nlinfo, nlflags);
 succeeded:
 	return 0;
 
+out_fib_notif:
+	/* notifier was sent that entry would be added to trie, but
+	 * the add failed and need to recover. Only failure for
+	 * fib_insert_alias is ENOMEM.
+	 */
+	NL_SET_ERR_MSG(extack, "Failed to insert route into trie");
+	call_fib_entry_notifiers(net, FIB_EVENT_ENTRY_DEL, key,
+				 plen, new_fa, NULL);
 out_free_new_fa:
 	kmem_cache_free(fn_alias_kmem, new_fa);
 out:
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC net-next 4/7] net/ipv4: Allow notifier to fail route repolace
  2018-03-22 22:57 [PATCH RFC net-next 0/7] net: Allow FIB notifiers to fail add and replace David Ahern
                   ` (2 preceding siblings ...)
  2018-03-22 22:57 ` [PATCH RFC net-next 3/7] net/ipv4: Move call_fib_entry_notifiers up for new routes David Ahern
@ 2018-03-22 22:57 ` David Ahern
  2018-03-22 22:57 ` [PATCH RFC net-next 5/7] net/ipv6: Move call_fib6_entry_notifiers up for route adds David Ahern
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 27+ messages in thread
From: David Ahern @ 2018-03-22 22:57 UTC (permalink / raw)
  To: netdev; +Cc: davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

Add checking to call to call_fib_entry_notifiers for IPv4 route replace.
Allows a notifier handler to fail the replace.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 net/ipv4/fib_trie.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index d0f70a16f8de..113822aa9ea7 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1219,8 +1219,13 @@ int fib_table_insert(struct net *net, struct fib_table *tb,
 			new_fa->tb_id = tb->tb_id;
 			new_fa->fa_default = -1;
 
-			call_fib_entry_notifiers(net, FIB_EVENT_ENTRY_REPLACE,
-						 key, plen, new_fa, extack);
+			err = call_fib_entry_notifiers(net,
+						       FIB_EVENT_ENTRY_REPLACE,
+						       key, plen, new_fa,
+						       extack);
+			if (err)
+				goto out_free_new_fa;
+
 			rtmsg_fib(RTM_NEWROUTE, htonl(key), new_fa, plen,
 				  tb->tb_id, &cfg->fc_nlinfo, nlflags);
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC net-next 5/7] net/ipv6: Move call_fib6_entry_notifiers up for route adds
  2018-03-22 22:57 [PATCH RFC net-next 0/7] net: Allow FIB notifiers to fail add and replace David Ahern
                   ` (3 preceding siblings ...)
  2018-03-22 22:57 ` [PATCH RFC net-next 4/7] net/ipv4: Allow notifier to fail route repolace David Ahern
@ 2018-03-22 22:57 ` David Ahern
  2018-03-22 22:57 ` [PATCH RFC net-next 6/7] devlink: Export methods to get and set namespace David Ahern
  2018-03-22 22:57 ` [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink David Ahern
  6 siblings, 0 replies; 27+ messages in thread
From: David Ahern @ 2018-03-22 22:57 UTC (permalink / raw)
  To: netdev; +Cc: davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

Move call to call_fib6_entry_notifiers for new IPv6 routes to right
before the insertion into the FIB. At this point notifier handlers can
decide the fate of the new route with a clean path to delete the
potential new entry if the notifier returns non-0.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 net/ipv6/ip6_fib.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 2f995e9e3050..041c0603841f 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1007,12 +1007,16 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt,
 		if (err)
 			return err;
 
+		err = call_fib6_entry_notifiers(info->nl_net,
+						FIB_EVENT_ENTRY_ADD,
+						rt, extack);
+		if (err)
+			return err;
+
 		rcu_assign_pointer(rt->rt6_next, iter);
 		atomic_inc(&rt->rt6i_ref);
 		rcu_assign_pointer(rt->rt6i_node, fn);
 		rcu_assign_pointer(*ins, rt);
-		call_fib6_entry_notifiers(info->nl_net, FIB_EVENT_ENTRY_ADD,
-					  rt, extack);
 		if (!info->skip_notify)
 			inet6_rt_notify(RTM_NEWROUTE, rt, info, nlflags);
 		info->nl_net->ipv6.rt6_stats->fib_rt_entries++;
@@ -1036,12 +1040,16 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt,
 		if (err)
 			return err;
 
+		err = call_fib6_entry_notifiers(info->nl_net,
+						FIB_EVENT_ENTRY_REPLACE,
+						rt, extack);
+		if (err)
+			return err;
+
 		atomic_inc(&rt->rt6i_ref);
 		rcu_assign_pointer(rt->rt6i_node, fn);
 		rt->rt6_next = iter->rt6_next;
 		rcu_assign_pointer(*ins, rt);
-		call_fib6_entry_notifiers(info->nl_net, FIB_EVENT_ENTRY_REPLACE,
-					  rt, extack);
 		if (!info->skip_notify)
 			inet6_rt_notify(RTM_NEWROUTE, rt, info, NLM_F_REPLACE);
 		if (!(fn->fn_flags & RTN_RTINFO)) {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC net-next 6/7] devlink: Export methods to get and set namespace
  2018-03-22 22:57 [PATCH RFC net-next 0/7] net: Allow FIB notifiers to fail add and replace David Ahern
                   ` (4 preceding siblings ...)
  2018-03-22 22:57 ` [PATCH RFC net-next 5/7] net/ipv6: Move call_fib6_entry_notifiers up for route adds David Ahern
@ 2018-03-22 22:57 ` David Ahern
  2018-03-22 22:57 ` [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink David Ahern
  6 siblings, 0 replies; 27+ messages in thread
From: David Ahern @ 2018-03-22 22:57 UTC (permalink / raw)
  To: netdev; +Cc: davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

Export devlink_net and devlink_net_set for modules to be able to set
the network namespace for a devlink instance and retrieve it later.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 include/net/devlink.h | 2 ++
 net/core/devlink.c    | 6 ++++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index d5b707375e48..e6f7d0f674a5 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -361,6 +361,8 @@ struct ib_device;
 #if IS_ENABLED(CONFIG_NET_DEVLINK)
 
 struct devlink *devlink_alloc(const struct devlink_ops *ops, size_t priv_size);
+struct net *devlink_net(const struct devlink *devlink);
+void devlink_net_set(struct devlink *devlink, struct net *net);
 int devlink_register(struct devlink *devlink, struct device *dev);
 void devlink_unregister(struct devlink *devlink);
 void devlink_free(struct devlink *devlink);
diff --git a/net/core/devlink.c b/net/core/devlink.c
index d03b96f87c25..53cddffe97fc 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -92,15 +92,17 @@ static LIST_HEAD(devlink_list);
  */
 static DEFINE_MUTEX(devlink_mutex);
 
-static struct net *devlink_net(const struct devlink *devlink)
+struct net *devlink_net(const struct devlink *devlink)
 {
 	return read_pnet(&devlink->_net);
 }
+EXPORT_SYMBOL_GPL(devlink_net);
 
-static void devlink_net_set(struct devlink *devlink, struct net *net)
+void devlink_net_set(struct devlink *devlink, struct net *net)
 {
 	write_pnet(&devlink->_net, net);
 }
+EXPORT_SYMBOL_GPL(devlink_net_set);
 
 static struct devlink *devlink_get_from_attrs(struct net *net,
 					      struct nlattr **attrs)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-22 22:57 [PATCH RFC net-next 0/7] net: Allow FIB notifiers to fail add and replace David Ahern
                   ` (5 preceding siblings ...)
  2018-03-22 22:57 ` [PATCH RFC net-next 6/7] devlink: Export methods to get and set namespace David Ahern
@ 2018-03-22 22:57 ` David Ahern
  2018-03-23  6:50   ` Jiri Pirko
  2018-03-24  3:47   ` Jakub Kicinski
  6 siblings, 2 replies; 27+ messages in thread
From: David Ahern @ 2018-03-22 22:57 UTC (permalink / raw)
  To: netdev; +Cc: davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

From: David Ahern <dsahern@gmail.com>

Add devlink support to netdevsim and use it to implement a simple,
profile based resource controller. Only one controller is needed
per namespace, so the first netdevsim netdevice in a namespace
registers with devlink. If that device is deleted, the resource
settings are deleted.

The resource controller allows a user to limit the number of IPv4 and
IPv6 FIB entries and FIB rules. The resource paths are:
    /IPv4
    /IPv4/fib
    /IPv4/fib-rules
    /IPv6
    /IPv6/fib
    /IPv6/fib-rules

The IPv4 and IPv6 top level resources are unlimited in size and can not
be changed. From there, the number of FIB entries and FIB rule entries
are unlimited by default. A user can specify a limit for the fib and
fib-rules resources:

    $ devlink resource set netdevsim/netdevsim0 path /IPv4/fib size 96
    $ devlink resource set netdevsim/netdevsim0 path /IPv4/fib-rules size 16
    $ devlink resource set netdevsim/netdevsim0 path /IPv6/fib size 64
    $ devlink resource set netdevsim/netdevsim0 path /IPv6/fib-rules size 16
    $ devlink dev reload netdevsim/netdevsim0

such that the number of rules or routes is limited:
    $ for n in $(seq 1 32); do ip ro add 10.99.$n.0/24 dev eth1; done
    Error: netdevsim: Exceeded number of supported fib entries.

    $ devlink resource show netdevsim/netdevsim0
    netdevsim/netdevsim0:
      name IPv4 size unlimited unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables non
        resources:
          name fib size 96 occ 96 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables
    ...

The resource limits for one namespace have no bearing on another:
    $ ip netns add foobar
    $ ip -netns foobar link add bar type netdevsim
    $ ip netns exec foobar devlink resource show netdevsim/netdevsim1
    netdevsim/netdevsim1:
      name IPv4 size unlimited unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables non
        resources:
          name fib size unlimited occ 0 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_t
          name fib-rules size unlimited occ 0 unit entry size_min 0 size_max unlimited size_gran 1 d
    ...

With this template in place for resource management, it is fairly trivial
to extend and shows one way to implement a simple counter based resource
controller typical of network profiles.

Signed-off-by: David Ahern <dsahern@gmail.com>
---
 drivers/net/netdevsim/Makefile    |   4 +
 drivers/net/netdevsim/devlink.c   | 281 ++++++++++++++++++++++++++++++++++++++
 drivers/net/netdevsim/fib.c       | 264 +++++++++++++++++++++++++++++++++++
 drivers/net/netdevsim/netdev.c    |  12 +-
 drivers/net/netdevsim/netdevsim.h |  42 ++++++
 5 files changed, 602 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/netdevsim/devlink.c
 create mode 100644 drivers/net/netdevsim/fib.c

diff --git a/drivers/net/netdevsim/Makefile b/drivers/net/netdevsim/Makefile
index 09388c06171d..449b2a1a1800 100644
--- a/drivers/net/netdevsim/Makefile
+++ b/drivers/net/netdevsim/Makefile
@@ -9,3 +9,7 @@ ifeq ($(CONFIG_BPF_SYSCALL),y)
 netdevsim-objs += \
 	bpf.o
 endif
+
+ifneq ($(CONFIG_NET_DEVLINK),)
+netdevsim-objs += devlink.o fib.o
+endif
diff --git a/drivers/net/netdevsim/devlink.c b/drivers/net/netdevsim/devlink.c
new file mode 100644
index 000000000000..d10558e1b022
--- /dev/null
+++ b/drivers/net/netdevsim/devlink.c
@@ -0,0 +1,281 @@
+/*
+ * Copyright (c) 2018 Cumulus Networks. All rights reserved.
+ * Copyright (c) 2018 David Ahern <dsa@cumulusnetworks.com>
+ *
+ * This software is licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree.
+ *
+ * THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
+ * WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
+ * BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE
+ * OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME
+ * THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+ */
+
+#include <linux/device.h>
+#include <net/devlink.h>
+#include <net/netns/generic.h>
+
+#include "netdevsim.h"
+
+static unsigned int nsim_devlink_id;
+
+/* IPv4
+ */
+static u64 nsim_ipv4_fib_resource_occ_get(struct devlink *devlink)
+{
+	struct net *net = devlink_net(devlink);
+
+	return nsim_fib_get_val(net, NSIM_RESOURCE_IPV4_FIB, false);
+}
+
+static struct devlink_resource_ops nsim_ipv4_fib_res_ops = {
+	.occ_get = nsim_ipv4_fib_resource_occ_get,
+};
+
+static u64 nsim_ipv4_fib_rules_res_occ_get(struct devlink *devlink)
+{
+	struct net *net = devlink_net(devlink);
+
+	return nsim_fib_get_val(net, NSIM_RESOURCE_IPV4_FIB_RULES, false);
+}
+
+static struct devlink_resource_ops nsim_ipv4_fib_rules_res_ops = {
+	.occ_get = nsim_ipv4_fib_rules_res_occ_get,
+};
+
+/* IPv6
+ */
+static u64 nsim_ipv6_fib_resource_occ_get(struct devlink *devlink)
+{
+	struct net *net = devlink_net(devlink);
+
+	return nsim_fib_get_val(net, NSIM_RESOURCE_IPV6_FIB, false);
+}
+
+static struct devlink_resource_ops nsim_ipv6_fib_res_ops = {
+	.occ_get = nsim_ipv6_fib_resource_occ_get,
+};
+
+static u64 nsim_ipv6_fib_rules_res_occ_get(struct devlink *devlink)
+{
+	struct net *net = devlink_net(devlink);
+
+	return nsim_fib_get_val(net, NSIM_RESOURCE_IPV6_FIB_RULES, false);
+}
+
+static struct devlink_resource_ops nsim_ipv6_fib_rules_res_ops = {
+	.occ_get = nsim_ipv6_fib_rules_res_occ_get,
+};
+
+static int devlink_resources_register(struct devlink *devlink)
+{
+	struct devlink_resource_size_params params = {
+		.size_max = (u64)-1,
+		.size_granularity = 1,
+		.unit = DEVLINK_RESOURCE_UNIT_ENTRY
+	};
+	struct net *net = devlink_net(devlink);
+	int err;
+	u64 n;
+
+	/* Resources for IPv4 */
+	err = devlink_resource_register(devlink, "IPv4", (u64)-1,
+					NSIM_RESOURCE_IPV4,
+					DEVLINK_RESOURCE_ID_PARENT_TOP,
+					&params, NULL);
+	if (err) {
+		pr_err("Failed to register IPv4 top resource\n");
+		goto out;
+	}
+
+	n = nsim_fib_get_val(net, NSIM_RESOURCE_IPV4_FIB, true);
+	err = devlink_resource_register(devlink, "fib", n,
+					NSIM_RESOURCE_IPV4_FIB,
+					NSIM_RESOURCE_IPV4,
+					&params, &nsim_ipv4_fib_res_ops);
+	if (err) {
+		pr_err("Failed to register IPv4 FIB resource\n");
+		return err;
+	}
+
+	n = nsim_fib_get_val(net, NSIM_RESOURCE_IPV4_FIB_RULES, true);
+	err = devlink_resource_register(devlink, "fib-rules", n,
+					NSIM_RESOURCE_IPV4_FIB_RULES,
+					NSIM_RESOURCE_IPV4,
+					&params, &nsim_ipv4_fib_rules_res_ops);
+	if (err) {
+		pr_err("Failed to register IPv4 FIB rules resource\n");
+		return err;
+	}
+
+	/* Resources for IPv6 */
+	err = devlink_resource_register(devlink, "IPv6", (u64)-1,
+					NSIM_RESOURCE_IPV6,
+					DEVLINK_RESOURCE_ID_PARENT_TOP,
+					&params, NULL);
+	if (err) {
+		pr_err("Failed to register IPv6 top resource\n");
+		goto out;
+	}
+
+	n = nsim_fib_get_val(net, NSIM_RESOURCE_IPV6_FIB, true);
+	err = devlink_resource_register(devlink, "fib", n,
+					NSIM_RESOURCE_IPV6_FIB,
+					NSIM_RESOURCE_IPV6,
+					&params, &nsim_ipv6_fib_res_ops);
+	if (err) {
+		pr_err("Failed to register IPv6 FIB resource\n");
+		return err;
+	}
+
+	n = nsim_fib_get_val(net, NSIM_RESOURCE_IPV6_FIB_RULES, true);
+	err = devlink_resource_register(devlink, "fib-rules", n,
+					NSIM_RESOURCE_IPV6_FIB_RULES,
+					NSIM_RESOURCE_IPV6,
+					&params, &nsim_ipv6_fib_rules_res_ops);
+	if (err) {
+		pr_err("Failed to register IPv6 FIB rules resource\n");
+		return err;
+	}
+out:
+	return err;
+}
+
+static int nsim_devlink_reload(struct devlink *devlink)
+{
+	enum nsim_resource_id res_ids[] = {
+		NSIM_RESOURCE_IPV4_FIB, NSIM_RESOURCE_IPV4_FIB_RULES,
+		NSIM_RESOURCE_IPV6_FIB, NSIM_RESOURCE_IPV6_FIB_RULES
+	};
+	struct net *net = devlink_net(devlink);
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(res_ids); ++i) {
+		int err;
+		u64 val;
+
+		err = devlink_resource_size_get(devlink, res_ids[i], &val);
+		if (!err) {
+			err = nsim_fib_set_max(net, res_ids[i], val);
+			if (err)
+				return err;
+		}
+	}
+
+	return 0;
+}
+
+static void nsim_devlink_net_reset(struct net *net)
+{
+	enum nsim_resource_id res_ids[] = {
+		NSIM_RESOURCE_IPV4_FIB, NSIM_RESOURCE_IPV4_FIB_RULES,
+		NSIM_RESOURCE_IPV6_FIB, NSIM_RESOURCE_IPV6_FIB_RULES
+	};
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(res_ids); ++i) {
+		if (nsim_fib_set_max(net, res_ids[i], (u64)-1)) {
+			pr_err("Failed to reset limit for resource %u\n",
+			       res_ids[i]);
+		}
+	}
+}
+
+static const struct devlink_ops nsim_devlink_ops = {
+	.reload = nsim_devlink_reload,
+};
+
+void nsim_devlink_teardown(struct netdevsim *ns)
+{
+	if (ns->devlink) {
+		struct net *net = dev_net(ns->netdev);
+		bool *reg_devlink = net_generic(net, nsim_devlink_id);
+
+		devlink_unregister(ns->devlink);
+		devlink_free(ns->devlink);
+		ns->devlink = NULL;
+
+		nsim_devlink_net_reset(net);
+		*reg_devlink = true;
+	}
+}
+
+void nsim_devlink_setup(struct netdevsim *ns)
+{
+	struct net *net = dev_net(ns->netdev);
+	bool *reg_devlink = net_generic(net, nsim_devlink_id);
+	struct devlink *devlink;
+	int err = -ENOMEM;
+
+	/* only one device per namespace controls devlink */
+	if (!*reg_devlink) {
+		ns->devlink = NULL;
+		return;
+	}
+
+	devlink = devlink_alloc(&nsim_devlink_ops, 0);
+	if (!devlink)
+		return;
+
+	devlink_net_set(devlink, net);
+	err = devlink_register(devlink, &ns->dev);
+	if (err)
+		goto err_devlink_free;
+
+	err = devlink_resources_register(devlink);
+	if (err)
+		goto err_dl_unregister;
+
+	ns->devlink = devlink;
+
+	*reg_devlink = false;
+
+	return;
+
+err_dl_unregister:
+	devlink_unregister(devlink);
+err_devlink_free:
+	devlink_free(devlink);
+}
+
+/* Initialize per network namespace state */
+static int __net_init nsim_devlink_netns_init(struct net *net)
+{
+	bool *reg_devlink = net_generic(net, nsim_devlink_id);
+
+	*reg_devlink = true;
+
+	return 0;
+}
+
+static struct pernet_operations nsim_devlink_net_ops __net_initdata = {
+	.init = nsim_devlink_netns_init,
+	.id   = &nsim_devlink_id,
+	.size = sizeof(bool),
+	.async = true,
+};
+
+void nsim_devlink_exit(void)
+{
+	unregister_pernet_subsys(&nsim_devlink_net_ops);
+	nsim_fib_exit();
+}
+
+int nsim_devlink_init(void)
+{
+	int err;
+
+	err = nsim_fib_init();
+	if (err)
+		goto err_out;
+
+	err = register_pernet_subsys(&nsim_devlink_net_ops);
+	if (err)
+		nsim_fib_exit();
+
+err_out:
+	return err;
+}
diff --git a/drivers/net/netdevsim/fib.c b/drivers/net/netdevsim/fib.c
new file mode 100644
index 000000000000..b77dcafc7158
--- /dev/null
+++ b/drivers/net/netdevsim/fib.c
@@ -0,0 +1,264 @@
+/*
+ * Copyright (c) 2018 Cumulus Networks. All rights reserved.
+ * Copyright (c) 2018 David Ahern <dsa@cumulusnetworks.com>
+ *
+ * This software is licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree.
+ *
+ * THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
+ * WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
+ * BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE
+ * OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME
+ * THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+ */
+
+#include <net/fib_notifier.h>
+#include <net/ip_fib.h>
+#include <net/ip6_fib.h>
+#include <net/fib_rules.h>
+#include <net/netns/generic.h>
+
+#include "netdevsim.h"
+
+struct nsim_fib_entry {
+	u64 max;
+	u64 num;
+};
+
+struct nsim_per_fib_data {
+	struct nsim_fib_entry fib;
+	struct nsim_fib_entry rules;
+};
+
+struct nsim_fib_data {
+	struct nsim_per_fib_data ipv4;
+	struct nsim_per_fib_data ipv6;
+};
+
+static unsigned int nsim_fib_net_id;
+
+u64 nsim_fib_get_val(struct net *net, enum nsim_resource_id res_id, bool max)
+{
+	struct nsim_fib_data *fib_data = net_generic(net, nsim_fib_net_id);
+	struct nsim_fib_entry *entry;
+
+	switch (res_id) {
+	case NSIM_RESOURCE_IPV4_FIB:
+		entry = &fib_data->ipv4.fib;
+		break;
+	case NSIM_RESOURCE_IPV4_FIB_RULES:
+		entry = &fib_data->ipv4.rules;
+		break;
+	case NSIM_RESOURCE_IPV6_FIB:
+		entry = &fib_data->ipv6.fib;
+		break;
+	case NSIM_RESOURCE_IPV6_FIB_RULES:
+		entry = &fib_data->ipv6.rules;
+		break;
+	default:
+		return 0;
+	}
+
+	return max ? entry->max : entry->num;
+}
+
+int nsim_fib_set_max(struct net *net, enum nsim_resource_id res_id, u64 val)
+{
+	struct nsim_fib_data *fib_data = net_generic(net, nsim_fib_net_id);
+	struct nsim_fib_entry *entry;
+	int err = 0;
+
+	switch (res_id) {
+	case NSIM_RESOURCE_IPV4_FIB:
+		entry = &fib_data->ipv4.fib;
+		break;
+	case NSIM_RESOURCE_IPV4_FIB_RULES:
+		entry = &fib_data->ipv4.rules;
+		break;
+	case NSIM_RESOURCE_IPV6_FIB:
+		entry = &fib_data->ipv6.fib;
+		break;
+	case NSIM_RESOURCE_IPV6_FIB_RULES:
+		entry = &fib_data->ipv6.rules;
+		break;
+	default:
+		return 0;
+	}
+
+	/* not allowing a new max to be less than curren occupancy
+	 * --> no means of evicting entries
+	 */
+	if (val < entry->num)
+		err = -EINVAL;
+	else
+		entry->max = val;
+
+	return err;
+}
+
+static int nsim_fib_rule_account(struct nsim_fib_entry *entry, bool add,
+				 struct netlink_ext_ack *extack)
+{
+	int err = 0;
+
+	if (add) {
+		if (entry->num < entry->max) {
+			entry->num++;
+		} else {
+			err = -ENOSPC;
+			NL_SET_ERR_MSG_MOD(extack, "Exceeded number of supported fib rule entries");
+		}
+	} else {
+		entry->num--;
+	}
+
+	return err;
+}
+
+static int nsim_fib_rule_event(struct fib_notifier_info *info, bool add)
+{
+	struct nsim_fib_data *data = net_generic(info->net, nsim_fib_net_id);
+	struct netlink_ext_ack *extack = info->extack;
+	int err = 0;
+
+	switch (info->family) {
+	case AF_INET:
+		err = nsim_fib_rule_account(&data->ipv4.rules, add, extack);
+		break;
+	case AF_INET6:
+		err = nsim_fib_rule_account(&data->ipv6.rules, add, extack);
+		break;
+	}
+
+	return err;
+}
+
+static int nsim_fib_account(struct nsim_fib_entry *entry, bool add,
+			    struct netlink_ext_ack *extack)
+{
+	int err = 0;
+
+	if (add) {
+		if (entry->num < entry->max) {
+			entry->num++;
+		} else {
+			err = -ENOSPC;
+			NL_SET_ERR_MSG_MOD(extack, "Exceeded number of supported fib entries");
+		}
+	} else {
+		entry->num--;
+	}
+
+	return err;
+}
+
+static int nsim_fib_event(struct fib_notifier_info *info, bool add)
+{
+	struct nsim_fib_data *data = net_generic(info->net, nsim_fib_net_id);
+	struct netlink_ext_ack *extack = info->extack;
+	int err = 0;
+
+	switch (info->family) {
+	case AF_INET:
+		err = nsim_fib_account(&data->ipv4.fib, add, extack);
+		break;
+	case AF_INET6:
+		err = nsim_fib_account(&data->ipv6.fib, add, extack);
+		break;
+	}
+
+	return err;
+}
+
+static int nsim_fib_event_nb(struct notifier_block *nb, unsigned long event,
+			     void *ptr)
+{
+	struct fib_notifier_info *info = ptr;
+	int err;
+
+	switch (event) {
+	case FIB_EVENT_RULE_ADD: /* fall through */
+	case FIB_EVENT_RULE_DEL:
+		err = nsim_fib_rule_event(info, event == FIB_EVENT_RULE_ADD);
+		break;
+
+	case FIB_EVENT_ENTRY_ADD:  /* fall through */
+	case FIB_EVENT_ENTRY_DEL:
+		err = nsim_fib_event(info, event == FIB_EVENT_ENTRY_ADD);
+		break;
+	}
+
+	return notifier_from_errno(err);
+}
+
+/* inconsistent dump, trying again */
+static void nsim_fib_dump_inconsistent(struct notifier_block *nb)
+{
+	struct nsim_fib_data *data;
+	struct net *net;
+
+	rcu_read_lock();
+	for_each_net_rcu(net) {
+		data = net_generic(net, nsim_fib_net_id);
+
+		data->ipv4.fib.num = 0ULL;
+		data->ipv4.rules.num = 0ULL;
+
+		data->ipv6.fib.num = 0ULL;
+		data->ipv6.rules.num = 0ULL;
+	}
+	rcu_read_unlock();
+}
+
+static struct notifier_block nsim_fib_nb = {
+	.notifier_call = nsim_fib_event_nb,
+};
+
+/* Initialize per network namespace state */
+static int __net_init nsim_fib_netns_init(struct net *net)
+{
+	struct nsim_fib_data *data = net_generic(net, nsim_fib_net_id);
+
+	data->ipv4.fib.max = (u64)-1;
+	data->ipv4.rules.max = (u64)-1;
+
+	data->ipv6.fib.max = (u64)-1;
+	data->ipv6.rules.max = (u64)-1;
+
+	return 0;
+}
+
+static struct pernet_operations nsim_fib_net_ops __net_initdata = {
+	.init = nsim_fib_netns_init,
+	.id   = &nsim_fib_net_id,
+	.size = sizeof(struct nsim_fib_data),
+	.async = true,
+};
+
+void nsim_fib_exit(void)
+{
+	unregister_pernet_subsys(&nsim_fib_net_ops);
+	unregister_fib_notifier(&nsim_fib_nb);
+}
+
+int nsim_fib_init(void)
+{
+	int err;
+
+	err = register_pernet_subsys(&nsim_fib_net_ops);
+	if (err < 0) {
+		pr_err("Failed to register pernet subsystem\n");
+		goto err_out;
+	}
+
+	err = register_fib_notifier(&nsim_fib_nb, nsim_fib_dump_inconsistent);
+	if (err < 0) {
+		pr_err("Failed to register fib notifier\n");
+		goto err_out;
+	}
+
+err_out:
+	return err;
+}
diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
index 3fd567928f3d..8b30ab3ea2c2 100644
--- a/drivers/net/netdevsim/netdev.c
+++ b/drivers/net/netdevsim/netdev.c
@@ -167,6 +167,8 @@ static int nsim_init(struct net_device *dev)
 
 	SET_NETDEV_DEV(dev, &ns->dev);
 
+	nsim_devlink_setup(ns);
+
 	return 0;
 
 err_bpf_uninit:
@@ -180,6 +182,7 @@ static void nsim_uninit(struct net_device *dev)
 {
 	struct netdevsim *ns = netdev_priv(dev);
 
+	nsim_devlink_teardown(ns);
 	debugfs_remove_recursive(ns->ddir);
 	nsim_bpf_uninit(ns);
 }
@@ -478,12 +481,18 @@ static int __init nsim_module_init(void)
 	if (err)
 		goto err_debugfs_destroy;
 
-	err = rtnl_link_register(&nsim_link_ops);
+	err = nsim_devlink_init();
 	if (err)
 		goto err_unreg_bus;
 
+	err = rtnl_link_register(&nsim_link_ops);
+	if (err)
+		goto err_dl_fini;
+
 	return 0;
 
+err_dl_fini:
+	nsim_devlink_exit();
 err_unreg_bus:
 	bus_unregister(&nsim_bus);
 err_debugfs_destroy:
@@ -494,6 +503,7 @@ static int __init nsim_module_init(void)
 static void __exit nsim_module_exit(void)
 {
 	rtnl_link_unregister(&nsim_link_ops);
+	nsim_devlink_exit();
 	bus_unregister(&nsim_bus);
 	debugfs_remove_recursive(nsim_ddir);
 }
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
index ea081c10efb8..37dd5f312913 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -64,6 +64,9 @@ struct netdevsim {
 
 	bool bpf_map_accept;
 	struct list_head bpf_bound_maps;
+#if IS_ENABLED(CONFIG_NET_DEVLINK)
+	struct devlink *devlink;
+#endif
 };
 
 extern struct dentry *nsim_ddir;
@@ -103,6 +106,45 @@ nsim_bpf_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
 }
 #endif
 
+#if IS_ENABLED(CONFIG_NET_DEVLINK)
+enum nsim_resource_id {
+	NSIM_RESOURCE_IPV4,
+	NSIM_RESOURCE_IPV4_FIB,
+	NSIM_RESOURCE_IPV4_FIB_RULES,
+	NSIM_RESOURCE_IPV6,
+	NSIM_RESOURCE_IPV6_FIB,
+	NSIM_RESOURCE_IPV6_FIB_RULES,
+};
+
+void nsim_devlink_setup(struct netdevsim *ns);
+void nsim_devlink_teardown(struct netdevsim *ns);
+
+int nsim_devlink_init(void);
+void nsim_devlink_exit(void);
+
+int nsim_fib_init(void);
+void nsim_fib_exit(void);
+u64 nsim_fib_get_val(struct net *net, enum nsim_resource_id res_id, bool max);
+int nsim_fib_set_max(struct net *net, enum nsim_resource_id res_id, u64 val);
+#else
+static inline void nsim_devlink_setup(struct netdevsim *ns)
+{
+}
+
+static inline void nsim_devlink_teardown(struct netdevsim *ns)
+{
+}
+
+static inline int nsim_devlink_init(void)
+{
+	return 0;
+}
+
+static inline void nsim_devlink_exit(void)
+{
+}
+#endif
+
 static inline struct netdevsim *to_nsim(struct device *ptr)
 {
 	return container_of(ptr, struct netdevsim, dev);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-22 22:57 ` [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink David Ahern
@ 2018-03-23  6:50   ` Jiri Pirko
  2018-03-23 14:31     ` David Ahern
  2018-03-24  3:47   ` Jakub Kicinski
  1 sibling, 1 reply; 27+ messages in thread
From: Jiri Pirko @ 2018-03-23  6:50 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

Thu, Mar 22, 2018 at 11:57:57PM CET, dsa@cumulusnetworks.com wrote:
>From: David Ahern <dsahern@gmail.com>

[...]


>+void nsim_devlink_teardown(struct netdevsim *ns)
>+{
>+	if (ns->devlink) {
>+		struct net *net = dev_net(ns->netdev);
>+		bool *reg_devlink = net_generic(net, nsim_devlink_id);
>+
>+		devlink_unregister(ns->devlink);
>+		devlink_free(ns->devlink);
>+		ns->devlink = NULL;
>+
>+		nsim_devlink_net_reset(net);
>+		*reg_devlink = true;
>+	}
>+}
>+
>+void nsim_devlink_setup(struct netdevsim *ns)
>+{
>+	struct net *net = dev_net(ns->netdev);
>+	bool *reg_devlink = net_generic(net, nsim_devlink_id);
>+	struct devlink *devlink;
>+	int err = -ENOMEM;
>+
>+	/* only one device per namespace controls devlink */
>+	if (!*reg_devlink) {
>+		ns->devlink = NULL;
>+		return;
>+	}
>+
>+	devlink = devlink_alloc(&nsim_devlink_ops, 0);
>+	if (!devlink)
>+		return;
>+
>+	devlink_net_set(devlink, net);
>+	err = devlink_register(devlink, &ns->dev);

This reg_devlink construct looks odd. Why don't you leave the devlink
instance in init_ns?



>+	if (err)
>+		goto err_devlink_free;
>+
>+	err = devlink_resources_register(devlink);
>+	if (err)
>+		goto err_dl_unregister;
>+
>+	ns->devlink = devlink;
>+
>+	*reg_devlink = false;
>+
>+	return;
>+
>+err_dl_unregister:
>+	devlink_unregister(devlink);
>+err_devlink_free:
>+	devlink_free(devlink);
>+}
>+
>+/* Initialize per network namespace state */
>+static int __net_init nsim_devlink_netns_init(struct net *net)
>+{
>+	bool *reg_devlink = net_generic(net, nsim_devlink_id);
>+
>+	*reg_devlink = true;
>+
>+	return 0;
>+}
>+
>+static struct pernet_operations nsim_devlink_net_ops __net_initdata = {
>+	.init = nsim_devlink_netns_init,
>+	.id   = &nsim_devlink_id,
>+	.size = sizeof(bool),
>+	.async = true,
>+};

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-23  6:50   ` Jiri Pirko
@ 2018-03-23 14:31     ` David Ahern
  2018-03-23 15:01       ` Jiri Pirko
  0 siblings, 1 reply; 27+ messages in thread
From: David Ahern @ 2018-03-23 14:31 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

On 3/23/18 12:50 AM, Jiri Pirko wrote:
>> +void nsim_devlink_setup(struct netdevsim *ns)
>> +{
>> +	struct net *net = dev_net(ns->netdev);
>> +	bool *reg_devlink = net_generic(net, nsim_devlink_id);
>> +	struct devlink *devlink;
>> +	int err = -ENOMEM;
>> +
>> +	/* only one device per namespace controls devlink */
>> +	if (!*reg_devlink) {
>> +		ns->devlink = NULL;
>> +		return;
>> +	}
>> +
>> +	devlink = devlink_alloc(&nsim_devlink_ops, 0);
>> +	if (!devlink)
>> +		return;
>> +
>> +	devlink_net_set(devlink, net);
>> +	err = devlink_register(devlink, &ns->dev);
> 
> This reg_devlink construct looks odd. Why don't you leave the devlink
> instance in init_ns?

It is a per-network namespace resource controller. Since struct devlink
has a net entry, the simplest design is to put it into the namespace of
the controller. Without it, controlling resource sizes in namespace
'foobar' has to be done from init_net, which is just wrong.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-23 14:31     ` David Ahern
@ 2018-03-23 15:01       ` Jiri Pirko
  2018-03-23 15:03         ` David Ahern
  0 siblings, 1 reply; 27+ messages in thread
From: Jiri Pirko @ 2018-03-23 15:01 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

Fri, Mar 23, 2018 at 03:31:02PM CET, dsa@cumulusnetworks.com wrote:
>On 3/23/18 12:50 AM, Jiri Pirko wrote:
>>> +void nsim_devlink_setup(struct netdevsim *ns)
>>> +{
>>> +	struct net *net = dev_net(ns->netdev);
>>> +	bool *reg_devlink = net_generic(net, nsim_devlink_id);
>>> +	struct devlink *devlink;
>>> +	int err = -ENOMEM;
>>> +
>>> +	/* only one device per namespace controls devlink */
>>> +	if (!*reg_devlink) {
>>> +		ns->devlink = NULL;
>>> +		return;
>>> +	}
>>> +
>>> +	devlink = devlink_alloc(&nsim_devlink_ops, 0);
>>> +	if (!devlink)
>>> +		return;
>>> +
>>> +	devlink_net_set(devlink, net);
>>> +	err = devlink_register(devlink, &ns->dev);
>> 
>> This reg_devlink construct looks odd. Why don't you leave the devlink
>> instance in init_ns?
>
>It is a per-network namespace resource controller. Since struct devlink

Wait a second. What do you mean by "per-network namespace"? Devlink
instance is always associated with one physical device. Like an ASIC.


>has a net entry, the simplest design is to put it into the namespace of
>the controller. Without it, controlling resource sizes in namespace
>'foobar' has to be done from init_net, which is just wrong.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-23 15:01       ` Jiri Pirko
@ 2018-03-23 15:03         ` David Ahern
  2018-03-23 15:05           ` Jiri Pirko
  0 siblings, 1 reply; 27+ messages in thread
From: David Ahern @ 2018-03-23 15:03 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

On 3/23/18 9:01 AM, Jiri Pirko wrote:
> Fri, Mar 23, 2018 at 03:31:02PM CET, dsa@cumulusnetworks.com wrote:
>> On 3/23/18 12:50 AM, Jiri Pirko wrote:
>>>> +void nsim_devlink_setup(struct netdevsim *ns)
>>>> +{
>>>> +	struct net *net = dev_net(ns->netdev);
>>>> +	bool *reg_devlink = net_generic(net, nsim_devlink_id);
>>>> +	struct devlink *devlink;
>>>> +	int err = -ENOMEM;
>>>> +
>>>> +	/* only one device per namespace controls devlink */
>>>> +	if (!*reg_devlink) {
>>>> +		ns->devlink = NULL;
>>>> +		return;
>>>> +	}
>>>> +
>>>> +	devlink = devlink_alloc(&nsim_devlink_ops, 0);
>>>> +	if (!devlink)
>>>> +		return;
>>>> +
>>>> +	devlink_net_set(devlink, net);
>>>> +	err = devlink_register(devlink, &ns->dev);
>>>
>>> This reg_devlink construct looks odd. Why don't you leave the devlink
>>> instance in init_ns?
>>
>> It is a per-network namespace resource controller. Since struct devlink
> 
> Wait a second. What do you mean by "per-network namespace"? Devlink
> instance is always associated with one physical device. Like an ASIC.
> 
> 
>> has a net entry, the simplest design is to put it into the namespace of
>> the controller. Without it, controlling resource sizes in namespace
>> 'foobar' has to be done from init_net, which is just wrong.

you need to look at how netdevsim creates a device per netdevice.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-23 15:03         ` David Ahern
@ 2018-03-23 15:05           ` Jiri Pirko
  2018-03-23 15:13             ` David Ahern
  0 siblings, 1 reply; 27+ messages in thread
From: Jiri Pirko @ 2018-03-23 15:05 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

Fri, Mar 23, 2018 at 04:03:40PM CET, dsa@cumulusnetworks.com wrote:
>On 3/23/18 9:01 AM, Jiri Pirko wrote:
>> Fri, Mar 23, 2018 at 03:31:02PM CET, dsa@cumulusnetworks.com wrote:
>>> On 3/23/18 12:50 AM, Jiri Pirko wrote:
>>>>> +void nsim_devlink_setup(struct netdevsim *ns)
>>>>> +{
>>>>> +	struct net *net = dev_net(ns->netdev);
>>>>> +	bool *reg_devlink = net_generic(net, nsim_devlink_id);
>>>>> +	struct devlink *devlink;
>>>>> +	int err = -ENOMEM;
>>>>> +
>>>>> +	/* only one device per namespace controls devlink */
>>>>> +	if (!*reg_devlink) {
>>>>> +		ns->devlink = NULL;
>>>>> +		return;
>>>>> +	}
>>>>> +
>>>>> +	devlink = devlink_alloc(&nsim_devlink_ops, 0);
>>>>> +	if (!devlink)
>>>>> +		return;
>>>>> +
>>>>> +	devlink_net_set(devlink, net);
>>>>> +	err = devlink_register(devlink, &ns->dev);
>>>>
>>>> This reg_devlink construct looks odd. Why don't you leave the devlink
>>>> instance in init_ns?
>>>
>>> It is a per-network namespace resource controller. Since struct devlink
>> 
>> Wait a second. What do you mean by "per-network namespace"? Devlink
>> instance is always associated with one physical device. Like an ASIC.
>> 
>> 
>>> has a net entry, the simplest design is to put it into the namespace of
>>> the controller. Without it, controlling resource sizes in namespace
>>> 'foobar' has to be done from init_net, which is just wrong.
>
>you need to look at how netdevsim creates a device per netdevice.

That means one devlink instance for each netdevsim device, doesn't it?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-23 15:05           ` Jiri Pirko
@ 2018-03-23 15:13             ` David Ahern
  2018-03-24  7:26               ` Jiri Pirko
  0 siblings, 1 reply; 27+ messages in thread
From: David Ahern @ 2018-03-23 15:13 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

On 3/23/18 9:05 AM, Jiri Pirko wrote:
> Fri, Mar 23, 2018 at 04:03:40PM CET, dsa@cumulusnetworks.com wrote:
>> On 3/23/18 9:01 AM, Jiri Pirko wrote:
>>> Fri, Mar 23, 2018 at 03:31:02PM CET, dsa@cumulusnetworks.com wrote:
>>>> On 3/23/18 12:50 AM, Jiri Pirko wrote:
>>>>>> +void nsim_devlink_setup(struct netdevsim *ns)
>>>>>> +{
>>>>>> +	struct net *net = dev_net(ns->netdev);
>>>>>> +	bool *reg_devlink = net_generic(net, nsim_devlink_id);
>>>>>> +	struct devlink *devlink;
>>>>>> +	int err = -ENOMEM;
>>>>>> +
>>>>>> +	/* only one device per namespace controls devlink */
>>>>>> +	if (!*reg_devlink) {
>>>>>> +		ns->devlink = NULL;
>>>>>> +		return;
>>>>>> +	}
>>>>>> +
>>>>>> +	devlink = devlink_alloc(&nsim_devlink_ops, 0);
>>>>>> +	if (!devlink)
>>>>>> +		return;
>>>>>> +
>>>>>> +	devlink_net_set(devlink, net);
>>>>>> +	err = devlink_register(devlink, &ns->dev);
>>>>>
>>>>> This reg_devlink construct looks odd. Why don't you leave the devlink
>>>>> instance in init_ns?
>>>>
>>>> It is a per-network namespace resource controller. Since struct devlink
>>>
>>> Wait a second. What do you mean by "per-network namespace"? Devlink
>>> instance is always associated with one physical device. Like an ASIC.
>>>
>>>
>>>> has a net entry, the simplest design is to put it into the namespace of
>>>> the controller. Without it, controlling resource sizes in namespace
>>>> 'foobar' has to be done from init_net, which is just wrong.
>>
>> you need to look at how netdevsim creates a device per netdevice.
> 
> That means one devlink instance for each netdevsim device, doesn't it?
> 

yes.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-22 22:57 ` [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink David Ahern
  2018-03-23  6:50   ` Jiri Pirko
@ 2018-03-24  3:47   ` Jakub Kicinski
  2018-03-24 15:02     ` David Ahern
  1 sibling, 1 reply; 27+ messages in thread
From: Jakub Kicinski @ 2018-03-24  3:47 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, roopa, shm, jiri, idosch, David Ahern

On Thu, 22 Mar 2018 15:57:57 -0700, David Ahern wrote:
> From: David Ahern <dsahern@gmail.com>
> 
> Add devlink support to netdevsim and use it to implement a simple,
> profile based resource controller. Only one controller is needed
> per namespace, so the first netdevsim netdevice in a namespace
> registers with devlink. If that device is deleted, the resource
> settings are deleted.

FWIW some nits from me blow.

> diff --git a/drivers/net/netdevsim/Makefile b/drivers/net/netdevsim/Makefile
> index 09388c06171d..449b2a1a1800 100644
> --- a/drivers/net/netdevsim/Makefile
> +++ b/drivers/net/netdevsim/Makefile
> @@ -9,3 +9,7 @@ ifeq ($(CONFIG_BPF_SYSCALL),y)
>  netdevsim-objs += \
>  	bpf.o
>  endif
> +
> +ifneq ($(CONFIG_NET_DEVLINK),)

Hm.  Don't you need MAY_USE_DEVLINK dependency perhaps?

> +netdevsim-objs += devlink.o fib.o
> +endif
> diff --git a/drivers/net/netdevsim/devlink.c b/drivers/net/netdevsim/devlink.c
> new file mode 100644
> index 000000000000..d10558e1b022
> --- /dev/null
> +++ b/drivers/net/netdevsim/devlink.c
 
> +static int devlink_resources_register(struct devlink *devlink)
> +{
> +	struct devlink_resource_size_params params = {
> +		.size_max = (u64)-1,
> +		.size_granularity = 1,
> +		.unit = DEVLINK_RESOURCE_UNIT_ENTRY
> +	};
> +	struct net *net = devlink_net(devlink);
> +	int err;
> +	u64 n;
> +
> +	/* Resources for IPv4 */
> +	err = devlink_resource_register(devlink, "IPv4", (u64)-1,
> +					NSIM_RESOURCE_IPV4,
> +					DEVLINK_RESOURCE_ID_PARENT_TOP,
> +					&params, NULL);
> +	if (err) {
> +		pr_err("Failed to register IPv4 top resource\n");
> +		goto out;

nit: why goto out here and return err everywhere else?  If I was to
     choose I'd rather see returns.  goto X; X: return; is less
     obviously correct IMHO.  Besides labels should be called by the
     action they perform/undo, so goto err_return? :S

> +	}
> +
> +	n = nsim_fib_get_val(net, NSIM_RESOURCE_IPV4_FIB, true);
> +	err = devlink_resource_register(devlink, "fib", n,
> +					NSIM_RESOURCE_IPV4_FIB,
> +					NSIM_RESOURCE_IPV4,
> +					&params, &nsim_ipv4_fib_res_ops);
> +	if (err) {
> +		pr_err("Failed to register IPv4 FIB resource\n");
> +		return err;
> +	}
> +
> +	n = nsim_fib_get_val(net, NSIM_RESOURCE_IPV4_FIB_RULES, true);
> +	err = devlink_resource_register(devlink, "fib-rules", n,
> +					NSIM_RESOURCE_IPV4_FIB_RULES,
> +					NSIM_RESOURCE_IPV4,
> +					&params, &nsim_ipv4_fib_rules_res_ops);
> +	if (err) {
> +		pr_err("Failed to register IPv4 FIB rules resource\n");
> +		return err;
> +	}
> +
> +	/* Resources for IPv6 */
> +	err = devlink_resource_register(devlink, "IPv6", (u64)-1,
> +					NSIM_RESOURCE_IPV6,
> +					DEVLINK_RESOURCE_ID_PARENT_TOP,
> +					&params, NULL);
> +	if (err) {
> +		pr_err("Failed to register IPv6 top resource\n");
> +		goto out;
> +	}
> +
> +	n = nsim_fib_get_val(net, NSIM_RESOURCE_IPV6_FIB, true);
> +	err = devlink_resource_register(devlink, "fib", n,
> +					NSIM_RESOURCE_IPV6_FIB,
> +					NSIM_RESOURCE_IPV6,
> +					&params, &nsim_ipv6_fib_res_ops);
> +	if (err) {
> +		pr_err("Failed to register IPv6 FIB resource\n");
> +		return err;
> +	}
> +
> +	n = nsim_fib_get_val(net, NSIM_RESOURCE_IPV6_FIB_RULES, true);
> +	err = devlink_resource_register(devlink, "fib-rules", n,
> +					NSIM_RESOURCE_IPV6_FIB_RULES,
> +					NSIM_RESOURCE_IPV6,
> +					&params, &nsim_ipv6_fib_rules_res_ops);
> +	if (err) {
> +		pr_err("Failed to register IPv6 FIB rules resource\n");
> +		return err;
> +	}
> +out:
> +	return err;
> +}

> +void nsim_devlink_teardown(struct netdevsim *ns)
> +{
> +	if (ns->devlink) {
> +		struct net *net = dev_net(ns->netdev);
> +		bool *reg_devlink = net_generic(net, nsim_devlink_id);

nit: reverse xmas tree

> +		devlink_unregister(ns->devlink);
> +		devlink_free(ns->devlink);
> +		ns->devlink = NULL;
> +
> +		nsim_devlink_net_reset(net);
> +		*reg_devlink = true;
> +	}
> +}

> diff --git a/drivers/net/netdevsim/fib.c b/drivers/net/netdevsim/fib.c
> new file mode 100644
> index 000000000000..b77dcafc7158
> --- /dev/null
> +++ b/drivers/net/netdevsim/fib.c

> +static int nsim_fib_event_nb(struct notifier_block *nb, unsigned long event,
> +			     void *ptr)
> +{
> +	struct fib_notifier_info *info = ptr;
> +	int err;
> +
> +	switch (event) {
> +	case FIB_EVENT_RULE_ADD: /* fall through */

nit: I don't think fall through comment is needed for back-to-back
     cases.

> +	case FIB_EVENT_RULE_DEL:
> +		err = nsim_fib_rule_event(info, event == FIB_EVENT_RULE_ADD);
> +		break;
> +
> +	case FIB_EVENT_ENTRY_ADD:  /* fall through */
> +	case FIB_EVENT_ENTRY_DEL:
> +		err = nsim_fib_event(info, event == FIB_EVENT_ENTRY_ADD);
> +		break;
> +	}
> +
> +	return notifier_from_errno(err);
> +}

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-23 15:13             ` David Ahern
@ 2018-03-24  7:26               ` Jiri Pirko
  2018-03-24 15:05                 ` David Ahern
  0 siblings, 1 reply; 27+ messages in thread
From: Jiri Pirko @ 2018-03-24  7:26 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

Fri, Mar 23, 2018 at 04:13:14PM CET, dsa@cumulusnetworks.com wrote:
>On 3/23/18 9:05 AM, Jiri Pirko wrote:
>> Fri, Mar 23, 2018 at 04:03:40PM CET, dsa@cumulusnetworks.com wrote:
>>> On 3/23/18 9:01 AM, Jiri Pirko wrote:
>>>> Fri, Mar 23, 2018 at 03:31:02PM CET, dsa@cumulusnetworks.com wrote:
>>>>> On 3/23/18 12:50 AM, Jiri Pirko wrote:
>>>>>>> +void nsim_devlink_setup(struct netdevsim *ns)
>>>>>>> +{
>>>>>>> +	struct net *net = dev_net(ns->netdev);
>>>>>>> +	bool *reg_devlink = net_generic(net, nsim_devlink_id);
>>>>>>> +	struct devlink *devlink;
>>>>>>> +	int err = -ENOMEM;
>>>>>>> +
>>>>>>> +	/* only one device per namespace controls devlink */
>>>>>>> +	if (!*reg_devlink) {
>>>>>>> +		ns->devlink = NULL;
>>>>>>> +		return;
>>>>>>> +	}
>>>>>>> +
>>>>>>> +	devlink = devlink_alloc(&nsim_devlink_ops, 0);
>>>>>>> +	if (!devlink)
>>>>>>> +		return;
>>>>>>> +
>>>>>>> +	devlink_net_set(devlink, net);
>>>>>>> +	err = devlink_register(devlink, &ns->dev);
>>>>>>
>>>>>> This reg_devlink construct looks odd. Why don't you leave the devlink
>>>>>> instance in init_ns?
>>>>>
>>>>> It is a per-network namespace resource controller. Since struct devlink
>>>>
>>>> Wait a second. What do you mean by "per-network namespace"? Devlink
>>>> instance is always associated with one physical device. Like an ASIC.
>>>>
>>>>
>>>>> has a net entry, the simplest design is to put it into the namespace of
>>>>> the controller. Without it, controlling resource sizes in namespace
>>>>> 'foobar' has to be done from init_net, which is just wrong.
>>>
>>> you need to look at how netdevsim creates a device per netdevice.
>> 
>> That means one devlink instance for each netdevsim device, doesn't it?
>> 
>
>yes.

Still not sure how to handle namespaces in devlink. Originally, I
thought it would be okay to leave all devlink instances in init_ns.
Because what happens if you move netdev to another namespace? Should the
devlink move as well? What if you have multiple ports, each in different
namespace. Can user move devlink instance to another namespace? Etc.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-24  3:47   ` Jakub Kicinski
@ 2018-03-24 15:02     ` David Ahern
  2018-03-25  6:35       ` Jakub Kicinski
  0 siblings, 1 reply; 27+ messages in thread
From: David Ahern @ 2018-03-24 15:02 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem, roopa, shm, jiri, idosch, David Ahern

On 3/23/18 9:47 PM, Jakub Kicinski wrote:
> On Thu, 22 Mar 2018 15:57:57 -0700, David Ahern wrote:
>> From: David Ahern <dsahern@gmail.com>
>>
>> Add devlink support to netdevsim and use it to implement a simple,
>> profile based resource controller. Only one controller is needed
>> per namespace, so the first netdevsim netdevice in a namespace
>> registers with devlink. If that device is deleted, the resource
>> settings are deleted.
> 
> FWIW some nits from me blow.
> 
>> diff --git a/drivers/net/netdevsim/Makefile b/drivers/net/netdevsim/Makefile
>> index 09388c06171d..449b2a1a1800 100644
>> --- a/drivers/net/netdevsim/Makefile
>> +++ b/drivers/net/netdevsim/Makefile
>> @@ -9,3 +9,7 @@ ifeq ($(CONFIG_BPF_SYSCALL),y)
>>  netdevsim-objs += \
>>  	bpf.o
>>  endif
>> +
>> +ifneq ($(CONFIG_NET_DEVLINK),)
> 
> Hm.  Don't you need MAY_USE_DEVLINK dependency perhaps?

mlxsw uses CONFIG_NET_DEVLINK in its Makefile.

MAY_USE_DEVLINK seems to only be used in Kconfig files. Not clear to me
why it is needed at all.

> 
>> +netdevsim-objs += devlink.o fib.o
>> +endif
>> diff --git a/drivers/net/netdevsim/devlink.c b/drivers/net/netdevsim/devlink.c
>> new file mode 100644
>> index 000000000000..d10558e1b022
>> --- /dev/null
>> +++ b/drivers/net/netdevsim/devlink.c
>  
>> +static int devlink_resources_register(struct devlink *devlink)
>> +{
>> +	struct devlink_resource_size_params params = {
>> +		.size_max = (u64)-1,
>> +		.size_granularity = 1,
>> +		.unit = DEVLINK_RESOURCE_UNIT_ENTRY
>> +	};
>> +	struct net *net = devlink_net(devlink);
>> +	int err;
>> +	u64 n;
>> +
>> +	/* Resources for IPv4 */
>> +	err = devlink_resource_register(devlink, "IPv4", (u64)-1,
>> +					NSIM_RESOURCE_IPV4,
>> +					DEVLINK_RESOURCE_ID_PARENT_TOP,
>> +					&params, NULL);
>> +	if (err) {
>> +		pr_err("Failed to register IPv4 top resource\n");
>> +		goto out;
> 
> nit: why goto out here and return err everywhere else?  If I was to
>      choose I'd rather see returns.  goto X; X: return; is less
>      obviously correct IMHO.  Besides labels should be called by the
>      action they perform/undo, so goto err_return? :S

Will fix. Just got lost in the many iterations leading up to the RFC.

> 
>> +	}
>> +
>> +	n = nsim_fib_get_val(net, NSIM_RESOURCE_IPV4_FIB, true);
>> +	err = devlink_resource_register(devlink, "fib", n,
>> +					NSIM_RESOURCE_IPV4_FIB,
>> +					NSIM_RESOURCE_IPV4,
>> +					&params, &nsim_ipv4_fib_res_ops);
>> +	if (err) {
>> +		pr_err("Failed to register IPv4 FIB resource\n");
>> +		return err;
>> +	}
>> +
>> +	n = nsim_fib_get_val(net, NSIM_RESOURCE_IPV4_FIB_RULES, true);
>> +	err = devlink_resource_register(devlink, "fib-rules", n,
>> +					NSIM_RESOURCE_IPV4_FIB_RULES,
>> +					NSIM_RESOURCE_IPV4,
>> +					&params, &nsim_ipv4_fib_rules_res_ops);
>> +	if (err) {
>> +		pr_err("Failed to register IPv4 FIB rules resource\n");
>> +		return err;
>> +	}
>> +
>> +	/* Resources for IPv6 */
>> +	err = devlink_resource_register(devlink, "IPv6", (u64)-1,
>> +					NSIM_RESOURCE_IPV6,
>> +					DEVLINK_RESOURCE_ID_PARENT_TOP,
>> +					&params, NULL);
>> +	if (err) {
>> +		pr_err("Failed to register IPv6 top resource\n");
>> +		goto out;
>> +	}
>> +
>> +	n = nsim_fib_get_val(net, NSIM_RESOURCE_IPV6_FIB, true);
>> +	err = devlink_resource_register(devlink, "fib", n,
>> +					NSIM_RESOURCE_IPV6_FIB,
>> +					NSIM_RESOURCE_IPV6,
>> +					&params, &nsim_ipv6_fib_res_ops);
>> +	if (err) {
>> +		pr_err("Failed to register IPv6 FIB resource\n");
>> +		return err;
>> +	}
>> +
>> +	n = nsim_fib_get_val(net, NSIM_RESOURCE_IPV6_FIB_RULES, true);
>> +	err = devlink_resource_register(devlink, "fib-rules", n,
>> +					NSIM_RESOURCE_IPV6_FIB_RULES,
>> +					NSIM_RESOURCE_IPV6,
>> +					&params, &nsim_ipv6_fib_rules_res_ops);
>> +	if (err) {
>> +		pr_err("Failed to register IPv6 FIB rules resource\n");
>> +		return err;
>> +	}
>> +out:
>> +	return err;
>> +}
> 
>> +void nsim_devlink_teardown(struct netdevsim *ns)
>> +{
>> +	if (ns->devlink) {
>> +		struct net *net = dev_net(ns->netdev);
>> +		bool *reg_devlink = net_generic(net, nsim_devlink_id);
> 
> nit: reverse xmas tree

reg_devlink uses net, so the order can not be changed

> 
>> +		devlink_unregister(ns->devlink);
>> +		devlink_free(ns->devlink);
>> +		ns->devlink = NULL;
>> +
>> +		nsim_devlink_net_reset(net);
>> +		*reg_devlink = true;
>> +	}
>> +}
> 
>> diff --git a/drivers/net/netdevsim/fib.c b/drivers/net/netdevsim/fib.c
>> new file mode 100644
>> index 000000000000..b77dcafc7158
>> --- /dev/null
>> +++ b/drivers/net/netdevsim/fib.c
> 
>> +static int nsim_fib_event_nb(struct notifier_block *nb, unsigned long event,
>> +			     void *ptr)
>> +{
>> +	struct fib_notifier_info *info = ptr;
>> +	int err;
>> +
>> +	switch (event) {
>> +	case FIB_EVENT_RULE_ADD: /* fall through */
> 
> nit: I don't think fall through comment is needed for back-to-back
>      cases.

ok.

Thanks for the review.

> 
>> +	case FIB_EVENT_RULE_DEL:
>> +		err = nsim_fib_rule_event(info, event == FIB_EVENT_RULE_ADD);
>> +		break;
>> +
>> +	case FIB_EVENT_ENTRY_ADD:  /* fall through */
>> +	case FIB_EVENT_ENTRY_DEL:
>> +		err = nsim_fib_event(info, event == FIB_EVENT_ENTRY_ADD);
>> +		break;
>> +	}
>> +
>> +	return notifier_from_errno(err);
>> +}

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-24  7:26               ` Jiri Pirko
@ 2018-03-24 15:05                 ` David Ahern
  2018-03-24 16:02                   ` Jiri Pirko
  0 siblings, 1 reply; 27+ messages in thread
From: David Ahern @ 2018-03-24 15:05 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

On 3/24/18 1:26 AM, Jiri Pirko wrote:
> Fri, Mar 23, 2018 at 04:13:14PM CET, dsa@cumulusnetworks.com wrote:
>> On 3/23/18 9:05 AM, Jiri Pirko wrote:
>>> Fri, Mar 23, 2018 at 04:03:40PM CET, dsa@cumulusnetworks.com wrote:
>>>> On 3/23/18 9:01 AM, Jiri Pirko wrote:
>>>>> Fri, Mar 23, 2018 at 03:31:02PM CET, dsa@cumulusnetworks.com wrote:
>>>>>> On 3/23/18 12:50 AM, Jiri Pirko wrote:
>>>>>>>> +void nsim_devlink_setup(struct netdevsim *ns)
>>>>>>>> +{
>>>>>>>> +	struct net *net = dev_net(ns->netdev);
>>>>>>>> +	bool *reg_devlink = net_generic(net, nsim_devlink_id);
>>>>>>>> +	struct devlink *devlink;
>>>>>>>> +	int err = -ENOMEM;
>>>>>>>> +
>>>>>>>> +	/* only one device per namespace controls devlink */
>>>>>>>> +	if (!*reg_devlink) {
>>>>>>>> +		ns->devlink = NULL;
>>>>>>>> +		return;
>>>>>>>> +	}
>>>>>>>> +
>>>>>>>> +	devlink = devlink_alloc(&nsim_devlink_ops, 0);
>>>>>>>> +	if (!devlink)
>>>>>>>> +		return;
>>>>>>>> +
>>>>>>>> +	devlink_net_set(devlink, net);
>>>>>>>> +	err = devlink_register(devlink, &ns->dev);
>>>>>>>
>>>>>>> This reg_devlink construct looks odd. Why don't you leave the devlink
>>>>>>> instance in init_ns?
>>>>>>
>>>>>> It is a per-network namespace resource controller. Since struct devlink
>>>>>
>>>>> Wait a second. What do you mean by "per-network namespace"? Devlink
>>>>> instance is always associated with one physical device. Like an ASIC.
>>>>>
>>>>>
>>>>>> has a net entry, the simplest design is to put it into the namespace of
>>>>>> the controller. Without it, controlling resource sizes in namespace
>>>>>> 'foobar' has to be done from init_net, which is just wrong.
>>>>
>>>> you need to look at how netdevsim creates a device per netdevice.
>>>
>>> That means one devlink instance for each netdevsim device, doesn't it?
>>>
>>
>> yes.
> 
> Still not sure how to handle namespaces in devlink. Originally, I
> thought it would be okay to leave all devlink instances in init_ns.
> Because what happens if you move netdev to another namespace? Should the
> devlink move as well? What if you have multiple ports, each in different
> namespace. Can user move devlink instance to another namespace? Etc.
> 

The devlink instance is associated with a 'struct device' and those do
not change namespaces AFAIK.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-24 15:05                 ` David Ahern
@ 2018-03-24 16:02                   ` Jiri Pirko
  2018-03-25 14:24                     ` David Ahern
  0 siblings, 1 reply; 27+ messages in thread
From: Jiri Pirko @ 2018-03-24 16:02 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

Sat, Mar 24, 2018 at 04:05:38PM CET, dsa@cumulusnetworks.com wrote:
>On 3/24/18 1:26 AM, Jiri Pirko wrote:
>> Fri, Mar 23, 2018 at 04:13:14PM CET, dsa@cumulusnetworks.com wrote:
>>> On 3/23/18 9:05 AM, Jiri Pirko wrote:
>>>> Fri, Mar 23, 2018 at 04:03:40PM CET, dsa@cumulusnetworks.com wrote:
>>>>> On 3/23/18 9:01 AM, Jiri Pirko wrote:
>>>>>> Fri, Mar 23, 2018 at 03:31:02PM CET, dsa@cumulusnetworks.com wrote:
>>>>>>> On 3/23/18 12:50 AM, Jiri Pirko wrote:
>>>>>>>>> +void nsim_devlink_setup(struct netdevsim *ns)
>>>>>>>>> +{
>>>>>>>>> +	struct net *net = dev_net(ns->netdev);
>>>>>>>>> +	bool *reg_devlink = net_generic(net, nsim_devlink_id);
>>>>>>>>> +	struct devlink *devlink;
>>>>>>>>> +	int err = -ENOMEM;
>>>>>>>>> +
>>>>>>>>> +	/* only one device per namespace controls devlink */
>>>>>>>>> +	if (!*reg_devlink) {
>>>>>>>>> +		ns->devlink = NULL;
>>>>>>>>> +		return;
>>>>>>>>> +	}
>>>>>>>>> +
>>>>>>>>> +	devlink = devlink_alloc(&nsim_devlink_ops, 0);
>>>>>>>>> +	if (!devlink)
>>>>>>>>> +		return;
>>>>>>>>> +
>>>>>>>>> +	devlink_net_set(devlink, net);
>>>>>>>>> +	err = devlink_register(devlink, &ns->dev);
>>>>>>>>
>>>>>>>> This reg_devlink construct looks odd. Why don't you leave the devlink
>>>>>>>> instance in init_ns?
>>>>>>>
>>>>>>> It is a per-network namespace resource controller. Since struct devlink
>>>>>>
>>>>>> Wait a second. What do you mean by "per-network namespace"? Devlink
>>>>>> instance is always associated with one physical device. Like an ASIC.
>>>>>>
>>>>>>
>>>>>>> has a net entry, the simplest design is to put it into the namespace of
>>>>>>> the controller. Without it, controlling resource sizes in namespace
>>>>>>> 'foobar' has to be done from init_net, which is just wrong.
>>>>>
>>>>> you need to look at how netdevsim creates a device per netdevice.
>>>>
>>>> That means one devlink instance for each netdevsim device, doesn't it?
>>>>
>>>
>>> yes.
>> 
>> Still not sure how to handle namespaces in devlink. Originally, I
>> thought it would be okay to leave all devlink instances in init_ns.
>> Because what happens if you move netdev to another namespace? Should the
>> devlink move as well? What if you have multiple ports, each in different
>> namespace. Can user move devlink instance to another namespace? Etc.
>> 
>
>The devlink instance is associated with a 'struct device' and those do
>not change namespaces AFAIK.

Yeah. But you put devlink instance into namespace according to struct
net_device. That is mismatch.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-24 15:02     ` David Ahern
@ 2018-03-25  6:35       ` Jakub Kicinski
  2018-03-25 14:27         ` David Ahern
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Kicinski @ 2018-03-25  6:35 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, roopa, shm, jiri, idosch, David Ahern

On Sat, 24 Mar 2018 09:02:45 -0600, David Ahern wrote:
> >> diff --git a/drivers/net/netdevsim/Makefile b/drivers/net/netdevsim/Makefile
> >> index 09388c06171d..449b2a1a1800 100644
> >> --- a/drivers/net/netdevsim/Makefile
> >> +++ b/drivers/net/netdevsim/Makefile
> >> @@ -9,3 +9,7 @@ ifeq ($(CONFIG_BPF_SYSCALL),y)
> >>  netdevsim-objs += \
> >>  	bpf.o
> >>  endif
> >> +
> >> +ifneq ($(CONFIG_NET_DEVLINK),)  
> > 
> > Hm.  Don't you need MAY_USE_DEVLINK dependency perhaps?  
> 
> mlxsw uses CONFIG_NET_DEVLINK in its Makefile.
> 
> MAY_USE_DEVLINK seems to only be used in Kconfig files. Not clear to me
> why it is needed at all.

NETDEVSIM=y && DEVLINK=m

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 1/7] net: Fix fib notifer to return errno
  2018-03-22 22:57 ` [PATCH RFC net-next 1/7] net: Fix fib notifer to return errno David Ahern
@ 2018-03-25  8:16   ` Ido Schimmel
  2018-03-25 14:00     ` David Ahern
  0 siblings, 1 reply; 27+ messages in thread
From: Ido Schimmel @ 2018-03-25  8:16 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski

On Thu, Mar 22, 2018 at 03:57:51PM -0700, David Ahern wrote:
> Notifier handlers use notifier_from_errno to convert any potential error
> to an encoded format. As a consequence the other side, call_fib_notifiers
> in this case, needs to use notifier_to_errno to return the error from
> the handler back to its caller.
> 
> Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
> ---
>  net/core/fib_notifier.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/core/fib_notifier.c b/net/core/fib_notifier.c
> index 5ace0705a3f9..14ba52ebe8c9 100644
> --- a/net/core/fib_notifier.c
> +++ b/net/core/fib_notifier.c
> @@ -21,8 +21,11 @@ EXPORT_SYMBOL(call_fib_notifier);
>  int call_fib_notifiers(struct net *net, enum fib_event_type event_type,
>  		       struct fib_notifier_info *info)

There's another (less interesting case) - call_fib_notifier(), which is
used to dump the FIB tables for the caller registering to the
notification chain.

For example, if you have a non-default FIB rule in the system and you
modprobe mlxsw, you'll get a silent failure and routes will not be
offloaded. On the other hand, I'm not sure we want to fail the module
loading in such cases.

A possible solution is to have the driver emit a warning via extack for
each route/rule being notified after the abort mechanism was triggered.

>  {
> +	int err;
> +
>  	info->net = net;
> -	return atomic_notifier_call_chain(&fib_chain, event_type, info);
> +	err = atomic_notifier_call_chain(&fib_chain, event_type, info);
> +	return notifier_to_errno(err);
>  }
>  EXPORT_SYMBOL(call_fib_notifiers);
>  
> -- 
> 2.11.0
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 1/7] net: Fix fib notifer to return errno
  2018-03-25  8:16   ` Ido Schimmel
@ 2018-03-25 14:00     ` David Ahern
  2018-03-25 15:37       ` Ido Schimmel
  0 siblings, 1 reply; 27+ messages in thread
From: David Ahern @ 2018-03-25 14:00 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski

On 3/25/18 2:16 AM, Ido Schimmel wrote:
> On Thu, Mar 22, 2018 at 03:57:51PM -0700, David Ahern wrote:
>> Notifier handlers use notifier_from_errno to convert any potential error
>> to an encoded format. As a consequence the other side, call_fib_notifiers
>> in this case, needs to use notifier_to_errno to return the error from
>> the handler back to its caller.
>>
>> Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
>> ---
>>  net/core/fib_notifier.c | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/core/fib_notifier.c b/net/core/fib_notifier.c
>> index 5ace0705a3f9..14ba52ebe8c9 100644
>> --- a/net/core/fib_notifier.c
>> +++ b/net/core/fib_notifier.c
>> @@ -21,8 +21,11 @@ EXPORT_SYMBOL(call_fib_notifier);
>>  int call_fib_notifiers(struct net *net, enum fib_event_type event_type,
>>  		       struct fib_notifier_info *info)
> 
> There's another (less interesting case) - call_fib_notifier(), which is
> used to dump the FIB tables for the caller registering to the
> notification chain.
> 
> For example, if you have a non-default FIB rule in the system and you
> modprobe mlxsw, you'll get a silent failure and routes will not be
> offloaded. On the other hand, I'm not sure we want to fail the module
> loading in such cases.

right. In normal cases the driver is loaded to create the netdevices
long before any networking config is done. So it seems to me the use
case you refer to, some user would have go out of there way to create a
situation where they install config that is not supported by the driver.

> 
> A possible solution is to have the driver emit a warning via extack for
> each route/rule being notified after the abort mechanism was triggered.

extack is not available on module load.

Per past discussions, something you suggested, we need a message for
"out-of-line" cases like this where a driver notifies userspace of a
problem.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-24 16:02                   ` Jiri Pirko
@ 2018-03-25 14:24                     ` David Ahern
  2018-03-26 14:33                       ` Jiri Pirko
  0 siblings, 1 reply; 27+ messages in thread
From: David Ahern @ 2018-03-25 14:24 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

On 3/24/18 10:02 AM, Jiri Pirko wrote:
>>>>>>>
>>>>>>> Wait a second. What do you mean by "per-network namespace"? Devlink
>>>>>>> instance is always associated with one physical device. Like an ASIC.
>>>>>>>
>>>>>>>
>>>>>>>> has a net entry, the simplest design is to put it into the namespace of
>>>>>>>> the controller. Without it, controlling resource sizes in namespace
>>>>>>>> 'foobar' has to be done from init_net, which is just wrong.
>>>>>>
>>>>>> you need to look at how netdevsim creates a device per netdevice.
>>>>>
>>>>> That means one devlink instance for each netdevsim device, doesn't it?
>>>>>
>>>>
>>>> yes.
>>>
>>> Still not sure how to handle namespaces in devlink. Originally, I
>>> thought it would be okay to leave all devlink instances in init_ns.
>>> Because what happens if you move netdev to another namespace? Should the
>>> devlink move as well? What if you have multiple ports, each in different
>>> namespace. Can user move devlink instance to another namespace? Etc.
>>>
>>
>> The devlink instance is associated with a 'struct device' and those do
>> not change namespaces AFAIK.
> 
> Yeah. But you put devlink instance into namespace according to struct
> net_device. That is mismatch.
> 

New netdevsim netdevice creates a new 'struct device' which creates a
new devlink instance. The namespace the netdev is created in is then
passed to the devlink instance. Yes, the netdev could change namespaces,
but that is something we can easily prevent if it has a devlink instance.

But really, we are way down a tangent with respect to the intent of this
patch set. I am fine with limiting the example resource controller to
just the init_net; this patch set is really aimed at the bigger idea of
allowing a notifier handler to fail route and rule adds.

We can revisit devlink and namespaces another time, along with moving
switch ports to namespaces - a key part of implementing a feature
equivalent to what Cisco calls a VDC [1].

[1]
https://www.cisco.com/c/en/us/products/collateral/switches/nexus-7000-10-slot-switch/White_Paper_Tech_Overview_Virtual_Device_Contexts.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-25  6:35       ` Jakub Kicinski
@ 2018-03-25 14:27         ` David Ahern
  2018-03-25 19:53           ` Jakub Kicinski
  0 siblings, 1 reply; 27+ messages in thread
From: David Ahern @ 2018-03-25 14:27 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem, roopa, shm, jiri, idosch, David Ahern

On 3/25/18 12:35 AM, Jakub Kicinski wrote:
> On Sat, 24 Mar 2018 09:02:45 -0600, David Ahern wrote:
>>>> diff --git a/drivers/net/netdevsim/Makefile b/drivers/net/netdevsim/Makefile
>>>> index 09388c06171d..449b2a1a1800 100644
>>>> --- a/drivers/net/netdevsim/Makefile
>>>> +++ b/drivers/net/netdevsim/Makefile
>>>> @@ -9,3 +9,7 @@ ifeq ($(CONFIG_BPF_SYSCALL),y)
>>>>  netdevsim-objs += \
>>>>  	bpf.o
>>>>  endif
>>>> +
>>>> +ifneq ($(CONFIG_NET_DEVLINK),)  
>>>
>>> Hm.  Don't you need MAY_USE_DEVLINK dependency perhaps?  
>>
>> mlxsw uses CONFIG_NET_DEVLINK in its Makefile.
>>
>> MAY_USE_DEVLINK seems to only be used in Kconfig files. Not clear to me
>> why it is needed at all.
> 
> NETDEVSIM=y && DEVLINK=m
> 

ok. I purposely did not add DEVLINK as a dependency to netdevsim to make
the resource controller truly optional. Can add it if you prefer.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 1/7] net: Fix fib notifer to return errno
  2018-03-25 14:00     ` David Ahern
@ 2018-03-25 15:37       ` Ido Schimmel
  0 siblings, 0 replies; 27+ messages in thread
From: Ido Schimmel @ 2018-03-25 15:37 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski

On Sun, Mar 25, 2018 at 08:00:19AM -0600, David Ahern wrote:
> On 3/25/18 2:16 AM, Ido Schimmel wrote:
> > On Thu, Mar 22, 2018 at 03:57:51PM -0700, David Ahern wrote:
> >> Notifier handlers use notifier_from_errno to convert any potential error
> >> to an encoded format. As a consequence the other side, call_fib_notifiers
> >> in this case, needs to use notifier_to_errno to return the error from
> >> the handler back to its caller.
> >>
> >> Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
> >> ---
> >>  net/core/fib_notifier.c | 5 ++++-
> >>  1 file changed, 4 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/net/core/fib_notifier.c b/net/core/fib_notifier.c
> >> index 5ace0705a3f9..14ba52ebe8c9 100644
> >> --- a/net/core/fib_notifier.c
> >> +++ b/net/core/fib_notifier.c
> >> @@ -21,8 +21,11 @@ EXPORT_SYMBOL(call_fib_notifier);
> >>  int call_fib_notifiers(struct net *net, enum fib_event_type event_type,
> >>  		       struct fib_notifier_info *info)
> > 
> > There's another (less interesting case) - call_fib_notifier(), which is
> > used to dump the FIB tables for the caller registering to the
> > notification chain.
> > 
> > For example, if you have a non-default FIB rule in the system and you
> > modprobe mlxsw, you'll get a silent failure and routes will not be
> > offloaded. On the other hand, I'm not sure we want to fail the module
> > loading in such cases.
> 
> right. In normal cases the driver is loaded to create the netdevices
> long before any networking config is done. So it seems to me the use
> case you refer to, some user would have go out of there way to create a
> situation where they install config that is not supported by the driver.

Yes.

> > A possible solution is to have the driver emit a warning via extack for
> > each route/rule being notified after the abort mechanism was triggered.
> 
> extack is not available on module load.

I'm aware. I meant that during module load we'll trigger the abort
mechanism (due to an unsupported FIB rule for example), then when user
configures additional routes and extack is available we'll emit a
warning.

> Per past discussions, something you suggested, we need a message for
> "out-of-line" cases like this where a driver notifies userspace of a
> problem.

That's another possibility. We can implement both options to make it
perfectly clear to users and daemons what's going on.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-25 14:27         ` David Ahern
@ 2018-03-25 19:53           ` Jakub Kicinski
  0 siblings, 0 replies; 27+ messages in thread
From: Jakub Kicinski @ 2018-03-25 19:53 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, roopa, shm, jiri, idosch, David Ahern

On Sun, 25 Mar 2018 08:27:42 -0600, David Ahern wrote:
> On 3/25/18 12:35 AM, Jakub Kicinski wrote:
> > On Sat, 24 Mar 2018 09:02:45 -0600, David Ahern wrote:  
> >>>> diff --git a/drivers/net/netdevsim/Makefile b/drivers/net/netdevsim/Makefile
> >>>> index 09388c06171d..449b2a1a1800 100644
> >>>> --- a/drivers/net/netdevsim/Makefile
> >>>> +++ b/drivers/net/netdevsim/Makefile
> >>>> @@ -9,3 +9,7 @@ ifeq ($(CONFIG_BPF_SYSCALL),y)
> >>>>  netdevsim-objs += \
> >>>>  	bpf.o
> >>>>  endif
> >>>> +
> >>>> +ifneq ($(CONFIG_NET_DEVLINK),)    
> >>>
> >>> Hm.  Don't you need MAY_USE_DEVLINK dependency perhaps?    
> >>
> >> mlxsw uses CONFIG_NET_DEVLINK in its Makefile.
> >>
> >> MAY_USE_DEVLINK seems to only be used in Kconfig files. Not clear to me
> >> why it is needed at all.  
> > 
> > NETDEVSIM=y && DEVLINK=m
> >   
> 
> ok. I purposely did not add DEVLINK as a dependency to netdevsim to make
> the resource controller truly optional. Can add it if you prefer.

Oh, no, I don't mind.  I just thought NETDEVSIM=y DEVLINK=m case will
break the build, but I haven't tested.  If it works perfect, let's not
add unnecessary dependencies :)

(FWIW the MAY_USE_DEVLINK dep is basically depend on DEVLINK ||
DEVLINK=n, so one can still build without devlink but if devlink is a
module netdevsim will also have to be.)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
  2018-03-25 14:24                     ` David Ahern
@ 2018-03-26 14:33                       ` Jiri Pirko
  0 siblings, 0 replies; 27+ messages in thread
From: Jiri Pirko @ 2018-03-26 14:33 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, davem, roopa, shm, jiri, idosch, jakub.kicinski, David Ahern

Sun, Mar 25, 2018 at 04:24:11PM CEST, dsa@cumulusnetworks.com wrote:
>On 3/24/18 10:02 AM, Jiri Pirko wrote:
>>>>>>>>
>>>>>>>> Wait a second. What do you mean by "per-network namespace"? Devlink
>>>>>>>> instance is always associated with one physical device. Like an ASIC.
>>>>>>>>
>>>>>>>>
>>>>>>>>> has a net entry, the simplest design is to put it into the namespace of
>>>>>>>>> the controller. Without it, controlling resource sizes in namespace
>>>>>>>>> 'foobar' has to be done from init_net, which is just wrong.
>>>>>>>
>>>>>>> you need to look at how netdevsim creates a device per netdevice.
>>>>>>
>>>>>> That means one devlink instance for each netdevsim device, doesn't it?
>>>>>>
>>>>>
>>>>> yes.
>>>>
>>>> Still not sure how to handle namespaces in devlink. Originally, I
>>>> thought it would be okay to leave all devlink instances in init_ns.
>>>> Because what happens if you move netdev to another namespace? Should the
>>>> devlink move as well? What if you have multiple ports, each in different
>>>> namespace. Can user move devlink instance to another namespace? Etc.
>>>>
>>>
>>> The devlink instance is associated with a 'struct device' and those do
>>> not change namespaces AFAIK.
>> 
>> Yeah. But you put devlink instance into namespace according to struct
>> net_device. That is mismatch.
>> 
>
>New netdevsim netdevice creates a new 'struct device' which creates a
>new devlink instance. The namespace the netdev is created in is then
>passed to the devlink instance. Yes, the netdev could change namespaces,
>but that is something we can easily prevent if it has a devlink instance.
>
>But really, we are way down a tangent with respect to the intent of this
>patch set. I am fine with limiting the example resource controller to

I know. That is just something that I spotted :)

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2018-03-26 14:33 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-22 22:57 [PATCH RFC net-next 0/7] net: Allow FIB notifiers to fail add and replace David Ahern
2018-03-22 22:57 ` [PATCH RFC net-next 1/7] net: Fix fib notifer to return errno David Ahern
2018-03-25  8:16   ` Ido Schimmel
2018-03-25 14:00     ` David Ahern
2018-03-25 15:37       ` Ido Schimmel
2018-03-22 22:57 ` [PATCH RFC net-next 2/7] net: Move call_fib_rule_notifiers up in fib_nl_newrule David Ahern
2018-03-22 22:57 ` [PATCH RFC net-next 3/7] net/ipv4: Move call_fib_entry_notifiers up for new routes David Ahern
2018-03-22 22:57 ` [PATCH RFC net-next 4/7] net/ipv4: Allow notifier to fail route repolace David Ahern
2018-03-22 22:57 ` [PATCH RFC net-next 5/7] net/ipv6: Move call_fib6_entry_notifiers up for route adds David Ahern
2018-03-22 22:57 ` [PATCH RFC net-next 6/7] devlink: Export methods to get and set namespace David Ahern
2018-03-22 22:57 ` [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink David Ahern
2018-03-23  6:50   ` Jiri Pirko
2018-03-23 14:31     ` David Ahern
2018-03-23 15:01       ` Jiri Pirko
2018-03-23 15:03         ` David Ahern
2018-03-23 15:05           ` Jiri Pirko
2018-03-23 15:13             ` David Ahern
2018-03-24  7:26               ` Jiri Pirko
2018-03-24 15:05                 ` David Ahern
2018-03-24 16:02                   ` Jiri Pirko
2018-03-25 14:24                     ` David Ahern
2018-03-26 14:33                       ` Jiri Pirko
2018-03-24  3:47   ` Jakub Kicinski
2018-03-24 15:02     ` David Ahern
2018-03-25  6:35       ` Jakub Kicinski
2018-03-25 14:27         ` David Ahern
2018-03-25 19:53           ` Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.