Netfilter-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/8 net] the indirect flow_block offload, revisited
@ 2020-05-13 16:41 Pablo Neira Ayuso
  2020-05-13 16:41 ` [PATCH 1/8 net] netfilter: nf_flowtable: expose nf_flow_table_gc_cleanup() Pablo Neira Ayuso
                   ` (8 more replies)
  0 siblings, 9 replies; 14+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-13 16:41 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, paulb, ozsh, vladbu, jiri, kuba, saeedm, michael.chan

Hi,

This patchset fixes the indirect flow_block support for the tc CT action
offload. Please, note that this batch is probably slightly large for the
net tree, however, I could not find a simple incremental fix.

= The problem

The nf_flow_table_indr_block_cb() function provides the tunnel netdevice
and the indirect flow_block driver callback. From this tunnel netdevice,
it is not possible to obtain the tc CT flow_block. Note that tc qdisc
and netfilter backtrack from the tunnel netdevice to the tc block /
netfilter chain to reach the flow_block object. This allows them to
clean up the hardware offload rules if the tunnel device is removed.

= What is the indirect flow_block infrastructure?

The indirect flow_block infrastructure allows drivers to offload
tc/netfilter rules that belong to software tunnel netdevices, e.g.
vxlan.

This indirect flow_block infrastructure relates tunnel netdevices with
drivers because there is no obvious way to relate these two things
from the control plane.

= How does the indirect flow_block work before this patchset?

Front-ends register the indirect flow_block callback through
flow_indr_add_block_cb() if they support for offloading tunnel
netdevices.

== Setting up an indirect flow_block

1) Drivers track tunnel netdevices via NETDEV_{REGISTER,UNREGISTER} events.
   If there is a new tunnel netdevice that the driver can offload, then the
   driver invokes __flow_indr_block_cb_register() with the new tunnel
   netdevice and the driver callback. The __flow_indr_block_cb_register()
   call iterates over the list of the front-end callbacks.

2) The front-end callback sets up the flow_block_offload structure and it
   invokes the driver callback to set up the flow_block.

3) The driver callback now registers the flow_block structure and it
   returns the flow_block back to the front-end.

4) The front-end gets the flow_block object and it is now ready to
   offload rules for this tunnel netdevice.

A simplified callgraph is represented below.

        Front-end                      Driver

                                   NETDEV_REGISTER
                                         |
                         __flow_indr_block_cb_register(netdev, cb_priv, driver_cb)
                                         | [1]
            .----------  frontend_indr_block_cb(cb_priv, driver_cb)
            |
   setup_flow_block_offload(bo)
            | [2]
   driver_cb(bo, cb_priv) ---------------.
                                         |
                                  set up flow_blocks [3]
                                         |
   add rules to flow_block <-------------'
   TC_SETUP_CLSFLOWER [4]

== Releasing the indirect flow_block

There are two possibilities, either tunnel netdevice is removed or
a netdevice (port representor) is removed.

=== Tunnel netdevice is removed

Driver waits for the NETDEV_UNREGISTER event that announces the tunnel
netdevice removal. Then, it calls __flow_indr_block_cb_unregister() to
remove the flow_block and rules.  Callgraph is very similar to the one
described above.

=== Netdevice is removed (port representor)

Driver calls __flow_indr_block_cb_unregister() to remove the existing
netfilter/tc rule that belong to the tunnel netdevice.

= How does the indirect flow_block work after this patchset?

Drivers register the indirect flow_block setup callback through
flow_indr_dev_register() if they support for offloading tunnel
netdevices.

== Setting up an indirect flow_block

1) Frontends check if dev->netdev_ops->ndo_setup_tc is unset. If so,
   frontends call flow_indr_dev_setup_offload(). This call invokes
   the drivers' indirect flow_block setup callback.

2) The indirect flow_block setup callback sets up a flow_block structure
   which relates the tunnel netdevice and the driver.

3) The front-end uses flow_block and offload the rules.

Note that the operational to set up (non-indirect) flow_block is very
similar.

== Releasing the indirect flow_block

=== Tunnel netdevice is removed

This calls flow_indr_dev_setup_offload() to set down the flow_block and
remove the offloaded rules. This alternate path is exercised if
dev->netdev_ops->ndo_setup_tc is unset.

=== Netdevice is removed (port representor)

If a netdevice is removed, then it might need to to clean up the
offloaded tc/netfilter rules that belongs to the tunnel netdevice:

1) The driver invokes flow_indr_dev_unregister() when a netdevice is
   removed.

2) This call iterates over the existing indirect flow_blocks
   and it invokes the cleanup callback to let the front-end remove the
   tc/netfilter rules. The cleanup callback already provides the
   flow_block that the front-end needs to clean up.

        Front-end                      Driver

                                         |
                            flow_indr_dev_unregister(...)
                                         |
                         iterate over list of indirect flow_block
                               and invoke cleanup callback
                                         |
            .-----------------------------
            |
            .
   frontend_flow_block_cleanup(flow_block)
            .
            |
           \/
   remove rules to flow_block
      TC_SETUP_CLSFLOWER

= About this patchset

This patchset aims to address the existing TC CT problem while
simplifying the indirect flow_block infrastructure. Saving 300 LoC in
the flow_offload core and the drivers. The operational gets aligned with
the (non-indirect) flow_blocks logic. Patchset is composed of:

Patch #1 add nf_flow_table_gc_cleanup() which is required by the
         netfilter's flowtable new indirect flow_block approach.

Patch #2 adds the flow_block_indr object which is actually part of
         of the flow_block object. This stores the indirect flow_block
	 metadata such as the tunnel netdevice owner and the cleanup
	 callback (in case the tunnel netdevice goes away).

	 This patch adds flow_indr_dev_{un}register() to allow drivers
         to offer netdevice tunnel hardware offload to the front-ends.
	 Then, front-ends call flow_indr_dev_setup_offload() to invoke
	 the drivers to set up the (indirect) flow_block.

Patch #3 add the tcf_block_offload_init() helper function, this is
	 a preparation patch to adapt the tc front-end to use this
	 new indirect flow_block infrastructure.

Patch #4 updates the tc and netfilter front-ends to use the new
	 indirect flow_block infrastructure.

Patch #5 updates the mlx5 driver to use the new indirect flow_block
	 infrastructure.

Patch #6 updates the nfp driver to use the new indirect flow_block
         infrastructure.

Patch #7 updates the bnxt driver to use the new indirect flow_block
         infrastructure.

Patch #8 removes the indirect flow_block infrastructure version 1,
         now that frontends and drivers have been translated to
	 version 2 (coming in this patchset).

Please, apply.

Pablo Neira Ayuso (8):
  netfilter: nf_flowtable: expose nf_flow_table_gc_cleanup()
  net: flow_offload: consolidate indirect flow_block infrastructure
  net: cls_api: add tcf_block_offload_init()
  net: use flow_indr_dev_setup_offload()
  mlx5: update indirect block support
  nfp: update indirect block support
  bnxt_tc: update indirect block support
  net: remove indirect block netdev event registration

 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   1 -
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c  |  51 +--
 .../net/ethernet/mellanox/mlx5/core/en_rep.c  |  83 +----
 .../net/ethernet/mellanox/mlx5/core/en_rep.h  |   5 -
 .../net/ethernet/netronome/nfp/flower/main.c  |  11 +-
 .../net/ethernet/netronome/nfp/flower/main.h  |   7 +-
 .../ethernet/netronome/nfp/flower/offload.c   |  35 +-
 include/net/flow_offload.h                    |  28 +-
 include/net/netfilter/nf_flow_table.h         |   2 +
 net/core/flow_offload.c                       | 301 +++++++-----------
 net/netfilter/nf_flow_table_core.c            |   6 +-
 net/netfilter/nf_flow_table_offload.c         |  85 +----
 net/netfilter/nf_tables_offload.c             |  69 ++--
 net/sched/cls_api.c                           | 157 +++------
 14 files changed, 251 insertions(+), 590 deletions(-)

--
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/8 net] netfilter: nf_flowtable: expose nf_flow_table_gc_cleanup()
  2020-05-13 16:41 [PATCH 0/8 net] the indirect flow_block offload, revisited Pablo Neira Ayuso
@ 2020-05-13 16:41 ` Pablo Neira Ayuso
  2020-05-13 16:41 ` [PATCH 2/8 net] net: flow_offload: consolidate indirect flow_block infrastructure Pablo Neira Ayuso
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-13 16:41 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, paulb, ozsh, vladbu, jiri, kuba, saeedm, michael.chan

This function schedules the flow teardown state and it forces a gc run.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_flow_table.h | 2 ++
 net/netfilter/nf_flow_table_core.c    | 6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
index 6bf69652f57d..15e21044e611 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -174,6 +174,8 @@ void flow_offload_refresh(struct nf_flowtable *flow_table,
 
 struct flow_offload_tuple_rhash *flow_offload_lookup(struct nf_flowtable *flow_table,
 						     struct flow_offload_tuple *tuple);
+void nf_flow_table_gc_cleanup(struct nf_flowtable *flowtable,
+			      struct net_device *dev);
 void nf_flow_table_cleanup(struct net_device *dev);
 
 int nf_flow_table_init(struct nf_flowtable *flow_table);
diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index 4344e572b7f9..86aeee638ca4 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -586,8 +586,8 @@ static void nf_flow_table_do_cleanup(struct flow_offload *flow, void *data)
 		flow_offload_teardown(flow);
 }
 
-static void nf_flow_table_iterate_cleanup(struct nf_flowtable *flowtable,
-					  struct net_device *dev)
+void nf_flow_table_gc_cleanup(struct nf_flowtable *flowtable,
+			      struct net_device *dev)
 {
 	nf_flow_table_iterate(flowtable, nf_flow_table_do_cleanup, dev);
 	flush_delayed_work(&flowtable->gc_work);
@@ -600,7 +600,7 @@ void nf_flow_table_cleanup(struct net_device *dev)
 
 	mutex_lock(&flowtable_lock);
 	list_for_each_entry(flowtable, &flowtables, list)
-		nf_flow_table_iterate_cleanup(flowtable, dev);
+		nf_flow_table_gc_cleanup(flowtable, dev);
 	mutex_unlock(&flowtable_lock);
 }
 EXPORT_SYMBOL_GPL(nf_flow_table_cleanup);
-- 
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/8 net] net: flow_offload: consolidate indirect flow_block infrastructure
  2020-05-13 16:41 [PATCH 0/8 net] the indirect flow_block offload, revisited Pablo Neira Ayuso
  2020-05-13 16:41 ` [PATCH 1/8 net] netfilter: nf_flowtable: expose nf_flow_table_gc_cleanup() Pablo Neira Ayuso
@ 2020-05-13 16:41 ` Pablo Neira Ayuso
  2020-05-13 16:41 ` [PATCH 3/8 net] net: cls_api: add tcf_block_offload_init() Pablo Neira Ayuso
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-13 16:41 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, paulb, ozsh, vladbu, jiri, kuba, saeedm, michael.chan

Tunnel devices provide no dev->netdev_ops->ndo_setup_tc(...) interface.
The tunnel device and route control plane does not provide an obvious
way to relate tunnel and physical devices.

This patch allows drivers to register a tunnel device offload handler
for the tc and netfilter frontends through flow_indr_dev_register() and
flow_indr_dev_unregister().

The frontend calls flow_indr_dev_setup_offload() that iterates over the
list of drivers that are offering tunnel device hardware offload
support and it sets up the flow block for this tunnel device.

If the driver module is removed, the indirect flow_block ends up with a
stale callback reference. The module removal path triggers the
dev_shutdown() path to remove the qdisc and the flow_blocks for the
physical devices. However, this is not useful for tunnel devices, where
relation between the physical and the tunnel device is not explicit.

This patch introduces a cleanup callback that is invoked when the driver
module is removed to clean up the tunnel device flow_block. This patch
defines struct flow_block_indr and it uses it from flow_block_cb to
store the information that front-end requires to perform the
flow_block_cb cleanup on module removal.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/flow_offload.h |  19 +++++
 net/core/flow_offload.c    | 157 +++++++++++++++++++++++++++++++++++++
 2 files changed, 176 insertions(+)

diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index efc8350b42fb..77e5a1bc30f0 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -430,6 +430,16 @@ enum tc_setup_type;
 typedef int flow_setup_cb_t(enum tc_setup_type type, void *type_data,
 			    void *cb_priv);
 
+struct flow_block_cb;
+
+struct flow_block_indr {
+	struct list_head		list;
+	struct net_device		*dev;
+	enum flow_block_binder_type	binder_type;
+	void				*data;
+	void				(*cleanup)(struct flow_block_cb *block_cb);
+};
+
 struct flow_block_cb {
 	struct list_head	driver_list;
 	struct list_head	list;
@@ -437,6 +447,7 @@ struct flow_block_cb {
 	void			*cb_ident;
 	void			*cb_priv;
 	void			(*release)(void *cb_priv);
+	struct flow_block_indr	indr;
 	unsigned int		refcnt;
 };
 
@@ -510,6 +521,14 @@ static inline void flow_block_init(struct flow_block *flow_block)
 typedef int flow_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
 				      enum tc_setup_type type, void *type_data);
 
+int flow_indr_dev_register(flow_indr_block_bind_cb_t *cb, void *cb_priv);
+void flow_indr_dev_unregister(flow_indr_block_bind_cb_t *cb, void *cb_priv,
+			      flow_setup_cb_t *setup_cb);
+int flow_indr_dev_setup_offload(struct net_device *dev,
+				enum tc_setup_type type, void *data,
+				struct flow_block_offload *bo,
+				void (*cleanup)(struct flow_block_cb *block_cb));
+
 typedef void flow_indr_block_cmd_t(struct net_device *dev,
 				   flow_indr_block_bind_cb_t *cb, void *cb_priv,
 				   enum flow_block_command command);
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index e951b743bed3..073dd0f74dc0 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -311,6 +311,163 @@ int flow_block_cb_setup_simple(struct flow_block_offload *f,
 }
 EXPORT_SYMBOL(flow_block_cb_setup_simple);
 
+static DEFINE_MUTEX(flow_indr_block_lock);
+static LIST_HEAD(flow_block_indr_list);
+static LIST_HEAD(flow_block_indr_dev_list);
+
+struct flow_indr_dev {
+	struct list_head		list;
+	flow_indr_block_bind_cb_t	*cb;
+	void				*cb_priv;
+	refcount_t			refcnt;
+	struct rcu_head			rcu;
+};
+
+static struct flow_indr_dev *flow_indr_dev_alloc(flow_indr_block_bind_cb_t *cb,
+						 void *cb_priv)
+{
+	struct flow_indr_dev *indr_dev;
+
+	indr_dev = kmalloc(sizeof(*indr_dev), GFP_KERNEL);
+	if (!indr_dev)
+		return NULL;
+
+	indr_dev->cb		= cb;
+	indr_dev->cb_priv	= cb_priv;
+	refcount_set(&indr_dev->refcnt, 1);
+
+	return indr_dev;
+}
+
+int flow_indr_dev_register(flow_indr_block_bind_cb_t *cb, void *cb_priv)
+{
+	struct flow_indr_dev *indr_dev;
+
+	mutex_lock(&flow_indr_block_lock);
+	list_for_each_entry(indr_dev, &flow_block_indr_dev_list, list) {
+		if (indr_dev->cb == cb &&
+		    indr_dev->cb_priv == cb_priv) {
+			refcount_inc(&indr_dev->refcnt);
+			mutex_unlock(&flow_indr_block_lock);
+			return 0;
+		}
+	}
+
+	indr_dev = flow_indr_dev_alloc(cb, cb_priv);
+	if (!indr_dev) {
+		mutex_unlock(&flow_indr_block_lock);
+		return -ENOMEM;
+	}
+
+	list_add(&indr_dev->list, &flow_block_indr_dev_list);
+	mutex_unlock(&flow_indr_block_lock);
+
+	return 0;
+}
+EXPORT_SYMBOL(flow_indr_dev_register);
+
+static void __flow_block_indr_cleanup(flow_setup_cb_t *setup_cb, void *cb_priv,
+				      struct list_head *cleanup_list)
+{
+	struct flow_block_cb *this, *next;
+
+	list_for_each_entry_safe(this, next, &flow_block_indr_list, indr.list) {
+		if (this->cb == setup_cb &&
+		    this->cb_priv == cb_priv) {
+			list_move(&this->indr.list, cleanup_list);
+			return;
+		}
+	}
+}
+
+static void flow_block_indr_notify(struct list_head *cleanup_list)
+{
+	struct flow_block_cb *this, *next;
+
+	list_for_each_entry_safe(this, next, cleanup_list, indr.list) {
+		list_del(&this->indr.list);
+		this->indr.cleanup(this);
+	}
+}
+
+void flow_indr_dev_unregister(flow_indr_block_bind_cb_t *cb, void *cb_priv,
+			      flow_setup_cb_t *setup_cb)
+{
+	struct flow_indr_dev *this, *next, *indr_dev = NULL;
+	LIST_HEAD(cleanup_list);
+
+	mutex_lock(&flow_indr_block_lock);
+	list_for_each_entry_safe(this, next, &flow_block_indr_dev_list, list) {
+		if (this->cb == cb &&
+		    this->cb_priv == cb_priv &&
+		    refcount_dec_and_test(&this->refcnt)) {
+			indr_dev = this;
+			list_del(&indr_dev->list);
+			break;
+		}
+	}
+
+	if (!indr_dev) {
+		mutex_unlock(&flow_indr_block_lock);
+		return;
+	}
+
+	__flow_block_indr_cleanup(setup_cb, cb_priv, &cleanup_list);
+	mutex_unlock(&flow_indr_block_lock);
+
+	flow_block_indr_notify(&cleanup_list);
+	kfree(indr_dev);
+}
+EXPORT_SYMBOL(flow_indr_dev_unregister);
+
+static void flow_block_indr_init(struct flow_block_cb *flow_block,
+				 struct flow_block_offload *bo,
+				 struct net_device *dev, void *data,
+				 void (*cleanup)(struct flow_block_cb *block_cb))
+{
+	flow_block->indr.binder_type = bo->binder_type;
+	flow_block->indr.data = data;
+	flow_block->indr.dev = dev;
+	flow_block->indr.cleanup = cleanup;
+}
+
+static void __flow_block_indr_binding(struct flow_block_offload *bo,
+				      struct net_device *dev, void *data,
+				      void (*cleanup)(struct flow_block_cb *block_cb))
+{
+	struct flow_block_cb *block_cb;
+
+	list_for_each_entry(block_cb, &bo->cb_list, list) {
+		switch (bo->command) {
+		case FLOW_BLOCK_BIND:
+			flow_block_indr_init(block_cb, bo, dev, data, cleanup);
+			list_add(&block_cb->indr.list, &flow_block_indr_list);
+			break;
+		case FLOW_BLOCK_UNBIND:
+			list_del(&block_cb->indr.list);
+			break;
+		}
+	}
+}
+
+int flow_indr_dev_setup_offload(struct net_device *dev,
+				enum tc_setup_type type, void *data,
+				struct flow_block_offload *bo,
+				void (*cleanup)(struct flow_block_cb *block_cb))
+{
+	struct flow_indr_dev *this;
+
+	mutex_lock(&flow_indr_block_lock);
+	list_for_each_entry(this, &flow_block_indr_dev_list, list)
+		this->cb(dev, this->cb_priv, type, bo);
+
+	__flow_block_indr_binding(bo, dev, data, cleanup);
+	mutex_unlock(&flow_indr_block_lock);
+
+	return list_empty(&bo->cb_list) ? -EOPNOTSUPP : 0;
+}
+EXPORT_SYMBOL(flow_indr_dev_setup_offload);
+
 static LIST_HEAD(block_cb_list);
 
 static struct rhashtable indr_setup_block_ht;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/8 net] net: cls_api: add tcf_block_offload_init()
  2020-05-13 16:41 [PATCH 0/8 net] the indirect flow_block offload, revisited Pablo Neira Ayuso
  2020-05-13 16:41 ` [PATCH 1/8 net] netfilter: nf_flowtable: expose nf_flow_table_gc_cleanup() Pablo Neira Ayuso
  2020-05-13 16:41 ` [PATCH 2/8 net] net: flow_offload: consolidate indirect flow_block infrastructure Pablo Neira Ayuso
@ 2020-05-13 16:41 ` Pablo Neira Ayuso
  2020-05-13 16:41 ` [PATCH 4/8 net] net: use flow_indr_dev_setup_offload() Pablo Neira Ayuso
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-13 16:41 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, paulb, ozsh, vladbu, jiri, kuba, saeedm, michael.chan

Add a helper function to initialize the flow_block_offload structure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/sched/cls_api.c | 26 +++++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 0a7ecc292bd3..99b349edd020 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -692,6 +692,22 @@ static void tc_indr_block_get_and_cmd(struct net_device *dev,
 	tc_indr_block_cmd(dev, block, cb, cb_priv, command, false);
 }
 
+static void tcf_block_offload_init(struct flow_block_offload *bo,
+				   struct net_device *dev,
+				   enum flow_block_command command,
+				   enum flow_block_binder_type binder_type,
+				   struct flow_block *flow_block,
+				   bool shared, struct netlink_ext_ack *extack)
+{
+	bo->net = dev_net(dev);
+	bo->command = command;
+	bo->binder_type = binder_type;
+	bo->block = flow_block;
+	bo->block_shared = shared;
+	bo->extack = extack;
+	INIT_LIST_HEAD(&bo->cb_list);
+}
+
 static void tc_indr_block_call(struct tcf_block *block,
 			       struct net_device *dev,
 			       struct tcf_block_ext_info *ei,
@@ -726,13 +742,9 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
 	struct flow_block_offload bo = {};
 	int err;
 
-	bo.net = dev_net(dev);
-	bo.command = command;
-	bo.binder_type = ei->binder_type;
-	bo.block = &block->flow_block;
-	bo.block_shared = tcf_block_shared(block);
-	bo.extack = extack;
-	INIT_LIST_HEAD(&bo.cb_list);
+	tcf_block_offload_init(&bo, dev, command, ei->binder_type,
+			       &block->flow_block, tcf_block_shared(block),
+			       extack);
 
 	err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
 	if (err < 0)
-- 
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 4/8 net] net: use flow_indr_dev_setup_offload()
  2020-05-13 16:41 [PATCH 0/8 net] the indirect flow_block offload, revisited Pablo Neira Ayuso
                   ` (2 preceding siblings ...)
  2020-05-13 16:41 ` [PATCH 3/8 net] net: cls_api: add tcf_block_offload_init() Pablo Neira Ayuso
@ 2020-05-13 16:41 ` Pablo Neira Ayuso
  2020-05-13 16:41 ` [PATCH 5/8 net] mlx5: update indirect block support Pablo Neira Ayuso
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-13 16:41 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, paulb, ozsh, vladbu, jiri, kuba, saeedm, michael.chan

Update existing frontends to use flow_indr_dev_setup_offload().

This new function must be called if ->ndo_setup_tc is unset to deal
with tunnel devices.

If there is no driver that is subscribed to new tunnel device
flow_block bindings, then this function bails out with EOPNOTSUPP.

If the driver module is removed, the ->cleanup() callback removes the
entries that belong to this tunnel device. This cleanup procedures is
triggered when the device unregisters the tunnel device offload handler.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_flow_table_offload.c | 19 ++++++---
 net/netfilter/nf_tables_offload.c     | 28 +++++++++++--
 net/sched/cls_api.c                   | 58 +++++++++++++--------------
 3 files changed, 67 insertions(+), 38 deletions(-)

diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
index e3b099c14eff..82562e47d99b 100644
--- a/net/netfilter/nf_flow_table_offload.c
+++ b/net/netfilter/nf_flow_table_offload.c
@@ -933,6 +933,18 @@ static void nf_flow_table_block_offload_init(struct flow_block_offload *bo,
 	INIT_LIST_HEAD(&bo->cb_list);
 }
 
+static void nf_flow_table_indr_cleanup(struct flow_block_cb *block_cb)
+{
+	struct nf_flowtable *flowtable = block_cb->indr.data;
+	struct net_device *dev = block_cb->indr.dev;
+
+	nf_flow_table_gc_cleanup(flowtable, dev);
+	down_write(&flowtable->flow_block_lock);
+	list_del(&block_cb->list);
+	flow_block_cb_free(block_cb);
+	up_write(&flowtable->flow_block_lock);
+}
+
 static int nf_flow_table_indr_offload_cmd(struct flow_block_offload *bo,
 					  struct nf_flowtable *flowtable,
 					  struct net_device *dev,
@@ -941,12 +953,9 @@ static int nf_flow_table_indr_offload_cmd(struct flow_block_offload *bo,
 {
 	nf_flow_table_block_offload_init(bo, dev_net(dev), cmd, flowtable,
 					 extack);
-	flow_indr_block_call(dev, bo, cmd, TC_SETUP_FT);
 
-	if (list_empty(&bo->cb_list))
-		return -EOPNOTSUPP;
-
-	return 0;
+	return flow_indr_dev_setup_offload(dev, TC_SETUP_FT, flowtable, bo,
+					   nf_flow_table_indr_cleanup);
 }
 
 static int nf_flow_table_offload_cmd(struct flow_block_offload *bo,
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 954bccb7f32a..1960f11477e8 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -304,21 +304,41 @@ static void nft_indr_block_ing_cmd(struct net_device *dev,
 	nft_block_setup(chain, &bo, cmd);
 }
 
-static int nft_indr_block_offload_cmd(struct nft_base_chain *chain,
+static void nft_indr_block_cleanup(struct flow_block_cb *block_cb)
+{
+	struct nft_base_chain *basechain = block_cb->indr.data;
+	struct net_device *dev = block_cb->indr.dev;
+	struct netlink_ext_ack extack = {};
+	struct net *net = dev_net(dev);
+	struct flow_block_offload bo;
+
+	nft_flow_block_offload_init(&bo, dev_net(dev), FLOW_BLOCK_UNBIND,
+				    basechain, &extack);
+	mutex_lock(&net->nft.commit_mutex);
+	list_move(&block_cb->list, &bo.cb_list);
+	nft_flow_offload_unbind(&bo, basechain);
+	mutex_unlock(&net->nft.commit_mutex);
+}
+
+static int nft_indr_block_offload_cmd(struct nft_base_chain *basechain,
 				      struct net_device *dev,
 				      enum flow_block_command cmd)
 {
 	struct netlink_ext_ack extack = {};
 	struct flow_block_offload bo;
+	int err;
 
-	nft_flow_block_offload_init(&bo, dev_net(dev), cmd, chain, &extack);
+	nft_flow_block_offload_init(&bo, dev_net(dev), cmd, basechain, &extack);
 
-	flow_indr_block_call(dev, &bo, cmd, TC_SETUP_BLOCK);
+	err = flow_indr_dev_setup_offload(dev, TC_SETUP_BLOCK, basechain, &bo,
+					  nft_indr_block_cleanup);
+	if (err < 0)
+		return err;
 
 	if (list_empty(&bo.cb_list))
 		return -EOPNOTSUPP;
 
-	return nft_block_setup(chain, &bo, cmd);
+	return nft_block_setup(basechain, &bo, cmd);
 }
 
 #define FLOW_SETUP_BLOCK TC_SETUP_BLOCK
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 99b349edd020..a16cfea7b136 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -708,24 +708,26 @@ static void tcf_block_offload_init(struct flow_block_offload *bo,
 	INIT_LIST_HEAD(&bo->cb_list);
 }
 
-static void tc_indr_block_call(struct tcf_block *block,
-			       struct net_device *dev,
-			       struct tcf_block_ext_info *ei,
-			       enum flow_block_command command,
-			       struct netlink_ext_ack *extack)
+static void tcf_block_unbind(struct tcf_block *block,
+			     struct flow_block_offload *bo);
+
+static void tc_block_indr_cleanup(struct flow_block_cb *block_cb)
 {
-	struct flow_block_offload bo = {
-		.command	= command,
-		.binder_type	= ei->binder_type,
-		.net		= dev_net(dev),
-		.block		= &block->flow_block,
-		.block_shared	= tcf_block_shared(block),
-		.extack		= extack,
-	};
-	INIT_LIST_HEAD(&bo.cb_list);
+	struct tcf_block *block = block_cb->indr.data;
+	struct net_device *dev = block_cb->indr.dev;
+	struct netlink_ext_ack extack = {};
+	struct flow_block_offload bo;
 
-	flow_indr_block_call(dev, &bo, command, TC_SETUP_BLOCK);
-	tcf_block_setup(block, &bo);
+	tcf_block_offload_init(&bo, dev, FLOW_BLOCK_UNBIND,
+			       block_cb->indr.binder_type,
+			       &block->flow_block, tcf_block_shared(block),
+			       &extack);
+	down_write(&block->cb_lock);
+	list_move(&block_cb->list, &bo.cb_list);
+	up_write(&block->cb_lock);
+	rtnl_lock();
+	tcf_block_unbind(block, &bo);
+	rtnl_unlock();
 }
 
 static bool tcf_block_offload_in_use(struct tcf_block *block)
@@ -746,7 +748,12 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
 			       &block->flow_block, tcf_block_shared(block),
 			       extack);
 
-	err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
+	if (dev->netdev_ops->ndo_setup_tc)
+		err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
+	else
+		err = flow_indr_dev_setup_offload(dev, TC_SETUP_BLOCK, block,
+						  &bo, tc_block_indr_cleanup);
+
 	if (err < 0)
 		return err;
 
@@ -761,13 +768,13 @@ static int tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q,
 	int err;
 
 	down_write(&block->cb_lock);
-	if (!dev->netdev_ops->ndo_setup_tc)
-		goto no_offload_dev_inc;
 
 	/* If tc offload feature is disabled and the block we try to bind
 	 * to already has some offloaded filters, forbid to bind.
 	 */
-	if (!tc_can_offload(dev) && tcf_block_offload_in_use(block)) {
+	if (dev->netdev_ops->ndo_setup_tc &&
+	    !tc_can_offload(dev) &&
+	    tcf_block_offload_in_use(block)) {
 		NL_SET_ERR_MSG(extack, "Bind to offloaded block failed as dev has offload disabled");
 		err = -EOPNOTSUPP;
 		goto err_unlock;
@@ -779,18 +786,15 @@ static int tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q,
 	if (err)
 		goto err_unlock;
 
-	tc_indr_block_call(block, dev, ei, FLOW_BLOCK_BIND, extack);
 	up_write(&block->cb_lock);
 	return 0;
 
 no_offload_dev_inc:
-	if (tcf_block_offload_in_use(block)) {
-		err = -EOPNOTSUPP;
+	if (tcf_block_offload_in_use(block))
 		goto err_unlock;
-	}
+
 	err = 0;
 	block->nooffloaddevcnt++;
-	tc_indr_block_call(block, dev, ei, FLOW_BLOCK_BIND, extack);
 err_unlock:
 	up_write(&block->cb_lock);
 	return err;
@@ -803,10 +807,6 @@ static void tcf_block_offload_unbind(struct tcf_block *block, struct Qdisc *q,
 	int err;
 
 	down_write(&block->cb_lock);
-	tc_indr_block_call(block, dev, ei, FLOW_BLOCK_UNBIND, NULL);
-
-	if (!dev->netdev_ops->ndo_setup_tc)
-		goto no_offload_dev_dec;
 	err = tcf_block_offload_cmd(block, dev, ei, FLOW_BLOCK_UNBIND, NULL);
 	if (err == -EOPNOTSUPP)
 		goto no_offload_dev_dec;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 5/8 net] mlx5: update indirect block support
  2020-05-13 16:41 [PATCH 0/8 net] the indirect flow_block offload, revisited Pablo Neira Ayuso
                   ` (3 preceding siblings ...)
  2020-05-13 16:41 ` [PATCH 4/8 net] net: use flow_indr_dev_setup_offload() Pablo Neira Ayuso
@ 2020-05-13 16:41 ` Pablo Neira Ayuso
  2020-05-13 16:41 ` [PATCH 6/8 net] nfp: " Pablo Neira Ayuso
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-13 16:41 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, paulb, ozsh, vladbu, jiri, kuba, saeedm, michael.chan

Register ndo callback via flow_indr_dev_register() and
flow_indr_dev_unregister().

No need for mlx5e_rep_indr_clean_block_privs() since flow_block_cb_free()
already releases the internal mapping via ->release callback, which in
this case is mlx5e_rep_indr_tc_block_unbind().

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 .../net/ethernet/mellanox/mlx5/core/en_rep.c  | 83 +++----------------
 .../net/ethernet/mellanox/mlx5/core/en_rep.h  |  5 --
 2 files changed, 10 insertions(+), 78 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index f372e94948fd..b7aa0d976ba7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -65,9 +65,6 @@ struct mlx5e_rep_indr_block_priv {
 	struct list_head list;
 };
 
-static void mlx5e_rep_indr_unregister_block(struct mlx5e_rep_priv *rpriv,
-					    struct net_device *netdev);
-
 static void mlx5e_rep_get_drvinfo(struct net_device *dev,
 				  struct ethtool_drvinfo *drvinfo)
 {
@@ -680,17 +677,6 @@ mlx5e_rep_indr_block_priv_lookup(struct mlx5e_rep_priv *rpriv,
 	return NULL;
 }
 
-static void mlx5e_rep_indr_clean_block_privs(struct mlx5e_rep_priv *rpriv)
-{
-	struct mlx5e_rep_indr_block_priv *cb_priv, *temp;
-	struct list_head *head = &rpriv->uplink_priv.tc_indr_block_priv_list;
-
-	list_for_each_entry_safe(cb_priv, temp, head, list) {
-		mlx5e_rep_indr_unregister_block(rpriv, cb_priv->netdev);
-		kfree(cb_priv);
-	}
-}
-
 static int
 mlx5e_rep_indr_offload(struct net_device *netdev,
 		       struct flow_cls_offload *flower,
@@ -794,9 +780,14 @@ mlx5e_rep_indr_setup_block(struct net_device *netdev,
 			   struct flow_block_offload *f,
 			   flow_setup_cb_t *setup_cb)
 {
+	struct mlx5e_priv *priv = netdev_priv(rpriv->netdev);
 	struct mlx5e_rep_indr_block_priv *indr_priv;
 	struct flow_block_cb *block_cb;
 
+	if (!mlx5e_tc_tun_device_to_offload(priv, netdev) &&
+	    !(is_vlan_dev(netdev) && vlan_dev_real_dev(netdev) == rpriv->netdev))
+		return -EOPNOTSUPP;
+
 	if (f->binder_type != FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS)
 		return -EOPNOTSUPP;
 
@@ -863,53 +854,6 @@ int mlx5e_rep_indr_setup_cb(struct net_device *netdev, void *cb_priv,
 	}
 }
 
-static int mlx5e_rep_indr_register_block(struct mlx5e_rep_priv *rpriv,
-					 struct net_device *netdev)
-{
-	int err;
-
-	err = __flow_indr_block_cb_register(netdev, rpriv,
-					    mlx5e_rep_indr_setup_cb,
-					    rpriv);
-	if (err) {
-		struct mlx5e_priv *priv = netdev_priv(rpriv->netdev);
-
-		mlx5_core_err(priv->mdev, "Failed to register remote block notifier for %s err=%d\n",
-			      netdev_name(netdev), err);
-	}
-	return err;
-}
-
-static void mlx5e_rep_indr_unregister_block(struct mlx5e_rep_priv *rpriv,
-					    struct net_device *netdev)
-{
-	__flow_indr_block_cb_unregister(netdev, mlx5e_rep_indr_setup_cb,
-					rpriv);
-}
-
-static int mlx5e_nic_rep_netdevice_event(struct notifier_block *nb,
-					 unsigned long event, void *ptr)
-{
-	struct mlx5e_rep_priv *rpriv = container_of(nb, struct mlx5e_rep_priv,
-						     uplink_priv.netdevice_nb);
-	struct mlx5e_priv *priv = netdev_priv(rpriv->netdev);
-	struct net_device *netdev = netdev_notifier_info_to_dev(ptr);
-
-	if (!mlx5e_tc_tun_device_to_offload(priv, netdev) &&
-	    !(is_vlan_dev(netdev) && vlan_dev_real_dev(netdev) == rpriv->netdev))
-		return NOTIFY_OK;
-
-	switch (event) {
-	case NETDEV_REGISTER:
-		mlx5e_rep_indr_register_block(rpriv, netdev);
-		break;
-	case NETDEV_UNREGISTER:
-		mlx5e_rep_indr_unregister_block(rpriv, netdev);
-		break;
-	}
-	return NOTIFY_OK;
-}
-
 static void
 mlx5e_rep_queue_neigh_update_work(struct mlx5e_priv *priv,
 				  struct mlx5e_neigh_hash_entry *nhe,
@@ -1806,12 +1750,10 @@ static int mlx5e_init_uplink_rep_tx(struct mlx5e_rep_priv *rpriv)
 
 	/* init indirect block notifications */
 	INIT_LIST_HEAD(&uplink_priv->tc_indr_block_priv_list);
-	uplink_priv->netdevice_nb.notifier_call = mlx5e_nic_rep_netdevice_event;
-	err = register_netdevice_notifier_dev_net(rpriv->netdev,
-						  &uplink_priv->netdevice_nb,
-						  &uplink_priv->netdevice_nn);
+
+	err = flow_indr_dev_register(mlx5e_rep_indr_setup_cb, rpriv);
 	if (err) {
-		mlx5_core_err(priv->mdev, "Failed to register netdev notifier\n");
+		mlx5_core_err(priv->mdev, "Failed to register indirect block notifier\n");
 		goto tc_esw_cleanup;
 	}
 
@@ -1848,13 +1790,8 @@ static int mlx5e_init_rep_tx(struct mlx5e_priv *priv)
 
 static void mlx5e_cleanup_uplink_rep_tx(struct mlx5e_rep_priv *rpriv)
 {
-	struct mlx5_rep_uplink_priv *uplink_priv = &rpriv->uplink_priv;
-
-	/* clean indirect TC block notifications */
-	unregister_netdevice_notifier_dev_net(rpriv->netdev,
-					      &uplink_priv->netdevice_nb,
-					      &uplink_priv->netdevice_nn);
-	mlx5e_rep_indr_clean_block_privs(rpriv);
+	flow_indr_dev_unregister(mlx5e_rep_indr_setup_cb, rpriv,
+				 mlx5e_rep_indr_setup_tc_cb);
 
 	/* delete shared tc flow table */
 	mlx5e_tc_esw_cleanup(&rpriv->uplink_priv.tc_ht);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
index 6a2337900420..3d6a4a9be482 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
@@ -68,13 +68,8 @@ struct mlx5_rep_uplink_priv {
 	 * tc_indr_block_cb_priv_list is used to lookup indirect callback
 	 * private data
 	 *
-	 * netdevice_nb is the netdev events notifier - used to register
-	 * tunnel devices for block events
-	 *
 	 */
 	struct list_head	    tc_indr_block_priv_list;
-	struct notifier_block	    netdevice_nb;
-	struct netdev_net_notifier  netdevice_nn;
 
 	struct mlx5_tun_entropy tun_entropy;
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 6/8 net] nfp: update indirect block support
  2020-05-13 16:41 [PATCH 0/8 net] the indirect flow_block offload, revisited Pablo Neira Ayuso
                   ` (4 preceding siblings ...)
  2020-05-13 16:41 ` [PATCH 5/8 net] mlx5: update indirect block support Pablo Neira Ayuso
@ 2020-05-13 16:41 ` Pablo Neira Ayuso
  2020-05-13 16:41 ` [PATCH 7/8 net] bnxt_tc: " Pablo Neira Ayuso
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-13 16:41 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, paulb, ozsh, vladbu, jiri, kuba, saeedm, michael.chan

Register ndo callback via flow_indr_dev_register() and
flow_indr_dev_unregister().

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 .../net/ethernet/netronome/nfp/flower/main.c  | 11 +++---
 .../net/ethernet/netronome/nfp/flower/main.h  |  7 ++--
 .../ethernet/netronome/nfp/flower/offload.c   | 35 ++++---------------
 3 files changed, 17 insertions(+), 36 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.c b/drivers/net/ethernet/netronome/nfp/flower/main.c
index d8ad9346a26a..10e8edaf5b28 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.c
@@ -783,6 +783,10 @@ static int nfp_flower_init(struct nfp_app *app)
 		nfp_warn(app->cpp, "Flow mod/merge not supported by FW.\n");
 	}
 
+	err = flow_indr_dev_register(nfp_flower_indr_setup_tc_cb, app);
+	if (err)
+		goto err_lag_clean;
+
 	if (app_priv->flower_ext_feats & NFP_FL_FEATS_VF_RLIM)
 		nfp_flower_qos_init(app);
 
@@ -810,6 +814,9 @@ static void nfp_flower_clean(struct nfp_app *app)
 	skb_queue_purge(&app_priv->cmsg_skbs_low);
 	flush_work(&app_priv->cmsg_work);
 
+	flow_indr_dev_unregister(nfp_flower_indr_setup_tc_cb, app,
+				 nfp_flower_setup_indr_block_cb);
+
 	if (app_priv->flower_ext_feats & NFP_FL_FEATS_VF_RLIM)
 		nfp_flower_qos_cleanup(app);
 
@@ -913,10 +920,6 @@ nfp_flower_netdev_event(struct nfp_app *app, struct net_device *netdev,
 			return ret;
 	}
 
-	ret = nfp_flower_reg_indir_block_handler(app, netdev, event);
-	if (ret & NOTIFY_STOP_MASK)
-		return ret;
-
 	ret = nfp_flower_internal_port_event_handler(app, netdev, event);
 	if (ret & NOTIFY_STOP_MASK)
 		return ret;
diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.h b/drivers/net/ethernet/netronome/nfp/flower/main.h
index d55d0d33bc45..74959a9d01be 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.h
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.h
@@ -444,9 +444,10 @@ void nfp_flower_qos_cleanup(struct nfp_app *app);
 int nfp_flower_setup_qos_offload(struct nfp_app *app, struct net_device *netdev,
 				 struct tc_cls_matchall_offload *flow);
 void nfp_flower_stats_rlim_reply(struct nfp_app *app, struct sk_buff *skb);
-int nfp_flower_reg_indir_block_handler(struct nfp_app *app,
-				       struct net_device *netdev,
-				       unsigned long event);
+int nfp_flower_indr_setup_tc_cb(struct net_device *netdev, void *cb_priv,
+				enum tc_setup_type type, void *type_data);
+int nfp_flower_setup_indr_block_cb(enum tc_setup_type type, void *type_data,
+				   void *cb_priv);
 
 void
 __nfp_flower_non_repr_priv_get(struct nfp_flower_non_repr_priv *non_repr_priv);
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index c694dbc239d0..10a4e86f5371 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -1618,8 +1618,8 @@ nfp_flower_indr_block_cb_priv_lookup(struct nfp_app *app,
 	return NULL;
 }
 
-static int nfp_flower_setup_indr_block_cb(enum tc_setup_type type,
-					  void *type_data, void *cb_priv)
+int nfp_flower_setup_indr_block_cb(enum tc_setup_type type,
+				   void *type_data, void *cb_priv)
 {
 	struct nfp_flower_indr_block_cb_priv *priv = cb_priv;
 	struct flow_cls_offload *flower = type_data;
@@ -1707,10 +1707,13 @@ nfp_flower_setup_indr_tc_block(struct net_device *netdev, struct nfp_app *app,
 	return 0;
 }
 
-static int
+int
 nfp_flower_indr_setup_tc_cb(struct net_device *netdev, void *cb_priv,
 			    enum tc_setup_type type, void *type_data)
 {
+	if (!nfp_fl_is_netdev_to_offload(netdev))
+		return -EOPNOTSUPP;
+
 	switch (type) {
 	case TC_SETUP_BLOCK:
 		return nfp_flower_setup_indr_tc_block(netdev, cb_priv,
@@ -1719,29 +1722,3 @@ nfp_flower_indr_setup_tc_cb(struct net_device *netdev, void *cb_priv,
 		return -EOPNOTSUPP;
 	}
 }
-
-int nfp_flower_reg_indir_block_handler(struct nfp_app *app,
-				       struct net_device *netdev,
-				       unsigned long event)
-{
-	int err;
-
-	if (!nfp_fl_is_netdev_to_offload(netdev))
-		return NOTIFY_OK;
-
-	if (event == NETDEV_REGISTER) {
-		err = __flow_indr_block_cb_register(netdev, app,
-						    nfp_flower_indr_setup_tc_cb,
-						    app);
-		if (err)
-			nfp_flower_cmsg_warn(app,
-					     "Indirect block reg failed - %s\n",
-					     netdev->name);
-	} else if (event == NETDEV_UNREGISTER) {
-		__flow_indr_block_cb_unregister(netdev,
-						nfp_flower_indr_setup_tc_cb,
-						app);
-	}
-
-	return NOTIFY_OK;
-}
-- 
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 7/8 net] bnxt_tc: update indirect block support
  2020-05-13 16:41 [PATCH 0/8 net] the indirect flow_block offload, revisited Pablo Neira Ayuso
                   ` (5 preceding siblings ...)
  2020-05-13 16:41 ` [PATCH 6/8 net] nfp: " Pablo Neira Ayuso
@ 2020-05-13 16:41 ` Pablo Neira Ayuso
  2020-05-19  8:53   ` Sriharsha Basavapatna
  2020-05-13 16:41 ` [PATCH 8/8 net] net: remove indirect block netdev event registration Pablo Neira Ayuso
  2020-05-14 11:44 ` [PATCH 0/8 net] the indirect flow_block offload, revisited Edward Cree
  8 siblings, 1 reply; 14+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-13 16:41 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, paulb, ozsh, vladbu, jiri, kuba, saeedm, michael.chan

Register ndo callback via flow_indr_dev_register() and
flow_indr_dev_unregister().

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h    |  1 -
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 51 +++++---------------
 2 files changed, 12 insertions(+), 40 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index f6a3250ef1c5..e7d1c12673bd 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1857,7 +1857,6 @@ struct bnxt {
 	u8			dsn[8];
 	struct bnxt_tc_info	*tc_info;
 	struct list_head	tc_indr_block_list;
-	struct notifier_block	tc_netdev_nb;
 	struct dentry		*debugfs_pdev;
 	struct device		*hwmon_dev;
 };
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
index 782ea0771221..0eef4f5e4a46 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -1939,53 +1939,25 @@ static int bnxt_tc_setup_indr_block(struct net_device *netdev, struct bnxt *bp,
 	return 0;
 }
 
-static int bnxt_tc_setup_indr_cb(struct net_device *netdev, void *cb_priv,
-				 enum tc_setup_type type, void *type_data)
-{
-	switch (type) {
-	case TC_SETUP_BLOCK:
-		return bnxt_tc_setup_indr_block(netdev, cb_priv, type_data);
-	default:
-		return -EOPNOTSUPP;
-	}
-}
-
 static bool bnxt_is_netdev_indr_offload(struct net_device *netdev)
 {
 	return netif_is_vxlan(netdev);
 }
 
-static int bnxt_tc_indr_block_event(struct notifier_block *nb,
-				    unsigned long event, void *ptr)
+static int bnxt_tc_setup_indr_cb(struct net_device *netdev, void *cb_priv,
+				 enum tc_setup_type type, void *type_data)
 {
-	struct net_device *netdev;
-	struct bnxt *bp;
-	int rc;
-
-	netdev = netdev_notifier_info_to_dev(ptr);
 	if (!bnxt_is_netdev_indr_offload(netdev))
-		return NOTIFY_OK;
-
-	bp = container_of(nb, struct bnxt, tc_netdev_nb);
+		return -EOPNOTSUPP;
 
-	switch (event) {
-	case NETDEV_REGISTER:
-		rc = __flow_indr_block_cb_register(netdev, bp,
-						   bnxt_tc_setup_indr_cb,
-						   bp);
-		if (rc)
-			netdev_info(bp->dev,
-				    "Failed to register indirect blk: dev: %s\n",
-				    netdev->name);
-		break;
-	case NETDEV_UNREGISTER:
-		__flow_indr_block_cb_unregister(netdev,
-						bnxt_tc_setup_indr_cb,
-						bp);
+	switch (type) {
+	case TC_SETUP_BLOCK:
+		return bnxt_tc_setup_indr_block(netdev, cb_priv, type_data);
+	default:
 		break;
 	}
 
-	return NOTIFY_DONE;
+	return -EOPNOTSUPP;
 }
 
 static const struct rhashtable_params bnxt_tc_flow_ht_params = {
@@ -2074,8 +2046,8 @@ int bnxt_init_tc(struct bnxt *bp)
 
 	/* init indirect block notifications */
 	INIT_LIST_HEAD(&bp->tc_indr_block_list);
-	bp->tc_netdev_nb.notifier_call = bnxt_tc_indr_block_event;
-	rc = register_netdevice_notifier(&bp->tc_netdev_nb);
+
+	rc = flow_indr_dev_register(bnxt_tc_setup_indr_cb, bp);
 	if (!rc)
 		return 0;
 
@@ -2101,7 +2073,8 @@ void bnxt_shutdown_tc(struct bnxt *bp)
 	if (!bnxt_tc_flower_enabled(bp))
 		return;
 
-	unregister_netdevice_notifier(&bp->tc_netdev_nb);
+	flow_indr_dev_unregister(bnxt_tc_setup_indr_cb, bp,
+				 bnxt_tc_setup_indr_block_cb);
 	rhashtable_destroy(&tc_info->flow_table);
 	rhashtable_destroy(&tc_info->l2_table);
 	rhashtable_destroy(&tc_info->decap_l2_table);
-- 
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 8/8 net] net: remove indirect block netdev event registration
  2020-05-13 16:41 [PATCH 0/8 net] the indirect flow_block offload, revisited Pablo Neira Ayuso
                   ` (6 preceding siblings ...)
  2020-05-13 16:41 ` [PATCH 7/8 net] bnxt_tc: " Pablo Neira Ayuso
@ 2020-05-13 16:41 ` Pablo Neira Ayuso
  2020-05-14 11:44 ` [PATCH 0/8 net] the indirect flow_block offload, revisited Edward Cree
  8 siblings, 0 replies; 14+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-13 16:41 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, paulb, ozsh, vladbu, jiri, kuba, saeedm, michael.chan

Drivers do not register to netdev events to set up indirect blocks
anymore. Remove __flow_indr_block_cb_register() and
__flow_indr_block_cb_unregister().

The frontends set up the callbacks through flow_indr_dev_setup_block()

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/flow_offload.h            |   9 -
 net/core/flow_offload.c               | 238 --------------------------
 net/netfilter/nf_flow_table_offload.c |  66 -------
 net/netfilter/nf_tables_offload.c     |  53 +-----
 net/sched/cls_api.c                   |  79 ---------
 5 files changed, 1 insertion(+), 444 deletions(-)

diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index 77e5a1bc30f0..3b5860cb8948 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -533,15 +533,6 @@ typedef void flow_indr_block_cmd_t(struct net_device *dev,
 				   flow_indr_block_bind_cb_t *cb, void *cb_priv,
 				   enum flow_block_command command);
 
-struct flow_indr_block_entry {
-	flow_indr_block_cmd_t *cb;
-	struct list_head	list;
-};
-
-void flow_indr_add_block_cb(struct flow_indr_block_entry *entry);
-
-void flow_indr_del_block_cb(struct flow_indr_block_entry *entry);
-
 int __flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
 				  flow_indr_block_bind_cb_t *cb,
 				  void *cb_ident);
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index 073dd0f74dc0..9f778362595d 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -467,241 +467,3 @@ int flow_indr_dev_setup_offload(struct net_device *dev,
 	return list_empty(&bo->cb_list) ? -EOPNOTSUPP : 0;
 }
 EXPORT_SYMBOL(flow_indr_dev_setup_offload);
-
-static LIST_HEAD(block_cb_list);
-
-static struct rhashtable indr_setup_block_ht;
-
-struct flow_indr_block_cb {
-	struct list_head list;
-	void *cb_priv;
-	flow_indr_block_bind_cb_t *cb;
-	void *cb_ident;
-};
-
-struct flow_indr_block_dev {
-	struct rhash_head ht_node;
-	struct net_device *dev;
-	unsigned int refcnt;
-	struct list_head cb_list;
-};
-
-static const struct rhashtable_params flow_indr_setup_block_ht_params = {
-	.key_offset	= offsetof(struct flow_indr_block_dev, dev),
-	.head_offset	= offsetof(struct flow_indr_block_dev, ht_node),
-	.key_len	= sizeof(struct net_device *),
-};
-
-static struct flow_indr_block_dev *
-flow_indr_block_dev_lookup(struct net_device *dev)
-{
-	return rhashtable_lookup_fast(&indr_setup_block_ht, &dev,
-				      flow_indr_setup_block_ht_params);
-}
-
-static struct flow_indr_block_dev *
-flow_indr_block_dev_get(struct net_device *dev)
-{
-	struct flow_indr_block_dev *indr_dev;
-
-	indr_dev = flow_indr_block_dev_lookup(dev);
-	if (indr_dev)
-		goto inc_ref;
-
-	indr_dev = kzalloc(sizeof(*indr_dev), GFP_KERNEL);
-	if (!indr_dev)
-		return NULL;
-
-	INIT_LIST_HEAD(&indr_dev->cb_list);
-	indr_dev->dev = dev;
-	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
-				   flow_indr_setup_block_ht_params)) {
-		kfree(indr_dev);
-		return NULL;
-	}
-
-inc_ref:
-	indr_dev->refcnt++;
-	return indr_dev;
-}
-
-static void flow_indr_block_dev_put(struct flow_indr_block_dev *indr_dev)
-{
-	if (--indr_dev->refcnt)
-		return;
-
-	rhashtable_remove_fast(&indr_setup_block_ht, &indr_dev->ht_node,
-			       flow_indr_setup_block_ht_params);
-	kfree(indr_dev);
-}
-
-static struct flow_indr_block_cb *
-flow_indr_block_cb_lookup(struct flow_indr_block_dev *indr_dev,
-			  flow_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	struct flow_indr_block_cb *indr_block_cb;
-
-	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
-		if (indr_block_cb->cb == cb &&
-		    indr_block_cb->cb_ident == cb_ident)
-			return indr_block_cb;
-	return NULL;
-}
-
-static struct flow_indr_block_cb *
-flow_indr_block_cb_add(struct flow_indr_block_dev *indr_dev, void *cb_priv,
-		       flow_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	struct flow_indr_block_cb *indr_block_cb;
-
-	indr_block_cb = flow_indr_block_cb_lookup(indr_dev, cb, cb_ident);
-	if (indr_block_cb)
-		return ERR_PTR(-EEXIST);
-
-	indr_block_cb = kzalloc(sizeof(*indr_block_cb), GFP_KERNEL);
-	if (!indr_block_cb)
-		return ERR_PTR(-ENOMEM);
-
-	indr_block_cb->cb_priv = cb_priv;
-	indr_block_cb->cb = cb;
-	indr_block_cb->cb_ident = cb_ident;
-	list_add(&indr_block_cb->list, &indr_dev->cb_list);
-
-	return indr_block_cb;
-}
-
-static void flow_indr_block_cb_del(struct flow_indr_block_cb *indr_block_cb)
-{
-	list_del(&indr_block_cb->list);
-	kfree(indr_block_cb);
-}
-
-static DEFINE_MUTEX(flow_indr_block_cb_lock);
-
-static void flow_block_cmd(struct net_device *dev,
-			   flow_indr_block_bind_cb_t *cb, void *cb_priv,
-			   enum flow_block_command command)
-{
-	struct flow_indr_block_entry *entry;
-
-	mutex_lock(&flow_indr_block_cb_lock);
-	list_for_each_entry(entry, &block_cb_list, list) {
-		entry->cb(dev, cb, cb_priv, command);
-	}
-	mutex_unlock(&flow_indr_block_cb_lock);
-}
-
-int __flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-				  flow_indr_block_bind_cb_t *cb,
-				  void *cb_ident)
-{
-	struct flow_indr_block_cb *indr_block_cb;
-	struct flow_indr_block_dev *indr_dev;
-	int err;
-
-	indr_dev = flow_indr_block_dev_get(dev);
-	if (!indr_dev)
-		return -ENOMEM;
-
-	indr_block_cb = flow_indr_block_cb_add(indr_dev, cb_priv, cb, cb_ident);
-	err = PTR_ERR_OR_ZERO(indr_block_cb);
-	if (err)
-		goto err_dev_put;
-
-	flow_block_cmd(dev, indr_block_cb->cb, indr_block_cb->cb_priv,
-		       FLOW_BLOCK_BIND);
-
-	return 0;
-
-err_dev_put:
-	flow_indr_block_dev_put(indr_dev);
-	return err;
-}
-EXPORT_SYMBOL_GPL(__flow_indr_block_cb_register);
-
-int flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-				flow_indr_block_bind_cb_t *cb,
-				void *cb_ident)
-{
-	int err;
-
-	rtnl_lock();
-	err = __flow_indr_block_cb_register(dev, cb_priv, cb, cb_ident);
-	rtnl_unlock();
-
-	return err;
-}
-EXPORT_SYMBOL_GPL(flow_indr_block_cb_register);
-
-void __flow_indr_block_cb_unregister(struct net_device *dev,
-				     flow_indr_block_bind_cb_t *cb,
-				     void *cb_ident)
-{
-	struct flow_indr_block_cb *indr_block_cb;
-	struct flow_indr_block_dev *indr_dev;
-
-	indr_dev = flow_indr_block_dev_lookup(dev);
-	if (!indr_dev)
-		return;
-
-	indr_block_cb = flow_indr_block_cb_lookup(indr_dev, cb, cb_ident);
-	if (!indr_block_cb)
-		return;
-
-	flow_block_cmd(dev, indr_block_cb->cb, indr_block_cb->cb_priv,
-		       FLOW_BLOCK_UNBIND);
-
-	flow_indr_block_cb_del(indr_block_cb);
-	flow_indr_block_dev_put(indr_dev);
-}
-EXPORT_SYMBOL_GPL(__flow_indr_block_cb_unregister);
-
-void flow_indr_block_cb_unregister(struct net_device *dev,
-				   flow_indr_block_bind_cb_t *cb,
-				   void *cb_ident)
-{
-	rtnl_lock();
-	__flow_indr_block_cb_unregister(dev, cb, cb_ident);
-	rtnl_unlock();
-}
-EXPORT_SYMBOL_GPL(flow_indr_block_cb_unregister);
-
-void flow_indr_block_call(struct net_device *dev,
-			  struct flow_block_offload *bo,
-			  enum flow_block_command command,
-			  enum tc_setup_type type)
-{
-	struct flow_indr_block_cb *indr_block_cb;
-	struct flow_indr_block_dev *indr_dev;
-
-	indr_dev = flow_indr_block_dev_lookup(dev);
-	if (!indr_dev)
-		return;
-
-	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
-		indr_block_cb->cb(dev, indr_block_cb->cb_priv, type, bo);
-}
-EXPORT_SYMBOL_GPL(flow_indr_block_call);
-
-void flow_indr_add_block_cb(struct flow_indr_block_entry *entry)
-{
-	mutex_lock(&flow_indr_block_cb_lock);
-	list_add_tail(&entry->list, &block_cb_list);
-	mutex_unlock(&flow_indr_block_cb_lock);
-}
-EXPORT_SYMBOL_GPL(flow_indr_add_block_cb);
-
-void flow_indr_del_block_cb(struct flow_indr_block_entry *entry)
-{
-	mutex_lock(&flow_indr_block_cb_lock);
-	list_del(&entry->list);
-	mutex_unlock(&flow_indr_block_cb_lock);
-}
-EXPORT_SYMBOL_GPL(flow_indr_del_block_cb);
-
-static int __init init_flow_indr_rhashtable(void)
-{
-	return rhashtable_init(&indr_setup_block_ht,
-			       &flow_indr_setup_block_ht_params);
-}
-subsys_initcall(init_flow_indr_rhashtable);
diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
index 82562e47d99b..969ee9461d82 100644
--- a/net/netfilter/nf_flow_table_offload.c
+++ b/net/netfilter/nf_flow_table_offload.c
@@ -999,69 +999,6 @@ int nf_flow_table_offload_setup(struct nf_flowtable *flowtable,
 }
 EXPORT_SYMBOL_GPL(nf_flow_table_offload_setup);
 
-static void nf_flow_table_indr_block_ing_cmd(struct net_device *dev,
-					     struct nf_flowtable *flowtable,
-					     flow_indr_block_bind_cb_t *cb,
-					     void *cb_priv,
-					     enum flow_block_command cmd)
-{
-	struct netlink_ext_ack extack = {};
-	struct flow_block_offload bo;
-
-	if (!flowtable)
-		return;
-
-	nf_flow_table_block_offload_init(&bo, dev_net(dev), cmd, flowtable,
-					 &extack);
-
-	cb(dev, cb_priv, TC_SETUP_FT, &bo);
-
-	nf_flow_table_block_setup(flowtable, &bo, cmd);
-}
-
-static void nf_flow_table_indr_block_cb_cmd(struct nf_flowtable *flowtable,
-					    struct net_device *dev,
-					    flow_indr_block_bind_cb_t *cb,
-					    void *cb_priv,
-					    enum flow_block_command cmd)
-{
-	if (!(flowtable->flags & NF_FLOWTABLE_HW_OFFLOAD))
-		return;
-
-	nf_flow_table_indr_block_ing_cmd(dev, flowtable, cb, cb_priv, cmd);
-}
-
-static void nf_flow_table_indr_block_cb(struct net_device *dev,
-					flow_indr_block_bind_cb_t *cb,
-					void *cb_priv,
-					enum flow_block_command cmd)
-{
-	struct net *net = dev_net(dev);
-	struct nft_flowtable *nft_ft;
-	struct nft_table *table;
-	struct nft_hook *hook;
-
-	mutex_lock(&net->nft.commit_mutex);
-	list_for_each_entry(table, &net->nft.tables, list) {
-		list_for_each_entry(nft_ft, &table->flowtables, list) {
-			list_for_each_entry(hook, &nft_ft->hook_list, list) {
-				if (hook->ops.dev != dev)
-					continue;
-
-				nf_flow_table_indr_block_cb_cmd(&nft_ft->data,
-								dev, cb,
-								cb_priv, cmd);
-			}
-		}
-	}
-	mutex_unlock(&net->nft.commit_mutex);
-}
-
-static struct flow_indr_block_entry block_ing_entry = {
-	.cb	= nf_flow_table_indr_block_cb,
-	.list	= LIST_HEAD_INIT(block_ing_entry.list),
-};
-
 int nf_flow_table_offload_init(void)
 {
 	nf_flow_offload_wq  = alloc_workqueue("nf_flow_table_offload",
@@ -1069,13 +1006,10 @@ int nf_flow_table_offload_init(void)
 	if (!nf_flow_offload_wq)
 		return -ENOMEM;
 
-	flow_indr_add_block_cb(&block_ing_entry);
-
 	return 0;
 }
 
 void nf_flow_table_offload_exit(void)
 {
-	flow_indr_del_block_cb(&block_ing_entry);
 	destroy_workqueue(nf_flow_offload_wq);
 }
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 1960f11477e8..185fc82c99aa 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -285,25 +285,6 @@ static int nft_block_offload_cmd(struct nft_base_chain *chain,
 	return nft_block_setup(chain, &bo, cmd);
 }
 
-static void nft_indr_block_ing_cmd(struct net_device *dev,
-				   struct nft_base_chain *chain,
-				   flow_indr_block_bind_cb_t *cb,
-				   void *cb_priv,
-				   enum flow_block_command cmd)
-{
-	struct netlink_ext_ack extack = {};
-	struct flow_block_offload bo;
-
-	if (!chain)
-		return;
-
-	nft_flow_block_offload_init(&bo, dev_net(dev), cmd, chain, &extack);
-
-	cb(dev, cb_priv, TC_SETUP_BLOCK, &bo);
-
-	nft_block_setup(chain, &bo, cmd);
-}
-
 static void nft_indr_block_cleanup(struct flow_block_cb *block_cb)
 {
 	struct nft_base_chain *basechain = block_cb->indr.data;
@@ -575,24 +556,6 @@ static struct nft_chain *__nft_offload_get_chain(struct net_device *dev)
 	return NULL;
 }
 
-static void nft_indr_block_cb(struct net_device *dev,
-			      flow_indr_block_bind_cb_t *cb, void *cb_priv,
-			      enum flow_block_command cmd)
-{
-	struct net *net = dev_net(dev);
-	struct nft_chain *chain;
-
-	mutex_lock(&net->nft.commit_mutex);
-	chain = __nft_offload_get_chain(dev);
-	if (chain && chain->flags & NFT_CHAIN_HW_OFFLOAD) {
-		struct nft_base_chain *basechain;
-
-		basechain = nft_base_chain(chain);
-		nft_indr_block_ing_cmd(dev, basechain, cb, cb_priv, cmd);
-	}
-	mutex_unlock(&net->nft.commit_mutex);
-}
-
 static int nft_offload_netdev_event(struct notifier_block *this,
 				    unsigned long event, void *ptr)
 {
@@ -614,30 +577,16 @@ static int nft_offload_netdev_event(struct notifier_block *this,
 	return NOTIFY_DONE;
 }
 
-static struct flow_indr_block_entry block_ing_entry = {
-	.cb	= nft_indr_block_cb,
-	.list	= LIST_HEAD_INIT(block_ing_entry.list),
-};
-
 static struct notifier_block nft_offload_netdev_notifier = {
 	.notifier_call	= nft_offload_netdev_event,
 };
 
 int nft_offload_init(void)
 {
-	int err;
-
-	err = register_netdevice_notifier(&nft_offload_netdev_notifier);
-	if (err < 0)
-		return err;
-
-	flow_indr_add_block_cb(&block_ing_entry);
-
-	return 0;
+	return register_netdevice_notifier(&nft_offload_netdev_notifier);
 }
 
 void nft_offload_exit(void)
 {
-	flow_indr_del_block_cb(&block_ing_entry);
 	unregister_netdevice_notifier(&nft_offload_netdev_notifier);
 }
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index a16cfea7b136..c95d72cac3ad 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -620,78 +620,6 @@ static void tcf_chain_flush(struct tcf_chain *chain, bool rtnl_held)
 static int tcf_block_setup(struct tcf_block *block,
 			   struct flow_block_offload *bo);
 
-static void tc_indr_block_cmd(struct net_device *dev, struct tcf_block *block,
-			      flow_indr_block_bind_cb_t *cb, void *cb_priv,
-			      enum flow_block_command command, bool ingress)
-{
-	struct flow_block_offload bo = {
-		.command	= command,
-		.binder_type	= ingress ?
-				  FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS :
-				  FLOW_BLOCK_BINDER_TYPE_CLSACT_EGRESS,
-		.net		= dev_net(dev),
-		.block_shared	= tcf_block_non_null_shared(block),
-	};
-	INIT_LIST_HEAD(&bo.cb_list);
-
-	if (!block)
-		return;
-
-	bo.block = &block->flow_block;
-
-	down_write(&block->cb_lock);
-	cb(dev, cb_priv, TC_SETUP_BLOCK, &bo);
-
-	tcf_block_setup(block, &bo);
-	up_write(&block->cb_lock);
-}
-
-static struct tcf_block *tc_dev_block(struct net_device *dev, bool ingress)
-{
-	const struct Qdisc_class_ops *cops;
-	const struct Qdisc_ops *ops;
-	struct Qdisc *qdisc;
-
-	if (!dev_ingress_queue(dev))
-		return NULL;
-
-	qdisc = dev_ingress_queue(dev)->qdisc_sleeping;
-	if (!qdisc)
-		return NULL;
-
-	ops = qdisc->ops;
-	if (!ops)
-		return NULL;
-
-	if (!ingress && !strcmp("ingress", ops->id))
-		return NULL;
-
-	cops = ops->cl_ops;
-	if (!cops)
-		return NULL;
-
-	if (!cops->tcf_block)
-		return NULL;
-
-	return cops->tcf_block(qdisc,
-			       ingress ? TC_H_MIN_INGRESS : TC_H_MIN_EGRESS,
-			       NULL);
-}
-
-static void tc_indr_block_get_and_cmd(struct net_device *dev,
-				      flow_indr_block_bind_cb_t *cb,
-				      void *cb_priv,
-				      enum flow_block_command command)
-{
-	struct tcf_block *block;
-
-	block = tc_dev_block(dev, true);
-	tc_indr_block_cmd(dev, block, cb, cb_priv, command, true);
-
-	block = tc_dev_block(dev, false);
-	tc_indr_block_cmd(dev, block, cb, cb_priv, command, false);
-}
-
 static void tcf_block_offload_init(struct flow_block_offload *bo,
 				   struct net_device *dev,
 				   enum flow_block_command command,
@@ -3751,11 +3679,6 @@ static struct pernet_operations tcf_net_ops = {
 	.size = sizeof(struct tcf_net),
 };
 
-static struct flow_indr_block_entry block_entry = {
-	.cb = tc_indr_block_get_and_cmd,
-	.list = LIST_HEAD_INIT(block_entry.list),
-};
-
 static int __init tc_filter_init(void)
 {
 	int err;
@@ -3768,8 +3691,6 @@ static int __init tc_filter_init(void)
 	if (err)
 		goto err_register_pernet_subsys;
 
-	flow_indr_add_block_cb(&block_entry);
-
 	rtnl_register(PF_UNSPEC, RTM_NEWTFILTER, tc_new_tfilter, NULL,
 		      RTNL_FLAG_DOIT_UNLOCKED);
 	rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_del_tfilter, NULL,
-- 
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/8 net] the indirect flow_block offload, revisited
  2020-05-13 16:41 [PATCH 0/8 net] the indirect flow_block offload, revisited Pablo Neira Ayuso
                   ` (7 preceding siblings ...)
  2020-05-13 16:41 ` [PATCH 8/8 net] net: remove indirect block netdev event registration Pablo Neira Ayuso
@ 2020-05-14 11:44 ` Edward Cree
  2020-05-14 22:36   ` Pablo Neira Ayuso
  8 siblings, 1 reply; 14+ messages in thread
From: Edward Cree @ 2020-05-14 11:44 UTC (permalink / raw)
  To: Pablo Neira Ayuso, netfilter-devel
  Cc: davem, netdev, paulb, ozsh, vladbu, jiri, kuba, saeedm, michael.chan

On 13/05/2020 17:41, Pablo Neira Ayuso wrote:
> Hi,
>
> This patchset fixes the indirect flow_block support for the tc CT action
> offload. Please, note that this batch is probably slightly large for the
> net tree, however, I could not find a simple incremental fix.
>
> = The problem
>
> The nf_flow_table_indr_block_cb() function provides the tunnel netdevice
> and the indirect flow_block driver callback. From this tunnel netdevice,
> it is not possible to obtain the tc CT flow_block. Note that tc qdisc
> and netfilter backtrack from the tunnel netdevice to the tc block /
> netfilter chain to reach the flow_block object. This allows them to
> clean up the hardware offload rules if the tunnel device is removed.
>
> <snip>
>
> = About this patchset
>
> This patchset aims to address the existing TC CT problem while
> simplifying the indirect flow_block infrastructure. Saving 300 LoC in
> the flow_offload core and the drivers.
This might be a dumb question, but: what is the actual bug being fixed,
 that makes this patch series needed on net rather than net-next?
From your description it sounds like a good and useful cleanup/refactor,
 but nonetheless still a clean-up, appropriate for net-next.

-ed

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/8 net] the indirect flow_block offload, revisited
  2020-05-14 11:44 ` [PATCH 0/8 net] the indirect flow_block offload, revisited Edward Cree
@ 2020-05-14 22:36   ` Pablo Neira Ayuso
  2020-05-15  0:29     ` David Miller
  0 siblings, 1 reply; 14+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-14 22:36 UTC (permalink / raw)
  To: Edward Cree
  Cc: netfilter-devel, davem, netdev, paulb, ozsh, vladbu, jiri, kuba,
	saeedm, michael.chan

On Thu, May 14, 2020 at 12:44:48PM +0100, Edward Cree wrote:
> On 13/05/2020 17:41, Pablo Neira Ayuso wrote:
> > Hi,
> >
> > This patchset fixes the indirect flow_block support for the tc CT action
> > offload. Please, note that this batch is probably slightly large for the
> > net tree, however, I could not find a simple incremental fix.
> >
> > = The problem
> >
> > The nf_flow_table_indr_block_cb() function provides the tunnel netdevice
> > and the indirect flow_block driver callback. From this tunnel netdevice,
> > it is not possible to obtain the tc CT flow_block. Note that tc qdisc
> > and netfilter backtrack from the tunnel netdevice to the tc block /
> > netfilter chain to reach the flow_block object. This allows them to
> > clean up the hardware offload rules if the tunnel device is removed.
> >
> > <snip>
> >
> > = About this patchset
> >
> > This patchset aims to address the existing TC CT problem while
> > simplifying the indirect flow_block infrastructure. Saving 300 LoC in
> > the flow_offload core and the drivers.
>
> This might be a dumb question, but: what is the actual bug being fixed,
>  that makes this patch series needed on net rather than net-next?

The TC CT action crashes the kernel with an indirect flow_block in place:

https://lore.kernel.org/netfilter-devel/db9dfe4f-62e7-241b-46a0-d878c89696a8@ucloud.cn/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/8 net] the indirect flow_block offload, revisited
  2020-05-14 22:36   ` Pablo Neira Ayuso
@ 2020-05-15  0:29     ` David Miller
  0 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2020-05-15  0:29 UTC (permalink / raw)
  To: pablo
  Cc: ecree, netfilter-devel, netdev, paulb, ozsh, vladbu, jiri, kuba,
	saeedm, michael.chan

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Fri, 15 May 2020 00:36:27 +0200

> The TC CT action crashes the kernel with an indirect flow_block in place:
> 
> https://lore.kernel.org/netfilter-devel/db9dfe4f-62e7-241b-46a0-d878c89696a8@ucloud.cn/

I've read over this patch set at least three times, and reread the
header posting, and there is no clear indication that this patch
series fixes a crash at all.

You need to be explicit about what bug this is fixing, in the commit
messages and introduction posting, with quoted crash logs etc.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 7/8 net] bnxt_tc: update indirect block support
  2020-05-13 16:41 ` [PATCH 7/8 net] bnxt_tc: " Pablo Neira Ayuso
@ 2020-05-19  8:53   ` Sriharsha Basavapatna
  2020-05-26 21:59     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 14+ messages in thread
From: Sriharsha Basavapatna @ 2020-05-19  8:53 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netfilter-devel, David Miller, netdev, paulb, Oz Shlomo, vladbu,
	jiri, kuba, saeedm, Michael Chan, Sriharsha Basavapatna

On Wed, May 13, 2020 at 10:12 PM Pablo Neira Ayuso <pablo@netfilter.org> wrote:
>
> Register ndo callback via flow_indr_dev_register() and
> flow_indr_dev_unregister().
>
> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> ---
>  drivers/net/ethernet/broadcom/bnxt/bnxt.h    |  1 -
>  drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 51 +++++---------------
>  2 files changed, 12 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> index f6a3250ef1c5..e7d1c12673bd 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> @@ -1857,7 +1857,6 @@ struct bnxt {
>         u8                      dsn[8];
>         struct bnxt_tc_info     *tc_info;
>         struct list_head        tc_indr_block_list;
> -       struct notifier_block   tc_netdev_nb;
>         struct dentry           *debugfs_pdev;
>         struct device           *hwmon_dev;
>  };
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
> index 782ea0771221..0eef4f5e4a46 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
> @@ -1939,53 +1939,25 @@ static int bnxt_tc_setup_indr_block(struct net_device *netdev, struct bnxt *bp,
>         return 0;
>  }
>
> -static int bnxt_tc_setup_indr_cb(struct net_device *netdev, void *cb_priv,
> -                                enum tc_setup_type type, void *type_data)
> -{
> -       switch (type) {
> -       case TC_SETUP_BLOCK:
> -               return bnxt_tc_setup_indr_block(netdev, cb_priv, type_data);
> -       default:
> -               return -EOPNOTSUPP;
> -       }
> -}
> -
>  static bool bnxt_is_netdev_indr_offload(struct net_device *netdev)
>  {
>         return netif_is_vxlan(netdev);
>  }
>
> -static int bnxt_tc_indr_block_event(struct notifier_block *nb,
> -                                   unsigned long event, void *ptr)
> +static int bnxt_tc_setup_indr_cb(struct net_device *netdev, void *cb_priv,
> +                                enum tc_setup_type type, void *type_data)
>  {
> -       struct net_device *netdev;
> -       struct bnxt *bp;
> -       int rc;
> -
> -       netdev = netdev_notifier_info_to_dev(ptr);
>         if (!bnxt_is_netdev_indr_offload(netdev))
> -               return NOTIFY_OK;
> -
> -       bp = container_of(nb, struct bnxt, tc_netdev_nb);
> +               return -EOPNOTSUPP;
>
> -       switch (event) {
> -       case NETDEV_REGISTER:
> -               rc = __flow_indr_block_cb_register(netdev, bp,
> -                                                  bnxt_tc_setup_indr_cb,
> -                                                  bp);
> -               if (rc)
> -                       netdev_info(bp->dev,
> -                                   "Failed to register indirect blk: dev: %s\n",
> -                                   netdev->name);
> -               break;
> -       case NETDEV_UNREGISTER:
> -               __flow_indr_block_cb_unregister(netdev,
> -                                               bnxt_tc_setup_indr_cb,
> -                                               bp);
> +       switch (type) {
> +       case TC_SETUP_BLOCK:
> +               return bnxt_tc_setup_indr_block(netdev, cb_priv, type_data);
> +       default:
>                 break;
>         }
>
> -       return NOTIFY_DONE;
> +       return -EOPNOTSUPP;
>  }
>
>  static const struct rhashtable_params bnxt_tc_flow_ht_params = {
> @@ -2074,8 +2046,8 @@ int bnxt_init_tc(struct bnxt *bp)
>
>         /* init indirect block notifications */
>         INIT_LIST_HEAD(&bp->tc_indr_block_list);
> -       bp->tc_netdev_nb.notifier_call = bnxt_tc_indr_block_event;
> -       rc = register_netdevice_notifier(&bp->tc_netdev_nb);
> +
> +       rc = flow_indr_dev_register(bnxt_tc_setup_indr_cb, bp);
>         if (!rc)
>                 return 0;
>
> @@ -2101,7 +2073,8 @@ void bnxt_shutdown_tc(struct bnxt *bp)
>         if (!bnxt_tc_flower_enabled(bp))
>                 return;
>
> -       unregister_netdevice_notifier(&bp->tc_netdev_nb);
> +       flow_indr_dev_unregister(bnxt_tc_setup_indr_cb, bp,
> +                                bnxt_tc_setup_indr_block_cb);

Why does the driver need to provide the "cb" again during unregister,
since both "cb" and "cb_priv" are already provided during register() ?
This interface could be simplified/improved if
flow_indr_dev_register() returns an opaque handle to the object it
creates (struct flow_indr_dev *) ? The driver should just pass this
handle during unregistration. Also, why do we need the extra (3rd)
argument (flow_setup_cb_t / bnxt_tc_setup_indr_block_cb) during unreg
? It is handled internally by the driver as a part of FLOW_BLOCK_BIND
/ UNBIND ?

Thanks,
-Harsha

>         rhashtable_destroy(&tc_info->flow_table);
>         rhashtable_destroy(&tc_info->l2_table);
>         rhashtable_destroy(&tc_info->decap_l2_table);
> --
> 2.20.1
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 7/8 net] bnxt_tc: update indirect block support
  2020-05-19  8:53   ` Sriharsha Basavapatna
@ 2020-05-26 21:59     ` Pablo Neira Ayuso
  0 siblings, 0 replies; 14+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-26 21:59 UTC (permalink / raw)
  To: Sriharsha Basavapatna
  Cc: netfilter-devel, David Miller, netdev, paulb, Oz Shlomo, vladbu,
	jiri, kuba, saeedm, Michael Chan

Hi,

I overlook this email, sorry.

On Tue, May 19, 2020 at 02:23:54PM +0530, Sriharsha Basavapatna wrote:
[...]
> > @@ -2101,7 +2073,8 @@ void bnxt_shutdown_tc(struct bnxt *bp)
> >         if (!bnxt_tc_flower_enabled(bp))
> >                 return;
> >
> > -       unregister_netdevice_notifier(&bp->tc_netdev_nb);
> > +       flow_indr_dev_unregister(bnxt_tc_setup_indr_cb, bp,
> > +                                bnxt_tc_setup_indr_block_cb);
> 
> Why does the driver need to provide the "cb" again during unregister,
> since both "cb" and "cb_priv" are already provided during register() ?
> This interface could be simplified/improved if
> flow_indr_dev_register() returns an opaque handle to the object it
> creates (struct flow_indr_dev *) ?

Probably, at the expense to storing this in the netdev private area.

> The driver should just pass this
> handle during unregistration. Also, why do we need the extra (3rd)
> argument (flow_setup_cb_t / bnxt_tc_setup_indr_block_cb) during unreg
> ? It is handled internally by the driver as a part of FLOW_BLOCK_BIND
> / UNBIND ?

flow_indr_dev_unregister() needs bnxt_tc_setup_indr_block_cb to
identify what indirect flow_blocks need to be cleaned up before this
representor is gone.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, back to index

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-13 16:41 [PATCH 0/8 net] the indirect flow_block offload, revisited Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 1/8 net] netfilter: nf_flowtable: expose nf_flow_table_gc_cleanup() Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 2/8 net] net: flow_offload: consolidate indirect flow_block infrastructure Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 3/8 net] net: cls_api: add tcf_block_offload_init() Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 4/8 net] net: use flow_indr_dev_setup_offload() Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 5/8 net] mlx5: update indirect block support Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 6/8 net] nfp: " Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 7/8 net] bnxt_tc: " Pablo Neira Ayuso
2020-05-19  8:53   ` Sriharsha Basavapatna
2020-05-26 21:59     ` Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 8/8 net] net: remove indirect block netdev event registration Pablo Neira Ayuso
2020-05-14 11:44 ` [PATCH 0/8 net] the indirect flow_block offload, revisited Edward Cree
2020-05-14 22:36   ` Pablo Neira Ayuso
2020-05-15  0:29     ` David Miller

Netfilter-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netfilter-devel/0 netfilter-devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netfilter-devel netfilter-devel/ https://lore.kernel.org/netfilter-devel \
		netfilter-devel@vger.kernel.org
	public-inbox-index netfilter-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netfilter-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git