netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8 net] the indirect flow_block offload, revisited
@ 2020-05-13 16:41 Pablo Neira Ayuso
  2020-05-13 16:41 ` [PATCH 1/8 net] netfilter: nf_flowtable: expose nf_flow_table_gc_cleanup() Pablo Neira Ayuso
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-13 16:41 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, paulb, ozsh, vladbu, jiri, kuba, saeedm, michael.chan

Hi,

This patchset fixes the indirect flow_block support for the tc CT action
offload. Please, note that this batch is probably slightly large for the
net tree, however, I could not find a simple incremental fix.

= The problem

The nf_flow_table_indr_block_cb() function provides the tunnel netdevice
and the indirect flow_block driver callback. From this tunnel netdevice,
it is not possible to obtain the tc CT flow_block. Note that tc qdisc
and netfilter backtrack from the tunnel netdevice to the tc block /
netfilter chain to reach the flow_block object. This allows them to
clean up the hardware offload rules if the tunnel device is removed.

= What is the indirect flow_block infrastructure?

The indirect flow_block infrastructure allows drivers to offload
tc/netfilter rules that belong to software tunnel netdevices, e.g.
vxlan.

This indirect flow_block infrastructure relates tunnel netdevices with
drivers because there is no obvious way to relate these two things
from the control plane.

= How does the indirect flow_block work before this patchset?

Front-ends register the indirect flow_block callback through
flow_indr_add_block_cb() if they support for offloading tunnel
netdevices.

== Setting up an indirect flow_block

1) Drivers track tunnel netdevices via NETDEV_{REGISTER,UNREGISTER} events.
   If there is a new tunnel netdevice that the driver can offload, then the
   driver invokes __flow_indr_block_cb_register() with the new tunnel
   netdevice and the driver callback. The __flow_indr_block_cb_register()
   call iterates over the list of the front-end callbacks.

2) The front-end callback sets up the flow_block_offload structure and it
   invokes the driver callback to set up the flow_block.

3) The driver callback now registers the flow_block structure and it
   returns the flow_block back to the front-end.

4) The front-end gets the flow_block object and it is now ready to
   offload rules for this tunnel netdevice.

A simplified callgraph is represented below.

        Front-end                      Driver

                                   NETDEV_REGISTER
                                         |
                         __flow_indr_block_cb_register(netdev, cb_priv, driver_cb)
                                         | [1]
            .----------  frontend_indr_block_cb(cb_priv, driver_cb)
            |
   setup_flow_block_offload(bo)
            | [2]
   driver_cb(bo, cb_priv) ---------------.
                                         |
                                  set up flow_blocks [3]
                                         |
   add rules to flow_block <-------------'
   TC_SETUP_CLSFLOWER [4]

== Releasing the indirect flow_block

There are two possibilities, either tunnel netdevice is removed or
a netdevice (port representor) is removed.

=== Tunnel netdevice is removed

Driver waits for the NETDEV_UNREGISTER event that announces the tunnel
netdevice removal. Then, it calls __flow_indr_block_cb_unregister() to
remove the flow_block and rules.  Callgraph is very similar to the one
described above.

=== Netdevice is removed (port representor)

Driver calls __flow_indr_block_cb_unregister() to remove the existing
netfilter/tc rule that belong to the tunnel netdevice.

= How does the indirect flow_block work after this patchset?

Drivers register the indirect flow_block setup callback through
flow_indr_dev_register() if they support for offloading tunnel
netdevices.

== Setting up an indirect flow_block

1) Frontends check if dev->netdev_ops->ndo_setup_tc is unset. If so,
   frontends call flow_indr_dev_setup_offload(). This call invokes
   the drivers' indirect flow_block setup callback.

2) The indirect flow_block setup callback sets up a flow_block structure
   which relates the tunnel netdevice and the driver.

3) The front-end uses flow_block and offload the rules.

Note that the operational to set up (non-indirect) flow_block is very
similar.

== Releasing the indirect flow_block

=== Tunnel netdevice is removed

This calls flow_indr_dev_setup_offload() to set down the flow_block and
remove the offloaded rules. This alternate path is exercised if
dev->netdev_ops->ndo_setup_tc is unset.

=== Netdevice is removed (port representor)

If a netdevice is removed, then it might need to to clean up the
offloaded tc/netfilter rules that belongs to the tunnel netdevice:

1) The driver invokes flow_indr_dev_unregister() when a netdevice is
   removed.

2) This call iterates over the existing indirect flow_blocks
   and it invokes the cleanup callback to let the front-end remove the
   tc/netfilter rules. The cleanup callback already provides the
   flow_block that the front-end needs to clean up.

        Front-end                      Driver

                                         |
                            flow_indr_dev_unregister(...)
                                         |
                         iterate over list of indirect flow_block
                               and invoke cleanup callback
                                         |
            .-----------------------------
            |
            .
   frontend_flow_block_cleanup(flow_block)
            .
            |
           \/
   remove rules to flow_block
      TC_SETUP_CLSFLOWER

= About this patchset

This patchset aims to address the existing TC CT problem while
simplifying the indirect flow_block infrastructure. Saving 300 LoC in
the flow_offload core and the drivers. The operational gets aligned with
the (non-indirect) flow_blocks logic. Patchset is composed of:

Patch #1 add nf_flow_table_gc_cleanup() which is required by the
         netfilter's flowtable new indirect flow_block approach.

Patch #2 adds the flow_block_indr object which is actually part of
         of the flow_block object. This stores the indirect flow_block
	 metadata such as the tunnel netdevice owner and the cleanup
	 callback (in case the tunnel netdevice goes away).

	 This patch adds flow_indr_dev_{un}register() to allow drivers
         to offer netdevice tunnel hardware offload to the front-ends.
	 Then, front-ends call flow_indr_dev_setup_offload() to invoke
	 the drivers to set up the (indirect) flow_block.

Patch #3 add the tcf_block_offload_init() helper function, this is
	 a preparation patch to adapt the tc front-end to use this
	 new indirect flow_block infrastructure.

Patch #4 updates the tc and netfilter front-ends to use the new
	 indirect flow_block infrastructure.

Patch #5 updates the mlx5 driver to use the new indirect flow_block
	 infrastructure.

Patch #6 updates the nfp driver to use the new indirect flow_block
         infrastructure.

Patch #7 updates the bnxt driver to use the new indirect flow_block
         infrastructure.

Patch #8 removes the indirect flow_block infrastructure version 1,
         now that frontends and drivers have been translated to
	 version 2 (coming in this patchset).

Please, apply.

Pablo Neira Ayuso (8):
  netfilter: nf_flowtable: expose nf_flow_table_gc_cleanup()
  net: flow_offload: consolidate indirect flow_block infrastructure
  net: cls_api: add tcf_block_offload_init()
  net: use flow_indr_dev_setup_offload()
  mlx5: update indirect block support
  nfp: update indirect block support
  bnxt_tc: update indirect block support
  net: remove indirect block netdev event registration

 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   1 -
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c  |  51 +--
 .../net/ethernet/mellanox/mlx5/core/en_rep.c  |  83 +----
 .../net/ethernet/mellanox/mlx5/core/en_rep.h  |   5 -
 .../net/ethernet/netronome/nfp/flower/main.c  |  11 +-
 .../net/ethernet/netronome/nfp/flower/main.h  |   7 +-
 .../ethernet/netronome/nfp/flower/offload.c   |  35 +-
 include/net/flow_offload.h                    |  28 +-
 include/net/netfilter/nf_flow_table.h         |   2 +
 net/core/flow_offload.c                       | 301 +++++++-----------
 net/netfilter/nf_flow_table_core.c            |   6 +-
 net/netfilter/nf_flow_table_offload.c         |  85 +----
 net/netfilter/nf_tables_offload.c             |  69 ++--
 net/sched/cls_api.c                           | 157 +++------
 14 files changed, 251 insertions(+), 590 deletions(-)

--
2.20.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-06-08 22:37 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-13 16:41 [PATCH 0/8 net] the indirect flow_block offload, revisited Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 1/8 net] netfilter: nf_flowtable: expose nf_flow_table_gc_cleanup() Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 2/8 net] net: flow_offload: consolidate indirect flow_block infrastructure Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 3/8 net] net: cls_api: add tcf_block_offload_init() Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 4/8 net] net: use flow_indr_dev_setup_offload() Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 5/8 net] mlx5: update indirect block support Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 6/8 net] nfp: " Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 7/8 net] bnxt_tc: " Pablo Neira Ayuso
2020-05-19  8:53   ` Sriharsha Basavapatna
2020-05-26 21:59     ` Pablo Neira Ayuso
2020-05-13 16:41 ` [PATCH 8/8 net] net: remove indirect block netdev event registration Pablo Neira Ayuso
2020-06-08 21:07   ` Jacob Keller
2020-06-08 21:47     ` Pablo Neira Ayuso
2020-06-08 22:37       ` Jacob Keller
2020-05-14 11:44 ` [PATCH 0/8 net] the indirect flow_block offload, revisited Edward Cree
2020-05-14 22:36   ` Pablo Neira Ayuso
2020-05-15  0:29     ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).