netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/6] act_ct: Software offload of conntrack_in
@ 2020-02-23 11:45 Paul Blakey
  2020-02-23 11:45 ` [PATCH net-next 1/6] netfilter: flowtable: pass flowtable to nf_flow_table_iterate() Paul Blakey
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Paul Blakey @ 2020-02-23 11:45 UTC (permalink / raw)
  To: Paul Blakey, Saeed Mahameed, Oz Shlomo, Jakub Kicinski,
	Vlad Buslov, David Miller, netdev, Jiri Pirko, Roi Dayan

This series adds software offload of connections with an established
ct state using the NF flow table offload infrastructure, so
once such flows are offloaded, they will not pass through conntrack
again, and instead act_ct will restore the conntrack info metadata
on the skb to the state it had on the offload event - established.

Act_ct maintains an FT instance per ct zone. Flow table entries
are created, per ct connection, when connections enter an established
state and deleted otherwise. Once an entry is created, the FT assumes
ownership of the entry, and manages it's aging.

On the datapath, first lookup the skb in the zone's FT before going
into conntrack, and if a matching flow is found, restore the conntrack
info metadata on the skb, and skip calling conntrack.

Netfilter flowtable patches provide the cleanup callback so act_ct can
cleanup the flow table once all entries are removed, as these are
managed by the flowtable.

Note that this patchset is part of the connection tracking offload feature.
Hardware offload of connections with an established ct state series will follow
this one.

Pablo Neira Ayuso (3):
  netfilter: flowtable: pass flowtable to nf_flow_table_iterate()
  netfilter: flowtable: nf_flow_table_iterate() returns the number of
    entries
  netfilter: flowtable: add cleanup callback from garbage collector

Paul Blakey (3):
  net/sched: act_ct: Create nf flow table per zone
  net/sched: act_ct: Offload established connections to flow table
  net/sched: act_ct: Software offload of established flows

 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c |   1 +
 include/net/netfilter/nf_flow_table.h           |   1 +
 include/net/tc_act/tc_ct.h                      |   2 +
 net/netfilter/nf_flow_table_core.c              |  28 +-
 net/sched/Kconfig                               |   2 +-
 net/sched/act_ct.c                              | 381 +++++++++++++++++++++++-
 6 files changed, 402 insertions(+), 13 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next 1/6] netfilter: flowtable: pass flowtable to nf_flow_table_iterate()
  2020-02-23 11:45 [PATCH net-next 0/6] act_ct: Software offload of conntrack_in Paul Blakey
@ 2020-02-23 11:45 ` Paul Blakey
  2020-02-23 11:45 ` [PATCH net-next 2/6] netfilter: flowtable: nf_flow_table_iterate() returns the number of entries Paul Blakey
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Paul Blakey @ 2020-02-23 11:45 UTC (permalink / raw)
  To: Paul Blakey, Saeed Mahameed, Oz Shlomo, Jakub Kicinski,
	Vlad Buslov, David Miller, netdev, Jiri Pirko, Roi Dayan
  Cc: Pablo Neira Ayuso

From: Pablo Neira Ayuso <pablo@netfilter.org>

Pass flowtable object to the iteration callback that is called from
nf_flow_table_iterate().

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
---
 net/netfilter/nf_flow_table_core.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index 8af28e1..face98b 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -312,7 +312,8 @@ struct flow_offload_tuple_rhash *
 
 static int
 nf_flow_table_iterate(struct nf_flowtable *flow_table,
-		      void (*iter)(struct flow_offload *flow, void *data),
+		      void (*iter)(struct nf_flowtable *flow_table,
+				   struct flow_offload *flow, void *data),
 		      void *data)
 {
 	struct flow_offload_tuple_rhash *tuplehash;
@@ -336,7 +337,7 @@ struct flow_offload_tuple_rhash *
 
 		flow = container_of(tuplehash, struct flow_offload, tuplehash[0]);
 
-		iter(flow, data);
+		iter(flow_table, flow, data);
 	}
 	rhashtable_walk_stop(&hti);
 	rhashtable_walk_exit(&hti);
@@ -344,10 +345,9 @@ struct flow_offload_tuple_rhash *
 	return err;
 }
 
-static void nf_flow_offload_gc_step(struct flow_offload *flow, void *data)
+static void nf_flow_offload_gc_step(struct nf_flowtable *flow_table,
+				    struct flow_offload *flow, void *data)
 {
-	struct nf_flowtable *flow_table = data;
-
 	if (nf_flow_has_expired(flow) || nf_ct_is_dying(flow->ct) ||
 	    test_bit(NF_FLOW_TEARDOWN, &flow->flags)) {
 		if (test_bit(NF_FLOW_HW, &flow->flags)) {
@@ -368,7 +368,7 @@ static void nf_flow_offload_work_gc(struct work_struct *work)
 	struct nf_flowtable *flow_table;
 
 	flow_table = container_of(work, struct nf_flowtable, gc_work.work);
-	nf_flow_table_iterate(flow_table, nf_flow_offload_gc_step, flow_table);
+	nf_flow_table_iterate(flow_table, nf_flow_offload_gc_step, NULL);
 	queue_delayed_work(system_power_efficient_wq, &flow_table->gc_work, HZ);
 }
 
@@ -511,7 +511,8 @@ int nf_flow_table_init(struct nf_flowtable *flowtable)
 }
 EXPORT_SYMBOL_GPL(nf_flow_table_init);
 
-static void nf_flow_table_do_cleanup(struct flow_offload *flow, void *data)
+static void nf_flow_table_do_cleanup(struct nf_flowtable *flowtable,
+				     struct flow_offload *flow, void *data)
 {
 	struct net_device *dev = data;
 
@@ -552,7 +553,7 @@ void nf_flow_table_free(struct nf_flowtable *flow_table)
 	mutex_unlock(&flowtable_lock);
 	cancel_delayed_work_sync(&flow_table->gc_work);
 	nf_flow_table_iterate(flow_table, nf_flow_table_do_cleanup, NULL);
-	nf_flow_table_iterate(flow_table, nf_flow_offload_gc_step, flow_table);
+	nf_flow_table_iterate(flow_table, nf_flow_offload_gc_step, NULL);
 	nf_flow_table_offload_flush(flow_table);
 	rhashtable_destroy(&flow_table->rhashtable);
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 2/6] netfilter: flowtable: nf_flow_table_iterate() returns the number of entries
  2020-02-23 11:45 [PATCH net-next 0/6] act_ct: Software offload of conntrack_in Paul Blakey
  2020-02-23 11:45 ` [PATCH net-next 1/6] netfilter: flowtable: pass flowtable to nf_flow_table_iterate() Paul Blakey
@ 2020-02-23 11:45 ` Paul Blakey
  2020-02-23 11:45 ` [PATCH net-next 3/6] netfilter: flowtable: add cleanup callback from garbage collector Paul Blakey
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Paul Blakey @ 2020-02-23 11:45 UTC (permalink / raw)
  To: Paul Blakey, Saeed Mahameed, Oz Shlomo, Jakub Kicinski,
	Vlad Buslov, David Miller, netdev, Jiri Pirko, Roi Dayan
  Cc: Pablo Neira Ayuso

From: Pablo Neira Ayuso <pablo@netfilter.org>

Update this iterator to return the number of entries in this flowtable.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
---
 net/netfilter/nf_flow_table_core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index face98b..83bc456 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -319,7 +319,7 @@ struct flow_offload_tuple_rhash *
 	struct flow_offload_tuple_rhash *tuplehash;
 	struct rhashtable_iter hti;
 	struct flow_offload *flow;
-	int err = 0;
+	int err = 0, cnt = 0;
 
 	rhashtable_walk_enter(&flow_table->rhashtable, &hti);
 	rhashtable_walk_start(&hti);
@@ -338,11 +338,12 @@ struct flow_offload_tuple_rhash *
 		flow = container_of(tuplehash, struct flow_offload, tuplehash[0]);
 
 		iter(flow_table, flow, data);
+		cnt++;
 	}
 	rhashtable_walk_stop(&hti);
 	rhashtable_walk_exit(&hti);
 
-	return err;
+	return err < 0 ? err : cnt;
 }
 
 static void nf_flow_offload_gc_step(struct nf_flowtable *flow_table,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 3/6] netfilter: flowtable: add cleanup callback from garbage collector
  2020-02-23 11:45 [PATCH net-next 0/6] act_ct: Software offload of conntrack_in Paul Blakey
  2020-02-23 11:45 ` [PATCH net-next 1/6] netfilter: flowtable: pass flowtable to nf_flow_table_iterate() Paul Blakey
  2020-02-23 11:45 ` [PATCH net-next 2/6] netfilter: flowtable: nf_flow_table_iterate() returns the number of entries Paul Blakey
@ 2020-02-23 11:45 ` Paul Blakey
  2020-02-23 11:45 ` [PATCH net-next 4/6] net/sched: act_ct: Create nf flow table per zone Paul Blakey
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Paul Blakey @ 2020-02-23 11:45 UTC (permalink / raw)
  To: Paul Blakey, Saeed Mahameed, Oz Shlomo, Jakub Kicinski,
	Vlad Buslov, David Miller, netdev, Jiri Pirko, Roi Dayan
  Cc: Pablo Neira Ayuso

From: Pablo Neira Ayuso <pablo@netfilter.org>

This new cleanup callback is called whenever garbage collector counts
no entries in the flowtable. This patch is useful for the act_tc
infrastructure which releases the flowtable if it gets empty.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
---
 include/net/netfilter/nf_flow_table.h | 1 +
 net/netfilter/nf_flow_table_core.c    | 8 +++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
index e0f709d9..ba65cf0 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -27,6 +27,7 @@ struct nf_flowtable_type {
 						  const struct flow_offload *flow,
 						  enum flow_offload_tuple_dir dir,
 						  struct nf_flow_rule *flow_rule);
+	int				(*cleanup)(struct nf_flowtable *ft);
 	void				(*free)(struct nf_flowtable *ft);
 	nf_hookfn			*hook;
 	struct module			*owner;
diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index 83bc456..e209bbe 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -367,9 +367,15 @@ static void nf_flow_offload_gc_step(struct nf_flowtable *flow_table,
 static void nf_flow_offload_work_gc(struct work_struct *work)
 {
 	struct nf_flowtable *flow_table;
+	int err, cnt;
 
 	flow_table = container_of(work, struct nf_flowtable, gc_work.work);
-	nf_flow_table_iterate(flow_table, nf_flow_offload_gc_step, NULL);
+	cnt = nf_flow_table_iterate(flow_table, nf_flow_offload_gc_step, NULL);
+	if (cnt == 0 && flow_table->type->cleanup) {
+		err = flow_table->type->cleanup(flow_table);
+		if (!err)
+			return;
+	}
 	queue_delayed_work(system_power_efficient_wq, &flow_table->gc_work, HZ);
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 4/6] net/sched: act_ct: Create nf flow table per zone
  2020-02-23 11:45 [PATCH net-next 0/6] act_ct: Software offload of conntrack_in Paul Blakey
                   ` (2 preceding siblings ...)
  2020-02-23 11:45 ` [PATCH net-next 3/6] netfilter: flowtable: add cleanup callback from garbage collector Paul Blakey
@ 2020-02-23 11:45 ` Paul Blakey
  2020-02-24 15:58   ` Edward Cree
  2020-02-28 13:45   ` Marcelo Ricardo Leitner
  2020-02-23 11:45 ` [PATCH net-next 5/6] net/sched: act_ct: Offload established connections to flow table Paul Blakey
  2020-02-23 11:45 ` [PATCH net-next 6/6] net/sched: act_ct: Software offload of established flows Paul Blakey
  5 siblings, 2 replies; 16+ messages in thread
From: Paul Blakey @ 2020-02-23 11:45 UTC (permalink / raw)
  To: Paul Blakey, Saeed Mahameed, Oz Shlomo, Jakub Kicinski,
	Vlad Buslov, David Miller, netdev, Jiri Pirko, Roi Dayan

Use the NF flow tables infrastructure for CT offload.

Create a nf flow table per zone.

Next patches will add FT entries to this table, and do
the software offload.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c |   1 +
 include/net/tc_act/tc_ct.h                      |   2 +
 net/sched/Kconfig                               |   2 +-
 net/sched/act_ct.c                              | 159 +++++++++++++++++++++++-
 4 files changed, 162 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 70b5fe2..eb16136 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -45,6 +45,7 @@
 #include <net/tc_act/tc_tunnel_key.h>
 #include <net/tc_act/tc_pedit.h>
 #include <net/tc_act/tc_csum.h>
+#include <net/tc_act/tc_ct.h>
 #include <net/arp.h>
 #include <net/ipv6_stubs.h>
 #include "en.h"
diff --git a/include/net/tc_act/tc_ct.h b/include/net/tc_act/tc_ct.h
index a8b1564..cf3492e 100644
--- a/include/net/tc_act/tc_ct.h
+++ b/include/net/tc_act/tc_ct.h
@@ -25,6 +25,8 @@ struct tcf_ct_params {
 	u16 ct_action;
 
 	struct rcu_head rcu;
+
+	struct tcf_ct_flow_table *ct_ft;
 };
 
 struct tcf_ct {
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index edde0e5..bfbefb7 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -972,7 +972,7 @@ config NET_ACT_TUNNEL_KEY
 
 config NET_ACT_CT
 	tristate "connection tracking tc action"
-	depends on NET_CLS_ACT && NF_CONNTRACK && NF_NAT
+	depends on NET_CLS_ACT && NF_CONNTRACK && NF_NAT && NF_FLOW_TABLE
 	help
 	  Say Y here to allow sending the packets to conntrack module.
 
diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index f685c0d..4267d7d 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -15,6 +15,7 @@
 #include <linux/pkt_cls.h>
 #include <linux/ip.h>
 #include <linux/ipv6.h>
+#include <linux/rhashtable.h>
 #include <net/netlink.h>
 #include <net/pkt_sched.h>
 #include <net/pkt_cls.h>
@@ -24,6 +25,7 @@
 #include <uapi/linux/tc_act/tc_ct.h>
 #include <net/tc_act/tc_ct.h>
 
+#include <net/netfilter/nf_flow_table.h>
 #include <net/netfilter/nf_conntrack.h>
 #include <net/netfilter/nf_conntrack_core.h>
 #include <net/netfilter/nf_conntrack_zones.h>
@@ -31,6 +33,133 @@
 #include <net/netfilter/ipv6/nf_defrag_ipv6.h>
 #include <uapi/linux/netfilter/nf_nat.h>
 
+static struct workqueue_struct *act_ct_wq;
+
+struct tcf_ct_flow_table {
+	struct rhash_head node; /* In zones tables */
+
+	struct rcu_work rwork;
+	struct nf_flowtable nf_ft;
+	u16 zone;
+	u32 ref;
+
+	bool dying;
+};
+
+static const struct rhashtable_params zones_params = {
+	.head_offset = offsetof(struct tcf_ct_flow_table, node),
+	.key_offset = offsetof(struct tcf_ct_flow_table, zone),
+	.key_len = sizeof_field(struct tcf_ct_flow_table, zone),
+	.automatic_shrinking = true,
+};
+
+static struct rhashtable zones_ht;
+static DEFINE_SPINLOCK(zones_lock);
+
+static void tcf_ct_flow_table_put(struct tcf_ct_params *params)
+{
+	spin_lock(&zones_lock);
+	--params->ct_ft->ref;
+	spin_unlock(&zones_lock);
+}
+
+static void tcf_ct_flow_table_cleanup_work(struct work_struct *work)
+{
+	struct tcf_ct_flow_table *ct_ft;
+
+	ct_ft = container_of(to_rcu_work(work), struct tcf_ct_flow_table,
+			     rwork);
+	nf_flow_table_free(&ct_ft->nf_ft);
+	kfree(ct_ft);
+
+	module_put(THIS_MODULE);
+}
+
+static int tcf_ct_flow_table_cleanup(struct nf_flowtable *nf_ft)
+{
+	struct tcf_ct_flow_table *ct_ft;
+	int err = -EBUSY;
+
+	spin_lock(&zones_lock);
+
+	/* Delete the FT if there is no rules with CT action for this zone.
+	 *
+	 * This method will be called by the FT GC until the FT table will
+	 * be freed. Set the dying flags to avoid multiple cleanups
+	 * till it's actually freed in work cb.
+	 */
+	ct_ft = container_of(nf_ft, struct tcf_ct_flow_table, nf_ft);
+	if (ct_ft->ref == 0 && !ct_ft->dying) {
+		ct_ft->dying = true;
+
+		rhashtable_remove_fast(&zones_ht, &ct_ft->node, zones_params);
+		INIT_RCU_WORK(&ct_ft->rwork, tcf_ct_flow_table_cleanup_work);
+		queue_rcu_work(act_ct_wq, &ct_ft->rwork);
+
+		err = 0;
+	}
+
+	spin_unlock(&zones_lock);
+
+	return err;
+}
+
+static struct nf_flowtable_type flowtable_ct = {
+	.cleanup	= tcf_ct_flow_table_cleanup,
+	.owner		= THIS_MODULE,
+};
+
+static int tcf_ct_flow_table_get(struct tcf_ct_params *params)
+{
+	struct tcf_ct_flow_table *ct_ft;
+	int err = -ENOMEM;
+
+	spin_lock(&zones_lock);
+	ct_ft = rhashtable_lookup_fast(&zones_ht, &params->zone, zones_params);
+	if (ct_ft)
+		goto take_ref;
+
+	ct_ft = kzalloc(sizeof(*ct_ft), GFP_KERNEL);
+	if (!ct_ft)
+		goto alloc_err;
+
+	ct_ft->zone = params->zone;
+	err = rhashtable_insert_fast(&zones_ht, &ct_ft->node, zones_params);
+	if (err)
+		goto insert_err;
+
+	ct_ft->nf_ft.type = &flowtable_ct;
+	err = nf_flow_table_init(&ct_ft->nf_ft);
+	if (err)
+		goto init_err;
+
+	__module_get(THIS_MODULE);
+take_ref:
+	params->ct_ft = ct_ft;
+	ct_ft->ref++;
+	spin_unlock(&zones_lock);
+
+	return 0;
+
+init_err:
+	rhashtable_remove_fast(&zones_ht, &ct_ft->node, zones_params);
+insert_err:
+	kfree(ct_ft);
+alloc_err:
+	spin_unlock(&zones_lock);
+	return err;
+}
+
+static int tcf_ct_flow_tables_init(void)
+{
+	return rhashtable_init(&zones_ht, &zones_params);
+}
+
+static void tcf_ct_flow_tables_uninit(void)
+{
+	rhashtable_destroy(&zones_ht);
+}
+
 static struct tc_action_ops act_ct_ops;
 static unsigned int ct_net_id;
 
@@ -207,6 +336,8 @@ static void tcf_ct_params_free(struct rcu_head *head)
 	struct tcf_ct_params *params = container_of(head,
 						    struct tcf_ct_params, rcu);
 
+	tcf_ct_flow_table_put(params);
+
 	if (params->tmpl)
 		nf_conntrack_put(&params->tmpl->ct_general);
 	kfree(params);
@@ -730,6 +861,10 @@ static int tcf_ct_init(struct net *net, struct nlattr *nla,
 	if (err)
 		goto cleanup;
 
+	err = tcf_ct_flow_table_get(params);
+	if (err)
+		goto cleanup;
+
 	spin_lock_bh(&c->tcf_lock);
 	goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch);
 	params = rcu_replace_pointer(c->params, params,
@@ -974,12 +1109,34 @@ static void __net_exit ct_exit_net(struct list_head *net_list)
 
 static int __init ct_init_module(void)
 {
-	return tcf_register_action(&act_ct_ops, &ct_net_ops);
+	int err;
+
+	act_ct_wq = alloc_ordered_workqueue("act_ct_workqueue", 0);
+	if (!act_ct_wq)
+		return -ENOMEM;
+
+	err = tcf_ct_flow_tables_init();
+	if (err)
+		goto err_tbl_init;
+
+	err = tcf_register_action(&act_ct_ops, &ct_net_ops);
+	if (err)
+		goto err_register;
+
+	return 0;
+
+err_tbl_init:
+	destroy_workqueue(act_ct_wq);
+err_register:
+	tcf_ct_flow_tables_uninit();
+	return err;
 }
 
 static void __exit ct_cleanup_module(void)
 {
 	tcf_unregister_action(&act_ct_ops, &ct_net_ops);
+	tcf_ct_flow_tables_uninit();
+	destroy_workqueue(act_ct_wq);
 }
 
 module_init(ct_init_module);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 5/6] net/sched: act_ct: Offload established connections to flow table
  2020-02-23 11:45 [PATCH net-next 0/6] act_ct: Software offload of conntrack_in Paul Blakey
                   ` (3 preceding siblings ...)
  2020-02-23 11:45 ` [PATCH net-next 4/6] net/sched: act_ct: Create nf flow table per zone Paul Blakey
@ 2020-02-23 11:45 ` Paul Blakey
  2020-02-23 11:45 ` [PATCH net-next 6/6] net/sched: act_ct: Software offload of established flows Paul Blakey
  5 siblings, 0 replies; 16+ messages in thread
From: Paul Blakey @ 2020-02-23 11:45 UTC (permalink / raw)
  To: Paul Blakey, Saeed Mahameed, Oz Shlomo, Jakub Kicinski,
	Vlad Buslov, David Miller, netdev, Jiri Pirko, Roi Dayan

Add a ft entry when connections enter an established state and delete
the connections when they leave the established state.

The flow table assumes ownership of the connection. In the following
patch act_ct will lookup the ct state from the FT. In future patches,
drivers will register for callbacks for ft add/del events and will be
able to use the information to offload the connections.

Note that connection aging is managed by the FT.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/act_ct.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index 4267d7d..b2bc885 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -150,6 +150,67 @@ static int tcf_ct_flow_table_get(struct tcf_ct_params *params)
 	return err;
 }
 
+static void tcf_ct_flow_table_add(struct tcf_ct_flow_table *ct_ft,
+				  struct nf_conn *ct,
+				  bool tcp)
+{
+	struct flow_offload *entry;
+	int err;
+
+	if (test_and_set_bit(IPS_OFFLOAD_BIT, &ct->status))
+		return;
+
+	entry = flow_offload_alloc(ct);
+	if (!entry) {
+		WARN_ON_ONCE(1);
+		goto err_alloc;
+	}
+
+	if (tcp) {
+		ct->proto.tcp.seen[0].flags |= IP_CT_TCP_FLAG_BE_LIBERAL;
+		ct->proto.tcp.seen[1].flags |= IP_CT_TCP_FLAG_BE_LIBERAL;
+	}
+
+	err = flow_offload_add(&ct_ft->nf_ft, entry);
+	if (err)
+		goto err_add;
+
+	return;
+
+err_add:
+	flow_offload_free(entry);
+err_alloc:
+	clear_bit(IPS_OFFLOAD_BIT, &ct->status);
+}
+
+static void tcf_ct_flow_table_process_conn(struct tcf_ct_flow_table *ct_ft,
+					   struct nf_conn *ct,
+					   enum ip_conntrack_info ctinfo)
+{
+	bool tcp = false;
+
+	if (ctinfo != IP_CT_ESTABLISHED && ctinfo != IP_CT_ESTABLISHED_REPLY)
+		return;
+
+	switch (nf_ct_protonum(ct)) {
+	case IPPROTO_TCP:
+		tcp = true;
+		if (ct->proto.tcp.state != TCP_CONNTRACK_ESTABLISHED)
+			return;
+		break;
+	case IPPROTO_UDP:
+		break;
+	default:
+		return;
+	}
+
+	if (nf_ct_ext_exist(ct, NF_CT_EXT_HELPER) ||
+	    ct->status & IPS_SEQ_ADJUST)
+		return;
+
+	tcf_ct_flow_table_add(ct_ft, ct, tcp);
+}
+
 static int tcf_ct_flow_tables_init(void)
 {
 	return rhashtable_init(&zones_ht, &zones_params);
@@ -603,6 +664,8 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
 		nf_conntrack_confirm(skb);
 	}
 
+	tcf_ct_flow_table_process_conn(p->ct_ft, ct, ctinfo);
+
 out_push:
 	skb_push_rcsum(skb, nh_ofs);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 6/6] net/sched: act_ct: Software offload of established flows
  2020-02-23 11:45 [PATCH net-next 0/6] act_ct: Software offload of conntrack_in Paul Blakey
                   ` (4 preceding siblings ...)
  2020-02-23 11:45 ` [PATCH net-next 5/6] net/sched: act_ct: Offload established connections to flow table Paul Blakey
@ 2020-02-23 11:45 ` Paul Blakey
  2020-02-24 16:04   ` Edward Cree
                     ` (2 more replies)
  5 siblings, 3 replies; 16+ messages in thread
From: Paul Blakey @ 2020-02-23 11:45 UTC (permalink / raw)
  To: Paul Blakey, Saeed Mahameed, Oz Shlomo, Jakub Kicinski,
	Vlad Buslov, David Miller, netdev, Jiri Pirko, Roi Dayan

Offload nf conntrack processing by looking up the 5-tuple in the
zone's flow table.

The nf conntrack module will process the packets until a connection is
in established state. Once in established state, the ct state pointer
(nf_conn) will be restored on the skb from a successful ft lookup.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/act_ct.c | 163 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 160 insertions(+), 3 deletions(-)

diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index b2bc885..3592e24 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -211,6 +211,157 @@ static void tcf_ct_flow_table_process_conn(struct tcf_ct_flow_table *ct_ft,
 	tcf_ct_flow_table_add(ct_ft, ct, tcp);
 }
 
+static bool
+tcf_ct_flow_table_fill_tuple_ipv4(struct sk_buff *skb,
+				  struct flow_offload_tuple *tuple)
+{
+	struct flow_ports *ports;
+	unsigned int thoff;
+	struct iphdr *iph;
+
+	if (!pskb_may_pull(skb, sizeof(*iph)))
+		return false;
+
+	iph = ip_hdr(skb);
+	thoff = iph->ihl * 4;
+
+	if (ip_is_fragment(iph) ||
+	    unlikely(thoff != sizeof(struct iphdr)))
+		return false;
+
+	if (iph->protocol != IPPROTO_TCP &&
+	    iph->protocol != IPPROTO_UDP)
+		return false;
+
+	if (iph->ttl <= 1)
+		return false;
+
+	thoff = iph->ihl * 4;
+	if (!pskb_may_pull(skb, thoff + sizeof(*ports)))
+		return false;
+
+	ports = (struct flow_ports *)(skb_network_header(skb) + thoff);
+
+	tuple->src_v4.s_addr = iph->saddr;
+	tuple->dst_v4.s_addr = iph->daddr;
+	tuple->src_port = ports->source;
+	tuple->dst_port = ports->dest;
+	tuple->l3proto = AF_INET;
+	tuple->l4proto = iph->protocol;
+
+	return true;
+}
+
+static bool
+tcf_ct_flow_table_fill_tuple_ipv6(struct sk_buff *skb,
+				  struct flow_offload_tuple *tuple)
+{
+	struct flow_ports *ports;
+	struct ipv6hdr *ip6h;
+	unsigned int thoff;
+
+	if (!pskb_may_pull(skb, sizeof(*ip6h)))
+		return false;
+
+	ip6h = ipv6_hdr(skb);
+
+	if (ip6h->nexthdr != IPPROTO_TCP &&
+	    ip6h->nexthdr != IPPROTO_UDP)
+		return false;
+
+	if (ip6h->hop_limit <= 1)
+		return false;
+
+	thoff = sizeof(*ip6h);
+	if (!pskb_may_pull(skb, thoff + sizeof(*ports)))
+		return false;
+
+	ports = (struct flow_ports *)(skb_network_header(skb) + thoff);
+
+	tuple->src_v6 = ip6h->saddr;
+	tuple->dst_v6 = ip6h->daddr;
+	tuple->src_port = ports->source;
+	tuple->dst_port = ports->dest;
+	tuple->l3proto = AF_INET6;
+	tuple->l4proto = ip6h->nexthdr;
+
+	return true;
+}
+
+static bool tcf_ct_flow_table_check_tcp(struct flow_offload *flow, int proto,
+					struct sk_buff *skb,
+					unsigned int thoff)
+{
+	struct tcphdr *tcph;
+
+	if (proto != IPPROTO_TCP)
+		return true;
+
+	if (!pskb_may_pull(skb, thoff + sizeof(*tcph)))
+		return false;
+
+	tcph = (void *)(skb_network_header(skb) + thoff);
+	if (unlikely(tcph->fin || tcph->rst)) {
+		flow_offload_teardown(flow);
+		return false;
+	}
+
+	return true;
+}
+
+static bool tcf_ct_flow_table_lookup(struct tcf_ct_params *p,
+				     struct sk_buff *skb,
+				     u8 family)
+{
+	struct nf_flowtable *nf_ft = &p->ct_ft->nf_ft;
+	struct flow_offload_tuple_rhash *tuplehash;
+	struct flow_offload_tuple tuple = {};
+	enum ip_conntrack_info ctinfo;
+	struct flow_offload *flow;
+	struct nf_conn *ct;
+	unsigned int thoff;
+	u8 dir;
+
+	/* Previously seen or loopback */
+	ct = nf_ct_get(skb, &ctinfo);
+	if ((ct && !nf_ct_is_template(ct)) || ctinfo == IP_CT_UNTRACKED)
+		return false;
+
+	switch (family) {
+	case NFPROTO_IPV4:
+		if (!tcf_ct_flow_table_fill_tuple_ipv4(skb, &tuple))
+			return false;
+		break;
+	case NFPROTO_IPV6:
+		if (!tcf_ct_flow_table_fill_tuple_ipv6(skb, &tuple))
+			return false;
+		break;
+	default:
+		return false;
+	}
+
+	tuplehash = flow_offload_lookup(nf_ft, &tuple);
+	if (!tuplehash)
+		return false;
+
+	dir = tuplehash->tuple.dir;
+	flow = container_of(tuplehash, struct flow_offload, tuplehash[dir]);
+	ct = flow->ct;
+
+	ctinfo = dir == FLOW_OFFLOAD_DIR_ORIGINAL ? IP_CT_ESTABLISHED :
+						    IP_CT_ESTABLISHED_REPLY;
+
+	thoff = ip_hdr(skb)->ihl * 4;
+	if (!tcf_ct_flow_table_check_tcp(flow, ip_hdr(skb)->protocol, skb,
+					 thoff))
+		return false;
+
+	nf_conntrack_get(&ct->ct_general);
+	nf_ct_set(skb, ct, ctinfo);
+
+	return true;
+}
+
 static int tcf_ct_flow_tables_init(void)
 {
 	return rhashtable_init(&zones_ht, &zones_params);
@@ -579,6 +730,7 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
 	struct nf_hook_state state;
 	int nh_ofs, err, retval;
 	struct tcf_ct_params *p;
+	bool skip_add = false;
 	struct nf_conn *ct;
 	u8 family;
 
@@ -628,6 +780,11 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
 	 */
 	cached = tcf_ct_skb_nfct_cached(net, skb, p->zone, force);
 	if (!cached) {
+		if (!commit && tcf_ct_flow_table_lookup(p, skb, family)) {
+			skip_add = true;
+			goto do_nat;
+		}
+
 		/* Associate skb with specified zone. */
 		if (tmpl) {
 			ct = nf_ct_get(skb, &ctinfo);
@@ -645,6 +802,7 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
 			goto out_push;
 	}
 
+do_nat:
 	ct = nf_ct_get(skb, &ctinfo);
 	if (!ct)
 		goto out_push;
@@ -662,9 +820,8 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
 		 * even if the connection is already confirmed.
 		 */
 		nf_conntrack_confirm(skb);
-	}
-
-	tcf_ct_flow_table_process_conn(p->ct_ft, ct, ctinfo);
+	} else if (!skip_add)
+		tcf_ct_flow_table_process_conn(p->ct_ft, ct, ctinfo);
 
 out_push:
 	skb_push_rcsum(skb, nh_ofs);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 4/6] net/sched: act_ct: Create nf flow table per zone
  2020-02-23 11:45 ` [PATCH net-next 4/6] net/sched: act_ct: Create nf flow table per zone Paul Blakey
@ 2020-02-24 15:58   ` Edward Cree
  2020-02-26  9:44     ` Paul Blakey
  2020-02-28 13:45   ` Marcelo Ricardo Leitner
  1 sibling, 1 reply; 16+ messages in thread
From: Edward Cree @ 2020-02-24 15:58 UTC (permalink / raw)
  To: Paul Blakey, Saeed Mahameed, Oz Shlomo, Jakub Kicinski,
	Vlad Buslov, David Miller, netdev, Jiri Pirko, Roi Dayan

On 23/02/2020 11:45, Paul Blakey wrote:
> Use the NF flow tables infrastructure for CT offload.
>
> Create a nf flow table per zone.
>
> Next patches will add FT entries to this table, and do
> the software offload.
>
> Signed-off-by: Paul Blakey <paulb@mellanox.com>
> Reviewed-by: Jiri Pirko <jiri@mellanox.com>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/en_tc.c |   1 +
>  include/net/tc_act/tc_ct.h                      |   2 +
>  net/sched/Kconfig                               |   2 +-
>  net/sched/act_ct.c                              | 159 +++++++++++++++++++++++-
>  4 files changed, 162 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> index 70b5fe2..eb16136 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> @@ -45,6 +45,7 @@
>  #include <net/tc_act/tc_tunnel_key.h>
>  #include <net/tc_act/tc_pedit.h>
>  #include <net/tc_act/tc_csum.h>
> +#include <net/tc_act/tc_ct.h>
>  #include <net/arp.h>
>  #include <net/ipv6_stubs.h>
>  #include "en.h"
> diff --git a/include/net/tc_act/tc_ct.h b/include/net/tc_act/tc_ct.h
> index a8b1564..cf3492e 100644
> --- a/include/net/tc_act/tc_ct.h
> +++ b/include/net/tc_act/tc_ct.h
> @@ -25,6 +25,8 @@ struct tcf_ct_params {
>  	u16 ct_action;
>  
>  	struct rcu_head rcu;
> +
> +	struct tcf_ct_flow_table *ct_ft;
>  };
>  
>  struct tcf_ct {
> diff --git a/net/sched/Kconfig b/net/sched/Kconfig
> index edde0e5..bfbefb7 100644
> --- a/net/sched/Kconfig
> +++ b/net/sched/Kconfig
> @@ -972,7 +972,7 @@ config NET_ACT_TUNNEL_KEY
>  
>  config NET_ACT_CT
>  	tristate "connection tracking tc action"
> -	depends on NET_CLS_ACT && NF_CONNTRACK && NF_NAT
> +	depends on NET_CLS_ACT && NF_CONNTRACK && NF_NAT && NF_FLOW_TABLE
Is it not possible to keep sensible/old behaviour in the case
 of NF_FLOW_TABLE=n?  (And what about NF_FLOW_TABLE=m, which is
 what its Kconfig help seems to advise...)

>  	help
>  	  Say Y here to allow sending the packets to conntrack module.
>  
> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
> index f685c0d..4267d7d 100644
> --- a/net/sched/act_ct.c
> +++ b/net/sched/act_ct.c
> @@ -15,6 +15,7 @@
>  #include <linux/pkt_cls.h>
>  #include <linux/ip.h>
>  #include <linux/ipv6.h>
> +#include <linux/rhashtable.h>
>  #include <net/netlink.h>
>  #include <net/pkt_sched.h>
>  #include <net/pkt_cls.h>
> @@ -24,6 +25,7 @@
>  #include <uapi/linux/tc_act/tc_ct.h>
>  #include <net/tc_act/tc_ct.h>
>  
> +#include <net/netfilter/nf_flow_table.h>
>  #include <net/netfilter/nf_conntrack.h>
>  #include <net/netfilter/nf_conntrack_core.h>
>  #include <net/netfilter/nf_conntrack_zones.h>
> @@ -31,6 +33,133 @@
>  #include <net/netfilter/ipv6/nf_defrag_ipv6.h>
>  #include <uapi/linux/netfilter/nf_nat.h>
>  
> +static struct workqueue_struct *act_ct_wq;
> +
> +struct tcf_ct_flow_table {
> +	struct rhash_head node; /* In zones tables */
> +
> +	struct rcu_work rwork;
> +	struct nf_flowtable nf_ft;
> +	u16 zone;
> +	u32 ref;
Any reason this isn't using a refcount_t?
-ed

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 6/6] net/sched: act_ct: Software offload of established flows
  2020-02-23 11:45 ` [PATCH net-next 6/6] net/sched: act_ct: Software offload of established flows Paul Blakey
@ 2020-02-24 16:04   ` Edward Cree
  2020-02-25 12:16     ` Paul Blakey
  2020-02-28 14:52   ` Marcelo Ricardo Leitner
  2020-02-29 14:18   ` Paul Blakey
  2 siblings, 1 reply; 16+ messages in thread
From: Edward Cree @ 2020-02-24 16:04 UTC (permalink / raw)
  To: Paul Blakey, Saeed Mahameed, Oz Shlomo, Jakub Kicinski,
	Vlad Buslov, David Miller, netdev, Jiri Pirko, Roi Dayan

On 23/02/2020 11:45, Paul Blakey wrote:
> Offload nf conntrack processing by looking up the 5-tuple in the
> zone's flow table.
>
> The nf conntrack module will process the packets until a connection is
> in established state. Once in established state, the ct state pointer
> (nf_conn) will be restored on the skb from a successful ft lookup.
>
> Signed-off-by: Paul Blakey <paulb@mellanox.com>
> Acked-by: Jiri Pirko <jiri@mellanox.com>
> ---
>  net/sched/act_ct.c | 163 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 160 insertions(+), 3 deletions(-)
>
> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
> index b2bc885..3592e24 100644
> --- a/net/sched/act_ct.c
> +++ b/net/sched/act_ct.c
<snip>
> @@ -645,6 +802,7 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
>  			goto out_push;
>  	}
>  
> +do_nat:
>  	ct = nf_ct_get(skb, &ctinfo);
>  	if (!ct)
>  		goto out_push;
> @@ -662,9 +820,8 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
>  		 * even if the connection is already confirmed.
>  		 */
>  		nf_conntrack_confirm(skb);
> -	}
> -
> -	tcf_ct_flow_table_process_conn(p->ct_ft, ct, ctinfo);
> +	} else if (!skip_add)
> +		tcf_ct_flow_table_process_conn(p->ct_ft, ct, ctinfo);
>  
Elseif body should be enclosed in braces, since if body was.
-ed

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 6/6] net/sched: act_ct: Software offload of established flows
  2020-02-24 16:04   ` Edward Cree
@ 2020-02-25 12:16     ` Paul Blakey
  0 siblings, 0 replies; 16+ messages in thread
From: Paul Blakey @ 2020-02-25 12:16 UTC (permalink / raw)
  To: Edward Cree, Saeed Mahameed, Oz Shlomo, Jakub Kicinski,
	Vlad Buslov, David Miller, netdev, Jiri Pirko, Roi Dayan


On 2/24/2020 6:04 PM, Edward Cree wrote:
> On 23/02/2020 11:45, Paul Blakey wrote:
>> Offload nf conntrack processing by looking up the 5-tuple in the
>> zone's flow table.
>>
>> The nf conntrack module will process the packets until a connection is
>> in established state. Once in established state, the ct state pointer
>> (nf_conn) will be restored on the skb from a successful ft lookup.
>>
>> Signed-off-by: Paul Blakey <paulb@mellanox.com>
>> Acked-by: Jiri Pirko <jiri@mellanox.com>
>> ---
>>  net/sched/act_ct.c | 163 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 160 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
>> index b2bc885..3592e24 100644
>> --- a/net/sched/act_ct.c
>> +++ b/net/sched/act_ct.c
> <snip>
>> @@ -645,6 +802,7 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
>>  			goto out_push;
>>  	}
>>  
>> +do_nat:
>>  	ct = nf_ct_get(skb, &ctinfo);
>>  	if (!ct)
>>  		goto out_push;
>> @@ -662,9 +820,8 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
>>  		 * even if the connection is already confirmed.
>>  		 */
>>  		nf_conntrack_confirm(skb);
>> -	}
>> -
>> -	tcf_ct_flow_table_process_conn(p->ct_ft, ct, ctinfo);
>> +	} else if (!skip_add)
>> +		tcf_ct_flow_table_process_conn(p->ct_ft, ct, ctinfo);
>>  
> Elseif body should be enclosed in braces, since if body was.
> -ed
thanks, will do

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 4/6] net/sched: act_ct: Create nf flow table per zone
  2020-02-24 15:58   ` Edward Cree
@ 2020-02-26  9:44     ` Paul Blakey
  0 siblings, 0 replies; 16+ messages in thread
From: Paul Blakey @ 2020-02-26  9:44 UTC (permalink / raw)
  To: Edward Cree, Saeed Mahameed, Oz Shlomo, Jakub Kicinski,
	Vlad Buslov, David Miller, netdev, Jiri Pirko, Roi Dayan


On 2/24/2020 5:58 PM, Edward Cree wrote:
> On 23/02/2020 11:45, Paul Blakey wrote:
>> Use the NF flow tables infrastructure for CT offload.
>>
>> Create a nf flow table per zone.
>>
>> Next patches will add FT entries to this table, and do
>> the software offload.
>>
>> Signed-off-by: Paul Blakey <paulb@mellanox.com>
>> Reviewed-by: Jiri Pirko <jiri@mellanox.com>
>> ---
>>  drivers/net/ethernet/mellanox/mlx5/core/en_tc.c |   1 +
>>  include/net/tc_act/tc_ct.h                      |   2 +
>>  net/sched/Kconfig                               |   2 +-
>>  net/sched/act_ct.c                              | 159 +++++++++++++++++++++++-
>>  4 files changed, 162 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>> index 70b5fe2..eb16136 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>> @@ -45,6 +45,7 @@
>>  #include <net/tc_act/tc_tunnel_key.h>
>>  #include <net/tc_act/tc_pedit.h>
>>  #include <net/tc_act/tc_csum.h>
>> +#include <net/tc_act/tc_ct.h>
>>  #include <net/arp.h>
>>  #include <net/ipv6_stubs.h>
>>  #include "en.h"
>> diff --git a/include/net/tc_act/tc_ct.h b/include/net/tc_act/tc_ct.h
>> index a8b1564..cf3492e 100644
>> --- a/include/net/tc_act/tc_ct.h
>> +++ b/include/net/tc_act/tc_ct.h
>> @@ -25,6 +25,8 @@ struct tcf_ct_params {
>>  	u16 ct_action;
>>  
>>  	struct rcu_head rcu;
>> +
>> +	struct tcf_ct_flow_table *ct_ft;
>>  };
>>  
>>  struct tcf_ct {
>> diff --git a/net/sched/Kconfig b/net/sched/Kconfig
>> index edde0e5..bfbefb7 100644
>> --- a/net/sched/Kconfig
>> +++ b/net/sched/Kconfig
>> @@ -972,7 +972,7 @@ config NET_ACT_TUNNEL_KEY
>>  
>>  config NET_ACT_CT
>>  	tristate "connection tracking tc action"
>> -	depends on NET_CLS_ACT && NF_CONNTRACK && NF_NAT
>> +	depends on NET_CLS_ACT && NF_CONNTRACK && NF_NAT && NF_FLOW_TABLE
> Is it not possible to keep sensible/old behaviour in the case
>  of NF_FLOW_TABLE=n?  (And what about NF_FLOW_TABLE=m, which is
>  what its Kconfig help seems to advise...)

No problem with it being a module.

It is possible to allow compilation without flow table, but it will create confusion for people trying

to offload conntrack, as it would silently not happen.

>
>>  	help
>>  	  Say Y here to allow sending the packets to conntrack module.
>>  
>> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
>> index f685c0d..4267d7d 100644
>> --- a/net/sched/act_ct.c
>> +++ b/net/sched/act_ct.c
>> @@ -15,6 +15,7 @@
>>  #include <linux/pkt_cls.h>
>>  #include <linux/ip.h>
>>  #include <linux/ipv6.h>
>> +#include <linux/rhashtable.h>
>>  #include <net/netlink.h>
>>  #include <net/pkt_sched.h>
>>  #include <net/pkt_cls.h>
>> @@ -24,6 +25,7 @@
>>  #include <uapi/linux/tc_act/tc_ct.h>
>>  #include <net/tc_act/tc_ct.h>
>>  
>> +#include <net/netfilter/nf_flow_table.h>
>>  #include <net/netfilter/nf_conntrack.h>
>>  #include <net/netfilter/nf_conntrack_core.h>
>>  #include <net/netfilter/nf_conntrack_zones.h>
>> @@ -31,6 +33,133 @@
>>  #include <net/netfilter/ipv6/nf_defrag_ipv6.h>
>>  #include <uapi/linux/netfilter/nf_nat.h>
>>  
>> +static struct workqueue_struct *act_ct_wq;
>> +
>> +struct tcf_ct_flow_table {
>> +	struct rhash_head node; /* In zones tables */
>> +
>> +	struct rcu_work rwork;
>> +	struct nf_flowtable nf_ft;
>> +	u16 zone;
>> +	u32 ref;
> Any reason this isn't using a refcount_t?
> -ed
it is updated under lock, so there was no need for atomic as well

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 4/6] net/sched: act_ct: Create nf flow table per zone
  2020-02-23 11:45 ` [PATCH net-next 4/6] net/sched: act_ct: Create nf flow table per zone Paul Blakey
  2020-02-24 15:58   ` Edward Cree
@ 2020-02-28 13:45   ` Marcelo Ricardo Leitner
  2020-03-01 12:38     ` Paul Blakey
  1 sibling, 1 reply; 16+ messages in thread
From: Marcelo Ricardo Leitner @ 2020-02-28 13:45 UTC (permalink / raw)
  To: Paul Blakey
  Cc: Saeed Mahameed, Oz Shlomo, Jakub Kicinski, Vlad Buslov,
	David Miller, netdev, Jiri Pirko, Roi Dayan

Hi,

On Sun, Feb 23, 2020 at 01:45:05PM +0200, Paul Blakey wrote:
> Use the NF flow tables infrastructure for CT offload.
> 
> Create a nf flow table per zone.
> 
> Next patches will add FT entries to this table, and do
> the software offload.
> 
> Signed-off-by: Paul Blakey <paulb@mellanox.com>
> Reviewed-by: Jiri Pirko <jiri@mellanox.com>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/en_tc.c |   1 +
>  include/net/tc_act/tc_ct.h                      |   2 +
>  net/sched/Kconfig                               |   2 +-
>  net/sched/act_ct.c                              | 159 +++++++++++++++++++++++-
>  4 files changed, 162 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> index 70b5fe2..eb16136 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> @@ -45,6 +45,7 @@
>  #include <net/tc_act/tc_tunnel_key.h>
>  #include <net/tc_act/tc_pedit.h>
>  #include <net/tc_act/tc_csum.h>
> +#include <net/tc_act/tc_ct.h>
>  #include <net/arp.h>
>  #include <net/ipv6_stubs.h>
>  #include "en.h"

This chunk is not really needed for this patchset, right?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 6/6] net/sched: act_ct: Software offload of established flows
  2020-02-23 11:45 ` [PATCH net-next 6/6] net/sched: act_ct: Software offload of established flows Paul Blakey
  2020-02-24 16:04   ` Edward Cree
@ 2020-02-28 14:52   ` Marcelo Ricardo Leitner
  2020-03-01 12:50     ` Paul Blakey
  2020-02-29 14:18   ` Paul Blakey
  2 siblings, 1 reply; 16+ messages in thread
From: Marcelo Ricardo Leitner @ 2020-02-28 14:52 UTC (permalink / raw)
  To: Paul Blakey
  Cc: Saeed Mahameed, Oz Shlomo, Jakub Kicinski, Vlad Buslov,
	David Miller, netdev, Jiri Pirko, Roi Dayan

On Sun, Feb 23, 2020 at 01:45:07PM +0200, Paul Blakey wrote:
> Offload nf conntrack processing by looking up the 5-tuple in the
> zone's flow table.
> 
> The nf conntrack module will process the packets until a connection is
> in established state. Once in established state, the ct state pointer
> (nf_conn) will be restored on the skb from a successful ft lookup.
> 
> Signed-off-by: Paul Blakey <paulb@mellanox.com>
> Acked-by: Jiri Pirko <jiri@mellanox.com>
> ---
>  net/sched/act_ct.c | 163 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 160 insertions(+), 3 deletions(-)
> 
> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
> index b2bc885..3592e24 100644
> --- a/net/sched/act_ct.c
> +++ b/net/sched/act_ct.c
> @@ -211,6 +211,157 @@ static void tcf_ct_flow_table_process_conn(struct tcf_ct_flow_table *ct_ft,
>  	tcf_ct_flow_table_add(ct_ft, ct, tcp);
>  }
>  
> +static bool
> +tcf_ct_flow_table_fill_tuple_ipv4(struct sk_buff *skb,
> +				  struct flow_offload_tuple *tuple)
> +{
> +	struct flow_ports *ports;
> +	unsigned int thoff;
> +	struct iphdr *iph;
> +
> +	if (!pskb_may_pull(skb, sizeof(*iph)))
> +		return false;
> +
> +	iph = ip_hdr(skb);
> +	thoff = iph->ihl * 4;

[A]

> +
> +	if (ip_is_fragment(iph) ||
> +	    unlikely(thoff != sizeof(struct iphdr)))
> +		return false;
> +
> +	if (iph->protocol != IPPROTO_TCP &&
> +	    iph->protocol != IPPROTO_UDP)
> +		return false;
> +
> +	if (iph->ttl <= 1)
> +		return false;
> +
> +	thoff = iph->ihl * 4;

This is not needed, as already done in [A].

> +	if (!pskb_may_pull(skb, thoff + sizeof(*ports)))
> +		return false;
> +
> +	ports = (struct flow_ports *)(skb_network_header(skb) + thoff);
> +
> +	tuple->src_v4.s_addr = iph->saddr;
> +	tuple->dst_v4.s_addr = iph->daddr;
> +	tuple->src_port = ports->source;
> +	tuple->dst_port = ports->dest;
> +	tuple->l3proto = AF_INET;
> +	tuple->l4proto = iph->protocol;
> +
> +	return true;
> +}
> +
> +static bool
> +tcf_ct_flow_table_fill_tuple_ipv6(struct sk_buff *skb,
> +				  struct flow_offload_tuple *tuple)
> +{
> +	struct flow_ports *ports;
> +	struct ipv6hdr *ip6h;
> +	unsigned int thoff;
> +
> +	if (!pskb_may_pull(skb, sizeof(*ip6h)))
> +		return false;
> +
> +	ip6h = ipv6_hdr(skb);
> +
> +	if (ip6h->nexthdr != IPPROTO_TCP &&
> +	    ip6h->nexthdr != IPPROTO_UDP)
> +		return false;
> +
> +	if (ip6h->hop_limit <= 1)
> +		return false;
> +
> +	thoff = sizeof(*ip6h);
> +	if (!pskb_may_pull(skb, thoff + sizeof(*ports)))
> +		return false;
> +
> +	ports = (struct flow_ports *)(skb_network_header(skb) + thoff);
> +
> +	tuple->src_v6 = ip6h->saddr;
> +	tuple->dst_v6 = ip6h->daddr;
> +	tuple->src_port = ports->source;
> +	tuple->dst_port = ports->dest;
> +	tuple->l3proto = AF_INET6;
> +	tuple->l4proto = ip6h->nexthdr;
> +
> +	return true;
> +}
> +
> +static bool tcf_ct_flow_table_check_tcp(struct flow_offload *flow, int proto,
> +					struct sk_buff *skb,
> +					unsigned int thoff)
> +{
> +	struct tcphdr *tcph;
> +
> +	if (proto != IPPROTO_TCP)
> +		return true;

I suppose this is a way to do additional checks for TCP while allowing
everything, but it does give the feeling that the 'return true' is
wrong and should have been 'return false' instead. The function name
works both ways too, at least to me. :-)

Can we have a comment to make it explicit, or a different construct?
Like, instead of 'return true' here, a 'goto out_ok' and reuse the
last return.


These are all my comments on this series. LGTM otherwise. Thanks!

> +
> +	if (!pskb_may_pull(skb, thoff + sizeof(*tcph)))
> +		return false;
> +
> +	tcph = (void *)(skb_network_header(skb) + thoff);
> +	if (unlikely(tcph->fin || tcph->rst)) {
> +		flow_offload_teardown(flow);
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
> +static bool tcf_ct_flow_table_lookup(struct tcf_ct_params *p,
> +				     struct sk_buff *skb,
> +				     u8 family)
> +{
> +	struct nf_flowtable *nf_ft = &p->ct_ft->nf_ft;
> +	struct flow_offload_tuple_rhash *tuplehash;
> +	struct flow_offload_tuple tuple = {};
> +	enum ip_conntrack_info ctinfo;
> +	struct flow_offload *flow;
> +	struct nf_conn *ct;
> +	unsigned int thoff;
> +	u8 dir;
> +
> +	/* Previously seen or loopback */
> +	ct = nf_ct_get(skb, &ctinfo);
> +	if ((ct && !nf_ct_is_template(ct)) || ctinfo == IP_CT_UNTRACKED)
> +		return false;
> +
> +	switch (family) {
> +	case NFPROTO_IPV4:
> +		if (!tcf_ct_flow_table_fill_tuple_ipv4(skb, &tuple))
> +			return false;
> +		break;
> +	case NFPROTO_IPV6:
> +		if (!tcf_ct_flow_table_fill_tuple_ipv6(skb, &tuple))
> +			return false;
> +		break;
> +	default:
> +		return false;
> +	}
> +
> +	tuplehash = flow_offload_lookup(nf_ft, &tuple);
> +	if (!tuplehash)
> +		return false;
> +
> +	dir = tuplehash->tuple.dir;
> +	flow = container_of(tuplehash, struct flow_offload, tuplehash[dir]);
> +	ct = flow->ct;
> +
> +	ctinfo = dir == FLOW_OFFLOAD_DIR_ORIGINAL ? IP_CT_ESTABLISHED :
> +						    IP_CT_ESTABLISHED_REPLY;
> +
> +	thoff = ip_hdr(skb)->ihl * 4;
> +	if (!tcf_ct_flow_table_check_tcp(flow, ip_hdr(skb)->protocol, skb,
> +					 thoff))
> +		return false;
> +
> +	nf_conntrack_get(&ct->ct_general);
> +	nf_ct_set(skb, ct, ctinfo);
> +
> +	return true;
> +}
> +
>  static int tcf_ct_flow_tables_init(void)
>  {
>  	return rhashtable_init(&zones_ht, &zones_params);
> @@ -579,6 +730,7 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
>  	struct nf_hook_state state;
>  	int nh_ofs, err, retval;
>  	struct tcf_ct_params *p;
> +	bool skip_add = false;
>  	struct nf_conn *ct;
>  	u8 family;
>  
> @@ -628,6 +780,11 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
>  	 */
>  	cached = tcf_ct_skb_nfct_cached(net, skb, p->zone, force);
>  	if (!cached) {
> +		if (!commit && tcf_ct_flow_table_lookup(p, skb, family)) {
> +			skip_add = true;
> +			goto do_nat;
> +		}
> +
>  		/* Associate skb with specified zone. */
>  		if (tmpl) {
>  			ct = nf_ct_get(skb, &ctinfo);
> @@ -645,6 +802,7 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
>  			goto out_push;
>  	}
>  
> +do_nat:
>  	ct = nf_ct_get(skb, &ctinfo);
>  	if (!ct)
>  		goto out_push;
> @@ -662,9 +820,8 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
>  		 * even if the connection is already confirmed.
>  		 */
>  		nf_conntrack_confirm(skb);
> -	}
> -
> -	tcf_ct_flow_table_process_conn(p->ct_ft, ct, ctinfo);
> +	} else if (!skip_add)
> +		tcf_ct_flow_table_process_conn(p->ct_ft, ct, ctinfo);
>  
>  out_push:
>  	skb_push_rcsum(skb, nh_ofs);
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 6/6] net/sched: act_ct: Software offload of established flows
  2020-02-23 11:45 ` [PATCH net-next 6/6] net/sched: act_ct: Software offload of established flows Paul Blakey
  2020-02-24 16:04   ` Edward Cree
  2020-02-28 14:52   ` Marcelo Ricardo Leitner
@ 2020-02-29 14:18   ` Paul Blakey
  2 siblings, 0 replies; 16+ messages in thread
From: Paul Blakey @ 2020-02-29 14:18 UTC (permalink / raw)
  To: Saeed Mahameed, Oz Shlomo, Jakub Kicinski, Vlad Buslov,
	David Miller, netdev, Jiri Pirko, Roi Dayan

Will be sending v2 shortly, thanks for the comments

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 4/6] net/sched: act_ct: Create nf flow table per zone
  2020-02-28 13:45   ` Marcelo Ricardo Leitner
@ 2020-03-01 12:38     ` Paul Blakey
  0 siblings, 0 replies; 16+ messages in thread
From: Paul Blakey @ 2020-03-01 12:38 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Saeed Mahameed, Oz Shlomo, Jakub Kicinski, Vlad Buslov,
	David Miller, netdev, Jiri Pirko, Roi Dayan


On 2/28/2020 3:45 PM, Marcelo Ricardo Leitner wrote:
> Hi,
>
> On Sun, Feb 23, 2020 at 01:45:05PM +0200, Paul Blakey wrote:
>> Use the NF flow tables infrastructure for CT offload.
>>
>> Create a nf flow table per zone.
>>
>> Next patches will add FT entries to this table, and do
>> the software offload.
>>
>> Signed-off-by: Paul Blakey <paulb@mellanox.com>
>> Reviewed-by: Jiri Pirko <jiri@mellanox.com>
>> ---
>>  drivers/net/ethernet/mellanox/mlx5/core/en_tc.c |   1 +
>>  include/net/tc_act/tc_ct.h                      |   2 +
>>  net/sched/Kconfig                               |   2 +-
>>  net/sched/act_ct.c                              | 159 +++++++++++++++++++++++-
>>  4 files changed, 162 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>> index 70b5fe2..eb16136 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>> @@ -45,6 +45,7 @@
>>  #include <net/tc_act/tc_tunnel_key.h>
>>  #include <net/tc_act/tc_pedit.h>
>>  #include <net/tc_act/tc_csum.h>
>> +#include <net/tc_act/tc_ct.h>
>>  #include <net/arp.h>
>>  #include <net/ipv6_stubs.h>
>>  #include "en.h"
> This chunk is not really needed for this patchset, right?
Removed thanks

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 6/6] net/sched: act_ct: Software offload of established flows
  2020-02-28 14:52   ` Marcelo Ricardo Leitner
@ 2020-03-01 12:50     ` Paul Blakey
  0 siblings, 0 replies; 16+ messages in thread
From: Paul Blakey @ 2020-03-01 12:50 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Saeed Mahameed, Oz Shlomo, Jakub Kicinski, Vlad Buslov,
	David Miller, netdev, Jiri Pirko, Roi Dayan


On 2/28/2020 4:52 PM, Marcelo Ricardo Leitner wrote:
> On Sun, Feb 23, 2020 at 01:45:07PM +0200, Paul Blakey wrote:
>> Offload nf conntrack processing by looking up the 5-tuple in the
>> zone's flow table.
>>
>> The nf conntrack module will process the packets until a connection is
>> in established state. Once in established state, the ct state pointer
>> (nf_conn) will be restored on the skb from a successful ft lookup.
>>
>> Signed-off-by: Paul Blakey <paulb@mellanox.com>
>> Acked-by: Jiri Pirko <jiri@mellanox.com>
>> ---
>>  net/sched/act_ct.c | 163 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 160 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
>> index b2bc885..3592e24 100644
>> --- a/net/sched/act_ct.c
>> +++ b/net/sched/act_ct.c
>> @@ -211,6 +211,157 @@ static void tcf_ct_flow_table_process_conn(struct tcf_ct_flow_table *ct_ft,
>>  	tcf_ct_flow_table_add(ct_ft, ct, tcp);
>>  }
>>  
>> +static bool
>> +tcf_ct_flow_table_fill_tuple_ipv4(struct sk_buff *skb,
>> +				  struct flow_offload_tuple *tuple)
>> +{
>> +	struct flow_ports *ports;
>> +	unsigned int thoff;
>> +	struct iphdr *iph;
>> +
>> +	if (!pskb_may_pull(skb, sizeof(*iph)))
>> +		return false;
>> +
>> +	iph = ip_hdr(skb);
>> +	thoff = iph->ihl * 4;
> [A]
>
>> +
>> +	if (ip_is_fragment(iph) ||
>> +	    unlikely(thoff != sizeof(struct iphdr)))
>> +		return false;
>> +
>> +	if (iph->protocol != IPPROTO_TCP &&
>> +	    iph->protocol != IPPROTO_UDP)
>> +		return false;
>> +
>> +	if (iph->ttl <= 1)
>> +		return false;
>> +
>> +	thoff = iph->ihl * 4;
> This is not needed, as already done in [A].
Removed
>> +	if (!pskb_may_pull(skb, thoff + sizeof(*ports)))
>> +		return false;
>> +
>> +	ports = (struct flow_ports *)(skb_network_header(skb) + thoff);
>> +
>> +	tuple->src_v4.s_addr = iph->saddr;
>> +	tuple->dst_v4.s_addr = iph->daddr;
>> +	tuple->src_port = ports->source;
>> +	tuple->dst_port = ports->dest;
>> +	tuple->l3proto = AF_INET;
>> +	tuple->l4proto = iph->protocol;
>> +
>> +	return true;
>> +}
>> +
>> +static bool
>> +tcf_ct_flow_table_fill_tuple_ipv6(struct sk_buff *skb,
>> +				  struct flow_offload_tuple *tuple)
>> +{
>> +	struct flow_ports *ports;
>> +	struct ipv6hdr *ip6h;
>> +	unsigned int thoff;
>> +
>> +	if (!pskb_may_pull(skb, sizeof(*ip6h)))
>> +		return false;
>> +
>> +	ip6h = ipv6_hdr(skb);
>> +
>> +	if (ip6h->nexthdr != IPPROTO_TCP &&
>> +	    ip6h->nexthdr != IPPROTO_UDP)
>> +		return false;
>> +
>> +	if (ip6h->hop_limit <= 1)
>> +		return false;
>> +
>> +	thoff = sizeof(*ip6h);
>> +	if (!pskb_may_pull(skb, thoff + sizeof(*ports)))
>> +		return false;
>> +
>> +	ports = (struct flow_ports *)(skb_network_header(skb) + thoff);
>> +
>> +	tuple->src_v6 = ip6h->saddr;
>> +	tuple->dst_v6 = ip6h->daddr;
>> +	tuple->src_port = ports->source;
>> +	tuple->dst_port = ports->dest;
>> +	tuple->l3proto = AF_INET6;
>> +	tuple->l4proto = ip6h->nexthdr;
>> +
>> +	return true;
>> +}
>> +
>> +static bool tcf_ct_flow_table_check_tcp(struct flow_offload *flow, int proto,
>> +					struct sk_buff *skb,
>> +					unsigned int thoff)
>> +{
>> +	struct tcphdr *tcph;
>> +
>> +	if (proto != IPPROTO_TCP)
>> +		return true;
> I suppose this is a way to do additional checks for TCP while allowing
> everything, but it does give the feeling that the 'return true' is
> wrong and should have been 'return false' instead. The function name
> works both ways too, at least to me. :-)
>
> Can we have a comment to make it explicit, or a different construct?
> Like, instead of 'return true' here, a 'goto out_ok' and reuse the
> last return.
>
i moved the tcp check outside (different construct)
> These are all my comments on this series. LGTM otherwise. Thanks!
thanks for reviewing.
>
>> +
>> +	if (!pskb_may_pull(skb, thoff + sizeof(*tcph)))
>> +		return false;
>> +
>> +	tcph = (void *)(skb_network_header(skb) + thoff);
>> +	if (unlikely(tcph->fin || tcph->rst)) {
>> +		flow_offload_teardown(flow);
>> +		return false;
>> +	}
>> +
>> +	return true;
>> +}
>> +
>> +static bool tcf_ct_flow_table_lookup(struct tcf_ct_params *p,
>> +				     struct sk_buff *skb,
>> +				     u8 family)
>> +{
>> +	struct nf_flowtable *nf_ft = &p->ct_ft->nf_ft;
>> +	struct flow_offload_tuple_rhash *tuplehash;
>> +	struct flow_offload_tuple tuple = {};
>> +	enum ip_conntrack_info ctinfo;
>> +	struct flow_offload *flow;
>> +	struct nf_conn *ct;
>> +	unsigned int thoff;
>> +	u8 dir;
>> +
>> +	/* Previously seen or loopback */
>> +	ct = nf_ct_get(skb, &ctinfo);
>> +	if ((ct && !nf_ct_is_template(ct)) || ctinfo == IP_CT_UNTRACKED)
>> +		return false;
>> +
>> +	switch (family) {
>> +	case NFPROTO_IPV4:
>> +		if (!tcf_ct_flow_table_fill_tuple_ipv4(skb, &tuple))
>> +			return false;
>> +		break;
>> +	case NFPROTO_IPV6:
>> +		if (!tcf_ct_flow_table_fill_tuple_ipv6(skb, &tuple))
>> +			return false;
>> +		break;
>> +	default:
>> +		return false;
>> +	}
>> +
>> +	tuplehash = flow_offload_lookup(nf_ft, &tuple);
>> +	if (!tuplehash)
>> +		return false;
>> +
>> +	dir = tuplehash->tuple.dir;
>> +	flow = container_of(tuplehash, struct flow_offload, tuplehash[dir]);
>> +	ct = flow->ct;
>> +
>> +	ctinfo = dir == FLOW_OFFLOAD_DIR_ORIGINAL ? IP_CT_ESTABLISHED :
>> +						    IP_CT_ESTABLISHED_REPLY;
>> +
>> +	thoff = ip_hdr(skb)->ihl * 4;
>> +	if (!tcf_ct_flow_table_check_tcp(flow, ip_hdr(skb)->protocol, skb,
>> +					 thoff))
>> +		return false;
>> +
>> +	nf_conntrack_get(&ct->ct_general);
>> +	nf_ct_set(skb, ct, ctinfo);
>> +
>> +	return true;
>> +}
>> +
>>  static int tcf_ct_flow_tables_init(void)
>>  {
>>  	return rhashtable_init(&zones_ht, &zones_params);
>> @@ -579,6 +730,7 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
>>  	struct nf_hook_state state;
>>  	int nh_ofs, err, retval;
>>  	struct tcf_ct_params *p;
>> +	bool skip_add = false;
>>  	struct nf_conn *ct;
>>  	u8 family;
>>  
>> @@ -628,6 +780,11 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
>>  	 */
>>  	cached = tcf_ct_skb_nfct_cached(net, skb, p->zone, force);
>>  	if (!cached) {
>> +		if (!commit && tcf_ct_flow_table_lookup(p, skb, family)) {
>> +			skip_add = true;
>> +			goto do_nat;
>> +		}
>> +
>>  		/* Associate skb with specified zone. */
>>  		if (tmpl) {
>>  			ct = nf_ct_get(skb, &ctinfo);
>> @@ -645,6 +802,7 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
>>  			goto out_push;
>>  	}
>>  
>> +do_nat:
>>  	ct = nf_ct_get(skb, &ctinfo);
>>  	if (!ct)
>>  		goto out_push;
>> @@ -662,9 +820,8 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
>>  		 * even if the connection is already confirmed.
>>  		 */
>>  		nf_conntrack_confirm(skb);
>> -	}
>> -
>> -	tcf_ct_flow_table_process_conn(p->ct_ft, ct, ctinfo);
>> +	} else if (!skip_add)
>> +		tcf_ct_flow_table_process_conn(p->ct_ft, ct, ctinfo);
>>  
>>  out_push:
>>  	skb_push_rcsum(skb, nh_ofs);
>> -- 
>> 1.8.3.1
>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-03-01 12:50 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-23 11:45 [PATCH net-next 0/6] act_ct: Software offload of conntrack_in Paul Blakey
2020-02-23 11:45 ` [PATCH net-next 1/6] netfilter: flowtable: pass flowtable to nf_flow_table_iterate() Paul Blakey
2020-02-23 11:45 ` [PATCH net-next 2/6] netfilter: flowtable: nf_flow_table_iterate() returns the number of entries Paul Blakey
2020-02-23 11:45 ` [PATCH net-next 3/6] netfilter: flowtable: add cleanup callback from garbage collector Paul Blakey
2020-02-23 11:45 ` [PATCH net-next 4/6] net/sched: act_ct: Create nf flow table per zone Paul Blakey
2020-02-24 15:58   ` Edward Cree
2020-02-26  9:44     ` Paul Blakey
2020-02-28 13:45   ` Marcelo Ricardo Leitner
2020-03-01 12:38     ` Paul Blakey
2020-02-23 11:45 ` [PATCH net-next 5/6] net/sched: act_ct: Offload established connections to flow table Paul Blakey
2020-02-23 11:45 ` [PATCH net-next 6/6] net/sched: act_ct: Software offload of established flows Paul Blakey
2020-02-24 16:04   ` Edward Cree
2020-02-25 12:16     ` Paul Blakey
2020-02-28 14:52   ` Marcelo Ricardo Leitner
2020-03-01 12:50     ` Paul Blakey
2020-02-29 14:18   ` Paul Blakey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).