netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/7] Netfilter fixes for net
@ 2022-05-18 21:38 Pablo Neira Ayuso
  2022-05-18 21:38 ` [PATCH net 1/7] netfilter: flowtable: fix excessive hw offload attempts after failure Pablo Neira Ayuso
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2022-05-18 21:38 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni

Hi,

This patchset contains Netfilter fixes for net:

1) Reduce number of hardware offload retries from flowtable datapath
   which might hog system with retries, from Felix Fietkau.

2) Skip neighbour lookup for PPPoE device, fill_forward_path() already
   provides this and set on destination address from fill_forward_path for
   PPPoE device, also from Felix.

4) When combining PPPoE on top of a VLAN device, set info->outdev to the
   PPPoE device so software offload works, from Felix.

5) Fix TCP teardown flowtable state, races with conntrack gc might result
   in resetting the state to ESTABLISHED and the time to one day. Joint
   work with Oz Shlomo and Sven Auhagen.

6) Call dst_check() from flowtable datapath to check if dst is stale
   instead of doing it from garbage collector path.

7) Disable register tracking infrastructure, either user-space or
   kernel need to pre-fetch keys inconditionally, otherwise register
   tracking assumes data is already available in register that might
   not well be there, leading to incorrect reductions.

Please, pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf.git

Thanks.

----------------------------------------------------------------

The following changes since commit f3f19f939c11925dadd3f4776f99f8c278a7017b:

  Merge tag 'net-5.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net (2022-05-12 11:51:45 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf.git HEAD

for you to fetch changes up to 9e539c5b6d9c5b996e45105921ee9dd955c0f535:

  netfilter: nf_tables: disable expression reduction infra (2022-05-18 17:34:26 +0200)

----------------------------------------------------------------
Felix Fietkau (4):
      netfilter: flowtable: fix excessive hw offload attempts after failure
      netfilter: nft_flow_offload: skip dst neigh lookup for ppp devices
      net: fix dev_fill_forward_path with pppoe + bridge
      netfilter: nft_flow_offload: fix offload with pppoe + vlan

Pablo Neira Ayuso (2):
      netfilter: flowtable: fix TCP flow teardown
      netfilter: nf_tables: disable expression reduction infra

Ritaro Takenaka (1):
      netfilter: flowtable: move dst_check to packet path

 drivers/net/ppp/pppoe.c            |  1 +
 include/linux/netdevice.h          |  2 +-
 net/core/dev.c                     |  2 +-
 net/netfilter/nf_flow_table_core.c | 60 +++++++-------------------------------
 net/netfilter/nf_flow_table_ip.c   | 19 ++++++++++++
 net/netfilter/nf_tables_api.c      | 11 +------
 net/netfilter/nft_flow_offload.c   | 28 +++++++++++-------
 7 files changed, 51 insertions(+), 72 deletions(-)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net 1/7] netfilter: flowtable: fix excessive hw offload attempts after failure
  2022-05-18 21:38 [PATCH net 0/7] Netfilter fixes for net Pablo Neira Ayuso
@ 2022-05-18 21:38 ` Pablo Neira Ayuso
  2022-05-19  4:40   ` patchwork-bot+netdevbpf
  2022-05-18 21:38 ` [PATCH net 2/7] netfilter: nft_flow_offload: skip dst neigh lookup for ppp devices Pablo Neira Ayuso
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 9+ messages in thread
From: Pablo Neira Ayuso @ 2022-05-18 21:38 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni

From: Felix Fietkau <nbd@nbd.name>

If a flow cannot be offloaded, the code currently repeatedly tries again as
quickly as possible, which can significantly increase system load.
Fix this by limiting flow timeout update and hardware offload retry to once
per second.

Fixes: c07531c01d82 ("netfilter: flowtable: Remove redundant hw refresh bit")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_flow_table_core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index 3db256da919b..20b4a14e5d4e 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -335,8 +335,10 @@ void flow_offload_refresh(struct nf_flowtable *flow_table,
 	u32 timeout;
 
 	timeout = nf_flowtable_time_stamp + flow_offload_get_timeout(flow);
-	if (READ_ONCE(flow->timeout) != timeout)
+	if (timeout - READ_ONCE(flow->timeout) > HZ)
 		WRITE_ONCE(flow->timeout, timeout);
+	else
+		return;
 
 	if (likely(!nf_flowtable_hw_offload(flow_table)))
 		return;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 2/7] netfilter: nft_flow_offload: skip dst neigh lookup for ppp devices
  2022-05-18 21:38 [PATCH net 0/7] Netfilter fixes for net Pablo Neira Ayuso
  2022-05-18 21:38 ` [PATCH net 1/7] netfilter: flowtable: fix excessive hw offload attempts after failure Pablo Neira Ayuso
@ 2022-05-18 21:38 ` Pablo Neira Ayuso
  2022-05-18 21:38 ` [PATCH net 3/7] net: fix dev_fill_forward_path with pppoe + bridge Pablo Neira Ayuso
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2022-05-18 21:38 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni

From: Felix Fietkau <nbd@nbd.name>

The dst entry does not contain a valid hardware address, so skip the lookup
in order to avoid running into errors here.
The proper hardware address is filled in from nft_dev_path_info

Fixes: 72efd585f714 ("netfilter: flowtable: add pppoe support")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_flow_offload.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c
index 900d48c810a1..d88de26aad75 100644
--- a/net/netfilter/nft_flow_offload.c
+++ b/net/netfilter/nft_flow_offload.c
@@ -36,6 +36,15 @@ static void nft_default_forward_path(struct nf_flow_route *route,
 	route->tuple[dir].xmit_type	= nft_xmit_type(dst_cache);
 }
 
+static bool nft_is_valid_ether_device(const struct net_device *dev)
+{
+	if (!dev || (dev->flags & IFF_LOOPBACK) || dev->type != ARPHRD_ETHER ||
+	    dev->addr_len != ETH_ALEN || !is_valid_ether_addr(dev->dev_addr))
+		return false;
+
+	return true;
+}
+
 static int nft_dev_fill_forward_path(const struct nf_flow_route *route,
 				     const struct dst_entry *dst_cache,
 				     const struct nf_conn *ct,
@@ -47,6 +56,9 @@ static int nft_dev_fill_forward_path(const struct nf_flow_route *route,
 	struct neighbour *n;
 	u8 nud_state;
 
+	if (!nft_is_valid_ether_device(dev))
+		goto out;
+
 	n = dst_neigh_lookup(dst_cache, daddr);
 	if (!n)
 		return -1;
@@ -60,6 +72,7 @@ static int nft_dev_fill_forward_path(const struct nf_flow_route *route,
 	if (!(nud_state & NUD_VALID))
 		return -1;
 
+out:
 	return dev_fill_forward_path(dev, ha, stack);
 }
 
@@ -78,15 +91,6 @@ struct nft_forward_info {
 	enum flow_offload_xmit_type xmit_type;
 };
 
-static bool nft_is_valid_ether_device(const struct net_device *dev)
-{
-	if (!dev || (dev->flags & IFF_LOOPBACK) || dev->type != ARPHRD_ETHER ||
-	    dev->addr_len != ETH_ALEN || !is_valid_ether_addr(dev->dev_addr))
-		return false;
-
-	return true;
-}
-
 static void nft_dev_path_info(const struct net_device_path_stack *stack,
 			      struct nft_forward_info *info,
 			      unsigned char *ha, struct nf_flowtable *flowtable)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 3/7] net: fix dev_fill_forward_path with pppoe + bridge
  2022-05-18 21:38 [PATCH net 0/7] Netfilter fixes for net Pablo Neira Ayuso
  2022-05-18 21:38 ` [PATCH net 1/7] netfilter: flowtable: fix excessive hw offload attempts after failure Pablo Neira Ayuso
  2022-05-18 21:38 ` [PATCH net 2/7] netfilter: nft_flow_offload: skip dst neigh lookup for ppp devices Pablo Neira Ayuso
@ 2022-05-18 21:38 ` Pablo Neira Ayuso
  2022-05-18 21:38 ` [PATCH net 4/7] netfilter: nft_flow_offload: fix offload with pppoe + vlan Pablo Neira Ayuso
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2022-05-18 21:38 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni

From: Felix Fietkau <nbd@nbd.name>

When calling dev_fill_forward_path on a pppoe device, the provided destination
address is invalid. In order for the bridge fdb lookup to succeed, the pppoe
code needs to update ctx->daddr to the correct value.
Fix this by storing the address inside struct net_device_path_ctx

Fixes: f6efc675c9dd ("net: ppp: resolve forwarding path for bridge pppoe devices")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 drivers/net/ppp/pppoe.c   | 1 +
 include/linux/netdevice.h | 2 +-
 net/core/dev.c            | 2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
index 3619520340b7..e172743948ed 100644
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -988,6 +988,7 @@ static int pppoe_fill_forward_path(struct net_device_path_ctx *ctx,
 	path->encap.proto = htons(ETH_P_PPP_SES);
 	path->encap.id = be16_to_cpu(po->num);
 	memcpy(path->encap.h_dest, po->pppoe_pa.remote, ETH_ALEN);
+	memcpy(ctx->daddr, po->pppoe_pa.remote, ETH_ALEN);
 	path->dev = ctx->dev;
 	ctx->dev = dev;
 
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b1fbe21650bb..f736c020cde2 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -900,7 +900,7 @@ struct net_device_path_stack {
 
 struct net_device_path_ctx {
 	const struct net_device *dev;
-	const u8		*daddr;
+	u8			daddr[ETH_ALEN];
 
 	int			num_vlans;
 	struct {
diff --git a/net/core/dev.c b/net/core/dev.c
index 1461c2d9dec8..2771fd22dc6a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -681,11 +681,11 @@ int dev_fill_forward_path(const struct net_device *dev, const u8 *daddr,
 	const struct net_device *last_dev;
 	struct net_device_path_ctx ctx = {
 		.dev	= dev,
-		.daddr	= daddr,
 	};
 	struct net_device_path *path;
 	int ret = 0;
 
+	memcpy(ctx.daddr, daddr, sizeof(ctx.daddr));
 	stack->num_paths = 0;
 	while (ctx.dev && ctx.dev->netdev_ops->ndo_fill_forward_path) {
 		last_dev = ctx.dev;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 4/7] netfilter: nft_flow_offload: fix offload with pppoe + vlan
  2022-05-18 21:38 [PATCH net 0/7] Netfilter fixes for net Pablo Neira Ayuso
                   ` (2 preceding siblings ...)
  2022-05-18 21:38 ` [PATCH net 3/7] net: fix dev_fill_forward_path with pppoe + bridge Pablo Neira Ayuso
@ 2022-05-18 21:38 ` Pablo Neira Ayuso
  2022-05-18 21:38 ` [PATCH net 5/7] netfilter: flowtable: fix TCP flow teardown Pablo Neira Ayuso
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2022-05-18 21:38 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni

From: Felix Fietkau <nbd@nbd.name>

When running a combination of PPPoE on top of a VLAN, we need to set
info->outdev to the PPPoE device, otherwise PPPoE encap is skipped
during software offload.

Fixes: 72efd585f714 ("netfilter: flowtable: add pppoe support")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_flow_offload.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c
index d88de26aad75..187b8cb9a510 100644
--- a/net/netfilter/nft_flow_offload.c
+++ b/net/netfilter/nft_flow_offload.c
@@ -123,7 +123,8 @@ static void nft_dev_path_info(const struct net_device_path_stack *stack,
 				info->indev = NULL;
 				break;
 			}
-			info->outdev = path->dev;
+			if (!info->outdev)
+				info->outdev = path->dev;
 			info->encap[info->num_encaps].id = path->encap.id;
 			info->encap[info->num_encaps].proto = path->encap.proto;
 			info->num_encaps++;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 5/7] netfilter: flowtable: fix TCP flow teardown
  2022-05-18 21:38 [PATCH net 0/7] Netfilter fixes for net Pablo Neira Ayuso
                   ` (3 preceding siblings ...)
  2022-05-18 21:38 ` [PATCH net 4/7] netfilter: nft_flow_offload: fix offload with pppoe + vlan Pablo Neira Ayuso
@ 2022-05-18 21:38 ` Pablo Neira Ayuso
  2022-05-18 21:38 ` [PATCH net 6/7] netfilter: flowtable: move dst_check to packet path Pablo Neira Ayuso
  2022-05-18 21:38 ` [PATCH net 7/7] netfilter: nf_tables: disable expression reduction infra Pablo Neira Ayuso
  6 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2022-05-18 21:38 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni

This patch addresses three possible problems:

1. ct gc may race to undo the timeout adjustment of the packet path, leaving
   the conntrack entry in place with the internal offload timeout (one day).

2. ct gc removes the ct because the IPS_OFFLOAD_BIT is not set and the CLOSE
   timeout is reached before the flow offload del.

3. tcp ct is always set to ESTABLISHED with a very long timeout
   in flow offload teardown/delete even though the state might be already
   CLOSED. Also as a remark we cannot assume that the FIN or RST packet
   is hitting flow table teardown as the packet might get bumped to the
   slow path in nftables.

This patch resets IPS_OFFLOAD_BIT from flow_offload_teardown(), so
conntrack handles the tcp rst/fin packet which triggers the CLOSE/FIN
state transition.

Moreover, teturn the connection's ownership to conntrack upon teardown
by clearing the offload flag and fixing the established timeout value.
The flow table GC thread will asynchonrnously free the flow table and
hardware offload entries.

Before this patch, the IPS_OFFLOAD_BIT remained set for expired flows on
which is also misleading since the flow is back to classic conntrack
path.

If nf_ct_delete() removes the entry from the conntrack table, then it
calls nf_ct_put() which decrements the refcnt. This is not a problem
because the flowtable holds a reference to the conntrack object from
flow_offload_alloc() path which is released via flow_offload_free().

This patch also updates nft_flow_offload to skip packets in SYN_RECV
state. Since we might miss or bump packets to slow path, we do not know
what will happen there while we are still in SYN_RECV, this patch
postpones offload up to the next packet which also aligns to the
existing behaviour in tc-ct.

flow_offload_teardown() does not reset the existing tcp state from
flow_offload_fixup_tcp() to ESTABLISHED anymore, packets bump to slow
path might have already update the state to CLOSE/FIN.

Joint work with Oz and Sven.

Fixes: 1e5b2471bcc4 ("netfilter: nf_flow_table: teardown flow timeout race")
Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_flow_table_core.c | 33 +++++++-----------------------
 net/netfilter/nft_flow_offload.c   |  3 ++-
 2 files changed, 9 insertions(+), 27 deletions(-)

diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index 20b4a14e5d4e..ebdf5332e838 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -179,12 +179,11 @@ EXPORT_SYMBOL_GPL(flow_offload_route_init);
 
 static void flow_offload_fixup_tcp(struct ip_ct_tcp *tcp)
 {
-	tcp->state = TCP_CONNTRACK_ESTABLISHED;
 	tcp->seen[0].td_maxwin = 0;
 	tcp->seen[1].td_maxwin = 0;
 }
 
-static void flow_offload_fixup_ct_timeout(struct nf_conn *ct)
+static void flow_offload_fixup_ct(struct nf_conn *ct)
 {
 	struct net *net = nf_ct_net(ct);
 	int l4num = nf_ct_protonum(ct);
@@ -193,7 +192,9 @@ static void flow_offload_fixup_ct_timeout(struct nf_conn *ct)
 	if (l4num == IPPROTO_TCP) {
 		struct nf_tcp_net *tn = nf_tcp_pernet(net);
 
-		timeout = tn->timeouts[TCP_CONNTRACK_ESTABLISHED];
+		flow_offload_fixup_tcp(&ct->proto.tcp);
+
+		timeout = tn->timeouts[ct->proto.tcp.state];
 		timeout -= tn->offload_timeout;
 	} else if (l4num == IPPROTO_UDP) {
 		struct nf_udp_net *tn = nf_udp_pernet(net);
@@ -211,18 +212,6 @@ static void flow_offload_fixup_ct_timeout(struct nf_conn *ct)
 		WRITE_ONCE(ct->timeout, nfct_time_stamp + timeout);
 }
 
-static void flow_offload_fixup_ct_state(struct nf_conn *ct)
-{
-	if (nf_ct_protonum(ct) == IPPROTO_TCP)
-		flow_offload_fixup_tcp(&ct->proto.tcp);
-}
-
-static void flow_offload_fixup_ct(struct nf_conn *ct)
-{
-	flow_offload_fixup_ct_state(ct);
-	flow_offload_fixup_ct_timeout(ct);
-}
-
 static void flow_offload_route_release(struct flow_offload *flow)
 {
 	nft_flow_dst_release(flow, FLOW_OFFLOAD_DIR_ORIGINAL);
@@ -361,22 +350,14 @@ static void flow_offload_del(struct nf_flowtable *flow_table,
 	rhashtable_remove_fast(&flow_table->rhashtable,
 			       &flow->tuplehash[FLOW_OFFLOAD_DIR_REPLY].node,
 			       nf_flow_offload_rhash_params);
-
-	clear_bit(IPS_OFFLOAD_BIT, &flow->ct->status);
-
-	if (nf_flow_has_expired(flow))
-		flow_offload_fixup_ct(flow->ct);
-	else
-		flow_offload_fixup_ct_timeout(flow->ct);
-
 	flow_offload_free(flow);
 }
 
 void flow_offload_teardown(struct flow_offload *flow)
 {
+	clear_bit(IPS_OFFLOAD_BIT, &flow->ct->status);
 	set_bit(NF_FLOW_TEARDOWN, &flow->flags);
-
-	flow_offload_fixup_ct_state(flow->ct);
+	flow_offload_fixup_ct(flow->ct);
 }
 EXPORT_SYMBOL_GPL(flow_offload_teardown);
 
@@ -466,7 +447,7 @@ static void nf_flow_offload_gc_step(struct nf_flowtable *flow_table,
 	if (nf_flow_has_expired(flow) ||
 	    nf_ct_is_dying(flow->ct) ||
 	    nf_flow_has_stale_dst(flow))
-		set_bit(NF_FLOW_TEARDOWN, &flow->flags);
+		flow_offload_teardown(flow);
 
 	if (test_bit(NF_FLOW_TEARDOWN, &flow->flags)) {
 		if (test_bit(NF_FLOW_HW, &flow->flags)) {
diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c
index 187b8cb9a510..6f0b07fe648d 100644
--- a/net/netfilter/nft_flow_offload.c
+++ b/net/netfilter/nft_flow_offload.c
@@ -298,7 +298,8 @@ static void nft_flow_offload_eval(const struct nft_expr *expr,
 	case IPPROTO_TCP:
 		tcph = skb_header_pointer(pkt->skb, nft_thoff(pkt),
 					  sizeof(_tcph), &_tcph);
-		if (unlikely(!tcph || tcph->fin || tcph->rst))
+		if (unlikely(!tcph || tcph->fin || tcph->rst ||
+			     !nf_conntrack_tcp_established(ct)))
 			goto out;
 		break;
 	case IPPROTO_UDP:
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 6/7] netfilter: flowtable: move dst_check to packet path
  2022-05-18 21:38 [PATCH net 0/7] Netfilter fixes for net Pablo Neira Ayuso
                   ` (4 preceding siblings ...)
  2022-05-18 21:38 ` [PATCH net 5/7] netfilter: flowtable: fix TCP flow teardown Pablo Neira Ayuso
@ 2022-05-18 21:38 ` Pablo Neira Ayuso
  2022-05-18 21:38 ` [PATCH net 7/7] netfilter: nf_tables: disable expression reduction infra Pablo Neira Ayuso
  6 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2022-05-18 21:38 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni

From: Ritaro Takenaka <ritarot634@gmail.com>

Fixes sporadic IPv6 packet loss when flow offloading is enabled.

IPv6 route GC and flowtable GC are not synchronized.
When dst_cache becomes stale and a packet passes through the flow before
the flowtable GC teardowns it, the packet can be dropped.
So, it is necessary to check dst every time in packet path.

Fixes: 227e1e4d0d6c ("netfilter: nf_flowtable: skip device lookup from interface index")
Signed-off-by: Ritaro Takenaka <ritarot634@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_flow_table_core.c | 23 +----------------------
 net/netfilter/nf_flow_table_ip.c   | 19 +++++++++++++++++++
 2 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index ebdf5332e838..f2def06d1070 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -421,32 +421,11 @@ nf_flow_table_iterate(struct nf_flowtable *flow_table,
 	return err;
 }
 
-static bool flow_offload_stale_dst(struct flow_offload_tuple *tuple)
-{
-	struct dst_entry *dst;
-
-	if (tuple->xmit_type == FLOW_OFFLOAD_XMIT_NEIGH ||
-	    tuple->xmit_type == FLOW_OFFLOAD_XMIT_XFRM) {
-		dst = tuple->dst_cache;
-		if (!dst_check(dst, tuple->dst_cookie))
-			return true;
-	}
-
-	return false;
-}
-
-static bool nf_flow_has_stale_dst(struct flow_offload *flow)
-{
-	return flow_offload_stale_dst(&flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple) ||
-	       flow_offload_stale_dst(&flow->tuplehash[FLOW_OFFLOAD_DIR_REPLY].tuple);
-}
-
 static void nf_flow_offload_gc_step(struct nf_flowtable *flow_table,
 				    struct flow_offload *flow, void *data)
 {
 	if (nf_flow_has_expired(flow) ||
-	    nf_ct_is_dying(flow->ct) ||
-	    nf_flow_has_stale_dst(flow))
+	    nf_ct_is_dying(flow->ct))
 		flow_offload_teardown(flow);
 
 	if (test_bit(NF_FLOW_TEARDOWN, &flow->flags)) {
diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c
index 32c0eb1b4821..b350fe9d00b0 100644
--- a/net/netfilter/nf_flow_table_ip.c
+++ b/net/netfilter/nf_flow_table_ip.c
@@ -248,6 +248,15 @@ static bool nf_flow_exceeds_mtu(const struct sk_buff *skb, unsigned int mtu)
 	return true;
 }
 
+static inline bool nf_flow_dst_check(struct flow_offload_tuple *tuple)
+{
+	if (tuple->xmit_type != FLOW_OFFLOAD_XMIT_NEIGH &&
+	    tuple->xmit_type != FLOW_OFFLOAD_XMIT_XFRM)
+		return true;
+
+	return dst_check(tuple->dst_cache, tuple->dst_cookie);
+}
+
 static unsigned int nf_flow_xmit_xfrm(struct sk_buff *skb,
 				      const struct nf_hook_state *state,
 				      struct dst_entry *dst)
@@ -367,6 +376,11 @@ nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb,
 	if (nf_flow_state_check(flow, iph->protocol, skb, thoff))
 		return NF_ACCEPT;
 
+	if (!nf_flow_dst_check(&tuplehash->tuple)) {
+		flow_offload_teardown(flow);
+		return NF_ACCEPT;
+	}
+
 	if (skb_try_make_writable(skb, thoff + hdrsize))
 		return NF_DROP;
 
@@ -624,6 +638,11 @@ nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb,
 	if (nf_flow_state_check(flow, ip6h->nexthdr, skb, thoff))
 		return NF_ACCEPT;
 
+	if (!nf_flow_dst_check(&tuplehash->tuple)) {
+		flow_offload_teardown(flow);
+		return NF_ACCEPT;
+	}
+
 	if (skb_try_make_writable(skb, thoff + hdrsize))
 		return NF_DROP;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 7/7] netfilter: nf_tables: disable expression reduction infra
  2022-05-18 21:38 [PATCH net 0/7] Netfilter fixes for net Pablo Neira Ayuso
                   ` (5 preceding siblings ...)
  2022-05-18 21:38 ` [PATCH net 6/7] netfilter: flowtable: move dst_check to packet path Pablo Neira Ayuso
@ 2022-05-18 21:38 ` Pablo Neira Ayuso
  6 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2022-05-18 21:38 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni

Either userspace or kernelspace need to pre-fetch keys inconditionally
before comparisons for this to work. Otherwise, register tracking data
is misleading and it might result in reducing expressions which are not
yet registers.

First expression is also guaranteed to be evaluated always, however,
certain expressions break before writing data to registers, before
comparing the data, leaving the register in undetermined state.

This patch disables this infrastructure by now.

Fixes: b2d306542ff9 ("netfilter: nf_tables: do not reduce read-only expressions")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_tables_api.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 16c3a39689f4..a096b9fbbbdf 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -8342,16 +8342,7 @@ EXPORT_SYMBOL_GPL(nf_tables_trans_destroy_flush_work);
 static bool nft_expr_reduce(struct nft_regs_track *track,
 			    const struct nft_expr *expr)
 {
-	if (!expr->ops->reduce) {
-		pr_warn_once("missing reduce for expression %s ",
-			     expr->ops->type->name);
-		return false;
-	}
-
-	if (nft_reduce_is_readonly(expr))
-		return false;
-
-	return expr->ops->reduce(track, expr);
+	return false;
 }
 
 static int nf_tables_commit_chain_prepare(struct net *net, struct nft_chain *chain)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net 1/7] netfilter: flowtable: fix excessive hw offload attempts after failure
  2022-05-18 21:38 ` [PATCH net 1/7] netfilter: flowtable: fix excessive hw offload attempts after failure Pablo Neira Ayuso
@ 2022-05-19  4:40   ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 9+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-05-19  4:40 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, davem, netdev, kuba, pabeni

Hello:

This series was applied to netdev/net.git (master)
by Pablo Neira Ayuso <pablo@netfilter.org>:

On Wed, 18 May 2022 23:38:35 +0200 you wrote:
> From: Felix Fietkau <nbd@nbd.name>
> 
> If a flow cannot be offloaded, the code currently repeatedly tries again as
> quickly as possible, which can significantly increase system load.
> Fix this by limiting flow timeout update and hardware offload retry to once
> per second.
> 
> [...]

Here is the summary with links:
  - [net,1/7] netfilter: flowtable: fix excessive hw offload attempts after failure
    https://git.kernel.org/netdev/net/c/396ef64113a8
  - [net,2/7] netfilter: nft_flow_offload: skip dst neigh lookup for ppp devices
    https://git.kernel.org/netdev/net/c/45ca3e61999e
  - [net,3/7] net: fix dev_fill_forward_path with pppoe + bridge
    https://git.kernel.org/netdev/net/c/cf2df74e202d
  - [net,4/7] netfilter: nft_flow_offload: fix offload with pppoe + vlan
    https://git.kernel.org/netdev/net/c/245607493500
  - [net,5/7] netfilter: flowtable: fix TCP flow teardown
    https://git.kernel.org/netdev/net/c/e5eaac2beb54
  - [net,6/7] netfilter: flowtable: move dst_check to packet path
    https://git.kernel.org/netdev/net/c/2738d9d963bd
  - [net,7/7] netfilter: nf_tables: disable expression reduction infra
    https://git.kernel.org/netdev/net/c/9e539c5b6d9c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-05-19  4:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-18 21:38 [PATCH net 0/7] Netfilter fixes for net Pablo Neira Ayuso
2022-05-18 21:38 ` [PATCH net 1/7] netfilter: flowtable: fix excessive hw offload attempts after failure Pablo Neira Ayuso
2022-05-19  4:40   ` patchwork-bot+netdevbpf
2022-05-18 21:38 ` [PATCH net 2/7] netfilter: nft_flow_offload: skip dst neigh lookup for ppp devices Pablo Neira Ayuso
2022-05-18 21:38 ` [PATCH net 3/7] net: fix dev_fill_forward_path with pppoe + bridge Pablo Neira Ayuso
2022-05-18 21:38 ` [PATCH net 4/7] netfilter: nft_flow_offload: fix offload with pppoe + vlan Pablo Neira Ayuso
2022-05-18 21:38 ` [PATCH net 5/7] netfilter: flowtable: fix TCP flow teardown Pablo Neira Ayuso
2022-05-18 21:38 ` [PATCH net 6/7] netfilter: flowtable: move dst_check to packet path Pablo Neira Ayuso
2022-05-18 21:38 ` [PATCH net 7/7] netfilter: nf_tables: disable expression reduction infra Pablo Neira Ayuso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).