netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 00/14] Netfilter fixes for net
@ 2022-08-24 22:03 Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 01/14] netfilter: ebtables: reject blobs that don't provide all entry points Pablo Neira Ayuso
                   ` (13 more replies)
  0 siblings, 14 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-08-24 22:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet

Hi,

The following patchset contains Netfilter fixes for net. All fixes
included in this batch address problems appearing in several releases:

1) Fix crash with malformed ebtables blob which do not provide all
   entry points, from Florian Westphal.

2) Fix possible TCP connection clogging up with default 5-days
   timeout in conntrack, from Florian.

3) Fix crash in nf_tables tproxy with unsupported chains, also from Florian.

4) Do not allow to update implicit chains.

5) Make table handle allocation per-netns to fix data race.

6) Do not truncated payload length and offset, and checksum offset.
   Instead report EINVAl.

7) Enable chain stats update via static key iff no error occurs.

8) Restrict osf expression to ip, ip6 and inet families.

9) Restrict tunnel expression to netdev family.

10) Fix crash when trying to bind again an already bound chain.

11) Flowtable garbage collector might leave behind pending work to
    delete entries. This patch comes with a previous preparation patch
    as dependency.

12) Allow net.netfilter.nf_conntrack_frag6_high_thresh to be lowered,
    from Eric Dumazet.

Please, pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf.git

Thanks.

----------------------------------------------------------------

The following changes since commit 855a28f9c96c80e6cbd2d986a857235e34868064:

  net: dsa: don't dereference NULL extack in dsa_slave_changeupper() (2022-08-23 07:54:16 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf.git HEAD

for you to fetch changes up to 00cd7bf9f9e06769ef84d5102774c8becd6a498a:

  netfilter: nf_defrag_ipv6: allow nf_conntrack_frag6_high_thresh increases (2022-08-24 08:06:44 +0200)

----------------------------------------------------------------
Eric Dumazet (1):
      netfilter: nf_defrag_ipv6: allow nf_conntrack_frag6_high_thresh increases

Florian Westphal (3):
      netfilter: ebtables: reject blobs that don't provide all entry points
      netfilter: conntrack: work around exceeded receive window
      netfilter: nft_tproxy: restrict to prerouting hook

Pablo Neira Ayuso (10):
      netfilter: nf_tables: disallow updates of implicit chain
      netfilter: nf_tables: make table handle allocation per-netns friendly
      netfilter: nft_payload: report ERANGE for too long offset and length
      netfilter: nft_payload: do not truncate csum_offset and csum_type
      netfilter: nf_tables: do not leave chain stats enabled on error
      netfilter: nft_osf: restrict osf to ipv4, ipv6 and inet families
      netfilter: nft_tunnel: restrict it to netdev family
      netfilter: nf_tables: disallow binding to already bound chain
      netfilter: flowtable: add function to invoke garbage collection immediately
      netfilter: flowtable: fix stuck flows on cleanup due to pending work

 include/linux/netfilter_bridge/ebtables.h |  4 ----
 include/net/netfilter/nf_flow_table.h     |  3 +++
 include/net/netfilter/nf_tables.h         |  1 +
 net/bridge/netfilter/ebtable_broute.c     |  8 --------
 net/bridge/netfilter/ebtable_filter.c     |  8 --------
 net/bridge/netfilter/ebtable_nat.c        |  8 --------
 net/bridge/netfilter/ebtables.c           |  8 +-------
 net/ipv6/netfilter/nf_conntrack_reasm.c   |  1 -
 net/netfilter/nf_conntrack_proto_tcp.c    | 31 +++++++++++++++++++++++++++++++
 net/netfilter/nf_flow_table_core.c        | 15 ++++++++++-----
 net/netfilter/nf_flow_table_offload.c     |  8 ++++++++
 net/netfilter/nf_tables_api.c             | 14 ++++++++++----
 net/netfilter/nft_osf.c                   | 18 +++++++++++++++---
 net/netfilter/nft_payload.c               | 29 +++++++++++++++++++++--------
 net/netfilter/nft_tproxy.c                |  8 ++++++++
 net/netfilter/nft_tunnel.c                |  1 +
 16 files changed, 109 insertions(+), 56 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH net 01/14] netfilter: ebtables: reject blobs that don't provide all entry points
  2022-08-24 22:03 [PATCH net 00/14] Netfilter fixes for net Pablo Neira Ayuso
@ 2022-08-24 22:03 ` Pablo Neira Ayuso
  2022-08-25  2:40   ` patchwork-bot+netdevbpf
  2022-08-24 22:03 ` [PATCH net 02/14] netfilter: conntrack: work around exceeded receive window Pablo Neira Ayuso
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-08-24 22:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet

From: Florian Westphal <fw@strlen.de>

Harshit Mogalapalli says:
 In ebt_do_table() function dereferencing 'private->hook_entry[hook]'
 can lead to NULL pointer dereference. [..] Kernel panic:

general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
[..]
RIP: 0010:ebt_do_table+0x1dc/0x1ce0
Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 5c 16 00 00 48 b8 00 00 00 00 00 fc ff df 49 8b 6c df 08 48 8d 7d 2c 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 88
[..]
Call Trace:
 nf_hook_slow+0xb1/0x170
 __br_forward+0x289/0x730
 maybe_deliver+0x24b/0x380
 br_flood+0xc6/0x390
 br_dev_xmit+0xa2e/0x12c0

For some reason ebtables rejects blobs that provide entry points that are
not supported by the table, but what it should instead reject is the
opposite: blobs that DO NOT provide an entry point supported by the table.

t->valid_hooks is the bitmask of hooks (input, forward ...) that will see
packets.  Providing an entry point that is not support is harmless
(never called/used), but the inverse isn't: it results in a crash
because the ebtables traverser doesn't expect a NULL blob for a location
its receiving packets for.

Instead of fixing all the individual checks, do what iptables is doing and
reject all blobs that differ from the expected hooks.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/linux/netfilter_bridge/ebtables.h | 4 ----
 net/bridge/netfilter/ebtable_broute.c     | 8 --------
 net/bridge/netfilter/ebtable_filter.c     | 8 --------
 net/bridge/netfilter/ebtable_nat.c        | 8 --------
 net/bridge/netfilter/ebtables.c           | 8 +-------
 5 files changed, 1 insertion(+), 35 deletions(-)

diff --git a/include/linux/netfilter_bridge/ebtables.h b/include/linux/netfilter_bridge/ebtables.h
index a13296d6c7ce..fd533552a062 100644
--- a/include/linux/netfilter_bridge/ebtables.h
+++ b/include/linux/netfilter_bridge/ebtables.h
@@ -94,10 +94,6 @@ struct ebt_table {
 	struct ebt_replace_kernel *table;
 	unsigned int valid_hooks;
 	rwlock_t lock;
-	/* e.g. could be the table explicitly only allows certain
-	 * matches, targets, ... 0 == let it in */
-	int (*check)(const struct ebt_table_info *info,
-	   unsigned int valid_hooks);
 	/* the data used by the kernel */
 	struct ebt_table_info *private;
 	struct nf_hook_ops *ops;
diff --git a/net/bridge/netfilter/ebtable_broute.c b/net/bridge/netfilter/ebtable_broute.c
index 1a11064f9990..8f19253024b0 100644
--- a/net/bridge/netfilter/ebtable_broute.c
+++ b/net/bridge/netfilter/ebtable_broute.c
@@ -36,18 +36,10 @@ static struct ebt_replace_kernel initial_table = {
 	.entries	= (char *)&initial_chain,
 };
 
-static int check(const struct ebt_table_info *info, unsigned int valid_hooks)
-{
-	if (valid_hooks & ~(1 << NF_BR_BROUTING))
-		return -EINVAL;
-	return 0;
-}
-
 static const struct ebt_table broute_table = {
 	.name		= "broute",
 	.table		= &initial_table,
 	.valid_hooks	= 1 << NF_BR_BROUTING,
-	.check		= check,
 	.me		= THIS_MODULE,
 };
 
diff --git a/net/bridge/netfilter/ebtable_filter.c b/net/bridge/netfilter/ebtable_filter.c
index cb949436bc0e..278f324e6752 100644
--- a/net/bridge/netfilter/ebtable_filter.c
+++ b/net/bridge/netfilter/ebtable_filter.c
@@ -43,18 +43,10 @@ static struct ebt_replace_kernel initial_table = {
 	.entries	= (char *)initial_chains,
 };
 
-static int check(const struct ebt_table_info *info, unsigned int valid_hooks)
-{
-	if (valid_hooks & ~FILTER_VALID_HOOKS)
-		return -EINVAL;
-	return 0;
-}
-
 static const struct ebt_table frame_filter = {
 	.name		= "filter",
 	.table		= &initial_table,
 	.valid_hooks	= FILTER_VALID_HOOKS,
-	.check		= check,
 	.me		= THIS_MODULE,
 };
 
diff --git a/net/bridge/netfilter/ebtable_nat.c b/net/bridge/netfilter/ebtable_nat.c
index 5ee0531ae506..9066f7f376d5 100644
--- a/net/bridge/netfilter/ebtable_nat.c
+++ b/net/bridge/netfilter/ebtable_nat.c
@@ -43,18 +43,10 @@ static struct ebt_replace_kernel initial_table = {
 	.entries	= (char *)initial_chains,
 };
 
-static int check(const struct ebt_table_info *info, unsigned int valid_hooks)
-{
-	if (valid_hooks & ~NAT_VALID_HOOKS)
-		return -EINVAL;
-	return 0;
-}
-
 static const struct ebt_table frame_nat = {
 	.name		= "nat",
 	.table		= &initial_table,
 	.valid_hooks	= NAT_VALID_HOOKS,
-	.check		= check,
 	.me		= THIS_MODULE,
 };
 
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index f2dbefb61ce8..9a0ae59cdc50 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -1040,8 +1040,7 @@ static int do_replace_finish(struct net *net, struct ebt_replace *repl,
 		goto free_iterate;
 	}
 
-	/* the table doesn't like it */
-	if (t->check && (ret = t->check(newinfo, repl->valid_hooks)))
+	if (repl->valid_hooks != t->valid_hooks)
 		goto free_unlock;
 
 	if (repl->num_counters && repl->num_counters != t->private->nentries) {
@@ -1231,11 +1230,6 @@ int ebt_register_table(struct net *net, const struct ebt_table *input_table,
 	if (ret != 0)
 		goto free_chainstack;
 
-	if (table->check && table->check(newinfo, table->valid_hooks)) {
-		ret = -EINVAL;
-		goto free_chainstack;
-	}
-
 	table->private = newinfo;
 	rwlock_init(&table->lock);
 	mutex_lock(&ebt_mutex);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net 02/14] netfilter: conntrack: work around exceeded receive window
  2022-08-24 22:03 [PATCH net 00/14] Netfilter fixes for net Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 01/14] netfilter: ebtables: reject blobs that don't provide all entry points Pablo Neira Ayuso
@ 2022-08-24 22:03 ` Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 03/14] netfilter: nft_tproxy: restrict to prerouting hook Pablo Neira Ayuso
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-08-24 22:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet

From: Florian Westphal <fw@strlen.de>

When a TCP sends more bytes than allowed by the receive window, all future
packets can be marked as invalid.
This can clog up the conntrack table because of 5-day default timeout.

Sequence of packets:
 01 initiator > responder: [S], seq 171, win 5840, options [mss 1330,sackOK,TS val 63 ecr 0,nop,wscale 1]
 02 responder > initiator: [S.], seq 33211, ack 172, win 65535, options [mss 1460,sackOK,TS val 010 ecr 63,nop,wscale 8]
 03 initiator > responder: [.], ack 33212, win 2920, options [nop,nop,TS val 068 ecr 010], length 0
 04 initiator > responder: [P.], seq 172:240, ack 33212, win 2920, options [nop,nop,TS val 279 ecr 010], length 68

Window is 5840 starting from 33212 -> 39052.

 05 responder > initiator: [.], ack 240, win 256, options [nop,nop,TS val 872 ecr 279], length 0
 06 responder > initiator: [.], seq 33212:34530, ack 240, win 256, options [nop,nop,TS val 892 ecr 279], length 1318

This is fine, conntrack will flag the connection as having outstanding
data (UNACKED), which lowers the conntrack timeout to 300s.

 07 responder > initiator: [.], seq 34530:35848, ack 240, win 256, options [nop,nop,TS val 892 ecr 279], length 1318
 08 responder > initiator: [.], seq 35848:37166, ack 240, win 256, options [nop,nop,TS val 892 ecr 279], length 1318
 09 responder > initiator: [.], seq 37166:38484, ack 240, win 256, options [nop,nop,TS val 892 ecr 279], length 1318
 10 responder > initiator: [.], seq 38484:39802, ack 240, win 256, options [nop,nop,TS val 892 ecr 279], length 1318

Packet 10 is already sending more than permitted, but conntrack doesn't
validate this (only seq is tested vs. maxend, not 'seq+len').

38484 is acceptable, but only up to 39052, so this packet should
not have been sent (or only 568 bytes, not 1318).

At this point, connection is still in '300s' mode.

Next packet however will get flagged:
 11 responder > initiator: [P.], seq 39802:40128, ack 240, win 256, options [nop,nop,TS val 892 ecr 279], length 326

nf_ct_proto_6: SEQ is over the upper bound (over the window of the receiver) .. LEN=378 .. SEQ=39802 ACK=240 ACK PSH ..

Now, a couple of replies/acks comes in:

 12 initiator > responder: [.], ack 34530, win 4368,
[.. irrelevant acks removed ]
 16 initiator > responder: [.], ack 39802, win 8712, options [nop,nop,TS val 296201291 ecr 2982371892], length 0

This ack is significant -- this acks the last packet send by the
responder that conntrack considered valid.

This means that ack == td_end.  This will withdraw the
'unacked data' flag, the connection moves back to the 5-day timeout
of established conntracks.

 17 initiator > responder: ack 40128, win 10030, ...

This packet is also flagged as invalid.

Because conntrack only updates state based on packets that are
considered valid, packet 11 'did not exist' and that gets us:

nf_ct_proto_6: ACK is over upper bound 39803 (ACKed data not seen yet) .. SEQ=240 ACK=40128 WINDOW=10030 RES=0x00 ACK URG

Because this received and processed by the endpoints, the conntrack entry
remains in a bad state, no packets will ever be considered valid again:

 30 responder > initiator: [F.], seq 40432, ack 2045, win 391, ..
 31 initiator > responder: [.], ack 40433, win 11348, ..
 32 initiator > responder: [F.], seq 2045, ack 40433, win 11348 ..

... all trigger 'ACK is over bound' test and we end up with
non-early-evictable 5-day default timeout.

NB: This patch triggers a bunch of checkpatch warnings because of silly
indent.  I will resend the cleanup series linked below to reduce the
indent level once this change has propagated to net-next.

I could route the cleanup via nf but that causes extra backport work for
stable maintainers.

Link: https://lore.kernel.org/netfilter-devel/20220720175228.17880-1-fw@strlen.de/T/#mb1d7147d36294573cc4f81d00f9f8dadfdd06cd8
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_conntrack_proto_tcp.c | 31 ++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index a63b51dceaf2..a634c72b1ffc 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -655,6 +655,37 @@ static bool tcp_in_window(struct nf_conn *ct,
 		    tn->tcp_be_liberal)
 			res = true;
 		if (!res) {
+			bool seq_ok = before(seq, sender->td_maxend + 1);
+
+			if (!seq_ok) {
+				u32 overshot = end - sender->td_maxend + 1;
+				bool ack_ok;
+
+				ack_ok = after(sack, receiver->td_end - MAXACKWINDOW(sender) - 1);
+
+				if (in_recv_win &&
+				    ack_ok &&
+				    overshot <= receiver->td_maxwin &&
+				    before(sack, receiver->td_end + 1)) {
+					/* Work around TCPs that send more bytes than allowed by
+					 * the receive window.
+					 *
+					 * If the (marked as invalid) packet is allowed to pass by
+					 * the ruleset and the peer acks this data, then its possible
+					 * all future packets will trigger 'ACK is over upper bound' check.
+					 *
+					 * Thus if only the sequence check fails then do update td_end so
+					 * possible ACK for this data can update internal state.
+					 */
+					sender->td_end = end;
+					sender->flags |= IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED;
+
+					nf_ct_l4proto_log_invalid(skb, ct, hook_state,
+								  "%u bytes more than expected", overshot);
+					return res;
+				}
+			}
+
 			nf_ct_l4proto_log_invalid(skb, ct, hook_state,
 			"%s",
 			before(seq, sender->td_maxend + 1) ?
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net 03/14] netfilter: nft_tproxy: restrict to prerouting hook
  2022-08-24 22:03 [PATCH net 00/14] Netfilter fixes for net Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 01/14] netfilter: ebtables: reject blobs that don't provide all entry points Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 02/14] netfilter: conntrack: work around exceeded receive window Pablo Neira Ayuso
@ 2022-08-24 22:03 ` Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 04/14] netfilter: nf_tables: disallow updates of implicit chain Pablo Neira Ayuso
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-08-24 22:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet

From: Florian Westphal <fw@strlen.de>

TPROXY is only allowed from prerouting, but nft_tproxy doesn't check this.
This fixes a crash (null dereference) when using tproxy from e.g. output.

Fixes: 4ed8eb6570a4 ("netfilter: nf_tables: Add native tproxy support")
Reported-by: Shell Chen <xierch@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nft_tproxy.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/net/netfilter/nft_tproxy.c b/net/netfilter/nft_tproxy.c
index 68b2eed742df..62da25ad264b 100644
--- a/net/netfilter/nft_tproxy.c
+++ b/net/netfilter/nft_tproxy.c
@@ -312,6 +312,13 @@ static int nft_tproxy_dump(struct sk_buff *skb,
 	return 0;
 }
 
+static int nft_tproxy_validate(const struct nft_ctx *ctx,
+			       const struct nft_expr *expr,
+			       const struct nft_data **data)
+{
+	return nft_chain_validate_hooks(ctx->chain, 1 << NF_INET_PRE_ROUTING);
+}
+
 static struct nft_expr_type nft_tproxy_type;
 static const struct nft_expr_ops nft_tproxy_ops = {
 	.type		= &nft_tproxy_type,
@@ -321,6 +328,7 @@ static const struct nft_expr_ops nft_tproxy_ops = {
 	.destroy	= nft_tproxy_destroy,
 	.dump		= nft_tproxy_dump,
 	.reduce		= NFT_REDUCE_READONLY,
+	.validate	= nft_tproxy_validate,
 };
 
 static struct nft_expr_type nft_tproxy_type __read_mostly = {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net 04/14] netfilter: nf_tables: disallow updates of implicit chain
  2022-08-24 22:03 [PATCH net 00/14] Netfilter fixes for net Pablo Neira Ayuso
                   ` (2 preceding siblings ...)
  2022-08-24 22:03 ` [PATCH net 03/14] netfilter: nft_tproxy: restrict to prerouting hook Pablo Neira Ayuso
@ 2022-08-24 22:03 ` Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 05/14] netfilter: nf_tables: make table handle allocation per-netns friendly Pablo Neira Ayuso
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-08-24 22:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet

Updates on existing implicit chain make no sense, disallow this.

Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_tables_api.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 62cfb0e31c40..dff2b5851bbb 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2574,6 +2574,9 @@ static int nf_tables_newchain(struct sk_buff *skb, const struct nfnl_info *info,
 	nft_ctx_init(&ctx, net, skb, info->nlh, family, table, chain, nla);
 
 	if (chain != NULL) {
+		if (chain->flags & NFT_CHAIN_BINDING)
+			return -EINVAL;
+
 		if (info->nlh->nlmsg_flags & NLM_F_EXCL) {
 			NL_SET_BAD_ATTR(extack, attr);
 			return -EEXIST;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net 05/14] netfilter: nf_tables: make table handle allocation per-netns friendly
  2022-08-24 22:03 [PATCH net 00/14] Netfilter fixes for net Pablo Neira Ayuso
                   ` (3 preceding siblings ...)
  2022-08-24 22:03 ` [PATCH net 04/14] netfilter: nf_tables: disallow updates of implicit chain Pablo Neira Ayuso
@ 2022-08-24 22:03 ` Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 06/14] netfilter: nft_payload: report ERANGE for too long offset and length Pablo Neira Ayuso
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-08-24 22:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet

mutex is per-netns, move table_netns to the pernet area.

*read-write* to 0xffffffff883a01e8 of 8 bytes by task 6542 on cpu 0:
 nf_tables_newtable+0x6dc/0xc00 net/netfilter/nf_tables_api.c:1221
 nfnetlink_rcv_batch net/netfilter/nfnetlink.c:513 [inline]
 nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:634 [inline]
 nfnetlink_rcv+0xa6a/0x13a0 net/netfilter/nfnetlink.c:652
 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
 netlink_unicast+0x652/0x730 net/netlink/af_netlink.c:1345
 netlink_sendmsg+0x643/0x740 net/netlink/af_netlink.c:1921

Fixes: f102d66b335a ("netfilter: nf_tables: use dedicated mutex to guard transactions")
Reported-by: Abhishek Shah <abhishek.shah@columbia.edu>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_tables.h | 1 +
 net/netfilter/nf_tables_api.c     | 3 +--
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 99aae36c04b9..cdb7db9b0e25 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -1652,6 +1652,7 @@ struct nftables_pernet {
 	struct list_head	module_list;
 	struct list_head	notify_list;
 	struct mutex		commit_mutex;
+	u64			table_handle;
 	unsigned int		base_seq;
 	u8			validate_state;
 };
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index dff2b5851bbb..0951f31caabe 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -32,7 +32,6 @@ static LIST_HEAD(nf_tables_objects);
 static LIST_HEAD(nf_tables_flowtables);
 static LIST_HEAD(nf_tables_destroy_list);
 static DEFINE_SPINLOCK(nf_tables_destroy_list_lock);
-static u64 table_handle;
 
 enum {
 	NFT_VALIDATE_SKIP	= 0,
@@ -1235,7 +1234,7 @@ static int nf_tables_newtable(struct sk_buff *skb, const struct nfnl_info *info,
 	INIT_LIST_HEAD(&table->flowtables);
 	table->family = family;
 	table->flags = flags;
-	table->handle = ++table_handle;
+	table->handle = ++nft_net->table_handle;
 	if (table->flags & NFT_TABLE_F_OWNER)
 		table->nlpid = NETLINK_CB(skb).portid;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net 06/14] netfilter: nft_payload: report ERANGE for too long offset and length
  2022-08-24 22:03 [PATCH net 00/14] Netfilter fixes for net Pablo Neira Ayuso
                   ` (4 preceding siblings ...)
  2022-08-24 22:03 ` [PATCH net 05/14] netfilter: nf_tables: make table handle allocation per-netns friendly Pablo Neira Ayuso
@ 2022-08-24 22:03 ` Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 07/14] netfilter: nft_payload: do not truncate csum_offset and csum_type Pablo Neira Ayuso
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-08-24 22:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet

Instead of offset and length are truncation to u8, report ERANGE.

Fixes: 96518518cc41 ("netfilter: add nftables")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_payload.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 2e7ac007cb30..4fee67abfe2c 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -833,6 +833,7 @@ nft_payload_select_ops(const struct nft_ctx *ctx,
 {
 	enum nft_payload_bases base;
 	unsigned int offset, len;
+	int err;
 
 	if (tb[NFTA_PAYLOAD_BASE] == NULL ||
 	    tb[NFTA_PAYLOAD_OFFSET] == NULL ||
@@ -859,8 +860,13 @@ nft_payload_select_ops(const struct nft_ctx *ctx,
 	if (tb[NFTA_PAYLOAD_DREG] == NULL)
 		return ERR_PTR(-EINVAL);
 
-	offset = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_OFFSET]));
-	len    = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_LEN]));
+	err = nft_parse_u32_check(tb[NFTA_PAYLOAD_OFFSET], U8_MAX, &offset);
+	if (err < 0)
+		return ERR_PTR(err);
+
+	err = nft_parse_u32_check(tb[NFTA_PAYLOAD_LEN], U8_MAX, &len);
+	if (err < 0)
+		return ERR_PTR(err);
 
 	if (len <= 4 && is_power_of_2(len) && IS_ALIGNED(offset, len) &&
 	    base != NFT_PAYLOAD_LL_HEADER && base != NFT_PAYLOAD_INNER_HEADER)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net 07/14] netfilter: nft_payload: do not truncate csum_offset and csum_type
  2022-08-24 22:03 [PATCH net 00/14] Netfilter fixes for net Pablo Neira Ayuso
                   ` (5 preceding siblings ...)
  2022-08-24 22:03 ` [PATCH net 06/14] netfilter: nft_payload: report ERANGE for too long offset and length Pablo Neira Ayuso
@ 2022-08-24 22:03 ` Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 08/14] netfilter: nf_tables: do not leave chain stats enabled on error Pablo Neira Ayuso
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-08-24 22:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet

Instead report ERANGE if csum_offset is too long, and EOPNOTSUPP if type
is not support.

Fixes: 7ec3f7b47b8d ("netfilter: nft_payload: add packet mangling support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_payload.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 4fee67abfe2c..eb0e40c29712 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -740,17 +740,23 @@ static int nft_payload_set_init(const struct nft_ctx *ctx,
 				const struct nlattr * const tb[])
 {
 	struct nft_payload_set *priv = nft_expr_priv(expr);
+	u32 csum_offset, csum_type = NFT_PAYLOAD_CSUM_NONE;
+	int err;
 
 	priv->base        = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_BASE]));
 	priv->offset      = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_OFFSET]));
 	priv->len         = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_LEN]));
 
 	if (tb[NFTA_PAYLOAD_CSUM_TYPE])
-		priv->csum_type =
-			ntohl(nla_get_be32(tb[NFTA_PAYLOAD_CSUM_TYPE]));
-	if (tb[NFTA_PAYLOAD_CSUM_OFFSET])
-		priv->csum_offset =
-			ntohl(nla_get_be32(tb[NFTA_PAYLOAD_CSUM_OFFSET]));
+		csum_type = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_CSUM_TYPE]));
+	if (tb[NFTA_PAYLOAD_CSUM_OFFSET]) {
+		err = nft_parse_u32_check(tb[NFTA_PAYLOAD_CSUM_OFFSET], U8_MAX,
+					  &csum_offset);
+		if (err < 0)
+			return err;
+
+		priv->csum_offset = csum_offset;
+	}
 	if (tb[NFTA_PAYLOAD_CSUM_FLAGS]) {
 		u32 flags;
 
@@ -761,7 +767,7 @@ static int nft_payload_set_init(const struct nft_ctx *ctx,
 		priv->csum_flags = flags;
 	}
 
-	switch (priv->csum_type) {
+	switch (csum_type) {
 	case NFT_PAYLOAD_CSUM_NONE:
 	case NFT_PAYLOAD_CSUM_INET:
 		break;
@@ -775,6 +781,7 @@ static int nft_payload_set_init(const struct nft_ctx *ctx,
 	default:
 		return -EOPNOTSUPP;
 	}
+	priv->csum_type = csum_type;
 
 	return nft_parse_register_load(tb[NFTA_PAYLOAD_SREG], &priv->sreg,
 				       priv->len);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net 08/14] netfilter: nf_tables: do not leave chain stats enabled on error
  2022-08-24 22:03 [PATCH net 00/14] Netfilter fixes for net Pablo Neira Ayuso
                   ` (6 preceding siblings ...)
  2022-08-24 22:03 ` [PATCH net 07/14] netfilter: nft_payload: do not truncate csum_offset and csum_type Pablo Neira Ayuso
@ 2022-08-24 22:03 ` Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 09/14] netfilter: nft_osf: restrict osf to ipv4, ipv6 and inet families Pablo Neira Ayuso
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-08-24 22:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet

Error might occur later in the nf_tables_addchain() codepath, enable
static key only after transaction has been created.

Fixes: 9f08ea848117 ("netfilter: nf_tables: keep chain counters away from hot path")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_tables_api.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 0951f31caabe..72c066a33416 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2195,9 +2195,9 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
 			      struct netlink_ext_ack *extack)
 {
 	const struct nlattr * const *nla = ctx->nla;
+	struct nft_stats __percpu *stats = NULL;
 	struct nft_table *table = ctx->table;
 	struct nft_base_chain *basechain;
-	struct nft_stats __percpu *stats;
 	struct net *net = ctx->net;
 	char name[NFT_NAME_MAXLEN];
 	struct nft_rule_blob *blob;
@@ -2235,7 +2235,6 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
 				return PTR_ERR(stats);
 			}
 			rcu_assign_pointer(basechain->stats, stats);
-			static_branch_inc(&nft_counters_enabled);
 		}
 
 		err = nft_basechain_init(basechain, family, &hook, flags);
@@ -2318,6 +2317,9 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
 		goto err_unregister_hook;
 	}
 
+	if (stats)
+		static_branch_inc(&nft_counters_enabled);
+
 	table->use++;
 
 	return 0;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net 09/14] netfilter: nft_osf: restrict osf to ipv4, ipv6 and inet families
  2022-08-24 22:03 [PATCH net 00/14] Netfilter fixes for net Pablo Neira Ayuso
                   ` (7 preceding siblings ...)
  2022-08-24 22:03 ` [PATCH net 08/14] netfilter: nf_tables: do not leave chain stats enabled on error Pablo Neira Ayuso
@ 2022-08-24 22:03 ` Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 10/14] netfilter: nft_tunnel: restrict it to netdev family Pablo Neira Ayuso
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-08-24 22:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet

As it was originally intended, restrict extension to supported families.

Fixes: b96af92d6eaf ("netfilter: nf_tables: implement Passive OS fingerprint module in nft_osf")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_osf.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nft_osf.c b/net/netfilter/nft_osf.c
index 0053a697c931..89342ccccdcc 100644
--- a/net/netfilter/nft_osf.c
+++ b/net/netfilter/nft_osf.c
@@ -115,9 +115,21 @@ static int nft_osf_validate(const struct nft_ctx *ctx,
 			    const struct nft_expr *expr,
 			    const struct nft_data **data)
 {
-	return nft_chain_validate_hooks(ctx->chain, (1 << NF_INET_LOCAL_IN) |
-						    (1 << NF_INET_PRE_ROUTING) |
-						    (1 << NF_INET_FORWARD));
+	unsigned int hooks;
+
+	switch (ctx->family) {
+	case NFPROTO_IPV4:
+	case NFPROTO_IPV6:
+	case NFPROTO_INET:
+		hooks = (1 << NF_INET_LOCAL_IN) |
+			(1 << NF_INET_PRE_ROUTING) |
+			(1 << NF_INET_FORWARD);
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	return nft_chain_validate_hooks(ctx->chain, hooks);
 }
 
 static bool nft_osf_reduce(struct nft_regs_track *track,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net 10/14] netfilter: nft_tunnel: restrict it to netdev family
  2022-08-24 22:03 [PATCH net 00/14] Netfilter fixes for net Pablo Neira Ayuso
                   ` (8 preceding siblings ...)
  2022-08-24 22:03 ` [PATCH net 09/14] netfilter: nft_osf: restrict osf to ipv4, ipv6 and inet families Pablo Neira Ayuso
@ 2022-08-24 22:03 ` Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 11/14] netfilter: nf_tables: disallow binding to already bound chain Pablo Neira Ayuso
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-08-24 22:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet

Only allow to use this expression from NFPROTO_NETDEV family.

Fixes: af308b94a2a4 ("netfilter: nf_tables: add tunnel support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_tunnel.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/netfilter/nft_tunnel.c b/net/netfilter/nft_tunnel.c
index 5edaaded706d..983ade4be3b3 100644
--- a/net/netfilter/nft_tunnel.c
+++ b/net/netfilter/nft_tunnel.c
@@ -161,6 +161,7 @@ static const struct nft_expr_ops nft_tunnel_get_ops = {
 
 static struct nft_expr_type nft_tunnel_type __read_mostly = {
 	.name		= "tunnel",
+	.family		= NFPROTO_NETDEV,
 	.ops		= &nft_tunnel_get_ops,
 	.policy		= nft_tunnel_policy,
 	.maxattr	= NFTA_TUNNEL_MAX,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net 11/14] netfilter: nf_tables: disallow binding to already bound chain
  2022-08-24 22:03 [PATCH net 00/14] Netfilter fixes for net Pablo Neira Ayuso
                   ` (9 preceding siblings ...)
  2022-08-24 22:03 ` [PATCH net 10/14] netfilter: nft_tunnel: restrict it to netdev family Pablo Neira Ayuso
@ 2022-08-24 22:03 ` Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 12/14] netfilter: flowtable: add function to invoke garbage collection immediately Pablo Neira Ayuso
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-08-24 22:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet

Update nft_data_init() to report EINVAL if chain is already bound.

Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING")
Reported-by: Gwangun Jung <exsociety@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_tables_api.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 72c066a33416..2ee50e23c9b7 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -9711,6 +9711,8 @@ static int nft_verdict_init(const struct nft_ctx *ctx, struct nft_data *data,
 			return PTR_ERR(chain);
 		if (nft_is_base_chain(chain))
 			return -EOPNOTSUPP;
+		if (nft_chain_is_bound(chain))
+			return -EINVAL;
 		if (desc->flags & NFT_DATA_DESC_SETELEM &&
 		    chain->flags & NFT_CHAIN_BINDING)
 			return -EINVAL;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net 12/14] netfilter: flowtable: add function to invoke garbage collection immediately
  2022-08-24 22:03 [PATCH net 00/14] Netfilter fixes for net Pablo Neira Ayuso
                   ` (10 preceding siblings ...)
  2022-08-24 22:03 ` [PATCH net 11/14] netfilter: nf_tables: disallow binding to already bound chain Pablo Neira Ayuso
@ 2022-08-24 22:03 ` Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 13/14] netfilter: flowtable: fix stuck flows on cleanup due to pending work Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 14/14] netfilter: nf_defrag_ipv6: allow nf_conntrack_frag6_high_thresh increases Pablo Neira Ayuso
  13 siblings, 0 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-08-24 22:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet

Expose nf_flow_table_gc_run() to force a garbage collector run from the
offload infrastructure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_flow_table.h |  1 +
 net/netfilter/nf_flow_table_core.c    | 12 +++++++++---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
index d5326c44b453..476cc4423a90 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -270,6 +270,7 @@ void flow_offload_refresh(struct nf_flowtable *flow_table,
 
 struct flow_offload_tuple_rhash *flow_offload_lookup(struct nf_flowtable *flow_table,
 						     struct flow_offload_tuple *tuple);
+void nf_flow_table_gc_run(struct nf_flowtable *flow_table);
 void nf_flow_table_gc_cleanup(struct nf_flowtable *flowtable,
 			      struct net_device *dev);
 void nf_flow_table_cleanup(struct net_device *dev);
diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index 765ac779bfc8..60fc1e1b7182 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -437,12 +437,17 @@ static void nf_flow_offload_gc_step(struct nf_flowtable *flow_table,
 	}
 }
 
+void nf_flow_table_gc_run(struct nf_flowtable *flow_table)
+{
+	nf_flow_table_iterate(flow_table, nf_flow_offload_gc_step, NULL);
+}
+
 static void nf_flow_offload_work_gc(struct work_struct *work)
 {
 	struct nf_flowtable *flow_table;
 
 	flow_table = container_of(work, struct nf_flowtable, gc_work.work);
-	nf_flow_table_iterate(flow_table, nf_flow_offload_gc_step, NULL);
+	nf_flow_table_gc_run(flow_table);
 	queue_delayed_work(system_power_efficient_wq, &flow_table->gc_work, HZ);
 }
 
@@ -601,10 +606,11 @@ void nf_flow_table_free(struct nf_flowtable *flow_table)
 
 	cancel_delayed_work_sync(&flow_table->gc_work);
 	nf_flow_table_iterate(flow_table, nf_flow_table_do_cleanup, NULL);
-	nf_flow_table_iterate(flow_table, nf_flow_offload_gc_step, NULL);
+	nf_flow_table_gc_run(flow_table);
 	nf_flow_table_offload_flush(flow_table);
 	if (nf_flowtable_hw_offload(flow_table))
-		nf_flow_table_iterate(flow_table, nf_flow_offload_gc_step, NULL);
+		nf_flow_table_gc_run(flow_table);
+
 	rhashtable_destroy(&flow_table->rhashtable);
 }
 EXPORT_SYMBOL_GPL(nf_flow_table_free);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net 13/14] netfilter: flowtable: fix stuck flows on cleanup due to pending work
  2022-08-24 22:03 [PATCH net 00/14] Netfilter fixes for net Pablo Neira Ayuso
                   ` (11 preceding siblings ...)
  2022-08-24 22:03 ` [PATCH net 12/14] netfilter: flowtable: add function to invoke garbage collection immediately Pablo Neira Ayuso
@ 2022-08-24 22:03 ` Pablo Neira Ayuso
  2022-08-24 22:03 ` [PATCH net 14/14] netfilter: nf_defrag_ipv6: allow nf_conntrack_frag6_high_thresh increases Pablo Neira Ayuso
  13 siblings, 0 replies; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-08-24 22:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet

To clear the flow table on flow table free, the following sequence
normally happens in order:

  1) gc_step work is stopped to disable any further stats/del requests.
  2) All flow table entries are set to teardown state.
  3) Run gc_step which will queue HW del work for each flow table entry.
  4) Waiting for the above del work to finish (flush).
  5) Run gc_step again, deleting all entries from the flow table.
  6) Flow table is freed.

But if a flow table entry already has pending HW stats or HW add work
step 3 will not queue HW del work (it will be skipped), step 4 will wait
for the pending add/stats to finish, and step 5 will queue HW del work
which might execute after freeing of the flow table.

To fix the above, this patch flushes the pending work, then it sets the
teardown flag to all flows in the flowtable and it forces a garbage
collector run to queue work to remove the flows from hardware, then it
flushes this new pending work and (finally) it forces another garbage
collector run to remove the entry from the software flowtable.

Stack trace:
[47773.882335] BUG: KASAN: use-after-free in down_read+0x99/0x460
[47773.883634] Write of size 8 at addr ffff888103b45aa8 by task kworker/u20:6/543704
[47773.885634] CPU: 3 PID: 543704 Comm: kworker/u20:6 Not tainted 5.12.0-rc7+ #2
[47773.886745] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)
[47773.888438] Workqueue: nf_ft_offload_del flow_offload_work_handler [nf_flow_table]
[47773.889727] Call Trace:
[47773.890214]  dump_stack+0xbb/0x107
[47773.890818]  print_address_description.constprop.0+0x18/0x140
[47773.892990]  kasan_report.cold+0x7c/0xd8
[47773.894459]  kasan_check_range+0x145/0x1a0
[47773.895174]  down_read+0x99/0x460
[47773.899706]  nf_flow_offload_tuple+0x24f/0x3c0 [nf_flow_table]
[47773.907137]  flow_offload_work_handler+0x72d/0xbe0 [nf_flow_table]
[47773.913372]  process_one_work+0x8ac/0x14e0
[47773.921325]
[47773.921325] Allocated by task 592159:
[47773.922031]  kasan_save_stack+0x1b/0x40
[47773.922730]  __kasan_kmalloc+0x7a/0x90
[47773.923411]  tcf_ct_flow_table_get+0x3cb/0x1230 [act_ct]
[47773.924363]  tcf_ct_init+0x71c/0x1156 [act_ct]
[47773.925207]  tcf_action_init_1+0x45b/0x700
[47773.925987]  tcf_action_init+0x453/0x6b0
[47773.926692]  tcf_exts_validate+0x3d0/0x600
[47773.927419]  fl_change+0x757/0x4a51 [cls_flower]
[47773.928227]  tc_new_tfilter+0x89a/0x2070
[47773.936652]
[47773.936652] Freed by task 543704:
[47773.937303]  kasan_save_stack+0x1b/0x40
[47773.938039]  kasan_set_track+0x1c/0x30
[47773.938731]  kasan_set_free_info+0x20/0x30
[47773.939467]  __kasan_slab_free+0xe7/0x120
[47773.940194]  slab_free_freelist_hook+0x86/0x190
[47773.941038]  kfree+0xce/0x3a0
[47773.941644]  tcf_ct_flow_table_cleanup_work

Original patch description and stack trace by Paul Blakey.

Fixes: c29f74e0df7a ("netfilter: nf_flow_table: hardware offload support")
Reported-by: Paul Blakey <paulb@nvidia.com>
Tested-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_flow_table.h | 2 ++
 net/netfilter/nf_flow_table_core.c    | 7 +++----
 net/netfilter/nf_flow_table_offload.c | 8 ++++++++
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
index 476cc4423a90..cd982f4a0f50 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -307,6 +307,8 @@ void nf_flow_offload_stats(struct nf_flowtable *flowtable,
 			   struct flow_offload *flow);
 
 void nf_flow_table_offload_flush(struct nf_flowtable *flowtable);
+void nf_flow_table_offload_flush_cleanup(struct nf_flowtable *flowtable);
+
 int nf_flow_table_offload_setup(struct nf_flowtable *flowtable,
 				struct net_device *dev,
 				enum flow_block_command cmd);
diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index 60fc1e1b7182..81c26a96c30b 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -605,12 +605,11 @@ void nf_flow_table_free(struct nf_flowtable *flow_table)
 	mutex_unlock(&flowtable_lock);
 
 	cancel_delayed_work_sync(&flow_table->gc_work);
+	nf_flow_table_offload_flush(flow_table);
+	/* ... no more pending work after this stage ... */
 	nf_flow_table_iterate(flow_table, nf_flow_table_do_cleanup, NULL);
 	nf_flow_table_gc_run(flow_table);
-	nf_flow_table_offload_flush(flow_table);
-	if (nf_flowtable_hw_offload(flow_table))
-		nf_flow_table_gc_run(flow_table);
-
+	nf_flow_table_offload_flush_cleanup(flow_table);
 	rhashtable_destroy(&flow_table->rhashtable);
 }
 EXPORT_SYMBOL_GPL(nf_flow_table_free);
diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
index 103b6cbf257f..b04645ced89b 100644
--- a/net/netfilter/nf_flow_table_offload.c
+++ b/net/netfilter/nf_flow_table_offload.c
@@ -1074,6 +1074,14 @@ void nf_flow_offload_stats(struct nf_flowtable *flowtable,
 	flow_offload_queue_work(offload);
 }
 
+void nf_flow_table_offload_flush_cleanup(struct nf_flowtable *flowtable)
+{
+	if (nf_flowtable_hw_offload(flowtable)) {
+		flush_workqueue(nf_flow_offload_del_wq);
+		nf_flow_table_gc_run(flowtable);
+	}
+}
+
 void nf_flow_table_offload_flush(struct nf_flowtable *flowtable)
 {
 	if (nf_flowtable_hw_offload(flowtable)) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net 14/14] netfilter: nf_defrag_ipv6: allow nf_conntrack_frag6_high_thresh increases
  2022-08-24 22:03 [PATCH net 00/14] Netfilter fixes for net Pablo Neira Ayuso
                   ` (12 preceding siblings ...)
  2022-08-24 22:03 ` [PATCH net 13/14] netfilter: flowtable: fix stuck flows on cleanup due to pending work Pablo Neira Ayuso
@ 2022-08-24 22:03 ` Pablo Neira Ayuso
  2022-08-25  2:17   ` Jakub Kicinski
  13 siblings, 1 reply; 17+ messages in thread
From: Pablo Neira Ayuso @ 2022-08-24 22:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet

From: Eric Dumazet <edumazet@google.com>

Currently, net.netfilter.nf_conntrack_frag6_high_thresh can only be lowered.

I found this issue while investigating a probable kernel issue
causing flakes in tools/testing/selftests/net/ip_defrag.sh

In particular, these sysctl changes were ignored:
	ip netns exec "${NETNS}" sysctl -w net.netfilter.nf_conntrack_frag6_high_thresh=9000000 >/dev/null 2>&1
	ip netns exec "${NETNS}" sysctl -w net.netfilter.nf_conntrack_frag6_low_thresh=7000000  >/dev/null 2>&1

This change is inline with commit 836196239298 ("net/ipfrag: let ip[6]frag_high_thresh
in ns be higher than in init_net")

Fixes: 8db3d41569bb ("netfilter: nf_defrag_ipv6: use net_generic infra")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv6/netfilter/nf_conntrack_reasm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c
index 7dd3629dd19e..38db0064d661 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -86,7 +86,6 @@ static int nf_ct_frag6_sysctl_register(struct net *net)
 	table[1].extra2	= &nf_frag->fqdir->high_thresh;
 	table[2].data	= &nf_frag->fqdir->high_thresh;
 	table[2].extra1	= &nf_frag->fqdir->low_thresh;
-	table[2].extra2	= &nf_frag->fqdir->high_thresh;
 
 	hdr = register_net_sysctl(net, "net/netfilter", table);
 	if (hdr == NULL)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH net 14/14] netfilter: nf_defrag_ipv6: allow nf_conntrack_frag6_high_thresh increases
  2022-08-24 22:03 ` [PATCH net 14/14] netfilter: nf_defrag_ipv6: allow nf_conntrack_frag6_high_thresh increases Pablo Neira Ayuso
@ 2022-08-25  2:17   ` Jakub Kicinski
  0 siblings, 0 replies; 17+ messages in thread
From: Jakub Kicinski @ 2022-08-25  2:17 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, davem, netdev, pabeni, edumazet

On Thu, 25 Aug 2022 00:03:30 +0200 Pablo Neira Ayuso wrote:
> Fixes: 8db3d41569bb ("netfilter: nf_defrag_ipv6: use net_generic infra")

Incorrect hash here, should have been:

Fixes: 8b0adbe3e38d ("netfilter: nf_defrag_ipv6: use net_generic infra")

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net 01/14] netfilter: ebtables: reject blobs that don't provide all entry points
  2022-08-24 22:03 ` [PATCH net 01/14] netfilter: ebtables: reject blobs that don't provide all entry points Pablo Neira Ayuso
@ 2022-08-25  2:40   ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 17+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-08-25  2:40 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, davem, netdev, kuba, pabeni, edumazet

Hello:

This series was applied to netdev/net.git (master)
by Pablo Neira Ayuso <pablo@netfilter.org>:

On Thu, 25 Aug 2022 00:03:17 +0200 you wrote:
> From: Florian Westphal <fw@strlen.de>
> 
> Harshit Mogalapalli says:
>  In ebt_do_table() function dereferencing 'private->hook_entry[hook]'
>  can lead to NULL pointer dereference. [..] Kernel panic:
> 
> general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
> [..]
> RIP: 0010:ebt_do_table+0x1dc/0x1ce0
> Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 5c 16 00 00 48 b8 00 00 00 00 00 fc ff df 49 8b 6c df 08 48 8d 7d 2c 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 88
> [..]
> Call Trace:
>  nf_hook_slow+0xb1/0x170
>  __br_forward+0x289/0x730
>  maybe_deliver+0x24b/0x380
>  br_flood+0xc6/0x390
>  br_dev_xmit+0xa2e/0x12c0
> 
> [...]

Here is the summary with links:
  - [net,01/14] netfilter: ebtables: reject blobs that don't provide all entry points
    https://git.kernel.org/netdev/net/c/7997eff82828
  - [net,02/14] netfilter: conntrack: work around exceeded receive window
    https://git.kernel.org/netdev/net/c/cf97769c761a
  - [net,03/14] netfilter: nft_tproxy: restrict to prerouting hook
    https://git.kernel.org/netdev/net/c/18bbc3213383
  - [net,04/14] netfilter: nf_tables: disallow updates of implicit chain
    https://git.kernel.org/netdev/net/c/5dc52d83baac
  - [net,05/14] netfilter: nf_tables: make table handle allocation per-netns friendly
    https://git.kernel.org/netdev/net/c/ab482c6b66a4
  - [net,06/14] netfilter: nft_payload: report ERANGE for too long offset and length
    https://git.kernel.org/netdev/net/c/94254f990c07
  - [net,07/14] netfilter: nft_payload: do not truncate csum_offset and csum_type
    https://git.kernel.org/netdev/net/c/7044ab281feb
  - [net,08/14] netfilter: nf_tables: do not leave chain stats enabled on error
    https://git.kernel.org/netdev/net/c/43eb8949cfdf
  - [net,09/14] netfilter: nft_osf: restrict osf to ipv4, ipv6 and inet families
    https://git.kernel.org/netdev/net/c/5f3b7aae14a7
  - [net,10/14] netfilter: nft_tunnel: restrict it to netdev family
    https://git.kernel.org/netdev/net/c/01e4092d53bc
  - [net,11/14] netfilter: nf_tables: disallow binding to already bound chain
    https://git.kernel.org/netdev/net/c/e02f0d397040
  - [net,12/14] netfilter: flowtable: add function to invoke garbage collection immediately
    https://git.kernel.org/netdev/net/c/759eebbcfafc
  - [net,13/14] netfilter: flowtable: fix stuck flows on cleanup due to pending work
    https://git.kernel.org/netdev/net/c/9afb4b27349a
  - [net,14/14] netfilter: nf_defrag_ipv6: allow nf_conntrack_frag6_high_thresh increases
    https://git.kernel.org/netdev/net/c/00cd7bf9f9e0

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-08-25  2:40 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-24 22:03 [PATCH net 00/14] Netfilter fixes for net Pablo Neira Ayuso
2022-08-24 22:03 ` [PATCH net 01/14] netfilter: ebtables: reject blobs that don't provide all entry points Pablo Neira Ayuso
2022-08-25  2:40   ` patchwork-bot+netdevbpf
2022-08-24 22:03 ` [PATCH net 02/14] netfilter: conntrack: work around exceeded receive window Pablo Neira Ayuso
2022-08-24 22:03 ` [PATCH net 03/14] netfilter: nft_tproxy: restrict to prerouting hook Pablo Neira Ayuso
2022-08-24 22:03 ` [PATCH net 04/14] netfilter: nf_tables: disallow updates of implicit chain Pablo Neira Ayuso
2022-08-24 22:03 ` [PATCH net 05/14] netfilter: nf_tables: make table handle allocation per-netns friendly Pablo Neira Ayuso
2022-08-24 22:03 ` [PATCH net 06/14] netfilter: nft_payload: report ERANGE for too long offset and length Pablo Neira Ayuso
2022-08-24 22:03 ` [PATCH net 07/14] netfilter: nft_payload: do not truncate csum_offset and csum_type Pablo Neira Ayuso
2022-08-24 22:03 ` [PATCH net 08/14] netfilter: nf_tables: do not leave chain stats enabled on error Pablo Neira Ayuso
2022-08-24 22:03 ` [PATCH net 09/14] netfilter: nft_osf: restrict osf to ipv4, ipv6 and inet families Pablo Neira Ayuso
2022-08-24 22:03 ` [PATCH net 10/14] netfilter: nft_tunnel: restrict it to netdev family Pablo Neira Ayuso
2022-08-24 22:03 ` [PATCH net 11/14] netfilter: nf_tables: disallow binding to already bound chain Pablo Neira Ayuso
2022-08-24 22:03 ` [PATCH net 12/14] netfilter: flowtable: add function to invoke garbage collection immediately Pablo Neira Ayuso
2022-08-24 22:03 ` [PATCH net 13/14] netfilter: flowtable: fix stuck flows on cleanup due to pending work Pablo Neira Ayuso
2022-08-24 22:03 ` [PATCH net 14/14] netfilter: nf_defrag_ipv6: allow nf_conntrack_frag6_high_thresh increases Pablo Neira Ayuso
2022-08-25  2:17   ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).