* [PATCH net 1/3] netfilter: nf_conntrack_tcp: re-init for syn packets only
2022-04-28 14:21 [PATCH net 0/3] Netfilter fixes for net Pablo Neira Ayuso
@ 2022-04-28 14:21 ` Pablo Neira Ayuso
2022-04-28 17:00 ` patchwork-bot+netdevbpf
2022-04-28 14:21 ` [PATCH net 2/3] netfilter: conntrack: fix udp offload timeout sysctl Pablo Neira Ayuso
2022-04-28 14:21 ` [PATCH net 3/3] netfilter: nft_socket: only do sk lookups when indev is available Pablo Neira Ayuso
2 siblings, 1 reply; 7+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-28 14:21 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba
From: Florian Westphal <fw@strlen.de>
Jaco Kroon reported tcp problems that Eric Dumazet and Neal Cardwell
pinpointed to nf_conntrack tcp_in_window() bug.
tcp trace shows following sequence:
I > R Flags [S], seq 3451342529, win 62580, options [.. tfo [|tcp]>
R > I Flags [S.], seq 2699962254, ack 3451342530, win 65535, options [..]
R > I Flags [P.], seq 1:89, ack 1, [..]
Note 3rd ACK is from responder to initiator so following branch is taken:
} else if (((state->state == TCP_CONNTRACK_SYN_SENT
&& dir == IP_CT_DIR_ORIGINAL)
|| (state->state == TCP_CONNTRACK_SYN_RECV
&& dir == IP_CT_DIR_REPLY))
&& after(end, sender->td_end)) {
... because state == TCP_CONNTRACK_SYN_RECV and dir is REPLY.
This causes the scaling factor to be reset to 0: window scale option
is only present in syn(ack) packets. This in turn makes nf_conntrack
mark valid packets as out-of-window.
This was always broken, it exists even in original commit where
window tracking was added to ip_conntrack (nf_conntrack predecessor)
in 2.6.9-rc1 kernel.
Restrict to 'tcph->syn', just like the 3rd condtional added in
commit 82b72cb94666 ("netfilter: conntrack: re-init state for retransmitted syn-ack").
Upon closer look, those conditionals/branches can be merged:
Because earlier checks prevent syn-ack from showing up in
original direction, the 'dir' checks in the conditional quoted above are
redundant, remove them. Return early for pure syn retransmitted in reply
direction (simultaneous open).
Fixes: 9fb9cbb1082d ("[NETFILTER]: Add nf_conntrack subsystem.")
Reported-by: Jaco Kroon <jaco@uls.co.za>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_conntrack_proto_tcp.c | 21 ++++++---------------
1 file changed, 6 insertions(+), 15 deletions(-)
diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index 8ec55cd72572..204a5cdff5b1 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -556,24 +556,14 @@ static bool tcp_in_window(struct nf_conn *ct,
}
}
- } else if (((state->state == TCP_CONNTRACK_SYN_SENT
- && dir == IP_CT_DIR_ORIGINAL)
- || (state->state == TCP_CONNTRACK_SYN_RECV
- && dir == IP_CT_DIR_REPLY))
- && after(end, sender->td_end)) {
+ } else if (tcph->syn &&
+ after(end, sender->td_end) &&
+ (state->state == TCP_CONNTRACK_SYN_SENT ||
+ state->state == TCP_CONNTRACK_SYN_RECV)) {
/*
* RFC 793: "if a TCP is reinitialized ... then it need
* not wait at all; it must only be sure to use sequence
* numbers larger than those recently used."
- */
- sender->td_end =
- sender->td_maxend = end;
- sender->td_maxwin = (win == 0 ? 1 : win);
-
- tcp_options(skb, dataoff, tcph, sender);
- } else if (tcph->syn && dir == IP_CT_DIR_REPLY &&
- state->state == TCP_CONNTRACK_SYN_SENT) {
- /* Retransmitted syn-ack, or syn (simultaneous open).
*
* Re-init state for this direction, just like for the first
* syn(-ack) reply, it might differ in seq, ack or tcp options.
@@ -581,7 +571,8 @@ static bool tcp_in_window(struct nf_conn *ct,
tcp_init_sender(sender, receiver,
skb, dataoff, tcph,
end, win);
- if (!tcph->ack)
+
+ if (dir == IP_CT_DIR_REPLY && !tcph->ack)
return true;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net 1/3] netfilter: nf_conntrack_tcp: re-init for syn packets only
2022-04-28 14:21 ` [PATCH net 1/3] netfilter: nf_conntrack_tcp: re-init for syn packets only Pablo Neira Ayuso
@ 2022-04-28 17:00 ` patchwork-bot+netdevbpf
2022-08-12 13:34 ` Neal Cardwell
0 siblings, 1 reply; 7+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-04-28 17:00 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel, davem, netdev, kuba
Hello:
This series was applied to netdev/net.git (master)
by Pablo Neira Ayuso <pablo@netfilter.org>:
On Thu, 28 Apr 2022 16:21:07 +0200 you wrote:
> From: Florian Westphal <fw@strlen.de>
>
> Jaco Kroon reported tcp problems that Eric Dumazet and Neal Cardwell
> pinpointed to nf_conntrack tcp_in_window() bug.
>
> tcp trace shows following sequence:
>
> [...]
Here is the summary with links:
- [net,1/3] netfilter: nf_conntrack_tcp: re-init for syn packets only
https://git.kernel.org/netdev/net/c/c7aab4f17021
- [net,2/3] netfilter: conntrack: fix udp offload timeout sysctl
https://git.kernel.org/netdev/net/c/626873c446f7
- [net,3/3] netfilter: nft_socket: only do sk lookups when indev is available
https://git.kernel.org/netdev/net/c/743b83f15d40
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net 1/3] netfilter: nf_conntrack_tcp: re-init for syn packets only
2022-04-28 17:00 ` patchwork-bot+netdevbpf
@ 2022-08-12 13:34 ` Neal Cardwell
2022-08-12 19:17 ` Jakub Kicinski
0 siblings, 1 reply; 7+ messages in thread
From: Neal Cardwell @ 2022-08-12 13:34 UTC (permalink / raw)
To: patchwork-bot+netdevbpf
Cc: Pablo Neira Ayuso, netfilter-devel, davem, netdev, kuba,
Yuchung Cheng, Eric Dumazet
On Thu, Apr 28, 2022 at 1:00 PM <patchwork-bot+netdevbpf@kernel.org> wrote:
>
> Hello:
>
> This series was applied to netdev/net.git (master)
> by Pablo Neira Ayuso <pablo@netfilter.org>:
>
> On Thu, 28 Apr 2022 16:21:07 +0200 you wrote:
> > From: Florian Westphal <fw@strlen.de>
> >
> > Jaco Kroon reported tcp problems that Eric Dumazet and Neal Cardwell
> > pinpointed to nf_conntrack tcp_in_window() bug.
> >
> > tcp trace shows following sequence:
> >
> > [...]
>
> Here is the summary with links:
> - [net,1/3] netfilter: nf_conntrack_tcp: re-init for syn packets only
> https://git.kernel.org/netdev/net/c/c7aab4f17021
> - [net,2/3] netfilter: conntrack: fix udp offload timeout sysctl
> https://git.kernel.org/netdev/net/c/626873c446f7
> - [net,3/3] netfilter: nft_socket: only do sk lookups when indev is available
> https://git.kernel.org/netdev/net/c/743b83f15d40
>
> You are awesome, thank you!
> --
> Deet-doot-dot, I am a bot.
> https://korg.docs.kernel.org/patchwork/pwbot.html
This first commit is an important bug fix for a serious bug that causes
TCP connection hangs for users of TCP fast open and nf_conntrack:
c7aab4f17021b netfilter: nf_conntrack_tcp: re-init for syn packets only
We are continuing to get reports about the bug that this commit fixes.
It seems this fix was only backported to v5.17 stable release, and not further,
due to a cherry-pick conflict, because this fix implicitly depends on a
slightly earlier v5.17 fix in the same spot:
82b72cb94666 netfilter: conntrack: re-init state for retransmitted syn-ack
I manually verified that the fix c7aab4f17021b can be cleanly cherry-picked
into the oldest (v4.9.325) and newest (v5.15.60) longterm release kernels as
long as we first cherry-pick that related fix that it implicitly depends on:
82b72cb94666b3dbd7152bb9f441b068af7a921b
netfilter: conntrack: re-init state for retransmitted syn-ack
c7aab4f17021b636a0ee75bcf28e06fb7c94ab48
netfilter: nf_conntrack_tcp: re-init for syn packets only
So would it be possible to backport both of those fixes with the following
cherry-picks, to all LTS stable releases?
git cherry-pick 82b72cb94666b3dbd7152bb9f441b068af7a921b
git cherry-pick c7aab4f17021b636a0ee75bcf28e06fb7c94ab48
Thanks!
Best Regards,
neal
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net 1/3] netfilter: nf_conntrack_tcp: re-init for syn packets only
2022-08-12 13:34 ` Neal Cardwell
@ 2022-08-12 19:17 ` Jakub Kicinski
0 siblings, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2022-08-12 19:17 UTC (permalink / raw)
To: Neal Cardwell, stable
Cc: patchwork-bot+netdevbpf, Pablo Neira Ayuso, netfilter-devel,
davem, netdev, Yuchung Cheng, Eric Dumazet
On Fri, 12 Aug 2022 09:34:14 -0400 Neal Cardwell wrote:
> This first commit is an important bug fix for a serious bug that causes
> TCP connection hangs for users of TCP fast open and nf_conntrack:
>
> c7aab4f17021b netfilter: nf_conntrack_tcp: re-init for syn packets only
>
> We are continuing to get reports about the bug that this commit fixes.
>
> It seems this fix was only backported to v5.17 stable release, and not further,
> due to a cherry-pick conflict, because this fix implicitly depends on a
> slightly earlier v5.17 fix in the same spot:
>
> 82b72cb94666 netfilter: conntrack: re-init state for retransmitted syn-ack
>
> I manually verified that the fix c7aab4f17021b can be cleanly cherry-picked
> into the oldest (v4.9.325) and newest (v5.15.60) longterm release kernels as
> long as we first cherry-pick that related fix that it implicitly depends on:
>
> 82b72cb94666b3dbd7152bb9f441b068af7a921b
> netfilter: conntrack: re-init state for retransmitted syn-ack
>
> c7aab4f17021b636a0ee75bcf28e06fb7c94ab48
> netfilter: nf_conntrack_tcp: re-init for syn packets only
>
> So would it be possible to backport both of those fixes with the following
> cherry-picks, to all LTS stable releases?
>
> git cherry-pick 82b72cb94666b3dbd7152bb9f441b068af7a921b
> git cherry-pick c7aab4f17021b636a0ee75bcf28e06fb7c94ab48
Thanks a lot Neal! FWIW we have recently changed our process and no
longer handle stable submissions ourselves, so in the future feel free
to talk directly to stable@ (and add CC: stable@ tags to patches).
I'm adding stable@, let's see if Greg & team can pick things up based
on your instructions :)
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH net 2/3] netfilter: conntrack: fix udp offload timeout sysctl
2022-04-28 14:21 [PATCH net 0/3] Netfilter fixes for net Pablo Neira Ayuso
2022-04-28 14:21 ` [PATCH net 1/3] netfilter: nf_conntrack_tcp: re-init for syn packets only Pablo Neira Ayuso
@ 2022-04-28 14:21 ` Pablo Neira Ayuso
2022-04-28 14:21 ` [PATCH net 3/3] netfilter: nft_socket: only do sk lookups when indev is available Pablo Neira Ayuso
2 siblings, 0 replies; 7+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-28 14:21 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba
From: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu>
`nf_flowtable_udp_timeout` sysctl option is available only
if CONFIG_NFT_FLOW_OFFLOAD enabled. But infra for this flow
offload UDP timeout was added under CONFIG_NF_FLOW_TABLE
config option. So, if you have CONFIG_NFT_FLOW_OFFLOAD
disabled and CONFIG_NF_FLOW_TABLE enabled, the
`nf_flowtable_udp_timeout` is not present in sysfs.
Please note, that TCP flow offload timeout sysctl option
is present even CONFIG_NFT_FLOW_OFFLOAD is disabled.
I suppose it was a typo in commit that adds UDP flow offload
timeout and CONFIG_NF_FLOW_TABLE should be used instead.
Fixes: 975c57504da1 ("netfilter: conntrack: Introduce udp offload timeout configuration")
Signed-off-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_conntrack_standalone.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c
index 3e1afd10a9b6..55aa55b252b2 100644
--- a/net/netfilter/nf_conntrack_standalone.c
+++ b/net/netfilter/nf_conntrack_standalone.c
@@ -823,7 +823,7 @@ static struct ctl_table nf_ct_sysctl_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec_jiffies,
},
-#if IS_ENABLED(CONFIG_NFT_FLOW_OFFLOAD)
+#if IS_ENABLED(CONFIG_NF_FLOW_TABLE)
[NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD] = {
.procname = "nf_flowtable_udp_timeout",
.maxlen = sizeof(unsigned int),
--
2.30.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH net 3/3] netfilter: nft_socket: only do sk lookups when indev is available
2022-04-28 14:21 [PATCH net 0/3] Netfilter fixes for net Pablo Neira Ayuso
2022-04-28 14:21 ` [PATCH net 1/3] netfilter: nf_conntrack_tcp: re-init for syn packets only Pablo Neira Ayuso
2022-04-28 14:21 ` [PATCH net 2/3] netfilter: conntrack: fix udp offload timeout sysctl Pablo Neira Ayuso
@ 2022-04-28 14:21 ` Pablo Neira Ayuso
2 siblings, 0 replies; 7+ messages in thread
From: Pablo Neira Ayuso @ 2022-04-28 14:21 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba
From: Florian Westphal <fw@strlen.de>
Check if the incoming interface is available and NFT_BREAK
in case neither skb->sk nor input device are set.
Because nf_sk_lookup_slow*() assume packet headers are in the
'in' direction, use in postrouting is not going to yield a meaningful
result. Same is true for the forward chain, so restrict the use
to prerouting, input and output.
Use in output work if a socket is already attached to the skb.
Fixes: 554ced0a6e29 ("netfilter: nf_tables: add support for native socket matching")
Reported-and-tested-by: Topi Miettinen <toiwoton@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nft_socket.c | 52 ++++++++++++++++++++++++++++----------
1 file changed, 38 insertions(+), 14 deletions(-)
diff --git a/net/netfilter/nft_socket.c b/net/netfilter/nft_socket.c
index 6d9e8e0a3a7d..05ae5a338b6f 100644
--- a/net/netfilter/nft_socket.c
+++ b/net/netfilter/nft_socket.c
@@ -54,6 +54,32 @@ nft_sock_get_eval_cgroupv2(u32 *dest, struct sock *sk, const struct nft_pktinfo
}
#endif
+static struct sock *nft_socket_do_lookup(const struct nft_pktinfo *pkt)
+{
+ const struct net_device *indev = nft_in(pkt);
+ const struct sk_buff *skb = pkt->skb;
+ struct sock *sk = NULL;
+
+ if (!indev)
+ return NULL;
+
+ switch (nft_pf(pkt)) {
+ case NFPROTO_IPV4:
+ sk = nf_sk_lookup_slow_v4(nft_net(pkt), skb, indev);
+ break;
+#if IS_ENABLED(CONFIG_NF_TABLES_IPV6)
+ case NFPROTO_IPV6:
+ sk = nf_sk_lookup_slow_v6(nft_net(pkt), skb, indev);
+ break;
+#endif
+ default:
+ WARN_ON_ONCE(1);
+ break;
+ }
+
+ return sk;
+}
+
static void nft_socket_eval(const struct nft_expr *expr,
struct nft_regs *regs,
const struct nft_pktinfo *pkt)
@@ -67,20 +93,7 @@ static void nft_socket_eval(const struct nft_expr *expr,
sk = NULL;
if (!sk)
- switch(nft_pf(pkt)) {
- case NFPROTO_IPV4:
- sk = nf_sk_lookup_slow_v4(nft_net(pkt), skb, nft_in(pkt));
- break;
-#if IS_ENABLED(CONFIG_NF_TABLES_IPV6)
- case NFPROTO_IPV6:
- sk = nf_sk_lookup_slow_v6(nft_net(pkt), skb, nft_in(pkt));
- break;
-#endif
- default:
- WARN_ON_ONCE(1);
- regs->verdict.code = NFT_BREAK;
- return;
- }
+ sk = nft_socket_do_lookup(pkt);
if (!sk) {
regs->verdict.code = NFT_BREAK;
@@ -224,6 +237,16 @@ static bool nft_socket_reduce(struct nft_regs_track *track,
return nft_expr_reduce_bitwise(track, expr);
}
+static int nft_socket_validate(const struct nft_ctx *ctx,
+ const struct nft_expr *expr,
+ const struct nft_data **data)
+{
+ return nft_chain_validate_hooks(ctx->chain,
+ (1 << NF_INET_PRE_ROUTING) |
+ (1 << NF_INET_LOCAL_IN) |
+ (1 << NF_INET_LOCAL_OUT));
+}
+
static struct nft_expr_type nft_socket_type;
static const struct nft_expr_ops nft_socket_ops = {
.type = &nft_socket_type,
@@ -231,6 +254,7 @@ static const struct nft_expr_ops nft_socket_ops = {
.eval = nft_socket_eval,
.init = nft_socket_init,
.dump = nft_socket_dump,
+ .validate = nft_socket_validate,
.reduce = nft_socket_reduce,
};
--
2.30.2
^ permalink raw reply related [flat|nested] 7+ messages in thread