* [PATCH 1/4] netfilter: conntrack: fix false CRC32c mismatch using paged skb
2017-05-29 11:34 [PATCH 0/4] Netfilter fixes for net Pablo Neira Ayuso
@ 2017-05-29 11:34 ` Pablo Neira Ayuso
2017-05-29 11:34 ` [PATCH 2/4] netfilter: nft_set_rbtree: handle element re-addition after deletion Pablo Neira Ayuso
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Pablo Neira Ayuso @ 2017-05-29 11:34 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Davide Caratti <dcaratti@redhat.com>
sctp_compute_cksum() implementation assumes that at least the SCTP header
is in the linear part of skb: modify conntrack error callback to avoid
false CRC32c mismatch, if the transport header is partially/entirely paged.
Fixes: cf6e007eef83 ("netfilter: conntrack: validate SCTP crc32c in PREROUTING")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_conntrack_proto_sctp.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/net/netfilter/nf_conntrack_proto_sctp.c b/net/netfilter/nf_conntrack_proto_sctp.c
index 13875d599a85..1c5b14a6cab3 100644
--- a/net/netfilter/nf_conntrack_proto_sctp.c
+++ b/net/netfilter/nf_conntrack_proto_sctp.c
@@ -512,16 +512,19 @@ static int sctp_error(struct net *net, struct nf_conn *tpl, struct sk_buff *skb,
u8 pf, unsigned int hooknum)
{
const struct sctphdr *sh;
- struct sctphdr _sctph;
const char *logmsg;
- sh = skb_header_pointer(skb, dataoff, sizeof(_sctph), &_sctph);
- if (!sh) {
+ if (skb->len < dataoff + sizeof(struct sctphdr)) {
logmsg = "nf_ct_sctp: short packet ";
goto out_invalid;
}
if (net->ct.sysctl_checksum && hooknum == NF_INET_PRE_ROUTING &&
skb->ip_summed == CHECKSUM_NONE) {
+ if (!skb_make_writable(skb, dataoff + sizeof(struct sctphdr))) {
+ logmsg = "nf_ct_sctp: failed to read header ";
+ goto out_invalid;
+ }
+ sh = (const struct sctphdr *)(skb->data + dataoff);
if (sh->checksum != sctp_compute_cksum(skb, dataoff)) {
logmsg = "nf_ct_sctp: bad CRC ";
goto out_invalid;
--
2.1.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/4] netfilter: nft_set_rbtree: handle element re-addition after deletion
2017-05-29 11:34 [PATCH 0/4] Netfilter fixes for net Pablo Neira Ayuso
2017-05-29 11:34 ` [PATCH 1/4] netfilter: conntrack: fix false CRC32c mismatch using paged skb Pablo Neira Ayuso
@ 2017-05-29 11:34 ` Pablo Neira Ayuso
2017-05-29 11:34 ` [PATCH 3/4] netfilter: nat: use atomic bit op to clear the _SRC_NAT_DONE_BIT Pablo Neira Ayuso
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Pablo Neira Ayuso @ 2017-05-29 11:34 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
The existing code selects no next branch to be inspected when
re-inserting an inactive element into the rb-tree, looping endlessly.
This patch restricts the check for active elements to the EEXIST case
only.
Fixes: e701001e7cbe ("netfilter: nft_rbtree: allow adjacent intervals with dynamic updates")
Reported-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Tested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nft_set_rbtree.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index e97e2fb53f0a..fbdbaa00dd5f 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -116,17 +116,17 @@ static int __nft_rbtree_insert(const struct net *net, const struct nft_set *set,
else if (d > 0)
p = &parent->rb_right;
else {
- if (nft_set_elem_active(&rbe->ext, genmask)) {
- if (nft_rbtree_interval_end(rbe) &&
- !nft_rbtree_interval_end(new))
- p = &parent->rb_left;
- else if (!nft_rbtree_interval_end(rbe) &&
- nft_rbtree_interval_end(new))
- p = &parent->rb_right;
- else {
- *ext = &rbe->ext;
- return -EEXIST;
- }
+ if (nft_rbtree_interval_end(rbe) &&
+ !nft_rbtree_interval_end(new)) {
+ p = &parent->rb_left;
+ } else if (!nft_rbtree_interval_end(rbe) &&
+ nft_rbtree_interval_end(new)) {
+ p = &parent->rb_right;
+ } else if (nft_set_elem_active(&rbe->ext, genmask)) {
+ *ext = &rbe->ext;
+ return -EEXIST;
+ } else {
+ p = &parent->rb_left;
}
}
}
--
2.1.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 3/4] netfilter: nat: use atomic bit op to clear the _SRC_NAT_DONE_BIT
2017-05-29 11:34 [PATCH 0/4] Netfilter fixes for net Pablo Neira Ayuso
2017-05-29 11:34 ` [PATCH 1/4] netfilter: conntrack: fix false CRC32c mismatch using paged skb Pablo Neira Ayuso
2017-05-29 11:34 ` [PATCH 2/4] netfilter: nft_set_rbtree: handle element re-addition after deletion Pablo Neira Ayuso
@ 2017-05-29 11:34 ` Pablo Neira Ayuso
2017-05-29 11:34 ` [PATCH 4/4] netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize Pablo Neira Ayuso
2017-05-30 3:20 ` [PATCH 0/4] Netfilter fixes for net David Miller
4 siblings, 0 replies; 6+ messages in thread
From: Pablo Neira Ayuso @ 2017-05-29 11:34 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Liping Zhang <zlpnobody@gmail.com>
We need to clear the IPS_SRC_NAT_DONE_BIT to indicate that the ct has
been removed from nat_bysource table. But unfortunately, we use the
non-atomic bit operation: "ct->status &= ~IPS_NAT_DONE_MASK". So
there's a race condition that we may clear the _DYING_BIT set by
another CPU unexpectedly.
Since we don't care about the IPS_DST_NAT_DONE_BIT, so just using
clear_bit to clear the IPS_SRC_NAT_DONE_BIT is enough.
Also note, this is the last user which use the non-atomic bit operation
to update the confirmed ct->status.
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_nat_core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index ef0be325a0c6..6c72922d20ca 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -566,7 +566,7 @@ static int nf_nat_proto_clean(struct nf_conn *ct, void *data)
* Else, when the conntrack is destoyed, nf_nat_cleanup_conntrack()
* will delete entry from already-freed table.
*/
- ct->status &= ~IPS_NAT_DONE_MASK;
+ clear_bit(IPS_SRC_NAT_DONE_BIT, &ct->status);
rhltable_remove(&nf_nat_bysource_table, &ct->nat_bysource,
nf_nat_bysource_params);
--
2.1.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 4/4] netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize
2017-05-29 11:34 [PATCH 0/4] Netfilter fixes for net Pablo Neira Ayuso
` (2 preceding siblings ...)
2017-05-29 11:34 ` [PATCH 3/4] netfilter: nat: use atomic bit op to clear the _SRC_NAT_DONE_BIT Pablo Neira Ayuso
@ 2017-05-29 11:34 ` Pablo Neira Ayuso
2017-05-30 3:20 ` [PATCH 0/4] Netfilter fixes for net David Miller
4 siblings, 0 replies; 6+ messages in thread
From: Pablo Neira Ayuso @ 2017-05-29 11:34 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Liping Zhang <zlpnobody@gmail.com>
If nf_conntrack_htable_size was adjusted by the user during the ct
dump operation, we may invoke nf_ct_put twice for the same ct, i.e.
the "last" ct. This will cause the ct will be freed but still linked
in hash buckets.
It's very easy to reproduce the problem by the following commands:
# while : ; do
echo $RANDOM > /proc/sys/net/netfilter/nf_conntrack_buckets
done
# while : ; do
conntrack -L
done
# iperf -s 127.0.0.1 &
# iperf -c 127.0.0.1 -P 60 -t 36000
After a while, the system will hang like this:
NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [bash:20184]
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [iperf:20382]
...
So at last if we find cb->args[1] is equal to "last", this means hash
resize happened, then we can set cb->args[1] to 0 to fix the above
issue.
Fixes: d205dc40798d ("[NETFILTER]: ctnetlink: fix deadlock in table dumping")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_conntrack_netlink.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 9799a50bc604..a8be9b72e6cd 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -890,8 +890,13 @@ ctnetlink_dump_table(struct sk_buff *skb, struct netlink_callback *cb)
}
out:
local_bh_enable();
- if (last)
+ if (last) {
+ /* nf ct hash resize happened, now clear the leftover. */
+ if ((struct nf_conn *)cb->args[1] == last)
+ cb->args[1] = 0;
+
nf_ct_put(last);
+ }
while (i) {
i--;
--
2.1.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 0/4] Netfilter fixes for net
2017-05-29 11:34 [PATCH 0/4] Netfilter fixes for net Pablo Neira Ayuso
` (3 preceding siblings ...)
2017-05-29 11:34 ` [PATCH 4/4] netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize Pablo Neira Ayuso
@ 2017-05-30 3:20 ` David Miller
4 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2017-05-30 3:20 UTC (permalink / raw)
To: pablo; +Cc: netfilter-devel, netdev
From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon, 29 May 2017 13:34:28 +0200
> Hi David,
>
> The following patchset contains Netfilter fixes for your net tree,
> they are:
>
> 1) Conntrack SCTP CRC32c checksum mangling may operate on non-linear
> skbuff, patch from Davide Caratti.
>
> 2) nf_tables rb-tree set backend does not handle element re-addition
> after deletion in the same transaction, leading to infinite loop.
>
> 3) Atomically unclear the IPS_SRC_NAT_DONE_BIT on nat module removal,
> from Liping Zhang.
>
> 4) Conntrack hashtable resizing while ctnetlink dump is progress leads
> to a dead reference to released objects in the lists, also from
> Liping.
>
> You can pull these changes from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git
Pulled, thanks a lot Pablo.
^ permalink raw reply [flat|nested] 6+ messages in thread