* [PATCH 1/7] netfilter: nf_tables: Fix for endless loop when dumping ruleset
2019-01-14 21:29 [PATCH 0/7] Netfilter fixes for net Pablo Neira Ayuso
@ 2019-01-14 21:29 ` Pablo Neira Ayuso
2019-01-14 21:29 ` [PATCH 2/7] netfilter: nf_tables: fix leaking object reference count Pablo Neira Ayuso
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-01-14 21:29 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Phil Sutter <phil@nwl.cc>
__nf_tables_dump_rules() stores the current idx value into cb->args[0]
before returning to caller. With multiple chains present, cb->args[0] is
therefore updated after each chain's rules have been traversed. This
though causes the final nf_tables_dump_rules() run (which should return
an skb->len of zero since no rules are left to dump) to continue dumping
rules for each but the first chain. Fix this by moving the cb->args[0]
update to nf_tables_dump_rules().
With no final action to be performed anymore in
__nf_tables_dump_rules(), drop 'out_unfinished' jump label and 'rc'
variable - instead return the appropriate value directly.
Fixes: 241faeceb849c ("netfilter: nf_tables: Speed up selective rule dumps")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_tables_api.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 2b0a93300dd7..e3ddd8e95e58 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2304,7 +2304,6 @@ static int __nf_tables_dump_rules(struct sk_buff *skb,
struct net *net = sock_net(skb->sk);
unsigned int s_idx = cb->args[0];
const struct nft_rule *rule;
- int rc = 1;
list_for_each_entry_rcu(rule, &chain->rules, list) {
if (!nft_is_active(net, rule))
@@ -2321,16 +2320,13 @@ static int __nf_tables_dump_rules(struct sk_buff *skb,
NLM_F_MULTI | NLM_F_APPEND,
table->family,
table, chain, rule) < 0)
- goto out_unfinished;
+ return 1;
nl_dump_check_consistent(cb, nlmsg_hdr(skb));
cont:
(*idx)++;
}
- rc = 0;
-out_unfinished:
- cb->args[0] = *idx;
- return rc;
+ return 0;
}
static int nf_tables_dump_rules(struct sk_buff *skb,
@@ -2382,6 +2378,8 @@ static int nf_tables_dump_rules(struct sk_buff *skb,
}
done:
rcu_read_unlock();
+
+ cb->args[0] = idx;
return skb->len;
}
--
2.11.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/7] netfilter: nf_tables: fix leaking object reference count
2019-01-14 21:29 [PATCH 0/7] Netfilter fixes for net Pablo Neira Ayuso
2019-01-14 21:29 ` [PATCH 1/7] netfilter: nf_tables: Fix for endless loop when dumping ruleset Pablo Neira Ayuso
@ 2019-01-14 21:29 ` Pablo Neira Ayuso
2019-01-14 21:29 ` [PATCH 3/7] netfilter: nf_tables: selective rule dump needs table to be specified Pablo Neira Ayuso
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-01-14 21:29 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Taehee Yoo <ap420073@gmail.com>
There is no code that decreases the reference count of stateful objects
in error path of the nft_add_set_elem(). this causes a leak of reference
count of stateful objects.
Test commands:
$nft add table ip filter
$nft add counter ip filter c1
$nft add map ip filter m1 { type ipv4_addr : counter \;}
$nft add element ip filter m1 { 1 : c1 }
$nft add element ip filter m1 { 1 : c1 }
$nft delete element ip filter m1 { 1 }
$nft delete counter ip filter c1
Result:
Error: Could not process rule: Device or resource busy
delete counter ip filter c1
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
At the second 'nft add element ip filter m1 { 1 : c1 }', the reference
count of the 'c1' is increased then it tries to insert into the 'm1'. but
the 'm1' already has same element so it returns -EEXIST.
But it doesn't decrease the reference count of the 'c1' in the error path.
Due to a leak of the reference count of the 'c1', the 'c1' can't be
removed by 'nft delete counter ip filter c1'.
Fixes: 8aeff920dcc9 ("netfilter: nf_tables: add stateful object reference to set elements")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_tables_api.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index e3ddd8e95e58..dcea979423bc 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4506,6 +4506,8 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
err5:
kfree(trans);
err4:
+ if (obj)
+ obj->use--;
kfree(elem.priv);
err3:
if (nla[NFTA_SET_ELEM_DATA] != NULL)
--
2.11.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/7] netfilter: nf_tables: selective rule dump needs table to be specified
2019-01-14 21:29 [PATCH 0/7] Netfilter fixes for net Pablo Neira Ayuso
2019-01-14 21:29 ` [PATCH 1/7] netfilter: nf_tables: Fix for endless loop when dumping ruleset Pablo Neira Ayuso
2019-01-14 21:29 ` [PATCH 2/7] netfilter: nf_tables: fix leaking object reference count Pablo Neira Ayuso
@ 2019-01-14 21:29 ` Pablo Neira Ayuso
2019-01-14 21:29 ` [PATCH 4/7] netfilter: nft_flow_offload: Fix reverse route lookup Pablo Neira Ayuso
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-01-14 21:29 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
Table needs to be specified for selective rule dumps per chain.
Fixes: 241faeceb849c ("netfilter: nf_tables: Speed up selective rule dumps")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_tables_api.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index dcea979423bc..fb07f6cfc719 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2350,7 +2350,7 @@ static int nf_tables_dump_rules(struct sk_buff *skb,
if (ctx && ctx->table && strcmp(ctx->table, table->name) != 0)
continue;
- if (ctx && ctx->chain) {
+ if (ctx && ctx->table && ctx->chain) {
struct rhlist_head *list, *tmp;
list = rhltable_lookup(&table->chains_ht, ctx->chain,
--
2.11.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/7] netfilter: nft_flow_offload: Fix reverse route lookup
2019-01-14 21:29 [PATCH 0/7] Netfilter fixes for net Pablo Neira Ayuso
` (2 preceding siblings ...)
2019-01-14 21:29 ` [PATCH 3/7] netfilter: nf_tables: selective rule dump needs table to be specified Pablo Neira Ayuso
@ 2019-01-14 21:29 ` Pablo Neira Ayuso
2019-01-14 21:29 ` [PATCH 5/7] netfilter: ebtables: account ebt_table_info to kmemcg Pablo Neira Ayuso
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-01-14 21:29 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: wenxu <wenxu@ucloud.cn>
Using the following example:
client 1.1.1.7 ---> 2.2.2.7 which dnat to 10.0.0.7 server
The first reply packet (ie. syn+ack) uses an incorrect destination
address for the reverse route lookup since it uses:
daddr = ct->tuplehash[!dir].tuple.dst.u3.ip;
which is 2.2.2.7 in the scenario that is described above, while this
should be:
daddr = ct->tuplehash[dir].tuple.src.u3.ip;
that is 10.0.0.7.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nft_flow_offload.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c
index 974525eb92df..ccdb8f5ababb 100644
--- a/net/netfilter/nft_flow_offload.c
+++ b/net/netfilter/nft_flow_offload.c
@@ -29,10 +29,10 @@ static int nft_flow_route(const struct nft_pktinfo *pkt,
memset(&fl, 0, sizeof(fl));
switch (nft_pf(pkt)) {
case NFPROTO_IPV4:
- fl.u.ip4.daddr = ct->tuplehash[!dir].tuple.dst.u3.ip;
+ fl.u.ip4.daddr = ct->tuplehash[dir].tuple.src.u3.ip;
break;
case NFPROTO_IPV6:
- fl.u.ip6.daddr = ct->tuplehash[!dir].tuple.dst.u3.in6;
+ fl.u.ip6.daddr = ct->tuplehash[dir].tuple.src.u3.in6;
break;
}
--
2.11.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 5/7] netfilter: ebtables: account ebt_table_info to kmemcg
2019-01-14 21:29 [PATCH 0/7] Netfilter fixes for net Pablo Neira Ayuso
` (3 preceding siblings ...)
2019-01-14 21:29 ` [PATCH 4/7] netfilter: nft_flow_offload: Fix reverse route lookup Pablo Neira Ayuso
@ 2019-01-14 21:29 ` Pablo Neira Ayuso
2019-01-14 21:29 ` [PATCH 6/7] netfilter: nft_flow_offload: fix interaction with vrf slave device Pablo Neira Ayuso
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-01-14 21:29 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Shakeel Butt <shakeelb@google.com>
The [ip,ip6,arp]_tables use x_tables_info internally and the underlying
memory is already accounted to kmemcg. Do the same for ebtables. The
syzbot, by using setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the
whole system from a restricted memcg, a potential DoS.
By accounting the ebt_table_info, the memory used for ebt_table_info can
be contained within the memcg of the allocating process. However the
lifetime of ebt_table_info is independent of the allocating process and
is tied to the network namespace. So, the oom-killer will not be able to
relieve the memory pressure due to ebt_table_info memory. The memory for
ebt_table_info is allocated through vmalloc. Currently vmalloc does not
handle the oom-killed allocating process correctly and one large
allocation can bypass memcg limit enforcement. So, with this patch,
at least the small allocations will be contained. For large allocations,
we need to fix vmalloc.
Reported-by: syzbot+7713f3aa67be76b1552c@syzkaller.appspotmail.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/bridge/netfilter/ebtables.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 491828713e0b..5e55cef0cec3 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -1137,14 +1137,16 @@ static int do_replace(struct net *net, const void __user *user,
tmp.name[sizeof(tmp.name) - 1] = 0;
countersize = COUNTER_OFFSET(tmp.nentries) * nr_cpu_ids;
- newinfo = vmalloc(sizeof(*newinfo) + countersize);
+ newinfo = __vmalloc(sizeof(*newinfo) + countersize, GFP_KERNEL_ACCOUNT,
+ PAGE_KERNEL);
if (!newinfo)
return -ENOMEM;
if (countersize)
memset(newinfo->counters, 0, countersize);
- newinfo->entries = vmalloc(tmp.entries_size);
+ newinfo->entries = __vmalloc(tmp.entries_size, GFP_KERNEL_ACCOUNT,
+ PAGE_KERNEL);
if (!newinfo->entries) {
ret = -ENOMEM;
goto free_newinfo;
--
2.11.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 6/7] netfilter: nft_flow_offload: fix interaction with vrf slave device
2019-01-14 21:29 [PATCH 0/7] Netfilter fixes for net Pablo Neira Ayuso
` (4 preceding siblings ...)
2019-01-14 21:29 ` [PATCH 5/7] netfilter: ebtables: account ebt_table_info to kmemcg Pablo Neira Ayuso
@ 2019-01-14 21:29 ` Pablo Neira Ayuso
2019-01-14 21:29 ` [PATCH 7/7] netfilter: nft_flow_offload: fix checking method of conntrack helper Pablo Neira Ayuso
2019-01-15 21:32 ` [PATCH 0/7] Netfilter fixes for net David Miller
7 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-01-14 21:29 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: wenxu <wenxu@ucloud.cn>
In the forward chain, the iif is changed from slave device to master vrf
device. Thus, flow offload does not find a match on the lower slave
device.
This patch uses the cached route, ie. dst->dev, to update the iif and
oif fields in the flow entry.
After this patch, the following example works fine:
# ip addr add dev eth0 1.1.1.1/24
# ip addr add dev eth1 10.0.0.1/24
# ip link add user1 type vrf table 1
# ip l set user1 up
# ip l set dev eth0 master user1
# ip l set dev eth1 master user1
# nft add table firewall
# nft add flowtable f fb1 { hook ingress priority 0 \; devices = { eth0, eth1 } \; }
# nft add chain f ftb-all {type filter hook forward priority 0 \; policy accept \; }
# nft add rule f ftb-all ct zone 1 ip protocol tcp flow offload @fb1
# nft add rule f ftb-all ct zone 1 ip protocol udp flow offload @fb1
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/net/netfilter/nf_flow_table.h | 1 -
net/netfilter/nf_flow_table_core.c | 5 +++--
net/netfilter/nft_flow_offload.c | 4 ++--
3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
index 7d5cda7ce32a..3e370cb36263 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -84,7 +84,6 @@ struct flow_offload {
struct nf_flow_route {
struct {
struct dst_entry *dst;
- int ifindex;
} tuple[FLOW_OFFLOAD_DIR_MAX];
};
diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index fa0844e2a68d..c0c72ae9df42 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -28,6 +28,7 @@ flow_offload_fill_dir(struct flow_offload *flow, struct nf_conn *ct,
{
struct flow_offload_tuple *ft = &flow->tuplehash[dir].tuple;
struct nf_conntrack_tuple *ctt = &ct->tuplehash[dir].tuple;
+ struct dst_entry *other_dst = route->tuple[!dir].dst;
struct dst_entry *dst = route->tuple[dir].dst;
ft->dir = dir;
@@ -50,8 +51,8 @@ flow_offload_fill_dir(struct flow_offload *flow, struct nf_conn *ct,
ft->src_port = ctt->src.u.tcp.port;
ft->dst_port = ctt->dst.u.tcp.port;
- ft->iifidx = route->tuple[dir].ifindex;
- ft->oifidx = route->tuple[!dir].ifindex;
+ ft->iifidx = other_dst->dev->ifindex;
+ ft->oifidx = dst->dev->ifindex;
ft->dst_cache = dst;
}
diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c
index ccdb8f5ababb..188c6bbf4e16 100644
--- a/net/netfilter/nft_flow_offload.c
+++ b/net/netfilter/nft_flow_offload.c
@@ -30,9 +30,11 @@ static int nft_flow_route(const struct nft_pktinfo *pkt,
switch (nft_pf(pkt)) {
case NFPROTO_IPV4:
fl.u.ip4.daddr = ct->tuplehash[dir].tuple.src.u3.ip;
+ fl.u.ip4.flowi4_oif = nft_in(pkt)->ifindex;
break;
case NFPROTO_IPV6:
fl.u.ip6.daddr = ct->tuplehash[dir].tuple.src.u3.in6;
+ fl.u.ip6.flowi6_oif = nft_in(pkt)->ifindex;
break;
}
@@ -41,9 +43,7 @@ static int nft_flow_route(const struct nft_pktinfo *pkt,
return -ENOENT;
route->tuple[dir].dst = this_dst;
- route->tuple[dir].ifindex = nft_in(pkt)->ifindex;
route->tuple[!dir].dst = other_dst;
- route->tuple[!dir].ifindex = nft_out(pkt)->ifindex;
return 0;
}
--
2.11.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 7/7] netfilter: nft_flow_offload: fix checking method of conntrack helper
2019-01-14 21:29 [PATCH 0/7] Netfilter fixes for net Pablo Neira Ayuso
` (5 preceding siblings ...)
2019-01-14 21:29 ` [PATCH 6/7] netfilter: nft_flow_offload: fix interaction with vrf slave device Pablo Neira Ayuso
@ 2019-01-14 21:29 ` Pablo Neira Ayuso
2019-01-15 21:32 ` [PATCH 0/7] Netfilter fixes for net David Miller
7 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-01-14 21:29 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Henry Yen <henry.yen@mediatek.com>
This patch uses nfct_help() to detect whether an established connection
needs conntrack helper instead of using test_bit(IPS_HELPER_BIT,
&ct->status).
The reason is that IPS_HELPER_BIT is only set when using explicit CT
target.
However, in the case that a device enables conntrack helper via command
"echo 1 > /proc/sys/net/netfilter/nf_conntrack_helper", the status of
IPS_HELPER_BIT will not present any change, and consequently it loses
the checking ability in the context.
Signed-off-by: Henry Yen <henry.yen@mediatek.com>
Reviewed-by: Ryder Lee <ryder.lee@mediatek.com>
Tested-by: John Crispin <john@phrozen.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nft_flow_offload.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c
index 188c6bbf4e16..6e6b9adf7d38 100644
--- a/net/netfilter/nft_flow_offload.c
+++ b/net/netfilter/nft_flow_offload.c
@@ -12,6 +12,7 @@
#include <net/netfilter/nf_conntrack_core.h>
#include <linux/netfilter/nf_conntrack_common.h>
#include <net/netfilter/nf_flow_table.h>
+#include <net/netfilter/nf_conntrack_helper.h>
struct nft_flow_offload {
struct nft_flowtable *flowtable;
@@ -66,6 +67,7 @@ static void nft_flow_offload_eval(const struct nft_expr *expr,
{
struct nft_flow_offload *priv = nft_expr_priv(expr);
struct nf_flowtable *flowtable = &priv->flowtable->data;
+ const struct nf_conn_help *help;
enum ip_conntrack_info ctinfo;
struct nf_flow_route route;
struct flow_offload *flow;
@@ -88,7 +90,8 @@ static void nft_flow_offload_eval(const struct nft_expr *expr,
goto out;
}
- if (test_bit(IPS_HELPER_BIT, &ct->status))
+ help = nfct_help(ct);
+ if (help)
goto out;
if (ctinfo == IP_CT_NEW ||
--
2.11.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 0/7] Netfilter fixes for net
2019-01-14 21:29 [PATCH 0/7] Netfilter fixes for net Pablo Neira Ayuso
` (6 preceding siblings ...)
2019-01-14 21:29 ` [PATCH 7/7] netfilter: nft_flow_offload: fix checking method of conntrack helper Pablo Neira Ayuso
@ 2019-01-15 21:32 ` David Miller
7 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2019-01-15 21:32 UTC (permalink / raw)
To: pablo; +Cc: netfilter-devel, netdev
From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon, 14 Jan 2019 22:29:33 +0100
> This is the first batch of Netfilter fixes for your net tree:
>
> 1) Fix endless loop in nf_tables rules netlink dump, from Phil Sutter.
>
> 2) Reference counter leak in object from the error path, from Taehee Yoo.
>
> 3) Selective rule dump requires table and chain.
>
> 4) Fix DNAT with nft_flow_offload reverse route lookup, from wenxu.
>
> 5) Use GFP_KERNEL_ACCOUNT in vmalloc allocation from ebtables, from
> Shakeel Butt.
>
> 6) Set ifindex from route to fix interaction with VRF slave device,
> also from wenxu.
>
> 7) Use nfct_help() to check for conntrack helper, IPS_HELPER status
> flag is only set from explicit helpers via -j CT, from Henry Yen.
>
> You can pull these changes from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git
Pulled, thanks Pablo.
^ permalink raw reply [flat|nested] 9+ messages in thread