* [PATCH 01/12] netfilter: ipvs: full-functionality option for ECN encapsulation in tunnel
2017-10-09 16:25 [PATCH 00/12] Netfilter/IPVS fixes for net Pablo Neira Ayuso
@ 2017-10-09 16:25 ` Pablo Neira Ayuso
2017-10-09 16:25 ` [PATCH 02/12] netfilter: xt_socket: Restore mark from full sockets only Pablo Neira Ayuso
` (11 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Pablo Neira Ayuso @ 2017-10-09 16:25 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Vadim Fedorenko <vfedorenko@yandex-team.ru>
IPVS tunnel mode works as simple tunnel (see RFC 3168) copying ECN field
to outer header. That's result in packet drops on egress tunnels in case
the egress tunnel operates as ECN-capable with Full-functionality option
(like ip_tunnel and ip6_tunnel kernel modules), according to RFC 3168
section 9.1.1 recommendation.
This patch implements ECN full-functionality option into ipvs xmit code.
Cc: netdev@vger.kernel.org
Cc: lvs-devel@vger.kernel.org
Signed-off-by: Vadim Fedorenko <vfedorenko@yandex-team.ru>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/ipvs/ip_vs_xmit.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 90d396814798..4527921b1c3a 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -921,6 +921,7 @@ ip_vs_prepare_tunneled_skb(struct sk_buff *skb, int skb_af,
{
struct sk_buff *new_skb = NULL;
struct iphdr *old_iph = NULL;
+ __u8 old_dsfield;
#ifdef CONFIG_IP_VS_IPV6
struct ipv6hdr *old_ipv6h = NULL;
#endif
@@ -945,7 +946,7 @@ ip_vs_prepare_tunneled_skb(struct sk_buff *skb, int skb_af,
*payload_len =
ntohs(old_ipv6h->payload_len) +
sizeof(*old_ipv6h);
- *dsfield = ipv6_get_dsfield(old_ipv6h);
+ old_dsfield = ipv6_get_dsfield(old_ipv6h);
*ttl = old_ipv6h->hop_limit;
if (df)
*df = 0;
@@ -960,12 +961,15 @@ ip_vs_prepare_tunneled_skb(struct sk_buff *skb, int skb_af,
/* fix old IP header checksum */
ip_send_check(old_iph);
- *dsfield = ipv4_get_dsfield(old_iph);
+ old_dsfield = ipv4_get_dsfield(old_iph);
*ttl = old_iph->ttl;
if (payload_len)
*payload_len = ntohs(old_iph->tot_len);
}
+ /* Implement full-functionality option for ECN encapsulation */
+ *dsfield = INET_ECN_encapsulate(old_dsfield, old_dsfield);
+
return skb;
error:
kfree_skb(skb);
--
2.1.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 02/12] netfilter: xt_socket: Restore mark from full sockets only
2017-10-09 16:25 [PATCH 00/12] Netfilter/IPVS fixes for net Pablo Neira Ayuso
2017-10-09 16:25 ` [PATCH 01/12] netfilter: ipvs: full-functionality option for ECN encapsulation in tunnel Pablo Neira Ayuso
@ 2017-10-09 16:25 ` Pablo Neira Ayuso
2017-10-09 16:25 ` [PATCH 03/12] netfilter: ipset: Fix adding an IPv4 range containing more than 2^31 addresses Pablo Neira Ayuso
` (10 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Pablo Neira Ayuso @ 2017-10-09 16:25 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
An out of bounds error was detected on an ARM64 target with
Android based kernel 4.9. This occurs while trying to
restore mark on a skb from an inet request socket.
BUG: KASAN: slab-out-of-bounds in socket_match.isra.2+0xc8/0x1f0 net/netfilter/xt_socket.c:248
Read of size 4 at addr ffffffc06a8d824c by task syz-fuzzer/1532
CPU: 7 PID: 1532 Comm: syz-fuzzer Tainted: G W O 4.9.41+ #1
Call trace:
[<ffffff900808d2f8>] dump_backtrace+0x0/0x440 arch/arm64/kernel/traps.c:76
[<ffffff900808d760>] show_stack+0x28/0x38 arch/arm64/kernel/traps.c:226
[<ffffff90085f7dc8>] __dump_stack lib/dump_stack.c:15 [inline]
[<ffffff90085f7dc8>] dump_stack+0xe4/0x134 lib/dump_stack.c:51
[<ffffff900830f358>] print_address_description+0x68/0x258 mm/kasan/report.c:248
[<ffffff900830f770>] kasan_report_error mm/kasan/report.c:347 [inline]
[<ffffff900830f770>] kasan_report.part.2+0x228/0x2f0 mm/kasan/report.c:371
[<ffffff900830fdec>] kasan_report+0x5c/0x70 mm/kasan/report.c:372
[<ffffff900830de98>] check_memory_region_inline mm/kasan/kasan.c:308 [inline]
[<ffffff900830de98>] __asan_load4+0x88/0xa0 mm/kasan/kasan.c:740
[<ffffff90097498f8>] socket_match.isra.2+0xc8/0x1f0 net/netfilter/xt_socket.c:248
[<ffffff9009749a5c>] socket_mt4_v1_v2_v3+0x3c/0x48 net/netfilter/xt_socket.c:272
[<ffffff90097f7e4c>] ipt_do_table+0x54c/0xad8 net/ipv4/netfilter/ip_tables.c:311
[<ffffff90097fcf14>] iptable_mangle_hook+0x6c/0x220 net/ipv4/netfilter/iptable_mangle.c:90
...
Allocated by task 1532:
save_stack_trace_tsk+0x0/0x2a0 arch/arm64/kernel/stacktrace.c:131
save_stack_trace+0x28/0x38 arch/arm64/kernel/stacktrace.c:215
save_stack mm/kasan/kasan.c:495 [inline]
set_track mm/kasan/kasan.c:507 [inline]
kasan_kmalloc+0xd8/0x188 mm/kasan/kasan.c:599
kasan_slab_alloc+0x14/0x20 mm/kasan/kasan.c:537
slab_post_alloc_hook mm/slab.h:417 [inline]
slab_alloc_node mm/slub.c:2728 [inline]
slab_alloc mm/slub.c:2736 [inline]
kmem_cache_alloc+0x14c/0x2e8 mm/slub.c:2741
reqsk_alloc include/net/request_sock.h:87 [inline]
inet_reqsk_alloc+0x4c/0x238 net/ipv4/tcp_input.c:6236
tcp_conn_request+0x2b0/0xea8 net/ipv4/tcp_input.c:6341
tcp_v4_conn_request+0xe0/0x100 net/ipv4/tcp_ipv4.c:1256
tcp_rcv_state_process+0x384/0x18a8 net/ipv4/tcp_input.c:5926
tcp_v4_do_rcv+0x2f0/0x3e0 net/ipv4/tcp_ipv4.c:1430
tcp_v4_rcv+0x1278/0x1350 net/ipv4/tcp_ipv4.c:1709
ip_local_deliver_finish+0x174/0x3e0 net/ipv4/ip_input.c:216
v1->v2: Change socket_mt6_v1_v2_v3() as well as mentioned by Eric
v2->v3: Put the correct fixes tag
Fixes: 01555e74bde5 ("netfilter: xt_socket: add XT_SOCKET_RESTORESKMARK flag")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/xt_socket.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/xt_socket.c b/net/netfilter/xt_socket.c
index e75ef39669c5..575d2153e3b8 100644
--- a/net/netfilter/xt_socket.c
+++ b/net/netfilter/xt_socket.c
@@ -76,7 +76,7 @@ socket_match(const struct sk_buff *skb, struct xt_action_param *par,
transparent = nf_sk_is_transparent(sk);
if (info->flags & XT_SOCKET_RESTORESKMARK && !wildcard &&
- transparent)
+ transparent && sk_fullsock(sk))
pskb->mark = sk->sk_mark;
if (sk != skb->sk)
@@ -133,7 +133,7 @@ socket_mt6_v1_v2_v3(const struct sk_buff *skb, struct xt_action_param *par)
transparent = nf_sk_is_transparent(sk);
if (info->flags & XT_SOCKET_RESTORESKMARK && !wildcard &&
- transparent)
+ transparent && sk_fullsock(sk))
pskb->mark = sk->sk_mark;
if (sk != skb->sk)
--
2.1.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 03/12] netfilter: ipset: Fix adding an IPv4 range containing more than 2^31 addresses
2017-10-09 16:25 [PATCH 00/12] Netfilter/IPVS fixes for net Pablo Neira Ayuso
2017-10-09 16:25 ` [PATCH 01/12] netfilter: ipvs: full-functionality option for ECN encapsulation in tunnel Pablo Neira Ayuso
2017-10-09 16:25 ` [PATCH 02/12] netfilter: xt_socket: Restore mark from full sockets only Pablo Neira Ayuso
@ 2017-10-09 16:25 ` Pablo Neira Ayuso
2017-10-09 16:25 ` [PATCH 04/12] netfilter: ipset: pernet ops must be unregistered last Pablo Neira Ayuso
` (9 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Pablo Neira Ayuso @ 2017-10-09 16:25 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Wrong comparison prevented the hash types to add a range with more than
2^31 addresses but reported as a success.
Fixes Netfilter's bugzilla id #1005, reported by Oleg Serditov and
Oliver Ford.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/ipset/ip_set_hash_ip.c | 22 ++++++++++++----------
net/netfilter/ipset/ip_set_hash_ipmark.c | 2 +-
net/netfilter/ipset/ip_set_hash_ipport.c | 2 +-
net/netfilter/ipset/ip_set_hash_ipportip.c | 2 +-
net/netfilter/ipset/ip_set_hash_ipportnet.c | 4 ++--
net/netfilter/ipset/ip_set_hash_net.c | 2 +-
net/netfilter/ipset/ip_set_hash_netiface.c | 2 +-
net/netfilter/ipset/ip_set_hash_netnet.c | 4 ++--
net/netfilter/ipset/ip_set_hash_netport.c | 2 +-
net/netfilter/ipset/ip_set_hash_netportnet.c | 4 ++--
10 files changed, 24 insertions(+), 22 deletions(-)
diff --git a/net/netfilter/ipset/ip_set_hash_ip.c b/net/netfilter/ipset/ip_set_hash_ip.c
index 20bfbd315f61..613eb212cb48 100644
--- a/net/netfilter/ipset/ip_set_hash_ip.c
+++ b/net/netfilter/ipset/ip_set_hash_ip.c
@@ -123,13 +123,12 @@ hash_ip4_uadt(struct ip_set *set, struct nlattr *tb[],
return ret;
ip &= ip_set_hostmask(h->netmask);
+ e.ip = htonl(ip);
+ if (e.ip == 0)
+ return -IPSET_ERR_HASH_ELEM;
- if (adt == IPSET_TEST) {
- e.ip = htonl(ip);
- if (e.ip == 0)
- return -IPSET_ERR_HASH_ELEM;
+ if (adt == IPSET_TEST)
return adtfn(set, &e, &ext, &ext, flags);
- }
ip_to = ip;
if (tb[IPSET_ATTR_IP_TO]) {
@@ -148,17 +147,20 @@ hash_ip4_uadt(struct ip_set *set, struct nlattr *tb[],
hosts = h->netmask == 32 ? 1 : 2 << (32 - h->netmask - 1);
- if (retried)
+ if (retried) {
ip = ntohl(h->next.ip);
- for (; !before(ip_to, ip); ip += hosts) {
e.ip = htonl(ip);
- if (e.ip == 0)
- return -IPSET_ERR_HASH_ELEM;
+ }
+ for (; ip <= ip_to;) {
ret = adtfn(set, &e, &ext, &ext, flags);
-
if (ret && !ip_set_eexist(ret, flags))
return ret;
+ ip += hosts;
+ e.ip = htonl(ip);
+ if (e.ip == 0)
+ return 0;
+
ret = 0;
}
return ret;
diff --git a/net/netfilter/ipset/ip_set_hash_ipmark.c b/net/netfilter/ipset/ip_set_hash_ipmark.c
index b64cf14e8352..f3ba8348cf9d 100644
--- a/net/netfilter/ipset/ip_set_hash_ipmark.c
+++ b/net/netfilter/ipset/ip_set_hash_ipmark.c
@@ -149,7 +149,7 @@ hash_ipmark4_uadt(struct ip_set *set, struct nlattr *tb[],
if (retried)
ip = ntohl(h->next.ip);
- for (; !before(ip_to, ip); ip++) {
+ for (; ip <= ip_to; ip++) {
e.ip = htonl(ip);
ret = adtfn(set, &e, &ext, &ext, flags);
diff --git a/net/netfilter/ipset/ip_set_hash_ipport.c b/net/netfilter/ipset/ip_set_hash_ipport.c
index f438740e6c6a..ddb8039ec1d2 100644
--- a/net/netfilter/ipset/ip_set_hash_ipport.c
+++ b/net/netfilter/ipset/ip_set_hash_ipport.c
@@ -178,7 +178,7 @@ hash_ipport4_uadt(struct ip_set *set, struct nlattr *tb[],
if (retried)
ip = ntohl(h->next.ip);
- for (; !before(ip_to, ip); ip++) {
+ for (; ip <= ip_to; ip++) {
p = retried && ip == ntohl(h->next.ip) ? ntohs(h->next.port)
: port;
for (; p <= port_to; p++) {
diff --git a/net/netfilter/ipset/ip_set_hash_ipportip.c b/net/netfilter/ipset/ip_set_hash_ipportip.c
index 6215fb898c50..a7f4d7a85420 100644
--- a/net/netfilter/ipset/ip_set_hash_ipportip.c
+++ b/net/netfilter/ipset/ip_set_hash_ipportip.c
@@ -185,7 +185,7 @@ hash_ipportip4_uadt(struct ip_set *set, struct nlattr *tb[],
if (retried)
ip = ntohl(h->next.ip);
- for (; !before(ip_to, ip); ip++) {
+ for (; ip <= ip_to; ip++) {
p = retried && ip == ntohl(h->next.ip) ? ntohs(h->next.port)
: port;
for (; p <= port_to; p++) {
diff --git a/net/netfilter/ipset/ip_set_hash_ipportnet.c b/net/netfilter/ipset/ip_set_hash_ipportnet.c
index 5ab1b99a53c2..a2f19b9906e9 100644
--- a/net/netfilter/ipset/ip_set_hash_ipportnet.c
+++ b/net/netfilter/ipset/ip_set_hash_ipportnet.c
@@ -271,7 +271,7 @@ hash_ipportnet4_uadt(struct ip_set *set, struct nlattr *tb[],
if (retried)
ip = ntohl(h->next.ip);
- for (; !before(ip_to, ip); ip++) {
+ for (; ip <= ip_to; ip++) {
e.ip = htonl(ip);
p = retried && ip == ntohl(h->next.ip) ? ntohs(h->next.port)
: port;
@@ -281,7 +281,7 @@ hash_ipportnet4_uadt(struct ip_set *set, struct nlattr *tb[],
ip == ntohl(h->next.ip) &&
p == ntohs(h->next.port)
? ntohl(h->next.ip2) : ip2_from;
- while (!after(ip2, ip2_to)) {
+ while (ip2 <= ip2_to) {
e.ip2 = htonl(ip2);
ip2_last = ip_set_range_to_cidr(ip2, ip2_to,
&cidr);
diff --git a/net/netfilter/ipset/ip_set_hash_net.c b/net/netfilter/ipset/ip_set_hash_net.c
index 5d9e895452e7..1c67a1761e45 100644
--- a/net/netfilter/ipset/ip_set_hash_net.c
+++ b/net/netfilter/ipset/ip_set_hash_net.c
@@ -193,7 +193,7 @@ hash_net4_uadt(struct ip_set *set, struct nlattr *tb[],
}
if (retried)
ip = ntohl(h->next.ip);
- while (!after(ip, ip_to)) {
+ while (ip <= ip_to) {
e.ip = htonl(ip);
last = ip_set_range_to_cidr(ip, ip_to, &e.cidr);
ret = adtfn(set, &e, &ext, &ext, flags);
diff --git a/net/netfilter/ipset/ip_set_hash_netiface.c b/net/netfilter/ipset/ip_set_hash_netiface.c
index 44cf11939c91..d417074f1c1a 100644
--- a/net/netfilter/ipset/ip_set_hash_netiface.c
+++ b/net/netfilter/ipset/ip_set_hash_netiface.c
@@ -255,7 +255,7 @@ hash_netiface4_uadt(struct ip_set *set, struct nlattr *tb[],
if (retried)
ip = ntohl(h->next.ip);
- while (!after(ip, ip_to)) {
+ while (ip <= ip_to) {
e.ip = htonl(ip);
last = ip_set_range_to_cidr(ip, ip_to, &e.cidr);
ret = adtfn(set, &e, &ext, &ext, flags);
diff --git a/net/netfilter/ipset/ip_set_hash_netnet.c b/net/netfilter/ipset/ip_set_hash_netnet.c
index db614e13b193..7f9ae2e9645b 100644
--- a/net/netfilter/ipset/ip_set_hash_netnet.c
+++ b/net/netfilter/ipset/ip_set_hash_netnet.c
@@ -250,13 +250,13 @@ hash_netnet4_uadt(struct ip_set *set, struct nlattr *tb[],
if (retried)
ip = ntohl(h->next.ip[0]);
- while (!after(ip, ip_to)) {
+ while (ip <= ip_to) {
e.ip[0] = htonl(ip);
last = ip_set_range_to_cidr(ip, ip_to, &e.cidr[0]);
ip2 = (retried &&
ip == ntohl(h->next.ip[0])) ? ntohl(h->next.ip[1])
: ip2_from;
- while (!after(ip2, ip2_to)) {
+ while (ip2 <= ip2_to) {
e.ip[1] = htonl(ip2);
last2 = ip_set_range_to_cidr(ip2, ip2_to, &e.cidr[1]);
ret = adtfn(set, &e, &ext, &ext, flags);
diff --git a/net/netfilter/ipset/ip_set_hash_netport.c b/net/netfilter/ipset/ip_set_hash_netport.c
index 54b64b6cd0cd..e6ef382febe4 100644
--- a/net/netfilter/ipset/ip_set_hash_netport.c
+++ b/net/netfilter/ipset/ip_set_hash_netport.c
@@ -241,7 +241,7 @@ hash_netport4_uadt(struct ip_set *set, struct nlattr *tb[],
if (retried)
ip = ntohl(h->next.ip);
- while (!after(ip, ip_to)) {
+ while (ip <= ip_to) {
e.ip = htonl(ip);
last = ip_set_range_to_cidr(ip, ip_to, &cidr);
e.cidr = cidr - 1;
diff --git a/net/netfilter/ipset/ip_set_hash_netportnet.c b/net/netfilter/ipset/ip_set_hash_netportnet.c
index aff846960ac4..8602f2595a1a 100644
--- a/net/netfilter/ipset/ip_set_hash_netportnet.c
+++ b/net/netfilter/ipset/ip_set_hash_netportnet.c
@@ -291,7 +291,7 @@ hash_netportnet4_uadt(struct ip_set *set, struct nlattr *tb[],
if (retried)
ip = ntohl(h->next.ip[0]);
- while (!after(ip, ip_to)) {
+ while (ip <= ip_to) {
e.ip[0] = htonl(ip);
ip_last = ip_set_range_to_cidr(ip, ip_to, &e.cidr[0]);
p = retried && ip == ntohl(h->next.ip[0]) ? ntohs(h->next.port)
@@ -301,7 +301,7 @@ hash_netportnet4_uadt(struct ip_set *set, struct nlattr *tb[],
ip2 = (retried && ip == ntohl(h->next.ip[0]) &&
p == ntohs(h->next.port)) ? ntohl(h->next.ip[1])
: ip2_from;
- while (!after(ip2, ip2_to)) {
+ while (ip2 <= ip2_to) {
e.ip[1] = htonl(ip2);
ip2_last = ip_set_range_to_cidr(ip2, ip2_to,
&e.cidr[1]);
--
2.1.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 04/12] netfilter: ipset: pernet ops must be unregistered last
2017-10-09 16:25 [PATCH 00/12] Netfilter/IPVS fixes for net Pablo Neira Ayuso
` (2 preceding siblings ...)
2017-10-09 16:25 ` [PATCH 03/12] netfilter: ipset: Fix adding an IPv4 range containing more than 2^31 addresses Pablo Neira Ayuso
@ 2017-10-09 16:25 ` Pablo Neira Ayuso
2017-10-09 16:25 ` [PATCH 05/12] netfilter: ipset: Fix race between dump and swap Pablo Neira Ayuso
` (8 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Pablo Neira Ayuso @ 2017-10-09 16:25 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Florian Westphal <fw@strlen.de>
Removing the ipset module leaves a small window where one cpu performs
module removal while another runs a command like 'ipset flush'.
ipset uses net_generic(), unregistering the pernet ops frees this
storage area.
Fix it by first removing the user-visible api handlers and the pernet
ops last.
Fixes: 1785e8f473082 ("netfiler: ipset: Add net namespace for ipset")
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/ipset/ip_set_core.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c
index e495b5e484b1..a7f049ff3049 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -2072,25 +2072,28 @@ static struct pernet_operations ip_set_net_ops = {
static int __init
ip_set_init(void)
{
- int ret = nfnetlink_subsys_register(&ip_set_netlink_subsys);
+ int ret = register_pernet_subsys(&ip_set_net_ops);
+ if (ret) {
+ pr_err("ip_set: cannot register pernet_subsys.\n");
+ return ret;
+ }
+
+ ret = nfnetlink_subsys_register(&ip_set_netlink_subsys);
if (ret != 0) {
pr_err("ip_set: cannot register with nfnetlink.\n");
+ unregister_pernet_subsys(&ip_set_net_ops);
return ret;
}
+
ret = nf_register_sockopt(&so_set);
if (ret != 0) {
pr_err("SO_SET registry failed: %d\n", ret);
nfnetlink_subsys_unregister(&ip_set_netlink_subsys);
+ unregister_pernet_subsys(&ip_set_net_ops);
return ret;
}
- ret = register_pernet_subsys(&ip_set_net_ops);
- if (ret) {
- pr_err("ip_set: cannot register pernet_subsys.\n");
- nf_unregister_sockopt(&so_set);
- nfnetlink_subsys_unregister(&ip_set_netlink_subsys);
- return ret;
- }
+
pr_info("ip_set: protocol %u\n", IPSET_PROTOCOL);
return 0;
}
@@ -2098,9 +2101,10 @@ ip_set_init(void)
static void __exit
ip_set_fini(void)
{
- unregister_pernet_subsys(&ip_set_net_ops);
nf_unregister_sockopt(&so_set);
nfnetlink_subsys_unregister(&ip_set_netlink_subsys);
+
+ unregister_pernet_subsys(&ip_set_net_ops);
pr_debug("these are the famous last words\n");
}
--
2.1.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 05/12] netfilter: ipset: Fix race between dump and swap
2017-10-09 16:25 [PATCH 00/12] Netfilter/IPVS fixes for net Pablo Neira Ayuso
` (3 preceding siblings ...)
2017-10-09 16:25 ` [PATCH 04/12] netfilter: ipset: pernet ops must be unregistered last Pablo Neira Ayuso
@ 2017-10-09 16:25 ` Pablo Neira Ayuso
2017-10-09 16:25 ` [PATCH 06/12] netfilter: nf_tables: fix update chain error Pablo Neira Ayuso
` (7 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Pablo Neira Ayuso @ 2017-10-09 16:25 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Ross Lagerwall <ross.lagerwall@citrix.com>
Fix a race between ip_set_dump_start() and ip_set_swap().
The race is as follows:
* Without holding the ref lock, ip_set_swap() checks ref_netlink of the
set and it is 0.
* ip_set_dump_start() takes a reference on the set.
* ip_set_swap() does the swap (even though it now has a non-zero
reference count).
* ip_set_dump_start() gets the set from ip_set_list again which is now a
different set since it has been swapped.
* ip_set_dump_start() calls __ip_set_put_netlink() and hits a BUG_ON due
to the reference count being 0.
Fix this race by extending the critical region in which the ref lock is
held to include checking the ref counts.
The race can be reproduced with the following script:
while :; do
ipset destroy hash_ip1
ipset destroy hash_ip2
ipset create hash_ip1 hash:ip family inet hashsize 1024 \
maxelem 500000
ipset create hash_ip2 hash:ip family inet hashsize 300000 \
maxelem 500000
ipset create hash_ip3 hash:ip family inet hashsize 1024 \
maxelem 500000
ipset save &
ipset swap hash_ip3 hash_ip2
ipset destroy hash_ip3
wait
done
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/ipset/ip_set_core.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c
index a7f049ff3049..cf84f7b37cd9 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -1191,14 +1191,17 @@ static int ip_set_swap(struct net *net, struct sock *ctnl, struct sk_buff *skb,
from->family == to->family))
return -IPSET_ERR_TYPE_MISMATCH;
- if (from->ref_netlink || to->ref_netlink)
+ write_lock_bh(&ip_set_ref_lock);
+
+ if (from->ref_netlink || to->ref_netlink) {
+ write_unlock_bh(&ip_set_ref_lock);
return -EBUSY;
+ }
strncpy(from_name, from->name, IPSET_MAXNAMELEN);
strncpy(from->name, to->name, IPSET_MAXNAMELEN);
strncpy(to->name, from_name, IPSET_MAXNAMELEN);
- write_lock_bh(&ip_set_ref_lock);
swap(from->ref, to->ref);
ip_set(inst, from_id) = to;
ip_set(inst, to_id) = from;
--
2.1.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 06/12] netfilter: nf_tables: fix update chain error
2017-10-09 16:25 [PATCH 00/12] Netfilter/IPVS fixes for net Pablo Neira Ayuso
` (4 preceding siblings ...)
2017-10-09 16:25 ` [PATCH 05/12] netfilter: ipset: Fix race between dump and swap Pablo Neira Ayuso
@ 2017-10-09 16:25 ` Pablo Neira Ayuso
2017-10-09 16:25 ` [PATCH 07/12] netfilter: ebtables: fix race condition in frame_filter_net_init() Pablo Neira Ayuso
` (6 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Pablo Neira Ayuso @ 2017-10-09 16:25 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: JingPiao Chen <chenjingpiao@gmail.com>
# nft add table filter
# nft add chain filter c1
# nft rename chain filter c1 c2
Error: Could not process rule: No such file or directory
rename chain filter c1 c2
^^^^^^^^^^^^^^^^^^^^^^^^^^
# nft add chain filter c2
# nft rename chain filter c1 c2
# nft list table filter
table ip filter {
chain c2 {
}
chain c2 {
}
}
Fixes: 664b0f8cd8 ("netfilter: nf_tables: add generation mask to chains")
Signed-off-by: JingPiao Chen <chenjingpiao@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_tables_api.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 929927171426..f98ca8c6aa59 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1487,8 +1487,8 @@ static int nf_tables_updchain(struct nft_ctx *ctx, u8 genmask, u8 policy,
chain2 = nf_tables_chain_lookup(table, nla[NFTA_CHAIN_NAME],
genmask);
- if (IS_ERR(chain2))
- return PTR_ERR(chain2);
+ if (!IS_ERR(chain2))
+ return -EEXIST;
}
if (nla[NFTA_CHAIN_COUNTERS]) {
--
2.1.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 07/12] netfilter: ebtables: fix race condition in frame_filter_net_init()
2017-10-09 16:25 [PATCH 00/12] Netfilter/IPVS fixes for net Pablo Neira Ayuso
` (5 preceding siblings ...)
2017-10-09 16:25 ` [PATCH 06/12] netfilter: nf_tables: fix update chain error Pablo Neira Ayuso
@ 2017-10-09 16:25 ` Pablo Neira Ayuso
2017-10-09 16:25 ` [PATCH 08/12] netfilter: nf_tables: Release memory obtained by kasprintf Pablo Neira Ayuso
` (5 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Pablo Neira Ayuso @ 2017-10-09 16:25 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Artem Savkov <asavkov@redhat.com>
It is possible for ebt_in_hook to be triggered before ebt_table is assigned
resulting in a NULL-pointer dereference. Make sure hooks are
registered as the last step.
Fixes: aee12a0a3727 ("ebtables: remove nf_hook_register usage")
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/linux/netfilter_bridge/ebtables.h | 7 ++++---
net/bridge/netfilter/ebtable_broute.c | 4 ++--
net/bridge/netfilter/ebtable_filter.c | 4 ++--
net/bridge/netfilter/ebtable_nat.c | 4 ++--
net/bridge/netfilter/ebtables.c | 17 +++++++++--------
5 files changed, 19 insertions(+), 17 deletions(-)
diff --git a/include/linux/netfilter_bridge/ebtables.h b/include/linux/netfilter_bridge/ebtables.h
index 2c2a5514b0df..528b24c78308 100644
--- a/include/linux/netfilter_bridge/ebtables.h
+++ b/include/linux/netfilter_bridge/ebtables.h
@@ -108,9 +108,10 @@ struct ebt_table {
#define EBT_ALIGN(s) (((s) + (__alignof__(struct _xt_align)-1)) & \
~(__alignof__(struct _xt_align)-1))
-extern struct ebt_table *ebt_register_table(struct net *net,
- const struct ebt_table *table,
- const struct nf_hook_ops *);
+extern int ebt_register_table(struct net *net,
+ const struct ebt_table *table,
+ const struct nf_hook_ops *ops,
+ struct ebt_table **res);
extern void ebt_unregister_table(struct net *net, struct ebt_table *table,
const struct nf_hook_ops *);
extern unsigned int ebt_do_table(struct sk_buff *skb,
diff --git a/net/bridge/netfilter/ebtable_broute.c b/net/bridge/netfilter/ebtable_broute.c
index 2585b100ebbb..276b60262981 100644
--- a/net/bridge/netfilter/ebtable_broute.c
+++ b/net/bridge/netfilter/ebtable_broute.c
@@ -65,8 +65,8 @@ static int ebt_broute(struct sk_buff *skb)
static int __net_init broute_net_init(struct net *net)
{
- net->xt.broute_table = ebt_register_table(net, &broute_table, NULL);
- return PTR_ERR_OR_ZERO(net->xt.broute_table);
+ return ebt_register_table(net, &broute_table, NULL,
+ &net->xt.broute_table);
}
static void __net_exit broute_net_exit(struct net *net)
diff --git a/net/bridge/netfilter/ebtable_filter.c b/net/bridge/netfilter/ebtable_filter.c
index 45a00dbdbcad..c41da5fac84f 100644
--- a/net/bridge/netfilter/ebtable_filter.c
+++ b/net/bridge/netfilter/ebtable_filter.c
@@ -93,8 +93,8 @@ static const struct nf_hook_ops ebt_ops_filter[] = {
static int __net_init frame_filter_net_init(struct net *net)
{
- net->xt.frame_filter = ebt_register_table(net, &frame_filter, ebt_ops_filter);
- return PTR_ERR_OR_ZERO(net->xt.frame_filter);
+ return ebt_register_table(net, &frame_filter, ebt_ops_filter,
+ &net->xt.frame_filter);
}
static void __net_exit frame_filter_net_exit(struct net *net)
diff --git a/net/bridge/netfilter/ebtable_nat.c b/net/bridge/netfilter/ebtable_nat.c
index 57cd5bb154e7..08df7406ecb3 100644
--- a/net/bridge/netfilter/ebtable_nat.c
+++ b/net/bridge/netfilter/ebtable_nat.c
@@ -93,8 +93,8 @@ static const struct nf_hook_ops ebt_ops_nat[] = {
static int __net_init frame_nat_net_init(struct net *net)
{
- net->xt.frame_nat = ebt_register_table(net, &frame_nat, ebt_ops_nat);
- return PTR_ERR_OR_ZERO(net->xt.frame_nat);
+ return ebt_register_table(net, &frame_nat, ebt_ops_nat,
+ &net->xt.frame_nat);
}
static void __net_exit frame_nat_net_exit(struct net *net)
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 83951f978445..3b3dcf719e07 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -1169,9 +1169,8 @@ static void __ebt_unregister_table(struct net *net, struct ebt_table *table)
kfree(table);
}
-struct ebt_table *
-ebt_register_table(struct net *net, const struct ebt_table *input_table,
- const struct nf_hook_ops *ops)
+int ebt_register_table(struct net *net, const struct ebt_table *input_table,
+ const struct nf_hook_ops *ops, struct ebt_table **res)
{
struct ebt_table_info *newinfo;
struct ebt_table *t, *table;
@@ -1183,7 +1182,7 @@ ebt_register_table(struct net *net, const struct ebt_table *input_table,
repl->entries == NULL || repl->entries_size == 0 ||
repl->counters != NULL || input_table->private != NULL) {
BUGPRINT("Bad table data for ebt_register_table!!!\n");
- return ERR_PTR(-EINVAL);
+ return -EINVAL;
}
/* Don't add one table to multiple lists. */
@@ -1252,16 +1251,18 @@ ebt_register_table(struct net *net, const struct ebt_table *input_table,
list_add(&table->list, &net->xt.tables[NFPROTO_BRIDGE]);
mutex_unlock(&ebt_mutex);
+ WRITE_ONCE(*res, table);
+
if (!ops)
- return table;
+ return 0;
ret = nf_register_net_hooks(net, ops, hweight32(table->valid_hooks));
if (ret) {
__ebt_unregister_table(net, table);
- return ERR_PTR(ret);
+ *res = NULL;
}
- return table;
+ return ret;
free_unlock:
mutex_unlock(&ebt_mutex);
free_chainstack:
@@ -1276,7 +1277,7 @@ ebt_register_table(struct net *net, const struct ebt_table *input_table,
free_table:
kfree(table);
out:
- return ERR_PTR(ret);
+ return ret;
}
void ebt_unregister_table(struct net *net, struct ebt_table *table,
--
2.1.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 08/12] netfilter: nf_tables: Release memory obtained by kasprintf
2017-10-09 16:25 [PATCH 00/12] Netfilter/IPVS fixes for net Pablo Neira Ayuso
` (6 preceding siblings ...)
2017-10-09 16:25 ` [PATCH 07/12] netfilter: ebtables: fix race condition in frame_filter_net_init() Pablo Neira Ayuso
@ 2017-10-09 16:25 ` Pablo Neira Ayuso
2017-10-09 16:25 ` [PATCH 09/12] netfilter: nf_tables: do not dump chain counters if not enabled Pablo Neira Ayuso
` (4 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Pablo Neira Ayuso @ 2017-10-09 16:25 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Arvind Yadav <arvind.yadav.cs@gmail.com>
Free memory region, if nf_tables_set_alloc_name is not successful.
Fixes: 387454901bd6 ("netfilter: nf_tables: Allow set names of up to 255 chars")
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_tables_api.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index f98ca8c6aa59..34adedcb239e 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2741,8 +2741,10 @@ static int nf_tables_set_alloc_name(struct nft_ctx *ctx, struct nft_set *set,
list_for_each_entry(i, &ctx->table->sets, list) {
if (!nft_is_active_next(ctx->net, i))
continue;
- if (!strcmp(set->name, i->name))
+ if (!strcmp(set->name, i->name)) {
+ kfree(set->name);
return -ENFILE;
+ }
}
return 0;
}
--
2.1.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 09/12] netfilter: nf_tables: do not dump chain counters if not enabled
2017-10-09 16:25 [PATCH 00/12] Netfilter/IPVS fixes for net Pablo Neira Ayuso
` (7 preceding siblings ...)
2017-10-09 16:25 ` [PATCH 08/12] netfilter: nf_tables: Release memory obtained by kasprintf Pablo Neira Ayuso
@ 2017-10-09 16:25 ` Pablo Neira Ayuso
2017-10-09 16:25 ` [PATCH 10/12] netfilter: x_tables: avoid stack-out-of-bounds read in xt_copy_counters_from_user Pablo Neira Ayuso
` (3 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Pablo Neira Ayuso @ 2017-10-09 16:25 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
Chain counters are only enabled on demand since 9f08ea848117, skip them
when dumping them via netlink.
Fixes: 9f08ea848117 ("netfilter: nf_tables: keep chain counters away from hot path")
Reported-by: Johny Mattsson <johny.mattsson+kernel@gmail.com>
Tested-by: Johny Mattsson <johny.mattsson+kernel@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_tables_api.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 34adedcb239e..64e1ee091225 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1048,7 +1048,7 @@ static int nf_tables_fill_chain_info(struct sk_buff *skb, struct net *net,
if (nla_put_string(skb, NFTA_CHAIN_TYPE, basechain->type->name))
goto nla_put_failure;
- if (nft_dump_stats(skb, nft_base_chain(chain)->stats))
+ if (basechain->stats && nft_dump_stats(skb, basechain->stats))
goto nla_put_failure;
}
--
2.1.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 10/12] netfilter: x_tables: avoid stack-out-of-bounds read in xt_copy_counters_from_user
2017-10-09 16:25 [PATCH 00/12] Netfilter/IPVS fixes for net Pablo Neira Ayuso
` (8 preceding siblings ...)
2017-10-09 16:25 ` [PATCH 09/12] netfilter: nf_tables: do not dump chain counters if not enabled Pablo Neira Ayuso
@ 2017-10-09 16:25 ` Pablo Neira Ayuso
2017-10-09 16:25 ` [PATCH 11/12] netfilter: SYNPROXY: skip non-tcp packet in {ipv4, ipv6}_synproxy_hook Pablo Neira Ayuso
` (2 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Pablo Neira Ayuso @ 2017-10-09 16:25 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Eric Dumazet <edumazet@google.com>
syzkaller reports an out of bound read in strlcpy(), triggered
by xt_copy_counters_from_user()
Fix this by using memcpy(), then forcing a zero byte at the last position
of the destination, as Florian did for the non COMPAT code.
Fixes: d7591f0c41ce ("netfilter: x_tables: introduce and use xt_copy_counters_from_user")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/x_tables.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index c83a3b5e1c6c..d8571f414208 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -892,7 +892,7 @@ void *xt_copy_counters_from_user(const void __user *user, unsigned int len,
if (copy_from_user(&compat_tmp, user, sizeof(compat_tmp)) != 0)
return ERR_PTR(-EFAULT);
- strlcpy(info->name, compat_tmp.name, sizeof(info->name));
+ memcpy(info->name, compat_tmp.name, sizeof(info->name) - 1);
info->num_counters = compat_tmp.num_counters;
user += sizeof(compat_tmp);
} else
@@ -905,9 +905,9 @@ void *xt_copy_counters_from_user(const void __user *user, unsigned int len,
if (copy_from_user(info, user, sizeof(*info)) != 0)
return ERR_PTR(-EFAULT);
- info->name[sizeof(info->name) - 1] = '\0';
user += sizeof(*info);
}
+ info->name[sizeof(info->name) - 1] = '\0';
size = sizeof(struct xt_counters);
size *= info->num_counters;
--
2.1.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 11/12] netfilter: SYNPROXY: skip non-tcp packet in {ipv4, ipv6}_synproxy_hook
2017-10-09 16:25 [PATCH 00/12] Netfilter/IPVS fixes for net Pablo Neira Ayuso
` (9 preceding siblings ...)
2017-10-09 16:25 ` [PATCH 10/12] netfilter: x_tables: avoid stack-out-of-bounds read in xt_copy_counters_from_user Pablo Neira Ayuso
@ 2017-10-09 16:25 ` Pablo Neira Ayuso
2017-10-09 16:25 ` [PATCH 12/12] netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1' Pablo Neira Ayuso
2017-10-09 17:40 ` [PATCH 00/12] Netfilter/IPVS fixes for net David Miller
12 siblings, 0 replies; 21+ messages in thread
From: Pablo Neira Ayuso @ 2017-10-09 16:25 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Lin Zhang <xiaolou4617@gmail.com>
In function {ipv4,ipv6}_synproxy_hook we expect a normal tcp packet, but
the real server maybe reply an icmp error packet related to the exist
tcp conntrack, so we will access wrong tcp data.
Fix it by checking for the protocol field and only process tcp traffic.
Signed-off-by: Lin Zhang <xiaolou4617@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/ipv4/netfilter/ipt_SYNPROXY.c | 3 ++-
net/ipv6/netfilter/ip6t_SYNPROXY.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/netfilter/ipt_SYNPROXY.c b/net/ipv4/netfilter/ipt_SYNPROXY.c
index 811689e523c3..f75fc6b53115 100644
--- a/net/ipv4/netfilter/ipt_SYNPROXY.c
+++ b/net/ipv4/netfilter/ipt_SYNPROXY.c
@@ -330,7 +330,8 @@ static unsigned int ipv4_synproxy_hook(void *priv,
if (synproxy == NULL)
return NF_ACCEPT;
- if (nf_is_loopback_packet(skb))
+ if (nf_is_loopback_packet(skb) ||
+ ip_hdr(skb)->protocol != IPPROTO_TCP)
return NF_ACCEPT;
thoff = ip_hdrlen(skb);
diff --git a/net/ipv6/netfilter/ip6t_SYNPROXY.c b/net/ipv6/netfilter/ip6t_SYNPROXY.c
index a5cd43d75393..437af8c95277 100644
--- a/net/ipv6/netfilter/ip6t_SYNPROXY.c
+++ b/net/ipv6/netfilter/ip6t_SYNPROXY.c
@@ -353,7 +353,7 @@ static unsigned int ipv6_synproxy_hook(void *priv,
nexthdr = ipv6_hdr(skb)->nexthdr;
thoff = ipv6_skip_exthdr(skb, sizeof(struct ipv6hdr), &nexthdr,
&frag_off);
- if (thoff < 0)
+ if (thoff < 0 || nexthdr != IPPROTO_TCP)
return NF_ACCEPT;
th = skb_header_pointer(skb, thoff, sizeof(_th), &_th);
--
2.1.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 12/12] netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'
2017-10-09 16:25 [PATCH 00/12] Netfilter/IPVS fixes for net Pablo Neira Ayuso
` (10 preceding siblings ...)
2017-10-09 16:25 ` [PATCH 11/12] netfilter: SYNPROXY: skip non-tcp packet in {ipv4, ipv6}_synproxy_hook Pablo Neira Ayuso
@ 2017-10-09 16:25 ` Pablo Neira Ayuso
2017-10-09 17:40 ` [PATCH 00/12] Netfilter/IPVS fixes for net David Miller
12 siblings, 0 replies; 21+ messages in thread
From: Pablo Neira Ayuso @ 2017-10-09 16:25 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Commit 2c16d6033264 ("netfilter: xt_bpf: support ebpf") introduced
support for attaching an eBPF object by an fd, with the
'bpf_mt_check_v1' ABI expecting the '.fd' to be specified upon each
IPT_SO_SET_REPLACE call.
However this breaks subsequent iptables calls:
# iptables -A INPUT -m bpf --object-pinned /sys/fs/bpf/xxx -j ACCEPT
# iptables -A INPUT -s 5.6.7.8 -j ACCEPT
iptables: Invalid argument. Run `dmesg' for more information.
That's because iptables works by loading existing rules using
IPT_SO_GET_ENTRIES to userspace, then issuing IPT_SO_SET_REPLACE with
the replacement set.
However, the loaded 'xt_bpf_info_v1' has an arbitrary '.fd' number
(from the initial "iptables -m bpf" invocation) - so when 2nd invocation
occurs, userspace passes a bogus fd number, which leads to
'bpf_mt_check_v1' to fail.
One suggested solution [1] was to hack iptables userspace, to perform a
"entries fixup" immediatley after IPT_SO_GET_ENTRIES, by opening a new,
process-local fd per every 'xt_bpf_info_v1' entry seen.
However, in [2] both Pablo Neira Ayuso and Willem de Bruijn suggested to
depricate the xt_bpf_info_v1 ABI dealing with pinned ebpf objects.
This fix changes the XT_BPF_MODE_FD_PINNED behavior to ignore the given
'.fd' and instead perform an in-kernel lookup for the bpf object given
the provided '.path'.
It also defines an alias for the XT_BPF_MODE_FD_PINNED mode, named
XT_BPF_MODE_PATH_PINNED, to better reflect the fact that the user is
expected to provide the path of the pinned object.
Existing XT_BPF_MODE_FD_ELF behavior (non-pinned fd mode) is preserved.
References: [1] https://marc.info/?l=netfilter-devel&m=150564724607440&w=2
[2] https://marc.info/?l=netfilter-devel&m=150575727129880&w=2
Reported-by: Rafael Buchbinder <rafi@rbk.ms>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/linux/bpf.h | 5 +++++
include/uapi/linux/netfilter/xt_bpf.h | 1 +
kernel/bpf/inode.c | 1 +
net/netfilter/xt_bpf.c | 22 ++++++++++++++++++++--
4 files changed, 27 insertions(+), 2 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 8390859e79e7..f1af7d63d678 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -368,6 +368,11 @@ static inline void __bpf_prog_uncharge(struct user_struct *user, u32 pages)
{
}
+static inline int bpf_obj_get_user(const char __user *pathname)
+{
+ return -EOPNOTSUPP;
+}
+
static inline struct net_device *__dev_map_lookup_elem(struct bpf_map *map,
u32 key)
{
diff --git a/include/uapi/linux/netfilter/xt_bpf.h b/include/uapi/linux/netfilter/xt_bpf.h
index b97725af2ac0..da161b56c79e 100644
--- a/include/uapi/linux/netfilter/xt_bpf.h
+++ b/include/uapi/linux/netfilter/xt_bpf.h
@@ -23,6 +23,7 @@ enum xt_bpf_modes {
XT_BPF_MODE_FD_PINNED,
XT_BPF_MODE_FD_ELF,
};
+#define XT_BPF_MODE_PATH_PINNED XT_BPF_MODE_FD_PINNED
struct xt_bpf_info_v1 {
__u16 mode;
diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
index e833ed914358..be1dde967208 100644
--- a/kernel/bpf/inode.c
+++ b/kernel/bpf/inode.c
@@ -363,6 +363,7 @@ int bpf_obj_get_user(const char __user *pathname)
putname(pname);
return ret;
}
+EXPORT_SYMBOL_GPL(bpf_obj_get_user);
static void bpf_evict_inode(struct inode *inode)
{
diff --git a/net/netfilter/xt_bpf.c b/net/netfilter/xt_bpf.c
index 38986a95216c..29123934887b 100644
--- a/net/netfilter/xt_bpf.c
+++ b/net/netfilter/xt_bpf.c
@@ -8,6 +8,7 @@
*/
#include <linux/module.h>
+#include <linux/syscalls.h>
#include <linux/skbuff.h>
#include <linux/filter.h>
#include <linux/bpf.h>
@@ -49,6 +50,22 @@ static int __bpf_mt_check_fd(int fd, struct bpf_prog **ret)
return 0;
}
+static int __bpf_mt_check_path(const char *path, struct bpf_prog **ret)
+{
+ mm_segment_t oldfs = get_fs();
+ int retval, fd;
+
+ set_fs(KERNEL_DS);
+ fd = bpf_obj_get_user(path);
+ set_fs(oldfs);
+ if (fd < 0)
+ return fd;
+
+ retval = __bpf_mt_check_fd(fd, ret);
+ sys_close(fd);
+ return retval;
+}
+
static int bpf_mt_check(const struct xt_mtchk_param *par)
{
struct xt_bpf_info *info = par->matchinfo;
@@ -66,9 +83,10 @@ static int bpf_mt_check_v1(const struct xt_mtchk_param *par)
return __bpf_mt_check_bytecode(info->bpf_program,
info->bpf_program_num_elem,
&info->filter);
- else if (info->mode == XT_BPF_MODE_FD_PINNED ||
- info->mode == XT_BPF_MODE_FD_ELF)
+ else if (info->mode == XT_BPF_MODE_FD_ELF)
return __bpf_mt_check_fd(info->fd, &info->filter);
+ else if (info->mode == XT_BPF_MODE_PATH_PINNED)
+ return __bpf_mt_check_path(info->path, &info->filter);
else
return -EINVAL;
}
--
2.1.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 00/12] Netfilter/IPVS fixes for net
2017-10-09 16:25 [PATCH 00/12] Netfilter/IPVS fixes for net Pablo Neira Ayuso
` (11 preceding siblings ...)
2017-10-09 16:25 ` [PATCH 12/12] netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1' Pablo Neira Ayuso
@ 2017-10-09 17:40 ` David Miller
12 siblings, 0 replies; 21+ messages in thread
From: David Miller @ 2017-10-09 17:40 UTC (permalink / raw)
To: pablo; +Cc: netfilter-devel, netdev
From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon, 9 Oct 2017 18:25:34 +0200
> The following patchset contains Netfilter/IPVS fixes for your net tree,
> they are:
...
> You can pull these changes from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git
Pulled, thanks!
^ permalink raw reply [flat|nested] 21+ messages in thread