All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] Netfilter fixes for net
@ 2015-11-11 17:33 Pablo Neira Ayuso
  2015-11-11 17:33 ` [PATCH 01/10] netfilter: ingress: don't use nf_hook_list_active Pablo Neira Ayuso
                   ` (11 more replies)
  0 siblings, 12 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2015-11-11 17:33 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter fixes for your net tree. This
large batch that includes fixes for ipset, netfilter ingress, nf_tables
dynamic set instantiation and a longstanding Kconfig dependency problem.
More specifically, they are:

1) Add missing check for empty hook list at the ingress hook, from
   Florian Westphal.

2) Input and output interface are swapped at the ingress hook,
   reported by Patrick McHardy.

3) Resolve ipset extension alignment issues on ARM, patch from Jozsef
   Kadlecsik.

4) Fix bit check on bitmap in ipset hash type, also from Jozsef.

5) Release buckets when all entries have expired in ipset hash type,
   again from Jozsef.

6) Oneliner to initialize conntrack tuple object in the PPTP helper,
   otherwise the conntrack lookup may fail due to random bits in the
   structure holes, patch from Anthony Lineham.

7) Silence a bogus gcc warning in nfnetlink_log, from Arnd Bergmann.

8) Fix Kconfig dependency problems with TPROXY, socket and dup, also
   from Arnd.

9) Add __netdev_alloc_pcpu_stats() to allow creating percpu counters
   from atomic context, this is required by the follow up fix for
   nf_tables.

10) Fix crash from the dynamic set expression, we have to add new clone
    operation that should be defined when a simple memcpy is not enough.
    This resolves a crash when using per-cpu counters with new Patrick
    McHardy's flow table nft support.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit 212cd0895330b775f2db49451f046a5ca4e5704b:

  selinux: fix random read in selinux_ip_postroute_compat() (2015-11-05 16:45:51 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 086f332167d64b645d37405854f049b9ad7371ab:

  netfilter: nf_tables: add clone interface to expression operations (2015-11-10 23:47:32 +0100)

----------------------------------------------------------------
Anthony Lineham (1):
      netfilter: Fix removal of GRE expectation entries created by PPTP

Arnd Bergmann (2):
      netfilter: nfnetlink_log: work around uninitialized variable warning
      netfilter: fix xt_TEE and xt_TPROXY dependencies

Florian Westphal (1):
      netfilter: ingress: don't use nf_hook_list_active

Jozsef Kadlecsik (3):
      netfilter: ipset: Fix extension alignment
      netfilter: ipset: Fix hash:* type expiration
      netfilter: ipset: Fix hash type expire: release empty hash bucket block

Pablo Neira Ayuso (4):
      netfilter: ingress: fix wrong input interface on hook
      Merge branch 'master' of git://blackhole.kfki.hu/nf
      net: add __netdev_alloc_pcpu_stats() to indicate gfp flags
      netfilter: nf_tables: add clone interface to expression operations

 include/linux/netdevice.h                 | 27 +++++++------
 include/linux/netfilter/ipset/ip_set.h    |  2 +-
 include/linux/netfilter_ingress.h         | 13 ++++---
 include/net/netfilter/nf_tables.h         | 16 +++++++-
 net/ipv4/netfilter/nf_nat_pptp.c          |  2 +-
 net/netfilter/Kconfig                     |  6 +--
 net/netfilter/ipset/ip_set_bitmap_gen.h   | 17 +++-----
 net/netfilter/ipset/ip_set_bitmap_ip.c    | 14 ++-----
 net/netfilter/ipset/ip_set_bitmap_ipmac.c | 64 ++++++++++++++-----------------
 net/netfilter/ipset/ip_set_bitmap_port.c  | 18 ++++-----
 net/netfilter/ipset/ip_set_core.c         | 14 ++++---
 net/netfilter/ipset/ip_set_hash_gen.h     | 26 ++++++++-----
 net/netfilter/ipset/ip_set_list_set.c     |  5 ++-
 net/netfilter/nfnetlink_log.c             |  2 +-
 net/netfilter/nft_counter.c               | 49 +++++++++++++++++++----
 net/netfilter/nft_dynset.c                |  5 ++-
 16 files changed, 161 insertions(+), 119 deletions(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 01/10] netfilter: ingress: don't use nf_hook_list_active
  2015-11-11 17:33 [PATCH 00/10] Netfilter fixes for net Pablo Neira Ayuso
@ 2015-11-11 17:33 ` Pablo Neira Ayuso
  2015-11-11 17:33 ` [PATCH 02/10] netfilter: ingress: fix wrong input interface on hook Pablo Neira Ayuso
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2015-11-11 17:33 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Florian Westphal <fw@strlen.de>

nf_hook_list_active() always returns true once at least one device has
NF_INGRESS hook enabled.

Thus, don't use this function. Instead, inverse the test and use the static
key to elide list_empty test if no NF_INGRESS hooks are active.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter_ingress.h | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/linux/netfilter_ingress.h b/include/linux/netfilter_ingress.h
index 187feab..ba7ce88 100644
--- a/include/linux/netfilter_ingress.h
+++ b/include/linux/netfilter_ingress.h
@@ -5,10 +5,13 @@
 #include <linux/netdevice.h>
 
 #ifdef CONFIG_NETFILTER_INGRESS
-static inline int nf_hook_ingress_active(struct sk_buff *skb)
+static inline bool nf_hook_ingress_active(const struct sk_buff *skb)
 {
-	return nf_hook_list_active(&skb->dev->nf_hooks_ingress,
-				   NFPROTO_NETDEV, NF_NETDEV_INGRESS);
+#ifdef HAVE_JUMP_LABEL
+	if (!static_key_false(&nf_hooks_needed[NFPROTO_NETDEV][NF_NETDEV_INGRESS]))
+		return false;
+#endif
+	return !list_empty(&skb->dev->nf_hooks_ingress);
 }
 
 static inline int nf_hook_ingress(struct sk_buff *skb)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 02/10] netfilter: ingress: fix wrong input interface on hook
  2015-11-11 17:33 [PATCH 00/10] Netfilter fixes for net Pablo Neira Ayuso
  2015-11-11 17:33 ` [PATCH 01/10] netfilter: ingress: don't use nf_hook_list_active Pablo Neira Ayuso
@ 2015-11-11 17:33 ` Pablo Neira Ayuso
  2015-11-11 17:33 ` [PATCH 03/10] netfilter: ipset: Fix extension alignment Pablo Neira Ayuso
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2015-11-11 17:33 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

The input and output interfaces in nf_hook_state_init() are flipped.
This fixes iif matching on nftables.

Reported-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter_ingress.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/netfilter_ingress.h b/include/linux/netfilter_ingress.h
index ba7ce88..5fcd375 100644
--- a/include/linux/netfilter_ingress.h
+++ b/include/linux/netfilter_ingress.h
@@ -19,8 +19,8 @@ static inline int nf_hook_ingress(struct sk_buff *skb)
 	struct nf_hook_state state;
 
 	nf_hook_state_init(&state, &skb->dev->nf_hooks_ingress,
-			   NF_NETDEV_INGRESS, INT_MIN, NFPROTO_NETDEV, NULL,
-			   skb->dev, NULL, dev_net(skb->dev), NULL);
+			   NF_NETDEV_INGRESS, INT_MIN, NFPROTO_NETDEV,
+			   skb->dev, NULL, NULL, dev_net(skb->dev), NULL);
 	return nf_hook_slow(skb, &state);
 }
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 03/10] netfilter: ipset: Fix extension alignment
  2015-11-11 17:33 [PATCH 00/10] Netfilter fixes for net Pablo Neira Ayuso
  2015-11-11 17:33 ` [PATCH 01/10] netfilter: ingress: don't use nf_hook_list_active Pablo Neira Ayuso
  2015-11-11 17:33 ` [PATCH 02/10] netfilter: ingress: fix wrong input interface on hook Pablo Neira Ayuso
@ 2015-11-11 17:33 ` Pablo Neira Ayuso
  2015-11-11 17:33 ` [PATCH 04/10] netfilter: ipset: Fix hash:* type expiration Pablo Neira Ayuso
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2015-11-11 17:33 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

The data extensions in ipset lacked the proper memory alignment and
thus could lead to kernel crash on several architectures. Therefore
the structures have been reorganized and alignment attributes added
where needed. The patch was tested on armv7h by Gerhard Wiesinger and
on x86_64, sparc64 by Jozsef Kadlecsik.

Reported-by: Gerhard Wiesinger <lists@wiesinger.com>
Tested-by: Gerhard Wiesinger <lists@wiesinger.com>
Tested-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
---
 include/linux/netfilter/ipset/ip_set.h    |  2 +-
 net/netfilter/ipset/ip_set_bitmap_gen.h   | 17 +++-----
 net/netfilter/ipset/ip_set_bitmap_ip.c    | 14 ++-----
 net/netfilter/ipset/ip_set_bitmap_ipmac.c | 64 ++++++++++++++-----------------
 net/netfilter/ipset/ip_set_bitmap_port.c  | 18 ++++-----
 net/netfilter/ipset/ip_set_core.c         | 14 ++++---
 net/netfilter/ipset/ip_set_hash_gen.h     | 11 ++++--
 net/netfilter/ipset/ip_set_list_set.c     |  5 ++-
 8 files changed, 65 insertions(+), 80 deletions(-)

diff --git a/include/linux/netfilter/ipset/ip_set.h b/include/linux/netfilter/ipset/ip_set.h
index 48bb01e..0e1f433 100644
--- a/include/linux/netfilter/ipset/ip_set.h
+++ b/include/linux/netfilter/ipset/ip_set.h
@@ -421,7 +421,7 @@ extern void ip_set_free(void *members);
 extern int ip_set_get_ipaddr4(struct nlattr *nla,  __be32 *ipaddr);
 extern int ip_set_get_ipaddr6(struct nlattr *nla, union nf_inet_addr *ipaddr);
 extern size_t ip_set_elem_len(struct ip_set *set, struct nlattr *tb[],
-			      size_t len);
+			      size_t len, size_t align);
 extern int ip_set_get_extensions(struct ip_set *set, struct nlattr *tb[],
 				 struct ip_set_ext *ext);
 
diff --git a/net/netfilter/ipset/ip_set_bitmap_gen.h b/net/netfilter/ipset/ip_set_bitmap_gen.h
index d05e759..b0bc475 100644
--- a/net/netfilter/ipset/ip_set_bitmap_gen.h
+++ b/net/netfilter/ipset/ip_set_bitmap_gen.h
@@ -33,7 +33,7 @@
 #define mtype_gc		IPSET_TOKEN(MTYPE, _gc)
 #define mtype			MTYPE
 
-#define get_ext(set, map, id)	((map)->extensions + (set)->dsize * (id))
+#define get_ext(set, map, id)	((map)->extensions + ((set)->dsize * (id)))
 
 static void
 mtype_gc_init(struct ip_set *set, void (*gc)(unsigned long ul_set))
@@ -67,12 +67,9 @@ mtype_destroy(struct ip_set *set)
 		del_timer_sync(&map->gc);
 
 	ip_set_free(map->members);
-	if (set->dsize) {
-		if (set->extensions & IPSET_EXT_DESTROY)
-			mtype_ext_cleanup(set);
-		ip_set_free(map->extensions);
-	}
-	kfree(map);
+	if (set->dsize && set->extensions & IPSET_EXT_DESTROY)
+		mtype_ext_cleanup(set);
+	ip_set_free(map);
 
 	set->data = NULL;
 }
@@ -92,16 +89,14 @@ mtype_head(struct ip_set *set, struct sk_buff *skb)
 {
 	const struct mtype *map = set->data;
 	struct nlattr *nested;
+	size_t memsize = sizeof(*map) + map->memsize;
 
 	nested = ipset_nest_start(skb, IPSET_ATTR_DATA);
 	if (!nested)
 		goto nla_put_failure;
 	if (mtype_do_head(skb, map) ||
 	    nla_put_net32(skb, IPSET_ATTR_REFERENCES, htonl(set->ref - 1)) ||
-	    nla_put_net32(skb, IPSET_ATTR_MEMSIZE,
-			  htonl(sizeof(*map) +
-				map->memsize +
-				set->dsize * map->elements)))
+	    nla_put_net32(skb, IPSET_ATTR_MEMSIZE, htonl(memsize)))
 		goto nla_put_failure;
 	if (unlikely(ip_set_put_flags(skb, set)))
 		goto nla_put_failure;
diff --git a/net/netfilter/ipset/ip_set_bitmap_ip.c b/net/netfilter/ipset/ip_set_bitmap_ip.c
index 64a5643..4783eff 100644
--- a/net/netfilter/ipset/ip_set_bitmap_ip.c
+++ b/net/netfilter/ipset/ip_set_bitmap_ip.c
@@ -41,7 +41,6 @@ MODULE_ALIAS("ip_set_bitmap:ip");
 /* Type structure */
 struct bitmap_ip {
 	void *members;		/* the set members */
-	void *extensions;	/* data extensions */
 	u32 first_ip;		/* host byte order, included in range */
 	u32 last_ip;		/* host byte order, included in range */
 	u32 elements;		/* number of max elements in the set */
@@ -49,6 +48,8 @@ struct bitmap_ip {
 	size_t memsize;		/* members size */
 	u8 netmask;		/* subnet netmask */
 	struct timer_list gc;	/* garbage collection */
+	unsigned char extensions[0]	/* data extensions */
+		__aligned(__alignof__(u64));
 };
 
 /* ADT structure for generic function args */
@@ -224,13 +225,6 @@ init_map_ip(struct ip_set *set, struct bitmap_ip *map,
 	map->members = ip_set_alloc(map->memsize);
 	if (!map->members)
 		return false;
-	if (set->dsize) {
-		map->extensions = ip_set_alloc(set->dsize * elements);
-		if (!map->extensions) {
-			kfree(map->members);
-			return false;
-		}
-	}
 	map->first_ip = first_ip;
 	map->last_ip = last_ip;
 	map->elements = elements;
@@ -316,13 +310,13 @@ bitmap_ip_create(struct net *net, struct ip_set *set, struct nlattr *tb[],
 	pr_debug("hosts %u, elements %llu\n",
 		 hosts, (unsigned long long)elements);
 
-	map = kzalloc(sizeof(*map), GFP_KERNEL);
+	set->dsize = ip_set_elem_len(set, tb, 0, 0);
+	map = ip_set_alloc(sizeof(*map) + elements * set->dsize);
 	if (!map)
 		return -ENOMEM;
 
 	map->memsize = bitmap_bytes(0, elements - 1);
 	set->variant = &bitmap_ip;
-	set->dsize = ip_set_elem_len(set, tb, 0);
 	if (!init_map_ip(set, map, first_ip, last_ip,
 			 elements, hosts, netmask)) {
 		kfree(map);
diff --git a/net/netfilter/ipset/ip_set_bitmap_ipmac.c b/net/netfilter/ipset/ip_set_bitmap_ipmac.c
index 1430535..29dde20 100644
--- a/net/netfilter/ipset/ip_set_bitmap_ipmac.c
+++ b/net/netfilter/ipset/ip_set_bitmap_ipmac.c
@@ -47,24 +47,26 @@ enum {
 /* Type structure */
 struct bitmap_ipmac {
 	void *members;		/* the set members */
-	void *extensions;	/* MAC + data extensions */
 	u32 first_ip;		/* host byte order, included in range */
 	u32 last_ip;		/* host byte order, included in range */
 	u32 elements;		/* number of max elements in the set */
 	size_t memsize;		/* members size */
 	struct timer_list gc;	/* garbage collector */
+	unsigned char extensions[0]	/* MAC + data extensions */
+		__aligned(__alignof__(u64));
 };
 
 /* ADT structure for generic function args */
 struct bitmap_ipmac_adt_elem {
+	unsigned char ether[ETH_ALEN] __aligned(2);
 	u16 id;
-	unsigned char *ether;
+	u16 add_mac;
 };
 
 struct bitmap_ipmac_elem {
 	unsigned char ether[ETH_ALEN];
 	unsigned char filled;
-} __attribute__ ((aligned));
+} __aligned(__alignof__(u64));
 
 static inline u32
 ip_to_id(const struct bitmap_ipmac *m, u32 ip)
@@ -72,11 +74,11 @@ ip_to_id(const struct bitmap_ipmac *m, u32 ip)
 	return ip - m->first_ip;
 }
 
-static inline struct bitmap_ipmac_elem *
-get_elem(void *extensions, u16 id, size_t dsize)
-{
-	return (struct bitmap_ipmac_elem *)(extensions + id * dsize);
-}
+#define get_elem(extensions, id, dsize)		\
+	(struct bitmap_ipmac_elem *)(extensions + (id) * (dsize))
+
+#define get_const_elem(extensions, id, dsize)	\
+	(const struct bitmap_ipmac_elem *)(extensions + (id) * (dsize))
 
 /* Common functions */
 
@@ -88,10 +90,9 @@ bitmap_ipmac_do_test(const struct bitmap_ipmac_adt_elem *e,
 
 	if (!test_bit(e->id, map->members))
 		return 0;
-	elem = get_elem(map->extensions, e->id, dsize);
-	if (elem->filled == MAC_FILLED)
-		return !e->ether ||
-		       ether_addr_equal(e->ether, elem->ether);
+	elem = get_const_elem(map->extensions, e->id, dsize);
+	if (e->add_mac && elem->filled == MAC_FILLED)
+		return ether_addr_equal(e->ether, elem->ether);
 	/* Trigger kernel to fill out the ethernet address */
 	return -EAGAIN;
 }
@@ -103,7 +104,7 @@ bitmap_ipmac_gc_test(u16 id, const struct bitmap_ipmac *map, size_t dsize)
 
 	if (!test_bit(id, map->members))
 		return 0;
-	elem = get_elem(map->extensions, id, dsize);
+	elem = get_const_elem(map->extensions, id, dsize);
 	/* Timer not started for the incomplete elements */
 	return elem->filled == MAC_FILLED;
 }
@@ -133,7 +134,7 @@ bitmap_ipmac_add_timeout(unsigned long *timeout,
 		 * and we can reuse it later when MAC is filled out,
 		 * possibly by the kernel
 		 */
-		if (e->ether)
+		if (e->add_mac)
 			ip_set_timeout_set(timeout, t);
 		else
 			*timeout = t;
@@ -150,7 +151,7 @@ bitmap_ipmac_do_add(const struct bitmap_ipmac_adt_elem *e,
 	elem = get_elem(map->extensions, e->id, dsize);
 	if (test_bit(e->id, map->members)) {
 		if (elem->filled == MAC_FILLED) {
-			if (e->ether &&
+			if (e->add_mac &&
 			    (flags & IPSET_FLAG_EXIST) &&
 			    !ether_addr_equal(e->ether, elem->ether)) {
 				/* memcpy isn't atomic */
@@ -159,7 +160,7 @@ bitmap_ipmac_do_add(const struct bitmap_ipmac_adt_elem *e,
 				ether_addr_copy(elem->ether, e->ether);
 			}
 			return IPSET_ADD_FAILED;
-		} else if (!e->ether)
+		} else if (!e->add_mac)
 			/* Already added without ethernet address */
 			return IPSET_ADD_FAILED;
 		/* Fill the MAC address and trigger the timer activation */
@@ -168,7 +169,7 @@ bitmap_ipmac_do_add(const struct bitmap_ipmac_adt_elem *e,
 		ether_addr_copy(elem->ether, e->ether);
 		elem->filled = MAC_FILLED;
 		return IPSET_ADD_START_STORED_TIMEOUT;
-	} else if (e->ether) {
+	} else if (e->add_mac) {
 		/* We can store MAC too */
 		ether_addr_copy(elem->ether, e->ether);
 		elem->filled = MAC_FILLED;
@@ -191,7 +192,7 @@ bitmap_ipmac_do_list(struct sk_buff *skb, const struct bitmap_ipmac *map,
 		     u32 id, size_t dsize)
 {
 	const struct bitmap_ipmac_elem *elem =
-		get_elem(map->extensions, id, dsize);
+		get_const_elem(map->extensions, id, dsize);
 
 	return nla_put_ipaddr4(skb, IPSET_ATTR_IP,
 			       htonl(map->first_ip + id)) ||
@@ -213,7 +214,7 @@ bitmap_ipmac_kadt(struct ip_set *set, const struct sk_buff *skb,
 {
 	struct bitmap_ipmac *map = set->data;
 	ipset_adtfn adtfn = set->variant->adt[adt];
-	struct bitmap_ipmac_adt_elem e = { .id = 0 };
+	struct bitmap_ipmac_adt_elem e = { .id = 0, .add_mac = 1 };
 	struct ip_set_ext ext = IP_SET_INIT_KEXT(skb, opt, set);
 	u32 ip;
 
@@ -231,7 +232,7 @@ bitmap_ipmac_kadt(struct ip_set *set, const struct sk_buff *skb,
 		return -EINVAL;
 
 	e.id = ip_to_id(map, ip);
-	e.ether = eth_hdr(skb)->h_source;
+	memcpy(e.ether, eth_hdr(skb)->h_source, ETH_ALEN);
 
 	return adtfn(set, &e, &ext, &opt->ext, opt->cmdflags);
 }
@@ -265,11 +266,10 @@ bitmap_ipmac_uadt(struct ip_set *set, struct nlattr *tb[],
 		return -IPSET_ERR_BITMAP_RANGE;
 
 	e.id = ip_to_id(map, ip);
-	if (tb[IPSET_ATTR_ETHER])
-		e.ether = nla_data(tb[IPSET_ATTR_ETHER]);
-	else
-		e.ether = NULL;
-
+	if (tb[IPSET_ATTR_ETHER]) {
+		memcpy(e.ether, nla_data(tb[IPSET_ATTR_ETHER]), ETH_ALEN);
+		e.add_mac = 1;
+	}
 	ret = adtfn(set, &e, &ext, &ext, flags);
 
 	return ip_set_eexist(ret, flags) ? 0 : ret;
@@ -300,13 +300,6 @@ init_map_ipmac(struct ip_set *set, struct bitmap_ipmac *map,
 	map->members = ip_set_alloc(map->memsize);
 	if (!map->members)
 		return false;
-	if (set->dsize) {
-		map->extensions = ip_set_alloc(set->dsize * elements);
-		if (!map->extensions) {
-			kfree(map->members);
-			return false;
-		}
-	}
 	map->first_ip = first_ip;
 	map->last_ip = last_ip;
 	map->elements = elements;
@@ -361,14 +354,15 @@ bitmap_ipmac_create(struct net *net, struct ip_set *set, struct nlattr *tb[],
 	if (elements > IPSET_BITMAP_MAX_RANGE + 1)
 		return -IPSET_ERR_BITMAP_RANGE_SIZE;
 
-	map = kzalloc(sizeof(*map), GFP_KERNEL);
+	set->dsize = ip_set_elem_len(set, tb,
+				     sizeof(struct bitmap_ipmac_elem),
+				     __alignof__(struct bitmap_ipmac_elem));
+	map = ip_set_alloc(sizeof(*map) + elements * set->dsize);
 	if (!map)
 		return -ENOMEM;
 
 	map->memsize = bitmap_bytes(0, elements - 1);
 	set->variant = &bitmap_ipmac;
-	set->dsize = ip_set_elem_len(set, tb,
-				     sizeof(struct bitmap_ipmac_elem));
 	if (!init_map_ipmac(set, map, first_ip, last_ip, elements)) {
 		kfree(map);
 		return -ENOMEM;
diff --git a/net/netfilter/ipset/ip_set_bitmap_port.c b/net/netfilter/ipset/ip_set_bitmap_port.c
index 5338ccd..7f0c733 100644
--- a/net/netfilter/ipset/ip_set_bitmap_port.c
+++ b/net/netfilter/ipset/ip_set_bitmap_port.c
@@ -35,12 +35,13 @@ MODULE_ALIAS("ip_set_bitmap:port");
 /* Type structure */
 struct bitmap_port {
 	void *members;		/* the set members */
-	void *extensions;	/* data extensions */
 	u16 first_port;		/* host byte order, included in range */
 	u16 last_port;		/* host byte order, included in range */
 	u32 elements;		/* number of max elements in the set */
 	size_t memsize;		/* members size */
 	struct timer_list gc;	/* garbage collection */
+	unsigned char extensions[0]	/* data extensions */
+		__aligned(__alignof__(u64));
 };
 
 /* ADT structure for generic function args */
@@ -209,13 +210,6 @@ init_map_port(struct ip_set *set, struct bitmap_port *map,
 	map->members = ip_set_alloc(map->memsize);
 	if (!map->members)
 		return false;
-	if (set->dsize) {
-		map->extensions = ip_set_alloc(set->dsize * map->elements);
-		if (!map->extensions) {
-			kfree(map->members);
-			return false;
-		}
-	}
 	map->first_port = first_port;
 	map->last_port = last_port;
 	set->timeout = IPSET_NO_TIMEOUT;
@@ -232,6 +226,7 @@ bitmap_port_create(struct net *net, struct ip_set *set, struct nlattr *tb[],
 {
 	struct bitmap_port *map;
 	u16 first_port, last_port;
+	u32 elements;
 
 	if (unlikely(!ip_set_attr_netorder(tb, IPSET_ATTR_PORT) ||
 		     !ip_set_attr_netorder(tb, IPSET_ATTR_PORT_TO) ||
@@ -248,14 +243,15 @@ bitmap_port_create(struct net *net, struct ip_set *set, struct nlattr *tb[],
 		last_port = tmp;
 	}
 
-	map = kzalloc(sizeof(*map), GFP_KERNEL);
+	elements = last_port - first_port + 1;
+	set->dsize = ip_set_elem_len(set, tb, 0, 0);
+	map = ip_set_alloc(sizeof(*map) + elements * set->dsize);
 	if (!map)
 		return -ENOMEM;
 
-	map->elements = last_port - first_port + 1;
+	map->elements = elements;
 	map->memsize = bitmap_bytes(0, map->elements);
 	set->variant = &bitmap_port;
-	set->dsize = ip_set_elem_len(set, tb, 0);
 	if (!init_map_port(set, map, first_port, last_port)) {
 		kfree(map);
 		return -ENOMEM;
diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c
index 69ab9c26..54f3d7c 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -364,25 +364,27 @@ add_extension(enum ip_set_ext_id id, u32 flags, struct nlattr *tb[])
 }
 
 size_t
-ip_set_elem_len(struct ip_set *set, struct nlattr *tb[], size_t len)
+ip_set_elem_len(struct ip_set *set, struct nlattr *tb[], size_t len,
+		size_t align)
 {
 	enum ip_set_ext_id id;
-	size_t offset = len;
 	u32 cadt_flags = 0;
 
 	if (tb[IPSET_ATTR_CADT_FLAGS])
 		cadt_flags = ip_set_get_h32(tb[IPSET_ATTR_CADT_FLAGS]);
 	if (cadt_flags & IPSET_FLAG_WITH_FORCEADD)
 		set->flags |= IPSET_CREATE_FLAG_FORCEADD;
+	if (!align)
+		align = 1;
 	for (id = 0; id < IPSET_EXT_ID_MAX; id++) {
 		if (!add_extension(id, cadt_flags, tb))
 			continue;
-		offset = ALIGN(offset, ip_set_extensions[id].align);
-		set->offset[id] = offset;
+		len = ALIGN(len, ip_set_extensions[id].align);
+		set->offset[id] = len;
 		set->extensions |= ip_set_extensions[id].type;
-		offset += ip_set_extensions[id].len;
+		len += ip_set_extensions[id].len;
 	}
-	return offset;
+	return ALIGN(len, align);
 }
 EXPORT_SYMBOL_GPL(ip_set_elem_len);
 
diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index 691b54f..4ff2219 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -72,8 +72,9 @@ struct hbucket {
 	DECLARE_BITMAP(used, AHASH_MAX_TUNED);
 	u8 size;		/* size of the array */
 	u8 pos;			/* position of the first free entry */
-	unsigned char value[0];	/* the array of the values */
-} __attribute__ ((aligned));
+	unsigned char value[0]	/* the array of the values */
+		__aligned(__alignof__(u64));
+};
 
 /* The hash table: the table size stored here in order to make resizing easy */
 struct htable {
@@ -1323,12 +1324,14 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
 #endif
 		set->variant = &IPSET_TOKEN(HTYPE, 4_variant);
 		set->dsize = ip_set_elem_len(set, tb,
-				sizeof(struct IPSET_TOKEN(HTYPE, 4_elem)));
+			sizeof(struct IPSET_TOKEN(HTYPE, 4_elem)),
+			__alignof__(struct IPSET_TOKEN(HTYPE, 4_elem)));
 #ifndef IP_SET_PROTO_UNDEF
 	} else {
 		set->variant = &IPSET_TOKEN(HTYPE, 6_variant);
 		set->dsize = ip_set_elem_len(set, tb,
-				sizeof(struct IPSET_TOKEN(HTYPE, 6_elem)));
+			sizeof(struct IPSET_TOKEN(HTYPE, 6_elem)),
+			__alignof__(struct IPSET_TOKEN(HTYPE, 6_elem)));
 	}
 #endif
 	if (tb[IPSET_ATTR_TIMEOUT]) {
diff --git a/net/netfilter/ipset/ip_set_list_set.c b/net/netfilter/ipset/ip_set_list_set.c
index 5a30ce6..bbede95 100644
--- a/net/netfilter/ipset/ip_set_list_set.c
+++ b/net/netfilter/ipset/ip_set_list_set.c
@@ -31,7 +31,7 @@ struct set_elem {
 	struct rcu_head rcu;
 	struct list_head list;
 	ip_set_id_t id;
-};
+} __aligned(__alignof__(u64));
 
 struct set_adt_elem {
 	ip_set_id_t id;
@@ -618,7 +618,8 @@ list_set_create(struct net *net, struct ip_set *set, struct nlattr *tb[],
 		size = IP_SET_LIST_MIN_SIZE;
 
 	set->variant = &set_variant;
-	set->dsize = ip_set_elem_len(set, tb, sizeof(struct set_elem));
+	set->dsize = ip_set_elem_len(set, tb, sizeof(struct set_elem),
+				     __alignof__(struct set_elem));
 	if (!init_list_set(net, set, size))
 		return -ENOMEM;
 	if (tb[IPSET_ATTR_TIMEOUT]) {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 04/10] netfilter: ipset: Fix hash:* type expiration
  2015-11-11 17:33 [PATCH 00/10] Netfilter fixes for net Pablo Neira Ayuso
                   ` (2 preceding siblings ...)
  2015-11-11 17:33 ` [PATCH 03/10] netfilter: ipset: Fix extension alignment Pablo Neira Ayuso
@ 2015-11-11 17:33 ` Pablo Neira Ayuso
  2015-11-11 17:33 ` [PATCH 05/10] netfilter: ipset: Fix hash type expire: release empty hash bucket block Pablo Neira Ayuso
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2015-11-11 17:33 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

Incorrect index was used when the data blob was shrinked at expiration,
which could lead to falsely expired entries and memory leak when
the comment extension was used too.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
---
 net/netfilter/ipset/ip_set_hash_gen.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index 4ff2219..fa4f637 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -523,7 +523,7 @@ mtype_expire(struct ip_set *set, struct htype *h, u8 nets_length, size_t dsize)
 					continue;
 				data = ahash_data(n, j, dsize);
 				memcpy(tmp->value + d * dsize, data, dsize);
-				set_bit(j, tmp->used);
+				set_bit(d, tmp->used);
 				d++;
 			}
 			tmp->pos = d;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 05/10] netfilter: ipset: Fix hash type expire: release empty hash bucket block
  2015-11-11 17:33 [PATCH 00/10] Netfilter fixes for net Pablo Neira Ayuso
                   ` (3 preceding siblings ...)
  2015-11-11 17:33 ` [PATCH 04/10] netfilter: ipset: Fix hash:* type expiration Pablo Neira Ayuso
@ 2015-11-11 17:33 ` Pablo Neira Ayuso
  2015-11-11 17:33 ` [PATCH 06/10] netfilter: Fix removal of GRE expectation entries created by PPTP Pablo Neira Ayuso
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2015-11-11 17:33 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

When all entries are expired/all slots are empty, release the bucket.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
---
 net/netfilter/ipset/ip_set_hash_gen.h | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index fa4f637..e5336ab 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -476,7 +476,7 @@ static void
 mtype_expire(struct ip_set *set, struct htype *h, u8 nets_length, size_t dsize)
 {
 	struct htable *t;
-	struct hbucket *n;
+	struct hbucket *n, *tmp;
 	struct mtype_elem *data;
 	u32 i, j, d;
 #ifdef IP_SET_HASH_WITH_NETS
@@ -511,9 +511,14 @@ mtype_expire(struct ip_set *set, struct htype *h, u8 nets_length, size_t dsize)
 			}
 		}
 		if (d >= AHASH_INIT_SIZE) {
-			struct hbucket *tmp = kzalloc(sizeof(*tmp) +
-					(n->size - AHASH_INIT_SIZE) * dsize,
-					GFP_ATOMIC);
+			if (d >= n->size) {
+				rcu_assign_pointer(hbucket(t, i), NULL);
+				kfree_rcu(n, rcu);
+				continue;
+			}
+			tmp = kzalloc(sizeof(*tmp) +
+				      (n->size - AHASH_INIT_SIZE) * dsize,
+				      GFP_ATOMIC);
 			if (!tmp)
 				/* Still try to delete expired elements */
 				continue;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 06/10] netfilter: Fix removal of GRE expectation entries created by PPTP
  2015-11-11 17:33 [PATCH 00/10] Netfilter fixes for net Pablo Neira Ayuso
                   ` (4 preceding siblings ...)
  2015-11-11 17:33 ` [PATCH 05/10] netfilter: ipset: Fix hash type expire: release empty hash bucket block Pablo Neira Ayuso
@ 2015-11-11 17:33 ` Pablo Neira Ayuso
  2015-11-11 17:33 ` [PATCH 07/10] netfilter: nfnetlink_log: work around uninitialized variable warning Pablo Neira Ayuso
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2015-11-11 17:33 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Anthony Lineham <anthony.lineham@alliedtelesis.co.nz>

The uninitialized tuple structure caused incorrect hash calculation
and the lookup failed.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=106441
Signed-off-by: Anthony Lineham <anthony.lineham@alliedtelesis.co.nz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv4/netfilter/nf_nat_pptp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/netfilter/nf_nat_pptp.c b/net/ipv4/netfilter/nf_nat_pptp.c
index 657d230..b3ca21b 100644
--- a/net/ipv4/netfilter/nf_nat_pptp.c
+++ b/net/ipv4/netfilter/nf_nat_pptp.c
@@ -45,7 +45,7 @@ static void pptp_nat_expected(struct nf_conn *ct,
 	struct net *net = nf_ct_net(ct);
 	const struct nf_conn *master = ct->master;
 	struct nf_conntrack_expect *other_exp;
-	struct nf_conntrack_tuple t;
+	struct nf_conntrack_tuple t = {};
 	const struct nf_ct_pptp_master *ct_pptp_info;
 	const struct nf_nat_pptp *nat_pptp_info;
 	struct nf_nat_range range;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 07/10] netfilter: nfnetlink_log: work around uninitialized variable warning
  2015-11-11 17:33 [PATCH 00/10] Netfilter fixes for net Pablo Neira Ayuso
                   ` (5 preceding siblings ...)
  2015-11-11 17:33 ` [PATCH 06/10] netfilter: Fix removal of GRE expectation entries created by PPTP Pablo Neira Ayuso
@ 2015-11-11 17:33 ` Pablo Neira Ayuso
  2015-11-11 17:33 ` [PATCH 08/10] netfilter: fix xt_TEE and xt_TPROXY dependencies Pablo Neira Ayuso
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2015-11-11 17:33 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Arnd Bergmann <arnd@arndb.de>

After a recent (correct) change, gcc started warning about the use
of the 'flags' variable in nfulnl_recv_config()

net/netfilter/nfnetlink_log.c: In function 'nfulnl_recv_config':
net/netfilter/nfnetlink_log.c:320:14: warning: 'flags' may be used uninitialized in this function [-Wmaybe-uninitialized]
net/netfilter/nfnetlink_log.c:828:6: note: 'flags' was declared here

The warning first shows up in ARM s3c2410_defconfig with gcc-4.3 or
higher (including 5.2.1, which is the latest version I checked) I
tried working around it by rearranging the code but had no success
with that.

As a last resort, this initializes the variable to zero, which shuts
up the warning, but means that we don't get a warning if the code
is ever changed in a way that actually causes the variable to be
used without first being written.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 8cbc870829ec ("netfilter: nfnetlink_log: validate dependencies to avoid breaking atomicity")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nfnetlink_log.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nfnetlink_log.c b/net/netfilter/nfnetlink_log.c
index 06eb48f..740cce4 100644
--- a/net/netfilter/nfnetlink_log.c
+++ b/net/netfilter/nfnetlink_log.c
@@ -825,7 +825,7 @@ nfulnl_recv_config(struct sock *ctnl, struct sk_buff *skb,
 	struct net *net = sock_net(ctnl);
 	struct nfnl_log_net *log = nfnl_log_pernet(net);
 	int ret = 0;
-	u16 flags;
+	u16 flags = 0;
 
 	if (nfula[NFULA_CFG_CMD]) {
 		u_int8_t pf = nfmsg->nfgen_family;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 08/10] netfilter: fix xt_TEE and xt_TPROXY dependencies
  2015-11-11 17:33 [PATCH 00/10] Netfilter fixes for net Pablo Neira Ayuso
                   ` (6 preceding siblings ...)
  2015-11-11 17:33 ` [PATCH 07/10] netfilter: nfnetlink_log: work around uninitialized variable warning Pablo Neira Ayuso
@ 2015-11-11 17:33 ` Pablo Neira Ayuso
  2015-11-11 17:33 ` [PATCH 09/10] net: add __netdev_alloc_pcpu_stats() to indicate gfp flags Pablo Neira Ayuso
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2015-11-11 17:33 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Arnd Bergmann <arnd@arndb.de>

Kconfig is too smart for its own good: a Kconfig line that states

	select NF_DEFRAG_IPV6 if IP6_NF_IPTABLES

means that if IP6_NF_IPTABLES is set to 'm', then NF_DEFRAG_IPV6 will
also be set to 'm', regardless of the state of the symbol from which
it is selected. When the xt_TEE driver is built-in and nothing else
forces NF_DEFRAG_IPV6 to be built-in, this causes a link-time error:

net/built-in.o: In function `tee_tg6':
net/netfilter/xt_TEE.c:46: undefined reference to `nf_dup_ipv6'

This works around that behavior by changing the dependency to
'if IP6_NF_IPTABLES != n', which is interpreted as boolean expression
rather than a tristate and causes the NF_DEFRAG_IPV6 symbol to
be built-in as well.

The bug only occurs once in thousands of 'randconfig' builds and
does not really impact real users. From inspecting the other
surrounding Kconfig symbols, I am guessing that NETFILTER_XT_TARGET_TPROXY
and NETFILTER_XT_MATCH_SOCKET have the same issue. If not, this
change should still be harmless.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/Kconfig | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index e22349e..4692782 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -869,7 +869,7 @@ config NETFILTER_XT_TARGET_TEE
 	depends on IPV6 || IPV6=n
 	depends on !NF_CONNTRACK || NF_CONNTRACK
 	select NF_DUP_IPV4
-	select NF_DUP_IPV6 if IP6_NF_IPTABLES
+	select NF_DUP_IPV6 if IP6_NF_IPTABLES != n
 	---help---
 	This option adds a "TEE" target with which a packet can be cloned and
 	this clone be rerouted to another nexthop.
@@ -882,7 +882,7 @@ config NETFILTER_XT_TARGET_TPROXY
 	depends on IP6_NF_IPTABLES || IP6_NF_IPTABLES=n
 	depends on IP_NF_MANGLE
 	select NF_DEFRAG_IPV4
-	select NF_DEFRAG_IPV6 if IP6_NF_IPTABLES
+	select NF_DEFRAG_IPV6 if IP6_NF_IPTABLES != n
 	help
 	  This option adds a `TPROXY' target, which is somewhat similar to
 	  REDIRECT.  It can only be used in the mangle table and is useful
@@ -1375,7 +1375,7 @@ config NETFILTER_XT_MATCH_SOCKET
 	depends on IPV6 || IPV6=n
 	depends on IP6_NF_IPTABLES || IP6_NF_IPTABLES=n
 	select NF_DEFRAG_IPV4
-	select NF_DEFRAG_IPV6 if IP6_NF_IPTABLES
+	select NF_DEFRAG_IPV6 if IP6_NF_IPTABLES != n
 	help
 	  This option adds a `socket' match, which can be used to match
 	  packets for which a TCP or UDP socket lookup finds a valid socket.
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 09/10] net: add __netdev_alloc_pcpu_stats() to indicate gfp flags
  2015-11-11 17:33 [PATCH 00/10] Netfilter fixes for net Pablo Neira Ayuso
                   ` (7 preceding siblings ...)
  2015-11-11 17:33 ` [PATCH 08/10] netfilter: fix xt_TEE and xt_TPROXY dependencies Pablo Neira Ayuso
@ 2015-11-11 17:33 ` Pablo Neira Ayuso
  2015-11-11 17:33 ` [PATCH 10/10] netfilter: nf_tables: add clone interface to expression operations Pablo Neira Ayuso
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2015-11-11 17:33 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

nf_tables may create percpu counters from the packet path through its
dynamic set instantiation infrastructure, so we need a way to allocate
this through GFP_ATOMIC.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: David S. Miller <davem@davemloft.net>
---
 include/linux/netdevice.h | 27 +++++++++++++++------------
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2c00772..e9d0c8a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2068,20 +2068,23 @@ struct pcpu_sw_netstats {
 	struct u64_stats_sync   syncp;
 };
 
-#define netdev_alloc_pcpu_stats(type)				\
-({								\
-	typeof(type) __percpu *pcpu_stats = alloc_percpu(type); \
-	if (pcpu_stats)	{					\
-		int __cpu;					\
-		for_each_possible_cpu(__cpu) {			\
-			typeof(type) *stat;			\
-			stat = per_cpu_ptr(pcpu_stats, __cpu);	\
-			u64_stats_init(&stat->syncp);		\
-		}						\
-	}							\
-	pcpu_stats;						\
+#define __netdev_alloc_pcpu_stats(type, gfp)				\
+({									\
+	typeof(type) __percpu *pcpu_stats = alloc_percpu_gfp(type, gfp);\
+	if (pcpu_stats)	{						\
+		int __cpu;						\
+		for_each_possible_cpu(__cpu) {				\
+			typeof(type) *stat;				\
+			stat = per_cpu_ptr(pcpu_stats, __cpu);		\
+			u64_stats_init(&stat->syncp);			\
+		}							\
+	}								\
+	pcpu_stats;							\
 })
 
+#define netdev_alloc_pcpu_stats(type)					\
+	__netdev_alloc_pcpu_stats(type, GFP_KERNEL);
+
 #include <linux/notifier.h>
 
 /* netdevice notifier chain. Please remember to update the rtnetlink
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 10/10] netfilter: nf_tables: add clone interface to expression operations
  2015-11-11 17:33 [PATCH 00/10] Netfilter fixes for net Pablo Neira Ayuso
                   ` (8 preceding siblings ...)
  2015-11-11 17:33 ` [PATCH 09/10] net: add __netdev_alloc_pcpu_stats() to indicate gfp flags Pablo Neira Ayuso
@ 2015-11-11 17:33 ` Pablo Neira Ayuso
  2015-11-12 19:20 ` [PATCH 00/10] Netfilter fixes for net David Miller
  2015-11-13 17:58 ` Josh Boyer
  11 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2015-11-11 17:33 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

With the conversion of the counter expressions to make it percpu, we
need to clone the percpu memory area, otherwise we crash when using
counters from flow tables.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_tables.h | 16 +++++++++++--
 net/netfilter/nft_counter.c       | 49 ++++++++++++++++++++++++++++++++-------
 net/netfilter/nft_dynset.c        |  5 ++--
 3 files changed, 58 insertions(+), 12 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index c9149cc..4bd7508 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -618,6 +618,8 @@ struct nft_expr_ops {
 	void				(*eval)(const struct nft_expr *expr,
 						struct nft_regs *regs,
 						const struct nft_pktinfo *pkt);
+	int				(*clone)(struct nft_expr *dst,
+						 const struct nft_expr *src);
 	unsigned int			size;
 
 	int				(*init)(const struct nft_ctx *ctx,
@@ -660,10 +662,20 @@ void nft_expr_destroy(const struct nft_ctx *ctx, struct nft_expr *expr);
 int nft_expr_dump(struct sk_buff *skb, unsigned int attr,
 		  const struct nft_expr *expr);
 
-static inline void nft_expr_clone(struct nft_expr *dst, struct nft_expr *src)
+static inline int nft_expr_clone(struct nft_expr *dst, struct nft_expr *src)
 {
+	int err;
+
 	__module_get(src->ops->type->owner);
-	memcpy(dst, src, src->ops->size);
+	if (src->ops->clone) {
+		dst->ops = src->ops;
+		err = src->ops->clone(dst, src);
+		if (err < 0)
+			return err;
+	} else {
+		memcpy(dst, src, src->ops->size);
+	}
+	return 0;
 }
 
 /**
diff --git a/net/netfilter/nft_counter.c b/net/netfilter/nft_counter.c
index 1067fb4..c7808fc 100644
--- a/net/netfilter/nft_counter.c
+++ b/net/netfilter/nft_counter.c
@@ -47,27 +47,34 @@ static void nft_counter_eval(const struct nft_expr *expr,
 	local_bh_enable();
 }
 
-static int nft_counter_dump(struct sk_buff *skb, const struct nft_expr *expr)
+static void nft_counter_fetch(const struct nft_counter_percpu __percpu *counter,
+			      struct nft_counter *total)
 {
-	struct nft_counter_percpu_priv *priv = nft_expr_priv(expr);
-	struct nft_counter_percpu *cpu_stats;
-	struct nft_counter total;
+	const struct nft_counter_percpu *cpu_stats;
 	u64 bytes, packets;
 	unsigned int seq;
 	int cpu;
 
-	memset(&total, 0, sizeof(total));
+	memset(total, 0, sizeof(*total));
 	for_each_possible_cpu(cpu) {
-		cpu_stats = per_cpu_ptr(priv->counter, cpu);
+		cpu_stats = per_cpu_ptr(counter, cpu);
 		do {
 			seq	= u64_stats_fetch_begin_irq(&cpu_stats->syncp);
 			bytes	= cpu_stats->counter.bytes;
 			packets	= cpu_stats->counter.packets;
 		} while (u64_stats_fetch_retry_irq(&cpu_stats->syncp, seq));
 
-		total.packets += packets;
-		total.bytes += bytes;
+		total->packets += packets;
+		total->bytes += bytes;
 	}
+}
+
+static int nft_counter_dump(struct sk_buff *skb, const struct nft_expr *expr)
+{
+	struct nft_counter_percpu_priv *priv = nft_expr_priv(expr);
+	struct nft_counter total;
+
+	nft_counter_fetch(priv->counter, &total);
 
 	if (nla_put_be64(skb, NFTA_COUNTER_BYTES, cpu_to_be64(total.bytes)) ||
 	    nla_put_be64(skb, NFTA_COUNTER_PACKETS, cpu_to_be64(total.packets)))
@@ -118,6 +125,31 @@ static void nft_counter_destroy(const struct nft_ctx *ctx,
 	free_percpu(priv->counter);
 }
 
+static int nft_counter_clone(struct nft_expr *dst, const struct nft_expr *src)
+{
+	struct nft_counter_percpu_priv *priv = nft_expr_priv(src);
+	struct nft_counter_percpu_priv *priv_clone = nft_expr_priv(dst);
+	struct nft_counter_percpu __percpu *cpu_stats;
+	struct nft_counter_percpu *this_cpu;
+	struct nft_counter total;
+
+	nft_counter_fetch(priv->counter, &total);
+
+	cpu_stats = __netdev_alloc_pcpu_stats(struct nft_counter_percpu,
+					      GFP_ATOMIC);
+	if (cpu_stats == NULL)
+		return ENOMEM;
+
+	preempt_disable();
+	this_cpu = this_cpu_ptr(cpu_stats);
+	this_cpu->counter.packets = total.packets;
+	this_cpu->counter.bytes = total.bytes;
+	preempt_enable();
+
+	priv_clone->counter = cpu_stats;
+	return 0;
+}
+
 static struct nft_expr_type nft_counter_type;
 static const struct nft_expr_ops nft_counter_ops = {
 	.type		= &nft_counter_type,
@@ -126,6 +158,7 @@ static const struct nft_expr_ops nft_counter_ops = {
 	.init		= nft_counter_init,
 	.destroy	= nft_counter_destroy,
 	.dump		= nft_counter_dump,
+	.clone		= nft_counter_clone,
 };
 
 static struct nft_expr_type nft_counter_type __read_mostly = {
diff --git a/net/netfilter/nft_dynset.c b/net/netfilter/nft_dynset.c
index 513a8ef..9dec3bd 100644
--- a/net/netfilter/nft_dynset.c
+++ b/net/netfilter/nft_dynset.c
@@ -50,8 +50,9 @@ static void *nft_dynset_new(struct nft_set *set, const struct nft_expr *expr,
 	}
 
 	ext = nft_set_elem_ext(set, elem);
-	if (priv->expr != NULL)
-		nft_expr_clone(nft_set_ext_expr(ext), priv->expr);
+	if (priv->expr != NULL &&
+	    nft_expr_clone(nft_set_ext_expr(ext), priv->expr) < 0)
+		return NULL;
 
 	return elem;
 }
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/10] Netfilter fixes for net
  2015-11-11 17:33 [PATCH 00/10] Netfilter fixes for net Pablo Neira Ayuso
                   ` (9 preceding siblings ...)
  2015-11-11 17:33 ` [PATCH 10/10] netfilter: nf_tables: add clone interface to expression operations Pablo Neira Ayuso
@ 2015-11-12 19:20 ` David Miller
  2015-11-13 17:58 ` Josh Boyer
  11 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2015-11-12 19:20 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed, 11 Nov 2015 18:33:33 +0100

> The following patchset contains Netfilter fixes for your net tree. This
> large batch that includes fixes for ipset, netfilter ingress, nf_tables
> dynamic set instantiation and a longstanding Kconfig dependency problem.
> More specifically, they are:
 ...
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thanks a lot Pablo.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/10] Netfilter fixes for net
  2015-11-11 17:33 [PATCH 00/10] Netfilter fixes for net Pablo Neira Ayuso
                   ` (10 preceding siblings ...)
  2015-11-12 19:20 ` [PATCH 00/10] Netfilter fixes for net David Miller
@ 2015-11-13 17:58 ` Josh Boyer
  2015-11-13 18:52   ` Jozsef Kadlecsik
  11 siblings, 1 reply; 16+ messages in thread
From: Josh Boyer @ 2015-11-13 17:58 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, David Miller, netdev

On Wed, Nov 11, 2015 at 12:33 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> Jozsef Kadlecsik (3):
>       netfilter: ipset: Fix extension alignment
>       netfilter: ipset: Fix hash:* type expiration
>       netfilter: ipset: Fix hash type expire: release empty hash bucket block

Should these three go to stable?  We've had reports in Fedora about
ipset crashing on e.g. ARM architectures with 4.2.y kernels.  If not
all three, then perhaps just the alignment fix?

josh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/10] Netfilter fixes for net
  2015-11-13 17:58 ` Josh Boyer
@ 2015-11-13 18:52   ` Jozsef Kadlecsik
  0 siblings, 0 replies; 16+ messages in thread
From: Jozsef Kadlecsik @ 2015-11-13 18:52 UTC (permalink / raw)
  To: Josh Boyer; +Cc: Pablo Neira Ayuso, netfilter-devel, David Miller, netdev

On Fri, 13 Nov 2015, Josh Boyer wrote:

> On Wed, Nov 11, 2015 at 12:33 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > Jozsef Kadlecsik (3):
> >       netfilter: ipset: Fix extension alignment
> >       netfilter: ipset: Fix hash:* type expiration
> >       netfilter: ipset: Fix hash type expire: release empty hash bucket block
> 
> Should these three go to stable?  We've had reports in Fedora about
> ipset crashing on e.g. ARM architectures with 4.2.y kernels.  If not
> all three, then perhaps just the alignment fix?

Yes, those should definitely go to stable, at least the first two ones.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.mta.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/10] Netfilter fixes for net
  2017-03-15 17:01 Pablo Neira Ayuso
@ 2017-03-15 22:13 ` David Miller
  0 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2017-03-15 22:13 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed, 15 Mar 2017 18:01:02 +0100

> The following patchset contains Netfilter fixes for your net tree, a
> rather large batch of fixes targeted to nf_tables, conntrack and bridge
> netfilter. More specifically, they are:
> 
> 1) Don't track fragmented packets if the socket option IP_NODEFRAG is set.
>    From Florian Westphal.
> 
> 2) SCTP protocol tracker assumes that ICMP error messages contain the
>    checksum field, what results in packet drops. From Ying Xue.
> 
> 3) Fix inconsistent handling of AH traffic from nf_tables.
> 
> 4) Fix new bitmap set representation with big endian. Fix mismatches in
>    nf_tables due to incorrect big endian handling too. Both patches
>    from Liping Zhang.
> 
> 5) Bridge netfilter doesn't honor maximum fragment size field, cap to
>    largest fragment seen. From Florian Westphal.
> 
> 6) Fake conntrack entry needs to be aligned to 8 bytes since the 3 LSB
>    bits are now used to store the ctinfo. From Steven Rostedt.
> 
> 7) Fix element comments with the bitmap set type. Revert the flush
>    field in the nft_set_iter structure, not required anymore after
>    fixing up element comments.
> 
> 8) Missing error on invalid conntrack direction from nft_ct, also from
>    Liping Zhang.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thanks Pablo!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 00/10] Netfilter fixes for net
@ 2017-03-15 17:01 Pablo Neira Ayuso
  2017-03-15 22:13 ` David Miller
  0 siblings, 1 reply; 16+ messages in thread
From: Pablo Neira Ayuso @ 2017-03-15 17:01 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter fixes for your net tree, a
rather large batch of fixes targeted to nf_tables, conntrack and bridge
netfilter. More specifically, they are:

1) Don't track fragmented packets if the socket option IP_NODEFRAG is set.
   From Florian Westphal.

2) SCTP protocol tracker assumes that ICMP error messages contain the
   checksum field, what results in packet drops. From Ying Xue.

3) Fix inconsistent handling of AH traffic from nf_tables.

4) Fix new bitmap set representation with big endian. Fix mismatches in
   nf_tables due to incorrect big endian handling too. Both patches
   from Liping Zhang.

5) Bridge netfilter doesn't honor maximum fragment size field, cap to
   largest fragment seen. From Florian Westphal.

6) Fake conntrack entry needs to be aligned to 8 bytes since the 3 LSB
   bits are now used to store the ctinfo. From Steven Rostedt.

7) Fix element comments with the bitmap set type. Revert the flush
   field in the nft_set_iter structure, not required anymore after
   fixing up element comments.

8) Missing error on invalid conntrack direction from nft_ct, also from
   Liping Zhang.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit 8d70eeb84ab277377c017af6a21d0a337025dede:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2017-03-04 17:31:39 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 4494dbc6dec37817f2cc2aa7604039a9e87ada18:

  netfilter: nft_ct: do cleanup work when NFTA_CT_DIRECTION is invalid (2017-03-15 17:15:54 +0100)

----------------------------------------------------------------
Florian Westphal (2):
      netfilter: don't track fragmented packets
      netfilter: bridge: honor frag_max_size when refragmenting

Liping Zhang (3):
      netfilter: nft_set_bitmap: fetch the element key based on the set->klen
      netfilter: nf_tables: fix mismatch in big-endian system
      netfilter: nft_ct: do cleanup work when NFTA_CT_DIRECTION is invalid

Pablo Neira Ayuso (3):
      netfilter: nf_tables: set pktinfo->thoff at AH header if found
      netfilter: nft_set_bitmap: keep a list of dummy elements
      Revert "netfilter: nf_tables: add flush field to struct nft_set_iter"

Steven Rostedt (VMware) (1):
      netfilter: Force fake conntrack entry to be at least 8 bytes aligned

Ying Xue (1):
      netfilter: nf_nat_sctp: fix ICMP packet to be dropped accidently

 include/net/netfilter/nf_conntrack.h           |   2 +-
 include/net/netfilter/nf_tables.h              |  30 ++++-
 include/net/netfilter/nf_tables_ipv6.h         |   6 +-
 net/bridge/br_netfilter_hooks.c                |  12 +-
 net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c |   4 +
 net/ipv4/netfilter/nf_nat_l3proto_ipv4.c       |   5 -
 net/ipv4/netfilter/nft_masq_ipv4.c             |   8 +-
 net/ipv4/netfilter/nft_redir_ipv4.c            |   8 +-
 net/ipv6/netfilter/nft_masq_ipv6.c             |   8 +-
 net/ipv6/netfilter/nft_redir_ipv6.c            |   8 +-
 net/netfilter/nf_conntrack_core.c              |   6 +-
 net/netfilter/nf_nat_proto_sctp.c              |  13 +-
 net/netfilter/nf_tables_api.c                  |   4 -
 net/netfilter/nft_ct.c                         |  21 ++--
 net/netfilter/nft_meta.c                       |  40 +++---
 net/netfilter/nft_nat.c                        |   8 +-
 net/netfilter/nft_set_bitmap.c                 | 165 ++++++++++++-------------
 17 files changed, 194 insertions(+), 154 deletions(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-03-15 22:13 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-11 17:33 [PATCH 00/10] Netfilter fixes for net Pablo Neira Ayuso
2015-11-11 17:33 ` [PATCH 01/10] netfilter: ingress: don't use nf_hook_list_active Pablo Neira Ayuso
2015-11-11 17:33 ` [PATCH 02/10] netfilter: ingress: fix wrong input interface on hook Pablo Neira Ayuso
2015-11-11 17:33 ` [PATCH 03/10] netfilter: ipset: Fix extension alignment Pablo Neira Ayuso
2015-11-11 17:33 ` [PATCH 04/10] netfilter: ipset: Fix hash:* type expiration Pablo Neira Ayuso
2015-11-11 17:33 ` [PATCH 05/10] netfilter: ipset: Fix hash type expire: release empty hash bucket block Pablo Neira Ayuso
2015-11-11 17:33 ` [PATCH 06/10] netfilter: Fix removal of GRE expectation entries created by PPTP Pablo Neira Ayuso
2015-11-11 17:33 ` [PATCH 07/10] netfilter: nfnetlink_log: work around uninitialized variable warning Pablo Neira Ayuso
2015-11-11 17:33 ` [PATCH 08/10] netfilter: fix xt_TEE and xt_TPROXY dependencies Pablo Neira Ayuso
2015-11-11 17:33 ` [PATCH 09/10] net: add __netdev_alloc_pcpu_stats() to indicate gfp flags Pablo Neira Ayuso
2015-11-11 17:33 ` [PATCH 10/10] netfilter: nf_tables: add clone interface to expression operations Pablo Neira Ayuso
2015-11-12 19:20 ` [PATCH 00/10] Netfilter fixes for net David Miller
2015-11-13 17:58 ` Josh Boyer
2015-11-13 18:52   ` Jozsef Kadlecsik
2017-03-15 17:01 Pablo Neira Ayuso
2017-03-15 22:13 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.