netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] Netfilter fixes for net
@ 2020-02-26 22:54 Pablo Neira Ayuso
  2020-02-26 22:54 ` [PATCH 1/6] netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports Pablo Neira Ayuso
                   ` (6 more replies)
  0 siblings, 7 replies; 44+ messages in thread
From: Pablo Neira Ayuso @ 2020-02-26 22:54 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi,

The following patchset contains Netfilter fixes:

1) Perform garbage collection from workqueue to fix rcu detected
   stall in ipset hash set types, from Jozsef Kadlecsik.

2) Fix the forceadd evaluation path, also from Jozsef.

3) Fix nft_set_pipapo selftest, from Stefano Brivio.

4) Crash when add-flush-add element in pipapo set, also from Stefano.
   Add test to cover this crash.

5) Remove sysctl entry under mutex in hashlimit, from Cong Wang.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thank you.

----------------------------------------------------------------

The following changes since commit 3614d05b5e6baf487e88fb114d884da172edd61a:

  Merge tag 'mac80211-for-net-2020-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 (2020-02-24 15:43:38 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 99b79c3900d4627672c85d9f344b5b0f06bc2a4d:

  netfilter: xt_hashlimit: unregister proc file before releasing mutex (2020-02-26 23:25:07 +0100)

----------------------------------------------------------------
Cong Wang (1):
      netfilter: xt_hashlimit: unregister proc file before releasing mutex

Jozsef Kadlecsik (2):
      netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports
      netfilter: ipset: Fix forceadd evaluation path

Pablo Neira Ayuso (1):
      Merge branch 'master' of git://blackhole.kfki.hu/nf

Stefano Brivio (3):
      selftests: nft_concat_range: Move option for 'list ruleset' before command
      nft_set_pipapo: Actually fetch key data in nft_pipapo_remove()
      selftests: nft_concat_range: Add test for reported add/flush/add issue

 include/linux/netfilter/ipset/ip_set.h             |  11 +-
 net/netfilter/ipset/ip_set_core.c                  |  34 +-
 net/netfilter/ipset/ip_set_hash_gen.h              | 635 ++++++++++++++-------
 net/netfilter/nft_set_pipapo.c                     |   6 +-
 net/netfilter/xt_hashlimit.c                       |  16 +-
 .../selftests/netfilter/nft_concat_range.sh        |  55 +-
 6 files changed, 529 insertions(+), 228 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 1/6] netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports
  2020-02-26 22:54 [PATCH 0/6] Netfilter fixes for net Pablo Neira Ayuso
@ 2020-02-26 22:54 ` Pablo Neira Ayuso
  2020-02-26 22:54 ` [PATCH 2/6] netfilter: ipset: Fix forceadd evaluation path Pablo Neira Ayuso
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 44+ messages in thread
From: Pablo Neira Ayuso @ 2020-02-26 22:54 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Jozsef Kadlecsik <kadlec@netfilter.org>

In the case of huge hash:* types of sets, due to the single spinlock of
a set the processing of the whole set under spinlock protection could take
too long.

There were four places where the whole hash table of the set was processed
from bucket to bucket under holding the spinlock:

- During resizing a set, the original set was locked to exclude kernel side
  add/del element operations (userspace add/del is excluded by the
  nfnetlink mutex). The original set is actually just read during the
  resize, so the spinlocking is replaced with rcu locking of regions.
  However, thus there can be parallel kernel side add/del of entries.
  In order not to loose those operations a backlog is added and replayed
  after the successful resize.
- Garbage collection of timed out entries was also protected by the spinlock.
  In order not to lock too long, region locking is introduced and a single
  region is processed in one gc go. Also, the simple timer based gc running
  is replaced with a workqueue based solution. The internal book-keeping
  (number of elements, size of extensions) is moved to region level due to
  the region locking.
- Adding elements: when the max number of the elements is reached, the gc
  was called to evict the timed out entries. The new approach is that the gc
  is called just for the matching region, assuming that if the region
  (proportionally) seems to be full, then the whole set does. We could scan
  the other regions to check every entry under rcu locking, but for huge
  sets it'd mean a slowdown at adding elements.
- Listing the set header data: when the set was defined with timeout
  support, the garbage collector was called to clean up timed out entries
  to get the correct element numbers and set size values. Now the set is
  scanned to check non-timed out entries, without actually calling the gc
  for the whole set.

Thanks to Florian Westphal for helping me to solve the SOFTIRQ-safe ->
SOFTIRQ-unsafe lock order issues during working on the patch.

Reported-by: syzbot+4b0e9d4ff3cf117837e5@syzkaller.appspotmail.com
Reported-by: syzbot+c27b8d5010f45c666ed1@syzkaller.appspotmail.com
Reported-by: syzbot+68a806795ac89df3aa1c@syzkaller.appspotmail.com
Fixes: 23c42a403a9c ("netfilter: ipset: Introduction of new commands and protocol version 7")
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
---
 include/linux/netfilter/ipset/ip_set.h |  11 +-
 net/netfilter/ipset/ip_set_core.c      |  34 +-
 net/netfilter/ipset/ip_set_hash_gen.h  | 633 +++++++++++++++++++++++----------
 3 files changed, 472 insertions(+), 206 deletions(-)

diff --git a/include/linux/netfilter/ipset/ip_set.h b/include/linux/netfilter/ipset/ip_set.h
index 908d38dbcb91..5448c8b443db 100644
--- a/include/linux/netfilter/ipset/ip_set.h
+++ b/include/linux/netfilter/ipset/ip_set.h
@@ -121,6 +121,7 @@ struct ip_set_ext {
 	u32 timeout;
 	u8 packets_op;
 	u8 bytes_op;
+	bool target;
 };
 
 struct ip_set;
@@ -187,6 +188,14 @@ struct ip_set_type_variant {
 	/* Return true if "b" set is the same as "a"
 	 * according to the create set parameters */
 	bool (*same_set)(const struct ip_set *a, const struct ip_set *b);
+	/* Region-locking is used */
+	bool region_lock;
+};
+
+struct ip_set_region {
+	spinlock_t lock;	/* Region lock */
+	size_t ext_size;	/* Size of the dynamic extensions */
+	u32 elements;		/* Number of elements vs timeout */
 };
 
 /* The core set type structure */
@@ -501,7 +510,7 @@ ip_set_init_skbinfo(struct ip_set_skbinfo *skbinfo,
 }
 
 #define IP_SET_INIT_KEXT(skb, opt, set)			\
-	{ .bytes = (skb)->len, .packets = 1,		\
+	{ .bytes = (skb)->len, .packets = 1, .target = true,\
 	  .timeout = ip_set_adt_opt_timeout(opt, set) }
 
 #define IP_SET_INIT_UEXT(set)				\
diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c
index 69c107f9ba8d..8dd17589217d 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -723,6 +723,20 @@ ip_set_rcu_get(struct net *net, ip_set_id_t index)
 	return set;
 }
 
+static inline void
+ip_set_lock(struct ip_set *set)
+{
+	if (!set->variant->region_lock)
+		spin_lock_bh(&set->lock);
+}
+
+static inline void
+ip_set_unlock(struct ip_set *set)
+{
+	if (!set->variant->region_lock)
+		spin_unlock_bh(&set->lock);
+}
+
 int
 ip_set_test(ip_set_id_t index, const struct sk_buff *skb,
 	    const struct xt_action_param *par, struct ip_set_adt_opt *opt)
@@ -744,9 +758,9 @@ ip_set_test(ip_set_id_t index, const struct sk_buff *skb,
 	if (ret == -EAGAIN) {
 		/* Type requests element to be completed */
 		pr_debug("element must be completed, ADD is triggered\n");
-		spin_lock_bh(&set->lock);
+		ip_set_lock(set);
 		set->variant->kadt(set, skb, par, IPSET_ADD, opt);
-		spin_unlock_bh(&set->lock);
+		ip_set_unlock(set);
 		ret = 1;
 	} else {
 		/* --return-nomatch: invert matched element */
@@ -775,9 +789,9 @@ ip_set_add(ip_set_id_t index, const struct sk_buff *skb,
 	    !(opt->family == set->family || set->family == NFPROTO_UNSPEC))
 		return -IPSET_ERR_TYPE_MISMATCH;
 
-	spin_lock_bh(&set->lock);
+	ip_set_lock(set);
 	ret = set->variant->kadt(set, skb, par, IPSET_ADD, opt);
-	spin_unlock_bh(&set->lock);
+	ip_set_unlock(set);
 
 	return ret;
 }
@@ -797,9 +811,9 @@ ip_set_del(ip_set_id_t index, const struct sk_buff *skb,
 	    !(opt->family == set->family || set->family == NFPROTO_UNSPEC))
 		return -IPSET_ERR_TYPE_MISMATCH;
 
-	spin_lock_bh(&set->lock);
+	ip_set_lock(set);
 	ret = set->variant->kadt(set, skb, par, IPSET_DEL, opt);
-	spin_unlock_bh(&set->lock);
+	ip_set_unlock(set);
 
 	return ret;
 }
@@ -1264,9 +1278,9 @@ ip_set_flush_set(struct ip_set *set)
 {
 	pr_debug("set: %s\n",  set->name);
 
-	spin_lock_bh(&set->lock);
+	ip_set_lock(set);
 	set->variant->flush(set);
-	spin_unlock_bh(&set->lock);
+	ip_set_unlock(set);
 }
 
 static int ip_set_flush(struct net *net, struct sock *ctnl, struct sk_buff *skb,
@@ -1713,9 +1727,9 @@ call_ad(struct sock *ctnl, struct sk_buff *skb, struct ip_set *set,
 	bool eexist = flags & IPSET_FLAG_EXIST, retried = false;
 
 	do {
-		spin_lock_bh(&set->lock);
+		ip_set_lock(set);
 		ret = set->variant->uadt(set, tb, adt, &lineno, flags, retried);
-		spin_unlock_bh(&set->lock);
+		ip_set_unlock(set);
 		retried = true;
 	} while (ret == -EAGAIN &&
 		 set->variant->resize &&
diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index 7480ce55b5c8..71e93eac0831 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -7,13 +7,21 @@
 #include <linux/rcupdate.h>
 #include <linux/jhash.h>
 #include <linux/types.h>
+#include <linux/netfilter/nfnetlink.h>
 #include <linux/netfilter/ipset/ip_set.h>
 
-#define __ipset_dereference_protected(p, c)	rcu_dereference_protected(p, c)
-#define ipset_dereference_protected(p, set) \
-	__ipset_dereference_protected(p, lockdep_is_held(&(set)->lock))
-
-#define rcu_dereference_bh_nfnl(p)	rcu_dereference_bh_check(p, 1)
+#define __ipset_dereference(p)		\
+	rcu_dereference_protected(p, 1)
+#define ipset_dereference_nfnl(p)	\
+	rcu_dereference_protected(p,	\
+		lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET))
+#define ipset_dereference_set(p, set) 	\
+	rcu_dereference_protected(p,	\
+		lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET) || \
+		lockdep_is_held(&(set)->lock))
+#define ipset_dereference_bh_nfnl(p)	\
+	rcu_dereference_bh_check(p, 	\
+		lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET))
 
 /* Hashing which uses arrays to resolve clashing. The hash table is resized
  * (doubled) when searching becomes too long.
@@ -72,11 +80,35 @@ struct hbucket {
 		__aligned(__alignof__(u64));
 };
 
+/* Region size for locking == 2^HTABLE_REGION_BITS */
+#define HTABLE_REGION_BITS	10
+#define ahash_numof_locks(htable_bits)		\
+	((htable_bits) < HTABLE_REGION_BITS ? 1	\
+		: jhash_size((htable_bits) - HTABLE_REGION_BITS))
+#define ahash_sizeof_regions(htable_bits)		\
+	(ahash_numof_locks(htable_bits) * sizeof(struct ip_set_region))
+#define ahash_region(n, htable_bits)		\
+	((n) % ahash_numof_locks(htable_bits))
+#define ahash_bucket_start(h,  htable_bits)	\
+	((htable_bits) < HTABLE_REGION_BITS ? 0	\
+		: (h) * jhash_size(HTABLE_REGION_BITS))
+#define ahash_bucket_end(h,  htable_bits)	\
+	((htable_bits) < HTABLE_REGION_BITS ? jhash_size(htable_bits)	\
+		: ((h) + 1) * jhash_size(HTABLE_REGION_BITS))
+
+struct htable_gc {
+	struct delayed_work dwork;
+	struct ip_set *set;	/* Set the gc belongs to */
+	u32 region;		/* Last gc run position */
+};
+
 /* The hash table: the table size stored here in order to make resizing easy */
 struct htable {
 	atomic_t ref;		/* References for resizing */
-	atomic_t uref;		/* References for dumping */
+	atomic_t uref;		/* References for dumping and gc */
 	u8 htable_bits;		/* size of hash table == 2^htable_bits */
+	u32 maxelem;		/* Maxelem per region */
+	struct ip_set_region *hregion;	/* Region locks and ext sizes */
 	struct hbucket __rcu *bucket[0]; /* hashtable buckets */
 };
 
@@ -162,6 +194,10 @@ htable_bits(u32 hashsize)
 #define NLEN			0
 #endif /* IP_SET_HASH_WITH_NETS */
 
+#define SET_ELEM_EXPIRED(set, d)	\
+	(SET_WITH_TIMEOUT(set) &&	\
+	 ip_set_timeout_expired(ext_timeout(d, set)))
+
 #endif /* _IP_SET_HASH_GEN_H */
 
 #ifndef MTYPE
@@ -205,10 +241,12 @@ htable_bits(u32 hashsize)
 #undef mtype_test_cidrs
 #undef mtype_test
 #undef mtype_uref
-#undef mtype_expire
 #undef mtype_resize
+#undef mtype_ext_size
+#undef mtype_resize_ad
 #undef mtype_head
 #undef mtype_list
+#undef mtype_gc_do
 #undef mtype_gc
 #undef mtype_gc_init
 #undef mtype_variant
@@ -247,10 +285,12 @@ htable_bits(u32 hashsize)
 #define mtype_test_cidrs	IPSET_TOKEN(MTYPE, _test_cidrs)
 #define mtype_test		IPSET_TOKEN(MTYPE, _test)
 #define mtype_uref		IPSET_TOKEN(MTYPE, _uref)
-#define mtype_expire		IPSET_TOKEN(MTYPE, _expire)
 #define mtype_resize		IPSET_TOKEN(MTYPE, _resize)
+#define mtype_ext_size		IPSET_TOKEN(MTYPE, _ext_size)
+#define mtype_resize_ad		IPSET_TOKEN(MTYPE, _resize_ad)
 #define mtype_head		IPSET_TOKEN(MTYPE, _head)
 #define mtype_list		IPSET_TOKEN(MTYPE, _list)
+#define mtype_gc_do		IPSET_TOKEN(MTYPE, _gc_do)
 #define mtype_gc		IPSET_TOKEN(MTYPE, _gc)
 #define mtype_gc_init		IPSET_TOKEN(MTYPE, _gc_init)
 #define mtype_variant		IPSET_TOKEN(MTYPE, _variant)
@@ -275,8 +315,7 @@ htable_bits(u32 hashsize)
 /* The generic hash structure */
 struct htype {
 	struct htable __rcu *table; /* the hash table */
-	struct timer_list gc;	/* garbage collection when timeout enabled */
-	struct ip_set *set;	/* attached to this ip_set */
+	struct htable_gc gc;	/* gc workqueue */
 	u32 maxelem;		/* max elements in the hash */
 	u32 initval;		/* random jhash init value */
 #ifdef IP_SET_HASH_WITH_MARKMASK
@@ -288,21 +327,33 @@ struct htype {
 #ifdef IP_SET_HASH_WITH_NETMASK
 	u8 netmask;		/* netmask value for subnets to store */
 #endif
+	struct list_head ad;	/* Resize add|del backlist */
 	struct mtype_elem next; /* temporary storage for uadd */
 #ifdef IP_SET_HASH_WITH_NETS
 	struct net_prefixes nets[NLEN]; /* book-keeping of prefixes */
 #endif
 };
 
+/* ADD|DEL entries saved during resize */
+struct mtype_resize_ad {
+	struct list_head list;
+	enum ipset_adt ad;	/* ADD|DEL element */
+	struct mtype_elem d;	/* Element value */
+	struct ip_set_ext ext;	/* Extensions for ADD */
+	struct ip_set_ext mext;	/* Target extensions for ADD */
+	u32 flags;		/* Flags for ADD */
+};
+
 #ifdef IP_SET_HASH_WITH_NETS
 /* Network cidr size book keeping when the hash stores different
  * sized networks. cidr == real cidr + 1 to support /0.
  */
 static void
-mtype_add_cidr(struct htype *h, u8 cidr, u8 n)
+mtype_add_cidr(struct ip_set *set, struct htype *h, u8 cidr, u8 n)
 {
 	int i, j;
 
+	spin_lock_bh(&set->lock);
 	/* Add in increasing prefix order, so larger cidr first */
 	for (i = 0, j = -1; i < NLEN && h->nets[i].cidr[n]; i++) {
 		if (j != -1) {
@@ -311,7 +362,7 @@ mtype_add_cidr(struct htype *h, u8 cidr, u8 n)
 			j = i;
 		} else if (h->nets[i].cidr[n] == cidr) {
 			h->nets[CIDR_POS(cidr)].nets[n]++;
-			return;
+			goto unlock;
 		}
 	}
 	if (j != -1) {
@@ -320,24 +371,29 @@ mtype_add_cidr(struct htype *h, u8 cidr, u8 n)
 	}
 	h->nets[i].cidr[n] = cidr;
 	h->nets[CIDR_POS(cidr)].nets[n] = 1;
+unlock:
+	spin_unlock_bh(&set->lock);
 }
 
 static void
-mtype_del_cidr(struct htype *h, u8 cidr, u8 n)
+mtype_del_cidr(struct ip_set *set, struct htype *h, u8 cidr, u8 n)
 {
 	u8 i, j, net_end = NLEN - 1;
 
+	spin_lock_bh(&set->lock);
 	for (i = 0; i < NLEN; i++) {
 		if (h->nets[i].cidr[n] != cidr)
 			continue;
 		h->nets[CIDR_POS(cidr)].nets[n]--;
 		if (h->nets[CIDR_POS(cidr)].nets[n] > 0)
-			return;
+			goto unlock;
 		for (j = i; j < net_end && h->nets[j].cidr[n]; j++)
 			h->nets[j].cidr[n] = h->nets[j + 1].cidr[n];
 		h->nets[j].cidr[n] = 0;
-		return;
+		goto unlock;
 	}
+unlock:
+	spin_unlock_bh(&set->lock);
 }
 #endif
 
@@ -345,7 +401,7 @@ mtype_del_cidr(struct htype *h, u8 cidr, u8 n)
 static size_t
 mtype_ahash_memsize(const struct htype *h, const struct htable *t)
 {
-	return sizeof(*h) + sizeof(*t);
+	return sizeof(*h) + sizeof(*t) + ahash_sizeof_regions(t->htable_bits);
 }
 
 /* Get the ith element from the array block n */
@@ -369,24 +425,29 @@ mtype_flush(struct ip_set *set)
 	struct htype *h = set->data;
 	struct htable *t;
 	struct hbucket *n;
-	u32 i;
-
-	t = ipset_dereference_protected(h->table, set);
-	for (i = 0; i < jhash_size(t->htable_bits); i++) {
-		n = __ipset_dereference_protected(hbucket(t, i), 1);
-		if (!n)
-			continue;
-		if (set->extensions & IPSET_EXT_DESTROY)
-			mtype_ext_cleanup(set, n);
-		/* FIXME: use slab cache */
-		rcu_assign_pointer(hbucket(t, i), NULL);
-		kfree_rcu(n, rcu);
+	u32 r, i;
+
+	t = ipset_dereference_nfnl(h->table);
+	for (r = 0; r < ahash_numof_locks(t->htable_bits); r++) {
+		spin_lock_bh(&t->hregion[r].lock);
+		for (i = ahash_bucket_start(r, t->htable_bits);
+		     i < ahash_bucket_end(r, t->htable_bits); i++) {
+			n = __ipset_dereference(hbucket(t, i));
+			if (!n)
+				continue;
+			if (set->extensions & IPSET_EXT_DESTROY)
+				mtype_ext_cleanup(set, n);
+			/* FIXME: use slab cache */
+			rcu_assign_pointer(hbucket(t, i), NULL);
+			kfree_rcu(n, rcu);
+		}
+		t->hregion[r].ext_size = 0;
+		t->hregion[r].elements = 0;
+		spin_unlock_bh(&t->hregion[r].lock);
 	}
 #ifdef IP_SET_HASH_WITH_NETS
 	memset(h->nets, 0, sizeof(h->nets));
 #endif
-	set->elements = 0;
-	set->ext_size = 0;
 }
 
 /* Destroy the hashtable part of the set */
@@ -397,7 +458,7 @@ mtype_ahash_destroy(struct ip_set *set, struct htable *t, bool ext_destroy)
 	u32 i;
 
 	for (i = 0; i < jhash_size(t->htable_bits); i++) {
-		n = __ipset_dereference_protected(hbucket(t, i), 1);
+		n = __ipset_dereference(hbucket(t, i));
 		if (!n)
 			continue;
 		if (set->extensions & IPSET_EXT_DESTROY && ext_destroy)
@@ -406,6 +467,7 @@ mtype_ahash_destroy(struct ip_set *set, struct htable *t, bool ext_destroy)
 		kfree(n);
 	}
 
+	ip_set_free(t->hregion);
 	ip_set_free(t);
 }
 
@@ -414,28 +476,21 @@ static void
 mtype_destroy(struct ip_set *set)
 {
 	struct htype *h = set->data;
+	struct list_head *l, *lt;
 
 	if (SET_WITH_TIMEOUT(set))
-		del_timer_sync(&h->gc);
+		cancel_delayed_work_sync(&h->gc.dwork);
 
-	mtype_ahash_destroy(set,
-			    __ipset_dereference_protected(h->table, 1), true);
+	mtype_ahash_destroy(set, ipset_dereference_nfnl(h->table), true);
+	list_for_each_safe(l, lt, &h->ad) {
+		list_del(l);
+		kfree(l);
+	}
 	kfree(h);
 
 	set->data = NULL;
 }
 
-static void
-mtype_gc_init(struct ip_set *set, void (*gc)(struct timer_list *t))
-{
-	struct htype *h = set->data;
-
-	timer_setup(&h->gc, gc, 0);
-	mod_timer(&h->gc, jiffies + IPSET_GC_PERIOD(set->timeout) * HZ);
-	pr_debug("gc initialized, run in every %u\n",
-		 IPSET_GC_PERIOD(set->timeout));
-}
-
 static bool
 mtype_same_set(const struct ip_set *a, const struct ip_set *b)
 {
@@ -454,11 +509,9 @@ mtype_same_set(const struct ip_set *a, const struct ip_set *b)
 	       a->extensions == b->extensions;
 }
 
-/* Delete expired elements from the hashtable */
 static void
-mtype_expire(struct ip_set *set, struct htype *h)
+mtype_gc_do(struct ip_set *set, struct htype *h, struct htable *t, u32 r)
 {
-	struct htable *t;
 	struct hbucket *n, *tmp;
 	struct mtype_elem *data;
 	u32 i, j, d;
@@ -466,10 +519,12 @@ mtype_expire(struct ip_set *set, struct htype *h)
 #ifdef IP_SET_HASH_WITH_NETS
 	u8 k;
 #endif
+	u8 htable_bits = t->htable_bits;
 
-	t = ipset_dereference_protected(h->table, set);
-	for (i = 0; i < jhash_size(t->htable_bits); i++) {
-		n = __ipset_dereference_protected(hbucket(t, i), 1);
+	spin_lock_bh(&t->hregion[r].lock);
+	for (i = ahash_bucket_start(r, htable_bits);
+	     i < ahash_bucket_end(r, htable_bits); i++) {
+		n = __ipset_dereference(hbucket(t, i));
 		if (!n)
 			continue;
 		for (j = 0, d = 0; j < n->pos; j++) {
@@ -485,58 +540,100 @@ mtype_expire(struct ip_set *set, struct htype *h)
 			smp_mb__after_atomic();
 #ifdef IP_SET_HASH_WITH_NETS
 			for (k = 0; k < IPSET_NET_COUNT; k++)
-				mtype_del_cidr(h,
+				mtype_del_cidr(set, h,
 					NCIDR_PUT(DCIDR_GET(data->cidr, k)),
 					k);
 #endif
+			t->hregion[r].elements--;
 			ip_set_ext_destroy(set, data);
-			set->elements--;
 			d++;
 		}
 		if (d >= AHASH_INIT_SIZE) {
 			if (d >= n->size) {
+				t->hregion[r].ext_size -=
+					ext_size(n->size, dsize);
 				rcu_assign_pointer(hbucket(t, i), NULL);
 				kfree_rcu(n, rcu);
 				continue;
 			}
 			tmp = kzalloc(sizeof(*tmp) +
-				      (n->size - AHASH_INIT_SIZE) * dsize,
-				      GFP_ATOMIC);
+				(n->size - AHASH_INIT_SIZE) * dsize,
+				GFP_ATOMIC);
 			if (!tmp)
-				/* Still try to delete expired elements */
+				/* Still try to delete expired elements. */
 				continue;
 			tmp->size = n->size - AHASH_INIT_SIZE;
 			for (j = 0, d = 0; j < n->pos; j++) {
 				if (!test_bit(j, n->used))
 					continue;
 				data = ahash_data(n, j, dsize);
-				memcpy(tmp->value + d * dsize, data, dsize);
+				memcpy(tmp->value + d * dsize,
+				       data, dsize);
 				set_bit(d, tmp->used);
 				d++;
 			}
 			tmp->pos = d;
-			set->ext_size -= ext_size(AHASH_INIT_SIZE, dsize);
+			t->hregion[r].ext_size -=
+				ext_size(AHASH_INIT_SIZE, dsize);
 			rcu_assign_pointer(hbucket(t, i), tmp);
 			kfree_rcu(n, rcu);
 		}
 	}
+	spin_unlock_bh(&t->hregion[r].lock);
 }
 
 static void
-mtype_gc(struct timer_list *t)
+mtype_gc(struct work_struct *work)
 {
-	struct htype *h = from_timer(h, t, gc);
-	struct ip_set *set = h->set;
+	struct htable_gc *gc;
+	struct ip_set *set;
+	struct htype *h;
+	struct htable *t;
+	u32 r, numof_locks;
+	unsigned int next_run;
+
+	gc = container_of(work, struct htable_gc, dwork.work);
+	set = gc->set;
+	h = set->data;
 
-	pr_debug("called\n");
 	spin_lock_bh(&set->lock);
-	mtype_expire(set, h);
+	t = ipset_dereference_set(h->table, set);
+	atomic_inc(&t->uref);
+	numof_locks = ahash_numof_locks(t->htable_bits);
+	r = gc->region++;
+	if (r >= numof_locks) {
+		r = gc->region = 0;
+	}
+	next_run = (IPSET_GC_PERIOD(set->timeout) * HZ) / numof_locks;
+	if (next_run < HZ/10)
+		next_run = HZ/10;
 	spin_unlock_bh(&set->lock);
 
-	h->gc.expires = jiffies + IPSET_GC_PERIOD(set->timeout) * HZ;
-	add_timer(&h->gc);
+	mtype_gc_do(set, h, t, r);
+
+	if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
+		pr_debug("Table destroy after resize by expire: %p\n", t);
+		mtype_ahash_destroy(set, t, false);
+	}
+
+	queue_delayed_work(system_power_efficient_wq, &gc->dwork, next_run);
+
+}
+
+static void
+mtype_gc_init(struct htable_gc *gc)
+{
+	INIT_DEFERRABLE_WORK(&gc->dwork, mtype_gc);
+	queue_delayed_work(system_power_efficient_wq, &gc->dwork, HZ);
 }
 
+static int
+mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
+	  struct ip_set_ext *mext, u32 flags);
+static int
+mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
+	  struct ip_set_ext *mext, u32 flags);
+
 /* Resize a hash: create a new hash table with doubling the hashsize
  * and inserting the elements to it. Repeat until we succeed or
  * fail due to memory pressures.
@@ -547,7 +644,7 @@ mtype_resize(struct ip_set *set, bool retried)
 	struct htype *h = set->data;
 	struct htable *t, *orig;
 	u8 htable_bits;
-	size_t extsize, dsize = set->dsize;
+	size_t dsize = set->dsize;
 #ifdef IP_SET_HASH_WITH_NETS
 	u8 flags;
 	struct mtype_elem *tmp;
@@ -555,7 +652,9 @@ mtype_resize(struct ip_set *set, bool retried)
 	struct mtype_elem *data;
 	struct mtype_elem *d;
 	struct hbucket *n, *m;
-	u32 i, j, key;
+	struct list_head *l, *lt;
+	struct mtype_resize_ad *x;
+	u32 i, j, r, nr, key;
 	int ret;
 
 #ifdef IP_SET_HASH_WITH_NETS
@@ -563,10 +662,8 @@ mtype_resize(struct ip_set *set, bool retried)
 	if (!tmp)
 		return -ENOMEM;
 #endif
-	rcu_read_lock_bh();
-	orig = rcu_dereference_bh_nfnl(h->table);
+	orig = ipset_dereference_bh_nfnl(h->table);
 	htable_bits = orig->htable_bits;
-	rcu_read_unlock_bh();
 
 retry:
 	ret = 0;
@@ -583,88 +680,124 @@ mtype_resize(struct ip_set *set, bool retried)
 		ret = -ENOMEM;
 		goto out;
 	}
+	t->hregion = ip_set_alloc(ahash_sizeof_regions(htable_bits));
+	if (!t->hregion) {
+		kfree(t);
+		ret = -ENOMEM;
+		goto out;
+	}
 	t->htable_bits = htable_bits;
+	t->maxelem = h->maxelem / ahash_numof_locks(htable_bits);
+	for (i = 0; i < ahash_numof_locks(htable_bits); i++)
+		spin_lock_init(&t->hregion[i].lock);
 
-	spin_lock_bh(&set->lock);
-	orig = __ipset_dereference_protected(h->table, 1);
-	/* There can't be another parallel resizing, but dumping is possible */
+	/* There can't be another parallel resizing,
+	 * but dumping, gc, kernel side add/del are possible
+	 */
+	orig = ipset_dereference_bh_nfnl(h->table);
 	atomic_set(&orig->ref, 1);
 	atomic_inc(&orig->uref);
-	extsize = 0;
 	pr_debug("attempt to resize set %s from %u to %u, t %p\n",
 		 set->name, orig->htable_bits, htable_bits, orig);
-	for (i = 0; i < jhash_size(orig->htable_bits); i++) {
-		n = __ipset_dereference_protected(hbucket(orig, i), 1);
-		if (!n)
-			continue;
-		for (j = 0; j < n->pos; j++) {
-			if (!test_bit(j, n->used))
+	for (r = 0; r < ahash_numof_locks(orig->htable_bits); r++) {
+		/* Expire may replace a hbucket with another one */
+		rcu_read_lock_bh();
+		for (i = ahash_bucket_start(r, orig->htable_bits);
+		     i < ahash_bucket_end(r, orig->htable_bits); i++) {
+			n = __ipset_dereference(hbucket(orig, i));
+			if (!n)
 				continue;
-			data = ahash_data(n, j, dsize);
+			for (j = 0; j < n->pos; j++) {
+				if (!test_bit(j, n->used))
+					continue;
+				data = ahash_data(n, j, dsize);
+				if (SET_ELEM_EXPIRED(set, data))
+					continue;
 #ifdef IP_SET_HASH_WITH_NETS
-			/* We have readers running parallel with us,
-			 * so the live data cannot be modified.
-			 */
-			flags = 0;
-			memcpy(tmp, data, dsize);
-			data = tmp;
-			mtype_data_reset_flags(data, &flags);
+				/* We have readers running parallel with us,
+				 * so the live data cannot be modified.
+				 */
+				flags = 0;
+				memcpy(tmp, data, dsize);
+				data = tmp;
+				mtype_data_reset_flags(data, &flags);
 #endif
-			key = HKEY(data, h->initval, htable_bits);
-			m = __ipset_dereference_protected(hbucket(t, key), 1);
-			if (!m) {
-				m = kzalloc(sizeof(*m) +
+				key = HKEY(data, h->initval, htable_bits);
+				m = __ipset_dereference(hbucket(t, key));
+				nr = ahash_region(key, htable_bits);
+				if (!m) {
+					m = kzalloc(sizeof(*m) +
 					    AHASH_INIT_SIZE * dsize,
 					    GFP_ATOMIC);
-				if (!m) {
-					ret = -ENOMEM;
-					goto cleanup;
-				}
-				m->size = AHASH_INIT_SIZE;
-				extsize += ext_size(AHASH_INIT_SIZE, dsize);
-				RCU_INIT_POINTER(hbucket(t, key), m);
-			} else if (m->pos >= m->size) {
-				struct hbucket *ht;
-
-				if (m->size >= AHASH_MAX(h)) {
-					ret = -EAGAIN;
-				} else {
-					ht = kzalloc(sizeof(*ht) +
+					if (!m) {
+						ret = -ENOMEM;
+						goto cleanup;
+					}
+					m->size = AHASH_INIT_SIZE;
+					t->hregion[nr].ext_size +=
+						ext_size(AHASH_INIT_SIZE,
+							 dsize);
+					RCU_INIT_POINTER(hbucket(t, key), m);
+				} else if (m->pos >= m->size) {
+					struct hbucket *ht;
+
+					if (m->size >= AHASH_MAX(h)) {
+						ret = -EAGAIN;
+					} else {
+						ht = kzalloc(sizeof(*ht) +
 						(m->size + AHASH_INIT_SIZE)
 						* dsize,
 						GFP_ATOMIC);
-					if (!ht)
-						ret = -ENOMEM;
+						if (!ht)
+							ret = -ENOMEM;
+					}
+					if (ret < 0)
+						goto cleanup;
+					memcpy(ht, m, sizeof(struct hbucket) +
+					       m->size * dsize);
+					ht->size = m->size + AHASH_INIT_SIZE;
+					t->hregion[nr].ext_size +=
+						ext_size(AHASH_INIT_SIZE,
+							 dsize);
+					kfree(m);
+					m = ht;
+					RCU_INIT_POINTER(hbucket(t, key), ht);
 				}
-				if (ret < 0)
-					goto cleanup;
-				memcpy(ht, m, sizeof(struct hbucket) +
-					      m->size * dsize);
-				ht->size = m->size + AHASH_INIT_SIZE;
-				extsize += ext_size(AHASH_INIT_SIZE, dsize);
-				kfree(m);
-				m = ht;
-				RCU_INIT_POINTER(hbucket(t, key), ht);
-			}
-			d = ahash_data(m, m->pos, dsize);
-			memcpy(d, data, dsize);
-			set_bit(m->pos++, m->used);
+				d = ahash_data(m, m->pos, dsize);
+				memcpy(d, data, dsize);
+				set_bit(m->pos++, m->used);
+				t->hregion[nr].elements++;
 #ifdef IP_SET_HASH_WITH_NETS
-			mtype_data_reset_flags(d, &flags);
+				mtype_data_reset_flags(d, &flags);
 #endif
+			}
 		}
+		rcu_read_unlock_bh();
 	}
-	rcu_assign_pointer(h->table, t);
-	set->ext_size = extsize;
 
-	spin_unlock_bh(&set->lock);
+	/* There can't be any other writer. */
+	rcu_assign_pointer(h->table, t);
 
 	/* Give time to other readers of the set */
 	synchronize_rcu();
 
 	pr_debug("set %s resized from %u (%p) to %u (%p)\n", set->name,
 		 orig->htable_bits, orig, t->htable_bits, t);
-	/* If there's nobody else dumping the table, destroy it */
+	/* Add/delete elements processed by the SET target during resize.
+	 * Kernel-side add cannot trigger a resize and userspace actions
+	 * are serialized by the mutex.
+	 */
+	list_for_each_safe(l, lt, &h->ad) {
+		x = list_entry(l, struct mtype_resize_ad, list);
+		if (x->ad == IPSET_ADD) {
+			mtype_add(set, &x->d, &x->ext, &x->mext, x->flags);
+		} else {
+			mtype_del(set, &x->d, NULL, NULL, 0);
+		}
+		list_del(l);
+		kfree(l);
+	}
+	/* If there's nobody else using the table, destroy it */
 	if (atomic_dec_and_test(&orig->uref)) {
 		pr_debug("Table destroy by resize %p\n", orig);
 		mtype_ahash_destroy(set, orig, false);
@@ -677,15 +810,44 @@ mtype_resize(struct ip_set *set, bool retried)
 	return ret;
 
 cleanup:
+	rcu_read_unlock_bh();
 	atomic_set(&orig->ref, 0);
 	atomic_dec(&orig->uref);
-	spin_unlock_bh(&set->lock);
 	mtype_ahash_destroy(set, t, false);
 	if (ret == -EAGAIN)
 		goto retry;
 	goto out;
 }
 
+/* Get the current number of elements and ext_size in the set  */
+static void
+mtype_ext_size(struct ip_set *set, u32 *elements, size_t *ext_size)
+{
+	struct htype *h = set->data;
+	const struct htable *t;
+	u32 i, j, r;
+	struct hbucket *n;
+	struct mtype_elem *data;
+
+	t = rcu_dereference_bh(h->table);
+	for (r = 0; r < ahash_numof_locks(t->htable_bits); r++) {
+		for (i = ahash_bucket_start(r, t->htable_bits);
+		     i < ahash_bucket_end(r, t->htable_bits); i++) {
+			n = rcu_dereference_bh(hbucket(t, i));
+			if (!n)
+				continue;
+			for (j = 0; j < n->pos; j++) {
+				if (!test_bit(j, n->used))
+					continue;
+				data = ahash_data(n, j, set->dsize);
+				if (!SET_ELEM_EXPIRED(set, data))
+					(*elements)++;
+			}
+		}
+		*ext_size += t->hregion[r].ext_size;
+	}
+}
+
 /* Add an element to a hash and update the internal counters when succeeded,
  * otherwise report the proper error code.
  */
@@ -698,32 +860,49 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
 	const struct mtype_elem *d = value;
 	struct mtype_elem *data;
 	struct hbucket *n, *old = ERR_PTR(-ENOENT);
-	int i, j = -1;
+	int i, j = -1, ret;
 	bool flag_exist = flags & IPSET_FLAG_EXIST;
 	bool deleted = false, forceadd = false, reuse = false;
-	u32 key, multi = 0;
+	u32 r, key, multi = 0, elements, maxelem;
 
-	if (set->elements >= h->maxelem) {
-		if (SET_WITH_TIMEOUT(set))
-			/* FIXME: when set is full, we slow down here */
-			mtype_expire(set, h);
-		if (set->elements >= h->maxelem && SET_WITH_FORCEADD(set))
+	rcu_read_lock_bh();
+	t = rcu_dereference_bh(h->table);
+	key = HKEY(value, h->initval, t->htable_bits);
+	r = ahash_region(key, t->htable_bits);
+	atomic_inc(&t->uref);
+	elements = t->hregion[r].elements;
+	maxelem = t->maxelem;
+	if (elements >= maxelem) {
+		u32 e;
+		if (SET_WITH_TIMEOUT(set)) {
+			rcu_read_unlock_bh();
+			mtype_gc_do(set, h, t, r);
+			rcu_read_lock_bh();
+		}
+		maxelem = h->maxelem;
+		elements = 0;
+		for (e = 0; e < ahash_numof_locks(t->htable_bits); e++)
+			elements += t->hregion[e].elements;
+		if (elements >= maxelem && SET_WITH_FORCEADD(set))
 			forceadd = true;
 	}
+	rcu_read_unlock_bh();
 
-	t = ipset_dereference_protected(h->table, set);
-	key = HKEY(value, h->initval, t->htable_bits);
-	n = __ipset_dereference_protected(hbucket(t, key), 1);
+	spin_lock_bh(&t->hregion[r].lock);
+	n = rcu_dereference_bh(hbucket(t, key));
 	if (!n) {
-		if (forceadd || set->elements >= h->maxelem)
+		if (forceadd || elements >= maxelem)
 			goto set_full;
 		old = NULL;
 		n = kzalloc(sizeof(*n) + AHASH_INIT_SIZE * set->dsize,
 			    GFP_ATOMIC);
-		if (!n)
-			return -ENOMEM;
+		if (!n) {
+			ret = -ENOMEM;
+			goto unlock;
+		}
 		n->size = AHASH_INIT_SIZE;
-		set->ext_size += ext_size(AHASH_INIT_SIZE, set->dsize);
+		t->hregion[r].ext_size +=
+			ext_size(AHASH_INIT_SIZE, set->dsize);
 		goto copy_elem;
 	}
 	for (i = 0; i < n->pos; i++) {
@@ -737,19 +916,16 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
 		}
 		data = ahash_data(n, i, set->dsize);
 		if (mtype_data_equal(data, d, &multi)) {
-			if (flag_exist ||
-			    (SET_WITH_TIMEOUT(set) &&
-			     ip_set_timeout_expired(ext_timeout(data, set)))) {
+			if (flag_exist || SET_ELEM_EXPIRED(set, data)) {
 				/* Just the extensions could be overwritten */
 				j = i;
 				goto overwrite_extensions;
 			}
-			return -IPSET_ERR_EXIST;
+			ret = -IPSET_ERR_EXIST;
+			goto unlock;
 		}
 		/* Reuse first timed out entry */
-		if (SET_WITH_TIMEOUT(set) &&
-		    ip_set_timeout_expired(ext_timeout(data, set)) &&
-		    j == -1) {
+		if (SET_ELEM_EXPIRED(set, data) && j == -1) {
 			j = i;
 			reuse = true;
 		}
@@ -759,16 +935,16 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
 		if (!deleted) {
 #ifdef IP_SET_HASH_WITH_NETS
 			for (i = 0; i < IPSET_NET_COUNT; i++)
-				mtype_del_cidr(h,
+				mtype_del_cidr(set, h,
 					NCIDR_PUT(DCIDR_GET(data->cidr, i)),
 					i);
 #endif
 			ip_set_ext_destroy(set, data);
-			set->elements--;
+			t->hregion[r].elements--;
 		}
 		goto copy_data;
 	}
-	if (set->elements >= h->maxelem)
+	if (elements >= maxelem)
 		goto set_full;
 	/* Create a new slot */
 	if (n->pos >= n->size) {
@@ -776,28 +952,32 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
 		if (n->size >= AHASH_MAX(h)) {
 			/* Trigger rehashing */
 			mtype_data_next(&h->next, d);
-			return -EAGAIN;
+			ret = -EAGAIN;
+			goto resize;
 		}
 		old = n;
 		n = kzalloc(sizeof(*n) +
 			    (old->size + AHASH_INIT_SIZE) * set->dsize,
 			    GFP_ATOMIC);
-		if (!n)
-			return -ENOMEM;
+		if (!n) {
+			ret = -ENOMEM;
+			goto unlock;
+		}
 		memcpy(n, old, sizeof(struct hbucket) +
 		       old->size * set->dsize);
 		n->size = old->size + AHASH_INIT_SIZE;
-		set->ext_size += ext_size(AHASH_INIT_SIZE, set->dsize);
+		t->hregion[r].ext_size +=
+			ext_size(AHASH_INIT_SIZE, set->dsize);
 	}
 
 copy_elem:
 	j = n->pos++;
 	data = ahash_data(n, j, set->dsize);
 copy_data:
-	set->elements++;
+	t->hregion[r].elements++;
 #ifdef IP_SET_HASH_WITH_NETS
 	for (i = 0; i < IPSET_NET_COUNT; i++)
-		mtype_add_cidr(h, NCIDR_PUT(DCIDR_GET(d->cidr, i)), i);
+		mtype_add_cidr(set, h, NCIDR_PUT(DCIDR_GET(d->cidr, i)), i);
 #endif
 	memcpy(data, d, sizeof(struct mtype_elem));
 overwrite_extensions:
@@ -820,13 +1000,41 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
 		if (old)
 			kfree_rcu(old, rcu);
 	}
+	ret = 0;
+resize:
+	spin_unlock_bh(&t->hregion[r].lock);
+	if (atomic_read(&t->ref) && ext->target) {
+		/* Resize is in process and kernel side add, save values */
+		struct mtype_resize_ad *x;
+
+		x = kzalloc(sizeof(struct mtype_resize_ad), GFP_ATOMIC);
+		if (!x)
+			/* Don't bother */
+			goto out;
+		x->ad = IPSET_ADD;
+		memcpy(&x->d, value, sizeof(struct mtype_elem));
+		memcpy(&x->ext, ext, sizeof(struct ip_set_ext));
+		memcpy(&x->mext, mext, sizeof(struct ip_set_ext));
+		x->flags = flags;
+		spin_lock_bh(&set->lock);
+		list_add_tail(&x->list, &h->ad);
+		spin_unlock_bh(&set->lock);
+	}
+	goto out;
 
-	return 0;
 set_full:
 	if (net_ratelimit())
 		pr_warn("Set %s is full, maxelem %u reached\n",
-			set->name, h->maxelem);
-	return -IPSET_ERR_HASH_FULL;
+			set->name, maxelem);
+	ret = -IPSET_ERR_HASH_FULL;
+unlock:
+	spin_unlock_bh(&t->hregion[r].lock);
+out:
+	if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
+		pr_debug("Table destroy after resize by add: %p\n", t);
+		mtype_ahash_destroy(set, t, false);
+	}
+	return ret;
 }
 
 /* Delete an element from the hash and free up space if possible.
@@ -840,13 +1048,23 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
 	const struct mtype_elem *d = value;
 	struct mtype_elem *data;
 	struct hbucket *n;
-	int i, j, k, ret = -IPSET_ERR_EXIST;
+	struct mtype_resize_ad *x = NULL;
+	int i, j, k, r, ret = -IPSET_ERR_EXIST;
 	u32 key, multi = 0;
 	size_t dsize = set->dsize;
 
-	t = ipset_dereference_protected(h->table, set);
+	/* Userspace add and resize is excluded by the mutex.
+	 * Kernespace add does not trigger resize.
+	 */
+	rcu_read_lock_bh();
+	t = rcu_dereference_bh(h->table);
 	key = HKEY(value, h->initval, t->htable_bits);
-	n = __ipset_dereference_protected(hbucket(t, key), 1);
+	r = ahash_region(key, t->htable_bits);
+	atomic_inc(&t->uref);
+	rcu_read_unlock_bh();
+
+	spin_lock_bh(&t->hregion[r].lock);
+	n = rcu_dereference_bh(hbucket(t, key));
 	if (!n)
 		goto out;
 	for (i = 0, k = 0; i < n->pos; i++) {
@@ -857,8 +1075,7 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
 		data = ahash_data(n, i, dsize);
 		if (!mtype_data_equal(data, d, &multi))
 			continue;
-		if (SET_WITH_TIMEOUT(set) &&
-		    ip_set_timeout_expired(ext_timeout(data, set)))
+		if (SET_ELEM_EXPIRED(set, data))
 			goto out;
 
 		ret = 0;
@@ -866,20 +1083,33 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
 		smp_mb__after_atomic();
 		if (i + 1 == n->pos)
 			n->pos--;
-		set->elements--;
+		t->hregion[r].elements--;
 #ifdef IP_SET_HASH_WITH_NETS
 		for (j = 0; j < IPSET_NET_COUNT; j++)
-			mtype_del_cidr(h, NCIDR_PUT(DCIDR_GET(d->cidr, j)),
-				       j);
+			mtype_del_cidr(set, h,
+				       NCIDR_PUT(DCIDR_GET(d->cidr, j)), j);
 #endif
 		ip_set_ext_destroy(set, data);
 
+		if (atomic_read(&t->ref) && ext->target) {
+			/* Resize is in process and kernel side del,
+			 * save values
+			 */
+			x = kzalloc(sizeof(struct mtype_resize_ad),
+				    GFP_ATOMIC);
+			if (x) {
+				x->ad = IPSET_DEL;
+				memcpy(&x->d, value,
+				       sizeof(struct mtype_elem));
+				x->flags = flags;
+			}
+		}
 		for (; i < n->pos; i++) {
 			if (!test_bit(i, n->used))
 				k++;
 		}
 		if (n->pos == 0 && k == 0) {
-			set->ext_size -= ext_size(n->size, dsize);
+			t->hregion[r].ext_size -= ext_size(n->size, dsize);
 			rcu_assign_pointer(hbucket(t, key), NULL);
 			kfree_rcu(n, rcu);
 		} else if (k >= AHASH_INIT_SIZE) {
@@ -898,7 +1128,8 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
 				k++;
 			}
 			tmp->pos = k;
-			set->ext_size -= ext_size(AHASH_INIT_SIZE, dsize);
+			t->hregion[r].ext_size -=
+				ext_size(AHASH_INIT_SIZE, dsize);
 			rcu_assign_pointer(hbucket(t, key), tmp);
 			kfree_rcu(n, rcu);
 		}
@@ -906,6 +1137,16 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
 	}
 
 out:
+	spin_unlock_bh(&t->hregion[r].lock);
+	if (x) {
+		spin_lock_bh(&set->lock);
+		list_add(&x->list, &h->ad);
+		spin_unlock_bh(&set->lock);
+	}
+	if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
+		pr_debug("Table destroy after resize by del: %p\n", t);
+		mtype_ahash_destroy(set, t, false);
+	}
 	return ret;
 }
 
@@ -991,6 +1232,7 @@ mtype_test(struct ip_set *set, void *value, const struct ip_set_ext *ext,
 	int i, ret = 0;
 	u32 key, multi = 0;
 
+	rcu_read_lock_bh();
 	t = rcu_dereference_bh(h->table);
 #ifdef IP_SET_HASH_WITH_NETS
 	/* If we test an IP address and not a network address,
@@ -1022,6 +1264,7 @@ mtype_test(struct ip_set *set, void *value, const struct ip_set_ext *ext,
 			goto out;
 	}
 out:
+	rcu_read_unlock_bh();
 	return ret;
 }
 
@@ -1033,23 +1276,14 @@ mtype_head(struct ip_set *set, struct sk_buff *skb)
 	const struct htable *t;
 	struct nlattr *nested;
 	size_t memsize;
+	u32 elements = 0;
+	size_t ext_size = 0;
 	u8 htable_bits;
 
-	/* If any members have expired, set->elements will be wrong
-	 * mytype_expire function will update it with the right count.
-	 * we do not hold set->lock here, so grab it first.
-	 * set->elements can still be incorrect in the case of a huge set,
-	 * because elements might time out during the listing.
-	 */
-	if (SET_WITH_TIMEOUT(set)) {
-		spin_lock_bh(&set->lock);
-		mtype_expire(set, h);
-		spin_unlock_bh(&set->lock);
-	}
-
 	rcu_read_lock_bh();
-	t = rcu_dereference_bh_nfnl(h->table);
-	memsize = mtype_ahash_memsize(h, t) + set->ext_size;
+	t = rcu_dereference_bh(h->table);
+	mtype_ext_size(set, &elements, &ext_size);
+	memsize = mtype_ahash_memsize(h, t) + ext_size + set->ext_size;
 	htable_bits = t->htable_bits;
 	rcu_read_unlock_bh();
 
@@ -1071,7 +1305,7 @@ mtype_head(struct ip_set *set, struct sk_buff *skb)
 #endif
 	if (nla_put_net32(skb, IPSET_ATTR_REFERENCES, htonl(set->ref)) ||
 	    nla_put_net32(skb, IPSET_ATTR_MEMSIZE, htonl(memsize)) ||
-	    nla_put_net32(skb, IPSET_ATTR_ELEMENTS, htonl(set->elements)))
+	    nla_put_net32(skb, IPSET_ATTR_ELEMENTS, htonl(elements)))
 		goto nla_put_failure;
 	if (unlikely(ip_set_put_flags(skb, set)))
 		goto nla_put_failure;
@@ -1091,15 +1325,15 @@ mtype_uref(struct ip_set *set, struct netlink_callback *cb, bool start)
 
 	if (start) {
 		rcu_read_lock_bh();
-		t = rcu_dereference_bh_nfnl(h->table);
+		t = ipset_dereference_bh_nfnl(h->table);
 		atomic_inc(&t->uref);
 		cb->args[IPSET_CB_PRIVATE] = (unsigned long)t;
 		rcu_read_unlock_bh();
 	} else if (cb->args[IPSET_CB_PRIVATE]) {
 		t = (struct htable *)cb->args[IPSET_CB_PRIVATE];
 		if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
-			/* Resizing didn't destroy the hash table */
-			pr_debug("Table destroy by dump: %p\n", t);
+			pr_debug("Table destroy after resize "
+				 " by dump: %p\n", t);
 			mtype_ahash_destroy(set, t, false);
 		}
 		cb->args[IPSET_CB_PRIVATE] = 0;
@@ -1141,8 +1375,7 @@ mtype_list(const struct ip_set *set,
 			if (!test_bit(i, n->used))
 				continue;
 			e = ahash_data(n, i, set->dsize);
-			if (SET_WITH_TIMEOUT(set) &&
-			    ip_set_timeout_expired(ext_timeout(e, set)))
+			if (SET_ELEM_EXPIRED(set, e))
 				continue;
 			pr_debug("list hash %lu hbucket %p i %u, data %p\n",
 				 cb->args[IPSET_CB_ARG0], n, i, e);
@@ -1208,6 +1441,7 @@ static const struct ip_set_type_variant mtype_variant = {
 	.uref	= mtype_uref,
 	.resize	= mtype_resize,
 	.same_set = mtype_same_set,
+	.region_lock = true,
 };
 
 #ifdef IP_SET_EMIT_CREATE
@@ -1226,6 +1460,7 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
 	size_t hsize;
 	struct htype *h;
 	struct htable *t;
+	u32 i;
 
 	pr_debug("Create set %s with family %s\n",
 		 set->name, set->family == NFPROTO_IPV4 ? "inet" : "inet6");
@@ -1294,6 +1529,15 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
 		kfree(h);
 		return -ENOMEM;
 	}
+	t->hregion = ip_set_alloc(ahash_sizeof_regions(hbits));
+	if (!t->hregion) {
+		kfree(t);
+		kfree(h);
+		return -ENOMEM;
+	}
+	h->gc.set = set;
+	for (i = 0; i < ahash_numof_locks(hbits); i++)
+		spin_lock_init(&t->hregion[i].lock);
 	h->maxelem = maxelem;
 #ifdef IP_SET_HASH_WITH_NETMASK
 	h->netmask = netmask;
@@ -1304,9 +1548,10 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
 	get_random_bytes(&h->initval, sizeof(h->initval));
 
 	t->htable_bits = hbits;
+	t->maxelem = h->maxelem / ahash_numof_locks(hbits);
 	RCU_INIT_POINTER(h->table, t);
 
-	h->set = set;
+	INIT_LIST_HEAD(&h->ad);
 	set->data = h;
 #ifndef IP_SET_PROTO_UNDEF
 	if (set->family == NFPROTO_IPV4) {
@@ -1329,12 +1574,10 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
 #ifndef IP_SET_PROTO_UNDEF
 		if (set->family == NFPROTO_IPV4)
 #endif
-			IPSET_TOKEN(HTYPE, 4_gc_init)(set,
-				IPSET_TOKEN(HTYPE, 4_gc));
+			IPSET_TOKEN(HTYPE, 4_gc_init)(&h->gc);
 #ifndef IP_SET_PROTO_UNDEF
 		else
-			IPSET_TOKEN(HTYPE, 6_gc_init)(set,
-				IPSET_TOKEN(HTYPE, 6_gc));
+			IPSET_TOKEN(HTYPE, 6_gc_init)(&h->gc);
 #endif
 	}
 	pr_debug("create %s hashsize %u (%u) maxelem %u: %p(%p)\n",
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 2/6] netfilter: ipset: Fix forceadd evaluation path
  2020-02-26 22:54 [PATCH 0/6] Netfilter fixes for net Pablo Neira Ayuso
  2020-02-26 22:54 ` [PATCH 1/6] netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports Pablo Neira Ayuso
@ 2020-02-26 22:54 ` Pablo Neira Ayuso
  2020-02-26 22:54 ` [PATCH 3/6] selftests: nft_concat_range: Move option for 'list ruleset' before command Pablo Neira Ayuso
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 44+ messages in thread
From: Pablo Neira Ayuso @ 2020-02-26 22:54 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Jozsef Kadlecsik <kadlec@netfilter.org>

When the forceadd option is enabled, the hash:* types should find and replace
the first entry in the bucket with the new one if there are no reuseable
(deleted or timed out) entries. However, the position index was just not set
to zero and remained the invalid -1 if there were no reuseable entries.

Reported-by: syzbot+6a86565c74ebe30aea18@syzkaller.appspotmail.com
Fixes: 23c42a403a9c ("netfilter: ipset: Introduction of new commands and protocol version 7")
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
---
 net/netfilter/ipset/ip_set_hash_gen.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index 71e93eac0831..e52d7b7597a0 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -931,6 +931,8 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
 		}
 	}
 	if (reuse || forceadd) {
+		if (j == -1)
+			j = 0;
 		data = ahash_data(n, j, set->dsize);
 		if (!deleted) {
 #ifdef IP_SET_HASH_WITH_NETS
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 3/6] selftests: nft_concat_range: Move option for 'list ruleset' before command
  2020-02-26 22:54 [PATCH 0/6] Netfilter fixes for net Pablo Neira Ayuso
  2020-02-26 22:54 ` [PATCH 1/6] netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports Pablo Neira Ayuso
  2020-02-26 22:54 ` [PATCH 2/6] netfilter: ipset: Fix forceadd evaluation path Pablo Neira Ayuso
@ 2020-02-26 22:54 ` Pablo Neira Ayuso
  2020-02-26 22:54 ` [PATCH 4/6] nft_set_pipapo: Actually fetch key data in nft_pipapo_remove() Pablo Neira Ayuso
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 44+ messages in thread
From: Pablo Neira Ayuso @ 2020-02-26 22:54 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Stefano Brivio <sbrivio@redhat.com>

Before nftables commit fb9cea50e8b3 ("main: enforce options before
commands"), 'nft list ruleset -a' happened to work, but it's wrong
and won't work anymore. Replace it by 'nft -a list ruleset'.

Reported-by: Chen Yi <yiche@redhat.com>
Fixes: 611973c1e06f ("selftests: netfilter: Introduce tests for sets with range concatenation")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 tools/testing/selftests/netfilter/nft_concat_range.sh | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/netfilter/nft_concat_range.sh b/tools/testing/selftests/netfilter/nft_concat_range.sh
index aca21dde102a..5c1033ee1b39 100755
--- a/tools/testing/selftests/netfilter/nft_concat_range.sh
+++ b/tools/testing/selftests/netfilter/nft_concat_range.sh
@@ -1025,7 +1025,7 @@ format_noconcat() {
 add() {
 	if ! nft add element inet filter test "${1}"; then
 		err "Failed to add ${1} given ruleset:"
-		err "$(nft list ruleset -a)"
+		err "$(nft -a list ruleset)"
 		return 1
 	fi
 }
@@ -1045,7 +1045,7 @@ add_perf() {
 add_perf_norange() {
 	if ! nft add element netdev perf norange "${1}"; then
 		err "Failed to add ${1} given ruleset:"
-		err "$(nft list ruleset -a)"
+		err "$(nft -a list ruleset)"
 		return 1
 	fi
 }
@@ -1054,7 +1054,7 @@ add_perf_norange() {
 add_perf_noconcat() {
 	if ! nft add element netdev perf noconcat "${1}"; then
 		err "Failed to add ${1} given ruleset:"
-		err "$(nft list ruleset -a)"
+		err "$(nft -a list ruleset)"
 		return 1
 	fi
 }
@@ -1063,7 +1063,7 @@ add_perf_noconcat() {
 del() {
 	if ! nft delete element inet filter test "${1}"; then
 		err "Failed to delete ${1} given ruleset:"
-		err "$(nft list ruleset -a)"
+		err "$(nft -a list ruleset)"
 		return 1
 	fi
 }
@@ -1134,7 +1134,7 @@ send_match() {
 		err "  $(for f in ${src}; do
 			 eval format_\$f "${2}"; printf ' '; done)"
 		err "should have matched ruleset:"
-		err "$(nft list ruleset -a)"
+		err "$(nft -a list ruleset)"
 		return 1
 	fi
 	nft reset counter inet filter test >/dev/null
@@ -1160,7 +1160,7 @@ send_nomatch() {
 		err "  $(for f in ${src}; do
 			 eval format_\$f "${2}"; printf ' '; done)"
 		err "should not have matched ruleset:"
-		err "$(nft list ruleset -a)"
+		err "$(nft -a list ruleset)"
 		return 1
 	fi
 }
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 4/6] nft_set_pipapo: Actually fetch key data in nft_pipapo_remove()
  2020-02-26 22:54 [PATCH 0/6] Netfilter fixes for net Pablo Neira Ayuso
                   ` (2 preceding siblings ...)
  2020-02-26 22:54 ` [PATCH 3/6] selftests: nft_concat_range: Move option for 'list ruleset' before command Pablo Neira Ayuso
@ 2020-02-26 22:54 ` Pablo Neira Ayuso
  2020-02-26 22:54 ` [PATCH 5/6] selftests: nft_concat_range: Add test for reported add/flush/add issue Pablo Neira Ayuso
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 44+ messages in thread
From: Pablo Neira Ayuso @ 2020-02-26 22:54 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Stefano Brivio <sbrivio@redhat.com>

Phil reports that adding elements, flushing and re-adding them
right away:

  nft add table t '{ set s { type ipv4_addr . inet_service; flags interval; }; }'
  nft add element t s '{ 10.0.0.1 . 22-25, 10.0.0.1 . 10-20 }'
  nft flush set t s
  nft add element t s '{ 10.0.0.1 . 10-20, 10.0.0.1 . 22-25 }'

triggers, almost reliably, a crash like this one:

  [   71.319848] general protection fault, probably for non-canonical address 0x6f6b6e696c2e756e: 0000 [#1] PREEMPT SMP PTI
  [   71.321540] CPU: 3 PID: 1201 Comm: kworker/3:2 Not tainted 5.6.0-rc1-00377-g2bb07f4e1d861 #192
  [   71.322746] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190711_202441-buildvm-armv7-10.arm.fedoraproject.org-2.fc31 04/01/2014
  [   71.324430] Workqueue: events nf_tables_trans_destroy_work [nf_tables]
  [   71.325387] RIP: 0010:nft_set_elem_destroy+0xa5/0x110 [nf_tables]
  [   71.326164] Code: 89 d4 84 c0 74 0e 8b 77 44 0f b6 f8 48 01 df e8 41 ff ff ff 45 84 e4 74 36 44 0f b6 63 08 45 84 e4 74 2c 49 01 dc 49 8b 04 24 <48> 8b 40 38 48 85 c0 74 4f 48 89 e7 4c 8b
  [   71.328423] RSP: 0018:ffffc9000226fd90 EFLAGS: 00010282
  [   71.329225] RAX: 6f6b6e696c2e756e RBX: ffff88813ab79f60 RCX: ffff88813931b5a0
  [   71.330365] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88813ab79f9a
  [   71.331473] RBP: ffff88813ab79f60 R08: 0000000000000008 R09: 0000000000000000
  [   71.332627] R10: 000000000000021c R11: 0000000000000000 R12: ffff88813ab79fc2
  [   71.333615] R13: ffff88813b3adf50 R14: dead000000000100 R15: ffff88813931b8a0
  [   71.334596] FS:  0000000000000000(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000
  [   71.335780] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   71.336577] CR2: 000055ac683710f0 CR3: 000000013a222003 CR4: 0000000000360ee0
  [   71.337533] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [   71.338557] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [   71.339718] Call Trace:
  [   71.340093]  nft_pipapo_destroy+0x7a/0x170 [nf_tables_set]
  [   71.340973]  nft_set_destroy+0x20/0x50 [nf_tables]
  [   71.341879]  nf_tables_trans_destroy_work+0x246/0x260 [nf_tables]
  [   71.342916]  process_one_work+0x1d5/0x3c0
  [   71.343601]  worker_thread+0x4a/0x3c0
  [   71.344229]  kthread+0xfb/0x130
  [   71.344780]  ? process_one_work+0x3c0/0x3c0
  [   71.345477]  ? kthread_park+0x90/0x90
  [   71.346129]  ret_from_fork+0x35/0x40
  [   71.346748] Modules linked in: nf_tables_set nf_tables nfnetlink 8021q [last unloaded: nfnetlink]
  [   71.348153] ---[ end trace 2eaa8149ca759bcc ]---
  [   71.349066] RIP: 0010:nft_set_elem_destroy+0xa5/0x110 [nf_tables]
  [   71.350016] Code: 89 d4 84 c0 74 0e 8b 77 44 0f b6 f8 48 01 df e8 41 ff ff ff 45 84 e4 74 36 44 0f b6 63 08 45 84 e4 74 2c 49 01 dc 49 8b 04 24 <48> 8b 40 38 48 85 c0 74 4f 48 89 e7 4c 8b
  [   71.350017] RSP: 0018:ffffc9000226fd90 EFLAGS: 00010282
  [   71.350019] RAX: 6f6b6e696c2e756e RBX: ffff88813ab79f60 RCX: ffff88813931b5a0
  [   71.350019] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88813ab79f9a
  [   71.350020] RBP: ffff88813ab79f60 R08: 0000000000000008 R09: 0000000000000000
  [   71.350021] R10: 000000000000021c R11: 0000000000000000 R12: ffff88813ab79fc2
  [   71.350022] R13: ffff88813b3adf50 R14: dead000000000100 R15: ffff88813931b8a0
  [   71.350025] FS:  0000000000000000(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000
  [   71.350026] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   71.350027] CR2: 000055ac683710f0 CR3: 000000013a222003 CR4: 0000000000360ee0
  [   71.350028] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [   71.350028] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [   71.350030] Kernel panic - not syncing: Fatal exception
  [   71.350412] Kernel Offset: disabled
  [   71.365922] ---[ end Kernel panic - not syncing: Fatal exception ]---

which is caused by dangling elements that have been deactivated, but
never removed.

On a flush operation, nft_pipapo_walk() walks through all the elements
in the mapping table, which are then deactivated by nft_flush_set(),
one by one, and added to the commit list for removal. Element data is
then freed.

On transaction commit, nft_pipapo_remove() is called, and failed to
remove these elements, leading to the stale references in the mapping.
The first symptom of this, revealed by KASan, is a one-byte
use-after-free in subsequent calls to nft_pipapo_walk(), which is
usually not enough to trigger a panic. When stale elements are used
more heavily, though, such as double-free via nft_pipapo_destroy()
as in Phil's case, the problem becomes more noticeable.

The issue comes from that fact that, on a flush operation,
nft_pipapo_remove() won't get the actual key data via elem->key,
elements to be deleted upon commit won't be found by the lookup via
pipapo_get(), and removal will be skipped. Key data should be fetched
via nft_set_ext_key(), instead.

Reported-by: Phil Sutter <phil@nwl.cc>
Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_set_pipapo.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
index feac8553f6d9..4fc0c924ed5d 100644
--- a/net/netfilter/nft_set_pipapo.c
+++ b/net/netfilter/nft_set_pipapo.c
@@ -1766,11 +1766,13 @@ static bool pipapo_match_field(struct nft_pipapo_field *f,
 static void nft_pipapo_remove(const struct net *net, const struct nft_set *set,
 			      const struct nft_set_elem *elem)
 {
-	const u8 *data = (const u8 *)elem->key.val.data;
 	struct nft_pipapo *priv = nft_set_priv(set);
 	struct nft_pipapo_match *m = priv->clone;
+	struct nft_pipapo_elem *e = elem->priv;
 	int rules_f0, first_rule = 0;
-	struct nft_pipapo_elem *e;
+	const u8 *data;
+
+	data = (const u8 *)nft_set_ext_key(&e->ext);
 
 	e = pipapo_get(net, set, data, 0);
 	if (IS_ERR(e))
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 5/6] selftests: nft_concat_range: Add test for reported add/flush/add issue
  2020-02-26 22:54 [PATCH 0/6] Netfilter fixes for net Pablo Neira Ayuso
                   ` (3 preceding siblings ...)
  2020-02-26 22:54 ` [PATCH 4/6] nft_set_pipapo: Actually fetch key data in nft_pipapo_remove() Pablo Neira Ayuso
@ 2020-02-26 22:54 ` Pablo Neira Ayuso
  2020-02-26 22:54 ` [PATCH 6/6] netfilter: xt_hashlimit: unregister proc file before releasing mutex Pablo Neira Ayuso
  2020-02-27  0:32 ` [PATCH 0/6] Netfilter fixes for net David Miller
  6 siblings, 0 replies; 44+ messages in thread
From: Pablo Neira Ayuso @ 2020-02-26 22:54 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Stefano Brivio <sbrivio@redhat.com>

Add a specific test for the crash reported by Phil Sutter and addressed
in the previous patch. The test cases that, in my intention, should
have covered these cases, that is, the ones from the 'concurrency'
section, don't run these sequences tightly enough and spectacularly
failed to catch this.

While at it, define a convenient way to add these kind of tests, by
adding a "reported issues" test section.

It's more convenient, for this particular test, to execute the set
setup in its own function. However, future test cases like this one
might need to call setup functions, and will typically need no tools
other than nft, so allow for this in check_tools().

The original form of the reproducer used here was provided by Phil.

Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 .../selftests/netfilter/nft_concat_range.sh        | 43 ++++++++++++++++++++--
 1 file changed, 39 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/netfilter/nft_concat_range.sh b/tools/testing/selftests/netfilter/nft_concat_range.sh
index 5c1033ee1b39..5a4938d6dcf2 100755
--- a/tools/testing/selftests/netfilter/nft_concat_range.sh
+++ b/tools/testing/selftests/netfilter/nft_concat_range.sh
@@ -13,11 +13,12 @@
 KSELFTEST_SKIP=4
 
 # Available test groups:
+# - reported_issues: check for issues that were reported in the past
 # - correctness: check that packets match given entries, and only those
 # - concurrency: attempt races between insertion, deletion and lookup
 # - timeout: check that packets match entries until they expire
 # - performance: estimate matching rate, compare with rbtree and hash baselines
-TESTS="correctness concurrency timeout"
+TESTS="reported_issues correctness concurrency timeout"
 [ "${quicktest}" != "1" ] && TESTS="${TESTS} performance"
 
 # Set types, defined by TYPE_ variables below
@@ -25,6 +26,9 @@ TYPES="net_port port_net net6_port port_proto net6_port_mac net6_port_mac_proto
        net_port_net net_mac net_mac_icmp net6_mac_icmp net6_port_net6_port
        net_port_mac_proto_net"
 
+# Reported bugs, also described by TYPE_ variables below
+BUGS="flush_remove_add"
+
 # List of possible paths to pktgen script from kernel tree for performance tests
 PKTGEN_SCRIPT_PATHS="
 	../../../samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh
@@ -327,6 +331,12 @@ flood_spec	ip daddr . tcp dport . meta l4proto . ip saddr
 perf_duration	0
 "
 
+# Definition of tests for bugs reported in the past:
+# display	display text for test report
+TYPE_flush_remove_add="
+display		Add two elements, flush, re-add
+"
+
 # Set template for all tests, types and rules are filled in depending on test
 set_template='
 flush ruleset
@@ -440,6 +450,8 @@ setup_set() {
 
 # Check that at least one of the needed tools is available
 check_tools() {
+	[ -z "${tools}" ] && return 0
+
 	__tools=
 	for tool in ${tools}; do
 		if [ "${tool}" = "nc" ] && [ "${proto}" = "udp6" ] && \
@@ -1430,6 +1442,23 @@ test_performance() {
 	kill "${perf_pid}"
 }
 
+test_bug_flush_remove_add() {
+	set_cmd='{ set s { type ipv4_addr . inet_service; flags interval; }; }'
+	elem1='{ 10.0.0.1 . 22-25, 10.0.0.1 . 10-20 }'
+	elem2='{ 10.0.0.1 . 10-20, 10.0.0.1 . 22-25 }'
+	for i in `seq 1 100`; do
+		nft add table t ${set_cmd}	|| return ${KSELFTEST_SKIP}
+		nft add element t s ${elem1}	2>/dev/null || return 1
+		nft flush set t s		2>/dev/null || return 1
+		nft add element t s ${elem2}	2>/dev/null || return 1
+	done
+	nft flush ruleset
+}
+
+test_reported_issues() {
+	eval test_bug_"${subtest}"
+}
+
 # Run everything in a separate network namespace
 [ "${1}" != "run" ] && { unshare -n "${0}" run; exit $?; }
 tmp="$(mktemp)"
@@ -1438,9 +1467,15 @@ trap cleanup EXIT
 # Entry point for test runs
 passed=0
 for name in ${TESTS}; do
-	printf "TEST: %s\n" "${name}"
-	for type in ${TYPES}; do
-		eval desc=\$TYPE_"${type}"
+	printf "TEST: %s\n" "$(echo ${name} | tr '_' ' ')"
+	if [ "${name}" = "reported_issues" ]; then
+		SUBTESTS="${BUGS}"
+	else
+		SUBTESTS="${TYPES}"
+	fi
+
+	for subtest in ${SUBTESTS}; do
+		eval desc=\$TYPE_"${subtest}"
 		IFS='
 '
 		for __line in ${desc}; do
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 6/6] netfilter: xt_hashlimit: unregister proc file before releasing mutex
  2020-02-26 22:54 [PATCH 0/6] Netfilter fixes for net Pablo Neira Ayuso
                   ` (4 preceding siblings ...)
  2020-02-26 22:54 ` [PATCH 5/6] selftests: nft_concat_range: Add test for reported add/flush/add issue Pablo Neira Ayuso
@ 2020-02-26 22:54 ` Pablo Neira Ayuso
  2020-02-27  0:32 ` [PATCH 0/6] Netfilter fixes for net David Miller
  6 siblings, 0 replies; 44+ messages in thread
From: Pablo Neira Ayuso @ 2020-02-26 22:54 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Cong Wang <xiyou.wangcong@gmail.com>

Before releasing the global mutex, we only unlink the hashtable
from the hash list, its proc file is still not unregistered at
this point. So syzbot could trigger a race condition where a
parallel htable_create() could register the same file immediately
after the mutex is released.

Move htable_remove_proc_entry() back to mutex protection to
fix this. And, fold htable_destroy() into htable_put() to make
the code slightly easier to understand.

Reported-and-tested-by: syzbot+d195fd3b9a364ddd6731@syzkaller.appspotmail.com
Fixes: c4a3922d2d20 ("netfilter: xt_hashlimit: reduce hashlimit_mutex scope for htable_put()")
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/xt_hashlimit.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 7a2c4b8408c4..8c835ad63729 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -402,15 +402,6 @@ static void htable_remove_proc_entry(struct xt_hashlimit_htable *hinfo)
 		remove_proc_entry(hinfo->name, parent);
 }
 
-static void htable_destroy(struct xt_hashlimit_htable *hinfo)
-{
-	cancel_delayed_work_sync(&hinfo->gc_work);
-	htable_remove_proc_entry(hinfo);
-	htable_selective_cleanup(hinfo, true);
-	kfree(hinfo->name);
-	vfree(hinfo);
-}
-
 static struct xt_hashlimit_htable *htable_find_get(struct net *net,
 						   const char *name,
 						   u_int8_t family)
@@ -432,8 +423,13 @@ static void htable_put(struct xt_hashlimit_htable *hinfo)
 {
 	if (refcount_dec_and_mutex_lock(&hinfo->use, &hashlimit_mutex)) {
 		hlist_del(&hinfo->node);
+		htable_remove_proc_entry(hinfo);
 		mutex_unlock(&hashlimit_mutex);
-		htable_destroy(hinfo);
+
+		cancel_delayed_work_sync(&hinfo->gc_work);
+		htable_selective_cleanup(hinfo, true);
+		kfree(hinfo->name);
+		vfree(hinfo);
 	}
 }
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2020-02-26 22:54 [PATCH 0/6] Netfilter fixes for net Pablo Neira Ayuso
                   ` (5 preceding siblings ...)
  2020-02-26 22:54 ` [PATCH 6/6] netfilter: xt_hashlimit: unregister proc file before releasing mutex Pablo Neira Ayuso
@ 2020-02-27  0:32 ` David Miller
  6 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2020-02-27  0:32 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed, 26 Feb 2020 23:54:36 +0100

> The following patchset contains Netfilter fixes:
> 
> 1) Perform garbage collection from workqueue to fix rcu detected
>    stall in ipset hash set types, from Jozsef Kadlecsik.
> 
> 2) Fix the forceadd evaluation path, also from Jozsef.
> 
> 3) Fix nft_set_pipapo selftest, from Stefano Brivio.
> 
> 4) Crash when add-flush-add element in pipapo set, also from Stefano.
>    Add test to cover this crash.
> 
> 5) Remove sysctl entry under mutex in hashlimit, from Cong Wang.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thanks Pablo.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2020-08-24 11:39 Pablo Neira Ayuso
@ 2020-08-24 13:37 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2020-08-24 13:37 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev, kuba

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon, 24 Aug 2020 13:39:35 +0200

> The following patchset contains Netfilter fixes for net:
> 
> 1) Don't flag SCTP heartbeat as invalid for re-used connections,
>    from Florian Westphal.
> 
> 2) Bogus overlap report due to rbtree tree rotations, from Stefano Brivio.
> 
> 3) Detect partial overlap with start end point match, also from Stefano.
> 
> 4) Skip netlink dump of NFTA_SET_USERDATA is unset.
> 
> 5) Incorrect nft_list_attributes enumeration definition.
> 
> 6) Missing zeroing before memcpy to destination register, also
>    from Florian.
> 
> Please, pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thank you.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2020-08-24 11:39 Pablo Neira Ayuso
  2020-08-24 13:37 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2020-08-24 11:39 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba

Hi,

The following patchset contains Netfilter fixes for net:

1) Don't flag SCTP heartbeat as invalid for re-used connections,
   from Florian Westphal.

2) Bogus overlap report due to rbtree tree rotations, from Stefano Brivio.

3) Detect partial overlap with start end point match, also from Stefano.

4) Skip netlink dump of NFTA_SET_USERDATA is unset.

5) Incorrect nft_list_attributes enumeration definition.

6) Missing zeroing before memcpy to destination register, also
   from Florian.

Please, pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thank you.

----------------------------------------------------------------

The following changes since commit cf96d977381d4a23957bade2ddf1c420b74a26b6:

  net: gemini: Fix missing free_netdev() in error path of gemini_ethernet_port_probe() (2020-08-19 16:37:18 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 1e105e6afa6c3d32bfb52c00ffa393894a525c27:

  netfilter: nf_tables: fix destination register zeroing (2020-08-21 19:00:33 +0200)

----------------------------------------------------------------
Florian Westphal (2):
      netfilter: conntrack: allow sctp hearbeat after connection re-use
      netfilter: nf_tables: fix destination register zeroing

Pablo Neira Ayuso (2):
      netfilter: nf_tables: add NFTA_SET_USERDATA if not null
      netfilter: nf_tables: incorrect enum nft_list_attributes definition

Stefano Brivio (2):
      netfilter: nft_set_rbtree: Handle outcomes of tree rotations in overlap detection
      netfilter: nft_set_rbtree: Detect partial overlap with start endpoint match

 include/linux/netfilter/nf_conntrack_sctp.h |  2 +
 include/net/netfilter/nf_tables.h           |  2 +
 include/uapi/linux/netfilter/nf_tables.h    |  2 +-
 net/netfilter/nf_conntrack_proto_sctp.c     | 39 ++++++++++++++++++--
 net/netfilter/nf_tables_api.c               |  3 +-
 net/netfilter/nft_payload.c                 |  4 +-
 net/netfilter/nft_set_rbtree.c              | 57 ++++++++++++++++++++++++-----
 7 files changed, 92 insertions(+), 17 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2020-05-14 12:19 Pablo Neira Ayuso
@ 2020-05-14 20:15 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2020-05-14 20:15 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Thu, 14 May 2020 14:19:07 +0200

> The following patchset contains Netfilter fixes for net:
> 
> 1) Fix gcc-10 compilation warning in nf_conntrack, from Arnd Bergmann.
> 
> 2) Add NF_FLOW_HW_PENDING to avoid races between stats and deletion
>    commands, from Paul Blakey.
> 
> 3) Remove WQ_MEM_RECLAIM from the offload workqueue, from Roi Dayan.
> 
> 4) Infinite loop when removing nf_conntrack module, from Florian Westphal.
> 
> 5) Set NF_FLOW_TEARDOWN bit on expiration to avoid races when refreshing
>    the timeout from the software path.
> 
> 6) Missing nft_set_elem_expired() check in the rbtree, from Phil Sutter.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thank you.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2020-05-14 12:19 Pablo Neira Ayuso
  2020-05-14 20:15 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2020-05-14 12:19 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi,

The following patchset contains Netfilter fixes for net:

1) Fix gcc-10 compilation warning in nf_conntrack, from Arnd Bergmann.

2) Add NF_FLOW_HW_PENDING to avoid races between stats and deletion
   commands, from Paul Blakey.

3) Remove WQ_MEM_RECLAIM from the offload workqueue, from Roi Dayan.

4) Infinite loop when removing nf_conntrack module, from Florian Westphal.

5) Set NF_FLOW_TEARDOWN bit on expiration to avoid races when refreshing
   the timeout from the software path.

6) Missing nft_set_elem_expired() check in the rbtree, from Phil Sutter.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thank you.

----------------------------------------------------------------

The following changes since commit 3047211ca11bf77b3ecbce045c0aa544d934b945:

  net: dsa: loop: Add module soft dependency (2020-05-10 11:24:20 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 340eaff651160234bdbce07ef34b92a8e45cd540:

  netfilter: nft_set_rbtree: Add missing expired checks (2020-05-12 13:19:34 +0200)

----------------------------------------------------------------
Arnd Bergmann (1):
      netfilter: conntrack: avoid gcc-10 zero-length-bounds warning

Florian Westphal (1):
      netfilter: conntrack: fix infinite loop on rmmod

Pablo Neira Ayuso (1):
      netfilter: flowtable: set NF_FLOW_TEARDOWN flag on entry expiration

Paul Blakey (1):
      netfilter: flowtable: Add pending bit for offload work

Phil Sutter (1):
      netfilter: nft_set_rbtree: Add missing expired checks

Roi Dayan (1):
      netfilter: flowtable: Remove WQ_MEM_RECLAIM from workqueue

 include/net/netfilter/nf_conntrack.h  |  2 +-
 include/net/netfilter/nf_flow_table.h |  1 +
 net/netfilter/nf_conntrack_core.c     | 17 ++++++++++++++---
 net/netfilter/nf_flow_table_core.c    |  8 +++++---
 net/netfilter/nf_flow_table_offload.c | 10 ++++++++--
 net/netfilter/nft_set_rbtree.c        | 11 +++++++++++
 6 files changed, 40 insertions(+), 9 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2020-01-31 19:24 Pablo Neira Ayuso
@ 2020-02-01 20:59 ` Jakub Kicinski
  0 siblings, 0 replies; 44+ messages in thread
From: Jakub Kicinski @ 2020-02-01 20:59 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, davem, netdev

On Fri, 31 Jan 2020 20:24:22 +0100, Pablo Neira Ayuso wrote:
> Hi,
> 
> The following patchset contains Netfilter fixes for net:
> 
> 1) Fix suspicious RCU usage in ipset, from Jozsef Kadlecsik.
> 
> 2) Use kvcalloc, from Joe Perches.
> 
> 3) Flush flowtable hardware workqueue after garbage collection run,
>    from Paul Blakey.
> 
> 4) Missing flowtable hardware workqueue flush from nf_flow_table_free(),
>    also from Paul.
> 
> 5) Restore NF_FLOW_HW_DEAD in flow_offload_work_del(), from Paul.
> 
> 6) Flowtable documentation fixes, from Matteo Croce.

Pulled, thanks!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2020-01-31 19:24 Pablo Neira Ayuso
  2020-02-01 20:59 ` Jakub Kicinski
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2020-01-31 19:24 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi,

The following patchset contains Netfilter fixes for net:

1) Fix suspicious RCU usage in ipset, from Jozsef Kadlecsik.

2) Use kvcalloc, from Joe Perches.

3) Flush flowtable hardware workqueue after garbage collection run,
   from Paul Blakey.

4) Missing flowtable hardware workqueue flush from nf_flow_table_free(),
   also from Paul.

5) Restore NF_FLOW_HW_DEAD in flow_offload_work_del(), from Paul.

6) Flowtable documentation fixes, from Matteo Croce.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thank you.

----------------------------------------------------------------

The following changes since commit 44efc78d0e464ce70b45b165c005f8bedc17952e:

  net: mvneta: fix XDP support if sw bm is used as fallback (2020-01-29 13:57:59 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 78e06cf430934fc3768c342cbebdd1013dcd6fa7:

  netfilter: nf_flowtable: fix documentation (2020-01-31 19:31:42 +0100)

----------------------------------------------------------------
Joe Perches (1):
      netfilter: Use kvcalloc

Kadlecsik József (1):
      netfilter: ipset: fix suspicious RCU usage in find_set_and_id

Matteo Croce (1):
      netfilter: nf_flowtable: fix documentation

Paul Blakey (3):
      netfilter: flowtable: Fix hardware flush order on nf_flow_table_cleanup
      netfilter: flowtable: Fix missing flush hardware on table free
      netfilter: flowtable: Fix setting forgotten NF_FLOW_HW_DEAD flag

 Documentation/networking/nf_flowtable.txt |  2 +-
 net/netfilter/ipset/ip_set_core.c         | 41 ++++++++++++++++---------------
 net/netfilter/nf_conntrack_core.c         |  3 +--
 net/netfilter/nf_flow_table_core.c        |  3 ++-
 net/netfilter/nf_flow_table_offload.c     |  1 +
 net/netfilter/x_tables.c                  |  4 +--
 6 files changed, 28 insertions(+), 26 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2019-02-05 19:04 Pablo Neira Ayuso
@ 2019-02-05 19:23 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2019-02-05 19:23 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Tue,  5 Feb 2019 20:04:09 +0100

> The following patchset contains Netfilter fixes for net:
 ...
> Diffstat look rather larger than usual because of the new selftest, but
> Florian and I consider that having tests soon into the tree is good to
> improve coverage. If there's a different policy in this regard, please,
> let me know.

Adding a test case like this fine and in fact encouraged.

> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2019-02-05 19:04 Pablo Neira Ayuso
  2019-02-05 19:23 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2019-02-05 19:04 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter fixes for net:

1) Use CONFIG_NF_TABLES_INET from seltests, not NF_TABLES_INET.
   From Naresh Kamboju.

2) Add a test to cover masquerading and redirect case, from Florian
   Westphal.

3) Two packets coming from the same socket may race to set up NAT,
   ending up with different tuples and the packet losing race being
   dropped. Update nf_conntrack_tuple_taken() to exercise clash
   resolution for this case. From Martynas Pumputis and Florian
   Westphal.

4) Unbind anonymous sets from the commit and abort path, this fixes
   a splat due to double set list removal/release in case that the
   transaction needs to be aborted.

5) Do not preserve original output interface for packets that are
   redirected in the output chain when ip6_route_me_harder() is
   called. Otherwise packets end up going not going to the loopback
   device. From Eli Cooper.

6) Fix bogus splat in nft_compat with CONFIG_REFCOUNT_FULL=y, this
   also simplifies the existing logic to deal with the list insertions
   of the xtables extensions. From Florian Westphal.

Diffstat look rather larger than usual because of the new selftest, but
Florian and I consider that having tests soon into the tree is good to
improve coverage. If there's a different policy in this regard, please,
let me know.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit cfe4bd7a257f6d6f81d3458d8c9d9ec4957539e6:

  sctp: check and update stream->out_curr when allocating stream_out (2019-02-03 14:27:47 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 947e492c0fc2132ae5fca081a9c2952ccaab0404:

  netfilter: nft_compat: don't use refcount_inc on newly allocated entry (2019-02-05 14:10:33 +0100)

----------------------------------------------------------------
Eli Cooper (1):
      netfilter: ipv6: Don't preserve original oif for loopback address

Florian Westphal (2):
      selftests: netfilter: add simple masq/redirect test cases
      netfilter: nft_compat: don't use refcount_inc on newly allocated entry

Martynas Pumputis (1):
      netfilter: nf_nat: skip nat clash resolution for same-origin entries

Naresh Kamboju (1):
      selftests: netfilter: fix config fragment CONFIG_NF_TABLES_INET

Pablo Neira Ayuso (1):
      netfilter: nf_tables: unbind set in rule from commit path

 include/net/netfilter/nf_tables.h            |  17 +-
 net/ipv6/netfilter.c                         |   4 +-
 net/netfilter/nf_conntrack_core.c            |  16 +
 net/netfilter/nf_tables_api.c                |  85 ++-
 net/netfilter/nft_compat.c                   |  62 +--
 net/netfilter/nft_dynset.c                   |  18 +-
 net/netfilter/nft_immediate.c                |   6 +-
 net/netfilter/nft_lookup.c                   |  18 +-
 net/netfilter/nft_objref.c                   |  18 +-
 tools/testing/selftests/netfilter/Makefile   |   2 +-
 tools/testing/selftests/netfilter/config     |   2 +-
 tools/testing/selftests/netfilter/nft_nat.sh | 762 +++++++++++++++++++++++++++
 12 files changed, 888 insertions(+), 122 deletions(-)
 create mode 100755 tools/testing/selftests/netfilter/nft_nat.sh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2018-10-01 22:37 Pablo Neira Ayuso
@ 2018-10-01 22:41 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2018-10-01 22:41 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Tue,  2 Oct 2018 00:37:39 +0200

> The following patchset contains Netfilter fixes for your net tree:
 ...
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2018-10-01 22:37 Pablo Neira Ayuso
  2018-10-01 22:41 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2018-10-01 22:37 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter fixes for your net tree:

1) Skip ip_sabotage_in() for packet making into the VRF driver,
   otherwise packets are dropped, from David Ahern.

2) Clang compilation warning uncovering typo in the
   nft_validate_register_store() call from nft_osf, from Stefan Agner.

3) Double sizeof netlink message length calculations in ctnetlink,
   from zhong jiang.

4) Missing rb_erase() on batch full in rbtree garbage collector,
   from Taehee Yoo.

5) Calm down compilation warning in nf_hook(), from Florian Westphal.

6) Missing check for non-null sk in xt_socket before validating
   netns procedence, from Flavio Leitner.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks.

----------------------------------------------------------------

The following changes since commit 56ce3c5a50f4d8cc95361b1ec7f152006c6320d8:

  smc: generic netlink family should be __ro_after_init (2018-09-20 07:49:55 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 40e4f26e6a14fc1496eabb8b0004a547303114e6:

  netfilter: xt_socket: check sk before checking for netns. (2018-09-28 14:47:41 +0200)

----------------------------------------------------------------
David Ahern (1):
      netfilter: bridge: Don't sabotage nf_hook calls from an l3mdev

Flavio Leitner (1):
      netfilter: xt_socket: check sk before checking for netns.

Florian Westphal (1):
      netfilter: avoid erronous array bounds warning

Stefan Agner (1):
      netfilter: nft_osf: use enum nft_data_types for nft_validate_register_store

Taehee Yoo (1):
      netfilter: nft_set_rbtree: add missing rb_erase() in GC routine

zhong jiang (1):
      netfilter: conntrack: get rid of double sizeof

 include/linux/netfilter.h              |  2 ++
 net/bridge/br_netfilter_hooks.c        |  3 ++-
 net/netfilter/nf_conntrack_proto_tcp.c |  4 ++--
 net/netfilter/nft_osf.c                |  2 +-
 net/netfilter/nft_set_rbtree.c         | 28 ++++++++++++++--------------
 net/netfilter/xt_socket.c              |  4 ++--
 6 files changed, 23 insertions(+), 20 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2018-07-09 17:18 Pablo Neira Ayuso
@ 2018-07-09 21:24 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2018-07-09 21:24 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon,  9 Jul 2018 19:18:58 +0200

> The following patchset contains Netfilter fixes for your net tree:
> 
> 1) Missing module autoloadfor icmp and icmpv6 x_tables matches,
>    from Florian Westphal.
> 
> 2) Possible non-linear access to TCP header from tproxy, from
>    Mate Eckl.
> 
> 3) Do not allow rbtree to be used for single elements, this patch
>    moves all set backend into one single module since such thing
>    can only happen if hashtable module is explicitly blacklisted,
>    which should not ever be done.
> 
> 4) Reject error and standard targets from nft_compat for sanity
>    reasons, they are never used from there.
> 
> 5) Don't crash on double hashsize module parameter, from Andrey
>    Ryabinin.
> 
> 6) Drop dst on skb before placing it in the fragmentation
>    reassembly queue, from Florian Westphal.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2018-07-09 17:18 Pablo Neira Ayuso
  2018-07-09 21:24 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2018-07-09 17:18 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter fixes for your net tree:

1) Missing module autoloadfor icmp and icmpv6 x_tables matches,
   from Florian Westphal.

2) Possible non-linear access to TCP header from tproxy, from
   Mate Eckl.

3) Do not allow rbtree to be used for single elements, this patch
   moves all set backend into one single module since such thing
   can only happen if hashtable module is explicitly blacklisted,
   which should not ever be done.

4) Reject error and standard targets from nft_compat for sanity
   reasons, they are never used from there.

5) Don't crash on double hashsize module parameter, from Andrey
   Ryabinin.

6) Drop dst on skb before placing it in the fragmentation
   reassembly queue, from Florian Westphal.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit d461e3da905332189aad546b2ad9adbe6071c7cc:

  smsc75xx: Add workaround for gigabit link up hardware errata. (2018-07-04 22:12:59 +0900)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 84379c9afe011020e797e3f50a662b08a6355dcf:

  netfilter: ipv6: nf_defrag: drop skb dst before queueing (2018-07-09 18:04:12 +0200)

----------------------------------------------------------------
Andrey Ryabinin (1):
      netfilter: nf_conntrack: Fix possible possible crash on module loading.

Florian Westphal (3):
      netfilter: x_tables: set module owner for icmp(6) matches
      netfilter: nft_compat: explicitly reject ERROR and standard target
      netfilter: ipv6: nf_defrag: drop skb dst before queueing

Máté Eckl (1):
      netfilter: nf_tproxy: fix possible non-linear access to transport header

Pablo Neira Ayuso (1):
      netfilter: nf_tables: place all set backends in one single module

 include/net/netfilter/nf_tables_core.h  |  6 ++++++
 include/net/netfilter/nf_tproxy.h       |  4 ++--
 net/ipv4/netfilter/ip_tables.c          |  1 +
 net/ipv4/netfilter/nf_tproxy_ipv4.c     | 18 ++++++++++++------
 net/ipv6/netfilter/ip6_tables.c         |  1 +
 net/ipv6/netfilter/nf_conntrack_reasm.c |  2 ++
 net/ipv6/netfilter/nf_tproxy_ipv6.c     | 18 ++++++++++++------
 net/netfilter/Kconfig                   | 25 +++++++------------------
 net/netfilter/Makefile                  |  7 ++++---
 net/netfilter/nf_conntrack_core.c       |  2 +-
 net/netfilter/nf_tables_set_core.c      | 28 ++++++++++++++++++++++++++++
 net/netfilter/nft_compat.c              | 13 +++++++++++++
 net/netfilter/nft_set_bitmap.c          | 19 +------------------
 net/netfilter/nft_set_hash.c            | 29 +++--------------------------
 net/netfilter/nft_set_rbtree.c          | 19 +------------------
 net/netfilter/xt_TPROXY.c               |  8 ++++----
 16 files changed, 98 insertions(+), 102 deletions(-)
 create mode 100644 net/netfilter/nf_tables_set_core.c

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2018-06-27 15:22 Pablo Neira Ayuso
@ 2018-06-28  4:33 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2018-06-28  4:33 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed, 27 Jun 2018 17:22:17 +0200

> The following patchset contains Netfilter fixes for your net tree:
> 
> 1) Missing netlink attribute validation in nf_queue, uncovered by KASAN,
>    from Eric Dumazet.
> 
> 2) Use pointer to sysctl table, save us 192 bytes of memory per netns.
>    Also from Eric.
> 
> 3) Possible use-after-free when removing conntrack helper modules due
>    to missing synchronize RCU call. From Taehee Yoo.
> 
> 4) Fix corner case in systcl writes to nf_log that lead to appending
>    data to uninitialized buffer, from Jann Horn.
> 
> 5) Jann Horn says we may indefinitely block other users of nf_log_mutex
>    if a userspace access in proc_dostring() blocked e.g. due to a
>    userfaultfd.
> 
> 6) Fix garbage collection race for unconfirmed conntrack entries,
>    from Florian Westphal.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thank you.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2018-06-27 15:22 Pablo Neira Ayuso
  2018-06-28  4:33 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2018-06-27 15:22 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter fixes for your net tree:

1) Missing netlink attribute validation in nf_queue, uncovered by KASAN,
   from Eric Dumazet.

2) Use pointer to sysctl table, save us 192 bytes of memory per netns.
   Also from Eric.

3) Possible use-after-free when removing conntrack helper modules due
   to missing synchronize RCU call. From Taehee Yoo.

4) Fix corner case in systcl writes to nf_log that lead to appending
   data to uninitialized buffer, from Jann Horn.

5) Jann Horn says we may indefinitely block other users of nf_log_mutex
   if a userspace access in proc_dostring() blocked e.g. due to a
   userfaultfd.

6) Fix garbage collection race for unconfirmed conntrack entries,
   from Florian Westphal.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks.

----------------------------------------------------------------

The following changes since commit 7e85dc8cb35abf16455f1511f0670b57c1a84608:

  net_sched: blackhole: tell upper qdisc about dropped packets (2018-06-17 08:42:33 +0900)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to b36e4523d4d56e2595e28f16f6ccf1cd6a9fc452:

  netfilter: nf_conncount: fix garbage collection confirm race (2018-06-26 18:28:57 +0200)

----------------------------------------------------------------
Eric Dumazet (2):
      netfilter: nf_queue: augment nfqa_cfg_policy
      netfilter: ipv6: nf_defrag: reduce struct net memory waste

Florian Westphal (1):
      netfilter: nf_conncount: fix garbage collection confirm race

Gao Feng (1):
      netfilter: nf_ct_helper: Fix possible panic after nf_conntrack_helper_unregister

Jann Horn (2):
      netfilter: nf_log: fix uninit read in nf_log_proc_dostring
      netfilter: nf_log: don't hold nf_log_mutex during user access

 include/net/net_namespace.h             |  1 +
 include/net/netns/ipv6.h                |  1 -
 net/ipv6/netfilter/nf_conntrack_reasm.c |  6 ++--
 net/netfilter/nf_conncount.c            | 52 +++++++++++++++++++++++++++++----
 net/netfilter/nf_conntrack_helper.c     |  5 ++++
 net/netfilter/nf_log.c                  | 13 +++++++--
 net/netfilter/nfnetlink_queue.c         |  3 ++
 7 files changed, 69 insertions(+), 12 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2018-02-01 18:02 Pablo Neira Ayuso
@ 2018-02-01 19:45 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2018-02-01 19:45 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Thu,  1 Feb 2018 19:02:11 +0100

> The following patchset contains Netfilter fixes for your net tree,
> they are:
> 
> 1) Fix OOM that syskaller triggers with ipt_replace.size = -1 and
>    IPT_SO_SET_REPLACE socket option, from Dmitry Vyukov.
> 
> 2) Check for too long extension name in xt_request_find_{match|target}
>    that result in out-of-bound reads, from Eric Dumazet.
> 
> 3) Fix memory exhaustion bug in ipset hash:*net* types when adding ranges
>    that look like x.x.x.x-255.255.255.255, from Jozsef Kadlecsik.
> 
> 4) Fix pointer leaks to userspace in x_tables, from Dmitry Vyukov.
> 
> 5) Insufficient sanity checks in clusterip_tg_check(), also from Dmitry.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2018-02-01 18:02 Pablo Neira Ayuso
  2018-02-01 19:45 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2018-02-01 18:02 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter fixes for your net tree,
they are:

1) Fix OOM that syskaller triggers with ipt_replace.size = -1 and
   IPT_SO_SET_REPLACE socket option, from Dmitry Vyukov.

2) Check for too long extension name in xt_request_find_{match|target}
   that result in out-of-bound reads, from Eric Dumazet.

3) Fix memory exhaustion bug in ipset hash:*net* types when adding ranges
   that look like x.x.x.x-255.255.255.255, from Jozsef Kadlecsik.

4) Fix pointer leaks to userspace in x_tables, from Dmitry Vyukov.

5) Insufficient sanity checks in clusterip_tg_check(), also from Dmitry.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

P.S: Another batch is following up soon, there are more fixes cooking on
     the mailing list.

----------------------------------------------------------------

The following changes since commit d1616f07e8f1a4a490d1791316d4a68906b284aa:

  net: fec: free/restore resource in related probe error pathes (2018-01-05 11:19:11 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 3f34cfae1238848fd53f25e5c8fd59da57901f4b:

  netfilter: on sockopt() acquire sock lock only in the required scope (2018-01-31 16:37:47 +0100)

----------------------------------------------------------------
Dmitry Vyukov (3):
      netfilter: x_tables: fix int overflow in xt_alloc_table_info()
      netfilter: x_tables: fix pointer leaks to userspace
      netfilter: ipt_CLUSTERIP: fix out-of-bounds accesses in clusterip_tg_check()

Eric Dumazet (1):
      netfilter: x_tables: avoid out-of-bounds reads in xt_request_find_{match|target}

Jozsef Kadlecsik (1):
      netfilter: ipset: Fix wraparound in hash:*net* types

Paolo Abeni (1):
      netfilter: on sockopt() acquire sock lock only in the required scope

 net/ipv4/ip_sockglue.c                         | 14 +++--------
 net/ipv4/netfilter/ipt_CLUSTERIP.c             | 16 +++++++++---
 net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c |  6 ++++-
 net/ipv6/ipv6_sockglue.c                       | 17 ++++---------
 net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c | 18 ++++++++-----
 net/netfilter/ipset/ip_set_hash_ipportnet.c    | 26 +++++++++----------
 net/netfilter/ipset/ip_set_hash_net.c          |  9 +++----
 net/netfilter/ipset/ip_set_hash_netiface.c     |  9 +++----
 net/netfilter/ipset/ip_set_hash_netnet.c       | 28 ++++++++++-----------
 net/netfilter/ipset/ip_set_hash_netport.c      | 19 +++++++-------
 net/netfilter/ipset/ip_set_hash_netportnet.c   | 35 +++++++++++++-------------
 net/netfilter/x_tables.c                       |  9 +++++--
 net/netfilter/xt_IDLETIMER.c                   |  1 +
 net/netfilter/xt_LED.c                         |  1 +
 net/netfilter/xt_limit.c                       |  3 +--
 net/netfilter/xt_nfacct.c                      |  1 +
 net/netfilter/xt_statistic.c                   |  1 +
 17 files changed, 114 insertions(+), 99 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2017-02-27 11:35 Pablo Neira Ayuso
@ 2017-02-27 14:19 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2017-02-27 14:19 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon, 27 Feb 2017 12:35:36 +0100

> The following patchset contains netfilter fixes for you net tree,
> they are:
> 
> 1) Missing ct zone size in the nft_ct initialization path, patch
>    from Florian Westphal.
> 
> 2) Two patches for netfilter uapi headers, one to remove unnecessary
>    sysctl.h inclusion and another to fix compilation of xt_hashlimit.h
>    in userspace, from Dmitry V. Levin.
> 
> 3) Patch to fix a sloppy change in nf_ct_expect that incorrectly
>    simplified nf_ct_expect_related_report() in the previous nf-next
>    batch. This also includes another patch for __nf_ct_expect_check()
>    to report success by returning 0 to keep it consistent with other
>    existing functions. From Jarno Rajahalme.
> 
> 4) The ->walk() iterator of the new bitmap set type goes over the real
>    bitmap size, this results in incorrect dumps when NFTA_SET_USERDATA
>    is used.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thanks Pablo.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2017-02-27 11:35 Pablo Neira Ayuso
  2017-02-27 14:19 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2017-02-27 11:35 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains netfilter fixes for you net tree,
they are:

1) Missing ct zone size in the nft_ct initialization path, patch
   from Florian Westphal.

2) Two patches for netfilter uapi headers, one to remove unnecessary
   sysctl.h inclusion and another to fix compilation of xt_hashlimit.h
   in userspace, from Dmitry V. Levin.

3) Patch to fix a sloppy change in nf_ct_expect that incorrectly
   simplified nf_ct_expect_related_report() in the previous nf-next
   batch. This also includes another patch for __nf_ct_expect_check()
   to report success by returning 0 to keep it consistent with other
   existing functions. From Jarno Rajahalme.

4) The ->walk() iterator of the new bitmap set type goes over the real
   bitmap size, this results in incorrect dumps when NFTA_SET_USERDATA
   is used.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit 9c4713701c01e4cef6e2315c2818abc919ffb0de:

  bpf: Fix bpf_xdp_event_output (2017-02-23 13:53:42 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 13aa5a8f498dacd5f1a8e35be72af47e630fb8c6:

  netfilter: nft_set_bitmap: incorrect bitmap size (2017-02-26 21:00:19 +0100)

----------------------------------------------------------------
Dmitry V. Levin (2):
      uapi: stop including linux/sysctl.h in uapi/linux/netfilter.h
      uapi: fix linux/netfilter/xt_hashlimit.h userspace compilation error

Florian Westphal (1):
      netfilter: nft_ct: fix random validation errors for zone set support

Jarno Rajahalme (2):
      netfilter: nf_ct_expect: nf_ct_expect_related_report(): Return zero on success.
      netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value.

Pablo Neira Ayuso (1):
      netfilter: nft_set_bitmap: incorrect bitmap size

 include/uapi/linux/netfilter.h              | 1 -
 include/uapi/linux/netfilter/xt_hashlimit.h | 1 +
 net/netfilter/nf_conntrack_expect.c         | 6 +++---
 net/netfilter/nft_ct.c                      | 1 +
 net/netfilter/nft_set_bitmap.c              | 2 +-
 5 files changed, 6 insertions(+), 5 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2017-01-05 11:19 Pablo Neira Ayuso
@ 2017-01-05 16:52 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2017-01-05 16:52 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Thu,  5 Jan 2017 12:19:47 +0100

> The following patchset contains accumulated Netfilter fixes for your
> net tree:
> 
> 1) Ensure quota dump and reset happens iff we can deliver numbers to
>    userspace.
> 
> 2) Silence splat on incorrect use of smp_processor_id() from nft_queue.
> 
> 3) Fix an out-of-bound access reported by KASAN in
>    nf_tables_rule_destroy(), patch from Florian Westphal.
> 
> 4) Fix layer 4 checksum mangling in the nf_tables payload expression
>    with IPv6.
> 
> 5) Fix a race in the CLUSTERIP target from control plane path when two
>    threads run to add a new configuration object. Serialize invocations
>    of clusterip_config_init() using spin_lock. From Xin Long.
> 
> 6) Call br_nf_pre_routing_finish_bridge_finish() once we are done with
>    the br_nf_pre_routing_finish() hook. From Artur Molchanov.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thanks Pablo.

And a happy new year to you too!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2017-01-05 11:19 Pablo Neira Ayuso
  2017-01-05 16:52 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2017-01-05 11:19 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains accumulated Netfilter fixes for your
net tree:

1) Ensure quota dump and reset happens iff we can deliver numbers to
   userspace.

2) Silence splat on incorrect use of smp_processor_id() from nft_queue.

3) Fix an out-of-bound access reported by KASAN in
   nf_tables_rule_destroy(), patch from Florian Westphal.

4) Fix layer 4 checksum mangling in the nf_tables payload expression
   with IPv6.

5) Fix a race in the CLUSTERIP target from control plane path when two
   threads run to add a new configuration object. Serialize invocations
   of clusterip_config_init() using spin_lock. From Xin Long.

6) Call br_nf_pre_routing_finish_bridge_finish() once we are done with
   the br_nf_pre_routing_finish() hook. From Artur Molchanov.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Wish you a nice new year btw, thanks!

----------------------------------------------------------------

The following changes since commit a220871be66f99d8957c693cf22ec67ecbd9c23a:

  virtio-net: correctly enable multiqueue (2016-12-13 10:37:38 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 14221cc45caad2fcab3a8543234bb7eda9b540d5:

  bridge: netfilter: Fix dropping packets that moving through bridge interface (2016-12-30 18:22:50 +0100)

----------------------------------------------------------------
Artur Molchanov (1):
      bridge: netfilter: Fix dropping packets that moving through bridge interface

Florian Westphal (1):
      netfilter: nf_tables: fix oob access

Pablo Neira Ayuso (3):
      netfilter: nft_quota: reset quota after dump
      netfilter: nft_queue: use raw_smp_processor_id()
      netfilter: nft_payload: mangle ckecksum if NFT_PAYLOAD_L4CSUM_PSEUDOHDR is set

Xin Long (1):
      netfilter: ipt_CLUSTERIP: check duplicate config when initializing

 net/bridge/br_netfilter_hooks.c    |  2 +-
 net/ipv4/netfilter/ipt_CLUSTERIP.c | 34 +++++++++++++++++++++++-----------
 net/netfilter/nf_tables_api.c      |  2 +-
 net/netfilter/nft_payload.c        | 27 +++++++++++++++++++--------
 net/netfilter/nft_queue.c          |  2 +-
 net/netfilter/nft_quota.c          | 26 ++++++++++++++------------
 6 files changed, 59 insertions(+), 34 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2016-08-18 17:29 Pablo Neira Ayuso
@ 2016-08-19  1:49 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2016-08-19  1:49 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Thu, 18 Aug 2016 19:29:02 +0200

> The following patchset contains Netfilter updates for your net tree,
> they are:
 ...
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thanks a lot Pablo.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2016-08-18 17:29 Pablo Neira Ayuso
  2016-08-19  1:49 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2016-08-18 17:29 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter updates for your net tree,
they are:

1) Dump only conntrack that belong to this namespace via /proc file.
   This is some fallout from the conversion to single conntrack table
   for all netns, patch from Liping Zhang.

2) Missing MODULE_ALIAS_NF_LOGGER() for the ARP family that prevents
   module autoloading, also from Liping Zhang.

3) Report overquota event to the right netnamespace, again from Liping.

4) Fix tproxy listener sk refcount that leads to crash, from
   Eric Dumazet.

5) Fix racy refcounting on object deletion from nfnetlink and rule
   removal both for nfacct and cttimeout, from Liping Zhang.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit a1560dd7a47f983419760aa7f6a481e3b910b54b:

  Merge branch 'mediatek-fixes' (2016-08-15 23:02:45 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to b75911b66ad508a3c3f006ce37d9f9ebee34da43:

  netfilter: cttimeout: fix use after free error when delete netns (2016-08-18 15:17:00 +0200)

----------------------------------------------------------------
Eric Dumazet (1):
      netfilter: tproxy: properly refcount tcp listeners

Liping Zhang (5):
      netfilter: conntrack: do not dump other netns's conntrack entries via proc
      netfilter: nfnetlink_log: add "nf-logger-3-1" module alias name
      netfilter: nfnetlink_acct: report overquota to the right netns
      netfilter: nfnetlink_acct: fix race between nfacct del and xt_nfacct destroy
      netfilter: cttimeout: fix use after free error when delete netns

 include/linux/netfilter/nfnetlink_acct.h |  4 ++--
 net/netfilter/nf_conntrack_standalone.c  |  4 ++++
 net/netfilter/nfnetlink_acct.c           | 17 +++++++++--------
 net/netfilter/nfnetlink_cttimeout.c      | 16 ++++++++++------
 net/netfilter/nfnetlink_log.c            |  1 +
 net/netfilter/xt_TPROXY.c                |  4 ++++
 net/netfilter/xt_nfacct.c                |  2 +-
 7 files changed, 31 insertions(+), 17 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2016-02-16 17:02 Pablo Neira Ayuso
@ 2016-02-16 17:56 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2016-02-16 17:56 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Tue, 16 Feb 2016 18:02:31 +0100

> The following patchset contain a rather large batch for your net that
> includes accumulated bugfixes, they are:
 ...
> Due to the NetDev 1.1 organization burden, I had no chance to pass up
> this to you any sooner in this release cycle, sorry about that.

Understood :)

> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2016-02-16 17:02 Pablo Neira Ayuso
  2016-02-16 17:56 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2016-02-16 17:02 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contain a rather large batch for your net that
includes accumulated bugfixes, they are:

1) Run conntrack cleanup from workqueue process context to avoid hitting
   soft lockup via watchdog for large tables. This is required by the
   IPv6 masquerading extension. From Florian Westphal.

2) Use original skbuff from nfnetlink batch when calling netlink_ack()
   on error since this needs to access the skb->sk pointer.

3) Incremental fix on top of recent Sasha Levin's lock fix for conntrack
   resizing.

4) Fix several problems in nfnetlink batch message header sanitization
   and error handling, from Phil Turnbull.

5) Select NF_DUP_IPV6 based on CONFIG_IPV6, from Arnd Bergmann.

6) Fix wrong signess in return values on nf_tables counter expression,
   from Anton Protopopov.

Due to the NetDev 1.1 organization burden, I had no chance to pass up
this to you any sooner in this release cycle, sorry about that.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit 53729eb174c1589f9185340ffe8c10b3f39f3ef3:

  Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth (2016-01-30 15:32:42 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 5cc6ce9ff27565949a1001a2889a8dd9fd09e772:

  netfilter: nft_counter: fix erroneous return values (2016-02-08 13:05:02 +0100)

----------------------------------------------------------------
Anton Protopopov (1):
      netfilter: nft_counter: fix erroneous return values

Arnd Bergmann (1):
      netfilter: tee: select NF_DUP_IPV6 unconditionally

Florian Westphal (2):
      netfilter: conntrack: resched in nf_ct_iterate_cleanup
      netfilter: cttimeout: fix deadlock due to erroneous unlock/lock conversion

Pablo Neira Ayuso (1):
      netfilter: nfnetlink: use original skbuff when acking batches

Phil Turnbull (1):
      netfilter: nfnetlink: correctly validate length of batch messages

 net/ipv6/netfilter/nf_nat_masquerade_ipv6.c | 74 +++++++++++++++++++++++++++--
 net/netfilter/Kconfig                       |  2 +-
 net/netfilter/nf_conntrack_core.c           |  5 ++
 net/netfilter/nfnetlink.c                   | 16 ++++---
 net/netfilter/nfnetlink_cttimeout.c         |  2 +-
 net/netfilter/nft_counter.c                 |  4 +-
 net/netfilter/xt_TEE.c                      |  4 +-
 7 files changed, 91 insertions(+), 16 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2016-01-20 17:03 Pablo Neira Ayuso
@ 2016-01-21  2:57 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2016-01-21  2:57 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed, 20 Jan 2016 18:03:58 +0100

> The following patchset contains Netfilter fixes for your net tree, they
> are:
> 
> 1) Fix accidental 3-times le/be conversion for 64-bits in nft_byteorder,
>    from Florian Westphal.
> 
> 2) Get rid of defensive cidr = 0 check in the ipset hash:netiface set
>    type which doesn't allow valid 0.0.0.0/0 elements, also from Florian.
> 
> 3) Relocate #endif in nft_ct counter support, this doesn't have any
>    relation with labels.
> 
> 4) Fix TCPMSS target for IPv6 when skb has CHECKSUM_COMPLETE, from
>    Eric Dumazet.
> 
> 5) Fix netdevice notifier leak from the error path of nf_tables_netdev.
> 
> 6) Safe conntrack hashtable resizing by introducing a global lock and
>    synchronize all buckets to avoid going over the maximum number of
>    preemption levels, from Sasha Levin.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thanks Pablo.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2016-01-20 17:03 Pablo Neira Ayuso
  2016-01-21  2:57 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2016-01-20 17:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter fixes for your net tree, they
are:

1) Fix accidental 3-times le/be conversion for 64-bits in nft_byteorder,
   from Florian Westphal.

2) Get rid of defensive cidr = 0 check in the ipset hash:netiface set
   type which doesn't allow valid 0.0.0.0/0 elements, also from Florian.

3) Relocate #endif in nft_ct counter support, this doesn't have any
   relation with labels.

4) Fix TCPMSS target for IPv6 when skb has CHECKSUM_COMPLETE, from
   Eric Dumazet.

5) Fix netdevice notifier leak from the error path of nf_tables_netdev.

6) Safe conntrack hashtable resizing by introducing a global lock and
   synchronize all buckets to avoid going over the maximum number of
   preemption levels, from Sasha Levin.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit f1640c3ddeec12804bc9a21feee85fc15aca95f6:

  bgmac: fix a missing check for build_skb (2016-01-13 00:24:14 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to b16c29191dc89bd877af99a7b04ce4866728a3e0:

  netfilter: nf_conntrack: use safer way to lock all buckets (2016-01-20 14:15:31 +0100)

----------------------------------------------------------------
Eric Dumazet (1):
      netfilter: xt_TCPMSS: handle CHECKSUM_COMPLETE in tcpmss_tg6()

Florian Westphal (2):
      netfilter: nft_byteorder: avoid unneeded le/be conversion steps
      netfilter: ipset: allow a 0 netmask with hash_netiface type

Pablo Neira Ayuso (2):
      netfilter: nft_ct: keep counters away from CONFIG_NF_CONNTRACK_LABELS
      netfilter: nf_tables_netdev: fix error path in module initialization

Sasha Levin (1):
      netfilter: nf_conntrack: use safer way to lock all buckets

 include/net/netfilter/nf_conntrack_core.h  |  8 +++----
 net/netfilter/ipset/ip_set_hash_netiface.c |  4 ----
 net/netfilter/nf_conntrack_core.c          | 38 ++++++++++++++++++++++--------
 net/netfilter/nf_conntrack_helper.c        |  2 +-
 net/netfilter/nf_conntrack_netlink.c       |  2 +-
 net/netfilter/nf_tables_netdev.c           |  8 +++----
 net/netfilter/nfnetlink_cttimeout.c        |  4 ++--
 net/netfilter/nft_byteorder.c              |  6 ++---
 net/netfilter/nft_ct.c                     |  2 +-
 net/netfilter/xt_TCPMSS.c                  |  9 +++++--
 10 files changed, 49 insertions(+), 34 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] netfilter fixes for net
  2015-12-14 11:25 [PATCH 0/6] netfilter " Pablo Neira Ayuso
@ 2015-12-14 16:09 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2015-12-14 16:09 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon, 14 Dec 2015 12:25:40 +0100

> The following patchset contains Netfilter fixes for you net tree,
> specifically for nf_tables and nfnetlink_queue, they are:

Pulled, thanks a lot Pablo.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] netfilter fixes for net
@ 2015-12-14 11:25 Pablo Neira Ayuso
  2015-12-14 16:09 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2015-12-14 11:25 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter fixes for you net tree,
specifically for nf_tables and nfnetlink_queue, they are:

1) Avoid a compilation warning in nfnetlink_queue that was introduced
   in the previous merge window with the simplification of the conntrack
   integration, from Arnd Bergmann.

2) nfnetlink_queue is leaking the pernet subsystem registration from
   a failure path, patch from Nikolay Borisov.

3) Pass down netns pointer to batch callback in nfnetlink, this is the
   largest patch and it is not a bugfix but it is a dependency to
   resolve a splat in the correct way.

4) Fix a splat due to incorrect socket memory accounting with nfnetlink
   skbuff clones.

5) Add missing conntrack dependencies to NFT_DUP_IPV4 and NFT_DUP_IPV6.

6) Traverse the nftables commit list in reverse order from the commit
   path, otherwise we crash when the user applies an incremental update
   via 'nft -f' that deletes an object that was just introduced in this
   batch, from Xin Long.

Regarding the compilation warning fix, many people have sent us (and
keep sending us) patches to address this, that's why I'm including this
batch even if this is not critical.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit 4c6980462f32b4f282c5d8e5f7ea8070e2937725:

  net: ip6mr: fix static mfc/dev leaks on table destruction (2015-11-22 20:44:47 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to a907e36d54e0ff836e55e04531be201bf6b4d8c8:

  netfilter: nf_tables: use reverse traversal commit_list in nf_tables_abort (2015-12-13 22:47:32 +0100)

----------------------------------------------------------------
Arnd Bergmann (1):
      netfilter: nfnetlink_queue: avoid harmless unnitialized variable warnings

Nikolay Borisov (1):
      netfilter: nfnetlink_queue: Unregister pernet subsys in case of init failure

Pablo Neira Ayuso (3):
      netfilter: nfnetlink: avoid recurrent netns lookups in call_batch
      netfilter: nfnetlink: fix splat due to incorrect socket memory accounting in skbuff clones
      netfilter: nf_dup: add missing dependencies with NF_CONNTRACK

Xin Long (1):
      netfilter: nf_tables: use reverse traversal commit_list in nf_tables_abort

 include/linux/netfilter/nfnetlink.h |  2 +-
 net/ipv4/netfilter/Kconfig          |  1 +
 net/ipv6/netfilter/Kconfig          |  1 +
 net/netfilter/nf_tables_api.c       | 99 ++++++++++++++++++-------------------
 net/netfilter/nfnetlink.c           |  4 +-
 net/netfilter/nfnetlink_queue.c     |  9 ++--
 6 files changed, 57 insertions(+), 59 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2015-09-03  9:50 [PATCH 0/6] Netfilter " Pablo Neira Ayuso
@ 2015-09-06  4:59 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2015-09-06  4:59 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Thu,  3 Sep 2015 11:50:55 +0200

> The following patchset contains Netfilter fixes for net, they are:
> 
> 1) Oneliner to restore maps in nf_tables since we support addressing registers
>    at 32 bits level.
> 
> 2) Restore previous default behaviour in bridge netfilter when CONFIG_IPV6=n,
>    oneliner from Bernhard Thaler.
> 
> 3) Out of bound access in ipset hash:net* set types, reported by Dave Jones'
>    KASan utility, patch from Jozsef Kadlecsik.
> 
> 4) Fix ipset compilation with gcc 4.4.7 related to C99 initialization of
>    unnamed unions, patch from Elad Raz.
> 
> 5) Add a workaround to address inconsistent endianess in the res_id field of
>    nfnetlink batch messages, reported by Florian Westphal.
> 
> 6) Fix error paths of CT/synproxy since the conntrack template was moved to use
>    kmalloc, patch from Daniel Borkmann.
> 
> All of them look good to me to reach 4.2, I can route this to -stable myself
> too, just let me know what you prefer.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, there was a merge conflict, please verify that I resolved it
correctly.

Thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2015-09-03  9:50 Pablo Neira Ayuso
  2015-09-06  4:59 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2015-09-03  9:50 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter fixes for net, they are:

1) Oneliner to restore maps in nf_tables since we support addressing registers
   at 32 bits level.

2) Restore previous default behaviour in bridge netfilter when CONFIG_IPV6=n,
   oneliner from Bernhard Thaler.

3) Out of bound access in ipset hash:net* set types, reported by Dave Jones'
   KASan utility, patch from Jozsef Kadlecsik.

4) Fix ipset compilation with gcc 4.4.7 related to C99 initialization of
   unnamed unions, patch from Elad Raz.

5) Add a workaround to address inconsistent endianess in the res_id field of
   nfnetlink batch messages, reported by Florian Westphal.

6) Fix error paths of CT/synproxy since the conntrack template was moved to use
   kmalloc, patch from Daniel Borkmann.

All of them look good to me to reach 4.2, I can route this to -stable myself
too, just let me know what you prefer.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit fd7dec25a18f495e50d2040398fd263836ff3b28:

  batman-adv: Fix memory leak on tt add with invalid vlan (2015-08-18 19:08:23 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git master

for you to fetch changes up to 9cf94eab8b309e8bcc78b41dd1561c75b537dd0b:

  netfilter: conntrack: use nf_ct_tmpl_free in CT/synproxy error paths (2015-09-01 12:15:08 +0200)

----------------------------------------------------------------
Bernhard Thaler (1):
      netfilter: bridge: fix IPv6 packets not being bridged with CONFIG_IPV6=n

Daniel Borkmann (1):
      netfilter: conntrack: use nf_ct_tmpl_free in CT/synproxy error paths

Elad Raz (1):
      netfilter: ipset: Fixing unnamed union init

Jozsef Kadlecsik (1):
      netfilter: ipset: Out of bound access in hash:net* types fixed

Pablo Neira Ayuso (2):
      netfilter: nf_tables: Use 32 bit addressing register from nft_type_to_reg()
      netfilter: nfnetlink: work around wrong endianess in res_id field

 include/net/netfilter/br_netfilter.h         |    2 +-
 include/net/netfilter/nf_conntrack.h         |    1 +
 include/net/netfilter/nf_tables.h            |    2 +-
 net/netfilter/ipset/ip_set_hash_gen.h        |   12 ++++++++----
 net/netfilter/ipset/ip_set_hash_netnet.c     |   20 ++++++++++++++++++--
 net/netfilter/ipset/ip_set_hash_netportnet.c |   20 ++++++++++++++++++--
 net/netfilter/nf_conntrack_core.c            |    3 ++-
 net/netfilter/nf_synproxy_core.c             |    2 +-
 net/netfilter/nfnetlink.c                    |    8 +++++++-
 net/netfilter/xt_CT.c                        |    2 +-
 10 files changed, 58 insertions(+), 14 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2015-03-22 18:46 Pablo Neira Ayuso
@ 2015-03-22 20:57 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2015-03-22 20:57 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Sun, 22 Mar 2015 19:46:32 +0100

> The following patchset contains Netfilter fixes for your net tree,
> they are:
> 
> 1) Fix missing initialization of tuple structure in nfnetlink_cthelper
>    to avoid mismatches when looking up to attach userspace helpers to
>    flows, from Ian Wilson.
> 
> 2) Fix potential crash in nft_hash when we hit -EAGAIN in
>    nft_hash_walk(), from Herbert Xu.
> 
> 3) We don't need to indicate the hook information to update the
>    basechain default policy in nf_tables.
> 
> 4) Restore tracing over nfnetlink_log due to recent rework to
>    accomodate logging infrastructure into nf_tables.
> 
> 5) Fix wrong IP6T_INV_PROTO check in xt_TPROXY.
> 
> 6) Set IP6T_F_PROTO flag in nft_compat so we can use SYNPROXY6 and
>    REJECT6 from xt over nftables.

Pulled, thanks Pablo.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2015-03-22 18:46 Pablo Neira Ayuso
  2015-03-22 20:57 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2015-03-22 18:46 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter fixes for your net tree,
they are:

1) Fix missing initialization of tuple structure in nfnetlink_cthelper
   to avoid mismatches when looking up to attach userspace helpers to
   flows, from Ian Wilson.

2) Fix potential crash in nft_hash when we hit -EAGAIN in
   nft_hash_walk(), from Herbert Xu.

3) We don't need to indicate the hook information to update the
   basechain default policy in nf_tables.

4) Restore tracing over nfnetlink_log due to recent rework to
   accomodate logging infrastructure into nf_tables.

5) Fix wrong IP6T_INV_PROTO check in xt_TPROXY.

6) Set IP6T_F_PROTO flag in nft_compat so we can use SYNPROXY6 and
   REJECT6 from xt over nftables.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit 4363890079674db7b00cf1bb0e6fa430e846e86b:

  net: Handle unregister properly when netdev namespace change fails. (2015-03-10 21:59:46 -0400)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git master

for you to fetch changes up to 749177ccc74f9c6d0f51bd78a15c652a2134aa11:

  netfilter: nft_compat: set IP6T_F_PROTO flag if protocol is set (2015-03-22 19:32:05 +0100)

----------------------------------------------------------------
Herbert Xu (1):
      netfilter: Fix potential crash in nft_hash walker

Ian Wilson (1):
      netfilter: Zero the tuple in nfnl_cthelper_parse_tuple()

Pablo Neira Ayuso (4):
      netfilter: nf_tables: allow to change chain policy without hook if it exists
      netfilter: restore rule tracing via nfnetlink_log
      netfilter: xt_TPROXY: fix invflags check in tproxy_tg6_check()
      netfilter: nft_compat: set IP6T_F_PROTO flag if protocol is set

 include/net/netfilter/nf_log.h     |   10 ++++++++++
 net/ipv4/netfilter/ip_tables.c     |    6 +++---
 net/ipv6/netfilter/ip6_tables.c    |    6 +++---
 net/netfilter/nf_log.c             |   24 ++++++++++++++++++++++++
 net/netfilter/nf_tables_api.c      |    5 ++++-
 net/netfilter/nf_tables_core.c     |    8 ++++----
 net/netfilter/nfnetlink_cthelper.c |    3 +++
 net/netfilter/nft_compat.c         |    6 ++++++
 net/netfilter/nft_hash.c           |    2 ++
 net/netfilter/xt_TPROXY.c          |    4 ++--
 10 files changed, 61 insertions(+), 13 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2014-05-09 10:56 Pablo Neira Ayuso
@ 2014-05-09 17:17 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2014-05-09 17:17 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Fri,  9 May 2014 12:56:01 +0200

> The following batch contains netfilter fixes for your net tree, they are:
> 
> 1) Fix use after free in nfnetlink when sending a batch for some
>    unsupported subsystem, from Denys Fedoryshchenko.
> 
> 2) Skip autoload of the nat module if no binding is specified via
>    ctnetlink, from Florian Westphal.
> 
> 3) Set local_df after netfilter defragmentation to avoid a bogus ICMP
>    fragmentation needed in the forwarding path, also from Florian.
> 
> 4) Fix potential user after free in ip6_route_me_harder() when returning
>    the error code to the upper layers, from Sergey Popovich.
> 
> 5) Skip possible bogus ICMP time exceeded emitted from the router (not
>    valid according to RFC) if conntrack zones are used, from Vasily Averin.
> 
> 6) Fix fragment handling when nf_defrag_ipv4 is loaded but nf_conntrack
>    is not present, also from Vasily.

Pulled, thanks a lot Pablo.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2014-05-09 10:56 Pablo Neira Ayuso
  2014-05-09 17:17 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2014-05-09 10:56 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following batch contains netfilter fixes for your net tree, they are:

1) Fix use after free in nfnetlink when sending a batch for some
   unsupported subsystem, from Denys Fedoryshchenko.

2) Skip autoload of the nat module if no binding is specified via
   ctnetlink, from Florian Westphal.

3) Set local_df after netfilter defragmentation to avoid a bogus ICMP
   fragmentation needed in the forwarding path, also from Florian.

4) Fix potential user after free in ip6_route_me_harder() when returning
   the error code to the upper layers, from Sergey Popovich.

5) Skip possible bogus ICMP time exceeded emitted from the router (not
   valid according to RFC) if conntrack zones are used, from Vasily Averin.

6) Fix fragment handling when nf_defrag_ipv4 is loaded but nf_conntrack
   is not present, also from Vasily.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit 014f1b20108dc2c0bb0777d8383654a089c790f8:

  net: bonding: Fix format string mismatch in bond_sysfs.c (2014-04-28 14:48:16 -0400)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git master

for you to fetch changes up to a8951d5814e1373807a94f79f7ccec7041325470:

  netfilter: Fix potential use after free in ip6_route_me_harder() (2014-05-09 02:36:39 +0200)

----------------------------------------------------------------
Denys Fedoryshchenko (1):
      netfilter: nfnetlink: Fix use after free when it fails to process batch

Florian Westphal (2):
      netfilter: ctnetlink: don't add null bindings if no nat requested
      netfilter: ipv4: defrag: set local_df flag on defragmented skb

Sergey Popovich (1):
      netfilter: Fix potential use after free in ip6_route_me_harder()

Vasily Averin (2):
      ipv4: fix "conntrack zones" support for defrag user check in ip_expire
      bridge: superfluous skb->nfct check in br_nf_dev_queue_xmit

 net/bridge/br_netfilter.c            |    4 ++--
 net/ipv4/ip_fragment.c               |    5 +++--
 net/ipv4/netfilter/nf_defrag_ipv4.c  |    5 +++--
 net/ipv6/netfilter.c                 |    6 ++++--
 net/netfilter/nf_conntrack_netlink.c |    3 +++
 net/netfilter/nfnetlink.c            |    8 ++++----
 6 files changed, 19 insertions(+), 12 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/6] Netfilter fixes for net
  2014-02-19 11:41 Pablo Neira Ayuso
@ 2014-02-19 18:16 ` David Miller
  0 siblings, 0 replies; 44+ messages in thread
From: David Miller @ 2014-02-19 18:16 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed, 19 Feb 2014 12:41:36 +0100

> The following patchset contains Netfilter fixes for your net tree,
> they are:
> 
> * Fix nf_trace in nftables if XT_TRACE=n, from Florian Westphal.
> 
> * Don't use the fast payload operation in nf_tables if the length is
>   not power of 2 or it is not aligned, from Nikolay Aleksandrov.
> 
> * Fix missing break statement the inet flavour of nft_reject, which
>   results in evaluating IPv4 packets with the IPv6 evaluation routine,
>   from Patrick McHardy.
> 
> * Fix wrong kconfig symbol in nft_meta to match the routing realm,
>   from Paul Bolle.
> 
> * Allocate the NAT null binding when creating new conntracks via
>   ctnetlink to avoid that several packets race at initializing the
>   the conntrack NAT extension, original patch from Florian Westphal,
>   revisited version from me.
> 
> * Fix DNAT handling in the snmp NAT helper, the same handling was being
>   done for SNAT and DNAT and 2.4 already contains that fix, from
>   Francois-Xavier Le Bail.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git master

Pulled, thanks a lot Pablo.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 0/6] Netfilter fixes for net
@ 2014-02-19 11:41 Pablo Neira Ayuso
  2014-02-19 18:16 ` David Miller
  0 siblings, 1 reply; 44+ messages in thread
From: Pablo Neira Ayuso @ 2014-02-19 11:41 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter fixes for your net tree,
they are:

* Fix nf_trace in nftables if XT_TRACE=n, from Florian Westphal.

* Don't use the fast payload operation in nf_tables if the length is
  not power of 2 or it is not aligned, from Nikolay Aleksandrov.

* Fix missing break statement the inet flavour of nft_reject, which
  results in evaluating IPv4 packets with the IPv6 evaluation routine,
  from Patrick McHardy.

* Fix wrong kconfig symbol in nft_meta to match the routing realm,
  from Paul Bolle.

* Allocate the NAT null binding when creating new conntracks via
  ctnetlink to avoid that several packets race at initializing the
  the conntrack NAT extension, original patch from Florian Westphal,
  revisited version from me.

* Fix DNAT handling in the snmp NAT helper, the same handling was being
  done for SNAT and DNAT and 2.4 already contains that fix, from
  Francois-Xavier Le Bail.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git master

Thanks!

----------------------------------------------------------------

The following changes since commit 20e7c4e80dcd01dad5e6c8b32455228b8fe9c619:

  6lowpan: fix lockdep splats (2014-02-10 17:51:29 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git master

for you to fetch changes up to 0eba801b64cc8284d9024c7ece30415a2b981a72:

  netfilter: ctnetlink: force null nat binding on insert (2014-02-18 00:13:51 +0100)

----------------------------------------------------------------
FX Le Bail (1):
      netfilter: nf_nat_snmp_basic: fix duplicates in if/else branches

Florian Westphal (1):
      netfilter: nf_tables: fix nf_trace always-on with XT_TRACE=n

Nikolay Aleksandrov (1):
      netfilter: nf_tables: check if payload length is a power of 2

Pablo Neira Ayuso (1):
      netfilter: ctnetlink: force null nat binding on insert

Patrick McHardy (1):
      netfilter: nft_reject_inet: fix unintended fall-through in switch-statatement

Paul Bolle (1):
      netfilter: nft_meta: fix typo "CONFIG_NET_CLS_ROUTE"

 include/linux/skbuff.h                 |    5 ++-
 net/core/skbuff.c                      |    3 --
 net/ipv4/ip_output.c                   |    3 --
 net/ipv4/netfilter/nf_nat_snmp_basic.c |    4 +--
 net/ipv6/ip6_output.c                  |    3 --
 net/netfilter/nf_conntrack_netlink.c   |   35 ++++++++------------
 net/netfilter/nf_nat_core.c            |   56 ++++++++++++++++++++------------
 net/netfilter/nft_meta.c               |    4 +--
 net/netfilter/nft_payload.c            |    3 +-
 net/netfilter/nft_reject_inet.c        |    4 +--
 10 files changed, 61 insertions(+), 59 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2020-08-24 13:40 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-26 22:54 [PATCH 0/6] Netfilter fixes for net Pablo Neira Ayuso
2020-02-26 22:54 ` [PATCH 1/6] netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports Pablo Neira Ayuso
2020-02-26 22:54 ` [PATCH 2/6] netfilter: ipset: Fix forceadd evaluation path Pablo Neira Ayuso
2020-02-26 22:54 ` [PATCH 3/6] selftests: nft_concat_range: Move option for 'list ruleset' before command Pablo Neira Ayuso
2020-02-26 22:54 ` [PATCH 4/6] nft_set_pipapo: Actually fetch key data in nft_pipapo_remove() Pablo Neira Ayuso
2020-02-26 22:54 ` [PATCH 5/6] selftests: nft_concat_range: Add test for reported add/flush/add issue Pablo Neira Ayuso
2020-02-26 22:54 ` [PATCH 6/6] netfilter: xt_hashlimit: unregister proc file before releasing mutex Pablo Neira Ayuso
2020-02-27  0:32 ` [PATCH 0/6] Netfilter fixes for net David Miller
  -- strict thread matches above, loose matches on Subject: below --
2020-08-24 11:39 Pablo Neira Ayuso
2020-08-24 13:37 ` David Miller
2020-05-14 12:19 Pablo Neira Ayuso
2020-05-14 20:15 ` David Miller
2020-01-31 19:24 Pablo Neira Ayuso
2020-02-01 20:59 ` Jakub Kicinski
2019-02-05 19:04 Pablo Neira Ayuso
2019-02-05 19:23 ` David Miller
2018-10-01 22:37 Pablo Neira Ayuso
2018-10-01 22:41 ` David Miller
2018-07-09 17:18 Pablo Neira Ayuso
2018-07-09 21:24 ` David Miller
2018-06-27 15:22 Pablo Neira Ayuso
2018-06-28  4:33 ` David Miller
2018-02-01 18:02 Pablo Neira Ayuso
2018-02-01 19:45 ` David Miller
2017-02-27 11:35 Pablo Neira Ayuso
2017-02-27 14:19 ` David Miller
2017-01-05 11:19 Pablo Neira Ayuso
2017-01-05 16:52 ` David Miller
2016-08-18 17:29 Pablo Neira Ayuso
2016-08-19  1:49 ` David Miller
2016-02-16 17:02 Pablo Neira Ayuso
2016-02-16 17:56 ` David Miller
2016-01-20 17:03 Pablo Neira Ayuso
2016-01-21  2:57 ` David Miller
2015-12-14 11:25 [PATCH 0/6] netfilter " Pablo Neira Ayuso
2015-12-14 16:09 ` David Miller
2015-09-03  9:50 [PATCH 0/6] Netfilter " Pablo Neira Ayuso
2015-09-06  4:59 ` David Miller
2015-03-22 18:46 Pablo Neira Ayuso
2015-03-22 20:57 ` David Miller
2014-05-09 10:56 Pablo Neira Ayuso
2014-05-09 17:17 ` David Miller
2014-02-19 11:41 Pablo Neira Ayuso
2014-02-19 18:16 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).