All of lore.kernel.org
 help / color / mirror / Atom feed
* Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu
@ 2016-08-11 13:17 Florian Westphal
  2016-08-11 13:17 ` [PATCH ipsec-next 1/8] xfrm: policy: use rcu versions for iteration and list add/del Florian Westphal
                   ` (8 more replies)
  0 siblings, 9 replies; 16+ messages in thread
From: Florian Westphal @ 2016-08-11 13:17 UTC (permalink / raw)
  To: netdev

Since commit d188ba86dd07a72eb ("xfrm: add rcu protection to sk->sk_policy[]")
sk_policy can rely on rcu protection.

This change allows to also use rcu in xfrm_policy_lookup_bytype and to
avoid grabbing the read-sie policy lock during lookups.

read-side policy rwlock is converted to pure rcu, then the policy_lock is
changed to a plain spinlock as only write-sides remain.

First few patches do some preparation work, the later ones remove
the read-side locks, last patch converts rwlock to spinlock.

I tested this with rcu debug enabled and simple esp tunnel
forwarding udp packets.

If you have better tests for this please let me know and I can re-run with
that.

Florian Westphal (8):
      xfrm: policy: use rcu versions for iteration and list add/del
      xfrm: policy: prepare policy_bydst hash for rcu lookups
      xfrm: policy: add sequence count to synchronize reads with hash resizes
      xfrm: policy: use atomic_inc_not_zero in rcu section
      xfrm: policy: make xfrm_policy_lookup_bytype lockless
      xfrm: policy: only use rcu in xfrm_sk_policy_lookup
      xfrm: policy: don't acquire policy lock in xfrm_spd_getinfo
      xfrm: policy: convert policy_lock to spinlock

 include/net/netns/xfrm.h |    4 -
 net/xfrm/xfrm_policy.c   |  145 +++++++++++++++++++++++++++--------------------
 2 files changed, 88 insertions(+), 61 deletions(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH ipsec-next 1/8] xfrm: policy: use rcu versions for iteration and list add/del
  2016-08-11 13:17 Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu Florian Westphal
@ 2016-08-11 13:17 ` Florian Westphal
  2016-08-11 13:17 ` [PATCH ipsec-next 2/8] xfrm: policy: prepare policy_bydst hash for rcu lookups Florian Westphal
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2016-08-11 13:17 UTC (permalink / raw)
  To: netdev; +Cc: Florian Westphal

This is required once we allow lockless readers.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/xfrm/xfrm_policy.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index b5e665b..538685c 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -426,14 +426,14 @@ redo:
 		h = __addr_hash(&pol->selector.daddr, &pol->selector.saddr,
 				pol->family, nhashmask, dbits, sbits);
 		if (!entry0) {
-			hlist_del(&pol->bydst);
-			hlist_add_head(&pol->bydst, ndsttable+h);
+			hlist_del_rcu(&pol->bydst);
+			hlist_add_head_rcu(&pol->bydst, ndsttable + h);
 			h0 = h;
 		} else {
 			if (h != h0)
 				continue;
-			hlist_del(&pol->bydst);
-			hlist_add_behind(&pol->bydst, entry0);
+			hlist_del_rcu(&pol->bydst);
+			hlist_add_behind_rcu(&pol->bydst, entry0);
 		}
 		entry0 = &pol->bydst;
 	}
@@ -1106,7 +1106,7 @@ static struct xfrm_policy *xfrm_policy_lookup_bytype(struct net *net, u8 type,
 	read_lock_bh(&net->xfrm.xfrm_policy_lock);
 	chain = policy_hash_direct(net, daddr, saddr, family, dir);
 	ret = NULL;
-	hlist_for_each_entry(pol, chain, bydst) {
+	hlist_for_each_entry_rcu(pol, chain, bydst) {
 		err = xfrm_policy_match(pol, fl, type, family, dir);
 		if (err) {
 			if (err == -ESRCH)
@@ -1122,7 +1122,7 @@ static struct xfrm_policy *xfrm_policy_lookup_bytype(struct net *net, u8 type,
 		}
 	}
 	chain = &net->xfrm.policy_inexact[dir];
-	hlist_for_each_entry(pol, chain, bydst) {
+	hlist_for_each_entry_rcu(pol, chain, bydst) {
 		if ((pol->priority >= priority) && ret)
 			break;
 
@@ -1271,7 +1271,7 @@ static struct xfrm_policy *__xfrm_policy_unlink(struct xfrm_policy *pol,
 
 	/* Socket policies are not hashed. */
 	if (!hlist_unhashed(&pol->bydst)) {
-		hlist_del(&pol->bydst);
+		hlist_del_rcu(&pol->bydst);
 		hlist_del(&pol->byidx);
 	}
 
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH ipsec-next 2/8] xfrm: policy: prepare policy_bydst hash for rcu lookups
  2016-08-11 13:17 Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu Florian Westphal
  2016-08-11 13:17 ` [PATCH ipsec-next 1/8] xfrm: policy: use rcu versions for iteration and list add/del Florian Westphal
@ 2016-08-11 13:17 ` Florian Westphal
  2016-08-11 13:17 ` [PATCH ipsec-next 3/8] xfrm: policy: add sequence count to sync with hash resize Florian Westphal
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2016-08-11 13:17 UTC (permalink / raw)
  To: netdev; +Cc: Florian Westphal

Since commit 56f047305dd4b6b617
("xfrm: add rcu grace period in xfrm_policy_destroy()") xfrm policy
objects are already free'd via rcu.

In order to make more places lockless (i.e. use rcu_read_lock instead of
grabbing read-side of policy rwlock) we only need to:

- use rcu_assign_pointer to store address of new hash table backend memory
- add rcu barrier so that freeing of old memory is delayed (expansion
  and free happens from system workqueue, so synchronize_rcu is fine)
- use rcu_dereference to fetch current address of the hash table.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/xfrm/xfrm_policy.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 538685c..362bdb4 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -385,9 +385,11 @@ static struct hlist_head *policy_hash_bysel(struct net *net,
 	__get_hash_thresh(net, family, dir, &dbits, &sbits);
 	hash = __sel_hash(sel, family, hmask, dbits, sbits);
 
-	return (hash == hmask + 1 ?
-		&net->xfrm.policy_inexact[dir] :
-		net->xfrm.policy_bydst[dir].table + hash);
+	if (hash == hmask + 1)
+		return &net->xfrm.policy_inexact[dir];
+
+	return rcu_dereference_check(net->xfrm.policy_bydst[dir].table,
+		     lockdep_is_held(&net->xfrm.xfrm_policy_lock)) + hash;
 }
 
 static struct hlist_head *policy_hash_direct(struct net *net,
@@ -403,7 +405,8 @@ static struct hlist_head *policy_hash_direct(struct net *net,
 	__get_hash_thresh(net, family, dir, &dbits, &sbits);
 	hash = __addr_hash(daddr, saddr, family, hmask, dbits, sbits);
 
-	return net->xfrm.policy_bydst[dir].table + hash;
+	return rcu_dereference_check(net->xfrm.policy_bydst[dir].table,
+		     lockdep_is_held(&net->xfrm.xfrm_policy_lock)) + hash;
 }
 
 static void xfrm_dst_hash_transfer(struct net *net,
@@ -468,8 +471,8 @@ static void xfrm_bydst_resize(struct net *net, int dir)
 	unsigned int hmask = net->xfrm.policy_bydst[dir].hmask;
 	unsigned int nhashmask = xfrm_new_hash_mask(hmask);
 	unsigned int nsize = (nhashmask + 1) * sizeof(struct hlist_head);
-	struct hlist_head *odst = net->xfrm.policy_bydst[dir].table;
 	struct hlist_head *ndst = xfrm_hash_alloc(nsize);
+	struct hlist_head *odst;
 	int i;
 
 	if (!ndst)
@@ -477,14 +480,19 @@ static void xfrm_bydst_resize(struct net *net, int dir)
 
 	write_lock_bh(&net->xfrm.xfrm_policy_lock);
 
+	odst = rcu_dereference_protected(net->xfrm.policy_bydst[dir].table,
+				lockdep_is_held(&net->xfrm.xfrm_policy_lock));
+
 	for (i = hmask; i >= 0; i--)
 		xfrm_dst_hash_transfer(net, odst + i, ndst, nhashmask, dir);
 
-	net->xfrm.policy_bydst[dir].table = ndst;
+	rcu_assign_pointer(net->xfrm.policy_bydst[dir].table, ndst);
 	net->xfrm.policy_bydst[dir].hmask = nhashmask;
 
 	write_unlock_bh(&net->xfrm.xfrm_policy_lock);
 
+	synchronize_rcu();
+
 	xfrm_hash_free(odst, (hmask + 1) * sizeof(struct hlist_head));
 }
 
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH ipsec-next 3/8] xfrm: policy: add sequence count to sync with hash resize
  2016-08-11 13:17 Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu Florian Westphal
  2016-08-11 13:17 ` [PATCH ipsec-next 1/8] xfrm: policy: use rcu versions for iteration and list add/del Florian Westphal
  2016-08-11 13:17 ` [PATCH ipsec-next 2/8] xfrm: policy: prepare policy_bydst hash for rcu lookups Florian Westphal
@ 2016-08-11 13:17 ` Florian Westphal
  2016-08-11 13:17 ` [PATCH ipsec-next 4/8] xfrm: policy: use atomic_inc_not_zero in rcu section Florian Westphal
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2016-08-11 13:17 UTC (permalink / raw)
  To: netdev; +Cc: Florian Westphal

Once xfrm_policy_lookup_bytype doesn't grab xfrm_policy_lock anymore its
possible for a hash resize to occur in parallel.

Use sequence counter to block lookup in case a resize is in
progress and to also re-lookup in case hash table was altered
in the mean time (might cause use to not find the best-match).

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/xfrm/xfrm_policy.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 362bdb4..01c530e 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -49,6 +49,7 @@ static struct xfrm_policy_afinfo __rcu *xfrm_policy_afinfo[NPROTO]
 						__read_mostly;
 
 static struct kmem_cache *xfrm_dst_cache __read_mostly;
+static __read_mostly seqcount_t xfrm_policy_hash_generation;
 
 static void xfrm_init_pmtu(struct dst_entry *dst);
 static int stale_bundle(struct dst_entry *dst);
@@ -479,6 +480,10 @@ static void xfrm_bydst_resize(struct net *net, int dir)
 		return;
 
 	write_lock_bh(&net->xfrm.xfrm_policy_lock);
+	write_seqcount_begin(&xfrm_policy_hash_generation);
+
+	odst = rcu_dereference_protected(net->xfrm.policy_bydst[dir].table,
+				lockdep_is_held(&net->xfrm.xfrm_policy_lock));
 
 	odst = rcu_dereference_protected(net->xfrm.policy_bydst[dir].table,
 				lockdep_is_held(&net->xfrm.xfrm_policy_lock));
@@ -489,6 +494,7 @@ static void xfrm_bydst_resize(struct net *net, int dir)
 	rcu_assign_pointer(net->xfrm.policy_bydst[dir].table, ndst);
 	net->xfrm.policy_bydst[dir].hmask = nhashmask;
 
+	write_seqcount_end(&xfrm_policy_hash_generation);
 	write_unlock_bh(&net->xfrm.xfrm_policy_lock);
 
 	synchronize_rcu();
@@ -1104,7 +1110,8 @@ static struct xfrm_policy *xfrm_policy_lookup_bytype(struct net *net, u8 type,
 	struct xfrm_policy *pol, *ret;
 	const xfrm_address_t *daddr, *saddr;
 	struct hlist_head *chain;
-	u32 priority = ~0U;
+	unsigned int sequence;
+	u32 priority;
 
 	daddr = xfrm_flowi_daddr(fl, family);
 	saddr = xfrm_flowi_saddr(fl, family);
@@ -1112,7 +1119,13 @@ static struct xfrm_policy *xfrm_policy_lookup_bytype(struct net *net, u8 type,
 		return NULL;
 
 	read_lock_bh(&net->xfrm.xfrm_policy_lock);
-	chain = policy_hash_direct(net, daddr, saddr, family, dir);
+ retry:
+	do {
+		sequence = read_seqcount_begin(&xfrm_policy_hash_generation);
+		chain = policy_hash_direct(net, daddr, saddr, family, dir);
+	} while (read_seqcount_retry(&xfrm_policy_hash_generation, sequence));
+
+	priority = ~0U;
 	ret = NULL;
 	hlist_for_each_entry_rcu(pol, chain, bydst) {
 		err = xfrm_policy_match(pol, fl, type, family, dir);
@@ -1148,6 +1161,9 @@ static struct xfrm_policy *xfrm_policy_lookup_bytype(struct net *net, u8 type,
 		}
 	}
 
+	if (read_seqcount_retry(&xfrm_policy_hash_generation, sequence))
+		goto retry;
+
 	xfrm_pol_hold(ret);
 fail:
 	read_unlock_bh(&net->xfrm.xfrm_policy_lock);
@@ -3090,6 +3106,7 @@ static struct pernet_operations __net_initdata xfrm_net_ops = {
 void __init xfrm_init(void)
 {
 	register_pernet_subsys(&xfrm_net_ops);
+	seqcount_init(&xfrm_policy_hash_generation);
 	xfrm_input_init();
 }
 
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH ipsec-next 4/8] xfrm: policy: use atomic_inc_not_zero in rcu section
  2016-08-11 13:17 Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu Florian Westphal
                   ` (2 preceding siblings ...)
  2016-08-11 13:17 ` [PATCH ipsec-next 3/8] xfrm: policy: add sequence count to sync with hash resize Florian Westphal
@ 2016-08-11 13:17 ` Florian Westphal
  2016-08-11 13:17 ` [PATCH ipsec-next 5/8] xfrm: policy: make xfrm_policy_lookup_bytype lockless Florian Westphal
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2016-08-11 13:17 UTC (permalink / raw)
  To: netdev; +Cc: Florian Westphal

If we don't hold the policy lock anymore the refcnt might
already be 0, i.e. policy struct is about to be free'd.

Switch to atomic_inc_not_zero to avoid this.

On removal policies are already unlinked from the tables (lists)
before the last _put occurs so we are not supposed to find the same
'dead' entry on the next loop, so its safe to just repeat the lookup.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/xfrm/xfrm_policy.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 01c530e..cbc9005 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -60,6 +60,11 @@ static void __xfrm_policy_link(struct xfrm_policy *pol, int dir);
 static struct xfrm_policy *__xfrm_policy_unlink(struct xfrm_policy *pol,
 						int dir);
 
+static inline bool xfrm_pol_hold_rcu(struct xfrm_policy *policy)
+{
+	return atomic_inc_not_zero(&policy->refcnt);
+}
+
 static inline bool
 __xfrm4_selector_match(const struct xfrm_selector *sel, const struct flowi *fl)
 {
@@ -1164,7 +1169,8 @@ static struct xfrm_policy *xfrm_policy_lookup_bytype(struct net *net, u8 type,
 	if (read_seqcount_retry(&xfrm_policy_hash_generation, sequence))
 		goto retry;
 
-	xfrm_pol_hold(ret);
+	if (ret && !xfrm_pol_hold_rcu(ret))
+		goto retry;
 fail:
 	read_unlock_bh(&net->xfrm.xfrm_policy_lock);
 
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH ipsec-next 5/8] xfrm: policy: make xfrm_policy_lookup_bytype lockless
  2016-08-11 13:17 Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu Florian Westphal
                   ` (3 preceding siblings ...)
  2016-08-11 13:17 ` [PATCH ipsec-next 4/8] xfrm: policy: use atomic_inc_not_zero in rcu section Florian Westphal
@ 2016-08-11 13:17 ` Florian Westphal
  2016-08-11 13:17 ` [PATCH ipsec-next 6/8] xfrm: policy: only use rcu in xfrm_sk_policy_lookup Florian Westphal
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2016-08-11 13:17 UTC (permalink / raw)
  To: netdev; +Cc: Florian Westphal

side effect: no longer disables BH (should be fine).

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netns/xfrm.h | 2 +-
 net/xfrm/xfrm_policy.c   | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/net/netns/xfrm.h b/include/net/netns/xfrm.h
index 1ab51d1..3ab828a 100644
--- a/include/net/netns/xfrm.h
+++ b/include/net/netns/xfrm.h
@@ -11,7 +11,7 @@
 struct ctl_table_header;
 
 struct xfrm_policy_hash {
-	struct hlist_head	*table;
+	struct hlist_head	__rcu *table;
 	unsigned int		hmask;
 	u8			dbits4;
 	u8			sbits4;
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index cbc9005..398661c 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1123,7 +1123,7 @@ static struct xfrm_policy *xfrm_policy_lookup_bytype(struct net *net, u8 type,
 	if (unlikely(!daddr || !saddr))
 		return NULL;
 
-	read_lock_bh(&net->xfrm.xfrm_policy_lock);
+	rcu_read_lock();
  retry:
 	do {
 		sequence = read_seqcount_begin(&xfrm_policy_hash_generation);
@@ -1172,7 +1172,7 @@ static struct xfrm_policy *xfrm_policy_lookup_bytype(struct net *net, u8 type,
 	if (ret && !xfrm_pol_hold_rcu(ret))
 		goto retry;
 fail:
-	read_unlock_bh(&net->xfrm.xfrm_policy_lock);
+	rcu_read_unlock();
 
 	return ret;
 }
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH ipsec-next 6/8] xfrm: policy: only use rcu in xfrm_sk_policy_lookup
  2016-08-11 13:17 Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu Florian Westphal
                   ` (4 preceding siblings ...)
  2016-08-11 13:17 ` [PATCH ipsec-next 5/8] xfrm: policy: make xfrm_policy_lookup_bytype lockless Florian Westphal
@ 2016-08-11 13:17 ` Florian Westphal
  2016-11-16 16:43   ` Nicolas Dichtel
  2016-08-11 13:17 ` [PATCH ipsec-next 7/8] xfrm: policy: don't acquire policy lock in xfrm_spd_getinfo Florian Westphal
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Florian Westphal @ 2016-08-11 13:17 UTC (permalink / raw)
  To: netdev; +Cc: Florian Westphal

Don't acquire the readlock anymore and rely on rcu alone.

In case writer on other CPU changed policy at the wrong moment (after we
obtained sk policy pointer but before we could obtain the reference)
just repeat the lookup.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/xfrm/xfrm_policy.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 398661c..575a48b 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1249,10 +1249,9 @@ static struct xfrm_policy *xfrm_sk_policy_lookup(const struct sock *sk, int dir,
 						 const struct flowi *fl)
 {
 	struct xfrm_policy *pol;
-	struct net *net = sock_net(sk);
 
 	rcu_read_lock();
-	read_lock_bh(&net->xfrm.xfrm_policy_lock);
+ again:
 	pol = rcu_dereference(sk->sk_policy[dir]);
 	if (pol != NULL) {
 		bool match = xfrm_selector_match(&pol->selector, fl,
@@ -1267,8 +1266,8 @@ static struct xfrm_policy *xfrm_sk_policy_lookup(const struct sock *sk, int dir,
 			err = security_xfrm_policy_lookup(pol->security,
 						      fl->flowi_secid,
 						      policy_to_flow_dir(dir));
-			if (!err)
-				xfrm_pol_hold(pol);
+			if (!err && !xfrm_pol_hold_rcu(pol))
+				goto again;
 			else if (err == -ESRCH)
 				pol = NULL;
 			else
@@ -1277,7 +1276,6 @@ static struct xfrm_policy *xfrm_sk_policy_lookup(const struct sock *sk, int dir,
 			pol = NULL;
 	}
 out:
-	read_unlock_bh(&net->xfrm.xfrm_policy_lock);
 	rcu_read_unlock();
 	return pol;
 }
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH ipsec-next 7/8] xfrm: policy: don't acquire policy lock in xfrm_spd_getinfo
  2016-08-11 13:17 Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu Florian Westphal
                   ` (5 preceding siblings ...)
  2016-08-11 13:17 ` [PATCH ipsec-next 6/8] xfrm: policy: only use rcu in xfrm_sk_policy_lookup Florian Westphal
@ 2016-08-11 13:17 ` Florian Westphal
  2016-08-11 13:17 ` [PATCH ipsec-next 8/8] xfrm: policy: convert policy_lock to spinlock Florian Westphal
  2016-08-12  8:22 ` Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu Steffen Klassert
  8 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2016-08-11 13:17 UTC (permalink / raw)
  To: netdev; +Cc: Florian Westphal

It doesn't seem that important.

We now get inconsistent view of the counters, but those are stale anyway
right after we drop the lock.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/xfrm/xfrm_policy.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 575a48b..3ff5893 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -560,7 +560,6 @@ static inline int xfrm_byidx_should_resize(struct net *net, int total)
 
 void xfrm_spd_getinfo(struct net *net, struct xfrmk_spdinfo *si)
 {
-	read_lock_bh(&net->xfrm.xfrm_policy_lock);
 	si->incnt = net->xfrm.policy_count[XFRM_POLICY_IN];
 	si->outcnt = net->xfrm.policy_count[XFRM_POLICY_OUT];
 	si->fwdcnt = net->xfrm.policy_count[XFRM_POLICY_FWD];
@@ -569,7 +568,6 @@ void xfrm_spd_getinfo(struct net *net, struct xfrmk_spdinfo *si)
 	si->fwdscnt = net->xfrm.policy_count[XFRM_POLICY_FWD+XFRM_POLICY_MAX];
 	si->spdhcnt = net->xfrm.policy_idx_hmask;
 	si->spdhmcnt = xfrm_policy_hashmax;
-	read_unlock_bh(&net->xfrm.xfrm_policy_lock);
 }
 EXPORT_SYMBOL(xfrm_spd_getinfo);
 
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH ipsec-next 8/8] xfrm: policy: convert policy_lock to spinlock
  2016-08-11 13:17 Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu Florian Westphal
                   ` (6 preceding siblings ...)
  2016-08-11 13:17 ` [PATCH ipsec-next 7/8] xfrm: policy: don't acquire policy lock in xfrm_spd_getinfo Florian Westphal
@ 2016-08-11 13:17 ` Florian Westphal
  2016-08-24 11:28   ` Steffen Klassert
  2016-08-12  8:22 ` Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu Steffen Klassert
  8 siblings, 1 reply; 16+ messages in thread
From: Florian Westphal @ 2016-08-11 13:17 UTC (permalink / raw)
  To: netdev; +Cc: Florian Westphal

After earlier patches conversions all spots acquire the writer lock and
we can now convert this to a normal spinlock.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netns/xfrm.h |  2 +-
 net/xfrm/xfrm_policy.c   | 68 ++++++++++++++++++++++++------------------------
 2 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/include/net/netns/xfrm.h b/include/net/netns/xfrm.h
index 3ab828a..177ed44 100644
--- a/include/net/netns/xfrm.h
+++ b/include/net/netns/xfrm.h
@@ -73,7 +73,7 @@ struct netns_xfrm {
 	struct dst_ops		xfrm6_dst_ops;
 #endif
 	spinlock_t xfrm_state_lock;
-	rwlock_t xfrm_policy_lock;
+	spinlock_t xfrm_policy_lock;
 	struct mutex xfrm_cfg_mutex;
 
 	/* flow cache part */
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 3ff5893..37f05df 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -484,7 +484,7 @@ static void xfrm_bydst_resize(struct net *net, int dir)
 	if (!ndst)
 		return;
 
-	write_lock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_lock_bh(&net->xfrm.xfrm_policy_lock);
 	write_seqcount_begin(&xfrm_policy_hash_generation);
 
 	odst = rcu_dereference_protected(net->xfrm.policy_bydst[dir].table,
@@ -500,7 +500,7 @@ static void xfrm_bydst_resize(struct net *net, int dir)
 	net->xfrm.policy_bydst[dir].hmask = nhashmask;
 
 	write_seqcount_end(&xfrm_policy_hash_generation);
-	write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 
 	synchronize_rcu();
 
@@ -519,7 +519,7 @@ static void xfrm_byidx_resize(struct net *net, int total)
 	if (!nidx)
 		return;
 
-	write_lock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_lock_bh(&net->xfrm.xfrm_policy_lock);
 
 	for (i = hmask; i >= 0; i--)
 		xfrm_idx_hash_transfer(oidx + i, nidx, nhashmask);
@@ -527,7 +527,7 @@ static void xfrm_byidx_resize(struct net *net, int total)
 	net->xfrm.policy_byidx = nidx;
 	net->xfrm.policy_idx_hmask = nhashmask;
 
-	write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 
 	xfrm_hash_free(oidx, (hmask + 1) * sizeof(struct hlist_head));
 }
@@ -617,7 +617,7 @@ static void xfrm_hash_rebuild(struct work_struct *work)
 		rbits6 = net->xfrm.policy_hthresh.rbits6;
 	} while (read_seqretry(&net->xfrm.policy_hthresh.lock, seq));
 
-	write_lock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_lock_bh(&net->xfrm.xfrm_policy_lock);
 
 	/* reset the bydst and inexact table in all directions */
 	for (dir = 0; dir < XFRM_POLICY_MAX; dir++) {
@@ -659,7 +659,7 @@ static void xfrm_hash_rebuild(struct work_struct *work)
 			hlist_add_head(&policy->bydst, chain);
 	}
 
-	write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 
 	mutex_unlock(&hash_resize_mutex);
 }
@@ -770,7 +770,7 @@ int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl)
 	struct hlist_head *chain;
 	struct hlist_node *newpos;
 
-	write_lock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_lock_bh(&net->xfrm.xfrm_policy_lock);
 	chain = policy_hash_bysel(net, &policy->selector, policy->family, dir);
 	delpol = NULL;
 	newpos = NULL;
@@ -781,7 +781,7 @@ int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl)
 		    xfrm_sec_ctx_match(pol->security, policy->security) &&
 		    !WARN_ON(delpol)) {
 			if (excl) {
-				write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+				spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 				return -EEXIST;
 			}
 			delpol = pol;
@@ -817,7 +817,7 @@ int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl)
 	policy->curlft.use_time = 0;
 	if (!mod_timer(&policy->timer, jiffies + HZ))
 		xfrm_pol_hold(policy);
-	write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 
 	if (delpol)
 		xfrm_policy_kill(delpol);
@@ -837,7 +837,7 @@ struct xfrm_policy *xfrm_policy_bysel_ctx(struct net *net, u32 mark, u8 type,
 	struct hlist_head *chain;
 
 	*err = 0;
-	write_lock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_lock_bh(&net->xfrm.xfrm_policy_lock);
 	chain = policy_hash_bysel(net, sel, sel->family, dir);
 	ret = NULL;
 	hlist_for_each_entry(pol, chain, bydst) {
@@ -850,7 +850,7 @@ struct xfrm_policy *xfrm_policy_bysel_ctx(struct net *net, u32 mark, u8 type,
 				*err = security_xfrm_policy_delete(
 								pol->security);
 				if (*err) {
-					write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+					spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 					return pol;
 				}
 				__xfrm_policy_unlink(pol, dir);
@@ -859,7 +859,7 @@ struct xfrm_policy *xfrm_policy_bysel_ctx(struct net *net, u32 mark, u8 type,
 			break;
 		}
 	}
-	write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 
 	if (ret && delete)
 		xfrm_policy_kill(ret);
@@ -878,7 +878,7 @@ struct xfrm_policy *xfrm_policy_byid(struct net *net, u32 mark, u8 type,
 		return NULL;
 
 	*err = 0;
-	write_lock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_lock_bh(&net->xfrm.xfrm_policy_lock);
 	chain = net->xfrm.policy_byidx + idx_hash(net, id);
 	ret = NULL;
 	hlist_for_each_entry(pol, chain, byidx) {
@@ -889,7 +889,7 @@ struct xfrm_policy *xfrm_policy_byid(struct net *net, u32 mark, u8 type,
 				*err = security_xfrm_policy_delete(
 								pol->security);
 				if (*err) {
-					write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+					spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 					return pol;
 				}
 				__xfrm_policy_unlink(pol, dir);
@@ -898,7 +898,7 @@ struct xfrm_policy *xfrm_policy_byid(struct net *net, u32 mark, u8 type,
 			break;
 		}
 	}
-	write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 
 	if (ret && delete)
 		xfrm_policy_kill(ret);
@@ -956,7 +956,7 @@ int xfrm_policy_flush(struct net *net, u8 type, bool task_valid)
 {
 	int dir, err = 0, cnt = 0;
 
-	write_lock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_lock_bh(&net->xfrm.xfrm_policy_lock);
 
 	err = xfrm_policy_flush_secctx_check(net, type, task_valid);
 	if (err)
@@ -972,14 +972,14 @@ int xfrm_policy_flush(struct net *net, u8 type, bool task_valid)
 			if (pol->type != type)
 				continue;
 			__xfrm_policy_unlink(pol, dir);
-			write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+			spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 			cnt++;
 
 			xfrm_audit_policy_delete(pol, 1, task_valid);
 
 			xfrm_policy_kill(pol);
 
-			write_lock_bh(&net->xfrm.xfrm_policy_lock);
+			spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 			goto again1;
 		}
 
@@ -991,13 +991,13 @@ int xfrm_policy_flush(struct net *net, u8 type, bool task_valid)
 				if (pol->type != type)
 					continue;
 				__xfrm_policy_unlink(pol, dir);
-				write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+				spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 				cnt++;
 
 				xfrm_audit_policy_delete(pol, 1, task_valid);
 				xfrm_policy_kill(pol);
 
-				write_lock_bh(&net->xfrm.xfrm_policy_lock);
+				spin_lock_bh(&net->xfrm.xfrm_policy_lock);
 				goto again2;
 			}
 		}
@@ -1006,7 +1006,7 @@ int xfrm_policy_flush(struct net *net, u8 type, bool task_valid)
 	if (!cnt)
 		err = -ESRCH;
 out:
-	write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 	return err;
 }
 EXPORT_SYMBOL(xfrm_policy_flush);
@@ -1026,7 +1026,7 @@ int xfrm_policy_walk(struct net *net, struct xfrm_policy_walk *walk,
 	if (list_empty(&walk->walk.all) && walk->seq != 0)
 		return 0;
 
-	write_lock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_lock_bh(&net->xfrm.xfrm_policy_lock);
 	if (list_empty(&walk->walk.all))
 		x = list_first_entry(&net->xfrm.policy_all, struct xfrm_policy_walk_entry, all);
 	else
@@ -1054,7 +1054,7 @@ int xfrm_policy_walk(struct net *net, struct xfrm_policy_walk *walk,
 	}
 	list_del_init(&walk->walk.all);
 out:
-	write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 	return error;
 }
 EXPORT_SYMBOL(xfrm_policy_walk);
@@ -1073,9 +1073,9 @@ void xfrm_policy_walk_done(struct xfrm_policy_walk *walk, struct net *net)
 	if (list_empty(&walk->walk.all))
 		return;
 
-	write_lock_bh(&net->xfrm.xfrm_policy_lock); /*FIXME where is net? */
+	spin_lock_bh(&net->xfrm.xfrm_policy_lock); /*FIXME where is net? */
 	list_del(&walk->walk.all);
-	write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 }
 EXPORT_SYMBOL(xfrm_policy_walk_done);
 
@@ -1321,9 +1321,9 @@ int xfrm_policy_delete(struct xfrm_policy *pol, int dir)
 {
 	struct net *net = xp_net(pol);
 
-	write_lock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_lock_bh(&net->xfrm.xfrm_policy_lock);
 	pol = __xfrm_policy_unlink(pol, dir);
-	write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 	if (pol) {
 		xfrm_policy_kill(pol);
 		return 0;
@@ -1342,7 +1342,7 @@ int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol)
 		return -EINVAL;
 #endif
 
-	write_lock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_lock_bh(&net->xfrm.xfrm_policy_lock);
 	old_pol = rcu_dereference_protected(sk->sk_policy[dir],
 				lockdep_is_held(&net->xfrm.xfrm_policy_lock));
 	if (pol) {
@@ -1360,7 +1360,7 @@ int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol)
 		 */
 		xfrm_sk_policy_unlink(old_pol, dir);
 	}
-	write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 
 	if (old_pol) {
 		xfrm_policy_kill(old_pol);
@@ -1390,9 +1390,9 @@ static struct xfrm_policy *clone_policy(const struct xfrm_policy *old, int dir)
 		newp->type = old->type;
 		memcpy(newp->xfrm_vec, old->xfrm_vec,
 		       newp->xfrm_nr*sizeof(struct xfrm_tmpl));
-		write_lock_bh(&net->xfrm.xfrm_policy_lock);
+		spin_lock_bh(&net->xfrm.xfrm_policy_lock);
 		xfrm_sk_policy_link(newp, dir);
-		write_unlock_bh(&net->xfrm.xfrm_policy_lock);
+		spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 		xfrm_pol_put(newp);
 	}
 	return newp;
@@ -3074,7 +3074,7 @@ static int __net_init xfrm_net_init(struct net *net)
 
 	/* Initialize the per-net locks here */
 	spin_lock_init(&net->xfrm.xfrm_state_lock);
-	rwlock_init(&net->xfrm.xfrm_policy_lock);
+	spin_lock_init(&net->xfrm.xfrm_policy_lock);
 	mutex_init(&net->xfrm.xfrm_cfg_mutex);
 
 	return 0;
@@ -3206,7 +3206,7 @@ static struct xfrm_policy *xfrm_migrate_policy_find(const struct xfrm_selector *
 	struct hlist_head *chain;
 	u32 priority = ~0U;
 
-	read_lock_bh(&net->xfrm.xfrm_policy_lock); /*FIXME*/
+	spin_lock_bh(&net->xfrm.xfrm_policy_lock);
 	chain = policy_hash_direct(net, &sel->daddr, &sel->saddr, sel->family, dir);
 	hlist_for_each_entry(pol, chain, bydst) {
 		if (xfrm_migrate_selector_match(sel, &pol->selector) &&
@@ -3230,7 +3230,7 @@ static struct xfrm_policy *xfrm_migrate_policy_find(const struct xfrm_selector *
 
 	xfrm_pol_hold(ret);
 
-	read_unlock_bh(&net->xfrm.xfrm_policy_lock);
+	spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 
 	return ret;
 }
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu
  2016-08-11 13:17 Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu Florian Westphal
                   ` (7 preceding siblings ...)
  2016-08-11 13:17 ` [PATCH ipsec-next 8/8] xfrm: policy: convert policy_lock to spinlock Florian Westphal
@ 2016-08-12  8:22 ` Steffen Klassert
  2016-08-22  5:15   ` Steffen Klassert
  8 siblings, 1 reply; 16+ messages in thread
From: Steffen Klassert @ 2016-08-12  8:22 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev

On Thu, Aug 11, 2016 at 03:17:51PM +0200, Florian Westphal wrote:
> Since commit d188ba86dd07a72eb ("xfrm: add rcu protection to sk->sk_policy[]")
> sk_policy can rely on rcu protection.
> 
> This change allows to also use rcu in xfrm_policy_lookup_bytype and to
> avoid grabbing the read-sie policy lock during lookups.
> 
> read-side policy rwlock is converted to pure rcu, then the policy_lock is
> changed to a plain spinlock as only write-sides remain.
> 
> First few patches do some preparation work, the later ones remove
> the read-side locks, last patch converts rwlock to spinlock.
> 
> I tested this with rcu debug enabled and simple esp tunnel
> forwarding udp packets.
> 
> If you have better tests for this please let me know and I can re-run with
> that.
> 
> Florian Westphal (8):
>       xfrm: policy: use rcu versions for iteration and list add/del
>       xfrm: policy: prepare policy_bydst hash for rcu lookups
>       xfrm: policy: add sequence count to synchronize reads with hash resizes
>       xfrm: policy: use atomic_inc_not_zero in rcu section
>       xfrm: policy: make xfrm_policy_lookup_bytype lockless
>       xfrm: policy: only use rcu in xfrm_sk_policy_lookup
>       xfrm: policy: don't acquire policy lock in xfrm_spd_getinfo
>       xfrm: policy: convert policy_lock to spinlock
> 
>  include/net/netns/xfrm.h |    4 -
>  net/xfrm/xfrm_policy.c   |  145 +++++++++++++++++++++++++++--------------------
>  2 files changed, 88 insertions(+), 61 deletions(-)

Looks good, I've applied to already to the ipsec-next testing branch.
However, I'll be off for a week starting this afternoon, so I don't
see if the patchset passed all tests before I leave. For that I defer
the applying to master until I'm back.

Thanks!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu
  2016-08-12  8:22 ` Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu Steffen Klassert
@ 2016-08-22  5:15   ` Steffen Klassert
  0 siblings, 0 replies; 16+ messages in thread
From: Steffen Klassert @ 2016-08-22  5:15 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev

On Fri, Aug 12, 2016 at 10:22:54AM +0200, Steffen Klassert wrote:
> On Thu, Aug 11, 2016 at 03:17:51PM +0200, Florian Westphal wrote:
> > Since commit d188ba86dd07a72eb ("xfrm: add rcu protection to sk->sk_policy[]")
> > sk_policy can rely on rcu protection.
> > 
> > This change allows to also use rcu in xfrm_policy_lookup_bytype and to
> > avoid grabbing the read-sie policy lock during lookups.
> > 
> > read-side policy rwlock is converted to pure rcu, then the policy_lock is
> > changed to a plain spinlock as only write-sides remain.
> > 
> > First few patches do some preparation work, the later ones remove
> > the read-side locks, last patch converts rwlock to spinlock.
> > 
> > I tested this with rcu debug enabled and simple esp tunnel
> > forwarding udp packets.
> > 
> > If you have better tests for this please let me know and I can re-run with
> > that.
> > 
> > Florian Westphal (8):
> >       xfrm: policy: use rcu versions for iteration and list add/del
> >       xfrm: policy: prepare policy_bydst hash for rcu lookups
> >       xfrm: policy: add sequence count to synchronize reads with hash resizes
> >       xfrm: policy: use atomic_inc_not_zero in rcu section
> >       xfrm: policy: make xfrm_policy_lookup_bytype lockless
> >       xfrm: policy: only use rcu in xfrm_sk_policy_lookup
> >       xfrm: policy: don't acquire policy lock in xfrm_spd_getinfo
> >       xfrm: policy: convert policy_lock to spinlock
> > 
> >  include/net/netns/xfrm.h |    4 -
> >  net/xfrm/xfrm_policy.c   |  145 +++++++++++++++++++++++++++--------------------
> >  2 files changed, 88 insertions(+), 61 deletions(-)
> 
> Looks good, I've applied to already to the ipsec-next testing branch.
> However, I'll be off for a week starting this afternoon, so I don't
> see if the patchset passed all tests before I leave. For that I defer
> the applying to master until I'm back.

Now applied to the ipsec-next tree, thanks a lot Florian!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH ipsec-next 8/8] xfrm: policy: convert policy_lock to spinlock
  2016-08-11 13:17 ` [PATCH ipsec-next 8/8] xfrm: policy: convert policy_lock to spinlock Florian Westphal
@ 2016-08-24 11:28   ` Steffen Klassert
  2016-08-24 11:30     ` Florian Westphal
  0 siblings, 1 reply; 16+ messages in thread
From: Steffen Klassert @ 2016-08-24 11:28 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev

On Thu, Aug 11, 2016 at 03:17:59PM +0200, Florian Westphal wrote:
> @@ -972,14 +972,14 @@ int xfrm_policy_flush(struct net *net, u8 type, bool task_valid)
>  			if (pol->type != type)
>  				continue;
>  			__xfrm_policy_unlink(pol, dir);
> -			write_unlock_bh(&net->xfrm.xfrm_policy_lock);
> +			spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
>  			cnt++;
>  
>  			xfrm_audit_policy_delete(pol, 1, task_valid);
>  
>  			xfrm_policy_kill(pol);
>  
> -			write_lock_bh(&net->xfrm.xfrm_policy_lock);
> +			spin_unlock_bh(&net->xfrm.xfrm_policy_lock);

I've just noticed that you accidentally replaced write_lock_bh
with spin_unlock_bh here.

I fixed it up in the ipsec-next testing branch with:

>From 4141b36ab16d7a66b4cf712f2d21eba61c5927e5 Mon Sep 17 00:00:00 2001
From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Wed, 24 Aug 2016 13:08:40 +0200
Subject: [PATCH ipsec-next] xfrm: Fix xfrm_policy_lock imbalance

An earlier patch accidentally replaced a write_lock_bh
with a spin_unlock_bh. Fix this by using spin_lock_bh
instead.

Fixes: 9d0380df6217 ("xfrm: policy: convert policy_lock to spinlock")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/xfrm/xfrm_policy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index dd01fd2..f7ce626 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -979,7 +979,7 @@ int xfrm_policy_flush(struct net *net, u8 type, bool task_valid)
 
 			xfrm_policy_kill(pol);
 
-			spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
+			spin_lock_bh(&net->xfrm.xfrm_policy_lock);
 			goto again1;
 		}
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH ipsec-next 8/8] xfrm: policy: convert policy_lock to spinlock
  2016-08-24 11:28   ` Steffen Klassert
@ 2016-08-24 11:30     ` Florian Westphal
  0 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2016-08-24 11:30 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: Florian Westphal, netdev

Steffen Klassert <steffen.klassert@secunet.com> wrote:
> On Thu, Aug 11, 2016 at 03:17:59PM +0200, Florian Westphal wrote:
> > @@ -972,14 +972,14 @@ int xfrm_policy_flush(struct net *net, u8 type, bool task_valid)
> >  			if (pol->type != type)
> >  				continue;
> >  			__xfrm_policy_unlink(pol, dir);
> > -			write_unlock_bh(&net->xfrm.xfrm_policy_lock);
> > +			spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
> >  			cnt++;
> >  
> >  			xfrm_audit_policy_delete(pol, 1, task_valid);
> >  
> >  			xfrm_policy_kill(pol);
> >  
> > -			write_lock_bh(&net->xfrm.xfrm_policy_lock);
> > +			spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
> 
> I've just noticed that you accidentally replaced write_lock_bh
> with spin_unlock_bh here.

Sorry about this, thanks for fixing this up.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH ipsec-next 6/8] xfrm: policy: only use rcu in xfrm_sk_policy_lookup
  2016-08-11 13:17 ` [PATCH ipsec-next 6/8] xfrm: policy: only use rcu in xfrm_sk_policy_lookup Florian Westphal
@ 2016-11-16 16:43   ` Nicolas Dichtel
  2016-11-16 17:30     ` Florian Westphal
  0 siblings, 1 reply; 16+ messages in thread
From: Nicolas Dichtel @ 2016-11-16 16:43 UTC (permalink / raw)
  To: Florian Westphal, netdev, Steffen Klassert

[-- Attachment #1: Type: text/plain, Size: 593 bytes --]

Le 11/08/2016 à 15:17, Florian Westphal a écrit :
> Don't acquire the readlock anymore and rely on rcu alone.
> 
> In case writer on other CPU changed policy at the wrong moment (after we
> obtained sk policy pointer but before we could obtain the reference)
> just repeat the lookup.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>
Since this patch, our IKEv1 Transport tests (using charon) fail to establish the
connection. If I revert it, the IKE negociation is ok again.
charon logs are enclosed.

I didn't had time to investigate now, but any idea is welcomed ;-)

Regards,
Nicolas

[-- Attachment #2: charon.txt --]
[-- Type: text/plain, Size: 2343 bytes --]

/var/log/syslog:Nov 16 16:28:47 router charon: 04[CFG] received stroke: add connection 'mytunnel-17'
/var/log/syslog:Nov 16 16:28:47 router charon: 04[CFG] added configuration 'mytunnel-17'
/var/log/syslog:Nov 16 16:28:47 router charon: 08[CFG] received stroke: route 'mytunnel-17'
/var/log/syslog:Nov 16 16:28:52 router charon: 13[KNL] creating acquire job for policy 10.125.0.1/32[gre] === 10.125.0.2/32[gre] with reqid {1}
/var/log/syslog:Nov 16 16:28:52 router charon: 13[IKE] initiating Aggressive Mode IKE_SA mytunnel-17[1] to 10.125.0.2
/var/log/syslog:Nov 16 16:28:52 router charon: 13[ENC] generating AGGRESSIVE request 0 [ SA KE No ID V V V V V ]
/var/log/syslog:Nov 16 16:28:52 router charon: 13[NET] sending packet: from 10.125.0.1[500] to 10.125.0.2[500] (548 bytes)
/var/log/syslog:Nov 16 16:28:56 router charon: 07[IKE] sending retransmit 1 of request message ID 0, seq 1
/var/log/syslog:Nov 16 16:28:56 router charon: 07[NET] sending packet: from 10.125.0.1[500] to 10.125.0.2[500] (548 bytes)
/var/log/syslog:Nov 16 16:29:04 router charon: 08[IKE] sending retransmit 2 of request message ID 0, seq 1
/var/log/syslog:Nov 16 16:29:04 router charon: 08[NET] sending packet: from 10.125.0.1[500] to 10.125.0.2[500] (548 bytes)
/var/log/syslog:Nov 16 16:29:17 router charon: 09[IKE] sending retransmit 3 of request message ID 0, seq 1
/var/log/syslog:Nov 16 16:29:17 router charon: 09[NET] sending packet: from 10.125.0.1[500] to 10.125.0.2[500] (548 bytes)
/var/log/syslog:Nov 16 16:29:40 router charon: 11[IKE] sending retransmit 4 of request message ID 0, seq 1
/var/log/syslog:Nov 16 16:29:40 router charon: 11[NET] sending packet: from 10.125.0.1[500] to 10.125.0.2[500] (548 bytes)
/var/log/syslog:Nov 16 16:30:22 router charon: 05[IKE] sending retransmit 5 of request message ID 0, seq 1
/var/log/syslog:Nov 16 16:30:22 router charon: 05[NET] sending packet: from 10.125.0.1[500] to 10.125.0.2[500] (548 bytes)
/var/log/syslog:Nov 16 16:31:37 router charon: 15[KNL] creating delete job for CHILD_SA ESP/0x00000000/10.125.0.2
/var/log/syslog:Nov 16 16:31:37 router charon: 13[JOB] CHILD_SA ESP/0x00000000/10.125.0.2 not found for delete
/var/log/syslog:Nov 16 16:31:38 router charon: 09[IKE] giving up after 5 retransmits
/var/log/syslog:Nov 16 16:31:38 router charon: 09[IKE] establishing IKE_SA failed, peer not responding

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH ipsec-next 6/8] xfrm: policy: only use rcu in xfrm_sk_policy_lookup
  2016-11-16 16:43   ` Nicolas Dichtel
@ 2016-11-16 17:30     ` Florian Westphal
  2016-11-17  8:55       ` Nicolas Dichtel
  0 siblings, 1 reply; 16+ messages in thread
From: Florian Westphal @ 2016-11-16 17:30 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: Florian Westphal, netdev, Steffen Klassert

Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote:
> Le 11/08/2016 à 15:17, Florian Westphal a écrit :
> > Don't acquire the readlock anymore and rely on rcu alone.
> > 
> > In case writer on other CPU changed policy at the wrong moment (after we
> > obtained sk policy pointer but before we could obtain the reference)
> > just repeat the lookup.
> > 
> > Signed-off-by: Florian Westphal <fw@strlen.de>
> Since this patch, our IKEv1 Transport tests (using charon) fail to establish the
> connection. If I revert it, the IKE negociation is ok again.
> charon logs are enclosed.
> 
> I didn't had time to investigate now, but any idea is welcomed ;-)

I'm an idiot.  Thanks for figuring out the faulty commit!

This should fix it (if we succeed grabbing the refcount, then if (err && !xfrm_pol_hold_rcu
evaluates to false and we hit last else branch and set pol to ERR_PTR(0)...

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index fd6986634e6f..5bf7e1bfeac7 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1268,12 +1268,14 @@ static struct xfrm_policy *xfrm_sk_policy_lookup(const struct sock *sk, int dir,
                        err = security_xfrm_policy_lookup(pol->security,
                                                      fl->flowi_secid,
                                                      policy_to_flow_dir(dir));
-                       if (!err && !xfrm_pol_hold_rcu(pol))
-                               goto again;
-                       else if (err == -ESRCH)
+                       if (!err) {
+                               if (!xfrm_pol_hold_rcu(pol))
+                                       goto again;
+                       } else if (err == -ESRCH) {
                                pol = NULL;
-                       else
+                       } else {
                                pol = ERR_PTR(err);
+                       }
                } else

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH ipsec-next 6/8] xfrm: policy: only use rcu in xfrm_sk_policy_lookup
  2016-11-16 17:30     ` Florian Westphal
@ 2016-11-17  8:55       ` Nicolas Dichtel
  0 siblings, 0 replies; 16+ messages in thread
From: Nicolas Dichtel @ 2016-11-17  8:55 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev, Steffen Klassert

Le 16/11/2016 à 18:30, Florian Westphal a écrit :
[snip]
> This should fix it (if we succeed grabbing the refcount, then if (err && !xfrm_pol_hold_rcu
> evaluates to false and we hit last else branch and set pol to ERR_PTR(0)...
Haha, right :) Thank you for the quick fix.
The patch solves my problem. You can add my 'tested-by' tag in the formal
submission.

Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-11-17  8:55 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-11 13:17 Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu Florian Westphal
2016-08-11 13:17 ` [PATCH ipsec-next 1/8] xfrm: policy: use rcu versions for iteration and list add/del Florian Westphal
2016-08-11 13:17 ` [PATCH ipsec-next 2/8] xfrm: policy: prepare policy_bydst hash for rcu lookups Florian Westphal
2016-08-11 13:17 ` [PATCH ipsec-next 3/8] xfrm: policy: add sequence count to sync with hash resize Florian Westphal
2016-08-11 13:17 ` [PATCH ipsec-next 4/8] xfrm: policy: use atomic_inc_not_zero in rcu section Florian Westphal
2016-08-11 13:17 ` [PATCH ipsec-next 5/8] xfrm: policy: make xfrm_policy_lookup_bytype lockless Florian Westphal
2016-08-11 13:17 ` [PATCH ipsec-next 6/8] xfrm: policy: only use rcu in xfrm_sk_policy_lookup Florian Westphal
2016-11-16 16:43   ` Nicolas Dichtel
2016-11-16 17:30     ` Florian Westphal
2016-11-17  8:55       ` Nicolas Dichtel
2016-08-11 13:17 ` [PATCH ipsec-next 7/8] xfrm: policy: don't acquire policy lock in xfrm_spd_getinfo Florian Westphal
2016-08-11 13:17 ` [PATCH ipsec-next 8/8] xfrm: policy: convert policy_lock to spinlock Florian Westphal
2016-08-24 11:28   ` Steffen Klassert
2016-08-24 11:30     ` Florian Westphal
2016-08-12  8:22 ` Subject: [PATCH ipsec-next 0/8] xfrm: policy: convert lookups to rcu Steffen Klassert
2016-08-22  5:15   ` Steffen Klassert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.