All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] rhashtable: detect when object movement between tables might have invalidated a lookup
@ 2018-11-14 23:32 NeilBrown
  2018-11-16  5:55 ` Herbert Xu
  0 siblings, 1 reply; 13+ messages in thread
From: NeilBrown @ 2018-11-14 23:32 UTC (permalink / raw)
  To: David Miller, herbert, tgraf; +Cc: netdev, linux-kernel, eric.dumazet

[-- Attachment #1: Type: text/plain, Size: 4953 bytes --]


Some users of rhashtables might need to move an object from one table
to another -  this appears to be the reason for the incomplete usage
of NULLS markers.

To support these, we store a unique NULLS_MARKER at the end of
each chain, and when a search fails to find a match, we check
if the NULLS marker found was the expected one.  If not, the search
may not have examined all objects in the target bucket, so it is
repeated.

The unique NULLS_MARKER is derived from the address of the
head of the chain.  As this cannot be derived at load-time the
static rhnull in rht_bucket_nested() needs to be initialised
at run time.

Any caller of a lookup function must still be prepared for the
possibility that the object returned is in a different table - it
might have been there for some time.

Note that this does NOT provide support for other uses of
NULLS_MARKERs such as allocating with SLAB_TYPESAFE_BY_RCU or changing
the key of an object and re-inserting it in the same table.
These could only be done safely if new objects were inserted
at the *start* of a hash chain, and that is not currently the case.

Signed-off-by: NeilBrown <neilb@suse.com>
---

I sent this patch back in July, but I made a bit of a mess of it.
Here it is again, with a bit more care.

Previously in
   Commit: 9f9a707738aa ("rhashtable: remove nulls_base and related code.")
I removed some 'nulls' related code that wasn't being used and wasn't
usable.
This patch adds code to provide the functionality that the removed
code was intended for.
Specifically, it is now possible to move an object in one rhashtable
into another rhashtable, and still provide correct lookup(etc)
semantics.

If a lookup is performed on a table from which objects can be moved,
then a found object will have been in the table at some point during the
looku, but might not still be there.  If that lookup returns failure,
then you can be sure that there was no matching object in the table for
the entire duration of the lookup.

This patch doesn't provide code to safely move an object.  It just adds
enough code that the already-existing references to NULLS_MARKER()s make
sense.

NeilBrown


 include/linux/rhashtable.h | 25 +++++++++++++++++--------
 lib/rhashtable.c           |  8 +++++---
 2 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index eb7111039247..8cc240f14834 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -75,8 +75,10 @@ struct bucket_table {
 	struct rhash_head __rcu *buckets[] ____cacheline_aligned_in_smp;
 };
 
+#define	RHT_NULLS_MARKER(ptr)	\
+	((void *)NULLS_MARKER(((unsigned long) (ptr)) >> 1))
 #define INIT_RHT_NULLS_HEAD(ptr)	\
-	((ptr) = (typeof(ptr)) NULLS_MARKER(0))
+	((ptr) = RHT_NULLS_MARKER(&(ptr)))
 
 static inline bool rht_is_a_nulls(const struct rhash_head *ptr)
 {
@@ -471,6 +473,7 @@ static inline struct rhash_head *__rhashtable_lookup(
 		.ht = ht,
 		.key = key,
 	};
+	struct rhash_head __rcu * const *head;
 	struct bucket_table *tbl;
 	struct rhash_head *he;
 	unsigned int hash;
@@ -478,13 +481,19 @@ static inline struct rhash_head *__rhashtable_lookup(
 	tbl = rht_dereference_rcu(ht->tbl, ht);
 restart:
 	hash = rht_key_hashfn(ht, tbl, key, params);
-	rht_for_each_rcu(he, tbl, hash) {
-		if (params.obj_cmpfn ?
-		    params.obj_cmpfn(&arg, rht_obj(ht, he)) :
-		    rhashtable_compare(&arg, rht_obj(ht, he)))
-			continue;
-		return he;
-	}
+	head = rht_bucket(tbl, hash);
+	do {
+		rht_for_each_rcu_continue(he, *head, tbl, hash) {
+			if (params.obj_cmpfn ?
+			    params.obj_cmpfn(&arg, rht_obj(ht, he)) :
+			    rhashtable_compare(&arg, rht_obj(ht, he)))
+				continue;
+			return he;
+		}
+		/* An object might have been moved to a different hash chain,
+		 * while we walk along it - better check and retry.
+		 */
+	} while (he != RHT_NULLS_MARKER(head));
 
 	/* Ensure we see any new tables. */
 	smp_rmb();
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 30526afa8343..852ffa5160f1 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -1179,8 +1179,7 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
 					    unsigned int hash)
 {
 	const unsigned int shift = PAGE_SHIFT - ilog2(sizeof(void *));
-	static struct rhash_head __rcu *rhnull =
-		(struct rhash_head __rcu *)NULLS_MARKER(0);
+	static struct rhash_head __rcu *rhnull;
 	unsigned int index = hash & ((1 << tbl->nest) - 1);
 	unsigned int size = tbl->size >> tbl->nest;
 	unsigned int subhash = hash;
@@ -1198,8 +1197,11 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
 		subhash >>= shift;
 	}
 
-	if (!ntbl)
+	if (!ntbl) {
+		if (!rhnull)
+			INIT_RHT_NULLS_HEAD(rhnull);
 		return &rhnull;
+	}
 
 	return &ntbl[subhash].bucket;
 
-- 
2.14.0.rc0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] rhashtable: detect when object movement between tables might have invalidated a lookup
  2018-11-14 23:32 [PATCH] rhashtable: detect when object movement between tables might have invalidated a lookup NeilBrown
@ 2018-11-16  5:55 ` Herbert Xu
  2018-11-16  6:59   ` NeilBrown
  0 siblings, 1 reply; 13+ messages in thread
From: Herbert Xu @ 2018-11-16  5:55 UTC (permalink / raw)
  To: NeilBrown; +Cc: David Miller, tgraf, netdev, linux-kernel, eric.dumazet

On Thu, Nov 15, 2018 at 10:32:13AM +1100, NeilBrown wrote:
>
> +#define	RHT_NULLS_MARKER(ptr)	\
> +	((void *)NULLS_MARKER(((unsigned long) (ptr)) >> 1))
>  #define INIT_RHT_NULLS_HEAD(ptr)	\
> -	((ptr) = (typeof(ptr)) NULLS_MARKER(0))
> +	((ptr) = RHT_NULLS_MARKER(&(ptr)))

Why are you shifting this by one?

> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> index 30526afa8343..852ffa5160f1 100644
> --- a/lib/rhashtable.c
> +++ b/lib/rhashtable.c
> @@ -1179,8 +1179,7 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
>  					    unsigned int hash)
>  {
>  	const unsigned int shift = PAGE_SHIFT - ilog2(sizeof(void *));
> -	static struct rhash_head __rcu *rhnull =
> -		(struct rhash_head __rcu *)NULLS_MARKER(0);
> +	static struct rhash_head __rcu *rhnull;

I don't understand why you can't continue to do NULLS_MARKER(0) or
RHT_NULLS_MARKER(0).

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rhashtable: detect when object movement between tables might have invalidated a lookup
  2018-11-16  5:55 ` Herbert Xu
@ 2018-11-16  6:59   ` NeilBrown
  2018-11-19  3:54     ` Herbert Xu
  2018-11-29 23:26     ` [PATCH v3] " NeilBrown
  0 siblings, 2 replies; 13+ messages in thread
From: NeilBrown @ 2018-11-16  6:59 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David Miller, tgraf, netdev, linux-kernel, eric.dumazet

[-- Attachment #1: Type: text/plain, Size: 1 bytes --]

E

[-- Attachment #2.1: Type: text/plain, Size: 2011 bytes --]

On Fri, Nov 16 2018, Herbert Xu wrote:

> On Thu, Nov 15, 2018 at 10:32:13AM +1100, NeilBrown wrote:
>>
>> +#define	RHT_NULLS_MARKER(ptr)	\
>> +	((void *)NULLS_MARKER(((unsigned long) (ptr)) >> 1))
>>  #define INIT_RHT_NULLS_HEAD(ptr)	\
>> -	((ptr) = (typeof(ptr)) NULLS_MARKER(0))
>> +	((ptr) = RHT_NULLS_MARKER(&(ptr)))
>
> Why are you shifting this by one?

NULLS_MARKER assumes a hash value in which the bottom bits are most
likely to be unique.  To convert this to a pointer which certainly not
valid, it shifts left by 1 and sets the lsb.
We aren't passing a hash value, but are passing an address instead.
In this case the bottom 2 bits are certain to be 0, and the top bit
could contain valuable information (on a 32bit system).
The best way to turn a pointer into a certainly-invalid pointer
is to just set the lsb.  By shifting right by one, we discard an
uninteresting bit, preserve all the interesting bits, and effectively
just set the lsb.

I could add a comment explaining that if you like.

>
>> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
>> index 30526afa8343..852ffa5160f1 100644
>> --- a/lib/rhashtable.c
>> +++ b/lib/rhashtable.c
>> @@ -1179,8 +1179,7 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
>>  					    unsigned int hash)
>>  {
>>  	const unsigned int shift = PAGE_SHIFT - ilog2(sizeof(void *));
>> -	static struct rhash_head __rcu *rhnull =
>> -		(struct rhash_head __rcu *)NULLS_MARKER(0);
>> +	static struct rhash_head __rcu *rhnull;
>
> I don't understand why you can't continue to do NULLS_MARKER(0) or
> RHT_NULLS_MARKER(0).

Because then the test

+	} while (he != RHT_NULLS_MARKER(head));

in __rhashtable_lookup() would always succeed, and it would loop
forever.

Thanks for the review.

NeilBrown


>
> Cheers,
> -- 
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

[-- Attachment #2.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rhashtable: detect when object movement between tables might have invalidated a lookup
  2018-11-16  6:59   ` NeilBrown
@ 2018-11-19  3:54     ` Herbert Xu
  2018-11-19  3:56       ` Herbert Xu
  2018-11-29 23:26     ` [PATCH v3] " NeilBrown
  1 sibling, 1 reply; 13+ messages in thread
From: Herbert Xu @ 2018-11-19  3:54 UTC (permalink / raw)
  To: NeilBrown; +Cc: David Miller, tgraf, netdev, linux-kernel, eric.dumazet

On Fri, Nov 16, 2018 at 05:59:19PM +1100, NeilBrown wrote:
>
> NULLS_MARKER assumes a hash value in which the bottom bits are most
> likely to be unique.  To convert this to a pointer which certainly not
> valid, it shifts left by 1 and sets the lsb.
> We aren't passing a hash value, but are passing an address instead.
> In this case the bottom 2 bits are certain to be 0, and the top bit
> could contain valuable information (on a 32bit system).
> The best way to turn a pointer into a certainly-invalid pointer
> is to just set the lsb.  By shifting right by one, we discard an
> uninteresting bit, preserve all the interesting bits, and effectively
> just set the lsb.
> 
> I could add a comment explaining that if you like.

The top-bit is most likely to be fixed and offer no real value.

> >> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> >> index 30526afa8343..852ffa5160f1 100644
> >> --- a/lib/rhashtable.c
> >> +++ b/lib/rhashtable.c
> >> @@ -1179,8 +1179,7 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
> >>  					    unsigned int hash)
> >>  {
> >>  	const unsigned int shift = PAGE_SHIFT - ilog2(sizeof(void *));
> >> -	static struct rhash_head __rcu *rhnull =
> >> -		(struct rhash_head __rcu *)NULLS_MARKER(0);
> >> +	static struct rhash_head __rcu *rhnull;
> >
> > I don't understand why you can't continue to do NULLS_MARKER(0) or
> > RHT_NULLS_MARKER(0).
> 
> Because then the test
> 
> +	} while (he != RHT_NULLS_MARKER(head));
> 
> in __rhashtable_lookup() would always succeed, and it would loop
> forever.

This change is only necessary because of your shifting change
above, which AFAICS adds no real benefit.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rhashtable: detect when object movement between tables might have invalidated a lookup
  2018-11-19  3:54     ` Herbert Xu
@ 2018-11-19  3:56       ` Herbert Xu
  2018-11-19  4:06         ` Herbert Xu
  0 siblings, 1 reply; 13+ messages in thread
From: Herbert Xu @ 2018-11-19  3:56 UTC (permalink / raw)
  To: NeilBrown; +Cc: David Miller, tgraf, netdev, linux-kernel, eric.dumazet

On Mon, Nov 19, 2018 at 11:54:15AM +0800, Herbert Xu wrote:
>
> > >> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> > >> index 30526afa8343..852ffa5160f1 100644
> > >> --- a/lib/rhashtable.c
> > >> +++ b/lib/rhashtable.c
> > >> @@ -1179,8 +1179,7 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
> > >>  					    unsigned int hash)
> > >>  {
> > >>  	const unsigned int shift = PAGE_SHIFT - ilog2(sizeof(void *));
> > >> -	static struct rhash_head __rcu *rhnull =
> > >> -		(struct rhash_head __rcu *)NULLS_MARKER(0);
> > >> +	static struct rhash_head __rcu *rhnull;
> > >
> > > I don't understand why you can't continue to do NULLS_MARKER(0) or
> > > RHT_NULLS_MARKER(0).
> > 
> > Because then the test
> > 
> > +	} while (he != RHT_NULLS_MARKER(head));
> > 
> > in __rhashtable_lookup() would always succeed, and it would loop
> > forever.
> 
> This change is only necessary because of your shifting change
> above, which AFAICS adds no real benefit.

I take that back.  Because of your shift which cancels out the
shift in NULLS_MARKER, it would appear that this should work just
fine with RHT_NULLS_MARRKER(0), no? IOW, it would appear that

	RHT_NULLS_MARKER(0) = RHT_NULLS_MARKER(RHT_NULLS_MARKER(0))

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rhashtable: detect when object movement between tables might have invalidated a lookup
  2018-11-19  3:56       ` Herbert Xu
@ 2018-11-19  4:06         ` Herbert Xu
  2018-11-19  4:22           ` David Miller
  0 siblings, 1 reply; 13+ messages in thread
From: Herbert Xu @ 2018-11-19  4:06 UTC (permalink / raw)
  To: NeilBrown; +Cc: David Miller, tgraf, netdev, linux-kernel, eric.dumazet

On Mon, Nov 19, 2018 at 11:56:35AM +0800, Herbert Xu wrote:
>
> I take that back.  Because of your shift which cancels out the
> shift in NULLS_MARKER, it would appear that this should work just
> fine with RHT_NULLS_MARRKER(0), no? IOW, it would appear that
> 
> 	RHT_NULLS_MARKER(0) = RHT_NULLS_MARKER(RHT_NULLS_MARKER(0))

My emails to Neil are bouncing:

	neilb@suse.com
	  host smtp.glb1.softwaregrp.com [15.124.2.87]
	  SMTP error from remote mail server after RCPT TO:<neilb@suse.com>:
	  550 Cannot process address

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rhashtable: detect when object movement between tables might have invalidated a lookup
  2018-11-19  4:06         ` Herbert Xu
@ 2018-11-19  4:22           ` David Miller
  0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2018-11-19  4:22 UTC (permalink / raw)
  To: herbert; +Cc: neilb, tgraf, netdev, linux-kernel, eric.dumazet

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon, 19 Nov 2018 12:06:34 +0800

> On Mon, Nov 19, 2018 at 11:56:35AM +0800, Herbert Xu wrote:
>>
>> I take that back.  Because of your shift which cancels out the
>> shift in NULLS_MARKER, it would appear that this should work just
>> fine with RHT_NULLS_MARRKER(0), no? IOW, it would appear that
>> 
>> 	RHT_NULLS_MARKER(0) = RHT_NULLS_MARKER(RHT_NULLS_MARKER(0))
> 
> My emails to Neil are bouncing:
> 
> 	neilb@suse.com
> 	  host smtp.glb1.softwaregrp.com [15.124.2.87]
> 	  SMTP error from remote mail server after RCPT TO:<neilb@suse.com>:
> 	  550 Cannot process address

Yeah this just started happening 2 days ago.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3] rhashtable: detect when object movement between tables might have invalidated a lookup
  2018-11-16  6:59   ` NeilBrown
  2018-11-19  3:54     ` Herbert Xu
@ 2018-11-29 23:26     ` NeilBrown
  2018-12-01  8:47       ` Herbert Xu
  2018-12-03 23:32       ` [PATCH v3] " David Miller
  1 sibling, 2 replies; 13+ messages in thread
From: NeilBrown @ 2018-11-29 23:26 UTC (permalink / raw)
  To: Herbert Xu, David Miller; +Cc: tgraf, netdev, linux-kernel, eric.dumazet

[-- Attachment #1: Type: text/plain, Size: 4401 bytes --]


Some users of rhashtables might need to move an object from one table
to another -  this appears to be the reason for the incomplete usage
of NULLS markers.

To support these, we store a unique NULLS_MARKER at the end of
each chain, and when a search fails to find a match, we check
if the NULLS marker found was the expected one.  If not, the search
may not have examined all objects in the target bucket, so it is
repeated.

The unique NULLS_MARKER is derived from the address of the
head of the chain.  As this cannot be derived at load-time the
static rhnull in rht_bucket_nested() needs to be initialised
at run time.

Any caller of a lookup function must still be prepared for the
possibility that the object returned is in a different table - it
might have been there for some time.

Note that this does NOT provide support for other uses of
NULLS_MARKERs such as allocating with SLAB_TYPESAFE_BY_RCU or changing
the key of an object and re-inserting it in the same table.
These could only be done safely if new objects were inserted
at the *start* of a hash chain, and that is not currently the case.

Signed-off-by: NeilBrown <neilb@suse.com>
---

This version has an added comment to explain the ">>1" in
RHT_NULLS_MARKER().

Thanks,
NeilBrown

 include/linux/rhashtable.h | 34 ++++++++++++++++++++++++++--------
 lib/rhashtable.c           |  8 +++++---
 2 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index eb7111039247..ae507af54800 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -75,8 +75,19 @@ struct bucket_table {
 	struct rhash_head __rcu *buckets[] ____cacheline_aligned_in_smp;
 };
 
+/*
+ * NULLS_MARKER() expects a hash value with the low
+ * bits mostly likely to be significant, and it discards
+ * the msb.
+ * We git it an address, in which the bottom 2 bits are
+ * always 0, and the msb might be significant.
+ * So we shift the address down one bit to align with
+ * expectations and avoid losing a significant bit.
+ */
+#define	RHT_NULLS_MARKER(ptr)	\
+	((void *)NULLS_MARKER(((unsigned long) (ptr)) >> 1))
 #define INIT_RHT_NULLS_HEAD(ptr)	\
-	((ptr) = (typeof(ptr)) NULLS_MARKER(0))
+	((ptr) = RHT_NULLS_MARKER(&(ptr)))
 
 static inline bool rht_is_a_nulls(const struct rhash_head *ptr)
 {
@@ -471,6 +482,7 @@ static inline struct rhash_head *__rhashtable_lookup(
 		.ht = ht,
 		.key = key,
 	};
+	struct rhash_head __rcu * const *head;
 	struct bucket_table *tbl;
 	struct rhash_head *he;
 	unsigned int hash;
@@ -478,13 +490,19 @@ static inline struct rhash_head *__rhashtable_lookup(
 	tbl = rht_dereference_rcu(ht->tbl, ht);
 restart:
 	hash = rht_key_hashfn(ht, tbl, key, params);
-	rht_for_each_rcu(he, tbl, hash) {
-		if (params.obj_cmpfn ?
-		    params.obj_cmpfn(&arg, rht_obj(ht, he)) :
-		    rhashtable_compare(&arg, rht_obj(ht, he)))
-			continue;
-		return he;
-	}
+	head = rht_bucket(tbl, hash);
+	do {
+		rht_for_each_rcu_continue(he, *head, tbl, hash) {
+			if (params.obj_cmpfn ?
+			    params.obj_cmpfn(&arg, rht_obj(ht, he)) :
+			    rhashtable_compare(&arg, rht_obj(ht, he)))
+				continue;
+			return he;
+		}
+		/* An object might have been moved to a different hash chain,
+		 * while we walk along it - better check and retry.
+		 */
+	} while (he != RHT_NULLS_MARKER(head));
 
 	/* Ensure we see any new tables. */
 	smp_rmb();
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 30526afa8343..852ffa5160f1 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -1179,8 +1179,7 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
 					    unsigned int hash)
 {
 	const unsigned int shift = PAGE_SHIFT - ilog2(sizeof(void *));
-	static struct rhash_head __rcu *rhnull =
-		(struct rhash_head __rcu *)NULLS_MARKER(0);
+	static struct rhash_head __rcu *rhnull;
 	unsigned int index = hash & ((1 << tbl->nest) - 1);
 	unsigned int size = tbl->size >> tbl->nest;
 	unsigned int subhash = hash;
@@ -1198,8 +1197,11 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
 		subhash >>= shift;
 	}
 
-	if (!ntbl)
+	if (!ntbl) {
+		if (!rhnull)
+			INIT_RHT_NULLS_HEAD(rhnull);
 		return &rhnull;
+	}
 
 	return &ntbl[subhash].bucket;
 
-- 
2.14.0.rc0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] rhashtable: detect when object movement between tables might have invalidated a lookup
  2018-11-29 23:26     ` [PATCH v3] " NeilBrown
@ 2018-12-01  8:47       ` Herbert Xu
  2018-12-02 22:20         ` NeilBrown
  2018-12-03 23:32       ` [PATCH v3] " David Miller
  1 sibling, 1 reply; 13+ messages in thread
From: Herbert Xu @ 2018-12-01  8:47 UTC (permalink / raw)
  To: NeilBrown; +Cc: David Miller, tgraf, netdev, linux-kernel, eric.dumazet

On Fri, Nov 30, 2018 at 10:26:50AM +1100, NeilBrown wrote:
>
> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> index 30526afa8343..852ffa5160f1 100644
> --- a/lib/rhashtable.c
> +++ b/lib/rhashtable.c
> @@ -1179,8 +1179,7 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
>  					    unsigned int hash)
>  {
>  	const unsigned int shift = PAGE_SHIFT - ilog2(sizeof(void *));
> -	static struct rhash_head __rcu *rhnull =
> -		(struct rhash_head __rcu *)NULLS_MARKER(0);
> +	static struct rhash_head __rcu *rhnull;
>  	unsigned int index = hash & ((1 << tbl->nest) - 1);
>  	unsigned int size = tbl->size >> tbl->nest;
>  	unsigned int subhash = hash;
> @@ -1198,8 +1197,11 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
>  		subhash >>= shift;
>  	}
>  
> -	if (!ntbl)
> +	if (!ntbl) {
> +		if (!rhnull)
> +			INIT_RHT_NULLS_HEAD(rhnull);
>  		return &rhnull;
> +	}

I think you missed my earlier reply beause of bouncing emails.

I think this is unnecessary because

	RHT_NULLS_MARKER(RHT_NULLS_MARKER(0)) = RHT_NULLS_MARKER(0)

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] rhashtable: detect when object movement between tables might have invalidated a lookup
  2018-12-01  8:47       ` Herbert Xu
@ 2018-12-02 22:20         ` NeilBrown
  2018-12-03  1:44           ` Herbert Xu
  0 siblings, 1 reply; 13+ messages in thread
From: NeilBrown @ 2018-12-02 22:20 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David Miller, tgraf, netdev, linux-kernel, eric.dumazet

[-- Attachment #1: Type: text/plain, Size: 2689 bytes --]

On Sat, Dec 01 2018, Herbert Xu wrote:

> On Fri, Nov 30, 2018 at 10:26:50AM +1100, NeilBrown wrote:
>>
>> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
>> index 30526afa8343..852ffa5160f1 100644
>> --- a/lib/rhashtable.c
>> +++ b/lib/rhashtable.c
>> @@ -1179,8 +1179,7 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
>>  					    unsigned int hash)
>>  {
>>  	const unsigned int shift = PAGE_SHIFT - ilog2(sizeof(void *));
>> -	static struct rhash_head __rcu *rhnull =
>> -		(struct rhash_head __rcu *)NULLS_MARKER(0);
>> +	static struct rhash_head __rcu *rhnull;
>>  	unsigned int index = hash & ((1 << tbl->nest) - 1);
>>  	unsigned int size = tbl->size >> tbl->nest;
>>  	unsigned int subhash = hash;
>> @@ -1198,8 +1197,11 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
>>  		subhash >>= shift;
>>  	}
>>  
>> -	if (!ntbl)
>> +	if (!ntbl) {
>> +		if (!rhnull)
>> +			INIT_RHT_NULLS_HEAD(rhnull);
>>  		return &rhnull;
>> +	}
>
> I think you missed my earlier reply beause of bouncing emails.

Yeah, sorry about that.  I should have looked through an lkml archive
once I realized that was happening - I have now.

>
> I think this is unnecessary because
>
> 	RHT_NULLS_MARKER(RHT_NULLS_MARKER(0)) = RHT_NULLS_MARKER(0)
>

I don't understand how this is relevant.

I think you are saying that when rht_bucket_nested() finds that the
target page hasn't been allocated, it should return a pointer to a
static variable which contains RHT_NULLS_MARKER(0)

 static struct rhash_head *rhnull = RHT_NULLS_MARKER(0);

Then in __rhashtable_lookup(),
	head = rht_bucket(tbl, hash);

would result in 'head' having the value '&rhnull'.

Then
		rht_for_each_rcu_continue(he, *head, tbl, hash) {

would result in 'he' having the value RHT_NULLS_MARKER(0)

Then
	} while (he != RHT_NULLS_MARKER(head));

will compare RHT_NULLS_MARKER(0) with RHT_NULLS_MARKED(&rhnull)
and they will be different, so it will loop indefinitely.

With respect to the shifting, you wrote:

> The top-bit is most likely to be fixed and offer no real value.

While this might be likely, it is not certain, so not relevant.
On a 32bit machine with more than 2GB of physical memory, some memory
addresses will have 0 in the msb, some will have 1.
It is possible (however unlikely) that two hash buckets in different
tables will have the same address except for the msb.  If we ignore the
msb, we might incorrectly determine that we have reached the end of the
chain from the first bucket, whereas we actually reached the end of the
chain from the second bucket.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] rhashtable: detect when object movement between tables might have invalidated a lookup
  2018-12-02 22:20         ` NeilBrown
@ 2018-12-03  1:44           ` Herbert Xu
  2018-12-03 22:28             ` [PATCH] " NeilBrown
  0 siblings, 1 reply; 13+ messages in thread
From: Herbert Xu @ 2018-12-03  1:44 UTC (permalink / raw)
  To: NeilBrown; +Cc: David Miller, tgraf, netdev, linux-kernel, eric.dumazet

On Mon, Dec 03, 2018 at 09:20:23AM +1100, NeilBrown wrote:
>
> I don't understand how this is relevant.

Thanks for the explanation, I had missed the double pointer part.
With that, I'm happy with this patch as it stands:

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] rhashtable: detect when object movement between tables might have invalidated a lookup
  2018-12-03  1:44           ` Herbert Xu
@ 2018-12-03 22:28             ` NeilBrown
  0 siblings, 0 replies; 13+ messages in thread
From: NeilBrown @ 2018-12-03 22:28 UTC (permalink / raw)
  To: Herbert Xu, David Miller; +Cc: tgraf, netdev, linux-kernel, eric.dumazet

[-- Attachment #1: Type: text/plain, Size: 4478 bytes --]


Some users of rhashtables might need to move an object from one table
to another -  this appears to be the reason for the incomplete usage
of NULLS markers.

To support these, we store a unique NULLS_MARKER at the end of
each chain, and when a search fails to find a match, we check
if the NULLS marker found was the expected one.  If not, the search
may not have examined all objects in the target bucket, so it is
repeated.

The unique NULLS_MARKER is derived from the address of the
head of the chain.  As this cannot be derived at load-time the
static rhnull in rht_bucket_nested() needs to be initialised
at run time.

Any caller of a lookup function must still be prepared for the
possibility that the object returned is in a different table - it
might have been there for some time.

Note that this does NOT provide support for other uses of
NULLS_MARKERs such as allocating with SLAB_TYPESAFE_BY_RCU or changing
the key of an object and re-inserting it in the same table.
These could only be done safely if new objects were inserted
at the *start* of a hash chain, and that is not currently the case.

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NeilBrown <neilb@suse.com>
---

Thanks Herbert,
 here is the patch complete with the Acked-by incase that makes it easer
 to apply.

Thanks,
NeilBrown


 include/linux/rhashtable.h | 34 ++++++++++++++++++++++++++--------
 lib/rhashtable.c           |  8 +++++---
 2 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index eb7111039247..20f9c6af7473 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -75,8 +75,19 @@ struct bucket_table {
 	struct rhash_head __rcu *buckets[] ____cacheline_aligned_in_smp;
 };
 
+/*
+ * NULLS_MARKER() expects a hash value with the low
+ * bits mostly likely to be significant, and it discards
+ * the msb.
+ * We git it an address, in which the bottom 2 bits are
+ * always 0, and the msb might be significant.
+ * So we shift the address down one bit to align with
+ * expectations and avoid losing a significant bit.
+ */
+#define	RHT_NULLS_MARKER(ptr)	\
+	((void *)NULLS_MARKER(((unsigned long) (ptr)) >> 1))
 #define INIT_RHT_NULLS_HEAD(ptr)	\
-	((ptr) = (typeof(ptr)) NULLS_MARKER(0))
+	((ptr) = RHT_NULLS_MARKER(&(ptr)))
 
 static inline bool rht_is_a_nulls(const struct rhash_head *ptr)
 {
@@ -471,6 +482,7 @@ static inline struct rhash_head *__rhashtable_lookup(
 		.ht = ht,
 		.key = key,
 	};
+	struct rhash_head __rcu * const *head;
 	struct bucket_table *tbl;
 	struct rhash_head *he;
 	unsigned int hash;
@@ -478,13 +490,19 @@ static inline struct rhash_head *__rhashtable_lookup(
 	tbl = rht_dereference_rcu(ht->tbl, ht);
 restart:
 	hash = rht_key_hashfn(ht, tbl, key, params);
-	rht_for_each_rcu(he, tbl, hash) {
-		if (params.obj_cmpfn ?
-		    params.obj_cmpfn(&arg, rht_obj(ht, he)) :
-		    rhashtable_compare(&arg, rht_obj(ht, he)))
-			continue;
-		return he;
-	}
+	head = rht_bucket(tbl, hash);
+	do {
+		rht_for_each_rcu_continue(he, *head, tbl, hash) {
+			if (params.obj_cmpfn ?
+			    params.obj_cmpfn(&arg, rht_obj(ht, he)) :
+			    rhashtable_compare(&arg, rht_obj(ht, he)))
+				continue;
+			return he;
+		}
+		/* An object might have been moved to a different hash chain,
+		 * while we walk along it - better check and retry.
+		 */
+	} while (he != RHT_NULLS_MARKER(head));
 
 	/* Ensure we see any new tables. */
 	smp_rmb();
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 30526afa8343..852ffa5160f1 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -1179,8 +1179,7 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
 					    unsigned int hash)
 {
 	const unsigned int shift = PAGE_SHIFT - ilog2(sizeof(void *));
-	static struct rhash_head __rcu *rhnull =
-		(struct rhash_head __rcu *)NULLS_MARKER(0);
+	static struct rhash_head __rcu *rhnull;
 	unsigned int index = hash & ((1 << tbl->nest) - 1);
 	unsigned int size = tbl->size >> tbl->nest;
 	unsigned int subhash = hash;
@@ -1198,8 +1197,11 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
 		subhash >>= shift;
 	}
 
-	if (!ntbl)
+	if (!ntbl) {
+		if (!rhnull)
+			INIT_RHT_NULLS_HEAD(rhnull);
 		return &rhnull;
+	}
 
 	return &ntbl[subhash].bucket;
 
-- 
2.14.0.rc0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] rhashtable: detect when object movement between tables might have invalidated a lookup
  2018-11-29 23:26     ` [PATCH v3] " NeilBrown
  2018-12-01  8:47       ` Herbert Xu
@ 2018-12-03 23:32       ` David Miller
  1 sibling, 0 replies; 13+ messages in thread
From: David Miller @ 2018-12-03 23:32 UTC (permalink / raw)
  To: neilb; +Cc: herbert, tgraf, netdev, linux-kernel, eric.dumazet

From: NeilBrown <neilb@suse.com>
Date: Fri, 30 Nov 2018 10:26:50 +1100

> 
> Some users of rhashtables might need to move an object from one table
> to another -  this appears to be the reason for the incomplete usage
> of NULLS markers.
> 
> To support these, we store a unique NULLS_MARKER at the end of
> each chain, and when a search fails to find a match, we check
> if the NULLS marker found was the expected one.  If not, the search
> may not have examined all objects in the target bucket, so it is
> repeated.
> 
> The unique NULLS_MARKER is derived from the address of the
> head of the chain.  As this cannot be derived at load-time the
> static rhnull in rht_bucket_nested() needs to be initialised
> at run time.
> 
> Any caller of a lookup function must still be prepared for the
> possibility that the object returned is in a different table - it
> might have been there for some time.
> 
> Note that this does NOT provide support for other uses of
> NULLS_MARKERs such as allocating with SLAB_TYPESAFE_BY_RCU or changing
> the key of an object and re-inserting it in the same table.
> These could only be done safely if new objects were inserted
> at the *start* of a hash chain, and that is not currently the case.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>

Applied to net-next.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-12-03 23:32 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-14 23:32 [PATCH] rhashtable: detect when object movement between tables might have invalidated a lookup NeilBrown
2018-11-16  5:55 ` Herbert Xu
2018-11-16  6:59   ` NeilBrown
2018-11-19  3:54     ` Herbert Xu
2018-11-19  3:56       ` Herbert Xu
2018-11-19  4:06         ` Herbert Xu
2018-11-19  4:22           ` David Miller
2018-11-29 23:26     ` [PATCH v3] " NeilBrown
2018-12-01  8:47       ` Herbert Xu
2018-12-02 22:20         ` NeilBrown
2018-12-03  1:44           ` Herbert Xu
2018-12-03 22:28             ` [PATCH] " NeilBrown
2018-12-03 23:32       ` [PATCH v3] " David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.