linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 for-next 0/4] optimise sbitmap deferred clear
@ 2020-11-22 15:35 Pavel Begunkov
  2020-11-22 15:35 ` [PATCH v2 1/4] sbitmap: optimise sbitmap_deferred_clear() Pavel Begunkov
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Pavel Begunkov @ 2020-11-22 15:35 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Omar Sandoval; +Cc: linux-kernel

sbitmap takes away some cycles for my tag-deficient test, removal of
locking in sbitmap_deferred_clear() gives +~1% throuhput.

[1/4] and [4/4] are simple, it'd be great if someone could double
check for ordering issues for other two patches.

v2: add 3rd (CAS -> atomic and) and 4th patches

Pavel Begunkov (4):
  sbitmap: optimise sbitmap_deferred_clear()
  sbitmap: remove swap_lock
  sbitmap: replace CAS with atomic and
  sbitmap: simplify wrap check

 include/linux/sbitmap.h |  5 -----
 lib/sbitmap.c           | 44 +++++++++++++++++------------------------
 2 files changed, 18 insertions(+), 31 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/4] sbitmap: optimise sbitmap_deferred_clear()
  2020-11-22 15:35 [PATCH v2 for-next 0/4] optimise sbitmap deferred clear Pavel Begunkov
@ 2020-11-22 15:35 ` Pavel Begunkov
  2020-11-24 14:11   ` John Garry
  2020-11-22 15:35 ` [PATCH v2 2/4] sbitmap: remove swap_lock Pavel Begunkov
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Pavel Begunkov @ 2020-11-22 15:35 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Omar Sandoval; +Cc: linux-kernel

Because of spinlocks and atomics sbitmap_deferred_clear() have to reload
&sb->map[index] on each access even though the map address won't change.
Pass in sbitmap_word instead of {sb, index}, so it's cached in a
variable. It also improves code generation of
sbitmap_find_bit_in_index().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 lib/sbitmap.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index 267aa7709416..c1c8a4e69325 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -12,32 +12,32 @@
 /*
  * See if we have deferred clears that we can batch move
  */
-static inline bool sbitmap_deferred_clear(struct sbitmap *sb, int index)
+static inline bool sbitmap_deferred_clear(struct sbitmap_word *map)
 {
 	unsigned long mask, val;
 	bool ret = false;
 	unsigned long flags;
 
-	spin_lock_irqsave(&sb->map[index].swap_lock, flags);
+	spin_lock_irqsave(&map->swap_lock, flags);
 
-	if (!sb->map[index].cleared)
+	if (!map->cleared)
 		goto out_unlock;
 
 	/*
 	 * First get a stable cleared mask, setting the old mask to 0.
 	 */
-	mask = xchg(&sb->map[index].cleared, 0);
+	mask = xchg(&map->cleared, 0);
 
 	/*
 	 * Now clear the masked bits in our free word
 	 */
 	do {
-		val = sb->map[index].word;
-	} while (cmpxchg(&sb->map[index].word, val, val & ~mask) != val);
+		val = map->word;
+	} while (cmpxchg(&map->word, val, val & ~mask) != val);
 
 	ret = true;
 out_unlock:
-	spin_unlock_irqrestore(&sb->map[index].swap_lock, flags);
+	spin_unlock_irqrestore(&map->swap_lock, flags);
 	return ret;
 }
 
@@ -92,7 +92,7 @@ void sbitmap_resize(struct sbitmap *sb, unsigned int depth)
 	unsigned int i;
 
 	for (i = 0; i < sb->map_nr; i++)
-		sbitmap_deferred_clear(sb, i);
+		sbitmap_deferred_clear(&sb->map[i]);
 
 	sb->depth = depth;
 	sb->map_nr = DIV_ROUND_UP(sb->depth, bits_per_word);
@@ -139,15 +139,15 @@ static int __sbitmap_get_word(unsigned long *word, unsigned long depth,
 static int sbitmap_find_bit_in_index(struct sbitmap *sb, int index,
 				     unsigned int alloc_hint, bool round_robin)
 {
+	struct sbitmap_word *map = &sb->map[index];
 	int nr;
 
 	do {
-		nr = __sbitmap_get_word(&sb->map[index].word,
-					sb->map[index].depth, alloc_hint,
+		nr = __sbitmap_get_word(&map->word, map->depth, alloc_hint,
 					!round_robin);
 		if (nr != -1)
 			break;
-		if (!sbitmap_deferred_clear(sb, index))
+		if (!sbitmap_deferred_clear(map))
 			break;
 	} while (1);
 
@@ -207,7 +207,7 @@ int sbitmap_get_shallow(struct sbitmap *sb, unsigned int alloc_hint,
 			break;
 		}
 
-		if (sbitmap_deferred_clear(sb, index))
+		if (sbitmap_deferred_clear(&sb->map[index]))
 			goto again;
 
 		/* Jump to next index. */
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/4] sbitmap: remove swap_lock
  2020-11-22 15:35 [PATCH v2 for-next 0/4] optimise sbitmap deferred clear Pavel Begunkov
  2020-11-22 15:35 ` [PATCH v2 1/4] sbitmap: optimise sbitmap_deferred_clear() Pavel Begunkov
@ 2020-11-22 15:35 ` Pavel Begunkov
  2020-11-24 14:22   ` John Garry
  2020-11-26  2:46   ` Ming Lei
  2020-11-22 15:35 ` [PATCH v2 3/4] sbitmap: replace CAS with atomic and Pavel Begunkov
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 13+ messages in thread
From: Pavel Begunkov @ 2020-11-22 15:35 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Omar Sandoval; +Cc: linux-kernel

map->swap_lock protects map->cleared from concurrent modification,
however sbitmap_deferred_clear() is already atomically drains it, so
it's guaranteed to not loose bits on concurrent
sbitmap_deferred_clear().

A one threaded tag heavy test on top of nullbk showed ~1.5% t-put
increase, and 3% -> 1% cycle reduction of sbitmap_get() according to perf.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/linux/sbitmap.h |  5 -----
 lib/sbitmap.c           | 14 +++-----------
 2 files changed, 3 insertions(+), 16 deletions(-)

diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
index e40d019c3d9d..74cc6384715e 100644
--- a/include/linux/sbitmap.h
+++ b/include/linux/sbitmap.h
@@ -32,11 +32,6 @@ struct sbitmap_word {
 	 * @cleared: word holding cleared bits
 	 */
 	unsigned long cleared ____cacheline_aligned_in_smp;
-
-	/**
-	 * @swap_lock: Held while swapping word <-> cleared
-	 */
-	spinlock_t swap_lock;
 } ____cacheline_aligned_in_smp;
 
 /**
diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index c1c8a4e69325..4fd877048ba8 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -15,13 +15,9 @@
 static inline bool sbitmap_deferred_clear(struct sbitmap_word *map)
 {
 	unsigned long mask, val;
-	bool ret = false;
-	unsigned long flags;
 
-	spin_lock_irqsave(&map->swap_lock, flags);
-
-	if (!map->cleared)
-		goto out_unlock;
+	if (!READ_ONCE(map->cleared))
+		return false;
 
 	/*
 	 * First get a stable cleared mask, setting the old mask to 0.
@@ -35,10 +31,7 @@ static inline bool sbitmap_deferred_clear(struct sbitmap_word *map)
 		val = map->word;
 	} while (cmpxchg(&map->word, val, val & ~mask) != val);
 
-	ret = true;
-out_unlock:
-	spin_unlock_irqrestore(&map->swap_lock, flags);
-	return ret;
+	return true;
 }
 
 int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift,
@@ -80,7 +73,6 @@ int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift,
 	for (i = 0; i < sb->map_nr; i++) {
 		sb->map[i].depth = min(depth, bits_per_word);
 		depth -= sb->map[i].depth;
-		spin_lock_init(&sb->map[i].swap_lock);
 	}
 	return 0;
 }
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/4] sbitmap: replace CAS with atomic and
  2020-11-22 15:35 [PATCH v2 for-next 0/4] optimise sbitmap deferred clear Pavel Begunkov
  2020-11-22 15:35 ` [PATCH v2 1/4] sbitmap: optimise sbitmap_deferred_clear() Pavel Begunkov
  2020-11-22 15:35 ` [PATCH v2 2/4] sbitmap: remove swap_lock Pavel Begunkov
@ 2020-11-22 15:35 ` Pavel Begunkov
  2020-11-22 15:35 ` [PATCH v2 4/4] sbitmap: simplify wrap check Pavel Begunkov
  2020-12-08  0:13 ` [PATCH v2 for-next 0/4] optimise sbitmap deferred clear Jens Axboe
  4 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2020-11-22 15:35 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Omar Sandoval; +Cc: linux-kernel

sbitmap_deferred_clear() does CAS loop to propagate cleared bits,
replace it with equivalent atomic bitwise and. That's slightly faster
and makes wait-free instead of lock-free as before.

The atomic can be relaxed (i.e. barrier-less) because following
sbitmap_get*() deal with synchronisation, see comments in
sbitmap_queue_clear().

It's ok to cast to atomic_long_t, that's what bitops/lock.h does.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 lib/sbitmap.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index 4fd877048ba8..c18b518a16ba 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -14,7 +14,7 @@
  */
 static inline bool sbitmap_deferred_clear(struct sbitmap_word *map)
 {
-	unsigned long mask, val;
+	unsigned long mask;
 
 	if (!READ_ONCE(map->cleared))
 		return false;
@@ -27,10 +27,8 @@ static inline bool sbitmap_deferred_clear(struct sbitmap_word *map)
 	/*
 	 * Now clear the masked bits in our free word
 	 */
-	do {
-		val = map->word;
-	} while (cmpxchg(&map->word, val, val & ~mask) != val);
-
+	atomic_long_andnot(mask, (atomic_long_t *)&map->word);
+	BUILD_BUG_ON(sizeof(atomic_long_t) != sizeof(map->word));
 	return true;
 }
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/4] sbitmap: simplify wrap check
  2020-11-22 15:35 [PATCH v2 for-next 0/4] optimise sbitmap deferred clear Pavel Begunkov
                   ` (2 preceding siblings ...)
  2020-11-22 15:35 ` [PATCH v2 3/4] sbitmap: replace CAS with atomic and Pavel Begunkov
@ 2020-11-22 15:35 ` Pavel Begunkov
  2020-12-08  0:13 ` [PATCH v2 for-next 0/4] optimise sbitmap deferred clear Jens Axboe
  4 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2020-11-22 15:35 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Omar Sandoval; +Cc: linux-kernel

__sbitmap_get_word() doesn't warp if it's starting from the beginning
(i.e. initial hint is 0). Instead of stashing the original hint just set
@wrap accordingly.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 lib/sbitmap.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index c18b518a16ba..d693d9213ceb 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -97,9 +97,11 @@ EXPORT_SYMBOL_GPL(sbitmap_resize);
 static int __sbitmap_get_word(unsigned long *word, unsigned long depth,
 			      unsigned int hint, bool wrap)
 {
-	unsigned int orig_hint = hint;
 	int nr;
 
+	/* don't wrap if starting from 0 */
+	wrap = wrap && hint;
+
 	while (1) {
 		nr = find_next_zero_bit(word, depth, hint);
 		if (unlikely(nr >= depth)) {
@@ -108,8 +110,8 @@ static int __sbitmap_get_word(unsigned long *word, unsigned long depth,
 			 * offset to 0 in a failure case, so start from 0 to
 			 * exhaust the map.
 			 */
-			if (orig_hint && hint && wrap) {
-				hint = orig_hint = 0;
+			if (hint && wrap) {
+				hint = 0;
 				continue;
 			}
 			return -1;
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/4] sbitmap: optimise sbitmap_deferred_clear()
  2020-11-22 15:35 ` [PATCH v2 1/4] sbitmap: optimise sbitmap_deferred_clear() Pavel Begunkov
@ 2020-11-24 14:11   ` John Garry
  2020-11-24 15:01     ` Pavel Begunkov
  0 siblings, 1 reply; 13+ messages in thread
From: John Garry @ 2020-11-24 14:11 UTC (permalink / raw)
  To: Pavel Begunkov, Jens Axboe, linux-block, Omar Sandoval; +Cc: linux-kernel

On 22/11/2020 15:35, Pavel Begunkov wrote:
> Because of spinlocks and atomics sbitmap_deferred_clear() have to reload
> &sb->map[index] on each access even though the map address won't change.
> Pass in sbitmap_word instead of {sb, index}, so it's cached in a
> variable. It also improves code generation of
> sbitmap_find_bit_in_index().
> 
> Signed-off-by: Pavel Begunkov<asml.silence@gmail.com>

Looks ok, even though a bit odd not be passing a struct sbitmap * now

Reviewed-by: John Garry <john.garry@huawei.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] sbitmap: remove swap_lock
  2020-11-22 15:35 ` [PATCH v2 2/4] sbitmap: remove swap_lock Pavel Begunkov
@ 2020-11-24 14:22   ` John Garry
  2020-11-24 14:43     ` Pavel Begunkov
  2020-11-26  2:46   ` Ming Lei
  1 sibling, 1 reply; 13+ messages in thread
From: John Garry @ 2020-11-24 14:22 UTC (permalink / raw)
  To: Pavel Begunkov, Jens Axboe, linux-block, Omar Sandoval; +Cc: linux-kernel

On 22/11/2020 15:35, Pavel Begunkov wrote:
> map->swap_lock protects map->cleared from concurrent modification,
> however sbitmap_deferred_clear() is already atomically drains it, so
> it's guaranteed to not loose bits on concurrent
> sbitmap_deferred_clear().
> 
> A one threaded tag heavy test on top of nullbk showed ~1.5% t-put
> increase, and 3% -> 1% cycle reduction of sbitmap_get() according to perf.
> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>   include/linux/sbitmap.h |  5 -----
>   lib/sbitmap.c           | 14 +++-----------
>   2 files changed, 3 insertions(+), 16 deletions(-)
> 
> diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
> index e40d019c3d9d..74cc6384715e 100644
> --- a/include/linux/sbitmap.h
> +++ b/include/linux/sbitmap.h
> @@ -32,11 +32,6 @@ struct sbitmap_word {
>   	 * @cleared: word holding cleared bits
>   	 */
>   	unsigned long cleared ____cacheline_aligned_in_smp;
> -
> -	/**
> -	 * @swap_lock: Held while swapping word <-> cleared
> -	 */
> -	spinlock_t swap_lock;
>   } ____cacheline_aligned_in_smp;
>   
>   /**
> diff --git a/lib/sbitmap.c b/lib/sbitmap.c
> index c1c8a4e69325..4fd877048ba8 100644
> --- a/lib/sbitmap.c
> +++ b/lib/sbitmap.c
> @@ -15,13 +15,9 @@
>   static inline bool sbitmap_deferred_clear(struct sbitmap_word *map)
>   {
>   	unsigned long mask, val;
> -	bool ret = false;
> -	unsigned long flags;
>   
> -	spin_lock_irqsave(&map->swap_lock, flags);
> -
> -	if (!map->cleared)
> -		goto out_unlock;
> +	if (!READ_ONCE(map->cleared))
> +		return false;

So if we race with another cpu, won't the 2nd cpu see that the mask is 0 
returned from the xchg (not shown)? If so, it's odd to continue to do 
the CAS - or atomic not, from later patch - on a mask of 0.

Thanks,
John

>   
>   	/*
>   	 * First get a stable cleared mask, setting the old mask to 0.
> @@ -35,10 +31,7 @@ static inline bool sbitmap_deferred_clear(struct sbitmap_word *map)
>   		val = map->word;
>   	} while (cmpxchg(&map->word, val, val & ~mask) != val);
>   
> -	ret = true;
> -out_unlock:
> -	spin_unlock_irqrestore(&map->swap_lock, flags);
> -	return ret;
> +	return true;
>   }
>   
>   int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift,
> @@ -80,7 +73,6 @@ int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift,
>   	for (i = 0; i < sb->map_nr; i++) {
>   		sb->map[i].depth = min(depth, bits_per_word);
>   		depth -= sb->map[i].depth;
> -		spin_lock_init(&sb->map[i].swap_lock);
>   	}
>   	return 0;
>   }
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] sbitmap: remove swap_lock
  2020-11-24 14:22   ` John Garry
@ 2020-11-24 14:43     ` Pavel Begunkov
  0 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2020-11-24 14:43 UTC (permalink / raw)
  To: John Garry, Jens Axboe, linux-block, Omar Sandoval; +Cc: linux-kernel

On 24/11/2020 14:22, John Garry wrote:
> On 22/11/2020 15:35, Pavel Begunkov wrote:
>> map->swap_lock protects map->cleared from concurrent modification,
>> however sbitmap_deferred_clear() is already atomically drains it, so
>> it's guaranteed to not loose bits on concurrent
>> sbitmap_deferred_clear().
>>
>> A one threaded tag heavy test on top of nullbk showed ~1.5% t-put
>> increase, and 3% -> 1% cycle reduction of sbitmap_get() according to perf.
>>
>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>> ---
>>   include/linux/sbitmap.h |  5 -----
>>   lib/sbitmap.c           | 14 +++-----------
>>   2 files changed, 3 insertions(+), 16 deletions(-)
>>
>> diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
>> index e40d019c3d9d..74cc6384715e 100644
>> --- a/include/linux/sbitmap.h
>> +++ b/include/linux/sbitmap.h
>> @@ -32,11 +32,6 @@ struct sbitmap_word {
>>        * @cleared: word holding cleared bits
>>        */
>>       unsigned long cleared ____cacheline_aligned_in_smp;
>> -
>> -    /**
>> -     * @swap_lock: Held while swapping word <-> cleared
>> -     */
>> -    spinlock_t swap_lock;
>>   } ____cacheline_aligned_in_smp;
>>     /**
>> diff --git a/lib/sbitmap.c b/lib/sbitmap.c
>> index c1c8a4e69325..4fd877048ba8 100644
>> --- a/lib/sbitmap.c
>> +++ b/lib/sbitmap.c
>> @@ -15,13 +15,9 @@
>>   static inline bool sbitmap_deferred_clear(struct sbitmap_word *map)
>>   {
>>       unsigned long mask, val;
>> -    bool ret = false;
>> -    unsigned long flags;
>>   -    spin_lock_irqsave(&map->swap_lock, flags);
>> -
>> -    if (!map->cleared)
>> -        goto out_unlock;
>> +    if (!READ_ONCE(map->cleared))
>> +        return false;
> 
> So if we race with another cpu, won't the 2nd cpu see that the mask is 0 returned from the xchg (not shown)? If so, it's odd to continue to do the CAS - or atomic not, from later patch - on a mask of 0.

Yeah, but this part is legit and I don't expect it to be so
contended to need an additional check, especially with atomic
and from [3/4].

I'm more concerned about sbitmap_resize*() callers to do right
synchronisation (e.g. quiesce) and not rely on that critical
section I remove. Would be great if anyone can confirm that.

> 
> Thanks,
> John
> 
>>         /*
>>        * First get a stable cleared mask, setting the old mask to 0.
>> @@ -35,10 +31,7 @@ static inline bool sbitmap_deferred_clear(struct sbitmap_word *map)
>>           val = map->word;
>>       } while (cmpxchg(&map->word, val, val & ~mask) != val);
>>   -    ret = true;
>> -out_unlock:
>> -    spin_unlock_irqrestore(&map->swap_lock, flags);
>> -    return ret;
>> +    return true;
>>   }
>>     int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift,
>> @@ -80,7 +73,6 @@ int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift,
>>       for (i = 0; i < sb->map_nr; i++) {
>>           sb->map[i].depth = min(depth, bits_per_word);
>>           depth -= sb->map[i].depth;
>> -        spin_lock_init(&sb->map[i].swap_lock);
>>       }
>>       return 0;
>>   }
>>
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/4] sbitmap: optimise sbitmap_deferred_clear()
  2020-11-24 14:11   ` John Garry
@ 2020-11-24 15:01     ` Pavel Begunkov
  0 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2020-11-24 15:01 UTC (permalink / raw)
  To: John Garry, Jens Axboe, linux-block, Omar Sandoval; +Cc: linux-kernel

On 24/11/2020 14:11, John Garry wrote:
> On 22/11/2020 15:35, Pavel Begunkov wrote:
>> Because of spinlocks and atomics sbitmap_deferred_clear() have to reload
>> &sb->map[index] on each access even though the map address won't change.
>> Pass in sbitmap_word instead of {sb, index}, so it's cached in a
>> variable. It also improves code generation of
>> sbitmap_find_bit_in_index().
>>
>> Signed-off-by: Pavel Begunkov<asml.silence@gmail.com>
> 
> Looks ok, even though a bit odd not be passing a struct sbitmap * now

IMHO, narrower context is better, so looks more natural to me.

> Reviewed-by: John Garry <john.garry@huawei.com>

Thanks!

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] sbitmap: remove swap_lock
  2020-11-22 15:35 ` [PATCH v2 2/4] sbitmap: remove swap_lock Pavel Begunkov
  2020-11-24 14:22   ` John Garry
@ 2020-11-26  2:46   ` Ming Lei
  2020-11-26 13:44     ` Pavel Begunkov
  1 sibling, 1 reply; 13+ messages in thread
From: Ming Lei @ 2020-11-26  2:46 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Jens Axboe, linux-block, Omar Sandoval, linux-kernel

On Sun, Nov 22, 2020 at 03:35:46PM +0000, Pavel Begunkov wrote:
> map->swap_lock protects map->cleared from concurrent modification,
> however sbitmap_deferred_clear() is already atomically drains it, so
> it's guaranteed to not loose bits on concurrent
> sbitmap_deferred_clear().
> 
> A one threaded tag heavy test on top of nullbk showed ~1.5% t-put
> increase, and 3% -> 1% cycle reduction of sbitmap_get() according to perf.
> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>  include/linux/sbitmap.h |  5 -----
>  lib/sbitmap.c           | 14 +++-----------
>  2 files changed, 3 insertions(+), 16 deletions(-)
> 
> diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
> index e40d019c3d9d..74cc6384715e 100644
> --- a/include/linux/sbitmap.h
> +++ b/include/linux/sbitmap.h
> @@ -32,11 +32,6 @@ struct sbitmap_word {
>  	 * @cleared: word holding cleared bits
>  	 */
>  	unsigned long cleared ____cacheline_aligned_in_smp;
> -
> -	/**
> -	 * @swap_lock: Held while swapping word <-> cleared
> -	 */
> -	spinlock_t swap_lock;
>  } ____cacheline_aligned_in_smp;
>  
>  /**
> diff --git a/lib/sbitmap.c b/lib/sbitmap.c
> index c1c8a4e69325..4fd877048ba8 100644
> --- a/lib/sbitmap.c
> +++ b/lib/sbitmap.c
> @@ -15,13 +15,9 @@
>  static inline bool sbitmap_deferred_clear(struct sbitmap_word *map)
>  {
>  	unsigned long mask, val;
> -	bool ret = false;
> -	unsigned long flags;
>  
> -	spin_lock_irqsave(&map->swap_lock, flags);
> -
> -	if (!map->cleared)
> -		goto out_unlock;
> +	if (!READ_ONCE(map->cleared))
> +		return false;

This way might break sbitmap_find_bit_in_index()/sbitmap_get_shallow().
Currently if sbitmap_deferred_clear() returns false, it means nothing
can be allocated from this word. With this patch, even though 'false'
is returned, free bits still might be available because another
sbitmap_deferred_clear() can be run concurrently.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] sbitmap: remove swap_lock
  2020-11-26  2:46   ` Ming Lei
@ 2020-11-26 13:44     ` Pavel Begunkov
  2020-11-27  2:06       ` Ming Lei
  0 siblings, 1 reply; 13+ messages in thread
From: Pavel Begunkov @ 2020-11-26 13:44 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Omar Sandoval, linux-kernel

On 26/11/2020 02:46, Ming Lei wrote:
> On Sun, Nov 22, 2020 at 03:35:46PM +0000, Pavel Begunkov wrote:
>> map->swap_lock protects map->cleared from concurrent modification,
>> however sbitmap_deferred_clear() is already atomically drains it, so
>> it's guaranteed to not loose bits on concurrent
>> sbitmap_deferred_clear().
>>
>> A one threaded tag heavy test on top of nullbk showed ~1.5% t-put
>> increase, and 3% -> 1% cycle reduction of sbitmap_get() according to perf.
>>
>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>> ---
>>  include/linux/sbitmap.h |  5 -----
>>  lib/sbitmap.c           | 14 +++-----------
>>  2 files changed, 3 insertions(+), 16 deletions(-)
>>
>> diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
>> index e40d019c3d9d..74cc6384715e 100644
>> --- a/include/linux/sbitmap.h
>> +++ b/include/linux/sbitmap.h
>> @@ -32,11 +32,6 @@ struct sbitmap_word {
>>  	 * @cleared: word holding cleared bits
>>  	 */
>>  	unsigned long cleared ____cacheline_aligned_in_smp;
>> -
>> -	/**
>> -	 * @swap_lock: Held while swapping word <-> cleared
>> -	 */
>> -	spinlock_t swap_lock;
>>  } ____cacheline_aligned_in_smp;
>>  
>>  /**
>> diff --git a/lib/sbitmap.c b/lib/sbitmap.c
>> index c1c8a4e69325..4fd877048ba8 100644
>> --- a/lib/sbitmap.c
>> +++ b/lib/sbitmap.c
>> @@ -15,13 +15,9 @@
>>  static inline bool sbitmap_deferred_clear(struct sbitmap_word *map)
>>  {
>>  	unsigned long mask, val;
>> -	bool ret = false;
>> -	unsigned long flags;
>>  
>> -	spin_lock_irqsave(&map->swap_lock, flags);
>> -
>> -	if (!map->cleared)
>> -		goto out_unlock;
>> +	if (!READ_ONCE(map->cleared))
>> +		return false;
> 
> This way might break sbitmap_find_bit_in_index()/sbitmap_get_shallow().
> Currently if sbitmap_deferred_clear() returns false, it means nothing
> can be allocated from this word. With this patch, even though 'false'
> is returned, free bits still might be available because another
> sbitmap_deferred_clear() can be run concurrently.

But that can happen anyway if someone frees a requests right after we
return from sbitmap_deferred_clear(). Can you elaborate what exactly
it breaks? Something in sbq wakeup paths?

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] sbitmap: remove swap_lock
  2020-11-26 13:44     ` Pavel Begunkov
@ 2020-11-27  2:06       ` Ming Lei
  0 siblings, 0 replies; 13+ messages in thread
From: Ming Lei @ 2020-11-27  2:06 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Jens Axboe, linux-block, Omar Sandoval, linux-kernel

On Thu, Nov 26, 2020 at 01:44:36PM +0000, Pavel Begunkov wrote:
> On 26/11/2020 02:46, Ming Lei wrote:
> > On Sun, Nov 22, 2020 at 03:35:46PM +0000, Pavel Begunkov wrote:
> >> map->swap_lock protects map->cleared from concurrent modification,
> >> however sbitmap_deferred_clear() is already atomically drains it, so
> >> it's guaranteed to not loose bits on concurrent
> >> sbitmap_deferred_clear().
> >>
> >> A one threaded tag heavy test on top of nullbk showed ~1.5% t-put
> >> increase, and 3% -> 1% cycle reduction of sbitmap_get() according to perf.
> >>
> >> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> >> ---
> >>  include/linux/sbitmap.h |  5 -----
> >>  lib/sbitmap.c           | 14 +++-----------
> >>  2 files changed, 3 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
> >> index e40d019c3d9d..74cc6384715e 100644
> >> --- a/include/linux/sbitmap.h
> >> +++ b/include/linux/sbitmap.h
> >> @@ -32,11 +32,6 @@ struct sbitmap_word {
> >>  	 * @cleared: word holding cleared bits
> >>  	 */
> >>  	unsigned long cleared ____cacheline_aligned_in_smp;
> >> -
> >> -	/**
> >> -	 * @swap_lock: Held while swapping word <-> cleared
> >> -	 */
> >> -	spinlock_t swap_lock;
> >>  } ____cacheline_aligned_in_smp;
> >>  
> >>  /**
> >> diff --git a/lib/sbitmap.c b/lib/sbitmap.c
> >> index c1c8a4e69325..4fd877048ba8 100644
> >> --- a/lib/sbitmap.c
> >> +++ b/lib/sbitmap.c
> >> @@ -15,13 +15,9 @@
> >>  static inline bool sbitmap_deferred_clear(struct sbitmap_word *map)
> >>  {
> >>  	unsigned long mask, val;
> >> -	bool ret = false;
> >> -	unsigned long flags;
> >>  
> >> -	spin_lock_irqsave(&map->swap_lock, flags);
> >> -
> >> -	if (!map->cleared)
> >> -		goto out_unlock;
> >> +	if (!READ_ONCE(map->cleared))
> >> +		return false;
> > 
> > This way might break sbitmap_find_bit_in_index()/sbitmap_get_shallow().
> > Currently if sbitmap_deferred_clear() returns false, it means nothing
> > can be allocated from this word. With this patch, even though 'false'
> > is returned, free bits still might be available because another
> > sbitmap_deferred_clear() can be run concurrently.
> 
> But that can happen anyway if someone frees a requests right after we
> return from sbitmap_deferred_clear().

When one request is freed, and if there is any allocator waiting for,
sbitmap_queue_wake_up() will wake up the allocator for retrying.

If there isn't any allocator waiting for, freeing request will make
empty bits, and the following way is applied to make forward
progress for either request allocation or driver tag:

	tag = __blk_mq_get_tag(data, bt);	[1]
	if (tag != BLK_MQ_NO_TAG)
		break;
	sbitmap_prepare_to_wait(bt, ws, &wait, TASK_UNINTERRUPTIBLE);
	tag = __blk_mq_get_tag(data, bt);	[2]
	if (tag != BLK_MQ_NO_TAG)
		break;
	io_schedule()

When allocation[2] and sbitmap_resize() is run concurrently,
allocation[2] may fail because map cleared bits are cleared.
But looks it can happen without your patch.

> Can you elaborate what exactly
> it breaks? Something in sbq wakeup paths?

Thinking of further, looks your patch is good: when one allocation
wins(removing swap_lock won't change this point), the winning bit will
be freed in future, then wakeup is run for other waiters if all allocation
take same approach with above.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 for-next 0/4] optimise sbitmap deferred clear
  2020-11-22 15:35 [PATCH v2 for-next 0/4] optimise sbitmap deferred clear Pavel Begunkov
                   ` (3 preceding siblings ...)
  2020-11-22 15:35 ` [PATCH v2 4/4] sbitmap: simplify wrap check Pavel Begunkov
@ 2020-12-08  0:13 ` Jens Axboe
  4 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2020-12-08  0:13 UTC (permalink / raw)
  To: Pavel Begunkov, linux-block, Omar Sandoval; +Cc: linux-kernel

On 11/22/20 8:35 AM, Pavel Begunkov wrote:
> sbitmap takes away some cycles for my tag-deficient test, removal of
> locking in sbitmap_deferred_clear() gives +~1% throuhput.
> 
> [1/4] and [4/4] are simple, it'd be great if someone could double
> check for ordering issues for other two patches.
> 
> v2: add 3rd (CAS -> atomic and) and 4th patches

Applied, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-12-08  0:14 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-22 15:35 [PATCH v2 for-next 0/4] optimise sbitmap deferred clear Pavel Begunkov
2020-11-22 15:35 ` [PATCH v2 1/4] sbitmap: optimise sbitmap_deferred_clear() Pavel Begunkov
2020-11-24 14:11   ` John Garry
2020-11-24 15:01     ` Pavel Begunkov
2020-11-22 15:35 ` [PATCH v2 2/4] sbitmap: remove swap_lock Pavel Begunkov
2020-11-24 14:22   ` John Garry
2020-11-24 14:43     ` Pavel Begunkov
2020-11-26  2:46   ` Ming Lei
2020-11-26 13:44     ` Pavel Begunkov
2020-11-27  2:06       ` Ming Lei
2020-11-22 15:35 ` [PATCH v2 3/4] sbitmap: replace CAS with atomic and Pavel Begunkov
2020-11-22 15:35 ` [PATCH v2 4/4] sbitmap: simplify wrap check Pavel Begunkov
2020-12-08  0:13 ` [PATCH v2 for-next 0/4] optimise sbitmap deferred clear Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).