All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter
@ 2022-06-22 20:04 Waiman Long
  2022-06-23 13:35 ` john.p.donnelly
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Waiman Long @ 2022-06-22 20:04 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng
  Cc: linux-kernel, john.p.donnelly, Hillf Danton, Mel Gorman, Waiman Long

With commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
consistent"), the writer that sets the handoff bit can be interrupted
out without clearing the bit if the wait queue isn't empty. This disables
reader and writer optimistic lock spinning and stealing.

Now if a non-first writer in the queue is somehow woken up or a new
waiter enters the slowpath, it can't acquire the lock.  This is not the
case before commit d257cc8cb8d5 as the writer that set the handoff bit
will clear it when exiting out via the out_nolock path. This is less
efficient as the busy rwsem stays in an unlock state for a longer time.

In some cases, this new behavior may cause lockups as shown in [1] and
[2].

This patch allows a non-first writer to ignore the handoff bit if it
is not originally set or initiated by the first waiter. This patch is
shown to be effective in fixing the lockup problem reported in [1].

[1] https://lore.kernel.org/lkml/20220617134325.GC30825@techsingularity.net/
[2] https://lore.kernel.org/lkml/3f02975c-1a9d-be20-32cf-f1d8e3dfafcc@oracle.com/

Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent")
Signed-off-by: Waiman Long <longman@redhat.com>
Tested-by: Mel Gorman <mgorman@techsingularity.net>
---
 kernel/locking/rwsem.c | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 9d1db4a54d34..ffd6188d4a7c 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -335,8 +335,6 @@ struct rwsem_waiter {
 	struct task_struct *task;
 	enum rwsem_waiter_type type;
 	unsigned long timeout;
-
-	/* Writer only, not initialized in reader */
 	bool handoff_set;
 };
 #define rwsem_first_waiter(sem) \
@@ -459,10 +457,12 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
 			 * to give up the lock), request a HANDOFF to
 			 * force the issue.
 			 */
-			if (!(oldcount & RWSEM_FLAG_HANDOFF) &&
-			    time_after(jiffies, waiter->timeout)) {
-				adjustment -= RWSEM_FLAG_HANDOFF;
-				lockevent_inc(rwsem_rlock_handoff);
+			if (time_after(jiffies, waiter->timeout)) {
+				if (!(oldcount & RWSEM_FLAG_HANDOFF)) {
+					adjustment -= RWSEM_FLAG_HANDOFF;
+					lockevent_inc(rwsem_rlock_handoff);
+				}
+				waiter->handoff_set = true;
 			}
 
 			atomic_long_add(-adjustment, &sem->count);
@@ -599,7 +599,7 @@ rwsem_del_wake_waiter(struct rw_semaphore *sem, struct rwsem_waiter *waiter,
 static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
 					struct rwsem_waiter *waiter)
 {
-	bool first = rwsem_first_waiter(sem) == waiter;
+	struct rwsem_waiter *first =  rwsem_first_waiter(sem);
 	long count, new;
 
 	lockdep_assert_held(&sem->wait_lock);
@@ -609,11 +609,20 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
 		bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
 
 		if (has_handoff) {
-			if (!first)
+			/*
+			 * Honor handoff bit and yield only when the first
+			 * waiter is the one that set it. Otherwisee, we
+			 * still try to acquire the rwsem.
+			 */
+			if (first->handoff_set && (waiter != first))
 				return false;
 
-			/* First waiter inherits a previously set handoff bit */
-			waiter->handoff_set = true;
+			/*
+			 * First waiter can inherit a previously set handoff
+			 * bit and spin on rwsem if lock acquisition fails.
+			 */
+			if (waiter == first)
+				waiter->handoff_set = true;
 		}
 
 		new = count;
@@ -1027,6 +1036,7 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
 	waiter.task = current;
 	waiter.type = RWSEM_WAITING_FOR_READ;
 	waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT;
+	waiter.handoff_set = false;
 
 	raw_spin_lock_irq(&sem->wait_lock);
 	if (list_empty(&sem->wait_list)) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter
  2022-06-22 20:04 [PATCH v2] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter Waiman Long
@ 2022-06-23 13:35 ` john.p.donnelly
  2022-07-01 14:37 ` john.p.donnelly
  2022-07-30  9:35 ` [tip: locking/urgent] " tip-bot2 for Waiman Long
  2 siblings, 0 replies; 5+ messages in thread
From: john.p.donnelly @ 2022-06-23 13:35 UTC (permalink / raw)
  To: Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng
  Cc: linux-kernel, Hillf Danton, Mel Gorman

On 6/22/22 3:04 PM, Waiman Long wrote:
> With commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
> consistent"), the writer that sets the handoff bit can be interrupted
> out without clearing the bit if the wait queue isn't empty. This disables
> reader and writer optimistic lock spinning and stealing.
> 
> Now if a non-first writer in the queue is somehow woken up or a new
> waiter enters the slowpath, it can't acquire the lock.  This is not the
> case before commit d257cc8cb8d5 as the writer that set the handoff bit
> will clear it when exiting out via the out_nolock path. This is less
> efficient as the busy rwsem stays in an unlock state for a longer time.
> 
> In some cases, this new behavior may cause lockups as shown in [1] and
> [2].
> 
> This patch allows a non-first writer to ignore the handoff bit if it
> is not originally set or initiated by the first waiter. This patch is
> shown to be effective in fixing the lockup problem reported in [1].
> 
> [1] https://urldefense.com/v3/__https://lore.kernel.org/lkml/20220617134325.GC30825@techsingularity.net/__;!!ACWV5N9M2RV99hQ!MtLCpFzIovhhzojTACH793QVlr1c2eCOsHnjbuB6O3ydj5cqhlZUjVchbfnbMeD4kQ5AsgjVmX0hGfREWPrEAA$
> [2] https://urldefense.com/v3/__https://lore.kernel.org/lkml/3f02975c-1a9d-be20-32cf-f1d8e3dfafcc@oracle.com/__;!!ACWV5N9M2RV99hQ!MtLCpFzIovhhzojTACH793QVlr1c2eCOsHnjbuB6O3ydj5cqhlZUjVchbfnbMeD4kQ5AsgjVmX0hGfQCd2bHaQ$
> 
> Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent")
> Signed-off-by: Waiman Long <longman@redhat.com>
> Tested-by: Mel Gorman <mgorman@techsingularity.net>

acked-by: John Donnelly  <john.p.donnelly@oracle.com>


Hi,

If you send the .patch offline as a attachment we can test it here too 
if needed in 5.19.x tip.

My email client doesn't like inline patches.



> --- >   kernel/locking/rwsem.c | 30 ++++++++++++++++++++----------
>   1 file changed, 20 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index 9d1db4a54d34..ffd6188d4a7c 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -335,8 +335,6 @@ struct rwsem_waiter {
>   	struct task_struct *task;
>   	enum rwsem_waiter_type type;
>   	unsigned long timeout;
> -
> -	/* Writer only, not initialized in reader */
>   	bool handoff_set;
>   };
>   #define rwsem_first_waiter(sem) \
> @@ -459,10 +457,12 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
>   			 * to give up the lock), request a HANDOFF to
>   			 * force the issue.
>   			 */
> -			if (!(oldcount & RWSEM_FLAG_HANDOFF) &&
> -			    time_after(jiffies, waiter->timeout)) {
> -				adjustment -= RWSEM_FLAG_HANDOFF;
> -				lockevent_inc(rwsem_rlock_handoff);
> +			if (time_after(jiffies, waiter->timeout)) {
> +				if (!(oldcount & RWSEM_FLAG_HANDOFF)) {
> +					adjustment -= RWSEM_FLAG_HANDOFF;
> +					lockevent_inc(rwsem_rlock_handoff);
> +				}
> +				waiter->handoff_set = true;
>   			}
>   
>   			atomic_long_add(-adjustment, &sem->count);
> @@ -599,7 +599,7 @@ rwsem_del_wake_waiter(struct rw_semaphore *sem, struct rwsem_waiter *waiter,
>   static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
>   					struct rwsem_waiter *waiter)
>   {
> -	bool first = rwsem_first_waiter(sem) == waiter;
> +	struct rwsem_waiter *first =  rwsem_first_waiter(sem);
>   	long count, new;
>   
>   	lockdep_assert_held(&sem->wait_lock);
> @@ -609,11 +609,20 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
>   		bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
>   
>   		if (has_handoff) {
> -			if (!first)
> +			/*
> +			 * Honor handoff bit and yield only when the first
> +			 * waiter is the one that set it. Otherwisee, we
> +			 * still try to acquire the rwsem.
> +			 */
> +			if (first->handoff_set && (waiter != first))
>   				return false;
>   
> -			/* First waiter inherits a previously set handoff bit */
> -			waiter->handoff_set = true;
> +			/*
> +			 * First waiter can inherit a previously set handoff
> +			 * bit and spin on rwsem if lock acquisition fails.
> +			 */
> +			if (waiter == first)
> +				waiter->handoff_set = true;
>   		}
>   
>   		new = count;
> @@ -1027,6 +1036,7 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
>   	waiter.task = current;
>   	waiter.type = RWSEM_WAITING_FOR_READ;
>   	waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT;
> +	waiter.handoff_set = false;
>   
>   	raw_spin_lock_irq(&sem->wait_lock);
>   	if (list_empty(&sem->wait_list)) {


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter
  2022-06-22 20:04 [PATCH v2] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter Waiman Long
  2022-06-23 13:35 ` john.p.donnelly
@ 2022-07-01 14:37 ` john.p.donnelly
  2022-07-22 12:09   ` john.p.donnelly
  2022-07-30  9:35 ` [tip: locking/urgent] " tip-bot2 for Waiman Long
  2 siblings, 1 reply; 5+ messages in thread
From: john.p.donnelly @ 2022-07-01 14:37 UTC (permalink / raw)
  To: Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng
  Cc: linux-kernel, Hillf Danton, Mel Gorman, John Donnelly

On 6/22/22 3:04 PM, Waiman Long wrote:
> With commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
> consistent"), the writer that sets the handoff bit can be interrupted
> out without clearing the bit if the wait queue isn't empty. This disables
> reader and writer optimistic lock spinning and stealing.
> 
> Now if a non-first writer in the queue is somehow woken up or a new
> waiter enters the slowpath, it can't acquire the lock.  This is not the
> case before commit d257cc8cb8d5 as the writer that set the handoff bit
> will clear it when exiting out via the out_nolock path. This is less
> efficient as the busy rwsem stays in an unlock state for a longer time.
> 
> In some cases, this new behavior may cause lockups as shown in [1] and
> [2].
> 
> This patch allows a non-first writer to ignore the handoff bit if it
> is not originally set or initiated by the first waiter. This patch is
> shown to be effective in fixing the lockup problem reported in [1].
> 
> [1] https://urldefense.com/v3/__https://lore.kernel.org/lkml/20220617134325.GC30825@techsingularity.net/__;!!ACWV5N9M2RV99hQ!MtLCpFzIovhhzojTACH793QVlr1c2eCOsHnjbuB6O3ydj5cqhlZUjVchbfnbMeD4kQ5AsgjVmX0hGfREWPrEAA$
> [2] https://urldefense.com/v3/__https://lore.kernel.org/lkml/3f02975c-1a9d-be20-32cf-f1d8e3dfafcc@oracle.com/__;!!ACWV5N9M2RV99hQ!MtLCpFzIovhhzojTACH793QVlr1c2eCOsHnjbuB6O3ydj5cqhlZUjVchbfnbMeD4kQ5AsgjVmX0hGfQCd2bHaQ$
> 
> Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent")
> Signed-off-by: Waiman Long <longman@redhat.com>
> Tested-by: Mel Gorman <mgorman@techsingularity.net>

Acked-by:  John Donnelly <john.p.donnelly@oracle.com>

> ---
>   kernel/locking/rwsem.c | 30 ++++++++++++++++++++----------
>   1 file changed, 20 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index 9d1db4a54d34..ffd6188d4a7c 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -335,8 +335,6 @@ struct rwsem_waiter {
>   	struct task_struct *task;
>   	enum rwsem_waiter_type type;
>   	unsigned long timeout;
> -
> -	/* Writer only, not initialized in reader */
>   	bool handoff_set;
>   };
>   #define rwsem_first_waiter(sem) \
> @@ -459,10 +457,12 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
>   			 * to give up the lock), request a HANDOFF to
>   			 * force the issue.
>   			 */
> -			if (!(oldcount & RWSEM_FLAG_HANDOFF) &&
> -			    time_after(jiffies, waiter->timeout)) {
> -				adjustment -= RWSEM_FLAG_HANDOFF;
> -				lockevent_inc(rwsem_rlock_handoff);
> +			if (time_after(jiffies, waiter->timeout)) {
> +				if (!(oldcount & RWSEM_FLAG_HANDOFF)) {
> +					adjustment -= RWSEM_FLAG_HANDOFF;
> +					lockevent_inc(rwsem_rlock_handoff);
> +				}
> +				waiter->handoff_set = true;
>   			}
>   
>   			atomic_long_add(-adjustment, &sem->count);
> @@ -599,7 +599,7 @@ rwsem_del_wake_waiter(struct rw_semaphore *sem, struct rwsem_waiter *waiter,
>   static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
>   					struct rwsem_waiter *waiter)
>   {
> -	bool first = rwsem_first_waiter(sem) == waiter;
> +	struct rwsem_waiter *first =  rwsem_first_waiter(sem);
>   	long count, new;
>   
>   	lockdep_assert_held(&sem->wait_lock);
> @@ -609,11 +609,20 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
>   		bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
>   
>   		if (has_handoff) {
> -			if (!first)
> +			/*
> +			 * Honor handoff bit and yield only when the first
> +			 * waiter is the one that set it. Otherwisee, we
> +			 * still try to acquire the rwsem.
> +			 */
> +			if (first->handoff_set && (waiter != first))
>   				return false;
>   
> -			/* First waiter inherits a previously set handoff bit */
> -			waiter->handoff_set = true;
> +			/*
> +			 * First waiter can inherit a previously set handoff
> +			 * bit and spin on rwsem if lock acquisition fails.
> +			 */
> +			if (waiter == first)
> +				waiter->handoff_set = true;
>   		}
>   
>   		new = count;
> @@ -1027,6 +1036,7 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
>   	waiter.task = current;
>   	waiter.type = RWSEM_WAITING_FOR_READ;
>   	waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT;
> +	waiter.handoff_set = false;
>   
>   	raw_spin_lock_irq(&sem->wait_lock);
>   	if (list_empty(&sem->wait_list)) {


Hi,

ping .


Can we get this reviewed ?


The offending commit:

  d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more  consistent")


is found in all these branches:

   kernel_dot_org/linux-stable.git    linux-5.15.y           - 76723ed1fb89
   kernel_dot_org/linux-stable.git    linux-5.16.y           - d257cc8cb8d5
   kernel_dot_org/linux-stable.git    linux-5.17.y           - d257cc8cb8d5
   kernel_dot_org/linux-stable.git    linux-5.18.y           - d257cc8cb8d5
   kernel_dot_org/linux-stable.git    master                 - d257cc8cb8d5




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter
  2022-07-01 14:37 ` john.p.donnelly
@ 2022-07-22 12:09   ` john.p.donnelly
  0 siblings, 0 replies; 5+ messages in thread
From: john.p.donnelly @ 2022-07-22 12:09 UTC (permalink / raw)
  To: Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng
  Cc: linux-kernel, Hillf Danton, Mel Gorman

On 7/1/22 9:37 AM, john.p.donnelly@oracle.com wrote:
> On 6/22/22 3:04 PM, Waiman Long wrote:
>> With commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
>> consistent"), the writer that sets the handoff bit can be interrupted
>> out without clearing the bit if the wait queue isn't empty. This disables
>> reader and writer optimistic lock spinning and stealing.
>>
>> Now if a non-first writer in the queue is somehow woken up or a new
>> waiter enters the slowpath, it can't acquire the lock.  This is not the
>> case before commit d257cc8cb8d5 as the writer that set the handoff bit
>> will clear it when exiting out via the out_nolock path. This is less
>> efficient as the busy rwsem stays in an unlock state for a longer time.
>>
>> In some cases, this new behavior may cause lockups as shown in [1] and
>> [2].
>>
>> This patch allows a non-first writer to ignore the handoff bit if it
>> is not originally set or initiated by the first waiter. This patch is
>> shown to be effective in fixing the lockup problem reported in [1].
>>
>> [1] 
>> https://urldefense.com/v3/__https://lore.kernel.org/lkml/20220617134325.GC30825@techsingularity.net/__;!!ACWV5N9M2RV99hQ!MtLCpFzIovhhzojTACH793QVlr1c2eCOsHnjbuB6O3ydj5cqhlZUjVchbfnbMeD4kQ5AsgjVmX0hGfREWPrEAA$ 
>>
>> [2] 
>> https://urldefense.com/v3/__https://lore.kernel.org/lkml/3f02975c-1a9d-be20-32cf-f1d8e3dfafcc@oracle.com/__;!!ACWV5N9M2RV99hQ!MtLCpFzIovhhzojTACH793QVlr1c2eCOsHnjbuB6O3ydj5cqhlZUjVchbfnbMeD4kQ5AsgjVmX0hGfQCd2bHaQ$ 
>>
>>
>> Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more 
>> consistent")
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> Tested-by: Mel Gorman <mgorman@techsingularity.net>
> 
> Acked-by:  John Donnelly <john.p.donnelly@oracle.com>
> 
>> ---
>>   kernel/locking/rwsem.c | 30 ++++++++++++++++++++----------
>>   1 file changed, 20 insertions(+), 10 deletions(-)
>>
>> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
>> index 9d1db4a54d34..ffd6188d4a7c 100644
>> --- a/kernel/locking/rwsem.c
>> +++ b/kernel/locking/rwsem.c
>> @@ -335,8 +335,6 @@ struct rwsem_waiter {
>>       struct task_struct *task;
>>       enum rwsem_waiter_type type;
>>       unsigned long timeout;
>> -
>> -    /* Writer only, not initialized in reader */
>>       bool handoff_set;
>>   };
>>   #define rwsem_first_waiter(sem) \
>> @@ -459,10 +457,12 @@ static void rwsem_mark_wake(struct rw_semaphore 
>> *sem,
>>                * to give up the lock), request a HANDOFF to
>>                * force the issue.
>>                */
>> -            if (!(oldcount & RWSEM_FLAG_HANDOFF) &&
>> -                time_after(jiffies, waiter->timeout)) {
>> -                adjustment -= RWSEM_FLAG_HANDOFF;
>> -                lockevent_inc(rwsem_rlock_handoff);
>> +            if (time_after(jiffies, waiter->timeout)) {
>> +                if (!(oldcount & RWSEM_FLAG_HANDOFF)) {
>> +                    adjustment -= RWSEM_FLAG_HANDOFF;
>> +                    lockevent_inc(rwsem_rlock_handoff);
>> +                }
>> +                waiter->handoff_set = true;
>>               }
>>               atomic_long_add(-adjustment, &sem->count);
>> @@ -599,7 +599,7 @@ rwsem_del_wake_waiter(struct rw_semaphore *sem, 
>> struct rwsem_waiter *waiter,
>>   static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
>>                       struct rwsem_waiter *waiter)
>>   {
>> -    bool first = rwsem_first_waiter(sem) == waiter;
>> +    struct rwsem_waiter *first =  rwsem_first_waiter(sem);
>>       long count, new;
>>       lockdep_assert_held(&sem->wait_lock);
>> @@ -609,11 +609,20 @@ static inline bool rwsem_try_write_lock(struct 
>> rw_semaphore *sem,
>>           bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
>>           if (has_handoff) {
>> -            if (!first)
>> +            /*
>> +             * Honor handoff bit and yield only when the first
>> +             * waiter is the one that set it. Otherwisee, we
>> +             * still try to acquire the rwsem.
>> +             */
>> +            if (first->handoff_set && (waiter != first))
>>                   return false;
>> -            /* First waiter inherits a previously set handoff bit */
>> -            waiter->handoff_set = true;
>> +            /*
>> +             * First waiter can inherit a previously set handoff
>> +             * bit and spin on rwsem if lock acquisition fails.
>> +             */
>> +            if (waiter == first)
>> +                waiter->handoff_set = true;
>>           }
>>           new = count;
>> @@ -1027,6 +1036,7 @@ rwsem_down_read_slowpath(struct rw_semaphore 
>> *sem, long count, unsigned int stat
>>       waiter.task = current;
>>       waiter.type = RWSEM_WAITING_FOR_READ;
>>       waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT;
>> +    waiter.handoff_set = false;
>>       raw_spin_lock_irq(&sem->wait_lock);
>>       if (list_empty(&sem->wait_list)) {
> 
> 
> Hi,
> 
> ping .
> 
> 
> Can we get this reviewed ?
> 
> 
> The offending commit:
> 
>   d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more  
> consistent")
> 
> 
> is found in all these branches:
> 
>    kernel_dot_org/linux-stable.git    linux-5.15.y           - 76723ed1fb89
>    kernel_dot_org/linux-stable.git    linux-5.16.y           - d257cc8cb8d5
>    kernel_dot_org/linux-stable.git    linux-5.17.y           - d257cc8cb8d5
>    kernel_dot_org/linux-stable.git    linux-5.18.y           - d257cc8cb8d5
>    kernel_dot_org/linux-stable.git    master                 - d257cc8cb8d5
> 
> 
> 

Hi,

This issue now in :  v5.19-rc7

Can we get this reviewed ?




^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip: locking/urgent] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter
  2022-06-22 20:04 [PATCH v2] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter Waiman Long
  2022-06-23 13:35 ` john.p.donnelly
  2022-07-01 14:37 ` john.p.donnelly
@ 2022-07-30  9:35 ` tip-bot2 for Waiman Long
  2 siblings, 0 replies; 5+ messages in thread
From: tip-bot2 for Waiman Long @ 2022-07-30  9:35 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Waiman Long, Peter Zijlstra (Intel),
	John Donnelly, Mel Gorman, x86, linux-kernel

The following commit has been merged into the locking/urgent branch of tip:

Commit-ID:     6eebd5fb20838f5971ba17df9f55cc4f84a31053
Gitweb:        https://git.kernel.org/tip/6eebd5fb20838f5971ba17df9f55cc4f84a31053
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Wed, 22 Jun 2022 16:04:19 -04:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Sat, 30 Jul 2022 10:58:28 +02:00

locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter

With commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
consistent"), the writer that sets the handoff bit can be interrupted
out without clearing the bit if the wait queue isn't empty. This disables
reader and writer optimistic lock spinning and stealing.

Now if a non-first writer in the queue is somehow woken up or a new
waiter enters the slowpath, it can't acquire the lock.  This is not the
case before commit d257cc8cb8d5 as the writer that set the handoff bit
will clear it when exiting out via the out_nolock path. This is less
efficient as the busy rwsem stays in an unlock state for a longer time.

In some cases, this new behavior may cause lockups as shown in [1] and
[2].

This patch allows a non-first writer to ignore the handoff bit if it
is not originally set or initiated by the first waiter. This patch is
shown to be effective in fixing the lockup problem reported in [1].

[1] https://lore.kernel.org/lkml/20220617134325.GC30825@techsingularity.net/
[2] https://lore.kernel.org/lkml/3f02975c-1a9d-be20-32cf-f1d8e3dfafcc@oracle.com/

Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Tested-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/r/20220622200419.778799-1-longman@redhat.com
---
 kernel/locking/rwsem.c | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 9d1db4a..65f0262 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -335,8 +335,6 @@ struct rwsem_waiter {
 	struct task_struct *task;
 	enum rwsem_waiter_type type;
 	unsigned long timeout;
-
-	/* Writer only, not initialized in reader */
 	bool handoff_set;
 };
 #define rwsem_first_waiter(sem) \
@@ -459,10 +457,12 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
 			 * to give up the lock), request a HANDOFF to
 			 * force the issue.
 			 */
-			if (!(oldcount & RWSEM_FLAG_HANDOFF) &&
-			    time_after(jiffies, waiter->timeout)) {
-				adjustment -= RWSEM_FLAG_HANDOFF;
-				lockevent_inc(rwsem_rlock_handoff);
+			if (time_after(jiffies, waiter->timeout)) {
+				if (!(oldcount & RWSEM_FLAG_HANDOFF)) {
+					adjustment -= RWSEM_FLAG_HANDOFF;
+					lockevent_inc(rwsem_rlock_handoff);
+				}
+				waiter->handoff_set = true;
 			}
 
 			atomic_long_add(-adjustment, &sem->count);
@@ -599,7 +599,7 @@ rwsem_del_wake_waiter(struct rw_semaphore *sem, struct rwsem_waiter *waiter,
 static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
 					struct rwsem_waiter *waiter)
 {
-	bool first = rwsem_first_waiter(sem) == waiter;
+	struct rwsem_waiter *first = rwsem_first_waiter(sem);
 	long count, new;
 
 	lockdep_assert_held(&sem->wait_lock);
@@ -609,11 +609,20 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
 		bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
 
 		if (has_handoff) {
-			if (!first)
+			/*
+			 * Honor handoff bit and yield only when the first
+			 * waiter is the one that set it. Otherwisee, we
+			 * still try to acquire the rwsem.
+			 */
+			if (first->handoff_set && (waiter != first))
 				return false;
 
-			/* First waiter inherits a previously set handoff bit */
-			waiter->handoff_set = true;
+			/*
+			 * First waiter can inherit a previously set handoff
+			 * bit and spin on rwsem if lock acquisition fails.
+			 */
+			if (waiter == first)
+				waiter->handoff_set = true;
 		}
 
 		new = count;
@@ -1027,6 +1036,7 @@ queue:
 	waiter.task = current;
 	waiter.type = RWSEM_WAITING_FOR_READ;
 	waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT;
+	waiter.handoff_set = false;
 
 	raw_spin_lock_irq(&sem->wait_lock);
 	if (list_empty(&sem->wait_list)) {

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-07-30  9:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-22 20:04 [PATCH v2] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter Waiman Long
2022-06-23 13:35 ` john.p.donnelly
2022-07-01 14:37 ` john.p.donnelly
2022-07-22 12:09   ` john.p.donnelly
2022-07-30  9:35 ` [tip: locking/urgent] " tip-bot2 for Waiman Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.