linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH-tip] locking/rwsem: Make handoff writer optimistically spin on owner
@ 2019-06-25 14:39 Waiman Long
  2019-07-25 16:26 ` Waiman Long
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Waiman Long @ 2019-06-25 14:39 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Thomas Gleixner,
	Borislav Petkov, H. Peter Anvin
  Cc: linux-kernel, x86, Davidlohr Bueso, Linus Torvalds, Tim Chen,
	huang ying, Waiman Long

When the handoff bit is set by a writer, no other tasks other than
the setting writer itself is allowed to acquire the lock. If the
to-be-handoff'ed writer goes to sleep, there will be a wakeup latency
period where the lock is free, but no one can acquire it. That is less
than ideal.

To reduce that latency, the handoff writer will now optimistically spin
on the owner if it happens to be a on-cpu writer. It will spin until
it releases the lock and the to-be-handoff'ed writer can then acquire
the lock immediately without any delay. Of course, if the owner is not
a on-cpu writer, the to-be-handoff'ed writer will have to sleep anyway.

The optimistic spinning code is also modified to not stop spinning
when the handoff bit is set. This will prevent an occasional setting of
handoff bit from causing a bunch of optimistic spinners from entering
into the wait queue causing significant reduction in throughput.

On a 1-socket 22-core 44-thread Skylake system, the AIM7 shared_memory
workload was run with 7000 users. The throughput (jobs/min) of the
following kernels were as follows:

 1) 5.2-rc6
    - 8,092,486
 2) 5.2-rc6 + tip's rwsem patches
    - 7,567,568
 3) 5.2-rc6 + tip's rwsem patches + this patch
    - 7,954,545

Using perf-record(1), the %cpu time used by rwsem_down_write_slowpath(),
rwsem_down_write_failed() and their callees for the 3 kernels were 1.70%,
5.46% and 2.08% respectively.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/rwsem.c | 32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 37524a47f002..5e022fbdb2bb 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -720,11 +720,12 @@ rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
 
 	rcu_read_lock();
 	for (;;) {
-		if (atomic_long_read(&sem->count) & RWSEM_FLAG_HANDOFF) {
-			state = OWNER_NONSPINNABLE;
-			break;
-		}
-
+		/*
+		 * When a waiting writer set the handoff flag, it may spin
+		 * on the owner as well. Once that writer acquires the lock,
+		 * we can spin on it. So we don't need to quit even when the
+		 * handoff bit is set.
+		 */
 		new = rwsem_owner_flags(sem, &new_flags);
 		if ((new != owner) || (new_flags != flags)) {
 			state = rwsem_owner_state(new, new_flags, nonspinnable);
@@ -970,6 +971,13 @@ static inline bool rwsem_reader_phase_trylock(struct rw_semaphore *sem,
 {
 	return false;
 }
+
+static inline int
+rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
+{
+	return 0;
+}
+#define OWNER_NULL	1
 #endif
 
 /*
@@ -1190,6 +1198,18 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
 
 		raw_spin_unlock_irq(&sem->wait_lock);
 
+		/*
+		 * After setting the handoff bit and failing to acquire
+		 * the lock, attempt to spin on owner to accelerate lock
+		 * transfer. If the previous owner is a on-cpu writer and it
+		 * has just released the lock, OWNER_NULL will be returned.
+		 * In this case, we attempt to acquire the lock again
+		 * without sleeping.
+		 */
+		if ((wstate == WRITER_HANDOFF) &&
+		    (rwsem_spin_on_owner(sem, 0) == OWNER_NULL))
+			goto trylock_again;
+
 		/* Block until there are no active lockers. */
 		for (;;) {
 			if (signal_pending_state(state, current))
@@ -1224,7 +1244,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
 				break;
 			}
 		}
-
+trylock_again:
 		raw_spin_lock_irq(&sem->wait_lock);
 	}
 	__set_current_state(TASK_RUNNING);
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH-tip] locking/rwsem: Make handoff writer optimistically spin on owner
  2019-06-25 14:39 [PATCH-tip] locking/rwsem: Make handoff writer optimistically spin on owner Waiman Long
@ 2019-07-25 16:26 ` Waiman Long
  2019-08-06 12:44 ` Peter Zijlstra
  2019-08-06 13:04 ` [tip:locking/core] " tip-bot for Waiman Long
  2 siblings, 0 replies; 4+ messages in thread
From: Waiman Long @ 2019-07-25 16:26 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Thomas Gleixner,
	Borislav Petkov, H. Peter Anvin
  Cc: linux-kernel, x86, Davidlohr Bueso, Linus Torvalds, Tim Chen, huang ying

On 6/25/19 10:39 AM, Waiman Long wrote:
> When the handoff bit is set by a writer, no other tasks other than
> the setting writer itself is allowed to acquire the lock. If the
> to-be-handoff'ed writer goes to sleep, there will be a wakeup latency
> period where the lock is free, but no one can acquire it. That is less
> than ideal.
>
> To reduce that latency, the handoff writer will now optimistically spin
> on the owner if it happens to be a on-cpu writer. It will spin until
> it releases the lock and the to-be-handoff'ed writer can then acquire
> the lock immediately without any delay. Of course, if the owner is not
> a on-cpu writer, the to-be-handoff'ed writer will have to sleep anyway.
>
> The optimistic spinning code is also modified to not stop spinning
> when the handoff bit is set. This will prevent an occasional setting of
> handoff bit from causing a bunch of optimistic spinners from entering
> into the wait queue causing significant reduction in throughput.
>
> On a 1-socket 22-core 44-thread Skylake system, the AIM7 shared_memory
> workload was run with 7000 users. The throughput (jobs/min) of the
> following kernels were as follows:
>
>  1) 5.2-rc6
>     - 8,092,486
>  2) 5.2-rc6 + tip's rwsem patches
>     - 7,567,568
>  3) 5.2-rc6 + tip's rwsem patches + this patch
>     - 7,954,545
>
> Using perf-record(1), the %cpu time used by rwsem_down_write_slowpath(),
> rwsem_down_write_failed() and their callees for the 3 kernels were 1.70%,
> 5.46% and 2.08% respectively.
>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  kernel/locking/rwsem.c | 32 ++++++++++++++++++++++++++------
>  1 file changed, 26 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index 37524a47f002..5e022fbdb2bb 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -720,11 +720,12 @@ rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
>  
>  	rcu_read_lock();
>  	for (;;) {
> -		if (atomic_long_read(&sem->count) & RWSEM_FLAG_HANDOFF) {
> -			state = OWNER_NONSPINNABLE;
> -			break;
> -		}
> -
> +		/*
> +		 * When a waiting writer set the handoff flag, it may spin
> +		 * on the owner as well. Once that writer acquires the lock,
> +		 * we can spin on it. So we don't need to quit even when the
> +		 * handoff bit is set.
> +		 */
>  		new = rwsem_owner_flags(sem, &new_flags);
>  		if ((new != owner) || (new_flags != flags)) {
>  			state = rwsem_owner_state(new, new_flags, nonspinnable);
> @@ -970,6 +971,13 @@ static inline bool rwsem_reader_phase_trylock(struct rw_semaphore *sem,
>  {
>  	return false;
>  }
> +
> +static inline int
> +rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
> +{
> +	return 0;
> +}
> +#define OWNER_NULL	1
>  #endif
>  
>  /*
> @@ -1190,6 +1198,18 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
>  
>  		raw_spin_unlock_irq(&sem->wait_lock);
>  
> +		/*
> +		 * After setting the handoff bit and failing to acquire
> +		 * the lock, attempt to spin on owner to accelerate lock
> +		 * transfer. If the previous owner is a on-cpu writer and it
> +		 * has just released the lock, OWNER_NULL will be returned.
> +		 * In this case, we attempt to acquire the lock again
> +		 * without sleeping.
> +		 */
> +		if ((wstate == WRITER_HANDOFF) &&
> +		    (rwsem_spin_on_owner(sem, 0) == OWNER_NULL))
> +			goto trylock_again;
> +
>  		/* Block until there are no active lockers. */
>  		for (;;) {
>  			if (signal_pending_state(state, current))
> @@ -1224,7 +1244,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
>  				break;
>  			}
>  		}
> -
> +trylock_again:
>  		raw_spin_lock_irq(&sem->wait_lock);
>  	}
>  	__set_current_state(TASK_RUNNING);

Any comment on this patch?

Cheers,
Longman


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH-tip] locking/rwsem: Make handoff writer optimistically spin on owner
  2019-06-25 14:39 [PATCH-tip] locking/rwsem: Make handoff writer optimistically spin on owner Waiman Long
  2019-07-25 16:26 ` Waiman Long
@ 2019-08-06 12:44 ` Peter Zijlstra
  2019-08-06 13:04 ` [tip:locking/core] " tip-bot for Waiman Long
  2 siblings, 0 replies; 4+ messages in thread
From: Peter Zijlstra @ 2019-08-06 12:44 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Will Deacon, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, linux-kernel, x86, Davidlohr Bueso,
	Linus Torvalds, Tim Chen, huang ying

On Tue, Jun 25, 2019 at 10:39:13AM -0400, Waiman Long wrote:
> When the handoff bit is set by a writer, no other tasks other than
> the setting writer itself is allowed to acquire the lock. If the
> to-be-handoff'ed writer goes to sleep, there will be a wakeup latency
> period where the lock is free, but no one can acquire it. That is less
> than ideal.
> 
> To reduce that latency, the handoff writer will now optimistically spin
> on the owner if it happens to be a on-cpu writer. It will spin until
> it releases the lock and the to-be-handoff'ed writer can then acquire
> the lock immediately without any delay. Of course, if the owner is not
> a on-cpu writer, the to-be-handoff'ed writer will have to sleep anyway.
> 
> The optimistic spinning code is also modified to not stop spinning
> when the handoff bit is set. This will prevent an occasional setting of
> handoff bit from causing a bunch of optimistic spinners from entering
> into the wait queue causing significant reduction in throughput.
> 
> On a 1-socket 22-core 44-thread Skylake system, the AIM7 shared_memory
> workload was run with 7000 users. The throughput (jobs/min) of the
> following kernels were as follows:
> 
>  1) 5.2-rc6
>     - 8,092,486
>  2) 5.2-rc6 + tip's rwsem patches
>     - 7,567,568
>  3) 5.2-rc6 + tip's rwsem patches + this patch
>     - 7,954,545
> 
> Using perf-record(1), the %cpu time used by rwsem_down_write_slowpath(),
> rwsem_down_write_failed() and their callees for the 3 kernels were 1.70%,
> 5.46% and 2.08% respectively.
> 
> Signed-off-by: Waiman Long <longman@redhat.com>

Thanks.. sorry for taking so long.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tip:locking/core] locking/rwsem: Make handoff writer optimistically spin on owner
  2019-06-25 14:39 [PATCH-tip] locking/rwsem: Make handoff writer optimistically spin on owner Waiman Long
  2019-07-25 16:26 ` Waiman Long
  2019-08-06 12:44 ` Peter Zijlstra
@ 2019-08-06 13:04 ` tip-bot for Waiman Long
  2 siblings, 0 replies; 4+ messages in thread
From: tip-bot for Waiman Long @ 2019-08-06 13:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, torvalds, hpa, longman, bp, tim.c.chen, peterz, tglx,
	linux-kernel, mingo, huang.ying.caritas, will.deacon, dave

Commit-ID:  91d2a812dfb98b3b4dad661529c33bc38d303461
Gitweb:     https://git.kernel.org/tip/91d2a812dfb98b3b4dad661529c33bc38d303461
Author:     Waiman Long <longman@redhat.com>
AuthorDate: Tue, 25 Jun 2019 10:39:13 -0400
Committer:  Peter Zijlstra <peterz@infradead.org>
CommitDate: Tue, 6 Aug 2019 12:49:15 +0200

locking/rwsem: Make handoff writer optimistically spin on owner

When the handoff bit is set by a writer, no other tasks other than
the setting writer itself is allowed to acquire the lock. If the
to-be-handoff'ed writer goes to sleep, there will be a wakeup latency
period where the lock is free, but no one can acquire it. That is less
than ideal.

To reduce that latency, the handoff writer will now optimistically spin
on the owner if it happens to be a on-cpu writer. It will spin until
it releases the lock and the to-be-handoff'ed writer can then acquire
the lock immediately without any delay. Of course, if the owner is not
a on-cpu writer, the to-be-handoff'ed writer will have to sleep anyway.

The optimistic spinning code is also modified to not stop spinning
when the handoff bit is set. This will prevent an occasional setting of
handoff bit from causing a bunch of optimistic spinners from entering
into the wait queue causing significant reduction in throughput.

On a 1-socket 22-core 44-thread Skylake system, the AIM7 shared_memory
workload was run with 7000 users. The throughput (jobs/min) of the
following kernels were as follows:

 1) 5.2-rc6
    - 8,092,486
 2) 5.2-rc6 + tip's rwsem patches
    - 7,567,568
 3) 5.2-rc6 + tip's rwsem patches + this patch
    - 7,954,545

Using perf-record(1), the %cpu time used by rwsem_down_write_slowpath(),
rwsem_down_write_failed() and their callees for the 3 kernels were 1.70%,
5.46% and 2.08% respectively.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: x86@kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: huang ying <huang.ying.caritas@gmail.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20190625143913.24154-1-longman@redhat.com
---
 kernel/locking/rwsem.c | 32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index bd0f0d05724c..354238a08b7a 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -724,11 +724,12 @@ rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
 
 	rcu_read_lock();
 	for (;;) {
-		if (atomic_long_read(&sem->count) & RWSEM_FLAG_HANDOFF) {
-			state = OWNER_NONSPINNABLE;
-			break;
-		}
-
+		/*
+		 * When a waiting writer set the handoff flag, it may spin
+		 * on the owner as well. Once that writer acquires the lock,
+		 * we can spin on it. So we don't need to quit even when the
+		 * handoff bit is set.
+		 */
 		new = rwsem_owner_flags(sem, &new_flags);
 		if ((new != owner) || (new_flags != flags)) {
 			state = rwsem_owner_state(new, new_flags, nonspinnable);
@@ -974,6 +975,13 @@ static inline bool rwsem_reader_phase_trylock(struct rw_semaphore *sem,
 {
 	return false;
 }
+
+static inline int
+rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
+{
+	return 0;
+}
+#define OWNER_NULL	1
 #endif
 
 /*
@@ -1206,6 +1214,18 @@ wait:
 
 		raw_spin_unlock_irq(&sem->wait_lock);
 
+		/*
+		 * After setting the handoff bit and failing to acquire
+		 * the lock, attempt to spin on owner to accelerate lock
+		 * transfer. If the previous owner is a on-cpu writer and it
+		 * has just released the lock, OWNER_NULL will be returned.
+		 * In this case, we attempt to acquire the lock again
+		 * without sleeping.
+		 */
+		if ((wstate == WRITER_HANDOFF) &&
+		    (rwsem_spin_on_owner(sem, 0) == OWNER_NULL))
+			goto trylock_again;
+
 		/* Block until there are no active lockers. */
 		for (;;) {
 			if (signal_pending_state(state, current))
@@ -1240,7 +1260,7 @@ wait:
 				break;
 			}
 		}
-
+trylock_again:
 		raw_spin_lock_irq(&sem->wait_lock);
 	}
 	__set_current_state(TASK_RUNNING);

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-08-06 13:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-25 14:39 [PATCH-tip] locking/rwsem: Make handoff writer optimistically spin on owner Waiman Long
2019-07-25 16:26 ` Waiman Long
2019-08-06 12:44 ` Peter Zijlstra
2019-08-06 13:04 ` [tip:locking/core] " tip-bot for Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).