All of lore.kernel.org
 help / color / mirror / Atom feed
From: john.p.donnelly@oracle.com
To: Hillf Danton <hdanton@sina.com>, Waiman Long <longman@redhat.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [PATCH] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter
Date: Wed, 22 Jun 2022 12:48:52 -0500	[thread overview]
Message-ID: <368f1ad6-83b9-01cd-1fba-3e87a0f73725@oracle.com> (raw)
In-Reply-To: <20220428012342.3713-1-hdanton@sina.com>

On 4/27/22 8:23 PM, Hillf Danton wrote:
> On  Wed, 27 Apr 2022 13:31:24 -0400 Waiman Long wrote:
>> With commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
>> consistent"), the writer that sets the handoff bit can be interrupted
>> out without clearing the bit if the wait queue isn't empty. This disables
>> reader and writer optimistic lock spinning and stealing.
>>
>> Now if a non-first writer in the queue is somehow woken up or first
>> entering the waiting loop, it can't acquire the lock.  This is not
>> the case before that commit as the writer that set the handoff bit
>> will clear it when exiting out via the out_nolock path. This is less
>> efficient as the busy rwsem stays in an unlock state for a longer time.
>>
>> This patch allows a non-first writer to ignore the handoff bit if it
>> is not originally set or initiated by the first waiter.
>>
>> Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent")
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   kernel/locking/rwsem.c | 30 ++++++++++++++++++++----------
>>   1 file changed, 20 insertions(+), 10 deletions(-)
>>
>> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
>> index 9d1db4a54d34..65f0262f635e 100644
>> --- a/kernel/locking/rwsem.c
>> +++ b/kernel/locking/rwsem.c
>> @@ -335,8 +335,6 @@ struct rwsem_waiter {
>>   	struct task_struct *task;
>>   	enum rwsem_waiter_type type;
>>   	unsigned long timeout;
>> -
>> -	/* Writer only, not initialized in reader */
>>   	bool handoff_set;
>>   };
>>   #define rwsem_first_waiter(sem) \
>> @@ -459,10 +457,12 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
>>   			 * to give up the lock), request a HANDOFF to
>>   			 * force the issue.
>>   			 */
>> -			if (!(oldcount & RWSEM_FLAG_HANDOFF) &&
>> -			    time_after(jiffies, waiter->timeout)) {
>> -				adjustment -= RWSEM_FLAG_HANDOFF;
>> -				lockevent_inc(rwsem_rlock_handoff);
>> +			if (time_after(jiffies, waiter->timeout)) {
>> +				if (!(oldcount & RWSEM_FLAG_HANDOFF)) {
>> +					adjustment -= RWSEM_FLAG_HANDOFF;
>> +					lockevent_inc(rwsem_rlock_handoff);
>> +				}
>> +				waiter->handoff_set = true;
>>   			}
> 
> Handoff is tracked in both sem->count and waiter->handoff_set,
> 
>>   
>>   			atomic_long_add(-adjustment, &sem->count);
>> @@ -599,7 +599,7 @@ rwsem_del_wake_waiter(struct rw_semaphore *sem, struct rwsem_waiter *waiter,
>>   static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
>>   					struct rwsem_waiter *waiter)
>>   {
>> -	bool first = rwsem_first_waiter(sem) == waiter;
>> +	struct rwsem_waiter *first = rwsem_first_waiter(sem);
>>   	long count, new;
>>   
>>   	lockdep_assert_held(&sem->wait_lock);
>> @@ -609,11 +609,20 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
>>   		bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
>>   
>>   		if (has_handoff) {
>> -			if (!first)
>> +			/*
>> +			 * Honor handoff bit and yield only when the first
>> +			 * waiter is the one that set it. Otherwisee, we
>> +			 * still try to acquire the rwsem.
>> +			 */
>> +			if (first->handoff_set && (waiter != first))
>>   				return false;
> 
> and checked against both parties, thus in a simpler manner RWSEM_FLAG_HANDOFF
> in sem->count means the first waiter has been waiting for lock long enough.
> 
> Feel free to ignore the comment given the Fixes tag above.
> 
> Hillf
>>   
>> -			/* First waiter inherits a previously set handoff bit */
>> -			waiter->handoff_set = true;
>> +			/*
>> +			 * First waiter can inherit a previously set handoff
>> +			 * bit and spin on rwsem if lock acquisition fails.
>> +			 */
>> +			if (waiter == first)
>> +				waiter->handoff_set = true;
>>   		}
>>   
>>   		new = count;
>> @@ -1027,6 +1036,7 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
>>   	waiter.task = current;
>>   	waiter.type = RWSEM_WAITING_FOR_READ;
>>   	waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT;
>> +	waiter.handoff_set = false;
>>   
>>   	raw_spin_lock_irq(&sem->wait_lock);
>>   	if (list_empty(&sem->wait_list)) {
>> -- 
>> 2.27.0


Was this ever added ?

I don't see it in


a111daf0c53ae 2022-06-19 | Linux 5.19-rc3




       reply	other threads:[~2022-06-22 17:49 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220428012342.3713-1-hdanton@sina.com>
2022-06-22 17:48 ` john.p.donnelly [this message]
2022-06-22 20:07   ` [PATCH] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter Waiman Long
2022-04-27 17:31 Waiman Long
2022-04-27 23:16 ` John Donnelly
2022-04-27 23:56   ` Waiman Long
2022-04-28  0:32     ` John Donnelly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=368f1ad6-83b9-01cd-1fba-3e87a0f73725@oracle.com \
    --to=john.p.donnelly@oracle.com \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.