From: Waiman Long <longman@redhat.com>
To: john.p.donnelly@oracle.com, Hillf Danton <hdanton@sina.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [PATCH] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter
Date: Wed, 22 Jun 2022 16:07:42 -0400 [thread overview]
Message-ID: <627771df-19a5-a0a0-e27d-81be87d6d1f2@redhat.com> (raw)
In-Reply-To: <368f1ad6-83b9-01cd-1fba-3e87a0f73725@oracle.com>
On 6/22/22 13:48, john.p.donnelly@oracle.com wrote:
> On 4/27/22 8:23 PM, Hillf Danton wrote:
>> On Wed, 27 Apr 2022 13:31:24 -0400 Waiman Long wrote:
>>> With commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling
>>> more
>>> consistent"), the writer that sets the handoff bit can be interrupted
>>> out without clearing the bit if the wait queue isn't empty. This
>>> disables
>>> reader and writer optimistic lock spinning and stealing.
>>>
>>> Now if a non-first writer in the queue is somehow woken up or first
>>> entering the waiting loop, it can't acquire the lock. This is not
>>> the case before that commit as the writer that set the handoff bit
>>> will clear it when exiting out via the out_nolock path. This is less
>>> efficient as the busy rwsem stays in an unlock state for a longer time.
>>>
>>> This patch allows a non-first writer to ignore the handoff bit if it
>>> is not originally set or initiated by the first waiter.
>>>
>>> Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
>>> consistent")
>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>> ---
>>> kernel/locking/rwsem.c | 30 ++++++++++++++++++++----------
>>> 1 file changed, 20 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
>>> index 9d1db4a54d34..65f0262f635e 100644
>>> --- a/kernel/locking/rwsem.c
>>> +++ b/kernel/locking/rwsem.c
>>> @@ -335,8 +335,6 @@ struct rwsem_waiter {
>>> struct task_struct *task;
>>> enum rwsem_waiter_type type;
>>> unsigned long timeout;
>>> -
>>> - /* Writer only, not initialized in reader */
>>> bool handoff_set;
>>> };
>>> #define rwsem_first_waiter(sem) \
>>> @@ -459,10 +457,12 @@ static void rwsem_mark_wake(struct
>>> rw_semaphore *sem,
>>> * to give up the lock), request a HANDOFF to
>>> * force the issue.
>>> */
>>> - if (!(oldcount & RWSEM_FLAG_HANDOFF) &&
>>> - time_after(jiffies, waiter->timeout)) {
>>> - adjustment -= RWSEM_FLAG_HANDOFF;
>>> - lockevent_inc(rwsem_rlock_handoff);
>>> + if (time_after(jiffies, waiter->timeout)) {
>>> + if (!(oldcount & RWSEM_FLAG_HANDOFF)) {
>>> + adjustment -= RWSEM_FLAG_HANDOFF;
>>> + lockevent_inc(rwsem_rlock_handoff);
>>> + }
>>> + waiter->handoff_set = true;
>>> }
>>
>> Handoff is tracked in both sem->count and waiter->handoff_set,
>>
>>> atomic_long_add(-adjustment, &sem->count);
>>> @@ -599,7 +599,7 @@ rwsem_del_wake_waiter(struct rw_semaphore *sem,
>>> struct rwsem_waiter *waiter,
>>> static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
>>> struct rwsem_waiter *waiter)
>>> {
>>> - bool first = rwsem_first_waiter(sem) == waiter;
>>> + struct rwsem_waiter *first = rwsem_first_waiter(sem);
>>> long count, new;
>>> lockdep_assert_held(&sem->wait_lock);
>>> @@ -609,11 +609,20 @@ static inline bool rwsem_try_write_lock(struct
>>> rw_semaphore *sem,
>>> bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
>>> if (has_handoff) {
>>> - if (!first)
>>> + /*
>>> + * Honor handoff bit and yield only when the first
>>> + * waiter is the one that set it. Otherwisee, we
>>> + * still try to acquire the rwsem.
>>> + */
>>> + if (first->handoff_set && (waiter != first))
>>> return false;
>>
>> and checked against both parties, thus in a simpler manner
>> RWSEM_FLAG_HANDOFF
>> in sem->count means the first waiter has been waiting for lock long
>> enough.
>>
>> Feel free to ignore the comment given the Fixes tag above.
>>
>> Hillf
>>> - /* First waiter inherits a previously set handoff bit */
>>> - waiter->handoff_set = true;
>>> + /*
>>> + * First waiter can inherit a previously set handoff
>>> + * bit and spin on rwsem if lock acquisition fails.
>>> + */
>>> + if (waiter == first)
>>> + waiter->handoff_set = true;
>>> }
>>> new = count;
>>> @@ -1027,6 +1036,7 @@ rwsem_down_read_slowpath(struct rw_semaphore
>>> *sem, long count, unsigned int stat
>>> waiter.task = current;
>>> waiter.type = RWSEM_WAITING_FOR_READ;
>>> waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT;
>>> + waiter.handoff_set = false;
>>> raw_spin_lock_irq(&sem->wait_lock);
>>> if (list_empty(&sem->wait_list)) {
>>> --
>>> 2.27.0
>
>
> Was this ever added ?
>
> I don't see it in
>
>
> a111daf0c53ae 2022-06-19 | Linux 5.19-rc3
This patch hasn't been taken up by upstream yet. I have reposted a v2
with update to the patch description.
Cheers,
Longman
next prev parent reply other threads:[~2022-06-22 20:07 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20220428012342.3713-1-hdanton@sina.com>
2022-06-22 17:48 ` [PATCH] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter john.p.donnelly
2022-06-22 20:07 ` Waiman Long [this message]
2022-04-27 17:31 Waiman Long
2022-04-27 23:16 ` John Donnelly
2022-04-27 23:56 ` Waiman Long
2022-04-28 0:32 ` John Donnelly
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=627771df-19a5-a0a0-e27d-81be87d6d1f2@redhat.com \
--to=longman@redhat.com \
--cc=hdanton@sina.com \
--cc=john.p.donnelly@oracle.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).