* [PATCH 1/2] locking/rwbase: Optimize rwbase_read_trylock
2021-09-20 5:20 [PATCH -tip 0/2] locking/rwbase: Two reader optimizations Davidlohr Bueso
@ 2021-09-20 5:20 ` Davidlohr Bueso
2021-10-09 10:07 ` [tip: locking/core] " tip-bot2 for Davidlohr Bueso
2021-09-20 5:20 ` [PATCH 2/2] locking/rwbase: Lockless reader waking up a writer Davidlohr Bueso
2021-09-20 21:54 ` [PATCH -tip 0/2] locking/rwbase: Two reader optimizations Waiman Long
2 siblings, 1 reply; 5+ messages in thread
From: Davidlohr Bueso @ 2021-09-20 5:20 UTC (permalink / raw)
To: tglx
Cc: peterz, mingo, rostedt, longman, bigeasy, boqun.feng, dave,
linux-kernel, Davidlohr Bueso
Instead of a full barrier around the Rmw insn, micro-optimize
for weakly ordered archs such that we only provide the required
ACQUIRE semantics when taking the read lock.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
kernel/locking/rwbase_rt.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/kernel/locking/rwbase_rt.c b/kernel/locking/rwbase_rt.c
index 88191f6e252c..a9034784a5a0 100644
--- a/kernel/locking/rwbase_rt.c
+++ b/kernel/locking/rwbase_rt.c
@@ -59,8 +59,7 @@ static __always_inline int rwbase_read_trylock(struct rwbase_rt *rwb)
* set.
*/
for (r = atomic_read(&rwb->readers); r < 0;) {
- /* Fully-ordered if cmpxchg() succeeds, provides ACQUIRE */
- if (likely(atomic_try_cmpxchg(&rwb->readers, &r, r + 1)))
+ if (likely(atomic_try_cmpxchg_acquire(&rwb->readers, &r, r + 1)))
return 1;
}
return 0;
@@ -183,7 +182,7 @@ static inline void __rwbase_write_unlock(struct rwbase_rt *rwb, int bias,
/*
* _release() is needed in case that reader is in fast path, pairing
- * with atomic_try_cmpxchg() in rwbase_read_trylock(), provides RELEASE
+ * with atomic_try_cmpxchg_acquire() in rwbase_read_trylock().
*/
(void)atomic_add_return_release(READER_BIAS - bias, &rwb->readers);
raw_spin_unlock_irqrestore(&rtm->wait_lock, flags);
--
2.26.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [tip: locking/core] locking/rwbase: Optimize rwbase_read_trylock
2021-09-20 5:20 ` [PATCH 1/2] locking/rwbase: Optimize rwbase_read_trylock Davidlohr Bueso
@ 2021-10-09 10:07 ` tip-bot2 for Davidlohr Bueso
0 siblings, 0 replies; 5+ messages in thread
From: tip-bot2 for Davidlohr Bueso @ 2021-10-09 10:07 UTC (permalink / raw)
To: linux-tip-commits
Cc: Davidlohr Bueso, Peter Zijlstra (Intel), Waiman Long, x86, linux-kernel
The following commit has been merged into the locking/core branch of tip:
Commit-ID: c78416d122243c92992a1d1063f17ddd0bc80e6c
Gitweb: https://git.kernel.org/tip/c78416d122243c92992a1d1063f17ddd0bc80e6c
Author: Davidlohr Bueso <dave@stgolabs.net>
AuthorDate: Sun, 19 Sep 2021 22:20:30 -07:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 07 Oct 2021 13:51:07 +02:00
locking/rwbase: Optimize rwbase_read_trylock
Instead of a full barrier around the Rmw insn, micro-optimize
for weakly ordered archs such that we only provide the required
ACQUIRE semantics when taking the read lock.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/20210920052031.54220-2-dave@stgolabs.net
---
kernel/locking/rwbase_rt.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/kernel/locking/rwbase_rt.c b/kernel/locking/rwbase_rt.c
index 15c8110..6fd3162 100644
--- a/kernel/locking/rwbase_rt.c
+++ b/kernel/locking/rwbase_rt.c
@@ -59,8 +59,7 @@ static __always_inline int rwbase_read_trylock(struct rwbase_rt *rwb)
* set.
*/
for (r = atomic_read(&rwb->readers); r < 0;) {
- /* Fully-ordered if cmpxchg() succeeds, provides ACQUIRE */
- if (likely(atomic_try_cmpxchg(&rwb->readers, &r, r + 1)))
+ if (likely(atomic_try_cmpxchg_acquire(&rwb->readers, &r, r + 1)))
return 1;
}
return 0;
@@ -187,7 +186,7 @@ static inline void __rwbase_write_unlock(struct rwbase_rt *rwb, int bias,
/*
* _release() is needed in case that reader is in fast path, pairing
- * with atomic_try_cmpxchg() in rwbase_read_trylock(), provides RELEASE
+ * with atomic_try_cmpxchg_acquire() in rwbase_read_trylock().
*/
(void)atomic_add_return_release(READER_BIAS - bias, &rwb->readers);
raw_spin_unlock_irqrestore(&rtm->wait_lock, flags);
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] locking/rwbase: Lockless reader waking up a writer
2021-09-20 5:20 [PATCH -tip 0/2] locking/rwbase: Two reader optimizations Davidlohr Bueso
2021-09-20 5:20 ` [PATCH 1/2] locking/rwbase: Optimize rwbase_read_trylock Davidlohr Bueso
@ 2021-09-20 5:20 ` Davidlohr Bueso
2021-09-20 21:54 ` [PATCH -tip 0/2] locking/rwbase: Two reader optimizations Waiman Long
2 siblings, 0 replies; 5+ messages in thread
From: Davidlohr Bueso @ 2021-09-20 5:20 UTC (permalink / raw)
To: tglx
Cc: peterz, mingo, rostedt, longman, bigeasy, boqun.feng, dave,
linux-kernel, Davidlohr Bueso
Use the RT-lock safe wake_q to allow waking up the writer
without having to hold the wait_lock across the operation.
While this is ideally for batching wakeups, single wakeup
usage has still shown to be beneficial vs the cost of
try_to_wakeup() when the lock is contended, as well as
not having irqs disabled during the wakeup window, albeit
preemption will remain disabled.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
kernel/locking/rtmutex.c | 19 +++++++++++++------
kernel/locking/rwbase_rt.c | 6 +++++-
2 files changed, 18 insertions(+), 7 deletions(-)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 6bb116c559b4..1581674d640b 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -446,19 +446,26 @@ static __always_inline void rt_mutex_adjust_prio(struct task_struct *p)
}
/* RT mutex specific wake_q wrappers */
-static __always_inline void rt_mutex_wake_q_add(struct rt_wake_q_head *wqh,
- struct rt_mutex_waiter *w)
+static __always_inline void rt_mutex_wake_q_add_task(struct rt_wake_q_head *wqh,
+ struct task_struct *task,
+ unsigned int wake_state)
{
- if (IS_ENABLED(CONFIG_PREEMPT_RT) && w->wake_state != TASK_NORMAL) {
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && wake_state != TASK_NORMAL) {
if (IS_ENABLED(CONFIG_PROVE_LOCKING))
WARN_ON_ONCE(wqh->rtlock_task);
- get_task_struct(w->task);
- wqh->rtlock_task = w->task;
+ get_task_struct(task);
+ wqh->rtlock_task = task;
} else {
- wake_q_add(&wqh->head, w->task);
+ wake_q_add(&wqh->head, task);
}
}
+static __always_inline void rt_mutex_wake_q_add(struct rt_wake_q_head *wqh,
+ struct rt_mutex_waiter *w)
+{
+ rt_mutex_wake_q_add_task(wqh, w->task, w->wake_state);
+}
+
static __always_inline void rt_mutex_wake_up_q(struct rt_wake_q_head *wqh)
{
if (IS_ENABLED(CONFIG_PREEMPT_RT) && wqh->rtlock_task) {
diff --git a/kernel/locking/rwbase_rt.c b/kernel/locking/rwbase_rt.c
index a9034784a5a0..8cb58758af3d 100644
--- a/kernel/locking/rwbase_rt.c
+++ b/kernel/locking/rwbase_rt.c
@@ -147,6 +147,7 @@ static void __sched __rwbase_read_unlock(struct rwbase_rt *rwb,
{
struct rt_mutex_base *rtm = &rwb->rtmutex;
struct task_struct *owner;
+ DEFINE_RT_WAKE_Q(wqh);
raw_spin_lock_irq(&rtm->wait_lock);
/*
@@ -157,9 +158,12 @@ static void __sched __rwbase_read_unlock(struct rwbase_rt *rwb,
*/
owner = rt_mutex_owner(rtm);
if (owner)
- wake_up_state(owner, state);
+ rt_mutex_wake_q_add_task(&wqh, owner, state);
+ /* Pairs with the preempt_enable() in rt_mutex_wake_up_q() */
+ preempt_disable();
raw_spin_unlock_irq(&rtm->wait_lock);
+ rt_mutex_wake_up_q(&wqh);
}
static __always_inline void rwbase_read_unlock(struct rwbase_rt *rwb,
--
2.26.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH -tip 0/2] locking/rwbase: Two reader optimizations
2021-09-20 5:20 [PATCH -tip 0/2] locking/rwbase: Two reader optimizations Davidlohr Bueso
2021-09-20 5:20 ` [PATCH 1/2] locking/rwbase: Optimize rwbase_read_trylock Davidlohr Bueso
2021-09-20 5:20 ` [PATCH 2/2] locking/rwbase: Lockless reader waking up a writer Davidlohr Bueso
@ 2021-09-20 21:54 ` Waiman Long
2 siblings, 0 replies; 5+ messages in thread
From: Waiman Long @ 2021-09-20 21:54 UTC (permalink / raw)
To: Davidlohr Bueso, tglx
Cc: peterz, mingo, rostedt, bigeasy, boqun.feng, linux-kernel
On 9/20/21 1:20 AM, Davidlohr Bueso wrote:
> Hi,
>
> Patch 1 is a barrier optimization that came up from the reader
> fastpath ordering auditing.
>
> Patch 2 is a resend of the previous broken patch that attempts
> to use wake_q for read_unlock() slowpath.
>
> Tested on v5.15.y-rt. Applies against tip/urgent.
>
> Thanks!
>
> Davidlohr Bueso (2):
> locking/rwbase: Optimize rwbase_read_trylock
> locking/rwbase: Lockless reader waking up a writer
>
> kernel/locking/rtmutex.c | 19 +++++++++++++------
> kernel/locking/rwbase_rt.c | 11 +++++++----
> 2 files changed, 20 insertions(+), 10 deletions(-)
>
> --
> 2.26.2
>
Your patches look good to me.
Acked-by: Waiman Long <longman@redhat.com>
^ permalink raw reply [flat|nested] 5+ messages in thread