linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath
@ 2021-04-15 14:25 Ali Saidi
  2021-04-15 15:02 ` Will Deacon
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Ali Saidi @ 2021-04-15 14:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: alisaidi, catalin.marinas, steve.capper, benh, stable,
	Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
	Boqun Feng

While this code is executed with the wait_lock held, a reader can
acquire the lock without holding wait_lock.  The writer side loops
checking the value with the atomic_cond_read_acquire(), but only truly
acquires the lock when the compare-and-exchange is completed
successfully which isn’t ordered. The other atomic operations from this
point are release-ordered and thus reads after the lock acquisition can
be completed before the lock is truly acquired which violates the
guarantees the lock should be making.

Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwloc")
Signed-off-by: Ali Saidi <alisaidi@amazon.com>
Cc: stable@vger.kernel.org
---
 kernel/locking/qrwlock.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
index 4786dd271b45..10770f6ac4d9 100644
--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -73,8 +73,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
 
 	/* When no more readers or writers, set the locked flag */
 	do {
-		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
-	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
+		atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING);
+	} while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING,
 					_QW_LOCKED) != _QW_WAITING);
 unlock:
 	arch_spin_unlock(&lock->wait_lock);
-- 
2.24.4.AMZN


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath
  2021-04-15 14:25 [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath Ali Saidi
@ 2021-04-15 15:02 ` Will Deacon
  2021-04-15 16:26   ` Ali Saidi
  2021-04-15 15:03 ` Peter Zijlstra
  2021-04-15 15:07 ` Waiman Long
  2 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2021-04-15 15:02 UTC (permalink / raw)
  To: Ali Saidi
  Cc: linux-kernel, catalin.marinas, steve.capper, benh, stable,
	Peter Zijlstra, Ingo Molnar, Waiman Long, Boqun Feng

On Thu, Apr 15, 2021 at 02:25:52PM +0000, Ali Saidi wrote:
> While this code is executed with the wait_lock held, a reader can
> acquire the lock without holding wait_lock.  The writer side loops
> checking the value with the atomic_cond_read_acquire(), but only truly
> acquires the lock when the compare-and-exchange is completed
> successfully which isn’t ordered. The other atomic operations from this
> point are release-ordered and thus reads after the lock acquisition can
> be completed before the lock is truly acquired which violates the
> guarantees the lock should be making.

I think it would be worth spelling this out with an example. The issue
appears to be a concurrent reader in interrupt context taking and releasing
the lock in the window where the writer has returned from the
atomic_cond_read_acquire() but has not yet performed the cmpxchg(). Loads
can be speculated during this time, but the A-B-A of the lock word
from _QW_WAITING to (_QW_WAITING | _QR_BIAS) and back to _QW_WAITING allows
the atomic_cmpxchg_relaxed() to succeed. Is that right?

With that in mind, it would probably be a good idea to eyeball the qspinlock
slowpath as well, as that uses both atomic_cond_read_acquire() and
atomic_try_cmpxchg_relaxed().

> Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwloc")

Typo in the quoted subject ('qrwloc').

> Signed-off-by: Ali Saidi <alisaidi@amazon.com>
> Cc: stable@vger.kernel.org
> ---
>  kernel/locking/qrwlock.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
> index 4786dd271b45..10770f6ac4d9 100644
> --- a/kernel/locking/qrwlock.c
> +++ b/kernel/locking/qrwlock.c
> @@ -73,8 +73,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
>  
>  	/* When no more readers or writers, set the locked flag */
>  	do {
> -		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> -	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
> +		atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING);
> +	} while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING,
>  					_QW_LOCKED) != _QW_WAITING);

Patch looks good, so with an updated message:

Acked-by: Will Deacon <will@kernel.org>

Will

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath
  2021-04-15 14:25 [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath Ali Saidi
  2021-04-15 15:02 ` Will Deacon
@ 2021-04-15 15:03 ` Peter Zijlstra
  2021-04-15 15:28   ` Will Deacon
  2021-04-15 15:07 ` Waiman Long
  2 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2021-04-15 15:03 UTC (permalink / raw)
  To: Ali Saidi
  Cc: linux-kernel, catalin.marinas, steve.capper, benh, stable,
	Ingo Molnar, Will Deacon, Waiman Long, Boqun Feng

On Thu, Apr 15, 2021 at 02:25:52PM +0000, Ali Saidi wrote:
> While this code is executed with the wait_lock held, a reader can
> acquire the lock without holding wait_lock.  The writer side loops
> checking the value with the atomic_cond_read_acquire(), but only truly
> acquires the lock when the compare-and-exchange is completed
> successfully which isn’t ordered. The other atomic operations from this
> point are release-ordered and thus reads after the lock acquisition can
> be completed before the lock is truly acquired which violates the
> guarantees the lock should be making.

Should be per who? We explicitly do not order the lock acquire store.
qspinlock has the exact same issue.

If you look in the git history surrounding spin_is_locked(), you'll find
'lovely' things.

Basically, if you're doing crazy things with spin_is_locked() you get to
keep the pieces.

> Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwloc")
> Signed-off-by: Ali Saidi <alisaidi@amazon.com>
> Cc: stable@vger.kernel.org
> ---
>  kernel/locking/qrwlock.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
> index 4786dd271b45..10770f6ac4d9 100644
> --- a/kernel/locking/qrwlock.c
> +++ b/kernel/locking/qrwlock.c
> @@ -73,8 +73,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
>  
>  	/* When no more readers or writers, set the locked flag */
>  	do {
> -		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> -	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
> +		atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING);
> +	} while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING,
>  					_QW_LOCKED) != _QW_WAITING);
>  unlock:
>  	arch_spin_unlock(&lock->wait_lock);

This doesn't make sense, there is no such thing as a store-acquire. What
you're doing here is moving the acquire from one load to the next. A
load we know will load the exact same value.

Also see Documentation/atomic_t.txt:

  {}_acquire: the R of the RMW (or atomic_read) is an ACQUIRE


If anything this code wants to be written like so.

---

diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
index 4786dd271b45..22aeccc363ca 100644
--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -60,6 +60,8 @@ EXPORT_SYMBOL(queued_read_lock_slowpath);
  */
 void queued_write_lock_slowpath(struct qrwlock *lock)
 {
+	u32 cnt;
+
 	/* Put the writer into the wait queue */
 	arch_spin_lock(&lock->wait_lock);
 
@@ -73,9 +75,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
 
 	/* When no more readers or writers, set the locked flag */
 	do {
-		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
-	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
-					_QW_LOCKED) != _QW_WAITING);
+		cnt = atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
+	} while (!atomic_try_cmpxchg_relaxed(&lock->cnts, &cnt, _QW_LOCKED));
 unlock:
 	arch_spin_unlock(&lock->wait_lock);
 }

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath
  2021-04-15 14:25 [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath Ali Saidi
  2021-04-15 15:02 ` Will Deacon
  2021-04-15 15:03 ` Peter Zijlstra
@ 2021-04-15 15:07 ` Waiman Long
  2 siblings, 0 replies; 12+ messages in thread
From: Waiman Long @ 2021-04-15 15:07 UTC (permalink / raw)
  To: Ali Saidi, linux-kernel
  Cc: catalin.marinas, steve.capper, benh, stable, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Boqun Feng

On 4/15/21 10:25 AM, Ali Saidi wrote:
> While this code is executed with the wait_lock held, a reader can
> acquire the lock without holding wait_lock.  The writer side loops
> checking the value with the atomic_cond_read_acquire(), but only truly
> acquires the lock when the compare-and-exchange is completed
> successfully which isn’t ordered. The other atomic operations from this
> point are release-ordered and thus reads after the lock acquisition can
> be completed before the lock is truly acquired which violates the
> guarantees the lock should be making.
>
> Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwloc")
> Signed-off-by: Ali Saidi <alisaidi@amazon.com>
> Cc: stable@vger.kernel.org
> ---
>   kernel/locking/qrwlock.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
> index 4786dd271b45..10770f6ac4d9 100644
> --- a/kernel/locking/qrwlock.c
> +++ b/kernel/locking/qrwlock.c
> @@ -73,8 +73,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
>   
>   	/* When no more readers or writers, set the locked flag */
>   	do {
> -		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> -	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
> +		atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING);
> +	} while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING,
>   					_QW_LOCKED) != _QW_WAITING);
>   unlock:
>   	arch_spin_unlock(&lock->wait_lock);

I think the original code isn't wrong. The read_acquire provides the 
acquire barrier for cmpxchg. Because of conditional dependency, the 
wait_lock unlock won't happen until the cmpxchg succeeds. Without doing 
a read_acquire, there may be a higher likelihood that the cmpxchg may fail.

Anyway, I will let Will or Peter chime in on this as I am not as 
proficient as them on this topic.

Cheers,
Longman



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath
  2021-04-15 15:03 ` Peter Zijlstra
@ 2021-04-15 15:28   ` Will Deacon
  2021-04-15 15:37     ` Catalin Marinas
  2021-04-15 15:54     ` Peter Zijlstra
  0 siblings, 2 replies; 12+ messages in thread
From: Will Deacon @ 2021-04-15 15:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ali Saidi, linux-kernel, catalin.marinas, steve.capper, benh,
	stable, Ingo Molnar, Waiman Long, Boqun Feng

On Thu, Apr 15, 2021 at 05:03:58PM +0200, Peter Zijlstra wrote:
> On Thu, Apr 15, 2021 at 02:25:52PM +0000, Ali Saidi wrote:
> > While this code is executed with the wait_lock held, a reader can
> > acquire the lock without holding wait_lock.  The writer side loops
> > checking the value with the atomic_cond_read_acquire(), but only truly
> > acquires the lock when the compare-and-exchange is completed
> > successfully which isn’t ordered. The other atomic operations from this
> > point are release-ordered and thus reads after the lock acquisition can
> > be completed before the lock is truly acquired which violates the
> > guarantees the lock should be making.
> 
> Should be per who? We explicitly do not order the lock acquire store.
> qspinlock has the exact same issue.
> 
> If you look in the git history surrounding spin_is_locked(), you'll find
> 'lovely' things.
> 
> Basically, if you're doing crazy things with spin_is_locked() you get to
> keep the pieces.
> 
> > Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwloc")
> > Signed-off-by: Ali Saidi <alisaidi@amazon.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  kernel/locking/qrwlock.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
> > index 4786dd271b45..10770f6ac4d9 100644
> > --- a/kernel/locking/qrwlock.c
> > +++ b/kernel/locking/qrwlock.c
> > @@ -73,8 +73,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
> >  
> >  	/* When no more readers or writers, set the locked flag */
> >  	do {
> > -		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> > -	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
> > +		atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING);
> > +	} while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING,
> >  					_QW_LOCKED) != _QW_WAITING);
> >  unlock:
> >  	arch_spin_unlock(&lock->wait_lock);
> 
> This doesn't make sense, there is no such thing as a store-acquire. What
> you're doing here is moving the acquire from one load to the next. A
> load we know will load the exact same value.
> 
> Also see Documentation/atomic_t.txt:
> 
>   {}_acquire: the R of the RMW (or atomic_read) is an ACQUIRE
> 
> 
> If anything this code wants to be written like so.
> 
> ---
> 
> diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
> index 4786dd271b45..22aeccc363ca 100644
> --- a/kernel/locking/qrwlock.c
> +++ b/kernel/locking/qrwlock.c
> @@ -60,6 +60,8 @@ EXPORT_SYMBOL(queued_read_lock_slowpath);
>   */
>  void queued_write_lock_slowpath(struct qrwlock *lock)
>  {
> +	u32 cnt;
> +
>  	/* Put the writer into the wait queue */
>  	arch_spin_lock(&lock->wait_lock);
>  
> @@ -73,9 +75,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
>  
>  	/* When no more readers or writers, set the locked flag */
>  	do {
> -		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> -	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
> -					_QW_LOCKED) != _QW_WAITING);
> +		cnt = atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);

I think the issue is that >here< a concurrent reader in interrupt context
can take the lock and release it again, but we could speculate reads from
the critical section up over the later release and up before the control
dependency here...

> +	} while (!atomic_try_cmpxchg_relaxed(&lock->cnts, &cnt, _QW_LOCKED));

... and then this cmpxchg() will succeed, so our speculated stale reads
could be used.

*HOWEVER*

Speculating a read should be fine in the face of a concurrent _reader_,
so for this to be an issue it implies that the reader is also doing some
(atomic?) updates.

Ali?

Will

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath
  2021-04-15 15:28   ` Will Deacon
@ 2021-04-15 15:37     ` Catalin Marinas
  2021-04-15 15:50       ` Will Deacon
  2021-04-15 15:54     ` Peter Zijlstra
  1 sibling, 1 reply; 12+ messages in thread
From: Catalin Marinas @ 2021-04-15 15:37 UTC (permalink / raw)
  To: Will Deacon
  Cc: Peter Zijlstra, Ali Saidi, linux-kernel, steve.capper, benh,
	stable, Ingo Molnar, Waiman Long, Boqun Feng

On Thu, Apr 15, 2021 at 04:28:21PM +0100, Will Deacon wrote:
> On Thu, Apr 15, 2021 at 05:03:58PM +0200, Peter Zijlstra wrote:
> > On Thu, Apr 15, 2021 at 02:25:52PM +0000, Ali Saidi wrote:
> > > While this code is executed with the wait_lock held, a reader can
> > > acquire the lock without holding wait_lock.  The writer side loops
> > > checking the value with the atomic_cond_read_acquire(), but only truly
> > > acquires the lock when the compare-and-exchange is completed
> > > successfully which isn’t ordered. The other atomic operations from this
> > > point are release-ordered and thus reads after the lock acquisition can
> > > be completed before the lock is truly acquired which violates the
> > > guarantees the lock should be making.
[...]
> > > Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwloc")
> > > Signed-off-by: Ali Saidi <alisaidi@amazon.com>
> > > Cc: stable@vger.kernel.org
> > > ---
> > >  kernel/locking/qrwlock.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
> > > index 4786dd271b45..10770f6ac4d9 100644
> > > --- a/kernel/locking/qrwlock.c
> > > +++ b/kernel/locking/qrwlock.c
> > > @@ -73,8 +73,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
> > >  
> > >  	/* When no more readers or writers, set the locked flag */
> > >  	do {
> > > -		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> > > -	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
> > > +		atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING);
> > > +	} while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING,
> > >  					_QW_LOCKED) != _QW_WAITING);
> > >  unlock:
> > >  	arch_spin_unlock(&lock->wait_lock);
> > 
> > This doesn't make sense, there is no such thing as a store-acquire. What
> > you're doing here is moving the acquire from one load to the next. A
> > load we know will load the exact same value.
> > 
> > Also see Documentation/atomic_t.txt:
> > 
> >   {}_acquire: the R of the RMW (or atomic_read) is an ACQUIRE
> > 
> > 
> > If anything this code wants to be written like so.
> > 
> > ---
> > 
> > diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
> > index 4786dd271b45..22aeccc363ca 100644
> > --- a/kernel/locking/qrwlock.c
> > +++ b/kernel/locking/qrwlock.c
> > @@ -60,6 +60,8 @@ EXPORT_SYMBOL(queued_read_lock_slowpath);
> >   */
> >  void queued_write_lock_slowpath(struct qrwlock *lock)
> >  {
> > +	u32 cnt;
> > +
> >  	/* Put the writer into the wait queue */
> >  	arch_spin_lock(&lock->wait_lock);
> >  
> > @@ -73,9 +75,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
> >  
> >  	/* When no more readers or writers, set the locked flag */
> >  	do {
> > -		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> > -	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
> > -					_QW_LOCKED) != _QW_WAITING);
> > +		cnt = atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> 
> I think the issue is that >here< a concurrent reader in interrupt context
> can take the lock and release it again, but we could speculate reads from
> the critical section up over the later release and up before the control
> dependency here...
> 
> > +	} while (!atomic_try_cmpxchg_relaxed(&lock->cnts, &cnt, _QW_LOCKED));
> 
> ... and then this cmpxchg() will succeed, so our speculated stale reads
> could be used.
> 
> *HOWEVER*
> 
> Speculating a read should be fine in the face of a concurrent _reader_,
> so for this to be an issue it implies that the reader is also doing some
> (atomic?) updates.

There's at least one such case: see chain_epi_lockless() updating
epi->next, called from ep_poll_callback() with a read_lock held. This
races with ep_done_scan() which has the write_lock held.

I think the authors of the above code interpreted the read_lock as
something that multiple threads can own disregarding the _read_ part.

-- 
Catalin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath
  2021-04-15 15:37     ` Catalin Marinas
@ 2021-04-15 15:50       ` Will Deacon
  0 siblings, 0 replies; 12+ messages in thread
From: Will Deacon @ 2021-04-15 15:50 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Peter Zijlstra, Ali Saidi, linux-kernel, steve.capper, benh,
	stable, Ingo Molnar, Waiman Long, Boqun Feng

On Thu, Apr 15, 2021 at 04:37:58PM +0100, Catalin Marinas wrote:
> On Thu, Apr 15, 2021 at 04:28:21PM +0100, Will Deacon wrote:
> > On Thu, Apr 15, 2021 at 05:03:58PM +0200, Peter Zijlstra wrote:
> > > diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
> > > index 4786dd271b45..22aeccc363ca 100644
> > > --- a/kernel/locking/qrwlock.c
> > > +++ b/kernel/locking/qrwlock.c
> > > @@ -60,6 +60,8 @@ EXPORT_SYMBOL(queued_read_lock_slowpath);
> > >   */
> > >  void queued_write_lock_slowpath(struct qrwlock *lock)
> > >  {
> > > +	u32 cnt;
> > > +
> > >  	/* Put the writer into the wait queue */
> > >  	arch_spin_lock(&lock->wait_lock);
> > >  
> > > @@ -73,9 +75,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
> > >  
> > >  	/* When no more readers or writers, set the locked flag */
> > >  	do {
> > > -		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> > > -	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
> > > -					_QW_LOCKED) != _QW_WAITING);
> > > +		cnt = atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> > 
> > I think the issue is that >here< a concurrent reader in interrupt context
> > can take the lock and release it again, but we could speculate reads from
> > the critical section up over the later release and up before the control
> > dependency here...
> > 
> > > +	} while (!atomic_try_cmpxchg_relaxed(&lock->cnts, &cnt, _QW_LOCKED));
> > 
> > ... and then this cmpxchg() will succeed, so our speculated stale reads
> > could be used.
> > 
> > *HOWEVER*
> > 
> > Speculating a read should be fine in the face of a concurrent _reader_,
> > so for this to be an issue it implies that the reader is also doing some
> > (atomic?) updates.
> 
> There's at least one such case: see chain_epi_lockless() updating
> epi->next, called from ep_poll_callback() with a read_lock held. This
> races with ep_done_scan() which has the write_lock held.

Do you know if that's the code which triggered this patch? If so, it would
be great to have this information in the changelog!

> I think the authors of the above code interpreted the read_lock as
> something that multiple threads can own disregarding the _read_ part.

Using RmW atomics should be fine for that; it's no worse than some of the
tricks pulled in RCU read context in the dentry cache (but then again, what
is? ;)

Will

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath
  2021-04-15 15:28   ` Will Deacon
  2021-04-15 15:37     ` Catalin Marinas
@ 2021-04-15 15:54     ` Peter Zijlstra
  1 sibling, 0 replies; 12+ messages in thread
From: Peter Zijlstra @ 2021-04-15 15:54 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ali Saidi, linux-kernel, catalin.marinas, steve.capper, benh,
	stable, Ingo Molnar, Waiman Long, Boqun Feng

On Thu, Apr 15, 2021 at 04:28:21PM +0100, Will Deacon wrote:
> On Thu, Apr 15, 2021 at 05:03:58PM +0200, Peter Zijlstra wrote:
> > @@ -73,9 +75,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
> >  
> >  	/* When no more readers or writers, set the locked flag */
> >  	do {
> > -		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> > -	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
> > -					_QW_LOCKED) != _QW_WAITING);
> > +		cnt = atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> 
> I think the issue is that >here< a concurrent reader in interrupt context
> can take the lock and release it again, but we could speculate reads from
> the critical section up over the later release and up before the control
> dependency here...

OK, so that was totally not clear from the original changelog.

> > +	} while (!atomic_try_cmpxchg_relaxed(&lock->cnts, &cnt, _QW_LOCKED));
> 
> ... and then this cmpxchg() will succeed, so our speculated stale reads
> could be used.
> 
> *HOWEVER*
> 
> Speculating a read should be fine in the face of a concurrent _reader_,
> so for this to be an issue it implies that the reader is also doing some
> (atomic?) updates.

So we're having something like:

	CPU0				CPU1

	queue_write_lock_slowpath()
	  atomic_cond_read_acquire() == _QW_WAITING

					queued_read_lock_slowpath()
					  atomic_cond_read_acquire()
					  return;
   ,--> (before X's store)
   |					X = y;
   |
   |					queued_read_unlock()
   |					  (void)atomic_sub_return_release()
   |
   |	  atomic_cmpxchg_relaxed(.old = _QW_WAITING)
   |
    `--	r = X;


Which as Will said is a cmpxchg ABA, however for it to be ABA, we have
to observe that unlock-store, and won't we then also have to observe the
whole critical section?

Ah, the issue is yet another load inside our own (CPU0)'s critical
section. which isn't ordered against the cmpxchg_relaxed() and can be
issued before.

So yes, then making the cmpxchg an acquire, will force all our own loads
to happen later.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath
  2021-04-15 15:02 ` Will Deacon
@ 2021-04-15 16:26   ` Ali Saidi
  2021-04-15 16:45     ` Will Deacon
  0 siblings, 1 reply; 12+ messages in thread
From: Ali Saidi @ 2021-04-15 16:26 UTC (permalink / raw)
  To: will
  Cc: alisaidi, benh, boqun.feng, catalin.marinas, linux-kernel,
	longman, mingo, peterz, stable, steve.capper


On Thu, 15 Apr 2021 16:02:29 +0100, Will Deacon wrote:
> On Thu, Apr 15, 2021 at 02:25:52PM +0000, Ali Saidi wrote:
> > While this code is executed with the wait_lock held, a reader can
> > acquire the lock without holding wait_lock.  The writer side loops
> > checking the value with the atomic_cond_read_acquire(), but only truly
> > acquires the lock when the compare-and-exchange is completed
> > successfully which isn’t ordered. The other atomic operations from this
> > point are release-ordered and thus reads after the lock acquisition can
> > be completed before the lock is truly acquired which violates the
> > guarantees the lock should be making.
> 
> I think it would be worth spelling this out with an example. The issue
> appears to be a concurrent reader in interrupt context taking and releasing
> the lock in the window where the writer has returned from the
> atomic_cond_read_acquire() but has not yet performed the cmpxchg(). Loads
> can be speculated during this time, but the A-B-A of the lock word
> from _QW_WAITING to (_QW_WAITING | _QR_BIAS) and back to _QW_WAITING allows
> the atomic_cmpxchg_relaxed() to succeed. Is that right?

You're right. What we're seeing is an A-B-A problem that can allow 
atomic_cond_read_acquire() to succeed and before the cmpxchg succeeds a reader
performs an A-B-A on the lock which allows the core to observe a read that
follows the cmpxchg ahead of the cmpxchg succeeding. 

We've seen a problem in epoll where the reader does a xchg while
holding the read lock, but the writer can see a value change out from under it. 

Writer                               | Reader 2
--------------------------------------------------------------------------------
ep_scan_ready_list()                 |
|- write_lock_irq()                  |
    |- queued_write_lock_slowpath()  |
      |- atomic_cond_read_acquire()  |
                                     | read_lock_irqsave(&ep->lock, flags);
                                     | chain_epi_lockless()
                                     |    epi->next = xchg(&ep->ovflist, epi);
                                     | read_unlock_irqrestore(&ep->lock, flags);
                                     |       
         atomic_cmpxchg_relaxed()    |
  READ_ONCE(ep->ovflist);    

> 
> With that in mind, it would probably be a good idea to eyeball the qspinlock
> slowpath as well, as that uses both atomic_cond_read_acquire() and
> atomic_try_cmpxchg_relaxed().

It seems plausible that the same thing could occur here in qspinlock:
          if ((val & _Q_TAIL_MASK) == tail) {
                  if (atomic_try_cmpxchg_relaxed(&lock->val, &val, _Q_LOCKED_VAL))
                          goto release; /* No contention */
          }

> 
> > Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwloc")

Ack, will fix. 

> Typo in the quoted subject ('qrwloc').
> 
> > Signed-off-by: Ali Saidi <alisaidi@amazon.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  kernel/locking/qrwlock.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
> > index 4786dd271b45..10770f6ac4d9 100644
> > --- a/kernel/locking/qrwlock.c
> > +++ b/kernel/locking/qrwlock.c
> > @@ -73,8 +73,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
> >  
> >  	/* When no more readers or writers, set the locked flag */
> >  	do {
> > -		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> > -	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
> > +		atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING);
> > +	} while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING,
> >  					_QW_LOCKED) != _QW_WAITING);
> 
> Patch looks good, so with an updated message:
> 
> Acked-by: Will Deacon <will@kernel.org>
> 
> Will

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath
  2021-04-15 16:26   ` Ali Saidi
@ 2021-04-15 16:45     ` Will Deacon
  2021-04-15 16:53       ` Waiman Long
  2021-04-15 17:18       ` Steve Capper
  0 siblings, 2 replies; 12+ messages in thread
From: Will Deacon @ 2021-04-15 16:45 UTC (permalink / raw)
  To: Ali Saidi
  Cc: benh, boqun.feng, catalin.marinas, linux-kernel, longman, mingo,
	peterz, stable, steve.capper

On Thu, Apr 15, 2021 at 04:26:46PM +0000, Ali Saidi wrote:
> 
> On Thu, 15 Apr 2021 16:02:29 +0100, Will Deacon wrote:
> > On Thu, Apr 15, 2021 at 02:25:52PM +0000, Ali Saidi wrote:
> > > While this code is executed with the wait_lock held, a reader can
> > > acquire the lock without holding wait_lock.  The writer side loops
> > > checking the value with the atomic_cond_read_acquire(), but only truly
> > > acquires the lock when the compare-and-exchange is completed
> > > successfully which isn’t ordered. The other atomic operations from this
> > > point are release-ordered and thus reads after the lock acquisition can
> > > be completed before the lock is truly acquired which violates the
> > > guarantees the lock should be making.
> > 
> > I think it would be worth spelling this out with an example. The issue
> > appears to be a concurrent reader in interrupt context taking and releasing
> > the lock in the window where the writer has returned from the
> > atomic_cond_read_acquire() but has not yet performed the cmpxchg(). Loads
> > can be speculated during this time, but the A-B-A of the lock word
> > from _QW_WAITING to (_QW_WAITING | _QR_BIAS) and back to _QW_WAITING allows
> > the atomic_cmpxchg_relaxed() to succeed. Is that right?
> 
> You're right. What we're seeing is an A-B-A problem that can allow 
> atomic_cond_read_acquire() to succeed and before the cmpxchg succeeds a reader
> performs an A-B-A on the lock which allows the core to observe a read that
> follows the cmpxchg ahead of the cmpxchg succeeding. 
> 
> We've seen a problem in epoll where the reader does a xchg while
> holding the read lock, but the writer can see a value change out from under it. 
> 
> Writer                               | Reader 2
> --------------------------------------------------------------------------------
> ep_scan_ready_list()                 |
> |- write_lock_irq()                  |
>     |- queued_write_lock_slowpath()  |
>       |- atomic_cond_read_acquire()  |
>                                      | read_lock_irqsave(&ep->lock, flags);
>                                      | chain_epi_lockless()
>                                      |    epi->next = xchg(&ep->ovflist, epi);
>                                      | read_unlock_irqrestore(&ep->lock, flags);
>                                      |       
>          atomic_cmpxchg_relaxed()    |
>   READ_ONCE(ep->ovflist);    


Please stick this in the commit message, preferably annotated a bit like
Peter's example to show the READ_ONCE() being speculated.

> > With that in mind, it would probably be a good idea to eyeball the qspinlock
> > slowpath as well, as that uses both atomic_cond_read_acquire() and
> > atomic_try_cmpxchg_relaxed().
> 
> It seems plausible that the same thing could occur here in qspinlock:
>           if ((val & _Q_TAIL_MASK) == tail) {
>                   if (atomic_try_cmpxchg_relaxed(&lock->val, &val, _Q_LOCKED_VAL))
>                           goto release; /* No contention */
>           }

Just been thinking about this, but I don't see an issue here because
everybody is queuing the same way (i.e. we don't have a mechanism to jump
the queue like we do for qrwlock) and the tail portion of the lock word
isn't susceptible to ABA. That is, once we're at the head of the queue
and we've seen the lock become unlocked via atomic_cond_read_acquire(),
then we know we hold it.

So qspinlock looks fine to me, but I'd obviously value anybody else's
opinion on that.

Will

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath
  2021-04-15 16:45     ` Will Deacon
@ 2021-04-15 16:53       ` Waiman Long
  2021-04-15 17:18       ` Steve Capper
  1 sibling, 0 replies; 12+ messages in thread
From: Waiman Long @ 2021-04-15 16:53 UTC (permalink / raw)
  To: Will Deacon, Ali Saidi
  Cc: benh, boqun.feng, catalin.marinas, linux-kernel, mingo, peterz,
	stable, steve.capper

On 4/15/21 12:45 PM, Will Deacon wrote:
>
>>> With that in mind, it would probably be a good idea to eyeball the qspinlock
>>> slowpath as well, as that uses both atomic_cond_read_acquire() and
>>> atomic_try_cmpxchg_relaxed().
>> It seems plausible that the same thing could occur here in qspinlock:
>>            if ((val & _Q_TAIL_MASK) == tail) {
>>                    if (atomic_try_cmpxchg_relaxed(&lock->val, &val, _Q_LOCKED_VAL))
>>                            goto release; /* No contention */
>>            }
> Just been thinking about this, but I don't see an issue here because
> everybody is queuing the same way (i.e. we don't have a mechanism to jump
> the queue like we do for qrwlock) and the tail portion of the lock word
> isn't susceptible to ABA. That is, once we're at the head of the queue
> and we've seen the lock become unlocked via atomic_cond_read_acquire(),
> then we know we hold it.
>
> So qspinlock looks fine to me, but I'd obviously value anybody else's
> opinion on that.

I agree with your assessment of qspinlock. I think qspinlock is fine.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath
  2021-04-15 16:45     ` Will Deacon
  2021-04-15 16:53       ` Waiman Long
@ 2021-04-15 17:18       ` Steve Capper
  1 sibling, 0 replies; 12+ messages in thread
From: Steve Capper @ 2021-04-15 17:18 UTC (permalink / raw)
  To: Will Deacon, Ali Saidi
  Cc: benh, boqun.feng, catalin.marinas, linux-kernel, longman, mingo,
	peterz, stable, nd


Hiya,

On 15/04/2021 17:45, Will Deacon wrote:
> On Thu, Apr 15, 2021 at 04:26:46PM +0000, Ali Saidi wrote:
>>
>> On Thu, 15 Apr 2021 16:02:29 +0100, Will Deacon wrote:
>>> On Thu, Apr 15, 2021 at 02:25:52PM +0000, Ali Saidi wrote:
>>>> While this code is executed with the wait_lock held, a reader can
>>>> acquire the lock without holding wait_lock.  The writer side loops
>>>> checking the value with the atomic_cond_read_acquire(), but only truly
>>>> acquires the lock when the compare-and-exchange is completed
>>>> successfully which isn’t ordered. The other atomic operations from this
>>>> point are release-ordered and thus reads after the lock acquisition can
>>>> be completed before the lock is truly acquired which violates the
>>>> guarantees the lock should be making.
>>>
>>> I think it would be worth spelling this out with an example. The issue
>>> appears to be a concurrent reader in interrupt context taking and releasing
>>> the lock in the window where the writer has returned from the
>>> atomic_cond_read_acquire() but has not yet performed the cmpxchg(). Loads
>>> can be speculated during this time, but the A-B-A of the lock word
>>> from _QW_WAITING to (_QW_WAITING | _QR_BIAS) and back to _QW_WAITING allows
>>> the atomic_cmpxchg_relaxed() to succeed. Is that right?
>>
>> You're right. What we're seeing is an A-B-A problem that can allow
>> atomic_cond_read_acquire() to succeed and before the cmpxchg succeeds a reader
>> performs an A-B-A on the lock which allows the core to observe a read that
>> follows the cmpxchg ahead of the cmpxchg succeeding.
>>
>> We've seen a problem in epoll where the reader does a xchg while
>> holding the read lock, but the writer can see a value change out from under it.
>>
>> Writer                               | Reader 2
>> --------------------------------------------------------------------------------
>> ep_scan_ready_list()                 |
>> |- write_lock_irq()                  |
>>      |- queued_write_lock_slowpath()  |
>>        |- atomic_cond_read_acquire()  |
>>                                       | read_lock_irqsave(&ep->lock, flags);
>>                                       | chain_epi_lockless()
>>                                       |    epi->next = xchg(&ep->ovflist, epi);
>>                                       | read_unlock_irqrestore(&ep->lock, flags);
>>                                       |
>>           atomic_cmpxchg_relaxed()    |
>>    READ_ONCE(ep->ovflist);
> 
> 
> Please stick this in the commit message, preferably annotated a bit like
> Peter's example to show the READ_ONCE() being speculated.
> 

I can confirm that this patch fixes a problem observed in
ep_scan_ready_list(.) whereby ovflist appeared to change when the write
lock was held.

So please feel free to add:
Tested-by: Steve Capper <steve.capper@arm.com>

Also, I have spent a decent chunk of time looking at the above issue and
went through qrwlock, so FWIW, please feel free to add:
Reviewed-by: Steve Capper <steve.capper@arm.com>

Cheers,
-- 
Steve

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-04-15 17:19 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-15 14:25 [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath Ali Saidi
2021-04-15 15:02 ` Will Deacon
2021-04-15 16:26   ` Ali Saidi
2021-04-15 16:45     ` Will Deacon
2021-04-15 16:53       ` Waiman Long
2021-04-15 17:18       ` Steve Capper
2021-04-15 15:03 ` Peter Zijlstra
2021-04-15 15:28   ` Will Deacon
2021-04-15 15:37     ` Catalin Marinas
2021-04-15 15:50       ` Will Deacon
2021-04-15 15:54     ` Peter Zijlstra
2021-04-15 15:07 ` Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).