linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
@ 2020-02-11 13:54 Qian Cai
  2020-05-08 20:59 ` Qian Cai
  0 siblings, 1 reply; 16+ messages in thread
From: Qian Cai @ 2020-02-11 13:54 UTC (permalink / raw)
  To: peterz, mingo; +Cc: will, elver, linux-kernel, Qian Cai

prev->next could be accessed concurrently as noticed by KCSAN,

 write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107:
  osq_lock+0x25f/0x350
  osq_wait_next at kernel/locking/osq_lock.c:79
  (inlined by) osq_lock at kernel/locking/osq_lock.c:185
  rwsem_optimistic_spin
  <snip>

 read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100:
  osq_lock+0x196/0x350
  osq_lock at kernel/locking/osq_lock.c:157
  rwsem_optimistic_spin
  <snip>

Since the write only stores NULL to prev->next and the read tests if
prev->next equals to this_cpu_ptr(&osq_node). Even if the value is
shattered, the code is still working correctly. Thus, mark it as an
intentional data race using the data_race() macro.

Signed-off-by: Qian Cai <cai@lca.pw>
---

v2: insert some code comments.

 kernel/locking/osq_lock.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 1f7734949ac8..f733bcd99e8a 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
 	 */
 
 	for (;;) {
-		if (prev->next == node &&
+		/*
+		 * cpu_relax() below implies a compiler barrier which would
+		 * prevent this comparison being optimized away.
+		 */
+		if (data_race(prev->next == node) &&
 		    cmpxchg(&prev->next, node, NULL) == node)
 			break;
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
  2020-02-11 13:54 [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock Qian Cai
@ 2020-05-08 20:59 ` Qian Cai
  2020-05-09  4:33   ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Qian Cai @ 2020-05-08 20:59 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Will Deacon, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel)



> On Feb 11, 2020, at 8:54 AM, Qian Cai <cai@lca.pw> wrote:
> 
> prev->next could be accessed concurrently as noticed by KCSAN,
> 
> write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107:
>  osq_lock+0x25f/0x350
>  osq_wait_next at kernel/locking/osq_lock.c:79
>  (inlined by) osq_lock at kernel/locking/osq_lock.c:185
>  rwsem_optimistic_spin
>  <snip>
> 
> read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100:
>  osq_lock+0x196/0x350
>  osq_lock at kernel/locking/osq_lock.c:157
>  rwsem_optimistic_spin
>  <snip>
> 
> Since the write only stores NULL to prev->next and the read tests if
> prev->next equals to this_cpu_ptr(&osq_node). Even if the value is
> shattered, the code is still working correctly. Thus, mark it as an
> intentional data race using the data_race() macro.
> 
> Signed-off-by: Qian Cai <cai@lca.pw>

Hmm, this patch has been dropped from linux-next from some reasons.

Paul, can you pick this up along with KCSAN fixes?

https://lore.kernel.org/lkml/1581429255-12542-1-git-send-email-cai@lca.pw/

> ---
> 
> v2: insert some code comments.
> 
> kernel/locking/osq_lock.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> index 1f7734949ac8..f733bcd99e8a 100644
> --- a/kernel/locking/osq_lock.c
> +++ b/kernel/locking/osq_lock.c
> @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> 	 */
> 
> 	for (;;) {
> -		if (prev->next == node &&
> +		/*
> +		 * cpu_relax() below implies a compiler barrier which would
> +		 * prevent this comparison being optimized away.
> +		 */
> +		if (data_race(prev->next == node) &&
> 		    cmpxchg(&prev->next, node, NULL) == node)
> 			break;
> 
> -- 
> 1.8.3.1
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
  2020-05-08 20:59 ` Qian Cai
@ 2020-05-09  4:33   ` Paul E. McKenney
  2020-05-09 13:01     ` Qian Cai
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2020-05-09  4:33 UTC (permalink / raw)
  To: Qian Cai
  Cc: Will Deacon, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel)

On Fri, May 08, 2020 at 04:59:05PM -0400, Qian Cai wrote:
> 
> 
> > On Feb 11, 2020, at 8:54 AM, Qian Cai <cai@lca.pw> wrote:
> > 
> > prev->next could be accessed concurrently as noticed by KCSAN,
> > 
> > write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107:
> >  osq_lock+0x25f/0x350
> >  osq_wait_next at kernel/locking/osq_lock.c:79
> >  (inlined by) osq_lock at kernel/locking/osq_lock.c:185
> >  rwsem_optimistic_spin
> >  <snip>
> > 
> > read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100:
> >  osq_lock+0x196/0x350
> >  osq_lock at kernel/locking/osq_lock.c:157
> >  rwsem_optimistic_spin
> >  <snip>
> > 
> > Since the write only stores NULL to prev->next and the read tests if
> > prev->next equals to this_cpu_ptr(&osq_node). Even if the value is
> > shattered, the code is still working correctly. Thus, mark it as an
> > intentional data race using the data_race() macro.
> > 
> > Signed-off-by: Qian Cai <cai@lca.pw>
> 
> Hmm, this patch has been dropped from linux-next from some reasons.
> 
> Paul, can you pick this up along with KCSAN fixes?
> 
> https://lore.kernel.org/lkml/1581429255-12542-1-git-send-email-cai@lca.pw/

I have queued it on -rcu, but it is too late for v5.8 via the -rcu tree,
so this is v5.9 at the earliest.  Plus I would need an ack from one of
the locking folks.

							Thanx, Paul

> > ---
> > 
> > v2: insert some code comments.
> > 
> > kernel/locking/osq_lock.c | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> > index 1f7734949ac8..f733bcd99e8a 100644
> > --- a/kernel/locking/osq_lock.c
> > +++ b/kernel/locking/osq_lock.c
> > @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> > 	 */
> > 
> > 	for (;;) {
> > -		if (prev->next == node &&
> > +		/*
> > +		 * cpu_relax() below implies a compiler barrier which would
> > +		 * prevent this comparison being optimized away.
> > +		 */
> > +		if (data_race(prev->next == node) &&
> > 		    cmpxchg(&prev->next, node, NULL) == node)
> > 			break;
> > 
> > -- 
> > 1.8.3.1
> > 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
  2020-05-09  4:33   ` Paul E. McKenney
@ 2020-05-09 13:01     ` Qian Cai
  2020-05-09 16:12       ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Qian Cai @ 2020-05-09 13:01 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Will Deacon, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel)



> On May 9, 2020, at 12:33 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
> 
> On Fri, May 08, 2020 at 04:59:05PM -0400, Qian Cai wrote:
>> 
>> 
>>> On Feb 11, 2020, at 8:54 AM, Qian Cai <cai@lca.pw> wrote:
>>> 
>>> prev->next could be accessed concurrently as noticed by KCSAN,
>>> 
>>> write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107:
>>> osq_lock+0x25f/0x350
>>> osq_wait_next at kernel/locking/osq_lock.c:79
>>> (inlined by) osq_lock at kernel/locking/osq_lock.c:185
>>> rwsem_optimistic_spin
>>> <snip>
>>> 
>>> read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100:
>>> osq_lock+0x196/0x350
>>> osq_lock at kernel/locking/osq_lock.c:157
>>> rwsem_optimistic_spin
>>> <snip>
>>> 
>>> Since the write only stores NULL to prev->next and the read tests if
>>> prev->next equals to this_cpu_ptr(&osq_node). Even if the value is
>>> shattered, the code is still working correctly. Thus, mark it as an
>>> intentional data race using the data_race() macro.
>>> 
>>> Signed-off-by: Qian Cai <cai@lca.pw>
>> 
>> Hmm, this patch has been dropped from linux-next from some reasons.
>> 
>> Paul, can you pick this up along with KCSAN fixes?
>> 
>> https://lore.kernel.org/lkml/1581429255-12542-1-git-send-email-cai@lca.pw/
> 
> I have queued it on -rcu, but it is too late for v5.8 via the -rcu tree,
> so this is v5.9 at the earliest.  Plus I would need an ack from one of
> the locking folks.

Peter, Will, can you give an ACK? This v2 should incorporate all the feedback from Peter,

https://lore.kernel.org/lkml/20200211124753.GP14914@hirez.programming.kicks-ass.net/

V5.9 is fine. All I care about is it is always in linux-next (so the testing bots won’t trigger this over and over again) and to be in mainline at some point in the future.

> 
> 							Thanx, Paul
> 
>>> ---
>>> 
>>> v2: insert some code comments.
>>> 
>>> kernel/locking/osq_lock.c | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
>>> index 1f7734949ac8..f733bcd99e8a 100644
>>> --- a/kernel/locking/osq_lock.c
>>> +++ b/kernel/locking/osq_lock.c
>>> @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
>>> 	 */
>>> 
>>> 	for (;;) {
>>> -		if (prev->next == node &&
>>> +		/*
>>> +		 * cpu_relax() below implies a compiler barrier which would
>>> +		 * prevent this comparison being optimized away.
>>> +		 */
>>> +		if (data_race(prev->next == node) &&
>>> 		    cmpxchg(&prev->next, node, NULL) == node)
>>> 			break;
>>> 
>>> -- 
>>> 1.8.3.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
  2020-05-09 13:01     ` Qian Cai
@ 2020-05-09 16:12       ` Paul E. McKenney
  2020-05-09 16:53         ` Qian Cai
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2020-05-09 16:12 UTC (permalink / raw)
  To: Qian Cai
  Cc: Will Deacon, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel)

On Sat, May 09, 2020 at 09:01:53AM -0400, Qian Cai wrote:
> 
> 
> > On May 9, 2020, at 12:33 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > 
> > On Fri, May 08, 2020 at 04:59:05PM -0400, Qian Cai wrote:
> >> 
> >> 
> >>> On Feb 11, 2020, at 8:54 AM, Qian Cai <cai@lca.pw> wrote:
> >>> 
> >>> prev->next could be accessed concurrently as noticed by KCSAN,
> >>> 
> >>> write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107:
> >>> osq_lock+0x25f/0x350
> >>> osq_wait_next at kernel/locking/osq_lock.c:79
> >>> (inlined by) osq_lock at kernel/locking/osq_lock.c:185
> >>> rwsem_optimistic_spin
> >>> <snip>
> >>> 
> >>> read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100:
> >>> osq_lock+0x196/0x350
> >>> osq_lock at kernel/locking/osq_lock.c:157
> >>> rwsem_optimistic_spin
> >>> <snip>
> >>> 
> >>> Since the write only stores NULL to prev->next and the read tests if
> >>> prev->next equals to this_cpu_ptr(&osq_node). Even if the value is
> >>> shattered, the code is still working correctly. Thus, mark it as an
> >>> intentional data race using the data_race() macro.
> >>> 
> >>> Signed-off-by: Qian Cai <cai@lca.pw>
> >> 
> >> Hmm, this patch has been dropped from linux-next from some reasons.
> >> 
> >> Paul, can you pick this up along with KCSAN fixes?
> >> 
> >> https://lore.kernel.org/lkml/1581429255-12542-1-git-send-email-cai@lca.pw/
> > 
> > I have queued it on -rcu, but it is too late for v5.8 via the -rcu tree,
> > so this is v5.9 at the earliest.  Plus I would need an ack from one of
> > the locking folks.
> 
> Peter, Will, can you give an ACK? This v2 should incorporate all the feedback from Peter,
> 
> https://lore.kernel.org/lkml/20200211124753.GP14914@hirez.programming.kicks-ass.net/
> 
> V5.9 is fine. All I care about is it is always in linux-next (so the testing bots won’t trigger this over and over again) and to be in mainline at some point in the future.

Ah, and I forgot to ask.  Why "if (data_race(prev->next == node)" instead
of "if (data_race(prev->next) == node"?

							Thanx, Paul

> >>> ---
> >>> 
> >>> v2: insert some code comments.
> >>> 
> >>> kernel/locking/osq_lock.c | 6 +++++-
> >>> 1 file changed, 5 insertions(+), 1 deletion(-)
> >>> 
> >>> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> >>> index 1f7734949ac8..f733bcd99e8a 100644
> >>> --- a/kernel/locking/osq_lock.c
> >>> +++ b/kernel/locking/osq_lock.c
> >>> @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> >>> 	 */
> >>> 
> >>> 	for (;;) {
> >>> -		if (prev->next == node &&
> >>> +		/*
> >>> +		 * cpu_relax() below implies a compiler barrier which would
> >>> +		 * prevent this comparison being optimized away.
> >>> +		 */
> >>> +		if (data_race(prev->next == node) &&
> >>> 		    cmpxchg(&prev->next, node, NULL) == node)
> >>> 			break;
> >>> 
> >>> -- 
> >>> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
  2020-05-09 16:12       ` Paul E. McKenney
@ 2020-05-09 16:53         ` Qian Cai
  2020-05-09 21:36           ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Qian Cai @ 2020-05-09 16:53 UTC (permalink / raw)
  To: paulmck
  Cc: Will Deacon, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel)



> On May 9, 2020, at 12:12 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> 
> Ah, and I forgot to ask.  Why "if (data_race(prev->next == node)" instead
> of "if (data_race(prev->next) == node"?

I think the one you suggested is slightly better to point out the exact race. Do you want me to resend or you could squash it instead?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
  2020-05-09 16:53         ` Qian Cai
@ 2020-05-09 21:36           ` Paul E. McKenney
  2020-05-11 15:58             ` Will Deacon
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2020-05-09 21:36 UTC (permalink / raw)
  To: Qian Cai
  Cc: Will Deacon, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel)

On Sat, May 09, 2020 at 12:53:38PM -0400, Qian Cai wrote:
> 
> 
> > On May 9, 2020, at 12:12 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > 
> > Ah, and I forgot to ask.  Why "if (data_race(prev->next == node)" instead
> > of "if (data_race(prev->next) == node"?
> 
> I think the one you suggested is slightly better to point out the exact race. Do you want me to resend or you could squash it instead?

The patch was still at the top of my stack, so I just amended it.  Here
is the updated version.

							Thanx, Paul

------------------------------------------------------------------------

commit 13e69ca01ce1621ce74248bda86cfad47fa5a0fa
Author: Qian Cai <cai@lca.pw>
Date:   Tue Feb 11 08:54:15 2020 -0500

    locking/osq_lock: Annotate a data race in osq_lock
    
    The prev->next pointer can be accessed concurrently as noticed by KCSAN:
    
     write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107:
      osq_lock+0x25f/0x350
      osq_wait_next at kernel/locking/osq_lock.c:79
      (inlined by) osq_lock at kernel/locking/osq_lock.c:185
      rwsem_optimistic_spin
      <snip>
    
     read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100:
      osq_lock+0x196/0x350
      osq_lock at kernel/locking/osq_lock.c:157
      rwsem_optimistic_spin
      <snip>
    
    Since the write only stores NULL to prev->next and the read tests if
    prev->next equals to this_cpu_ptr(&osq_node). Even if the value is
    shattered, the code is still working correctly. Thus, mark it as an
    intentional data race using the data_race() macro.
    
    Signed-off-by: Qian Cai <cai@lca.pw>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 1f77349..1de006e 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
 	 */
 
 	for (;;) {
-		if (prev->next == node &&
+		/*
+		 * cpu_relax() below implies a compiler barrier which would
+		 * prevent this comparison being optimized away.
+		 */
+		if (data_race(prev->next) == node &&
 		    cmpxchg(&prev->next, node, NULL) == node)
 			break;
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
  2020-05-09 21:36           ` Paul E. McKenney
@ 2020-05-11 15:58             ` Will Deacon
  2020-05-11 16:43               ` Paul E. McKenney
  2020-05-11 16:44               ` Qian Cai
  0 siblings, 2 replies; 16+ messages in thread
From: Will Deacon @ 2020-05-11 15:58 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Qian Cai, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel)

On Sat, May 09, 2020 at 02:36:54PM -0700, Paul E. McKenney wrote:
> On Sat, May 09, 2020 at 12:53:38PM -0400, Qian Cai wrote:
> > 
> > 
> > > On May 9, 2020, at 12:12 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > > 
> > > Ah, and I forgot to ask.  Why "if (data_race(prev->next == node)" instead
> > > of "if (data_race(prev->next) == node"?
> > 
> > I think the one you suggested is slightly better to point out the exact race. Do you want me to resend or you could squash it instead?
> 
> The patch was still at the top of my stack, so I just amended it.  Here
> is the updated version.
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> commit 13e69ca01ce1621ce74248bda86cfad47fa5a0fa
> Author: Qian Cai <cai@lca.pw>
> Date:   Tue Feb 11 08:54:15 2020 -0500
> 
>     locking/osq_lock: Annotate a data race in osq_lock
>     
>     The prev->next pointer can be accessed concurrently as noticed by KCSAN:
>     
>      write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107:
>       osq_lock+0x25f/0x350
>       osq_wait_next at kernel/locking/osq_lock.c:79
>       (inlined by) osq_lock at kernel/locking/osq_lock.c:185
>       rwsem_optimistic_spin
>       <snip>
>     
>      read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100:
>       osq_lock+0x196/0x350
>       osq_lock at kernel/locking/osq_lock.c:157
>       rwsem_optimistic_spin
>       <snip>
>     
>     Since the write only stores NULL to prev->next and the read tests if
>     prev->next equals to this_cpu_ptr(&osq_node). Even if the value is
>     shattered, the code is still working correctly. Thus, mark it as an
>     intentional data race using the data_race() macro.
>     
>     Signed-off-by: Qian Cai <cai@lca.pw>
>     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> 
> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> index 1f77349..1de006e 100644
> --- a/kernel/locking/osq_lock.c
> +++ b/kernel/locking/osq_lock.c
> @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
>  	 */
>  
>  	for (;;) {
> -		if (prev->next == node &&
> +		/*
> +		 * cpu_relax() below implies a compiler barrier which would
> +		 * prevent this comparison being optimized away.
> +		 */
> +		if (data_race(prev->next) == node &&
>  		    cmpxchg(&prev->next, node, NULL) == node)
>  			break;

I'm fine with the data_race() placement, but I don't find the comment
very helpful. We assign the result of a READ_ONCE() to 'prev' in the
loop, so I don't think that the cpu_relax() is really relevant.

The reason we don't need READ_ONCE() here is because if we race with
the writer then either we'll go round the loop again after accidentally
thinking prev->next != node, or we'll erroneously attempt the cmpxchg()
because we thought they were equal and that will fail.

Make sense?

Will

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
  2020-05-11 15:58             ` Will Deacon
@ 2020-05-11 16:43               ` Paul E. McKenney
  2020-05-11 16:52                 ` Will Deacon
  2020-05-11 16:44               ` Qian Cai
  1 sibling, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2020-05-11 16:43 UTC (permalink / raw)
  To: Will Deacon
  Cc: Qian Cai, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel)

On Mon, May 11, 2020 at 04:58:13PM +0100, Will Deacon wrote:
> On Sat, May 09, 2020 at 02:36:54PM -0700, Paul E. McKenney wrote:
> > On Sat, May 09, 2020 at 12:53:38PM -0400, Qian Cai wrote:
> > > 
> > > 
> > > > On May 9, 2020, at 12:12 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > 
> > > > Ah, and I forgot to ask.  Why "if (data_race(prev->next == node)" instead
> > > > of "if (data_race(prev->next) == node"?
> > > 
> > > I think the one you suggested is slightly better to point out the exact race. Do you want me to resend or you could squash it instead?
> > 
> > The patch was still at the top of my stack, so I just amended it.  Here
> > is the updated version.
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > commit 13e69ca01ce1621ce74248bda86cfad47fa5a0fa
> > Author: Qian Cai <cai@lca.pw>
> > Date:   Tue Feb 11 08:54:15 2020 -0500
> > 
> >     locking/osq_lock: Annotate a data race in osq_lock
> >     
> >     The prev->next pointer can be accessed concurrently as noticed by KCSAN:
> >     
> >      write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107:
> >       osq_lock+0x25f/0x350
> >       osq_wait_next at kernel/locking/osq_lock.c:79
> >       (inlined by) osq_lock at kernel/locking/osq_lock.c:185
> >       rwsem_optimistic_spin
> >       <snip>
> >     
> >      read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100:
> >       osq_lock+0x196/0x350
> >       osq_lock at kernel/locking/osq_lock.c:157
> >       rwsem_optimistic_spin
> >       <snip>
> >     
> >     Since the write only stores NULL to prev->next and the read tests if
> >     prev->next equals to this_cpu_ptr(&osq_node). Even if the value is
> >     shattered, the code is still working correctly. Thus, mark it as an
> >     intentional data race using the data_race() macro.
> >     
> >     Signed-off-by: Qian Cai <cai@lca.pw>
> >     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > 
> > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> > index 1f77349..1de006e 100644
> > --- a/kernel/locking/osq_lock.c
> > +++ b/kernel/locking/osq_lock.c
> > @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> >  	 */
> >  
> >  	for (;;) {
> > -		if (prev->next == node &&
> > +		/*
> > +		 * cpu_relax() below implies a compiler barrier which would
> > +		 * prevent this comparison being optimized away.
> > +		 */
> > +		if (data_race(prev->next) == node &&
> >  		    cmpxchg(&prev->next, node, NULL) == node)
> >  			break;
> 
> I'm fine with the data_race() placement, but I don't find the comment
> very helpful. We assign the result of a READ_ONCE() to 'prev' in the
> loop, so I don't think that the cpu_relax() is really relevant.

Suppose that the compiler loaded a value that was not equal to "node".
In that case, the cmpxchg() won't happen, so something else must force
the compiler to do the reload in order to avoid an infinite loop, right?
Or am I missing something here?

							Thanx, Paul

> The reason we don't need READ_ONCE() here is because if we race with
> the writer then either we'll go round the loop again after accidentally
> thinking prev->next != node, or we'll erroneously attempt the cmpxchg()
> because we thought they were equal and that will fail.
> 
> Make sense?
> 
> Will

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
  2020-05-11 15:58             ` Will Deacon
  2020-05-11 16:43               ` Paul E. McKenney
@ 2020-05-11 16:44               ` Qian Cai
  2020-05-11 16:54                 ` Will Deacon
  1 sibling, 1 reply; 16+ messages in thread
From: Qian Cai @ 2020-05-11 16:44 UTC (permalink / raw)
  To: Will Deacon
  Cc: Paul E. McKenney, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel)



> On May 11, 2020, at 11:58 AM, Will Deacon <will@kernel.org> wrote:
> 
> I'm fine with the data_race() placement, but I don't find the comment
> very helpful. We assign the result of a READ_ONCE() to 'prev' in the
> loop, so I don't think that the cpu_relax() is really relevant.
> 
> The reason we don't need READ_ONCE() here is because if we race with
> the writer then either we'll go round the loop again after accidentally
> thinking prev->next != node, or we'll erroneously attempt the cmpxchg()
> because we thought they were equal and that will fail.
> 
> Make sense?

I think the significant concern from the previous reviews was if compilers could prove that prev->next == node was always true because it had no knowledge of the concurrency, and then took out the whole if statement away resulting in an infinite loop.

The comment tried to explain that the cpu_relax() would save us from the infinite loop in theory here.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
  2020-05-11 16:43               ` Paul E. McKenney
@ 2020-05-11 16:52                 ` Will Deacon
  2020-05-11 17:29                   ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Will Deacon @ 2020-05-11 16:52 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Qian Cai, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel)

On Mon, May 11, 2020 at 09:43:19AM -0700, Paul E. McKenney wrote:
> On Mon, May 11, 2020 at 04:58:13PM +0100, Will Deacon wrote:
> > On Sat, May 09, 2020 at 02:36:54PM -0700, Paul E. McKenney wrote:
> > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> > > index 1f77349..1de006e 100644
> > > --- a/kernel/locking/osq_lock.c
> > > +++ b/kernel/locking/osq_lock.c
> > > @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> > >  	 */
> > >  
> > >  	for (;;) {
> > > -		if (prev->next == node &&
> > > +		/*
> > > +		 * cpu_relax() below implies a compiler barrier which would
> > > +		 * prevent this comparison being optimized away.
> > > +		 */
> > > +		if (data_race(prev->next) == node &&
> > >  		    cmpxchg(&prev->next, node, NULL) == node)
> > >  			break;
> > 
> > I'm fine with the data_race() placement, but I don't find the comment
> > very helpful. We assign the result of a READ_ONCE() to 'prev' in the
> > loop, so I don't think that the cpu_relax() is really relevant.
> 
> Suppose that the compiler loaded a value that was not equal to "node".
> In that case, the cmpxchg() won't happen, so something else must force
> the compiler to do the reload in order to avoid an infinite loop, right?
> Or am I missing something here?

Then we just go round the loop and reload prev:

	prev = READ_ONCE(node->prev);

which should be enough to stop the compiler, no?

Will

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
  2020-05-11 16:44               ` Qian Cai
@ 2020-05-11 16:54                 ` Will Deacon
  2020-05-11 17:10                   ` Qian Cai
  0 siblings, 1 reply; 16+ messages in thread
From: Will Deacon @ 2020-05-11 16:54 UTC (permalink / raw)
  To: Qian Cai
  Cc: Paul E. McKenney, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel)

On Mon, May 11, 2020 at 12:44:26PM -0400, Qian Cai wrote:
> 
> 
> > On May 11, 2020, at 11:58 AM, Will Deacon <will@kernel.org> wrote:
> > 
> > I'm fine with the data_race() placement, but I don't find the comment
> > very helpful. We assign the result of a READ_ONCE() to 'prev' in the
> > loop, so I don't think that the cpu_relax() is really relevant.
> > 
> > The reason we don't need READ_ONCE() here is because if we race with
> > the writer then either we'll go round the loop again after accidentally
> > thinking prev->next != node, or we'll erroneously attempt the cmpxchg()
> > because we thought they were equal and that will fail.
> > 
> > Make sense?
> 
> I think the significant concern from the previous reviews was if compilers
> could prove that prev->next == node was always true because it had no
> knowledge of the concurrency, and then took out the whole if statement
> away resulting in an infinite loop.

Hmm, I don't see how it can remove the cmpxchg(). Do you have a link to that
discussion, please?

Will

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
  2020-05-11 16:54                 ` Will Deacon
@ 2020-05-11 17:10                   ` Qian Cai
  0 siblings, 0 replies; 16+ messages in thread
From: Qian Cai @ 2020-05-11 17:10 UTC (permalink / raw)
  To: Will Deacon
  Cc: Paul E. McKenney, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel)



> On May 11, 2020, at 12:54 PM, Will Deacon <will@kernel.org> wrote:
> 
> Hmm, I don't see how it can remove the cmpxchg(). Do you have a link to that
> discussion, please?

lore.kernel.org/lkml/20200211124753.GP14914@hirez.programming.kicks-ass.net

Correction — if compilers could prove ”prev->next != node” is always true, that cmpxchg() would not run. cpu_relax() should be sufficient to keep that “if statement” been optimized away in any case.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
  2020-05-11 16:52                 ` Will Deacon
@ 2020-05-11 17:29                   ` Paul E. McKenney
  2020-05-11 17:34                     ` Will Deacon
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2020-05-11 17:29 UTC (permalink / raw)
  To: Will Deacon
  Cc: Qian Cai, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel)

On Mon, May 11, 2020 at 05:52:17PM +0100, Will Deacon wrote:
> On Mon, May 11, 2020 at 09:43:19AM -0700, Paul E. McKenney wrote:
> > On Mon, May 11, 2020 at 04:58:13PM +0100, Will Deacon wrote:
> > > On Sat, May 09, 2020 at 02:36:54PM -0700, Paul E. McKenney wrote:
> > > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> > > > index 1f77349..1de006e 100644
> > > > --- a/kernel/locking/osq_lock.c
> > > > +++ b/kernel/locking/osq_lock.c
> > > > @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> > > >  	 */
> > > >  
> > > >  	for (;;) {
> > > > -		if (prev->next == node &&
> > > > +		/*
> > > > +		 * cpu_relax() below implies a compiler barrier which would
> > > > +		 * prevent this comparison being optimized away.
> > > > +		 */
> > > > +		if (data_race(prev->next) == node &&
> > > >  		    cmpxchg(&prev->next, node, NULL) == node)
> > > >  			break;
> > > 
> > > I'm fine with the data_race() placement, but I don't find the comment
> > > very helpful. We assign the result of a READ_ONCE() to 'prev' in the
> > > loop, so I don't think that the cpu_relax() is really relevant.
> > 
> > Suppose that the compiler loaded a value that was not equal to "node".
> > In that case, the cmpxchg() won't happen, so something else must force
> > the compiler to do the reload in order to avoid an infinite loop, right?
> > Or am I missing something here?
> 
> Then we just go round the loop and reload prev:
> 
> 	prev = READ_ONCE(node->prev);
> 
> which should be enough to stop the compiler, no?

Yes, that would also work.  Either have the cpu_relax() or a barrier()
or whatever on the one hand, or, as you say, turn the data_race() into
a READ_ONCE().  I personally prefer the READ_ONCE() myself, unless that
would undesirably suppress other KCSAN warnings.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
  2020-05-11 17:29                   ` Paul E. McKenney
@ 2020-05-11 17:34                     ` Will Deacon
  2020-05-11 18:07                       ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Will Deacon @ 2020-05-11 17:34 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Qian Cai, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel)

On Mon, May 11, 2020 at 10:29:18AM -0700, Paul E. McKenney wrote:
> On Mon, May 11, 2020 at 05:52:17PM +0100, Will Deacon wrote:
> > On Mon, May 11, 2020 at 09:43:19AM -0700, Paul E. McKenney wrote:
> > > On Mon, May 11, 2020 at 04:58:13PM +0100, Will Deacon wrote:
> > > > On Sat, May 09, 2020 at 02:36:54PM -0700, Paul E. McKenney wrote:
> > > > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> > > > > index 1f77349..1de006e 100644
> > > > > --- a/kernel/locking/osq_lock.c
> > > > > +++ b/kernel/locking/osq_lock.c
> > > > > @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> > > > >  	 */
> > > > >  
> > > > >  	for (;;) {
> > > > > -		if (prev->next == node &&
> > > > > +		/*
> > > > > +		 * cpu_relax() below implies a compiler barrier which would
> > > > > +		 * prevent this comparison being optimized away.
> > > > > +		 */
> > > > > +		if (data_race(prev->next) == node &&
> > > > >  		    cmpxchg(&prev->next, node, NULL) == node)
> > > > >  			break;
> > > > 
> > > > I'm fine with the data_race() placement, but I don't find the comment
> > > > very helpful. We assign the result of a READ_ONCE() to 'prev' in the
> > > > loop, so I don't think that the cpu_relax() is really relevant.
> > > 
> > > Suppose that the compiler loaded a value that was not equal to "node".
> > > In that case, the cmpxchg() won't happen, so something else must force
> > > the compiler to do the reload in order to avoid an infinite loop, right?
> > > Or am I missing something here?
> > 
> > Then we just go round the loop and reload prev:
> > 
> > 	prev = READ_ONCE(node->prev);
> > 
> > which should be enough to stop the compiler, no?
> 
> Yes, that would also work.  Either have the cpu_relax() or a barrier()
> or whatever on the one hand, or, as you say, turn the data_race() into
> a READ_ONCE().  I personally prefer the READ_ONCE() myself, unless that
> would undesirably suppress other KCSAN warnings.

No, I mean here is the code after this patch is applied:

	for (;;) {
		if (data_race(prev->next) == node &&
		    cmpxchg(&prev->next, node, NULL) == node)
			break;

		/*
		 * We can only fail the cmpxchg() racing against an unlock(),
		 * in which case we should observe @node->locked becomming
		 * true.
		 */
		if (smp_load_acquire(&node->locked))
			return true;

		cpu_relax();

		/*
		 * Or we race against a concurrent unqueue()'s step-B, in which
		 * case its step-C will write us a new @node->prev pointer.
		 */
		prev = READ_ONCE(node->prev);
	}

I'm saying that this READ_ONCE at the end of the loop should be sufficient
to stop the compiler making value assumptions about prev->next. Do you
agree?

Will

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock
  2020-05-11 17:34                     ` Will Deacon
@ 2020-05-11 18:07                       ` Paul E. McKenney
  0 siblings, 0 replies; 16+ messages in thread
From: Paul E. McKenney @ 2020-05-11 18:07 UTC (permalink / raw)
  To: Will Deacon
  Cc: Qian Cai, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel)

On Mon, May 11, 2020 at 06:34:13PM +0100, Will Deacon wrote:
> On Mon, May 11, 2020 at 10:29:18AM -0700, Paul E. McKenney wrote:
> > On Mon, May 11, 2020 at 05:52:17PM +0100, Will Deacon wrote:
> > > On Mon, May 11, 2020 at 09:43:19AM -0700, Paul E. McKenney wrote:
> > > > On Mon, May 11, 2020 at 04:58:13PM +0100, Will Deacon wrote:
> > > > > On Sat, May 09, 2020 at 02:36:54PM -0700, Paul E. McKenney wrote:
> > > > > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> > > > > > index 1f77349..1de006e 100644
> > > > > > --- a/kernel/locking/osq_lock.c
> > > > > > +++ b/kernel/locking/osq_lock.c
> > > > > > @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> > > > > >  	 */
> > > > > >  
> > > > > >  	for (;;) {
> > > > > > -		if (prev->next == node &&
> > > > > > +		/*
> > > > > > +		 * cpu_relax() below implies a compiler barrier which would
> > > > > > +		 * prevent this comparison being optimized away.
> > > > > > +		 */
> > > > > > +		if (data_race(prev->next) == node &&
> > > > > >  		    cmpxchg(&prev->next, node, NULL) == node)
> > > > > >  			break;
> > > > > 
> > > > > I'm fine with the data_race() placement, but I don't find the comment
> > > > > very helpful. We assign the result of a READ_ONCE() to 'prev' in the
> > > > > loop, so I don't think that the cpu_relax() is really relevant.
> > > > 
> > > > Suppose that the compiler loaded a value that was not equal to "node".
> > > > In that case, the cmpxchg() won't happen, so something else must force
> > > > the compiler to do the reload in order to avoid an infinite loop, right?
> > > > Or am I missing something here?
> > > 
> > > Then we just go round the loop and reload prev:
> > > 
> > > 	prev = READ_ONCE(node->prev);
> > > 
> > > which should be enough to stop the compiler, no?
> > 
> > Yes, that would also work.  Either have the cpu_relax() or a barrier()
> > or whatever on the one hand, or, as you say, turn the data_race() into
> > a READ_ONCE().  I personally prefer the READ_ONCE() myself, unless that
> > would undesirably suppress other KCSAN warnings.
> 
> No, I mean here is the code after this patch is applied:
> 
> 	for (;;) {
> 		if (data_race(prev->next) == node &&
> 		    cmpxchg(&prev->next, node, NULL) == node)
> 			break;
> 
> 		/*
> 		 * We can only fail the cmpxchg() racing against an unlock(),
> 		 * in which case we should observe @node->locked becomming
> 		 * true.
> 		 */
> 		if (smp_load_acquire(&node->locked))
> 			return true;
> 
> 		cpu_relax();
> 
> 		/*
> 		 * Or we race against a concurrent unqueue()'s step-B, in which
> 		 * case its step-C will write us a new @node->prev pointer.
> 		 */
> 		prev = READ_ONCE(node->prev);
> 	}
> 
> I'm saying that this READ_ONCE at the end of the loop should be sufficient
> to stop the compiler making value assumptions about prev->next. Do you
> agree?

Good point, and I would certainly hope so!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-05-11 18:07 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-11 13:54 [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock Qian Cai
2020-05-08 20:59 ` Qian Cai
2020-05-09  4:33   ` Paul E. McKenney
2020-05-09 13:01     ` Qian Cai
2020-05-09 16:12       ` Paul E. McKenney
2020-05-09 16:53         ` Qian Cai
2020-05-09 21:36           ` Paul E. McKenney
2020-05-11 15:58             ` Will Deacon
2020-05-11 16:43               ` Paul E. McKenney
2020-05-11 16:52                 ` Will Deacon
2020-05-11 17:29                   ` Paul E. McKenney
2020-05-11 17:34                     ` Will Deacon
2020-05-11 18:07                       ` Paul E. McKenney
2020-05-11 16:44               ` Qian Cai
2020-05-11 16:54                 ` Will Deacon
2020-05-11 17:10                   ` Qian Cai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).