* [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock @ 2020-02-11 13:54 Qian Cai 2020-05-08 20:59 ` Qian Cai 0 siblings, 1 reply; 16+ messages in thread From: Qian Cai @ 2020-02-11 13:54 UTC (permalink / raw) To: peterz, mingo; +Cc: will, elver, linux-kernel, Qian Cai prev->next could be accessed concurrently as noticed by KCSAN, write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107: osq_lock+0x25f/0x350 osq_wait_next at kernel/locking/osq_lock.c:79 (inlined by) osq_lock at kernel/locking/osq_lock.c:185 rwsem_optimistic_spin <snip> read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100: osq_lock+0x196/0x350 osq_lock at kernel/locking/osq_lock.c:157 rwsem_optimistic_spin <snip> Since the write only stores NULL to prev->next and the read tests if prev->next equals to this_cpu_ptr(&osq_node). Even if the value is shattered, the code is still working correctly. Thus, mark it as an intentional data race using the data_race() macro. Signed-off-by: Qian Cai <cai@lca.pw> --- v2: insert some code comments. kernel/locking/osq_lock.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c index 1f7734949ac8..f733bcd99e8a 100644 --- a/kernel/locking/osq_lock.c +++ b/kernel/locking/osq_lock.c @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock) */ for (;;) { - if (prev->next == node && + /* + * cpu_relax() below implies a compiler barrier which would + * prevent this comparison being optimized away. + */ + if (data_race(prev->next == node) && cmpxchg(&prev->next, node, NULL) == node) break; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock 2020-02-11 13:54 [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock Qian Cai @ 2020-05-08 20:59 ` Qian Cai 2020-05-09 4:33 ` Paul E. McKenney 0 siblings, 1 reply; 16+ messages in thread From: Qian Cai @ 2020-05-08 20:59 UTC (permalink / raw) To: Paul E. McKenney Cc: Will Deacon, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel) > On Feb 11, 2020, at 8:54 AM, Qian Cai <cai@lca.pw> wrote: > > prev->next could be accessed concurrently as noticed by KCSAN, > > write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107: > osq_lock+0x25f/0x350 > osq_wait_next at kernel/locking/osq_lock.c:79 > (inlined by) osq_lock at kernel/locking/osq_lock.c:185 > rwsem_optimistic_spin > <snip> > > read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100: > osq_lock+0x196/0x350 > osq_lock at kernel/locking/osq_lock.c:157 > rwsem_optimistic_spin > <snip> > > Since the write only stores NULL to prev->next and the read tests if > prev->next equals to this_cpu_ptr(&osq_node). Even if the value is > shattered, the code is still working correctly. Thus, mark it as an > intentional data race using the data_race() macro. > > Signed-off-by: Qian Cai <cai@lca.pw> Hmm, this patch has been dropped from linux-next from some reasons. Paul, can you pick this up along with KCSAN fixes? https://lore.kernel.org/lkml/1581429255-12542-1-git-send-email-cai@lca.pw/ > --- > > v2: insert some code comments. > > kernel/locking/osq_lock.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c > index 1f7734949ac8..f733bcd99e8a 100644 > --- a/kernel/locking/osq_lock.c > +++ b/kernel/locking/osq_lock.c > @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock) > */ > > for (;;) { > - if (prev->next == node && > + /* > + * cpu_relax() below implies a compiler barrier which would > + * prevent this comparison being optimized away. > + */ > + if (data_race(prev->next == node) && > cmpxchg(&prev->next, node, NULL) == node) > break; > > -- > 1.8.3.1 > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock 2020-05-08 20:59 ` Qian Cai @ 2020-05-09 4:33 ` Paul E. McKenney 2020-05-09 13:01 ` Qian Cai 0 siblings, 1 reply; 16+ messages in thread From: Paul E. McKenney @ 2020-05-09 4:33 UTC (permalink / raw) To: Qian Cai Cc: Will Deacon, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel) On Fri, May 08, 2020 at 04:59:05PM -0400, Qian Cai wrote: > > > > On Feb 11, 2020, at 8:54 AM, Qian Cai <cai@lca.pw> wrote: > > > > prev->next could be accessed concurrently as noticed by KCSAN, > > > > write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107: > > osq_lock+0x25f/0x350 > > osq_wait_next at kernel/locking/osq_lock.c:79 > > (inlined by) osq_lock at kernel/locking/osq_lock.c:185 > > rwsem_optimistic_spin > > <snip> > > > > read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100: > > osq_lock+0x196/0x350 > > osq_lock at kernel/locking/osq_lock.c:157 > > rwsem_optimistic_spin > > <snip> > > > > Since the write only stores NULL to prev->next and the read tests if > > prev->next equals to this_cpu_ptr(&osq_node). Even if the value is > > shattered, the code is still working correctly. Thus, mark it as an > > intentional data race using the data_race() macro. > > > > Signed-off-by: Qian Cai <cai@lca.pw> > > Hmm, this patch has been dropped from linux-next from some reasons. > > Paul, can you pick this up along with KCSAN fixes? > > https://lore.kernel.org/lkml/1581429255-12542-1-git-send-email-cai@lca.pw/ I have queued it on -rcu, but it is too late for v5.8 via the -rcu tree, so this is v5.9 at the earliest. Plus I would need an ack from one of the locking folks. Thanx, Paul > > --- > > > > v2: insert some code comments. > > > > kernel/locking/osq_lock.c | 6 +++++- > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c > > index 1f7734949ac8..f733bcd99e8a 100644 > > --- a/kernel/locking/osq_lock.c > > +++ b/kernel/locking/osq_lock.c > > @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock) > > */ > > > > for (;;) { > > - if (prev->next == node && > > + /* > > + * cpu_relax() below implies a compiler barrier which would > > + * prevent this comparison being optimized away. > > + */ > > + if (data_race(prev->next == node) && > > cmpxchg(&prev->next, node, NULL) == node) > > break; > > > > -- > > 1.8.3.1 > > > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock 2020-05-09 4:33 ` Paul E. McKenney @ 2020-05-09 13:01 ` Qian Cai 2020-05-09 16:12 ` Paul E. McKenney 0 siblings, 1 reply; 16+ messages in thread From: Qian Cai @ 2020-05-09 13:01 UTC (permalink / raw) To: Paul E. McKenney Cc: Will Deacon, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel) > On May 9, 2020, at 12:33 AM, Paul E. McKenney <paulmck@kernel.org> wrote: > > On Fri, May 08, 2020 at 04:59:05PM -0400, Qian Cai wrote: >> >> >>> On Feb 11, 2020, at 8:54 AM, Qian Cai <cai@lca.pw> wrote: >>> >>> prev->next could be accessed concurrently as noticed by KCSAN, >>> >>> write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107: >>> osq_lock+0x25f/0x350 >>> osq_wait_next at kernel/locking/osq_lock.c:79 >>> (inlined by) osq_lock at kernel/locking/osq_lock.c:185 >>> rwsem_optimistic_spin >>> <snip> >>> >>> read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100: >>> osq_lock+0x196/0x350 >>> osq_lock at kernel/locking/osq_lock.c:157 >>> rwsem_optimistic_spin >>> <snip> >>> >>> Since the write only stores NULL to prev->next and the read tests if >>> prev->next equals to this_cpu_ptr(&osq_node). Even if the value is >>> shattered, the code is still working correctly. Thus, mark it as an >>> intentional data race using the data_race() macro. >>> >>> Signed-off-by: Qian Cai <cai@lca.pw> >> >> Hmm, this patch has been dropped from linux-next from some reasons. >> >> Paul, can you pick this up along with KCSAN fixes? >> >> https://lore.kernel.org/lkml/1581429255-12542-1-git-send-email-cai@lca.pw/ > > I have queued it on -rcu, but it is too late for v5.8 via the -rcu tree, > so this is v5.9 at the earliest. Plus I would need an ack from one of > the locking folks. Peter, Will, can you give an ACK? This v2 should incorporate all the feedback from Peter, https://lore.kernel.org/lkml/20200211124753.GP14914@hirez.programming.kicks-ass.net/ V5.9 is fine. All I care about is it is always in linux-next (so the testing bots won’t trigger this over and over again) and to be in mainline at some point in the future. > > Thanx, Paul > >>> --- >>> >>> v2: insert some code comments. >>> >>> kernel/locking/osq_lock.c | 6 +++++- >>> 1 file changed, 5 insertions(+), 1 deletion(-) >>> >>> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c >>> index 1f7734949ac8..f733bcd99e8a 100644 >>> --- a/kernel/locking/osq_lock.c >>> +++ b/kernel/locking/osq_lock.c >>> @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock) >>> */ >>> >>> for (;;) { >>> - if (prev->next == node && >>> + /* >>> + * cpu_relax() below implies a compiler barrier which would >>> + * prevent this comparison being optimized away. >>> + */ >>> + if (data_race(prev->next == node) && >>> cmpxchg(&prev->next, node, NULL) == node) >>> break; >>> >>> -- >>> 1.8.3.1 ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock 2020-05-09 13:01 ` Qian Cai @ 2020-05-09 16:12 ` Paul E. McKenney 2020-05-09 16:53 ` Qian Cai 0 siblings, 1 reply; 16+ messages in thread From: Paul E. McKenney @ 2020-05-09 16:12 UTC (permalink / raw) To: Qian Cai Cc: Will Deacon, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel) On Sat, May 09, 2020 at 09:01:53AM -0400, Qian Cai wrote: > > > > On May 9, 2020, at 12:33 AM, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Fri, May 08, 2020 at 04:59:05PM -0400, Qian Cai wrote: > >> > >> > >>> On Feb 11, 2020, at 8:54 AM, Qian Cai <cai@lca.pw> wrote: > >>> > >>> prev->next could be accessed concurrently as noticed by KCSAN, > >>> > >>> write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107: > >>> osq_lock+0x25f/0x350 > >>> osq_wait_next at kernel/locking/osq_lock.c:79 > >>> (inlined by) osq_lock at kernel/locking/osq_lock.c:185 > >>> rwsem_optimistic_spin > >>> <snip> > >>> > >>> read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100: > >>> osq_lock+0x196/0x350 > >>> osq_lock at kernel/locking/osq_lock.c:157 > >>> rwsem_optimistic_spin > >>> <snip> > >>> > >>> Since the write only stores NULL to prev->next and the read tests if > >>> prev->next equals to this_cpu_ptr(&osq_node). Even if the value is > >>> shattered, the code is still working correctly. Thus, mark it as an > >>> intentional data race using the data_race() macro. > >>> > >>> Signed-off-by: Qian Cai <cai@lca.pw> > >> > >> Hmm, this patch has been dropped from linux-next from some reasons. > >> > >> Paul, can you pick this up along with KCSAN fixes? > >> > >> https://lore.kernel.org/lkml/1581429255-12542-1-git-send-email-cai@lca.pw/ > > > > I have queued it on -rcu, but it is too late for v5.8 via the -rcu tree, > > so this is v5.9 at the earliest. Plus I would need an ack from one of > > the locking folks. > > Peter, Will, can you give an ACK? This v2 should incorporate all the feedback from Peter, > > https://lore.kernel.org/lkml/20200211124753.GP14914@hirez.programming.kicks-ass.net/ > > V5.9 is fine. All I care about is it is always in linux-next (so the testing bots won’t trigger this over and over again) and to be in mainline at some point in the future. Ah, and I forgot to ask. Why "if (data_race(prev->next == node)" instead of "if (data_race(prev->next) == node"? Thanx, Paul > >>> --- > >>> > >>> v2: insert some code comments. > >>> > >>> kernel/locking/osq_lock.c | 6 +++++- > >>> 1 file changed, 5 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c > >>> index 1f7734949ac8..f733bcd99e8a 100644 > >>> --- a/kernel/locking/osq_lock.c > >>> +++ b/kernel/locking/osq_lock.c > >>> @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock) > >>> */ > >>> > >>> for (;;) { > >>> - if (prev->next == node && > >>> + /* > >>> + * cpu_relax() below implies a compiler barrier which would > >>> + * prevent this comparison being optimized away. > >>> + */ > >>> + if (data_race(prev->next == node) && > >>> cmpxchg(&prev->next, node, NULL) == node) > >>> break; > >>> > >>> -- > >>> 1.8.3.1 > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock 2020-05-09 16:12 ` Paul E. McKenney @ 2020-05-09 16:53 ` Qian Cai 2020-05-09 21:36 ` Paul E. McKenney 0 siblings, 1 reply; 16+ messages in thread From: Qian Cai @ 2020-05-09 16:53 UTC (permalink / raw) To: paulmck Cc: Will Deacon, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel) > On May 9, 2020, at 12:12 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > Ah, and I forgot to ask. Why "if (data_race(prev->next == node)" instead > of "if (data_race(prev->next) == node"? I think the one you suggested is slightly better to point out the exact race. Do you want me to resend or you could squash it instead? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock 2020-05-09 16:53 ` Qian Cai @ 2020-05-09 21:36 ` Paul E. McKenney 2020-05-11 15:58 ` Will Deacon 0 siblings, 1 reply; 16+ messages in thread From: Paul E. McKenney @ 2020-05-09 21:36 UTC (permalink / raw) To: Qian Cai Cc: Will Deacon, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel) On Sat, May 09, 2020 at 12:53:38PM -0400, Qian Cai wrote: > > > > On May 9, 2020, at 12:12 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > Ah, and I forgot to ask. Why "if (data_race(prev->next == node)" instead > > of "if (data_race(prev->next) == node"? > > I think the one you suggested is slightly better to point out the exact race. Do you want me to resend or you could squash it instead? The patch was still at the top of my stack, so I just amended it. Here is the updated version. Thanx, Paul ------------------------------------------------------------------------ commit 13e69ca01ce1621ce74248bda86cfad47fa5a0fa Author: Qian Cai <cai@lca.pw> Date: Tue Feb 11 08:54:15 2020 -0500 locking/osq_lock: Annotate a data race in osq_lock The prev->next pointer can be accessed concurrently as noticed by KCSAN: write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107: osq_lock+0x25f/0x350 osq_wait_next at kernel/locking/osq_lock.c:79 (inlined by) osq_lock at kernel/locking/osq_lock.c:185 rwsem_optimistic_spin <snip> read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100: osq_lock+0x196/0x350 osq_lock at kernel/locking/osq_lock.c:157 rwsem_optimistic_spin <snip> Since the write only stores NULL to prev->next and the read tests if prev->next equals to this_cpu_ptr(&osq_node). Even if the value is shattered, the code is still working correctly. Thus, mark it as an intentional data race using the data_race() macro. Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c index 1f77349..1de006e 100644 --- a/kernel/locking/osq_lock.c +++ b/kernel/locking/osq_lock.c @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock) */ for (;;) { - if (prev->next == node && + /* + * cpu_relax() below implies a compiler barrier which would + * prevent this comparison being optimized away. + */ + if (data_race(prev->next) == node && cmpxchg(&prev->next, node, NULL) == node) break; ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock 2020-05-09 21:36 ` Paul E. McKenney @ 2020-05-11 15:58 ` Will Deacon 2020-05-11 16:43 ` Paul E. McKenney 2020-05-11 16:44 ` Qian Cai 0 siblings, 2 replies; 16+ messages in thread From: Will Deacon @ 2020-05-11 15:58 UTC (permalink / raw) To: Paul E. McKenney Cc: Qian Cai, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel) On Sat, May 09, 2020 at 02:36:54PM -0700, Paul E. McKenney wrote: > On Sat, May 09, 2020 at 12:53:38PM -0400, Qian Cai wrote: > > > > > > > On May 9, 2020, at 12:12 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > > > Ah, and I forgot to ask. Why "if (data_race(prev->next == node)" instead > > > of "if (data_race(prev->next) == node"? > > > > I think the one you suggested is slightly better to point out the exact race. Do you want me to resend or you could squash it instead? > > The patch was still at the top of my stack, so I just amended it. Here > is the updated version. > > Thanx, Paul > > ------------------------------------------------------------------------ > > commit 13e69ca01ce1621ce74248bda86cfad47fa5a0fa > Author: Qian Cai <cai@lca.pw> > Date: Tue Feb 11 08:54:15 2020 -0500 > > locking/osq_lock: Annotate a data race in osq_lock > > The prev->next pointer can be accessed concurrently as noticed by KCSAN: > > write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107: > osq_lock+0x25f/0x350 > osq_wait_next at kernel/locking/osq_lock.c:79 > (inlined by) osq_lock at kernel/locking/osq_lock.c:185 > rwsem_optimistic_spin > <snip> > > read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100: > osq_lock+0x196/0x350 > osq_lock at kernel/locking/osq_lock.c:157 > rwsem_optimistic_spin > <snip> > > Since the write only stores NULL to prev->next and the read tests if > prev->next equals to this_cpu_ptr(&osq_node). Even if the value is > shattered, the code is still working correctly. Thus, mark it as an > intentional data race using the data_race() macro. > > Signed-off-by: Qian Cai <cai@lca.pw> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c > index 1f77349..1de006e 100644 > --- a/kernel/locking/osq_lock.c > +++ b/kernel/locking/osq_lock.c > @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock) > */ > > for (;;) { > - if (prev->next == node && > + /* > + * cpu_relax() below implies a compiler barrier which would > + * prevent this comparison being optimized away. > + */ > + if (data_race(prev->next) == node && > cmpxchg(&prev->next, node, NULL) == node) > break; I'm fine with the data_race() placement, but I don't find the comment very helpful. We assign the result of a READ_ONCE() to 'prev' in the loop, so I don't think that the cpu_relax() is really relevant. The reason we don't need READ_ONCE() here is because if we race with the writer then either we'll go round the loop again after accidentally thinking prev->next != node, or we'll erroneously attempt the cmpxchg() because we thought they were equal and that will fail. Make sense? Will ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock 2020-05-11 15:58 ` Will Deacon @ 2020-05-11 16:43 ` Paul E. McKenney 2020-05-11 16:52 ` Will Deacon 2020-05-11 16:44 ` Qian Cai 1 sibling, 1 reply; 16+ messages in thread From: Paul E. McKenney @ 2020-05-11 16:43 UTC (permalink / raw) To: Will Deacon Cc: Qian Cai, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel) On Mon, May 11, 2020 at 04:58:13PM +0100, Will Deacon wrote: > On Sat, May 09, 2020 at 02:36:54PM -0700, Paul E. McKenney wrote: > > On Sat, May 09, 2020 at 12:53:38PM -0400, Qian Cai wrote: > > > > > > > > > > On May 9, 2020, at 12:12 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > > > > > Ah, and I forgot to ask. Why "if (data_race(prev->next == node)" instead > > > > of "if (data_race(prev->next) == node"? > > > > > > I think the one you suggested is slightly better to point out the exact race. Do you want me to resend or you could squash it instead? > > > > The patch was still at the top of my stack, so I just amended it. Here > > is the updated version. > > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > commit 13e69ca01ce1621ce74248bda86cfad47fa5a0fa > > Author: Qian Cai <cai@lca.pw> > > Date: Tue Feb 11 08:54:15 2020 -0500 > > > > locking/osq_lock: Annotate a data race in osq_lock > > > > The prev->next pointer can be accessed concurrently as noticed by KCSAN: > > > > write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107: > > osq_lock+0x25f/0x350 > > osq_wait_next at kernel/locking/osq_lock.c:79 > > (inlined by) osq_lock at kernel/locking/osq_lock.c:185 > > rwsem_optimistic_spin > > <snip> > > > > read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100: > > osq_lock+0x196/0x350 > > osq_lock at kernel/locking/osq_lock.c:157 > > rwsem_optimistic_spin > > <snip> > > > > Since the write only stores NULL to prev->next and the read tests if > > prev->next equals to this_cpu_ptr(&osq_node). Even if the value is > > shattered, the code is still working correctly. Thus, mark it as an > > intentional data race using the data_race() macro. > > > > Signed-off-by: Qian Cai <cai@lca.pw> > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c > > index 1f77349..1de006e 100644 > > --- a/kernel/locking/osq_lock.c > > +++ b/kernel/locking/osq_lock.c > > @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock) > > */ > > > > for (;;) { > > - if (prev->next == node && > > + /* > > + * cpu_relax() below implies a compiler barrier which would > > + * prevent this comparison being optimized away. > > + */ > > + if (data_race(prev->next) == node && > > cmpxchg(&prev->next, node, NULL) == node) > > break; > > I'm fine with the data_race() placement, but I don't find the comment > very helpful. We assign the result of a READ_ONCE() to 'prev' in the > loop, so I don't think that the cpu_relax() is really relevant. Suppose that the compiler loaded a value that was not equal to "node". In that case, the cmpxchg() won't happen, so something else must force the compiler to do the reload in order to avoid an infinite loop, right? Or am I missing something here? Thanx, Paul > The reason we don't need READ_ONCE() here is because if we race with > the writer then either we'll go round the loop again after accidentally > thinking prev->next != node, or we'll erroneously attempt the cmpxchg() > because we thought they were equal and that will fail. > > Make sense? > > Will ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock 2020-05-11 16:43 ` Paul E. McKenney @ 2020-05-11 16:52 ` Will Deacon 2020-05-11 17:29 ` Paul E. McKenney 0 siblings, 1 reply; 16+ messages in thread From: Will Deacon @ 2020-05-11 16:52 UTC (permalink / raw) To: Paul E. McKenney Cc: Qian Cai, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel) On Mon, May 11, 2020 at 09:43:19AM -0700, Paul E. McKenney wrote: > On Mon, May 11, 2020 at 04:58:13PM +0100, Will Deacon wrote: > > On Sat, May 09, 2020 at 02:36:54PM -0700, Paul E. McKenney wrote: > > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c > > > index 1f77349..1de006e 100644 > > > --- a/kernel/locking/osq_lock.c > > > +++ b/kernel/locking/osq_lock.c > > > @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock) > > > */ > > > > > > for (;;) { > > > - if (prev->next == node && > > > + /* > > > + * cpu_relax() below implies a compiler barrier which would > > > + * prevent this comparison being optimized away. > > > + */ > > > + if (data_race(prev->next) == node && > > > cmpxchg(&prev->next, node, NULL) == node) > > > break; > > > > I'm fine with the data_race() placement, but I don't find the comment > > very helpful. We assign the result of a READ_ONCE() to 'prev' in the > > loop, so I don't think that the cpu_relax() is really relevant. > > Suppose that the compiler loaded a value that was not equal to "node". > In that case, the cmpxchg() won't happen, so something else must force > the compiler to do the reload in order to avoid an infinite loop, right? > Or am I missing something here? Then we just go round the loop and reload prev: prev = READ_ONCE(node->prev); which should be enough to stop the compiler, no? Will ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock 2020-05-11 16:52 ` Will Deacon @ 2020-05-11 17:29 ` Paul E. McKenney 2020-05-11 17:34 ` Will Deacon 0 siblings, 1 reply; 16+ messages in thread From: Paul E. McKenney @ 2020-05-11 17:29 UTC (permalink / raw) To: Will Deacon Cc: Qian Cai, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel) On Mon, May 11, 2020 at 05:52:17PM +0100, Will Deacon wrote: > On Mon, May 11, 2020 at 09:43:19AM -0700, Paul E. McKenney wrote: > > On Mon, May 11, 2020 at 04:58:13PM +0100, Will Deacon wrote: > > > On Sat, May 09, 2020 at 02:36:54PM -0700, Paul E. McKenney wrote: > > > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c > > > > index 1f77349..1de006e 100644 > > > > --- a/kernel/locking/osq_lock.c > > > > +++ b/kernel/locking/osq_lock.c > > > > @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock) > > > > */ > > > > > > > > for (;;) { > > > > - if (prev->next == node && > > > > + /* > > > > + * cpu_relax() below implies a compiler barrier which would > > > > + * prevent this comparison being optimized away. > > > > + */ > > > > + if (data_race(prev->next) == node && > > > > cmpxchg(&prev->next, node, NULL) == node) > > > > break; > > > > > > I'm fine with the data_race() placement, but I don't find the comment > > > very helpful. We assign the result of a READ_ONCE() to 'prev' in the > > > loop, so I don't think that the cpu_relax() is really relevant. > > > > Suppose that the compiler loaded a value that was not equal to "node". > > In that case, the cmpxchg() won't happen, so something else must force > > the compiler to do the reload in order to avoid an infinite loop, right? > > Or am I missing something here? > > Then we just go round the loop and reload prev: > > prev = READ_ONCE(node->prev); > > which should be enough to stop the compiler, no? Yes, that would also work. Either have the cpu_relax() or a barrier() or whatever on the one hand, or, as you say, turn the data_race() into a READ_ONCE(). I personally prefer the READ_ONCE() myself, unless that would undesirably suppress other KCSAN warnings. Thanx, Paul ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock 2020-05-11 17:29 ` Paul E. McKenney @ 2020-05-11 17:34 ` Will Deacon 2020-05-11 18:07 ` Paul E. McKenney 0 siblings, 1 reply; 16+ messages in thread From: Will Deacon @ 2020-05-11 17:34 UTC (permalink / raw) To: Paul E. McKenney Cc: Qian Cai, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel) On Mon, May 11, 2020 at 10:29:18AM -0700, Paul E. McKenney wrote: > On Mon, May 11, 2020 at 05:52:17PM +0100, Will Deacon wrote: > > On Mon, May 11, 2020 at 09:43:19AM -0700, Paul E. McKenney wrote: > > > On Mon, May 11, 2020 at 04:58:13PM +0100, Will Deacon wrote: > > > > On Sat, May 09, 2020 at 02:36:54PM -0700, Paul E. McKenney wrote: > > > > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c > > > > > index 1f77349..1de006e 100644 > > > > > --- a/kernel/locking/osq_lock.c > > > > > +++ b/kernel/locking/osq_lock.c > > > > > @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock) > > > > > */ > > > > > > > > > > for (;;) { > > > > > - if (prev->next == node && > > > > > + /* > > > > > + * cpu_relax() below implies a compiler barrier which would > > > > > + * prevent this comparison being optimized away. > > > > > + */ > > > > > + if (data_race(prev->next) == node && > > > > > cmpxchg(&prev->next, node, NULL) == node) > > > > > break; > > > > > > > > I'm fine with the data_race() placement, but I don't find the comment > > > > very helpful. We assign the result of a READ_ONCE() to 'prev' in the > > > > loop, so I don't think that the cpu_relax() is really relevant. > > > > > > Suppose that the compiler loaded a value that was not equal to "node". > > > In that case, the cmpxchg() won't happen, so something else must force > > > the compiler to do the reload in order to avoid an infinite loop, right? > > > Or am I missing something here? > > > > Then we just go round the loop and reload prev: > > > > prev = READ_ONCE(node->prev); > > > > which should be enough to stop the compiler, no? > > Yes, that would also work. Either have the cpu_relax() or a barrier() > or whatever on the one hand, or, as you say, turn the data_race() into > a READ_ONCE(). I personally prefer the READ_ONCE() myself, unless that > would undesirably suppress other KCSAN warnings. No, I mean here is the code after this patch is applied: for (;;) { if (data_race(prev->next) == node && cmpxchg(&prev->next, node, NULL) == node) break; /* * We can only fail the cmpxchg() racing against an unlock(), * in which case we should observe @node->locked becomming * true. */ if (smp_load_acquire(&node->locked)) return true; cpu_relax(); /* * Or we race against a concurrent unqueue()'s step-B, in which * case its step-C will write us a new @node->prev pointer. */ prev = READ_ONCE(node->prev); } I'm saying that this READ_ONCE at the end of the loop should be sufficient to stop the compiler making value assumptions about prev->next. Do you agree? Will ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock 2020-05-11 17:34 ` Will Deacon @ 2020-05-11 18:07 ` Paul E. McKenney 0 siblings, 0 replies; 16+ messages in thread From: Paul E. McKenney @ 2020-05-11 18:07 UTC (permalink / raw) To: Will Deacon Cc: Qian Cai, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel) On Mon, May 11, 2020 at 06:34:13PM +0100, Will Deacon wrote: > On Mon, May 11, 2020 at 10:29:18AM -0700, Paul E. McKenney wrote: > > On Mon, May 11, 2020 at 05:52:17PM +0100, Will Deacon wrote: > > > On Mon, May 11, 2020 at 09:43:19AM -0700, Paul E. McKenney wrote: > > > > On Mon, May 11, 2020 at 04:58:13PM +0100, Will Deacon wrote: > > > > > On Sat, May 09, 2020 at 02:36:54PM -0700, Paul E. McKenney wrote: > > > > > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c > > > > > > index 1f77349..1de006e 100644 > > > > > > --- a/kernel/locking/osq_lock.c > > > > > > +++ b/kernel/locking/osq_lock.c > > > > > > @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock) > > > > > > */ > > > > > > > > > > > > for (;;) { > > > > > > - if (prev->next == node && > > > > > > + /* > > > > > > + * cpu_relax() below implies a compiler barrier which would > > > > > > + * prevent this comparison being optimized away. > > > > > > + */ > > > > > > + if (data_race(prev->next) == node && > > > > > > cmpxchg(&prev->next, node, NULL) == node) > > > > > > break; > > > > > > > > > > I'm fine with the data_race() placement, but I don't find the comment > > > > > very helpful. We assign the result of a READ_ONCE() to 'prev' in the > > > > > loop, so I don't think that the cpu_relax() is really relevant. > > > > > > > > Suppose that the compiler loaded a value that was not equal to "node". > > > > In that case, the cmpxchg() won't happen, so something else must force > > > > the compiler to do the reload in order to avoid an infinite loop, right? > > > > Or am I missing something here? > > > > > > Then we just go round the loop and reload prev: > > > > > > prev = READ_ONCE(node->prev); > > > > > > which should be enough to stop the compiler, no? > > > > Yes, that would also work. Either have the cpu_relax() or a barrier() > > or whatever on the one hand, or, as you say, turn the data_race() into > > a READ_ONCE(). I personally prefer the READ_ONCE() myself, unless that > > would undesirably suppress other KCSAN warnings. > > No, I mean here is the code after this patch is applied: > > for (;;) { > if (data_race(prev->next) == node && > cmpxchg(&prev->next, node, NULL) == node) > break; > > /* > * We can only fail the cmpxchg() racing against an unlock(), > * in which case we should observe @node->locked becomming > * true. > */ > if (smp_load_acquire(&node->locked)) > return true; > > cpu_relax(); > > /* > * Or we race against a concurrent unqueue()'s step-B, in which > * case its step-C will write us a new @node->prev pointer. > */ > prev = READ_ONCE(node->prev); > } > > I'm saying that this READ_ONCE at the end of the loop should be sufficient > to stop the compiler making value assumptions about prev->next. Do you > agree? Good point, and I would certainly hope so! Thanx, Paul ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock 2020-05-11 15:58 ` Will Deacon 2020-05-11 16:43 ` Paul E. McKenney @ 2020-05-11 16:44 ` Qian Cai 2020-05-11 16:54 ` Will Deacon 1 sibling, 1 reply; 16+ messages in thread From: Qian Cai @ 2020-05-11 16:44 UTC (permalink / raw) To: Will Deacon Cc: Paul E. McKenney, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel) > On May 11, 2020, at 11:58 AM, Will Deacon <will@kernel.org> wrote: > > I'm fine with the data_race() placement, but I don't find the comment > very helpful. We assign the result of a READ_ONCE() to 'prev' in the > loop, so I don't think that the cpu_relax() is really relevant. > > The reason we don't need READ_ONCE() here is because if we race with > the writer then either we'll go round the loop again after accidentally > thinking prev->next != node, or we'll erroneously attempt the cmpxchg() > because we thought they were equal and that will fail. > > Make sense? I think the significant concern from the previous reviews was if compilers could prove that prev->next == node was always true because it had no knowledge of the concurrency, and then took out the whole if statement away resulting in an infinite loop. The comment tried to explain that the cpu_relax() would save us from the infinite loop in theory here. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock 2020-05-11 16:44 ` Qian Cai @ 2020-05-11 16:54 ` Will Deacon 2020-05-11 17:10 ` Qian Cai 0 siblings, 1 reply; 16+ messages in thread From: Will Deacon @ 2020-05-11 16:54 UTC (permalink / raw) To: Qian Cai Cc: Paul E. McKenney, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel) On Mon, May 11, 2020 at 12:44:26PM -0400, Qian Cai wrote: > > > > On May 11, 2020, at 11:58 AM, Will Deacon <will@kernel.org> wrote: > > > > I'm fine with the data_race() placement, but I don't find the comment > > very helpful. We assign the result of a READ_ONCE() to 'prev' in the > > loop, so I don't think that the cpu_relax() is really relevant. > > > > The reason we don't need READ_ONCE() here is because if we race with > > the writer then either we'll go round the loop again after accidentally > > thinking prev->next != node, or we'll erroneously attempt the cmpxchg() > > because we thought they were equal and that will fail. > > > > Make sense? > > I think the significant concern from the previous reviews was if compilers > could prove that prev->next == node was always true because it had no > knowledge of the concurrency, and then took out the whole if statement > away resulting in an infinite loop. Hmm, I don't see how it can remove the cmpxchg(). Do you have a link to that discussion, please? Will ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock 2020-05-11 16:54 ` Will Deacon @ 2020-05-11 17:10 ` Qian Cai 0 siblings, 0 replies; 16+ messages in thread From: Qian Cai @ 2020-05-11 17:10 UTC (permalink / raw) To: Will Deacon Cc: Paul E. McKenney, Elver Marco, LKML, Ingo Molnar, Peter Zijlstra (Intel) > On May 11, 2020, at 12:54 PM, Will Deacon <will@kernel.org> wrote: > > Hmm, I don't see how it can remove the cmpxchg(). Do you have a link to that > discussion, please? lore.kernel.org/lkml/20200211124753.GP14914@hirez.programming.kicks-ass.net Correction — if compilers could prove ”prev->next != node” is always true, that cmpxchg() would not run. cpu_relax() should be sufficient to keep that “if statement” been optimized away in any case. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2020-05-11 18:07 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-02-11 13:54 [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock Qian Cai 2020-05-08 20:59 ` Qian Cai 2020-05-09 4:33 ` Paul E. McKenney 2020-05-09 13:01 ` Qian Cai 2020-05-09 16:12 ` Paul E. McKenney 2020-05-09 16:53 ` Qian Cai 2020-05-09 21:36 ` Paul E. McKenney 2020-05-11 15:58 ` Will Deacon 2020-05-11 16:43 ` Paul E. McKenney 2020-05-11 16:52 ` Will Deacon 2020-05-11 17:29 ` Paul E. McKenney 2020-05-11 17:34 ` Will Deacon 2020-05-11 18:07 ` Paul E. McKenney 2020-05-11 16:44 ` Qian Cai 2020-05-11 16:54 ` Will Deacon 2020-05-11 17:10 ` Qian Cai
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).