LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Marco Elver <elver@google.com>
To: Qian Cai <cai@lca.pw>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Will Deacon <will@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"paul E. McKenney" <paulmck@kernel.org>,
	kasan-dev <kasan-dev@googlegroups.com>
Subject: Re: [PATCH] locking/osq_lock: fix a data race in osq_wait_next
Date: Tue, 28 Jan 2020 12:46:26 +0100
Message-ID: <CANpmjNNo6yW-y-Af7JgvWi3t==+=02hE4-pFU4OiH8yvbT3Byg@mail.gmail.com> (raw)
In-Reply-To: <E722E6E0-26CB-440F-98D7-D182B57D1F43@lca.pw>

On Tue, 28 Jan 2020 at 04:11, Qian Cai <cai@lca.pw> wrote:
>
> > On Jan 23, 2020, at 4:39 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Wed, Jan 22, 2020 at 06:54:43PM -0500, Qian Cai wrote:
> >> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> >> index 1f7734949ac8..832e87966dcf 100644
> >> --- a/kernel/locking/osq_lock.c
> >> +++ b/kernel/locking/osq_lock.c
> >> @@ -75,7 +75,7 @@ osq_wait_next(struct optimistic_spin_queue *lock,
> >>                 * wait for either @lock to point to us, through its Step-B, or
> >>                 * wait for a new @node->next from its Step-C.
> >>                 */
> >> -               if (node->next) {
> >> +               if (READ_ONCE(node->next)) {
> >>                        next = xchg(&node->next, NULL);
> >>                        if (next)
> >>                                break;
> >
> > This could possibly trigger the warning, but is a false positive. The
> > above doesn't fix anything in that even if that load is shattered the
> > code will function correctly -- it checks for any !0 value, any byte
> > composite that is !0 is sufficient.
> >
> > This is in fact something KCSAN compiler infrastructure could deduce.

Not in the general case. As far as I can tell, this if-statement is
purely optional and an optimization to avoid false sharing. This is
specific knowledge about the logic that (without conveying more
details about the logic) the tool couldn't safely deduce. Consider the
case:

T0:
if ( (x = READ_ONCE(ptr)) ) use_ptr_value(*x);

T1:
WRITE_ONCE(ptr, valid_ptr);

Here, unlike the case above, reading ptr without READ_ONCE can clearly
be dangerous.

The false sharing scenario came up before, and maybe it's worth
telling the tool about the logic. In fact, the 'data_race()' macro is
perfectly well suited to do this.

>
> Marco, any thought on improving KCSAN for this to reduce the false
> positives?

Define 'false positive'.

From what I can tell, all 'false positives' that have come up are data
races where the consequences on the behaviour of the code is
inconsequential. In other words, all of them would require
understanding of the intended logic of the code, and understanding if
the worst possible outcome of a data race changes the behaviour of the
code in such a way that we may end up with an erroneously behaving
system.

As I have said before, KCSAN (or any data race detector) by definition
only works at the language level. Any semantic analysis, beyond simple
rules (such as ignore same-value stores) and annotations, is simply
impossible since the tool can't know about the logic that the
programmer intended.

That being said, if there are simple rules (like ignore same-value
stores) or other minimal annotations that can help reduce such 'false
positives', more than happy to add them.

Qian: firstly I suggest you try
CONFIG_KCSAN_REPORT_ONCE_IN_MS=1000000000 as mentioned before so your
system doesn't get spammed, considering you do not use the default
config but want to use all debugging tools at once which seems to
trigger certain data races more than usual.

Secondly, what are your expectations? If you expect the situation to
be perfect tomorrow, you'll be disappointed. This is inherent, given
the problem we face (safe concurrency). Consider the various parts to
this story: concurrent kernel code, the LKMM, people's preferences and
opinions, and KCSAN (which is late to the party). All of them are
still evolving, hopefully together. At least that's my expectation.

What to do about osq_lock here? If people agree that no further
annotations are wanted, and the reasoning above concludes there are no
bugs, we can blacklist the file. That would, however, miss new data
races in future.

Thanks,
-- Marco

  reply index

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-22 16:38 Qian Cai
2020-01-22 16:59 ` Will Deacon
2020-01-22 17:08   ` Qian Cai
2020-01-22 22:38     ` Marco Elver
2020-01-22 23:54       ` Qian Cai
2020-01-23  9:39         ` Peter Zijlstra
2020-01-28  3:11           ` Qian Cai
2020-01-28 11:46             ` Marco Elver [this message]
2020-01-28 12:53               ` Qian Cai
2020-01-28 16:52               ` Peter Zijlstra
2020-01-28 16:56               ` Peter Zijlstra
2020-01-29  0:22                 ` Paul E. McKenney
2020-01-29 15:29                   ` Marco Elver
2020-01-29 18:40                     ` Peter Zijlstra
2020-01-30 13:39                       ` Marco Elver
2020-01-30 13:48                         ` Peter Zijlstra
2020-01-31  3:32                           ` Qian Cai
2020-01-29 18:49                   ` Peter Zijlstra
2020-01-29 19:26                     ` Paul E. McKenney
2020-01-23  9:36       ` Peter Zijlstra
2020-01-28  3:12         ` Qian Cai
2020-01-28  8:18           ` Marco Elver
2020-01-28 10:10             ` Qian Cai
2020-01-28 10:29               ` Marco Elver
2020-01-22 17:09 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANpmjNNo6yW-y-Af7JgvWi3t==+=02hE4-pFU4OiH8yvbT3Byg@mail.gmail.com' \
    --to=elver@google.com \
    --cc=cai@lca.pw \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git