All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alan Stern <stern@rowland.harvard.edu>
To: "Paul E. McKenney" <paulmck@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Andrea Parri <andrea.parri@amarulasolutions.com>,
	LKMM Maintainers -- Akira Yokosawa <akiyks@gmail.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	Daniel Lustig <dlustig@nvidia.com>,
	David Howells <dhowells@redhat.com>,
	Jade Alglave <j.alglave@ucl.ac.uk>,
	Luc Maranget <luc.maranget@inria.fr>,
	Nicholas Piggin <npiggin@gmail.com>,
	Will Deacon <will.deacon@arm.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	<linux-kernel@vger.kernel.org>
Subject: Re: Plain accesses and data races in the Linux Kernel Memory Model
Date: Wed, 16 Jan 2019 10:49:01 -0500 (EST)	[thread overview]
Message-ID: <Pine.LNX.4.44L0.1901161012140.1610-100000@iolanthe.rowland.org> (raw)
In-Reply-To: <20190116131150.GH1215@linux.ibm.com>

On Wed, 16 Jan 2019, Paul E. McKenney wrote:

> On Wed, Jan 16, 2019 at 12:57:52PM +0100, Peter Zijlstra wrote:
> > On Tue, Jan 15, 2019 at 10:19:10AM -0500, Alan Stern wrote:
> > > On Tue, 15 Jan 2019, Andrea Parri wrote:
> > > 
> > > > Unless I'm mis-reading/-applying this definition, this will flag the
> > > > following test (a variation on your "race.litmus") with "data-race":
> > > > 
> > > > C no-race
> > > > 
> > > > {}
> > > > 
> > > > P0(int *x, spinlock_t *s)
> > > > {
> > > > 	spin_lock(s);
> > > >         WRITE_ONCE(*x, 1);	/* A */
> > > > 	spin_unlock(s);	/* B */
> > > > }
> > > > 
> > > > P1(int *x, spinlock_t *s)
> > > > {
> > > >         int r1;
> > > > 
> > > > 	spin_lock(s); /* C */
> > > >         r1 = *x;	/* D */
> > > > 	spin_unlock(s);
> > > > }
> > > > 
> > > > exists (1:r1=1)
> > > > 
> > > > Broadly speaking, this is due to the fact that the modified "happens-
> > > > before" axiom does not forbid the execution with the (MP-) cycle
> > > > 
> > > > 	A ->po-rel B ->rfe C ->acq-po D ->fre A
> > > > 
> > > > and then to the link "D ->race-from-r A" here defined.
> > > 
> > > Yes, that cycle certainly should be forbidden.  On the other hand, we
> > > don't want to insist that C happens before D, given that D may not
> > > happen at all.
> > > 
> > > This is a real problem.  Can we solve it by adding a modified
> > > "happens-before" which says essentially that _if_ D is preserved _then_
> > > C happens before D?  But then what about cycles involving more than one
> > > possibly preserved access?  Or maybe a relation which says that D
> > > cannot execute before C (so if D executes at all, it has to come after
> > > C)?
> > 
> > The latter; there is a compiler barrier implied at the end of
> > spin_lock() such that anything later (in PO) must indeed be later.
> > 
> > > Now you see why this stuff is so difficult...  At the moment, I don't
> > > know how to fix this.
> 
> In the spirit of cutting the Gordian Knot...
> 
> Given that we are flagging data races, how much do we really lose by
> simply ignoring the possibility of removed accesses?

Well, I thought about these issues overnight.  It turns out Andrea's
test cases expose two problems: an easy one and a hard one.

The easy one is that my definition of hb was too stringent; it required
the accesses involved in the prop relation to be marked, but it should
have allowed any preserved access.  At the same time, it was too
lenient in that the overwrite relation allowed any write as the
right-hand argument, but it should have required the write to be
preserved.  Likewise for the rfe? term in A-cumul.  Those issues have 
now been fixed.

The hard problem involves race detection when non-preserved accesses
are present.  (The plain reads in Andrea's examples were non-preserved;  
if the examples are changed to make them preserved then the corrected
model will realize they do not race.)  The point is that non-preserved
accesses can participate in a data race, but if they do it means that
the compiler must have preserved them!  To put it another way, if the
compiler deletes an access then that access can't race with anything.

Hence, when testing whether a particular execution has a data race
between accesses X and Y, we really should re-determine whether the
execution is allowed under the assumption that X and Y are both
preserved.  If it isn't then X and Y don't race in that execution.

Here's a particularly obscure example to illustrate the point.


C non-race1

{}

P0(int *x, int *y)
{
	int r1;
	int r2;

	r1 = READ_ONCE(*x);
	smp_rmb();
	if (r1 == 1)
		r2 = *y;
	WRITE_ONCE(*y, 1);
}

P1(int *x, int *y)
{
	int r3;

	r3 = READ_ONCE(*y);
	WRITE_ONCE(*x, r3);
}

P2(int *y)
{
	WRITE_ONCE(*y, 2);
}

exists (0:r1=1 /\ 1:r3=1)


This litmus test is allowed, and there's no synchronization at all
between the marked write to y in P2() and the plain read of y in P0().  
Nevertheless, those two accesses do not race, because the "r2 = *y" 
read does not actually occur in any of the allowed executions.

I'm thinking about ways to attack this problem.  One approach is to
ignore non-preserved accesses entirely (they do correspond to dead
code, after all).  But that's not so good, because an access may be
preserved in one execution and non-preserved in another.

Still working on it...

Alan


  reply	other threads:[~2019-01-16 15:49 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <Pine.LNX.4.44L0.1901141439480.1366-100000@iolanthe.rowland.org>
     [not found] ` <20190114235426.GV1215@linux.ibm.com>
2019-01-15  7:20   ` Plain accesses and data races in the Linux Kernel Memory Model Dmitry Vyukov
2019-01-15 15:03     ` Alan Stern
2019-01-15 15:23       ` Paul E. McKenney
2019-01-15 14:25 ` Andrea Parri
2019-01-15 15:19   ` Alan Stern
2019-01-16 11:57     ` Peter Zijlstra
2019-01-16 13:11       ` Paul E. McKenney
2019-01-16 15:49         ` Alan Stern [this message]
2019-01-16 21:36 ` Andrea Parri
2019-01-17 15:03   ` Andrea Parri
2019-01-17 20:21     ` Alan Stern
2019-01-18 15:10     ` Alan Stern
2019-01-18 15:56       ` Andrea Parri
2019-01-18 16:43         ` Alan Stern
2019-01-17 19:43   ` Alan Stern
2019-01-18 18:53     ` Paul E. McKenney
2019-01-22 15:47 ` Andrea Parri
2019-01-22 16:19   ` Alan Stern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.44L0.1901161012140.1610-100000@iolanthe.rowland.org \
    --to=stern@rowland.harvard.edu \
    --cc=akiyks@gmail.com \
    --cc=andrea.parri@amarulasolutions.com \
    --cc=boqun.feng@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=dlustig@nvidia.com \
    --cc=dvyukov@google.com \
    --cc=j.alglave@ucl.ac.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luc.maranget@inria.fr \
    --cc=npiggin@gmail.com \
    --cc=paulmck@linux.ibm.com \
    --cc=peterz@infradead.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.