All of lore.kernel.org
 help / color / mirror / Atom feed
* Interrupts, smp_load_acquire(), smp_store_release(), etc.
@ 2018-10-20 16:10 Paul E. McKenney
  2018-10-20 20:18   ` Alan Stern
  2018-10-20 20:22 ` Andrea Parri
  0 siblings, 2 replies; 9+ messages in thread
From: Paul E. McKenney @ 2018-10-20 16:10 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: davidtgoldblatt, stern, andrea.parri, will.deacon, peterz,
	boqun.feng, npiggin, dhowells, j.alglave, luc.maranget, akiyks,
	dlustig

Hello!

David Goldblatt (CCed) came up with an interesting pair of C++ litmus
tests involving POSIX signals that have Linux-kernel counterparts
involving interrupts.  These litmus tests can (in paranoid theory, anyway)
produce counter-intuitive results on architectures that use explicit
fences to enforce ordering as part of a larger primitive, which in the
specific case of smp_store_release() includes all architectures other than
arm64, ia64, s390, SPARC, x86, and of course any UP-only architecture.

David's first litmus test made use of the C11 sequentially consistent
store, which in the Linux kernel would require two separate statements
anyway (a WRITE_ONCE() either preceded or followed by smp_mb()), so
the outcome that is counter-intuitive in C11 should be expected in the
Linux kernel.  (Yes, there are similar but more complicated examples that
would have more interesting outcomes in the Linux kernel, but let's keep
it simple for the moment.)

The second (informal) litmus test has a more interesting Linux-kernel
counterpart:

	void t1_interrupt(void)
	{
		r0 = READ_ONCE(y);
		smp_store_release(&x, 1);
	}

	void t1(void)
	{
		smp_store_release(&y, 1);
	}

	void t2(void)
	{
		r1 = smp_load_acquire(&x);
		r2 = smp_load_acquire(&y);
	}

On store-reordering architectures that implement smp_store_release()
as a memory-barrier instruction followed by a store, the interrupt could
arrive betweentimes in t1(), so that there would be no ordering between
t1_interrupt()'s store to x and t1()'s store to y.  This could (again,
in paranoid theory) result in the outcome r0==0 && r1==0 && r2==1.

In practice, we analyzed exception paths in the sys_membarrier() review,
and ended up with this function:

static void ipi_mb(void *info)
{
	smp_mb();	/* IPIs should be serializing but paranoid. */
}

So how paranoid should we be with respect to interrupt handlers for
smp_store_release(), smp_load_acquire(), and the various RMW atomic
operations that are sometimes implemented with separate memory-barrier
instructions?  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Interrupts, smp_load_acquire(), smp_store_release(), etc.
  2018-10-20 16:10 Interrupts, smp_load_acquire(), smp_store_release(), etc Paul E. McKenney
@ 2018-10-20 20:18   ` Alan Stern
  2018-10-20 20:22 ` Andrea Parri
  1 sibling, 0 replies; 9+ messages in thread
From: Alan Stern @ 2018-10-20 20:18 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, linux-arch, davidtgoldblatt, andrea.parri,
	will.deacon, peterz, boqun.feng, npiggin, dhowells, j.alglave,
	luc.maranget, akiyks, dlustig

On Sat, 20 Oct 2018, Paul E. McKenney wrote:

> The second (informal) litmus test has a more interesting Linux-kernel
> counterpart:
> 
> 	void t1_interrupt(void)
> 	{
> 		r0 = READ_ONCE(y);
> 		smp_store_release(&x, 1);
> 	}
> 
> 	void t1(void)
> 	{
> 		smp_store_release(&y, 1);
> 	}
> 
> 	void t2(void)
> 	{
> 		r1 = smp_load_acquire(&x);
> 		r2 = smp_load_acquire(&y);
> 	}
> 
> On store-reordering architectures that implement smp_store_release()
> as a memory-barrier instruction followed by a store, the interrupt could
> arrive betweentimes in t1(), so that there would be no ordering between
> t1_interrupt()'s store to x and t1()'s store to y.  This could (again,
> in paranoid theory) result in the outcome r0==0 && r1==0 && r2==1.

This is disconcerting only if we assume that t1_interrupt() has to be
executed by the same CPU as t1().  If the interrupt could be fielded by
a different CPU then the paranoid outcome is perfectly understandable,
even in an SC context.

So the question really should be limited to situations where a handler 
is forced to execute in the context of a particular thread.  While 
POSIX does allow such restrictions for user programs, I'm not aware of 
any similar mechanism in the kernel.

Alan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Interrupts, smp_load_acquire(), smp_store_release(), etc.
@ 2018-10-20 20:18   ` Alan Stern
  0 siblings, 0 replies; 9+ messages in thread
From: Alan Stern @ 2018-10-20 20:18 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, linux-arch, davidtgoldblatt, andrea.parri,
	will.deacon, peterz, boqun.feng, npiggin, dhowells, j.alglave,
	luc.maranget, akiyks, dlustig

On Sat, 20 Oct 2018, Paul E. McKenney wrote:

> The second (informal) litmus test has a more interesting Linux-kernel
> counterpart:
> 
> 	void t1_interrupt(void)
> 	{
> 		r0 = READ_ONCE(y);
> 		smp_store_release(&x, 1);
> 	}
> 
> 	void t1(void)
> 	{
> 		smp_store_release(&y, 1);
> 	}
> 
> 	void t2(void)
> 	{
> 		r1 = smp_load_acquire(&x);
> 		r2 = smp_load_acquire(&y);
> 	}
> 
> On store-reordering architectures that implement smp_store_release()
> as a memory-barrier instruction followed by a store, the interrupt could
> arrive betweentimes in t1(), so that there would be no ordering between
> t1_interrupt()'s store to x and t1()'s store to y.  This could (again,
> in paranoid theory) result in the outcome r0==0 && r1==0 && r2==1.

This is disconcerting only if we assume that t1_interrupt() has to be
executed by the same CPU as t1().  If the interrupt could be fielded by
a different CPU then the paranoid outcome is perfectly understandable,
even in an SC context.

So the question really should be limited to situations where a handler 
is forced to execute in the context of a particular thread.  While 
POSIX does allow such restrictions for user programs, I'm not aware of 
any similar mechanism in the kernel.

Alan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Interrupts, smp_load_acquire(), smp_store_release(), etc.
  2018-10-20 16:10 Interrupts, smp_load_acquire(), smp_store_release(), etc Paul E. McKenney
  2018-10-20 20:18   ` Alan Stern
@ 2018-10-20 20:22 ` Andrea Parri
  2018-10-20 21:06   ` Paul E. McKenney
  1 sibling, 1 reply; 9+ messages in thread
From: Andrea Parri @ 2018-10-20 20:22 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, linux-arch, davidtgoldblatt, stern, will.deacon,
	peterz, boqun.feng, npiggin, dhowells, j.alglave, luc.maranget,
	akiyks, dlustig

[...]

> The second (informal) litmus test has a more interesting Linux-kernel
> counterpart:
> 
> 	void t1_interrupt(void)
> 	{
> 		r0 = READ_ONCE(y);
> 		smp_store_release(&x, 1);
> 	}
> 
> 	void t1(void)
> 	{
> 		smp_store_release(&y, 1);
> 	}
> 
> 	void t2(void)
> 	{
> 		r1 = smp_load_acquire(&x);
> 		r2 = smp_load_acquire(&y);
> 	}
> 
> On store-reordering architectures that implement smp_store_release()
> as a memory-barrier instruction followed by a store, the interrupt could
> arrive betweentimes in t1(), so that there would be no ordering between
> t1_interrupt()'s store to x and t1()'s store to y.  This could (again,
> in paranoid theory) result in the outcome r0==0 && r1==0 && r2==1.

FWIW, I'd rather call "paranoid" the act of excluding such outcome ;-)
but I admit that I've only run this test in *my mind*: in an SC world,

  CPU1				CPU2

  t1()
    t1_interrupt()
      r0 = READ_ONCE(y); // =0
				t2()
				  r1 = smp_load_acquire(&x); // =0
      smp_store_release(&x, 1);
    smp_store_release(&y, 1);
				  r2 = smp_load_acquire(&y); // =1


> So how paranoid should we be with respect to interrupt handlers for
> smp_store_release(), smp_load_acquire(), and the various RMW atomic
> operations that are sometimes implemented with separate memory-barrier
> instructions?  ;-)

Good question! ;-)

  Andrea


> 
> 							Thanx, Paul
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Interrupts, smp_load_acquire(), smp_store_release(), etc.
  2018-10-20 20:18   ` Alan Stern
  (?)
@ 2018-10-20 21:04   ` Paul E. McKenney
  2018-10-22 17:30     ` Eric W. Biederman
  -1 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2018-10-20 21:04 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-kernel, linux-arch, davidtgoldblatt, andrea.parri,
	will.deacon, peterz, boqun.feng, npiggin, dhowells, j.alglave,
	luc.maranget, akiyks, dlustig

On Sat, Oct 20, 2018 at 04:18:37PM -0400, Alan Stern wrote:
> On Sat, 20 Oct 2018, Paul E. McKenney wrote:
> 
> > The second (informal) litmus test has a more interesting Linux-kernel
> > counterpart:
> > 
> > 	void t1_interrupt(void)
> > 	{
> > 		r0 = READ_ONCE(y);
> > 		smp_store_release(&x, 1);
> > 	}
> > 
> > 	void t1(void)
> > 	{
> > 		smp_store_release(&y, 1);
> > 	}
> > 
> > 	void t2(void)
> > 	{
> > 		r1 = smp_load_acquire(&x);
> > 		r2 = smp_load_acquire(&y);
> > 	}
> > 
> > On store-reordering architectures that implement smp_store_release()
> > as a memory-barrier instruction followed by a store, the interrupt could
> > arrive betweentimes in t1(), so that there would be no ordering between
> > t1_interrupt()'s store to x and t1()'s store to y.  This could (again,
> > in paranoid theory) result in the outcome r0==0 && r1==0 && r2==1.
> 
> This is disconcerting only if we assume that t1_interrupt() has to be
> executed by the same CPU as t1().  If the interrupt could be fielded by
> a different CPU then the paranoid outcome is perfectly understandable,
> even in an SC context.
> 
> So the question really should be limited to situations where a handler 
> is forced to execute in the context of a particular thread.  While 
> POSIX does allow such restrictions for user programs, I'm not aware of 
> any similar mechanism in the kernel.

Good point, and I was in fact assuming that t1() and t1_interrupt()
were executing on the same CPU.

This sort of thing happens naturally in the kernel when both t1()
and t1_interrupt() are accessing per-CPU variables.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Interrupts, smp_load_acquire(), smp_store_release(), etc.
  2018-10-20 20:22 ` Andrea Parri
@ 2018-10-20 21:06   ` Paul E. McKenney
  2018-10-21 14:52       ` Alan Stern
  0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2018-10-20 21:06 UTC (permalink / raw)
  To: Andrea Parri
  Cc: linux-kernel, linux-arch, davidtgoldblatt, stern, will.deacon,
	peterz, boqun.feng, npiggin, dhowells, j.alglave, luc.maranget,
	akiyks, dlustig

On Sat, Oct 20, 2018 at 10:22:29PM +0200, Andrea Parri wrote:
> [...]
> 
> > The second (informal) litmus test has a more interesting Linux-kernel
> > counterpart:
> > 
> > 	void t1_interrupt(void)
> > 	{
> > 		r0 = READ_ONCE(y);
> > 		smp_store_release(&x, 1);
> > 	}
> > 
> > 	void t1(void)
> > 	{
> > 		smp_store_release(&y, 1);
> > 	}
> > 
> > 	void t2(void)
> > 	{
> > 		r1 = smp_load_acquire(&x);
> > 		r2 = smp_load_acquire(&y);
> > 	}
> > 
> > On store-reordering architectures that implement smp_store_release()
> > as a memory-barrier instruction followed by a store, the interrupt could
> > arrive betweentimes in t1(), so that there would be no ordering between
> > t1_interrupt()'s store to x and t1()'s store to y.  This could (again,
> > in paranoid theory) result in the outcome r0==0 && r1==0 && r2==1.
> 
> FWIW, I'd rather call "paranoid" the act of excluding such outcome ;-)
> but I admit that I've only run this test in *my mind*: in an SC world,
> 
>   CPU1				CPU2
> 
>   t1()
>     t1_interrupt()
>       r0 = READ_ONCE(y); // =0
> 				t2()
> 				  r1 = smp_load_acquire(&x); // =0
>       smp_store_release(&x, 1);
>     smp_store_release(&y, 1);
> 				  r2 = smp_load_acquire(&y); // =1

OK, so did I get the outcome messed up again?  :-/

							Thanx, Paul

> > So how paranoid should we be with respect to interrupt handlers for
> > smp_store_release(), smp_load_acquire(), and the various RMW atomic
> > operations that are sometimes implemented with separate memory-barrier
> > instructions?  ;-)
> 
> Good question! ;-)
> 
>   Andrea
> 
> 
> > 
> > 							Thanx, Paul
> > 
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Interrupts, smp_load_acquire(), smp_store_release(), etc.
  2018-10-20 21:06   ` Paul E. McKenney
@ 2018-10-21 14:52       ` Alan Stern
  0 siblings, 0 replies; 9+ messages in thread
From: Alan Stern @ 2018-10-21 14:52 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Andrea Parri, linux-kernel, linux-arch, davidtgoldblatt,
	will.deacon, peterz, boqun.feng, npiggin, dhowells, j.alglave,
	luc.maranget, akiyks, dlustig

On Sat, 20 Oct 2018, Paul E. McKenney wrote:

> On Sat, Oct 20, 2018 at 10:22:29PM +0200, Andrea Parri wrote:
> > [...]
> > 
> > > The second (informal) litmus test has a more interesting Linux-kernel
> > > counterpart:
> > > 
> > > 	void t1_interrupt(void)
> > > 	{
> > > 		r0 = READ_ONCE(y);
> > > 		smp_store_release(&x, 1);
> > > 	}
> > > 
> > > 	void t1(void)
> > > 	{
> > > 		smp_store_release(&y, 1);
> > > 	}
> > > 
> > > 	void t2(void)
> > > 	{
> > > 		r1 = smp_load_acquire(&x);
> > > 		r2 = smp_load_acquire(&y);
> > > 	}
> > > 
> > > On store-reordering architectures that implement smp_store_release()
> > > as a memory-barrier instruction followed by a store, the interrupt could
> > > arrive betweentimes in t1(), so that there would be no ordering between
> > > t1_interrupt()'s store to x and t1()'s store to y.  This could (again,
> > > in paranoid theory) result in the outcome r0==0 && r1==0 && r2==1.
> > 
> > FWIW, I'd rather call "paranoid" the act of excluding such outcome ;-)
> > but I admit that I've only run this test in *my mind*: in an SC world,
> > 
> >   CPU1				CPU2
> > 
> >   t1()
> >     t1_interrupt()
> >       r0 = READ_ONCE(y); // =0
> > 				t2()
> > 				  r1 = smp_load_acquire(&x); // =0
> >       smp_store_release(&x, 1);
> >     smp_store_release(&y, 1);
> > 				  r2 = smp_load_acquire(&y); // =1
> 
> OK, so did I get the outcome messed up again?  :-/

Did you mean to say r0==1?  If so, the litmus test would be a little
clearer if you wrote t1() above t1_interrupt().  That would help to
cement the WRC pattern in the reader's mind.

In any case, perhaps this indicates the kernel should ensure that a
full memory barrier is executed when an interrupt occurs.  (Of course, 
the hardware may already do this for us, depending on the 
architecture.)

Alan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Interrupts, smp_load_acquire(), smp_store_release(), etc.
@ 2018-10-21 14:52       ` Alan Stern
  0 siblings, 0 replies; 9+ messages in thread
From: Alan Stern @ 2018-10-21 14:52 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Andrea Parri, linux-kernel, linux-arch, davidtgoldblatt,
	will.deacon, peterz, boqun.feng, npiggin, dhowells, j.alglave,
	luc.maranget, akiyks, dlustig

On Sat, 20 Oct 2018, Paul E. McKenney wrote:

> On Sat, Oct 20, 2018 at 10:22:29PM +0200, Andrea Parri wrote:
> > [...]
> > 
> > > The second (informal) litmus test has a more interesting Linux-kernel
> > > counterpart:
> > > 
> > > 	void t1_interrupt(void)
> > > 	{
> > > 		r0 = READ_ONCE(y);
> > > 		smp_store_release(&x, 1);
> > > 	}
> > > 
> > > 	void t1(void)
> > > 	{
> > > 		smp_store_release(&y, 1);
> > > 	}
> > > 
> > > 	void t2(void)
> > > 	{
> > > 		r1 = smp_load_acquire(&x);
> > > 		r2 = smp_load_acquire(&y);
> > > 	}
> > > 
> > > On store-reordering architectures that implement smp_store_release()
> > > as a memory-barrier instruction followed by a store, the interrupt could
> > > arrive betweentimes in t1(), so that there would be no ordering between
> > > t1_interrupt()'s store to x and t1()'s store to y.  This could (again,
> > > in paranoid theory) result in the outcome r0==0 && r1==0 && r2==1.
> > 
> > FWIW, I'd rather call "paranoid" the act of excluding such outcome ;-)
> > but I admit that I've only run this test in *my mind*: in an SC world,
> > 
> >   CPU1				CPU2
> > 
> >   t1()
> >     t1_interrupt()
> >       r0 = READ_ONCE(y); // =0
> > 				t2()
> > 				  r1 = smp_load_acquire(&x); // =0
> >       smp_store_release(&x, 1);
> >     smp_store_release(&y, 1);
> > 				  r2 = smp_load_acquire(&y); // =1
> 
> OK, so did I get the outcome messed up again?  :-/

Did you mean to say r0==1?  If so, the litmus test would be a little
clearer if you wrote t1() above t1_interrupt().  That would help to
cement the WRC pattern in the reader's mind.

In any case, perhaps this indicates the kernel should ensure that a
full memory barrier is executed when an interrupt occurs.  (Of course, 
the hardware may already do this for us, depending on the 
architecture.)

Alan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Interrupts, smp_load_acquire(), smp_store_release(), etc.
  2018-10-20 21:04   ` Paul E. McKenney
@ 2018-10-22 17:30     ` Eric W. Biederman
  0 siblings, 0 replies; 9+ messages in thread
From: Eric W. Biederman @ 2018-10-22 17:30 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Alan Stern, linux-kernel, linux-arch, davidtgoldblatt,
	andrea.parri, will.deacon, peterz, boqun.feng, npiggin, dhowells,
	j.alglave, luc.maranget, akiyks, dlustig

"Paul E. McKenney" <paulmck@linux.ibm.com> writes:

> On Sat, Oct 20, 2018 at 04:18:37PM -0400, Alan Stern wrote:
>> On Sat, 20 Oct 2018, Paul E. McKenney wrote:
>> 
>> > The second (informal) litmus test has a more interesting Linux-kernel
>> > counterpart:
>> > 
>> > 	void t1_interrupt(void)
>> > 	{
>> > 		r0 = READ_ONCE(y);
>> > 		smp_store_release(&x, 1);
>> > 	}
>> > 
>> > 	void t1(void)
>> > 	{
>> > 		smp_store_release(&y, 1);
>> > 	}
>> > 
>> > 	void t2(void)
>> > 	{
>> > 		r1 = smp_load_acquire(&x);
>> > 		r2 = smp_load_acquire(&y);
>> > 	}
>> > 
>> > On store-reordering architectures that implement smp_store_release()
>> > as a memory-barrier instruction followed by a store, the interrupt could
>> > arrive betweentimes in t1(), so that there would be no ordering between
>> > t1_interrupt()'s store to x and t1()'s store to y.  This could (again,
>> > in paranoid theory) result in the outcome r0==0 && r1==0 && r2==1.
>> 
>> This is disconcerting only if we assume that t1_interrupt() has to be
>> executed by the same CPU as t1().  If the interrupt could be fielded by
>> a different CPU then the paranoid outcome is perfectly understandable,
>> even in an SC context.
>> 
>> So the question really should be limited to situations where a handler 
>> is forced to execute in the context of a particular thread.  While 
>> POSIX does allow such restrictions for user programs, I'm not aware of 
>> any similar mechanism in the kernel.

> Good point, and I was in fact assuming that t1() and t1_interrupt()
> were executing on the same CPU.
>
> This sort of thing happens naturally in the kernel when both t1()
> and t1_interrupt() are accessing per-CPU variables.

Interrupts have a cpumask of the cpus they may be dlievered on.

I believe networking does in fact have places where percpu actions
happen as well as interrupts pinned to a single cpu.  And yes I agree
percpu variables mean that you do not need to pin an interrupt to a
single cpu to cause this to happen.

Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-10-22 17:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-20 16:10 Interrupts, smp_load_acquire(), smp_store_release(), etc Paul E. McKenney
2018-10-20 20:18 ` Alan Stern
2018-10-20 20:18   ` Alan Stern
2018-10-20 21:04   ` Paul E. McKenney
2018-10-22 17:30     ` Eric W. Biederman
2018-10-20 20:22 ` Andrea Parri
2018-10-20 21:06   ` Paul E. McKenney
2018-10-21 14:52     ` Alan Stern
2018-10-21 14:52       ` Alan Stern

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.