From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S964898AbdKBUV7 (ORCPT <rfc822;w@1wt.eu>);
        Thu, 2 Nov 2017 16:21:59 -0400
Received: from iolanthe.rowland.org ([192.131.102.54]:33300 "HELO
        iolanthe.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with SMTP id S964885AbdKBUV5 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 2 Nov 2017 16:21:57 -0400
Date: Thu, 2 Nov 2017 16:21:56 -0400 (EDT)
From: Alan Stern <stern@rowland.harvard.edu>
X-X-Sender: stern@iolanthe.rowland.org
To: Will Deacon <will.deacon@arm.com>
cc: Peter Zijlstra <peterz@infradead.org>,
        "Reshetova, Elena" <elena.reshetova@intel.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
        "keescook@chromium.org" <keescook@chromium.org>,
        "tglx@linutronix.de" <tglx@linutronix.de>,
        "mingo@redhat.com" <mingo@redhat.com>,
        "ishkamiel@gmail.com" <ishkamiel@gmail.com>,
        Paul McKenney <paulmck@linux.vnet.ibm.com>, <parri.andrea@gmail.com>,
        <boqun.feng@gmail.com>, <dhowells@redhat.com>, <david@fromorbit.com>
Subject: Re: [PATCH] refcount: provide same memory ordering guarantees as in
 atomic_t
In-Reply-To: <20171102171644.GD595@arm.com>
Message-ID: <Pine.LNX.4.44L0.1711021558120.1277-100000@iolanthe.rowland.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, 2 Nov 2017, Will Deacon wrote:

> > Right.  To address your point: release + acquire isn't the same as a
> > full barrier either.  The SB pattern illustrates the difference:
> > 
> > 	P0		P1
> > 	Write x=1	Write y=1
> > 	Release a	smp_mb
> > 	Acquire b	Read x=0
> > 	Read y=0
> > 
> > This would not be allowed if the release + acquire sequence was 
> > replaced by smp_mb.  But as it stands, this is allowed because nothing 
> > prevents the CPU from interchanging the order of the release and the 
> > acquire -- and then you're back to the acquire + release case.
> > 
> > However, there is one circumstance where this interchange isn't 
> > allowed: when the release and acquire access the same memory 
> > location.  Thus:
> > 
> > 	P0(int *x, int *y, int *a)
> > 	{
> > 		int r0;
> > 
> > 		WRITE_ONCE(*x, 1);
> > 		smp_store_release(a, 1);
> > 		smp_load_acquire(a);
> > 		r0 = READ_ONCE(*y);
> > 	}
> > 
> > 	P1(int *x, int *y)
> > 	{
> > 		int r1;
> > 
> > 		WRITE_ONCE(*y, 1);
> > 		smp_mb();
> > 		r1 = READ_ONCE(*x);
> > 	}
> > 
> > 	exists (0:r0=0 /\ 1:r1=0)
> > 
> > This is forbidden.  It would remain forbidden even if the smp_mb in P1 
> > were replaced by a similar release/acquire pair for the same memory 
> > location.

I have to apologize; this was totally wrong.  This test is not
forbidden under the LKMM, and it certainly isn't forbidden if the
smp_mb is replaced by a release/acquire pair.

I was trying to think of something completely different.  If you have a
release/acquire to the same address, it creates a happens-before
ordering:

	Access x
	Release a
	Acquire a
	Access y

Here is the access to x happens-before the access to y.  This is true
even on x86, even in the presence of forwarding -- the CPU still has to
execute the instructions in order.  But if the release and acquire are
to different addresses:

	Access x
	Release a
	Acquire b
	Access y

then there is no happens-before ordering for x and y -- the CPU can
execute the last two instructions before the first two.  x86 and
PowerPC won't do this, but I believe ARMv8 can.  (Please correct me if
it can't.)

But happens-before is much weaker than a strong fence.  So in short, 
release + acquire, even to the same address, is no replacement for 
smp_mb().

> Isn't this allowed on x86 mapping smp_mb() to mfence, store-release to plain
> store and load-acquire to plain load? All we're saying is that you can forward
> from a release to an acquire, which is fine for RCpc semantics.
> 
> e.g.
> 
> X86 SB+mfence+po-rfi-po
> "MFencedWR Fre PodWW Rfi PodRR Fre"
> Generator=diyone7 (version 7.46+3)
> Prefetch=0:x=F,0:y=T,1:y=F,1:x=T
> Com=Fr Fr
> Orig=MFencedWR Fre PodWW Rfi PodRR Fre
> {
> }
>  P0          | P1          ;
>  MOV [x],$1  | MOV [y],$1  ;
>  MFENCE      | MOV [z],$1  ;
>  MOV EAX,[y] | MOV EAX,[z] ;
>              | MOV EBX,[x] ;
> exists
> (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0)
> 
> which herd says is allowed:
> 
> Test SB+mfence+po-rfi-po Allowed
> States 4
> 0:EAX=0; 1:EAX=1; 1:EBX=0;
> 0:EAX=0; 1:EAX=1; 1:EBX=1;
> 0:EAX=1; 1:EAX=1; 1:EBX=0;
> 0:EAX=1; 1:EAX=1; 1:EBX=1;
> Ok
> Witnesses
> Positive: 1 Negative: 3
> Condition exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0)
> Observation SB+mfence+po-rfi-po Sometimes 1 3
> Time SB+mfence+po-rfi-po 0.00
> Hash=0f983e2d7579e5c04c332f9ac620c31f
> 
> and I can reproduce using litmus to actually run it on my x86 box:
> 
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> % Results for SB+mfence+po-rfi-po.litmus %
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> X86 SB+mfence+po-rfi-po
> "MFencedWR Fre PodWW Rfi PodRR Fre"
> 
> {}
> 
>  P0          | P1          ;
>  MOV [x],$1  | MOV [y],$1  ;
>  MFENCE      | MOV [z],$1  ;
>  MOV EAX,[y] | MOV EAX,[z] ;
>              | MOV EBX,[x] ;
> 
> exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0)
> Generated assembler
> #START _litmus_P1
> 	movl $1,(%r8,%rcx)
> 	movl $1,(%r9,%rcx)
> 	movl (%r9,%rcx),%eax
> 	movl (%rdi,%rcx),%edx
> #START _litmus_P0
> 	movl $1,(%rdx,%rcx)
> 	mfence
> 	movl (%rdi,%rcx),%eax
> 
> Test SB+mfence+po-rfi-po Allowed
> Histogram (4 states)
> 8     *>0:EAX=0; 1:EAX=1; 1:EBX=0;
> 1999851:>0:EAX=1; 1:EAX=1; 1:EBX=0;
> 1999549:>0:EAX=0; 1:EAX=1; 1:EBX=1;
> 592   :>0:EAX=1; 1:EAX=1; 1:EBX=1;
> Ok
> 
> Witnesses
> Positive: 8, Negative: 3999992
> Condition exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) is validated
> Hash=0f983e2d7579e5c04c332f9ac620c31f
> Generator=diyone7 (version 7.46+3)
> Com=Fr Fr
> Orig=MFencedWR Fre PodWW Rfi PodRR Fre
> Observation SB+mfence+po-rfi-po Sometimes 8 3999992
> Time SB+mfence+po-rfi-po 0.17

Yes, you are quite correct.  Thanks for pointing out my mistake.

Alan Stern