All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leonardo Bras <leobras@redhat.com>
To: Boqun Feng <boqun.feng@gmail.com>
Cc: Leonardo Bras <leobras@redhat.com>, Will Deacon <will@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Albert Ou <aou@eecs.berkeley.edu>, Guo Ren <guoren@kernel.org>,
	Andrea Parri <parri.andrea@gmail.com>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrzej Hajda <andrzej.hajda@intel.com>,
	linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org
Subject: Re: [PATCH v1 1/5] riscv/cmpxchg: Deduplicate xchg() asm functions
Date: Fri,  5 Jan 2024 03:59:44 -0300	[thread overview]
Message-ID: <ZZeo3-XzV9GBZuMe@LeoBras> (raw)
In-Reply-To: <ZZeRF2GX6sLLxgrM@Boquns-Mac-mini.home>

On Thu, Jan 04, 2024 at 09:18:15PM -0800, Boqun Feng wrote:
> On Fri, Jan 05, 2024 at 01:45:42AM -0300, Leonardo Bras wrote:
> [...]
> > > > According to gcc.gnu.org:
> > > > 
> > > > ---
> > > > "memory" [clobber]:
> > > > 
> > > >     The "memory" clobber tells the compiler that the assembly code 
> > > >     performs memory reads or writes to items other than those listed in 
> > > >     the input and output operands (for example, accessing the memory 
> > > >     pointed to by one of the input parameters). To ensure memory contains 
> > > 
> > > Note here it says "other than those listed in the input and output
> > > operands", and in the above asm block, the memory pointed by "__ptr" is
> > > already marked as read-and-write by the asm block via "+A" (*__ptr), so
> > > the compiler knows the asm block may modify the memory pointed by
> > > "__ptr", therefore in _relaxed() case, "memory" clobber can be avoided.
> > 
> > Thanks for pointing that out! 
> > That helped me improve my understanding on constraints for asm operands :)
> > (I ended up getting even more info from the gcc manual)
> > 
> > So "+A" constraints means the operand will get read/write and it's an 
> > address stored into a register.
> > 
> > > 
> > > Here is an example showing the difference, considering the follow case:
> > > 
> > > 	this_val = *this;
> > > 	that_val = *that;
> > > 	xchg_relaxed(this, 1);
> > > 	reread_this = *this;
> > > 
> > > by the semantics of _relaxed, compilers can optimize the above into
> > > 
> > > 	this_val = *this;
> > > 	xchg_relaxed(this, 1);
> > > 	that_val = *that;
> > > 	reread_this = *this;
> > > 
> > 
> > Seems correct, since there is no barrier().
> > 
> > > but the "memory" clobber in the xchg_relexed() will provide this.
> > 
> > By 'this' here you mean the barrier? I mean, IIUC "memory" clobber will 
> > avoid the above optimization, right?
> > 
> 
> Right, seems I mis-typed "provide" (I meant "prevent")
> 
> > > Needless to say the '"+A" (*__ptr)' prevents compiler from the following
> > > optimization:
> > > 
> > > 	this_val = *this;
> > > 	that_val = *that;
> > > 	xchg_relaxed(this, 1);
> > > 	reread_this = this_val;
> > > 
> > > since the compiler knows the asm block will read and write *this.
> >  
> > Right, the compiler knows that address will be wrote by the asm block, and 
> > so it reloads the value instead of re-using the old one.
> > 
> 
> Correct.
> 
> > 
> > A question, though:
> 
> Good question ;-)
> 
> > Do we need the "memory" clobber in any other xchg / cmpxchg asm?
> 
> The "memory" clobber is needed for others, see below:
> 
> > I mean, usually the only write to memory will happen in the *__ptr, which 
> > should be safe by "+A".
> > 
> > I understand that since the others are not "relaxed" they will need to 
> > have a barrier, but is not the compiler supposed to understand the barrier 
> > instruction and avoid compiler reordering / optimizations across given 
> > instruction ?  
> > 
> 
> The barrier semantics (ACQUIRE/RELEASE/FULL) is provided by the combined
> effort of both 1) preventing compiler optimization by "memory" clobber
> and 2) preventing CPU/memory reordering by arch-specific instructions.
> 
> In other words, an asm block contains a hardware barrier instruction
> should always have the "memory" clobber, otherwise, there are
> possiblities that compilers reorder the asm block therefore break the
> ordering provided by the hardware instructions.

Oh, I see.
So this means the compiler does not check for memory barrier instructions 
before reordering loads/stores. Right?

Meaning it needs a way to signal a compiler barrier, on top of the barrier 
instructions. 

Thanks for helping me improve my understanding of this!
Leo

> 
> Regards,
> Boqun
> 
> > 
> > Thanks!
> > Leo
> > 
> > > Regards,
> > > Boqun
> > > 
> > > >     correct values, GCC may need to flush specific register values to 
> > > >     memory before executing the asm. Further, the compiler does not assume 
> > > >     that any values read from memory before an asm remain unchanged after 
> > > >     that asm ; it reloads them as needed. Using the "memory" clobber 
> > > >     effectively forms a read/write memory barrier for the compiler.
> > > > 
> > > >     Note that this clobber does not prevent the processor from doing 
> > > >     speculative reads past the asm statement. To prevent that, you need 
> > > >     processor-specific fence instructions.
> > > > ---
> > > > 
> > > > IIUC above text says that having memory accesses to *__ptr would require 
> > > > above asm to have the "memory" clobber, so memory accesses don't get 
> > > > reordered by the compiler. 
> > > > 
> > > > By above affirmation, all asm in this file should have the "memory" 
> > > > clobber, since all atomic operations will change memory pointed by an input 
> > > > ptr. Is that correct?
> > > > 
> > > > Thanks!
> > > > Leo
> > > > 
> > > > 
> > > > > 
> > > > > Regards,
> > > > > Boqun
> > > > > 
> > > > > > -		break;							\
> > > > > > -	case 8:								\
> > > > > > -		__asm__ __volatile__ (					\
> > > > > > -			"	amoswap.d %0, %2, %1\n"			\
> > > > > > -			: "=r" (__ret), "+A" (*__ptr)			\
> > > > > > -			: "r" (__new)					\
> > > > > > -			: "memory");					\
> > > > > > -		break;							\
> > > > > > -	default:							\
> > > > > > -		BUILD_BUG();						\
> > > > > > -	}								\
> > > > > > -	__ret;								\
> > > > > > -})
> > > > > > -
> > > > > > -#define arch_xchg_relaxed(ptr, x)					\
> > > > > > -({									\
> > > > > > -	__typeof__(*(ptr)) _x_ = (x);					\
> > > > > > -	(__typeof__(*(ptr))) __xchg_relaxed((ptr),			\
> > > > > > -					    _x_, sizeof(*(ptr)));	\
> > > > > > +	__asm__ __volatile__ (						\
> > > > > > +		prepend							\
> > > > > > +		"	amoswap" sfx " %0, %2, %1\n"			\
> > > > > > +		append							\
> > > > > > +		: "=r" (r), "+A" (*(p))					\
> > > > > > +		: "r" (n)						\
> > > > > > +		: "memory");						\
> > > > > >  })
> > > > > >  
> > > > > > -#define __xchg_acquire(ptr, new, size)					\
> > > > > > +#define _arch_xchg(ptr, new, sfx, prepend, append)			\
> > > > > >  ({									\
> > > > > >  	__typeof__(ptr) __ptr = (ptr);					\
> > > > > > -	__typeof__(new) __new = (new);					\
> > > > > > -	__typeof__(*(ptr)) __ret;					\
> > > > > > -	switch (size) {							\
> > > > > > +	__typeof__(*(__ptr)) __new = (new);				\
> > > > > > +	__typeof__(*(__ptr)) __ret;					\
> > > > > > +	switch (sizeof(*__ptr)) {					\
> > > > > >  	case 4:								\
> > > > > > -		__asm__ __volatile__ (					\
> > > > > > -			"	amoswap.w %0, %2, %1\n"			\
> > > > > > -			RISCV_ACQUIRE_BARRIER				\
> > > > > > -			: "=r" (__ret), "+A" (*__ptr)			\
> > > > > > -			: "r" (__new)					\
> > > > > > -			: "memory");					\
> > > > > > +		__arch_xchg(".w" sfx, prepend, append,			\
> > > > > > +			      __ret, __ptr, __new);			\
> > > > > >  		break;							\
> > > > > >  	case 8:								\
> > > > > > -		__asm__ __volatile__ (					\
> > > > > > -			"	amoswap.d %0, %2, %1\n"			\
> > > > > > -			RISCV_ACQUIRE_BARRIER				\
> > > > > > -			: "=r" (__ret), "+A" (*__ptr)			\
> > > > > > -			: "r" (__new)					\
> > > > > > -			: "memory");					\
> > > > > > +		__arch_xchg(".d" sfx, prepend, append,			\
> > > > > > +			      __ret, __ptr, __new);			\
> > > > > >  		break;							\
> > > > > >  	default:							\
> > > > > >  		BUILD_BUG();						\
> > > > > >  	}								\
> > > > > > -	__ret;								\
> > > > > > +	(__typeof__(*(__ptr)))__ret;					\
> > > > > >  })
> > > > > >  
> > > > > > -#define arch_xchg_acquire(ptr, x)					\
> > > > > > -({									\
> > > > > > -	__typeof__(*(ptr)) _x_ = (x);					\
> > > > > > -	(__typeof__(*(ptr))) __xchg_acquire((ptr),			\
> > > > > > -					    _x_, sizeof(*(ptr)));	\
> > > > > > -})
> > > > > > +#define arch_xchg_relaxed(ptr, x)					\
> > > > > > +	_arch_xchg(ptr, x, "", "", "")
> > > > > >  
> > > > > > -#define __xchg_release(ptr, new, size)					\
> > > > > > -({									\
> > > > > > -	__typeof__(ptr) __ptr = (ptr);					\
> > > > > > -	__typeof__(new) __new = (new);					\
> > > > > > -	__typeof__(*(ptr)) __ret;					\
> > > > > > -	switch (size) {							\
> > > > > > -	case 4:								\
> > > > > > -		__asm__ __volatile__ (					\
> > > > > > -			RISCV_RELEASE_BARRIER				\
> > > > > > -			"	amoswap.w %0, %2, %1\n"			\
> > > > > > -			: "=r" (__ret), "+A" (*__ptr)			\
> > > > > > -			: "r" (__new)					\
> > > > > > -			: "memory");					\
> > > > > > -		break;							\
> > > > > > -	case 8:								\
> > > > > > -		__asm__ __volatile__ (					\
> > > > > > -			RISCV_RELEASE_BARRIER				\
> > > > > > -			"	amoswap.d %0, %2, %1\n"			\
> > > > > > -			: "=r" (__ret), "+A" (*__ptr)			\
> > > > > > -			: "r" (__new)					\
> > > > > > -			: "memory");					\
> > > > > > -		break;							\
> > > > > > -	default:							\
> > > > > > -		BUILD_BUG();						\
> > > > > > -	}								\
> > > > > > -	__ret;								\
> > > > > > -})
> > > > > > +#define arch_xchg_acquire(ptr, x)					\
> > > > > > +	_arch_xchg(ptr, x, "", "", RISCV_ACQUIRE_BARRIER)
> > > > > >  
> > > > > >  #define arch_xchg_release(ptr, x)					\
> > > > > > -({									\
> > > > > > -	__typeof__(*(ptr)) _x_ = (x);					\
> > > > > > -	(__typeof__(*(ptr))) __xchg_release((ptr),			\
> > > > > > -					    _x_, sizeof(*(ptr)));	\
> > > > > > -})
> > > > > > -
> > > > > > -#define __arch_xchg(ptr, new, size)					\
> > > > > > -({									\
> > > > > > -	__typeof__(ptr) __ptr = (ptr);					\
> > > > > > -	__typeof__(new) __new = (new);					\
> > > > > > -	__typeof__(*(ptr)) __ret;					\
> > > > > > -	switch (size) {							\
> > > > > > -	case 4:								\
> > > > > > -		__asm__ __volatile__ (					\
> > > > > > -			"	amoswap.w.aqrl %0, %2, %1\n"		\
> > > > > > -			: "=r" (__ret), "+A" (*__ptr)			\
> > > > > > -			: "r" (__new)					\
> > > > > > -			: "memory");					\
> > > > > > -		break;							\
> > > > > > -	case 8:								\
> > > > > > -		__asm__ __volatile__ (					\
> > > > > > -			"	amoswap.d.aqrl %0, %2, %1\n"		\
> > > > > > -			: "=r" (__ret), "+A" (*__ptr)			\
> > > > > > -			: "r" (__new)					\
> > > > > > -			: "memory");					\
> > > > > > -		break;							\
> > > > > > -	default:							\
> > > > > > -		BUILD_BUG();						\
> > > > > > -	}								\
> > > > > > -	__ret;								\
> > > > > > -})
> > > > > > +	_arch_xchg(ptr, x, "", RISCV_RELEASE_BARRIER, "")
> > > > > >  
> > > > > >  #define arch_xchg(ptr, x)						\
> > > > > > -({									\
> > > > > > -	__typeof__(*(ptr)) _x_ = (x);					\
> > > > > > -	(__typeof__(*(ptr))) __arch_xchg((ptr), _x_, sizeof(*(ptr)));	\
> > > > > > -})
> > > > > > +	_arch_xchg(ptr, x, ".aqrl", "", "")
> > > > > >  
> > > > > >  #define xchg32(ptr, x)							\
> > > > > >  ({									\
> > > > > > -- 
> > > > > > 2.43.0
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: Leonardo Bras <leobras@redhat.com>
To: Boqun Feng <boqun.feng@gmail.com>
Cc: Leonardo Bras <leobras@redhat.com>, Will Deacon <will@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Albert Ou <aou@eecs.berkeley.edu>, Guo Ren <guoren@kernel.org>,
	Andrea Parri <parri.andrea@gmail.com>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrzej Hajda <andrzej.hajda@intel.com>,
	linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org
Subject: Re: [PATCH v1 1/5] riscv/cmpxchg: Deduplicate xchg() asm functions
Date: Fri,  5 Jan 2024 03:59:44 -0300	[thread overview]
Message-ID: <ZZeo3-XzV9GBZuMe@LeoBras> (raw)
In-Reply-To: <ZZeRF2GX6sLLxgrM@Boquns-Mac-mini.home>

On Thu, Jan 04, 2024 at 09:18:15PM -0800, Boqun Feng wrote:
> On Fri, Jan 05, 2024 at 01:45:42AM -0300, Leonardo Bras wrote:
> [...]
> > > > According to gcc.gnu.org:
> > > > 
> > > > ---
> > > > "memory" [clobber]:
> > > > 
> > > >     The "memory" clobber tells the compiler that the assembly code 
> > > >     performs memory reads or writes to items other than those listed in 
> > > >     the input and output operands (for example, accessing the memory 
> > > >     pointed to by one of the input parameters). To ensure memory contains 
> > > 
> > > Note here it says "other than those listed in the input and output
> > > operands", and in the above asm block, the memory pointed by "__ptr" is
> > > already marked as read-and-write by the asm block via "+A" (*__ptr), so
> > > the compiler knows the asm block may modify the memory pointed by
> > > "__ptr", therefore in _relaxed() case, "memory" clobber can be avoided.
> > 
> > Thanks for pointing that out! 
> > That helped me improve my understanding on constraints for asm operands :)
> > (I ended up getting even more info from the gcc manual)
> > 
> > So "+A" constraints means the operand will get read/write and it's an 
> > address stored into a register.
> > 
> > > 
> > > Here is an example showing the difference, considering the follow case:
> > > 
> > > 	this_val = *this;
> > > 	that_val = *that;
> > > 	xchg_relaxed(this, 1);
> > > 	reread_this = *this;
> > > 
> > > by the semantics of _relaxed, compilers can optimize the above into
> > > 
> > > 	this_val = *this;
> > > 	xchg_relaxed(this, 1);
> > > 	that_val = *that;
> > > 	reread_this = *this;
> > > 
> > 
> > Seems correct, since there is no barrier().
> > 
> > > but the "memory" clobber in the xchg_relexed() will provide this.
> > 
> > By 'this' here you mean the barrier? I mean, IIUC "memory" clobber will 
> > avoid the above optimization, right?
> > 
> 
> Right, seems I mis-typed "provide" (I meant "prevent")
> 
> > > Needless to say the '"+A" (*__ptr)' prevents compiler from the following
> > > optimization:
> > > 
> > > 	this_val = *this;
> > > 	that_val = *that;
> > > 	xchg_relaxed(this, 1);
> > > 	reread_this = this_val;
> > > 
> > > since the compiler knows the asm block will read and write *this.
> >  
> > Right, the compiler knows that address will be wrote by the asm block, and 
> > so it reloads the value instead of re-using the old one.
> > 
> 
> Correct.
> 
> > 
> > A question, though:
> 
> Good question ;-)
> 
> > Do we need the "memory" clobber in any other xchg / cmpxchg asm?
> 
> The "memory" clobber is needed for others, see below:
> 
> > I mean, usually the only write to memory will happen in the *__ptr, which 
> > should be safe by "+A".
> > 
> > I understand that since the others are not "relaxed" they will need to 
> > have a barrier, but is not the compiler supposed to understand the barrier 
> > instruction and avoid compiler reordering / optimizations across given 
> > instruction ?  
> > 
> 
> The barrier semantics (ACQUIRE/RELEASE/FULL) is provided by the combined
> effort of both 1) preventing compiler optimization by "memory" clobber
> and 2) preventing CPU/memory reordering by arch-specific instructions.
> 
> In other words, an asm block contains a hardware barrier instruction
> should always have the "memory" clobber, otherwise, there are
> possiblities that compilers reorder the asm block therefore break the
> ordering provided by the hardware instructions.

Oh, I see.
So this means the compiler does not check for memory barrier instructions 
before reordering loads/stores. Right?

Meaning it needs a way to signal a compiler barrier, on top of the barrier 
instructions. 

Thanks for helping me improve my understanding of this!
Leo

> 
> Regards,
> Boqun
> 
> > 
> > Thanks!
> > Leo
> > 
> > > Regards,
> > > Boqun
> > > 
> > > >     correct values, GCC may need to flush specific register values to 
> > > >     memory before executing the asm. Further, the compiler does not assume 
> > > >     that any values read from memory before an asm remain unchanged after 
> > > >     that asm ; it reloads them as needed. Using the "memory" clobber 
> > > >     effectively forms a read/write memory barrier for the compiler.
> > > > 
> > > >     Note that this clobber does not prevent the processor from doing 
> > > >     speculative reads past the asm statement. To prevent that, you need 
> > > >     processor-specific fence instructions.
> > > > ---
> > > > 
> > > > IIUC above text says that having memory accesses to *__ptr would require 
> > > > above asm to have the "memory" clobber, so memory accesses don't get 
> > > > reordered by the compiler. 
> > > > 
> > > > By above affirmation, all asm in this file should have the "memory" 
> > > > clobber, since all atomic operations will change memory pointed by an input 
> > > > ptr. Is that correct?
> > > > 
> > > > Thanks!
> > > > Leo
> > > > 
> > > > 
> > > > > 
> > > > > Regards,
> > > > > Boqun
> > > > > 
> > > > > > -		break;							\
> > > > > > -	case 8:								\
> > > > > > -		__asm__ __volatile__ (					\
> > > > > > -			"	amoswap.d %0, %2, %1\n"			\
> > > > > > -			: "=r" (__ret), "+A" (*__ptr)			\
> > > > > > -			: "r" (__new)					\
> > > > > > -			: "memory");					\
> > > > > > -		break;							\
> > > > > > -	default:							\
> > > > > > -		BUILD_BUG();						\
> > > > > > -	}								\
> > > > > > -	__ret;								\
> > > > > > -})
> > > > > > -
> > > > > > -#define arch_xchg_relaxed(ptr, x)					\
> > > > > > -({									\
> > > > > > -	__typeof__(*(ptr)) _x_ = (x);					\
> > > > > > -	(__typeof__(*(ptr))) __xchg_relaxed((ptr),			\
> > > > > > -					    _x_, sizeof(*(ptr)));	\
> > > > > > +	__asm__ __volatile__ (						\
> > > > > > +		prepend							\
> > > > > > +		"	amoswap" sfx " %0, %2, %1\n"			\
> > > > > > +		append							\
> > > > > > +		: "=r" (r), "+A" (*(p))					\
> > > > > > +		: "r" (n)						\
> > > > > > +		: "memory");						\
> > > > > >  })
> > > > > >  
> > > > > > -#define __xchg_acquire(ptr, new, size)					\
> > > > > > +#define _arch_xchg(ptr, new, sfx, prepend, append)			\
> > > > > >  ({									\
> > > > > >  	__typeof__(ptr) __ptr = (ptr);					\
> > > > > > -	__typeof__(new) __new = (new);					\
> > > > > > -	__typeof__(*(ptr)) __ret;					\
> > > > > > -	switch (size) {							\
> > > > > > +	__typeof__(*(__ptr)) __new = (new);				\
> > > > > > +	__typeof__(*(__ptr)) __ret;					\
> > > > > > +	switch (sizeof(*__ptr)) {					\
> > > > > >  	case 4:								\
> > > > > > -		__asm__ __volatile__ (					\
> > > > > > -			"	amoswap.w %0, %2, %1\n"			\
> > > > > > -			RISCV_ACQUIRE_BARRIER				\
> > > > > > -			: "=r" (__ret), "+A" (*__ptr)			\
> > > > > > -			: "r" (__new)					\
> > > > > > -			: "memory");					\
> > > > > > +		__arch_xchg(".w" sfx, prepend, append,			\
> > > > > > +			      __ret, __ptr, __new);			\
> > > > > >  		break;							\
> > > > > >  	case 8:								\
> > > > > > -		__asm__ __volatile__ (					\
> > > > > > -			"	amoswap.d %0, %2, %1\n"			\
> > > > > > -			RISCV_ACQUIRE_BARRIER				\
> > > > > > -			: "=r" (__ret), "+A" (*__ptr)			\
> > > > > > -			: "r" (__new)					\
> > > > > > -			: "memory");					\
> > > > > > +		__arch_xchg(".d" sfx, prepend, append,			\
> > > > > > +			      __ret, __ptr, __new);			\
> > > > > >  		break;							\
> > > > > >  	default:							\
> > > > > >  		BUILD_BUG();						\
> > > > > >  	}								\
> > > > > > -	__ret;								\
> > > > > > +	(__typeof__(*(__ptr)))__ret;					\
> > > > > >  })
> > > > > >  
> > > > > > -#define arch_xchg_acquire(ptr, x)					\
> > > > > > -({									\
> > > > > > -	__typeof__(*(ptr)) _x_ = (x);					\
> > > > > > -	(__typeof__(*(ptr))) __xchg_acquire((ptr),			\
> > > > > > -					    _x_, sizeof(*(ptr)));	\
> > > > > > -})
> > > > > > +#define arch_xchg_relaxed(ptr, x)					\
> > > > > > +	_arch_xchg(ptr, x, "", "", "")
> > > > > >  
> > > > > > -#define __xchg_release(ptr, new, size)					\
> > > > > > -({									\
> > > > > > -	__typeof__(ptr) __ptr = (ptr);					\
> > > > > > -	__typeof__(new) __new = (new);					\
> > > > > > -	__typeof__(*(ptr)) __ret;					\
> > > > > > -	switch (size) {							\
> > > > > > -	case 4:								\
> > > > > > -		__asm__ __volatile__ (					\
> > > > > > -			RISCV_RELEASE_BARRIER				\
> > > > > > -			"	amoswap.w %0, %2, %1\n"			\
> > > > > > -			: "=r" (__ret), "+A" (*__ptr)			\
> > > > > > -			: "r" (__new)					\
> > > > > > -			: "memory");					\
> > > > > > -		break;							\
> > > > > > -	case 8:								\
> > > > > > -		__asm__ __volatile__ (					\
> > > > > > -			RISCV_RELEASE_BARRIER				\
> > > > > > -			"	amoswap.d %0, %2, %1\n"			\
> > > > > > -			: "=r" (__ret), "+A" (*__ptr)			\
> > > > > > -			: "r" (__new)					\
> > > > > > -			: "memory");					\
> > > > > > -		break;							\
> > > > > > -	default:							\
> > > > > > -		BUILD_BUG();						\
> > > > > > -	}								\
> > > > > > -	__ret;								\
> > > > > > -})
> > > > > > +#define arch_xchg_acquire(ptr, x)					\
> > > > > > +	_arch_xchg(ptr, x, "", "", RISCV_ACQUIRE_BARRIER)
> > > > > >  
> > > > > >  #define arch_xchg_release(ptr, x)					\
> > > > > > -({									\
> > > > > > -	__typeof__(*(ptr)) _x_ = (x);					\
> > > > > > -	(__typeof__(*(ptr))) __xchg_release((ptr),			\
> > > > > > -					    _x_, sizeof(*(ptr)));	\
> > > > > > -})
> > > > > > -
> > > > > > -#define __arch_xchg(ptr, new, size)					\
> > > > > > -({									\
> > > > > > -	__typeof__(ptr) __ptr = (ptr);					\
> > > > > > -	__typeof__(new) __new = (new);					\
> > > > > > -	__typeof__(*(ptr)) __ret;					\
> > > > > > -	switch (size) {							\
> > > > > > -	case 4:								\
> > > > > > -		__asm__ __volatile__ (					\
> > > > > > -			"	amoswap.w.aqrl %0, %2, %1\n"		\
> > > > > > -			: "=r" (__ret), "+A" (*__ptr)			\
> > > > > > -			: "r" (__new)					\
> > > > > > -			: "memory");					\
> > > > > > -		break;							\
> > > > > > -	case 8:								\
> > > > > > -		__asm__ __volatile__ (					\
> > > > > > -			"	amoswap.d.aqrl %0, %2, %1\n"		\
> > > > > > -			: "=r" (__ret), "+A" (*__ptr)			\
> > > > > > -			: "r" (__new)					\
> > > > > > -			: "memory");					\
> > > > > > -		break;							\
> > > > > > -	default:							\
> > > > > > -		BUILD_BUG();						\
> > > > > > -	}								\
> > > > > > -	__ret;								\
> > > > > > -})
> > > > > > +	_arch_xchg(ptr, x, "", RISCV_RELEASE_BARRIER, "")
> > > > > >  
> > > > > >  #define arch_xchg(ptr, x)						\
> > > > > > -({									\
> > > > > > -	__typeof__(*(ptr)) _x_ = (x);					\
> > > > > > -	(__typeof__(*(ptr))) __arch_xchg((ptr), _x_, sizeof(*(ptr)));	\
> > > > > > -})
> > > > > > +	_arch_xchg(ptr, x, ".aqrl", "", "")
> > > > > >  
> > > > > >  #define xchg32(ptr, x)							\
> > > > > >  ({									\
> > > > > > -- 
> > > > > > 2.43.0
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 


  reply	other threads:[~2024-01-05  7:00 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-03 16:31 [PATCH v1 0/5] Rework & improve riscv cmpxchg.h and atomic.h Leonardo Bras
2024-01-03 16:31 ` Leonardo Bras
2024-01-03 16:31 ` [PATCH v1 1/5] riscv/cmpxchg: Deduplicate xchg() asm functions Leonardo Bras
2024-01-03 16:31   ` Leonardo Bras
2024-01-04 19:53   ` Boqun Feng
2024-01-04 19:53     ` Boqun Feng
2024-01-04 20:41     ` Leonardo Bras
2024-01-04 20:41       ` Leonardo Bras
2024-01-04 21:51       ` Boqun Feng
2024-01-04 21:51         ` Boqun Feng
2024-01-05  4:45         ` Leonardo Bras
2024-01-05  4:45           ` Leonardo Bras
2024-01-05  5:18           ` Boqun Feng
2024-01-05  5:18             ` Boqun Feng
2024-01-05  6:59             ` Leonardo Bras [this message]
2024-01-05  6:59               ` Leonardo Bras
2024-01-13  6:54   ` kernel test robot
2024-01-13  6:54     ` kernel test robot
2024-01-16 19:27     ` Leonardo Bras
2024-01-16 19:27       ` Leonardo Bras
2024-01-03 16:32 ` [PATCH v1 2/5] riscv/cmpxchg: Deduplicate cmpxchg() asm and macros Leonardo Bras
2024-01-03 16:32   ` Leonardo Bras
2024-01-03 16:32 ` [PATCH v1 3/5] riscv/atomic.h : Deduplicate arch_atomic.* Leonardo Bras
2024-01-03 16:32   ` Leonardo Bras
2024-01-03 16:32 ` [PATCH v1 4/5] riscv/cmpxchg: Implement cmpxchg for variables of size 1 and 2 Leonardo Bras
2024-01-03 16:32   ` Leonardo Bras
2024-01-03 16:32 ` [PATCH v1 5/5] riscv/cmpxchg: Implement xchg " Leonardo Bras
2024-01-03 16:32   ` Leonardo Bras
2024-01-03 16:34 ` [PATCH v1 0/5] Rework & improve riscv cmpxchg.h and atomic.h Leonardo Bras
2024-01-03 16:34   ` Leonardo Bras
2024-04-10 14:20 ` patchwork-bot+linux-riscv
2024-04-10 14:20   ` patchwork-bot+linux-riscv

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZZeo3-XzV9GBZuMe@LeoBras \
    --to=leobras@redhat.com \
    --cc=andrzej.hajda@intel.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=boqun.feng@gmail.com \
    --cc=geert@linux-m68k.org \
    --cc=guoren@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@kernel.org \
    --cc=palmer@dabbelt.com \
    --cc=parri.andrea@gmail.com \
    --cc=paul.walmsley@sifive.com \
    --cc=peterz@infradead.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.