* [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release @ 2015-06-02 0:09 Leonid Yegoshin 2015-06-02 0:09 ` [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers Leonid Yegoshin ` (3 more replies) 0 siblings, 4 replies; 24+ messages in thread From: Leonid Yegoshin @ 2015-06-02 0:09 UTC (permalink / raw) To: linux-mips, benh, will.deacon, linux-kernel, ralf, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem The following series implements lightweight SYNC memory barriers for SMP Linux and a correct use of SYNCs around atomics, futexes, spinlocks etc LL-SC loops - the basic building blocks of any atomics in MIPS. Historically, a generic MIPS doesn't use memory barriers around LL-SC loops in atomics, spinlocks etc. However, Architecture documents never specify that LL-SC loop creates a memory barrier. Some non-generic MIPS vendors already feel the pain and enforces it. With introduction in a recent out-of-order superscalar MIPS processors an aggressive speculative memory read it is a problem now. The generic MIPS memory barrier instruction SYNC (aka SYNC 0) is something very heavvy because it was designed for propogating barrier down to memory. MIPS R2 introduced lightweight SYNC instructions which correspond to smp_*() set of SMP barriers. The description was very HW-specific and it was never used, however, it is much less trouble for processor pipelines and can be used in smp_mb()/smp_rmb()/smp_wmb() as is as in acquire/release barrier semantics. After prolonged discussions with HW team it became clear that lightweight SYNCs were designed specifically with smp_*() in mind but description is in timeline ordering space. So, the problem was spotted recently in engineering tests and it was confirmed with tests that without memory barrier load and store may pass LL/SC instructions in both directions, even in old MIPS R2 processors. Aggressive speculation in MIPS R6 and MIPS I5600 processors adds more fire to this issue. 3 patches introduces a configurable control for lightweight SYNCs around LL/SC loops and for MIPS32 R2 it was allowed to choose an enforcing SYNCs or not (keep as is) because some old MIPS32 R2 may be happy without that SYNCs. In MIPS R6 I chose to have SYNC around LL/SC mandatory because all of that processors have an agressive speculation and delayed write buffers. In that processors series it is still possible the use of SYNC 0 instead of lightweight SYNCs in configuration - just in case of some trouble in implementation in specific CPU. However, it is considered safe do not implement some or any lightweight SYNC in specific core because Architecture requires HW map of unimplemented SYNCs to SYNC 0. --- Leonid Yegoshin (3): MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers MIPS: enforce LL-SC loop enclosing with SYNC (ACQUIRE and RELEASE) MIPS: bugfix - replace smp_mb with release barrier function in unlocks arch/mips/Kconfig | 47 ++++++++++++++++++++++++++++++++++++++ arch/mips/include/asm/barrier.h | 32 +++++++++++++++++++++++--- arch/mips/include/asm/bitops.h | 2 +- arch/mips/include/asm/spinlock.h | 2 +- 4 files changed, 77 insertions(+), 6 deletions(-) -- Signature ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers 2015-06-02 0:09 [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release Leonid Yegoshin @ 2015-06-02 0:09 ` Leonid Yegoshin 2015-06-02 10:08 ` Paul Burton ` (2 more replies) 2015-06-02 0:09 ` [PATCH 2/3] MIPS: enforce LL-SC loop enclosing with SYNC (ACQUIRE and RELEASE) Leonid Yegoshin ` (2 subsequent siblings) 3 siblings, 3 replies; 24+ messages in thread From: Leonid Yegoshin @ 2015-06-02 0:09 UTC (permalink / raw) To: linux-mips, benh, will.deacon, linux-kernel, ralf, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem This instructions were specifically designed to work for smp_*() sort of memory barriers in MIPS R2/R3/R5 and R6. Unfortunately, it's description is very cryptic and is done in HW engineering style which prevents use of it by SW. This instructions are not mandatory but there is a mandatory requirement - if not implemented, then a regular MIPS SYNC 0 must be used instead. The reason for this change is - SYNC 0 is a heavvy-weighted in many CPUs, it may disrupt other cores pipelines etc. Due to concern about verification of old MIPS R2 compatible cores of other vendors it is enforced only for MIPS R6 and other MIPS32/64 R2/R5 processors have it configurable. Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> --- arch/mips/Kconfig | 22 ++++++++++++++++++++++ arch/mips/include/asm/barrier.h | 6 ++++++ 2 files changed, 28 insertions(+) diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index be384d6a58bb..c7d0cacece3d 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -1347,6 +1347,7 @@ config CPU_MIPS32_R2 select CPU_SUPPORTS_32BIT_KERNEL select CPU_SUPPORTS_HIGHMEM select CPU_SUPPORTS_MSA + select CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC select HAVE_KVM help Choose this option to build a kernel for release 2 or later of the @@ -1365,6 +1366,8 @@ config CPU_MIPS32_R6 select GENERIC_CSUM select HAVE_KVM select MIPS_O32_FP64_SUPPORT + select CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC + select WEAK_REORDERING_BEYOND_LLSC help Choose this option to build a kernel for release 6 or later of the MIPS32 architecture. New MIPS processors, starting with the Warrior @@ -1399,6 +1402,7 @@ config CPU_MIPS64_R2 select CPU_SUPPORTS_HIGHMEM select CPU_SUPPORTS_HUGEPAGES select CPU_SUPPORTS_MSA + select CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC help Choose this option to build a kernel for release 2 or later of the MIPS64 architecture. Many modern embedded systems with a 64-bit @@ -1415,6 +1419,8 @@ config CPU_MIPS64_R6 select CPU_SUPPORTS_HIGHMEM select CPU_SUPPORTS_MSA select GENERIC_CSUM + select CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC + select WEAK_REORDERING_BEYOND_LLSC help Choose this option to build a kernel for release 6 or later of the MIPS64 architecture. New MIPS processors, starting with the Warrior @@ -1876,6 +1882,20 @@ config WEAK_ORDERING # config WEAK_REORDERING_BEYOND_LLSC bool + +config MIPS_LIGHTWEIGHT_SYNC + bool "CPU lightweight SYNC instruction for weak reordering" + depends on CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC && WEAK_ORDERING + default y if CPU_MIPSR6 + help + This option enforces a use of "lightweight sync" instructions + for SMP (multi-CPU) memory barriers. This instructions are much + more faster than a traditional "SYNC 0". + + If that instructions are not implemented in processor then it is + converted to generic "SYNC 0". + + If you unsure, say N here, it may slightly decrease your performance endmenu # @@ -1928,6 +1948,8 @@ config CPU_SUPPORTS_HUGEPAGES bool config CPU_SUPPORTS_UNCACHED_ACCELERATED bool +config CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC + bool config MIPS_PGD_C0_CONTEXT bool default y if 64BIT && CPU_MIPSR2 && !CPU_XLP diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index 2b8bbbcb9be0..d2a63abfc7c6 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -96,9 +96,15 @@ # define smp_rmb() barrier() # define smp_wmb() __syncw() # else +# ifdef CONFIG_MIPS_LIGHTWEIGHT_SYNC +# define smp_mb() __asm__ __volatile__("sync 0x10" : : :"memory") +# define smp_rmb() __asm__ __volatile__("sync 0x13" : : :"memory") +# define smp_wmb() __asm__ __volatile__("sync 0x4" : : :"memory") +# else # define smp_mb() __asm__ __volatile__("sync" : : :"memory") # define smp_rmb() __asm__ __volatile__("sync" : : :"memory") # define smp_wmb() __asm__ __volatile__("sync" : : :"memory") +# endif # endif #else #define smp_mb() barrier() ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers 2015-06-02 0:09 ` [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers Leonid Yegoshin @ 2015-06-02 10:08 ` Paul Burton 2015-06-02 12:12 ` Luc Van Oostenryck 2015-06-02 10:48 ` James Hogan 2015-06-05 13:10 ` Ralf Baechle 2 siblings, 1 reply; 24+ messages in thread From: Paul Burton @ 2015-06-02 10:08 UTC (permalink / raw) To: Leonid Yegoshin Cc: linux-mips, benh, will.deacon, linux-kernel, ralf, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem Hi Leonid, On Mon, Jun 01, 2015 at 05:09:34PM -0700, Leonid Yegoshin wrote: > This instructions were specifically designed to work for smp_*() sort of > memory barriers in MIPS R2/R3/R5 and R6. > > Unfortunately, it's description is very cryptic and is done in HW engineering > style which prevents use of it by SW. FYI this reads to me like "I couldn't figure it out from the manuals" which you might want to leave out. > This instructions are not mandatory but > there is a mandatory requirement - if not implemented, then a regular MIPS > SYNC 0 must be used instead. > > The reason for this change is - SYNC 0 is a heavvy-weighted in many CPUs, it may > disrupt other cores pipelines etc. > > Due to concern about verification of old MIPS R2 compatible cores of other > vendors it is enforced only for MIPS R6 and other MIPS32/64 R2/R5 processors > have it configurable. > > Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> > --- > arch/mips/Kconfig | 22 ++++++++++++++++++++++ > arch/mips/include/asm/barrier.h | 6 ++++++ > 2 files changed, 28 insertions(+) > > diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig > index be384d6a58bb..c7d0cacece3d 100644 > --- a/arch/mips/Kconfig > +++ b/arch/mips/Kconfig > @@ -1347,6 +1347,7 @@ config CPU_MIPS32_R2 > select CPU_SUPPORTS_32BIT_KERNEL > select CPU_SUPPORTS_HIGHMEM > select CPU_SUPPORTS_MSA > + select CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC > select HAVE_KVM > help > Choose this option to build a kernel for release 2 or later of the > @@ -1365,6 +1366,8 @@ config CPU_MIPS32_R6 > select GENERIC_CSUM > select HAVE_KVM > select MIPS_O32_FP64_SUPPORT > + select CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC > + select WEAK_REORDERING_BEYOND_LLSC This WEAK_REORDERING_BEYOND_LLSC change should probably be split out into a separate patch, since it has nothing to do with the smp_* barriers and is just left as a sync 0 (__WEAK_LLSC_MB) as of this patch. > help > Choose this option to build a kernel for release 6 or later of the > MIPS32 architecture. New MIPS processors, starting with the Warrior > @@ -1399,6 +1402,7 @@ config CPU_MIPS64_R2 > select CPU_SUPPORTS_HIGHMEM > select CPU_SUPPORTS_HUGEPAGES > select CPU_SUPPORTS_MSA > + select CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC > help > Choose this option to build a kernel for release 2 or later of the > MIPS64 architecture. Many modern embedded systems with a 64-bit > @@ -1415,6 +1419,8 @@ config CPU_MIPS64_R6 > select CPU_SUPPORTS_HIGHMEM > select CPU_SUPPORTS_MSA > select GENERIC_CSUM > + select CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC > + select WEAK_REORDERING_BEYOND_LLSC Ditto. > help > Choose this option to build a kernel for release 6 or later of the > MIPS64 architecture. New MIPS processors, starting with the Warrior > @@ -1876,6 +1882,20 @@ config WEAK_ORDERING > # > config WEAK_REORDERING_BEYOND_LLSC > bool > + > +config MIPS_LIGHTWEIGHT_SYNC > + bool "CPU lightweight SYNC instruction for weak reordering" > + depends on CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC && WEAK_ORDERING > + default y if CPU_MIPSR6 > + help > + This option enforces a use of "lightweight sync" instructions > + for SMP (multi-CPU) memory barriers. This instructions are much > + more faster than a traditional "SYNC 0". "enforces a use" would probably read better as "makes use", and "much more faster" should just be "much faster". Personally I'd not make the SYNC so shouty - sync is perfectly valid & generally the way the instruction is named even in the kernel. > + > + If that instructions are not implemented in processor then it is > + converted to generic "SYNC 0". I think this would read better as something like: If a processor does not implement the lightweight sync operations then the architecture requires that they interpret the corresponding sync instructions as the typical heavyweight "sync 0". Therefore this should be safe to enable on all CPUs implementing release 2 or later of the MIPS architecture. > + > + If you unsure, say N here, it may slightly decrease your performance > endmenu > > # > @@ -1928,6 +1948,8 @@ config CPU_SUPPORTS_HUGEPAGES > bool > config CPU_SUPPORTS_UNCACHED_ACCELERATED > bool > +config CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC > + bool I'm not sure putting MIPS_ in the name of this makes sense either BTW. Other things around it don't bother (eg. UNCACHED_ACCELERATED right above) and it's already under arch/mips/Kconfig. > config MIPS_PGD_C0_CONTEXT > bool > default y if 64BIT && CPU_MIPSR2 && !CPU_XLP > diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h > index 2b8bbbcb9be0..d2a63abfc7c6 100644 > --- a/arch/mips/include/asm/barrier.h > +++ b/arch/mips/include/asm/barrier.h > @@ -96,9 +96,15 @@ > # define smp_rmb() barrier() > # define smp_wmb() __syncw() > # else > +# ifdef CONFIG_MIPS_LIGHTWEIGHT_SYNC > +# define smp_mb() __asm__ __volatile__("sync 0x10" : : :"memory") > +# define smp_rmb() __asm__ __volatile__("sync 0x13" : : :"memory") > +# define smp_wmb() __asm__ __volatile__("sync 0x4" : : :"memory") Tabs please. Thanks, Paul > +# else > # define smp_mb() __asm__ __volatile__("sync" : : :"memory") > # define smp_rmb() __asm__ __volatile__("sync" : : :"memory") > # define smp_wmb() __asm__ __volatile__("sync" : : :"memory") > +# endif > # endif > #else > #define smp_mb() barrier() > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers 2015-06-02 10:08 ` Paul Burton @ 2015-06-02 12:12 ` Luc Van Oostenryck 2015-06-02 12:44 ` Ralf Baechle 2015-06-02 18:20 ` Leonid Yegoshin 0 siblings, 2 replies; 24+ messages in thread From: Luc Van Oostenryck @ 2015-06-02 12:12 UTC (permalink / raw) To: Paul Burton Cc: Leonid Yegoshin, linux-mips, benh, will.deacon, linux-kernel, ralf, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem On Tue, Jun 02, 2015 at 11:08:35AM +0100, Paul Burton wrote: > Hi Leonid, > <snip> > > + > > + If that instructions are not implemented in processor then it is > > + converted to generic "SYNC 0". > > I think this would read better as something like: > > If a processor does not implement the lightweight sync operations then > the architecture requires that they interpret the corresponding sync > instructions as the typical heavyweight "sync 0". Therefore this > should be safe to enable on all CPUs implementing release 2 or > later of the MIPS architecture. > Is it really the case for release 2? I'm asking because recently I needed to do something similar and I couldn't find this garantee in the revision 2.00 of the manual. May it's just poorly formulated but here is what I find in it: - "The stype values 1-31 are reserved for future extensions to the architecture." (ok) - "A value of zero will always be defined such that it performs all defined synchronization operations." (ok) - "Non-zero values may be defined to remove some synchronization operations." (ok, certainly if we understand the word "weaker" instead of "remove") - "As such, software should never use a non-zero value of the stype field, as this may inadvertently cause future failures if non-zero values remove synchronization operations." (Mmmm, ok but ...) Nowhere is there something close to what is found in the revision 5.0 or later: "If an implementation does not use one of these non-zero values to define a different synchronization behavior, then that non-zero value of stype must act the same as stype zero completion barrier." The wording may have changed since revision 2.8 but I don't have access to the corresponding manual. Luc Van Oostenryck ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers 2015-06-02 12:12 ` Luc Van Oostenryck @ 2015-06-02 12:44 ` Ralf Baechle 2015-06-02 18:20 ` Leonid Yegoshin 1 sibling, 0 replies; 24+ messages in thread From: Ralf Baechle @ 2015-06-02 12:44 UTC (permalink / raw) To: Luc Van Oostenryck Cc: Paul Burton, Leonid Yegoshin, linux-mips, benh, will.deacon, linux-kernel, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem On Tue, Jun 02, 2015 at 02:12:29PM +0200, Luc Van Oostenryck wrote: > Date: Tue, 2 Jun 2015 14:12:29 +0200 > From: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> > To: Paul Burton <paul.burton@imgtec.com> > Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>, > linux-mips@linux-mips.org, benh@kernel.crashing.org, will.deacon@arm.com, > linux-kernel@vger.kernel.org, ralf@linux-mips.org, > markos.chandras@imgtec.com, macro@linux-mips.org, Steven.Hill@imgtec.com, > alexander.h.duyck@redhat.com, davem@davemloft.net > Subject: Re: [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in > smp_* memory barriers > Content-Type: text/plain; charset=us-ascii > > On Tue, Jun 02, 2015 at 11:08:35AM +0100, Paul Burton wrote: > > Hi Leonid, > > > > <snip> > > > > + > > > + If that instructions are not implemented in processor then it is > > > + converted to generic "SYNC 0". > > > > I think this would read better as something like: > > > > If a processor does not implement the lightweight sync operations then > > the architecture requires that they interpret the corresponding sync > > instructions as the typical heavyweight "sync 0". Therefore this > > should be safe to enable on all CPUs implementing release 2 or > > later of the MIPS architecture. > > > > Is it really the case for release 2? > > I'm asking because recently I needed to do something similar and I couldn't > find this garantee in the revision 2.00 of the manual. > May it's just poorly formulated but here is what I find in it: > - "The stype values 1-31 are reserved for future extensions to the architecture." > (ok) > - "A value of zero will always be defined such that it performs all defined > synchronization operations." (ok) > - "Non-zero values may be defined to remove some synchronization operations." > (ok, certainly if we understand the word "weaker" instead of "remove") Yes, "weaker" is what was meant here. > - "As such, software should never use a non-zero value of the stype field, as > this may inadvertently cause future failures if non-zero values remove > synchronization operations." (Mmmm, ok but ...) > Nowhere is there something close to what is found in the revision 5.0 or later: I think that's just a very convoluted way to say non-zero values are reserved and the CPU may bite you and your kittens if you dare to use such values. > "If an implementation does not use one of these non-zero values to define a > different synchronization behavior, then that non-zero value of stype must > act the same as stype zero completion barrier." "We try to be nice to bad code but ... you've been warned!" :-) > The wording may have changed since revision 2.8 but I don't have access to the > corresponding manual. The page about the SYNC instruction has changed significantly over time - the SYNC instruction's documentation manual has grown from one page for the R4000 to four pages for MIPS32 R1 and R2.5 to five pages for MIPS64 R3.02 and newer. R3 added read, write, rw, acquire and release sync types. But the sentence in question exists unchanged even in the R6 manual. Ralf ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers 2015-06-02 12:12 ` Luc Van Oostenryck 2015-06-02 12:44 ` Ralf Baechle @ 2015-06-02 18:20 ` Leonid Yegoshin 1 sibling, 0 replies; 24+ messages in thread From: Leonid Yegoshin @ 2015-06-02 18:20 UTC (permalink / raw) To: Luc Van Oostenryck, Paul Burton Cc: linux-mips, benh, will.deacon, linux-kernel, ralf, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem On 06/02/2015 05:12 AM, Luc Van Oostenryck wrote: > On Tue, Jun 02, 2015 at 11:08:35AM +0100, Paul Burton wrote: > >> I think this would read better as something like: >> >> If a processor does not implement the lightweight sync operations then >> the architecture requires that they interpret the corresponding sync >> instructions as the typical heavyweight "sync 0". Therefore this >> should be safe to enable on all CPUs implementing release 2 or >> later of the MIPS architecture. >> > Is it really the case for release 2? > > I'm asking because recently I needed to do something similar and I couldn't > find this garantee in the revision 2.00 of the manual. Yes. MD00086/MD00084/MD00087 Rev 2.60 are technically MIPS R2. And this revision explicitly lists optional codes and it has a clear statement: > Implementations that do not use any of the non-zero values of stype to > define different barriers, such as ordering bar- > riers, must make those stype values act the same as stype zero. (don't blame me that Rev 2.60 is 5 years after initial 2.00, it is still MIPS R2). - Leonid. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers 2015-06-02 0:09 ` [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers Leonid Yegoshin 2015-06-02 10:08 ` Paul Burton @ 2015-06-02 10:48 ` James Hogan 2015-06-02 16:15 ` Maciej W. Rozycki 2015-06-05 13:10 ` Ralf Baechle 2 siblings, 1 reply; 24+ messages in thread From: James Hogan @ 2015-06-02 10:48 UTC (permalink / raw) To: Leonid Yegoshin, linux-mips, benh, will.deacon, linux-kernel, ralf, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem [-- Attachment #1: Type: text/plain, Size: 5044 bytes --] Hi Leonid, On 02/06/15 01:09, Leonid Yegoshin wrote: > This instructions were specifically designed to work for smp_*() sort of > memory barriers in MIPS R2/R3/R5 and R6. > > Unfortunately, it's description is very cryptic and is done in HW engineering > style which prevents use of it by SW. This instructions are not mandatory but > there is a mandatory requirement - if not implemented, then a regular MIPS > SYNC 0 must be used instead. > > The reason for this change is - SYNC 0 is a heavvy-weighted in many CPUs, it may heavy > disrupt other cores pipelines etc. > > Due to concern about verification of old MIPS R2 compatible cores of other > vendors it is enforced only for MIPS R6 and other MIPS32/64 R2/R5 processors > have it configurable. Is it worth inverting the logic to exclude the platforms you're concerned about (which also makes it explicit what hardware needs verifying). For new platforms we don't need to worry about kernel regressions so much, so it probably makes sense to have them used by default. > > Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> > --- > arch/mips/Kconfig | 22 ++++++++++++++++++++++ > arch/mips/include/asm/barrier.h | 6 ++++++ > 2 files changed, 28 insertions(+) > > diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig > index be384d6a58bb..c7d0cacece3d 100644 > --- a/arch/mips/Kconfig > +++ b/arch/mips/Kconfig > @@ -1347,6 +1347,7 @@ config CPU_MIPS32_R2 > select CPU_SUPPORTS_32BIT_KERNEL > select CPU_SUPPORTS_HIGHMEM > select CPU_SUPPORTS_MSA > + select CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC > select HAVE_KVM > help > Choose this option to build a kernel for release 2 or later of the > @@ -1365,6 +1366,8 @@ config CPU_MIPS32_R6 > select GENERIC_CSUM > select HAVE_KVM > select MIPS_O32_FP64_SUPPORT > + select CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC > + select WEAK_REORDERING_BEYOND_LLSC > help > Choose this option to build a kernel for release 6 or later of the > MIPS32 architecture. New MIPS processors, starting with the Warrior > @@ -1399,6 +1402,7 @@ config CPU_MIPS64_R2 > select CPU_SUPPORTS_HIGHMEM > select CPU_SUPPORTS_HUGEPAGES > select CPU_SUPPORTS_MSA > + select CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC > help > Choose this option to build a kernel for release 2 or later of the > MIPS64 architecture. Many modern embedded systems with a 64-bit > @@ -1415,6 +1419,8 @@ config CPU_MIPS64_R6 > select CPU_SUPPORTS_HIGHMEM > select CPU_SUPPORTS_MSA > select GENERIC_CSUM > + select CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC > + select WEAK_REORDERING_BEYOND_LLSC > help > Choose this option to build a kernel for release 6 or later of the > MIPS64 architecture. New MIPS processors, starting with the Warrior > @@ -1876,6 +1882,20 @@ config WEAK_ORDERING > # > config WEAK_REORDERING_BEYOND_LLSC > bool > + > +config MIPS_LIGHTWEIGHT_SYNC > + bool "CPU lightweight SYNC instruction for weak reordering" > + depends on CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC && WEAK_ORDERING Worth depending on SMP, so as not to give the user more options than they need? > + default y if CPU_MIPSR6 > + help > + This option enforces a use of "lightweight sync" instructions > + for SMP (multi-CPU) memory barriers. This instructions are much > + more faster than a traditional "SYNC 0". > + > + If that instructions are not implemented in processor then it is > + converted to generic "SYNC 0". > + > + If you unsure, say N here, it may slightly decrease your performance "it" is ambiguous. "Saying N" or "this option"? (I guess "saying N", but its not obvious to an uninformed kernel configurer). > endmenu > > # > @@ -1928,6 +1948,8 @@ config CPU_SUPPORTS_HUGEPAGES > bool > config CPU_SUPPORTS_UNCACHED_ACCELERATED > bool > +config CPU_SUPPORTS_MIPS_LIGHTWEIGHT_SYNC > + bool > config MIPS_PGD_C0_CONTEXT > bool > default y if 64BIT && CPU_MIPSR2 && !CPU_XLP > diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h > index 2b8bbbcb9be0..d2a63abfc7c6 100644 > --- a/arch/mips/include/asm/barrier.h > +++ b/arch/mips/include/asm/barrier.h > @@ -96,9 +96,15 @@ > # define smp_rmb() barrier() > # define smp_wmb() __syncw() > # else > +# ifdef CONFIG_MIPS_LIGHTWEIGHT_SYNC > +# define smp_mb() __asm__ __volatile__("sync 0x10" : : :"memory") > +# define smp_rmb() __asm__ __volatile__("sync 0x13" : : :"memory") > +# define smp_wmb() __asm__ __volatile__("sync 0x4" : : :"memory") binutils appears to support the sync_mb, sync_rmb, sync_wmb aliases since version 2.21. Can we safely use them? Cheers James > +# else > # define smp_mb() __asm__ __volatile__("sync" : : :"memory") > # define smp_rmb() __asm__ __volatile__("sync" : : :"memory") > # define smp_wmb() __asm__ __volatile__("sync" : : :"memory") > +# endif > # endif > #else > #define smp_mb() barrier() > > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers 2015-06-02 10:48 ` James Hogan @ 2015-06-02 16:15 ` Maciej W. Rozycki 2015-06-02 23:56 ` David Daney 0 siblings, 1 reply; 24+ messages in thread From: Maciej W. Rozycki @ 2015-06-02 16:15 UTC (permalink / raw) To: James Hogan Cc: Leonid Yegoshin, linux-mips, benh, will.deacon, linux-kernel, Ralf Baechle, markos.chandras, Steven.Hill, alexander.h.duyck, David S. Miller On Tue, 2 Jun 2015, James Hogan wrote: > > diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h > > index 2b8bbbcb9be0..d2a63abfc7c6 100644 > > --- a/arch/mips/include/asm/barrier.h > > +++ b/arch/mips/include/asm/barrier.h > > @@ -96,9 +96,15 @@ > > # define smp_rmb() barrier() > > # define smp_wmb() __syncw() > > # else > > +# ifdef CONFIG_MIPS_LIGHTWEIGHT_SYNC > > +# define smp_mb() __asm__ __volatile__("sync 0x10" : : :"memory") > > +# define smp_rmb() __asm__ __volatile__("sync 0x13" : : :"memory") > > +# define smp_wmb() __asm__ __volatile__("sync 0x4" : : :"memory") > > binutils appears to support the sync_mb, sync_rmb, sync_wmb aliases > since version 2.21. Can we safely use them? I suggest that we don't -- we still officially support binutils 2.12 and have other places where we even use `.word' to insert instructions current versions of binutils properly handle. It may be worth noting in a comment though that these encodings correspond to these operations that you named. Maciej ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers 2015-06-02 16:15 ` Maciej W. Rozycki @ 2015-06-02 23:56 ` David Daney 2015-06-03 1:56 ` Leonid Yegoshin 0 siblings, 1 reply; 24+ messages in thread From: David Daney @ 2015-06-02 23:56 UTC (permalink / raw) To: Maciej W. Rozycki Cc: James Hogan, Leonid Yegoshin, linux-mips, benh, will.deacon, linux-kernel, Ralf Baechle, markos.chandras, Steven.Hill, alexander.h.duyck, David S. Miller On 06/02/2015 09:15 AM, Maciej W. Rozycki wrote: > On Tue, 2 Jun 2015, James Hogan wrote: > >>> diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h >>> index 2b8bbbcb9be0..d2a63abfc7c6 100644 >>> --- a/arch/mips/include/asm/barrier.h >>> +++ b/arch/mips/include/asm/barrier.h >>> @@ -96,9 +96,15 @@ >>> # define smp_rmb() barrier() >>> # define smp_wmb() __syncw() >>> # else >>> +# ifdef CONFIG_MIPS_LIGHTWEIGHT_SYNC >>> +# define smp_mb() __asm__ __volatile__("sync 0x10" : : :"memory") >>> +# define smp_rmb() __asm__ __volatile__("sync 0x13" : : :"memory") >>> +# define smp_wmb() __asm__ __volatile__("sync 0x4" : : :"memory") >> >> binutils appears to support the sync_mb, sync_rmb, sync_wmb aliases >> since version 2.21. Can we safely use them? > > I suggest that we don't -- we still officially support binutils 2.12 and > have other places where we even use `.word' to insert instructions current > versions of binutils properly handle. It may be worth noting in a comment > though that these encodings correspond to these operations that you named. > Surely the other MIPSr6 instructions are not supported in binutils 2.12 either. So if it is for r6, why not require modern tools, and put something user readable in here? David Daney ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers 2015-06-02 23:56 ` David Daney @ 2015-06-03 1:56 ` Leonid Yegoshin 0 siblings, 0 replies; 24+ messages in thread From: Leonid Yegoshin @ 2015-06-03 1:56 UTC (permalink / raw) To: David Daney, Maciej W. Rozycki Cc: James Hogan, linux-mips, benh, will.deacon, linux-kernel, Ralf Baechle, markos.chandras, Steven.Hill, alexander.h.duyck, David S. Miller On 06/02/2015 04:56 PM, David Daney wrote: > On 06/02/2015 09:15 AM, Maciej W. Rozycki wrote: >> On Tue, 2 Jun 2015, James Hogan wrote: >> >>> >>> binutils appears to support the sync_mb, sync_rmb, sync_wmb aliases >>> since version 2.21. Can we safely use them? >> >> I suggest that we don't -- we still officially support binutils >> 2.12 and >> have other places where we even use `.word' to insert instructions >> current >> versions of binutils properly handle. It may be worth noting in a >> comment >> though that these encodings correspond to these operations that you >> named. >> > > Surely the other MIPSr6 instructions are not supported in binutils > 2.12 either. So if it is for r6, why not require modern tools, and > put something user readable in here? > > No, it can be used for MIPS R2 also. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers 2015-06-02 0:09 ` [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers Leonid Yegoshin 2015-06-02 10:08 ` Paul Burton 2015-06-02 10:48 ` James Hogan @ 2015-06-05 13:10 ` Ralf Baechle 2015-06-05 21:18 ` Maciej W. Rozycki 2016-01-28 2:28 ` Joshua Kinard 2 siblings, 2 replies; 24+ messages in thread From: Ralf Baechle @ 2015-06-05 13:10 UTC (permalink / raw) To: Leonid Yegoshin, macro Cc: linux-mips, benh, will.deacon, linux-kernel, markos.chandras, Steven.Hill, alexander.h.duyck, davem On Mon, Jun 01, 2015 at 05:09:34PM -0700, Leonid Yegoshin wrote: Leonid, to me the biggest technical problem with this patch is that the new Kconfig option is user visible. This is the kind of deeply technical options which exceeds the technical knowledge of most users, so it should probably be driven by a select. We probably also want to enquire how old CPUs from before the invention of the stype field are behaving. If those as I hope for all treat an stype != 0 as stype 0 we could simply drop the option. But we might simply be out of luck - dunno. Maciej, do you have an R4000 / R4600 / R5000 / R7000 / SiByte system at hand to test this? I think we don't need to test that SYNC actually works as intended but the simpler test that SYNC <stype != 0> is not causing a illegal instruction exception is sufficient, that is if something like int main(int argc, charg *argv[]) { asm(" .set mips2 \n" " sync 0x10 \n" " sync 0x13 \n" " sync 0x04 \n" " .set mips 0 \n"); return 0; } doesn't crash we should be ok. The kernel's SYNC emulation should already be ok. We ignore the stype field entirely and for a uniprocessor R2000/R3000 that should be just the right thing. Ralf ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers 2015-06-05 13:10 ` Ralf Baechle @ 2015-06-05 21:18 ` Maciej W. Rozycki 2016-01-28 2:28 ` Joshua Kinard 1 sibling, 0 replies; 24+ messages in thread From: Maciej W. Rozycki @ 2015-06-05 21:18 UTC (permalink / raw) To: Ralf Baechle Cc: Leonid Yegoshin, linux-mips, benh, will.deacon, linux-kernel, markos.chandras, Steven.Hill, alexander.h.duyck, David S. Miller On Fri, 5 Jun 2015, Ralf Baechle wrote: > do you have an R4000 / R4600 / R5000 / R7000 / SiByte system at hand to > test this? I should be able to check R4400 (that is virtually the same as R4000) next week or so. As to SiByte -- not before next month I'm afraid. I don't have access to any of the other processors you named. You may want to find a better person if you want to accept this change soon. Maciej ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers 2015-06-05 13:10 ` Ralf Baechle 2015-06-05 21:18 ` Maciej W. Rozycki @ 2016-01-28 2:28 ` Joshua Kinard 2016-01-29 13:32 ` Maciej W. Rozycki 1 sibling, 1 reply; 24+ messages in thread From: Joshua Kinard @ 2016-01-28 2:28 UTC (permalink / raw) To: Ralf Baechle, Leonid Yegoshin, macro Cc: linux-mips, benh, will.deacon, linux-kernel, markos.chandras, Steven.Hill, alexander.h.duyck, davem On 06/05/2015 09:10, Ralf Baechle wrote: > On Mon, Jun 01, 2015 at 05:09:34PM -0700, Leonid Yegoshin wrote: > > Leonid, > > to me the biggest technical problem with this patch is that the new Kconfig > option is user visible. This is the kind of deeply technical options > which exceeds the technical knowledge of most users, so it should probably > be driven by a select. > > We probably also want to enquire how old CPUs from before the invention > of the stype field are behaving. If those as I hope for all treat an > stype != 0 as stype 0 we could simply drop the option. But we might > simply be out of luck - dunno. > > Maciej, > > do you have an R4000 / R4600 / R5000 / R7000 / SiByte system at hand to > test this? I think we don't need to test that SYNC actually works as > intended but the simpler test that SYNC <stype != 0> is not causing a > illegal instruction exception is sufficient, that is if something like > > int main(int argc, charg *argv[]) > { > asm(" .set mips2 \n" > " sync 0x10 \n" > " sync 0x13 \n" > " sync 0x04 \n" > " .set mips 0 \n"); > > return 0; > } > > doesn't crash we should be ok. > > The kernel's SYNC emulation should already be ok. We ignore the stype > field entirely and for a uniprocessor R2000/R3000 that should be just > the right thing. > > Ralf I tried just compiling this on my SGI O2, which has an RM7000 CPU, and it is refusing to even compile (after fixing typos): # gcc -O2 -pipe sync_test.c -o sync_test {standard input}: Assembler messages: {standard input}:19: Error: invalid operands `sync 0x10' {standard input}:20: Error: invalid operands `sync 0x13' {standard input}:21: Error: invalid operands `sync 0x04' So a bit of searching landed me here: http://stackoverflow.com/questions/3599564/gnu-assembler-for-mips-how-to-emit-sync-instructions And I recoded the sync insns like this: int main(int argc, char *argv[]) { __asm__ volatile ( \ " .set mips2 \n" " .word (0x0000000f | (0x10 << 6)) \n" " .word (0x0000000f | (0x14 << 6)) \n" " .word (0x0000000f | (0x04 << 6)) \n" " .set mips0 \n" ); return 0; } And the program compiles successfully and executes with no noticeable oddities or errors. Nothing in dmesg, no crashes, booms, or disappearance of small kittens. I did a quick disassembly to make sure all three got emitted: 004005e0 <main>: 4005e0: 27bdfff8 addiu sp,sp,-8 4005e4: afbe0004 sw s8,4(sp) 4005e8: 03a0f021 move s8,sp 4005ec: afc40008 sw a0,8(s8) 4005f0: afc5000c sw a1,12(s8) > 4005f4: 0000040f sync.p > 4005f8: 0000050f 0x50f > 4005fc: 0000010f 0x10f 400600: 00001021 move v0,zero Same effect on my Octane (IP30) w/ an R14000 CPU. Tested inside a uclibc-based chroot, but no change. Executes successfully silently. -- Joshua Kinard Gentoo/MIPS kumba@gentoo.org 6144R/F5C6C943 2015-04-27 177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943 "The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between." --Emperor Turhan, Centauri Republic ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers 2016-01-28 2:28 ` Joshua Kinard @ 2016-01-29 13:32 ` Maciej W. Rozycki 2016-01-30 16:25 ` Joshua Kinard 0 siblings, 1 reply; 24+ messages in thread From: Maciej W. Rozycki @ 2016-01-29 13:32 UTC (permalink / raw) To: Ralf Baechle, Joshua Kinard Cc: Leonid Yegoshin, linux-mips, benh, will.deacon, linux-kernel, markos.chandras, Steven.Hill, alexander.h.duyck, David S. Miller, Maciej W. Rozycki On Wed, 27 Jan 2016, Joshua Kinard wrote: > On 06/05/2015 09:10, Ralf Baechle wrote: > > > > Maciej, > > > > do you have an R4000 / R4600 / R5000 / R7000 / SiByte system at hand to > > test this? I think we don't need to test that SYNC actually works as > > intended but the simpler test that SYNC <stype != 0> is not causing a > > illegal instruction exception is sufficient, that is if something like > > > > int main(int argc, charg *argv[]) > > { > > asm(" .set mips2 \n" > > " sync 0x10 \n" > > " sync 0x13 \n" > > " sync 0x04 \n" > > " .set mips 0 \n"); > > > > return 0; > > } > > > > doesn't crash we should be ok. No anomalies observed with these processors: CPU0 revision is: 00000440 (R4400SC) CPU0 revision is: 01040102 (SiByte SB1) > I tried just compiling this on my SGI O2, which has an RM7000 CPU, and it is > refusing to even compile (after fixing typos): > > # gcc -O2 -pipe sync_test.c -o sync_test > {standard input}: Assembler messages: > {standard input}:19: Error: invalid operands `sync 0x10' > {standard input}:20: Error: invalid operands `sync 0x13' > {standard input}:21: Error: invalid operands `sync 0x04' Yeah, there is another typo there: you need to use `.set mips32' or suchlike rather than `.set mips2' to be able to specify an operand. That probably counts as a bug in binutils, as -- according to what I have observed in the other thread -- the `stype' field has been defined at least since the MIPS IV ISA. > And the program compiles successfully and executes with no noticeable oddities > or errors. Nothing in dmesg, no crashes, booms, or disappearance of small You did disable SYNC emulation in the kernel though with a change like below, did you? > kittens. I did a quick disassembly to make sure all three got emitted: > > 004005e0 <main>: > 4005e0: 27bdfff8 addiu sp,sp,-8 > 4005e4: afbe0004 sw s8,4(sp) > 4005e8: 03a0f021 move s8,sp > 4005ec: afc40008 sw a0,8(s8) > 4005f0: afc5000c sw a1,12(s8) > > 4005f4: 0000040f sync.p > > 4005f8: 0000050f 0x50f > > 4005fc: 0000010f 0x10f > 400600: 00001021 move v0,zero You could have used the -m switch to override the architecture embedded with the ELF file, to have the instructions disassembled correctly, e.g.: $ objdump -m mips:isa64 -d sync [...] 00000001200008f0 <main>: 1200008f0: 0000040f sync 0x10 1200008f4: 000004cf sync 0x13 1200008f8: 0000010f sync 0x4 1200008fc: 03e00008 jr ra 120000900: 0000102d move v0,zero ... [...] What's interesting to note is that rev. 2.60 of the architecture did actually change the semantics of plain SYNC (aka SYNC 0) from an ordering barrier to a completion barrier. Previous architectural requirements for plain SYNC were equivalent to rev. 2.60's SYNC_MB, and to implement a completion barrier a dummy load had to follow. Some implementers of the old plain SYNC did actually make it a completion rather than ordering barrier and people even argued this was an architectural requirement, as I recall from discussions in early 2000s. On the other hand the implementations affected were UP-only processors where about the only use for SYNC was to drain any write buffers of data intended for MMIO accesses and such an implementation did conform even if it was too heavyweight for architectural requirements. So I think it was a reasonable implementation decision, saving 1-2 instructions where a completion barrier was required. The only downside of this decision was some programmers assumed such semantics was universally guaranteed, while indeed it was not (before rev. 2.60). Overall where backwards compatibility is required it looks to me like we need to keep the implementation of any completion barriers (e.g. `iob') as a SYNC followed by a dummy load. Maciej linux-mips-no-sync.diff Index: linux/arch/mips/kernel/traps.c =================================================================== --- linux.orig/arch/mips/kernel/traps.c +++ linux/arch/mips/kernel/traps.c @@ -672,6 +672,7 @@ static int simulate_rdhwr_mm(struct pt_r return -1; } +#if 0 static int simulate_sync(struct pt_regs *regs, unsigned int opcode) { if ((opcode & OPCODE) == SPEC0 && (opcode & FUNC) == SYNC) { @@ -682,6 +683,7 @@ static int simulate_sync(struct pt_regs return -1; /* Must be something else ... */ } +#endif asmlinkage void do_ov(struct pt_regs *regs) { @@ -1117,8 +1119,10 @@ asmlinkage void do_ri(struct pt_regs *re if (status < 0) status = simulate_rdhwr_normal(regs, opcode); +#if 0 if (status < 0) status = simulate_sync(regs, opcode); +#endif if (status < 0) status = simulate_fp(regs, opcode, old_epc, old31); ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers 2016-01-29 13:32 ` Maciej W. Rozycki @ 2016-01-30 16:25 ` Joshua Kinard 0 siblings, 0 replies; 24+ messages in thread From: Joshua Kinard @ 2016-01-30 16:25 UTC (permalink / raw) To: Maciej W. Rozycki, Ralf Baechle Cc: Leonid Yegoshin, linux-mips, benh, will.deacon, linux-kernel, markos.chandras, Steven.Hill, alexander.h.duyck, David S. Miller, Maciej W. Rozycki On 01/29/2016 08:32, Maciej W. Rozycki wrote: > On Wed, 27 Jan 2016, Joshua Kinard wrote: > >> On 06/05/2015 09:10, Ralf Baechle wrote: >>> >>> Maciej, >>> >>> do you have an R4000 / R4600 / R5000 / R7000 / SiByte system at hand to >>> test this? I think we don't need to test that SYNC actually works as >>> intended but the simpler test that SYNC <stype != 0> is not causing a >>> illegal instruction exception is sufficient, that is if something like >>> >>> int main(int argc, charg *argv[]) >>> { >>> asm(" .set mips2 \n" >>> " sync 0x10 \n" >>> " sync 0x13 \n" >>> " sync 0x04 \n" >>> " .set mips 0 \n"); >>> >>> return 0; >>> } >>> >>> doesn't crash we should be ok. > > No anomalies observed with these processors: > > CPU0 revision is: 00000440 (R4400SC) > CPU0 revision is: 01040102 (SiByte SB1) > >> I tried just compiling this on my SGI O2, which has an RM7000 CPU, and it is >> refusing to even compile (after fixing typos): >> >> # gcc -O2 -pipe sync_test.c -o sync_test >> {standard input}: Assembler messages: >> {standard input}:19: Error: invalid operands `sync 0x10' >> {standard input}:20: Error: invalid operands `sync 0x13' >> {standard input}:21: Error: invalid operands `sync 0x04' > > Yeah, there is another typo there: you need to use `.set mips32' or > suchlike rather than `.set mips2' to be able to specify an operand. That > probably counts as a bug in binutils, as -- according to what I have > observed in the other thread -- the `stype' field has been defined at > least since the MIPS IV ISA. > >> And the program compiles successfully and executes with no noticeable oddities >> or errors. Nothing in dmesg, no crashes, booms, or disappearance of small > > You did disable SYNC emulation in the kernel though with a change like > below, did you? > I did not...wasn't aware that the kernel emulates a lot of non-compatible instructions in traps.c. I applied the patch to both my IP30 and IP32 kernels, rebooted, and tested both forms of the test program (using the '.word' form and '.set mips32' variants with operands to 'sync'), and the binary appears to run w/o incident, as far as I can tell. Looks like at least the RM7K and R1x000 families handle operands to 'sync' w/o issue. I might be able to test an RM5200 box out within the next two weeks, once I finish building a netboot image to boot up a spare O2. >> kittens. I did a quick disassembly to make sure all three got emitted: >> >> 004005e0 <main>: >> 4005e0: 27bdfff8 addiu sp,sp,-8 >> 4005e4: afbe0004 sw s8,4(sp) >> 4005e8: 03a0f021 move s8,sp >> 4005ec: afc40008 sw a0,8(s8) >> 4005f0: afc5000c sw a1,12(s8) >>> 4005f4: 0000040f sync.p >>> 4005f8: 0000050f 0x50f >>> 4005fc: 0000010f 0x10f >> 400600: 00001021 move v0,zero > > You could have used the -m switch to override the architecture embedded > with the ELF file, to have the instructions disassembled correctly, e.g.: > > $ objdump -m mips:isa64 -d sync > [...] > 00000001200008f0 <main>: > 1200008f0: 0000040f sync 0x10 > 1200008f4: 000004cf sync 0x13 > 1200008f8: 0000010f sync 0x4 > 1200008fc: 03e00008 jr ra > 120000900: 0000102d move v0,zero > ... > [...] > > What's interesting to note is that rev. 2.60 of the architecture did > actually change the semantics of plain SYNC (aka SYNC 0) from an ordering > barrier to a completion barrier. Previous architectural requirements for > plain SYNC were equivalent to rev. 2.60's SYNC_MB, and to implement a > completion barrier a dummy load had to follow. > > Some implementers of the old plain SYNC did actually make it a completion > rather than ordering barrier and people even argued this was an > architectural requirement, as I recall from discussions in early 2000s. > On the other hand the implementations affected were UP-only processors > where about the only use for SYNC was to drain any write buffers of data > intended for MMIO accesses and such an implementation did conform even if > it was too heavyweight for architectural requirements. So I think it was > a reasonable implementation decision, saving 1-2 instructions where a > completion barrier was required. The only downside of this decision was > some programmers assumed such semantics was universally guaranteed, while > indeed it was not (before rev. 2.60). > > Overall where backwards compatibility is required it looks to me like we > need to keep the implementation of any completion barriers (e.g. `iob') as > a SYNC followed by a dummy load. > > Maciej > > linux-mips-no-sync.diff > Index: linux/arch/mips/kernel/traps.c > =================================================================== > --- linux.orig/arch/mips/kernel/traps.c > +++ linux/arch/mips/kernel/traps.c > @@ -672,6 +672,7 @@ static int simulate_rdhwr_mm(struct pt_r > return -1; > } > > +#if 0 > static int simulate_sync(struct pt_regs *regs, unsigned int opcode) > { > if ((opcode & OPCODE) == SPEC0 && (opcode & FUNC) == SYNC) { > @@ -682,6 +683,7 @@ static int simulate_sync(struct pt_regs > > return -1; /* Must be something else ... */ > } > +#endif > > asmlinkage void do_ov(struct pt_regs *regs) > { > @@ -1117,8 +1119,10 @@ asmlinkage void do_ri(struct pt_regs *re > if (status < 0) > status = simulate_rdhwr_normal(regs, opcode); > > +#if 0 > if (status < 0) > status = simulate_sync(regs, opcode); > +#endif > > if (status < 0) > status = simulate_fp(regs, opcode, old_epc, old31); > > -- Joshua Kinard Gentoo/MIPS kumba@gentoo.org 6144R/F5C6C943 2015-04-27 177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943 "The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between." --Emperor Turhan, Centauri Republic ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 2/3] MIPS: enforce LL-SC loop enclosing with SYNC (ACQUIRE and RELEASE) 2015-06-02 0:09 [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release Leonid Yegoshin 2015-06-02 0:09 ` [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers Leonid Yegoshin @ 2015-06-02 0:09 ` Leonid Yegoshin 2015-06-02 11:39 ` James Hogan 2015-06-02 0:09 ` [PATCH 3/3] MIPS: bugfix - replace smp_mb with release barrier function in unlocks Leonid Yegoshin 2015-06-02 8:41 ` [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release Joshua Kinard 3 siblings, 1 reply; 24+ messages in thread From: Leonid Yegoshin @ 2015-06-02 0:09 UTC (permalink / raw) To: linux-mips, benh, will.deacon, linux-kernel, ralf, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem Many MIPS32 R2 and all MIPS R6 CPUs are out of order execution, so it needs memory barriers in SMP environment. However, past cores may have a pipeline short enough to ignore that requirements and problem may never occurs until recently. This patch gives an option to enclose LL-SC loops by SYNC barriers in spinlocks, atomics, futexes, cmpxchg and bitops. So, this option is defined for MIPS32 R2 only, because that recent CPUs may occasionally have problems in accordance with HW team. And most of MIPS64 R2 vendor processors already have some kind of memory barrier and the only one generic 5KEs has a pretty short pipeline. Using memory barriers in MIPS R6 is mandatory, all that processors have a speculative memory read which can inflict a trouble without a correct use of barriers in LL-SC loop cycles. The same is actually for MIPS32 R5 I5600 processor. Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> --- arch/mips/Kconfig | 25 +++++++++++++++++++++++++ arch/mips/include/asm/barrier.h | 26 ++++++++++++++++++++++---- 2 files changed, 47 insertions(+), 4 deletions(-) diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index c7d0cacece3d..676eb64f5545 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -1896,6 +1896,30 @@ config MIPS_LIGHTWEIGHT_SYNC converted to generic "SYNC 0". If you unsure, say N here, it may slightly decrease your performance + +config MIPS_ENFORCE_WEAK_REORDERING_BEYOND_LLSC + bool "Enforce memory barriers at LLSC loops - atomics, spinlocks etc" + depends on CPU_MIPS32_R2 + default y if CPU_MIPSR6 + select WEAK_REORDERING_BEYOND_LLSC + help + Many MIPS32 R2 and all MIPS R6 CPUs are out of order execution, so it + needs memory barriers in SMP environment. However, past cores may have + a pipeline short enough to ignore that requirements and problem may + never occurs until recently. + + So, this option is defined for MIPS32 R2 only, because that recent + CPUs may occasionally have problems in accordance with HW team. + And MIPS64 R2 vendor processors already have some kind of memory + barrier and the only one generic 5KEs has a pretty short pipeline. + + Using memory barriers in MIPS R6 is mandatory, all that + processors have a speculative memory read which can inflict a trouble + without a correct use of barriers in LL-SC loop cycles. + The same is actually for MIPS32 R5 I5600 processor. + + If you unsure, say Y here, it may slightly decrease your performance + but increase a reliability. endmenu # @@ -1924,6 +1948,7 @@ config CPU_MIPSR2 config CPU_MIPSR6 bool default y if CPU_MIPS32_R6 || CPU_MIPS64_R6 + select MIPS_ENFORCE_WEAK_REORDERING_BEYOND_LLSC select MIPS_SPRAM config EVA diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index d2a63abfc7c6..f3cc7a91ac0d 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -95,33 +95,51 @@ # define smp_mb() __sync() # define smp_rmb() barrier() # define smp_wmb() __syncw() +# define smp_acquire() __sync() +# define smp_release() __sync() # else # ifdef CONFIG_MIPS_LIGHTWEIGHT_SYNC # define smp_mb() __asm__ __volatile__("sync 0x10" : : :"memory") # define smp_rmb() __asm__ __volatile__("sync 0x13" : : :"memory") # define smp_wmb() __asm__ __volatile__("sync 0x4" : : :"memory") +# define smp_acquire() __asm__ __volatile__("sync 0x11" : : :"memory") +# define smp_release() __asm__ __volatile__("sync 0x12" : : :"memory") # else # define smp_mb() __asm__ __volatile__("sync" : : :"memory") # define smp_rmb() __asm__ __volatile__("sync" : : :"memory") # define smp_wmb() __asm__ __volatile__("sync" : : :"memory") +# define smp_acquire() __asm__ __volatile__("sync" : : :"memory") +# define smp_release() __asm__ __volatile__("sync" : : :"memory") # endif # endif #else #define smp_mb() barrier() #define smp_rmb() barrier() #define smp_wmb() barrier() +#define smp_acquire() barrier() +#define smp_release() barrier() #endif #if defined(CONFIG_WEAK_REORDERING_BEYOND_LLSC) && defined(CONFIG_SMP) +#ifdef CONFIG_MIPS_LIGHTWEIGHT_SYNC +#define __WEAK_LLSC_MB " sync 0x10 \n" +#define __WEAK_ACQUIRE " sync 0x11 \n" +#define __WEAK_RELEASE " sync 0x12 \n" +#else #define __WEAK_LLSC_MB " sync \n" +#define __WEAK_ACQUIRE __WEAK_LLSC_MB +#define __WEAK_RELEASE __WEAK_LLSC_MB +#endif #else #define __WEAK_LLSC_MB " \n" +#define __WEAK_ACQUIRE __WEAK_LLSC_MB +#define __WEAK_RELEASE __WEAK_LLSC_MB #endif #define set_mb(var, value) \ do { var = value; smp_mb(); } while (0) -#define smp_llsc_mb() __asm__ __volatile__(__WEAK_LLSC_MB : : :"memory") +#define smp_llsc_mb() __asm__ __volatile__(__WEAK_ACQUIRE : : :"memory") #ifdef CONFIG_CPU_CAVIUM_OCTEON #define smp_mb__before_llsc() smp_wmb() @@ -131,14 +149,14 @@ "syncw\n\t" \ ".set pop" : : : "memory") #else -#define smp_mb__before_llsc() smp_llsc_mb() +#define smp_mb__before_llsc() __asm__ __volatile__(__WEAK_RELEASE : : :"memory") #define nudge_writes() mb() #endif #define smp_store_release(p, v) \ do { \ compiletime_assert_atomic_type(*p); \ - smp_mb(); \ + smp_release(); \ ACCESS_ONCE(*p) = (v); \ } while (0) @@ -146,7 +164,7 @@ do { \ ({ \ typeof(*p) ___p1 = ACCESS_ONCE(*p); \ compiletime_assert_atomic_type(*p); \ - smp_mb(); \ + smp_acquire(); \ ___p1; \ }) ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 2/3] MIPS: enforce LL-SC loop enclosing with SYNC (ACQUIRE and RELEASE) 2015-06-02 0:09 ` [PATCH 2/3] MIPS: enforce LL-SC loop enclosing with SYNC (ACQUIRE and RELEASE) Leonid Yegoshin @ 2015-06-02 11:39 ` James Hogan 0 siblings, 0 replies; 24+ messages in thread From: James Hogan @ 2015-06-02 11:39 UTC (permalink / raw) To: Leonid Yegoshin, linux-mips, benh, will.deacon, linux-kernel, ralf, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem [-- Attachment #1: Type: text/plain, Size: 7978 bytes --] Hi Leonid, On 02/06/15 01:09, Leonid Yegoshin wrote: > Many MIPS32 R2 and all MIPS R6 CPUs are out of order execution, so it > needs memory barriers in SMP environment. However, past cores may have > a pipeline short enough to ignore that requirements and problem may > never occurs until recently. > > This patch gives an option to enclose LL-SC loops by SYNC barriers in spinlocks, > atomics, futexes, cmpxchg and bitops. Please reflow text. > > So, this option is defined for MIPS32 R2 only, because that recent s/that/those/ > CPUs may occasionally have problems in accordance with HW team. "have problems in accordance with HW team" is a bit confusing. What do you mean? > And most of MIPS64 R2 vendor processors already have some kind of memory > barrier and the only one generic 5KEs has a pretty short pipeline. > > Using memory barriers in MIPS R6 is mandatory, all that s/that/those/ > processors have a speculative memory read which can inflict a trouble s/a // > without a correct use of barriers in LL-SC loop cycles. > The same is actually for MIPS32 R5 I5600 processor. Actually *true*? P5600 you mean? Same in Kconfig help text. > > Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> > --- > arch/mips/Kconfig | 25 +++++++++++++++++++++++++ > arch/mips/include/asm/barrier.h | 26 ++++++++++++++++++++++---- > 2 files changed, 47 insertions(+), 4 deletions(-) > > diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig > index c7d0cacece3d..676eb64f5545 100644 > --- a/arch/mips/Kconfig > +++ b/arch/mips/Kconfig > @@ -1896,6 +1896,30 @@ config MIPS_LIGHTWEIGHT_SYNC > converted to generic "SYNC 0". > > If you unsure, say N here, it may slightly decrease your performance > + > +config MIPS_ENFORCE_WEAK_REORDERING_BEYOND_LLSC > + bool "Enforce memory barriers at LLSC loops - atomics, spinlocks etc" > + depends on CPU_MIPS32_R2 > + default y if CPU_MIPSR6 > + select WEAK_REORDERING_BEYOND_LLSC > + help > + Many MIPS32 R2 and all MIPS R6 CPUs are out of order execution, so it > + needs memory barriers in SMP environment. However, past cores may have > + a pipeline short enough to ignore that requirements and problem may > + never occurs until recently. > + > + So, this option is defined for MIPS32 R2 only, because that recent > + CPUs may occasionally have problems in accordance with HW team. > + And MIPS64 R2 vendor processors already have some kind of memory > + barrier and the only one generic 5KEs has a pretty short pipeline. > + > + Using memory barriers in MIPS R6 is mandatory, all that > + processors have a speculative memory read which can inflict a trouble > + without a correct use of barriers in LL-SC loop cycles. > + The same is actually for MIPS32 R5 I5600 processor. > + > + If you unsure, say Y here, it may slightly decrease your performance If you *are* unsure (same in patch 1). Same comment as patch 1 too about ambiguity of "it". > + but increase a reliability. well only if the hardware has weak ordering, and only because it would be technically wrong in that case to say N here. It feels wrong to be giving the user this option. Can't we just select WEAK_REORDERING_BEYOND_LLSC automatically based on the hardware that needs to be supported by the kernel configuration (e.g. CPU_MIPSR6 or CPU_MIPS32_R2)? Those who care about mips r2 performance on hardware which doesn't strictly need it can always speak up / add an exception. Out of interest, are futex operations safe with weak llsc ordering, on the premise that they're mainly used by userland so ordering with normal kernel accesses just doesn't matter in practice? > endmenu > > # > @@ -1924,6 +1948,7 @@ config CPU_MIPSR2 > config CPU_MIPSR6 > bool > default y if CPU_MIPS32_R6 || CPU_MIPS64_R6 > + select MIPS_ENFORCE_WEAK_REORDERING_BEYOND_LLSC > select MIPS_SPRAM > > config EVA > diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h > index d2a63abfc7c6..f3cc7a91ac0d 100644 > --- a/arch/mips/include/asm/barrier.h > +++ b/arch/mips/include/asm/barrier.h > @@ -95,33 +95,51 @@ > # define smp_mb() __sync() > # define smp_rmb() barrier() > # define smp_wmb() __syncw() > +# define smp_acquire() __sync() > +# define smp_release() __sync() > # else > # ifdef CONFIG_MIPS_LIGHTWEIGHT_SYNC > # define smp_mb() __asm__ __volatile__("sync 0x10" : : :"memory") > # define smp_rmb() __asm__ __volatile__("sync 0x13" : : :"memory") > # define smp_wmb() __asm__ __volatile__("sync 0x4" : : :"memory") > +# define smp_acquire() __asm__ __volatile__("sync 0x11" : : :"memory") > +# define smp_release() __asm__ __volatile__("sync 0x12" : : :"memory") same question about use of sync_acquire/sync_release instructions which binutils understands since 2.21. > # else > # define smp_mb() __asm__ __volatile__("sync" : : :"memory") > # define smp_rmb() __asm__ __volatile__("sync" : : :"memory") > # define smp_wmb() __asm__ __volatile__("sync" : : :"memory") > +# define smp_acquire() __asm__ __volatile__("sync" : : :"memory") > +# define smp_release() __asm__ __volatile__("sync" : : :"memory") > # endif > # endif > #else > #define smp_mb() barrier() > #define smp_rmb() barrier() > #define smp_wmb() barrier() > +#define smp_acquire() barrier() > +#define smp_release() barrier() > #endif > > #if defined(CONFIG_WEAK_REORDERING_BEYOND_LLSC) && defined(CONFIG_SMP) > +#ifdef CONFIG_MIPS_LIGHTWEIGHT_SYNC > +#define __WEAK_LLSC_MB " sync 0x10 \n" > +#define __WEAK_ACQUIRE " sync 0x11 \n" > +#define __WEAK_RELEASE " sync 0x12 \n" tabs __WEAK_ACQUIRE and __WEAK_RELEASE are specific to llsc, maybe call them __WEAK_LLSC_ACQUIRE and __WEAK_LLSC_RELEASE instead to avoid confusion. > +#else > #define __WEAK_LLSC_MB " sync \n" > +#define __WEAK_ACQUIRE __WEAK_LLSC_MB > +#define __WEAK_RELEASE __WEAK_LLSC_MB tabs > +#endif > #else > #define __WEAK_LLSC_MB " \n" > +#define __WEAK_ACQUIRE __WEAK_LLSC_MB > +#define __WEAK_RELEASE __WEAK_LLSC_MB tabs > #endif > > #define set_mb(var, value) \ > do { var = value; smp_mb(); } while (0) > > -#define smp_llsc_mb() __asm__ __volatile__(__WEAK_LLSC_MB : : :"memory") > +#define smp_llsc_mb() __asm__ __volatile__(__WEAK_ACQUIRE : : :"memory") tabs > > #ifdef CONFIG_CPU_CAVIUM_OCTEON > #define smp_mb__before_llsc() smp_wmb() > @@ -131,14 +149,14 @@ > "syncw\n\t" \ > ".set pop" : : : "memory") > #else > -#define smp_mb__before_llsc() smp_llsc_mb() > +#define smp_mb__before_llsc() __asm__ __volatile__(__WEAK_RELEASE : : :"memory") > #define nudge_writes() mb() > #endif > > #define smp_store_release(p, v) \ > do { \ > compiletime_assert_atomic_type(*p); \ > - smp_mb(); \ > + smp_release(); \ > ACCESS_ONCE(*p) = (v); \ > } while (0) > > @@ -146,7 +164,7 @@ do { \ > ({ \ > typeof(*p) ___p1 = ACCESS_ONCE(*p); \ > compiletime_assert_atomic_type(*p); \ > - smp_mb(); \ > + smp_acquire(); \ > ___p1; \ > }) > > > This patch does 3 logically separable things: 1) add smp_release/smp_acquire based on MIPS_LIGHTWEIGHT_SYNC and weaken smp_store_release()/smp_load_acquire() to use them. 2) weaken llsc barriers when MIPS_LIGHTWEIGHT_SYNC. 3) the MIPS_ENFORCE_WEAK_REORDERING_BEYOND_LLSC Kconfig stuff (or whatever method to select WEAK_REORDERING_BEYOND_LLSC more often). Any reason not to split them, and give a clear description of each? Cheers James [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 3/3] MIPS: bugfix - replace smp_mb with release barrier function in unlocks 2015-06-02 0:09 [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release Leonid Yegoshin 2015-06-02 0:09 ` [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers Leonid Yegoshin 2015-06-02 0:09 ` [PATCH 2/3] MIPS: enforce LL-SC loop enclosing with SYNC (ACQUIRE and RELEASE) Leonid Yegoshin @ 2015-06-02 0:09 ` Leonid Yegoshin 2015-06-02 11:42 ` James Hogan 2015-06-02 8:41 ` [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release Joshua Kinard 3 siblings, 1 reply; 24+ messages in thread From: Leonid Yegoshin @ 2015-06-02 0:09 UTC (permalink / raw) To: linux-mips, benh, will.deacon, linux-kernel, ralf, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem Repleace smp_mb() in arch_write_unlock() and __clear_bit_unlock() to smp_mb__before_llsc() call which does "release" barrier functionality. It seems like it was missed in commit f252ffd50c97dae87b45f1dbad24f71358ccfbd6 during introduction of "acquire" and "release" semantics. Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> --- arch/mips/include/asm/bitops.h | 2 +- arch/mips/include/asm/spinlock.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 0cf29bd5dc5c..ce9666cf1499 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -469,7 +469,7 @@ static inline int test_and_change_bit(unsigned long nr, */ static inline void __clear_bit_unlock(unsigned long nr, volatile unsigned long *addr) { - smp_mb(); + smp_mb__before_llsc(); __clear_bit(nr, addr); } diff --git a/arch/mips/include/asm/spinlock.h b/arch/mips/include/asm/spinlock.h index 1fca2e0793dc..7c7f3b2bd3de 100644 --- a/arch/mips/include/asm/spinlock.h +++ b/arch/mips/include/asm/spinlock.h @@ -317,7 +317,7 @@ static inline void arch_write_lock(arch_rwlock_t *rw) static inline void arch_write_unlock(arch_rwlock_t *rw) { - smp_mb(); + smp_mb__before_llsc(); __asm__ __volatile__( " # arch_write_unlock \n" ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 3/3] MIPS: bugfix - replace smp_mb with release barrier function in unlocks 2015-06-02 0:09 ` [PATCH 3/3] MIPS: bugfix - replace smp_mb with release barrier function in unlocks Leonid Yegoshin @ 2015-06-02 11:42 ` James Hogan 2015-06-02 13:22 ` Ralf Baechle 0 siblings, 1 reply; 24+ messages in thread From: James Hogan @ 2015-06-02 11:42 UTC (permalink / raw) To: Leonid Yegoshin, linux-mips, benh, will.deacon, linux-kernel, ralf, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem [-- Attachment #1: Type: text/plain, Size: 1757 bytes --] On 02/06/15 01:09, Leonid Yegoshin wrote: > Repleace smp_mb() in arch_write_unlock() and __clear_bit_unlock() to Replace. > smp_mb__before_llsc() call which does "release" barrier functionality. > > It seems like it was missed in commit f252ffd50c97dae87b45f1dbad24f71358ccfbd6 > during introduction of "acquire" and "release" semantics. > > Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> > --- > arch/mips/include/asm/bitops.h | 2 +- > arch/mips/include/asm/spinlock.h | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h > index 0cf29bd5dc5c..ce9666cf1499 100644 > --- a/arch/mips/include/asm/bitops.h > +++ b/arch/mips/include/asm/bitops.h > @@ -469,7 +469,7 @@ static inline int test_and_change_bit(unsigned long nr, > */ > static inline void __clear_bit_unlock(unsigned long nr, volatile unsigned long *addr) > { > - smp_mb(); > + smp_mb__before_llsc(); > __clear_bit(nr, addr); > } > > diff --git a/arch/mips/include/asm/spinlock.h b/arch/mips/include/asm/spinlock.h > index 1fca2e0793dc..7c7f3b2bd3de 100644 > --- a/arch/mips/include/asm/spinlock.h > +++ b/arch/mips/include/asm/spinlock.h > @@ -317,7 +317,7 @@ static inline void arch_write_lock(arch_rwlock_t *rw) > > static inline void arch_write_unlock(arch_rwlock_t *rw) > { > - smp_mb(); > + smp_mb__before_llsc(); arch_write_unlock appears to just use sw, not sc, and __clear_bit appears to be implemented in plain C, so is smp_mb__before_llsc() really appropriate? Would smp_release() be more accurate/correct in both cases? Cheers James > > __asm__ __volatile__( > " # arch_write_unlock \n" > > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/3] MIPS: bugfix - replace smp_mb with release barrier function in unlocks 2015-06-02 11:42 ` James Hogan @ 2015-06-02 13:22 ` Ralf Baechle 0 siblings, 0 replies; 24+ messages in thread From: Ralf Baechle @ 2015-06-02 13:22 UTC (permalink / raw) To: James Hogan Cc: Leonid Yegoshin, linux-mips, benh, will.deacon, linux-kernel, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem On Tue, Jun 02, 2015 at 12:42:40PM +0100, James Hogan wrote: > Replace. > > > smp_mb__before_llsc() call which does "release" barrier functionality. > > > > It seems like it was missed in commit f252ffd50c97dae87b45f1dbad24f71358ccfbd6 > > during introduction of "acquire" and "release" semantics. > > > > Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> > > --- > > arch/mips/include/asm/bitops.h | 2 +- > > arch/mips/include/asm/spinlock.h | 2 +- > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h > > index 0cf29bd5dc5c..ce9666cf1499 100644 > > --- a/arch/mips/include/asm/bitops.h > > +++ b/arch/mips/include/asm/bitops.h > > @@ -469,7 +469,7 @@ static inline int test_and_change_bit(unsigned long nr, > > */ > > static inline void __clear_bit_unlock(unsigned long nr, volatile unsigned long *addr) > > { > > - smp_mb(); > > + smp_mb__before_llsc(); > > __clear_bit(nr, addr); > > } > > > > diff --git a/arch/mips/include/asm/spinlock.h b/arch/mips/include/asm/spinlock.h > > index 1fca2e0793dc..7c7f3b2bd3de 100644 > > --- a/arch/mips/include/asm/spinlock.h > > +++ b/arch/mips/include/asm/spinlock.h > > @@ -317,7 +317,7 @@ static inline void arch_write_lock(arch_rwlock_t *rw) > > > > static inline void arch_write_unlock(arch_rwlock_t *rw) > > { > > - smp_mb(); > > + smp_mb__before_llsc(); > > arch_write_unlock appears to just use sw, not sc, and __clear_bit > appears to be implemented in plain C, so is smp_mb__before_llsc() really > appropriate? Would smp_release() be more accurate/correct in both cases? Yes on the both questions. Ralf ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release 2015-06-02 0:09 [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release Leonid Yegoshin ` (2 preceding siblings ...) 2015-06-02 0:09 ` [PATCH 3/3] MIPS: bugfix - replace smp_mb with release barrier function in unlocks Leonid Yegoshin @ 2015-06-02 8:41 ` Joshua Kinard 2015-06-02 9:59 ` Ralf Baechle 3 siblings, 1 reply; 24+ messages in thread From: Joshua Kinard @ 2015-06-02 8:41 UTC (permalink / raw) To: Leonid Yegoshin, linux-mips, will.deacon, linux-kernel, ralf Cc: benh, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem On 06/01/2015 20:09, Leonid Yegoshin wrote: > The following series implements lightweight SYNC memory barriers for SMP Linux > and a correct use of SYNCs around atomics, futexes, spinlocks etc LL-SC loops - > the basic building blocks of any atomics in MIPS. > > Historically, a generic MIPS doesn't use memory barriers around LL-SC loops in > atomics, spinlocks etc. However, Architecture documents never specify that LL-SC > loop creates a memory barrier. Some non-generic MIPS vendors already feel > the pain and enforces it. With introduction in a recent out-of-order superscalar > MIPS processors an aggressive speculative memory read it is a problem now. > > The generic MIPS memory barrier instruction SYNC (aka SYNC 0) is something > very heavvy because it was designed for propogating barrier down to memory. > MIPS R2 introduced lightweight SYNC instructions which correspond to smp_*() > set of SMP barriers. The description was very HW-specific and it was never > used, however, it is much less trouble for processor pipelines and can be used > in smp_mb()/smp_rmb()/smp_wmb() as is as in acquire/release barrier semantics. > After prolonged discussions with HW team it became clear that lightweight SYNCs > were designed specifically with smp_*() in mind but description is in timeline > ordering space. > > So, the problem was spotted recently in engineering tests and it was confirmed > with tests that without memory barrier load and store may pass LL/SC > instructions in both directions, even in old MIPS R2 processors. > Aggressive speculation in MIPS R6 and MIPS I5600 processors adds more fire to > this issue. > > 3 patches introduces a configurable control for lightweight SYNCs around LL/SC > loops and for MIPS32 R2 it was allowed to choose an enforcing SYNCs or not > (keep as is) because some old MIPS32 R2 may be happy without that SYNCs. > In MIPS R6 I chose to have SYNC around LL/SC mandatory because all of that > processors have an agressive speculation and delayed write buffers. In that > processors series it is still possible the use of SYNC 0 instead of > lightweight SYNCs in configuration - just in case of some trouble in > implementation in specific CPU. However, it is considered safe do not implement > some or any lightweight SYNC in specific core because Architecture requires > HW map of unimplemented SYNCs to SYNC 0. How useful might this be for older hardware, such as the R10k CPUs? Just fallbacks to the old sync insn? --J ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release 2015-06-02 8:41 ` [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release Joshua Kinard @ 2015-06-02 9:59 ` Ralf Baechle 2015-06-02 18:59 ` Joshua Kinard 0 siblings, 1 reply; 24+ messages in thread From: Ralf Baechle @ 2015-06-02 9:59 UTC (permalink / raw) To: Joshua Kinard Cc: Leonid Yegoshin, linux-mips, will.deacon, linux-kernel, benh, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem On Tue, Jun 02, 2015 at 04:41:21AM -0400, Joshua Kinard wrote: > On 06/01/2015 20:09, Leonid Yegoshin wrote: > > The following series implements lightweight SYNC memory barriers for SMP Linux > > and a correct use of SYNCs around atomics, futexes, spinlocks etc LL-SC loops - > > the basic building blocks of any atomics in MIPS. > > > > Historically, a generic MIPS doesn't use memory barriers around LL-SC loops in > > atomics, spinlocks etc. However, Architecture documents never specify that LL-SC > > loop creates a memory barrier. Some non-generic MIPS vendors already feel > > the pain and enforces it. With introduction in a recent out-of-order superscalar > > MIPS processors an aggressive speculative memory read it is a problem now. > > > > The generic MIPS memory barrier instruction SYNC (aka SYNC 0) is something > > very heavvy because it was designed for propogating barrier down to memory. > > MIPS R2 introduced lightweight SYNC instructions which correspond to smp_*() > > set of SMP barriers. The description was very HW-specific and it was never > > used, however, it is much less trouble for processor pipelines and can be used > > in smp_mb()/smp_rmb()/smp_wmb() as is as in acquire/release barrier semantics. > > After prolonged discussions with HW team it became clear that lightweight SYNCs > > were designed specifically with smp_*() in mind but description is in timeline > > ordering space. > > > > So, the problem was spotted recently in engineering tests and it was confirmed > > with tests that without memory barrier load and store may pass LL/SC > > instructions in both directions, even in old MIPS R2 processors. > > Aggressive speculation in MIPS R6 and MIPS I5600 processors adds more fire to > > this issue. > > > > 3 patches introduces a configurable control for lightweight SYNCs around LL/SC > > loops and for MIPS32 R2 it was allowed to choose an enforcing SYNCs or not > > (keep as is) because some old MIPS32 R2 may be happy without that SYNCs. > > In MIPS R6 I chose to have SYNC around LL/SC mandatory because all of that > > processors have an agressive speculation and delayed write buffers. In that > > processors series it is still possible the use of SYNC 0 instead of > > lightweight SYNCs in configuration - just in case of some trouble in > > implementation in specific CPU. However, it is considered safe do not implement > > some or any lightweight SYNC in specific core because Architecture requires > > HW map of unimplemented SYNCs to SYNC 0. > > How useful might this be for older hardware, such as the R10k CPUs? Just > fallbacks to the old sync insn? The R10000 family is strongly ordered so there is no SYNC instruction required in the entire kernel even though some Origin hardware documentation incorrectly claims otherwise. Ralf ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release 2015-06-02 9:59 ` Ralf Baechle @ 2015-06-02 18:59 ` Joshua Kinard 2015-06-02 19:19 ` Ralf Baechle 0 siblings, 1 reply; 24+ messages in thread From: Joshua Kinard @ 2015-06-02 18:59 UTC (permalink / raw) To: Ralf Baechle Cc: Leonid Yegoshin, linux-mips, will.deacon, linux-kernel, benh, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem On 06/02/2015 05:59, Ralf Baechle wrote: > On Tue, Jun 02, 2015 at 04:41:21AM -0400, Joshua Kinard wrote: > >> On 06/01/2015 20:09, Leonid Yegoshin wrote: >>> The following series implements lightweight SYNC memory barriers for SMP Linux >>> and a correct use of SYNCs around atomics, futexes, spinlocks etc LL-SC loops - >>> the basic building blocks of any atomics in MIPS. >>> >>> Historically, a generic MIPS doesn't use memory barriers around LL-SC loops in >>> atomics, spinlocks etc. However, Architecture documents never specify that LL-SC >>> loop creates a memory barrier. Some non-generic MIPS vendors already feel >>> the pain and enforces it. With introduction in a recent out-of-order superscalar >>> MIPS processors an aggressive speculative memory read it is a problem now. >>> >>> The generic MIPS memory barrier instruction SYNC (aka SYNC 0) is something >>> very heavvy because it was designed for propogating barrier down to memory. >>> MIPS R2 introduced lightweight SYNC instructions which correspond to smp_*() >>> set of SMP barriers. The description was very HW-specific and it was never >>> used, however, it is much less trouble for processor pipelines and can be used >>> in smp_mb()/smp_rmb()/smp_wmb() as is as in acquire/release barrier semantics. >>> After prolonged discussions with HW team it became clear that lightweight SYNCs >>> were designed specifically with smp_*() in mind but description is in timeline >>> ordering space. >>> >>> So, the problem was spotted recently in engineering tests and it was confirmed >>> with tests that without memory barrier load and store may pass LL/SC >>> instructions in both directions, even in old MIPS R2 processors. >>> Aggressive speculation in MIPS R6 and MIPS I5600 processors adds more fire to >>> this issue. >>> >>> 3 patches introduces a configurable control for lightweight SYNCs around LL/SC >>> loops and for MIPS32 R2 it was allowed to choose an enforcing SYNCs or not >>> (keep as is) because some old MIPS32 R2 may be happy without that SYNCs. >>> In MIPS R6 I chose to have SYNC around LL/SC mandatory because all of that >>> processors have an agressive speculation and delayed write buffers. In that >>> processors series it is still possible the use of SYNC 0 instead of >>> lightweight SYNCs in configuration - just in case of some trouble in >>> implementation in specific CPU. However, it is considered safe do not implement >>> some or any lightweight SYNC in specific core because Architecture requires >>> HW map of unimplemented SYNCs to SYNC 0. >> >> How useful might this be for older hardware, such as the R10k CPUs? Just >> fallbacks to the old sync insn? > > The R10000 family is strongly ordered so there is no SYNC instruction > required in the entire kernel even though some Origin hardware documentation > incorrectly claims otherwise. So no benefits even in the speculative execution case on noncoherent hardware like IP28 and IP32? --J ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release 2015-06-02 18:59 ` Joshua Kinard @ 2015-06-02 19:19 ` Ralf Baechle 0 siblings, 0 replies; 24+ messages in thread From: Ralf Baechle @ 2015-06-02 19:19 UTC (permalink / raw) To: Joshua Kinard Cc: Leonid Yegoshin, linux-mips, will.deacon, linux-kernel, benh, markos.chandras, macro, Steven.Hill, alexander.h.duyck, davem On Tue, Jun 02, 2015 at 02:59:38PM -0400, Joshua Kinard wrote: > >> How useful might this be for older hardware, such as the R10k CPUs? Just > >> fallbacks to the old sync insn? > > > > The R10000 family is strongly ordered so there is no SYNC instruction > > required in the entire kernel even though some Origin hardware documentation > > incorrectly claims otherwise. > > So no benefits even in the speculative execution case on noncoherent hardware > like IP28 and IP32? That's handled entirely differently by using a CACHE BARRIER instruction, something which is specific to the R10000-family. It's also used differently by putting once such instruction at the end of every basic block that might result in speculatively dirty cache lines. Note that these systems affected by this speculation issue are all non-coherent uniprocessor systems while Leonid's patch matters for SMP kernels; the primitives he's changed will not genrate any code for a !CONFIG_SMP kernel. Ralf ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2016-01-30 16:34 UTC | newest] Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-06-02 0:09 [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release Leonid Yegoshin 2015-06-02 0:09 ` [PATCH 1/3] MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers Leonid Yegoshin 2015-06-02 10:08 ` Paul Burton 2015-06-02 12:12 ` Luc Van Oostenryck 2015-06-02 12:44 ` Ralf Baechle 2015-06-02 18:20 ` Leonid Yegoshin 2015-06-02 10:48 ` James Hogan 2015-06-02 16:15 ` Maciej W. Rozycki 2015-06-02 23:56 ` David Daney 2015-06-03 1:56 ` Leonid Yegoshin 2015-06-05 13:10 ` Ralf Baechle 2015-06-05 21:18 ` Maciej W. Rozycki 2016-01-28 2:28 ` Joshua Kinard 2016-01-29 13:32 ` Maciej W. Rozycki 2016-01-30 16:25 ` Joshua Kinard 2015-06-02 0:09 ` [PATCH 2/3] MIPS: enforce LL-SC loop enclosing with SYNC (ACQUIRE and RELEASE) Leonid Yegoshin 2015-06-02 11:39 ` James Hogan 2015-06-02 0:09 ` [PATCH 3/3] MIPS: bugfix - replace smp_mb with release barrier function in unlocks Leonid Yegoshin 2015-06-02 11:42 ` James Hogan 2015-06-02 13:22 ` Ralf Baechle 2015-06-02 8:41 ` [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release Joshua Kinard 2015-06-02 9:59 ` Ralf Baechle 2015-06-02 18:59 ` Joshua Kinard 2015-06-02 19:19 ` Ralf Baechle
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).