All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/4] ARM: Change the mandatory barriers implementation
@ 2010-02-23 11:01 Catalin Marinas
  2010-02-23 11:10 ` Russell King - ARM Linux
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Catalin Marinas @ 2010-02-23 11:01 UTC (permalink / raw)
  To: linux-arm-kernel

The mandatory barriers (mb, rmb, wmb) are used even on uniprocessor
systems for things like ordering Normal Non-cacheable memory accesses
with DMA transfer (via Device memory writes). The current implementation
uses dmb() for mb() and friends but this is not sufficient. The DMB only
ensures the ordering of accesses with regards to a single observer
accessing the same memory. If a DMA transfer is started by a write to
Device memory, the data to be transfered may not reach the main memory
(even if mapped as Normal Non-cacheable) before the device receives the
notification to begin the transfer. The only barrier that would help in
this situation is DSB which would completely drain the write buffers.

The patch also adds support for platform-defined barriers that can be
defined in mach/barriers.h. This is required by at least two platforms -
MSM and RealView (possible OMAP as well). On RealView with an outer
cache (PL310 for example) stores to Normal Non-cacheable memory are
buffered by the outer cache but the DSB doesn't go as far as this. A
separate L2x0 sync command is required (a store to Strongly Ordered
memory would do as well, similar to the MSM requirements and maybe
faster).

Note that the SMP barriers are not affected as they only deal with
ordering in Normal memory. There is however a situation with the use of
IPIs. A DMB is not enough to ensure that a write to Normal memory is
strictly ordered with respect to the IPI generation (and interrupt
handling). A solution is for the users of smp_call_function() to use a
mandatory barrier.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Daniel Walker <dwalker@codeaurora.org>
Cc: Larry Bassel <lbassel@quicinc.com>
Cc: Tony Lindgren <tony@atomide.com>
---
 arch/arm/include/asm/system.h |   18 ++++++++----------
 arch/arm/mm/Kconfig           |    6 ++++++
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/arch/arm/include/asm/system.h b/arch/arm/include/asm/system.h
index 058e7e9..477861d 100644
--- a/arch/arm/include/asm/system.h
+++ b/arch/arm/include/asm/system.h
@@ -138,14 +138,12 @@ extern unsigned int user_debug;
 #define dmb() __asm__ __volatile__ ("" : : : "memory")
 #endif
 
-#if __LINUX_ARM_ARCH__ >= 7 || defined(CONFIG_SMP)
-#define mb()		dmb()
-#define rmb()		dmb()
-#define wmb()		dmb()
+#ifdef CONFIG_ARCH_HAS_BARRIERS
+#include <mach/barriers.h>
 #else
-#define mb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
-#define rmb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
-#define wmb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
+#define mb()		dsb()
+#define rmb()		dmb()
+#define wmb()		dsb()
 #endif
 
 #ifndef CONFIG_SMP
@@ -153,9 +151,9 @@ extern unsigned int user_debug;
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 #else
-#define smp_mb()	mb()
-#define smp_rmb()	rmb()
-#define smp_wmb()	wmb()
+#define smp_mb()	dmb()
+#define smp_rmb()	dmb()
+#define smp_wmb()	dmb()
 #endif
 
 #define read_barrier_depends()		do { } while(0)
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index f62beb7..f67f2c4 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -802,3 +802,9 @@ config ARM_L1_CACHE_SHIFT
 	int
 	default 6 if ARCH_OMAP3 || ARCH_S5PC1XX
 	default 5
+
+config ARCH_HAS_BARRIERS
+	bool
+	help
+	  This option allows the use of custom mandatory barriers
+	  included via the mach/barriers.h file.

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 11:01 [PATCH 1/4] ARM: Change the mandatory barriers implementation Catalin Marinas
@ 2010-02-23 11:10 ` Russell King - ARM Linux
  2010-02-23 12:16   ` Catalin Marinas
  2010-02-23 12:21   ` Catalin Marinas
  2010-02-23 11:35 ` Shilimkar, Santosh
  2010-02-23 17:33 ` Russell King - ARM Linux
  2 siblings, 2 replies; 19+ messages in thread
From: Russell King - ARM Linux @ 2010-02-23 11:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 23, 2010 at 11:01:05AM +0000, Catalin Marinas wrote:
> -#define mb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> -#define rmb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> -#define wmb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> +#define mb()		dsb()
> +#define rmb()		dmb()
> +#define wmb()		dsb()

What is the reason for getting rid of the arch_is_coherent() bit here
and imposing non-compiler barriers on everything?  I'd assume that
Lennert knew what he was doing with Xscale3.

This also breaks ARMv3 since you now issue a mcr for the write buffer
on architectures which have no write buffer.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 11:01 [PATCH 1/4] ARM: Change the mandatory barriers implementation Catalin Marinas
  2010-02-23 11:10 ` Russell King - ARM Linux
@ 2010-02-23 11:35 ` Shilimkar, Santosh
  2010-02-23 11:41   ` Russell King - ARM Linux
  2010-02-23 17:33 ` Russell King - ARM Linux
  2 siblings, 1 reply; 19+ messages in thread
From: Shilimkar, Santosh @ 2010-02-23 11:35 UTC (permalink / raw)
  To: linux-arm-kernel

Catalin,
Thanks for these patches.
> -----Original Message-----
> From: linux-arm-kernel-bounces at lists.infradead.org [mailto:linux-arm-kernel-
> bounces at lists.infradead.org] On Behalf Of Catalin Marinas
> Sent: Tuesday, February 23, 2010 4:31 PM
> To: linux-arm-kernel at lists.infradead.org
> Cc: Tony Lindgren; Larry Bassel; Daniel Walker; Russell King
> Subject: [PATCH 1/4] ARM: Change the mandatory barriers implementation
> 
> The mandatory barriers (mb, rmb, wmb) are used even on uniprocessor
> systems for things like ordering Normal Non-cacheable memory accesses
> with DMA transfer (via Device memory writes). The current implementation
> uses dmb() for mb() and friends but this is not sufficient. The DMB only
> ensures the ordering of accesses with regards to a single observer
> accessing the same memory. If a DMA transfer is started by a write to
> Device memory, the data to be transfered may not reach the main memory
> (even if mapped as Normal Non-cacheable) before the device receives the
> notification to begin the transfer. The only barrier that would help in
> this situation is DSB which would completely drain the write buffers.
> 
> The patch also adds support for platform-defined barriers that can be
> defined in mach/barriers.h. This is required by at least two platforms -
> MSM and RealView (possible OMAP as well). On RealView with an outer
> cache (PL310 for example) stores to Normal Non-cacheable memory are
> buffered by the outer cache but the DSB doesn't go as far as this. A
> separate L2x0 sync command is required (a store to Strongly Ordered
> memory would do as well, similar to the MSM requirements and maybe
> faster).
> 
> Note that the SMP barriers are not affected as they only deal with
> ordering in Normal memory. There is however a situation with the use of
> IPIs. A DMB is not enough to ensure that a write to Normal memory is
> strictly ordered with respect to the IPI generation (and interrupt
> handling). A solution is for the users of smp_call_function() to use a
> mandatory barrier.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Russell King <linux@arm.linux.org.uk>
> Cc: Daniel Walker <dwalker@codeaurora.org>
> Cc: Larry Bassel <lbassel@quicinc.com>
> Cc: Tony Lindgren <tony@atomide.com>
> ---
>  arch/arm/include/asm/system.h |   18 ++++++++----------
>  arch/arm/mm/Kconfig           |    6 ++++++
>  2 files changed, 14 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/arm/include/asm/system.h b/arch/arm/include/asm/system.h
> index 058e7e9..477861d 100644
> --- a/arch/arm/include/asm/system.h
> +++ b/arch/arm/include/asm/system.h
> @@ -138,14 +138,12 @@ extern unsigned int user_debug;
>  #define dmb() __asm__ __volatile__ ("" : : : "memory")
>  #endif
> 
> -#if __LINUX_ARM_ARCH__ >= 7 || defined(CONFIG_SMP)
> -#define mb()		dmb()
> -#define rmb()		dmb()
> -#define wmb()		dmb()
> +#ifdef CONFIG_ARCH_HAS_BARRIERS
> +#include <mach/barriers.h>
>  #else
> -#define mb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> -#define rmb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> -#define wmb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> +#define mb()		dsb()
> +#define rmb()		dmb()
> +#define wmb()		dsb()
>  #endif
> 
>  #ifndef CONFIG_SMP
> @@ -153,9 +151,9 @@ extern unsigned int user_debug;
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
>  #else
> -#define smp_mb()	mb()
> -#define smp_rmb()	rmb()
> -#define smp_wmb()	wmb()
> +#define smp_mb()	dmb()
> +#define smp_rmb()	dmb()
> +#define smp_wmb()	dmb()
>  #endif
> 
>  #define read_barrier_depends()		do { } while(0)
> diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
> index f62beb7..f67f2c4 100644
> --- a/arch/arm/mm/Kconfig
> +++ b/arch/arm/mm/Kconfig
> @@ -802,3 +802,9 @@ config ARM_L1_CACHE_SHIFT
>  	int
>  	default 6 if ARCH_OMAP3 || ARCH_S5PC1XX
>  	default 5
> +
> +config ARCH_HAS_BARRIERS
> +	bool
> +	help
> +	  This option allows the use of custom mandatory barriers
> +	  included via the mach/barriers.h file.
> 
OMAP3 and family would be ok with this much needed mandatory barriers
but for OMAP4, we need platform specified barriers as you mentioned. Will
have a look at this series and do some testing on OMAP4. Will send
a follow up patch then.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 11:35 ` Shilimkar, Santosh
@ 2010-02-23 11:41   ` Russell King - ARM Linux
  0 siblings, 0 replies; 19+ messages in thread
From: Russell King - ARM Linux @ 2010-02-23 11:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 23, 2010 at 05:05:31PM +0530, Shilimkar, Santosh wrote:
> OMAP3 and family would be ok with this much needed mandatory barriers
> but for OMAP4, we need platform specified barriers as you mentioned. Will
> have a look at this series and do some testing on OMAP4. Will send
> a follow up patch then.

Wait for v2 - it introduces barriers onto the older CPUs which do not
need it, and therefore is definitely not right.

Eg, ARMv4 CPUs don't need the drain-write-buffer instruction for mb()
and wmb() - we have no relaxed behaviours on those CPUs, so this is
pointless.  As I also pointed out, it breaks ARMv3 because the drain-
write-buffer instruction there is an undefined instruction.

Plus, non-coherent Xscale has the same requirements as ARMv4, so getting
rid of the arch_is_coherent() stuff was completely wrong.

Basically, the patch is not suitable as it currently stands, and is in
need of rework.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 11:10 ` Russell King - ARM Linux
@ 2010-02-23 12:16   ` Catalin Marinas
  2010-02-23 12:30     ` Russell King - ARM Linux
  2010-02-23 12:21   ` Catalin Marinas
  1 sibling, 1 reply; 19+ messages in thread
From: Catalin Marinas @ 2010-02-23 12:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-02-23 at 11:10 +0000, Russell King - ARM Linux wrote:
> On Tue, Feb 23, 2010 at 11:01:05AM +0000, Catalin Marinas wrote:
> > -#define mb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> > -#define rmb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> > -#define wmb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> > +#define mb()		dsb()
> > +#define rmb()		dmb()
> > +#define wmb()		dsb()
> 
> What is the reason for getting rid of the arch_is_coherent() bit here
> and imposing non-compiler barriers on everything?  

The original code defines the mandatory barriers only if
__LINUX_ARCH_ARM__ >= 7 || CONFIG_SMP. However we still need them for
earlier architectures (v6). We can change the #if here to test for
earlier architectures as well and only do the arch_is_coherent() check
for some early implementations that don't have barriers.

> This also breaks ARMv3 since you now issue a mcr for the write buffer
> on architectures which have no write buffer.

Is this the case for another #ifdef? I can add it.

-- 
Catalin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 11:10 ` Russell King - ARM Linux
  2010-02-23 12:16   ` Catalin Marinas
@ 2010-02-23 12:21   ` Catalin Marinas
  2010-02-23 12:31     ` Russell King - ARM Linux
  1 sibling, 1 reply; 19+ messages in thread
From: Catalin Marinas @ 2010-02-23 12:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-02-23 at 11:10 +0000, Russell King - ARM Linux wrote:
> On Tue, Feb 23, 2010 at 11:01:05AM +0000, Catalin Marinas wrote:
> > -#define mb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> > -#define rmb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> > -#define wmb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> > +#define mb()		dsb()
> > +#define rmb()		dmb()
> > +#define wmb()		dsb()
> 
> What is the reason for getting rid of the arch_is_coherent() bit here
> and imposing non-compiler barriers on everything?  I'd assume that
> Lennert knew what he was doing with Xscale3.
> 
> This also breaks ARMv3 since you now issue a mcr for the write buffer
> on architectures which have no write buffer.

What about the patch below?


ARM: Change the mandatory barriers implementation

From: Catalin Marinas <catalin.marinas@arm.com>

The mandatory barriers (mb, rmb, wmb) are used even on uniprocessor
systems for things like ordering Normal Non-cacheable memory accesses
with DMA transfer (via Device memory writes). The current implementation
uses dmb() for mb() and friends but this is not sufficient. The DMB only
ensures the ordering of accesses with regards to a single observer
accessing the same memory. If a DMA transfer is started by a write to
Device memory, the data to be transfered may not reach the main memory
(even if mapped as Normal Non-cacheable) before the device receives the
notification to begin the transfer. The only barrier that would help in
this situation is DSB which would completely drain the write buffers.

The patch also adds support for platform-defined barriers that can be
defined in mach/barriers.h. This is required by at least two platforms -
MSM and RealView (possible OMAP as well). On RealView with an outer
cache (PL310 for example) stores to Normal Non-cacheable memory are
buffered by the outer cache but the DSB doesn't go as far as this. A
separate L2x0 sync command is required (a store to Strongly Ordered
memory would do as well, similar to the MSM requirements and maybe
faster).

Note that the SMP barriers are not affected as they only deal with
ordering in Normal memory. There is however a situation with the use of
IPIs. A DMB is not enough to ensure that a write to Normal memory is
strictly ordered with respect to the IPI generation (and interrupt
handling). A solution is for the users of smp_call_function() to use a
mandatory barrier.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Daniel Walker <dwalker@codeaurora.org>
Cc: Larry Bassel <lbassel@quicinc.com>
Cc: Tony Lindgren <tony@atomide.com>
---
 arch/arm/include/asm/system.h |   14 ++++++++------
 arch/arm/mm/Kconfig           |    6 ++++++
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/system.h b/arch/arm/include/asm/system.h
index 058e7e9..0db8b06 100644
--- a/arch/arm/include/asm/system.h
+++ b/arch/arm/include/asm/system.h
@@ -138,10 +138,12 @@ extern unsigned int user_debug;
 #define dmb() __asm__ __volatile__ ("" : : : "memory")
 #endif
 
-#if __LINUX_ARM_ARCH__ >= 7 || defined(CONFIG_SMP)
-#define mb()		dmb()
+#ifdef CONFIG_ARCH_HAS_BARRIERS
+#include <mach/barriers.h>
+#elif __LINUX_ARM_ARCH__ >= 5
+#define mb()		dsb()
 #define rmb()		dmb()
-#define wmb()		dmb()
+#define wmb()		dsb()
 #else
 #define mb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
 #define rmb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
@@ -153,9 +155,9 @@ extern unsigned int user_debug;
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 #else
-#define smp_mb()	mb()
-#define smp_rmb()	rmb()
-#define smp_wmb()	wmb()
+#define smp_mb()	dmb()
+#define smp_rmb()	dmb()
+#define smp_wmb()	dmb()
 #endif
 
 #define read_barrier_depends()		do { } while(0)
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index f62beb7..f67f2c4 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -802,3 +802,9 @@ config ARM_L1_CACHE_SHIFT
 	int
 	default 6 if ARCH_OMAP3 || ARCH_S5PC1XX
 	default 5
+
+config ARCH_HAS_BARRIERS
+	bool
+	help
+	  This option allows the use of custom mandatory barriers
+	  included via the mach/barriers.h file.



-- 
Catalin

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 12:16   ` Catalin Marinas
@ 2010-02-23 12:30     ` Russell King - ARM Linux
  2010-02-23 15:12       ` Catalin Marinas
  0 siblings, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2010-02-23 12:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 23, 2010 at 12:16:43PM +0000, Catalin Marinas wrote:
> On Tue, 2010-02-23 at 11:10 +0000, Russell King - ARM Linux wrote:
> > On Tue, Feb 23, 2010 at 11:01:05AM +0000, Catalin Marinas wrote:
> > > -#define mb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> > > -#define rmb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> > > -#define wmb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> > > +#define mb()		dsb()
> > > +#define rmb()		dmb()
> > > +#define wmb()		dsb()
> > 
> > What is the reason for getting rid of the arch_is_coherent() bit here
> > and imposing non-compiler barriers on everything?  
> 
> The original code defines the mandatory barriers only if
> __LINUX_ARCH_ARM__ >= 7 || CONFIG_SMP.

No.  Please read the code.

The original code defines mandatory barriers in the following cases:
- ARMv7+
- SMP
- Any architecture which sets arch_is_coherent() (for coherent DMA) -
  which at the moment is only Xscale 3 on IXP23xx platforms.

So, we end up with mandatory barriers for: ARMv7+, ARMv6 SMP, XScale 3.
We end up with mb() and friends being compiler barriers on everything
else.

However, because dsb() replaced the 'drain write buffer' instruction in
things like the TLB operations, dsb() et al need to be defined for the
other architectures - and TLB operations execute dsb() according to the
TLB type attributes.

What your patch does is make mb() and friends unconditionally call dsb()
and friends.  As I've pointed out, this is totally wrong.  We only need
mb() and friends to be dsb() et al for ARMv6+ and anything which has
arch_is_coherent() enabled.

So, all you need to do is to leave the #else clause as-is, and change:

#if __LINUX_ARM_ARCH__ >= 7 || defined(CONFIG_SMP)

to be

#if __LINUX_ARM_ARCH__ >= 6

No need for any additional ifdefs.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 12:21   ` Catalin Marinas
@ 2010-02-23 12:31     ` Russell King - ARM Linux
  0 siblings, 0 replies; 19+ messages in thread
From: Russell King - ARM Linux @ 2010-02-23 12:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 23, 2010 at 12:21:56PM +0000, Catalin Marinas wrote:
> diff --git a/arch/arm/include/asm/system.h b/arch/arm/include/asm/system.h
> index 058e7e9..0db8b06 100644
> --- a/arch/arm/include/asm/system.h
> +++ b/arch/arm/include/asm/system.h
> @@ -138,10 +138,12 @@ extern unsigned int user_debug;
>  #define dmb() __asm__ __volatile__ ("" : : : "memory")
>  #endif
>  
> -#if __LINUX_ARM_ARCH__ >= 7 || defined(CONFIG_SMP)
> -#define mb()		dmb()
> +#ifdef CONFIG_ARCH_HAS_BARRIERS
> +#include <mach/barriers.h>
> +#elif __LINUX_ARM_ARCH__ >= 5
> +#define mb()		dsb()
>  #define rmb()		dmb()
> -#define wmb()		dmb()
> +#define wmb()		dsb()

Why do we need these to be anything more than compiler barriers on PXA?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 12:30     ` Russell King - ARM Linux
@ 2010-02-23 15:12       ` Catalin Marinas
  2010-02-23 15:24         ` Russell King - ARM Linux
  0 siblings, 1 reply; 19+ messages in thread
From: Catalin Marinas @ 2010-02-23 15:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-02-23 at 12:30 +0000, Russell King - ARM Linux wrote:
> On Tue, Feb 23, 2010 at 12:16:43PM +0000, Catalin Marinas wrote:
> > On Tue, 2010-02-23 at 11:10 +0000, Russell King - ARM Linux wrote:
> > > On Tue, Feb 23, 2010 at 11:01:05AM +0000, Catalin Marinas wrote:
> > > > -#define mb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> > > > -#define rmb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> > > > -#define wmb()	do { if (arch_is_coherent()) dmb(); else barrier(); } while (0)
> > > > +#define mb()		dsb()
> > > > +#define rmb()		dmb()
> > > > +#define wmb()		dsb()
> > > 
> > > What is the reason for getting rid of the arch_is_coherent() bit here
> > > and imposing non-compiler barriers on everything?  
> > 
> > The original code defines the mandatory barriers only if
> > __LINUX_ARCH_ARM__ >= 7 || CONFIG_SMP.
> 
> No.  Please read the code.
> 
> The original code defines mandatory barriers in the following cases:
> - ARMv7+
> - SMP
> - Any architecture which sets arch_is_coherent() (for coherent DMA) -
>   which at the moment is only Xscale 3 on IXP23xx platforms.
> 
> So, we end up with mandatory barriers for: ARMv7+, ARMv6 SMP, XScale 3.
> We end up with mb() and friends being compiler barriers on everything
> else.

The scenario I have in mind is when using mb() in relation with DMA
coherent mappings. We only use Normal Uncached for this case on ARMv7.
Earlier architectures, including ARMv6 SMP, we use strongly ordered
which would be fine without any barrier.

Since mb() isn't meant for SMP use, does it still make sense to have it
defined in the ARMv6 SMP case? Would people not use the smp_* variants?

Or is mb() meant to cover all the cases that smp_mb() would cover on
SMP?

> So, all you need to do is to leave the #else clause as-is, and change:
> 
> #if __LINUX_ARM_ARCH__ >= 7 || defined(CONFIG_SMP)
> 
> to be
> 
> #if __LINUX_ARM_ARCH__ >= 6

Yes, sounds fine. Just need to check whether the minimum version should
be 6 or 7 (see above).

-- 
Catalin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 15:12       ` Catalin Marinas
@ 2010-02-23 15:24         ` Russell King - ARM Linux
  2010-02-23 16:02           ` Catalin Marinas
  0 siblings, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2010-02-23 15:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 23, 2010 at 03:12:46PM +0000, Catalin Marinas wrote:
> The scenario I have in mind is when using mb() in relation with DMA
> coherent mappings. We only use Normal Uncached for this case on ARMv7.
> Earlier architectures, including ARMv6 SMP, we use strongly ordered
> which would be fine without any barrier.
> 
> Since mb() isn't meant for SMP use, does it still make sense to have it
> defined in the ARMv6 SMP case? Would people not use the smp_* variants?

Part of the reason for that is that smp_mb() are, afaik, supposed to be
as strong as mb() in the SMP case, or reduce to compiler barriers in the
UP case.

I'm not entirely convinced by the part of your patch which changes the
SMP barriers yet.  For instance, some drivers contain:

                /* We need for force the visibility of tp->intr_mask
                 * for other CPUs, as we can loose an MSI interrupt
                 * and potentially wait for a retransmit timeout if we don't.
                 * The posted write to IntrMask is safe, as it will
                 * eventually make it to the chip and we won't loose anything
                 * until it does.
                 */
                tp->intr_mask = 0xffff;
                smp_wmb();
                RTL_W16(IntrMask, tp->intr_event);

The second write is a write to hardware, and thus would be to a device
region.  The first is a write to a memory structure.

It seems to me given your description in the patch, that having smp_wmb()
be a dmb(), rather than a wmb() would be insufficient here.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 15:24         ` Russell King - ARM Linux
@ 2010-02-23 16:02           ` Catalin Marinas
  2010-02-23 18:03             ` Russell King - ARM Linux
  0 siblings, 1 reply; 19+ messages in thread
From: Catalin Marinas @ 2010-02-23 16:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-02-23 at 15:24 +0000, Russell King - ARM Linux wrote:
> On Tue, Feb 23, 2010 at 03:12:46PM +0000, Catalin Marinas wrote:
> > The scenario I have in mind is when using mb() in relation with DMA
> > coherent mappings. We only use Normal Uncached for this case on ARMv7.
> > Earlier architectures, including ARMv6 SMP, we use strongly ordered
> > which would be fine without any barrier.
> >
> > Since mb() isn't meant for SMP use, does it still make sense to have it
> > defined in the ARMv6 SMP case? Would people not use the smp_* variants?
> 
> Part of the reason for that is that smp_mb() are, afaik, supposed to be
> as strong as mb() in the SMP case, or reduce to compiler barriers in the
> UP case.

I looked again at Documentation/memory-barriers.txt and I haven't seen
anything that would suggest the above. The only reference to these types
of barriers together seems to be:

        Note that SMP memory barriers _must_ be used to control the
        ordering of references to shared memory on SMP systems [...].
        
        Mandatory barriers should not be used to control SMP effects,
        since mandatory barriers unnecessarily impose overhead on UP
        systems. They may, however, be used to control MMIO effects on
        accesses through relaxed memory I/O windows.

They don't seem to imply that one is stronger than the other, only that
they are meant for different scenarios.

> I'm not entirely convinced by the part of your patch which changes the
> SMP barriers yet.  For instance, some drivers contain:
> 
>                 /* We need for force the visibility of tp->intr_mask
>                  * for other CPUs, as we can loose an MSI interrupt
>                  * and potentially wait for a retransmit timeout if we don't.
>                  * The posted write to IntrMask is safe, as it will
>                  * eventually make it to the chip and we won't loose anything
>                  * until it does.
>                  */
>                 tp->intr_mask = 0xffff;
>                 smp_wmb();
>                 RTL_W16(IntrMask, tp->intr_event);
> 
> The second write is a write to hardware, and thus would be to a device
> region.  The first is a write to a memory structure.
> 
> It seems to me given your description in the patch, that having smp_wmb()
> be a dmb(), rather than a wmb() would be insufficient here.

Yes, a DMB is insufficient here (see the Mailbox example in the Barrier
Litmus document from Richard G). That's actually the case with the GIC
currently - CPU0 writes some data, barrier and then IPI to CPU1. If only
DMB is used, the IPI may arrive at CPU1 before the data written by CPU0
reaches the memory.

My proposal for this would be to place an explicit DSB at the beginning
of gic_raise_irq(). Otherwise, we can change smp_wmb() to be a DSB but
we may have some performance penalty for other cases where ordering with
Device accesses is not required.

-- 
Catalin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 11:01 [PATCH 1/4] ARM: Change the mandatory barriers implementation Catalin Marinas
  2010-02-23 11:10 ` Russell King - ARM Linux
  2010-02-23 11:35 ` Shilimkar, Santosh
@ 2010-02-23 17:33 ` Russell King - ARM Linux
  2010-02-23 17:58   ` Catalin Marinas
  2 siblings, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2010-02-23 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 23, 2010 at 11:01:05AM +0000, Catalin Marinas wrote:
> The mandatory barriers (mb, rmb, wmb) are used even on uniprocessor
> systems for things like ordering Normal Non-cacheable memory accesses
> with DMA transfer (via Device memory writes). The current implementation
> uses dmb() for mb() and friends but this is not sufficient. The DMB only
> ensures the ordering of accesses with regards to a single observer
> accessing the same memory.

Erm, I also don't think your statement here is right.  DMB is defined in
the ARM ARM to be by default a full-system, read/write memory barrier.
It has this property:

  If the required shareability is Full system then the operation applies
  to all observers within the system.
...
  Any observer with the same required shareability domain as Pe observes
  all members of Group A before it observes any member of Group B to the
  extent that those group members are required to be observed, as
  determined by the shareability and cacheability of the memory locations
  accessed by the group members.  Where members of Group A and Group B
  access the same memory-mapped peripheral, all members of Group A will
  be visible at the memory-mapped peripheral before any members of Group
  B are visible at that peripheral.

This most definitely is not "single observer" - it's all observers within
the same "shareability domain".  That may encompass all CPUs and not
devices and DMA agents.

I'd go further - in the case of:

	write to non-cacheable memory
	dmb();
	write to peripheral

then, because the dmb() is a full system dmb, all observers within the
system should see the write to non-cacheable memory before the write to
peripheral.

Either that or the ARM ARM is unclear/wrong about what "all observers"
means.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 17:33 ` Russell King - ARM Linux
@ 2010-02-23 17:58   ` Catalin Marinas
  2010-02-23 18:04     ` Catalin Marinas
  0 siblings, 1 reply; 19+ messages in thread
From: Catalin Marinas @ 2010-02-23 17:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-02-23 at 17:33 +0000, Russell King - ARM Linux wrote:
> On Tue, Feb 23, 2010 at 11:01:05AM +0000, Catalin Marinas wrote:
> > The mandatory barriers (mb, rmb, wmb) are used even on uniprocessor
> > systems for things like ordering Normal Non-cacheable memory accesses
> > with DMA transfer (via Device memory writes). The current implementation
> > uses dmb() for mb() and friends but this is not sufficient. The DMB only
> > ensures the ordering of accesses with regards to a single observer
> > accessing the same memory.
> 
> Erm, I also don't think your statement here is right.  DMB is defined in
> the ARM ARM to be by default a full-system, read/write memory barrier.
> It has this property:
> 
>   If the required shareability is Full system then the operation applies
>   to all observers within the system.
> ...
>   Any observer with the same required shareability domain as Pe observes
>   all members of Group A before it observes any member of Group B to the
>   extent that those group members are required to be observed, as
>   determined by the shareability and cacheability of the memory locations
>   accessed by the group members.  Where members of Group A and Group B
>   access the same memory-mapped peripheral, all members of Group A will
>   be visible at the memory-mapped peripheral before any members of Group
>   B are visible at that peripheral.
> 
> This most definitely is not "single observer" - it's all observers within
> the same "shareability domain".  That may encompass all CPUs and not
> devices and DMA agents.

Yes, that's correct but see below (the issue is the definition of
"observability").

> I'd go further - in the case of:
> 
>         write to non-cacheable memory
>         dmb();
>         write to peripheral
> 
> then, because the dmb() is a full system dmb, all observers within the
> system should see the write to non-cacheable memory before the write to
> peripheral.
> 
> Either that or the ARM ARM is unclear/wrong about what "all observers"
> means.

The ARM ARM wasn't clear to me either but the answer I got from Richard
G on this issue is that the "write to peripheral" above doesn't count as
"observed" since the "observability" definition only applies to master
accesses. In the above case, the CPU writes to a slave port of the
peripheral and the DMB has no effect on when such event would be
observed (if the device is mapped as strongly ordered, there are some
stricter requirements). You could ask RG directly, he would probably be
happy to clarify this for you.

Linux doesn't have a barrier defined to flush the write buffer, even
though the Documentation/DMA-API.txt mention that the user of coherent
DMA mappings may need to do this. Since many drivers use wmb/mb for this
situation, I find this barrier as the best place to put a DSB.

IMHO, mb() has at least the smp_mb() semantics and it can be safely used
in the IPI case you mentioned instead of the smp_* one (when implemented
as DSB).

-- 
Catalin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 16:02           ` Catalin Marinas
@ 2010-02-23 18:03             ` Russell King - ARM Linux
  2010-02-23 18:07               ` Catalin Marinas
  2010-02-26 15:43               ` Catalin Marinas
  0 siblings, 2 replies; 19+ messages in thread
From: Russell King - ARM Linux @ 2010-02-23 18:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 23, 2010 at 04:02:35PM +0000, Catalin Marinas wrote:
> > I'm not entirely convinced by the part of your patch which changes the
> > SMP barriers yet.  For instance, some drivers contain:
> > 
> >                 /* We need for force the visibility of tp->intr_mask
> >                  * for other CPUs, as we can loose an MSI interrupt
> >                  * and potentially wait for a retransmit timeout if we don't.
> >                  * The posted write to IntrMask is safe, as it will
> >                  * eventually make it to the chip and we won't loose anything
> >                  * until it does.
> >                  */
> >                 tp->intr_mask = 0xffff;
> >                 smp_wmb();
> >                 RTL_W16(IntrMask, tp->intr_event);
> > 
> > The second write is a write to hardware, and thus would be to a device
> > region.  The first is a write to a memory structure.
> > 
> > It seems to me given your description in the patch, that having smp_wmb()
> > be a dmb(), rather than a wmb() would be insufficient here.
> 
> My proposal for this would be to place an explicit DSB at the beginning
> of gic_raise_irq(). Otherwise, we can change smp_wmb() to be a DSB but
> we may have some performance penalty for other cases where ordering with
> Device accesses is not required.

That doesn't solve the above case; this isn't GIC code, it's driver code.

Given what you've said, it would appear that smp_wmb() needs to be a
wmb() in the SMP case, to ensure that the write to intr_mask is
visible to other CPUs before the interrupt mask write hits the
peripheral.

So, that leads us back to the:

#ifndef CONFIG_SMP
#define smp_mb()        barrier()
#define smp_rmb()       barrier()
#define smp_wmb()       barrier()
#else
#define smp_mb()        mb()
#define smp_rmb()       rmb()
#define smp_wmb()       wmb()
#endif

which we have at the moment.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 17:58   ` Catalin Marinas
@ 2010-02-23 18:04     ` Catalin Marinas
  0 siblings, 0 replies; 19+ messages in thread
From: Catalin Marinas @ 2010-02-23 18:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-02-23 at 17:58 +0000, Catalin Marinas wrote:
> On Tue, 2010-02-23 at 17:33 +0000, Russell King - ARM Linux wrote:
> > On Tue, Feb 23, 2010 at 11:01:05AM +0000, Catalin Marinas wrote:
> > > The mandatory barriers (mb, rmb, wmb) are used even on uniprocessor
> > > systems for things like ordering Normal Non-cacheable memory accesses
> > > with DMA transfer (via Device memory writes). The current implementation
> > > uses dmb() for mb() and friends but this is not sufficient. The DMB only
> > > ensures the ordering of accesses with regards to a single observer
> > > accessing the same memory.
> >
> > Erm, I also don't think your statement here is right.  DMB is defined in
> > the ARM ARM to be by default a full-system, read/write memory barrier.
> > It has this property:
> >
> >   If the required shareability is Full system then the operation applies
> >   to all observers within the system.
> > ...
> >   Any observer with the same required shareability domain as Pe observes
> >   all members of Group A before it observes any member of Group B to the
> >   extent that those group members are required to be observed, as
> >   determined by the shareability and cacheability of the memory locations
> >   accessed by the group members.  Where members of Group A and Group B
> >   access the same memory-mapped peripheral, all members of Group A will
> >   be visible at the memory-mapped peripheral before any members of Group
> >   B are visible at that peripheral.
> >
> > This most definitely is not "single observer" - it's all observers within
> > the same "shareability domain".  That may encompass all CPUs and not
> > devices and DMA agents.
> 
> Yes, that's correct but see below (the issue is the definition of
> "observability").

I meant that the definitions in the ARM ARM encompass all CPUs,
devices/DMA agents (but only when acting as masters).

-- 
Catalin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 18:03             ` Russell King - ARM Linux
@ 2010-02-23 18:07               ` Catalin Marinas
  2010-03-01  3:37                 ` Jamie Lokier
  2010-02-26 15:43               ` Catalin Marinas
  1 sibling, 1 reply; 19+ messages in thread
From: Catalin Marinas @ 2010-02-23 18:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-02-23 at 18:03 +0000, Russell King - ARM Linux wrote:
> On Tue, Feb 23, 2010 at 04:02:35PM +0000, Catalin Marinas wrote:
> > > I'm not entirely convinced by the part of your patch which changes the
> > > SMP barriers yet.  For instance, some drivers contain:
> > >
> > >                 /* We need for force the visibility of tp->intr_mask
> > >                  * for other CPUs, as we can loose an MSI interrupt
> > >                  * and potentially wait for a retransmit timeout if we don't.
> > >                  * The posted write to IntrMask is safe, as it will
> > >                  * eventually make it to the chip and we won't loose anything
> > >                  * until it does.
> > >                  */
> > >                 tp->intr_mask = 0xffff;
> > >                 smp_wmb();
> > >                 RTL_W16(IntrMask, tp->intr_event);
> > >
> > > The second write is a write to hardware, and thus would be to a device
> > > region.  The first is a write to a memory structure.
> > >
> > > It seems to me given your description in the patch, that having smp_wmb()
> > > be a dmb(), rather than a wmb() would be insufficient here.
> >
> > My proposal for this would be to place an explicit DSB at the beginning
> > of gic_raise_irq(). Otherwise, we can change smp_wmb() to be a DSB but
> > we may have some performance penalty for other cases where ordering with
> > Device accesses is not required.
> 
> That doesn't solve the above case; this isn't GIC code, it's driver code.

As I mentioned in the previous e-mail, my view is that the driver could
just use wmb() rather than smp_wmb() in this case. Are the smp_*()
barriers in Linux required to ensure the relative ordering of anything
other than shared normal memory accesses?

-- 
Catalin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 18:03             ` Russell King - ARM Linux
  2010-02-23 18:07               ` Catalin Marinas
@ 2010-02-26 15:43               ` Catalin Marinas
  2010-03-01  3:44                 ` Jamie Lokier
  1 sibling, 1 reply; 19+ messages in thread
From: Catalin Marinas @ 2010-02-26 15:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-02-23 at 18:03 +0000, Russell King - ARM Linux wrote:
> On Tue, Feb 23, 2010 at 04:02:35PM +0000, Catalin Marinas wrote:
> > > I'm not entirely convinced by the part of your patch which changes the
> > > SMP barriers yet.  For instance, some drivers contain:
> > >
> > >                 /* We need for force the visibility of tp->intr_mask
> > >                  * for other CPUs, as we can loose an MSI interrupt
> > >                  * and potentially wait for a retransmit timeout if we don't.
> > >                  * The posted write to IntrMask is safe, as it will
> > >                  * eventually make it to the chip and we won't loose anything
> > >                  * until it does.
> > >                  */
> > >                 tp->intr_mask = 0xffff;
> > >                 smp_wmb();
> > >                 RTL_W16(IntrMask, tp->intr_event);
> > >
> > > The second write is a write to hardware, and thus would be to a device
> > > region.  The first is a write to a memory structure.
> > >
> > > It seems to me given your description in the patch, that having smp_wmb()
> > > be a dmb(), rather than a wmb() would be insufficient here.
[...]
> Given what you've said, it would appear that smp_wmb() needs to be a
> wmb() in the SMP case, to ensure that the write to intr_mask is
> visible to other CPUs before the interrupt mask write hits the
> peripheral.
> 
> So, that leads us back to the:
> 
> #ifndef CONFIG_SMP
> #define smp_mb()        barrier()
> #define smp_rmb()       barrier()
> #define smp_wmb()       barrier()
> #else
> #define smp_mb()        mb()
> #define smp_rmb()       rmb()
> #define smp_wmb()       wmb()
> #endif

A better implementation would be this:

#ifndef CONFIG_SMP
#define smp_mb()	barrier()
#define smp_rmb()	barrier()
#define smp_wmb()	barrier()
#else
#define smp_mb()	dsb()
#define smp_rmb()	mb()
#define smp_wmb()	dsb()
#endif

Since the mb() may have other effects like draining the L2 write buffer
which is definitely not needed for the SMP barriers.

Anyway, the above change to smp_*mb() would probably have a performance
impact especially with spinlocks.

I can see that the driver situation you described appears in other
drivers as well. Whether this is a correct usage model I can't tell. It
may be worth going with this on linux-arch. PowerPC for example uses a
light barrier for the smp_wmb() case which doesn't ensure ordering
between accesses to normal vs I/O memory.

-- 
Catalin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-23 18:07               ` Catalin Marinas
@ 2010-03-01  3:37                 ` Jamie Lokier
  0 siblings, 0 replies; 19+ messages in thread
From: Jamie Lokier @ 2010-03-01  3:37 UTC (permalink / raw)
  To: linux-arm-kernel

Catalin Marinas wrote:
> On Tue, 2010-02-23 at 18:03 +0000, Russell King - ARM Linux wrote:
> > On Tue, Feb 23, 2010 at 04:02:35PM +0000, Catalin Marinas wrote:
> > > > I'm not entirely convinced by the part of your patch which changes the
> > > > SMP barriers yet.  For instance, some drivers contain:
> > > >
> > > >                 /* We need for force the visibility of tp->intr_mask
> > > >                  * for other CPUs, as we can loose an MSI interrupt
> > > >                  * and potentially wait for a retransmit timeout if we don't.
> > > >                  * The posted write to IntrMask is safe, as it will
> > > >                  * eventually make it to the chip and we won't loose anything
> > > >                  * until it does.
> > > >                  */
> > > >                 tp->intr_mask = 0xffff;
> > > >                 smp_wmb();
> > > >                 RTL_W16(IntrMask, tp->intr_event);
> > > >
> > > > The second write is a write to hardware, and thus would be to a device
> > > > region.  The first is a write to a memory structure.
> > > >
> > > > It seems to me given your description in the patch, that having smp_wmb()
> > > > be a dmb(), rather than a wmb() would be insufficient here.
> > >
> > > My proposal for this would be to place an explicit DSB at the beginning
> > > of gic_raise_irq(). Otherwise, we can change smp_wmb() to be a DSB but
> > > we may have some performance penalty for other cases where ordering with
> > > Device accesses is not required.
> > 
> > That doesn't solve the above case; this isn't GIC code, it's driver code.
> 
> As I mentioned in the previous e-mail, my view is that the driver could
> just use wmb() rather than smp_wmb() in this case. Are the smp_*()
> barriers in Linux required to ensure the relative ordering of anything
> other than shared normal memory accesses?

Historically, smp_*() was just this:

#ifdef SMP
#define smp_mb()	mb()
#else
#define smb_mb()	barrier()
#endif

The idea was to write smp_*() everywhere that you know you need a
barrier iff CONFIG_SMP.

So at one time, the driver code might have been fine, assuming it's
comment is correct.

It's only more recently that SMP + smp_*() became weaker than *() on some
architectures.

Which might mean the above driver code is now buggy.

smp_*() are not required to order regular memory vs. device MMIO, or
vs. interrupts, or vs. DMA, because on UP systems, they are usually
defined as barrier() which guarantees nothing for MMIO.

-- Jamie

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] ARM: Change the mandatory barriers implementation
  2010-02-26 15:43               ` Catalin Marinas
@ 2010-03-01  3:44                 ` Jamie Lokier
  0 siblings, 0 replies; 19+ messages in thread
From: Jamie Lokier @ 2010-03-01  3:44 UTC (permalink / raw)
  To: linux-arm-kernel

Catalin Marinas wrote:
> A better implementation would be this:
> 
> #ifndef CONFIG_SMP
> #define smp_mb()	barrier()
> #define smp_rmb()	barrier()
> #define smp_wmb()	barrier()
> #else
> #define smp_mb()	dsb()
> #define smp_rmb()	mb()
> #define smp_wmb()	dsb()
> #endif
> 
> Since the mb() may have other effects like draining the L2 write buffer
> which is definitely not needed for the SMP barriers.
> 
> Anyway, the above change to smp_*mb() would probably have a performance
> impact especially with spinlocks.
> 
> I can see that the driver situation you described appears in other
> drivers as well. Whether this is a correct usage model I can't tell. It
> may be worth going with this on linux-arch. PowerPC for example uses a
> light barrier for the smp_wmb() case which doesn't ensure ordering
> between accesses to normal vs I/O memory.

I agree, it looks like some confusion about the meaning of smp_wmb()
for ordering w.r.t. I/O, DMA and interrupts has crept in.
It would be good to clarify it.

-- Jamie

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2010-03-01  3:44 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-23 11:01 [PATCH 1/4] ARM: Change the mandatory barriers implementation Catalin Marinas
2010-02-23 11:10 ` Russell King - ARM Linux
2010-02-23 12:16   ` Catalin Marinas
2010-02-23 12:30     ` Russell King - ARM Linux
2010-02-23 15:12       ` Catalin Marinas
2010-02-23 15:24         ` Russell King - ARM Linux
2010-02-23 16:02           ` Catalin Marinas
2010-02-23 18:03             ` Russell King - ARM Linux
2010-02-23 18:07               ` Catalin Marinas
2010-03-01  3:37                 ` Jamie Lokier
2010-02-26 15:43               ` Catalin Marinas
2010-03-01  3:44                 ` Jamie Lokier
2010-02-23 12:21   ` Catalin Marinas
2010-02-23 12:31     ` Russell King - ARM Linux
2010-02-23 11:35 ` Shilimkar, Santosh
2010-02-23 11:41   ` Russell King - ARM Linux
2010-02-23 17:33 ` Russell King - ARM Linux
2010-02-23 17:58   ` Catalin Marinas
2010-02-23 18:04     ` Catalin Marinas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.