All of lore.kernel.org
 help / color / mirror / Atom feed
From: o.rempel@pengutronix.de (Oleksij Rempel)
To: linux-arm-kernel@lists.infradead.org
Subject: Clarifying dma_wmb behavior in presence of non-coherent masters and outer caches
Date: Mon, 9 Jul 2018 08:20:22 +0200	[thread overview]
Message-ID: <72f67412-a011-6552-03be-6730a9302ebb@pengutronix.de> (raw)
In-Reply-To: <20180706122610.GA6993@arm.com>

Hi Will,

On 06.07.2018 14:26, Will Deacon wrote:
> Hi Lucas,
> 
> On Mon, Jul 02, 2018 at 06:45:52PM +0100, Will Deacon wrote:
>> On Mon, Jul 02, 2018 at 03:49:26PM +0200, Lucas Stach wrote:
>>> Also the section "External caches", "Before AMBA4 ACE" seems to suggest
>>> that the barriers aren't fully propagated to the PL310 write-caches and
>>> master ports. I'm unsure if this is just an artifact of the mentioned
>>> MMIO access, so handling of normal vs. device memory transactions in
>>> the PL310.
>>
>> I'll check with the hardware folks about the PL310, but I don't see how
>> you could propagate barriers over AXI3 anyway because I don't think this
>> stuff existed back then.
> 
> Damn. As I feared, you can't propagate the barrier to the PL310, so the
> A9 is unable to enforce ordering beyond the responses it gets back. The
> PL310 will then re-order normal memory accesses (including non-cacheable)
> in its store buffer, so we do actually need a bloody CACHE SYNC here.
> 
> Patch below. Does it help?

Suddenly no.
Is it possible that I run in to this issue:
764369: Data or Unified Cache Line Maintenance by MVA Fails on
Inner-Shareable Memory
"...
Note that even when all three components of the workaround are in place,
this erratum might still occur. However, this occurrence would require
some extremely rare and complex timing conditions, so that the
probability of reaching the point of failure is extremely low. This low
probability, along with the fact that this erratum requires an uncommon
software scenario, explains why this workaround is likely to be a
reliable practical solution for most systems.
To ARM's knowledge, no failure has been observed in any system when all
three components of this workaround have been implemented.
For critical systems that cannot cope with the extremely low failure
risks associated with the above workaround, a second workaround is
possible which involves changing the mapping of the data being accessed
so that it is in a non-cacheable area. This ensures that the written
data remains uncached, which means it is always visible to non-coherent
agents in the system, or to the instruction side in the case of
self-modifying code, without any need for cache maintenance operation."


> Will
> 
> --->8
> 
> From 5a2769b45e94b212d21bc44f5940ed5f3bde5cc2 Mon Sep 17 00:00:00 2001
> From: Will Deacon <will.deacon@arm.com>
> Date: Wed, 4 Jul 2018 10:56:22 +0100
> Subject: [PATCH] ARM: barrier: Kick outer-cache and SoC fabric in dma_wmb()
> 
> Prior to AMBA4-ACE, the AXI bus protocol had no support for barrier
> transactions or shareability domains and was therefore unable to enforce
> ordering between memory accesses either side of a barrier instruction,
> requiring the CPU to handle all of the ordering itself.
> 
> Unfortunately, IP such as the venerable PL310 will re-order stores to
> Normal memory inside its store-buffer, meaning that accesses to a coherent
> DMA buffer (i.e. one allocated by dma_alloc_coherent()) can be observed
> out-of-order by a non-coherent DMA engine unless the PL310 is kicked with
> a CACHE_SYNC command between buffering the accesses. Whilst this is already
> the case for mb() and wmb(), dma_wmb() just expands to a DMB OSHST, leading
> to sequences such as:
> 
> 	STR	Rdata, [Rbuf, #DATA_OFFSET]
> 	DMB	OSHST
> 	STR	Rvalid, [Rbuf, #VALID_OFFSET]
> 
> being reordered such that the DMA device sees the write of valid but not
> the write of data.
> 
> This patch ensures that the SoC fabric and the outer-cache are kicked
> by dma_wmb() if they are able to buffer transactions outside of the CPU
> but before the Point of Coherency.
> 
> Reported-by: Lucas Stach <l.stach@pengutronix.de>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  arch/arm/include/asm/barrier.h    | 14 ++++++++++----
>  arch/arm/include/asm/outercache.h |  6 ++++++
>  arch/arm/mm/flush.c               | 24 +++++++++++++++++++++++-
>  3 files changed, 39 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
> index 69772e742a0a..e31833f72b69 100644
> --- a/arch/arm/include/asm/barrier.h
> +++ b/arch/arm/include/asm/barrier.h
> @@ -53,17 +53,23 @@
>  #ifdef CONFIG_ARM_HEAVY_MB
>  extern void (*soc_mb)(void);
>  extern void arm_heavy_mb(void);
> -#define __arm_heavy_mb(x...) do { dsb(x); arm_heavy_mb(); } while (0)
> +extern void arm_heavy_wmb(void);
> +extern void arm_heavy_dma_wmb(void);
> +#define __arm_heavy_mb()	arm_heavy_mb()
> +#define __arm_heavy_wmb()	arm_heavy_wmb()
> +#define __arm_heavy_dma_wmb()	arm_heavy_dma_wmb()
>  #else
> -#define __arm_heavy_mb(x...) dsb(x)
> +#define __arm_heavy_mb()	dsb()
> +#define __arm_heavy_wmb()	dsb(st)
> +#define __arm_heavy_dma_wmb()	dmb(oshst)
>  #endif
>  
>  #if defined(CONFIG_ARM_DMA_MEM_BUFFERABLE) || defined(CONFIG_SMP)
>  #define mb()		__arm_heavy_mb()
>  #define rmb()		dsb()
> -#define wmb()		__arm_heavy_mb(st)
> +#define wmb()		__arm_heavy_wmb()
>  #define dma_rmb()	dmb(osh)
> -#define dma_wmb()	dmb(oshst)
> +#define dma_wmb()	__arm_heavy_dma_wmb()
>  #else
>  #define mb()		barrier()
>  #define rmb()		barrier()
> diff --git a/arch/arm/include/asm/outercache.h b/arch/arm/include/asm/outercache.h
> index c2bf24f40177..e744efe6e2a7 100644
> --- a/arch/arm/include/asm/outercache.h
> +++ b/arch/arm/include/asm/outercache.h
> @@ -43,6 +43,12 @@ struct outer_cache_fns {
>  
>  extern struct outer_cache_fns outer_cache;
>  
> +#ifdef CONFIG_OUTER_CACHE_SYNC
> +static inline bool outer_cache_has_sync(void) { return outer_cache.sync; }
> +#else
> +static inline bool outer_cache_has_sync(void) { return 0; }
> +#endif
> +
>  #ifdef CONFIG_OUTER_CACHE
>  /**
>   * outer_inv_range - invalidate range of outer cache lines
> diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
> index 58469623b015..1252689362d8 100644
> --- a/arch/arm/mm/flush.c
> +++ b/arch/arm/mm/flush.c
> @@ -24,7 +24,7 @@
>  #ifdef CONFIG_ARM_HEAVY_MB
>  void (*soc_mb)(void);
>  
> -void arm_heavy_mb(void)
> +static void __arm_heavy_sync(void)
>  {
>  #ifdef CONFIG_OUTER_CACHE_SYNC
>  	if (outer_cache.sync)
> @@ -33,7 +33,29 @@ void arm_heavy_mb(void)
>  	if (soc_mb)
>  		soc_mb();
>  }
> +
> +void arm_heavy_mb(void)
> +{
> +	dsb();
> +	__arm_heavy_sync();
> +}
>  EXPORT_SYMBOL(arm_heavy_mb);
> +
> +void arm_heavy_wmb(void)
> +{
> +	dsb(st);
> +	__arm_heavy_sync();
> +}
> +EXPORT_SYMBOL(arm_heavy_wmb);
> +
> +void arm_heavy_dma_wmb(void)
> +{
> +	if (outer_cache_has_sync() || soc_mb)
> +		arm_heavy_wmb();
> +	else
> +		dmb(oshst);
> +}
> +EXPORT_SYMBOL(arm_heavy_dma_wmb);
>  #endif
>  
>  #ifdef CONFIG_CPU_CACHE_VIPT
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180709/ba850d25/attachment.sig>

  reply	other threads:[~2018-07-09  6:20 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-29 12:28 Clarifying dma_wmb behavior in presence of non-coherent masters and outer caches Lucas Stach
2018-06-29 14:25 ` Russell King - ARM Linux
2018-06-29 16:19   ` Lucas Stach
2018-06-29 16:22   ` Will Deacon
2018-06-29 16:48     ` Russell King - ARM Linux
2018-06-29 17:43       ` Will Deacon
2018-06-29 18:01         ` Russell King - ARM Linux
2018-07-02 13:49         ` Lucas Stach
2018-07-02 17:45           ` Will Deacon
2018-07-06 12:26             ` Will Deacon
2018-07-09  6:20               ` Oleksij Rempel [this message]
2018-09-13 13:17                 ` Will Deacon
2018-09-13 14:09                   ` Oleksij Rempel
2018-07-09  9:45               ` Lucas Stach
2018-06-29 17:14     ` Lucas Stach
2018-06-29 17:46       ` Will Deacon
2018-07-02  9:58         ` Lucas Stach

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=72f67412-a011-6552-03be-6730a9302ebb@pengutronix.de \
    --to=o.rempel@pengutronix.de \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.