linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v2 0/4] ARM LPAE Outer Shared v2
@ 2016-06-06  3:20 Bill Mills
  2016-06-06  3:20 ` [RFC v2 1/4] ARM: mm: add early page table attribute modification ability Bill Mills
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Bill Mills @ 2016-06-06  3:20 UTC (permalink / raw)
  To: rmk+kernel, mark.rutland, t-kristo, ssantosh, catalin.marinas
  Cc: linux-arm-kernel, linux-kernel, r-woodruff2

This RFC series adds support for outer shared LPAE page table
attributes. This attribute is needed by at least keystone to achieve
dma coherency. The choice is done at early boot time and can co-exist
with other platforms that want only inner shared.

v2 addresses the concern about changing the memory attributes while the
MMU is on that was raised in v1. It also puts the primary
responsibility of choosing the right mode on the platform.

Instead of creating an "need outer shared flag" to the pv_fixup code, I
created a generic attribute modification mechanism. The idea was it
could be used to solve other problems where the assumptions of the
early boot tables need to be changed in a safe manner. Right now it is
LPAE only and tied 1:1 with pv_fixup but that could change. I did test
that applying a 0 pv_fixup seemed to do no harm.

There is a patch that adds an early param "defshared". This is a
separate patch as I am unsure if this is really desired. It is useful
for testing the series however. You can use it to force keystone to
use inner shared (and it will fallback to non-coherent dma-ops) or you
can use it to force another platform to use outer shared and see what
happens. If we keep the param, documentation will be added.

This series needs more testing and finishing but I wanted to get a read
on the direction. This does run on Keystone and for QEMU vexpress-A15.
QEMU vexpress runs with inner or outer shared :)
Multiple TODO points marked in-line. If the approach is accepted I
will complete the TODO items and TI will do more testing.

Series based on V4.7-rc2

v1 was here:
http://marc.info/?t=146044908600005&r=1&w=2

-- Bill

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC v2 1/4] ARM: mm: add early page table attribute modification ability
  2016-06-06  3:20 [RFC v2 0/4] ARM LPAE Outer Shared v2 Bill Mills
@ 2016-06-06  3:20 ` Bill Mills
  2016-06-06 12:18   ` Russell King - ARM Linux
  2016-06-06  3:20 ` [RFC v2 2/4] ARM: mm: Add LPAE support for outer shared Bill Mills
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 21+ messages in thread
From: Bill Mills @ 2016-06-06  3:20 UTC (permalink / raw)
  To: rmk+kernel, mark.rutland, t-kristo, ssantosh, catalin.marinas
  Cc: linux-arm-kernel, linux-kernel, r-woodruff2, Bill Mills

Allow early-init to specify modifications to be made to the boot time page
table. Any modifications specified will be done with MMU off at the same
time that any Phy<->Virt fixup is done.

This ability is enabled with ARM_PV_FIXUP.

It is currently only implemented for LPAE mode.

Signed-off-by: Bill Mills <wmills@ti.com>
---
 arch/arm/include/asm/pgtable-hwdef.h | 21 +++++++++
 arch/arm/mm/mmu.c                    | 36 ++++++++++++---
 arch/arm/mm/pv-fixup-asm.S           | 86 ++++++++++++++++++++++++++++++++++--
 3 files changed, 135 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-hwdef.h b/arch/arm/include/asm/pgtable-hwdef.h
index 8426229..c35d71f 100644
--- a/arch/arm/include/asm/pgtable-hwdef.h
+++ b/arch/arm/include/asm/pgtable-hwdef.h
@@ -16,4 +16,25 @@
 #include <asm/pgtable-2level-hwdef.h>
 #endif
 
+#ifdef CONFIG_ARM_PV_FIXUP
+
+#define MAX_ATTR_MOD_ENTRIES	64
+
+#ifndef __ASSEMBLY__
+
+struct attr_mod_entry {
+	pmdval_t	test_mask;
+	pmdval_t	test_value;
+	pmdval_t	clear_mask;
+	pmdval_t	set_mask;
+};
+
+bool attr_mod_add(struct attr_mod_entry *pmod);
+
+extern int num_attr_mods;
+extern struct attr_mod_entry attr_mod_table[MAX_ATTR_MOD_ENTRIES];
+
+#endif	/* __ASSEMBLY__ */
+#endif	/* CONFIG_ARM_PV_FIXUP */
+
 #endif
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 62f4d01..a608980 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1496,23 +1496,41 @@ extern unsigned long __atags_pointer;
 typedef void pgtables_remap(long long offset, unsigned long pgd, void *bdata);
 pgtables_remap lpae_pgtables_remap_asm;
 
+int num_attr_mods;
+
+/* add an entry to the early page table attribute modification list */
+bool __init attr_mod_add(struct attr_mod_entry *pmod)
+{
+	if (num_attr_mods >= MAX_ATTR_MOD_ENTRIES) {
+		pr_crit("Out of room for (or late use of) early page table attribute modifications.\n");
+		return false;
+	}
+
+	attr_mod_table[num_attr_mods++] = *pmod;
+	return true;
+}
+
 /*
  * early_paging_init() recreates boot time page table setup, allowing machines
  * to switch over to a high (>4G) address space on LPAE systems
+ *
+ * This function also applies any attribute modifications specified in
+ * attr_mod_table.  These may have been added before we got here (early_param)
+ * or from within mdesc->pv_fixup called by this function
  */
 void __init early_paging_init(const struct machine_desc *mdesc)
 {
 	pgtables_remap *lpae_pgtables_remap;
 	unsigned long pa_pgd;
 	unsigned int cr, ttbcr;
-	long long offset;
+	long long offset = 0;
 	void *boot_data;
+	unsigned long pmd;
 
-	if (!mdesc->pv_fixup)
-		return;
+	if (mdesc->pv_fixup)
+		offset = mdesc->pv_fixup();
 
-	offset = mdesc->pv_fixup();
-	if (offset == 0)
+	if (offset == 0 && num_attr_mods == 0)
 		return;
 
 	/*
@@ -1564,6 +1582,14 @@ void __init early_paging_init(const struct machine_desc *mdesc)
 	/* Re-enable the caches and cacheable TLB walks */
 	asm volatile("mcr p15, 0, %0, c2, c0, 2" : : "r" (ttbcr));
 	set_cr(cr);
+
+	/* disable any further use of attribute fixup */
+	num_attr_mods = MAX_ATTR_MOD_ENTRIES + 1;
+
+	/* record the new "initial" pmd and cachepolicy */
+	pmd = pmd_val(*pmd_off_k((unsigned long)_data));
+	pmd &= ~PMD_MASK;
+	init_default_cache_policy(pmd);
 }
 
 #else
diff --git a/arch/arm/mm/pv-fixup-asm.S b/arch/arm/mm/pv-fixup-asm.S
index 1867f3e4..ad8edc2 100644
--- a/arch/arm/mm/pv-fixup-asm.S
+++ b/arch/arm/mm/pv-fixup-asm.S
@@ -19,8 +19,44 @@
 #define L1_ORDER 3
 #define L2_ORDER 3
 
+/*
+ *	attr_mod_table:
+ *		describe transforms to be made to the early boot pgtable
+ *		This is poked by early init code
+ *	mod descriptor list:
+ *		64 bit test mask
+ *		64 bit test value
+ *		64 bit clear mask
+ *		64 bit set mask
+ *      	next descriptor
+ *		...
+ *		0x0000_00000 0x0000_0000 end of list
+ */
+/* TODO: what segment?, test w/ XIP kernel? */
+	 .globl attr_mod_table
+attr_mod_table:
+	.zero   8*MAX_ATTR_MOD_ENTRIES*4 + 1
+
+/*
+ *	lpae_pgtables_remap_asm(long long offset, unsigned long pg,
+ *		void* boot_data)
+ *
+ *	Rewrite initial boot page tables with new physical addresses and or
+ *	attributes.
+ *	This function starts in identity mapped VA -> low PA
+ *	The body runs in low PA with MMU off
+ * 	The function ends in "identity mapped" VA -> high PA
+ *	The function returns to kernel VA space -> high PA
+ *
+ *	- r0    PA delta low
+ *	- r1	PA delta high
+ *	- r2    address of top level table
+ *	- r3    address of dtb (or atags))
+ *
+ *	uses null terminated list of attribute modifications in attr_mod_table
+ */
 ENTRY(lpae_pgtables_remap_asm)
-	stmfd	sp!, {r4-r8, lr}
+	stmfd	sp!, {r4-r11, lr}
 
 	mrc	p15, 0, r8, c1, c0, 0		@ read control reg
 	bic	ip, r8, #CR_M			@ disable caches and MMU
@@ -63,6 +99,7 @@ ENTRY(lpae_pgtables_remap_asm)
 	subs	r6, r6, #1
 	bne	2b
 
+	/* Update HW page table regs with new PA */
 	mrrc	p15, 0, r4, r5, c2		@ read TTBR0
 	adds	r4, r4, r0			@ update physical address
 	adc	r5, r5, r1
@@ -74,15 +111,58 @@ ENTRY(lpae_pgtables_remap_asm)
 
 	dsb
 
+	/* Update attributes of all level 2 entries in 1GB space */
+	/* TODO: fix/test BE8 THUMB2 kernel */
+	adrl	r3, attr_mod_table
+	add	r7, r2, #0x1000
+	add	r6, r7, #0x4000
+	bl	3f				@ NOT C ABI
+
+	/* Update attributes of the 4 level 1 entries */
+	/* TODO: delete this or allow mod entries to match only L1 */
+	mov	r7, r2
+	add	r6, r7, #32
+	bl	3f				@ NOT C ABI
+	b	7f
+
+3:	ldrd	r4, [r7]
+	orrs	r11, r4, r5
+	beq	6f				@ skip unused entries
+	mov 	r10, r3
+4:	ldrd	r8, [r10]
+	orrs	r11, r8, r9
+	beq	6f				@ end of mod table?
+	and	r0, r4, r8			@ no, load test mask
+	and	r1, r5, r9
+	ldrd	r8, [r10, #8]			@ load test bits
+	cmp	r0, r8
+	cmpeq	r1, r9
+	bne	5f				@ does entry match desc?
+	ldrd	r8, [r10, #16]			@ yes, load mod clear mask
+	bic	r4, r4, r8
+	bic	r5, r5, r9
+	ldrd	r8, [r10, #24]			@ load mod set mask
+	orr	r4, r4, r8
+	orr	r5, r5, r9
+5:	add     r10, r10, #32			@ try next mod desc
+	b	4b
+6:	strd	r4, [r7], #1 << L2_ORDER
+	cmp	r7, r6
+	bls	3b
+	bx 	lr
+
+7:
 	mov	ip, #0
 	mcr	p15, 0, ip, c7, c5, 0		@ I+BTB cache invalidate
 	mcr	p15, 0, ip, c8, c7, 0		@ local_flush_tlb_all()
 	dsb
 	isb
 
-	mcr	p15, 0, r8, c1, c0, 0		@ re-enable MMU
+	mrc	p15, 0, r8, c1, c0, 0		@ re-enable MMU
+	orr	r8, r8, #CR_M
+	mcr	p15, 0, r8, c1, c0, 0
 	dsb
 	isb
 
-	ldmfd	sp!, {r4-r8, pc}
+	ldmfd	sp!, {r4-r11, pc}
 ENDPROC(lpae_pgtables_remap_asm)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC v2 2/4] ARM: mm: Add LPAE support for outer shared
  2016-06-06  3:20 [RFC v2 0/4] ARM LPAE Outer Shared v2 Bill Mills
  2016-06-06  3:20 ` [RFC v2 1/4] ARM: mm: add early page table attribute modification ability Bill Mills
@ 2016-06-06  3:20 ` Bill Mills
  2016-06-06  3:20 ` [RFC v2 3/4] ARM: mm: add inner/outer sharing value command line Bill Mills
  2016-06-06  3:20 ` [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback Bill Mills
  3 siblings, 0 replies; 21+ messages in thread
From: Bill Mills @ 2016-06-06  3:20 UTC (permalink / raw)
  To: rmk+kernel, mark.rutland, t-kristo, ssantosh, catalin.marinas
  Cc: linux-arm-kernel, linux-kernel, r-woodruff2, Bill Mills

Support early init selection of inner or outer shared page table
attributes.

In LPAE shared is 3 valued: non-shared, inner-shared, and outer-shared.
Provide a mask and both shared values.  Shared value in use is stored in
variables.  The old constants are eliminated to avoid accidental use.

Early page tables and variables are initialized to inner shared.

If a platform needs outer shared, it calls use_outer_shared()
during early_paging_init.  The variables and early page table are
fixed up. The mem_types built during paging_init are fixed to match
the value in effect.

No functional change for non-LPAE.  We only add a few extra aliases for
existing constants and use some extra vars at boot.

This patch is based in part on an earlier RFC patch by
    Tero Kristo <t-kristo@ti.com>

Signed-off-by: Bill Mills <wmills@ti.com>
---
 arch/arm/include/asm/pgtable-2level-hwdef.h |  6 +++
 arch/arm/include/asm/pgtable-3level-hwdef.h | 14 ++++-
 arch/arm/include/asm/pgtable-3level.h       |  2 +-
 arch/arm/include/asm/pgtable-hwdef.h        |  1 +
 arch/arm/include/asm/pgtable.h              |  3 ++
 arch/arm/mm/dump.c                          | 28 ++++++++++
 arch/arm/mm/mmu.c                           | 80 +++++++++++++++++++++++------
 arch/arm/mm/proc-v7-3level.S                |  2 +-
 8 files changed, 115 insertions(+), 21 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-2level-hwdef.h b/arch/arm/include/asm/pgtable-2level-hwdef.h
index d0131ee..d62e20f 100644
--- a/arch/arm/include/asm/pgtable-2level-hwdef.h
+++ b/arch/arm/include/asm/pgtable-2level-hwdef.h
@@ -93,4 +93,10 @@
 
 #define PHYS_MASK		(~0UL)
 
+/* These are here to share more code between 2level & 3level */
+#define L_PTE_EARLY_SHARED	PTE_EXT_SHARED
+#define PTE_EXT_SMASK		PTE_EXT_SHARED
+#define PMD_SECT_EARLY_S	PMD_SECT_S
+#define PMD_SECT_SMASK		PMD_SECT_S
+
 #endif
diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h
index f8f1cff..3ffc0ce 100644
--- a/arch/arm/include/asm/pgtable-3level-hwdef.h
+++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
@@ -44,7 +44,9 @@
 #define PMD_SECT_CACHEABLE	(_AT(pmdval_t, 1) << 3)
 #define PMD_SECT_USER		(_AT(pmdval_t, 1) << 6)		/* AP[1] */
 #define PMD_SECT_AP2		(_AT(pmdval_t, 1) << 7)		/* read only */
-#define PMD_SECT_S		(_AT(pmdval_t, 3) << 8)
+#define PMD_SECT_SMASK		(_AT(pmdval_t, 3) << 8)		/* shareable bits */
+#define PMD_SECT_ISHARED	(_AT(pmdval_t, 3) << 8)		/* inner sharable */
+#define PMD_SECT_OSHARED	(_AT(pmdval_t, 2) << 8)		/* outer sharable */
 #define PMD_SECT_AF		(_AT(pmdval_t, 1) << 10)
 #define PMD_SECT_nG		(_AT(pmdval_t, 1) << 11)
 #define PMD_SECT_PXN		(_AT(pmdval_t, 1) << 53)
@@ -73,12 +75,20 @@
 #define PTE_BUFFERABLE		(_AT(pteval_t, 1) << 2)		/* AttrIndx[0] */
 #define PTE_CACHEABLE		(_AT(pteval_t, 1) << 3)		/* AttrIndx[1] */
 #define PTE_AP2			(_AT(pteval_t, 1) << 7)		/* AP[2] */
-#define PTE_EXT_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
+#define PTE_EXT_SMASK		(_AT(pteval_t, 3) << 8)		/* SH[1:0], shareable */
+#define PTE_EXT_ISHARED	(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
+#define PTE_EXT_OSHARED	(_AT(pteval_t, 2) << 8)		/* SH[1:0], outer shareable */
 #define PTE_EXT_AF		(_AT(pteval_t, 1) << 10)	/* Access Flag */
 #define PTE_EXT_NG		(_AT(pteval_t, 1) << 11)	/* nG */
 #define PTE_EXT_PXN		(_AT(pteval_t, 1) << 53)	/* PXN */
 #define PTE_EXT_XN		(_AT(pteval_t, 1) << 54)	/* XN */
 
+/* in early boot we assume inner shared,
+ * afterward use L_PTE_SHARED but only in code, can't be static initializer
+ */
+#define L_PTE_EARLY_SHARED	PTE_EXT_ISHARED
+#define PMD_SECT_EARLY_S	PMD_SECT_ISHARED
+
 /*
  * 40-bit physical address supported.
  */
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index fa70db7..af5b9cb 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -78,7 +78,7 @@
 #define L_PTE_VALID		(_AT(pteval_t, 1) << 0)		/* Valid */
 #define L_PTE_PRESENT		(_AT(pteval_t, 3) << 0)		/* Present */
 #define L_PTE_USER		(_AT(pteval_t, 1) << 6)		/* AP[1] */
-#define L_PTE_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
+#define L_PTE_SHARED		(l_pte_shared)			/* inner or outer shareable */
 #define L_PTE_YOUNG		(_AT(pteval_t, 1) << 10)	/* AF */
 #define L_PTE_XN		(_AT(pteval_t, 1) << 54)	/* XN */
 #define L_PTE_DIRTY		(_AT(pteval_t, 1) << 55)
diff --git a/arch/arm/include/asm/pgtable-hwdef.h b/arch/arm/include/asm/pgtable-hwdef.h
index c35d71f..27654a9 100644
--- a/arch/arm/include/asm/pgtable-hwdef.h
+++ b/arch/arm/include/asm/pgtable-hwdef.h
@@ -30,6 +30,7 @@ struct attr_mod_entry {
 };
 
 bool attr_mod_add(struct attr_mod_entry *pmod);
+bool use_outer_shared(void);
 
 extern int num_attr_mods;
 extern struct attr_mod_entry attr_mod_table[MAX_ATTR_MOD_ENTRIES];
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 348caab..4d2e412 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -86,6 +86,9 @@ extern pgprot_t		pgprot_hyp_device;
 extern pgprot_t		pgprot_s2;
 extern pgprot_t		pgprot_s2_device;
 
+extern pmdval_t		pmd_sect_s;
+extern pteval_t		l_pte_shared;
+
 #define _MOD_PROT(p, b)	__pgprot(pgprot_val(p) | (b))
 
 #define PAGE_NONE		_MOD_PROT(pgprot_user, L_PTE_XN | L_PTE_RDONLY | L_PTE_NONE)
diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c
index 9fe8e24..fb98fa6 100644
--- a/arch/arm/mm/dump.c
+++ b/arch/arm/mm/dump.c
@@ -68,10 +68,24 @@ static const struct prot_bits pte_bits[] = {
 		.set	= "NX",
 		.clear	= "x ",
 	}, {
+#ifndef CONFIG_ARM_LPAE
 		.mask	= L_PTE_SHARED,
 		.val	= L_PTE_SHARED,
 		.set	= "SHD",
 		.clear	= "   ",
+#else
+		.mask	= PTE_EXT_SMASK,
+		.val	= PTE_EXT_ISHARED,
+		.set	= "ISHD",
+	}, {
+		.mask	= PTE_EXT_SMASK,
+		.val	= PTE_EXT_OSHARED,
+		.set	= "OSHD",
+	}, {
+		.mask	= PTE_EXT_SMASK,
+		.val	= 0,
+		.set	= "    ",
+#endif
 	}, {
 		.mask	= L_PTE_MT_MASK,
 		.val	= L_PTE_MT_UNCACHED,
@@ -172,10 +186,24 @@ static const struct prot_bits section_bits[] = {
 		.set	= "NX",
 		.clear	= "x ",
 	}, {
+#ifndef CONFIG_ARM_LPAE
 		.mask	= PMD_SECT_S,
 		.val	= PMD_SECT_S,
 		.set	= "SHD",
 		.clear	= "   ",
+#else
+		.mask	= PMD_SECT_SMASK,
+		.val	= PMD_SECT_ISHARED,
+		.set	= "ISHD",
+	}, {
+		.mask	= PMD_SECT_SMASK,
+		.val	= PMD_SECT_OSHARED,
+		.set	= "OSHD",
+	}, {
+		.mask	= PMD_SECT_SMASK,
+		.val	= 0,
+		.set	= "    ",
+#endif
 	},
 };
 
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index a608980..8aaccf2 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -70,6 +70,13 @@ pgprot_t pgprot_hyp_device;
 pgprot_t pgprot_s2;
 pgprot_t pgprot_s2_device;
 
+/* For LPAE hold the value of Inner or Outer Shared attribute selected at
+ * early init, which starts out as inner shared
+ * For non_LPAE these are always just the single S Bit
+ */
+pmdval_t pmd_sect_s    = PMD_SECT_EARLY_S;
+pteval_t l_pte_shared  = L_PTE_EARLY_SHARED;
+
 EXPORT_SYMBOL(pgprot_user);
 EXPORT_SYMBOL(pgprot_kernel);
 
@@ -246,12 +253,12 @@ __setup("noalign", noalign_setup);
 static struct mem_type mem_types[] = {
 	[MT_DEVICE] = {		  /* Strongly ordered / ARMv6 shared device */
 		.prot_pte	= PROT_PTE_DEVICE | L_PTE_MT_DEV_SHARED |
-				  L_PTE_SHARED,
+				  L_PTE_EARLY_SHARED,
 		.prot_pte_s2	= s2_policy(PROT_PTE_S2_DEVICE) |
 				  s2_policy(L_PTE_S2_MT_DEV_SHARED) |
-				  L_PTE_SHARED,
+				  L_PTE_EARLY_SHARED,
 		.prot_l1	= PMD_TYPE_TABLE,
-		.prot_sect	= PROT_SECT_DEVICE | PMD_SECT_S,
+		.prot_sect	= PROT_SECT_DEVICE | PMD_SECT_EARLY_S,
 		.domain		= DOMAIN_IO,
 	},
 	[MT_DEVICE_NONSHARED] = { /* ARMv6 non-shared device */
@@ -340,8 +347,9 @@ static struct mem_type mem_types[] = {
 		.prot_pte  = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
 				L_PTE_MT_UNCACHED | L_PTE_XN,
 		.prot_l1   = PMD_TYPE_TABLE,
-		.prot_sect = PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_SECT_S |
-				PMD_SECT_UNCACHED | PMD_SECT_XN,
+		.prot_sect = PMD_TYPE_SECT | PMD_SECT_AP_WRITE |
+				PMD_SECT_EARLY_S | PMD_SECT_UNCACHED |
+				PMD_SECT_XN,
 		.domain    = DOMAIN_KERNEL,
 	},
 	[MT_MEMORY_DMA_READY] = {
@@ -422,6 +430,15 @@ void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot)
 	local_flush_tlb_kernel_range(vaddr, vaddr + PAGE_SIZE);
 }
 
+#ifdef CONFIG_ARM_LPAE
+static void __init fixup_mem_type_shared(struct mem_type *pmt)
+{
+	pmt->prot_sect   = (pmt->prot_sect   & ~PMD_SECT_SMASK) | pmd_sect_s;
+	pmt->prot_pte    = (pmt->prot_pte    & ~PTE_EXT_SMASK)  | l_pte_shared;
+	pmt->prot_pte_s2 = (pmt->prot_pte_s2 & ~PTE_EXT_SMASK)  | l_pte_shared;
+}
+#endif
+
 /*
  * Adjust the PMD section entries according to the CPU in use.
  */
@@ -449,14 +466,24 @@ static void __init build_mem_type_table(void)
 		ecc_mask = 0;
 	}
 
+#ifdef CONFIG_ARM_LPAE
+	if (pmd_sect_s != PMD_SECT_EARLY_S)
+		/* we are using different sharable value than was set at
+		 * compile time, fixup the mem types
+		 */
+		for (i = 0; i < ARRAY_SIZE(mem_types); i++)
+			if (mem_types[i].prot_sect & PMD_SECT_SMASK)
+				fixup_mem_type_shared(&mem_types[i]);
+#endif
+
 	if (is_smp()) {
 		if (cachepolicy != CPOLICY_WRITEALLOC) {
 			pr_warn("Forcing write-allocate cache policy for SMP\n");
 			cachepolicy = CPOLICY_WRITEALLOC;
 		}
-		if (!(initial_pmd_value & PMD_SECT_S)) {
+		if (!(initial_pmd_value & PMD_SECT_SMASK)) {
 			pr_warn("Forcing shared mappings for SMP\n");
-			initial_pmd_value |= PMD_SECT_S;
+			initial_pmd_value |= pmd_sect_s;
 		}
 	}
 
@@ -470,7 +497,7 @@ static void __init build_mem_type_table(void)
 			mem_types[i].prot_sect &= ~PMD_SECT_TEX(7);
 	if ((cpu_arch < CPU_ARCH_ARMv6 || !(cr & CR_XP)) && !cpu_is_xsc3())
 		for (i = 0; i < ARRAY_SIZE(mem_types); i++)
-			mem_types[i].prot_sect &= ~PMD_SECT_S;
+			mem_types[i].prot_sect &= ~PMD_SECT_SMASK;
 
 	/*
 	 * ARMv5 and lower, bit 4 must be set for page tables (was: cache
@@ -592,25 +619,25 @@ static void __init build_mem_type_table(void)
 #endif
 
 		/*
-		 * If the initial page tables were created with the S bit
-		 * set, then we need to do the same here for the same
-		 * reasons given in early_cachepolicy().
+		 * If we are using shared mode (ex SMP)
+		 * then we need to add the shared attribute to all needed
+		 * mem_types
 		 */
-		if (initial_pmd_value & PMD_SECT_S) {
+		if (initial_pmd_value & PMD_SECT_SMASK) {
 			user_pgprot |= L_PTE_SHARED;
 			kern_pgprot |= L_PTE_SHARED;
 			vecs_pgprot |= L_PTE_SHARED;
 			s2_pgprot |= L_PTE_SHARED;
-			mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_S;
+			mem_types[MT_DEVICE_WC].prot_sect |= pmd_sect_s;
 			mem_types[MT_DEVICE_WC].prot_pte |= L_PTE_SHARED;
-			mem_types[MT_DEVICE_CACHED].prot_sect |= PMD_SECT_S;
+			mem_types[MT_DEVICE_CACHED].prot_sect |= pmd_sect_s;
 			mem_types[MT_DEVICE_CACHED].prot_pte |= L_PTE_SHARED;
-			mem_types[MT_MEMORY_RWX].prot_sect |= PMD_SECT_S;
+			mem_types[MT_MEMORY_RWX].prot_sect |= pmd_sect_s;
 			mem_types[MT_MEMORY_RWX].prot_pte |= L_PTE_SHARED;
-			mem_types[MT_MEMORY_RW].prot_sect |= PMD_SECT_S;
+			mem_types[MT_MEMORY_RW].prot_sect |= pmd_sect_s;
 			mem_types[MT_MEMORY_RW].prot_pte |= L_PTE_SHARED;
 			mem_types[MT_MEMORY_DMA_READY].prot_pte |= L_PTE_SHARED;
-			mem_types[MT_MEMORY_RWX_NONCACHED].prot_sect |= PMD_SECT_S;
+			mem_types[MT_MEMORY_RWX_NONCACHED].prot_sect |= pmd_sect_s;
 			mem_types[MT_MEMORY_RWX_NONCACHED].prot_pte |= L_PTE_SHARED;
 		}
 	}
@@ -1510,6 +1537,25 @@ bool __init attr_mod_add(struct attr_mod_entry *pmod)
 	return true;
 }
 
+/* use outer shared wherever we would have used inner shared */
+bool __init use_outer_shared(void)
+{
+	struct attr_mod_entry mod = {
+		.test_mask   = PTE_EXT_SMASK,
+		.test_value  = PTE_EXT_ISHARED,
+		.clear_mask  = PTE_EXT_SMASK,
+		.set_mask    = PTE_EXT_OSHARED
+	};
+
+	if (attr_mod_add(&mod) >= 0) {
+		l_pte_shared = PTE_EXT_OSHARED;
+		pmd_sect_s   = PMD_SECT_OSHARED;
+		return true;
+	}
+
+	return false;
+}
+
 /*
  * early_paging_init() recreates boot time page table setup, allowing machines
  * to switch over to a high (>4G) address space on LPAE systems
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 5e5720e..a518b3b 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -38,7 +38,7 @@
 
 /* PTWs cacheable, inner WBWA shareable, outer WBWA not shareable */
 #define TTB_FLAGS_SMP	(TTB_IRGN_WBWA|TTB_S|TTB_RGN_OC_WBWA)
-#define PMD_FLAGS_SMP	(PMD_SECT_WBWA|PMD_SECT_S)
+#define PMD_FLAGS_SMP	(PMD_SECT_WBWA|PMD_SECT_EARLY_S)
 
 #ifndef __ARMEB__
 #  define rpgdl	r0
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC v2 3/4] ARM: mm: add inner/outer sharing value command line
  2016-06-06  3:20 [RFC v2 0/4] ARM LPAE Outer Shared v2 Bill Mills
  2016-06-06  3:20 ` [RFC v2 1/4] ARM: mm: add early page table attribute modification ability Bill Mills
  2016-06-06  3:20 ` [RFC v2 2/4] ARM: mm: Add LPAE support for outer shared Bill Mills
@ 2016-06-06  3:20 ` Bill Mills
  2016-06-06  3:20 ` [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback Bill Mills
  3 siblings, 0 replies; 21+ messages in thread
From: Bill Mills @ 2016-06-06  3:20 UTC (permalink / raw)
  To: rmk+kernel, mark.rutland, t-kristo, ssantosh, catalin.marinas
  Cc: linux-arm-kernel, linux-kernel, r-woodruff2, Bill Mills

Adds defsharing=inner|outer as an early command line option.
Any such command line option will override a platform's choice.

Signed-off-by: Bill Mills <wmills@ti.com>
---
 arch/arm/include/asm/pgtable-hwdef.h |  1 +
 arch/arm/mm/mmu.c                    | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/arch/arm/include/asm/pgtable-hwdef.h b/arch/arm/include/asm/pgtable-hwdef.h
index 27654a9..2a9e24b 100644
--- a/arch/arm/include/asm/pgtable-hwdef.h
+++ b/arch/arm/include/asm/pgtable-hwdef.h
@@ -31,6 +31,7 @@ struct attr_mod_entry {
 
 bool attr_mod_add(struct attr_mod_entry *pmod);
 bool use_outer_shared(void);
+bool use_inner_shared(void);
 
 extern int num_attr_mods;
 extern struct attr_mod_entry attr_mod_table[MAX_ATTR_MOD_ENTRIES];
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 8aaccf2..cc4a803 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1524,6 +1524,7 @@ typedef void pgtables_remap(long long offset, unsigned long pgd, void *bdata);
 pgtables_remap lpae_pgtables_remap_asm;
 
 int num_attr_mods;
+static const char *defshared_seen;
 
 /* add an entry to the early page table attribute modification list */
 bool __init attr_mod_add(struct attr_mod_entry *pmod)
@@ -1547,15 +1548,50 @@ bool __init use_outer_shared(void)
 		.set_mask    = PTE_EXT_OSHARED
 	};
 
+	if (defshared_seen) {
+		pr_err("Default Sharing already set to %s\n", defshared_seen);
+		return false;
+	}
+
 	if (attr_mod_add(&mod) >= 0) {
 		l_pte_shared = PTE_EXT_OSHARED;
 		pmd_sect_s   = PMD_SECT_OSHARED;
+		defshared_seen = "outer";
 		return true;
 	}
 
 	return false;
 }
 
+/* explicitly use inner shared */
+bool __init use_inner_shared(void)
+{
+	if (defshared_seen) {
+		pr_err("Default Sharing already set to %s\n", defshared_seen);
+		return false;
+	}
+
+	defshared_seen = "inner";
+	return true;
+}
+
+/*
+ * Allow sharing type to be set
+ */
+static int __init early_defshared(char *p)
+{
+	if (strcmp(p, "outer") == 0)
+		use_outer_shared();
+	else if (strcmp(p, "inner") == 0)
+		use_inner_shared();
+	else
+		pr_err("Unknown defshared mode %s\n", p);
+
+	return 0;
+}
+
+early_param("defshared", early_defshared);
+
 /*
  * early_paging_init() recreates boot time page table setup, allowing machines
  * to switch over to a high (>4G) address space on LPAE systems
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback
  2016-06-06  3:20 [RFC v2 0/4] ARM LPAE Outer Shared v2 Bill Mills
                   ` (2 preceding siblings ...)
  2016-06-06  3:20 ` [RFC v2 3/4] ARM: mm: add inner/outer sharing value command line Bill Mills
@ 2016-06-06  3:20 ` Bill Mills
  2016-06-06  8:56   ` Mark Rutland
  3 siblings, 1 reply; 21+ messages in thread
From: Bill Mills @ 2016-06-06  3:20 UTC (permalink / raw)
  To: rmk+kernel, mark.rutland, t-kristo, ssantosh, catalin.marinas
  Cc: linux-arm-kernel, linux-kernel, r-woodruff2, Bill Mills

Keystone2 can do DMA coherency but only if:
1) DDR3A DMA buffers are in high physical addresses (0x8_0000_0000)
    (DDR3B does not have this constraint)
2) Memory is marked outer shared
3) DMA Master marks transactions as outer shared
    (This is taken care of in bootloader)

Use outer shared instead of inner shared.
This choice is done at early init time and uses the attr_mod facility

If the kernel is not configured for LPAE and using high PA, or if the
switch to outer shared fails, then we fail to meet this criteria.
Under any of these conditions we veto any dma-coherent attributes in
the DTB.

Signed-off-by: Bill Mills <wmills@ti.com>
---
 arch/arm/mach-keystone/keystone.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
index a33a296..d10adaf 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -28,6 +28,7 @@
 #include "keystone.h"
 
 static unsigned long keystone_dma_pfn_offset __read_mostly;
+static bool keystone_dma_coherent;
 
 static int keystone_platform_notifier(struct notifier_block *nb,
 				      unsigned long event, void *data)
@@ -52,21 +53,53 @@ static struct notifier_block platform_nb = {
 	.notifier_call = keystone_platform_notifier,
 };
 
+void veto_dma_coherent(void)
+{
+	struct device_node	*node, *start_node;
+	struct property		*prop;
+
+	for (start_node = NULL;
+	     (node = of_find_node_with_property(start_node, "dma-coherent"));
+	     start_node = node) {
+		prop = of_find_property(node, "dma-coherent", NULL);
+		if (prop)
+			of_remove_property(node, prop);
+	}
+}
+
 static void __init keystone_init(void)
 {
+	/* If we are running from the high physical addresses then adjust
+	 * addresses we give to the device's DMA.  They will be seeing this
+	 * memory through the MSMC address translation which makes the first 2GB
+	 * of high memory appear in the low 4GB space.
+	 * (DMA masters on keystone2 have 32 bit address buses)
+	 */
 	if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START) {
 		keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START -
 						   KEYSTONE_LOW_PHYS_START);
 		bus_register_notifier(&platform_bus_type, &platform_nb);
 	}
+
+	/* if the kernel has not been configured to meet the keystone
+	 * platform requirements to achieve DMA coherency, then ignore any
+	 * device tree configuration for this
+	 */
+	if (!keystone_dma_coherent)
+		veto_dma_coherent();
+
 	keystone_pm_runtime_init();
 	of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
 }
 
 static long long __init keystone_pv_fixup(void)
 {
+#ifdef CONFIG_ARM_LPAE
 	long long offset;
 	phys_addr_t mem_start, mem_end;
+	bool dma_ok;
+
+	dma_ok = use_outer_shared();
 
 	mem_start = memblock_start_of_DRAM();
 	mem_end = memblock_end_of_DRAM();
@@ -84,11 +117,15 @@ static long long __init keystone_pv_fixup(void)
 	}
 
 	offset = KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START;
+	keystone_dma_coherent = dma_ok;
 
 	/* Populate the arch idmap hook */
 	arch_phys_to_idmap_offset = -offset;
 
 	return offset;
+#else
+	return 0;
+#endif
 }
 
 static const char *const keystone_match[] __initconst = {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback
  2016-06-06  3:20 ` [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback Bill Mills
@ 2016-06-06  8:56   ` Mark Rutland
  2016-06-06  9:09     ` Arnd Bergmann
  2016-06-06 11:43     ` Russell King - ARM Linux
  0 siblings, 2 replies; 21+ messages in thread
From: Mark Rutland @ 2016-06-06  8:56 UTC (permalink / raw)
  To: Bill Mills
  Cc: rmk+kernel, t-kristo, ssantosh, catalin.marinas,
	linux-arm-kernel, linux-kernel, r-woodruff2, devicetree

[adding devicetree]

On Sun, Jun 05, 2016 at 11:20:29PM -0400, Bill Mills wrote:
> Keystone2 can do DMA coherency but only if:
> 1) DDR3A DMA buffers are in high physical addresses (0x8_0000_0000)
>     (DDR3B does not have this constraint)
> 2) Memory is marked outer shared
> 3) DMA Master marks transactions as outer shared
>     (This is taken care of in bootloader)
> 
> Use outer shared instead of inner shared.
> This choice is done at early init time and uses the attr_mod facility
> 
> If the kernel is not configured for LPAE and using high PA, or if the
> switch to outer shared fails, then we fail to meet this criteria.
> Under any of these conditions we veto any dma-coherent attributes in
> the DTB.

I very much do not like this. As I previously mentioned [1],
dma-coherent has de-facto semantics today. This series deliberately
changes that, and inverts the relationship between DT and kernel (as the
describption in the DT would now depend on the configuration of the
kernel).

I would prefer that we have a separate property (e.g.
"dma-outer-coherent") to describe when a device can be coherent with
Normal, Outer Shareable, Inner Write-Back, Outer Write-Back memory.
Then the kernel can figure out whether or not device can be used
coherently, depending on how it is configured.

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-April/421470.html

> 
> Signed-off-by: Bill Mills <wmills@ti.com>
> ---
>  arch/arm/mach-keystone/keystone.c | 37 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 37 insertions(+)
> 
> diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
> index a33a296..d10adaf 100644
> --- a/arch/arm/mach-keystone/keystone.c
> +++ b/arch/arm/mach-keystone/keystone.c
> @@ -28,6 +28,7 @@
>  #include "keystone.h"
>  
>  static unsigned long keystone_dma_pfn_offset __read_mostly;
> +static bool keystone_dma_coherent;
>  
>  static int keystone_platform_notifier(struct notifier_block *nb,
>  				      unsigned long event, void *data)
> @@ -52,21 +53,53 @@ static struct notifier_block platform_nb = {
>  	.notifier_call = keystone_platform_notifier,
>  };
>  
> +void veto_dma_coherent(void)
> +{
> +	struct device_node	*node, *start_node;
> +	struct property		*prop;
> +
> +	for (start_node = NULL;
> +	     (node = of_find_node_with_property(start_node, "dma-coherent"));
> +	     start_node = node) {
> +		prop = of_find_property(node, "dma-coherent", NULL);
> +		if (prop)
> +			of_remove_property(node, prop);
> +	}
> +}
> +
>  static void __init keystone_init(void)
>  {
> +	/* If we are running from the high physical addresses then adjust
> +	 * addresses we give to the device's DMA.  They will be seeing this
> +	 * memory through the MSMC address translation which makes the first 2GB
> +	 * of high memory appear in the low 4GB space.
> +	 * (DMA masters on keystone2 have 32 bit address buses)
> +	 */
>  	if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START) {
>  		keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START -
>  						   KEYSTONE_LOW_PHYS_START);
>  		bus_register_notifier(&platform_bus_type, &platform_nb);
>  	}
> +
> +	/* if the kernel has not been configured to meet the keystone
> +	 * platform requirements to achieve DMA coherency, then ignore any
> +	 * device tree configuration for this
> +	 */
> +	if (!keystone_dma_coherent)
> +		veto_dma_coherent();
> +
>  	keystone_pm_runtime_init();
>  	of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
>  }
>  
>  static long long __init keystone_pv_fixup(void)
>  {
> +#ifdef CONFIG_ARM_LPAE
>  	long long offset;
>  	phys_addr_t mem_start, mem_end;
> +	bool dma_ok;
> +
> +	dma_ok = use_outer_shared();
>  
>  	mem_start = memblock_start_of_DRAM();
>  	mem_end = memblock_end_of_DRAM();
> @@ -84,11 +117,15 @@ static long long __init keystone_pv_fixup(void)
>  	}
>  
>  	offset = KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START;
> +	keystone_dma_coherent = dma_ok;
>  
>  	/* Populate the arch idmap hook */
>  	arch_phys_to_idmap_offset = -offset;
>  
>  	return offset;
> +#else
> +	return 0;
> +#endif
>  }
>  
>  static const char *const keystone_match[] __initconst = {
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback
  2016-06-06  8:56   ` Mark Rutland
@ 2016-06-06  9:09     ` Arnd Bergmann
  2016-06-06 11:42       ` Mark Rutland
  2016-06-06 11:43     ` Russell King - ARM Linux
  1 sibling, 1 reply; 21+ messages in thread
From: Arnd Bergmann @ 2016-06-06  9:09 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Bill Mills, rmk+kernel, t-kristo, ssantosh, catalin.marinas,
	linux-arm-kernel, linux-kernel, r-woodruff2, devicetree

On Monday, June 6, 2016 9:56:27 AM CEST Mark Rutland wrote:
> [adding devicetree]
> 
> On Sun, Jun 05, 2016 at 11:20:29PM -0400, Bill Mills wrote:
> > Keystone2 can do DMA coherency but only if:
> > 1) DDR3A DMA buffers are in high physical addresses (0x8_0000_0000)
> >     (DDR3B does not have this constraint)
> > 2) Memory is marked outer shared
> > 3) DMA Master marks transactions as outer shared
> >     (This is taken care of in bootloader)
> > 
> > Use outer shared instead of inner shared.
> > This choice is done at early init time and uses the attr_mod facility
> > 
> > If the kernel is not configured for LPAE and using high PA, or if the
> > switch to outer shared fails, then we fail to meet this criteria.
> > Under any of these conditions we veto any dma-coherent attributes in
> > the DTB.
> 
> I very much do not like this. As I previously mentioned [1],
> dma-coherent has de-facto semantics today. This series deliberately
> changes that, and inverts the relationship between DT and kernel (as the
> describption in the DT would now depend on the configuration of the
> kernel).
> 
> I would prefer that we have a separate property (e.g.
> "dma-outer-coherent") to describe when a device can be coherent with
> Normal, Outer Shareable, Inner Write-Back, Outer Write-Back memory.
> Then the kernel can figure out whether or not device can be used
> coherently, depending on how it is configured.

I share your concern, but I don't think the dma-outer-coherent attribute
would be a good solution either.

The problem really is that keystone is a platform that is sometimes
coherent, depending purely on what kernel we run, and not at all on
anything we can describe in devicetree, and I don't see any good way
to capture the behavior of the hardware in generic DT bindings.

So far, the assumption has been:

- when running a non-LPAE kernel, keystone is not coherent, and we
  must ignore both the dma-coherent properties in devices and the
  dma-ranges properties in bus nodes.
- when running an LPAE kernel, keystone is coherent, and we must
  respect both of those.

My interpretation of Bill's description above is that we now have
an additional requirement that at least I was not aware of before,
regarding the outer-sharable attribute. I don't think there is
much value in making this a boot-time option, since everyone would
want to run this platform in a cache-coherent way if at all possible.

We already have special hacks to detect the case of keystone running
in LPAE mode, in order to do the special rewrite-all-page-tables
hack at boot time for relocating the physical address, and we could
use the same hack to change the page table attributes.

The question is how to communicate the requirement for outer-sharable
for a platform. If we think it's a safe assumption that there will
not be future 32-bit platforms with this requirement (or maybe one
or two more at most), we could leave it in the special keystone hack.
Alternatively, a DT property in an appropriate node could indicate
that a particular platform requires it.

	Arnd

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback
  2016-06-06  9:09     ` Arnd Bergmann
@ 2016-06-06 11:42       ` Mark Rutland
  2016-06-06 12:37         ` Arnd Bergmann
  2016-06-06 12:50         ` William Mills
  0 siblings, 2 replies; 21+ messages in thread
From: Mark Rutland @ 2016-06-06 11:42 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Bill Mills, rmk+kernel, t-kristo, ssantosh, catalin.marinas,
	linux-arm-kernel, linux-kernel, r-woodruff2, devicetree

On Mon, Jun 06, 2016 at 11:09:07AM +0200, Arnd Bergmann wrote:
> On Monday, June 6, 2016 9:56:27 AM CEST Mark Rutland wrote:
> > [adding devicetree]
> > 
> > On Sun, Jun 05, 2016 at 11:20:29PM -0400, Bill Mills wrote:
> > > Keystone2 can do DMA coherency but only if:
> > > 1) DDR3A DMA buffers are in high physical addresses (0x8_0000_0000)
> > >     (DDR3B does not have this constraint)
> > > 2) Memory is marked outer shared
> > > 3) DMA Master marks transactions as outer shared
> > >     (This is taken care of in bootloader)
> > > 
> > > Use outer shared instead of inner shared.
> > > This choice is done at early init time and uses the attr_mod facility
> > > 
> > > If the kernel is not configured for LPAE and using high PA, or if the
> > > switch to outer shared fails, then we fail to meet this criteria.
> > > Under any of these conditions we veto any dma-coherent attributes in
> > > the DTB.
> > 
> > I very much do not like this. As I previously mentioned [1],
> > dma-coherent has de-facto semantics today. This series deliberately
> > changes that, and inverts the relationship between DT and kernel (as the
> > describption in the DT would now depend on the configuration of the
> > kernel).
> > 
> > I would prefer that we have a separate property (e.g.
> > "dma-outer-coherent") to describe when a device can be coherent with
> > Normal, Outer Shareable, Inner Write-Back, Outer Write-Back memory.
> > Then the kernel can figure out whether or not device can be used
> > coherently, depending on how it is configured.
> 
> I share your concern, but I don't think the dma-outer-coherent attribute
> would be a good solution either.
> 
> The problem really is that keystone is a platform that is sometimes
> coherent, depending purely on what kernel we run, and not at all on
> anything we can describe in devicetree, and I don't see any good way
> to capture the behavior of the hardware in generic DT bindings.

I think that above doesn't quite capture the situation:

Some DMA masters can be cache-coherent (only) with Outer Shareable
transactions. That is a property we could capture inthe DT (e.g.
dma-outer-coherent), and is independent of the kernel configuration.

Whether or not the devices are coherent with the kernel's chosen memory
attributes certainly depends on the kernel configuration, but that is
not what we capture in the DT.

> So far, the assumption has been:
> 
> - when running a non-LPAE kernel, keystone is not coherent, and we
>   must ignore both the dma-coherent properties in devices and the
>   dma-ranges properties in bus nodes.

I wasn't able to spot if/where that was enforced. Is it possible to boot
Keystone UP, !LPAE?

> - when running an LPAE kernel, keystone is coherent, and we must
>   respect both of those.

Similarly this has not been enforced either way. Currently I cannot see
how devices with dma-coherent could possibly work with Keystone.

I think we also need to be clear as to what we mean by "keystone is
coherent". With LPAE, CPUs can be coherent with each other, so long as
the appropriate physical addresses are used (with the magic to handle
that).

Devices being cache-coherent with CPUs is already something we manage
per-device.

> My interpretation of Bill's description above is that we now have
> an additional requirement that at least I was not aware of before,
> regarding the outer-sharable attribute. I don't think there is
> much value in making this a boot-time option, since everyone would
> want to run this platform in a cache-coherent way if at all possible.
>
> We already have special hacks to detect the case of keystone running
> in LPAE mode, in order to do the special rewrite-all-page-tables
> hack at boot time for relocating the physical address, and we could
> use the same hack to change the page table attributes.
> 
> The question is how to communicate the requirement for outer-sharable
> for a platform. If we think it's a safe assumption that there will
> not be future 32-bit platforms with this requirement (or maybe one
> or two more at most), we could leave it in the special keystone hack.
> Alternatively, a DT property in an appropriate node could indicate
> that a particular platform requires it.

As above, I think this must be specified per-device, following the usual
manner in which we describe Inner Shareable coherency using
dma-coherent.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback
  2016-06-06  8:56   ` Mark Rutland
  2016-06-06  9:09     ` Arnd Bergmann
@ 2016-06-06 11:43     ` Russell King - ARM Linux
  2016-06-06 11:59       ` Mark Rutland
  1 sibling, 1 reply; 21+ messages in thread
From: Russell King - ARM Linux @ 2016-06-06 11:43 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Bill Mills, t-kristo, ssantosh, catalin.marinas,
	linux-arm-kernel, linux-kernel, r-woodruff2, devicetree

On Mon, Jun 06, 2016 at 09:56:27AM +0100, Mark Rutland wrote:
> I very much do not like this. As I previously mentioned [1],
> dma-coherent has de-facto semantics today. This series deliberately
> changes that, and inverts the relationship between DT and kernel (as the
> describption in the DT would now depend on the configuration of the
> kernel).

dma-coherent's semantics are not very well defined - just grep for it
in Documention/devicetree/ and you'll find several different wordings
for what this property means.

Anyway, my point here is that all of these merely say that the hardware
is coherent in _some regard_ - it doesn't specify under what conditions
DMA coherency is guaranteed by the hardware.  It happens that on ARM,
most platforms give that guarantee when using inner shared mappings.  If
we were to use some other sharing, or disable sharing altogether (eg, by
disabling SMP support) then all these platforms would immediately break.

In other words, DMA coherence today already depends on the kernel's setup
of the page tables corresponding to the requirements of the hardware.

Keystone II is just slightly different - and as I understand it, TI
followed one of the early specifications that ARM Ltd produced.  That
specification may have contained errors, but unfortunately, we now have
a situation where there is hardware out there which followed in good
faith.

So, it seems to me to be entirely reasonable that Keystone II should
mark devices with the "dma-coherent" property - just the same way as
every other platform does.  It also seems to be entirely appropriate for
a platform to remove this property if it determines that the conditions
for DMA coherency are not met - in order to save the users data from
corruption.

TI Keystone II is not the only platform with issues here: there are
Marvell Armada platforms out there which have DMA coherence, but are
uniprocessor, we don't set the shared bit (which they require for
DMA coherence) and so we omit the dma-coherent property from the
device tree at the moment.  And they're inner-shared coherent.  We
just don't set the page tables up so that they can work.

So, I think to require a whole new property is absurd.  The existing
property means "if the rest of the system is appropriately configured,
this device can be dma-coherent".  So, what I think we need is a way
to communicate whether the rest of the system has been appropriately
configured, so the property can be attached to devices which meet the
criteria, but the arch/platform level can signal whether the conditions
for device DMA coherence have been met.  That's not a DT property,
that's a matter of how the kernel has setup the system.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback
  2016-06-06 11:43     ` Russell King - ARM Linux
@ 2016-06-06 11:59       ` Mark Rutland
  2016-06-06 12:19         ` William Mills
  2016-06-06 12:32         ` Russell King - ARM Linux
  0 siblings, 2 replies; 21+ messages in thread
From: Mark Rutland @ 2016-06-06 11:59 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Bill Mills, t-kristo, ssantosh, catalin.marinas,
	linux-arm-kernel, linux-kernel, r-woodruff2, devicetree

On Mon, Jun 06, 2016 at 12:43:21PM +0100, Russell King - ARM Linux wrote:
> On Mon, Jun 06, 2016 at 09:56:27AM +0100, Mark Rutland wrote:
> > I very much do not like this. As I previously mentioned [1],
> > dma-coherent has de-facto semantics today. This series deliberately
> > changes that, and inverts the relationship between DT and kernel (as the
> > describption in the DT would now depend on the configuration of the
> > kernel).
> 
> dma-coherent's semantics are not very well defined - just grep for it
> in Documention/devicetree/ and you'll find several different wordings
> for what this property means.

Indeed. This is the tip of the iceberg w.r.t. under-specification of
memory attribute usage.

> Anyway, my point here is that all of these merely say that the hardware
> is coherent in _some regard_ - it doesn't specify under what conditions
> DMA coherency is guaranteed by the hardware.  It happens that on ARM,
> most platforms give that guarantee when using inner shared mappings.  If
> we were to use some other sharing, or disable sharing altogether (eg, by
> disabling SMP support) then all these platforms would immediately break.
> 
> In other words, DMA coherence today already depends on the kernel's setup
> of the page tables corresponding to the requirements of the hardware.

I agree that whether or not devices are coherent in practice depends on
the kernel's configuration. The flip side, as you point out, is that
devices are coherent when a specific set of attributes are used.

i.e. that if you read dma-coherent as meaning "coherent iff Normal,
Inner Shareable, Inner WB Cacheable, Outer WB Cacheable", then
dma-coherent consistently describes the same thing, rather than
depending on the configuration of the OS.

DT is a datastructure provided to the kernel, potentially without deep
internal knowledge of that kernel configuration. Having a consistent
rule that is independent of the kernel configuration seems worth aiming
for.

A dma-outer-coherent property would allow us to accurately describe the
keystone case in the same way, independent of kernel configuration.

> Keystone II is just slightly different - and as I understand it, TI
> followed one of the early specifications that ARM Ltd produced.  That
> specification may have contained errors, but unfortunately, we now have
> a situation where there is hardware out there which followed in good
> faith.

To be clear, I am not arguing against supporting keystone. I just wish
to avoid muddying the waters further w.r.t. the semantics of
dma-coherent, which I believe can be salvaged and made consistent.

Clearly, those semantics are the point of contention here.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC v2 1/4] ARM: mm: add early page table attribute modification ability
  2016-06-06  3:20 ` [RFC v2 1/4] ARM: mm: add early page table attribute modification ability Bill Mills
@ 2016-06-06 12:18   ` Russell King - ARM Linux
  2016-06-06 12:31     ` William Mills
  0 siblings, 1 reply; 21+ messages in thread
From: Russell King - ARM Linux @ 2016-06-06 12:18 UTC (permalink / raw)
  To: Bill Mills
  Cc: mark.rutland, t-kristo, ssantosh, catalin.marinas,
	linux-arm-kernel, linux-kernel, r-woodruff2

On Sun, Jun 05, 2016 at 11:20:26PM -0400, Bill Mills wrote:
> Allow early-init to specify modifications to be made to the boot time page
> table. Any modifications specified will be done with MMU off at the same
> time that any Phy<->Virt fixup is done.

I think this is rather over-engineered - do we need to support multiple
different fixups to the page tables like this?

Given how this has grown, I think it would be better to duplicate the
existing swapper_pg_dir, modify the new copy, and then have the
pv-fixup-asm code merely copy the new to the old with the MMU off.
That way, the only two things that the assembly code has to do is to
deal with the page table update, and updating the TTBR registers.
Most of the complexity can then be kept in the C code.

I think we also need to modify the TTBCR to match the sharability of
memory - currently, TTB walks will be inner sharable, but my
understanding is that if we switch memory to be outer sharable, we
also need to update the TTB walks to match.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback
  2016-06-06 11:59       ` Mark Rutland
@ 2016-06-06 12:19         ` William Mills
  2016-06-06 12:32         ` Russell King - ARM Linux
  1 sibling, 0 replies; 21+ messages in thread
From: William Mills @ 2016-06-06 12:19 UTC (permalink / raw)
  To: Mark Rutland, Russell King - ARM Linux
  Cc: t-kristo, ssantosh, catalin.marinas, linux-arm-kernel,
	linux-kernel, r-woodruff2, devicetree



On 06/06/2016 07:59 AM, Mark Rutland wrote:
> On Mon, Jun 06, 2016 at 12:43:21PM +0100, Russell King - ARM Linux wrote:
>> On Mon, Jun 06, 2016 at 09:56:27AM +0100, Mark Rutland wrote:
>>> I very much do not like this. As I previously mentioned [1],
>>> dma-coherent has de-facto semantics today. This series deliberately
>>> changes that, and inverts the relationship between DT and kernel (as the
>>> describption in the DT would now depend on the configuration of the
>>> kernel).
>>
>> dma-coherent's semantics are not very well defined - just grep for it
>> in Documention/devicetree/ and you'll find several different wordings
>> for what this property means.
> 
> Indeed. This is the tip of the iceberg w.r.t. under-specification of
> memory attribute usage.
> 
>> Anyway, my point here is that all of these merely say that the hardware
>> is coherent in _some regard_ - it doesn't specify under what conditions
>> DMA coherency is guaranteed by the hardware.  It happens that on ARM,
>> most platforms give that guarantee when using inner shared mappings.  If
>> we were to use some other sharing, or disable sharing altogether (eg, by
>> disabling SMP support) then all these platforms would immediately break.
>>
>> In other words, DMA coherence today already depends on the kernel's setup
>> of the page tables corresponding to the requirements of the hardware.
> 
> I agree that whether or not devices are coherent in practice depends on
> the kernel's configuration. The flip side, as you point out, is that
> devices are coherent when a specific set of attributes are used.
> 
> i.e. that if you read dma-coherent as meaning "coherent iff Normal,
> Inner Shareable, Inner WB Cacheable, Outer WB Cacheable", then
> dma-coherent consistently describes the same thing, rather than
> depending on the configuration of the OS.
>

Even w/o inner / outer it seems to me it is under specified.
What about a system where main DDR memory is DMA coherent but an on chip
SRAM needs manual flush/invalidate?  Right now dma coherency is all or
nothing.  In the above case you would need to ensure that the device
never tried to use that SRAM for dma transactions even though the device
is perfectly capable with the right hand-holding.  Alternatively you
could use the SRAM but penalize the normal case of DDR.

> DT is a datastructure provided to the kernel, potentially without deep
> internal knowledge of that kernel configuration. Having a consistent
> rule that is independent of the kernel configuration seems worth aiming
> for.
> 
> A dma-outer-coherent property would allow us to accurately describe the
> keystone case in the same way, independent of kernel configuration.
> 
>> Keystone II is just slightly different - and as I understand it, TI
>> followed one of the early specifications that ARM Ltd produced.  That
>> specification may have contained errors, but unfortunately, we now have
>> a situation where there is hardware out there which followed in good
>> faith.
> 
> To be clear, I am not arguing against supporting keystone. I just wish
> to avoid muddying the waters further w.r.t. the semantics of
> dma-coherent, which I believe can be salvaged and made consistent.
> 
> Clearly, those semantics are the point of contention here.
>

To me this seems like a choice between embracing outer-shared or
treating it like a quirk of some early armv7 devices.  With ARM's
current recommendations to use inner-shared for everything in many
contexts (like ARM servers etc), I was assuming you would want the latter.

Thanks for looking.  This is not the patch that I expected to generate
the most discussion.  :)

-- Bill

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC v2 1/4] ARM: mm: add early page table attribute modification ability
  2016-06-06 12:18   ` Russell King - ARM Linux
@ 2016-06-06 12:31     ` William Mills
  0 siblings, 0 replies; 21+ messages in thread
From: William Mills @ 2016-06-06 12:31 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: mark.rutland, t-kristo, ssantosh, catalin.marinas,
	linux-arm-kernel, linux-kernel, r-woodruff2



On 06/06/2016 08:18 AM, Russell King - ARM Linux wrote:
> On Sun, Jun 05, 2016 at 11:20:26PM -0400, Bill Mills wrote:
>> Allow early-init to specify modifications to be made to the boot time page
>> table. Any modifications specified will be done with MMU off at the same
>> time that any Phy<->Virt fixup is done.
> 
> I think this is rather over-engineered - do we need to support multiple
> different fixups to the page tables like this?

Yes I was expecting this comment but thought I would give you the choice. :)

> 
> Given how this has grown, I think it would be better to duplicate the
> existing swapper_pg_dir, modify the new copy, and then have the
> pv-fixup-asm code merely copy the new to the old with the MMU off.
> That way, the only two things that the assembly code has to do is to
> deal with the page table update, and updating the TTBR registers.
> Most of the complexity can then be kept in the C code.
> 

I really like this.  I can just do the outer shared fixup and not worry
about a generalized mechanism.  *If* someone needs to do another fixup
they can just code it in C.

The new patch #1 will just rework the PV_FIXUP for the new asm/C split.

You want the off-line table to copy over the early table in place w/ MMU
off, correct? (Not update the HW to point to a new spot.)

> I think we also need to modify the TTBCR to match the sharability of
> memory - currently, TTB walks will be inner sharable, but my
> understanding is that if we switch memory to be outer sharable, we
> also need to update the TTB walks to match.
> 

Good point, Thanks.  I don't think our internal hack has been doing that.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback
  2016-06-06 11:59       ` Mark Rutland
  2016-06-06 12:19         ` William Mills
@ 2016-06-06 12:32         ` Russell King - ARM Linux
  2016-06-06 16:28           ` Santosh Shilimkar
  2016-06-07 10:01           ` Mark Rutland
  1 sibling, 2 replies; 21+ messages in thread
From: Russell King - ARM Linux @ 2016-06-06 12:32 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Bill Mills, t-kristo, ssantosh, catalin.marinas,
	linux-arm-kernel, linux-kernel, r-woodruff2, devicetree

On Mon, Jun 06, 2016 at 12:59:18PM +0100, Mark Rutland wrote:
> I agree that whether or not devices are coherent in practice depends on
> the kernel's configuration. The flip side, as you point out, is that
> devices are coherent when a specific set of attributes are used.
> 
> i.e. that if you read dma-coherent as meaning "coherent iff Normal,
> Inner Shareable, Inner WB Cacheable, Outer WB Cacheable", then
> dma-coherent consistently describes the same thing, rather than
> depending on the configuration of the OS.
> 
> DT is a datastructure provided to the kernel, potentially without deep
> internal knowledge of that kernel configuration. Having a consistent
> rule that is independent of the kernel configuration seems worth aiming
> for.

I think you've missed the point.  dma-coherent is _already_ dependent on
the kernel configuration.  "Having a consistent rule that is independent
of the kernel configuration" is already an impossibility, as I illustrated
in my previous message concerning Marvell Armada SoCs, and you also said
in your preceding paragraph!

For example, if you clear the shared bit in the page tables on non-LPAE
SoCs, devices are no longer coherent.

DMA coherence on ARM _is_ already tightly linked with the kernel
configuration.  You already can't get away from that, so I think you
should give up trying to argue that point. :)

Whether devices are DMA coherent is a combination of two things:
 * is the device connected to a coherent bus.
 * is the system setup to allow coherency on that bus to work.

We capture the first through the dma-coherent property, which is clearly
a per-device property.  We ignore the second because we assume everyone
is going to configure the CPU side correctly.  That's untrue today, and
it's untrue not only because of Keystone II, but also because of other
SoCs as well which pre-date Keystone II.  We currently miss out on
considering that, because if we ignore it, we get something that works
for most platforms.

I don't see that adding a dma-outer-coherent property helps this - it's
muddying the waters somewhat - and it's also forcing additional complexity
into places where we shouldn't have it.  We would need to parse two
properties in the DMA API code, and then combine it with knowledge as
to how the system page tables have been setup.  If they've been setup
as inner sharable, then dma-coherent identifies whether the device is
coherent.  If they've been setup as outer sharable, then
dma-outer-coherent specifies that and dma-coherent is meaningless.

Sounds like a recipe for confusion.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback
  2016-06-06 11:42       ` Mark Rutland
@ 2016-06-06 12:37         ` Arnd Bergmann
  2016-06-06 12:50         ` William Mills
  1 sibling, 0 replies; 21+ messages in thread
From: Arnd Bergmann @ 2016-06-06 12:37 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Bill Mills, rmk+kernel, t-kristo, ssantosh, catalin.marinas,
	linux-arm-kernel, linux-kernel, r-woodruff2, devicetree

On Monday, June 6, 2016 12:42:56 PM CEST Mark Rutland wrote:
> On Mon, Jun 06, 2016 at 11:09:07AM +0200, Arnd Bergmann wrote:
> > On Monday, June 6, 2016 9:56:27 AM CEST Mark Rutland wrote:
> >
> > So far, the assumption has been:
> > 
> > - when running a non-LPAE kernel, keystone is not coherent, and we
> >   must ignore both the dma-coherent properties in devices and the
> >   dma-ranges properties in bus nodes.
> 
> I wasn't able to spot if/where that was enforced. Is it possible to boot
> Keystone UP, !LPAE?

With !LPAE, no devices are coherent, so that should work with both
SMP and and UP. Not sure about LPAE with coherent devices on UP,
IIRC we had a bug in that configuration on Armada XP, as the memory
was not marked as sharable at all there, and ended up not being
coherent with DMA masters.

My first guess is that uniprocessor mode on keystone has not been
tested at all (TI's QA seems to test only a very limited number of
configurations), so it may or may not work.

	Arnd

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback
  2016-06-06 11:42       ` Mark Rutland
  2016-06-06 12:37         ` Arnd Bergmann
@ 2016-06-06 12:50         ` William Mills
  2016-06-06 16:18           ` Santosh Shilimkar
  1 sibling, 1 reply; 21+ messages in thread
From: William Mills @ 2016-06-06 12:50 UTC (permalink / raw)
  To: Mark Rutland, Arnd Bergmann
  Cc: rmk+kernel, t-kristo, ssantosh, catalin.marinas,
	linux-arm-kernel, linux-kernel, r-woodruff2, devicetree



On 06/06/2016 07:42 AM, Mark Rutland wrote:
> On Mon, Jun 06, 2016 at 11:09:07AM +0200, Arnd Bergmann wrote:
>> On Monday, June 6, 2016 9:56:27 AM CEST Mark Rutland wrote:
>>> [adding devicetree]
>>>
>>> On Sun, Jun 05, 2016 at 11:20:29PM -0400, Bill Mills wrote:
>>>> Keystone2 can do DMA coherency but only if:
>>>> 1) DDR3A DMA buffers are in high physical addresses (0x8_0000_0000)
>>>>     (DDR3B does not have this constraint)
>>>> 2) Memory is marked outer shared
>>>> 3) DMA Master marks transactions as outer shared
>>>>     (This is taken care of in bootloader)
>>>>
>>>> Use outer shared instead of inner shared.
>>>> This choice is done at early init time and uses the attr_mod facility
>>>>
>>>> If the kernel is not configured for LPAE and using high PA, or if the
>>>> switch to outer shared fails, then we fail to meet this criteria.
>>>> Under any of these conditions we veto any dma-coherent attributes in
>>>> the DTB.
>>>
>>> I very much do not like this. As I previously mentioned [1],
>>> dma-coherent has de-facto semantics today. This series deliberately
>>> changes that, and inverts the relationship between DT and kernel (as the
>>> describption in the DT would now depend on the configuration of the
>>> kernel).
>>>
>>> I would prefer that we have a separate property (e.g.
>>> "dma-outer-coherent") to describe when a device can be coherent with
>>> Normal, Outer Shareable, Inner Write-Back, Outer Write-Back memory.
>>> Then the kernel can figure out whether or not device can be used
>>> coherently, depending on how it is configured.
>>
>> I share your concern, but I don't think the dma-outer-coherent attribute
>> would be a good solution either.
>>
>> The problem really is that keystone is a platform that is sometimes
>> coherent, depending purely on what kernel we run, and not at all on
>> anything we can describe in devicetree, and I don't see any good way
>> to capture the behavior of the hardware in generic DT bindings.
> 
> I think that above doesn't quite capture the situation:
> 
> Some DMA masters can be cache-coherent (only) with Outer Shareable
> transactions. That is a property we could capture inthe DT (e.g.
> dma-outer-coherent), and is independent of the kernel configuration.
> 
> Whether or not the devices are coherent with the kernel's chosen memory
> attributes certainly depends on the kernel configuration, but that is
> not what we capture in the DT.
> 
>> So far, the assumption has been:
>>
>> - when running a non-LPAE kernel, keystone is not coherent, and we
>>   must ignore both the dma-coherent properties in devices and the
>>   dma-ranges properties in bus nodes.
> 
> I wasn't able to spot if/where that was enforced. Is it possible to boot
> Keystone UP, !LPAE?
> 

Yes ...  with the right combination of DTB, u-boot, u-boot vars, and
kernel config.  Mismatches either fail hard or use dma-coherent ops
without actually providing coherency. I am attempting to make this less
fragile.

Mis-configured coherency can be dead-wrong and still only fail 1
transaction in 1,000,000.  I have seen customers run for weeks or months
w/o detecting the issue.  Thats why I wanted the veto logic.

There are 3 cases to cover:
LPAE w/ high PA:
	this is the normal mode for KS2.  Uses coherent dma-ops.
!LPAE:
	obviously uses low PA and must use non-coherent dma-ops.
LPAE w/ low PA:
	This happens with an LPAE kernel but the user has passed a low
	PA memory DTB and u-boot has not fixed it up.
	This case must also use non-coherent dma-ops

Upstream DTS has keystone memory at the low PA.  I agree with that.
U-boot and kernel opt-in to the use of high PA.

If you give high PA to a non-LPAE kernel I believe it will fail hard and
fast.  I can check.

Thanks,
Bill

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback
  2016-06-06 12:50         ` William Mills
@ 2016-06-06 16:18           ` Santosh Shilimkar
  0 siblings, 0 replies; 21+ messages in thread
From: Santosh Shilimkar @ 2016-06-06 16:18 UTC (permalink / raw)
  To: William Mills, Mark Rutland, Arnd Bergmann
  Cc: rmk+kernel, t-kristo, ssantosh, catalin.marinas,
	linux-arm-kernel, linux-kernel, r-woodruff2, devicetree

On 6/6/2016 5:50 AM, William Mills wrote:
>
>
I saw only v2 but seems like it already generated
discussion(s)

> On 06/06/2016 07:42 AM, Mark Rutland wrote:
>> On Mon, Jun 06, 2016 at 11:09:07AM +0200, Arnd Bergmann wrote:
>>> On Monday, June 6, 2016 9:56:27 AM CEST Mark Rutland wrote:
>>>> [adding devicetree]
>>>>
>>>> On Sun, Jun 05, 2016 at 11:20:29PM -0400, Bill Mills wrote:
>>>>> Keystone2 can do DMA coherency but only if:
>>>>> 1) DDR3A DMA buffers are in high physical addresses (0x8_0000_0000)
>>>>>     (DDR3B does not have this constraint)
>>>>> 2) Memory is marked outer shared
>>>>> 3) DMA Master marks transactions as outer shared
>>>>>     (This is taken care of in bootloader)
>>>>>
>>>>> Use outer shared instead of inner shared.
>>>>> This choice is done at early init time and uses the attr_mod facility
>>>>>
>>>>> If the kernel is not configured for LPAE and using high PA, or if the
>>>>> switch to outer shared fails, then we fail to meet this criteria.
>>>>> Under any of these conditions we veto any dma-coherent attributes in
>>>>> the DTB.
>>>>
>>>> I very much do not like this. As I previously mentioned [1],
>>>> dma-coherent has de-facto semantics today. This series deliberately
>>>> changes that, and inverts the relationship between DT and kernel (as the
>>>> describption in the DT would now depend on the configuration of the
>>>> kernel).
>>>>
>>>> I would prefer that we have a separate property (e.g.
>>>> "dma-outer-coherent") to describe when a device can be coherent with
>>>> Normal, Outer Shareable, Inner Write-Back, Outer Write-Back memory.
>>>> Then the kernel can figure out whether or not device can be used
>>>> coherently, depending on how it is configured.
>>>
>>> I share your concern, but I don't think the dma-outer-coherent attribute
>>> would be a good solution either.
>>>
>>> The problem really is that keystone is a platform that is sometimes
>>> coherent, depending purely on what kernel we run, and not at all on
>>> anything we can describe in devicetree, and I don't see any good way
>>> to capture the behavior of the hardware in generic DT bindings.
>>
>> I think that above doesn't quite capture the situation:
>>
>> Some DMA masters can be cache-coherent (only) with Outer Shareable
>> transactions. That is a property we could capture inthe DT (e.g.
>> dma-outer-coherent), and is independent of the kernel configuration.
>>
>> Whether or not the devices are coherent with the kernel's chosen memory
>> attributes certainly depends on the kernel configuration, but that is
>> not what we capture in the DT.
>>
>>> So far, the assumption has been:
>>>
>>> - when running a non-LPAE kernel, keystone is not coherent, and we
>>>   must ignore both the dma-coherent properties in devices and the
>>>   dma-ranges properties in bus nodes.

Correct.
>>
>> I wasn't able to spot if/where that was enforced. Is it possible to boot
>> Keystone UP, !LPAE?
>>
>
> Yes ...  with the right combination of DTB, u-boot, u-boot vars, and
> kernel config.  Mismatches either fail hard or use dma-coherent ops
> without actually providing coherency. I am attempting to make this less
> fragile.
>
> Mis-configured coherency can be dead-wrong and still only fail 1
> transaction in 1,000,000.  I have seen customers run for weeks or months
> w/o detecting the issue.  Thats why I wanted the veto logic.
>
> There are 3 cases to cover:
> LPAE w/ high PA:
> 	this is the normal mode for KS2.  Uses coherent dma-ops.
> !LPAE:
> 	obviously uses low PA and must use non-coherent dma-ops.
> LPAE w/ low PA:
> 	This happens with an LPAE kernel but the user has passed a low
> 	PA memory DTB and u-boot has not fixed it up.
> 	This case must also use non-coherent dma-ops
>
> Upstream DTS has keystone memory at the low PA.  I agree with that.
> U-boot and kernel opt-in to the use of high PA.
>
> If you give high PA to a non-LPAE kernel I believe it will fail hard and
> fast.  I can check.
>
UP will mostly boot from boot view the memory. The keystone_pv_fixup()
will bail out for higher PA. Let me know if you see otherwise.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback
  2016-06-06 12:32         ` Russell King - ARM Linux
@ 2016-06-06 16:28           ` Santosh Shilimkar
  2016-06-07 10:01           ` Mark Rutland
  1 sibling, 0 replies; 21+ messages in thread
From: Santosh Shilimkar @ 2016-06-06 16:28 UTC (permalink / raw)
  To: Russell King - ARM Linux, Mark Rutland, Bill Mills
  Cc: t-kristo, ssantosh, catalin.marinas, linux-arm-kernel,
	linux-kernel, r-woodruff2, devicetree

(Joining discussion late since only this thread showed up in my
inbox)

On 6/6/2016 5:32 AM, Russell King - ARM Linux wrote:
> On Mon, Jun 06, 2016 at 12:59:18PM +0100, Mark Rutland wrote:
>> I agree that whether or not devices are coherent in practice depends on
>> the kernel's configuration. The flip side, as you point out, is that
>> devices are coherent when a specific set of attributes are used.
>>
>> i.e. that if you read dma-coherent as meaning "coherent iff Normal,
>> Inner Shareable, Inner WB Cacheable, Outer WB Cacheable", then
>> dma-coherent consistently describes the same thing, rather than
>> depending on the configuration of the OS.
>>
I think there is a bit of miss-understanding with 'dma-coherent'
DT property and as RMK pointed out "dma-coherent-outer" isn't
right direction either.

>> DT is a datastructure provided to the kernel, potentially without deep
>> internal knowledge of that kernel configuration. Having a consistent
>> rule that is independent of the kernel configuration seems worth aiming
>> for.
>
> I think you've missed the point.  dma-coherent is _already_ dependent on
> the kernel configuration.  "Having a consistent rule that is independent
> of the kernel configuration" is already an impossibility, as I illustrated
> in my previous message concerning Marvell Armada SoCs, and you also said
> in your preceding paragraph!
>
> For example, if you clear the shared bit in the page tables on non-LPAE
> SoCs, devices are no longer coherent.
>
> DMA coherence on ARM _is_ already tightly linked with the kernel
> configuration.  You already can't get away from that, so I think you
> should give up trying to argue that point. :)
>
> Whether devices are DMA coherent is a combination of two things:
>  * is the device connected to a coherent bus.
>  * is the system setup to allow coherency on that bus to work.
>
> We capture the first through the dma-coherent property, which is clearly
> a per-device property.  We ignore the second because we assume everyone
> is going to configure the CPU side correctly.  That's untrue today, and
> it's untrue not only because of Keystone II, but also because of other
> SoCs as well which pre-date Keystone II.  We currently miss out on
> considering that, because if we ignore it, we get something that works
> for most platforms.
>
I agree with Russell. When I added "dma-coherent" per device DT
property, the intention was to distinguish certain devices which may
not be coherent sitting on coherent fabric for some hardware reasons.

> I don't see that adding a dma-outer-coherent property helps this - it's
> muddying the waters somewhat - and it's also forcing additional complexity
> into places where we shouldn't have it.  We would need to parse two
> properties in the DMA API code, and then combine it with knowledge as
> to how the system page tables have been setup.  If they've been setup
> as inner sharable, then dma-coherent identifies whether the device is
> coherent.  If they've been setup as outer sharable, then
> dma-outer-coherent specifies that and dma-coherent is meaningless.
>
> Sounds like a recipe for confusion.
>
Exactly. We should leave the "dma-coherent" property to mark coherent
vs non coherent device(s).

The inner vs outer is really page table ARCH setup issue and should
be handled exactly the way it was done first place to handle the
special memory view(outside 4 GB).

Keystone needs outer shared bit set while setting up MMU pages
which is best done in MMU off mode while recreating the new
page tables.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback
  2016-06-06 12:32         ` Russell King - ARM Linux
  2016-06-06 16:28           ` Santosh Shilimkar
@ 2016-06-07 10:01           ` Mark Rutland
  2016-06-07 12:32             ` Russell King - ARM Linux
  1 sibling, 1 reply; 21+ messages in thread
From: Mark Rutland @ 2016-06-07 10:01 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Bill Mills, t-kristo, ssantosh, catalin.marinas,
	linux-arm-kernel, linux-kernel, r-woodruff2, devicetree

On Mon, Jun 06, 2016 at 01:32:10PM +0100, Russell King - ARM Linux wrote:
> On Mon, Jun 06, 2016 at 12:59:18PM +0100, Mark Rutland wrote:
> > I agree that whether or not devices are coherent in practice depends on
> > the kernel's configuration. The flip side, as you point out, is that
> > devices are coherent when a specific set of attributes are used.
> > 
> > i.e. that if you read dma-coherent as meaning "coherent iff Normal,
> > Inner Shareable, Inner WB Cacheable, Outer WB Cacheable", then
> > dma-coherent consistently describes the same thing, rather than
> > depending on the configuration of the OS.
> > 
> > DT is a datastructure provided to the kernel, potentially without deep
> > internal knowledge of that kernel configuration. Having a consistent
> > rule that is independent of the kernel configuration seems worth aiming
> > for.
> 
> I think you've missed the point.  dma-coherent is _already_ dependent on
> the kernel configuration. 

I understood this point. Please, allow me to clarify below, as I've
evidently not done a good job so far.

> "Having a consistent rule that is independent of the kernel
> configuration" is already an impossibility, as I illustrated in my
> previous message concerning Marvell Armada SoCs, and you also said in
> your preceding paragraph!

That's not quite what I said. What I said was that whether or not you
end up with coherency depends on the kernel's configuration. That's why
I pointed out that in practice, the only cases that work with today's
mainline kernels are this for which the devices which are coherent with
the kernel's usual memory attributes in an SMP configuration.

If you grep for dma-coherent in arch/arm/boot/dts, you'll find that
appears in:

arch/arm/boot/dts/artpec6.dtsi
arch/arm/boot/dts/ecx-common.dtsi
arch/arm/boot/dts/ls1021a.dtsi

Which are all SMP Cortex-{A7,A9,A15} platforms, and:

arch/arm/boot/dts/k2e.dtsi
arch/arm/boot/dts/k2e-netcp.dtsi
arch/arm/boot/dts/k2hk-netcp.dtsi
arch/arm/boot/dts/k2l-netcp.dtsi
arch/arm/boot/dts/keystone.dtsi

For which, as far as I am aware, the dma-coherent property does not
yield coherency with a mainline kernel, due to the requirement of Outer
Shareable attributes.

So, if we codify the dma-coherent semantics as only matching the working
case today, then it becomes consistent and independent of kernel
configuration, and we can add properties to cater for the other cases,
independent of kernel configuration.

> For example, if you clear the shared bit in the page tables on non-LPAE
> SoCs, devices are no longer coherent.

Yes. This is a problem, but one that we already face. If we clarified
the semantics as above, we would know that the device is simply not
coherent.

> DMA coherence on ARM _is_ already tightly linked with the kernel
> configuration.  You already can't get away from that, so I think you
> should give up trying to argue that point. :)

I hope that I've clarified my position w.r.t. coherence vs specification
thereof. :)

> Whether devices are DMA coherent is a combination of two things:
>  * is the device connected to a coherent bus.
>  * is the system setup to allow coherency on that bus to work.
> 
> We capture the first through the dma-coherent property, which is clearly
> a per-device property.  We ignore the second because we assume everyone
> is going to configure the CPU side correctly.  That's untrue today, and
> it's untrue not only because of Keystone II, but also because of other
> SoCs as well which pre-date Keystone II.  We currently miss out on
> considering that, because if we ignore it, we get something that works
> for most platforms.
> 
> I don't see that adding a dma-outer-coherent property helps this - it's
> muddying the waters somewhat - and it's also forcing additional complexity
> into places where we shouldn't have it.  We would need to parse two
> properties in the DMA API code, and then combine it with knowledge as
> to how the system page tables have been setup.  If they've been setup
> as inner sharable, then dma-coherent identifies whether the device is
> coherent.  If they've been setup as outer sharable, then
> dma-outer-coherent specifies that and dma-coherent is meaningless.

I think that at minimum, the attributes devices require needs to be
describe to the kernel, rather than being something we hope just
happened to match.

> Sounds like a recipe for confusion.

Unfortunately, I think everything in this area leads to confusion. :(

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback
  2016-06-07 10:01           ` Mark Rutland
@ 2016-06-07 12:32             ` Russell King - ARM Linux
  2016-06-07 12:55               ` Mark Rutland
  0 siblings, 1 reply; 21+ messages in thread
From: Russell King - ARM Linux @ 2016-06-07 12:32 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Bill Mills, t-kristo, ssantosh, catalin.marinas,
	linux-arm-kernel, linux-kernel, r-woodruff2, devicetree

On Tue, Jun 07, 2016 at 11:01:43AM +0100, Mark Rutland wrote:
> So, if we codify the dma-coherent semantics as only matching the working
> case today, then it becomes consistent and independent of kernel
> configuration, and we can add properties to cater for the other cases,
> independent of kernel configuration.

That's where our points of view differ.  You claim that it becomes
independent of the kernel configuration.  I'm saying that's total
rubbish, because it's dependent on the kernel setting the CPU page
tables up as it does today.

If we set them up differently, then it doesn't work so well.  This
is evidenced by Marvell Armada uniprocessor platforms, where they
are DMA coherent provided that the S bit is set.  However, because
they are uniprocessor platforms, the kernel sets the page tables up
with the S bit clear.  That means that the kernel configures the
system in a way which results in it being non-coherent.

So here, we have an example of why your position is actually incorrect.
dma-coherent does *not* give a "consistent and independent of kernel
configuration" property - it's inherently tied to how the kernel has
setup the page tables.

So, dma-coherent is coherent _provided_ the kernel sets the page tables
up as we currently expect it to - the S bit set on non-LPAE systems, on
LPAE systems, inner-sharable.  If we deviate from that, (eg by clearing
the S bit on non-LPAE systems) we end up with a non-coherent system,
even if dma-coherent is specified.

The Keystone II case is no different - Keystone II is coherent if the
correct conditions are met with the CPU page tables.  The only difference
is that it's a slightly different set of conditions.

> > For example, if you clear the shared bit in the page tables on non-LPAE
> > SoCs, devices are no longer coherent.
> 
> Yes. This is a problem, but one that we already face. If we clarified
> the semantics as above, we would know that the device is simply not
> coherent.

How?  We would need to introduce some flag which is passed from the
architecture code into the OF code to disable the effect of dma-coherent,
making of_dma_is_coherent() return false if the S bit is clear.

> > Whether devices are DMA coherent is a combination of two things:
> >  * is the device connected to a coherent bus.
> >  * is the system setup to allow coherency on that bus to work.
> > 
> > We capture the first through the dma-coherent property, which is clearly
> > a per-device property.  We ignore the second because we assume everyone
> > is going to configure the CPU side correctly.  That's untrue today, and
> > it's untrue not only because of Keystone II, but also because of other
> > SoCs as well which pre-date Keystone II.  We currently miss out on
> > considering that, because if we ignore it, we get something that works
> > for most platforms.
> > 
> > I don't see that adding a dma-outer-coherent property helps this - it's
> > muddying the waters somewhat - and it's also forcing additional complexity
> > into places where we shouldn't have it.  We would need to parse two
> > properties in the DMA API code, and then combine it with knowledge as
> > to how the system page tables have been setup.  If they've been setup
> > as inner sharable, then dma-coherent identifies whether the device is
> > coherent.  If they've been setup as outer sharable, then
> > dma-outer-coherent specifies that and dma-coherent is meaningless.
> 
> I think that at minimum, the attributes devices require needs to be
> describe to the kernel, rather than being something we hope just
> happened to match.

Yuck.  Seriously?  What happens when we have two devices which have
different required attributes for the CPU mapping?  Should architecture
code have to parse the entire DT tree to work out what attributes each
device needs, and try to then work out how the CPU page tables should
be setup?

I really don't think that's a good idea.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback
  2016-06-07 12:32             ` Russell King - ARM Linux
@ 2016-06-07 12:55               ` Mark Rutland
  0 siblings, 0 replies; 21+ messages in thread
From: Mark Rutland @ 2016-06-07 12:55 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Bill Mills, t-kristo, ssantosh, catalin.marinas,
	linux-arm-kernel, linux-kernel, r-woodruff2, devicetree

On Tue, Jun 07, 2016 at 01:32:48PM +0100, Russell King - ARM Linux wrote:
> On Tue, Jun 07, 2016 at 11:01:43AM +0100, Mark Rutland wrote:
> > So, if we codify the dma-coherent semantics as only matching the working
> > case today, then it becomes consistent and independent of kernel
> > configuration, and we can add properties to cater for the other cases,
> > independent of kernel configuration.
> 
> That's where our points of view differ.  You claim that it becomes
> independent of the kernel configuration.  I'm saying that's total
> rubbish, because it's dependent on the kernel setting the CPU page
> tables up as it does today.

The key point is that *description* of the requirements for coherency
becomes independent of kernel configuration. Yes, whether or not a
kernel can support those depends on the configuration.

> If we set them up differently, then it doesn't work so well.  This
> is evidenced by Marvell Armada uniprocessor platforms, where they
> are DMA coherent provided that the S bit is set.  However, because
> they are uniprocessor platforms, the kernel sets the page tables up
> with the S bit clear.  That means that the kernel configures the
> system in a way which results in it being non-coherent.
> 
> So here, we have an example of why your position is actually incorrect.
> dma-coherent does *not* give a "consistent and independent of kernel
> configuration" property - it's inherently tied to how the kernel has
> setup the page tables.

Sorry, but that is not quite what I said.

I said that if you read dma-coherent as specifying *the requirements*
for coherency (i.e. "coherent iff Normal, Inner Shareable, Inner WB
Cacheable, Outer WB Cacheable"), rather than specifying that there is
coherency given some unspecified requirements, then it is possible to
use it in a manner which is consistent and independent of kernel
configuration. If the kernel uses memory attributes that don't meet
those requirements, it can know that the device cannot be used in a
coherent manner.

If we take that stance, then we can cater for other requirements (e.g.
Outer Shareable on Keystone) by having properties to specify those
requirements (e.g. dma-outer-coherent). The tricky part is how the
kernel decides how best to use that information, but that is a problem
regardless.

> > > For example, if you clear the shared bit in the page tables on non-LPAE
> > > SoCs, devices are no longer coherent.
> > 
> > Yes. This is a problem, but one that we already face. If we clarified
> > the semantics as above, we would know that the device is simply not
> > coherent.
> 
> How?  We would need to introduce some flag which is passed from the
> architecture code into the OF code to disable the effect of dma-coherent,
> making of_dma_is_coherent() return false if the S bit is clear.

Yes, we would need to either alter the OF code, or some code which makes
use of this. Surely it's possible to have this logic in an arch
callback?

> > > Whether devices are DMA coherent is a combination of two things:
> > >  * is the device connected to a coherent bus.
> > >  * is the system setup to allow coherency on that bus to work.
> > > 
> > > We capture the first through the dma-coherent property, which is clearly
> > > a per-device property.  We ignore the second because we assume everyone
> > > is going to configure the CPU side correctly.  That's untrue today, and
> > > it's untrue not only because of Keystone II, but also because of other
> > > SoCs as well which pre-date Keystone II.  We currently miss out on
> > > considering that, because if we ignore it, we get something that works
> > > for most platforms.
> > > 
> > > I don't see that adding a dma-outer-coherent property helps this - it's
> > > muddying the waters somewhat - and it's also forcing additional complexity
> > > into places where we shouldn't have it.  We would need to parse two
> > > properties in the DMA API code, and then combine it with knowledge as
> > > to how the system page tables have been setup.  If they've been setup
> > > as inner sharable, then dma-coherent identifies whether the device is
> > > coherent.  If they've been setup as outer sharable, then
> > > dma-outer-coherent specifies that and dma-coherent is meaningless.
> > 
> > I think that at minimum, the attributes devices require needs to be
> > describe to the kernel, rather than being something we hope just
> > happened to match.
> 
> Yuck.  Seriously?  What happens when we have two devices which have
> different required attributes for the CPU mapping?  Should architecture
> code have to parse the entire DT tree to work out what attributes each
> device needs, and try to then work out how the CPU page tables should
> be setup?

No, we do not necessarily have to try to dynamically handle every
possible case, especially as the vastly common case is the one I called
out above.

For those boards where we're going to have some code special-casing
those regardless, automatically deciding to have the kernel use the
preferred set of attributes is fine. However, to do this I don't think
we should provide board+kernel specific semantics to dma-coherent, and
should at least precisely specify the coherency requirements.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2016-06-07 12:55 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-06  3:20 [RFC v2 0/4] ARM LPAE Outer Shared v2 Bill Mills
2016-06-06  3:20 ` [RFC v2 1/4] ARM: mm: add early page table attribute modification ability Bill Mills
2016-06-06 12:18   ` Russell King - ARM Linux
2016-06-06 12:31     ` William Mills
2016-06-06  3:20 ` [RFC v2 2/4] ARM: mm: Add LPAE support for outer shared Bill Mills
2016-06-06  3:20 ` [RFC v2 3/4] ARM: mm: add inner/outer sharing value command line Bill Mills
2016-06-06  3:20 ` [RFC v2 4/4] ARM: keystone: dma-coherent with safe fallback Bill Mills
2016-06-06  8:56   ` Mark Rutland
2016-06-06  9:09     ` Arnd Bergmann
2016-06-06 11:42       ` Mark Rutland
2016-06-06 12:37         ` Arnd Bergmann
2016-06-06 12:50         ` William Mills
2016-06-06 16:18           ` Santosh Shilimkar
2016-06-06 11:43     ` Russell King - ARM Linux
2016-06-06 11:59       ` Mark Rutland
2016-06-06 12:19         ` William Mills
2016-06-06 12:32         ` Russell King - ARM Linux
2016-06-06 16:28           ` Santosh Shilimkar
2016-06-07 10:01           ` Mark Rutland
2016-06-07 12:32             ` Russell King - ARM Linux
2016-06-07 12:55               ` Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).