All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ARM: fix highmem with VIPT cache and DMA
@ 2010-03-25 21:02 ` Nicolas Pitre
  0 siblings, 0 replies; 25+ messages in thread
From: Nicolas Pitre @ 2010-03-25 21:02 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Catalin Marinas, linux-mmc, Hemanth V, saeed bishara, pierre,
	linux-arm-kernel


The VIVT cache of a highmem page is always flushed before the page
is unmapped.  This cache flush is explicit through flush_cache_kmaps()
in flush_all_zero_pkmaps(), or through __cpuc_flush_dcache_area() in
kunmap_atomic().  There is also an implicit flush of those highmem pages
that were part of a process that just terminated making those pages free
as the whole VIVT cache has to be flushed on every task switch. Hence
unmapped highmem pages need no cache maintenance in that case.

However unmapped pages may still be cached with a VIPT cache because the
cache is tagged with physical addresses.  There is no need for a whole
cache flush during task switching for that reason, and despite the
explicit cache flushes in flush_all_zero_pkmaps() and kunmap_atomic(),
some highmem pages that were mapped in user space end up still cached
even when they become unmapped.

So, we do have to perform cache maintenance on those unmapped highmem
pages in the context of DMA when using a VIPT cache.  Unfortunately,
it is not possible to perform that cache maintenance using physical
addresses as all the L1 cache maintenance coprocessor functions accept
virtual addresses only.  Therefore we have no choice but to set up a
temporary virtual mapping for that purpose.

And of course the explicit cache flushing when unmapping a highmem page
on a system with a VIPT cache now can go, which should increase
performance.

While at it, because the code in __flush_dcache_page() has to be modified
anyway, let's also make sure the mapped highmem pages are pinned with
kmap_high_get() for the duration of the cache maintenance operation.
Because kunmap() does unmap highmem pages lazily, it was reported by
Gary King <GKing@nvidia.com> that those pages ended up being unmapped
during cache maintenance on SMP causing segmentation faults.

Signed-off-by: Nicolas Pitre <nico@marvell.com>
---
 arch/arm/include/asm/highmem.h    |   15 +++++-
 arch/arm/include/asm/kmap_types.h |    1 +
 arch/arm/mm/copypage-v6.c         |    9 +---
 arch/arm/mm/dma-mapping.c         |    5 ++
 arch/arm/mm/flush.c               |   25 +++++----
 arch/arm/mm/highmem.c             |   87 ++++++++++++++++++++++++++++++-
 6 files changed, 122 insertions(+), 20 deletions(-)

diff --git a/arch/arm/include/asm/highmem.h b/arch/arm/include/asm/highmem.h
index 7f36d00..feb988a 100644
--- a/arch/arm/include/asm/highmem.h
+++ b/arch/arm/include/asm/highmem.h
@@ -11,7 +11,11 @@
 
 #define kmap_prot		PAGE_KERNEL
 
-#define flush_cache_kmaps()	flush_cache_all()
+#define flush_cache_kmaps() \
+	do { \
+		if (cache_is_vivt()) \
+			flush_cache_all(); \
+	} while (0)
 
 extern pte_t *pkmap_page_table;
 
@@ -21,11 +25,20 @@ extern void *kmap_high(struct page *page);
 extern void *kmap_high_get(struct page *page);
 extern void kunmap_high(struct page *page);
 
+extern void *kmap_high_l1_vipt(struct page *page, pte_t *saved_pte);
+extern void kunmap_high_l1_vipt(struct page *page, pte_t saved_pte);
+
+/*
+ * The following functions are already defined by <linux/highmem.h>
+ * when CONFIG_HIGHMEM is not set.
+ */
+#ifdef CONFIG_HIGHMEM
 extern void *kmap(struct page *page);
 extern void kunmap(struct page *page);
 extern void *kmap_atomic(struct page *page, enum km_type type);
 extern void kunmap_atomic(void *kvaddr, enum km_type type);
 extern void *kmap_atomic_pfn(unsigned long pfn, enum km_type type);
 extern struct page *kmap_atomic_to_page(const void *ptr);
+#endif
 
 #endif
diff --git a/arch/arm/include/asm/kmap_types.h b/arch/arm/include/asm/kmap_types.h
index c019949..c4b2ea3 100644
--- a/arch/arm/include/asm/kmap_types.h
+++ b/arch/arm/include/asm/kmap_types.h
@@ -18,6 +18,7 @@ enum km_type {
 	KM_IRQ1,
 	KM_SOFTIRQ0,
 	KM_SOFTIRQ1,
+	KM_L1_CACHE,
 	KM_L2_CACHE,
 	KM_TYPE_NR
 };
diff --git a/arch/arm/mm/copypage-v6.c b/arch/arm/mm/copypage-v6.c
index 8bca4de..f55fa10 100644
--- a/arch/arm/mm/copypage-v6.c
+++ b/arch/arm/mm/copypage-v6.c
@@ -41,14 +41,7 @@ static void v6_copy_user_highpage_nonaliasing(struct page *to,
 	kfrom = kmap_atomic(from, KM_USER0);
 	kto = kmap_atomic(to, KM_USER1);
 	copy_page(kto, kfrom);
-#ifdef CONFIG_HIGHMEM
-	/*
-	 * kmap_atomic() doesn't set the page virtual address, and
-	 * kunmap_atomic() takes care of cache flushing already.
-	 */
-	if (page_address(to) != NULL)
-#endif
-		__cpuc_flush_dcache_area(kto, PAGE_SIZE);
+	__cpuc_flush_dcache_area(kto, PAGE_SIZE);
 	kunmap_atomic(kto, KM_USER1);
 	kunmap_atomic(kfrom, KM_USER0);
 }
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 0da7ecc..e955dc4 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -464,6 +464,11 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset,
 				vaddr += offset;
 				op(vaddr, len, dir);
 				kunmap_high(page);
+			} else if (cache_is_vipt()) {
+				pte_t saved_pte;
+				vaddr = kmap_high_l1_vipt(page, &saved_pte);
+				op(vaddr + offset, len, dir);
+				kunmap_high_l1_vipt(page, saved_pte);
 			}
 		} else {
 			vaddr = page_address(page) + offset;
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index e34f095..c6844cb 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -13,6 +13,7 @@
 
 #include <asm/cacheflush.h>
 #include <asm/cachetype.h>
+#include <asm/highmem.h>
 #include <asm/smp_plat.h>
 #include <asm/system.h>
 #include <asm/tlbflush.h>
@@ -152,21 +153,25 @@ void copy_to_user_page(struct vm_area_struct *vma, struct page *page,
 
 void __flush_dcache_page(struct address_space *mapping, struct page *page)
 {
-	void *addr = page_address(page);
-
 	/*
 	 * Writeback any data associated with the kernel mapping of this
 	 * page.  This ensures that data in the physical page is mutually
 	 * coherent with the kernels mapping.
 	 */
-#ifdef CONFIG_HIGHMEM
-	/*
-	 * kmap_atomic() doesn't set the page virtual address, and
-	 * kunmap_atomic() takes care of cache flushing already.
-	 */
-	if (addr)
-#endif
-		__cpuc_flush_dcache_area(addr, PAGE_SIZE);
+	if (!PageHighMem(page)) {
+		__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
+	} else {
+		void *addr = kmap_high_get(page);
+		if (addr) {
+			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
+			kunmap_high(page);
+		} else if (cache_is_vipt()) {
+			pte_t saved_pte;
+			addr = kmap_high_l1_vipt(page, &saved_pte);
+			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
+			kunmap_high_l1_vipt(page, saved_pte);
+		}
+	}
 
 	/*
 	 * If this is a page cache page, and we have an aliasing VIPT cache,
diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
index 2be1ec7..77b030f 100644
--- a/arch/arm/mm/highmem.c
+++ b/arch/arm/mm/highmem.c
@@ -79,7 +79,8 @@ void kunmap_atomic(void *kvaddr, enum km_type type)
 	unsigned int idx = type + KM_TYPE_NR * smp_processor_id();
 
 	if (kvaddr >= (void *)FIXADDR_START) {
-		__cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE);
+		if (cache_is_vivt())
+			__cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE);
 #ifdef CONFIG_DEBUG_HIGHMEM
 		BUG_ON(vaddr != __fix_to_virt(FIX_KMAP_BEGIN + idx));
 		set_pte_ext(TOP_PTE(vaddr), __pte(0), 0);
@@ -124,3 +125,87 @@ struct page *kmap_atomic_to_page(const void *ptr)
 	pte = TOP_PTE(vaddr);
 	return pte_page(*pte);
 }
+
+#ifdef CONFIG_CPU_CACHE_VIPT
+
+#include <linux/percpu.h>
+
+/*
+ * The VIVT cache of a highmem page is always flushed before the page
+ * is unmapped. Hence unmapped highmem pages need no cache maintenance
+ * in that case.
+ *
+ * However unmapped pages may still be cached with a VIPT cache, and
+ * it is not possible to perform cache maintenance on them using physical
+ * addresses unfortunately.  So we have no choice but to set up a temporary
+ * virtual mapping for that purpose.
+ *
+ * Yet this VIPT cache maintenance may be triggered from DMA support
+ * functions which are possibly called from interrupt context. As we don't
+ * want to keep interrupt disabled all the time when such maintenance is
+ * taking place, we therefore allow for some reentrancy by preserving and
+ * restoring the previous fixmap entry before the interrupted context is
+ * resumed.  If the reentrancy depth is 0 then there is no need to restore
+ * the previous fixmap, and leaving the current one in place allow it to
+ * be reused the next time without a TLB flush (common with DMA).
+ */
+
+static DEFINE_PER_CPU(int, kmap_high_l1_vipt_depth);
+
+void *kmap_high_l1_vipt(struct page *page, pte_t *saved_pte)
+{
+	unsigned int idx, cpu = smp_processor_id();
+	int *depth = &per_cpu(kmap_high_l1_vipt_depth, cpu);
+	unsigned long vaddr, flags;
+	pte_t pte, *ptep;
+
+	idx = KM_L1_CACHE + KM_TYPE_NR * cpu;
+	vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
+	ptep = TOP_PTE(vaddr);
+	pte = mk_pte(page, kmap_prot);
+
+	if (!in_interrupt())
+		preempt_disable();
+
+	raw_local_irq_save(flags);
+	(*depth)++;
+	if (pte_val(*ptep) == pte_val(pte)) {
+		*saved_pte = pte;
+	} else {
+		*saved_pte = *ptep;
+		set_pte_ext(ptep, pte, 0);
+		local_flush_tlb_kernel_page(vaddr);
+	}
+	raw_local_irq_restore(flags);
+
+	return (void *)vaddr;
+}
+
+void kunmap_high_l1_vipt(struct page *page, pte_t saved_pte)
+{
+	unsigned int idx, cpu = smp_processor_id();
+	int *depth = &per_cpu(kmap_high_l1_vipt_depth, cpu);
+	unsigned long vaddr, flags;
+	pte_t pte, *ptep;
+
+	idx = KM_L1_CACHE + KM_TYPE_NR * cpu;
+	vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
+	ptep = TOP_PTE(vaddr);
+	pte = mk_pte(page, kmap_prot);
+
+	BUG_ON(pte_val(*ptep) != pte_val(pte));
+	BUG_ON(*depth <= 0);
+
+	raw_local_irq_save(flags);
+	(*depth)--;
+	if (*depth != 0 && pte_val(pte) != pte_val(saved_pte)) {
+		set_pte_ext(ptep, saved_pte, 0);
+		local_flush_tlb_kernel_page(vaddr);
+	}
+	raw_local_irq_restore(flags);
+
+	if (!in_interrupt())
+		preempt_enable();
+}
+
+#endif  /* CONFIG_CPU_CACHE_VIPT */


Nicolas

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
@ 2010-03-25 21:02 ` Nicolas Pitre
  0 siblings, 0 replies; 25+ messages in thread
From: Nicolas Pitre @ 2010-03-25 21:02 UTC (permalink / raw)
  To: linux-arm-kernel


The VIVT cache of a highmem page is always flushed before the page
is unmapped.  This cache flush is explicit through flush_cache_kmaps()
in flush_all_zero_pkmaps(), or through __cpuc_flush_dcache_area() in
kunmap_atomic().  There is also an implicit flush of those highmem pages
that were part of a process that just terminated making those pages free
as the whole VIVT cache has to be flushed on every task switch. Hence
unmapped highmem pages need no cache maintenance in that case.

However unmapped pages may still be cached with a VIPT cache because the
cache is tagged with physical addresses.  There is no need for a whole
cache flush during task switching for that reason, and despite the
explicit cache flushes in flush_all_zero_pkmaps() and kunmap_atomic(),
some highmem pages that were mapped in user space end up still cached
even when they become unmapped.

So, we do have to perform cache maintenance on those unmapped highmem
pages in the context of DMA when using a VIPT cache.  Unfortunately,
it is not possible to perform that cache maintenance using physical
addresses as all the L1 cache maintenance coprocessor functions accept
virtual addresses only.  Therefore we have no choice but to set up a
temporary virtual mapping for that purpose.

And of course the explicit cache flushing when unmapping a highmem page
on a system with a VIPT cache now can go, which should increase
performance.

While at it, because the code in __flush_dcache_page() has to be modified
anyway, let's also make sure the mapped highmem pages are pinned with
kmap_high_get() for the duration of the cache maintenance operation.
Because kunmap() does unmap highmem pages lazily, it was reported by
Gary King <GKing@nvidia.com> that those pages ended up being unmapped
during cache maintenance on SMP causing segmentation faults.

Signed-off-by: Nicolas Pitre <nico@marvell.com>
---
 arch/arm/include/asm/highmem.h    |   15 +++++-
 arch/arm/include/asm/kmap_types.h |    1 +
 arch/arm/mm/copypage-v6.c         |    9 +---
 arch/arm/mm/dma-mapping.c         |    5 ++
 arch/arm/mm/flush.c               |   25 +++++----
 arch/arm/mm/highmem.c             |   87 ++++++++++++++++++++++++++++++-
 6 files changed, 122 insertions(+), 20 deletions(-)

diff --git a/arch/arm/include/asm/highmem.h b/arch/arm/include/asm/highmem.h
index 7f36d00..feb988a 100644
--- a/arch/arm/include/asm/highmem.h
+++ b/arch/arm/include/asm/highmem.h
@@ -11,7 +11,11 @@
 
 #define kmap_prot		PAGE_KERNEL
 
-#define flush_cache_kmaps()	flush_cache_all()
+#define flush_cache_kmaps() \
+	do { \
+		if (cache_is_vivt()) \
+			flush_cache_all(); \
+	} while (0)
 
 extern pte_t *pkmap_page_table;
 
@@ -21,11 +25,20 @@ extern void *kmap_high(struct page *page);
 extern void *kmap_high_get(struct page *page);
 extern void kunmap_high(struct page *page);
 
+extern void *kmap_high_l1_vipt(struct page *page, pte_t *saved_pte);
+extern void kunmap_high_l1_vipt(struct page *page, pte_t saved_pte);
+
+/*
+ * The following functions are already defined by <linux/highmem.h>
+ * when CONFIG_HIGHMEM is not set.
+ */
+#ifdef CONFIG_HIGHMEM
 extern void *kmap(struct page *page);
 extern void kunmap(struct page *page);
 extern void *kmap_atomic(struct page *page, enum km_type type);
 extern void kunmap_atomic(void *kvaddr, enum km_type type);
 extern void *kmap_atomic_pfn(unsigned long pfn, enum km_type type);
 extern struct page *kmap_atomic_to_page(const void *ptr);
+#endif
 
 #endif
diff --git a/arch/arm/include/asm/kmap_types.h b/arch/arm/include/asm/kmap_types.h
index c019949..c4b2ea3 100644
--- a/arch/arm/include/asm/kmap_types.h
+++ b/arch/arm/include/asm/kmap_types.h
@@ -18,6 +18,7 @@ enum km_type {
 	KM_IRQ1,
 	KM_SOFTIRQ0,
 	KM_SOFTIRQ1,
+	KM_L1_CACHE,
 	KM_L2_CACHE,
 	KM_TYPE_NR
 };
diff --git a/arch/arm/mm/copypage-v6.c b/arch/arm/mm/copypage-v6.c
index 8bca4de..f55fa10 100644
--- a/arch/arm/mm/copypage-v6.c
+++ b/arch/arm/mm/copypage-v6.c
@@ -41,14 +41,7 @@ static void v6_copy_user_highpage_nonaliasing(struct page *to,
 	kfrom = kmap_atomic(from, KM_USER0);
 	kto = kmap_atomic(to, KM_USER1);
 	copy_page(kto, kfrom);
-#ifdef CONFIG_HIGHMEM
-	/*
-	 * kmap_atomic() doesn't set the page virtual address, and
-	 * kunmap_atomic() takes care of cache flushing already.
-	 */
-	if (page_address(to) != NULL)
-#endif
-		__cpuc_flush_dcache_area(kto, PAGE_SIZE);
+	__cpuc_flush_dcache_area(kto, PAGE_SIZE);
 	kunmap_atomic(kto, KM_USER1);
 	kunmap_atomic(kfrom, KM_USER0);
 }
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 0da7ecc..e955dc4 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -464,6 +464,11 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset,
 				vaddr += offset;
 				op(vaddr, len, dir);
 				kunmap_high(page);
+			} else if (cache_is_vipt()) {
+				pte_t saved_pte;
+				vaddr = kmap_high_l1_vipt(page, &saved_pte);
+				op(vaddr + offset, len, dir);
+				kunmap_high_l1_vipt(page, saved_pte);
 			}
 		} else {
 			vaddr = page_address(page) + offset;
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index e34f095..c6844cb 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -13,6 +13,7 @@
 
 #include <asm/cacheflush.h>
 #include <asm/cachetype.h>
+#include <asm/highmem.h>
 #include <asm/smp_plat.h>
 #include <asm/system.h>
 #include <asm/tlbflush.h>
@@ -152,21 +153,25 @@ void copy_to_user_page(struct vm_area_struct *vma, struct page *page,
 
 void __flush_dcache_page(struct address_space *mapping, struct page *page)
 {
-	void *addr = page_address(page);
-
 	/*
 	 * Writeback any data associated with the kernel mapping of this
 	 * page.  This ensures that data in the physical page is mutually
 	 * coherent with the kernels mapping.
 	 */
-#ifdef CONFIG_HIGHMEM
-	/*
-	 * kmap_atomic() doesn't set the page virtual address, and
-	 * kunmap_atomic() takes care of cache flushing already.
-	 */
-	if (addr)
-#endif
-		__cpuc_flush_dcache_area(addr, PAGE_SIZE);
+	if (!PageHighMem(page)) {
+		__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
+	} else {
+		void *addr = kmap_high_get(page);
+		if (addr) {
+			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
+			kunmap_high(page);
+		} else if (cache_is_vipt()) {
+			pte_t saved_pte;
+			addr = kmap_high_l1_vipt(page, &saved_pte);
+			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
+			kunmap_high_l1_vipt(page, saved_pte);
+		}
+	}
 
 	/*
 	 * If this is a page cache page, and we have an aliasing VIPT cache,
diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
index 2be1ec7..77b030f 100644
--- a/arch/arm/mm/highmem.c
+++ b/arch/arm/mm/highmem.c
@@ -79,7 +79,8 @@ void kunmap_atomic(void *kvaddr, enum km_type type)
 	unsigned int idx = type + KM_TYPE_NR * smp_processor_id();
 
 	if (kvaddr >= (void *)FIXADDR_START) {
-		__cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE);
+		if (cache_is_vivt())
+			__cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE);
 #ifdef CONFIG_DEBUG_HIGHMEM
 		BUG_ON(vaddr != __fix_to_virt(FIX_KMAP_BEGIN + idx));
 		set_pte_ext(TOP_PTE(vaddr), __pte(0), 0);
@@ -124,3 +125,87 @@ struct page *kmap_atomic_to_page(const void *ptr)
 	pte = TOP_PTE(vaddr);
 	return pte_page(*pte);
 }
+
+#ifdef CONFIG_CPU_CACHE_VIPT
+
+#include <linux/percpu.h>
+
+/*
+ * The VIVT cache of a highmem page is always flushed before the page
+ * is unmapped. Hence unmapped highmem pages need no cache maintenance
+ * in that case.
+ *
+ * However unmapped pages may still be cached with a VIPT cache, and
+ * it is not possible to perform cache maintenance on them using physical
+ * addresses unfortunately.  So we have no choice but to set up a temporary
+ * virtual mapping for that purpose.
+ *
+ * Yet this VIPT cache maintenance may be triggered from DMA support
+ * functions which are possibly called from interrupt context. As we don't
+ * want to keep interrupt disabled all the time when such maintenance is
+ * taking place, we therefore allow for some reentrancy by preserving and
+ * restoring the previous fixmap entry before the interrupted context is
+ * resumed.  If the reentrancy depth is 0 then there is no need to restore
+ * the previous fixmap, and leaving the current one in place allow it to
+ * be reused the next time without a TLB flush (common with DMA).
+ */
+
+static DEFINE_PER_CPU(int, kmap_high_l1_vipt_depth);
+
+void *kmap_high_l1_vipt(struct page *page, pte_t *saved_pte)
+{
+	unsigned int idx, cpu = smp_processor_id();
+	int *depth = &per_cpu(kmap_high_l1_vipt_depth, cpu);
+	unsigned long vaddr, flags;
+	pte_t pte, *ptep;
+
+	idx = KM_L1_CACHE + KM_TYPE_NR * cpu;
+	vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
+	ptep = TOP_PTE(vaddr);
+	pte = mk_pte(page, kmap_prot);
+
+	if (!in_interrupt())
+		preempt_disable();
+
+	raw_local_irq_save(flags);
+	(*depth)++;
+	if (pte_val(*ptep) == pte_val(pte)) {
+		*saved_pte = pte;
+	} else {
+		*saved_pte = *ptep;
+		set_pte_ext(ptep, pte, 0);
+		local_flush_tlb_kernel_page(vaddr);
+	}
+	raw_local_irq_restore(flags);
+
+	return (void *)vaddr;
+}
+
+void kunmap_high_l1_vipt(struct page *page, pte_t saved_pte)
+{
+	unsigned int idx, cpu = smp_processor_id();
+	int *depth = &per_cpu(kmap_high_l1_vipt_depth, cpu);
+	unsigned long vaddr, flags;
+	pte_t pte, *ptep;
+
+	idx = KM_L1_CACHE + KM_TYPE_NR * cpu;
+	vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
+	ptep = TOP_PTE(vaddr);
+	pte = mk_pte(page, kmap_prot);
+
+	BUG_ON(pte_val(*ptep) != pte_val(pte));
+	BUG_ON(*depth <= 0);
+
+	raw_local_irq_save(flags);
+	(*depth)--;
+	if (*depth != 0 && pte_val(pte) != pte_val(saved_pte)) {
+		set_pte_ext(ptep, saved_pte, 0);
+		local_flush_tlb_kernel_page(vaddr);
+	}
+	raw_local_irq_restore(flags);
+
+	if (!in_interrupt())
+		preempt_enable();
+}
+
+#endif  /* CONFIG_CPU_CACHE_VIPT */


Nicolas

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-03-25 21:02 ` Nicolas Pitre
@ 2010-03-26 13:34   ` Catalin Marinas
  -1 siblings, 0 replies; 25+ messages in thread
From: Catalin Marinas @ 2010-03-26 13:34 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Russell King - ARM Linux, linux-mmc, Hemanth V, saeed bishara,
	pierre, linux-arm-kernel

On Thu, 2010-03-25 at 21:02 +0000, Nicolas Pitre wrote:
> --- a/arch/arm/include/asm/highmem.h
> +++ b/arch/arm/include/asm/highmem.h
> @@ -11,7 +11,11 @@
> 
>  #define kmap_prot              PAGE_KERNEL
> 
> -#define flush_cache_kmaps()    flush_cache_all()
> +#define flush_cache_kmaps() \
> +       do { \
> +               if (cache_is_vivt()) \
> +                       flush_cache_all(); \
> +       } while (0)

Do the aliasing VIPT caches need flushing as well?

> --- a/arch/arm/mm/highmem.c
> +++ b/arch/arm/mm/highmem.c
> @@ -79,7 +79,8 @@ void kunmap_atomic(void *kvaddr, enum km_type type)
>         unsigned int idx = type + KM_TYPE_NR * smp_processor_id();
> 
>         if (kvaddr >= (void *)FIXADDR_START) {
> -               __cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE);
> +               if (cache_is_vivt())
> +                       __cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE);

Same here (and probably some other places in this patch, not sure).

-- 
Catalin


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
@ 2010-03-26 13:34   ` Catalin Marinas
  0 siblings, 0 replies; 25+ messages in thread
From: Catalin Marinas @ 2010-03-26 13:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2010-03-25 at 21:02 +0000, Nicolas Pitre wrote:
> --- a/arch/arm/include/asm/highmem.h
> +++ b/arch/arm/include/asm/highmem.h
> @@ -11,7 +11,11 @@
> 
>  #define kmap_prot              PAGE_KERNEL
> 
> -#define flush_cache_kmaps()    flush_cache_all()
> +#define flush_cache_kmaps() \
> +       do { \
> +               if (cache_is_vivt()) \
> +                       flush_cache_all(); \
> +       } while (0)

Do the aliasing VIPT caches need flushing as well?

> --- a/arch/arm/mm/highmem.c
> +++ b/arch/arm/mm/highmem.c
> @@ -79,7 +79,8 @@ void kunmap_atomic(void *kvaddr, enum km_type type)
>         unsigned int idx = type + KM_TYPE_NR * smp_processor_id();
> 
>         if (kvaddr >= (void *)FIXADDR_START) {
> -               __cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE);
> +               if (cache_is_vivt())
> +                       __cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE);

Same here (and probably some other places in this patch, not sure).

-- 
Catalin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-03-26 13:34   ` Catalin Marinas
@ 2010-03-26 15:51     ` Nicolas Pitre
  -1 siblings, 0 replies; 25+ messages in thread
From: Nicolas Pitre @ 2010-03-26 15:51 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Russell King - ARM Linux, linux-mmc, Hemanth V, saeed bishara,
	pierre, linux-arm-kernel

On Fri, 26 Mar 2010, Catalin Marinas wrote:

> On Thu, 2010-03-25 at 21:02 +0000, Nicolas Pitre wrote:
> > --- a/arch/arm/include/asm/highmem.h
> > +++ b/arch/arm/include/asm/highmem.h
> > @@ -11,7 +11,11 @@
> > 
> >  #define kmap_prot              PAGE_KERNEL
> > 
> > -#define flush_cache_kmaps()    flush_cache_all()
> > +#define flush_cache_kmaps() \
> > +       do { \
> > +               if (cache_is_vivt()) \
> > +                       flush_cache_all(); \
> > +       } while (0)
> 
> Do the aliasing VIPT caches need flushing as well?

No idea.  Highmem is not supported with aliasing VIPT at the moment 
anyway -- see commit 3f973e2216.  I don't have hardware with aliasing 
VIPT cache either.


Nicolas

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
@ 2010-03-26 15:51     ` Nicolas Pitre
  0 siblings, 0 replies; 25+ messages in thread
From: Nicolas Pitre @ 2010-03-26 15:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 26 Mar 2010, Catalin Marinas wrote:

> On Thu, 2010-03-25 at 21:02 +0000, Nicolas Pitre wrote:
> > --- a/arch/arm/include/asm/highmem.h
> > +++ b/arch/arm/include/asm/highmem.h
> > @@ -11,7 +11,11 @@
> > 
> >  #define kmap_prot              PAGE_KERNEL
> > 
> > -#define flush_cache_kmaps()    flush_cache_all()
> > +#define flush_cache_kmaps() \
> > +       do { \
> > +               if (cache_is_vivt()) \
> > +                       flush_cache_all(); \
> > +       } while (0)
> 
> Do the aliasing VIPT caches need flushing as well?

No idea.  Highmem is not supported with aliasing VIPT at the moment 
anyway -- see commit 3f973e2216.  I don't have hardware with aliasing 
VIPT cache either.


Nicolas

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-03-26 15:51     ` Nicolas Pitre
@ 2010-03-26 23:09       ` Russell King - ARM Linux
  -1 siblings, 0 replies; 25+ messages in thread
From: Russell King - ARM Linux @ 2010-03-26 23:09 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Catalin Marinas, Hemanth V, linux-mmc, pierre, saeed bishara,
	linux-arm-kernel

On Fri, Mar 26, 2010 at 11:51:58AM -0400, Nicolas Pitre wrote:
> On Fri, 26 Mar 2010, Catalin Marinas wrote:
> 
> > On Thu, 2010-03-25 at 21:02 +0000, Nicolas Pitre wrote:
> > > --- a/arch/arm/include/asm/highmem.h
> > > +++ b/arch/arm/include/asm/highmem.h
> > > @@ -11,7 +11,11 @@
> > > 
> > >  #define kmap_prot              PAGE_KERNEL
> > > 
> > > -#define flush_cache_kmaps()    flush_cache_all()
> > > +#define flush_cache_kmaps() \
> > > +       do { \
> > > +               if (cache_is_vivt()) \
> > > +                       flush_cache_all(); \
> > > +       } while (0)
> > 
> > Do the aliasing VIPT caches need flushing as well?
> 
> No idea.  Highmem is not supported with aliasing VIPT at the moment 
> anyway -- see commit 3f973e2216.  I don't have hardware with aliasing 
> VIPT cache either.

I don't think we'll ever support aliasing VIPT caches with highmem -
it'd quadruple the amount of kmap space that's required for things like
KM_USER*.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
@ 2010-03-26 23:09       ` Russell King - ARM Linux
  0 siblings, 0 replies; 25+ messages in thread
From: Russell King - ARM Linux @ 2010-03-26 23:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Mar 26, 2010 at 11:51:58AM -0400, Nicolas Pitre wrote:
> On Fri, 26 Mar 2010, Catalin Marinas wrote:
> 
> > On Thu, 2010-03-25 at 21:02 +0000, Nicolas Pitre wrote:
> > > --- a/arch/arm/include/asm/highmem.h
> > > +++ b/arch/arm/include/asm/highmem.h
> > > @@ -11,7 +11,11 @@
> > > 
> > >  #define kmap_prot              PAGE_KERNEL
> > > 
> > > -#define flush_cache_kmaps()    flush_cache_all()
> > > +#define flush_cache_kmaps() \
> > > +       do { \
> > > +               if (cache_is_vivt()) \
> > > +                       flush_cache_all(); \
> > > +       } while (0)
> > 
> > Do the aliasing VIPT caches need flushing as well?
> 
> No idea.  Highmem is not supported with aliasing VIPT at the moment 
> anyway -- see commit 3f973e2216.  I don't have hardware with aliasing 
> VIPT cache either.

I don't think we'll ever support aliasing VIPT caches with highmem -
it'd quadruple the amount of kmap space that's required for things like
KM_USER*.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-03-26 15:51     ` Nicolas Pitre
  (?)
  (?)
@ 2010-04-11 14:42     ` Nicolas Pitre
  2010-04-12  7:56       ` saeed bishara
                         ` (2 more replies)
  -1 siblings, 3 replies; 25+ messages in thread
From: Nicolas Pitre @ 2010-04-11 14:42 UTC (permalink / raw)
  To: linux-arm-kernel

Russell,

Do you have concerns about patch #6007/1 that I might have missed? 
Otherwise it would be nice to move it towards mainline for ARMv6 
targets to work properly with highmem at last.


Nicolas

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-04-11 14:42     ` Nicolas Pitre
@ 2010-04-12  7:56       ` saeed bishara
  2010-04-12 15:36         ` Nicolas Pitre
  2010-04-12 19:41       ` Russell King - ARM Linux
  2010-05-28 18:56       ` Russell King - ARM Linux
  2 siblings, 1 reply; 25+ messages in thread
From: saeed bishara @ 2010-04-12  7:56 UTC (permalink / raw)
  To: linux-arm-kernel

Nico, thanks for the patch.
I've ported it for 2.6.32.9 kernel, can you have a look at the attached patch?

saeed

On Sun, Apr 11, 2010 at 5:42 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> Russell,
>
> Do you have concerns about patch #6007/1 that I might have missed?
> Otherwise it would be nice to move it towards mainline for ARMv6
> targets to work properly with highmem at last.
>
>
> Nicolas
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ARM-fix-highmem-with-VIPT-cache-and-DMA.patch
Type: text/x-patch
Size: 8951 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20100412/68878bb9/attachment-0001.bin>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-04-12  7:56       ` saeed bishara
@ 2010-04-12 15:36         ` Nicolas Pitre
  2010-04-13  7:11           ` saeed bishara
  0 siblings, 1 reply; 25+ messages in thread
From: Nicolas Pitre @ 2010-04-12 15:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 12 Apr 2010, saeed bishara wrote:

> Nico, thanks for the patch.
> I've ported it for 2.6.32.9 kernel, can you have a look at the
> attached patch?

That patch is bad.

|diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
|index b9590a7..5286d71 100644
|--- a/arch/arm/mm/dma-mapping.c
|+++ b/arch/arm/mm/dma-mapping.c
|@@ -602,6 +602,11 @@ static void dma_cache_maint_contiguous(struct page *page, unsigned long offset,
| 			vaddr += offset;
| 			inner_op(vaddr, vaddr + size);
| 			kunmap_high(page);
|+		} else if (cache_is_vipt()) {
|+			pte_t saved_pte;
|+			vaddr = kmap_high_l1_vipt(page, &saved_pte);
|+			inner_op(vaddr, vaddr + size);
|+			kunmap_high_l1_vipt(page, saved_pte);
| 		}
| 	}

You should add 'offset' to 'vaddr' before passing it to inner_op(), just 
like it is done 6 lines above it.

Also you left out the addition of the unconditional flush in 
v6_copy_user_highpage_nonaliasing() which is needed now that 
kunmap_atomic() doesn't flush the cache anymore on ARMv6.


Nicolas

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-04-11 14:42     ` Nicolas Pitre
  2010-04-12  7:56       ` saeed bishara
@ 2010-04-12 19:41       ` Russell King - ARM Linux
  2010-05-28 18:56       ` Russell King - ARM Linux
  2 siblings, 0 replies; 25+ messages in thread
From: Russell King - ARM Linux @ 2010-04-12 19:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Apr 11, 2010 at 10:42:06AM -0400, Nicolas Pitre wrote:
> Do you have concerns about patch #6007/1 that I might have missed? 
> Otherwise it would be nice to move it towards mainline for ARMv6 
> targets to work properly with highmem at last.

I don't think so; remember I've been away for Easter so things have
been slow this last week and a bit.

Once Linus has pulled the current crop of stuff, I'll see about merging
it.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-04-12 15:36         ` Nicolas Pitre
@ 2010-04-13  7:11           ` saeed bishara
  2010-04-13 11:50             ` Nicolas Pitre
  0 siblings, 1 reply; 25+ messages in thread
From: saeed bishara @ 2010-04-13  7:11 UTC (permalink / raw)
  To: linux-arm-kernel

thanks. attached fixed version.

On Mon, Apr 12, 2010 at 6:36 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Mon, 12 Apr 2010, saeed bishara wrote:
>
>> Nico, thanks for the patch.
>> I've ported it for 2.6.32.9 kernel, can you have a look at the
>> attached patch?
>
> That patch is bad.
>
> |diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> |index b9590a7..5286d71 100644
> |--- a/arch/arm/mm/dma-mapping.c
> |+++ b/arch/arm/mm/dma-mapping.c
> |@@ -602,6 +602,11 @@ static void dma_cache_maint_contiguous(struct page *page, unsigned long offset,
> | ? ? ? ? ? ? ? ? ? ? ? vaddr += offset;
> | ? ? ? ? ? ? ? ? ? ? ? inner_op(vaddr, vaddr + size);
> | ? ? ? ? ? ? ? ? ? ? ? kunmap_high(page);
> |+ ? ? ? ? ? ? ?} else if (cache_is_vipt()) {
> |+ ? ? ? ? ? ? ? ? ? ? ?pte_t saved_pte;
> |+ ? ? ? ? ? ? ? ? ? ? ?vaddr = kmap_high_l1_vipt(page, &saved_pte);
> |+ ? ? ? ? ? ? ? ? ? ? ?inner_op(vaddr, vaddr + size);
> |+ ? ? ? ? ? ? ? ? ? ? ?kunmap_high_l1_vipt(page, saved_pte);
> | ? ? ? ? ? ? ? }
> | ? ? ? }
>
> You should add 'offset' to 'vaddr' before passing it to inner_op(), just
> like it is done 6 lines above it.
>
> Also you left out the addition of the unconditional flush in
> v6_copy_user_highpage_nonaliasing() which is needed now that
> kunmap_atomic() doesn't flush the cache anymore on ARMv6.
>
>
> Nicolas
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ARM-fix-highmem-with-VIPT-cache-and-DMA.patch
Type: text/x-patch
Size: 9460 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20100413/aeab5cb4/attachment.bin>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-04-13  7:11           ` saeed bishara
@ 2010-04-13 11:50             ` Nicolas Pitre
  0 siblings, 0 replies; 25+ messages in thread
From: Nicolas Pitre @ 2010-04-13 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 13 Apr 2010, saeed bishara wrote:

> thanks. attached fixed version.

Looks fine.


Nicolas

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-04-11 14:42     ` Nicolas Pitre
  2010-04-12  7:56       ` saeed bishara
  2010-04-12 19:41       ` Russell King - ARM Linux
@ 2010-05-28 18:56       ` Russell King - ARM Linux
  2010-05-28 19:26         ` Nicolas Pitre
  2 siblings, 1 reply; 25+ messages in thread
From: Russell King - ARM Linux @ 2010-05-28 18:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Apr 11, 2010 at 10:42:06AM -0400, Nicolas Pitre wrote:
> Do you have concerns about patch #6007/1 that I might have missed? 
> Otherwise it would be nice to move it towards mainline for ARMv6 
> targets to work properly with highmem at last.

FYI, I'm considering reverting this patch; this patch is one of two
responsible for causing a stability regression of Versatile Express
platform - a highmem+Cortex A9 MPCore system with VIPT non-aliasing
caches.

The two commits responsible for making Versatile Express userspace
completely unstable are:

b8349b569aae661dea9d59d7d2ee587ccea3336c
ARM: 6112/1: Use the Inner Shareable I-cache and BTB ops on ARMv7 SMP

7e5a69e83ba7a0d5917ad830f417cba8b8d6aa72
ARM: 6007/1: fix highmem with VIPT cache and DMA

Most of the instability is from the second (this) patch, but reverting
just one patch doesn't fully restore userspace stability.

The instability seen consists of completely random SIGSEGVs and SIGILLs,
sometimes early enough to kill init, sometimes init is the only thing
able to run - so it's very severe.

Rootfs is on ext2fs on a read only MMC filesystem, using the MMCI driver.
No swap.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-05-28 18:56       ` Russell King - ARM Linux
@ 2010-05-28 19:26         ` Nicolas Pitre
  2010-05-28 19:45           ` Russell King - ARM Linux
  0 siblings, 1 reply; 25+ messages in thread
From: Nicolas Pitre @ 2010-05-28 19:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 28 May 2010, Russell King - ARM Linux wrote:

> FYI, I'm considering reverting this patch; this patch is one of two
> responsible for causing a stability regression of Versatile Express
> platform - a highmem+Cortex A9 MPCore system with VIPT non-aliasing
> caches.

Two questions:

1) Do you actually have CONFIG_HIGHMEM enabled?

2) If no, does the reversal of this patch really make a difference?
   If yes, are you sure highmem actually works reliably without it?

In theory, this patch should have no effect if CONFIG_HIGHMEM is 
configured out, and should result in the same kernel binary with or 
without this patch.

But if you do have CONFIG_HIGHMEM turned on without this patch then you 
should get memory corruption pretty fast simply by booting on a block 
device with a driver that uses DMA.  That was confirmed on both a TI 
OMAP board and on a Marvell Dove board.

> The two commits responsible for making Versatile Express userspace
> completely unstable are:
> 
> b8349b569aae661dea9d59d7d2ee587ccea3336c
> ARM: 6112/1: Use the Inner Shareable I-cache and BTB ops on ARMv7 SMP
> 
> 7e5a69e83ba7a0d5917ad830f417cba8b8d6aa72
> ARM: 6007/1: fix highmem with VIPT cache and DMA
> 
> Most of the instability is from the second (this) patch, but reverting
> just one patch doesn't fully restore userspace stability.

And that includes reverting only b8349b56, right?

> The instability seen consists of completely random SIGSEGVs and SIGILLs,
> sometimes early enough to kill init, sometimes init is the only thing
> able to run - so it's very severe.

That is exactly the observed behavior on OMAP and Dove when this patch 
is _not_ applied.

> Rootfs is on ext2fs on a read only MMC filesystem, using the MMCI driver.
> No swap.

The OMAP board was also using an MMC card, while the Dove board was 
booting on a SATA hard disk.

Also beware the CONFIG_MMC_BLOCK_BOUNCE option.  If that is enabled and 
the core code decides to actually use it, then the cache coherency issue 
caused by DMA usage will be hidden and things might appear to just work 
fine as the DMA won't happen directly into highmem user pages (same 
thing if you boot with NFS root).


Nicolas

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-05-28 19:26         ` Nicolas Pitre
@ 2010-05-28 19:45           ` Russell King - ARM Linux
  2010-05-28 20:28             ` Russell King - ARM Linux
  0 siblings, 1 reply; 25+ messages in thread
From: Russell King - ARM Linux @ 2010-05-28 19:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 28, 2010 at 03:26:21PM -0400, Nicolas Pitre wrote:
> On Fri, 28 May 2010, Russell King - ARM Linux wrote:
> 
> > FYI, I'm considering reverting this patch; this patch is one of two
> > responsible for causing a stability regression of Versatile Express
> > platform - a highmem+Cortex A9 MPCore system with VIPT non-aliasing
> > caches.
> 
> Two questions:
> 
> 1) Do you actually have CONFIG_HIGHMEM enabled?

Yes - and here's the messages to prove it.

Uncompressing Linux... done, booting the kernel.
Linux version 2.6.34 (rmk at rmk-PC) (gcc version 4.3.2 (GCC) ) #165 SMP Fri May 28 19:38:08 BST 2010
CPU: ARMv7 Processor [410fc091] revision 1 (ARMv7), cr=10c53c7f
CPU: VIPT nonaliasing data cache, VIPT nonaliasing instruction cache
Machine: ARM-Versatile Express CA9x4
Memory policy: ECC disabled, Data cache writealloc
PERCPU: Embedded 8 pages/cpu @c0c87000 s8768 r8192 d15808 u65536
pcpu-alloc: s8768 r8192 d15808 u65536 alloc=16*4096
pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 260096
Kernel command line: root=/dev/mmcblk0p1 ro mem=1024M console=ttyAMA0 vmalloc=256M rootdelay=5
PID hash table entries: 4096 (order: 2, 16384 bytes)
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 640MB 384MB = 1024MB total
Memory: 1034868k/1034868k available, 13708k reserved, 393216K highmem
Virtual kernel memory layout:
    vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
    DMA     : 0xffc00000 - 0xffe00000   (   2 MB)
    vmalloc : 0xe8800000 - 0xf8000000   ( 248 MB)
    lowmem  : 0xc0000000 - 0xe8000000   ( 640 MB)
    pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
    modules : 0xbf000000 - 0xbfe00000   (  14 MB)
      .init : 0xc0008000 - 0xc002a000   ( 136 kB)
      .text : 0xc002a000 - 0xc03c4000   (3688 kB)
      .data : 0xc03c4000 - 0xc03f19b0   ( 183 kB)
SLUB: Genslabs=11, HWalign=32, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
Hierarchical RCU implementation.
        RCU-based detection of stalled CPUs is disabled.
        Verbose stalled-CPUs detection is disabled.
NR_IRQS:128
Console: colour dummy device 80x30
Calibrating delay loop... 797.90 BogoMIPS (lpj=3989504)
Mount-cache hash table entries: 512
CPU: Testing write buffer coherency: ok
CPU1: Booted secondary processor
CPU2: Booted secondary processor
CPU3: Booted secondary processor
Brought up 4 CPUs
SMP: Total of 4 processors activated (3196.51 BogoMIPS).
NET: Registered protocol family 16
L310 cache controller enabled
l2x0: 8 ways, CACHE_ID 0x410000c3, AUX_CTRL 0x02060000
...

> 2) If no, does the reversal of this patch really make a difference?
>    If yes, are you sure highmem actually works reliably without it?

With this patch, userspace sometimes doesn't even make it to printing
the first "INIT: version" line, other times it does and the next line
you see is "INIT: entering runlevel 5".  Other times, you get segfaults
or illegal instructions reported by the shell.

Revert this patch, and I have a much more stable system, resulting in
most (but not all) of things like udev, dbus, hal, networkmanager
starting.

Revert the other patch, and everything comes up.  Restore your patch,
everything falls apart completely again.

> But if you do have CONFIG_HIGHMEM turned on without this patch then you 
> should get memory corruption pretty fast simply by booting on a block 
> device with a driver that uses DMA.  That was confirmed on both a TI 
> OMAP board and on a Marvell Dove board.

The MMCI primecell is currently PIO only.

> > The two commits responsible for making Versatile Express userspace
> > completely unstable are:
> > 
> > b8349b569aae661dea9d59d7d2ee587ccea3336c
> > ARM: 6112/1: Use the Inner Shareable I-cache and BTB ops on ARMv7 SMP
> > 
> > 7e5a69e83ba7a0d5917ad830f417cba8b8d6aa72
> > ARM: 6007/1: fix highmem with VIPT cache and DMA
> > 
> > Most of the instability is from the second (this) patch, but reverting
> > just one patch doesn't fully restore userspace stability.
> 
> And that includes reverting only b8349b56, right?

Correct.  I've tried all four combinations.  Most of the instability is
down to 7e5a.  Reverting b834 on its own has no visible difference.
Reverting 7e5a has a very noticable difference but is not completely back
to normal without b834 too.

> > The instability seen consists of completely random SIGSEGVs and SIGILLs,
> > sometimes early enough to kill init, sometimes init is the only thing
> > able to run - so it's very severe.
> 
> That is exactly the observed behavior on OMAP and Dove when this patch 
> is _not_ applied.

I'm about to test this on my OMAP4 board, which will be a DMA-based test,
which will be using the same filesystem.

This will be an interesting test because both systems have the same
revision and patch level of the Cortex A9 MPCore hardware.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-05-28 19:45           ` Russell King - ARM Linux
@ 2010-05-28 20:28             ` Russell King - ARM Linux
  2010-05-28 20:49               ` Nicolas Pitre
  0 siblings, 1 reply; 25+ messages in thread
From: Russell King - ARM Linux @ 2010-05-28 20:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 28, 2010 at 08:45:01PM +0100, Russell King - ARM Linux wrote:
> On Fri, May 28, 2010 at 03:26:21PM -0400, Nicolas Pitre wrote:
> > That is exactly the observed behavior on OMAP and Dove when this patch 
> > is _not_ applied.
> 
> I'm about to test this on my OMAP4 board, which will be a DMA-based test,
> which will be using the same filesystem.

Confirmed - on the OMAP4 board _with_ DMA, this patch fixes a problem
causing various faults in userspace - though it's nowhere near as bad
as the Versatile Express board.

So, we seem to have this behaviour:

- with PIO drivers, with this patch breaks userspace, without fixes userspace
- with DMA drivers, with this patch fixes userspace, without breaks userspace

Note that MMCI does do flush_dcache_page() on PIO read.

Maybe there's something wrong in __flush_dcache_page()?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-05-28 20:28             ` Russell King - ARM Linux
@ 2010-05-28 20:49               ` Nicolas Pitre
  2010-05-28 22:19                 ` Russell King - ARM Linux
  0 siblings, 1 reply; 25+ messages in thread
From: Nicolas Pitre @ 2010-05-28 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 28 May 2010, Russell King - ARM Linux wrote:

> On Fri, May 28, 2010 at 08:45:01PM +0100, Russell King - ARM Linux wrote:
> > On Fri, May 28, 2010 at 03:26:21PM -0400, Nicolas Pitre wrote:
> > > That is exactly the observed behavior on OMAP and Dove when this patch 
> > > is _not_ applied.
> > 
> > I'm about to test this on my OMAP4 board, which will be a DMA-based test,
> > which will be using the same filesystem.
> 
> Confirmed - on the OMAP4 board _with_ DMA, this patch fixes a problem
> causing various faults in userspace - though it's nowhere near as bad
> as the Versatile Express board.

In my experience, the badness depends on the actual amount of highmem 
you have.  Changing that number using the memory and vmalloc boot 
arguments will move the average behavior between "it almost boots" and 
"init doesn't even come up".

> So, we seem to have this behaviour:
> 
> - with PIO drivers, with this patch breaks userspace, without fixes userspace
> - with DMA drivers, with this patch fixes userspace, without breaks userspace

Could you possibly test a UP kernel on those as well?  That might 
provide a better matrix.

> Note that MMCI does do flush_dcache_page() on PIO read.
> 
> Maybe there's something wrong in __flush_dcache_page()?

One possible suspect might be the optimization in kmap_high_l1_vipt() 
which I couldn't test on SMP as I have no access to such beast.  Hence 
this patch might be worth testing as well:

diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
index 77b030f..06abed2 100644
--- a/arch/arm/mm/highmem.c
+++ b/arch/arm/mm/highmem.c
@@ -169,7 +169,7 @@ void *kmap_high_l1_vipt(struct page *page, pte_t *saved_pte)
 
 	raw_local_irq_save(flags);
 	(*depth)++;
-	if (pte_val(*ptep) == pte_val(pte)) {
+	if (0 && pte_val(*ptep) == pte_val(pte)) {
 		*saved_pte = pte;
 	} else {
 		*saved_pte = *ptep;
@@ -198,7 +198,7 @@ void kunmap_high_l1_vipt(struct page *page, pte_t saved_pte)
 
 	raw_local_irq_save(flags);
 	(*depth)--;
-	if (*depth != 0 && pte_val(pte) != pte_val(saved_pte)) {
+	if (1 || (*depth != 0 && pte_val(pte) != pte_val(saved_pte))) {
 		set_pte_ext(ptep, saved_pte, 0);
 		local_flush_tlb_kernel_page(vaddr);
 	}

Nicolas

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-05-28 20:49               ` Nicolas Pitre
@ 2010-05-28 22:19                 ` Russell King - ARM Linux
  2010-05-29  0:00                   ` Nicolas Pitre
  0 siblings, 1 reply; 25+ messages in thread
From: Russell King - ARM Linux @ 2010-05-28 22:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 28, 2010 at 04:49:40PM -0400, Nicolas Pitre wrote:
> One possible suspect might be the optimization in kmap_high_l1_vipt() 
> which I couldn't test on SMP as I have no access to such beast.  Hence 
> this patch might be worth testing as well:

Well, it's turned out that it's probably not your patch, but mmci.c at
fault.  It's kmapping the scatterlist entry, but the scatterlist entry
buffer length if 16K.

Somehow this doesn't cause us to oops when we run off the end of the
kmap_atomic'd buffer, even with highmem debugging on.

Maybe we need a fence between each KM_* like other arches do (see
asm-generic/kmap_types.h).  This probably needs the kmap_high_get()
check in kmap_atomic() to be avoided on non-aliasing VIPT to reliably
make such overruns trigger an oops.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-05-28 22:19                 ` Russell King - ARM Linux
@ 2010-05-29  0:00                   ` Nicolas Pitre
  0 siblings, 0 replies; 25+ messages in thread
From: Nicolas Pitre @ 2010-05-29  0:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 28 May 2010, Russell King - ARM Linux wrote:

> On Fri, May 28, 2010 at 04:49:40PM -0400, Nicolas Pitre wrote:
> > One possible suspect might be the optimization in kmap_high_l1_vipt() 
> > which I couldn't test on SMP as I have no access to such beast.  Hence 
> > this patch might be worth testing as well:
> 
> Well, it's turned out that it's probably not your patch, but mmci.c at
> fault.  It's kmapping the scatterlist entry, but the scatterlist entry
> buffer length if 16K.

Ah.

> Somehow this doesn't cause us to oops when we run off the end of the
> kmap_atomic'd buffer, even with highmem debugging on.

That's strange.  It uses KM_BIO_SRC_IRQ, and the following kmap is 
KM_BIO_DST_IRQ.  In the whole source tree, only 
drivers/mmc/host/tifm_sd.c appears to be using KM_BIO_DST_IRQ.  So in 
your case it must never have been set to any valid mapping. In that case 
the readsl() to follow should have oopsed.

And when highmem debugging is active, all atomic kmaps are always 
cleared upon kmap_atomic(), effectively creating a big fence around the 
kmap_atomic'd buffer.

> Maybe we need a fence between each KM_* like other arches do (see
> asm-generic/kmap_types.h).  This probably needs the kmap_high_get()
> check in kmap_atomic() to be avoided on non-aliasing VIPT to reliably
> make such overruns trigger an oops.

Oh, right.  The non-atomic mappings that we opportunistically reuse are 
lazily unmapped... and the probability to find a following mapped entry 
is quite high.  So I think all that is needed is this:

diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
index 77b030f..086816b 100644
--- a/arch/arm/mm/highmem.c
+++ b/arch/arm/mm/highmem.c
@@ -48,7 +48,16 @@ void *kmap_atomic(struct page *page, enum km_type type)
 
 	debug_kmap_atomic(type);
 
-	kmap = kmap_high_get(page);
+#ifdef CONFIG_DEBUG_HIGHMEM
+	/*
+	 * There is no cache coherency issue when non VIVT, so force the
+	 * dedicated kmap usage for better debugging purposes in that case.
+	 */
+	if (!cache_is_vivt())
+		kmap = NULL;
+	else
+#endif
+		kmap = kmap_high_get(page);
 	if (kmap)
 		return kmap;
 

If you can confirm that this actually traps the mmci bug then I'll 
submit that to the patch system with an appropriate description.


Nicolas

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-04-25 19:01   ` Allen Pais
@ 2010-04-26  1:07     ` Nicolas Pitre
  0 siblings, 0 replies; 25+ messages in thread
From: Nicolas Pitre @ 2010-04-26  1:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 25 Apr 2010, Allen Pais wrote:

> > >???But currently we are facing issues with test cases such
> > > as AV Playback, mp3 play, image capture, camera 
> > > capture+mpeg4 encode. 
> > > 
> > >???Am not sure what's causing this break, any pointer would be
> > > helpful.
> 
> > To be clear, are you saying that those test cases are failing with 
> > highmem only?
> 
> Yes, when Highmem is enabled, these(testcases) failures
> are what we are observing.

Let's take a simple one: mp3 playback.  What exactly is happening when 
highmem is enabled?


Nicolas

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-04-25 14:13 ` Nicolas Pitre
@ 2010-04-25 19:01   ` Allen Pais
  2010-04-26  1:07     ` Nicolas Pitre
  0 siblings, 1 reply; 25+ messages in thread
From: Allen Pais @ 2010-04-25 19:01 UTC (permalink / raw)
  To: linux-arm-kernel


> 
>? ? As mentioned by Hemanth, we were facing issues with 
> highmem and MMC, but this is resolved now with the patch
> you shared. 
> 
>? ? Here's the patch for your ref.[its reverted cause of the
> issue] 
> http://dev.omapzoom.org/?p=integration/kernel-omap3.git;a=commit;h=f6427e74a3c08fa5e30c99de0e9e8ed9d3b65b74
> 
>???But currently we are facing issues with test cases such
> as AV Playback, mp3 play, image capture, camera 
> capture+mpeg4 encode. 
> 
>???Am not sure what's causing this break, any pointer would be
> helpful.

To be clear, are you saying that those test cases are failing with 
highmem only?

[Allen] Yes, when Highmem is enabled, these(testcases) failures
are what we are observing.

- Allen




      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20100425/6c4d9465/attachment-0001.htm>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
  2010-04-23  6:55 Allen Pais
@ 2010-04-25 14:13 ` Nicolas Pitre
  2010-04-25 19:01   ` Allen Pais
  0 siblings, 1 reply; 25+ messages in thread
From: Nicolas Pitre @ 2010-04-25 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 22 Apr 2010, Allen Pais wrote:

> Nicolas,
> 
>    As mentioned by Hemanth, we were facing issues with 
> highmem and MMC, but this is resolved now with the patch
> you shared. 
> 
>    Here's the patch for your ref.[its reverted cause of the
> issue] 
> http://dev.omapzoom.org/?p=integration/kernel-omap3.git;a=commit;h=f6427e74a3c08fa5e30c99de0e9e8ed9d3b65b74
> 
>   But currently we are facing issues with test cases such
> as AV Playback, mp3 play, image capture, camera 
> capture+mpeg4 encode. 
> 
>   Am not sure what's causing this break, any pointer would be
> helpful.

To be clear, are you saying that those test cases are failing with 
highmem only?


Nicolas

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] ARM: fix highmem with VIPT cache and DMA
@ 2010-04-23  6:55 Allen Pais
  2010-04-25 14:13 ` Nicolas Pitre
  0 siblings, 1 reply; 25+ messages in thread
From: Allen Pais @ 2010-04-23  6:55 UTC (permalink / raw)
  To: linux-arm-kernel

Nicolas,

   As mentioned by Hemanth, we were facing issues with 
highmem and MMC, but this is resolved now with the patch
you shared. 

   Here's the patch for your ref.[its reverted cause of the
issue] 
http://dev.omapzoom.org/?p=integration/kernel-omap3.git;a=commit;h=f6427e74a3c08fa5e30c99de0e9e8ed9d3b65b74

  But currently we are facing issues with test cases such
as AV Playback, mp3 play, image capture, camera 
capture+mpeg4 encode. 

  Am not sure what's causing this break, any pointer would be
helpful.

Thanks,

- Allen



      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20100422/5d72c555/attachment.htm>

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2010-05-29  0:00 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-25 21:02 [PATCH] ARM: fix highmem with VIPT cache and DMA Nicolas Pitre
2010-03-25 21:02 ` Nicolas Pitre
2010-03-26 13:34 ` Catalin Marinas
2010-03-26 13:34   ` Catalin Marinas
2010-03-26 15:51   ` Nicolas Pitre
2010-03-26 15:51     ` Nicolas Pitre
2010-03-26 23:09     ` Russell King - ARM Linux
2010-03-26 23:09       ` Russell King - ARM Linux
2010-04-11 14:42     ` Nicolas Pitre
2010-04-12  7:56       ` saeed bishara
2010-04-12 15:36         ` Nicolas Pitre
2010-04-13  7:11           ` saeed bishara
2010-04-13 11:50             ` Nicolas Pitre
2010-04-12 19:41       ` Russell King - ARM Linux
2010-05-28 18:56       ` Russell King - ARM Linux
2010-05-28 19:26         ` Nicolas Pitre
2010-05-28 19:45           ` Russell King - ARM Linux
2010-05-28 20:28             ` Russell King - ARM Linux
2010-05-28 20:49               ` Nicolas Pitre
2010-05-28 22:19                 ` Russell King - ARM Linux
2010-05-29  0:00                   ` Nicolas Pitre
2010-04-23  6:55 Allen Pais
2010-04-25 14:13 ` Nicolas Pitre
2010-04-25 19:01   ` Allen Pais
2010-04-26  1:07     ` Nicolas Pitre

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.