linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] powerpc: convert cache asm to C
@ 2019-09-03  5:23 Alastair D'Silva
  2019-09-03  5:23 ` [PATCH v2 1/6] powerpc: Allow flush_icache_range to work across ranges >4GB Alastair D'Silva
                   ` (5 more replies)
  0 siblings, 6 replies; 28+ messages in thread
From: Alastair D'Silva @ 2019-09-03  5:23 UTC (permalink / raw)
  To: alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Christophe Leroy, Allison Randal, Thomas Gleixner, Qian Cai,
	Nicholas Piggin, Greg Kroah-Hartman, Andrew Morton,
	David Hildenbrand, Michal Hocko, Mike Rapoport, linuxppc-dev,
	linux-kernel

From: Alastair D'Silva <alastair@d-silva.org>

This series addresses a few issues discovered in how we flush caches:
1. Flushes were truncated at 4GB, so larger flushes were incorrect.
2. Flushing the dcache in arch_add_memory was unnecessary

This series also converts much of the cache assembler to C, with the
aim of making it easier to maintain.

Alastair D'Silva (6):
  powerpc: Allow flush_icache_range to work across ranges >4GB
  powerpc: define helpers to get L1 icache sizes
  powerpc: Convert flush_icache_range & friends to C
  powerpc: Chunk calls to flush_dcache_range in arch_*_memory
  powerpc: Remove 'extern' from func prototypes in cache headers
  powerpc: Don't flush caches when adding memory

Changelog:
 V2:
     - Replace C implementation of flush_dcache_icache_phys() with
       inline assembler authored by Christophe Leroy
     - Add memory clobbers for iccci implementation
     - Give __flush_dcache_icache a real implementation, it can't
       just be a wrapper around flush_icache_range()
     - Remove PPC64_CACHES from misc_64.S
     - Replace code duplicating clean_dcache_range() in
       flush_icache_range() with a call to clean_dcache_range()
     - Replace #ifdef CONFIG_44x with IS_ENABLED(...) in
       flush_icache_cange()
     - Use 1GB chunks instead of 16GB in arch_*_memory


 arch/powerpc/include/asm/cache.h      |  63 ++++++----
 arch/powerpc/include/asm/cacheflush.h |  37 +++---
 arch/powerpc/kernel/misc_32.S         | 117 -------------------
 arch/powerpc/kernel/misc_64.S         | 102 -----------------
 arch/powerpc/mm/mem.c                 | 159 +++++++++++++++++++++++++-
 5 files changed, 213 insertions(+), 265 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 1/6] powerpc: Allow flush_icache_range to work across ranges >4GB
  2019-09-03  5:23 [PATCH v2 0/6] powerpc: convert cache asm to C Alastair D'Silva
@ 2019-09-03  5:23 ` Alastair D'Silva
  2019-09-14  7:46   ` Christophe Leroy
  2019-09-03  5:23 ` [PATCH v2 2/6] powerpc: define helpers to get L1 icache sizes Alastair D'Silva
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 28+ messages in thread
From: Alastair D'Silva @ 2019-09-03  5:23 UTC (permalink / raw)
  To: alastair
  Cc: stable, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Christophe Leroy, Greg Kroah-Hartman, Thomas Gleixner, Qian Cai,
	Nicholas Piggin, Allison Randal, Andrew Morton, Mike Rapoport,
	Michal Hocko, David Hildenbrand, linuxppc-dev, linux-kernel

From: Alastair D'Silva <alastair@d-silva.org>

When calling flush_icache_range with a size >4GB, we were masking
off the upper 32 bits, so we would incorrectly flush a range smaller
than intended.

This patch replaces the 32 bit shifts with 64 bit ones, so that
the full size is accounted for.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Cc: stable@vger.kernel.org
---
 arch/powerpc/kernel/misc_64.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index b55a7b4cb543..9bc0aa9aeb65 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -82,7 +82,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
 	subf	r8,r6,r4		/* compute length */
 	add	r8,r8,r5		/* ensure we get enough */
 	lwz	r9,DCACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of cache block size */
-	srw.	r8,r8,r9		/* compute line count */
+	srd.	r8,r8,r9		/* compute line count */
 	beqlr				/* nothing to do? */
 	mtctr	r8
 1:	dcbst	0,r6
@@ -98,7 +98,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
 	subf	r8,r6,r4		/* compute length */
 	add	r8,r8,r5
 	lwz	r9,ICACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of Icache block size */
-	srw.	r8,r8,r9		/* compute line count */
+	srd.	r8,r8,r9		/* compute line count */
 	beqlr				/* nothing to do? */
 	mtctr	r8
 2:	icbi	0,r6
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 2/6] powerpc: define helpers to get L1 icache sizes
  2019-09-03  5:23 [PATCH v2 0/6] powerpc: convert cache asm to C Alastair D'Silva
  2019-09-03  5:23 ` [PATCH v2 1/6] powerpc: Allow flush_icache_range to work across ranges >4GB Alastair D'Silva
@ 2019-09-03  5:23 ` Alastair D'Silva
  2019-09-03  5:23 ` [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C Alastair D'Silva
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 28+ messages in thread
From: Alastair D'Silva @ 2019-09-03  5:23 UTC (permalink / raw)
  To: alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Christophe Leroy, Allison Randal, Greg Kroah-Hartman,
	Thomas Gleixner, Qian Cai, Nicholas Piggin, Andrew Morton,
	Mike Rapoport, David Hildenbrand, linuxppc-dev, linux-kernel

From: Alastair D'Silva <alastair@d-silva.org>

This patch adds helpers to retrieve icache sizes, and renames the existing
helpers to make it clear that they are for dcache.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/include/asm/cache.h      | 29 +++++++++++++++++++++++----
 arch/powerpc/include/asm/cacheflush.h | 12 +++++------
 2 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index b3388d95f451..f852d5cd746c 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -55,25 +55,46 @@ struct ppc64_caches {
 
 extern struct ppc64_caches ppc64_caches;
 
-static inline u32 l1_cache_shift(void)
+static inline u32 l1_dcache_shift(void)
 {
 	return ppc64_caches.l1d.log_block_size;
 }
 
-static inline u32 l1_cache_bytes(void)
+static inline u32 l1_dcache_bytes(void)
 {
 	return ppc64_caches.l1d.block_size;
 }
+
+static inline u32 l1_icache_shift(void)
+{
+	return ppc64_caches.l1i.log_block_size;
+}
+
+static inline u32 l1_icache_bytes(void)
+{
+	return ppc64_caches.l1i.block_size;
+}
 #else
-static inline u32 l1_cache_shift(void)
+static inline u32 l1_dcache_shift(void)
 {
 	return L1_CACHE_SHIFT;
 }
 
-static inline u32 l1_cache_bytes(void)
+static inline u32 l1_dcache_bytes(void)
 {
 	return L1_CACHE_BYTES;
 }
+
+static inline u32 l1_icache_shift(void)
+{
+	return L1_CACHE_SHIFT;
+}
+
+static inline u32 l1_icache_bytes(void)
+{
+	return L1_CACHE_BYTES;
+}
+
 #endif
 #endif /* ! __ASSEMBLY__ */
 
diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
index eef388f2659f..ed57843ef452 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -63,8 +63,8 @@ static inline void __flush_dcache_icache_phys(unsigned long physaddr)
  */
 static inline void flush_dcache_range(unsigned long start, unsigned long stop)
 {
-	unsigned long shift = l1_cache_shift();
-	unsigned long bytes = l1_cache_bytes();
+	unsigned long shift = l1_dcache_shift();
+	unsigned long bytes = l1_dcache_bytes();
 	void *addr = (void *)(start & ~(bytes - 1));
 	unsigned long size = stop - (unsigned long)addr + (bytes - 1);
 	unsigned long i;
@@ -89,8 +89,8 @@ static inline void flush_dcache_range(unsigned long start, unsigned long stop)
  */
 static inline void clean_dcache_range(unsigned long start, unsigned long stop)
 {
-	unsigned long shift = l1_cache_shift();
-	unsigned long bytes = l1_cache_bytes();
+	unsigned long shift = l1_dcache_shift();
+	unsigned long bytes = l1_dcache_bytes();
 	void *addr = (void *)(start & ~(bytes - 1));
 	unsigned long size = stop - (unsigned long)addr + (bytes - 1);
 	unsigned long i;
@@ -108,8 +108,8 @@ static inline void clean_dcache_range(unsigned long start, unsigned long stop)
 static inline void invalidate_dcache_range(unsigned long start,
 					   unsigned long stop)
 {
-	unsigned long shift = l1_cache_shift();
-	unsigned long bytes = l1_cache_bytes();
+	unsigned long shift = l1_dcache_shift();
+	unsigned long bytes = l1_dcache_bytes();
 	void *addr = (void *)(start & ~(bytes - 1));
 	unsigned long size = stop - (unsigned long)addr + (bytes - 1);
 	unsigned long i;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-09-03  5:23 [PATCH v2 0/6] powerpc: convert cache asm to C Alastair D'Silva
  2019-09-03  5:23 ` [PATCH v2 1/6] powerpc: Allow flush_icache_range to work across ranges >4GB Alastair D'Silva
  2019-09-03  5:23 ` [PATCH v2 2/6] powerpc: define helpers to get L1 icache sizes Alastair D'Silva
@ 2019-09-03  5:23 ` Alastair D'Silva
  2019-09-03  6:08   ` Christophe Leroy
  2019-09-03 13:04   ` Segher Boessenkool
  2019-09-03  5:23 ` [PATCH v2 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory Alastair D'Silva
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 28+ messages in thread
From: Alastair D'Silva @ 2019-09-03  5:23 UTC (permalink / raw)
  To: alastair
  Cc: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Greg Kroah-Hartman, Thomas Gleixner, Qian Cai,
	Nicholas Piggin, Allison Randal, Andrew Morton,
	David Hildenbrand, Mike Rapoport, linuxppc-dev, linux-kernel

From: Alastair D'Silva <alastair@d-silva.org>

Similar to commit 22e9c88d486a
("powerpc/64: reuse PPC32 static inline flush_dcache_range()")
this patch converts the following ASM symbols to C:
    flush_icache_range()
    __flush_dcache_icache()
    __flush_dcache_icache_phys()

This was done as we discovered a long-standing bug where the length of the
range was truncated due to using a 32 bit shift instead of a 64 bit one.

By converting these functions to C, it becomes easier to maintain.

flush_dcache_icache_phys() retains a critical assembler section as we must
ensure there are no memory accesses while the data MMU is disabled
(authored by Christophe Leroy). Since this has no external callers, it has
also been made static, allowing the compiler to inline it within
flush_dcache_icache_page().

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
 arch/powerpc/include/asm/cache.h      |  26 ++---
 arch/powerpc/include/asm/cacheflush.h |  24 ++--
 arch/powerpc/kernel/misc_32.S         | 117 --------------------
 arch/powerpc/kernel/misc_64.S         | 102 -----------------
 arch/powerpc/mm/mem.c                 | 152 +++++++++++++++++++++++++-
 5 files changed, 173 insertions(+), 248 deletions(-)

diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index f852d5cd746c..91c808c6738b 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -98,20 +98,7 @@ static inline u32 l1_icache_bytes(void)
 #endif
 #endif /* ! __ASSEMBLY__ */
 
-#if defined(__ASSEMBLY__)
-/*
- * For a snooping icache, we still need a dummy icbi to purge all the
- * prefetched instructions from the ifetch buffers. We also need a sync
- * before the icbi to order the the actual stores to memory that might
- * have modified instructions with the icbi.
- */
-#define PURGE_PREFETCHED_INS	\
-	sync;			\
-	icbi	0,r3;		\
-	sync;			\
-	isync
-
-#else
+#if !defined(__ASSEMBLY__)
 #define __read_mostly __attribute__((__section__(".data..read_mostly")))
 
 #ifdef CONFIG_PPC_BOOK3S_32
@@ -145,6 +132,17 @@ static inline void dcbst(void *addr)
 {
 	__asm__ __volatile__ ("dcbst %y0" : : "Z"(*(u8 *)addr) : "memory");
 }
+
+static inline void icbi(void *addr)
+{
+	__asm__ __volatile__ ("icbi 0, %0" : : "r"(addr) : "memory");
+}
+
+static inline void iccci(void *addr)
+{
+	__asm__ __volatile__ ("iccci 0, %0" : : "r"(addr) : "memory");
+}
+
 #endif /* !__ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_CACHE_H */
diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
index ed57843ef452..4a1c9f0200e1 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -42,24 +42,20 @@ extern void flush_dcache_page(struct page *page);
 #define flush_dcache_mmap_lock(mapping)		do { } while (0)
 #define flush_dcache_mmap_unlock(mapping)	do { } while (0)
 
-extern void flush_icache_range(unsigned long, unsigned long);
+void flush_icache_range(unsigned long start, unsigned long stop);
 extern void flush_icache_user_range(struct vm_area_struct *vma,
 				    struct page *page, unsigned long addr,
 				    int len);
-extern void __flush_dcache_icache(void *page_va);
 extern void flush_dcache_icache_page(struct page *page);
-#if defined(CONFIG_PPC32) && !defined(CONFIG_BOOKE)
-extern void __flush_dcache_icache_phys(unsigned long physaddr);
-#else
-static inline void __flush_dcache_icache_phys(unsigned long physaddr)
-{
-	BUG();
-}
-#endif
-
-/*
- * Write any modified data cache blocks out to memory and invalidate them.
- * Does not invalidate the corresponding instruction cache blocks.
+void __flush_dcache_icache(void *page);
+
+/**
+ * flush_dcache_range(): Write any modified data cache blocks out to memory and
+ * invalidate them. Does not invalidate the corresponding instruction cache
+ * blocks.
+ *
+ * @start: the start address
+ * @stop: the stop address (exclusive)
  */
 static inline void flush_dcache_range(unsigned long start, unsigned long stop)
 {
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index fe4bd321730e..12b95e6799d4 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -318,123 +318,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_UNIFIED_ID_CACHE)
 EXPORT_SYMBOL(flush_instruction_cache)
 #endif /* CONFIG_PPC_8xx */
 
-/*
- * Write any modified data cache blocks out to memory
- * and invalidate the corresponding instruction cache blocks.
- * This is a no-op on the 601.
- *
- * flush_icache_range(unsigned long start, unsigned long stop)
- */
-_GLOBAL(flush_icache_range)
-BEGIN_FTR_SECTION
-	PURGE_PREFETCHED_INS
-	blr				/* for 601, do nothing */
-END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
-	rlwinm	r3,r3,0,0,31 - L1_CACHE_SHIFT
-	subf	r4,r3,r4
-	addi	r4,r4,L1_CACHE_BYTES - 1
-	srwi.	r4,r4,L1_CACHE_SHIFT
-	beqlr
-	mtctr	r4
-	mr	r6,r3
-1:	dcbst	0,r3
-	addi	r3,r3,L1_CACHE_BYTES
-	bdnz	1b
-	sync				/* wait for dcbst's to get to ram */
-#ifndef CONFIG_44x
-	mtctr	r4
-2:	icbi	0,r6
-	addi	r6,r6,L1_CACHE_BYTES
-	bdnz	2b
-#else
-	/* Flash invalidate on 44x because we are passed kmapped addresses and
-	   this doesn't work for userspace pages due to the virtually tagged
-	   icache.  Sigh. */
-	iccci	0, r0
-#endif
-	sync				/* additional sync needed on g4 */
-	isync
-	blr
-_ASM_NOKPROBE_SYMBOL(flush_icache_range)
-EXPORT_SYMBOL(flush_icache_range)
-
-/*
- * Flush a particular page from the data cache to RAM.
- * Note: this is necessary because the instruction cache does *not*
- * snoop from the data cache.
- * This is a no-op on the 601 which has a unified cache.
- *
- *	void __flush_dcache_icache(void *page)
- */
-_GLOBAL(__flush_dcache_icache)
-BEGIN_FTR_SECTION
-	PURGE_PREFETCHED_INS
-	blr
-END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
-	rlwinm	r3,r3,0,0,31-PAGE_SHIFT		/* Get page base address */
-	li	r4,PAGE_SIZE/L1_CACHE_BYTES	/* Number of lines in a page */
-	mtctr	r4
-	mr	r6,r3
-0:	dcbst	0,r3				/* Write line to ram */
-	addi	r3,r3,L1_CACHE_BYTES
-	bdnz	0b
-	sync
-#ifdef CONFIG_44x
-	/* We don't flush the icache on 44x. Those have a virtual icache
-	 * and we don't have access to the virtual address here (it's
-	 * not the page vaddr but where it's mapped in user space). The
-	 * flushing of the icache on these is handled elsewhere, when
-	 * a change in the address space occurs, before returning to
-	 * user space
-	 */
-BEGIN_MMU_FTR_SECTION
-	blr
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_44x)
-#endif /* CONFIG_44x */
-	mtctr	r4
-1:	icbi	0,r6
-	addi	r6,r6,L1_CACHE_BYTES
-	bdnz	1b
-	sync
-	isync
-	blr
-
-#ifndef CONFIG_BOOKE
-/*
- * Flush a particular page from the data cache to RAM, identified
- * by its physical address.  We turn off the MMU so we can just use
- * the physical address (this may be a highmem page without a kernel
- * mapping).
- *
- *	void __flush_dcache_icache_phys(unsigned long physaddr)
- */
-_GLOBAL(__flush_dcache_icache_phys)
-BEGIN_FTR_SECTION
-	PURGE_PREFETCHED_INS
-	blr					/* for 601, do nothing */
-END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
-	mfmsr	r10
-	rlwinm	r0,r10,0,28,26			/* clear DR */
-	mtmsr	r0
-	isync
-	rlwinm	r3,r3,0,0,31-PAGE_SHIFT		/* Get page base address */
-	li	r4,PAGE_SIZE/L1_CACHE_BYTES	/* Number of lines in a page */
-	mtctr	r4
-	mr	r6,r3
-0:	dcbst	0,r3				/* Write line to ram */
-	addi	r3,r3,L1_CACHE_BYTES
-	bdnz	0b
-	sync
-	mtctr	r4
-1:	icbi	0,r6
-	addi	r6,r6,L1_CACHE_BYTES
-	bdnz	1b
-	sync
-	mtmsr	r10				/* restore DR */
-	isync
-	blr
-#endif /* CONFIG_BOOKE */
-
 /*
  * Copy a whole page.  We use the dcbz instruction on the destination
  * to reduce memory traffic (it eliminates the unnecessary reads of
diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index 9bc0aa9aeb65..ff20c253f273 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -49,108 +49,6 @@ _GLOBAL(call_do_irq)
 	mtlr	r0
 	blr
 
-	.section	".toc","aw"
-PPC64_CACHES:
-	.tc		ppc64_caches[TC],ppc64_caches
-	.section	".text"
-
-/*
- * Write any modified data cache blocks out to memory
- * and invalidate the corresponding instruction cache blocks.
- *
- * flush_icache_range(unsigned long start, unsigned long stop)
- *
- *   flush all bytes from start through stop-1 inclusive
- */
-
-_GLOBAL_TOC(flush_icache_range)
-BEGIN_FTR_SECTION
-	PURGE_PREFETCHED_INS
-	blr
-END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
-/*
- * Flush the data cache to memory 
- * 
- * Different systems have different cache line sizes
- * and in some cases i-cache and d-cache line sizes differ from
- * each other.
- */
- 	ld	r10,PPC64_CACHES@toc(r2)
-	lwz	r7,DCACHEL1BLOCKSIZE(r10)/* Get cache block size */
-	addi	r5,r7,-1
-	andc	r6,r3,r5		/* round low to line bdy */
-	subf	r8,r6,r4		/* compute length */
-	add	r8,r8,r5		/* ensure we get enough */
-	lwz	r9,DCACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of cache block size */
-	srd.	r8,r8,r9		/* compute line count */
-	beqlr				/* nothing to do? */
-	mtctr	r8
-1:	dcbst	0,r6
-	add	r6,r6,r7
-	bdnz	1b
-	sync
-
-/* Now invalidate the instruction cache */
-	
-	lwz	r7,ICACHEL1BLOCKSIZE(r10)	/* Get Icache block size */
-	addi	r5,r7,-1
-	andc	r6,r3,r5		/* round low to line bdy */
-	subf	r8,r6,r4		/* compute length */
-	add	r8,r8,r5
-	lwz	r9,ICACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of Icache block size */
-	srd.	r8,r8,r9		/* compute line count */
-	beqlr				/* nothing to do? */
-	mtctr	r8
-2:	icbi	0,r6
-	add	r6,r6,r7
-	bdnz	2b
-	isync
-	blr
-_ASM_NOKPROBE_SYMBOL(flush_icache_range)
-EXPORT_SYMBOL(flush_icache_range)
-
-/*
- * Flush a particular page from the data cache to RAM.
- * Note: this is necessary because the instruction cache does *not*
- * snoop from the data cache.
- *
- *	void __flush_dcache_icache(void *page)
- */
-_GLOBAL(__flush_dcache_icache)
-/*
- * Flush the data cache to memory 
- * 
- * Different systems have different cache line sizes
- */
-
-BEGIN_FTR_SECTION
-	PURGE_PREFETCHED_INS
-	blr
-END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
-
-/* Flush the dcache */
- 	ld	r7,PPC64_CACHES@toc(r2)
-	clrrdi	r3,r3,PAGE_SHIFT           	    /* Page align */
-	lwz	r4,DCACHEL1BLOCKSPERPAGE(r7)	/* Get # dcache blocks per page */
-	lwz	r5,DCACHEL1BLOCKSIZE(r7)	/* Get dcache block size */
-	mr	r6,r3
-	mtctr	r4
-0:	dcbst	0,r6
-	add	r6,r6,r5
-	bdnz	0b
-	sync
-
-/* Now invalidate the icache */	
-
-	lwz	r4,ICACHEL1BLOCKSPERPAGE(r7)	/* Get # icache blocks per page */
-	lwz	r5,ICACHEL1BLOCKSIZE(r7)	/* Get icache block size */
-	mtctr	r4
-1:	icbi	0,r3
-	add	r3,r3,r5
-	bdnz	1b
-	isync
-	blr
-
 _GLOBAL(__bswapdi2)
 EXPORT_SYMBOL(__bswapdi2)
 	srdi	r8,r3,32
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 9191a66b3bc5..cd540123874d 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -321,6 +321,105 @@ void free_initmem(void)
 	free_initmem_default(POISON_FREE_INITMEM);
 }
 
+/*
+ * Warning: This macro will perform an early return if the CPU has
+ * a coherent icache. The intent is is call this early in function,
+ * and handle the non-coherent icache variant afterwards.
+ *
+ * For a snooping icache, we still need a dummy icbi to purge all the
+ * prefetched instructions from the ifetch buffers. We also need a sync
+ * before the icbi to order the the actual stores to memory that might
+ * have modified instructions with the icbi.
+ */
+#define flush_coherent_icache_or_return(addr) {			\
+	if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) {		\
+		mb(); /* sync */				\
+		icbi(addr);					\
+		mb(); /* sync */				\
+		isync();					\
+		return;						\
+	}							\
+}
+
+/**
+ * flush_icache_range: Write any modified data cache blocks out to memory
+ * and invalidate the corresponding blocks in the instruction cache
+ *
+ * Generic code will call this after writing memory, before executing from it.
+ *
+ * @start: the start address
+ * @stop: the stop address (exclusive)
+ */
+void flush_icache_range(unsigned long start, unsigned long stop)
+{
+	unsigned long shift = l1_icache_shift();
+	unsigned long bytes = l1_icache_bytes();
+	char *addr = (char *)(start & ~(bytes - 1));
+	unsigned long size = stop - (unsigned long)addr + (bytes - 1);
+	unsigned long i;
+
+	flush_coherent_icache_or_return(addr);
+	clean_dcache_range(start, stop);
+
+	if (IS_ENABLED(CONFIG_44x)) {
+		/*
+		 * Flash invalidate on 44x because we are passed kmapped
+		 * addresses and this doesn't work for userspace pages due to
+		 * the virtually tagged icache.
+		 */
+		iccci(addr);
+	} else {
+		/* Now invalidate the instruction cache */
+		for (i = 0; i < size >> shift; i++, addr += bytes)
+			icbi(addr);
+	}
+
+	if (!IS_ENABLED(CONFIG_PPC64))
+		mb(); /* additional sync needed on g4 */
+	isync();
+}
+EXPORT_SYMBOL(flush_icache_range);
+
+#if !defined(CONFIG_PPC_8xx) & !defined(CONFIG_PPC64)
+/**
+ * flush_dcache_icache_phys() - Flush a page by it's physical address
+ * @physaddr: the physical address of the page
+ */
+static void flush_dcache_icache_phys(unsigned long physaddr)
+{
+	unsigned long bytes = l1_dcache_bytes();
+	unsigned long nb = PAGE_SIZE / bytes;
+	unsigned long addr = physaddr & PAGE_MASK;
+	unsigned long msr, msr0;
+	unsigned long loop1 = addr, loop2 = addr;
+
+	msr0 = mfmsr();
+	msr = msr0 & ~MSR_DR;
+	/*
+	 * This must remain as ASM to prevent potential memory accesses
+	 * while the data MMU is disabled
+	 */
+	asm volatile(
+		"   mtctr %2;"
+		"   mtmsr %3;"
+		"   isync;"
+		"0: dcbst   0, %0;"
+		"   addi    %0, %0, %4;"
+		"   bdnz    0b;"
+		"   sync;"
+		"   mtctr %2;"
+		"1: icbi    0, %1;"
+		"   addi    %1, %1, %4;"
+		"   bdnz    1b;"
+		"   sync;"
+		"   mtmsr %5;"
+		"   isync;"
+		: "+r" (loop1), "+r" (loop2)
+		: "r" (nb), "r" (msr), "i" (bytes), "r" (msr0)
+		: "ctr", "memory");
+}
+#endif // !defined(CONFIG_PPC_8xx) & !defined(CONFIG_PPC64)
+
 /*
  * This is called when a page has been modified by the kernel.
  * It just marks the page as not i-cache clean.  We do the i-cache
@@ -353,12 +452,63 @@ void flush_dcache_icache_page(struct page *page)
 		__flush_dcache_icache(start);
 		kunmap_atomic(start);
 	} else {
-		__flush_dcache_icache_phys(page_to_pfn(page) << PAGE_SHIFT);
+		unsigned long addr = page_to_pfn(page) << PAGE_SHIFT;
+
+		flush_coherent_icache_or_return((void *)addr);
+		flush_dcache_icache_phys(addr);
 	}
 #endif
 }
 EXPORT_SYMBOL(flush_dcache_icache_page);
 
+/**
+ * __flush_dcache_icache(): Flush a particular page from the data cache to RAM.
+ * Note: this is necessary because the instruction cache does *not*
+ * snoop from the data cache.
+ *
+ * @page: the address of the page to flush
+ */
+void __flush_dcache_icache(void *page)
+{
+	char *addr = page;
+	unsigned long lines = PAGE_SIZE >> l1_dcache_shift();
+	unsigned long bytes = l1_dcache_bytes();
+	unsigned long i;
+
+	flush_coherent_icache_or_return(addr);
+
+	/* Flush the data cache to memory */
+	for (i = 0; i < lines; i++, addr += bytes)
+		dcbst(addr);
+
+	mb(); /* sync */
+
+#ifdef CONFIG_44x
+	/*
+	 * We don't flush the icache on 44x. Those have a virtual icache and we
+	 * don't have access to the virtual address here (it's not the page
+	 * vaddr but where it's mapped in user space). The flushing of the
+	 * icache on these is handled elsewhere, when a change in the address
+	 * space occurs, before returning to user space.
+	 */
+
+	if (cpu_has_feature(MMU_FTR_TYPE_44x))
+		return;
+#endif
+
+	lines = PAGE_SIZE >> l1_icache_shift();
+	bytes = l1_icache_bytes();
+	addr = page;
+
+	/* Now invalidate the instruction cache */
+	for (i = 0; i < lines; i++, addr += bytes)
+		icbi(addr);
+
+	mb(); /* sync */
+	isync();
+}
+EXPORT_SYMBOL(__flush_dcache_icache);
+
 void clear_user_page(void *page, unsigned long vaddr, struct page *pg)
 {
 	clear_page(page);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory
  2019-09-03  5:23 [PATCH v2 0/6] powerpc: convert cache asm to C Alastair D'Silva
                   ` (2 preceding siblings ...)
  2019-09-03  5:23 ` [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C Alastair D'Silva
@ 2019-09-03  5:23 ` Alastair D'Silva
  2019-09-03  6:19   ` Christophe Leroy
  2019-09-03  5:23 ` [PATCH v2 5/6] powerpc: Remove 'extern' from func prototypes in cache headers Alastair D'Silva
  2019-09-03  5:24 ` [PATCH v2 6/6] powerpc: Don't flush caches when adding memory Alastair D'Silva
  5 siblings, 1 reply; 28+ messages in thread
From: Alastair D'Silva @ 2019-09-03  5:23 UTC (permalink / raw)
  To: alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Christophe Leroy, Thomas Gleixner, Greg Kroah-Hartman, Qian Cai,
	Nicholas Piggin, Allison Randal, Andrew Morton, Michal Hocko,
	Mike Rapoport, David Hildenbrand, linuxppc-dev, linux-kernel

From: Alastair D'Silva <alastair@d-silva.org>

When presented with large amounts of memory being hotplugged
(in my test case, ~890GB), the call to flush_dcache_range takes
a while (~50 seconds), triggering RCU stalls.

This patch breaks up the call into 1GB chunks, calling
cond_resched() inbetween to allow the scheduler to run.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/mm/mem.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index cd540123874d..854aaea2c6ae 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -104,11 +104,14 @@ int __weak remove_section_mapping(unsigned long start, unsigned long end)
 	return -ENODEV;
 }
 
+#define FLUSH_CHUNK_SIZE SZ_1G
+
 int __ref arch_add_memory(int nid, u64 start, u64 size,
 			struct mhp_restrictions *restrictions)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
+	u64 i;
 	int rc;
 
 	resize_hpt_for_hotplug(memblock_phys_mem_size());
@@ -120,7 +123,12 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
 			start, start + size, rc);
 		return -EFAULT;
 	}
-	flush_dcache_range(start, start + size);
+
+	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
+		flush_dcache_range(start + i,
+				   min(start + size, start + i + FLUSH_CHUNK_SIZE));
+		cond_resched();
+	}
 
 	return __add_pages(nid, start_pfn, nr_pages, restrictions);
 }
@@ -131,13 +139,19 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size,
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 	struct page *page = pfn_to_page(start_pfn) + vmem_altmap_offset(altmap);
+	u64 i;
 	int ret;
 
 	__remove_pages(page_zone(page), start_pfn, nr_pages, altmap);
 
 	/* Remove htab bolted mappings for this section of memory */
 	start = (unsigned long)__va(start);
-	flush_dcache_range(start, start + size);
+	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
+		flush_dcache_range(start + i,
+				   min(start + size, start + i + FLUSH_CHUNK_SIZE));
+		cond_resched();
+	}
+
 	ret = remove_section_mapping(start, start + size);
 	WARN_ON_ONCE(ret);
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 5/6] powerpc: Remove 'extern' from func prototypes in cache headers
  2019-09-03  5:23 [PATCH v2 0/6] powerpc: convert cache asm to C Alastair D'Silva
                   ` (3 preceding siblings ...)
  2019-09-03  5:23 ` [PATCH v2 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory Alastair D'Silva
@ 2019-09-03  5:23 ` Alastair D'Silva
  2019-09-03  6:21   ` Christophe Leroy
  2019-09-03  5:24 ` [PATCH v2 6/6] powerpc: Don't flush caches when adding memory Alastair D'Silva
  5 siblings, 1 reply; 28+ messages in thread
From: Alastair D'Silva @ 2019-09-03  5:23 UTC (permalink / raw)
  To: alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Christophe Leroy, Greg Kroah-Hartman, Thomas Gleixner, Qian Cai,
	Nicholas Piggin, Allison Randal, Andrew Morton,
	David Hildenbrand, Mike Rapoport, linuxppc-dev, linux-kernel

From: Alastair D'Silva <alastair@d-silva.org>

The 'extern' keyword does not value-add for function prototypes.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/include/asm/cache.h      | 8 ++++----
 arch/powerpc/include/asm/cacheflush.h | 6 +++---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index 91c808c6738b..54fffdf5a6ec 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -102,10 +102,10 @@ static inline u32 l1_icache_bytes(void)
 #define __read_mostly __attribute__((__section__(".data..read_mostly")))
 
 #ifdef CONFIG_PPC_BOOK3S_32
-extern long _get_L2CR(void);
-extern long _get_L3CR(void);
-extern void _set_L2CR(unsigned long);
-extern void _set_L3CR(unsigned long);
+long _get_L2CR(void);
+long _get_L3CR(void);
+void _set_L2CR(unsigned long val);
+void _set_L3CR(unsigned long val);
 #else
 #define _get_L2CR()	0L
 #define _get_L3CR()	0L
diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
index 4a1c9f0200e1..fa10dc19206c 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -38,15 +38,15 @@ static inline void flush_cache_vmap(unsigned long start, unsigned long end) { }
 #endif
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
-extern void flush_dcache_page(struct page *page);
+void flush_dcache_page(struct page *page);
 #define flush_dcache_mmap_lock(mapping)		do { } while (0)
 #define flush_dcache_mmap_unlock(mapping)	do { } while (0)
 
 void flush_icache_range(unsigned long start, unsigned long stop);
-extern void flush_icache_user_range(struct vm_area_struct *vma,
+void flush_icache_user_range(struct vm_area_struct *vma,
 				    struct page *page, unsigned long addr,
 				    int len);
-extern void flush_dcache_icache_page(struct page *page);
+void flush_dcache_icache_page(struct page *page);
 void __flush_dcache_icache(void *page);
 
 /**
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 6/6] powerpc: Don't flush caches when adding memory
  2019-09-03  5:23 [PATCH v2 0/6] powerpc: convert cache asm to C Alastair D'Silva
                   ` (4 preceding siblings ...)
  2019-09-03  5:23 ` [PATCH v2 5/6] powerpc: Remove 'extern' from func prototypes in cache headers Alastair D'Silva
@ 2019-09-03  5:24 ` Alastair D'Silva
  2019-09-03  6:23   ` Christophe Leroy
  5 siblings, 1 reply; 28+ messages in thread
From: Alastair D'Silva @ 2019-09-03  5:24 UTC (permalink / raw)
  To: alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Christophe Leroy, Qian Cai, Greg Kroah-Hartman, Thomas Gleixner,
	Nicholas Piggin, Allison Randal, Andrew Morton,
	David Hildenbrand, Mike Rapoport, linuxppc-dev, linux-kernel

From: Alastair D'Silva <alastair@d-silva.org>

This operation takes a significant amount of time when hotplugging
large amounts of memory (~50 seconds with 890GB of persistent memory).

This was orignally in commit fb5924fddf9e
("powerpc/mm: Flush cache on memory hot(un)plug") to support memtrace,
but the flush on add is not needed as it is flushed on remove.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/mm/mem.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 854aaea2c6ae..2a14b5b93e19 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -111,7 +111,6 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
-	u64 i;
 	int rc;
 
 	resize_hpt_for_hotplug(memblock_phys_mem_size());
@@ -124,12 +123,6 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
 		return -EFAULT;
 	}
 
-	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
-		flush_dcache_range(start + i,
-				   min(start + size, start + i + FLUSH_CHUNK_SIZE));
-		cond_resched();
-	}
-
 	return __add_pages(nid, start_pfn, nr_pages, restrictions);
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-09-03  5:23 ` [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C Alastair D'Silva
@ 2019-09-03  6:08   ` Christophe Leroy
  2019-09-03 11:25     ` Michael Ellerman
  2019-09-04  3:23     ` Alastair D'Silva
  2019-09-03 13:04   ` Segher Boessenkool
  1 sibling, 2 replies; 28+ messages in thread
From: Christophe Leroy @ 2019-09-03  6:08 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Greg Kroah-Hartman, Thomas Gleixner, Qian Cai, Nicholas Piggin,
	Allison Randal, Andrew Morton, David Hildenbrand, Mike Rapoport,
	linuxppc-dev, linux-kernel



Le 03/09/2019 à 07:23, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> Similar to commit 22e9c88d486a
> ("powerpc/64: reuse PPC32 static inline flush_dcache_range()")
> this patch converts the following ASM symbols to C:
>      flush_icache_range()
>      __flush_dcache_icache()
>      __flush_dcache_icache_phys()
> 
> This was done as we discovered a long-standing bug where the length of the
> range was truncated due to using a 32 bit shift instead of a 64 bit one.
> 
> By converting these functions to C, it becomes easier to maintain.
> 
> flush_dcache_icache_phys() retains a critical assembler section as we must
> ensure there are no memory accesses while the data MMU is disabled
> (authored by Christophe Leroy). Since this has no external callers, it has
> also been made static, allowing the compiler to inline it within
> flush_dcache_icache_page().
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
>   arch/powerpc/include/asm/cache.h      |  26 ++---
>   arch/powerpc/include/asm/cacheflush.h |  24 ++--
>   arch/powerpc/kernel/misc_32.S         | 117 --------------------
>   arch/powerpc/kernel/misc_64.S         | 102 -----------------
>   arch/powerpc/mm/mem.c                 | 152 +++++++++++++++++++++++++-
>   5 files changed, 173 insertions(+), 248 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
> index f852d5cd746c..91c808c6738b 100644
> --- a/arch/powerpc/include/asm/cache.h
> +++ b/arch/powerpc/include/asm/cache.h
> @@ -98,20 +98,7 @@ static inline u32 l1_icache_bytes(void)
>   #endif
>   #endif /* ! __ASSEMBLY__ */
>   
> -#if defined(__ASSEMBLY__)
> -/*
> - * For a snooping icache, we still need a dummy icbi to purge all the
> - * prefetched instructions from the ifetch buffers. We also need a sync
> - * before the icbi to order the the actual stores to memory that might
> - * have modified instructions with the icbi.
> - */
> -#define PURGE_PREFETCHED_INS	\
> -	sync;			\
> -	icbi	0,r3;		\
> -	sync;			\
> -	isync
> -
> -#else
> +#if !defined(__ASSEMBLY__)
>   #define __read_mostly __attribute__((__section__(".data..read_mostly")))
>   
>   #ifdef CONFIG_PPC_BOOK3S_32
> @@ -145,6 +132,17 @@ static inline void dcbst(void *addr)
>   {
>   	__asm__ __volatile__ ("dcbst %y0" : : "Z"(*(u8 *)addr) : "memory");
>   }
> +
> +static inline void icbi(void *addr)
> +{
> +	__asm__ __volatile__ ("icbi 0, %0" : : "r"(addr) : "memory");

I think "__asm__ __volatile__" is deprecated. Use "asm volatile" instead.

> +}
> +
> +static inline void iccci(void *addr)
> +{
> +	__asm__ __volatile__ ("iccci 0, %0" : : "r"(addr) : "memory");
> +}
> +

Same

>   #endif /* !__ASSEMBLY__ */
>   #endif /* __KERNEL__ */
>   #endif /* _ASM_POWERPC_CACHE_H */
> diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
> index ed57843ef452..4a1c9f0200e1 100644
> --- a/arch/powerpc/include/asm/cacheflush.h
> +++ b/arch/powerpc/include/asm/cacheflush.h
> @@ -42,24 +42,20 @@ extern void flush_dcache_page(struct page *page);
>   #define flush_dcache_mmap_lock(mapping)		do { } while (0)
>   #define flush_dcache_mmap_unlock(mapping)	do { } while (0)
>   
> -extern void flush_icache_range(unsigned long, unsigned long);
> +void flush_icache_range(unsigned long start, unsigned long stop);
>   extern void flush_icache_user_range(struct vm_area_struct *vma,
>   				    struct page *page, unsigned long addr,
>   				    int len);
> -extern void __flush_dcache_icache(void *page_va);
>   extern void flush_dcache_icache_page(struct page *page);
> -#if defined(CONFIG_PPC32) && !defined(CONFIG_BOOKE)
> -extern void __flush_dcache_icache_phys(unsigned long physaddr);
> -#else
> -static inline void __flush_dcache_icache_phys(unsigned long physaddr)
> -{
> -	BUG();
> -}
> -#endif
> -
> -/*
> - * Write any modified data cache blocks out to memory and invalidate them.
> - * Does not invalidate the corresponding instruction cache blocks.
> +void __flush_dcache_icache(void *page);
> +
> +/**
> + * flush_dcache_range(): Write any modified data cache blocks out to memory and
> + * invalidate them. Does not invalidate the corresponding instruction cache
> + * blocks.
> + *
> + * @start: the start address
> + * @stop: the stop address (exclusive)
>    */
>   static inline void flush_dcache_range(unsigned long start, unsigned long stop)
>   {
> diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
> index fe4bd321730e..12b95e6799d4 100644
> --- a/arch/powerpc/kernel/misc_32.S
> +++ b/arch/powerpc/kernel/misc_32.S
> @@ -318,123 +318,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_UNIFIED_ID_CACHE)
>   EXPORT_SYMBOL(flush_instruction_cache)
>   #endif /* CONFIG_PPC_8xx */
>   
> -/*
> - * Write any modified data cache blocks out to memory
> - * and invalidate the corresponding instruction cache blocks.
> - * This is a no-op on the 601.
> - *
> - * flush_icache_range(unsigned long start, unsigned long stop)
> - */
> -_GLOBAL(flush_icache_range)
> -BEGIN_FTR_SECTION
> -	PURGE_PREFETCHED_INS
> -	blr				/* for 601, do nothing */
> -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> -	rlwinm	r3,r3,0,0,31 - L1_CACHE_SHIFT
> -	subf	r4,r3,r4
> -	addi	r4,r4,L1_CACHE_BYTES - 1
> -	srwi.	r4,r4,L1_CACHE_SHIFT
> -	beqlr
> -	mtctr	r4
> -	mr	r6,r3
> -1:	dcbst	0,r3
> -	addi	r3,r3,L1_CACHE_BYTES
> -	bdnz	1b
> -	sync				/* wait for dcbst's to get to ram */
> -#ifndef CONFIG_44x
> -	mtctr	r4
> -2:	icbi	0,r6
> -	addi	r6,r6,L1_CACHE_BYTES
> -	bdnz	2b
> -#else
> -	/* Flash invalidate on 44x because we are passed kmapped addresses and
> -	   this doesn't work for userspace pages due to the virtually tagged
> -	   icache.  Sigh. */
> -	iccci	0, r0
> -#endif
> -	sync				/* additional sync needed on g4 */
> -	isync
> -	blr
> -_ASM_NOKPROBE_SYMBOL(flush_icache_range)
> -EXPORT_SYMBOL(flush_icache_range)
> -
> -/*
> - * Flush a particular page from the data cache to RAM.
> - * Note: this is necessary because the instruction cache does *not*
> - * snoop from the data cache.
> - * This is a no-op on the 601 which has a unified cache.
> - *
> - *	void __flush_dcache_icache(void *page)
> - */
> -_GLOBAL(__flush_dcache_icache)
> -BEGIN_FTR_SECTION
> -	PURGE_PREFETCHED_INS
> -	blr
> -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> -	rlwinm	r3,r3,0,0,31-PAGE_SHIFT		/* Get page base address */
> -	li	r4,PAGE_SIZE/L1_CACHE_BYTES	/* Number of lines in a page */
> -	mtctr	r4
> -	mr	r6,r3
> -0:	dcbst	0,r3				/* Write line to ram */
> -	addi	r3,r3,L1_CACHE_BYTES
> -	bdnz	0b
> -	sync
> -#ifdef CONFIG_44x
> -	/* We don't flush the icache on 44x. Those have a virtual icache
> -	 * and we don't have access to the virtual address here (it's
> -	 * not the page vaddr but where it's mapped in user space). The
> -	 * flushing of the icache on these is handled elsewhere, when
> -	 * a change in the address space occurs, before returning to
> -	 * user space
> -	 */
> -BEGIN_MMU_FTR_SECTION
> -	blr
> -END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_44x)
> -#endif /* CONFIG_44x */
> -	mtctr	r4
> -1:	icbi	0,r6
> -	addi	r6,r6,L1_CACHE_BYTES
> -	bdnz	1b
> -	sync
> -	isync
> -	blr
> -
> -#ifndef CONFIG_BOOKE
> -/*
> - * Flush a particular page from the data cache to RAM, identified
> - * by its physical address.  We turn off the MMU so we can just use
> - * the physical address (this may be a highmem page without a kernel
> - * mapping).
> - *
> - *	void __flush_dcache_icache_phys(unsigned long physaddr)
> - */
> -_GLOBAL(__flush_dcache_icache_phys)
> -BEGIN_FTR_SECTION
> -	PURGE_PREFETCHED_INS
> -	blr					/* for 601, do nothing */
> -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> -	mfmsr	r10
> -	rlwinm	r0,r10,0,28,26			/* clear DR */
> -	mtmsr	r0
> -	isync
> -	rlwinm	r3,r3,0,0,31-PAGE_SHIFT		/* Get page base address */
> -	li	r4,PAGE_SIZE/L1_CACHE_BYTES	/* Number of lines in a page */
> -	mtctr	r4
> -	mr	r6,r3
> -0:	dcbst	0,r3				/* Write line to ram */
> -	addi	r3,r3,L1_CACHE_BYTES
> -	bdnz	0b
> -	sync
> -	mtctr	r4
> -1:	icbi	0,r6
> -	addi	r6,r6,L1_CACHE_BYTES
> -	bdnz	1b
> -	sync
> -	mtmsr	r10				/* restore DR */
> -	isync
> -	blr
> -#endif /* CONFIG_BOOKE */
> -
>   /*
>    * Copy a whole page.  We use the dcbz instruction on the destination
>    * to reduce memory traffic (it eliminates the unnecessary reads of
> diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
> index 9bc0aa9aeb65..ff20c253f273 100644
> --- a/arch/powerpc/kernel/misc_64.S
> +++ b/arch/powerpc/kernel/misc_64.S
> @@ -49,108 +49,6 @@ _GLOBAL(call_do_irq)
>   	mtlr	r0
>   	blr
>   
> -	.section	".toc","aw"
> -PPC64_CACHES:
> -	.tc		ppc64_caches[TC],ppc64_caches
> -	.section	".text"
> -
> -/*
> - * Write any modified data cache blocks out to memory
> - * and invalidate the corresponding instruction cache blocks.
> - *
> - * flush_icache_range(unsigned long start, unsigned long stop)
> - *
> - *   flush all bytes from start through stop-1 inclusive
> - */
> -
> -_GLOBAL_TOC(flush_icache_range)
> -BEGIN_FTR_SECTION
> -	PURGE_PREFETCHED_INS
> -	blr
> -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> -/*
> - * Flush the data cache to memory
> - *
> - * Different systems have different cache line sizes
> - * and in some cases i-cache and d-cache line sizes differ from
> - * each other.
> - */
> - 	ld	r10,PPC64_CACHES@toc(r2)
> -	lwz	r7,DCACHEL1BLOCKSIZE(r10)/* Get cache block size */
> -	addi	r5,r7,-1
> -	andc	r6,r3,r5		/* round low to line bdy */
> -	subf	r8,r6,r4		/* compute length */
> -	add	r8,r8,r5		/* ensure we get enough */
> -	lwz	r9,DCACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of cache block size */
> -	srd.	r8,r8,r9		/* compute line count */
> -	beqlr				/* nothing to do? */
> -	mtctr	r8
> -1:	dcbst	0,r6
> -	add	r6,r6,r7
> -	bdnz	1b
> -	sync
> -
> -/* Now invalidate the instruction cache */
> -	
> -	lwz	r7,ICACHEL1BLOCKSIZE(r10)	/* Get Icache block size */
> -	addi	r5,r7,-1
> -	andc	r6,r3,r5		/* round low to line bdy */
> -	subf	r8,r6,r4		/* compute length */
> -	add	r8,r8,r5
> -	lwz	r9,ICACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of Icache block size */
> -	srd.	r8,r8,r9		/* compute line count */
> -	beqlr				/* nothing to do? */
> -	mtctr	r8
> -2:	icbi	0,r6
> -	add	r6,r6,r7
> -	bdnz	2b
> -	isync
> -	blr
> -_ASM_NOKPROBE_SYMBOL(flush_icache_range)
> -EXPORT_SYMBOL(flush_icache_range)
> -
> -/*
> - * Flush a particular page from the data cache to RAM.
> - * Note: this is necessary because the instruction cache does *not*
> - * snoop from the data cache.
> - *
> - *	void __flush_dcache_icache(void *page)
> - */
> -_GLOBAL(__flush_dcache_icache)
> -/*
> - * Flush the data cache to memory
> - *
> - * Different systems have different cache line sizes
> - */
> -
> -BEGIN_FTR_SECTION
> -	PURGE_PREFETCHED_INS
> -	blr
> -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> -
> -/* Flush the dcache */
> - 	ld	r7,PPC64_CACHES@toc(r2)
> -	clrrdi	r3,r3,PAGE_SHIFT           	    /* Page align */
> -	lwz	r4,DCACHEL1BLOCKSPERPAGE(r7)	/* Get # dcache blocks per page */
> -	lwz	r5,DCACHEL1BLOCKSIZE(r7)	/* Get dcache block size */
> -	mr	r6,r3
> -	mtctr	r4
> -0:	dcbst	0,r6
> -	add	r6,r6,r5
> -	bdnz	0b
> -	sync
> -
> -/* Now invalidate the icache */	
> -
> -	lwz	r4,ICACHEL1BLOCKSPERPAGE(r7)	/* Get # icache blocks per page */
> -	lwz	r5,ICACHEL1BLOCKSIZE(r7)	/* Get icache block size */
> -	mtctr	r4
> -1:	icbi	0,r3
> -	add	r3,r3,r5
> -	bdnz	1b
> -	isync
> -	blr
> -
>   _GLOBAL(__bswapdi2)
>   EXPORT_SYMBOL(__bswapdi2)
>   	srdi	r8,r3,32
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 9191a66b3bc5..cd540123874d 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -321,6 +321,105 @@ void free_initmem(void)
>   	free_initmem_default(POISON_FREE_INITMEM);
>   }
>   
> +/*
> + * Warning: This macro will perform an early return if the CPU has
> + * a coherent icache. The intent is is call this early in function,
> + * and handle the non-coherent icache variant afterwards.
> + *
> + * For a snooping icache, we still need a dummy icbi to purge all the
> + * prefetched instructions from the ifetch buffers. We also need a sync
> + * before the icbi to order the the actual stores to memory that might
> + * have modified instructions with the icbi.
> + */
> +#define flush_coherent_icache_or_return(addr) {			\
> +	if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) {		\
> +		mb(); /* sync */				\
> +		icbi(addr);					\
> +		mb(); /* sync */				\
> +		isync();					\
> +		return;						\
> +	}							\
> +}

I hate this kind of awful macro which kills code readability.

Please to something like

static bool flush_coherent_icache_or_return(unsigned long addr)
{
	if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
		return false;

	mb(); /* sync */
	icbi(addr);
	mb(); /* sync */
	isync();
	return true;
}

then callers will do:

	if (flush_coherent_icache_or_return(addr))
		return;

> +
> +/**
> + * flush_icache_range: Write any modified data cache blocks out to memory
> + * and invalidate the corresponding blocks in the instruction cache
> + *
> + * Generic code will call this after writing memory, before executing from it.
> + *
> + * @start: the start address
> + * @stop: the stop address (exclusive)
> + */
> +void flush_icache_range(unsigned long start, unsigned long stop)
> +{
> +	unsigned long shift = l1_icache_shift();
> +	unsigned long bytes = l1_icache_bytes();
> +	char *addr = (char *)(start & ~(bytes - 1));
> +	unsigned long size = stop - (unsigned long)addr + (bytes - 1);
> +	unsigned long i;

Could probably move all this and the loop into a __flush_icache_range() 
helper.

> +
> +	flush_coherent_icache_or_return(addr);
> +	clean_dcache_range(start, stop);
> +
> +	if (IS_ENABLED(CONFIG_44x)) {
> +		/*
> +		 * Flash invalidate on 44x because we are passed kmapped
> +		 * addresses and this doesn't work for userspace pages due to
> +		 * the virtually tagged icache.
> +		 */
> +		iccci(addr);
> +	} else {
> +		/* Now invalidate the instruction cache */
> +		for (i = 0; i < size >> shift; i++, addr += bytes)
> +			icbi(addr);
> +	}
> +
> +	if (!IS_ENABLED(CONFIG_PPC64))
> +		mb(); /* additional sync needed on g4 */
> +	isync();
> +}
> +EXPORT_SYMBOL(flush_icache_range);
> +
> +#if !defined(CONFIG_PPC_8xx) & !defined(CONFIG_PPC64)
> +/**
> + * flush_dcache_icache_phys() - Flush a page by it's physical address
> + * @physaddr: the physical address of the page
> + */
> +static void flush_dcache_icache_phys(unsigned long physaddr)
> +{
> +	unsigned long bytes = l1_dcache_bytes();
> +	unsigned long nb = PAGE_SIZE / bytes;
> +	unsigned long addr = physaddr & PAGE_MASK;
> +	unsigned long msr, msr0;
> +	unsigned long loop1 = addr, loop2 = addr;
> +
> +	msr0 = mfmsr();
> +	msr = msr0 & ~MSR_DR;

Maybe we could get rid of msr and just use (msr0 & ~MSR_DR) in the asm 
inputs parameters.

> +	/*
> +	 * This must remain as ASM to prevent potential memory accesses
> +	 * while the data MMU is disabled
> +	 */
> +	asm volatile(
> +		"   mtctr %2;"
> +		"   mtmsr %3;"
> +		"   isync;"
> +		"0: dcbst   0, %0;"
> +		"   addi    %0, %0, %4;"
> +		"   bdnz    0b;"
> +		"   sync;"
> +		"   mtctr %2;"
> +		"1: icbi    0, %1;"
> +		"   addi    %1, %1, %4;"
> +		"   bdnz    1b;"
> +		"   sync;"
> +		"   mtmsr %5;"
> +		"   isync;"
> +		: "+r" (loop1), "+r" (loop2)
> +		: "r" (nb), "r" (msr), "i" (bytes), "r" (msr0)
> +		: "ctr", "memory");

Maybe also add "msr" in the clobbers.

> +}
> +#endif // !defined(CONFIG_PPC_8xx) & !defined(CONFIG_PPC64)
> +
>   /*
>    * This is called when a page has been modified by the kernel.
>    * It just marks the page as not i-cache clean.  We do the i-cache
> @@ -353,12 +452,63 @@ void flush_dcache_icache_page(struct page *page)
>   		__flush_dcache_icache(start);
>   		kunmap_atomic(start);
>   	} else {
> -		__flush_dcache_icache_phys(page_to_pfn(page) << PAGE_SHIFT);
> +		unsigned long addr = page_to_pfn(page) << PAGE_SHIFT;
> +
> +		flush_coherent_icache_or_return((void *)addr);
> +		flush_dcache_icache_phys(addr);
>   	}
>   #endif
>   }
>   EXPORT_SYMBOL(flush_dcache_icache_page);
>   
> +/**
> + * __flush_dcache_icache(): Flush a particular page from the data cache to RAM.
> + * Note: this is necessary because the instruction cache does *not*
> + * snoop from the data cache.
> + *
> + * @page: the address of the page to flush
> + */
> +void __flush_dcache_icache(void *page)
> +{
> +	char *addr = page;
> +	unsigned long lines = PAGE_SIZE >> l1_dcache_shift();
> +	unsigned long bytes = l1_dcache_bytes();
> +	unsigned long i;
> +
> +	flush_coherent_icache_or_return(addr);
> +
> +	/* Flush the data cache to memory */
> +	for (i = 0; i < lines; i++, addr += bytes)
> +		dcbst(addr);

Use clean_dcache_range(addr, addr + PAGE_SIZE);

> +
> +	mb(); /* sync */
> +
> +#ifdef CONFIG_44x

This ifdef is useless.
If CONFIG_44x is not enabled, MMU_FTR_TYPE_44x will not be in 
MMU_FTRS_POSSIBLE so cpu_has_feature() will return constant false at 
buildtime and GCC will drop it.

> +	/*
> +	 * We don't flush the icache on 44x. Those have a virtual icache and we
> +	 * don't have access to the virtual address here (it's not the page
> +	 * vaddr but where it's mapped in user space). The flushing of the
> +	 * icache on these is handled elsewhere, when a change in the address
> +	 * space occurs, before returning to user space.
> +	 */
> +
> +	if (cpu_has_feature(MMU_FTR_TYPE_44x))
> +		return;
> +#endif
> +
> +	lines = PAGE_SIZE >> l1_icache_shift();
> +	bytes = l1_icache_bytes();
> +	addr = page;
> +
> +	/* Now invalidate the instruction cache */
> +	for (i = 0; i < lines; i++, addr += bytes)
> +		icbi(addr);

Re-use the __flush_icache_range() helper suggested before.

> +
> +	mb(); /* sync */
> +	isync();
> +}
> +EXPORT_SYMBOL(__flush_dcache_icache);
> +
>   void clear_user_page(void *page, unsigned long vaddr, struct page *pg)
>   {
>   	clear_page(page);
> 

Christophe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory
  2019-09-03  5:23 ` [PATCH v2 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory Alastair D'Silva
@ 2019-09-03  6:19   ` Christophe Leroy
  2019-09-03  6:25     ` Alastair D'Silva
  0 siblings, 1 reply; 28+ messages in thread
From: Christophe Leroy @ 2019-09-03  6:19 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Greg Kroah-Hartman, Qian Cai, Nicholas Piggin,
	Allison Randal, Andrew Morton, Michal Hocko, Mike Rapoport,
	David Hildenbrand, linuxppc-dev, linux-kernel



Le 03/09/2019 à 07:23, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> When presented with large amounts of memory being hotplugged
> (in my test case, ~890GB), the call to flush_dcache_range takes
> a while (~50 seconds), triggering RCU stalls.
> 
> This patch breaks up the call into 1GB chunks, calling
> cond_resched() inbetween to allow the scheduler to run.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/mm/mem.c | 18 ++++++++++++++++--
>   1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index cd540123874d..854aaea2c6ae 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -104,11 +104,14 @@ int __weak remove_section_mapping(unsigned long start, unsigned long end)
>   	return -ENODEV;
>   }
>   
> +#define FLUSH_CHUNK_SIZE SZ_1G

Maybe the name is a bit long for a local define. See if we could reduce 
code line splits below by shortening this name.

> +
>   int __ref arch_add_memory(int nid, u64 start, u64 size,
>   			struct mhp_restrictions *restrictions)
>   {
>   	unsigned long start_pfn = start >> PAGE_SHIFT;
>   	unsigned long nr_pages = size >> PAGE_SHIFT;
> +	u64 i;
>   	int rc;
>   
>   	resize_hpt_for_hotplug(memblock_phys_mem_size());
> @@ -120,7 +123,12 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
>   			start, start + size, rc);
>   		return -EFAULT;
>   	}
> -	flush_dcache_range(start, start + size);
> +
> +	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
> +		flush_dcache_range(start + i,
> +				   min(start + size, start + i + FLUSH_CHUNK_SIZE));

My eyes don't like it.

What about
	for (; i < size; i += FLUSH_CHUNK_SIZE) {
		int len = min(size - i, FLUSH_CHUNK_SIZE);

		flush_dcache_range(start + i, start + i + len);
		cond_resched();
	}

or

	end = start + size;
	for (; start < end; start += FLUSH_CHUNK_SIZE, size -= FLUSH_CHUNK_SIZE) {
		int len = min(size, FLUSH_CHUNK_SIZE);

		flush_dcache_range(start, start + len);
		cond_resched();
	}

> +		cond_resched();
> +	}
>   
>   	return __add_pages(nid, start_pfn, nr_pages, restrictions);
>   }
> @@ -131,13 +139,19 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size,
>   	unsigned long start_pfn = start >> PAGE_SHIFT;
>   	unsigned long nr_pages = size >> PAGE_SHIFT;
>   	struct page *page = pfn_to_page(start_pfn) + vmem_altmap_offset(altmap);
> +	u64 i;
>   	int ret;
>   
>   	__remove_pages(page_zone(page), start_pfn, nr_pages, altmap);
>   
>   	/* Remove htab bolted mappings for this section of memory */
>   	start = (unsigned long)__va(start);
> -	flush_dcache_range(start, start + size);
> +	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
> +		flush_dcache_range(start + i,
> +				   min(start + size, start + i + FLUSH_CHUNK_SIZE));
> +		cond_resched();
> +	}
> +

This piece of code looks pretty similar to the one before. Can we 
refactor into a small helper ?

>   	ret = remove_section_mapping(start, start + size);
>   	WARN_ON_ONCE(ret);
>   
> 

Christophe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/6] powerpc: Remove 'extern' from func prototypes in cache headers
  2019-09-03  5:23 ` [PATCH v2 5/6] powerpc: Remove 'extern' from func prototypes in cache headers Alastair D'Silva
@ 2019-09-03  6:21   ` Christophe Leroy
  0 siblings, 0 replies; 28+ messages in thread
From: Christophe Leroy @ 2019-09-03  6:21 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Greg Kroah-Hartman, Thomas Gleixner, Qian Cai, Nicholas Piggin,
	Allison Randal, Andrew Morton, David Hildenbrand, Mike Rapoport,
	linuxppc-dev, linux-kernel



Le 03/09/2019 à 07:23, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> The 'extern' keyword does not value-add for function prototypes.

Is it worth it ? That kind of change is nice cleanup but the drawback is 
to kill automatic backports of fixes.

> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/include/asm/cache.h      | 8 ++++----
>   arch/powerpc/include/asm/cacheflush.h | 6 +++---
>   2 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
> index 91c808c6738b..54fffdf5a6ec 100644
> --- a/arch/powerpc/include/asm/cache.h
> +++ b/arch/powerpc/include/asm/cache.h
> @@ -102,10 +102,10 @@ static inline u32 l1_icache_bytes(void)
>   #define __read_mostly __attribute__((__section__(".data..read_mostly")))
>   
>   #ifdef CONFIG_PPC_BOOK3S_32
> -extern long _get_L2CR(void);
> -extern long _get_L3CR(void);
> -extern void _set_L2CR(unsigned long);
> -extern void _set_L3CR(unsigned long);
> +long _get_L2CR(void);
> +long _get_L3CR(void);
> +void _set_L2CR(unsigned long val);
> +void _set_L3CR(unsigned long val);
>   #else
>   #define _get_L2CR()	0L
>   #define _get_L3CR()	0L
> diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
> index 4a1c9f0200e1..fa10dc19206c 100644
> --- a/arch/powerpc/include/asm/cacheflush.h
> +++ b/arch/powerpc/include/asm/cacheflush.h
> @@ -38,15 +38,15 @@ static inline void flush_cache_vmap(unsigned long start, unsigned long end) { }
>   #endif
>   
>   #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
> -extern void flush_dcache_page(struct page *page);
> +void flush_dcache_page(struct page *page);
>   #define flush_dcache_mmap_lock(mapping)		do { } while (0)
>   #define flush_dcache_mmap_unlock(mapping)	do { } while (0)
>   
>   void flush_icache_range(unsigned long start, unsigned long stop);
> -extern void flush_icache_user_range(struct vm_area_struct *vma,
> +void flush_icache_user_range(struct vm_area_struct *vma,
>   				    struct page *page, unsigned long addr,
>   				    int len);

You have to re-align the other parameters to the ( of the first line.

> -extern void flush_dcache_icache_page(struct page *page);
> +void flush_dcache_icache_page(struct page *page);
>   void __flush_dcache_icache(void *page);
>   
>   /**
> 

Christophe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 6/6] powerpc: Don't flush caches when adding memory
  2019-09-03  5:24 ` [PATCH v2 6/6] powerpc: Don't flush caches when adding memory Alastair D'Silva
@ 2019-09-03  6:23   ` Christophe Leroy
  2019-09-03  6:27     ` Alastair D'Silva
  0 siblings, 1 reply; 28+ messages in thread
From: Christophe Leroy @ 2019-09-03  6:23 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Qian Cai, Greg Kroah-Hartman, Thomas Gleixner, Nicholas Piggin,
	Allison Randal, Andrew Morton, David Hildenbrand, Mike Rapoport,
	linuxppc-dev, linux-kernel



Le 03/09/2019 à 07:24, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This operation takes a significant amount of time when hotplugging
> large amounts of memory (~50 seconds with 890GB of persistent memory).
> 
> This was orignally in commit fb5924fddf9e
> ("powerpc/mm: Flush cache on memory hot(un)plug") to support memtrace,
> but the flush on add is not needed as it is flushed on remove.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/mm/mem.c | 7 -------
>   1 file changed, 7 deletions(-)
> 
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 854aaea2c6ae..2a14b5b93e19 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -111,7 +111,6 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
>   {
>   	unsigned long start_pfn = start >> PAGE_SHIFT;
>   	unsigned long nr_pages = size >> PAGE_SHIFT;
> -	u64 i;
>   	int rc;
>   
>   	resize_hpt_for_hotplug(memblock_phys_mem_size());
> @@ -124,12 +123,6 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
>   		return -EFAULT;
>   	}
>   
> -	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
> -		flush_dcache_range(start + i,
> -				   min(start + size, start + i + FLUSH_CHUNK_SIZE));
> -		cond_resched();
> -	}
> -

So you are removing the code you added in patch 4. Why not move this one 
before patch 4 ?

>   	return __add_pages(nid, start_pfn, nr_pages, restrictions);
>   }
>   
> 

Christophe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory
  2019-09-03  6:19   ` Christophe Leroy
@ 2019-09-03  6:25     ` Alastair D'Silva
  2019-09-03  6:51       ` Christophe Leroy
  0 siblings, 1 reply; 28+ messages in thread
From: Alastair D'Silva @ 2019-09-03  6:25 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Greg Kroah-Hartman, Qian Cai, Nicholas Piggin,
	Allison Randal, Andrew Morton, Michal Hocko, Mike Rapoport,
	David Hildenbrand, linuxppc-dev, linux-kernel

On Tue, 2019-09-03 at 08:19 +0200, Christophe Leroy wrote:
> 
> Le 03/09/2019 à 07:23, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > When presented with large amounts of memory being hotplugged
> > (in my test case, ~890GB), the call to flush_dcache_range takes
> > a while (~50 seconds), triggering RCU stalls.
> > 
> > This patch breaks up the call into 1GB chunks, calling
> > cond_resched() inbetween to allow the scheduler to run.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/mm/mem.c | 18 ++++++++++++++++--
> >   1 file changed, 16 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> > index cd540123874d..854aaea2c6ae 100644
> > --- a/arch/powerpc/mm/mem.c
> > +++ b/arch/powerpc/mm/mem.c
> > @@ -104,11 +104,14 @@ int __weak remove_section_mapping(unsigned
> > long start, unsigned long end)
> >   	return -ENODEV;
> >   }
> >   
> > +#define FLUSH_CHUNK_SIZE SZ_1G
> 
> Maybe the name is a bit long for a local define. See if we could
> reduce 
> code line splits below by shortening this name.
> 
> > +
> >   int __ref arch_add_memory(int nid, u64 start, u64 size,
> >   			struct mhp_restrictions *restrictions)
> >   {
> >   	unsigned long start_pfn = start >> PAGE_SHIFT;
> >   	unsigned long nr_pages = size >> PAGE_SHIFT;
> > +	u64 i;
> >   	int rc;
> >   
> >   	resize_hpt_for_hotplug(memblock_phys_mem_size());
> > @@ -120,7 +123,12 @@ int __ref arch_add_memory(int nid, u64 start,
> > u64 size,
> >   			start, start + size, rc);
> >   		return -EFAULT;
> >   	}
> > -	flush_dcache_range(start, start + size);
> > +
> > +	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
> > +		flush_dcache_range(start + i,
> > +				   min(start + size, start + i +
> > FLUSH_CHUNK_SIZE));
> 
> My eyes don't like it.
> 
> What about
> 	for (; i < size; i += FLUSH_CHUNK_SIZE) {
> 		int len = min(size - i, FLUSH_CHUNK_SIZE);
> 
> 		flush_dcache_range(start + i, start + i + len);
> 		cond_resched();
> 	}
> 
> or
> 
> 	end = start + size;
> 	for (; start < end; start += FLUSH_CHUNK_SIZE, size -=
> FLUSH_CHUNK_SIZE) {
> 		int len = min(size, FLUSH_CHUNK_SIZE);
> 
> 		flush_dcache_range(start, start + len);
> 		cond_resched();
> 	}
> 
> > +		cond_resched();
> > +	}
> >   
> >   	return __add_pages(nid, start_pfn, nr_pages, restrictions);
> >   }
> > @@ -131,13 +139,19 @@ void __ref arch_remove_memory(int nid, u64
> > start, u64 size,
> >   	unsigned long start_pfn = start >> PAGE_SHIFT;
> >   	unsigned long nr_pages = size >> PAGE_SHIFT;
> >   	struct page *page = pfn_to_page(start_pfn) +
> > vmem_altmap_offset(altmap);
> > +	u64 i;
> >   	int ret;
> >   
> >   	__remove_pages(page_zone(page), start_pfn, nr_pages, altmap);
> >   
> >   	/* Remove htab bolted mappings for this section of memory */
> >   	start = (unsigned long)__va(start);
> > -	flush_dcache_range(start, start + size);
> > +	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
> > +		flush_dcache_range(start + i,
> > +				   min(start + size, start + i +
> > FLUSH_CHUNK_SIZE));
> > +		cond_resched();
> > +	}
> > +
> 
> This piece of code looks pretty similar to the one before. Can we 
> refactor into a small helper ?
> 

Not much point, it's removed in a subsequent patch.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 6/6] powerpc: Don't flush caches when adding memory
  2019-09-03  6:23   ` Christophe Leroy
@ 2019-09-03  6:27     ` Alastair D'Silva
  0 siblings, 0 replies; 28+ messages in thread
From: Alastair D'Silva @ 2019-09-03  6:27 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Qian Cai, Greg Kroah-Hartman, Thomas Gleixner, Nicholas Piggin,
	Allison Randal, Andrew Morton, David Hildenbrand, Mike Rapoport,
	linuxppc-dev, linux-kernel

On Tue, 2019-09-03 at 08:23 +0200, Christophe Leroy wrote:
> 
> Le 03/09/2019 à 07:24, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > This operation takes a significant amount of time when hotplugging
> > large amounts of memory (~50 seconds with 890GB of persistent
> > memory).
> > 
> > This was orignally in commit fb5924fddf9e
> > ("powerpc/mm: Flush cache on memory hot(un)plug") to support
> > memtrace,
> > but the flush on add is not needed as it is flushed on remove.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/mm/mem.c | 7 -------
> >   1 file changed, 7 deletions(-)
> > 
> > diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> > index 854aaea2c6ae..2a14b5b93e19 100644
> > --- a/arch/powerpc/mm/mem.c
> > +++ b/arch/powerpc/mm/mem.c
> > @@ -111,7 +111,6 @@ int __ref arch_add_memory(int nid, u64 start,
> > u64 size,
> >   {
> >   	unsigned long start_pfn = start >> PAGE_SHIFT;
> >   	unsigned long nr_pages = size >> PAGE_SHIFT;
> > -	u64 i;
> >   	int rc;
> >   
> >   	resize_hpt_for_hotplug(memblock_phys_mem_size());
> > @@ -124,12 +123,6 @@ int __ref arch_add_memory(int nid, u64 start,
> > u64 size,
> >   		return -EFAULT;
> >   	}
> >   
> > -	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
> > -		flush_dcache_range(start + i,
> > -				   min(start + size, start + i +
> > FLUSH_CHUNK_SIZE));
> > -		cond_resched();
> > -	}
> > -
> 
> So you are removing the code you added in patch 4. Why not move this
> one 
> before patch 4 ?
> 

I put them in this order so that if someone did want the flushes in
arch_add_memory, they could drop the later patch, but not trigger RCU
stalls.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory
  2019-09-03  6:25     ` Alastair D'Silva
@ 2019-09-03  6:51       ` Christophe Leroy
  2019-09-04  4:11         ` Alastair D'Silva
  0 siblings, 1 reply; 28+ messages in thread
From: Christophe Leroy @ 2019-09-03  6:51 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Greg Kroah-Hartman, Qian Cai, Nicholas Piggin,
	Allison Randal, Andrew Morton, Michal Hocko, Mike Rapoport,
	David Hildenbrand, linuxppc-dev, linux-kernel



Le 03/09/2019 à 08:25, Alastair D'Silva a écrit :
> On Tue, 2019-09-03 at 08:19 +0200, Christophe Leroy wrote:
>>
>> Le 03/09/2019 à 07:23, Alastair D'Silva a écrit :
>>> From: Alastair D'Silva <alastair@d-silva.org>
>>>
>>> When presented with large amounts of memory being hotplugged
>>> (in my test case, ~890GB), the call to flush_dcache_range takes
>>> a while (~50 seconds), triggering RCU stalls.
>>>
>>> This patch breaks up the call into 1GB chunks, calling
>>> cond_resched() inbetween to allow the scheduler to run.
>>>
>>> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
>>> ---
>>>    arch/powerpc/mm/mem.c | 18 ++++++++++++++++--
>>>    1 file changed, 16 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
>>> index cd540123874d..854aaea2c6ae 100644
>>> --- a/arch/powerpc/mm/mem.c
>>> +++ b/arch/powerpc/mm/mem.c
>>> @@ -104,11 +104,14 @@ int __weak remove_section_mapping(unsigned
>>> long start, unsigned long end)
>>>    	return -ENODEV;
>>>    }
>>>    
>>> +#define FLUSH_CHUNK_SIZE SZ_1G
>>
>> Maybe the name is a bit long for a local define. See if we could
>> reduce
>> code line splits below by shortening this name.
>>
>>> +
>>>    int __ref arch_add_memory(int nid, u64 start, u64 size,
>>>    			struct mhp_restrictions *restrictions)
>>>    {
>>>    	unsigned long start_pfn = start >> PAGE_SHIFT;
>>>    	unsigned long nr_pages = size >> PAGE_SHIFT;
>>> +	u64 i;
>>>    	int rc;
>>>    
>>>    	resize_hpt_for_hotplug(memblock_phys_mem_size());
>>> @@ -120,7 +123,12 @@ int __ref arch_add_memory(int nid, u64 start,
>>> u64 size,
>>>    			start, start + size, rc);
>>>    		return -EFAULT;
>>>    	}
>>> -	flush_dcache_range(start, start + size);
>>> +
>>> +	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
>>> +		flush_dcache_range(start + i,
>>> +				   min(start + size, start + i +
>>> FLUSH_CHUNK_SIZE));
>>
>> My eyes don't like it.
>>
>> What about
>> 	for (; i < size; i += FLUSH_CHUNK_SIZE) {
>> 		int len = min(size - i, FLUSH_CHUNK_SIZE);
>>
>> 		flush_dcache_range(start + i, start + i + len);
>> 		cond_resched();
>> 	}
>>
>> or
>>
>> 	end = start + size;
>> 	for (; start < end; start += FLUSH_CHUNK_SIZE, size -=
>> FLUSH_CHUNK_SIZE) {
>> 		int len = min(size, FLUSH_CHUNK_SIZE);
>>
>> 		flush_dcache_range(start, start + len);
>> 		cond_resched();
>> 	}
>>
>>> +		cond_resched();
>>> +	}
>>>    
>>>    	return __add_pages(nid, start_pfn, nr_pages, restrictions);
>>>    }
>>> @@ -131,13 +139,19 @@ void __ref arch_remove_memory(int nid, u64
>>> start, u64 size,
>>>    	unsigned long start_pfn = start >> PAGE_SHIFT;
>>>    	unsigned long nr_pages = size >> PAGE_SHIFT;
>>>    	struct page *page = pfn_to_page(start_pfn) +
>>> vmem_altmap_offset(altmap);
>>> +	u64 i;
>>>    	int ret;
>>>    
>>>    	__remove_pages(page_zone(page), start_pfn, nr_pages, altmap);
>>>    
>>>    	/* Remove htab bolted mappings for this section of memory */
>>>    	start = (unsigned long)__va(start);
>>> -	flush_dcache_range(start, start + size);
>>> +	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
>>> +		flush_dcache_range(start + i,
>>> +				   min(start + size, start + i +
>>> FLUSH_CHUNK_SIZE));
>>> +		cond_resched();
>>> +	}
>>> +
>>
>> This piece of code looks pretty similar to the one before. Can we
>> refactor into a small helper ?
>>
> 
> Not much point, it's removed in a subsequent patch.
> 

But you tell me that you leave to people the opportunity to not apply 
that subsequent patch, and that's the reason you didn't put that patch 
before this one. In that case adding a helper is worth it.

Christophe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-09-03  6:08   ` Christophe Leroy
@ 2019-09-03 11:25     ` Michael Ellerman
  2019-09-04  3:23     ` Alastair D'Silva
  1 sibling, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2019-09-03 11:25 UTC (permalink / raw)
  To: Christophe Leroy, Alastair D'Silva, alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Greg Kroah-Hartman,
	Thomas Gleixner, Qian Cai, Nicholas Piggin, Allison Randal,
	Andrew Morton, David Hildenbrand, Mike Rapoport, linuxppc-dev,
	linux-kernel

Christophe Leroy <christophe.leroy@c-s.fr> writes:
> Le 03/09/2019 à 07:23, Alastair D'Silva a écrit :
>> From: Alastair D'Silva <alastair@d-silva.org>
>> 
>> Similar to commit 22e9c88d486a
>> ("powerpc/64: reuse PPC32 static inline flush_dcache_range()")
>> this patch converts the following ASM symbols to C:
>>      flush_icache_range()
>>      __flush_dcache_icache()
>>      __flush_dcache_icache_phys()
>> 
>> This was done as we discovered a long-standing bug where the length of the
>> range was truncated due to using a 32 bit shift instead of a 64 bit one.
>> 
>> By converting these functions to C, it becomes easier to maintain.
>> 
>> flush_dcache_icache_phys() retains a critical assembler section as we must
>> ensure there are no memory accesses while the data MMU is disabled
>> (authored by Christophe Leroy). Since this has no external callers, it has
>> also been made static, allowing the compiler to inline it within
>> flush_dcache_icache_page().
>> 
>> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>> ---
>>   arch/powerpc/include/asm/cache.h      |  26 ++---
>>   arch/powerpc/include/asm/cacheflush.h |  24 ++--
>>   arch/powerpc/kernel/misc_32.S         | 117 --------------------
>>   arch/powerpc/kernel/misc_64.S         | 102 -----------------
>>   arch/powerpc/mm/mem.c                 | 152 +++++++++++++++++++++++++-
>>   5 files changed, 173 insertions(+), 248 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
>> index f852d5cd746c..91c808c6738b 100644
>> --- a/arch/powerpc/include/asm/cache.h
>> +++ b/arch/powerpc/include/asm/cache.h
>> @@ -98,20 +98,7 @@ static inline u32 l1_icache_bytes(void)
>>   #endif
>>   #endif /* ! __ASSEMBLY__ */
>>   
>> -#if defined(__ASSEMBLY__)
>> -/*
>> - * For a snooping icache, we still need a dummy icbi to purge all the
>> - * prefetched instructions from the ifetch buffers. We also need a sync
>> - * before the icbi to order the the actual stores to memory that might
>> - * have modified instructions with the icbi.
>> - */
>> -#define PURGE_PREFETCHED_INS	\
>> -	sync;			\
>> -	icbi	0,r3;		\
>> -	sync;			\
>> -	isync
>> -
>> -#else
>> +#if !defined(__ASSEMBLY__)
>>   #define __read_mostly __attribute__((__section__(".data..read_mostly")))
>>   
>>   #ifdef CONFIG_PPC_BOOK3S_32
>> @@ -145,6 +132,17 @@ static inline void dcbst(void *addr)
>>   {
>>   	__asm__ __volatile__ ("dcbst %y0" : : "Z"(*(u8 *)addr) : "memory");
>>   }
>> +
>> +static inline void icbi(void *addr)
>> +{
>> +	__asm__ __volatile__ ("icbi 0, %0" : : "r"(addr) : "memory");
>
> I think "__asm__ __volatile__" is deprecated. Use "asm volatile" instead.

Yes please.

>> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
>> index 9191a66b3bc5..cd540123874d 100644
>> --- a/arch/powerpc/mm/mem.c
>> +++ b/arch/powerpc/mm/mem.c
>> @@ -321,6 +321,105 @@ void free_initmem(void)
>>   	free_initmem_default(POISON_FREE_INITMEM);
>>   }
>>   
>> +/*
>> + * Warning: This macro will perform an early return if the CPU has
>> + * a coherent icache. The intent is is call this early in function,
>> + * and handle the non-coherent icache variant afterwards.
>> + *
>> + * For a snooping icache, we still need a dummy icbi to purge all the
>> + * prefetched instructions from the ifetch buffers. We also need a sync
>> + * before the icbi to order the the actual stores to memory that might
>> + * have modified instructions with the icbi.
>> + */
>> +#define flush_coherent_icache_or_return(addr) {			\
>> +	if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) {		\
>> +		mb(); /* sync */				\
>> +		icbi(addr);					\
>> +		mb(); /* sync */				\
>> +		isync();					\
>> +		return;						\
>> +	}							\
>> +}
>
> I hate this kind of awful macro which kills code readability.

Yes I agree.

> Please to something like
>
> static bool flush_coherent_icache_or_return(unsigned long addr)
> {
> 	if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
> 		return false;
>
> 	mb(); /* sync */
> 	icbi(addr);
> 	mb(); /* sync */
> 	isync();
> 	return true;
> }
>
> then callers will do:
>
> 	if (flush_coherent_icache_or_return(addr))
> 		return;

I don't think it needs the "_or_return" in the name.

eg, it can just be:

 	if (flush_coherent_icache(addr))
		return;


Which reads fine I think, ie. flush the coherent icache, and if that
succeeds return, else continue.

cheers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-09-03  5:23 ` [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C Alastair D'Silva
  2019-09-03  6:08   ` Christophe Leroy
@ 2019-09-03 13:04   ` Segher Boessenkool
  2019-09-03 14:28     ` Christophe Leroy
  1 sibling, 1 reply; 28+ messages in thread
From: Segher Boessenkool @ 2019-09-03 13:04 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: alastair, David Hildenbrand, linux-kernel, Nicholas Piggin,
	Mike Rapoport, Paul Mackerras, Greg Kroah-Hartman, Qian Cai,
	Thomas Gleixner, linuxppc-dev, Andrew Morton, Allison Randal

Hi!

On Tue, Sep 03, 2019 at 03:23:57PM +1000, Alastair D'Silva wrote:
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c

> +#if !defined(CONFIG_PPC_8xx) & !defined(CONFIG_PPC64)

Please write that as &&?  That is more usual, and thus, easier to read.

> +static void flush_dcache_icache_phys(unsigned long physaddr)

> +	asm volatile(
> +		"   mtctr %2;"
> +		"   mtmsr %3;"
> +		"   isync;"
> +		"0: dcbst   0, %0;"
> +		"   addi    %0, %0, %4;"
> +		"   bdnz    0b;"
> +		"   sync;"
> +		"   mtctr %2;"
> +		"1: icbi    0, %1;"
> +		"   addi    %1, %1, %4;"
> +		"   bdnz    1b;"
> +		"   sync;"
> +		"   mtmsr %5;"
> +		"   isync;"
> +		: "+r" (loop1), "+r" (loop2)
> +		: "r" (nb), "r" (msr), "i" (bytes), "r" (msr0)
> +		: "ctr", "memory");

This outputs as one huge assembler statement, all on one line.  That's
going to be fun to read or debug.

loop1 and/or loop2 can be assigned the same register as msr0 or nb.  They
need to be made earlyclobbers.  (msr is fine, all of its reads are before
any writes to loop1 or loop2; and bytes is fine, it's not a register).


Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-09-03 13:04   ` Segher Boessenkool
@ 2019-09-03 14:28     ` Christophe Leroy
  2019-09-03 16:04       ` Segher Boessenkool
  0 siblings, 1 reply; 28+ messages in thread
From: Christophe Leroy @ 2019-09-03 14:28 UTC (permalink / raw)
  To: Segher Boessenkool, Alastair D'Silva
  Cc: David Hildenbrand, Greg Kroah-Hartman, linux-kernel,
	Nicholas Piggin, Mike Rapoport, Paul Mackerras, alastair,
	Qian Cai, Thomas Gleixner, linuxppc-dev, Andrew Morton,
	Allison Randal



Le 03/09/2019 à 15:04, Segher Boessenkool a écrit :
> Hi!
> 
> On Tue, Sep 03, 2019 at 03:23:57PM +1000, Alastair D'Silva wrote:
>> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> 
>> +#if !defined(CONFIG_PPC_8xx) & !defined(CONFIG_PPC64)
> 
> Please write that as &&?  That is more usual, and thus, easier to read.
> 
>> +static void flush_dcache_icache_phys(unsigned long physaddr)
> 
>> +	asm volatile(
>> +		"   mtctr %2;"
>> +		"   mtmsr %3;"
>> +		"   isync;"
>> +		"0: dcbst   0, %0;"
>> +		"   addi    %0, %0, %4;"
>> +		"   bdnz    0b;"
>> +		"   sync;"
>> +		"   mtctr %2;"
>> +		"1: icbi    0, %1;"
>> +		"   addi    %1, %1, %4;"
>> +		"   bdnz    1b;"
>> +		"   sync;"
>> +		"   mtmsr %5;"
>> +		"   isync;"
>> +		: "+r" (loop1), "+r" (loop2)
>> +		: "r" (nb), "r" (msr), "i" (bytes), "r" (msr0)
>> +		: "ctr", "memory");
> 
> This outputs as one huge assembler statement, all on one line.  That's
> going to be fun to read or debug.

Do you mean \n has to be added after the ; ?

> 
> loop1 and/or loop2 can be assigned the same register as msr0 or nb.  They
> need to be made earlyclobbers.  (msr is fine, all of its reads are before
> any writes to loop1 or loop2; and bytes is fine, it's not a register).

Can you explicit please ? Doesn't '+r' means that they are input and 
output at the same time ?

"to be made earlyclobbers", what does this means exactly ? How to do that ?

Christophe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-09-03 14:28     ` Christophe Leroy
@ 2019-09-03 16:04       ` Segher Boessenkool
  2019-09-03 17:05         ` Christophe Leroy
  2019-09-04  3:36         ` Alastair D'Silva
  0 siblings, 2 replies; 28+ messages in thread
From: Segher Boessenkool @ 2019-09-03 16:04 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Alastair D'Silva, David Hildenbrand, Greg Kroah-Hartman,
	linux-kernel, Nicholas Piggin, Mike Rapoport, Paul Mackerras,
	alastair, Qian Cai, Thomas Gleixner, linuxppc-dev, Andrew Morton,
	Allison Randal

On Tue, Sep 03, 2019 at 04:28:09PM +0200, Christophe Leroy wrote:
> Le 03/09/2019 à 15:04, Segher Boessenkool a écrit :
> >On Tue, Sep 03, 2019 at 03:23:57PM +1000, Alastair D'Silva wrote:
> >>+	asm volatile(
> >>+		"   mtctr %2;"
> >>+		"   mtmsr %3;"
> >>+		"   isync;"
> >>+		"0: dcbst   0, %0;"
> >>+		"   addi    %0, %0, %4;"
> >>+		"   bdnz    0b;"
> >>+		"   sync;"
> >>+		"   mtctr %2;"
> >>+		"1: icbi    0, %1;"
> >>+		"   addi    %1, %1, %4;"
> >>+		"   bdnz    1b;"
> >>+		"   sync;"
> >>+		"   mtmsr %5;"
> >>+		"   isync;"
> >>+		: "+r" (loop1), "+r" (loop2)
> >>+		: "r" (nb), "r" (msr), "i" (bytes), "r" (msr0)
> >>+		: "ctr", "memory");
> >
> >This outputs as one huge assembler statement, all on one line.  That's
> >going to be fun to read or debug.
> 
> Do you mean \n has to be added after the ; ?

Something like that.  There is no really satisfying way for doing huge
inline asm, and maybe that is a good thing ;-)

Often people write \n\t at the end of each line of inline asm.  This works
pretty well (but then there are labels, oh joy).

> >loop1 and/or loop2 can be assigned the same register as msr0 or nb.  They
> >need to be made earlyclobbers.  (msr is fine, all of its reads are before
> >any writes to loop1 or loop2; and bytes is fine, it's not a register).
> 
> Can you explicit please ? Doesn't '+r' means that they are input and 
> output at the same time ?

That is what + means, yes -- that this output is an input as well.  It is
the same to write

  asm("mov %1,%0 ; mov %0,42" : "+r"(x), "=r"(y));
or to write
  asm("mov %1,%0 ; mov %0,42" : "=r"(x), "=r"(y) : "0"(x));

(So not "at the same time" as in "in the same machine instruction", but
more loosely, as in "in the same inline asm statement").

> "to be made earlyclobbers", what does this means exactly ? How to do that ?

You write &, like "+&r" in this case.  It means the machine code writes
to this register before it has consumed all asm inputs (remember, GCC
does not understand (or even parse!) the assembler string).

So just

		: "+&r" (loop1), "+&r" (loop2)

will do.  (Why are they separate though?  It could just be one loop var).


Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-09-03 16:04       ` Segher Boessenkool
@ 2019-09-03 17:05         ` Christophe Leroy
  2019-09-03 18:31           ` Segher Boessenkool
  2019-09-04  3:36         ` Alastair D'Silva
  1 sibling, 1 reply; 28+ messages in thread
From: Christophe Leroy @ 2019-09-03 17:05 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Alastair D'Silva, David Hildenbrand, Greg Kroah-Hartman,
	linux-kernel, Nicholas Piggin, Mike Rapoport, Paul Mackerras,
	alastair, Qian Cai, Thomas Gleixner, linuxppc-dev, Andrew Morton,
	Allison Randal



Le 03/09/2019 à 18:04, Segher Boessenkool a écrit :
> On Tue, Sep 03, 2019 at 04:28:09PM +0200, Christophe Leroy wrote:
>> Le 03/09/2019 à 15:04, Segher Boessenkool a écrit :
>>> On Tue, Sep 03, 2019 at 03:23:57PM +1000, Alastair D'Silva wrote:
>>>> +	asm volatile(
>>>> +		"   mtctr %2;"
>>>> +		"   mtmsr %3;"
>>>> +		"   isync;"
>>>> +		"0: dcbst   0, %0;"
>>>> +		"   addi    %0, %0, %4;"
>>>> +		"   bdnz    0b;"
>>>> +		"   sync;"
>>>> +		"   mtctr %2;"
>>>> +		"1: icbi    0, %1;"
>>>> +		"   addi    %1, %1, %4;"
>>>> +		"   bdnz    1b;"
>>>> +		"   sync;"
>>>> +		"   mtmsr %5;"
>>>> +		"   isync;"
>>>> +		: "+r" (loop1), "+r" (loop2)
>>>> +		: "r" (nb), "r" (msr), "i" (bytes), "r" (msr0)
>>>> +		: "ctr", "memory");
>>>
>>> This outputs as one huge assembler statement, all on one line.  That's
>>> going to be fun to read or debug.
>>
>> Do you mean \n has to be added after the ; ?
> 
> Something like that.  There is no really satisfying way for doing huge
> inline asm, and maybe that is a good thing ;-)
> 
> Often people write \n\t at the end of each line of inline asm.  This works
> pretty well (but then there are labels, oh joy).
> 
>>> loop1 and/or loop2 can be assigned the same register as msr0 or nb.  They
>>> need to be made earlyclobbers.  (msr is fine, all of its reads are before
>>> any writes to loop1 or loop2; and bytes is fine, it's not a register).
>>
>> Can you explicit please ? Doesn't '+r' means that they are input and
>> output at the same time ?
> 
> That is what + means, yes -- that this output is an input as well.  It is
> the same to write
> 
>    asm("mov %1,%0 ; mov %0,42" : "+r"(x), "=r"(y));
> or to write
>    asm("mov %1,%0 ; mov %0,42" : "=r"(x), "=r"(y) : "0"(x));
> 
> (So not "at the same time" as in "in the same machine instruction", but
> more loosely, as in "in the same inline asm statement").
> 
>> "to be made earlyclobbers", what does this means exactly ? How to do that ?
> 
> You write &, like "+&r" in this case.  It means the machine code writes
> to this register before it has consumed all asm inputs (remember, GCC
> does not understand (or even parse!) the assembler string).
> 
> So just
> 
> 		: "+&r" (loop1), "+&r" (loop2)
> 
> will do.  (Why are they separate though?  It could just be one loop var).

Yes it could just be a single loop var, but in that case it would have 
to be reset at the start of the second loop, which means we would have 
to pass 'addr' for resetting the loop anyway, so I opted to do it 
outside the inline asm by using to separate loop vars set to their 
starting value outside the inline asm.

Christophe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-09-03 17:05         ` Christophe Leroy
@ 2019-09-03 18:31           ` Segher Boessenkool
  2019-09-03 20:11             ` Gabriel Paubert
  0 siblings, 1 reply; 28+ messages in thread
From: Segher Boessenkool @ 2019-09-03 18:31 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Alastair D'Silva, David Hildenbrand, Greg Kroah-Hartman,
	linux-kernel, Nicholas Piggin, Mike Rapoport, Paul Mackerras,
	alastair, Qian Cai, Thomas Gleixner, linuxppc-dev, Andrew Morton,
	Allison Randal

On Tue, Sep 03, 2019 at 07:05:19PM +0200, Christophe Leroy wrote:
> Le 03/09/2019 à 18:04, Segher Boessenkool a écrit :
> >(Why are they separate though?  It could just be one loop var).
> 
> Yes it could just be a single loop var, but in that case it would have 
> to be reset at the start of the second loop, which means we would have 
> to pass 'addr' for resetting the loop anyway,

Right, I noticed that after hitting send, as usual.

> so I opted to do it 
> outside the inline asm by using to separate loop vars set to their 
> starting value outside the inline asm.

The thing is, the way it is written now, it will get separate registers
for each loop (with proper earlyclobbers added).  Not that that really
matters of course, it just feels wrong :-)


Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-09-03 18:31           ` Segher Boessenkool
@ 2019-09-03 20:11             ` Gabriel Paubert
  2019-09-04  3:42               ` Alastair D'Silva
  0 siblings, 1 reply; 28+ messages in thread
From: Gabriel Paubert @ 2019-09-03 20:11 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christophe Leroy, Alastair D'Silva, David Hildenbrand,
	Greg Kroah-Hartman, linux-kernel, Nicholas Piggin, Mike Rapoport,
	Paul Mackerras, alastair, Qian Cai, Thomas Gleixner,
	linuxppc-dev, Andrew Morton, Allison Randal

On Tue, Sep 03, 2019 at 01:31:57PM -0500, Segher Boessenkool wrote:
> On Tue, Sep 03, 2019 at 07:05:19PM +0200, Christophe Leroy wrote:
> > Le 03/09/2019 à 18:04, Segher Boessenkool a écrit :
> > >(Why are they separate though?  It could just be one loop var).
> > 
> > Yes it could just be a single loop var, but in that case it would have 
> > to be reset at the start of the second loop, which means we would have 
> > to pass 'addr' for resetting the loop anyway,
> 
> Right, I noticed that after hitting send, as usual.
> 
> > so I opted to do it 
> > outside the inline asm by using to separate loop vars set to their 
> > starting value outside the inline asm.
> 
> The thing is, the way it is written now, it will get separate registers
> for each loop (with proper earlyclobbers added).  Not that that really
> matters of course, it just feels wrong :-)

After "mtmsr %3", it is always possible to copy %0 to %3 and use it as
an address register for the second loop. One register less to allocate
for the compiler. Constraints of course have to be adjusted.

	Gabriel
> 
> 
> Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-09-03  6:08   ` Christophe Leroy
  2019-09-03 11:25     ` Michael Ellerman
@ 2019-09-04  3:23     ` Alastair D'Silva
  2019-09-04 13:35       ` Segher Boessenkool
  1 sibling, 1 reply; 28+ messages in thread
From: Alastair D'Silva @ 2019-09-04  3:23 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Greg Kroah-Hartman, Thomas Gleixner, Qian Cai, Nicholas Piggin,
	Allison Randal, Andrew Morton, David Hildenbrand, Mike Rapoport,
	linuxppc-dev, linux-kernel

On Tue, 2019-09-03 at 08:08 +0200, Christophe Leroy wrote:
> 
> Le 03/09/2019 à 07:23, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > Similar to commit 22e9c88d486a
> > ("powerpc/64: reuse PPC32 static inline flush_dcache_range()")
> > this patch converts the following ASM symbols to C:
> >      flush_icache_range()
> >      __flush_dcache_icache()
> >      __flush_dcache_icache_phys()
> > 
> > This was done as we discovered a long-standing bug where the length
> > of the
> > range was truncated due to using a 32 bit shift instead of a 64 bit
> > one.
> > 
> > By converting these functions to C, it becomes easier to maintain.
> > 
> > flush_dcache_icache_phys() retains a critical assembler section as
> > we must
> > ensure there are no memory accesses while the data MMU is disabled
> > (authored by Christophe Leroy). Since this has no external callers,
> > it has
> > also been made static, allowing the compiler to inline it within
> > flush_dcache_icache_page().
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> > ---
> >   arch/powerpc/include/asm/cache.h      |  26 ++---
> >   arch/powerpc/include/asm/cacheflush.h |  24 ++--
> >   arch/powerpc/kernel/misc_32.S         | 117 --------------------
> >   arch/powerpc/kernel/misc_64.S         | 102 -----------------
> >   arch/powerpc/mm/mem.c                 | 152
> > +++++++++++++++++++++++++-
> >   5 files changed, 173 insertions(+), 248 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/cache.h
> > b/arch/powerpc/include/asm/cache.h
> > index f852d5cd746c..91c808c6738b 100644
> > --- a/arch/powerpc/include/asm/cache.h
> > +++ b/arch/powerpc/include/asm/cache.h
> > @@ -98,20 +98,7 @@ static inline u32 l1_icache_bytes(void)
> >   #endif
> >   #endif /* ! __ASSEMBLY__ */
> >   
> > -#if defined(__ASSEMBLY__)
> > -/*
> > - * For a snooping icache, we still need a dummy icbi to purge all
> > the
> > - * prefetched instructions from the ifetch buffers. We also need a
> > sync
> > - * before the icbi to order the the actual stores to memory that
> > might
> > - * have modified instructions with the icbi.
> > - */
> > -#define PURGE_PREFETCHED_INS	\
> > -	sync;			\
> > -	icbi	0,r3;		\
> > -	sync;			\
> > -	isync
> > -
> > -#else
> > +#if !defined(__ASSEMBLY__)
> >   #define __read_mostly
> > __attribute__((__section__(".data..read_mostly")))
> >   
> >   #ifdef CONFIG_PPC_BOOK3S_32
> > @@ -145,6 +132,17 @@ static inline void dcbst(void *addr)
> >   {
> >   	__asm__ __volatile__ ("dcbst %y0" : : "Z"(*(u8 *)addr) :
> > "memory");
> >   }
> > +
> > +static inline void icbi(void *addr)
> > +{
> > +	__asm__ __volatile__ ("icbi 0, %0" : : "r"(addr) : "memory");
> 
> I think "__asm__ __volatile__" is deprecated. Use "asm volatile"
> instead.
> 

Ok.

> > +}
> > +
> > +static inline void iccci(void *addr)
> > +{
> > +	__asm__ __volatile__ ("iccci 0, %0" : : "r"(addr) : "memory");
> > +}
> > +
> 
> Same
> 
> >   #endif /* !__ASSEMBLY__ */
> >   #endif /* __KERNEL__ */
> >   #endif /* _ASM_POWERPC_CACHE_H */
> > diff --git a/arch/powerpc/include/asm/cacheflush.h
> > b/arch/powerpc/include/asm/cacheflush.h
> > index ed57843ef452..4a1c9f0200e1 100644
> > --- a/arch/powerpc/include/asm/cacheflush.h
> > +++ b/arch/powerpc/include/asm/cacheflush.h
> > @@ -42,24 +42,20 @@ extern void flush_dcache_page(struct page
> > *page);
> >   #define flush_dcache_mmap_lock(mapping)		do { } while
> > (0)
> >   #define flush_dcache_mmap_unlock(mapping)	do { } while (0)
> >   
> > -extern void flush_icache_range(unsigned long, unsigned long);
> > +void flush_icache_range(unsigned long start, unsigned long stop);
> >   extern void flush_icache_user_range(struct vm_area_struct *vma,
> >   				    struct page *page, unsigned long
> > addr,
> >   				    int len);
> > -extern void __flush_dcache_icache(void *page_va);
> >   extern void flush_dcache_icache_page(struct page *page);
> > -#if defined(CONFIG_PPC32) && !defined(CONFIG_BOOKE)
> > -extern void __flush_dcache_icache_phys(unsigned long physaddr);
> > -#else
> > -static inline void __flush_dcache_icache_phys(unsigned long
> > physaddr)
> > -{
> > -	BUG();
> > -}
> > -#endif
> > -
> > -/*
> > - * Write any modified data cache blocks out to memory and
> > invalidate them.
> > - * Does not invalidate the corresponding instruction cache blocks.
> > +void __flush_dcache_icache(void *page);
> > +
> > +/**
> > + * flush_dcache_range(): Write any modified data cache blocks out
> > to memory and
> > + * invalidate them. Does not invalidate the corresponding
> > instruction cache
> > + * blocks.
> > + *
> > + * @start: the start address
> > + * @stop: the stop address (exclusive)
> >    */
> >   static inline void flush_dcache_range(unsigned long start,
> > unsigned long stop)
> >   {
> > diff --git a/arch/powerpc/kernel/misc_32.S
> > b/arch/powerpc/kernel/misc_32.S
> > index fe4bd321730e..12b95e6799d4 100644
> > --- a/arch/powerpc/kernel/misc_32.S
> > +++ b/arch/powerpc/kernel/misc_32.S
> > @@ -318,123 +318,6 @@
> > END_FTR_SECTION_IFSET(CPU_FTR_UNIFIED_ID_CACHE)
> >   EXPORT_SYMBOL(flush_instruction_cache)
> >   #endif /* CONFIG_PPC_8xx */
> >   
> > -/*
> > - * Write any modified data cache blocks out to memory
> > - * and invalidate the corresponding instruction cache blocks.
> > - * This is a no-op on the 601.
> > - *
> > - * flush_icache_range(unsigned long start, unsigned long stop)
> > - */
> > -_GLOBAL(flush_icache_range)
> > -BEGIN_FTR_SECTION
> > -	PURGE_PREFETCHED_INS
> > -	blr				/* for 601, do nothing */
> > -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> > -	rlwinm	r3,r3,0,0,31 - L1_CACHE_SHIFT
> > -	subf	r4,r3,r4
> > -	addi	r4,r4,L1_CACHE_BYTES - 1
> > -	srwi.	r4,r4,L1_CACHE_SHIFT
> > -	beqlr
> > -	mtctr	r4
> > -	mr	r6,r3
> > -1:	dcbst	0,r3
> > -	addi	r3,r3,L1_CACHE_BYTES
> > -	bdnz	1b
> > -	sync				/* wait for dcbst's to get
> > to ram */
> > -#ifndef CONFIG_44x
> > -	mtctr	r4
> > -2:	icbi	0,r6
> > -	addi	r6,r6,L1_CACHE_BYTES
> > -	bdnz	2b
> > -#else
> > -	/* Flash invalidate on 44x because we are passed kmapped
> > addresses and
> > -	   this doesn't work for userspace pages due to the virtually
> > tagged
> > -	   icache.  Sigh. */
> > -	iccci	0, r0
> > -#endif
> > -	sync				/* additional sync needed
> > on g4 */
> > -	isync
> > -	blr
> > -_ASM_NOKPROBE_SYMBOL(flush_icache_range)
> > -EXPORT_SYMBOL(flush_icache_range)
> > -
> > -/*
> > - * Flush a particular page from the data cache to RAM.
> > - * Note: this is necessary because the instruction cache does
> > *not*
> > - * snoop from the data cache.
> > - * This is a no-op on the 601 which has a unified cache.
> > - *
> > - *	void __flush_dcache_icache(void *page)
> > - */
> > -_GLOBAL(__flush_dcache_icache)
> > -BEGIN_FTR_SECTION
> > -	PURGE_PREFETCHED_INS
> > -	blr
> > -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> > -	rlwinm	r3,r3,0,0,31-PAGE_SHIFT		/* Get page base
> > address */
> > -	li	r4,PAGE_SIZE/L1_CACHE_BYTES	/* Number of lines in a
> > page */
> > -	mtctr	r4
> > -	mr	r6,r3
> > -0:	dcbst	0,r3				/* Write line to
> > ram */
> > -	addi	r3,r3,L1_CACHE_BYTES
> > -	bdnz	0b
> > -	sync
> > -#ifdef CONFIG_44x
> > -	/* We don't flush the icache on 44x. Those have a virtual
> > icache
> > -	 * and we don't have access to the virtual address here (it's
> > -	 * not the page vaddr but where it's mapped in user space). The
> > -	 * flushing of the icache on these is handled elsewhere, when
> > -	 * a change in the address space occurs, before returning to
> > -	 * user space
> > -	 */
> > -BEGIN_MMU_FTR_SECTION
> > -	blr
> > -END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_44x)
> > -#endif /* CONFIG_44x */
> > -	mtctr	r4
> > -1:	icbi	0,r6
> > -	addi	r6,r6,L1_CACHE_BYTES
> > -	bdnz	1b
> > -	sync
> > -	isync
> > -	blr
> > -
> > -#ifndef CONFIG_BOOKE
> > -/*
> > - * Flush a particular page from the data cache to RAM, identified
> > - * by its physical address.  We turn off the MMU so we can just
> > use
> > - * the physical address (this may be a highmem page without a
> > kernel
> > - * mapping).
> > - *
> > - *	void __flush_dcache_icache_phys(unsigned long physaddr)
> > - */
> > -_GLOBAL(__flush_dcache_icache_phys)
> > -BEGIN_FTR_SECTION
> > -	PURGE_PREFETCHED_INS
> > -	blr					/* for 601, do nothing */
> > -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> > -	mfmsr	r10
> > -	rlwinm	r0,r10,0,28,26			/* clear DR */
> > -	mtmsr	r0
> > -	isync
> > -	rlwinm	r3,r3,0,0,31-PAGE_SHIFT		/* Get page base
> > address */
> > -	li	r4,PAGE_SIZE/L1_CACHE_BYTES	/* Number of lines in a
> > page */
> > -	mtctr	r4
> > -	mr	r6,r3
> > -0:	dcbst	0,r3				/* Write line to
> > ram */
> > -	addi	r3,r3,L1_CACHE_BYTES
> > -	bdnz	0b
> > -	sync
> > -	mtctr	r4
> > -1:	icbi	0,r6
> > -	addi	r6,r6,L1_CACHE_BYTES
> > -	bdnz	1b
> > -	sync
> > -	mtmsr	r10				/* restore DR */
> > -	isync
> > -	blr
> > -#endif /* CONFIG_BOOKE */
> > -
> >   /*
> >    * Copy a whole page.  We use the dcbz instruction on the
> > destination
> >    * to reduce memory traffic (it eliminates the unnecessary reads
> > of
> > diff --git a/arch/powerpc/kernel/misc_64.S
> > b/arch/powerpc/kernel/misc_64.S
> > index 9bc0aa9aeb65..ff20c253f273 100644
> > --- a/arch/powerpc/kernel/misc_64.S
> > +++ b/arch/powerpc/kernel/misc_64.S
> > @@ -49,108 +49,6 @@ _GLOBAL(call_do_irq)
> >   	mtlr	r0
> >   	blr
> >   
> > -	.section	".toc","aw"
> > -PPC64_CACHES:
> > -	.tc		ppc64_caches[TC],ppc64_caches
> > -	.section	".text"
> > -
> > -/*
> > - * Write any modified data cache blocks out to memory
> > - * and invalidate the corresponding instruction cache blocks.
> > - *
> > - * flush_icache_range(unsigned long start, unsigned long stop)
> > - *
> > - *   flush all bytes from start through stop-1 inclusive
> > - */
> > -
> > -_GLOBAL_TOC(flush_icache_range)
> > -BEGIN_FTR_SECTION
> > -	PURGE_PREFETCHED_INS
> > -	blr
> > -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> > -/*
> > - * Flush the data cache to memory
> > - *
> > - * Different systems have different cache line sizes
> > - * and in some cases i-cache and d-cache line sizes differ from
> > - * each other.
> > - */
> > - 	ld	r10,PPC64_CACHES@toc(r2)
> > -	lwz	r7,DCACHEL1BLOCKSIZE(r10)/* Get cache block size */
> > -	addi	r5,r7,-1
> > -	andc	r6,r3,r5		/* round low to line bdy */
> > -	subf	r8,r6,r4		/* compute length */
> > -	add	r8,r8,r5		/* ensure we get enough */
> > -	lwz	r9,DCACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of cache block
> > size */
> > -	srd.	r8,r8,r9		/* compute line count */
> > -	beqlr				/* nothing to do? */
> > -	mtctr	r8
> > -1:	dcbst	0,r6
> > -	add	r6,r6,r7
> > -	bdnz	1b
> > -	sync
> > -
> > -/* Now invalidate the instruction cache */
> > -	
> > -	lwz	r7,ICACHEL1BLOCKSIZE(r10)	/* Get Icache block size */
> > -	addi	r5,r7,-1
> > -	andc	r6,r3,r5		/* round low to line bdy */
> > -	subf	r8,r6,r4		/* compute length */
> > -	add	r8,r8,r5
> > -	lwz	r9,ICACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of Icache
> > block size */
> > -	srd.	r8,r8,r9		/* compute line count */
> > -	beqlr				/* nothing to do? */
> > -	mtctr	r8
> > -2:	icbi	0,r6
> > -	add	r6,r6,r7
> > -	bdnz	2b
> > -	isync
> > -	blr
> > -_ASM_NOKPROBE_SYMBOL(flush_icache_range)
> > -EXPORT_SYMBOL(flush_icache_range)
> > -
> > -/*
> > - * Flush a particular page from the data cache to RAM.
> > - * Note: this is necessary because the instruction cache does
> > *not*
> > - * snoop from the data cache.
> > - *
> > - *	void __flush_dcache_icache(void *page)
> > - */
> > -_GLOBAL(__flush_dcache_icache)
> > -/*
> > - * Flush the data cache to memory
> > - *
> > - * Different systems have different cache line sizes
> > - */
> > -
> > -BEGIN_FTR_SECTION
> > -	PURGE_PREFETCHED_INS
> > -	blr
> > -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> > -
> > -/* Flush the dcache */
> > - 	ld	r7,PPC64_CACHES@toc(r2)
> > -	clrrdi	r3,r3,PAGE_SHIFT           	    /* Page align
> > */
> > -	lwz	r4,DCACHEL1BLOCKSPERPAGE(r7)	/* Get # dcache blocks per
> > page */
> > -	lwz	r5,DCACHEL1BLOCKSIZE(r7)	/* Get dcache block size */
> > -	mr	r6,r3
> > -	mtctr	r4
> > -0:	dcbst	0,r6
> > -	add	r6,r6,r5
> > -	bdnz	0b
> > -	sync
> > -
> > -/* Now invalidate the icache */	
> > -
> > -	lwz	r4,ICACHEL1BLOCKSPERPAGE(r7)	/* Get # icache blocks per
> > page */
> > -	lwz	r5,ICACHEL1BLOCKSIZE(r7)	/* Get icache block size */
> > -	mtctr	r4
> > -1:	icbi	0,r3
> > -	add	r3,r3,r5
> > -	bdnz	1b
> > -	isync
> > -	blr
> > -
> >   _GLOBAL(__bswapdi2)
> >   EXPORT_SYMBOL(__bswapdi2)
> >   	srdi	r8,r3,32
> > diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> > index 9191a66b3bc5..cd540123874d 100644
> > --- a/arch/powerpc/mm/mem.c
> > +++ b/arch/powerpc/mm/mem.c
> > @@ -321,6 +321,105 @@ void free_initmem(void)
> >   	free_initmem_default(POISON_FREE_INITMEM);
> >   }
> >   
> > +/*
> > + * Warning: This macro will perform an early return if the CPU has
> > + * a coherent icache. The intent is is call this early in
> > function,
> > + * and handle the non-coherent icache variant afterwards.
> > + *
> > + * For a snooping icache, we still need a dummy icbi to purge all
> > the
> > + * prefetched instructions from the ifetch buffers. We also need a
> > sync
> > + * before the icbi to order the the actual stores to memory that
> > might
> > + * have modified instructions with the icbi.
> > + */
> > +#define flush_coherent_icache_or_return(addr) {			
> > \
> > +	if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) {		\
> > +		mb(); /* sync */				\
> > +		icbi(addr);					\
> > +		mb(); /* sync */				\
> > +		isync();					\
> > +		return;						\
> > +	}							\
> > +}
> 
> I hate this kind of awful macro which kills code readability.
> 
> Please to something like
> 
> static bool flush_coherent_icache_or_return(unsigned long addr)
> {
> 	if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
> 		return false;
> 
> 	mb(); /* sync */
> 	icbi(addr);
> 	mb(); /* sync */
> 	isync();
> 	return true;
> }
> 
> then callers will do:
> 
> 	if (flush_coherent_icache_or_return(addr))
> 		return;

Sounds good.

> > +
> > +/**
> > + * flush_icache_range: Write any modified data cache blocks out to
> > memory
> > + * and invalidate the corresponding blocks in the instruction
> > cache
> > + *
> > + * Generic code will call this after writing memory, before
> > executing from it.
> > + *
> > + * @start: the start address
> > + * @stop: the stop address (exclusive)
> > + */
> > +void flush_icache_range(unsigned long start, unsigned long stop)
> > +{
> > +	unsigned long shift = l1_icache_shift();
> > +	unsigned long bytes = l1_icache_bytes();
> > +	char *addr = (char *)(start & ~(bytes - 1));
> > +	unsigned long size = stop - (unsigned long)addr + (bytes - 1);
> > +	unsigned long i;
> 
> Could probably move all this and the loop into a
> __flush_icache_range() 
> helper.

Will factor it out into invalidate_icache_range (as its similar to
invalidate_dcache_range).

> 
> > +
> > +	flush_coherent_icache_or_return(addr);
> > +	clean_dcache_range(start, stop);
> > +
> > +	if (IS_ENABLED(CONFIG_44x)) {
> > +		/*
> > +		 * Flash invalidate on 44x because we are passed
> > kmapped
> > +		 * addresses and this doesn't work for userspace pages
> > due to
> > +		 * the virtually tagged icache.
> > +		 */
> > +		iccci(addr);
> > +	} else {
> > +		/* Now invalidate the instruction cache */
> > +		for (i = 0; i < size >> shift; i++, addr += bytes)
> > +			icbi(addr);
> > +	}
> > +
> > +	if (!IS_ENABLED(CONFIG_PPC64))
> > +		mb(); /* additional sync needed on g4 */
> > +	isync();
> > +}
> > +EXPORT_SYMBOL(flush_icache_range);
> > +
> > +#if !defined(CONFIG_PPC_8xx) & !defined(CONFIG_PPC64)
> > +/**
> > + * flush_dcache_icache_phys() - Flush a page by it's physical
> > address
> > + * @physaddr: the physical address of the page
> > + */
> > +static void flush_dcache_icache_phys(unsigned long physaddr)
> > +{
> > +	unsigned long bytes = l1_dcache_bytes();
> > +	unsigned long nb = PAGE_SIZE / bytes;
> > +	unsigned long addr = physaddr & PAGE_MASK;
> > +	unsigned long msr, msr0;
> > +	unsigned long loop1 = addr, loop2 = addr;
> > +
> > +	msr0 = mfmsr();
> > +	msr = msr0 & ~MSR_DR;
> 
> Maybe we could get rid of msr and just use (msr0 & ~MSR_DR) in the
> asm 
> inputs parameters.
> 

That's already pretty busy, I think it's clearer as-is.

> > +	/*
> > +	 * This must remain as ASM to prevent potential memory accesses
> > +	 * while the data MMU is disabled
> > +	 */
> > +	asm volatile(
> > +		"   mtctr %2;"
> > +		"   mtmsr %3;"
> > +		"   isync;"
> > +		"0: dcbst   0, %0;"
> > +		"   addi    %0, %0, %4;"
> > +		"   bdnz    0b;"
> > +		"   sync;"
> > +		"   mtctr %2;"
> > +		"1: icbi    0, %1;"
> > +		"   addi    %1, %1, %4;"
> > +		"   bdnz    1b;"
> > +		"   sync;"
> > +		"   mtmsr %5;"
> > +		"   isync;"
> > +		: "+r" (loop1), "+r" (loop2)
> > +		: "r" (nb), "r" (msr), "i" (bytes), "r" (msr0)
> > +		: "ctr", "memory");
> 
> Maybe also add "msr" in the clobbers.
> 
Ok.

> > +}
> > +#endif // !defined(CONFIG_PPC_8xx) & !defined(CONFIG_PPC64)
> > +
> >   /*
> >    * This is called when a page has been modified by the kernel.
> >    * It just marks the page as not i-cache clean.  We do the i-
> > cache
> > @@ -353,12 +452,63 @@ void flush_dcache_icache_page(struct page
> > *page)
> >   		__flush_dcache_icache(start);
> >   		kunmap_atomic(start);
> >   	} else {
> > -		__flush_dcache_icache_phys(page_to_pfn(page) <<
> > PAGE_SHIFT);
> > +		unsigned long addr = page_to_pfn(page) << PAGE_SHIFT;
> > +
> > +		flush_coherent_icache_or_return((void *)addr);
> > +		flush_dcache_icache_phys(addr);
> >   	}
> >   #endif
> >   }
> >   EXPORT_SYMBOL(flush_dcache_icache_page);
> >   
> > +/**
> > + * __flush_dcache_icache(): Flush a particular page from the data
> > cache to RAM.
> > + * Note: this is necessary because the instruction cache does
> > *not*
> > + * snoop from the data cache.
> > + *
> > + * @page: the address of the page to flush
> > + */
> > +void __flush_dcache_icache(void *page)
> > +{
> > +	char *addr = page;
> > +	unsigned long lines = PAGE_SIZE >> l1_dcache_shift();
> > +	unsigned long bytes = l1_dcache_bytes();
> > +	unsigned long i;
> > +
> > +	flush_coherent_icache_or_return(addr);
> > +
> > +	/* Flush the data cache to memory */
> > +	for (i = 0; i < lines; i++, addr += bytes)
> > +		dcbst(addr);
> 
> Use clean_dcache_range(addr, addr + PAGE_SIZE);
> 
Ok.

> > +
> > +	mb(); /* sync */
> > +
> > +#ifdef CONFIG_44x
> 
> This ifdef is useless.
> If CONFIG_44x is not enabled, MMU_FTR_TYPE_44x will not be in 
> MMU_FTRS_POSSIBLE so cpu_has_feature() will return constant false at 
> buildtime and GCC will drop it.
> 

Ok.

> > +	/*
> > +	 * We don't flush the icache on 44x. Those have a virtual
> > icache and we
> > +	 * don't have access to the virtual address here (it's not the
> > page
> > +	 * vaddr but where it's mapped in user space). The flushing of
> > the
> > +	 * icache on these is handled elsewhere, when a change in the
> > address
> > +	 * space occurs, before returning to user space.
> > +	 */
> > +
> > +	if (cpu_has_feature(MMU_FTR_TYPE_44x))
> > +		return;
> > +#endif
> > +
> > +	lines = PAGE_SIZE >> l1_icache_shift();
> > +	bytes = l1_icache_bytes();
> > +	addr = page;
> > +
> > +	/* Now invalidate the instruction cache */
> > +	for (i = 0; i < lines; i++, addr += bytes)
> > +		icbi(addr);
> 
> Re-use the __flush_icache_range() helper suggested before.
> 
Ok.

> > +
> > +	mb(); /* sync */
> > +	isync();
> > +}
> > +EXPORT_SYMBOL(__flush_dcache_icache);
> > +
> >   void clear_user_page(void *page, unsigned long vaddr, struct page
> > *pg)
> >   {
> >   	clear_page(page);
> > 
> 
> Christophe
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-09-03 16:04       ` Segher Boessenkool
  2019-09-03 17:05         ` Christophe Leroy
@ 2019-09-04  3:36         ` Alastair D'Silva
  1 sibling, 0 replies; 28+ messages in thread
From: Alastair D'Silva @ 2019-09-04  3:36 UTC (permalink / raw)
  To: Segher Boessenkool, Christophe Leroy
  Cc: David Hildenbrand, Greg Kroah-Hartman, linux-kernel,
	Nicholas Piggin, Mike Rapoport, Paul Mackerras, Qian Cai,
	Thomas Gleixner, linuxppc-dev, Andrew Morton, Allison Randal

On Tue, 2019-09-03 at 11:04 -0500, Segher Boessenkool wrote:
> On Tue, Sep 03, 2019 at 04:28:09PM +0200, Christophe Leroy wrote:
> > Le 03/09/2019 à 15:04, Segher Boessenkool a écrit :
> > > On Tue, Sep 03, 2019 at 03:23:57PM +1000, Alastair D'Silva wrote:
> > > > +	asm volatile(
> > > > +		"   mtctr %2;"
> > > > +		"   mtmsr %3;"
> > > > +		"   isync;"
> > > > +		"0: dcbst   0, %0;"
> > > > +		"   addi    %0, %0, %4;"
> > > > +		"   bdnz    0b;"
> > > > +		"   sync;"
> > > > +		"   mtctr %2;"
> > > > +		"1: icbi    0, %1;"
> > > > +		"   addi    %1, %1, %4;"
> > > > +		"   bdnz    1b;"
> > > > +		"   sync;"
> > > > +		"   mtmsr %5;"
> > > > +		"   isync;"
> > > > +		: "+r" (loop1), "+r" (loop2)
> > > > +		: "r" (nb), "r" (msr), "i" (bytes), "r" (msr0)
> > > > +		: "ctr", "memory");
> > > 
> > > This outputs as one huge assembler statement, all on one
> > > line.  That's
> > > going to be fun to read or debug.
> > 
> > Do you mean \n has to be added after the ; ?
> 
> Something like that.  There is no really satisfying way for doing
> huge
> inline asm, and maybe that is a good thing ;-)
> 
> Often people write \n\t at the end of each line of inline asm.  This
> works
> pretty well (but then there are labels, oh joy).
> 
> > > loop1 and/or loop2 can be assigned the same register as msr0 or
> > > nb.  They
> > > need to be made earlyclobbers.  (msr is fine, all of its reads
> > > are before
> > > any writes to loop1 or loop2; and bytes is fine, it's not a
> > > register).
> > 
> > Can you explicit please ? Doesn't '+r' means that they are input
> > and 
> > output at the same time ?
> 
> That is what + means, yes -- that this output is an input as
> well.  It is
> the same to write
> 
>   asm("mov %1,%0 ; mov %0,42" : "+r"(x), "=r"(y));
> or to write
>   asm("mov %1,%0 ; mov %0,42" : "=r"(x), "=r"(y) : "0"(x));
> 
> (So not "at the same time" as in "in the same machine instruction",
> but
> more loosely, as in "in the same inline asm statement").
> 
> > "to be made earlyclobbers", what does this means exactly ? How to
> > do that ?
> 
> You write &, like "+&r" in this case.  It means the machine code
> writes
> to this register before it has consumed all asm inputs (remember, GCC
> does not understand (or even parse!) the assembler string).
> 
> So just
> 
> 		: "+&r" (loop1), "+&r" (loop2)
> 
> will do.  (Why are they separate though?  It could just be one loop
> var).
> 
> 

Thanks, I've updated these.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-09-03 20:11             ` Gabriel Paubert
@ 2019-09-04  3:42               ` Alastair D'Silva
  0 siblings, 0 replies; 28+ messages in thread
From: Alastair D'Silva @ 2019-09-04  3:42 UTC (permalink / raw)
  To: Gabriel Paubert, Segher Boessenkool
  Cc: Christophe Leroy, David Hildenbrand, Greg Kroah-Hartman,
	linux-kernel, Nicholas Piggin, Mike Rapoport, Paul Mackerras,
	Qian Cai, Thomas Gleixner, linuxppc-dev, Andrew Morton,
	Allison Randal

On Tue, 2019-09-03 at 22:11 +0200, Gabriel Paubert wrote:
> On Tue, Sep 03, 2019 at 01:31:57PM -0500, Segher Boessenkool wrote:
> > On Tue, Sep 03, 2019 at 07:05:19PM +0200, Christophe Leroy wrote:
> > > Le 03/09/2019 à 18:04, Segher Boessenkool a écrit :
> > > > (Why are they separate though?  It could just be one loop var).
> > > 
> > > Yes it could just be a single loop var, but in that case it would
> > > have 
> > > to be reset at the start of the second loop, which means we would
> > > have 
> > > to pass 'addr' for resetting the loop anyway,
> > 
> > Right, I noticed that after hitting send, as usual.
> > 
> > > so I opted to do it 
> > > outside the inline asm by using to separate loop vars set to
> > > their 
> > > starting value outside the inline asm.
> > 
> > The thing is, the way it is written now, it will get separate
> > registers
> > for each loop (with proper earlyclobbers added).  Not that that
> > really
> > matters of course, it just feels wrong :-)
> 
> After "mtmsr %3", it is always possible to copy %0 to %3 and use it
> as
> an address register for the second loop. One register less to
> allocate
> for the compiler. Constraints of course have to be adjusted.
> 
> 

Given that we're dealing with registers holding data that has been
named outside the assembler, this feels dirty. We'd be using the
register passed in as 'msr' to hold the address instead.

Since we're not short on registers, I don't see this as a good change.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v2 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory
  2019-09-03  6:51       ` Christophe Leroy
@ 2019-09-04  4:11         ` Alastair D'Silva
  0 siblings, 0 replies; 28+ messages in thread
From: Alastair D'Silva @ 2019-09-04  4:11 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Greg Kroah-Hartman, Qian Cai, Nicholas Piggin,
	Allison Randal, Andrew Morton, Michal Hocko, Mike Rapoport,
	David Hildenbrand, linuxppc-dev, linux-kernel

On Tue, 2019-09-03 at 08:51 +0200, Christophe Leroy wrote:
<snip>

> > > This piece of code looks pretty similar to the one before. Can we
> > > refactor into a small helper ?
> > > 
> > 
> > Not much point, it's removed in a subsequent patch.
> > 
> 
> But you tell me that you leave to people the opportunity to not
> apply 
> that subsequent patch, and that's the reason you didn't put that
> patch 
> before this one. In that case adding a helper is worth it.
> 
> Christophe

I factored it out anyway, since it made the code much nicer to read.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-09-04  3:23     ` Alastair D'Silva
@ 2019-09-04 13:35       ` Segher Boessenkool
  0 siblings, 0 replies; 28+ messages in thread
From: Segher Boessenkool @ 2019-09-04 13:35 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: Christophe Leroy, David Hildenbrand, linux-kernel,
	Nicholas Piggin, Mike Rapoport, Qian Cai, Greg Kroah-Hartman,
	Paul Mackerras, Thomas Gleixner, linuxppc-dev, Andrew Morton,
	Allison Randal

On Wed, Sep 04, 2019 at 01:23:36PM +1000, Alastair D'Silva wrote:
> > Maybe also add "msr" in the clobbers.
> > 
> Ok.

There is no known register "msr" in GCC.


Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/6] powerpc: Allow flush_icache_range to work across ranges >4GB
  2019-09-03  5:23 ` [PATCH v2 1/6] powerpc: Allow flush_icache_range to work across ranges >4GB Alastair D'Silva
@ 2019-09-14  7:46   ` Christophe Leroy
  2019-09-16  3:25     ` Alastair D'Silva
  0 siblings, 1 reply; 28+ messages in thread
From: Christophe Leroy @ 2019-09-14  7:46 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: stable, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Greg Kroah-Hartman, Thomas Gleixner, Qian Cai, Nicholas Piggin,
	Allison Randal, Andrew Morton, Mike Rapoport, Michal Hocko,
	David Hildenbrand, linuxppc-dev, linux-kernel



Le 03/09/2019 à 07:23, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> When calling flush_icache_range with a size >4GB, we were masking
> off the upper 32 bits, so we would incorrectly flush a range smaller
> than intended.
> 
> This patch replaces the 32 bit shifts with 64 bit ones, so that
> the full size is accounted for.

Isn't there the same issue in arch/powerpc/kernel/vdso64/cacheflush.S ?

Christophe

> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> Cc: stable@vger.kernel.org
> ---
>   arch/powerpc/kernel/misc_64.S | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
> index b55a7b4cb543..9bc0aa9aeb65 100644
> --- a/arch/powerpc/kernel/misc_64.S
> +++ b/arch/powerpc/kernel/misc_64.S
> @@ -82,7 +82,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
>   	subf	r8,r6,r4		/* compute length */
>   	add	r8,r8,r5		/* ensure we get enough */
>   	lwz	r9,DCACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of cache block size */
> -	srw.	r8,r8,r9		/* compute line count */
> +	srd.	r8,r8,r9		/* compute line count */
>   	beqlr				/* nothing to do? */
>   	mtctr	r8
>   1:	dcbst	0,r6
> @@ -98,7 +98,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
>   	subf	r8,r6,r4		/* compute length */
>   	add	r8,r8,r5
>   	lwz	r9,ICACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of Icache block size */
> -	srw.	r8,r8,r9		/* compute line count */
> +	srd.	r8,r8,r9		/* compute line count */
>   	beqlr				/* nothing to do? */
>   	mtctr	r8
>   2:	icbi	0,r6
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v2 1/6] powerpc: Allow flush_icache_range to work across ranges >4GB
  2019-09-14  7:46   ` Christophe Leroy
@ 2019-09-16  3:25     ` Alastair D'Silva
  0 siblings, 0 replies; 28+ messages in thread
From: Alastair D'Silva @ 2019-09-16  3:25 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: stable, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Greg Kroah-Hartman, Thomas Gleixner, Qian Cai, Nicholas Piggin,
	Allison Randal, Andrew Morton, Mike Rapoport, Michal Hocko,
	David Hildenbrand, linuxppc-dev, linux-kernel

On Sat, 2019-09-14 at 09:46 +0200, Christophe Leroy wrote:
> 
> Le 03/09/2019 à 07:23, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > When calling flush_icache_range with a size >4GB, we were masking
> > off the upper 32 bits, so we would incorrectly flush a range
> > smaller
> > than intended.
> > 
> > This patch replaces the 32 bit shifts with 64 bit ones, so that
> > the full size is accounted for.
> 
> Isn't there the same issue in arch/powerpc/kernel/vdso64/cacheflush.S
> ?
> 
> Christophe

Yes, there is. I'll fix it, but I wonder whether anything calls it? I
asked Google, and every mention of it was in the kernel source or
mailing list.

Maybe BenH can chime in?

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2019-09-16  3:26 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-03  5:23 [PATCH v2 0/6] powerpc: convert cache asm to C Alastair D'Silva
2019-09-03  5:23 ` [PATCH v2 1/6] powerpc: Allow flush_icache_range to work across ranges >4GB Alastair D'Silva
2019-09-14  7:46   ` Christophe Leroy
2019-09-16  3:25     ` Alastair D'Silva
2019-09-03  5:23 ` [PATCH v2 2/6] powerpc: define helpers to get L1 icache sizes Alastair D'Silva
2019-09-03  5:23 ` [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C Alastair D'Silva
2019-09-03  6:08   ` Christophe Leroy
2019-09-03 11:25     ` Michael Ellerman
2019-09-04  3:23     ` Alastair D'Silva
2019-09-04 13:35       ` Segher Boessenkool
2019-09-03 13:04   ` Segher Boessenkool
2019-09-03 14:28     ` Christophe Leroy
2019-09-03 16:04       ` Segher Boessenkool
2019-09-03 17:05         ` Christophe Leroy
2019-09-03 18:31           ` Segher Boessenkool
2019-09-03 20:11             ` Gabriel Paubert
2019-09-04  3:42               ` Alastair D'Silva
2019-09-04  3:36         ` Alastair D'Silva
2019-09-03  5:23 ` [PATCH v2 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory Alastair D'Silva
2019-09-03  6:19   ` Christophe Leroy
2019-09-03  6:25     ` Alastair D'Silva
2019-09-03  6:51       ` Christophe Leroy
2019-09-04  4:11         ` Alastair D'Silva
2019-09-03  5:23 ` [PATCH v2 5/6] powerpc: Remove 'extern' from func prototypes in cache headers Alastair D'Silva
2019-09-03  6:21   ` Christophe Leroy
2019-09-03  5:24 ` [PATCH v2 6/6] powerpc: Don't flush caches when adding memory Alastair D'Silva
2019-09-03  6:23   ` Christophe Leroy
2019-09-03  6:27     ` Alastair D'Silva

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).