linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] powerpc: convert cache asm to C
@ 2019-08-15  4:10 Alastair D'Silva
  2019-08-15  4:10 ` [PATCH 1/6] powerpc: Allow flush_icache_range to work across ranges >4GB Alastair D'Silva
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Alastair D'Silva @ 2019-08-15  4:10 UTC (permalink / raw)
  To: alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Christophe Leroy, Thomas Gleixner, Greg Kroah-Hartman, Qian Cai,
	Nicholas Piggin, Allison Randal, Andrew Morton, Michal Hocko,
	David Hildenbrand, Mike Rapoport, linuxppc-dev, linux-kernel

From: Alastair D'Silva <alastair@d-silva.org>

This series addresses a few issues discovered in how we flush caches:
1. Flushes were truncated at 4GB, so larger flushes were incorrect.
2. Flushing the dcache in arch_add_memory was unnecessary

This series also converts much of the cache assembler to C, with the
aim of making it easier to maintain.

Alastair D'Silva (6):
  powerpc: Allow flush_icache_range to work across ranges >4GB
  powerpc: define helpers to get L1 icache sizes
  powerpc: Convert flush_icache_range & friends to C
  powerpc: Chunk calls to flush_dcache_range in arch_*_memory
  powerpc: Remove 'extern' from func prototypes in cache headers
  powerpc: Don't flush caches when adding memory

 arch/powerpc/include/asm/cache.h      |  63 +++++++++-----
 arch/powerpc/include/asm/cacheflush.h |  49 ++++++-----
 arch/powerpc/kernel/misc_32.S         | 117 --------------------------
 arch/powerpc/kernel/misc_64.S         |  97 ---------------------
 arch/powerpc/mm/mem.c                 |  80 +++++++++++++++++-
 5 files changed, 146 insertions(+), 260 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/6] powerpc: Allow flush_icache_range to work across ranges >4GB
  2019-08-15  4:10 [PATCH 0/6] powerpc: convert cache asm to C Alastair D'Silva
@ 2019-08-15  4:10 ` Alastair D'Silva
  2019-08-15  4:10 ` [PATCH 2/6] powerpc: define helpers to get L1 icache sizes Alastair D'Silva
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Alastair D'Silva @ 2019-08-15  4:10 UTC (permalink / raw)
  To: alastair
  Cc: stable, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Christophe Leroy, Thomas Gleixner, Qian Cai, Nicholas Piggin,
	Allison Randal, Greg Kroah-Hartman, Andrew Morton,
	David Hildenbrand, Mike Rapoport, linuxppc-dev, linux-kernel

From: Alastair D'Silva <alastair@d-silva.org>

When calling flush_icache_range with a size >4GB, we were masking
off the upper 32 bits, so we would incorrectly flush a range smaller
than intended.

This patch replaces the 32 bit shifts with 64 bit ones, so that
the full size is accounted for.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Cc: stable@vger.kernel.org
---
 arch/powerpc/kernel/misc_64.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index b55a7b4cb543..9bc0aa9aeb65 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -82,7 +82,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
 	subf	r8,r6,r4		/* compute length */
 	add	r8,r8,r5		/* ensure we get enough */
 	lwz	r9,DCACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of cache block size */
-	srw.	r8,r8,r9		/* compute line count */
+	srd.	r8,r8,r9		/* compute line count */
 	beqlr				/* nothing to do? */
 	mtctr	r8
 1:	dcbst	0,r6
@@ -98,7 +98,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
 	subf	r8,r6,r4		/* compute length */
 	add	r8,r8,r5
 	lwz	r9,ICACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of Icache block size */
-	srw.	r8,r8,r9		/* compute line count */
+	srd.	r8,r8,r9		/* compute line count */
 	beqlr				/* nothing to do? */
 	mtctr	r8
 2:	icbi	0,r6
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/6] powerpc: define helpers to get L1 icache sizes
  2019-08-15  4:10 [PATCH 0/6] powerpc: convert cache asm to C Alastair D'Silva
  2019-08-15  4:10 ` [PATCH 1/6] powerpc: Allow flush_icache_range to work across ranges >4GB Alastair D'Silva
@ 2019-08-15  4:10 ` Alastair D'Silva
  2019-08-15  4:10 ` [PATCH 3/6] powerpc: Convert flush_icache_range & friends to C Alastair D'Silva
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Alastair D'Silva @ 2019-08-15  4:10 UTC (permalink / raw)
  To: alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Christophe Leroy, Thomas Gleixner, Greg Kroah-Hartman, Qian Cai,
	Nicholas Piggin, Allison Randal, Andrew Morton,
	David Hildenbrand, Michal Hocko, Mike Rapoport, linuxppc-dev,
	linux-kernel

From: Alastair D'Silva <alastair@d-silva.org>

This patch adds helpers to retrieve icache sizes, and renames the existing
helpers to make it clear that they are for dcache.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/include/asm/cache.h      | 29 +++++++++++++++++++++++----
 arch/powerpc/include/asm/cacheflush.h | 12 +++++------
 2 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index b3388d95f451..f852d5cd746c 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -55,25 +55,46 @@ struct ppc64_caches {
 
 extern struct ppc64_caches ppc64_caches;
 
-static inline u32 l1_cache_shift(void)
+static inline u32 l1_dcache_shift(void)
 {
 	return ppc64_caches.l1d.log_block_size;
 }
 
-static inline u32 l1_cache_bytes(void)
+static inline u32 l1_dcache_bytes(void)
 {
 	return ppc64_caches.l1d.block_size;
 }
+
+static inline u32 l1_icache_shift(void)
+{
+	return ppc64_caches.l1i.log_block_size;
+}
+
+static inline u32 l1_icache_bytes(void)
+{
+	return ppc64_caches.l1i.block_size;
+}
 #else
-static inline u32 l1_cache_shift(void)
+static inline u32 l1_dcache_shift(void)
 {
 	return L1_CACHE_SHIFT;
 }
 
-static inline u32 l1_cache_bytes(void)
+static inline u32 l1_dcache_bytes(void)
 {
 	return L1_CACHE_BYTES;
 }
+
+static inline u32 l1_icache_shift(void)
+{
+	return L1_CACHE_SHIFT;
+}
+
+static inline u32 l1_icache_bytes(void)
+{
+	return L1_CACHE_BYTES;
+}
+
 #endif
 #endif /* ! __ASSEMBLY__ */
 
diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
index eef388f2659f..ed57843ef452 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -63,8 +63,8 @@ static inline void __flush_dcache_icache_phys(unsigned long physaddr)
  */
 static inline void flush_dcache_range(unsigned long start, unsigned long stop)
 {
-	unsigned long shift = l1_cache_shift();
-	unsigned long bytes = l1_cache_bytes();
+	unsigned long shift = l1_dcache_shift();
+	unsigned long bytes = l1_dcache_bytes();
 	void *addr = (void *)(start & ~(bytes - 1));
 	unsigned long size = stop - (unsigned long)addr + (bytes - 1);
 	unsigned long i;
@@ -89,8 +89,8 @@ static inline void flush_dcache_range(unsigned long start, unsigned long stop)
  */
 static inline void clean_dcache_range(unsigned long start, unsigned long stop)
 {
-	unsigned long shift = l1_cache_shift();
-	unsigned long bytes = l1_cache_bytes();
+	unsigned long shift = l1_dcache_shift();
+	unsigned long bytes = l1_dcache_bytes();
 	void *addr = (void *)(start & ~(bytes - 1));
 	unsigned long size = stop - (unsigned long)addr + (bytes - 1);
 	unsigned long i;
@@ -108,8 +108,8 @@ static inline void clean_dcache_range(unsigned long start, unsigned long stop)
 static inline void invalidate_dcache_range(unsigned long start,
 					   unsigned long stop)
 {
-	unsigned long shift = l1_cache_shift();
-	unsigned long bytes = l1_cache_bytes();
+	unsigned long shift = l1_dcache_shift();
+	unsigned long bytes = l1_dcache_bytes();
 	void *addr = (void *)(start & ~(bytes - 1));
 	unsigned long size = stop - (unsigned long)addr + (bytes - 1);
 	unsigned long i;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-08-15  4:10 [PATCH 0/6] powerpc: convert cache asm to C Alastair D'Silva
  2019-08-15  4:10 ` [PATCH 1/6] powerpc: Allow flush_icache_range to work across ranges >4GB Alastair D'Silva
  2019-08-15  4:10 ` [PATCH 2/6] powerpc: define helpers to get L1 icache sizes Alastair D'Silva
@ 2019-08-15  4:10 ` Alastair D'Silva
  2019-08-15  7:29   ` christophe leroy
  2019-08-15  4:10 ` [PATCH 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory Alastair D'Silva
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Alastair D'Silva @ 2019-08-15  4:10 UTC (permalink / raw)
  To: alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Christophe Leroy, Thomas Gleixner, Qian Cai, Nicholas Piggin,
	Greg Kroah-Hartman, Allison Randal, Andrew Morton,
	David Hildenbrand, Michal Hocko, Mike Rapoport, linuxppc-dev,
	linux-kernel

From: Alastair D'Silva <alastair@d-silva.org>

Similar to commit 22e9c88d486a
("powerpc/64: reuse PPC32 static inline flush_dcache_range()")
this patch converts flush_icache_range() to C, and reimplements the
following functions as wrappers around it:
__flush_dcache_icache
__flush_dcache_icache_phys

This was done as we discovered a long-standing bug where the length of the
range was truncated due to using a 32 bit shift instead of a 64 bit one.

By converting these functions to C, it becomes easier to maintain.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/include/asm/cache.h      |  26 +++---
 arch/powerpc/include/asm/cacheflush.h |  32 ++++---
 arch/powerpc/kernel/misc_32.S         | 117 --------------------------
 arch/powerpc/kernel/misc_64.S         |  97 ---------------------
 arch/powerpc/mm/mem.c                 |  71 +++++++++++++++-
 5 files changed, 102 insertions(+), 241 deletions(-)

diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index f852d5cd746c..728f154204db 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -98,20 +98,7 @@ static inline u32 l1_icache_bytes(void)
 #endif
 #endif /* ! __ASSEMBLY__ */
 
-#if defined(__ASSEMBLY__)
-/*
- * For a snooping icache, we still need a dummy icbi to purge all the
- * prefetched instructions from the ifetch buffers. We also need a sync
- * before the icbi to order the the actual stores to memory that might
- * have modified instructions with the icbi.
- */
-#define PURGE_PREFETCHED_INS	\
-	sync;			\
-	icbi	0,r3;		\
-	sync;			\
-	isync
-
-#else
+#if !defined(__ASSEMBLY__)
 #define __read_mostly __attribute__((__section__(".data..read_mostly")))
 
 #ifdef CONFIG_PPC_BOOK3S_32
@@ -145,6 +132,17 @@ static inline void dcbst(void *addr)
 {
 	__asm__ __volatile__ ("dcbst %y0" : : "Z"(*(u8 *)addr) : "memory");
 }
+
+static inline void icbi(void *addr)
+{
+	__asm__ __volatile__ ("icbi 0, %0" : : "r"(addr) : "memory");
+}
+
+static inline void iccci(void)
+{
+	__asm__ __volatile__ ("iccci 0, r0");
+}
+
 #endif /* !__ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_CACHE_H */
diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
index ed57843ef452..4c3377aff8ed 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -42,24 +42,18 @@ extern void flush_dcache_page(struct page *page);
 #define flush_dcache_mmap_lock(mapping)		do { } while (0)
 #define flush_dcache_mmap_unlock(mapping)	do { } while (0)
 
-extern void flush_icache_range(unsigned long, unsigned long);
+void flush_icache_range(unsigned long start, unsigned long stop);
 extern void flush_icache_user_range(struct vm_area_struct *vma,
 				    struct page *page, unsigned long addr,
 				    int len);
-extern void __flush_dcache_icache(void *page_va);
 extern void flush_dcache_icache_page(struct page *page);
-#if defined(CONFIG_PPC32) && !defined(CONFIG_BOOKE)
-extern void __flush_dcache_icache_phys(unsigned long physaddr);
-#else
-static inline void __flush_dcache_icache_phys(unsigned long physaddr)
-{
-	BUG();
-}
-#endif
 
-/*
- * Write any modified data cache blocks out to memory and invalidate them.
+/**
+ * flush_dcache_range(): Write any modified data cache blocks out to memory and invalidate them.
  * Does not invalidate the corresponding instruction cache blocks.
+ *
+ * @start: the start address
+ * @stop: the stop address (exclusive)
  */
 static inline void flush_dcache_range(unsigned long start, unsigned long stop)
 {
@@ -82,6 +76,20 @@ static inline void flush_dcache_range(unsigned long start, unsigned long stop)
 		isync();
 }
 
+/**
+ * __flush_dcache_icache(): Flush a particular page from the data cache to RAM.
+ * Note: this is necessary because the instruction cache does *not*
+ * snoop from the data cache.
+ *
+ * @page: the address of the page to flush
+ */
+static inline void __flush_dcache_icache(void *page)
+{
+	unsigned long page_addr = (unsigned long)page;
+
+	flush_icache_range(page_addr, page_addr + PAGE_SIZE);
+}
+
 /*
  * Write any modified data cache blocks out to memory.
  * Does not invalidate the corresponding cache lines (especially for
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index fe4bd321730e..12b95e6799d4 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -318,123 +318,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_UNIFIED_ID_CACHE)
 EXPORT_SYMBOL(flush_instruction_cache)
 #endif /* CONFIG_PPC_8xx */
 
-/*
- * Write any modified data cache blocks out to memory
- * and invalidate the corresponding instruction cache blocks.
- * This is a no-op on the 601.
- *
- * flush_icache_range(unsigned long start, unsigned long stop)
- */
-_GLOBAL(flush_icache_range)
-BEGIN_FTR_SECTION
-	PURGE_PREFETCHED_INS
-	blr				/* for 601, do nothing */
-END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
-	rlwinm	r3,r3,0,0,31 - L1_CACHE_SHIFT
-	subf	r4,r3,r4
-	addi	r4,r4,L1_CACHE_BYTES - 1
-	srwi.	r4,r4,L1_CACHE_SHIFT
-	beqlr
-	mtctr	r4
-	mr	r6,r3
-1:	dcbst	0,r3
-	addi	r3,r3,L1_CACHE_BYTES
-	bdnz	1b
-	sync				/* wait for dcbst's to get to ram */
-#ifndef CONFIG_44x
-	mtctr	r4
-2:	icbi	0,r6
-	addi	r6,r6,L1_CACHE_BYTES
-	bdnz	2b
-#else
-	/* Flash invalidate on 44x because we are passed kmapped addresses and
-	   this doesn't work for userspace pages due to the virtually tagged
-	   icache.  Sigh. */
-	iccci	0, r0
-#endif
-	sync				/* additional sync needed on g4 */
-	isync
-	blr
-_ASM_NOKPROBE_SYMBOL(flush_icache_range)
-EXPORT_SYMBOL(flush_icache_range)
-
-/*
- * Flush a particular page from the data cache to RAM.
- * Note: this is necessary because the instruction cache does *not*
- * snoop from the data cache.
- * This is a no-op on the 601 which has a unified cache.
- *
- *	void __flush_dcache_icache(void *page)
- */
-_GLOBAL(__flush_dcache_icache)
-BEGIN_FTR_SECTION
-	PURGE_PREFETCHED_INS
-	blr
-END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
-	rlwinm	r3,r3,0,0,31-PAGE_SHIFT		/* Get page base address */
-	li	r4,PAGE_SIZE/L1_CACHE_BYTES	/* Number of lines in a page */
-	mtctr	r4
-	mr	r6,r3
-0:	dcbst	0,r3				/* Write line to ram */
-	addi	r3,r3,L1_CACHE_BYTES
-	bdnz	0b
-	sync
-#ifdef CONFIG_44x
-	/* We don't flush the icache on 44x. Those have a virtual icache
-	 * and we don't have access to the virtual address here (it's
-	 * not the page vaddr but where it's mapped in user space). The
-	 * flushing of the icache on these is handled elsewhere, when
-	 * a change in the address space occurs, before returning to
-	 * user space
-	 */
-BEGIN_MMU_FTR_SECTION
-	blr
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_44x)
-#endif /* CONFIG_44x */
-	mtctr	r4
-1:	icbi	0,r6
-	addi	r6,r6,L1_CACHE_BYTES
-	bdnz	1b
-	sync
-	isync
-	blr
-
-#ifndef CONFIG_BOOKE
-/*
- * Flush a particular page from the data cache to RAM, identified
- * by its physical address.  We turn off the MMU so we can just use
- * the physical address (this may be a highmem page without a kernel
- * mapping).
- *
- *	void __flush_dcache_icache_phys(unsigned long physaddr)
- */
-_GLOBAL(__flush_dcache_icache_phys)
-BEGIN_FTR_SECTION
-	PURGE_PREFETCHED_INS
-	blr					/* for 601, do nothing */
-END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
-	mfmsr	r10
-	rlwinm	r0,r10,0,28,26			/* clear DR */
-	mtmsr	r0
-	isync
-	rlwinm	r3,r3,0,0,31-PAGE_SHIFT		/* Get page base address */
-	li	r4,PAGE_SIZE/L1_CACHE_BYTES	/* Number of lines in a page */
-	mtctr	r4
-	mr	r6,r3
-0:	dcbst	0,r3				/* Write line to ram */
-	addi	r3,r3,L1_CACHE_BYTES
-	bdnz	0b
-	sync
-	mtctr	r4
-1:	icbi	0,r6
-	addi	r6,r6,L1_CACHE_BYTES
-	bdnz	1b
-	sync
-	mtmsr	r10				/* restore DR */
-	isync
-	blr
-#endif /* CONFIG_BOOKE */
-
 /*
  * Copy a whole page.  We use the dcbz instruction on the destination
  * to reduce memory traffic (it eliminates the unnecessary reads of
diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index 9bc0aa9aeb65..fa451186620f 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -54,103 +54,6 @@ PPC64_CACHES:
 	.tc		ppc64_caches[TC],ppc64_caches
 	.section	".text"
 
-/*
- * Write any modified data cache blocks out to memory
- * and invalidate the corresponding instruction cache blocks.
- *
- * flush_icache_range(unsigned long start, unsigned long stop)
- *
- *   flush all bytes from start through stop-1 inclusive
- */
-
-_GLOBAL_TOC(flush_icache_range)
-BEGIN_FTR_SECTION
-	PURGE_PREFETCHED_INS
-	blr
-END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
-/*
- * Flush the data cache to memory 
- * 
- * Different systems have different cache line sizes
- * and in some cases i-cache and d-cache line sizes differ from
- * each other.
- */
- 	ld	r10,PPC64_CACHES@toc(r2)
-	lwz	r7,DCACHEL1BLOCKSIZE(r10)/* Get cache block size */
-	addi	r5,r7,-1
-	andc	r6,r3,r5		/* round low to line bdy */
-	subf	r8,r6,r4		/* compute length */
-	add	r8,r8,r5		/* ensure we get enough */
-	lwz	r9,DCACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of cache block size */
-	srd.	r8,r8,r9		/* compute line count */
-	beqlr				/* nothing to do? */
-	mtctr	r8
-1:	dcbst	0,r6
-	add	r6,r6,r7
-	bdnz	1b
-	sync
-
-/* Now invalidate the instruction cache */
-	
-	lwz	r7,ICACHEL1BLOCKSIZE(r10)	/* Get Icache block size */
-	addi	r5,r7,-1
-	andc	r6,r3,r5		/* round low to line bdy */
-	subf	r8,r6,r4		/* compute length */
-	add	r8,r8,r5
-	lwz	r9,ICACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of Icache block size */
-	srd.	r8,r8,r9		/* compute line count */
-	beqlr				/* nothing to do? */
-	mtctr	r8
-2:	icbi	0,r6
-	add	r6,r6,r7
-	bdnz	2b
-	isync
-	blr
-_ASM_NOKPROBE_SYMBOL(flush_icache_range)
-EXPORT_SYMBOL(flush_icache_range)
-
-/*
- * Flush a particular page from the data cache to RAM.
- * Note: this is necessary because the instruction cache does *not*
- * snoop from the data cache.
- *
- *	void __flush_dcache_icache(void *page)
- */
-_GLOBAL(__flush_dcache_icache)
-/*
- * Flush the data cache to memory 
- * 
- * Different systems have different cache line sizes
- */
-
-BEGIN_FTR_SECTION
-	PURGE_PREFETCHED_INS
-	blr
-END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
-
-/* Flush the dcache */
- 	ld	r7,PPC64_CACHES@toc(r2)
-	clrrdi	r3,r3,PAGE_SHIFT           	    /* Page align */
-	lwz	r4,DCACHEL1BLOCKSPERPAGE(r7)	/* Get # dcache blocks per page */
-	lwz	r5,DCACHEL1BLOCKSIZE(r7)	/* Get dcache block size */
-	mr	r6,r3
-	mtctr	r4
-0:	dcbst	0,r6
-	add	r6,r6,r5
-	bdnz	0b
-	sync
-
-/* Now invalidate the icache */	
-
-	lwz	r4,ICACHEL1BLOCKSPERPAGE(r7)	/* Get # icache blocks per page */
-	lwz	r5,ICACHEL1BLOCKSIZE(r7)	/* Get icache block size */
-	mtctr	r4
-1:	icbi	0,r3
-	add	r3,r3,r5
-	bdnz	1b
-	isync
-	blr
-
 _GLOBAL(__bswapdi2)
 EXPORT_SYMBOL(__bswapdi2)
 	srdi	r8,r3,32
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 9191a66b3bc5..5400da87a804 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -321,6 +321,66 @@ void free_initmem(void)
 	free_initmem_default(POISON_FREE_INITMEM);
 }
 
+/**
+ * flush_icache_range: Write any modified data cache blocks out to memory
+ * and invalidate the corresponding blocks in the instruction cache
+ *
+ * Generic code will call this after writing memory, before executing from it.
+ *
+ * @start: the start address
+ * @stop: the stop address (exclusive)
+ */
+void flush_icache_range(unsigned long start, unsigned long stop)
+{
+	unsigned long shift = l1_dcache_shift();
+	unsigned long bytes = l1_dcache_bytes();
+	void *addr = (void *)(start & ~(bytes - 1));
+	unsigned long size = stop - (unsigned long)addr + (bytes - 1);
+	unsigned long i;
+
+	if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) {
+		/* For a snooping icache, we still need a dummy icbi to purge all the
+		 * prefetched instructions from the ifetch buffers. We also need a sync
+		 * before the icbi to order the the actual stores to memory that might
+		 * have modified instructions with the icbi.
+		 */
+		mb(); /* sync */
+		icbi((void *)start);
+		mb(); /* sync */
+		isync();
+		return;
+	}
+
+	/* Flush the data cache to memory */
+	for (i = 0; i < size >> shift; i++, addr += bytes)
+		dcbst(addr);
+
+	mb();	/* sync */
+
+#ifdef CONFIG_44x
+	/* Flash invalidate on 44x because we are passed kmapped addresses and
+	 * this doesn't work for userspace pages due to the virtually tagged
+	 * icache.
+	 */
+
+	iccci();
+#else
+	shift = l1_icache_shift();
+	bytes = l1_icache_bytes();
+	addr = (void *)(start & ~(bytes - 1));
+	size = stop - (unsigned long)addr + (bytes - 1);
+
+	/* Now invalidate the instruction cache */
+	for (i = 0; i < size >> shift; i++, addr += bytes)
+		icbi(addr);
+#endif
+
+	if (!IS_ENABLED(CONFIG_PPC64))
+		mb(); /* additional sync needed on g4 */
+	isync();
+}
+EXPORT_SYMBOL(flush_icache_range);
+
 /*
  * This is called when a page has been modified by the kernel.
  * It just marks the page as not i-cache clean.  We do the i-cache
@@ -353,7 +413,16 @@ void flush_dcache_icache_page(struct page *page)
 		__flush_dcache_icache(start);
 		kunmap_atomic(start);
 	} else {
-		__flush_dcache_icache_phys(page_to_pfn(page) << PAGE_SHIFT);
+		unsigned long msr = mfmsr();
+
+		/* Clear the DR bit so that we operate on physical
+		 * rather than virtual addresses
+		 */
+		mtmsr(msr & ~(MSR_DR));
+
+		__flush_dcache_icache((void *)physaddr);
+
+		mtmsr(msr);
 	}
 #endif
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory
  2019-08-15  4:10 [PATCH 0/6] powerpc: convert cache asm to C Alastair D'Silva
                   ` (2 preceding siblings ...)
  2019-08-15  4:10 ` [PATCH 3/6] powerpc: Convert flush_icache_range & friends to C Alastair D'Silva
@ 2019-08-15  4:10 ` Alastair D'Silva
  2019-08-15  6:54   ` Mike Rapoport
  2019-08-15  7:36   ` christophe leroy
  2019-08-15  4:10 ` [PATCH 5/6] powerpc: Remove 'extern' from func prototypes in cache headers Alastair D'Silva
  2019-08-15  4:10 ` [PATCH 6/6] powerpc: Don't flush caches when adding memory Alastair D'Silva
  5 siblings, 2 replies; 13+ messages in thread
From: Alastair D'Silva @ 2019-08-15  4:10 UTC (permalink / raw)
  To: alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Christophe Leroy, Qian Cai, Allison Randal, Thomas Gleixner,
	Nicholas Piggin, Greg Kroah-Hartman, Andrew Morton,
	Mike Rapoport, David Hildenbrand, linuxppc-dev, linux-kernel

From: Alastair D'Silva <alastair@d-silva.org>

When presented with large amounts of memory being hotplugged
(in my test case, ~890GB), the call to flush_dcache_range takes
a while (~50 seconds), triggering RCU stalls.

This patch breaks up the call into 16GB chunks, calling
cond_resched() inbetween to allow the scheduler to run.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/mm/mem.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 5400da87a804..fb0d5e9aa11b 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -104,11 +104,14 @@ int __weak remove_section_mapping(unsigned long start, unsigned long end)
 	return -ENODEV;
 }
 
+#define FLUSH_CHUNK_SIZE (16ull * 1024ull * 1024ull * 1024ull)
+
 int __ref arch_add_memory(int nid, u64 start, u64 size,
 			struct mhp_restrictions *restrictions)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
+	unsigned long i;
 	int rc;
 
 	resize_hpt_for_hotplug(memblock_phys_mem_size());
@@ -120,7 +123,11 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
 			start, start + size, rc);
 		return -EFAULT;
 	}
-	flush_dcache_range(start, start + size);
+
+	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
+		flush_dcache_range(start + i, min(start + size, start + i + FLUSH_CHUNK_SIZE));
+		cond_resched();
+	}
 
 	return __add_pages(nid, start_pfn, nr_pages, restrictions);
 }
@@ -131,13 +138,18 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size,
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 	struct page *page = pfn_to_page(start_pfn) + vmem_altmap_offset(altmap);
+	unsigned long i;
 	int ret;
 
 	__remove_pages(page_zone(page), start_pfn, nr_pages, altmap);
 
 	/* Remove htab bolted mappings for this section of memory */
 	start = (unsigned long)__va(start);
-	flush_dcache_range(start, start + size);
+	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
+		flush_dcache_range(start + i, min(start + size, start + i + FLUSH_CHUNK_SIZE));
+		cond_resched();
+	}
+
 	ret = remove_section_mapping(start, start + size);
 	WARN_ON_ONCE(ret);
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/6] powerpc: Remove 'extern' from func prototypes in cache headers
  2019-08-15  4:10 [PATCH 0/6] powerpc: convert cache asm to C Alastair D'Silva
                   ` (3 preceding siblings ...)
  2019-08-15  4:10 ` [PATCH 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory Alastair D'Silva
@ 2019-08-15  4:10 ` Alastair D'Silva
  2019-08-15  4:10 ` [PATCH 6/6] powerpc: Don't flush caches when adding memory Alastair D'Silva
  5 siblings, 0 replies; 13+ messages in thread
From: Alastair D'Silva @ 2019-08-15  4:10 UTC (permalink / raw)
  To: alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Christophe Leroy, Allison Randal, Thomas Gleixner, Qian Cai,
	Nicholas Piggin, Greg Kroah-Hartman, Andrew Morton,
	David Hildenbrand, Mike Rapoport, linuxppc-dev, linux-kernel

From: Alastair D'Silva <alastair@d-silva.org>

The 'extern' keyword does not value-add for function prototypes.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/include/asm/cache.h      | 8 ++++----
 arch/powerpc/include/asm/cacheflush.h | 6 +++---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index 728f154204db..c5c096e968e0 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -102,10 +102,10 @@ static inline u32 l1_icache_bytes(void)
 #define __read_mostly __attribute__((__section__(".data..read_mostly")))
 
 #ifdef CONFIG_PPC_BOOK3S_32
-extern long _get_L2CR(void);
-extern long _get_L3CR(void);
-extern void _set_L2CR(unsigned long);
-extern void _set_L3CR(unsigned long);
+long _get_L2CR(void);
+long _get_L3CR(void);
+void _set_L2CR(unsigned long val);
+void _set_L3CR(unsigned long val);
 #else
 #define _get_L2CR()	0L
 #define _get_L3CR()	0L
diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
index 4c3377aff8ed..1826bf2cc137 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -38,15 +38,15 @@ static inline void flush_cache_vmap(unsigned long start, unsigned long end) { }
 #endif
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
-extern void flush_dcache_page(struct page *page);
+void flush_dcache_page(struct page *page);
 #define flush_dcache_mmap_lock(mapping)		do { } while (0)
 #define flush_dcache_mmap_unlock(mapping)	do { } while (0)
 
 void flush_icache_range(unsigned long start, unsigned long stop);
-extern void flush_icache_user_range(struct vm_area_struct *vma,
+void flush_icache_user_range(struct vm_area_struct *vma,
 				    struct page *page, unsigned long addr,
 				    int len);
-extern void flush_dcache_icache_page(struct page *page);
+void flush_dcache_icache_page(struct page *page);
 
 /**
  * flush_dcache_range(): Write any modified data cache blocks out to memory and invalidate them.
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/6] powerpc: Don't flush caches when adding memory
  2019-08-15  4:10 [PATCH 0/6] powerpc: convert cache asm to C Alastair D'Silva
                   ` (4 preceding siblings ...)
  2019-08-15  4:10 ` [PATCH 5/6] powerpc: Remove 'extern' from func prototypes in cache headers Alastair D'Silva
@ 2019-08-15  4:10 ` Alastair D'Silva
  5 siblings, 0 replies; 13+ messages in thread
From: Alastair D'Silva @ 2019-08-15  4:10 UTC (permalink / raw)
  To: alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Christophe Leroy, Qian Cai, Greg Kroah-Hartman, Thomas Gleixner,
	Nicholas Piggin, Allison Randal, Andrew Morton,
	David Hildenbrand, Mike Rapoport, linuxppc-dev, linux-kernel

From: Alastair D'Silva <alastair@d-silva.org>

This operation takes a significant amount of time when hotplugging
large amounts of memory (~50 seconds with 890GB of persistent memory).

This was orignally in commit fb5924fddf9e
("powerpc/mm: Flush cache on memory hot(un)plug") to support memtrace,
but the flush on add is not needed as it is flushed on remove.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/mm/mem.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index fb0d5e9aa11b..43be99de7c9a 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -111,7 +111,6 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
-	unsigned long i;
 	int rc;
 
 	resize_hpt_for_hotplug(memblock_phys_mem_size());
@@ -124,11 +123,6 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
 		return -EFAULT;
 	}
 
-	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
-		flush_dcache_range(start + i, min(start + size, start + i + FLUSH_CHUNK_SIZE));
-		cond_resched();
-	}
-
 	return __add_pages(nid, start_pfn, nr_pages, restrictions);
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory
  2019-08-15  4:10 ` [PATCH 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory Alastair D'Silva
@ 2019-08-15  6:54   ` Mike Rapoport
  2019-08-15  7:36   ` christophe leroy
  1 sibling, 0 replies; 13+ messages in thread
From: Mike Rapoport @ 2019-08-15  6:54 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: alastair, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Christophe Leroy, Qian Cai, Allison Randal,
	Thomas Gleixner, Nicholas Piggin, Greg Kroah-Hartman,
	Andrew Morton, Mike Rapoport, David Hildenbrand, linuxppc-dev,
	linux-kernel

On Thu, Aug 15, 2019 at 02:10:49PM +1000, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> When presented with large amounts of memory being hotplugged
> (in my test case, ~890GB), the call to flush_dcache_range takes
> a while (~50 seconds), triggering RCU stalls.
> 
> This patch breaks up the call into 16GB chunks, calling
> cond_resched() inbetween to allow the scheduler to run.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>  arch/powerpc/mm/mem.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 5400da87a804..fb0d5e9aa11b 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -104,11 +104,14 @@ int __weak remove_section_mapping(unsigned long start, unsigned long end)
>  	return -ENODEV;
>  }
> 
> +#define FLUSH_CHUNK_SIZE (16ull * 1024ull * 1024ull * 1024ull)

IMHO this begs for adding SZ_16G to include/linux/sizes.h and using it here

> +
>  int __ref arch_add_memory(int nid, u64 start, u64 size,
>  			struct mhp_restrictions *restrictions)
>  {
>  	unsigned long start_pfn = start >> PAGE_SHIFT;
>  	unsigned long nr_pages = size >> PAGE_SHIFT;
> +	unsigned long i;
>  	int rc;
> 
>  	resize_hpt_for_hotplug(memblock_phys_mem_size());
> @@ -120,7 +123,11 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
>  			start, start + size, rc);
>  		return -EFAULT;
>  	}
> -	flush_dcache_range(start, start + size);
> +
> +	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
> +		flush_dcache_range(start + i, min(start + size, start + i + FLUSH_CHUNK_SIZE));
> +		cond_resched();
> +	}
> 
>  	return __add_pages(nid, start_pfn, nr_pages, restrictions);
>  }
> @@ -131,13 +138,18 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size,
>  	unsigned long start_pfn = start >> PAGE_SHIFT;
>  	unsigned long nr_pages = size >> PAGE_SHIFT;
>  	struct page *page = pfn_to_page(start_pfn) + vmem_altmap_offset(altmap);
> +	unsigned long i;
>  	int ret;
> 
>  	__remove_pages(page_zone(page), start_pfn, nr_pages, altmap);
> 
>  	/* Remove htab bolted mappings for this section of memory */
>  	start = (unsigned long)__va(start);
> -	flush_dcache_range(start, start + size);
> +	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
> +		flush_dcache_range(start + i, min(start + size, start + i + FLUSH_CHUNK_SIZE));
> +		cond_resched();
> +	}
> +
>  	ret = remove_section_mapping(start, start + size);
>  	WARN_ON_ONCE(ret);
> 
> -- 
> 2.21.0
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-08-15  4:10 ` [PATCH 3/6] powerpc: Convert flush_icache_range & friends to C Alastair D'Silva
@ 2019-08-15  7:29   ` christophe leroy
  2019-08-16  1:44     ` Alastair D'Silva
  2019-08-16 15:57     ` Christophe Leroy
  0 siblings, 2 replies; 13+ messages in thread
From: christophe leroy @ 2019-08-15  7:29 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Qian Cai, Nicholas Piggin, Greg Kroah-Hartman,
	Allison Randal, Andrew Morton, David Hildenbrand, Michal Hocko,
	Mike Rapoport, linuxppc-dev, linux-kernel



Le 15/08/2019 à 06:10, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> Similar to commit 22e9c88d486a
> ("powerpc/64: reuse PPC32 static inline flush_dcache_range()")
> this patch converts flush_icache_range() to C, and reimplements the
> following functions as wrappers around it:
> __flush_dcache_icache
> __flush_dcache_icache_phys

Not sure you can do that for __flush_dcache_icache_phys(), see detailed 
comments below

> 
> This was done as we discovered a long-standing bug where the length of the
> range was truncated due to using a 32 bit shift instead of a 64 bit one.
> 
> By converting these functions to C, it becomes easier to maintain.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/include/asm/cache.h      |  26 +++---
>   arch/powerpc/include/asm/cacheflush.h |  32 ++++---
>   arch/powerpc/kernel/misc_32.S         | 117 --------------------------
>   arch/powerpc/kernel/misc_64.S         |  97 ---------------------
>   arch/powerpc/mm/mem.c                 |  71 +++++++++++++++-
>   5 files changed, 102 insertions(+), 241 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
> index f852d5cd746c..728f154204db 100644
> --- a/arch/powerpc/include/asm/cache.h
> +++ b/arch/powerpc/include/asm/cache.h
> @@ -98,20 +98,7 @@ static inline u32 l1_icache_bytes(void)
>   #endif
>   #endif /* ! __ASSEMBLY__ */
>   
> -#if defined(__ASSEMBLY__)
> -/*
> - * For a snooping icache, we still need a dummy icbi to purge all the
> - * prefetched instructions from the ifetch buffers. We also need a sync
> - * before the icbi to order the the actual stores to memory that might
> - * have modified instructions with the icbi.
> - */
> -#define PURGE_PREFETCHED_INS	\
> -	sync;			\
> -	icbi	0,r3;		\
> -	sync;			\
> -	isync

Is this still used anywhere now ?

> -
> -#else
> +#if !defined(__ASSEMBLY__)
>   #define __read_mostly __attribute__((__section__(".data..read_mostly")))
>   
>   #ifdef CONFIG_PPC_BOOK3S_32
> @@ -145,6 +132,17 @@ static inline void dcbst(void *addr)
>   {
>   	__asm__ __volatile__ ("dcbst %y0" : : "Z"(*(u8 *)addr) : "memory");
>   }
> +
> +static inline void icbi(void *addr)
> +{
> +	__asm__ __volatile__ ("icbi 0, %0" : : "r"(addr) : "memory");
> +}
> +
> +static inline void iccci(void)
> +{
> +	__asm__ __volatile__ ("iccci 0, r0");

I think you need the "memory" clobber too here.

> +}
> +
>   #endif /* !__ASSEMBLY__ */
>   #endif /* __KERNEL__ */
>   #endif /* _ASM_POWERPC_CACHE_H */
> diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
> index ed57843ef452..4c3377aff8ed 100644
> --- a/arch/powerpc/include/asm/cacheflush.h
> +++ b/arch/powerpc/include/asm/cacheflush.h
> @@ -42,24 +42,18 @@ extern void flush_dcache_page(struct page *page);
>   #define flush_dcache_mmap_lock(mapping)		do { } while (0)
>   #define flush_dcache_mmap_unlock(mapping)	do { } while (0)
>   
> -extern void flush_icache_range(unsigned long, unsigned long);
> +void flush_icache_range(unsigned long start, unsigned long stop);
>   extern void flush_icache_user_range(struct vm_area_struct *vma,
>   				    struct page *page, unsigned long addr,
>   				    int len);
> -extern void __flush_dcache_icache(void *page_va);
>   extern void flush_dcache_icache_page(struct page *page);
> -#if defined(CONFIG_PPC32) && !defined(CONFIG_BOOKE)
> -extern void __flush_dcache_icache_phys(unsigned long physaddr);
> -#else
> -static inline void __flush_dcache_icache_phys(unsigned long physaddr)
> -{
> -	BUG();
> -}
> -#endif
>   
> -/*
> - * Write any modified data cache blocks out to memory and invalidate them.
> +/**
> + * flush_dcache_range(): Write any modified data cache blocks out to memory and invalidate them.
>    * Does not invalidate the corresponding instruction cache blocks.
> + *
> + * @start: the start address
> + * @stop: the stop address (exclusive)
>    */
>   static inline void flush_dcache_range(unsigned long start, unsigned long stop)
>   {
> @@ -82,6 +76,20 @@ static inline void flush_dcache_range(unsigned long start, unsigned long stop)
>   		isync();
>   }
>   
> +/**
> + * __flush_dcache_icache(): Flush a particular page from the data cache to RAM.
> + * Note: this is necessary because the instruction cache does *not*
> + * snoop from the data cache.
> + *
> + * @page: the address of the page to flush
> + */
> +static inline void __flush_dcache_icache(void *page)
> +{
> +	unsigned long page_addr = (unsigned long)page;

The function is small enough to call this addr instead of page_addr 
without ambiguity.

> +
> +	flush_icache_range(page_addr, page_addr + PAGE_SIZE);

Finally, I think that's not so simple. For the 44x, if MMU_FTR_TYPE_44x 
is set, that's just a clean_dcache_range().
If that feature is not set, it looks like a standard 
flush_icache_range() but without the special CONFIG_4xx handling with 
iccci we have in flush_icache_range()

> +}
> +
>   /*
>    * Write any modified data cache blocks out to memory.
>    * Does not invalidate the corresponding cache lines (especially for
> diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
> index fe4bd321730e..12b95e6799d4 100644
> --- a/arch/powerpc/kernel/misc_32.S
> +++ b/arch/powerpc/kernel/misc_32.S
> @@ -318,123 +318,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_UNIFIED_ID_CACHE)
>   EXPORT_SYMBOL(flush_instruction_cache)
>   #endif /* CONFIG_PPC_8xx */
>   
> -/*
> - * Write any modified data cache blocks out to memory
> - * and invalidate the corresponding instruction cache blocks.
> - * This is a no-op on the 601.
> - *
> - * flush_icache_range(unsigned long start, unsigned long stop)
> - */
> -_GLOBAL(flush_icache_range)
> -BEGIN_FTR_SECTION
> -	PURGE_PREFETCHED_INS
> -	blr				/* for 601, do nothing */
> -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> -	rlwinm	r3,r3,0,0,31 - L1_CACHE_SHIFT
> -	subf	r4,r3,r4
> -	addi	r4,r4,L1_CACHE_BYTES - 1
> -	srwi.	r4,r4,L1_CACHE_SHIFT
> -	beqlr
> -	mtctr	r4
> -	mr	r6,r3
> -1:	dcbst	0,r3
> -	addi	r3,r3,L1_CACHE_BYTES
> -	bdnz	1b
> -	sync				/* wait for dcbst's to get to ram */
> -#ifndef CONFIG_44x
> -	mtctr	r4
> -2:	icbi	0,r6
> -	addi	r6,r6,L1_CACHE_BYTES
> -	bdnz	2b
> -#else
> -	/* Flash invalidate on 44x because we are passed kmapped addresses and
> -	   this doesn't work for userspace pages due to the virtually tagged
> -	   icache.  Sigh. */
> -	iccci	0, r0
> -#endif
> -	sync				/* additional sync needed on g4 */
> -	isync
> -	blr
> -_ASM_NOKPROBE_SYMBOL(flush_icache_range)
> -EXPORT_SYMBOL(flush_icache_range)
> -
> -/*
> - * Flush a particular page from the data cache to RAM.
> - * Note: this is necessary because the instruction cache does *not*
> - * snoop from the data cache.
> - * This is a no-op on the 601 which has a unified cache.
> - *
> - *	void __flush_dcache_icache(void *page)
> - */
> -_GLOBAL(__flush_dcache_icache)
> -BEGIN_FTR_SECTION
> -	PURGE_PREFETCHED_INS
> -	blr
> -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> -	rlwinm	r3,r3,0,0,31-PAGE_SHIFT		/* Get page base address */
> -	li	r4,PAGE_SIZE/L1_CACHE_BYTES	/* Number of lines in a page */
> -	mtctr	r4
> -	mr	r6,r3
> -0:	dcbst	0,r3				/* Write line to ram */
> -	addi	r3,r3,L1_CACHE_BYTES
> -	bdnz	0b
> -	sync
> -#ifdef CONFIG_44x
> -	/* We don't flush the icache on 44x. Those have a virtual icache
> -	 * and we don't have access to the virtual address here (it's
> -	 * not the page vaddr but where it's mapped in user space). The
> -	 * flushing of the icache on these is handled elsewhere, when
> -	 * a change in the address space occurs, before returning to
> -	 * user space
> -	 */
> -BEGIN_MMU_FTR_SECTION
> -	blr
> -END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_44x)
> -#endif /* CONFIG_44x */
> -	mtctr	r4
> -1:	icbi	0,r6
> -	addi	r6,r6,L1_CACHE_BYTES
> -	bdnz	1b
> -	sync
> -	isync
> -	blr
> -
> -#ifndef CONFIG_BOOKE
> -/*
> - * Flush a particular page from the data cache to RAM, identified
> - * by its physical address.  We turn off the MMU so we can just use
> - * the physical address (this may be a highmem page without a kernel
> - * mapping).
> - *
> - *	void __flush_dcache_icache_phys(unsigned long physaddr)
> - */
> -_GLOBAL(__flush_dcache_icache_phys)
> -BEGIN_FTR_SECTION
> -	PURGE_PREFETCHED_INS
> -	blr					/* for 601, do nothing */
> -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> -	mfmsr	r10
> -	rlwinm	r0,r10,0,28,26			/* clear DR */
> -	mtmsr	r0
> -	isync
> -	rlwinm	r3,r3,0,0,31-PAGE_SHIFT		/* Get page base address */
> -	li	r4,PAGE_SIZE/L1_CACHE_BYTES	/* Number of lines in a page */
> -	mtctr	r4
> -	mr	r6,r3
> -0:	dcbst	0,r3				/* Write line to ram */
> -	addi	r3,r3,L1_CACHE_BYTES
> -	bdnz	0b
> -	sync
> -	mtctr	r4
> -1:	icbi	0,r6
> -	addi	r6,r6,L1_CACHE_BYTES
> -	bdnz	1b
> -	sync
> -	mtmsr	r10				/* restore DR */
> -	isync
> -	blr
> -#endif /* CONFIG_BOOKE */
> -
>   /*
>    * Copy a whole page.  We use the dcbz instruction on the destination
>    * to reduce memory traffic (it eliminates the unnecessary reads of
> diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
> index 9bc0aa9aeb65..fa451186620f 100644
> --- a/arch/powerpc/kernel/misc_64.S
> +++ b/arch/powerpc/kernel/misc_64.S
> @@ -54,103 +54,6 @@ PPC64_CACHES:
>   	.tc		ppc64_caches[TC],ppc64_caches
>   	.section	".text"

Is the above still needed at all once all cache functions are gone ?

>   
> -/*
> - * Write any modified data cache blocks out to memory
> - * and invalidate the corresponding instruction cache blocks.
> - *
> - * flush_icache_range(unsigned long start, unsigned long stop)
> - *
> - *   flush all bytes from start through stop-1 inclusive
> - */
> -
> -_GLOBAL_TOC(flush_icache_range)
> -BEGIN_FTR_SECTION
> -	PURGE_PREFETCHED_INS
> -	blr
> -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> -/*
> - * Flush the data cache to memory
> - *
> - * Different systems have different cache line sizes
> - * and in some cases i-cache and d-cache line sizes differ from
> - * each other.
> - */
> - 	ld	r10,PPC64_CACHES@toc(r2)
> -	lwz	r7,DCACHEL1BLOCKSIZE(r10)/* Get cache block size */
> -	addi	r5,r7,-1
> -	andc	r6,r3,r5		/* round low to line bdy */
> -	subf	r8,r6,r4		/* compute length */
> -	add	r8,r8,r5		/* ensure we get enough */
> -	lwz	r9,DCACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of cache block size */
> -	srd.	r8,r8,r9		/* compute line count */
> -	beqlr				/* nothing to do? */
> -	mtctr	r8
> -1:	dcbst	0,r6
> -	add	r6,r6,r7
> -	bdnz	1b
> -	sync
> -
> -/* Now invalidate the instruction cache */
> -	
> -	lwz	r7,ICACHEL1BLOCKSIZE(r10)	/* Get Icache block size */
> -	addi	r5,r7,-1
> -	andc	r6,r3,r5		/* round low to line bdy */
> -	subf	r8,r6,r4		/* compute length */
> -	add	r8,r8,r5
> -	lwz	r9,ICACHEL1LOGBLOCKSIZE(r10)	/* Get log-2 of Icache block size */
> -	srd.	r8,r8,r9		/* compute line count */
> -	beqlr				/* nothing to do? */
> -	mtctr	r8
> -2:	icbi	0,r6
> -	add	r6,r6,r7
> -	bdnz	2b
> -	isync
> -	blr
> -_ASM_NOKPROBE_SYMBOL(flush_icache_range)
> -EXPORT_SYMBOL(flush_icache_range)
> -
> -/*
> - * Flush a particular page from the data cache to RAM.
> - * Note: this is necessary because the instruction cache does *not*
> - * snoop from the data cache.
> - *
> - *	void __flush_dcache_icache(void *page)
> - */
> -_GLOBAL(__flush_dcache_icache)
> -/*
> - * Flush the data cache to memory
> - *
> - * Different systems have different cache line sizes
> - */
> -
> -BEGIN_FTR_SECTION
> -	PURGE_PREFETCHED_INS
> -	blr
> -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> -
> -/* Flush the dcache */
> - 	ld	r7,PPC64_CACHES@toc(r2)
> -	clrrdi	r3,r3,PAGE_SHIFT           	    /* Page align */
> -	lwz	r4,DCACHEL1BLOCKSPERPAGE(r7)	/* Get # dcache blocks per page */
> -	lwz	r5,DCACHEL1BLOCKSIZE(r7)	/* Get dcache block size */
> -	mr	r6,r3
> -	mtctr	r4
> -0:	dcbst	0,r6
> -	add	r6,r6,r5
> -	bdnz	0b
> -	sync
> -
> -/* Now invalidate the icache */	
> -
> -	lwz	r4,ICACHEL1BLOCKSPERPAGE(r7)	/* Get # icache blocks per page */
> -	lwz	r5,ICACHEL1BLOCKSIZE(r7)	/* Get icache block size */
> -	mtctr	r4
> -1:	icbi	0,r3
> -	add	r3,r3,r5
> -	bdnz	1b
> -	isync
> -	blr
> -
>   _GLOBAL(__bswapdi2)
>   EXPORT_SYMBOL(__bswapdi2)
>   	srdi	r8,r3,32
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 9191a66b3bc5..5400da87a804 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -321,6 +321,66 @@ void free_initmem(void)
>   	free_initmem_default(POISON_FREE_INITMEM);
>   }
>   
> +/**
> + * flush_icache_range: Write any modified data cache blocks out to memory
> + * and invalidate the corresponding blocks in the instruction cache
> + *
> + * Generic code will call this after writing memory, before executing from it.
> + *
> + * @start: the start address
> + * @stop: the stop address (exclusive)
> + */
> +void flush_icache_range(unsigned long start, unsigned long stop)
> +{
> +	unsigned long shift = l1_dcache_shift();
> +	unsigned long bytes = l1_dcache_bytes();
> +	void *addr = (void *)(start & ~(bytes - 1));
> +	unsigned long size = stop - (unsigned long)addr + (bytes - 1);
> +	unsigned long i;
> +
> +	if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) {
> +		/* For a snooping icache, we still need a dummy icbi to purge all the
> +		 * prefetched instructions from the ifetch buffers. We also need a sync
> +		 * before the icbi to order the the actual stores to memory that might
> +		 * have modified instructions with the icbi.
> +		 */
> +		mb(); /* sync */
> +		icbi((void *)start);
> +		mb(); /* sync */
> +		isync();
> +		return;
> +	}
> +
> +	/* Flush the data cache to memory */
> +	for (i = 0; i < size >> shift; i++, addr += bytes)
> +		dcbst(addr);
> +
> +	mb();	/* sync */

Can we call clean_dcache_range() here instead ?

> +
> +#ifdef CONFIG_44x

Could we use the following instead of an #ifdef#else#endif ?

if (IS_ENABLED(CONFIG_44x)) {
} else {
}

> +	/* Flash invalidate on 44x because we are passed kmapped addresses and
> +	 * this doesn't work for userspace pages due to the virtually tagged
> +	 * icache.
> +	 */
> +
> +	iccci();
> +#else
> +	shift = l1_icache_shift();
> +	bytes = l1_icache_bytes();
> +	addr = (void *)(start & ~(bytes - 1));
> +	size = stop - (unsigned long)addr + (bytes - 1);
> +
> +	/* Now invalidate the instruction cache */
> +	for (i = 0; i < size >> shift; i++, addr += bytes)
> +		icbi(addr);
> +#endif
> +
> +	if (!IS_ENABLED(CONFIG_PPC64))
> +		mb(); /* additional sync needed on g4 */
> +	isync();
> +}
> +EXPORT_SYMBOL(flush_icache_range);
> +
>   /*
>    * This is called when a page has been modified by the kernel.
>    * It just marks the page as not i-cache clean.  We do the i-cache
> @@ -353,7 +413,16 @@ void flush_dcache_icache_page(struct page *page)
>   		__flush_dcache_icache(start);
>   		kunmap_atomic(start);
>   	} else {
> -		__flush_dcache_icache_phys(page_to_pfn(page) << PAGE_SHIFT);
> +		unsigned long msr = mfmsr();
> +
> +		/* Clear the DR bit so that we operate on physical
> +		 * rather than virtual addresses
> +		 */
> +		mtmsr(msr & ~(MSR_DR));
> +
> +		__flush_dcache_icache((void *)physaddr);
> +
> +		mtmsr(msr);

You can't do that I think. Once Data MMU is deactivated, you really have 
to take can of what data you access. For instance, the C code can't use 
stack anymore, neither can it directly use any global var. Only vars 
that are in registers can safely be used. As it can't use the stack, it 
is likely that it can't call another function in general case. So I 
don't think we can write in C a function which needs to de-activate data 
MMU.
I think this function must remain in ASM.

And what happened to the following that was around 
__flush_dcache_icache_phys() in asm/cacheflush.h ?

#if defined(CONFIG_PPC32) && !defined(CONFIG_BOOKE)
...
#else
BUG();
#endif

>   	}
>   #endif
>   }
> 

---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
https://www.avast.com/antivirus


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory
  2019-08-15  4:10 ` [PATCH 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory Alastair D'Silva
  2019-08-15  6:54   ` Mike Rapoport
@ 2019-08-15  7:36   ` christophe leroy
  2019-08-16  1:45     ` Alastair D'Silva
  1 sibling, 1 reply; 13+ messages in thread
From: christophe leroy @ 2019-08-15  7:36 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Qian Cai, Allison Randal, Thomas Gleixner, Nicholas Piggin,
	Greg Kroah-Hartman, Andrew Morton, Mike Rapoport,
	David Hildenbrand, linuxppc-dev, linux-kernel



Le 15/08/2019 à 06:10, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> When presented with large amounts of memory being hotplugged
> (in my test case, ~890GB), the call to flush_dcache_range takes
> a while (~50 seconds), triggering RCU stalls.
> 
> This patch breaks up the call into 16GB chunks, calling
> cond_resched() inbetween to allow the scheduler to run.

Is 16GB small enough ? If 890GB takes 50s, 16GB still takes about 1s.
I'd use 1GB chuncks to remain below 100ms.

> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/mm/mem.c | 16 ++++++++++++++--
>   1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 5400da87a804..fb0d5e9aa11b 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -104,11 +104,14 @@ int __weak remove_section_mapping(unsigned long start, unsigned long end)
>   	return -ENODEV;
>   }
>   
> +#define FLUSH_CHUNK_SIZE (16ull * 1024ull * 1024ull * 1024ull)

Can we use SZ_16GB ?

> +
>   int __ref arch_add_memory(int nid, u64 start, u64 size,
>   			struct mhp_restrictions *restrictions)
>   {
>   	unsigned long start_pfn = start >> PAGE_SHIFT;
>   	unsigned long nr_pages = size >> PAGE_SHIFT;
> +	unsigned long i;
>   	int rc;
>   
>   	resize_hpt_for_hotplug(memblock_phys_mem_size());
> @@ -120,7 +123,11 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
>   			start, start + size, rc);
>   		return -EFAULT;
>   	}
> -	flush_dcache_range(start, start + size);
> +
> +	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
> +		flush_dcache_range(start + i, min(start + size, start + i + FLUSH_CHUNK_SIZE));

Isn't the line a bit long (I have not checked).

> +		cond_resched();
> +	}
>   
>   	return __add_pages(nid, start_pfn, nr_pages, restrictions);
>   }
> @@ -131,13 +138,18 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size,
>   	unsigned long start_pfn = start >> PAGE_SHIFT;
>   	unsigned long nr_pages = size >> PAGE_SHIFT;
>   	struct page *page = pfn_to_page(start_pfn) + vmem_altmap_offset(altmap);
> +	unsigned long i;
>   	int ret;
>   
>   	__remove_pages(page_zone(page), start_pfn, nr_pages, altmap);
>   
>   	/* Remove htab bolted mappings for this section of memory */
>   	start = (unsigned long)__va(start);
> -	flush_dcache_range(start, start + size);
> +	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
> +		flush_dcache_range(start + i, min(start + size, start + i + FLUSH_CHUNK_SIZE));
> +		cond_resched();
> +	}
> +
>   	ret = remove_section_mapping(start, start + size);
>   	WARN_ON_ONCE(ret);
>   
> 

Christophe

---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
https://www.avast.com/antivirus


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-08-15  7:29   ` christophe leroy
@ 2019-08-16  1:44     ` Alastair D'Silva
  2019-08-16 15:57     ` Christophe Leroy
  1 sibling, 0 replies; 13+ messages in thread
From: Alastair D'Silva @ 2019-08-16  1:44 UTC (permalink / raw)
  To: christophe leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Qian Cai, Nicholas Piggin, Greg Kroah-Hartman,
	Allison Randal, Andrew Morton, David Hildenbrand, Michal Hocko,
	Mike Rapoport, linuxppc-dev, linux-kernel

On Thu, 2019-08-15 at 09:29 +0200, christophe leroy wrote:
> 
> Le 15/08/2019 à 06:10, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > Similar to commit 22e9c88d486a
> > ("powerpc/64: reuse PPC32 static inline flush_dcache_range()")
> > this patch converts flush_icache_range() to C, and reimplements the
> > following functions as wrappers around it:
> > __flush_dcache_icache
> > __flush_dcache_icache_phys
> 
> Not sure you can do that for __flush_dcache_icache_phys(), see
> detailed 
> comments below
> 
> > This was done as we discovered a long-standing bug where the length
> > of the
> > range was truncated due to using a 32 bit shift instead of a 64 bit
> > one.
> > 
> > By converting these functions to C, it becomes easier to maintain.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/include/asm/cache.h      |  26 +++---
> >   arch/powerpc/include/asm/cacheflush.h |  32 ++++---
> >   arch/powerpc/kernel/misc_32.S         | 117 -------------------
> > -------
> >   arch/powerpc/kernel/misc_64.S         |  97 ---------------------
> >   arch/powerpc/mm/mem.c                 |  71 +++++++++++++++-
> >   5 files changed, 102 insertions(+), 241 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/cache.h
> > b/arch/powerpc/include/asm/cache.h
> > index f852d5cd746c..728f154204db 100644
> > --- a/arch/powerpc/include/asm/cache.h
> > +++ b/arch/powerpc/include/asm/cache.h
> > @@ -98,20 +98,7 @@ static inline u32 l1_icache_bytes(void)
> >   #endif
> >   #endif /* ! __ASSEMBLY__ */
> >   
> > -#if defined(__ASSEMBLY__)
> > -/*
> > - * For a snooping icache, we still need a dummy icbi to purge all
> > the
> > - * prefetched instructions from the ifetch buffers. We also need a
> > sync
> > - * before the icbi to order the the actual stores to memory that
> > might
> > - * have modified instructions with the icbi.
> > - */
> > -#define PURGE_PREFETCHED_INS	\
> > -	sync;			\
> > -	icbi	0,r3;		\
> > -	sync;			\
> > -	isync
> 
> Is this still used anywhere now ?
No, this patch removes all users of it.

> > -
> > -#else
> > +#if !defined(__ASSEMBLY__)
> >   #define __read_mostly
> > __attribute__((__section__(".data..read_mostly")))
> >   
> >   #ifdef CONFIG_PPC_BOOK3S_32
> > @@ -145,6 +132,17 @@ static inline void dcbst(void *addr)
> >   {
> >   	__asm__ __volatile__ ("dcbst %y0" : : "Z"(*(u8 *)addr) :
> > "memory");
> >   }
> > +
> > +static inline void icbi(void *addr)
> > +{
> > +	__asm__ __volatile__ ("icbi 0, %0" : : "r"(addr) : "memory");
> > +}
> > +
> > +static inline void iccci(void)
> > +{
> > +	__asm__ __volatile__ ("iccci 0, r0");
> 
> I think you need the "memory" clobber too here.
> 
Thanks, I'll add that.

> > +}
> > +
> >   #endif /* !__ASSEMBLY__ */
> >   #endif /* __KERNEL__ */
> >   #endif /* _ASM_POWERPC_CACHE_H */
> > diff --git a/arch/powerpc/include/asm/cacheflush.h
> > b/arch/powerpc/include/asm/cacheflush.h
> > index ed57843ef452..4c3377aff8ed 100644
> > --- a/arch/powerpc/include/asm/cacheflush.h
> > +++ b/arch/powerpc/include/asm/cacheflush.h
> > @@ -42,24 +42,18 @@ extern void flush_dcache_page(struct page
> > *page);
> >   #define flush_dcache_mmap_lock(mapping)		do { } while
> > (0)
> >   #define flush_dcache_mmap_unlock(mapping)	do { } while (0)
> >   
> > -extern void flush_icache_range(unsigned long, unsigned long);
> > +void flush_icache_range(unsigned long start, unsigned long stop);
> >   extern void flush_icache_user_range(struct vm_area_struct *vma,
> >   				    struct page *page, unsigned long
> > addr,
> >   				    int len);
> > -extern void __flush_dcache_icache(void *page_va);
> >   extern void flush_dcache_icache_page(struct page *page);
> > -#if defined(CONFIG_PPC32) && !defined(CONFIG_BOOKE)
> > -extern void __flush_dcache_icache_phys(unsigned long physaddr);
> > -#else
> > -static inline void __flush_dcache_icache_phys(unsigned long
> > physaddr)
> > -{
> > -	BUG();
> > -}
> > -#endif
> >   
> > -/*
> > - * Write any modified data cache blocks out to memory and
> > invalidate them.
> > +/**
> > + * flush_dcache_range(): Write any modified data cache blocks out
> > to memory and invalidate them.
> >    * Does not invalidate the corresponding instruction cache
> > blocks.
> > + *
> > + * @start: the start address
> > + * @stop: the stop address (exclusive)
> >    */
> >   static inline void flush_dcache_range(unsigned long start,
> > unsigned long stop)
> >   {
> > @@ -82,6 +76,20 @@ static inline void flush_dcache_range(unsigned
> > long start, unsigned long stop)
> >   		isync();
> >   }
> >   
> > +/**
> > + * __flush_dcache_icache(): Flush a particular page from the data
> > cache to RAM.
> > + * Note: this is necessary because the instruction cache does
> > *not*
> > + * snoop from the data cache.
> > + *
> > + * @page: the address of the page to flush
> > + */
> > +static inline void __flush_dcache_icache(void *page)
> > +{
> > +	unsigned long page_addr = (unsigned long)page;
> 
> The function is small enough to call this addr instead of page_addr 
> without ambiguity.
Ok

> 
> > +
> > +	flush_icache_range(page_addr, page_addr + PAGE_SIZE);
> 
> Finally, I think that's not so simple. For the 44x, if
> MMU_FTR_TYPE_44x 
> is set, that's just a clean_dcache_range().
> If that feature is not set, it looks like a standard 
> flush_icache_range() but without the special CONFIG_4xx handling
> with 
> iccci we have in flush_icache_range()
> 
Hmm, I'll have to hack at this a bit more.

<snip>
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory
  2019-08-15  7:36   ` christophe leroy
@ 2019-08-16  1:45     ` Alastair D'Silva
  0 siblings, 0 replies; 13+ messages in thread
From: Alastair D'Silva @ 2019-08-16  1:45 UTC (permalink / raw)
  To: christophe leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Qian Cai, Allison Randal, Thomas Gleixner, Nicholas Piggin,
	Greg Kroah-Hartman, Andrew Morton, Mike Rapoport,
	David Hildenbrand, linuxppc-dev, linux-kernel

On Thu, 2019-08-15 at 09:36 +0200, christophe leroy wrote:
> 
> Le 15/08/2019 à 06:10, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > When presented with large amounts of memory being hotplugged
> > (in my test case, ~890GB), the call to flush_dcache_range takes
> > a while (~50 seconds), triggering RCU stalls.
> > 
> > This patch breaks up the call into 16GB chunks, calling
> > cond_resched() inbetween to allow the scheduler to run.
> 
> Is 16GB small enough ? If 890GB takes 50s, 16GB still takes about 1s.
> I'd use 1GB chuncks to remain below 100ms.
> 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/mm/mem.c | 16 ++++++++++++++--
> >   1 file changed, 14 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> > index 5400da87a804..fb0d5e9aa11b 100644
> > --- a/arch/powerpc/mm/mem.c
> > +++ b/arch/powerpc/mm/mem.c
> > @@ -104,11 +104,14 @@ int __weak remove_section_mapping(unsigned
> > long start, unsigned long end)
> >   	return -ENODEV;
> >   }
> >   
> > +#define FLUSH_CHUNK_SIZE (16ull * 1024ull * 1024ull * 1024ull)
> 
> Can we use SZ_16GB ?
Sure, I'll go with 1GB as you recommended above

> > +
> >   int __ref arch_add_memory(int nid, u64 start, u64 size,
> >   			struct mhp_restrictions *restrictions)
> >   {
> >   	unsigned long start_pfn = start >> PAGE_SHIFT;
> >   	unsigned long nr_pages = size >> PAGE_SHIFT;
> > +	unsigned long i;
> >   	int rc;
> >   
> >   	resize_hpt_for_hotplug(memblock_phys_mem_size());
> > @@ -120,7 +123,11 @@ int __ref arch_add_memory(int nid, u64 start,
> > u64 size,
> >   			start, start + size, rc);
> >   		return -EFAULT;
> >   	}
> > -	flush_dcache_range(start, start + size);
> > +
> > +	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
> > +		flush_dcache_range(start + i, min(start + size, start +
> > i + FLUSH_CHUNK_SIZE));
> 
> Isn't the line a bit long (I have not checked).
> 
> > +		cond_resched();
> > +	}
> >   
> >   	return __add_pages(nid, start_pfn, nr_pages, restrictions);
> >   }
> > @@ -131,13 +138,18 @@ void __ref arch_remove_memory(int nid, u64
> > start, u64 size,
> >   	unsigned long start_pfn = start >> PAGE_SHIFT;
> >   	unsigned long nr_pages = size >> PAGE_SHIFT;
> >   	struct page *page = pfn_to_page(start_pfn) +
> > vmem_altmap_offset(altmap);
> > +	unsigned long i;
> >   	int ret;
> >   
> >   	__remove_pages(page_zone(page), start_pfn, nr_pages, altmap);
> >   
> >   	/* Remove htab bolted mappings for this section of memory */
> >   	start = (unsigned long)__va(start);
> > -	flush_dcache_range(start, start + size);
> > +	for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
> > +		flush_dcache_range(start + i, min(start + size, start +
> > i + FLUSH_CHUNK_SIZE));
> > +		cond_resched();
> > +	}
> > +
> >   	ret = remove_section_mapping(start, start + size);
> >   	WARN_ON_ONCE(ret);
> >   
> > 
> 
> Christophe
> 
> ---
> L'absence de virus dans ce courrier électronique a été vérifiée par
> le logiciel antivirus Avast.
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.avast.com_antivirus&d=DwIDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cT4tgeEQ0Ll3SIlZDHE5AEXyKy6uKADMtf9_Eb7-vec&m=TBT2NNM2DXqDWHhSb_WdFPcfAjYk9hP2cvGksF001cQ&s=XURKAOQQ4h3_RhJlezSguD2kpSitAF-uBhQqVZLU4GU&e= 
> 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/6] powerpc: Convert flush_icache_range & friends to C
  2019-08-15  7:29   ` christophe leroy
  2019-08-16  1:44     ` Alastair D'Silva
@ 2019-08-16 15:57     ` Christophe Leroy
  1 sibling, 0 replies; 13+ messages in thread
From: Christophe Leroy @ 2019-08-16 15:57 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Qian Cai, Nicholas Piggin, Greg Kroah-Hartman,
	Allison Randal, Andrew Morton, David Hildenbrand, Michal Hocko,
	Mike Rapoport, linuxppc-dev, linux-kernel



Le 15/08/2019 à 09:29, christophe leroy a écrit :
> 
> 
> Le 15/08/2019 à 06:10, Alastair D'Silva a écrit :
>> From: Alastair D'Silva <alastair@d-silva.org>
>>
>> Similar to commit 22e9c88d486a
>> ("powerpc/64: reuse PPC32 static inline flush_dcache_range()")
>> this patch converts flush_icache_range() to C, and reimplements the
>> following functions as wrappers around it:
>> __flush_dcache_icache
>> __flush_dcache_icache_phys
> 
> Not sure you can do that for __flush_dcache_icache_phys(), see detailed 
> comments below
> 

I just sent you an RFC patch that could be the way to convert 
__flush_dcache_icache_phys() to C.

Feel free to modify it as wished and include it in your series.

Christophe

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-08-16 15:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-15  4:10 [PATCH 0/6] powerpc: convert cache asm to C Alastair D'Silva
2019-08-15  4:10 ` [PATCH 1/6] powerpc: Allow flush_icache_range to work across ranges >4GB Alastair D'Silva
2019-08-15  4:10 ` [PATCH 2/6] powerpc: define helpers to get L1 icache sizes Alastair D'Silva
2019-08-15  4:10 ` [PATCH 3/6] powerpc: Convert flush_icache_range & friends to C Alastair D'Silva
2019-08-15  7:29   ` christophe leroy
2019-08-16  1:44     ` Alastair D'Silva
2019-08-16 15:57     ` Christophe Leroy
2019-08-15  4:10 ` [PATCH 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory Alastair D'Silva
2019-08-15  6:54   ` Mike Rapoport
2019-08-15  7:36   ` christophe leroy
2019-08-16  1:45     ` Alastair D'Silva
2019-08-15  4:10 ` [PATCH 5/6] powerpc: Remove 'extern' from func prototypes in cache headers Alastair D'Silva
2019-08-15  4:10 ` [PATCH 6/6] powerpc: Don't flush caches when adding memory Alastair D'Silva

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).