* [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments
@ 2016-02-03 22:53 Christophe Leroy
2016-02-03 22:53 ` [PATCH v5 01/23] powerpc/8xx: Save r3 all the time in DTLB miss handler Christophe Leroy
` (22 more replies)
0 siblings, 23 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:53 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
Scott Wood, Jonathan Corbet
Cc: linux-kernel, linuxppc-dev, linux-doc
The main purpose of this patchset is to dramatically reduce the time
spent in DTLB miss handler. This is achieved by:
1/ Mapping RAM with 8M pages
2/ Mapping IMMR with a fixed 512K page
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.
Once the full patchset applied, the number of DTLB misses during the
period is reduced to 11.8 millions for a duration of 5.8s, which
represents 2% of the non-idle time.
This patch also includes other miscellaneous improvements:
1/ Handling of CPU6 ERRATA directly in mtspr() C macro to reduce code
specific to PPC8xx
2/ Rewrite of a few non critical ASM functions in C
3/ Removal of some unused items
See related patches for details
Main changes in v3:
* Using fixmap instead of fix address for mapping IMMR
Change in v4:
* Fix of a wrong #if notified by kbuild robot in 07/23
Change in v5:
* Removed use of pmd_val() as L-value
* Adapted to match the new include files layout in Linux 4.5
Christophe Leroy (23):
powerpc/8xx: Save r3 all the time in DTLB miss handler
powerpc/8xx: Map linear kernel RAM with 8M pages
powerpc: Update documentation for noltlbs kernel parameter
powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c
powerpc32: Fix pte_offset_kernel() to return NULL for bad pages
powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam()
together
powerpc/8xx: Fix vaddr for IMMR early remap
powerpc/8xx: Map IMMR area with 512k page at a fixed address
powerpc/8xx: CONFIG_PIN_TLB unneeded for CONFIG_PPC_EARLY_DEBUG_CPM
powerpc/8xx: map more RAM at startup when needed
powerpc32: Remove useless/wrong MMU:setio progress message
powerpc32: remove ioremap_base
powerpc/8xx: Add missing SPRN defines into reg_8xx.h
powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro
powerpc/8xx: remove special handling of CPU6 errata in set_dec()
powerpc/8xx: rewrite set_context() in C
powerpc/8xx: rewrite flush_instruction_cache() in C
powerpc: add inline functions for cache related instructions
powerpc32: Remove clear_pages() and define clear_page() inline
powerpc32: move xxxxx_dcache_range() functions inline
powerpc: Simplify test in __dma_sync()
powerpc32: small optimisation in flush_icache_range()
powerpc32: Remove one insn in mulhdu
Documentation/kernel-parameters.txt | 2 +-
arch/powerpc/Kconfig.debug | 1 -
arch/powerpc/include/asm/cache.h | 19 +++
arch/powerpc/include/asm/cacheflush.h | 52 ++++++-
arch/powerpc/include/asm/fixmap.h | 14 ++
arch/powerpc/include/asm/mmu-8xx.h | 4 +-
arch/powerpc/include/asm/nohash/32/pgtable.h | 5 +-
arch/powerpc/include/asm/page_32.h | 17 ++-
arch/powerpc/include/asm/reg.h | 2 +
arch/powerpc/include/asm/reg_8xx.h | 93 ++++++++++++
arch/powerpc/include/asm/time.h | 6 +-
arch/powerpc/kernel/asm-offsets.c | 8 ++
arch/powerpc/kernel/head_8xx.S | 207 +++++++++++++++++----------
arch/powerpc/kernel/misc_32.S | 107 ++------------
arch/powerpc/kernel/ppc_ksyms.c | 2 +
arch/powerpc/kernel/ppc_ksyms_32.c | 1 -
arch/powerpc/mm/8xx_mmu.c | 190 ++++++++++++++++++++++++
arch/powerpc/mm/Makefile | 1 +
arch/powerpc/mm/dma-noncoherent.c | 2 +-
arch/powerpc/mm/fsl_booke_mmu.c | 4 +-
arch/powerpc/mm/init_32.c | 23 ---
arch/powerpc/mm/mmu_decl.h | 34 +++--
arch/powerpc/mm/pgtable_32.c | 47 +-----
arch/powerpc/mm/ppc_mmu_32.c | 4 +-
arch/powerpc/platforms/embedded6xx/mpc10x.h | 10 --
arch/powerpc/sysdev/cpm_common.c | 15 +-
26 files changed, 583 insertions(+), 287 deletions(-)
create mode 100644 arch/powerpc/mm/8xx_mmu.c
--
2.1.0
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v5 01/23] powerpc/8xx: Save r3 all the time in DTLB miss handler
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
@ 2016-02-03 22:53 ` Christophe Leroy
2016-02-03 22:53 ` [PATCH v5 02/23] powerpc/8xx: Map linear kernel RAM with 8M pages Christophe Leroy
` (21 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:53 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
We are spending between 40 and 160 cycles with a mean of 65 cycles in
the DTLB handling routine (measured with mftbl) so make it more
simple althought it adds one instruction.
With this modification, we get three registers available at all time,
which will help with following patch.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
v3: no change
v4: no change
v5: no change
arch/powerpc/kernel/head_8xx.S | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index e629e28..a89492e 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -385,23 +385,20 @@ InstructionTLBMiss:
. = 0x1200
DataStoreTLBMiss:
-#ifdef CONFIG_8xx_CPU6
mtspr SPRN_SPRG_SCRATCH2, r3
-#endif
EXCEPTION_PROLOG_0
- mfcr r10
+ mfcr r3
/* If we are faulting a kernel address, we have to use the
* kernel page tables.
*/
- mfspr r11, SPRN_MD_EPN
- IS_KERNEL(r11, r11)
+ mfspr r10, SPRN_MD_EPN
+ IS_KERNEL(r11, r10)
mfspr r11, SPRN_M_TW /* Get level 1 table */
BRANCH_UNLESS_KERNEL(3f)
lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha
3:
- mtcr r10
- mfspr r10, SPRN_MD_EPN
+ mtcr r3
/* Insert level 1 index */
rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29
@@ -453,9 +450,7 @@ DataStoreTLBMiss:
MTSPR_CPU6(SPRN_MD_RPN, r10, r3) /* Update TLB entry */
/* Restore registers */
-#ifdef CONFIG_8xx_CPU6
mfspr r3, SPRN_SPRG_SCRATCH2
-#endif
mtspr SPRN_DAR, r11 /* Tag DAR */
EXCEPTION_EPILOG_0
rfi
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 02/23] powerpc/8xx: Map linear kernel RAM with 8M pages
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
2016-02-03 22:53 ` [PATCH v5 01/23] powerpc/8xx: Save r3 all the time in DTLB miss handler Christophe Leroy
@ 2016-02-03 22:53 ` Christophe Leroy
2016-02-03 22:53 ` [PATCH v5 03/23] powerpc: Update documentation for noltlbs kernel parameter Christophe Leroy
` (20 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:53 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.
MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.
In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.
In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.
With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: using bt instead of bgt and named the label explicitly
v3: no change
v4: no change
v5: removed use of pmd_val() as L-value
arch/powerpc/kernel/head_8xx.S | 35 +++++++++++++++++-
arch/powerpc/mm/8xx_mmu.c | 83 ++++++++++++++++++++++++++++++++++++++++++
arch/powerpc/mm/Makefile | 1 +
arch/powerpc/mm/mmu_decl.h | 15 ++------
4 files changed, 120 insertions(+), 14 deletions(-)
create mode 100644 arch/powerpc/mm/8xx_mmu.c
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index a89492e..87d1f5f 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -398,11 +398,13 @@ DataStoreTLBMiss:
BRANCH_UNLESS_KERNEL(3f)
lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha
3:
- mtcr r3
/* Insert level 1 index */
rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29
lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11) /* Get the level 1 entry */
+ mtcr r11
+ bt- 28,DTLBMiss8M /* bit 28 = Large page (8M) */
+ mtcr r3
/* We have a pte table, so load fetch the pte from the table.
*/
@@ -455,6 +457,29 @@ DataStoreTLBMiss:
EXCEPTION_EPILOG_0
rfi
+DTLBMiss8M:
+ mtcr r3
+ ori r11, r11, MD_SVALID
+ MTSPR_CPU6(SPRN_MD_TWC, r11, r3)
+#ifdef CONFIG_PPC_16K_PAGES
+ /*
+ * In 16k pages mode, each PGD entry defines a 64M block.
+ * Here we select the 8M page within the block.
+ */
+ rlwimi r11, r10, 0, 0x03800000
+#endif
+ rlwinm r10, r11, 0, 0xff800000
+ ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SHARED | _PAGE_DIRTY | \
+ _PAGE_PRESENT
+ MTSPR_CPU6(SPRN_MD_RPN, r10, r3) /* Update TLB entry */
+
+ li r11, RPN_PATTERN
+ mfspr r3, SPRN_SPRG_SCRATCH2
+ mtspr SPRN_DAR, r11 /* Tag DAR */
+ EXCEPTION_EPILOG_0
+ rfi
+
+
/* This is an instruction TLB error on the MPC8xx. This could be due
* to many reasons, such as executing guarded memory or illegal instruction
* addresses. There is nothing to do but handle a big time error fault.
@@ -532,13 +557,15 @@ FixupDAR:/* Entry point for dcbx workaround. */
/* Insert level 1 index */
3: rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29
lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11) /* Get the level 1 entry */
+ mtcr r11
+ bt 28,200f /* bit 28 = Large page (8M) */
rlwinm r11, r11,0,0,19 /* Extract page descriptor page address */
/* Insert level 2 index */
rlwimi r11, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29
lwz r11, 0(r11) /* Get the pte */
/* concat physical page address(r11) and page offset(r10) */
rlwimi r11, r10, 0, 32 - PAGE_SHIFT, 31
- lwz r11,0(r11)
+201: lwz r11,0(r11)
/* Check if it really is a dcbx instruction. */
/* dcbt and dcbtst does not generate DTLB Misses/Errors,
* no need to include them here */
@@ -557,6 +584,10 @@ FixupDAR:/* Entry point for dcbx workaround. */
141: mfspr r10,SPRN_SPRG_SCRATCH2
b DARFixed /* Nope, go back to normal TLB processing */
+ /* concat physical page address(r11) and page offset(r10) */
+200: rlwimi r11, r10, 0, 32 - (PAGE_SHIFT << 1), 31
+ b 201b
+
144: mfspr r10, SPRN_DSISR
rlwinm r10, r10,0,7,5 /* Clear store bit for buggy dcbst insn */
mtspr SPRN_DSISR, r10
diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
new file mode 100644
index 0000000..2d42745
--- /dev/null
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -0,0 +1,83 @@
+/*
+ * This file contains the routines for initializing the MMU
+ * on the 8xx series of chips.
+ * -- christophe
+ *
+ * Derived from arch/powerpc/mm/40x_mmu.c:
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ */
+
+#include <linux/memblock.h>
+
+#include "mmu_decl.h"
+
+extern int __map_without_ltlbs;
+/*
+ * MMU_init_hw does the chip-specific initialization of the MMU hardware.
+ */
+void __init MMU_init_hw(void)
+{
+ /* Nothing to do for the time being but keep it similar to other PPC */
+}
+
+#define LARGE_PAGE_SIZE_4M (1<<22)
+#define LARGE_PAGE_SIZE_8M (1<<23)
+#define LARGE_PAGE_SIZE_64M (1<<26)
+
+unsigned long __init mmu_mapin_ram(unsigned long top)
+{
+ unsigned long v, s, mapped;
+ phys_addr_t p;
+
+ v = KERNELBASE;
+ p = 0;
+ s = top;
+
+ if (__map_without_ltlbs)
+ return 0;
+
+#ifdef CONFIG_PPC_4K_PAGES
+ while (s >= LARGE_PAGE_SIZE_8M) {
+ pmd_t *pmdp;
+ unsigned long val = p | MD_PS8MEG;
+
+ pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
+ *pmdp++ = __pmd(val);
+ *pmdp++ = __pmd(val + LARGE_PAGE_SIZE_4M);
+
+ v += LARGE_PAGE_SIZE_8M;
+ p += LARGE_PAGE_SIZE_8M;
+ s -= LARGE_PAGE_SIZE_8M;
+ }
+#else /* CONFIG_PPC_16K_PAGES */
+ while (s >= LARGE_PAGE_SIZE_64M) {
+ pmd_t *pmdp;
+ unsigned long val = p | MD_PS8MEG;
+
+ pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
+ *pmdp++ = __pmd(val);
+
+ v += LARGE_PAGE_SIZE_64M;
+ p += LARGE_PAGE_SIZE_64M;
+ s -= LARGE_PAGE_SIZE_64M;
+ }
+#endif
+
+ mapped = top - s;
+
+ /* If the size of RAM is not an exact power of two, we may not
+ * have covered RAM in its entirety with 8 MiB
+ * pages. Consequently, restrict the top end of RAM currently
+ * allocable so that calls to the MEMBLOCK to allocate PTEs for "tail"
+ * coverage with normal-sized pages (or other reasons) do not
+ * attempt to allocate outside the allowed range.
+ */
+ memblock_set_current_limit(mapped);
+
+ return mapped;
+}
diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index 1ffeda8..adfee3f 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -25,6 +25,7 @@ obj-$(CONFIG_PPC_ICSWX) += icswx.o
obj-$(CONFIG_PPC_ICSWX_PID) += icswx_pid.o
obj-$(CONFIG_40x) += 40x_mmu.o
obj-$(CONFIG_44x) += 44x_mmu.o
+obj-$(CONFIG_PPC_8xx) += 8xx_mmu.o
obj-$(CONFIG_PPC_FSL_BOOK3E) += fsl_booke_mmu.o
obj-$(CONFIG_NEED_MULTIPLE_NODES) += numa.o
obj-$(CONFIG_PPC_SPLPAR) += vphn.o
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 9f58ff4..7faeb9f 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -132,22 +132,17 @@ extern void wii_memory_fixups(void);
/* ...and now those things that may be slightly different between processor
* architectures. -- Dan
*/
-#if defined(CONFIG_8xx)
-#define MMU_init_hw() do { } while(0)
-#define mmu_mapin_ram(top) (0UL)
-
-#elif defined(CONFIG_4xx)
+#ifdef CONFIG_PPC32
extern void MMU_init_hw(void);
extern unsigned long mmu_mapin_ram(unsigned long top);
+#endif
-#elif defined(CONFIG_PPC_FSL_BOOK3E)
+#ifdef CONFIG_PPC_FSL_BOOK3E
extern unsigned long map_mem_in_cams(unsigned long ram, int max_cam_idx,
bool dryrun);
extern unsigned long calc_cam_sz(unsigned long ram, unsigned long virt,
phys_addr_t phys);
#ifdef CONFIG_PPC32
-extern void MMU_init_hw(void);
-extern unsigned long mmu_mapin_ram(unsigned long top);
extern void adjust_total_lowmem(void);
extern int switch_to_as1(void);
extern void restore_to_as0(int esel, int offset, void *dt_ptr, int bootcpu);
@@ -162,8 +157,4 @@ struct tlbcam {
u32 MAS3;
u32 MAS7;
};
-#elif defined(CONFIG_PPC32)
-/* anything 32-bit except 4xx or 8xx */
-extern void MMU_init_hw(void);
-extern unsigned long mmu_mapin_ram(unsigned long top);
#endif
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 03/23] powerpc: Update documentation for noltlbs kernel parameter
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
2016-02-03 22:53 ` [PATCH v5 01/23] powerpc/8xx: Save r3 all the time in DTLB miss handler Christophe Leroy
2016-02-03 22:53 ` [PATCH v5 02/23] powerpc/8xx: Map linear kernel RAM with 8M pages Christophe Leroy
@ 2016-02-03 22:53 ` Christophe Leroy
2016-02-03 22:53 ` [PATCH v5 04/23] powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c Christophe Leroy
` (19 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:53 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
Scott Wood, Jonathan Corbet
Cc: linux-kernel, linuxppc-dev, linux-doc
Now the noltlbs kernel parameter is also applicable to PPC8xx
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
v3: no change
v4: no change
v5: no change
Documentation/kernel-parameters.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 59e1515..c3e420b 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2592,7 +2592,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
nolapic_timer [X86-32,APIC] Do not use the local APIC timer.
noltlbs [PPC] Do not use large page/tlb entries for kernel
- lowmem mapping on PPC40x.
+ lowmem mapping on PPC40x and PPC8xx
nomca [IA-64] Disable machine check abort handling
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 04/23] powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (2 preceding siblings ...)
2016-02-03 22:53 ` [PATCH v5 03/23] powerpc: Update documentation for noltlbs kernel parameter Christophe Leroy
@ 2016-02-03 22:53 ` Christophe Leroy
2016-02-03 22:53 ` [PATCH v5 05/23] powerpc32: Fix pte_offset_kernel() to return NULL for bad pages Christophe Leroy
` (18 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:53 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
Now we have a 8xx specific .c file for that so put it in there
as other powerpc variants do
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
v3: no change
v4: no change
v5: no change
arch/powerpc/mm/8xx_mmu.c | 17 +++++++++++++++++
arch/powerpc/mm/init_32.c | 19 -------------------
2 files changed, 17 insertions(+), 19 deletions(-)
diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
index 2d42745..a84f5eb 100644
--- a/arch/powerpc/mm/8xx_mmu.c
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -81,3 +81,20 @@ unsigned long __init mmu_mapin_ram(unsigned long top)
return mapped;
}
+
+void setup_initial_memory_limit(phys_addr_t first_memblock_base,
+ phys_addr_t first_memblock_size)
+{
+ /* We don't currently support the first MEMBLOCK not mapping 0
+ * physical on those processors
+ */
+ BUG_ON(first_memblock_base != 0);
+
+#ifdef CONFIG_PIN_TLB
+ /* 8xx can only access 24MB at the moment */
+ memblock_set_current_limit(min_t(u64, first_memblock_size, 0x01800000));
+#else
+ /* 8xx can only access 8MB at the moment */
+ memblock_set_current_limit(min_t(u64, first_memblock_size, 0x00800000));
+#endif
+}
diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c
index a10be66..1a18e4b 100644
--- a/arch/powerpc/mm/init_32.c
+++ b/arch/powerpc/mm/init_32.c
@@ -193,22 +193,3 @@ void __init MMU_init(void)
/* Shortly after that, the entire linear mapping will be available */
memblock_set_current_limit(lowmem_end_addr);
}
-
-#ifdef CONFIG_8xx /* No 8xx specific .c file to put that in ... */
-void setup_initial_memory_limit(phys_addr_t first_memblock_base,
- phys_addr_t first_memblock_size)
-{
- /* We don't currently support the first MEMBLOCK not mapping 0
- * physical on those processors
- */
- BUG_ON(first_memblock_base != 0);
-
-#ifdef CONFIG_PIN_TLB
- /* 8xx can only access 24MB at the moment */
- memblock_set_current_limit(min_t(u64, first_memblock_size, 0x01800000));
-#else
- /* 8xx can only access 8MB at the moment */
- memblock_set_current_limit(min_t(u64, first_memblock_size, 0x00800000));
-#endif
-}
-#endif /* CONFIG_8xx */
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 05/23] powerpc32: Fix pte_offset_kernel() to return NULL for bad pages
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (3 preceding siblings ...)
2016-02-03 22:53 ` [PATCH v5 04/23] powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c Christophe Leroy
@ 2016-02-03 22:53 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 06/23] powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() together Christophe Leroy
` (17 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:53 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
The fixmap related functions try to map kernel pages that are
already mapped through Large TLBs. pte_offset_kernel() has to
return NULL for LTLBs, otherwise the caller will try to access
level 2 table which doesn't exist
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v3: new
v4: no change
v5: no change
arch/powerpc/include/asm/nohash/32/pgtable.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index c82cbf5..e201600 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -309,7 +309,8 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
#define pte_index(address) \
(((address) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
#define pte_offset_kernel(dir, addr) \
- ((pte_t *) pmd_page_vaddr(*(dir)) + pte_index(addr))
+ (pmd_bad(*(dir)) ? NULL : (pte_t *)pmd_page_vaddr(*(dir)) + \
+ pte_index(addr))
#define pte_offset_map(dir, addr) \
((pte_t *) kmap_atomic(pmd_page(*(dir))) + pte_index(addr))
#define pte_unmap(pte) kunmap_atomic(pte)
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 06/23] powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() together
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (4 preceding siblings ...)
2016-02-03 22:53 ` [PATCH v5 05/23] powerpc32: Fix pte_offset_kernel() to return NULL for bad pages Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-07 9:42 ` kbuild test robot
2016-02-03 22:54 ` [PATCH v5 07/23] powerpc/8xx: Fix vaddr for IMMR early remap Christophe Leroy
` (16 subsequent siblings)
22 siblings, 1 reply; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
x_mapped_by_bats() and x_mapped_by_tlbcam() serve the same kind of
purpose, and are never defined at the same time.
So rename them x_block_mapped() and define them in the relevant
places
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
v3: Functions are mutually exclusive so renamed iaw Scott comment instead of grouping into a single function
v4: no change
v5: no change
arch/powerpc/mm/fsl_booke_mmu.c | 4 ++--
arch/powerpc/mm/mmu_decl.h | 10 ++++++++++
arch/powerpc/mm/pgtable_32.c | 44 ++++++-----------------------------------
arch/powerpc/mm/ppc_mmu_32.c | 4 ++--
4 files changed, 20 insertions(+), 42 deletions(-)
diff --git a/arch/powerpc/mm/fsl_booke_mmu.c b/arch/powerpc/mm/fsl_booke_mmu.c
index f3afe3d..5d45341 100644
--- a/arch/powerpc/mm/fsl_booke_mmu.c
+++ b/arch/powerpc/mm/fsl_booke_mmu.c
@@ -75,7 +75,7 @@ unsigned long tlbcam_sz(int idx)
/*
* Return PA for this VA if it is mapped by a CAM, or 0
*/
-phys_addr_t v_mapped_by_tlbcam(unsigned long va)
+phys_addr_t v_block_mapped(unsigned long va)
{
int b;
for (b = 0; b < tlbcam_index; ++b)
@@ -87,7 +87,7 @@ phys_addr_t v_mapped_by_tlbcam(unsigned long va)
/*
* Return VA for a given PA or 0 if not mapped
*/
-unsigned long p_mapped_by_tlbcam(phys_addr_t pa)
+unsigned long p_block_mapped(phys_addr_t pa)
{
int b;
for (b = 0; b < tlbcam_index; ++b)
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 7faeb9f..40dd5d3 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -158,3 +158,13 @@ struct tlbcam {
u32 MAS7;
};
#endif
+
+#if defined(CONFIG_6xx) || defined(CONFIG_FSL_BOOKE)
+/* 6xx have BATS */
+/* FSL_BOOKE have TLBCAM */
+phys_addr_t v_block_mapped(unsigned long va);
+unsigned long p_block_mapped(phys_addr_t pa);
+#else
+static inline phys_addr_t v_block_mapped(unsigned long va) { return 0; }
+static inline unsigned long p_block_mapped(phys_addr_t pa) { return 0; }
+#endif
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 7692d1b..db0d35e 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -41,32 +41,8 @@ unsigned long ioremap_base;
unsigned long ioremap_bot;
EXPORT_SYMBOL(ioremap_bot); /* aka VMALLOC_END */
-#ifdef CONFIG_6xx
-#define HAVE_BATS 1
-#endif
-
-#if defined(CONFIG_FSL_BOOKE)
-#define HAVE_TLBCAM 1
-#endif
-
extern char etext[], _stext[];
-#ifdef HAVE_BATS
-extern phys_addr_t v_mapped_by_bats(unsigned long va);
-extern unsigned long p_mapped_by_bats(phys_addr_t pa);
-#else /* !HAVE_BATS */
-#define v_mapped_by_bats(x) (0UL)
-#define p_mapped_by_bats(x) (0UL)
-#endif /* HAVE_BATS */
-
-#ifdef HAVE_TLBCAM
-extern phys_addr_t v_mapped_by_tlbcam(unsigned long va);
-extern unsigned long p_mapped_by_tlbcam(phys_addr_t pa);
-#else /* !HAVE_TLBCAM */
-#define v_mapped_by_tlbcam(x) (0UL)
-#define p_mapped_by_tlbcam(x) (0UL)
-#endif /* HAVE_TLBCAM */
-
#define PGDIR_ORDER (32 + PGD_T_LOG2 - PGDIR_SHIFT)
#ifndef CONFIG_PPC_4K_PAGES
@@ -228,19 +204,10 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags,
/*
* Is it already mapped? Perhaps overlapped by a previous
- * BAT mapping. If the whole area is mapped then we're done,
- * otherwise remap it since we want to keep the virt addrs for
- * each request contiguous.
- *
- * We make the assumption here that if the bottom and top
- * of the range we want are mapped then it's mapped to the
- * same virt address (and this is contiguous).
- * -- Cort
+ * mapping.
*/
- if ((v = p_mapped_by_bats(p)) /*&& p_mapped_by_bats(p+size-1)*/ )
- goto out;
-
- if ((v = p_mapped_by_tlbcam(p)))
+ v = p_block_mapped(p);
+ if (v)
goto out;
if (slab_is_available()) {
@@ -278,7 +245,8 @@ void iounmap(volatile void __iomem *addr)
* If mapped by BATs then there is nothing to do.
* Calling vfree() generates a benign warning.
*/
- if (v_mapped_by_bats((unsigned long)addr)) return;
+ if (v_block_mapped((unsigned long)addr))
+ return;
if (addr > high_memory && (unsigned long) addr < ioremap_bot)
vunmap((void *) (PAGE_MASK & (unsigned long)addr));
@@ -403,7 +371,7 @@ static int __change_page_attr(struct page *page, pgprot_t prot)
BUG_ON(PageHighMem(page));
address = (unsigned long)page_address(page);
- if (v_mapped_by_bats(address) || v_mapped_by_tlbcam(address))
+ if (v_block_mapped(address))
return 0;
if (!get_pteptr(&init_mm, address, &kpte, &kpmd))
return -EINVAL;
diff --git a/arch/powerpc/mm/ppc_mmu_32.c b/arch/powerpc/mm/ppc_mmu_32.c
index 6b2f3e4..2a049fb 100644
--- a/arch/powerpc/mm/ppc_mmu_32.c
+++ b/arch/powerpc/mm/ppc_mmu_32.c
@@ -49,7 +49,7 @@ struct batrange { /* stores address ranges mapped by BATs */
/*
* Return PA for this VA if it is mapped by a BAT, or 0
*/
-phys_addr_t v_mapped_by_bats(unsigned long va)
+phys_addr_t v_block_mapped(unsigned long va)
{
int b;
for (b = 0; b < 4; ++b)
@@ -61,7 +61,7 @@ phys_addr_t v_mapped_by_bats(unsigned long va)
/*
* Return VA for a given PA or 0 if not mapped
*/
-unsigned long p_mapped_by_bats(phys_addr_t pa)
+unsigned long p_block_mapped(phys_addr_t pa)
{
int b;
for (b = 0; b < 4; ++b)
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 07/23] powerpc/8xx: Fix vaddr for IMMR early remap
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (5 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 06/23] powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() together Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 08/23] powerpc/8xx: Map IMMR area with 512k page at a fixed address Christophe Leroy
` (15 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
Memory: 124428K/131072K available (3748K kernel code, 188K rwdata,
648K rodata, 508K init, 290K bss, 6644K reserved)
Kernel virtual memory layout:
* 0xfffdf000..0xfffff000 : fixmap
* 0xfde00000..0xfe000000 : consistent mem
* 0xfddf6000..0xfde00000 : early ioremap
* 0xc9000000..0xfddf6000 : vmalloc & ioremap
SLUB: HWalign=16, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
Today, IMMR is mapped 1:1 at startup
Mapping IMMR 1:1 is just wrong because it may overlap with another
area. On most mpc8xx boards it is OK as IMMR is set to 0xff000000
but for instance on EP88xC board, IMMR is at 0xfa200000 which
overlaps with VM ioremap area
This patch fixes the virtual address for remapping IMMR with the fixmap
regardless of the value of IMMR.
The size of IMMR area is 256kbytes (CPM at offset 0, security engine
at offset 128k) so a 512k page is enough
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
v3: Using fixmap instead of fixed address
v4: Fix a wrong #if notified by kbuild robot
v5: no change
arch/powerpc/include/asm/fixmap.h | 7 +++++++
arch/powerpc/kernel/asm-offsets.c | 8 ++++++++
arch/powerpc/kernel/head_8xx.S | 11 ++++++-----
arch/powerpc/mm/mmu_decl.h | 7 +++++++
arch/powerpc/sysdev/cpm_common.c | 15 ++++++++++++---
5 files changed, 40 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/include/asm/fixmap.h b/arch/powerpc/include/asm/fixmap.h
index 90f604b..d7dd8fb 100644
--- a/arch/powerpc/include/asm/fixmap.h
+++ b/arch/powerpc/include/asm/fixmap.h
@@ -51,6 +51,13 @@ enum fixed_addresses {
FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */
FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1,
#endif
+#ifdef CONFIG_PPC_8xx
+ /* For IMMR we need an aligned 512K area */
+ FIX_IMMR_START,
+ FIX_IMMR_TOP = (FIX_IMMR_START - 1 + ((512 * 1024) / PAGE_SIZE)) &
+ ~(((512 * 1024) / PAGE_SIZE) - 1),
+ FIX_IMMR_BASE = FIX_IMMR_TOP - 1 + ((512 * 1024) / PAGE_SIZE),
+#endif
/* FIX_PCIE_MCFG, */
__end_of_fixed_addresses
};
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 07cebc3..9724ff8 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -68,6 +68,10 @@
#include "../mm/mmu_decl.h"
#endif
+#ifdef CONFIG_PPC_8xx
+#include <asm/fixmap.h>
+#endif
+
int main(void)
{
DEFINE(THREAD, offsetof(struct task_struct, thread));
@@ -772,5 +776,9 @@ int main(void)
DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER);
+#ifdef CONFIG_PPC_8xx
+ DEFINE(VIRT_IMMR_BASE, __fix_to_virt(FIX_IMMR_BASE));
+#endif
+
return 0;
}
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 87d1f5f..09173ae 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -30,6 +30,7 @@
#include <asm/ppc_asm.h>
#include <asm/asm-offsets.h>
#include <asm/ptrace.h>
+#include <asm/fixmap.h>
/* Macro to make the code more readable. */
#ifdef CONFIG_8xx_CPU6
@@ -763,7 +764,7 @@ start_here:
* virtual to physical. Also, set the cache mode since that is defined
* by TLB entries and perform any additional mapping (like of the IMMR).
* If configured to pin some TLBs, we pin the first 8 Mbytes of kernel,
- * 24 Mbytes of data, and the 8M IMMR space. Anything not covered by
+ * 24 Mbytes of data, and the 512k IMMR space. Anything not covered by
* these mappings is mapped by page tables.
*/
initial_mmu:
@@ -812,7 +813,7 @@ initial_mmu:
ori r8, r8, MD_APG_INIT@l
mtspr SPRN_MD_AP, r8
- /* Map another 8 MByte at the IMMR to get the processor
+ /* Map a 512k page for the IMMR to get the processor
* internal registers (among other things).
*/
#ifdef CONFIG_PIN_TLB
@@ -820,12 +821,12 @@ initial_mmu:
mtspr SPRN_MD_CTR, r10
#endif
mfspr r9, 638 /* Get current IMMR */
- andis. r9, r9, 0xff80 /* Get 8Mbyte boundary */
+ andis. r9, r9, 0xfff8 /* Get 512 kbytes boundary */
- mr r8, r9 /* Create vaddr for TLB */
+ lis r8, VIRT_IMMR_BASE@h /* Create vaddr for TLB */
ori r8, r8, MD_EVALID /* Mark it valid */
mtspr SPRN_MD_EPN, r8
- li r8, MD_PS8MEG /* Set 8M byte page */
+ li r8, MD_PS512K | MD_GUARDED /* Set 512k byte page */
ori r8, r8, MD_SVALID /* Make it valid */
mtspr SPRN_MD_TWC, r8
mr r8, r9 /* Create paddr for TLB */
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 40dd5d3..e7228b7 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -107,6 +107,13 @@ struct hash_pte;
extern struct hash_pte *Hash, *Hash_end;
extern unsigned long Hash_size, Hash_mask;
+#define PHYS_IMMR_BASE (mfspr(SPRN_IMMR) & 0xfff80000)
+#ifdef CONFIG_PPC_8xx
+#define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE))
+#else
+#define VIRT_IMMR_BASE PHYS_IMMR_BASE
+#endif
+
#endif /* CONFIG_PPC32 */
#ifdef CONFIG_PPC64
diff --git a/arch/powerpc/sysdev/cpm_common.c b/arch/powerpc/sysdev/cpm_common.c
index 9d32465..94476f4 100644
--- a/arch/powerpc/sysdev/cpm_common.c
+++ b/arch/powerpc/sysdev/cpm_common.c
@@ -28,6 +28,7 @@
#include <asm/udbg.h>
#include <asm/io.h>
#include <asm/cpm.h>
+#include <asm/fixmap.h>
#include <soc/fsl/qe/qe.h>
#include <mm/mmu_decl.h>
@@ -37,12 +38,16 @@
#endif
#ifdef CONFIG_PPC_EARLY_DEBUG_CPM
-static u32 __iomem *cpm_udbg_txdesc =
- (u32 __iomem __force *)CONFIG_PPC_EARLY_DEBUG_CPM_ADDR;
+static u32 __iomem *cpm_udbg_txdesc;
static void udbg_putc_cpm(char c)
{
- u8 __iomem *txbuf = (u8 __iomem __force *)in_be32(&cpm_udbg_txdesc[1]);
+ static u8 __iomem *txbuf;
+
+ if (unlikely(txbuf == NULL))
+ txbuf = (u8 __iomem __force *)
+ (in_be32(&cpm_udbg_txdesc[1]) - PHYS_IMMR_BASE +
+ VIRT_IMMR_BASE);
if (c == '\n')
udbg_putc_cpm('\r');
@@ -56,6 +61,10 @@ static void udbg_putc_cpm(char c)
void __init udbg_init_cpm(void)
{
+ cpm_udbg_txdesc = (u32 __iomem __force *)
+ (CONFIG_PPC_EARLY_DEBUG_CPM_ADDR - PHYS_IMMR_BASE +
+ VIRT_IMMR_BASE);
+
if (cpm_udbg_txdesc) {
#ifdef CONFIG_CPM2
setbat(1, 0xf0000000, 0xf0000000, 1024*1024, PAGE_KERNEL_NCG);
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 08/23] powerpc/8xx: Map IMMR area with 512k page at a fixed address
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (6 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 07/23] powerpc/8xx: Fix vaddr for IMMR early remap Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-04 9:58 ` kbuild test robot
2016-02-03 22:54 ` [PATCH v5 09/23] powerpc/8xx: CONFIG_PIN_TLB unneeded for CONFIG_PPC_EARLY_DEBUG_CPM Christophe Leroy
` (14 subsequent siblings)
22 siblings, 1 reply; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
Once the linear memory space has been mapped with 8Mb pages, as
seen in the related commit, we get 11 millions DTLB missed during
the reference 600s period. 77% of the misses are on user addresses
and 23% are on kernel addresses (1 fourth for linear address space
and 3 fourth for virtual address space)
Traditionaly, each driver manages one computer board which has its
own components with its own memory maps.
But on embedded chips like the MPC8xx, the SOC has all registers
located in the same IO area.
When looking at ioremaps done during startup, we see that
many drivers are re-mapping small parts of the IMMR for their own use
and all those small pieces gets their own 4k page, amplifying the
number of TLB misses: in our system we get 0xff000000 mapped 31 times
and 0xff003000 mapped 9 times.
Even if each part of IMMR was mapped only once with 4k pages, it would
still be several small mappings towards linear area.
With the patch, on the same principle as what was done for the RAM,
the IMMR gets mapped by a 512k page.
In 4k pages mode, we reserve a 4Mb area for mapping IMMR. The TLB
miss handler checks that we are within the first 512k and bail out
with page not marked valid if we are outside
In 16k pages mode, it is not realistic to reserve a 64Mb area, so
we do a standard mapping of the 512k area using 32 pages of 16k.
The CPM will be mapped via the first two pages, and the SEC engine
will be mapped via the 16th and 17th pages. As the pages are marked
guarded, there will be no speculative accesses.
With this patch applied, the number of DTLB misses during the 10 min
period is reduced to 11.8 millions for a duration of 5.8s, which
represents 2% of the non-idle time hence yet another 10% reduction.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2:
- using bt instead of blt/bgt
- reorganised in order to have only one taken branch for both 512k
and 8M instead of a first branch for both 8M and 512k then a second
branch for 512k
v3:
- using fixmap
- using the new x_block_mapped() functions
v4: no change
v5: no change
arch/powerpc/include/asm/fixmap.h | 9 ++++++-
arch/powerpc/kernel/head_8xx.S | 36 +++++++++++++++++++++++++-
arch/powerpc/mm/8xx_mmu.c | 53 +++++++++++++++++++++++++++++++++++++++
arch/powerpc/mm/mmu_decl.h | 3 ++-
4 files changed, 98 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/include/asm/fixmap.h b/arch/powerpc/include/asm/fixmap.h
index d7dd8fb..b954dc3 100644
--- a/arch/powerpc/include/asm/fixmap.h
+++ b/arch/powerpc/include/asm/fixmap.h
@@ -52,12 +52,19 @@ enum fixed_addresses {
FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1,
#endif
#ifdef CONFIG_PPC_8xx
- /* For IMMR we need an aligned 512K area */
FIX_IMMR_START,
+#ifdef CONFIG_PPC_4K_PAGES
+ /* For IMMR we need an aligned 4M area (full PGD entry) */
+ FIX_IMMR_TOP = (FIX_IMMR_START - 1 + ((4 * 1024 * 1024) / PAGE_SIZE)) &
+ ~(((4 * 1024 * 1024) / PAGE_SIZE) - 1),
+ FIX_IMMR_BASE = FIX_IMMR_TOP - 1 + ((4 * 1024 * 1024) / PAGE_SIZE),
+#else
+ /* For IMMR we need an aligned 512K area */
FIX_IMMR_TOP = (FIX_IMMR_START - 1 + ((512 * 1024) / PAGE_SIZE)) &
~(((512 * 1024) / PAGE_SIZE) - 1),
FIX_IMMR_BASE = FIX_IMMR_TOP - 1 + ((512 * 1024) / PAGE_SIZE),
#endif
+#endif
/* FIX_PCIE_MCFG, */
__end_of_fixed_addresses
};
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 09173ae..ae721a1 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -254,6 +254,37 @@ DataAccess:
. = 0x400
InstructionAccess:
+/*
+ * Bottom part of DTLBMiss handler for 512k pages
+ * not enough space in the primary location
+ */
+#ifdef CONFIG_PPC_4K_PAGES
+/*
+ * 512k pages are only used for mapping IMMR area in 4K pages mode.
+ * Only map the first 512k page of the 4M area covered by the PGD entry.
+ * This should not happen, but if we are called for another page of that
+ * area, don't mark it valid
+ *
+ * In 16k pages mode, IMMR is directly mapped with 16k pages
+ */
+DTLBMiss512k:
+ rlwinm. r10, r10, 0, 0x00380000
+ bne- 1f
+ ori r11, r11, MD_SVALID
+1: mtcr r3
+ MTSPR_CPU6(SPRN_MD_TWC, r11, r3)
+ rlwinm r10, r11, 0, 0xffc00000
+ ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SHARED | _PAGE_DIRTY | \
+ _PAGE_PRESENT | _PAGE_NO_CACHE
+ MTSPR_CPU6(SPRN_MD_RPN, r10, r3) /* Update TLB entry */
+
+ li r11, RPN_PATTERN
+ mfspr r3, SPRN_SPRG_SCRATCH2
+ mtspr SPRN_DAR, r11 /* Tag DAR */
+ EXCEPTION_EPILOG_0
+ rfi
+#endif
+
/* External interrupt */
EXCEPTION(0x500, HardwareInterrupt, do_IRQ, EXC_XFER_LITE)
@@ -405,6 +436,9 @@ DataStoreTLBMiss:
lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11) /* Get the level 1 entry */
mtcr r11
bt- 28,DTLBMiss8M /* bit 28 = Large page (8M) */
+#ifdef CONFIG_PPC_4K_PAGES
+ bt- 29,DTLBMiss512k /* bit 29 = Large page (8M or 512K) */
+#endif
mtcr r3
/* We have a pte table, so load fetch the pte from the table.
@@ -559,7 +593,7 @@ FixupDAR:/* Entry point for dcbx workaround. */
3: rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29
lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11) /* Get the level 1 entry */
mtcr r11
- bt 28,200f /* bit 28 = Large page (8M) */
+ bt 29,200f /* bit 29 = Large page (8M or 512K) */
rlwinm r11, r11,0,0,19 /* Extract page descriptor page address */
/* Insert level 2 index */
rlwimi r11, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29
diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
index a84f5eb..4de93b2 100644
--- a/arch/powerpc/mm/8xx_mmu.c
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -13,10 +13,61 @@
*/
#include <linux/memblock.h>
+#include <asm/fixmap.h>
#include "mmu_decl.h"
+#define IMMR_SIZE (__fix_to_virt(FIX_IMMR_TOP) + PAGE_SIZE - VIRT_IMMR_BASE)
+
extern int __map_without_ltlbs;
+
+/*
+ * Return PA for this VA if it is in IMMR area, or 0
+ */
+phys_addr_t v_block_mapped(unsigned long va)
+{
+ unsigned long p = PHYS_IMMR_BASE;
+
+ if (__map_without_ltlbs)
+ return 0;
+ if (va >= VIRT_IMMR_BASE && va < VIRT_IMMR_BASE + IMMR_SIZE)
+ return p + va - VIRT_IMMR_BASE;
+ return 0;
+}
+
+/*
+ * Return VA for a given PA or 0 if not mapped
+ */
+unsigned long p_block_mapped(phys_addr_t pa)
+{
+ unsigned long p = PHYS_IMMR_BASE;
+
+ if (__map_without_ltlbs)
+ return 0;
+ if (pa >= p && pa < p + IMMR_SIZE)
+ return VIRT_IMMR_BASE + pa - p;
+ return 0;
+}
+
+static void mmu_mapin_immr(void)
+{
+ unsigned long p = PHYS_IMMR_BASE;
+ unsigned long v = VIRT_IMMR_BASE;
+#ifdef CONFIG_PPC_4K_PAGES
+ pmd_t *pmdp;
+ unsigned long val = p | MD_PS512K | MD_GUARDED;
+
+ pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
+ pmd_val(*pmdp) = val;
+#else /* CONFIG_PPC_16K_PAGES */
+ unsigned long f = pgprot_val(PAGE_KERNEL_NCG);
+ int offset;
+
+ for (offset = 0; offset < IMMR_SIZE; offset += PAGE_SIZE)
+ map_page(v + offset, p + offset, f);
+#endif
+}
+
/*
* MMU_init_hw does the chip-specific initialization of the MMU hardware.
*/
@@ -79,6 +130,8 @@ unsigned long __init mmu_mapin_ram(unsigned long top)
*/
memblock_set_current_limit(mapped);
+ mmu_mapin_immr();
+
return mapped;
}
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index e7228b7..3872332 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -166,9 +166,10 @@ struct tlbcam {
};
#endif
-#if defined(CONFIG_6xx) || defined(CONFIG_FSL_BOOKE)
+#if defined(CONFIG_6xx) || defined(CONFIG_FSL_BOOKE) || defined(CONFIG_PPC_8xx)
/* 6xx have BATS */
/* FSL_BOOKE have TLBCAM */
+/* 8xx have LTLB */
phys_addr_t v_block_mapped(unsigned long va);
unsigned long p_block_mapped(phys_addr_t pa);
#else
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 09/23] powerpc/8xx: CONFIG_PIN_TLB unneeded for CONFIG_PPC_EARLY_DEBUG_CPM
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (7 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 08/23] powerpc/8xx: Map IMMR area with 512k page at a fixed address Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 10/23] powerpc/8xx: map more RAM at startup when needed Christophe Leroy
` (13 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
IMMR is now mapped by page tables so it is not
anymore necessary to PIN TLBs
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
v3: no change
v4: no change
v5: no change
arch/powerpc/Kconfig.debug | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
index 638f9ce..136b09c 100644
--- a/arch/powerpc/Kconfig.debug
+++ b/arch/powerpc/Kconfig.debug
@@ -220,7 +220,6 @@ config PPC_EARLY_DEBUG_40x
config PPC_EARLY_DEBUG_CPM
bool "Early serial debugging for Freescale CPM-based serial ports"
depends on SERIAL_CPM
- select PIN_TLB if PPC_8xx
help
Select this to enable early debugging for Freescale chips
using a CPM-based serial port. This assumes that the bootwrapper
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 10/23] powerpc/8xx: map more RAM at startup when needed
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (8 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 09/23] powerpc/8xx: CONFIG_PIN_TLB unneeded for CONFIG_PPC_EARLY_DEBUG_CPM Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 11/23] powerpc32: Remove useless/wrong MMU:setio progress message Christophe Leroy
` (12 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
On recent kernels, with some debug options like for instance
CONFIG_LOCKDEP, the BSS requires more than 8M memory, allthough
the kernel code fits in the first 8M.
Today, it is necessary to activate CONFIG_PIN_TLB to get more than 8M
at startup, allthough pinning TLB is not necessary for that.
This patch adds more pages (up to 24Mb) to the initial mapping if
possible/needed in order to have the necessary mappings regardless of
CONFIG_PIN_TLB.
We could have mapped 16M or 24M inconditionnally but since some
platforms only have 8M memory, we need something a bit more elaborated
Therefore, if the bootloader is compliant with ePAPR standard, we use
r7 to know how much memory was mapped by the bootloader.
Otherwise, we try to determine the required memory size by looking at
the _end symbol and the address of the device tree.
This patch does not modify the behaviour when CONFIG_PIN_TLB is
selected.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
v3: Automatic detection of available/needed memory instead of allocating 16M for all.
v4: no change
v5: no change
arch/powerpc/kernel/head_8xx.S | 56 +++++++++++++++++++++++++++++++++++++++---
arch/powerpc/mm/8xx_mmu.c | 10 +++-----
2 files changed, 56 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index ae721a1..a268cf4 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -72,6 +72,9 @@
#define RPN_PATTERN 0x00f0
#endif
+/* ePAPR magic value for non BOOK III-E CPUs */
+#define EPAPR_SMAGIC 0x65504150
+
__HEAD
_ENTRY(_stext);
_ENTRY(_start);
@@ -101,6 +104,38 @@ _ENTRY(_start);
*/
.globl __start
__start:
+/*
+ * Determine initial RAM size
+ *
+ * If the Bootloader is ePAPR compliant, the size is given in r7
+ * otherwise, we have to determine how much is needed. For that, we have to
+ * check whether _end of kernel and device tree are within the first 8Mb.
+ */
+ lis r30, 0x00800000@h /* 8Mb by default */
+
+ lis r8, EPAPR_SMAGIC@h
+ ori r8, r8, EPAPR_SMAGIC@l
+ cmplw cr0, r8, r6
+ bne 1f
+ lis r30, 0x01800000@h /* 24Mb max */
+ cmplw cr0, r7, r30
+ bgt 2f
+ mr r30, r7 /* save initial ram size */
+ b 2f
+1:
+ /* is kernel _end or DTB in the first 8M ? if not map 16M */
+ lis r8, (_end - PAGE_OFFSET)@h
+ ori r8, r8, (_end - PAGE_OFFSET)@l
+ addi r8, r8, -1
+ or r8, r8, r3
+ cmplw cr0, r8, r30
+ blt 2f
+ lis r30, 0x01000000@h /* 16Mb */
+ /* is kernel _end or DTB in the first 16M ? if not map 24M */
+ cmplw cr0, r8, r30
+ blt 2f
+ lis r30, 0x01800000@h /* 24Mb */
+2:
mr r31,r3 /* save device tree ptr */
/* We have to turn on the MMU right away so we get cache modes
@@ -737,6 +772,8 @@ start_here:
/*
* Decide what sort of machine this is and initialize the MMU.
*/
+ lis r3, initial_memory_size@ha
+ stw r30, initial_memory_size@l(r3)
li r3,0
mr r4,r31
bl machine_init
@@ -868,10 +905,15 @@ initial_mmu:
mtspr SPRN_MD_RPN, r8
#ifdef CONFIG_PIN_TLB
- /* Map two more 8M kernel data pages.
- */
+ /* Map one more 8M kernel data page. */
addi r10, r10, 0x0100
mtspr SPRN_MD_CTR, r10
+#else
+ /* Map one more 8M kernel data page if needed */
+ lis r10, 0x00800000@h
+ cmplw cr0, r30, r10
+ ble 1f
+#endif
lis r8, KERNELBASE@h /* Create vaddr for TLB */
addis r8, r8, 0x0080 /* Add 8M */
@@ -884,20 +926,28 @@ initial_mmu:
addis r11, r11, 0x0080 /* Add 8M */
mtspr SPRN_MD_RPN, r11
+#ifdef CONFIG_PIN_TLB
+ /* Map one more 8M kernel data page. */
addi r10, r10, 0x0100
mtspr SPRN_MD_CTR, r10
+#else
+ /* Map one more 8M kernel data page if needed */
+ lis r10, 0x01000000@h
+ cmplw cr0, r30, r10
+ ble 1f
+#endif
addis r8, r8, 0x0080 /* Add 8M */
mtspr SPRN_MD_EPN, r8
mtspr SPRN_MD_TWC, r9
addis r11, r11, 0x0080 /* Add 8M */
mtspr SPRN_MD_RPN, r11
-#endif
/* Since the cache is enabled according to the information we
* just loaded into the TLB, invalidate and enable the caches here.
* We should probably check/set other modes....later.
*/
+1:
lis r8, IDC_INVALL@h
mtspr SPRN_IC_CST, r8
mtspr SPRN_DC_CST, r8
diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
index 4de93b2..6965575 100644
--- a/arch/powerpc/mm/8xx_mmu.c
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -20,6 +20,7 @@
#define IMMR_SIZE (__fix_to_virt(FIX_IMMR_TOP) + PAGE_SIZE - VIRT_IMMR_BASE)
extern int __map_without_ltlbs;
+int initial_memory_size; /* size of initial memory mapped by head_8xx.S */
/*
* Return PA for this VA if it is in IMMR area, or 0
@@ -143,11 +144,6 @@ void setup_initial_memory_limit(phys_addr_t first_memblock_base,
*/
BUG_ON(first_memblock_base != 0);
-#ifdef CONFIG_PIN_TLB
- /* 8xx can only access 24MB at the moment */
- memblock_set_current_limit(min_t(u64, first_memblock_size, 0x01800000));
-#else
- /* 8xx can only access 8MB at the moment */
- memblock_set_current_limit(min_t(u64, first_memblock_size, 0x00800000));
-#endif
+ memblock_set_current_limit(min_t(u64, first_memblock_size,
+ initial_memory_size));
}
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 11/23] powerpc32: Remove useless/wrong MMU:setio progress message
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (9 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 10/23] powerpc/8xx: map more RAM at startup when needed Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 12/23] powerpc32: remove ioremap_base Christophe Leroy
` (11 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
Commit 771168494719 ("[POWERPC] Remove unused machine call outs")
removed the call to setup_io_mappings(), so remove the associated
progress line message
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
v3: no change
v4: no change
v5: no change
arch/powerpc/mm/init_32.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c
index 1a18e4b..4eb1b8f 100644
--- a/arch/powerpc/mm/init_32.c
+++ b/arch/powerpc/mm/init_32.c
@@ -178,10 +178,6 @@ void __init MMU_init(void)
/* Initialize early top-down ioremap allocator */
ioremap_bot = IOREMAP_TOP;
- /* Map in I/O resources */
- if (ppc_md.progress)
- ppc_md.progress("MMU:setio", 0x302);
-
if (ppc_md.progress)
ppc_md.progress("MMU:exit", 0x211);
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 12/23] powerpc32: remove ioremap_base
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (10 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 11/23] powerpc32: Remove useless/wrong MMU:setio progress message Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 13/23] powerpc/8xx: Add missing SPRN defines into reg_8xx.h Christophe Leroy
` (10 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
ioremap_base is not initialised and is nowhere used so remove it
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
v3: fix comment as well
v4: no change
v5: no change
arch/powerpc/include/asm/nohash/32/pgtable.h | 2 +-
arch/powerpc/mm/mmu_decl.h | 1 -
arch/powerpc/mm/pgtable_32.c | 3 +--
arch/powerpc/platforms/embedded6xx/mpc10x.h | 10 ----------
4 files changed, 2 insertions(+), 14 deletions(-)
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index e201600..7808475 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -86,7 +86,7 @@ extern int icache_44x_need_flush;
* We no longer map larger than phys RAM with the BATs so we don't have
* to worry about the VMALLOC_OFFSET causing problems. We do have to worry
* about clashes between our early calls to ioremap() that start growing down
- * from ioremap_base being run into the VM area allocations (growing upwards
+ * from IOREMAP_TOP being run into the VM area allocations (growing upwards
* from VMALLOC_START). For this reason we have ioremap_bot to check when
* we actually run into our mappings setup in the early boot with the VM
* system. This really does become a problem for machines with good amounts
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 3872332..53564a3 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -100,7 +100,6 @@ extern void setbat(int index, unsigned long virt, phys_addr_t phys,
extern int __map_without_bats;
extern int __allow_ioremap_reserved;
-extern unsigned long ioremap_base;
extern unsigned int rtas_data, rtas_size;
struct hash_pte;
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index db0d35e..815ccd7 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -37,7 +37,6 @@
#include "mmu_decl.h"
-unsigned long ioremap_base;
unsigned long ioremap_bot;
EXPORT_SYMBOL(ioremap_bot); /* aka VMALLOC_END */
@@ -173,7 +172,7 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags,
/*
* Choose an address to map it to.
* Once the vmalloc system is running, we use it.
- * Before then, we use space going down from ioremap_base
+ * Before then, we use space going down from IOREMAP_TOP
* (ioremap_bot records where we're up to).
*/
p = addr & PAGE_MASK;
diff --git a/arch/powerpc/platforms/embedded6xx/mpc10x.h b/arch/powerpc/platforms/embedded6xx/mpc10x.h
index b290b63..5ad1202 100644
--- a/arch/powerpc/platforms/embedded6xx/mpc10x.h
+++ b/arch/powerpc/platforms/embedded6xx/mpc10x.h
@@ -24,13 +24,11 @@
* Processor: 0x80000000 - 0x807fffff -> PCI I/O: 0x00000000 - 0x007fffff
* Processor: 0xc0000000 - 0xdfffffff -> PCI MEM: 0x00000000 - 0x1fffffff
* PCI MEM: 0x80000000 -> Processor System Memory: 0x00000000
- * EUMB mapped to: ioremap_base - 0x00100000 (ioremap_base - 1 MB)
*
* MAP B (CHRP Map)
* Processor: 0xfe000000 - 0xfebfffff -> PCI I/O: 0x00000000 - 0x00bfffff
* Processor: 0x80000000 - 0xbfffffff -> PCI MEM: 0x80000000 - 0xbfffffff
* PCI MEM: 0x00000000 -> Processor System Memory: 0x00000000
- * EUMB mapped to: ioremap_base - 0x00100000 (ioremap_base - 1 MB)
*/
/*
@@ -138,14 +136,6 @@
#define MPC10X_EUMB_WP_OFFSET 0x000ff000 /* Data path diagnostic, watchpoint reg offset */
#define MPC10X_EUMB_WP_SIZE 0x00001000 /* Data path diagnostic, watchpoint reg size */
-/*
- * Define some recommended places to put the EUMB regs.
- * For both maps, recommend putting the EUMB from 0xeff00000 to 0xefffffff.
- */
-extern unsigned long ioremap_base;
-#define MPC10X_MAPA_EUMB_BASE (ioremap_base - MPC10X_EUMB_SIZE)
-#define MPC10X_MAPB_EUMB_BASE MPC10X_MAPA_EUMB_BASE
-
enum ppc_sys_devices {
MPC10X_IIC1,
MPC10X_DMA0,
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 13/23] powerpc/8xx: Add missing SPRN defines into reg_8xx.h
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (11 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 12/23] powerpc32: remove ioremap_base Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 14/23] powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro Christophe Leroy
` (9 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
Add missing SPRN defines into reg_8xx.h
Some of them are defined in mmu-8xx.h, so we include mmu-8xx.h in
reg_8xx.h, for that we remove references to PAGE_SHIFT in mmu-8xx.h
to have it self sufficient, as includers of reg_8xx.h don't all
include asm/page.h
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
v3: We just add missing ones, don't move anymore the ones from mmu-8xx.h
v4: no change
v5: no change
arch/powerpc/include/asm/mmu-8xx.h | 4 ++--
arch/powerpc/include/asm/reg_8xx.h | 11 +++++++++++
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/mmu-8xx.h b/arch/powerpc/include/asm/mmu-8xx.h
index f05500a..0a566f1 100644
--- a/arch/powerpc/include/asm/mmu-8xx.h
+++ b/arch/powerpc/include/asm/mmu-8xx.h
@@ -171,9 +171,9 @@ typedef struct {
} mm_context_t;
#endif /* !__ASSEMBLY__ */
-#if (PAGE_SHIFT == 12)
+#if defined(CONFIG_PPC_4K_PAGES)
#define mmu_virtual_psize MMU_PAGE_4K
-#elif (PAGE_SHIFT == 14)
+#elif defined(CONFIG_PPC_16K_PAGES)
#define mmu_virtual_psize MMU_PAGE_16K
#else
#error "Unsupported PAGE_SIZE"
diff --git a/arch/powerpc/include/asm/reg_8xx.h b/arch/powerpc/include/asm/reg_8xx.h
index e8ea346..0f71c81 100644
--- a/arch/powerpc/include/asm/reg_8xx.h
+++ b/arch/powerpc/include/asm/reg_8xx.h
@@ -4,6 +4,8 @@
#ifndef _ASM_POWERPC_REG_8xx_H
#define _ASM_POWERPC_REG_8xx_H
+#include <asm/mmu-8xx.h>
+
/* Cache control on the MPC8xx is provided through some additional
* special purpose registers.
*/
@@ -14,6 +16,15 @@
#define SPRN_DC_ADR 569 /* Address needed for some commands */
#define SPRN_DC_DAT 570 /* Read-only data register */
+/* Misc Debug */
+#define SPRN_DPDR 630
+#define SPRN_MI_CAM 816
+#define SPRN_MI_RAM0 817
+#define SPRN_MI_RAM1 818
+#define SPRN_MD_CAM 824
+#define SPRN_MD_RAM0 825
+#define SPRN_MD_RAM1 826
+
/* Commands. Only the first few are available to the instruction cache.
*/
#define IDC_ENABLE 0x02000000 /* Cache enable */
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 14/23] powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (12 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 13/23] powerpc/8xx: Add missing SPRN defines into reg_8xx.h Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 15/23] powerpc/8xx: remove special handling of CPU6 errata in set_dec() Christophe Leroy
` (8 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
MPC8xx has an ERRATA on the use of mtspr() for some registers
This patch includes the ERRATA handling directly into mtspr() macro
so that mtspr() users don't need to bother about that errata
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
v3: no change
v4: no change
v5: no change
arch/powerpc/include/asm/reg.h | 2 +
arch/powerpc/include/asm/reg_8xx.h | 82 ++++++++++++++++++++++++++++++++++++++
2 files changed, 84 insertions(+)
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index c4cb2ff..7b5d97f 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1211,9 +1211,11 @@ static inline void mtmsr_isync(unsigned long val)
#define mfspr(rn) ({unsigned long rval; \
asm volatile("mfspr %0," __stringify(rn) \
: "=r" (rval)); rval;})
+#ifndef mtspr
#define mtspr(rn, v) asm volatile("mtspr " __stringify(rn) ",%0" : \
: "r" ((unsigned long)(v)) \
: "memory")
+#endif
extern void msr_check_and_set(unsigned long bits);
extern bool strict_msr_control;
diff --git a/arch/powerpc/include/asm/reg_8xx.h b/arch/powerpc/include/asm/reg_8xx.h
index 0f71c81..d41412c 100644
--- a/arch/powerpc/include/asm/reg_8xx.h
+++ b/arch/powerpc/include/asm/reg_8xx.h
@@ -50,4 +50,86 @@
#define DC_DFWT 0x40000000 /* Data cache is forced write through */
#define DC_LES 0x20000000 /* Caches are little endian mode */
+#ifdef CONFIG_8xx_CPU6
+#define do_mtspr_cpu6(rn, rn_addr, v) \
+ do { \
+ int _reg_cpu6 = rn_addr, _tmp_cpu6[1]; \
+ asm volatile("stw %0, %1;" \
+ "lwz %0, %1;" \
+ "mtspr " __stringify(rn) ",%2" : \
+ : "r" (_reg_cpu6), "m"(_tmp_cpu6), \
+ "r" ((unsigned long)(v)) \
+ : "memory"); \
+ } while (0)
+
+#define do_mtspr(rn, v) asm volatile("mtspr " __stringify(rn) ",%0" : \
+ : "r" ((unsigned long)(v)) \
+ : "memory")
+#define mtspr(rn, v) \
+ do { \
+ if (rn == SPRN_IMMR) \
+ do_mtspr_cpu6(rn, 0x3d30, v); \
+ else if (rn == SPRN_IC_CST) \
+ do_mtspr_cpu6(rn, 0x2110, v); \
+ else if (rn == SPRN_IC_ADR) \
+ do_mtspr_cpu6(rn, 0x2310, v); \
+ else if (rn == SPRN_IC_DAT) \
+ do_mtspr_cpu6(rn, 0x2510, v); \
+ else if (rn == SPRN_DC_CST) \
+ do_mtspr_cpu6(rn, 0x3110, v); \
+ else if (rn == SPRN_DC_ADR) \
+ do_mtspr_cpu6(rn, 0x3310, v); \
+ else if (rn == SPRN_DC_DAT) \
+ do_mtspr_cpu6(rn, 0x3510, v); \
+ else if (rn == SPRN_MI_CTR) \
+ do_mtspr_cpu6(rn, 0x2180, v); \
+ else if (rn == SPRN_MI_AP) \
+ do_mtspr_cpu6(rn, 0x2580, v); \
+ else if (rn == SPRN_MI_EPN) \
+ do_mtspr_cpu6(rn, 0x2780, v); \
+ else if (rn == SPRN_MI_TWC) \
+ do_mtspr_cpu6(rn, 0x2b80, v); \
+ else if (rn == SPRN_MI_RPN) \
+ do_mtspr_cpu6(rn, 0x2d80, v); \
+ else if (rn == SPRN_MI_CAM) \
+ do_mtspr_cpu6(rn, 0x2190, v); \
+ else if (rn == SPRN_MI_RAM0) \
+ do_mtspr_cpu6(rn, 0x2390, v); \
+ else if (rn == SPRN_MI_RAM1) \
+ do_mtspr_cpu6(rn, 0x2590, v); \
+ else if (rn == SPRN_MD_CTR) \
+ do_mtspr_cpu6(rn, 0x3180, v); \
+ else if (rn == SPRN_M_CASID) \
+ do_mtspr_cpu6(rn, 0x3380, v); \
+ else if (rn == SPRN_MD_AP) \
+ do_mtspr_cpu6(rn, 0x3580, v); \
+ else if (rn == SPRN_MD_EPN) \
+ do_mtspr_cpu6(rn, 0x3780, v); \
+ else if (rn == SPRN_M_TWB) \
+ do_mtspr_cpu6(rn, 0x3980, v); \
+ else if (rn == SPRN_MD_TWC) \
+ do_mtspr_cpu6(rn, 0x3b80, v); \
+ else if (rn == SPRN_MD_RPN) \
+ do_mtspr_cpu6(rn, 0x3d80, v); \
+ else if (rn == SPRN_M_TW) \
+ do_mtspr_cpu6(rn, 0x3f80, v); \
+ else if (rn == SPRN_MD_CAM) \
+ do_mtspr_cpu6(rn, 0x3190, v); \
+ else if (rn == SPRN_MD_RAM0) \
+ do_mtspr_cpu6(rn, 0x3390, v); \
+ else if (rn == SPRN_MD_RAM1) \
+ do_mtspr_cpu6(rn, 0x3590, v); \
+ else if (rn == SPRN_DEC) \
+ do_mtspr_cpu6(rn, 0x2c00, v); \
+ else if (rn == SPRN_TBWL) \
+ do_mtspr_cpu6(rn, 0x3880, v); \
+ else if (rn == SPRN_TBWU) \
+ do_mtspr_cpu6(rn, 0x3a80, v); \
+ else if (rn == SPRN_DPDR) \
+ do_mtspr_cpu6(rn, 0x2d30, v); \
+ else \
+ do_mtspr(rn, v); \
+ } while (0)
+#endif
+
#endif /* _ASM_POWERPC_REG_8xx_H */
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 15/23] powerpc/8xx: remove special handling of CPU6 errata in set_dec()
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (13 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 14/23] powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 16/23] powerpc/8xx: rewrite set_context() in C Christophe Leroy
` (7 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
CPU6 ERRATA is now handled directly in mtspr(), so we can use the
standard set_dec() fonction in all cases.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
v3: no change
v4: no change
v5: no change
arch/powerpc/include/asm/time.h | 6 +-----
arch/powerpc/kernel/head_8xx.S | 18 ------------------
2 files changed, 1 insertion(+), 23 deletions(-)
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 2d7109a..1092fdd 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -31,8 +31,6 @@ extern void tick_broadcast_ipi_handler(void);
extern void generic_calibrate_decr(void);
-extern void set_dec_cpu6(unsigned int val);
-
/* Some sane defaults: 125 MHz timebase, 1GHz processor */
extern unsigned long ppc_proc_freq;
#define DEFAULT_PROC_FREQ (DEFAULT_TB_FREQ * 8)
@@ -166,14 +164,12 @@ static inline void set_dec(int val)
{
#if defined(CONFIG_40x)
mtspr(SPRN_PIT, val);
-#elif defined(CONFIG_8xx_CPU6)
- set_dec_cpu6(val - 1);
#else
#ifndef CONFIG_BOOKE
--val;
#endif
mtspr(SPRN_DEC, val);
-#endif /* not 40x or 8xx_CPU6 */
+#endif /* not 40x */
}
static inline unsigned long tb_ticks_since(unsigned long tstamp)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index a268cf4..637f8e9 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -1011,24 +1011,6 @@ _GLOBAL(set_context)
SYNC
blr
-#ifdef CONFIG_8xx_CPU6
-/* It's here because it is unique to the 8xx.
- * It is important we get called with interrupts disabled. I used to
- * do that, but it appears that all code that calls this already had
- * interrupt disabled.
- */
- .globl set_dec_cpu6
-set_dec_cpu6:
- lis r7, cpu6_errata_word@h
- ori r7, r7, cpu6_errata_word@l
- li r4, 0x2c00
- stw r4, 8(r7)
- lwz r4, 8(r7)
- mtspr 22, r3 /* Update Decrementer */
- SYNC
- blr
-#endif
-
/*
* We put a few things here that have to be page-aligned.
* This stuff goes at the beginning of the data segment,
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 16/23] powerpc/8xx: rewrite set_context() in C
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (14 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 15/23] powerpc/8xx: remove special handling of CPU6 errata in set_dec() Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 17/23] powerpc/8xx: rewrite flush_instruction_cache() " Christophe Leroy
` (6 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
There is no real need to have set_context() in assembly.
Now that we have mtspr() handling CPU6 ERRATA directly, we
can rewrite set_context() in C language for easier maintenance.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
v3: no change
v4: no change
v5: no change
arch/powerpc/kernel/head_8xx.S | 44 ------------------------------------------
arch/powerpc/mm/8xx_mmu.c | 34 ++++++++++++++++++++++++++++++++
2 files changed, 34 insertions(+), 44 deletions(-)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 637f8e9..bb2b657 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -968,50 +968,6 @@ initial_mmu:
/*
- * Set up to use a given MMU context.
- * r3 is context number, r4 is PGD pointer.
- *
- * We place the physical address of the new task page directory loaded
- * into the MMU base register, and set the ASID compare register with
- * the new "context."
- */
-_GLOBAL(set_context)
-
-#ifdef CONFIG_BDI_SWITCH
- /* Context switch the PTE pointer for the Abatron BDI2000.
- * The PGDIR is passed as second argument.
- */
- lis r5, KERNELBASE@h
- lwz r5, 0xf0(r5)
- stw r4, 0x4(r5)
-#endif
-
- /* Register M_TW will contain base address of level 1 table minus the
- * lower part of the kernel PGDIR base address, so that all accesses to
- * level 1 table are done relative to lower part of kernel PGDIR base
- * address.
- */
- li r5, (swapper_pg_dir-PAGE_OFFSET)@l
- sub r4, r4, r5
- tophys (r4, r4)
-#ifdef CONFIG_8xx_CPU6
- lis r6, cpu6_errata_word@h
- ori r6, r6, cpu6_errata_word@l
- li r7, 0x3f80
- stw r7, 12(r6)
- lwz r7, 12(r6)
-#endif
- mtspr SPRN_M_TW, r4 /* Update pointeur to level 1 table */
-#ifdef CONFIG_8xx_CPU6
- li r7, 0x3380
- stw r7, 12(r6)
- lwz r7, 12(r6)
-#endif
- mtspr SPRN_M_CASID, r3 /* Update context */
- SYNC
- blr
-
-/*
* We put a few things here that have to be page-aligned.
* This stuff goes at the beginning of the data segment,
* which is page-aligned.
diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
index 6965575..8701e4d 100644
--- a/arch/powerpc/mm/8xx_mmu.c
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -147,3 +147,37 @@ void setup_initial_memory_limit(phys_addr_t first_memblock_base,
memblock_set_current_limit(min_t(u64, first_memblock_size,
initial_memory_size));
}
+
+/*
+ * Set up to use a given MMU context.
+ * id is context number, pgd is PGD pointer.
+ *
+ * We place the physical address of the new task page directory loaded
+ * into the MMU base register, and set the ASID compare register with
+ * the new "context."
+ */
+void set_context(unsigned long id, pgd_t *pgd)
+{
+ s16 offset = (s16)(__pa(swapper_pg_dir));
+
+#ifdef CONFIG_BDI_SWITCH
+ pgd_t **ptr = *(pgd_t ***)(KERNELBASE + 0xf0);
+
+ /* Context switch the PTE pointer for the Abatron BDI2000.
+ * The PGDIR is passed as second argument.
+ */
+ *(ptr + 1) = pgd;
+#endif
+
+ /* Register M_TW will contain base address of level 1 table minus the
+ * lower part of the kernel PGDIR base address, so that all accesses to
+ * level 1 table are done relative to lower part of kernel PGDIR base
+ * address.
+ */
+ mtspr(SPRN_M_TW, __pa(pgd) - offset);
+
+ /* Update context */
+ mtspr(SPRN_M_CASID, id);
+ /* sync */
+ mb();
+}
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 17/23] powerpc/8xx: rewrite flush_instruction_cache() in C
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (15 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 16/23] powerpc/8xx: rewrite set_context() in C Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 18/23] powerpc: add inline functions for cache related instructions Christophe Leroy
` (5 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
On PPC8xx, flushing instruction cache is performed by writing
in register SPRN_IC_CST. This registers suffers CPU6 ERRATA.
The patch rewrites the fonction in C so that CPU6 ERRATA will
be handled transparently
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
v3: no change
v4: no change
v5: no change
arch/powerpc/kernel/misc_32.S | 10 ++++------
arch/powerpc/mm/8xx_mmu.c | 7 +++++++
2 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index be8edd6..7d1284f 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -296,12 +296,9 @@ _GLOBAL(real_writeb)
* Flush instruction cache.
* This is a no-op on the 601.
*/
+#ifndef CONFIG_PPC_8xx
_GLOBAL(flush_instruction_cache)
-#if defined(CONFIG_8xx)
- isync
- lis r5, IDC_INVALL@h
- mtspr SPRN_IC_CST, r5
-#elif defined(CONFIG_4xx)
+#if defined(CONFIG_4xx)
#ifdef CONFIG_403GCX
li r3, 512
mtctr r3
@@ -334,9 +331,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_UNIFIED_ID_CACHE)
mfspr r3,SPRN_HID0
ori r3,r3,HID0_ICFI
mtspr SPRN_HID0,r3
-#endif /* CONFIG_8xx/4xx */
+#endif /* CONFIG_4xx */
isync
blr
+#endif /* CONFIG_PPC_8xx */
/*
* Write any modified data cache blocks out to memory
diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
index 8701e4d..604af7a 100644
--- a/arch/powerpc/mm/8xx_mmu.c
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -181,3 +181,10 @@ void set_context(unsigned long id, pgd_t *pgd)
/* sync */
mb();
}
+
+void flush_instruction_cache(void)
+{
+ isync();
+ mtspr(SPRN_IC_CST, IDC_INVALL);
+ isync();
+}
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 18/23] powerpc: add inline functions for cache related instructions
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (16 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 17/23] powerpc/8xx: rewrite flush_instruction_cache() " Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 19/23] powerpc32: Remove clear_pages() and define clear_page() inline Christophe Leroy
` (4 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
This patch adds inline functions to use dcbz, dcbi, dcbf, dcbst
from C functions
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: new
v3: no change
v4: no change
v5: no change
arch/powerpc/include/asm/cache.h | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index 5f8229e..ffbafbf 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -69,6 +69,25 @@ extern void _set_L3CR(unsigned long);
#define _set_L3CR(val) do { } while(0)
#endif
+static inline void dcbz(void *addr)
+{
+ __asm__ __volatile__ ("dcbz 0, %0" : : "r"(addr) : "memory");
+}
+
+static inline void dcbi(void *addr)
+{
+ __asm__ __volatile__ ("dcbi 0, %0" : : "r"(addr) : "memory");
+}
+
+static inline void dcbf(void *addr)
+{
+ __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory");
+}
+
+static inline void dcbst(void *addr)
+{
+ __asm__ __volatile__ ("dcbst 0, %0" : : "r"(addr) : "memory");
+}
#endif /* !__ASSEMBLY__ */
#endif /* __KERNEL__ */
#endif /* _ASM_POWERPC_CACHE_H */
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 19/23] powerpc32: Remove clear_pages() and define clear_page() inline
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (17 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 18/23] powerpc: add inline functions for cache related instructions Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 20/23] powerpc32: move xxxxx_dcache_range() functions inline Christophe Leroy
` (3 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
clear_pages() is never used expect by clear_page, and PPC32 is the
only architecture (still) having this function. Neither PPC64 nor
any other architecture has it.
This patch removes clear_pages() and moves clear_page() function
inline (same as PPC64) as it only is a few isns
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
v3: no change
v4: no change
v5: no change
arch/powerpc/include/asm/page_32.h | 17 ++++++++++++++---
arch/powerpc/kernel/misc_32.S | 16 ----------------
arch/powerpc/kernel/ppc_ksyms_32.c | 1 -
3 files changed, 14 insertions(+), 20 deletions(-)
diff --git a/arch/powerpc/include/asm/page_32.h b/arch/powerpc/include/asm/page_32.h
index 68d73b2..6a8e179 100644
--- a/arch/powerpc/include/asm/page_32.h
+++ b/arch/powerpc/include/asm/page_32.h
@@ -1,6 +1,8 @@
#ifndef _ASM_POWERPC_PAGE_32_H
#define _ASM_POWERPC_PAGE_32_H
+#include <asm/cache.h>
+
#if defined(CONFIG_PHYSICAL_ALIGN) && (CONFIG_PHYSICAL_START != 0)
#if (CONFIG_PHYSICAL_START % CONFIG_PHYSICAL_ALIGN) != 0
#error "CONFIG_PHYSICAL_START must be a multiple of CONFIG_PHYSICAL_ALIGN"
@@ -36,9 +38,18 @@ typedef unsigned long long pte_basic_t;
typedef unsigned long pte_basic_t;
#endif
-struct page;
-extern void clear_pages(void *page, int order);
-static inline void clear_page(void *page) { clear_pages(page, 0); }
+/*
+ * Clear page using the dcbz instruction, which doesn't cause any
+ * memory traffic (except to write out any cache lines which get
+ * displaced). This only works on cacheable memory.
+ */
+static inline void clear_page(void *addr)
+{
+ unsigned int i;
+
+ for (i = 0; i < PAGE_SIZE / L1_CACHE_BYTES; i++, addr += L1_CACHE_BYTES)
+ dcbz(addr);
+}
extern void copy_page(void *to, void *from);
#include <asm-generic/getorder.h>
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 7d1284f..181afc1 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -517,22 +517,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
#endif /* CONFIG_BOOKE */
/*
- * Clear pages using the dcbz instruction, which doesn't cause any
- * memory traffic (except to write out any cache lines which get
- * displaced). This only works on cacheable memory.
- *
- * void clear_pages(void *page, int order) ;
- */
-_GLOBAL(clear_pages)
- li r0,PAGE_SIZE/L1_CACHE_BYTES
- slw r0,r0,r4
- mtctr r0
-1: dcbz 0,r3
- addi r3,r3,L1_CACHE_BYTES
- bdnz 1b
- blr
-
-/*
* Copy a whole page. We use the dcbz instruction on the destination
* to reduce memory traffic (it eliminates the unnecessary reads of
* the destination into cache). This requires that the destination
diff --git a/arch/powerpc/kernel/ppc_ksyms_32.c b/arch/powerpc/kernel/ppc_ksyms_32.c
index 30ddd8a..2bfaafe 100644
--- a/arch/powerpc/kernel/ppc_ksyms_32.c
+++ b/arch/powerpc/kernel/ppc_ksyms_32.c
@@ -10,7 +10,6 @@
#include <asm/pgtable.h>
#include <asm/dcr.h>
-EXPORT_SYMBOL(clear_pages);
EXPORT_SYMBOL(ISA_DMA_THRESHOLD);
EXPORT_SYMBOL(DMA_MODE_READ);
EXPORT_SYMBOL(DMA_MODE_WRITE);
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 20/23] powerpc32: move xxxxx_dcache_range() functions inline
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (18 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 19/23] powerpc32: Remove clear_pages() and define clear_page() inline Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 21/23] powerpc: Simplify test in __dma_sync() Christophe Leroy
` (2 subsequent siblings)
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
flush/clean/invalidate _dcache_range() functions are all very
similar and are quite short. They are mainly used in __dma_sync()
perf_event locate them in the top 3 consumming functions during
heavy ethernet activity
They are good candidate for inlining, as __dma_sync() does
almost nothing but calling them
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: new
v3: no change
v4: no change
v5: no change
arch/powerpc/include/asm/cacheflush.h | 52 ++++++++++++++++++++++++++--
arch/powerpc/kernel/misc_32.S | 65 -----------------------------------
arch/powerpc/kernel/ppc_ksyms.c | 2 ++
3 files changed, 51 insertions(+), 68 deletions(-)
diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
index 6229e6b..97c9978 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -47,12 +47,58 @@ static inline void __flush_dcache_icache_phys(unsigned long physaddr)
}
#endif
-extern void flush_dcache_range(unsigned long start, unsigned long stop);
#ifdef CONFIG_PPC32
-extern void clean_dcache_range(unsigned long start, unsigned long stop);
-extern void invalidate_dcache_range(unsigned long start, unsigned long stop);
+/*
+ * Write any modified data cache blocks out to memory and invalidate them.
+ * Does not invalidate the corresponding instruction cache blocks.
+ */
+static inline void flush_dcache_range(unsigned long start, unsigned long stop)
+{
+ void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1));
+ unsigned long size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1);
+ unsigned long i;
+
+ for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES)
+ dcbf(addr);
+ mb(); /* sync */
+}
+
+/*
+ * Write any modified data cache blocks out to memory.
+ * Does not invalidate the corresponding cache lines (especially for
+ * any corresponding instruction cache).
+ */
+static inline void clean_dcache_range(unsigned long start, unsigned long stop)
+{
+ void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1));
+ unsigned long size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1);
+ unsigned long i;
+
+ for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES)
+ dcbst(addr);
+ mb(); /* sync */
+}
+
+/*
+ * Like above, but invalidate the D-cache. This is used by the 8xx
+ * to invalidate the cache so the PPC core doesn't get stale data
+ * from the CPM (no cache snooping here :-).
+ */
+static inline void invalidate_dcache_range(unsigned long start,
+ unsigned long stop)
+{
+ void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1));
+ unsigned long size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1);
+ unsigned long i;
+
+ for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES)
+ dcbi(addr);
+ mb(); /* sync */
+}
+
#endif /* CONFIG_PPC32 */
#ifdef CONFIG_PPC64
+extern void flush_dcache_range(unsigned long start, unsigned long stop);
extern void flush_inval_dcache_range(unsigned long start, unsigned long stop);
extern void flush_dcache_phys_range(unsigned long start, unsigned long stop);
#endif
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 181afc1..09e1e5d 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -375,71 +375,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
isync
blr
/*
- * Write any modified data cache blocks out to memory.
- * Does not invalidate the corresponding cache lines (especially for
- * any corresponding instruction cache).
- *
- * clean_dcache_range(unsigned long start, unsigned long stop)
- */
-_GLOBAL(clean_dcache_range)
- li r5,L1_CACHE_BYTES-1
- andc r3,r3,r5
- subf r4,r3,r4
- add r4,r4,r5
- srwi. r4,r4,L1_CACHE_SHIFT
- beqlr
- mtctr r4
-
-1: dcbst 0,r3
- addi r3,r3,L1_CACHE_BYTES
- bdnz 1b
- sync /* wait for dcbst's to get to ram */
- blr
-
-/*
- * Write any modified data cache blocks out to memory and invalidate them.
- * Does not invalidate the corresponding instruction cache blocks.
- *
- * flush_dcache_range(unsigned long start, unsigned long stop)
- */
-_GLOBAL(flush_dcache_range)
- li r5,L1_CACHE_BYTES-1
- andc r3,r3,r5
- subf r4,r3,r4
- add r4,r4,r5
- srwi. r4,r4,L1_CACHE_SHIFT
- beqlr
- mtctr r4
-
-1: dcbf 0,r3
- addi r3,r3,L1_CACHE_BYTES
- bdnz 1b
- sync /* wait for dcbst's to get to ram */
- blr
-
-/*
- * Like above, but invalidate the D-cache. This is used by the 8xx
- * to invalidate the cache so the PPC core doesn't get stale data
- * from the CPM (no cache snooping here :-).
- *
- * invalidate_dcache_range(unsigned long start, unsigned long stop)
- */
-_GLOBAL(invalidate_dcache_range)
- li r5,L1_CACHE_BYTES-1
- andc r3,r3,r5
- subf r4,r3,r4
- add r4,r4,r5
- srwi. r4,r4,L1_CACHE_SHIFT
- beqlr
- mtctr r4
-
-1: dcbi 0,r3
- addi r3,r3,L1_CACHE_BYTES
- bdnz 1b
- sync /* wait for dcbi's to get to ram */
- blr
-
-/*
* Flush a particular page from the data cache to RAM.
* Note: this is necessary because the instruction cache does *not*
* snoop from the data cache.
diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
index 41e1607..3236679 100644
--- a/arch/powerpc/kernel/ppc_ksyms.c
+++ b/arch/powerpc/kernel/ppc_ksyms.c
@@ -6,7 +6,9 @@
#include <asm/cacheflush.h>
#include <asm/epapr_hcalls.h>
+#ifdef CONFIG_PPC64
EXPORT_SYMBOL(flush_dcache_range);
+#endif
EXPORT_SYMBOL(flush_icache_range);
EXPORT_SYMBOL(empty_zero_page);
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 21/23] powerpc: Simplify test in __dma_sync()
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (19 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 20/23] powerpc32: move xxxxx_dcache_range() functions inline Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-04 11:37 ` Denis Kirjanov
2016-02-03 22:54 ` [PATCH v5 22/23] powerpc32: small optimisation in flush_icache_range() Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 23/23] powerpc32: Remove one insn in mulhdu Christophe Leroy
22 siblings, 1 reply; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
This simplification helps the compiler. We now have only one test
instead of two, so it reduces the number of branches.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: new
v3: no change
v4: no change
v5: no change
arch/powerpc/mm/dma-noncoherent.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/mm/dma-noncoherent.c b/arch/powerpc/mm/dma-noncoherent.c
index 169aba4..2dc74e5 100644
--- a/arch/powerpc/mm/dma-noncoherent.c
+++ b/arch/powerpc/mm/dma-noncoherent.c
@@ -327,7 +327,7 @@ void __dma_sync(void *vaddr, size_t size, int direction)
* invalidate only when cache-line aligned otherwise there is
* the potential for discarding uncommitted data from the cache
*/
- if ((start & (L1_CACHE_BYTES - 1)) || (size & (L1_CACHE_BYTES - 1)))
+ if ((start | end) & (L1_CACHE_BYTES - 1))
flush_dcache_range(start, end);
else
invalidate_dcache_range(start, end);
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 22/23] powerpc32: small optimisation in flush_icache_range()
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (20 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 21/23] powerpc: Simplify test in __dma_sync() Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 23/23] powerpc32: Remove one insn in mulhdu Christophe Leroy
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
Inlining of _dcache_range() functions has shown that the compiler
does the same thing a bit better with one insn less
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: new
v3: no change
v4: no change
v5: no change
arch/powerpc/kernel/misc_32.S | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 09e1e5d..3ec5a22 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -348,10 +348,9 @@ BEGIN_FTR_SECTION
PURGE_PREFETCHED_INS
blr /* for 601, do nothing */
END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
- li r5,L1_CACHE_BYTES-1
- andc r3,r3,r5
+ rlwinm r3,r3,0,0,31 - L1_CACHE_SHIFT
subf r4,r3,r4
- add r4,r4,r5
+ addi r4,r4,L1_CACHE_BYTES - 1
srwi. r4,r4,L1_CACHE_SHIFT
beqlr
mtctr r4
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 23/23] powerpc32: Remove one insn in mulhdu
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
` (21 preceding siblings ...)
2016-02-03 22:54 ` [PATCH v5 22/23] powerpc32: small optimisation in flush_icache_range() Christophe Leroy
@ 2016-02-03 22:54 ` Christophe Leroy
22 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-03 22:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Scott Wood
Cc: linux-kernel, linuxppc-dev
Remove one instruction in mulhdu
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: new
v3: no change
v4: no change
v5: no change
arch/powerpc/kernel/misc_32.S | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 3ec5a22..bf5160f 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -91,17 +91,16 @@ _GLOBAL(mulhdu)
addc r7,r0,r7
addze r4,r4
1: beqlr cr1 /* all done if high part of A is 0 */
- mr r10,r3
mullw r9,r3,r5
- mulhwu r3,r3,r5
+ mulhwu r10,r3,r5
beq 2f
- mullw r0,r10,r6
- mulhwu r8,r10,r6
+ mullw r0,r3,r6
+ mulhwu r8,r3,r6
addc r7,r0,r7
adde r4,r4,r8
- addze r3,r3
+ addze r10,r10
2: addc r4,r4,r9
- addze r3,r3
+ addze r3,r10
blr
/*
--
2.1.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH v5 08/23] powerpc/8xx: Map IMMR area with 512k page at a fixed address
2016-02-03 22:54 ` [PATCH v5 08/23] powerpc/8xx: Map IMMR area with 512k page at a fixed address Christophe Leroy
@ 2016-02-04 9:58 ` kbuild test robot
0 siblings, 0 replies; 31+ messages in thread
From: kbuild test robot @ 2016-02-04 9:58 UTC (permalink / raw)
To: Christophe Leroy
Cc: kbuild-all, Benjamin Herrenschmidt, Paul Mackerras,
Michael Ellerman, Scott Wood, linux-kernel, linuxppc-dev
[-- Attachment #1: Type: text/plain, Size: 1550 bytes --]
Hi Christophe,
[auto build test ERROR on powerpc/next]
[also build test ERROR on v4.5-rc2 next-20160204]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]
url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/powerpc-8xx-Use-large-pages-for-RAM-and-IMMR-and-other-improvments/20160204-071322
base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-mpc885_ads_defconfig (attached as .config)
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=powerpc
All errors (new ones prefixed by >>):
arch/powerpc/mm/8xx_mmu.c: In function 'mmu_mapin_immr':
>> arch/powerpc/mm/8xx_mmu.c:61:17: error: lvalue required as left operand of assignment
pmd_val(*pmdp) = val;
^
vim +61 arch/powerpc/mm/8xx_mmu.c
55 unsigned long v = VIRT_IMMR_BASE;
56 #ifdef CONFIG_PPC_4K_PAGES
57 pmd_t *pmdp;
58 unsigned long val = p | MD_PS512K | MD_GUARDED;
59
60 pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
> 61 pmd_val(*pmdp) = val;
62 #else /* CONFIG_PPC_16K_PAGES */
63 unsigned long f = pgprot_val(PAGE_KERNEL_NCG);
64 int offset;
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 9549 bytes --]
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 21/23] powerpc: Simplify test in __dma_sync()
2016-02-03 22:54 ` [PATCH v5 21/23] powerpc: Simplify test in __dma_sync() Christophe Leroy
@ 2016-02-04 11:37 ` Denis Kirjanov
2016-02-04 13:42 ` Christophe Leroy
0 siblings, 1 reply; 31+ messages in thread
From: Denis Kirjanov @ 2016-02-04 11:37 UTC (permalink / raw)
To: Christophe Leroy
Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
Scott Wood, linuxppc-dev, linux-kernel
On 2/4/16, Christophe Leroy <christophe.leroy@c-s.fr> wrote:
> This simplification helps the compiler. We now have only one test
> instead of two, so it reduces the number of branches.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
> v2: new
> v3: no change
> v4: no change
> v5: no change
>
> arch/powerpc/mm/dma-noncoherent.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/mm/dma-noncoherent.c
> b/arch/powerpc/mm/dma-noncoherent.c
> index 169aba4..2dc74e5 100644
> --- a/arch/powerpc/mm/dma-noncoherent.c
> +++ b/arch/powerpc/mm/dma-noncoherent.c
> @@ -327,7 +327,7 @@ void __dma_sync(void *vaddr, size_t size, int direction)
> * invalidate only when cache-line aligned otherwise there is
> * the potential for discarding uncommitted data from the cache
> */
> - if ((start & (L1_CACHE_BYTES - 1)) || (size & (L1_CACHE_BYTES - 1)))
> + if ((start | end) & (L1_CACHE_BYTES - 1))
> flush_dcache_range(start, end);
> else
> invalidate_dcache_range(start, end);
The previous version of address cache-line aligned check reads perfectly fine.
What's the benefit of this micro optimization?
> --
> 2.1.0
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 21/23] powerpc: Simplify test in __dma_sync()
2016-02-04 11:37 ` Denis Kirjanov
@ 2016-02-04 13:42 ` Christophe Leroy
2016-02-05 7:52 ` Denis Kirjanov
0 siblings, 1 reply; 31+ messages in thread
From: Christophe Leroy @ 2016-02-04 13:42 UTC (permalink / raw)
To: Denis Kirjanov
Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
Scott Wood, linuxppc-dev, linux-kernel
Le 04/02/2016 12:37, Denis Kirjanov a écrit :
> On 2/4/16, Christophe Leroy <christophe.leroy@c-s.fr> wrote:
>> This simplification helps the compiler. We now have only one test
>> instead of two, so it reduces the number of branches.
>>
>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>> ---
>> v2: new
>> v3: no change
>> v4: no change
>> v5: no change
>>
>> arch/powerpc/mm/dma-noncoherent.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/mm/dma-noncoherent.c
>> b/arch/powerpc/mm/dma-noncoherent.c
>> index 169aba4..2dc74e5 100644
>> --- a/arch/powerpc/mm/dma-noncoherent.c
>> +++ b/arch/powerpc/mm/dma-noncoherent.c
>> @@ -327,7 +327,7 @@ void __dma_sync(void *vaddr, size_t size, int direction)
>> * invalidate only when cache-line aligned otherwise there is
>> * the potential for discarding uncommitted data from the cache
>> */
>> - if ((start & (L1_CACHE_BYTES - 1)) || (size & (L1_CACHE_BYTES - 1)))
>> + if ((start | end) & (L1_CACHE_BYTES - 1))
>> flush_dcache_range(start, end);
>> else
>> invalidate_dcache_range(start, end);
> The previous version of address cache-line aligned check reads perfectly fine.
> What's the benefit of this micro optimization?
With this optimisation we avoid one unneccessary test and two associated
jumps. Taking into account that __dma_sync() is one of the top ten CPU
consummers, I believe it is worth it:
Without the patch:
c000d894: 70 6a 00 0f andi. r10,r3,15
c000d898: 39 29 00 0f addi r9,r9,15
c000d89c: 54 63 00 36 rlwinm r3,r3,0,0,27
c000d8a0: 7d 23 48 50 subf r9,r3,r9
c000d8a4: 41 82 00 84 beq c000d928 <__dma_sync+0xb8>
[...]
c000d8c0: 7c 00 04 ac sync
c000d8c4: 4e 80 00 20 blr
[...]
c000d928: 70 8a 00 0f andi. r10,r4,15
c000d92c: 40 a2 ff 7c bne c000d8a8 <__dma_sync+0x38>
c000d930: 55 2a e1 3f rlwinm. r10,r9,28,4,31
c000d934: 41 a2 ff 8c beq c000d8c0 <__dma_sync+0x50>
With the patch:
c000d894: 7c 89 1b 78 or r9,r4,r3
c000d898: 71 2a 00 0f andi. r10,r9,15
c000d89c: 54 63 00 36 rlwinm r3,r3,0,0,27
c000d8a0: 38 84 00 0f addi r4,r4,15
c000d8a4: 7c 83 20 50 subf r4,r3,r4
c000d8a8: 41 82 00 84 beq c000d92c <__dma_sync+0xbc>
[...]
c000d8c4: 7c 00 04 ac sync
c000d8c8: 4e 80 00 20 blr
[...]
c000d92c: 54 89 e1 3f rlwinm. r9,r4,28,4,31
c000d930: 41 a2 ff 94 beq c000d8c4 <__dma_sync+0x54>
Christophe
>> --
>> 2.1.0
>>
>> _______________________________________________
>> Linuxppc-dev mailing list
>> Linuxppc-dev@lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/linuxppc-dev
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 21/23] powerpc: Simplify test in __dma_sync()
2016-02-04 13:42 ` Christophe Leroy
@ 2016-02-05 7:52 ` Denis Kirjanov
0 siblings, 0 replies; 31+ messages in thread
From: Denis Kirjanov @ 2016-02-05 7:52 UTC (permalink / raw)
To: Christophe Leroy
Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
Scott Wood, linuxppc-dev, linux-kernel
On 2/4/16, Christophe Leroy <christophe.leroy@c-s.fr> wrote:
>
>
> Le 04/02/2016 12:37, Denis Kirjanov a écrit :
>> On 2/4/16, Christophe Leroy <christophe.leroy@c-s.fr> wrote:
>>> This simplification helps the compiler. We now have only one test
>>> instead of two, so it reduces the number of branches.
>>>
>>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>>> ---
>>> v2: new
>>> v3: no change
>>> v4: no change
>>> v5: no change
>>>
>>> arch/powerpc/mm/dma-noncoherent.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/powerpc/mm/dma-noncoherent.c
>>> b/arch/powerpc/mm/dma-noncoherent.c
>>> index 169aba4..2dc74e5 100644
>>> --- a/arch/powerpc/mm/dma-noncoherent.c
>>> +++ b/arch/powerpc/mm/dma-noncoherent.c
>>> @@ -327,7 +327,7 @@ void __dma_sync(void *vaddr, size_t size, int
>>> direction)
>>> * invalidate only when cache-line aligned otherwise there is
>>> * the potential for discarding uncommitted data from the cache
>>> */
>>> - if ((start & (L1_CACHE_BYTES - 1)) || (size & (L1_CACHE_BYTES - 1)))
>>> + if ((start | end) & (L1_CACHE_BYTES - 1))
>>> flush_dcache_range(start, end);
>>> else
>>> invalidate_dcache_range(start, end);
>> The previous version of address cache-line aligned check reads perfectly
>> fine.
>> What's the benefit of this micro optimization?
> With this optimisation we avoid one unneccessary test and two associated
> jumps. Taking into account that __dma_sync() is one of the top ten CPU
> consummers, I believe it is worth it:
>
> Without the patch:
>
> c000d894: 70 6a 00 0f andi. r10,r3,15
> c000d898: 39 29 00 0f addi r9,r9,15
> c000d89c: 54 63 00 36 rlwinm r3,r3,0,0,27
> c000d8a0: 7d 23 48 50 subf r9,r3,r9
> c000d8a4: 41 82 00 84 beq c000d928 <__dma_sync+0xb8>
> [...]
> c000d8c0: 7c 00 04 ac sync
> c000d8c4: 4e 80 00 20 blr
> [...]
> c000d928: 70 8a 00 0f andi. r10,r4,15
> c000d92c: 40 a2 ff 7c bne c000d8a8 <__dma_sync+0x38>
> c000d930: 55 2a e1 3f rlwinm. r10,r9,28,4,31
> c000d934: 41 a2 ff 8c beq c000d8c0 <__dma_sync+0x50>
>
> With the patch:
>
> c000d894: 7c 89 1b 78 or r9,r4,r3
> c000d898: 71 2a 00 0f andi. r10,r9,15
> c000d89c: 54 63 00 36 rlwinm r3,r3,0,0,27
> c000d8a0: 38 84 00 0f addi r4,r4,15
> c000d8a4: 7c 83 20 50 subf r4,r3,r4
> c000d8a8: 41 82 00 84 beq c000d92c <__dma_sync+0xbc>
> [...]
> c000d8c4: 7c 00 04 ac sync
> c000d8c8: 4e 80 00 20 blr
> [...]
> c000d92c: 54 89 e1 3f rlwinm. r9,r4,28,4,31
> c000d930: 41 a2 ff 94 beq c000d8c4 <__dma_sync+0x54>
Yeah, looks better. Did you compile the kernel with default compiler flags?
Thanks!
>
>
> Christophe
>>> --
>>> 2.1.0
>>>
>>> _______________________________________________
>>> Linuxppc-dev mailing list
>>> Linuxppc-dev@lists.ozlabs.org
>>> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 21/23] powerpc: Simplify test in __dma_sync()
@ 2016-02-05 7:52 ` Denis Kirjanov
0 siblings, 0 replies; 31+ messages in thread
From: Denis Kirjanov @ 2016-02-05 7:52 UTC (permalink / raw)
To: Christophe Leroy
Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
Scott Wood, linuxppc-dev, linux-kernel
On 2/4/16, Christophe Leroy <christophe.leroy@c-s.fr> wrote:
>
>
> Le 04/02/2016 12:37, Denis Kirjanov a =C3=A9crit :
>> On 2/4/16, Christophe Leroy <christophe.leroy@c-s.fr> wrote:
>>> This simplification helps the compiler. We now have only one test
>>> instead of two, so it reduces the number of branches.
>>>
>>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>>> ---
>>> v2: new
>>> v3: no change
>>> v4: no change
>>> v5: no change
>>>
>>> arch/powerpc/mm/dma-noncoherent.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/powerpc/mm/dma-noncoherent.c
>>> b/arch/powerpc/mm/dma-noncoherent.c
>>> index 169aba4..2dc74e5 100644
>>> --- a/arch/powerpc/mm/dma-noncoherent.c
>>> +++ b/arch/powerpc/mm/dma-noncoherent.c
>>> @@ -327,7 +327,7 @@ void __dma_sync(void *vaddr, size_t size, int
>>> direction)
>>> * invalidate only when cache-line aligned otherwise there is
>>> * the potential for discarding uncommitted data from the cache
>>> */
>>> - if ((start & (L1_CACHE_BYTES - 1)) || (size & (L1_CACHE_BYTES - 1)))
>>> + if ((start | end) & (L1_CACHE_BYTES - 1))
>>> flush_dcache_range(start, end);
>>> else
>>> invalidate_dcache_range(start, end);
>> The previous version of address cache-line aligned check reads perfectly
>> fine.
>> What's the benefit of this micro optimization?
> With this optimisation we avoid one unneccessary test and two associated
> jumps. Taking into account that __dma_sync() is one of the top ten CPU
> consummers, I believe it is worth it:
>
> Without the patch:
>
> c000d894: 70 6a 00 0f andi. r10,r3,15
> c000d898: 39 29 00 0f addi r9,r9,15
> c000d89c: 54 63 00 36 rlwinm r3,r3,0,0,27
> c000d8a0: 7d 23 48 50 subf r9,r3,r9
> c000d8a4: 41 82 00 84 beq c000d928 <__dma_sync+0xb8>
> [...]
> c000d8c0: 7c 00 04 ac sync
> c000d8c4: 4e 80 00 20 blr
> [...]
> c000d928: 70 8a 00 0f andi. r10,r4,15
> c000d92c: 40 a2 ff 7c bne c000d8a8 <__dma_sync+0x38>
> c000d930: 55 2a e1 3f rlwinm. r10,r9,28,4,31
> c000d934: 41 a2 ff 8c beq c000d8c0 <__dma_sync+0x50>
>
> With the patch:
>
> c000d894: 7c 89 1b 78 or r9,r4,r3
> c000d898: 71 2a 00 0f andi. r10,r9,15
> c000d89c: 54 63 00 36 rlwinm r3,r3,0,0,27
> c000d8a0: 38 84 00 0f addi r4,r4,15
> c000d8a4: 7c 83 20 50 subf r4,r3,r4
> c000d8a8: 41 82 00 84 beq c000d92c <__dma_sync+0xbc>
> [...]
> c000d8c4: 7c 00 04 ac sync
> c000d8c8: 4e 80 00 20 blr
> [...]
> c000d92c: 54 89 e1 3f rlwinm. r9,r4,28,4,31
> c000d930: 41 a2 ff 94 beq c000d8c4 <__dma_sync+0x54>
Yeah, looks better. Did you compile the kernel with default compiler flags?
Thanks!
>
>
> Christophe
>>> --
>>> 2.1.0
>>>
>>> _______________________________________________
>>> Linuxppc-dev mailing list
>>> Linuxppc-dev@lists.ozlabs.org
>>> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 21/23] powerpc: Simplify test in __dma_sync()
2016-02-05 7:52 ` Denis Kirjanov
(?)
@ 2016-02-05 7:56 ` Christophe Leroy
-1 siblings, 0 replies; 31+ messages in thread
From: Christophe Leroy @ 2016-02-05 7:56 UTC (permalink / raw)
To: Denis Kirjanov
Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
Scott Wood, linuxppc-dev, linux-kernel
Le 05/02/2016 08:52, Denis Kirjanov a écrit :
> On 2/4/16, Christophe Leroy <christophe.leroy@c-s.fr> wrote:
>>
>> Le 04/02/2016 12:37, Denis Kirjanov a écrit :
>>> On 2/4/16, Christophe Leroy <christophe.leroy@c-s.fr> wrote:
>>>> This simplification helps the compiler. We now have only one test
>>>> instead of two, so it reduces the number of branches.
>>>>
>>>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>>>> ---
>>>> v2: new
>>>> v3: no change
>>>> v4: no change
>>>> v5: no change
>>>>
>>>> arch/powerpc/mm/dma-noncoherent.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/powerpc/mm/dma-noncoherent.c
>>>> b/arch/powerpc/mm/dma-noncoherent.c
>>>> index 169aba4..2dc74e5 100644
>>>> --- a/arch/powerpc/mm/dma-noncoherent.c
>>>> +++ b/arch/powerpc/mm/dma-noncoherent.c
>>>> @@ -327,7 +327,7 @@ void __dma_sync(void *vaddr, size_t size, int
>>>> direction)
>>>> * invalidate only when cache-line aligned otherwise there is
>>>> * the potential for discarding uncommitted data from the cache
>>>> */
>>>> - if ((start & (L1_CACHE_BYTES - 1)) || (size & (L1_CACHE_BYTES - 1)))
>>>> + if ((start | end) & (L1_CACHE_BYTES - 1))
>>>> flush_dcache_range(start, end);
>>>> else
>>>> invalidate_dcache_range(start, end);
>>> The previous version of address cache-line aligned check reads perfectly
>>> fine.
>>> What's the benefit of this micro optimization?
>> With this optimisation we avoid one unneccessary test and two associated
>> jumps. Taking into account that __dma_sync() is one of the top ten CPU
>> consummers, I believe it is worth it:
>>
>>
> Yeah, looks better. Did you compile the kernel with default compiler flags?
>
> Thanks!
Yes I did
Christophe
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 06/23] powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() together
2016-02-03 22:54 ` [PATCH v5 06/23] powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() together Christophe Leroy
@ 2016-02-07 9:42 ` kbuild test robot
0 siblings, 0 replies; 31+ messages in thread
From: kbuild test robot @ 2016-02-07 9:42 UTC (permalink / raw)
To: Christophe Leroy
Cc: kbuild-all, Benjamin Herrenschmidt, Paul Mackerras,
Michael Ellerman, Scott Wood, linux-kernel, linuxppc-dev
[-- Attachment #1: Type: text/plain, Size: 2496 bytes --]
Hi Christophe,
[auto build test ERROR on powerpc/next]
[also build test ERROR on v4.5-rc2 next-20160205]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]
url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/powerpc-8xx-Use-large-pages-for-RAM-and-IMMR-and-other-improvments/20160204-071322
base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-ppc64e_defconfig (attached as .config)
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=powerpc
All errors (new ones prefixed by >>):
>> arch/powerpc/mm/fsl_booke_mmu.c:78:13: error: redefinition of 'v_block_mapped'
phys_addr_t v_block_mapped(unsigned long va)
^
In file included from arch/powerpc/mm/fsl_booke_mmu.c:57:0:
arch/powerpc/mm/mmu_decl.h:168:27: note: previous definition of 'v_block_mapped' was here
static inline phys_addr_t v_block_mapped(unsigned long va) { return 0; }
^
>> arch/powerpc/mm/fsl_booke_mmu.c:90:15: error: redefinition of 'p_block_mapped'
unsigned long p_block_mapped(phys_addr_t pa)
^
In file included from arch/powerpc/mm/fsl_booke_mmu.c:57:0:
arch/powerpc/mm/mmu_decl.h:169:29: note: previous definition of 'p_block_mapped' was here
static inline unsigned long p_block_mapped(phys_addr_t pa) { return 0; }
^
vim +/v_block_mapped +78 arch/powerpc/mm/fsl_booke_mmu.c
72 return tlbcam_addrs[idx].limit - tlbcam_addrs[idx].start + 1;
73 }
74
75 /*
76 * Return PA for this VA if it is mapped by a CAM, or 0
77 */
> 78 phys_addr_t v_block_mapped(unsigned long va)
79 {
80 int b;
81 for (b = 0; b < tlbcam_index; ++b)
82 if (va >= tlbcam_addrs[b].start && va < tlbcam_addrs[b].limit)
83 return tlbcam_addrs[b].phys + (va - tlbcam_addrs[b].start);
84 return 0;
85 }
86
87 /*
88 * Return VA for a given PA or 0 if not mapped
89 */
> 90 unsigned long p_block_mapped(phys_addr_t pa)
91 {
92 int b;
93 for (b = 0; b < tlbcam_index; ++b)
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 20194 bytes --]
^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2016-02-07 9:43 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-03 22:53 [PATCH v5 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
2016-02-03 22:53 ` [PATCH v5 01/23] powerpc/8xx: Save r3 all the time in DTLB miss handler Christophe Leroy
2016-02-03 22:53 ` [PATCH v5 02/23] powerpc/8xx: Map linear kernel RAM with 8M pages Christophe Leroy
2016-02-03 22:53 ` [PATCH v5 03/23] powerpc: Update documentation for noltlbs kernel parameter Christophe Leroy
2016-02-03 22:53 ` [PATCH v5 04/23] powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c Christophe Leroy
2016-02-03 22:53 ` [PATCH v5 05/23] powerpc32: Fix pte_offset_kernel() to return NULL for bad pages Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 06/23] powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() together Christophe Leroy
2016-02-07 9:42 ` kbuild test robot
2016-02-03 22:54 ` [PATCH v5 07/23] powerpc/8xx: Fix vaddr for IMMR early remap Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 08/23] powerpc/8xx: Map IMMR area with 512k page at a fixed address Christophe Leroy
2016-02-04 9:58 ` kbuild test robot
2016-02-03 22:54 ` [PATCH v5 09/23] powerpc/8xx: CONFIG_PIN_TLB unneeded for CONFIG_PPC_EARLY_DEBUG_CPM Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 10/23] powerpc/8xx: map more RAM at startup when needed Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 11/23] powerpc32: Remove useless/wrong MMU:setio progress message Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 12/23] powerpc32: remove ioremap_base Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 13/23] powerpc/8xx: Add missing SPRN defines into reg_8xx.h Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 14/23] powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 15/23] powerpc/8xx: remove special handling of CPU6 errata in set_dec() Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 16/23] powerpc/8xx: rewrite set_context() in C Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 17/23] powerpc/8xx: rewrite flush_instruction_cache() " Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 18/23] powerpc: add inline functions for cache related instructions Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 19/23] powerpc32: Remove clear_pages() and define clear_page() inline Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 20/23] powerpc32: move xxxxx_dcache_range() functions inline Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 21/23] powerpc: Simplify test in __dma_sync() Christophe Leroy
2016-02-04 11:37 ` Denis Kirjanov
2016-02-04 13:42 ` Christophe Leroy
2016-02-05 7:52 ` Denis Kirjanov
2016-02-05 7:52 ` Denis Kirjanov
2016-02-05 7:56 ` Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 22/23] powerpc32: small optimisation in flush_icache_range() Christophe Leroy
2016-02-03 22:54 ` [PATCH v5 23/23] powerpc32: Remove one insn in mulhdu Christophe Leroy
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.