* [PATCH v3 0/6] ARM: mm: Extend the runtime patch stub for PAE systems
@ 2013-10-03 21:17 Santosh Shilimkar
2013-10-03 21:17 ` [PATCH v3 1/6] ARM: mm: use phys_addr_t appropriately in p2v and v2p conversions Santosh Shilimkar
` (5 more replies)
0 siblings, 6 replies; 28+ messages in thread
From: Santosh Shilimkar @ 2013-10-03 21:17 UTC (permalink / raw)
To: linux-arm-kernel
This is v3 of the series which addresses the comments/suggestions
we received on v2[1]. The v1 version link is here [2]
The series trying to extend the existing v2p runtime patching for
LPAE machines which can have physical memory beyond 4 GB. Keystone
is one such ARM machine.
64 bit patching support patch is significantly revised with the inputs
from Nicolas Pitre. The patch-set is tested in various modes like
LPAE/non-LPAE, ARM/THMUB. For the THUMB2 build, we found an issue
with devicemap_init() code sequence and the last patch in the series
tries to address that. I missed that patch to be included in the
last version.
Special thanks to Nicolas for his valuable feedback on the earlier
versions.
There was a point about dual patching and avoiding two steps but there
is no easy way at least we can think of apart from ripping out the
current patch code and directly operating on pv_offsets which is
already nacked while back. In either case, this will be an optimisation
and can be carried out as a next step.
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Santosh Shilimkar (4):
ARM: mm: use phys_addr_t appropriately in p2v and v2p conversions
ARM: mm: Introduce virt_to_idmap() with an arch hook
ARM: mm: Move the idmap print to appropriate place in the code
ARM: mm: Recreate kernel mappings in early_paging_init()
Sricharan R (2):
ARM: mm: Correct virt_to_phys patching for 64 bit physical addresses
ARM: mm: Change the order of TLB/cache maintenance operations.
arch/arm/include/asm/mach/arch.h | 1 +
arch/arm/include/asm/memory.h | 73 +++++++++++++++++++++++++++++----
arch/arm/kernel/armksyms.c | 1 +
arch/arm/kernel/head.S | 60 +++++++++++++++++----------
arch/arm/kernel/patch.c | 3 ++
arch/arm/kernel/setup.c | 3 ++
arch/arm/kernel/smp.c | 2 +-
arch/arm/mm/idmap.c | 8 ++--
arch/arm/mm/mmu.c | 84 +++++++++++++++++++++++++++++++++++++-
9 files changed, 199 insertions(+), 36 deletions(-)
Regards,
Santosh
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2013-July/188108.html
[2] http://lwn.net/Articles/556175/
--
1.7.9.5
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 1/6] ARM: mm: use phys_addr_t appropriately in p2v and v2p conversions
2013-10-03 21:17 [PATCH v3 0/6] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
@ 2013-10-03 21:17 ` Santosh Shilimkar
2013-10-03 21:17 ` [PATCH v3 2/6] ARM: mm: Introduce virt_to_idmap() with an arch hook Santosh Shilimkar
` (4 subsequent siblings)
5 siblings, 0 replies; 28+ messages in thread
From: Santosh Shilimkar @ 2013-10-03 21:17 UTC (permalink / raw)
To: linux-arm-kernel
Fix remainder types used when converting back and forth between
physical and virtual addresses.
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
arch/arm/include/asm/memory.h | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index e750a93..c133bd9 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -185,22 +185,32 @@ extern unsigned long __pv_phys_offset;
: "=r" (to) \
: "r" (from), "I" (type))
-static inline unsigned long __virt_to_phys(unsigned long x)
+static inline phys_addr_t __virt_to_phys(unsigned long x)
{
unsigned long t;
__pv_stub(x, t, "add", __PV_BITS_31_24);
return t;
}
-static inline unsigned long __phys_to_virt(unsigned long x)
+static inline unsigned long __phys_to_virt(phys_addr_t x)
{
unsigned long t;
__pv_stub(x, t, "sub", __PV_BITS_31_24);
return t;
}
+
#else
-#define __virt_to_phys(x) ((x) - PAGE_OFFSET + PHYS_OFFSET)
-#define __phys_to_virt(x) ((x) - PHYS_OFFSET + PAGE_OFFSET)
+
+static inline phys_addr_t __virt_to_phys(unsigned long x)
+{
+ return (phys_addr_t)x - PAGE_OFFSET + PHYS_OFFSET;
+}
+
+static inline unsigned long __phys_to_virt(phys_addr_t x)
+{
+ return x - PHYS_OFFSET + PAGE_OFFSET;
+}
+
#endif
#endif
#endif /* __ASSEMBLY__ */
@@ -238,14 +248,14 @@ static inline phys_addr_t virt_to_phys(const volatile void *x)
static inline void *phys_to_virt(phys_addr_t x)
{
- return (void *)(__phys_to_virt((unsigned long)(x)));
+ return (void *)__phys_to_virt(x);
}
/*
* Drivers should NOT use these either.
*/
#define __pa(x) __virt_to_phys((unsigned long)(x))
-#define __va(x) ((void *)__phys_to_virt((unsigned long)(x)))
+#define __va(x) ((void *)__phys_to_virt((phys_addr_t)(x)))
#define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT)
/*
--
1.7.9.5
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 2/6] ARM: mm: Introduce virt_to_idmap() with an arch hook
2013-10-03 21:17 [PATCH v3 0/6] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
2013-10-03 21:17 ` [PATCH v3 1/6] ARM: mm: use phys_addr_t appropriately in p2v and v2p conversions Santosh Shilimkar
@ 2013-10-03 21:17 ` Santosh Shilimkar
2013-10-03 21:17 ` [PATCH v3 3/6] ARM: mm: Move the idmap print to appropriate place in the code Santosh Shilimkar
` (3 subsequent siblings)
5 siblings, 0 replies; 28+ messages in thread
From: Santosh Shilimkar @ 2013-10-03 21:17 UTC (permalink / raw)
To: linux-arm-kernel
On some PAE systems (e.g. TI Keystone), memory is above the
32-bit addressable limit, and the interconnect provides an
aliased view of parts of physical memory in the 32-bit addressable
space. This alias is strictly for boot time usage, and is not
otherwise usable because of coherency limitations. On such systems,
the idmap mechanism needs to take this aliased mapping into account.
This patch introduces virt_to_idmap() and a arch function pointer which
can be populated by platform which needs it. Also populate necessary
idmap spots with now available virt_to_idmap(). Avoided #ifdef approach
to be compatible with multi-platform builds.
Most architecture won't touch it and in that case virt_to_idmap()
fall-back to existing virt_to_phys() macro.
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
arch/arm/include/asm/memory.h | 16 ++++++++++++++++
arch/arm/kernel/smp.c | 2 +-
arch/arm/mm/idmap.c | 5 +++--
3 files changed, 20 insertions(+), 3 deletions(-)
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index c133bd9..d9b96c65 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -173,6 +173,7 @@
*/
#define __PV_BITS_31_24 0x81000000
+extern phys_addr_t (*arch_virt_to_idmap) (unsigned long x);
extern unsigned long __pv_phys_offset;
#define PHYS_OFFSET __pv_phys_offset
@@ -259,6 +260,21 @@ static inline void *phys_to_virt(phys_addr_t x)
#define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT)
/*
+ * These are for systems that have a hardware interconnect supported alias of
+ * physical memory for idmap purposes. Most cases should leave these
+ * untouched.
+ */
+static inline phys_addr_t __virt_to_idmap(unsigned long x)
+{
+ if (arch_virt_to_idmap)
+ return arch_virt_to_idmap(x);
+ else
+ return __virt_to_phys(x);
+}
+
+#define virt_to_idmap(x) __virt_to_idmap((unsigned long)(x))
+
+/*
* Virtual <-> DMA view memory address translations
* Again, these are *only* valid on the kernel direct mapped RAM
* memory. Use of these is *deprecated* (and that doesn't mean
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 72024ea..a0eb830 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -80,7 +80,7 @@ void __init smp_set_ops(struct smp_operations *ops)
static unsigned long get_arch_pgd(pgd_t *pgd)
{
- phys_addr_t pgdir = virt_to_phys(pgd);
+ phys_addr_t pgdir = virt_to_idmap(pgd);
BUG_ON(pgdir & ARCH_PGD_MASK);
return pgdir >> ARCH_PGD_SHIFT;
}
diff --git a/arch/arm/mm/idmap.c b/arch/arm/mm/idmap.c
index 83cb3ac..c0a1e48 100644
--- a/arch/arm/mm/idmap.c
+++ b/arch/arm/mm/idmap.c
@@ -10,6 +10,7 @@
#include <asm/system_info.h>
pgd_t *idmap_pgd;
+phys_addr_t (*arch_virt_to_idmap) (unsigned long x);
#ifdef CONFIG_ARM_LPAE
static void idmap_add_pmd(pud_t *pud, unsigned long addr, unsigned long end,
@@ -67,8 +68,8 @@ static void identity_mapping_add(pgd_t *pgd, const char *text_start,
unsigned long addr, end;
unsigned long next;
- addr = virt_to_phys(text_start);
- end = virt_to_phys(text_end);
+ addr = virt_to_idmap(text_start);
+ end = virt_to_idmap(text_end);
prot |= PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_SECT_AF;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 3/6] ARM: mm: Move the idmap print to appropriate place in the code
2013-10-03 21:17 [PATCH v3 0/6] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
2013-10-03 21:17 ` [PATCH v3 1/6] ARM: mm: use phys_addr_t appropriately in p2v and v2p conversions Santosh Shilimkar
2013-10-03 21:17 ` [PATCH v3 2/6] ARM: mm: Introduce virt_to_idmap() with an arch hook Santosh Shilimkar
@ 2013-10-03 21:17 ` Santosh Shilimkar
2013-10-03 21:17 ` [PATCH v3 4/6] ARM: mm: Correct virt_to_phys patching for 64 bit physical addresses Santosh Shilimkar
` (2 subsequent siblings)
5 siblings, 0 replies; 28+ messages in thread
From: Santosh Shilimkar @ 2013-10-03 21:17 UTC (permalink / raw)
To: linux-arm-kernel
Commit 9e9a367c29cebd2 {ARM: Section based HYP idmap} moved
the address conversion inside identity_mapping_add() without
respective print which carries useful idmap information.
Move the print as well inside identity_mapping_add() to
fix the same.
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
arch/arm/mm/idmap.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/arm/mm/idmap.c b/arch/arm/mm/idmap.c
index c0a1e48..8e0e52e 100644
--- a/arch/arm/mm/idmap.c
+++ b/arch/arm/mm/idmap.c
@@ -70,6 +70,7 @@ static void identity_mapping_add(pgd_t *pgd, const char *text_start,
addr = virt_to_idmap(text_start);
end = virt_to_idmap(text_end);
+ pr_info("Setting up static identity map for 0x%lx - 0x%lx\n", addr, end);
prot |= PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_SECT_AF;
@@ -91,8 +92,6 @@ static int __init init_static_idmap(void)
if (!idmap_pgd)
return -ENOMEM;
- pr_info("Setting up static identity map for 0x%p - 0x%p\n",
- __idmap_text_start, __idmap_text_end);
identity_mapping_add(idmap_pgd, __idmap_text_start,
__idmap_text_end, 0);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 4/6] ARM: mm: Correct virt_to_phys patching for 64 bit physical addresses
2013-10-03 21:17 [PATCH v3 0/6] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
` (2 preceding siblings ...)
2013-10-03 21:17 ` [PATCH v3 3/6] ARM: mm: Move the idmap print to appropriate place in the code Santosh Shilimkar
@ 2013-10-03 21:17 ` Santosh Shilimkar
2013-10-04 0:17 ` Nicolas Pitre
2013-10-03 21:17 ` [PATCH v3 5/6] ARM: mm: Recreate kernel mappings in early_paging_init() Santosh Shilimkar
2013-10-03 21:18 ` [PATCH v3 6/6] ARM: mm: Change the order of TLB/cache maintenance operations Santosh Shilimkar
5 siblings, 1 reply; 28+ messages in thread
From: Santosh Shilimkar @ 2013-10-03 21:17 UTC (permalink / raw)
To: linux-arm-kernel
From: Sricharan R <r.sricharan@ti.com>
The current phys_to_virt patching mechanism works only for 32 bit
physical addresses and this patch extends the idea for 64bit physical
addresses.
The 64bit v2p patching mechanism patches the higher 8 bits of physical
address with a constant using 'mov' instruction and lower 32bits are patched
using 'add'. While this is correct, in those platforms where the lowmem addressable
physical memory spawns across 4GB boundary, a carry bit can be produced as a
result of addition of lower 32bits. This has to be taken in to account and added
in to the upper. The patched __pv_offset and va are added in lower 32bits, where
__pv_offset can be in two's complement form when PA_START < VA_START and that can
result in a false carry bit.
e.g
1) PA = 0x80000000; VA = 0xC0000000
__pv_offset = PA - VA = 0xC0000000 (2's complement)
2) PA = 0x2 80000000; VA = 0xC000000
__pv_offset = PA - VA = 0x1 C0000000
So adding __pv_offset + VA should never result in a true overflow for (1).
So in order to differentiate between a true carry, a __pv_offset is extended
to 64bit and the upper 32bits will have 0xffffffff if __pv_offset is
2's complement. So 'mvn #0' is inserted instead of 'mov' while patching
for the same reason. Since mov, add, sub instruction are to patched
with different constants inside the same stub, the rotation field
of the opcode is using to differentiate between them.
So the above examples for v2p translation becomes for VA=0xC0000000,
1) PA[63:32] = 0xffffffff
PA[31:0] = VA + 0xC0000000 --> results in a carry
PA[63:32] = PA[63:32] + carry
PA[63:0] = 0x0 80000000
2) PA[63:32] = 0x1
PA[31:0] = VA + 0xC0000000 --> results in a carry
PA[63:32] = PA[63:32] + carry
PA[63:0] = 0x2 80000000
The above ideas were suggested by Nicolas Pitre <nico@linaro.org> as
part of the review of first and second versions of the subject patch.
There is no corresponding change on the phys_to_virt() side, because
computations on the upper 32-bits would be discarded anyway.
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Sricharan R <r.sricharan@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
arch/arm/include/asm/memory.h | 35 +++++++++++++++++++++---
arch/arm/kernel/armksyms.c | 1 +
arch/arm/kernel/head.S | 60 ++++++++++++++++++++++++++---------------
arch/arm/kernel/patch.c | 3 +++
4 files changed, 75 insertions(+), 24 deletions(-)
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index d9b96c65..942ad84 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -172,9 +172,12 @@
* so that all we need to do is modify the 8-bit constant field.
*/
#define __PV_BITS_31_24 0x81000000
+#define __PV_BITS_7_0 0x81
extern phys_addr_t (*arch_virt_to_idmap) (unsigned long x);
-extern unsigned long __pv_phys_offset;
+extern u64 __pv_phys_offset;
+extern u64 __pv_offset;
+
#define PHYS_OFFSET __pv_phys_offset
#define __pv_stub(from,to,instr,type) \
@@ -186,10 +189,36 @@ extern unsigned long __pv_phys_offset;
: "=r" (to) \
: "r" (from), "I" (type))
+#define __pv_stub_mov_hi(t) \
+ __asm__ volatile("@ __pv_stub_mov\n" \
+ "1: mov %R0, %1\n" \
+ " .pushsection .pv_table,\"a\"\n" \
+ " .long 1b\n" \
+ " .popsection\n" \
+ : "=r" (t) \
+ : "I" (__PV_BITS_7_0))
+
+#define __pv_add_carry_stub(x, y) \
+ __asm__ volatile("@ __pv_add_carry_stub\n" \
+ "1: adds %Q0, %1, %2\n" \
+ " adc %R0, %R0, #0\n" \
+ " .pushsection .pv_table,\"a\"\n" \
+ " .long 1b\n" \
+ " .popsection\n" \
+ : "+r" (y) \
+ : "r" (x), "I" (__PV_BITS_31_24) \
+ : "cc")
+
static inline phys_addr_t __virt_to_phys(unsigned long x)
{
- unsigned long t;
- __pv_stub(x, t, "add", __PV_BITS_31_24);
+ phys_addr_t t;
+
+ if (sizeof(phys_addr_t) == 4) {
+ __pv_stub(x, t, "add", __PV_BITS_31_24);
+ } else {
+ __pv_stub_mov_hi(t);
+ __pv_add_carry_stub(x, t);
+ }
return t;
}
diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
index 60d3b73..1f031dd 100644
--- a/arch/arm/kernel/armksyms.c
+++ b/arch/arm/kernel/armksyms.c
@@ -155,4 +155,5 @@ EXPORT_SYMBOL(__gnu_mcount_nc);
#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
EXPORT_SYMBOL(__pv_phys_offset);
+EXPORT_SYMBOL(__pv_offset);
#endif
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 2c7cc1e..90d04d7 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -536,6 +536,14 @@ ENTRY(fixup_smp)
ldmfd sp!, {r4 - r6, pc}
ENDPROC(fixup_smp)
+#ifdef __ARMEB_
+#define LOW_OFFSET 0x4
+#define HIGH_OFFSET 0x0
+#else
+#define LOW_OFFSET 0x0
+#define HIGH_OFFSET 0x4
+#endif
+
#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
/* __fixup_pv_table - patch the stub instructions with the delta between
@@ -546,17 +554,20 @@ ENDPROC(fixup_smp)
__HEAD
__fixup_pv_table:
adr r0, 1f
- ldmia r0, {r3-r5, r7}
- sub r3, r0, r3 @ PHYS_OFFSET - PAGE_OFFSET
+ ldmia r0, {r3-r7}
+ mvn ip, #0
+ subs r3, r0, r3 @ PHYS_OFFSET - PAGE_OFFSET
add r4, r4, r3 @ adjust table start address
add r5, r5, r3 @ adjust table end address
- add r7, r7, r3 @ adjust __pv_phys_offset address
- str r8, [r7] @ save computed PHYS_OFFSET to __pv_phys_offset
+ add r6, r6, r3 @ adjust __pv_phys_offset address
+ add r7, r7, r3 @ adjust __pv_offset address
+ str r8, [r6, #LOW_OFFSET] @ save computed PHYS_OFFSET to __pv_phys_offset
+ strcc ip, [r7, #HIGH_OFFSET] @ save to __pv_offset high bits
mov r6, r3, lsr #24 @ constant for add/sub instructions
teq r3, r6, lsl #24 @ must be 16MiB aligned
THUMB( it ne @ cross section branch )
bne __error
- str r6, [r7, #4] @ save to __pv_offset
+ str r3, [r7, #LOW_OFFSET] @ save to __pv_offset low bits
b __fixup_a_pv_table
ENDPROC(__fixup_pv_table)
@@ -565,9 +576,18 @@ ENDPROC(__fixup_pv_table)
.long __pv_table_begin
.long __pv_table_end
2: .long __pv_phys_offset
+ .long __pv_offset
.text
__fixup_a_pv_table:
+ adr r0, 3f
+ ldr r6, [r0]
+ add r6, r6, r3
+ ldr r0, [r6, #HIGH_OFFSET] @ pv_offset high word
+ ldr r6, [r6, #LOW_OFFSET] @ pv_offset low word
+ mov r6, r6, lsr #24
+ cmn r0, #1
+ moveq r0, #0x400000 @ set bit 22, mov to mvn instruction
#ifdef CONFIG_THUMB2_KERNEL
lsls r6, #24
beq 2f
@@ -582,9 +602,15 @@ __fixup_a_pv_table:
b 2f
1: add r7, r3
ldrh ip, [r7, #2]
- and ip, 0x8f00
- orr ip, r6 @ mask in offset bits 31-24
+ tst ip, #0x4000
+ and ip, #0x8f00
+ orrne ip, r6 @ mask in offset bits 31-24
+ orreq ip, r0 @ mask in offset bits 7-0
strh ip, [r7, #2]
+ ldrheq ip, [r7]
+ biceq ip, #0x20
+ orreq ip, ip, r0, lsr #16
+ strheq ip, [r7]
2: cmp r4, r5
ldrcc r7, [r4], #4 @ use branch for delay slot
bcc 1b
@@ -593,7 +619,10 @@ __fixup_a_pv_table:
b 2f
1: ldr ip, [r7, r3]
bic ip, ip, #0x000000ff
- orr ip, ip, r6 @ mask in offset bits 31-24
+ tst ip, #0xf00 @ check the rotation field
+ orrne ip, ip, r6 @ mask in offset bits 31-24
+ biceq ip, ip, #0x400000 @ clear bit 22
+ orreq ip, ip, r0 @ mask in offset bits 7-0
str ip, [r7, r3]
2: cmp r4, r5
ldrcc r7, [r4], #4 @ use branch for delay slot
@@ -602,28 +631,17 @@ __fixup_a_pv_table:
#endif
ENDPROC(__fixup_a_pv_table)
+3: .long __pv_offset
+
ENTRY(fixup_pv_table)
stmfd sp!, {r4 - r7, lr}
- ldr r2, 2f @ get address of __pv_phys_offset
mov r3, #0 @ no offset
mov r4, r0 @ r0 = table start
add r5, r0, r1 @ r1 = table size
- ldr r6, [r2, #4] @ get __pv_offset
bl __fixup_a_pv_table
ldmfd sp!, {r4 - r7, pc}
ENDPROC(fixup_pv_table)
- .align
-2: .long __pv_phys_offset
-
- .data
- .globl __pv_phys_offset
- .type __pv_phys_offset, %object
-__pv_phys_offset:
- .long 0
- .size __pv_phys_offset, . - __pv_phys_offset
-__pv_offset:
- .long 0
#endif
#include "head-common.S"
diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
index 07314af..8356312 100644
--- a/arch/arm/kernel/patch.c
+++ b/arch/arm/kernel/patch.c
@@ -8,6 +8,9 @@
#include "patch.h"
+u64 __pv_phys_offset __attribute__((section(".data")));
+u64 __pv_offset __attribute__((section(".data")));
+
struct patch {
void *addr;
unsigned int insn;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 5/6] ARM: mm: Recreate kernel mappings in early_paging_init()
2013-10-03 21:17 [PATCH v3 0/6] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
` (3 preceding siblings ...)
2013-10-03 21:17 ` [PATCH v3 4/6] ARM: mm: Correct virt_to_phys patching for 64 bit physical addresses Santosh Shilimkar
@ 2013-10-03 21:17 ` Santosh Shilimkar
2013-10-04 0:23 ` Nicolas Pitre
2013-10-04 15:59 ` Will Deacon
2013-10-03 21:18 ` [PATCH v3 6/6] ARM: mm: Change the order of TLB/cache maintenance operations Santosh Shilimkar
5 siblings, 2 replies; 28+ messages in thread
From: Santosh Shilimkar @ 2013-10-03 21:17 UTC (permalink / raw)
To: linux-arm-kernel
This patch adds a step in the init sequence, in order to recreate
the kernel code/data page table mappings prior to full paging
initialization. This is necessary on LPAE systems that run out of
a physical address space outside the 4G limit. On these systems,
this implementation provides a machine descriptor hook that allows
the PHYS_OFFSET to be overridden in a machine specific fashion.
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: R Sricharan <r.sricharan@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
arch/arm/include/asm/mach/arch.h | 1 +
arch/arm/kernel/setup.c | 3 ++
arch/arm/mm/mmu.c | 82 ++++++++++++++++++++++++++++++++++++++
3 files changed, 86 insertions(+)
diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h
index 402a2bc..17a3fa2 100644
--- a/arch/arm/include/asm/mach/arch.h
+++ b/arch/arm/include/asm/mach/arch.h
@@ -49,6 +49,7 @@ struct machine_desc {
bool (*smp_init)(void);
void (*fixup)(struct tag *, char **,
struct meminfo *);
+ void (*init_meminfo)(void);
void (*reserve)(void);/* reserve mem blocks */
void (*map_io)(void);/* IO mapping function */
void (*init_early)(void);
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 0e1e2b3..b9a6dac 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -73,6 +73,7 @@ __setup("fpe=", fpe_setup);
#endif
extern void paging_init(const struct machine_desc *desc);
+extern void early_paging_init(const struct machine_desc *, struct proc_info_list *);
extern void sanity_check_meminfo(void);
extern enum reboot_mode reboot_mode;
extern void setup_dma_zone(const struct machine_desc *desc);
@@ -878,6 +879,8 @@ void __init setup_arch(char **cmdline_p)
parse_early_param();
sort(&meminfo.bank, meminfo.nr_banks, sizeof(meminfo.bank[0]), meminfo_cmp, NULL);
+
+ early_paging_init(mdesc, lookup_processor_type(read_cpuid_id()));
sanity_check_meminfo();
arm_memblock_init(&meminfo, mdesc);
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index b1d17ee..47c7497 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -28,6 +28,7 @@
#include <asm/highmem.h>
#include <asm/system_info.h>
#include <asm/traps.h>
+#include <asm/procinfo.h>
#include <asm/mach/arch.h>
#include <asm/mach/map.h>
@@ -1315,6 +1316,87 @@ static void __init map_lowmem(void)
}
}
+#ifdef CONFIG_ARM_LPAE
+extern void fixup_pv_table(const void *, unsigned long);
+extern const void *__pv_table_begin, *__pv_table_end;
+
+/*
+ * early_paging_init() recreates boot time page table setup, allowing machines
+ * to switch over to a high (>4G) address space on LPAE systems
+ */
+void __init early_paging_init(const struct machine_desc *mdesc,
+ struct proc_info_list *procinfo)
+{
+ pmdval_t pmdprot = procinfo->__cpu_mm_mmu_flags;
+ unsigned long map_start, map_end;
+ pgd_t *pgd0, *pgdk;
+ pud_t *pud0, *pudk;
+ pmd_t *pmd0, *pmdk;
+ phys_addr_t phys;
+ int i;
+
+ /* remap kernel code and data */
+ map_start = init_mm.start_code;
+ map_end = init_mm.brk;
+
+ /* get a handle on things... */
+ pgd0 = pgd_offset_k(0);
+ pud0 = pud_offset(pgd0, 0);
+ pmd0 = pmd_offset(pud0, 0);
+
+ pgdk = pgd_offset_k(map_start);
+ pudk = pud_offset(pgdk, map_start);
+ pmdk = pmd_offset(pudk, map_start);
+
+ phys = PHYS_OFFSET;
+
+ if (mdesc->init_meminfo) {
+ mdesc->init_meminfo();
+ /* Run the patch stub to update the constants */
+ fixup_pv_table(&__pv_table_begin,
+ (&__pv_table_end - &__pv_table_begin) << 2);
+
+ /*
+ * Cache cleaning operations for self-modifying code
+ * We should clean the entries by MVA but running a
+ * for loop over every pv_table entry pointer would
+ * just complicate the code.
+ */
+ flush_cache_louis();
+ dsb();
+ isb();
+ }
+
+ /* remap level 1 table */
+ for (i = 0; i < PTRS_PER_PGD; i++) {
+ *pud0++ = __pud(__pa(pmd0) | PMD_TYPE_TABLE | L_PGD_SWAPPER);
+ pmd0 += PTRS_PER_PMD;
+ }
+
+ /* remap pmds for kernel mapping */
+ phys = __pa(map_start) & PMD_MASK;
+ do {
+ *pmdk++ = __pmd(phys | pmdprot);
+ phys += PMD_SIZE;
+ } while (phys < map_end);
+
+ flush_cache_all();
+ cpu_set_ttbr(0, __pa(pgd0));
+ cpu_set_ttbr(1, __pa(pgd0) + TTBR1_OFFSET);
+ local_flush_tlb_all();
+}
+
+#else
+
+void __init early_paging_init(const struct machine_desc *mdesc,
+ struct proc_info_list *procinfo)
+{
+ if (mdesc->init_meminfo)
+ mdesc->init_meminfo();
+}
+
+#endif
+
/*
* paging_init() sets up the page tables, initialises the zone memory
* maps, and sets up the zero page, bad page and bad page tables.
--
1.7.9.5
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 6/6] ARM: mm: Change the order of TLB/cache maintenance operations.
2013-10-03 21:17 [PATCH v3 0/6] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
` (4 preceding siblings ...)
2013-10-03 21:17 ` [PATCH v3 5/6] ARM: mm: Recreate kernel mappings in early_paging_init() Santosh Shilimkar
@ 2013-10-03 21:18 ` Santosh Shilimkar
2013-10-04 0:25 ` Nicolas Pitre
` (2 more replies)
5 siblings, 3 replies; 28+ messages in thread
From: Santosh Shilimkar @ 2013-10-03 21:18 UTC (permalink / raw)
To: linux-arm-kernel
From: Sricharan R <r.sricharan@ti.com>
As per the arm ARMv7 manual, the sequence of TLB maintenance
operations after making changes to the translation table is
to clean the dcache first, then invalidate the TLB. With
the current sequence we see cache corruption when the
flush_cache_all is called after tlb_flush_all.
STR rx, [Translation table entry]
; write new entry to the translation table
Clean cache line [Translation table entry]
DSB
; ensures visibility of the data cleaned from the D Cache
Invalidate TLB entry by MVA (and ASID if non-global) [page address]
Invalidate BTC
DSB
; ensure completion of the Invalidate TLB operation
ISB
; ensure table changes visible to instruction fetch
The issue is seen only with LPAE + THUMB BUILT KERNEL + 64BIT patching,
which is little bit weird.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Signed-off-by: Sricharan R <r.sricharan@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
arch/arm/mm/mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 47c7497..49cba8a 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1280,8 +1280,8 @@ static void __init devicemaps_init(const struct machine_desc *mdesc)
* any write-allocated cache lines in the vector page are written
* back. After this point, we can start to touch devices again.
*/
- local_flush_tlb_all();
flush_cache_all();
+ local_flush_tlb_all();
}
static void __init kmap_init(void)
--
1.7.9.5
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 4/6] ARM: mm: Correct virt_to_phys patching for 64 bit physical addresses
2013-10-03 21:17 ` [PATCH v3 4/6] ARM: mm: Correct virt_to_phys patching for 64 bit physical addresses Santosh Shilimkar
@ 2013-10-04 0:17 ` Nicolas Pitre
2013-10-04 5:37 ` Sricharan R
0 siblings, 1 reply; 28+ messages in thread
From: Nicolas Pitre @ 2013-10-04 0:17 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, 3 Oct 2013, Santosh Shilimkar wrote:
> From: Sricharan R <r.sricharan@ti.com>
>
> The current phys_to_virt patching mechanism works only for 32 bit
> physical addresses and this patch extends the idea for 64bit physical
> addresses.
>
> The 64bit v2p patching mechanism patches the higher 8 bits of physical
> address with a constant using 'mov' instruction and lower 32bits are patched
> using 'add'. While this is correct, in those platforms where the lowmem addressable
> physical memory spawns across 4GB boundary, a carry bit can be produced as a
> result of addition of lower 32bits. This has to be taken in to account and added
> in to the upper. The patched __pv_offset and va are added in lower 32bits, where
> __pv_offset can be in two's complement form when PA_START < VA_START and that can
> result in a false carry bit.
>
> e.g
> 1) PA = 0x80000000; VA = 0xC0000000
> __pv_offset = PA - VA = 0xC0000000 (2's complement)
>
> 2) PA = 0x2 80000000; VA = 0xC000000
> __pv_offset = PA - VA = 0x1 C0000000
>
> So adding __pv_offset + VA should never result in a true overflow for (1).
> So in order to differentiate between a true carry, a __pv_offset is extended
> to 64bit and the upper 32bits will have 0xffffffff if __pv_offset is
> 2's complement. So 'mvn #0' is inserted instead of 'mov' while patching
> for the same reason. Since mov, add, sub instruction are to patched
> with different constants inside the same stub, the rotation field
> of the opcode is using to differentiate between them.
>
> So the above examples for v2p translation becomes for VA=0xC0000000,
> 1) PA[63:32] = 0xffffffff
> PA[31:0] = VA + 0xC0000000 --> results in a carry
> PA[63:32] = PA[63:32] + carry
>
> PA[63:0] = 0x0 80000000
>
> 2) PA[63:32] = 0x1
> PA[31:0] = VA + 0xC0000000 --> results in a carry
> PA[63:32] = PA[63:32] + carry
>
> PA[63:0] = 0x2 80000000
>
> The above ideas were suggested by Nicolas Pitre <nico@linaro.org> as
> part of the review of first and second versions of the subject patch.
>
> There is no corresponding change on the phys_to_virt() side, because
> computations on the upper 32-bits would be discarded anyway.
>
> Cc: Nicolas Pitre <nico@linaro.org>
> Cc: Russell King <linux@arm.linux.org.uk>
>
> Signed-off-by: Sricharan R <r.sricharan@ti.com>
> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Almost there ...
> ---
> arch/arm/include/asm/memory.h | 35 +++++++++++++++++++++---
> arch/arm/kernel/armksyms.c | 1 +
> arch/arm/kernel/head.S | 60 ++++++++++++++++++++++++++---------------
> arch/arm/kernel/patch.c | 3 +++
> 4 files changed, 75 insertions(+), 24 deletions(-)
>
> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> index d9b96c65..942ad84 100644
> --- a/arch/arm/include/asm/memory.h
> +++ b/arch/arm/include/asm/memory.h
> @@ -172,9 +172,12 @@
> * so that all we need to do is modify the 8-bit constant field.
> */
> #define __PV_BITS_31_24 0x81000000
> +#define __PV_BITS_7_0 0x81
>
> extern phys_addr_t (*arch_virt_to_idmap) (unsigned long x);
> -extern unsigned long __pv_phys_offset;
> +extern u64 __pv_phys_offset;
> +extern u64 __pv_offset;
> +
> #define PHYS_OFFSET __pv_phys_offset
>
> #define __pv_stub(from,to,instr,type) \
> @@ -186,10 +189,36 @@ extern unsigned long __pv_phys_offset;
> : "=r" (to) \
> : "r" (from), "I" (type))
>
> +#define __pv_stub_mov_hi(t) \
> + __asm__ volatile("@ __pv_stub_mov\n" \
> + "1: mov %R0, %1\n" \
> + " .pushsection .pv_table,\"a\"\n" \
> + " .long 1b\n" \
> + " .popsection\n" \
> + : "=r" (t) \
> + : "I" (__PV_BITS_7_0))
> +
> +#define __pv_add_carry_stub(x, y) \
> + __asm__ volatile("@ __pv_add_carry_stub\n" \
> + "1: adds %Q0, %1, %2\n" \
> + " adc %R0, %R0, #0\n" \
> + " .pushsection .pv_table,\"a\"\n" \
> + " .long 1b\n" \
> + " .popsection\n" \
> + : "+r" (y) \
> + : "r" (x), "I" (__PV_BITS_31_24) \
The third operand i.e. __PV_BITS_31_24 is useless here.
> + : "cc")
> +
> static inline phys_addr_t __virt_to_phys(unsigned long x)
> {
> - unsigned long t;
> - __pv_stub(x, t, "add", __PV_BITS_31_24);
> + phys_addr_t t;
> +
> + if (sizeof(phys_addr_t) == 4) {
> + __pv_stub(x, t, "add", __PV_BITS_31_24);
> + } else {
> + __pv_stub_mov_hi(t);
> + __pv_add_carry_stub(x, t);
> + }
> return t;
> }
>
> diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
> index 60d3b73..1f031dd 100644
> --- a/arch/arm/kernel/armksyms.c
> +++ b/arch/arm/kernel/armksyms.c
> @@ -155,4 +155,5 @@ EXPORT_SYMBOL(__gnu_mcount_nc);
>
> #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
> EXPORT_SYMBOL(__pv_phys_offset);
> +EXPORT_SYMBOL(__pv_offset);
> #endif
> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> index 2c7cc1e..90d04d7 100644
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
> @@ -536,6 +536,14 @@ ENTRY(fixup_smp)
> ldmfd sp!, {r4 - r6, pc}
> ENDPROC(fixup_smp)
>
> +#ifdef __ARMEB_
> +#define LOW_OFFSET 0x4
> +#define HIGH_OFFSET 0x0
> +#else
> +#define LOW_OFFSET 0x0
> +#define HIGH_OFFSET 0x4
> +#endif
> +
> #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
>
> /* __fixup_pv_table - patch the stub instructions with the delta between
> @@ -546,17 +554,20 @@ ENDPROC(fixup_smp)
> __HEAD
> __fixup_pv_table:
> adr r0, 1f
> - ldmia r0, {r3-r5, r7}
> - sub r3, r0, r3 @ PHYS_OFFSET - PAGE_OFFSET
> + ldmia r0, {r3-r7}
> + mvn ip, #0
> + subs r3, r0, r3 @ PHYS_OFFSET - PAGE_OFFSET
> add r4, r4, r3 @ adjust table start address
> add r5, r5, r3 @ adjust table end address
> - add r7, r7, r3 @ adjust __pv_phys_offset address
> - str r8, [r7] @ save computed PHYS_OFFSET to __pv_phys_offset
> + add r6, r6, r3 @ adjust __pv_phys_offset address
> + add r7, r7, r3 @ adjust __pv_offset address
> + str r8, [r6, #LOW_OFFSET] @ save computed PHYS_OFFSET to __pv_phys_offset
> + strcc ip, [r7, #HIGH_OFFSET] @ save to __pv_offset high bits
> mov r6, r3, lsr #24 @ constant for add/sub instructions
> teq r3, r6, lsl #24 @ must be 16MiB aligned
> THUMB( it ne @ cross section branch )
> bne __error
> - str r6, [r7, #4] @ save to __pv_offset
> + str r3, [r7, #LOW_OFFSET] @ save to __pv_offset low bits
> b __fixup_a_pv_table
> ENDPROC(__fixup_pv_table)
>
> @@ -565,9 +576,18 @@ ENDPROC(__fixup_pv_table)
> .long __pv_table_begin
> .long __pv_table_end
> 2: .long __pv_phys_offset
> + .long __pv_offset
>
> .text
> __fixup_a_pv_table:
> + adr r0, 3f
> + ldr r6, [r0]
> + add r6, r6, r3
> + ldr r0, [r6, #HIGH_OFFSET] @ pv_offset high word
> + ldr r6, [r6, #LOW_OFFSET] @ pv_offset low word
> + mov r6, r6, lsr #24
> + cmn r0, #1
> + moveq r0, #0x400000 @ set bit 22, mov to mvn instruction
> #ifdef CONFIG_THUMB2_KERNEL
> lsls r6, #24
> beq 2f
> @@ -582,9 +602,15 @@ __fixup_a_pv_table:
> b 2f
> 1: add r7, r3
> ldrh ip, [r7, #2]
> - and ip, 0x8f00
> - orr ip, r6 @ mask in offset bits 31-24
> + tst ip, #0x4000
> + and ip, #0x8f00
> + orrne ip, r6 @ mask in offset bits 31-24
> + orreq ip, r0 @ mask in offset bits 7-0
> strh ip, [r7, #2]
> + ldrheq ip, [r7]
> + biceq ip, #0x20
> + orreq ip, ip, r0, lsr #16
> + strheq ip, [r7]
> 2: cmp r4, r5
> ldrcc r7, [r4], #4 @ use branch for delay slot
> bcc 1b
> @@ -593,7 +619,10 @@ __fixup_a_pv_table:
> b 2f
> 1: ldr ip, [r7, r3]
> bic ip, ip, #0x000000ff
> - orr ip, ip, r6 @ mask in offset bits 31-24
> + tst ip, #0xf00 @ check the rotation field
> + orrne ip, ip, r6 @ mask in offset bits 31-24
> + biceq ip, ip, #0x400000 @ clear bit 22
> + orreq ip, ip, r0 @ mask in offset bits 7-0
> str ip, [r7, r3]
> 2: cmp r4, r5
> ldrcc r7, [r4], #4 @ use branch for delay slot
> @@ -602,28 +631,17 @@ __fixup_a_pv_table:
> #endif
> ENDPROC(__fixup_a_pv_table)
>
> +3: .long __pv_offset
> +
> ENTRY(fixup_pv_table)
> stmfd sp!, {r4 - r7, lr}
> - ldr r2, 2f @ get address of __pv_phys_offset
> mov r3, #0 @ no offset
> mov r4, r0 @ r0 = table start
> add r5, r0, r1 @ r1 = table size
> - ldr r6, [r2, #4] @ get __pv_offset
> bl __fixup_a_pv_table
> ldmfd sp!, {r4 - r7, pc}
> ENDPROC(fixup_pv_table)
>
> - .align
> -2: .long __pv_phys_offset
> -
> - .data
> - .globl __pv_phys_offset
> - .type __pv_phys_offset, %object
> -__pv_phys_offset:
> - .long 0
> - .size __pv_phys_offset, . - __pv_phys_offset
> -__pv_offset:
> - .long 0
> #endif
>
> #include "head-common.S"
> diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
> index 07314af..8356312 100644
> --- a/arch/arm/kernel/patch.c
> +++ b/arch/arm/kernel/patch.c
> @@ -8,6 +8,9 @@
>
> #include "patch.h"
>
> +u64 __pv_phys_offset __attribute__((section(".data")));
> +u64 __pv_offset __attribute__((section(".data")));
Please add a comment explaining why you force those variables out of the
.bss section. This is unlikely to be obvious to people.
In fact, is there a reason why you moved those out of head.S? You only
needed to replace the .long with .quad to match the u64 type.
I think I might have suggested moving them out if they were to be typed
with phys_addr_t, but using a fixed u64 is simpler.
Nicolas
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 5/6] ARM: mm: Recreate kernel mappings in early_paging_init()
2013-10-03 21:17 ` [PATCH v3 5/6] ARM: mm: Recreate kernel mappings in early_paging_init() Santosh Shilimkar
@ 2013-10-04 0:23 ` Nicolas Pitre
2013-10-04 15:59 ` Will Deacon
1 sibling, 0 replies; 28+ messages in thread
From: Nicolas Pitre @ 2013-10-04 0:23 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, 3 Oct 2013, Santosh Shilimkar wrote:
> This patch adds a step in the init sequence, in order to recreate
> the kernel code/data page table mappings prior to full paging
> initialization. This is necessary on LPAE systems that run out of
> a physical address space outside the 4G limit. On these systems,
> this implementation provides a machine descriptor hook that allows
> the PHYS_OFFSET to be overridden in a machine specific fashion.
>
> Cc: Nicolas Pitre <nico@linaro.org>
> Cc: Russell King <linux@arm.linux.org.uk>
>
> Signed-off-by: R Sricharan <r.sricharan@ti.com>
> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
> ---
> arch/arm/include/asm/mach/arch.h | 1 +
> arch/arm/kernel/setup.c | 3 ++
> arch/arm/mm/mmu.c | 82 ++++++++++++++++++++++++++++++++++++++
> 3 files changed, 86 insertions(+)
>
> diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h
> index 402a2bc..17a3fa2 100644
> --- a/arch/arm/include/asm/mach/arch.h
> +++ b/arch/arm/include/asm/mach/arch.h
> @@ -49,6 +49,7 @@ struct machine_desc {
> bool (*smp_init)(void);
> void (*fixup)(struct tag *, char **,
> struct meminfo *);
> + void (*init_meminfo)(void);
> void (*reserve)(void);/* reserve mem blocks */
> void (*map_io)(void);/* IO mapping function */
> void (*init_early)(void);
> diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
> index 0e1e2b3..b9a6dac 100644
> --- a/arch/arm/kernel/setup.c
> +++ b/arch/arm/kernel/setup.c
> @@ -73,6 +73,7 @@ __setup("fpe=", fpe_setup);
> #endif
>
> extern void paging_init(const struct machine_desc *desc);
> +extern void early_paging_init(const struct machine_desc *, struct proc_info_list *);
> extern void sanity_check_meminfo(void);
> extern enum reboot_mode reboot_mode;
> extern void setup_dma_zone(const struct machine_desc *desc);
> @@ -878,6 +879,8 @@ void __init setup_arch(char **cmdline_p)
> parse_early_param();
>
> sort(&meminfo.bank, meminfo.nr_banks, sizeof(meminfo.bank[0]), meminfo_cmp, NULL);
> +
> + early_paging_init(mdesc, lookup_processor_type(read_cpuid_id()));
> sanity_check_meminfo();
> arm_memblock_init(&meminfo, mdesc);
>
> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index b1d17ee..47c7497 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -28,6 +28,7 @@
> #include <asm/highmem.h>
> #include <asm/system_info.h>
> #include <asm/traps.h>
> +#include <asm/procinfo.h>
>
> #include <asm/mach/arch.h>
> #include <asm/mach/map.h>
> @@ -1315,6 +1316,87 @@ static void __init map_lowmem(void)
> }
> }
>
> +#ifdef CONFIG_ARM_LPAE
> +extern void fixup_pv_table(const void *, unsigned long);
> +extern const void *__pv_table_begin, *__pv_table_end;
> +
> +/*
> + * early_paging_init() recreates boot time page table setup, allowing machines
> + * to switch over to a high (>4G) address space on LPAE systems
> + */
> +void __init early_paging_init(const struct machine_desc *mdesc,
> + struct proc_info_list *procinfo)
> +{
> + pmdval_t pmdprot = procinfo->__cpu_mm_mmu_flags;
> + unsigned long map_start, map_end;
> + pgd_t *pgd0, *pgdk;
> + pud_t *pud0, *pudk;
> + pmd_t *pmd0, *pmdk;
> + phys_addr_t phys;
> + int i;
> +
> + /* remap kernel code and data */
> + map_start = init_mm.start_code;
> + map_end = init_mm.brk;
> +
> + /* get a handle on things... */
> + pgd0 = pgd_offset_k(0);
> + pud0 = pud_offset(pgd0, 0);
> + pmd0 = pmd_offset(pud0, 0);
> +
> + pgdk = pgd_offset_k(map_start);
> + pudk = pud_offset(pgdk, map_start);
> + pmdk = pmd_offset(pudk, map_start);
> +
> + phys = PHYS_OFFSET;
> +
> + if (mdesc->init_meminfo) {
> + mdesc->init_meminfo();
> + /* Run the patch stub to update the constants */
> + fixup_pv_table(&__pv_table_begin,
> + (&__pv_table_end - &__pv_table_begin) << 2);
> +
> + /*
> + * Cache cleaning operations for self-modifying code
> + * We should clean the entries by MVA but running a
> + * for loop over every pv_table entry pointer would
> + * just complicate the code.
> + */
> + flush_cache_louis();
> + dsb();
> + isb();
> + }
> +
> + /* remap level 1 table */
> + for (i = 0; i < PTRS_PER_PGD; i++) {
> + *pud0++ = __pud(__pa(pmd0) | PMD_TYPE_TABLE | L_PGD_SWAPPER);
> + pmd0 += PTRS_PER_PMD;
> + }
> +
> + /* remap pmds for kernel mapping */
> + phys = __pa(map_start) & PMD_MASK;
> + do {
> + *pmdk++ = __pmd(phys | pmdprot);
> + phys += PMD_SIZE;
> + } while (phys < map_end);
> +
> + flush_cache_all();
> + cpu_set_ttbr(0, __pa(pgd0));
> + cpu_set_ttbr(1, __pa(pgd0) + TTBR1_OFFSET);
> + local_flush_tlb_all();
> +}
> +
> +#else
> +
> +void __init early_paging_init(const struct machine_desc *mdesc,
> + struct proc_info_list *procinfo)
> +{
> + if (mdesc->init_meminfo)
> + mdesc->init_meminfo();
> +}
> +
> +#endif
> +
> /*
> * paging_init() sets up the page tables, initialises the zone memory
> * maps, and sets up the zero page, bad page and bad page tables.
> --
> 1.7.9.5
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 6/6] ARM: mm: Change the order of TLB/cache maintenance operations.
2013-10-03 21:18 ` [PATCH v3 6/6] ARM: mm: Change the order of TLB/cache maintenance operations Santosh Shilimkar
@ 2013-10-04 0:25 ` Nicolas Pitre
2013-10-04 8:46 ` Russell King - ARM Linux
2013-10-04 15:52 ` Will Deacon
2 siblings, 0 replies; 28+ messages in thread
From: Nicolas Pitre @ 2013-10-04 0:25 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, 3 Oct 2013, Santosh Shilimkar wrote:
> From: Sricharan R <r.sricharan@ti.com>
>
> As per the arm ARMv7 manual, the sequence of TLB maintenance
> operations after making changes to the translation table is
> to clean the dcache first, then invalidate the TLB. With
> the current sequence we see cache corruption when the
> flush_cache_all is called after tlb_flush_all.
>
> STR rx, [Translation table entry]
> ; write new entry to the translation table
> Clean cache line [Translation table entry]
> DSB
> ; ensures visibility of the data cleaned from the D Cache
> Invalidate TLB entry by MVA (and ASID if non-global) [page address]
> Invalidate BTC
> DSB
> ; ensure completion of the Invalidate TLB operation
> ISB
> ; ensure table changes visible to instruction fetch
>
> The issue is seen only with LPAE + THUMB BUILT KERNEL + 64BIT patching,
> which is little bit weird.
>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
> Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
>
> Signed-off-by: Sricharan R <r.sricharan@ti.com>
> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
> ---
> arch/arm/mm/mmu.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index 47c7497..49cba8a 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -1280,8 +1280,8 @@ static void __init devicemaps_init(const struct machine_desc *mdesc)
> * any write-allocated cache lines in the vector page are written
> * back. After this point, we can start to touch devices again.
> */
> - local_flush_tlb_all();
> flush_cache_all();
> + local_flush_tlb_all();
> }
>
> static void __init kmap_init(void)
> --
> 1.7.9.5
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 4/6] ARM: mm: Correct virt_to_phys patching for 64 bit physical addresses
2013-10-04 0:17 ` Nicolas Pitre
@ 2013-10-04 5:37 ` Sricharan R
2013-10-04 13:02 ` Nicolas Pitre
0 siblings, 1 reply; 28+ messages in thread
From: Sricharan R @ 2013-10-04 5:37 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
On Friday 04 October 2013 05:47 AM, Nicolas Pitre wrote:
> On Thu, 3 Oct 2013, Santosh Shilimkar wrote:
>
>> From: Sricharan R <r.sricharan@ti.com>
>>
>> The current phys_to_virt patching mechanism works only for 32 bit
>> physical addresses and this patch extends the idea for 64bit physical
>> addresses.
>>
>> The 64bit v2p patching mechanism patches the higher 8 bits of physical
>> address with a constant using 'mov' instruction and lower 32bits are patched
>> using 'add'. While this is correct, in those platforms where the lowmem addressable
>> physical memory spawns across 4GB boundary, a carry bit can be produced as a
>> result of addition of lower 32bits. This has to be taken in to account and added
>> in to the upper. The patched __pv_offset and va are added in lower 32bits, where
>> __pv_offset can be in two's complement form when PA_START < VA_START and that can
>> result in a false carry bit.
>>
>> e.g
>> 1) PA = 0x80000000; VA = 0xC0000000
>> __pv_offset = PA - VA = 0xC0000000 (2's complement)
>>
>> 2) PA = 0x2 80000000; VA = 0xC000000
>> __pv_offset = PA - VA = 0x1 C0000000
>>
>> So adding __pv_offset + VA should never result in a true overflow for (1).
>> So in order to differentiate between a true carry, a __pv_offset is extended
>> to 64bit and the upper 32bits will have 0xffffffff if __pv_offset is
>> 2's complement. So 'mvn #0' is inserted instead of 'mov' while patching
>> for the same reason. Since mov, add, sub instruction are to patched
>> with different constants inside the same stub, the rotation field
>> of the opcode is using to differentiate between them.
>>
>> So the above examples for v2p translation becomes for VA=0xC0000000,
>> 1) PA[63:32] = 0xffffffff
>> PA[31:0] = VA + 0xC0000000 --> results in a carry
>> PA[63:32] = PA[63:32] + carry
>>
>> PA[63:0] = 0x0 80000000
>>
>> 2) PA[63:32] = 0x1
>> PA[31:0] = VA + 0xC0000000 --> results in a carry
>> PA[63:32] = PA[63:32] + carry
>>
>> PA[63:0] = 0x2 80000000
>>
>> The above ideas were suggested by Nicolas Pitre <nico@linaro.org> as
>> part of the review of first and second versions of the subject patch.
>>
>> There is no corresponding change on the phys_to_virt() side, because
>> computations on the upper 32-bits would be discarded anyway.
>>
>> Cc: Nicolas Pitre <nico@linaro.org>
>> Cc: Russell King <linux@arm.linux.org.uk>
>>
>> Signed-off-by: Sricharan R <r.sricharan@ti.com>
>> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
> Almost there ...
>
>> ---
>> arch/arm/include/asm/memory.h | 35 +++++++++++++++++++++---
>> arch/arm/kernel/armksyms.c | 1 +
>> arch/arm/kernel/head.S | 60 ++++++++++++++++++++++++++---------------
>> arch/arm/kernel/patch.c | 3 +++
>> 4 files changed, 75 insertions(+), 24 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
>> index d9b96c65..942ad84 100644
>> --- a/arch/arm/include/asm/memory.h
>> +++ b/arch/arm/include/asm/memory.h
>> @@ -172,9 +172,12 @@
>> * so that all we need to do is modify the 8-bit constant field.
>> */
>> #define __PV_BITS_31_24 0x81000000
>> +#define __PV_BITS_7_0 0x81
>>
>> extern phys_addr_t (*arch_virt_to_idmap) (unsigned long x);
>> -extern unsigned long __pv_phys_offset;
>> +extern u64 __pv_phys_offset;
>> +extern u64 __pv_offset;
>> +
>> #define PHYS_OFFSET __pv_phys_offset
>>
>> #define __pv_stub(from,to,instr,type) \
>> @@ -186,10 +189,36 @@ extern unsigned long __pv_phys_offset;
>> : "=r" (to) \
>> : "r" (from), "I" (type))
>>
>> +#define __pv_stub_mov_hi(t) \
>> + __asm__ volatile("@ __pv_stub_mov\n" \
>> + "1: mov %R0, %1\n" \
>> + " .pushsection .pv_table,\"a\"\n" \
>> + " .long 1b\n" \
>> + " .popsection\n" \
>> + : "=r" (t) \
>> + : "I" (__PV_BITS_7_0))
>> +
>> +#define __pv_add_carry_stub(x, y) \
>> + __asm__ volatile("@ __pv_add_carry_stub\n" \
>> + "1: adds %Q0, %1, %2\n" \
>> + " adc %R0, %R0, #0\n" \
>> + " .pushsection .pv_table,\"a\"\n" \
>> + " .long 1b\n" \
>> + " .popsection\n" \
>> + : "+r" (y) \
>> + : "r" (x), "I" (__PV_BITS_31_24) \
> The third operand i.e. __PV_BITS_31_24 is useless here.
This is used to encode the correct rotation and we use
this in the patching code to identify the the offset to
be patched.
>> + : "cc")
>> +
>> static inline phys_addr_t __virt_to_phys(unsigned long x)
>> {
>> - unsigned long t;
>> - __pv_stub(x, t, "add", __PV_BITS_31_24);
>> + phys_addr_t t;
>> +
>> + if (sizeof(phys_addr_t) == 4) {
>> + __pv_stub(x, t, "add", __PV_BITS_31_24);
>> + } else {
>> + __pv_stub_mov_hi(t);
>> + __pv_add_carry_stub(x, t);
>> + }
>> return t;
>> }
>>
>> diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
>> index 60d3b73..1f031dd 100644
>> --- a/arch/arm/kernel/armksyms.c
>> +++ b/arch/arm/kernel/armksyms.c
>> @@ -155,4 +155,5 @@ EXPORT_SYMBOL(__gnu_mcount_nc);
>>
>> #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
>> EXPORT_SYMBOL(__pv_phys_offset);
>> +EXPORT_SYMBOL(__pv_offset);
>> #endif
>> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
>> index 2c7cc1e..90d04d7 100644
>> --- a/arch/arm/kernel/head.S
>> +++ b/arch/arm/kernel/head.S
>> @@ -536,6 +536,14 @@ ENTRY(fixup_smp)
>> ldmfd sp!, {r4 - r6, pc}
>> ENDPROC(fixup_smp)
>>
>> +#ifdef __ARMEB_
>> +#define LOW_OFFSET 0x4
>> +#define HIGH_OFFSET 0x0
>> +#else
>> +#define LOW_OFFSET 0x0
>> +#define HIGH_OFFSET 0x4
>> +#endif
>> +
>> #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
>>
>> /* __fixup_pv_table - patch the stub instructions with the delta between
>> @@ -546,17 +554,20 @@ ENDPROC(fixup_smp)
>> __HEAD
>> __fixup_pv_table:
>> adr r0, 1f
>> - ldmia r0, {r3-r5, r7}
>> - sub r3, r0, r3 @ PHYS_OFFSET - PAGE_OFFSET
>> + ldmia r0, {r3-r7}
>> + mvn ip, #0
>> + subs r3, r0, r3 @ PHYS_OFFSET - PAGE_OFFSET
>> add r4, r4, r3 @ adjust table start address
>> add r5, r5, r3 @ adjust table end address
>> - add r7, r7, r3 @ adjust __pv_phys_offset address
>> - str r8, [r7] @ save computed PHYS_OFFSET to __pv_phys_offset
>> + add r6, r6, r3 @ adjust __pv_phys_offset address
>> + add r7, r7, r3 @ adjust __pv_offset address
>> + str r8, [r6, #LOW_OFFSET] @ save computed PHYS_OFFSET to __pv_phys_offset
>> + strcc ip, [r7, #HIGH_OFFSET] @ save to __pv_offset high bits
>> mov r6, r3, lsr #24 @ constant for add/sub instructions
>> teq r3, r6, lsl #24 @ must be 16MiB aligned
>> THUMB( it ne @ cross section branch )
>> bne __error
>> - str r6, [r7, #4] @ save to __pv_offset
>> + str r3, [r7, #LOW_OFFSET] @ save to __pv_offset low bits
>> b __fixup_a_pv_table
>> ENDPROC(__fixup_pv_table)
>>
>> @@ -565,9 +576,18 @@ ENDPROC(__fixup_pv_table)
>> .long __pv_table_begin
>> .long __pv_table_end
>> 2: .long __pv_phys_offset
>> + .long __pv_offset
>>
>> .text
>> __fixup_a_pv_table:
>> + adr r0, 3f
>> + ldr r6, [r0]
>> + add r6, r6, r3
>> + ldr r0, [r6, #HIGH_OFFSET] @ pv_offset high word
>> + ldr r6, [r6, #LOW_OFFSET] @ pv_offset low word
>> + mov r6, r6, lsr #24
>> + cmn r0, #1
>> + moveq r0, #0x400000 @ set bit 22, mov to mvn instruction
>> #ifdef CONFIG_THUMB2_KERNEL
>> lsls r6, #24
>> beq 2f
>> @@ -582,9 +602,15 @@ __fixup_a_pv_table:
>> b 2f
>> 1: add r7, r3
>> ldrh ip, [r7, #2]
>> - and ip, 0x8f00
>> - orr ip, r6 @ mask in offset bits 31-24
>> + tst ip, #0x4000
>> + and ip, #0x8f00
>> + orrne ip, r6 @ mask in offset bits 31-24
>> + orreq ip, r0 @ mask in offset bits 7-0
>> strh ip, [r7, #2]
>> + ldrheq ip, [r7]
>> + biceq ip, #0x20
>> + orreq ip, ip, r0, lsr #16
>> + strheq ip, [r7]
>> 2: cmp r4, r5
>> ldrcc r7, [r4], #4 @ use branch for delay slot
>> bcc 1b
>> @@ -593,7 +619,10 @@ __fixup_a_pv_table:
>> b 2f
>> 1: ldr ip, [r7, r3]
>> bic ip, ip, #0x000000ff
>> - orr ip, ip, r6 @ mask in offset bits 31-24
>> + tst ip, #0xf00 @ check the rotation field
>> + orrne ip, ip, r6 @ mask in offset bits 31-24
>> + biceq ip, ip, #0x400000 @ clear bit 22
>> + orreq ip, ip, r0 @ mask in offset bits 7-0
>> str ip, [r7, r3]
>> 2: cmp r4, r5
>> ldrcc r7, [r4], #4 @ use branch for delay slot
>> @@ -602,28 +631,17 @@ __fixup_a_pv_table:
>> #endif
>> ENDPROC(__fixup_a_pv_table)
>>
>> +3: .long __pv_offset
>> +
>> ENTRY(fixup_pv_table)
>> stmfd sp!, {r4 - r7, lr}
>> - ldr r2, 2f @ get address of __pv_phys_offset
>> mov r3, #0 @ no offset
>> mov r4, r0 @ r0 = table start
>> add r5, r0, r1 @ r1 = table size
>> - ldr r6, [r2, #4] @ get __pv_offset
>> bl __fixup_a_pv_table
>> ldmfd sp!, {r4 - r7, pc}
>> ENDPROC(fixup_pv_table)
>>
>> - .align
>> -2: .long __pv_phys_offset
>> -
>> - .data
>> - .globl __pv_phys_offset
>> - .type __pv_phys_offset, %object
>> -__pv_phys_offset:
>> - .long 0
>> - .size __pv_phys_offset, . - __pv_phys_offset
>> -__pv_offset:
>> - .long 0
>> #endif
>>
>> #include "head-common.S"
>> diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
>> index 07314af..8356312 100644
>> --- a/arch/arm/kernel/patch.c
>> +++ b/arch/arm/kernel/patch.c
>> @@ -8,6 +8,9 @@
>>
>> #include "patch.h"
>>
>> +u64 __pv_phys_offset __attribute__((section(".data")));
>> +u64 __pv_offset __attribute__((section(".data")));
> Please add a comment explaining why you force those variables out of the
> .bss section. This is unlikely to be obvious to people.
>
> In fact, is there a reason why you moved those out of head.S? You only
> needed to replace the .long with .quad to match the u64 type.
>
> I think I might have suggested moving them out if they were to be typed
> with phys_addr_t, but using a fixed u64 is simpler.
Yes, I moved it here after your comments :-) . Since it is always u64
i can move it to head.S with quad as well.
Regards,
Sricharan
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 6/6] ARM: mm: Change the order of TLB/cache maintenance operations.
2013-10-03 21:18 ` [PATCH v3 6/6] ARM: mm: Change the order of TLB/cache maintenance operations Santosh Shilimkar
2013-10-04 0:25 ` Nicolas Pitre
@ 2013-10-04 8:46 ` Russell King - ARM Linux
2013-10-04 13:14 ` Nicolas Pitre
2013-10-04 15:52 ` Will Deacon
2 siblings, 1 reply; 28+ messages in thread
From: Russell King - ARM Linux @ 2013-10-04 8:46 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 03, 2013 at 05:18:00PM -0400, Santosh Shilimkar wrote:
> From: Sricharan R <r.sricharan@ti.com>
>
> As per the arm ARMv7 manual, the sequence of TLB maintenance
> operations after making changes to the translation table is
> to clean the dcache first, then invalidate the TLB. With
> the current sequence we see cache corruption when the
> flush_cache_all is called after tlb_flush_all.
This needs testing on ARMv4 CPUs which don't have a way to flush the
cache except by reading memory - hence they need the new page table
entries to be visible to the MMU before calling flush_cache_all().
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 4/6] ARM: mm: Correct virt_to_phys patching for 64 bit physical addresses
2013-10-04 5:37 ` Sricharan R
@ 2013-10-04 13:02 ` Nicolas Pitre
2013-10-07 19:25 ` Santosh Shilimkar
0 siblings, 1 reply; 28+ messages in thread
From: Nicolas Pitre @ 2013-10-04 13:02 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, 4 Oct 2013, Sricharan R wrote:
> Hi,
> On Friday 04 October 2013 05:47 AM, Nicolas Pitre wrote:
> > On Thu, 3 Oct 2013, Santosh Shilimkar wrote:
> >
> >> From: Sricharan R <r.sricharan@ti.com>
> >>
> >> The current phys_to_virt patching mechanism works only for 32 bit
> >> physical addresses and this patch extends the idea for 64bit physical
> >> addresses.
> >>
> >> The 64bit v2p patching mechanism patches the higher 8 bits of physical
> >> address with a constant using 'mov' instruction and lower 32bits are patched
> >> using 'add'. While this is correct, in those platforms where the lowmem addressable
> >> physical memory spawns across 4GB boundary, a carry bit can be produced as a
> >> result of addition of lower 32bits. This has to be taken in to account and added
> >> in to the upper. The patched __pv_offset and va are added in lower 32bits, where
> >> __pv_offset can be in two's complement form when PA_START < VA_START and that can
> >> result in a false carry bit.
> >>
> >> e.g
> >> 1) PA = 0x80000000; VA = 0xC0000000
> >> __pv_offset = PA - VA = 0xC0000000 (2's complement)
> >>
> >> 2) PA = 0x2 80000000; VA = 0xC000000
> >> __pv_offset = PA - VA = 0x1 C0000000
> >>
> >> So adding __pv_offset + VA should never result in a true overflow for (1).
> >> So in order to differentiate between a true carry, a __pv_offset is extended
> >> to 64bit and the upper 32bits will have 0xffffffff if __pv_offset is
> >> 2's complement. So 'mvn #0' is inserted instead of 'mov' while patching
> >> for the same reason. Since mov, add, sub instruction are to patched
> >> with different constants inside the same stub, the rotation field
> >> of the opcode is using to differentiate between them.
> >>
> >> So the above examples for v2p translation becomes for VA=0xC0000000,
> >> 1) PA[63:32] = 0xffffffff
> >> PA[31:0] = VA + 0xC0000000 --> results in a carry
> >> PA[63:32] = PA[63:32] + carry
> >>
> >> PA[63:0] = 0x0 80000000
> >>
> >> 2) PA[63:32] = 0x1
> >> PA[31:0] = VA + 0xC0000000 --> results in a carry
> >> PA[63:32] = PA[63:32] + carry
> >>
> >> PA[63:0] = 0x2 80000000
> >>
> >> The above ideas were suggested by Nicolas Pitre <nico@linaro.org> as
> >> part of the review of first and second versions of the subject patch.
> >>
> >> There is no corresponding change on the phys_to_virt() side, because
> >> computations on the upper 32-bits would be discarded anyway.
> >>
> >> Cc: Nicolas Pitre <nico@linaro.org>
> >> Cc: Russell King <linux@arm.linux.org.uk>
> >>
> >> Signed-off-by: Sricharan R <r.sricharan@ti.com>
> >> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
> > Almost there ...
> >
> >> ---
> >> arch/arm/include/asm/memory.h | 35 +++++++++++++++++++++---
> >> arch/arm/kernel/armksyms.c | 1 +
> >> arch/arm/kernel/head.S | 60 ++++++++++++++++++++++++++---------------
> >> arch/arm/kernel/patch.c | 3 +++
> >> 4 files changed, 75 insertions(+), 24 deletions(-)
> >>
> >> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> >> index d9b96c65..942ad84 100644
> >> --- a/arch/arm/include/asm/memory.h
> >> +++ b/arch/arm/include/asm/memory.h
> >> @@ -172,9 +172,12 @@
> >> * so that all we need to do is modify the 8-bit constant field.
> >> */
> >> #define __PV_BITS_31_24 0x81000000
> >> +#define __PV_BITS_7_0 0x81
> >>
> >> extern phys_addr_t (*arch_virt_to_idmap) (unsigned long x);
> >> -extern unsigned long __pv_phys_offset;
> >> +extern u64 __pv_phys_offset;
> >> +extern u64 __pv_offset;
> >> +
> >> #define PHYS_OFFSET __pv_phys_offset
> >>
> >> #define __pv_stub(from,to,instr,type) \
> >> @@ -186,10 +189,36 @@ extern unsigned long __pv_phys_offset;
> >> : "=r" (to) \
> >> : "r" (from), "I" (type))
> >>
> >> +#define __pv_stub_mov_hi(t) \
> >> + __asm__ volatile("@ __pv_stub_mov\n" \
> >> + "1: mov %R0, %1\n" \
> >> + " .pushsection .pv_table,\"a\"\n" \
> >> + " .long 1b\n" \
> >> + " .popsection\n" \
> >> + : "=r" (t) \
> >> + : "I" (__PV_BITS_7_0))
> >> +
> >> +#define __pv_add_carry_stub(x, y) \
> >> + __asm__ volatile("@ __pv_add_carry_stub\n" \
> >> + "1: adds %Q0, %1, %2\n" \
> >> + " adc %R0, %R0, #0\n" \
> >> + " .pushsection .pv_table,\"a\"\n" \
> >> + " .long 1b\n" \
> >> + " .popsection\n" \
> >> + : "+r" (y) \
> >> + : "r" (x), "I" (__PV_BITS_31_24) \
> > The third operand i.e. __PV_BITS_31_24 is useless here.
> This is used to encode the correct rotation and we use
> this in the patching code to identify the the offset to
> be patched.
Obviously! Please disregard this comment -- I was confused.
> >> + : "cc")
> >> +
> >> static inline phys_addr_t __virt_to_phys(unsigned long x)
> >> {
> >> - unsigned long t;
> >> - __pv_stub(x, t, "add", __PV_BITS_31_24);
> >> + phys_addr_t t;
> >> +
> >> + if (sizeof(phys_addr_t) == 4) {
> >> + __pv_stub(x, t, "add", __PV_BITS_31_24);
> >> + } else {
> >> + __pv_stub_mov_hi(t);
> >> + __pv_add_carry_stub(x, t);
> >> + }
> >> return t;
> >> }
> >>
> >> diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
> >> index 60d3b73..1f031dd 100644
> >> --- a/arch/arm/kernel/armksyms.c
> >> +++ b/arch/arm/kernel/armksyms.c
> >> @@ -155,4 +155,5 @@ EXPORT_SYMBOL(__gnu_mcount_nc);
> >>
> >> #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
> >> EXPORT_SYMBOL(__pv_phys_offset);
> >> +EXPORT_SYMBOL(__pv_offset);
> >> #endif
> >> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> >> index 2c7cc1e..90d04d7 100644
> >> --- a/arch/arm/kernel/head.S
> >> +++ b/arch/arm/kernel/head.S
> >> @@ -536,6 +536,14 @@ ENTRY(fixup_smp)
> >> ldmfd sp!, {r4 - r6, pc}
> >> ENDPROC(fixup_smp)
> >>
> >> +#ifdef __ARMEB_
> >> +#define LOW_OFFSET 0x4
> >> +#define HIGH_OFFSET 0x0
> >> +#else
> >> +#define LOW_OFFSET 0x0
> >> +#define HIGH_OFFSET 0x4
> >> +#endif
> >> +
> >> #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
> >>
> >> /* __fixup_pv_table - patch the stub instructions with the delta between
> >> @@ -546,17 +554,20 @@ ENDPROC(fixup_smp)
> >> __HEAD
> >> __fixup_pv_table:
> >> adr r0, 1f
> >> - ldmia r0, {r3-r5, r7}
> >> - sub r3, r0, r3 @ PHYS_OFFSET - PAGE_OFFSET
> >> + ldmia r0, {r3-r7}
> >> + mvn ip, #0
> >> + subs r3, r0, r3 @ PHYS_OFFSET - PAGE_OFFSET
> >> add r4, r4, r3 @ adjust table start address
> >> add r5, r5, r3 @ adjust table end address
> >> - add r7, r7, r3 @ adjust __pv_phys_offset address
> >> - str r8, [r7] @ save computed PHYS_OFFSET to __pv_phys_offset
> >> + add r6, r6, r3 @ adjust __pv_phys_offset address
> >> + add r7, r7, r3 @ adjust __pv_offset address
> >> + str r8, [r6, #LOW_OFFSET] @ save computed PHYS_OFFSET to __pv_phys_offset
> >> + strcc ip, [r7, #HIGH_OFFSET] @ save to __pv_offset high bits
> >> mov r6, r3, lsr #24 @ constant for add/sub instructions
> >> teq r3, r6, lsl #24 @ must be 16MiB aligned
> >> THUMB( it ne @ cross section branch )
> >> bne __error
> >> - str r6, [r7, #4] @ save to __pv_offset
> >> + str r3, [r7, #LOW_OFFSET] @ save to __pv_offset low bits
> >> b __fixup_a_pv_table
> >> ENDPROC(__fixup_pv_table)
> >>
> >> @@ -565,9 +576,18 @@ ENDPROC(__fixup_pv_table)
> >> .long __pv_table_begin
> >> .long __pv_table_end
> >> 2: .long __pv_phys_offset
> >> + .long __pv_offset
> >>
> >> .text
> >> __fixup_a_pv_table:
> >> + adr r0, 3f
> >> + ldr r6, [r0]
> >> + add r6, r6, r3
> >> + ldr r0, [r6, #HIGH_OFFSET] @ pv_offset high word
> >> + ldr r6, [r6, #LOW_OFFSET] @ pv_offset low word
> >> + mov r6, r6, lsr #24
> >> + cmn r0, #1
> >> + moveq r0, #0x400000 @ set bit 22, mov to mvn instruction
> >> #ifdef CONFIG_THUMB2_KERNEL
> >> lsls r6, #24
> >> beq 2f
> >> @@ -582,9 +602,15 @@ __fixup_a_pv_table:
> >> b 2f
> >> 1: add r7, r3
> >> ldrh ip, [r7, #2]
> >> - and ip, 0x8f00
> >> - orr ip, r6 @ mask in offset bits 31-24
> >> + tst ip, #0x4000
> >> + and ip, #0x8f00
> >> + orrne ip, r6 @ mask in offset bits 31-24
> >> + orreq ip, r0 @ mask in offset bits 7-0
> >> strh ip, [r7, #2]
> >> + ldrheq ip, [r7]
> >> + biceq ip, #0x20
> >> + orreq ip, ip, r0, lsr #16
> >> + strheq ip, [r7]
> >> 2: cmp r4, r5
> >> ldrcc r7, [r4], #4 @ use branch for delay slot
> >> bcc 1b
> >> @@ -593,7 +619,10 @@ __fixup_a_pv_table:
> >> b 2f
> >> 1: ldr ip, [r7, r3]
> >> bic ip, ip, #0x000000ff
> >> - orr ip, ip, r6 @ mask in offset bits 31-24
> >> + tst ip, #0xf00 @ check the rotation field
> >> + orrne ip, ip, r6 @ mask in offset bits 31-24
> >> + biceq ip, ip, #0x400000 @ clear bit 22
> >> + orreq ip, ip, r0 @ mask in offset bits 7-0
> >> str ip, [r7, r3]
> >> 2: cmp r4, r5
> >> ldrcc r7, [r4], #4 @ use branch for delay slot
> >> @@ -602,28 +631,17 @@ __fixup_a_pv_table:
> >> #endif
> >> ENDPROC(__fixup_a_pv_table)
> >>
> >> +3: .long __pv_offset
> >> +
> >> ENTRY(fixup_pv_table)
> >> stmfd sp!, {r4 - r7, lr}
> >> - ldr r2, 2f @ get address of __pv_phys_offset
> >> mov r3, #0 @ no offset
> >> mov r4, r0 @ r0 = table start
> >> add r5, r0, r1 @ r1 = table size
> >> - ldr r6, [r2, #4] @ get __pv_offset
> >> bl __fixup_a_pv_table
> >> ldmfd sp!, {r4 - r7, pc}
> >> ENDPROC(fixup_pv_table)
> >>
> >> - .align
> >> -2: .long __pv_phys_offset
> >> -
> >> - .data
> >> - .globl __pv_phys_offset
> >> - .type __pv_phys_offset, %object
> >> -__pv_phys_offset:
> >> - .long 0
> >> - .size __pv_phys_offset, . - __pv_phys_offset
> >> -__pv_offset:
> >> - .long 0
> >> #endif
> >>
> >> #include "head-common.S"
> >> diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
> >> index 07314af..8356312 100644
> >> --- a/arch/arm/kernel/patch.c
> >> +++ b/arch/arm/kernel/patch.c
> >> @@ -8,6 +8,9 @@
> >>
> >> #include "patch.h"
> >>
> >> +u64 __pv_phys_offset __attribute__((section(".data")));
> >> +u64 __pv_offset __attribute__((section(".data")));
> > Please add a comment explaining why you force those variables out of the
> > .bss section. This is unlikely to be obvious to people.
> >
> > In fact, is there a reason why you moved those out of head.S? You only
> > needed to replace the .long with .quad to match the u64 type.
> >
> > I think I might have suggested moving them out if they were to be typed
> > with phys_addr_t, but using a fixed u64 is simpler.
> Yes, I moved it here after your comments :-) . Since it is always u64
> i can move it to head.S with quad as well.
The reason behind my suggestion was to use phys_addr_t for those
variable which is easier with C code given that phys_addr_t can be 32 or
64 bits. But that makes the assembly more complicated. With a fixed
type it is not required to move them.
Once this is done, you can add:
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Nicolas
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 6/6] ARM: mm: Change the order of TLB/cache maintenance operations.
2013-10-04 8:46 ` Russell King - ARM Linux
@ 2013-10-04 13:14 ` Nicolas Pitre
2013-10-04 13:19 ` Santosh Shilimkar
0 siblings, 1 reply; 28+ messages in thread
From: Nicolas Pitre @ 2013-10-04 13:14 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, 4 Oct 2013, Russell King - ARM Linux wrote:
> On Thu, Oct 03, 2013 at 05:18:00PM -0400, Santosh Shilimkar wrote:
> > From: Sricharan R <r.sricharan@ti.com>
> >
> > As per the arm ARMv7 manual, the sequence of TLB maintenance
> > operations after making changes to the translation table is
> > to clean the dcache first, then invalidate the TLB. With
> > the current sequence we see cache corruption when the
> > flush_cache_all is called after tlb_flush_all.
>
> This needs testing on ARMv4 CPUs which don't have a way to flush the
> cache except by reading memory - hence they need the new page table
> entries to be visible to the MMU before calling flush_cache_all().
I suspect you might be one of the few individuals still having the
ability to test new kernels on ARMv4 CPUs.
Nicolas
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 6/6] ARM: mm: Change the order of TLB/cache maintenance operations.
2013-10-04 13:14 ` Nicolas Pitre
@ 2013-10-04 13:19 ` Santosh Shilimkar
0 siblings, 0 replies; 28+ messages in thread
From: Santosh Shilimkar @ 2013-10-04 13:19 UTC (permalink / raw)
To: linux-arm-kernel
On Friday 04 October 2013 09:14 AM, Nicolas Pitre wrote:
> On Fri, 4 Oct 2013, Russell King - ARM Linux wrote:
>
>> On Thu, Oct 03, 2013 at 05:18:00PM -0400, Santosh Shilimkar wrote:
>>> From: Sricharan R <r.sricharan@ti.com>
>>>
>>> As per the arm ARMv7 manual, the sequence of TLB maintenance
>>> operations after making changes to the translation table is
>>> to clean the dcache first, then invalidate the TLB. With
>>> the current sequence we see cache corruption when the
>>> flush_cache_all is called after tlb_flush_all.
>>
>> This needs testing on ARMv4 CPUs which don't have a way to flush the
>> cache except by reading memory - hence they need the new page table
>> entries to be visible to the MMU before calling flush_cache_all().
>
> I suspect you might be one of the few individuals still having the
> ability to test new kernels on ARMv4 CPUs.
>
At least I don't have any ARMv4 based system to validate it.
Regards,
Santosh
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 6/6] ARM: mm: Change the order of TLB/cache maintenance operations.
2013-10-03 21:18 ` [PATCH v3 6/6] ARM: mm: Change the order of TLB/cache maintenance operations Santosh Shilimkar
2013-10-04 0:25 ` Nicolas Pitre
2013-10-04 8:46 ` Russell King - ARM Linux
@ 2013-10-04 15:52 ` Will Deacon
2013-10-04 16:03 ` Santosh Shilimkar
2 siblings, 1 reply; 28+ messages in thread
From: Will Deacon @ 2013-10-04 15:52 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 03, 2013 at 10:18:00PM +0100, Santosh Shilimkar wrote:
> From: Sricharan R <r.sricharan@ti.com>
>
> As per the arm ARMv7 manual, the sequence of TLB maintenance
> operations after making changes to the translation table is
> to clean the dcache first, then invalidate the TLB. With
> the current sequence we see cache corruption when the
> flush_cache_all is called after tlb_flush_all.
>
> STR rx, [Translation table entry]
> ; write new entry to the translation table
> Clean cache line [Translation table entry]
> DSB
> ; ensures visibility of the data cleaned from the D Cache
> Invalidate TLB entry by MVA (and ASID if non-global) [page address]
> Invalidate BTC
> DSB
> ; ensure completion of the Invalidate TLB operation
> ISB
> ; ensure table changes visible to instruction fetch
>
> The issue is seen only with LPAE + THUMB BUILT KERNEL + 64BIT patching,
> which is little bit weird.
NAK.
I don't buy your reasoning. All current LPAE implementations also implement
the multi-processing extensions, meaning that the cache flush isn't required
to make the PTEs visible to the table walker. The dsb from the TLB_WB flag
is sufficient, so I think you still have some debugging to do as this change
is likely masking a problem elsewhere.
On top of that, create_mapping does all the flushing you need (for the !SMP
case) when the tables are initialised, so this code doesn't need changing.
Will
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 5/6] ARM: mm: Recreate kernel mappings in early_paging_init()
2013-10-03 21:17 ` [PATCH v3 5/6] ARM: mm: Recreate kernel mappings in early_paging_init() Santosh Shilimkar
2013-10-04 0:23 ` Nicolas Pitre
@ 2013-10-04 15:59 ` Will Deacon
2013-10-04 16:12 ` Santosh Shilimkar
1 sibling, 1 reply; 28+ messages in thread
From: Will Deacon @ 2013-10-04 15:59 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 03, 2013 at 10:17:59PM +0100, Santosh Shilimkar wrote:
> This patch adds a step in the init sequence, in order to recreate
> the kernel code/data page table mappings prior to full paging
> initialization. This is necessary on LPAE systems that run out of
> a physical address space outside the 4G limit. On these systems,
> this implementation provides a machine descriptor hook that allows
> the PHYS_OFFSET to be overridden in a machine specific fashion.
[...]
> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index b1d17ee..47c7497 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -28,6 +28,7 @@
> #include <asm/highmem.h>
> #include <asm/system_info.h>
> #include <asm/traps.h>
> +#include <asm/procinfo.h>
>
> #include <asm/mach/arch.h>
> #include <asm/mach/map.h>
> @@ -1315,6 +1316,87 @@ static void __init map_lowmem(void)
> }
> }
>
> +#ifdef CONFIG_ARM_LPAE
> +extern void fixup_pv_table(const void *, unsigned long);
> +extern const void *__pv_table_begin, *__pv_table_end;
> +
> +/*
> + * early_paging_init() recreates boot time page table setup, allowing machines
> + * to switch over to a high (>4G) address space on LPAE systems
> + */
> +void __init early_paging_init(const struct machine_desc *mdesc,
> + struct proc_info_list *procinfo)
> +{
> + pmdval_t pmdprot = procinfo->__cpu_mm_mmu_flags;
> + unsigned long map_start, map_end;
> + pgd_t *pgd0, *pgdk;
> + pud_t *pud0, *pudk;
> + pmd_t *pmd0, *pmdk;
> + phys_addr_t phys;
> + int i;
> +
> + /* remap kernel code and data */
> + map_start = init_mm.start_code;
> + map_end = init_mm.brk;
> +
> + /* get a handle on things... */
> + pgd0 = pgd_offset_k(0);
> + pud0 = pud_offset(pgd0, 0);
> + pmd0 = pmd_offset(pud0, 0);
> +
> + pgdk = pgd_offset_k(map_start);
> + pudk = pud_offset(pgdk, map_start);
> + pmdk = pmd_offset(pudk, map_start);
> +
> + phys = PHYS_OFFSET;
> +
> + if (mdesc->init_meminfo) {
> + mdesc->init_meminfo();
> + /* Run the patch stub to update the constants */
> + fixup_pv_table(&__pv_table_begin,
> + (&__pv_table_end - &__pv_table_begin) << 2);
> +
> + /*
> + * Cache cleaning operations for self-modifying code
> + * We should clean the entries by MVA but running a
> + * for loop over every pv_table entry pointer would
> + * just complicate the code.
> + */
> + flush_cache_louis();
> + dsb();
> + isb();
You don't need either of these barriers.
> + }
> +
> + /* remap level 1 table */
> + for (i = 0; i < PTRS_PER_PGD; i++) {
> + *pud0++ = __pud(__pa(pmd0) | PMD_TYPE_TABLE | L_PGD_SWAPPER);
> + pmd0 += PTRS_PER_PMD;
> + }
> +
> + /* remap pmds for kernel mapping */
> + phys = __pa(map_start) & PMD_MASK;
> + do {
> + *pmdk++ = __pmd(phys | pmdprot);
> + phys += PMD_SIZE;
> + } while (phys < map_end);
> +
> + flush_cache_all();
Why are you being so heavyweight with your cacheflushing? If you're just
interested in flushing the new page tables, then use the proper accessors to
build them. The only case I think you need to flush the world is for VIVT,
which you won't have with LPAE (you could have a BUG_ON here).
> + cpu_set_ttbr(0, __pa(pgd0));
> + cpu_set_ttbr(1, __pa(pgd0) + TTBR1_OFFSET);
Can you not use cpu_switch_mm with the init_mm for this?
Will
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 6/6] ARM: mm: Change the order of TLB/cache maintenance operations.
2013-10-04 15:52 ` Will Deacon
@ 2013-10-04 16:03 ` Santosh Shilimkar
2013-10-09 18:56 ` Santosh Shilimkar
0 siblings, 1 reply; 28+ messages in thread
From: Santosh Shilimkar @ 2013-10-04 16:03 UTC (permalink / raw)
To: linux-arm-kernel
On Friday 04 October 2013 11:52 AM, Will Deacon wrote:
> On Thu, Oct 03, 2013 at 10:18:00PM +0100, Santosh Shilimkar wrote:
>> From: Sricharan R <r.sricharan@ti.com>
>>
>> As per the arm ARMv7 manual, the sequence of TLB maintenance
>> operations after making changes to the translation table is
>> to clean the dcache first, then invalidate the TLB. With
>> the current sequence we see cache corruption when the
>> flush_cache_all is called after tlb_flush_all.
>>
>> STR rx, [Translation table entry]
>> ; write new entry to the translation table
>> Clean cache line [Translation table entry]
>> DSB
>> ; ensures visibility of the data cleaned from the D Cache
>> Invalidate TLB entry by MVA (and ASID if non-global) [page address]
>> Invalidate BTC
>> DSB
>> ; ensure completion of the Invalidate TLB operation
>> ISB
>> ; ensure table changes visible to instruction fetch
>>
>> The issue is seen only with LPAE + THUMB BUILT KERNEL + 64BIT patching,
>> which is little bit weird.
>
> NAK.
>
> I don't buy your reasoning. All current LPAE implementations also implement
> the multi-processing extensions, meaning that the cache flush isn't required
> to make the PTEs visible to the table walker. The dsb from the TLB_WB flag
> is sufficient, so I think you still have some debugging to do as this change
> is likely masking a problem elsewhere.
>
> On top of that, create_mapping does all the flushing you need (for the !SMP
> case) when the tables are initialised, so this code doesn't need changing.
>
Fair enough. We will drop this patch from this series and continue to look
at the issue further. As such the patch has no hard dependency with rest of
the series.
Regards,
Santosh
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 5/6] ARM: mm: Recreate kernel mappings in early_paging_init()
2013-10-04 15:59 ` Will Deacon
@ 2013-10-04 16:12 ` Santosh Shilimkar
2013-10-07 19:34 ` Santosh Shilimkar
0 siblings, 1 reply; 28+ messages in thread
From: Santosh Shilimkar @ 2013-10-04 16:12 UTC (permalink / raw)
To: linux-arm-kernel
On Friday 04 October 2013 11:59 AM, Will Deacon wrote:
> On Thu, Oct 03, 2013 at 10:17:59PM +0100, Santosh Shilimkar wrote:
>> This patch adds a step in the init sequence, in order to recreate
>> the kernel code/data page table mappings prior to full paging
>> initialization. This is necessary on LPAE systems that run out of
>> a physical address space outside the 4G limit. On these systems,
>> this implementation provides a machine descriptor hook that allows
>> the PHYS_OFFSET to be overridden in a machine specific fashion.
>
> [...]
>
>> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
>> index b1d17ee..47c7497 100644
>> --- a/arch/arm/mm/mmu.c
>> +++ b/arch/arm/mm/mmu.c
>> @@ -28,6 +28,7 @@
>> #include <asm/highmem.h>
>> #include <asm/system_info.h>
>> #include <asm/traps.h>
>> +#include <asm/procinfo.h>
>>
>> #include <asm/mach/arch.h>
>> #include <asm/mach/map.h>
>> @@ -1315,6 +1316,87 @@ static void __init map_lowmem(void)
>> }
>> }
>>
>> +#ifdef CONFIG_ARM_LPAE
>> +extern void fixup_pv_table(const void *, unsigned long);
>> +extern const void *__pv_table_begin, *__pv_table_end;
>> +
>> +/*
>> + * early_paging_init() recreates boot time page table setup, allowing machines
>> + * to switch over to a high (>4G) address space on LPAE systems
>> + */
>> +void __init early_paging_init(const struct machine_desc *mdesc,
>> + struct proc_info_list *procinfo)
>> +{
>> + pmdval_t pmdprot = procinfo->__cpu_mm_mmu_flags;
>> + unsigned long map_start, map_end;
>> + pgd_t *pgd0, *pgdk;
>> + pud_t *pud0, *pudk;
>> + pmd_t *pmd0, *pmdk;
>> + phys_addr_t phys;
>> + int i;
>> +
>> + /* remap kernel code and data */
>> + map_start = init_mm.start_code;
>> + map_end = init_mm.brk;
>> +
>> + /* get a handle on things... */
>> + pgd0 = pgd_offset_k(0);
>> + pud0 = pud_offset(pgd0, 0);
>> + pmd0 = pmd_offset(pud0, 0);
>> +
>> + pgdk = pgd_offset_k(map_start);
>> + pudk = pud_offset(pgdk, map_start);
>> + pmdk = pmd_offset(pudk, map_start);
>> +
>> + phys = PHYS_OFFSET;
>> +
>> + if (mdesc->init_meminfo) {
>> + mdesc->init_meminfo();
>> + /* Run the patch stub to update the constants */
>> + fixup_pv_table(&__pv_table_begin,
>> + (&__pv_table_end - &__pv_table_begin) << 2);
>> +
>> + /*
>> + * Cache cleaning operations for self-modifying code
>> + * We should clean the entries by MVA but running a
>> + * for loop over every pv_table entry pointer would
>> + * just complicate the code.
>> + */
>> + flush_cache_louis();
>> + dsb();
>> + isb();
>
> You don't need either of these barriers.
>
Agree. Just want to be clear, its because they are already present
in flush_cache_louis(), right ?
>> + }
>> +
>> + /* remap level 1 table */
>> + for (i = 0; i < PTRS_PER_PGD; i++) {
>> + *pud0++ = __pud(__pa(pmd0) | PMD_TYPE_TABLE | L_PGD_SWAPPER);
>> + pmd0 += PTRS_PER_PMD;
>> + }
>> +
>> + /* remap pmds for kernel mapping */
>> + phys = __pa(map_start) & PMD_MASK;
>> + do {
>> + *pmdk++ = __pmd(phys | pmdprot);
>> + phys += PMD_SIZE;
>> + } while (phys < map_end);
>> +
>> + flush_cache_all();
>
> Why are you being so heavyweight with your cacheflushing? If you're just
> interested in flushing the new page tables, then use the proper accessors to
> build them. The only case I think you need to flush the world is for VIVT,
> which you won't have with LPAE (you could have a BUG_ON here).
>
It was mainly to avoid all the looping MVA based stuff but you have valid point.
I shall look at it and see what can be done.
>> + cpu_set_ttbr(0, __pa(pgd0));
>> + cpu_set_ttbr(1, __pa(pgd0) + TTBR1_OFFSET);
>
> Can you not use cpu_switch_mm with the init_mm for this?
>
Probably yes. Will have a look at it.
Regards,
Santosh
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 4/6] ARM: mm: Correct virt_to_phys patching for 64 bit physical addresses
2013-10-04 13:02 ` Nicolas Pitre
@ 2013-10-07 19:25 ` Santosh Shilimkar
2013-10-07 19:42 ` Nicolas Pitre
0 siblings, 1 reply; 28+ messages in thread
From: Santosh Shilimkar @ 2013-10-07 19:25 UTC (permalink / raw)
To: linux-arm-kernel
On Friday 04 October 2013 09:02 AM, Nicolas Pitre wrote:
> On Fri, 4 Oct 2013, Sricharan R wrote:
>
>> Hi,
>> On Friday 04 October 2013 05:47 AM, Nicolas Pitre wrote:
>>> On Thu, 3 Oct 2013, Santosh Shilimkar wrote:
>>>
>>>> From: Sricharan R <r.sricharan@ti.com>
[..]
>>>> diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
>>>> index 07314af..8356312 100644
>>>> --- a/arch/arm/kernel/patch.c
>>>> +++ b/arch/arm/kernel/patch.c
>>>> @@ -8,6 +8,9 @@
>>>>
>>>> #include "patch.h"
>>>>
>>>> +u64 __pv_phys_offset __attribute__((section(".data")));
>>>> +u64 __pv_offset __attribute__((section(".data")));
>>> Please add a comment explaining why you force those variables out of the
>>> .bss section. This is unlikely to be obvious to people.
>>>
>>> In fact, is there a reason why you moved those out of head.S? You only
>>> needed to replace the .long with .quad to match the u64 type.
>>>
>>> I think I might have suggested moving them out if they were to be typed
>>> with phys_addr_t, but using a fixed u64 is simpler.
>> Yes, I moved it here after your comments :-) . Since it is always u64
>> i can move it to head.S with quad as well.
>
> The reason behind my suggestion was to use phys_addr_t for those
> variable which is easier with C code given that phys_addr_t can be 32 or
> 64 bits. But that makes the assembly more complicated. With a fixed
> type it is not required to move them.
>
> Once this is done, you can add:
>
> Reviewed-by: Nicolas Pitre <nico@linaro.org>
>
Update patch below with your review tag for records.
Regards,
Santosh
-->
>From 3a018dad5ac051436a0cb4951a9325047c5c152a Mon Sep 17 00:00:00 2001
From: Sricharan R <r.sricharan@ti.com>
Date: Mon, 29 Jul 2013 20:26:22 +0530
Subject: [PATCH v3 4/5] ARM: mm: Correct virt_to_phys patching for 64 bit
physical addresses
The current phys_to_virt patching mechanism works only for 32 bit
physical addresses and this patch extends the idea for 64bit physical
addresses.
The 64bit v2p patching mechanism patches the higher 8 bits of physical
address with a constant using 'mov' instruction and lower 32bits are patched
using 'add'. While this is correct, in those platforms where the lowmem addressable
physical memory spawns across 4GB boundary, a carry bit can be produced as a
result of addition of lower 32bits. This has to be taken in to account and added
in to the upper. The patched __pv_offset and va are added in lower 32bits, where
__pv_offset can be in two's complement form when PA_START < VA_START and that can
result in a false carry bit.
e.g
1) PA = 0x80000000; VA = 0xC0000000
__pv_offset = PA - VA = 0xC0000000 (2's complement)
2) PA = 0x2 80000000; VA = 0xC000000
__pv_offset = PA - VA = 0x1 C0000000
So adding __pv_offset + VA should never result in a true overflow for (1).
So in order to differentiate between a true carry, a __pv_offset is extended
to 64bit and the upper 32bits will have 0xffffffff if __pv_offset is
2's complement. So 'mvn #0' is inserted instead of 'mov' while patching
for the same reason. Since mov, add, sub instruction are to patched
with different constants inside the same stub, the rotation field
of the opcode is using to differentiate between them.
So the above examples for v2p translation becomes for VA=0xC0000000,
1) PA[63:32] = 0xffffffff
PA[31:0] = VA + 0xC0000000 --> results in a carry
PA[63:32] = PA[63:32] + carry
PA[63:0] = 0x0 80000000
2) PA[63:32] = 0x1
PA[31:0] = VA + 0xC0000000 --> results in a carry
PA[63:32] = PA[63:32] + carry
PA[63:0] = 0x2 80000000
The above ideas were suggested by Nicolas Pitre <nico@linaro.org> as
part of the review of first and second versions of the subject patch.
There is no corresponding change on the phys_to_virt() side, because
computations on the upper 32-bits would be discarded anyway.
Cc: Russell King <linux@arm.linux.org.uk>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Sricharan R <r.sricharan@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
arch/arm/include/asm/memory.h | 35 +++++++++++++++++++++--
arch/arm/kernel/armksyms.c | 1 +
arch/arm/kernel/head.S | 61 ++++++++++++++++++++++++++++++-----------
3 files changed, 78 insertions(+), 19 deletions(-)
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index d9b96c65..942ad84 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -172,9 +172,12 @@
* so that all we need to do is modify the 8-bit constant field.
*/
#define __PV_BITS_31_24 0x81000000
+#define __PV_BITS_7_0 0x81
extern phys_addr_t (*arch_virt_to_idmap) (unsigned long x);
-extern unsigned long __pv_phys_offset;
+extern u64 __pv_phys_offset;
+extern u64 __pv_offset;
+
#define PHYS_OFFSET __pv_phys_offset
#define __pv_stub(from,to,instr,type) \
@@ -186,10 +189,36 @@ extern unsigned long __pv_phys_offset;
: "=r" (to) \
: "r" (from), "I" (type))
+#define __pv_stub_mov_hi(t) \
+ __asm__ volatile("@ __pv_stub_mov\n" \
+ "1: mov %R0, %1\n" \
+ " .pushsection .pv_table,\"a\"\n" \
+ " .long 1b\n" \
+ " .popsection\n" \
+ : "=r" (t) \
+ : "I" (__PV_BITS_7_0))
+
+#define __pv_add_carry_stub(x, y) \
+ __asm__ volatile("@ __pv_add_carry_stub\n" \
+ "1: adds %Q0, %1, %2\n" \
+ " adc %R0, %R0, #0\n" \
+ " .pushsection .pv_table,\"a\"\n" \
+ " .long 1b\n" \
+ " .popsection\n" \
+ : "+r" (y) \
+ : "r" (x), "I" (__PV_BITS_31_24) \
+ : "cc")
+
static inline phys_addr_t __virt_to_phys(unsigned long x)
{
- unsigned long t;
- __pv_stub(x, t, "add", __PV_BITS_31_24);
+ phys_addr_t t;
+
+ if (sizeof(phys_addr_t) == 4) {
+ __pv_stub(x, t, "add", __PV_BITS_31_24);
+ } else {
+ __pv_stub_mov_hi(t);
+ __pv_add_carry_stub(x, t);
+ }
return t;
}
diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
index 60d3b73..1f031dd 100644
--- a/arch/arm/kernel/armksyms.c
+++ b/arch/arm/kernel/armksyms.c
@@ -155,4 +155,5 @@ EXPORT_SYMBOL(__gnu_mcount_nc);
#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
EXPORT_SYMBOL(__pv_phys_offset);
+EXPORT_SYMBOL(__pv_offset);
#endif
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 2c7cc1e..69eaf84 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -536,6 +536,14 @@ ENTRY(fixup_smp)
ldmfd sp!, {r4 - r6, pc}
ENDPROC(fixup_smp)
+#ifdef __ARMEB_
+#define LOW_OFFSET 0x4
+#define HIGH_OFFSET 0x0
+#else
+#define LOW_OFFSET 0x0
+#define HIGH_OFFSET 0x4
+#endif
+
#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
/* __fixup_pv_table - patch the stub instructions with the delta between
@@ -546,17 +554,20 @@ ENDPROC(fixup_smp)
__HEAD
__fixup_pv_table:
adr r0, 1f
- ldmia r0, {r3-r5, r7}
- sub r3, r0, r3 @ PHYS_OFFSET - PAGE_OFFSET
+ ldmia r0, {r3-r7}
+ mvn ip, #0
+ subs r3, r0, r3 @ PHYS_OFFSET - PAGE_OFFSET
add r4, r4, r3 @ adjust table start address
add r5, r5, r3 @ adjust table end address
- add r7, r7, r3 @ adjust __pv_phys_offset address
- str r8, [r7] @ save computed PHYS_OFFSET to __pv_phys_offset
+ add r6, r6, r3 @ adjust __pv_phys_offset address
+ add r7, r7, r3 @ adjust __pv_offset address
+ str r8, [r6, #LOW_OFFSET] @ save computed PHYS_OFFSET to __pv_phys_offset
+ strcc ip, [r7, #HIGH_OFFSET] @ save to __pv_offset high bits
mov r6, r3, lsr #24 @ constant for add/sub instructions
teq r3, r6, lsl #24 @ must be 16MiB aligned
THUMB( it ne @ cross section branch )
bne __error
- str r6, [r7, #4] @ save to __pv_offset
+ str r3, [r7, #LOW_OFFSET] @ save to __pv_offset low bits
b __fixup_a_pv_table
ENDPROC(__fixup_pv_table)
@@ -565,9 +576,18 @@ ENDPROC(__fixup_pv_table)
.long __pv_table_begin
.long __pv_table_end
2: .long __pv_phys_offset
+ .long __pv_offset
.text
__fixup_a_pv_table:
+ adr r0, 3f
+ ldr r6, [r0]
+ add r6, r6, r3
+ ldr r0, [r6, #HIGH_OFFSET] @ pv_offset high word
+ ldr r6, [r6, #LOW_OFFSET] @ pv_offset low word
+ mov r6, r6, lsr #24
+ cmn r0, #1
+ moveq r0, #0x400000 @ set bit 22, mov to mvn instruction
#ifdef CONFIG_THUMB2_KERNEL
lsls r6, #24
beq 2f
@@ -582,9 +602,15 @@ __fixup_a_pv_table:
b 2f
1: add r7, r3
ldrh ip, [r7, #2]
- and ip, 0x8f00
- orr ip, r6 @ mask in offset bits 31-24
+ tst ip, #0x4000
+ and ip, #0x8f00
+ orrne ip, r6 @ mask in offset bits 31-24
+ orreq ip, r0 @ mask in offset bits 7-0
strh ip, [r7, #2]
+ ldrheq ip, [r7]
+ biceq ip, #0x20
+ orreq ip, ip, r0, lsr #16
+ strheq ip, [r7]
2: cmp r4, r5
ldrcc r7, [r4], #4 @ use branch for delay slot
bcc 1b
@@ -593,7 +619,10 @@ __fixup_a_pv_table:
b 2f
1: ldr ip, [r7, r3]
bic ip, ip, #0x000000ff
- orr ip, ip, r6 @ mask in offset bits 31-24
+ tst ip, #0xf00 @ check the rotation field
+ orrne ip, ip, r6 @ mask in offset bits 31-24
+ biceq ip, ip, #0x400000 @ clear bit 22
+ orreq ip, ip, r0 @ mask in offset bits 7-0
str ip, [r7, r3]
2: cmp r4, r5
ldrcc r7, [r4], #4 @ use branch for delay slot
@@ -602,28 +631,28 @@ __fixup_a_pv_table:
#endif
ENDPROC(__fixup_a_pv_table)
+3: .long __pv_offset
+
ENTRY(fixup_pv_table)
stmfd sp!, {r4 - r7, lr}
- ldr r2, 2f @ get address of __pv_phys_offset
mov r3, #0 @ no offset
mov r4, r0 @ r0 = table start
add r5, r0, r1 @ r1 = table size
- ldr r6, [r2, #4] @ get __pv_offset
bl __fixup_a_pv_table
ldmfd sp!, {r4 - r7, pc}
ENDPROC(fixup_pv_table)
- .align
-2: .long __pv_phys_offset
-
.data
.globl __pv_phys_offset
.type __pv_phys_offset, %object
__pv_phys_offset:
- .long 0
- .size __pv_phys_offset, . - __pv_phys_offset
+ .quad 0
+
+ .data
+ .globl __pv_offset
+ .type __pv_offset, %object
__pv_offset:
- .long 0
+ .quad 0
#endif
#include "head-common.S"
--
1.7.9.5
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 5/6] ARM: mm: Recreate kernel mappings in early_paging_init()
2013-10-04 16:12 ` Santosh Shilimkar
@ 2013-10-07 19:34 ` Santosh Shilimkar
2013-10-08 10:26 ` Will Deacon
0 siblings, 1 reply; 28+ messages in thread
From: Santosh Shilimkar @ 2013-10-07 19:34 UTC (permalink / raw)
To: linux-arm-kernel
Will,
On Friday 04 October 2013 12:12 PM, Santosh Shilimkar wrote:
> On Friday 04 October 2013 11:59 AM, Will Deacon wrote:
>> On Thu, Oct 03, 2013 at 10:17:59PM +0100, Santosh Shilimkar wrote:
>>> This patch adds a step in the init sequence, in order to recreate
>>> the kernel code/data page table mappings prior to full paging
>>> initialization. This is necessary on LPAE systems that run out of
>>> a physical address space outside the 4G limit. On these systems,
>>> this implementation provides a machine descriptor hook that allows
>>> the PHYS_OFFSET to be overridden in a machine specific fashion.
>>
>> [...]
>>
>>> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
>>> index b1d17ee..47c7497 100644
>>> --- a/arch/arm/mm/mmu.c
>>> +++ b/arch/arm/mm/mmu.c
[..]
>>> @@ -1315,6 +1316,87 @@ static void __init map_lowmem(void)
>>> }
>>> }
>>>
>>> +#ifdef CONFIG_ARM_LPAE
>>> +extern void fixup_pv_table(const void *, unsigned long);
>>> +extern const void *__pv_table_begin, *__pv_table_end;
>>> +
>>> +/*
>>> + * early_paging_init() recreates boot time page table setup, allowing machines
>>> + * to switch over to a high (>4G) address space on LPAE systems
>>> + */
>>> +void __init early_paging_init(const struct machine_desc *mdesc,
>>> + struct proc_info_list *procinfo)
>>> +{
>>> + pmdval_t pmdprot = procinfo->__cpu_mm_mmu_flags;
>>> + unsigned long map_start, map_end;
>>> + pgd_t *pgd0, *pgdk;
>>> + pud_t *pud0, *pudk;
>>> + pmd_t *pmd0, *pmdk;
>>> + phys_addr_t phys;
>>> + int i;
>>> +
>>> + /* remap kernel code and data */
>>> + map_start = init_mm.start_code;
>>> + map_end = init_mm.brk;
>>> +
>>> + /* get a handle on things... */
>>> + pgd0 = pgd_offset_k(0);
>>> + pud0 = pud_offset(pgd0, 0);
>>> + pmd0 = pmd_offset(pud0, 0);
>>> +
>>> + pgdk = pgd_offset_k(map_start);
>>> + pudk = pud_offset(pgdk, map_start);
>>> + pmdk = pmd_offset(pudk, map_start);
>>> +
>>> + phys = PHYS_OFFSET;
>>> +
>>> + if (mdesc->init_meminfo) {
>>> + mdesc->init_meminfo();
>>> + /* Run the patch stub to update the constants */
>>> + fixup_pv_table(&__pv_table_begin,
>>> + (&__pv_table_end - &__pv_table_begin) << 2);
>>> +
>>> + /*
>>> + * Cache cleaning operations for self-modifying code
>>> + * We should clean the entries by MVA but running a
>>> + * for loop over every pv_table entry pointer would
>>> + * just complicate the code.
>>> + */
>>> + flush_cache_louis();
>>> + dsb();
>>> + isb();
>>
>> You don't need either of these barriers.
>>
> Agree. Just want to be clear, its because they are already present
> in flush_cache_louis(), right ?
>
Updated patch end of the email which addresses your comments. Regarding
above barriers, we dropped the dsb() but I have to retain the isb()
to commit the I-cache/BTB invalidate ops which are issued as part of
flush_cache_louis(). Off-list I was discussing whether to patch cache-v7.S
to add an isb to flush_cache_louis() with Russell but looking at other
usages we though of leaving the isb() in my patch itself. Without the
isb(), we see corruption on next v2p conversion.
Let me know if you have any other concern. I plan to prepare a branch for
RMK to pull in for upcoming merge window.
Regards,
Santosh
>From 6c4be7594a9556d1d79503c17d0ce629abec17d7 Mon Sep 17 00:00:00 2001
From: Santosh Shilimkar <santosh.shilimkar@ti.com>
Date: Wed, 31 Jul 2013 12:44:46 -0400
Subject: [PATCH v3 5/5] ARM: mm: Recreate kernel mappings in
early_paging_init()
This patch adds a step in the init sequence, in order to recreate
the kernel code/data page table mappings prior to full paging
initialization. This is necessary on LPAE systems that run out of
a physical address space outside the 4G limit. On these systems,
this implementation provides a machine descriptor hook that allows
the PHYS_OFFSET to be overridden in a machine specific fashion.
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: R Sricharan <r.sricharan@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
arch/arm/include/asm/mach/arch.h | 1 +
arch/arm/kernel/setup.c | 3 ++
arch/arm/mm/mmu.c | 89 ++++++++++++++++++++++++++++++++++++++
3 files changed, 93 insertions(+)
diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h
index 402a2bc..17a3fa2 100644
--- a/arch/arm/include/asm/mach/arch.h
+++ b/arch/arm/include/asm/mach/arch.h
@@ -49,6 +49,7 @@ struct machine_desc {
bool (*smp_init)(void);
void (*fixup)(struct tag *, char **,
struct meminfo *);
+ void (*init_meminfo)(void);
void (*reserve)(void);/* reserve mem blocks */
void (*map_io)(void);/* IO mapping function */
void (*init_early)(void);
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 0e1e2b3..b9a6dac 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -73,6 +73,7 @@ __setup("fpe=", fpe_setup);
#endif
extern void paging_init(const struct machine_desc *desc);
+extern void early_paging_init(const struct machine_desc *, struct proc_info_list *);
extern void sanity_check_meminfo(void);
extern enum reboot_mode reboot_mode;
extern void setup_dma_zone(const struct machine_desc *desc);
@@ -878,6 +879,8 @@ void __init setup_arch(char **cmdline_p)
parse_early_param();
sort(&meminfo.bank, meminfo.nr_banks, sizeof(meminfo.bank[0]), meminfo_cmp, NULL);
+
+ early_paging_init(mdesc, lookup_processor_type(read_cpuid_id()));
sanity_check_meminfo();
arm_memblock_init(&meminfo, mdesc);
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index b1d17ee..0751b46 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -28,6 +28,7 @@
#include <asm/highmem.h>
#include <asm/system_info.h>
#include <asm/traps.h>
+#include <asm/procinfo.h>
#include <asm/mach/arch.h>
#include <asm/mach/map.h>
@@ -1315,6 +1316,94 @@ static void __init map_lowmem(void)
}
}
+#ifdef CONFIG_ARM_LPAE
+extern void fixup_pv_table(const void *, unsigned long);
+extern const void *__pv_table_begin, *__pv_table_end;
+
+/*
+ * early_paging_init() recreates boot time page table setup, allowing machines
+ * to switch over to a high (>4G) address space on LPAE systems
+ */
+void __init early_paging_init(const struct machine_desc *mdesc,
+ struct proc_info_list *procinfo)
+{
+ pmdval_t pmdprot = procinfo->__cpu_mm_mmu_flags;
+ unsigned long map_start, map_end;
+ pgd_t *pgd0, *pgdk;
+ pud_t *pud0, *pudk, *pud_start;
+ pmd_t *pmd0, *pmdk, *pmd_start;
+ phys_addr_t phys;
+ int i;
+
+ /* remap kernel code and data */
+ map_start = init_mm.start_code;
+ map_end = init_mm.brk;
+
+ /* get a handle on things... */
+ pgd0 = pgd_offset_k(0);
+ pud_start = pud0 = pud_offset(pgd0, 0);
+ pmd0 = pmd_offset(pud0, 0);
+
+ pgdk = pgd_offset_k(map_start);
+ pudk = pud_offset(pgdk, map_start);
+ pmd_start = pmdk = pmd_offset(pudk, map_start);
+
+ phys = PHYS_OFFSET;
+
+ if (mdesc->init_meminfo) {
+ mdesc->init_meminfo();
+ /* Run the patch stub to update the constants */
+ fixup_pv_table(&__pv_table_begin,
+ (&__pv_table_end - &__pv_table_begin) << 2);
+
+ /*
+ * Cache cleaning operations for self-modifying code
+ * We should clean the entries by MVA but running a
+ * for loop over every pv_table entry pointer would
+ * just complicate the code. isb() is added to commit
+ * all the prior cp15 operations.
+ */
+ flush_cache_louis();
+ isb();
+ }
+
+ /* remap level 1 table */
+ for (i = 0; i < PTRS_PER_PGD; i++) {
+ *pud0++ = __pud(__pa(pmd0) | PMD_TYPE_TABLE | L_PGD_SWAPPER);
+ pmd0 += PTRS_PER_PMD;
+ }
+
+ __cpuc_flush_dcache_area(pud_start, sizeof(pud_start) * PTRS_PER_PGD);
+ outer_clean_range(virt_to_phys(pud_start), sizeof(pud_start) * PTRS_PER_PGD);
+
+ /* remap pmds for kernel mapping */
+ phys = __pa(map_start) & PMD_MASK;
+ i = 0;
+ do {
+ *pmdk++ = __pmd(phys | pmdprot);
+ phys += PMD_SIZE;
+ i++;
+ } while (phys < map_end);
+
+ __cpuc_flush_dcache_area(pmd_start, sizeof(pmd_start) * i);
+ outer_clean_range(virt_to_phys(pmd_start), sizeof(pmd_start) * i);
+
+ cpu_switch_mm(pgd0, &init_mm);
+ cpu_set_ttbr(1, __pa(pgd0) + TTBR1_OFFSET);
+ local_flush_tlb_all();
+}
+
+#else
+
+void __init early_paging_init(const struct machine_desc *mdesc,
+ struct proc_info_list *procinfo)
+{
+ if (mdesc->init_meminfo)
+ mdesc->init_meminfo();
+}
+
+#endif
+
/*
* paging_init() sets up the page tables, initialises the zone memory
* maps, and sets up the zero page, bad page and bad page tables.
--
1.7.9.5
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 4/6] ARM: mm: Correct virt_to_phys patching for 64 bit physical addresses
2013-10-07 19:25 ` Santosh Shilimkar
@ 2013-10-07 19:42 ` Nicolas Pitre
2013-10-08 11:43 ` Sricharan R
0 siblings, 1 reply; 28+ messages in thread
From: Nicolas Pitre @ 2013-10-07 19:42 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 7 Oct 2013, Santosh Shilimkar wrote:
> Update patch below with your review tag for records.
>
> Regards,
> Santosh
Micronit:
> .data
> .globl __pv_phys_offset
> .type __pv_phys_offset, %object
> __pv_phys_offset:
> - .long 0
> - .size __pv_phys_offset, . - __pv_phys_offset
> + .quad 0
> +
> + .data
> + .globl __pv_offset
> + .type __pv_offset, %object
> __pv_offset:
> - .long 0
> + .quad 0
Please keep the .size statement for __pv_phys_offset, and adding one for
__pv_offset wouldn't hurt either. And the second .data is redundant.
Nicolas
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 5/6] ARM: mm: Recreate kernel mappings in early_paging_init()
2013-10-07 19:34 ` Santosh Shilimkar
@ 2013-10-08 10:26 ` Will Deacon
2013-10-08 17:45 ` Santosh Shilimkar
0 siblings, 1 reply; 28+ messages in thread
From: Will Deacon @ 2013-10-08 10:26 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Oct 07, 2013 at 08:34:41PM +0100, Santosh Shilimkar wrote:
> Will,
Hi Santosh,
> On Friday 04 October 2013 12:12 PM, Santosh Shilimkar wrote:
> > On Friday 04 October 2013 11:59 AM, Will Deacon wrote:
> >> On Thu, Oct 03, 2013 at 10:17:59PM +0100, Santosh Shilimkar wrote:
> >>> + if (mdesc->init_meminfo) {
> >>> + mdesc->init_meminfo();
> >>> + /* Run the patch stub to update the constants */
> >>> + fixup_pv_table(&__pv_table_begin,
> >>> + (&__pv_table_end - &__pv_table_begin) << 2);
> >>> +
> >>> + /*
> >>> + * Cache cleaning operations for self-modifying code
> >>> + * We should clean the entries by MVA but running a
> >>> + * for loop over every pv_table entry pointer would
> >>> + * just complicate the code.
> >>> + */
> >>> + flush_cache_louis();
> >>> + dsb();
> >>> + isb();
> >>
> >> You don't need either of these barriers.
> >>
> > Agree. Just want to be clear, its because they are already present
> > in flush_cache_louis(), right ?
> >
> Updated patch end of the email which addresses your comments. Regarding
> above barriers, we dropped the dsb() but I have to retain the isb()
> to commit the I-cache/BTB invalidate ops which are issued as part of
> flush_cache_louis(). Off-list I was discussing whether to patch cache-v7.S
> to add an isb to flush_cache_louis() with Russell but looking at other
> usages we though of leaving the isb() in my patch itself. Without the
> isb(), we see corruption on next v2p conversion.
Ok, further comments below.
> +void __init early_paging_init(const struct machine_desc *mdesc,
> + struct proc_info_list *procinfo)
> +{
> + pmdval_t pmdprot = procinfo->__cpu_mm_mmu_flags;
> + unsigned long map_start, map_end;
> + pgd_t *pgd0, *pgdk;
> + pud_t *pud0, *pudk, *pud_start;
> + pmd_t *pmd0, *pmdk, *pmd_start;
> + phys_addr_t phys;
> + int i;
> +
> + /* remap kernel code and data */
> + map_start = init_mm.start_code;
> + map_end = init_mm.brk;
> +
> + /* get a handle on things... */
> + pgd0 = pgd_offset_k(0);
> + pud_start = pud0 = pud_offset(pgd0, 0);
> + pmd0 = pmd_offset(pud0, 0);
> +
> + pgdk = pgd_offset_k(map_start);
> + pudk = pud_offset(pgdk, map_start);
> + pmd_start = pmdk = pmd_offset(pudk, map_start);
> +
> + phys = PHYS_OFFSET;
> +
> + if (mdesc->init_meminfo) {
> + mdesc->init_meminfo();
> + /* Run the patch stub to update the constants */
> + fixup_pv_table(&__pv_table_begin,
> + (&__pv_table_end - &__pv_table_begin) << 2);
> +
> + /*
> + * Cache cleaning operations for self-modifying code
> + * We should clean the entries by MVA but running a
> + * for loop over every pv_table entry pointer would
> + * just complicate the code. isb() is added to commit
> + * all the prior cp15 operations.
> + */
> + flush_cache_louis();
> + isb();
I see, you need the new __pv_tables to be visible for your page table
population below, right? In which case, I'm afraid I have to go back on my
original statement; you *do* need that dsb() prior to the isb() if you want
to ensure that the icache maintenance is complete and synchronised.
However, this really looks like an issue with the v7 cache flushing
routines. Why on Earth do they only guarantee completion on the D-side?
> + }
> +
> + /* remap level 1 table */
> + for (i = 0; i < PTRS_PER_PGD; i++) {
> + *pud0++ = __pud(__pa(pmd0) | PMD_TYPE_TABLE | L_PGD_SWAPPER);
> + pmd0 += PTRS_PER_PMD;
> + }
> +
> + __cpuc_flush_dcache_area(pud_start, sizeof(pud_start) * PTRS_PER_PGD);
> + outer_clean_range(virt_to_phys(pud_start), sizeof(pud_start) * PTRS_PER_PGD);
You don't need to flush these page tables if you're SMP. If you use
clean_dcache_area instead, it will do the right thing. The again, why can't
you use pud_populate and pmd_populate for these two loops? Is there an
interaction with coherency here? (if so, why don't you need to flush the
entire cache hierarchy anyway?)
> + /* remap pmds for kernel mapping */
> + phys = __pa(map_start) & PMD_MASK;
> + i = 0;
> + do {
> + *pmdk++ = __pmd(phys | pmdprot);
> + phys += PMD_SIZE;
> + i++;
> + } while (phys < map_end);
> +
> + __cpuc_flush_dcache_area(pmd_start, sizeof(pmd_start) * i);
> + outer_clean_range(virt_to_phys(pmd_start), sizeof(pmd_start) * i);
> +
> + cpu_switch_mm(pgd0, &init_mm);
> + cpu_set_ttbr(1, __pa(pgd0) + TTBR1_OFFSET);
I think you should have a local_flush_bp_all here.
> + local_flush_tlb_all();
Will
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 4/6] ARM: mm: Correct virt_to_phys patching for 64 bit physical addresses
2013-10-07 19:42 ` Nicolas Pitre
@ 2013-10-08 11:43 ` Sricharan R
0 siblings, 0 replies; 28+ messages in thread
From: Sricharan R @ 2013-10-08 11:43 UTC (permalink / raw)
To: linux-arm-kernel
On Tuesday 08 October 2013 01:12 AM, Nicolas Pitre wrote:
> On Mon, 7 Oct 2013, Santosh Shilimkar wrote:
>
>> Update patch below with your review tag for records.
>>
>> Regards,
>> Santosh
> Micronit:
>
>> .data
>> .globl __pv_phys_offset
>> .type __pv_phys_offset, %object
>> __pv_phys_offset:
>> - .long 0
>> - .size __pv_phys_offset, . - __pv_phys_offset
>> + .quad 0
>> +
>> + .data
>> + .globl __pv_offset
>> + .type __pv_offset, %object
>> __pv_offset:
>> - .long 0
>> + .quad 0
> Please keep the .size statement for __pv_phys_offset, and adding one for
> __pv_offset wouldn't hurt either. And the second .data is redundant.
>
Ok, will update this.
Regards,
Sricharan
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 5/6] ARM: mm: Recreate kernel mappings in early_paging_init()
2013-10-08 10:26 ` Will Deacon
@ 2013-10-08 17:45 ` Santosh Shilimkar
2013-10-09 10:06 ` Will Deacon
0 siblings, 1 reply; 28+ messages in thread
From: Santosh Shilimkar @ 2013-10-08 17:45 UTC (permalink / raw)
To: linux-arm-kernel
Hi Will,
On Tuesday 08 October 2013 06:26 AM, Will Deacon wrote:
> On Mon, Oct 07, 2013 at 08:34:41PM +0100, Santosh Shilimkar wrote:
>> Will,
>
> Hi Santosh,
>
[..]
>> +void __init early_paging_init(const struct machine_desc *mdesc,
>> + struct proc_info_list *procinfo)
>> +{
>> + pmdval_t pmdprot = procinfo->__cpu_mm_mmu_flags;
>> + unsigned long map_start, map_end;
>> + pgd_t *pgd0, *pgdk;
>> + pud_t *pud0, *pudk, *pud_start;
>> + pmd_t *pmd0, *pmdk, *pmd_start;
>> + phys_addr_t phys;
>> + int i;
>> +
>> + /* remap kernel code and data */
>> + map_start = init_mm.start_code;
>> + map_end = init_mm.brk;
>> +
>> + /* get a handle on things... */
>> + pgd0 = pgd_offset_k(0);
>> + pud_start = pud0 = pud_offset(pgd0, 0);
>> + pmd0 = pmd_offset(pud0, 0);
>> +
>> + pgdk = pgd_offset_k(map_start);
>> + pudk = pud_offset(pgdk, map_start);
>> + pmd_start = pmdk = pmd_offset(pudk, map_start);
>> +
>> + phys = PHYS_OFFSET;
>> +
>> + if (mdesc->init_meminfo) {
>> + mdesc->init_meminfo();
>> + /* Run the patch stub to update the constants */
>> + fixup_pv_table(&__pv_table_begin,
>> + (&__pv_table_end - &__pv_table_begin) << 2);
>> +
>> + /*
>> + * Cache cleaning operations for self-modifying code
>> + * We should clean the entries by MVA but running a
>> + * for loop over every pv_table entry pointer would
>> + * just complicate the code. isb() is added to commit
>> + * all the prior cp15 operations.
>> + */
>> + flush_cache_louis();
>> + isb();
>
> I see, you need the new __pv_tables to be visible for your page table
> population below, right? In which case, I'm afraid I have to go back on my
> original statement; you *do* need that dsb() prior to the isb() if you want
> to ensure that the icache maintenance is complete and synchronised.
>
Need of dsb and isb is what ARM ARM says but then I got bit biased after
your reply.
> However, this really looks like an issue with the v7 cache flushing
> routines. Why on Earth do they only guarantee completion on the D-side?
>
Indeed.
>> + }
>> +
>> + /* remap level 1 table */
>> + for (i = 0; i < PTRS_PER_PGD; i++) {
>> + *pud0++ = __pud(__pa(pmd0) | PMD_TYPE_TABLE | L_PGD_SWAPPER);
>> + pmd0 += PTRS_PER_PMD;
>> + }
>> +
>> + __cpuc_flush_dcache_area(pud_start, sizeof(pud_start) * PTRS_PER_PGD);
>> + outer_clean_range(virt_to_phys(pud_start), sizeof(pud_start) * PTRS_PER_PGD);
>
> You don't need to flush these page tables if you're SMP. If you use
> clean_dcache_area instead, it will do the right thing. The again, why can't
> you use pud_populate and pmd_populate for these two loops? Is there an
> interaction with coherency here? (if so, why don't you need to flush the
> entire cache hierarchy anyway?)
>
You mean AMRMv7 SMP PT walkers can read from L1 cache and hence doesn't need
flushing L1. While this could be true, for some reason we don't the same
behavior and seeing that without flush we are seeing the issue.
Initially we were doing entire cache flush but moved to the mva based
routines on your suggestion.
Regarding the pud_populate(), since we needed L_PGD_SWAPPER, we couldn't
use that version but updated patch uses the set_pud() which takes the flag.
And pmd_populate() can't be used either because it creates pte based
tables which is not what we want.
So the current working patch as it stands is end of the email. Do let
us know if we are missing anything for the PTW L1 allocation behavior.
Regards,
Santosh
>From 832ea2ba84ad8a012ec7d4dad4d8085cca2cd598 Mon Sep 17 00:00:00 2001
From: Santosh Shilimkar <santosh.shilimkar@ti.com>
Date: Wed, 31 Jul 2013 12:44:46 -0400
Subject: [PATCH v3 5/8] ARM: mm: Recreate kernel mappings in
early_paging_init()
This patch adds a step in the init sequence, in order to recreate
the kernel code/data page table mappings prior to full paging
initialization. This is necessary on LPAE systems that run out of
a physical address space outside the 4G limit. On these systems,
this implementation provides a machine descriptor hook that allows
the PHYS_OFFSET to be overridden in a machine specific fashion.
Based on Cyril's initial patch. The pv_table needs to be patched
again after switching to higher address space.
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: R Sricharan <r.sricharan@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
arch/arm/include/asm/mach/arch.h | 1 +
arch/arm/kernel/setup.c | 4 ++
arch/arm/mm/mmu.c | 91 ++++++++++++++++++++++++++++++++++++++
3 files changed, 96 insertions(+)
diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h
index 402a2bc..17a3fa2 100644
--- a/arch/arm/include/asm/mach/arch.h
+++ b/arch/arm/include/asm/mach/arch.h
@@ -49,6 +49,7 @@ struct machine_desc {
bool (*smp_init)(void);
void (*fixup)(struct tag *, char **,
struct meminfo *);
+ void (*init_meminfo)(void);
void (*reserve)(void);/* reserve mem blocks */
void (*map_io)(void);/* IO mapping function */
void (*init_early)(void);
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 0e1e2b3..af7b7db 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -73,6 +73,8 @@ __setup("fpe=", fpe_setup);
#endif
extern void paging_init(const struct machine_desc *desc);
+extern void early_paging_init(const struct machine_desc *,
+ struct proc_info_list *);
extern void sanity_check_meminfo(void);
extern enum reboot_mode reboot_mode;
extern void setup_dma_zone(const struct machine_desc *desc);
@@ -878,6 +880,8 @@ void __init setup_arch(char **cmdline_p)
parse_early_param();
sort(&meminfo.bank, meminfo.nr_banks, sizeof(meminfo.bank[0]), meminfo_cmp, NULL);
+
+ early_paging_init(mdesc, lookup_processor_type(read_cpuid_id()));
sanity_check_meminfo();
arm_memblock_init(&meminfo, mdesc);
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index b1d17ee..e9e5276 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -28,6 +28,7 @@
#include <asm/highmem.h>
#include <asm/system_info.h>
#include <asm/traps.h>
+#include <asm/procinfo.h>
#include <asm/mach/arch.h>
#include <asm/mach/map.h>
@@ -1315,6 +1316,96 @@ static void __init map_lowmem(void)
}
}
+#ifdef CONFIG_ARM_LPAE
+extern void fixup_pv_table(const void *, unsigned long);
+extern const void *__pv_table_begin, *__pv_table_end;
+
+/*
+ * early_paging_init() recreates boot time page table setup, allowing machines
+ * to switch over to a high (>4G) address space on LPAE systems
+ */
+void __init early_paging_init(const struct machine_desc *mdesc,
+ struct proc_info_list *procinfo)
+{
+ pmdval_t pmdprot = procinfo->__cpu_mm_mmu_flags;
+ unsigned long map_start, map_end;
+ pgd_t *pgd0, *pgdk;
+ pud_t *pud0, *pudk, *pud_start;
+ pmd_t *pmd0, *pmdk, *pmd_start;
+ phys_addr_t phys;
+ int i;
+
+ /* remap kernel code and data */
+ map_start = init_mm.start_code;
+ map_end = init_mm.brk;
+
+ /* get a handle on things... */
+ pgd0 = pgd_offset_k(0);
+ pud_start = pud0 = pud_offset(pgd0, 0);
+ pmd0 = pmd_offset(pud0, 0);
+
+ pgdk = pgd_offset_k(map_start);
+ pudk = pud_offset(pgdk, map_start);
+ pmd_start = pmdk = pmd_offset(pudk, map_start);
+
+ if (mdesc->init_meminfo) {
+ mdesc->init_meminfo();
+ /* Run the patch stub to update the constants */
+ fixup_pv_table(&__pv_table_begin,
+ (&__pv_table_end - &__pv_table_begin) << 2);
+
+ /*
+ * Cache cleaning operations for self-modifying code
+ * We should clean the entries by MVA but running a
+ * for loop over every pv_table entry pointer would
+ * just complicate the code.
+ */
+ flush_cache_louis();
+ dsb();
+ isb();
+ }
+
+ /* remap level 1 table */
+ for (i = 0; i < PTRS_PER_PGD; pud0++, i++) {
+ set_pud(pud0,
+ __pud(__pa(pmd0) | PMD_TYPE_TABLE | L_PGD_SWAPPER));
+ pmd0 += PTRS_PER_PMD;
+ }
+
+ __cpuc_flush_dcache_area(pud_start, sizeof(pud_start) * PTRS_PER_PGD);
+ outer_clean_range(virt_to_phys(pud_start),
+ sizeof(pud_start) * PTRS_PER_PGD);
+
+ /* remap pmds for kernel mapping */
+ phys = __pa(map_start) & PMD_MASK;
+ i = 0;
+ do {
+ *pmdk++ = __pmd(phys | pmdprot);
+ phys += PMD_SIZE;
+ i++;
+ } while (phys < map_end);
+
+ __cpuc_flush_dcache_area(pmd_start, sizeof(pmd_start) * i);
+ outer_clean_range(virt_to_phys(pmd_start),
+ sizeof(pmd_start) * i);
+
+ cpu_switch_mm(pgd0, &init_mm);
+ cpu_set_ttbr(1, __pa(pgd0) + TTBR1_OFFSET);
+ local_flush_bp_all();
+ local_flush_tlb_all();
+}
+
+#else
+
+void __init early_paging_init(struct machine_desc *mdesc,
+ struct proc_info_list *procinfo)
+{
+ if (mdesc->init_meminfo)
+ mdesc->init_meminfo();
+}
+
+#endif
+
/*
* paging_init() sets up the page tables, initialises the zone memory
* maps, and sets up the zero page, bad page and bad page tables.
--
1.7.9.5
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 5/6] ARM: mm: Recreate kernel mappings in early_paging_init()
2013-10-08 17:45 ` Santosh Shilimkar
@ 2013-10-09 10:06 ` Will Deacon
2013-10-09 18:51 ` Santosh Shilimkar
0 siblings, 1 reply; 28+ messages in thread
From: Will Deacon @ 2013-10-09 10:06 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Oct 08, 2013 at 06:45:33PM +0100, Santosh Shilimkar wrote:
> On Tuesday 08 October 2013 06:26 AM, Will Deacon wrote:
> > On Mon, Oct 07, 2013 at 08:34:41PM +0100, Santosh Shilimkar wrote:
> >> + /*
> >> + * Cache cleaning operations for self-modifying code
> >> + * We should clean the entries by MVA but running a
> >> + * for loop over every pv_table entry pointer would
> >> + * just complicate the code. isb() is added to commit
> >> + * all the prior cp15 operations.
> >> + */
> >> + flush_cache_louis();
> >> + isb();
> >
> > I see, you need the new __pv_tables to be visible for your page table
> > population below, right? In which case, I'm afraid I have to go back on my
> > original statement; you *do* need that dsb() prior to the isb() if you want
> > to ensure that the icache maintenance is complete and synchronised.
> >
> Need of dsb and isb is what ARM ARM says but then I got bit biased after
> your reply.
Yeah, sorry about that. I didn't originally notice that you needed the I-cache
flushing before the __pa stuff below.
> >> + }
> >> +
> >> + /* remap level 1 table */
> >> + for (i = 0; i < PTRS_PER_PGD; i++) {
> >> + *pud0++ = __pud(__pa(pmd0) | PMD_TYPE_TABLE | L_PGD_SWAPPER);
> >> + pmd0 += PTRS_PER_PMD;
> >> + }
> >> +
> >> + __cpuc_flush_dcache_area(pud_start, sizeof(pud_start) * PTRS_PER_PGD);
> >> + outer_clean_range(virt_to_phys(pud_start), sizeof(pud_start) * PTRS_PER_PGD);
> >
> > You don't need to flush these page tables if you're SMP. If you use
> > clean_dcache_area instead, it will do the right thing. The again, why can't
> > you use pud_populate and pmd_populate for these two loops? Is there an
> > interaction with coherency here? (if so, why don't you need to flush the
> > entire cache hierarchy anyway?)
> >
> You mean AMRMv7 SMP PT walkers can read from L1 cache and hence doesn't need
> flushing L1. While this could be true, for some reason we don't the same
> behavior and seeing that without flush we are seeing the issue.
I would really like to know why this isn't working for you. I have a feeling
that it's related to your interesting coherency issues on keystone. For
example, if the physical address put in the ttbr doesn't match the physical
address which is mapped to the kernel page tables, then we could get
physical aliasing in the caches.
> Initially we were doing entire cache flush but moved to the mva based
> routines on your suggestion.
If the issue is related to coherency and physical aliasing, I really think
you should just flush the entire cache hierarchy. It's difficult to identify
exactly what state needs to be carried over between the old and new
mappings, but I bet it's more than just page tables.
> Regarding the pud_populate(), since we needed L_PGD_SWAPPER, we couldn't
> use that version but updated patch uses the set_pud() which takes the flag.
> And pmd_populate() can't be used either because it creates pte based
> tables which is not what we want.
Ok. It certainly looks better than it did.
Will
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 5/6] ARM: mm: Recreate kernel mappings in early_paging_init()
2013-10-09 10:06 ` Will Deacon
@ 2013-10-09 18:51 ` Santosh Shilimkar
0 siblings, 0 replies; 28+ messages in thread
From: Santosh Shilimkar @ 2013-10-09 18:51 UTC (permalink / raw)
To: linux-arm-kernel
On Wednesday 09 October 2013 06:06 AM, Will Deacon wrote:
> On Tue, Oct 08, 2013 at 06:45:33PM +0100, Santosh Shilimkar wrote:
>> On Tuesday 08 October 2013 06:26 AM, Will Deacon wrote:
>>> On Mon, Oct 07, 2013 at 08:34:41PM +0100, Santosh Shilimkar wrote:
>>>> + /*
>>>> + * Cache cleaning operations for self-modifying code
>>>> + * We should clean the entries by MVA but running a
>>>> + * for loop over every pv_table entry pointer would
>>>> + * just complicate the code. isb() is added to commit
>>>> + * all the prior cp15 operations.
>>>> + */
>>>> + flush_cache_louis();
>>>> + isb();
>>>
>>> I see, you need the new __pv_tables to be visible for your page table
>>> population below, right? In which case, I'm afraid I have to go back on my
>>> original statement; you *do* need that dsb() prior to the isb() if you want
>>> to ensure that the icache maintenance is complete and synchronised.
>>>
>> Need of dsb and isb is what ARM ARM says but then I got bit biased after
>> your reply.
>
> Yeah, sorry about that. I didn't originally notice that you needed the I-cache
> flushing before the __pa stuff below.
>
No problem
>>>> + }
>>>> +
>>>> + /* remap level 1 table */
>>>> + for (i = 0; i < PTRS_PER_PGD; i++) {
>>>> + *pud0++ = __pud(__pa(pmd0) | PMD_TYPE_TABLE | L_PGD_SWAPPER);
>>>> + pmd0 += PTRS_PER_PMD;
>>>> + }
>>>> +
>>>> + __cpuc_flush_dcache_area(pud_start, sizeof(pud_start) * PTRS_PER_PGD);
>>>> + outer_clean_range(virt_to_phys(pud_start), sizeof(pud_start) * PTRS_PER_PGD);
>>>
>>> You don't need to flush these page tables if you're SMP. If you use
>>> clean_dcache_area instead, it will do the right thing. The again, why can't
>>> you use pud_populate and pmd_populate for these two loops? Is there an
>>> interaction with coherency here? (if so, why don't you need to flush the
>>> entire cache hierarchy anyway?)
>>>
>> You mean AMRMv7 SMP PT walkers can read from L1 cache and hence doesn't need
>> flushing L1. While this could be true, for some reason we don't the same
>> behavior and seeing that without flush we are seeing the issue.
>
> I would really like to know why this isn't working for you. I have a feeling
> that it's related to your interesting coherency issues on keystone. For
> example, if the physical address put in the ttbr doesn't match the physical
> address which is mapped to the kernel page tables, then we could get
> physical aliasing in the caches.
>
It might be. we will keep debugging that.
>> Initially we were doing entire cache flush but moved to the mva based
>> routines on your suggestion.
>
> If the issue is related to coherency and physical aliasing, I really think
> you should just flush the entire cache hierarchy. It's difficult to identify
> exactly what state needs to be carried over between the old and new
> mappings, but I bet it's more than just page tables.
>
You are probably right. I will go back to the full flush to avoid any
corner case till we figure out the issue.
>> Regarding the pud_populate(), since we needed L_PGD_SWAPPER, we couldn't
>> use that version but updated patch uses the set_pud() which takes the flag.
>> And pmd_populate() can't be used either because it creates pte based
>> tables which is not what we want.
>
> Ok. It certainly looks better than it did.
>
Thanks a lot. I will refresh the patch with above update.
Regards,
Santosh
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 6/6] ARM: mm: Change the order of TLB/cache maintenance operations.
2013-10-04 16:03 ` Santosh Shilimkar
@ 2013-10-09 18:56 ` Santosh Shilimkar
0 siblings, 0 replies; 28+ messages in thread
From: Santosh Shilimkar @ 2013-10-09 18:56 UTC (permalink / raw)
To: linux-arm-kernel
Will,
Will,
On Friday 04 October 2013 12:03 PM, Santosh Shilimkar wrote:
> On Friday 04 October 2013 11:52 AM, Will Deacon wrote:
>> On Thu, Oct 03, 2013 at 10:18:00PM +0100, Santosh Shilimkar wrote:
>>> From: Sricharan R <r.sricharan@ti.com>
>>>
>>> As per the arm ARMv7 manual, the sequence of TLB maintenance
>>> operations after making changes to the translation table is
>>> to clean the dcache first, then invalidate the TLB. With
>>> the current sequence we see cache corruption when the
>>> flush_cache_all is called after tlb_flush_all.
>>>
>>> STR rx, [Translation table entry]
>>> ; write new entry to the translation table
>>> Clean cache line [Translation table entry]
>>> DSB
>>> ; ensures visibility of the data cleaned from the D Cache
>>> Invalidate TLB entry by MVA (and ASID if non-global) [page address]
>>> Invalidate BTC
>>> DSB
>>> ; ensure completion of the Invalidate TLB operation
>>> ISB
>>> ; ensure table changes visible to instruction fetch
>>>
>>> The issue is seen only with LPAE + THUMB BUILT KERNEL + 64BIT patching,
>>> which is little bit weird.
>>
>> NAK.
>>
>> I don't buy your reasoning. All current LPAE implementations also implement
>> the multi-processing extensions, meaning that the cache flush isn't required
>> to make the PTEs visible to the table walker. The dsb from the TLB_WB flag
>> is sufficient, so I think you still have some debugging to do as this change
>> is likely masking a problem elsewhere.
>>
>> On top of that, create_mapping does all the flushing you need (for the !SMP
>> case) when the tables are initialised, so this code doesn't need changing.
>>
> Fair enough. We will drop this patch from this series and continue to look
> at the issue further. As such the patch has no hard dependency with rest of
> the series.
>
Just to update the thread, Sricharan tracked down this issue now and
the 64 bit patch is fixed.
Thanks for NAK ;)
Regards,
Santosh
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2013-10-09 18:56 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-03 21:17 [PATCH v3 0/6] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
2013-10-03 21:17 ` [PATCH v3 1/6] ARM: mm: use phys_addr_t appropriately in p2v and v2p conversions Santosh Shilimkar
2013-10-03 21:17 ` [PATCH v3 2/6] ARM: mm: Introduce virt_to_idmap() with an arch hook Santosh Shilimkar
2013-10-03 21:17 ` [PATCH v3 3/6] ARM: mm: Move the idmap print to appropriate place in the code Santosh Shilimkar
2013-10-03 21:17 ` [PATCH v3 4/6] ARM: mm: Correct virt_to_phys patching for 64 bit physical addresses Santosh Shilimkar
2013-10-04 0:17 ` Nicolas Pitre
2013-10-04 5:37 ` Sricharan R
2013-10-04 13:02 ` Nicolas Pitre
2013-10-07 19:25 ` Santosh Shilimkar
2013-10-07 19:42 ` Nicolas Pitre
2013-10-08 11:43 ` Sricharan R
2013-10-03 21:17 ` [PATCH v3 5/6] ARM: mm: Recreate kernel mappings in early_paging_init() Santosh Shilimkar
2013-10-04 0:23 ` Nicolas Pitre
2013-10-04 15:59 ` Will Deacon
2013-10-04 16:12 ` Santosh Shilimkar
2013-10-07 19:34 ` Santosh Shilimkar
2013-10-08 10:26 ` Will Deacon
2013-10-08 17:45 ` Santosh Shilimkar
2013-10-09 10:06 ` Will Deacon
2013-10-09 18:51 ` Santosh Shilimkar
2013-10-03 21:18 ` [PATCH v3 6/6] ARM: mm: Change the order of TLB/cache maintenance operations Santosh Shilimkar
2013-10-04 0:25 ` Nicolas Pitre
2013-10-04 8:46 ` Russell King - ARM Linux
2013-10-04 13:14 ` Nicolas Pitre
2013-10-04 13:19 ` Santosh Shilimkar
2013-10-04 15:52 ` Will Deacon
2013-10-04 16:03 ` Santosh Shilimkar
2013-10-09 18:56 ` Santosh Shilimkar
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.