All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems
@ 2013-06-21 23:48 Santosh Shilimkar
  2013-06-21 23:48 ` [PATCH 1/8] ARM: mm: LPAE: use phys_addr_t appropriately in p2v and v2p conversions Santosh Shilimkar
                   ` (8 more replies)
  0 siblings, 9 replies; 22+ messages in thread
From: Santosh Shilimkar @ 2013-06-21 23:48 UTC (permalink / raw)
  To: linux-arm-kernel

Based on discussion/debate on Cyril's generic code patching framework,
we cooked up this series which basically trying to extend the existing
v2p runtime patching for LPAE machines which can have physical memory
beyond 4 GB. Keystone is one such ARM machine.

We think the 64 bit patching can be still made better than the proposed
patch in the series and hence seeking expert comments from RMK, Nico and
others. Last patch in the series added to just give perspective on how
machine code will make use of the available bits from the series.

Santosh Shilimkar (6):
  ARM: mm: LPAE: use phys_addr_t appropriately in p2v and v2p
    conversions
  ARM: mm: Introduce virt_to_idmap() with an arch hook
  ARM: mm: Move the idmap print to appropriate place in the code
  ARM: mm: Pass the constant as an argument to fixup_pv_table()
  ARM: mm: Recreate kernel mappings in early_paging_init()
  ARM: keystone: Switch over to high physical address range

Sricharan R (2):
  ARM: mm: Add __pv_stub_mov to patch MOV instruction
  ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical
    addresses

 arch/arm/include/asm/mach/arch.h  |    1 +
 arch/arm/include/asm/memory.h     |   72 ++++++++++++++++++++++++---
 arch/arm/kernel/armksyms.c        |    2 +
 arch/arm/kernel/head.S            |   39 +++++++++++++--
 arch/arm/kernel/module.c          |   11 ++++-
 arch/arm/kernel/setup.c           |    3 ++
 arch/arm/kernel/smp.c             |    2 +-
 arch/arm/kernel/vmlinux.lds.S     |    5 ++
 arch/arm/mach-keystone/keystone.c |   49 ++++++++++++++++++
 arch/arm/mach-keystone/memory.h   |   24 +++++++++
 arch/arm/mach-keystone/platsmp.c  |   16 +++++-
 arch/arm/mm/idmap.c               |    8 +--
 arch/arm/mm/mmu.c                 |   99 +++++++++++++++++++++++++++++++++++++
 13 files changed, 311 insertions(+), 20 deletions(-)
 create mode 100644 arch/arm/mach-keystone/memory.h

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/8] ARM: mm: LPAE: use phys_addr_t appropriately in p2v and v2p conversions
  2013-06-21 23:48 [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
@ 2013-06-21 23:48 ` Santosh Shilimkar
  2013-07-22 15:03   ` Nicolas Pitre
  2013-06-21 23:48 ` [PATCH 2/8] ARM: mm: Introduce virt_to_idmap() with an arch hook Santosh Shilimkar
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Santosh Shilimkar @ 2013-06-21 23:48 UTC (permalink / raw)
  To: linux-arm-kernel

Fix remainder types used when converting back and forth between
physical and virtual addresses.

Cc: Nicolas Pitre <nico@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
 arch/arm/include/asm/memory.h |   22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 584786f..93376e4 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -185,22 +185,32 @@ extern unsigned long __pv_phys_offset;
 	: "=r" (to)					\
 	: "r" (from), "I" (type))
 
-static inline unsigned long __virt_to_phys(unsigned long x)
+static inline phys_addr_t __virt_to_phys(unsigned long x)
 {
 	unsigned long t;
 	__pv_stub(x, t, "add", __PV_BITS_31_24);
 	return t;
 }
 
-static inline unsigned long __phys_to_virt(unsigned long x)
+static inline unsigned long __phys_to_virt(phys_addr_t x)
 {
 	unsigned long t;
 	__pv_stub(x, t, "sub", __PV_BITS_31_24);
 	return t;
 }
+
 #else
-#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
-#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
+
+static inline phys_addr_t __virt_to_phys(unsigned long x)
+{
+	return (phys_addr_t)x - PAGE_OFFSET + PHYS_OFFSET;
+}
+
+static inline unsigned long __phys_to_virt(phys_addr_t x)
+{
+	return x - PHYS_OFFSET + PAGE_OFFSET;
+}
+
 #endif
 #endif
 #endif /* __ASSEMBLY__ */
@@ -238,14 +248,14 @@ static inline phys_addr_t virt_to_phys(const volatile void *x)
 
 static inline void *phys_to_virt(phys_addr_t x)
 {
-	return (void *)(__phys_to_virt((unsigned long)(x)));
+	return (void *)__phys_to_virt(x);
 }
 
 /*
  * Drivers should NOT use these either.
  */
 #define __pa(x)			__virt_to_phys((unsigned long)(x))
-#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
+#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
 #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
 
 /*
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/8] ARM: mm: Introduce virt_to_idmap() with an arch hook
  2013-06-21 23:48 [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
  2013-06-21 23:48 ` [PATCH 1/8] ARM: mm: LPAE: use phys_addr_t appropriately in p2v and v2p conversions Santosh Shilimkar
@ 2013-06-21 23:48 ` Santosh Shilimkar
  2013-06-21 23:48 ` [PATCH 3/8] ARM: mm: Move the idmap print to appropriate place in the code Santosh Shilimkar
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Santosh Shilimkar @ 2013-06-21 23:48 UTC (permalink / raw)
  To: linux-arm-kernel

On some PAE systems (e.g. TI Keystone), memory is above the
32-bit addressable limit, and the interconnect provides an
aliased view of parts of physical memory in the 32-bit addressable
space.  This alias is strictly for boot time usage, and is not
otherwise usable because of coherency limitations. On such systems,
the idmap mechanism needs to take this aliased mapping into account.

This patch introduces virt_to_idmap() and a arch function pointer which
can be populated by platform which needs it. Also populate necessary
idmap spots with now available virt_to_idmap(). Avoided #ifdef approach
to be compatible with multi-platform builds.

Most architecture won't touch it and in that case virt_to_idmap()
fall-back to existing virt_to_phys() macro.

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
 arch/arm/include/asm/memory.h |   16 ++++++++++++++++
 arch/arm/kernel/smp.c         |    2 +-
 arch/arm/mm/idmap.c           |    5 +++--
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 93376e4..5944092 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -173,6 +173,7 @@
  */
 #define __PV_BITS_31_24	0x81000000
 
+extern phys_addr_t (*arch_virt_to_idmap) (unsigned long x);
 extern unsigned long __pv_phys_offset;
 #define PHYS_OFFSET __pv_phys_offset
 
@@ -259,6 +260,21 @@ static inline void *phys_to_virt(phys_addr_t x)
 #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
 
 /*
+ * These are for systems that have a hardware interconnect supported alias of
+ * physical memory for idmap purposes.  Most cases should leave these
+ * untouched.
+ */
+static inline phys_addr_t __virt_to_idmap(unsigned long x)
+{
+	if (arch_virt_to_idmap)
+		return arch_virt_to_idmap(x);
+	else
+		return __virt_to_phys(x);
+}
+
+#define virt_to_idmap(x)	__virt_to_idmap((unsigned long)(x))
+
+/*
  * Virtual <-> DMA view memory address translations
  * Again, these are *only* valid on the kernel direct mapped RAM
  * memory.  Use of these is *deprecated* (and that doesn't mean
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 217b755..e1a7b9a 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -80,7 +80,7 @@ void __init smp_set_ops(struct smp_operations *ops)
 
 static unsigned long get_arch_pgd(pgd_t *pgd)
 {
-	phys_addr_t pgdir = virt_to_phys(pgd);
+	phys_addr_t pgdir = virt_to_idmap(pgd);
 	BUG_ON(pgdir & ARCH_PGD_MASK);
 	return pgdir >> ARCH_PGD_SHIFT;
 }
diff --git a/arch/arm/mm/idmap.c b/arch/arm/mm/idmap.c
index 83cb3ac..c0a1e48 100644
--- a/arch/arm/mm/idmap.c
+++ b/arch/arm/mm/idmap.c
@@ -10,6 +10,7 @@
 #include <asm/system_info.h>
 
 pgd_t *idmap_pgd;
+phys_addr_t (*arch_virt_to_idmap) (unsigned long x);
 
 #ifdef CONFIG_ARM_LPAE
 static void idmap_add_pmd(pud_t *pud, unsigned long addr, unsigned long end,
@@ -67,8 +68,8 @@ static void identity_mapping_add(pgd_t *pgd, const char *text_start,
 	unsigned long addr, end;
 	unsigned long next;
 
-	addr = virt_to_phys(text_start);
-	end = virt_to_phys(text_end);
+	addr = virt_to_idmap(text_start);
+	end = virt_to_idmap(text_end);
 
 	prot |= PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_SECT_AF;
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/8] ARM: mm: Move the idmap print to appropriate place in the code
  2013-06-21 23:48 [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
  2013-06-21 23:48 ` [PATCH 1/8] ARM: mm: LPAE: use phys_addr_t appropriately in p2v and v2p conversions Santosh Shilimkar
  2013-06-21 23:48 ` [PATCH 2/8] ARM: mm: Introduce virt_to_idmap() with an arch hook Santosh Shilimkar
@ 2013-06-21 23:48 ` Santosh Shilimkar
  2013-06-21 23:48 ` [PATCH 4/8] ARM: mm: Pass the constant as an argument to fixup_pv_table() Santosh Shilimkar
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Santosh Shilimkar @ 2013-06-21 23:48 UTC (permalink / raw)
  To: linux-arm-kernel

Commit 9e9a367c29cebd2 {ARM: Section based HYP idmap} moved
the address conversion inside identity_mapping_add() without
respective print which carries useful idmap information.

Move the print as well inside identity_mapping_add() to
fix the same.

Cc: Christoffer Dall <c.dall@virtualopensystems.com>
Cc: Will Deacon <will.deacon@arm.com>

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
 arch/arm/mm/idmap.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/arm/mm/idmap.c b/arch/arm/mm/idmap.c
index c0a1e48..8e0e52e 100644
--- a/arch/arm/mm/idmap.c
+++ b/arch/arm/mm/idmap.c
@@ -70,6 +70,7 @@ static void identity_mapping_add(pgd_t *pgd, const char *text_start,
 
 	addr = virt_to_idmap(text_start);
 	end = virt_to_idmap(text_end);
+	pr_info("Setting up static identity map for 0x%lx - 0x%lx\n", addr, end);
 
 	prot |= PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_SECT_AF;
 
@@ -91,8 +92,6 @@ static int __init init_static_idmap(void)
 	if (!idmap_pgd)
 		return -ENOMEM;
 
-	pr_info("Setting up static identity map for 0x%p - 0x%p\n",
-		__idmap_text_start, __idmap_text_end);
 	identity_mapping_add(idmap_pgd, __idmap_text_start,
 			     __idmap_text_end, 0);
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/8] ARM: mm: Pass the constant as an argument to fixup_pv_table()
  2013-06-21 23:48 [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
                   ` (2 preceding siblings ...)
  2013-06-21 23:48 ` [PATCH 3/8] ARM: mm: Move the idmap print to appropriate place in the code Santosh Shilimkar
@ 2013-06-21 23:48 ` Santosh Shilimkar
  2013-06-21 23:48 ` [PATCH 5/8] ARM: mm: Add __pv_stub_mov to patch MOV instruction Santosh Shilimkar
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Santosh Shilimkar @ 2013-06-21 23:48 UTC (permalink / raw)
  To: linux-arm-kernel

Current fixup_pv_table() API hardcoded the constant to be
patched. Make that as an argument to fixup_pv_table() so that
different type of instructions can be patched.

This is preparatory patch to add the subsequent 64 bit patching.

Signed-off-by: Sricharan R <r.sricharan@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
 arch/arm/include/asm/memory.h |    2 ++
 arch/arm/kernel/head.S        |    3 +--
 arch/arm/kernel/module.c      |    4 ++--
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 5944092..3d4f79c 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -175,6 +175,8 @@
 
 extern phys_addr_t (*arch_virt_to_idmap) (unsigned long x);
 extern unsigned long __pv_phys_offset;
+extern unsigned long __pv_offset;
+
 #define PHYS_OFFSET __pv_phys_offset
 
 #define __pv_stub(from,to,instr,type)			\
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 45e8935..764e83b 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -604,11 +604,10 @@ ENDPROC(__fixup_a_pv_table)
 
 ENTRY(fixup_pv_table)
 	stmfd	sp!, {r4 - r7, lr}
-	ldr	r2, 2f			@ get address of __pv_phys_offset
 	mov	r3, #0			@ no offset
 	mov	r4, r0			@ r0 = table start
 	add	r5, r0, r1		@ r1 = table size
-	ldr	r6, [r2, #4]		@ get __pv_offset
+	mov	r6, r2			@ get constant to be patched
 	bl	__fixup_a_pv_table
 	ldmfd	sp!, {r4 - r7, pc}
 ENDPROC(fixup_pv_table)
diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index 1e9be5d..1ac071b 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -265,7 +265,7 @@ static const Elf_Shdr *find_mod_section(const Elf32_Ehdr *hdr,
 	return NULL;
 }
 
-extern void fixup_pv_table(const void *, unsigned long);
+extern void fixup_pv_table(const void *, unsigned long, unsigned long);
 extern void fixup_smp(const void *, unsigned long);
 
 int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs,
@@ -319,7 +319,7 @@ int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs,
 #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
 	s = find_mod_section(hdr, sechdrs, ".pv_table");
 	if (s)
-		fixup_pv_table((void *)s->sh_addr, s->sh_size);
+		fixup_pv_table((void *)s->sh_addr, s->sh_size, __pv_offset);
 #endif
 	s = find_mod_section(hdr, sechdrs, ".alt.smp.init");
 	if (s && !is_smp())
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 5/8] ARM: mm: Add __pv_stub_mov to patch MOV instruction
  2013-06-21 23:48 [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
                   ` (3 preceding siblings ...)
  2013-06-21 23:48 ` [PATCH 4/8] ARM: mm: Pass the constant as an argument to fixup_pv_table() Santosh Shilimkar
@ 2013-06-21 23:48 ` Santosh Shilimkar
  2013-06-21 23:48 ` [PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses Santosh Shilimkar
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Santosh Shilimkar @ 2013-06-21 23:48 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sricharan R <r.sricharan@ti.com>

Patch adds stub for MOV instruction. This is preparatory patch to
add the subsequent 64 bit patching which will use MOV stub.

For review simplicity the patch is kept as a separate change

Signed-off-by: Sricharan R <r.sricharan@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
 arch/arm/include/asm/memory.h |   10 ++++++++++
 arch/arm/kernel/head.S        |    2 ++
 arch/arm/kernel/vmlinux.lds.S |    5 +++++
 3 files changed, 17 insertions(+)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 3d4f79c..d8a3ea6 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -172,6 +172,7 @@
  * so that all we need to do is modify the 8-bit constant field.
  */
 #define __PV_BITS_31_24	0x81000000
+#define __PV_BITS_7_0	0x81
 
 extern phys_addr_t (*arch_virt_to_idmap) (unsigned long x);
 extern unsigned long __pv_phys_offset;
@@ -188,6 +189,15 @@ extern unsigned long __pv_offset;
 	: "=r" (to)					\
 	: "r" (from), "I" (type))
 
+#define __pv_stub_mov(to, instr, type)			\
+	__asm__ volatile("@ __pv_stub_mov\n"		\
+	"1:	" instr "	%R0, %1\n"		\
+	"	.pushsection .pv_high_table,\"a\"\n"	\
+	"	.long	1b\n"				\
+	"	.popsection\n"				\
+	: "=r" (to)					\
+	: "I" (type))
+
 static inline phys_addr_t __virt_to_phys(unsigned long x)
 {
 	unsigned long t;
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 764e83b..b1bdeb5 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -565,6 +565,8 @@ ENDPROC(__fixup_pv_table)
 	.long	__pv_table_begin
 	.long	__pv_table_end
 2:	.long	__pv_phys_offset
+3:	.long	__pv_high_table_begin
+	.long	__pv_high_table_end
 
 	.text
 __fixup_a_pv_table:
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index a871b8e..cf17c7b 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -182,6 +182,11 @@ SECTIONS
 		*(.pv_table)
 		__pv_table_end = .;
 	}
+	.init.pv_high_table : {
+		__pv_high_table_begin = .;
+		*(.pv_high_table)
+		__pv_high_table_end = .;
+	}
 	.init.data : {
 #ifndef CONFIG_XIP_KERNEL
 		INIT_DATA
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses
  2013-06-21 23:48 [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
                   ` (4 preceding siblings ...)
  2013-06-21 23:48 ` [PATCH 5/8] ARM: mm: Add __pv_stub_mov to patch MOV instruction Santosh Shilimkar
@ 2013-06-21 23:48 ` Santosh Shilimkar
  2013-07-24  1:10   ` Nicolas Pitre
  2013-06-21 23:48 ` [PATCH 7/8] ARM: mm: Recreate kernel mappings in early_paging_init() Santosh Shilimkar
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Santosh Shilimkar @ 2013-06-21 23:48 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sricharan R <r.sricharan@ti.com>

The current phys_to_virt patching mechanism does not work
for 64 bit physical addressesp. Note that constant used in add/sub
instructions is encoded in to the last 8 bits of the opcode. So shift
the _pv_offset constant by 24 to get it in to the correct place.

The v2p patching mechanism patches the higher 32bits of physical
address with a constant. While this is correct, in those platforms
where the lowmem addressable physical memory spawns across 4GB boundary,
a carry bit can be produced as a result of addition of lower 32bits.
This has to be taken in to account and added in to the upper. The patched
__pv_offset and va are added in lower 32bits, where __pv_offset can be
in two's complement form when PA_START < VA_START and that can result
in a false carry bit.

e.g PA = 0x80000000 VA = 0xC0000000
__pv_offset = PA - VA = 0xC0000000 (2's complement)

So adding __pv_offset + VA should never result in a true overflow. So in
order to differentiate between a true carry, a extra flag __pv_sign_flag
is introduced.

There is no corresponding change on the phys_to_virt() side, because
computations on the upper 32-bits would be discarded anyway.

We think, the patch can be further optimised and made bit better with expert
review from RMK, Nico and others.

Cc: Nicolas Pitre <nico@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>

Signed-off-by: Sricharan R <r.sricharan@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
 arch/arm/include/asm/memory.h |   22 ++++++++++++++++++++--
 arch/arm/kernel/armksyms.c    |    2 ++
 arch/arm/kernel/head.S        |   34 ++++++++++++++++++++++++++++++++--
 arch/arm/kernel/module.c      |    7 +++++++
 4 files changed, 61 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index d8a3ea6..e16468d 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -174,9 +174,17 @@
 #define __PV_BITS_31_24	0x81000000
 #define __PV_BITS_7_0	0x81
 
+/*
+ * PV patch constants.
+ * Lower 32bits are 16MB aligned.
+ */
+#define PV_LOW_SHIFT	24
+#define PV_HIGH_SHIFT	32
+
 extern phys_addr_t (*arch_virt_to_idmap) (unsigned long x);
-extern unsigned long __pv_phys_offset;
+extern phys_addr_t __pv_phys_offset;
 extern unsigned long __pv_offset;
+extern unsigned long __pv_sign_flag;
 
 #define PHYS_OFFSET __pv_phys_offset
 
@@ -187,7 +195,8 @@ extern unsigned long __pv_offset;
 	"	.long	1b\n"				\
 	"	.popsection\n"				\
 	: "=r" (to)					\
-	: "r" (from), "I" (type))
+	: "r" (from), "I" (type)			\
+	: "cc")
 
 #define __pv_stub_mov(to, instr, type)			\
 	__asm__ volatile("@ __pv_stub_mov\n"		\
@@ -200,8 +209,17 @@ extern unsigned long __pv_offset;
 
 static inline phys_addr_t __virt_to_phys(unsigned long x)
 {
+#ifdef CONFIG_ARM_LPAE
+	register phys_addr_t t asm("r4") = 0;
+
+	__pv_stub_mov(t, "mov", __PV_BITS_7_0);
+	__pv_stub(x, t, "adds", __PV_BITS_31_24);
+	__asm__ volatile("adc %R0, %R0, %1" : "+r" (t) : "I" (0x0));
+	__asm__ volatile("sub %R0, %R0, %1" : "+r" (t) : "r" (__pv_sign_flag));
+#else
 	unsigned long t;
 	__pv_stub(x, t, "add", __PV_BITS_31_24);
+#endif
 	return t;
 }
 
diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
index 60d3b73..f0c51ed 100644
--- a/arch/arm/kernel/armksyms.c
+++ b/arch/arm/kernel/armksyms.c
@@ -155,4 +155,6 @@ EXPORT_SYMBOL(__gnu_mcount_nc);
 
 #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
 EXPORT_SYMBOL(__pv_phys_offset);
+EXPORT_SYMBOL(__pv_offset);
+EXPORT_SYMBOL(__pv_sign_flag);
 #endif
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index b1bdeb5..25c9d5f 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -546,24 +546,42 @@ ENDPROC(fixup_smp)
 	__HEAD
 __fixup_pv_table:
 	adr	r0, 1f
-	ldmia	r0, {r3-r5, r7}
+	ldmia	r0, {r3-r7}
+	cmp	r0, r3
+	mov	ip, #1
 	sub	r3, r0, r3	@ PHYS_OFFSET - PAGE_OFFSET
 	add	r4, r4, r3	@ adjust table start address
 	add	r5, r5, r3	@ adjust table end address
 	add	r7, r7, r3	@ adjust __pv_phys_offset address
+	add	r6, r6, r3	@ adjust __pv_sign_flag
+	strcc	ip, [r6]	@ save __pv_sign_flag
 	str	r8, [r7]	@ save computed PHYS_OFFSET to __pv_phys_offset
 	mov	r6, r3, lsr #24	@ constant for add/sub instructions
 	teq	r3, r6, lsl #24 @ must be 16MiB aligned
 THUMB(	it	ne		@ cross section branch )
 	bne	__error
+#ifndef CONFIG_ARM_LPAE
 	str	r6, [r7, #4]	@ save to __pv_offset
 	b	__fixup_a_pv_table
+#else
+	str	r6, [r7, #8]	@ save to __pv_offset
+	mov	r0, r14		@ save lr
+	bl	__fixup_a_pv_table
+	adr	r6, 3f
+	ldmia	r6, {r4-r5}
+	add	r4, r4, r3	@ adjust __pv_high_table start address
+	add 	r5, r5, r3	@ adjust __pv_high_table end address
+	mov	r6, #0		@ higher 32 bits of PHYS_OFFSET to start with
+	bl	__fixup_a_pv_table
+	mov	pc, r0
+#endif
 ENDPROC(__fixup_pv_table)
 
 	.align
 1:	.long	.
 	.long	__pv_table_begin
 	.long	__pv_table_end
+	.long  __pv_sign_flag
 2:	.long	__pv_phys_offset
 3:	.long	__pv_high_table_begin
 	.long	__pv_high_table_end
@@ -621,10 +639,22 @@ ENDPROC(fixup_pv_table)
 	.globl	__pv_phys_offset
 	.type	__pv_phys_offset, %object
 __pv_phys_offset:
+#ifdef CONFIG_ARM_LPAE
+	.quad	0
+#else
 	.long	0
-	.size	__pv_phys_offset, . - __pv_phys_offset
+#endif
+	.data
+	.globl __pv_offset
+	.type __pv_offset, %object
 __pv_offset:
 	.long	0
+
+	.data
+	.globl __pv_sign_flag
+	.type __pv_sign_flag, %object
+__pv_sign_flag:
+	.long 0
 #endif
 
 #include "head-common.S"
diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index 1ac071b..024c06d 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -320,6 +320,13 @@ int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs,
 	s = find_mod_section(hdr, sechdrs, ".pv_table");
 	if (s)
 		fixup_pv_table((void *)s->sh_addr, s->sh_size, __pv_offset);
+
+#ifdef CONFIG_ARM_LPAE
+	s = find_mod_section(hdr, sechdrs, ".pv_high_table");
+	if (s)
+		fixup_pv_table((void *)s->sh_addr, s->sh_size,
+					__pv_phys_offset >> PV_HIGH_SHIFT);
+#endif
 #endif
 	s = find_mod_section(hdr, sechdrs, ".alt.smp.init");
 	if (s && !is_smp())
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 7/8] ARM: mm: Recreate kernel mappings in early_paging_init()
  2013-06-21 23:48 [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
                   ` (5 preceding siblings ...)
  2013-06-21 23:48 ` [PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses Santosh Shilimkar
@ 2013-06-21 23:48 ` Santosh Shilimkar
  2013-06-21 23:48 ` [PATCH 8/8] ARM: keystone: Switch over to high physical address range Santosh Shilimkar
  2013-06-22  1:51 ` [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems Nicolas Pitre
  8 siblings, 0 replies; 22+ messages in thread
From: Santosh Shilimkar @ 2013-06-21 23:48 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds a step in the init sequence, in order to recreate
the kernel code/data page table mappings prior to full paging
initialization.  This is necessary on LPAE systems that run out of
a physical address space outside the 4G limit.  On these systems,
this implementation provides a machine descriptor hook that allows
the PHYS_OFFSET to be overridden in a machine specific fashion.

Based on Cyril's initial patch. The pv_table needs to be patched
again after switching to higher address space.

Cc: Nicolas Pitre <nico@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>

Signed-off-by: R Sricharan <r.sricharan@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
 arch/arm/include/asm/mach/arch.h |    1 +
 arch/arm/kernel/setup.c          |    3 ++
 arch/arm/mm/mmu.c                |   99 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 103 insertions(+)

diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h
index 308ad7d..e487b8e 100644
--- a/arch/arm/include/asm/mach/arch.h
+++ b/arch/arm/include/asm/mach/arch.h
@@ -43,6 +43,7 @@ struct machine_desc {
 	struct smp_operations	*smp;		/* SMP operations	*/
 	void			(*fixup)(struct tag *, char **,
 					 struct meminfo *);
+	void			(*init_meminfo)(void);
 	void			(*reserve)(void);/* reserve mem blocks	*/
 	void			(*map_io)(void);/* IO mapping function	*/
 	void			(*init_early)(void);
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index bdcd4dd..e2ebe6c 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -71,6 +71,7 @@ static int __init fpe_setup(char *line)
 __setup("fpe=", fpe_setup);
 #endif
 
+extern void early_paging_init(struct machine_desc *, struct proc_info_list *);
 extern void paging_init(struct machine_desc *desc);
 extern void sanity_check_meminfo(void);
 extern void reboot_setup(char *str);
@@ -789,6 +790,8 @@ void __init setup_arch(char **cmdline_p)
 	parse_early_param();
 
 	sort(&meminfo.bank, meminfo.nr_banks, sizeof(meminfo.bank[0]), meminfo_cmp, NULL);
+
+	early_paging_init(mdesc, lookup_processor_type(read_cpuid_id()));
 	sanity_check_meminfo();
 	arm_memblock_init(&meminfo, mdesc);
 
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 280f91d..f8ef29b 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -28,6 +28,7 @@
 #include <asm/highmem.h>
 #include <asm/system_info.h>
 #include <asm/traps.h>
+#include <asm/procinfo.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
@@ -1264,6 +1265,104 @@ static void __init map_lowmem(void)
 	}
 }
 
+#ifdef CONFIG_ARM_LPAE
+extern void fixup_pv_table(const void *, unsigned long, unsigned long);
+extern const void *__pv_table_begin, *__pv_table_end;
+extern const void *__pv_high_table_begin, *__pv_high_table_end;
+
+/*
+ * early_paging_init() recreates boot time page table setup, allowing machines
+ * to switch over to a high (>4G) address space on LPAE systems
+ */
+void __init early_paging_init(struct machine_desc *mdesc,
+			      struct proc_info_list *procinfo)
+{
+	pmdval_t pmdprot = procinfo->__cpu_mm_mmu_flags;
+	unsigned long map_start, map_end;
+	pgd_t *pgd0, *pgdk;
+	pud_t *pud0, *pudk;
+	pmd_t *pmd0, *pmdk;
+	phys_addr_t phys;
+	int i;
+	unsigned long __pv_phys_offset_low;
+
+	/* remap kernel code and data */
+	map_start = init_mm.start_code;
+	map_end   = init_mm.brk;
+
+	/* get a handle on things... */
+	pgd0 = pgd_offset_k(0);
+	pud0 = pud_offset(pgd0, 0);
+	pmd0 = pmd_offset(pud0, 0);
+
+	pgdk = pgd_offset_k(map_start);
+	pudk = pud_offset(pgdk, map_start);
+	pmdk = pmd_offset(pudk, map_start);
+
+	phys = PHYS_OFFSET;
+
+	if (mdesc->init_meminfo) {
+		mdesc->init_meminfo();
+		/* Run the patch stub to update the constants */
+		fixup_pv_table(&__pv_table_begin,
+			(&__pv_table_end - &__pv_table_begin) << 2,
+			__pv_offset);
+
+		fixup_pv_table(&__pv_high_table_begin,
+			(&__pv_high_table_end - &__pv_high_table_begin) << 2,
+			__pv_phys_offset >> PV_HIGH_SHIFT);
+
+		/*
+		 * Cache cleaning operations for self-modifying code
+		 * We should clean the entries by MVA but running a
+		 * for loop over every pv_table entry pointer would
+		 * just complicate the code.
+		 */
+		flush_cache_louis();
+		dsb();
+		isb();
+
+		/*
+		 * Set the flag to indicate whether __pv_offset is
+		 * real or 2's complement after high address switch
+		 */
+		__pv_phys_offset_low = __pv_phys_offset;
+		if (__pv_phys_offset_low < PAGE_OFFSET)
+			__pv_sign_flag = 1;
+		else
+			 __pv_sign_flag = 0;
+	}
+
+	/* remap level 1 table */
+	for (i = 0; i < PTRS_PER_PGD; i++) {
+		*pud0++ = __pud(__pa(pmd0) | PMD_TYPE_TABLE | L_PGD_SWAPPER);
+		pmd0 += PTRS_PER_PMD;
+	}
+
+	/* remap pmds for kernel mapping */
+	phys = __pa(map_start) & PMD_MASK;
+	do {
+		*pmdk++ = __pmd(phys | pmdprot);
+		phys += PMD_SIZE;
+	} while (phys < map_end);
+
+	flush_cache_all();
+	cpu_set_ttbr(0, __pa(pgd0));
+	cpu_set_ttbr(1, __pa(pgd0) + TTBR1_OFFSET);
+	local_flush_tlb_all();
+}
+
+#else
+
+void __init early_paging_init(struct machine_desc *mdesc,
+			      struct proc_info_list *procinfo)
+{
+	if (mdesc->init_meminfo)
+		mdesc->init_meminfo();
+}
+
+#endif
+
 /*
  * paging_init() sets up the page tables, initialises the zone memory
  * maps, and sets up the zero page, bad page and bad page tables.
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 8/8] ARM: keystone: Switch over to high physical address range
  2013-06-21 23:48 [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
                   ` (6 preceding siblings ...)
  2013-06-21 23:48 ` [PATCH 7/8] ARM: mm: Recreate kernel mappings in early_paging_init() Santosh Shilimkar
@ 2013-06-21 23:48 ` Santosh Shilimkar
  2013-06-22  1:51 ` [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems Nicolas Pitre
  8 siblings, 0 replies; 22+ messages in thread
From: Santosh Shilimkar @ 2013-06-21 23:48 UTC (permalink / raw)
  To: linux-arm-kernel

Keystone platforms have their physical memory mapped at an address outside the
32-bit physical range.  A Keystone machine with 16G of RAM would find its
memory at 0x0800000000 - 0x0bffffffff.

For boot purposes, the interconnect supports a limited alias of some of this
memory within the 32-bit addressable space (0x80000000 - 0xffffffff).  This
aliasing is implemented in hardware, and is not intended to be used much
beyond boot.  For instance, DMA coherence does not work when running out of
this aliased address space.

Therefore, a two pahsed boot approach is implemented.
1) Boot out of the low physical address range
2) Switch over to the high range once we're safely inside machine
 initialization.

This patch implements this switch over mechanism, which involves rewiring
the TTBRs and page tables to point to the new physical address space.
Based on earlier patch version from Cyril.

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
 arch/arm/mach-keystone/keystone.c |   49 +++++++++++++++++++++++++++++++++++++
 arch/arm/mach-keystone/memory.h   |   24 ++++++++++++++++++
 arch/arm/mach-keystone/platsmp.c  |   16 +++++++++++-
 3 files changed, 88 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/mach-keystone/memory.h

diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
index fe4d9ff..90442bb 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -20,6 +20,9 @@
 #include <asm/mach/arch.h>
 #include <asm/mach/time.h>
 #include <asm/smp_plat.h>
+#include <asm/memory.h>
+
+#include "memory.h"
 
 #include "keystone.h"
 
@@ -44,6 +47,51 @@ static void __init keystone_init(void)
 	of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
 }
 
+static phys_addr_t keystone_virt_to_idmap(unsigned long x)
+{
+	return (phys_addr_t)(x) - CONFIG_PAGE_OFFSET + KEYSTONE_LOW_PHYS_START;
+}
+
+static void __init keystone_init_meminfo(void)
+{
+	bool lpae = IS_ENABLED(CONFIG_ARM_LPAE);
+	bool pvpatch = IS_ENABLED(CONFIG_ARM_PATCH_PHYS_VIRT);
+	phys_addr_t offset = PHYS_OFFSET - KEYSTONE_LOW_PHYS_START;
+	phys_addr_t mem_start, mem_end;
+
+	BUG_ON(meminfo.nr_banks < 1);
+	mem_start = meminfo.bank[0].start;
+	mem_end = mem_start + meminfo.bank[0].size - 1;
+
+	/* nothing to do if we are running out of the <32-bit space */
+	if (mem_start >= KEYSTONE_LOW_PHYS_START &&
+	    mem_end   <= KEYSTONE_LOW_PHYS_END)
+		return;
+
+	if (!lpae || !pvpatch) {
+		pr_crit("Enable %s%s%s to run outside 32-bit space\n",
+		      !lpae ? __stringify(CONFIG_ARM_LPAE) : "",
+		      (!lpae && !pvpatch) ? " and " : "",
+		      !pvpatch ? __stringify(CONFIG_ARM_PATCH_PHYS_VIRT) : "");
+	}
+
+	if (mem_start < KEYSTONE_HIGH_PHYS_START ||
+	    mem_end   > KEYSTONE_HIGH_PHYS_END) {
+		pr_crit("Invalid address space for memory (%08llx-%08llx)\n",
+		      (u64)mem_start, (u64)mem_end);
+	}
+
+	offset += KEYSTONE_HIGH_PHYS_START;
+	__pv_phys_offset = offset;
+	__pv_offset = (offset - PAGE_OFFSET);
+	__pv_offset >>= PV_LOW_SHIFT;
+
+	/* Populate the arch idmap hook */
+	arch_virt_to_idmap = keystone_virt_to_idmap;
+
+	pr_info("Switching to high address space at 0x%llx\n", (u64)offset);
+}
+
 static const char *keystone_match[] __initconst = {
 	"ti,keystone-evm",
 	NULL,
@@ -72,4 +120,5 @@ DT_MACHINE_START(KEYSTONE, "Keystone")
 	.init_machine	= keystone_init,
 	.dt_compat	= keystone_match,
 	.restart	= keystone_restart,
+	.init_meminfo   = keystone_init_meminfo,
 MACHINE_END
diff --git a/arch/arm/mach-keystone/memory.h b/arch/arm/mach-keystone/memory.h
new file mode 100644
index 0000000..83fc5a9
--- /dev/null
+++ b/arch/arm/mach-keystone/memory.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright 2013 Texas Instruments, Inc.
+ *	Santosh Shilimkar <santosh.shilimkar@ti.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+#ifndef __MACH_MEMORY_H
+#define __MACH_MEMORY_H
+
+#define MAX_PHYSMEM_BITS	36
+#define SECTION_SIZE_BITS	34
+
+#define KEYSTONE_LOW_PHYS_START		0x80000000ULL
+#define KEYSTONE_LOW_PHYS_SIZE		0x80000000ULL /* 2G */
+#define KEYSTONE_LOW_PHYS_END		(KEYSTONE_LOW_PHYS_START + \
+					 KEYSTONE_LOW_PHYS_SIZE - 1)
+
+#define KEYSTONE_HIGH_PHYS_START	0x800000000ULL
+#define KEYSTONE_HIGH_PHYS_SIZE		0x400000000ULL	/* 16G */
+#define KEYSTONE_HIGH_PHYS_END		(KEYSTONE_HIGH_PHYS_START + \
+					 KEYSTONE_HIGH_PHYS_SIZE - 1)
+#endif /* _MACH_MEMORY_H */
diff --git a/arch/arm/mach-keystone/platsmp.c b/arch/arm/mach-keystone/platsmp.c
index 1d4181e..4be8ea2 100644
--- a/arch/arm/mach-keystone/platsmp.c
+++ b/arch/arm/mach-keystone/platsmp.c
@@ -18,13 +18,15 @@
 
 #include <asm/smp_plat.h>
 #include <asm/prom.h>
+#include <asm/tlbflush.h>
+#include <asm/pgtable.h>
 
 #include "keystone.h"
 
 static int __cpuinit keystone_smp_boot_secondary(unsigned int cpu,
 						struct task_struct *idle)
 {
-	unsigned long start = virt_to_phys(&secondary_startup);
+	unsigned long start = virt_to_idmap(&secondary_startup);
 	int error;
 
 	pr_debug("keystone-smp: booting cpu %d, vector %08lx\n",
@@ -37,7 +39,19 @@ static int __cpuinit keystone_smp_boot_secondary(unsigned int cpu,
 	return error;
 }
 
+static void __cpuinit keystone_smp_secondary_initmem(unsigned int cpu)
+{
+	bool lpae = IS_ENABLED(CONFIG_ARM_LPAE);
+
+	if (lpae) {
+		pgd_t *pgd0 = pgd_offset_k(0);
+		cpu_set_ttbr(1, __pa(pgd0) + TTBR1_OFFSET);
+		local_flush_tlb_all();
+	}
+}
+
 struct smp_operations keystone_smp_ops __initdata = {
 	.smp_init_cpus		= arm_dt_init_cpu_maps,
 	.smp_boot_secondary	= keystone_smp_boot_secondary,
+	.smp_secondary_init     = keystone_smp_secondary_initmem,
 };
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems
  2013-06-21 23:48 [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
                   ` (7 preceding siblings ...)
  2013-06-21 23:48 ` [PATCH 8/8] ARM: keystone: Switch over to high physical address range Santosh Shilimkar
@ 2013-06-22  1:51 ` Nicolas Pitre
  2013-06-22  2:17   ` Santosh Shilimkar
  2013-07-16 18:42   ` Santosh Shilimkar
  8 siblings, 2 replies; 22+ messages in thread
From: Nicolas Pitre @ 2013-06-22  1:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 21 Jun 2013, Santosh Shilimkar wrote:

> Based on discussion/debate on Cyril's generic code patching framework,
> we cooked up this series which basically trying to extend the existing
> v2p runtime patching for LPAE machines which can have physical memory
> beyond 4 GB. Keystone is one such ARM machine.
> 
> We think the 64 bit patching can be still made better than the proposed
> patch in the series and hence seeking expert comments from RMK, Nico and
> others. Last patch in the series added to just give perspective on how
> machine code will make use of the available bits from the series.

FYI I'm currently on vacation and only handling quick emails.

I'll review your series in a week or so.


Nicolas

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems
  2013-06-22  1:51 ` [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems Nicolas Pitre
@ 2013-06-22  2:17   ` Santosh Shilimkar
  2013-07-16 18:42   ` Santosh Shilimkar
  1 sibling, 0 replies; 22+ messages in thread
From: Santosh Shilimkar @ 2013-06-22  2:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 21 June 2013 09:51 PM, Nicolas Pitre wrote:
> On Fri, 21 Jun 2013, Santosh Shilimkar wrote:
> 
>> Based on discussion/debate on Cyril's generic code patching framework,
>> we cooked up this series which basically trying to extend the existing
>> v2p runtime patching for LPAE machines which can have physical memory
>> beyond 4 GB. Keystone is one such ARM machine.
>>
>> We think the 64 bit patching can be still made better than the proposed
>> patch in the series and hence seeking expert comments from RMK, Nico and
>> others. Last patch in the series added to just give perspective on how
>> machine code will make use of the available bits from the series.
> 
> FYI I'm currently on vacation and only handling quick emails.
> 
> I'll review your series in a week or so.
> 
NP. Thanks for the note.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems
  2013-06-22  1:51 ` [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems Nicolas Pitre
  2013-06-22  2:17   ` Santosh Shilimkar
@ 2013-07-16 18:42   ` Santosh Shilimkar
  1 sibling, 0 replies; 22+ messages in thread
From: Santosh Shilimkar @ 2013-07-16 18:42 UTC (permalink / raw)
  To: linux-arm-kernel

Russell, Nico,

Ping.

On Friday 21 June 2013 09:51 PM, Nicolas Pitre wrote:
> On Fri, 21 Jun 2013, Santosh Shilimkar wrote:
> 
>> Based on discussion/debate on Cyril's generic code patching framework,
>> we cooked up this series which basically trying to extend the existing
>> v2p runtime patching for LPAE machines which can have physical memory
>> beyond 4 GB. Keystone is one such ARM machine.
>>
>> We think the 64 bit patching can be still made better than the proposed
>> patch in the series and hence seeking expert comments from RMK, Nico and
>> others. Last patch in the series added to just give perspective on how
>> machine code will make use of the available bits from the series.
> 
> FYI I'm currently on vacation and only handling quick emails.
> 
> I'll review your series in a week or so.
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/8] ARM: mm: LPAE: use phys_addr_t appropriately in p2v and v2p conversions
  2013-06-21 23:48 ` [PATCH 1/8] ARM: mm: LPAE: use phys_addr_t appropriately in p2v and v2p conversions Santosh Shilimkar
@ 2013-07-22 15:03   ` Nicolas Pitre
  0 siblings, 0 replies; 22+ messages in thread
From: Nicolas Pitre @ 2013-07-22 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 21 Jun 2013, Santosh Shilimkar wrote:

> Fix remainder types used when converting back and forth between
> physical and virtual addresses.
> 
> Cc: Nicolas Pitre <nico@linaro.org>
> Cc: Will Deacon <will.deacon@arm.com>
> 
> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

Acked-by: Nicolas Pitre <nico@linaro.org>



> ---
>  arch/arm/include/asm/memory.h |   22 ++++++++++++++++------
>  1 file changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> index 584786f..93376e4 100644
> --- a/arch/arm/include/asm/memory.h
> +++ b/arch/arm/include/asm/memory.h
> @@ -185,22 +185,32 @@ extern unsigned long __pv_phys_offset;
>  	: "=r" (to)					\
>  	: "r" (from), "I" (type))
>  
> -static inline unsigned long __virt_to_phys(unsigned long x)
> +static inline phys_addr_t __virt_to_phys(unsigned long x)
>  {
>  	unsigned long t;
>  	__pv_stub(x, t, "add", __PV_BITS_31_24);
>  	return t;
>  }
>  
> -static inline unsigned long __phys_to_virt(unsigned long x)
> +static inline unsigned long __phys_to_virt(phys_addr_t x)
>  {
>  	unsigned long t;
>  	__pv_stub(x, t, "sub", __PV_BITS_31_24);
>  	return t;
>  }
> +
>  #else
> -#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
> -#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
> +
> +static inline phys_addr_t __virt_to_phys(unsigned long x)
> +{
> +	return (phys_addr_t)x - PAGE_OFFSET + PHYS_OFFSET;
> +}
> +
> +static inline unsigned long __phys_to_virt(phys_addr_t x)
> +{
> +	return x - PHYS_OFFSET + PAGE_OFFSET;
> +}
> +
>  #endif
>  #endif
>  #endif /* __ASSEMBLY__ */
> @@ -238,14 +248,14 @@ static inline phys_addr_t virt_to_phys(const volatile void *x)
>  
>  static inline void *phys_to_virt(phys_addr_t x)
>  {
> -	return (void *)(__phys_to_virt((unsigned long)(x)));
> +	return (void *)__phys_to_virt(x);
>  }
>  
>  /*
>   * Drivers should NOT use these either.
>   */
>  #define __pa(x)			__virt_to_phys((unsigned long)(x))
> -#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
> +#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
>  #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
>  
>  /*
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses
  2013-06-21 23:48 ` [PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses Santosh Shilimkar
@ 2013-07-24  1:10   ` Nicolas Pitre
  2013-07-24  2:01     ` Santosh Shilimkar
  0 siblings, 1 reply; 22+ messages in thread
From: Nicolas Pitre @ 2013-07-24  1:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 21 Jun 2013, Santosh Shilimkar wrote:

> From: Sricharan R <r.sricharan@ti.com>
> 
> The current phys_to_virt patching mechanism does not work
> for 64 bit physical addressesp. Note that constant used in add/sub
> instructions is encoded in to the last 8 bits of the opcode. So shift
> the _pv_offset constant by 24 to get it in to the correct place.
> 
> The v2p patching mechanism patches the higher 32bits of physical
> address with a constant. While this is correct, in those platforms
> where the lowmem addressable physical memory spawns across 4GB boundary,
> a carry bit can be produced as a result of addition of lower 32bits.
> This has to be taken in to account and added in to the upper. The patched
> __pv_offset and va are added in lower 32bits, where __pv_offset can be
> in two's complement form when PA_START < VA_START and that can result
> in a false carry bit.
> 
> e.g PA = 0x80000000 VA = 0xC0000000
> __pv_offset = PA - VA = 0xC0000000 (2's complement)
> 
> So adding __pv_offset + VA should never result in a true overflow. So in
> order to differentiate between a true carry, a extra flag __pv_sign_flag
> is introduced.

I'm still wondering if this is worth bothering about.

If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry 
to propagate to the high word of the physical address as the VA space 
cannot be larger than 0x40000000.

So is there really a case where:

1) physical memory is crossing the 4GB mark, and ...

2) physical memory start address is higher than virtual memory start 
   address needing a carry due to the 32-bit add overflow?

It is easy to create (2) just by having a different user:kernel address 
space split.  However I wonder if (1) is likely.  Sure you need a memory 
alias in physical space to be able to boot, however you shouldn't need 
to address that memory alias via virtual addresses for any 
significant amount of time.  In fact, as soon as the MMU is turned on, 
there shouldn't be any issue simply using the final physical memory 
addresses right away.

What am I missing?


Nicolas

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses
  2013-07-24  1:10   ` Nicolas Pitre
@ 2013-07-24  2:01     ` Santosh Shilimkar
  2013-07-24  2:49       ` Nicolas Pitre
  0 siblings, 1 reply; 22+ messages in thread
From: Santosh Shilimkar @ 2013-07-24  2:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote:
> On Fri, 21 Jun 2013, Santosh Shilimkar wrote:
> 
>> From: Sricharan R <r.sricharan@ti.com>
>>
>> The current phys_to_virt patching mechanism does not work
>> for 64 bit physical addressesp. Note that constant used in add/sub
>> instructions is encoded in to the last 8 bits of the opcode. So shift
>> the _pv_offset constant by 24 to get it in to the correct place.
>>
>> The v2p patching mechanism patches the higher 32bits of physical
>> address with a constant. While this is correct, in those platforms
>> where the lowmem addressable physical memory spawns across 4GB boundary,
>> a carry bit can be produced as a result of addition of lower 32bits.
>> This has to be taken in to account and added in to the upper. The patched
>> __pv_offset and va are added in lower 32bits, where __pv_offset can be
>> in two's complement form when PA_START < VA_START and that can result
>> in a false carry bit.
>>
>> e.g PA = 0x80000000 VA = 0xC0000000
>> __pv_offset = PA - VA = 0xC0000000 (2's complement)
>>
>> So adding __pv_offset + VA should never result in a true overflow. So in
>> order to differentiate between a true carry, a extra flag __pv_sign_flag
>> is introduced.
>
First of all thanks for the review.
 
> I'm still wondering if this is worth bothering about.
> 
> If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry 
> to propagate to the high word of the physical address as the VA space 
> cannot be larger than 0x40000000.
> 
Agreed.

> So is there really a case where:
> 
> 1) physical memory is crossing the 4GB mark, and ...
> 
> 2) physical memory start address is higher than virtual memory start 
>    address needing a carry due to the 32-bit add overflow?
> 
Consider below two cases of memory layout apart from one mentioned
above where the carry is bit irrelevant as you rightly said.

1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000
2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000

In both of these cases there a true carry which needs to be
considered.

> It is easy to create (2) just by having a different user:kernel address 
> space split.  However I wonder if (1) is likely.  Sure you need a memory 
> alias in physical space to be able to boot, however you shouldn't need 
> to address that memory alias via virtual addresses for any 
> significant amount of time.  In fact, as soon as the MMU is turned on, 
> there shouldn't be any issue simply using the final physical memory 
> addresses right away.
> 
> What am I missing?
> 
I thought about switching to the final address space along with
MMU enable at startup but then based on the discussion earlier
(RMK suggested), to have such a patching support in least disruptive
manner, we could patch once at boot, and then re-patch at switch  
over. This also gives flexibility to be able to patch code post
machine init. Hopefully I haven't missed your point here.

regards,
Santosh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses
  2013-07-24  2:01     ` Santosh Shilimkar
@ 2013-07-24  2:49       ` Nicolas Pitre
  2013-07-24 11:50         ` Sricharan R
  0 siblings, 1 reply; 22+ messages in thread
From: Nicolas Pitre @ 2013-07-24  2:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 23 Jul 2013, Santosh Shilimkar wrote:

> On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote:
> > On Fri, 21 Jun 2013, Santosh Shilimkar wrote:
> > 
> >> From: Sricharan R <r.sricharan@ti.com>
> >>
> >> The current phys_to_virt patching mechanism does not work
> >> for 64 bit physical addressesp. Note that constant used in add/sub
> >> instructions is encoded in to the last 8 bits of the opcode. So shift
> >> the _pv_offset constant by 24 to get it in to the correct place.
> >>
> >> The v2p patching mechanism patches the higher 32bits of physical
> >> address with a constant. While this is correct, in those platforms
> >> where the lowmem addressable physical memory spawns across 4GB boundary,
> >> a carry bit can be produced as a result of addition of lower 32bits.
> >> This has to be taken in to account and added in to the upper. The patched
> >> __pv_offset and va are added in lower 32bits, where __pv_offset can be
> >> in two's complement form when PA_START < VA_START and that can result
> >> in a false carry bit.
> >>
> >> e.g PA = 0x80000000 VA = 0xC0000000
> >> __pv_offset = PA - VA = 0xC0000000 (2's complement)
> >>
> >> So adding __pv_offset + VA should never result in a true overflow. So in
> >> order to differentiate between a true carry, a extra flag __pv_sign_flag
> >> is introduced.
> >
> First of all thanks for the review.
>  
> > I'm still wondering if this is worth bothering about.
> > 
> > If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry 
> > to propagate to the high word of the physical address as the VA space 
> > cannot be larger than 0x40000000.
> > 
> Agreed.
> 
> > So is there really a case where:
> > 
> > 1) physical memory is crossing the 4GB mark, and ...
> > 
> > 2) physical memory start address is higher than virtual memory start 
> >    address needing a carry due to the 32-bit add overflow?
> > 
> Consider below two cases of memory layout apart from one mentioned
> above where the carry is bit irrelevant as you rightly said.
> 
> 1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000

This can be patched as:

	mov	phys_hi, #0x8
	add	phys_lo, virt, #0x40000000  @ carry ignored

> 2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000

	mov	phys_hi, #0x2
	add	phys_lo, virt, #0xc0000000  @ carry ignored

> In both of these cases there a true carry which needs to be
> considered.

Well, not really.  However, if you have:

3) PA = 0x2_8000_0000, VA = 0x4000-0000, pv_offset = 0x2-4000-0000

... then you need:

	mov	phys_hi, #0x2
	adds	phys_lo, virt, #0x40000000
	adc	phys_hi, phys_hi, #0

My question is: how likely is this?

What is your actual physical memory start address?

If we really need to cope with the carry, then the __pv_sign_flag should 
instead be represented in pv_offset directly:

Taking example #2 above, that would be:

	mov	phys_hi, #0x1
	adds	phys_lo, virt, #0xc0000000
	adc	phys_hi, phys_hi, #0

If PA = 0x8000-0000 and VA = 0xc000-0000 then pv_offset is 
0xffff-ffff-c000-0000, meaning:

	mvn	phys_hi, #0
	add	phys_lo, virt, #0xc0000000
	adc	phys_hi, phys_hi, #0

So that would require a special case in the patching code where a mvn 
with 0 is used if the high part of pv_offset is 0xffffffff.


Nicolas

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses
  2013-07-24  2:49       ` Nicolas Pitre
@ 2013-07-24 11:50         ` Sricharan R
  2013-07-24 12:07           ` Sricharan R
  0 siblings, 1 reply; 22+ messages in thread
From: Sricharan R @ 2013-07-24 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 24 July 2013 08:19 AM, Nicolas Pitre wrote:
> On Tue, 23 Jul 2013, Santosh Shilimkar wrote:
>
>> On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote:
>>> On Fri, 21 Jun 2013, Santosh Shilimkar wrote:
>>>
>>>> From: Sricharan R <r.sricharan@ti.com>
>>>>
>>>> The current phys_to_virt patching mechanism does not work
>>>> for 64 bit physical addressesp. Note that constant used in add/sub
>>>> instructions is encoded in to the last 8 bits of the opcode. So shift
>>>> the _pv_offset constant by 24 to get it in to the correct place.
>>>>
>>>> The v2p patching mechanism patches the higher 32bits of physical
>>>> address with a constant. While this is correct, in those platforms
>>>> where the lowmem addressable physical memory spawns across 4GB boundary,
>>>> a carry bit can be produced as a result of addition of lower 32bits.
>>>> This has to be taken in to account and added in to the upper. The patched
>>>> __pv_offset and va are added in lower 32bits, where __pv_offset can be
>>>> in two's complement form when PA_START < VA_START and that can result
>>>> in a false carry bit.
>>>>
>>>> e.g PA = 0x80000000 VA = 0xC0000000
>>>> __pv_offset = PA - VA = 0xC0000000 (2's complement)
>>>>
>>>> So adding __pv_offset + VA should never result in a true overflow. So in
>>>> order to differentiate between a true carry, a extra flag __pv_sign_flag
>>>> is introduced.
>> First of all thanks for the review.
>>  
>>> I'm still wondering if this is worth bothering about.
>>>
>>> If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry 
>>> to propagate to the high word of the physical address as the VA space 
>>> cannot be larger than 0x40000000.
>>>
>> Agreed.
>>
>>> So is there really a case where:
>>>
>>> 1) physical memory is crossing the 4GB mark, and ...
>>>
>>> 2) physical memory start address is higher than virtual memory start 
>>>    address needing a carry due to the 32-bit add overflow?
>>>
>> Consider below two cases of memory layout apart from one mentioned
>> above where the carry is bit irrelevant as you rightly said.
>>
>> 1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000
> This can be patched as:
>
> 	mov	phys_hi, #0x8
> 	add	phys_lo, virt, #0x40000000  @ carry ignored
>
>> 2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000
> 	mov	phys_hi, #0x2
> 	add	phys_lo, virt, #0xc0000000  @ carry ignored
>
>> In both of these cases there a true carry which needs to be
>> considered.
> Well, not really.  However, if you have:
>
> 3) PA = 0x2_8000_0000, VA = 0x4000-0000, pv_offset = 0x2-4000-0000
>
> ... then you need:
>
> 	mov	phys_hi, #0x2
> 	adds	phys_lo, virt, #0x40000000
> 	adc	phys_hi, phys_hi, #0
>
> My question is: how likely is this?
>
> What is your actual physical memory start address?

 Agreed.  In our case we do not have the Physical address crossing across
  4GB. So ignoring the carry would have be been OK. But we are
 also addressing the other case where it would really crossover.

> If we really need to cope with the carry, then the __pv_sign_flag should 
> instead be represented in pv_offset directly:
>
> Taking example #2 above, that would be:
>
> 	mov	phys_hi, #0x1
> 	adds	phys_lo, virt, #0xc0000000
> 	adc	phys_hi, phys_hi, #0
>
> If PA = 0x8000-0000 and VA = 0xc000-0000 then pv_offset is 
> 0xffff-ffff-c000-0000, meaning:
>
> 	mvn	phys_hi, #0
> 	add	phys_lo, virt, #0xc0000000
> 	adc	phys_hi, phys_hi, #0
>
> So that would require a special case in the patching code where a mvn 
> with 0 is used if the high part of pv_offset is 0xffffffff.
>
>
> Nicolas
Extending pv_offset to 64bit is really neat way. When PA > VA, then pv_offset
is going to be actual value and not 2's complement. Fine here.
When running from higher physical address space, we will always fall here.

So for the second case where pv_offset is 0xffffffff .., (PA < VA)
is a problem only when we run from lower physical address. So we can safely
assume that the higher 32bits of PA are '0' and stub it initially. In this way we
can avoid the special case.

Regards,
 Sricharan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses
  2013-07-24 11:50         ` Sricharan R
@ 2013-07-24 12:07           ` Sricharan R
  2013-07-24 14:04             ` Santosh Shilimkar
  2013-07-24 20:21             ` Nicolas Pitre
  0 siblings, 2 replies; 22+ messages in thread
From: Sricharan R @ 2013-07-24 12:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 24 July 2013 05:20 PM, Sricharan R wrote:
> On Wednesday 24 July 2013 08:19 AM, Nicolas Pitre wrote:
>> On Tue, 23 Jul 2013, Santosh Shilimkar wrote:
>>
>>> On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote:
>>>> On Fri, 21 Jun 2013, Santosh Shilimkar wrote:
>>>>
>>>>> From: Sricharan R <r.sricharan@ti.com>
>>>>>
>>>>> The current phys_to_virt patching mechanism does not work
>>>>> for 64 bit physical addressesp. Note that constant used in add/sub
>>>>> instructions is encoded in to the last 8 bits of the opcode. So shift
>>>>> the _pv_offset constant by 24 to get it in to the correct place.
>>>>>
>>>>> The v2p patching mechanism patches the higher 32bits of physical
>>>>> address with a constant. While this is correct, in those platforms
>>>>> where the lowmem addressable physical memory spawns across 4GB boundary,
>>>>> a carry bit can be produced as a result of addition of lower 32bits.
>>>>> This has to be taken in to account and added in to the upper. The patched
>>>>> __pv_offset and va are added in lower 32bits, where __pv_offset can be
>>>>> in two's complement form when PA_START < VA_START and that can result
>>>>> in a false carry bit.
>>>>>
>>>>> e.g PA = 0x80000000 VA = 0xC0000000
>>>>> __pv_offset = PA - VA = 0xC0000000 (2's complement)
>>>>>
>>>>> So adding __pv_offset + VA should never result in a true overflow. So in
>>>>> order to differentiate between a true carry, a extra flag __pv_sign_flag
>>>>> is introduced.
>>> First of all thanks for the review.
>>>  
>>>> I'm still wondering if this is worth bothering about.
>>>>
>>>> If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry 
>>>> to propagate to the high word of the physical address as the VA space 
>>>> cannot be larger than 0x40000000.
>>>>
>>> Agreed.
>>>
>>>> So is there really a case where:
>>>>
>>>> 1) physical memory is crossing the 4GB mark, and ...
>>>>
>>>> 2) physical memory start address is higher than virtual memory start 
>>>>    address needing a carry due to the 32-bit add overflow?
>>>>
>>> Consider below two cases of memory layout apart from one mentioned
>>> above where the carry is bit irrelevant as you rightly said.
>>>
>>> 1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000
>> This can be patched as:
>>
>> 	mov	phys_hi, #0x8
>> 	add	phys_lo, virt, #0x40000000  @ carry ignored
>>
>>> 2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000
>> 	mov	phys_hi, #0x2
>> 	add	phys_lo, virt, #0xc0000000  @ carry ignored
>>
>>> In both of these cases there a true carry which needs to be
>>> considered.
>> Well, not really.  However, if you have:
>>
>> 3) PA = 0x2_8000_0000, VA = 0x4000-0000, pv_offset = 0x2-4000-0000
>>
>> ... then you need:
>>
>> 	mov	phys_hi, #0x2
>> 	adds	phys_lo, virt, #0x40000000
>> 	adc	phys_hi, phys_hi, #0
>>
>> My question is: how likely is this?
>>
>> What is your actual physical memory start address?
>  Agreed.  In our case we do not have the Physical address crossing across
>   4GB. So ignoring the carry would have be been OK. But we are
>  also addressing the other case where it would really crossover.
>
>> If we really need to cope with the carry, then the __pv_sign_flag should 
>> instead be represented in pv_offset directly:
>>
>> Taking example #2 above, that would be:
>>
>> 	mov	phys_hi, #0x1
>> 	adds	phys_lo, virt, #0xc0000000
>> 	adc	phys_hi, phys_hi, #0
>>
>> If PA = 0x8000-0000 and VA = 0xc000-0000 then pv_offset is 
>> 0xffff-ffff-c000-0000, meaning:
>>
>> 	mvn	phys_hi, #0
>> 	add	phys_lo, virt, #0xc0000000
>> 	adc	phys_hi, phys_hi, #0
>>
>> So that would require a special case in the patching code where a mvn 
>> with 0 is used if the high part of pv_offset is 0xffffffff.
>>
>>
>> Nicolas
> Extending pv_offset to 64bit is really neat way. When PA > VA, then pv_offset
> is going to be actual value and not 2's complement. Fine here.
> When running from higher physical address space, we will always fall here.
>
> So for the second case where pv_offset is 0xffffffff .., (PA < VA)
> is a problem only when we run from lower physical address. So we can safely
> assume that the higher 32bits of PA are '0' and stub it initially. In this way we
> can avoid the special case.
   Sorry, I missed one more point here. In the second case,we should patch it with
   0x0 when (PA > VA) and with 0xffffffff when (PA < VA).

Regards,
 Sricharan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses
  2013-07-24 12:07           ` Sricharan R
@ 2013-07-24 14:04             ` Santosh Shilimkar
  2013-07-24 20:21             ` Nicolas Pitre
  1 sibling, 0 replies; 22+ messages in thread
From: Santosh Shilimkar @ 2013-07-24 14:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 24 July 2013 08:07 AM, Sricharan R wrote:
> On Wednesday 24 July 2013 05:20 PM, Sricharan R wrote:
>> On Wednesday 24 July 2013 08:19 AM, Nicolas Pitre wrote:
>>> On Tue, 23 Jul 2013, Santosh Shilimkar wrote:
>>>
>>>> On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote:
>>>>> On Fri, 21 Jun 2013, Santosh Shilimkar wrote:
>>>>>
>>>>>> From: Sricharan R <r.sricharan@ti.com>
>>>>>>
>>>>>> The current phys_to_virt patching mechanism does not work
>>>>>> for 64 bit physical addressesp. Note that constant used in add/sub
>>>>>> instructions is encoded in to the last 8 bits of the opcode. So shift
>>>>>> the _pv_offset constant by 24 to get it in to the correct place.
>>>>>>
>>>>>> The v2p patching mechanism patches the higher 32bits of physical
>>>>>> address with a constant. While this is correct, in those platforms
>>>>>> where the lowmem addressable physical memory spawns across 4GB boundary,
>>>>>> a carry bit can be produced as a result of addition of lower 32bits.
>>>>>> This has to be taken in to account and added in to the upper. The patched
>>>>>> __pv_offset and va are added in lower 32bits, where __pv_offset can be
>>>>>> in two's complement form when PA_START < VA_START and that can result
>>>>>> in a false carry bit.
>>>>>>
>>>>>> e.g PA = 0x80000000 VA = 0xC0000000
>>>>>> __pv_offset = PA - VA = 0xC0000000 (2's complement)
>>>>>>
>>>>>> So adding __pv_offset + VA should never result in a true overflow. So in
>>>>>> order to differentiate between a true carry, a extra flag __pv_sign_flag
>>>>>> is introduced.
>>>> First of all thanks for the review.
>>>>  
>>>>> I'm still wondering if this is worth bothering about.
>>>>>
>>>>> If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry 
>>>>> to propagate to the high word of the physical address as the VA space 
>>>>> cannot be larger than 0x40000000.
>>>>>
>>>> Agreed.
>>>>
>>>>> So is there really a case where:
>>>>>
>>>>> 1) physical memory is crossing the 4GB mark, and ...
>>>>>
>>>>> 2) physical memory start address is higher than virtual memory start 
>>>>>    address needing a carry due to the 32-bit add overflow?
>>>>>
>>>> Consider below two cases of memory layout apart from one mentioned
>>>> above where the carry is bit irrelevant as you rightly said.
>>>>
>>>> 1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000
>>> This can be patched as:
>>>
>>> 	mov	phys_hi, #0x8
>>> 	add	phys_lo, virt, #0x40000000  @ carry ignored
>>>
>>>> 2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000
>>> 	mov	phys_hi, #0x2
>>> 	add	phys_lo, virt, #0xc0000000  @ carry ignored
>>>
>>>> In both of these cases there a true carry which needs to be
>>>> considered.
>>> Well, not really.  However, if you have:
>>>
>>> 3) PA = 0x2_8000_0000, VA = 0x4000-0000, pv_offset = 0x2-4000-0000
>>>
>>> ... then you need:
>>>
>>> 	mov	phys_hi, #0x2
>>> 	adds	phys_lo, virt, #0x40000000
>>> 	adc	phys_hi, phys_hi, #0
>>>
>>> My question is: how likely is this?
>>>
>>> What is your actual physical memory start address?
>>  Agreed.  In our case we do not have the Physical address crossing across
>>   4GB. So ignoring the carry would have be been OK. But we are
>>  also addressing the other case where it would really crossover.
>>
Yes. We don't need to worry this case. We can get to this with kernel:user
split but nobody uses that case so we can safely ignore this case.

>>> If we really need to cope with the carry, then the __pv_sign_flag should 
>>> instead be represented in pv_offset directly:
>>>
>>> Taking example #2 above, that would be:
>>>
>>> 	mov	phys_hi, #0x1
>>> 	adds	phys_lo, virt, #0xc0000000
>>> 	adc	phys_hi, phys_hi, #0
>>>
>>> If PA = 0x8000-0000 and VA = 0xc000-0000 then pv_offset is 
>>> 0xffff-ffff-c000-0000, meaning:
>>>
>>> 	mvn	phys_hi, #0
>>> 	add	phys_lo, virt, #0xc0000000
>>> 	adc	phys_hi, phys_hi, #0
>>>
>>> So that would require a special case in the patching code where a mvn 
>>> with 0 is used if the high part of pv_offset is 0xffffffff.
>>>
>>>
>>> Nicolas
>> Extending pv_offset to 64bit is really neat way. When PA > VA, then pv_offset
>> is going to be actual value and not 2's complement. Fine here.
>> When running from higher physical address space, we will always fall here.
>>
>> So for the second case where pv_offset is 0xffffffff .., (PA < VA)
>> is a problem only when we run from lower physical address. So we can safely
>> assume that the higher 32bits of PA are '0' and stub it initially. In this way we
>> can avoid the special case.
>    Sorry, I missed one more point here. In the second case,we should patch it with
>    0x0 when (PA > VA) and with 0xffffffff when (PA < VA).
> 
As Sricharan said, we agree with your suggestion for the special case patching. It
will be either 0x0 or 0xffffffff so easy to take care. We will try out the
suggested changes.

Thanks a lot again.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses
  2013-07-24 12:07           ` Sricharan R
  2013-07-24 14:04             ` Santosh Shilimkar
@ 2013-07-24 20:21             ` Nicolas Pitre
  2013-07-25  3:49               ` Sricharan R
  1 sibling, 1 reply; 22+ messages in thread
From: Nicolas Pitre @ 2013-07-24 20:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 24 Jul 2013, Sricharan R wrote:

> On Wednesday 24 July 2013 05:20 PM, Sricharan R wrote:
> > On Wednesday 24 July 2013 08:19 AM, Nicolas Pitre wrote:
> >> On Tue, 23 Jul 2013, Santosh Shilimkar wrote:
> >>
> >>> On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote:
> >>>> On Fri, 21 Jun 2013, Santosh Shilimkar wrote:
> >>>>
> >>>>> From: Sricharan R <r.sricharan@ti.com>
> >>>>>
> >>>>> The current phys_to_virt patching mechanism does not work
> >>>>> for 64 bit physical addressesp. Note that constant used in add/sub
> >>>>> instructions is encoded in to the last 8 bits of the opcode. So shift
> >>>>> the _pv_offset constant by 24 to get it in to the correct place.
> >>>>>
> >>>>> The v2p patching mechanism patches the higher 32bits of physical
> >>>>> address with a constant. While this is correct, in those platforms
> >>>>> where the lowmem addressable physical memory spawns across 4GB boundary,
> >>>>> a carry bit can be produced as a result of addition of lower 32bits.
> >>>>> This has to be taken in to account and added in to the upper. The patched
> >>>>> __pv_offset and va are added in lower 32bits, where __pv_offset can be
> >>>>> in two's complement form when PA_START < VA_START and that can result
> >>>>> in a false carry bit.
> >>>>>
> >>>>> e.g PA = 0x80000000 VA = 0xC0000000
> >>>>> __pv_offset = PA - VA = 0xC0000000 (2's complement)
> >>>>>
> >>>>> So adding __pv_offset + VA should never result in a true overflow. So in
> >>>>> order to differentiate between a true carry, a extra flag __pv_sign_flag
> >>>>> is introduced.
> >>> First of all thanks for the review.
> >>>  
> >>>> I'm still wondering if this is worth bothering about.
> >>>>
> >>>> If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry 
> >>>> to propagate to the high word of the physical address as the VA space 
> >>>> cannot be larger than 0x40000000.
> >>>>
> >>> Agreed.
> >>>
> >>>> So is there really a case where:
> >>>>
> >>>> 1) physical memory is crossing the 4GB mark, and ...
> >>>>
> >>>> 2) physical memory start address is higher than virtual memory start 
> >>>>    address needing a carry due to the 32-bit add overflow?
> >>>>
> >>> Consider below two cases of memory layout apart from one mentioned
> >>> above where the carry is bit irrelevant as you rightly said.
> >>>
> >>> 1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000
> >> This can be patched as:
> >>
> >> 	mov	phys_hi, #0x8
> >> 	add	phys_lo, virt, #0x40000000  @ carry ignored
> >>
> >>> 2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000
> >> 	mov	phys_hi, #0x2
> >> 	add	phys_lo, virt, #0xc0000000  @ carry ignored
> >>
> >>> In both of these cases there a true carry which needs to be
> >>> considered.
> >> Well, not really.  However, if you have:
> >>
> >> 3) PA = 0x2_8000_0000, VA = 0x4000-0000, pv_offset = 0x2-4000-0000
> >>
> >> ... then you need:
> >>
> >> 	mov	phys_hi, #0x2
> >> 	adds	phys_lo, virt, #0x40000000
> >> 	adc	phys_hi, phys_hi, #0
> >>
> >> My question is: how likely is this?
> >>
> >> What is your actual physical memory start address?
> >  Agreed.  In our case we do not have the Physical address crossing across
> >   4GB. So ignoring the carry would have be been OK. But we are
> >  also addressing the other case where it would really crossover.
> >
> >> If we really need to cope with the carry, then the __pv_sign_flag should 
> >> instead be represented in pv_offset directly:
> >>
> >> Taking example #2 above, that would be:
> >>
> >> 	mov	phys_hi, #0x1
> >> 	adds	phys_lo, virt, #0xc0000000
> >> 	adc	phys_hi, phys_hi, #0
> >>
> >> If PA = 0x8000-0000 and VA = 0xc000-0000 then pv_offset is 
> >> 0xffff-ffff-c000-0000, meaning:
> >>
> >> 	mvn	phys_hi, #0
> >> 	add	phys_lo, virt, #0xc0000000
> >> 	adc	phys_hi, phys_hi, #0
> >>
> >> So that would require a special case in the patching code where a mvn 
> >> with 0 is used if the high part of pv_offset is 0xffffffff.
> >>
> >>
> >> Nicolas
> > Extending pv_offset to 64bit is really neat way. When PA > VA, then pv_offset
> > is going to be actual value and not 2's complement. Fine here.
> > When running from higher physical address space, we will always fall here.
> >
> > So for the second case where pv_offset is 0xffffffff .., (PA < VA)
> > is a problem only when we run from lower physical address. So we can safely
> > assume that the higher 32bits of PA are '0' and stub it initially. In this way we
> > can avoid the special case.
>    Sorry, I missed one more point here. In the second case,we should patch it with
>    0x0 when (PA > VA) and with 0xffffffff when (PA < VA).

I don't think I follow you here.

Let's assume:

phys_addr_t __pv_offset = PHYS_START - VIRT_START;

If PA = 0x0-8000-0000 and VA = 0xc000-0000 then
__pv_offset = 0xffff-ffff-c000-0000.

If PA = 0x2-8000-0000 and VA = 0xc000-0000 then
__pv_offset = 0x1-c000-0000.

So the __virt_to_phys() assembly stub could look like:

static inline phys_addr_t __virt_to_phys(unsigned long x)
{
	phys_addr_t t;

	if if (sizeof(phys_addr_t) == 4) {
		__pv_stub(x, t, "add", __PV_BITS_31_24);
	} else {
		__pv_movhi_stub(t);
		__pv_add_carry_stub(x, t);
	}

	return t;
}

And...

#define __pv_movhi_stub(y) \
	__asm__("@ __pv_movhi_stub\n" \
	"1:	mov	%R0, %1\n" \
	"	.pushsection .pv_table,\"a\"\n" \
	"	.long	1b\n" \
	"	.popsection\n" \
	: "=r" (y) \
	: "I" (__PV_BITS_8_0))

#define __pv_add_carry_stub(x, y) \
	__asm__("@ __pv_add_carry_stub\n" \
	"1:	adds	%Q0, %1, %2\n" \
	"	adc	%R0, %R0, #0\n" \
	"	.pushsection .pv_table,\"a\"\n" \
	"	.long	1b\n" \
	"	.popsection\n" \
	: "+r" (y) \
	: "r" (x), "I" (__PV_BITS_31_24) \
	: "cc")

The stub bits such as __PV_BITS_8_0 can be augmented with more bits in 
the middle to determine the type of fixup needed.  The fixup code would 
determine the shift needed on the value, and whether or not the low or 
high word of __pv_offset should be used according to those bits.

Then, in the case where a mov is patched, you need to check if the high 
word of __pv_offset is 0xffffffff and if so the mov should be turned 
into a "mvn rn, #0".

And there you are with all possible cases handled.


Nicolas

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses
  2013-07-24 20:21             ` Nicolas Pitre
@ 2013-07-25  3:49               ` Sricharan R
  2013-07-25 18:53                 ` Santosh Shilimkar
  0 siblings, 1 reply; 22+ messages in thread
From: Sricharan R @ 2013-07-25  3:49 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Nicolas,

On Thursday 25 July 2013 01:51 AM, Nicolas Pitre wrote:
> On Wed, 24 Jul 2013, Sricharan R wrote:
>
>> On Wednesday 24 July 2013 05:20 PM, Sricharan R wrote:
>>> On Wednesday 24 July 2013 08:19 AM, Nicolas Pitre wrote:
>>>> On Tue, 23 Jul 2013, Santosh Shilimkar wrote:
>>>>
>>>>> On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote:
>>>>>> On Fri, 21 Jun 2013, Santosh Shilimkar wrote:
>>>>>>
>>>>>>> From: Sricharan R <r.sricharan@ti.com>
>>>>>>>
>>>>>>> The current phys_to_virt patching mechanism does not work
>>>>>>> for 64 bit physical addressesp. Note that constant used in add/sub
>>>>>>> instructions is encoded in to the last 8 bits of the opcode. So shift
>>>>>>> the _pv_offset constant by 24 to get it in to the correct place.
>>>>>>>
>>>>>>> The v2p patching mechanism patches the higher 32bits of physical
>>>>>>> address with a constant. While this is correct, in those platforms
>>>>>>> where the lowmem addressable physical memory spawns across 4GB boundary,
>>>>>>> a carry bit can be produced as a result of addition of lower 32bits.
>>>>>>> This has to be taken in to account and added in to the upper. The patched
>>>>>>> __pv_offset and va are added in lower 32bits, where __pv_offset can be
>>>>>>> in two's complement form when PA_START < VA_START and that can result
>>>>>>> in a false carry bit.
>>>>>>>
>>>>>>> e.g PA = 0x80000000 VA = 0xC0000000
>>>>>>> __pv_offset = PA - VA = 0xC0000000 (2's complement)
>>>>>>>
>>>>>>> So adding __pv_offset + VA should never result in a true overflow. So in
>>>>>>> order to differentiate between a true carry, a extra flag __pv_sign_flag
>>>>>>> is introduced.
>>>>> First of all thanks for the review.
>>>>>  
>>>>>> I'm still wondering if this is worth bothering about.
>>>>>>
>>>>>> If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry 
>>>>>> to propagate to the high word of the physical address as the VA space 
>>>>>> cannot be larger than 0x40000000.
>>>>>>
>>>>> Agreed.
>>>>>
>>>>>> So is there really a case where:
>>>>>>
>>>>>> 1) physical memory is crossing the 4GB mark, and ...
>>>>>>
>>>>>> 2) physical memory start address is higher than virtual memory start 
>>>>>>    address needing a carry due to the 32-bit add overflow?
>>>>>>
>>>>> Consider below two cases of memory layout apart from one mentioned
>>>>> above where the carry is bit irrelevant as you rightly said.
>>>>>
>>>>> 1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000
>>>> This can be patched as:
>>>>
>>>> 	mov	phys_hi, #0x8
>>>> 	add	phys_lo, virt, #0x40000000  @ carry ignored
>>>>
>>>>> 2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000
>>>> 	mov	phys_hi, #0x2
>>>> 	add	phys_lo, virt, #0xc0000000  @ carry ignored
>>>>
>>>>> In both of these cases there a true carry which needs to be
>>>>> considered.
>>>> Well, not really.  However, if you have:
>>>>
>>>> 3) PA = 0x2_8000_0000, VA = 0x4000-0000, pv_offset = 0x2-4000-0000
>>>>
>>>> ... then you need:
>>>>
>>>> 	mov	phys_hi, #0x2
>>>> 	adds	phys_lo, virt, #0x40000000
>>>> 	adc	phys_hi, phys_hi, #0
>>>>
>>>> My question is: how likely is this?
>>>>
>>>> What is your actual physical memory start address?
>>>  Agreed.  In our case we do not have the Physical address crossing across
>>>   4GB. So ignoring the carry would have be been OK. But we are
>>>  also addressing the other case where it would really crossover.
>>>
>>>> If we really need to cope with the carry, then the __pv_sign_flag should 
>>>> instead be represented in pv_offset directly:
>>>>
>>>> Taking example #2 above, that would be:
>>>>
>>>> 	mov	phys_hi, #0x1
>>>> 	adds	phys_lo, virt, #0xc0000000
>>>> 	adc	phys_hi, phys_hi, #0
>>>>
>>>> If PA = 0x8000-0000 and VA = 0xc000-0000 then pv_offset is 
>>>> 0xffff-ffff-c000-0000, meaning:
>>>>
>>>> 	mvn	phys_hi, #0
>>>> 	add	phys_lo, virt, #0xc0000000
>>>> 	adc	phys_hi, phys_hi, #0
>>>>
>>>> So that would require a special case in the patching code where a mvn 
>>>> with 0 is used if the high part of pv_offset is 0xffffffff.
>>>>
>>>>
>>>> Nicolas
>>> Extending pv_offset to 64bit is really neat way. When PA > VA, then pv_offset
>>> is going to be actual value and not 2's complement. Fine here.
>>> When running from higher physical address space, we will always fall here.
>>>
>>> So for the second case where pv_offset is 0xffffffff .., (PA < VA)
>>> is a problem only when we run from lower physical address. So we can safely
>>> assume that the higher 32bits of PA are '0' and stub it initially. In this way we
>>> can avoid the special case.
>>    Sorry, I missed one more point here. In the second case,we should patch it with
>>    0x0 when (PA > VA) and with 0xffffffff when (PA < VA).
> I don't think I follow you here.
>
> Let's assume:
>
> phys_addr_t __pv_offset = PHYS_START - VIRT_START;
>
> If PA = 0x0-8000-0000 and VA = 0xc000-0000 then
> __pv_offset = 0xffff-ffff-c000-0000.
>
> If PA = 0x2-8000-0000 and VA = 0xc000-0000 then
> __pv_offset = 0x1-c000-0000.
>
> So the __virt_to_phys() assembly stub could look like:
>
> static inline phys_addr_t __virt_to_phys(unsigned long x)
> {
> 	phys_addr_t t;
>
> 	if if (sizeof(phys_addr_t) == 4) {
> 		__pv_stub(x, t, "add", __PV_BITS_31_24);
> 	} else {
> 		__pv_movhi_stub(t);
> 		__pv_add_carry_stub(x, t);
> 	}
>
> 	return t;
> }
>
> And...
>
> #define __pv_movhi_stub(y) \
> 	__asm__("@ __pv_movhi_stub\n" \
> 	"1:	mov	%R0, %1\n" \
> 	"	.pushsection .pv_table,\"a\"\n" \
> 	"	.long	1b\n" \
> 	"	.popsection\n" \
> 	: "=r" (y) \
> 	: "I" (__PV_BITS_8_0))
>
> #define __pv_add_carry_stub(x, y) \
> 	__asm__("@ __pv_add_carry_stub\n" \
> 	"1:	adds	%Q0, %1, %2\n" \
> 	"	adc	%R0, %R0, #0\n" \
> 	"	.pushsection .pv_table,\"a\"\n" \
> 	"	.long	1b\n" \
> 	"	.popsection\n" \
> 	: "+r" (y) \
> 	: "r" (x), "I" (__PV_BITS_31_24) \
> 	: "cc")
>
> The stub bits such as __PV_BITS_8_0 can be augmented with more bits in 
> the middle to determine the type of fixup needed.  The fixup code would 
> determine the shift needed on the value, and whether or not the low or 
> high word of __pv_offset should be used according to those bits.
>
> Then, in the case where a mov is patched, you need to check if the high 
> word of __pv_offset is 0xffffffff and if so the mov should be turned 
> into a "mvn rn, #0".
>
> And there you are with all possible cases handled.
>
>
> Nicolas
  Thanks and you have given the full details here.

  Sorry if i was not clear on my previous response.

 1)  When i said special case can be avoided, i meant that
      we need not differentiate the 0xfffffff case inside the
      __virt_to_phy macro, but can handle it at the time of patching.
      Your above code makes that clear.
 
 2) I would have ended creating separate tables for 'mov' and 'add'
      case. But again thanks to your above idea of augumenting the
      __PV_BITS, with which we can find out run time. And 'mvn' would
      be needed for moving '0xffffffff' . Now I can get rid
     of the separate section that i created for 'mov' in my previous
     version.

     I will make the above suggestions and come back.

Regards,
 Sricharan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses
  2013-07-25  3:49               ` Sricharan R
@ 2013-07-25 18:53                 ` Santosh Shilimkar
  0 siblings, 0 replies; 22+ messages in thread
From: Santosh Shilimkar @ 2013-07-25 18:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 24 July 2013 11:49 PM, Sricharan R wrote:
> Hi Nicolas,
> 
> On Thursday 25 July 2013 01:51 AM, Nicolas Pitre wrote:

[..]

>> I don't think I follow you here.
>>
>> Let's assume:
>>
>> phys_addr_t __pv_offset = PHYS_START - VIRT_START;
>>
>> If PA = 0x0-8000-0000 and VA = 0xc000-0000 then
>> __pv_offset = 0xffff-ffff-c000-0000.
>>
>> If PA = 0x2-8000-0000 and VA = 0xc000-0000 then
>> __pv_offset = 0x1-c000-0000.
>>
>> So the __virt_to_phys() assembly stub could look like:
>>
>> static inline phys_addr_t __virt_to_phys(unsigned long x)
>> {
>> 	phys_addr_t t;
>>
>> 	if if (sizeof(phys_addr_t) == 4) {
>> 		__pv_stub(x, t, "add", __PV_BITS_31_24);
>> 	} else {
>> 		__pv_movhi_stub(t);
>> 		__pv_add_carry_stub(x, t);
>> 	}
>>
>> 	return t;
>> }
>>
>> And...
>>
>> #define __pv_movhi_stub(y) \
>> 	__asm__("@ __pv_movhi_stub\n" \
>> 	"1:	mov	%R0, %1\n" \
>> 	"	.pushsection .pv_table,\"a\"\n" \
>> 	"	.long	1b\n" \
>> 	"	.popsection\n" \
>> 	: "=r" (y) \
>> 	: "I" (__PV_BITS_8_0))
>>
>> #define __pv_add_carry_stub(x, y) \
>> 	__asm__("@ __pv_add_carry_stub\n" \
>> 	"1:	adds	%Q0, %1, %2\n" \
>> 	"	adc	%R0, %R0, #0\n" \
>> 	"	.pushsection .pv_table,\"a\"\n" \
>> 	"	.long	1b\n" \
>> 	"	.popsection\n" \
>> 	: "+r" (y) \
>> 	: "r" (x), "I" (__PV_BITS_31_24) \
>> 	: "cc")
>>
>> The stub bits such as __PV_BITS_8_0 can be augmented with more bits in 
>> the middle to determine the type of fixup needed.  The fixup code would 
>> determine the shift needed on the value, and whether or not the low or 
>> high word of __pv_offset should be used according to those bits.
>>
>> Then, in the case where a mov is patched, you need to check if the high 
>> word of __pv_offset is 0xffffffff and if so the mov should be turned 
>> into a "mvn rn, #0".
>>
>> And there you are with all possible cases handled.
>>
Brilliant !!
We knew you will have some tricks and better way. We were
not convinced with the extra stub for 'mov' but also didn't
have idea to avoid it.

>   Thanks and you have given the full details here.
> 
>   Sorry if i was not clear on my previous response.
> 
>  1)  When i said special case can be avoided, i meant that
>       we need not differentiate the 0xfffffff case inside the
>       __virt_to_phy macro, but can handle it at the time of patching.
>       Your above code makes that clear.
>  
>  2) I would have ended creating separate tables for 'mov' and 'add'
>       case. But again thanks to your above idea of augumenting the
>       __PV_BITS, with which we can find out run time. And 'mvn' would
>       be needed for moving '0xffffffff' . Now I can get rid
>      of the separate section that i created for 'mov' in my previous
>      version.
> 
We also get rid of calling separate patching for modules as well
as late patching. Overall the patch-set becomes smaller and
simpler. Thanks for help.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2013-07-25 18:53 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-21 23:48 [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems Santosh Shilimkar
2013-06-21 23:48 ` [PATCH 1/8] ARM: mm: LPAE: use phys_addr_t appropriately in p2v and v2p conversions Santosh Shilimkar
2013-07-22 15:03   ` Nicolas Pitre
2013-06-21 23:48 ` [PATCH 2/8] ARM: mm: Introduce virt_to_idmap() with an arch hook Santosh Shilimkar
2013-06-21 23:48 ` [PATCH 3/8] ARM: mm: Move the idmap print to appropriate place in the code Santosh Shilimkar
2013-06-21 23:48 ` [PATCH 4/8] ARM: mm: Pass the constant as an argument to fixup_pv_table() Santosh Shilimkar
2013-06-21 23:48 ` [PATCH 5/8] ARM: mm: Add __pv_stub_mov to patch MOV instruction Santosh Shilimkar
2013-06-21 23:48 ` [PATCH 6/8] ARM: mm: LPAE: Correct virt_to_phys patching for 64 bit physical addresses Santosh Shilimkar
2013-07-24  1:10   ` Nicolas Pitre
2013-07-24  2:01     ` Santosh Shilimkar
2013-07-24  2:49       ` Nicolas Pitre
2013-07-24 11:50         ` Sricharan R
2013-07-24 12:07           ` Sricharan R
2013-07-24 14:04             ` Santosh Shilimkar
2013-07-24 20:21             ` Nicolas Pitre
2013-07-25  3:49               ` Sricharan R
2013-07-25 18:53                 ` Santosh Shilimkar
2013-06-21 23:48 ` [PATCH 7/8] ARM: mm: Recreate kernel mappings in early_paging_init() Santosh Shilimkar
2013-06-21 23:48 ` [PATCH 8/8] ARM: keystone: Switch over to high physical address range Santosh Shilimkar
2013-06-22  1:51 ` [PATCH 0/8] ARM: mm: Extend the runtime patch stub for PAE systems Nicolas Pitre
2013-06-22  2:17   ` Santosh Shilimkar
2013-07-16 18:42   ` Santosh Shilimkar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.