All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/22] Introducing the TI Keystone platform
@ 2012-07-31 23:04 ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon, Cyril Chemparathy

This series is a follow on to the RFC series posted earlier (archived at [1]).
The major change introduced here is the modification to the kernel patching
mechanism for phys_to_virt/virt_to_phys, in order to support LPAE platforms
that require late patching.  In addition to these changes, we've updated the
series based on feedback from the earlier posting.

Most of the patches in this series are fixes and extensions to LPAE support on
ARM. The last three patches in this series are specific to the TI Keystone
platform, and are being provided here for the sake of completeness.  These
three patches are dependent on the smpops patch set (see [2]), and are not
ready to be merged in as yet.

[1] - https://lkml.org/lkml/2012/7/23/460
[2] - http://permalink.gmane.org/gmane.linux.ports.arm.kernel/171540

Cyril Chemparathy (18):
  ARM: add mechanism for late code patching
  ARM: use late patch framework for phys-virt patching
  ARM: LPAE: use phys_addr_t on virt <--> phys conversion
  ARM: LPAE: support 64-bit virt/phys patching
  ARM: LPAE: use signed arithmetic for mask definitions
  ARM: LPAE: use 64-bit pgd physical address in switch_mm()
  ARM: LPAE: use 64-bit accessors for TTBR registers
  ARM: LPAE: define ARCH_LOW_ADDRESS_LIMIT for bootmem
  ARM: LPAE: factor out T1SZ and TTBR1 computations
  ARM: LPAE: allow proc override of TTB setup
  ARM: LPAE: accomodate >32-bit addresses for page table base
  ARM: mm: use physical addresses in highmem sanity checks
  ARM: mm: cleanup checks for membank overlap with vmalloc area
  ARM: mm: clean up membank size limit checks
  ARM: recreate kernel mappings in early_paging_init()
  ARM: keystone: introducing TI Keystone platform
  ARM: keystone: enable SMP on Keystone machines
  ARM: keystone: add switch over to high physical address range

Vitaly Andrianov (4):
  ARM: LPAE: use phys_addr_t in alloc_init_pud()
  ARM: LPAE: use phys_addr_t in free_memmap()
  ARM: LPAE: use phys_addr_t for initrd location and size
  ARM: add virt_to_idmap for interconnect aliasing

 arch/arm/Kconfig                                  |   20 +++
 arch/arm/Makefile                                 |    1 +
 arch/arm/boot/dts/keystone-sim.dts                |   77 +++++++++
 arch/arm/configs/keystone_defconfig               |   23 +++
 arch/arm/include/asm/cache.h                      |    9 +
 arch/arm/include/asm/mach/arch.h                  |    1 +
 arch/arm/include/asm/memory.h                     |   68 +++++---
 arch/arm/include/asm/page.h                       |    2 +-
 arch/arm/include/asm/patch.h                      |  123 +++++++++++++
 arch/arm/include/asm/pgtable-3level-hwdef.h       |   10 ++
 arch/arm/include/asm/pgtable-3level.h             |    6 +-
 arch/arm/include/asm/proc-fns.h                   |   28 ++-
 arch/arm/kernel/head.S                            |  119 +++----------
 arch/arm/kernel/module.c                          |    7 +-
 arch/arm/kernel/setup.c                           |  192 +++++++++++++++++++++
 arch/arm/kernel/smp.c                             |   11 +-
 arch/arm/kernel/vmlinux.lds.S                     |   13 +-
 arch/arm/mach-keystone/Makefile                   |    2 +
 arch/arm/mach-keystone/Makefile.boot              |    1 +
 arch/arm/mach-keystone/include/mach/debug-macro.S |   44 +++++
 arch/arm/mach-keystone/include/mach/memory.h      |   47 +++++
 arch/arm/mach-keystone/include/mach/timex.h       |   21 +++
 arch/arm/mach-keystone/include/mach/uncompress.h  |   24 +++
 arch/arm/mach-keystone/keystone.c                 |  122 +++++++++++++
 arch/arm/mach-keystone/keystone.h                 |   23 +++
 arch/arm/mach-keystone/platsmp.c                  |   88 ++++++++++
 arch/arm/mm/context.c                             |   13 +-
 arch/arm/mm/idmap.c                               |    4 +-
 arch/arm/mm/init.c                                |   20 +--
 arch/arm/mm/mmu.c                                 |  106 ++++++++----
 arch/arm/mm/proc-arm1026.S                        |    3 +
 arch/arm/mm/proc-mohawk.S                         |    3 +
 arch/arm/mm/proc-v6.S                             |    6 +-
 arch/arm/mm/proc-v7-2level.S                      |    7 +-
 arch/arm/mm/proc-v7-3level.S                      |   29 ++--
 arch/arm/mm/proc-v7.S                             |    2 +
 arch/arm/mm/proc-xsc3.S                           |    3 +
 37 files changed, 1065 insertions(+), 213 deletions(-)
 create mode 100644 arch/arm/boot/dts/keystone-sim.dts
 create mode 100644 arch/arm/configs/keystone_defconfig
 create mode 100644 arch/arm/include/asm/patch.h
 create mode 100644 arch/arm/mach-keystone/Makefile
 create mode 100644 arch/arm/mach-keystone/Makefile.boot
 create mode 100644 arch/arm/mach-keystone/include/mach/debug-macro.S
 create mode 100644 arch/arm/mach-keystone/include/mach/memory.h
 create mode 100644 arch/arm/mach-keystone/include/mach/timex.h
 create mode 100644 arch/arm/mach-keystone/include/mach/uncompress.h
 create mode 100644 arch/arm/mach-keystone/keystone.c
 create mode 100644 arch/arm/mach-keystone/keystone.h
 create mode 100644 arch/arm/mach-keystone/platsmp.c

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 00/22] Introducing the TI Keystone platform
@ 2012-07-31 23:04 ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

This series is a follow on to the RFC series posted earlier (archived at [1]).
The major change introduced here is the modification to the kernel patching
mechanism for phys_to_virt/virt_to_phys, in order to support LPAE platforms
that require late patching.  In addition to these changes, we've updated the
series based on feedback from the earlier posting.

Most of the patches in this series are fixes and extensions to LPAE support on
ARM. The last three patches in this series are specific to the TI Keystone
platform, and are being provided here for the sake of completeness.  These
three patches are dependent on the smpops patch set (see [2]), and are not
ready to be merged in as yet.

[1] - https://lkml.org/lkml/2012/7/23/460
[2] - http://permalink.gmane.org/gmane.linux.ports.arm.kernel/171540

Cyril Chemparathy (18):
  ARM: add mechanism for late code patching
  ARM: use late patch framework for phys-virt patching
  ARM: LPAE: use phys_addr_t on virt <--> phys conversion
  ARM: LPAE: support 64-bit virt/phys patching
  ARM: LPAE: use signed arithmetic for mask definitions
  ARM: LPAE: use 64-bit pgd physical address in switch_mm()
  ARM: LPAE: use 64-bit accessors for TTBR registers
  ARM: LPAE: define ARCH_LOW_ADDRESS_LIMIT for bootmem
  ARM: LPAE: factor out T1SZ and TTBR1 computations
  ARM: LPAE: allow proc override of TTB setup
  ARM: LPAE: accomodate >32-bit addresses for page table base
  ARM: mm: use physical addresses in highmem sanity checks
  ARM: mm: cleanup checks for membank overlap with vmalloc area
  ARM: mm: clean up membank size limit checks
  ARM: recreate kernel mappings in early_paging_init()
  ARM: keystone: introducing TI Keystone platform
  ARM: keystone: enable SMP on Keystone machines
  ARM: keystone: add switch over to high physical address range

Vitaly Andrianov (4):
  ARM: LPAE: use phys_addr_t in alloc_init_pud()
  ARM: LPAE: use phys_addr_t in free_memmap()
  ARM: LPAE: use phys_addr_t for initrd location and size
  ARM: add virt_to_idmap for interconnect aliasing

 arch/arm/Kconfig                                  |   20 +++
 arch/arm/Makefile                                 |    1 +
 arch/arm/boot/dts/keystone-sim.dts                |   77 +++++++++
 arch/arm/configs/keystone_defconfig               |   23 +++
 arch/arm/include/asm/cache.h                      |    9 +
 arch/arm/include/asm/mach/arch.h                  |    1 +
 arch/arm/include/asm/memory.h                     |   68 +++++---
 arch/arm/include/asm/page.h                       |    2 +-
 arch/arm/include/asm/patch.h                      |  123 +++++++++++++
 arch/arm/include/asm/pgtable-3level-hwdef.h       |   10 ++
 arch/arm/include/asm/pgtable-3level.h             |    6 +-
 arch/arm/include/asm/proc-fns.h                   |   28 ++-
 arch/arm/kernel/head.S                            |  119 +++----------
 arch/arm/kernel/module.c                          |    7 +-
 arch/arm/kernel/setup.c                           |  192 +++++++++++++++++++++
 arch/arm/kernel/smp.c                             |   11 +-
 arch/arm/kernel/vmlinux.lds.S                     |   13 +-
 arch/arm/mach-keystone/Makefile                   |    2 +
 arch/arm/mach-keystone/Makefile.boot              |    1 +
 arch/arm/mach-keystone/include/mach/debug-macro.S |   44 +++++
 arch/arm/mach-keystone/include/mach/memory.h      |   47 +++++
 arch/arm/mach-keystone/include/mach/timex.h       |   21 +++
 arch/arm/mach-keystone/include/mach/uncompress.h  |   24 +++
 arch/arm/mach-keystone/keystone.c                 |  122 +++++++++++++
 arch/arm/mach-keystone/keystone.h                 |   23 +++
 arch/arm/mach-keystone/platsmp.c                  |   88 ++++++++++
 arch/arm/mm/context.c                             |   13 +-
 arch/arm/mm/idmap.c                               |    4 +-
 arch/arm/mm/init.c                                |   20 +--
 arch/arm/mm/mmu.c                                 |  106 ++++++++----
 arch/arm/mm/proc-arm1026.S                        |    3 +
 arch/arm/mm/proc-mohawk.S                         |    3 +
 arch/arm/mm/proc-v6.S                             |    6 +-
 arch/arm/mm/proc-v7-2level.S                      |    7 +-
 arch/arm/mm/proc-v7-3level.S                      |   29 ++--
 arch/arm/mm/proc-v7.S                             |    2 +
 arch/arm/mm/proc-xsc3.S                           |    3 +
 37 files changed, 1065 insertions(+), 213 deletions(-)
 create mode 100644 arch/arm/boot/dts/keystone-sim.dts
 create mode 100644 arch/arm/configs/keystone_defconfig
 create mode 100644 arch/arm/include/asm/patch.h
 create mode 100644 arch/arm/mach-keystone/Makefile
 create mode 100644 arch/arm/mach-keystone/Makefile.boot
 create mode 100644 arch/arm/mach-keystone/include/mach/debug-macro.S
 create mode 100644 arch/arm/mach-keystone/include/mach/memory.h
 create mode 100644 arch/arm/mach-keystone/include/mach/timex.h
 create mode 100644 arch/arm/mach-keystone/include/mach/uncompress.h
 create mode 100644 arch/arm/mach-keystone/keystone.c
 create mode 100644 arch/arm/mach-keystone/keystone.h
 create mode 100644 arch/arm/mach-keystone/platsmp.c

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon, Cyril Chemparathy

The original phys_to_virt/virt_to_phys patching implementation relied on early
patching prior to MMU initialization.  On PAE systems running out of >4G
address space, this would have entailed an additional round of patching after
switching over to the high address space.

The approach implemented here conceptually extends the original PHYS_OFFSET
patching implementation with the introduction of "early" patch stubs.  Early
patch code is required to be functional out of the box, even before the patch
is applied.  This is implemented by inserting functional (but inefficient)
load code into the .patch.code init section.  Having functional code out of
the box then allows us to defer the init time patch application until later
in the init sequence.

In addition to fitting better with our need for physical address-space
switch-over, this implementation should be somewhat more extensible by virtue
of its more readable (and hackable) C implementation.  This should prove
useful for other similar init time specialization needs, especially in light
of our multi-platform kernel initiative.

This code has been boot tested in both ARM and Thumb-2 modes on an ARMv7
(Cortex-A8) device.

Note: the obtuse use of stringified symbols in patch_stub() and
early_patch_stub() is intentional.  Theoretically this should have been
accomplished with formal operands passed into the asm block, but this requires
the use of the 'c' modifier for instantiating the long (e.g. .long %c0).
However, the 'c' modifier has been found to ICE certain versions of GCC, and
therefore we resort to stringified symbols here.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/include/asm/patch.h  |  123 +++++++++++++++++++++++++++++
 arch/arm/kernel/module.c      |    4 +
 arch/arm/kernel/setup.c       |  175 +++++++++++++++++++++++++++++++++++++++++
 arch/arm/kernel/vmlinux.lds.S |   10 +++
 4 files changed, 312 insertions(+)
 create mode 100644 arch/arm/include/asm/patch.h

diff --git a/arch/arm/include/asm/patch.h b/arch/arm/include/asm/patch.h
new file mode 100644
index 0000000..a89749f
--- /dev/null
+++ b/arch/arm/include/asm/patch.h
@@ -0,0 +1,123 @@
+/*
+ *  arch/arm/include/asm/patch.h
+ *
+ *  Copyright (C) 2012, Texas Instruments
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ *  Note: this file should not be included by non-asm/.h files
+ */
+#ifndef __ASM_ARM_PATCH_H
+#define __ASM_ARM_PATCH_H
+
+#include <linux/stringify.h>
+
+#ifndef __ASSEMBLY__
+
+extern unsigned __patch_table_begin, __patch_table_end;
+
+struct patch_info {
+	u32	 type;
+	u32	 size;
+	void	*insn_start;
+	void	*insn_end;
+	u32	 patch_data[0];
+};
+
+#define patch_next(p)		((void *)(p) + (p)->size)
+
+#define PATCH_TYPE_MASK		0xffff
+#define PATCH_IMM8		0x0001
+
+#define PATCH_EARLY		0x10000
+
+#define patch_stub(type, code, patch_data, ...)			\
+	__asm__("@ patch stub\n"				\
+		"1:\n"						\
+		code						\
+		"2:\n"						\
+		"	.pushsection .patch.table, \"a\"\n"	\
+		"3:\n"						\
+		"	.long (" __stringify(type) ")\n"	\
+		"	.long (4f-3b)\n"			\
+		"	.long 1b\n"				\
+		"	.long 2b\n"				\
+		patch_data					\
+		"4:\n"						\
+		"	.popsection\n"				\
+		__VA_ARGS__)
+
+#define early_patch_stub(type, code, patch_data, ...)		\
+	__asm__("@ patch stub\n"				\
+		"1:\n"						\
+		"	b	5f\n"				\
+		"2:\n"						\
+		"	.pushsection .patch.table, \"a\"\n"	\
+		"3:\n"						\
+		"	.long (" __stringify(type | PATCH_EARLY) ")\n" \
+		"	.long (4f-3b)\n"			\
+		"	.long 1b\n"				\
+		"	.long 2b\n"				\
+		patch_data					\
+		"4:\n"						\
+		"	.popsection\n"				\
+		"	.pushsection .patch.code, \"ax\"\n"	\
+		"5:\n"						\
+		code						\
+		"	b 2b\n"					\
+		"	.popsection\n"				\
+		__VA_ARGS__)
+
+/* constant used to force encoding */
+#define __IMM8		(0x81 << 24)
+
+/*
+ * patch_imm8() - init-time specialized binary operation (imm8 operand)
+ *		  This effectively does: to = from "insn" sym,
+ *		  where the value of sym is fixed at init-time, and is patched
+ *		  in as an immediate operand.  This value must be
+ *		  representible as an 8-bit quantity with an optional
+ *		  rotation.
+ *
+ *		  The stub code produced by this variant is non-functional
+ *		  prior to patching.  Use early_patch_imm8() if you need the
+ *		  code to be functional early on in the init sequence.
+ */
+#define patch_imm8(from, to, insn, sym)				\
+	patch_stub(PATCH_IMM8,					\
+		   /* code */					\
+		   insn " %0, %1, %2\n",			\
+		   /* patch_data */				\
+		   ".long " __stringify(sym)		  "\n"	\
+		   insn " %0, %1, %2\n",			\
+		   : "=r" (to)					\
+		   : "r" (from), "I" (__IMM8), "m" (sym)	\
+		   : "cc")
+
+/*
+ * early_patch_imm8() - early functional variant of patch_imm8() above.  The
+ *			same restrictions on the constant apply here.  This
+ *			version emits workable (albeit inefficient) code at
+ *			compile-time, and therefore functions even prior to
+ *			patch application.
+ */
+#define early_patch_imm8(from, to, insn, sym)			\
+	early_patch_stub(PATCH_IMM8,				\
+			 /* code */				\
+			 "ldr	%0, =" __stringify(sym) "\n"	\
+			 "ldr	%0, [%0]\n"			\
+			 insn " %0, %1, %0\n",			\
+			 /* patch_data */			\
+			 ".long " __stringify(sym) "\n"		\
+			 insn " %0, %1, %2\n",			\
+			 : "=&r" (to)				\
+			 : "r" (from), "I" (__IMM8), "m" (sym)	\
+			 : "cc")
+
+int patch_kernel(const void *table, unsigned size);
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* __ASM_ARM_PATCH_H */
diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index 1e9be5d..df5e897 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -24,6 +24,7 @@
 #include <asm/sections.h>
 #include <asm/smp_plat.h>
 #include <asm/unwind.h>
+#include <asm/patch.h>
 
 #ifdef CONFIG_XIP_KERNEL
 /*
@@ -321,6 +322,9 @@ int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs,
 	if (s)
 		fixup_pv_table((void *)s->sh_addr, s->sh_size);
 #endif
+	s = find_mod_section(hdr, sechdrs, ".patch.table");
+	if (s)
+		patch_kernel((void *)s->sh_addr, s->sh_size);
 	s = find_mod_section(hdr, sechdrs, ".alt.smp.init");
 	if (s && !is_smp())
 #ifdef CONFIG_SMP_ON_UP
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index e15d83b..15a7699 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -55,6 +55,7 @@
 #include <asm/traps.h>
 #include <asm/unwind.h>
 #include <asm/memblock.h>
+#include <asm/opcodes.h>
 
 #if defined(CONFIG_DEPRECATED_PARAM_STRUCT)
 #include "compat.h"
@@ -937,6 +938,178 @@ static int __init meminfo_cmp(const void *_a, const void *_b)
 	return cmp < 0 ? -1 : cmp > 0 ? 1 : 0;
 }
 
+static int apply_patch_imm8_arm(const struct patch_info *p)
+{
+	u32 insn, ninsn, op, *insn_ptr = p->insn_start;
+	u32 imm, rot, val;
+	int size = p->insn_end - p->insn_start;
+
+	if (size != 4) {
+		pr_err("patch: bad template size %d\n", size);
+		return -EINVAL;
+	}
+
+	insn = __mem_to_opcode_arm(p->patch_data[1]);
+
+	/* disallow special unconditional instructions
+	 * 1111 xxxx xxxx xxxx xxxx xxxx xxxx xxxx */
+	if ((insn >> 24) == 0xf) {
+		pr_err("patch: unconditional insn %08x\n", insn);
+		return -EINVAL;
+	}
+
+	/* allow only data processing (immediate)
+	 * xxxx 001x xxxx xxxx xxxx xxxx xxxx xxxx */
+	if (((insn >> 25) & 0x3) != 1) {
+		pr_err("patch: unknown insn %08x\n", insn);
+		return -EINVAL;
+	}
+
+	/* extract op code */
+	op = (insn >> 20) & 0x1f;
+
+	/* disallow unsupported 10xxx op codes */
+	if (((op >> 3) & 0x3) == 2) {
+		pr_err("patch: unsupported opcode %08x\n", insn);
+		return -EINVAL;
+	}
+
+	/* disallow Rn == PC and Rd == PC */
+	if (((insn >> 16) & 0xf) == 0xf || ((insn >> 12) & 0xf) == 0xf) {
+		pr_err("patch: unsupported register %08x\n", insn);
+		return -EINVAL;
+	}
+
+	imm = *(u32 *)p->patch_data[0];
+
+	rot = imm ? __ffs(imm) / 2 : 0;
+	val = imm >> (rot * 2);
+	rot = (-rot) & 0xf;
+
+	/* does this fit in 8-bit? */
+	if (val > 0xff) {
+		pr_err("patch: constant overflow %08x\n", imm);
+		return -EINVAL;
+	}
+
+	/* patch in new immediate and rotation */
+	ninsn = (insn & ~0xfff) | (rot << 8) | val;
+
+	*insn_ptr = __opcode_to_mem_arm(ninsn);
+
+	return 0;
+}
+
+static int apply_patch_imm8_thumb(const struct patch_info *p)
+{
+	u32 insn, ninsn, op, *insn_ptr = p->insn_start;
+	u32 imm, rot, val;
+	int size = p->insn_end - p->insn_start;
+	const u32 supported_ops = (BIT(0)  | /* and */
+				   BIT(1)  | /* bic */
+				   BIT(2)  | /* orr/mov */
+				   BIT(3)  | /* orn/mvn */
+				   BIT(4)  | /* eor */
+				   BIT(8)  | /* add */
+				   BIT(10) | /* adc */
+				   BIT(11) | /* sbc */
+				   BIT(12) | /* sub */
+				   BIT(13)); /* rsb */
+
+	if (size != 4) {
+		pr_err("patch: bad template size %d\n", size);
+		return -EINVAL;
+	}
+
+	insn = __mem_to_opcode_thumb32(p->patch_data[1]);
+	if (!__opcode_is_thumb32(insn)) {
+		pr_err("patch: invalid thumb2 insn %08x\n", insn);
+		return -EINVAL;
+	}
+
+	/* allow only data processing (immediate)
+	 * 1111 0x0x xxx0 xxxx 0xxx xxxx xxxx xxxx */
+	if ((insn & 0xfa008000) != 0xf0000000) {
+		pr_err("patch: unknown insn %08x\n", insn);
+		return -EINVAL;
+	}
+
+	/* disallow Rn == PC and Rd == PC */
+	if (((insn >> 8) & 0xf) == 0xf || ((insn >> 16) & 0xf) == 0xf) {
+		pr_err("patch: unsupported register %08x\n", insn);
+		return -EINVAL;
+	}
+
+	/* extract op code */
+	op = (insn >> 21) & 0xf;
+
+	/* disallow unsupported opcodes */
+	if ((supported_ops & BIT(op)) == 0) {
+		pr_err("patch: unsupported opcode %x\n", op);
+		return -EINVAL;
+	}
+
+	imm = *(u32 *)p->patch_data[0];
+
+	if (imm <= 0xff) {
+		rot = 0;
+		val = imm;
+	} else {
+		rot = 32 - fls(imm); /* clz */
+		if (imm & ~(0xff000000 >> rot)) {
+			pr_err("patch: constant overflow %08x\n", imm);
+			return -EINVAL;
+		}
+		val  = (imm >> (24 - rot)) & 0x7f;
+		rot += 8; /* encoded i:imm3:a */
+
+		/* pack least-sig rot bit into most-sig val bit */
+		val |= (rot & 1) << 7;
+		rot >>= 1;
+	}
+
+	ninsn  = insn & ~(BIT(26) | 0x7 << 12 | 0xff);
+	ninsn |= (rot >> 3) << 26;	/* field "i" */
+	ninsn |= (rot & 0x7) << 12;	/* field "imm3" */
+	ninsn |= val;
+
+	*insn_ptr = __opcode_to_mem_thumb32(ninsn);
+
+	return 0;
+}
+
+int patch_kernel(const void *table, unsigned size)
+{
+	const struct patch_info *p = table, *end = (table + size);
+	bool thumb2 = IS_ENABLED(CONFIG_THUMB2_KERNEL);
+
+	for (p = table; p < end; p = patch_next(p)) {
+		int type = p->type & PATCH_TYPE_MASK;
+		int ret;
+
+		if (type == PATCH_IMM8) {
+			ret = (thumb2 ? apply_patch_imm8_thumb(p) :
+					apply_patch_imm8_arm(p));
+		} else {
+			pr_err("invalid patch type %d\n", type);
+			ret = -EINVAL;
+		}
+
+		if (ret < 0)
+			return ret;
+	}
+	return 0;
+}
+
+static void __init init_patch_kernel(void)
+{
+	const void *start = &__patch_table_begin;
+	const void *end   = &__patch_table_end;
+
+	BUG_ON(patch_kernel(start, end - start));
+	flush_icache_range(init_mm.start_code, init_mm.end_code);
+}
+
 void __init setup_arch(char **cmdline_p)
 {
 	struct machine_desc *mdesc;
@@ -998,6 +1171,8 @@ void __init setup_arch(char **cmdline_p)
 
 	if (mdesc->init_early)
 		mdesc->init_early();
+
+	init_patch_kernel();
 }
 
 
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 36ff15b..bacb275 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -167,6 +167,16 @@ SECTIONS
 		*(.pv_table)
 		__pv_table_end = .;
 	}
+	.init.patch_table : {
+		__patch_table_begin = .;
+		*(.patch.table)
+		__patch_table_end = .;
+	}
+	.init.patch_code : {
+		__patch_code_begin = .;
+		*(.patch.code)
+		__patch_code_end = .;
+	}
 	.init.data : {
 #ifndef CONFIG_XIP_KERNEL
 		INIT_DATA
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

The original phys_to_virt/virt_to_phys patching implementation relied on early
patching prior to MMU initialization.  On PAE systems running out of >4G
address space, this would have entailed an additional round of patching after
switching over to the high address space.

The approach implemented here conceptually extends the original PHYS_OFFSET
patching implementation with the introduction of "early" patch stubs.  Early
patch code is required to be functional out of the box, even before the patch
is applied.  This is implemented by inserting functional (but inefficient)
load code into the .patch.code init section.  Having functional code out of
the box then allows us to defer the init time patch application until later
in the init sequence.

In addition to fitting better with our need for physical address-space
switch-over, this implementation should be somewhat more extensible by virtue
of its more readable (and hackable) C implementation.  This should prove
useful for other similar init time specialization needs, especially in light
of our multi-platform kernel initiative.

This code has been boot tested in both ARM and Thumb-2 modes on an ARMv7
(Cortex-A8) device.

Note: the obtuse use of stringified symbols in patch_stub() and
early_patch_stub() is intentional.  Theoretically this should have been
accomplished with formal operands passed into the asm block, but this requires
the use of the 'c' modifier for instantiating the long (e.g. .long %c0).
However, the 'c' modifier has been found to ICE certain versions of GCC, and
therefore we resort to stringified symbols here.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/include/asm/patch.h  |  123 +++++++++++++++++++++++++++++
 arch/arm/kernel/module.c      |    4 +
 arch/arm/kernel/setup.c       |  175 +++++++++++++++++++++++++++++++++++++++++
 arch/arm/kernel/vmlinux.lds.S |   10 +++
 4 files changed, 312 insertions(+)
 create mode 100644 arch/arm/include/asm/patch.h

diff --git a/arch/arm/include/asm/patch.h b/arch/arm/include/asm/patch.h
new file mode 100644
index 0000000..a89749f
--- /dev/null
+++ b/arch/arm/include/asm/patch.h
@@ -0,0 +1,123 @@
+/*
+ *  arch/arm/include/asm/patch.h
+ *
+ *  Copyright (C) 2012, Texas Instruments
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ *  Note: this file should not be included by non-asm/.h files
+ */
+#ifndef __ASM_ARM_PATCH_H
+#define __ASM_ARM_PATCH_H
+
+#include <linux/stringify.h>
+
+#ifndef __ASSEMBLY__
+
+extern unsigned __patch_table_begin, __patch_table_end;
+
+struct patch_info {
+	u32	 type;
+	u32	 size;
+	void	*insn_start;
+	void	*insn_end;
+	u32	 patch_data[0];
+};
+
+#define patch_next(p)		((void *)(p) + (p)->size)
+
+#define PATCH_TYPE_MASK		0xffff
+#define PATCH_IMM8		0x0001
+
+#define PATCH_EARLY		0x10000
+
+#define patch_stub(type, code, patch_data, ...)			\
+	__asm__("@ patch stub\n"				\
+		"1:\n"						\
+		code						\
+		"2:\n"						\
+		"	.pushsection .patch.table, \"a\"\n"	\
+		"3:\n"						\
+		"	.long (" __stringify(type) ")\n"	\
+		"	.long (4f-3b)\n"			\
+		"	.long 1b\n"				\
+		"	.long 2b\n"				\
+		patch_data					\
+		"4:\n"						\
+		"	.popsection\n"				\
+		__VA_ARGS__)
+
+#define early_patch_stub(type, code, patch_data, ...)		\
+	__asm__("@ patch stub\n"				\
+		"1:\n"						\
+		"	b	5f\n"				\
+		"2:\n"						\
+		"	.pushsection .patch.table, \"a\"\n"	\
+		"3:\n"						\
+		"	.long (" __stringify(type | PATCH_EARLY) ")\n" \
+		"	.long (4f-3b)\n"			\
+		"	.long 1b\n"				\
+		"	.long 2b\n"				\
+		patch_data					\
+		"4:\n"						\
+		"	.popsection\n"				\
+		"	.pushsection .patch.code, \"ax\"\n"	\
+		"5:\n"						\
+		code						\
+		"	b 2b\n"					\
+		"	.popsection\n"				\
+		__VA_ARGS__)
+
+/* constant used to force encoding */
+#define __IMM8		(0x81 << 24)
+
+/*
+ * patch_imm8() - init-time specialized binary operation (imm8 operand)
+ *		  This effectively does: to = from "insn" sym,
+ *		  where the value of sym is fixed@init-time, and is patched
+ *		  in as an immediate operand.  This value must be
+ *		  representible as an 8-bit quantity with an optional
+ *		  rotation.
+ *
+ *		  The stub code produced by this variant is non-functional
+ *		  prior to patching.  Use early_patch_imm8() if you need the
+ *		  code to be functional early on in the init sequence.
+ */
+#define patch_imm8(from, to, insn, sym)				\
+	patch_stub(PATCH_IMM8,					\
+		   /* code */					\
+		   insn " %0, %1, %2\n",			\
+		   /* patch_data */				\
+		   ".long " __stringify(sym)		  "\n"	\
+		   insn " %0, %1, %2\n",			\
+		   : "=r" (to)					\
+		   : "r" (from), "I" (__IMM8), "m" (sym)	\
+		   : "cc")
+
+/*
+ * early_patch_imm8() - early functional variant of patch_imm8() above.  The
+ *			same restrictions on the constant apply here.  This
+ *			version emits workable (albeit inefficient) code at
+ *			compile-time, and therefore functions even prior to
+ *			patch application.
+ */
+#define early_patch_imm8(from, to, insn, sym)			\
+	early_patch_stub(PATCH_IMM8,				\
+			 /* code */				\
+			 "ldr	%0, =" __stringify(sym) "\n"	\
+			 "ldr	%0, [%0]\n"			\
+			 insn " %0, %1, %0\n",			\
+			 /* patch_data */			\
+			 ".long " __stringify(sym) "\n"		\
+			 insn " %0, %1, %2\n",			\
+			 : "=&r" (to)				\
+			 : "r" (from), "I" (__IMM8), "m" (sym)	\
+			 : "cc")
+
+int patch_kernel(const void *table, unsigned size);
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* __ASM_ARM_PATCH_H */
diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index 1e9be5d..df5e897 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -24,6 +24,7 @@
 #include <asm/sections.h>
 #include <asm/smp_plat.h>
 #include <asm/unwind.h>
+#include <asm/patch.h>
 
 #ifdef CONFIG_XIP_KERNEL
 /*
@@ -321,6 +322,9 @@ int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs,
 	if (s)
 		fixup_pv_table((void *)s->sh_addr, s->sh_size);
 #endif
+	s = find_mod_section(hdr, sechdrs, ".patch.table");
+	if (s)
+		patch_kernel((void *)s->sh_addr, s->sh_size);
 	s = find_mod_section(hdr, sechdrs, ".alt.smp.init");
 	if (s && !is_smp())
 #ifdef CONFIG_SMP_ON_UP
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index e15d83b..15a7699 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -55,6 +55,7 @@
 #include <asm/traps.h>
 #include <asm/unwind.h>
 #include <asm/memblock.h>
+#include <asm/opcodes.h>
 
 #if defined(CONFIG_DEPRECATED_PARAM_STRUCT)
 #include "compat.h"
@@ -937,6 +938,178 @@ static int __init meminfo_cmp(const void *_a, const void *_b)
 	return cmp < 0 ? -1 : cmp > 0 ? 1 : 0;
 }
 
+static int apply_patch_imm8_arm(const struct patch_info *p)
+{
+	u32 insn, ninsn, op, *insn_ptr = p->insn_start;
+	u32 imm, rot, val;
+	int size = p->insn_end - p->insn_start;
+
+	if (size != 4) {
+		pr_err("patch: bad template size %d\n", size);
+		return -EINVAL;
+	}
+
+	insn = __mem_to_opcode_arm(p->patch_data[1]);
+
+	/* disallow special unconditional instructions
+	 * 1111 xxxx xxxx xxxx xxxx xxxx xxxx xxxx */
+	if ((insn >> 24) == 0xf) {
+		pr_err("patch: unconditional insn %08x\n", insn);
+		return -EINVAL;
+	}
+
+	/* allow only data processing (immediate)
+	 * xxxx 001x xxxx xxxx xxxx xxxx xxxx xxxx */
+	if (((insn >> 25) & 0x3) != 1) {
+		pr_err("patch: unknown insn %08x\n", insn);
+		return -EINVAL;
+	}
+
+	/* extract op code */
+	op = (insn >> 20) & 0x1f;
+
+	/* disallow unsupported 10xxx op codes */
+	if (((op >> 3) & 0x3) == 2) {
+		pr_err("patch: unsupported opcode %08x\n", insn);
+		return -EINVAL;
+	}
+
+	/* disallow Rn == PC and Rd == PC */
+	if (((insn >> 16) & 0xf) == 0xf || ((insn >> 12) & 0xf) == 0xf) {
+		pr_err("patch: unsupported register %08x\n", insn);
+		return -EINVAL;
+	}
+
+	imm = *(u32 *)p->patch_data[0];
+
+	rot = imm ? __ffs(imm) / 2 : 0;
+	val = imm >> (rot * 2);
+	rot = (-rot) & 0xf;
+
+	/* does this fit in 8-bit? */
+	if (val > 0xff) {
+		pr_err("patch: constant overflow %08x\n", imm);
+		return -EINVAL;
+	}
+
+	/* patch in new immediate and rotation */
+	ninsn = (insn & ~0xfff) | (rot << 8) | val;
+
+	*insn_ptr = __opcode_to_mem_arm(ninsn);
+
+	return 0;
+}
+
+static int apply_patch_imm8_thumb(const struct patch_info *p)
+{
+	u32 insn, ninsn, op, *insn_ptr = p->insn_start;
+	u32 imm, rot, val;
+	int size = p->insn_end - p->insn_start;
+	const u32 supported_ops = (BIT(0)  | /* and */
+				   BIT(1)  | /* bic */
+				   BIT(2)  | /* orr/mov */
+				   BIT(3)  | /* orn/mvn */
+				   BIT(4)  | /* eor */
+				   BIT(8)  | /* add */
+				   BIT(10) | /* adc */
+				   BIT(11) | /* sbc */
+				   BIT(12) | /* sub */
+				   BIT(13)); /* rsb */
+
+	if (size != 4) {
+		pr_err("patch: bad template size %d\n", size);
+		return -EINVAL;
+	}
+
+	insn = __mem_to_opcode_thumb32(p->patch_data[1]);
+	if (!__opcode_is_thumb32(insn)) {
+		pr_err("patch: invalid thumb2 insn %08x\n", insn);
+		return -EINVAL;
+	}
+
+	/* allow only data processing (immediate)
+	 * 1111 0x0x xxx0 xxxx 0xxx xxxx xxxx xxxx */
+	if ((insn & 0xfa008000) != 0xf0000000) {
+		pr_err("patch: unknown insn %08x\n", insn);
+		return -EINVAL;
+	}
+
+	/* disallow Rn == PC and Rd == PC */
+	if (((insn >> 8) & 0xf) == 0xf || ((insn >> 16) & 0xf) == 0xf) {
+		pr_err("patch: unsupported register %08x\n", insn);
+		return -EINVAL;
+	}
+
+	/* extract op code */
+	op = (insn >> 21) & 0xf;
+
+	/* disallow unsupported opcodes */
+	if ((supported_ops & BIT(op)) == 0) {
+		pr_err("patch: unsupported opcode %x\n", op);
+		return -EINVAL;
+	}
+
+	imm = *(u32 *)p->patch_data[0];
+
+	if (imm <= 0xff) {
+		rot = 0;
+		val = imm;
+	} else {
+		rot = 32 - fls(imm); /* clz */
+		if (imm & ~(0xff000000 >> rot)) {
+			pr_err("patch: constant overflow %08x\n", imm);
+			return -EINVAL;
+		}
+		val  = (imm >> (24 - rot)) & 0x7f;
+		rot += 8; /* encoded i:imm3:a */
+
+		/* pack least-sig rot bit into most-sig val bit */
+		val |= (rot & 1) << 7;
+		rot >>= 1;
+	}
+
+	ninsn  = insn & ~(BIT(26) | 0x7 << 12 | 0xff);
+	ninsn |= (rot >> 3) << 26;	/* field "i" */
+	ninsn |= (rot & 0x7) << 12;	/* field "imm3" */
+	ninsn |= val;
+
+	*insn_ptr = __opcode_to_mem_thumb32(ninsn);
+
+	return 0;
+}
+
+int patch_kernel(const void *table, unsigned size)
+{
+	const struct patch_info *p = table, *end = (table + size);
+	bool thumb2 = IS_ENABLED(CONFIG_THUMB2_KERNEL);
+
+	for (p = table; p < end; p = patch_next(p)) {
+		int type = p->type & PATCH_TYPE_MASK;
+		int ret;
+
+		if (type == PATCH_IMM8) {
+			ret = (thumb2 ? apply_patch_imm8_thumb(p) :
+					apply_patch_imm8_arm(p));
+		} else {
+			pr_err("invalid patch type %d\n", type);
+			ret = -EINVAL;
+		}
+
+		if (ret < 0)
+			return ret;
+	}
+	return 0;
+}
+
+static void __init init_patch_kernel(void)
+{
+	const void *start = &__patch_table_begin;
+	const void *end   = &__patch_table_end;
+
+	BUG_ON(patch_kernel(start, end - start));
+	flush_icache_range(init_mm.start_code, init_mm.end_code);
+}
+
 void __init setup_arch(char **cmdline_p)
 {
 	struct machine_desc *mdesc;
@@ -998,6 +1171,8 @@ void __init setup_arch(char **cmdline_p)
 
 	if (mdesc->init_early)
 		mdesc->init_early();
+
+	init_patch_kernel();
 }
 
 
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 36ff15b..bacb275 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -167,6 +167,16 @@ SECTIONS
 		*(.pv_table)
 		__pv_table_end = .;
 	}
+	.init.patch_table : {
+		__patch_table_begin = .;
+		*(.patch.table)
+		__patch_table_end = .;
+	}
+	.init.patch_code : {
+		__patch_code_begin = .;
+		*(.patch.code)
+		__patch_code_end = .;
+	}
 	.init.data : {
 #ifndef CONFIG_XIP_KERNEL
 		INIT_DATA
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 02/22] ARM: use late patch framework for phys-virt patching
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon, Cyril Chemparathy

This patch replaces the original physical offset patching implementation
with one that uses the newly added patching framework.  In the process, we now
unconditionally initialize the __pv_phys_offset and __pv_offset globals in the
head.S code.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/include/asm/memory.h |   20 ++-------
 arch/arm/kernel/head.S        |   96 +++++------------------------------------
 arch/arm/kernel/module.c      |    5 ---
 arch/arm/kernel/vmlinux.lds.S |    5 ---
 4 files changed, 15 insertions(+), 111 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index fcb5757..01c710d 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -17,6 +17,7 @@
 #include <linux/const.h>
 #include <linux/types.h>
 #include <asm/sizes.h>
+#include <asm/patch.h>
 
 #ifdef CONFIG_NEED_MACH_MEMORY_H
 #include <mach/memory.h>
@@ -151,35 +152,22 @@
 #ifndef __virt_to_phys
 #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
 
-/*
- * Constants used to force the right instruction encodings and shifts
- * so that all we need to do is modify the 8-bit constant field.
- */
-#define __PV_BITS_31_24	0x81000000
-
 extern unsigned long __pv_phys_offset;
 #define PHYS_OFFSET __pv_phys_offset
 
-#define __pv_stub(from,to,instr,type)			\
-	__asm__("@ __pv_stub\n"				\
-	"1:	" instr "	%0, %1, %2\n"		\
-	"	.pushsection .pv_table,\"a\"\n"		\
-	"	.long	1b\n"				\
-	"	.popsection\n"				\
-	: "=r" (to)					\
-	: "r" (from), "I" (type))
+extern unsigned long __pv_offset;
 
 static inline unsigned long __virt_to_phys(unsigned long x)
 {
 	unsigned long t;
-	__pv_stub(x, t, "add", __PV_BITS_31_24);
+	early_patch_imm8(x, t, "add", __pv_offset);
 	return t;
 }
 
 static inline unsigned long __phys_to_virt(unsigned long x)
 {
 	unsigned long t;
-	__pv_stub(x, t, "sub", __PV_BITS_31_24);
+	early_patch_imm8(x, t, "sub", __pv_offset);
 	return t;
 }
 #else
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 835898e..d165896 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -109,9 +109,13 @@ ENTRY(stext)
 
 #ifndef CONFIG_XIP_KERNEL
 	adr	r3, 2f
-	ldmia	r3, {r4, r8}
+	ldmia	r3, {r4, r5, r6, r8}
 	sub	r4, r3, r4			@ (PHYS_OFFSET - PAGE_OFFSET)
 	add	r8, r8, r4			@ PHYS_OFFSET
+	add	r5, r5, r4
+	str	r8, [r5]			@ set __pv_phys_offset
+	add	r6, r6, r4
+	str	r4, [r6]			@ set __pv_offset
 #else
 	ldr	r8, =PHYS_OFFSET		@ always constant in this case
 #endif
@@ -124,9 +128,6 @@ ENTRY(stext)
 #ifdef CONFIG_SMP_ON_UP
 	bl	__fixup_smp
 #endif
-#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
-	bl	__fixup_pv_table
-#endif
 	bl	__create_page_tables
 
 	/*
@@ -148,6 +149,8 @@ ENDPROC(stext)
 	.ltorg
 #ifndef CONFIG_XIP_KERNEL
 2:	.long	.
+	.long	__pv_phys_offset
+	.long	__pv_offset
 	.long	PAGE_OFFSET
 #endif
 
@@ -522,94 +525,17 @@ ENTRY(fixup_smp)
 	ldmfd	sp!, {r4 - r6, pc}
 ENDPROC(fixup_smp)
 
-#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
-
-/* __fixup_pv_table - patch the stub instructions with the delta between
- * PHYS_OFFSET and PAGE_OFFSET, which is assumed to be 16MiB aligned and
- * can be expressed by an immediate shifter operand. The stub instruction
- * has a form of '(add|sub) rd, rn, #imm'.
- */
-	__HEAD
-__fixup_pv_table:
-	adr	r0, 1f
-	ldmia	r0, {r3-r5, r7}
-	sub	r3, r0, r3	@ PHYS_OFFSET - PAGE_OFFSET
-	add	r4, r4, r3	@ adjust table start address
-	add	r5, r5, r3	@ adjust table end address
-	add	r7, r7, r3	@ adjust __pv_phys_offset address
-	str	r8, [r7]	@ save computed PHYS_OFFSET to __pv_phys_offset
-	mov	r6, r3, lsr #24	@ constant for add/sub instructions
-	teq	r3, r6, lsl #24 @ must be 16MiB aligned
-THUMB(	it	ne		@ cross section branch )
-	bne	__error
-	str	r6, [r7, #4]	@ save to __pv_offset
-	b	__fixup_a_pv_table
-ENDPROC(__fixup_pv_table)
-
-	.align
-1:	.long	.
-	.long	__pv_table_begin
-	.long	__pv_table_end
-2:	.long	__pv_phys_offset
-
-	.text
-__fixup_a_pv_table:
-#ifdef CONFIG_THUMB2_KERNEL
-	lsls	r6, #24
-	beq	2f
-	clz	r7, r6
-	lsr	r6, #24
-	lsl	r6, r7
-	bic	r6, #0x0080
-	lsrs	r7, #1
-	orrcs	r6, #0x0080
-	orr	r6, r6, r7, lsl #12
-	orr	r6, #0x4000
-	b	2f
-1:	add     r7, r3
-	ldrh	ip, [r7, #2]
-	and	ip, 0x8f00
-	orr	ip, r6	@ mask in offset bits 31-24
-	strh	ip, [r7, #2]
-2:	cmp	r4, r5
-	ldrcc	r7, [r4], #4	@ use branch for delay slot
-	bcc	1b
-	bx	lr
-#else
-	b	2f
-1:	ldr	ip, [r7, r3]
-	bic	ip, ip, #0x000000ff
-	orr	ip, ip, r6	@ mask in offset bits 31-24
-	str	ip, [r7, r3]
-2:	cmp	r4, r5
-	ldrcc	r7, [r4], #4	@ use branch for delay slot
-	bcc	1b
-	mov	pc, lr
-#endif
-ENDPROC(__fixup_a_pv_table)
-
-ENTRY(fixup_pv_table)
-	stmfd	sp!, {r4 - r7, lr}
-	ldr	r2, 2f			@ get address of __pv_phys_offset
-	mov	r3, #0			@ no offset
-	mov	r4, r0			@ r0 = table start
-	add	r5, r0, r1		@ r1 = table size
-	ldr	r6, [r2, #4]		@ get __pv_offset
-	bl	__fixup_a_pv_table
-	ldmfd	sp!, {r4 - r7, pc}
-ENDPROC(fixup_pv_table)
-
-	.align
-2:	.long	__pv_phys_offset
-
 	.data
 	.globl	__pv_phys_offset
 	.type	__pv_phys_offset, %object
 __pv_phys_offset:
 	.long	0
 	.size	__pv_phys_offset, . - __pv_phys_offset
+
+	.globl	__pv_offset
+	.type	__pv_offset, %object
 __pv_offset:
 	.long	0
-#endif
+	.size	__pv_offset, . - __pv_offset
 
 #include "head-common.S"
diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index df5e897..39f8fce 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -317,11 +317,6 @@ int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs,
 					         maps[i].txt_sec->sh_addr,
 					         maps[i].txt_sec->sh_size);
 #endif
-#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
-	s = find_mod_section(hdr, sechdrs, ".pv_table");
-	if (s)
-		fixup_pv_table((void *)s->sh_addr, s->sh_size);
-#endif
 	s = find_mod_section(hdr, sechdrs, ".patch.table");
 	if (s)
 		patch_kernel((void *)s->sh_addr, s->sh_size);
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index bacb275..13731e3 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -162,11 +162,6 @@ SECTIONS
 		__smpalt_end = .;
 	}
 #endif
-	.init.pv_table : {
-		__pv_table_begin = .;
-		*(.pv_table)
-		__pv_table_end = .;
-	}
 	.init.patch_table : {
 		__patch_table_begin = .;
 		*(.patch.table)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 02/22] ARM: use late patch framework for phys-virt patching
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch replaces the original physical offset patching implementation
with one that uses the newly added patching framework.  In the process, we now
unconditionally initialize the __pv_phys_offset and __pv_offset globals in the
head.S code.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/include/asm/memory.h |   20 ++-------
 arch/arm/kernel/head.S        |   96 +++++------------------------------------
 arch/arm/kernel/module.c      |    5 ---
 arch/arm/kernel/vmlinux.lds.S |    5 ---
 4 files changed, 15 insertions(+), 111 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index fcb5757..01c710d 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -17,6 +17,7 @@
 #include <linux/const.h>
 #include <linux/types.h>
 #include <asm/sizes.h>
+#include <asm/patch.h>
 
 #ifdef CONFIG_NEED_MACH_MEMORY_H
 #include <mach/memory.h>
@@ -151,35 +152,22 @@
 #ifndef __virt_to_phys
 #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
 
-/*
- * Constants used to force the right instruction encodings and shifts
- * so that all we need to do is modify the 8-bit constant field.
- */
-#define __PV_BITS_31_24	0x81000000
-
 extern unsigned long __pv_phys_offset;
 #define PHYS_OFFSET __pv_phys_offset
 
-#define __pv_stub(from,to,instr,type)			\
-	__asm__("@ __pv_stub\n"				\
-	"1:	" instr "	%0, %1, %2\n"		\
-	"	.pushsection .pv_table,\"a\"\n"		\
-	"	.long	1b\n"				\
-	"	.popsection\n"				\
-	: "=r" (to)					\
-	: "r" (from), "I" (type))
+extern unsigned long __pv_offset;
 
 static inline unsigned long __virt_to_phys(unsigned long x)
 {
 	unsigned long t;
-	__pv_stub(x, t, "add", __PV_BITS_31_24);
+	early_patch_imm8(x, t, "add", __pv_offset);
 	return t;
 }
 
 static inline unsigned long __phys_to_virt(unsigned long x)
 {
 	unsigned long t;
-	__pv_stub(x, t, "sub", __PV_BITS_31_24);
+	early_patch_imm8(x, t, "sub", __pv_offset);
 	return t;
 }
 #else
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 835898e..d165896 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -109,9 +109,13 @@ ENTRY(stext)
 
 #ifndef CONFIG_XIP_KERNEL
 	adr	r3, 2f
-	ldmia	r3, {r4, r8}
+	ldmia	r3, {r4, r5, r6, r8}
 	sub	r4, r3, r4			@ (PHYS_OFFSET - PAGE_OFFSET)
 	add	r8, r8, r4			@ PHYS_OFFSET
+	add	r5, r5, r4
+	str	r8, [r5]			@ set __pv_phys_offset
+	add	r6, r6, r4
+	str	r4, [r6]			@ set __pv_offset
 #else
 	ldr	r8, =PHYS_OFFSET		@ always constant in this case
 #endif
@@ -124,9 +128,6 @@ ENTRY(stext)
 #ifdef CONFIG_SMP_ON_UP
 	bl	__fixup_smp
 #endif
-#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
-	bl	__fixup_pv_table
-#endif
 	bl	__create_page_tables
 
 	/*
@@ -148,6 +149,8 @@ ENDPROC(stext)
 	.ltorg
 #ifndef CONFIG_XIP_KERNEL
 2:	.long	.
+	.long	__pv_phys_offset
+	.long	__pv_offset
 	.long	PAGE_OFFSET
 #endif
 
@@ -522,94 +525,17 @@ ENTRY(fixup_smp)
 	ldmfd	sp!, {r4 - r6, pc}
 ENDPROC(fixup_smp)
 
-#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
-
-/* __fixup_pv_table - patch the stub instructions with the delta between
- * PHYS_OFFSET and PAGE_OFFSET, which is assumed to be 16MiB aligned and
- * can be expressed by an immediate shifter operand. The stub instruction
- * has a form of '(add|sub) rd, rn, #imm'.
- */
-	__HEAD
-__fixup_pv_table:
-	adr	r0, 1f
-	ldmia	r0, {r3-r5, r7}
-	sub	r3, r0, r3	@ PHYS_OFFSET - PAGE_OFFSET
-	add	r4, r4, r3	@ adjust table start address
-	add	r5, r5, r3	@ adjust table end address
-	add	r7, r7, r3	@ adjust __pv_phys_offset address
-	str	r8, [r7]	@ save computed PHYS_OFFSET to __pv_phys_offset
-	mov	r6, r3, lsr #24	@ constant for add/sub instructions
-	teq	r3, r6, lsl #24 @ must be 16MiB aligned
-THUMB(	it	ne		@ cross section branch )
-	bne	__error
-	str	r6, [r7, #4]	@ save to __pv_offset
-	b	__fixup_a_pv_table
-ENDPROC(__fixup_pv_table)
-
-	.align
-1:	.long	.
-	.long	__pv_table_begin
-	.long	__pv_table_end
-2:	.long	__pv_phys_offset
-
-	.text
-__fixup_a_pv_table:
-#ifdef CONFIG_THUMB2_KERNEL
-	lsls	r6, #24
-	beq	2f
-	clz	r7, r6
-	lsr	r6, #24
-	lsl	r6, r7
-	bic	r6, #0x0080
-	lsrs	r7, #1
-	orrcs	r6, #0x0080
-	orr	r6, r6, r7, lsl #12
-	orr	r6, #0x4000
-	b	2f
-1:	add     r7, r3
-	ldrh	ip, [r7, #2]
-	and	ip, 0x8f00
-	orr	ip, r6	@ mask in offset bits 31-24
-	strh	ip, [r7, #2]
-2:	cmp	r4, r5
-	ldrcc	r7, [r4], #4	@ use branch for delay slot
-	bcc	1b
-	bx	lr
-#else
-	b	2f
-1:	ldr	ip, [r7, r3]
-	bic	ip, ip, #0x000000ff
-	orr	ip, ip, r6	@ mask in offset bits 31-24
-	str	ip, [r7, r3]
-2:	cmp	r4, r5
-	ldrcc	r7, [r4], #4	@ use branch for delay slot
-	bcc	1b
-	mov	pc, lr
-#endif
-ENDPROC(__fixup_a_pv_table)
-
-ENTRY(fixup_pv_table)
-	stmfd	sp!, {r4 - r7, lr}
-	ldr	r2, 2f			@ get address of __pv_phys_offset
-	mov	r3, #0			@ no offset
-	mov	r4, r0			@ r0 = table start
-	add	r5, r0, r1		@ r1 = table size
-	ldr	r6, [r2, #4]		@ get __pv_offset
-	bl	__fixup_a_pv_table
-	ldmfd	sp!, {r4 - r7, pc}
-ENDPROC(fixup_pv_table)
-
-	.align
-2:	.long	__pv_phys_offset
-
 	.data
 	.globl	__pv_phys_offset
 	.type	__pv_phys_offset, %object
 __pv_phys_offset:
 	.long	0
 	.size	__pv_phys_offset, . - __pv_phys_offset
+
+	.globl	__pv_offset
+	.type	__pv_offset, %object
 __pv_offset:
 	.long	0
-#endif
+	.size	__pv_offset, . - __pv_offset
 
 #include "head-common.S"
diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index df5e897..39f8fce 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -317,11 +317,6 @@ int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs,
 					         maps[i].txt_sec->sh_addr,
 					         maps[i].txt_sec->sh_size);
 #endif
-#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
-	s = find_mod_section(hdr, sechdrs, ".pv_table");
-	if (s)
-		fixup_pv_table((void *)s->sh_addr, s->sh_size);
-#endif
 	s = find_mod_section(hdr, sechdrs, ".patch.table");
 	if (s)
 		patch_kernel((void *)s->sh_addr, s->sh_size);
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index bacb275..13731e3 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -162,11 +162,6 @@ SECTIONS
 		__smpalt_end = .;
 	}
 #endif
-	.init.pv_table : {
-		__pv_table_begin = .;
-		*(.pv_table)
-		__pv_table_end = .;
-	}
 	.init.patch_table : {
 		__patch_table_begin = .;
 		*(.patch.table)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 03/22] ARM: LPAE: use phys_addr_t on virt <--> phys conversion
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Cyril Chemparathy, Vitaly Andrianov

This patch fixes up the types used when converting back and forth between
physical and virtual addresses.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/include/asm/memory.h |   17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 01c710d..4a0108f 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -157,22 +157,27 @@ extern unsigned long __pv_phys_offset;
 
 extern unsigned long __pv_offset;
 
-static inline unsigned long __virt_to_phys(unsigned long x)
+static inline phys_addr_t __virt_to_phys(unsigned long x)
 {
 	unsigned long t;
 	early_patch_imm8(x, t, "add", __pv_offset);
 	return t;
 }
 
-static inline unsigned long __phys_to_virt(unsigned long x)
+static inline unsigned long __phys_to_virt(phys_addr_t x)
 {
 	unsigned long t;
 	early_patch_imm8(x, t, "sub", __pv_offset);
 	return t;
 }
 #else
-#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
-#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
+
+#define __virt_to_phys(x)		\
+	((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET)
+
+#define __phys_to_virt(x)		\
+	((unsigned long)((phys_addr_t)(x) - PHYS_OFFSET + PAGE_OFFSET))
+
 #endif
 #endif
 
@@ -207,14 +212,14 @@ static inline phys_addr_t virt_to_phys(const volatile void *x)
 
 static inline void *phys_to_virt(phys_addr_t x)
 {
-	return (void *)(__phys_to_virt((unsigned long)(x)));
+	return (void *)__phys_to_virt(x);
 }
 
 /*
  * Drivers should NOT use these either.
  */
 #define __pa(x)			__virt_to_phys((unsigned long)(x))
-#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
+#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
 #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
 
 /*
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 03/22] ARM: LPAE: use phys_addr_t on virt <--> phys conversion
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch fixes up the types used when converting back and forth between
physical and virtual addresses.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/include/asm/memory.h |   17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 01c710d..4a0108f 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -157,22 +157,27 @@ extern unsigned long __pv_phys_offset;
 
 extern unsigned long __pv_offset;
 
-static inline unsigned long __virt_to_phys(unsigned long x)
+static inline phys_addr_t __virt_to_phys(unsigned long x)
 {
 	unsigned long t;
 	early_patch_imm8(x, t, "add", __pv_offset);
 	return t;
 }
 
-static inline unsigned long __phys_to_virt(unsigned long x)
+static inline unsigned long __phys_to_virt(phys_addr_t x)
 {
 	unsigned long t;
 	early_patch_imm8(x, t, "sub", __pv_offset);
 	return t;
 }
 #else
-#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
-#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
+
+#define __virt_to_phys(x)		\
+	((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET)
+
+#define __phys_to_virt(x)		\
+	((unsigned long)((phys_addr_t)(x) - PHYS_OFFSET + PAGE_OFFSET))
+
 #endif
 #endif
 
@@ -207,14 +212,14 @@ static inline phys_addr_t virt_to_phys(const volatile void *x)
 
 static inline void *phys_to_virt(phys_addr_t x)
 {
-	return (void *)(__phys_to_virt((unsigned long)(x)));
+	return (void *)__phys_to_virt(x);
 }
 
 /*
  * Drivers should NOT use these either.
  */
 #define __pa(x)			__virt_to_phys((unsigned long)(x))
-#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
+#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
 #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
 
 /*
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 04/22] ARM: LPAE: support 64-bit virt/phys patching
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon, Cyril Chemparathy

This patch adds support for 64-bit physical addresses in virt_to_phys
patching.  This does not do real 64-bit add/sub, but instead patches in the
upper 32-bits of the phys_offset directly into the output of virt_to_phys.

In addition to adding 64-bit support, this patch also adds a set_phys_offset()
helper that is needed on architectures that need to modify PHYS_OFFSET during
initialization.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/include/asm/memory.h |   22 +++++++++++++++-------
 arch/arm/kernel/head.S        |    6 ++++++
 arch/arm/kernel/setup.c       |   14 ++++++++++++++
 3 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 4a0108f..110495c 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -153,23 +153,31 @@
 #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
 
 extern unsigned long __pv_phys_offset;
-#define PHYS_OFFSET __pv_phys_offset
-
+extern unsigned long __pv_phys_offset_high;
 extern unsigned long __pv_offset;
 
+extern void set_phys_offset(phys_addr_t po);
+
+#define PHYS_OFFSET	__virt_to_phys(PAGE_OFFSET)
+
 static inline phys_addr_t __virt_to_phys(unsigned long x)
 {
-	unsigned long t;
-	early_patch_imm8(x, t, "add", __pv_offset);
-	return t;
+	unsigned long tlo, thi = 0;
+
+	early_patch_imm8(x, tlo, "add", __pv_offset);
+	if (sizeof(phys_addr_t) > 4)
+		early_patch_imm8(0, thi, "add", __pv_phys_offset_high);
+
+	return (u64)tlo | (u64)thi << 32;
 }
 
 static inline unsigned long __phys_to_virt(phys_addr_t x)
 {
-	unsigned long t;
-	early_patch_imm8(x, t, "sub", __pv_offset);
+	unsigned long t, xlo = x;
+	early_patch_imm8(xlo, t, "sub", __pv_offset);
 	return t;
 }
+
 #else
 
 #define __virt_to_phys(x)		\
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index d165896..fa820b3 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -532,6 +532,12 @@ __pv_phys_offset:
 	.long	0
 	.size	__pv_phys_offset, . - __pv_phys_offset
 
+	.globl	__pv_phys_offset_high
+	.type	__pv_phys_offset_high, %object
+__pv_phys_offset_high:
+	.long	0
+	.size	__pv_phys_offset_high, . - __pv_phys_offset_high
+
 	.globl	__pv_offset
 	.type	__pv_offset, %object
 __pv_offset:
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 15a7699..bba3fdc 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -67,6 +67,20 @@
 #define MEM_SIZE	(16*1024*1024)
 #endif
 
+#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
+/*
+ * set_phys_offset() sets PHYS_OFFSET and pv_offset.
+ * Note: this is unsafe to use beyond setup_arch().
+ */
+void __init set_phys_offset(phys_addr_t po)
+{
+	__pv_phys_offset	= po;
+	__pv_phys_offset_high	= (u64)po >> 32;
+	__pv_offset		= po - PAGE_OFFSET;
+}
+
+#endif
+
 #if defined(CONFIG_FPE_NWFPE) || defined(CONFIG_FPE_FASTFPE)
 char fpe_type[8];
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 04/22] ARM: LPAE: support 64-bit virt/phys patching
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds support for 64-bit physical addresses in virt_to_phys
patching.  This does not do real 64-bit add/sub, but instead patches in the
upper 32-bits of the phys_offset directly into the output of virt_to_phys.

In addition to adding 64-bit support, this patch also adds a set_phys_offset()
helper that is needed on architectures that need to modify PHYS_OFFSET during
initialization.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/include/asm/memory.h |   22 +++++++++++++++-------
 arch/arm/kernel/head.S        |    6 ++++++
 arch/arm/kernel/setup.c       |   14 ++++++++++++++
 3 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 4a0108f..110495c 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -153,23 +153,31 @@
 #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
 
 extern unsigned long __pv_phys_offset;
-#define PHYS_OFFSET __pv_phys_offset
-
+extern unsigned long __pv_phys_offset_high;
 extern unsigned long __pv_offset;
 
+extern void set_phys_offset(phys_addr_t po);
+
+#define PHYS_OFFSET	__virt_to_phys(PAGE_OFFSET)
+
 static inline phys_addr_t __virt_to_phys(unsigned long x)
 {
-	unsigned long t;
-	early_patch_imm8(x, t, "add", __pv_offset);
-	return t;
+	unsigned long tlo, thi = 0;
+
+	early_patch_imm8(x, tlo, "add", __pv_offset);
+	if (sizeof(phys_addr_t) > 4)
+		early_patch_imm8(0, thi, "add", __pv_phys_offset_high);
+
+	return (u64)tlo | (u64)thi << 32;
 }
 
 static inline unsigned long __phys_to_virt(phys_addr_t x)
 {
-	unsigned long t;
-	early_patch_imm8(x, t, "sub", __pv_offset);
+	unsigned long t, xlo = x;
+	early_patch_imm8(xlo, t, "sub", __pv_offset);
 	return t;
 }
+
 #else
 
 #define __virt_to_phys(x)		\
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index d165896..fa820b3 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -532,6 +532,12 @@ __pv_phys_offset:
 	.long	0
 	.size	__pv_phys_offset, . - __pv_phys_offset
 
+	.globl	__pv_phys_offset_high
+	.type	__pv_phys_offset_high, %object
+__pv_phys_offset_high:
+	.long	0
+	.size	__pv_phys_offset_high, . - __pv_phys_offset_high
+
 	.globl	__pv_offset
 	.type	__pv_offset, %object
 __pv_offset:
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 15a7699..bba3fdc 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -67,6 +67,20 @@
 #define MEM_SIZE	(16*1024*1024)
 #endif
 
+#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
+/*
+ * set_phys_offset() sets PHYS_OFFSET and pv_offset.
+ * Note: this is unsafe to use beyond setup_arch().
+ */
+void __init set_phys_offset(phys_addr_t po)
+{
+	__pv_phys_offset	= po;
+	__pv_phys_offset_high	= (u64)po >> 32;
+	__pv_offset		= po - PAGE_OFFSET;
+}
+
+#endif
+
 #if defined(CONFIG_FPE_NWFPE) || defined(CONFIG_FPE_FASTFPE)
 char fpe_type[8];
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 05/22] ARM: LPAE: use signed arithmetic for mask definitions
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Cyril Chemparathy, Vitaly Andrianov

This patch applies to PAGE_MASK, PMD_MASK, and PGDIR_MASK, where forcing
unsigned long math truncates the mask at the 32-bits.  This clearly does bad
things on PAE systems.

This patch fixes this problem by defining these masks as signed quantities.
We then rely on sign extension to do the right thing.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/include/asm/page.h           |    2 +-
 arch/arm/include/asm/pgtable-3level.h |    6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index ecf9019..1e0fe08 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -13,7 +13,7 @@
 /* PAGE_SHIFT determines the page size */
 #define PAGE_SHIFT		12
 #define PAGE_SIZE		(_AC(1,UL) << PAGE_SHIFT)
-#define PAGE_MASK		(~(PAGE_SIZE-1))
+#define PAGE_MASK		(~((1 << PAGE_SHIFT) - 1))
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index b249035..ae39d11 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -48,16 +48,16 @@
 #define PMD_SHIFT		21
 
 #define PMD_SIZE		(1UL << PMD_SHIFT)
-#define PMD_MASK		(~(PMD_SIZE-1))
+#define PMD_MASK		(~((1 << PMD_SHIFT) - 1))
 #define PGDIR_SIZE		(1UL << PGDIR_SHIFT)
-#define PGDIR_MASK		(~(PGDIR_SIZE-1))
+#define PGDIR_MASK		(~((1 << PGDIR_SHIFT) - 1))
 
 /*
  * section address mask and size definitions.
  */
 #define SECTION_SHIFT		21
 #define SECTION_SIZE		(1UL << SECTION_SHIFT)
-#define SECTION_MASK		(~(SECTION_SIZE-1))
+#define SECTION_MASK		(~((1 << SECTION_SHIFT) - 1))
 
 #define USER_PTRS_PER_PGD	(PAGE_OFFSET / PGDIR_SIZE)
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 05/22] ARM: LPAE: use signed arithmetic for mask definitions
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch applies to PAGE_MASK, PMD_MASK, and PGDIR_MASK, where forcing
unsigned long math truncates the mask at the 32-bits.  This clearly does bad
things on PAE systems.

This patch fixes this problem by defining these masks as signed quantities.
We then rely on sign extension to do the right thing.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/include/asm/page.h           |    2 +-
 arch/arm/include/asm/pgtable-3level.h |    6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index ecf9019..1e0fe08 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -13,7 +13,7 @@
 /* PAGE_SHIFT determines the page size */
 #define PAGE_SHIFT		12
 #define PAGE_SIZE		(_AC(1,UL) << PAGE_SHIFT)
-#define PAGE_MASK		(~(PAGE_SIZE-1))
+#define PAGE_MASK		(~((1 << PAGE_SHIFT) - 1))
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index b249035..ae39d11 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -48,16 +48,16 @@
 #define PMD_SHIFT		21
 
 #define PMD_SIZE		(1UL << PMD_SHIFT)
-#define PMD_MASK		(~(PMD_SIZE-1))
+#define PMD_MASK		(~((1 << PMD_SHIFT) - 1))
 #define PGDIR_SIZE		(1UL << PGDIR_SHIFT)
-#define PGDIR_MASK		(~(PGDIR_SIZE-1))
+#define PGDIR_MASK		(~((1 << PGDIR_SHIFT) - 1))
 
 /*
  * section address mask and size definitions.
  */
 #define SECTION_SHIFT		21
 #define SECTION_SIZE		(1UL << SECTION_SHIFT)
-#define SECTION_MASK		(~(SECTION_SIZE-1))
+#define SECTION_MASK		(~((1 << SECTION_SHIFT) - 1))
 
 #define USER_PTRS_PER_PGD	(PAGE_OFFSET / PGDIR_SIZE)
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 06/22] ARM: LPAE: use phys_addr_t in alloc_init_pud()
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Vitaly Andrianov, Cyril Chemparathy

From: Vitaly Andrianov <vitalya@ti.com>

This patch fixes the alloc_init_pud() function to use phys_addr_t instead of
unsigned long when passing in the phys argument.

This is an extension to commit 97092e0c56830457af0639f6bd904537a150ea4a, which
applied similar changes elsewhere in the ARM memory management code.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/mm/mmu.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index cf4528d..226985c 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -628,7 +628,8 @@ static void __init alloc_init_section(pud_t *pud, unsigned long addr,
 }
 
 static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
-	unsigned long end, unsigned long phys, const struct mem_type *type)
+				  unsigned long end, phys_addr_t phys,
+				  const struct mem_type *type)
 {
 	pud_t *pud = pud_offset(pgd, addr);
 	unsigned long next;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 06/22] ARM: LPAE: use phys_addr_t in alloc_init_pud()
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

From: Vitaly Andrianov <vitalya@ti.com>

This patch fixes the alloc_init_pud() function to use phys_addr_t instead of
unsigned long when passing in the phys argument.

This is an extension to commit 97092e0c56830457af0639f6bd904537a150ea4a, which
applied similar changes elsewhere in the ARM memory management code.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/mm/mmu.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index cf4528d..226985c 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -628,7 +628,8 @@ static void __init alloc_init_section(pud_t *pud, unsigned long addr,
 }
 
 static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
-	unsigned long end, unsigned long phys, const struct mem_type *type)
+				  unsigned long end, phys_addr_t phys,
+				  const struct mem_type *type)
 {
 	pud_t *pud = pud_offset(pgd, addr);
 	unsigned long next;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 07/22] ARM: LPAE: use phys_addr_t in free_memmap()
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Vitaly Andrianov, Cyril Chemparathy

From: Vitaly Andrianov <vitalya@ti.com>

The free_memmap() was mistakenly using unsigned long type to represent
physical addresses.  This breaks on PAE systems where memory could be placed
above the 32-bit addressible limit.

This patch fixes this function to properly use phys_addr_t instead.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/mm/init.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index f54d592..8252c31 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -457,7 +457,7 @@ static inline void
 free_memmap(unsigned long start_pfn, unsigned long end_pfn)
 {
 	struct page *start_pg, *end_pg;
-	unsigned long pg, pgend;
+	phys_addr_t pg, pgend;
 
 	/*
 	 * Convert start_pfn/end_pfn to a struct page pointer.
@@ -469,8 +469,8 @@ free_memmap(unsigned long start_pfn, unsigned long end_pfn)
 	 * Convert to physical addresses, and
 	 * round start upwards and end downwards.
 	 */
-	pg = (unsigned long)PAGE_ALIGN(__pa(start_pg));
-	pgend = (unsigned long)__pa(end_pg) & PAGE_MASK;
+	pg = PAGE_ALIGN(__pa(start_pg));
+	pgend = __pa(end_pg) & PAGE_MASK;
 
 	/*
 	 * If there are free pages between these,
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 07/22] ARM: LPAE: use phys_addr_t in free_memmap()
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

From: Vitaly Andrianov <vitalya@ti.com>

The free_memmap() was mistakenly using unsigned long type to represent
physical addresses.  This breaks on PAE systems where memory could be placed
above the 32-bit addressible limit.

This patch fixes this function to properly use phys_addr_t instead.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/mm/init.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index f54d592..8252c31 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -457,7 +457,7 @@ static inline void
 free_memmap(unsigned long start_pfn, unsigned long end_pfn)
 {
 	struct page *start_pg, *end_pg;
-	unsigned long pg, pgend;
+	phys_addr_t pg, pgend;
 
 	/*
 	 * Convert start_pfn/end_pfn to a struct page pointer.
@@ -469,8 +469,8 @@ free_memmap(unsigned long start_pfn, unsigned long end_pfn)
 	 * Convert to physical addresses, and
 	 * round start upwards and end downwards.
 	 */
-	pg = (unsigned long)PAGE_ALIGN(__pa(start_pg));
-	pgend = (unsigned long)__pa(end_pg) & PAGE_MASK;
+	pg = PAGE_ALIGN(__pa(start_pg));
+	pgend = __pa(end_pg) & PAGE_MASK;
 
 	/*
 	 * If there are free pages between these,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 08/22] ARM: LPAE: use phys_addr_t for initrd location and size
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Vitaly Andrianov, Cyril Chemparathy

From: Vitaly Andrianov <vitalya@ti.com>

This patch fixes the initrd setup code to use phys_addr_t instead of assuming
32-bit addressing.  Without this we cannot boot on systems where initrd is
located above the 4G physical address limit.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/mm/init.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 8252c31..51f3e92 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -36,12 +36,12 @@
 
 #include "mm.h"
 
-static unsigned long phys_initrd_start __initdata = 0;
-static unsigned long phys_initrd_size __initdata = 0;
+static phys_addr_t phys_initrd_start __initdata = 0;
+static phys_addr_t phys_initrd_size __initdata = 0;
 
 static int __init early_initrd(char *p)
 {
-	unsigned long start, size;
+	phys_addr_t start, size;
 	char *endp;
 
 	start = memparse(p, &endp);
@@ -347,14 +347,14 @@ void __init arm_memblock_init(struct meminfo *mi, struct machine_desc *mdesc)
 #ifdef CONFIG_BLK_DEV_INITRD
 	if (phys_initrd_size &&
 	    !memblock_is_region_memory(phys_initrd_start, phys_initrd_size)) {
-		pr_err("INITRD: 0x%08lx+0x%08lx is not a memory region - disabling initrd\n",
-		       phys_initrd_start, phys_initrd_size);
+		pr_err("INITRD: 0x%08llx+0x%08llx is not a memory region - disabling initrd\n",
+		       (u64)phys_initrd_start, (u64)phys_initrd_size);
 		phys_initrd_start = phys_initrd_size = 0;
 	}
 	if (phys_initrd_size &&
 	    memblock_is_region_reserved(phys_initrd_start, phys_initrd_size)) {
-		pr_err("INITRD: 0x%08lx+0x%08lx overlaps in-use memory region - disabling initrd\n",
-		       phys_initrd_start, phys_initrd_size);
+		pr_err("INITRD: 0x%08llx+0x%08llx overlaps in-use memory region - disabling initrd\n",
+		       (u64)phys_initrd_start, (u64)phys_initrd_size);
 		phys_initrd_start = phys_initrd_size = 0;
 	}
 	if (phys_initrd_size) {
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 08/22] ARM: LPAE: use phys_addr_t for initrd location and size
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

From: Vitaly Andrianov <vitalya@ti.com>

This patch fixes the initrd setup code to use phys_addr_t instead of assuming
32-bit addressing.  Without this we cannot boot on systems where initrd is
located above the 4G physical address limit.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/mm/init.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 8252c31..51f3e92 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -36,12 +36,12 @@
 
 #include "mm.h"
 
-static unsigned long phys_initrd_start __initdata = 0;
-static unsigned long phys_initrd_size __initdata = 0;
+static phys_addr_t phys_initrd_start __initdata = 0;
+static phys_addr_t phys_initrd_size __initdata = 0;
 
 static int __init early_initrd(char *p)
 {
-	unsigned long start, size;
+	phys_addr_t start, size;
 	char *endp;
 
 	start = memparse(p, &endp);
@@ -347,14 +347,14 @@ void __init arm_memblock_init(struct meminfo *mi, struct machine_desc *mdesc)
 #ifdef CONFIG_BLK_DEV_INITRD
 	if (phys_initrd_size &&
 	    !memblock_is_region_memory(phys_initrd_start, phys_initrd_size)) {
-		pr_err("INITRD: 0x%08lx+0x%08lx is not a memory region - disabling initrd\n",
-		       phys_initrd_start, phys_initrd_size);
+		pr_err("INITRD: 0x%08llx+0x%08llx is not a memory region - disabling initrd\n",
+		       (u64)phys_initrd_start, (u64)phys_initrd_size);
 		phys_initrd_start = phys_initrd_size = 0;
 	}
 	if (phys_initrd_size &&
 	    memblock_is_region_reserved(phys_initrd_start, phys_initrd_size)) {
-		pr_err("INITRD: 0x%08lx+0x%08lx overlaps in-use memory region - disabling initrd\n",
-		       phys_initrd_start, phys_initrd_size);
+		pr_err("INITRD: 0x%08llx+0x%08llx overlaps in-use memory region - disabling initrd\n",
+		       (u64)phys_initrd_start, (u64)phys_initrd_size);
 		phys_initrd_start = phys_initrd_size = 0;
 	}
 	if (phys_initrd_size) {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 09/22] ARM: LPAE: use 64-bit pgd physical address in switch_mm()
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Cyril Chemparathy, Vitaly Andrianov

This patch modifies the switch_mm() processor functions to use 64-bit
addresses.  We use u64 instead of phys_addr_t, in order to avoid having config
dependent register usage when calling into switch_mm assembly code.

The changes in this patch are primarily adjustments for registers used for
arguments to switch_mm.  The few processor definitions that did use the second
argument have been modified accordingly.

Arguments and calling conventions aside, this patch should be a no-op on v6
and non-LPAE v7 processors.  On LPAE systems, we now honor the upper 32-bits
of the physical address that is being passed in.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/include/asm/proc-fns.h |    4 ++--
 arch/arm/mm/proc-v6.S           |    2 +-
 arch/arm/mm/proc-v7-2level.S    |    2 +-
 arch/arm/mm/proc-v7-3level.S    |    5 +++--
 4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/proc-fns.h b/arch/arm/include/asm/proc-fns.h
index f3628fb..fa6554e 100644
--- a/arch/arm/include/asm/proc-fns.h
+++ b/arch/arm/include/asm/proc-fns.h
@@ -60,7 +60,7 @@ extern struct processor {
 	/*
 	 * Set the page table
 	 */
-	void (*switch_mm)(unsigned long pgd_phys, struct mm_struct *mm);
+	void (*switch_mm)(u64 pgd_phys, struct mm_struct *mm);
 	/*
 	 * Set a possibly extended PTE.  Non-extended PTEs should
 	 * ignore 'ext'.
@@ -82,7 +82,7 @@ extern void cpu_proc_init(void);
 extern void cpu_proc_fin(void);
 extern int cpu_do_idle(void);
 extern void cpu_dcache_clean_area(void *, int);
-extern void cpu_do_switch_mm(unsigned long pgd_phys, struct mm_struct *mm);
+extern void cpu_do_switch_mm(u64 pgd_phys, struct mm_struct *mm);
 #ifdef CONFIG_ARM_LPAE
 extern void cpu_set_pte_ext(pte_t *ptep, pte_t pte);
 #else
diff --git a/arch/arm/mm/proc-v6.S b/arch/arm/mm/proc-v6.S
index 5900cd5..566c658 100644
--- a/arch/arm/mm/proc-v6.S
+++ b/arch/arm/mm/proc-v6.S
@@ -100,8 +100,8 @@ ENTRY(cpu_v6_dcache_clean_area)
  */
 ENTRY(cpu_v6_switch_mm)
 #ifdef CONFIG_MMU
+	ldr	r1, [r2, #MM_CONTEXT_ID]	@ get mm->context.id
 	mov	r2, #0
-	ldr	r1, [r1, #MM_CONTEXT_ID]	@ get mm->context.id
 	ALT_SMP(orr	r0, r0, #TTB_FLAGS_SMP)
 	ALT_UP(orr	r0, r0, #TTB_FLAGS_UP)
 	mcr	p15, 0, r2, c7, c5, 6		@ flush BTAC/BTB
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
index 42ac069..3397803 100644
--- a/arch/arm/mm/proc-v7-2level.S
+++ b/arch/arm/mm/proc-v7-2level.S
@@ -39,8 +39,8 @@
  */
 ENTRY(cpu_v7_switch_mm)
 #ifdef CONFIG_MMU
+	ldr	r1, [r2, #MM_CONTEXT_ID]	@ get mm->context.id
 	mov	r2, #0
-	ldr	r1, [r1, #MM_CONTEXT_ID]	@ get mm->context.id
 	ALT_SMP(orr	r0, r0, #TTB_FLAGS_SMP)
 	ALT_UP(orr	r0, r0, #TTB_FLAGS_UP)
 #ifdef CONFIG_ARM_ERRATA_430973
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 8de0f1d..0001581 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -47,9 +47,10 @@
  */
 ENTRY(cpu_v7_switch_mm)
 #ifdef CONFIG_MMU
-	ldr	r1, [r1, #MM_CONTEXT_ID]	@ get mm->context.id
-	and	r3, r1, #0xff
+	ldr	r2, [r2, #MM_CONTEXT_ID]	@ get mm->context.id
+	and	r3, r2, #0xff
 	mov	r3, r3, lsl #(48 - 32)		@ ASID
+	orr	r3, r3, r1			@ upper 32-bits of pgd phys
 	mcrr	p15, 0, r0, r3, c2		@ set TTB 0
 	isb
 #endif
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 09/22] ARM: LPAE: use 64-bit pgd physical address in switch_mm()
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch modifies the switch_mm() processor functions to use 64-bit
addresses.  We use u64 instead of phys_addr_t, in order to avoid having config
dependent register usage when calling into switch_mm assembly code.

The changes in this patch are primarily adjustments for registers used for
arguments to switch_mm.  The few processor definitions that did use the second
argument have been modified accordingly.

Arguments and calling conventions aside, this patch should be a no-op on v6
and non-LPAE v7 processors.  On LPAE systems, we now honor the upper 32-bits
of the physical address that is being passed in.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/include/asm/proc-fns.h |    4 ++--
 arch/arm/mm/proc-v6.S           |    2 +-
 arch/arm/mm/proc-v7-2level.S    |    2 +-
 arch/arm/mm/proc-v7-3level.S    |    5 +++--
 4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/proc-fns.h b/arch/arm/include/asm/proc-fns.h
index f3628fb..fa6554e 100644
--- a/arch/arm/include/asm/proc-fns.h
+++ b/arch/arm/include/asm/proc-fns.h
@@ -60,7 +60,7 @@ extern struct processor {
 	/*
 	 * Set the page table
 	 */
-	void (*switch_mm)(unsigned long pgd_phys, struct mm_struct *mm);
+	void (*switch_mm)(u64 pgd_phys, struct mm_struct *mm);
 	/*
 	 * Set a possibly extended PTE.  Non-extended PTEs should
 	 * ignore 'ext'.
@@ -82,7 +82,7 @@ extern void cpu_proc_init(void);
 extern void cpu_proc_fin(void);
 extern int cpu_do_idle(void);
 extern void cpu_dcache_clean_area(void *, int);
-extern void cpu_do_switch_mm(unsigned long pgd_phys, struct mm_struct *mm);
+extern void cpu_do_switch_mm(u64 pgd_phys, struct mm_struct *mm);
 #ifdef CONFIG_ARM_LPAE
 extern void cpu_set_pte_ext(pte_t *ptep, pte_t pte);
 #else
diff --git a/arch/arm/mm/proc-v6.S b/arch/arm/mm/proc-v6.S
index 5900cd5..566c658 100644
--- a/arch/arm/mm/proc-v6.S
+++ b/arch/arm/mm/proc-v6.S
@@ -100,8 +100,8 @@ ENTRY(cpu_v6_dcache_clean_area)
  */
 ENTRY(cpu_v6_switch_mm)
 #ifdef CONFIG_MMU
+	ldr	r1, [r2, #MM_CONTEXT_ID]	@ get mm->context.id
 	mov	r2, #0
-	ldr	r1, [r1, #MM_CONTEXT_ID]	@ get mm->context.id
 	ALT_SMP(orr	r0, r0, #TTB_FLAGS_SMP)
 	ALT_UP(orr	r0, r0, #TTB_FLAGS_UP)
 	mcr	p15, 0, r2, c7, c5, 6		@ flush BTAC/BTB
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
index 42ac069..3397803 100644
--- a/arch/arm/mm/proc-v7-2level.S
+++ b/arch/arm/mm/proc-v7-2level.S
@@ -39,8 +39,8 @@
  */
 ENTRY(cpu_v7_switch_mm)
 #ifdef CONFIG_MMU
+	ldr	r1, [r2, #MM_CONTEXT_ID]	@ get mm->context.id
 	mov	r2, #0
-	ldr	r1, [r1, #MM_CONTEXT_ID]	@ get mm->context.id
 	ALT_SMP(orr	r0, r0, #TTB_FLAGS_SMP)
 	ALT_UP(orr	r0, r0, #TTB_FLAGS_UP)
 #ifdef CONFIG_ARM_ERRATA_430973
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 8de0f1d..0001581 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -47,9 +47,10 @@
  */
 ENTRY(cpu_v7_switch_mm)
 #ifdef CONFIG_MMU
-	ldr	r1, [r1, #MM_CONTEXT_ID]	@ get mm->context.id
-	and	r3, r1, #0xff
+	ldr	r2, [r2, #MM_CONTEXT_ID]	@ get mm->context.id
+	and	r3, r2, #0xff
 	mov	r3, r3, lsl #(48 - 32)		@ ASID
+	orr	r3, r3, r1			@ upper 32-bits of pgd phys
 	mcrr	p15, 0, r0, r3, c2		@ set TTB 0
 	isb
 #endif
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 10/22] ARM: LPAE: use 64-bit accessors for TTBR registers
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Cyril Chemparathy, Vitaly Andrianov

This patch adds TTBR accessor macros, and modifies cpu_get_pgd() and
the LPAE version of cpu_set_reserved_ttbr0() to use these instead.

In the process, we also fix these functions to correctly handle cases
where the physical address lies beyond the 4G limit of 32-bit addressing.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/include/asm/proc-fns.h |   24 +++++++++++++++++++-----
 arch/arm/mm/context.c           |   13 ++-----------
 2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/arch/arm/include/asm/proc-fns.h b/arch/arm/include/asm/proc-fns.h
index fa6554e..918b4f9 100644
--- a/arch/arm/include/asm/proc-fns.h
+++ b/arch/arm/include/asm/proc-fns.h
@@ -116,13 +116,27 @@ extern void cpu_resume(void);
 #define cpu_switch_mm(pgd,mm) cpu_do_switch_mm(virt_to_phys(pgd),mm)
 
 #ifdef CONFIG_ARM_LPAE
+
+#define cpu_get_ttbr(nr)					\
+	({							\
+		u64 ttbr;					\
+		__asm__("mrrc	p15, " #nr ", %Q0, %R0, c2"	\
+			: "=r" (ttbr)				\
+			: : "cc");				\
+		ttbr;						\
+	})
+
+#define cpu_set_ttbr(nr, val)					\
+	do {							\
+		u64 ttbr = val;					\
+		__asm__("mcrr	p15, " #nr ", %Q0, %R0, c2"	\
+			: : "r" (ttbr)				\
+			: "cc");				\
+	} while (0)
+
 #define cpu_get_pgd()	\
 	({						\
-		unsigned long pg, pg2;			\
-		__asm__("mrrc	p15, 0, %0, %1, c2"	\
-			: "=r" (pg), "=r" (pg2)		\
-			:				\
-			: "cc");			\
+		u64 pg = cpu_get_ttbr(0);		\
 		pg &= ~(PTRS_PER_PGD*sizeof(pgd_t)-1);	\
 		(pgd_t *)phys_to_virt(pg);		\
 	})
diff --git a/arch/arm/mm/context.c b/arch/arm/mm/context.c
index 806cc4f..ad70bd8 100644
--- a/arch/arm/mm/context.c
+++ b/arch/arm/mm/context.c
@@ -15,6 +15,7 @@
 
 #include <asm/mmu_context.h>
 #include <asm/tlbflush.h>
+#include <asm/proc-fns.h>
 
 static DEFINE_RAW_SPINLOCK(cpu_asid_lock);
 unsigned int cpu_last_asid = ASID_FIRST_VERSION;
@@ -22,17 +23,7 @@ unsigned int cpu_last_asid = ASID_FIRST_VERSION;
 #ifdef CONFIG_ARM_LPAE
 void cpu_set_reserved_ttbr0(void)
 {
-	unsigned long ttbl = __pa(swapper_pg_dir);
-	unsigned long ttbh = 0;
-
-	/*
-	 * Set TTBR0 to swapper_pg_dir which contains only global entries. The
-	 * ASID is set to 0.
-	 */
-	asm volatile(
-	"	mcrr	p15, 0, %0, %1, c2		@ set TTBR0\n"
-	:
-	: "r" (ttbl), "r" (ttbh));
+	cpu_set_ttbr(0, __pa(swapper_pg_dir));
 	isb();
 }
 #else
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 10/22] ARM: LPAE: use 64-bit accessors for TTBR registers
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds TTBR accessor macros, and modifies cpu_get_pgd() and
the LPAE version of cpu_set_reserved_ttbr0() to use these instead.

In the process, we also fix these functions to correctly handle cases
where the physical address lies beyond the 4G limit of 32-bit addressing.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/include/asm/proc-fns.h |   24 +++++++++++++++++++-----
 arch/arm/mm/context.c           |   13 ++-----------
 2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/arch/arm/include/asm/proc-fns.h b/arch/arm/include/asm/proc-fns.h
index fa6554e..918b4f9 100644
--- a/arch/arm/include/asm/proc-fns.h
+++ b/arch/arm/include/asm/proc-fns.h
@@ -116,13 +116,27 @@ extern void cpu_resume(void);
 #define cpu_switch_mm(pgd,mm) cpu_do_switch_mm(virt_to_phys(pgd),mm)
 
 #ifdef CONFIG_ARM_LPAE
+
+#define cpu_get_ttbr(nr)					\
+	({							\
+		u64 ttbr;					\
+		__asm__("mrrc	p15, " #nr ", %Q0, %R0, c2"	\
+			: "=r" (ttbr)				\
+			: : "cc");				\
+		ttbr;						\
+	})
+
+#define cpu_set_ttbr(nr, val)					\
+	do {							\
+		u64 ttbr = val;					\
+		__asm__("mcrr	p15, " #nr ", %Q0, %R0, c2"	\
+			: : "r" (ttbr)				\
+			: "cc");				\
+	} while (0)
+
 #define cpu_get_pgd()	\
 	({						\
-		unsigned long pg, pg2;			\
-		__asm__("mrrc	p15, 0, %0, %1, c2"	\
-			: "=r" (pg), "=r" (pg2)		\
-			:				\
-			: "cc");			\
+		u64 pg = cpu_get_ttbr(0);		\
 		pg &= ~(PTRS_PER_PGD*sizeof(pgd_t)-1);	\
 		(pgd_t *)phys_to_virt(pg);		\
 	})
diff --git a/arch/arm/mm/context.c b/arch/arm/mm/context.c
index 806cc4f..ad70bd8 100644
--- a/arch/arm/mm/context.c
+++ b/arch/arm/mm/context.c
@@ -15,6 +15,7 @@
 
 #include <asm/mmu_context.h>
 #include <asm/tlbflush.h>
+#include <asm/proc-fns.h>
 
 static DEFINE_RAW_SPINLOCK(cpu_asid_lock);
 unsigned int cpu_last_asid = ASID_FIRST_VERSION;
@@ -22,17 +23,7 @@ unsigned int cpu_last_asid = ASID_FIRST_VERSION;
 #ifdef CONFIG_ARM_LPAE
 void cpu_set_reserved_ttbr0(void)
 {
-	unsigned long ttbl = __pa(swapper_pg_dir);
-	unsigned long ttbh = 0;
-
-	/*
-	 * Set TTBR0 to swapper_pg_dir which contains only global entries. The
-	 * ASID is set to 0.
-	 */
-	asm volatile(
-	"	mcrr	p15, 0, %0, %1, c2		@ set TTBR0\n"
-	:
-	: "r" (ttbl), "r" (ttbh));
+	cpu_set_ttbr(0, __pa(swapper_pg_dir));
 	isb();
 }
 #else
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 11/22] ARM: LPAE: define ARCH_LOW_ADDRESS_LIMIT for bootmem
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Cyril Chemparathy, Vitaly Andrianov

This patch adds an architecture defined override for ARCH_LOW_ADDRESS_LIMIT.
On PAE systems, the absence of this override causes bootmem to incorrectly
limit itself to 32-bit addressable physical memory.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/include/asm/memory.h |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 110495c..0c1b396 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -281,6 +281,8 @@ static inline __deprecated void *bus_to_virt(unsigned long x)
 #define arch_is_coherent()		0
 #endif
 
+#define ARCH_LOW_ADDRESS_LIMIT		PHYS_MASK
+
 #endif
 
 #include <asm-generic/memory_model.h>
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 11/22] ARM: LPAE: define ARCH_LOW_ADDRESS_LIMIT for bootmem
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds an architecture defined override for ARCH_LOW_ADDRESS_LIMIT.
On PAE systems, the absence of this override causes bootmem to incorrectly
limit itself to 32-bit addressable physical memory.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/include/asm/memory.h |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 110495c..0c1b396 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -281,6 +281,8 @@ static inline __deprecated void *bus_to_virt(unsigned long x)
 #define arch_is_coherent()		0
 #endif
 
+#define ARCH_LOW_ADDRESS_LIMIT		PHYS_MASK
+
 #endif
 
 #include <asm-generic/memory_model.h>
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 12/22] ARM: LPAE: factor out T1SZ and TTBR1 computations
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Cyril Chemparathy, Vitaly Andrianov

This patch moves the TTBR1 offset calculation and the T1SZ calculation out
of the TTB setup assembly code.  This should not affect functionality in
any way, but improves code readability as well as readability of subsequent
patches in this series.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/include/asm/pgtable-3level-hwdef.h |   10 ++++++++++
 arch/arm/mm/proc-v7-3level.S                |   16 ++++------------
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h
index d795282..b501650 100644
--- a/arch/arm/include/asm/pgtable-3level-hwdef.h
+++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
@@ -74,4 +74,14 @@
 #define PHYS_MASK_SHIFT		(40)
 #define PHYS_MASK		((1ULL << PHYS_MASK_SHIFT) - 1)
 
+#if defined CONFIG_VMSPLIT_2G
+#define TTBR1_OFFSET	(1 << 4)		/* skip two L1 entries */
+#elif defined CONFIG_VMSPLIT_3G
+#define TTBR1_OFFSET	(4096 * (1 + 3))	/* only L2, skip pgd + 3*pmd */
+#else
+#define TTBR1_OFFSET	0
+#endif
+
+#define TTBR1_SIZE	(((PAGE_OFFSET >> 30) - 1) << 16)
+
 #endif
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 0001581..3b1a745 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -120,18 +120,10 @@ ENDPROC(cpu_v7_set_pte_ext)
 	 * booting secondary CPUs would end up using TTBR1 for the identity
 	 * mapping set up in TTBR0.
 	 */
-	bhi	9001f				@ PHYS_OFFSET > PAGE_OFFSET?
-	orr	\tmp, \tmp, #(((PAGE_OFFSET >> 30) - 1) << 16) @ TTBCR.T1SZ
-#if defined CONFIG_VMSPLIT_2G
-	/* PAGE_OFFSET == 0x80000000, T1SZ == 1 */
-	add	\ttbr1, \ttbr1, #1 << 4		@ skip two L1 entries
-#elif defined CONFIG_VMSPLIT_3G
-	/* PAGE_OFFSET == 0xc0000000, T1SZ == 2 */
-	add	\ttbr1, \ttbr1, #4096 * (1 + 3)	@ only L2 used, skip pgd+3*pmd
-#endif
-	/* CONFIG_VMSPLIT_1G does not need TTBR1 adjustment */
-9001:	mcr	p15, 0, \tmp, c2, c0, 2		@ TTB control register
-	mcrr	p15, 1, \ttbr1, \zero, c2	@ load TTBR1
+	orrls	\tmp, \tmp, #TTBR1_SIZE				@ TTBCR.T1SZ
+	mcr	p15, 0, \tmp, c2, c0, 2				@ TTBCR
+	addls	\ttbr1, \ttbr1, #TTBR1_OFFSET
+	mcrr	p15, 1, \ttbr1, \zero, c2			@ load TTBR1
 	.endm
 
 	__CPUINIT
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 12/22] ARM: LPAE: factor out T1SZ and TTBR1 computations
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch moves the TTBR1 offset calculation and the T1SZ calculation out
of the TTB setup assembly code.  This should not affect functionality in
any way, but improves code readability as well as readability of subsequent
patches in this series.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/include/asm/pgtable-3level-hwdef.h |   10 ++++++++++
 arch/arm/mm/proc-v7-3level.S                |   16 ++++------------
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h
index d795282..b501650 100644
--- a/arch/arm/include/asm/pgtable-3level-hwdef.h
+++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
@@ -74,4 +74,14 @@
 #define PHYS_MASK_SHIFT		(40)
 #define PHYS_MASK		((1ULL << PHYS_MASK_SHIFT) - 1)
 
+#if defined CONFIG_VMSPLIT_2G
+#define TTBR1_OFFSET	(1 << 4)		/* skip two L1 entries */
+#elif defined CONFIG_VMSPLIT_3G
+#define TTBR1_OFFSET	(4096 * (1 + 3))	/* only L2, skip pgd + 3*pmd */
+#else
+#define TTBR1_OFFSET	0
+#endif
+
+#define TTBR1_SIZE	(((PAGE_OFFSET >> 30) - 1) << 16)
+
 #endif
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 0001581..3b1a745 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -120,18 +120,10 @@ ENDPROC(cpu_v7_set_pte_ext)
 	 * booting secondary CPUs would end up using TTBR1 for the identity
 	 * mapping set up in TTBR0.
 	 */
-	bhi	9001f				@ PHYS_OFFSET > PAGE_OFFSET?
-	orr	\tmp, \tmp, #(((PAGE_OFFSET >> 30) - 1) << 16) @ TTBCR.T1SZ
-#if defined CONFIG_VMSPLIT_2G
-	/* PAGE_OFFSET == 0x80000000, T1SZ == 1 */
-	add	\ttbr1, \ttbr1, #1 << 4		@ skip two L1 entries
-#elif defined CONFIG_VMSPLIT_3G
-	/* PAGE_OFFSET == 0xc0000000, T1SZ == 2 */
-	add	\ttbr1, \ttbr1, #4096 * (1 + 3)	@ only L2 used, skip pgd+3*pmd
-#endif
-	/* CONFIG_VMSPLIT_1G does not need TTBR1 adjustment */
-9001:	mcr	p15, 0, \tmp, c2, c0, 2		@ TTB control register
-	mcrr	p15, 1, \ttbr1, \zero, c2	@ load TTBR1
+	orrls	\tmp, \tmp, #TTBR1_SIZE				@ TTBCR.T1SZ
+	mcr	p15, 0, \tmp, c2, c0, 2				@ TTBCR
+	addls	\ttbr1, \ttbr1, #TTBR1_OFFSET
+	mcrr	p15, 1, \ttbr1, \zero, c2			@ load TTBR1
 	.endm
 
 	__CPUINIT
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 13/22] ARM: LPAE: allow proc override of TTB setup
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Cyril Chemparathy, Vitaly Andrianov

This patch allows ARM processor setup functions (*_setup in proc-*.S) to
indicate that the page table has already been programmed.  This is
done by setting r4 (page table pointer) to -1 before returning from the
processor setup handler.

This capability is particularly needed on LPAE systems, where the translation
table base needs to be programmed differently with 64-bit control
register operations.

Further, a few of the processors (arm1026, mohawk, xsc3) were programming the
TTB twice.  This patch prevents the main head.S code from programming TTB the
second time on these machines.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/kernel/head.S       |   10 +++++-----
 arch/arm/mm/proc-arm1026.S   |    1 +
 arch/arm/mm/proc-mohawk.S    |    1 +
 arch/arm/mm/proc-v6.S        |    2 ++
 arch/arm/mm/proc-v7-2level.S |    3 ++-
 arch/arm/mm/proc-v7-3level.S |    1 +
 arch/arm/mm/proc-v7.S        |    1 +
 arch/arm/mm/proc-xsc3.S      |    1 +
 8 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index fa820b3..7b1a3be 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -414,17 +414,17 @@ __enable_mmu:
 #ifdef CONFIG_CPU_ICACHE_DISABLE
 	bic	r0, r0, #CR_I
 #endif
-#ifdef CONFIG_ARM_LPAE
-	mov	r5, #0
-	mcrr	p15, 0, r4, r5, c2		@ load TTBR0
-#else
+#ifndef CONFIG_ARM_LPAE
 	mov	r5, #(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_IO, DOMAIN_CLIENT))
 	mcr	p15, 0, r5, c3, c0, 0		@ load domain access register
-	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
 #endif
+
+	@ has the processor setup already programmed the page table pointer?
+	adds	r5, r4, #1
+	mcrne	p15, 0, r4, c2, c0, 0		@ load page table pointer
 	b	__turn_mmu_on
 ENDPROC(__enable_mmu)
 
diff --git a/arch/arm/mm/proc-arm1026.S b/arch/arm/mm/proc-arm1026.S
index fbc1d5f..c28070e 100644
--- a/arch/arm/mm/proc-arm1026.S
+++ b/arch/arm/mm/proc-arm1026.S
@@ -404,6 +404,7 @@ __arm1026_setup:
 #ifdef CONFIG_MMU
 	mcr	p15, 0, r0, c8, c7		@ invalidate I,D TLBs on v4
 	mcr	p15, 0, r4, c2, c0		@ load page table pointer
+	mvn	r4, #0				@ do not set page table pointer
 #endif
 #ifdef CONFIG_CPU_DCACHE_WRITETHROUGH
 	mov	r0, #4				@ explicitly disable writeback
diff --git a/arch/arm/mm/proc-mohawk.S b/arch/arm/mm/proc-mohawk.S
index fbb2124..a26303c 100644
--- a/arch/arm/mm/proc-mohawk.S
+++ b/arch/arm/mm/proc-mohawk.S
@@ -390,6 +390,7 @@ __mohawk_setup:
 	mcr	p15, 0, r0, c8, c7		@ invalidate I,D TLBs
 	orr	r4, r4, #0x18			@ cache the page table in L2
 	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
+	mvn	r4, #0				@ do not set page table pointer
 
 	mov	r0, #0				@ don't allow CP access
 	mcr	p15, 0, r0, c15, c1, 0		@ write CP access register
diff --git a/arch/arm/mm/proc-v6.S b/arch/arm/mm/proc-v6.S
index 566c658..872156e 100644
--- a/arch/arm/mm/proc-v6.S
+++ b/arch/arm/mm/proc-v6.S
@@ -210,7 +210,9 @@ __v6_setup:
 	ALT_UP(orr	r4, r4, #TTB_FLAGS_UP)
 	ALT_SMP(orr	r8, r8, #TTB_FLAGS_SMP)
 	ALT_UP(orr	r8, r8, #TTB_FLAGS_UP)
+	mcr	p15, 0, r4, c2, c0, 0		@ load TTB0
 	mcr	p15, 0, r8, c2, c0, 1		@ load TTB1
+	mvn	r4, #0				@ do not set page table pointer
 #endif /* CONFIG_MMU */
 	adr	r5, v6_crval
 	ldmia	r5, {r5, r6}
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
index 3397803..cc78c0c 100644
--- a/arch/arm/mm/proc-v7-2level.S
+++ b/arch/arm/mm/proc-v7-2level.S
@@ -139,7 +139,7 @@ ENDPROC(cpu_v7_set_pte_ext)
 
 	/*
 	 * Macro for setting up the TTBRx and TTBCR registers.
-	 * - \ttb0 and \ttb1 updated with the corresponding flags.
+	 * - \ttbr0 and \ttbr1 updated with the corresponding flags.
 	 */
 	.macro	v7_ttb_setup, zero, ttbr0, ttbr1, tmp
 	mcr	p15, 0, \zero, c2, c0, 2	@ TTB control register
@@ -147,6 +147,7 @@ ENDPROC(cpu_v7_set_pte_ext)
 	ALT_UP(orr	\ttbr0, \ttbr0, #TTB_FLAGS_UP)
 	ALT_SMP(orr	\ttbr1, \ttbr1, #TTB_FLAGS_SMP)
 	ALT_UP(orr	\ttbr1, \ttbr1, #TTB_FLAGS_UP)
+	mcr	p15, 0, \ttbr0, c2, c0, 0	@ load TTB0
 	mcr	p15, 0, \ttbr1, c2, c0, 1	@ load TTB1
 	.endm
 
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 3b1a745..5e3bed1 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -124,6 +124,7 @@ ENDPROC(cpu_v7_set_pte_ext)
 	mcr	p15, 0, \tmp, c2, c0, 2				@ TTBCR
 	addls	\ttbr1, \ttbr1, #TTBR1_OFFSET
 	mcrr	p15, 1, \ttbr1, \zero, c2			@ load TTBR1
+	mcrr	p15, 0, \ttbr0, \zero, c2			@ load TTBR0
 	.endm
 
 	__CPUINIT
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index c2e2b66..8850194 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -250,6 +250,7 @@ __v7_setup:
 #ifdef CONFIG_MMU
 	mcr	p15, 0, r10, c8, c7, 0		@ invalidate I + D TLBs
 	v7_ttb_setup r10, r4, r8, r5		@ TTBCR, TTBRx setup
+	mvn	r4, #0				@ do not set page table pointer
 	ldr	r5, =PRRR			@ PRRR
 	ldr	r6, =NMRR			@ NMRR
 	mcr	p15, 0, r5, c10, c2, 0		@ write PRRR
diff --git a/arch/arm/mm/proc-xsc3.S b/arch/arm/mm/proc-xsc3.S
index b0d5786..db3836b 100644
--- a/arch/arm/mm/proc-xsc3.S
+++ b/arch/arm/mm/proc-xsc3.S
@@ -455,6 +455,7 @@ __xsc3_setup:
 	mcr	p15, 0, ip, c8, c7, 0		@ invalidate I and D TLBs
 	orr	r4, r4, #0x18			@ cache the page table in L2
 	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
+	mvn	r4, #0				@ do not set page table pointer
 
 	mov	r0, #1 << 6			@ cp6 access for early sched_clock
 	mcr	p15, 0, r0, c15, c1, 0		@ write CP access register
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 13/22] ARM: LPAE: allow proc override of TTB setup
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch allows ARM processor setup functions (*_setup in proc-*.S) to
indicate that the page table has already been programmed.  This is
done by setting r4 (page table pointer) to -1 before returning from the
processor setup handler.

This capability is particularly needed on LPAE systems, where the translation
table base needs to be programmed differently with 64-bit control
register operations.

Further, a few of the processors (arm1026, mohawk, xsc3) were programming the
TTB twice.  This patch prevents the main head.S code from programming TTB the
second time on these machines.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/kernel/head.S       |   10 +++++-----
 arch/arm/mm/proc-arm1026.S   |    1 +
 arch/arm/mm/proc-mohawk.S    |    1 +
 arch/arm/mm/proc-v6.S        |    2 ++
 arch/arm/mm/proc-v7-2level.S |    3 ++-
 arch/arm/mm/proc-v7-3level.S |    1 +
 arch/arm/mm/proc-v7.S        |    1 +
 arch/arm/mm/proc-xsc3.S      |    1 +
 8 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index fa820b3..7b1a3be 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -414,17 +414,17 @@ __enable_mmu:
 #ifdef CONFIG_CPU_ICACHE_DISABLE
 	bic	r0, r0, #CR_I
 #endif
-#ifdef CONFIG_ARM_LPAE
-	mov	r5, #0
-	mcrr	p15, 0, r4, r5, c2		@ load TTBR0
-#else
+#ifndef CONFIG_ARM_LPAE
 	mov	r5, #(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_IO, DOMAIN_CLIENT))
 	mcr	p15, 0, r5, c3, c0, 0		@ load domain access register
-	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
 #endif
+
+	@ has the processor setup already programmed the page table pointer?
+	adds	r5, r4, #1
+	mcrne	p15, 0, r4, c2, c0, 0		@ load page table pointer
 	b	__turn_mmu_on
 ENDPROC(__enable_mmu)
 
diff --git a/arch/arm/mm/proc-arm1026.S b/arch/arm/mm/proc-arm1026.S
index fbc1d5f..c28070e 100644
--- a/arch/arm/mm/proc-arm1026.S
+++ b/arch/arm/mm/proc-arm1026.S
@@ -404,6 +404,7 @@ __arm1026_setup:
 #ifdef CONFIG_MMU
 	mcr	p15, 0, r0, c8, c7		@ invalidate I,D TLBs on v4
 	mcr	p15, 0, r4, c2, c0		@ load page table pointer
+	mvn	r4, #0				@ do not set page table pointer
 #endif
 #ifdef CONFIG_CPU_DCACHE_WRITETHROUGH
 	mov	r0, #4				@ explicitly disable writeback
diff --git a/arch/arm/mm/proc-mohawk.S b/arch/arm/mm/proc-mohawk.S
index fbb2124..a26303c 100644
--- a/arch/arm/mm/proc-mohawk.S
+++ b/arch/arm/mm/proc-mohawk.S
@@ -390,6 +390,7 @@ __mohawk_setup:
 	mcr	p15, 0, r0, c8, c7		@ invalidate I,D TLBs
 	orr	r4, r4, #0x18			@ cache the page table in L2
 	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
+	mvn	r4, #0				@ do not set page table pointer
 
 	mov	r0, #0				@ don't allow CP access
 	mcr	p15, 0, r0, c15, c1, 0		@ write CP access register
diff --git a/arch/arm/mm/proc-v6.S b/arch/arm/mm/proc-v6.S
index 566c658..872156e 100644
--- a/arch/arm/mm/proc-v6.S
+++ b/arch/arm/mm/proc-v6.S
@@ -210,7 +210,9 @@ __v6_setup:
 	ALT_UP(orr	r4, r4, #TTB_FLAGS_UP)
 	ALT_SMP(orr	r8, r8, #TTB_FLAGS_SMP)
 	ALT_UP(orr	r8, r8, #TTB_FLAGS_UP)
+	mcr	p15, 0, r4, c2, c0, 0		@ load TTB0
 	mcr	p15, 0, r8, c2, c0, 1		@ load TTB1
+	mvn	r4, #0				@ do not set page table pointer
 #endif /* CONFIG_MMU */
 	adr	r5, v6_crval
 	ldmia	r5, {r5, r6}
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
index 3397803..cc78c0c 100644
--- a/arch/arm/mm/proc-v7-2level.S
+++ b/arch/arm/mm/proc-v7-2level.S
@@ -139,7 +139,7 @@ ENDPROC(cpu_v7_set_pte_ext)
 
 	/*
 	 * Macro for setting up the TTBRx and TTBCR registers.
-	 * - \ttb0 and \ttb1 updated with the corresponding flags.
+	 * - \ttbr0 and \ttbr1 updated with the corresponding flags.
 	 */
 	.macro	v7_ttb_setup, zero, ttbr0, ttbr1, tmp
 	mcr	p15, 0, \zero, c2, c0, 2	@ TTB control register
@@ -147,6 +147,7 @@ ENDPROC(cpu_v7_set_pte_ext)
 	ALT_UP(orr	\ttbr0, \ttbr0, #TTB_FLAGS_UP)
 	ALT_SMP(orr	\ttbr1, \ttbr1, #TTB_FLAGS_SMP)
 	ALT_UP(orr	\ttbr1, \ttbr1, #TTB_FLAGS_UP)
+	mcr	p15, 0, \ttbr0, c2, c0, 0	@ load TTB0
 	mcr	p15, 0, \ttbr1, c2, c0, 1	@ load TTB1
 	.endm
 
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 3b1a745..5e3bed1 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -124,6 +124,7 @@ ENDPROC(cpu_v7_set_pte_ext)
 	mcr	p15, 0, \tmp, c2, c0, 2				@ TTBCR
 	addls	\ttbr1, \ttbr1, #TTBR1_OFFSET
 	mcrr	p15, 1, \ttbr1, \zero, c2			@ load TTBR1
+	mcrr	p15, 0, \ttbr0, \zero, c2			@ load TTBR0
 	.endm
 
 	__CPUINIT
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index c2e2b66..8850194 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -250,6 +250,7 @@ __v7_setup:
 #ifdef CONFIG_MMU
 	mcr	p15, 0, r10, c8, c7, 0		@ invalidate I + D TLBs
 	v7_ttb_setup r10, r4, r8, r5		@ TTBCR, TTBRx setup
+	mvn	r4, #0				@ do not set page table pointer
 	ldr	r5, =PRRR			@ PRRR
 	ldr	r6, =NMRR			@ NMRR
 	mcr	p15, 0, r5, c10, c2, 0		@ write PRRR
diff --git a/arch/arm/mm/proc-xsc3.S b/arch/arm/mm/proc-xsc3.S
index b0d5786..db3836b 100644
--- a/arch/arm/mm/proc-xsc3.S
+++ b/arch/arm/mm/proc-xsc3.S
@@ -455,6 +455,7 @@ __xsc3_setup:
 	mcr	p15, 0, ip, c8, c7, 0		@ invalidate I and D TLBs
 	orr	r4, r4, #0x18			@ cache the page table in L2
 	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
+	mvn	r4, #0				@ do not set page table pointer
 
 	mov	r0, #1 << 6			@ cp6 access for early sched_clock
 	mcr	p15, 0, r0, c15, c1, 0		@ write CP access register
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 14/22] ARM: LPAE: accomodate >32-bit addresses for page table base
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Cyril Chemparathy, Vitaly Andrianov

This patch redefines the early boot time use of the R4 register to steal a few
low order bits (ARCH_PGD_SHIFT bits), allowing for up to 38-bit physical
addresses.

This is probably not the best means to the end, and a better alternative may
be to modify the head.S register allocations to fit in full register pairs for
pgdir and swapper_pg_dir.  However, squeezing out these extra registers seemed
to be a far greater pain than squeezing out a few low order bits from the page
table addresses.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/include/asm/cache.h |    9 +++++++++
 arch/arm/kernel/head.S       |    7 +++++--
 arch/arm/kernel/smp.c        |   11 +++++++++--
 arch/arm/mm/proc-arm1026.S   |    2 ++
 arch/arm/mm/proc-mohawk.S    |    2 ++
 arch/arm/mm/proc-v6.S        |    2 ++
 arch/arm/mm/proc-v7-2level.S |    2 ++
 arch/arm/mm/proc-v7-3level.S |    7 +++++++
 arch/arm/mm/proc-v7.S        |    1 +
 arch/arm/mm/proc-xsc3.S      |    2 ++
 10 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/cache.h b/arch/arm/include/asm/cache.h
index 75fe66b..986480c 100644
--- a/arch/arm/include/asm/cache.h
+++ b/arch/arm/include/asm/cache.h
@@ -17,6 +17,15 @@
 #define ARCH_DMA_MINALIGN	L1_CACHE_BYTES
 
 /*
+ * Minimum guaranted alignment in pgd_alloc().  The page table pointers passed
+ * around in head.S and proc-*.S are shifted by this amount, in order to
+ * leave spare high bits for systems with physical address extension.  This
+ * does not fully accomodate the 40-bit addressing capability of ARM LPAE, but
+ * gives us about 38-bits or so.
+ */
+#define ARCH_PGD_SHIFT		L1_CACHE_SHIFT
+
+/*
  * With EABI on ARMv5 and above we must have 64-bit aligned slab pointers.
  */
 #if defined(CONFIG_AEABI) && (__LINUX_ARM_ARCH__ >= 5)
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 7b1a3be..af029ec 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -22,6 +22,7 @@
 #include <asm/memory.h>
 #include <asm/thread_info.h>
 #include <asm/pgtable.h>
+#include <asm/cache.h>
 
 #ifdef CONFIG_DEBUG_LL
 #include <mach/debug-macro.S>
@@ -163,7 +164,7 @@ ENDPROC(stext)
  *
  * Returns:
  *  r0, r3, r5-r7 corrupted
- *  r4 = physical page table address
+ *  r4 = page table (see ARCH_PGD_SHIFT in asm/cache.h)
  */
 __create_page_tables:
 	pgtbl	r4, r8				@ page table address
@@ -323,6 +324,7 @@ __create_page_tables:
 #ifdef CONFIG_ARM_LPAE
 	sub	r4, r4, #0x1000		@ point to the PGD table
 #endif
+	mov	r4, r4, lsr #ARCH_PGD_SHIFT
 	mov	pc, lr
 ENDPROC(__create_page_tables)
 	.ltorg
@@ -395,7 +397,7 @@ __secondary_data:
  *  r0  = cp#15 control register
  *  r1  = machine ID
  *  r2  = atags or dtb pointer
- *  r4  = page table pointer
+ *  r4  = page table (see ARCH_PGD_SHIFT in asm/cache.h)
  *  r9  = processor ID
  *  r13 = *virtual* address to jump to upon completion
  */
@@ -424,6 +426,7 @@ __enable_mmu:
 
 	@ has the processor setup already programmed the page table pointer?
 	adds	r5, r4, #1
+	movne	r4, r4, lsl #ARCH_PGD_SHIFT
 	mcrne	p15, 0, r4, c2, c0, 0		@ load page table pointer
 	b	__turn_mmu_on
 ENDPROC(__enable_mmu)
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 2c7217d..e41e1be 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -42,6 +42,7 @@
 #include <asm/ptrace.h>
 #include <asm/localtimer.h>
 #include <asm/smp_plat.h>
+#include <asm/cache.h>
 
 /*
  * as from 2.5, kernels no longer have an init_tasks structure
@@ -62,6 +63,7 @@ static DECLARE_COMPLETION(cpu_running);
 
 int __cpuinit __cpu_up(unsigned int cpu, struct task_struct *idle)
 {
+	phys_addr_t pgdir;
 	int ret;
 
 	/*
@@ -69,8 +71,13 @@ int __cpuinit __cpu_up(unsigned int cpu, struct task_struct *idle)
 	 * its stack and the page tables.
 	 */
 	secondary_data.stack = task_stack_page(idle) + THREAD_START_SP;
-	secondary_data.pgdir = virt_to_phys(idmap_pgd);
-	secondary_data.swapper_pg_dir = virt_to_phys(swapper_pg_dir);
+
+	pgdir = virt_to_phys(idmap_pgd);
+	secondary_data.pgdir = pgdir >> ARCH_PGD_SHIFT;
+
+	pgdir = virt_to_phys(swapper_pg_dir);
+	secondary_data.swapper_pg_dir = pgdir >> ARCH_PGD_SHIFT;
+
 	__cpuc_flush_dcache_area(&secondary_data, sizeof(secondary_data));
 	outer_clean_range(__pa(&secondary_data), __pa(&secondary_data + 1));
 
diff --git a/arch/arm/mm/proc-arm1026.S b/arch/arm/mm/proc-arm1026.S
index c28070e..4556f77 100644
--- a/arch/arm/mm/proc-arm1026.S
+++ b/arch/arm/mm/proc-arm1026.S
@@ -22,6 +22,7 @@
 #include <asm/pgtable-hwdef.h>
 #include <asm/pgtable.h>
 #include <asm/ptrace.h>
+#include <asm/cache.h>
 
 #include "proc-macros.S"
 
@@ -403,6 +404,7 @@ __arm1026_setup:
 	mcr	p15, 0, r0, c7, c10, 4		@ drain write buffer on v4
 #ifdef CONFIG_MMU
 	mcr	p15, 0, r0, c8, c7		@ invalidate I,D TLBs on v4
+	mov	r4, r4, lsl #ARCH_PGD_SHIFT
 	mcr	p15, 0, r4, c2, c0		@ load page table pointer
 	mvn	r4, #0				@ do not set page table pointer
 #endif
diff --git a/arch/arm/mm/proc-mohawk.S b/arch/arm/mm/proc-mohawk.S
index a26303c..13fcc67 100644
--- a/arch/arm/mm/proc-mohawk.S
+++ b/arch/arm/mm/proc-mohawk.S
@@ -28,6 +28,7 @@
 #include <asm/pgtable.h>
 #include <asm/page.h>
 #include <asm/ptrace.h>
+#include <asm/cache.h>
 #include "proc-macros.S"
 
 /*
@@ -388,6 +389,7 @@ __mohawk_setup:
 	mcr	p15, 0, r0, c7, c7		@ invalidate I,D caches
 	mcr	p15, 0, r0, c7, c10, 4		@ drain write buffer
 	mcr	p15, 0, r0, c8, c7		@ invalidate I,D TLBs
+	mov	r4, r4, lsl #ARCH_PGD_SHIFT
 	orr	r4, r4, #0x18			@ cache the page table in L2
 	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
 	mvn	r4, #0				@ do not set page table pointer
diff --git a/arch/arm/mm/proc-v6.S b/arch/arm/mm/proc-v6.S
index 872156e..4751be7 100644
--- a/arch/arm/mm/proc-v6.S
+++ b/arch/arm/mm/proc-v6.S
@@ -17,6 +17,7 @@
 #include <asm/hwcap.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/pgtable.h>
+#include <asm/cache.h>
 
 #include "proc-macros.S"
 
@@ -206,6 +207,7 @@ __v6_setup:
 #ifdef CONFIG_MMU
 	mcr	p15, 0, r0, c8, c7, 0		@ invalidate I + D TLBs
 	mcr	p15, 0, r0, c2, c0, 2		@ TTB control register
+	mov	r4, r4, lsl #ARCH_PGD_SHIFT
 	ALT_SMP(orr	r4, r4, #TTB_FLAGS_SMP)
 	ALT_UP(orr	r4, r4, #TTB_FLAGS_UP)
 	ALT_SMP(orr	r8, r8, #TTB_FLAGS_SMP)
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
index cc78c0c..f4bc63b 100644
--- a/arch/arm/mm/proc-v7-2level.S
+++ b/arch/arm/mm/proc-v7-2level.S
@@ -143,8 +143,10 @@ ENDPROC(cpu_v7_set_pte_ext)
 	 */
 	.macro	v7_ttb_setup, zero, ttbr0, ttbr1, tmp
 	mcr	p15, 0, \zero, c2, c0, 2	@ TTB control register
+	mov	\ttbr0, \ttbr0, lsl #ARCH_PGD_SHIFT
 	ALT_SMP(orr	\ttbr0, \ttbr0, #TTB_FLAGS_SMP)
 	ALT_UP(orr	\ttbr0, \ttbr0, #TTB_FLAGS_UP)
+	mov	\ttbr1, \ttbr1, lsl #ARCH_PGD_SHIFT
 	ALT_SMP(orr	\ttbr1, \ttbr1, #TTB_FLAGS_SMP)
 	ALT_UP(orr	\ttbr1, \ttbr1, #TTB_FLAGS_UP)
 	mcr	p15, 0, \ttbr0, c2, c0, 0	@ load TTB0
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 5e3bed1..33f322a 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -103,6 +103,7 @@ ENDPROC(cpu_v7_set_pte_ext)
 	 */
 	.macro	v7_ttb_setup, zero, ttbr0, ttbr1, tmp
 	ldr	\tmp, =swapper_pg_dir		@ swapper_pg_dir virtual address
+	mov	\tmp, \tmp, lsr #ARCH_PGD_SHIFT
 	cmp	\ttbr1, \tmp			@ PHYS_OFFSET > PAGE_OFFSET? (branch below)
 	mrc	p15, 0, \tmp, c2, c0, 2		@ TTB control register
 	orr	\tmp, \tmp, #TTB_EAE
@@ -122,8 +123,14 @@ ENDPROC(cpu_v7_set_pte_ext)
 	 */
 	orrls	\tmp, \tmp, #TTBR1_SIZE				@ TTBCR.T1SZ
 	mcr	p15, 0, \tmp, c2, c0, 2				@ TTBCR
+	mov	\tmp, \ttbr1, lsr #(32 - ARCH_PGD_SHIFT)	@ upper bits
+	mov	\ttbr1, \ttbr1, lsl #ARCH_PGD_SHIFT		@ lower bits
 	addls	\ttbr1, \ttbr1, #TTBR1_OFFSET
 	mcrr	p15, 1, \ttbr1, \zero, c2			@ load TTBR1
+	mov	\tmp, \ttbr0, lsr #(32 - ARCH_PGD_SHIFT)	@ upper bits
+	mov	\ttbr0, \ttbr0, lsl #ARCH_PGD_SHIFT		@ lower bits
+	mcrr	p15, 0, \ttbr0, \zero, c2			@ load TTBR0
+	mcrr	p15, 1, \ttbr1, \zero, c2			@ load TTBR1
 	mcrr	p15, 0, \ttbr0, \zero, c2			@ load TTBR0
 	.endm
 
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 8850194..443f602 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -16,6 +16,7 @@
 #include <asm/hwcap.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/pgtable.h>
+#include <asm/cache.h>
 
 #include "proc-macros.S"
 
diff --git a/arch/arm/mm/proc-xsc3.S b/arch/arm/mm/proc-xsc3.S
index db3836b..a43a07d 100644
--- a/arch/arm/mm/proc-xsc3.S
+++ b/arch/arm/mm/proc-xsc3.S
@@ -32,6 +32,7 @@
 #include <asm/pgtable-hwdef.h>
 #include <asm/page.h>
 #include <asm/ptrace.h>
+#include <asm/cache.h>
 #include "proc-macros.S"
 
 /*
@@ -453,6 +454,7 @@ __xsc3_setup:
 	mcr	p15, 0, ip, c7, c10, 4		@ data write barrier
 	mcr	p15, 0, ip, c7, c5, 4		@ prefetch flush
 	mcr	p15, 0, ip, c8, c7, 0		@ invalidate I and D TLBs
+	mov	r4, r4, lsl #ARCH_PGD_SHIFT
 	orr	r4, r4, #0x18			@ cache the page table in L2
 	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
 	mvn	r4, #0				@ do not set page table pointer
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 14/22] ARM: LPAE: accomodate >32-bit addresses for page table base
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch redefines the early boot time use of the R4 register to steal a few
low order bits (ARCH_PGD_SHIFT bits), allowing for up to 38-bit physical
addresses.

This is probably not the best means to the end, and a better alternative may
be to modify the head.S register allocations to fit in full register pairs for
pgdir and swapper_pg_dir.  However, squeezing out these extra registers seemed
to be a far greater pain than squeezing out a few low order bits from the page
table addresses.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/include/asm/cache.h |    9 +++++++++
 arch/arm/kernel/head.S       |    7 +++++--
 arch/arm/kernel/smp.c        |   11 +++++++++--
 arch/arm/mm/proc-arm1026.S   |    2 ++
 arch/arm/mm/proc-mohawk.S    |    2 ++
 arch/arm/mm/proc-v6.S        |    2 ++
 arch/arm/mm/proc-v7-2level.S |    2 ++
 arch/arm/mm/proc-v7-3level.S |    7 +++++++
 arch/arm/mm/proc-v7.S        |    1 +
 arch/arm/mm/proc-xsc3.S      |    2 ++
 10 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/cache.h b/arch/arm/include/asm/cache.h
index 75fe66b..986480c 100644
--- a/arch/arm/include/asm/cache.h
+++ b/arch/arm/include/asm/cache.h
@@ -17,6 +17,15 @@
 #define ARCH_DMA_MINALIGN	L1_CACHE_BYTES
 
 /*
+ * Minimum guaranted alignment in pgd_alloc().  The page table pointers passed
+ * around in head.S and proc-*.S are shifted by this amount, in order to
+ * leave spare high bits for systems with physical address extension.  This
+ * does not fully accomodate the 40-bit addressing capability of ARM LPAE, but
+ * gives us about 38-bits or so.
+ */
+#define ARCH_PGD_SHIFT		L1_CACHE_SHIFT
+
+/*
  * With EABI on ARMv5 and above we must have 64-bit aligned slab pointers.
  */
 #if defined(CONFIG_AEABI) && (__LINUX_ARM_ARCH__ >= 5)
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 7b1a3be..af029ec 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -22,6 +22,7 @@
 #include <asm/memory.h>
 #include <asm/thread_info.h>
 #include <asm/pgtable.h>
+#include <asm/cache.h>
 
 #ifdef CONFIG_DEBUG_LL
 #include <mach/debug-macro.S>
@@ -163,7 +164,7 @@ ENDPROC(stext)
  *
  * Returns:
  *  r0, r3, r5-r7 corrupted
- *  r4 = physical page table address
+ *  r4 = page table (see ARCH_PGD_SHIFT in asm/cache.h)
  */
 __create_page_tables:
 	pgtbl	r4, r8				@ page table address
@@ -323,6 +324,7 @@ __create_page_tables:
 #ifdef CONFIG_ARM_LPAE
 	sub	r4, r4, #0x1000		@ point to the PGD table
 #endif
+	mov	r4, r4, lsr #ARCH_PGD_SHIFT
 	mov	pc, lr
 ENDPROC(__create_page_tables)
 	.ltorg
@@ -395,7 +397,7 @@ __secondary_data:
  *  r0  = cp#15 control register
  *  r1  = machine ID
  *  r2  = atags or dtb pointer
- *  r4  = page table pointer
+ *  r4  = page table (see ARCH_PGD_SHIFT in asm/cache.h)
  *  r9  = processor ID
  *  r13 = *virtual* address to jump to upon completion
  */
@@ -424,6 +426,7 @@ __enable_mmu:
 
 	@ has the processor setup already programmed the page table pointer?
 	adds	r5, r4, #1
+	movne	r4, r4, lsl #ARCH_PGD_SHIFT
 	mcrne	p15, 0, r4, c2, c0, 0		@ load page table pointer
 	b	__turn_mmu_on
 ENDPROC(__enable_mmu)
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 2c7217d..e41e1be 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -42,6 +42,7 @@
 #include <asm/ptrace.h>
 #include <asm/localtimer.h>
 #include <asm/smp_plat.h>
+#include <asm/cache.h>
 
 /*
  * as from 2.5, kernels no longer have an init_tasks structure
@@ -62,6 +63,7 @@ static DECLARE_COMPLETION(cpu_running);
 
 int __cpuinit __cpu_up(unsigned int cpu, struct task_struct *idle)
 {
+	phys_addr_t pgdir;
 	int ret;
 
 	/*
@@ -69,8 +71,13 @@ int __cpuinit __cpu_up(unsigned int cpu, struct task_struct *idle)
 	 * its stack and the page tables.
 	 */
 	secondary_data.stack = task_stack_page(idle) + THREAD_START_SP;
-	secondary_data.pgdir = virt_to_phys(idmap_pgd);
-	secondary_data.swapper_pg_dir = virt_to_phys(swapper_pg_dir);
+
+	pgdir = virt_to_phys(idmap_pgd);
+	secondary_data.pgdir = pgdir >> ARCH_PGD_SHIFT;
+
+	pgdir = virt_to_phys(swapper_pg_dir);
+	secondary_data.swapper_pg_dir = pgdir >> ARCH_PGD_SHIFT;
+
 	__cpuc_flush_dcache_area(&secondary_data, sizeof(secondary_data));
 	outer_clean_range(__pa(&secondary_data), __pa(&secondary_data + 1));
 
diff --git a/arch/arm/mm/proc-arm1026.S b/arch/arm/mm/proc-arm1026.S
index c28070e..4556f77 100644
--- a/arch/arm/mm/proc-arm1026.S
+++ b/arch/arm/mm/proc-arm1026.S
@@ -22,6 +22,7 @@
 #include <asm/pgtable-hwdef.h>
 #include <asm/pgtable.h>
 #include <asm/ptrace.h>
+#include <asm/cache.h>
 
 #include "proc-macros.S"
 
@@ -403,6 +404,7 @@ __arm1026_setup:
 	mcr	p15, 0, r0, c7, c10, 4		@ drain write buffer on v4
 #ifdef CONFIG_MMU
 	mcr	p15, 0, r0, c8, c7		@ invalidate I,D TLBs on v4
+	mov	r4, r4, lsl #ARCH_PGD_SHIFT
 	mcr	p15, 0, r4, c2, c0		@ load page table pointer
 	mvn	r4, #0				@ do not set page table pointer
 #endif
diff --git a/arch/arm/mm/proc-mohawk.S b/arch/arm/mm/proc-mohawk.S
index a26303c..13fcc67 100644
--- a/arch/arm/mm/proc-mohawk.S
+++ b/arch/arm/mm/proc-mohawk.S
@@ -28,6 +28,7 @@
 #include <asm/pgtable.h>
 #include <asm/page.h>
 #include <asm/ptrace.h>
+#include <asm/cache.h>
 #include "proc-macros.S"
 
 /*
@@ -388,6 +389,7 @@ __mohawk_setup:
 	mcr	p15, 0, r0, c7, c7		@ invalidate I,D caches
 	mcr	p15, 0, r0, c7, c10, 4		@ drain write buffer
 	mcr	p15, 0, r0, c8, c7		@ invalidate I,D TLBs
+	mov	r4, r4, lsl #ARCH_PGD_SHIFT
 	orr	r4, r4, #0x18			@ cache the page table in L2
 	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
 	mvn	r4, #0				@ do not set page table pointer
diff --git a/arch/arm/mm/proc-v6.S b/arch/arm/mm/proc-v6.S
index 872156e..4751be7 100644
--- a/arch/arm/mm/proc-v6.S
+++ b/arch/arm/mm/proc-v6.S
@@ -17,6 +17,7 @@
 #include <asm/hwcap.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/pgtable.h>
+#include <asm/cache.h>
 
 #include "proc-macros.S"
 
@@ -206,6 +207,7 @@ __v6_setup:
 #ifdef CONFIG_MMU
 	mcr	p15, 0, r0, c8, c7, 0		@ invalidate I + D TLBs
 	mcr	p15, 0, r0, c2, c0, 2		@ TTB control register
+	mov	r4, r4, lsl #ARCH_PGD_SHIFT
 	ALT_SMP(orr	r4, r4, #TTB_FLAGS_SMP)
 	ALT_UP(orr	r4, r4, #TTB_FLAGS_UP)
 	ALT_SMP(orr	r8, r8, #TTB_FLAGS_SMP)
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
index cc78c0c..f4bc63b 100644
--- a/arch/arm/mm/proc-v7-2level.S
+++ b/arch/arm/mm/proc-v7-2level.S
@@ -143,8 +143,10 @@ ENDPROC(cpu_v7_set_pte_ext)
 	 */
 	.macro	v7_ttb_setup, zero, ttbr0, ttbr1, tmp
 	mcr	p15, 0, \zero, c2, c0, 2	@ TTB control register
+	mov	\ttbr0, \ttbr0, lsl #ARCH_PGD_SHIFT
 	ALT_SMP(orr	\ttbr0, \ttbr0, #TTB_FLAGS_SMP)
 	ALT_UP(orr	\ttbr0, \ttbr0, #TTB_FLAGS_UP)
+	mov	\ttbr1, \ttbr1, lsl #ARCH_PGD_SHIFT
 	ALT_SMP(orr	\ttbr1, \ttbr1, #TTB_FLAGS_SMP)
 	ALT_UP(orr	\ttbr1, \ttbr1, #TTB_FLAGS_UP)
 	mcr	p15, 0, \ttbr0, c2, c0, 0	@ load TTB0
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 5e3bed1..33f322a 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -103,6 +103,7 @@ ENDPROC(cpu_v7_set_pte_ext)
 	 */
 	.macro	v7_ttb_setup, zero, ttbr0, ttbr1, tmp
 	ldr	\tmp, =swapper_pg_dir		@ swapper_pg_dir virtual address
+	mov	\tmp, \tmp, lsr #ARCH_PGD_SHIFT
 	cmp	\ttbr1, \tmp			@ PHYS_OFFSET > PAGE_OFFSET? (branch below)
 	mrc	p15, 0, \tmp, c2, c0, 2		@ TTB control register
 	orr	\tmp, \tmp, #TTB_EAE
@@ -122,8 +123,14 @@ ENDPROC(cpu_v7_set_pte_ext)
 	 */
 	orrls	\tmp, \tmp, #TTBR1_SIZE				@ TTBCR.T1SZ
 	mcr	p15, 0, \tmp, c2, c0, 2				@ TTBCR
+	mov	\tmp, \ttbr1, lsr #(32 - ARCH_PGD_SHIFT)	@ upper bits
+	mov	\ttbr1, \ttbr1, lsl #ARCH_PGD_SHIFT		@ lower bits
 	addls	\ttbr1, \ttbr1, #TTBR1_OFFSET
 	mcrr	p15, 1, \ttbr1, \zero, c2			@ load TTBR1
+	mov	\tmp, \ttbr0, lsr #(32 - ARCH_PGD_SHIFT)	@ upper bits
+	mov	\ttbr0, \ttbr0, lsl #ARCH_PGD_SHIFT		@ lower bits
+	mcrr	p15, 0, \ttbr0, \zero, c2			@ load TTBR0
+	mcrr	p15, 1, \ttbr1, \zero, c2			@ load TTBR1
 	mcrr	p15, 0, \ttbr0, \zero, c2			@ load TTBR0
 	.endm
 
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 8850194..443f602 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -16,6 +16,7 @@
 #include <asm/hwcap.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/pgtable.h>
+#include <asm/cache.h>
 
 #include "proc-macros.S"
 
diff --git a/arch/arm/mm/proc-xsc3.S b/arch/arm/mm/proc-xsc3.S
index db3836b..a43a07d 100644
--- a/arch/arm/mm/proc-xsc3.S
+++ b/arch/arm/mm/proc-xsc3.S
@@ -32,6 +32,7 @@
 #include <asm/pgtable-hwdef.h>
 #include <asm/page.h>
 #include <asm/ptrace.h>
+#include <asm/cache.h>
 #include "proc-macros.S"
 
 /*
@@ -453,6 +454,7 @@ __xsc3_setup:
 	mcr	p15, 0, ip, c7, c10, 4		@ data write barrier
 	mcr	p15, 0, ip, c7, c5, 4		@ prefetch flush
 	mcr	p15, 0, ip, c8, c7, 0		@ invalidate I and D TLBs
+	mov	r4, r4, lsl #ARCH_PGD_SHIFT
 	orr	r4, r4, #0x18			@ cache the page table in L2
 	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
 	mvn	r4, #0				@ do not set page table pointer
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 15/22] ARM: mm: use physical addresses in highmem sanity checks
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Cyril Chemparathy, Vitaly Andrianov

This patch modifies the highmem sanity checking code to use physical addresses
instead.  This change eliminates the wrap-around problems associated with the
original virtual address based checks, and this simplifies the code a bit.

The one constraint imposed here is that low physical memory must be mapped in
a monotonically increasing fashion if there are multiple banks of memory,
i.e., x < y must => pa(x) < pa(y).

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/mm/mmu.c |   22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 226985c..adaf8c3 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -901,6 +901,7 @@ phys_addr_t arm_lowmem_limit __initdata = 0;
 void __init sanity_check_meminfo(void)
 {
 	int i, j, highmem = 0;
+	phys_addr_t vmalloc_limit = __pa(vmalloc_min - 1) + 1;
 
 	for (i = 0, j = 0; i < meminfo.nr_banks; i++) {
 		struct membank *bank = &meminfo.bank[j];
@@ -910,8 +911,7 @@ void __init sanity_check_meminfo(void)
 			highmem = 1;
 
 #ifdef CONFIG_HIGHMEM
-		if (__va(bank->start) >= vmalloc_min ||
-		    __va(bank->start) < (void *)PAGE_OFFSET)
+		if (bank->start >= vmalloc_limit)
 			highmem = 1;
 
 		bank->highmem = highmem;
@@ -920,8 +920,8 @@ void __init sanity_check_meminfo(void)
 		 * Split those memory banks which are partially overlapping
 		 * the vmalloc area greatly simplifying things later.
 		 */
-		if (!highmem && __va(bank->start) < vmalloc_min &&
-		    bank->size > vmalloc_min - __va(bank->start)) {
+		if (!highmem && bank->start < vmalloc_limit &&
+		    bank->size > vmalloc_limit - bank->start) {
 			if (meminfo.nr_banks >= NR_BANKS) {
 				printk(KERN_CRIT "NR_BANKS too low, "
 						 "ignoring high memory\n");
@@ -930,12 +930,12 @@ void __init sanity_check_meminfo(void)
 					(meminfo.nr_banks - i) * sizeof(*bank));
 				meminfo.nr_banks++;
 				i++;
-				bank[1].size -= vmalloc_min - __va(bank->start);
-				bank[1].start = __pa(vmalloc_min - 1) + 1;
+				bank[1].size -= vmalloc_limit - bank->start;
+				bank[1].start = vmalloc_limit;
 				bank[1].highmem = highmem = 1;
 				j++;
 			}
-			bank->size = vmalloc_min - __va(bank->start);
+			bank->size = vmalloc_limit - bank->start;
 		}
 #else
 		bank->highmem = highmem;
@@ -955,8 +955,7 @@ void __init sanity_check_meminfo(void)
 		 * Check whether this memory bank would entirely overlap
 		 * the vmalloc area.
 		 */
-		if (__va(bank->start) >= vmalloc_min ||
-		    __va(bank->start) < (void *)PAGE_OFFSET) {
+		if (bank->start >= vmalloc_limit) {
 			printk(KERN_NOTICE "Ignoring RAM at %.8llx-%.8llx "
 			       "(vmalloc region overlap).\n",
 			       (unsigned long long)bank->start,
@@ -968,9 +967,8 @@ void __init sanity_check_meminfo(void)
 		 * Check whether this memory bank would partially overlap
 		 * the vmalloc area.
 		 */
-		if (__va(bank->start + bank->size) > vmalloc_min ||
-		    __va(bank->start + bank->size) < __va(bank->start)) {
-			unsigned long newsize = vmalloc_min - __va(bank->start);
+		if (bank->start + bank->size > vmalloc_limit)
+			unsigned long newsize = vmalloc_limit - bank->start;
 			printk(KERN_NOTICE "Truncating RAM at %.8llx-%.8llx "
 			       "to -%.8llx (vmalloc region overlap).\n",
 			       (unsigned long long)bank->start,
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 15/22] ARM: mm: use physical addresses in highmem sanity checks
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch modifies the highmem sanity checking code to use physical addresses
instead.  This change eliminates the wrap-around problems associated with the
original virtual address based checks, and this simplifies the code a bit.

The one constraint imposed here is that low physical memory must be mapped in
a monotonically increasing fashion if there are multiple banks of memory,
i.e., x < y must => pa(x) < pa(y).

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/mm/mmu.c |   22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 226985c..adaf8c3 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -901,6 +901,7 @@ phys_addr_t arm_lowmem_limit __initdata = 0;
 void __init sanity_check_meminfo(void)
 {
 	int i, j, highmem = 0;
+	phys_addr_t vmalloc_limit = __pa(vmalloc_min - 1) + 1;
 
 	for (i = 0, j = 0; i < meminfo.nr_banks; i++) {
 		struct membank *bank = &meminfo.bank[j];
@@ -910,8 +911,7 @@ void __init sanity_check_meminfo(void)
 			highmem = 1;
 
 #ifdef CONFIG_HIGHMEM
-		if (__va(bank->start) >= vmalloc_min ||
-		    __va(bank->start) < (void *)PAGE_OFFSET)
+		if (bank->start >= vmalloc_limit)
 			highmem = 1;
 
 		bank->highmem = highmem;
@@ -920,8 +920,8 @@ void __init sanity_check_meminfo(void)
 		 * Split those memory banks which are partially overlapping
 		 * the vmalloc area greatly simplifying things later.
 		 */
-		if (!highmem && __va(bank->start) < vmalloc_min &&
-		    bank->size > vmalloc_min - __va(bank->start)) {
+		if (!highmem && bank->start < vmalloc_limit &&
+		    bank->size > vmalloc_limit - bank->start) {
 			if (meminfo.nr_banks >= NR_BANKS) {
 				printk(KERN_CRIT "NR_BANKS too low, "
 						 "ignoring high memory\n");
@@ -930,12 +930,12 @@ void __init sanity_check_meminfo(void)
 					(meminfo.nr_banks - i) * sizeof(*bank));
 				meminfo.nr_banks++;
 				i++;
-				bank[1].size -= vmalloc_min - __va(bank->start);
-				bank[1].start = __pa(vmalloc_min - 1) + 1;
+				bank[1].size -= vmalloc_limit - bank->start;
+				bank[1].start = vmalloc_limit;
 				bank[1].highmem = highmem = 1;
 				j++;
 			}
-			bank->size = vmalloc_min - __va(bank->start);
+			bank->size = vmalloc_limit - bank->start;
 		}
 #else
 		bank->highmem = highmem;
@@ -955,8 +955,7 @@ void __init sanity_check_meminfo(void)
 		 * Check whether this memory bank would entirely overlap
 		 * the vmalloc area.
 		 */
-		if (__va(bank->start) >= vmalloc_min ||
-		    __va(bank->start) < (void *)PAGE_OFFSET) {
+		if (bank->start >= vmalloc_limit) {
 			printk(KERN_NOTICE "Ignoring RAM at %.8llx-%.8llx "
 			       "(vmalloc region overlap).\n",
 			       (unsigned long long)bank->start,
@@ -968,9 +967,8 @@ void __init sanity_check_meminfo(void)
 		 * Check whether this memory bank would partially overlap
 		 * the vmalloc area.
 		 */
-		if (__va(bank->start + bank->size) > vmalloc_min ||
-		    __va(bank->start + bank->size) < __va(bank->start)) {
-			unsigned long newsize = vmalloc_min - __va(bank->start);
+		if (bank->start + bank->size > vmalloc_limit)
+			unsigned long newsize = vmalloc_limit - bank->start;
 			printk(KERN_NOTICE "Truncating RAM at %.8llx-%.8llx "
 			       "to -%.8llx (vmalloc region overlap).\n",
 			       (unsigned long long)bank->start,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 16/22] ARM: mm: cleanup checks for membank overlap with vmalloc area
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Cyril Chemparathy, Vitaly Andrianov

On Keystone platforms, physical memory is entirely outside the 32-bit
addressible range.  Therefore, the (bank->start > ULONG_MAX) check below marks
the entire system memory as highmem, and this causes unpleasentness all over.

This patch eliminates the extra bank start check (against ULONG_MAX) by
checking bank->start against the physical address corresponding to vmalloc_min
instead.

In the process, this patch also cleans up parts of the highmem sanity check
code by removing what has now become a redundant check for banks that entirely
overlap with the vmalloc range.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/mm/mmu.c |   19 +------------------
 1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index adaf8c3..4840efa 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -907,15 +907,12 @@ void __init sanity_check_meminfo(void)
 		struct membank *bank = &meminfo.bank[j];
 		*bank = meminfo.bank[i];
 
-		if (bank->start > ULONG_MAX)
-			highmem = 1;
-
-#ifdef CONFIG_HIGHMEM
 		if (bank->start >= vmalloc_limit)
 			highmem = 1;
 
 		bank->highmem = highmem;
 
+#ifdef CONFIG_HIGHMEM
 		/*
 		 * Split those memory banks which are partially overlapping
 		 * the vmalloc area greatly simplifying things later.
@@ -938,8 +935,6 @@ void __init sanity_check_meminfo(void)
 			bank->size = vmalloc_limit - bank->start;
 		}
 #else
-		bank->highmem = highmem;
-
 		/*
 		 * Highmem banks not allowed with !CONFIG_HIGHMEM.
 		 */
@@ -952,18 +947,6 @@ void __init sanity_check_meminfo(void)
 		}
 
 		/*
-		 * Check whether this memory bank would entirely overlap
-		 * the vmalloc area.
-		 */
-		if (bank->start >= vmalloc_limit) {
-			printk(KERN_NOTICE "Ignoring RAM at %.8llx-%.8llx "
-			       "(vmalloc region overlap).\n",
-			       (unsigned long long)bank->start,
-			       (unsigned long long)bank->start + bank->size - 1);
-			continue;
-		}
-
-		/*
 		 * Check whether this memory bank would partially overlap
 		 * the vmalloc area.
 		 */
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 16/22] ARM: mm: cleanup checks for membank overlap with vmalloc area
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Keystone platforms, physical memory is entirely outside the 32-bit
addressible range.  Therefore, the (bank->start > ULONG_MAX) check below marks
the entire system memory as highmem, and this causes unpleasentness all over.

This patch eliminates the extra bank start check (against ULONG_MAX) by
checking bank->start against the physical address corresponding to vmalloc_min
instead.

In the process, this patch also cleans up parts of the highmem sanity check
code by removing what has now become a redundant check for banks that entirely
overlap with the vmalloc range.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/mm/mmu.c |   19 +------------------
 1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index adaf8c3..4840efa 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -907,15 +907,12 @@ void __init sanity_check_meminfo(void)
 		struct membank *bank = &meminfo.bank[j];
 		*bank = meminfo.bank[i];
 
-		if (bank->start > ULONG_MAX)
-			highmem = 1;
-
-#ifdef CONFIG_HIGHMEM
 		if (bank->start >= vmalloc_limit)
 			highmem = 1;
 
 		bank->highmem = highmem;
 
+#ifdef CONFIG_HIGHMEM
 		/*
 		 * Split those memory banks which are partially overlapping
 		 * the vmalloc area greatly simplifying things later.
@@ -938,8 +935,6 @@ void __init sanity_check_meminfo(void)
 			bank->size = vmalloc_limit - bank->start;
 		}
 #else
-		bank->highmem = highmem;
-
 		/*
 		 * Highmem banks not allowed with !CONFIG_HIGHMEM.
 		 */
@@ -952,18 +947,6 @@ void __init sanity_check_meminfo(void)
 		}
 
 		/*
-		 * Check whether this memory bank would entirely overlap
-		 * the vmalloc area.
-		 */
-		if (bank->start >= vmalloc_limit) {
-			printk(KERN_NOTICE "Ignoring RAM at %.8llx-%.8llx "
-			       "(vmalloc region overlap).\n",
-			       (unsigned long long)bank->start,
-			       (unsigned long long)bank->start + bank->size - 1);
-			continue;
-		}
-
-		/*
 		 * Check whether this memory bank would partially overlap
 		 * the vmalloc area.
 		 */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 17/22] ARM: mm: clean up membank size limit checks
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Cyril Chemparathy, Vitaly Andrianov

This patch cleans up the highmem sanity check code by simplifying the range
checks with a pre-calculated size_limit.  This patch should otherwise have no
functional impact on behavior.

This patch also removes a redundant (bank->start < vmalloc_limit) check, since
this is already covered by the !highmem condition.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/mm/mmu.c |   19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 4840efa..6b0baf3 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -905,10 +905,15 @@ void __init sanity_check_meminfo(void)
 
 	for (i = 0, j = 0; i < meminfo.nr_banks; i++) {
 		struct membank *bank = &meminfo.bank[j];
+		phys_addr_t size_limit;
+
 		*bank = meminfo.bank[i];
+		size_limit = bank->size;
 
 		if (bank->start >= vmalloc_limit)
 			highmem = 1;
+		else
+			size_limit = vmalloc_limit - bank->start;
 
 		bank->highmem = highmem;
 
@@ -917,8 +922,7 @@ void __init sanity_check_meminfo(void)
 		 * Split those memory banks which are partially overlapping
 		 * the vmalloc area greatly simplifying things later.
 		 */
-		if (!highmem && bank->start < vmalloc_limit &&
-		    bank->size > vmalloc_limit - bank->start) {
+		if (!highmem && bank->size > size_limit) {
 			if (meminfo.nr_banks >= NR_BANKS) {
 				printk(KERN_CRIT "NR_BANKS too low, "
 						 "ignoring high memory\n");
@@ -927,12 +931,12 @@ void __init sanity_check_meminfo(void)
 					(meminfo.nr_banks - i) * sizeof(*bank));
 				meminfo.nr_banks++;
 				i++;
-				bank[1].size -= vmalloc_limit - bank->start;
+				bank[1].size -= size_limit;
 				bank[1].start = vmalloc_limit;
 				bank[1].highmem = highmem = 1;
 				j++;
 			}
-			bank->size = vmalloc_limit - bank->start;
+			bank->size = size_limit;
 		}
 #else
 		/*
@@ -950,14 +954,13 @@ void __init sanity_check_meminfo(void)
 		 * Check whether this memory bank would partially overlap
 		 * the vmalloc area.
 		 */
-		if (bank->start + bank->size > vmalloc_limit)
-			unsigned long newsize = vmalloc_limit - bank->start;
+		if (bank->size > size_limit) {
 			printk(KERN_NOTICE "Truncating RAM at %.8llx-%.8llx "
 			       "to -%.8llx (vmalloc region overlap).\n",
 			       (unsigned long long)bank->start,
 			       (unsigned long long)bank->start + bank->size - 1,
-			       (unsigned long long)bank->start + newsize - 1);
-			bank->size = newsize;
+			       (unsigned long long)bank->start + size_limit - 1);
+			bank->size = size_limit;
 		}
 #endif
 		if (!bank->highmem && bank->start + bank->size > arm_lowmem_limit)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 17/22] ARM: mm: clean up membank size limit checks
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch cleans up the highmem sanity check code by simplifying the range
checks with a pre-calculated size_limit.  This patch should otherwise have no
functional impact on behavior.

This patch also removes a redundant (bank->start < vmalloc_limit) check, since
this is already covered by the !highmem condition.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/mm/mmu.c |   19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 4840efa..6b0baf3 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -905,10 +905,15 @@ void __init sanity_check_meminfo(void)
 
 	for (i = 0, j = 0; i < meminfo.nr_banks; i++) {
 		struct membank *bank = &meminfo.bank[j];
+		phys_addr_t size_limit;
+
 		*bank = meminfo.bank[i];
+		size_limit = bank->size;
 
 		if (bank->start >= vmalloc_limit)
 			highmem = 1;
+		else
+			size_limit = vmalloc_limit - bank->start;
 
 		bank->highmem = highmem;
 
@@ -917,8 +922,7 @@ void __init sanity_check_meminfo(void)
 		 * Split those memory banks which are partially overlapping
 		 * the vmalloc area greatly simplifying things later.
 		 */
-		if (!highmem && bank->start < vmalloc_limit &&
-		    bank->size > vmalloc_limit - bank->start) {
+		if (!highmem && bank->size > size_limit) {
 			if (meminfo.nr_banks >= NR_BANKS) {
 				printk(KERN_CRIT "NR_BANKS too low, "
 						 "ignoring high memory\n");
@@ -927,12 +931,12 @@ void __init sanity_check_meminfo(void)
 					(meminfo.nr_banks - i) * sizeof(*bank));
 				meminfo.nr_banks++;
 				i++;
-				bank[1].size -= vmalloc_limit - bank->start;
+				bank[1].size -= size_limit;
 				bank[1].start = vmalloc_limit;
 				bank[1].highmem = highmem = 1;
 				j++;
 			}
-			bank->size = vmalloc_limit - bank->start;
+			bank->size = size_limit;
 		}
 #else
 		/*
@@ -950,14 +954,13 @@ void __init sanity_check_meminfo(void)
 		 * Check whether this memory bank would partially overlap
 		 * the vmalloc area.
 		 */
-		if (bank->start + bank->size > vmalloc_limit)
-			unsigned long newsize = vmalloc_limit - bank->start;
+		if (bank->size > size_limit) {
 			printk(KERN_NOTICE "Truncating RAM at %.8llx-%.8llx "
 			       "to -%.8llx (vmalloc region overlap).\n",
 			       (unsigned long long)bank->start,
 			       (unsigned long long)bank->start + bank->size - 1,
-			       (unsigned long long)bank->start + newsize - 1);
-			bank->size = newsize;
+			       (unsigned long long)bank->start + size_limit - 1);
+			bank->size = size_limit;
 		}
 #endif
 		if (!bank->highmem && bank->start + bank->size > arm_lowmem_limit)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 18/22] ARM: add virt_to_idmap for interconnect aliasing
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Vitaly Andrianov, Cyril Chemparathy

From: Vitaly Andrianov <vitalya@ti.com>

On some PAE systems (e.g. TI Keystone), memory is above the 32-bit addressible
limit, and the interconnect provides an aliased view of parts of physical
memory in the 32-bit addressible space.  This alias is strictly for boot time
usage, and is not otherwise usable because of coherency limitations.

On such systems, the idmap mechanism needs to take this aliased mapping into
account.  This patch introduces a virt_to_idmap() macro, which can be used on
such sub-architectures to represent the interconnect supported boot time
alias.  Most other systems would leave this macro untouched, i.e., do a simply
virt_to_phys() and nothing more.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/include/asm/memory.h |    9 +++++++++
 arch/arm/kernel/smp.c         |    4 ++--
 arch/arm/mm/idmap.c           |    4 ++--
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 0c1b396..4bf47ed 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -231,6 +231,15 @@ static inline void *phys_to_virt(phys_addr_t x)
 #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
 
 /*
+ * These are for systems that have a hardware interconnect supported alias of
+ * physical memory for idmap purposes.  Most cases should leave these
+ * untouched.
+ */
+#ifndef virt_to_idmap
+#define virt_to_idmap(x) virt_to_phys(x)
+#endif
+
+/*
  * Virtual <-> DMA view memory address translations
  * Again, these are *only* valid on the kernel direct mapped RAM
  * memory.  Use of these is *deprecated* (and that doesn't mean
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index e41e1be..cce630c 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -72,10 +72,10 @@ int __cpuinit __cpu_up(unsigned int cpu, struct task_struct *idle)
 	 */
 	secondary_data.stack = task_stack_page(idle) + THREAD_START_SP;
 
-	pgdir = virt_to_phys(idmap_pgd);
+	pgdir = virt_to_idmap(idmap_pgd);
 	secondary_data.pgdir = pgdir >> ARCH_PGD_SHIFT;
 
-	pgdir = virt_to_phys(swapper_pg_dir);
+	pgdir = virt_to_idmap(swapper_pg_dir);
 	secondary_data.swapper_pg_dir = pgdir >> ARCH_PGD_SHIFT;
 
 	__cpuc_flush_dcache_area(&secondary_data, sizeof(secondary_data));
diff --git a/arch/arm/mm/idmap.c b/arch/arm/mm/idmap.c
index ab88ed4..919cb6e 100644
--- a/arch/arm/mm/idmap.c
+++ b/arch/arm/mm/idmap.c
@@ -85,8 +85,8 @@ static int __init init_static_idmap(void)
 		return -ENOMEM;
 
 	/* Add an identity mapping for the physical address of the section. */
-	idmap_start = virt_to_phys((void *)__idmap_text_start);
-	idmap_end = virt_to_phys((void *)__idmap_text_end);
+	idmap_start = virt_to_idmap((void *)__idmap_text_start);
+	idmap_end = virt_to_idmap((void *)__idmap_text_end);
 
 	pr_info("Setting up static identity map for 0x%llx - 0x%llx\n",
 		(long long)idmap_start, (long long)idmap_end);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 18/22] ARM: add virt_to_idmap for interconnect aliasing
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

From: Vitaly Andrianov <vitalya@ti.com>

On some PAE systems (e.g. TI Keystone), memory is above the 32-bit addressible
limit, and the interconnect provides an aliased view of parts of physical
memory in the 32-bit addressible space.  This alias is strictly for boot time
usage, and is not otherwise usable because of coherency limitations.

On such systems, the idmap mechanism needs to take this aliased mapping into
account.  This patch introduces a virt_to_idmap() macro, which can be used on
such sub-architectures to represent the interconnect supported boot time
alias.  Most other systems would leave this macro untouched, i.e., do a simply
virt_to_phys() and nothing more.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/include/asm/memory.h |    9 +++++++++
 arch/arm/kernel/smp.c         |    4 ++--
 arch/arm/mm/idmap.c           |    4 ++--
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 0c1b396..4bf47ed 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -231,6 +231,15 @@ static inline void *phys_to_virt(phys_addr_t x)
 #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
 
 /*
+ * These are for systems that have a hardware interconnect supported alias of
+ * physical memory for idmap purposes.  Most cases should leave these
+ * untouched.
+ */
+#ifndef virt_to_idmap
+#define virt_to_idmap(x) virt_to_phys(x)
+#endif
+
+/*
  * Virtual <-> DMA view memory address translations
  * Again, these are *only* valid on the kernel direct mapped RAM
  * memory.  Use of these is *deprecated* (and that doesn't mean
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index e41e1be..cce630c 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -72,10 +72,10 @@ int __cpuinit __cpu_up(unsigned int cpu, struct task_struct *idle)
 	 */
 	secondary_data.stack = task_stack_page(idle) + THREAD_START_SP;
 
-	pgdir = virt_to_phys(idmap_pgd);
+	pgdir = virt_to_idmap(idmap_pgd);
 	secondary_data.pgdir = pgdir >> ARCH_PGD_SHIFT;
 
-	pgdir = virt_to_phys(swapper_pg_dir);
+	pgdir = virt_to_idmap(swapper_pg_dir);
 	secondary_data.swapper_pg_dir = pgdir >> ARCH_PGD_SHIFT;
 
 	__cpuc_flush_dcache_area(&secondary_data, sizeof(secondary_data));
diff --git a/arch/arm/mm/idmap.c b/arch/arm/mm/idmap.c
index ab88ed4..919cb6e 100644
--- a/arch/arm/mm/idmap.c
+++ b/arch/arm/mm/idmap.c
@@ -85,8 +85,8 @@ static int __init init_static_idmap(void)
 		return -ENOMEM;
 
 	/* Add an identity mapping for the physical address of the section. */
-	idmap_start = virt_to_phys((void *)__idmap_text_start);
-	idmap_end = virt_to_phys((void *)__idmap_text_end);
+	idmap_start = virt_to_idmap((void *)__idmap_text_start);
+	idmap_end = virt_to_idmap((void *)__idmap_text_end);
 
 	pr_info("Setting up static identity map for 0x%llx - 0x%llx\n",
 		(long long)idmap_start, (long long)idmap_end);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 19/22] ARM: recreate kernel mappings in early_paging_init()
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:04   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Cyril Chemparathy, Vitaly Andrianov

This patch adds a step in the init sequence, in order to recreate the kernel
code/data page table mappings prior to full paging initialization.  This is
necessary on LPAE systems that run out of a physical address space outside the
4G limit.  On these systems, this implementation provides a machine descriptor
hook that allows the PHYS_OFFSET to be overridden in a machine specific
fashion.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/include/asm/mach/arch.h |    1 +
 arch/arm/kernel/setup.c          |    3 ++
 arch/arm/mm/mmu.c                |   57 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 61 insertions(+)

diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h
index 0b1c94b..2b9ecc5 100644
--- a/arch/arm/include/asm/mach/arch.h
+++ b/arch/arm/include/asm/mach/arch.h
@@ -37,6 +37,7 @@ struct machine_desc {
 	char			restart_mode;	/* default restart mode	*/
 	void			(*fixup)(struct tag *, char **,
 					 struct meminfo *);
+	void			(*init_meminfo)(void);
 	void			(*reserve)(void);/* reserve mem blocks	*/
 	void			(*map_io)(void);/* IO mapping function	*/
 	void			(*init_early)(void);
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index bba3fdc..ccf052c 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -93,6 +93,7 @@ static int __init fpe_setup(char *line)
 __setup("fpe=", fpe_setup);
 #endif
 
+extern void early_paging_init(struct machine_desc *, struct proc_info_list *);
 extern void paging_init(struct machine_desc *desc);
 extern void sanity_check_meminfo(void);
 extern void reboot_setup(char *str);
@@ -1152,6 +1153,8 @@ void __init setup_arch(char **cmdline_p)
 	parse_early_param();
 
 	sort(&meminfo.bank, meminfo.nr_banks, sizeof(meminfo.bank[0]), meminfo_cmp, NULL);
+
+	early_paging_init(mdesc, lookup_processor_type(read_cpuid_id()));
 	sanity_check_meminfo();
 	arm_memblock_init(&meminfo, mdesc);
 
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 6b0baf3..21fb171 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -28,6 +28,7 @@
 #include <asm/highmem.h>
 #include <asm/system_info.h>
 #include <asm/traps.h>
+#include <asm/procinfo.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
@@ -1175,6 +1176,62 @@ static void __init map_lowmem(void)
 }
 
 /*
+ * early_paging_init() recreates boot time page table setup, allowing machines
+ * to switch over to a high (>4G) address space on LPAE systems
+ */
+void __init early_paging_init(struct machine_desc *mdesc,
+			      struct proc_info_list *procinfo)
+{
+	bool lpae = IS_ENABLED(CONFIG_ARM_LPAE);
+	pmdval_t pmdprot = procinfo->__cpu_mm_mmu_flags;
+	unsigned long map_start, map_end;
+	pgd_t *pgd0, *pgdk;
+	pud_t *pud0, *pudk;
+	pmd_t *pmd0, *pmdk;
+	phys_addr_t phys;
+	int i;
+
+	if (!lpae)
+		return;
+
+	/* remap kernel code and data */
+	map_start = init_mm.start_code;
+	map_end   = init_mm.brk;
+
+	/* get a handle on things... */
+	pgd0 = pgd_offset_k(0);
+	pud0 = pud_offset(pgd0, 0);
+	pmd0 = pmd_offset(pud0, 0);
+
+	pgdk = pgd_offset_k(map_start);
+	pudk = pud_offset(pgdk, map_start);
+	pmdk = pmd_offset(pudk, map_start);
+
+	phys = PHYS_OFFSET;
+
+	if (mdesc->init_meminfo)
+		mdesc->init_meminfo();
+
+	/* remap level 1 table */
+	for (i = 0; i < PTRS_PER_PGD; i++) {
+		*pud0++ = __pud(__pa(pmd0) | PMD_TYPE_TABLE | L_PGD_SWAPPER);
+		pmd0 += PTRS_PER_PMD;
+	}
+
+	/* remap pmds for kernel mapping */
+	phys = __pa(map_start) & PMD_MASK;
+	do {
+		*pmdk++ = __pmd(phys | pmdprot);
+		phys += PMD_SIZE;
+	} while (phys < map_end);
+
+	flush_cache_all();
+	cpu_set_ttbr(0, __pa(pgd0));
+	cpu_set_ttbr(1, __pa(pgd0) + TTBR1_OFFSET);
+	local_flush_tlb_all();
+}
+
+/*
  * paging_init() sets up the page tables, initialises the zone memory
  * maps, and sets up the zero page, bad page and bad page tables.
  */
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH 19/22] ARM: recreate kernel mappings in early_paging_init()
@ 2012-07-31 23:04   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds a step in the init sequence, in order to recreate the kernel
code/data page table mappings prior to full paging initialization.  This is
necessary on LPAE systems that run out of a physical address space outside the
4G limit.  On these systems, this implementation provides a machine descriptor
hook that allows the PHYS_OFFSET to be overridden in a machine specific
fashion.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
---
 arch/arm/include/asm/mach/arch.h |    1 +
 arch/arm/kernel/setup.c          |    3 ++
 arch/arm/mm/mmu.c                |   57 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 61 insertions(+)

diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h
index 0b1c94b..2b9ecc5 100644
--- a/arch/arm/include/asm/mach/arch.h
+++ b/arch/arm/include/asm/mach/arch.h
@@ -37,6 +37,7 @@ struct machine_desc {
 	char			restart_mode;	/* default restart mode	*/
 	void			(*fixup)(struct tag *, char **,
 					 struct meminfo *);
+	void			(*init_meminfo)(void);
 	void			(*reserve)(void);/* reserve mem blocks	*/
 	void			(*map_io)(void);/* IO mapping function	*/
 	void			(*init_early)(void);
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index bba3fdc..ccf052c 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -93,6 +93,7 @@ static int __init fpe_setup(char *line)
 __setup("fpe=", fpe_setup);
 #endif
 
+extern void early_paging_init(struct machine_desc *, struct proc_info_list *);
 extern void paging_init(struct machine_desc *desc);
 extern void sanity_check_meminfo(void);
 extern void reboot_setup(char *str);
@@ -1152,6 +1153,8 @@ void __init setup_arch(char **cmdline_p)
 	parse_early_param();
 
 	sort(&meminfo.bank, meminfo.nr_banks, sizeof(meminfo.bank[0]), meminfo_cmp, NULL);
+
+	early_paging_init(mdesc, lookup_processor_type(read_cpuid_id()));
 	sanity_check_meminfo();
 	arm_memblock_init(&meminfo, mdesc);
 
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 6b0baf3..21fb171 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -28,6 +28,7 @@
 #include <asm/highmem.h>
 #include <asm/system_info.h>
 #include <asm/traps.h>
+#include <asm/procinfo.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
@@ -1175,6 +1176,62 @@ static void __init map_lowmem(void)
 }
 
 /*
+ * early_paging_init() recreates boot time page table setup, allowing machines
+ * to switch over to a high (>4G) address space on LPAE systems
+ */
+void __init early_paging_init(struct machine_desc *mdesc,
+			      struct proc_info_list *procinfo)
+{
+	bool lpae = IS_ENABLED(CONFIG_ARM_LPAE);
+	pmdval_t pmdprot = procinfo->__cpu_mm_mmu_flags;
+	unsigned long map_start, map_end;
+	pgd_t *pgd0, *pgdk;
+	pud_t *pud0, *pudk;
+	pmd_t *pmd0, *pmdk;
+	phys_addr_t phys;
+	int i;
+
+	if (!lpae)
+		return;
+
+	/* remap kernel code and data */
+	map_start = init_mm.start_code;
+	map_end   = init_mm.brk;
+
+	/* get a handle on things... */
+	pgd0 = pgd_offset_k(0);
+	pud0 = pud_offset(pgd0, 0);
+	pmd0 = pmd_offset(pud0, 0);
+
+	pgdk = pgd_offset_k(map_start);
+	pudk = pud_offset(pgdk, map_start);
+	pmdk = pmd_offset(pudk, map_start);
+
+	phys = PHYS_OFFSET;
+
+	if (mdesc->init_meminfo)
+		mdesc->init_meminfo();
+
+	/* remap level 1 table */
+	for (i = 0; i < PTRS_PER_PGD; i++) {
+		*pud0++ = __pud(__pa(pmd0) | PMD_TYPE_TABLE | L_PGD_SWAPPER);
+		pmd0 += PTRS_PER_PMD;
+	}
+
+	/* remap pmds for kernel mapping */
+	phys = __pa(map_start) & PMD_MASK;
+	do {
+		*pmdk++ = __pmd(phys | pmdprot);
+		phys += PMD_SIZE;
+	} while (phys < map_end);
+
+	flush_cache_all();
+	cpu_set_ttbr(0, __pa(pgd0));
+	cpu_set_ttbr(1, __pa(pgd0) + TTBR1_OFFSET);
+	local_flush_tlb_all();
+}
+
+/*
  * paging_init() sets up the page tables, initialises the zone memory
  * maps, and sets up the zero page, bad page and bad page tables.
  */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [RFC 20/22] ARM: keystone: introducing TI Keystone platform
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:06   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Cyril Chemparathy, Vitaly Andrianov

Texas Instruments Keystone family of multicore devices now includes an
upcoming slew of Cortex A15 based devices.  This patch adds basic definitions
for a new Keystone sub-architecture in ARM.

Subsequent patches in this series will extend support to include SMP and take
advantage of the large physical memory addressing capabilities via LPAE.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/Kconfig                                  |   18 +++++
 arch/arm/Makefile                                 |    1 +
 arch/arm/boot/dts/keystone-sim.dts                |   77 +++++++++++++++++++
 arch/arm/configs/keystone_defconfig               |   20 +++++
 arch/arm/mach-keystone/Makefile                   |    1 +
 arch/arm/mach-keystone/Makefile.boot              |    1 +
 arch/arm/mach-keystone/include/mach/debug-macro.S |   44 +++++++++++
 arch/arm/mach-keystone/include/mach/memory.h      |   22 ++++++
 arch/arm/mach-keystone/include/mach/timex.h       |   21 ++++++
 arch/arm/mach-keystone/include/mach/uncompress.h  |   24 ++++++
 arch/arm/mach-keystone/keystone.c                 |   82 +++++++++++++++++++++
 11 files changed, 311 insertions(+)
 create mode 100644 arch/arm/boot/dts/keystone-sim.dts
 create mode 100644 arch/arm/configs/keystone_defconfig
 create mode 100644 arch/arm/mach-keystone/Makefile
 create mode 100644 arch/arm/mach-keystone/Makefile.boot
 create mode 100644 arch/arm/mach-keystone/include/mach/debug-macro.S
 create mode 100644 arch/arm/mach-keystone/include/mach/memory.h
 create mode 100644 arch/arm/mach-keystone/include/mach/timex.h
 create mode 100644 arch/arm/mach-keystone/include/mach/uncompress.h
 create mode 100644 arch/arm/mach-keystone/keystone.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index a91009c..e0588e3 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -358,6 +358,24 @@ config ARCH_HIGHBANK
 	help
 	  Support for the Calxeda Highbank SoC based boards.
 
+config ARCH_KEYSTONE
+	bool "Texas Instruments Keystone Devices"
+	select ARCH_WANT_OPTIONAL_GPIOLIB
+	select ARM_GIC
+	select MULTI_IRQ_HANDLER
+	select CLKDEV_LOOKUP
+	select COMMON_CLK
+	select CLKSRC_MMIO
+	select CPU_V7
+	select GENERIC_CLOCKEVENTS
+	select USE_OF
+	select SPARSE_IRQ
+	select NEED_MACH_MEMORY_H
+	select HAVE_SCHED_CLOCK
+	help
+	  Support for boards based on the Texas Instruments Keystone family of
+	  SoCs.
+
 config ARCH_CLPS711X
 	bool "Cirrus Logic CLPS711x/EP721x/EP731x-based"
 	select CPU_ARM720T
diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 0298b00..13d6ef5 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -143,6 +143,7 @@ machine-$(CONFIG_ARCH_EP93XX)		:= ep93xx
 machine-$(CONFIG_ARCH_GEMINI)		:= gemini
 machine-$(CONFIG_ARCH_H720X)		:= h720x
 machine-$(CONFIG_ARCH_HIGHBANK)		:= highbank
+machine-$(CONFIG_ARCH_KEYSTONE)		:= keystone
 machine-$(CONFIG_ARCH_INTEGRATOR)	:= integrator
 machine-$(CONFIG_ARCH_IOP13XX)		:= iop13xx
 machine-$(CONFIG_ARCH_IOP32X)		:= iop32x
diff --git a/arch/arm/boot/dts/keystone-sim.dts b/arch/arm/boot/dts/keystone-sim.dts
new file mode 100644
index 0000000..118d631
--- /dev/null
+++ b/arch/arm/boot/dts/keystone-sim.dts
@@ -0,0 +1,77 @@
+/dts-v1/;
+/include/ "skeleton.dtsi"
+
+/ {
+	model = "Texas Instruments Keystone 2 SoC";
+	compatible = "ti,keystone-evm";
+	#address-cells = <1>;
+	#size-cells = <1>;
+	interrupt-parent = <&gic>;
+
+	aliases {
+		serial0	= &uart0;
+	};
+
+	chosen {
+		bootargs = "console=ttyS0,115200n8 debug earlyprintk lpj=50000 rdinit=/bin/ash rw root=/dev/ram0 initrd=0x85000000,9M";
+	};
+
+	memory {
+		reg = <0x80000000 0x8000000>;
+	};
+
+	cpus {
+		interrupt-parent = <&gic>;
+
+		cpu@0 {
+			compatible = "arm,cortex-a15";
+		};
+
+		cpu@1 {
+			compatible = "arm,cortex-a15";
+		};
+
+		cpu@2 {
+			compatible = "arm,cortex-a15";
+		};
+
+		cpu@3 {
+			compatible = "arm,cortex-a15";
+		};
+
+	};
+
+	soc {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		ranges;
+		compatible = "ti,keystone","simple-bus";
+		interrupt-parent = <&gic>;
+
+		gic:	interrupt-controller@02560000 {
+			compatible = "arm,cortex-a15-gic";
+			#interrupt-cells = <3>;
+			#size-cells = <0>;
+			#address-cells = <1>;
+			interrupt-controller;
+			reg = <0x02561000 0x1000>,
+			      <0x02562000 0x2000>;
+		};
+
+		timer {
+			compatible = "arm,armv7-timer";
+			interrupts = <1 13 0xf08 1 14 0xf08>;
+			clock-frequency = <10000000>; /* Freq in Hz - optional */
+		};
+
+		uart0:	serial@02530c00 {
+			compatible	= "ns16550a";
+			current-speed	= <115200>;
+			reg-shift	= <2>;
+			reg-io-width	= <4>;
+			reg		= <0x02530c00 0x100>;
+			clock-frequency = <48000000>;
+			interrupts	= <0 277 0xf01>;
+		};
+	};
+};
diff --git a/arch/arm/configs/keystone_defconfig b/arch/arm/configs/keystone_defconfig
new file mode 100644
index 0000000..7f2a04b
--- /dev/null
+++ b/arch/arm/configs/keystone_defconfig
@@ -0,0 +1,20 @@
+CONFIG_EXPERIMENTAL=y
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_ARCH_KEYSTONE=y
+CONFIG_ARM_ARCH_TIMER=y
+CONFIG_AEABI=y
+CONFIG_HIGHMEM=y
+CONFIG_VFP=y
+CONFIG_NEON=y
+# CONFIG_SUSPEND is not set
+CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
+CONFIG_BLK_DEV_RAM=y
+CONFIG_SERIAL_8250=y
+CONFIG_SERIAL_8250_CONSOLE=y
+CONFIG_SERIAL_OF_PLATFORM=y
+CONFIG_PRINTK_TIME=y
+CONFIG_DEBUG_KERNEL=y
+CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_USER=y
+CONFIG_DEBUG_LL=y
+CONFIG_EARLY_PRINTK=y
diff --git a/arch/arm/mach-keystone/Makefile b/arch/arm/mach-keystone/Makefile
new file mode 100644
index 0000000..d4671d5
--- /dev/null
+++ b/arch/arm/mach-keystone/Makefile
@@ -0,0 +1 @@
+obj-y					:= keystone.o
diff --git a/arch/arm/mach-keystone/Makefile.boot b/arch/arm/mach-keystone/Makefile.boot
new file mode 100644
index 0000000..dae9661
--- /dev/null
+++ b/arch/arm/mach-keystone/Makefile.boot
@@ -0,0 +1 @@
+zreladdr-y	:= 0x00008000
diff --git a/arch/arm/mach-keystone/include/mach/debug-macro.S b/arch/arm/mach-keystone/include/mach/debug-macro.S
new file mode 100644
index 0000000..1108210
--- /dev/null
+++ b/arch/arm/mach-keystone/include/mach/debug-macro.S
@@ -0,0 +1,44 @@
+/*
+ * Debugging macro include header
+ *
+ * Copyright 2010-2012 Texas Instruments, Inc.
+ * Copyright (C) 1994-1999 Russell King
+ * Moved from linux/arch/arm/kernel/debug.S by Ben Dooks
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/serial_reg.h>
+
+#define UART_SHIFT	2
+
+	.macro	addruart,rp,rv,tmp
+	movw	\rv, #0x0c00
+	movt	\rv, #0xfed3
+	movw	\rp, #0x0c00
+	movt	\rp, #0x0253
+	.endm
+
+
+	.macro	senduart,rd,rx
+	str	\rd, [\rx, #UART_TX << UART_SHIFT]
+	.endm
+
+	.macro	busyuart,rd,rx
+1002:	ldr	\rd, [\rx, #UART_LSR << UART_SHIFT]
+	and	\rd, \rd, #UART_LSR_TEMT | UART_LSR_THRE
+	teq	\rd, #UART_LSR_TEMT | UART_LSR_THRE
+	bne	1002b
+	.endm
+
+	.macro	waituart,rd,rx
+	.endm
diff --git a/arch/arm/mach-keystone/include/mach/memory.h b/arch/arm/mach-keystone/include/mach/memory.h
new file mode 100644
index 0000000..7c78b1e
--- /dev/null
+++ b/arch/arm/mach-keystone/include/mach/memory.h
@@ -0,0 +1,22 @@
+/*
+ * Copyright 2010-2012 Texas Instruments, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_MACH_MEMORY_H
+#define __ASM_MACH_MEMORY_H
+
+#define MAX_PHYSMEM_BITS	36
+#define SECTION_SIZE_BITS	34
+
+#endif /* __ASM_MACH_MEMORY_H */
diff --git a/arch/arm/mach-keystone/include/mach/timex.h b/arch/arm/mach-keystone/include/mach/timex.h
new file mode 100644
index 0000000..f355ecb
--- /dev/null
+++ b/arch/arm/mach-keystone/include/mach/timex.h
@@ -0,0 +1,21 @@
+/*
+ * Copyright 2010-2012 Texas Instruments, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __MACH_TIMEX_H
+#define __MACH_TIMEX_H
+
+#define CLOCK_TICK_RATE		1000000
+
+#endif
diff --git a/arch/arm/mach-keystone/include/mach/uncompress.h b/arch/arm/mach-keystone/include/mach/uncompress.h
new file mode 100644
index 0000000..1071761
--- /dev/null
+++ b/arch/arm/mach-keystone/include/mach/uncompress.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright 2010-2012 Texas Instruments, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __MACH_UNCOMPRESS_H
+#define __MACH_UNCOMPRESS_H
+
+#define putc(c)
+#define flush()
+#define arch_decomp_setup()
+#define arch_decomp_wdog()
+
+#endif
diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
new file mode 100644
index 0000000..4bc60ec
--- /dev/null
+++ b/arch/arm/mach-keystone/keystone.c
@@ -0,0 +1,82 @@
+/*
+ * Copyright 2010-2012 Texas Instruments, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/io.h>
+#include <linux/of.h>
+#include <linux/init.h>
+#include <linux/of_irq.h>
+#include <linux/of_platform.h>
+
+#include <asm/setup.h>
+#include <asm/mach/map.h>
+#include <asm/mach/arch.h>
+#include <asm/mach/time.h>
+#include <asm/arch_timer.h>
+#include <asm/hardware/gic.h>
+
+static struct map_desc io_desc[] = {
+	{
+		.virtual        = 0xfe800000UL,
+		.pfn            = __phys_to_pfn(0x02000000UL),
+		.length         = 0x800000UL,
+		.type           = MT_DEVICE
+	},
+};
+
+static void __init keystone_map_io(void)
+{
+	iotable_init(io_desc, sizeof(io_desc)/sizeof(struct map_desc));
+}
+
+static const struct of_device_id irq_match[] = {
+	{ .compatible = "arm,cortex-a15-gic", .data = gic_of_init, },
+	{}
+};
+
+static void __init keystone_init_irq(void)
+{
+	of_irq_init(irq_match);
+}
+
+
+static void __init keystone_timer_init(void)
+{
+	arch_timer_of_register();
+	arch_timer_sched_clock_init();
+}
+
+static struct sys_timer keystone_timer = {
+	.init = keystone_timer_init,
+};
+
+
+static void __init keystone_init(void)
+{
+	of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
+}
+
+static const char *keystone_match[] __initconst = {
+	"ti,keystone-evm",
+	NULL,
+};
+
+DT_MACHINE_START(KEYSTONE, "Keystone")
+	.map_io		= keystone_map_io,
+	.init_irq	= keystone_init_irq,
+	.timer		= &keystone_timer,
+	.handle_irq	= gic_handle_irq,
+	.init_machine	= keystone_init,
+	.dt_compat	= keystone_match,
+MACHINE_END
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [RFC 21/22] ARM: keystone: enable SMP on Keystone machines
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:05   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Cyril Chemparathy, Vitaly Andrianov

This patch adds basic SMP support for Keystone machines.  Nothing very fancy
here, just enough to get 4 CPUs booted up.  This does not include support for
hotplug, etc.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/Kconfig                    |    1 +
 arch/arm/configs/keystone_defconfig |    2 +
 arch/arm/mach-keystone/Makefile     |    1 +
 arch/arm/mach-keystone/keystone.c   |    3 ++
 arch/arm/mach-keystone/keystone.h   |   23 +++++++++++
 arch/arm/mach-keystone/platsmp.c    |   74 +++++++++++++++++++++++++++++++++++
 6 files changed, 104 insertions(+)
 create mode 100644 arch/arm/mach-keystone/keystone.h
 create mode 100644 arch/arm/mach-keystone/platsmp.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index e0588e3..7a76924 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -372,6 +372,7 @@ config ARCH_KEYSTONE
 	select SPARSE_IRQ
 	select NEED_MACH_MEMORY_H
 	select HAVE_SCHED_CLOCK
+	select HAVE_SMP
 	help
 	  Support for boards based on the Texas Instruments Keystone family of
 	  SoCs.
diff --git a/arch/arm/configs/keystone_defconfig b/arch/arm/configs/keystone_defconfig
index 7f2a04b..5f71e66 100644
--- a/arch/arm/configs/keystone_defconfig
+++ b/arch/arm/configs/keystone_defconfig
@@ -1,7 +1,9 @@
 CONFIG_EXPERIMENTAL=y
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_ARCH_KEYSTONE=y
+CONFIG_SMP=y
 CONFIG_ARM_ARCH_TIMER=y
+CONFIG_NR_CPUS=4
 CONFIG_AEABI=y
 CONFIG_HIGHMEM=y
 CONFIG_VFP=y
diff --git a/arch/arm/mach-keystone/Makefile b/arch/arm/mach-keystone/Makefile
index d4671d5..3f6b8ab 100644
--- a/arch/arm/mach-keystone/Makefile
+++ b/arch/arm/mach-keystone/Makefile
@@ -1 +1,2 @@
 obj-y					:= keystone.o
+obj-$(CONFIG_SMP)			+= platsmp.o
diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
index 4bc60ec..a4eed57 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -26,6 +26,8 @@
 #include <asm/arch_timer.h>
 #include <asm/hardware/gic.h>
 
+#include "keystone.h"
+
 static struct map_desc io_desc[] = {
 	{
 		.virtual        = 0xfe800000UL,
@@ -73,6 +75,7 @@ static const char *keystone_match[] __initconst = {
 };
 
 DT_MACHINE_START(KEYSTONE, "Keystone")
+	smp_ops(keystone_smp_ops)
 	.map_io		= keystone_map_io,
 	.init_irq	= keystone_init_irq,
 	.timer		= &keystone_timer,
diff --git a/arch/arm/mach-keystone/keystone.h b/arch/arm/mach-keystone/keystone.h
new file mode 100644
index 0000000..71bd0f4
--- /dev/null
+++ b/arch/arm/mach-keystone/keystone.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright 2010-2012 Texas Instruments, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __KEYSTONE_H__
+#define __KEYSTONE_H__
+
+extern struct smp_ops keystone_smp_ops;
+extern void secondary_startup(void);
+
+#endif /* __KEYSTONE_H__ */
diff --git a/arch/arm/mach-keystone/platsmp.c b/arch/arm/mach-keystone/platsmp.c
new file mode 100644
index 0000000..dbe7601
--- /dev/null
+++ b/arch/arm/mach-keystone/platsmp.c
@@ -0,0 +1,74 @@
+/*
+ * Copyright 2012 Texas Instruments, Inc.
+ *
+ * Based on platsmp.c, Copyright 2010-2011 Calxeda, Inc.
+ * Based on platsmp.c, Copyright (C) 2002 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/init.h>
+#include <linux/smp.h>
+#include <linux/io.h>
+
+#include <asm/smp_plat.h>
+#include <asm/smp_ops.h>
+#include <asm/hardware/gic.h>
+#include <asm/cacheflush.h>
+#include <asm/memory.h>
+
+#include "keystone.h"
+
+static void __init keystone_smp_init_cpus(void)
+{
+	unsigned int i, ncores;
+
+	ncores = 4;
+
+	/* sanity check */
+	if (ncores > NR_CPUS) {
+		pr_warn("restricted to %d cpus\n", NR_CPUS);
+		ncores = NR_CPUS;
+	}
+
+	for (i = 0; i < ncores; i++)
+		set_cpu_possible(i, true);
+
+	set_smp_cross_call(gic_raise_softirq);
+}
+
+static void __init keystone_smp_prepare_cpus(unsigned int max_cpus)
+{
+	/* nothing for now */
+}
+
+static void __cpuinit keystone_secondary_init(unsigned int cpu)
+{
+	gic_secondary_init(0);
+}
+
+static int __cpuinit
+keystone_boot_secondary(unsigned int cpu, struct task_struct *idle)
+{
+	unsigned long *ptr;
+	
+	ptr = phys_to_virt(0x800001f0);
+	ptr[cpu] = virt_to_idmap(&secondary_startup);
+	__cpuc_flush_dcache_area(ptr, sizeof(ptr) * 4);
+
+	return 0;
+}
+
+struct smp_ops keystone_smp_ops __initdata = {
+	smp_init_ops(keystone)
+	smp_secondary_ops(keystone)
+};
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [RFC 22/22] ARM: keystone: add switch over to high physical address range
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-07-31 23:05   ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, catalin.marinas, nico, linux, will.deacon,
	Cyril Chemparathy, Vitaly Andrianov

Keystone platforms have their physical memory mapped at an address outside the
32-bit physical range.  A Keystone machine with 16G of RAM would find its
memory at 0x0800000000 - 0x0bffffffff.

For boot purposes, the interconnect supports a limited alias of some of this
memory within the 32-bit addressable space (0x80000000 - 0xffffffff).  This
aliasing is implemented in hardware, and is not intended to be used much
beyond boot.  For instance, DMA coherence does not work when running out of
this aliased address space.

Therefore, we've taken the approach of booting out of the low physical address
range, and subsequently we switch over to the high range once we're safely
inside machine specific territory.  This patch implements this switch over
mechanism, which involves rewiring the TTBRs and page tables to point to the
new physical address space.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/Kconfig                             |    1 +
 arch/arm/boot/dts/keystone-sim.dts           |    8 +++---
 arch/arm/configs/keystone_defconfig          |    1 +
 arch/arm/mach-keystone/include/mach/memory.h |   25 +++++++++++++++++
 arch/arm/mach-keystone/keystone.c            |   37 ++++++++++++++++++++++++++
 arch/arm/mach-keystone/platsmp.c             |   18 +++++++++++--
 6 files changed, 84 insertions(+), 6 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 7a76924..33a17c7 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -373,6 +373,7 @@ config ARCH_KEYSTONE
 	select NEED_MACH_MEMORY_H
 	select HAVE_SCHED_CLOCK
 	select HAVE_SMP
+	select ZONE_DMA if ARM_LPAE
 	help
 	  Support for boards based on the Texas Instruments Keystone family of
 	  SoCs.
diff --git a/arch/arm/boot/dts/keystone-sim.dts b/arch/arm/boot/dts/keystone-sim.dts
index 118d631..5912fa1 100644
--- a/arch/arm/boot/dts/keystone-sim.dts
+++ b/arch/arm/boot/dts/keystone-sim.dts
@@ -4,8 +4,8 @@
 / {
 	model = "Texas Instruments Keystone 2 SoC";
 	compatible = "ti,keystone-evm";
-	#address-cells = <1>;
-	#size-cells = <1>;
+	#address-cells = <2>;
+	#size-cells = <2>;
 	interrupt-parent = <&gic>;
 
 	aliases {
@@ -13,11 +13,11 @@
 	};
 
 	chosen {
-		bootargs = "console=ttyS0,115200n8 debug earlyprintk lpj=50000 rdinit=/bin/ash rw root=/dev/ram0 initrd=0x85000000,9M";
+		bootargs = "console=ttyS0,115200n8 debug earlyprintk lpj=50000 rdinit=/bin/ash rw root=/dev/ram0 initrd=0x805000000,9M";
 	};
 
 	memory {
-		reg = <0x80000000 0x8000000>;
+		reg = <0x00000008 0x00000000 0x00000000 0x8000000>;
 	};
 
 	cpus {
diff --git a/arch/arm/configs/keystone_defconfig b/arch/arm/configs/keystone_defconfig
index 5f71e66..8ea3b96 100644
--- a/arch/arm/configs/keystone_defconfig
+++ b/arch/arm/configs/keystone_defconfig
@@ -1,6 +1,7 @@
 CONFIG_EXPERIMENTAL=y
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_ARCH_KEYSTONE=y
+CONFIG_ARM_LPAE=y
 CONFIG_SMP=y
 CONFIG_ARM_ARCH_TIMER=y
 CONFIG_NR_CPUS=4
diff --git a/arch/arm/mach-keystone/include/mach/memory.h b/arch/arm/mach-keystone/include/mach/memory.h
index 7c78b1e..a5f7a1a 100644
--- a/arch/arm/mach-keystone/include/mach/memory.h
+++ b/arch/arm/mach-keystone/include/mach/memory.h
@@ -19,4 +19,29 @@
 #define MAX_PHYSMEM_BITS	36
 #define SECTION_SIZE_BITS	34
 
+#define KEYSTONE_LOW_PHYS_START		0x80000000ULL
+#define KEYSTONE_LOW_PHYS_SIZE		0x80000000ULL /* 2G */
+#define KEYSTONE_LOW_PHYS_END		(KEYSTONE_LOW_PHYS_START + \
+					 KEYSTONE_LOW_PHYS_SIZE - 1)
+
+#define KEYSTONE_HIGH_PHYS_START	0x800000000ULL
+#define KEYSTONE_HIGH_PHYS_SIZE		0x400000000ULL	/* 16G */
+#define KEYSTONE_HIGH_PHYS_END		(KEYSTONE_HIGH_PHYS_START + \
+					 KEYSTONE_HIGH_PHYS_SIZE - 1)
+#ifdef CONFIG_ARM_LPAE
+
+#ifndef __ASSEMBLY__
+
+static inline phys_addr_t __virt_to_idmap(unsigned long x)
+{
+	return (phys_addr_t)(x) - CONFIG_PAGE_OFFSET +
+		KEYSTONE_LOW_PHYS_START;
+}
+
+#define virt_to_idmap(x)	__virt_to_idmap((unsigned long)(x))
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* CONFIG_ARM_LPAE */
+
 #endif /* __ASM_MACH_MEMORY_H */
diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
index a4eed57..e8aee85 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -74,6 +74,39 @@ static const char *keystone_match[] __initconst = {
 	NULL,
 };
 
+static void __init keystone_init_meminfo(void)
+{
+	bool lpae = IS_ENABLED(CONFIG_ARM_LPAE);
+	bool pvpatch = IS_ENABLED(CONFIG_ARM_PATCH_PHYS_VIRT);
+	phys_addr_t mem_start, mem_end;
+
+	BUG_ON(meminfo.nr_banks < 1);
+
+	mem_start = meminfo.bank[0].start;
+	mem_end   = mem_start + meminfo.bank[0].size - 1;
+
+	/* nothing to do if we are running out of the <32-bit space */
+	if (mem_start >= KEYSTONE_LOW_PHYS_START &&
+	    mem_end   <= KEYSTONE_LOW_PHYS_END)
+		return;
+
+	if (!lpae || !pvpatch) {
+		panic("Enable %s%s%s to run outside 32-bit space\n",
+		      !lpae ? __stringify(CONFIG_ARM_LPAE) : "",
+		      (!lpae && !pvpatch) ? " and " : "",
+		      !pvpatch ? __stringify(CONFIG_ARM_PATCH_PHYS_VIRT) : "");
+	}
+
+	if (mem_start < KEYSTONE_HIGH_PHYS_START ||
+	    mem_end   > KEYSTONE_HIGH_PHYS_END) {
+		panic("Invalid address space for memory (%08llx-%08llx)\n",
+		      (u64)KEYSTONE_HIGH_PHYS_START,
+		      (u64)KEYSTONE_HIGH_PHYS_END);
+	}
+
+	set_phys_offset(KEYSTONE_HIGH_PHYS_START);
+}
+
 DT_MACHINE_START(KEYSTONE, "Keystone")
 	smp_ops(keystone_smp_ops)
 	.map_io		= keystone_map_io,
@@ -82,4 +115,8 @@ DT_MACHINE_START(KEYSTONE, "Keystone")
 	.handle_irq	= gic_handle_irq,
 	.init_machine	= keystone_init,
 	.dt_compat	= keystone_match,
+	.init_meminfo	= keystone_init_meminfo,
+#ifdef CONFIG_ZONE_DMA
+	.dma_zone_size	= SZ_2G,
+#endif
 MACHINE_END
diff --git a/arch/arm/mach-keystone/platsmp.c b/arch/arm/mach-keystone/platsmp.c
index dbe7601..b7f0724 100644
--- a/arch/arm/mach-keystone/platsmp.c
+++ b/arch/arm/mach-keystone/platsmp.c
@@ -24,6 +24,7 @@
 #include <asm/smp_ops.h>
 #include <asm/hardware/gic.h>
 #include <asm/cacheflush.h>
+#include <asm/tlbflush.h>
 #include <asm/memory.h>
 
 #include "keystone.h"
@@ -51,17 +52,30 @@ static void __init keystone_smp_prepare_cpus(unsigned int max_cpus)
 	/* nothing for now */
 }
 
+static void __cpuinit keystone_secondary_initmem(void)
+{
+#ifdef CONFIG_ARM_LPAE
+	pgd_t *pgd0 = pgd_offset_k(0);
+	cpu_set_ttbr(1, __pa(pgd0) + TTBR1_OFFSET);
+	local_flush_tlb_all();
+#endif
+}
+
 static void __cpuinit keystone_secondary_init(unsigned int cpu)
 {
 	gic_secondary_init(0);
+	keystone_secondary_initmem();
 }
 
 static int __cpuinit
 keystone_boot_secondary(unsigned int cpu, struct task_struct *idle)
 {
 	unsigned long *ptr;
-	
-	ptr = phys_to_virt(0x800001f0);
+
+	ptr = IS_ENABLED(CONFIG_ARM_LPAE) ?
+		phys_to_virt(KEYSTONE_HIGH_PHYS_START + 0x1f0) :
+		phys_to_virt(KEYSTONE_LOW_PHYS_START + 0x1f0);
+
 	ptr[cpu] = virt_to_idmap(&secondary_startup);
 	__cpuc_flush_dcache_area(ptr, sizeof(ptr) * 4);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [RFC 21/22] ARM: keystone: enable SMP on Keystone machines
@ 2012-07-31 23:05   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:05 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds basic SMP support for Keystone machines.  Nothing very fancy
here, just enough to get 4 CPUs booted up.  This does not include support for
hotplug, etc.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/Kconfig                    |    1 +
 arch/arm/configs/keystone_defconfig |    2 +
 arch/arm/mach-keystone/Makefile     |    1 +
 arch/arm/mach-keystone/keystone.c   |    3 ++
 arch/arm/mach-keystone/keystone.h   |   23 +++++++++++
 arch/arm/mach-keystone/platsmp.c    |   74 +++++++++++++++++++++++++++++++++++
 6 files changed, 104 insertions(+)
 create mode 100644 arch/arm/mach-keystone/keystone.h
 create mode 100644 arch/arm/mach-keystone/platsmp.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index e0588e3..7a76924 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -372,6 +372,7 @@ config ARCH_KEYSTONE
 	select SPARSE_IRQ
 	select NEED_MACH_MEMORY_H
 	select HAVE_SCHED_CLOCK
+	select HAVE_SMP
 	help
 	  Support for boards based on the Texas Instruments Keystone family of
 	  SoCs.
diff --git a/arch/arm/configs/keystone_defconfig b/arch/arm/configs/keystone_defconfig
index 7f2a04b..5f71e66 100644
--- a/arch/arm/configs/keystone_defconfig
+++ b/arch/arm/configs/keystone_defconfig
@@ -1,7 +1,9 @@
 CONFIG_EXPERIMENTAL=y
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_ARCH_KEYSTONE=y
+CONFIG_SMP=y
 CONFIG_ARM_ARCH_TIMER=y
+CONFIG_NR_CPUS=4
 CONFIG_AEABI=y
 CONFIG_HIGHMEM=y
 CONFIG_VFP=y
diff --git a/arch/arm/mach-keystone/Makefile b/arch/arm/mach-keystone/Makefile
index d4671d5..3f6b8ab 100644
--- a/arch/arm/mach-keystone/Makefile
+++ b/arch/arm/mach-keystone/Makefile
@@ -1 +1,2 @@
 obj-y					:= keystone.o
+obj-$(CONFIG_SMP)			+= platsmp.o
diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
index 4bc60ec..a4eed57 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -26,6 +26,8 @@
 #include <asm/arch_timer.h>
 #include <asm/hardware/gic.h>
 
+#include "keystone.h"
+
 static struct map_desc io_desc[] = {
 	{
 		.virtual        = 0xfe800000UL,
@@ -73,6 +75,7 @@ static const char *keystone_match[] __initconst = {
 };
 
 DT_MACHINE_START(KEYSTONE, "Keystone")
+	smp_ops(keystone_smp_ops)
 	.map_io		= keystone_map_io,
 	.init_irq	= keystone_init_irq,
 	.timer		= &keystone_timer,
diff --git a/arch/arm/mach-keystone/keystone.h b/arch/arm/mach-keystone/keystone.h
new file mode 100644
index 0000000..71bd0f4
--- /dev/null
+++ b/arch/arm/mach-keystone/keystone.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright 2010-2012 Texas Instruments, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __KEYSTONE_H__
+#define __KEYSTONE_H__
+
+extern struct smp_ops keystone_smp_ops;
+extern void secondary_startup(void);
+
+#endif /* __KEYSTONE_H__ */
diff --git a/arch/arm/mach-keystone/platsmp.c b/arch/arm/mach-keystone/platsmp.c
new file mode 100644
index 0000000..dbe7601
--- /dev/null
+++ b/arch/arm/mach-keystone/platsmp.c
@@ -0,0 +1,74 @@
+/*
+ * Copyright 2012 Texas Instruments, Inc.
+ *
+ * Based on platsmp.c, Copyright 2010-2011 Calxeda, Inc.
+ * Based on platsmp.c, Copyright (C) 2002 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/init.h>
+#include <linux/smp.h>
+#include <linux/io.h>
+
+#include <asm/smp_plat.h>
+#include <asm/smp_ops.h>
+#include <asm/hardware/gic.h>
+#include <asm/cacheflush.h>
+#include <asm/memory.h>
+
+#include "keystone.h"
+
+static void __init keystone_smp_init_cpus(void)
+{
+	unsigned int i, ncores;
+
+	ncores = 4;
+
+	/* sanity check */
+	if (ncores > NR_CPUS) {
+		pr_warn("restricted to %d cpus\n", NR_CPUS);
+		ncores = NR_CPUS;
+	}
+
+	for (i = 0; i < ncores; i++)
+		set_cpu_possible(i, true);
+
+	set_smp_cross_call(gic_raise_softirq);
+}
+
+static void __init keystone_smp_prepare_cpus(unsigned int max_cpus)
+{
+	/* nothing for now */
+}
+
+static void __cpuinit keystone_secondary_init(unsigned int cpu)
+{
+	gic_secondary_init(0);
+}
+
+static int __cpuinit
+keystone_boot_secondary(unsigned int cpu, struct task_struct *idle)
+{
+	unsigned long *ptr;
+	
+	ptr = phys_to_virt(0x800001f0);
+	ptr[cpu] = virt_to_idmap(&secondary_startup);
+	__cpuc_flush_dcache_area(ptr, sizeof(ptr) * 4);
+
+	return 0;
+}
+
+struct smp_ops keystone_smp_ops __initdata = {
+	smp_init_ops(keystone)
+	smp_secondary_ops(keystone)
+};
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [RFC 22/22] ARM: keystone: add switch over to high physical address range
@ 2012-07-31 23:05   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:05 UTC (permalink / raw)
  To: linux-arm-kernel

Keystone platforms have their physical memory mapped at an address outside the
32-bit physical range.  A Keystone machine with 16G of RAM would find its
memory at 0x0800000000 - 0x0bffffffff.

For boot purposes, the interconnect supports a limited alias of some of this
memory within the 32-bit addressable space (0x80000000 - 0xffffffff).  This
aliasing is implemented in hardware, and is not intended to be used much
beyond boot.  For instance, DMA coherence does not work when running out of
this aliased address space.

Therefore, we've taken the approach of booting out of the low physical address
range, and subsequently we switch over to the high range once we're safely
inside machine specific territory.  This patch implements this switch over
mechanism, which involves rewiring the TTBRs and page tables to point to the
new physical address space.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/Kconfig                             |    1 +
 arch/arm/boot/dts/keystone-sim.dts           |    8 +++---
 arch/arm/configs/keystone_defconfig          |    1 +
 arch/arm/mach-keystone/include/mach/memory.h |   25 +++++++++++++++++
 arch/arm/mach-keystone/keystone.c            |   37 ++++++++++++++++++++++++++
 arch/arm/mach-keystone/platsmp.c             |   18 +++++++++++--
 6 files changed, 84 insertions(+), 6 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 7a76924..33a17c7 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -373,6 +373,7 @@ config ARCH_KEYSTONE
 	select NEED_MACH_MEMORY_H
 	select HAVE_SCHED_CLOCK
 	select HAVE_SMP
+	select ZONE_DMA if ARM_LPAE
 	help
 	  Support for boards based on the Texas Instruments Keystone family of
 	  SoCs.
diff --git a/arch/arm/boot/dts/keystone-sim.dts b/arch/arm/boot/dts/keystone-sim.dts
index 118d631..5912fa1 100644
--- a/arch/arm/boot/dts/keystone-sim.dts
+++ b/arch/arm/boot/dts/keystone-sim.dts
@@ -4,8 +4,8 @@
 / {
 	model = "Texas Instruments Keystone 2 SoC";
 	compatible = "ti,keystone-evm";
-	#address-cells = <1>;
-	#size-cells = <1>;
+	#address-cells = <2>;
+	#size-cells = <2>;
 	interrupt-parent = <&gic>;
 
 	aliases {
@@ -13,11 +13,11 @@
 	};
 
 	chosen {
-		bootargs = "console=ttyS0,115200n8 debug earlyprintk lpj=50000 rdinit=/bin/ash rw root=/dev/ram0 initrd=0x85000000,9M";
+		bootargs = "console=ttyS0,115200n8 debug earlyprintk lpj=50000 rdinit=/bin/ash rw root=/dev/ram0 initrd=0x805000000,9M";
 	};
 
 	memory {
-		reg = <0x80000000 0x8000000>;
+		reg = <0x00000008 0x00000000 0x00000000 0x8000000>;
 	};
 
 	cpus {
diff --git a/arch/arm/configs/keystone_defconfig b/arch/arm/configs/keystone_defconfig
index 5f71e66..8ea3b96 100644
--- a/arch/arm/configs/keystone_defconfig
+++ b/arch/arm/configs/keystone_defconfig
@@ -1,6 +1,7 @@
 CONFIG_EXPERIMENTAL=y
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_ARCH_KEYSTONE=y
+CONFIG_ARM_LPAE=y
 CONFIG_SMP=y
 CONFIG_ARM_ARCH_TIMER=y
 CONFIG_NR_CPUS=4
diff --git a/arch/arm/mach-keystone/include/mach/memory.h b/arch/arm/mach-keystone/include/mach/memory.h
index 7c78b1e..a5f7a1a 100644
--- a/arch/arm/mach-keystone/include/mach/memory.h
+++ b/arch/arm/mach-keystone/include/mach/memory.h
@@ -19,4 +19,29 @@
 #define MAX_PHYSMEM_BITS	36
 #define SECTION_SIZE_BITS	34
 
+#define KEYSTONE_LOW_PHYS_START		0x80000000ULL
+#define KEYSTONE_LOW_PHYS_SIZE		0x80000000ULL /* 2G */
+#define KEYSTONE_LOW_PHYS_END		(KEYSTONE_LOW_PHYS_START + \
+					 KEYSTONE_LOW_PHYS_SIZE - 1)
+
+#define KEYSTONE_HIGH_PHYS_START	0x800000000ULL
+#define KEYSTONE_HIGH_PHYS_SIZE		0x400000000ULL	/* 16G */
+#define KEYSTONE_HIGH_PHYS_END		(KEYSTONE_HIGH_PHYS_START + \
+					 KEYSTONE_HIGH_PHYS_SIZE - 1)
+#ifdef CONFIG_ARM_LPAE
+
+#ifndef __ASSEMBLY__
+
+static inline phys_addr_t __virt_to_idmap(unsigned long x)
+{
+	return (phys_addr_t)(x) - CONFIG_PAGE_OFFSET +
+		KEYSTONE_LOW_PHYS_START;
+}
+
+#define virt_to_idmap(x)	__virt_to_idmap((unsigned long)(x))
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* CONFIG_ARM_LPAE */
+
 #endif /* __ASM_MACH_MEMORY_H */
diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
index a4eed57..e8aee85 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -74,6 +74,39 @@ static const char *keystone_match[] __initconst = {
 	NULL,
 };
 
+static void __init keystone_init_meminfo(void)
+{
+	bool lpae = IS_ENABLED(CONFIG_ARM_LPAE);
+	bool pvpatch = IS_ENABLED(CONFIG_ARM_PATCH_PHYS_VIRT);
+	phys_addr_t mem_start, mem_end;
+
+	BUG_ON(meminfo.nr_banks < 1);
+
+	mem_start = meminfo.bank[0].start;
+	mem_end   = mem_start + meminfo.bank[0].size - 1;
+
+	/* nothing to do if we are running out of the <32-bit space */
+	if (mem_start >= KEYSTONE_LOW_PHYS_START &&
+	    mem_end   <= KEYSTONE_LOW_PHYS_END)
+		return;
+
+	if (!lpae || !pvpatch) {
+		panic("Enable %s%s%s to run outside 32-bit space\n",
+		      !lpae ? __stringify(CONFIG_ARM_LPAE) : "",
+		      (!lpae && !pvpatch) ? " and " : "",
+		      !pvpatch ? __stringify(CONFIG_ARM_PATCH_PHYS_VIRT) : "");
+	}
+
+	if (mem_start < KEYSTONE_HIGH_PHYS_START ||
+	    mem_end   > KEYSTONE_HIGH_PHYS_END) {
+		panic("Invalid address space for memory (%08llx-%08llx)\n",
+		      (u64)KEYSTONE_HIGH_PHYS_START,
+		      (u64)KEYSTONE_HIGH_PHYS_END);
+	}
+
+	set_phys_offset(KEYSTONE_HIGH_PHYS_START);
+}
+
 DT_MACHINE_START(KEYSTONE, "Keystone")
 	smp_ops(keystone_smp_ops)
 	.map_io		= keystone_map_io,
@@ -82,4 +115,8 @@ DT_MACHINE_START(KEYSTONE, "Keystone")
 	.handle_irq	= gic_handle_irq,
 	.init_machine	= keystone_init,
 	.dt_compat	= keystone_match,
+	.init_meminfo	= keystone_init_meminfo,
+#ifdef CONFIG_ZONE_DMA
+	.dma_zone_size	= SZ_2G,
+#endif
 MACHINE_END
diff --git a/arch/arm/mach-keystone/platsmp.c b/arch/arm/mach-keystone/platsmp.c
index dbe7601..b7f0724 100644
--- a/arch/arm/mach-keystone/platsmp.c
+++ b/arch/arm/mach-keystone/platsmp.c
@@ -24,6 +24,7 @@
 #include <asm/smp_ops.h>
 #include <asm/hardware/gic.h>
 #include <asm/cacheflush.h>
+#include <asm/tlbflush.h>
 #include <asm/memory.h>
 
 #include "keystone.h"
@@ -51,17 +52,30 @@ static void __init keystone_smp_prepare_cpus(unsigned int max_cpus)
 	/* nothing for now */
 }
 
+static void __cpuinit keystone_secondary_initmem(void)
+{
+#ifdef CONFIG_ARM_LPAE
+	pgd_t *pgd0 = pgd_offset_k(0);
+	cpu_set_ttbr(1, __pa(pgd0) + TTBR1_OFFSET);
+	local_flush_tlb_all();
+#endif
+}
+
 static void __cpuinit keystone_secondary_init(unsigned int cpu)
 {
 	gic_secondary_init(0);
+	keystone_secondary_initmem();
 }
 
 static int __cpuinit
 keystone_boot_secondary(unsigned int cpu, struct task_struct *idle)
 {
 	unsigned long *ptr;
-	
-	ptr = phys_to_virt(0x800001f0);
+
+	ptr = IS_ENABLED(CONFIG_ARM_LPAE) ?
+		phys_to_virt(KEYSTONE_HIGH_PHYS_START + 0x1f0) :
+		phys_to_virt(KEYSTONE_LOW_PHYS_START + 0x1f0);
+
 	ptr[cpu] = virt_to_idmap(&secondary_startup);
 	__cpuc_flush_dcache_area(ptr, sizeof(ptr) * 4);
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [RFC 20/22] ARM: keystone: introducing TI Keystone platform
@ 2012-07-31 23:06   ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-07-31 23:06 UTC (permalink / raw)
  To: linux-arm-kernel

Texas Instruments Keystone family of multicore devices now includes an
upcoming slew of Cortex A15 based devices.  This patch adds basic definitions
for a new Keystone sub-architecture in ARM.

Subsequent patches in this series will extend support to include SMP and take
advantage of the large physical memory addressing capabilities via LPAE.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/Kconfig                                  |   18 +++++
 arch/arm/Makefile                                 |    1 +
 arch/arm/boot/dts/keystone-sim.dts                |   77 +++++++++++++++++++
 arch/arm/configs/keystone_defconfig               |   20 +++++
 arch/arm/mach-keystone/Makefile                   |    1 +
 arch/arm/mach-keystone/Makefile.boot              |    1 +
 arch/arm/mach-keystone/include/mach/debug-macro.S |   44 +++++++++++
 arch/arm/mach-keystone/include/mach/memory.h      |   22 ++++++
 arch/arm/mach-keystone/include/mach/timex.h       |   21 ++++++
 arch/arm/mach-keystone/include/mach/uncompress.h  |   24 ++++++
 arch/arm/mach-keystone/keystone.c                 |   82 +++++++++++++++++++++
 11 files changed, 311 insertions(+)
 create mode 100644 arch/arm/boot/dts/keystone-sim.dts
 create mode 100644 arch/arm/configs/keystone_defconfig
 create mode 100644 arch/arm/mach-keystone/Makefile
 create mode 100644 arch/arm/mach-keystone/Makefile.boot
 create mode 100644 arch/arm/mach-keystone/include/mach/debug-macro.S
 create mode 100644 arch/arm/mach-keystone/include/mach/memory.h
 create mode 100644 arch/arm/mach-keystone/include/mach/timex.h
 create mode 100644 arch/arm/mach-keystone/include/mach/uncompress.h
 create mode 100644 arch/arm/mach-keystone/keystone.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index a91009c..e0588e3 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -358,6 +358,24 @@ config ARCH_HIGHBANK
 	help
 	  Support for the Calxeda Highbank SoC based boards.
 
+config ARCH_KEYSTONE
+	bool "Texas Instruments Keystone Devices"
+	select ARCH_WANT_OPTIONAL_GPIOLIB
+	select ARM_GIC
+	select MULTI_IRQ_HANDLER
+	select CLKDEV_LOOKUP
+	select COMMON_CLK
+	select CLKSRC_MMIO
+	select CPU_V7
+	select GENERIC_CLOCKEVENTS
+	select USE_OF
+	select SPARSE_IRQ
+	select NEED_MACH_MEMORY_H
+	select HAVE_SCHED_CLOCK
+	help
+	  Support for boards based on the Texas Instruments Keystone family of
+	  SoCs.
+
 config ARCH_CLPS711X
 	bool "Cirrus Logic CLPS711x/EP721x/EP731x-based"
 	select CPU_ARM720T
diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 0298b00..13d6ef5 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -143,6 +143,7 @@ machine-$(CONFIG_ARCH_EP93XX)		:= ep93xx
 machine-$(CONFIG_ARCH_GEMINI)		:= gemini
 machine-$(CONFIG_ARCH_H720X)		:= h720x
 machine-$(CONFIG_ARCH_HIGHBANK)		:= highbank
+machine-$(CONFIG_ARCH_KEYSTONE)		:= keystone
 machine-$(CONFIG_ARCH_INTEGRATOR)	:= integrator
 machine-$(CONFIG_ARCH_IOP13XX)		:= iop13xx
 machine-$(CONFIG_ARCH_IOP32X)		:= iop32x
diff --git a/arch/arm/boot/dts/keystone-sim.dts b/arch/arm/boot/dts/keystone-sim.dts
new file mode 100644
index 0000000..118d631
--- /dev/null
+++ b/arch/arm/boot/dts/keystone-sim.dts
@@ -0,0 +1,77 @@
+/dts-v1/;
+/include/ "skeleton.dtsi"
+
+/ {
+	model = "Texas Instruments Keystone 2 SoC";
+	compatible = "ti,keystone-evm";
+	#address-cells = <1>;
+	#size-cells = <1>;
+	interrupt-parent = <&gic>;
+
+	aliases {
+		serial0	= &uart0;
+	};
+
+	chosen {
+		bootargs = "console=ttyS0,115200n8 debug earlyprintk lpj=50000 rdinit=/bin/ash rw root=/dev/ram0 initrd=0x85000000,9M";
+	};
+
+	memory {
+		reg = <0x80000000 0x8000000>;
+	};
+
+	cpus {
+		interrupt-parent = <&gic>;
+
+		cpu at 0 {
+			compatible = "arm,cortex-a15";
+		};
+
+		cpu at 1 {
+			compatible = "arm,cortex-a15";
+		};
+
+		cpu at 2 {
+			compatible = "arm,cortex-a15";
+		};
+
+		cpu at 3 {
+			compatible = "arm,cortex-a15";
+		};
+
+	};
+
+	soc {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		ranges;
+		compatible = "ti,keystone","simple-bus";
+		interrupt-parent = <&gic>;
+
+		gic:	interrupt-controller at 02560000 {
+			compatible = "arm,cortex-a15-gic";
+			#interrupt-cells = <3>;
+			#size-cells = <0>;
+			#address-cells = <1>;
+			interrupt-controller;
+			reg = <0x02561000 0x1000>,
+			      <0x02562000 0x2000>;
+		};
+
+		timer {
+			compatible = "arm,armv7-timer";
+			interrupts = <1 13 0xf08 1 14 0xf08>;
+			clock-frequency = <10000000>; /* Freq in Hz - optional */
+		};
+
+		uart0:	serial at 02530c00 {
+			compatible	= "ns16550a";
+			current-speed	= <115200>;
+			reg-shift	= <2>;
+			reg-io-width	= <4>;
+			reg		= <0x02530c00 0x100>;
+			clock-frequency = <48000000>;
+			interrupts	= <0 277 0xf01>;
+		};
+	};
+};
diff --git a/arch/arm/configs/keystone_defconfig b/arch/arm/configs/keystone_defconfig
new file mode 100644
index 0000000..7f2a04b
--- /dev/null
+++ b/arch/arm/configs/keystone_defconfig
@@ -0,0 +1,20 @@
+CONFIG_EXPERIMENTAL=y
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_ARCH_KEYSTONE=y
+CONFIG_ARM_ARCH_TIMER=y
+CONFIG_AEABI=y
+CONFIG_HIGHMEM=y
+CONFIG_VFP=y
+CONFIG_NEON=y
+# CONFIG_SUSPEND is not set
+CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
+CONFIG_BLK_DEV_RAM=y
+CONFIG_SERIAL_8250=y
+CONFIG_SERIAL_8250_CONSOLE=y
+CONFIG_SERIAL_OF_PLATFORM=y
+CONFIG_PRINTK_TIME=y
+CONFIG_DEBUG_KERNEL=y
+CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_USER=y
+CONFIG_DEBUG_LL=y
+CONFIG_EARLY_PRINTK=y
diff --git a/arch/arm/mach-keystone/Makefile b/arch/arm/mach-keystone/Makefile
new file mode 100644
index 0000000..d4671d5
--- /dev/null
+++ b/arch/arm/mach-keystone/Makefile
@@ -0,0 +1 @@
+obj-y					:= keystone.o
diff --git a/arch/arm/mach-keystone/Makefile.boot b/arch/arm/mach-keystone/Makefile.boot
new file mode 100644
index 0000000..dae9661
--- /dev/null
+++ b/arch/arm/mach-keystone/Makefile.boot
@@ -0,0 +1 @@
+zreladdr-y	:= 0x00008000
diff --git a/arch/arm/mach-keystone/include/mach/debug-macro.S b/arch/arm/mach-keystone/include/mach/debug-macro.S
new file mode 100644
index 0000000..1108210
--- /dev/null
+++ b/arch/arm/mach-keystone/include/mach/debug-macro.S
@@ -0,0 +1,44 @@
+/*
+ * Debugging macro include header
+ *
+ * Copyright 2010-2012 Texas Instruments, Inc.
+ * Copyright (C) 1994-1999 Russell King
+ * Moved from linux/arch/arm/kernel/debug.S by Ben Dooks
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/serial_reg.h>
+
+#define UART_SHIFT	2
+
+	.macro	addruart,rp,rv,tmp
+	movw	\rv, #0x0c00
+	movt	\rv, #0xfed3
+	movw	\rp, #0x0c00
+	movt	\rp, #0x0253
+	.endm
+
+
+	.macro	senduart,rd,rx
+	str	\rd, [\rx, #UART_TX << UART_SHIFT]
+	.endm
+
+	.macro	busyuart,rd,rx
+1002:	ldr	\rd, [\rx, #UART_LSR << UART_SHIFT]
+	and	\rd, \rd, #UART_LSR_TEMT | UART_LSR_THRE
+	teq	\rd, #UART_LSR_TEMT | UART_LSR_THRE
+	bne	1002b
+	.endm
+
+	.macro	waituart,rd,rx
+	.endm
diff --git a/arch/arm/mach-keystone/include/mach/memory.h b/arch/arm/mach-keystone/include/mach/memory.h
new file mode 100644
index 0000000..7c78b1e
--- /dev/null
+++ b/arch/arm/mach-keystone/include/mach/memory.h
@@ -0,0 +1,22 @@
+/*
+ * Copyright 2010-2012 Texas Instruments, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_MACH_MEMORY_H
+#define __ASM_MACH_MEMORY_H
+
+#define MAX_PHYSMEM_BITS	36
+#define SECTION_SIZE_BITS	34
+
+#endif /* __ASM_MACH_MEMORY_H */
diff --git a/arch/arm/mach-keystone/include/mach/timex.h b/arch/arm/mach-keystone/include/mach/timex.h
new file mode 100644
index 0000000..f355ecb
--- /dev/null
+++ b/arch/arm/mach-keystone/include/mach/timex.h
@@ -0,0 +1,21 @@
+/*
+ * Copyright 2010-2012 Texas Instruments, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __MACH_TIMEX_H
+#define __MACH_TIMEX_H
+
+#define CLOCK_TICK_RATE		1000000
+
+#endif
diff --git a/arch/arm/mach-keystone/include/mach/uncompress.h b/arch/arm/mach-keystone/include/mach/uncompress.h
new file mode 100644
index 0000000..1071761
--- /dev/null
+++ b/arch/arm/mach-keystone/include/mach/uncompress.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright 2010-2012 Texas Instruments, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __MACH_UNCOMPRESS_H
+#define __MACH_UNCOMPRESS_H
+
+#define putc(c)
+#define flush()
+#define arch_decomp_setup()
+#define arch_decomp_wdog()
+
+#endif
diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
new file mode 100644
index 0000000..4bc60ec
--- /dev/null
+++ b/arch/arm/mach-keystone/keystone.c
@@ -0,0 +1,82 @@
+/*
+ * Copyright 2010-2012 Texas Instruments, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/io.h>
+#include <linux/of.h>
+#include <linux/init.h>
+#include <linux/of_irq.h>
+#include <linux/of_platform.h>
+
+#include <asm/setup.h>
+#include <asm/mach/map.h>
+#include <asm/mach/arch.h>
+#include <asm/mach/time.h>
+#include <asm/arch_timer.h>
+#include <asm/hardware/gic.h>
+
+static struct map_desc io_desc[] = {
+	{
+		.virtual        = 0xfe800000UL,
+		.pfn            = __phys_to_pfn(0x02000000UL),
+		.length         = 0x800000UL,
+		.type           = MT_DEVICE
+	},
+};
+
+static void __init keystone_map_io(void)
+{
+	iotable_init(io_desc, sizeof(io_desc)/sizeof(struct map_desc));
+}
+
+static const struct of_device_id irq_match[] = {
+	{ .compatible = "arm,cortex-a15-gic", .data = gic_of_init, },
+	{}
+};
+
+static void __init keystone_init_irq(void)
+{
+	of_irq_init(irq_match);
+}
+
+
+static void __init keystone_timer_init(void)
+{
+	arch_timer_of_register();
+	arch_timer_sched_clock_init();
+}
+
+static struct sys_timer keystone_timer = {
+	.init = keystone_timer_init,
+};
+
+
+static void __init keystone_init(void)
+{
+	of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
+}
+
+static const char *keystone_match[] __initconst = {
+	"ti,keystone-evm",
+	NULL,
+};
+
+DT_MACHINE_START(KEYSTONE, "Keystone")
+	.map_io		= keystone_map_io,
+	.init_irq	= keystone_init_irq,
+	.timer		= &keystone_timer,
+	.handle_irq	= gic_handle_irq,
+	.init_machine	= keystone_init,
+	.dt_compat	= keystone_match,
+MACHINE_END
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* Re: [RFC 20/22] ARM: keystone: introducing TI Keystone platform
  2012-07-31 23:06   ` Cyril Chemparathy
@ 2012-07-31 23:16     ` Arnd Bergmann
  -1 siblings, 0 replies; 127+ messages in thread
From: Arnd Bergmann @ 2012-07-31 23:16 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, catalin.marinas, nico, linux,
	will.deacon, Vitaly Andrianov

On Tuesday 31 July 2012, Cyril Chemparathy wrote:
> Texas Instruments Keystone family of multicore devices now includes an
> upcoming slew of Cortex A15 based devices.  This patch adds basic definitions
> for a new Keystone sub-architecture in ARM.
> 
> Subsequent patches in this series will extend support to include SMP and take
> advantage of the large physical memory addressing capabilities via LPAE.
> 
> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>

Reviewed-by: Arnd Bergmann <arnd@arndb.de>

And some nitpicking:
> +
> +	chosen {
> +		bootargs = "console=ttyS0,115200n8 debug earlyprintk lpj=50000 rdinit=/bin/ash rw root=/dev/ram0 initrd=0x85000000,9M";
> +	};

This command line should not really be here. Most of what you put in it is not
generic to the platform at all.

In order to select the console, use an alias for the serial device.

> +
> +static void __init keystone_map_io(void)
> +{
> +	iotable_init(io_desc, sizeof(io_desc)/sizeof(struct map_desc));
> +}

Use the ARRAY_SIZE macro here.

	Arnd

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [RFC 20/22] ARM: keystone: introducing TI Keystone platform
@ 2012-07-31 23:16     ` Arnd Bergmann
  0 siblings, 0 replies; 127+ messages in thread
From: Arnd Bergmann @ 2012-07-31 23:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 31 July 2012, Cyril Chemparathy wrote:
> Texas Instruments Keystone family of multicore devices now includes an
> upcoming slew of Cortex A15 based devices.  This patch adds basic definitions
> for a new Keystone sub-architecture in ARM.
> 
> Subsequent patches in this series will extend support to include SMP and take
> advantage of the large physical memory addressing capabilities via LPAE.
> 
> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>

Reviewed-by: Arnd Bergmann <arnd@arndb.de>

And some nitpicking:
> +
> +	chosen {
> +		bootargs = "console=ttyS0,115200n8 debug earlyprintk lpj=50000 rdinit=/bin/ash rw root=/dev/ram0 initrd=0x85000000,9M";
> +	};

This command line should not really be here. Most of what you put in it is not
generic to the platform at all.

In order to select the console, use an alias for the serial device.

> +
> +static void __init keystone_map_io(void)
> +{
> +	iotable_init(io_desc, sizeof(io_desc)/sizeof(struct map_desc));
> +}

Use the ARRAY_SIZE macro here.

	Arnd

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 06/22] ARM: LPAE: use phys_addr_t in alloc_init_pud()
  2012-07-31 23:04   ` Cyril Chemparathy
@ 2012-08-01 12:08     ` Sergei Shtylyov
  -1 siblings, 0 replies; 127+ messages in thread
From: Sergei Shtylyov @ 2012-08-01 12:08 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, linux, arnd, nico,
	catalin.marinas, will.deacon, Vitaly Andrianov

Hello.

On 01-08-2012 3:04, Cyril Chemparathy wrote:

> From: Vitaly Andrianov <vitalya@ti.com>

> This patch fixes the alloc_init_pud() function to use phys_addr_t instead of
> unsigned long when passing in the phys argument.

> This is an extension to commit 97092e0c56830457af0639f6bd904537a150ea4a, which

    Please also specify that commit's summary in parens.

> applied similar changes elsewhere in the ARM memory management code.

> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>

WBR, Sergei


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 06/22] ARM: LPAE: use phys_addr_t in alloc_init_pud()
@ 2012-08-01 12:08     ` Sergei Shtylyov
  0 siblings, 0 replies; 127+ messages in thread
From: Sergei Shtylyov @ 2012-08-01 12:08 UTC (permalink / raw)
  To: linux-arm-kernel

Hello.

On 01-08-2012 3:04, Cyril Chemparathy wrote:

> From: Vitaly Andrianov <vitalya@ti.com>

> This patch fixes the alloc_init_pud() function to use phys_addr_t instead of
> unsigned long when passing in the phys argument.

> This is an extension to commit 97092e0c56830457af0639f6bd904537a150ea4a, which

    Please also specify that commit's summary in parens.

> applied similar changes elsewhere in the ARM memory management code.

> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>

WBR, Sergei

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC 20/22] ARM: keystone: introducing TI Keystone platform
  2012-07-31 23:16     ` Arnd Bergmann
@ 2012-08-01 15:41       ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-01 15:41 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arm-kernel, linux-kernel, catalin.marinas, nico, linux,
	will.deacon, Vitaly Andrianov

On 7/31/2012 7:16 PM, Arnd Bergmann wrote:
> On Tuesday 31 July 2012, Cyril Chemparathy wrote:
>> Texas Instruments Keystone family of multicore devices now includes an
>> upcoming slew of Cortex A15 based devices.  This patch adds basic definitions
>> for a new Keystone sub-architecture in ARM.
>>
>> Subsequent patches in this series will extend support to include SMP and take
>> advantage of the large physical memory addressing capabilities via LPAE.
>>
>> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>
> Reviewed-by: Arnd Bergmann <arnd@arndb.de>
>

Thanks for taking a look, Arnd.

Any inputs on the other patches in this series?  I'd ideally like to 
have the LPAE fixes, and code patching changes merged in sooner than the 
Keystone machine specific stuff.

> And some nitpicking:
>> +
>> +	chosen {
>> +		bootargs = "console=ttyS0,115200n8 debug earlyprintk lpj=50000 rdinit=/bin/ash rw root=/dev/ram0 initrd=0x85000000,9M";
>> +	};
>
> This command line should not really be here. Most of what you put in it is not
> generic to the platform at all.
>
> In order to select the console, use an alias for the serial device.
>

Agreed.  The DTS in general needs quite a bit of work.

>> +
>> +static void __init keystone_map_io(void)
>> +{
>> +	iotable_init(io_desc, sizeof(io_desc)/sizeof(struct map_desc));
>> +}
>
> Use the ARRAY_SIZE macro here.
>

Thanks.  I've fixed this in the code, and this will show up in the next rev.

> 	Arnd
>

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [RFC 20/22] ARM: keystone: introducing TI Keystone platform
@ 2012-08-01 15:41       ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-01 15:41 UTC (permalink / raw)
  To: linux-arm-kernel

On 7/31/2012 7:16 PM, Arnd Bergmann wrote:
> On Tuesday 31 July 2012, Cyril Chemparathy wrote:
>> Texas Instruments Keystone family of multicore devices now includes an
>> upcoming slew of Cortex A15 based devices.  This patch adds basic definitions
>> for a new Keystone sub-architecture in ARM.
>>
>> Subsequent patches in this series will extend support to include SMP and take
>> advantage of the large physical memory addressing capabilities via LPAE.
>>
>> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>
> Reviewed-by: Arnd Bergmann <arnd@arndb.de>
>

Thanks for taking a look, Arnd.

Any inputs on the other patches in this series?  I'd ideally like to 
have the LPAE fixes, and code patching changes merged in sooner than the 
Keystone machine specific stuff.

> And some nitpicking:
>> +
>> +	chosen {
>> +		bootargs = "console=ttyS0,115200n8 debug earlyprintk lpj=50000 rdinit=/bin/ash rw root=/dev/ram0 initrd=0x85000000,9M";
>> +	};
>
> This command line should not really be here. Most of what you put in it is not
> generic to the platform at all.
>
> In order to select the console, use an alias for the serial device.
>

Agreed.  The DTS in general needs quite a bit of work.

>> +
>> +static void __init keystone_map_io(void)
>> +{
>> +	iotable_init(io_desc, sizeof(io_desc)/sizeof(struct map_desc));
>> +}
>
> Use the ARRAY_SIZE macro here.
>

Thanks.  I've fixed this in the code, and this will show up in the next rev.

> 	Arnd
>

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 06/22] ARM: LPAE: use phys_addr_t in alloc_init_pud()
  2012-08-01 12:08     ` Sergei Shtylyov
@ 2012-08-01 15:42       ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-01 15:42 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: linux-arm-kernel, linux-kernel, linux, arnd, nico,
	catalin.marinas, will.deacon, Vitaly Andrianov

On 8/1/2012 8:08 AM, Sergei Shtylyov wrote:
> Hello.
>
> On 01-08-2012 3:04, Cyril Chemparathy wrote:
>
>> From: Vitaly Andrianov <vitalya@ti.com>
>
>> This patch fixes the alloc_init_pud() function to use phys_addr_t
>> instead of
>> unsigned long when passing in the phys argument.
>
>> This is an extension to commit
>> 97092e0c56830457af0639f6bd904537a150ea4a, which
>
>     Please also specify that commit's summary in parens.

Thanks Sergei.  Will do so.  I'm assuming you meant headline and not 
summary.

>
>> applied similar changes elsewhere in the ARM memory management code.
>
>> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>
> WBR, Sergei
>

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 06/22] ARM: LPAE: use phys_addr_t in alloc_init_pud()
@ 2012-08-01 15:42       ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-01 15:42 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/1/2012 8:08 AM, Sergei Shtylyov wrote:
> Hello.
>
> On 01-08-2012 3:04, Cyril Chemparathy wrote:
>
>> From: Vitaly Andrianov <vitalya@ti.com>
>
>> This patch fixes the alloc_init_pud() function to use phys_addr_t
>> instead of
>> unsigned long when passing in the phys argument.
>
>> This is an extension to commit
>> 97092e0c56830457af0639f6bd904537a150ea4a, which
>
>     Please also specify that commit's summary in parens.

Thanks Sergei.  Will do so.  I'm assuming you meant headline and not 
summary.

>
>> applied similar changes elsewhere in the ARM memory management code.
>
>> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>
> WBR, Sergei
>

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [RFC 20/22] ARM: keystone: introducing TI Keystone platform
  2012-08-01 15:41       ` Cyril Chemparathy
  (?)
@ 2012-08-01 17:20       ` Arnd Bergmann
  -1 siblings, 0 replies; 127+ messages in thread
From: Arnd Bergmann @ 2012-08-01 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 01 August 2012 11:41:08 Cyril Chemparathy wrote:
> On 7/31/2012 7:16 PM, Arnd Bergmann wrote:
> > On Tuesday 31 July 2012, Cyril Chemparathy wrote:
> >> Texas Instruments Keystone family of multicore devices now includes an
> >> upcoming slew of Cortex A15 based devices.  This patch adds basic definitions
> >> for a new Keystone sub-architecture in ARM.
> >>
> >> Subsequent patches in this series will extend support to include SMP and take
> >> advantage of the large physical memory addressing capabilities via LPAE.
> >>
> >> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
> >> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
> >
> > Reviewed-by: Arnd Bergmann <arnd@arndb.de>
> >
> 
> Thanks for taking a look, Arnd.
> 
> Any inputs on the other patches in this series?

I briefly looked over them and they largely looked ok, but I'm not really
qualified to comment on most of them.

	Arnd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20120801/69d764d0/attachment-0001.html>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 01/22] ARM: add mechanism for late code patching
  2012-07-31 23:04   ` Cyril Chemparathy
@ 2012-08-04  5:38     ` Nicolas Pitre
  -1 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  5:38 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> The original phys_to_virt/virt_to_phys patching implementation relied on early
> patching prior to MMU initialization.  On PAE systems running out of >4G
> address space, this would have entailed an additional round of patching after
> switching over to the high address space.
> 
> The approach implemented here conceptually extends the original PHYS_OFFSET
> patching implementation with the introduction of "early" patch stubs.  Early
> patch code is required to be functional out of the box, even before the patch
> is applied.  This is implemented by inserting functional (but inefficient)
> load code into the .patch.code init section.  Having functional code out of
> the box then allows us to defer the init time patch application until later
> in the init sequence.
> 
> In addition to fitting better with our need for physical address-space
> switch-over, this implementation should be somewhat more extensible by virtue
> of its more readable (and hackable) C implementation.  This should prove
> useful for other similar init time specialization needs, especially in light
> of our multi-platform kernel initiative.
> 
> This code has been boot tested in both ARM and Thumb-2 modes on an ARMv7
> (Cortex-A8) device.
> 
> Note: the obtuse use of stringified symbols in patch_stub() and
> early_patch_stub() is intentional.  Theoretically this should have been
> accomplished with formal operands passed into the asm block, but this requires
> the use of the 'c' modifier for instantiating the long (e.g. .long %c0).
> However, the 'c' modifier has been found to ICE certain versions of GCC, and
> therefore we resort to stringified symbols here.
> 
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>

This looks very nice.  Comments below.

> ---
>  arch/arm/include/asm/patch.h  |  123 +++++++++++++++++++++++++++++

Please find a better name for this file. "patch" is way too generic and 
commonly referring to something different. "runtime-patching" or similar 
would be more descriptive.

>  arch/arm/kernel/module.c      |    4 +
>  arch/arm/kernel/setup.c       |  175 +++++++++++++++++++++++++++++++++++++++++

This is complex enough to waarrant aa separate source file.  Please move 
those additions out from setup.c.  Given a good name for the header file 
above, the c file could share the same name.

> new file mode 100644
> index 0000000..a89749f
> --- /dev/null
> +++ b/arch/arm/include/asm/patch.h
> @@ -0,0 +1,123 @@
> +/*
> + *  arch/arm/include/asm/patch.h
> + *
> + *  Copyright (C) 2012, Texas Instruments
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + *  Note: this file should not be included by non-asm/.h files
> + */
> +#ifndef __ASM_ARM_PATCH_H
> +#define __ASM_ARM_PATCH_H
> +
> +#include <linux/stringify.h>
> +
> +#ifndef __ASSEMBLY__
> +
> extern unsigned __patch_table_begin, __patch_table_end;

You could use "exttern void __patch_table_begin" so those symbols don't 
get any type that could be misused by mistake, while you still can take 
their addresses.

> +
> +struct patch_info {
> +	u32	 type;
> +	u32	 size;

Given the possibly large number of table entries, some effort at making 
those entries as compact as possible should be considered. For instance, 
the type and size fields could be u8's and insn_end pointer replaced 
with another size expressed as an u8.  By placing all the u8's together 
they would occupy a single word by themselves.  The assembly stub would 
only need a .align statement to reflect the c structure's padding.

[...]

Did you verify with some test program that your patching routines do 
produce the same opcodes as the assembled equivalent for all possible 
shift values?  Especially for Thumb2 code which isn't as trivial to get 
right as the ARM one.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
@ 2012-08-04  5:38     ` Nicolas Pitre
  0 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  5:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> The original phys_to_virt/virt_to_phys patching implementation relied on early
> patching prior to MMU initialization.  On PAE systems running out of >4G
> address space, this would have entailed an additional round of patching after
> switching over to the high address space.
> 
> The approach implemented here conceptually extends the original PHYS_OFFSET
> patching implementation with the introduction of "early" patch stubs.  Early
> patch code is required to be functional out of the box, even before the patch
> is applied.  This is implemented by inserting functional (but inefficient)
> load code into the .patch.code init section.  Having functional code out of
> the box then allows us to defer the init time patch application until later
> in the init sequence.
> 
> In addition to fitting better with our need for physical address-space
> switch-over, this implementation should be somewhat more extensible by virtue
> of its more readable (and hackable) C implementation.  This should prove
> useful for other similar init time specialization needs, especially in light
> of our multi-platform kernel initiative.
> 
> This code has been boot tested in both ARM and Thumb-2 modes on an ARMv7
> (Cortex-A8) device.
> 
> Note: the obtuse use of stringified symbols in patch_stub() and
> early_patch_stub() is intentional.  Theoretically this should have been
> accomplished with formal operands passed into the asm block, but this requires
> the use of the 'c' modifier for instantiating the long (e.g. .long %c0).
> However, the 'c' modifier has been found to ICE certain versions of GCC, and
> therefore we resort to stringified symbols here.
> 
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>

This looks very nice.  Comments below.

> ---
>  arch/arm/include/asm/patch.h  |  123 +++++++++++++++++++++++++++++

Please find a better name for this file. "patch" is way too generic and 
commonly referring to something different. "runtime-patching" or similar 
would be more descriptive.

>  arch/arm/kernel/module.c      |    4 +
>  arch/arm/kernel/setup.c       |  175 +++++++++++++++++++++++++++++++++++++++++

This is complex enough to waarrant aa separate source file.  Please move 
those additions out from setup.c.  Given a good name for the header file 
above, the c file could share the same name.

> new file mode 100644
> index 0000000..a89749f
> --- /dev/null
> +++ b/arch/arm/include/asm/patch.h
> @@ -0,0 +1,123 @@
> +/*
> + *  arch/arm/include/asm/patch.h
> + *
> + *  Copyright (C) 2012, Texas Instruments
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + *  Note: this file should not be included by non-asm/.h files
> + */
> +#ifndef __ASM_ARM_PATCH_H
> +#define __ASM_ARM_PATCH_H
> +
> +#include <linux/stringify.h>
> +
> +#ifndef __ASSEMBLY__
> +
> extern unsigned __patch_table_begin, __patch_table_end;

You could use "exttern void __patch_table_begin" so those symbols don't 
get any type that could be misused by mistake, while you still can take 
their addresses.

> +
> +struct patch_info {
> +	u32	 type;
> +	u32	 size;

Given the possibly large number of table entries, some effort at making 
those entries as compact as possible should be considered. For instance, 
the type and size fields could be u8's and insn_end pointer replaced 
with another size expressed as an u8.  By placing all the u8's together 
they would occupy a single word by themselves.  The assembly stub would 
only need a .align statement to reflect the c structure's padding.

[...]

Did you verify with some test program that your patching routines do 
produce the same opcodes as the assembled equivalent for all possible 
shift values?  Especially for Thumb2 code which isn't as trivial to get 
right as the ARM one.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 02/22] ARM: use late patch framework for phys-virt patching
  2012-07-31 23:04   ` Cyril Chemparathy
@ 2012-08-04  6:15     ` Nicolas Pitre
  -1 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  6:15 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> This patch replaces the original physical offset patching implementation
> with one that uses the newly added patching framework.  In the process, we now
> unconditionally initialize the __pv_phys_offset and __pv_offset globals in the
> head.S code.

Why unconditionally initializing those?  There is no reason for that.

> Signed-off-by: Cyril Chemparathy <cyril@ti.com>

Comments below.

> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> index 835898e..d165896 100644
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
[...]
>  	.data
>  	.globl	__pv_phys_offset
>  	.type	__pv_phys_offset, %object
>  __pv_phys_offset:
>  	.long	0
>  	.size	__pv_phys_offset, . - __pv_phys_offset
> +
> +	.globl	__pv_offset
> +	.type	__pv_offset, %object
>  __pv_offset:
>  	.long	0
> -#endif
> +	.size	__pv_offset, . - __pv_offset

Please move those to C code.  They aren't of much use in this file 
anymore.  This will allow you to use pphys_addr_t for them as well in 
your subsequent patch. And more importantly get rid of that ugly 
pv_offset_high that you introduced iin another patch.

> diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
> index df5e897..39f8fce 100644
> --- a/arch/arm/kernel/module.c
> +++ b/arch/arm/kernel/module.c
> @@ -317,11 +317,6 @@ int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs,
>  					         maps[i].txt_sec->sh_addr,
>  					         maps[i].txt_sec->sh_size);
>  #endif
> -#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
> -	s = find_mod_section(hdr, sechdrs, ".pv_table");
> -	if (s)
> -		fixup_pv_table((void *)s->sh_addr, s->sh_size);
> -#endif
>  	s = find_mod_section(hdr, sechdrs, ".patch.table");
>  	if (s)
>  		patch_kernel((void *)s->sh_addr, s->sh_size);

The patch_kernel code and its invokation should still be conditional on 
CONFIG_ARM_PATCH_PHYS_VIRT.  This ability may still be configured out 
irrespective of the implementation used.

> diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
> index bacb275..13731e3 100644
> --- a/arch/arm/kernel/vmlinux.lds.S
> +++ b/arch/arm/kernel/vmlinux.lds.S
> @@ -162,11 +162,6 @@ SECTIONS
>  		__smpalt_end = .;
>  	}
>  #endif
> -	.init.pv_table : {
> -		__pv_table_begin = .;
> -		*(.pv_table)
> -		__pv_table_end = .;
> -	}
>  	.init.patch_table : {
>  		__patch_table_begin = .;
>  		*(.patch.table)

Since you're changing the module ABI,it is important to also modify the 
module vermagic string in asm/module.h to prevent the loading of 
incompatible kernel modules.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 02/22] ARM: use late patch framework for phys-virt patching
@ 2012-08-04  6:15     ` Nicolas Pitre
  0 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  6:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> This patch replaces the original physical offset patching implementation
> with one that uses the newly added patching framework.  In the process, we now
> unconditionally initialize the __pv_phys_offset and __pv_offset globals in the
> head.S code.

Why unconditionally initializing those?  There is no reason for that.

> Signed-off-by: Cyril Chemparathy <cyril@ti.com>

Comments below.

> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> index 835898e..d165896 100644
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
[...]
>  	.data
>  	.globl	__pv_phys_offset
>  	.type	__pv_phys_offset, %object
>  __pv_phys_offset:
>  	.long	0
>  	.size	__pv_phys_offset, . - __pv_phys_offset
> +
> +	.globl	__pv_offset
> +	.type	__pv_offset, %object
>  __pv_offset:
>  	.long	0
> -#endif
> +	.size	__pv_offset, . - __pv_offset

Please move those to C code.  They aren't of much use in this file 
anymore.  This will allow you to use pphys_addr_t for them as well in 
your subsequent patch. And more importantly get rid of that ugly 
pv_offset_high that you introduced iin another patch.

> diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
> index df5e897..39f8fce 100644
> --- a/arch/arm/kernel/module.c
> +++ b/arch/arm/kernel/module.c
> @@ -317,11 +317,6 @@ int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs,
>  					         maps[i].txt_sec->sh_addr,
>  					         maps[i].txt_sec->sh_size);
>  #endif
> -#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
> -	s = find_mod_section(hdr, sechdrs, ".pv_table");
> -	if (s)
> -		fixup_pv_table((void *)s->sh_addr, s->sh_size);
> -#endif
>  	s = find_mod_section(hdr, sechdrs, ".patch.table");
>  	if (s)
>  		patch_kernel((void *)s->sh_addr, s->sh_size);

The patch_kernel code and its invokation should still be conditional on 
CONFIG_ARM_PATCH_PHYS_VIRT.  This ability may still be configured out 
irrespective of the implementation used.

> diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
> index bacb275..13731e3 100644
> --- a/arch/arm/kernel/vmlinux.lds.S
> +++ b/arch/arm/kernel/vmlinux.lds.S
> @@ -162,11 +162,6 @@ SECTIONS
>  		__smpalt_end = .;
>  	}
>  #endif
> -	.init.pv_table : {
> -		__pv_table_begin = .;
> -		*(.pv_table)
> -		__pv_table_end = .;
> -	}
>  	.init.patch_table : {
>  		__patch_table_begin = .;
>  		*(.patch.table)

Since you're changing the module ABI,it is important to also modify the 
module vermagic string in asm/module.h to prevent the loading of 
incompatible kernel modules.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 03/22] ARM: LPAE: use phys_addr_t on virt <--> phys conversion
  2012-07-31 23:04   ` Cyril Chemparathy
@ 2012-08-04  6:24     ` Nicolas Pitre
  -1 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  6:24 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon, Vitaly Andrianov

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> This patch fixes up the types used when converting back and forth between
> physical and virtual addresses.
> 
> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>

Did you verify that this didn't introduce any compilation warning when 
compiling for non LPAE?  If so and there were no warnings then...

Acked-by: Nicolas Pitre <nico@linaro.org>


> ---
>  arch/arm/include/asm/memory.h |   17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> index 01c710d..4a0108f 100644
> --- a/arch/arm/include/asm/memory.h
> +++ b/arch/arm/include/asm/memory.h
> @@ -157,22 +157,27 @@ extern unsigned long __pv_phys_offset;
>  
>  extern unsigned long __pv_offset;
>  
> -static inline unsigned long __virt_to_phys(unsigned long x)
> +static inline phys_addr_t __virt_to_phys(unsigned long x)
>  {
>  	unsigned long t;
>  	early_patch_imm8(x, t, "add", __pv_offset);
>  	return t;
>  }
>  
> -static inline unsigned long __phys_to_virt(unsigned long x)
> +static inline unsigned long __phys_to_virt(phys_addr_t x)
>  {
>  	unsigned long t;
>  	early_patch_imm8(x, t, "sub", __pv_offset);
>  	return t;
>  }
>  #else
> -#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
> -#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
> +
> +#define __virt_to_phys(x)		\
> +	((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET)
> +
> +#define __phys_to_virt(x)		\
> +	((unsigned long)((phys_addr_t)(x) - PHYS_OFFSET + PAGE_OFFSET))
> +
>  #endif
>  #endif
>  
> @@ -207,14 +212,14 @@ static inline phys_addr_t virt_to_phys(const volatile void *x)
>  
>  static inline void *phys_to_virt(phys_addr_t x)
>  {
> -	return (void *)(__phys_to_virt((unsigned long)(x)));
> +	return (void *)__phys_to_virt(x);
>  }
>  
>  /*
>   * Drivers should NOT use these either.
>   */
>  #define __pa(x)			__virt_to_phys((unsigned long)(x))
> -#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
> +#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
>  #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
>  
>  /*
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 03/22] ARM: LPAE: use phys_addr_t on virt <--> phys conversion
@ 2012-08-04  6:24     ` Nicolas Pitre
  0 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  6:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> This patch fixes up the types used when converting back and forth between
> physical and virtual addresses.
> 
> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>

Did you verify that this didn't introduce any compilation warning when 
compiling for non LPAE?  If so and there were no warnings then...

Acked-by: Nicolas Pitre <nico@linaro.org>


> ---
>  arch/arm/include/asm/memory.h |   17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> index 01c710d..4a0108f 100644
> --- a/arch/arm/include/asm/memory.h
> +++ b/arch/arm/include/asm/memory.h
> @@ -157,22 +157,27 @@ extern unsigned long __pv_phys_offset;
>  
>  extern unsigned long __pv_offset;
>  
> -static inline unsigned long __virt_to_phys(unsigned long x)
> +static inline phys_addr_t __virt_to_phys(unsigned long x)
>  {
>  	unsigned long t;
>  	early_patch_imm8(x, t, "add", __pv_offset);
>  	return t;
>  }
>  
> -static inline unsigned long __phys_to_virt(unsigned long x)
> +static inline unsigned long __phys_to_virt(phys_addr_t x)
>  {
>  	unsigned long t;
>  	early_patch_imm8(x, t, "sub", __pv_offset);
>  	return t;
>  }
>  #else
> -#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
> -#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
> +
> +#define __virt_to_phys(x)		\
> +	((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET)
> +
> +#define __phys_to_virt(x)		\
> +	((unsigned long)((phys_addr_t)(x) - PHYS_OFFSET + PAGE_OFFSET))
> +
>  #endif
>  #endif
>  
> @@ -207,14 +212,14 @@ static inline phys_addr_t virt_to_phys(const volatile void *x)
>  
>  static inline void *phys_to_virt(phys_addr_t x)
>  {
> -	return (void *)(__phys_to_virt((unsigned long)(x)));
> +	return (void *)__phys_to_virt(x);
>  }
>  
>  /*
>   * Drivers should NOT use these either.
>   */
>  #define __pa(x)			__virt_to_phys((unsigned long)(x))
> -#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
> +#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
>  #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
>  
>  /*
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 04/22] ARM: LPAE: support 64-bit virt/phys patching
  2012-07-31 23:04   ` Cyril Chemparathy
@ 2012-08-04  6:49     ` Nicolas Pitre
  -1 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  6:49 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> This patch adds support for 64-bit physical addresses in virt_to_phys
> patching.  This does not do real 64-bit add/sub, but instead patches in the
> upper 32-bits of the phys_offset directly into the output of virt_to_phys.

You should explain _why_ you do not a real aadd/sub.  I did deduce it 
but that might not be obvious to everyone.  Also this subtlety should be 
commented in the code as well.

> In addition to adding 64-bit support, this patch also adds a set_phys_offset()
> helper that is needed on architectures that need to modify PHYS_OFFSET during
> initialization.
> 
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
> ---
>  arch/arm/include/asm/memory.h |   22 +++++++++++++++-------
>  arch/arm/kernel/head.S        |    6 ++++++
>  arch/arm/kernel/setup.c       |   14 ++++++++++++++
>  3 files changed, 35 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> index 4a0108f..110495c 100644
> --- a/arch/arm/include/asm/memory.h
> +++ b/arch/arm/include/asm/memory.h
> @@ -153,23 +153,31 @@
>  #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
>  
>  extern unsigned long __pv_phys_offset;
> -#define PHYS_OFFSET __pv_phys_offset
> -
> +extern unsigned long __pv_phys_offset_high;

As mentioned previously, this is just too ugly.  Please make 
__pv_phys_offset into a phys_addr_t instead and mask the low/high parts 
as needed in __virt_to_phys().

>  extern unsigned long __pv_offset;
>  
> +extern void set_phys_offset(phys_addr_t po);
> +
> +#define PHYS_OFFSET	__virt_to_phys(PAGE_OFFSET)
> +
>  static inline phys_addr_t __virt_to_phys(unsigned long x)
>  {
> -	unsigned long t;
> -	early_patch_imm8(x, t, "add", __pv_offset);
> -	return t;
> +	unsigned long tlo, thi = 0;
> +
> +	early_patch_imm8(x, tlo, "add", __pv_offset);
> +	if (sizeof(phys_addr_t) > 4)
> +		early_patch_imm8(0, thi, "add", __pv_phys_offset_high);

Given the high part is always the same, isn't there a better way than an 
add with 0 that could be done here?  The add will force a load of 0 in a 
register needlessly just to add a constant value to it.  Your new 
patching framework ought to be able to patch a mov (or a mvn) 
instruction directly.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 04/22] ARM: LPAE: support 64-bit virt/phys patching
@ 2012-08-04  6:49     ` Nicolas Pitre
  0 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  6:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> This patch adds support for 64-bit physical addresses in virt_to_phys
> patching.  This does not do real 64-bit add/sub, but instead patches in the
> upper 32-bits of the phys_offset directly into the output of virt_to_phys.

You should explain _why_ you do not a real aadd/sub.  I did deduce it 
but that might not be obvious to everyone.  Also this subtlety should be 
commented in the code as well.

> In addition to adding 64-bit support, this patch also adds a set_phys_offset()
> helper that is needed on architectures that need to modify PHYS_OFFSET during
> initialization.
> 
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
> ---
>  arch/arm/include/asm/memory.h |   22 +++++++++++++++-------
>  arch/arm/kernel/head.S        |    6 ++++++
>  arch/arm/kernel/setup.c       |   14 ++++++++++++++
>  3 files changed, 35 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> index 4a0108f..110495c 100644
> --- a/arch/arm/include/asm/memory.h
> +++ b/arch/arm/include/asm/memory.h
> @@ -153,23 +153,31 @@
>  #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
>  
>  extern unsigned long __pv_phys_offset;
> -#define PHYS_OFFSET __pv_phys_offset
> -
> +extern unsigned long __pv_phys_offset_high;

As mentioned previously, this is just too ugly.  Please make 
__pv_phys_offset into a phys_addr_t instead and mask the low/high parts 
as needed in __virt_to_phys().

>  extern unsigned long __pv_offset;
>  
> +extern void set_phys_offset(phys_addr_t po);
> +
> +#define PHYS_OFFSET	__virt_to_phys(PAGE_OFFSET)
> +
>  static inline phys_addr_t __virt_to_phys(unsigned long x)
>  {
> -	unsigned long t;
> -	early_patch_imm8(x, t, "add", __pv_offset);
> -	return t;
> +	unsigned long tlo, thi = 0;
> +
> +	early_patch_imm8(x, tlo, "add", __pv_offset);
> +	if (sizeof(phys_addr_t) > 4)
> +		early_patch_imm8(0, thi, "add", __pv_phys_offset_high);

Given the high part is always the same, isn't there a better way than an 
add with 0 that could be done here?  The add will force a load of 0 in a 
register needlessly just to add a constant value to it.  Your new 
patching framework ought to be able to patch a mov (or a mvn) 
instruction directly.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 06/22] ARM: LPAE: use phys_addr_t in alloc_init_pud()
  2012-07-31 23:04   ` Cyril Chemparathy
@ 2012-08-04  6:51     ` Nicolas Pitre
  -1 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  6:51 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon, Vitaly Andrianov

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> From: Vitaly Andrianov <vitalya@ti.com>
> 
> This patch fixes the alloc_init_pud() function to use phys_addr_t instead of
> unsigned long when passing in the phys argument.
> 
> This is an extension to commit 97092e0c56830457af0639f6bd904537a150ea4a, which
> applied similar changes elsewhere in the ARM memory management code.
> 
> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>

Acked-by: Nicolas Pitre <nico@linaro.org>

> ---
>  arch/arm/mm/mmu.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index cf4528d..226985c 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -628,7 +628,8 @@ static void __init alloc_init_section(pud_t *pud, unsigned long addr,
>  }
>  
>  static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
> -	unsigned long end, unsigned long phys, const struct mem_type *type)
> +				  unsigned long end, phys_addr_t phys,
> +				  const struct mem_type *type)
>  {
>  	pud_t *pud = pud_offset(pgd, addr);
>  	unsigned long next;
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 06/22] ARM: LPAE: use phys_addr_t in alloc_init_pud()
@ 2012-08-04  6:51     ` Nicolas Pitre
  0 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  6:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> From: Vitaly Andrianov <vitalya@ti.com>
> 
> This patch fixes the alloc_init_pud() function to use phys_addr_t instead of
> unsigned long when passing in the phys argument.
> 
> This is an extension to commit 97092e0c56830457af0639f6bd904537a150ea4a, which
> applied similar changes elsewhere in the ARM memory management code.
> 
> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>

Acked-by: Nicolas Pitre <nico@linaro.org>

> ---
>  arch/arm/mm/mmu.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index cf4528d..226985c 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -628,7 +628,8 @@ static void __init alloc_init_section(pud_t *pud, unsigned long addr,
>  }
>  
>  static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
> -	unsigned long end, unsigned long phys, const struct mem_type *type)
> +				  unsigned long end, phys_addr_t phys,
> +				  const struct mem_type *type)
>  {
>  	pud_t *pud = pud_offset(pgd, addr);
>  	unsigned long next;
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 07/22] ARM: LPAE: use phys_addr_t in free_memmap()
  2012-07-31 23:04   ` Cyril Chemparathy
@ 2012-08-04  6:54     ` Nicolas Pitre
  -1 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  6:54 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon, Vitaly Andrianov

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> From: Vitaly Andrianov <vitalya@ti.com>
> 
> The free_memmap() was mistakenly using unsigned long type to represent
> physical addresses.  This breaks on PAE systems where memory could be placed
> above the 32-bit addressible limit.
> 
> This patch fixes this function to properly use phys_addr_t instead.
> 
> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>

Acked-by: Nicolas Pitre <nico@linaro.org>

> ---
>  arch/arm/mm/init.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index f54d592..8252c31 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -457,7 +457,7 @@ static inline void
>  free_memmap(unsigned long start_pfn, unsigned long end_pfn)
>  {
>  	struct page *start_pg, *end_pg;
> -	unsigned long pg, pgend;
> +	phys_addr_t pg, pgend;
>  
>  	/*
>  	 * Convert start_pfn/end_pfn to a struct page pointer.
> @@ -469,8 +469,8 @@ free_memmap(unsigned long start_pfn, unsigned long end_pfn)
>  	 * Convert to physical addresses, and
>  	 * round start upwards and end downwards.
>  	 */
> -	pg = (unsigned long)PAGE_ALIGN(__pa(start_pg));
> -	pgend = (unsigned long)__pa(end_pg) & PAGE_MASK;
> +	pg = PAGE_ALIGN(__pa(start_pg));
> +	pgend = __pa(end_pg) & PAGE_MASK;
>  
>  	/*
>  	 * If there are free pages between these,
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 07/22] ARM: LPAE: use phys_addr_t in free_memmap()
@ 2012-08-04  6:54     ` Nicolas Pitre
  0 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  6:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> From: Vitaly Andrianov <vitalya@ti.com>
> 
> The free_memmap() was mistakenly using unsigned long type to represent
> physical addresses.  This breaks on PAE systems where memory could be placed
> above the 32-bit addressible limit.
> 
> This patch fixes this function to properly use phys_addr_t instead.
> 
> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>

Acked-by: Nicolas Pitre <nico@linaro.org>

> ---
>  arch/arm/mm/init.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index f54d592..8252c31 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -457,7 +457,7 @@ static inline void
>  free_memmap(unsigned long start_pfn, unsigned long end_pfn)
>  {
>  	struct page *start_pg, *end_pg;
> -	unsigned long pg, pgend;
> +	phys_addr_t pg, pgend;
>  
>  	/*
>  	 * Convert start_pfn/end_pfn to a struct page pointer.
> @@ -469,8 +469,8 @@ free_memmap(unsigned long start_pfn, unsigned long end_pfn)
>  	 * Convert to physical addresses, and
>  	 * round start upwards and end downwards.
>  	 */
> -	pg = (unsigned long)PAGE_ALIGN(__pa(start_pg));
> -	pgend = (unsigned long)__pa(end_pg) & PAGE_MASK;
> +	pg = PAGE_ALIGN(__pa(start_pg));
> +	pgend = __pa(end_pg) & PAGE_MASK;
>  
>  	/*
>  	 * If there are free pages between these,
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 08/22] ARM: LPAE: use phys_addr_t for initrd location and size
  2012-07-31 23:04   ` Cyril Chemparathy
@ 2012-08-04  6:57     ` Nicolas Pitre
  -1 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  6:57 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon, Vitaly Andrianov

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> From: Vitaly Andrianov <vitalya@ti.com>
> 
> This patch fixes the initrd setup code to use phys_addr_t instead of assuming
> 32-bit addressing.  Without this we cannot boot on systems where initrd is
> located above the 4G physical address limit.
> 
> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
> ---
>  arch/arm/mm/init.c |   14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 8252c31..51f3e92 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -36,12 +36,12 @@
>  
>  #include "mm.h"
>  
> -static unsigned long phys_initrd_start __initdata = 0;
> -static unsigned long phys_initrd_size __initdata = 0;
> +static phys_addr_t phys_initrd_start __initdata = 0;
> +static phys_addr_t phys_initrd_size __initdata = 0;

phys_addr_t for the initrd size is rather overkill, isn't it?


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 08/22] ARM: LPAE: use phys_addr_t for initrd location and size
@ 2012-08-04  6:57     ` Nicolas Pitre
  0 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  6:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> From: Vitaly Andrianov <vitalya@ti.com>
> 
> This patch fixes the initrd setup code to use phys_addr_t instead of assuming
> 32-bit addressing.  Without this we cannot boot on systems where initrd is
> located above the 4G physical address limit.
> 
> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
> ---
>  arch/arm/mm/init.c |   14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 8252c31..51f3e92 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -36,12 +36,12 @@
>  
>  #include "mm.h"
>  
> -static unsigned long phys_initrd_start __initdata = 0;
> -static unsigned long phys_initrd_size __initdata = 0;
> +static phys_addr_t phys_initrd_start __initdata = 0;
> +static phys_addr_t phys_initrd_size __initdata = 0;

phys_addr_t for the initrd size is rather overkill, isn't it?


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 09/22] ARM: LPAE: use 64-bit pgd physical address in switch_mm()
  2012-07-31 23:04   ` Cyril Chemparathy
@ 2012-08-04  7:04     ` Nicolas Pitre
  -1 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  7:04 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon, Vitaly Andrianov

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> This patch modifies the switch_mm() processor functions to use 64-bit
> addresses.  We use u64 instead of phys_addr_t, in order to avoid having config
> dependent register usage when calling into switch_mm assembly code.
> 
> The changes in this patch are primarily adjustments for registers used for
> arguments to switch_mm.  The few processor definitions that did use the second
> argument have been modified accordingly.
> 
> Arguments and calling conventions aside, this patch should be a no-op on v6
> and non-LPAE v7 processors. 

NAK.

You just broke all big endian targets, LPAE or not.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 09/22] ARM: LPAE: use 64-bit pgd physical address in switch_mm()
@ 2012-08-04  7:04     ` Nicolas Pitre
  0 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-04  7:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> This patch modifies the switch_mm() processor functions to use 64-bit
> addresses.  We use u64 instead of phys_addr_t, in order to avoid having config
> dependent register usage when calling into switch_mm assembly code.
> 
> The changes in this patch are primarily adjustments for registers used for
> arguments to switch_mm.  The few processor definitions that did use the second
> argument have been modified accordingly.
> 
> Arguments and calling conventions aside, this patch should be a no-op on v6
> and non-LPAE v7 processors. 

NAK.

You just broke all big endian targets, LPAE or not.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 00/22] Introducing the TI Keystone platform
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-08-04  8:39   ` Russell King - ARM Linux
  -1 siblings, 0 replies; 127+ messages in thread
From: Russell King - ARM Linux @ 2012-08-04  8:39 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, nico, will.deacon

On Tue, Jul 31, 2012 at 07:04:36PM -0400, Cyril Chemparathy wrote:
> This series is a follow on to the RFC series posted earlier (archived at [1]).
> The major change introduced here is the modification to the kernel patching
> mechanism for phys_to_virt/virt_to_phys, in order to support LPAE platforms
> that require late patching.  In addition to these changes, we've updated the
> series based on feedback from the earlier posting.
> 
> Most of the patches in this series are fixes and extensions to LPAE support on
> ARM. The last three patches in this series are specific to the TI Keystone
> platform, and are being provided here for the sake of completeness.  These
> three patches are dependent on the smpops patch set (see [2]), and are not
> ready to be merged in as yet.

Can you explain why you want the kernel loaded above the 4GB watermark?
This seems silly to me, as the kernel needs to run at points with a 1:1
physical to virtual mapping, and you can't do that if the kernel is
stored in physical memory above the 4GB watermark.


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 00/22] Introducing the TI Keystone platform
@ 2012-08-04  8:39   ` Russell King - ARM Linux
  0 siblings, 0 replies; 127+ messages in thread
From: Russell King - ARM Linux @ 2012-08-04  8:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 31, 2012 at 07:04:36PM -0400, Cyril Chemparathy wrote:
> This series is a follow on to the RFC series posted earlier (archived at [1]).
> The major change introduced here is the modification to the kernel patching
> mechanism for phys_to_virt/virt_to_phys, in order to support LPAE platforms
> that require late patching.  In addition to these changes, we've updated the
> series based on feedback from the earlier posting.
> 
> Most of the patches in this series are fixes and extensions to LPAE support on
> ARM. The last three patches in this series are specific to the TI Keystone
> platform, and are being provided here for the sake of completeness.  These
> three patches are dependent on the smpops patch set (see [2]), and are not
> ready to be merged in as yet.

Can you explain why you want the kernel loaded above the 4GB watermark?
This seems silly to me, as the kernel needs to run at points with a 1:1
physical to virtual mapping, and you can't do that if the kernel is
stored in physical memory above the 4GB watermark.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 01/22] ARM: add mechanism for late code patching
  2012-08-04  5:38     ` Nicolas Pitre
@ 2012-08-05 13:56       ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-05 13:56 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon

Hi Nicolas,

On 8/4/2012 1:38 AM, Nicolas Pitre wrote:
> On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
>
>> The original phys_to_virt/virt_to_phys patching implementation relied on early
>> patching prior to MMU initialization.  On PAE systems running out of >4G
>> address space, this would have entailed an additional round of patching after
>> switching over to the high address space.
>>
>> The approach implemented here conceptually extends the original PHYS_OFFSET
>> patching implementation with the introduction of "early" patch stubs.  Early
>> patch code is required to be functional out of the box, even before the patch
>> is applied.  This is implemented by inserting functional (but inefficient)
>> load code into the .patch.code init section.  Having functional code out of
>> the box then allows us to defer the init time patch application until later
>> in the init sequence.
>>
>> In addition to fitting better with our need for physical address-space
>> switch-over, this implementation should be somewhat more extensible by virtue
>> of its more readable (and hackable) C implementation.  This should prove
>> useful for other similar init time specialization needs, especially in light
>> of our multi-platform kernel initiative.
>>
>> This code has been boot tested in both ARM and Thumb-2 modes on an ARMv7
>> (Cortex-A8) device.
>>
>> Note: the obtuse use of stringified symbols in patch_stub() and
>> early_patch_stub() is intentional.  Theoretically this should have been
>> accomplished with formal operands passed into the asm block, but this requires
>> the use of the 'c' modifier for instantiating the long (e.g. .long %c0).
>> However, the 'c' modifier has been found to ICE certain versions of GCC, and
>> therefore we resort to stringified symbols here.
>>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>
> This looks very nice.  Comments below.
>
>> ---
>>   arch/arm/include/asm/patch.h  |  123 +++++++++++++++++++++++++++++
>
> Please find a better name for this file. "patch" is way too generic and
> commonly referring to something different. "runtime-patching" or similar
> would be more descriptive.
>

Sure.  Does init-patch sound about right?  We need to reflect the fact 
that this is intended for init-time patching only.

>>   arch/arm/kernel/module.c      |    4 +
>>   arch/arm/kernel/setup.c       |  175 +++++++++++++++++++++++++++++++++++++++++
>
> This is complex enough to waarrant aa separate source file.  Please move
> those additions out from setup.c.  Given a good name for the header file
> above, the c file could share the same name.
>

Sure.

>> new file mode 100644
>> index 0000000..a89749f
>> --- /dev/null
>> +++ b/arch/arm/include/asm/patch.h
>> @@ -0,0 +1,123 @@
>> +/*
>> + *  arch/arm/include/asm/patch.h
>> + *
>> + *  Copyright (C) 2012, Texas Instruments
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + *  Note: this file should not be included by non-asm/.h files
>> + */
>> +#ifndef __ASM_ARM_PATCH_H
>> +#define __ASM_ARM_PATCH_H
>> +
>> +#include <linux/stringify.h>
>> +
>> +#ifndef __ASSEMBLY__
>> +
>> extern unsigned __patch_table_begin, __patch_table_end;
>
> You could use "exttern void __patch_table_begin" so those symbols don't
> get any type that could be misused by mistake, while you still can take
> their addresses.
>

Sure.

>> +
>> +struct patch_info {
>> +	u32	 type;
>> +	u32	 size;
>
> Given the possibly large number of table entries, some effort at making
> those entries as compact as possible should be considered. For instance,
> the type and size fields could be u8's and insn_end pointer replaced
> with another size expressed as an u8.  By placing all the u8's together
> they would occupy a single word by themselves.  The assembly stub would
> only need a .align statement to reflect the c structure's padding.
>

Thanks, will try and pack this struct up.

> [...]
>
> Did you verify with some test program that your patching routines do
> produce the same opcodes as the assembled equivalent for all possible
> shift values?  Especially for Thumb2 code which isn't as trivial to get
> right as the ARM one.
>

Not quite all, but I'm sure I can conjure up an off-line test harness to 
do so.


Much appreciated feedback.  Thanks for taking a look.

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
@ 2012-08-05 13:56       ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-05 13:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Nicolas,

On 8/4/2012 1:38 AM, Nicolas Pitre wrote:
> On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
>
>> The original phys_to_virt/virt_to_phys patching implementation relied on early
>> patching prior to MMU initialization.  On PAE systems running out of >4G
>> address space, this would have entailed an additional round of patching after
>> switching over to the high address space.
>>
>> The approach implemented here conceptually extends the original PHYS_OFFSET
>> patching implementation with the introduction of "early" patch stubs.  Early
>> patch code is required to be functional out of the box, even before the patch
>> is applied.  This is implemented by inserting functional (but inefficient)
>> load code into the .patch.code init section.  Having functional code out of
>> the box then allows us to defer the init time patch application until later
>> in the init sequence.
>>
>> In addition to fitting better with our need for physical address-space
>> switch-over, this implementation should be somewhat more extensible by virtue
>> of its more readable (and hackable) C implementation.  This should prove
>> useful for other similar init time specialization needs, especially in light
>> of our multi-platform kernel initiative.
>>
>> This code has been boot tested in both ARM and Thumb-2 modes on an ARMv7
>> (Cortex-A8) device.
>>
>> Note: the obtuse use of stringified symbols in patch_stub() and
>> early_patch_stub() is intentional.  Theoretically this should have been
>> accomplished with formal operands passed into the asm block, but this requires
>> the use of the 'c' modifier for instantiating the long (e.g. .long %c0).
>> However, the 'c' modifier has been found to ICE certain versions of GCC, and
>> therefore we resort to stringified symbols here.
>>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>
> This looks very nice.  Comments below.
>
>> ---
>>   arch/arm/include/asm/patch.h  |  123 +++++++++++++++++++++++++++++
>
> Please find a better name for this file. "patch" is way too generic and
> commonly referring to something different. "runtime-patching" or similar
> would be more descriptive.
>

Sure.  Does init-patch sound about right?  We need to reflect the fact 
that this is intended for init-time patching only.

>>   arch/arm/kernel/module.c      |    4 +
>>   arch/arm/kernel/setup.c       |  175 +++++++++++++++++++++++++++++++++++++++++
>
> This is complex enough to waarrant aa separate source file.  Please move
> those additions out from setup.c.  Given a good name for the header file
> above, the c file could share the same name.
>

Sure.

>> new file mode 100644
>> index 0000000..a89749f
>> --- /dev/null
>> +++ b/arch/arm/include/asm/patch.h
>> @@ -0,0 +1,123 @@
>> +/*
>> + *  arch/arm/include/asm/patch.h
>> + *
>> + *  Copyright (C) 2012, Texas Instruments
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + *  Note: this file should not be included by non-asm/.h files
>> + */
>> +#ifndef __ASM_ARM_PATCH_H
>> +#define __ASM_ARM_PATCH_H
>> +
>> +#include <linux/stringify.h>
>> +
>> +#ifndef __ASSEMBLY__
>> +
>> extern unsigned __patch_table_begin, __patch_table_end;
>
> You could use "exttern void __patch_table_begin" so those symbols don't
> get any type that could be misused by mistake, while you still can take
> their addresses.
>

Sure.

>> +
>> +struct patch_info {
>> +	u32	 type;
>> +	u32	 size;
>
> Given the possibly large number of table entries, some effort at making
> those entries as compact as possible should be considered. For instance,
> the type and size fields could be u8's and insn_end pointer replaced
> with another size expressed as an u8.  By placing all the u8's together
> they would occupy a single word by themselves.  The assembly stub would
> only need a .align statement to reflect the c structure's padding.
>

Thanks, will try and pack this struct up.

> [...]
>
> Did you verify with some test program that your patching routines do
> produce the same opcodes as the assembled equivalent for all possible
> shift values?  Especially for Thumb2 code which isn't as trivial to get
> right as the ARM one.
>

Not quite all, but I'm sure I can conjure up an off-line test harness to 
do so.


Much appreciated feedback.  Thanks for taking a look.

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 02/22] ARM: use late patch framework for phys-virt patching
  2012-08-04  6:15     ` Nicolas Pitre
@ 2012-08-05 14:03       ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-05 14:03 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon

Hi Nicolas,

On 8/4/2012 2:15 AM, Nicolas Pitre wrote:
> On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
>
>> This patch replaces the original physical offset patching implementation
>> with one that uses the newly added patching framework.  In the process, we now
>> unconditionally initialize the __pv_phys_offset and __pv_offset globals in the
>> head.S code.
>
> Why unconditionally initializing those?  There is no reason for that.
>

We could keep this conditional on LPAE, but do you see any specific need 
for keeping it conditional?

>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>
> Comments below.
>
>> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
>> index 835898e..d165896 100644
>> --- a/arch/arm/kernel/head.S
>> +++ b/arch/arm/kernel/head.S
> [...]
>>   	.data
>>   	.globl	__pv_phys_offset
>>   	.type	__pv_phys_offset, %object
>>   __pv_phys_offset:
>>   	.long	0
>>   	.size	__pv_phys_offset, . - __pv_phys_offset
>> +
>> +	.globl	__pv_offset
>> +	.type	__pv_offset, %object
>>   __pv_offset:
>>   	.long	0
>> -#endif
>> +	.size	__pv_offset, . - __pv_offset
>
> Please move those to C code.  They aren't of much use in this file
> anymore.  This will allow you to use pphys_addr_t for them as well in
> your subsequent patch. And more importantly get rid of that ugly
> pv_offset_high that you introduced iin another patch.
>

Moving it to C-code caused problems because these get filled in prior to 
BSS being cleared.

We could potentially have this initialized in C with a mystery dummy 
value to prevent it from landing in BSS.  Would that be acceptable?

>> diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
>> index df5e897..39f8fce 100644
>> --- a/arch/arm/kernel/module.c
>> +++ b/arch/arm/kernel/module.c
>> @@ -317,11 +317,6 @@ int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs,
>>   					         maps[i].txt_sec->sh_addr,
>>   					         maps[i].txt_sec->sh_size);
>>   #endif
>> -#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
>> -	s = find_mod_section(hdr, sechdrs, ".pv_table");
>> -	if (s)
>> -		fixup_pv_table((void *)s->sh_addr, s->sh_size);
>> -#endif
>>   	s = find_mod_section(hdr, sechdrs, ".patch.table");
>>   	if (s)
>>   		patch_kernel((void *)s->sh_addr, s->sh_size);
>
> The patch_kernel code and its invokation should still be conditional on
> CONFIG_ARM_PATCH_PHYS_VIRT.  This ability may still be configured out
> irrespective of the implementation used.
>

Maybe CONFIG_ARM_PATCH_PHYS_VIRT is not quite appropriate if this is 
used to patch up other things in addition to phys-virt stuff?

I could have this dependent on CONFIG_ARM_INIT_PATCH (or whatever 
nomenclature we chose for this) and have CONFIG_ARM_PATCH_PHYS_VIRT 
depend on it.

>> diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
>> index bacb275..13731e3 100644
>> --- a/arch/arm/kernel/vmlinux.lds.S
>> +++ b/arch/arm/kernel/vmlinux.lds.S
>> @@ -162,11 +162,6 @@ SECTIONS
>>   		__smpalt_end = .;
>>   	}
>>   #endif
>> -	.init.pv_table : {
>> -		__pv_table_begin = .;
>> -		*(.pv_table)
>> -		__pv_table_end = .;
>> -	}
>>   	.init.patch_table : {
>>   		__patch_table_begin = .;
>>   		*(.patch.table)
>
> Since you're changing the module ABI,it is important to also modify the
> module vermagic string in asm/module.h to prevent the loading of
> incompatible kernel modules.
>

Absolutely.  Thanks.

>
> Nicolas
>

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 02/22] ARM: use late patch framework for phys-virt patching
@ 2012-08-05 14:03       ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-05 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Nicolas,

On 8/4/2012 2:15 AM, Nicolas Pitre wrote:
> On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
>
>> This patch replaces the original physical offset patching implementation
>> with one that uses the newly added patching framework.  In the process, we now
>> unconditionally initialize the __pv_phys_offset and __pv_offset globals in the
>> head.S code.
>
> Why unconditionally initializing those?  There is no reason for that.
>

We could keep this conditional on LPAE, but do you see any specific need 
for keeping it conditional?

>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>
> Comments below.
>
>> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
>> index 835898e..d165896 100644
>> --- a/arch/arm/kernel/head.S
>> +++ b/arch/arm/kernel/head.S
> [...]
>>   	.data
>>   	.globl	__pv_phys_offset
>>   	.type	__pv_phys_offset, %object
>>   __pv_phys_offset:
>>   	.long	0
>>   	.size	__pv_phys_offset, . - __pv_phys_offset
>> +
>> +	.globl	__pv_offset
>> +	.type	__pv_offset, %object
>>   __pv_offset:
>>   	.long	0
>> -#endif
>> +	.size	__pv_offset, . - __pv_offset
>
> Please move those to C code.  They aren't of much use in this file
> anymore.  This will allow you to use pphys_addr_t for them as well in
> your subsequent patch. And more importantly get rid of that ugly
> pv_offset_high that you introduced iin another patch.
>

Moving it to C-code caused problems because these get filled in prior to 
BSS being cleared.

We could potentially have this initialized in C with a mystery dummy 
value to prevent it from landing in BSS.  Would that be acceptable?

>> diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
>> index df5e897..39f8fce 100644
>> --- a/arch/arm/kernel/module.c
>> +++ b/arch/arm/kernel/module.c
>> @@ -317,11 +317,6 @@ int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs,
>>   					         maps[i].txt_sec->sh_addr,
>>   					         maps[i].txt_sec->sh_size);
>>   #endif
>> -#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
>> -	s = find_mod_section(hdr, sechdrs, ".pv_table");
>> -	if (s)
>> -		fixup_pv_table((void *)s->sh_addr, s->sh_size);
>> -#endif
>>   	s = find_mod_section(hdr, sechdrs, ".patch.table");
>>   	if (s)
>>   		patch_kernel((void *)s->sh_addr, s->sh_size);
>
> The patch_kernel code and its invokation should still be conditional on
> CONFIG_ARM_PATCH_PHYS_VIRT.  This ability may still be configured out
> irrespective of the implementation used.
>

Maybe CONFIG_ARM_PATCH_PHYS_VIRT is not quite appropriate if this is 
used to patch up other things in addition to phys-virt stuff?

I could have this dependent on CONFIG_ARM_INIT_PATCH (or whatever 
nomenclature we chose for this) and have CONFIG_ARM_PATCH_PHYS_VIRT 
depend on it.

>> diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
>> index bacb275..13731e3 100644
>> --- a/arch/arm/kernel/vmlinux.lds.S
>> +++ b/arch/arm/kernel/vmlinux.lds.S
>> @@ -162,11 +162,6 @@ SECTIONS
>>   		__smpalt_end = .;
>>   	}
>>   #endif
>> -	.init.pv_table : {
>> -		__pv_table_begin = .;
>> -		*(.pv_table)
>> -		__pv_table_end = .;
>> -	}
>>   	.init.patch_table : {
>>   		__patch_table_begin = .;
>>   		*(.patch.table)
>
> Since you're changing the module ABI,it is important to also modify the
> module vermagic string in asm/module.h to prevent the loading of
> incompatible kernel modules.
>

Absolutely.  Thanks.

>
> Nicolas
>

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 03/22] ARM: LPAE: use phys_addr_t on virt <--> phys conversion
  2012-08-04  6:24     ` Nicolas Pitre
@ 2012-08-05 14:05       ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-05 14:05 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon, Vitaly Andrianov

On 8/4/2012 2:24 AM, Nicolas Pitre wrote:
> On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
>
>> This patch fixes up the types used when converting back and forth between
>> physical and virtual addresses.
>>
>> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>
> Did you verify that this didn't introduce any compilation warning when
> compiling for non LPAE?  If so and there were no warnings then...
>

Yes.  This series has been tested on vanilla ARMv7 Cortex-A8 non-LPAE 
hardware as well.

> Acked-by: Nicolas Pitre <nico@linaro.org>
>
>
>> ---
>>   arch/arm/include/asm/memory.h |   17 +++++++++++------
>>   1 file changed, 11 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
>> index 01c710d..4a0108f 100644
>> --- a/arch/arm/include/asm/memory.h
>> +++ b/arch/arm/include/asm/memory.h
>> @@ -157,22 +157,27 @@ extern unsigned long __pv_phys_offset;
>>
>>   extern unsigned long __pv_offset;
>>
>> -static inline unsigned long __virt_to_phys(unsigned long x)
>> +static inline phys_addr_t __virt_to_phys(unsigned long x)
>>   {
>>   	unsigned long t;
>>   	early_patch_imm8(x, t, "add", __pv_offset);
>>   	return t;
>>   }
>>
>> -static inline unsigned long __phys_to_virt(unsigned long x)
>> +static inline unsigned long __phys_to_virt(phys_addr_t x)
>>   {
>>   	unsigned long t;
>>   	early_patch_imm8(x, t, "sub", __pv_offset);
>>   	return t;
>>   }
>>   #else
>> -#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
>> -#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
>> +
>> +#define __virt_to_phys(x)		\
>> +	((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET)
>> +
>> +#define __phys_to_virt(x)		\
>> +	((unsigned long)((phys_addr_t)(x) - PHYS_OFFSET + PAGE_OFFSET))
>> +
>>   #endif
>>   #endif
>>
>> @@ -207,14 +212,14 @@ static inline phys_addr_t virt_to_phys(const volatile void *x)
>>
>>   static inline void *phys_to_virt(phys_addr_t x)
>>   {
>> -	return (void *)(__phys_to_virt((unsigned long)(x)));
>> +	return (void *)__phys_to_virt(x);
>>   }
>>
>>   /*
>>    * Drivers should NOT use these either.
>>    */
>>   #define __pa(x)			__virt_to_phys((unsigned long)(x))
>> -#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
>> +#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
>>   #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
>>
>>   /*
>> --
>> 1.7.9.5
>>

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 03/22] ARM: LPAE: use phys_addr_t on virt <--> phys conversion
@ 2012-08-05 14:05       ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-05 14:05 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/4/2012 2:24 AM, Nicolas Pitre wrote:
> On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
>
>> This patch fixes up the types used when converting back and forth between
>> physical and virtual addresses.
>>
>> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>
> Did you verify that this didn't introduce any compilation warning when
> compiling for non LPAE?  If so and there were no warnings then...
>

Yes.  This series has been tested on vanilla ARMv7 Cortex-A8 non-LPAE 
hardware as well.

> Acked-by: Nicolas Pitre <nico@linaro.org>
>
>
>> ---
>>   arch/arm/include/asm/memory.h |   17 +++++++++++------
>>   1 file changed, 11 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
>> index 01c710d..4a0108f 100644
>> --- a/arch/arm/include/asm/memory.h
>> +++ b/arch/arm/include/asm/memory.h
>> @@ -157,22 +157,27 @@ extern unsigned long __pv_phys_offset;
>>
>>   extern unsigned long __pv_offset;
>>
>> -static inline unsigned long __virt_to_phys(unsigned long x)
>> +static inline phys_addr_t __virt_to_phys(unsigned long x)
>>   {
>>   	unsigned long t;
>>   	early_patch_imm8(x, t, "add", __pv_offset);
>>   	return t;
>>   }
>>
>> -static inline unsigned long __phys_to_virt(unsigned long x)
>> +static inline unsigned long __phys_to_virt(phys_addr_t x)
>>   {
>>   	unsigned long t;
>>   	early_patch_imm8(x, t, "sub", __pv_offset);
>>   	return t;
>>   }
>>   #else
>> -#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
>> -#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
>> +
>> +#define __virt_to_phys(x)		\
>> +	((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET)
>> +
>> +#define __phys_to_virt(x)		\
>> +	((unsigned long)((phys_addr_t)(x) - PHYS_OFFSET + PAGE_OFFSET))
>> +
>>   #endif
>>   #endif
>>
>> @@ -207,14 +212,14 @@ static inline phys_addr_t virt_to_phys(const volatile void *x)
>>
>>   static inline void *phys_to_virt(phys_addr_t x)
>>   {
>> -	return (void *)(__phys_to_virt((unsigned long)(x)));
>> +	return (void *)__phys_to_virt(x);
>>   }
>>
>>   /*
>>    * Drivers should NOT use these either.
>>    */
>>   #define __pa(x)			__virt_to_phys((unsigned long)(x))
>> -#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
>> +#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
>>   #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
>>
>>   /*
>> --
>> 1.7.9.5
>>

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 04/22] ARM: LPAE: support 64-bit virt/phys patching
  2012-08-04  6:49     ` Nicolas Pitre
@ 2012-08-05 14:21       ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-05 14:21 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon

Hi Nicolas,

On 8/4/2012 2:49 AM, Nicolas Pitre wrote:
> On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
>
>> This patch adds support for 64-bit physical addresses in virt_to_phys
>> patching.  This does not do real 64-bit add/sub, but instead patches in the
>> upper 32-bits of the phys_offset directly into the output of virt_to_phys.
>
> You should explain _why_ you do not a real aadd/sub.  I did deduce it
> but that might not be obvious to everyone.  Also this subtlety should be
> commented in the code as well.
>

We could not do an ADDS + ADC here because the carry is not guaranteed 
to be retained and passed into the ADC.  This is because the compiler is 
free to insert all kinds of stuff between the two non-volatile asm blocks.

Is there another subtlety here that I have missed out on entirely?

>> In addition to adding 64-bit support, this patch also adds a set_phys_offset()
>> helper that is needed on architectures that need to modify PHYS_OFFSET during
>> initialization.
>>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>> ---
>>   arch/arm/include/asm/memory.h |   22 +++++++++++++++-------
>>   arch/arm/kernel/head.S        |    6 ++++++
>>   arch/arm/kernel/setup.c       |   14 ++++++++++++++
>>   3 files changed, 35 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
>> index 4a0108f..110495c 100644
>> --- a/arch/arm/include/asm/memory.h
>> +++ b/arch/arm/include/asm/memory.h
>> @@ -153,23 +153,31 @@
>>   #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
>>
>>   extern unsigned long __pv_phys_offset;
>> -#define PHYS_OFFSET __pv_phys_offset
>> -
>> +extern unsigned long __pv_phys_offset_high;
>
> As mentioned previously, this is just too ugly.  Please make
> __pv_phys_offset into a phys_addr_t instead and mask the low/high parts
> as needed in __virt_to_phys().
>

Maybe u64 instead of phys_addr_t to keep the sizing non-variable?

>>   extern unsigned long __pv_offset;
>>
>> +extern void set_phys_offset(phys_addr_t po);
>> +
>> +#define PHYS_OFFSET	__virt_to_phys(PAGE_OFFSET)
>> +
>>   static inline phys_addr_t __virt_to_phys(unsigned long x)
>>   {
>> -	unsigned long t;
>> -	early_patch_imm8(x, t, "add", __pv_offset);
>> -	return t;
>> +	unsigned long tlo, thi = 0;
>> +
>> +	early_patch_imm8(x, tlo, "add", __pv_offset);
>> +	if (sizeof(phys_addr_t) > 4)
>> +		early_patch_imm8(0, thi, "add", __pv_phys_offset_high);
>
> Given the high part is always the same, isn't there a better way than an
> add with 0 that could be done here?  The add will force a load of 0 in a
> register needlessly just to add a constant value to it.  Your new
> patching framework ought to be able to patch a mov (or a mvn)
> instruction directly.
>

True.  I'll try and figure out a better way of doing this.

>
> Nicolas
>

Once again, thanks for the excellent feedback.

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 04/22] ARM: LPAE: support 64-bit virt/phys patching
@ 2012-08-05 14:21       ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-05 14:21 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Nicolas,

On 8/4/2012 2:49 AM, Nicolas Pitre wrote:
> On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
>
>> This patch adds support for 64-bit physical addresses in virt_to_phys
>> patching.  This does not do real 64-bit add/sub, but instead patches in the
>> upper 32-bits of the phys_offset directly into the output of virt_to_phys.
>
> You should explain _why_ you do not a real aadd/sub.  I did deduce it
> but that might not be obvious to everyone.  Also this subtlety should be
> commented in the code as well.
>

We could not do an ADDS + ADC here because the carry is not guaranteed 
to be retained and passed into the ADC.  This is because the compiler is 
free to insert all kinds of stuff between the two non-volatile asm blocks.

Is there another subtlety here that I have missed out on entirely?

>> In addition to adding 64-bit support, this patch also adds a set_phys_offset()
>> helper that is needed on architectures that need to modify PHYS_OFFSET during
>> initialization.
>>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>> ---
>>   arch/arm/include/asm/memory.h |   22 +++++++++++++++-------
>>   arch/arm/kernel/head.S        |    6 ++++++
>>   arch/arm/kernel/setup.c       |   14 ++++++++++++++
>>   3 files changed, 35 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
>> index 4a0108f..110495c 100644
>> --- a/arch/arm/include/asm/memory.h
>> +++ b/arch/arm/include/asm/memory.h
>> @@ -153,23 +153,31 @@
>>   #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
>>
>>   extern unsigned long __pv_phys_offset;
>> -#define PHYS_OFFSET __pv_phys_offset
>> -
>> +extern unsigned long __pv_phys_offset_high;
>
> As mentioned previously, this is just too ugly.  Please make
> __pv_phys_offset into a phys_addr_t instead and mask the low/high parts
> as needed in __virt_to_phys().
>

Maybe u64 instead of phys_addr_t to keep the sizing non-variable?

>>   extern unsigned long __pv_offset;
>>
>> +extern void set_phys_offset(phys_addr_t po);
>> +
>> +#define PHYS_OFFSET	__virt_to_phys(PAGE_OFFSET)
>> +
>>   static inline phys_addr_t __virt_to_phys(unsigned long x)
>>   {
>> -	unsigned long t;
>> -	early_patch_imm8(x, t, "add", __pv_offset);
>> -	return t;
>> +	unsigned long tlo, thi = 0;
>> +
>> +	early_patch_imm8(x, tlo, "add", __pv_offset);
>> +	if (sizeof(phys_addr_t) > 4)
>> +		early_patch_imm8(0, thi, "add", __pv_phys_offset_high);
>
> Given the high part is always the same, isn't there a better way than an
> add with 0 that could be done here?  The add will force a load of 0 in a
> register needlessly just to add a constant value to it.  Your new
> patching framework ought to be able to patch a mov (or a mvn)
> instruction directly.
>

True.  I'll try and figure out a better way of doing this.

>
> Nicolas
>

Once again, thanks for the excellent feedback.

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 08/22] ARM: LPAE: use phys_addr_t for initrd location and size
  2012-08-04  6:57     ` Nicolas Pitre
@ 2012-08-05 14:23       ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-05 14:23 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon, Vitaly Andrianov

On 8/4/2012 2:57 AM, Nicolas Pitre wrote:
> On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
>
>> From: Vitaly Andrianov <vitalya@ti.com>
>>
>> This patch fixes the initrd setup code to use phys_addr_t instead of assuming
>> 32-bit addressing.  Without this we cannot boot on systems where initrd is
>> located above the 4G physical address limit.
>>
>> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>> ---
>>   arch/arm/mm/init.c |   14 +++++++-------
>>   1 file changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
>> index 8252c31..51f3e92 100644
>> --- a/arch/arm/mm/init.c
>> +++ b/arch/arm/mm/init.c
>> @@ -36,12 +36,12 @@
>>
>>   #include "mm.h"
>>
>> -static unsigned long phys_initrd_start __initdata = 0;
>> -static unsigned long phys_initrd_size __initdata = 0;
>> +static phys_addr_t phys_initrd_start __initdata = 0;
>> +static phys_addr_t phys_initrd_size __initdata = 0;
>
> phys_addr_t for the initrd size is rather overkill, isn't it?
>

Fair enough. :-)

>
> Nicolas
>

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 08/22] ARM: LPAE: use phys_addr_t for initrd location and size
@ 2012-08-05 14:23       ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-05 14:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/4/2012 2:57 AM, Nicolas Pitre wrote:
> On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
>
>> From: Vitaly Andrianov <vitalya@ti.com>
>>
>> This patch fixes the initrd setup code to use phys_addr_t instead of assuming
>> 32-bit addressing.  Without this we cannot boot on systems where initrd is
>> located above the 4G physical address limit.
>>
>> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>> ---
>>   arch/arm/mm/init.c |   14 +++++++-------
>>   1 file changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
>> index 8252c31..51f3e92 100644
>> --- a/arch/arm/mm/init.c
>> +++ b/arch/arm/mm/init.c
>> @@ -36,12 +36,12 @@
>>
>>   #include "mm.h"
>>
>> -static unsigned long phys_initrd_start __initdata = 0;
>> -static unsigned long phys_initrd_size __initdata = 0;
>> +static phys_addr_t phys_initrd_start __initdata = 0;
>> +static phys_addr_t phys_initrd_size __initdata = 0;
>
> phys_addr_t for the initrd size is rather overkill, isn't it?
>

Fair enough. :-)

>
> Nicolas
>

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 09/22] ARM: LPAE: use 64-bit pgd physical address in switch_mm()
  2012-08-04  7:04     ` Nicolas Pitre
@ 2012-08-05 14:29       ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-05 14:29 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon, Vitaly Andrianov

On 8/4/2012 3:04 AM, Nicolas Pitre wrote:
> On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
>
>> This patch modifies the switch_mm() processor functions to use 64-bit
>> addresses.  We use u64 instead of phys_addr_t, in order to avoid having config
>> dependent register usage when calling into switch_mm assembly code.
>>
>> The changes in this patch are primarily adjustments for registers used for
>> arguments to switch_mm.  The few processor definitions that did use the second
>> argument have been modified accordingly.
>>
>> Arguments and calling conventions aside, this patch should be a no-op on v6
>> and non-LPAE v7 processors.
>
> NAK.
>
> You just broke all big endian targets, LPAE or not.
>

Indeed.  Thanks.

Would C-land word swappery on BE do?  Any other ideas on the best 
approach to this?

>
> Nicolas
>

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 09/22] ARM: LPAE: use 64-bit pgd physical address in switch_mm()
@ 2012-08-05 14:29       ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-05 14:29 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/4/2012 3:04 AM, Nicolas Pitre wrote:
> On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
>
>> This patch modifies the switch_mm() processor functions to use 64-bit
>> addresses.  We use u64 instead of phys_addr_t, in order to avoid having config
>> dependent register usage when calling into switch_mm assembly code.
>>
>> The changes in this patch are primarily adjustments for registers used for
>> arguments to switch_mm.  The few processor definitions that did use the second
>> argument have been modified accordingly.
>>
>> Arguments and calling conventions aside, this patch should be a no-op on v6
>> and non-LPAE v7 processors.
>
> NAK.
>
> You just broke all big endian targets, LPAE or not.
>

Indeed.  Thanks.

Would C-land word swappery on BE do?  Any other ideas on the best 
approach to this?

>
> Nicolas
>

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 00/22] Introducing the TI Keystone platform
  2012-08-04  8:39   ` Russell King - ARM Linux
@ 2012-08-05 15:10     ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-05 15:10 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, nico, will.deacon

Hi Russell,

On 8/4/2012 4:39 AM, Russell King - ARM Linux wrote:
> On Tue, Jul 31, 2012 at 07:04:36PM -0400, Cyril Chemparathy wrote:
>> This series is a follow on to the RFC series posted earlier (archived at [1]).
>> The major change introduced here is the modification to the kernel patching
>> mechanism for phys_to_virt/virt_to_phys, in order to support LPAE platforms
>> that require late patching.  In addition to these changes, we've updated the
>> series based on feedback from the earlier posting.
>>
>> Most of the patches in this series are fixes and extensions to LPAE support on
>> ARM. The last three patches in this series are specific to the TI Keystone
>> platform, and are being provided here for the sake of completeness.  These
>> three patches are dependent on the smpops patch set (see [2]), and are not
>> ready to be merged in as yet.
>
> Can you explain why you want the kernel loaded above the 4GB watermark?
> This seems silly to me, as the kernel needs to run at points with a 1:1
> physical to virtual mapping, and you can't do that if the kernel is
> stored in physical memory above the 4GB watermark.
>

The Keystone family of devices is built to run with large (>8G) physical 
memory for certain use-cases.  From the CPUs perspective, this entire 
range of physical memory is mapped in linearly at 08:0000:0000, i.e., 
above the 4GB watermark.

The interconnect provides an aliased view of the first 2GB of this 
memory at the 8000:0000 offset.  This alias is intended primarily for 
boot-time usage, and does not support DMA coherence.  We considered the 
option of running with the first 2G of memory located under the 4GB 
watermark, and the rest located at the native >4GB location, but this 
would necessitate sparsemem, and would also break DMA coherence out of 
lowmem.  Hence the need for the more complicated approach implemented in 
this patch series.


The posted patch series manages to get an SMP system running out of 
memory beyond the 4GB watermark.  We identified a couple of places that 
needed the 1:1 physical to virtual mapping, and for these we take 
advantage of the alias view provided by the interconnect.  The two 
places that we found the need for 1:1 mapping were:

1. initial boot code in head.S:  here we've taken the approach of 
initially running out of the alias space, and then switching over to the 
high address space once we are safely in machine-specific territory.

2. idmap for secondary CPU boot:  here we've added a virt_to_idmap() 
facility that our sub-architecture then overrides to express the 
interconnect supported alias view.


We are well aware of the fact that we are barely scratching the surface 
of the problem space here, and we'd be very thankful for a heads up on 
issues that we may have missed so far.  We would similarly appreciate 
other better ideas to solve this problem in light of the unique 
constraints imposed here.

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 00/22] Introducing the TI Keystone platform
@ 2012-08-05 15:10     ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-05 15:10 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On 8/4/2012 4:39 AM, Russell King - ARM Linux wrote:
> On Tue, Jul 31, 2012 at 07:04:36PM -0400, Cyril Chemparathy wrote:
>> This series is a follow on to the RFC series posted earlier (archived at [1]).
>> The major change introduced here is the modification to the kernel patching
>> mechanism for phys_to_virt/virt_to_phys, in order to support LPAE platforms
>> that require late patching.  In addition to these changes, we've updated the
>> series based on feedback from the earlier posting.
>>
>> Most of the patches in this series are fixes and extensions to LPAE support on
>> ARM. The last three patches in this series are specific to the TI Keystone
>> platform, and are being provided here for the sake of completeness.  These
>> three patches are dependent on the smpops patch set (see [2]), and are not
>> ready to be merged in as yet.
>
> Can you explain why you want the kernel loaded above the 4GB watermark?
> This seems silly to me, as the kernel needs to run at points with a 1:1
> physical to virtual mapping, and you can't do that if the kernel is
> stored in physical memory above the 4GB watermark.
>

The Keystone family of devices is built to run with large (>8G) physical 
memory for certain use-cases.  From the CPUs perspective, this entire 
range of physical memory is mapped in linearly at 08:0000:0000, i.e., 
above the 4GB watermark.

The interconnect provides an aliased view of the first 2GB of this 
memory at the 8000:0000 offset.  This alias is intended primarily for 
boot-time usage, and does not support DMA coherence.  We considered the 
option of running with the first 2G of memory located under the 4GB 
watermark, and the rest located at the native >4GB location, but this 
would necessitate sparsemem, and would also break DMA coherence out of 
lowmem.  Hence the need for the more complicated approach implemented in 
this patch series.


The posted patch series manages to get an SMP system running out of 
memory beyond the 4GB watermark.  We identified a couple of places that 
needed the 1:1 physical to virtual mapping, and for these we take 
advantage of the alias view provided by the interconnect.  The two 
places that we found the need for 1:1 mapping were:

1. initial boot code in head.S:  here we've taken the approach of 
initially running out of the alias space, and then switching over to the 
high address space once we are safely in machine-specific territory.

2. idmap for secondary CPU boot:  here we've added a virt_to_idmap() 
facility that our sub-architecture then overrides to express the 
interconnect supported alias view.


We are well aware of the fact that we are barely scratching the surface 
of the problem space here, and we'd be very thankful for a heads up on 
issues that we may have missed so far.  We would similarly appreciate 
other better ideas to solve this problem in light of the unique 
constraints imposed here.

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 02/22] ARM: use late patch framework for phys-virt patching
  2012-08-05 14:03       ` Cyril Chemparathy
@ 2012-08-06  2:06         ` Nicolas Pitre
  -1 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-06  2:06 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon

On Sun, 5 Aug 2012, Cyril Chemparathy wrote:

> Hi Nicolas,
> 
> On 8/4/2012 2:15 AM, Nicolas Pitre wrote:
> > On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
> > 
> > > This patch replaces the original physical offset patching implementation
> > > with one that uses the newly added patching framework.  In the process, we
> > > now
> > > unconditionally initialize the __pv_phys_offset and __pv_offset globals in
> > > the
> > > head.S code.
> > 
> > Why unconditionally initializing those?  There is no reason for that.
> > 
> 
> We could keep this conditional on LPAE, but do you see any specific need for
> keeping it conditional?

This has nothing to do with LPAe.  This is about 
CONFIG_ARM_PATCH_PHYS_VIRT only.  If not selected, those global 
vaariables have no need to exist.

> > Comments below.
> > 
> > > diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> > > index 835898e..d165896 100644
> > > --- a/arch/arm/kernel/head.S
> > > +++ b/arch/arm/kernel/head.S
> > [...]
> > >   	.data
> > >   	.globl	__pv_phys_offset
> > >   	.type	__pv_phys_offset, %object
> > >   __pv_phys_offset:
> > >   	.long	0
> > >   	.size	__pv_phys_offset, . - __pv_phys_offset
> > > +
> > > +	.globl	__pv_offset
> > > +	.type	__pv_offset, %object
> > >   __pv_offset:
> > >   	.long	0
> > > -#endif
> > > +	.size	__pv_offset, . - __pv_offset
> > 
> > Please move those to C code.  They aren't of much use in this file
> > anymore.  This will allow you to use pphys_addr_t for them as well in
> > your subsequent patch. And more importantly get rid of that ugly
> > pv_offset_high that you introduced iin another patch.
> > 
> 
> Moving it to C-code caused problems because these get filled in prior to BSS
> being cleared.
> 
> We could potentially have this initialized in C with a mystery dummy value to
> prevent it from landing in BSS.  Would that be acceptable?

Just initialize them explicitly to zero.  They will end up in .ddata 
section.
> 
> > > index df5e897..39f8fce 100644
> > > --- a/arch/arm/kernel/module.c
> > > +++ b/arch/arm/kernel/module.c
> > > @@ -317,11 +317,6 @@ int module_finalize(const Elf32_Ehdr *hdr, const
> > > Elf_Shdr *sechdrs,
> > >   					         maps[i].txt_sec->sh_addr,
> > >   					         maps[i].txt_sec->sh_size);
> > >   #endif
> > > -#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
> > > -	s = find_mod_section(hdr, sechdrs, ".pv_table");
> > > -	if (s)
> > > -		fixup_pv_table((void *)s->sh_addr, s->sh_size);
> > > -#endif
> > >   	s = find_mod_section(hdr, sechdrs, ".patch.table");
> > >   	if (s)
> > >   		patch_kernel((void *)s->sh_addr, s->sh_size);
> > 
> > The patch_kernel code and its invokation should still be conditional on
> > CONFIG_ARM_PATCH_PHYS_VIRT.  This ability may still be configured out
> > irrespective of the implementation used.
> > 
> 
> Maybe CONFIG_ARM_PATCH_PHYS_VIRT is not quite appropriate if this is used to
> patch up other things in addition to phys-virt stuff?

Maybe, but at the moment this is not the case.

> I could have this dependent on CONFIG_ARM_INIT_PATCH (or whatever nomenclature
> we chose for this) and have CONFIG_ARM_PATCH_PHYS_VIRT depend on it.

Let's cross that bridge in time.

FWIW, I don't like "init patch" much.  I feel like the "runtime" 
qualifier more pricisely describe this code than "init".


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 02/22] ARM: use late patch framework for phys-virt patching
@ 2012-08-06  2:06         ` Nicolas Pitre
  0 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-06  2:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 5 Aug 2012, Cyril Chemparathy wrote:

> Hi Nicolas,
> 
> On 8/4/2012 2:15 AM, Nicolas Pitre wrote:
> > On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
> > 
> > > This patch replaces the original physical offset patching implementation
> > > with one that uses the newly added patching framework.  In the process, we
> > > now
> > > unconditionally initialize the __pv_phys_offset and __pv_offset globals in
> > > the
> > > head.S code.
> > 
> > Why unconditionally initializing those?  There is no reason for that.
> > 
> 
> We could keep this conditional on LPAE, but do you see any specific need for
> keeping it conditional?

This has nothing to do with LPAe.  This is about 
CONFIG_ARM_PATCH_PHYS_VIRT only.  If not selected, those global 
vaariables have no need to exist.

> > Comments below.
> > 
> > > diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> > > index 835898e..d165896 100644
> > > --- a/arch/arm/kernel/head.S
> > > +++ b/arch/arm/kernel/head.S
> > [...]
> > >   	.data
> > >   	.globl	__pv_phys_offset
> > >   	.type	__pv_phys_offset, %object
> > >   __pv_phys_offset:
> > >   	.long	0
> > >   	.size	__pv_phys_offset, . - __pv_phys_offset
> > > +
> > > +	.globl	__pv_offset
> > > +	.type	__pv_offset, %object
> > >   __pv_offset:
> > >   	.long	0
> > > -#endif
> > > +	.size	__pv_offset, . - __pv_offset
> > 
> > Please move those to C code.  They aren't of much use in this file
> > anymore.  This will allow you to use pphys_addr_t for them as well in
> > your subsequent patch. And more importantly get rid of that ugly
> > pv_offset_high that you introduced iin another patch.
> > 
> 
> Moving it to C-code caused problems because these get filled in prior to BSS
> being cleared.
> 
> We could potentially have this initialized in C with a mystery dummy value to
> prevent it from landing in BSS.  Would that be acceptable?

Just initialize them explicitly to zero.  They will end up in .ddata 
section.
> 
> > > index df5e897..39f8fce 100644
> > > --- a/arch/arm/kernel/module.c
> > > +++ b/arch/arm/kernel/module.c
> > > @@ -317,11 +317,6 @@ int module_finalize(const Elf32_Ehdr *hdr, const
> > > Elf_Shdr *sechdrs,
> > >   					         maps[i].txt_sec->sh_addr,
> > >   					         maps[i].txt_sec->sh_size);
> > >   #endif
> > > -#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
> > > -	s = find_mod_section(hdr, sechdrs, ".pv_table");
> > > -	if (s)
> > > -		fixup_pv_table((void *)s->sh_addr, s->sh_size);
> > > -#endif
> > >   	s = find_mod_section(hdr, sechdrs, ".patch.table");
> > >   	if (s)
> > >   		patch_kernel((void *)s->sh_addr, s->sh_size);
> > 
> > The patch_kernel code and its invokation should still be conditional on
> > CONFIG_ARM_PATCH_PHYS_VIRT.  This ability may still be configured out
> > irrespective of the implementation used.
> > 
> 
> Maybe CONFIG_ARM_PATCH_PHYS_VIRT is not quite appropriate if this is used to
> patch up other things in addition to phys-virt stuff?

Maybe, but at the moment this is not the case.

> I could have this dependent on CONFIG_ARM_INIT_PATCH (or whatever nomenclature
> we chose for this) and have CONFIG_ARM_PATCH_PHYS_VIRT depend on it.

Let's cross that bridge in time.

FWIW, I don't like "init patch" much.  I feel like the "runtime" 
qualifier more pricisely describe this code than "init".


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 04/22] ARM: LPAE: support 64-bit virt/phys patching
  2012-08-05 14:21       ` Cyril Chemparathy
@ 2012-08-06  2:19         ` Nicolas Pitre
  -1 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-06  2:19 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon

On Sun, 5 Aug 2012, Cyril Chemparathy wrote:

> Hi Nicolas,
> 
> On 8/4/2012 2:49 AM, Nicolas Pitre wrote:
> > On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
> > 
> > > This patch adds support for 64-bit physical addresses in virt_to_phys
> > > patching.  This does not do real 64-bit add/sub, but instead patches in
> > > the
> > > upper 32-bits of the phys_offset directly into the output of virt_to_phys.
> > 
> > You should explain _why_ you do not a real aadd/sub.  I did deduce it
> > but that might not be obvious to everyone.  Also this subtlety should be
> > commented in the code as well.
> > 
> 
> We could not do an ADDS + ADC here because the carry is not guaranteed to be
> retained and passed into the ADC.  This is because the compiler is free to
> insert all kinds of stuff between the two non-volatile asm blocks.
> 
> Is there another subtlety here that I have missed out on entirely?

The high bits for the valid physical memory address range for which 
virt_to_phys and phys_to_virt can be used are always the same.  
Therefore no aadition at all is needed, fake or real.  Only providing 
those bits in the top word for the value returned by virt_to_phys is 
needed.

> > > In addition to adding 64-bit support, this patch also adds a
> > > set_phys_offset()
> > > helper that is needed on architectures that need to modify PHYS_OFFSET
> > > during
> > > initialization.
> > > 
> > > Signed-off-by: Cyril Chemparathy <cyril@ti.com>
> > > ---
> > >   arch/arm/include/asm/memory.h |   22 +++++++++++++++-------
> > >   arch/arm/kernel/head.S        |    6 ++++++
> > >   arch/arm/kernel/setup.c       |   14 ++++++++++++++
> > >   3 files changed, 35 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> > > index 4a0108f..110495c 100644
> > > --- a/arch/arm/include/asm/memory.h
> > > +++ b/arch/arm/include/asm/memory.h
> > > @@ -153,23 +153,31 @@
> > >   #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
> > > 
> > >   extern unsigned long __pv_phys_offset;
> > > -#define PHYS_OFFSET __pv_phys_offset
> > > -
> > > +extern unsigned long __pv_phys_offset_high;
> > 
> > As mentioned previously, this is just too ugly.  Please make
> > __pv_phys_offset into a phys_addr_t instead and mask the low/high parts
> > as needed in __virt_to_phys().
> > 
> 
> Maybe u64 instead of phys_addr_t to keep the sizing non-variable?

No.  When not using LPAE, we don't have to pay the price of a u64 value.  
That's why the phys_addr_t type is conditionally defined.  You already 
do  extra processing in virt_to_phys when sizeof(phys_addr_t) > 4 which 
is perfect for dealing with this issue.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 04/22] ARM: LPAE: support 64-bit virt/phys patching
@ 2012-08-06  2:19         ` Nicolas Pitre
  0 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-06  2:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 5 Aug 2012, Cyril Chemparathy wrote:

> Hi Nicolas,
> 
> On 8/4/2012 2:49 AM, Nicolas Pitre wrote:
> > On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
> > 
> > > This patch adds support for 64-bit physical addresses in virt_to_phys
> > > patching.  This does not do real 64-bit add/sub, but instead patches in
> > > the
> > > upper 32-bits of the phys_offset directly into the output of virt_to_phys.
> > 
> > You should explain _why_ you do not a real aadd/sub.  I did deduce it
> > but that might not be obvious to everyone.  Also this subtlety should be
> > commented in the code as well.
> > 
> 
> We could not do an ADDS + ADC here because the carry is not guaranteed to be
> retained and passed into the ADC.  This is because the compiler is free to
> insert all kinds of stuff between the two non-volatile asm blocks.
> 
> Is there another subtlety here that I have missed out on entirely?

The high bits for the valid physical memory address range for which 
virt_to_phys and phys_to_virt can be used are always the same.  
Therefore no aadition at all is needed, fake or real.  Only providing 
those bits in the top word for the value returned by virt_to_phys is 
needed.

> > > In addition to adding 64-bit support, this patch also adds a
> > > set_phys_offset()
> > > helper that is needed on architectures that need to modify PHYS_OFFSET
> > > during
> > > initialization.
> > > 
> > > Signed-off-by: Cyril Chemparathy <cyril@ti.com>
> > > ---
> > >   arch/arm/include/asm/memory.h |   22 +++++++++++++++-------
> > >   arch/arm/kernel/head.S        |    6 ++++++
> > >   arch/arm/kernel/setup.c       |   14 ++++++++++++++
> > >   3 files changed, 35 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> > > index 4a0108f..110495c 100644
> > > --- a/arch/arm/include/asm/memory.h
> > > +++ b/arch/arm/include/asm/memory.h
> > > @@ -153,23 +153,31 @@
> > >   #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
> > > 
> > >   extern unsigned long __pv_phys_offset;
> > > -#define PHYS_OFFSET __pv_phys_offset
> > > -
> > > +extern unsigned long __pv_phys_offset_high;
> > 
> > As mentioned previously, this is just too ugly.  Please make
> > __pv_phys_offset into a phys_addr_t instead and mask the low/high parts
> > as needed in __virt_to_phys().
> > 
> 
> Maybe u64 instead of phys_addr_t to keep the sizing non-variable?

No.  When not using LPAE, we don't have to pay the price of a u64 value.  
That's why the phys_addr_t type is conditionally defined.  You already 
do  extra processing in virt_to_phys when sizeof(phys_addr_t) > 4 which 
is perfect for dealing with this issue.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 09/22] ARM: LPAE: use 64-bit pgd physical address in switch_mm()
  2012-08-05 14:29       ` Cyril Chemparathy
@ 2012-08-06  2:35         ` Nicolas Pitre
  -1 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-06  2:35 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, Arnd Bergmann, Catalin Marinas,
	Russell King - ARM Linux, Will Deacon, Vitaly Andrianov

On Sun, 5 Aug 2012, Cyril Chemparathy wrote:

> On 8/4/2012 3:04 AM, Nicolas Pitre wrote:
> > On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
> > 
> > > This patch modifies the switch_mm() processor functions to use 64-bit
> > > addresses.  We use u64 instead of phys_addr_t, in order to avoid having
> > > config
> > > dependent register usage when calling into switch_mm assembly code.
> > > 
> > > The changes in this patch are primarily adjustments for registers used for
> > > arguments to switch_mm.  The few processor definitions that did use the
> > > second
> > > argument have been modified accordingly.
> > > 
> > > Arguments and calling conventions aside, this patch should be a no-op on
> > > v6
> > > and non-LPAE v7 processors.
> > 
> > NAK.
> > 
> > You just broke all big endian targets, LPAE or not.
> > 
> 
> Indeed.  Thanks.
> 
> Would C-land word swappery on BE do?  Any other ideas on the best approach to
> this?

First, don't use a u64 unconditionally. A phys_addr_t is best for the 
same arguments as before.  Since this is equivalent to a u64 only when 
LPAE is defined, you then only have to care about endian issues in 
proc-v7-3level.S.  And in there you can deal with the issue with 
register aliases just as it is done in lib/div64.S.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 09/22] ARM: LPAE: use 64-bit pgd physical address in switch_mm()
@ 2012-08-06  2:35         ` Nicolas Pitre
  0 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-06  2:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 5 Aug 2012, Cyril Chemparathy wrote:

> On 8/4/2012 3:04 AM, Nicolas Pitre wrote:
> > On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
> > 
> > > This patch modifies the switch_mm() processor functions to use 64-bit
> > > addresses.  We use u64 instead of phys_addr_t, in order to avoid having
> > > config
> > > dependent register usage when calling into switch_mm assembly code.
> > > 
> > > The changes in this patch are primarily adjustments for registers used for
> > > arguments to switch_mm.  The few processor definitions that did use the
> > > second
> > > argument have been modified accordingly.
> > > 
> > > Arguments and calling conventions aside, this patch should be a no-op on
> > > v6
> > > and non-LPAE v7 processors.
> > 
> > NAK.
> > 
> > You just broke all big endian targets, LPAE or not.
> > 
> 
> Indeed.  Thanks.
> 
> Would C-land word swappery on BE do?  Any other ideas on the best approach to
> this?

First, don't use a u64 unconditionally. A phys_addr_t is best for the 
same arguments as before.  Since this is equivalent to a u64 only when 
LPAE is defined, you then only have to care about endian issues in 
proc-v7-3level.S.  And in there you can deal with the issue with 
register aliases just as it is done in lib/div64.S.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 01/22] ARM: add mechanism for late code patching
  2012-07-31 23:04   ` Cyril Chemparathy
@ 2012-08-06 11:12     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 127+ messages in thread
From: Russell King - ARM Linux @ 2012-08-06 11:12 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, nico, will.deacon

On Tue, Jul 31, 2012 at 07:04:37PM -0400, Cyril Chemparathy wrote:
> +static void __init init_patch_kernel(void)
> +{
> +	const void *start = &__patch_table_begin;
> +	const void *end   = &__patch_table_end;
> +
> +	BUG_ON(patch_kernel(start, end - start));
> +	flush_icache_range(init_mm.start_code, init_mm.end_code);

Err.  You are asking the kernel to flush every single cache line
manually throughout the kernel code.  That's a flush every 32-bytes
over maybe a few megabytes of address space.

This is one of the reasons we do the patching in assembly code before
the caches are enabled - so we don't have to worry about the interaction
with the CPU caches, which for this kind of application would be very
expensive.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
@ 2012-08-06 11:12     ` Russell King - ARM Linux
  0 siblings, 0 replies; 127+ messages in thread
From: Russell King - ARM Linux @ 2012-08-06 11:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 31, 2012 at 07:04:37PM -0400, Cyril Chemparathy wrote:
> +static void __init init_patch_kernel(void)
> +{
> +	const void *start = &__patch_table_begin;
> +	const void *end   = &__patch_table_end;
> +
> +	BUG_ON(patch_kernel(start, end - start));
> +	flush_icache_range(init_mm.start_code, init_mm.end_code);

Err.  You are asking the kernel to flush every single cache line
manually throughout the kernel code.  That's a flush every 32-bytes
over maybe a few megabytes of address space.

This is one of the reasons we do the patching in assembly code before
the caches are enabled - so we don't have to worry about the interaction
with the CPU caches, which for this kind of application would be very
expensive.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 03/22] ARM: LPAE: use phys_addr_t on virt <--> phys conversion
  2012-07-31 23:04   ` Cyril Chemparathy
@ 2012-08-06 11:14     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 127+ messages in thread
From: Russell King - ARM Linux @ 2012-08-06 11:14 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, nico,
	will.deacon, Vitaly Andrianov

On Tue, Jul 31, 2012 at 07:04:39PM -0400, Cyril Chemparathy wrote:
> This patch fixes up the types used when converting back and forth between
> physical and virtual addresses.
> 
> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
> ---
>  arch/arm/include/asm/memory.h |   17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> index 01c710d..4a0108f 100644
> --- a/arch/arm/include/asm/memory.h
> +++ b/arch/arm/include/asm/memory.h
> @@ -157,22 +157,27 @@ extern unsigned long __pv_phys_offset;
>  
>  extern unsigned long __pv_offset;
>  
> -static inline unsigned long __virt_to_phys(unsigned long x)
> +static inline phys_addr_t __virt_to_phys(unsigned long x)
>  {
>  	unsigned long t;
>  	early_patch_imm8(x, t, "add", __pv_offset);
>  	return t;
>  }
>  
> -static inline unsigned long __phys_to_virt(unsigned long x)
> +static inline unsigned long __phys_to_virt(phys_addr_t x)
>  {
>  	unsigned long t;
>  	early_patch_imm8(x, t, "sub", __pv_offset);
>  	return t;
>  }
>  #else
> -#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
> -#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
> +
> +#define __virt_to_phys(x)		\
> +	((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET)
> +
> +#define __phys_to_virt(x)		\
> +	((unsigned long)((phys_addr_t)(x) - PHYS_OFFSET + PAGE_OFFSET))
> +
>  #endif
>  #endif
>  
> @@ -207,14 +212,14 @@ static inline phys_addr_t virt_to_phys(const volatile void *x)
>  
>  static inline void *phys_to_virt(phys_addr_t x)
>  {
> -	return (void *)(__phys_to_virt((unsigned long)(x)));
> +	return (void *)__phys_to_virt(x);
>  }
>  
>  /*
>   * Drivers should NOT use these either.
>   */
>  #define __pa(x)			__virt_to_phys((unsigned long)(x))
> -#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
> +#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
>  #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)

This as a whole does not fill me with a great amount of enthusiasm,
because it breaks some of the typechecking that we have here.

The whole point of __phys_to_virt() and __virt_to_phys() is that they work
on integer types, and warn if you dare to use them with pointers.  Adding
a cast into them breaks that.

The whole point is that the typecasting is explicitly inside phys_to_virt()
and virt_to_phys() and not their macro counterparts.

Secondly, are you sure that this patch is correct on its own?  You're
passing a u64 into assembly only expecting a 32-bit register.  Have you
checked it does the right thing with a 64-bit phys_addr_t on both LE
and BE?

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 03/22] ARM: LPAE: use phys_addr_t on virt <--> phys conversion
@ 2012-08-06 11:14     ` Russell King - ARM Linux
  0 siblings, 0 replies; 127+ messages in thread
From: Russell King - ARM Linux @ 2012-08-06 11:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 31, 2012 at 07:04:39PM -0400, Cyril Chemparathy wrote:
> This patch fixes up the types used when converting back and forth between
> physical and virtual addresses.
> 
> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
> ---
>  arch/arm/include/asm/memory.h |   17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> index 01c710d..4a0108f 100644
> --- a/arch/arm/include/asm/memory.h
> +++ b/arch/arm/include/asm/memory.h
> @@ -157,22 +157,27 @@ extern unsigned long __pv_phys_offset;
>  
>  extern unsigned long __pv_offset;
>  
> -static inline unsigned long __virt_to_phys(unsigned long x)
> +static inline phys_addr_t __virt_to_phys(unsigned long x)
>  {
>  	unsigned long t;
>  	early_patch_imm8(x, t, "add", __pv_offset);
>  	return t;
>  }
>  
> -static inline unsigned long __phys_to_virt(unsigned long x)
> +static inline unsigned long __phys_to_virt(phys_addr_t x)
>  {
>  	unsigned long t;
>  	early_patch_imm8(x, t, "sub", __pv_offset);
>  	return t;
>  }
>  #else
> -#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
> -#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
> +
> +#define __virt_to_phys(x)		\
> +	((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET)
> +
> +#define __phys_to_virt(x)		\
> +	((unsigned long)((phys_addr_t)(x) - PHYS_OFFSET + PAGE_OFFSET))
> +
>  #endif
>  #endif
>  
> @@ -207,14 +212,14 @@ static inline phys_addr_t virt_to_phys(const volatile void *x)
>  
>  static inline void *phys_to_virt(phys_addr_t x)
>  {
> -	return (void *)(__phys_to_virt((unsigned long)(x)));
> +	return (void *)__phys_to_virt(x);
>  }
>  
>  /*
>   * Drivers should NOT use these either.
>   */
>  #define __pa(x)			__virt_to_phys((unsigned long)(x))
> -#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
> +#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
>  #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)

This as a whole does not fill me with a great amount of enthusiasm,
because it breaks some of the typechecking that we have here.

The whole point of __phys_to_virt() and __virt_to_phys() is that they work
on integer types, and warn if you dare to use them with pointers.  Adding
a cast into them breaks that.

The whole point is that the typecasting is explicitly inside phys_to_virt()
and virt_to_phys() and not their macro counterparts.

Secondly, are you sure that this patch is correct on its own?  You're
passing a u64 into assembly only expecting a 32-bit register.  Have you
checked it does the right thing with a 64-bit phys_addr_t on both LE
and BE?

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 01/22] ARM: add mechanism for late code patching
  2012-08-06 11:12     ` Russell King - ARM Linux
@ 2012-08-06 13:19       ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-06 13:19 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, nico, will.deacon

On 8/6/2012 7:12 AM, Russell King - ARM Linux wrote:
> On Tue, Jul 31, 2012 at 07:04:37PM -0400, Cyril Chemparathy wrote:
>> +static void __init init_patch_kernel(void)
>> +{
>> +	const void *start = &__patch_table_begin;
>> +	const void *end   = &__patch_table_end;
>> +
>> +	BUG_ON(patch_kernel(start, end - start));
>> +	flush_icache_range(init_mm.start_code, init_mm.end_code);
>
> Err.  You are asking the kernel to flush every single cache line
> manually throughout the kernel code.  That's a flush every 32-bytes
> over maybe a few megabytes of address space.
>

With a flush_cache_all(), we could avoid having to operate a cacheline 
at a time, but that clobbers way more than necessary.

Maybe the better answer is to flush only the patched cachelines.

> This is one of the reasons we do the patching in assembly code before
> the caches are enabled - so we don't have to worry about the interaction
> with the CPU caches, which for this kind of application would be very
> expensive.
>

Sure, flushing caches is expensive.  But then, so is running the 
patching code with caches disabled.  I guess memory access latencies 
drive the performance trade off here.

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
@ 2012-08-06 13:19       ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-06 13:19 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/6/2012 7:12 AM, Russell King - ARM Linux wrote:
> On Tue, Jul 31, 2012 at 07:04:37PM -0400, Cyril Chemparathy wrote:
>> +static void __init init_patch_kernel(void)
>> +{
>> +	const void *start = &__patch_table_begin;
>> +	const void *end   = &__patch_table_end;
>> +
>> +	BUG_ON(patch_kernel(start, end - start));
>> +	flush_icache_range(init_mm.start_code, init_mm.end_code);
>
> Err.  You are asking the kernel to flush every single cache line
> manually throughout the kernel code.  That's a flush every 32-bytes
> over maybe a few megabytes of address space.
>

With a flush_cache_all(), we could avoid having to operate a cacheline 
at a time, but that clobbers way more than necessary.

Maybe the better answer is to flush only the patched cachelines.

> This is one of the reasons we do the patching in assembly code before
> the caches are enabled - so we don't have to worry about the interaction
> with the CPU caches, which for this kind of application would be very
> expensive.
>

Sure, flushing caches is expensive.  But then, so is running the 
patching code with caches disabled.  I guess memory access latencies 
drive the performance trade off here.

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 01/22] ARM: add mechanism for late code patching
  2012-08-06 13:19       ` Cyril Chemparathy
@ 2012-08-06 13:26         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 127+ messages in thread
From: Russell King - ARM Linux @ 2012-08-06 13:26 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, nico, will.deacon

On Mon, Aug 06, 2012 at 09:19:10AM -0400, Cyril Chemparathy wrote:
> With a flush_cache_all(), we could avoid having to operate a cacheline  
> at a time, but that clobbers way more than necessary.

You can't do that, because flush_cache_all() on some CPUs requires the
proper MMU mappings to be in place, and you can't get those mappings
in place because you don't have the V:P offsets fixed up in the kernel.
Welcome to the chicken and egg problem.

> Sure, flushing caches is expensive.  But then, so is running the  
> patching code with caches disabled.  I guess memory access latencies  
> drive the performance trade off here.

There we disagree on a few orders of magnitude.  There are relatively
few places that need updating.  According to the kernel I have here:

   text    data     bss     dec     hex filename
7644346  454320  212984 8311650  7ed362 vmlinux

Idx Name          Size      VMA       LMA       File off  Algn
  1 .text         004cd170  c00081c0  c00081c0  000081c0  2**5
 16 .init.pv_table 00000300  c0753a24  c0753a24  00753a24  2**0

That's about 7MB of text, and only 192 points in that code which need
patching.  Even if we did this with caches on, that's still 192 places,
and only 192 places we'd need to flush a cache line.

Alternatively, with your approach and 7MB of text, you need to flush
238885 cache lines to cover the entire kernel.

It would be far _cheaper_ with your approach to flush the individual
cache lines as you go.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
@ 2012-08-06 13:26         ` Russell King - ARM Linux
  0 siblings, 0 replies; 127+ messages in thread
From: Russell King - ARM Linux @ 2012-08-06 13:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 06, 2012 at 09:19:10AM -0400, Cyril Chemparathy wrote:
> With a flush_cache_all(), we could avoid having to operate a cacheline  
> at a time, but that clobbers way more than necessary.

You can't do that, because flush_cache_all() on some CPUs requires the
proper MMU mappings to be in place, and you can't get those mappings
in place because you don't have the V:P offsets fixed up in the kernel.
Welcome to the chicken and egg problem.

> Sure, flushing caches is expensive.  But then, so is running the  
> patching code with caches disabled.  I guess memory access latencies  
> drive the performance trade off here.

There we disagree on a few orders of magnitude.  There are relatively
few places that need updating.  According to the kernel I have here:

   text    data     bss     dec     hex filename
7644346  454320  212984 8311650  7ed362 vmlinux

Idx Name          Size      VMA       LMA       File off  Algn
  1 .text         004cd170  c00081c0  c00081c0  000081c0  2**5
 16 .init.pv_table 00000300  c0753a24  c0753a24  00753a24  2**0

That's about 7MB of text, and only 192 points in that code which need
patching.  Even if we did this with caches on, that's still 192 places,
and only 192 places we'd need to flush a cache line.

Alternatively, with your approach and 7MB of text, you need to flush
238885 cache lines to cover the entire kernel.

It would be far _cheaper_ with your approach to flush the individual
cache lines as you go.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 03/22] ARM: LPAE: use phys_addr_t on virt <--> phys conversion
  2012-08-06 11:14     ` Russell King - ARM Linux
@ 2012-08-06 13:30       ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-06 13:30 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, nico,
	will.deacon, Vitaly Andrianov

On 8/6/2012 7:14 AM, Russell King - ARM Linux wrote:
> On Tue, Jul 31, 2012 at 07:04:39PM -0400, Cyril Chemparathy wrote:
>> This patch fixes up the types used when converting back and forth between
>> physical and virtual addresses.
>>
>> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>> ---
>>   arch/arm/include/asm/memory.h |   17 +++++++++++------
>>   1 file changed, 11 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
>> index 01c710d..4a0108f 100644
>> --- a/arch/arm/include/asm/memory.h
>> +++ b/arch/arm/include/asm/memory.h
>> @@ -157,22 +157,27 @@ extern unsigned long __pv_phys_offset;
>>
>>   extern unsigned long __pv_offset;
>>
>> -static inline unsigned long __virt_to_phys(unsigned long x)
>> +static inline phys_addr_t __virt_to_phys(unsigned long x)
>>   {
>>   	unsigned long t;
>>   	early_patch_imm8(x, t, "add", __pv_offset);
>>   	return t;
>>   }
>>
>> -static inline unsigned long __phys_to_virt(unsigned long x)
>> +static inline unsigned long __phys_to_virt(phys_addr_t x)
>>   {
>>   	unsigned long t;
>>   	early_patch_imm8(x, t, "sub", __pv_offset);
>>   	return t;
>>   }
>>   #else
>> -#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
>> -#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
>> +
>> +#define __virt_to_phys(x)		\
>> +	((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET)
>> +
>> +#define __phys_to_virt(x)		\
>> +	((unsigned long)((phys_addr_t)(x) - PHYS_OFFSET + PAGE_OFFSET))
>> +
>>   #endif
>>   #endif
>>
>> @@ -207,14 +212,14 @@ static inline phys_addr_t virt_to_phys(const volatile void *x)
>>
>>   static inline void *phys_to_virt(phys_addr_t x)
>>   {
>> -	return (void *)(__phys_to_virt((unsigned long)(x)));
>> +	return (void *)__phys_to_virt(x);
>>   }
>>
>>   /*
>>    * Drivers should NOT use these either.
>>    */
>>   #define __pa(x)			__virt_to_phys((unsigned long)(x))
>> -#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
>> +#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
>>   #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
>
> This as a whole does not fill me with a great amount of enthusiasm,
> because it breaks some of the typechecking that we have here.
>
> The whole point of __phys_to_virt() and __virt_to_phys() is that they work
> on integer types, and warn if you dare to use them with pointers.  Adding
> a cast into them breaks that.
>

Understood.  Thanks.  The casts were needed to upgrade to 64-bit before 
arithmetic.  We should convert the non-patch __phys_to_virt and 
__virt_to_phys to inlines to keep the typechecking intact.

> The whole point is that the typecasting is explicitly inside phys_to_virt()
> and virt_to_phys() and not their macro counterparts.
>
> Secondly, are you sure that this patch is correct on its own?  You're
> passing a u64 into assembly only expecting a 32-bit register.  Have you
> checked it does the right thing with a 64-bit phys_addr_t on both LE
> and BE?
>

We should explicitly pass in the lower order bits here, at least until 
the next patch in the series fixes things up for 64-bit.  Thanks.

We've tested with 64-bit and 32-bit phys_addr_t, but only on LE.  Thanks 
for pointing this out, we'll figure out a way to run BE as well.

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 03/22] ARM: LPAE: use phys_addr_t on virt <--> phys conversion
@ 2012-08-06 13:30       ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-06 13:30 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/6/2012 7:14 AM, Russell King - ARM Linux wrote:
> On Tue, Jul 31, 2012 at 07:04:39PM -0400, Cyril Chemparathy wrote:
>> This patch fixes up the types used when converting back and forth between
>> physical and virtual addresses.
>>
>> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>> ---
>>   arch/arm/include/asm/memory.h |   17 +++++++++++------
>>   1 file changed, 11 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
>> index 01c710d..4a0108f 100644
>> --- a/arch/arm/include/asm/memory.h
>> +++ b/arch/arm/include/asm/memory.h
>> @@ -157,22 +157,27 @@ extern unsigned long __pv_phys_offset;
>>
>>   extern unsigned long __pv_offset;
>>
>> -static inline unsigned long __virt_to_phys(unsigned long x)
>> +static inline phys_addr_t __virt_to_phys(unsigned long x)
>>   {
>>   	unsigned long t;
>>   	early_patch_imm8(x, t, "add", __pv_offset);
>>   	return t;
>>   }
>>
>> -static inline unsigned long __phys_to_virt(unsigned long x)
>> +static inline unsigned long __phys_to_virt(phys_addr_t x)
>>   {
>>   	unsigned long t;
>>   	early_patch_imm8(x, t, "sub", __pv_offset);
>>   	return t;
>>   }
>>   #else
>> -#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
>> -#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
>> +
>> +#define __virt_to_phys(x)		\
>> +	((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET)
>> +
>> +#define __phys_to_virt(x)		\
>> +	((unsigned long)((phys_addr_t)(x) - PHYS_OFFSET + PAGE_OFFSET))
>> +
>>   #endif
>>   #endif
>>
>> @@ -207,14 +212,14 @@ static inline phys_addr_t virt_to_phys(const volatile void *x)
>>
>>   static inline void *phys_to_virt(phys_addr_t x)
>>   {
>> -	return (void *)(__phys_to_virt((unsigned long)(x)));
>> +	return (void *)__phys_to_virt(x);
>>   }
>>
>>   /*
>>    * Drivers should NOT use these either.
>>    */
>>   #define __pa(x)			__virt_to_phys((unsigned long)(x))
>> -#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
>> +#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
>>   #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
>
> This as a whole does not fill me with a great amount of enthusiasm,
> because it breaks some of the typechecking that we have here.
>
> The whole point of __phys_to_virt() and __virt_to_phys() is that they work
> on integer types, and warn if you dare to use them with pointers.  Adding
> a cast into them breaks that.
>

Understood.  Thanks.  The casts were needed to upgrade to 64-bit before 
arithmetic.  We should convert the non-patch __phys_to_virt and 
__virt_to_phys to inlines to keep the typechecking intact.

> The whole point is that the typecasting is explicitly inside phys_to_virt()
> and virt_to_phys() and not their macro counterparts.
>
> Secondly, are you sure that this patch is correct on its own?  You're
> passing a u64 into assembly only expecting a 32-bit register.  Have you
> checked it does the right thing with a 64-bit phys_addr_t on both LE
> and BE?
>

We should explicitly pass in the lower order bits here, at least until 
the next patch in the series fixes things up for 64-bit.  Thanks.

We've tested with 64-bit and 32-bit phys_addr_t, but only on LE.  Thanks 
for pointing this out, we'll figure out a way to run BE as well.

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 01/22] ARM: add mechanism for late code patching
  2012-08-06 13:26         ` Russell King - ARM Linux
@ 2012-08-06 13:38           ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-06 13:38 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, nico, will.deacon

On 8/6/2012 9:26 AM, Russell King - ARM Linux wrote:
> On Mon, Aug 06, 2012 at 09:19:10AM -0400, Cyril Chemparathy wrote:
>> With a flush_cache_all(), we could avoid having to operate a cacheline
>> at a time, but that clobbers way more than necessary.
>
> You can't do that, because flush_cache_all() on some CPUs requires the
> proper MMU mappings to be in place, and you can't get those mappings
> in place because you don't have the V:P offsets fixed up in the kernel.
> Welcome to the chicken and egg problem.
>
>> Sure, flushing caches is expensive.  But then, so is running the
>> patching code with caches disabled.  I guess memory access latencies
>> drive the performance trade off here.
>
> There we disagree on a few orders of magnitude.  There are relatively
> few places that need updating.  According to the kernel I have here:
>
>     text    data     bss     dec     hex filename
> 7644346  454320  212984 8311650  7ed362 vmlinux
>
> Idx Name          Size      VMA       LMA       File off  Algn
>    1 .text         004cd170  c00081c0  c00081c0  000081c0  2**5
>   16 .init.pv_table 00000300  c0753a24  c0753a24  00753a24  2**0
>
> That's about 7MB of text, and only 192 points in that code which need
> patching.  Even if we did this with caches on, that's still 192 places,
> and only 192 places we'd need to flush a cache line.
>
> Alternatively, with your approach and 7MB of text, you need to flush
> 238885 cache lines to cover the entire kernel.
>
> It would be far _cheaper_ with your approach to flush the individual
> cache lines as you go.
>

Agreed.  Thanks.

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
@ 2012-08-06 13:38           ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-06 13:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/6/2012 9:26 AM, Russell King - ARM Linux wrote:
> On Mon, Aug 06, 2012 at 09:19:10AM -0400, Cyril Chemparathy wrote:
>> With a flush_cache_all(), we could avoid having to operate a cacheline
>> at a time, but that clobbers way more than necessary.
>
> You can't do that, because flush_cache_all() on some CPUs requires the
> proper MMU mappings to be in place, and you can't get those mappings
> in place because you don't have the V:P offsets fixed up in the kernel.
> Welcome to the chicken and egg problem.
>
>> Sure, flushing caches is expensive.  But then, so is running the
>> patching code with caches disabled.  I guess memory access latencies
>> drive the performance trade off here.
>
> There we disagree on a few orders of magnitude.  There are relatively
> few places that need updating.  According to the kernel I have here:
>
>     text    data     bss     dec     hex filename
> 7644346  454320  212984 8311650  7ed362 vmlinux
>
> Idx Name          Size      VMA       LMA       File off  Algn
>    1 .text         004cd170  c00081c0  c00081c0  000081c0  2**5
>   16 .init.pv_table 00000300  c0753a24  c0753a24  00753a24  2**0
>
> That's about 7MB of text, and only 192 points in that code which need
> patching.  Even if we did this with caches on, that's still 192 places,
> and only 192 places we'd need to flush a cache line.
>
> Alternatively, with your approach and 7MB of text, you need to flush
> 238885 cache lines to cover the entire kernel.
>
> It would be far _cheaper_ with your approach to flush the individual
> cache lines as you go.
>

Agreed.  Thanks.

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 01/22] ARM: add mechanism for late code patching
  2012-08-06 13:26         ` Russell King - ARM Linux
@ 2012-08-06 18:02           ` Nicolas Pitre
  -1 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-06 18:02 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Cyril Chemparathy, linux-arm-kernel, linux-kernel, arnd,
	catalin.marinas, will.deacon

On Mon, 6 Aug 2012, Russell King - ARM Linux wrote:

> On Mon, Aug 06, 2012 at 09:19:10AM -0400, Cyril Chemparathy wrote:
> > With a flush_cache_all(), we could avoid having to operate a cacheline  
> > at a time, but that clobbers way more than necessary.
> 
> You can't do that, because flush_cache_all() on some CPUs requires the
> proper MMU mappings to be in place, and you can't get those mappings
> in place because you don't have the V:P offsets fixed up in the kernel.
> Welcome to the chicken and egg problem.

This problem is fixed in this case by having the p2v and v2p code sites 
using an out-of-line non optimized computation until those sites are 
runtime patched with the inlined optimized computation we have today.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
@ 2012-08-06 18:02           ` Nicolas Pitre
  0 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-06 18:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 6 Aug 2012, Russell King - ARM Linux wrote:

> On Mon, Aug 06, 2012 at 09:19:10AM -0400, Cyril Chemparathy wrote:
> > With a flush_cache_all(), we could avoid having to operate a cacheline  
> > at a time, but that clobbers way more than necessary.
> 
> You can't do that, because flush_cache_all() on some CPUs requires the
> proper MMU mappings to be in place, and you can't get those mappings
> in place because you don't have the V:P offsets fixed up in the kernel.
> Welcome to the chicken and egg problem.

This problem is fixed in this case by having the p2v and v2p code sites 
using an out-of-line non optimized computation until those sites are 
runtime patched with the inlined optimized computation we have today.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 01/22] ARM: add mechanism for late code patching
  2012-08-04  5:38     ` Nicolas Pitre
@ 2012-08-07 22:52       ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-07 22:52 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, linux,
	will.deacon

Hi Nicolas,

On 8/4/2012 1:38 AM, Nicolas Pitre wrote:
[...]
>> extern unsigned __patch_table_begin, __patch_table_end;
>
> You could use "exttern void __patch_table_begin" so those symbols don't
> get any type that could be misused by mistake, while you still can take
> their addresses.
>

Looks like we'll have to stick with a non-void type here.  The compiler 
throws a warning when we try to take the address of a void.

[...]
> Did you verify with some test program that your patching routines do
> produce the same opcodes as the assembled equivalent for all possible
> shift values?  Especially for Thumb2 code which isn't as trivial to get
> right as the ARM one.
>

We've refactored the patching code into separate functions as:

static int do_patch_imm8_arm(u32 insn, u32 imm, u32 *ninsn);
static int do_patch_imm8_thumb(u32 insn, u32 imm, u32 *ninsn);


With this, the following test code has been used to verify the generated 
instruction encoding:

u32 arm_check[] = {
	0xe2810041, 0xe2810082, 0xe2810f41, 0xe2810f82, 0xe2810e41,
	0xe2810e82, 0xe2810d41, 0xe2810d82, 0xe2810c41, 0xe2810c82,
	0xe2810b41, 0xe2810b82, 0xe2810a41, 0xe2810a82, 0xe2810941,
	0xe2810982, 0xe2810841, 0xe2810882, 0xe2810741, 0xe2810782,
	0xe2810641, 0xe2810682, 0xe2810541, 0xe2810582, 0xe2810441,
};

u32 thumb_check[] = {
	0xf1010081, 0xf5017081, 0xf5017001, 0xf5016081, 0xf5016001,
	0xf5015081, 0xf5015001, 0xf5014081, 0xf5014001, 0xf5013081,
	0xf5013001, 0xf5012081, 0xf5012001, 0xf5011081, 0xf5011001,
	0xf5010081, 0xf5010001, 0xf1017081, 0xf1017001, 0xf1016081,
	0xf1016001, 0xf1015081, 0xf1015001, 0xf1014081, 0xf1014001,
};

int do_test(void)
{
	int i, ret;
	u32 ninsn, insn;
	
	insn = arm_check[0];
	for (i = 0; i < ARRAY_SIZE(arm_check); i++) {
		ret = do_patch_imm8_arm(insn, 0x41 << i, &ninsn);
		if (ret < 0)
			pr_err("patch failed at shift %d\n", i);
		if (ninsn != arm_check[i])
			pr_err("mismatch at %d, expect %x, got %x\n",
			       i, arm_check[i], ninsn);
	}

	insn = thumb_check[0];
	for (i = 0; i < ARRAY_SIZE(thumb_check); i++) {
		ret = do_patch_imm8_thumb(insn, 0x81 << i, &ninsn);
		if (ret < 0)
			pr_err("patch failed at shift %d\n", i);
		if (ninsn != thumb_check[i])
			pr_err("mismatch at %d, expect %x, got %x\n",
			       i, thumb_check[i], ninsn);
	}
}

Any ideas on improving these tests?

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
@ 2012-08-07 22:52       ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-07 22:52 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Nicolas,

On 8/4/2012 1:38 AM, Nicolas Pitre wrote:
[...]
>> extern unsigned __patch_table_begin, __patch_table_end;
>
> You could use "exttern void __patch_table_begin" so those symbols don't
> get any type that could be misused by mistake, while you still can take
> their addresses.
>

Looks like we'll have to stick with a non-void type here.  The compiler 
throws a warning when we try to take the address of a void.

[...]
> Did you verify with some test program that your patching routines do
> produce the same opcodes as the assembled equivalent for all possible
> shift values?  Especially for Thumb2 code which isn't as trivial to get
> right as the ARM one.
>

We've refactored the patching code into separate functions as:

static int do_patch_imm8_arm(u32 insn, u32 imm, u32 *ninsn);
static int do_patch_imm8_thumb(u32 insn, u32 imm, u32 *ninsn);


With this, the following test code has been used to verify the generated 
instruction encoding:

u32 arm_check[] = {
	0xe2810041, 0xe2810082, 0xe2810f41, 0xe2810f82, 0xe2810e41,
	0xe2810e82, 0xe2810d41, 0xe2810d82, 0xe2810c41, 0xe2810c82,
	0xe2810b41, 0xe2810b82, 0xe2810a41, 0xe2810a82, 0xe2810941,
	0xe2810982, 0xe2810841, 0xe2810882, 0xe2810741, 0xe2810782,
	0xe2810641, 0xe2810682, 0xe2810541, 0xe2810582, 0xe2810441,
};

u32 thumb_check[] = {
	0xf1010081, 0xf5017081, 0xf5017001, 0xf5016081, 0xf5016001,
	0xf5015081, 0xf5015001, 0xf5014081, 0xf5014001, 0xf5013081,
	0xf5013001, 0xf5012081, 0xf5012001, 0xf5011081, 0xf5011001,
	0xf5010081, 0xf5010001, 0xf1017081, 0xf1017001, 0xf1016081,
	0xf1016001, 0xf1015081, 0xf1015001, 0xf1014081, 0xf1014001,
};

int do_test(void)
{
	int i, ret;
	u32 ninsn, insn;
	
	insn = arm_check[0];
	for (i = 0; i < ARRAY_SIZE(arm_check); i++) {
		ret = do_patch_imm8_arm(insn, 0x41 << i, &ninsn);
		if (ret < 0)
			pr_err("patch failed at shift %d\n", i);
		if (ninsn != arm_check[i])
			pr_err("mismatch at %d, expect %x, got %x\n",
			       i, arm_check[i], ninsn);
	}

	insn = thumb_check[0];
	for (i = 0; i < ARRAY_SIZE(thumb_check); i++) {
		ret = do_patch_imm8_thumb(insn, 0x81 << i, &ninsn);
		if (ret < 0)
			pr_err("patch failed at shift %d\n", i);
		if (ninsn != thumb_check[i])
			pr_err("mismatch@%d, expect %x, got %x\n",
			       i, thumb_check[i], ninsn);
	}
}

Any ideas on improving these tests?

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 01/22] ARM: add mechanism for late code patching
  2012-08-07 22:52       ` Cyril Chemparathy
@ 2012-08-08  5:56         ` Nicolas Pitre
  -1 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-08  5:56 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, Arnd Bergmann, Catalin Marinas,
	Russell King - ARM Linux, Will Deacon

On Tue, 7 Aug 2012, Cyril Chemparathy wrote:

> Hi Nicolas,
> 
> On 8/4/2012 1:38 AM, Nicolas Pitre wrote:
> [...]
> > > extern unsigned __patch_table_begin, __patch_table_end;
> > 
> > You could use "exttern void __patch_table_begin" so those symbols don't
> > get any type that could be misused by mistake, while you still can take
> > their addresses.
> > 
> 
> Looks like we'll have to stick with a non-void type here.  The compiler throws
> a warning when we try to take the address of a void.

Ah, I see. Bummer.  This used not to be the case with older gcc 
versions.

> [...]
> > Did you verify with some test program that your patching routines do
> > produce the same opcodes as the assembled equivalent for all possible
> > shift values?  Especially for Thumb2 code which isn't as trivial to get
> > right as the ARM one.
> > 
> 
> We've refactored the patching code into separate functions as:
> 
> static int do_patch_imm8_arm(u32 insn, u32 imm, u32 *ninsn);
> static int do_patch_imm8_thumb(u32 insn, u32 imm, u32 *ninsn);
> 
> 
> With this, the following test code has been used to verify the generated
> instruction encoding:
> 
> u32 arm_check[] = {
> 	0xe2810041, 0xe2810082, 0xe2810f41, 0xe2810f82, 0xe2810e41,
> 	0xe2810e82, 0xe2810d41, 0xe2810d82, 0xe2810c41, 0xe2810c82,
> 	0xe2810b41, 0xe2810b82, 0xe2810a41, 0xe2810a82, 0xe2810941,
> 	0xe2810982, 0xe2810841, 0xe2810882, 0xe2810741, 0xe2810782,
> 	0xe2810641, 0xe2810682, 0xe2810541, 0xe2810582, 0xe2810441,
> };

Instead of using this array you could let the assembler do it for you 
like this:

asm (" \n\
	.arm \n\
arm_check: \n\
        .set shft, 0 \n\
        .rep 12 \n\
        add     r1, r2, #0x81 << \shft \n\
        .set shft, \shft + 2 \n\
        .endr \n\
");

> u32 thumb_check[] = {
> 	0xf1010081, 0xf5017081, 0xf5017001, 0xf5016081, 0xf5016001,
> 	0xf5015081, 0xf5015001, 0xf5014081, 0xf5014001, 0xf5013081,
> 	0xf5013001, 0xf5012081, 0xf5012001, 0xf5011081, 0xf5011001,
> 	0xf5010081, 0xf5010001, 0xf1017081, 0xf1017001, 0xf1016081,
> 	0xf1016001, 0xf1015081, 0xf1015001, 0xf1014081, 0xf1014001,

Same idea here.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
@ 2012-08-08  5:56         ` Nicolas Pitre
  0 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-08  5:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 7 Aug 2012, Cyril Chemparathy wrote:

> Hi Nicolas,
> 
> On 8/4/2012 1:38 AM, Nicolas Pitre wrote:
> [...]
> > > extern unsigned __patch_table_begin, __patch_table_end;
> > 
> > You could use "exttern void __patch_table_begin" so those symbols don't
> > get any type that could be misused by mistake, while you still can take
> > their addresses.
> > 
> 
> Looks like we'll have to stick with a non-void type here.  The compiler throws
> a warning when we try to take the address of a void.

Ah, I see. Bummer.  This used not to be the case with older gcc 
versions.

> [...]
> > Did you verify with some test program that your patching routines do
> > produce the same opcodes as the assembled equivalent for all possible
> > shift values?  Especially for Thumb2 code which isn't as trivial to get
> > right as the ARM one.
> > 
> 
> We've refactored the patching code into separate functions as:
> 
> static int do_patch_imm8_arm(u32 insn, u32 imm, u32 *ninsn);
> static int do_patch_imm8_thumb(u32 insn, u32 imm, u32 *ninsn);
> 
> 
> With this, the following test code has been used to verify the generated
> instruction encoding:
> 
> u32 arm_check[] = {
> 	0xe2810041, 0xe2810082, 0xe2810f41, 0xe2810f82, 0xe2810e41,
> 	0xe2810e82, 0xe2810d41, 0xe2810d82, 0xe2810c41, 0xe2810c82,
> 	0xe2810b41, 0xe2810b82, 0xe2810a41, 0xe2810a82, 0xe2810941,
> 	0xe2810982, 0xe2810841, 0xe2810882, 0xe2810741, 0xe2810782,
> 	0xe2810641, 0xe2810682, 0xe2810541, 0xe2810582, 0xe2810441,
> };

Instead of using this array you could let the assembler do it for you 
like this:

asm (" \n\
	.arm \n\
arm_check: \n\
        .set shft, 0 \n\
        .rep 12 \n\
        add     r1, r2, #0x81 << \shft \n\
        .set shft, \shft + 2 \n\
        .endr \n\
");

> u32 thumb_check[] = {
> 	0xf1010081, 0xf5017081, 0xf5017001, 0xf5016081, 0xf5016001,
> 	0xf5015081, 0xf5015001, 0xf5014081, 0xf5014001, 0xf5013081,
> 	0xf5013001, 0xf5012081, 0xf5012001, 0xf5011081, 0xf5011001,
> 	0xf5010081, 0xf5010001, 0xf1017081, 0xf1017001, 0xf1016081,
> 	0xf1016001, 0xf1015081, 0xf1015001, 0xf1014081, 0xf1014001,

Same idea here.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 01/22] ARM: add mechanism for late code patching
  2012-08-08  5:56         ` Nicolas Pitre
@ 2012-08-08 13:18           ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-08 13:18 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: linux-arm-kernel, linux-kernel, Arnd Bergmann, Catalin Marinas,
	Russell King - ARM Linux, Will Deacon

On 08/08/12 01:56, Nicolas Pitre wrote:
> On Tue, 7 Aug 2012, Cyril Chemparathy wrote:
[...]
>> u32 arm_check[] = {
>> 	0xe2810041, 0xe2810082, 0xe2810f41, 0xe2810f82, 0xe2810e41,
>> 	0xe2810e82, 0xe2810d41, 0xe2810d82, 0xe2810c41, 0xe2810c82,
>> 	0xe2810b41, 0xe2810b82, 0xe2810a41, 0xe2810a82, 0xe2810941,
>> 	0xe2810982, 0xe2810841, 0xe2810882, 0xe2810741, 0xe2810782,
>> 	0xe2810641, 0xe2810682, 0xe2810541, 0xe2810582, 0xe2810441,
>> };
>
> Instead of using this array you could let the assembler do it for you
> like this:
>
> asm (" \n\
> 	.arm \n\
> arm_check: \n\
>          .set shft, 0 \n\
>          .rep 12 \n\
>          add     r1, r2, #0x81 << \shft \n\
>          .set shft, \shft + 2 \n\
>          .endr \n\
> ");
>

Neat macro magic.  Are you thinking that we build this in as a self test 
in the code?

Thanks
-- Cyril.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
@ 2012-08-08 13:18           ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-08 13:18 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/08/12 01:56, Nicolas Pitre wrote:
> On Tue, 7 Aug 2012, Cyril Chemparathy wrote:
[...]
>> u32 arm_check[] = {
>> 	0xe2810041, 0xe2810082, 0xe2810f41, 0xe2810f82, 0xe2810e41,
>> 	0xe2810e82, 0xe2810d41, 0xe2810d82, 0xe2810c41, 0xe2810c82,
>> 	0xe2810b41, 0xe2810b82, 0xe2810a41, 0xe2810a82, 0xe2810941,
>> 	0xe2810982, 0xe2810841, 0xe2810882, 0xe2810741, 0xe2810782,
>> 	0xe2810641, 0xe2810682, 0xe2810541, 0xe2810582, 0xe2810441,
>> };
>
> Instead of using this array you could let the assembler do it for you
> like this:
>
> asm (" \n\
> 	.arm \n\
> arm_check: \n\
>          .set shft, 0 \n\
>          .rep 12 \n\
>          add     r1, r2, #0x81 << \shft \n\
>          .set shft, \shft + 2 \n\
>          .endr \n\
> ");
>

Neat macro magic.  Are you thinking that we build this in as a self test 
in the code?

Thanks
-- Cyril.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 01/22] ARM: add mechanism for late code patching
  2012-08-08 13:18           ` Cyril Chemparathy
@ 2012-08-08 13:55             ` Nicolas Pitre
  -1 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-08 13:55 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, Arnd Bergmann, Catalin Marinas,
	Russell King - ARM Linux, Will Deacon

On Wed, 8 Aug 2012, Cyril Chemparathy wrote:

> On 08/08/12 01:56, Nicolas Pitre wrote:
> > On Tue, 7 Aug 2012, Cyril Chemparathy wrote:
> [...]
> > > u32 arm_check[] = {
> > > 	0xe2810041, 0xe2810082, 0xe2810f41, 0xe2810f82, 0xe2810e41,
> > > 	0xe2810e82, 0xe2810d41, 0xe2810d82, 0xe2810c41, 0xe2810c82,
> > > 	0xe2810b41, 0xe2810b82, 0xe2810a41, 0xe2810a82, 0xe2810941,
> > > 	0xe2810982, 0xe2810841, 0xe2810882, 0xe2810741, 0xe2810782,
> > > 	0xe2810641, 0xe2810682, 0xe2810541, 0xe2810582, 0xe2810441,
> > > };
> > 
> > Instead of using this array you could let the assembler do it for you
> > like this:
> > 
> > asm (" \n\
> > 	.arm \n\
> > arm_check: \n\
> >          .set shft, 0 \n\
> >          .rep 12 \n\
> >          add     r1, r2, #0x81 << \shft \n\
> >          .set shft, \shft + 2 \n\
> >          .endr \n\
> > ");
> > 
> 
> Neat macro magic.  Are you thinking that we build this in as a self test in
> the code?

For such things, this is never a bad idea to have some test alongside 
with the main code, especially if this is extended to more cases in the 
future.  It is too easy to break it in subtle ways.

See arch/arm/kernel/kprobes-test*.c for a precedent.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
@ 2012-08-08 13:55             ` Nicolas Pitre
  0 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-08 13:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 8 Aug 2012, Cyril Chemparathy wrote:

> On 08/08/12 01:56, Nicolas Pitre wrote:
> > On Tue, 7 Aug 2012, Cyril Chemparathy wrote:
> [...]
> > > u32 arm_check[] = {
> > > 	0xe2810041, 0xe2810082, 0xe2810f41, 0xe2810f82, 0xe2810e41,
> > > 	0xe2810e82, 0xe2810d41, 0xe2810d82, 0xe2810c41, 0xe2810c82,
> > > 	0xe2810b41, 0xe2810b82, 0xe2810a41, 0xe2810a82, 0xe2810941,
> > > 	0xe2810982, 0xe2810841, 0xe2810882, 0xe2810741, 0xe2810782,
> > > 	0xe2810641, 0xe2810682, 0xe2810541, 0xe2810582, 0xe2810441,
> > > };
> > 
> > Instead of using this array you could let the assembler do it for you
> > like this:
> > 
> > asm (" \n\
> > 	.arm \n\
> > arm_check: \n\
> >          .set shft, 0 \n\
> >          .rep 12 \n\
> >          add     r1, r2, #0x81 << \shft \n\
> >          .set shft, \shft + 2 \n\
> >          .endr \n\
> > ");
> > 
> 
> Neat macro magic.  Are you thinking that we build this in as a self test in
> the code?

For such things, this is never a bad idea to have some test alongside 
with the main code, especially if this is extended to more cases in the 
future.  It is too easy to break it in subtle ways.

See arch/arm/kernel/kprobes-test*.c for a precedent.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 00/22] Introducing the TI Keystone platform
  2012-07-31 23:04 ` Cyril Chemparathy
@ 2012-08-08 13:57   ` Will Deacon
  -1 siblings, 0 replies; 127+ messages in thread
From: Will Deacon @ 2012-08-08 13:57 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: linux-arm-kernel, linux-kernel, arnd, Catalin Marinas, nico, linux

Hi Cyril,

On Wed, Aug 01, 2012 at 12:04:36AM +0100, Cyril Chemparathy wrote:
> This series is a follow on to the RFC series posted earlier (archived at [1]).
> The major change introduced here is the modification to the kernel patching
> mechanism for phys_to_virt/virt_to_phys, in order to support LPAE platforms
> that require late patching.  In addition to these changes, we've updated the
> series based on feedback from the earlier posting.

One thing I've noticed going through this code and also looking at the rest
of the LPAE code in mainline is that it's not at all clear what is the maximum
physical address we can support for memory.

We currently have the following restrictions:

ARM architecture: 40 bits
ARCH_PGD_SHIFT	: 38 bits
swapfile	: 36 bits (I posted some patches for this. We could
                           extend to 37 bits if we complicate the code)
SPARSEMEM	: 36 bits (due to limited number of page-flags)

It would be nice if we could define a 36-bit memory limit across the kernel
for LPAE whilst allowing higher addresses to be used for peripherals. This
also matches x86 PAE, so the common code will also work correctly.

Otherwise I worry that we will see platforms with memory right at the top of
the physical map and these will be incredibly painful to support.

Will

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 00/22] Introducing the TI Keystone platform
@ 2012-08-08 13:57   ` Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: Will Deacon @ 2012-08-08 13:57 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Cyril,

On Wed, Aug 01, 2012 at 12:04:36AM +0100, Cyril Chemparathy wrote:
> This series is a follow on to the RFC series posted earlier (archived at [1]).
> The major change introduced here is the modification to the kernel patching
> mechanism for phys_to_virt/virt_to_phys, in order to support LPAE platforms
> that require late patching.  In addition to these changes, we've updated the
> series based on feedback from the earlier posting.

One thing I've noticed going through this code and also looking at the rest
of the LPAE code in mainline is that it's not at all clear what is the maximum
physical address we can support for memory.

We currently have the following restrictions:

ARM architecture: 40 bits
ARCH_PGD_SHIFT	: 38 bits
swapfile	: 36 bits (I posted some patches for this. We could
                           extend to 37 bits if we complicate the code)
SPARSEMEM	: 36 bits (due to limited number of page-flags)

It would be nice if we could define a 36-bit memory limit across the kernel
for LPAE whilst allowing higher addresses to be used for peripherals. This
also matches x86 PAE, so the common code will also work correctly.

Otherwise I worry that we will see platforms with memory right at the top of
the physical map and these will be incredibly painful to support.

Will

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 00/22] Introducing the TI Keystone platform
  2012-08-05 15:10     ` Cyril Chemparathy
@ 2012-08-08 15:43       ` Catalin Marinas
  -1 siblings, 0 replies; 127+ messages in thread
From: Catalin Marinas @ 2012-08-08 15:43 UTC (permalink / raw)
  To: Cyril Chemparathy
  Cc: Russell King - ARM Linux, linux-arm-kernel, linux-kernel, arnd,
	nico, Will Deacon

On Sun, Aug 05, 2012 at 04:10:34PM +0100, Cyril Chemparathy wrote:
> On 8/4/2012 4:39 AM, Russell King - ARM Linux wrote:
> > On Tue, Jul 31, 2012 at 07:04:36PM -0400, Cyril Chemparathy wrote:
> >> This series is a follow on to the RFC series posted earlier (archived at [1]).
> >> The major change introduced here is the modification to the kernel patching
> >> mechanism for phys_to_virt/virt_to_phys, in order to support LPAE platforms
> >> that require late patching.  In addition to these changes, we've updated the
> >> series based on feedback from the earlier posting.
> >>
> >> Most of the patches in this series are fixes and extensions to LPAE support on
> >> ARM. The last three patches in this series are specific to the TI Keystone
> >> platform, and are being provided here for the sake of completeness.  These
> >> three patches are dependent on the smpops patch set (see [2]), and are not
> >> ready to be merged in as yet.
> >
> > Can you explain why you want the kernel loaded above the 4GB watermark?
> > This seems silly to me, as the kernel needs to run at points with a 1:1
> > physical to virtual mapping, and you can't do that if the kernel is
> > stored in physical memory above the 4GB watermark.
[...]
> We are well aware of the fact that we are barely scratching the surface 
> of the problem space here, and we'd be very thankful for a heads up on 
> issues that we may have missed so far.

Another thing to be aware is that apart from a virtual alias between the
kernel mapping and the idmap, you now introduce a physical alias as well
and the caches (physically tagged) get confused.

-- 
Catalin

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 00/22] Introducing the TI Keystone platform
@ 2012-08-08 15:43       ` Catalin Marinas
  0 siblings, 0 replies; 127+ messages in thread
From: Catalin Marinas @ 2012-08-08 15:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Aug 05, 2012 at 04:10:34PM +0100, Cyril Chemparathy wrote:
> On 8/4/2012 4:39 AM, Russell King - ARM Linux wrote:
> > On Tue, Jul 31, 2012 at 07:04:36PM -0400, Cyril Chemparathy wrote:
> >> This series is a follow on to the RFC series posted earlier (archived at [1]).
> >> The major change introduced here is the modification to the kernel patching
> >> mechanism for phys_to_virt/virt_to_phys, in order to support LPAE platforms
> >> that require late patching.  In addition to these changes, we've updated the
> >> series based on feedback from the earlier posting.
> >>
> >> Most of the patches in this series are fixes and extensions to LPAE support on
> >> ARM. The last three patches in this series are specific to the TI Keystone
> >> platform, and are being provided here for the sake of completeness.  These
> >> three patches are dependent on the smpops patch set (see [2]), and are not
> >> ready to be merged in as yet.
> >
> > Can you explain why you want the kernel loaded above the 4GB watermark?
> > This seems silly to me, as the kernel needs to run at points with a 1:1
> > physical to virtual mapping, and you can't do that if the kernel is
> > stored in physical memory above the 4GB watermark.
[...]
> We are well aware of the fact that we are barely scratching the surface 
> of the problem space here, and we'd be very thankful for a heads up on 
> issues that we may have missed so far.

Another thing to be aware is that apart from a virtual alias between the
kernel mapping and the idmap, you now introduce a physical alias as well
and the caches (physically tagged) get confused.

-- 
Catalin

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 01/22] ARM: add mechanism for late code patching
  2012-08-08 13:55             ` Nicolas Pitre
@ 2012-08-08 16:05               ` Russell King - ARM Linux
  -1 siblings, 0 replies; 127+ messages in thread
From: Russell King - ARM Linux @ 2012-08-08 16:05 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Cyril Chemparathy, linux-arm-kernel, linux-kernel, Arnd Bergmann,
	Catalin Marinas, Will Deacon

On Wed, Aug 08, 2012 at 09:55:12AM -0400, Nicolas Pitre wrote:
> On Wed, 8 Aug 2012, Cyril Chemparathy wrote:
> > Neat macro magic.  Are you thinking that we build this in as a self test in
> > the code?
> 
> For such things, this is never a bad idea to have some test alongside 
> with the main code, especially if this is extended to more cases in the 
> future.  It is too easy to break it in subtle ways.
> 
> See arch/arm/kernel/kprobes-test*.c for a precedent.

Done correctly, it shouldn't be a problem, but I wouldn't say that
arch/arm/kernel/kprobes-test*.c is done correctly.  It's seen quite
a number of patching attempts since it was introduced for various
problems, and I've seen quite a number of builds fail for various
reasons in this file (none which I could be bothered to investigate.)

When the test code ends up causing more problems than the code it's
testing, something is definitely wrong.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
@ 2012-08-08 16:05               ` Russell King - ARM Linux
  0 siblings, 0 replies; 127+ messages in thread
From: Russell King - ARM Linux @ 2012-08-08 16:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 08, 2012 at 09:55:12AM -0400, Nicolas Pitre wrote:
> On Wed, 8 Aug 2012, Cyril Chemparathy wrote:
> > Neat macro magic.  Are you thinking that we build this in as a self test in
> > the code?
> 
> For such things, this is never a bad idea to have some test alongside 
> with the main code, especially if this is extended to more cases in the 
> future.  It is too easy to break it in subtle ways.
> 
> See arch/arm/kernel/kprobes-test*.c for a precedent.

Done correctly, it shouldn't be a problem, but I wouldn't say that
arch/arm/kernel/kprobes-test*.c is done correctly.  It's seen quite
a number of patching attempts since it was introduced for various
problems, and I've seen quite a number of builds fail for various
reasons in this file (none which I could be bothered to investigate.)

When the test code ends up causing more problems than the code it's
testing, something is definitely wrong.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 01/22] ARM: add mechanism for late code patching
  2012-08-08 16:05               ` Russell King - ARM Linux
@ 2012-08-08 16:56                 ` Nicolas Pitre
  -1 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-08 16:56 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Cyril Chemparathy, linux-arm-kernel, linux-kernel, Arnd Bergmann,
	Catalin Marinas, Will Deacon

On Wed, 8 Aug 2012, Russell King - ARM Linux wrote:

> On Wed, Aug 08, 2012 at 09:55:12AM -0400, Nicolas Pitre wrote:
> > On Wed, 8 Aug 2012, Cyril Chemparathy wrote:
> > > Neat macro magic.  Are you thinking that we build this in as a self test in
> > > the code?
> > 
> > For such things, this is never a bad idea to have some test alongside 
> > with the main code, especially if this is extended to more cases in the 
> > future.  It is too easy to break it in subtle ways.
> > 
> > See arch/arm/kernel/kprobes-test*.c for a precedent.
> 
> Done correctly, it shouldn't be a problem, but I wouldn't say that
> arch/arm/kernel/kprobes-test*.c is done correctly.  It's seen quite
> a number of patching attempts since it was introduced for various
> problems, and I've seen quite a number of builds fail for various
> reasons in this file (none which I could be bothered to investigate.)
> 
> When the test code ends up causing more problems than the code it's
> testing, something is definitely wrong.

I think we shouldn't compare the complexity of test code for kprobes and 
test code for runtime patching code.  The former, while more difficult 
to keep compiling, has found loads of issues in the former kprobes code.  
So it certainly paid back many times its cost in maintenance.

My mention of it wasn't about the actual test code implementation, but 
rather about the fact that we do have test code in the tree which can be 
enabled with a config option.

As for build failures with that test code, I'd suggest you simply drop a 
note to Tixy who is normally very responsive.  I randomly enable it 
myself and didn't run into any issues yet.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
@ 2012-08-08 16:56                 ` Nicolas Pitre
  0 siblings, 0 replies; 127+ messages in thread
From: Nicolas Pitre @ 2012-08-08 16:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 8 Aug 2012, Russell King - ARM Linux wrote:

> On Wed, Aug 08, 2012 at 09:55:12AM -0400, Nicolas Pitre wrote:
> > On Wed, 8 Aug 2012, Cyril Chemparathy wrote:
> > > Neat macro magic.  Are you thinking that we build this in as a self test in
> > > the code?
> > 
> > For such things, this is never a bad idea to have some test alongside 
> > with the main code, especially if this is extended to more cases in the 
> > future.  It is too easy to break it in subtle ways.
> > 
> > See arch/arm/kernel/kprobes-test*.c for a precedent.
> 
> Done correctly, it shouldn't be a problem, but I wouldn't say that
> arch/arm/kernel/kprobes-test*.c is done correctly.  It's seen quite
> a number of patching attempts since it was introduced for various
> problems, and I've seen quite a number of builds fail for various
> reasons in this file (none which I could be bothered to investigate.)
> 
> When the test code ends up causing more problems than the code it's
> testing, something is definitely wrong.

I think we shouldn't compare the complexity of test code for kprobes and 
test code for runtime patching code.  The former, while more difficult 
to keep compiling, has found loads of issues in the former kprobes code.  
So it certainly paid back many times its cost in maintenance.

My mention of it wasn't about the actual test code implementation, but 
rather about the fact that we do have test code in the tree which can be 
enabled with a config option.

As for build failures with that test code, I'd suggest you simply drop a 
note to Tixy who is normally very responsive.  I randomly enable it 
myself and didn't run into any issues yet.


Nicolas

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 01/22] ARM: add mechanism for late code patching
  2012-08-08 16:56                 ` Nicolas Pitre
@ 2012-08-09  6:59                   ` Tixy
  -1 siblings, 0 replies; 127+ messages in thread
From: Tixy @ 2012-08-09  6:59 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Russell King - ARM Linux, Arnd Bergmann, Catalin Marinas,
	Will Deacon, linux-kernel, Cyril Chemparathy, linux-arm-kernel

On Wed, 2012-08-08 at 12:56 -0400, Nicolas Pitre wrote:
> On Wed, 8 Aug 2012, Russell King - ARM Linux wrote:
> > Done correctly, it shouldn't be a problem, but I wouldn't say that
> > arch/arm/kernel/kprobes-test*.c is done correctly.  It's seen quite
> > a number of patching attempts since it was introduced for various
> > problems, and I've seen quite a number of builds fail for various
> > reasons in this file (none which I could be bothered to investigate.)
<snip>
> >
> As for build failures with that test code, I'd suggest you simply drop a 
> note to Tixy who is normally very responsive.

Indeed. If there are build failures, I'm happy to investigate and fix.

-- 
Tixy


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 01/22] ARM: add mechanism for late code patching
@ 2012-08-09  6:59                   ` Tixy
  0 siblings, 0 replies; 127+ messages in thread
From: Tixy @ 2012-08-09  6:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2012-08-08 at 12:56 -0400, Nicolas Pitre wrote:
> On Wed, 8 Aug 2012, Russell King - ARM Linux wrote:
> > Done correctly, it shouldn't be a problem, but I wouldn't say that
> > arch/arm/kernel/kprobes-test*.c is done correctly.  It's seen quite
> > a number of patching attempts since it was introduced for various
> > problems, and I've seen quite a number of builds fail for various
> > reasons in this file (none which I could be bothered to investigate.)
<snip>
> >
> As for build failures with that test code, I'd suggest you simply drop a 
> note to Tixy who is normally very responsive.

Indeed. If there are build failures, I'm happy to investigate and fix.

-- 
Tixy

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH 03/22] ARM: LPAE: use phys_addr_t on virt <--> phys conversion
  2012-08-06 11:14     ` Russell King - ARM Linux
@ 2012-08-09 14:10       ` Cyril Chemparathy
  -1 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-09 14:10 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: linux-arm-kernel, linux-kernel, arnd, catalin.marinas, nico,
	will.deacon, Vitaly Andrianov

Hi Russell,

On 8/6/2012 7:14 AM, Russell King - ARM Linux wrote:
> On Tue, Jul 31, 2012 at 07:04:39PM -0400, Cyril Chemparathy wrote:
>> This patch fixes up the types used when converting back and forth between
>> physical and virtual addresses.
>>
>> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>> ---
>>   arch/arm/include/asm/memory.h |   17 +++++++++++------
>>   1 file changed, 11 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
>> index 01c710d..4a0108f 100644
>> --- a/arch/arm/include/asm/memory.h
>> +++ b/arch/arm/include/asm/memory.h
>> @@ -157,22 +157,27 @@ extern unsigned long __pv_phys_offset;
>>
>>   extern unsigned long __pv_offset;
>>
>> -static inline unsigned long __virt_to_phys(unsigned long x)
>> +static inline phys_addr_t __virt_to_phys(unsigned long x)
>>   {
>>   	unsigned long t;
>>   	early_patch_imm8(x, t, "add", __pv_offset);
>>   	return t;
>>   }
>>
>> -static inline unsigned long __phys_to_virt(unsigned long x)
>> +static inline unsigned long __phys_to_virt(phys_addr_t x)
>>   {
>>   	unsigned long t;
>>   	early_patch_imm8(x, t, "sub", __pv_offset);
>>   	return t;
>>   }
>>   #else
>> -#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
>> -#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
>> +
>> +#define __virt_to_phys(x)		\
>> +	((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET)
>> +
>> +#define __phys_to_virt(x)		\
>> +	((unsigned long)((phys_addr_t)(x) - PHYS_OFFSET + PAGE_OFFSET))
>> +
>>   #endif
>>   #endif
>>
>> @@ -207,14 +212,14 @@ static inline phys_addr_t virt_to_phys(const volatile void *x)
>>
>>   static inline void *phys_to_virt(phys_addr_t x)
>>   {
>> -	return (void *)(__phys_to_virt((unsigned long)(x)));
>> +	return (void *)__phys_to_virt(x);
>>   }
>>
>>   /*
>>    * Drivers should NOT use these either.
>>    */
>>   #define __pa(x)			__virt_to_phys((unsigned long)(x))
>> -#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
>> +#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
>>   #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
>
> This as a whole does not fill me with a great amount of enthusiasm,
> because it breaks some of the typechecking that we have here.
>
> The whole point of __phys_to_virt() and __virt_to_phys() is that they work
> on integer types, and warn if you dare to use them with pointers.  Adding
> a cast into them breaks that.
>
> The whole point is that the typecasting is explicitly inside phys_to_virt()
> and virt_to_phys() and not their macro counterparts.
>

The casts in __phys_to_virt() and __virt_to_phys() were necessary to 
widen integer types in case of LPAE without phys/virt patching.

I assume that this specifically is the typecasting that you are 
concerned about.  Would it be better then to convert these to inlines 
then?  That way we could get the typechecking, with proper widening as 
needed.

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH 03/22] ARM: LPAE: use phys_addr_t on virt <--> phys conversion
@ 2012-08-09 14:10       ` Cyril Chemparathy
  0 siblings, 0 replies; 127+ messages in thread
From: Cyril Chemparathy @ 2012-08-09 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On 8/6/2012 7:14 AM, Russell King - ARM Linux wrote:
> On Tue, Jul 31, 2012 at 07:04:39PM -0400, Cyril Chemparathy wrote:
>> This patch fixes up the types used when converting back and forth between
>> physical and virtual addresses.
>>
>> Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>> ---
>>   arch/arm/include/asm/memory.h |   17 +++++++++++------
>>   1 file changed, 11 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
>> index 01c710d..4a0108f 100644
>> --- a/arch/arm/include/asm/memory.h
>> +++ b/arch/arm/include/asm/memory.h
>> @@ -157,22 +157,27 @@ extern unsigned long __pv_phys_offset;
>>
>>   extern unsigned long __pv_offset;
>>
>> -static inline unsigned long __virt_to_phys(unsigned long x)
>> +static inline phys_addr_t __virt_to_phys(unsigned long x)
>>   {
>>   	unsigned long t;
>>   	early_patch_imm8(x, t, "add", __pv_offset);
>>   	return t;
>>   }
>>
>> -static inline unsigned long __phys_to_virt(unsigned long x)
>> +static inline unsigned long __phys_to_virt(phys_addr_t x)
>>   {
>>   	unsigned long t;
>>   	early_patch_imm8(x, t, "sub", __pv_offset);
>>   	return t;
>>   }
>>   #else
>> -#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
>> -#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
>> +
>> +#define __virt_to_phys(x)		\
>> +	((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET)
>> +
>> +#define __phys_to_virt(x)		\
>> +	((unsigned long)((phys_addr_t)(x) - PHYS_OFFSET + PAGE_OFFSET))
>> +
>>   #endif
>>   #endif
>>
>> @@ -207,14 +212,14 @@ static inline phys_addr_t virt_to_phys(const volatile void *x)
>>
>>   static inline void *phys_to_virt(phys_addr_t x)
>>   {
>> -	return (void *)(__phys_to_virt((unsigned long)(x)));
>> +	return (void *)__phys_to_virt(x);
>>   }
>>
>>   /*
>>    * Drivers should NOT use these either.
>>    */
>>   #define __pa(x)			__virt_to_phys((unsigned long)(x))
>> -#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
>> +#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
>>   #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
>
> This as a whole does not fill me with a great amount of enthusiasm,
> because it breaks some of the typechecking that we have here.
>
> The whole point of __phys_to_virt() and __virt_to_phys() is that they work
> on integer types, and warn if you dare to use them with pointers.  Adding
> a cast into them breaks that.
>
> The whole point is that the typecasting is explicitly inside phys_to_virt()
> and virt_to_phys() and not their macro counterparts.
>

The casts in __phys_to_virt() and __virt_to_phys() were necessary to 
widen integer types in case of LPAE without phys/virt patching.

I assume that this specifically is the typecasting that you are 
concerned about.  Would it be better then to convert these to inlines 
then?  That way we could get the typechecking, with proper widening as 
needed.

-- 
Thanks
- Cyril

^ permalink raw reply	[flat|nested] 127+ messages in thread

end of thread, other threads:[~2012-08-09 14:10 UTC | newest]

Thread overview: 127+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-31 23:04 [PATCH 00/22] Introducing the TI Keystone platform Cyril Chemparathy
2012-07-31 23:04 ` Cyril Chemparathy
2012-07-31 23:04 ` [PATCH 01/22] ARM: add mechanism for late code patching Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-08-04  5:38   ` Nicolas Pitre
2012-08-04  5:38     ` Nicolas Pitre
2012-08-05 13:56     ` Cyril Chemparathy
2012-08-05 13:56       ` Cyril Chemparathy
2012-08-07 22:52     ` Cyril Chemparathy
2012-08-07 22:52       ` Cyril Chemparathy
2012-08-08  5:56       ` Nicolas Pitre
2012-08-08  5:56         ` Nicolas Pitre
2012-08-08 13:18         ` Cyril Chemparathy
2012-08-08 13:18           ` Cyril Chemparathy
2012-08-08 13:55           ` Nicolas Pitre
2012-08-08 13:55             ` Nicolas Pitre
2012-08-08 16:05             ` Russell King - ARM Linux
2012-08-08 16:05               ` Russell King - ARM Linux
2012-08-08 16:56               ` Nicolas Pitre
2012-08-08 16:56                 ` Nicolas Pitre
2012-08-09  6:59                 ` Tixy
2012-08-09  6:59                   ` Tixy
2012-08-06 11:12   ` Russell King - ARM Linux
2012-08-06 11:12     ` Russell King - ARM Linux
2012-08-06 13:19     ` Cyril Chemparathy
2012-08-06 13:19       ` Cyril Chemparathy
2012-08-06 13:26       ` Russell King - ARM Linux
2012-08-06 13:26         ` Russell King - ARM Linux
2012-08-06 13:38         ` Cyril Chemparathy
2012-08-06 13:38           ` Cyril Chemparathy
2012-08-06 18:02         ` Nicolas Pitre
2012-08-06 18:02           ` Nicolas Pitre
2012-07-31 23:04 ` [PATCH 02/22] ARM: use late patch framework for phys-virt patching Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-08-04  6:15   ` Nicolas Pitre
2012-08-04  6:15     ` Nicolas Pitre
2012-08-05 14:03     ` Cyril Chemparathy
2012-08-05 14:03       ` Cyril Chemparathy
2012-08-06  2:06       ` Nicolas Pitre
2012-08-06  2:06         ` Nicolas Pitre
2012-07-31 23:04 ` [PATCH 03/22] ARM: LPAE: use phys_addr_t on virt <--> phys conversion Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-08-04  6:24   ` Nicolas Pitre
2012-08-04  6:24     ` Nicolas Pitre
2012-08-05 14:05     ` Cyril Chemparathy
2012-08-05 14:05       ` Cyril Chemparathy
2012-08-06 11:14   ` Russell King - ARM Linux
2012-08-06 11:14     ` Russell King - ARM Linux
2012-08-06 13:30     ` Cyril Chemparathy
2012-08-06 13:30       ` Cyril Chemparathy
2012-08-09 14:10     ` Cyril Chemparathy
2012-08-09 14:10       ` Cyril Chemparathy
2012-07-31 23:04 ` [PATCH 04/22] ARM: LPAE: support 64-bit virt/phys patching Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-08-04  6:49   ` Nicolas Pitre
2012-08-04  6:49     ` Nicolas Pitre
2012-08-05 14:21     ` Cyril Chemparathy
2012-08-05 14:21       ` Cyril Chemparathy
2012-08-06  2:19       ` Nicolas Pitre
2012-08-06  2:19         ` Nicolas Pitre
2012-07-31 23:04 ` [PATCH 05/22] ARM: LPAE: use signed arithmetic for mask definitions Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-07-31 23:04 ` [PATCH 06/22] ARM: LPAE: use phys_addr_t in alloc_init_pud() Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-08-01 12:08   ` Sergei Shtylyov
2012-08-01 12:08     ` Sergei Shtylyov
2012-08-01 15:42     ` Cyril Chemparathy
2012-08-01 15:42       ` Cyril Chemparathy
2012-08-04  6:51   ` Nicolas Pitre
2012-08-04  6:51     ` Nicolas Pitre
2012-07-31 23:04 ` [PATCH 07/22] ARM: LPAE: use phys_addr_t in free_memmap() Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-08-04  6:54   ` Nicolas Pitre
2012-08-04  6:54     ` Nicolas Pitre
2012-07-31 23:04 ` [PATCH 08/22] ARM: LPAE: use phys_addr_t for initrd location and size Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-08-04  6:57   ` Nicolas Pitre
2012-08-04  6:57     ` Nicolas Pitre
2012-08-05 14:23     ` Cyril Chemparathy
2012-08-05 14:23       ` Cyril Chemparathy
2012-07-31 23:04 ` [PATCH 09/22] ARM: LPAE: use 64-bit pgd physical address in switch_mm() Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-08-04  7:04   ` Nicolas Pitre
2012-08-04  7:04     ` Nicolas Pitre
2012-08-05 14:29     ` Cyril Chemparathy
2012-08-05 14:29       ` Cyril Chemparathy
2012-08-06  2:35       ` Nicolas Pitre
2012-08-06  2:35         ` Nicolas Pitre
2012-07-31 23:04 ` [PATCH 10/22] ARM: LPAE: use 64-bit accessors for TTBR registers Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-07-31 23:04 ` [PATCH 11/22] ARM: LPAE: define ARCH_LOW_ADDRESS_LIMIT for bootmem Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-07-31 23:04 ` [PATCH 12/22] ARM: LPAE: factor out T1SZ and TTBR1 computations Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-07-31 23:04 ` [PATCH 13/22] ARM: LPAE: allow proc override of TTB setup Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-07-31 23:04 ` [PATCH 14/22] ARM: LPAE: accomodate >32-bit addresses for page table base Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-07-31 23:04 ` [PATCH 15/22] ARM: mm: use physical addresses in highmem sanity checks Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-07-31 23:04 ` [PATCH 16/22] ARM: mm: cleanup checks for membank overlap with vmalloc area Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-07-31 23:04 ` [PATCH 17/22] ARM: mm: clean up membank size limit checks Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-07-31 23:04 ` [PATCH 18/22] ARM: add virt_to_idmap for interconnect aliasing Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-07-31 23:04 ` [PATCH 19/22] ARM: recreate kernel mappings in early_paging_init() Cyril Chemparathy
2012-07-31 23:04   ` Cyril Chemparathy
2012-07-31 23:04 ` [RFC 20/22] ARM: keystone: introducing TI Keystone platform Cyril Chemparathy
2012-07-31 23:06   ` Cyril Chemparathy
2012-07-31 23:16   ` Arnd Bergmann
2012-07-31 23:16     ` Arnd Bergmann
2012-08-01 15:41     ` Cyril Chemparathy
2012-08-01 15:41       ` Cyril Chemparathy
2012-08-01 17:20       ` Arnd Bergmann
2012-07-31 23:04 ` [RFC 21/22] ARM: keystone: enable SMP on Keystone machines Cyril Chemparathy
2012-07-31 23:05   ` Cyril Chemparathy
2012-07-31 23:04 ` [RFC 22/22] ARM: keystone: add switch over to high physical address range Cyril Chemparathy
2012-07-31 23:05   ` Cyril Chemparathy
2012-08-04  8:39 ` [PATCH 00/22] Introducing the TI Keystone platform Russell King - ARM Linux
2012-08-04  8:39   ` Russell King - ARM Linux
2012-08-05 15:10   ` Cyril Chemparathy
2012-08-05 15:10     ` Cyril Chemparathy
2012-08-08 15:43     ` Catalin Marinas
2012-08-08 15:43       ` Catalin Marinas
2012-08-08 13:57 ` Will Deacon
2012-08-08 13:57   ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.