All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Kdump support for PPC440x
@ 2011-10-25 11:53 Suzuki K. Poulose
  2011-10-25 11:53 ` [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel Suzuki K. Poulose
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Suzuki K. Poulose @ 2011-10-25 11:53 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Suzuki Poulose, Alan Modra, Scott Wood, Paul Mackerras, Dave Hansen

The following series implements:

 * Generic framework for relocatable kernel on PPC32, based on processing 
   the dynamic relocation entries.
 * Relocatable kernel support for 44x
 * Kdump support for 44x. Doesn't support 47x yet, as the kexec 
   support is missing.

Changes from V1:

  * Splitted patch 'Enable CONFIG_RELOCATABLE for PPC44x' to move some
    of the generic bits to a new patch.
  * Renamed RELOCATABLE_PPC32 to RELOCATABLE_PPC32_PIE and provided options to
    retained old style mapping. (Suggested by: Scott Wood)
  * Added support for avoiding the overlapping of uncompressed kernel
    with boot wrapper for PPC images.

The patches are based on -next tree for ppc.

I have tested these patches on Ebony, Sequoia and Virtex(QEMU Emulated).
I haven't tested the RELOCATABLE bits on PPC_47x yet, as I don't have access
to one. However, it should work fine there as we only depend on the runtime
address and the XLAT entry setup by the boot loader. It would be great if
somebody could test these patches on a 47x.


---

Suzuki K. Poulose (5):
      [boot] Change the load address for the wrapper to fit the kernel
      [44x] Enable CRASH_DUMP for 440x
      [44x] Enable CONFIG_RELOCATABLE for PPC44x
      [ppc] Define virtual-physical translations for PIE relocations
      [ppc] Process dynamic relocations for kernel


 arch/powerpc/Kconfig              |   18 +++
 arch/powerpc/Makefile             |    1 
 arch/powerpc/boot/wrapper         |   20 ++++
 arch/powerpc/include/asm/page.h   |   85 ++++++++++++++++
 arch/powerpc/kernel/Makefile      |    2 
 arch/powerpc/kernel/head_44x.S    |  110 ++++++++++++++++++++-
 arch/powerpc/kernel/reloc_32.S    |  194 +++++++++++++++++++++++++++++++++++++
 arch/powerpc/kernel/vmlinux.lds.S |    8 +-
 arch/powerpc/mm/init_32.c         |    7 +
 9 files changed, 434 insertions(+), 11 deletions(-)
 create mode 100644 arch/powerpc/kernel/reloc_32.S

--
Suzuki Poulose

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
  2011-10-25 11:53 [PATCH v2 0/5] Kdump support for PPC440x Suzuki K. Poulose
@ 2011-10-25 11:53 ` Suzuki K. Poulose
  2011-11-02 23:36   ` Josh Poimboeuf
  2011-10-25 11:54 ` [PATCH v2 2/5] [ppc] Define virtual-physical translations for PIE relocations Suzuki K. Poulose
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Suzuki K. Poulose @ 2011-10-25 11:53 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Suzuki Poulose, Alan Modra, Scott Wood, Paul Mackerras, Dave Hansen

The following patch implements the dynamic relocation processing for
PPC32 kernel. relocate() accepts the target virtual address and relocates
 the kernel image to the same.

Currently the following relocation types are handled :

	R_PPC_RELATIVE
	R_PPC_ADDR16_LO
	R_PPC_ADDR16_HI
	R_PPC_ADDR16_HA

The last 3 relocations in the above list depends on value of Symbol indexed
whose index is encoded in the Relocation entry. Hence we need the Symbol
Table for processing such relocations.

Note: The GNU ld for ppc32 produces buggy relocations for relocation types
that depend on symbols. The value of the symbols with STB_LOCAL scope
should be assumed to be zero. - Alan Modra

Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Cc:	Paul Mackerras <paulus@samba.org>
Cc:	Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc:	Alan Modra <amodra@au1.ibm.com>
Cc:	Kumar Gala <galak@kernel.crashing.org>
Cc:	Josh Boyer <jwboyer@gmail.com>
Cc:	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
---

 arch/powerpc/Kconfig              |   12 ++
 arch/powerpc/kernel/Makefile      |    2 
 arch/powerpc/kernel/reloc_32.S    |  194 +++++++++++++++++++++++++++++++++++++
 arch/powerpc/kernel/vmlinux.lds.S |    8 +-
 4 files changed, 215 insertions(+), 1 deletions(-)
 create mode 100644 arch/powerpc/kernel/reloc_32.S

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 8523bd1..016f863 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -859,6 +859,18 @@ config RELOCATABLE
 	  setting can still be useful to bootwrappers that need to know the
 	  load location of the kernel (eg. u-boot/mkimage).
 
+config RELOCATABLE_PPC32_PIE
+	bool "Compile the kernel with dynamic relocations (EXPERIMENTAL)"
+	default n
+	depends on PPC32 && RELOCATABLE
+	help
+	  This option builds the kernel with dynamic relocations(-pie). Enables
+	  the kernel to be loaded at any address for BOOKE processors, removing
+	  the page alignment restriction for the load address.
+
+	  The option is more useful for platforms where the TLB page size is
+	  big (e.g, 256M on 44x), where we cannot enforce the alignment.
+
 config PAGE_OFFSET_BOOL
 	bool "Set custom page offset address"
 	depends on ADVANCED_OPTIONS
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index ce4f7f1..0957570 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -85,6 +85,8 @@ extra-$(CONFIG_FSL_BOOKE)	:= head_fsl_booke.o
 extra-$(CONFIG_8xx)		:= head_8xx.o
 extra-y				+= vmlinux.lds
 
+obj-$(CONFIG_RELOCATABLE_PPC32_PIE)	+= reloc_32.o
+
 obj-$(CONFIG_PPC32)		+= entry_32.o setup_32.o
 obj-$(CONFIG_PPC64)		+= dma-iommu.o iommu.o
 obj-$(CONFIG_KGDB)		+= kgdb.o
diff --git a/arch/powerpc/kernel/reloc_32.S b/arch/powerpc/kernel/reloc_32.S
new file mode 100644
index 0000000..045d61e
--- /dev/null
+++ b/arch/powerpc/kernel/reloc_32.S
@@ -0,0 +1,194 @@
+/*
+ * Code to process dynamic relocations for PPC32.
+ *
+ * Copyrights (C) IBM Corporation, 2011.
+ *	Author: Suzuki Poulose <suzuki@in.ibm.com>
+ *
+ *  - Based on ppc64 code - reloc_64.S
+ *
+ *  This program is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU General Public License
+ *  as published by the Free Software Foundation; either version
+ *  2 of the License, or (at your option) any later version.
+ */
+
+#include <asm/ppc_asm.h>
+
+/* Dynamic section table entry tags */
+DT_RELA = 7			/* Tag for Elf32_Rela section */
+DT_RELASZ = 8			/* Size of the Rela relocs */
+DT_RELAENT = 9			/* Size of one Rela reloc entry */
+
+STN_UNDEF = 0			/* Undefined symbol index */
+STB_LOCAL = 0			/* Local binding for the symbol */
+
+R_PPC_ADDR16_LO = 4		/* Lower half of (S+A) */
+R_PPC_ADDR16_HI = 5		/* Upper half of (S+A) */
+R_PPC_ADDR16_HA = 6		/* High Adjusted (S+A) */
+R_PPC_RELATIVE = 22
+
+/*
+ * r3 = desired final address
+ */
+
+_GLOBAL(relocate)
+
+	mflr	r0
+	bl	0f		/* Find our current runtime address */
+0:	mflr	r12		/* Make it accessible */
+	mtlr	r0
+
+	lwz	r11, (p_dyn - 0b)(r12)
+	add	r11, r11, r12	/* runtime address of .dynamic section */
+	lwz	r9, (p_rela - 0b)(r12)
+	add	r9, r9, r12	/* runtime address of .rela.dyn section */
+	lwz	r10, (p_st - 0b)(r12)
+	add	r10, r10, r12	/* runtime address of _stext section */
+	lwz	r13, (p_sym - 0b)(r12)
+	add	r13, r13, r12	/* runtime address of .dynsym section */
+
+	/*
+	 * Scan the dynamic section for RELA, RELASZ entries
+	 */
+	li	r6, 0
+	li	r7, 0
+	li	r8, 0
+1:	lwz	r5, 0(r11)	/* ELF_Dyn.d_tag */
+	cmpwi	r5, 0		/* End of ELF_Dyn[] */
+	beq	eodyn
+	cmpwi	r5, DT_RELA
+	bne	relasz
+	lwz	r7, 4(r11)	/* r7 = rela.link */
+	b	skip
+relasz:
+	cmpwi	r5, DT_RELASZ
+	bne	relaent
+	lwz	r8, 4(r11)	/* r8 = Total Rela relocs size */
+	b	skip
+relaent:
+	cmpwi	r5, DT_RELAENT
+	bne	skip
+	lwz	r6, 4(r11)	/* r6 = Size of one Rela reloc */
+skip:
+	addi	r11, r11, 8
+	b	1b
+eodyn:				/* End of Dyn Table scan */
+
+	/* Check if we have found all the entries */
+	cmpwi	r7, 0
+	beq	done
+	cmpwi	r8, 0
+	beq	done
+	cmpwi	r6, 0
+	beq	done
+
+
+	/*
+	 * Work out the current offset from the link time address of .rela
+	 * section.
+	 *  cur_offset[r7] = rela.run[r9] - rela.link [r7]
+	 *  _stext.link[r10] = _stext.run[r10] - cur_offset[r7]
+	 *  final_offset[r3] = _stext.final[r3] - _stext.link[r10]
+	 */
+	subf	r7, r7, r9	/* cur_offset */
+	subf	r10, r7, r10
+	subf	r3, r10, r3	/* final_offset */
+
+	subf	r8, r6, r8	/* relaz -= relaent */
+	/*
+	 * Scan through the .rela table and process each entry
+	 * r9	- points to the current .rela table entry
+	 * r13	- points to the symbol table
+	 */
+
+	/*
+	 * Check if we have a relocation based on symbol
+	 * r5 will hold the value of the symbol.
+	 */
+applyrela:
+	lwz	r4, 4(r9)
+	srwi	r5, r4, 8		/* ELF32_R_SYM(r_info) */
+	cmpwi	r5, STN_UNDEF	/* sym == STN_UNDEF ? */
+	beq	get_type	/* value = 0 */
+	/* Find the value of the symbol at index(r5) */
+	slwi	r5, r5, 4		/* r5 = r5 * sizeof(Elf32_Sym) */
+	add	r12, r13, r5	/* r12 = &__dyn_sym[Index] */
+
+	/*
+	 * GNU ld has a bug, where dynamic relocs based on
+	 * STB_LOCAL symbols, the value should be assumed
+	 * to be zero. - Alan Modra
+	 */
+	/* XXX: Do we need to check if we are using GNU ld ? */
+	lbz	r5, 12(r12)	/* r5 = dyn_sym[Index].st_info */
+	extrwi	r5, r5, 4, 24	/* r5 = ELF32_ST_BIND(r5) */
+	cmpwi	r5, STB_LOCAL	/* st_value = 0, ld bug */
+	beq	get_type	/* We have r5 = 0 */
+	lwz	r5, 4(r12)	/* r5 = __dyn_sym[Index].st_value */
+
+get_type:
+	/* r4 holds the relocation type */
+	extrwi	r4, r4, 8, 24	/* r4 = ((char*)r4)[3] */
+
+	/* R_PPC_RELATIVE */
+	cmpwi	r4, R_PPC_RELATIVE
+	bne	hi16
+	lwz	r4, 0(r9)	/* r_offset */
+	lwz	r0, 8(r9)	/* r_addend */
+	add	r0, r0, r3	/* final addend */
+	stwx	r0, r4, r7	/* memory[r4+r7]) = (u32)r0 */
+	b	nxtrela		/* continue */
+
+	/* R_PPC_ADDR16_HI */
+hi16:
+	cmpwi	r4, R_PPC_ADDR16_HI
+	bne	ha16
+	lwz	r4, 0(r9)	/* r_offset */
+	lwz	r0, 8(r9)	/* r_addend */
+	add	r0, r0, r3
+	add	r0, r0, r5	/* r0 = (S+A+Offset) */
+	extrwi	r0, r0, 16, 0	/* r0 = (r0 >> 16) */
+	b	store_half
+
+	/* R_PPC_ADDR16_HA */
+ha16:
+	cmpwi	r4, R_PPC_ADDR16_HA
+	bne	lo16
+	lwz	r4, 0(r9)	/* r_offset */
+	lwz	r0, 8(r9)	/* r_addend */
+	add	r0, r0, r3
+	add	r0, r0, r5	/* r0 = (S+A+Offset) */
+	extrwi	r5, r0, 1, 16	/* Extract bit 16 */
+	extrwi	r0, r0, 16, 0	/* r0 = (r0 >> 16) */
+	add	r0, r0, r5	/* Add it to r0 */
+	b	store_half
+
+	/* R_PPC_ADDR16_LO */
+lo16:
+	cmpwi	r4, R_PPC_ADDR16_LO
+	bne	nxtrela
+	lwz	r4, 0(r9)	/* r_offset */
+	lwz	r0, 8(r9)	/* r_addend */
+	add	r0, r0, r3
+	add	r0, r0, r5	/* r0 = (S+A+Offset) */
+	extrwi	r0, r0, 16, 16	/* r0 &= 0xffff */
+	/* Fall through to */
+
+	/* Store half word */
+store_half:
+	sthx	r0, r4, r7	/* memory[r4+r7] = (u16)r0 */
+
+nxtrela:
+	cmpwi	r8, 0		/* relasz = 0 ? */
+	ble	done
+	add	r9, r9, r6	/* move to next entry in the .rela table */
+	subf	r8, r6, r8	/* relasz -= relaent */
+	b	applyrela
+
+done:	blr
+
+
+p_dyn:		.long	__dynamic_start - 0b
+p_rela:		.long	__rela_dyn_start - 0b
+p_sym:		.long	__dynamic_symtab - 0b
+p_st:		.long	_stext - 0b
diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S
index 920276c..949cf17 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -170,7 +170,13 @@ SECTIONS
 	}
 #ifdef CONFIG_RELOCATABLE
 	. = ALIGN(8);
-	.dynsym : AT(ADDR(.dynsym) - LOAD_OFFSET) { *(.dynsym) }
+	.dynsym : AT(ADDR(.dynsym) - LOAD_OFFSET)
+	{
+#ifdef CONFIG_RELOCATABLE_PPC32_PIE
+		__dynamic_symtab = .;
+#endif
+		*(.dynsym)
+	}
 	.dynstr : AT(ADDR(.dynstr) - LOAD_OFFSET) { *(.dynstr) }
 	.dynamic : AT(ADDR(.dynamic) - LOAD_OFFSET)
 	{

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/5] [ppc] Define virtual-physical translations for PIE relocations
  2011-10-25 11:53 [PATCH v2 0/5] Kdump support for PPC440x Suzuki K. Poulose
  2011-10-25 11:53 ` [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel Suzuki K. Poulose
@ 2011-10-25 11:54 ` Suzuki K. Poulose
  2011-10-25 11:54 ` [PATCH v2 3/5] [44x] Enable CONFIG_RELOCATABLE for PPC44x Suzuki K. Poulose
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Suzuki K. Poulose @ 2011-10-25 11:54 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Suzuki Poulose, Alan Modra, Scott Wood, Paul Mackerras, Dave Hansen

We find the runtime address of _stext and relocate ourselves based
on the following calculation.

	virtual_base = ALIGN(KERNELBASE,KERNEL_TLB_PIN_SIZE) +
			MODULO(_stext.run,KERNEL_TLB_PIN_SIZE)

relocate() is called with the Effective Virtual Base Address (as
shown below)

            | Phys. Addr| Virt. Addr |
Page        |------------------------|
Boundary    |           |            |
            |           |            |
            |           |            |
Kernel Load |___________|_ __ _ _ _ _|<- Effective
Addr(_stext)|           |      ^     |Virt. Base Addr
            |           |      |     |
            |           |      |     |
            |           |reloc_offset|
            |           |      |     |
            |           |      |     |
            |           |______v_____|<-(KERNELBASE)%TLB_SIZE
            |           |            |
            |           |            |
            |           |            |
Page        |-----------|------------|
Boundary    |           |            |


On BookE, we need __va() & __pa() early in the boot process to access
the device tree.

Currently this has been defined as :

#define __va(x) ((void *)(unsigned long)((phys_addr_t)(x) -
						PHYSICAL_START + KERNELBASE)
where:
 PHYSICAL_START is kernstart_addr - a variable updated at runtime.
 KERNELBASE	is the compile time Virtual base address of kernel.

This won't work for us, as kernstart_addr is dynamic and will yield different
results for __va()/__pa() for same mapping.

e.g.,

Let the kernel be loaded at 64MB and KERNELBASE be 0xc0000000 (same as
PAGE_OFFSET).

In this case, we would be mapping 0 to 0xc0000000, and kernstart_addr = 64M

Now __va(1MB) = (0x100000) - (0x4000000) + 0xc0000000
		= 0xbc100000 , which is wrong.

it should be : 0xc0000000 + 0x100000 = 0xc0100000

On platforms which support AMP, like PPC_47x (based on 44x), the kernel
could be loaded at highmem. Hence we cannot always depend on the compile
time constants for mapping.

Here are the possible solutions:

1) Update kernstart_addr(PHSYICAL_START) to match the Physical address of
compile time KERNELBASE value, instead of the actual Physical_Address(_stext).

The disadvantage is that we may break other users of PHYSICAL_START. They
could be replaced with __pa(_stext).

2) Redefine __va() & __pa() with relocation offset


#if defined(CONFIG_RELOCATABLE) && defined(CONFIG_44x)
#define __va(x) ((void *)(unsigned long)((phys_addr_t)(x) - PHYSICAL_START + (KERNELBASE + RELOC_OFFSET)))
#define __pa(x) ((unsigned long)(x) + PHYSICAL_START - (KERNELBASE + RELOC_OFFSET))
#endif

where, RELOC_OFFSET could be

  a) A variable, say relocation_offset (like kernstart_addr), updated
     at boot time. This impacts performance, as we have to load an additional
     variable from memory.

		OR

  b) #define RELOC_OFFSET ((PHYSICAL_START & PPC_PIN_SIZE_OFFSET_MASK) - \
                      (KERNELBASE & PPC_PIN_SIZE_OFFSET_MASK))

   This introduces more calculations for doing the translation.

3) Redefine __va() & __pa() with a new variable

i.e,

#define __va(x) ((void *)(unsigned long)((phys_addr_t)(x) + VIRT_PHYS_OFFSET))

where VIRT_PHYS_OFFSET :

#ifdef CONFIG_RELOCATABLE_PPC32_PIE
#define VIRT_PHYS_OFFSET virt_phys_offset
#else
#define VIRT_PHYS_OFFSET (KERNELBASE - PHYSICAL_START)
#endif /* CONFIG_RELOCATABLE_PPC32_PIE */

where virt_phy_offset is updated at runtime to :

	Effective KERNELBASE - kernstart_addr.

Taking our example, above:

virt_phys_offset = effective_kernelstart_vaddr - kernstart_addr
		 = 0xc0400000 - 0x400000
		 = 0xc0000000
	and

	__va(0x100000) = 0xc0000000 + 0x100000 = 0xc0100000
	 which is what we want.

I have implemented (3) in the following patch which has same cost of
operation as the existing one.

I have tested the patches on 440x platforms only. However this should
work fine for PPC_47x also, as we only depend on the runtime address
and the current TLB XLAT entry for the startup code, which is available
in r25. I don't have access to a 47x board yet. So, it would be great if
somebody could test this on 47x.

Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Cc:	Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc:	Kumar Gala <galak@kernel.crashing.org>
Cc:	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
---

 arch/powerpc/Makefile           |    1 
 arch/powerpc/include/asm/page.h |   85 ++++++++++++++++++++++++++++++++++++++-
 arch/powerpc/mm/init_32.c       |    7 +++
 3 files changed, 90 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 57af16e..77f928f 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -65,6 +65,7 @@ endif
 
 LDFLAGS_vmlinux-yy := -Bstatic
 LDFLAGS_vmlinux-$(CONFIG_PPC64)$(CONFIG_RELOCATABLE) := -pie
+LDFLAGS_vmlinux-y$(CONFIG_RELOCATABLE_PPC32_PIE) := -pie
 LDFLAGS_vmlinux	:= $(LDFLAGS_vmlinux-yy)
 
 CFLAGS-$(CONFIG_PPC64)	:= -mminimal-toc -mtraceback=no -mcall-aixdesc
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index dd9c4fd..dc0d9fd 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -97,12 +97,26 @@ extern unsigned int HPAGE_SHIFT;
 
 extern phys_addr_t memstart_addr;
 extern phys_addr_t kernstart_addr;
+
+#ifdef CONFIG_RELOCATABLE_PPC32_PIE
+extern long long virt_phys_offset;
 #endif
+
+#endif /* __ASSEMBLY__ */
 #define PHYSICAL_START	kernstart_addr
-#else
+
+#else	/* !CONFIG_RELOCATABLE */
 #define PHYSICAL_START	ASM_CONST(CONFIG_PHYSICAL_START)
 #endif
 
+/* See Description below for VIRT_PHYS_OFFSET */
+#ifdef CONFIG_RELOCATABLE_PPC32_PIE
+#define VIRT_PHYS_OFFSET virt_phys_offset
+#else
+#define VIRT_PHYS_OFFSET (KERNELBASE - PHYSICAL_START)
+#endif
+
+
 #ifdef CONFIG_PPC64
 #define MEMORY_START	0UL
 #elif defined(CONFIG_RELOCATABLE)
@@ -125,12 +139,77 @@ extern phys_addr_t kernstart_addr;
  * determine MEMORY_START until then.  However we can determine PHYSICAL_START
  * from information at hand (program counter, TLB lookup).
  *
+ *  Relocation on BookE with PIE (RELOCATABLE_PPC32_PIE)
+ *
+ *   With RELOCATABLE_PPC32_PIE,  we support loading the kernel at any physical 
+ *   address without any restriction on the page alignment.
+ *
+ *   We find the runtime address of _stext and relocate ourselves based on 
+ *   the following calculation:
+ *
+ *  	  virtual_base = ALIGN_DOWN(KERNELBASE,256M) +
+ *  				MODULO(_stext.run,256M)
+ *   and create the following mapping:
+ *
+ * 	  ALIGN_DOWN(_stext.run,256M) => ALIGN_DOWN(KERNELBASE,256M)
+ *
+ *   When we process relocations, we cannot depend on the
+ *   existing equation for the __va()/__pa() translations:
+ *
+ * 	   __va(x) = (x)  - PHYSICAL_START + KERNELBASE
+ *
+ *   Where:
+ *   	 PHYSICAL_START = kernstart_addr = Physical address of _stext
+ *  	 KERNELBASE = Compiled virtual address of _stext.
+ *
+ *   This formula holds true iff, kernel load address is TLB page aligned.
+ *
+ *   In our case, we need to also account for the shift in the kernel Virtual 
+ *   address.
+ *
+ *   E.g.,
+ *
+ *   Let the kernel be loaded at 64MB and KERNELBASE be 0xc0000000 (same as PAGE_OFFSET).
+ *   In this case, we would be mapping 0 to 0xc0000000, and kernstart_addr = 64M
+ *
+ *   Now __va(1MB) = (0x100000) - (0x4000000) + 0xc0000000
+ *                 = 0xbc100000 , which is wrong.
+ *
+ *   Rather, it should be : 0xc0000000 + 0x100000 = 0xc0100000
+ *      	according to our mapping.
+ *
+ *   Hence we use the following formula to get the translations right:
+ *
+ * 	  __va(x) = (x) - [ PHYSICAL_START - Effective KERNELBASE ]
+ *
+ * 	  Where :
+ * 		PHYSICAL_START = dynamic load address.(kernstart_addr variable)
+ * 		Effective KERNELBASE = virtual_base =
+ * 				     = ALIGN_DOWN(KERNELBASE,256M) +
+ * 						MODULO(PHYSICAL_START,256M)
+ *
+ * 	To make the cost of __va() / __pa() more light weight, we introduce
+ * 	a new variable virt_phys_offset, which will hold :
+ *
+ * 	virt_phys_offset = Effective KERNELBASE - PHYSICAL_START
+ * 			 = ALIGN_DOWN(KERNELBASE,256M) - 
+ * 			 	ALIGN_DOWN(PHYSICALSTART,256M)
+ *
+ * 	Hence :
+ *
+ * 	__va(x) = x - PHYSICAL_START + Effective KERNELBASE
+ * 		= x + virt_phys_offset
+ *
+ * 		and
+ * 	__pa(x) = x + PHYSICAL_START - Effective KERNELBASE
+ * 		= x - virt_phys_offset
+ * 		
  * On non-Book-E PPC64 PAGE_OFFSET and MEMORY_START are constants so use
  * the other definitions for __va & __pa.
  */
 #ifdef CONFIG_BOOKE
-#define __va(x) ((void *)(unsigned long)((phys_addr_t)(x) - PHYSICAL_START + KERNELBASE))
-#define __pa(x) ((unsigned long)(x) + PHYSICAL_START - KERNELBASE)
+#define __va(x) ((void *)(unsigned long)((phys_addr_t)(x) + VIRT_PHYS_OFFSET))
+#define __pa(x) ((unsigned long)(x) - VIRT_PHYS_OFFSET)
 #else
 #define __va(x) ((void *)(unsigned long)((phys_addr_t)(x) + PAGE_OFFSET - MEMORY_START))
 #define __pa(x) ((unsigned long)(x) - PAGE_OFFSET + MEMORY_START)
diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c
index 161cefd..c4e7815 100644
--- a/arch/powerpc/mm/init_32.c
+++ b/arch/powerpc/mm/init_32.c
@@ -65,6 +65,13 @@ phys_addr_t memstart_addr = (phys_addr_t)~0ull;
 EXPORT_SYMBOL(memstart_addr);
 phys_addr_t kernstart_addr;
 EXPORT_SYMBOL(kernstart_addr);
+
+#ifdef CONFIG_RELOCATABLE_PPC32_PIE
+/* Used in __va()/__pa() */
+long long virt_phys_offset;
+EXPORT_SYMBOL(virt_phys_offset);
+#endif
+
 phys_addr_t lowmem_end_addr;
 
 int boot_mapsize;

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/5] [44x] Enable CONFIG_RELOCATABLE for PPC44x
  2011-10-25 11:53 [PATCH v2 0/5] Kdump support for PPC440x Suzuki K. Poulose
  2011-10-25 11:53 ` [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel Suzuki K. Poulose
  2011-10-25 11:54 ` [PATCH v2 2/5] [ppc] Define virtual-physical translations for PIE relocations Suzuki K. Poulose
@ 2011-10-25 11:54 ` Suzuki K. Poulose
  2011-10-25 11:54 ` [PATCH v2 4/5] [44x] Enable CRASH_DUMP for 440x Suzuki K. Poulose
  2011-10-25 11:54 ` [PATCH v2 5/5] [boot] Change the load address for the wrapper to fit the kernel Suzuki K. Poulose
  4 siblings, 0 replies; 19+ messages in thread
From: Suzuki K. Poulose @ 2011-10-25 11:54 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Suzuki Poulose, Alan Modra, Scott Wood, Paul Mackerras, Dave Hansen

The following patch adds relocatable support for PPC44x kernel.

This enables two types of relocatable kernel support for PPC44x.

1) The old style, mapping based- which restricts the load address to 256M
   aligned.

2) The new approach based on processing dynamic relocation entries -
   CONFIG_RELOCATABLE_PPC32_PIE


In case of CONFIG_RELOCATABLE_PPC32_PIE :

We find the runtime address of _stext and relocate ourselves based
on the following calculation.

	virtual_base = ALIGN(KERNELBASE,256M) +
			MODULO(_stext.run,256M)

relocate() is called with the Effective Virtual Base Address (as
shown below)

            | Phys. Addr| Virt. Addr |
Page (256M) |------------------------|
Boundary    |           |            |
            |           |            |
            |           |            |
Kernel Load |___________|_ __ _ _ _ _|<- Effective
Addr(_stext)|           |      ^     |Virt. Base Addr
            |           |      |     |
            |           |      |     |
            |           |reloc_offset|
            |           |      |     |
            |           |      |     |
            |           |______v_____|<-(KERNELBASE)%256M
            |           |            |
            |           |            |
            |           |            |
Page(256M)  |-----------|------------|
Boundary    |           |            |

The virt_phys_offset is updated accordingly, i.e,

	virt_phys_offset = effective. kernel virt base - kernstart_addr

I have tested the patches on 440x platforms only. However this should
work fine for PPC_47x also, as we only depend on the runtime address
and the current TLB XLAT entry for the startup code, which is available
in r25. I don't have access to a 47x board yet. So, it would be great if
somebody could test this on 47x.

Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Cc:	Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc:	Kumar Gala <galak@kernel.crashing.org>
Cc:	Tony Breeds <tony@bakeyournoodle.com>
Cc:	Josh Boyer <jwboyer@gmail.com>
Cc:	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
---

 arch/powerpc/Kconfig           |    4 +
 arch/powerpc/kernel/head_44x.S |  110 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 108 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 016f863..1cedcda 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -843,7 +843,7 @@ config LOWMEM_CAM_NUM
 
 config RELOCATABLE
 	bool "Build a relocatable kernel (EXPERIMENTAL)"
-	depends on EXPERIMENTAL && ADVANCED_OPTIONS && FLATMEM && (FSL_BOOKE || PPC_47x)
+	depends on EXPERIMENTAL && ADVANCED_OPTIONS && FLATMEM && (FSL_BOOKE || 44x)
 	help
 	  This builds a kernel image that is capable of running at the
 	  location the kernel is loaded at (some alignment restrictions may
@@ -862,7 +862,7 @@ config RELOCATABLE
 config RELOCATABLE_PPC32_PIE
 	bool "Compile the kernel with dynamic relocations (EXPERIMENTAL)"
 	default n
-	depends on PPC32 && RELOCATABLE
+	depends on PPC32 && RELOCATABLE && 44x
 	help
 	  This option builds the kernel with dynamic relocations(-pie). Enables
 	  the kernel to be loaded at any address for BOOKE processors, removing
diff --git a/arch/powerpc/kernel/head_44x.S b/arch/powerpc/kernel/head_44x.S
index b725dab..213759e 100644
--- a/arch/powerpc/kernel/head_44x.S
+++ b/arch/powerpc/kernel/head_44x.S
@@ -64,6 +64,35 @@ _ENTRY(_start);
 	mr	r31,r3		/* save device tree ptr */
 	li	r24,0		/* CPU number */
 
+#ifdef CONFIG_RELOCATABLE_PPC32_PIE
+/*
+ * Relocate ourselves to the current runtime address.
+ * This is called only by the Boot CPU.
+ * "relocate" is called with our current runtime virutal
+ * address.
+ * r21 will be loaded with the physical runtime address of _stext
+ */
+	bl	0f				/* Get our runtime address */
+0:	mflr	r21				/* Make it accessible */
+	addis	r21,r21,(_stext - 0b)@ha
+	addi	r21,r21,(_stext - 0b)@l 	/* Get our current runtime base */
+
+	/*
+	 * We have the runtime (virutal) address of our base.
+	 * We calculate our shift of offset from a 256M page.
+	 * We could map the 256M page we belong to at PAGE_OFFSET and
+	 * get going from there.
+	 */
+	lis	r4,KERNELBASE@h
+	ori	r4,r4,KERNELBASE@l
+	rlwinm	r6,r21,0,4,31			/* r6 = PHYS_START % 256M */
+	rlwinm	r5,r4,0,4,31			/* r5 = KERNELBASE % 256M */
+	subf	r3,r5,r6			/* r3 = r6 - r5 */
+	add	r3,r4,r3			/* Required Virutal Address */
+
+	bl	relocate
+#endif
+
 	bl	init_cpu_state
 
 	/*
@@ -88,14 +117,66 @@ _ENTRY(_start);
 
 #ifdef CONFIG_RELOCATABLE
 	/*
+	 * When we reach here :
 	 * r25 will contain RPN/ERPN for the start address of memory
-	 *
-	 * Add the difference between KERNELBASE and PAGE_OFFSET to the
-	 * start of physical memory to get kernstart_addr.
+	 * r21 contain the physical address of _stext in case of PIE
 	 */
 	lis	r3,kernstart_addr@ha
 	la	r3,kernstart_addr@l(r3)
 
+#ifdef CONFIG_RELOCATABLE_PPC32_PIE
+	/*
+	 * Compute the kernstart_addr.
+	 * kernstart_addr => (r6,r8)
+	 * kernstart_addr & ~0xfffffff => (r6,r7)
+	 */
+	rlwinm	r6,r25,0,28,31	/* ERPN. Bits 32-35 of Address */
+	rlwinm	r7,r25,0,0,3	/* RPN - assuming 256 MB page size */
+	rlwinm	r8,r21,0,4,31	/* r8 = (_stext & 0xfffffff) */
+	or	r8,r7,r8	/* Compute the lower 32bit of kernstart_addr */
+
+	/* Store kernstart_addr */
+	stw	r6,0(r3)	/* higher 32bit */
+	stw	r8,4(r3)	/* lower 32bit  */
+
+	/*
+	 * Compute the virt_phys_offset :
+	 * virt_phys_offset = stext.run - kernstart_addr
+	 *
+	 * stext.run = (KERNELBASE & ~0xfffffff) + (kernstart_addr & 0xfffffff)
+	 * When we relocate, we have :
+	 *
+	 *	(kernstart_addr & 0xfffffff) = (stext.run & 0xfffffff)
+	 *
+	 * hence:
+	 *  virt_phys_offset = (KERNELBASE & ~0xfffffff) - (kernstart_addr & ~0xfffffff)
+	 *
+	 */
+
+	/* KERNELBASE&~0xfffffff => (r4,r5) */
+	li	r4, 0		/* higer 32bit */
+	lis	r5,KERNELBASE@h
+	rlwinm	r5,r5,0,0,3	/* Align to 256M, lower 32bit */
+
+	/*
+	 * 64bit subtraction.
+	 */
+	subfc	r5,r7,r5
+	subfe	r4,r6,r4
+
+	/* Store virt_phys_offset */
+	lis	r3,virt_phys_offset@ha
+	la	r3,virt_phys_offset@l(r3)
+
+	stw	r4,0(r3)
+	stw	r5,4(r3)
+#else
+	/* Old style mapping based relocation */
+
+	/*
+	 * Add the difference between KERNELBASE and PAGE_OFFSET to the
+	 * start of physical memory to get kernstart_addr.
+	 */
 	lis	r4,KERNELBASE@h
 	ori	r4,r4,KERNELBASE@l
 	lis	r5,PAGE_OFFSET@h
@@ -108,7 +189,9 @@ _ENTRY(_start);
 
 	stw	r6,0(r3)
 	stw	r7,4(r3)
-#endif
+#endif	/* CONFIG_RELOCATABLE_PPC32_PIE */
+
+#endif /* CONFIG_RELOCATABLE */
 
 /*
  * Decide what sort of machine this is and initialize the MMU.
@@ -801,11 +884,30 @@ skpinv:	addi	r4,r4,1				/* Increment */
  * Configure and load pinned entry into TLB slot 63.
  */
 
+#ifdef CONFIG_RELOCATABLE
+	/*
+	 * Stores the XLAT entry for this code at r25.
+	 * Uses the mapping where we are loaded.
+	 */
+
+	tlbre	r25,r23,PPC44x_TLB_XLAT		/* Read our XLAT entry in r25 */
+
+	/* PAGEID fields for mapping */
+	lis	r3,KERNELBASE@h
+	rlwinm	r3,r3,0,0,3			/* Round to 256M page boundary */
+
+	/* Use the current XLAT entry */
+	mr	r4,r25
+#else
+
+
 	lis	r3,PAGE_OFFSET@h
 	ori	r3,r3,PAGE_OFFSET@l
 
 	/* Kernel is at the base of RAM */
 	li r4, 0			/* Load the kernel physical address */
+#endif
+
 
 	/* Load the kernel PID = 0 */
 	li	r0,0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/5] [44x] Enable CRASH_DUMP for 440x
  2011-10-25 11:53 [PATCH v2 0/5] Kdump support for PPC440x Suzuki K. Poulose
                   ` (2 preceding siblings ...)
  2011-10-25 11:54 ` [PATCH v2 3/5] [44x] Enable CONFIG_RELOCATABLE for PPC44x Suzuki K. Poulose
@ 2011-10-25 11:54 ` Suzuki K. Poulose
  2011-10-25 11:54 ` [PATCH v2 5/5] [boot] Change the load address for the wrapper to fit the kernel Suzuki K. Poulose
  4 siblings, 0 replies; 19+ messages in thread
From: Suzuki K. Poulose @ 2011-10-25 11:54 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Suzuki Poulose, Alan Modra, Scott Wood, Paul Mackerras, Dave Hansen

Now that we have relocatable kernel, supporting CRASH_DUMP only requires
turning the switches on for UP machines.

We don't have kexec support on 47x yet. Enabling SMP support would be done
as part of enabling the PPC_47x support.


Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Cc:	Josh Boyer <jwboyer@gmail.com>
Cc:	Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc:	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
---

 arch/powerpc/Kconfig |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 1cedcda..4de7733 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -362,8 +362,8 @@ config KEXEC
 
 config CRASH_DUMP
 	bool "Build a kdump crash kernel"
-	depends on PPC64 || 6xx || FSL_BOOKE
-	select RELOCATABLE if PPC64 || FSL_BOOKE
+	depends on PPC64 || 6xx || FSL_BOOKE || (44x && !SMP)
+	select RELOCATABLE if PPC64 || FSL_BOOKE || 44x
 	help
 	  Build a kernel suitable for use as a kdump capture kernel.
 	  The same kernel binary can be used as production kernel and dump

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 5/5] [boot] Change the load address for the wrapper to fit the kernel
  2011-10-25 11:53 [PATCH v2 0/5] Kdump support for PPC440x Suzuki K. Poulose
                   ` (3 preceding siblings ...)
  2011-10-25 11:54 ` [PATCH v2 4/5] [44x] Enable CRASH_DUMP for 440x Suzuki K. Poulose
@ 2011-10-25 11:54 ` Suzuki K. Poulose
  4 siblings, 0 replies; 19+ messages in thread
From: Suzuki K. Poulose @ 2011-10-25 11:54 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Suzuki Poulose, Alan Modra, Scott Wood, Paul Mackerras, Dave Hansen

The wrapper code which uncompresses the kernel in case of a 'ppc' boot
is by default loaded at 0x00400000 and the kernel will be uncompressed
to fit the location 0-0x00400000. But with dynamic relocations, the size
of the kernel may exceed 0x00400000(4M). This would cause an overlap
of the uncompressed kernel and the boot wrapper, causing a failure in
boot.

The message looks like :


   zImage starting: loaded at 0x00400000 (sp: 0x0065ffb0)
   Allocating 0x5ce650 bytes for kernel ...
   Insufficient memory for kernel at address 0! (_start=00400000, uncompressed size=00591a20)

This patch shifts the load address of the boot wrapper code to the next higher MB,
according to the size of  the uncompressed vmlinux.

Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
---

 arch/powerpc/boot/wrapper |   20 ++++++++++++++++++++
 1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper
index c74531a..213a9fd 100755
--- a/arch/powerpc/boot/wrapper
+++ b/arch/powerpc/boot/wrapper
@@ -257,6 +257,8 @@ vmz="$tmpdir/`basename \"$kernel\"`.$ext"
 if [ -z "$cacheit" -o ! -f "$vmz$gzip" -o "$vmz$gzip" -ot "$kernel" ]; then
     ${CROSS}objcopy $objflags "$kernel" "$vmz.$$"
 
+    strip_size=$(stat -c %s $vmz.$$)
+
     if [ -n "$gzip" ]; then
         gzip -n -f -9 "$vmz.$$"
     fi
@@ -266,6 +268,24 @@ if [ -z "$cacheit" -o ! -f "$vmz$gzip" -o "$vmz$gzip" -ot "$kernel" ]; then
     else
 	vmz="$vmz.$$"
     fi
+else
+    # Calculate the vmlinux.strip size
+    ${CROSS}objcopy $objflags "$kernel" "$vmz.$$"
+    strip_size=$(stat -c %s $vmz.$$)
+    rm -f $vmz.$$
+fi
+
+# Round the size to next higher MB limit
+round_size=$(((strip_size + 0xfffff) & 0xfff00000))
+
+round_size=0x$(printf "%x\n" $round_size)
+link_addr=$(printf "%d\n" $link_address)
+
+if [ $link_addr -lt $strip_size ]; then
+    echo "WARN: Uncompressed kernel size(0x$(printf "%x\n" $strip_size))" \
+		" exceeds the address of the wrapper($link_address)"
+    echo "WARN: Fixing the link_address to ($round_size))"
+    link_address=$round_size
 fi
 
 vmz="$vmz$gzip"

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
  2011-10-25 11:53 ` [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel Suzuki K. Poulose
@ 2011-11-02 23:36   ` Josh Poimboeuf
  2011-11-04  8:36     ` Suzuki Poulose
  0 siblings, 1 reply; 19+ messages in thread
From: Josh Poimboeuf @ 2011-11-02 23:36 UTC (permalink / raw)
  To: Suzuki K. Poulose
  Cc: Nathan Miller, Josh Poimboeuf, Dave Hansen, Alan Modra,
	Scott Wood, Paul Mackerras, linuxppc-dev

On Tue, 2011-10-25 at 17:23 +0530, Suzuki K. Poulose wrote: 
> The following patch implements the dynamic relocation processing for
> PPC32 kernel. relocate() accepts the target virtual address and relocates
>  the kernel image to the same.

Hi Suzuki,

Thanks for the patches.  I've been testing them on a 440-based card, and
encountered TLB error exceptions because the BSS section wasn't getting
properly cleared in early_init().

It turns out that some of the instructions which were modified in
relocate() weren't then getting flushed out of the d-cache into memory.
After that, early_init() executed the stale (non-modified) instructions
for the BSS area.  Those instructions just accessed offset 0 instead of
the actual BSS-related offsets.  That resulted in BSS not getting`
zeroed.

I was able to verify this on my 440 by comparing the d-cache and i-cache
entries for the BSS-accessing instructions in early_init() using a
RISCWatch.  As I suspected, the instructions in the d-cache showed the
corrected offsets, but the i-cache showed the old, non-relocated
offsets.

To fix the issue, I wrote the following patch, applied on top of your
patches.  Suggestions and comments are welcome.



>From c88ae39da0c0352f411aca8d9636990a442d47da Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe@linux.vnet.ibm.com>
Date: Wed, 2 Nov 2011 16:41:24 -0500
Subject: [PATCH] Flush relocated instructions from data cache

After updating instructions with relocated addresses, flush them from
the data cache and invalidate the icache line so we don't execute stale
instructions.

Signed-off-by: Josh Poimboeuf <jpoimboe@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/reloc_32.S |   11 ++++++++++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/reloc_32.S
b/arch/powerpc/kernel/reloc_32.S
index 045d61e..a92857d 100644
--- a/arch/powerpc/kernel/reloc_32.S
+++ b/arch/powerpc/kernel/reloc_32.S
@@ -137,6 +137,9 @@ get_type:
 	lwz	r0, 8(r9)	/* r_addend */
 	add	r0, r0, r3	/* final addend */
 	stwx	r0, r4, r7	/* memory[r4+r7]) = (u32)r0 */
+	dcbst	r4,r7		/* flush dcache line to memory */
+	sync			/* wait for flush to complete */
+	icbi	r4,r7		/* invalidate icache line */
 	b	nxtrela		/* continue */
 
 	/* R_PPC_ADDR16_HI */
@@ -177,6 +180,9 @@ lo16:
 	/* Store half word */
 store_half:
 	sthx	r0, r4, r7	/* memory[r4+r7] = (u16)r0 */
+	dcbst	r4,r7		/* flush dcache line to memory */
+	sync			/* wait for flush to complete */
+	icbi	r4,r7		/* invalidate icache line */
 
 nxtrela:
 	cmpwi	r8, 0		/* relasz = 0 ? */
@@ -185,7 +191,10 @@ nxtrela:
 	subf	r8, r6, r8	/* relasz -= relaent */
 	b	applyrela
 
-done:	blr
+done:
+	sync			/* wait for icache invalidates to complete */
+	isync			/* discard any prefetched instructions */
+	blr
 
 
 p_dyn:		.long	__dynamic_start - 0b
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
  2011-11-02 23:36   ` Josh Poimboeuf
@ 2011-11-04  8:36     ` Suzuki Poulose
  2011-11-07 15:13       ` Josh Poimboeuf
  0 siblings, 1 reply; 19+ messages in thread
From: Suzuki Poulose @ 2011-11-04  8:36 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Nathan Miller, Josh Poimboeuf, Dave Hansen, Alan Modra,
	Scott Wood, Paul Mackerras, linuxppc-dev

On 11/03/11 05:06, Josh Poimboeuf wrote:
> On Tue, 2011-10-25 at 17:23 +0530, Suzuki K. Poulose wrote:
>> The following patch implements the dynamic relocation processing for
>> PPC32 kernel. relocate() accepts the target virtual address and relocates
>>   the kernel image to the same.
>
> Hi Suzuki,
>
> Thanks for the patches.  I've been testing them on a 440-based card, and
> encountered TLB error exceptions because the BSS section wasn't getting
> properly cleared in early_init().

Thanks a lot for the testing.
>
> It turns out that some of the instructions which were modified in
> relocate() weren't then getting flushed out of the d-cache into memory.
> After that, early_init() executed the stale (non-modified) instructions
> for the BSS area.  Those instructions just accessed offset 0 instead of
> the actual BSS-related offsets.  That resulted in BSS not getting`
> zeroed.
>
> I was able to verify this on my 440 by comparing the d-cache and i-cache
> entries for the BSS-accessing instructions in early_init() using a
> RISCWatch.  As I suspected, the instructions in the d-cache showed the
> corrected offsets, but the i-cache showed the old, non-relocated
> offsets.
>
> To fix the issue, I wrote the following patch, applied on top of your
> patches.  Suggestions and comments are welcome.
>
>
>
>  From c88ae39da0c0352f411aca8d9636990a442d47da Mon Sep 17 00:00:00 2001
> From: Josh Poimboeuf<jpoimboe@linux.vnet.ibm.com>
> Date: Wed, 2 Nov 2011 16:41:24 -0500
> Subject: [PATCH] Flush relocated instructions from data cache
>
> After updating instructions with relocated addresses, flush them from
> the data cache and invalidate the icache line so we don't execute stale
> instructions.
>
> Signed-off-by: Josh Poimboeuf<jpoimboe@linux.vnet.ibm.com>
> ---
>   arch/powerpc/kernel/reloc_32.S |   11 ++++++++++-
>   1 files changed, 10 insertions(+), 1 deletions(-)
>
> diff --git a/arch/powerpc/kernel/reloc_32.S
> b/arch/powerpc/kernel/reloc_32.S
> index 045d61e..a92857d 100644
> --- a/arch/powerpc/kernel/reloc_32.S
> +++ b/arch/powerpc/kernel/reloc_32.S
> @@ -137,6 +137,9 @@ get_type:
>   	lwz	r0, 8(r9)	/* r_addend */
>   	add	r0, r0, r3	/* final addend */
>   	stwx	r0, r4, r7	/* memory[r4+r7]) = (u32)r0 */
> +	dcbst	r4,r7		/* flush dcache line to memory */
> +	sync			/* wait for flush to complete */
> +	icbi	r4,r7		/* invalidate icache line */

Doing it this way has two drawbacks :

1) Placing it here in relocate would do the flushing for each and every update.
2) I would like to keep this code as generic as possible for the PPC32 code.

Could we move this to the place from relocate is called and flush the d-cache and
i-cache entirely ?

Thanks

Suzuki

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
  2011-11-04  8:36     ` Suzuki Poulose
@ 2011-11-07 15:13       ` Josh Poimboeuf
  2011-11-07 15:26         ` David Laight
  2011-11-08  7:11         ` Suzuki Poulose
  0 siblings, 2 replies; 19+ messages in thread
From: Josh Poimboeuf @ 2011-11-07 15:13 UTC (permalink / raw)
  To: Suzuki Poulose
  Cc: Nathan Miller, Josh Poimboeuf, Dave Hansen, Alan Modra,
	Scott Wood, Paul Mackerras, linuxppc-dev

On Fri, 2011-11-04 at 14:06 +0530, Suzuki Poulose wrote:
> On 11/03/11 05:06, Josh Poimboeuf wrote:
> > On Tue, 2011-10-25 at 17:23 +0530, Suzuki K. Poulose wrote:
> > @@ -137,6 +137,9 @@ get_type:
> >   	lwz	r0, 8(r9)	/* r_addend */
> >   	add	r0, r0, r3	/* final addend */
> >   	stwx	r0, r4, r7	/* memory[r4+r7]) = (u32)r0 */
> > +	dcbst	r4,r7		/* flush dcache line to memory */
> > +	sync			/* wait for flush to complete */
> > +	icbi	r4,r7		/* invalidate icache line */
> 
> Doing it this way has two drawbacks :
> 
> 1) Placing it here in relocate would do the flushing for each and every update.

I agree.  My kernel had around 80,000 relocations, which means 80,000
d-cache line flushes (for a 32k d-cache) and 80,000 i-cache line
invalidates (for a 32k i-cache).  Which is obviously a little overkill.
Although I didn't notice a performance hit during boot.


> 2) I would like to keep this code as generic as possible for the PPC32 code.
> 
> Could we move this to the place from relocate is called and flush the d-cache and
> i-cache entirely ?

Why not put the cache flushing code at the end of relocate?  Would some
of the other PPC32 platforms not require the cache flushing?

My PPC32 knowledge is 4xx-centric, so please feel free to rewrite the
patch as needed to accommodate other PPC32 cores.


Thanks,
Josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
  2011-11-07 15:13       ` Josh Poimboeuf
@ 2011-11-07 15:26         ` David Laight
  2011-11-08  7:11         ` Suzuki Poulose
  1 sibling, 0 replies; 19+ messages in thread
From: David Laight @ 2011-11-07 15:26 UTC (permalink / raw)
  To: Josh Poimboeuf, Suzuki Poulose
  Cc: Nathan Miller, Josh Poimboeuf, Dave Hansen, Alan Modra,
	Scott Wood, Paul Mackerras, linuxppc-dev

=20
> On Fri, 2011-11-04 at 14:06 +0530, Suzuki Poulose wrote:
> > On 11/03/11 05:06, Josh Poimboeuf wrote:
> > > On Tue, 2011-10-25 at 17:23 +0530, Suzuki K. Poulose wrote:
> > > @@ -137,6 +137,9 @@ get_type:
> > >   	lwz	r0, 8(r9)	/* r_addend */
> > >   	add	r0, r0, r3	/* final addend */
> > >   	stwx	r0, r4, r7	/* memory[r4+r7]) =3D (u32)r0 */
> > > +	dcbst	r4,r7		/* flush dcache line to memory */
> > > +	sync			/* wait for flush to complete */
> > > +	icbi	r4,r7		/* invalidate icache line */
> >=20
> > Doing it this way has two drawbacks :
> >=20
> > 1) Placing it here in relocate would do the flushing for=20
> each and every update.
>=20
> I agree.  My kernel had around 80,000 relocations, which means 80,000
> d-cache line flushes (for a 32k d-cache) and 80,000 i-cache line
> invalidates (for a 32k i-cache).  Which is obviously a little=20
> overkill.
> Although I didn't notice a performance hit during boot.

The I-cache invalidates shouldn't be needed, the un-relocated
code can't be in the I-cache (on the grounds that executing
it would crash the system).
A single sync at the end is probably enough as well.
I guess it is possible for the cpu to prefetch/preload
into the i-cache through the jump into the relocated code?
So maybe a full i-cache invalidate right at the end? (or
a jump indirect? - which is probably there anyway)

The d-cache will need some kind of flush, since the modified
lines have to be written out, the only time it generates
additional memeory cycles are if there are two (or more)
reloations in the same d-cache line. Otherwise the early
write-back might help!

	David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
  2011-11-07 15:13       ` Josh Poimboeuf
  2011-11-07 15:26         ` David Laight
@ 2011-11-08  7:11         ` Suzuki Poulose
  2011-11-08 16:19           ` Josh Poimboeuf
  1 sibling, 1 reply; 19+ messages in thread
From: Suzuki Poulose @ 2011-11-08  7:11 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Nathan Miller, Josh Poimboeuf, Dave Hansen, Alan Modra,
	Scott Wood, Paul Mackerras, linuxppc-dev

On 11/07/11 20:43, Josh Poimboeuf wrote:
> On Fri, 2011-11-04 at 14:06 +0530, Suzuki Poulose wrote:
>> On 11/03/11 05:06, Josh Poimboeuf wrote:
>>> On Tue, 2011-10-25 at 17:23 +0530, Suzuki K. Poulose wrote:
>>> @@ -137,6 +137,9 @@ get_type:
>>>    	lwz	r0, 8(r9)	/* r_addend */
>>>    	add	r0, r0, r3	/* final addend */
>>>    	stwx	r0, r4, r7	/* memory[r4+r7]) = (u32)r0 */
>>> +	dcbst	r4,r7		/* flush dcache line to memory */
>>> +	sync			/* wait for flush to complete */
>>> +	icbi	r4,r7		/* invalidate icache line */
>>
>> Doing it this way has two drawbacks :
>>
>> 1) Placing it here in relocate would do the flushing for each and every update.
>
> I agree.  My kernel had around 80,000 relocations, which means 80,000
> d-cache line flushes (for a 32k d-cache) and 80,000 i-cache line
> invalidates (for a 32k i-cache).  Which is obviously a little overkill.
> Although I didn't notice a performance hit during boot.
>
>
>> 2) I would like to keep this code as generic as possible for the PPC32 code.
>>
>> Could we move this to the place from relocate is called and flush the d-cache and
>> i-cache entirely ?
>
> Why not put the cache flushing code at the end of relocate?  Would some
> of the other PPC32 platforms not require the cache flushing?
What I was suggesting is, instead of flushing the cache in relocate(), lets do it
like:

for e.g, on 440x, (in head_44x.S :)

#ifdef CONFIG_RELOCATABLE
	...
	bl relocate

	#Flush the d-cache and invalidate the i-cache here
#endif


This would let the different platforms do the the cache invalidation in their
own way.

Btw, I didn't find an instruction to flush the entire d-cache in PPC440 manual.
We have instructions to flush only a block corresponding to an address.

However, we have 'iccci' which would invalidate the entire i-cache which, which
I think is better than 80,000 i-cache invalidates.

Kumar / Josh,

Do you have any suggestions here ?




>
> My PPC32 knowledge is 4xx-centric, so please feel free to rewrite the
> patch as needed to accommodate other PPC32 cores.

Same here :)

Thanks
Suzuki

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
  2011-11-08  7:11         ` Suzuki Poulose
@ 2011-11-08 16:19           ` Josh Poimboeuf
  2011-11-09  6:33             ` Suzuki Poulose
  0 siblings, 1 reply; 19+ messages in thread
From: Josh Poimboeuf @ 2011-11-08 16:19 UTC (permalink / raw)
  To: Suzuki Poulose
  Cc: Nathan Miller, Josh Poimboeuf, Dave Hansen, Alan Modra,
	Scott Wood, Paul Mackerras, linuxppc-dev

On Tue, 2011-11-08 at 12:41 +0530, Suzuki Poulose wrote:
> What I was suggesting is, instead of flushing the cache in relocate(), lets do it
> like:
> 
> for e.g, on 440x, (in head_44x.S :)
> 
> #ifdef CONFIG_RELOCATABLE
> 	...
> 	bl relocate
> 
> 	#Flush the d-cache and invalidate the i-cache here
> #endif
> 
> 
> This would let the different platforms do the the cache invalidation in their
> own way.
> 
> Btw, I didn't find an instruction to flush the entire d-cache in PPC440 manual.
> We have instructions to flush only a block corresponding to an address.
> 
> However, we have 'iccci' which would invalidate the entire i-cache which, which
> I think is better than 80,000 i-cache invalidates.

In misc_32.S there are already some platform-independent cache
management functions.  If we use those, then relocate() could simply
call them.  Then the different platforms calling relocate() wouldn't
have to worry about flushing/invalidating caches.

For example, there's a clean_dcache_range() function.  Given any range
twice the size of the d-cache, it should flush the entire d-cache.  But
the only drawback is that it would require the caller to know the size
of the d-cache.

Instead, I think it would be preferable to create a new clean_dcache()
(or clean_dcache_all()?) function in misc_32.S, which could call
clean_dcache_range() with the appropriate args for flushing the entire
d-cache.  relocate() could then call the platform-independent
clean_dcache().

For i-cache invalidation there's already the (incorrectly named?)
flush_instruction_cache().  It uses the appropriate platform-specific
methods (e.g. iccci for 44x) to invalidate the entire i-cache.

Suzuki, if you agree with this direction, I could work up a new patch if
needed.


Josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
  2011-11-08 16:19           ` Josh Poimboeuf
@ 2011-11-09  6:33             ` Suzuki Poulose
  2011-11-09  8:42               ` Suzuki Poulose
  2011-11-09 14:53               ` Josh Poimboeuf
  0 siblings, 2 replies; 19+ messages in thread
From: Suzuki Poulose @ 2011-11-09  6:33 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Nathan Miller, Josh Poimboeuf, Dave Hansen, Alan Modra,
	Scott Wood, Paul Mackerras, linuxppc-dev

On Tue, 08 Nov 2011 10:19:05 -0600
Josh Poimboeuf <jpoimboe@linux.vnet.ibm.com> wrote:

> On Tue, 2011-11-08 at 12:41 +0530, Suzuki Poulose wrote:
> > What I was suggesting is, instead of flushing the cache in
> > relocate(), lets do it like:
> > 
> > for e.g, on 440x, (in head_44x.S :)
> > 
> > #ifdef CONFIG_RELOCATABLE
> > 	...
> > 	bl relocate
> > 
> > 	#Flush the d-cache and invalidate the i-cache here
> > #endif
> > 
> > 
> > This would let the different platforms do the the cache
> > invalidation in their own way.
> > 
> > Btw, I didn't find an instruction to flush the entire d-cache in
> > PPC440 manual. We have instructions to flush only a block
> > corresponding to an address.
> > 
> > However, we have 'iccci' which would invalidate the entire i-cache
> > which, which I think is better than 80,000 i-cache invalidates.
> 
> In misc_32.S there are already some platform-independent cache
> management functions.  If we use those, then relocate() could simply
> call them.  Then the different platforms calling relocate() wouldn't
> have to worry about flushing/invalidating caches.
> 
> For example, there's a clean_dcache_range() function.  Given any range
> twice the size of the d-cache, it should flush the entire d-cache.
> But the only drawback is that it would require the caller to know the
> size of the d-cache.
> 
> Instead, I think it would be preferable to create a new clean_dcache()
> (or clean_dcache_all()?) function in misc_32.S, which could call
> clean_dcache_range() with the appropriate args for flushing the entire
> d-cache.  relocate() could then call the platform-independent
> clean_dcache().
> 


How about using clean_dcache_range() to flush the range runtime
address range [ _stext, _end ] ? That would flush the entire
instructions.


> For i-cache invalidation there's already the (incorrectly named?)
> flush_instruction_cache().  It uses the appropriate platform-specific
> methods (e.g. iccci for 44x) to invalidate the entire i-cache.

Agreed. The only thing that worries me is the use of KERNELBASE in the 
flush_instruction_cache() for CONFIG_4xx. Can we safely assume all 4xx
implementations ignore the arguments passed to iccci ? 
> 
> Suzuki, if you agree with this direction, I could work up a new patch
> if needed.
> 

I have the following (untested) patch which uses clean_dcache_range()
and flush_instruction_cache(): I will send the next version soon
with those changes and the DYNAMIC_MEMSTART config for oldstyle
relocatoin, if every one agrees to this.


diff --git a/arch/powerpc/kernel/reloc_32.S
b/arch/powerpc/kernel/reloc_32.S index 045d61e..cce0510 100644
--- a/arch/powerpc/kernel/reloc_32.S
+++ b/arch/powerpc/kernel/reloc_32.S
@@ -33,10 +33,9 @@ R_PPC_RELATIVE = 22
 
 _GLOBAL(relocate)
 
-	mflr	r0
+	mflr	r14		/* Save our LR */
 	bl	0f		/* Find our current runtime
address */ 0:	mflr	r12		/* Make it
accessible */
-	mtlr	r0
 
 	lwz	r11, (p_dyn - 0b)(r12)
 	add	r11, r11, r12	/* runtime address of .dynamic
section */ @@ -87,18 +86,19 @@ eodyn:				/*
End of Dyn Table scan */
 	 * Work out the current offset from the link time address
of .rela
 	 * section.
 	 *  cur_offset[r7] = rela.run[r9] - rela.link [r7]
-	 *  _stext.link[r10] = _stext.run[r10] - cur_offset[r7]
-	 *  final_offset[r3] = _stext.final[r3] - _stext.link[r10]
+	 *  _stext.link[r12] = _stext.run[r10] - cur_offset[r7]
+	 *  final_offset[r3] = _stext.final[r3] - _stext.link[r12]
 	 */
 	subf	r7, r7, r9	/* cur_offset */
-	subf	r10, r7, r10
-	subf	r3, r10, r3	/* final_offset */
+	subf	r12, r7, r10
+	subf	r3, r12, r3	/* final_offset */
 
 	subf	r8, r6, r8	/* relaz -= relaent */
 	/*
 	 * Scan through the .rela table and process each entry
 	 * r9	- points to the current .rela table entry
 	 * r13	- points to the symbol table
+	 * r10  - holds the runtime address of _stext
 	 */
 
 	/*
@@ -180,12 +180,23 @@ store_half:
 
 nxtrela:
 	cmpwi	r8, 0		/* relasz = 0 ? */
-	ble	done
+	ble	flush
 	add	r9, r9, r6	/* move to next entry in
the .rela table */ subf	r8, r6, r8	/* relasz -= relaent */
 	b	applyrela
 
-done:	blr
+	/* Flush the d-cache'd instructions */
+flush:
+	mr	r3, r10
+	lis	r4, (_end - _stext)@h
+	ori	r4, r4, (_end - _stext)@l
+	add	r4, r3, r4
+	bl	clean_dcache_range
+	/* Invalidate the i-cache */
+	bl	flush_instruction_cache
+done:
+	mtlr	r14
+	blr
 
 
 p_dyn:		.long	__dynamic_start - 0b


Thanks

Suzuki

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
  2011-11-09  6:33             ` Suzuki Poulose
@ 2011-11-09  8:42               ` Suzuki Poulose
  2011-11-09 14:53               ` Josh Poimboeuf
  1 sibling, 0 replies; 19+ messages in thread
From: Suzuki Poulose @ 2011-11-09  8:42 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Nathan Miller, Josh Poimboeuf, Dave Hansen, Alan Modra,
	Scott Wood, Paul Mackerras, linuxppc-dev

On 11/09/11 12:03, Suzuki Poulose wrote:
> On Tue, 08 Nov 2011 10:19:05 -0600
> Josh Poimboeuf<jpoimboe@linux.vnet.ibm.com>  wrote:
>
>> On Tue, 2011-11-08 at 12:41 +0530, Suzuki Poulose wrote:
>>> What I was suggesting is, instead of flushing the cache in
>>> relocate(), lets do it like:
>>>
>>> for e.g, on 440x, (in head_44x.S :)
>>>
>>> #ifdef CONFIG_RELOCATABLE
>>> 	...
>>> 	bl relocate
>>>
>>> 	#Flush the d-cache and invalidate the i-cache here
>>> #endif
>>>
>>>
>>> This would let the different platforms do the the cache
>>> invalidation in their own way.
>>>
>>> Btw, I didn't find an instruction to flush the entire d-cache in
>>> PPC440 manual. We have instructions to flush only a block
>>> corresponding to an address.
>>>
>>> However, we have 'iccci' which would invalidate the entire i-cache
>>> which, which I think is better than 80,000 i-cache invalidates.
>>
>> In misc_32.S there are already some platform-independent cache
>> management functions.  If we use those, then relocate() could simply
>> call them.  Then the different platforms calling relocate() wouldn't
>> have to worry about flushing/invalidating caches.
>>
>> For example, there's a clean_dcache_range() function.  Given any range
>> twice the size of the d-cache, it should flush the entire d-cache.
>> But the only drawback is that it would require the caller to know the
>> size of the d-cache.
>>
>> Instead, I think it would be preferable to create a new clean_dcache()
>> (or clean_dcache_all()?) function in misc_32.S, which could call
>> clean_dcache_range() with the appropriate args for flushing the entire
>> d-cache.  relocate() could then call the platform-independent
>> clean_dcache().
>>
>
>
> How about using clean_dcache_range() to flush the range runtime
> address range [ _stext, _end ] ? That would flush the entire
> instructions.
>
>
>> For i-cache invalidation there's already the (incorrectly named?)
>> flush_instruction_cache().  It uses the appropriate platform-specific
>> methods (e.g. iccci for 44x) to invalidate the entire i-cache.
>
> Agreed. The only thing that worries me is the use of KERNELBASE in the
> flush_instruction_cache() for CONFIG_4xx. Can we safely assume all 4xx
> implementations ignore the arguments passed to iccci ?
>>
>> Suzuki, if you agree with this direction, I could work up a new patch
>> if needed.
>>
>
> I have the following (untested) patch which uses clean_dcache_range()
> and flush_instruction_cache(): I will send the next version soon
> with those changes and the DYNAMIC_MEMSTART config for oldstyle
> relocatoin, if every one agrees to this.
>
>
> diff --git a/arch/powerpc/kernel/reloc_32.S
> b/arch/powerpc/kernel/reloc_32.S index 045d61e..cce0510 100644
> --- a/arch/powerpc/kernel/reloc_32.S
> +++ b/arch/powerpc/kernel/reloc_32.S
> @@ -33,10 +33,9 @@ R_PPC_RELATIVE = 22
>
>   _GLOBAL(relocate)
>
> -	mflr	r0
> +	mflr	r14		/* Save our LR */
>   	bl	0f		/* Find our current runtime
> address */ 0:	mflr	r12		/* Make it
> accessible */
> -	mtlr	r0
>
>   	lwz	r11, (p_dyn - 0b)(r12)
>   	add	r11, r11, r12	/* runtime address of .dynamic
> section */ @@ -87,18 +86,19 @@ eodyn:				/*
> End of Dyn Table scan */
>   	 * Work out the current offset from the link time address
> of .rela
>   	 * section.
>   	 *  cur_offset[r7] = rela.run[r9] - rela.link [r7]
> -	 *  _stext.link[r10] = _stext.run[r10] - cur_offset[r7]
> -	 *  final_offset[r3] = _stext.final[r3] - _stext.link[r10]
> +	 *  _stext.link[r12] = _stext.run[r10] - cur_offset[r7]
> +	 *  final_offset[r3] = _stext.final[r3] - _stext.link[r12]
>   	 */
>   	subf	r7, r7, r9	/* cur_offset */
> -	subf	r10, r7, r10
> -	subf	r3, r10, r3	/* final_offset */
> +	subf	r12, r7, r10
> +	subf	r3, r12, r3	/* final_offset */
>
>   	subf	r8, r6, r8	/* relaz -= relaent */
>   	/*
>   	 * Scan through the .rela table and process each entry
>   	 * r9	- points to the current .rela table entry
>   	 * r13	- points to the symbol table
> +	 * r10  - holds the runtime address of _stext
>   	 */
>
>   	/*
> @@ -180,12 +180,23 @@ store_half:
>
>   nxtrela:
>   	cmpwi	r8, 0		/* relasz = 0 ? */
> -	ble	done
> +	ble	flush
>   	add	r9, r9, r6	/* move to next entry in
> the .rela table */ subf	r8, r6, r8	/* relasz -= relaent */
>   	b	applyrela
>
> -done:	blr
> +	/* Flush the d-cache'd instructions */
> +flush:
> +	mr	r3, r10
> +	lis	r4, (_end - _stext)@h
> +	ori	r4, r4, (_end - _stext)@l

Err ! This doesn't compile :

arch/powerpc/kernel/reloc_32.S: Assembler messages:
arch/powerpc/kernel/reloc_32.S:191: Error: can't resolve `_end' {*UND* section} - `_stext' {*UND* section}


I will fix it, but the idea remains the same.

Thanks

Suzuki

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
  2011-11-09  6:33             ` Suzuki Poulose
  2011-11-09  8:42               ` Suzuki Poulose
@ 2011-11-09 14:53               ` Josh Poimboeuf
  2011-11-10  2:31                 ` Suzuki Poulose
  1 sibling, 1 reply; 19+ messages in thread
From: Josh Poimboeuf @ 2011-11-09 14:53 UTC (permalink / raw)
  To: Suzuki Poulose
  Cc: Nathan Miller, Josh Poimboeuf, Dave Hansen, Alan Modra,
	Scott Wood, Paul Mackerras, linuxppc-dev

On Wed, 2011-11-09 at 12:03 +0530, Suzuki Poulose wrote:
> On Tue, 08 Nov 2011 10:19:05 -0600
> Josh Poimboeuf <jpoimboe@linux.vnet.ibm.com> wrote:
> 
> > On Tue, 2011-11-08 at 12:41 +0530, Suzuki Poulose wrote:
> > > What I was suggesting is, instead of flushing the cache in
> > > relocate(), lets do it like:
> > > 
> > > for e.g, on 440x, (in head_44x.S :)
> > > 
> > > #ifdef CONFIG_RELOCATABLE
> > > 	...
> > > 	bl relocate
> > > 
> > > 	#Flush the d-cache and invalidate the i-cache here
> > > #endif
> > > 
> > > 
> > > This would let the different platforms do the the cache
> > > invalidation in their own way.
> > > 
> > > Btw, I didn't find an instruction to flush the entire d-cache in
> > > PPC440 manual. We have instructions to flush only a block
> > > corresponding to an address.
> > > 
> > > However, we have 'iccci' which would invalidate the entire i-cache
> > > which, which I think is better than 80,000 i-cache invalidates.
> > 
> > In misc_32.S there are already some platform-independent cache
> > management functions.  If we use those, then relocate() could simply
> > call them.  Then the different platforms calling relocate() wouldn't
> > have to worry about flushing/invalidating caches.
> > 
> > For example, there's a clean_dcache_range() function.  Given any range
> > twice the size of the d-cache, it should flush the entire d-cache.
> > But the only drawback is that it would require the caller to know the
> > size of the d-cache.
> > 
> > Instead, I think it would be preferable to create a new clean_dcache()
> > (or clean_dcache_all()?) function in misc_32.S, which could call
> > clean_dcache_range() with the appropriate args for flushing the entire
> > d-cache.  relocate() could then call the platform-independent
> > clean_dcache().
> > 
> 
> 
> How about using clean_dcache_range() to flush the range runtime
> address range [ _stext, _end ] ? That would flush the entire
> instructions.

Wouldn't that result in more cache flushing than the original solution? 

For example, my kernel is 3.5MB.  Assuming a 32 byte cache line size,
clean_dcache_range(_stext, _end) would result in about 115,000 dcbst's
(3.5MB / 32).


> 
> 
> > For i-cache invalidation there's already the (incorrectly named?)
> > flush_instruction_cache().  It uses the appropriate platform-specific
> > methods (e.g. iccci for 44x) to invalidate the entire i-cache.
> 
> Agreed. The only thing that worries me is the use of KERNELBASE in the 
> flush_instruction_cache() for CONFIG_4xx. Can we safely assume all 4xx
> implementations ignore the arguments passed to iccci ? 

Good question.  I don't know the answer. :-)

That also may suggest a bigger can of worms.  A grep of the powerpc code
shows many uses of KERNELBASE.  For a relocatable kernel, nobody should
be relying on KERNELBASE except for the early relocation code.  Are we
sure that all the other usages of KERNELBASE are "safe"?

For example -- the DEBUG_CRIT_EXCEPTION macro, which head_44x.S uses,
relies on KERNELBASE.  That doesn't seem right to me.


Josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
  2011-11-09 14:53               ` Josh Poimboeuf
@ 2011-11-10  2:31                 ` Suzuki Poulose
  2011-11-10  9:15                   ` David Laight
                                     ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Suzuki Poulose @ 2011-11-10  2:31 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Nathan Miller, Josh Poimboeuf, Dave Hansen, Paul Mackerras,
	Scott Wood, Alan Modra, linuxppc-dev

On 11/09/11 20:23, Josh Poimboeuf wrote:
> On Wed, 2011-11-09 at 12:03 +0530, Suzuki Poulose wrote:
>> On Tue, 08 Nov 2011 10:19:05 -0600
>> Josh Poimboeuf<jpoimboe@linux.vnet.ibm.com>  wrote:
>>
>>> On Tue, 2011-11-08 at 12:41 +0530, Suzuki Poulose wrote:
>>>> What I was suggesting is, instead of flushing the cache in
>>>> relocate(), lets do it like:
>>>>
>>>> for e.g, on 440x, (in head_44x.S :)
>>>>
>>>> #ifdef CONFIG_RELOCATABLE
>>>> 	...
>>>> 	bl relocate
>>>>
>>>> 	#Flush the d-cache and invalidate the i-cache here
>>>> #endif
>>>>
>>>>
>>>> This would let the different platforms do the the cache
>>>> invalidation in their own way.
>>>>
>>>> Btw, I didn't find an instruction to flush the entire d-cache in
>>>> PPC440 manual. We have instructions to flush only a block
>>>> corresponding to an address.
>>>>
>>>> However, we have 'iccci' which would invalidate the entire i-cache
>>>> which, which I think is better than 80,000 i-cache invalidates.
>>>
>>> In misc_32.S there are already some platform-independent cache
>>> management functions.  If we use those, then relocate() could simply
>>> call them.  Then the different platforms calling relocate() wouldn't
>>> have to worry about flushing/invalidating caches.
>>>
>>> For example, there's a clean_dcache_range() function.  Given any range
>>> twice the size of the d-cache, it should flush the entire d-cache.
>>> But the only drawback is that it would require the caller to know the
>>> size of the d-cache.
>>>
>>> Instead, I think it would be preferable to create a new clean_dcache()
>>> (or clean_dcache_all()?) function in misc_32.S, which could call
>>> clean_dcache_range() with the appropriate args for flushing the entire
>>> d-cache.  relocate() could then call the platform-independent
>>> clean_dcache().
>>>
>>
>>
>> How about using clean_dcache_range() to flush the range runtime
>> address range [ _stext, _end ] ? That would flush the entire
>> instructions.
>
> Wouldn't that result in more cache flushing than the original solution?
>
> For example, my kernel is 3.5MB.  Assuming a 32 byte cache line size,
> clean_dcache_range(_stext, _end) would result in about 115,000 dcbst's
> (3.5MB / 32).

Oops ! You are right. We could go back to the clean_dcache_all() or the
initial approach that you suggested. (dcbst).

I am not sure how do we flush the entire dcache(only). Could you post a
patch which does the same ?

Another option is to, change the current mapping to 'Write Through' before
calling the relocate() and revert back to the original setting after relocate().

>
>
>>
>>
>>> For i-cache invalidation there's already the (incorrectly named?)
>>> flush_instruction_cache().  It uses the appropriate platform-specific
>>> methods (e.g. iccci for 44x) to invalidate the entire i-cache.
>>
>> Agreed. The only thing that worries me is the use of KERNELBASE in the
>> flush_instruction_cache() for CONFIG_4xx. Can we safely assume all 4xx
>> implementations ignore the arguments passed to iccci ?
>
> Good question.  I don't know the answer. :-)
>
> That also may suggest a bigger can of worms.  A grep of the powerpc code
> shows many uses of KERNELBASE.  For a relocatable kernel, nobody should
> be relying on KERNELBASE except for the early relocation code.  Are we
> sure that all the other usages of KERNELBASE are "safe"?
>
I think we could simply replace the occurrences of KERNELBASE (after the relocate())
  with '_stext' which would give the virtual start address of the kernel.

Thanks
Suzuki

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
  2011-11-10  2:31                 ` Suzuki Poulose
@ 2011-11-10  9:15                   ` David Laight
  2011-11-10 21:44                   ` Josh Poimboeuf
  2011-11-11  4:11                   ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 19+ messages in thread
From: David Laight @ 2011-11-10  9:15 UTC (permalink / raw)
  To: Suzuki Poulose, Josh Poimboeuf
  Cc: Nathan Miller, Josh Poimboeuf, Dave Hansen, Alan Modra,
	Scott Wood, Paul Mackerras, linuxppc-dev

=20
> >> How about using clean_dcache_range() to flush the range runtime
> >> address range [ _stext, _end ] ? That would flush the entire
> >> instructions.
> >
> > Wouldn't that result in more cache flushing than the original
solution?
> >
> > For example, my kernel is 3.5MB.  Assuming a 32 byte cache line
size,
> > clean_dcache_range(_stext, _end) would result in about 115,000
dcbst's
> > (3.5MB / 32).
>=20
> Oops ! You are right. We could go back to the=20
> clean_dcache_all() or the
> initial approach that you suggested. (dcbst).

Maybe clean_dcache_range() itself should limit the range
to the size of the cache?
I suspect that code can easily know the cache size.

	David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
  2011-11-10  2:31                 ` Suzuki Poulose
  2011-11-10  9:15                   ` David Laight
@ 2011-11-10 21:44                   ` Josh Poimboeuf
  2011-11-11  4:11                   ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 19+ messages in thread
From: Josh Poimboeuf @ 2011-11-10 21:44 UTC (permalink / raw)
  To: Suzuki Poulose
  Cc: Nathan Miller, Josh Poimboeuf, Dave Hansen, Paul Mackerras,
	Scott Wood, Alan Modra, linuxppc-dev

On Thu, 2011-11-10 at 08:01 +0530, Suzuki Poulose wrote:
> >> How about using clean_dcache_range() to flush the range runtime
> >> address range [ _stext, _end ] ? That would flush the entire
> >> instructions.
> >
> > Wouldn't that result in more cache flushing than the original solution?
> >
> > For example, my kernel is 3.5MB.  Assuming a 32 byte cache line size,
> > clean_dcache_range(_stext, _end) would result in about 115,000 dcbst's
> > (3.5MB / 32).
> 
> Oops ! You are right. We could go back to the clean_dcache_all() or the
> initial approach that you suggested. (dcbst).
> 
> I am not sure how do we flush the entire dcache(only). Could you post a
> patch which does the same ?

It turns out that my original idea of giving a 64k address range to
clean_dcache_range() wouldn't work, because dcbst only flushes the line
if the given address is in the cache already.


Also, I experimented with a simple clean_dcache_all() function in
misc_32.S:

#define L1_CACHE_SIZE   (32 * 1024)
#define L1_CACHE_LINES  (L1_CACHE_SIZE / L1_CACHE_BYTES) 
_GLOBAL(clean_dcache_all)
        lis     r3, _start@h
        ori     r3, r3, _start@l
        li      r4, (L1_CACHE_LINES * 2)
        mtctr   r4
1:      lwz     r5, 0(r3)
        addi    r3, r3, L1_CACHE_BYTES
        bdnz    1b
        sync
        blr

But this approach has some issues:

1. It should probably be made more platform-independent with respect to
d-cache size.  I'm not sure the best way to achieve that.

2. The _start address is the kernel virtual address, not the physical
address, but relocate() is running without TLB address translation
enabled.  Although we could easily circumvent this problem by clearing
the d-cache directly in relocate() (or in head_44x.S) using physical
addresses.

3. Chicken/egg issue:  the _start address might be stale because we
haven't yet flushed the d-cache and invalidated the i-cache.

I also discovered that calling flush_instruction_cache() from relocate()
wouldn't work for all platforms, for similar reasons.

We could overcome these issues with more code, but the added complexity
might not be worth it (premature optimization and all).  My original
patch at least has the benefit of being simple.


> 
> Another option is to, change the current mapping to 'Write Through' before
> calling the relocate() and revert back to the original setting after relocate().

True, that's another option.  Although since TLB handling is
platform-specific, I think it would have to be handled by the caller in
head_44x.S, rather than within relocate().


> > That also may suggest a bigger can of worms.  A grep of the powerpc code
> > shows many uses of KERNELBASE.  For a relocatable kernel, nobody should
> > be relying on KERNELBASE except for the early relocation code.  Are we
> > sure that all the other usages of KERNELBASE are "safe"?
> >
> I think we could simply replace the occurrences of KERNELBASE (after the relocate())
>   with '_stext' which would give the virtual start address of the kernel.

Yeah, that would work.


Josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel
  2011-11-10  2:31                 ` Suzuki Poulose
  2011-11-10  9:15                   ` David Laight
  2011-11-10 21:44                   ` Josh Poimboeuf
@ 2011-11-11  4:11                   ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 19+ messages in thread
From: Benjamin Herrenschmidt @ 2011-11-11  4:11 UTC (permalink / raw)
  To: Suzuki Poulose
  Cc: Nathan Miller, Josh Poimboeuf, Josh Poimboeuf, Dave Hansen,
	Paul Mackerras, Scott Wood, Alan Modra, linuxppc-dev

On Thu, 2011-11-10 at 08:01 +0530, Suzuki Poulose wrote:

> Oops ! You are right. We could go back to the clean_dcache_all() or the
> initial approach that you suggested. (dcbst).
> 
> I am not sure how do we flush the entire dcache(only). Could you post a
> patch which does the same ?
> 
> Another option is to, change the current mapping to 'Write Through' before
> calling the relocate() and revert back to the original setting after relocate().

Why not just keep the dcbst's & icbi's in relocate for now ? (original
patch) Is it noticably slower to boot ? If not I'd say keep it that way,
it will work on all implementations.

Cheers,
Ben.
> >
> >>
> >>
> >>> For i-cache invalidation there's already the (incorrectly named?)
> >>> flush_instruction_cache().  It uses the appropriate platform-specific
> >>> methods (e.g. iccci for 44x) to invalidate the entire i-cache.
> >>
> >> Agreed. The only thing that worries me is the use of KERNELBASE in the
> >> flush_instruction_cache() for CONFIG_4xx. Can we safely assume all 4xx
> >> implementations ignore the arguments passed to iccci ?
> >
> > Good question.  I don't know the answer. :-)
> >
> > That also may suggest a bigger can of worms.  A grep of the powerpc code
> > shows many uses of KERNELBASE.  For a relocatable kernel, nobody should
> > be relying on KERNELBASE except for the early relocation code.  Are we
> > sure that all the other usages of KERNELBASE are "safe"?
> >
> I think we could simply replace the occurrences of KERNELBASE (after the relocate())
>   with '_stext' which would give the virtual start address of the kernel.
> 
> Thanks
> Suzuki
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2011-11-11  4:12 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-25 11:53 [PATCH v2 0/5] Kdump support for PPC440x Suzuki K. Poulose
2011-10-25 11:53 ` [PATCH v2 1/5] [ppc] Process dynamic relocations for kernel Suzuki K. Poulose
2011-11-02 23:36   ` Josh Poimboeuf
2011-11-04  8:36     ` Suzuki Poulose
2011-11-07 15:13       ` Josh Poimboeuf
2011-11-07 15:26         ` David Laight
2011-11-08  7:11         ` Suzuki Poulose
2011-11-08 16:19           ` Josh Poimboeuf
2011-11-09  6:33             ` Suzuki Poulose
2011-11-09  8:42               ` Suzuki Poulose
2011-11-09 14:53               ` Josh Poimboeuf
2011-11-10  2:31                 ` Suzuki Poulose
2011-11-10  9:15                   ` David Laight
2011-11-10 21:44                   ` Josh Poimboeuf
2011-11-11  4:11                   ` Benjamin Herrenschmidt
2011-10-25 11:54 ` [PATCH v2 2/5] [ppc] Define virtual-physical translations for PIE relocations Suzuki K. Poulose
2011-10-25 11:54 ` [PATCH v2 3/5] [44x] Enable CONFIG_RELOCATABLE for PPC44x Suzuki K. Poulose
2011-10-25 11:54 ` [PATCH v2 4/5] [44x] Enable CRASH_DUMP for 440x Suzuki K. Poulose
2011-10-25 11:54 ` [PATCH v2 5/5] [boot] Change the load address for the wrapper to fit the kernel Suzuki K. Poulose

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.