linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v10 00/10] kexec_file_load implementation for PowerPC
@ 2016-11-10  3:27 Thiago Jung Bauermann
  2016-11-10  3:27 ` [PATCH v10 01/10] kexec_file: Allow arch-specific memory walking for kexec_add_buffer Thiago Jung Bauermann
                   ` (9 more replies)
  0 siblings, 10 replies; 20+ messages in thread
From: Thiago Jung Bauermann @ 2016-11-10  3:27 UTC (permalink / raw)
  To: kexec
  Cc: linuxppc-dev, linux-kernel, x86, Eric Biederman, Dave Young,
	Vivek Goyal, Baoquan He, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Stewart Smith,
	Mimi Zohar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andrew Morton, Stephen Rothwell, Thiago Jung Bauermann

Hello,

[ Andrew, you might want to wait until the kexec maintainers say whether
  they agree with patch 4 before picking up this version. ]

v10 addresses two requests from Michael Ellerman: build the
purgatory as a Position Independent Executable binary to reduce the
number of relocation types that we have to deal with, and remove code
that is not strictly necessary from the purgatory. 

The first objective is made possible by patch 4, which is new in this
version and adds the support in generic code that arch-specific code
needs if it wants to use a PIE purgatory. Now it's not necessary
anymore to refactor code in module_64.c to reuse its implementation
of the relocations.

I also added a script which checks purgatory.ro and warns if it has
a relocation type that is not supported.

The second objective is met by removing code for printing messages
to the console. Everything else that is in the purgatory is to implement
the checksum verification of the kexec image.

The gory details are in the detailed changelog at the end of this email.

Original cover letter:

This patch series implements the kexec_file_load system call on PowerPC.

This system call moves the reading of the kernel, initrd and the device tree
from the userspace kexec tool to the kernel. This is needed if you want to
do one or both of the following:

1. only allow loading of signed kernels.
2. "measure" (i.e., record the hashes of) the kernel, initrd, kernel
   command line and other boot inputs for the Integrity Measurement
   Architecture subsystem.

The above are the functions kexec already has built into kexec_file_load.
Yesterday I posted a set of patches which allows a third feature:

3. have IMA pass-on its event log (where integrity measurements are
   registered) accross kexec to the second kernel, so that the event
   history is preserved.

Because OpenPower uses an intermediary Linux instance as a boot loader
(skiroot), feature 1 is needed to implement secure boot for the platform,
while features 2 and 3 are needed to implement trusted boot.

This patch series starts by removing an x86 assumption from kexec_file:
kexec_add_buffer uses iomem to find reserved memory ranges, but PowerPC
uses the memblock subsystem.  A hook is added so that each arch can
specify how memory ranges can be found.

Also, the memory-walking logic in kexec_add_buffer is useful in this
implementation to find a free area for the purgatory's stack, so the
next patch moves that logic to kexec_locate_mem_hole.

The kexec_file_load system call needs to apply relocations to the
purgatory but adding code for that would duplicate functionality with
the module loading mechanism, which also needs to apply relocations to
the kernel modules.  Therefore, this patch series factors out the module
relocation code so that it can be shared.

One thing that is still missing is crashkernel support, which I intend
to submit shortly. For now, arch_kexec_kernel_image_probe rejects crash
kernels.

This code is based on kexec-tools, but with many modifications to adapt
it to the kernel environment and facilities. Except the purgatory,
which only has minimal changes.

Changes for v10:
- Patch "kexec_file: Add support for purgatory built as PIE."
  - New patch.
- Patch "powerpc: Factor out relocation code in module_64.c"
  - Dropped patch from series.
- Patch "powerpc: Implement kexec_file_load."
  - Made CONFIG_KEXEC_FILE select HAVE_KEXEC_FILE_PIE_PURGATORY.
  - Removed elf_my_r2 and elf_toc_section functions, they aren't necessary
    anymore.
  - Adapted arch_kexec_apply_relocations_add to the absolute addresses found
    in PIE binaries. Apply the relocations in this function instead of
    calling elf64_apply_relocate_add_item (which doesn't exist anymore).
  - No need to change module_64.c anymore.
  - Implemented R_PPC64_RELATIVE relocation.
- Patch "powerpc: Add functions to read ELF files of any endianness."
  - Introduce <asm/elf_util.h> and elf_util.c in this patch.
- Patch "powerpc: Add support for loading ELF kernels with kexec_file_load."
  - Define PURGATORY_ELF_TYPE macro in <asm/kexec.h>.
  - Removed debug argument from the setup_purgatory function.
  - Don't call find_debug_console in elf64_load anymore.
  - Removed function find_debug_console.
  - Find the TOC address by looking for the .got section in setup_purgatory.
  - Don't set the "debug" purgatory symbol (which doesn't exist anymore) in
    setup_purgatory.
- Patch "powerpc: Add purgatory for kexec_file_load implementation."
  - Don't change boot/string.S.
  - Build purgatory as a PIE binary and suppress generation of the INTERP
    segment, which is not meaningful in this case.
  - Add check-purgatory-relocs.sh and call it after building purgatory.ro.
  - Removed console-ppc64.c, hvCall.S, ppc64_asm.h and printf.c.
  - Moved inclusion of <linux/compiler.h> from purgatory.h to
    purgatory-ppc64.c.
  - Removed "debug" global variable from purgatory-ppc64.c.
  - Removed inclusion of "../boot/string.h" from purgatory.c.
  - Removed printf calls from purgatory.c.
  - Included <linux/types.h>, added prototype for memcmp and removed
    prototypes for putchar, sprintf and printf in purgatory.h.
  - Added string.c with code copied from lib/string.c.
  - Removed inclusion of "ppc64_asm.h" and use of DOTSYM macro from v2wrap.S.
  
Changes for v9:
- Rebased on top of v4.9-rc1
- Patch "powerpc: Change places using CONFIG_KEXEC to use CONFIG_KEXEC_CORE instead."
  - Fixed conflict with patch renaming CONFIG_WORD_SIZE to BITS.
- Patch "powerpc: Factor out relocation code from module_64.c to elf_util_64.c."
  - Retitled to "powerpc: Factor out relocation code in module_64.c"
  - Fixed conflict with patch renaming CONFIG_WORD_SIZE to BITS.
  - Put relocation code in function elf64_apply_relocate_add_item in
    module_64.c itself instead of moving it to elf_util_64.c.
  - Only factored out the switch statement to apply the relocations
    instead of the whole logic to iterate through all of the relocations.
  - There's no elf_util_64.c anymore.
  - Doesn't change <asm/module.h> anymore.
  - Doesn't change arch/powerpc/kernel/Makefile anymore.
  - Moved creation of <asm/elf_util.h> to patch "Implement kexec_file_load."
- Patch "powerpc: Generalize elf64_apply_relocate_add."
  - Dropped patch from series.
  - Moved changes adding address variable to patch "powerpc: Implement
    kexec_file_load."
- Patch "powerpc: Adapt elf64_apply_relocate_add for kexec_file_load."
  - Dropped patch from series.
  - Moved new relocs needed by the purgatory to patch "powerpc: Implement
    kexec_file_load."
- Patch "powerpc: Implement kexec_file_load."
  - Fixed conflict with patch renaming CONFIG_WORD_SIZE to BITS.
  - Put code in a new file machine_kexec_file_64.c instead of adding it to
    machine_kexec_64.c.
  - arch_kexec_apply_relocations_add now has the for loop which was
    previously factored out in elf64_apply_relocate_add.
  - Moved definition of setup_purgatory to "powerpc: Add support for
    loading ELF kernels with kexec_file_load.".
  - Introduce <asm/elf_util.h> and elf_util.c in this patch.
  - Only build elf_util.c if CONFIG_KEXEC_FILE=y.
  - Renamed my_r2 to elf_my_r2 and changed it to receive a pointer to
    struct elf_shdr instead of struct elf_info.
  - Added toc_section function.
  - Removed elf_init_elf_info function.
- Patch "powerpc: Add functions to read ELF files of any endianness."
  - Fixed conflict with patch renaming CONFIG_WORD_SIZE to BITS.
  - Introduce struct elf_info in this patch.
  - Removed stubs_section and toc_section from struct elf_info.
  - Doesn't change arch/powerpc/kernel/Makefile anymore.
- Patch "powerpc: Add code to work with device trees in kexec_file_load."
  - Squashed into "powerpc: Add support for loading ELF kernels with
    kexec_file_load.". The code is unchanged.
- Patch "powerpc: Add support for loading ELF kernels with kexec_file_load."
  - Fixed conflict with patch renaming CONFIG_WORD_SIZE to BITS.
  - Fixed uninitialized variable warning when building with GCC 4.6.3.
  - Don't create <asm/kexec_elf_64.h> anymore, put declaration of
    kexec_el_64_ops in <asm/kexec.h> instead.
  - Removed unused extern declaration of kexec_purgatory_size.
  - Made elf64_load static.
  - Factor out delete_fdt_mem_rsv from setup_new_fdt. This change was in
    the patch series for preserving IMA measurements accross kexec.
- Patch "powerpc: Add purgatory for kexec_file_load implementation."
  - Fixed conflict with patch renaming CONFIG_WORD_SIZE to BITS.
  - Changed purgatory Makefile to use the kernel KBUILD_CFLAGS, filtering
    out the flags for ftrace. (Michael Ellerman)
  - Added extern declaration of the debug global variable to purgatory.h and
    included it in console-ppc64.c to fix checkpatch.pl warning.
  - Added __section attribute to global variables in purgatory-ppc64.c and
    don't initialize them to zero, to fix checkpatch.pl warning.
  - Fixed type of debug global variable to int instead of unsigned int.
  - Included <linux/compiler.h> in purgatory.h and use __printf macro to
    fix checkpatch.pl warning.
  - Removed crashdump-ppc64.h, crashdump_backup.c and purgatory-ppc64.h.

Changes for v8:
- Rebased on top of v4.8-rc5.
- Patch "powerpc: Add purgatory for kexec_file_load implementation."
  - Fix cross build from ppc64 LE to BE and vice-versa.
  - Don't build the purgatory during archprepare.

Changes for v7:
- Rebased on top of v4.8-rc4.
- Patch "powerpc: Change places using CONFIG_KEXEC to use CONFIG_KEXEC_CORE
  instead."
  - New patch. Fixes build when CONFIG_KEXEC=n and CONFIG_KEXEC_FILE=y.
- Patch "powerpc: Adapt elf64_apply_relocate_add for kexec_file_load."
  - Fixed checkpatch warning "else is not generally useful after a break
    or return".
  - Fixed checkpatch warnings about line length. (Andrew Morton)
- Patch "powerpc: Add code to work with device trees in kexec_file_load."
  - Remove space before tabs in doc comment for setup_new_fdt. (Andrew Morton)
  - Fixed checkpatch warnings about line length.
- Patch "powerpc: Add support for loading ELF kernels with kexec_file_load."
  - Removed duplicate #include <linux/kexec.h>.

Changes for v6:
- Based directly on top of v4.8-rc1.
- Patch "powerpc: Adapt elf64_apply_relocate_add for kexec_file_load."
  - Allow undefined symbols if they are relocations for the TOC in the
    big endian ABI.
  - Fixed build error in this patch by adding the ehdr member to elf_info
    here instead of in the next patch.
  - Initialize elf_info.ehdr in module_64.c:module_frob_arch_sections.
- Patch "powerpc: Add code to work with device trees in kexec_file_load."
  - Changed find_debug_console to look for /chosen instead of receiving
    its offset as an argument.
  - setup_new_fdt: no need to find /chosen again after deleting the memory
    reservation for initrd.
- Patch "powerpc: Add support for loading ELF kernels with kexec_file_load."
  - Don't pass the offset to /chosen to find_debug_console.
- Patch "powerpc: Allow userspace to set device tree properties in kexec_file_load"
  - Dropped patch.
- Patch "powerpc: Add purgatory for kexec_file_load implementation."
  - Make boot/string.S use the DOTSYM macro so that it can be
    used by the ppc64 big endian purgatory.
  - Use -mcall-aixdesc to compile the purgatory on big endian ppc64.

Thiago Jung Bauermann (10):
  kexec_file: Allow arch-specific memory walking for kexec_add_buffer
  kexec_file: Change kexec_add_buffer to take kexec_buf as argument.
  kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.
  kexec_file: Add support for purgatory built as PIE.
  powerpc: Change places using CONFIG_KEXEC to use CONFIG_KEXEC_CORE
    instead.
  powerpc: Implement kexec_file_load.
  powerpc: Add functions to read ELF files of any endianness.
  powerpc: Add support for loading ELF kernels with kexec_file_load.
  powerpc: Add purgatory for kexec_file_load implementation.
  powerpc: Enable CONFIG_KEXEC_FILE in powerpc server defconfigs.

 arch/Kconfig                                   |  11 +
 arch/powerpc/Kconfig                           |  16 +-
 arch/powerpc/Makefile                          |   1 +
 arch/powerpc/configs/powernv_defconfig         |   2 +
 arch/powerpc/configs/ppc64_defconfig           |   2 +
 arch/powerpc/configs/pseries_defconfig         |   2 +
 arch/powerpc/include/asm/debug.h               |   2 +-
 arch/powerpc/include/asm/elf_util.h            |  43 ++
 arch/powerpc/include/asm/kexec.h               |  18 +-
 arch/powerpc/include/asm/machdep.h             |   4 +-
 arch/powerpc/include/asm/smp.h                 |   2 +-
 arch/powerpc/include/asm/systbl.h              |   1 +
 arch/powerpc/include/asm/unistd.h              |   2 +-
 arch/powerpc/include/uapi/asm/unistd.h         |   1 +
 arch/powerpc/kernel/Makefile                   |   6 +-
 arch/powerpc/kernel/elf_util.c                 | 437 +++++++++++++++++++
 arch/powerpc/kernel/head_64.S                  |   2 +-
 arch/powerpc/kernel/kexec_elf_64.c             | 279 ++++++++++++
 arch/powerpc/kernel/machine_kexec_file_64.c    | 578 +++++++++++++++++++++++++
 arch/powerpc/kernel/misc_32.S                  |   2 +-
 arch/powerpc/kernel/misc_64.S                  |   6 +-
 arch/powerpc/kernel/prom.c                     |   2 +-
 arch/powerpc/kernel/setup_64.c                 |   4 +-
 arch/powerpc/kernel/smp.c                      |   6 +-
 arch/powerpc/kernel/traps.c                    |   2 +-
 arch/powerpc/platforms/85xx/corenet_generic.c  |   2 +-
 arch/powerpc/platforms/85xx/smp.c              |   8 +-
 arch/powerpc/platforms/cell/spu_base.c         |   2 +-
 arch/powerpc/platforms/powernv/setup.c         |   6 +-
 arch/powerpc/platforms/ps3/setup.c             |   4 +-
 arch/powerpc/platforms/pseries/Makefile        |   2 +-
 arch/powerpc/platforms/pseries/setup.c         |   4 +-
 arch/powerpc/purgatory/.gitignore              |   2 +
 arch/powerpc/purgatory/Makefile                |  40 ++
 arch/powerpc/purgatory/crtsavres.S             |   5 +
 arch/powerpc/purgatory/kexec-sha256.h          |  11 +
 arch/powerpc/purgatory/purgatory-ppc64.c       |  36 ++
 arch/powerpc/purgatory/purgatory.c             |  48 ++
 arch/powerpc/purgatory/purgatory.h             |  10 +
 arch/powerpc/purgatory/sha256.c                |   6 +
 arch/powerpc/purgatory/sha256.h                |   1 +
 arch/powerpc/purgatory/string.c                |  60 +++
 arch/powerpc/purgatory/v2wrap.S                | 132 ++++++
 arch/powerpc/scripts/check-purgatory-relocs.sh |  47 ++
 arch/x86/kernel/crash.c                        |  37 +-
 arch/x86/kernel/kexec-bzimage64.c              |  48 +-
 include/linux/kexec.h                          |  40 +-
 kernel/kexec_file.c                            | 413 +++++++++++++-----
 kernel/kexec_internal.h                        |  16 -
 49 files changed, 2197 insertions(+), 214 deletions(-)
 create mode 100644 arch/powerpc/include/asm/elf_util.h
 create mode 100644 arch/powerpc/kernel/elf_util.c
 create mode 100644 arch/powerpc/kernel/kexec_elf_64.c
 create mode 100644 arch/powerpc/kernel/machine_kexec_file_64.c
 create mode 100644 arch/powerpc/purgatory/.gitignore
 create mode 100644 arch/powerpc/purgatory/Makefile
 create mode 100644 arch/powerpc/purgatory/crtsavres.S
 create mode 100644 arch/powerpc/purgatory/kexec-sha256.h
 create mode 100644 arch/powerpc/purgatory/purgatory-ppc64.c
 create mode 100644 arch/powerpc/purgatory/purgatory.c
 create mode 100644 arch/powerpc/purgatory/purgatory.h
 create mode 100644 arch/powerpc/purgatory/sha256.c
 create mode 100644 arch/powerpc/purgatory/sha256.h
 create mode 100644 arch/powerpc/purgatory/string.c
 create mode 100644 arch/powerpc/purgatory/v2wrap.S
 create mode 100755 arch/powerpc/scripts/check-purgatory-relocs.sh

-- 
2.7.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v10 01/10] kexec_file: Allow arch-specific memory walking for kexec_add_buffer
  2016-11-10  3:27 [PATCH v10 00/10] kexec_file_load implementation for PowerPC Thiago Jung Bauermann
@ 2016-11-10  3:27 ` Thiago Jung Bauermann
  2016-11-10  3:27 ` [PATCH v10 02/10] kexec_file: Change kexec_add_buffer to take kexec_buf as argument Thiago Jung Bauermann
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Thiago Jung Bauermann @ 2016-11-10  3:27 UTC (permalink / raw)
  To: kexec
  Cc: linuxppc-dev, linux-kernel, x86, Eric Biederman, Dave Young,
	Vivek Goyal, Baoquan He, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Stewart Smith,
	Mimi Zohar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andrew Morton, Stephen Rothwell, Thiago Jung Bauermann

Allow architectures to specify a different memory walking function for
kexec_add_buffer. x86 uses iomem to track reserved memory ranges, but
PowerPC uses the memblock subsystem.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
---
 include/linux/kexec.h   | 29 ++++++++++++++++++++++++++++-
 kernel/kexec_file.c     | 30 ++++++++++++++++++++++--------
 kernel/kexec_internal.h | 16 ----------------
 3 files changed, 50 insertions(+), 25 deletions(-)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 406c33dcae13..5e320ddaaa82 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -148,7 +148,34 @@ struct kexec_file_ops {
 	kexec_verify_sig_t *verify_sig;
 #endif
 };
-#endif
+
+/**
+ * struct kexec_buf - parameters for finding a place for a buffer in memory
+ * @image:	kexec image in which memory to search.
+ * @buffer:	Contents which will be copied to the allocated memory.
+ * @bufsz:	Size of @buffer.
+ * @mem:	On return will have address of the buffer in memory.
+ * @memsz:	Size for the buffer in memory.
+ * @buf_align:	Minimum alignment needed.
+ * @buf_min:	The buffer can't be placed below this address.
+ * @buf_max:	The buffer can't be placed above this address.
+ * @top_down:	Allocate from top of memory.
+ */
+struct kexec_buf {
+	struct kimage *image;
+	char *buffer;
+	unsigned long bufsz;
+	unsigned long mem;
+	unsigned long memsz;
+	unsigned long buf_align;
+	unsigned long buf_min;
+	unsigned long buf_max;
+	bool top_down;
+};
+
+int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
+			       int (*func)(u64, u64, void *));
+#endif /* CONFIG_KEXEC_FILE */
 
 struct kimage {
 	kimage_entry_t head;
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 037c321c5618..f865674bff51 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -428,6 +428,27 @@ static int locate_mem_hole_callback(u64 start, u64 end, void *arg)
 	return locate_mem_hole_bottom_up(start, end, kbuf);
 }
 
+/**
+ * arch_kexec_walk_mem - call func(data) on free memory regions
+ * @kbuf:	Context info for the search. Also passed to @func.
+ * @func:	Function to call for each memory region.
+ *
+ * Return: The memory walk will stop when func returns a non-zero value
+ * and that value will be returned. If all free regions are visited without
+ * func returning non-zero, then zero will be returned.
+ */
+int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
+			       int (*func)(u64, u64, void *))
+{
+	if (kbuf->image->type == KEXEC_TYPE_CRASH)
+		return walk_iomem_res_desc(crashk_res.desc,
+					   IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
+					   crashk_res.start, crashk_res.end,
+					   kbuf, func);
+	else
+		return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
+}
+
 /*
  * Helper function for placing a buffer in a kexec segment. This assumes
  * that kexec_mutex is held.
@@ -474,14 +495,7 @@ int kexec_add_buffer(struct kimage *image, char *buffer, unsigned long bufsz,
 	kbuf->top_down = top_down;
 
 	/* Walk the RAM ranges and allocate a suitable range for the buffer */
-	if (image->type == KEXEC_TYPE_CRASH)
-		ret = walk_iomem_res_desc(crashk_res.desc,
-				IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
-				crashk_res.start, crashk_res.end, kbuf,
-				locate_mem_hole_callback);
-	else
-		ret = walk_system_ram_res(0, -1, kbuf,
-					  locate_mem_hole_callback);
+	ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
 	if (ret != 1) {
 		/* A suitable memory range could not be found for buffer */
 		return -EADDRNOTAVAIL;
diff --git a/kernel/kexec_internal.h b/kernel/kexec_internal.h
index 0a52315d9c62..4cef7e4706b0 100644
--- a/kernel/kexec_internal.h
+++ b/kernel/kexec_internal.h
@@ -20,22 +20,6 @@ struct kexec_sha_region {
 	unsigned long len;
 };
 
-/*
- * Keeps track of buffer parameters as provided by caller for requesting
- * memory placement of buffer.
- */
-struct kexec_buf {
-	struct kimage *image;
-	char *buffer;
-	unsigned long bufsz;
-	unsigned long mem;
-	unsigned long memsz;
-	unsigned long buf_align;
-	unsigned long buf_min;
-	unsigned long buf_max;
-	bool top_down;		/* allocate from top of memory hole */
-};
-
 void kimage_file_post_load_cleanup(struct kimage *image);
 #else /* CONFIG_KEXEC_FILE */
 static inline void kimage_file_post_load_cleanup(struct kimage *image) { }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v10 02/10] kexec_file: Change kexec_add_buffer to take kexec_buf as argument.
  2016-11-10  3:27 [PATCH v10 00/10] kexec_file_load implementation for PowerPC Thiago Jung Bauermann
  2016-11-10  3:27 ` [PATCH v10 01/10] kexec_file: Allow arch-specific memory walking for kexec_add_buffer Thiago Jung Bauermann
@ 2016-11-10  3:27 ` Thiago Jung Bauermann
  2016-11-10  3:27 ` [PATCH v10 03/10] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer Thiago Jung Bauermann
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Thiago Jung Bauermann @ 2016-11-10  3:27 UTC (permalink / raw)
  To: kexec
  Cc: linuxppc-dev, linux-kernel, x86, Eric Biederman, Dave Young,
	Vivek Goyal, Baoquan He, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Stewart Smith,
	Mimi Zohar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andrew Morton, Stephen Rothwell, Thiago Jung Bauermann

This is done to simplify the kexec_add_buffer argument list.
Adapt all callers to set up a kexec_buf to pass to kexec_add_buffer.

In addition, change the type of kexec_buf.buffer from char * to void *.
There is no particular reason for it to be a char *, and the change
allows us to get rid of 3 existing casts to char * in the code.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
---
 arch/x86/kernel/crash.c           | 37 ++++++++--------
 arch/x86/kernel/kexec-bzimage64.c | 48 +++++++++++----------
 include/linux/kexec.h             |  8 +---
 kernel/kexec_file.c               | 88 ++++++++++++++++++---------------------
 4 files changed, 87 insertions(+), 94 deletions(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 650830e39e3a..3741461c63a0 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -631,9 +631,9 @@ static int determine_backup_region(u64 start, u64 end, void *arg)
 
 int crash_load_segments(struct kimage *image)
 {
-	unsigned long src_start, src_sz, elf_sz;
-	void *elf_addr;
 	int ret;
+	struct kexec_buf kbuf = { .image = image, .buf_min = 0,
+				  .buf_max = ULONG_MAX, .top_down = false };
 
 	/*
 	 * Determine and load a segment for backup area. First 640K RAM
@@ -647,43 +647,44 @@ int crash_load_segments(struct kimage *image)
 	if (ret < 0)
 		return ret;
 
-	src_start = image->arch.backup_src_start;
-	src_sz = image->arch.backup_src_sz;
-
 	/* Add backup segment. */
-	if (src_sz) {
+	if (image->arch.backup_src_sz) {
+		kbuf.buffer = &crash_zero_bytes;
+		kbuf.bufsz = sizeof(crash_zero_bytes);
+		kbuf.memsz = image->arch.backup_src_sz;
+		kbuf.buf_align = PAGE_SIZE;
 		/*
 		 * Ideally there is no source for backup segment. This is
 		 * copied in purgatory after crash. Just add a zero filled
 		 * segment for now to make sure checksum logic works fine.
 		 */
-		ret = kexec_add_buffer(image, (char *)&crash_zero_bytes,
-				       sizeof(crash_zero_bytes), src_sz,
-				       PAGE_SIZE, 0, -1, 0,
-				       &image->arch.backup_load_addr);
+		ret = kexec_add_buffer(&kbuf);
 		if (ret)
 			return ret;
+		image->arch.backup_load_addr = kbuf.mem;
 		pr_debug("Loaded backup region at 0x%lx backup_start=0x%lx memsz=0x%lx\n",
-			 image->arch.backup_load_addr, src_start, src_sz);
+			 image->arch.backup_load_addr,
+			 image->arch.backup_src_start, kbuf.memsz);
 	}
 
 	/* Prepare elf headers and add a segment */
-	ret = prepare_elf_headers(image, &elf_addr, &elf_sz);
+	ret = prepare_elf_headers(image, &kbuf.buffer, &kbuf.bufsz);
 	if (ret)
 		return ret;
 
-	image->arch.elf_headers = elf_addr;
-	image->arch.elf_headers_sz = elf_sz;
+	image->arch.elf_headers = kbuf.buffer;
+	image->arch.elf_headers_sz = kbuf.bufsz;
 
-	ret = kexec_add_buffer(image, (char *)elf_addr, elf_sz, elf_sz,
-			ELF_CORE_HEADER_ALIGN, 0, -1, 0,
-			&image->arch.elf_load_addr);
+	kbuf.memsz = kbuf.bufsz;
+	kbuf.buf_align = ELF_CORE_HEADER_ALIGN;
+	ret = kexec_add_buffer(&kbuf);
 	if (ret) {
 		vfree((void *)image->arch.elf_headers);
 		return ret;
 	}
+	image->arch.elf_load_addr = kbuf.mem;
 	pr_debug("Loaded ELF headers at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
-		 image->arch.elf_load_addr, elf_sz, elf_sz);
+		 image->arch.elf_load_addr, kbuf.bufsz, kbuf.bufsz);
 
 	return ret;
 }
diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
index 3407b148c240..d0a814a9d96a 100644
--- a/arch/x86/kernel/kexec-bzimage64.c
+++ b/arch/x86/kernel/kexec-bzimage64.c
@@ -331,17 +331,17 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
 
 	struct setup_header *header;
 	int setup_sects, kern16_size, ret = 0;
-	unsigned long setup_header_size, params_cmdline_sz, params_misc_sz;
+	unsigned long setup_header_size, params_cmdline_sz;
 	struct boot_params *params;
 	unsigned long bootparam_load_addr, kernel_load_addr, initrd_load_addr;
 	unsigned long purgatory_load_addr;
-	unsigned long kernel_bufsz, kernel_memsz, kernel_align;
-	char *kernel_buf;
 	struct bzimage64_data *ldata;
 	struct kexec_entry64_regs regs64;
 	void *stack;
 	unsigned int setup_hdr_offset = offsetof(struct boot_params, hdr);
 	unsigned int efi_map_offset, efi_map_sz, efi_setup_data_offset;
+	struct kexec_buf kbuf = { .image = image, .buf_max = ULONG_MAX,
+				  .top_down = true };
 
 	header = (struct setup_header *)(kernel + setup_hdr_offset);
 	setup_sects = header->setup_sects;
@@ -402,11 +402,11 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
 	params_cmdline_sz = sizeof(struct boot_params) + cmdline_len +
 				MAX_ELFCOREHDR_STR_LEN;
 	params_cmdline_sz = ALIGN(params_cmdline_sz, 16);
-	params_misc_sz = params_cmdline_sz + efi_map_sz +
+	kbuf.bufsz = params_cmdline_sz + efi_map_sz +
 				sizeof(struct setup_data) +
 				sizeof(struct efi_setup_data);
 
-	params = kzalloc(params_misc_sz, GFP_KERNEL);
+	params = kzalloc(kbuf.bufsz, GFP_KERNEL);
 	if (!params)
 		return ERR_PTR(-ENOMEM);
 	efi_map_offset = params_cmdline_sz;
@@ -418,37 +418,41 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
 	/* Is there a limit on setup header size? */
 	memcpy(&params->hdr, (kernel + setup_hdr_offset), setup_header_size);
 
-	ret = kexec_add_buffer(image, (char *)params, params_misc_sz,
-			       params_misc_sz, 16, MIN_BOOTPARAM_ADDR,
-			       ULONG_MAX, 1, &bootparam_load_addr);
+	kbuf.buffer = params;
+	kbuf.memsz = kbuf.bufsz;
+	kbuf.buf_align = 16;
+	kbuf.buf_min = MIN_BOOTPARAM_ADDR;
+	ret = kexec_add_buffer(&kbuf);
 	if (ret)
 		goto out_free_params;
+	bootparam_load_addr = kbuf.mem;
 	pr_debug("Loaded boot_param, command line and misc at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
-		 bootparam_load_addr, params_misc_sz, params_misc_sz);
+		 bootparam_load_addr, kbuf.bufsz, kbuf.bufsz);
 
 	/* Load kernel */
-	kernel_buf = kernel + kern16_size;
-	kernel_bufsz =  kernel_len - kern16_size;
-	kernel_memsz = PAGE_ALIGN(header->init_size);
-	kernel_align = header->kernel_alignment;
-
-	ret = kexec_add_buffer(image, kernel_buf,
-			       kernel_bufsz, kernel_memsz, kernel_align,
-			       MIN_KERNEL_LOAD_ADDR, ULONG_MAX, 1,
-			       &kernel_load_addr);
+	kbuf.buffer = kernel + kern16_size;
+	kbuf.bufsz =  kernel_len - kern16_size;
+	kbuf.memsz = PAGE_ALIGN(header->init_size);
+	kbuf.buf_align = header->kernel_alignment;
+	kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
+	ret = kexec_add_buffer(&kbuf);
 	if (ret)
 		goto out_free_params;
+	kernel_load_addr = kbuf.mem;
 
 	pr_debug("Loaded 64bit kernel at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
-		 kernel_load_addr, kernel_memsz, kernel_memsz);
+		 kernel_load_addr, kbuf.bufsz, kbuf.memsz);
 
 	/* Load initrd high */
 	if (initrd) {
-		ret = kexec_add_buffer(image, initrd, initrd_len, initrd_len,
-				       PAGE_SIZE, MIN_INITRD_LOAD_ADDR,
-				       ULONG_MAX, 1, &initrd_load_addr);
+		kbuf.buffer = initrd;
+		kbuf.bufsz = kbuf.memsz = initrd_len;
+		kbuf.buf_align = PAGE_SIZE;
+		kbuf.buf_min = MIN_INITRD_LOAD_ADDR;
+		ret = kexec_add_buffer(&kbuf);
 		if (ret)
 			goto out_free_params;
+		initrd_load_addr = kbuf.mem;
 
 		pr_debug("Loaded initrd at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
 				initrd_load_addr, initrd_len, initrd_len);
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 5e320ddaaa82..437ef1b47428 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -163,7 +163,7 @@ struct kexec_file_ops {
  */
 struct kexec_buf {
 	struct kimage *image;
-	char *buffer;
+	void *buffer;
 	unsigned long bufsz;
 	unsigned long mem;
 	unsigned long memsz;
@@ -175,6 +175,7 @@ struct kexec_buf {
 
 int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
 			       int (*func)(u64, u64, void *));
+extern int kexec_add_buffer(struct kexec_buf *kbuf);
 #endif /* CONFIG_KEXEC_FILE */
 
 struct kimage {
@@ -239,11 +240,6 @@ extern asmlinkage long sys_kexec_load(unsigned long entry,
 					struct kexec_segment __user *segments,
 					unsigned long flags);
 extern int kernel_kexec(void);
-extern int kexec_add_buffer(struct kimage *image, char *buffer,
-			    unsigned long bufsz, unsigned long memsz,
-			    unsigned long buf_align, unsigned long buf_min,
-			    unsigned long buf_max, bool top_down,
-			    unsigned long *load_addr);
 extern struct page *kimage_alloc_control_pages(struct kimage *image,
 						unsigned int order);
 extern int kexec_load_purgatory(struct kimage *image, unsigned long min,
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index f865674bff51..efd2c094af7e 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -449,25 +449,27 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
 		return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
 }
 
-/*
- * Helper function for placing a buffer in a kexec segment. This assumes
- * that kexec_mutex is held.
+/**
+ * kexec_add_buffer - place a buffer in a kexec segment
+ * @kbuf:	Buffer contents and memory parameters.
+ *
+ * This function assumes that kexec_mutex is held.
+ * On successful return, @kbuf->mem will have the physical address of
+ * the buffer in memory.
+ *
+ * Return: 0 on success, negative errno on error.
  */
-int kexec_add_buffer(struct kimage *image, char *buffer, unsigned long bufsz,
-		     unsigned long memsz, unsigned long buf_align,
-		     unsigned long buf_min, unsigned long buf_max,
-		     bool top_down, unsigned long *load_addr)
+int kexec_add_buffer(struct kexec_buf *kbuf)
 {
 
 	struct kexec_segment *ksegment;
-	struct kexec_buf buf, *kbuf;
 	int ret;
 
 	/* Currently adding segment this way is allowed only in file mode */
-	if (!image->file_mode)
+	if (!kbuf->image->file_mode)
 		return -EINVAL;
 
-	if (image->nr_segments >= KEXEC_SEGMENT_MAX)
+	if (kbuf->image->nr_segments >= KEXEC_SEGMENT_MAX)
 		return -EINVAL;
 
 	/*
@@ -477,22 +479,14 @@ int kexec_add_buffer(struct kimage *image, char *buffer, unsigned long bufsz,
 	 * logic goes through list of segments to make sure there are
 	 * no destination overlaps.
 	 */
-	if (!list_empty(&image->control_pages)) {
+	if (!list_empty(&kbuf->image->control_pages)) {
 		WARN_ON(1);
 		return -EINVAL;
 	}
 
-	memset(&buf, 0, sizeof(struct kexec_buf));
-	kbuf = &buf;
-	kbuf->image = image;
-	kbuf->buffer = buffer;
-	kbuf->bufsz = bufsz;
-
-	kbuf->memsz = ALIGN(memsz, PAGE_SIZE);
-	kbuf->buf_align = max(buf_align, PAGE_SIZE);
-	kbuf->buf_min = buf_min;
-	kbuf->buf_max = buf_max;
-	kbuf->top_down = top_down;
+	/* Ensure minimum alignment needed for segments. */
+	kbuf->memsz = ALIGN(kbuf->memsz, PAGE_SIZE);
+	kbuf->buf_align = max(kbuf->buf_align, PAGE_SIZE);
 
 	/* Walk the RAM ranges and allocate a suitable range for the buffer */
 	ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
@@ -502,13 +496,12 @@ int kexec_add_buffer(struct kimage *image, char *buffer, unsigned long bufsz,
 	}
 
 	/* Found a suitable memory range */
-	ksegment = &image->segment[image->nr_segments];
+	ksegment = &kbuf->image->segment[kbuf->image->nr_segments];
 	ksegment->kbuf = kbuf->buffer;
 	ksegment->bufsz = kbuf->bufsz;
 	ksegment->mem = kbuf->mem;
 	ksegment->memsz = kbuf->memsz;
-	image->nr_segments++;
-	*load_addr = ksegment->mem;
+	kbuf->image->nr_segments++;
 	return 0;
 }
 
@@ -630,13 +623,15 @@ static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
 				  unsigned long max, int top_down)
 {
 	struct purgatory_info *pi = &image->purgatory_info;
-	unsigned long align, buf_align, bss_align, buf_sz, bss_sz, bss_pad;
-	unsigned long memsz, entry, load_addr, curr_load_addr, bss_addr, offset;
+	unsigned long align, bss_align, bss_sz, bss_pad;
+	unsigned long entry, load_addr, curr_load_addr, bss_addr, offset;
 	unsigned char *buf_addr, *src;
 	int i, ret = 0, entry_sidx = -1;
 	const Elf_Shdr *sechdrs_c;
 	Elf_Shdr *sechdrs = NULL;
-	void *purgatory_buf = NULL;
+	struct kexec_buf kbuf = { .image = image, .bufsz = 0, .buf_align = 1,
+				  .buf_min = min, .buf_max = max,
+				  .top_down = top_down };
 
 	/*
 	 * sechdrs_c points to section headers in purgatory and are read
@@ -702,9 +697,7 @@ static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
 	}
 
 	/* Determine how much memory is needed to load relocatable object. */
-	buf_align = 1;
 	bss_align = 1;
-	buf_sz = 0;
 	bss_sz = 0;
 
 	for (i = 0; i < pi->ehdr->e_shnum; i++) {
@@ -713,10 +706,10 @@ static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
 
 		align = sechdrs[i].sh_addralign;
 		if (sechdrs[i].sh_type != SHT_NOBITS) {
-			if (buf_align < align)
-				buf_align = align;
-			buf_sz = ALIGN(buf_sz, align);
-			buf_sz += sechdrs[i].sh_size;
+			if (kbuf.buf_align < align)
+				kbuf.buf_align = align;
+			kbuf.bufsz = ALIGN(kbuf.bufsz, align);
+			kbuf.bufsz += sechdrs[i].sh_size;
 		} else {
 			/* bss section */
 			if (bss_align < align)
@@ -728,32 +721,31 @@ static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
 
 	/* Determine the bss padding required to align bss properly */
 	bss_pad = 0;
-	if (buf_sz & (bss_align - 1))
-		bss_pad = bss_align - (buf_sz & (bss_align - 1));
+	if (kbuf.bufsz & (bss_align - 1))
+		bss_pad = bss_align - (kbuf.bufsz & (bss_align - 1));
 
-	memsz = buf_sz + bss_pad + bss_sz;
+	kbuf.memsz = kbuf.bufsz + bss_pad + bss_sz;
 
 	/* Allocate buffer for purgatory */
-	purgatory_buf = vzalloc(buf_sz);
-	if (!purgatory_buf) {
+	kbuf.buffer = vzalloc(kbuf.bufsz);
+	if (!kbuf.buffer) {
 		ret = -ENOMEM;
 		goto out;
 	}
 
-	if (buf_align < bss_align)
-		buf_align = bss_align;
+	if (kbuf.buf_align < bss_align)
+		kbuf.buf_align = bss_align;
 
 	/* Add buffer to segment list */
-	ret = kexec_add_buffer(image, purgatory_buf, buf_sz, memsz,
-				buf_align, min, max, top_down,
-				&pi->purgatory_load_addr);
+	ret = kexec_add_buffer(&kbuf);
 	if (ret)
 		goto out;
+	pi->purgatory_load_addr = kbuf.mem;
 
 	/* Load SHF_ALLOC sections */
-	buf_addr = purgatory_buf;
+	buf_addr = kbuf.buffer;
 	load_addr = curr_load_addr = pi->purgatory_load_addr;
-	bss_addr = load_addr + buf_sz + bss_pad;
+	bss_addr = load_addr + kbuf.bufsz + bss_pad;
 
 	for (i = 0; i < pi->ehdr->e_shnum; i++) {
 		if (!(sechdrs[i].sh_flags & SHF_ALLOC))
@@ -799,11 +791,11 @@ static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
 	 * Used later to identify which section is purgatory and skip it
 	 * from checksumming.
 	 */
-	pi->purgatory_buf = purgatory_buf;
+	pi->purgatory_buf = kbuf.buffer;
 	return ret;
 out:
 	vfree(sechdrs);
-	vfree(purgatory_buf);
+	vfree(kbuf.buffer);
 	return ret;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v10 03/10] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.
  2016-11-10  3:27 [PATCH v10 00/10] kexec_file_load implementation for PowerPC Thiago Jung Bauermann
  2016-11-10  3:27 ` [PATCH v10 01/10] kexec_file: Allow arch-specific memory walking for kexec_add_buffer Thiago Jung Bauermann
  2016-11-10  3:27 ` [PATCH v10 02/10] kexec_file: Change kexec_add_buffer to take kexec_buf as argument Thiago Jung Bauermann
@ 2016-11-10  3:27 ` Thiago Jung Bauermann
  2016-11-10  3:27 ` [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE Thiago Jung Bauermann
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Thiago Jung Bauermann @ 2016-11-10  3:27 UTC (permalink / raw)
  To: kexec
  Cc: linuxppc-dev, linux-kernel, x86, Eric Biederman, Dave Young,
	Vivek Goyal, Baoquan He, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Stewart Smith,
	Mimi Zohar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andrew Morton, Stephen Rothwell, Thiago Jung Bauermann

kexec_locate_mem_hole will be used by the PowerPC kexec_file_load
implementation to find free memory for the purgatory stack.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
---
 include/linux/kexec.h |  1 +
 kernel/kexec_file.c   | 25 ++++++++++++++++++++-----
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 437ef1b47428..a33f63351f86 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -176,6 +176,7 @@ struct kexec_buf {
 int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
 			       int (*func)(u64, u64, void *));
 extern int kexec_add_buffer(struct kexec_buf *kbuf);
+int kexec_locate_mem_hole(struct kexec_buf *kbuf);
 #endif /* CONFIG_KEXEC_FILE */
 
 struct kimage {
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index efd2c094af7e..0c2df7f73792 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -450,6 +450,23 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
 }
 
 /**
+ * kexec_locate_mem_hole - find free memory for the purgatory or the next kernel
+ * @kbuf:	Parameters for the memory search.
+ *
+ * On success, kbuf->mem will have the start address of the memory region found.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+int kexec_locate_mem_hole(struct kexec_buf *kbuf)
+{
+	int ret;
+
+	ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
+
+	return ret == 1 ? 0 : -EADDRNOTAVAIL;
+}
+
+/**
  * kexec_add_buffer - place a buffer in a kexec segment
  * @kbuf:	Buffer contents and memory parameters.
  *
@@ -489,11 +506,9 @@ int kexec_add_buffer(struct kexec_buf *kbuf)
 	kbuf->buf_align = max(kbuf->buf_align, PAGE_SIZE);
 
 	/* Walk the RAM ranges and allocate a suitable range for the buffer */
-	ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
-	if (ret != 1) {
-		/* A suitable memory range could not be found for buffer */
-		return -EADDRNOTAVAIL;
-	}
+	ret = kexec_locate_mem_hole(kbuf);
+	if (ret)
+		return ret;
 
 	/* Found a suitable memory range */
 	ksegment = &kbuf->image->segment[kbuf->image->nr_segments];
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.
  2016-11-10  3:27 [PATCH v10 00/10] kexec_file_load implementation for PowerPC Thiago Jung Bauermann
                   ` (2 preceding siblings ...)
  2016-11-10  3:27 ` [PATCH v10 03/10] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer Thiago Jung Bauermann
@ 2016-11-10  3:27 ` Thiago Jung Bauermann
  2016-11-20  2:45   ` Dave Young
  2016-11-10  3:27 ` [PATCH v10 05/10] powerpc: Change places using CONFIG_KEXEC to use CONFIG_KEXEC_CORE instead Thiago Jung Bauermann
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 20+ messages in thread
From: Thiago Jung Bauermann @ 2016-11-10  3:27 UTC (permalink / raw)
  To: kexec
  Cc: linuxppc-dev, linux-kernel, x86, Eric Biederman, Dave Young,
	Vivek Goyal, Baoquan He, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Stewart Smith,
	Mimi Zohar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andrew Morton, Stephen Rothwell, Thiago Jung Bauermann

powerpc's purgatory.ro has 12 relocation types when built as
a relocatable object. To implement support for them requires
arch_kexec_apply_relocations_add to duplicate a lot of code with
module_64.c:apply_relocate_add.

When built as a Position Independent Executable there are only 4
relocation types in purgatory.ro, so it becomes practical for the powerpc
implementation of kexec_file to have its own relocation implementation.

Also, the purgatory is an executable and not an intermediary output from
the compiler so it makes sense conceptually that it is easier to build
it as a PIE than as a partially linked object.

Apart from the greatly reduced number of relocations, there are two
differences between a relocatable object and a PIE:

1. __kexec_load_purgatory needs to use the program headers rather than the
   section headers to figure out how to load the binary.
2. Symbol values are absolute addresses instead of relative to the
   start of the section.

This patch adds the support needed in generic code for the differences
above and allows powerpc to load and relocate a position independent
purgatory.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/Kconfig          |  11 ++
 include/linux/kexec.h |   4 +
 kernel/kexec_file.c   | 314 ++++++++++++++++++++++++++++++++++++++------------
 3 files changed, 253 insertions(+), 76 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 659bdd079277..f4498530a618 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -5,6 +5,17 @@
 config KEXEC_CORE
 	bool
 
+config HAVE_KEXEC_FILE_PIE_PURGATORY
+	bool
+	help
+	  By default, the purgatory binary is built as a relocatable
+	  object, but on some architectures it might be an advantage
+	  to build it as a Position Independent Executable to reduce
+	  the types of relocation that have to be dealt with.
+
+	  If an architecture builds a PIE purgatory it should select
+	  this symbol.
+
 config OPROFILE
 	tristate "OProfile system profiling"
 	depends on PROFILING
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index a33f63351f86..5c356e387240 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -112,6 +112,10 @@ struct compat_kexec_segment {
 #endif
 
 #ifdef CONFIG_KEXEC_FILE
+#ifndef PURGATORY_ELF_TYPE
+#define PURGATORY_ELF_TYPE ET_REL
+#endif
+
 struct purgatory_info {
 	/* Pointer to elf header of read only purgatory */
 	Elf_Ehdr *ehdr;
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 0c2df7f73792..cf8c17111b12 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -633,68 +633,139 @@ static int kexec_calculate_store_digests(struct kimage *image)
 	return ret;
 }
 
-/* Actually load purgatory. Lot of code taken from kexec-tools */
-static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
-				  unsigned long max, int top_down)
+#ifdef CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY
+/*
+ * Load position independent executable purgatory using program header
+ * information.
+ */
+static int __kexec_really_load_purgatory(const Elf_Ehdr *ehdr,
+					 Elf_Shdr *sechdrs,
+					 struct kexec_buf *kbuf,
+					 unsigned long *entry_addr)
 {
-	struct purgatory_info *pi = &image->purgatory_info;
-	unsigned long align, bss_align, bss_sz, bss_pad;
-	unsigned long entry, load_addr, curr_load_addr, bss_addr, offset;
-	unsigned char *buf_addr, *src;
-	int i, ret = 0, entry_sidx = -1;
-	const Elf_Shdr *sechdrs_c;
-	Elf_Shdr *sechdrs = NULL;
-	struct kexec_buf kbuf = { .image = image, .bufsz = 0, .buf_align = 1,
-				  .buf_min = min, .buf_max = max,
-				  .top_down = top_down };
+	int ret;
+	unsigned long entry, dst_mem;
+	void *dst;
+	const Elf_Phdr *phdr, *first_load_seg = NULL, *last_load_seg = NULL;
+	const Elf_Phdr *prev_load_seg;
+	const Elf_Phdr *phdrs = (const void *) ehdr + ehdr->e_phoff;
+
+	/* Determine how much memory is needed to load the executable. */
+	for (phdr = phdrs; phdr < phdrs + ehdr->e_phnum; phdr++) {
+		if (phdr->p_type != PT_LOAD)
+			continue;
 
-	/*
-	 * sechdrs_c points to section headers in purgatory and are read
-	 * only. No modifications allowed.
-	 */
-	sechdrs_c = (void *)pi->ehdr + pi->ehdr->e_shoff;
+		if (!first_load_seg)
+			first_load_seg = phdr;
 
-	/*
-	 * We can not modify sechdrs_c[] and its fields. It is read only.
-	 * Copy it over to a local copy where one can store some temporary
-	 * data and free it at the end. We need to modify ->sh_addr and
-	 * ->sh_offset fields to keep track of permanent and temporary
-	 * locations of sections.
-	 */
-	sechdrs = vzalloc(pi->ehdr->e_shnum * sizeof(Elf_Shdr));
-	if (!sechdrs)
-		return -ENOMEM;
+		if (kbuf->buf_align < phdr->p_align)
+			kbuf->buf_align = phdr->p_align;
 
-	memcpy(sechdrs, sechdrs_c, pi->ehdr->e_shnum * sizeof(Elf_Shdr));
+		last_load_seg = phdr;
+	}
 
-	/*
-	 * We seem to have multiple copies of sections. First copy is which
-	 * is embedded in kernel in read only section. Some of these sections
-	 * will be copied to a temporary buffer and relocated. And these
-	 * sections will finally be copied to their final destination at
-	 * segment load time.
-	 *
-	 * Use ->sh_offset to reflect section address in memory. It will
-	 * point to original read only copy if section is not allocatable.
-	 * Otherwise it will point to temporary copy which will be relocated.
-	 *
-	 * Use ->sh_addr to contain final address of the section where it
-	 * will go during execution time.
-	 */
-	for (i = 0; i < pi->ehdr->e_shnum; i++) {
-		if (sechdrs[i].sh_type == SHT_NOBITS)
+	kbuf->memsz = kbuf->bufsz = last_load_seg->p_paddr +
+					    last_load_seg->p_memsz -
+					    first_load_seg->p_paddr +
+					    first_load_seg->p_offset;
+
+	/* Allocate buffer for purgatory. */
+	kbuf->buffer = vzalloc(kbuf->bufsz);
+	if (!kbuf->buffer) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	/* Add buffer to segment list. */
+	ret = kexec_add_buffer(kbuf);
+	if (ret)
+		goto out;
+
+	/* Load executable. */
+	prev_load_seg = NULL;
+	entry = ehdr->e_entry;
+	dst = kbuf->buffer;
+	dst_mem = kbuf->mem;
+	for (phdr = phdrs; phdr < phdrs + ehdr->e_phnum; phdr++) {
+		const void *src;
+		int i;
+		unsigned long p_offset;
+
+		if (phdr->p_type != PT_LOAD)
 			continue;
 
-		sechdrs[i].sh_offset = (unsigned long)pi->ehdr +
-						sechdrs[i].sh_offset;
+		if (!prev_load_seg)
+			p_offset = phdr->p_offset;
+		else
+			p_offset = phdr->p_paddr - prev_load_seg->p_paddr;
+
+		src = (const void *) ehdr + phdr->p_offset;
+		dst += p_offset;
+		dst_mem += p_offset;
+		memcpy(dst, src, phdr->p_filesz);
+
+		/*
+		 * If the entry point is in this segment, find its final
+		 * location in memory.
+		 */
+		if (entry >= phdr->p_paddr &&
+		    entry < phdr->p_paddr + phdr->p_memsz)
+			entry = dst_mem + entry - phdr->p_paddr;
+
+		/*
+		 * Find sections within this segment and update their
+		 * ->sh_offset to point to within the buffer.
+		 */
+		for (i = 0; i < ehdr->e_shnum; i++) {
+			Elf_Shdr *sechdr = &sechdrs[i];
+			Elf_Addr start = sechdr->sh_addr;
+			Elf_Addr size = sechdr->sh_size;
+
+			if (!(sechdr->sh_flags & SHF_ALLOC))
+				continue;
+
+			if (start >= phdr->p_paddr &&
+			    start + size <= phdr->p_paddr + phdr->p_memsz) {
+				unsigned long s_offset = start - phdr->p_paddr;
+				void *p = dst + s_offset;
+
+				sechdr->sh_addr = dst_mem + s_offset;
+				sechdr->sh_offset = (unsigned long long) p;
+			}
+		}
+
+		prev_load_seg = phdr;
 	}
 
+	*entry_addr = entry;
+
+	return 0;
+out:
+	vfree(kbuf->buffer);
+
+	return ret;
+}
+#else /* CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY */
+/*
+ * Load relocatable object purgatory using the section header information.
+ * A lot of code taken from kexec-tools.
+ */
+static int __kexec_really_load_purgatory(const Elf_Ehdr *ehdr,
+					 Elf_Shdr *sechdrs,
+					 struct kexec_buf *kbuf,
+					 unsigned long *entry_addr)
+{
+	unsigned long align, bss_align, bss_sz, bss_pad;
+	unsigned long entry, load_addr, curr_load_addr, bss_addr, offset;
+	unsigned char *buf_addr, *src;
+	int i, ret, entry_sidx = -1;
+
 	/*
 	 * Identify entry point section and make entry relative to section
 	 * start.
 	 */
-	entry = pi->ehdr->e_entry;
-	for (i = 0; i < pi->ehdr->e_shnum; i++) {
+	entry = ehdr->e_entry;
+	for (i = 0; i < ehdr->e_shnum; i++) {
 		if (!(sechdrs[i].sh_flags & SHF_ALLOC))
 			continue;
 
@@ -702,9 +773,9 @@ static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
 			continue;
 
 		/* Make entry section relative */
-		if (sechdrs[i].sh_addr <= pi->ehdr->e_entry &&
+		if (sechdrs[i].sh_addr <= ehdr->e_entry &&
 		    ((sechdrs[i].sh_addr + sechdrs[i].sh_size) >
-		     pi->ehdr->e_entry)) {
+		     ehdr->e_entry)) {
 			entry_sidx = i;
 			entry -= sechdrs[i].sh_addr;
 			break;
@@ -715,16 +786,16 @@ static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
 	bss_align = 1;
 	bss_sz = 0;
 
-	for (i = 0; i < pi->ehdr->e_shnum; i++) {
+	for (i = 0; i < ehdr->e_shnum; i++) {
 		if (!(sechdrs[i].sh_flags & SHF_ALLOC))
 			continue;
 
 		align = sechdrs[i].sh_addralign;
 		if (sechdrs[i].sh_type != SHT_NOBITS) {
-			if (kbuf.buf_align < align)
-				kbuf.buf_align = align;
-			kbuf.bufsz = ALIGN(kbuf.bufsz, align);
-			kbuf.bufsz += sechdrs[i].sh_size;
+			if (kbuf->buf_align < align)
+				kbuf->buf_align = align;
+			kbuf->bufsz = ALIGN(kbuf->bufsz, align);
+			kbuf->bufsz += sechdrs[i].sh_size;
 		} else {
 			/* bss section */
 			if (bss_align < align)
@@ -736,33 +807,32 @@ static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
 
 	/* Determine the bss padding required to align bss properly */
 	bss_pad = 0;
-	if (kbuf.bufsz & (bss_align - 1))
-		bss_pad = bss_align - (kbuf.bufsz & (bss_align - 1));
+	if (kbuf->bufsz & (bss_align - 1))
+		bss_pad = bss_align - (kbuf->bufsz & (bss_align - 1));
 
-	kbuf.memsz = kbuf.bufsz + bss_pad + bss_sz;
+	kbuf->memsz = kbuf->bufsz + bss_pad + bss_sz;
 
 	/* Allocate buffer for purgatory */
-	kbuf.buffer = vzalloc(kbuf.bufsz);
-	if (!kbuf.buffer) {
+	kbuf->buffer = vzalloc(kbuf->bufsz);
+	if (!kbuf->buffer) {
 		ret = -ENOMEM;
 		goto out;
 	}
 
-	if (kbuf.buf_align < bss_align)
-		kbuf.buf_align = bss_align;
+	if (kbuf->buf_align < bss_align)
+		kbuf->buf_align = bss_align;
 
 	/* Add buffer to segment list */
-	ret = kexec_add_buffer(&kbuf);
+	ret = kexec_add_buffer(kbuf);
 	if (ret)
 		goto out;
-	pi->purgatory_load_addr = kbuf.mem;
 
 	/* Load SHF_ALLOC sections */
-	buf_addr = kbuf.buffer;
-	load_addr = curr_load_addr = pi->purgatory_load_addr;
-	bss_addr = load_addr + kbuf.bufsz + bss_pad;
+	buf_addr = kbuf->buffer;
+	load_addr = curr_load_addr = kbuf->mem;
+	bss_addr = load_addr + kbuf->bufsz + bss_pad;
 
-	for (i = 0; i < pi->ehdr->e_shnum; i++) {
+	for (i = 0; i < ehdr->e_shnum; i++) {
 		if (!(sechdrs[i].sh_flags & SHF_ALLOC))
 			continue;
 
@@ -796,8 +866,76 @@ static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
 	if (entry_sidx >= 0)
 		entry += sechdrs[entry_sidx].sh_addr;
 
-	/* Make kernel jump to purgatory after shutdown */
-	image->start = entry;
+	*entry_addr = entry;
+
+	return 0;
+out:
+	vfree(kbuf->buffer);
+
+	return ret;
+}
+#endif /* CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY */
+
+/*
+ * Adjust the section headers to point to the temporary copy of their contents
+ * and load the purgatory binary.
+ */
+static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
+				  unsigned long max, int top_down)
+{
+	struct purgatory_info *pi = &image->purgatory_info;
+	const Elf_Shdr *sechdrs_c;
+	Elf_Shdr *sechdrs;
+	int i, ret;
+	struct kexec_buf kbuf = { .image = image, .buf_align = 1,
+				  .buf_min = min, .buf_max = max,
+				  .top_down = top_down };
+
+	/*
+	 * sechdrs_c points to section headers in purgatory and are read
+	 * only. No modifications allowed.
+	 */
+	sechdrs_c = (void *)pi->ehdr + pi->ehdr->e_shoff;
+
+	/*
+	 * We can not modify sechdrs_c[] and its fields. It is read only.
+	 * Copy it over to a local copy where one can store some temporary
+	 * data and free it at the end. We need to modify ->sh_addr and
+	 * ->sh_offset fields to keep track of permanent and temporary
+	 * locations of sections.
+	 */
+	sechdrs = vzalloc(pi->ehdr->e_shnum * sizeof(Elf_Shdr));
+	if (!sechdrs)
+		return -ENOMEM;
+
+	memcpy(sechdrs, sechdrs_c, pi->ehdr->e_shnum * sizeof(Elf_Shdr));
+
+	/*
+	 * We seem to have multiple copies of sections. First copy is which
+	 * is embedded in kernel in read only section. Some of these sections
+	 * will be copied to a temporary buffer and relocated. And these
+	 * sections will finally be copied to their final destination at
+	 * segment load time.
+	 *
+	 * Use ->sh_offset to reflect section address in memory. It will
+	 * point to original read only copy if section is not allocatable.
+	 * Otherwise it will point to temporary copy which will be relocated.
+	 *
+	 * Use ->sh_addr to contain final address of the section where it
+	 * will go during execution time.
+	 */
+	for (i = 0; i < pi->ehdr->e_shnum; i++) {
+		if (sechdrs[i].sh_type == SHT_NOBITS)
+			continue;
+
+		sechdrs[i].sh_offset = (unsigned long)pi->ehdr +
+						sechdrs[i].sh_offset;
+	}
+
+	ret = __kexec_really_load_purgatory(pi->ehdr, sechdrs, &kbuf,
+					    &image->start);
+	if (ret)
+		goto out;
 
 	/* Used later to get/set symbol values */
 	pi->sechdrs = sechdrs;
@@ -807,10 +945,12 @@ static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
 	 * from checksumming.
 	 */
 	pi->purgatory_buf = kbuf.buffer;
-	return ret;
+	pi->purgatory_load_addr = kbuf.mem;
+
+	return 0;
 out:
 	vfree(sechdrs);
-	vfree(kbuf.buffer);
+
 	return ret;
 }
 
@@ -886,7 +1026,7 @@ int kexec_load_purgatory(struct kimage *image, unsigned long min,
 	pi->ehdr = (Elf_Ehdr *)kexec_purgatory;
 
 	if (memcmp(pi->ehdr->e_ident, ELFMAG, SELFMAG) != 0
-	    || pi->ehdr->e_type != ET_REL
+	    || pi->ehdr->e_type != PURGATORY_ELF_TYPE
 	    || !elf_check_arch(pi->ehdr)
 	    || pi->ehdr->e_shentsize != sizeof(Elf_Shdr))
 		return -ENOEXEC;
@@ -915,6 +1055,28 @@ int kexec_load_purgatory(struct kimage *image, unsigned long min,
 	return ret;
 }
 
+/**
+ * sym_value_offset - return symbol value as a section-relative offset
+ *
+ * In position-independent executables the symbol value is an absolute address,
+ * so convert it to a section-relative offset. In relocatable objects the symbol
+ * value already is a section-relative offset.
+ */
+static Elf_Addr sym_value_offset(struct purgatory_info *pi, Elf_Sym *sym)
+{
+	Elf_Addr base_addr;
+
+	if (IS_ENABLED(CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY)) {
+		const Elf_Shdr *sechdrs_c;
+
+		sechdrs_c = (const void *) pi->ehdr + pi->ehdr->e_shoff;
+		base_addr = sechdrs_c[sym->st_shndx].sh_addr;
+	} else
+		base_addr = 0;
+
+	return sym->st_value - base_addr;
+}
+
 static Elf_Sym *kexec_purgatory_find_symbol(struct purgatory_info *pi,
 					    const char *name)
 {
@@ -979,7 +1141,7 @@ void *kexec_purgatory_get_symbol_addr(struct kimage *image, const char *name)
 	 * Returns the address where symbol will finally be loaded after
 	 * kexec_load_segment()
 	 */
-	return (void *)(sechdr->sh_addr + sym->st_value);
+	return (void *)(sechdr->sh_addr + sym_value_offset(pi, sym));
 }
 
 /*
@@ -1013,7 +1175,7 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
 	}
 
 	sym_buf = (unsigned char *)sechdrs[sym->st_shndx].sh_offset +
-					sym->st_value;
+					sym_value_offset(pi, sym);
 
 	if (get_value)
 		memcpy((void *)buf, sym_buf, size);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v10 05/10] powerpc: Change places using CONFIG_KEXEC to use CONFIG_KEXEC_CORE instead.
  2016-11-10  3:27 [PATCH v10 00/10] kexec_file_load implementation for PowerPC Thiago Jung Bauermann
                   ` (3 preceding siblings ...)
  2016-11-10  3:27 ` [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE Thiago Jung Bauermann
@ 2016-11-10  3:27 ` Thiago Jung Bauermann
  2016-11-10  3:27 ` [PATCH v10 06/10] powerpc: Implement kexec_file_load Thiago Jung Bauermann
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Thiago Jung Bauermann @ 2016-11-10  3:27 UTC (permalink / raw)
  To: kexec
  Cc: linuxppc-dev, linux-kernel, x86, Eric Biederman, Dave Young,
	Vivek Goyal, Baoquan He, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Stewart Smith,
	Mimi Zohar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andrew Morton, Stephen Rothwell, Thiago Jung Bauermann

Commit 2965faa5e03d ("kexec: split kexec_load syscall from kexec core
code") introduced CONFIG_KEXEC_CORE so that CONFIG_KEXEC means whether
the kexec_load system call should be compiled-in and CONFIG_KEXEC_FILE
means whether the kexec_file_load system call should be compiled-in.
These options can be set independently from each other.

Since until now powerpc only supported kexec_load, CONFIG_KEXEC and
CONFIG_KEXEC_CORE were synonyms. That is not the case anymore, so we
need to make a distinction. Almost all places where CONFIG_KEXEC was
being used should be using CONFIG_KEXEC_CORE instead, since
kexec_file_load also needs that code compiled in.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig                          | 2 +-
 arch/powerpc/include/asm/debug.h              | 2 +-
 arch/powerpc/include/asm/kexec.h              | 6 +++---
 arch/powerpc/include/asm/machdep.h            | 4 ++--
 arch/powerpc/include/asm/smp.h                | 2 +-
 arch/powerpc/kernel/Makefile                  | 4 ++--
 arch/powerpc/kernel/head_64.S                 | 2 +-
 arch/powerpc/kernel/misc_32.S                 | 2 +-
 arch/powerpc/kernel/misc_64.S                 | 6 +++---
 arch/powerpc/kernel/prom.c                    | 2 +-
 arch/powerpc/kernel/setup_64.c                | 4 ++--
 arch/powerpc/kernel/smp.c                     | 6 +++---
 arch/powerpc/kernel/traps.c                   | 2 +-
 arch/powerpc/platforms/85xx/corenet_generic.c | 2 +-
 arch/powerpc/platforms/85xx/smp.c             | 8 ++++----
 arch/powerpc/platforms/cell/spu_base.c        | 2 +-
 arch/powerpc/platforms/powernv/setup.c        | 6 +++---
 arch/powerpc/platforms/ps3/setup.c            | 4 ++--
 arch/powerpc/platforms/pseries/Makefile       | 2 +-
 arch/powerpc/platforms/pseries/setup.c        | 4 ++--
 20 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 65fba4c34cd7..6cb59c6e5ba4 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -489,7 +489,7 @@ config CRASH_DUMP
 
 config FA_DUMP
 	bool "Firmware-assisted dump"
-	depends on PPC64 && PPC_RTAS && CRASH_DUMP && KEXEC
+	depends on PPC64 && PPC_RTAS && CRASH_DUMP && KEXEC_CORE
 	help
 	  A robust mechanism to get reliable kernel crash dump with
 	  assistance from firmware. This approach does not use kexec,
diff --git a/arch/powerpc/include/asm/debug.h b/arch/powerpc/include/asm/debug.h
index a954e4975049..86308f177f2d 100644
--- a/arch/powerpc/include/asm/debug.h
+++ b/arch/powerpc/include/asm/debug.h
@@ -10,7 +10,7 @@ struct pt_regs;
 
 extern struct dentry *powerpc_debugfs_root;
 
-#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC)
+#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC_CORE)
 
 extern int (*__debugger)(struct pt_regs *regs);
 extern int (*__debugger_ipi)(struct pt_regs *regs);
diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index a46f5f45570c..eca2f975bf44 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -53,7 +53,7 @@
 
 typedef void (*crash_shutdown_t)(void);
 
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
 
 /*
  * This function is responsible for capturing register states if coming
@@ -91,7 +91,7 @@ static inline bool kdump_in_progress(void)
 	return crashing_cpu >= 0;
 }
 
-#else /* !CONFIG_KEXEC */
+#else /* !CONFIG_KEXEC_CORE */
 static inline void crash_kexec_secondary(struct pt_regs *regs) { }
 
 static inline int overlaps_crashkernel(unsigned long start, unsigned long size)
@@ -116,7 +116,7 @@ static inline bool kdump_in_progress(void)
 	return false;
 }
 
-#endif /* CONFIG_KEXEC */
+#endif /* CONFIG_KEXEC_CORE */
 #endif /* ! __ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_KEXEC_H */
diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index e02cbc6a6c70..5011b69107a7 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -183,7 +183,7 @@ struct machdep_calls {
 	 */
 	void (*machine_shutdown)(void);
 
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
 	void (*kexec_cpu_down)(int crash_shutdown, int secondary);
 
 	/* Called to do what every setup is needed on image and the
@@ -198,7 +198,7 @@ struct machdep_calls {
 	 * no return.
 	 */
 	void (*machine_kexec)(struct kimage *image);
-#endif /* CONFIG_KEXEC */
+#endif /* CONFIG_KEXEC_CORE */
 
 #ifdef CONFIG_SUSPEND
 	/* These are called to disable and enable, respectively, IRQs when
diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 0d02c11dc331..32db16d2e7ad 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -176,7 +176,7 @@ static inline void set_hard_smp_processor_id(int cpu, int phys)
 #endif /* !CONFIG_SMP */
 #endif /* !CONFIG_PPC64 */
 
-#if defined(CONFIG_PPC64) && (defined(CONFIG_SMP) || defined(CONFIG_KEXEC))
+#if defined(CONFIG_PPC64) && (defined(CONFIG_SMP) || defined(CONFIG_KEXEC_CORE))
 extern void smp_release_cpus(void);
 #else
 static inline void smp_release_cpus(void) { };
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 1925341dbb9c..22534a56c914 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -107,7 +107,7 @@ pci64-$(CONFIG_PPC64)		+= pci_dn.o pci-hotplug.o isa-bridge.o
 obj-$(CONFIG_PCI)		+= pci_$(BITS).o $(pci64-y) \
 				   pci-common.o pci_of_scan.o
 obj-$(CONFIG_PCI_MSI)		+= msi.o
-obj-$(CONFIG_KEXEC)		+= machine_kexec.o crash.o \
+obj-$(CONFIG_KEXEC_CORE)	+= machine_kexec.o crash.o \
 				   machine_kexec_$(BITS).o
 obj-$(CONFIG_AUDIT)		+= audit.o
 obj64-$(CONFIG_AUDIT)		+= compat_audit.o
@@ -128,7 +128,7 @@ obj64-$(CONFIG_PPC_TRANSACTIONAL_MEM)	+= tm.o
 obj-$(CONFIG_PPC64)		+= $(obj64-y)
 obj-$(CONFIG_PPC32)		+= $(obj32-y)
 
-ifneq ($(CONFIG_XMON)$(CONFIG_KEXEC),)
+ifneq ($(CONFIG_XMON)$(CONFIG_KEXEC_CORE),)
 obj-y				+= ppc_save_regs.o
 endif
 
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 04c546e20cc0..d3d482b628a6 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -153,7 +153,7 @@ __secondary_hold:
 	cmpdi	0,r12,0
 	beq	100b
 
-#if defined(CONFIG_SMP) || defined(CONFIG_KEXEC)
+#if defined(CONFIG_SMP) || defined(CONFIG_KEXEC_CORE)
 #ifdef CONFIG_PPC_BOOK3E
 	tovirt(r12,r12)
 #endif
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 93cf7a5846a6..1863324c6a3c 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -614,7 +614,7 @@ _GLOBAL(start_secondary_resume)
 _GLOBAL(__main)
 	blr
 
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
 	/*
 	 * Must be relocatable PIC code callable as a C function.
 	 */
diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index 4f178671f230..32be2a844947 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -478,7 +478,7 @@ _GLOBAL(kexec_wait)
 	addi	r5,r5,kexec_flag-1b
 
 99:	HMT_LOW
-#ifdef CONFIG_KEXEC		/* use no memory without kexec */
+#ifdef CONFIG_KEXEC_CORE	/* use no memory without kexec */
 	lwz	r4,0(r5)
 	cmpwi	0,r4,0
 	beq	99b
@@ -503,7 +503,7 @@ kexec_flag:
 	.long	0
 
 
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
 #ifdef CONFIG_PPC_BOOK3E
 /*
  * BOOK3E has no real MMU mode, so we have to setup the initial TLB
@@ -716,4 +716,4 @@ _GLOBAL(kexec_sequence)
 	mtlr	4
 	li	r5,0
 	blr	/* image->start(physid, image->start, 0); */
-#endif /* CONFIG_KEXEC */
+#endif /* CONFIG_KEXEC_CORE */
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index b0245bed6f54..fa7aea479fba 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -427,7 +427,7 @@ static int __init early_init_dt_scan_chosen_ppc(unsigned long node,
 		tce_alloc_end = *lprop;
 #endif
 
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
 	lprop = of_get_flat_dt_prop(node, "linux,crashkernel-base", NULL);
 	if (lprop)
 		crashk_res.start = *lprop;
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 7ac8e6eaab5b..c3e129080c31 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -346,7 +346,7 @@ void early_setup_secondary(void)
 
 #endif /* CONFIG_SMP */
 
-#if defined(CONFIG_SMP) || defined(CONFIG_KEXEC)
+#if defined(CONFIG_SMP) || defined(CONFIG_KEXEC_CORE)
 static bool use_spinloop(void)
 {
 	if (!IS_ENABLED(CONFIG_PPC_BOOK3E))
@@ -391,7 +391,7 @@ void smp_release_cpus(void)
 
 	DBG(" <- smp_release_cpus()\n");
 }
-#endif /* CONFIG_SMP || CONFIG_KEXEC */
+#endif /* CONFIG_SMP || CONFIG_KEXEC_CORE */
 
 /*
  * Initialize some remaining members of the ppc64_caches and systemcfg
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 9c6f3fd58059..893bd7f79be6 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -193,7 +193,7 @@ int smp_request_message_ipi(int virq, int msg)
 	if (msg < 0 || msg > PPC_MSG_DEBUGGER_BREAK) {
 		return -EINVAL;
 	}
-#if !defined(CONFIG_DEBUGGER) && !defined(CONFIG_KEXEC)
+#if !defined(CONFIG_DEBUGGER) && !defined(CONFIG_KEXEC_CORE)
 	if (msg == PPC_MSG_DEBUGGER_BREAK) {
 		return 1;
 	}
@@ -325,7 +325,7 @@ void tick_broadcast(const struct cpumask *mask)
 }
 #endif
 
-#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC)
+#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC_CORE)
 void smp_send_debugger_break(void)
 {
 	int cpu;
@@ -340,7 +340,7 @@ void smp_send_debugger_break(void)
 }
 #endif
 
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
 void crash_send_ipi(void (*crash_ipi_callback)(struct pt_regs *))
 {
 	crash_ipi_function_ptr = crash_ipi_callback;
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 023a462725b5..d26605d950fe 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -65,7 +65,7 @@
 #include <asm/hmi.h>
 #include <sysdev/fsl_pci.h>
 
-#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC)
+#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC_CORE)
 int (*__debugger)(struct pt_regs *regs) __read_mostly;
 int (*__debugger_ipi)(struct pt_regs *regs) __read_mostly;
 int (*__debugger_bpt)(struct pt_regs *regs) __read_mostly;
diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c
index 1179115a4b5c..3803b0addf65 100644
--- a/arch/powerpc/platforms/85xx/corenet_generic.c
+++ b/arch/powerpc/platforms/85xx/corenet_generic.c
@@ -220,7 +220,7 @@ define_machine(corenet_generic) {
  *
  * Likewise, problems have been seen with kexec when coreint is enabled.
  */
-#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_KEXEC)
+#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_KEXEC_CORE)
 	.get_irq		= mpic_get_irq,
 #else
 	.get_irq		= mpic_get_coreint_irq,
diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
index fe9f19e5e935..a83a6d26090d 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -349,13 +349,13 @@ struct smp_ops_t smp_85xx_ops = {
 	.cpu_disable	= generic_cpu_disable,
 	.cpu_die	= generic_cpu_die,
 #endif
-#if defined(CONFIG_KEXEC) && !defined(CONFIG_PPC64)
+#if defined(CONFIG_KEXEC_CORE) && !defined(CONFIG_PPC64)
 	.give_timebase	= smp_generic_give_timebase,
 	.take_timebase	= smp_generic_take_timebase,
 #endif
 };
 
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
 #ifdef CONFIG_PPC32
 atomic_t kexec_down_cpus = ATOMIC_INIT(0);
 
@@ -458,7 +458,7 @@ static void mpc85xx_smp_machine_kexec(struct kimage *image)
 
 	default_machine_kexec(image);
 }
-#endif /* CONFIG_KEXEC */
+#endif /* CONFIG_KEXEC_CORE */
 
 static void smp_85xx_basic_setup(int cpu_nr)
 {
@@ -512,7 +512,7 @@ void __init mpc85xx_smp_init(void)
 #endif
 	smp_ops = &smp_85xx_ops;
 
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
 	ppc_md.kexec_cpu_down = mpc85xx_smp_kexec_cpu_down;
 	ppc_md.machine_kexec = mpc85xx_smp_machine_kexec;
 #endif
diff --git a/arch/powerpc/platforms/cell/spu_base.c b/arch/powerpc/platforms/cell/spu_base.c
index e84d8fbc2e21..96c2b8a40630 100644
--- a/arch/powerpc/platforms/cell/spu_base.c
+++ b/arch/powerpc/platforms/cell/spu_base.c
@@ -676,7 +676,7 @@ static ssize_t spu_stat_show(struct device *dev,
 
 static DEVICE_ATTR(stat, 0444, spu_stat_show, NULL);
 
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
 
 struct crash_spu_info {
 	struct spu *spu;
diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
index efe8b6bb168b..d50c7d99baaf 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -174,7 +174,7 @@ static void pnv_shutdown(void)
 	opal_shutdown();
 }
 
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
 static void pnv_kexec_wait_secondaries_down(void)
 {
 	int my_cpu, i, notified = -1;
@@ -245,7 +245,7 @@ static void pnv_kexec_cpu_down(int crash_shutdown, int secondary)
 		opal_reinit_cpus(OPAL_REINIT_CPUS_HILE_BE);
 	}
 }
-#endif /* CONFIG_KEXEC */
+#endif /* CONFIG_KEXEC_CORE */
 
 #ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
 static unsigned long pnv_memory_block_size(void)
@@ -311,7 +311,7 @@ define_machine(powernv) {
 	.machine_shutdown	= pnv_shutdown,
 	.power_save             = NULL,
 	.calibrate_decr		= generic_calibrate_decr,
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
 	.kexec_cpu_down		= pnv_kexec_cpu_down,
 #endif
 #ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
diff --git a/arch/powerpc/platforms/ps3/setup.c b/arch/powerpc/platforms/ps3/setup.c
index 3a487e7f4a5e..6244bc849469 100644
--- a/arch/powerpc/platforms/ps3/setup.c
+++ b/arch/powerpc/platforms/ps3/setup.c
@@ -250,7 +250,7 @@ static int __init ps3_probe(void)
 	return 1;
 }
 
-#if defined(CONFIG_KEXEC)
+#if defined(CONFIG_KEXEC_CORE)
 static void ps3_kexec_cpu_down(int crash_shutdown, int secondary)
 {
 	int cpu = smp_processor_id();
@@ -276,7 +276,7 @@ define_machine(ps3) {
 	.progress			= ps3_progress,
 	.restart			= ps3_restart,
 	.halt				= ps3_halt,
-#if defined(CONFIG_KEXEC)
+#if defined(CONFIG_KEXEC_CORE)
 	.kexec_cpu_down			= ps3_kexec_cpu_down,
 #endif
 };
diff --git a/arch/powerpc/platforms/pseries/Makefile b/arch/powerpc/platforms/pseries/Makefile
index fedc2ccf029d..dd9d9c2ba71b 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -8,7 +8,7 @@ obj-y			:= lpar.o hvCall.o nvram.o reconfig.o \
 			   pci.o pci_dlpar.o eeh_pseries.o msi.o
 obj-$(CONFIG_SMP)	+= smp.o
 obj-$(CONFIG_SCANLOG)	+= scanlog.o
-obj-$(CONFIG_KEXEC)	+= kexec.o
+obj-$(CONFIG_KEXEC_CORE)	+= kexec.o
 obj-$(CONFIG_PSERIES_ENERGY)	+= pseries_energy.o
 
 obj-$(CONFIG_HOTPLUG_CPU)	+= hotplug-cpu.o
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 97aa3f332f24..7736352f7279 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -367,7 +367,7 @@ void pseries_disable_reloc_on_exc(void)
 }
 EXPORT_SYMBOL(pseries_disable_reloc_on_exc);
 
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
 static void pSeries_machine_kexec(struct kimage *image)
 {
 	if (firmware_has_feature(FW_FEATURE_SET_MODE))
@@ -725,7 +725,7 @@ define_machine(pseries) {
 	.progress		= rtas_progress,
 	.system_reset_exception = pSeries_system_reset_exception,
 	.machine_check_exception = pSeries_machine_check_exception,
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
 	.machine_kexec          = pSeries_machine_kexec,
 	.kexec_cpu_down         = pseries_kexec_cpu_down,
 #endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v10 06/10] powerpc: Implement kexec_file_load.
  2016-11-10  3:27 [PATCH v10 00/10] kexec_file_load implementation for PowerPC Thiago Jung Bauermann
                   ` (4 preceding siblings ...)
  2016-11-10  3:27 ` [PATCH v10 05/10] powerpc: Change places using CONFIG_KEXEC to use CONFIG_KEXEC_CORE instead Thiago Jung Bauermann
@ 2016-11-10  3:27 ` Thiago Jung Bauermann
  2016-11-10  3:27 ` [PATCH v10 07/10] powerpc: Add functions to read ELF files of any endianness Thiago Jung Bauermann
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Thiago Jung Bauermann @ 2016-11-10  3:27 UTC (permalink / raw)
  To: kexec
  Cc: linuxppc-dev, linux-kernel, x86, Eric Biederman, Dave Young,
	Vivek Goyal, Baoquan He, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Stewart Smith,
	Mimi Zohar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andrew Morton, Stephen Rothwell, Thiago Jung Bauermann,
	Josh Sklar

Add arch-specific functions needed by the generic kexec_file code.

Signed-off-by: Josh Sklar <sklar@linux.vnet.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig                        |  14 ++
 arch/powerpc/include/asm/systbl.h           |   1 +
 arch/powerpc/include/asm/unistd.h           |   2 +-
 arch/powerpc/include/uapi/asm/unistd.h      |   1 +
 arch/powerpc/kernel/Makefile                |   1 +
 arch/powerpc/kernel/machine_kexec_file_64.c | 301 ++++++++++++++++++++++++++++
 6 files changed, 319 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 6cb59c6e5ba4..a5a7bcf30c05 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -455,6 +455,20 @@ config KEXEC
 	  interface is strongly in flux, so no good recommendation can be
 	  made.
 
+config KEXEC_FILE
+	bool "kexec file based system call"
+	select KEXEC_CORE
+	select HAVE_KEXEC_FILE_PIE_PURGATORY
+	select BUILD_BIN2C
+	depends on PPC64
+	depends on CRYPTO=y
+	depends on CRYPTO_SHA256=y
+	help
+	  This is a new version of the kexec system call. This call is
+	  file based and takes in file descriptors as system call arguments
+	  for kernel and initramfs as opposed to a list of segments as is the
+	  case for the older kexec call.
+
 config RELOCATABLE
 	bool "Build a relocatable kernel"
 	depends on (PPC64 && !COMPILE_TEST) || (FLATMEM && (44x || FSL_BOOKE))
diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h
index 2fc5d4db503c..4b369d83fe9c 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -386,3 +386,4 @@ SYSCALL(mlock2)
 SYSCALL(copy_file_range)
 COMPAT_SYS_SPU(preadv2)
 COMPAT_SYS_SPU(pwritev2)
+SYSCALL(kexec_file_load)
diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h
index cf12c580f6b2..a01e97d3f305 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,7 +12,7 @@
 #include <uapi/asm/unistd.h>
 
 
-#define NR_syscalls		382
+#define NR_syscalls		383
 
 #define __NR__exit __NR_exit
 
diff --git a/arch/powerpc/include/uapi/asm/unistd.h b/arch/powerpc/include/uapi/asm/unistd.h
index e9f5f41aa55a..2f26335a3c42 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -392,5 +392,6 @@
 #define __NR_copy_file_range	379
 #define __NR_preadv2		380
 #define __NR_pwritev2		381
+#define __NR_kexec_file_load	382
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 22534a56c914..6de731d90bff 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -109,6 +109,7 @@ obj-$(CONFIG_PCI)		+= pci_$(BITS).o $(pci64-y) \
 obj-$(CONFIG_PCI_MSI)		+= msi.o
 obj-$(CONFIG_KEXEC_CORE)	+= machine_kexec.o crash.o \
 				   machine_kexec_$(BITS).o
+obj-$(CONFIG_KEXEC_FILE)	+= machine_kexec_file_$(BITS).o
 obj-$(CONFIG_AUDIT)		+= audit.o
 obj64-$(CONFIG_AUDIT)		+= compat_audit.o
 
diff --git a/arch/powerpc/kernel/machine_kexec_file_64.c b/arch/powerpc/kernel/machine_kexec_file_64.c
new file mode 100644
index 000000000000..172f6f736987
--- /dev/null
+++ b/arch/powerpc/kernel/machine_kexec_file_64.c
@@ -0,0 +1,301 @@
+/*
+ * ppc64 code to implement the kexec_file_load syscall
+ *
+ * Copyright (C) 2004  Adam Litke (agl@us.ibm.com)
+ * Copyright (C) 2004  IBM Corp.
+ * Copyright (C) 2005  R Sharada (sharada@in.ibm.com)
+ * Copyright (C) 2006  Mohan Kumar M (mohan@in.ibm.com)
+ * Copyright (C) 2016  IBM Corporation
+ *
+ * Based on kexec-tools' kexec-elf-ppc64.c.
+ * Heavily modified for the kernel by
+ * Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation (version 2 of the License).
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/slab.h>
+#include <linux/kexec.h>
+#include <linux/memblock.h>
+#include <linux/libfdt.h>
+
+#define SLAVE_CODE_SIZE		256
+
+static struct kexec_file_ops *kexec_file_loaders[] = { };
+
+int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
+				  unsigned long buf_len)
+{
+	int i, ret = -ENOEXEC;
+	struct kexec_file_ops *fops;
+
+	/* We don't support crash kernels yet. */
+	if (image->type == KEXEC_TYPE_CRASH)
+		return -ENOTSUPP;
+
+	for (i = 0; i < ARRAY_SIZE(kexec_file_loaders); i++) {
+		fops = kexec_file_loaders[i];
+		if (!fops || !fops->probe)
+			continue;
+
+		ret = fops->probe(buf, buf_len);
+		if (!ret) {
+			image->fops = fops;
+			return ret;
+		}
+	}
+
+	return ret;
+}
+
+void *arch_kexec_kernel_image_load(struct kimage *image)
+{
+	if (!image->fops || !image->fops->load)
+		return ERR_PTR(-ENOEXEC);
+
+	return image->fops->load(image, image->kernel_buf,
+				 image->kernel_buf_len, image->initrd_buf,
+				 image->initrd_buf_len, image->cmdline_buf,
+				 image->cmdline_buf_len);
+}
+
+int arch_kimage_file_post_load_cleanup(struct kimage *image)
+{
+	if (!image->fops || !image->fops->cleanup)
+		return 0;
+
+	return image->fops->cleanup(image->image_loader_data);
+}
+
+/**
+ * arch_kexec_walk_mem - call func(data) for each unreserved memory block
+ * @kbuf:	Context info for the search. Also passed to @func.
+ * @func:	Function to call for each memory block.
+ *
+ * This function is used by kexec_add_buffer and kexec_locate_mem_hole
+ * to find unreserved memory to load kexec segments into.
+ *
+ * Return: The memory walk will stop when func returns a non-zero value
+ * and that value will be returned. If all free regions are visited without
+ * func returning non-zero, then zero will be returned.
+ */
+int arch_kexec_walk_mem(struct kexec_buf *kbuf, int (*func)(u64, u64, void *))
+{
+	int ret = 0;
+	u64 i;
+	phys_addr_t mstart, mend;
+
+	if (kbuf->top_down) {
+		for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
+						&mstart, &mend, NULL) {
+			/*
+			 * In memblock, end points to the first byte after the
+			 * range while in kexec, end points to the last byte
+			 * in the range.
+			 */
+			ret = func(mstart, mend - 1, kbuf);
+			if (ret)
+				break;
+		}
+	} else {
+		for_each_free_mem_range(i, NUMA_NO_NODE, 0, &mstart, &mend,
+					NULL) {
+			/*
+			 * In memblock, end points to the first byte after the
+			 * range while in kexec, end points to the last byte
+			 * in the range.
+			 */
+			ret = func(mstart, mend - 1, kbuf);
+			if (ret)
+				break;
+		}
+	}
+
+	return ret;
+}
+
+/**
+ * arch_kexec_apply_relocations_add - apply purgatory relocations
+ * @ehdr:	Pointer to ELF headers.
+ * @sechdrs:	Pointer to section headers.
+ * @relsec:	Section index of SHT_RELA section.
+ *
+ * Elf64_Shdr.sh_offset has been modified to keep the pointer to the section
+ * contents, while Elf64_Shdr.sh_addr points to the final address of the
+ * section in memory.
+ */
+int arch_kexec_apply_relocations_add(const Elf64_Ehdr *ehdr,
+				     Elf64_Shdr *sechdrs, unsigned int relsec)
+{
+	unsigned int i;
+	unsigned long reloc_type;
+	unsigned long *location;
+	unsigned long address;
+	unsigned long value;
+	const char *name;
+	Elf64_Sym *sym;
+	/* Section containing the relocation entries. */
+	Elf64_Shdr *rel_section = &sechdrs[relsec];
+	const Elf64_Rela *rela = (const Elf64_Rela *) rel_section->sh_offset;
+	/* Section to which relocations apply. */
+	Elf64_Shdr *target_section = &sechdrs[rel_section->sh_info];
+	/* Associated symbol table. */
+	Elf64_Shdr *symtabsec = &sechdrs[rel_section->sh_link];
+	void *syms_base = (void *) symtabsec->sh_offset;
+	void *loc_base = (void *) target_section->sh_offset;
+	Elf64_Addr addr_base = target_section->sh_addr;
+	Elf64_Addr orig_addr_base;
+	const Elf_Phdr *phdrs = (const void *) ehdr + ehdr->e_phoff;
+	const Elf_Phdr *phdr;
+	unsigned long sec_base;
+	unsigned long purgatory_load_addr;
+	unsigned long orig_load_addr;
+	const char *strtab;
+	const char *shstrtab;
+	const Elf_Shdr *sechdrs_c;
+
+	if (symtabsec->sh_link >= ehdr->e_shnum) {
+		/* Invalid strtab section number */
+		pr_err("Invalid string table section index %d\n",
+		       symtabsec->sh_link);
+		return -ENOEXEC;
+	}
+
+	/*
+	 * The original section header was modified by __kexec_load_purgatory
+	 * so that the ->sh_addr and ->sh_offset fields point to the permanent
+	 * and temporary locations of sections.
+	 */
+	sechdrs_c = (const void *) ehdr + ehdr->e_shoff;
+	orig_addr_base = sechdrs_c[rel_section->sh_info].sh_addr;
+
+	/* Find the address where the purgatory was built to be loaded in. */
+	for (phdr = phdrs; phdr < phdrs + ehdr->e_phnum; phdr++) {
+		if (phdr->p_type != PT_LOAD)
+			continue;
+
+		orig_load_addr = phdr->p_paddr - phdr->p_offset;
+		break;
+	}
+
+	/*
+	 * Find the address where we will load the purgatory.
+	 * This is simply the reverse of the calculation done when modifying
+	 * ->sh_addr in __kexec_really_load_purgatory.
+	 */
+	purgatory_load_addr = addr_base - orig_addr_base + orig_load_addr;
+
+	/* String table for the associated symbol table. */
+	strtab = (const char *) sechdrs[symtabsec->sh_link].sh_offset;
+
+	/* Section header string table. */
+	shstrtab = (const char *) sechdrs[ehdr->e_shstrndx].sh_offset;
+
+	for (i = 0; i < rel_section->sh_size / sizeof(Elf64_Rela); i++) {
+		Elf64_Addr r_offset = rela[i].r_offset - orig_addr_base;
+		long addend = rela[i].r_addend;
+		Elf64_Addr orig_sec_base;
+
+		/*
+		 * rels[i].r_offset contains the byte offset from the beginning
+		 * of section to the storage unit affected.
+		 *
+		 * This is the location to update in the temporary buffer where
+		 * the section is currently loaded. The section will finally
+		 * be loaded to a different address later, pointed to by
+		 * addr_base.
+		 */
+		location = loc_base + r_offset;
+
+		/* Final address of the location. */
+		address = addr_base + r_offset;
+
+		/* This is the symbol the relocation is referring to. */
+		sym = (Elf64_Sym *) syms_base + ELF64_R_SYM(rela[i].r_info);
+		orig_sec_base = sechdrs_c[sym->st_shndx].sh_addr;
+
+		if (sym->st_name)
+			name = strtab + sym->st_name;
+		else if (sym->st_value == orig_sec_base)
+			name = &shstrtab[sechdrs[sym->st_shndx].sh_name];
+		else
+			name = "<unnamed symbol>";
+
+		reloc_type = ELF64_R_TYPE(rela[i].r_info);
+
+		pr_debug("RELOC at %p: %lu-type as %s (0x%lx) + %li\n",
+		       location, reloc_type, name, (unsigned long)sym->st_value,
+		       (long)rela[i].r_addend);
+
+		if ((void *) location >= loc_base + target_section->sh_size) {
+			pr_err("Location %p is %llx bytes beyond the end of the section.\n",
+			       location, (void *) location - loc_base +
+						target_section->sh_size - 1);
+			return -ENOEXEC;
+		}
+
+		/*
+		 * Function descriptor symbols appear as undefined but
+		 * should be resolved as well, so allow them to be processed.
+		 */
+		if (sym->st_shndx == SHN_UNDEF &&
+				reloc_type != R_PPC64_RELATIVE) {
+			pr_err("Undefined symbol: %s\n", name);
+			return -ENOEXEC;
+		} else if (sym->st_shndx == SHN_COMMON) {
+			pr_err("Symbol '%s' in common section.\n",
+			       name);
+			return -ENOEXEC;
+		}
+
+		if (sym->st_shndx != SHN_ABS) {
+			if (sym->st_shndx >= ehdr->e_shnum) {
+				pr_err("Invalid section %d for symbol %s\n",
+				       sym->st_shndx, name);
+				return -ENOEXEC;
+			}
+
+			sec_base = sechdrs[sym->st_shndx].sh_addr;
+		} else
+			sec_base = orig_sec_base = 0;
+
+		/* `Everything is relative'. */
+		value = sym->st_value - orig_sec_base + sec_base + addend;
+
+		switch (reloc_type) {
+		case R_PPC64_ADDR16_LO:
+			*(uint16_t *) location = value & 0xffff;
+			break;
+
+		case R_PPC64_ADDR16_HI:
+			*(uint16_t *) location = (value >> 16) & 0xffff;
+			break;
+
+		case R_PPC64_ADDR16_HIGHER:
+			*(uint16_t *) location = (((uint64_t) value >> 32) &
+						  0xffff);
+			break;
+
+		case R_PPC64_ADDR16_HIGHEST:
+			*(uint16_t *) location = (((uint64_t) value >> 48) &
+						  0xffff);
+			break;
+		case R_PPC64_RELATIVE:
+			*location = purgatory_load_addr + addend - orig_load_addr;
+			break;
+		default:
+			pr_err("kexec purgatory: Unknown ADD relocation: %lu\n",
+			       reloc_type);
+			return -ENOEXEC;
+		}
+	}
+
+	return 0;
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v10 07/10] powerpc: Add functions to read ELF files of any endianness.
  2016-11-10  3:27 [PATCH v10 00/10] kexec_file_load implementation for PowerPC Thiago Jung Bauermann
                   ` (5 preceding siblings ...)
  2016-11-10  3:27 ` [PATCH v10 06/10] powerpc: Implement kexec_file_load Thiago Jung Bauermann
@ 2016-11-10  3:27 ` Thiago Jung Bauermann
  2016-11-10  3:27 ` [PATCH v10 08/10] powerpc: Add support for loading ELF kernels with kexec_file_load Thiago Jung Bauermann
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Thiago Jung Bauermann @ 2016-11-10  3:27 UTC (permalink / raw)
  To: kexec
  Cc: linuxppc-dev, linux-kernel, x86, Eric Biederman, Dave Young,
	Vivek Goyal, Baoquan He, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Stewart Smith,
	Mimi Zohar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andrew Morton, Stephen Rothwell, Thiago Jung Bauermann

A little endian kernel might need to kexec a big endian kernel (the
opposite is less likely but could happen as well), so we can't just cast
the buffer with the binary to ELF structs and use them as is done
elsewhere.

This patch adds functions which do byte-swapping as necessary when
populating the ELF structs. These functions will be used in the next
patch in the series.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/elf_util.h |  43 ++++
 arch/powerpc/kernel/Makefile        |   2 +-
 arch/powerpc/kernel/elf_util.c      | 437 ++++++++++++++++++++++++++++++++++++
 3 files changed, 481 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/elf_util.h b/arch/powerpc/include/asm/elf_util.h
new file mode 100644
index 000000000000..944b3a2d8b73
--- /dev/null
+++ b/arch/powerpc/include/asm/elf_util.h
@@ -0,0 +1,43 @@
+/*
+ * Utility functions to work with ELF files.
+ *
+ * Copyright (C) 2016, IBM Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_POWERPC_ELF_UTIL_H
+#define _ASM_POWERPC_ELF_UTIL_H
+
+#include <linux/elf.h>
+
+struct elf_info {
+	/*
+	 * Where the ELF binary contents are kept.
+	 * Memory managed by the user of the struct.
+	 */
+	const char *buffer;
+
+	const struct elfhdr *ehdr;
+	const struct elf_phdr *proghdrs;
+	struct elf_shdr *sechdrs;
+};
+
+static inline bool elf_is_elf_file(const struct elfhdr *ehdr)
+{
+	return memcmp(ehdr->e_ident, ELFMAG, SELFMAG) == 0;
+}
+
+int elf_read_from_buffer(const char *buf, size_t len, struct elfhdr *ehdr,
+			 struct elf_info *elf_info);
+void elf_free_info(struct elf_info *elf_info);
+
+#endif /* _ASM_POWERPC_ELF_UTIL_H */
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 6de731d90bff..de14b7eb11bb 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -109,7 +109,7 @@ obj-$(CONFIG_PCI)		+= pci_$(BITS).o $(pci64-y) \
 obj-$(CONFIG_PCI_MSI)		+= msi.o
 obj-$(CONFIG_KEXEC_CORE)	+= machine_kexec.o crash.o \
 				   machine_kexec_$(BITS).o
-obj-$(CONFIG_KEXEC_FILE)	+= machine_kexec_file_$(BITS).o
+obj-$(CONFIG_KEXEC_FILE)	+= machine_kexec_file_$(BITS).o elf_util.o
 obj-$(CONFIG_AUDIT)		+= audit.o
 obj64-$(CONFIG_AUDIT)		+= compat_audit.o
 
diff --git a/arch/powerpc/kernel/elf_util.c b/arch/powerpc/kernel/elf_util.c
new file mode 100644
index 000000000000..8572fc84a802
--- /dev/null
+++ b/arch/powerpc/kernel/elf_util.c
@@ -0,0 +1,437 @@
+/*
+ * Utility functions to work with ELF files.
+ *
+ * Copyright (C) 2016, IBM Corporation
+ *
+ * Based on kexec-tools' kexec-elf.c. Heavily modified for the
+ * kernel by Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation (version 2 of the License).
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/slab.h>
+#include <asm/elf_util.h>
+#include <asm-generic/module.h>
+
+#if ELF_CLASS == ELFCLASS32
+#define elf_addr_to_cpu	elf32_to_cpu
+
+#ifndef Elf_Rel
+#define Elf_Rel		Elf32_Rel
+#endif /* Elf_Rel */
+#else /* ELF_CLASS == ELFCLASS32 */
+#define elf_addr_to_cpu	elf64_to_cpu
+
+#ifndef Elf_Rel
+#define Elf_Rel		Elf64_Rel
+#endif /* Elf_Rel */
+
+static uint64_t elf64_to_cpu(const struct elfhdr *ehdr, uint64_t value)
+{
+	if (ehdr->e_ident[EI_DATA] == ELFDATA2LSB)
+		value = le64_to_cpu(value);
+	else if (ehdr->e_ident[EI_DATA] == ELFDATA2MSB)
+		value = be64_to_cpu(value);
+
+	return value;
+}
+#endif /* ELF_CLASS == ELFCLASS32 */
+
+static uint16_t elf16_to_cpu(const struct elfhdr *ehdr, uint16_t value)
+{
+	if (ehdr->e_ident[EI_DATA] == ELFDATA2LSB)
+		value = le16_to_cpu(value);
+	else if (ehdr->e_ident[EI_DATA] == ELFDATA2MSB)
+		value = be16_to_cpu(value);
+
+	return value;
+}
+
+static uint32_t elf32_to_cpu(const struct elfhdr *ehdr, uint32_t value)
+{
+	if (ehdr->e_ident[EI_DATA] == ELFDATA2LSB)
+		value = le32_to_cpu(value);
+	else if (ehdr->e_ident[EI_DATA] == ELFDATA2MSB)
+		value = be32_to_cpu(value);
+
+	return value;
+}
+
+/**
+ * elf_is_ehdr_sane - check that it is safe to use the ELF header
+ * @buf_len:	size of the buffer in which the ELF file is loaded.
+ */
+static bool elf_is_ehdr_sane(const struct elfhdr *ehdr, size_t buf_len)
+{
+	if (ehdr->e_phnum > 0 && ehdr->e_phentsize != sizeof(struct elf_phdr)) {
+		pr_debug("Bad program header size.\n");
+		return false;
+	} else if (ehdr->e_shnum > 0 &&
+		   ehdr->e_shentsize != sizeof(struct elf_shdr)) {
+		pr_debug("Bad section header size.\n");
+		return false;
+	} else if (ehdr->e_ident[EI_VERSION] != EV_CURRENT ||
+		   ehdr->e_version != EV_CURRENT) {
+		pr_debug("Unknown ELF version.\n");
+		return false;
+	}
+
+	if (ehdr->e_phoff > 0 && ehdr->e_phnum > 0) {
+		size_t phdr_size;
+
+		/*
+		 * e_phnum is at most 65535 so calculating the size of the
+		 * program header cannot overflow.
+		 */
+		phdr_size = sizeof(struct elf_phdr) * ehdr->e_phnum;
+
+		/* Sanity check the program header table location. */
+		if (ehdr->e_phoff + phdr_size < ehdr->e_phoff) {
+			pr_debug("Program headers at invalid location.\n");
+			return false;
+		} else if (ehdr->e_phoff + phdr_size > buf_len) {
+			pr_debug("Program headers truncated.\n");
+			return false;
+		}
+	}
+
+	if (ehdr->e_shoff > 0 && ehdr->e_shnum > 0) {
+		size_t shdr_size;
+
+		/*
+		 * e_shnum is at most 65536 so calculating
+		 * the size of the section header cannot overflow.
+		 */
+		shdr_size = sizeof(struct elf_shdr) * ehdr->e_shnum;
+
+		/* Sanity check the section header table location. */
+		if (ehdr->e_shoff + shdr_size < ehdr->e_shoff) {
+			pr_debug("Section headers at invalid location.\n");
+			return false;
+		} else if (ehdr->e_shoff + shdr_size > buf_len) {
+			pr_debug("Section headers truncated.\n");
+			return false;
+		}
+	}
+
+	return true;
+}
+
+static int elf_read_ehdr(const char *buf, size_t len, struct elfhdr *ehdr)
+{
+	struct elfhdr *buf_ehdr;
+
+	if (len < sizeof(*buf_ehdr)) {
+		pr_debug("Buffer is too small to hold ELF header.\n");
+		return -ENOEXEC;
+	}
+
+	memset(ehdr, 0, sizeof(*ehdr));
+	memcpy(ehdr->e_ident, buf, sizeof(ehdr->e_ident));
+	if (!elf_is_elf_file(ehdr)) {
+		pr_debug("No ELF header magic.\n");
+		return -ENOEXEC;
+	}
+
+	if (ehdr->e_ident[EI_CLASS] != ELF_CLASS) {
+		pr_debug("Not a supported ELF class.\n");
+		return -1;
+	} else  if (ehdr->e_ident[EI_DATA] != ELFDATA2LSB &&
+		ehdr->e_ident[EI_DATA] != ELFDATA2MSB) {
+		pr_debug("Not a supported ELF data format.\n");
+		return -ENOEXEC;
+	}
+
+	buf_ehdr = (struct elfhdr *) buf;
+	if (elf16_to_cpu(ehdr, buf_ehdr->e_ehsize) != sizeof(*buf_ehdr)) {
+		pr_debug("Bad ELF header size.\n");
+		return -ENOEXEC;
+	}
+
+	ehdr->e_type      = elf16_to_cpu(ehdr, buf_ehdr->e_type);
+	ehdr->e_machine   = elf16_to_cpu(ehdr, buf_ehdr->e_machine);
+	ehdr->e_version   = elf32_to_cpu(ehdr, buf_ehdr->e_version);
+	ehdr->e_entry     = elf_addr_to_cpu(ehdr, buf_ehdr->e_entry);
+	ehdr->e_phoff     = elf_addr_to_cpu(ehdr, buf_ehdr->e_phoff);
+	ehdr->e_shoff     = elf_addr_to_cpu(ehdr, buf_ehdr->e_shoff);
+	ehdr->e_flags     = elf32_to_cpu(ehdr, buf_ehdr->e_flags);
+	ehdr->e_phentsize = elf16_to_cpu(ehdr, buf_ehdr->e_phentsize);
+	ehdr->e_phnum     = elf16_to_cpu(ehdr, buf_ehdr->e_phnum);
+	ehdr->e_shentsize = elf16_to_cpu(ehdr, buf_ehdr->e_shentsize);
+	ehdr->e_shnum     = elf16_to_cpu(ehdr, buf_ehdr->e_shnum);
+	ehdr->e_shstrndx  = elf16_to_cpu(ehdr, buf_ehdr->e_shstrndx);
+
+	return elf_is_ehdr_sane(ehdr, len) ? 0 : -ENOEXEC;
+}
+
+/**
+ * elf_is_phdr_sane - check that it is safe to use the program header
+ * @buf_len:	size of the buffer in which the ELF file is loaded.
+ */
+static bool elf_is_phdr_sane(const struct elf_phdr *phdr, size_t buf_len)
+{
+
+	if (phdr->p_offset + phdr->p_filesz < phdr->p_offset) {
+		pr_debug("ELF segment location wraps around.\n");
+		return false;
+	} else if (phdr->p_offset + phdr->p_filesz > buf_len) {
+		pr_debug("ELF segment not in file.\n");
+		return false;
+	} else if (phdr->p_paddr + phdr->p_memsz < phdr->p_paddr) {
+		pr_debug("ELF segment address wraps around.\n");
+		return false;
+	}
+
+	return true;
+}
+
+static int elf_read_phdr(const char *buf, size_t len, struct elf_info *elf_info,
+			 int idx)
+{
+	/* Override the const in proghdrs, we are the ones doing the loading. */
+	struct elf_phdr *phdr = (struct elf_phdr *) &elf_info->proghdrs[idx];
+	const char *pbuf;
+	struct elf_phdr *buf_phdr;
+
+	pbuf = buf + elf_info->ehdr->e_phoff + (idx * sizeof(*buf_phdr));
+	buf_phdr = (struct elf_phdr *) pbuf;
+
+	phdr->p_type   = elf32_to_cpu(elf_info->ehdr, buf_phdr->p_type);
+	phdr->p_offset = elf_addr_to_cpu(elf_info->ehdr, buf_phdr->p_offset);
+	phdr->p_paddr  = elf_addr_to_cpu(elf_info->ehdr, buf_phdr->p_paddr);
+	phdr->p_vaddr  = elf_addr_to_cpu(elf_info->ehdr, buf_phdr->p_vaddr);
+	phdr->p_flags  = elf32_to_cpu(elf_info->ehdr, buf_phdr->p_flags);
+
+	/*
+	 * The following fields have a type equivalent to Elf_Addr
+	 * both in 32 bit and 64 bit ELF.
+	 */
+	phdr->p_filesz = elf_addr_to_cpu(elf_info->ehdr, buf_phdr->p_filesz);
+	phdr->p_memsz  = elf_addr_to_cpu(elf_info->ehdr, buf_phdr->p_memsz);
+	phdr->p_align  = elf_addr_to_cpu(elf_info->ehdr, buf_phdr->p_align);
+
+	return elf_is_phdr_sane(phdr, len) ? 0 : -ENOEXEC;
+}
+
+/**
+ * elf_read_phdrs - read the program headers from the buffer
+ *
+ * This function assumes that the program header table was checked for sanity.
+ * Use elf_is_ehdr_sane() if it wasn't.
+ */
+static int elf_read_phdrs(const char *buf, size_t len,
+			  struct elf_info *elf_info)
+{
+	size_t phdr_size, i;
+	const struct elfhdr *ehdr = elf_info->ehdr;
+
+	/*
+	 * e_phnum is at most 65535 so calculating the size of the
+	 * program header cannot overflow.
+	 */
+	phdr_size = sizeof(struct elf_phdr) * ehdr->e_phnum;
+
+	elf_info->proghdrs = kzalloc(phdr_size, GFP_KERNEL);
+	if (!elf_info->proghdrs)
+		return -ENOMEM;
+
+	for (i = 0; i < ehdr->e_phnum; i++) {
+		int ret;
+
+		ret = elf_read_phdr(buf, len, elf_info, i);
+		if (ret) {
+			kfree(elf_info->proghdrs);
+			elf_info->proghdrs = NULL;
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * elf_is_shdr_sane - check that it is safe to use the section header
+ * @buf_len:	size of the buffer in which the ELF file is loaded.
+ */
+static bool elf_is_shdr_sane(const struct elf_shdr *shdr, size_t buf_len)
+{
+	bool size_ok;
+
+	/* SHT_NULL headers have undefined values, so we can't check them. */
+	if (shdr->sh_type == SHT_NULL)
+		return true;
+
+	/* Now verify sh_entsize */
+	switch (shdr->sh_type) {
+	case SHT_SYMTAB:
+		size_ok = shdr->sh_entsize == sizeof(Elf_Sym);
+		break;
+	case SHT_RELA:
+		size_ok = shdr->sh_entsize == sizeof(Elf_Rela);
+		break;
+	case SHT_DYNAMIC:
+		size_ok = shdr->sh_entsize == sizeof(Elf_Dyn);
+		break;
+	case SHT_REL:
+		size_ok = shdr->sh_entsize == sizeof(Elf_Rel);
+		break;
+	case SHT_NOTE:
+	case SHT_PROGBITS:
+	case SHT_HASH:
+	case SHT_NOBITS:
+	default:
+		/*
+		 * This is a section whose entsize requirements
+		 * I don't care about.  If I don't know about
+		 * the section I can't care about it's entsize
+		 * requirements.
+		 */
+		size_ok = true;
+		break;
+	}
+
+	if (!size_ok) {
+		pr_debug("ELF section with wrong entry size.\n");
+		return false;
+	} else if (shdr->sh_addr + shdr->sh_size < shdr->sh_addr) {
+		pr_debug("ELF section address wraps around.\n");
+		return false;
+	}
+
+	if (shdr->sh_type != SHT_NOBITS) {
+		if (shdr->sh_offset + shdr->sh_size < shdr->sh_offset) {
+			pr_debug("ELF section location wraps around.\n");
+			return false;
+		} else if (shdr->sh_offset + shdr->sh_size > buf_len) {
+			pr_debug("ELF section not in file.\n");
+			return false;
+		}
+	}
+
+	return true;
+}
+
+static int elf_read_shdr(const char *buf, size_t len, struct elf_info *elf_info,
+			 int idx)
+{
+	struct elf_shdr *shdr = &elf_info->sechdrs[idx];
+	const struct elfhdr *ehdr = elf_info->ehdr;
+	const char *sbuf;
+	struct elf_shdr *buf_shdr;
+
+	sbuf = buf + ehdr->e_shoff + idx * sizeof(*buf_shdr);
+	buf_shdr = (struct elf_shdr *) sbuf;
+
+	shdr->sh_name      = elf32_to_cpu(ehdr, buf_shdr->sh_name);
+	shdr->sh_type      = elf32_to_cpu(ehdr, buf_shdr->sh_type);
+	shdr->sh_addr      = elf_addr_to_cpu(ehdr, buf_shdr->sh_addr);
+	shdr->sh_offset    = elf_addr_to_cpu(ehdr, buf_shdr->sh_offset);
+	shdr->sh_link      = elf32_to_cpu(ehdr, buf_shdr->sh_link);
+	shdr->sh_info      = elf32_to_cpu(ehdr, buf_shdr->sh_info);
+
+	/*
+	 * The following fields have a type equivalent to Elf_Addr
+	 * both in 32 bit and 64 bit ELF.
+	 */
+	shdr->sh_flags     = elf_addr_to_cpu(ehdr, buf_shdr->sh_flags);
+	shdr->sh_size      = elf_addr_to_cpu(ehdr, buf_shdr->sh_size);
+	shdr->sh_addralign = elf_addr_to_cpu(ehdr, buf_shdr->sh_addralign);
+	shdr->sh_entsize   = elf_addr_to_cpu(ehdr, buf_shdr->sh_entsize);
+
+	return elf_is_shdr_sane(shdr, len) ? 0 : -ENOEXEC;
+}
+
+/**
+ * elf_read_shdrs - read the section headers from the buffer
+ *
+ * This function assumes that the section header table was checked for sanity.
+ * Use elf_is_ehdr_sane() if it wasn't.
+ */
+static int elf_read_shdrs(const char *buf, size_t len,
+			  struct elf_info *elf_info)
+{
+	size_t shdr_size, i;
+
+	/*
+	 * e_shnum is at most 65536 so calculating
+	 * the size of the section header cannot overflow.
+	 */
+	shdr_size = sizeof(struct elf_shdr) * elf_info->ehdr->e_shnum;
+
+	elf_info->sechdrs = kzalloc(shdr_size, GFP_KERNEL);
+	if (!elf_info->sechdrs)
+		return -ENOMEM;
+
+	for (i = 0; i < elf_info->ehdr->e_shnum; i++) {
+		int ret;
+
+		ret = elf_read_shdr(buf, len, elf_info, i);
+		if (ret) {
+			kfree(elf_info->sechdrs);
+			elf_info->sechdrs = NULL;
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * elf_read_from_buffer - read ELF file and sets up ELF header and ELF info
+ * @buf:	Buffer to read ELF file from.
+ * @len:	Size of @buf.
+ * @ehdr:	Pointer to existing struct which will be populated.
+ * @elf_info:	Pointer to existing struct which will be populated.
+ *
+ * This function allows reading ELF files with different byte order than
+ * the kernel, byte-swapping the fields as needed.
+ *
+ * Return:
+ * On success returns 0, and the caller should call elf_free_info(elf_info) to
+ * free the memory allocated for the section and program headers.
+ */
+int elf_read_from_buffer(const char *buf, size_t len, struct elfhdr *ehdr,
+			 struct elf_info *elf_info)
+{
+	int ret;
+
+	ret = elf_read_ehdr(buf, len, ehdr);
+	if (ret)
+		return ret;
+
+	elf_info->buffer = buf;
+	elf_info->ehdr = ehdr;
+	if (ehdr->e_phoff > 0 && ehdr->e_phnum > 0) {
+		ret = elf_read_phdrs(buf, len, elf_info);
+		if (ret)
+			return ret;
+	}
+	if (ehdr->e_shoff > 0 && ehdr->e_shnum > 0) {
+		ret = elf_read_shdrs(buf, len, elf_info);
+		if (ret) {
+			kfree(elf_info->proghdrs);
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * elf_free_info - free memory allocated by elf_read_from_buffer
+ */
+void elf_free_info(struct elf_info *elf_info)
+{
+	kfree(elf_info->proghdrs);
+	kfree(elf_info->sechdrs);
+	memset(elf_info, 0, sizeof(*elf_info));
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v10 08/10] powerpc: Add support for loading ELF kernels with kexec_file_load.
  2016-11-10  3:27 [PATCH v10 00/10] kexec_file_load implementation for PowerPC Thiago Jung Bauermann
                   ` (6 preceding siblings ...)
  2016-11-10  3:27 ` [PATCH v10 07/10] powerpc: Add functions to read ELF files of any endianness Thiago Jung Bauermann
@ 2016-11-10  3:27 ` Thiago Jung Bauermann
  2016-11-10  3:27 ` [PATCH v10 09/10] powerpc: Add purgatory for kexec_file_load implementation Thiago Jung Bauermann
  2016-11-10  3:27 ` [PATCH v10 10/10] powerpc: Enable CONFIG_KEXEC_FILE in powerpc server defconfigs Thiago Jung Bauermann
  9 siblings, 0 replies; 20+ messages in thread
From: Thiago Jung Bauermann @ 2016-11-10  3:27 UTC (permalink / raw)
  To: kexec
  Cc: linuxppc-dev, linux-kernel, x86, Eric Biederman, Dave Young,
	Vivek Goyal, Baoquan He, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Stewart Smith,
	Mimi Zohar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andrew Morton, Stephen Rothwell, Thiago Jung Bauermann

This uses all the infrastructure built up by the previous patches
in the series to load an ELF vmlinux file and an initrd. It uses the
flattened device tree at initial_boot_params as a base and adjusts memory
reservations and its /chosen node for the next kernel.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
 arch/powerpc/include/asm/kexec.h            |  12 ++
 arch/powerpc/kernel/Makefile                |   3 +-
 arch/powerpc/kernel/kexec_elf_64.c          | 279 +++++++++++++++++++++++++++
 arch/powerpc/kernel/machine_kexec_file_64.c | 281 +++++++++++++++++++++++++++-
 4 files changed, 572 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index eca2f975bf44..6b8cbcf42466 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -91,6 +91,18 @@ static inline bool kdump_in_progress(void)
 	return crashing_cpu >= 0;
 }
 
+#ifdef CONFIG_KEXEC_FILE
+#define PURGATORY_ELF_TYPE ET_EXEC
+
+extern struct kexec_file_ops kexec_elf64_ops;
+
+int setup_purgatory(struct kimage *image, const void *slave_code,
+		    const void *fdt, unsigned long kernel_load_addr,
+		    unsigned long fdt_load_addr, unsigned long stack_top);
+int setup_new_fdt(void *fdt, unsigned long initrd_load_addr,
+		  unsigned long initrd_len, const char *cmdline);
+#endif /* CONFIG_KEXEC_FILE */
+
 #else /* !CONFIG_KEXEC_CORE */
 static inline void crash_kexec_secondary(struct pt_regs *regs) { }
 
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index de14b7eb11bb..424b13b1b2b0 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -109,7 +109,8 @@ obj-$(CONFIG_PCI)		+= pci_$(BITS).o $(pci64-y) \
 obj-$(CONFIG_PCI_MSI)		+= msi.o
 obj-$(CONFIG_KEXEC_CORE)	+= machine_kexec.o crash.o \
 				   machine_kexec_$(BITS).o
-obj-$(CONFIG_KEXEC_FILE)	+= machine_kexec_file_$(BITS).o elf_util.o
+obj-$(CONFIG_KEXEC_FILE)	+= machine_kexec_file_$(BITS).o elf_util.o \
+				   kexec_elf_$(BITS).o
 obj-$(CONFIG_AUDIT)		+= audit.o
 obj64-$(CONFIG_AUDIT)		+= compat_audit.o
 
diff --git a/arch/powerpc/kernel/kexec_elf_64.c b/arch/powerpc/kernel/kexec_elf_64.c
new file mode 100644
index 000000000000..f58b77d80d59
--- /dev/null
+++ b/arch/powerpc/kernel/kexec_elf_64.c
@@ -0,0 +1,279 @@
+/*
+ * Load ELF vmlinux file for the kexec_file_load syscall.
+ *
+ * Copyright (C) 2004  Adam Litke (agl@us.ibm.com)
+ * Copyright (C) 2004  IBM Corp.
+ * Copyright (C) 2005  R Sharada (sharada@in.ibm.com)
+ * Copyright (C) 2006  Mohan Kumar M (mohan@in.ibm.com)
+ * Copyright (C) 2016  IBM Corporation
+ *
+ * Based on kexec-tools' kexec-elf-exec.c and kexec-elf-ppc64.c.
+ * Heavily modified for the kernel by
+ * Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation (version 2 of the License).
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#define pr_fmt(fmt)	"kexec_elf: " fmt
+
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/elf.h>
+#include <linux/kexec.h>
+#include <linux/of_fdt.h>
+#include <linux/libfdt.h>
+#include <asm/elf_util.h>
+
+#define PURGATORY_STACK_SIZE	(16 * 1024)
+
+/**
+ * build_elf_exec_info - read ELF executable and check that we can use it
+ */
+static int build_elf_exec_info(const char *buf, size_t len, struct elfhdr *ehdr,
+			       struct elf_info *elf_info)
+{
+	int i;
+	int ret;
+
+	ret = elf_read_from_buffer(buf, len, ehdr, elf_info);
+	if (ret)
+		return ret;
+
+	/* Big endian vmlinux has type ET_DYN. */
+	if (ehdr->e_type != ET_EXEC && ehdr->e_type != ET_DYN) {
+		pr_err("Not an ELF executable.\n");
+		goto error;
+	} else if (!elf_info->proghdrs) {
+		pr_err("No ELF program header.\n");
+		goto error;
+	}
+
+	for (i = 0; i < ehdr->e_phnum; i++) {
+		/*
+		 * Kexec does not support loading interpreters.
+		 * In addition this check keeps us from attempting
+		 * to kexec ordinay executables.
+		 */
+		if (elf_info->proghdrs[i].p_type == PT_INTERP) {
+			pr_err("Requires an ELF interpreter.\n");
+			goto error;
+		}
+	}
+
+	return 0;
+error:
+	elf_free_info(elf_info);
+	return -ENOEXEC;
+}
+
+static int elf64_probe(const char *buf, unsigned long len)
+{
+	struct elfhdr ehdr;
+	struct elf_info elf_info;
+	int ret;
+
+	ret = build_elf_exec_info(buf, len, &ehdr, &elf_info);
+	if (ret)
+		return ret;
+
+	elf_free_info(&elf_info);
+
+	return elf_check_arch(&ehdr) ? 0 : -ENOEXEC;
+}
+
+/**
+ * elf_exec_load - load ELF executable image
+ * @lowest_load_addr:	On return, will be the address where the first PT_LOAD
+ *			section will be loaded in memory.
+ *
+ * Return:
+ * 0 on success, negative value on failure.
+ */
+static int elf_exec_load(struct kimage *image, struct elfhdr *ehdr,
+			 struct elf_info *elf_info,
+			 unsigned long *lowest_load_addr)
+{
+	unsigned long base = 0, lowest_addr = UINT_MAX;
+	int ret;
+	size_t i;
+	struct kexec_buf kbuf = { .image = image, .buf_max = ppc64_rma_size,
+				  .top_down = false };
+
+	/* Read in the PT_LOAD segments. */
+	for (i = 0; i < ehdr->e_phnum; i++) {
+		unsigned long load_addr;
+		size_t size;
+		const struct elf_phdr *phdr;
+
+		phdr = &elf_info->proghdrs[i];
+		if (phdr->p_type != PT_LOAD)
+			continue;
+
+		size = phdr->p_filesz;
+		if (size > phdr->p_memsz)
+			size = phdr->p_memsz;
+
+		kbuf.buffer = (void *) elf_info->buffer + phdr->p_offset;
+		kbuf.bufsz = size;
+		kbuf.memsz = phdr->p_memsz;
+		kbuf.buf_align = phdr->p_align;
+		kbuf.buf_min = phdr->p_paddr + base;
+		ret = kexec_add_buffer(&kbuf);
+		if (ret)
+			goto out;
+		load_addr = kbuf.mem;
+
+		if (load_addr < lowest_addr)
+			lowest_addr = load_addr;
+	}
+
+	/* Update entry point to reflect new load address. */
+	ehdr->e_entry += base;
+
+	*lowest_load_addr = lowest_addr;
+	ret = 0;
+ out:
+	return ret;
+}
+
+static void *elf64_load(struct kimage *image, char *kernel_buf,
+			unsigned long kernel_len, char *initrd,
+			unsigned long initrd_len, char *cmdline,
+			unsigned long cmdline_len)
+{
+	int i, ret;
+	unsigned int fdt_size;
+	unsigned long kernel_load_addr, purgatory_load_addr;
+	unsigned long initrd_load_addr = 0, fdt_load_addr, stack_top;
+	void *fdt;
+	const void *slave_code;
+	struct elfhdr ehdr;
+	struct elf_info elf_info;
+	struct fdt_reserve_entry *rsvmap;
+	struct kexec_buf kbuf = { .image = image, .buf_min = 0,
+				  .buf_max = ppc64_rma_size };
+
+	ret = build_elf_exec_info(kernel_buf, kernel_len, &ehdr, &elf_info);
+	if (ret)
+		goto out;
+
+	ret = elf_exec_load(image, &ehdr, &elf_info, &kernel_load_addr);
+	if (ret)
+		goto out;
+
+	pr_debug("Loaded the kernel at 0x%lx\n", kernel_load_addr);
+
+	ret = kexec_load_purgatory(image, 0, ppc64_rma_size, true,
+				   &purgatory_load_addr);
+	if (ret) {
+		pr_err("Loading purgatory failed.\n");
+		goto out;
+	}
+
+	pr_debug("Loaded purgatory at 0x%lx\n", purgatory_load_addr);
+
+	if (initrd != NULL) {
+		kbuf.buffer = initrd;
+		kbuf.bufsz = kbuf.memsz = initrd_len;
+		kbuf.buf_align = PAGE_SIZE;
+		kbuf.top_down = false;
+		ret = kexec_add_buffer(&kbuf);
+		if (ret)
+			goto out;
+		initrd_load_addr = kbuf.mem;
+
+		pr_debug("Loaded initrd at 0x%lx\n", initrd_load_addr);
+	}
+
+	fdt_size = fdt_totalsize(initial_boot_params) * 2;
+	fdt = kmalloc(fdt_size, GFP_KERNEL);
+	if (!fdt) {
+		pr_err("Not enough memory for the device tree.\n");
+		ret = -ENOMEM;
+		goto out;
+	}
+	ret = fdt_open_into(initial_boot_params, fdt, fdt_size);
+	if (ret < 0) {
+		pr_err("Error setting up the new device tree.\n");
+		ret = -EINVAL;
+		goto out;
+	}
+
+	ret = setup_new_fdt(fdt, initrd_load_addr, initrd_len, cmdline);
+	if (ret)
+		goto out;
+
+	/*
+	 * Documentation/devicetree/booting-without-of.txt says we need to
+	 * add a reservation entry for the device tree block, but
+	 * early_init_fdt_reserve_self reserves the memory even if there's no
+	 * such entry. We'll add a reservation entry anyway, to be safe and
+	 * compliant.
+	 *
+	 * Use dummy values, we will correct them in a moment.
+	 */
+	ret = fdt_add_mem_rsv(fdt, 1, 1);
+	if (ret) {
+		pr_err("Error reserving device tree memory: %s\n",
+		       fdt_strerror(ret));
+		ret = -EINVAL;
+		goto out;
+	}
+	fdt_pack(fdt);
+
+	kbuf.buffer = fdt;
+	kbuf.bufsz = kbuf.memsz = fdt_size;
+	kbuf.buf_align = PAGE_SIZE;
+	kbuf.top_down = true;
+	ret = kexec_add_buffer(&kbuf);
+	if (ret)
+		goto out;
+	fdt_load_addr = kbuf.mem;
+
+	/*
+	 * Fix fdt reservation, now that we now where it will be loaded
+	 * and how big it is.
+	 */
+	rsvmap = fdt + fdt_off_mem_rsvmap(fdt);
+	i = fdt_num_mem_rsv(fdt) - 1;
+	rsvmap[i].address = cpu_to_fdt64(fdt_load_addr);
+	rsvmap[i].size = cpu_to_fdt64(fdt_totalsize(fdt));
+
+	pr_debug("Loaded device tree at 0x%lx\n", fdt_load_addr);
+
+	kbuf.memsz = PURGATORY_STACK_SIZE;
+	kbuf.buf_align = PAGE_SIZE;
+	kbuf.top_down = true;
+	ret = kexec_locate_mem_hole(&kbuf);
+	if (ret) {
+		pr_err("Couldn't find free memory for the purgatory stack.\n");
+		ret = -ENOMEM;
+		goto out;
+	}
+	stack_top = kbuf.mem + PURGATORY_STACK_SIZE - 1;
+	pr_debug("Purgatory stack is at 0x%lx\n", stack_top);
+
+	slave_code = elf_info.buffer + elf_info.proghdrs[0].p_offset;
+	ret = setup_purgatory(image, slave_code, fdt, kernel_load_addr,
+			      fdt_load_addr, stack_top);
+	if (ret)
+		pr_err("Error setting up the purgatory.\n");
+
+out:
+	elf_free_info(&elf_info);
+
+	/* Make kimage_file_post_load_cleanup free the fdt buffer for us. */
+	return ret ? ERR_PTR(ret) : fdt;
+}
+
+struct kexec_file_ops kexec_elf64_ops = {
+	.probe = elf64_probe,
+	.load = elf64_load,
+};
diff --git a/arch/powerpc/kernel/machine_kexec_file_64.c b/arch/powerpc/kernel/machine_kexec_file_64.c
index 172f6f736987..c181314efa47 100644
--- a/arch/powerpc/kernel/machine_kexec_file_64.c
+++ b/arch/powerpc/kernel/machine_kexec_file_64.c
@@ -3,11 +3,12 @@
  *
  * Copyright (C) 2004  Adam Litke (agl@us.ibm.com)
  * Copyright (C) 2004  IBM Corp.
+ * Copyright (C) 2004,2005  Milton D Miller II, IBM Corporation
  * Copyright (C) 2005  R Sharada (sharada@in.ibm.com)
  * Copyright (C) 2006  Mohan Kumar M (mohan@in.ibm.com)
  * Copyright (C) 2016  IBM Corporation
  *
- * Based on kexec-tools' kexec-elf-ppc64.c.
+ * Based on kexec-tools' kexec-elf-ppc64.c, fs2dt.c.
  * Heavily modified for the kernel by
  * Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>.
  *
@@ -24,11 +25,15 @@
 #include <linux/slab.h>
 #include <linux/kexec.h>
 #include <linux/memblock.h>
+#include <linux/of_fdt.h>
 #include <linux/libfdt.h>
+#include <asm/elf_util.h>
 
 #define SLAVE_CODE_SIZE		256
 
-static struct kexec_file_ops *kexec_file_loaders[] = { };
+static struct kexec_file_ops *kexec_file_loaders[] = {
+	&kexec_elf64_ops,
+};
 
 int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
 				  unsigned long buf_len)
@@ -299,3 +304,275 @@ int arch_kexec_apply_relocations_add(const Elf64_Ehdr *ehdr,
 
 	return 0;
 }
+
+/**
+ * setup_purgatory - initialize the purgatory's global variables
+ * @image:		kexec image.
+ * @slave_code:		Slave code for the purgatory.
+ * @fdt:		Flattened device tree for the next kernel.
+ * @kernel_load_addr:	Address where the kernel is loaded.
+ * @fdt_load_addr:	Address where the flattened device tree is loaded.
+ * @stack_top:		Address where the purgatory can place its stack.
+ *
+ * Return: 0 on success, or negative errno on error.
+ */
+int setup_purgatory(struct kimage *image, const void *slave_code,
+		    const void *fdt, unsigned long kernel_load_addr,
+		    unsigned long fdt_load_addr, unsigned long stack_top)
+{
+	int i, ret, tree_node;
+	const void *prop;
+	unsigned long opal_base, opal_entry, toc;
+	unsigned int *slave_code_buf, master_entry;
+	const struct purgatory_info *pi = &image->purgatory_info;
+	const Elf_Shdr *sechdrs = pi->sechdrs;
+	const char *shstrtab;
+
+	slave_code_buf = kmalloc(SLAVE_CODE_SIZE, GFP_KERNEL);
+	if (!slave_code_buf)
+		return -ENOMEM;
+
+	/* Get the slave code from the new kernel and put it in purgatory. */
+	ret = kexec_purgatory_get_set_symbol(image, "purgatory_start",
+					     slave_code_buf, SLAVE_CODE_SIZE,
+					     true);
+	if (ret) {
+		kfree(slave_code_buf);
+		return ret;
+	}
+
+	master_entry = slave_code_buf[0];
+	memcpy(slave_code_buf, slave_code, SLAVE_CODE_SIZE);
+	slave_code_buf[0] = master_entry;
+	ret = kexec_purgatory_get_set_symbol(image, "purgatory_start",
+					     slave_code_buf, SLAVE_CODE_SIZE,
+					     false);
+	kfree(slave_code_buf);
+
+	ret = kexec_purgatory_get_set_symbol(image, "kernel", &kernel_load_addr,
+					     sizeof(kernel_load_addr), false);
+	if (ret)
+		return ret;
+	ret = kexec_purgatory_get_set_symbol(image, "dt_offset", &fdt_load_addr,
+					     sizeof(fdt_load_addr), false);
+	if (ret)
+		return ret;
+
+	tree_node = fdt_path_offset(fdt, "/ibm,opal");
+	if (tree_node >= 0) {
+		prop = fdt_getprop(fdt, tree_node, "opal-base-address", NULL);
+		if (!prop) {
+			pr_err("OPAL address not found in the device tree.\n");
+			return -EINVAL;
+		}
+		opal_base = fdt64_to_cpu((const fdt64_t *) prop);
+
+		prop = fdt_getprop(fdt, tree_node, "opal-entry-address", NULL);
+		if (!prop) {
+			pr_err("OPAL address not found in the device tree.\n");
+			return -EINVAL;
+		}
+		opal_entry = fdt64_to_cpu((const fdt64_t *) prop);
+
+		ret = kexec_purgatory_get_set_symbol(image, "opal_base",
+						     &opal_base,
+						     sizeof(opal_base), false);
+		if (ret)
+			return ret;
+		ret = kexec_purgatory_get_set_symbol(image, "opal_entry",
+						     &opal_entry,
+						     sizeof(opal_entry), false);
+		if (ret)
+			return ret;
+	}
+
+	ret = kexec_purgatory_get_set_symbol(image, "stack", &stack_top,
+					     sizeof(stack_top), false);
+	if (ret)
+		return ret;
+
+	/* Section header string table. */
+	shstrtab = (const char *) sechdrs[pi->ehdr->e_shstrndx].sh_offset;
+
+	/* Find the TOC address by looking for the .got section. */
+	for (i = 0; i < pi->ehdr->e_shnum; i++)
+		if (!strcmp(&shstrtab[sechdrs[i].sh_name], ".got")) {
+			toc = sechdrs[i].sh_addr + 0x8000;
+
+			break;
+		}
+
+	if (i == pi->ehdr->e_shnum)
+		return -EINVAL;
+
+	ret = kexec_purgatory_get_set_symbol(image, "my_toc", &toc, sizeof(toc),
+					     false);
+	if (ret)
+		return ret;
+
+	pr_debug("Purgatory TOC is at 0x%lx\n", toc);
+
+	return 0;
+}
+
+/**
+ * delete_fdt_mem_rsv - delete memory reservation with given address and size
+ *
+ * Return: 0 on success, or negative errno on error.
+ */
+static int delete_fdt_mem_rsv(void *fdt, unsigned long start, unsigned long size)
+{
+	int i, ret, num_rsvs = fdt_num_mem_rsv(fdt);
+
+	for (i = 0; i < num_rsvs; i++) {
+		uint64_t rsv_start, rsv_size;
+
+		ret = fdt_get_mem_rsv(fdt, i, &rsv_start, &rsv_size);
+		if (ret) {
+			pr_err("Malformed device tree.\n");
+			return -EINVAL;
+		}
+
+		if (rsv_start == start && rsv_size == size) {
+			ret = fdt_del_mem_rsv(fdt, i);
+			if (ret) {
+				pr_err("Error deleting device tree reservation.\n");
+				return -EINVAL;
+			}
+
+			return 0;
+		}
+	}
+
+	return -ENOENT;
+}
+
+/*
+ * setup_new_fdt - modify /chosen and memory reservation for the next kernel
+ * @fdt:		Flattened device tree for the next kernel.
+ * @initrd_load_addr:	Address where the next initrd will be loaded.
+ * @initrd_len:		Size of the next initrd, or 0 if there will be none.
+ * @cmdline:		Command line for the next kernel, or NULL if there will
+ *			be none.
+ *
+ * Return: 0 on success, or negative errno on error.
+ */
+int setup_new_fdt(void *fdt, unsigned long initrd_load_addr,
+		  unsigned long initrd_len, const char *cmdline)
+{
+	int ret, chosen_node;
+	const void *prop;
+
+	/* Remove memory reservation for the current device tree. */
+	ret = delete_fdt_mem_rsv(fdt, __pa(initial_boot_params),
+				 fdt_totalsize(initial_boot_params));
+	if (ret == 0)
+		pr_debug("Removed old device tree reservation.\n");
+	else if (ret != -ENOENT)
+		return ret;
+
+	chosen_node = fdt_path_offset(fdt, "/chosen");
+	if (chosen_node == -FDT_ERR_NOTFOUND) {
+		chosen_node = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"),
+					      "chosen");
+		if (chosen_node < 0) {
+			pr_err("Error creating /chosen.\n");
+			return -EINVAL;
+		}
+	} else if (chosen_node < 0) {
+		pr_err("Malformed device tree: error reading /chosen.\n");
+		return -EINVAL;
+	}
+
+	/* Did we boot using an initrd? */
+	prop = fdt_getprop(fdt, chosen_node, "linux,initrd-start", NULL);
+	if (prop) {
+		uint64_t tmp_start, tmp_end, tmp_size;
+
+		tmp_start = fdt64_to_cpu(*((const fdt64_t *) prop));
+
+		prop = fdt_getprop(fdt, chosen_node, "linux,initrd-end", NULL);
+		if (!prop) {
+			pr_err("Malformed device tree.\n");
+			return -EINVAL;
+		}
+		tmp_end = fdt64_to_cpu(*((const fdt64_t *) prop));
+
+		/*
+		 * kexec reserves exact initrd size, while firmware may
+		 * reserve a multiple of PAGE_SIZE, so check for both.
+		 */
+		tmp_size = tmp_end - tmp_start;
+		ret = delete_fdt_mem_rsv(fdt, tmp_start, tmp_size);
+		if (ret == -ENOENT)
+			ret = delete_fdt_mem_rsv(fdt, tmp_start,
+						 round_up(tmp_size, PAGE_SIZE));
+		if (ret == 0)
+			pr_debug("Removed old initrd reservation.\n");
+		else if (ret != -ENOENT)
+			return ret;
+
+		/* If there's no new initrd, delete the old initrd's info. */
+		if (initrd_len == 0) {
+			ret = fdt_delprop(fdt, chosen_node,
+					  "linux,initrd-start");
+			if (ret) {
+				pr_err("Error deleting linux,initrd-start.\n");
+				return -EINVAL;
+			}
+
+			ret = fdt_delprop(fdt, chosen_node, "linux,initrd-end");
+			if (ret) {
+				pr_err("Error deleting linux,initrd-end.\n");
+				return -EINVAL;
+			}
+		}
+	}
+
+	if (initrd_len) {
+		ret = fdt_setprop_u64(fdt, chosen_node,
+				      "linux,initrd-start",
+				      initrd_load_addr);
+		if (ret < 0) {
+			pr_err("Error setting up the new device tree.\n");
+			return -EINVAL;
+		}
+
+		/* initrd-end is the first address after the initrd image. */
+		ret = fdt_setprop_u64(fdt, chosen_node, "linux,initrd-end",
+				      initrd_load_addr + initrd_len);
+		if (ret < 0) {
+			pr_err("Error setting up the new device tree.\n");
+			return -EINVAL;
+		}
+
+		ret = fdt_add_mem_rsv(fdt, initrd_load_addr, initrd_len);
+		if (ret) {
+			pr_err("Error reserving initrd memory: %s\n",
+			       fdt_strerror(ret));
+			return -EINVAL;
+		}
+	}
+
+	if (cmdline != NULL) {
+		ret = fdt_setprop_string(fdt, chosen_node, "bootargs", cmdline);
+		if (ret < 0) {
+			pr_err("Error setting up the new device tree.\n");
+			return -EINVAL;
+		}
+	} else {
+		ret = fdt_delprop(fdt, chosen_node, "bootargs");
+		if (ret && ret != -FDT_ERR_NOTFOUND) {
+			pr_err("Error deleting bootargs.\n");
+			return -EINVAL;
+		}
+	}
+
+	ret = fdt_setprop(fdt, chosen_node, "linux,booted-from-kexec", NULL, 0);
+	if (ret) {
+		pr_err("Error setting up the new device tree.\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v10 09/10] powerpc: Add purgatory for kexec_file_load implementation.
  2016-11-10  3:27 [PATCH v10 00/10] kexec_file_load implementation for PowerPC Thiago Jung Bauermann
                   ` (7 preceding siblings ...)
  2016-11-10  3:27 ` [PATCH v10 08/10] powerpc: Add support for loading ELF kernels with kexec_file_load Thiago Jung Bauermann
@ 2016-11-10  3:27 ` Thiago Jung Bauermann
  2016-11-10  3:27 ` [PATCH v10 10/10] powerpc: Enable CONFIG_KEXEC_FILE in powerpc server defconfigs Thiago Jung Bauermann
  9 siblings, 0 replies; 20+ messages in thread
From: Thiago Jung Bauermann @ 2016-11-10  3:27 UTC (permalink / raw)
  To: kexec
  Cc: linuxppc-dev, linux-kernel, x86, Eric Biederman, Dave Young,
	Vivek Goyal, Baoquan He, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Stewart Smith,
	Mimi Zohar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andrew Morton, Stephen Rothwell, Thiago Jung Bauermann

This purgatory implementation comes from kexec-tools and was trimmed
down a bit.

It uses the memset, memcpy and memcmp implementations from lib/string.c.
It's not straightforward to #include "lib/string.c" so we simply
copy those functions.

The changes made to the purgatory code relative to the version in
kexec-tools were:

The support for printing messages to the console was removed.

Also, since we don't support loading a crashdump kernel via
kexec_file_load yet, the code related to that functionality has been
removed for now.

The sha256_regions global variable was renamed to sha_regions to match
what kexec_file_load expects, and to use the sha256.c file from x86's
purgatory (this avoids adding yet another SHA-256 implementation).

The global variables in purgatory.c and purgatory-ppc64.c now use a
__section attribute to put them in the .data section instead of being
initialized to zero. It doesn't matter what their initial value is,
because they will be set by the kernel when preparing the kexec image.

Finally, some checkpatch.pl warnings were fixed.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/Makefile                          |   1 +
 arch/powerpc/purgatory/.gitignore              |   2 +
 arch/powerpc/purgatory/Makefile                |  40 ++++++++
 arch/powerpc/purgatory/crtsavres.S             |   5 +
 arch/powerpc/purgatory/kexec-sha256.h          |  11 +++
 arch/powerpc/purgatory/purgatory-ppc64.c       |  36 +++++++
 arch/powerpc/purgatory/purgatory.c             |  48 +++++++++
 arch/powerpc/purgatory/purgatory.h             |  10 ++
 arch/powerpc/purgatory/sha256.c                |   6 ++
 arch/powerpc/purgatory/sha256.h                |   1 +
 arch/powerpc/purgatory/string.c                |  60 +++++++++++
 arch/powerpc/purgatory/v2wrap.S                | 132 +++++++++++++++++++++++++
 arch/powerpc/scripts/check-purgatory-relocs.sh |  47 +++++++++
 13 files changed, 399 insertions(+)

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 617dece67924..5e7dcdaf93f5 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -249,6 +249,7 @@ core-y				+= arch/powerpc/kernel/ \
 core-$(CONFIG_XMON)		+= arch/powerpc/xmon/
 core-$(CONFIG_KVM) 		+= arch/powerpc/kvm/
 core-$(CONFIG_PERF_EVENTS)	+= arch/powerpc/perf/
+core-$(CONFIG_KEXEC_FILE)	+= arch/powerpc/purgatory/
 
 drivers-$(CONFIG_OPROFILE)	+= arch/powerpc/oprofile/
 
diff --git a/arch/powerpc/purgatory/.gitignore b/arch/powerpc/purgatory/.gitignore
new file mode 100644
index 000000000000..e9e66f178a6d
--- /dev/null
+++ b/arch/powerpc/purgatory/.gitignore
@@ -0,0 +1,2 @@
+kexec-purgatory.c
+purgatory.ro
diff --git a/arch/powerpc/purgatory/Makefile b/arch/powerpc/purgatory/Makefile
new file mode 100644
index 000000000000..2dfb53ac9944
--- /dev/null
+++ b/arch/powerpc/purgatory/Makefile
@@ -0,0 +1,40 @@
+OBJECT_FILES_NON_STANDARD := y
+
+purgatory-y := purgatory.o string.o v2wrap.o purgatory-ppc64.o crtsavres.o \
+		sha256.o
+
+targets += $(purgatory-y)
+PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y))
+
+LDFLAGS_purgatory.ro := -pie --no-dynamic-linker -e purgatory_start \
+			--no-undefined -nostartfiles -nostdlib -nodefaultlibs
+targets += purgatory.ro
+
+KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE), $(KBUILD_CFLAGS))
+
+KBUILD_CFLAGS += -fno-zero-initialized-in-bss -fno-builtin -ffreestanding \
+		 -fno-stack-protector -fno-exceptions -fpie
+KBUILD_AFLAGS += -fno-exceptions -msoft-float -fpie
+
+$(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
+		$(call if_changed,ld)
+
+targets += kexec-purgatory.c
+
+quiet_cmd_relocs_check = CALL    $<
+      cmd_relocs_check = $(CONFIG_SHELL) $< "$(OBJDUMP)" "$(obj)/purgatory.ro"
+
+PHONY += relocs_check
+relocs_check: arch/powerpc/scripts/check-purgatory-relocs.sh $(obj)/purgatory.ro
+	$(call cmd,relocs_check)
+
+CMD_BIN2C = $(objtree)/scripts/basic/bin2c
+quiet_cmd_bin2c = BIN2C   $@
+      cmd_bin2c = $(CMD_BIN2C) kexec_purgatory < $< > $@
+
+$(obj)/kexec-purgatory.c: $(obj)/purgatory.ro relocs_check FORCE
+	$(call if_changed,bin2c)
+	@:
+
+
+obj-$(CONFIG_KEXEC_FILE)	+= kexec-purgatory.o
diff --git a/arch/powerpc/purgatory/crtsavres.S b/arch/powerpc/purgatory/crtsavres.S
new file mode 100644
index 000000000000..5d17e1c0d575
--- /dev/null
+++ b/arch/powerpc/purgatory/crtsavres.S
@@ -0,0 +1,5 @@
+#ifndef CONFIG_CC_OPTIMIZE_FOR_SIZE
+#define CONFIG_CC_OPTIMIZE_FOR_SIZE 1
+#endif
+
+#include "../lib/crtsavres.S"
diff --git a/arch/powerpc/purgatory/kexec-sha256.h b/arch/powerpc/purgatory/kexec-sha256.h
new file mode 100644
index 000000000000..4418ed02c052
--- /dev/null
+++ b/arch/powerpc/purgatory/kexec-sha256.h
@@ -0,0 +1,11 @@
+#ifndef KEXEC_SHA256_H
+#define KEXEC_SHA256_H
+
+struct kexec_sha_region {
+	unsigned long start;
+	unsigned long len;
+};
+
+#define SHA256_REGIONS 16
+
+#endif /* KEXEC_SHA256_H */
diff --git a/arch/powerpc/purgatory/purgatory-ppc64.c b/arch/powerpc/purgatory/purgatory-ppc64.c
new file mode 100644
index 000000000000..05f3216c3ded
--- /dev/null
+++ b/arch/powerpc/purgatory/purgatory-ppc64.c
@@ -0,0 +1,36 @@
+/*
+ * kexec: Linux boots Linux
+ *
+ * Created by: Mohan Kumar M (mohan@in.ibm.com)
+ *
+ * Copyright (C) IBM Corporation, 2005. All rights reserved
+ *
+ * Code taken from kexec-tools.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation (version 2 of the License).
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/compiler.h>
+#include "purgatory.h"
+
+unsigned long stack __section(".data");
+unsigned long dt_offset __section(".data");
+unsigned long my_toc __section(".data");
+unsigned long kernel __section(".data");
+unsigned long opal_base __section(".data");
+unsigned long opal_entry __section(".data");
+
+void setup_arch(void)
+{
+}
+
+void post_verification_setup_arch(void)
+{
+}
diff --git a/arch/powerpc/purgatory/purgatory.c b/arch/powerpc/purgatory/purgatory.c
new file mode 100644
index 000000000000..17ff279b463e
--- /dev/null
+++ b/arch/powerpc/purgatory/purgatory.c
@@ -0,0 +1,48 @@
+/*
+ * Code taken from kexec-tools.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation (version 2 of the License).
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include "purgatory.h"
+#include "sha256.h"
+#include "kexec-sha256.h"
+
+struct kexec_sha_region sha_regions[SHA256_REGIONS] __section(".data");
+u8 sha256_digest[SHA256_DIGEST_SIZE] __section(".data");
+
+int verify_sha256_digest(void)
+{
+	struct kexec_sha_region *ptr, *end;
+	u8 digest[SHA256_DIGEST_SIZE];
+	struct sha256_state sctx;
+
+	sha256_init(&sctx);
+	end = &sha_regions[sizeof(sha_regions)/sizeof(sha_regions[0])];
+	for (ptr = sha_regions; ptr < end; ptr++)
+		sha256_update(&sctx, (uint8_t *)(ptr->start), ptr->len);
+	sha256_final(&sctx, digest);
+
+	if (memcmp(digest, sha256_digest, sizeof(digest)))
+		return 1;
+
+	return 0;
+}
+
+void purgatory(void)
+{
+	setup_arch();
+	if (verify_sha256_digest()) {
+		/* loop forever */
+		for (;;)
+			;
+	}
+	post_verification_setup_arch();
+}
diff --git a/arch/powerpc/purgatory/purgatory.h b/arch/powerpc/purgatory/purgatory.h
new file mode 100644
index 000000000000..5a31628b9b99
--- /dev/null
+++ b/arch/powerpc/purgatory/purgatory.h
@@ -0,0 +1,10 @@
+#ifndef PURGATORY_H
+#define PURGATORY_H
+
+#include <linux/types.h>
+
+int memcmp(const void *cs, const void *ct, size_t count);
+void setup_arch(void);
+void post_verification_setup_arch(void);
+
+#endif /* PURGATORY_H */
diff --git a/arch/powerpc/purgatory/sha256.c b/arch/powerpc/purgatory/sha256.c
new file mode 100644
index 000000000000..6abee1877d56
--- /dev/null
+++ b/arch/powerpc/purgatory/sha256.c
@@ -0,0 +1,6 @@
+#include "../boot/string.h"
+
+/* Avoid including x86's boot/string.h in sha256.c. */
+#define BOOT_STRING_H
+
+#include "../../x86/purgatory/sha256.c"
diff --git a/arch/powerpc/purgatory/sha256.h b/arch/powerpc/purgatory/sha256.h
new file mode 100644
index 000000000000..72818f3a207e
--- /dev/null
+++ b/arch/powerpc/purgatory/sha256.h
@@ -0,0 +1 @@
+#include "../../x86/purgatory/sha256.h"
diff --git a/arch/powerpc/purgatory/string.c b/arch/powerpc/purgatory/string.c
new file mode 100644
index 000000000000..d31263637fbc
--- /dev/null
+++ b/arch/powerpc/purgatory/string.c
@@ -0,0 +1,60 @@
+/*
+ * Copied from linux/lib/string.c.
+ */
+
+#include <linux/types.h>
+
+/**
+ * memset - Fill a region of memory with the given value
+ * @s: Pointer to the start of the area.
+ * @c: The byte to fill the area with
+ * @count: The size of the area.
+ *
+ * Do not use memset() to access IO space, use memset_io() instead.
+ */
+void *memset(void *s, int c, size_t count)
+{
+	char *xs = s;
+
+	while (count--)
+		*xs++ = c;
+	return s;
+}
+
+/**
+ * memcpy - Copy one area of memory to another
+ * @dest: Where to copy to
+ * @src: Where to copy from
+ * @count: The size of the area.
+ *
+ * You should not use this function to access IO space, use memcpy_toio()
+ * or memcpy_fromio() instead.
+ */
+void *memcpy(void *dest, const void *src, size_t count)
+{
+	char *tmp = dest;
+	const char *s = src;
+
+	while (count--)
+		*tmp++ = *s++;
+	return dest;
+}
+
+/**
+ * memcmp - Compare two areas of memory
+ * @cs: One area of memory
+ * @ct: Another area of memory
+ * @count: The size of the area.
+ */
+int memcmp(const void *cs, const void *ct, size_t count)
+{
+	const unsigned char *su1, *su2;
+	int res = 0;
+
+	for (su1 = cs, su2 = ct; count > 0; ++su1, ++su2, count--) {
+		res = *su1 - *su2;
+		if (res != 0)
+			break;
+	}
+	return res;
+}
diff --git a/arch/powerpc/purgatory/v2wrap.S b/arch/powerpc/purgatory/v2wrap.S
new file mode 100644
index 000000000000..a15d6fdee4db
--- /dev/null
+++ b/arch/powerpc/purgatory/v2wrap.S
@@ -0,0 +1,132 @@
+#
+#  kexec: Linux boots Linux
+#
+#  Copyright (C) 2004 - 2005, Milton D Miller II, IBM Corporation
+#  Copyright (C) 2006, Mohan Kumar M (mohan@in.ibm.com), IBM Corporation
+#
+# Code taken from kexec-tools.
+#
+#  This program is free software; you can redistribute it and/or modify
+#  it under the terms of the GNU General Public License as published by
+#  the Free Software Foundation (version 2 of the License).
+#
+#  This program is distributed in the hope that it will be useful,
+#  but WITHOUT ANY WARRANTY; without even the implied warranty of
+#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#  GNU General Public License for more details.
+
+# v2wrap.S
+# a wrapper to call purgatory code to backup first
+# 32kB of first kernel into the backup region
+# reserved by kexec-tools.
+# Invokes ppc64 kernel with the expected arguments
+# of kernel(device-tree, phys-offset, 0)
+
+#
+# calling convention:
+#   r3 = physical number of this cpu (all cpus)
+#   r4 = address of this chunk (master only)
+# master enters at purgatory_start (aka first byte of this chunk)
+# slaves (additional cpus), if any, enter a copy of the
+# first 0x100 bytes of this code relocated to 0x0
+#
+# in other words,
+#   a copy of the first 0x100 bytes of this code is copied to 0
+#   and the slaves are sent to address 0x60
+#   with r3 = their physical cpu number.
+
+#define LOADADDR(rn,name) \
+	lis     rn,name##@highest;      \
+	ori     rn,rn,name##@higher;    \
+	rldicr  rn,rn,32,31;            \
+	oris    rn,rn,name##@h;         \
+	ori     rn,rn,name##@l
+
+	.machine ppc64
+	.align 8
+	.globl purgatory_start
+purgatory_start:	b	master
+	.org purgatory_start + 0x5c     # ABI: possible run_at_load flag at 0x5c
+	.globl run_at_load
+run_at_load:
+	.long 0
+	.size run_at_load, . - run_at_load
+	.org purgatory_start + 0x60     # ABI: slaves start at 60 with r3=phys
+slave:	b $
+	.org purgatory_start + 0x100    # ABI: end of copied region
+	.size purgatory_start, . - purgatory_start
+
+#
+# The above 0x100 bytes at purgatory_start are replaced with the
+# code from the kernel (or next stage) by kexec/arch/ppc64/kexec-elf-ppc64.c
+#
+
+master:
+	or	1,1,1		# low priority to let other threads catchup
+	isync
+	mr      17,3            # save cpu id to r17
+	mr      15,4            # save physical address in reg15
+
+	LOADADDR(6,my_toc)
+	ld      2,0(6)          #setup toc
+
+	LOADADDR(6,stack)
+	ld      1,0(6)          #setup stack
+
+	subi    1,1,112
+	bl      purgatory
+	nop
+
+	or	3,3,3		# ok now to high priority, lets boot
+	lis	6,0x1
+	mtctr	6		# delay a bit for slaves to catch up
+83:	bdnz	83b		# before we overwrite 0-100 again
+
+	LOADADDR(16, dt_offset)
+	ld      3,0(16)         # load device-tree address
+	mr      16,3            # save dt address in reg16
+#ifdef __BIG_ENDIAN__
+	lwz     6,20(3)         # fetch version number
+#else
+	li	4,20
+	lwbrx	6,3,4		# fetch BE version number
+#endif
+	cmpwi   0,6,2           # v2 ?
+	blt     80f
+#ifdef __BIG_ENDIAN__
+	stw     17,28(3)        # save my cpu number as boot_cpu_phys
+#else
+	li	4,28
+	stwbrx	17,3,4		# Store my cpu as BE value
+#endif
+80:
+	LOADADDR(6,opal_base)	# For OPAL early debug
+	ld      8,0(6)          # load the OPAL base address in r8
+	LOADADDR(6,opal_entry)	# For OPAL early debug
+	ld      9,0(6)          # load the OPAL entry address in r9
+	LOADADDR(6,kernel)
+	ld      4,0(6)          # load the kernel address
+	LOADADDR(6,run_at_load) # the load flag
+	lwz	7,0(6)		# possibly patched by kexec-elf-ppc64
+	stw	7,0x5c(4)	# and patch it into the kernel
+	mr      3,16            # restore dt address
+
+	mfmsr	5
+	andi.	10,5,1		# test MSR_LE
+	bne	little_endian
+
+	li	5,0		# r5 will be 0 for kernel
+	mtctr	4		# prepare branch to
+	bctr			# start kernel
+
+little_endian:			# book3s-only
+	mtsrr0	4		# prepare branch to
+
+	clrrdi	5,5,1		# clear MSR_LE
+	mtsrr1	5
+
+	li	5,0		# r5 will be 0 for kernel
+
+				# skip cache flush, do we care?
+
+	rfid			# update MSR and start kernel
diff --git a/arch/powerpc/scripts/check-purgatory-relocs.sh b/arch/powerpc/scripts/check-purgatory-relocs.sh
new file mode 100755
index 000000000000..c3b86e04eb4b
--- /dev/null
+++ b/arch/powerpc/scripts/check-purgatory-relocs.sh
@@ -0,0 +1,47 @@
+#!/bin/sh
+
+# Copyright © 2016 IBM Corporation
+
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License
+# as published by the Free Software Foundation; either version
+# 2 of the License, or (at your option) any later version.
+
+# This script checks the relocations of a purgatory executable for
+# relocations that are not handled by the kernel.
+
+# Based on relocs_check.sh.
+# Copyright © 2015 IBM Corporation
+
+if [ $# -lt 2 ]; then
+	echo "$0 [path to objdump] [path to purgatory.ro]" 1>&2
+	exit 1
+fi
+
+# Have Kbuild supply the path to objdump so we handle cross compilation.
+objdump="$1"
+purgatory="$2"
+
+bad_relocs=$(
+"$objdump" -R "$purgatory" |
+	# Only look at relocation lines.
+	grep -E '\<R_' |
+	# These relocations are okay
+	# On PPC64:
+	#	R_PPC64_ADDR16_LO, R_PPC64_ADDR16_HI,
+	#	R_PPC64_ADDR16_HIGHER, R_PPC64_ADDR16_HIGHEST,
+	#	R_PPC64_RELATIVE
+	grep -F -w -v 'R_PPC64_ADDR16_LO
+R_PPC64_ADDR16_HI
+R_PPC64_ADDR16_HIGHER
+R_PPC64_ADDR16_HIGHEST
+R_PPC64_RELATIVE'
+)
+
+if [ -z "$bad_relocs" ]; then
+	exit 0
+fi
+
+num_bad=$(echo "$bad_relocs" | wc -l)
+echo "WARNING: $num_bad bad relocations in $2"
+echo "$bad_relocs"
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v10 10/10] powerpc: Enable CONFIG_KEXEC_FILE in powerpc server defconfigs.
  2016-11-10  3:27 [PATCH v10 00/10] kexec_file_load implementation for PowerPC Thiago Jung Bauermann
                   ` (8 preceding siblings ...)
  2016-11-10  3:27 ` [PATCH v10 09/10] powerpc: Add purgatory for kexec_file_load implementation Thiago Jung Bauermann
@ 2016-11-10  3:27 ` Thiago Jung Bauermann
  9 siblings, 0 replies; 20+ messages in thread
From: Thiago Jung Bauermann @ 2016-11-10  3:27 UTC (permalink / raw)
  To: kexec
  Cc: linuxppc-dev, linux-kernel, x86, Eric Biederman, Dave Young,
	Vivek Goyal, Baoquan He, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Stewart Smith,
	Mimi Zohar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andrew Morton, Stephen Rothwell, Thiago Jung Bauermann

Enable CONFIG_KEXEC_FILE in powernv_defconfig, ppc64_defconfig and
pseries_defconfig.

It depends on CONFIG_CRYPTO_SHA256=y, so add that as well.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/configs/powernv_defconfig | 2 ++
 arch/powerpc/configs/ppc64_defconfig   | 2 ++
 arch/powerpc/configs/pseries_defconfig | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/arch/powerpc/configs/powernv_defconfig b/arch/powerpc/configs/powernv_defconfig
index d98b6eb3254f..5a190aa5534b 100644
--- a/arch/powerpc/configs/powernv_defconfig
+++ b/arch/powerpc/configs/powernv_defconfig
@@ -49,6 +49,7 @@ CONFIG_BINFMT_MISC=m
 CONFIG_PPC_TRANSACTIONAL_MEM=y
 CONFIG_HOTPLUG_CPU=y
 CONFIG_KEXEC=y
+CONFIG_KEXEC_FILE=y
 CONFIG_IRQ_ALL_CPUS=y
 CONFIG_NUMA=y
 CONFIG_MEMORY_HOTPLUG=y
@@ -301,6 +302,7 @@ CONFIG_CRYPTO_CCM=m
 CONFIG_CRYPTO_PCBC=m
 CONFIG_CRYPTO_HMAC=y
 CONFIG_CRYPTO_MICHAEL_MIC=m
+CONFIG_CRYPTO_SHA256=y
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_ANUBIS=m
diff --git a/arch/powerpc/configs/ppc64_defconfig b/arch/powerpc/configs/ppc64_defconfig
index 58a98d40086f..0059d2088b9c 100644
--- a/arch/powerpc/configs/ppc64_defconfig
+++ b/arch/powerpc/configs/ppc64_defconfig
@@ -46,6 +46,7 @@ CONFIG_HZ_100=y
 CONFIG_BINFMT_MISC=m
 CONFIG_PPC_TRANSACTIONAL_MEM=y
 CONFIG_KEXEC=y
+CONFIG_KEXEC_FILE=y
 CONFIG_CRASH_DUMP=y
 CONFIG_IRQ_ALL_CPUS=y
 CONFIG_MEMORY_HOTREMOVE=y
@@ -336,6 +337,7 @@ CONFIG_CRYPTO_TEST=m
 CONFIG_CRYPTO_PCBC=m
 CONFIG_CRYPTO_HMAC=y
 CONFIG_CRYPTO_MICHAEL_MIC=m
+CONFIG_CRYPTO_SHA256=y
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_ANUBIS=m
diff --git a/arch/powerpc/configs/pseries_defconfig b/arch/powerpc/configs/pseries_defconfig
index 8a3bc016b732..f022f657a984 100644
--- a/arch/powerpc/configs/pseries_defconfig
+++ b/arch/powerpc/configs/pseries_defconfig
@@ -52,6 +52,7 @@ CONFIG_HZ_100=y
 CONFIG_BINFMT_MISC=m
 CONFIG_PPC_TRANSACTIONAL_MEM=y
 CONFIG_KEXEC=y
+CONFIG_KEXEC_FILE=y
 CONFIG_IRQ_ALL_CPUS=y
 CONFIG_MEMORY_HOTPLUG=y
 CONFIG_MEMORY_HOTREMOVE=y
@@ -303,6 +304,7 @@ CONFIG_CRYPTO_TEST=m
 CONFIG_CRYPTO_PCBC=m
 CONFIG_CRYPTO_HMAC=y
 CONFIG_CRYPTO_MICHAEL_MIC=m
+CONFIG_CRYPTO_SHA256=y
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_ANUBIS=m
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.
  2016-11-10  3:27 ` [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE Thiago Jung Bauermann
@ 2016-11-20  2:45   ` Dave Young
  2016-11-21 23:49     ` Thiago Jung Bauermann
  0 siblings, 1 reply; 20+ messages in thread
From: Dave Young @ 2016-11-20  2:45 UTC (permalink / raw)
  To: Thiago Jung Bauermann
  Cc: kexec, linuxppc-dev, linux-kernel, x86, Eric Biederman,
	Vivek Goyal, Baoquan He, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Stewart Smith,
	Mimi Zohar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andrew Morton, Stephen Rothwell

On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote:
> powerpc's purgatory.ro has 12 relocation types when built as
> a relocatable object. To implement support for them requires
> arch_kexec_apply_relocations_add to duplicate a lot of code with
> module_64.c:apply_relocate_add.
> 
> When built as a Position Independent Executable there are only 4
> relocation types in purgatory.ro, so it becomes practical for the powerpc
> implementation of kexec_file to have its own relocation implementation.
> 
> Also, the purgatory is an executable and not an intermediary output from
> the compiler so it makes sense conceptually that it is easier to build
> it as a PIE than as a partially linked object.
> 
> Apart from the greatly reduced number of relocations, there are two
> differences between a relocatable object and a PIE:
> 
> 1. __kexec_load_purgatory needs to use the program headers rather than the
>    section headers to figure out how to load the binary.
> 2. Symbol values are absolute addresses instead of relative to the
>    start of the section.
> 
> This patch adds the support needed in generic code for the differences
> above and allows powerpc to load and relocate a position independent
> purgatory.
> 

[snip]

The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it is
not that complex. So could you look into simplify your kexec_file
implementation?

kernel/kexec_file.c kexec_apply_relocations only do limited things
and some of the logic is in arch/x86, so move general code out of arch
code, then I guess the arch code will be simpler and then we probably
do not need this PIE stuff anymore.

BTW, __kexec_really_load_purgatory looks worse than
___kexec_load_purgatory ;)

Thanks
Dave

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.
  2016-11-20  2:45   ` Dave Young
@ 2016-11-21 23:49     ` Thiago Jung Bauermann
  2016-11-22  1:32       ` Dave Young
                         ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Thiago Jung Bauermann @ 2016-11-21 23:49 UTC (permalink / raw)
  To: Dave Young
  Cc: kexec, linuxppc-dev, linux-kernel, x86, Eric Biederman,
	Vivek Goyal, Baoquan He, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Stewart Smith,
	Mimi Zohar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andrew Morton, Stephen Rothwell

Hello Dave,

Thanks for your review.

Am Sonntag, 20. November 2016, 10:45:46 BRST schrieb Dave Young:
> On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote:
> > powerpc's purgatory.ro has 12 relocation types when built as
> > a relocatable object. To implement support for them requires
> > arch_kexec_apply_relocations_add to duplicate a lot of code with
> > module_64.c:apply_relocate_add.
> > 
> > When built as a Position Independent Executable there are only 4
> > relocation types in purgatory.ro, so it becomes practical for the powerpc
> > implementation of kexec_file to have its own relocation implementation.
> > 
> > Also, the purgatory is an executable and not an intermediary output from
> > the compiler so it makes sense conceptually that it is easier to build
> > it as a PIE than as a partially linked object.
> > 
> > Apart from the greatly reduced number of relocations, there are two
> > differences between a relocatable object and a PIE:
> > 
> > 1. __kexec_load_purgatory needs to use the program headers rather than the
> > 
> >    section headers to figure out how to load the binary.
> > 
> > 2. Symbol values are absolute addresses instead of relative to the
> > 
> >    start of the section.
> > 
> > This patch adds the support needed in generic code for the differences
> > above and allows powerpc to load and relocate a position independent
> > purgatory.
> 
> [snip]
> 
> The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it is
> not that complex. So could you look into simplify your kexec_file
> implementation?

I can try, but there is one fundamental issue here: powerpc position-dependent 
code relies more on relocations than x86 position-dependent code does, so 
there's a limit to how simple it can be made without switching to position-
independent code. And it will always be more involved than it is on x86.

BTW, building x86's purgatory as PIE results in it not having any relocation 
at all, so it's an advantage even in that architecture. Unfortunately, the 
machine locks up during reboot and I didn't have time to try to figure out 
what's going on.

> kernel/kexec_file.c kexec_apply_relocations only do limited things
> and some of the logic is in arch/x86, so move general code out of arch
> code, then I guess the arch code will be simpler

I agree that is a good idea. Is the patch below what you had in mind?

> and then we probably do not need this PIE stuff anymore.

If you are ok with the patch below I can post a new version of the series 
based on it and we can see if Michael Ellerman thinks it is enough.

> BTW, __kexec_really_load_purgatory looks worse than
> ___kexec_load_purgatory ;)

Really? I find the special handling of bss makes the section-based loader a 
bit more confusing.

-- 
Thiago Jung Bauermann
IBM Linux Technology Center


Subject: [PATCH] kexec_file: Move generic relocation code from arch/x86 to
 kernel/kexec_file.c

The check for undefined symbols stays in arch-specific code because
powerpc needs to allow TOC symbols to be processed even though they're
undefined.

There is no functional change.

Suggested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/x86/kernel/machine_kexec_64.c | 160 +++++++------------------------------
 include/linux/kexec.h              |   9 ++-
 kernel/kexec_file.c                | 120 +++++++++++++++++++++++++++-
 3 files changed, 154 insertions(+), 135 deletions(-)

diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 8c1f218926d7..f4860c408ece 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -401,143 +401,45 @@ int arch_kexec_kernel_verify_sig(struct kimage *image, void *kernel,
 }
 #endif
 
-/*
- * Apply purgatory relocations.
- *
- * ehdr: Pointer to elf headers
- * sechdrs: Pointer to section headers.
- * relsec: section index of SHT_RELA section.
- *
- * TODO: Some of the code belongs to generic code. Move that in kexec.c.
- */
-int arch_kexec_apply_relocations_add(const Elf64_Ehdr *ehdr,
-				     Elf64_Shdr *sechdrs, unsigned int relsec)
+int arch_kexec_apply_relocation_add(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+				    unsigned int reltype, Elf_Sym *sym,
+				    const char *name, unsigned long *location,
+				    unsigned long address, unsigned long value)
 {
-	unsigned int i;
-	Elf64_Rela *rel;
-	Elf64_Sym *sym;
-	void *location;
-	Elf64_Shdr *section, *symtabsec;
-	unsigned long address, sec_base, value;
-	const char *strtab, *name, *shstrtab;
-
-	/*
-	 * ->sh_offset has been modified to keep the pointer to section
-	 * contents in memory
-	 */
-	rel = (void *)sechdrs[relsec].sh_offset;
-
-	/* Section to which relocations apply */
-	section = &sechdrs[sechdrs[relsec].sh_info];
-
-	pr_debug("Applying relocate section %u to %u\n", relsec,
-		 sechdrs[relsec].sh_info);
-
-	/* Associated symbol table */
-	symtabsec = &sechdrs[sechdrs[relsec].sh_link];
-
-	/* String table */
-	if (symtabsec->sh_link >= ehdr->e_shnum) {
-		/* Invalid strtab section number */
-		pr_err("Invalid string table section index %d\n",
-		       symtabsec->sh_link);
+	if (sym->st_shndx == SHN_UNDEF) {
+		pr_err("Undefined symbol: %s\n", name);
 		return -ENOEXEC;
 	}
 
-	strtab = (char *)sechdrs[symtabsec->sh_link].sh_offset;
-
-	/* section header string table */
-	shstrtab = (char *)sechdrs[ehdr->e_shstrndx].sh_offset;
-
-	for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rel); i++) {
-
-		/*
-		 * rel[i].r_offset contains byte offset from beginning
-		 * of section to the storage unit affected.
-		 *
-		 * This is location to update (->sh_offset). This is temporary
-		 * buffer where section is currently loaded. This will finally
-		 * be loaded to a different address later, pointed to by
-		 * ->sh_addr. kexec takes care of moving it
-		 *  (kexec_load_segment()).
-		 */
-		location = (void *)(section->sh_offset + rel[i].r_offset);
-
-		/* Final address of the location */
-		address = section->sh_addr + rel[i].r_offset;
-
-		/*
-		 * rel[i].r_info contains information about symbol table index
-		 * w.r.t which relocation must be made and type of relocation
-		 * to apply. ELF64_R_SYM() and ELF64_R_TYPE() macros get
-		 * these respectively.
-		 */
-		sym = (Elf64_Sym *)symtabsec->sh_offset +
-				ELF64_R_SYM(rel[i].r_info);
-
-		if (sym->st_name)
-			name = strtab + sym->st_name;
-		else
-			name = shstrtab + sechdrs[sym->st_shndx].sh_name;
-
-		pr_debug("Symbol: %s info: %02x shndx: %02x value=%llx size: %llx\n",
-			 name, sym->st_info, sym->st_shndx, sym->st_value,
-			 sym->st_size);
-
-		if (sym->st_shndx == SHN_UNDEF) {
-			pr_err("Undefined symbol: %s\n", name);
-			return -ENOEXEC;
-		}
-
-		if (sym->st_shndx == SHN_COMMON) {
-			pr_err("symbol '%s' in common section\n", name);
-			return -ENOEXEC;
-		}
-
-		if (sym->st_shndx == SHN_ABS)
-			sec_base = 0;
-		else if (sym->st_shndx >= ehdr->e_shnum) {
-			pr_err("Invalid section %d for symbol %s\n",
-			       sym->st_shndx, name);
-			return -ENOEXEC;
-		} else
-			sec_base = sechdrs[sym->st_shndx].sh_addr;
-
-		value = sym->st_value;
-		value += sec_base;
-		value += rel[i].r_addend;
-
-		switch (ELF64_R_TYPE(rel[i].r_info)) {
-		case R_X86_64_NONE:
-			break;
-		case R_X86_64_64:
-			*(u64 *)location = value;
-			break;
-		case R_X86_64_32:
-			*(u32 *)location = value;
-			if (value != *(u32 *)location)
-				goto overflow;
-			break;
-		case R_X86_64_32S:
-			*(s32 *)location = value;
-			if ((s64)value != *(s32 *)location)
-				goto overflow;
-			break;
-		case R_X86_64_PC32:
-			value -= (u64)address;
-			*(u32 *)location = value;
-			break;
-		default:
-			pr_err("Unknown rela relocation: %llu\n",
-			       ELF64_R_TYPE(rel[i].r_info));
-			return -ENOEXEC;
-		}
+	switch (reltype) {
+	case R_X86_64_NONE:
+		break;
+	case R_X86_64_64:
+		*(u64 *)location = value;
+		break;
+	case R_X86_64_32:
+		*(u32 *)location = value;
+		if (value != *(u32 *)location)
+			goto overflow;
+		break;
+	case R_X86_64_32S:
+		*(s32 *)location = value;
+		if ((s64)value != *(s32 *)location)
+			goto overflow;
+		break;
+	case R_X86_64_PC32:
+		value -= (u64)address;
+		*(u32 *)location = value;
+		break;
+	default:
+		pr_err("Unknown rela relocation: %u\n", reltype);
+		return -ENOEXEC;
 	}
+
 	return 0;
 
 overflow:
-	pr_err("Overflow in relocation type %d value 0x%lx\n",
-	       (int)ELF64_R_TYPE(rel[i].r_info), value);
+	pr_err("Overflow in relocation type %u value 0x%lx\n", reltype, value);
 	return -ENOEXEC;
 }
 #endif /* CONFIG_KEXEC_FILE */
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 406c33dcae13..e171a083540d 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -320,8 +320,13 @@ void * __weak arch_kexec_kernel_image_load(struct kimage *image);
 int __weak arch_kimage_file_post_load_cleanup(struct kimage *image);
 int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
 					unsigned long buf_len);
-int __weak arch_kexec_apply_relocations_add(const Elf_Ehdr *ehdr,
-					Elf_Shdr *sechdrs, unsigned int relsec);
+int __weak arch_kexec_apply_relocation_add(const Elf_Ehdr *ehdr,
+					   Elf_Shdr *sechdrs,
+					   unsigned int reltype, Elf_Sym *sym,
+					   const char *name,
+					   unsigned long *location,
+					   unsigned long address,
+					   unsigned long value);
 int __weak arch_kexec_apply_relocations(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 					unsigned int relsec);
 void arch_kexec_protect_crashkres(void);
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 037c321c5618..1517f977cc25 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -61,8 +61,10 @@ int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
 
 /* Apply relocations of type RELA */
 int __weak
-arch_kexec_apply_relocations_add(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
-				 unsigned int relsec)
+arch_kexec_apply_relocation_add(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+				unsigned int reltype, Elf_Sym *sym,
+				const char *name, unsigned long *location,
+				unsigned long address, unsigned long value)
 {
 	pr_err("RELA relocation unsupported.\n");
 	return -ENOEXEC;
@@ -793,6 +795,117 @@ static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
 	return ret;
 }
 
+/**
+ * kexec_apply_relocations_add - apply purgatory relocations
+ * @ehdr:	Pointer to elf headers
+ * @sechdrs:	Pointer to section headers.
+ * @relsec:	Section index of SHT_RELA section.
+ */
+static int kexec_apply_relocations_add(const Elf64_Ehdr *ehdr,
+				       Elf64_Shdr *sechdrs, unsigned int relsec)
+{
+	int ret;
+	unsigned int i;
+	Elf64_Rela *rel;
+	Elf64_Sym *sym;
+	void *location;
+	Elf64_Shdr *section, *symtabsec;
+	unsigned long address, sec_base, value;
+	const char *strtab, *name, *shstrtab;
+
+	/*
+	 * ->sh_offset has been modified to keep the pointer to section
+	 * contents in memory
+	 */
+	rel = (void *)sechdrs[relsec].sh_offset;
+
+	/* Section to which relocations apply */
+	section = &sechdrs[sechdrs[relsec].sh_info];
+
+	pr_debug("Applying relocate section %u to %u\n", relsec,
+		 sechdrs[relsec].sh_info);
+
+	/* Associated symbol table */
+	symtabsec = &sechdrs[sechdrs[relsec].sh_link];
+
+	/* String table */
+	if (symtabsec->sh_link >= ehdr->e_shnum) {
+		/* Invalid strtab section number */
+		pr_err("Invalid string table section index %d\n",
+		       symtabsec->sh_link);
+		return -ENOEXEC;
+	}
+
+	/* String table for the associated symbol table. */
+	strtab = (char *)sechdrs[symtabsec->sh_link].sh_offset;
+
+	/* section header string table */
+	shstrtab = (char *)sechdrs[ehdr->e_shstrndx].sh_offset;
+
+	for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rel); i++) {
+
+		/*
+		 * rel[i].r_offset contains byte offset from beginning
+		 * of section to the storage unit affected.
+		 *
+		 * This is location to update (->sh_offset). This is temporary
+		 * buffer where section is currently loaded. This will finally
+		 * be loaded to a different address later, pointed to by
+		 * ->sh_addr. kexec takes care of moving it
+		 *  (kexec_load_segment()).
+		 */
+		location = (void *)(section->sh_offset + rel[i].r_offset);
+
+		/* Final address of the location */
+		address = section->sh_addr + rel[i].r_offset;
+
+		/*
+		 * rel[i].r_info contains information about symbol table index
+		 * w.r.t which relocation must be made and type of relocation
+		 * to apply. ELF64_R_SYM() and ELF64_R_TYPE() macros get
+		 * these respectively.
+		 */
+		sym = (Elf64_Sym *)symtabsec->sh_offset +
+				ELF64_R_SYM(rel[i].r_info);
+
+		if (sym->st_name)
+			name = strtab + sym->st_name;
+		else
+			name = shstrtab + sechdrs[sym->st_shndx].sh_name;
+
+		pr_debug("Symbol: %s info: %02x shndx: %02x value=%llx size: %llx\n",
+			 name, sym->st_info, sym->st_shndx, sym->st_value,
+			 sym->st_size);
+
+		if (sym->st_shndx == SHN_COMMON) {
+			pr_err("symbol '%s' in common section\n", name);
+			return -ENOEXEC;
+		}
+
+		if (sym->st_shndx == SHN_ABS)
+			sec_base = 0;
+		else if (sym->st_shndx >= ehdr->e_shnum) {
+			pr_err("Invalid section %d for symbol %s\n",
+			       sym->st_shndx, name);
+			return -ENOEXEC;
+		} else
+			sec_base = sechdrs[sym->st_shndx].sh_addr;
+
+		value = sym->st_value;
+		value += sec_base;
+		value += rel[i].r_addend;
+
+		ret = arch_kexec_apply_relocation_add(ehdr, sechdrs,
+						      ELF64_R_TYPE(rel[i].r_info),
+						      sym, name, location,
+						      address, value);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
 static int kexec_apply_relocations(struct kimage *image)
 {
 	int i, ret;
@@ -836,8 +949,7 @@ static int kexec_apply_relocations(struct kimage *image)
 		 * relocations of type SHT_RELA/SHT_REL.
 		 */
 		if (sechdrs[i].sh_type == SHT_RELA)
-			ret = arch_kexec_apply_relocations_add(pi->ehdr,
-							       sechdrs, i);
+			ret = kexec_apply_relocations_add(pi->ehdr, sechdrs, i);
 		else if (sechdrs[i].sh_type == SHT_REL)
 			ret = arch_kexec_apply_relocations(pi->ehdr,
 							   sechdrs, i);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.
  2016-11-21 23:49     ` Thiago Jung Bauermann
@ 2016-11-22  1:32       ` Dave Young
  2016-11-22  6:01       ` Michael Ellerman
  2016-11-23  8:45       ` Dave Young
  2 siblings, 0 replies; 20+ messages in thread
From: Dave Young @ 2016-11-22  1:32 UTC (permalink / raw)
  To: Thiago Jung Bauermann
  Cc: kexec, linuxppc-dev, linux-kernel, x86, Eric Biederman,
	Vivek Goyal, Baoquan He, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Stewart Smith,
	Mimi Zohar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andrew Morton, Stephen Rothwell

On 11/21/16 at 09:49pm, Thiago Jung Bauermann wrote:
> Hello Dave,
> 
> Thanks for your review.
> 
> Am Sonntag, 20. November 2016, 10:45:46 BRST schrieb Dave Young:
> > On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote:
> > > powerpc's purgatory.ro has 12 relocation types when built as
> > > a relocatable object. To implement support for them requires
> > > arch_kexec_apply_relocations_add to duplicate a lot of code with
> > > module_64.c:apply_relocate_add.
> > > 
> > > When built as a Position Independent Executable there are only 4
> > > relocation types in purgatory.ro, so it becomes practical for the powerpc
> > > implementation of kexec_file to have its own relocation implementation.
> > > 
> > > Also, the purgatory is an executable and not an intermediary output from
> > > the compiler so it makes sense conceptually that it is easier to build
> > > it as a PIE than as a partially linked object.
> > > 
> > > Apart from the greatly reduced number of relocations, there are two
> > > differences between a relocatable object and a PIE:
> > > 
> > > 1. __kexec_load_purgatory needs to use the program headers rather than the
> > > 
> > >    section headers to figure out how to load the binary.
> > > 
> > > 2. Symbol values are absolute addresses instead of relative to the
> > > 
> > >    start of the section.
> > > 
> > > This patch adds the support needed in generic code for the differences
> > > above and allows powerpc to load and relocate a position independent
> > > purgatory.
> > 
> > [snip]
> > 
> > The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it is
> > not that complex. So could you look into simplify your kexec_file
> > implementation?
> 
> I can try, but there is one fundamental issue here: powerpc position-dependent 
> code relies more on relocations than x86 position-dependent code does, so 
> there's a limit to how simple it can be made without switching to position-
> independent code. And it will always be more involved than it is on x86.
> 
> BTW, building x86's purgatory as PIE results in it not having any relocation 
> at all, so it's an advantage even in that architecture. Unfortunately, the 
> machine locks up during reboot and I didn't have time to try to figure out 
> what's going on.
> 
> > kernel/kexec_file.c kexec_apply_relocations only do limited things
> > and some of the logic is in arch/x86, so move general code out of arch
> > code, then I guess the arch code will be simpler
> 
> I agree that is a good idea. Is the patch below what you had in mind?
> 
> > and then we probably do not need this PIE stuff anymore.
> 
> If you are ok with the patch below I can post a new version of the series 
> based on it and we can see if Michael Ellerman thinks it is enough.
> 

Will review it and do a test. Thanks. I believe this will benefit for
other arches if they want a kexec_file in the future.

> > BTW, __kexec_really_load_purgatory looks worse than
> > ___kexec_load_purgatory ;)
> 
> Really? I find the special handling of bss makes the section-based loader a 
> bit more confusing.

I'm not sure I understand above about "*bss*", personally I like
___kexec_load_purgatory more. But anyway if we move arch code as general
code then it will be not necessary anymore..

> 
> -- 
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
> 
> Subject: [PATCH] kexec_file: Move generic relocation code from arch/x86 to
>  kernel/kexec_file.c
> 
> The check for undefined symbols stays in arch-specific code because
> powerpc needs to allow TOC symbols to be processed even though they're
> undefined.
> 
> There is no functional change.
> 
> Suggested-by: Dave Young <dyoung@redhat.com>
> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> ---
>  arch/x86/kernel/machine_kexec_64.c | 160 +++++++------------------------------
>  include/linux/kexec.h              |   9 ++-
>  kernel/kexec_file.c                | 120 +++++++++++++++++++++++++++-
>  3 files changed, 154 insertions(+), 135 deletions(-)
> 
> diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
> index 8c1f218926d7..f4860c408ece 100644
> --- a/arch/x86/kernel/machine_kexec_64.c
> +++ b/arch/x86/kernel/machine_kexec_64.c
> @@ -401,143 +401,45 @@ int arch_kexec_kernel_verify_sig(struct kimage *image, void *kernel,
>  }
>  #endif
>  
> -/*
> - * Apply purgatory relocations.
> - *
> - * ehdr: Pointer to elf headers
> - * sechdrs: Pointer to section headers.
> - * relsec: section index of SHT_RELA section.
> - *
> - * TODO: Some of the code belongs to generic code. Move that in kexec.c.
> - */
> -int arch_kexec_apply_relocations_add(const Elf64_Ehdr *ehdr,
> -				     Elf64_Shdr *sechdrs, unsigned int relsec)
> +int arch_kexec_apply_relocation_add(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
> +				    unsigned int reltype, Elf_Sym *sym,
> +				    const char *name, unsigned long *location,
> +				    unsigned long address, unsigned long value)
>  {
> -	unsigned int i;
> -	Elf64_Rela *rel;
> -	Elf64_Sym *sym;
> -	void *location;
> -	Elf64_Shdr *section, *symtabsec;
> -	unsigned long address, sec_base, value;
> -	const char *strtab, *name, *shstrtab;
> -
> -	/*
> -	 * ->sh_offset has been modified to keep the pointer to section
> -	 * contents in memory
> -	 */
> -	rel = (void *)sechdrs[relsec].sh_offset;
> -
> -	/* Section to which relocations apply */
> -	section = &sechdrs[sechdrs[relsec].sh_info];
> -
> -	pr_debug("Applying relocate section %u to %u\n", relsec,
> -		 sechdrs[relsec].sh_info);
> -
> -	/* Associated symbol table */
> -	symtabsec = &sechdrs[sechdrs[relsec].sh_link];
> -
> -	/* String table */
> -	if (symtabsec->sh_link >= ehdr->e_shnum) {
> -		/* Invalid strtab section number */
> -		pr_err("Invalid string table section index %d\n",
> -		       symtabsec->sh_link);
> +	if (sym->st_shndx == SHN_UNDEF) {
> +		pr_err("Undefined symbol: %s\n", name);
>  		return -ENOEXEC;
>  	}
>  
> -	strtab = (char *)sechdrs[symtabsec->sh_link].sh_offset;
> -
> -	/* section header string table */
> -	shstrtab = (char *)sechdrs[ehdr->e_shstrndx].sh_offset;
> -
> -	for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rel); i++) {
> -
> -		/*
> -		 * rel[i].r_offset contains byte offset from beginning
> -		 * of section to the storage unit affected.
> -		 *
> -		 * This is location to update (->sh_offset). This is temporary
> -		 * buffer where section is currently loaded. This will finally
> -		 * be loaded to a different address later, pointed to by
> -		 * ->sh_addr. kexec takes care of moving it
> -		 *  (kexec_load_segment()).
> -		 */
> -		location = (void *)(section->sh_offset + rel[i].r_offset);
> -
> -		/* Final address of the location */
> -		address = section->sh_addr + rel[i].r_offset;
> -
> -		/*
> -		 * rel[i].r_info contains information about symbol table index
> -		 * w.r.t which relocation must be made and type of relocation
> -		 * to apply. ELF64_R_SYM() and ELF64_R_TYPE() macros get
> -		 * these respectively.
> -		 */
> -		sym = (Elf64_Sym *)symtabsec->sh_offset +
> -				ELF64_R_SYM(rel[i].r_info);
> -
> -		if (sym->st_name)
> -			name = strtab + sym->st_name;
> -		else
> -			name = shstrtab + sechdrs[sym->st_shndx].sh_name;
> -
> -		pr_debug("Symbol: %s info: %02x shndx: %02x value=%llx size: %llx\n",
> -			 name, sym->st_info, sym->st_shndx, sym->st_value,
> -			 sym->st_size);
> -
> -		if (sym->st_shndx == SHN_UNDEF) {
> -			pr_err("Undefined symbol: %s\n", name);
> -			return -ENOEXEC;
> -		}
> -
> -		if (sym->st_shndx == SHN_COMMON) {
> -			pr_err("symbol '%s' in common section\n", name);
> -			return -ENOEXEC;
> -		}
> -
> -		if (sym->st_shndx == SHN_ABS)
> -			sec_base = 0;
> -		else if (sym->st_shndx >= ehdr->e_shnum) {
> -			pr_err("Invalid section %d for symbol %s\n",
> -			       sym->st_shndx, name);
> -			return -ENOEXEC;
> -		} else
> -			sec_base = sechdrs[sym->st_shndx].sh_addr;
> -
> -		value = sym->st_value;
> -		value += sec_base;
> -		value += rel[i].r_addend;
> -
> -		switch (ELF64_R_TYPE(rel[i].r_info)) {
> -		case R_X86_64_NONE:
> -			break;
> -		case R_X86_64_64:
> -			*(u64 *)location = value;
> -			break;
> -		case R_X86_64_32:
> -			*(u32 *)location = value;
> -			if (value != *(u32 *)location)
> -				goto overflow;
> -			break;
> -		case R_X86_64_32S:
> -			*(s32 *)location = value;
> -			if ((s64)value != *(s32 *)location)
> -				goto overflow;
> -			break;
> -		case R_X86_64_PC32:
> -			value -= (u64)address;
> -			*(u32 *)location = value;
> -			break;
> -		default:
> -			pr_err("Unknown rela relocation: %llu\n",
> -			       ELF64_R_TYPE(rel[i].r_info));
> -			return -ENOEXEC;
> -		}
> +	switch (reltype) {
> +	case R_X86_64_NONE:
> +		break;
> +	case R_X86_64_64:
> +		*(u64 *)location = value;
> +		break;
> +	case R_X86_64_32:
> +		*(u32 *)location = value;
> +		if (value != *(u32 *)location)
> +			goto overflow;
> +		break;
> +	case R_X86_64_32S:
> +		*(s32 *)location = value;
> +		if ((s64)value != *(s32 *)location)
> +			goto overflow;
> +		break;
> +	case R_X86_64_PC32:
> +		value -= (u64)address;
> +		*(u32 *)location = value;
> +		break;
> +	default:
> +		pr_err("Unknown rela relocation: %u\n", reltype);
> +		return -ENOEXEC;
>  	}
> +
>  	return 0;
>  
>  overflow:
> -	pr_err("Overflow in relocation type %d value 0x%lx\n",
> -	       (int)ELF64_R_TYPE(rel[i].r_info), value);
> +	pr_err("Overflow in relocation type %u value 0x%lx\n", reltype, value);
>  	return -ENOEXEC;
>  }
>  #endif /* CONFIG_KEXEC_FILE */
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 406c33dcae13..e171a083540d 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -320,8 +320,13 @@ void * __weak arch_kexec_kernel_image_load(struct kimage *image);
>  int __weak arch_kimage_file_post_load_cleanup(struct kimage *image);
>  int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
>  					unsigned long buf_len);
> -int __weak arch_kexec_apply_relocations_add(const Elf_Ehdr *ehdr,
> -					Elf_Shdr *sechdrs, unsigned int relsec);
> +int __weak arch_kexec_apply_relocation_add(const Elf_Ehdr *ehdr,
> +					   Elf_Shdr *sechdrs,
> +					   unsigned int reltype, Elf_Sym *sym,
> +					   const char *name,
> +					   unsigned long *location,
> +					   unsigned long address,
> +					   unsigned long value);
>  int __weak arch_kexec_apply_relocations(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
>  					unsigned int relsec);
>  void arch_kexec_protect_crashkres(void);
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 037c321c5618..1517f977cc25 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -61,8 +61,10 @@ int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
>  
>  /* Apply relocations of type RELA */
>  int __weak
> -arch_kexec_apply_relocations_add(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
> -				 unsigned int relsec)
> +arch_kexec_apply_relocation_add(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
> +				unsigned int reltype, Elf_Sym *sym,
> +				const char *name, unsigned long *location,
> +				unsigned long address, unsigned long value)
>  {
>  	pr_err("RELA relocation unsupported.\n");
>  	return -ENOEXEC;
> @@ -793,6 +795,117 @@ static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
>  	return ret;
>  }
>  
> +/**
> + * kexec_apply_relocations_add - apply purgatory relocations
> + * @ehdr:	Pointer to elf headers
> + * @sechdrs:	Pointer to section headers.
> + * @relsec:	Section index of SHT_RELA section.
> + */
> +static int kexec_apply_relocations_add(const Elf64_Ehdr *ehdr,
> +				       Elf64_Shdr *sechdrs, unsigned int relsec)
> +{
> +	int ret;
> +	unsigned int i;
> +	Elf64_Rela *rel;
> +	Elf64_Sym *sym;
> +	void *location;
> +	Elf64_Shdr *section, *symtabsec;
> +	unsigned long address, sec_base, value;
> +	const char *strtab, *name, *shstrtab;
> +
> +	/*
> +	 * ->sh_offset has been modified to keep the pointer to section
> +	 * contents in memory
> +	 */
> +	rel = (void *)sechdrs[relsec].sh_offset;
> +
> +	/* Section to which relocations apply */
> +	section = &sechdrs[sechdrs[relsec].sh_info];
> +
> +	pr_debug("Applying relocate section %u to %u\n", relsec,
> +		 sechdrs[relsec].sh_info);
> +
> +	/* Associated symbol table */
> +	symtabsec = &sechdrs[sechdrs[relsec].sh_link];
> +
> +	/* String table */
> +	if (symtabsec->sh_link >= ehdr->e_shnum) {
> +		/* Invalid strtab section number */
> +		pr_err("Invalid string table section index %d\n",
> +		       symtabsec->sh_link);
> +		return -ENOEXEC;
> +	}
> +
> +	/* String table for the associated symbol table. */
> +	strtab = (char *)sechdrs[symtabsec->sh_link].sh_offset;
> +
> +	/* section header string table */
> +	shstrtab = (char *)sechdrs[ehdr->e_shstrndx].sh_offset;
> +
> +	for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rel); i++) {
> +
> +		/*
> +		 * rel[i].r_offset contains byte offset from beginning
> +		 * of section to the storage unit affected.
> +		 *
> +		 * This is location to update (->sh_offset). This is temporary
> +		 * buffer where section is currently loaded. This will finally
> +		 * be loaded to a different address later, pointed to by
> +		 * ->sh_addr. kexec takes care of moving it
> +		 *  (kexec_load_segment()).
> +		 */
> +		location = (void *)(section->sh_offset + rel[i].r_offset);
> +
> +		/* Final address of the location */
> +		address = section->sh_addr + rel[i].r_offset;
> +
> +		/*
> +		 * rel[i].r_info contains information about symbol table index
> +		 * w.r.t which relocation must be made and type of relocation
> +		 * to apply. ELF64_R_SYM() and ELF64_R_TYPE() macros get
> +		 * these respectively.
> +		 */
> +		sym = (Elf64_Sym *)symtabsec->sh_offset +
> +				ELF64_R_SYM(rel[i].r_info);
> +
> +		if (sym->st_name)
> +			name = strtab + sym->st_name;
> +		else
> +			name = shstrtab + sechdrs[sym->st_shndx].sh_name;
> +
> +		pr_debug("Symbol: %s info: %02x shndx: %02x value=%llx size: %llx\n",
> +			 name, sym->st_info, sym->st_shndx, sym->st_value,
> +			 sym->st_size);
> +
> +		if (sym->st_shndx == SHN_COMMON) {
> +			pr_err("symbol '%s' in common section\n", name);
> +			return -ENOEXEC;
> +		}
> +
> +		if (sym->st_shndx == SHN_ABS)
> +			sec_base = 0;
> +		else if (sym->st_shndx >= ehdr->e_shnum) {
> +			pr_err("Invalid section %d for symbol %s\n",
> +			       sym->st_shndx, name);
> +			return -ENOEXEC;
> +		} else
> +			sec_base = sechdrs[sym->st_shndx].sh_addr;
> +
> +		value = sym->st_value;
> +		value += sec_base;
> +		value += rel[i].r_addend;
> +
> +		ret = arch_kexec_apply_relocation_add(ehdr, sechdrs,
> +						      ELF64_R_TYPE(rel[i].r_info),
> +						      sym, name, location,
> +						      address, value);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return 0;
> +}
> +
>  static int kexec_apply_relocations(struct kimage *image)
>  {
>  	int i, ret;
> @@ -836,8 +949,7 @@ static int kexec_apply_relocations(struct kimage *image)
>  		 * relocations of type SHT_RELA/SHT_REL.
>  		 */
>  		if (sechdrs[i].sh_type == SHT_RELA)
> -			ret = arch_kexec_apply_relocations_add(pi->ehdr,
> -							       sechdrs, i);
> +			ret = kexec_apply_relocations_add(pi->ehdr, sechdrs, i);
>  		else if (sechdrs[i].sh_type == SHT_REL)
>  			ret = arch_kexec_apply_relocations(pi->ehdr,
>  							   sechdrs, i);
> -- 
> 2.7.4
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.
  2016-11-21 23:49     ` Thiago Jung Bauermann
  2016-11-22  1:32       ` Dave Young
@ 2016-11-22  6:01       ` Michael Ellerman
  2016-11-22  6:16         ` Dave Young
  2016-11-22 13:44         ` Thiago Jung Bauermann
  2016-11-23  8:45       ` Dave Young
  2 siblings, 2 replies; 20+ messages in thread
From: Michael Ellerman @ 2016-11-22  6:01 UTC (permalink / raw)
  To: Thiago Jung Bauermann, Dave Young
  Cc: kexec, linuxppc-dev, linux-kernel, x86, Eric Biederman,
	Vivek Goyal, Baoquan He, Benjamin Herrenschmidt, Paul Mackerras,
	Stewart Smith, Mimi Zohar, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Andrew Morton, Stephen Rothwell

Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> writes:
> Am Sonntag, 20. November 2016, 10:45:46 BRST schrieb Dave Young:
>> On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote:
>> > powerpc's purgatory.ro has 12 relocation types when built as
>> > a relocatable object. To implement support for them requires
>> > arch_kexec_apply_relocations_add to duplicate a lot of code with
>> > module_64.c:apply_relocate_add.
>> > 
>> > When built as a Position Independent Executable there are only 4
>> > relocation types in purgatory.ro, so it becomes practical for the powerpc
>> > implementation of kexec_file to have its own relocation implementation.
>> > 
>> > Also, the purgatory is an executable and not an intermediary output from
>> > the compiler so it makes sense conceptually that it is easier to build
>> > it as a PIE than as a partially linked object.
>> > 
>> > Apart from the greatly reduced number of relocations, there are two
>> > differences between a relocatable object and a PIE:
>> > 
>> > 1. __kexec_load_purgatory needs to use the program headers rather than the
>> > 
>> >    section headers to figure out how to load the binary.
>> > 
>> > 2. Symbol values are absolute addresses instead of relative to the
>> > 
>> >    start of the section.
>> > 
>> > This patch adds the support needed in generic code for the differences
>> > above and allows powerpc to load and relocate a position independent
>> > purgatory.
>> 
>> [snip]
>> 
>> The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it is
>> not that complex. So could you look into simplify your kexec_file
>> implementation?
>
> I can try, but there is one fundamental issue here: powerpc position-dependent 
> code relies more on relocations than x86 position-dependent code does, so 
> there's a limit to how simple it can be made without switching to position-
> independent code. And it will always be more involved than it is on x86.

I think we need to go back to the drawing board on this one.

My hope was that building purgatory as PIE would reduce the amount of
complexity, but instead it's just added more. Sorry for sending you in
that direction.


In general I dislike the level of complexity of the kexec-tools
purgatory, and in particular I'm not comfortable with things like:

diff --git a/arch/powerpc/purgatory/sha256.c b/arch/powerpc/purgatory/sha256.c
new file mode 100644
index 000000000000..6abee1877d56
--- /dev/null
+++ b/arch/powerpc/purgatory/sha256.c
@@ -0,0 +1,6 @@
+#include "../boot/string.h"
+
+/* Avoid including x86's boot/string.h in sha256.c. */
+#define BOOT_STRING_H
+
+#include "../../x86/purgatory/sha256.c"


I think the best way to get this over the line would be to take the
kexec-lite purgatory implementation and use that to begin with. I know
it doesn't have all the features of the kexec-tools version, but it
should work, and we can look at adding the extra features later.

I'll try and get that working tonight.

cheers

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.
  2016-11-22  6:01       ` Michael Ellerman
@ 2016-11-22  6:16         ` Dave Young
  2016-11-22 13:44         ` Thiago Jung Bauermann
  1 sibling, 0 replies; 20+ messages in thread
From: Dave Young @ 2016-11-22  6:16 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Thiago Jung Bauermann, kexec, linuxppc-dev, linux-kernel, x86,
	Eric Biederman, Vivek Goyal, Baoquan He, Benjamin Herrenschmidt,
	Paul Mackerras, Stewart Smith, Mimi Zohar, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Andrew Morton, Stephen Rothwell

Hi Michael
On 11/22/16 at 05:01pm, Michael Ellerman wrote:
> Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> writes:
> > Am Sonntag, 20. November 2016, 10:45:46 BRST schrieb Dave Young:
> >> On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote:
> >> > powerpc's purgatory.ro has 12 relocation types when built as
> >> > a relocatable object. To implement support for them requires
> >> > arch_kexec_apply_relocations_add to duplicate a lot of code with
> >> > module_64.c:apply_relocate_add.
> >> > 
> >> > When built as a Position Independent Executable there are only 4
> >> > relocation types in purgatory.ro, so it becomes practical for the powerpc
> >> > implementation of kexec_file to have its own relocation implementation.
> >> > 
> >> > Also, the purgatory is an executable and not an intermediary output from
> >> > the compiler so it makes sense conceptually that it is easier to build
> >> > it as a PIE than as a partially linked object.
> >> > 
> >> > Apart from the greatly reduced number of relocations, there are two
> >> > differences between a relocatable object and a PIE:
> >> > 
> >> > 1. __kexec_load_purgatory needs to use the program headers rather than the
> >> > 
> >> >    section headers to figure out how to load the binary.
> >> > 
> >> > 2. Symbol values are absolute addresses instead of relative to the
> >> > 
> >> >    start of the section.
> >> > 
> >> > This patch adds the support needed in generic code for the differences
> >> > above and allows powerpc to load and relocate a position independent
> >> > purgatory.
> >> 
> >> [snip]
> >> 
> >> The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it is
> >> not that complex. So could you look into simplify your kexec_file
> >> implementation?
> >
> > I can try, but there is one fundamental issue here: powerpc position-dependent 
> > code relies more on relocations than x86 position-dependent code does, so 
> > there's a limit to how simple it can be made without switching to position-
> > independent code. And it will always be more involved than it is on x86.
> 
> I think we need to go back to the drawing board on this one.
> 
> My hope was that building purgatory as PIE would reduce the amount of
> complexity, but instead it's just added more. Sorry for sending you in
> that direction.
> 
> 
> In general I dislike the level of complexity of the kexec-tools
> purgatory, and in particular I'm not comfortable with things like:
> 
> diff --git a/arch/powerpc/purgatory/sha256.c b/arch/powerpc/purgatory/sha256.c
> new file mode 100644
> index 000000000000..6abee1877d56
> --- /dev/null
> +++ b/arch/powerpc/purgatory/sha256.c
> @@ -0,0 +1,6 @@
> +#include "../boot/string.h"
> +
> +/* Avoid including x86's boot/string.h in sha256.c. */
> +#define BOOT_STRING_H
> +
> +#include "../../x86/purgatory/sha256.c"
> 

Agreed, include x86 code in powerpc looks bad

> 
> I think the best way to get this over the line would be to take the
> kexec-lite purgatory implementation and use that to begin with. I know
> it doesn't have all the features of the kexec-tools version, but it
> should work, and we can look at adding the extra features later.

Instead of adding other implementation, moving the purgatory sha256 code
out of x86 sounds better so that we can reuse them cleanly..

> 
> I'll try and get that working tonight.
> 
> cheers

Thanks
Dave

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.
  2016-11-22  6:01       ` Michael Ellerman
  2016-11-22  6:16         ` Dave Young
@ 2016-11-22 13:44         ` Thiago Jung Bauermann
  2016-11-23  1:32           ` Dave Young
  1 sibling, 1 reply; 20+ messages in thread
From: Thiago Jung Bauermann @ 2016-11-22 13:44 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Dave Young, kexec, linuxppc-dev, linux-kernel, x86,
	Eric Biederman, Vivek Goyal, Baoquan He, Benjamin Herrenschmidt,
	Paul Mackerras, Stewart Smith, Mimi Zohar, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Andrew Morton, Stephen Rothwell

Am Dienstag, 22. November 2016, 17:01:10 BRST schrieb Michael Ellerman:
> Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> writes:
> > Am Sonntag, 20. November 2016, 10:45:46 BRST schrieb Dave Young:
> >> On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote:
> >> > powerpc's purgatory.ro has 12 relocation types when built as
> >> > a relocatable object. To implement support for them requires
> >> > arch_kexec_apply_relocations_add to duplicate a lot of code with
> >> > module_64.c:apply_relocate_add.
> >> > 
> >> > When built as a Position Independent Executable there are only 4
> >> > relocation types in purgatory.ro, so it becomes practical for the
> >> > powerpc
> >> > implementation of kexec_file to have its own relocation implementation.
> >> > 
> >> > Also, the purgatory is an executable and not an intermediary output
> >> > from
> >> > the compiler so it makes sense conceptually that it is easier to build
> >> > it as a PIE than as a partially linked object.
> >> > 
> >> > Apart from the greatly reduced number of relocations, there are two
> >> > differences between a relocatable object and a PIE:
> >> > 
> >> > 1. __kexec_load_purgatory needs to use the program headers rather than
> >> > the
> >> > 
> >> >    section headers to figure out how to load the binary.
> >> > 
> >> > 2. Symbol values are absolute addresses instead of relative to the
> >> > 
> >> >    start of the section.
> >> > 
> >> > This patch adds the support needed in generic code for the differences
> >> > above and allows powerpc to load and relocate a position independent
> >> > purgatory.
> >> 
> >> [snip]
> >> 
> >> The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it is
> >> not that complex. So could you look into simplify your kexec_file
> >> implementation?
> > 
> > I can try, but there is one fundamental issue here: powerpc
> > position-dependent code relies more on relocations than x86
> > position-dependent code does, so there's a limit to how simple it can be
> > made without switching to position- independent code. And it will always
> > be more involved than it is on x86.
> I think we need to go back to the drawing board on this one.
> 
> My hope was that building purgatory as PIE would reduce the amount of
> complexity, but instead it's just added more. Sorry for sending you in
> that direction.

It added complexity because in my series powerpc was using a PIE purgatory but 
x86 kept using a partially-linked object (because of the problem I mentioned I 
had when trying out a PIE x86 purgatory), so generic code needed two purgatory 
loaders.

I'll see if I can make the PIE x86 purgatory to work so that generic code can 
have only one loader implementation. Then it will indeed be simpler.


Am Dienstag, 22. November 2016, 14:16:22 BRST schrieb Dave Young:
> Hi Michael
> 
> On 11/22/16 at 05:01pm, Michael Ellerman wrote:
> > In general I dislike the level of complexity of the kexec-tools
> > purgatory, and in particular I'm not comfortable with things like:
> > 
> > diff --git a/arch/powerpc/purgatory/sha256.c
> > b/arch/powerpc/purgatory/sha256.c new file mode 100644
> > index 000000000000..6abee1877d56
> > --- /dev/null
> > +++ b/arch/powerpc/purgatory/sha256.c
> > @@ -0,0 +1,6 @@
> > +#include "../boot/string.h"
> > +
> > +/* Avoid including x86's boot/string.h in sha256.c. */
> > +#define BOOT_STRING_H
> > +
> > +#include "../../x86/purgatory/sha256.c"
> 
> Agreed, include x86 code in powerpc looks bad
> 
> > I think the best way to get this over the line would be to take the
> > kexec-lite purgatory implementation and use that to begin with. I know
> > it doesn't have all the features of the kexec-tools version, but it
> > should work, and we can look at adding the extra features later.
> 
> Instead of adding other implementation, moving the purgatory sha256 code
> out of x86 sounds better so that we can reuse them cleanly..

Do you have a suggestion of where that code can live so that it can be shared 
between purgatories for different arches?

Do we need a purgatory with generic and arch-specific code like in kexec-
tools?

-- 
Thiago Jung Bauermann
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.
  2016-11-22 13:44         ` Thiago Jung Bauermann
@ 2016-11-23  1:32           ` Dave Young
  2016-11-23  2:54             ` Thiago Jung Bauermann
  0 siblings, 1 reply; 20+ messages in thread
From: Dave Young @ 2016-11-23  1:32 UTC (permalink / raw)
  To: Thiago Jung Bauermann
  Cc: Michael Ellerman, kexec, linuxppc-dev, linux-kernel, x86,
	Eric Biederman, Vivek Goyal, Baoquan He, Benjamin Herrenschmidt,
	Paul Mackerras, Stewart Smith, Mimi Zohar, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Andrew Morton, Stephen Rothwell

On 11/22/16 at 11:44am, Thiago Jung Bauermann wrote:
> Am Dienstag, 22. November 2016, 17:01:10 BRST schrieb Michael Ellerman:
> > Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> writes:
> > > Am Sonntag, 20. November 2016, 10:45:46 BRST schrieb Dave Young:
> > >> On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote:
> > >> > powerpc's purgatory.ro has 12 relocation types when built as
> > >> > a relocatable object. To implement support for them requires
> > >> > arch_kexec_apply_relocations_add to duplicate a lot of code with
> > >> > module_64.c:apply_relocate_add.
> > >> > 
> > >> > When built as a Position Independent Executable there are only 4
> > >> > relocation types in purgatory.ro, so it becomes practical for the
> > >> > powerpc
> > >> > implementation of kexec_file to have its own relocation implementation.
> > >> > 
> > >> > Also, the purgatory is an executable and not an intermediary output
> > >> > from
> > >> > the compiler so it makes sense conceptually that it is easier to build
> > >> > it as a PIE than as a partially linked object.
> > >> > 
> > >> > Apart from the greatly reduced number of relocations, there are two
> > >> > differences between a relocatable object and a PIE:
> > >> > 
> > >> > 1. __kexec_load_purgatory needs to use the program headers rather than
> > >> > the
> > >> > 
> > >> >    section headers to figure out how to load the binary.
> > >> > 
> > >> > 2. Symbol values are absolute addresses instead of relative to the
> > >> > 
> > >> >    start of the section.
> > >> > 
> > >> > This patch adds the support needed in generic code for the differences
> > >> > above and allows powerpc to load and relocate a position independent
> > >> > purgatory.
> > >> 
> > >> [snip]
> > >> 
> > >> The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it is
> > >> not that complex. So could you look into simplify your kexec_file
> > >> implementation?
> > > 
> > > I can try, but there is one fundamental issue here: powerpc
> > > position-dependent code relies more on relocations than x86
> > > position-dependent code does, so there's a limit to how simple it can be
> > > made without switching to position- independent code. And it will always
> > > be more involved than it is on x86.
> > I think we need to go back to the drawing board on this one.
> > 
> > My hope was that building purgatory as PIE would reduce the amount of
> > complexity, but instead it's just added more. Sorry for sending you in
> > that direction.
> 
> It added complexity because in my series powerpc was using a PIE purgatory but 
> x86 kept using a partially-linked object (because of the problem I mentioned I 
> had when trying out a PIE x86 purgatory), so generic code needed two purgatory 
> loaders.
> 
> I'll see if I can make the PIE x86 purgatory to work so that generic code can 
> have only one loader implementation. Then it will indeed be simpler.

Do we really need the PIE purgatory, after moving generic code out of
x86, there will be no much benefit, no? Anyway, the first step should be
making the purgatory code more generic so that it can be easier for
other arches to support kexec_file in the future. 

> 
> 
> Am Dienstag, 22. November 2016, 14:16:22 BRST schrieb Dave Young:
> > Hi Michael
> > 
> > On 11/22/16 at 05:01pm, Michael Ellerman wrote:
> > > In general I dislike the level of complexity of the kexec-tools
> > > purgatory, and in particular I'm not comfortable with things like:
> > > 
> > > diff --git a/arch/powerpc/purgatory/sha256.c
> > > b/arch/powerpc/purgatory/sha256.c new file mode 100644
> > > index 000000000000..6abee1877d56
> > > --- /dev/null
> > > +++ b/arch/powerpc/purgatory/sha256.c
> > > @@ -0,0 +1,6 @@
> > > +#include "../boot/string.h"
> > > +
> > > +/* Avoid including x86's boot/string.h in sha256.c. */
> > > +#define BOOT_STRING_H
> > > +
> > > +#include "../../x86/purgatory/sha256.c"
> > 
> > Agreed, include x86 code in powerpc looks bad
> > 
> > > I think the best way to get this over the line would be to take the
> > > kexec-lite purgatory implementation and use that to begin with. I know
> > > it doesn't have all the features of the kexec-tools version, but it
> > > should work, and we can look at adding the extra features later.
> > 
> > Instead of adding other implementation, moving the purgatory sha256 code
> > out of x86 sounds better so that we can reuse them cleanly..
> 
> Do you have a suggestion of where that code can live so that it can be shared 
> between purgatories for different arches?

Maybe it is better to stay in lib/purgatory/

> 
> Do we need a purgatory with generic and arch-specific code like in kexec-
> tools?

Yes, if we have more arches to add kexec_file, this should be
necessary..

> 
> -- 
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 

Thanks
Dave

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.
  2016-11-23  1:32           ` Dave Young
@ 2016-11-23  2:54             ` Thiago Jung Bauermann
  0 siblings, 0 replies; 20+ messages in thread
From: Thiago Jung Bauermann @ 2016-11-23  2:54 UTC (permalink / raw)
  To: Dave Young
  Cc: Michael Ellerman, kexec, linuxppc-dev, linux-kernel, x86,
	Eric Biederman, Vivek Goyal, Baoquan He, Benjamin Herrenschmidt,
	Paul Mackerras, Stewart Smith, Mimi Zohar, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Andrew Morton, Stephen Rothwell

Am Mittwoch, 23. November 2016, 09:32:58 BRST schrieb Dave Young:
> On 11/22/16 at 11:44am, Thiago Jung Bauermann wrote:
> > Am Dienstag, 22. November 2016, 17:01:10 BRST schrieb Michael Ellerman:
> > > Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> writes:
> > > > Am Sonntag, 20. November 2016, 10:45:46 BRST schrieb Dave Young:
> > > >> On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote:
> > > >> > powerpc's purgatory.ro has 12 relocation types when built as
> > > >> > a relocatable object. To implement support for them requires
> > > >> > arch_kexec_apply_relocations_add to duplicate a lot of code with
> > > >> > module_64.c:apply_relocate_add.
> > > >> > 
> > > >> > When built as a Position Independent Executable there are only 4
> > > >> > relocation types in purgatory.ro, so it becomes practical for the
> > > >> > powerpc
> > > >> > implementation of kexec_file to have its own relocation
> > > >> > implementation.
> > > >> > 
> > > >> > Also, the purgatory is an executable and not an intermediary output
> > > >> > from
> > > >> > the compiler so it makes sense conceptually that it is easier to
> > > >> > build
> > > >> > it as a PIE than as a partially linked object.
> > > >> > 
> > > >> > Apart from the greatly reduced number of relocations, there are two
> > > >> > differences between a relocatable object and a PIE:
> > > >> > 
> > > >> > 1. __kexec_load_purgatory needs to use the program headers rather
> > > >> > than
> > > >> > the
> > > >> > 
> > > >> >    section headers to figure out how to load the binary.
> > > >> > 
> > > >> > 2. Symbol values are absolute addresses instead of relative to the
> > > >> > 
> > > >> >    start of the section.
> > > >> > 
> > > >> > This patch adds the support needed in generic code for the
> > > >> > differences
> > > >> > above and allows powerpc to load and relocate a position
> > > >> > independent
> > > >> > purgatory.
> > > >> 
> > > >> [snip]
> > > >> 
> > > >> The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it
> > > >> is
> > > >> not that complex. So could you look into simplify your kexec_file
> > > >> implementation?
> > > > 
> > > > I can try, but there is one fundamental issue here: powerpc
> > > > position-dependent code relies more on relocations than x86
> > > > position-dependent code does, so there's a limit to how simple it can
> > > > be
> > > > made without switching to position- independent code. And it will
> > > > always
> > > > be more involved than it is on x86.
> > > 
> > > I think we need to go back to the drawing board on this one.
> > > 
> > > My hope was that building purgatory as PIE would reduce the amount of
> > > complexity, but instead it's just added more. Sorry for sending you in
> > > that direction.
> > 
> > It added complexity because in my series powerpc was using a PIE purgatory
> > but x86 kept using a partially-linked object (because of the problem I
> > mentioned I had when trying out a PIE x86 purgatory), so generic code
> > needed two purgatory loaders.
> > 
> > I'll see if I can make the PIE x86 purgatory to work so that generic code
> > can have only one loader implementation. Then it will indeed be simpler.
> Do we really need the PIE purgatory, after moving generic code out of
> x86, there will be no much benefit, no?

It still makes a big difference on powerpc, even after moving out the generic 
code. I just got the PIE purgatory working on x86 and it also simplifies the 
code there, so it's a win for both architectures.

I'll clean up the code and post tomorrow so that you can see what you think.

> Anyway, the first step should be
> making the purgatory code more generic so that it can be easier for
> other arches to support kexec_file in the future.

I'll try putting sha256.c in lib/purgatory/ as you suggested.

-- 
Thiago Jung Bauermann
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.
  2016-11-21 23:49     ` Thiago Jung Bauermann
  2016-11-22  1:32       ` Dave Young
  2016-11-22  6:01       ` Michael Ellerman
@ 2016-11-23  8:45       ` Dave Young
  2 siblings, 0 replies; 20+ messages in thread
From: Dave Young @ 2016-11-23  8:45 UTC (permalink / raw)
  To: Thiago Jung Bauermann
  Cc: kexec, linuxppc-dev, linux-kernel, x86, Eric Biederman,
	Vivek Goyal, Baoquan He, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Stewart Smith,
	Mimi Zohar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andrew Morton, Stephen Rothwell

On 11/21/16 at 09:49pm, Thiago Jung Bauermann wrote:
> Hello Dave,
> 
> Thanks for your review.
> 
> Am Sonntag, 20. November 2016, 10:45:46 BRST schrieb Dave Young:
> > On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote:
> > > powerpc's purgatory.ro has 12 relocation types when built as
> > > a relocatable object. To implement support for them requires
> > > arch_kexec_apply_relocations_add to duplicate a lot of code with
> > > module_64.c:apply_relocate_add.
> > > 
> > > When built as a Position Independent Executable there are only 4
> > > relocation types in purgatory.ro, so it becomes practical for the powerpc
> > > implementation of kexec_file to have its own relocation implementation.
> > > 
> > > Also, the purgatory is an executable and not an intermediary output from
> > > the compiler so it makes sense conceptually that it is easier to build
> > > it as a PIE than as a partially linked object.
> > > 
> > > Apart from the greatly reduced number of relocations, there are two
> > > differences between a relocatable object and a PIE:
> > > 
> > > 1. __kexec_load_purgatory needs to use the program headers rather than the
> > > 
> > >    section headers to figure out how to load the binary.
> > > 
> > > 2. Symbol values are absolute addresses instead of relative to the
> > > 
> > >    start of the section.
> > > 
> > > This patch adds the support needed in generic code for the differences
> > > above and allows powerpc to load and relocate a position independent
> > > purgatory.
> > 
> > [snip]
> > 
> > The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it is
> > not that complex. So could you look into simplify your kexec_file
> > implementation?
> 
> I can try, but there is one fundamental issue here: powerpc position-dependent 
> code relies more on relocations than x86 position-dependent code does, so 
> there's a limit to how simple it can be made without switching to position-
> independent code. And it will always be more involved than it is on x86.
> 
> BTW, building x86's purgatory as PIE results in it not having any relocation 
> at all, so it's an advantage even in that architecture. Unfortunately, the 
> machine locks up during reboot and I didn't have time to try to figure out 
> what's going on.
> 
> > kernel/kexec_file.c kexec_apply_relocations only do limited things
> > and some of the logic is in arch/x86, so move general code out of arch
> > code, then I guess the arch code will be simpler
> 
> I agree that is a good idea. Is the patch below what you had in mind?
> 
> > and then we probably do not need this PIE stuff anymore.
> 
> If you are ok with the patch below I can post a new version of the series 
> based on it and we can see if Michael Ellerman thinks it is enough.
> 
> > BTW, __kexec_really_load_purgatory looks worse than
> > ___kexec_load_purgatory ;)
> 
> Really? I find the special handling of bss makes the section-based loader a 
> bit more confusing.
> 
> -- 
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
> 
> Subject: [PATCH] kexec_file: Move generic relocation code from arch/x86 to
>  kernel/kexec_file.c
> 
> The check for undefined symbols stays in arch-specific code because
> powerpc needs to allow TOC symbols to be processed even though they're
> undefined.
> 
> There is no functional change.
> 
> Suggested-by: Dave Young <dyoung@redhat.com>
> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> ---
>  arch/x86/kernel/machine_kexec_64.c | 160 +++++++------------------------------
>  include/linux/kexec.h              |   9 ++-
>  kernel/kexec_file.c                | 120 +++++++++++++++++++++++++++-
>  3 files changed, 154 insertions(+), 135 deletions(-)
> 
> diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
> index 8c1f218926d7..f4860c408ece 100644
> --- a/arch/x86/kernel/machine_kexec_64.c
> +++ b/arch/x86/kernel/machine_kexec_64.c
> @@ -401,143 +401,45 @@ int arch_kexec_kernel_verify_sig(struct kimage *image, void *kernel,
>  }
>  #endif
>  
> -/*
> - * Apply purgatory relocations.
> - *
> - * ehdr: Pointer to elf headers
> - * sechdrs: Pointer to section headers.
> - * relsec: section index of SHT_RELA section.
> - *
> - * TODO: Some of the code belongs to generic code. Move that in kexec.c.
> - */
> -int arch_kexec_apply_relocations_add(const Elf64_Ehdr *ehdr,
> -				     Elf64_Shdr *sechdrs, unsigned int relsec)
> +int arch_kexec_apply_relocation_add(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
> +				    unsigned int reltype, Elf_Sym *sym,
> +				    const char *name, unsigned long *location,
> +				    unsigned long address, unsigned long value)
>  {
> -	unsigned int i;
> -	Elf64_Rela *rel;
> -	Elf64_Sym *sym;
> -	void *location;
> -	Elf64_Shdr *section, *symtabsec;
> -	unsigned long address, sec_base, value;
> -	const char *strtab, *name, *shstrtab;
> -
> -	/*
> -	 * ->sh_offset has been modified to keep the pointer to section
> -	 * contents in memory
> -	 */
> -	rel = (void *)sechdrs[relsec].sh_offset;
> -
> -	/* Section to which relocations apply */
> -	section = &sechdrs[sechdrs[relsec].sh_info];
> -
> -	pr_debug("Applying relocate section %u to %u\n", relsec,
> -		 sechdrs[relsec].sh_info);
> -
> -	/* Associated symbol table */
> -	symtabsec = &sechdrs[sechdrs[relsec].sh_link];
> -
> -	/* String table */
> -	if (symtabsec->sh_link >= ehdr->e_shnum) {
> -		/* Invalid strtab section number */
> -		pr_err("Invalid string table section index %d\n",
> -		       symtabsec->sh_link);
> +	if (sym->st_shndx == SHN_UNDEF) {
> +		pr_err("Undefined symbol: %s\n", name);
>  		return -ENOEXEC;
>  	}
>  
> -	strtab = (char *)sechdrs[symtabsec->sh_link].sh_offset;
> -
> -	/* section header string table */
> -	shstrtab = (char *)sechdrs[ehdr->e_shstrndx].sh_offset;
> -
> -	for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rel); i++) {
> -
> -		/*
> -		 * rel[i].r_offset contains byte offset from beginning
> -		 * of section to the storage unit affected.
> -		 *
> -		 * This is location to update (->sh_offset). This is temporary
> -		 * buffer where section is currently loaded. This will finally
> -		 * be loaded to a different address later, pointed to by
> -		 * ->sh_addr. kexec takes care of moving it
> -		 *  (kexec_load_segment()).
> -		 */
> -		location = (void *)(section->sh_offset + rel[i].r_offset);
> -
> -		/* Final address of the location */
> -		address = section->sh_addr + rel[i].r_offset;
> -
> -		/*
> -		 * rel[i].r_info contains information about symbol table index
> -		 * w.r.t which relocation must be made and type of relocation
> -		 * to apply. ELF64_R_SYM() and ELF64_R_TYPE() macros get
> -		 * these respectively.
> -		 */
> -		sym = (Elf64_Sym *)symtabsec->sh_offset +
> -				ELF64_R_SYM(rel[i].r_info);
> -
> -		if (sym->st_name)
> -			name = strtab + sym->st_name;
> -		else
> -			name = shstrtab + sechdrs[sym->st_shndx].sh_name;
> -
> -		pr_debug("Symbol: %s info: %02x shndx: %02x value=%llx size: %llx\n",
> -			 name, sym->st_info, sym->st_shndx, sym->st_value,
> -			 sym->st_size);
> -
> -		if (sym->st_shndx == SHN_UNDEF) {
> -			pr_err("Undefined symbol: %s\n", name);
> -			return -ENOEXEC;
> -		}
> -
> -		if (sym->st_shndx == SHN_COMMON) {
> -			pr_err("symbol '%s' in common section\n", name);
> -			return -ENOEXEC;
> -		}
> -
> -		if (sym->st_shndx == SHN_ABS)
> -			sec_base = 0;
> -		else if (sym->st_shndx >= ehdr->e_shnum) {
> -			pr_err("Invalid section %d for symbol %s\n",
> -			       sym->st_shndx, name);
> -			return -ENOEXEC;
> -		} else
> -			sec_base = sechdrs[sym->st_shndx].sh_addr;
> -
> -		value = sym->st_value;
> -		value += sec_base;
> -		value += rel[i].r_addend;
> -
> -		switch (ELF64_R_TYPE(rel[i].r_info)) {
> -		case R_X86_64_NONE:
> -			break;
> -		case R_X86_64_64:
> -			*(u64 *)location = value;
> -			break;
> -		case R_X86_64_32:
> -			*(u32 *)location = value;
> -			if (value != *(u32 *)location)
> -				goto overflow;
> -			break;
> -		case R_X86_64_32S:
> -			*(s32 *)location = value;
> -			if ((s64)value != *(s32 *)location)
> -				goto overflow;
> -			break;
> -		case R_X86_64_PC32:
> -			value -= (u64)address;
> -			*(u32 *)location = value;
> -			break;
> -		default:
> -			pr_err("Unknown rela relocation: %llu\n",
> -			       ELF64_R_TYPE(rel[i].r_info));
> -			return -ENOEXEC;
> -		}
> +	switch (reltype) {
> +	case R_X86_64_NONE:
> +		break;
> +	case R_X86_64_64:
> +		*(u64 *)location = value;
> +		break;
> +	case R_X86_64_32:
> +		*(u32 *)location = value;
> +		if (value != *(u32 *)location)
> +			goto overflow;
> +		break;
> +	case R_X86_64_32S:
> +		*(s32 *)location = value;
> +		if ((s64)value != *(s32 *)location)
> +			goto overflow;
> +		break;
> +	case R_X86_64_PC32:
> +		value -= (u64)address;
> +		*(u32 *)location = value;
> +		break;
> +	default:
> +		pr_err("Unknown rela relocation: %u\n", reltype);
> +		return -ENOEXEC;
>  	}
> +
>  	return 0;
>  
>  overflow:
> -	pr_err("Overflow in relocation type %d value 0x%lx\n",
> -	       (int)ELF64_R_TYPE(rel[i].r_info), value);
> +	pr_err("Overflow in relocation type %u value 0x%lx\n", reltype, value);
>  	return -ENOEXEC;
>  }
>  #endif /* CONFIG_KEXEC_FILE */
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 406c33dcae13..e171a083540d 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -320,8 +320,13 @@ void * __weak arch_kexec_kernel_image_load(struct kimage *image);
>  int __weak arch_kimage_file_post_load_cleanup(struct kimage *image);
>  int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
>  					unsigned long buf_len);
> -int __weak arch_kexec_apply_relocations_add(const Elf_Ehdr *ehdr,
> -					Elf_Shdr *sechdrs, unsigned int relsec);
> +int __weak arch_kexec_apply_relocation_add(const Elf_Ehdr *ehdr,
> +					   Elf_Shdr *sechdrs,
> +					   unsigned int reltype, Elf_Sym *sym,
> +					   const char *name,
> +					   unsigned long *location,
> +					   unsigned long address,
> +					   unsigned long value);
>  int __weak arch_kexec_apply_relocations(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
>  					unsigned int relsec);
>  void arch_kexec_protect_crashkres(void);
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 037c321c5618..1517f977cc25 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -61,8 +61,10 @@ int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
>  
>  /* Apply relocations of type RELA */
>  int __weak
> -arch_kexec_apply_relocations_add(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
> -				 unsigned int relsec)
> +arch_kexec_apply_relocation_add(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
> +				unsigned int reltype, Elf_Sym *sym,
> +				const char *name, unsigned long *location,
> +				unsigned long address, unsigned long value)
>  {
>  	pr_err("RELA relocation unsupported.\n");
>  	return -ENOEXEC;
> @@ -793,6 +795,117 @@ static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
>  	return ret;
>  }
>  
> +/**
> + * kexec_apply_relocations_add - apply purgatory relocations
> + * @ehdr:	Pointer to elf headers
> + * @sechdrs:	Pointer to section headers.
> + * @relsec:	Section index of SHT_RELA section.
> + */
> +static int kexec_apply_relocations_add(const Elf64_Ehdr *ehdr,
> +				       Elf64_Shdr *sechdrs, unsigned int relsec)
> +{
> +	int ret;
> +	unsigned int i;
> +	Elf64_Rela *rel;
> +	Elf64_Sym *sym;
> +	void *location;
> +	Elf64_Shdr *section, *symtabsec;
> +	unsigned long address, sec_base, value;
> +	const char *strtab, *name, *shstrtab;
> +
> +	/*
> +	 * ->sh_offset has been modified to keep the pointer to section
> +	 * contents in memory
> +	 */
> +	rel = (void *)sechdrs[relsec].sh_offset;
> +
> +	/* Section to which relocations apply */
> +	section = &sechdrs[sechdrs[relsec].sh_info];
> +
> +	pr_debug("Applying relocate section %u to %u\n", relsec,
> +		 sechdrs[relsec].sh_info);
> +
> +	/* Associated symbol table */
> +	symtabsec = &sechdrs[sechdrs[relsec].sh_link];
> +
> +	/* String table */
> +	if (symtabsec->sh_link >= ehdr->e_shnum) {
> +		/* Invalid strtab section number */
> +		pr_err("Invalid string table section index %d\n",
> +		       symtabsec->sh_link);
> +		return -ENOEXEC;
> +	}
> +
> +	/* String table for the associated symbol table. */
> +	strtab = (char *)sechdrs[symtabsec->sh_link].sh_offset;
> +
> +	/* section header string table */
> +	shstrtab = (char *)sechdrs[ehdr->e_shstrndx].sh_offset;
> +
> +	for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rel); i++) {
> +
> +		/*
> +		 * rel[i].r_offset contains byte offset from beginning
> +		 * of section to the storage unit affected.
> +		 *
> +		 * This is location to update (->sh_offset). This is temporary
> +		 * buffer where section is currently loaded. This will finally
> +		 * be loaded to a different address later, pointed to by
> +		 * ->sh_addr. kexec takes care of moving it
> +		 *  (kexec_load_segment()).
> +		 */
> +		location = (void *)(section->sh_offset + rel[i].r_offset);
> +
> +		/* Final address of the location */
> +		address = section->sh_addr + rel[i].r_offset;
> +
> +		/*
> +		 * rel[i].r_info contains information about symbol table index
> +		 * w.r.t which relocation must be made and type of relocation
> +		 * to apply. ELF64_R_SYM() and ELF64_R_TYPE() macros get
> +		 * these respectively.
> +		 */
> +		sym = (Elf64_Sym *)symtabsec->sh_offset +
> +				ELF64_R_SYM(rel[i].r_info);
> +
> +		if (sym->st_name)
> +			name = strtab + sym->st_name;
> +		else
> +			name = shstrtab + sechdrs[sym->st_shndx].sh_name;
> +
> +		pr_debug("Symbol: %s info: %02x shndx: %02x value=%llx size: %llx\n",
> +			 name, sym->st_info, sym->st_shndx, sym->st_value,
> +			 sym->st_size);
> +
> +		if (sym->st_shndx == SHN_COMMON) {
> +			pr_err("symbol '%s' in common section\n", name);
> +			return -ENOEXEC;
> +		}
> +
> +		if (sym->st_shndx == SHN_ABS)
> +			sec_base = 0;
> +		else if (sym->st_shndx >= ehdr->e_shnum) {
> +			pr_err("Invalid section %d for symbol %s\n",
> +			       sym->st_shndx, name);
> +			return -ENOEXEC;
> +		} else
> +			sec_base = sechdrs[sym->st_shndx].sh_addr;
> +
> +		value = sym->st_value;
> +		value += sec_base;
> +		value += rel[i].r_addend;
> +
> +		ret = arch_kexec_apply_relocation_add(ehdr, sechdrs,
> +						      ELF64_R_TYPE(rel[i].r_info),
> +						      sym, name, location,
> +						      address, value);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return 0;
> +}
> +
>  static int kexec_apply_relocations(struct kimage *image)
>  {
>  	int i, ret;
> @@ -836,8 +949,7 @@ static int kexec_apply_relocations(struct kimage *image)
>  		 * relocations of type SHT_RELA/SHT_REL.
>  		 */
>  		if (sechdrs[i].sh_type == SHT_RELA)
> -			ret = arch_kexec_apply_relocations_add(pi->ehdr,
> -							       sechdrs, i);
> +			ret = kexec_apply_relocations_add(pi->ehdr, sechdrs, i);
>  		else if (sechdrs[i].sh_type == SHT_REL)
>  			ret = arch_kexec_apply_relocations(pi->ehdr,
>  							   sechdrs, i);

Hi,

As we are here, two weak functions looks not necessary, let arch code to
determin if SHT_REL can be handled is reasonable though I'm not sure why
there is not such things in kexec-tools.

Could you just use arch_kexec_apply_relocations for these two types
instead of adding another?

Thanks
Dave

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-11-23  8:57 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-10  3:27 [PATCH v10 00/10] kexec_file_load implementation for PowerPC Thiago Jung Bauermann
2016-11-10  3:27 ` [PATCH v10 01/10] kexec_file: Allow arch-specific memory walking for kexec_add_buffer Thiago Jung Bauermann
2016-11-10  3:27 ` [PATCH v10 02/10] kexec_file: Change kexec_add_buffer to take kexec_buf as argument Thiago Jung Bauermann
2016-11-10  3:27 ` [PATCH v10 03/10] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer Thiago Jung Bauermann
2016-11-10  3:27 ` [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE Thiago Jung Bauermann
2016-11-20  2:45   ` Dave Young
2016-11-21 23:49     ` Thiago Jung Bauermann
2016-11-22  1:32       ` Dave Young
2016-11-22  6:01       ` Michael Ellerman
2016-11-22  6:16         ` Dave Young
2016-11-22 13:44         ` Thiago Jung Bauermann
2016-11-23  1:32           ` Dave Young
2016-11-23  2:54             ` Thiago Jung Bauermann
2016-11-23  8:45       ` Dave Young
2016-11-10  3:27 ` [PATCH v10 05/10] powerpc: Change places using CONFIG_KEXEC to use CONFIG_KEXEC_CORE instead Thiago Jung Bauermann
2016-11-10  3:27 ` [PATCH v10 06/10] powerpc: Implement kexec_file_load Thiago Jung Bauermann
2016-11-10  3:27 ` [PATCH v10 07/10] powerpc: Add functions to read ELF files of any endianness Thiago Jung Bauermann
2016-11-10  3:27 ` [PATCH v10 08/10] powerpc: Add support for loading ELF kernels with kexec_file_load Thiago Jung Bauermann
2016-11-10  3:27 ` [PATCH v10 09/10] powerpc: Add purgatory for kexec_file_load implementation Thiago Jung Bauermann
2016-11-10  3:27 ` [PATCH v10 10/10] powerpc: Enable CONFIG_KEXEC_FILE in powerpc server defconfigs Thiago Jung Bauermann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).