linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support
@ 2018-07-11  7:41 AKASHI Takahiro
  2018-07-11  7:41 ` [PATCH v11 01/15] asm-generic: add kexec_file_load system call to unistd.h AKASHI Takahiro
                   ` (14 more replies)
  0 siblings, 15 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:41 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro


This is the eleventh round of implementing kexec_file_load() support
on arm64.[1] (See "Changes" below)
Most of the code is based on kexec-tools.


This patch series enables us to
  * load the kernel by specifying its file descriptor, instead of user-
    filled buffer, at kexec_file_load() system call, and
  * optionally verify its signature at load time for trusted boot.
Kernel virtual address randomization is also supported since v9.

Contrary to kexec_load() system call, as we discussed a long time ago,
users may not be allowed to provide a device tree to the 2nd kernel
explicitly, hence enforcing a dt blob of the first kernel to be re-used
internally.

To use kexec_file_load() system call, instead of kexec_load(), at kexec
command, '-s' option must be specified. See [2] for a necessary patch for
kexec-tools.

To analyze a generated crash dump file, use the latest master branch of
crash utility[3]. I always try to submit patches to fix any inconsistencies
introduced in the latest kernel.

Regarding a kernel image verification, a signature must be presented
along with the binary itself. A signature is basically a hash value
calculated against the whole binary data and encrypted by a key which
will be authenticated by one of the system's trusted certificates.
Any attempt to read and load a to-be-kexec-ed kernel image through
a system call will be checked and blocked if the binary's hash value
doesn't match its associated signature.

There are two methods available now:
1. implementing arch-specific verification hook of kexec_file_load()
2. utilizing IMA(Integrity Measurement Architecture)[4] appraisal framework

Before my v7, I believed that my patch only supports (1) but am now
confident that (2) comes free if IMA is enabled and properly configured.


(1) Arch-specific verification hook
If CONFIG_KEXEC_VERIFY_SIG is enabled, kexec_file_load() invokes an arch-
defined (and hence file-format-specific) hook function to check for the
validity of kernel binary.

On x86, a signature is embedded into a PE file (Microsoft's format) header
of binary. Since arm64's "Image" can also be seen as a PE file as far as
CONFIG_EFI is enabled, we adopt this format for kernel signing.  

As in the case of UEFI applications, we can create a signed kernel image:
    $ sbsign --key ${KEY} --cert ${CERT} Image

You may want to use certs/signing_key.pem, which is intended to be used
for module signing (CONFIG_MODULE_SIG), as ${KEY} and ${CERT} for test
purpose.


(2) IMA appraisal-based
IMA was first introduced in linux in order to meet TCG (Trusted Computing
Group) requirement that all the sensitive files be *measured* before
reading/executing them to detect any untrusted changes/modification.
Then appraisal feature, which allows us to ensure the integrity of
files and even prevent them from reading/executing, was added later.

Meanwhile, kexec_file_load() has been merged since v3.17 and evolved to
enable IMA-appraisal type verification by the commit b804defe4297 ("kexec:
replace call to copy_file_from_fd() with kernel version").

In this scheme, a signature will be stored in a extended file attribute,
"security.ima" while a decryption key is hold in a dedicated keyring,
".ima" or "_ima".  All the necessary process of verification is confined
in a secure API, kernel_read_file_from_fd(), called by kexec_file_load().

    Please note that powerpc is one of the two architectures now
    supporting KEXEC_FILE, and that it wishes to exntend IMA,
    where a signature may be appended to "vmlinux" file[5], like module
    signing, instead of using an extended file attribute.

While IMA meant to be used with TPM (Trusted Platform Module) on secure
platform, IMA is still usable without TPM. Here is an example procedure
about how we can give it a try to run the feature using a self-signed
root ca for demo/test purposes:

 1) Generate needed keys and certificates, following "Generate trusted
    keys" section in README of ima-evm-utils[6].

 2) Build the kernel with the following kernel configurations, specifying
    "ima-local-ca.pem" for CONFIG_SYSTEM_TRUSTED_KEYS:
	CONFIG_EXT4_FS_SECURITY
	CONFIG_INTEGRITY_SIGNATURE
	CONFIG_INTEGRITY_ASYMMETRIC_KEYS
	CONFIG_INTEGRITY_TRUSTED_KEYRING
	CONFIG_IMA
	CONFIG_IMA_WRITE_POLICY
	CONFIG_IMA_READ_POLICY
	CONFIG_IMA_APPRAISE
	CONFIG_IMA_APPRAISE_BOOTPARAM
	CONFIG_SYSTEM_TRUSTED_KEYS
    Please note that CONFIG_KEXEC_VERIFY_SIG is not, actually should
    not be, enabled.

 3) Sign(label) a kernel image binary to be kexec-ed on target filesystem:
    $ evmctl ima_sign --key /path/to/private_key.pem /your/Image

 4) Add a command line parameter and boot the kernel:
    ima_appraise=enforce

 On live system,
 5) Set a security policy:
    $ mount -t securityfs none /sys/kernel/security
    $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" \
      > /sys/kernel/security/ima/policy

 6) Add a key for ima:
    $ keyctl padd asymmetric my_ima_key %:.ima < /path/to/x509_ima.der
    (or evmctl import /path/to/x509_ima.der <ima_keyring_id>)

 7) Then try kexec as normal.


Concerns(or future works):
* Support for physical address randomization
* Signature verification of big endian kernel with CONFIG_KEXEC_VERIFY_SIG
  While big-endian kernel can support kernel signing, I'm not sure that
  Image can be recognized as in PE format because x86 standard only
  defines little-endian-based format.
* Support for vminux loading

  [1] http://git.linaro.org/people/takahiro.akashi/linux-aarch64.git
	branch:arm64/kexec_file
  [2] http://git.linaro.org/people/takahiro.akashi/kexec-tools.git
	branch:arm64/kexec_file
  [3] http://github.com/crash-utility/crash.git
  [4] https://sourceforge.net/p/linux-ima/wiki/Home/
  [5] http://lkml.iu.edu//hypermail/linux/kernel/1707.0/03669.html
  [6] https://sourceforge.net/p/linux-ima/ima-evm-utils/ci/master/tree/


Changes in v11 (July 11, 2018)
* split v10's patch#3, a refactoring stuff, into two parts, "just move"
  and modify
* remove selecting BUILD_BIN2C from KEXEC_FILE config
* modify setup_dtb()
   * to correct a return value on failure of fdt_xyz() call,
   * to always remove existing bootargs and initrd-start/end properties,
     if any, when copying current system's dtb into new dtb
   * to use fdt_setprop_string() for bootargs (I'm now sure that
     kimage->cmdline_buf is a null-terminated string.)
* revise a warning comment in case of KEXEC_VERIFY_SIG but
  !(EFI && SIGNED_PE_FILE_VERIFICATION)

Changes in v10 (June 23, 2018)
* rebased to v4.18-rc
* change syscall numer of kexec_file_load from 292 to 293
* factor out memblock-based arch_kexec_walk_mem() from powerpc and
  merge it into generic one
* move generic fdt helper functions from arm64 dir to drivers/of
  (dt_root_[addr|size]_cells are no longer __initdata.)
* modify fill_property() to use 'while' loop
* modify fdt_setprop_reg() to allocate a buffer on stack
* modify setup_dtb() to use fdt_setprop_u64()
* pass kernel_load_addr/size directly as arguments, instead of via
  kimage_arch.kern_segment, at load_other_segments()
* refuse loading an image which cannot be supported in image loader,
  adding cpu-feature(MMFR0) helper functions
* modify prepare_elf_headers() to use kmalloc() instead of vmalloc()
* always pass arch.dtb_mem as the fourth argument to cpu_soft_restart()
  in machine_kexec() while dtb_mem will be zero in kexec case

Changes in v9 (April 25, 2018)
* rebased to v4.17-rc
* remove preparatory patches on generic/x86/ppc code
  They have now been merged in v4.17-rc1.
* allocate memory based on memblock list instead of system resources
  This will prevent reserved regions, particularly UEFI/ACPI data,
  from being corrupted.
* correct dt property names, linux,initrd-*, in newly-created dtb
  "linux," was missing.
* remove alignment requirement for initrd loading
* add kaslr (kernel virtual address randomization) support
* misc code clean-up
* revise commit messages

Changes in v8 (Feb 22, 2018)
* introduce ARCH_HAS_KEXEC_PURGATORY so that arm64 will be able to skip
  purgatory
* remove "ifdef CONFIG_X86_64" stuffs from a re-factored function,
  prepare_elf64_headers(), making its interface more generic
  (The original patch was split into two for easier reviews.)
* modify cpu_soft_restart() so as to let the 2nd kernel jump into its entry
  code directly without requiring purgatory in case of kexec_file_load
* remove CONFIG_KEXEC_FILE_IMAGE_FMT and introduce
  CONFIG_KEXEC_IMAGE_VERIFY_SIG, much similar to x86 but quite redundant
  for now.
* In addition, update/modify dependencies of KEXEC_IMAGE_VERIFY_SIG

Changes in v7 (Dec 4, 2017)
* rebased to v4.15-rc2
* re-organize the patch set to separate KEXEC_FILE_VERIFY_SIG-related
  code from the others
* revamp factored-out code in kernel/kexec_file.c due to the changes
  in original x86 code
* redefine walk_sys_ram_res_rev() prototype due to change of callback
  type in the counterpart, walk_sys_ram_res()
* make KEXEC_FILE_IMAGE_FMT default on if KEXEC_FILE selected

Changes in v6 (Oct 24, 2017)
* fix a for-loop bug in _kexec_kernel_image_probe() per Julien

Changes in v5 (Oct 10, 2017)
* fix kbuild errors around patch #3
per Julien's comments,
* fix a bug in walk_system_ram_res_rev() with some cleanup
* modify fdt_setprop_range() to use vmalloc()
* modify fill_property() to use memset()

Changes in v4 (Oct 2, 2017)
* reinstate x86's arch_kexec_kernel_image_load()
* rename weak arch_kexec_kernel_xxx() to _kexec_kernel_xxx() for
  better re-use
* constify kexec_file_loaders[]

Changes in v3 (Sep 15, 2017)
* fix kbuild test error
* factor out arch_kexec_kernel_*() & arch_kimage_file_post_load_cleanup()
* remove CONFIG_CRASH_CORE guard from kexec_file.c
* add vmapped kernel region to vmcore for gdb backtracing
  (see prepare_elf64_headers())
* merge asm/kexec_file.h into asm/kexec.h
* and some cleanups

Changes in v2 (Sep 8, 2017)
* move core-header-related functions from crash_core.c to kexec_file.c
* drop hash-check code from purgatory
* modify purgatory asm to remove arch_kexec_apply_relocations_add()
* drop older kernel support
* drop vmlinux support (at least, for this series)


Patch #1 to #10 are essential part for KEXEC_FILE support
(additionally allowing for IMA-based verification):
  Patch #1 to #6 are all preparatory patches on generic side.
  Patch #7 to #11 are to enable kexec_file_load on arm64.

Patch #12 to #13 are for KEXEC_VERIFY_SIG (arch-specific verification)
support

AKASHI Takahiro (15):
  asm-generic: add kexec_file_load system call to unistd.h
  kexec_file: make kexec_image_post_load_cleanup_default() global
  powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  kexec_file: kexec_walk_memblock() only walks a dedicated region at
    kdump
  of/fdt: add helper functions for handling properties
  arm64: add image head flag definitions
  arm64: cpufeature: add MMFR0 helper functions
  arm64: enable KEXEC_FILE config
  arm64: kexec_file: load initrd and device-tree
  arm64: kexec_file: allow for loading Image-format kernel
  arm64: kexec_file: add crash dump support
  arm64: kexec_file: invoke the kernel without purgatory
  include: pe.h: remove message[] from mz header definition
  arm64: kexec_file: add kernel signature verification support
  arm64: kexec_file: add kaslr support

 arch/arm64/Kconfig                          |  33 ++
 arch/arm64/include/asm/boot.h               |  15 +
 arch/arm64/include/asm/cpufeature.h         |  48 +++
 arch/arm64/include/asm/kexec.h              |  48 +++
 arch/arm64/kernel/Makefile                  |   3 +-
 arch/arm64/kernel/cpu-reset.S               |   8 +-
 arch/arm64/kernel/head.S                    |   2 +-
 arch/arm64/kernel/kexec_image.c             | 128 ++++++++
 arch/arm64/kernel/machine_kexec.c           |  12 +-
 arch/arm64/kernel/machine_kexec_file.c      | 320 ++++++++++++++++++++
 arch/arm64/kernel/relocate_kernel.S         |   3 +-
 arch/powerpc/kernel/machine_kexec_file_64.c |  54 ----
 drivers/of/fdt.c                            |  62 +++-
 include/linux/kexec.h                       |   1 +
 include/linux/of_fdt.h                      |  10 +-
 include/linux/pe.h                          |   2 +-
 include/uapi/asm-generic/unistd.h           |   4 +-
 kernel/kexec_file.c                         |  59 +++-
 18 files changed, 742 insertions(+), 70 deletions(-)
 create mode 100644 arch/arm64/kernel/kexec_image.c
 create mode 100644 arch/arm64/kernel/machine_kexec_file.c

-- 
2.17.0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v11 01/15] asm-generic: add kexec_file_load system call to unistd.h
  2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
@ 2018-07-11  7:41 ` AKASHI Takahiro
  2018-07-11  7:41 ` [PATCH v11 02/15] kexec_file: make kexec_image_post_load_cleanup_default() global AKASHI Takahiro
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:41 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro

The initial user of this system call number is arm64.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
---
 include/uapi/asm-generic/unistd.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 42990676a55e..c81f4a0df51f 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -734,9 +734,11 @@ __SYSCALL(__NR_pkey_free,     sys_pkey_free)
 __SYSCALL(__NR_statx,     sys_statx)
 #define __NR_io_pgetevents 292
 __SC_COMP(__NR_io_pgetevents, sys_io_pgetevents, compat_sys_io_pgetevents)
+#define __NR_kexec_file_load 293
+__SYSCALL(__NR_kexec_file_load,     sys_kexec_file_load)
 
 #undef __NR_syscalls
-#define __NR_syscalls 293
+#define __NR_syscalls 294
 
 /*
  * 32 bit systems traditionally used different
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 02/15] kexec_file: make kexec_image_post_load_cleanup_default() global
  2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
  2018-07-11  7:41 ` [PATCH v11 01/15] asm-generic: add kexec_file_load system call to unistd.h AKASHI Takahiro
@ 2018-07-11  7:41 ` AKASHI Takahiro
  2018-07-11  7:41 ` [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem() AKASHI Takahiro
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:41 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro

Change this function from static to global so that arm64 can implement
its own arch_kimage_file_post_load_cleanup() later using
kexec_image_post_load_cleanup_default().

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
---
 include/linux/kexec.h | 1 +
 kernel/kexec_file.c   | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 9e4e638fb505..49ab758f4d91 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -143,6 +143,7 @@ extern const struct kexec_file_ops * const kexec_file_loaders[];
 
 int kexec_image_probe_default(struct kimage *image, void *buf,
 			      unsigned long buf_len);
+int kexec_image_post_load_cleanup_default(struct kimage *image);
 
 /**
  * struct kexec_buf - parameters for finding a place for a buffer in memory
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index c6a3b6851372..63c7ce1c0c3e 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -78,7 +78,7 @@ void * __weak arch_kexec_kernel_image_load(struct kimage *image)
 	return kexec_image_load_default(image);
 }
 
-static int kexec_image_post_load_cleanup_default(struct kimage *image)
+int kexec_image_post_load_cleanup_default(struct kimage *image)
 {
 	if (!image->fops || !image->fops->cleanup)
 		return 0;
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
  2018-07-11  7:41 ` [PATCH v11 01/15] asm-generic: add kexec_file_load system call to unistd.h AKASHI Takahiro
  2018-07-11  7:41 ` [PATCH v11 02/15] kexec_file: make kexec_image_post_load_cleanup_default() global AKASHI Takahiro
@ 2018-07-11  7:41 ` AKASHI Takahiro
  2018-07-14  1:52   ` Dave Young
  2018-07-16 12:26   ` Dave Young
  2018-07-11  7:41 ` [PATCH v11 04/15] kexec_file: kexec_walk_memblock() only walks a dedicated region at kdump AKASHI Takahiro
                   ` (11 subsequent siblings)
  14 siblings, 2 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:41 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro, Eric W. Biederman

Memblock list is another source for usable system memory layout.
So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
other memblock-based architectures, particularly arm64, can also utilise
it. A moved function is now renamed to kexec_walk_memblock() and merged
into the existing arch_kexec_walk_mem() for general use, either resource
list or memblock list.

A consequent function will not work for kdump with memblock list, but
this will be fixed in the next patch.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Acked-by: James Morse <james.morse@arm.com>
---
 arch/powerpc/kernel/machine_kexec_file_64.c | 54 ---------------------
 kernel/kexec_file.c                         | 54 +++++++++++++++++++++
 2 files changed, 54 insertions(+), 54 deletions(-)

diff --git a/arch/powerpc/kernel/machine_kexec_file_64.c b/arch/powerpc/kernel/machine_kexec_file_64.c
index 0bd23dc789a4..5357b09902c5 100644
--- a/arch/powerpc/kernel/machine_kexec_file_64.c
+++ b/arch/powerpc/kernel/machine_kexec_file_64.c
@@ -24,7 +24,6 @@
 
 #include <linux/slab.h>
 #include <linux/kexec.h>
-#include <linux/memblock.h>
 #include <linux/of_fdt.h>
 #include <linux/libfdt.h>
 #include <asm/ima.h>
@@ -46,59 +45,6 @@ int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
 	return kexec_image_probe_default(image, buf, buf_len);
 }
 
-/**
- * arch_kexec_walk_mem - call func(data) for each unreserved memory block
- * @kbuf:	Context info for the search. Also passed to @func.
- * @func:	Function to call for each memory block.
- *
- * This function is used by kexec_add_buffer and kexec_locate_mem_hole
- * to find unreserved memory to load kexec segments into.
- *
- * Return: The memory walk will stop when func returns a non-zero value
- * and that value will be returned. If all free regions are visited without
- * func returning non-zero, then zero will be returned.
- */
-int arch_kexec_walk_mem(struct kexec_buf *kbuf,
-			int (*func)(struct resource *, void *))
-{
-	int ret = 0;
-	u64 i;
-	phys_addr_t mstart, mend;
-	struct resource res = { };
-
-	if (kbuf->top_down) {
-		for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
-						&mstart, &mend, NULL) {
-			/*
-			 * In memblock, end points to the first byte after the
-			 * range while in kexec, end points to the last byte
-			 * in the range.
-			 */
-			res.start = mstart;
-			res.end = mend - 1;
-			ret = func(&res, kbuf);
-			if (ret)
-				break;
-		}
-	} else {
-		for_each_free_mem_range(i, NUMA_NO_NODE, 0, &mstart, &mend,
-					NULL) {
-			/*
-			 * In memblock, end points to the first byte after the
-			 * range while in kexec, end points to the last byte
-			 * in the range.
-			 */
-			res.start = mstart;
-			res.end = mend - 1;
-			ret = func(&res, kbuf);
-			if (ret)
-				break;
-		}
-	}
-
-	return ret;
-}
-
 /**
  * setup_purgatory - initialize the purgatory's global variables
  * @image:		kexec image.
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 63c7ce1c0c3e..b088324fb3ad 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -16,6 +16,7 @@
 #include <linux/file.h>
 #include <linux/slab.h>
 #include <linux/kexec.h>
+#include <linux/memblock.h>
 #include <linux/mutex.h>
 #include <linux/list.h>
 #include <linux/fs.h>
@@ -501,6 +502,55 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
 	return locate_mem_hole_bottom_up(start, end, kbuf);
 }
 
+#if defined(CONFIG_HAVE_MEMBLOCK) && !defined(CONFIG_ARCH_DISCARD_MEMBLOCK)
+static int kexec_walk_memblock(struct kexec_buf *kbuf,
+			       int (*func)(struct resource *, void *))
+{
+	int ret = 0;
+	u64 i;
+	phys_addr_t mstart, mend;
+	struct resource res = { };
+
+	if (kbuf->top_down) {
+		for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
+						&mstart, &mend, NULL) {
+			/*
+			 * In memblock, end points to the first byte after the
+			 * range while in kexec, end points to the last byte
+			 * in the range.
+			 */
+			res.start = mstart;
+			res.end = mend - 1;
+			ret = func(&res, kbuf);
+			if (ret)
+				break;
+		}
+	} else {
+		for_each_free_mem_range(i, NUMA_NO_NODE, 0, &mstart, &mend,
+					NULL) {
+			/*
+			 * In memblock, end points to the first byte after the
+			 * range while in kexec, end points to the last byte
+			 * in the range.
+			 */
+			res.start = mstart;
+			res.end = mend - 1;
+			ret = func(&res, kbuf);
+			if (ret)
+				break;
+		}
+	}
+
+	return ret;
+}
+#else
+static int kexec_walk_memblock(struct kexec_buf *kbuf,
+			       int (*func)(struct resource *, void *))
+{
+	return 0;
+}
+#endif
+
 /**
  * arch_kexec_walk_mem - call func(data) on free memory regions
  * @kbuf:	Context info for the search. Also passed to @func.
@@ -513,6 +563,10 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
 int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
 			       int (*func)(struct resource *, void *))
 {
+	if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
+			!IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
+		return kexec_walk_memblock(kbuf, func);
+
 	if (kbuf->image->type == KEXEC_TYPE_CRASH)
 		return walk_iomem_res_desc(crashk_res.desc,
 					   IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 04/15] kexec_file: kexec_walk_memblock() only walks a dedicated region at kdump
  2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
                   ` (2 preceding siblings ...)
  2018-07-11  7:41 ` [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem() AKASHI Takahiro
@ 2018-07-11  7:41 ` AKASHI Takahiro
  2018-07-11  7:41 ` [PATCH v11 05/15] of/fdt: add helper functions for handling properties AKASHI Takahiro
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:41 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro

In kdump case, there exists only one dedicated memoblock region as usable
memory (crashk_res). With this patch, kexec_walk_memblock() runs a given
callback function on this region.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
---
 kernel/kexec_file.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index b088324fb3ad..ebf06c3e168d 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -511,6 +511,9 @@ static int kexec_walk_memblock(struct kexec_buf *kbuf,
 	phys_addr_t mstart, mend;
 	struct resource res = { };
 
+	if (kbuf->image->type == KEXEC_TYPE_CRASH)
+		return func(&crashk_res, kbuf);
+
 	if (kbuf->top_down) {
 		for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
 						&mstart, &mend, NULL) {
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 05/15] of/fdt: add helper functions for handling properties
  2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
                   ` (3 preceding siblings ...)
  2018-07-11  7:41 ` [PATCH v11 04/15] kexec_file: kexec_walk_memblock() only walks a dedicated region at kdump AKASHI Takahiro
@ 2018-07-11  7:41 ` AKASHI Takahiro
  2018-07-11  7:41 ` [PATCH v11 06/15] arm64: add image head flag definitions AKASHI Takahiro
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:41 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro, Rob Herring, Frank Rowand

These functions will be used later to handle kexec-specific properties
in arm64's kexec_file implementation.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
---
 drivers/of/fdt.c       | 62 ++++++++++++++++++++++++++++++++++++++++--
 include/linux/of_fdt.h | 10 +++++--
 2 files changed, 68 insertions(+), 4 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 6da20b9688f7..f7c9d69ce86c 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -25,6 +25,7 @@
 #include <linux/debugfs.h>
 #include <linux/serial_core.h>
 #include <linux/sysfs.h>
+#include <linux/types.h>
 
 #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
 #include <asm/page.h>
@@ -537,8 +538,8 @@ void *of_fdt_unflatten_tree(const unsigned long *blob,
 EXPORT_SYMBOL_GPL(of_fdt_unflatten_tree);
 
 /* Everything below here references initial_boot_params directly. */
-int __initdata dt_root_addr_cells;
-int __initdata dt_root_size_cells;
+int dt_root_addr_cells;
+int dt_root_size_cells;
 
 void *initial_boot_params;
 
@@ -1330,3 +1331,60 @@ late_initcall(of_fdt_raw_init);
 #endif
 
 #endif /* CONFIG_OF_EARLY_FLATTREE */
+
+bool of_fdt_cells_size_fitted(u64 base, u64 size)
+{
+	/* if *_cells >= 2, cells can hold 64-bit values anyway */
+	if ((dt_root_addr_cells == 1) && (base > U32_MAX))
+		return false;
+
+	if ((dt_root_size_cells == 1) && (size > U32_MAX))
+		return false;
+
+	return true;
+}
+
+size_t of_fdt_reg_cells_size(void)
+{
+	return (dt_root_addr_cells + dt_root_size_cells) * sizeof(u32);
+}
+
+#define FDT_ALIGN(x, a)	(((x) + (a) - 1) & ~((a) - 1))
+#define FDT_TAGALIGN(x)	(FDT_ALIGN((x), FDT_TAGSIZE))
+
+int fdt_prop_len(const char *prop_name, int len)
+{
+	return (strlen(prop_name) + 1) +
+		sizeof(struct fdt_property) +
+		FDT_TAGALIGN(len);
+}
+
+static void fill_property(void *buf, u64 val64, int cells)
+{
+	__be32 val32;
+
+	while (cells) {
+		val32 = cpu_to_fdt32((val64 >> (32 * (--cells))) & U32_MAX);
+		memcpy(buf, &val32, sizeof(val32));
+		buf += sizeof(val32);
+	}
+}
+
+int fdt_setprop_reg(void *fdt, int nodeoffset, const char *name,
+						u64 addr, u64 size)
+{
+	char buf[sizeof(__be32) * 2 * 2];
+		/* assume dt_root_[addr|size]_cells <= 2 */
+	void *prop;
+	size_t buf_size;
+
+	buf_size = of_fdt_reg_cells_size();
+	prop = buf;
+
+	fill_property(prop, addr, dt_root_addr_cells);
+	prop += dt_root_addr_cells * sizeof(u32);
+
+	fill_property(prop, size, dt_root_size_cells);
+
+	return fdt_setprop(fdt, nodeoffset, name, buf, buf_size);
+}
diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
index b9cd9ebdf9b9..9615d6142578 100644
--- a/include/linux/of_fdt.h
+++ b/include/linux/of_fdt.h
@@ -37,8 +37,8 @@ extern void *of_fdt_unflatten_tree(const unsigned long *blob,
 				   struct device_node **mynodes);
 
 /* TBD: Temporary export of fdt globals - remove when code fully merged */
-extern int __initdata dt_root_addr_cells;
-extern int __initdata dt_root_size_cells;
+extern int dt_root_addr_cells;
+extern int dt_root_size_cells;
 extern void *initial_boot_params;
 
 extern char __dtb_start[];
@@ -108,5 +108,11 @@ static inline void unflatten_device_tree(void) {}
 static inline void unflatten_and_copy_device_tree(void) {}
 #endif /* CONFIG_OF_EARLY_FLATTREE */
 
+bool of_fdt_cells_size_fitted(u64 base, u64 size);
+size_t of_fdt_reg_cells_size(void);
+int fdt_prop_len(const char *prop_name, int len);
+int fdt_setprop_reg(void *fdt, int nodeoffset, const char *name,
+						u64 addr, u64 size);
+
 #endif /* __ASSEMBLY__ */
 #endif /* _LINUX_OF_FDT_H */
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 06/15] arm64: add image head flag definitions
  2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
                   ` (4 preceding siblings ...)
  2018-07-11  7:41 ` [PATCH v11 05/15] of/fdt: add helper functions for handling properties AKASHI Takahiro
@ 2018-07-11  7:41 ` AKASHI Takahiro
  2018-07-11  7:41 ` [PATCH v11 07/15] arm64: cpufeature: add MMFR0 helper functions AKASHI Takahiro
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:41 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro

Those image head's flags will be used later by kexec_file loader.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/boot.h | 15 +++++++++++++++
 arch/arm64/kernel/head.S      |  2 +-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
index 355e552a9175..0bab7eed3012 100644
--- a/arch/arm64/include/asm/boot.h
+++ b/arch/arm64/include/asm/boot.h
@@ -5,6 +5,21 @@
 
 #include <asm/sizes.h>
 
+#define ARM64_MAGIC		"ARM\x64"
+
+#define HEAD_FLAG_BE_SHIFT		0
+#define HEAD_FLAG_PAGE_SIZE_SHIFT	1
+#define HEAD_FLAG_BE_MASK		0x1
+#define HEAD_FLAG_PAGE_SIZE_MASK	0x3
+
+#define HEAD_FLAG_BE			1
+#define HEAD_FLAG_PAGE_SIZE_4K		1
+#define HEAD_FLAG_PAGE_SIZE_16K		2
+#define HEAD_FLAG_PAGE_SIZE_64K		3
+
+#define head_flag_field(flags, field) \
+		(((flags) >> field##_SHIFT) & field##_MASK)
+
 /*
  * arm64 requires the DTB to be 8 byte aligned and
  * not exceed 2MB in size.
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b0853069702f..8cbac6232ed1 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -91,7 +91,7 @@ _head:
 	.quad	0				// reserved
 	.quad	0				// reserved
 	.quad	0				// reserved
-	.ascii	"ARM\x64"			// Magic number
+	.ascii	ARM64_MAGIC			// Magic number
 #ifdef CONFIG_EFI
 	.long	pe_header - _head		// Offset to the PE header.
 
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 07/15] arm64: cpufeature: add MMFR0 helper functions
  2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
                   ` (5 preceding siblings ...)
  2018-07-11  7:41 ` [PATCH v11 06/15] arm64: add image head flag definitions AKASHI Takahiro
@ 2018-07-11  7:41 ` AKASHI Takahiro
  2018-07-11  7:41 ` [PATCH v11 08/15] arm64: enable KEXEC_FILE config AKASHI Takahiro
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:41 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro

Those helper functions for MMFR0 register will be used later by kexec_file
loader.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/cpufeature.h | 48 +++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 1717ba1db35d..cd90b5252d6d 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -486,11 +486,59 @@ static inline bool system_supports_32bit_el0(void)
 	return cpus_have_const_cap(ARM64_HAS_32BIT_EL0);
 }
 
+static inline bool system_supports_4kb_granule(void)
+{
+	u64 mmfr0;
+	u32 val;
+
+	mmfr0 =	read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+	val = cpuid_feature_extract_unsigned_field(mmfr0,
+						ID_AA64MMFR0_TGRAN4_SHIFT);
+
+	return val == ID_AA64MMFR0_TGRAN4_SUPPORTED;
+}
+
+static inline bool system_supports_64kb_granule(void)
+{
+	u64 mmfr0;
+	u32 val;
+
+	mmfr0 =	read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+	val = cpuid_feature_extract_unsigned_field(mmfr0,
+						ID_AA64MMFR0_TGRAN64_SHIFT);
+
+	return val == ID_AA64MMFR0_TGRAN64_SUPPORTED;
+}
+
+static inline bool system_supports_16kb_granule(void)
+{
+	u64 mmfr0;
+	u32 val;
+
+	mmfr0 =	read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+	val = cpuid_feature_extract_unsigned_field(mmfr0,
+						ID_AA64MMFR0_TGRAN16_SHIFT);
+
+	return val == ID_AA64MMFR0_TGRAN16_SUPPORTED;
+}
+
 static inline bool system_supports_mixed_endian_el0(void)
 {
 	return id_aa64mmfr0_mixed_endian_el0(read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1));
 }
 
+static inline bool system_supports_mixed_endian(void)
+{
+	u64 mmfr0;
+	u32 val;
+
+	mmfr0 =	read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+	val = cpuid_feature_extract_unsigned_field(mmfr0,
+						ID_AA64MMFR0_BIGENDEL_SHIFT);
+
+	return val == 0x1;
+}
+
 static inline bool system_supports_fpsimd(void)
 {
 	return !cpus_have_const_cap(ARM64_HAS_NO_FPSIMD);
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 08/15] arm64: enable KEXEC_FILE config
  2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
                   ` (6 preceding siblings ...)
  2018-07-11  7:41 ` [PATCH v11 07/15] arm64: cpufeature: add MMFR0 helper functions AKASHI Takahiro
@ 2018-07-11  7:41 ` AKASHI Takahiro
  2018-07-11  7:41 ` [PATCH v11 09/15] arm64: kexec_file: load initrd and device-tree AKASHI Takahiro
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:41 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro

Modify arm64/Kconfig to enable kexec_file_load support.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: James Morse <james.morse@arm.com>
---
 arch/arm64/Kconfig                     |  9 +++++++++
 arch/arm64/kernel/Makefile             |  3 ++-
 arch/arm64/kernel/machine_kexec_file.c | 16 ++++++++++++++++
 3 files changed, 27 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/machine_kexec_file.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 42c090cf0292..a9a3a5583c8b 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -835,6 +835,15 @@ config KEXEC
 	  but it is independent of the system firmware.   And like a reboot
 	  you can start any kernel with it, not just Linux.
 
+config KEXEC_FILE
+	bool "kexec file based system call"
+	select KEXEC_CORE
+	help
+	  This is new version of kexec system call. This system call is
+	  file based and takes file descriptors as system call argument
+	  for kernel and initramfs as opposed to list of segments as
+	  accepted by previous system call.
+
 config CRASH_DUMP
 	bool "Build kdump crash kernel"
 	help
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 0025f8691046..06281e1ad7ed 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -48,8 +48,9 @@ arm64-obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL)	+= acpi_parking_protocol.o
 arm64-obj-$(CONFIG_PARAVIRT)		+= paravirt.o
 arm64-obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
 arm64-obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o
-arm64-obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o	\
+arm64-obj-$(CONFIG_KEXEC_CORE)		+= machine_kexec.o relocate_kernel.o	\
 					   cpu-reset.o
+arm64-obj-$(CONFIG_KEXEC_FILE)		+= machine_kexec_file.o
 arm64-obj-$(CONFIG_ARM64_RELOC_TEST)	+= arm64-reloc-test.o
 arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
 arm64-obj-$(CONFIG_CRASH_DUMP)		+= crash_dump.o
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
new file mode 100644
index 000000000000..c38a8048ed00
--- /dev/null
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -0,0 +1,16 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * kexec_file for arm64
+ *
+ * Copyright (C) 2018 Linaro Limited
+ * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
+ *
+ */
+
+#define pr_fmt(fmt) "kexec_file: " fmt
+
+#include <linux/kexec.h>
+
+const struct kexec_file_ops * const kexec_file_loaders[] = {
+	NULL
+};
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 09/15] arm64: kexec_file: load initrd and device-tree
  2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
                   ` (7 preceding siblings ...)
  2018-07-11  7:41 ` [PATCH v11 08/15] arm64: enable KEXEC_FILE config AKASHI Takahiro
@ 2018-07-11  7:41 ` AKASHI Takahiro
  2018-07-17 16:57   ` James Morse
  2018-07-11  7:41 ` [PATCH v11 10/15] arm64: kexec_file: allow for loading Image-format kernel AKASHI Takahiro
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:41 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro

load_other_segments() is expected to allocate and place all the necessary
memory segments other than kernel, including initrd and device-tree
blob (and elf core header for crash).
While most of the code was borrowed from kexec-tools' counterpart,
users may not be allowed to specify dtb explicitly, instead, the dtb
presented by the original boot loader is reused.

arch_kimage_kernel_post_load_cleanup() is responsible for freeing arm64-
specific data allocated in load_other_segments().

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/kexec.h         |  16 +++
 arch/arm64/kernel/machine_kexec_file.c | 184 +++++++++++++++++++++++++
 2 files changed, 200 insertions(+)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index e17f0529a882..01bbf6cebf12 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -93,6 +93,22 @@ static inline void crash_prepare_suspend(void) {}
 static inline void crash_post_resume(void) {}
 #endif
 
+#ifdef CONFIG_KEXEC_FILE
+#define ARCH_HAS_KIMAGE_ARCH
+
+struct kimage_arch {
+	phys_addr_t dtb_mem;
+	void *dtb_buf;
+};
+
+struct kimage;
+
+extern int load_other_segments(struct kimage *image,
+		unsigned long kernel_load_addr, unsigned long kernel_size,
+		char *initrd, unsigned long initrd_len,
+		char *cmdline, unsigned long cmdline_len);
+#endif
+
 #endif /* __ASSEMBLY__ */
 
 #endif
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index c38a8048ed00..ca00681c25c6 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -5,12 +5,196 @@
  * Copyright (C) 2018 Linaro Limited
  * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
  *
+ * Most code is derived from arm64 port of kexec-tools
  */
 
 #define pr_fmt(fmt) "kexec_file: " fmt
 
+#include <linux/ioport.h>
+#include <linux/kernel.h>
 #include <linux/kexec.h>
+#include <linux/libfdt.h>
+#include <linux/memblock.h>
+#include <linux/of_fdt.h>
+#include <linux/types.h>
+#include <asm/byteorder.h>
 
 const struct kexec_file_ops * const kexec_file_loaders[] = {
 	NULL
 };
+
+int arch_kimage_file_post_load_cleanup(struct kimage *image)
+{
+	vfree(image->arch.dtb_buf);
+	image->arch.dtb_buf = NULL;
+
+	return kexec_image_post_load_cleanup_default(image);
+}
+
+static int setup_dtb(struct kimage *image,
+		unsigned long initrd_load_addr, unsigned long initrd_len,
+		char *cmdline, unsigned long cmdline_len,
+		char **dtb_buf, size_t *dtb_buf_len)
+{
+	char *buf = NULL;
+	size_t buf_size;
+	int nodeoffset;
+	u64 value;
+	int ret;
+
+	/* duplicate dt blob */
+	buf_size = fdt_totalsize(initial_boot_params);
+
+	if (initrd_load_addr) {
+		/* can be redundant, but trimmed at the end */
+		buf_size += fdt_prop_len("linux,initrd-start", sizeof(u64));
+		buf_size += fdt_prop_len("linux,initrd-end", sizeof(u64));
+	}
+
+	if (cmdline)
+		/* can be redundant, but trimmed at the end */
+		buf_size += fdt_prop_len("bootargs", cmdline_len + 1);
+
+	buf = vmalloc(buf_size);
+	if (!buf) {
+		ret = -ENOMEM;
+		goto out_err;
+	}
+
+	ret = fdt_open_into(initial_boot_params, buf, buf_size);
+	if (ret) {
+		ret = -EINVAL;
+		goto out_err;
+	}
+
+	nodeoffset = fdt_path_offset(buf, "/chosen");
+	if (nodeoffset < 0) {
+		ret = -EINVAL;
+		goto out_err;
+	}
+
+	/* add bootargs */
+	if (cmdline) {
+		ret = fdt_setprop_string(buf, nodeoffset, "bootargs", cmdline);
+		if (ret) {
+			ret = -EINVAL;
+			goto out_err;
+		}
+	} else {
+		ret = fdt_delprop(buf, nodeoffset, "bootargs");
+		if (ret && (ret != -FDT_ERR_NOTFOUND)) {
+			ret = -EINVAL;
+			goto out_err;
+		}
+	}
+
+	/* add initrd-* */
+	if (initrd_load_addr) {
+		value = cpu_to_fdt64(initrd_load_addr);
+		ret = fdt_setprop_u64(buf, nodeoffset, "linux,initrd-start",
+							value);
+		if (ret) {
+			ret = -EINVAL;
+			goto out_err;
+		}
+
+		value = cpu_to_fdt64(initrd_load_addr + initrd_len);
+		ret = fdt_setprop_u64(buf, nodeoffset, "linux,initrd-end",
+							value);
+		if (ret) {
+			ret = -EINVAL;
+			goto out_err;
+		}
+	} else {
+		ret = fdt_delprop(buf, nodeoffset, "linux,initrd-start");
+		if (ret && (ret != -FDT_ERR_NOTFOUND)) {
+			ret = -EINVAL;
+			goto out_err;
+		}
+
+		ret = fdt_delprop(buf, nodeoffset, "linux,initrd-end");
+		if (ret && (ret != -FDT_ERR_NOTFOUND)) {
+			ret = -EINVAL;
+			goto out_err;
+		}
+	}
+
+	/* trim a buffer */
+	fdt_pack(buf);
+	*dtb_buf = buf;
+	*dtb_buf_len = fdt_totalsize(buf);
+
+	return 0;
+
+out_err:
+	vfree(buf);
+	return ret;
+}
+
+int load_other_segments(struct kimage *image,
+			unsigned long kernel_load_addr,
+			unsigned long kernel_size,
+			char *initrd, unsigned long initrd_len,
+			char *cmdline, unsigned long cmdline_len)
+{
+	struct kexec_buf kbuf;
+	unsigned long initrd_load_addr = 0;
+	char *dtb = NULL;
+	unsigned long dtb_len = 0;
+	int ret = 0;
+
+	kbuf.image = image;
+	/* not allocate anything below the kernel */
+	kbuf.buf_min = kernel_load_addr + kernel_size;
+
+	/* load initrd */
+	if (initrd) {
+		kbuf.buffer = initrd;
+		kbuf.bufsz = initrd_len;
+		kbuf.memsz = initrd_len;
+		kbuf.buf_align = 0;
+		/* within 1GB-aligned window of up to 32GB in size */
+		kbuf.buf_max = round_down(kernel_load_addr, SZ_1G)
+						+ (unsigned long)SZ_1G * 32;
+		kbuf.top_down = false;
+
+		ret = kexec_add_buffer(&kbuf);
+		if (ret)
+			goto out_err;
+		initrd_load_addr = kbuf.mem;
+
+		pr_debug("Loaded initrd at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+				initrd_load_addr, initrd_len, initrd_len);
+	}
+
+	/* load dtb blob */
+	ret = setup_dtb(image, initrd_load_addr, initrd_len,
+				cmdline, cmdline_len, &dtb, &dtb_len);
+	if (ret) {
+		pr_err("Preparing for new dtb failed\n");
+		goto out_err;
+	}
+
+	kbuf.buffer = dtb;
+	kbuf.bufsz = dtb_len;
+	kbuf.memsz = dtb_len;
+	/* not across 2MB boundary */
+	kbuf.buf_align = SZ_2M;
+	kbuf.buf_max = ULONG_MAX;
+	kbuf.top_down = true;
+
+	ret = kexec_add_buffer(&kbuf);
+	if (ret)
+		goto out_err;
+	image->arch.dtb_mem = kbuf.mem;
+	image->arch.dtb_buf = dtb;
+
+	pr_debug("Loaded dtb at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+			kbuf.mem, dtb_len, dtb_len);
+
+	return 0;
+
+out_err:
+	vfree(dtb);
+	return ret;
+}
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 10/15] arm64: kexec_file: allow for loading Image-format kernel
  2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
                   ` (8 preceding siblings ...)
  2018-07-11  7:41 ` [PATCH v11 09/15] arm64: kexec_file: load initrd and device-tree AKASHI Takahiro
@ 2018-07-11  7:41 ` AKASHI Takahiro
  2018-07-18 16:47   ` James Morse
  2018-07-11  7:41 ` [PATCH v11 11/15] arm64: kexec_file: add crash dump support AKASHI Takahiro
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:41 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro

This patch provides kexec_file_ops for "Image"-format kernel. In this
implementation, a binary is always loaded with a fixed offset identified
in text_offset field of its header.

Regarding signature verification for trusted boot, this patch doesn't
contains CONFIG_KEXEC_VERIFY_SIG support, which is to be added later
in this series, but file-attribute-based verification is still a viable
option by enabling IMA security subsystem.

You can sign(label) a to-be-kexec'ed kernel image on target file system
with:
    $ evmctl ima_sign --key /path/to/private_key.pem Image

On live system, you must have IMA enforced with, at least, the following
security policy:
    "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig"

See more details about IMA here:
    https://sourceforge.net/p/linux-ima/wiki/Home/

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/kexec.h         |  28 +++++++
 arch/arm64/kernel/Makefile             |   2 +-
 arch/arm64/kernel/kexec_image.c        | 108 +++++++++++++++++++++++++
 arch/arm64/kernel/machine_kexec_file.c |   1 +
 4 files changed, 138 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/kexec_image.c

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 01bbf6cebf12..69333694e3e2 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -101,6 +101,34 @@ struct kimage_arch {
 	void *dtb_buf;
 };
 
+/**
+ * struct arm64_image_header - arm64 kernel image header
+ * See Documentation/arm64/booting.txt for details
+ *
+ * @mz_magic: DOS header magic number ('MZ', optional)
+ * @code1: Instruction (branch to stext)
+ * @text_offset: Image load offset
+ * @image_size: Effective image size
+ * @flags: Bit-field flags
+ * @reserved: Reserved
+ * @magic: Magic number
+ * @pe_header: Offset to PE COFF header (optional)
+ **/
+
+struct arm64_image_header {
+	__le16 mz_magic; /* also code0 */
+	__le16 pad;
+	__le32 code1;
+	__le64 text_offset;
+	__le64 image_size;
+	__le64 flags;
+	__le64 reserved[3];
+	__le32 magic;
+	__le32 pe_header;
+};
+
+extern const struct kexec_file_ops kexec_image_ops;
+
 struct kimage;
 
 extern int load_other_segments(struct kimage *image,
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 06281e1ad7ed..a9cc7752f276 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -50,7 +50,7 @@ arm64-obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
 arm64-obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o
 arm64-obj-$(CONFIG_KEXEC_CORE)		+= machine_kexec.o relocate_kernel.o	\
 					   cpu-reset.o
-arm64-obj-$(CONFIG_KEXEC_FILE)		+= machine_kexec_file.o
+arm64-obj-$(CONFIG_KEXEC_FILE)		+= machine_kexec_file.o kexec_image.o
 arm64-obj-$(CONFIG_ARM64_RELOC_TEST)	+= arm64-reloc-test.o
 arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
 arm64-obj-$(CONFIG_CRASH_DUMP)		+= crash_dump.o
diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
new file mode 100644
index 000000000000..a47cf9bc699e
--- /dev/null
+++ b/arch/arm64/kernel/kexec_image.c
@@ -0,0 +1,108 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Kexec image loader
+
+ * Copyright (C) 2018 Linaro Limited
+ * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
+ */
+
+#define pr_fmt(fmt)	"kexec_file(Image): " fmt
+
+#include <linux/err.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/kexec.h>
+#include <linux/string.h>
+#include <asm/boot.h>
+#include <asm/byteorder.h>
+#include <asm/cpufeature.h>
+#include <asm/memory.h>
+
+static int image_probe(const char *kernel_buf, unsigned long kernel_len)
+{
+	const struct arm64_image_header *h;
+
+	h = (const struct arm64_image_header *)(kernel_buf);
+
+	if (!h || (kernel_len < sizeof(*h)) ||
+			!memcmp(&h->magic, ARM64_MAGIC, sizeof(ARM64_MAGIC)))
+		return -EINVAL;
+
+	return 0;
+}
+
+static void *image_load(struct kimage *image,
+				char *kernel, unsigned long kernel_len,
+				char *initrd, unsigned long initrd_len,
+				char *cmdline, unsigned long cmdline_len)
+{
+	struct arm64_image_header *h;
+	u64 flags, value;
+	struct kexec_buf kbuf;
+	unsigned long text_offset;
+	struct kexec_segment *kernel_segment;
+	int ret;
+
+	/* Don't support old kernel */
+	h = (struct arm64_image_header *)kernel;
+	if (!h->text_offset)
+		return ERR_PTR(-EINVAL);
+
+	/* Check cpu features */
+	flags = le64_to_cpu(h->flags);
+	value = head_flag_field(flags, HEAD_FLAG_BE);
+	if (((value == HEAD_FLAG_BE) && !IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) ||
+	    ((value != HEAD_FLAG_BE) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)))
+		if (!system_supports_mixed_endian())
+			return ERR_PTR(-EINVAL);
+
+	value = head_flag_field(flags, HEAD_FLAG_PAGE_SIZE);
+	if (((value == HEAD_FLAG_PAGE_SIZE_4K) &&
+			!system_supports_4kb_granule()) ||
+	    ((value == HEAD_FLAG_PAGE_SIZE_64K) &&
+			!system_supports_64kb_granule()) ||
+	    ((value == HEAD_FLAG_PAGE_SIZE_16K) &&
+			!system_supports_16kb_granule()))
+		return ERR_PTR(-EINVAL);
+
+	/* Load the kernel */
+	kbuf.image = image;
+	kbuf.buf_min = 0;
+	kbuf.buf_max = ULONG_MAX;
+	kbuf.top_down = false;
+
+	kbuf.buffer = kernel;
+	kbuf.bufsz = kernel_len;
+	kbuf.memsz = le64_to_cpu(h->image_size);
+	text_offset = le64_to_cpu(h->text_offset);
+	kbuf.buf_align = SZ_2M;
+
+	/* Adjust kernel segment with TEXT_OFFSET */
+	kbuf.memsz += text_offset;
+
+	ret = kexec_add_buffer(&kbuf);
+	if (ret)
+		goto out;
+
+	kernel_segment = &image->segment[image->nr_segments - 1];
+	kernel_segment->mem += text_offset;
+	kernel_segment->memsz -= text_offset;
+	image->start = kernel_segment->mem;
+
+	pr_debug("Loaded kernel at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+				kernel_segment->mem, kbuf.bufsz,
+				kernel_segment->memsz);
+
+	/* Load additional data */
+	ret = load_other_segments(image,
+				kernel_segment->mem, kernel_segment->memsz,
+				initrd, initrd_len, cmdline, cmdline_len);
+
+out:
+	return ERR_PTR(ret);
+}
+
+const struct kexec_file_ops kexec_image_ops = {
+	.probe = image_probe,
+	.load = image_load,
+};
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index ca00681c25c6..a0b44fe18b95 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -20,6 +20,7 @@
 #include <asm/byteorder.h>
 
 const struct kexec_file_ops * const kexec_file_loaders[] = {
+	&kexec_image_ops,
 	NULL
 };
 
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 11/15] arm64: kexec_file: add crash dump support
  2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
                   ` (9 preceding siblings ...)
  2018-07-11  7:41 ` [PATCH v11 10/15] arm64: kexec_file: allow for loading Image-format kernel AKASHI Takahiro
@ 2018-07-11  7:41 ` AKASHI Takahiro
  2018-07-18 16:50   ` James Morse
  2018-07-11  7:42 ` [PATCH v11 12/15] arm64: kexec_file: invoke the kernel without purgatory AKASHI Takahiro
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:41 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro

Enabling crash dump (kdump) includes
* prepare contents of ELF header of a core dump file, /proc/vmcore,
  using crash_prepare_elf64_headers(), and
* add two device tree properties, "linux,usable-memory-range" and
  "linux,elfcorehdr", which represent respectively a memory range
  to be used by crash dump kernel and the header's location

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/kexec.h         |   4 +
 arch/arm64/kernel/kexec_image.c        |   9 +-
 arch/arm64/kernel/machine_kexec_file.c | 114 ++++++++++++++++++++++++-
 3 files changed, 124 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 69333694e3e2..eeb5766928b0 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -99,6 +99,10 @@ static inline void crash_post_resume(void) {}
 struct kimage_arch {
 	phys_addr_t dtb_mem;
 	void *dtb_buf;
+	/* Core ELF header buffer */
+	void *elf_headers;
+	unsigned long elf_headers_sz;
+	unsigned long elf_load_addr;
 };
 
 /**
diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
index a47cf9bc699e..df1e341d3a28 100644
--- a/arch/arm64/kernel/kexec_image.c
+++ b/arch/arm64/kernel/kexec_image.c
@@ -67,8 +67,13 @@ static void *image_load(struct kimage *image,
 
 	/* Load the kernel */
 	kbuf.image = image;
-	kbuf.buf_min = 0;
-	kbuf.buf_max = ULONG_MAX;
+	if (image->type == KEXEC_TYPE_CRASH) {
+		kbuf.buf_min = crashk_res.start;
+		kbuf.buf_max = crashk_res.end + 1;
+	} else {
+		kbuf.buf_min = 0;
+		kbuf.buf_max = ULONG_MAX;
+	}
 	kbuf.top_down = false;
 
 	kbuf.buffer = kernel;
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index a0b44fe18b95..261564df7210 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -16,7 +16,9 @@
 #include <linux/libfdt.h>
 #include <linux/memblock.h>
 #include <linux/of_fdt.h>
+#include <linux/slab.h>
 #include <linux/types.h>
+#include <linux/vmalloc.h>
 #include <asm/byteorder.h>
 
 const struct kexec_file_ops * const kexec_file_loaders[] = {
@@ -29,6 +31,10 @@ int arch_kimage_file_post_load_cleanup(struct kimage *image)
 	vfree(image->arch.dtb_buf);
 	image->arch.dtb_buf = NULL;
 
+	vfree(image->arch.elf_headers);
+	image->arch.elf_headers = NULL;
+	image->arch.elf_headers_sz = 0;
+
 	return kexec_image_post_load_cleanup_default(image);
 }
 
@@ -38,13 +44,31 @@ static int setup_dtb(struct kimage *image,
 		char **dtb_buf, size_t *dtb_buf_len)
 {
 	char *buf = NULL;
-	size_t buf_size;
+	size_t buf_size, range_size;
 	int nodeoffset;
 	u64 value;
 	int ret;
 
+	/* check ranges against root's #address-cells and #size-cells */
+	if (image->type == KEXEC_TYPE_CRASH &&
+		(!of_fdt_cells_size_fitted(image->arch.elf_load_addr,
+				image->arch.elf_headers_sz) ||
+		 !of_fdt_cells_size_fitted(crashk_res.start,
+				crashk_res.end - crashk_res.start + 1))) {
+		pr_err("Crash memory region doesn't fit into DT's root cell sizes.\n");
+		ret = -EINVAL;
+		goto out_err;
+	}
+
 	/* duplicate dt blob */
 	buf_size = fdt_totalsize(initial_boot_params);
+	range_size = of_fdt_reg_cells_size();
+
+	if (image->type == KEXEC_TYPE_CRASH) {
+		buf_size += fdt_prop_len("linux,elfcorehdr", range_size);
+		buf_size += fdt_prop_len("linux,usable-memory-range",
+								range_size);
+	}
 
 	if (initrd_load_addr) {
 		/* can be redundant, but trimmed at the end */
@@ -74,6 +98,23 @@ static int setup_dtb(struct kimage *image,
 		goto out_err;
 	}
 
+	if (image->type == KEXEC_TYPE_CRASH) {
+		/* add linux,elfcorehdr */
+		ret = fdt_setprop_reg(buf, nodeoffset, "linux,elfcorehdr",
+				image->arch.elf_load_addr,
+				image->arch.elf_headers_sz);
+		if (ret)
+			goto out_err;
+
+		/* add linux,usable-memory-range */
+		ret = fdt_setprop_reg(buf, nodeoffset,
+				"linux,usable-memory-range",
+				crashk_res.start,
+				crashk_res.end - crashk_res.start + 1);
+		if (ret)
+			goto out_err;
+	}
+
 	/* add bootargs */
 	if (cmdline) {
 		ret = fdt_setprop_string(buf, nodeoffset, "bootargs", cmdline);
@@ -132,6 +173,45 @@ static int setup_dtb(struct kimage *image,
 	return ret;
 }
 
+static int prepare_elf_headers(void **addr, unsigned long *sz)
+{
+	struct crash_mem *cmem;
+	unsigned int nr_ranges;
+	int ret;
+	u64 i;
+	phys_addr_t start, end;
+
+	nr_ranges = 1; /* for exclusion of crashkernel region */
+	for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, 0,
+							&start, &end, NULL)
+		nr_ranges++;
+
+	cmem = kmalloc(sizeof(struct crash_mem) +
+			sizeof(struct crash_mem_range) * nr_ranges, GFP_KERNEL);
+	if (!cmem)
+		return -ENOMEM;
+
+	cmem->max_nr_ranges = nr_ranges;
+	cmem->nr_ranges = 0;
+	for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, 0,
+							&start, &end, NULL) {
+		cmem->ranges[cmem->nr_ranges].start = start;
+		cmem->ranges[cmem->nr_ranges].end = end - 1;
+		cmem->nr_ranges++;
+	}
+
+	/* Exclude crashkernel region */
+	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
+	if (ret)
+		goto out;
+
+	ret =  crash_prepare_elf64_headers(cmem, true, addr, sz);
+
+out:
+	kfree(cmem);
+	return ret;
+}
+
 int load_other_segments(struct kimage *image,
 			unsigned long kernel_load_addr,
 			unsigned long kernel_size,
@@ -139,11 +219,43 @@ int load_other_segments(struct kimage *image,
 			char *cmdline, unsigned long cmdline_len)
 {
 	struct kexec_buf kbuf;
+	void *hdrs_addr;
+	unsigned long hdrs_sz;
 	unsigned long initrd_load_addr = 0;
 	char *dtb = NULL;
 	unsigned long dtb_len = 0;
 	int ret = 0;
 
+	/* load elf core header */
+	if (image->type == KEXEC_TYPE_CRASH) {
+		ret = prepare_elf_headers(&hdrs_addr, &hdrs_sz);
+		if (ret) {
+			pr_err("Preparing elf core header failed\n");
+			goto out_err;
+		}
+
+		kbuf.image = image;
+		kbuf.buffer = hdrs_addr;
+		kbuf.bufsz = hdrs_sz;
+		kbuf.memsz = hdrs_sz;
+		kbuf.buf_align = PAGE_SIZE;
+		kbuf.buf_min = crashk_res.start;
+		kbuf.buf_max = crashk_res.end + 1;
+		kbuf.top_down = true;
+
+		ret = kexec_add_buffer(&kbuf);
+		if (ret) {
+			vfree(hdrs_addr);
+			goto out_err;
+		}
+		image->arch.elf_headers = hdrs_addr;
+		image->arch.elf_headers_sz = hdrs_sz;
+		image->arch.elf_load_addr = kbuf.mem;
+
+		pr_debug("Loaded elf core header at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+				 image->arch.elf_load_addr, hdrs_sz, hdrs_sz);
+	}
+
 	kbuf.image = image;
 	/* not allocate anything below the kernel */
 	kbuf.buf_min = kernel_load_addr + kernel_size;
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 12/15] arm64: kexec_file: invoke the kernel without purgatory
  2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
                   ` (10 preceding siblings ...)
  2018-07-11  7:41 ` [PATCH v11 11/15] arm64: kexec_file: add crash dump support AKASHI Takahiro
@ 2018-07-11  7:42 ` AKASHI Takahiro
  2018-07-11  7:42 ` [PATCH v11 13/15] include: pe.h: remove message[] from mz header definition AKASHI Takahiro
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:42 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro

On arm64, purgatory would do almost nothing. So just invoke secondary
kernel directly by jumping into its entry code.

While, in this case, cpu_soft_restart() must be called with dtb address
in the fifth argument, the behavior still stays compatible with kexec_load
case as long as the argument is null.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: James Morse <james.morse@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/cpu-reset.S       |  8 ++++----
 arch/arm64/kernel/machine_kexec.c   | 12 ++++++++++--
 arch/arm64/kernel/relocate_kernel.S |  3 ++-
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
index 8021b46c9743..a2be30275a73 100644
--- a/arch/arm64/kernel/cpu-reset.S
+++ b/arch/arm64/kernel/cpu-reset.S
@@ -22,11 +22,11 @@
  * __cpu_soft_restart(el2_switch, entry, arg0, arg1, arg2) - Helper for
  * cpu_soft_restart.
  *
- * @el2_switch: Flag to indicate a swich to EL2 is needed.
+ * @el2_switch: Flag to indicate a switch to EL2 is needed.
  * @entry: Location to jump to for soft reset.
- * arg0: First argument passed to @entry.
- * arg1: Second argument passed to @entry.
- * arg2: Third argument passed to @entry.
+ * arg0: First argument passed to @entry. (relocation list)
+ * arg1: Second argument passed to @entry.(physical kernel entry)
+ * arg2: Third argument passed to @entry. (physical dtb address)
  *
  * Put the CPU into the same state as it would be if it had been reset, and
  * branch to what would be the reset vector. It must be executed with the
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index f76ea92dff91..830a5063e09d 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -205,10 +205,18 @@ void machine_kexec(struct kimage *kimage)
 	 * uses physical addressing to relocate the new image to its final
 	 * position and transfers control to the image entry point when the
 	 * relocation is complete.
+	 * In kexec case, kimage->start points to purgatory assuming that
+	 * kernel entry and dtb address are embedded in purgatory by
+	 * userspace (kexec-tools).
+	 * In kexec_file case, the kernel starts directly without purgatory.
 	 */
-
 	cpu_soft_restart(kimage != kexec_crash_image,
-		reboot_code_buffer_phys, kimage->head, kimage->start, 0);
+		reboot_code_buffer_phys, kimage->head, kimage->start,
+#ifdef CONFIG_KEXEC_FILE
+						kimage->arch.dtb_mem);
+#else
+						0);
+#endif
 
 	BUG(); /* Should never get here. */
 }
diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
index f407e422a720..95fd94209aae 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -32,6 +32,7 @@
 ENTRY(arm64_relocate_new_kernel)
 
 	/* Setup the list loop variables. */
+	mov	x18, x2				/* x18 = dtb address */
 	mov	x17, x1				/* x17 = kimage_start */
 	mov	x16, x0				/* x16 = kimage_head */
 	raw_dcache_line_size x15, x0		/* x15 = dcache line size */
@@ -107,7 +108,7 @@ ENTRY(arm64_relocate_new_kernel)
 	isb
 
 	/* Start new image. */
-	mov	x0, xzr
+	mov	x0, x18
 	mov	x1, xzr
 	mov	x2, xzr
 	mov	x3, xzr
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 13/15] include: pe.h: remove message[] from mz header definition
  2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
                   ` (11 preceding siblings ...)
  2018-07-11  7:42 ` [PATCH v11 12/15] arm64: kexec_file: invoke the kernel without purgatory AKASHI Takahiro
@ 2018-07-11  7:42 ` AKASHI Takahiro
  2018-07-11  7:42 ` [PATCH v11 14/15] arm64: kexec_file: add kernel signature verification support AKASHI Takahiro
  2018-07-11  7:42 ` [PATCH v11 15/15] arm64: kexec_file: add kaslr support AKASHI Takahiro
  14 siblings, 0 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:42 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro

message[] field won't be part of the definition of mz header.

This change is crucial for enabling kexec_file_load on arm64 because
arm64's "Image" binary, as in PE format, doesn't have any data for it and
accordingly the following check in pefile_parse_binary() will fail:

	chkaddr(cursor, mz->peaddr, sizeof(*pe));

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
---
 include/linux/pe.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/pe.h b/include/linux/pe.h
index 143ce75be5f0..3482b18a48b5 100644
--- a/include/linux/pe.h
+++ b/include/linux/pe.h
@@ -166,7 +166,7 @@ struct mz_hdr {
 	uint16_t oem_info;	/* oem specific */
 	uint16_t reserved1[10];	/* reserved */
 	uint32_t peaddr;	/* address of pe header */
-	char     message[64];	/* message to print */
+	char     message[];	/* message to print */
 };
 
 struct mz_reloc {
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 14/15] arm64: kexec_file: add kernel signature verification support
  2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
                   ` (12 preceding siblings ...)
  2018-07-11  7:42 ` [PATCH v11 13/15] include: pe.h: remove message[] from mz header definition AKASHI Takahiro
@ 2018-07-11  7:42 ` AKASHI Takahiro
  2018-07-11  7:42 ` [PATCH v11 15/15] arm64: kexec_file: add kaslr support AKASHI Takahiro
  14 siblings, 0 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:42 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro

With this patch, kernel verification can be done without IMA security
subsystem enabled. Turn on CONFIG_KEXEC_VERIFY_SIG instead.

On x86, a signature is embedded into a PE file (Microsoft's format) header
of binary. Since arm64's "Image" can also be seen as a PE file as far as
CONFIG_EFI is enabled, we adopt this format for kernel signing.

You can create a signed kernel image with:
    $ sbsign --key ${KEY} --cert ${CERT} Image

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/Kconfig              | 24 ++++++++++++++++++++++++
 arch/arm64/kernel/kexec_image.c | 15 +++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index a9a3a5583c8b..1445eb2fc833 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -844,6 +844,30 @@ config KEXEC_FILE
 	  for kernel and initramfs as opposed to list of segments as
 	  accepted by previous system call.
 
+config KEXEC_VERIFY_SIG
+	bool "Verify kernel signature during kexec_file_load() syscall"
+	depends on KEXEC_FILE
+	help
+	  Select this option to verify a signature with loaded kernel
+	  image. If configured, any attempt of loading a image without
+	  valid signature will fail.
+
+	  In addition to that option, you need to enable signature
+	  verification for the corresponding kernel image type being
+	  loaded in order for this to work.
+
+config KEXEC_IMAGE_VERIFY_SIG
+	bool "Enable Image signature verification support"
+	default y
+	depends on KEXEC_VERIFY_SIG
+	depends on EFI && SIGNED_PE_FILE_VERIFICATION
+	help
+	  Enable Image signature verification support.
+
+comment "Support for PE file signature verification disabled"
+	depends on KEXEC_VERIFY_SIG
+	depends on !EFI || !SIGNED_PE_FILE_VERIFICATION
+
 config CRASH_DUMP
 	bool "Build kdump crash kernel"
 	help
diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
index df1e341d3a28..bb0a95add197 100644
--- a/arch/arm64/kernel/kexec_image.c
+++ b/arch/arm64/kernel/kexec_image.c
@@ -13,6 +13,7 @@
 #include <linux/kernel.h>
 #include <linux/kexec.h>
 #include <linux/string.h>
+#include <linux/verification.h>
 #include <asm/boot.h>
 #include <asm/byteorder.h>
 #include <asm/cpufeature.h>
@@ -28,6 +29,9 @@ static int image_probe(const char *kernel_buf, unsigned long kernel_len)
 			!memcmp(&h->magic, ARM64_MAGIC, sizeof(ARM64_MAGIC)))
 		return -EINVAL;
 
+	pr_debug("PE format: %s\n",
+			memcmp(&h->mz_magic, "MZ", 2) ?  "no" : "yes");
+
 	return 0;
 }
 
@@ -107,7 +111,18 @@ static void *image_load(struct kimage *image,
 	return ERR_PTR(ret);
 }
 
+#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
+static int image_verify_sig(const char *kernel, unsigned long kernel_len)
+{
+	return verify_pefile_signature(kernel, kernel_len, NULL,
+				       VERIFYING_KEXEC_PE_SIGNATURE);
+}
+#endif
+
 const struct kexec_file_ops kexec_image_ops = {
 	.probe = image_probe,
 	.load = image_load,
+#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
+	.verify_sig = image_verify_sig,
+#endif
 };
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 15/15] arm64: kexec_file: add kaslr support
  2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
                   ` (13 preceding siblings ...)
  2018-07-11  7:42 ` [PATCH v11 14/15] arm64: kexec_file: add kernel signature verification support AKASHI Takahiro
@ 2018-07-11  7:42 ` AKASHI Takahiro
  14 siblings, 0 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-11  7:42 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd
  Cc: ard.biesheuvel, james.morse, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, AKASHI Takahiro

Adding "kaslr-seed" to dtb enables triggering kaslr, or kernel virtual
address randomization, at secondary kernel boot. We always do this as
it will have no harm on kaslr-incapable kernel.

We don't have any "switch" to turn off this feature directly, but still
can suppress it by passing "nokaslr" as a kernel boot argument.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/machine_kexec_file.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index 261564df7210..99d771afe88e 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -16,6 +16,7 @@
 #include <linux/libfdt.h>
 #include <linux/memblock.h>
 #include <linux/of_fdt.h>
+#include <linux/random.h>
 #include <linux/slab.h>
 #include <linux/types.h>
 #include <linux/vmalloc.h>
@@ -161,6 +162,12 @@ static int setup_dtb(struct kimage *image,
 		}
 	}
 
+	/* add kaslr-seed */
+	get_random_bytes(&value, sizeof(value));
+	ret = fdt_setprop(buf, nodeoffset, "kaslr-seed", &value, sizeof(value));
+	if (ret)
+		goto out_err;
+
 	/* trim a buffer */
 	fdt_pack(buf);
 	*dtb_buf = buf;
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-11  7:41 ` [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem() AKASHI Takahiro
@ 2018-07-14  1:52   ` Dave Young
  2018-07-16 11:04     ` James Morse
  2018-07-16 12:26   ` Dave Young
  1 sibling, 1 reply; 38+ messages in thread
From: Dave Young @ 2018-07-14  1:52 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	bhe, arnd, ard.biesheuvel, james.morse, bhsharma, kexec,
	linux-arm-kernel, linux-kernel, Eric W. Biederman

On 07/11/18 at 04:41pm, AKASHI Takahiro wrote:
> Memblock list is another source for usable system memory layout.
> So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
> other memblock-based architectures, particularly arm64, can also utilise
> it. A moved function is now renamed to kexec_walk_memblock() and merged
> into the existing arch_kexec_walk_mem() for general use, either resource
> list or memblock list.
> 
> A consequent function will not work for kdump with memblock list, but
> this will be fixed in the next patch.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Acked-by: James Morse <james.morse@arm.com>
> ---
>  arch/powerpc/kernel/machine_kexec_file_64.c | 54 ---------------------
>  kernel/kexec_file.c                         | 54 +++++++++++++++++++++
>  2 files changed, 54 insertions(+), 54 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/machine_kexec_file_64.c b/arch/powerpc/kernel/machine_kexec_file_64.c
> index 0bd23dc789a4..5357b09902c5 100644
> --- a/arch/powerpc/kernel/machine_kexec_file_64.c
> +++ b/arch/powerpc/kernel/machine_kexec_file_64.c
> @@ -24,7 +24,6 @@
>  
>  #include <linux/slab.h>
>  #include <linux/kexec.h>
> -#include <linux/memblock.h>
>  #include <linux/of_fdt.h>
>  #include <linux/libfdt.h>
>  #include <asm/ima.h>
> @@ -46,59 +45,6 @@ int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
>  	return kexec_image_probe_default(image, buf, buf_len);
>  }
>  
> -/**
> - * arch_kexec_walk_mem - call func(data) for each unreserved memory block
> - * @kbuf:	Context info for the search. Also passed to @func.
> - * @func:	Function to call for each memory block.
> - *
> - * This function is used by kexec_add_buffer and kexec_locate_mem_hole
> - * to find unreserved memory to load kexec segments into.
> - *
> - * Return: The memory walk will stop when func returns a non-zero value
> - * and that value will be returned. If all free regions are visited without
> - * func returning non-zero, then zero will be returned.
> - */
> -int arch_kexec_walk_mem(struct kexec_buf *kbuf,
> -			int (*func)(struct resource *, void *))
> -{
> -	int ret = 0;
> -	u64 i;
> -	phys_addr_t mstart, mend;
> -	struct resource res = { };
> -
> -	if (kbuf->top_down) {
> -		for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
> -						&mstart, &mend, NULL) {
> -			/*
> -			 * In memblock, end points to the first byte after the
> -			 * range while in kexec, end points to the last byte
> -			 * in the range.
> -			 */
> -			res.start = mstart;
> -			res.end = mend - 1;
> -			ret = func(&res, kbuf);
> -			if (ret)
> -				break;
> -		}
> -	} else {
> -		for_each_free_mem_range(i, NUMA_NO_NODE, 0, &mstart, &mend,
> -					NULL) {
> -			/*
> -			 * In memblock, end points to the first byte after the
> -			 * range while in kexec, end points to the last byte
> -			 * in the range.
> -			 */
> -			res.start = mstart;
> -			res.end = mend - 1;
> -			ret = func(&res, kbuf);
> -			if (ret)
> -				break;
> -		}
> -	}
> -
> -	return ret;
> -}
> -
>  /**
>   * setup_purgatory - initialize the purgatory's global variables
>   * @image:		kexec image.
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 63c7ce1c0c3e..b088324fb3ad 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -16,6 +16,7 @@
>  #include <linux/file.h>
>  #include <linux/slab.h>
>  #include <linux/kexec.h>
> +#include <linux/memblock.h>
>  #include <linux/mutex.h>
>  #include <linux/list.h>
>  #include <linux/fs.h>
> @@ -501,6 +502,55 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
>  	return locate_mem_hole_bottom_up(start, end, kbuf);
>  }
>  
> +#if defined(CONFIG_HAVE_MEMBLOCK) && !defined(CONFIG_ARCH_DISCARD_MEMBLOCK)
> +static int kexec_walk_memblock(struct kexec_buf *kbuf,
> +			       int (*func)(struct resource *, void *))
> +{
> +	int ret = 0;
> +	u64 i;
> +	phys_addr_t mstart, mend;
> +	struct resource res = { };
> +
> +	if (kbuf->top_down) {
> +		for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
> +						&mstart, &mend, NULL) {
> +			/*
> +			 * In memblock, end points to the first byte after the
> +			 * range while in kexec, end points to the last byte
> +			 * in the range.
> +			 */
> +			res.start = mstart;
> +			res.end = mend - 1;
> +			ret = func(&res, kbuf);
> +			if (ret)
> +				break;
> +		}
> +	} else {
> +		for_each_free_mem_range(i, NUMA_NO_NODE, 0, &mstart, &mend,
> +					NULL) {
> +			/*
> +			 * In memblock, end points to the first byte after the
> +			 * range while in kexec, end points to the last byte
> +			 * in the range.
> +			 */
> +			res.start = mstart;
> +			res.end = mend - 1;
> +			ret = func(&res, kbuf);
> +			if (ret)
> +				break;
> +		}
> +	}
> +
> +	return ret;
> +}
> +#else
> +static int kexec_walk_memblock(struct kexec_buf *kbuf,
> +			       int (*func)(struct resource *, void *))
> +{
> +	return 0;
> +}
> +#endif
> +
>  /**
>   * arch_kexec_walk_mem - call func(data) on free memory regions
>   * @kbuf:	Context info for the search. Also passed to @func.
> @@ -513,6 +563,10 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
>  int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
>  			       int (*func)(struct resource *, void *))
>  {
> +	if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
> +			!IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
> +		return kexec_walk_memblock(kbuf, func);

AKASHI, I'm not sure if this works on all arches, for example I chekced
the .config on my Nokia N900 kernel tree, there is HAVE_MEMBLOCK=y and
no CONFIG_ARCH_DISCARD_MEMBLOCK,  in 32bit arm code no
arch_kexec_walk_mem() 

> +
>  	if (kbuf->image->type == KEXEC_TYPE_CRASH)
>  		return walk_iomem_res_desc(crashk_res.desc,
>  					   IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
> -- 
> 2.17.0
> 

Thanks
Dave

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-14  1:52   ` Dave Young
@ 2018-07-16 11:04     ` James Morse
  2018-07-16 12:24       ` Dave Young
  0 siblings, 1 reply; 38+ messages in thread
From: James Morse @ 2018-07-16 11:04 UTC (permalink / raw)
  To: Dave Young, AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	bhe, arnd, ard.biesheuvel, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, Eric W. Biederman

Hi Dave,

On 14/07/18 02:52, Dave Young wrote:
> On 07/11/18 at 04:41pm, AKASHI Takahiro wrote:
>> Memblock list is another source for usable system memory layout.
>> So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
>> other memblock-based architectures, particularly arm64, can also utilise
>> it. A moved function is now renamed to kexec_walk_memblock() and merged
>> into the existing arch_kexec_walk_mem() for general use, either resource
>> list or memblock list.
>>
>> A consequent function will not work for kdump with memblock list, but
>> this will be fixed in the next patch.

>> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c

>> @@ -513,6 +563,10 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
>>  int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
>>  			       int (*func)(struct resource *, void *))
>>  {
>> +	if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
>> +			!IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
>> +		return kexec_walk_memblock(kbuf, func);
> 
> AKASHI, I'm not sure if this works on all arches, for example I chekced
> the .config on my Nokia N900 kernel tree, there is HAVE_MEMBLOCK=y and
> no CONFIG_ARCH_DISCARD_MEMBLOCK, in 32bit arm code no arch_kexec_walk_mem()
By doesn't work you mean it's a change in behaviour?
I think this is fine because 32bit arm doesn't support KEXEC_FILE, (this file is
kexec_file specific right?).

It only affects architectures with MEMBLOCK and KEXEC_FILE: powerpc, s390 and
soon arm64. s390 keeps its behaviour because it provides arch_kexec_walk_mem(),
and powerpc's is copied in here as its generic 'memblock describes my memory'
stuff. The implementation would be the same on arm64, so we're doing this to
avoid duplicating otherwise generic arch code. I think 32bit arm should be able
to use this too if it gets KEXEC_FILE support. (32bit arms' KEXEC already
depends on MEMBLOCK).


Thanks,

James

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-16 11:04     ` James Morse
@ 2018-07-16 12:24       ` Dave Young
  2018-07-17  5:31         ` AKASHI Takahiro
  0 siblings, 1 reply; 38+ messages in thread
From: Dave Young @ 2018-07-16 12:24 UTC (permalink / raw)
  To: James Morse
  Cc: AKASHI Takahiro, catalin.marinas, will.deacon, dhowells, vgoyal,
	herbert, davem, bhe, arnd, ard.biesheuvel, bhsharma, kexec,
	linux-arm-kernel, linux-kernel, Eric W. Biederman

On 07/16/18 at 12:04pm, James Morse wrote:
> Hi Dave,
> 
> On 14/07/18 02:52, Dave Young wrote:
> > On 07/11/18 at 04:41pm, AKASHI Takahiro wrote:
> >> Memblock list is another source for usable system memory layout.
> >> So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
> >> other memblock-based architectures, particularly arm64, can also utilise
> >> it. A moved function is now renamed to kexec_walk_memblock() and merged
> >> into the existing arch_kexec_walk_mem() for general use, either resource
> >> list or memblock list.
> >>
> >> A consequent function will not work for kdump with memblock list, but
> >> this will be fixed in the next patch.
> 
> >> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> 
> >> @@ -513,6 +563,10 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
> >>  int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> >>  			       int (*func)(struct resource *, void *))
> >>  {
> >> +	if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
> >> +			!IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
> >> +		return kexec_walk_memblock(kbuf, func);
> > 
> > AKASHI, I'm not sure if this works on all arches, for example I chekced
> > the .config on my Nokia N900 kernel tree, there is HAVE_MEMBLOCK=y and
> > no CONFIG_ARCH_DISCARD_MEMBLOCK, in 32bit arm code no arch_kexec_walk_mem()
> By doesn't work you mean it's a change in behaviour?
> I think this is fine because 32bit arm doesn't support KEXEC_FILE, (this file is
> kexec_file specific right?).

Ah, replied on a train, I forgot this is only for kexec_file, sorry
about that.  Please ignore the comment.

But since we have a weak function arch_kexec_walk_mem, adding another
condition branch within this weak function looks not good.
Something like below would be better:
 
int kexec_locate_mem_hole(struct kexec_buf *kbuf)
{
        int ret;

	+ if use memblock
	+	ret = kexec_walk_memblock()
	+ else
        	ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);

        return ret == 1 ? 0 : -EADDRNOTAVAIL;
}


> 
> It only affects architectures with MEMBLOCK and KEXEC_FILE: powerpc, s390 and
> soon arm64. s390 keeps its behaviour because it provides arch_kexec_walk_mem(),
> and powerpc's is copied in here as its generic 'memblock describes my memory'
> stuff. The implementation would be the same on arm64, so we're doing this to
> avoid duplicating otherwise generic arch code. I think 32bit arm should be able
> to use this too if it gets KEXEC_FILE support. (32bit arms' KEXEC already
> depends on MEMBLOCK).
> 
> 
> Thanks,
> 
> James

Thanks
Dave

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-11  7:41 ` [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem() AKASHI Takahiro
  2018-07-14  1:52   ` Dave Young
@ 2018-07-16 12:26   ` Dave Young
  2018-07-18 16:52     ` James Morse
  1 sibling, 1 reply; 38+ messages in thread
From: Dave Young @ 2018-07-16 12:26 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	bhe, arnd, ard.biesheuvel, james.morse, bhsharma, kexec,
	linux-arm-kernel, linux-kernel, Eric W. Biederman

Hi Akashi,

On 07/11/18 at 04:41pm, AKASHI Takahiro wrote:
> Memblock list is another source for usable system memory layout.
> So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
> other memblock-based architectures, particularly arm64, can also utilise
> it. A moved function is now renamed to kexec_walk_memblock() and merged
> into the existing arch_kexec_walk_mem() for general use, either resource
> list or memblock list.
> 
> A consequent function will not work for kdump with memblock list, but
> this will be fixed in the next patch.

If this breaks something, then it would be good to fold the following
patch in this patch so that bisect can still work?

Thanks
Dave

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-16 12:24       ` Dave Young
@ 2018-07-17  5:31         ` AKASHI Takahiro
  2018-07-17  7:49           ` Dave Young
  0 siblings, 1 reply; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-17  5:31 UTC (permalink / raw)
  To: Dave Young
  Cc: James Morse, catalin.marinas, will.deacon, dhowells, vgoyal,
	herbert, davem, bhe, arnd, ard.biesheuvel, bhsharma, kexec,
	linux-arm-kernel, linux-kernel, Eric W. Biederman

Hi Dave,

On Mon, Jul 16, 2018 at 08:24:12PM +0800, Dave Young wrote:
> On 07/16/18 at 12:04pm, James Morse wrote:
> > Hi Dave,
> > 
> > On 14/07/18 02:52, Dave Young wrote:
> > > On 07/11/18 at 04:41pm, AKASHI Takahiro wrote:
> > >> Memblock list is another source for usable system memory layout.
> > >> So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
> > >> other memblock-based architectures, particularly arm64, can also utilise
> > >> it. A moved function is now renamed to kexec_walk_memblock() and merged
> > >> into the existing arch_kexec_walk_mem() for general use, either resource
> > >> list or memblock list.
> > >>
> > >> A consequent function will not work for kdump with memblock list, but
> > >> this will be fixed in the next patch.
> > 
> > >> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > 
> > >> @@ -513,6 +563,10 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
> > >>  int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > >>  			       int (*func)(struct resource *, void *))
> > >>  {
> > >> +	if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
> > >> +			!IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
> > >> +		return kexec_walk_memblock(kbuf, func);
> > > 
> > > AKASHI, I'm not sure if this works on all arches, for example I chekced
> > > the .config on my Nokia N900 kernel tree, there is HAVE_MEMBLOCK=y and
> > > no CONFIG_ARCH_DISCARD_MEMBLOCK, in 32bit arm code no arch_kexec_walk_mem()
> > By doesn't work you mean it's a change in behaviour?
> > I think this is fine because 32bit arm doesn't support KEXEC_FILE, (this file is
> > kexec_file specific right?).
> 
> Ah, replied on a train, I forgot this is only for kexec_file, sorry
> about that.  Please ignore the comment.
> 
> But since we have a weak function arch_kexec_walk_mem, adding another
> condition branch within this weak function looks not good.
> Something like below would be better:

I see your concern here, but


> int kexec_locate_mem_hole(struct kexec_buf *kbuf)
> {
>         int ret;
> 
> 	+ if use memblock
> 	+	ret = kexec_walk_memblock()
> 	+ else
>         	ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
> 
>         return ret == 1 ? 0 : -EADDRNOTAVAIL;
> }

what if yet another architecture comes to kexec_file and wanna
take a third approach? How can it override those functions?
Depending on kernel configuration, it might re-define either
kexec_walk_memblock() or arch_kexec_walk_mem(). It sounds weird to me.

Thanks,
-Takahiro AKASHI

> 
> > 
> > It only affects architectures with MEMBLOCK and KEXEC_FILE: powerpc, s390 and
> > soon arm64. s390 keeps its behaviour because it provides arch_kexec_walk_mem(),
> > and powerpc's is copied in here as its generic 'memblock describes my memory'
> > stuff. The implementation would be the same on arm64, so we're doing this to
> > avoid duplicating otherwise generic arch code. I think 32bit arm should be able
> > to use this too if it gets KEXEC_FILE support. (32bit arms' KEXEC already
> > depends on MEMBLOCK).
> > 
> > 
> > Thanks,
> > 
> > James
> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-17  5:31         ` AKASHI Takahiro
@ 2018-07-17  7:49           ` Dave Young
  2018-07-18  5:38             ` AKASHI Takahiro
  0 siblings, 1 reply; 38+ messages in thread
From: Dave Young @ 2018-07-17  7:49 UTC (permalink / raw)
  To: AKASHI Takahiro, James Morse, catalin.marinas, will.deacon,
	dhowells, vgoyal, herbert, davem, bhe, arnd, ard.biesheuvel,
	bhsharma, kexec, linux-arm-kernel, linux-kernel,
	Eric W. Biederman

Hi AKASHI,
On 07/17/18 at 02:31pm, AKASHI Takahiro wrote:
> Hi Dave,
> 
> On Mon, Jul 16, 2018 at 08:24:12PM +0800, Dave Young wrote:
> > On 07/16/18 at 12:04pm, James Morse wrote:
> > > Hi Dave,
> > > 
> > > On 14/07/18 02:52, Dave Young wrote:
> > > > On 07/11/18 at 04:41pm, AKASHI Takahiro wrote:
> > > >> Memblock list is another source for usable system memory layout.
> > > >> So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
> > > >> other memblock-based architectures, particularly arm64, can also utilise
> > > >> it. A moved function is now renamed to kexec_walk_memblock() and merged
> > > >> into the existing arch_kexec_walk_mem() for general use, either resource
> > > >> list or memblock list.
> > > >>
> > > >> A consequent function will not work for kdump with memblock list, but
> > > >> this will be fixed in the next patch.
> > > 
> > > >> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > 
> > > >> @@ -513,6 +563,10 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
> > > >>  int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > > >>  			       int (*func)(struct resource *, void *))
> > > >>  {
> > > >> +	if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
> > > >> +			!IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
> > > >> +		return kexec_walk_memblock(kbuf, func);
> > > > 
> > > > AKASHI, I'm not sure if this works on all arches, for example I chekced
> > > > the .config on my Nokia N900 kernel tree, there is HAVE_MEMBLOCK=y and
> > > > no CONFIG_ARCH_DISCARD_MEMBLOCK, in 32bit arm code no arch_kexec_walk_mem()
> > > By doesn't work you mean it's a change in behaviour?
> > > I think this is fine because 32bit arm doesn't support KEXEC_FILE, (this file is
> > > kexec_file specific right?).
> > 
> > Ah, replied on a train, I forgot this is only for kexec_file, sorry
> > about that.  Please ignore the comment.
> > 
> > But since we have a weak function arch_kexec_walk_mem, adding another
> > condition branch within this weak function looks not good.
> > Something like below would be better:
> 
> I see your concern here, but
> 
> 
> > int kexec_locate_mem_hole(struct kexec_buf *kbuf)
> > {
> >         int ret;
> > 
> > 	+ if use memblock
> > 	+	ret = kexec_walk_memblock()
> > 	+ else
> >         	ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
> > 
> >         return ret == 1 ? 0 : -EADDRNOTAVAIL;
> > }
> 
> what if yet another architecture comes to kexec_file and wanna
> take a third approach? How can it override those functions?
> Depending on kernel configuration, it might re-define either
> kexec_walk_memblock() or arch_kexec_walk_mem(). It sounds weird to me.

I also feel this weird, but it is slightly better because currently no
user need another overriding requirement, and I feel it is not expected to have in
the future for the memblock use.

Rethinking about this issue, we can just remove the weak function and
just use general function.

Currently with your patch applied only s390 use arch_kexec_walk_mem like
below:
/*
 * The kernel is loaded to a fixed location. Turn off kexec_locate_mem_hole
 * and provide kbuf->mem by hand.
 */
int arch_kexec_walk_mem(struct kexec_buf *kbuf,
                        int (*func)(struct resource *, void *))
{
        return 1;
}

AFAIK, all other users initialize kbuf->mem as NULL, so we can check
kbuf->mem in int kexec_locate_mem_hole:

if (kbuf->mem)
	return 0;

if use memblock
	kexec_walk_memblock
else
	kexec_walk_mem

> 
> Thanks,
> -Takahiro AKASHI
> 
> > 
> > > 
> > > It only affects architectures with MEMBLOCK and KEXEC_FILE: powerpc, s390 and
> > > soon arm64. s390 keeps its behaviour because it provides arch_kexec_walk_mem(),
> > > and powerpc's is copied in here as its generic 'memblock describes my memory'
> > > stuff. The implementation would be the same on arm64, so we're doing this to
> > > avoid duplicating otherwise generic arch code. I think 32bit arm should be able
> > > to use this too if it gets KEXEC_FILE support. (32bit arms' KEXEC already
> > > depends on MEMBLOCK).
> > > 
> > > 
> > > Thanks,
> > > 
> > > James
> > 
> > Thanks
> > Dave

Thanks
Dave

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 09/15] arm64: kexec_file: load initrd and device-tree
  2018-07-11  7:41 ` [PATCH v11 09/15] arm64: kexec_file: load initrd and device-tree AKASHI Takahiro
@ 2018-07-17 16:57   ` James Morse
  2018-07-18  5:56     ` AKASHI Takahiro
  0 siblings, 1 reply; 38+ messages in thread
From: James Morse @ 2018-07-17 16:57 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd, ard.biesheuvel, bhsharma, kexec,
	linux-arm-kernel, linux-kernel

Hi Akashi,

On 11/07/18 08:41, AKASHI Takahiro wrote:
> load_other_segments() is expected to allocate and place all the necessary
> memory segments other than kernel, including initrd and device-tree
> blob (and elf core header for crash).
> While most of the code was borrowed from kexec-tools' counterpart,
> users may not be allowed to specify dtb explicitly, instead, the dtb
> presented by the original boot loader is reused.
> 
> arch_kimage_kernel_post_load_cleanup() is responsible for freeing arm64-
> specific data allocated in load_other_segments().

> diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> index c38a8048ed00..ca00681c25c6 100644
> --- a/arch/arm64/kernel/machine_kexec_file.c
> +++ b/arch/arm64/kernel/machine_kexec_file.c

> +int arch_kimage_file_post_load_cleanup(struct kimage *image)
> +{
> +	vfree(image->arch.dtb_buf);
> +	image->arch.dtb_buf = NULL;
> +
> +	return kexec_image_post_load_cleanup_default(image);
> +}

A nit from sparse:
| warning: symbol 'arch_kimage_file_post_load_cleanup' was not declared

Can we add a definition for this to a header file somewhere. asm/kexec.h is
probably the best bet.


> +static int setup_dtb(struct kimage *image,
> +		unsigned long initrd_load_addr, unsigned long initrd_len,
> +		char *cmdline, unsigned long cmdline_len,
> +		char **dtb_buf, size_t *dtb_buf_len)
> +{

> +	/* add initrd-* */
> +	if (initrd_load_addr) {
> +		value = cpu_to_fdt64(initrd_load_addr);
> +		ret = fdt_setprop_u64(buf, nodeoffset, "linux,initrd-start",
> +							value);

fdt_setprop_u64() already does the endian conversion.

From scripts/dtc/libfdt/libfdt.h, its implemented as:
| 	fdt64_t tmp = cpu_to_fdt64(val);
| 	return fdt_setprop(fdt, nodeoffset, name, &tmp, sizeof(tmp));

(I think you were using setprop directly in an older version)


This leads to:
| ------------[ cut here ]------------
| initrd not fully accessible via the linear mapping -- please check your
| bootloader ...
| WARNING: CPU: 0 PID: 0 at ../arch/arm64/mm/init.c:429
| arm64_memblock_init+0x150/0x3d8
| Modules linked in:
| CPU: 0 PID: 0 Comm: swapper Not tainted 4.18.0-rc5-00015-g95b5c843d0da #10150
| Hardware name: AMD Seattle (Rev.B0) Development Board (Overdrive) (DT)
| pstate: 60000085 (nZCv daIf -PAN -UAO)
| pc : arm64_memblock_init+0x150/0x3d8
| lr : arm64_memblock_init+0x150/0x3d8

| Call trace:
| arm64_memblock_init+0x150/0x3d8
| setup_arch+0x1c0/0x510
| start_kernel+0x80/0x418
| random: get_random_bytes called from print_oops_end_marker+0x4c/0x68 with
| crng_init=0
| ---[ end trace 0000000000000000 ]---


Which is caused by the values being miles outside ram due to the extra byte
swapping:
| morse@frikadeller:~$ sudo dtc -I dtb -O dts /sys/firmware/fdt  | grep initrd
|                 linux,initrd-end = <0x900b6c05 0x80000000>;
|                 linux,initrd-start = <0x906a04 0x80000000>;


With the two extra cpu_to_fdt64() calls removed:
Reviewed-by: James Morse <james.morse@arm.com>


Thanks,

James

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-17  7:49           ` Dave Young
@ 2018-07-18  5:38             ` AKASHI Takahiro
  2018-07-18  6:13               ` Dave Young
  0 siblings, 1 reply; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-18  5:38 UTC (permalink / raw)
  To: Dave Young
  Cc: James Morse, catalin.marinas, will.deacon, dhowells, vgoyal,
	herbert, davem, bhe, arnd, ard.biesheuvel, bhsharma, kexec,
	linux-arm-kernel, linux-kernel, Eric W. Biederman

Dave,

On Tue, Jul 17, 2018 at 03:49:23PM +0800, Dave Young wrote:
> Hi AKASHI,
> On 07/17/18 at 02:31pm, AKASHI Takahiro wrote:
> > Hi Dave,
> > 
> > On Mon, Jul 16, 2018 at 08:24:12PM +0800, Dave Young wrote:
> > > On 07/16/18 at 12:04pm, James Morse wrote:
> > > > Hi Dave,
> > > > 
> > > > On 14/07/18 02:52, Dave Young wrote:
> > > > > On 07/11/18 at 04:41pm, AKASHI Takahiro wrote:
> > > > >> Memblock list is another source for usable system memory layout.
> > > > >> So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
> > > > >> other memblock-based architectures, particularly arm64, can also utilise
> > > > >> it. A moved function is now renamed to kexec_walk_memblock() and merged
> > > > >> into the existing arch_kexec_walk_mem() for general use, either resource
> > > > >> list or memblock list.
> > > > >>
> > > > >> A consequent function will not work for kdump with memblock list, but
> > > > >> this will be fixed in the next patch.
> > > > 
> > > > >> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > > 
> > > > >> @@ -513,6 +563,10 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
> > > > >>  int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > > > >>  			       int (*func)(struct resource *, void *))
> > > > >>  {
> > > > >> +	if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
> > > > >> +			!IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
> > > > >> +		return kexec_walk_memblock(kbuf, func);
> > > > > 
> > > > > AKASHI, I'm not sure if this works on all arches, for example I chekced
> > > > > the .config on my Nokia N900 kernel tree, there is HAVE_MEMBLOCK=y and
> > > > > no CONFIG_ARCH_DISCARD_MEMBLOCK, in 32bit arm code no arch_kexec_walk_mem()
> > > > By doesn't work you mean it's a change in behaviour?
> > > > I think this is fine because 32bit arm doesn't support KEXEC_FILE, (this file is
> > > > kexec_file specific right?).
> > > 
> > > Ah, replied on a train, I forgot this is only for kexec_file, sorry
> > > about that.  Please ignore the comment.
> > > 
> > > But since we have a weak function arch_kexec_walk_mem, adding another
> > > condition branch within this weak function looks not good.
> > > Something like below would be better:
> > 
> > I see your concern here, but
> > 
> > 
> > > int kexec_locate_mem_hole(struct kexec_buf *kbuf)
> > > {
> > >         int ret;
> > > 
> > > 	+ if use memblock
> > > 	+	ret = kexec_walk_memblock()
> > > 	+ else
> > >         	ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
> > > 
> > >         return ret == 1 ? 0 : -EADDRNOTAVAIL;
> > > }
> > 
> > what if yet another architecture comes to kexec_file and wanna
> > take a third approach? How can it override those functions?
> > Depending on kernel configuration, it might re-define either
> > kexec_walk_memblock() or arch_kexec_walk_mem(). It sounds weird to me.
> 
> I also feel this weird, but it is slightly better because currently no
> user need another overriding requirement, and I feel it is not expected to have in
> the future for the memblock use.
> 
> Rethinking about this issue, we can just remove the weak function and
> just use general function.

Do you really want to remove "weak" attribute?

> Currently with your patch applied only s390 use arch_kexec_walk_mem like
> below:
> /*
>  * The kernel is loaded to a fixed location. Turn off kexec_locate_mem_hole
>  * and provide kbuf->mem by hand.
>  */
> int arch_kexec_walk_mem(struct kexec_buf *kbuf,
>                         int (*func)(struct resource *, void *))
> {
>         return 1;
> }
> 
> AFAIK, all other users initialize kbuf->mem as NULL, so we can check

As a matter of fact, nobody initializes kbuf->mem before calling
kexec_add_buffer (in turn, kexec_locate_mem_hole()).

> kbuf->mem in int kexec_locate_mem_hole:
> 
> if (kbuf->mem)
> 	return 0;
> 
> if use memblock
> 	kexec_walk_memblock
> else
> 	kexec_walk_mem

I think that your solution will work for existing architectures
with appropriate patches, but to take your approach, as I said above,
we will have to modify every call site on all kexec_file-capable architectures.

If this is what you expect, I will work on it, but I don't think
that it would be a better idea.

Thanks,
-Takahiro AKASHI

> > 
> > Thanks,
> > -Takahiro AKASHI
> > 
> > > 
> > > > 
> > > > It only affects architectures with MEMBLOCK and KEXEC_FILE: powerpc, s390 and
> > > > soon arm64. s390 keeps its behaviour because it provides arch_kexec_walk_mem(),
> > > > and powerpc's is copied in here as its generic 'memblock describes my memory'
> > > > stuff. The implementation would be the same on arm64, so we're doing this to
> > > > avoid duplicating otherwise generic arch code. I think 32bit arm should be able
> > > > to use this too if it gets KEXEC_FILE support. (32bit arms' KEXEC already
> > > > depends on MEMBLOCK).
> > > > 
> > > > 
> > > > Thanks,
> > > > 
> > > > James
> > > 
> > > Thanks
> > > Dave
> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 09/15] arm64: kexec_file: load initrd and device-tree
  2018-07-17 16:57   ` James Morse
@ 2018-07-18  5:56     ` AKASHI Takahiro
  0 siblings, 0 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-18  5:56 UTC (permalink / raw)
  To: James Morse
  Cc: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd, ard.biesheuvel, bhsharma, kexec,
	linux-arm-kernel, linux-kernel

James,

On Tue, Jul 17, 2018 at 05:57:06PM +0100, James Morse wrote:
> Hi Akashi,
> 
> On 11/07/18 08:41, AKASHI Takahiro wrote:
> > load_other_segments() is expected to allocate and place all the necessary
> > memory segments other than kernel, including initrd and device-tree
> > blob (and elf core header for crash).
> > While most of the code was borrowed from kexec-tools' counterpart,
> > users may not be allowed to specify dtb explicitly, instead, the dtb
> > presented by the original boot loader is reused.
> > 
> > arch_kimage_kernel_post_load_cleanup() is responsible for freeing arm64-
> > specific data allocated in load_other_segments().
> 
> > diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> > index c38a8048ed00..ca00681c25c6 100644
> > --- a/arch/arm64/kernel/machine_kexec_file.c
> > +++ b/arch/arm64/kernel/machine_kexec_file.c
> 
> > +int arch_kimage_file_post_load_cleanup(struct kimage *image)
> > +{
> > +	vfree(image->arch.dtb_buf);
> > +	image->arch.dtb_buf = NULL;
> > +
> > +	return kexec_image_post_load_cleanup_default(image);
> > +}
> 
> A nit from sparse:
> | warning: symbol 'arch_kimage_file_post_load_cleanup' was not declared
> 
> Can we add a definition for this to a header file somewhere. asm/kexec.h is
> probably the best bet.

Sparse! Ok, I will fix it.

> > +static int setup_dtb(struct kimage *image,
> > +		unsigned long initrd_load_addr, unsigned long initrd_len,
> > +		char *cmdline, unsigned long cmdline_len,
> > +		char **dtb_buf, size_t *dtb_buf_len)
> > +{
> 
> > +	/* add initrd-* */
> > +	if (initrd_load_addr) {
> > +		value = cpu_to_fdt64(initrd_load_addr);
> > +		ret = fdt_setprop_u64(buf, nodeoffset, "linux,initrd-start",
> > +							value);
> 
> fdt_setprop_u64() already does the endian conversion.
> 
> From scripts/dtc/libfdt/libfdt.h, its implemented as:
> | 	fdt64_t tmp = cpu_to_fdt64(val);
> | 	return fdt_setprop(fdt, nodeoffset, name, &tmp, sizeof(tmp));
> 
> (I think you were using setprop directly in an older version)

Indeed.

> This leads to:
> | ------------[ cut here ]------------
> | initrd not fully accessible via the linear mapping -- please check your
> | bootloader ...
> | WARNING: CPU: 0 PID: 0 at ../arch/arm64/mm/init.c:429
> | arm64_memblock_init+0x150/0x3d8
> | Modules linked in:
> | CPU: 0 PID: 0 Comm: swapper Not tainted 4.18.0-rc5-00015-g95b5c843d0da #10150
> | Hardware name: AMD Seattle (Rev.B0) Development Board (Overdrive) (DT)
> | pstate: 60000085 (nZCv daIf -PAN -UAO)
> | pc : arm64_memblock_init+0x150/0x3d8
> | lr : arm64_memblock_init+0x150/0x3d8
> 
> | Call trace:
> | arm64_memblock_init+0x150/0x3d8
> | setup_arch+0x1c0/0x510
> | start_kernel+0x80/0x418
> | random: get_random_bytes called from print_oops_end_marker+0x4c/0x68 with
> | crng_init=0
> | ---[ end trace 0000000000000000 ]---
> 
> 
> Which is caused by the values being miles outside ram due to the extra byte
> swapping:

So it is in little endian.

> | morse@frikadeller:~$ sudo dtc -I dtb -O dts /sys/firmware/fdt  | grep initrd
> |                 linux,initrd-end = <0x900b6c05 0x80000000>;
> |                 linux,initrd-start = <0x906a04 0x80000000>;
> 
> 
> With the two extra cpu_to_fdt64() calls removed:
> Reviewed-by: James Morse <james.morse@arm.com>

Thank you for your review.

-Takahiro AKASHI

> 
> Thanks,
> 
> James

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-18  5:38             ` AKASHI Takahiro
@ 2018-07-18  6:13               ` Dave Young
  2018-07-18  6:40                 ` AKASHI Takahiro
  0 siblings, 1 reply; 38+ messages in thread
From: Dave Young @ 2018-07-18  6:13 UTC (permalink / raw)
  To: AKASHI Takahiro, James Morse, catalin.marinas, will.deacon,
	dhowells, vgoyal, herbert, davem, bhe, arnd, ard.biesheuvel,
	bhsharma, kexec, linux-arm-kernel, linux-kernel,
	Eric W. Biederman

Hi AKASHI,

On 07/18/18 at 02:38pm, AKASHI Takahiro wrote:
> Dave,
> 
> On Tue, Jul 17, 2018 at 03:49:23PM +0800, Dave Young wrote:
> > Hi AKASHI,
> > On 07/17/18 at 02:31pm, AKASHI Takahiro wrote:
> > > Hi Dave,
> > > 
> > > On Mon, Jul 16, 2018 at 08:24:12PM +0800, Dave Young wrote:
> > > > On 07/16/18 at 12:04pm, James Morse wrote:
> > > > > Hi Dave,
> > > > > 
> > > > > On 14/07/18 02:52, Dave Young wrote:
> > > > > > On 07/11/18 at 04:41pm, AKASHI Takahiro wrote:
> > > > > >> Memblock list is another source for usable system memory layout.
> > > > > >> So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
> > > > > >> other memblock-based architectures, particularly arm64, can also utilise
> > > > > >> it. A moved function is now renamed to kexec_walk_memblock() and merged
> > > > > >> into the existing arch_kexec_walk_mem() for general use, either resource
> > > > > >> list or memblock list.
> > > > > >>
> > > > > >> A consequent function will not work for kdump with memblock list, but
> > > > > >> this will be fixed in the next patch.
> > > > > 
> > > > > >> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > > > 
> > > > > >> @@ -513,6 +563,10 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
> > > > > >>  int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > > > > >>  			       int (*func)(struct resource *, void *))
> > > > > >>  {
> > > > > >> +	if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
> > > > > >> +			!IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
> > > > > >> +		return kexec_walk_memblock(kbuf, func);
> > > > > > 
> > > > > > AKASHI, I'm not sure if this works on all arches, for example I chekced
> > > > > > the .config on my Nokia N900 kernel tree, there is HAVE_MEMBLOCK=y and
> > > > > > no CONFIG_ARCH_DISCARD_MEMBLOCK, in 32bit arm code no arch_kexec_walk_mem()
> > > > > By doesn't work you mean it's a change in behaviour?
> > > > > I think this is fine because 32bit arm doesn't support KEXEC_FILE, (this file is
> > > > > kexec_file specific right?).
> > > > 
> > > > Ah, replied on a train, I forgot this is only for kexec_file, sorry
> > > > about that.  Please ignore the comment.
> > > > 
> > > > But since we have a weak function arch_kexec_walk_mem, adding another
> > > > condition branch within this weak function looks not good.
> > > > Something like below would be better:
> > > 
> > > I see your concern here, but
> > > 
> > > 
> > > > int kexec_locate_mem_hole(struct kexec_buf *kbuf)
> > > > {
> > > >         int ret;
> > > > 
> > > > 	+ if use memblock
> > > > 	+	ret = kexec_walk_memblock()
> > > > 	+ else
> > > >         	ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
> > > > 
> > > >         return ret == 1 ? 0 : -EADDRNOTAVAIL;
> > > > }
> > > 
> > > what if yet another architecture comes to kexec_file and wanna
> > > take a third approach? How can it override those functions?
> > > Depending on kernel configuration, it might re-define either
> > > kexec_walk_memblock() or arch_kexec_walk_mem(). It sounds weird to me.
> > 
> > I also feel this weird, but it is slightly better because currently no
> > user need another overriding requirement, and I feel it is not expected to have in
> > the future for the memblock use.
> > 
> > Rethinking about this issue, we can just remove the weak function and
> > just use general function.
> 
> Do you really want to remove "weak" attribute?
> 
> > Currently with your patch applied only s390 use arch_kexec_walk_mem like
> > below:
> > /*
> >  * The kernel is loaded to a fixed location. Turn off kexec_locate_mem_hole
> >  * and provide kbuf->mem by hand.
> >  */
> > int arch_kexec_walk_mem(struct kexec_buf *kbuf,
> >                         int (*func)(struct resource *, void *))
> > {
> >         return 1;
> > }
> > 
> > AFAIK, all other users initialize kbuf->mem as NULL, so we can check
> 
> As a matter of fact, nobody initializes kbuf->mem before calling
> kexec_add_buffer (in turn, kexec_locate_mem_hole()).

Not sure we understand each other..
Let's take an example in arch/x86/kernel/kexec-bzimage64.c:
bzImage64_load() :
	struct kexec_buf kbuf = { .image = image, .buf_max = ULONG_MAX,
				.top_down = true };

Except the three fields above other members will be initialized as zero
when compiling including the kbuf->mem

> 
> > kbuf->mem in int kexec_locate_mem_hole:
> > 
> > if (kbuf->mem)
> > 	return 0;
> > 
> > if use memblock
> > 	kexec_walk_memblock
> > else
> > 	kexec_walk_mem

kexec_walk_resource will be better than kexec_walk_mem

> 
> I think that your solution will work for existing architectures
> with appropriate patches, but to take your approach, as I said above,
> we will have to modify every call site on all kexec_file-capable architectures.
> 
> If this is what you expect, I will work on it, but I don't think
> that it would be a better idea.
> 
> Thanks,
> -Takahiro AKASHI
> 
> > > 
> > > Thanks,
> > > -Takahiro AKASHI
> > > 
> > > > 
> > > > > 
> > > > > It only affects architectures with MEMBLOCK and KEXEC_FILE: powerpc, s390 and
> > > > > soon arm64. s390 keeps its behaviour because it provides arch_kexec_walk_mem(),
> > > > > and powerpc's is copied in here as its generic 'memblock describes my memory'
> > > > > stuff. The implementation would be the same on arm64, so we're doing this to
> > > > > avoid duplicating otherwise generic arch code. I think 32bit arm should be able
> > > > > to use this too if it gets KEXEC_FILE support. (32bit arms' KEXEC already
> > > > > depends on MEMBLOCK).
> > > > > 
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > James
> > > > 
> > > > Thanks
> > > > Dave
> > 
> > Thanks
> > Dave

Thanks
dave

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-18  6:13               ` Dave Young
@ 2018-07-18  6:40                 ` AKASHI Takahiro
  2018-07-18  6:45                   ` Dave Young
  0 siblings, 1 reply; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-18  6:40 UTC (permalink / raw)
  To: Dave Young
  Cc: James Morse, catalin.marinas, will.deacon, dhowells, vgoyal,
	herbert, davem, bhe, arnd, ard.biesheuvel, bhsharma, kexec,
	linux-arm-kernel, linux-kernel, Eric W. Biederman

On Wed, Jul 18, 2018 at 02:13:50PM +0800, Dave Young wrote:
> Hi AKASHI,
> 
> On 07/18/18 at 02:38pm, AKASHI Takahiro wrote:
> > Dave,
> > 
> > On Tue, Jul 17, 2018 at 03:49:23PM +0800, Dave Young wrote:
> > > Hi AKASHI,
> > > On 07/17/18 at 02:31pm, AKASHI Takahiro wrote:
> > > > Hi Dave,
> > > > 
> > > > On Mon, Jul 16, 2018 at 08:24:12PM +0800, Dave Young wrote:
> > > > > On 07/16/18 at 12:04pm, James Morse wrote:
> > > > > > Hi Dave,
> > > > > > 
> > > > > > On 14/07/18 02:52, Dave Young wrote:
> > > > > > > On 07/11/18 at 04:41pm, AKASHI Takahiro wrote:
> > > > > > >> Memblock list is another source for usable system memory layout.
> > > > > > >> So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
> > > > > > >> other memblock-based architectures, particularly arm64, can also utilise
> > > > > > >> it. A moved function is now renamed to kexec_walk_memblock() and merged
> > > > > > >> into the existing arch_kexec_walk_mem() for general use, either resource
> > > > > > >> list or memblock list.
> > > > > > >>
> > > > > > >> A consequent function will not work for kdump with memblock list, but
> > > > > > >> this will be fixed in the next patch.
> > > > > > 
> > > > > > >> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > > > > 
> > > > > > >> @@ -513,6 +563,10 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
> > > > > > >>  int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > > > > > >>  			       int (*func)(struct resource *, void *))
> > > > > > >>  {
> > > > > > >> +	if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
> > > > > > >> +			!IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
> > > > > > >> +		return kexec_walk_memblock(kbuf, func);
> > > > > > > 
> > > > > > > AKASHI, I'm not sure if this works on all arches, for example I chekced
> > > > > > > the .config on my Nokia N900 kernel tree, there is HAVE_MEMBLOCK=y and
> > > > > > > no CONFIG_ARCH_DISCARD_MEMBLOCK, in 32bit arm code no arch_kexec_walk_mem()
> > > > > > By doesn't work you mean it's a change in behaviour?
> > > > > > I think this is fine because 32bit arm doesn't support KEXEC_FILE, (this file is
> > > > > > kexec_file specific right?).
> > > > > 
> > > > > Ah, replied on a train, I forgot this is only for kexec_file, sorry
> > > > > about that.  Please ignore the comment.
> > > > > 
> > > > > But since we have a weak function arch_kexec_walk_mem, adding another
> > > > > condition branch within this weak function looks not good.
> > > > > Something like below would be better:
> > > > 
> > > > I see your concern here, but
> > > > 
> > > > 
> > > > > int kexec_locate_mem_hole(struct kexec_buf *kbuf)
> > > > > {
> > > > >         int ret;
> > > > > 
> > > > > 	+ if use memblock
> > > > > 	+	ret = kexec_walk_memblock()
> > > > > 	+ else
> > > > >         	ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
> > > > > 
> > > > >         return ret == 1 ? 0 : -EADDRNOTAVAIL;
> > > > > }
> > > > 
> > > > what if yet another architecture comes to kexec_file and wanna
> > > > take a third approach? How can it override those functions?
> > > > Depending on kernel configuration, it might re-define either
> > > > kexec_walk_memblock() or arch_kexec_walk_mem(). It sounds weird to me.
> > > 
> > > I also feel this weird, but it is slightly better because currently no
> > > user need another overriding requirement, and I feel it is not expected to have in
> > > the future for the memblock use.
> > > 
> > > Rethinking about this issue, we can just remove the weak function and
> > > just use general function.
> > 
> > Do you really want to remove "weak" attribute?
> > 
> > > Currently with your patch applied only s390 use arch_kexec_walk_mem like
> > > below:
> > > /*
> > >  * The kernel is loaded to a fixed location. Turn off kexec_locate_mem_hole
> > >  * and provide kbuf->mem by hand.
> > >  */
> > > int arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > >                         int (*func)(struct resource *, void *))
> > > {
> > >         return 1;
> > > }
> > > 
> > > AFAIK, all other users initialize kbuf->mem as NULL, so we can check
> > 
> > As a matter of fact, nobody initializes kbuf->mem before calling
> > kexec_add_buffer (in turn, kexec_locate_mem_hole()).
> 
> Not sure we understand each other..
> Let's take an example in arch/x86/kernel/kexec-bzimage64.c:
> bzImage64_load() :
> 	struct kexec_buf kbuf = { .image = image, .buf_max = ULONG_MAX,
> 				.top_down = true };
> 
> Except the three fields above other members will be initialized as zero
> when compiling including the kbuf->mem

Ah, you're right.
(My armr64 patch doesn't use struct initializer, though.)

> > 
> > > kbuf->mem in int kexec_locate_mem_hole:
> > > 
> > > if (kbuf->mem)
> > > 	return 0;
> > > 
> > > if use memblock
> > > 	kexec_walk_memblock
> > > else
> > > 	kexec_walk_mem
> 
> kexec_walk_resource will be better than kexec_walk_mem
> 
> > 
> > I think that your solution will work for existing architectures
> > with appropriate patches, but to take your approach, as I said above,
> > we will have to modify every call site on all kexec_file-capable architectures.
> > 
> > If this is what you expect, I will work on it, but I don't think
> > that it would be a better idea.

So you would expect me to modify my own arm64 code as well as s390.

-Takahiro AKASHI

> > 
> > Thanks,
> > -Takahiro AKASHI
> > 
> > > > 
> > > > Thanks,
> > > > -Takahiro AKASHI
> > > > 
> > > > > 
> > > > > > 
> > > > > > It only affects architectures with MEMBLOCK and KEXEC_FILE: powerpc, s390 and
> > > > > > soon arm64. s390 keeps its behaviour because it provides arch_kexec_walk_mem(),
> > > > > > and powerpc's is copied in here as its generic 'memblock describes my memory'
> > > > > > stuff. The implementation would be the same on arm64, so we're doing this to
> > > > > > avoid duplicating otherwise generic arch code. I think 32bit arm should be able
> > > > > > to use this too if it gets KEXEC_FILE support. (32bit arms' KEXEC already
> > > > > > depends on MEMBLOCK).
> > > > > > 
> > > > > > 
> > > > > > Thanks,
> > > > > > 
> > > > > > James
> > > > > 
> > > > > Thanks
> > > > > Dave
> > > 
> > > Thanks
> > > Dave
> 
> Thanks
> dave

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-18  6:40                 ` AKASHI Takahiro
@ 2018-07-18  6:45                   ` Dave Young
  2018-07-20  5:33                     ` AKASHI Takahiro
  0 siblings, 1 reply; 38+ messages in thread
From: Dave Young @ 2018-07-18  6:45 UTC (permalink / raw)
  To: AKASHI Takahiro, James Morse, catalin.marinas, will.deacon,
	dhowells, vgoyal, herbert, davem, bhe, arnd, ard.biesheuvel,
	bhsharma, kexec, linux-arm-kernel, linux-kernel,
	Eric W. Biederman

On 07/18/18 at 03:40pm, AKASHI Takahiro wrote:
> On Wed, Jul 18, 2018 at 02:13:50PM +0800, Dave Young wrote:
> > Hi AKASHI,
> > 
> > On 07/18/18 at 02:38pm, AKASHI Takahiro wrote:
> > > Dave,
> > > 
> > > On Tue, Jul 17, 2018 at 03:49:23PM +0800, Dave Young wrote:
> > > > Hi AKASHI,
> > > > On 07/17/18 at 02:31pm, AKASHI Takahiro wrote:
> > > > > Hi Dave,
> > > > > 
> > > > > On Mon, Jul 16, 2018 at 08:24:12PM +0800, Dave Young wrote:
> > > > > > On 07/16/18 at 12:04pm, James Morse wrote:
> > > > > > > Hi Dave,
> > > > > > > 
> > > > > > > On 14/07/18 02:52, Dave Young wrote:
> > > > > > > > On 07/11/18 at 04:41pm, AKASHI Takahiro wrote:
> > > > > > > >> Memblock list is another source for usable system memory layout.
> > > > > > > >> So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
> > > > > > > >> other memblock-based architectures, particularly arm64, can also utilise
> > > > > > > >> it. A moved function is now renamed to kexec_walk_memblock() and merged
> > > > > > > >> into the existing arch_kexec_walk_mem() for general use, either resource
> > > > > > > >> list or memblock list.
> > > > > > > >>
> > > > > > > >> A consequent function will not work for kdump with memblock list, but
> > > > > > > >> this will be fixed in the next patch.
> > > > > > > 
> > > > > > > >> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > > > > > 
> > > > > > > >> @@ -513,6 +563,10 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
> > > > > > > >>  int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > > > > > > >>  			       int (*func)(struct resource *, void *))
> > > > > > > >>  {
> > > > > > > >> +	if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
> > > > > > > >> +			!IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
> > > > > > > >> +		return kexec_walk_memblock(kbuf, func);
> > > > > > > > 
> > > > > > > > AKASHI, I'm not sure if this works on all arches, for example I chekced
> > > > > > > > the .config on my Nokia N900 kernel tree, there is HAVE_MEMBLOCK=y and
> > > > > > > > no CONFIG_ARCH_DISCARD_MEMBLOCK, in 32bit arm code no arch_kexec_walk_mem()
> > > > > > > By doesn't work you mean it's a change in behaviour?
> > > > > > > I think this is fine because 32bit arm doesn't support KEXEC_FILE, (this file is
> > > > > > > kexec_file specific right?).
> > > > > > 
> > > > > > Ah, replied on a train, I forgot this is only for kexec_file, sorry
> > > > > > about that.  Please ignore the comment.
> > > > > > 
> > > > > > But since we have a weak function arch_kexec_walk_mem, adding another
> > > > > > condition branch within this weak function looks not good.
> > > > > > Something like below would be better:
> > > > > 
> > > > > I see your concern here, but
> > > > > 
> > > > > 
> > > > > > int kexec_locate_mem_hole(struct kexec_buf *kbuf)
> > > > > > {
> > > > > >         int ret;
> > > > > > 
> > > > > > 	+ if use memblock
> > > > > > 	+	ret = kexec_walk_memblock()
> > > > > > 	+ else
> > > > > >         	ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
> > > > > > 
> > > > > >         return ret == 1 ? 0 : -EADDRNOTAVAIL;
> > > > > > }
> > > > > 
> > > > > what if yet another architecture comes to kexec_file and wanna
> > > > > take a third approach? How can it override those functions?
> > > > > Depending on kernel configuration, it might re-define either
> > > > > kexec_walk_memblock() or arch_kexec_walk_mem(). It sounds weird to me.
> > > > 
> > > > I also feel this weird, but it is slightly better because currently no
> > > > user need another overriding requirement, and I feel it is not expected to have in
> > > > the future for the memblock use.
> > > > 
> > > > Rethinking about this issue, we can just remove the weak function and
> > > > just use general function.
> > > 
> > > Do you really want to remove "weak" attribute?
> > > 
> > > > Currently with your patch applied only s390 use arch_kexec_walk_mem like
> > > > below:
> > > > /*
> > > >  * The kernel is loaded to a fixed location. Turn off kexec_locate_mem_hole
> > > >  * and provide kbuf->mem by hand.
> > > >  */
> > > > int arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > > >                         int (*func)(struct resource *, void *))
> > > > {
> > > >         return 1;
> > > > }
> > > > 
> > > > AFAIK, all other users initialize kbuf->mem as NULL, so we can check
> > > 
> > > As a matter of fact, nobody initializes kbuf->mem before calling
> > > kexec_add_buffer (in turn, kexec_locate_mem_hole()).
> > 
> > Not sure we understand each other..
> > Let's take an example in arch/x86/kernel/kexec-bzimage64.c:
> > bzImage64_load() :
> > 	struct kexec_buf kbuf = { .image = image, .buf_max = ULONG_MAX,
> > 				.top_down = true };
> > 
> > Except the three fields above other members will be initialized as zero
> > when compiling including the kbuf->mem
> 
> Ah, you're right.
> (My armr64 patch doesn't use struct initializer, though.)
> 
> > > 
> > > > kbuf->mem in int kexec_locate_mem_hole:
> > > > 
> > > > if (kbuf->mem)
> > > > 	return 0;
> > > > 
> > > > if use memblock
> > > > 	kexec_walk_memblock
> > > > else
> > > > 	kexec_walk_mem
> > 
> > kexec_walk_resource will be better than kexec_walk_mem
> > 
> > > 
> > > I think that your solution will work for existing architectures
> > > with appropriate patches, but to take your approach, as I said above,
> > > we will have to modify every call site on all kexec_file-capable architectures.
> > > 
> > > If this is what you expect, I will work on it, but I don't think
> > > that it would be a better idea.
> 
> So you would expect me to modify my own arm64 code as well as s390.

Yes :)  But I had not get time to read all your patches so I was not
aware the struct initialization in arm64 code so I assumed only s390
need a change..

Thanks
Dave

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 10/15] arm64: kexec_file: allow for loading Image-format kernel
  2018-07-11  7:41 ` [PATCH v11 10/15] arm64: kexec_file: allow for loading Image-format kernel AKASHI Takahiro
@ 2018-07-18 16:47   ` James Morse
  2018-07-20  6:14     ` AKASHI Takahiro
  0 siblings, 1 reply; 38+ messages in thread
From: James Morse @ 2018-07-18 16:47 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd, ard.biesheuvel, bhsharma, kexec,
	linux-arm-kernel, linux-kernel

Hi Akashi,

On 11/07/18 08:41, AKASHI Takahiro wrote:
> This patch provides kexec_file_ops for "Image"-format kernel. In this
> implementation, a binary is always loaded with a fixed offset identified
> in text_offset field of its header.
> 
> Regarding signature verification for trusted boot, this patch doesn't
> contains CONFIG_KEXEC_VERIFY_SIG support, which is to be added later
> in this series, but file-attribute-based verification is still a viable
> option by enabling IMA security subsystem.
> 
> You can sign(label) a to-be-kexec'ed kernel image on target file system
> with:
>     $ evmctl ima_sign --key /path/to/private_key.pem Image
> 
> On live system, you must have IMA enforced with, at least, the following
> security policy:
>     "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig"
> 
> See more details about IMA here:
>     https://sourceforge.net/p/linux-ima/wiki/Home/

This looks useful to set a keys/signature/policy for a kernel that wasn't built
to enforce signatures at compile time, so its a good thing to have from a
single-image perspective.

I haven't managed to get IMA working to test this, but its all done by the kexec
core code, so I don't think we're missing anything.


> diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
> new file mode 100644
> index 000000000000..a47cf9bc699e
> --- /dev/null
> +++ b/arch/arm64/kernel/kexec_image.c

> +static int image_probe(const char *kernel_buf, unsigned long kernel_len)
> +{
> +	const struct arm64_image_header *h;
> +
> +	h = (const struct arm64_image_header *)(kernel_buf);
> +
> +	if (!h || (kernel_len < sizeof(*h)) ||

> +			!memcmp(&h->magic, ARM64_MAGIC, sizeof(ARM64_MAGIC)))

Doesn't memcmp() return 0 if the memory regions are the same?
This would always match the correct magic, rejecting the image.

That's not whats happening, as kexec-file works, so this never matches anything.

sizeof(ARM64_MAGIC) includes the null terminator, but this sequence is output in
head.S using '.ascii' which doesn't include the terminator, (otherwise it
wouldn't fit in the 4byte magic field). The memcmp() here is also consuming the
least significant bytes of the next field.

I think this line should be:
| 			memcmp(&h->magic, ARM64_MAGIC, sizeof(h->magic)))


> +static void *image_load(struct kimage *image,
> +				char *kernel, unsigned long kernel_len,
> +				char *initrd, unsigned long initrd_len,
> +				char *cmdline, unsigned long cmdline_len)

> +	kbuf.buffer = kernel;
> +	kbuf.bufsz = kernel_len;
> +	kbuf.memsz = le64_to_cpu(h->image_size);
> +	text_offset = le64_to_cpu(h->text_offset);
> +	kbuf.buf_align = SZ_2M;

Nit: MIN_KIMG_ALIGN ?


> +	/* Adjust kernel segment with TEXT_OFFSET */
> +	kbuf.memsz += text_offset;
> +
> +	ret = kexec_add_buffer(&kbuf);
> +	if (ret)
> +		goto out;

You just return in the error cases above but here you goto ... the return
statement at the end. Seems a bit odd.


With the memcmp() thing fixed:
Reviewed-by: James Morse <james.morse@arm.com>


Thanks,

James

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 11/15] arm64: kexec_file: add crash dump support
  2018-07-11  7:41 ` [PATCH v11 11/15] arm64: kexec_file: add crash dump support AKASHI Takahiro
@ 2018-07-18 16:50   ` James Morse
  2018-07-23  5:39     ` AKASHI Takahiro
  0 siblings, 1 reply; 38+ messages in thread
From: James Morse @ 2018-07-18 16:50 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd, ard.biesheuvel, bhsharma, kexec,
	linux-arm-kernel, linux-kernel

Hi Akashi,

On 11/07/18 08:41, AKASHI Takahiro wrote:
> Enabling crash dump (kdump) includes
> * prepare contents of ELF header of a core dump file, /proc/vmcore,
>   using crash_prepare_elf64_headers(), and
> * add two device tree properties, "linux,usable-memory-range" and
>   "linux,elfcorehdr", which represent respectively a memory range
>   to be used by crash dump kernel and the header's location

> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
> index 69333694e3e2..eeb5766928b0 100644
> --- a/arch/arm64/include/asm/kexec.h
> +++ b/arch/arm64/include/asm/kexec.h
> @@ -99,6 +99,10 @@ static inline void crash_post_resume(void) {}
>  struct kimage_arch {
>  	phys_addr_t dtb_mem;
>  	void *dtb_buf;
> +	/* Core ELF header buffer */

> +	void *elf_headers;

Shouldn't this be a phys_addr_t if it comes from kbuf.mem?
(dtb_mem is, and they type tells us which way round the runtime/kexec-time
pointers are)


> +	unsigned long elf_headers_sz;
> +	unsigned long elf_load_addr;
>  };
>  
>  /**


> diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> index a0b44fe18b95..261564df7210 100644
> --- a/arch/arm64/kernel/machine_kexec_file.c
> +++ b/arch/arm64/kernel/machine_kexec_file.c
> @@ -132,6 +173,45 @@ static int setup_dtb(struct kimage *image,
>  	return ret;
>  }
>  
> +static int prepare_elf_headers(void **addr, unsigned long *sz)
> +{
> +	struct crash_mem *cmem;
> +	unsigned int nr_ranges;
> +	int ret;
> +	u64 i;
> +	phys_addr_t start, end;

> +	nr_ranges = 1; /* for exclusion of crashkernel region */
> +	for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, 0,
> +							&start, &end, NULL)

Nit: flags = MEMBLOCK_NONE? Just to make it obvious this is how MEMBLOCK_NOMAP
regions are weeded out.

This is going to get interesting if we ever support hotpluggable memory... but
it works for now and implicitly removes the nomap regions.


> +		nr_ranges++;

> +
> +	cmem = kmalloc(sizeof(struct crash_mem) +
> +			sizeof(struct crash_mem_range) * nr_ranges, GFP_KERNEL);
> +	if (!cmem)
> +		return -ENOMEM;
> +
> +	cmem->max_nr_ranges = nr_ranges;
> +	cmem->nr_ranges = 0;
> +	for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, 0,
> +							&start, &end, NULL) {
> +		cmem->ranges[cmem->nr_ranges].start = start;
> +		cmem->ranges[cmem->nr_ranges].end = end - 1;
> +		cmem->nr_ranges++;
> +	}
> +
> +	/* Exclude crashkernel region */
> +	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);


> +	if (ret)
> +		goto out;
> +
> +	ret =  crash_prepare_elf64_headers(cmem, true, addr, sz);
> +
> +out:

Nit: You could save the goto if you wrote this as:
|	if (!ret)
|		ret = crash_prepare_elf64_headers(cmem, true, addr, sz);


> +	kfree(cmem);
> +	return ret;
> +}
> +
>  int load_other_segments(struct kimage *image,
>  			unsigned long kernel_load_addr,
>  			unsigned long kernel_size,
> @@ -139,11 +219,43 @@ int load_other_segments(struct kimage *image,
>  			char *cmdline, unsigned long cmdline_len)
>  {
>  	struct kexec_buf kbuf;
> +	void *hdrs_addr;
> +	unsigned long hdrs_sz;
>  	unsigned long initrd_load_addr = 0;
>  	char *dtb = NULL;
>  	unsigned long dtb_len = 0;
>  	int ret = 0;
>  
> +	/* load elf core header */
> +	if (image->type == KEXEC_TYPE_CRASH) {
> +		ret = prepare_elf_headers(&hdrs_addr, &hdrs_sz);
> +		if (ret) {
> +			pr_err("Preparing elf core header failed\n");
> +			goto out_err;
> +		}
> +
> +		kbuf.image = image;
> +		kbuf.buffer = hdrs_addr;
> +		kbuf.bufsz = hdrs_sz;
> +		kbuf.memsz = hdrs_sz;

> +		kbuf.buf_align = PAGE_SIZE;

Whose PAGE_SIZE?

Won't this break if the kdump kernel is 64K pages, but the first kernel uses 4K?
Should we change this to the largest supported PAGE_SIZE: SZ_64K?


> +		kbuf.buf_min = crashk_res.start;
> +		kbuf.buf_max = crashk_res.end + 1;
> +		kbuf.top_down = true;
> +
> +		ret = kexec_add_buffer(&kbuf);
> +		if (ret) {
> +			vfree(hdrs_addr);
> +			goto out_err;
> +		}
> +		image->arch.elf_headers = hdrs_addr;
> +		image->arch.elf_headers_sz = hdrs_sz;
> +		image->arch.elf_load_addr = kbuf.mem;
> +
> +		pr_debug("Loaded elf core header at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
> +				 image->arch.elf_load_addr, hdrs_sz, hdrs_sz);
> +	}
> +
>  	kbuf.image = image;
>  	/* not allocate anything below the kernel */
>  	kbuf.buf_min = kernel_load_addr + kernel_size;


I think the initramfs can escape the crash kernel range because you add to the
buf_max region:
|	/* within 1GB-aligned window of up to 32GB in size */
|	kbuf.buf_max = round_down(kernel_load_addr, SZ_1G)
|				 + (unsigned long)SZ_1G * 32;


I think we need a helper to clamp these min/max ranges to within the crash
kernel range, as its needs doing in a few places.


Thanks,

James

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-16 12:26   ` Dave Young
@ 2018-07-18 16:52     ` James Morse
  2018-07-19  2:23       ` Dave Young
  0 siblings, 1 reply; 38+ messages in thread
From: James Morse @ 2018-07-18 16:52 UTC (permalink / raw)
  To: Dave Young, AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	bhe, arnd, ard.biesheuvel, bhsharma, kexec, linux-arm-kernel,
	linux-kernel, Eric W. Biederman

Hi Dave, Akashi,

On 16/07/18 13:26, Dave Young wrote:
> On 07/11/18 at 04:41pm, AKASHI Takahiro wrote:
>> Memblock list is another source for usable system memory layout.
>> So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
>> other memblock-based architectures, particularly arm64, can also utilise
>> it. A moved function is now renamed to kexec_walk_memblock() and merged
>> into the existing arch_kexec_walk_mem() for general use, either resource
>> list or memblock list.
>>
>> A consequent function will not work for kdump with memblock list, but
>> this will be fixed in the next patch.
> 
> If this breaks something, then it would be good to fold the following
> patch in this patch so that bisect can still work?

This patch is just moving code from arch/powerpc that is generic.
powerpc doesn't support kdump via kexec_file, so nothing is damaged by adding
this new code in the next patch.

arm64 would need this kdump support, but it doesn't use it until patch 11.


Thanks,

James

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-18 16:52     ` James Morse
@ 2018-07-19  2:23       ` Dave Young
  0 siblings, 0 replies; 38+ messages in thread
From: Dave Young @ 2018-07-19  2:23 UTC (permalink / raw)
  To: James Morse
  Cc: AKASHI Takahiro, catalin.marinas, will.deacon, dhowells, vgoyal,
	herbert, davem, bhe, arnd, ard.biesheuvel, bhsharma, kexec,
	linux-arm-kernel, linux-kernel, Eric W. Biederman

Hi James,
On 07/18/18 at 05:52pm, James Morse wrote:
> Hi Dave, Akashi,
> 
> On 16/07/18 13:26, Dave Young wrote:
> > On 07/11/18 at 04:41pm, AKASHI Takahiro wrote:
> >> Memblock list is another source for usable system memory layout.
> >> So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
> >> other memblock-based architectures, particularly arm64, can also utilise
> >> it. A moved function is now renamed to kexec_walk_memblock() and merged
> >> into the existing arch_kexec_walk_mem() for general use, either resource
> >> list or memblock list.
> >>
> >> A consequent function will not work for kdump with memblock list, but
> >> this will be fixed in the next patch.
> > 
> > If this breaks something, then it would be good to fold the following
> > patch in this patch so that bisect can still work?
> 
> This patch is just moving code from arch/powerpc that is generic.
> powerpc doesn't support kdump via kexec_file, so nothing is damaged by adding
> this new code in the next patch.
> 
> arm64 would need this kdump support, but it doesn't use it until patch 11.

Ok, then I'm fine with it, thanks for the explanation.

> 
> 
> Thanks,
> 
> James

Thanks
Dave

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-18  6:45                   ` Dave Young
@ 2018-07-20  5:33                     ` AKASHI Takahiro
  2018-07-20  5:57                       ` Dave Young
  0 siblings, 1 reply; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-20  5:33 UTC (permalink / raw)
  To: Dave Young
  Cc: James Morse, catalin.marinas, will.deacon, dhowells, vgoyal,
	herbert, davem, bhe, arnd, ard.biesheuvel, bhsharma, kexec,
	linux-arm-kernel, linux-kernel, Eric W. Biederman

Dave,

On Wed, Jul 18, 2018 at 02:45:19PM +0800, Dave Young wrote:
> On 07/18/18 at 03:40pm, AKASHI Takahiro wrote:
> > On Wed, Jul 18, 2018 at 02:13:50PM +0800, Dave Young wrote:
> > > Hi AKASHI,
> > > 
> > > On 07/18/18 at 02:38pm, AKASHI Takahiro wrote:
> > > > Dave,
> > > > 
> > > > On Tue, Jul 17, 2018 at 03:49:23PM +0800, Dave Young wrote:
> > > > > Hi AKASHI,
> > > > > On 07/17/18 at 02:31pm, AKASHI Takahiro wrote:
> > > > > > Hi Dave,
> > > > > > 
> > > > > > On Mon, Jul 16, 2018 at 08:24:12PM +0800, Dave Young wrote:
> > > > > > > On 07/16/18 at 12:04pm, James Morse wrote:
> > > > > > > > Hi Dave,
> > > > > > > > 
> > > > > > > > On 14/07/18 02:52, Dave Young wrote:
> > > > > > > > > On 07/11/18 at 04:41pm, AKASHI Takahiro wrote:
> > > > > > > > >> Memblock list is another source for usable system memory layout.
> > > > > > > > >> So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
> > > > > > > > >> other memblock-based architectures, particularly arm64, can also utilise
> > > > > > > > >> it. A moved function is now renamed to kexec_walk_memblock() and merged
> > > > > > > > >> into the existing arch_kexec_walk_mem() for general use, either resource
> > > > > > > > >> list or memblock list.
> > > > > > > > >>
> > > > > > > > >> A consequent function will not work for kdump with memblock list, but
> > > > > > > > >> this will be fixed in the next patch.
> > > > > > > > 
> > > > > > > > >> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > > > > > > 
> > > > > > > > >> @@ -513,6 +563,10 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
> > > > > > > > >>  int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > > > > > > > >>  			       int (*func)(struct resource *, void *))
> > > > > > > > >>  {
> > > > > > > > >> +	if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
> > > > > > > > >> +			!IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
> > > > > > > > >> +		return kexec_walk_memblock(kbuf, func);
> > > > > > > > > 
> > > > > > > > > AKASHI, I'm not sure if this works on all arches, for example I chekced
> > > > > > > > > the .config on my Nokia N900 kernel tree, there is HAVE_MEMBLOCK=y and
> > > > > > > > > no CONFIG_ARCH_DISCARD_MEMBLOCK, in 32bit arm code no arch_kexec_walk_mem()
> > > > > > > > By doesn't work you mean it's a change in behaviour?
> > > > > > > > I think this is fine because 32bit arm doesn't support KEXEC_FILE, (this file is
> > > > > > > > kexec_file specific right?).
> > > > > > > 
> > > > > > > Ah, replied on a train, I forgot this is only for kexec_file, sorry
> > > > > > > about that.  Please ignore the comment.
> > > > > > > 
> > > > > > > But since we have a weak function arch_kexec_walk_mem, adding another
> > > > > > > condition branch within this weak function looks not good.
> > > > > > > Something like below would be better:
> > > > > > 
> > > > > > I see your concern here, but
> > > > > > 
> > > > > > 
> > > > > > > int kexec_locate_mem_hole(struct kexec_buf *kbuf)
> > > > > > > {
> > > > > > >         int ret;
> > > > > > > 
> > > > > > > 	+ if use memblock
> > > > > > > 	+	ret = kexec_walk_memblock()
> > > > > > > 	+ else
> > > > > > >         	ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
> > > > > > > 
> > > > > > >         return ret == 1 ? 0 : -EADDRNOTAVAIL;
> > > > > > > }
> > > > > > 
> > > > > > what if yet another architecture comes to kexec_file and wanna
> > > > > > take a third approach? How can it override those functions?
> > > > > > Depending on kernel configuration, it might re-define either
> > > > > > kexec_walk_memblock() or arch_kexec_walk_mem(). It sounds weird to me.
> > > > > 
> > > > > I also feel this weird, but it is slightly better because currently no
> > > > > user need another overriding requirement, and I feel it is not expected to have in
> > > > > the future for the memblock use.
> > > > > 
> > > > > Rethinking about this issue, we can just remove the weak function and
> > > > > just use general function.
> > > > 
> > > > Do you really want to remove "weak" attribute?
> > > > 
> > > > > Currently with your patch applied only s390 use arch_kexec_walk_mem like
> > > > > below:
> > > > > /*
> > > > >  * The kernel is loaded to a fixed location. Turn off kexec_locate_mem_hole
> > > > >  * and provide kbuf->mem by hand.
> > > > >  */
> > > > > int arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > > > >                         int (*func)(struct resource *, void *))
> > > > > {
> > > > >         return 1;
> > > > > }
> > > > > 
> > > > > AFAIK, all other users initialize kbuf->mem as NULL, so we can check
> > > > 
> > > > As a matter of fact, nobody initializes kbuf->mem before calling
> > > > kexec_add_buffer (in turn, kexec_locate_mem_hole()).
> > > 
> > > Not sure we understand each other..
> > > Let's take an example in arch/x86/kernel/kexec-bzimage64.c:
> > > bzImage64_load() :
> > > 	struct kexec_buf kbuf = { .image = image, .buf_max = ULONG_MAX,
> > > 				.top_down = true };
> > > 
> > > Except the three fields above other members will be initialized as zero
> > > when compiling including the kbuf->mem
> > 
> > Ah, you're right.
> > (My armr64 patch doesn't use struct initializer, though.)
> > 
> > > > 
> > > > > kbuf->mem in int kexec_locate_mem_hole:
> > > > > 
> > > > > if (kbuf->mem)
> > > > > 	return 0;
> > > > > 
> > > > > if use memblock
> > > > > 	kexec_walk_memblock
> > > > > else
> > > > > 	kexec_walk_mem
> > > 
> > > kexec_walk_resource will be better than kexec_walk_mem
> > > 
> > > > 
> > > > I think that your solution will work for existing architectures
> > > > with appropriate patches, but to take your approach, as I said above,
> > > > we will have to modify every call site on all kexec_file-capable architectures.
> > > > 
> > > > If this is what you expect, I will work on it, but I don't think
> > > > that it would be a better idea.
> > 
> > So you would expect me to modify my own arm64 code as well as s390.
> 
> Yes :)  But I had not get time to read all your patches so I was not
> aware the struct initialization in arm64 code so I assumed only s390
> need a change..

Okay, but I don't want to mix cross-arch changes into a single patch,
prefer to leave the current patch as it is and add an additional patch
as you suggested here.

Is that OK for you?

Thanks,
-Takahiro AKASHI


> Thanks
> Dave

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-20  5:33                     ` AKASHI Takahiro
@ 2018-07-20  5:57                       ` Dave Young
  2018-07-20  6:25                         ` AKASHI Takahiro
  0 siblings, 1 reply; 38+ messages in thread
From: Dave Young @ 2018-07-20  5:57 UTC (permalink / raw)
  To: AKASHI Takahiro, James Morse, catalin.marinas, will.deacon,
	dhowells, vgoyal, herbert, davem, bhe, arnd, ard.biesheuvel,
	bhsharma, kexec, linux-arm-kernel, linux-kernel,
	Eric W. Biederman

On 07/20/18 at 02:33pm, AKASHI Takahiro wrote:
> Dave,
> 
> On Wed, Jul 18, 2018 at 02:45:19PM +0800, Dave Young wrote:
> > On 07/18/18 at 03:40pm, AKASHI Takahiro wrote:
> > > On Wed, Jul 18, 2018 at 02:13:50PM +0800, Dave Young wrote:
> > > > Hi AKASHI,
> > > > 
> > > > On 07/18/18 at 02:38pm, AKASHI Takahiro wrote:
> > > > > Dave,
> > > > > 
> > > > > On Tue, Jul 17, 2018 at 03:49:23PM +0800, Dave Young wrote:
> > > > > > Hi AKASHI,
> > > > > > On 07/17/18 at 02:31pm, AKASHI Takahiro wrote:
> > > > > > > Hi Dave,
> > > > > > > 
> > > > > > > On Mon, Jul 16, 2018 at 08:24:12PM +0800, Dave Young wrote:
> > > > > > > > On 07/16/18 at 12:04pm, James Morse wrote:
> > > > > > > > > Hi Dave,
> > > > > > > > > 
> > > > > > > > > On 14/07/18 02:52, Dave Young wrote:
> > > > > > > > > > On 07/11/18 at 04:41pm, AKASHI Takahiro wrote:
> > > > > > > > > >> Memblock list is another source for usable system memory layout.
> > > > > > > > > >> So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
> > > > > > > > > >> other memblock-based architectures, particularly arm64, can also utilise
> > > > > > > > > >> it. A moved function is now renamed to kexec_walk_memblock() and merged
> > > > > > > > > >> into the existing arch_kexec_walk_mem() for general use, either resource
> > > > > > > > > >> list or memblock list.
> > > > > > > > > >>
> > > > > > > > > >> A consequent function will not work for kdump with memblock list, but
> > > > > > > > > >> this will be fixed in the next patch.
> > > > > > > > > 
> > > > > > > > > >> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > > > > > > > 
> > > > > > > > > >> @@ -513,6 +563,10 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
> > > > > > > > > >>  int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > > > > > > > > >>  			       int (*func)(struct resource *, void *))
> > > > > > > > > >>  {
> > > > > > > > > >> +	if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
> > > > > > > > > >> +			!IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
> > > > > > > > > >> +		return kexec_walk_memblock(kbuf, func);
> > > > > > > > > > 
> > > > > > > > > > AKASHI, I'm not sure if this works on all arches, for example I chekced
> > > > > > > > > > the .config on my Nokia N900 kernel tree, there is HAVE_MEMBLOCK=y and
> > > > > > > > > > no CONFIG_ARCH_DISCARD_MEMBLOCK, in 32bit arm code no arch_kexec_walk_mem()
> > > > > > > > > By doesn't work you mean it's a change in behaviour?
> > > > > > > > > I think this is fine because 32bit arm doesn't support KEXEC_FILE, (this file is
> > > > > > > > > kexec_file specific right?).
> > > > > > > > 
> > > > > > > > Ah, replied on a train, I forgot this is only for kexec_file, sorry
> > > > > > > > about that.  Please ignore the comment.
> > > > > > > > 
> > > > > > > > But since we have a weak function arch_kexec_walk_mem, adding another
> > > > > > > > condition branch within this weak function looks not good.
> > > > > > > > Something like below would be better:
> > > > > > > 
> > > > > > > I see your concern here, but
> > > > > > > 
> > > > > > > 
> > > > > > > > int kexec_locate_mem_hole(struct kexec_buf *kbuf)
> > > > > > > > {
> > > > > > > >         int ret;
> > > > > > > > 
> > > > > > > > 	+ if use memblock
> > > > > > > > 	+	ret = kexec_walk_memblock()
> > > > > > > > 	+ else
> > > > > > > >         	ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
> > > > > > > > 
> > > > > > > >         return ret == 1 ? 0 : -EADDRNOTAVAIL;
> > > > > > > > }
> > > > > > > 
> > > > > > > what if yet another architecture comes to kexec_file and wanna
> > > > > > > take a third approach? How can it override those functions?
> > > > > > > Depending on kernel configuration, it might re-define either
> > > > > > > kexec_walk_memblock() or arch_kexec_walk_mem(). It sounds weird to me.
> > > > > > 
> > > > > > I also feel this weird, but it is slightly better because currently no
> > > > > > user need another overriding requirement, and I feel it is not expected to have in
> > > > > > the future for the memblock use.
> > > > > > 
> > > > > > Rethinking about this issue, we can just remove the weak function and
> > > > > > just use general function.
> > > > > 
> > > > > Do you really want to remove "weak" attribute?
> > > > > 
> > > > > > Currently with your patch applied only s390 use arch_kexec_walk_mem like
> > > > > > below:
> > > > > > /*
> > > > > >  * The kernel is loaded to a fixed location. Turn off kexec_locate_mem_hole
> > > > > >  * and provide kbuf->mem by hand.
> > > > > >  */
> > > > > > int arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > > > > >                         int (*func)(struct resource *, void *))
> > > > > > {
> > > > > >         return 1;
> > > > > > }
> > > > > > 
> > > > > > AFAIK, all other users initialize kbuf->mem as NULL, so we can check
> > > > > 
> > > > > As a matter of fact, nobody initializes kbuf->mem before calling
> > > > > kexec_add_buffer (in turn, kexec_locate_mem_hole()).
> > > > 
> > > > Not sure we understand each other..
> > > > Let's take an example in arch/x86/kernel/kexec-bzimage64.c:
> > > > bzImage64_load() :
> > > > 	struct kexec_buf kbuf = { .image = image, .buf_max = ULONG_MAX,
> > > > 				.top_down = true };
> > > > 
> > > > Except the three fields above other members will be initialized as zero
> > > > when compiling including the kbuf->mem
> > > 
> > > Ah, you're right.
> > > (My armr64 patch doesn't use struct initializer, though.)
> > > 
> > > > > 
> > > > > > kbuf->mem in int kexec_locate_mem_hole:
> > > > > > 
> > > > > > if (kbuf->mem)
> > > > > > 	return 0;
> > > > > > 
> > > > > > if use memblock
> > > > > > 	kexec_walk_memblock
> > > > > > else
> > > > > > 	kexec_walk_mem
> > > > 
> > > > kexec_walk_resource will be better than kexec_walk_mem
> > > > 
> > > > > 
> > > > > I think that your solution will work for existing architectures
> > > > > with appropriate patches, but to take your approach, as I said above,
> > > > > we will have to modify every call site on all kexec_file-capable architectures.
> > > > > 
> > > > > If this is what you expect, I will work on it, but I don't think
> > > > > that it would be a better idea.
> > > 
> > > So you would expect me to modify my own arm64 code as well as s390.
> > 
> > Yes :)  But I had not get time to read all your patches so I was not
> > aware the struct initialization in arm64 code so I assumed only s390
> > need a change..
> 
> Okay, but I don't want to mix cross-arch changes into a single patch,
> prefer to leave the current patch as it is and add an additional patch
> as you suggested here.
Hi AKASHI,

Maybe add another patch to drop s390 walk function first, then follow
with this patch with the modification about common code restructure.

Is this better? For example:
03/15 s390, drop s390 arch_kexec_mem_walk
04/15 powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem
> 
> Is that OK for you?
> 
> Thanks,
> -Takahiro AKASHI
> 
> 
> > Thanks
> > Dave

Thanks
Dave

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 10/15] arm64: kexec_file: allow for loading Image-format kernel
  2018-07-18 16:47   ` James Morse
@ 2018-07-20  6:14     ` AKASHI Takahiro
  0 siblings, 0 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-20  6:14 UTC (permalink / raw)
  To: James Morse
  Cc: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd, ard.biesheuvel, bhsharma, kexec,
	linux-arm-kernel, linux-kernel

On Wed, Jul 18, 2018 at 05:47:50PM +0100, James Morse wrote:
> Hi Akashi,
> 
> On 11/07/18 08:41, AKASHI Takahiro wrote:
> > This patch provides kexec_file_ops for "Image"-format kernel. In this
> > implementation, a binary is always loaded with a fixed offset identified
> > in text_offset field of its header.
> > 
> > Regarding signature verification for trusted boot, this patch doesn't
> > contains CONFIG_KEXEC_VERIFY_SIG support, which is to be added later
> > in this series, but file-attribute-based verification is still a viable
> > option by enabling IMA security subsystem.
> > 
> > You can sign(label) a to-be-kexec'ed kernel image on target file system
> > with:
> >     $ evmctl ima_sign --key /path/to/private_key.pem Image
> > 
> > On live system, you must have IMA enforced with, at least, the following
> > security policy:
> >     "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig"
> > 
> > See more details about IMA here:
> >     https://sourceforge.net/p/linux-ima/wiki/Home/
> 
> This looks useful to set a keys/signature/policy for a kernel that wasn't built
> to enforce signatures at compile time, so its a good thing to have from a
> single-image perspective.
> 
> I haven't managed to get IMA working to test this, but its all done by the kexec
> core code, so I don't think we're missing anything.
> 
> 
> > diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
> > new file mode 100644
> > index 000000000000..a47cf9bc699e
> > --- /dev/null
> > +++ b/arch/arm64/kernel/kexec_image.c
> 
> > +static int image_probe(const char *kernel_buf, unsigned long kernel_len)
> > +{
> > +	const struct arm64_image_header *h;
> > +
> > +	h = (const struct arm64_image_header *)(kernel_buf);
> > +
> > +	if (!h || (kernel_len < sizeof(*h)) ||
> 
> > +			!memcmp(&h->magic, ARM64_MAGIC, sizeof(ARM64_MAGIC)))
> 
> Doesn't memcmp() return 0 if the memory regions are the same?
> This would always match the correct magic, rejecting the image.
> 
> That's not whats happening, as kexec-file works, so this never matches anything.
> 
> sizeof(ARM64_MAGIC) includes the null terminator, but this sequence is output in
> head.S using '.ascii' which doesn't include the terminator, (otherwise it
> wouldn't fit in the 4byte magic field). The memcmp() here is also consuming the
> least significant bytes of the next field.
> 
> I think this line should be:
> | 			memcmp(&h->magic, ARM64_MAGIC, sizeof(h->magic)))

Absolutely you're right!

> 
> > +static void *image_load(struct kimage *image,
> > +				char *kernel, unsigned long kernel_len,
> > +				char *initrd, unsigned long initrd_len,
> > +				char *cmdline, unsigned long cmdline_len)
> 
> > +	kbuf.buffer = kernel;
> > +	kbuf.bufsz = kernel_len;
> > +	kbuf.memsz = le64_to_cpu(h->image_size);
> > +	text_offset = le64_to_cpu(h->text_offset);
> > +	kbuf.buf_align = SZ_2M;
> 
> Nit: MIN_KIMG_ALIGN ?

OK.

> 
> > +	/* Adjust kernel segment with TEXT_OFFSET */
> > +	kbuf.memsz += text_offset;
> > +
> > +	ret = kexec_add_buffer(&kbuf);
> > +	if (ret)
> > +		goto out;
> 
> You just return in the error cases above but here you goto ... the return
> statement at the end. Seems a bit odd.

Will fix it.

> 
> With the memcmp() thing fixed:
> Reviewed-by: James Morse <james.morse@arm.com>

Always appreciate you reviewing.

-Takahiro AKASHI


> 
> Thanks,
> 
> James

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
  2018-07-20  5:57                       ` Dave Young
@ 2018-07-20  6:25                         ` AKASHI Takahiro
  0 siblings, 0 replies; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-20  6:25 UTC (permalink / raw)
  To: Dave Young
  Cc: James Morse, catalin.marinas, will.deacon, dhowells, vgoyal,
	herbert, davem, bhe, arnd, ard.biesheuvel, bhsharma, kexec,
	linux-arm-kernel, linux-kernel, Eric W. Biederman

On Fri, Jul 20, 2018 at 01:57:27PM +0800, Dave Young wrote:
> On 07/20/18 at 02:33pm, AKASHI Takahiro wrote:
> > Dave,
> > 
> > On Wed, Jul 18, 2018 at 02:45:19PM +0800, Dave Young wrote:
> > > On 07/18/18 at 03:40pm, AKASHI Takahiro wrote:
> > > > On Wed, Jul 18, 2018 at 02:13:50PM +0800, Dave Young wrote:
> > > > > Hi AKASHI,
> > > > > 
> > > > > On 07/18/18 at 02:38pm, AKASHI Takahiro wrote:
> > > > > > Dave,
> > > > > > 
> > > > > > On Tue, Jul 17, 2018 at 03:49:23PM +0800, Dave Young wrote:
> > > > > > > Hi AKASHI,
> > > > > > > On 07/17/18 at 02:31pm, AKASHI Takahiro wrote:
> > > > > > > > Hi Dave,
> > > > > > > > 
> > > > > > > > On Mon, Jul 16, 2018 at 08:24:12PM +0800, Dave Young wrote:
> > > > > > > > > On 07/16/18 at 12:04pm, James Morse wrote:
> > > > > > > > > > Hi Dave,
> > > > > > > > > > 
> > > > > > > > > > On 14/07/18 02:52, Dave Young wrote:
> > > > > > > > > > > On 07/11/18 at 04:41pm, AKASHI Takahiro wrote:
> > > > > > > > > > >> Memblock list is another source for usable system memory layout.
> > > > > > > > > > >> So powerpc's arch_kexec_walk_mem() is moved to kexec_file.c so that
> > > > > > > > > > >> other memblock-based architectures, particularly arm64, can also utilise
> > > > > > > > > > >> it. A moved function is now renamed to kexec_walk_memblock() and merged
> > > > > > > > > > >> into the existing arch_kexec_walk_mem() for general use, either resource
> > > > > > > > > > >> list or memblock list.
> > > > > > > > > > >>
> > > > > > > > > > >> A consequent function will not work for kdump with memblock list, but
> > > > > > > > > > >> this will be fixed in the next patch.
> > > > > > > > > > 
> > > > > > > > > > >> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > > > > > > > > 
> > > > > > > > > > >> @@ -513,6 +563,10 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
> > > > > > > > > > >>  int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > > > > > > > > > >>  			       int (*func)(struct resource *, void *))
> > > > > > > > > > >>  {
> > > > > > > > > > >> +	if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
> > > > > > > > > > >> +			!IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
> > > > > > > > > > >> +		return kexec_walk_memblock(kbuf, func);
> > > > > > > > > > > 
> > > > > > > > > > > AKASHI, I'm not sure if this works on all arches, for example I chekced
> > > > > > > > > > > the .config on my Nokia N900 kernel tree, there is HAVE_MEMBLOCK=y and
> > > > > > > > > > > no CONFIG_ARCH_DISCARD_MEMBLOCK, in 32bit arm code no arch_kexec_walk_mem()
> > > > > > > > > > By doesn't work you mean it's a change in behaviour?
> > > > > > > > > > I think this is fine because 32bit arm doesn't support KEXEC_FILE, (this file is
> > > > > > > > > > kexec_file specific right?).
> > > > > > > > > 
> > > > > > > > > Ah, replied on a train, I forgot this is only for kexec_file, sorry
> > > > > > > > > about that.  Please ignore the comment.
> > > > > > > > > 
> > > > > > > > > But since we have a weak function arch_kexec_walk_mem, adding another
> > > > > > > > > condition branch within this weak function looks not good.
> > > > > > > > > Something like below would be better:
> > > > > > > > 
> > > > > > > > I see your concern here, but
> > > > > > > > 
> > > > > > > > 
> > > > > > > > > int kexec_locate_mem_hole(struct kexec_buf *kbuf)
> > > > > > > > > {
> > > > > > > > >         int ret;
> > > > > > > > > 
> > > > > > > > > 	+ if use memblock
> > > > > > > > > 	+	ret = kexec_walk_memblock()
> > > > > > > > > 	+ else
> > > > > > > > >         	ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
> > > > > > > > > 
> > > > > > > > >         return ret == 1 ? 0 : -EADDRNOTAVAIL;
> > > > > > > > > }
> > > > > > > > 
> > > > > > > > what if yet another architecture comes to kexec_file and wanna
> > > > > > > > take a third approach? How can it override those functions?
> > > > > > > > Depending on kernel configuration, it might re-define either
> > > > > > > > kexec_walk_memblock() or arch_kexec_walk_mem(). It sounds weird to me.
> > > > > > > 
> > > > > > > I also feel this weird, but it is slightly better because currently no
> > > > > > > user need another overriding requirement, and I feel it is not expected to have in
> > > > > > > the future for the memblock use.
> > > > > > > 
> > > > > > > Rethinking about this issue, we can just remove the weak function and
> > > > > > > just use general function.
> > > > > > 
> > > > > > Do you really want to remove "weak" attribute?
> > > > > > 
> > > > > > > Currently with your patch applied only s390 use arch_kexec_walk_mem like
> > > > > > > below:
> > > > > > > /*
> > > > > > >  * The kernel is loaded to a fixed location. Turn off kexec_locate_mem_hole
> > > > > > >  * and provide kbuf->mem by hand.
> > > > > > >  */
> > > > > > > int arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > > > > > >                         int (*func)(struct resource *, void *))
> > > > > > > {
> > > > > > >         return 1;
> > > > > > > }
> > > > > > > 
> > > > > > > AFAIK, all other users initialize kbuf->mem as NULL, so we can check
> > > > > > 
> > > > > > As a matter of fact, nobody initializes kbuf->mem before calling
> > > > > > kexec_add_buffer (in turn, kexec_locate_mem_hole()).
> > > > > 
> > > > > Not sure we understand each other..
> > > > > Let's take an example in arch/x86/kernel/kexec-bzimage64.c:
> > > > > bzImage64_load() :
> > > > > 	struct kexec_buf kbuf = { .image = image, .buf_max = ULONG_MAX,
> > > > > 				.top_down = true };
> > > > > 
> > > > > Except the three fields above other members will be initialized as zero
> > > > > when compiling including the kbuf->mem
> > > > 
> > > > Ah, you're right.
> > > > (My armr64 patch doesn't use struct initializer, though.)
> > > > 
> > > > > > 
> > > > > > > kbuf->mem in int kexec_locate_mem_hole:
> > > > > > > 
> > > > > > > if (kbuf->mem)
> > > > > > > 	return 0;
> > > > > > > 
> > > > > > > if use memblock
> > > > > > > 	kexec_walk_memblock
> > > > > > > else
> > > > > > > 	kexec_walk_mem
> > > > > 
> > > > > kexec_walk_resource will be better than kexec_walk_mem
> > > > > 
> > > > > > 
> > > > > > I think that your solution will work for existing architectures
> > > > > > with appropriate patches, but to take your approach, as I said above,
> > > > > > we will have to modify every call site on all kexec_file-capable architectures.
> > > > > > 
> > > > > > If this is what you expect, I will work on it, but I don't think
> > > > > > that it would be a better idea.
> > > > 
> > > > So you would expect me to modify my own arm64 code as well as s390.
> > > 
> > > Yes :)  But I had not get time to read all your patches so I was not
> > > aware the struct initialization in arm64 code so I assumed only s390
> > > need a change..
> > 
> > Okay, but I don't want to mix cross-arch changes into a single patch,
> > prefer to leave the current patch as it is and add an additional patch
> > as you suggested here.
> Hi AKASHI,
> 
> Maybe add another patch to drop s390 walk function first, then follow
> with this patch with the modification about common code restructure.
> 
> Is this better? For example:
> 03/15 s390, drop s390 arch_kexec_mem_walk
> 04/15 powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem

That's fine to me, too.

Thanks,
-Takahiro AKASHI

> > 
> > Is that OK for you?
> > 
> > Thanks,
> > -Takahiro AKASHI
> > 
> > 
> > > Thanks
> > > Dave
> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 11/15] arm64: kexec_file: add crash dump support
  2018-07-18 16:50   ` James Morse
@ 2018-07-23  5:39     ` AKASHI Takahiro
  2018-07-23 17:04       ` James Morse
  0 siblings, 1 reply; 38+ messages in thread
From: AKASHI Takahiro @ 2018-07-23  5:39 UTC (permalink / raw)
  To: James Morse
  Cc: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd, ard.biesheuvel, bhsharma, kexec,
	linux-arm-kernel, linux-kernel

Hi James,

On Wed, Jul 18, 2018 at 05:50:22PM +0100, James Morse wrote:
> Hi Akashi,
> 
> On 11/07/18 08:41, AKASHI Takahiro wrote:
> > Enabling crash dump (kdump) includes
> > * prepare contents of ELF header of a core dump file, /proc/vmcore,
> >   using crash_prepare_elf64_headers(), and
> > * add two device tree properties, "linux,usable-memory-range" and
> >   "linux,elfcorehdr", which represent respectively a memory range
> >   to be used by crash dump kernel and the header's location
> 
> > diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
> > index 69333694e3e2..eeb5766928b0 100644
> > --- a/arch/arm64/include/asm/kexec.h
> > +++ b/arch/arm64/include/asm/kexec.h
> > @@ -99,6 +99,10 @@ static inline void crash_post_resume(void) {}
> >  struct kimage_arch {
> >  	phys_addr_t dtb_mem;
> >  	void *dtb_buf;
> > +	/* Core ELF header buffer */
> 
> > +	void *elf_headers;
> 
> Shouldn't this be a phys_addr_t if it comes from kbuf.mem?

Do you mean elf_load_addr? You're right.
But kexec_buf defined mem as unsigned long and so I'd rather change
dtb_mem to unsigned long instead of elf_load_addr, which will also be
renamed to elf_headers_mem for clarification.

> (dtb_mem is, and they type tells us which way round the runtime/kexec-time
> pointers are)
> 
> 
> > +	unsigned long elf_headers_sz;
> > +	unsigned long elf_load_addr;
> >  };
> >  
> >  /**
> 
> 
> > diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> > index a0b44fe18b95..261564df7210 100644
> > --- a/arch/arm64/kernel/machine_kexec_file.c
> > +++ b/arch/arm64/kernel/machine_kexec_file.c
> > @@ -132,6 +173,45 @@ static int setup_dtb(struct kimage *image,
> >  	return ret;
> >  }
> >  
> > +static int prepare_elf_headers(void **addr, unsigned long *sz)
> > +{
> > +	struct crash_mem *cmem;
> > +	unsigned int nr_ranges;
> > +	int ret;
> > +	u64 i;
> > +	phys_addr_t start, end;
> 
> > +	nr_ranges = 1; /* for exclusion of crashkernel region */
> > +	for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, 0,
> > +							&start, &end, NULL)
> 
> Nit: flags = MEMBLOCK_NONE? Just to make it obvious this is how MEMBLOCK_NOMAP
> regions are weeded out.

OK.

> This is going to get interesting if we ever support hotpluggable memory... but
> it works for now and implicitly removes the nomap regions.
> 
> 
> > +		nr_ranges++;
> 
> > +
> > +	cmem = kmalloc(sizeof(struct crash_mem) +
> > +			sizeof(struct crash_mem_range) * nr_ranges, GFP_KERNEL);
> > +	if (!cmem)
> > +		return -ENOMEM;
> > +
> > +	cmem->max_nr_ranges = nr_ranges;
> > +	cmem->nr_ranges = 0;
> > +	for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, 0,
> > +							&start, &end, NULL) {
> > +		cmem->ranges[cmem->nr_ranges].start = start;
> > +		cmem->ranges[cmem->nr_ranges].end = end - 1;
> > +		cmem->nr_ranges++;
> > +	}
> > +
> > +	/* Exclude crashkernel region */
> > +	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
> 
> 
> > +	if (ret)
> > +		goto out;
> > +
> > +	ret =  crash_prepare_elf64_headers(cmem, true, addr, sz);
> > +
> > +out:
> 
> Nit: You could save the goto if you wrote this as:
> |	if (!ret)
> |		ret = crash_prepare_elf64_headers(cmem, true, addr, sz);

OK.

> > +	kfree(cmem);
> > +	return ret;
> > +}
> > +
> >  int load_other_segments(struct kimage *image,
> >  			unsigned long kernel_load_addr,
> >  			unsigned long kernel_size,
> > @@ -139,11 +219,43 @@ int load_other_segments(struct kimage *image,
> >  			char *cmdline, unsigned long cmdline_len)
> >  {
> >  	struct kexec_buf kbuf;
> > +	void *hdrs_addr;
> > +	unsigned long hdrs_sz;
> >  	unsigned long initrd_load_addr = 0;
> >  	char *dtb = NULL;
> >  	unsigned long dtb_len = 0;
> >  	int ret = 0;
> >  
> > +	/* load elf core header */
> > +	if (image->type == KEXEC_TYPE_CRASH) {
> > +		ret = prepare_elf_headers(&hdrs_addr, &hdrs_sz);
> > +		if (ret) {
> > +			pr_err("Preparing elf core header failed\n");
> > +			goto out_err;
> > +		}
> > +
> > +		kbuf.image = image;
> > +		kbuf.buffer = hdrs_addr;
> > +		kbuf.bufsz = hdrs_sz;
> > +		kbuf.memsz = hdrs_sz;
> 
> > +		kbuf.buf_align = PAGE_SIZE;
> 
> Whose PAGE_SIZE?
> 
> Won't this break if the kdump kernel is 64K pages, but the first kernel uses 4K?
> Should we change this to the largest supported PAGE_SIZE: SZ_64K?

Ah, yes.

> > +		kbuf.buf_min = crashk_res.start;
> > +		kbuf.buf_max = crashk_res.end + 1;
> > +		kbuf.top_down = true;
> > +
> > +		ret = kexec_add_buffer(&kbuf);
> > +		if (ret) {
> > +			vfree(hdrs_addr);
> > +			goto out_err;
> > +		}
> > +		image->arch.elf_headers = hdrs_addr;
> > +		image->arch.elf_headers_sz = hdrs_sz;
> > +		image->arch.elf_load_addr = kbuf.mem;
> > +
> > +		pr_debug("Loaded elf core header at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
> > +				 image->arch.elf_load_addr, hdrs_sz, hdrs_sz);
> > +	}
> > +
> >  	kbuf.image = image;
> >  	/* not allocate anything below the kernel */
> >  	kbuf.buf_min = kernel_load_addr + kernel_size;
> 
> 
> I think the initramfs can escape the crash kernel range because you add to the
> buf_max region:
> |	/* within 1GB-aligned window of up to 32GB in size */
> |	kbuf.buf_max = round_down(kernel_load_addr, SZ_1G)
> |				 + (unsigned long)SZ_1G * 32;

No worries.
kexec_add_buffer() will limit the search only within crashk_res anyway.

On the other hand, the code:
> > +	if (image->type == KEXEC_TYPE_CRASH) {
                (snip)
> > +		kbuf.buf_min = crashk_res.start;
> > +		kbuf.buf_max = crashk_res.end + 1;
can be misleading. I will fix it as follows:
|		kbuf.buf_min = kernel_load_addr + kernel_size;
|		kbuf.buf_max = ULONG_MAX;
(and likewise, will fix image_load().)

Thank you again for your valuable comments.
Are you reviewing other patches in my v11?
If not, I will post v12 tomorrow.

-Takahiro AKASHI

> 
> I think we need a helper to clamp these min/max ranges to within the crash
> kernel range, as its needs doing in a few places.
> 
> 
> Thanks,
> 
> James

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 11/15] arm64: kexec_file: add crash dump support
  2018-07-23  5:39     ` AKASHI Takahiro
@ 2018-07-23 17:04       ` James Morse
  0 siblings, 0 replies; 38+ messages in thread
From: James Morse @ 2018-07-23 17:04 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, dhowells, vgoyal, herbert, davem,
	dyoung, bhe, arnd, ard.biesheuvel, bhsharma, kexec,
	linux-arm-kernel, linux-kernel

Hi Akashi,

On 23/07/18 06:39, AKASHI Takahiro wrote:
> On Wed, Jul 18, 2018 at 05:50:22PM +0100, James Morse wrote:
>> On 11/07/18 08:41, AKASHI Takahiro wrote:
>>> Enabling crash dump (kdump) includes
>>> * prepare contents of ELF header of a core dump file, /proc/vmcore,
>>>   using crash_prepare_elf64_headers(), and
>>> * add two device tree properties, "linux,usable-memory-range" and
>>>   "linux,elfcorehdr", which represent respectively a memory range
>>>   to be used by crash dump kernel and the header's location
>>
>>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
>>> index 69333694e3e2..eeb5766928b0 100644
>>> --- a/arch/arm64/include/asm/kexec.h
>>> +++ b/arch/arm64/include/asm/kexec.h
>>> @@ -99,6 +99,10 @@ static inline void crash_post_resume(void) {}
>>>  struct kimage_arch {
>>>  	phys_addr_t dtb_mem;
>>>  	void *dtb_buf;
>>> +	/* Core ELF header buffer */
>>
>>> +	void *elf_headers;
>>
>> Shouldn't this be a phys_addr_t if it comes from kbuf.mem?
> 
> Do you mean elf_load_addr? You're right.
> But kexec_buf defined mem as unsigned long and so I'd rather change
> dtb_mem to unsigned long instead of elf_load_addr, which will also be
> renamed to elf_headers_mem for clarification.

>> (dtb_mem is, and they type tells us which way round the runtime/kexec-time
>> pointers are)

My preference would be for physical addresses to always be phys_addr_t, but as
long as we can easily spot the difference kexec-time versus runtime addresses,
it will save bugs where we use the wrong one.


>>> diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
>>> index a0b44fe18b95..261564df7210 100644
>>> --- a/arch/arm64/kernel/machine_kexec_file.c
>>> +++ b/arch/arm64/kernel/machine_kexec_file.c
>>> @@ -132,6 +173,45 @@ static int setup_dtb(struct kimage *image,

>>> +		kbuf.buf_min = crashk_res.start;
>>> +		kbuf.buf_max = crashk_res.end + 1;
>>> +		kbuf.top_down = true;
>>> +
>>> +		ret = kexec_add_buffer(&kbuf);
>>> +		if (ret) {
>>> +			vfree(hdrs_addr);
>>> +			goto out_err;
>>> +		}
>>> +		image->arch.elf_headers = hdrs_addr;
>>> +		image->arch.elf_headers_sz = hdrs_sz;
>>> +		image->arch.elf_load_addr = kbuf.mem;
>>> +
>>> +		pr_debug("Loaded elf core header at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
>>> +				 image->arch.elf_load_addr, hdrs_sz, hdrs_sz);
>>> +	}
>>> +
>>>  	kbuf.image = image;
>>>  	/* not allocate anything below the kernel */
>>>  	kbuf.buf_min = kernel_load_addr + kernel_size;

>> I think the initramfs can escape the crash kernel range because you add to the
>> buf_max region:
>> |	/* within 1GB-aligned window of up to 32GB in size */
>> |	kbuf.buf_max = round_down(kernel_load_addr, SZ_1G)
>> |				 + (unsigned long)SZ_1G * 32;
> 
> No worries.
> kexec_add_buffer() will limit the search only within crashk_res anyway.

via arch_kexec_walk_mem()? Got it.

But strangely the buf_min and buf_max still matter because
locate_mem_hole_callback() uses them.


> Are you reviewing other patches in my v11?
> If not, I will post v12 tomorrow.

No, (I try to batch replies to avoid that happening).
I'm reading up on Secure-boot and trying to test the pe_verification stuff...


Thanks,

James

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2018-07-23 17:04 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-11  7:41 [PATCH v11 00/15] subject: arm64: kexec: add kexec_file_load() support AKASHI Takahiro
2018-07-11  7:41 ` [PATCH v11 01/15] asm-generic: add kexec_file_load system call to unistd.h AKASHI Takahiro
2018-07-11  7:41 ` [PATCH v11 02/15] kexec_file: make kexec_image_post_load_cleanup_default() global AKASHI Takahiro
2018-07-11  7:41 ` [PATCH v11 03/15] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem() AKASHI Takahiro
2018-07-14  1:52   ` Dave Young
2018-07-16 11:04     ` James Morse
2018-07-16 12:24       ` Dave Young
2018-07-17  5:31         ` AKASHI Takahiro
2018-07-17  7:49           ` Dave Young
2018-07-18  5:38             ` AKASHI Takahiro
2018-07-18  6:13               ` Dave Young
2018-07-18  6:40                 ` AKASHI Takahiro
2018-07-18  6:45                   ` Dave Young
2018-07-20  5:33                     ` AKASHI Takahiro
2018-07-20  5:57                       ` Dave Young
2018-07-20  6:25                         ` AKASHI Takahiro
2018-07-16 12:26   ` Dave Young
2018-07-18 16:52     ` James Morse
2018-07-19  2:23       ` Dave Young
2018-07-11  7:41 ` [PATCH v11 04/15] kexec_file: kexec_walk_memblock() only walks a dedicated region at kdump AKASHI Takahiro
2018-07-11  7:41 ` [PATCH v11 05/15] of/fdt: add helper functions for handling properties AKASHI Takahiro
2018-07-11  7:41 ` [PATCH v11 06/15] arm64: add image head flag definitions AKASHI Takahiro
2018-07-11  7:41 ` [PATCH v11 07/15] arm64: cpufeature: add MMFR0 helper functions AKASHI Takahiro
2018-07-11  7:41 ` [PATCH v11 08/15] arm64: enable KEXEC_FILE config AKASHI Takahiro
2018-07-11  7:41 ` [PATCH v11 09/15] arm64: kexec_file: load initrd and device-tree AKASHI Takahiro
2018-07-17 16:57   ` James Morse
2018-07-18  5:56     ` AKASHI Takahiro
2018-07-11  7:41 ` [PATCH v11 10/15] arm64: kexec_file: allow for loading Image-format kernel AKASHI Takahiro
2018-07-18 16:47   ` James Morse
2018-07-20  6:14     ` AKASHI Takahiro
2018-07-11  7:41 ` [PATCH v11 11/15] arm64: kexec_file: add crash dump support AKASHI Takahiro
2018-07-18 16:50   ` James Morse
2018-07-23  5:39     ` AKASHI Takahiro
2018-07-23 17:04       ` James Morse
2018-07-11  7:42 ` [PATCH v11 12/15] arm64: kexec_file: invoke the kernel without purgatory AKASHI Takahiro
2018-07-11  7:42 ` [PATCH v11 13/15] include: pe.h: remove message[] from mz header definition AKASHI Takahiro
2018-07-11  7:42 ` [PATCH v11 14/15] arm64: kexec_file: add kernel signature verification support AKASHI Takahiro
2018-07-11  7:42 ` [PATCH v11 15/15] arm64: kexec_file: add kaslr support AKASHI Takahiro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).