All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 00/13] arm64: kexec: add kexec_file_load() support
@ 2018-02-22 11:17 ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-arm-kernel, linux-kernel, AKASHI Takahiro

This is the eighth round of implementing kexec_file_load() support
on arm64.[1]
Most of the code is based on kexec-tools (along with some kernel code
from x86, which also came from kexec-tools).


This patch series enables us to
  * load the kernel by specifying its file descriptor, instead of user-
    filled buffer, at kexec_file_load() system call, and
  * optionally verify its signature at load time for trusted boot.

Contrary to kexec_load() system call, as we discussed a long time ago,
users may not be allowed to provide a device tree to the 2nd kernel
explicitly, hence enforcing a dt blob of the first kernel to be re-used
internally.

To use kexec_file_load() system call, instead of kexec_load(), at kexec
command, '-s' option must be specified. See [2] for a necessary patch for
kexec-tools.

To anaylize a generated crash dump file, use the latest master branch of
crash utility[3] for v4.16-rc kernel. I always try to submit patches to
fix any inconsistencies introduced in the latest kernel.

Regarding a kernel image verification, a signature must be presented
along with the binary itself. A signature is basically a hash value
calculated against the whole binary data and encrypted by a key which
will be authenticated by one of the system's trusted certificates.
Any attempt to read and load a to-be-kexec-ed kernel image through
a system call will be checked and blocked if the binary's hash value
doesn't match its associated signature.

There are two methods available now:
1. implementing arch-specific verification hook of kexec_file_load()
2. utilizing IMA(Integrity Measurement Architecture)[4] appraisal framework

Before my v7, I believed that my patch only supports (1) but am now
confident that (2) comes free if IMA is enabled and properly configured.


(1) Arch-specific verification hook
If CONFIG_KEXEC_VERIFY_SIG is enabled, kexec_file_load() invokes an arch-
defined (and hence file-format-specific) hook function to check for the
validity of kernel binary.

On x86, a signature is embedded into a PE file (Microsoft's format) header
of binary. Since arm64's "Image" can also be seen as a PE file as far as
CONFIG_EFI is enabled, we adopt this format for kernel signing.  

As in the case of UEFI applications, we can create a signed kernel image:
    $ sbsign --key ${KEY} --cert ${CERT} Image

You may want to use certs/signing_key.pem, which is intended to be used
for module sigining (CONFIG_MODULE_SIG), as ${KEY} and ${CERT} for test
purpose.


(2) IMA appraisal-based
IMA was first introduced in linux in order to meet TCG (Trusted Computing
Group) requirement that all the sensitive files be *measured* before
reading/executing them to detect any untrusted changes/modification.
Then appraisal feature, which allows us to ensure the integrity of
files and even prevent them from reading/executing, was added later.

Meanwhile, kexec_file_load() has been merged since v3.17 and evolved to
enable IMA-appraisal type verification by the commit b804defe4297 ("kexec:
replace call to copy_file_from_fd() with kernel version").

In this scheme, a signature will be stored in a extended file attribute,
"security.ima" while a decryption key is hold in a dedicated keyring,
".ima" or "_ima".  All the necessary process of verification is confined
in a secure API, kernel_read_file_from_fd(), called by kexec_file_load().

    Please note that powerpc is one of the two architectures now
    supporting KEXEC_FILE, and that it wishes to exntend IMA,
    where a signature may be appended to "vmlinux" file[5], like module
    signing, instead of using an extended file attribute.

While IMA meant to be used with TPM (Trusted Platform Module) on secure
platform, IMA is still usable without TPM. Here is an example procedure
about how we can give it a try to run the feature using a self-signed
root ca for demo/test purposes:

 1) Generate needed keys and certificates, following "Generate trusted
    keys" section in README of ima-evm-utils[6].

 2) Build the kernel with the following kernel configurations, specifying
    "ima-local-ca.pem" for CONFIG_SYSTEM_TRUSTED_KEYS:
	CONFIG_EXT4_FS_SECURITY
	CONFIG_INTEGRITY_SIGNATURE
	CONFIG_INTEGRITY_ASYMMETRIC_KEYS
	CONFIG_INTEGRITY_TRUSTED_KEYRING
	CONFIG_IMA
	CONFIG_IMA_WRITE_POLICY
	CONFIG_IMA_READ_POLICY
	CONFIG_IMA_APPRAISE
	CONFIG_IMA_APPRAISE_BOOTPARAM
	CONFIG_SYSTEM_TRUSTED_KEYS
    Please note that CONFIG_KEXEC_VERIFY_SIG is not, actually should
    not be, enabled.

 3) Sign(label) a kernel image binary to be kexec-ed on target filesystem:
    $ evmctl ima_sign --key /path/to/private_key.pem /your/Image

 4) Add a command line parameter and boot the kernel:
    ima_appraise=enforce

 On live system,
 5) Set a security policy:
    $ mount -t securityfs none /sys/kernel/security
    $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" \
      > /sys/kernel/security/ima/policy

 6) Add a key for ima:
    $ keyctl padd asymmetric my_ima_key %:.ima < /path/to/x509_ima.der
    (or evmctl import /path/to/x509_ima.der <ima_keyring_id>)

 7) Then try kexec as normal.


Concerns(or future works):
* Even if the kernel is configured with CONFIG_RANDOMIZE_BASE, the 2nd
  kernel won't be placed at a randomized address. We will have to
  add some boot code similar to efi-stub to implement the randomization.
for approach (1),
* While big-endian kernel can support kernel signing, I'm not sure that
  Image can be recognized as in PE format because x86 standard only
  defines little-endian-based format.
* vmlinux support

  [1] http://git.linaro.org/people/takahiro.akashi/linux-aarch64.git
	branch:arm64/kexec_file
  [2] http://git.linaro.org/people/takahiro.akashi/kexec-tools.git
	branch:arm64/kexec_file
  [3] http://github.com/crash-utility/crash.git
  [4] https://sourceforge.net/p/linux-ima/wiki/Home/
  [5] http://lkml.iu.edu//hypermail/linux/kernel/1707.0/03669.html
  [6] https://sourceforge.net/p/linux-ima/ima-evm-utils/ci/master/tree/


Changes in v8 (Feb 22, 2018)
* introduce ARCH_HAS_KEXEC_PURGATORY so that arm64 will be able to skip
  purgatory
* remove "ifdef CONFIG_X86_64" stuffs from a re-factored function,
  prepare_elf64_headers(), making its interface more generic
  (The original patch was split into two for easier reviews.)
* modify cpu_soft_restart() so as to let the 2nd kernel jump into its entry
  code directly without requiring purgatory in case of kexec_file_load
* remove CONFIG_KEXEC_FILE_IMAGE_FMT and introduce
  CONFIG_KEXEC_IMAGE_VERIFY_SIG, much similar to x86 but quite redundant
  for now.
* In addition, update/modify dependencies of KEXEC_IMAGE_VERIFY_SIG

Changes in v7 (Dec 4, 2017)
* rebased to v4.15-rc2
* re-organize the patch set to separate KEXEC_FILE_VERIFY_SIG-related
  code from the others
* revamp factored-out code in kernel/kexec_file.c due to the changes
  in original x86 code
* redefine walk_sys_ram_res_rev() prototype due to change of callback
  type in the counterpart, walk_sys_ram_res()
* make KEXEC_FILE_IMAGE_FMT defaut on if KEXEC_FILE selected

Changes in v6 (Oct 24, 2017)
* fix a for-loop bug in _kexec_kernel_image_probe() per Julien

Changes in v5 (Oct 10, 2017)
* fix kbuild errors around patch #3
per Julien's comments,
* fix a bug in walk_system_ram_res_rev() with some cleanup
* modify fdt_setprop_range() to use vmalloc()
* modify fill_property() to use memset()

Changes in v4 (Oct 2, 2017)
* reinstate x86's arch_kexec_kernel_image_load()
* rename weak arch_kexec_kernel_xxx() to _kexec_kernel_xxx() for
  better re-use
* constify kexec_file_loaders[]

Changes in v3 (Sep 15, 2017)
* fix kbuild test error
* factor out arch_kexec_kernel_*() & arch_kimage_file_post_load_cleanup()
* remove CONFIG_CRASH_CORE guard from kexec_file.c
* add vmapped kernel region to vmcore for gdb backtracing
  (see prepare_elf64_headers())
* merge asm/kexec_file.h into asm/kexec.h
* and some cleanups

Changes in v2 (Sep 8, 2017)
* move core-header-related functions from crash_core.c to kexec_file.c
* drop hash-check code from purgatory
* modify purgatory asm to remove arch_kexec_apply_relocations_add()
* drop older kernel support
* drop vmlinux support (at least, for this series)


Patch #1 to #10 are essential part for KEXEC_FILE support
(additionally allowing for IMA-based verification):
  Patch #1 to #6 are all preparatory patches on generic side.
  Patch #7 to #11 are to enable kexec_file_load on arm64.

Patch #12 to #13 are for KEXEC_VERIFY_SIG (arch-specific verification)
support

AKASHI Takahiro (13):
  resource: add walk_system_ram_res_rev()
  kexec_file: make an use of purgatory optional
  kexec_file,x86,powerpc: factor out kexec_file_ops functions
  x86: kexec_file: factor out elf core header related functions
  kexec_file, x86: move re-factored code to generic side
  asm-generic: add kexec_file_load system call to unistd.h
  arm64: kexec_file: invoke the kernel without purgatory
  arm64: kexec_file: load initrd and device-tree
  arm64: kexec_file: add crash dump support
  arm64: kexec_file: add Image format support
  arm64: kexec_file: enable KEXEC_FILE config
  include: pe.h: remove message[] from mz header definition
  arm64: kexec_file: enable KEXEC_VERIFY_SIG for Image

 arch/arm64/Kconfig                          |  34 +++
 arch/arm64/include/asm/kexec.h              |  90 +++++++
 arch/arm64/kernel/Makefile                  |   3 +-
 arch/arm64/kernel/cpu-reset.S               |   6 +-
 arch/arm64/kernel/kexec_image.c             | 105 ++++++++
 arch/arm64/kernel/machine_kexec.c           |  11 +-
 arch/arm64/kernel/machine_kexec_file.c      | 401 ++++++++++++++++++++++++++++
 arch/arm64/kernel/relocate_kernel.S         |   3 +-
 arch/powerpc/Kconfig                        |   3 +
 arch/powerpc/include/asm/kexec.h            |   2 +-
 arch/powerpc/kernel/kexec_elf_64.c          |   2 +-
 arch/powerpc/kernel/machine_kexec_file_64.c |  39 +--
 arch/x86/Kconfig                            |   3 +
 arch/x86/include/asm/kexec-bzimage64.h      |   2 +-
 arch/x86/kernel/crash.c                     | 332 +++++------------------
 arch/x86/kernel/kexec-bzimage64.c           |   2 +-
 arch/x86/kernel/machine_kexec_64.c          |  45 +---
 include/linux/ioport.h                      |   3 +
 include/linux/kexec.h                       |  34 ++-
 include/linux/pe.h                          |   2 +-
 include/uapi/asm-generic/unistd.h           |   4 +-
 kernel/kexec_file.c                         | 238 ++++++++++++++++-
 kernel/resource.c                           |  57 ++++
 23 files changed, 1046 insertions(+), 375 deletions(-)
 create mode 100644 arch/arm64/kernel/kexec_image.c
 create mode 100644 arch/arm64/kernel/machine_kexec_file.c

-- 
2.16.2

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 00/13] arm64: kexec: add kexec_file_load() support
@ 2018-02-22 11:17 ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

This is the eighth round of implementing kexec_file_load() support
on arm64.[1]
Most of the code is based on kexec-tools (along with some kernel code
from x86, which also came from kexec-tools).


This patch series enables us to
  * load the kernel by specifying its file descriptor, instead of user-
    filled buffer, at kexec_file_load() system call, and
  * optionally verify its signature at load time for trusted boot.

Contrary to kexec_load() system call, as we discussed a long time ago,
users may not be allowed to provide a device tree to the 2nd kernel
explicitly, hence enforcing a dt blob of the first kernel to be re-used
internally.

To use kexec_file_load() system call, instead of kexec_load(), at kexec
command, '-s' option must be specified. See [2] for a necessary patch for
kexec-tools.

To anaylize a generated crash dump file, use the latest master branch of
crash utility[3] for v4.16-rc kernel. I always try to submit patches to
fix any inconsistencies introduced in the latest kernel.

Regarding a kernel image verification, a signature must be presented
along with the binary itself. A signature is basically a hash value
calculated against the whole binary data and encrypted by a key which
will be authenticated by one of the system's trusted certificates.
Any attempt to read and load a to-be-kexec-ed kernel image through
a system call will be checked and blocked if the binary's hash value
doesn't match its associated signature.

There are two methods available now:
1. implementing arch-specific verification hook of kexec_file_load()
2. utilizing IMA(Integrity Measurement Architecture)[4] appraisal framework

Before my v7, I believed that my patch only supports (1) but am now
confident that (2) comes free if IMA is enabled and properly configured.


(1) Arch-specific verification hook
If CONFIG_KEXEC_VERIFY_SIG is enabled, kexec_file_load() invokes an arch-
defined (and hence file-format-specific) hook function to check for the
validity of kernel binary.

On x86, a signature is embedded into a PE file (Microsoft's format) header
of binary. Since arm64's "Image" can also be seen as a PE file as far as
CONFIG_EFI is enabled, we adopt this format for kernel signing.  

As in the case of UEFI applications, we can create a signed kernel image:
    $ sbsign --key ${KEY} --cert ${CERT} Image

You may want to use certs/signing_key.pem, which is intended to be used
for module sigining (CONFIG_MODULE_SIG), as ${KEY} and ${CERT} for test
purpose.


(2) IMA appraisal-based
IMA was first introduced in linux in order to meet TCG (Trusted Computing
Group) requirement that all the sensitive files be *measured* before
reading/executing them to detect any untrusted changes/modification.
Then appraisal feature, which allows us to ensure the integrity of
files and even prevent them from reading/executing, was added later.

Meanwhile, kexec_file_load() has been merged since v3.17 and evolved to
enable IMA-appraisal type verification by the commit b804defe4297 ("kexec:
replace call to copy_file_from_fd() with kernel version").

In this scheme, a signature will be stored in a extended file attribute,
"security.ima" while a decryption key is hold in a dedicated keyring,
".ima" or "_ima".  All the necessary process of verification is confined
in a secure API, kernel_read_file_from_fd(), called by kexec_file_load().

    Please note that powerpc is one of the two architectures now
    supporting KEXEC_FILE, and that it wishes to exntend IMA,
    where a signature may be appended to "vmlinux" file[5], like module
    signing, instead of using an extended file attribute.

While IMA meant to be used with TPM (Trusted Platform Module) on secure
platform, IMA is still usable without TPM. Here is an example procedure
about how we can give it a try to run the feature using a self-signed
root ca for demo/test purposes:

 1) Generate needed keys and certificates, following "Generate trusted
    keys" section in README of ima-evm-utils[6].

 2) Build the kernel with the following kernel configurations, specifying
    "ima-local-ca.pem" for CONFIG_SYSTEM_TRUSTED_KEYS:
	CONFIG_EXT4_FS_SECURITY
	CONFIG_INTEGRITY_SIGNATURE
	CONFIG_INTEGRITY_ASYMMETRIC_KEYS
	CONFIG_INTEGRITY_TRUSTED_KEYRING
	CONFIG_IMA
	CONFIG_IMA_WRITE_POLICY
	CONFIG_IMA_READ_POLICY
	CONFIG_IMA_APPRAISE
	CONFIG_IMA_APPRAISE_BOOTPARAM
	CONFIG_SYSTEM_TRUSTED_KEYS
    Please note that CONFIG_KEXEC_VERIFY_SIG is not, actually should
    not be, enabled.

 3) Sign(label) a kernel image binary to be kexec-ed on target filesystem:
    $ evmctl ima_sign --key /path/to/private_key.pem /your/Image

 4) Add a command line parameter and boot the kernel:
    ima_appraise=enforce

 On live system,
 5) Set a security policy:
    $ mount -t securityfs none /sys/kernel/security
    $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" \
      > /sys/kernel/security/ima/policy

 6) Add a key for ima:
    $ keyctl padd asymmetric my_ima_key %:.ima < /path/to/x509_ima.der
    (or evmctl import /path/to/x509_ima.der <ima_keyring_id>)

 7) Then try kexec as normal.


Concerns(or future works):
* Even if the kernel is configured with CONFIG_RANDOMIZE_BASE, the 2nd
  kernel won't be placed at a randomized address. We will have to
  add some boot code similar to efi-stub to implement the randomization.
for approach (1),
* While big-endian kernel can support kernel signing, I'm not sure that
  Image can be recognized as in PE format because x86 standard only
  defines little-endian-based format.
* vmlinux support

  [1] http://git.linaro.org/people/takahiro.akashi/linux-aarch64.git
	branch:arm64/kexec_file
  [2] http://git.linaro.org/people/takahiro.akashi/kexec-tools.git
	branch:arm64/kexec_file
  [3] http://github.com/crash-utility/crash.git
  [4] https://sourceforge.net/p/linux-ima/wiki/Home/
  [5] http://lkml.iu.edu//hypermail/linux/kernel/1707.0/03669.html
  [6] https://sourceforge.net/p/linux-ima/ima-evm-utils/ci/master/tree/


Changes in v8 (Feb 22, 2018)
* introduce ARCH_HAS_KEXEC_PURGATORY so that arm64 will be able to skip
  purgatory
* remove "ifdef CONFIG_X86_64" stuffs from a re-factored function,
  prepare_elf64_headers(), making its interface more generic
  (The original patch was split into two for easier reviews.)
* modify cpu_soft_restart() so as to let the 2nd kernel jump into its entry
  code directly without requiring purgatory in case of kexec_file_load
* remove CONFIG_KEXEC_FILE_IMAGE_FMT and introduce
  CONFIG_KEXEC_IMAGE_VERIFY_SIG, much similar to x86 but quite redundant
  for now.
* In addition, update/modify dependencies of KEXEC_IMAGE_VERIFY_SIG

Changes in v7 (Dec 4, 2017)
* rebased to v4.15-rc2
* re-organize the patch set to separate KEXEC_FILE_VERIFY_SIG-related
  code from the others
* revamp factored-out code in kernel/kexec_file.c due to the changes
  in original x86 code
* redefine walk_sys_ram_res_rev() prototype due to change of callback
  type in the counterpart, walk_sys_ram_res()
* make KEXEC_FILE_IMAGE_FMT defaut on if KEXEC_FILE selected

Changes in v6 (Oct 24, 2017)
* fix a for-loop bug in _kexec_kernel_image_probe() per Julien

Changes in v5 (Oct 10, 2017)
* fix kbuild errors around patch #3
per Julien's comments,
* fix a bug in walk_system_ram_res_rev() with some cleanup
* modify fdt_setprop_range() to use vmalloc()
* modify fill_property() to use memset()

Changes in v4 (Oct 2, 2017)
* reinstate x86's arch_kexec_kernel_image_load()
* rename weak arch_kexec_kernel_xxx() to _kexec_kernel_xxx() for
  better re-use
* constify kexec_file_loaders[]

Changes in v3 (Sep 15, 2017)
* fix kbuild test error
* factor out arch_kexec_kernel_*() & arch_kimage_file_post_load_cleanup()
* remove CONFIG_CRASH_CORE guard from kexec_file.c
* add vmapped kernel region to vmcore for gdb backtracing
  (see prepare_elf64_headers())
* merge asm/kexec_file.h into asm/kexec.h
* and some cleanups

Changes in v2 (Sep 8, 2017)
* move core-header-related functions from crash_core.c to kexec_file.c
* drop hash-check code from purgatory
* modify purgatory asm to remove arch_kexec_apply_relocations_add()
* drop older kernel support
* drop vmlinux support (at least, for this series)


Patch #1 to #10 are essential part for KEXEC_FILE support
(additionally allowing for IMA-based verification):
  Patch #1 to #6 are all preparatory patches on generic side.
  Patch #7 to #11 are to enable kexec_file_load on arm64.

Patch #12 to #13 are for KEXEC_VERIFY_SIG (arch-specific verification)
support

AKASHI Takahiro (13):
  resource: add walk_system_ram_res_rev()
  kexec_file: make an use of purgatory optional
  kexec_file,x86,powerpc: factor out kexec_file_ops functions
  x86: kexec_file: factor out elf core header related functions
  kexec_file, x86: move re-factored code to generic side
  asm-generic: add kexec_file_load system call to unistd.h
  arm64: kexec_file: invoke the kernel without purgatory
  arm64: kexec_file: load initrd and device-tree
  arm64: kexec_file: add crash dump support
  arm64: kexec_file: add Image format support
  arm64: kexec_file: enable KEXEC_FILE config
  include: pe.h: remove message[] from mz header definition
  arm64: kexec_file: enable KEXEC_VERIFY_SIG for Image

 arch/arm64/Kconfig                          |  34 +++
 arch/arm64/include/asm/kexec.h              |  90 +++++++
 arch/arm64/kernel/Makefile                  |   3 +-
 arch/arm64/kernel/cpu-reset.S               |   6 +-
 arch/arm64/kernel/kexec_image.c             | 105 ++++++++
 arch/arm64/kernel/machine_kexec.c           |  11 +-
 arch/arm64/kernel/machine_kexec_file.c      | 401 ++++++++++++++++++++++++++++
 arch/arm64/kernel/relocate_kernel.S         |   3 +-
 arch/powerpc/Kconfig                        |   3 +
 arch/powerpc/include/asm/kexec.h            |   2 +-
 arch/powerpc/kernel/kexec_elf_64.c          |   2 +-
 arch/powerpc/kernel/machine_kexec_file_64.c |  39 +--
 arch/x86/Kconfig                            |   3 +
 arch/x86/include/asm/kexec-bzimage64.h      |   2 +-
 arch/x86/kernel/crash.c                     | 332 +++++------------------
 arch/x86/kernel/kexec-bzimage64.c           |   2 +-
 arch/x86/kernel/machine_kexec_64.c          |  45 +---
 include/linux/ioport.h                      |   3 +
 include/linux/kexec.h                       |  34 ++-
 include/linux/pe.h                          |   2 +-
 include/uapi/asm-generic/unistd.h           |   4 +-
 kernel/kexec_file.c                         | 238 ++++++++++++++++-
 kernel/resource.c                           |  57 ++++
 23 files changed, 1046 insertions(+), 375 deletions(-)
 create mode 100644 arch/arm64/kernel/kexec_image.c
 create mode 100644 arch/arm64/kernel/machine_kexec_file.c

-- 
2.16.2

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 00/13] arm64: kexec: add kexec_file_load() support
@ 2018-02-22 11:17 ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: AKASHI Takahiro, kexec, linux-kernel, linux-arm-kernel

This is the eighth round of implementing kexec_file_load() support
on arm64.[1]
Most of the code is based on kexec-tools (along with some kernel code
from x86, which also came from kexec-tools).


This patch series enables us to
  * load the kernel by specifying its file descriptor, instead of user-
    filled buffer, at kexec_file_load() system call, and
  * optionally verify its signature at load time for trusted boot.

Contrary to kexec_load() system call, as we discussed a long time ago,
users may not be allowed to provide a device tree to the 2nd kernel
explicitly, hence enforcing a dt blob of the first kernel to be re-used
internally.

To use kexec_file_load() system call, instead of kexec_load(), at kexec
command, '-s' option must be specified. See [2] for a necessary patch for
kexec-tools.

To anaylize a generated crash dump file, use the latest master branch of
crash utility[3] for v4.16-rc kernel. I always try to submit patches to
fix any inconsistencies introduced in the latest kernel.

Regarding a kernel image verification, a signature must be presented
along with the binary itself. A signature is basically a hash value
calculated against the whole binary data and encrypted by a key which
will be authenticated by one of the system's trusted certificates.
Any attempt to read and load a to-be-kexec-ed kernel image through
a system call will be checked and blocked if the binary's hash value
doesn't match its associated signature.

There are two methods available now:
1. implementing arch-specific verification hook of kexec_file_load()
2. utilizing IMA(Integrity Measurement Architecture)[4] appraisal framework

Before my v7, I believed that my patch only supports (1) but am now
confident that (2) comes free if IMA is enabled and properly configured.


(1) Arch-specific verification hook
If CONFIG_KEXEC_VERIFY_SIG is enabled, kexec_file_load() invokes an arch-
defined (and hence file-format-specific) hook function to check for the
validity of kernel binary.

On x86, a signature is embedded into a PE file (Microsoft's format) header
of binary. Since arm64's "Image" can also be seen as a PE file as far as
CONFIG_EFI is enabled, we adopt this format for kernel signing.  

As in the case of UEFI applications, we can create a signed kernel image:
    $ sbsign --key ${KEY} --cert ${CERT} Image

You may want to use certs/signing_key.pem, which is intended to be used
for module sigining (CONFIG_MODULE_SIG), as ${KEY} and ${CERT} for test
purpose.


(2) IMA appraisal-based
IMA was first introduced in linux in order to meet TCG (Trusted Computing
Group) requirement that all the sensitive files be *measured* before
reading/executing them to detect any untrusted changes/modification.
Then appraisal feature, which allows us to ensure the integrity of
files and even prevent them from reading/executing, was added later.

Meanwhile, kexec_file_load() has been merged since v3.17 and evolved to
enable IMA-appraisal type verification by the commit b804defe4297 ("kexec:
replace call to copy_file_from_fd() with kernel version").

In this scheme, a signature will be stored in a extended file attribute,
"security.ima" while a decryption key is hold in a dedicated keyring,
".ima" or "_ima".  All the necessary process of verification is confined
in a secure API, kernel_read_file_from_fd(), called by kexec_file_load().

    Please note that powerpc is one of the two architectures now
    supporting KEXEC_FILE, and that it wishes to exntend IMA,
    where a signature may be appended to "vmlinux" file[5], like module
    signing, instead of using an extended file attribute.

While IMA meant to be used with TPM (Trusted Platform Module) on secure
platform, IMA is still usable without TPM. Here is an example procedure
about how we can give it a try to run the feature using a self-signed
root ca for demo/test purposes:

 1) Generate needed keys and certificates, following "Generate trusted
    keys" section in README of ima-evm-utils[6].

 2) Build the kernel with the following kernel configurations, specifying
    "ima-local-ca.pem" for CONFIG_SYSTEM_TRUSTED_KEYS:
	CONFIG_EXT4_FS_SECURITY
	CONFIG_INTEGRITY_SIGNATURE
	CONFIG_INTEGRITY_ASYMMETRIC_KEYS
	CONFIG_INTEGRITY_TRUSTED_KEYRING
	CONFIG_IMA
	CONFIG_IMA_WRITE_POLICY
	CONFIG_IMA_READ_POLICY
	CONFIG_IMA_APPRAISE
	CONFIG_IMA_APPRAISE_BOOTPARAM
	CONFIG_SYSTEM_TRUSTED_KEYS
    Please note that CONFIG_KEXEC_VERIFY_SIG is not, actually should
    not be, enabled.

 3) Sign(label) a kernel image binary to be kexec-ed on target filesystem:
    $ evmctl ima_sign --key /path/to/private_key.pem /your/Image

 4) Add a command line parameter and boot the kernel:
    ima_appraise=enforce

 On live system,
 5) Set a security policy:
    $ mount -t securityfs none /sys/kernel/security
    $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" \
      > /sys/kernel/security/ima/policy

 6) Add a key for ima:
    $ keyctl padd asymmetric my_ima_key %:.ima < /path/to/x509_ima.der
    (or evmctl import /path/to/x509_ima.der <ima_keyring_id>)

 7) Then try kexec as normal.


Concerns(or future works):
* Even if the kernel is configured with CONFIG_RANDOMIZE_BASE, the 2nd
  kernel won't be placed at a randomized address. We will have to
  add some boot code similar to efi-stub to implement the randomization.
for approach (1),
* While big-endian kernel can support kernel signing, I'm not sure that
  Image can be recognized as in PE format because x86 standard only
  defines little-endian-based format.
* vmlinux support

  [1] http://git.linaro.org/people/takahiro.akashi/linux-aarch64.git
	branch:arm64/kexec_file
  [2] http://git.linaro.org/people/takahiro.akashi/kexec-tools.git
	branch:arm64/kexec_file
  [3] http://github.com/crash-utility/crash.git
  [4] https://sourceforge.net/p/linux-ima/wiki/Home/
  [5] http://lkml.iu.edu//hypermail/linux/kernel/1707.0/03669.html
  [6] https://sourceforge.net/p/linux-ima/ima-evm-utils/ci/master/tree/


Changes in v8 (Feb 22, 2018)
* introduce ARCH_HAS_KEXEC_PURGATORY so that arm64 will be able to skip
  purgatory
* remove "ifdef CONFIG_X86_64" stuffs from a re-factored function,
  prepare_elf64_headers(), making its interface more generic
  (The original patch was split into two for easier reviews.)
* modify cpu_soft_restart() so as to let the 2nd kernel jump into its entry
  code directly without requiring purgatory in case of kexec_file_load
* remove CONFIG_KEXEC_FILE_IMAGE_FMT and introduce
  CONFIG_KEXEC_IMAGE_VERIFY_SIG, much similar to x86 but quite redundant
  for now.
* In addition, update/modify dependencies of KEXEC_IMAGE_VERIFY_SIG

Changes in v7 (Dec 4, 2017)
* rebased to v4.15-rc2
* re-organize the patch set to separate KEXEC_FILE_VERIFY_SIG-related
  code from the others
* revamp factored-out code in kernel/kexec_file.c due to the changes
  in original x86 code
* redefine walk_sys_ram_res_rev() prototype due to change of callback
  type in the counterpart, walk_sys_ram_res()
* make KEXEC_FILE_IMAGE_FMT defaut on if KEXEC_FILE selected

Changes in v6 (Oct 24, 2017)
* fix a for-loop bug in _kexec_kernel_image_probe() per Julien

Changes in v5 (Oct 10, 2017)
* fix kbuild errors around patch #3
per Julien's comments,
* fix a bug in walk_system_ram_res_rev() with some cleanup
* modify fdt_setprop_range() to use vmalloc()
* modify fill_property() to use memset()

Changes in v4 (Oct 2, 2017)
* reinstate x86's arch_kexec_kernel_image_load()
* rename weak arch_kexec_kernel_xxx() to _kexec_kernel_xxx() for
  better re-use
* constify kexec_file_loaders[]

Changes in v3 (Sep 15, 2017)
* fix kbuild test error
* factor out arch_kexec_kernel_*() & arch_kimage_file_post_load_cleanup()
* remove CONFIG_CRASH_CORE guard from kexec_file.c
* add vmapped kernel region to vmcore for gdb backtracing
  (see prepare_elf64_headers())
* merge asm/kexec_file.h into asm/kexec.h
* and some cleanups

Changes in v2 (Sep 8, 2017)
* move core-header-related functions from crash_core.c to kexec_file.c
* drop hash-check code from purgatory
* modify purgatory asm to remove arch_kexec_apply_relocations_add()
* drop older kernel support
* drop vmlinux support (at least, for this series)


Patch #1 to #10 are essential part for KEXEC_FILE support
(additionally allowing for IMA-based verification):
  Patch #1 to #6 are all preparatory patches on generic side.
  Patch #7 to #11 are to enable kexec_file_load on arm64.

Patch #12 to #13 are for KEXEC_VERIFY_SIG (arch-specific verification)
support

AKASHI Takahiro (13):
  resource: add walk_system_ram_res_rev()
  kexec_file: make an use of purgatory optional
  kexec_file,x86,powerpc: factor out kexec_file_ops functions
  x86: kexec_file: factor out elf core header related functions
  kexec_file, x86: move re-factored code to generic side
  asm-generic: add kexec_file_load system call to unistd.h
  arm64: kexec_file: invoke the kernel without purgatory
  arm64: kexec_file: load initrd and device-tree
  arm64: kexec_file: add crash dump support
  arm64: kexec_file: add Image format support
  arm64: kexec_file: enable KEXEC_FILE config
  include: pe.h: remove message[] from mz header definition
  arm64: kexec_file: enable KEXEC_VERIFY_SIG for Image

 arch/arm64/Kconfig                          |  34 +++
 arch/arm64/include/asm/kexec.h              |  90 +++++++
 arch/arm64/kernel/Makefile                  |   3 +-
 arch/arm64/kernel/cpu-reset.S               |   6 +-
 arch/arm64/kernel/kexec_image.c             | 105 ++++++++
 arch/arm64/kernel/machine_kexec.c           |  11 +-
 arch/arm64/kernel/machine_kexec_file.c      | 401 ++++++++++++++++++++++++++++
 arch/arm64/kernel/relocate_kernel.S         |   3 +-
 arch/powerpc/Kconfig                        |   3 +
 arch/powerpc/include/asm/kexec.h            |   2 +-
 arch/powerpc/kernel/kexec_elf_64.c          |   2 +-
 arch/powerpc/kernel/machine_kexec_file_64.c |  39 +--
 arch/x86/Kconfig                            |   3 +
 arch/x86/include/asm/kexec-bzimage64.h      |   2 +-
 arch/x86/kernel/crash.c                     | 332 +++++------------------
 arch/x86/kernel/kexec-bzimage64.c           |   2 +-
 arch/x86/kernel/machine_kexec_64.c          |  45 +---
 include/linux/ioport.h                      |   3 +
 include/linux/kexec.h                       |  34 ++-
 include/linux/pe.h                          |   2 +-
 include/uapi/asm-generic/unistd.h           |   4 +-
 kernel/kexec_file.c                         | 238 ++++++++++++++++-
 kernel/resource.c                           |  57 ++++
 23 files changed, 1046 insertions(+), 375 deletions(-)
 create mode 100644 arch/arm64/kernel/kexec_image.c
 create mode 100644 arch/arm64/kernel/machine_kexec_file.c

-- 
2.16.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 01/13] resource: add walk_system_ram_res_rev()
  2018-02-22 11:17 ` AKASHI Takahiro
  (?)
@ 2018-02-22 11:17   ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-arm-kernel, linux-kernel, AKASHI Takahiro, Linus Torvalds

This function, being a variant of walk_system_ram_res() introduced in
commit 8c86e70acead ("resource: provide new functions to walk through
resources"), walks through a list of all the resources of System RAM
in reversed order, i.e., from higher to lower.

It will be used in kexec_file implementation on arm64.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
 include/linux/ioport.h |  3 +++
 kernel/resource.c      | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+)

diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index da0ebaec25f0..f12d95fe038b 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -277,6 +277,9 @@ extern int
 walk_system_ram_res(u64 start, u64 end, void *arg,
 		    int (*func)(struct resource *, void *));
 extern int
+walk_system_ram_res_rev(u64 start, u64 end, void *arg,
+			int (*func)(struct resource *, void *));
+extern int
 walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
 		    void *arg, int (*func)(struct resource *, void *));
 
diff --git a/kernel/resource.c b/kernel/resource.c
index e270b5048988..bdaa93407f4c 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -23,6 +23,8 @@
 #include <linux/pfn.h>
 #include <linux/mm.h>
 #include <linux/resource_ext.h>
+#include <linux/string.h>
+#include <linux/vmalloc.h>
 #include <asm/io.h>
 
 
@@ -486,6 +488,61 @@ int walk_mem_res(u64 start, u64 end, void *arg,
 				     arg, func);
 }
 
+int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
+				int (*func)(struct resource *, void *))
+{
+	struct resource res, *rams;
+	int rams_size = 16, i;
+	int ret = -1;
+
+	/* create a list */
+	rams = vmalloc(sizeof(struct resource) * rams_size);
+	if (!rams)
+		return ret;
+
+	res.start = start;
+	res.end = end;
+	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+	i = 0;
+	while ((res.start < res.end) &&
+		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
+		if (i >= rams_size) {
+			/* re-alloc */
+			struct resource *rams_new;
+			int rams_new_size;
+
+			rams_new_size = rams_size + 16;
+			rams_new = vmalloc(sizeof(struct resource)
+							* rams_new_size);
+			if (!rams_new)
+				goto out;
+
+			memcpy(rams_new, rams,
+					sizeof(struct resource) * rams_size);
+			vfree(rams);
+			rams = rams_new;
+			rams_size = rams_new_size;
+		}
+
+		rams[i].start = res.start;
+		rams[i++].end = res.end;
+
+		res.start = res.end + 1;
+		res.end = end;
+	}
+
+	/* go reverse */
+	for (i--; i >= 0; i--) {
+		ret = (*func)(&rams[i], arg);
+		if (ret)
+			break;
+	}
+
+out:
+	vfree(rams);
+	return ret;
+}
+
 #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
 
 /*
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 01/13] resource: add walk_system_ram_res_rev()
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

This function, being a variant of walk_system_ram_res() introduced in
commit 8c86e70acead ("resource: provide new functions to walk through
resources"), walks through a list of all the resources of System RAM
in reversed order, i.e., from higher to lower.

It will be used in kexec_file implementation on arm64.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
 include/linux/ioport.h |  3 +++
 kernel/resource.c      | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+)

diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index da0ebaec25f0..f12d95fe038b 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -277,6 +277,9 @@ extern int
 walk_system_ram_res(u64 start, u64 end, void *arg,
 		    int (*func)(struct resource *, void *));
 extern int
+walk_system_ram_res_rev(u64 start, u64 end, void *arg,
+			int (*func)(struct resource *, void *));
+extern int
 walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
 		    void *arg, int (*func)(struct resource *, void *));
 
diff --git a/kernel/resource.c b/kernel/resource.c
index e270b5048988..bdaa93407f4c 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -23,6 +23,8 @@
 #include <linux/pfn.h>
 #include <linux/mm.h>
 #include <linux/resource_ext.h>
+#include <linux/string.h>
+#include <linux/vmalloc.h>
 #include <asm/io.h>
 
 
@@ -486,6 +488,61 @@ int walk_mem_res(u64 start, u64 end, void *arg,
 				     arg, func);
 }
 
+int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
+				int (*func)(struct resource *, void *))
+{
+	struct resource res, *rams;
+	int rams_size = 16, i;
+	int ret = -1;
+
+	/* create a list */
+	rams = vmalloc(sizeof(struct resource) * rams_size);
+	if (!rams)
+		return ret;
+
+	res.start = start;
+	res.end = end;
+	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+	i = 0;
+	while ((res.start < res.end) &&
+		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
+		if (i >= rams_size) {
+			/* re-alloc */
+			struct resource *rams_new;
+			int rams_new_size;
+
+			rams_new_size = rams_size + 16;
+			rams_new = vmalloc(sizeof(struct resource)
+							* rams_new_size);
+			if (!rams_new)
+				goto out;
+
+			memcpy(rams_new, rams,
+					sizeof(struct resource) * rams_size);
+			vfree(rams);
+			rams = rams_new;
+			rams_size = rams_new_size;
+		}
+
+		rams[i].start = res.start;
+		rams[i++].end = res.end;
+
+		res.start = res.end + 1;
+		res.end = end;
+	}
+
+	/* go reverse */
+	for (i--; i >= 0; i--) {
+		ret = (*func)(&rams[i], arg);
+		if (ret)
+			break;
+	}
+
+out:
+	vfree(rams);
+	return ret;
+}
+
 #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
 
 /*
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 01/13] resource: add walk_system_ram_res_rev()
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: AKASHI Takahiro, Linus Torvalds, kexec, linux-kernel, linux-arm-kernel

This function, being a variant of walk_system_ram_res() introduced in
commit 8c86e70acead ("resource: provide new functions to walk through
resources"), walks through a list of all the resources of System RAM
in reversed order, i.e., from higher to lower.

It will be used in kexec_file implementation on arm64.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
 include/linux/ioport.h |  3 +++
 kernel/resource.c      | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+)

diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index da0ebaec25f0..f12d95fe038b 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -277,6 +277,9 @@ extern int
 walk_system_ram_res(u64 start, u64 end, void *arg,
 		    int (*func)(struct resource *, void *));
 extern int
+walk_system_ram_res_rev(u64 start, u64 end, void *arg,
+			int (*func)(struct resource *, void *));
+extern int
 walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
 		    void *arg, int (*func)(struct resource *, void *));
 
diff --git a/kernel/resource.c b/kernel/resource.c
index e270b5048988..bdaa93407f4c 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -23,6 +23,8 @@
 #include <linux/pfn.h>
 #include <linux/mm.h>
 #include <linux/resource_ext.h>
+#include <linux/string.h>
+#include <linux/vmalloc.h>
 #include <asm/io.h>
 
 
@@ -486,6 +488,61 @@ int walk_mem_res(u64 start, u64 end, void *arg,
 				     arg, func);
 }
 
+int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
+				int (*func)(struct resource *, void *))
+{
+	struct resource res, *rams;
+	int rams_size = 16, i;
+	int ret = -1;
+
+	/* create a list */
+	rams = vmalloc(sizeof(struct resource) * rams_size);
+	if (!rams)
+		return ret;
+
+	res.start = start;
+	res.end = end;
+	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+	i = 0;
+	while ((res.start < res.end) &&
+		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
+		if (i >= rams_size) {
+			/* re-alloc */
+			struct resource *rams_new;
+			int rams_new_size;
+
+			rams_new_size = rams_size + 16;
+			rams_new = vmalloc(sizeof(struct resource)
+							* rams_new_size);
+			if (!rams_new)
+				goto out;
+
+			memcpy(rams_new, rams,
+					sizeof(struct resource) * rams_size);
+			vfree(rams);
+			rams = rams_new;
+			rams_size = rams_new_size;
+		}
+
+		rams[i].start = res.start;
+		rams[i++].end = res.end;
+
+		res.start = res.end + 1;
+		res.end = end;
+	}
+
+	/* go reverse */
+	for (i--; i >= 0; i--) {
+		ret = (*func)(&rams[i], arg);
+		if (ret)
+			break;
+	}
+
+out:
+	vfree(rams);
+	return ret;
+}
+
 #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
 
 /*
-- 
2.16.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 02/13] kexec_file: make an use of purgatory optional
  2018-02-22 11:17 ` AKASHI Takahiro
  (?)
@ 2018-02-22 11:17   ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-arm-kernel, linux-kernel, AKASHI Takahiro

On arm64, no trampline code between old kernel and new kernel will be
required in kexec_file implementation. This patch introduces a new
configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be
compiled in only if necessary.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
---
 arch/powerpc/Kconfig | 3 +++
 arch/x86/Kconfig     | 3 +++
 kernel/kexec_file.c  | 6 ++++++
 3 files changed, 12 insertions(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 73ce5dd07642..c32a181a7cbb 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -552,6 +552,9 @@ config KEXEC_FILE
 	  for kernel and initramfs as opposed to a list of segments as is the
 	  case for the older kexec call.
 
+config ARCH_HAS_KEXEC_PURGATORY
+	def_bool KEXEC_FILE
+
 config RELOCATABLE
 	bool "Build a relocatable kernel"
 	depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c1236b187824..f031c3efe47e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2019,6 +2019,9 @@ config KEXEC_FILE
 	  for kernel and initramfs as opposed to list of segments as
 	  accepted by previous system call.
 
+config ARCH_HAS_KEXEC_PURGATORY
+	def_bool KEXEC_FILE
+
 config KEXEC_VERIFY_SIG
 	bool "Verify kernel signature during kexec_file_load() syscall"
 	depends on KEXEC_FILE
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index e5bcd94c1efb..990adae52151 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -26,7 +26,11 @@
 #include <linux/vmalloc.h>
 #include "kexec_internal.h"
 
+#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
 static int kexec_calculate_store_digests(struct kimage *image);
+#else
+static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
+#endif
 
 /* Architectures can provide this probe function */
 int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
@@ -520,6 +524,7 @@ int kexec_add_buffer(struct kexec_buf *kbuf)
 	return 0;
 }
 
+#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
 /* Calculate and store the digest of segments */
 static int kexec_calculate_store_digests(struct kimage *image)
 {
@@ -1022,3 +1027,4 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
 
 	return 0;
 }
+#endif /* CONFIG_ARCH_HAS_KEXEC_PURGATORY */
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 02/13] kexec_file: make an use of purgatory optional
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

On arm64, no trampline code between old kernel and new kernel will be
required in kexec_file implementation. This patch introduces a new
configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be
compiled in only if necessary.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
---
 arch/powerpc/Kconfig | 3 +++
 arch/x86/Kconfig     | 3 +++
 kernel/kexec_file.c  | 6 ++++++
 3 files changed, 12 insertions(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 73ce5dd07642..c32a181a7cbb 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -552,6 +552,9 @@ config KEXEC_FILE
 	  for kernel and initramfs as opposed to a list of segments as is the
 	  case for the older kexec call.
 
+config ARCH_HAS_KEXEC_PURGATORY
+	def_bool KEXEC_FILE
+
 config RELOCATABLE
 	bool "Build a relocatable kernel"
 	depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c1236b187824..f031c3efe47e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2019,6 +2019,9 @@ config KEXEC_FILE
 	  for kernel and initramfs as opposed to list of segments as
 	  accepted by previous system call.
 
+config ARCH_HAS_KEXEC_PURGATORY
+	def_bool KEXEC_FILE
+
 config KEXEC_VERIFY_SIG
 	bool "Verify kernel signature during kexec_file_load() syscall"
 	depends on KEXEC_FILE
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index e5bcd94c1efb..990adae52151 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -26,7 +26,11 @@
 #include <linux/vmalloc.h>
 #include "kexec_internal.h"
 
+#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
 static int kexec_calculate_store_digests(struct kimage *image);
+#else
+static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
+#endif
 
 /* Architectures can provide this probe function */
 int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
@@ -520,6 +524,7 @@ int kexec_add_buffer(struct kexec_buf *kbuf)
 	return 0;
 }
 
+#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
 /* Calculate and store the digest of segments */
 static int kexec_calculate_store_digests(struct kimage *image)
 {
@@ -1022,3 +1027,4 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
 
 	return 0;
 }
+#endif /* CONFIG_ARCH_HAS_KEXEC_PURGATORY */
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 02/13] kexec_file: make an use of purgatory optional
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: AKASHI Takahiro, kexec, linux-kernel, linux-arm-kernel

On arm64, no trampline code between old kernel and new kernel will be
required in kexec_file implementation. This patch introduces a new
configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be
compiled in only if necessary.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
---
 arch/powerpc/Kconfig | 3 +++
 arch/x86/Kconfig     | 3 +++
 kernel/kexec_file.c  | 6 ++++++
 3 files changed, 12 insertions(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 73ce5dd07642..c32a181a7cbb 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -552,6 +552,9 @@ config KEXEC_FILE
 	  for kernel and initramfs as opposed to a list of segments as is the
 	  case for the older kexec call.
 
+config ARCH_HAS_KEXEC_PURGATORY
+	def_bool KEXEC_FILE
+
 config RELOCATABLE
 	bool "Build a relocatable kernel"
 	depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c1236b187824..f031c3efe47e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2019,6 +2019,9 @@ config KEXEC_FILE
 	  for kernel and initramfs as opposed to list of segments as
 	  accepted by previous system call.
 
+config ARCH_HAS_KEXEC_PURGATORY
+	def_bool KEXEC_FILE
+
 config KEXEC_VERIFY_SIG
 	bool "Verify kernel signature during kexec_file_load() syscall"
 	depends on KEXEC_FILE
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index e5bcd94c1efb..990adae52151 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -26,7 +26,11 @@
 #include <linux/vmalloc.h>
 #include "kexec_internal.h"
 
+#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
 static int kexec_calculate_store_digests(struct kimage *image);
+#else
+static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
+#endif
 
 /* Architectures can provide this probe function */
 int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
@@ -520,6 +524,7 @@ int kexec_add_buffer(struct kexec_buf *kbuf)
 	return 0;
 }
 
+#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
 /* Calculate and store the digest of segments */
 static int kexec_calculate_store_digests(struct kimage *image)
 {
@@ -1022,3 +1027,4 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
 
 	return 0;
 }
+#endif /* CONFIG_ARCH_HAS_KEXEC_PURGATORY */
-- 
2.16.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
  2018-02-22 11:17 ` AKASHI Takahiro
  (?)
@ 2018-02-22 11:17   ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-arm-kernel, linux-kernel, AKASHI Takahiro

As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
array and now duplicated among some architectures, let's factor them out.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/kexec.h            |  2 +-
 arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
 arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
 arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
 arch/x86/kernel/kexec-bzimage64.c           |  2 +-
 arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
 include/linux/kexec.h                       | 15 ++++----
 kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
 8 files changed, 70 insertions(+), 94 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index d8b1e8e7e035..4a585cba1787 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -95,7 +95,7 @@ static inline bool kdump_in_progress(void)
 }
 
 #ifdef CONFIG_KEXEC_FILE
-extern struct kexec_file_ops kexec_elf64_ops;
+extern const struct kexec_file_ops kexec_elf64_ops;
 
 #ifdef CONFIG_IMA_KEXEC
 #define ARCH_HAS_KIMAGE_ARCH
diff --git a/arch/powerpc/kernel/kexec_elf_64.c b/arch/powerpc/kernel/kexec_elf_64.c
index 9a42309b091a..6c78c11c7faf 100644
--- a/arch/powerpc/kernel/kexec_elf_64.c
+++ b/arch/powerpc/kernel/kexec_elf_64.c
@@ -657,7 +657,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
 	return ret ? ERR_PTR(ret) : fdt;
 }
 
-struct kexec_file_ops kexec_elf64_ops = {
+const struct kexec_file_ops kexec_elf64_ops = {
 	.probe = elf64_probe,
 	.load = elf64_load,
 };
diff --git a/arch/powerpc/kernel/machine_kexec_file_64.c b/arch/powerpc/kernel/machine_kexec_file_64.c
index e4395f937d63..a27ec647350c 100644
--- a/arch/powerpc/kernel/machine_kexec_file_64.c
+++ b/arch/powerpc/kernel/machine_kexec_file_64.c
@@ -31,52 +31,19 @@
 
 #define SLAVE_CODE_SIZE		256
 
-static struct kexec_file_ops *kexec_file_loaders[] = {
+const struct kexec_file_ops * const kexec_file_loaders[] = {
 	&kexec_elf64_ops,
+	NULL
 };
 
 int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
 				  unsigned long buf_len)
 {
-	int i, ret = -ENOEXEC;
-	struct kexec_file_ops *fops;
-
 	/* We don't support crash kernels yet. */
 	if (image->type == KEXEC_TYPE_CRASH)
 		return -ENOTSUPP;
 
-	for (i = 0; i < ARRAY_SIZE(kexec_file_loaders); i++) {
-		fops = kexec_file_loaders[i];
-		if (!fops || !fops->probe)
-			continue;
-
-		ret = fops->probe(buf, buf_len);
-		if (!ret) {
-			image->fops = fops;
-			return ret;
-		}
-	}
-
-	return ret;
-}
-
-void *arch_kexec_kernel_image_load(struct kimage *image)
-{
-	if (!image->fops || !image->fops->load)
-		return ERR_PTR(-ENOEXEC);
-
-	return image->fops->load(image, image->kernel_buf,
-				 image->kernel_buf_len, image->initrd_buf,
-				 image->initrd_buf_len, image->cmdline_buf,
-				 image->cmdline_buf_len);
-}
-
-int arch_kimage_file_post_load_cleanup(struct kimage *image)
-{
-	if (!image->fops || !image->fops->cleanup)
-		return 0;
-
-	return image->fops->cleanup(image->image_loader_data);
+	return _kexec_kernel_image_probe(image, buf, buf_len);
 }
 
 /**
diff --git a/arch/x86/include/asm/kexec-bzimage64.h b/arch/x86/include/asm/kexec-bzimage64.h
index 9f07cff43705..df89ee7d3e9e 100644
--- a/arch/x86/include/asm/kexec-bzimage64.h
+++ b/arch/x86/include/asm/kexec-bzimage64.h
@@ -2,6 +2,6 @@
 #ifndef _ASM_KEXEC_BZIMAGE64_H
 #define _ASM_KEXEC_BZIMAGE64_H
 
-extern struct kexec_file_ops kexec_bzImage64_ops;
+extern const struct kexec_file_ops kexec_bzImage64_ops;
 
 #endif  /* _ASM_KEXE_BZIMAGE64_H */
diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
index fb095ba0c02f..705654776c0c 100644
--- a/arch/x86/kernel/kexec-bzimage64.c
+++ b/arch/x86/kernel/kexec-bzimage64.c
@@ -538,7 +538,7 @@ static int bzImage64_verify_sig(const char *kernel, unsigned long kernel_len)
 }
 #endif
 
-struct kexec_file_ops kexec_bzImage64_ops = {
+const struct kexec_file_ops kexec_bzImage64_ops = {
 	.probe = bzImage64_probe,
 	.load = bzImage64_load,
 	.cleanup = bzImage64_cleanup,
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 1f790cf9d38f..2cdd29d64181 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -30,8 +30,9 @@
 #include <asm/set_memory.h>
 
 #ifdef CONFIG_KEXEC_FILE
-static struct kexec_file_ops *kexec_file_loaders[] = {
+const struct kexec_file_ops * const kexec_file_loaders[] = {
 		&kexec_bzImage64_ops,
+		NULL
 };
 #endif
 
@@ -363,27 +364,6 @@ void arch_crash_save_vmcoreinfo(void)
 /* arch-dependent functionality related to kexec file-based syscall */
 
 #ifdef CONFIG_KEXEC_FILE
-int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
-				  unsigned long buf_len)
-{
-	int i, ret = -ENOEXEC;
-	struct kexec_file_ops *fops;
-
-	for (i = 0; i < ARRAY_SIZE(kexec_file_loaders); i++) {
-		fops = kexec_file_loaders[i];
-		if (!fops || !fops->probe)
-			continue;
-
-		ret = fops->probe(buf, buf_len);
-		if (!ret) {
-			image->fops = fops;
-			return ret;
-		}
-	}
-
-	return ret;
-}
-
 void *arch_kexec_kernel_image_load(struct kimage *image)
 {
 	vfree(image->arch.elf_headers);
@@ -398,27 +378,6 @@ void *arch_kexec_kernel_image_load(struct kimage *image)
 				 image->cmdline_buf_len);
 }
 
-int arch_kimage_file_post_load_cleanup(struct kimage *image)
-{
-	if (!image->fops || !image->fops->cleanup)
-		return 0;
-
-	return image->fops->cleanup(image->image_loader_data);
-}
-
-#ifdef CONFIG_KEXEC_VERIFY_SIG
-int arch_kexec_kernel_verify_sig(struct kimage *image, void *kernel,
-				 unsigned long kernel_len)
-{
-	if (!image->fops || !image->fops->verify_sig) {
-		pr_debug("kernel loader does not support signature verification.");
-		return -EKEYREJECTED;
-	}
-
-	return image->fops->verify_sig(kernel, kernel_len);
-}
-#endif
-
 /*
  * Apply purgatory relocations.
  *
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index f16f6ceb3875..325980537125 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -209,7 +209,7 @@ struct kimage {
 	unsigned long cmdline_buf_len;
 
 	/* File operations provided by image loader */
-	struct kexec_file_ops *fops;
+	const struct kexec_file_ops *fops;
 
 	/* Image loader handling the kernel can store a pointer here */
 	void *image_loader_data;
@@ -277,12 +277,13 @@ int crash_shrink_memory(unsigned long new_size);
 size_t crash_get_memory_size(void);
 void crash_free_reserved_phys_range(unsigned long begin, unsigned long end);
 
-int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
-					 unsigned long buf_len);
-void * __weak arch_kexec_kernel_image_load(struct kimage *image);
-int __weak arch_kimage_file_post_load_cleanup(struct kimage *image);
-int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
-					unsigned long buf_len);
+int _kexec_kernel_image_probe(struct kimage *image, void *buf,
+			      unsigned long buf_len);
+void *_kexec_kernel_image_load(struct kimage *image);
+int _kimage_file_post_load_cleanup(struct kimage *image);
+int _kexec_kernel_verify_sig(struct kimage *image, void *buf,
+			     unsigned long buf_len);
+
 int __weak arch_kexec_apply_relocations_add(const Elf_Ehdr *ehdr,
 					Elf_Shdr *sechdrs, unsigned int relsec);
 int __weak arch_kexec_apply_relocations(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 990adae52151..a6d14a768b3e 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -26,34 +26,83 @@
 #include <linux/vmalloc.h>
 #include "kexec_internal.h"
 
+const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
+
 #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
 static int kexec_calculate_store_digests(struct kimage *image);
 #else
 static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
 #endif
 
+int _kexec_kernel_image_probe(struct kimage *image, void *buf,
+			     unsigned long buf_len)
+{
+	const struct kexec_file_ops * const *fops;
+	int ret = -ENOEXEC;
+
+	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
+		ret = (*fops)->probe(buf, buf_len);
+		if (!ret) {
+			image->fops = *fops;
+			return ret;
+		}
+	}
+
+	return ret;
+}
+
 /* Architectures can provide this probe function */
 int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
 					 unsigned long buf_len)
 {
-	return -ENOEXEC;
+	return _kexec_kernel_image_probe(image, buf, buf_len);
+}
+
+void *_kexec_kernel_image_load(struct kimage *image)
+{
+	if (!image->fops || !image->fops->load)
+		return ERR_PTR(-ENOEXEC);
+
+	return image->fops->load(image, image->kernel_buf,
+				 image->kernel_buf_len, image->initrd_buf,
+				 image->initrd_buf_len, image->cmdline_buf,
+				 image->cmdline_buf_len);
 }
 
 void * __weak arch_kexec_kernel_image_load(struct kimage *image)
 {
-	return ERR_PTR(-ENOEXEC);
+	return _kexec_kernel_image_load(image);
+}
+
+int _kimage_file_post_load_cleanup(struct kimage *image)
+{
+	if (!image->fops || !image->fops->cleanup)
+		return 0;
+
+	return image->fops->cleanup(image->image_loader_data);
 }
 
 int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
 {
-	return -EINVAL;
+	return _kimage_file_post_load_cleanup(image);
 }
 
 #ifdef CONFIG_KEXEC_VERIFY_SIG
+int _kexec_kernel_verify_sig(struct kimage *image, void *buf,
+			    unsigned long buf_len)
+{
+	if (!image->fops || !image->fops->verify_sig) {
+		pr_debug("kernel loader does not support signature verification.\n");
+		return -EKEYREJECTED;
+	}
+
+	return image->fops->verify_sig(buf, buf_len);
+}
+
 int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
 					unsigned long buf_len)
 {
-	return -EKEYREJECTED;
+	return _kexec_kernel_verify_sig(image, buf, buf_len);
 }
 #endif
 
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 03/13] kexec_file, x86, powerpc: factor out kexec_file_ops functions
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
array and now duplicated among some architectures, let's factor them out.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/kexec.h            |  2 +-
 arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
 arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
 arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
 arch/x86/kernel/kexec-bzimage64.c           |  2 +-
 arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
 include/linux/kexec.h                       | 15 ++++----
 kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
 8 files changed, 70 insertions(+), 94 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index d8b1e8e7e035..4a585cba1787 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -95,7 +95,7 @@ static inline bool kdump_in_progress(void)
 }
 
 #ifdef CONFIG_KEXEC_FILE
-extern struct kexec_file_ops kexec_elf64_ops;
+extern const struct kexec_file_ops kexec_elf64_ops;
 
 #ifdef CONFIG_IMA_KEXEC
 #define ARCH_HAS_KIMAGE_ARCH
diff --git a/arch/powerpc/kernel/kexec_elf_64.c b/arch/powerpc/kernel/kexec_elf_64.c
index 9a42309b091a..6c78c11c7faf 100644
--- a/arch/powerpc/kernel/kexec_elf_64.c
+++ b/arch/powerpc/kernel/kexec_elf_64.c
@@ -657,7 +657,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
 	return ret ? ERR_PTR(ret) : fdt;
 }
 
-struct kexec_file_ops kexec_elf64_ops = {
+const struct kexec_file_ops kexec_elf64_ops = {
 	.probe = elf64_probe,
 	.load = elf64_load,
 };
diff --git a/arch/powerpc/kernel/machine_kexec_file_64.c b/arch/powerpc/kernel/machine_kexec_file_64.c
index e4395f937d63..a27ec647350c 100644
--- a/arch/powerpc/kernel/machine_kexec_file_64.c
+++ b/arch/powerpc/kernel/machine_kexec_file_64.c
@@ -31,52 +31,19 @@
 
 #define SLAVE_CODE_SIZE		256
 
-static struct kexec_file_ops *kexec_file_loaders[] = {
+const struct kexec_file_ops * const kexec_file_loaders[] = {
 	&kexec_elf64_ops,
+	NULL
 };
 
 int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
 				  unsigned long buf_len)
 {
-	int i, ret = -ENOEXEC;
-	struct kexec_file_ops *fops;
-
 	/* We don't support crash kernels yet. */
 	if (image->type == KEXEC_TYPE_CRASH)
 		return -ENOTSUPP;
 
-	for (i = 0; i < ARRAY_SIZE(kexec_file_loaders); i++) {
-		fops = kexec_file_loaders[i];
-		if (!fops || !fops->probe)
-			continue;
-
-		ret = fops->probe(buf, buf_len);
-		if (!ret) {
-			image->fops = fops;
-			return ret;
-		}
-	}
-
-	return ret;
-}
-
-void *arch_kexec_kernel_image_load(struct kimage *image)
-{
-	if (!image->fops || !image->fops->load)
-		return ERR_PTR(-ENOEXEC);
-
-	return image->fops->load(image, image->kernel_buf,
-				 image->kernel_buf_len, image->initrd_buf,
-				 image->initrd_buf_len, image->cmdline_buf,
-				 image->cmdline_buf_len);
-}
-
-int arch_kimage_file_post_load_cleanup(struct kimage *image)
-{
-	if (!image->fops || !image->fops->cleanup)
-		return 0;
-
-	return image->fops->cleanup(image->image_loader_data);
+	return _kexec_kernel_image_probe(image, buf, buf_len);
 }
 
 /**
diff --git a/arch/x86/include/asm/kexec-bzimage64.h b/arch/x86/include/asm/kexec-bzimage64.h
index 9f07cff43705..df89ee7d3e9e 100644
--- a/arch/x86/include/asm/kexec-bzimage64.h
+++ b/arch/x86/include/asm/kexec-bzimage64.h
@@ -2,6 +2,6 @@
 #ifndef _ASM_KEXEC_BZIMAGE64_H
 #define _ASM_KEXEC_BZIMAGE64_H
 
-extern struct kexec_file_ops kexec_bzImage64_ops;
+extern const struct kexec_file_ops kexec_bzImage64_ops;
 
 #endif  /* _ASM_KEXE_BZIMAGE64_H */
diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
index fb095ba0c02f..705654776c0c 100644
--- a/arch/x86/kernel/kexec-bzimage64.c
+++ b/arch/x86/kernel/kexec-bzimage64.c
@@ -538,7 +538,7 @@ static int bzImage64_verify_sig(const char *kernel, unsigned long kernel_len)
 }
 #endif
 
-struct kexec_file_ops kexec_bzImage64_ops = {
+const struct kexec_file_ops kexec_bzImage64_ops = {
 	.probe = bzImage64_probe,
 	.load = bzImage64_load,
 	.cleanup = bzImage64_cleanup,
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 1f790cf9d38f..2cdd29d64181 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -30,8 +30,9 @@
 #include <asm/set_memory.h>
 
 #ifdef CONFIG_KEXEC_FILE
-static struct kexec_file_ops *kexec_file_loaders[] = {
+const struct kexec_file_ops * const kexec_file_loaders[] = {
 		&kexec_bzImage64_ops,
+		NULL
 };
 #endif
 
@@ -363,27 +364,6 @@ void arch_crash_save_vmcoreinfo(void)
 /* arch-dependent functionality related to kexec file-based syscall */
 
 #ifdef CONFIG_KEXEC_FILE
-int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
-				  unsigned long buf_len)
-{
-	int i, ret = -ENOEXEC;
-	struct kexec_file_ops *fops;
-
-	for (i = 0; i < ARRAY_SIZE(kexec_file_loaders); i++) {
-		fops = kexec_file_loaders[i];
-		if (!fops || !fops->probe)
-			continue;
-
-		ret = fops->probe(buf, buf_len);
-		if (!ret) {
-			image->fops = fops;
-			return ret;
-		}
-	}
-
-	return ret;
-}
-
 void *arch_kexec_kernel_image_load(struct kimage *image)
 {
 	vfree(image->arch.elf_headers);
@@ -398,27 +378,6 @@ void *arch_kexec_kernel_image_load(struct kimage *image)
 				 image->cmdline_buf_len);
 }
 
-int arch_kimage_file_post_load_cleanup(struct kimage *image)
-{
-	if (!image->fops || !image->fops->cleanup)
-		return 0;
-
-	return image->fops->cleanup(image->image_loader_data);
-}
-
-#ifdef CONFIG_KEXEC_VERIFY_SIG
-int arch_kexec_kernel_verify_sig(struct kimage *image, void *kernel,
-				 unsigned long kernel_len)
-{
-	if (!image->fops || !image->fops->verify_sig) {
-		pr_debug("kernel loader does not support signature verification.");
-		return -EKEYREJECTED;
-	}
-
-	return image->fops->verify_sig(kernel, kernel_len);
-}
-#endif
-
 /*
  * Apply purgatory relocations.
  *
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index f16f6ceb3875..325980537125 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -209,7 +209,7 @@ struct kimage {
 	unsigned long cmdline_buf_len;
 
 	/* File operations provided by image loader */
-	struct kexec_file_ops *fops;
+	const struct kexec_file_ops *fops;
 
 	/* Image loader handling the kernel can store a pointer here */
 	void *image_loader_data;
@@ -277,12 +277,13 @@ int crash_shrink_memory(unsigned long new_size);
 size_t crash_get_memory_size(void);
 void crash_free_reserved_phys_range(unsigned long begin, unsigned long end);
 
-int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
-					 unsigned long buf_len);
-void * __weak arch_kexec_kernel_image_load(struct kimage *image);
-int __weak arch_kimage_file_post_load_cleanup(struct kimage *image);
-int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
-					unsigned long buf_len);
+int _kexec_kernel_image_probe(struct kimage *image, void *buf,
+			      unsigned long buf_len);
+void *_kexec_kernel_image_load(struct kimage *image);
+int _kimage_file_post_load_cleanup(struct kimage *image);
+int _kexec_kernel_verify_sig(struct kimage *image, void *buf,
+			     unsigned long buf_len);
+
 int __weak arch_kexec_apply_relocations_add(const Elf_Ehdr *ehdr,
 					Elf_Shdr *sechdrs, unsigned int relsec);
 int __weak arch_kexec_apply_relocations(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 990adae52151..a6d14a768b3e 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -26,34 +26,83 @@
 #include <linux/vmalloc.h>
 #include "kexec_internal.h"
 
+const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
+
 #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
 static int kexec_calculate_store_digests(struct kimage *image);
 #else
 static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
 #endif
 
+int _kexec_kernel_image_probe(struct kimage *image, void *buf,
+			     unsigned long buf_len)
+{
+	const struct kexec_file_ops * const *fops;
+	int ret = -ENOEXEC;
+
+	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
+		ret = (*fops)->probe(buf, buf_len);
+		if (!ret) {
+			image->fops = *fops;
+			return ret;
+		}
+	}
+
+	return ret;
+}
+
 /* Architectures can provide this probe function */
 int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
 					 unsigned long buf_len)
 {
-	return -ENOEXEC;
+	return _kexec_kernel_image_probe(image, buf, buf_len);
+}
+
+void *_kexec_kernel_image_load(struct kimage *image)
+{
+	if (!image->fops || !image->fops->load)
+		return ERR_PTR(-ENOEXEC);
+
+	return image->fops->load(image, image->kernel_buf,
+				 image->kernel_buf_len, image->initrd_buf,
+				 image->initrd_buf_len, image->cmdline_buf,
+				 image->cmdline_buf_len);
 }
 
 void * __weak arch_kexec_kernel_image_load(struct kimage *image)
 {
-	return ERR_PTR(-ENOEXEC);
+	return _kexec_kernel_image_load(image);
+}
+
+int _kimage_file_post_load_cleanup(struct kimage *image)
+{
+	if (!image->fops || !image->fops->cleanup)
+		return 0;
+
+	return image->fops->cleanup(image->image_loader_data);
 }
 
 int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
 {
-	return -EINVAL;
+	return _kimage_file_post_load_cleanup(image);
 }
 
 #ifdef CONFIG_KEXEC_VERIFY_SIG
+int _kexec_kernel_verify_sig(struct kimage *image, void *buf,
+			    unsigned long buf_len)
+{
+	if (!image->fops || !image->fops->verify_sig) {
+		pr_debug("kernel loader does not support signature verification.\n");
+		return -EKEYREJECTED;
+	}
+
+	return image->fops->verify_sig(buf, buf_len);
+}
+
 int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
 					unsigned long buf_len)
 {
-	return -EKEYREJECTED;
+	return _kexec_kernel_verify_sig(image, buf, buf_len);
 }
 #endif
 
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 03/13] kexec_file, x86, powerpc: factor out kexec_file_ops functions
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: AKASHI Takahiro, kexec, linux-kernel, linux-arm-kernel

As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
array and now duplicated among some architectures, let's factor them out.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/kexec.h            |  2 +-
 arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
 arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
 arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
 arch/x86/kernel/kexec-bzimage64.c           |  2 +-
 arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
 include/linux/kexec.h                       | 15 ++++----
 kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
 8 files changed, 70 insertions(+), 94 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index d8b1e8e7e035..4a585cba1787 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -95,7 +95,7 @@ static inline bool kdump_in_progress(void)
 }
 
 #ifdef CONFIG_KEXEC_FILE
-extern struct kexec_file_ops kexec_elf64_ops;
+extern const struct kexec_file_ops kexec_elf64_ops;
 
 #ifdef CONFIG_IMA_KEXEC
 #define ARCH_HAS_KIMAGE_ARCH
diff --git a/arch/powerpc/kernel/kexec_elf_64.c b/arch/powerpc/kernel/kexec_elf_64.c
index 9a42309b091a..6c78c11c7faf 100644
--- a/arch/powerpc/kernel/kexec_elf_64.c
+++ b/arch/powerpc/kernel/kexec_elf_64.c
@@ -657,7 +657,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
 	return ret ? ERR_PTR(ret) : fdt;
 }
 
-struct kexec_file_ops kexec_elf64_ops = {
+const struct kexec_file_ops kexec_elf64_ops = {
 	.probe = elf64_probe,
 	.load = elf64_load,
 };
diff --git a/arch/powerpc/kernel/machine_kexec_file_64.c b/arch/powerpc/kernel/machine_kexec_file_64.c
index e4395f937d63..a27ec647350c 100644
--- a/arch/powerpc/kernel/machine_kexec_file_64.c
+++ b/arch/powerpc/kernel/machine_kexec_file_64.c
@@ -31,52 +31,19 @@
 
 #define SLAVE_CODE_SIZE		256
 
-static struct kexec_file_ops *kexec_file_loaders[] = {
+const struct kexec_file_ops * const kexec_file_loaders[] = {
 	&kexec_elf64_ops,
+	NULL
 };
 
 int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
 				  unsigned long buf_len)
 {
-	int i, ret = -ENOEXEC;
-	struct kexec_file_ops *fops;
-
 	/* We don't support crash kernels yet. */
 	if (image->type == KEXEC_TYPE_CRASH)
 		return -ENOTSUPP;
 
-	for (i = 0; i < ARRAY_SIZE(kexec_file_loaders); i++) {
-		fops = kexec_file_loaders[i];
-		if (!fops || !fops->probe)
-			continue;
-
-		ret = fops->probe(buf, buf_len);
-		if (!ret) {
-			image->fops = fops;
-			return ret;
-		}
-	}
-
-	return ret;
-}
-
-void *arch_kexec_kernel_image_load(struct kimage *image)
-{
-	if (!image->fops || !image->fops->load)
-		return ERR_PTR(-ENOEXEC);
-
-	return image->fops->load(image, image->kernel_buf,
-				 image->kernel_buf_len, image->initrd_buf,
-				 image->initrd_buf_len, image->cmdline_buf,
-				 image->cmdline_buf_len);
-}
-
-int arch_kimage_file_post_load_cleanup(struct kimage *image)
-{
-	if (!image->fops || !image->fops->cleanup)
-		return 0;
-
-	return image->fops->cleanup(image->image_loader_data);
+	return _kexec_kernel_image_probe(image, buf, buf_len);
 }
 
 /**
diff --git a/arch/x86/include/asm/kexec-bzimage64.h b/arch/x86/include/asm/kexec-bzimage64.h
index 9f07cff43705..df89ee7d3e9e 100644
--- a/arch/x86/include/asm/kexec-bzimage64.h
+++ b/arch/x86/include/asm/kexec-bzimage64.h
@@ -2,6 +2,6 @@
 #ifndef _ASM_KEXEC_BZIMAGE64_H
 #define _ASM_KEXEC_BZIMAGE64_H
 
-extern struct kexec_file_ops kexec_bzImage64_ops;
+extern const struct kexec_file_ops kexec_bzImage64_ops;
 
 #endif  /* _ASM_KEXE_BZIMAGE64_H */
diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
index fb095ba0c02f..705654776c0c 100644
--- a/arch/x86/kernel/kexec-bzimage64.c
+++ b/arch/x86/kernel/kexec-bzimage64.c
@@ -538,7 +538,7 @@ static int bzImage64_verify_sig(const char *kernel, unsigned long kernel_len)
 }
 #endif
 
-struct kexec_file_ops kexec_bzImage64_ops = {
+const struct kexec_file_ops kexec_bzImage64_ops = {
 	.probe = bzImage64_probe,
 	.load = bzImage64_load,
 	.cleanup = bzImage64_cleanup,
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 1f790cf9d38f..2cdd29d64181 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -30,8 +30,9 @@
 #include <asm/set_memory.h>
 
 #ifdef CONFIG_KEXEC_FILE
-static struct kexec_file_ops *kexec_file_loaders[] = {
+const struct kexec_file_ops * const kexec_file_loaders[] = {
 		&kexec_bzImage64_ops,
+		NULL
 };
 #endif
 
@@ -363,27 +364,6 @@ void arch_crash_save_vmcoreinfo(void)
 /* arch-dependent functionality related to kexec file-based syscall */
 
 #ifdef CONFIG_KEXEC_FILE
-int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
-				  unsigned long buf_len)
-{
-	int i, ret = -ENOEXEC;
-	struct kexec_file_ops *fops;
-
-	for (i = 0; i < ARRAY_SIZE(kexec_file_loaders); i++) {
-		fops = kexec_file_loaders[i];
-		if (!fops || !fops->probe)
-			continue;
-
-		ret = fops->probe(buf, buf_len);
-		if (!ret) {
-			image->fops = fops;
-			return ret;
-		}
-	}
-
-	return ret;
-}
-
 void *arch_kexec_kernel_image_load(struct kimage *image)
 {
 	vfree(image->arch.elf_headers);
@@ -398,27 +378,6 @@ void *arch_kexec_kernel_image_load(struct kimage *image)
 				 image->cmdline_buf_len);
 }
 
-int arch_kimage_file_post_load_cleanup(struct kimage *image)
-{
-	if (!image->fops || !image->fops->cleanup)
-		return 0;
-
-	return image->fops->cleanup(image->image_loader_data);
-}
-
-#ifdef CONFIG_KEXEC_VERIFY_SIG
-int arch_kexec_kernel_verify_sig(struct kimage *image, void *kernel,
-				 unsigned long kernel_len)
-{
-	if (!image->fops || !image->fops->verify_sig) {
-		pr_debug("kernel loader does not support signature verification.");
-		return -EKEYREJECTED;
-	}
-
-	return image->fops->verify_sig(kernel, kernel_len);
-}
-#endif
-
 /*
  * Apply purgatory relocations.
  *
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index f16f6ceb3875..325980537125 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -209,7 +209,7 @@ struct kimage {
 	unsigned long cmdline_buf_len;
 
 	/* File operations provided by image loader */
-	struct kexec_file_ops *fops;
+	const struct kexec_file_ops *fops;
 
 	/* Image loader handling the kernel can store a pointer here */
 	void *image_loader_data;
@@ -277,12 +277,13 @@ int crash_shrink_memory(unsigned long new_size);
 size_t crash_get_memory_size(void);
 void crash_free_reserved_phys_range(unsigned long begin, unsigned long end);
 
-int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
-					 unsigned long buf_len);
-void * __weak arch_kexec_kernel_image_load(struct kimage *image);
-int __weak arch_kimage_file_post_load_cleanup(struct kimage *image);
-int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
-					unsigned long buf_len);
+int _kexec_kernel_image_probe(struct kimage *image, void *buf,
+			      unsigned long buf_len);
+void *_kexec_kernel_image_load(struct kimage *image);
+int _kimage_file_post_load_cleanup(struct kimage *image);
+int _kexec_kernel_verify_sig(struct kimage *image, void *buf,
+			     unsigned long buf_len);
+
 int __weak arch_kexec_apply_relocations_add(const Elf_Ehdr *ehdr,
 					Elf_Shdr *sechdrs, unsigned int relsec);
 int __weak arch_kexec_apply_relocations(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 990adae52151..a6d14a768b3e 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -26,34 +26,83 @@
 #include <linux/vmalloc.h>
 #include "kexec_internal.h"
 
+const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
+
 #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
 static int kexec_calculate_store_digests(struct kimage *image);
 #else
 static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
 #endif
 
+int _kexec_kernel_image_probe(struct kimage *image, void *buf,
+			     unsigned long buf_len)
+{
+	const struct kexec_file_ops * const *fops;
+	int ret = -ENOEXEC;
+
+	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
+		ret = (*fops)->probe(buf, buf_len);
+		if (!ret) {
+			image->fops = *fops;
+			return ret;
+		}
+	}
+
+	return ret;
+}
+
 /* Architectures can provide this probe function */
 int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
 					 unsigned long buf_len)
 {
-	return -ENOEXEC;
+	return _kexec_kernel_image_probe(image, buf, buf_len);
+}
+
+void *_kexec_kernel_image_load(struct kimage *image)
+{
+	if (!image->fops || !image->fops->load)
+		return ERR_PTR(-ENOEXEC);
+
+	return image->fops->load(image, image->kernel_buf,
+				 image->kernel_buf_len, image->initrd_buf,
+				 image->initrd_buf_len, image->cmdline_buf,
+				 image->cmdline_buf_len);
 }
 
 void * __weak arch_kexec_kernel_image_load(struct kimage *image)
 {
-	return ERR_PTR(-ENOEXEC);
+	return _kexec_kernel_image_load(image);
+}
+
+int _kimage_file_post_load_cleanup(struct kimage *image)
+{
+	if (!image->fops || !image->fops->cleanup)
+		return 0;
+
+	return image->fops->cleanup(image->image_loader_data);
 }
 
 int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
 {
-	return -EINVAL;
+	return _kimage_file_post_load_cleanup(image);
 }
 
 #ifdef CONFIG_KEXEC_VERIFY_SIG
+int _kexec_kernel_verify_sig(struct kimage *image, void *buf,
+			    unsigned long buf_len)
+{
+	if (!image->fops || !image->fops->verify_sig) {
+		pr_debug("kernel loader does not support signature verification.\n");
+		return -EKEYREJECTED;
+	}
+
+	return image->fops->verify_sig(buf, buf_len);
+}
+
 int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
 					unsigned long buf_len)
 {
-	return -EKEYREJECTED;
+	return _kexec_kernel_verify_sig(image, buf, buf_len);
 }
 #endif
 
-- 
2.16.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 04/13] x86: kexec_file: factor out elf core header related functions
  2018-02-22 11:17 ` AKASHI Takahiro
  (?)
@ 2018-02-22 11:17   ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-arm-kernel, linux-kernel, AKASHI Takahiro

exclude_mem_range() and prepare_elf64_headers() can be re-used on other
architectures, including arm64, as well. So let them factored out so as to
move them to generic side in the next patch.

fill_up_crash_elf_data() can potentially be commonalized for most
architectures who want to go through io resources (/proc/iomem) for a list
of "System RAM", but leave it private for now.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
---
 arch/x86/kernel/crash.c | 235 +++++++++++++++++++++---------------------------
 1 file changed, 103 insertions(+), 132 deletions(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 10e74d4778a1..5c19cfbf3b85 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -41,32 +41,14 @@
 /* Alignment required for elf header segment */
 #define ELF_CORE_HEADER_ALIGN   4096
 
-/* This primarily represents number of split ranges due to exclusion */
-#define CRASH_MAX_RANGES	16
-
 struct crash_mem_range {
 	u64 start, end;
 };
 
 struct crash_mem {
-	unsigned int nr_ranges;
-	struct crash_mem_range ranges[CRASH_MAX_RANGES];
-};
-
-/* Misc data about ram ranges needed to prepare elf headers */
-struct crash_elf_data {
-	struct kimage *image;
-	/*
-	 * Total number of ram ranges we have after various adjustments for
-	 * crash reserved region, etc.
-	 */
 	unsigned int max_nr_ranges;
-
-	/* Pointer to elf header */
-	void *ehdr;
-	/* Pointer to next phdr */
-	void *bufp;
-	struct crash_mem mem;
+	unsigned int nr_ranges;
+	struct crash_mem_range ranges[0];
 };
 
 /* Used while preparing memory map entries for second kernel */
@@ -217,29 +199,32 @@ static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
 	return 0;
 }
 
-
 /* Gather all the required information to prepare elf headers for ram regions */
-static void fill_up_crash_elf_data(struct crash_elf_data *ced,
-				   struct kimage *image)
+static struct crash_mem *fill_up_crash_elf_data(void)
 {
 	unsigned int nr_ranges = 0;
-
-	ced->image = image;
+	struct crash_mem *cmem;
 
 	walk_system_ram_res(0, -1, &nr_ranges,
 				get_nr_ram_ranges_callback);
 
-	ced->max_nr_ranges = nr_ranges;
+	/*
+	 * Exclusion of crash region and/or crashk_low_res may cause
+	 * another range split. So add extra two slots here.
+	 */
+	nr_ranges += 2;
+	cmem = vmalloc(sizeof(struct crash_mem) +
+			sizeof(struct crash_mem_range) * nr_ranges);
+	if (!cmem)
+		return NULL;
 
-	/* Exclusion of crash region could split memory ranges */
-	ced->max_nr_ranges++;
+	cmem->max_nr_ranges = nr_ranges;
+	cmem->nr_ranges = 0;
 
-	/* If crashk_low_res is not 0, another range split possible */
-	if (crashk_low_res.end)
-		ced->max_nr_ranges++;
+	return cmem;
 }
 
-static int exclude_mem_range(struct crash_mem *mem,
+static int crash_exclude_mem_range(struct crash_mem *mem,
 		unsigned long long mstart, unsigned long long mend)
 {
 	int i, j;
@@ -293,10 +278,8 @@ static int exclude_mem_range(struct crash_mem *mem,
 		return 0;
 
 	/* Split happened */
-	if (i == CRASH_MAX_RANGES - 1) {
-		pr_err("Too many crash ranges after split\n");
+	if (i == mem->max_nr_ranges - 1)
 		return -ENOMEM;
-	}
 
 	/* Location where new range should go */
 	j = i + 1;
@@ -314,27 +297,20 @@ static int exclude_mem_range(struct crash_mem *mem,
 
 /*
  * Look for any unwanted ranges between mstart, mend and remove them. This
- * might lead to split and split ranges are put in ced->mem.ranges[] array
+ * might lead to split and split ranges are put in cmem->ranges[] array
  */
-static int elf_header_exclude_ranges(struct crash_elf_data *ced,
-		unsigned long long mstart, unsigned long long mend)
+static int elf_header_exclude_ranges(struct crash_mem *cmem)
 {
-	struct crash_mem *cmem = &ced->mem;
 	int ret = 0;
 
-	memset(cmem->ranges, 0, sizeof(cmem->ranges));
-
-	cmem->ranges[0].start = mstart;
-	cmem->ranges[0].end = mend;
-	cmem->nr_ranges = 1;
-
 	/* Exclude crashkernel region */
-	ret = exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
+	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
 	if (ret)
 		return ret;
 
 	if (crashk_low_res.end) {
-		ret = exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
+		ret = crash_exclude_mem_range(cmem, crashk_low_res.start,
+							crashk_low_res.end);
 		if (ret)
 			return ret;
 	}
@@ -344,70 +320,29 @@ static int elf_header_exclude_ranges(struct crash_elf_data *ced,
 
 static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
 {
-	struct crash_elf_data *ced = arg;
-	Elf64_Ehdr *ehdr;
-	Elf64_Phdr *phdr;
-	unsigned long mstart, mend;
-	struct kimage *image = ced->image;
-	struct crash_mem *cmem;
-	int ret, i;
+	struct crash_mem *cmem = arg;
 
-	ehdr = ced->ehdr;
-
-	/* Exclude unwanted mem ranges */
-	ret = elf_header_exclude_ranges(ced, res->start, res->end);
-	if (ret)
-		return ret;
-
-	/* Go through all the ranges in ced->mem.ranges[] and prepare phdr */
-	cmem = &ced->mem;
-
-	for (i = 0; i < cmem->nr_ranges; i++) {
-		mstart = cmem->ranges[i].start;
-		mend = cmem->ranges[i].end;
-
-		phdr = ced->bufp;
-		ced->bufp += sizeof(Elf64_Phdr);
-
-		phdr->p_type = PT_LOAD;
-		phdr->p_flags = PF_R|PF_W|PF_X;
-		phdr->p_offset  = mstart;
-
-		/*
-		 * If a range matches backup region, adjust offset to backup
-		 * segment.
-		 */
-		if (mstart == image->arch.backup_src_start &&
-		    (mend - mstart + 1) == image->arch.backup_src_sz)
-			phdr->p_offset = image->arch.backup_load_addr;
-
-		phdr->p_paddr = mstart;
-		phdr->p_vaddr = (unsigned long long) __va(mstart);
-		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
-		phdr->p_align = 0;
-		ehdr->e_phnum++;
-		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
-			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
-			ehdr->e_phnum, phdr->p_offset);
-	}
+	cmem->ranges[cmem->nr_ranges].start = res->start;
+	cmem->ranges[cmem->nr_ranges].end = res->end;
+	cmem->nr_ranges++;
 
-	return ret;
+	return 0;
 }
 
-static int prepare_elf64_headers(struct crash_elf_data *ced,
-		void **addr, unsigned long *sz)
+static int crash_prepare_elf64_headers(struct crash_mem *cmem, int kernel_map,
+					void **addr, unsigned long *sz)
 {
 	Elf64_Ehdr *ehdr;
 	Elf64_Phdr *phdr;
 	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
-	unsigned char *buf, *bufp;
-	unsigned int cpu;
+	unsigned char *buf;
+	unsigned int cpu, i;
 	unsigned long long notes_addr;
-	int ret;
+	unsigned long mstart, mend;
 
 	/* extra phdr for vmcoreinfo elf note */
 	nr_phdr = nr_cpus + 1;
-	nr_phdr += ced->max_nr_ranges;
+	nr_phdr += cmem->nr_ranges;
 
 	/*
 	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
@@ -425,9 +360,8 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
 	if (!buf)
 		return -ENOMEM;
 
-	bufp = buf;
-	ehdr = (Elf64_Ehdr *)bufp;
-	bufp += sizeof(Elf64_Ehdr);
+	ehdr = (Elf64_Ehdr *)buf;
+	phdr = (Elf64_Phdr *)(ehdr + 1);
 	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
 	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
 	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
@@ -443,42 +377,51 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
 
 	/* Prepare one phdr of type PT_NOTE for each present cpu */
 	for_each_present_cpu(cpu) {
-		phdr = (Elf64_Phdr *)bufp;
-		bufp += sizeof(Elf64_Phdr);
 		phdr->p_type = PT_NOTE;
 		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
 		phdr->p_offset = phdr->p_paddr = notes_addr;
 		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
 		(ehdr->e_phnum)++;
+		phdr++;
 	}
 
 	/* Prepare one PT_NOTE header for vmcoreinfo */
-	phdr = (Elf64_Phdr *)bufp;
-	bufp += sizeof(Elf64_Phdr);
 	phdr->p_type = PT_NOTE;
 	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
 	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
 	(ehdr->e_phnum)++;
+	phdr++;
 
-#ifdef CONFIG_X86_64
 	/* Prepare PT_LOAD type program header for kernel text region */
-	phdr = (Elf64_Phdr *)bufp;
-	bufp += sizeof(Elf64_Phdr);
-	phdr->p_type = PT_LOAD;
-	phdr->p_flags = PF_R|PF_W|PF_X;
-	phdr->p_vaddr = (Elf64_Addr)_text;
-	phdr->p_filesz = phdr->p_memsz = _end - _text;
-	phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
-	(ehdr->e_phnum)++;
-#endif
+	if (kernel_map) {
+		phdr->p_type = PT_LOAD;
+		phdr->p_flags = PF_R|PF_W|PF_X;
+		phdr->p_vaddr = (Elf64_Addr)_text;
+		phdr->p_filesz = phdr->p_memsz = _end - _text;
+		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
+		ehdr->e_phnum++;
+		phdr++;
+	}
 
-	/* Prepare PT_LOAD headers for system ram chunks. */
-	ced->ehdr = ehdr;
-	ced->bufp = bufp;
-	ret = walk_system_ram_res(0, -1, ced,
-			prepare_elf64_ram_headers_callback);
-	if (ret < 0)
-		return ret;
+	/* Go through all the ranges in cmem->ranges[] and prepare phdr */
+	for (i = 0; i < cmem->nr_ranges; i++) {
+		mstart = cmem->ranges[i].start;
+		mend = cmem->ranges[i].end;
+
+		phdr->p_type = PT_LOAD;
+		phdr->p_flags = PF_R|PF_W|PF_X;
+		phdr->p_offset  = mstart;
+
+		phdr->p_paddr = mstart;
+		phdr->p_vaddr = (unsigned long long) __va(mstart);
+		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
+		phdr->p_align = 0;
+		ehdr->e_phnum++;
+		phdr++;
+		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
+			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
+			ehdr->e_phnum, phdr->p_offset);
+	}
 
 	*addr = buf;
 	*sz = elf_sz;
@@ -489,18 +432,46 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
 static int prepare_elf_headers(struct kimage *image, void **addr,
 					unsigned long *sz)
 {
-	struct crash_elf_data *ced;
-	int ret;
+	struct crash_mem *cmem;
+	Elf64_Ehdr *ehdr;
+	Elf64_Phdr *phdr;
+	int ret, i;
 
-	ced = kzalloc(sizeof(*ced), GFP_KERNEL);
-	if (!ced)
+	cmem = fill_up_crash_elf_data();
+	if (!cmem)
 		return -ENOMEM;
 
-	fill_up_crash_elf_data(ced, image);
+	ret = walk_system_ram_res(0, -1, cmem,
+				prepare_elf64_ram_headers_callback);
+	if (ret)
+		goto out;
+
+	/* Exclude unwanted mem ranges */
+	ret = elf_header_exclude_ranges(cmem);
+	if (ret)
+		goto out;
 
 	/* By default prepare 64bit headers */
-	ret =  prepare_elf64_headers(ced, addr, sz);
-	kfree(ced);
+	ret =  crash_prepare_elf64_headers(cmem,
+				(int)IS_ENABLED(CONFIG_X86_64), addr, sz);
+	if (ret)
+		goto out;
+
+	/*
+	 * If a range matches backup region, adjust offset to backup
+	 * segment.
+	 */
+	ehdr = (Elf64_Ehdr *)*addr;
+	phdr = (Elf64_Phdr *)(ehdr + 1);
+	for (i = 0; i < ehdr->e_phnum; phdr++, i++)
+		if (phdr->p_type == PT_LOAD &&
+				phdr->p_paddr == image->arch.backup_src_start &&
+				phdr->p_memsz == image->arch.backup_src_sz) {
+			phdr->p_offset = image->arch.backup_load_addr;
+			break;
+		}
+out:
+	vfree(cmem);
 	return ret;
 }
 
@@ -546,14 +517,14 @@ static int memmap_exclude_ranges(struct kimage *image, struct crash_mem *cmem,
 	/* Exclude Backup region */
 	start = image->arch.backup_load_addr;
 	end = start + image->arch.backup_src_sz - 1;
-	ret = exclude_mem_range(cmem, start, end);
+	ret = crash_exclude_mem_range(cmem, start, end);
 	if (ret)
 		return ret;
 
 	/* Exclude elf header region */
 	start = image->arch.elf_load_addr;
 	end = start + image->arch.elf_headers_sz - 1;
-	return exclude_mem_range(cmem, start, end);
+	return crash_exclude_mem_range(cmem, start, end);
 }
 
 /* Prepare memory map for crash dump kernel */
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 04/13] x86: kexec_file: factor out elf core header related functions
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

exclude_mem_range() and prepare_elf64_headers() can be re-used on other
architectures, including arm64, as well. So let them factored out so as to
move them to generic side in the next patch.

fill_up_crash_elf_data() can potentially be commonalized for most
architectures who want to go through io resources (/proc/iomem) for a list
of "System RAM", but leave it private for now.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
---
 arch/x86/kernel/crash.c | 235 +++++++++++++++++++++---------------------------
 1 file changed, 103 insertions(+), 132 deletions(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 10e74d4778a1..5c19cfbf3b85 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -41,32 +41,14 @@
 /* Alignment required for elf header segment */
 #define ELF_CORE_HEADER_ALIGN   4096
 
-/* This primarily represents number of split ranges due to exclusion */
-#define CRASH_MAX_RANGES	16
-
 struct crash_mem_range {
 	u64 start, end;
 };
 
 struct crash_mem {
-	unsigned int nr_ranges;
-	struct crash_mem_range ranges[CRASH_MAX_RANGES];
-};
-
-/* Misc data about ram ranges needed to prepare elf headers */
-struct crash_elf_data {
-	struct kimage *image;
-	/*
-	 * Total number of ram ranges we have after various adjustments for
-	 * crash reserved region, etc.
-	 */
 	unsigned int max_nr_ranges;
-
-	/* Pointer to elf header */
-	void *ehdr;
-	/* Pointer to next phdr */
-	void *bufp;
-	struct crash_mem mem;
+	unsigned int nr_ranges;
+	struct crash_mem_range ranges[0];
 };
 
 /* Used while preparing memory map entries for second kernel */
@@ -217,29 +199,32 @@ static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
 	return 0;
 }
 
-
 /* Gather all the required information to prepare elf headers for ram regions */
-static void fill_up_crash_elf_data(struct crash_elf_data *ced,
-				   struct kimage *image)
+static struct crash_mem *fill_up_crash_elf_data(void)
 {
 	unsigned int nr_ranges = 0;
-
-	ced->image = image;
+	struct crash_mem *cmem;
 
 	walk_system_ram_res(0, -1, &nr_ranges,
 				get_nr_ram_ranges_callback);
 
-	ced->max_nr_ranges = nr_ranges;
+	/*
+	 * Exclusion of crash region and/or crashk_low_res may cause
+	 * another range split. So add extra two slots here.
+	 */
+	nr_ranges += 2;
+	cmem = vmalloc(sizeof(struct crash_mem) +
+			sizeof(struct crash_mem_range) * nr_ranges);
+	if (!cmem)
+		return NULL;
 
-	/* Exclusion of crash region could split memory ranges */
-	ced->max_nr_ranges++;
+	cmem->max_nr_ranges = nr_ranges;
+	cmem->nr_ranges = 0;
 
-	/* If crashk_low_res is not 0, another range split possible */
-	if (crashk_low_res.end)
-		ced->max_nr_ranges++;
+	return cmem;
 }
 
-static int exclude_mem_range(struct crash_mem *mem,
+static int crash_exclude_mem_range(struct crash_mem *mem,
 		unsigned long long mstart, unsigned long long mend)
 {
 	int i, j;
@@ -293,10 +278,8 @@ static int exclude_mem_range(struct crash_mem *mem,
 		return 0;
 
 	/* Split happened */
-	if (i == CRASH_MAX_RANGES - 1) {
-		pr_err("Too many crash ranges after split\n");
+	if (i == mem->max_nr_ranges - 1)
 		return -ENOMEM;
-	}
 
 	/* Location where new range should go */
 	j = i + 1;
@@ -314,27 +297,20 @@ static int exclude_mem_range(struct crash_mem *mem,
 
 /*
  * Look for any unwanted ranges between mstart, mend and remove them. This
- * might lead to split and split ranges are put in ced->mem.ranges[] array
+ * might lead to split and split ranges are put in cmem->ranges[] array
  */
-static int elf_header_exclude_ranges(struct crash_elf_data *ced,
-		unsigned long long mstart, unsigned long long mend)
+static int elf_header_exclude_ranges(struct crash_mem *cmem)
 {
-	struct crash_mem *cmem = &ced->mem;
 	int ret = 0;
 
-	memset(cmem->ranges, 0, sizeof(cmem->ranges));
-
-	cmem->ranges[0].start = mstart;
-	cmem->ranges[0].end = mend;
-	cmem->nr_ranges = 1;
-
 	/* Exclude crashkernel region */
-	ret = exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
+	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
 	if (ret)
 		return ret;
 
 	if (crashk_low_res.end) {
-		ret = exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
+		ret = crash_exclude_mem_range(cmem, crashk_low_res.start,
+							crashk_low_res.end);
 		if (ret)
 			return ret;
 	}
@@ -344,70 +320,29 @@ static int elf_header_exclude_ranges(struct crash_elf_data *ced,
 
 static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
 {
-	struct crash_elf_data *ced = arg;
-	Elf64_Ehdr *ehdr;
-	Elf64_Phdr *phdr;
-	unsigned long mstart, mend;
-	struct kimage *image = ced->image;
-	struct crash_mem *cmem;
-	int ret, i;
+	struct crash_mem *cmem = arg;
 
-	ehdr = ced->ehdr;
-
-	/* Exclude unwanted mem ranges */
-	ret = elf_header_exclude_ranges(ced, res->start, res->end);
-	if (ret)
-		return ret;
-
-	/* Go through all the ranges in ced->mem.ranges[] and prepare phdr */
-	cmem = &ced->mem;
-
-	for (i = 0; i < cmem->nr_ranges; i++) {
-		mstart = cmem->ranges[i].start;
-		mend = cmem->ranges[i].end;
-
-		phdr = ced->bufp;
-		ced->bufp += sizeof(Elf64_Phdr);
-
-		phdr->p_type = PT_LOAD;
-		phdr->p_flags = PF_R|PF_W|PF_X;
-		phdr->p_offset  = mstart;
-
-		/*
-		 * If a range matches backup region, adjust offset to backup
-		 * segment.
-		 */
-		if (mstart == image->arch.backup_src_start &&
-		    (mend - mstart + 1) == image->arch.backup_src_sz)
-			phdr->p_offset = image->arch.backup_load_addr;
-
-		phdr->p_paddr = mstart;
-		phdr->p_vaddr = (unsigned long long) __va(mstart);
-		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
-		phdr->p_align = 0;
-		ehdr->e_phnum++;
-		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
-			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
-			ehdr->e_phnum, phdr->p_offset);
-	}
+	cmem->ranges[cmem->nr_ranges].start = res->start;
+	cmem->ranges[cmem->nr_ranges].end = res->end;
+	cmem->nr_ranges++;
 
-	return ret;
+	return 0;
 }
 
-static int prepare_elf64_headers(struct crash_elf_data *ced,
-		void **addr, unsigned long *sz)
+static int crash_prepare_elf64_headers(struct crash_mem *cmem, int kernel_map,
+					void **addr, unsigned long *sz)
 {
 	Elf64_Ehdr *ehdr;
 	Elf64_Phdr *phdr;
 	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
-	unsigned char *buf, *bufp;
-	unsigned int cpu;
+	unsigned char *buf;
+	unsigned int cpu, i;
 	unsigned long long notes_addr;
-	int ret;
+	unsigned long mstart, mend;
 
 	/* extra phdr for vmcoreinfo elf note */
 	nr_phdr = nr_cpus + 1;
-	nr_phdr += ced->max_nr_ranges;
+	nr_phdr += cmem->nr_ranges;
 
 	/*
 	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
@@ -425,9 +360,8 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
 	if (!buf)
 		return -ENOMEM;
 
-	bufp = buf;
-	ehdr = (Elf64_Ehdr *)bufp;
-	bufp += sizeof(Elf64_Ehdr);
+	ehdr = (Elf64_Ehdr *)buf;
+	phdr = (Elf64_Phdr *)(ehdr + 1);
 	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
 	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
 	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
@@ -443,42 +377,51 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
 
 	/* Prepare one phdr of type PT_NOTE for each present cpu */
 	for_each_present_cpu(cpu) {
-		phdr = (Elf64_Phdr *)bufp;
-		bufp += sizeof(Elf64_Phdr);
 		phdr->p_type = PT_NOTE;
 		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
 		phdr->p_offset = phdr->p_paddr = notes_addr;
 		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
 		(ehdr->e_phnum)++;
+		phdr++;
 	}
 
 	/* Prepare one PT_NOTE header for vmcoreinfo */
-	phdr = (Elf64_Phdr *)bufp;
-	bufp += sizeof(Elf64_Phdr);
 	phdr->p_type = PT_NOTE;
 	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
 	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
 	(ehdr->e_phnum)++;
+	phdr++;
 
-#ifdef CONFIG_X86_64
 	/* Prepare PT_LOAD type program header for kernel text region */
-	phdr = (Elf64_Phdr *)bufp;
-	bufp += sizeof(Elf64_Phdr);
-	phdr->p_type = PT_LOAD;
-	phdr->p_flags = PF_R|PF_W|PF_X;
-	phdr->p_vaddr = (Elf64_Addr)_text;
-	phdr->p_filesz = phdr->p_memsz = _end - _text;
-	phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
-	(ehdr->e_phnum)++;
-#endif
+	if (kernel_map) {
+		phdr->p_type = PT_LOAD;
+		phdr->p_flags = PF_R|PF_W|PF_X;
+		phdr->p_vaddr = (Elf64_Addr)_text;
+		phdr->p_filesz = phdr->p_memsz = _end - _text;
+		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
+		ehdr->e_phnum++;
+		phdr++;
+	}
 
-	/* Prepare PT_LOAD headers for system ram chunks. */
-	ced->ehdr = ehdr;
-	ced->bufp = bufp;
-	ret = walk_system_ram_res(0, -1, ced,
-			prepare_elf64_ram_headers_callback);
-	if (ret < 0)
-		return ret;
+	/* Go through all the ranges in cmem->ranges[] and prepare phdr */
+	for (i = 0; i < cmem->nr_ranges; i++) {
+		mstart = cmem->ranges[i].start;
+		mend = cmem->ranges[i].end;
+
+		phdr->p_type = PT_LOAD;
+		phdr->p_flags = PF_R|PF_W|PF_X;
+		phdr->p_offset  = mstart;
+
+		phdr->p_paddr = mstart;
+		phdr->p_vaddr = (unsigned long long) __va(mstart);
+		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
+		phdr->p_align = 0;
+		ehdr->e_phnum++;
+		phdr++;
+		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
+			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
+			ehdr->e_phnum, phdr->p_offset);
+	}
 
 	*addr = buf;
 	*sz = elf_sz;
@@ -489,18 +432,46 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
 static int prepare_elf_headers(struct kimage *image, void **addr,
 					unsigned long *sz)
 {
-	struct crash_elf_data *ced;
-	int ret;
+	struct crash_mem *cmem;
+	Elf64_Ehdr *ehdr;
+	Elf64_Phdr *phdr;
+	int ret, i;
 
-	ced = kzalloc(sizeof(*ced), GFP_KERNEL);
-	if (!ced)
+	cmem = fill_up_crash_elf_data();
+	if (!cmem)
 		return -ENOMEM;
 
-	fill_up_crash_elf_data(ced, image);
+	ret = walk_system_ram_res(0, -1, cmem,
+				prepare_elf64_ram_headers_callback);
+	if (ret)
+		goto out;
+
+	/* Exclude unwanted mem ranges */
+	ret = elf_header_exclude_ranges(cmem);
+	if (ret)
+		goto out;
 
 	/* By default prepare 64bit headers */
-	ret =  prepare_elf64_headers(ced, addr, sz);
-	kfree(ced);
+	ret =  crash_prepare_elf64_headers(cmem,
+				(int)IS_ENABLED(CONFIG_X86_64), addr, sz);
+	if (ret)
+		goto out;
+
+	/*
+	 * If a range matches backup region, adjust offset to backup
+	 * segment.
+	 */
+	ehdr = (Elf64_Ehdr *)*addr;
+	phdr = (Elf64_Phdr *)(ehdr + 1);
+	for (i = 0; i < ehdr->e_phnum; phdr++, i++)
+		if (phdr->p_type == PT_LOAD &&
+				phdr->p_paddr == image->arch.backup_src_start &&
+				phdr->p_memsz == image->arch.backup_src_sz) {
+			phdr->p_offset = image->arch.backup_load_addr;
+			break;
+		}
+out:
+	vfree(cmem);
 	return ret;
 }
 
@@ -546,14 +517,14 @@ static int memmap_exclude_ranges(struct kimage *image, struct crash_mem *cmem,
 	/* Exclude Backup region */
 	start = image->arch.backup_load_addr;
 	end = start + image->arch.backup_src_sz - 1;
-	ret = exclude_mem_range(cmem, start, end);
+	ret = crash_exclude_mem_range(cmem, start, end);
 	if (ret)
 		return ret;
 
 	/* Exclude elf header region */
 	start = image->arch.elf_load_addr;
 	end = start + image->arch.elf_headers_sz - 1;
-	return exclude_mem_range(cmem, start, end);
+	return crash_exclude_mem_range(cmem, start, end);
 }
 
 /* Prepare memory map for crash dump kernel */
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 04/13] x86: kexec_file: factor out elf core header related functions
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: AKASHI Takahiro, kexec, linux-kernel, linux-arm-kernel

exclude_mem_range() and prepare_elf64_headers() can be re-used on other
architectures, including arm64, as well. So let them factored out so as to
move them to generic side in the next patch.

fill_up_crash_elf_data() can potentially be commonalized for most
architectures who want to go through io resources (/proc/iomem) for a list
of "System RAM", but leave it private for now.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
---
 arch/x86/kernel/crash.c | 235 +++++++++++++++++++++---------------------------
 1 file changed, 103 insertions(+), 132 deletions(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 10e74d4778a1..5c19cfbf3b85 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -41,32 +41,14 @@
 /* Alignment required for elf header segment */
 #define ELF_CORE_HEADER_ALIGN   4096
 
-/* This primarily represents number of split ranges due to exclusion */
-#define CRASH_MAX_RANGES	16
-
 struct crash_mem_range {
 	u64 start, end;
 };
 
 struct crash_mem {
-	unsigned int nr_ranges;
-	struct crash_mem_range ranges[CRASH_MAX_RANGES];
-};
-
-/* Misc data about ram ranges needed to prepare elf headers */
-struct crash_elf_data {
-	struct kimage *image;
-	/*
-	 * Total number of ram ranges we have after various adjustments for
-	 * crash reserved region, etc.
-	 */
 	unsigned int max_nr_ranges;
-
-	/* Pointer to elf header */
-	void *ehdr;
-	/* Pointer to next phdr */
-	void *bufp;
-	struct crash_mem mem;
+	unsigned int nr_ranges;
+	struct crash_mem_range ranges[0];
 };
 
 /* Used while preparing memory map entries for second kernel */
@@ -217,29 +199,32 @@ static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
 	return 0;
 }
 
-
 /* Gather all the required information to prepare elf headers for ram regions */
-static void fill_up_crash_elf_data(struct crash_elf_data *ced,
-				   struct kimage *image)
+static struct crash_mem *fill_up_crash_elf_data(void)
 {
 	unsigned int nr_ranges = 0;
-
-	ced->image = image;
+	struct crash_mem *cmem;
 
 	walk_system_ram_res(0, -1, &nr_ranges,
 				get_nr_ram_ranges_callback);
 
-	ced->max_nr_ranges = nr_ranges;
+	/*
+	 * Exclusion of crash region and/or crashk_low_res may cause
+	 * another range split. So add extra two slots here.
+	 */
+	nr_ranges += 2;
+	cmem = vmalloc(sizeof(struct crash_mem) +
+			sizeof(struct crash_mem_range) * nr_ranges);
+	if (!cmem)
+		return NULL;
 
-	/* Exclusion of crash region could split memory ranges */
-	ced->max_nr_ranges++;
+	cmem->max_nr_ranges = nr_ranges;
+	cmem->nr_ranges = 0;
 
-	/* If crashk_low_res is not 0, another range split possible */
-	if (crashk_low_res.end)
-		ced->max_nr_ranges++;
+	return cmem;
 }
 
-static int exclude_mem_range(struct crash_mem *mem,
+static int crash_exclude_mem_range(struct crash_mem *mem,
 		unsigned long long mstart, unsigned long long mend)
 {
 	int i, j;
@@ -293,10 +278,8 @@ static int exclude_mem_range(struct crash_mem *mem,
 		return 0;
 
 	/* Split happened */
-	if (i == CRASH_MAX_RANGES - 1) {
-		pr_err("Too many crash ranges after split\n");
+	if (i == mem->max_nr_ranges - 1)
 		return -ENOMEM;
-	}
 
 	/* Location where new range should go */
 	j = i + 1;
@@ -314,27 +297,20 @@ static int exclude_mem_range(struct crash_mem *mem,
 
 /*
  * Look for any unwanted ranges between mstart, mend and remove them. This
- * might lead to split and split ranges are put in ced->mem.ranges[] array
+ * might lead to split and split ranges are put in cmem->ranges[] array
  */
-static int elf_header_exclude_ranges(struct crash_elf_data *ced,
-		unsigned long long mstart, unsigned long long mend)
+static int elf_header_exclude_ranges(struct crash_mem *cmem)
 {
-	struct crash_mem *cmem = &ced->mem;
 	int ret = 0;
 
-	memset(cmem->ranges, 0, sizeof(cmem->ranges));
-
-	cmem->ranges[0].start = mstart;
-	cmem->ranges[0].end = mend;
-	cmem->nr_ranges = 1;
-
 	/* Exclude crashkernel region */
-	ret = exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
+	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
 	if (ret)
 		return ret;
 
 	if (crashk_low_res.end) {
-		ret = exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
+		ret = crash_exclude_mem_range(cmem, crashk_low_res.start,
+							crashk_low_res.end);
 		if (ret)
 			return ret;
 	}
@@ -344,70 +320,29 @@ static int elf_header_exclude_ranges(struct crash_elf_data *ced,
 
 static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
 {
-	struct crash_elf_data *ced = arg;
-	Elf64_Ehdr *ehdr;
-	Elf64_Phdr *phdr;
-	unsigned long mstart, mend;
-	struct kimage *image = ced->image;
-	struct crash_mem *cmem;
-	int ret, i;
+	struct crash_mem *cmem = arg;
 
-	ehdr = ced->ehdr;
-
-	/* Exclude unwanted mem ranges */
-	ret = elf_header_exclude_ranges(ced, res->start, res->end);
-	if (ret)
-		return ret;
-
-	/* Go through all the ranges in ced->mem.ranges[] and prepare phdr */
-	cmem = &ced->mem;
-
-	for (i = 0; i < cmem->nr_ranges; i++) {
-		mstart = cmem->ranges[i].start;
-		mend = cmem->ranges[i].end;
-
-		phdr = ced->bufp;
-		ced->bufp += sizeof(Elf64_Phdr);
-
-		phdr->p_type = PT_LOAD;
-		phdr->p_flags = PF_R|PF_W|PF_X;
-		phdr->p_offset  = mstart;
-
-		/*
-		 * If a range matches backup region, adjust offset to backup
-		 * segment.
-		 */
-		if (mstart == image->arch.backup_src_start &&
-		    (mend - mstart + 1) == image->arch.backup_src_sz)
-			phdr->p_offset = image->arch.backup_load_addr;
-
-		phdr->p_paddr = mstart;
-		phdr->p_vaddr = (unsigned long long) __va(mstart);
-		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
-		phdr->p_align = 0;
-		ehdr->e_phnum++;
-		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
-			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
-			ehdr->e_phnum, phdr->p_offset);
-	}
+	cmem->ranges[cmem->nr_ranges].start = res->start;
+	cmem->ranges[cmem->nr_ranges].end = res->end;
+	cmem->nr_ranges++;
 
-	return ret;
+	return 0;
 }
 
-static int prepare_elf64_headers(struct crash_elf_data *ced,
-		void **addr, unsigned long *sz)
+static int crash_prepare_elf64_headers(struct crash_mem *cmem, int kernel_map,
+					void **addr, unsigned long *sz)
 {
 	Elf64_Ehdr *ehdr;
 	Elf64_Phdr *phdr;
 	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
-	unsigned char *buf, *bufp;
-	unsigned int cpu;
+	unsigned char *buf;
+	unsigned int cpu, i;
 	unsigned long long notes_addr;
-	int ret;
+	unsigned long mstart, mend;
 
 	/* extra phdr for vmcoreinfo elf note */
 	nr_phdr = nr_cpus + 1;
-	nr_phdr += ced->max_nr_ranges;
+	nr_phdr += cmem->nr_ranges;
 
 	/*
 	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
@@ -425,9 +360,8 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
 	if (!buf)
 		return -ENOMEM;
 
-	bufp = buf;
-	ehdr = (Elf64_Ehdr *)bufp;
-	bufp += sizeof(Elf64_Ehdr);
+	ehdr = (Elf64_Ehdr *)buf;
+	phdr = (Elf64_Phdr *)(ehdr + 1);
 	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
 	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
 	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
@@ -443,42 +377,51 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
 
 	/* Prepare one phdr of type PT_NOTE for each present cpu */
 	for_each_present_cpu(cpu) {
-		phdr = (Elf64_Phdr *)bufp;
-		bufp += sizeof(Elf64_Phdr);
 		phdr->p_type = PT_NOTE;
 		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
 		phdr->p_offset = phdr->p_paddr = notes_addr;
 		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
 		(ehdr->e_phnum)++;
+		phdr++;
 	}
 
 	/* Prepare one PT_NOTE header for vmcoreinfo */
-	phdr = (Elf64_Phdr *)bufp;
-	bufp += sizeof(Elf64_Phdr);
 	phdr->p_type = PT_NOTE;
 	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
 	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
 	(ehdr->e_phnum)++;
+	phdr++;
 
-#ifdef CONFIG_X86_64
 	/* Prepare PT_LOAD type program header for kernel text region */
-	phdr = (Elf64_Phdr *)bufp;
-	bufp += sizeof(Elf64_Phdr);
-	phdr->p_type = PT_LOAD;
-	phdr->p_flags = PF_R|PF_W|PF_X;
-	phdr->p_vaddr = (Elf64_Addr)_text;
-	phdr->p_filesz = phdr->p_memsz = _end - _text;
-	phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
-	(ehdr->e_phnum)++;
-#endif
+	if (kernel_map) {
+		phdr->p_type = PT_LOAD;
+		phdr->p_flags = PF_R|PF_W|PF_X;
+		phdr->p_vaddr = (Elf64_Addr)_text;
+		phdr->p_filesz = phdr->p_memsz = _end - _text;
+		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
+		ehdr->e_phnum++;
+		phdr++;
+	}
 
-	/* Prepare PT_LOAD headers for system ram chunks. */
-	ced->ehdr = ehdr;
-	ced->bufp = bufp;
-	ret = walk_system_ram_res(0, -1, ced,
-			prepare_elf64_ram_headers_callback);
-	if (ret < 0)
-		return ret;
+	/* Go through all the ranges in cmem->ranges[] and prepare phdr */
+	for (i = 0; i < cmem->nr_ranges; i++) {
+		mstart = cmem->ranges[i].start;
+		mend = cmem->ranges[i].end;
+
+		phdr->p_type = PT_LOAD;
+		phdr->p_flags = PF_R|PF_W|PF_X;
+		phdr->p_offset  = mstart;
+
+		phdr->p_paddr = mstart;
+		phdr->p_vaddr = (unsigned long long) __va(mstart);
+		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
+		phdr->p_align = 0;
+		ehdr->e_phnum++;
+		phdr++;
+		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
+			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
+			ehdr->e_phnum, phdr->p_offset);
+	}
 
 	*addr = buf;
 	*sz = elf_sz;
@@ -489,18 +432,46 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
 static int prepare_elf_headers(struct kimage *image, void **addr,
 					unsigned long *sz)
 {
-	struct crash_elf_data *ced;
-	int ret;
+	struct crash_mem *cmem;
+	Elf64_Ehdr *ehdr;
+	Elf64_Phdr *phdr;
+	int ret, i;
 
-	ced = kzalloc(sizeof(*ced), GFP_KERNEL);
-	if (!ced)
+	cmem = fill_up_crash_elf_data();
+	if (!cmem)
 		return -ENOMEM;
 
-	fill_up_crash_elf_data(ced, image);
+	ret = walk_system_ram_res(0, -1, cmem,
+				prepare_elf64_ram_headers_callback);
+	if (ret)
+		goto out;
+
+	/* Exclude unwanted mem ranges */
+	ret = elf_header_exclude_ranges(cmem);
+	if (ret)
+		goto out;
 
 	/* By default prepare 64bit headers */
-	ret =  prepare_elf64_headers(ced, addr, sz);
-	kfree(ced);
+	ret =  crash_prepare_elf64_headers(cmem,
+				(int)IS_ENABLED(CONFIG_X86_64), addr, sz);
+	if (ret)
+		goto out;
+
+	/*
+	 * If a range matches backup region, adjust offset to backup
+	 * segment.
+	 */
+	ehdr = (Elf64_Ehdr *)*addr;
+	phdr = (Elf64_Phdr *)(ehdr + 1);
+	for (i = 0; i < ehdr->e_phnum; phdr++, i++)
+		if (phdr->p_type == PT_LOAD &&
+				phdr->p_paddr == image->arch.backup_src_start &&
+				phdr->p_memsz == image->arch.backup_src_sz) {
+			phdr->p_offset = image->arch.backup_load_addr;
+			break;
+		}
+out:
+	vfree(cmem);
 	return ret;
 }
 
@@ -546,14 +517,14 @@ static int memmap_exclude_ranges(struct kimage *image, struct crash_mem *cmem,
 	/* Exclude Backup region */
 	start = image->arch.backup_load_addr;
 	end = start + image->arch.backup_src_sz - 1;
-	ret = exclude_mem_range(cmem, start, end);
+	ret = crash_exclude_mem_range(cmem, start, end);
 	if (ret)
 		return ret;
 
 	/* Exclude elf header region */
 	start = image->arch.elf_load_addr;
 	end = start + image->arch.elf_headers_sz - 1;
-	return exclude_mem_range(cmem, start, end);
+	return crash_exclude_mem_range(cmem, start, end);
 }
 
 /* Prepare memory map for crash dump kernel */
-- 
2.16.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 05/13] kexec_file, x86: move re-factored code to generic side
  2018-02-22 11:17 ` AKASHI Takahiro
  (?)
@ 2018-02-22 11:17   ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-arm-kernel, linux-kernel, AKASHI Takahiro

In the previous patch, commonly-used routines were carved out. Now place
them in kexec common code.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
---
 arch/x86/kernel/crash.c | 183 ------------------------------------------------
 include/linux/kexec.h   |  19 +++++
 kernel/kexec_file.c     | 175 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 194 insertions(+), 183 deletions(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 5c19cfbf3b85..3e4f3980688d 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -38,19 +38,6 @@
 #include <asm/virtext.h>
 #include <asm/intel_pt.h>
 
-/* Alignment required for elf header segment */
-#define ELF_CORE_HEADER_ALIGN   4096
-
-struct crash_mem_range {
-	u64 start, end;
-};
-
-struct crash_mem {
-	unsigned int max_nr_ranges;
-	unsigned int nr_ranges;
-	struct crash_mem_range ranges[0];
-};
-
 /* Used while preparing memory map entries for second kernel */
 struct crash_memmap_data {
 	struct boot_params *params;
@@ -224,77 +211,6 @@ static struct crash_mem *fill_up_crash_elf_data(void)
 	return cmem;
 }
 
-static int crash_exclude_mem_range(struct crash_mem *mem,
-		unsigned long long mstart, unsigned long long mend)
-{
-	int i, j;
-	unsigned long long start, end;
-	struct crash_mem_range temp_range = {0, 0};
-
-	for (i = 0; i < mem->nr_ranges; i++) {
-		start = mem->ranges[i].start;
-		end = mem->ranges[i].end;
-
-		if (mstart > end || mend < start)
-			continue;
-
-		/* Truncate any area outside of range */
-		if (mstart < start)
-			mstart = start;
-		if (mend > end)
-			mend = end;
-
-		/* Found completely overlapping range */
-		if (mstart == start && mend == end) {
-			mem->ranges[i].start = 0;
-			mem->ranges[i].end = 0;
-			if (i < mem->nr_ranges - 1) {
-				/* Shift rest of the ranges to left */
-				for (j = i; j < mem->nr_ranges - 1; j++) {
-					mem->ranges[j].start =
-						mem->ranges[j+1].start;
-					mem->ranges[j].end =
-							mem->ranges[j+1].end;
-				}
-			}
-			mem->nr_ranges--;
-			return 0;
-		}
-
-		if (mstart > start && mend < end) {
-			/* Split original range */
-			mem->ranges[i].end = mstart - 1;
-			temp_range.start = mend + 1;
-			temp_range.end = end;
-		} else if (mstart != start)
-			mem->ranges[i].end = mstart - 1;
-		else
-			mem->ranges[i].start = mend + 1;
-		break;
-	}
-
-	/* If a split happend, add the split to array */
-	if (!temp_range.end)
-		return 0;
-
-	/* Split happened */
-	if (i == mem->max_nr_ranges - 1)
-		return -ENOMEM;
-
-	/* Location where new range should go */
-	j = i + 1;
-	if (j < mem->nr_ranges) {
-		/* Move over all ranges one slot towards the end */
-		for (i = mem->nr_ranges - 1; i >= j; i--)
-			mem->ranges[i + 1] = mem->ranges[i];
-	}
-
-	mem->ranges[j].start = temp_range.start;
-	mem->ranges[j].end = temp_range.end;
-	mem->nr_ranges++;
-	return 0;
-}
-
 /*
  * Look for any unwanted ranges between mstart, mend and remove them. This
  * might lead to split and split ranges are put in cmem->ranges[] array
@@ -329,105 +245,6 @@ static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
 	return 0;
 }
 
-static int crash_prepare_elf64_headers(struct crash_mem *cmem, int kernel_map,
-					void **addr, unsigned long *sz)
-{
-	Elf64_Ehdr *ehdr;
-	Elf64_Phdr *phdr;
-	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
-	unsigned char *buf;
-	unsigned int cpu, i;
-	unsigned long long notes_addr;
-	unsigned long mstart, mend;
-
-	/* extra phdr for vmcoreinfo elf note */
-	nr_phdr = nr_cpus + 1;
-	nr_phdr += cmem->nr_ranges;
-
-	/*
-	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
-	 * area on x86_64 (ffffffff80000000 - ffffffffa0000000).
-	 * I think this is required by tools like gdb. So same physical
-	 * memory will be mapped in two elf headers. One will contain kernel
-	 * text virtual addresses and other will have __va(physical) addresses.
-	 */
-
-	nr_phdr++;
-	elf_sz = sizeof(Elf64_Ehdr) + nr_phdr * sizeof(Elf64_Phdr);
-	elf_sz = ALIGN(elf_sz, ELF_CORE_HEADER_ALIGN);
-
-	buf = vzalloc(elf_sz);
-	if (!buf)
-		return -ENOMEM;
-
-	ehdr = (Elf64_Ehdr *)buf;
-	phdr = (Elf64_Phdr *)(ehdr + 1);
-	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
-	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
-	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
-	ehdr->e_ident[EI_VERSION] = EV_CURRENT;
-	ehdr->e_ident[EI_OSABI] = ELF_OSABI;
-	memset(ehdr->e_ident + EI_PAD, 0, EI_NIDENT - EI_PAD);
-	ehdr->e_type = ET_CORE;
-	ehdr->e_machine = ELF_ARCH;
-	ehdr->e_version = EV_CURRENT;
-	ehdr->e_phoff = sizeof(Elf64_Ehdr);
-	ehdr->e_ehsize = sizeof(Elf64_Ehdr);
-	ehdr->e_phentsize = sizeof(Elf64_Phdr);
-
-	/* Prepare one phdr of type PT_NOTE for each present cpu */
-	for_each_present_cpu(cpu) {
-		phdr->p_type = PT_NOTE;
-		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
-		phdr->p_offset = phdr->p_paddr = notes_addr;
-		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
-		(ehdr->e_phnum)++;
-		phdr++;
-	}
-
-	/* Prepare one PT_NOTE header for vmcoreinfo */
-	phdr->p_type = PT_NOTE;
-	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
-	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
-	(ehdr->e_phnum)++;
-	phdr++;
-
-	/* Prepare PT_LOAD type program header for kernel text region */
-	if (kernel_map) {
-		phdr->p_type = PT_LOAD;
-		phdr->p_flags = PF_R|PF_W|PF_X;
-		phdr->p_vaddr = (Elf64_Addr)_text;
-		phdr->p_filesz = phdr->p_memsz = _end - _text;
-		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
-		ehdr->e_phnum++;
-		phdr++;
-	}
-
-	/* Go through all the ranges in cmem->ranges[] and prepare phdr */
-	for (i = 0; i < cmem->nr_ranges; i++) {
-		mstart = cmem->ranges[i].start;
-		mend = cmem->ranges[i].end;
-
-		phdr->p_type = PT_LOAD;
-		phdr->p_flags = PF_R|PF_W|PF_X;
-		phdr->p_offset  = mstart;
-
-		phdr->p_paddr = mstart;
-		phdr->p_vaddr = (unsigned long long) __va(mstart);
-		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
-		phdr->p_align = 0;
-		ehdr->e_phnum++;
-		phdr++;
-		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
-			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
-			ehdr->e_phnum, phdr->p_offset);
-	}
-
-	*addr = buf;
-	*sz = elf_sz;
-	return 0;
-}
-
 /* Prepare elf headers. Return addr and size */
 static int prepare_elf_headers(struct kimage *image, void **addr,
 					unsigned long *sz)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 325980537125..0b10c7f7cca8 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -163,6 +163,25 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
 			       int (*func)(struct resource *, void *));
 extern int kexec_add_buffer(struct kexec_buf *kbuf);
 int kexec_locate_mem_hole(struct kexec_buf *kbuf);
+
+/* Alignment required for elf header segment */
+#define ELF_CORE_HEADER_ALIGN   4096
+
+struct crash_mem_range {
+	u64 start, end;
+};
+
+struct crash_mem {
+	unsigned int max_nr_ranges;
+	unsigned int nr_ranges;
+	struct crash_mem_range ranges[0];
+};
+
+extern int crash_exclude_mem_range(struct crash_mem *mem,
+				   unsigned long long mstart,
+				   unsigned long long mend);
+extern int crash_prepare_elf64_headers(struct crash_mem *mem, int kernel_map,
+				       void **addr, unsigned long *sz);
 #endif /* CONFIG_KEXEC_FILE */
 
 struct kimage {
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index a6d14a768b3e..3f506774f32e 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -22,6 +22,11 @@
 #include <linux/ima.h>
 #include <crypto/hash.h>
 #include <crypto/sha.h>
+#include <linux/elf.h>
+#include <linux/elfcore.h>
+#include <linux/kernel.h>
+#include <linux/kexec.h>
+#include <linux/slab.h>
 #include <linux/syscalls.h>
 #include <linux/vmalloc.h>
 #include "kexec_internal.h"
@@ -1077,3 +1082,173 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
 	return 0;
 }
 #endif /* CONFIG_ARCH_HAS_KEXEC_PURGATORY */
+
+int crash_exclude_mem_range(struct crash_mem *mem,
+			    unsigned long long mstart, unsigned long long mend)
+{
+	int i, j;
+	unsigned long long start, end;
+	struct crash_mem_range temp_range = {0, 0};
+
+	for (i = 0; i < mem->nr_ranges; i++) {
+		start = mem->ranges[i].start;
+		end = mem->ranges[i].end;
+
+		if (mstart > end || mend < start)
+			continue;
+
+		/* Truncate any area outside of range */
+		if (mstart < start)
+			mstart = start;
+		if (mend > end)
+			mend = end;
+
+		/* Found completely overlapping range */
+		if (mstart == start && mend == end) {
+			mem->ranges[i].start = 0;
+			mem->ranges[i].end = 0;
+			if (i < mem->nr_ranges - 1) {
+				/* Shift rest of the ranges to left */
+				for (j = i; j < mem->nr_ranges - 1; j++) {
+					mem->ranges[j].start =
+						mem->ranges[j+1].start;
+					mem->ranges[j].end =
+							mem->ranges[j+1].end;
+				}
+			}
+			mem->nr_ranges--;
+			return 0;
+		}
+
+		if (mstart > start && mend < end) {
+			/* Split original range */
+			mem->ranges[i].end = mstart - 1;
+			temp_range.start = mend + 1;
+			temp_range.end = end;
+		} else if (mstart != start)
+			mem->ranges[i].end = mstart - 1;
+		else
+			mem->ranges[i].start = mend + 1;
+		break;
+	}
+
+	/* If a split happened, add the split to array */
+	if (!temp_range.end)
+		return 0;
+
+	/* Split happened */
+	if (i == mem->max_nr_ranges - 1)
+		return -ENOMEM;
+
+	/* Location where new range should go */
+	j = i + 1;
+	if (j < mem->nr_ranges) {
+		/* Move over all ranges one slot towards the end */
+		for (i = mem->nr_ranges - 1; i >= j; i--)
+			mem->ranges[i + 1] = mem->ranges[i];
+	}
+
+	mem->ranges[j].start = temp_range.start;
+	mem->ranges[j].end = temp_range.end;
+	mem->nr_ranges++;
+	return 0;
+}
+
+int crash_prepare_elf64_headers(struct crash_mem *mem, int kernel_map,
+			  void **addr, unsigned long *sz)
+{
+	Elf64_Ehdr *ehdr;
+	Elf64_Phdr *phdr;
+	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
+	unsigned char *buf;
+	unsigned int cpu, i;
+	unsigned long long notes_addr;
+	unsigned long mstart, mend;
+
+	/* extra phdr for vmcoreinfo elf note */
+	nr_phdr = nr_cpus + 1;
+	nr_phdr += mem->nr_ranges;
+
+	/*
+	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
+	 * area (for example, ffffffff80000000 - ffffffffa0000000 on x86_64).
+	 * I think this is required by tools like gdb. So same physical
+	 * memory will be mapped in two elf headers. One will contain kernel
+	 * text virtual addresses and other will have __va(physical) addresses.
+	 */
+
+	nr_phdr++;
+	elf_sz = sizeof(Elf64_Ehdr) + nr_phdr * sizeof(Elf64_Phdr);
+	elf_sz = ALIGN(elf_sz, ELF_CORE_HEADER_ALIGN);
+
+	buf = vzalloc(elf_sz);
+	if (!buf)
+		return -ENOMEM;
+
+	ehdr = (Elf64_Ehdr *)buf;
+	phdr = (Elf64_Phdr *)(ehdr + 1);
+	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
+	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
+	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
+	ehdr->e_ident[EI_VERSION] = EV_CURRENT;
+	ehdr->e_ident[EI_OSABI] = ELF_OSABI;
+	memset(ehdr->e_ident + EI_PAD, 0, EI_NIDENT - EI_PAD);
+	ehdr->e_type = ET_CORE;
+	ehdr->e_machine = ELF_ARCH;
+	ehdr->e_version = EV_CURRENT;
+	ehdr->e_phoff = sizeof(Elf64_Ehdr);
+	ehdr->e_ehsize = sizeof(Elf64_Ehdr);
+	ehdr->e_phentsize = sizeof(Elf64_Phdr);
+
+	/* Prepare one phdr of type PT_NOTE for each present cpu */
+	for_each_present_cpu(cpu) {
+		phdr->p_type = PT_NOTE;
+		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
+		phdr->p_offset = phdr->p_paddr = notes_addr;
+		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
+		(ehdr->e_phnum)++;
+		phdr++;
+	}
+
+	/* Prepare one PT_NOTE header for vmcoreinfo */
+	phdr->p_type = PT_NOTE;
+	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
+	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
+	(ehdr->e_phnum)++;
+	phdr++;
+
+	/* Prepare PT_LOAD type program header for kernel text region */
+	if (kernel_map) {
+		phdr->p_type = PT_LOAD;
+		phdr->p_flags = PF_R|PF_W|PF_X;
+		phdr->p_vaddr = (Elf64_Addr)_text;
+		phdr->p_filesz = phdr->p_memsz = _end - _text;
+		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
+		ehdr->e_phnum++;
+		phdr++;
+	}
+
+	/* Go through all the ranges in mem->ranges[] and prepare phdr */
+	for (i = 0; i < mem->nr_ranges; i++) {
+		mstart = mem->ranges[i].start;
+		mend = mem->ranges[i].end;
+
+		phdr->p_type = PT_LOAD;
+		phdr->p_flags = PF_R|PF_W|PF_X;
+		phdr->p_offset  = mstart;
+
+		phdr->p_paddr = mstart;
+		phdr->p_vaddr = (unsigned long long) __va(mstart);
+		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
+		phdr->p_align = 0;
+		ehdr->e_phnum++;
+		phdr++;
+		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
+			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
+			ehdr->e_phnum, phdr->p_offset);
+	}
+
+	*addr = buf;
+	*sz = elf_sz;
+	return 0;
+}
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 05/13] kexec_file, x86: move re-factored code to generic side
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

In the previous patch, commonly-used routines were carved out. Now place
them in kexec common code.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
---
 arch/x86/kernel/crash.c | 183 ------------------------------------------------
 include/linux/kexec.h   |  19 +++++
 kernel/kexec_file.c     | 175 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 194 insertions(+), 183 deletions(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 5c19cfbf3b85..3e4f3980688d 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -38,19 +38,6 @@
 #include <asm/virtext.h>
 #include <asm/intel_pt.h>
 
-/* Alignment required for elf header segment */
-#define ELF_CORE_HEADER_ALIGN   4096
-
-struct crash_mem_range {
-	u64 start, end;
-};
-
-struct crash_mem {
-	unsigned int max_nr_ranges;
-	unsigned int nr_ranges;
-	struct crash_mem_range ranges[0];
-};
-
 /* Used while preparing memory map entries for second kernel */
 struct crash_memmap_data {
 	struct boot_params *params;
@@ -224,77 +211,6 @@ static struct crash_mem *fill_up_crash_elf_data(void)
 	return cmem;
 }
 
-static int crash_exclude_mem_range(struct crash_mem *mem,
-		unsigned long long mstart, unsigned long long mend)
-{
-	int i, j;
-	unsigned long long start, end;
-	struct crash_mem_range temp_range = {0, 0};
-
-	for (i = 0; i < mem->nr_ranges; i++) {
-		start = mem->ranges[i].start;
-		end = mem->ranges[i].end;
-
-		if (mstart > end || mend < start)
-			continue;
-
-		/* Truncate any area outside of range */
-		if (mstart < start)
-			mstart = start;
-		if (mend > end)
-			mend = end;
-
-		/* Found completely overlapping range */
-		if (mstart == start && mend == end) {
-			mem->ranges[i].start = 0;
-			mem->ranges[i].end = 0;
-			if (i < mem->nr_ranges - 1) {
-				/* Shift rest of the ranges to left */
-				for (j = i; j < mem->nr_ranges - 1; j++) {
-					mem->ranges[j].start =
-						mem->ranges[j+1].start;
-					mem->ranges[j].end =
-							mem->ranges[j+1].end;
-				}
-			}
-			mem->nr_ranges--;
-			return 0;
-		}
-
-		if (mstart > start && mend < end) {
-			/* Split original range */
-			mem->ranges[i].end = mstart - 1;
-			temp_range.start = mend + 1;
-			temp_range.end = end;
-		} else if (mstart != start)
-			mem->ranges[i].end = mstart - 1;
-		else
-			mem->ranges[i].start = mend + 1;
-		break;
-	}
-
-	/* If a split happend, add the split to array */
-	if (!temp_range.end)
-		return 0;
-
-	/* Split happened */
-	if (i == mem->max_nr_ranges - 1)
-		return -ENOMEM;
-
-	/* Location where new range should go */
-	j = i + 1;
-	if (j < mem->nr_ranges) {
-		/* Move over all ranges one slot towards the end */
-		for (i = mem->nr_ranges - 1; i >= j; i--)
-			mem->ranges[i + 1] = mem->ranges[i];
-	}
-
-	mem->ranges[j].start = temp_range.start;
-	mem->ranges[j].end = temp_range.end;
-	mem->nr_ranges++;
-	return 0;
-}
-
 /*
  * Look for any unwanted ranges between mstart, mend and remove them. This
  * might lead to split and split ranges are put in cmem->ranges[] array
@@ -329,105 +245,6 @@ static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
 	return 0;
 }
 
-static int crash_prepare_elf64_headers(struct crash_mem *cmem, int kernel_map,
-					void **addr, unsigned long *sz)
-{
-	Elf64_Ehdr *ehdr;
-	Elf64_Phdr *phdr;
-	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
-	unsigned char *buf;
-	unsigned int cpu, i;
-	unsigned long long notes_addr;
-	unsigned long mstart, mend;
-
-	/* extra phdr for vmcoreinfo elf note */
-	nr_phdr = nr_cpus + 1;
-	nr_phdr += cmem->nr_ranges;
-
-	/*
-	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
-	 * area on x86_64 (ffffffff80000000 - ffffffffa0000000).
-	 * I think this is required by tools like gdb. So same physical
-	 * memory will be mapped in two elf headers. One will contain kernel
-	 * text virtual addresses and other will have __va(physical) addresses.
-	 */
-
-	nr_phdr++;
-	elf_sz = sizeof(Elf64_Ehdr) + nr_phdr * sizeof(Elf64_Phdr);
-	elf_sz = ALIGN(elf_sz, ELF_CORE_HEADER_ALIGN);
-
-	buf = vzalloc(elf_sz);
-	if (!buf)
-		return -ENOMEM;
-
-	ehdr = (Elf64_Ehdr *)buf;
-	phdr = (Elf64_Phdr *)(ehdr + 1);
-	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
-	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
-	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
-	ehdr->e_ident[EI_VERSION] = EV_CURRENT;
-	ehdr->e_ident[EI_OSABI] = ELF_OSABI;
-	memset(ehdr->e_ident + EI_PAD, 0, EI_NIDENT - EI_PAD);
-	ehdr->e_type = ET_CORE;
-	ehdr->e_machine = ELF_ARCH;
-	ehdr->e_version = EV_CURRENT;
-	ehdr->e_phoff = sizeof(Elf64_Ehdr);
-	ehdr->e_ehsize = sizeof(Elf64_Ehdr);
-	ehdr->e_phentsize = sizeof(Elf64_Phdr);
-
-	/* Prepare one phdr of type PT_NOTE for each present cpu */
-	for_each_present_cpu(cpu) {
-		phdr->p_type = PT_NOTE;
-		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
-		phdr->p_offset = phdr->p_paddr = notes_addr;
-		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
-		(ehdr->e_phnum)++;
-		phdr++;
-	}
-
-	/* Prepare one PT_NOTE header for vmcoreinfo */
-	phdr->p_type = PT_NOTE;
-	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
-	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
-	(ehdr->e_phnum)++;
-	phdr++;
-
-	/* Prepare PT_LOAD type program header for kernel text region */
-	if (kernel_map) {
-		phdr->p_type = PT_LOAD;
-		phdr->p_flags = PF_R|PF_W|PF_X;
-		phdr->p_vaddr = (Elf64_Addr)_text;
-		phdr->p_filesz = phdr->p_memsz = _end - _text;
-		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
-		ehdr->e_phnum++;
-		phdr++;
-	}
-
-	/* Go through all the ranges in cmem->ranges[] and prepare phdr */
-	for (i = 0; i < cmem->nr_ranges; i++) {
-		mstart = cmem->ranges[i].start;
-		mend = cmem->ranges[i].end;
-
-		phdr->p_type = PT_LOAD;
-		phdr->p_flags = PF_R|PF_W|PF_X;
-		phdr->p_offset  = mstart;
-
-		phdr->p_paddr = mstart;
-		phdr->p_vaddr = (unsigned long long) __va(mstart);
-		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
-		phdr->p_align = 0;
-		ehdr->e_phnum++;
-		phdr++;
-		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
-			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
-			ehdr->e_phnum, phdr->p_offset);
-	}
-
-	*addr = buf;
-	*sz = elf_sz;
-	return 0;
-}
-
 /* Prepare elf headers. Return addr and size */
 static int prepare_elf_headers(struct kimage *image, void **addr,
 					unsigned long *sz)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 325980537125..0b10c7f7cca8 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -163,6 +163,25 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
 			       int (*func)(struct resource *, void *));
 extern int kexec_add_buffer(struct kexec_buf *kbuf);
 int kexec_locate_mem_hole(struct kexec_buf *kbuf);
+
+/* Alignment required for elf header segment */
+#define ELF_CORE_HEADER_ALIGN   4096
+
+struct crash_mem_range {
+	u64 start, end;
+};
+
+struct crash_mem {
+	unsigned int max_nr_ranges;
+	unsigned int nr_ranges;
+	struct crash_mem_range ranges[0];
+};
+
+extern int crash_exclude_mem_range(struct crash_mem *mem,
+				   unsigned long long mstart,
+				   unsigned long long mend);
+extern int crash_prepare_elf64_headers(struct crash_mem *mem, int kernel_map,
+				       void **addr, unsigned long *sz);
 #endif /* CONFIG_KEXEC_FILE */
 
 struct kimage {
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index a6d14a768b3e..3f506774f32e 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -22,6 +22,11 @@
 #include <linux/ima.h>
 #include <crypto/hash.h>
 #include <crypto/sha.h>
+#include <linux/elf.h>
+#include <linux/elfcore.h>
+#include <linux/kernel.h>
+#include <linux/kexec.h>
+#include <linux/slab.h>
 #include <linux/syscalls.h>
 #include <linux/vmalloc.h>
 #include "kexec_internal.h"
@@ -1077,3 +1082,173 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
 	return 0;
 }
 #endif /* CONFIG_ARCH_HAS_KEXEC_PURGATORY */
+
+int crash_exclude_mem_range(struct crash_mem *mem,
+			    unsigned long long mstart, unsigned long long mend)
+{
+	int i, j;
+	unsigned long long start, end;
+	struct crash_mem_range temp_range = {0, 0};
+
+	for (i = 0; i < mem->nr_ranges; i++) {
+		start = mem->ranges[i].start;
+		end = mem->ranges[i].end;
+
+		if (mstart > end || mend < start)
+			continue;
+
+		/* Truncate any area outside of range */
+		if (mstart < start)
+			mstart = start;
+		if (mend > end)
+			mend = end;
+
+		/* Found completely overlapping range */
+		if (mstart == start && mend == end) {
+			mem->ranges[i].start = 0;
+			mem->ranges[i].end = 0;
+			if (i < mem->nr_ranges - 1) {
+				/* Shift rest of the ranges to left */
+				for (j = i; j < mem->nr_ranges - 1; j++) {
+					mem->ranges[j].start =
+						mem->ranges[j+1].start;
+					mem->ranges[j].end =
+							mem->ranges[j+1].end;
+				}
+			}
+			mem->nr_ranges--;
+			return 0;
+		}
+
+		if (mstart > start && mend < end) {
+			/* Split original range */
+			mem->ranges[i].end = mstart - 1;
+			temp_range.start = mend + 1;
+			temp_range.end = end;
+		} else if (mstart != start)
+			mem->ranges[i].end = mstart - 1;
+		else
+			mem->ranges[i].start = mend + 1;
+		break;
+	}
+
+	/* If a split happened, add the split to array */
+	if (!temp_range.end)
+		return 0;
+
+	/* Split happened */
+	if (i == mem->max_nr_ranges - 1)
+		return -ENOMEM;
+
+	/* Location where new range should go */
+	j = i + 1;
+	if (j < mem->nr_ranges) {
+		/* Move over all ranges one slot towards the end */
+		for (i = mem->nr_ranges - 1; i >= j; i--)
+			mem->ranges[i + 1] = mem->ranges[i];
+	}
+
+	mem->ranges[j].start = temp_range.start;
+	mem->ranges[j].end = temp_range.end;
+	mem->nr_ranges++;
+	return 0;
+}
+
+int crash_prepare_elf64_headers(struct crash_mem *mem, int kernel_map,
+			  void **addr, unsigned long *sz)
+{
+	Elf64_Ehdr *ehdr;
+	Elf64_Phdr *phdr;
+	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
+	unsigned char *buf;
+	unsigned int cpu, i;
+	unsigned long long notes_addr;
+	unsigned long mstart, mend;
+
+	/* extra phdr for vmcoreinfo elf note */
+	nr_phdr = nr_cpus + 1;
+	nr_phdr += mem->nr_ranges;
+
+	/*
+	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
+	 * area (for example, ffffffff80000000 - ffffffffa0000000 on x86_64).
+	 * I think this is required by tools like gdb. So same physical
+	 * memory will be mapped in two elf headers. One will contain kernel
+	 * text virtual addresses and other will have __va(physical) addresses.
+	 */
+
+	nr_phdr++;
+	elf_sz = sizeof(Elf64_Ehdr) + nr_phdr * sizeof(Elf64_Phdr);
+	elf_sz = ALIGN(elf_sz, ELF_CORE_HEADER_ALIGN);
+
+	buf = vzalloc(elf_sz);
+	if (!buf)
+		return -ENOMEM;
+
+	ehdr = (Elf64_Ehdr *)buf;
+	phdr = (Elf64_Phdr *)(ehdr + 1);
+	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
+	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
+	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
+	ehdr->e_ident[EI_VERSION] = EV_CURRENT;
+	ehdr->e_ident[EI_OSABI] = ELF_OSABI;
+	memset(ehdr->e_ident + EI_PAD, 0, EI_NIDENT - EI_PAD);
+	ehdr->e_type = ET_CORE;
+	ehdr->e_machine = ELF_ARCH;
+	ehdr->e_version = EV_CURRENT;
+	ehdr->e_phoff = sizeof(Elf64_Ehdr);
+	ehdr->e_ehsize = sizeof(Elf64_Ehdr);
+	ehdr->e_phentsize = sizeof(Elf64_Phdr);
+
+	/* Prepare one phdr of type PT_NOTE for each present cpu */
+	for_each_present_cpu(cpu) {
+		phdr->p_type = PT_NOTE;
+		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
+		phdr->p_offset = phdr->p_paddr = notes_addr;
+		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
+		(ehdr->e_phnum)++;
+		phdr++;
+	}
+
+	/* Prepare one PT_NOTE header for vmcoreinfo */
+	phdr->p_type = PT_NOTE;
+	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
+	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
+	(ehdr->e_phnum)++;
+	phdr++;
+
+	/* Prepare PT_LOAD type program header for kernel text region */
+	if (kernel_map) {
+		phdr->p_type = PT_LOAD;
+		phdr->p_flags = PF_R|PF_W|PF_X;
+		phdr->p_vaddr = (Elf64_Addr)_text;
+		phdr->p_filesz = phdr->p_memsz = _end - _text;
+		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
+		ehdr->e_phnum++;
+		phdr++;
+	}
+
+	/* Go through all the ranges in mem->ranges[] and prepare phdr */
+	for (i = 0; i < mem->nr_ranges; i++) {
+		mstart = mem->ranges[i].start;
+		mend = mem->ranges[i].end;
+
+		phdr->p_type = PT_LOAD;
+		phdr->p_flags = PF_R|PF_W|PF_X;
+		phdr->p_offset  = mstart;
+
+		phdr->p_paddr = mstart;
+		phdr->p_vaddr = (unsigned long long) __va(mstart);
+		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
+		phdr->p_align = 0;
+		ehdr->e_phnum++;
+		phdr++;
+		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
+			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
+			ehdr->e_phnum, phdr->p_offset);
+	}
+
+	*addr = buf;
+	*sz = elf_sz;
+	return 0;
+}
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 05/13] kexec_file, x86: move re-factored code to generic side
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: AKASHI Takahiro, kexec, linux-kernel, linux-arm-kernel

In the previous patch, commonly-used routines were carved out. Now place
them in kexec common code.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
---
 arch/x86/kernel/crash.c | 183 ------------------------------------------------
 include/linux/kexec.h   |  19 +++++
 kernel/kexec_file.c     | 175 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 194 insertions(+), 183 deletions(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 5c19cfbf3b85..3e4f3980688d 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -38,19 +38,6 @@
 #include <asm/virtext.h>
 #include <asm/intel_pt.h>
 
-/* Alignment required for elf header segment */
-#define ELF_CORE_HEADER_ALIGN   4096
-
-struct crash_mem_range {
-	u64 start, end;
-};
-
-struct crash_mem {
-	unsigned int max_nr_ranges;
-	unsigned int nr_ranges;
-	struct crash_mem_range ranges[0];
-};
-
 /* Used while preparing memory map entries for second kernel */
 struct crash_memmap_data {
 	struct boot_params *params;
@@ -224,77 +211,6 @@ static struct crash_mem *fill_up_crash_elf_data(void)
 	return cmem;
 }
 
-static int crash_exclude_mem_range(struct crash_mem *mem,
-		unsigned long long mstart, unsigned long long mend)
-{
-	int i, j;
-	unsigned long long start, end;
-	struct crash_mem_range temp_range = {0, 0};
-
-	for (i = 0; i < mem->nr_ranges; i++) {
-		start = mem->ranges[i].start;
-		end = mem->ranges[i].end;
-
-		if (mstart > end || mend < start)
-			continue;
-
-		/* Truncate any area outside of range */
-		if (mstart < start)
-			mstart = start;
-		if (mend > end)
-			mend = end;
-
-		/* Found completely overlapping range */
-		if (mstart == start && mend == end) {
-			mem->ranges[i].start = 0;
-			mem->ranges[i].end = 0;
-			if (i < mem->nr_ranges - 1) {
-				/* Shift rest of the ranges to left */
-				for (j = i; j < mem->nr_ranges - 1; j++) {
-					mem->ranges[j].start =
-						mem->ranges[j+1].start;
-					mem->ranges[j].end =
-							mem->ranges[j+1].end;
-				}
-			}
-			mem->nr_ranges--;
-			return 0;
-		}
-
-		if (mstart > start && mend < end) {
-			/* Split original range */
-			mem->ranges[i].end = mstart - 1;
-			temp_range.start = mend + 1;
-			temp_range.end = end;
-		} else if (mstart != start)
-			mem->ranges[i].end = mstart - 1;
-		else
-			mem->ranges[i].start = mend + 1;
-		break;
-	}
-
-	/* If a split happend, add the split to array */
-	if (!temp_range.end)
-		return 0;
-
-	/* Split happened */
-	if (i == mem->max_nr_ranges - 1)
-		return -ENOMEM;
-
-	/* Location where new range should go */
-	j = i + 1;
-	if (j < mem->nr_ranges) {
-		/* Move over all ranges one slot towards the end */
-		for (i = mem->nr_ranges - 1; i >= j; i--)
-			mem->ranges[i + 1] = mem->ranges[i];
-	}
-
-	mem->ranges[j].start = temp_range.start;
-	mem->ranges[j].end = temp_range.end;
-	mem->nr_ranges++;
-	return 0;
-}
-
 /*
  * Look for any unwanted ranges between mstart, mend and remove them. This
  * might lead to split and split ranges are put in cmem->ranges[] array
@@ -329,105 +245,6 @@ static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
 	return 0;
 }
 
-static int crash_prepare_elf64_headers(struct crash_mem *cmem, int kernel_map,
-					void **addr, unsigned long *sz)
-{
-	Elf64_Ehdr *ehdr;
-	Elf64_Phdr *phdr;
-	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
-	unsigned char *buf;
-	unsigned int cpu, i;
-	unsigned long long notes_addr;
-	unsigned long mstart, mend;
-
-	/* extra phdr for vmcoreinfo elf note */
-	nr_phdr = nr_cpus + 1;
-	nr_phdr += cmem->nr_ranges;
-
-	/*
-	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
-	 * area on x86_64 (ffffffff80000000 - ffffffffa0000000).
-	 * I think this is required by tools like gdb. So same physical
-	 * memory will be mapped in two elf headers. One will contain kernel
-	 * text virtual addresses and other will have __va(physical) addresses.
-	 */
-
-	nr_phdr++;
-	elf_sz = sizeof(Elf64_Ehdr) + nr_phdr * sizeof(Elf64_Phdr);
-	elf_sz = ALIGN(elf_sz, ELF_CORE_HEADER_ALIGN);
-
-	buf = vzalloc(elf_sz);
-	if (!buf)
-		return -ENOMEM;
-
-	ehdr = (Elf64_Ehdr *)buf;
-	phdr = (Elf64_Phdr *)(ehdr + 1);
-	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
-	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
-	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
-	ehdr->e_ident[EI_VERSION] = EV_CURRENT;
-	ehdr->e_ident[EI_OSABI] = ELF_OSABI;
-	memset(ehdr->e_ident + EI_PAD, 0, EI_NIDENT - EI_PAD);
-	ehdr->e_type = ET_CORE;
-	ehdr->e_machine = ELF_ARCH;
-	ehdr->e_version = EV_CURRENT;
-	ehdr->e_phoff = sizeof(Elf64_Ehdr);
-	ehdr->e_ehsize = sizeof(Elf64_Ehdr);
-	ehdr->e_phentsize = sizeof(Elf64_Phdr);
-
-	/* Prepare one phdr of type PT_NOTE for each present cpu */
-	for_each_present_cpu(cpu) {
-		phdr->p_type = PT_NOTE;
-		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
-		phdr->p_offset = phdr->p_paddr = notes_addr;
-		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
-		(ehdr->e_phnum)++;
-		phdr++;
-	}
-
-	/* Prepare one PT_NOTE header for vmcoreinfo */
-	phdr->p_type = PT_NOTE;
-	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
-	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
-	(ehdr->e_phnum)++;
-	phdr++;
-
-	/* Prepare PT_LOAD type program header for kernel text region */
-	if (kernel_map) {
-		phdr->p_type = PT_LOAD;
-		phdr->p_flags = PF_R|PF_W|PF_X;
-		phdr->p_vaddr = (Elf64_Addr)_text;
-		phdr->p_filesz = phdr->p_memsz = _end - _text;
-		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
-		ehdr->e_phnum++;
-		phdr++;
-	}
-
-	/* Go through all the ranges in cmem->ranges[] and prepare phdr */
-	for (i = 0; i < cmem->nr_ranges; i++) {
-		mstart = cmem->ranges[i].start;
-		mend = cmem->ranges[i].end;
-
-		phdr->p_type = PT_LOAD;
-		phdr->p_flags = PF_R|PF_W|PF_X;
-		phdr->p_offset  = mstart;
-
-		phdr->p_paddr = mstart;
-		phdr->p_vaddr = (unsigned long long) __va(mstart);
-		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
-		phdr->p_align = 0;
-		ehdr->e_phnum++;
-		phdr++;
-		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
-			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
-			ehdr->e_phnum, phdr->p_offset);
-	}
-
-	*addr = buf;
-	*sz = elf_sz;
-	return 0;
-}
-
 /* Prepare elf headers. Return addr and size */
 static int prepare_elf_headers(struct kimage *image, void **addr,
 					unsigned long *sz)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 325980537125..0b10c7f7cca8 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -163,6 +163,25 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
 			       int (*func)(struct resource *, void *));
 extern int kexec_add_buffer(struct kexec_buf *kbuf);
 int kexec_locate_mem_hole(struct kexec_buf *kbuf);
+
+/* Alignment required for elf header segment */
+#define ELF_CORE_HEADER_ALIGN   4096
+
+struct crash_mem_range {
+	u64 start, end;
+};
+
+struct crash_mem {
+	unsigned int max_nr_ranges;
+	unsigned int nr_ranges;
+	struct crash_mem_range ranges[0];
+};
+
+extern int crash_exclude_mem_range(struct crash_mem *mem,
+				   unsigned long long mstart,
+				   unsigned long long mend);
+extern int crash_prepare_elf64_headers(struct crash_mem *mem, int kernel_map,
+				       void **addr, unsigned long *sz);
 #endif /* CONFIG_KEXEC_FILE */
 
 struct kimage {
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index a6d14a768b3e..3f506774f32e 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -22,6 +22,11 @@
 #include <linux/ima.h>
 #include <crypto/hash.h>
 #include <crypto/sha.h>
+#include <linux/elf.h>
+#include <linux/elfcore.h>
+#include <linux/kernel.h>
+#include <linux/kexec.h>
+#include <linux/slab.h>
 #include <linux/syscalls.h>
 #include <linux/vmalloc.h>
 #include "kexec_internal.h"
@@ -1077,3 +1082,173 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
 	return 0;
 }
 #endif /* CONFIG_ARCH_HAS_KEXEC_PURGATORY */
+
+int crash_exclude_mem_range(struct crash_mem *mem,
+			    unsigned long long mstart, unsigned long long mend)
+{
+	int i, j;
+	unsigned long long start, end;
+	struct crash_mem_range temp_range = {0, 0};
+
+	for (i = 0; i < mem->nr_ranges; i++) {
+		start = mem->ranges[i].start;
+		end = mem->ranges[i].end;
+
+		if (mstart > end || mend < start)
+			continue;
+
+		/* Truncate any area outside of range */
+		if (mstart < start)
+			mstart = start;
+		if (mend > end)
+			mend = end;
+
+		/* Found completely overlapping range */
+		if (mstart == start && mend == end) {
+			mem->ranges[i].start = 0;
+			mem->ranges[i].end = 0;
+			if (i < mem->nr_ranges - 1) {
+				/* Shift rest of the ranges to left */
+				for (j = i; j < mem->nr_ranges - 1; j++) {
+					mem->ranges[j].start =
+						mem->ranges[j+1].start;
+					mem->ranges[j].end =
+							mem->ranges[j+1].end;
+				}
+			}
+			mem->nr_ranges--;
+			return 0;
+		}
+
+		if (mstart > start && mend < end) {
+			/* Split original range */
+			mem->ranges[i].end = mstart - 1;
+			temp_range.start = mend + 1;
+			temp_range.end = end;
+		} else if (mstart != start)
+			mem->ranges[i].end = mstart - 1;
+		else
+			mem->ranges[i].start = mend + 1;
+		break;
+	}
+
+	/* If a split happened, add the split to array */
+	if (!temp_range.end)
+		return 0;
+
+	/* Split happened */
+	if (i == mem->max_nr_ranges - 1)
+		return -ENOMEM;
+
+	/* Location where new range should go */
+	j = i + 1;
+	if (j < mem->nr_ranges) {
+		/* Move over all ranges one slot towards the end */
+		for (i = mem->nr_ranges - 1; i >= j; i--)
+			mem->ranges[i + 1] = mem->ranges[i];
+	}
+
+	mem->ranges[j].start = temp_range.start;
+	mem->ranges[j].end = temp_range.end;
+	mem->nr_ranges++;
+	return 0;
+}
+
+int crash_prepare_elf64_headers(struct crash_mem *mem, int kernel_map,
+			  void **addr, unsigned long *sz)
+{
+	Elf64_Ehdr *ehdr;
+	Elf64_Phdr *phdr;
+	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
+	unsigned char *buf;
+	unsigned int cpu, i;
+	unsigned long long notes_addr;
+	unsigned long mstart, mend;
+
+	/* extra phdr for vmcoreinfo elf note */
+	nr_phdr = nr_cpus + 1;
+	nr_phdr += mem->nr_ranges;
+
+	/*
+	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
+	 * area (for example, ffffffff80000000 - ffffffffa0000000 on x86_64).
+	 * I think this is required by tools like gdb. So same physical
+	 * memory will be mapped in two elf headers. One will contain kernel
+	 * text virtual addresses and other will have __va(physical) addresses.
+	 */
+
+	nr_phdr++;
+	elf_sz = sizeof(Elf64_Ehdr) + nr_phdr * sizeof(Elf64_Phdr);
+	elf_sz = ALIGN(elf_sz, ELF_CORE_HEADER_ALIGN);
+
+	buf = vzalloc(elf_sz);
+	if (!buf)
+		return -ENOMEM;
+
+	ehdr = (Elf64_Ehdr *)buf;
+	phdr = (Elf64_Phdr *)(ehdr + 1);
+	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
+	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
+	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
+	ehdr->e_ident[EI_VERSION] = EV_CURRENT;
+	ehdr->e_ident[EI_OSABI] = ELF_OSABI;
+	memset(ehdr->e_ident + EI_PAD, 0, EI_NIDENT - EI_PAD);
+	ehdr->e_type = ET_CORE;
+	ehdr->e_machine = ELF_ARCH;
+	ehdr->e_version = EV_CURRENT;
+	ehdr->e_phoff = sizeof(Elf64_Ehdr);
+	ehdr->e_ehsize = sizeof(Elf64_Ehdr);
+	ehdr->e_phentsize = sizeof(Elf64_Phdr);
+
+	/* Prepare one phdr of type PT_NOTE for each present cpu */
+	for_each_present_cpu(cpu) {
+		phdr->p_type = PT_NOTE;
+		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
+		phdr->p_offset = phdr->p_paddr = notes_addr;
+		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
+		(ehdr->e_phnum)++;
+		phdr++;
+	}
+
+	/* Prepare one PT_NOTE header for vmcoreinfo */
+	phdr->p_type = PT_NOTE;
+	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
+	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
+	(ehdr->e_phnum)++;
+	phdr++;
+
+	/* Prepare PT_LOAD type program header for kernel text region */
+	if (kernel_map) {
+		phdr->p_type = PT_LOAD;
+		phdr->p_flags = PF_R|PF_W|PF_X;
+		phdr->p_vaddr = (Elf64_Addr)_text;
+		phdr->p_filesz = phdr->p_memsz = _end - _text;
+		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
+		ehdr->e_phnum++;
+		phdr++;
+	}
+
+	/* Go through all the ranges in mem->ranges[] and prepare phdr */
+	for (i = 0; i < mem->nr_ranges; i++) {
+		mstart = mem->ranges[i].start;
+		mend = mem->ranges[i].end;
+
+		phdr->p_type = PT_LOAD;
+		phdr->p_flags = PF_R|PF_W|PF_X;
+		phdr->p_offset  = mstart;
+
+		phdr->p_paddr = mstart;
+		phdr->p_vaddr = (unsigned long long) __va(mstart);
+		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
+		phdr->p_align = 0;
+		ehdr->e_phnum++;
+		phdr++;
+		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
+			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
+			ehdr->e_phnum, phdr->p_offset);
+	}
+
+	*addr = buf;
+	*sz = elf_sz;
+	return 0;
+}
-- 
2.16.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 06/13] asm-generic: add kexec_file_load system call to unistd.h
  2018-02-22 11:17 ` AKASHI Takahiro
  (?)
@ 2018-02-22 11:17   ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-arm-kernel, linux-kernel, AKASHI Takahiro

The initial user of this system call number is arm64.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
---
 include/uapi/asm-generic/unistd.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 8b87de067bc7..33761525ed2f 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -732,9 +732,11 @@ __SYSCALL(__NR_pkey_alloc,    sys_pkey_alloc)
 __SYSCALL(__NR_pkey_free,     sys_pkey_free)
 #define __NR_statx 291
 __SYSCALL(__NR_statx,     sys_statx)
+#define __NR_kexec_file_load 292
+__SYSCALL(__NR_kexec_file_load,     sys_kexec_file_load)
 
 #undef __NR_syscalls
-#define __NR_syscalls 292
+#define __NR_syscalls 293
 
 /*
  * All syscalls below here should go away really,
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 06/13] asm-generic: add kexec_file_load system call to unistd.h
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

The initial user of this system call number is arm64.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
---
 include/uapi/asm-generic/unistd.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 8b87de067bc7..33761525ed2f 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -732,9 +732,11 @@ __SYSCALL(__NR_pkey_alloc,    sys_pkey_alloc)
 __SYSCALL(__NR_pkey_free,     sys_pkey_free)
 #define __NR_statx 291
 __SYSCALL(__NR_statx,     sys_statx)
+#define __NR_kexec_file_load 292
+__SYSCALL(__NR_kexec_file_load,     sys_kexec_file_load)
 
 #undef __NR_syscalls
-#define __NR_syscalls 292
+#define __NR_syscalls 293
 
 /*
  * All syscalls below here should go away really,
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 06/13] asm-generic: add kexec_file_load system call to unistd.h
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: AKASHI Takahiro, kexec, linux-kernel, linux-arm-kernel

The initial user of this system call number is arm64.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
---
 include/uapi/asm-generic/unistd.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 8b87de067bc7..33761525ed2f 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -732,9 +732,11 @@ __SYSCALL(__NR_pkey_alloc,    sys_pkey_alloc)
 __SYSCALL(__NR_pkey_free,     sys_pkey_free)
 #define __NR_statx 291
 __SYSCALL(__NR_statx,     sys_statx)
+#define __NR_kexec_file_load 292
+__SYSCALL(__NR_kexec_file_load,     sys_kexec_file_load)
 
 #undef __NR_syscalls
-#define __NR_syscalls 292
+#define __NR_syscalls 293
 
 /*
  * All syscalls below here should go away really,
-- 
2.16.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 07/13] arm64: kexec_file: invoke the kernel without purgatory
  2018-02-22 11:17 ` AKASHI Takahiro
  (?)
@ 2018-02-22 11:17   ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-arm-kernel, linux-kernel, AKASHI Takahiro

On arm64, purugatory would do almosty nothing. So just invoke the second
kernel by jumping into the entry code directly.

While, in this case, cpu_soft_restart() must be called in a specific way,
it still stays compatible with kexec as far as the fifth argument is null.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/cpu-reset.S       |  6 +++---
 arch/arm64/kernel/machine_kexec.c   | 11 +++++++++--
 arch/arm64/kernel/relocate_kernel.S |  3 ++-
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
index 8021b46c9743..46fd9ea66ae8 100644
--- a/arch/arm64/kernel/cpu-reset.S
+++ b/arch/arm64/kernel/cpu-reset.S
@@ -24,9 +24,9 @@
  *
  * @el2_switch: Flag to indicate a swich to EL2 is needed.
  * @entry: Location to jump to for soft reset.
- * arg0: First argument passed to @entry.
- * arg1: Second argument passed to @entry.
- * arg2: Third argument passed to @entry.
+ * arg0: First argument passed to @entry. (rellocator's address)
+ * arg1: Second argument passed to @entry.(physcal kernel entry)
+ * arg2: Third argument passed to @entry. (physical dtb address)
  *
  * Put the CPU into the same state as it would be if it had been reset, and
  * branch to what would be the reset vector. It must be executed with the
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index f76ea92dff91..f7dbba00be10 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -205,10 +205,17 @@ void machine_kexec(struct kimage *kimage)
 	 * uses physical addressing to relocate the new image to its final
 	 * position and transfers control to the image entry point when the
 	 * relocation is complete.
+	 * In case of kexec_file_load syscall, we directly start the kernel,
+	 * skipping purgatory.
 	 */
-
 	cpu_soft_restart(kimage != kexec_crash_image,
-		reboot_code_buffer_phys, kimage->head, kimage->start, 0);
+		reboot_code_buffer_phys, kimage->head, kimage->start,
+#ifdef CONFIG_KEXEC_FILE
+				kimage->purgatory_info.purgatory_buf ?
+						0 : kimage->arch.dtb_mem);
+#else
+				0);
+#endif
 
 	BUG(); /* Should never get here. */
 }
diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
index f407e422a720..95fd94209aae 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -32,6 +32,7 @@
 ENTRY(arm64_relocate_new_kernel)
 
 	/* Setup the list loop variables. */
+	mov	x18, x2				/* x18 = dtb address */
 	mov	x17, x1				/* x17 = kimage_start */
 	mov	x16, x0				/* x16 = kimage_head */
 	raw_dcache_line_size x15, x0		/* x15 = dcache line size */
@@ -107,7 +108,7 @@ ENTRY(arm64_relocate_new_kernel)
 	isb
 
 	/* Start new image. */
-	mov	x0, xzr
+	mov	x0, x18
 	mov	x1, xzr
 	mov	x2, xzr
 	mov	x3, xzr
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 07/13] arm64: kexec_file: invoke the kernel without purgatory
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

On arm64, purugatory would do almosty nothing. So just invoke the second
kernel by jumping into the entry code directly.

While, in this case, cpu_soft_restart() must be called in a specific way,
it still stays compatible with kexec as far as the fifth argument is null.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/cpu-reset.S       |  6 +++---
 arch/arm64/kernel/machine_kexec.c   | 11 +++++++++--
 arch/arm64/kernel/relocate_kernel.S |  3 ++-
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
index 8021b46c9743..46fd9ea66ae8 100644
--- a/arch/arm64/kernel/cpu-reset.S
+++ b/arch/arm64/kernel/cpu-reset.S
@@ -24,9 +24,9 @@
  *
  * @el2_switch: Flag to indicate a swich to EL2 is needed.
  * @entry: Location to jump to for soft reset.
- * arg0: First argument passed to @entry.
- * arg1: Second argument passed to @entry.
- * arg2: Third argument passed to @entry.
+ * arg0: First argument passed to @entry. (rellocator's address)
+ * arg1: Second argument passed to @entry.(physcal kernel entry)
+ * arg2: Third argument passed to @entry. (physical dtb address)
  *
  * Put the CPU into the same state as it would be if it had been reset, and
  * branch to what would be the reset vector. It must be executed with the
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index f76ea92dff91..f7dbba00be10 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -205,10 +205,17 @@ void machine_kexec(struct kimage *kimage)
 	 * uses physical addressing to relocate the new image to its final
 	 * position and transfers control to the image entry point when the
 	 * relocation is complete.
+	 * In case of kexec_file_load syscall, we directly start the kernel,
+	 * skipping purgatory.
 	 */
-
 	cpu_soft_restart(kimage != kexec_crash_image,
-		reboot_code_buffer_phys, kimage->head, kimage->start, 0);
+		reboot_code_buffer_phys, kimage->head, kimage->start,
+#ifdef CONFIG_KEXEC_FILE
+				kimage->purgatory_info.purgatory_buf ?
+						0 : kimage->arch.dtb_mem);
+#else
+				0);
+#endif
 
 	BUG(); /* Should never get here. */
 }
diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
index f407e422a720..95fd94209aae 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -32,6 +32,7 @@
 ENTRY(arm64_relocate_new_kernel)
 
 	/* Setup the list loop variables. */
+	mov	x18, x2				/* x18 = dtb address */
 	mov	x17, x1				/* x17 = kimage_start */
 	mov	x16, x0				/* x16 = kimage_head */
 	raw_dcache_line_size x15, x0		/* x15 = dcache line size */
@@ -107,7 +108,7 @@ ENTRY(arm64_relocate_new_kernel)
 	isb
 
 	/* Start new image. */
-	mov	x0, xzr
+	mov	x0, x18
 	mov	x1, xzr
 	mov	x2, xzr
 	mov	x3, xzr
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 07/13] arm64: kexec_file: invoke the kernel without purgatory
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: AKASHI Takahiro, kexec, linux-kernel, linux-arm-kernel

On arm64, purugatory would do almosty nothing. So just invoke the second
kernel by jumping into the entry code directly.

While, in this case, cpu_soft_restart() must be called in a specific way,
it still stays compatible with kexec as far as the fifth argument is null.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/cpu-reset.S       |  6 +++---
 arch/arm64/kernel/machine_kexec.c   | 11 +++++++++--
 arch/arm64/kernel/relocate_kernel.S |  3 ++-
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
index 8021b46c9743..46fd9ea66ae8 100644
--- a/arch/arm64/kernel/cpu-reset.S
+++ b/arch/arm64/kernel/cpu-reset.S
@@ -24,9 +24,9 @@
  *
  * @el2_switch: Flag to indicate a swich to EL2 is needed.
  * @entry: Location to jump to for soft reset.
- * arg0: First argument passed to @entry.
- * arg1: Second argument passed to @entry.
- * arg2: Third argument passed to @entry.
+ * arg0: First argument passed to @entry. (rellocator's address)
+ * arg1: Second argument passed to @entry.(physcal kernel entry)
+ * arg2: Third argument passed to @entry. (physical dtb address)
  *
  * Put the CPU into the same state as it would be if it had been reset, and
  * branch to what would be the reset vector. It must be executed with the
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index f76ea92dff91..f7dbba00be10 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -205,10 +205,17 @@ void machine_kexec(struct kimage *kimage)
 	 * uses physical addressing to relocate the new image to its final
 	 * position and transfers control to the image entry point when the
 	 * relocation is complete.
+	 * In case of kexec_file_load syscall, we directly start the kernel,
+	 * skipping purgatory.
 	 */
-
 	cpu_soft_restart(kimage != kexec_crash_image,
-		reboot_code_buffer_phys, kimage->head, kimage->start, 0);
+		reboot_code_buffer_phys, kimage->head, kimage->start,
+#ifdef CONFIG_KEXEC_FILE
+				kimage->purgatory_info.purgatory_buf ?
+						0 : kimage->arch.dtb_mem);
+#else
+				0);
+#endif
 
 	BUG(); /* Should never get here. */
 }
diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
index f407e422a720..95fd94209aae 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -32,6 +32,7 @@
 ENTRY(arm64_relocate_new_kernel)
 
 	/* Setup the list loop variables. */
+	mov	x18, x2				/* x18 = dtb address */
 	mov	x17, x1				/* x17 = kimage_start */
 	mov	x16, x0				/* x16 = kimage_head */
 	raw_dcache_line_size x15, x0		/* x15 = dcache line size */
@@ -107,7 +108,7 @@ ENTRY(arm64_relocate_new_kernel)
 	isb
 
 	/* Start new image. */
-	mov	x0, xzr
+	mov	x0, x18
 	mov	x1, xzr
 	mov	x2, xzr
 	mov	x3, xzr
-- 
2.16.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 08/13] arm64: kexec_file: load initrd and device-tree
  2018-02-22 11:17 ` AKASHI Takahiro
  (?)
@ 2018-02-22 11:17   ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-arm-kernel, linux-kernel, AKASHI Takahiro

load_other_segments() sets up and adds all the memory segments necessary
other than kernel, including initrd and device-tree blob.
Most of the code was borrowed from kexec-tools' counterpart.

arch_kimage_kernel_post_load_cleanup() is meant to free arm64-specific data
allocated in load_other_segments().

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/kexec.h         |  19 ++++
 arch/arm64/kernel/Makefile             |   3 +-
 arch/arm64/kernel/machine_kexec_file.c | 189 +++++++++++++++++++++++++++++++++
 3 files changed, 210 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/machine_kexec_file.c

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index e17f0529a882..fc562db22d46 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -93,6 +93,25 @@ static inline void crash_prepare_suspend(void) {}
 static inline void crash_post_resume(void) {}
 #endif
 
+#ifdef CONFIG_KEXEC_FILE
+#define ARCH_HAS_KIMAGE_ARCH
+
+struct kimage_arch {
+	phys_addr_t dtb_mem;
+	void *dtb_buf;
+};
+
+struct kimage;
+
+#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
+extern int arch_kimage_file_post_load_cleanup(struct kimage *image);
+
+extern int load_other_segments(struct kimage *image,
+		unsigned long kernel_load_addr,
+		char *initrd, unsigned long initrd_len,
+		char *cmdline, unsigned long cmdline_len);
+#endif
+
 #endif /* __ASSEMBLY__ */
 
 #endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index b87541360f43..151dc890737c 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -47,8 +47,9 @@ arm64-obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL)	+= acpi_parking_protocol.o
 arm64-obj-$(CONFIG_PARAVIRT)		+= paravirt.o
 arm64-obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
 arm64-obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o
-arm64-obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o	\
+arm64-obj-$(CONFIG_KEXEC_CORE)		+= machine_kexec.o relocate_kernel.o	\
 					   cpu-reset.o
+arm64-obj-$(CONFIG_KEXEC_FILE)		+= machine_kexec_file.o
 arm64-obj-$(CONFIG_ARM64_RELOC_TEST)	+= arm64-reloc-test.o
 arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
 arm64-obj-$(CONFIG_CRASH_DUMP)		+= crash_dump.o
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
new file mode 100644
index 000000000000..12012f247501
--- /dev/null
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -0,0 +1,189 @@
+/*
+ * kexec_file for arm64
+ *
+ * Copyright (C) 2018 Linaro Limited
+ * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
+ *
+ * Most code is derived from arm64 port of kexec-tools
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt) "kexec_file: " fmt
+
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/kexec.h>
+#include <linux/libfdt.h>
+#include <linux/memblock.h>
+#include <linux/of_fdt.h>
+
+static int __dt_root_addr_cells;
+static int __dt_root_size_cells;
+
+const struct kexec_file_ops * const kexec_file_loaders[] = {
+	NULL
+};
+
+int arch_kimage_file_post_load_cleanup(struct kimage *image)
+{
+	vfree(image->arch.dtb_buf);
+	image->arch.dtb_buf = NULL;
+
+	return _kimage_file_post_load_cleanup(image);
+}
+
+int arch_kexec_walk_mem(struct kexec_buf *kbuf,
+				int (*func)(struct resource *, void *))
+{
+	if (kbuf->image->type == KEXEC_TYPE_CRASH)
+		return walk_iomem_res_desc(crashk_res.desc,
+					IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
+					crashk_res.start, crashk_res.end,
+					kbuf, func);
+	else if (kbuf->top_down)
+		return walk_system_ram_res_rev(0, ULONG_MAX, kbuf, func);
+	else
+		return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
+}
+
+static int setup_dtb(struct kimage *image,
+		unsigned long initrd_load_addr, unsigned long initrd_len,
+		char *cmdline, unsigned long cmdline_len,
+		char **dtb_buf, size_t *dtb_buf_len)
+{
+	char *buf = NULL;
+	size_t buf_size;
+	int nodeoffset;
+	u64 value;
+	int range_len;
+	int ret;
+
+	/* duplicate dt blob */
+	buf_size = fdt_totalsize(initial_boot_params);
+	range_len = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
+
+	if (initrd_load_addr)
+		buf_size += fdt_prop_len("initrd-start", sizeof(u64))
+				+ fdt_prop_len("initrd-end", sizeof(u64));
+
+	if (cmdline)
+		buf_size += fdt_prop_len("bootargs", cmdline_len + 1);
+
+	buf = vmalloc(buf_size);
+	if (!buf) {
+		ret = -ENOMEM;
+		goto out_err;
+	}
+
+	ret = fdt_open_into(initial_boot_params, buf, buf_size);
+	if (ret)
+		goto out_err;
+
+	nodeoffset = fdt_path_offset(buf, "/chosen");
+	if (nodeoffset < 0)
+		goto out_err;
+
+	/* add bootargs */
+	if (cmdline) {
+		ret = fdt_setprop(buf, nodeoffset, "bootargs",
+						cmdline, cmdline_len + 1);
+		if (ret)
+			goto out_err;
+	}
+
+	/* add initrd-* */
+	if (initrd_load_addr) {
+		value = cpu_to_fdt64(initrd_load_addr);
+		ret = fdt_setprop(buf, nodeoffset, "initrd-start",
+				&value, sizeof(value));
+		if (ret)
+			goto out_err;
+
+		value = cpu_to_fdt64(initrd_load_addr + initrd_len);
+		ret = fdt_setprop(buf, nodeoffset, "initrd-end",
+				&value, sizeof(value));
+		if (ret)
+			goto out_err;
+	}
+
+	/* trim a buffer */
+	fdt_pack(buf);
+	*dtb_buf = buf;
+	*dtb_buf_len = fdt_totalsize(buf);
+
+	return 0;
+
+out_err:
+	vfree(buf);
+	return ret;
+}
+
+int load_other_segments(struct kimage *image, unsigned long kernel_load_addr,
+			char *initrd, unsigned long initrd_len,
+			char *cmdline, unsigned long cmdline_len)
+{
+	struct kexec_buf kbuf;
+	unsigned long initrd_load_addr = 0;
+	char *dtb = NULL;
+	unsigned long dtb_len = 0;
+	int ret = 0;
+
+	kbuf.image = image;
+	/* not allocate anything below the kernel */
+	kbuf.buf_min = kernel_load_addr;
+
+	/* Load initrd */
+	if (initrd) {
+		kbuf.buffer = initrd;
+		kbuf.bufsz = initrd_len;
+		kbuf.memsz = initrd_len;
+		kbuf.buf_align = PAGE_SIZE;
+		/* within 1GB-aligned window of up to 32GB in size */
+		kbuf.buf_max = round_down(kernel_load_addr, SZ_1G)
+						+ (unsigned long)SZ_1G * 31;
+		kbuf.top_down = 0;
+
+		ret = kexec_add_buffer(&kbuf);
+		if (ret)
+			goto out_err;
+		initrd_load_addr = kbuf.mem;
+
+		pr_debug("Loaded initrd at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+				initrd_load_addr, initrd_len, initrd_len);
+	}
+
+	/* Load dtb blob */
+	ret = setup_dtb(image, initrd_load_addr, initrd_len,
+				cmdline, cmdline_len, &dtb, &dtb_len);
+	if (ret) {
+		pr_err("Preparing for new dtb failed\n");
+		goto out_err;
+	}
+
+	kbuf.buffer = dtb;
+	kbuf.bufsz = dtb_len;
+	kbuf.memsz = dtb_len;
+	/* not across 2MB boundary */
+	kbuf.buf_align = SZ_2M;
+	kbuf.buf_max = ULONG_MAX;
+	kbuf.top_down = 1;
+
+	ret = kexec_add_buffer(&kbuf);
+	if (ret)
+		goto out_err;
+	image->arch.dtb_mem = kbuf.mem;
+	image->arch.dtb_buf = dtb;
+
+	pr_debug("Loaded dtb at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+			kbuf.mem, dtb_len, dtb_len);
+
+	return 0;
+
+out_err:
+	vfree(dtb);
+	image->arch.dtb_buf = NULL;
+	return ret;
+}
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 08/13] arm64: kexec_file: load initrd and device-tree
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

load_other_segments() sets up and adds all the memory segments necessary
other than kernel, including initrd and device-tree blob.
Most of the code was borrowed from kexec-tools' counterpart.

arch_kimage_kernel_post_load_cleanup() is meant to free arm64-specific data
allocated in load_other_segments().

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/kexec.h         |  19 ++++
 arch/arm64/kernel/Makefile             |   3 +-
 arch/arm64/kernel/machine_kexec_file.c | 189 +++++++++++++++++++++++++++++++++
 3 files changed, 210 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/machine_kexec_file.c

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index e17f0529a882..fc562db22d46 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -93,6 +93,25 @@ static inline void crash_prepare_suspend(void) {}
 static inline void crash_post_resume(void) {}
 #endif
 
+#ifdef CONFIG_KEXEC_FILE
+#define ARCH_HAS_KIMAGE_ARCH
+
+struct kimage_arch {
+	phys_addr_t dtb_mem;
+	void *dtb_buf;
+};
+
+struct kimage;
+
+#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
+extern int arch_kimage_file_post_load_cleanup(struct kimage *image);
+
+extern int load_other_segments(struct kimage *image,
+		unsigned long kernel_load_addr,
+		char *initrd, unsigned long initrd_len,
+		char *cmdline, unsigned long cmdline_len);
+#endif
+
 #endif /* __ASSEMBLY__ */
 
 #endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index b87541360f43..151dc890737c 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -47,8 +47,9 @@ arm64-obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL)	+= acpi_parking_protocol.o
 arm64-obj-$(CONFIG_PARAVIRT)		+= paravirt.o
 arm64-obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
 arm64-obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o
-arm64-obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o	\
+arm64-obj-$(CONFIG_KEXEC_CORE)		+= machine_kexec.o relocate_kernel.o	\
 					   cpu-reset.o
+arm64-obj-$(CONFIG_KEXEC_FILE)		+= machine_kexec_file.o
 arm64-obj-$(CONFIG_ARM64_RELOC_TEST)	+= arm64-reloc-test.o
 arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
 arm64-obj-$(CONFIG_CRASH_DUMP)		+= crash_dump.o
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
new file mode 100644
index 000000000000..12012f247501
--- /dev/null
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -0,0 +1,189 @@
+/*
+ * kexec_file for arm64
+ *
+ * Copyright (C) 2018 Linaro Limited
+ * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
+ *
+ * Most code is derived from arm64 port of kexec-tools
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt) "kexec_file: " fmt
+
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/kexec.h>
+#include <linux/libfdt.h>
+#include <linux/memblock.h>
+#include <linux/of_fdt.h>
+
+static int __dt_root_addr_cells;
+static int __dt_root_size_cells;
+
+const struct kexec_file_ops * const kexec_file_loaders[] = {
+	NULL
+};
+
+int arch_kimage_file_post_load_cleanup(struct kimage *image)
+{
+	vfree(image->arch.dtb_buf);
+	image->arch.dtb_buf = NULL;
+
+	return _kimage_file_post_load_cleanup(image);
+}
+
+int arch_kexec_walk_mem(struct kexec_buf *kbuf,
+				int (*func)(struct resource *, void *))
+{
+	if (kbuf->image->type == KEXEC_TYPE_CRASH)
+		return walk_iomem_res_desc(crashk_res.desc,
+					IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
+					crashk_res.start, crashk_res.end,
+					kbuf, func);
+	else if (kbuf->top_down)
+		return walk_system_ram_res_rev(0, ULONG_MAX, kbuf, func);
+	else
+		return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
+}
+
+static int setup_dtb(struct kimage *image,
+		unsigned long initrd_load_addr, unsigned long initrd_len,
+		char *cmdline, unsigned long cmdline_len,
+		char **dtb_buf, size_t *dtb_buf_len)
+{
+	char *buf = NULL;
+	size_t buf_size;
+	int nodeoffset;
+	u64 value;
+	int range_len;
+	int ret;
+
+	/* duplicate dt blob */
+	buf_size = fdt_totalsize(initial_boot_params);
+	range_len = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
+
+	if (initrd_load_addr)
+		buf_size += fdt_prop_len("initrd-start", sizeof(u64))
+				+ fdt_prop_len("initrd-end", sizeof(u64));
+
+	if (cmdline)
+		buf_size += fdt_prop_len("bootargs", cmdline_len + 1);
+
+	buf = vmalloc(buf_size);
+	if (!buf) {
+		ret = -ENOMEM;
+		goto out_err;
+	}
+
+	ret = fdt_open_into(initial_boot_params, buf, buf_size);
+	if (ret)
+		goto out_err;
+
+	nodeoffset = fdt_path_offset(buf, "/chosen");
+	if (nodeoffset < 0)
+		goto out_err;
+
+	/* add bootargs */
+	if (cmdline) {
+		ret = fdt_setprop(buf, nodeoffset, "bootargs",
+						cmdline, cmdline_len + 1);
+		if (ret)
+			goto out_err;
+	}
+
+	/* add initrd-* */
+	if (initrd_load_addr) {
+		value = cpu_to_fdt64(initrd_load_addr);
+		ret = fdt_setprop(buf, nodeoffset, "initrd-start",
+				&value, sizeof(value));
+		if (ret)
+			goto out_err;
+
+		value = cpu_to_fdt64(initrd_load_addr + initrd_len);
+		ret = fdt_setprop(buf, nodeoffset, "initrd-end",
+				&value, sizeof(value));
+		if (ret)
+			goto out_err;
+	}
+
+	/* trim a buffer */
+	fdt_pack(buf);
+	*dtb_buf = buf;
+	*dtb_buf_len = fdt_totalsize(buf);
+
+	return 0;
+
+out_err:
+	vfree(buf);
+	return ret;
+}
+
+int load_other_segments(struct kimage *image, unsigned long kernel_load_addr,
+			char *initrd, unsigned long initrd_len,
+			char *cmdline, unsigned long cmdline_len)
+{
+	struct kexec_buf kbuf;
+	unsigned long initrd_load_addr = 0;
+	char *dtb = NULL;
+	unsigned long dtb_len = 0;
+	int ret = 0;
+
+	kbuf.image = image;
+	/* not allocate anything below the kernel */
+	kbuf.buf_min = kernel_load_addr;
+
+	/* Load initrd */
+	if (initrd) {
+		kbuf.buffer = initrd;
+		kbuf.bufsz = initrd_len;
+		kbuf.memsz = initrd_len;
+		kbuf.buf_align = PAGE_SIZE;
+		/* within 1GB-aligned window of up to 32GB in size */
+		kbuf.buf_max = round_down(kernel_load_addr, SZ_1G)
+						+ (unsigned long)SZ_1G * 31;
+		kbuf.top_down = 0;
+
+		ret = kexec_add_buffer(&kbuf);
+		if (ret)
+			goto out_err;
+		initrd_load_addr = kbuf.mem;
+
+		pr_debug("Loaded initrd@0x%lx bufsz=0x%lx memsz=0x%lx\n",
+				initrd_load_addr, initrd_len, initrd_len);
+	}
+
+	/* Load dtb blob */
+	ret = setup_dtb(image, initrd_load_addr, initrd_len,
+				cmdline, cmdline_len, &dtb, &dtb_len);
+	if (ret) {
+		pr_err("Preparing for new dtb failed\n");
+		goto out_err;
+	}
+
+	kbuf.buffer = dtb;
+	kbuf.bufsz = dtb_len;
+	kbuf.memsz = dtb_len;
+	/* not across 2MB boundary */
+	kbuf.buf_align = SZ_2M;
+	kbuf.buf_max = ULONG_MAX;
+	kbuf.top_down = 1;
+
+	ret = kexec_add_buffer(&kbuf);
+	if (ret)
+		goto out_err;
+	image->arch.dtb_mem = kbuf.mem;
+	image->arch.dtb_buf = dtb;
+
+	pr_debug("Loaded dtb at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+			kbuf.mem, dtb_len, dtb_len);
+
+	return 0;
+
+out_err:
+	vfree(dtb);
+	image->arch.dtb_buf = NULL;
+	return ret;
+}
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 08/13] arm64: kexec_file: load initrd and device-tree
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: AKASHI Takahiro, kexec, linux-kernel, linux-arm-kernel

load_other_segments() sets up and adds all the memory segments necessary
other than kernel, including initrd and device-tree blob.
Most of the code was borrowed from kexec-tools' counterpart.

arch_kimage_kernel_post_load_cleanup() is meant to free arm64-specific data
allocated in load_other_segments().

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/kexec.h         |  19 ++++
 arch/arm64/kernel/Makefile             |   3 +-
 arch/arm64/kernel/machine_kexec_file.c | 189 +++++++++++++++++++++++++++++++++
 3 files changed, 210 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/machine_kexec_file.c

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index e17f0529a882..fc562db22d46 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -93,6 +93,25 @@ static inline void crash_prepare_suspend(void) {}
 static inline void crash_post_resume(void) {}
 #endif
 
+#ifdef CONFIG_KEXEC_FILE
+#define ARCH_HAS_KIMAGE_ARCH
+
+struct kimage_arch {
+	phys_addr_t dtb_mem;
+	void *dtb_buf;
+};
+
+struct kimage;
+
+#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
+extern int arch_kimage_file_post_load_cleanup(struct kimage *image);
+
+extern int load_other_segments(struct kimage *image,
+		unsigned long kernel_load_addr,
+		char *initrd, unsigned long initrd_len,
+		char *cmdline, unsigned long cmdline_len);
+#endif
+
 #endif /* __ASSEMBLY__ */
 
 #endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index b87541360f43..151dc890737c 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -47,8 +47,9 @@ arm64-obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL)	+= acpi_parking_protocol.o
 arm64-obj-$(CONFIG_PARAVIRT)		+= paravirt.o
 arm64-obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
 arm64-obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o
-arm64-obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o	\
+arm64-obj-$(CONFIG_KEXEC_CORE)		+= machine_kexec.o relocate_kernel.o	\
 					   cpu-reset.o
+arm64-obj-$(CONFIG_KEXEC_FILE)		+= machine_kexec_file.o
 arm64-obj-$(CONFIG_ARM64_RELOC_TEST)	+= arm64-reloc-test.o
 arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
 arm64-obj-$(CONFIG_CRASH_DUMP)		+= crash_dump.o
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
new file mode 100644
index 000000000000..12012f247501
--- /dev/null
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -0,0 +1,189 @@
+/*
+ * kexec_file for arm64
+ *
+ * Copyright (C) 2018 Linaro Limited
+ * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
+ *
+ * Most code is derived from arm64 port of kexec-tools
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt) "kexec_file: " fmt
+
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/kexec.h>
+#include <linux/libfdt.h>
+#include <linux/memblock.h>
+#include <linux/of_fdt.h>
+
+static int __dt_root_addr_cells;
+static int __dt_root_size_cells;
+
+const struct kexec_file_ops * const kexec_file_loaders[] = {
+	NULL
+};
+
+int arch_kimage_file_post_load_cleanup(struct kimage *image)
+{
+	vfree(image->arch.dtb_buf);
+	image->arch.dtb_buf = NULL;
+
+	return _kimage_file_post_load_cleanup(image);
+}
+
+int arch_kexec_walk_mem(struct kexec_buf *kbuf,
+				int (*func)(struct resource *, void *))
+{
+	if (kbuf->image->type == KEXEC_TYPE_CRASH)
+		return walk_iomem_res_desc(crashk_res.desc,
+					IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
+					crashk_res.start, crashk_res.end,
+					kbuf, func);
+	else if (kbuf->top_down)
+		return walk_system_ram_res_rev(0, ULONG_MAX, kbuf, func);
+	else
+		return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
+}
+
+static int setup_dtb(struct kimage *image,
+		unsigned long initrd_load_addr, unsigned long initrd_len,
+		char *cmdline, unsigned long cmdline_len,
+		char **dtb_buf, size_t *dtb_buf_len)
+{
+	char *buf = NULL;
+	size_t buf_size;
+	int nodeoffset;
+	u64 value;
+	int range_len;
+	int ret;
+
+	/* duplicate dt blob */
+	buf_size = fdt_totalsize(initial_boot_params);
+	range_len = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
+
+	if (initrd_load_addr)
+		buf_size += fdt_prop_len("initrd-start", sizeof(u64))
+				+ fdt_prop_len("initrd-end", sizeof(u64));
+
+	if (cmdline)
+		buf_size += fdt_prop_len("bootargs", cmdline_len + 1);
+
+	buf = vmalloc(buf_size);
+	if (!buf) {
+		ret = -ENOMEM;
+		goto out_err;
+	}
+
+	ret = fdt_open_into(initial_boot_params, buf, buf_size);
+	if (ret)
+		goto out_err;
+
+	nodeoffset = fdt_path_offset(buf, "/chosen");
+	if (nodeoffset < 0)
+		goto out_err;
+
+	/* add bootargs */
+	if (cmdline) {
+		ret = fdt_setprop(buf, nodeoffset, "bootargs",
+						cmdline, cmdline_len + 1);
+		if (ret)
+			goto out_err;
+	}
+
+	/* add initrd-* */
+	if (initrd_load_addr) {
+		value = cpu_to_fdt64(initrd_load_addr);
+		ret = fdt_setprop(buf, nodeoffset, "initrd-start",
+				&value, sizeof(value));
+		if (ret)
+			goto out_err;
+
+		value = cpu_to_fdt64(initrd_load_addr + initrd_len);
+		ret = fdt_setprop(buf, nodeoffset, "initrd-end",
+				&value, sizeof(value));
+		if (ret)
+			goto out_err;
+	}
+
+	/* trim a buffer */
+	fdt_pack(buf);
+	*dtb_buf = buf;
+	*dtb_buf_len = fdt_totalsize(buf);
+
+	return 0;
+
+out_err:
+	vfree(buf);
+	return ret;
+}
+
+int load_other_segments(struct kimage *image, unsigned long kernel_load_addr,
+			char *initrd, unsigned long initrd_len,
+			char *cmdline, unsigned long cmdline_len)
+{
+	struct kexec_buf kbuf;
+	unsigned long initrd_load_addr = 0;
+	char *dtb = NULL;
+	unsigned long dtb_len = 0;
+	int ret = 0;
+
+	kbuf.image = image;
+	/* not allocate anything below the kernel */
+	kbuf.buf_min = kernel_load_addr;
+
+	/* Load initrd */
+	if (initrd) {
+		kbuf.buffer = initrd;
+		kbuf.bufsz = initrd_len;
+		kbuf.memsz = initrd_len;
+		kbuf.buf_align = PAGE_SIZE;
+		/* within 1GB-aligned window of up to 32GB in size */
+		kbuf.buf_max = round_down(kernel_load_addr, SZ_1G)
+						+ (unsigned long)SZ_1G * 31;
+		kbuf.top_down = 0;
+
+		ret = kexec_add_buffer(&kbuf);
+		if (ret)
+			goto out_err;
+		initrd_load_addr = kbuf.mem;
+
+		pr_debug("Loaded initrd at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+				initrd_load_addr, initrd_len, initrd_len);
+	}
+
+	/* Load dtb blob */
+	ret = setup_dtb(image, initrd_load_addr, initrd_len,
+				cmdline, cmdline_len, &dtb, &dtb_len);
+	if (ret) {
+		pr_err("Preparing for new dtb failed\n");
+		goto out_err;
+	}
+
+	kbuf.buffer = dtb;
+	kbuf.bufsz = dtb_len;
+	kbuf.memsz = dtb_len;
+	/* not across 2MB boundary */
+	kbuf.buf_align = SZ_2M;
+	kbuf.buf_max = ULONG_MAX;
+	kbuf.top_down = 1;
+
+	ret = kexec_add_buffer(&kbuf);
+	if (ret)
+		goto out_err;
+	image->arch.dtb_mem = kbuf.mem;
+	image->arch.dtb_buf = dtb;
+
+	pr_debug("Loaded dtb at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+			kbuf.mem, dtb_len, dtb_len);
+
+	return 0;
+
+out_err:
+	vfree(dtb);
+	image->arch.dtb_buf = NULL;
+	return ret;
+}
-- 
2.16.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 09/13] arm64: kexec_file: add crash dump support
  2018-02-22 11:17 ` AKASHI Takahiro
  (?)
@ 2018-02-22 11:17   ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-arm-kernel, linux-kernel, AKASHI Takahiro

To enable crash dump (kdump), we need to
* prepare the contents of ELF header of /proc/vmcore through
  load_crashdump_segments(), and
* set up two device tree properties, "linux,usable-memory-range" and
  "linux,elfcorehdr", which repsectively represent a memory range to be
  used on crash dump kernel and a region of ELF core header
  (The logic of this cod is also from kexec-tools.)

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/kexec.h         |   5 +
 arch/arm64/kernel/machine_kexec_file.c | 211 +++++++++++++++++++++++++++++++++
 2 files changed, 216 insertions(+)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index fc562db22d46..d7427d510e1b 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -99,6 +99,10 @@ static inline void crash_post_resume(void) {}
 struct kimage_arch {
 	phys_addr_t dtb_mem;
 	void *dtb_buf;
+	/* Core ELF header buffer */
+	void *elf_headers;
+	unsigned long elf_headers_sz;
+	unsigned long elf_load_addr;
 };
 
 struct kimage;
@@ -110,6 +114,7 @@ extern int load_other_segments(struct kimage *image,
 		unsigned long kernel_load_addr,
 		char *initrd, unsigned long initrd_len,
 		char *cmdline, unsigned long cmdline_len);
+extern int load_crashdump_segments(struct kimage *image);
 #endif
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index 12012f247501..fc132047c8cd 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -19,6 +19,7 @@
 #include <linux/libfdt.h>
 #include <linux/memblock.h>
 #include <linux/of_fdt.h>
+#include <linux/vmalloc.h>
 
 static int __dt_root_addr_cells;
 static int __dt_root_size_cells;
@@ -32,6 +33,10 @@ int arch_kimage_file_post_load_cleanup(struct kimage *image)
 	vfree(image->arch.dtb_buf);
 	image->arch.dtb_buf = NULL;
 
+	vfree(image->arch.elf_headers);
+	image->arch.elf_headers = NULL;
+	image->arch.elf_headers_sz = 0;
+
 	return _kimage_file_post_load_cleanup(image);
 }
 
@@ -49,6 +54,78 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf,
 		return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
 }
 
+static int __init arch_kexec_file_init(void)
+{
+	/* Those values are used later on loading the kernel */
+	__dt_root_addr_cells = dt_root_addr_cells;
+	__dt_root_size_cells = dt_root_size_cells;
+
+	return 0;
+}
+late_initcall(arch_kexec_file_init);
+
+#define FDT_ALIGN(x, a)	(((x) + (a) - 1) & ~((a) - 1))
+#define FDT_TAGALIGN(x)	(FDT_ALIGN((x), FDT_TAGSIZE))
+
+static int fdt_prop_len(const char *prop_name, int len)
+{
+	return (strlen(prop_name) + 1) +
+		sizeof(struct fdt_property) +
+		FDT_TAGALIGN(len);
+}
+
+static bool cells_size_fitted(unsigned long base, unsigned long size)
+{
+	/* if *_cells >= 2, cells can hold 64-bit values anyway */
+	if ((__dt_root_addr_cells == 1) && (base >= (1ULL << 32)))
+		return false;
+
+	if ((__dt_root_size_cells == 1) && (size >= (1ULL << 32)))
+		return false;
+
+	return true;
+}
+
+static void fill_property(void *buf, u64 val64, int cells)
+{
+	u32 val32;
+
+	if (cells == 1) {
+		val32 = cpu_to_fdt32((u32)val64);
+		memcpy(buf, &val32, sizeof(val32));
+	} else {
+		memset(buf, 0, cells * sizeof(u32) - sizeof(u64));
+		buf += cells * sizeof(u32) - sizeof(u64);
+
+		val64 = cpu_to_fdt64(val64);
+		memcpy(buf, &val64, sizeof(val64));
+	}
+}
+
+static int fdt_setprop_range(void *fdt, int nodeoffset, const char *name,
+				unsigned long addr, unsigned long size)
+{
+	void *buf, *prop;
+	size_t buf_size;
+	int result;
+
+	buf_size = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
+	prop = buf = vmalloc(buf_size);
+	if (!buf)
+		return -ENOMEM;
+
+	fill_property(prop, addr, __dt_root_addr_cells);
+	prop += __dt_root_addr_cells * sizeof(u32);
+
+	fill_property(prop, size, __dt_root_size_cells);
+
+	result = fdt_setprop(fdt, nodeoffset, name, buf, buf_size);
+
+	vfree(buf);
+
+	return result;
+}
+
 static int setup_dtb(struct kimage *image,
 		unsigned long initrd_load_addr, unsigned long initrd_len,
 		char *cmdline, unsigned long cmdline_len,
@@ -61,10 +138,26 @@ static int setup_dtb(struct kimage *image,
 	int range_len;
 	int ret;
 
+	/* check ranges against root's #address-cells and #size-cells */
+	if (image->type == KEXEC_TYPE_CRASH &&
+		(!cells_size_fitted(image->arch.elf_load_addr,
+				image->arch.elf_headers_sz) ||
+		 !cells_size_fitted(crashk_res.start,
+				crashk_res.end - crashk_res.start + 1))) {
+		pr_err("Crash memory region doesn't fit into DT's root cell sizes.\n");
+		ret = -EINVAL;
+		goto out_err;
+	}
+
 	/* duplicate dt blob */
 	buf_size = fdt_totalsize(initial_boot_params);
 	range_len = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
 
+	if (image->type == KEXEC_TYPE_CRASH)
+		buf_size += fdt_prop_len("linux,elfcorehdr", range_len)
+				+ fdt_prop_len("linux,usable-memory-range",
+								range_len);
+
 	if (initrd_load_addr)
 		buf_size += fdt_prop_len("initrd-start", sizeof(u64))
 				+ fdt_prop_len("initrd-end", sizeof(u64));
@@ -86,6 +179,23 @@ static int setup_dtb(struct kimage *image,
 	if (nodeoffset < 0)
 		goto out_err;
 
+	if (image->type == KEXEC_TYPE_CRASH) {
+		/* add linux,elfcorehdr */
+		ret = fdt_setprop_range(buf, nodeoffset, "linux,elfcorehdr",
+				image->arch.elf_load_addr,
+				image->arch.elf_headers_sz);
+		if (ret)
+			goto out_err;
+
+		/* add linux,usable-memory-range */
+		ret = fdt_setprop_range(buf, nodeoffset,
+				"linux,usable-memory-range",
+				crashk_res.start,
+				crashk_res.end - crashk_res.start + 1);
+		if (ret)
+			goto out_err;
+	}
+
 	/* add bootargs */
 	if (cmdline) {
 		ret = fdt_setprop(buf, nodeoffset, "bootargs",
@@ -187,3 +297,104 @@ int load_other_segments(struct kimage *image, unsigned long kernel_load_addr,
 	image->arch.dtb_buf = NULL;
 	return ret;
 }
+
+static int get_nr_ranges_callback(struct resource *res, void *arg)
+{
+	unsigned int *nr_ranges = arg;
+
+	(*nr_ranges)++;
+	return 0;
+}
+
+static int add_mem_range_callback(struct resource *res, void *arg)
+{
+	struct crash_mem *cmem = arg;
+
+	cmem->ranges[cmem->nr_ranges].start = res->start;
+	cmem->ranges[cmem->nr_ranges].end = res->end;
+	cmem->nr_ranges++;
+
+	return 0;
+}
+
+static struct crash_mem *get_crash_memory_ranges(void)
+{
+	unsigned int nr_ranges;
+	struct crash_mem *cmem;
+
+	nr_ranges = 1; /* for exclusion of crashkernel region */
+	walk_system_ram_res(0, -1, &nr_ranges, get_nr_ranges_callback);
+
+	cmem = vmalloc(sizeof(struct crash_mem) +
+			sizeof(struct crash_mem_range) * nr_ranges);
+	if (!cmem)
+		return NULL;
+
+	cmem->max_nr_ranges = nr_ranges;
+	cmem->nr_ranges = 0;
+	walk_system_ram_res(0, -1, cmem, add_mem_range_callback);
+
+	/* Exclude crashkernel region */
+	if (crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end)) {
+		vfree(cmem);
+		return NULL;
+	}
+
+	return cmem;
+}
+
+static int prepare_elf_core_header(void **addr, unsigned long *sz)
+{
+	struct crash_mem *cmem;
+	int ret = 0;
+
+	cmem = get_crash_memory_ranges();
+	if (!cmem)
+		return -ENOMEM;
+
+	/* 1: add segment for kernel map */
+	ret =  crash_prepare_elf64_headers(cmem, 1, addr, sz);
+
+	vfree(cmem);
+	return ret;
+}
+
+int load_crashdump_segments(struct kimage *image)
+{
+	void *elf_addr;
+	unsigned long elf_sz;
+	struct kexec_buf kbuf;
+	int ret;
+
+	if (image->type != KEXEC_TYPE_CRASH)
+		return 0;
+
+	ret = prepare_elf_core_header(&elf_addr, &elf_sz);
+	if (ret) {
+		pr_err("Preparing elf core header failed\n");
+		return ret;
+	}
+
+	kbuf.image = image;
+	kbuf.buffer = elf_addr;
+	kbuf.bufsz = elf_sz;
+	kbuf.memsz = elf_sz;
+	kbuf.buf_align = PAGE_SIZE;
+	kbuf.buf_min = crashk_res.start;
+	kbuf.buf_max = crashk_res.end + 1;
+	kbuf.top_down = 1;
+
+	ret = kexec_add_buffer(&kbuf);
+	if (ret) {
+		vfree(elf_addr);
+		return ret;
+	}
+	image->arch.elf_headers = elf_addr;
+	image->arch.elf_headers_sz = elf_sz;
+	image->arch.elf_load_addr = kbuf.mem;
+
+	pr_debug("Loaded elf core header at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+			 image->arch.elf_load_addr, elf_sz, elf_sz);
+
+	return ret;
+}
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 09/13] arm64: kexec_file: add crash dump support
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

To enable crash dump (kdump), we need to
* prepare the contents of ELF header of /proc/vmcore through
  load_crashdump_segments(), and
* set up two device tree properties, "linux,usable-memory-range" and
  "linux,elfcorehdr", which repsectively represent a memory range to be
  used on crash dump kernel and a region of ELF core header
  (The logic of this cod is also from kexec-tools.)

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/kexec.h         |   5 +
 arch/arm64/kernel/machine_kexec_file.c | 211 +++++++++++++++++++++++++++++++++
 2 files changed, 216 insertions(+)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index fc562db22d46..d7427d510e1b 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -99,6 +99,10 @@ static inline void crash_post_resume(void) {}
 struct kimage_arch {
 	phys_addr_t dtb_mem;
 	void *dtb_buf;
+	/* Core ELF header buffer */
+	void *elf_headers;
+	unsigned long elf_headers_sz;
+	unsigned long elf_load_addr;
 };
 
 struct kimage;
@@ -110,6 +114,7 @@ extern int load_other_segments(struct kimage *image,
 		unsigned long kernel_load_addr,
 		char *initrd, unsigned long initrd_len,
 		char *cmdline, unsigned long cmdline_len);
+extern int load_crashdump_segments(struct kimage *image);
 #endif
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index 12012f247501..fc132047c8cd 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -19,6 +19,7 @@
 #include <linux/libfdt.h>
 #include <linux/memblock.h>
 #include <linux/of_fdt.h>
+#include <linux/vmalloc.h>
 
 static int __dt_root_addr_cells;
 static int __dt_root_size_cells;
@@ -32,6 +33,10 @@ int arch_kimage_file_post_load_cleanup(struct kimage *image)
 	vfree(image->arch.dtb_buf);
 	image->arch.dtb_buf = NULL;
 
+	vfree(image->arch.elf_headers);
+	image->arch.elf_headers = NULL;
+	image->arch.elf_headers_sz = 0;
+
 	return _kimage_file_post_load_cleanup(image);
 }
 
@@ -49,6 +54,78 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf,
 		return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
 }
 
+static int __init arch_kexec_file_init(void)
+{
+	/* Those values are used later on loading the kernel */
+	__dt_root_addr_cells = dt_root_addr_cells;
+	__dt_root_size_cells = dt_root_size_cells;
+
+	return 0;
+}
+late_initcall(arch_kexec_file_init);
+
+#define FDT_ALIGN(x, a)	(((x) + (a) - 1) & ~((a) - 1))
+#define FDT_TAGALIGN(x)	(FDT_ALIGN((x), FDT_TAGSIZE))
+
+static int fdt_prop_len(const char *prop_name, int len)
+{
+	return (strlen(prop_name) + 1) +
+		sizeof(struct fdt_property) +
+		FDT_TAGALIGN(len);
+}
+
+static bool cells_size_fitted(unsigned long base, unsigned long size)
+{
+	/* if *_cells >= 2, cells can hold 64-bit values anyway */
+	if ((__dt_root_addr_cells == 1) && (base >= (1ULL << 32)))
+		return false;
+
+	if ((__dt_root_size_cells == 1) && (size >= (1ULL << 32)))
+		return false;
+
+	return true;
+}
+
+static void fill_property(void *buf, u64 val64, int cells)
+{
+	u32 val32;
+
+	if (cells == 1) {
+		val32 = cpu_to_fdt32((u32)val64);
+		memcpy(buf, &val32, sizeof(val32));
+	} else {
+		memset(buf, 0, cells * sizeof(u32) - sizeof(u64));
+		buf += cells * sizeof(u32) - sizeof(u64);
+
+		val64 = cpu_to_fdt64(val64);
+		memcpy(buf, &val64, sizeof(val64));
+	}
+}
+
+static int fdt_setprop_range(void *fdt, int nodeoffset, const char *name,
+				unsigned long addr, unsigned long size)
+{
+	void *buf, *prop;
+	size_t buf_size;
+	int result;
+
+	buf_size = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
+	prop = buf = vmalloc(buf_size);
+	if (!buf)
+		return -ENOMEM;
+
+	fill_property(prop, addr, __dt_root_addr_cells);
+	prop += __dt_root_addr_cells * sizeof(u32);
+
+	fill_property(prop, size, __dt_root_size_cells);
+
+	result = fdt_setprop(fdt, nodeoffset, name, buf, buf_size);
+
+	vfree(buf);
+
+	return result;
+}
+
 static int setup_dtb(struct kimage *image,
 		unsigned long initrd_load_addr, unsigned long initrd_len,
 		char *cmdline, unsigned long cmdline_len,
@@ -61,10 +138,26 @@ static int setup_dtb(struct kimage *image,
 	int range_len;
 	int ret;
 
+	/* check ranges against root's #address-cells and #size-cells */
+	if (image->type == KEXEC_TYPE_CRASH &&
+		(!cells_size_fitted(image->arch.elf_load_addr,
+				image->arch.elf_headers_sz) ||
+		 !cells_size_fitted(crashk_res.start,
+				crashk_res.end - crashk_res.start + 1))) {
+		pr_err("Crash memory region doesn't fit into DT's root cell sizes.\n");
+		ret = -EINVAL;
+		goto out_err;
+	}
+
 	/* duplicate dt blob */
 	buf_size = fdt_totalsize(initial_boot_params);
 	range_len = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
 
+	if (image->type == KEXEC_TYPE_CRASH)
+		buf_size += fdt_prop_len("linux,elfcorehdr", range_len)
+				+ fdt_prop_len("linux,usable-memory-range",
+								range_len);
+
 	if (initrd_load_addr)
 		buf_size += fdt_prop_len("initrd-start", sizeof(u64))
 				+ fdt_prop_len("initrd-end", sizeof(u64));
@@ -86,6 +179,23 @@ static int setup_dtb(struct kimage *image,
 	if (nodeoffset < 0)
 		goto out_err;
 
+	if (image->type == KEXEC_TYPE_CRASH) {
+		/* add linux,elfcorehdr */
+		ret = fdt_setprop_range(buf, nodeoffset, "linux,elfcorehdr",
+				image->arch.elf_load_addr,
+				image->arch.elf_headers_sz);
+		if (ret)
+			goto out_err;
+
+		/* add linux,usable-memory-range */
+		ret = fdt_setprop_range(buf, nodeoffset,
+				"linux,usable-memory-range",
+				crashk_res.start,
+				crashk_res.end - crashk_res.start + 1);
+		if (ret)
+			goto out_err;
+	}
+
 	/* add bootargs */
 	if (cmdline) {
 		ret = fdt_setprop(buf, nodeoffset, "bootargs",
@@ -187,3 +297,104 @@ int load_other_segments(struct kimage *image, unsigned long kernel_load_addr,
 	image->arch.dtb_buf = NULL;
 	return ret;
 }
+
+static int get_nr_ranges_callback(struct resource *res, void *arg)
+{
+	unsigned int *nr_ranges = arg;
+
+	(*nr_ranges)++;
+	return 0;
+}
+
+static int add_mem_range_callback(struct resource *res, void *arg)
+{
+	struct crash_mem *cmem = arg;
+
+	cmem->ranges[cmem->nr_ranges].start = res->start;
+	cmem->ranges[cmem->nr_ranges].end = res->end;
+	cmem->nr_ranges++;
+
+	return 0;
+}
+
+static struct crash_mem *get_crash_memory_ranges(void)
+{
+	unsigned int nr_ranges;
+	struct crash_mem *cmem;
+
+	nr_ranges = 1; /* for exclusion of crashkernel region */
+	walk_system_ram_res(0, -1, &nr_ranges, get_nr_ranges_callback);
+
+	cmem = vmalloc(sizeof(struct crash_mem) +
+			sizeof(struct crash_mem_range) * nr_ranges);
+	if (!cmem)
+		return NULL;
+
+	cmem->max_nr_ranges = nr_ranges;
+	cmem->nr_ranges = 0;
+	walk_system_ram_res(0, -1, cmem, add_mem_range_callback);
+
+	/* Exclude crashkernel region */
+	if (crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end)) {
+		vfree(cmem);
+		return NULL;
+	}
+
+	return cmem;
+}
+
+static int prepare_elf_core_header(void **addr, unsigned long *sz)
+{
+	struct crash_mem *cmem;
+	int ret = 0;
+
+	cmem = get_crash_memory_ranges();
+	if (!cmem)
+		return -ENOMEM;
+
+	/* 1: add segment for kernel map */
+	ret =  crash_prepare_elf64_headers(cmem, 1, addr, sz);
+
+	vfree(cmem);
+	return ret;
+}
+
+int load_crashdump_segments(struct kimage *image)
+{
+	void *elf_addr;
+	unsigned long elf_sz;
+	struct kexec_buf kbuf;
+	int ret;
+
+	if (image->type != KEXEC_TYPE_CRASH)
+		return 0;
+
+	ret = prepare_elf_core_header(&elf_addr, &elf_sz);
+	if (ret) {
+		pr_err("Preparing elf core header failed\n");
+		return ret;
+	}
+
+	kbuf.image = image;
+	kbuf.buffer = elf_addr;
+	kbuf.bufsz = elf_sz;
+	kbuf.memsz = elf_sz;
+	kbuf.buf_align = PAGE_SIZE;
+	kbuf.buf_min = crashk_res.start;
+	kbuf.buf_max = crashk_res.end + 1;
+	kbuf.top_down = 1;
+
+	ret = kexec_add_buffer(&kbuf);
+	if (ret) {
+		vfree(elf_addr);
+		return ret;
+	}
+	image->arch.elf_headers = elf_addr;
+	image->arch.elf_headers_sz = elf_sz;
+	image->arch.elf_load_addr = kbuf.mem;
+
+	pr_debug("Loaded elf core header at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+			 image->arch.elf_load_addr, elf_sz, elf_sz);
+
+	return ret;
+}
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 09/13] arm64: kexec_file: add crash dump support
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: AKASHI Takahiro, kexec, linux-kernel, linux-arm-kernel

To enable crash dump (kdump), we need to
* prepare the contents of ELF header of /proc/vmcore through
  load_crashdump_segments(), and
* set up two device tree properties, "linux,usable-memory-range" and
  "linux,elfcorehdr", which repsectively represent a memory range to be
  used on crash dump kernel and a region of ELF core header
  (The logic of this cod is also from kexec-tools.)

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/kexec.h         |   5 +
 arch/arm64/kernel/machine_kexec_file.c | 211 +++++++++++++++++++++++++++++++++
 2 files changed, 216 insertions(+)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index fc562db22d46..d7427d510e1b 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -99,6 +99,10 @@ static inline void crash_post_resume(void) {}
 struct kimage_arch {
 	phys_addr_t dtb_mem;
 	void *dtb_buf;
+	/* Core ELF header buffer */
+	void *elf_headers;
+	unsigned long elf_headers_sz;
+	unsigned long elf_load_addr;
 };
 
 struct kimage;
@@ -110,6 +114,7 @@ extern int load_other_segments(struct kimage *image,
 		unsigned long kernel_load_addr,
 		char *initrd, unsigned long initrd_len,
 		char *cmdline, unsigned long cmdline_len);
+extern int load_crashdump_segments(struct kimage *image);
 #endif
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index 12012f247501..fc132047c8cd 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -19,6 +19,7 @@
 #include <linux/libfdt.h>
 #include <linux/memblock.h>
 #include <linux/of_fdt.h>
+#include <linux/vmalloc.h>
 
 static int __dt_root_addr_cells;
 static int __dt_root_size_cells;
@@ -32,6 +33,10 @@ int arch_kimage_file_post_load_cleanup(struct kimage *image)
 	vfree(image->arch.dtb_buf);
 	image->arch.dtb_buf = NULL;
 
+	vfree(image->arch.elf_headers);
+	image->arch.elf_headers = NULL;
+	image->arch.elf_headers_sz = 0;
+
 	return _kimage_file_post_load_cleanup(image);
 }
 
@@ -49,6 +54,78 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf,
 		return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
 }
 
+static int __init arch_kexec_file_init(void)
+{
+	/* Those values are used later on loading the kernel */
+	__dt_root_addr_cells = dt_root_addr_cells;
+	__dt_root_size_cells = dt_root_size_cells;
+
+	return 0;
+}
+late_initcall(arch_kexec_file_init);
+
+#define FDT_ALIGN(x, a)	(((x) + (a) - 1) & ~((a) - 1))
+#define FDT_TAGALIGN(x)	(FDT_ALIGN((x), FDT_TAGSIZE))
+
+static int fdt_prop_len(const char *prop_name, int len)
+{
+	return (strlen(prop_name) + 1) +
+		sizeof(struct fdt_property) +
+		FDT_TAGALIGN(len);
+}
+
+static bool cells_size_fitted(unsigned long base, unsigned long size)
+{
+	/* if *_cells >= 2, cells can hold 64-bit values anyway */
+	if ((__dt_root_addr_cells == 1) && (base >= (1ULL << 32)))
+		return false;
+
+	if ((__dt_root_size_cells == 1) && (size >= (1ULL << 32)))
+		return false;
+
+	return true;
+}
+
+static void fill_property(void *buf, u64 val64, int cells)
+{
+	u32 val32;
+
+	if (cells == 1) {
+		val32 = cpu_to_fdt32((u32)val64);
+		memcpy(buf, &val32, sizeof(val32));
+	} else {
+		memset(buf, 0, cells * sizeof(u32) - sizeof(u64));
+		buf += cells * sizeof(u32) - sizeof(u64);
+
+		val64 = cpu_to_fdt64(val64);
+		memcpy(buf, &val64, sizeof(val64));
+	}
+}
+
+static int fdt_setprop_range(void *fdt, int nodeoffset, const char *name,
+				unsigned long addr, unsigned long size)
+{
+	void *buf, *prop;
+	size_t buf_size;
+	int result;
+
+	buf_size = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
+	prop = buf = vmalloc(buf_size);
+	if (!buf)
+		return -ENOMEM;
+
+	fill_property(prop, addr, __dt_root_addr_cells);
+	prop += __dt_root_addr_cells * sizeof(u32);
+
+	fill_property(prop, size, __dt_root_size_cells);
+
+	result = fdt_setprop(fdt, nodeoffset, name, buf, buf_size);
+
+	vfree(buf);
+
+	return result;
+}
+
 static int setup_dtb(struct kimage *image,
 		unsigned long initrd_load_addr, unsigned long initrd_len,
 		char *cmdline, unsigned long cmdline_len,
@@ -61,10 +138,26 @@ static int setup_dtb(struct kimage *image,
 	int range_len;
 	int ret;
 
+	/* check ranges against root's #address-cells and #size-cells */
+	if (image->type == KEXEC_TYPE_CRASH &&
+		(!cells_size_fitted(image->arch.elf_load_addr,
+				image->arch.elf_headers_sz) ||
+		 !cells_size_fitted(crashk_res.start,
+				crashk_res.end - crashk_res.start + 1))) {
+		pr_err("Crash memory region doesn't fit into DT's root cell sizes.\n");
+		ret = -EINVAL;
+		goto out_err;
+	}
+
 	/* duplicate dt blob */
 	buf_size = fdt_totalsize(initial_boot_params);
 	range_len = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
 
+	if (image->type == KEXEC_TYPE_CRASH)
+		buf_size += fdt_prop_len("linux,elfcorehdr", range_len)
+				+ fdt_prop_len("linux,usable-memory-range",
+								range_len);
+
 	if (initrd_load_addr)
 		buf_size += fdt_prop_len("initrd-start", sizeof(u64))
 				+ fdt_prop_len("initrd-end", sizeof(u64));
@@ -86,6 +179,23 @@ static int setup_dtb(struct kimage *image,
 	if (nodeoffset < 0)
 		goto out_err;
 
+	if (image->type == KEXEC_TYPE_CRASH) {
+		/* add linux,elfcorehdr */
+		ret = fdt_setprop_range(buf, nodeoffset, "linux,elfcorehdr",
+				image->arch.elf_load_addr,
+				image->arch.elf_headers_sz);
+		if (ret)
+			goto out_err;
+
+		/* add linux,usable-memory-range */
+		ret = fdt_setprop_range(buf, nodeoffset,
+				"linux,usable-memory-range",
+				crashk_res.start,
+				crashk_res.end - crashk_res.start + 1);
+		if (ret)
+			goto out_err;
+	}
+
 	/* add bootargs */
 	if (cmdline) {
 		ret = fdt_setprop(buf, nodeoffset, "bootargs",
@@ -187,3 +297,104 @@ int load_other_segments(struct kimage *image, unsigned long kernel_load_addr,
 	image->arch.dtb_buf = NULL;
 	return ret;
 }
+
+static int get_nr_ranges_callback(struct resource *res, void *arg)
+{
+	unsigned int *nr_ranges = arg;
+
+	(*nr_ranges)++;
+	return 0;
+}
+
+static int add_mem_range_callback(struct resource *res, void *arg)
+{
+	struct crash_mem *cmem = arg;
+
+	cmem->ranges[cmem->nr_ranges].start = res->start;
+	cmem->ranges[cmem->nr_ranges].end = res->end;
+	cmem->nr_ranges++;
+
+	return 0;
+}
+
+static struct crash_mem *get_crash_memory_ranges(void)
+{
+	unsigned int nr_ranges;
+	struct crash_mem *cmem;
+
+	nr_ranges = 1; /* for exclusion of crashkernel region */
+	walk_system_ram_res(0, -1, &nr_ranges, get_nr_ranges_callback);
+
+	cmem = vmalloc(sizeof(struct crash_mem) +
+			sizeof(struct crash_mem_range) * nr_ranges);
+	if (!cmem)
+		return NULL;
+
+	cmem->max_nr_ranges = nr_ranges;
+	cmem->nr_ranges = 0;
+	walk_system_ram_res(0, -1, cmem, add_mem_range_callback);
+
+	/* Exclude crashkernel region */
+	if (crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end)) {
+		vfree(cmem);
+		return NULL;
+	}
+
+	return cmem;
+}
+
+static int prepare_elf_core_header(void **addr, unsigned long *sz)
+{
+	struct crash_mem *cmem;
+	int ret = 0;
+
+	cmem = get_crash_memory_ranges();
+	if (!cmem)
+		return -ENOMEM;
+
+	/* 1: add segment for kernel map */
+	ret =  crash_prepare_elf64_headers(cmem, 1, addr, sz);
+
+	vfree(cmem);
+	return ret;
+}
+
+int load_crashdump_segments(struct kimage *image)
+{
+	void *elf_addr;
+	unsigned long elf_sz;
+	struct kexec_buf kbuf;
+	int ret;
+
+	if (image->type != KEXEC_TYPE_CRASH)
+		return 0;
+
+	ret = prepare_elf_core_header(&elf_addr, &elf_sz);
+	if (ret) {
+		pr_err("Preparing elf core header failed\n");
+		return ret;
+	}
+
+	kbuf.image = image;
+	kbuf.buffer = elf_addr;
+	kbuf.bufsz = elf_sz;
+	kbuf.memsz = elf_sz;
+	kbuf.buf_align = PAGE_SIZE;
+	kbuf.buf_min = crashk_res.start;
+	kbuf.buf_max = crashk_res.end + 1;
+	kbuf.top_down = 1;
+
+	ret = kexec_add_buffer(&kbuf);
+	if (ret) {
+		vfree(elf_addr);
+		return ret;
+	}
+	image->arch.elf_headers = elf_addr;
+	image->arch.elf_headers_sz = elf_sz;
+	image->arch.elf_load_addr = kbuf.mem;
+
+	pr_debug("Loaded elf core header at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+			 image->arch.elf_load_addr, elf_sz, elf_sz);
+
+	return ret;
+}
-- 
2.16.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 10/13] arm64: kexec_file: add Image format support
  2018-02-22 11:17 ` AKASHI Takahiro
  (?)
@ 2018-02-22 11:17   ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-arm-kernel, linux-kernel, AKASHI Takahiro

This patch provides kexec_file_ops for "Image"-format kernel. Please note
that a binary is always loaded at an offset based on its text_offset field.

While this patch doesn't contains CONFIG_KEXEC_VERIFY_SIG support,
that is to be added in a later patch in this series, file-attribute-based
kernel verification can now be materialised by enabling IMA security
subsystem.
See more details about IMA here:
    https://sourceforge.net/p/linux-ima/wiki/Home/

You can sign(label) a kernel image on target filesystem to be kexec-ed
with:
    $ evmctl ima_sign --key /path/to/private_key.pem Image

On live system, you must have IMA enforced with, at least, the following
security policy:
    "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig"

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/kexec.h         | 50 +++++++++++++++++++
 arch/arm64/kernel/Makefile             |  2 +-
 arch/arm64/kernel/kexec_image.c        | 90 ++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/machine_kexec_file.c |  1 +
 4 files changed, 142 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/kexec_image.c

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index d7427d510e1b..592890085aae 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -105,6 +105,56 @@ struct kimage_arch {
 	unsigned long elf_load_addr;
 };
 
+/**
+ * struct arm64_image_header - arm64 kernel image header
+ *
+ * @pe_sig: Optional PE format 'MZ' signature
+ * @branch_code: Instruction to branch to stext
+ * @text_offset: Image load offset, little endian
+ * @image_size: Effective image size, little endian
+ * @flags:
+ *	Bit 0: Kernel endianness. 0=little endian, 1=big endian
+ * @reserved: Reserved
+ * @magic: Magic number, "ARM\x64"
+ * @pe_header: Optional offset to a PE format header
+ **/
+
+struct arm64_image_header {
+	u8 pe_sig[2];
+	u8 pad[2];
+	u32 branch_code;
+	u64 text_offset;
+	u64 image_size;
+	u64 flags;
+	u64 reserved[3];
+	u8 magic[4];
+	u32 pe_header;
+};
+
+static const u8 arm64_image_magic[4] = {'A', 'R', 'M', 0x64U};
+
+/**
+ * arm64_header_check_magic - Helper to check the arm64 image header.
+ *
+ * Returns non-zero if header is OK.
+ */
+
+static inline int arm64_header_check_magic(const struct arm64_image_header *h)
+{
+	if (!h)
+		return 0;
+
+	if (!h->text_offset)
+		return 0;
+
+	return (h->magic[0] == arm64_image_magic[0]
+		&& h->magic[1] == arm64_image_magic[1]
+		&& h->magic[2] == arm64_image_magic[2]
+		&& h->magic[3] == arm64_image_magic[3]);
+}
+
+extern const struct kexec_file_ops kexec_image_ops;
+
 struct kimage;
 
 #define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 151dc890737c..454b2735603a 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -49,7 +49,7 @@ arm64-obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
 arm64-obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o
 arm64-obj-$(CONFIG_KEXEC_CORE)		+= machine_kexec.o relocate_kernel.o	\
 					   cpu-reset.o
-arm64-obj-$(CONFIG_KEXEC_FILE)		+= machine_kexec_file.o
+arm64-obj-$(CONFIG_KEXEC_FILE)		+= machine_kexec_file.o kexec_image.o
 arm64-obj-$(CONFIG_ARM64_RELOC_TEST)	+= arm64-reloc-test.o
 arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
 arm64-obj-$(CONFIG_CRASH_DUMP)		+= crash_dump.o
diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
new file mode 100644
index 000000000000..de62def63dd6
--- /dev/null
+++ b/arch/arm64/kernel/kexec_image.c
@@ -0,0 +1,90 @@
+/*
+ * Kexec image loader
+
+ * Copyright (C) 2018 Linaro Limited
+ * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt)	"kexec_file(Image): " fmt
+
+#include <linux/err.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/kexec.h>
+#include <asm/byteorder.h>
+#include <asm/memory.h>
+
+static int image_probe(const char *kernel_buf, unsigned long kernel_len)
+{
+	const struct arm64_image_header *h;
+
+	h = (const struct arm64_image_header *)(kernel_buf);
+
+	if ((kernel_len < sizeof(*h)) || !arm64_header_check_magic(h))
+		return -EINVAL;
+
+	return 0;
+}
+
+static void *image_load(struct kimage *image, char *kernel,
+			    unsigned long kernel_len, char *initrd,
+			    unsigned long initrd_len, char *cmdline,
+			    unsigned long cmdline_len)
+{
+	struct kexec_buf kbuf;
+	struct arm64_image_header *h = (struct arm64_image_header *)kernel;
+	unsigned long text_offset;
+	int ret;
+
+	/* Create elf core header segment */
+	ret = load_crashdump_segments(image);
+	if (ret)
+		goto out;
+
+	/* Load the kernel */
+	kbuf.image = image;
+	if (image->type == KEXEC_TYPE_CRASH) {
+		kbuf.buf_min = crashk_res.start;
+		kbuf.buf_max = crashk_res.end + 1;
+	} else {
+		kbuf.buf_min = 0;
+		kbuf.buf_max = ULONG_MAX;
+	}
+	kbuf.top_down = 0;
+
+	kbuf.buffer = kernel;
+	kbuf.bufsz = kernel_len;
+	kbuf.memsz = le64_to_cpu(h->image_size);
+	text_offset = le64_to_cpu(h->text_offset);
+	kbuf.buf_align = SZ_2M;
+
+	/* Adjust kernel segment with TEXT_OFFSET */
+	kbuf.memsz += text_offset;
+
+	ret = kexec_add_buffer(&kbuf);
+	if (ret)
+		goto out;
+
+	image->segment[image->nr_segments - 1].mem += text_offset;
+	image->segment[image->nr_segments - 1].memsz -= text_offset;
+	image->start = kbuf.mem + text_offset;
+
+	pr_debug("Loaded kernel at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+		 image->start, kbuf.bufsz, kbuf.memsz);
+
+	/* Load additional data */
+	ret = load_other_segments(image, image->start,
+			    initrd, initrd_len, cmdline, cmdline_len);
+
+out:
+	return ERR_PTR(ret);
+}
+
+const struct kexec_file_ops kexec_image_ops = {
+	.probe = image_probe,
+	.load = image_load,
+};
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index fc132047c8cd..384146583f8d 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -25,6 +25,7 @@ static int __dt_root_addr_cells;
 static int __dt_root_size_cells;
 
 const struct kexec_file_ops * const kexec_file_loaders[] = {
+	&kexec_image_ops,
 	NULL
 };
 
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 10/13] arm64: kexec_file: add Image format support
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

This patch provides kexec_file_ops for "Image"-format kernel. Please note
that a binary is always loaded at an offset based on its text_offset field.

While this patch doesn't contains CONFIG_KEXEC_VERIFY_SIG support,
that is to be added in a later patch in this series, file-attribute-based
kernel verification can now be materialised by enabling IMA security
subsystem.
See more details about IMA here:
    https://sourceforge.net/p/linux-ima/wiki/Home/

You can sign(label) a kernel image on target filesystem to be kexec-ed
with:
    $ evmctl ima_sign --key /path/to/private_key.pem Image

On live system, you must have IMA enforced with, at least, the following
security policy:
    "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig"

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/kexec.h         | 50 +++++++++++++++++++
 arch/arm64/kernel/Makefile             |  2 +-
 arch/arm64/kernel/kexec_image.c        | 90 ++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/machine_kexec_file.c |  1 +
 4 files changed, 142 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/kexec_image.c

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index d7427d510e1b..592890085aae 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -105,6 +105,56 @@ struct kimage_arch {
 	unsigned long elf_load_addr;
 };
 
+/**
+ * struct arm64_image_header - arm64 kernel image header
+ *
+ * @pe_sig: Optional PE format 'MZ' signature
+ * @branch_code: Instruction to branch to stext
+ * @text_offset: Image load offset, little endian
+ * @image_size: Effective image size, little endian
+ * @flags:
+ *	Bit 0: Kernel endianness. 0=little endian, 1=big endian
+ * @reserved: Reserved
+ * @magic: Magic number, "ARM\x64"
+ * @pe_header: Optional offset to a PE format header
+ **/
+
+struct arm64_image_header {
+	u8 pe_sig[2];
+	u8 pad[2];
+	u32 branch_code;
+	u64 text_offset;
+	u64 image_size;
+	u64 flags;
+	u64 reserved[3];
+	u8 magic[4];
+	u32 pe_header;
+};
+
+static const u8 arm64_image_magic[4] = {'A', 'R', 'M', 0x64U};
+
+/**
+ * arm64_header_check_magic - Helper to check the arm64 image header.
+ *
+ * Returns non-zero if header is OK.
+ */
+
+static inline int arm64_header_check_magic(const struct arm64_image_header *h)
+{
+	if (!h)
+		return 0;
+
+	if (!h->text_offset)
+		return 0;
+
+	return (h->magic[0] == arm64_image_magic[0]
+		&& h->magic[1] == arm64_image_magic[1]
+		&& h->magic[2] == arm64_image_magic[2]
+		&& h->magic[3] == arm64_image_magic[3]);
+}
+
+extern const struct kexec_file_ops kexec_image_ops;
+
 struct kimage;
 
 #define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 151dc890737c..454b2735603a 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -49,7 +49,7 @@ arm64-obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
 arm64-obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o
 arm64-obj-$(CONFIG_KEXEC_CORE)		+= machine_kexec.o relocate_kernel.o	\
 					   cpu-reset.o
-arm64-obj-$(CONFIG_KEXEC_FILE)		+= machine_kexec_file.o
+arm64-obj-$(CONFIG_KEXEC_FILE)		+= machine_kexec_file.o kexec_image.o
 arm64-obj-$(CONFIG_ARM64_RELOC_TEST)	+= arm64-reloc-test.o
 arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
 arm64-obj-$(CONFIG_CRASH_DUMP)		+= crash_dump.o
diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
new file mode 100644
index 000000000000..de62def63dd6
--- /dev/null
+++ b/arch/arm64/kernel/kexec_image.c
@@ -0,0 +1,90 @@
+/*
+ * Kexec image loader
+
+ * Copyright (C) 2018 Linaro Limited
+ * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt)	"kexec_file(Image): " fmt
+
+#include <linux/err.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/kexec.h>
+#include <asm/byteorder.h>
+#include <asm/memory.h>
+
+static int image_probe(const char *kernel_buf, unsigned long kernel_len)
+{
+	const struct arm64_image_header *h;
+
+	h = (const struct arm64_image_header *)(kernel_buf);
+
+	if ((kernel_len < sizeof(*h)) || !arm64_header_check_magic(h))
+		return -EINVAL;
+
+	return 0;
+}
+
+static void *image_load(struct kimage *image, char *kernel,
+			    unsigned long kernel_len, char *initrd,
+			    unsigned long initrd_len, char *cmdline,
+			    unsigned long cmdline_len)
+{
+	struct kexec_buf kbuf;
+	struct arm64_image_header *h = (struct arm64_image_header *)kernel;
+	unsigned long text_offset;
+	int ret;
+
+	/* Create elf core header segment */
+	ret = load_crashdump_segments(image);
+	if (ret)
+		goto out;
+
+	/* Load the kernel */
+	kbuf.image = image;
+	if (image->type == KEXEC_TYPE_CRASH) {
+		kbuf.buf_min = crashk_res.start;
+		kbuf.buf_max = crashk_res.end + 1;
+	} else {
+		kbuf.buf_min = 0;
+		kbuf.buf_max = ULONG_MAX;
+	}
+	kbuf.top_down = 0;
+
+	kbuf.buffer = kernel;
+	kbuf.bufsz = kernel_len;
+	kbuf.memsz = le64_to_cpu(h->image_size);
+	text_offset = le64_to_cpu(h->text_offset);
+	kbuf.buf_align = SZ_2M;
+
+	/* Adjust kernel segment with TEXT_OFFSET */
+	kbuf.memsz += text_offset;
+
+	ret = kexec_add_buffer(&kbuf);
+	if (ret)
+		goto out;
+
+	image->segment[image->nr_segments - 1].mem += text_offset;
+	image->segment[image->nr_segments - 1].memsz -= text_offset;
+	image->start = kbuf.mem + text_offset;
+
+	pr_debug("Loaded kernel at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+		 image->start, kbuf.bufsz, kbuf.memsz);
+
+	/* Load additional data */
+	ret = load_other_segments(image, image->start,
+			    initrd, initrd_len, cmdline, cmdline_len);
+
+out:
+	return ERR_PTR(ret);
+}
+
+const struct kexec_file_ops kexec_image_ops = {
+	.probe = image_probe,
+	.load = image_load,
+};
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index fc132047c8cd..384146583f8d 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -25,6 +25,7 @@ static int __dt_root_addr_cells;
 static int __dt_root_size_cells;
 
 const struct kexec_file_ops * const kexec_file_loaders[] = {
+	&kexec_image_ops,
 	NULL
 };
 
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 10/13] arm64: kexec_file: add Image format support
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: AKASHI Takahiro, kexec, linux-kernel, linux-arm-kernel

This patch provides kexec_file_ops for "Image"-format kernel. Please note
that a binary is always loaded at an offset based on its text_offset field.

While this patch doesn't contains CONFIG_KEXEC_VERIFY_SIG support,
that is to be added in a later patch in this series, file-attribute-based
kernel verification can now be materialised by enabling IMA security
subsystem.
See more details about IMA here:
    https://sourceforge.net/p/linux-ima/wiki/Home/

You can sign(label) a kernel image on target filesystem to be kexec-ed
with:
    $ evmctl ima_sign --key /path/to/private_key.pem Image

On live system, you must have IMA enforced with, at least, the following
security policy:
    "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig"

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/kexec.h         | 50 +++++++++++++++++++
 arch/arm64/kernel/Makefile             |  2 +-
 arch/arm64/kernel/kexec_image.c        | 90 ++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/machine_kexec_file.c |  1 +
 4 files changed, 142 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/kexec_image.c

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index d7427d510e1b..592890085aae 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -105,6 +105,56 @@ struct kimage_arch {
 	unsigned long elf_load_addr;
 };
 
+/**
+ * struct arm64_image_header - arm64 kernel image header
+ *
+ * @pe_sig: Optional PE format 'MZ' signature
+ * @branch_code: Instruction to branch to stext
+ * @text_offset: Image load offset, little endian
+ * @image_size: Effective image size, little endian
+ * @flags:
+ *	Bit 0: Kernel endianness. 0=little endian, 1=big endian
+ * @reserved: Reserved
+ * @magic: Magic number, "ARM\x64"
+ * @pe_header: Optional offset to a PE format header
+ **/
+
+struct arm64_image_header {
+	u8 pe_sig[2];
+	u8 pad[2];
+	u32 branch_code;
+	u64 text_offset;
+	u64 image_size;
+	u64 flags;
+	u64 reserved[3];
+	u8 magic[4];
+	u32 pe_header;
+};
+
+static const u8 arm64_image_magic[4] = {'A', 'R', 'M', 0x64U};
+
+/**
+ * arm64_header_check_magic - Helper to check the arm64 image header.
+ *
+ * Returns non-zero if header is OK.
+ */
+
+static inline int arm64_header_check_magic(const struct arm64_image_header *h)
+{
+	if (!h)
+		return 0;
+
+	if (!h->text_offset)
+		return 0;
+
+	return (h->magic[0] == arm64_image_magic[0]
+		&& h->magic[1] == arm64_image_magic[1]
+		&& h->magic[2] == arm64_image_magic[2]
+		&& h->magic[3] == arm64_image_magic[3]);
+}
+
+extern const struct kexec_file_ops kexec_image_ops;
+
 struct kimage;
 
 #define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 151dc890737c..454b2735603a 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -49,7 +49,7 @@ arm64-obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
 arm64-obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o
 arm64-obj-$(CONFIG_KEXEC_CORE)		+= machine_kexec.o relocate_kernel.o	\
 					   cpu-reset.o
-arm64-obj-$(CONFIG_KEXEC_FILE)		+= machine_kexec_file.o
+arm64-obj-$(CONFIG_KEXEC_FILE)		+= machine_kexec_file.o kexec_image.o
 arm64-obj-$(CONFIG_ARM64_RELOC_TEST)	+= arm64-reloc-test.o
 arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
 arm64-obj-$(CONFIG_CRASH_DUMP)		+= crash_dump.o
diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
new file mode 100644
index 000000000000..de62def63dd6
--- /dev/null
+++ b/arch/arm64/kernel/kexec_image.c
@@ -0,0 +1,90 @@
+/*
+ * Kexec image loader
+
+ * Copyright (C) 2018 Linaro Limited
+ * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt)	"kexec_file(Image): " fmt
+
+#include <linux/err.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/kexec.h>
+#include <asm/byteorder.h>
+#include <asm/memory.h>
+
+static int image_probe(const char *kernel_buf, unsigned long kernel_len)
+{
+	const struct arm64_image_header *h;
+
+	h = (const struct arm64_image_header *)(kernel_buf);
+
+	if ((kernel_len < sizeof(*h)) || !arm64_header_check_magic(h))
+		return -EINVAL;
+
+	return 0;
+}
+
+static void *image_load(struct kimage *image, char *kernel,
+			    unsigned long kernel_len, char *initrd,
+			    unsigned long initrd_len, char *cmdline,
+			    unsigned long cmdline_len)
+{
+	struct kexec_buf kbuf;
+	struct arm64_image_header *h = (struct arm64_image_header *)kernel;
+	unsigned long text_offset;
+	int ret;
+
+	/* Create elf core header segment */
+	ret = load_crashdump_segments(image);
+	if (ret)
+		goto out;
+
+	/* Load the kernel */
+	kbuf.image = image;
+	if (image->type == KEXEC_TYPE_CRASH) {
+		kbuf.buf_min = crashk_res.start;
+		kbuf.buf_max = crashk_res.end + 1;
+	} else {
+		kbuf.buf_min = 0;
+		kbuf.buf_max = ULONG_MAX;
+	}
+	kbuf.top_down = 0;
+
+	kbuf.buffer = kernel;
+	kbuf.bufsz = kernel_len;
+	kbuf.memsz = le64_to_cpu(h->image_size);
+	text_offset = le64_to_cpu(h->text_offset);
+	kbuf.buf_align = SZ_2M;
+
+	/* Adjust kernel segment with TEXT_OFFSET */
+	kbuf.memsz += text_offset;
+
+	ret = kexec_add_buffer(&kbuf);
+	if (ret)
+		goto out;
+
+	image->segment[image->nr_segments - 1].mem += text_offset;
+	image->segment[image->nr_segments - 1].memsz -= text_offset;
+	image->start = kbuf.mem + text_offset;
+
+	pr_debug("Loaded kernel at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+		 image->start, kbuf.bufsz, kbuf.memsz);
+
+	/* Load additional data */
+	ret = load_other_segments(image, image->start,
+			    initrd, initrd_len, cmdline, cmdline_len);
+
+out:
+	return ERR_PTR(ret);
+}
+
+const struct kexec_file_ops kexec_image_ops = {
+	.probe = image_probe,
+	.load = image_load,
+};
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index fc132047c8cd..384146583f8d 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -25,6 +25,7 @@ static int __dt_root_addr_cells;
 static int __dt_root_size_cells;
 
 const struct kexec_file_ops * const kexec_file_loaders[] = {
+	&kexec_image_ops,
 	NULL
 };
 
-- 
2.16.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 11/13] arm64: kexec_file: enable KEXEC_FILE config
  2018-02-22 11:17 ` AKASHI Takahiro
  (?)
@ 2018-02-22 11:17   ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-arm-kernel, linux-kernel, AKASHI Takahiro

Modify arm64/Kconfig to enable kexec_file_load support.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/Kconfig | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7381eeb7ef8e..79ee27b8d2a0 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -829,6 +829,16 @@ config KEXEC
 	  but it is independent of the system firmware.   And like a reboot
 	  you can start any kernel with it, not just Linux.
 
+config KEXEC_FILE
+	bool "kexec file based system call"
+	select KEXEC_CORE
+	select BUILD_BIN2C
+	---help---
+	  This is new version of kexec system call. This system call is
+	  file based and takes file descriptors as system call argument
+	  for kernel and initramfs as opposed to list of segments as
+	  accepted by previous system call.
+
 config CRASH_DUMP
 	bool "Build kdump crash kernel"
 	help
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 11/13] arm64: kexec_file: enable KEXEC_FILE config
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

Modify arm64/Kconfig to enable kexec_file_load support.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/Kconfig | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7381eeb7ef8e..79ee27b8d2a0 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -829,6 +829,16 @@ config KEXEC
 	  but it is independent of the system firmware.   And like a reboot
 	  you can start any kernel with it, not just Linux.
 
+config KEXEC_FILE
+	bool "kexec file based system call"
+	select KEXEC_CORE
+	select BUILD_BIN2C
+	---help---
+	  This is new version of kexec system call. This system call is
+	  file based and takes file descriptors as system call argument
+	  for kernel and initramfs as opposed to list of segments as
+	  accepted by previous system call.
+
 config CRASH_DUMP
 	bool "Build kdump crash kernel"
 	help
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 11/13] arm64: kexec_file: enable KEXEC_FILE config
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: AKASHI Takahiro, kexec, linux-kernel, linux-arm-kernel

Modify arm64/Kconfig to enable kexec_file_load support.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/Kconfig | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7381eeb7ef8e..79ee27b8d2a0 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -829,6 +829,16 @@ config KEXEC
 	  but it is independent of the system firmware.   And like a reboot
 	  you can start any kernel with it, not just Linux.
 
+config KEXEC_FILE
+	bool "kexec file based system call"
+	select KEXEC_CORE
+	select BUILD_BIN2C
+	---help---
+	  This is new version of kexec system call. This system call is
+	  file based and takes file descriptors as system call argument
+	  for kernel and initramfs as opposed to list of segments as
+	  accepted by previous system call.
+
 config CRASH_DUMP
 	bool "Build kdump crash kernel"
 	help
-- 
2.16.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 12/13] include: pe.h: remove message[] from mz header definition
  2018-02-22 11:17 ` AKASHI Takahiro
  (?)
@ 2018-02-22 11:17   ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-arm-kernel, linux-kernel, AKASHI Takahiro

message[] field won't be part of the definition of mz header.

This change is crucial for enabling kexec_file_load on arm64 because
arm64's "Image" binary, as in PE format, doesn't have any data for it and
accordingly the following check in pefile_parse_binary() will fail:

	chkaddr(cursor, mz->peaddr, sizeof(*pe));

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
---
 include/linux/pe.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/pe.h b/include/linux/pe.h
index 143ce75be5f0..3482b18a48b5 100644
--- a/include/linux/pe.h
+++ b/include/linux/pe.h
@@ -166,7 +166,7 @@ struct mz_hdr {
 	uint16_t oem_info;	/* oem specific */
 	uint16_t reserved1[10];	/* reserved */
 	uint32_t peaddr;	/* address of pe header */
-	char     message[64];	/* message to print */
+	char     message[];	/* message to print */
 };
 
 struct mz_reloc {
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 12/13] include: pe.h: remove message[] from mz header definition
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

message[] field won't be part of the definition of mz header.

This change is crucial for enabling kexec_file_load on arm64 because
arm64's "Image" binary, as in PE format, doesn't have any data for it and
accordingly the following check in pefile_parse_binary() will fail:

	chkaddr(cursor, mz->peaddr, sizeof(*pe));

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
---
 include/linux/pe.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/pe.h b/include/linux/pe.h
index 143ce75be5f0..3482b18a48b5 100644
--- a/include/linux/pe.h
+++ b/include/linux/pe.h
@@ -166,7 +166,7 @@ struct mz_hdr {
 	uint16_t oem_info;	/* oem specific */
 	uint16_t reserved1[10];	/* reserved */
 	uint32_t peaddr;	/* address of pe header */
-	char     message[64];	/* message to print */
+	char     message[];	/* message to print */
 };
 
 struct mz_reloc {
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 12/13] include: pe.h: remove message[] from mz header definition
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: AKASHI Takahiro, kexec, linux-kernel, linux-arm-kernel

message[] field won't be part of the definition of mz header.

This change is crucial for enabling kexec_file_load on arm64 because
arm64's "Image" binary, as in PE format, doesn't have any data for it and
accordingly the following check in pefile_parse_binary() will fail:

	chkaddr(cursor, mz->peaddr, sizeof(*pe));

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
---
 include/linux/pe.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/pe.h b/include/linux/pe.h
index 143ce75be5f0..3482b18a48b5 100644
--- a/include/linux/pe.h
+++ b/include/linux/pe.h
@@ -166,7 +166,7 @@ struct mz_hdr {
 	uint16_t oem_info;	/* oem specific */
 	uint16_t reserved1[10];	/* reserved */
 	uint32_t peaddr;	/* address of pe header */
-	char     message[64];	/* message to print */
+	char     message[];	/* message to print */
 };
 
 struct mz_reloc {
-- 
2.16.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 13/13] arm64: kexec_file: enable KEXEC_VERIFY_SIG for Image
  2018-02-22 11:17 ` AKASHI Takahiro
  (?)
@ 2018-02-22 11:17   ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-arm-kernel, linux-kernel, AKASHI Takahiro

With this patch, kernel verification can be done without IMA security
subsystem enabled. Turn on CONFIG_KEXEC_VERIFY_SIG instead.

On x86, a signature is embedded into a PE file (Microsoft's format) header
of binary. Since arm64's "Image" can also be seen as a PE file as far as
CONFIG_EFI is enabled, we adopt this format for kernel signing.

You can create a signed kernel image with:
    $ sbsign --key ${KEY} --cert ${CERT} Image

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/Kconfig              | 24 ++++++++++++++++++++++++
 arch/arm64/include/asm/kexec.h  | 16 ++++++++++++++++
 arch/arm64/kernel/kexec_image.c | 15 +++++++++++++++
 3 files changed, 55 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 79ee27b8d2a0..e400edc291d4 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -839,6 +839,30 @@ config KEXEC_FILE
 	  for kernel and initramfs as opposed to list of segments as
 	  accepted by previous system call.
 
+config KEXEC_VERIFY_SIG
+	bool "Verify kernel signature during kexec_file_load() syscall"
+	depends on KEXEC_FILE
+	---help---
+	  Select this option to verify a signature with loaded kernel
+	  image. If configured, any attempt of loading a image without
+	  valid signature will fail.
+
+	  In addition to that option, you need to enable signature
+	  verification for the corresponding kernel image type being
+	  loaded in order for this to work.
+
+config KEXEC_IMAGE_VERIFY_SIG
+	bool "Enable Image signature verification support"
+	default y
+	depends on KEXEC_VERIFY_SIG
+	depends on EFI && SIGNED_PE_FILE_VERIFICATION
+	---help---
+	  Enable Image signature verification support.
+
+comment "Image signature verification is missing yet"
+	depends on KEXEC_VERIFY_SIG
+	depends on !EFI || !SIGNED_PE_FILE_VERIFICATION
+
 config CRASH_DUMP
 	bool "Build kdump crash kernel"
 	help
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 592890085aae..85f6913f868f 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -132,6 +132,7 @@ struct arm64_image_header {
 };
 
 static const u8 arm64_image_magic[4] = {'A', 'R', 'M', 0x64U};
+static const u8 arm64_image_pe_sig[2] = {'M', 'Z'};
 
 /**
  * arm64_header_check_magic - Helper to check the arm64 image header.
@@ -153,6 +154,21 @@ static inline int arm64_header_check_magic(const struct arm64_image_header *h)
 		&& h->magic[3] == arm64_image_magic[3]);
 }
 
+/**
+ * arm64_header_check_pe_sig - Helper to check the arm64 image header.
+ *
+ * Returns non-zero if 'MZ' signature is found.
+ */
+
+static inline int arm64_header_check_pe_sig(const struct arm64_image_header *h)
+{
+	if (!h)
+		return 0;
+
+	return (h->pe_sig[0] == arm64_image_pe_sig[0]
+		&& h->pe_sig[1] == arm64_image_pe_sig[1]);
+}
+
 extern const struct kexec_file_ops kexec_image_ops;
 
 struct kimage;
diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
index de62def63dd6..816d5faf491d 100644
--- a/arch/arm64/kernel/kexec_image.c
+++ b/arch/arm64/kernel/kexec_image.c
@@ -15,6 +15,7 @@
 #include <linux/errno.h>
 #include <linux/kernel.h>
 #include <linux/kexec.h>
+#include <linux/verification.h>
 #include <asm/byteorder.h>
 #include <asm/memory.h>
 
@@ -27,6 +28,9 @@ static int image_probe(const char *kernel_buf, unsigned long kernel_len)
 	if ((kernel_len < sizeof(*h)) || !arm64_header_check_magic(h))
 		return -EINVAL;
 
+	pr_debug("PE format: %s\n",
+			(arm64_header_check_pe_sig(h) ? "yes" : "no"));
+
 	return 0;
 }
 
@@ -84,7 +88,18 @@ static void *image_load(struct kimage *image, char *kernel,
 	return ERR_PTR(ret);
 }
 
+#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
+static int image_verify_sig(const char *kernel, unsigned long kernel_len)
+{
+	return verify_pefile_signature(kernel, kernel_len, NULL,
+				       VERIFYING_KEXEC_PE_SIGNATURE);
+}
+#endif
+
 const struct kexec_file_ops kexec_image_ops = {
 	.probe = image_probe,
 	.load = image_load,
+#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
+	.verify_sig = image_verify_sig,
+#endif
 };
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 13/13] arm64: kexec_file: enable KEXEC_VERIFY_SIG for Image
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

With this patch, kernel verification can be done without IMA security
subsystem enabled. Turn on CONFIG_KEXEC_VERIFY_SIG instead.

On x86, a signature is embedded into a PE file (Microsoft's format) header
of binary. Since arm64's "Image" can also be seen as a PE file as far as
CONFIG_EFI is enabled, we adopt this format for kernel signing.

You can create a signed kernel image with:
    $ sbsign --key ${KEY} --cert ${CERT} Image

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/Kconfig              | 24 ++++++++++++++++++++++++
 arch/arm64/include/asm/kexec.h  | 16 ++++++++++++++++
 arch/arm64/kernel/kexec_image.c | 15 +++++++++++++++
 3 files changed, 55 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 79ee27b8d2a0..e400edc291d4 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -839,6 +839,30 @@ config KEXEC_FILE
 	  for kernel and initramfs as opposed to list of segments as
 	  accepted by previous system call.
 
+config KEXEC_VERIFY_SIG
+	bool "Verify kernel signature during kexec_file_load() syscall"
+	depends on KEXEC_FILE
+	---help---
+	  Select this option to verify a signature with loaded kernel
+	  image. If configured, any attempt of loading a image without
+	  valid signature will fail.
+
+	  In addition to that option, you need to enable signature
+	  verification for the corresponding kernel image type being
+	  loaded in order for this to work.
+
+config KEXEC_IMAGE_VERIFY_SIG
+	bool "Enable Image signature verification support"
+	default y
+	depends on KEXEC_VERIFY_SIG
+	depends on EFI && SIGNED_PE_FILE_VERIFICATION
+	---help---
+	  Enable Image signature verification support.
+
+comment "Image signature verification is missing yet"
+	depends on KEXEC_VERIFY_SIG
+	depends on !EFI || !SIGNED_PE_FILE_VERIFICATION
+
 config CRASH_DUMP
 	bool "Build kdump crash kernel"
 	help
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 592890085aae..85f6913f868f 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -132,6 +132,7 @@ struct arm64_image_header {
 };
 
 static const u8 arm64_image_magic[4] = {'A', 'R', 'M', 0x64U};
+static const u8 arm64_image_pe_sig[2] = {'M', 'Z'};
 
 /**
  * arm64_header_check_magic - Helper to check the arm64 image header.
@@ -153,6 +154,21 @@ static inline int arm64_header_check_magic(const struct arm64_image_header *h)
 		&& h->magic[3] == arm64_image_magic[3]);
 }
 
+/**
+ * arm64_header_check_pe_sig - Helper to check the arm64 image header.
+ *
+ * Returns non-zero if 'MZ' signature is found.
+ */
+
+static inline int arm64_header_check_pe_sig(const struct arm64_image_header *h)
+{
+	if (!h)
+		return 0;
+
+	return (h->pe_sig[0] == arm64_image_pe_sig[0]
+		&& h->pe_sig[1] == arm64_image_pe_sig[1]);
+}
+
 extern const struct kexec_file_ops kexec_image_ops;
 
 struct kimage;
diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
index de62def63dd6..816d5faf491d 100644
--- a/arch/arm64/kernel/kexec_image.c
+++ b/arch/arm64/kernel/kexec_image.c
@@ -15,6 +15,7 @@
 #include <linux/errno.h>
 #include <linux/kernel.h>
 #include <linux/kexec.h>
+#include <linux/verification.h>
 #include <asm/byteorder.h>
 #include <asm/memory.h>
 
@@ -27,6 +28,9 @@ static int image_probe(const char *kernel_buf, unsigned long kernel_len)
 	if ((kernel_len < sizeof(*h)) || !arm64_header_check_magic(h))
 		return -EINVAL;
 
+	pr_debug("PE format: %s\n",
+			(arm64_header_check_pe_sig(h) ? "yes" : "no"));
+
 	return 0;
 }
 
@@ -84,7 +88,18 @@ static void *image_load(struct kimage *image, char *kernel,
 	return ERR_PTR(ret);
 }
 
+#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
+static int image_verify_sig(const char *kernel, unsigned long kernel_len)
+{
+	return verify_pefile_signature(kernel, kernel_len, NULL,
+				       VERIFYING_KEXEC_PE_SIGNATURE);
+}
+#endif
+
 const struct kexec_file_ops kexec_image_ops = {
 	.probe = image_probe,
 	.load = image_load,
+#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
+	.verify_sig = image_verify_sig,
+#endif
 };
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 13/13] arm64: kexec_file: enable KEXEC_VERIFY_SIG for Image
@ 2018-02-22 11:17   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-22 11:17 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: AKASHI Takahiro, kexec, linux-kernel, linux-arm-kernel

With this patch, kernel verification can be done without IMA security
subsystem enabled. Turn on CONFIG_KEXEC_VERIFY_SIG instead.

On x86, a signature is embedded into a PE file (Microsoft's format) header
of binary. Since arm64's "Image" can also be seen as a PE file as far as
CONFIG_EFI is enabled, we adopt this format for kernel signing.

You can create a signed kernel image with:
    $ sbsign --key ${KEY} --cert ${CERT} Image

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/Kconfig              | 24 ++++++++++++++++++++++++
 arch/arm64/include/asm/kexec.h  | 16 ++++++++++++++++
 arch/arm64/kernel/kexec_image.c | 15 +++++++++++++++
 3 files changed, 55 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 79ee27b8d2a0..e400edc291d4 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -839,6 +839,30 @@ config KEXEC_FILE
 	  for kernel and initramfs as opposed to list of segments as
 	  accepted by previous system call.
 
+config KEXEC_VERIFY_SIG
+	bool "Verify kernel signature during kexec_file_load() syscall"
+	depends on KEXEC_FILE
+	---help---
+	  Select this option to verify a signature with loaded kernel
+	  image. If configured, any attempt of loading a image without
+	  valid signature will fail.
+
+	  In addition to that option, you need to enable signature
+	  verification for the corresponding kernel image type being
+	  loaded in order for this to work.
+
+config KEXEC_IMAGE_VERIFY_SIG
+	bool "Enable Image signature verification support"
+	default y
+	depends on KEXEC_VERIFY_SIG
+	depends on EFI && SIGNED_PE_FILE_VERIFICATION
+	---help---
+	  Enable Image signature verification support.
+
+comment "Image signature verification is missing yet"
+	depends on KEXEC_VERIFY_SIG
+	depends on !EFI || !SIGNED_PE_FILE_VERIFICATION
+
 config CRASH_DUMP
 	bool "Build kdump crash kernel"
 	help
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 592890085aae..85f6913f868f 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -132,6 +132,7 @@ struct arm64_image_header {
 };
 
 static const u8 arm64_image_magic[4] = {'A', 'R', 'M', 0x64U};
+static const u8 arm64_image_pe_sig[2] = {'M', 'Z'};
 
 /**
  * arm64_header_check_magic - Helper to check the arm64 image header.
@@ -153,6 +154,21 @@ static inline int arm64_header_check_magic(const struct arm64_image_header *h)
 		&& h->magic[3] == arm64_image_magic[3]);
 }
 
+/**
+ * arm64_header_check_pe_sig - Helper to check the arm64 image header.
+ *
+ * Returns non-zero if 'MZ' signature is found.
+ */
+
+static inline int arm64_header_check_pe_sig(const struct arm64_image_header *h)
+{
+	if (!h)
+		return 0;
+
+	return (h->pe_sig[0] == arm64_image_pe_sig[0]
+		&& h->pe_sig[1] == arm64_image_pe_sig[1]);
+}
+
 extern const struct kexec_file_ops kexec_image_ops;
 
 struct kimage;
diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
index de62def63dd6..816d5faf491d 100644
--- a/arch/arm64/kernel/kexec_image.c
+++ b/arch/arm64/kernel/kexec_image.c
@@ -15,6 +15,7 @@
 #include <linux/errno.h>
 #include <linux/kernel.h>
 #include <linux/kexec.h>
+#include <linux/verification.h>
 #include <asm/byteorder.h>
 #include <asm/memory.h>
 
@@ -27,6 +28,9 @@ static int image_probe(const char *kernel_buf, unsigned long kernel_len)
 	if ((kernel_len < sizeof(*h)) || !arm64_header_check_magic(h))
 		return -EINVAL;
 
+	pr_debug("PE format: %s\n",
+			(arm64_header_check_pe_sig(h) ? "yes" : "no"));
+
 	return 0;
 }
 
@@ -84,7 +88,18 @@ static void *image_load(struct kimage *image, char *kernel,
 	return ERR_PTR(ret);
 }
 
+#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
+static int image_verify_sig(const char *kernel, unsigned long kernel_len)
+{
+	return verify_pefile_signature(kernel, kernel_len, NULL,
+				       VERIFYING_KEXEC_PE_SIGNATURE);
+}
+#endif
+
 const struct kexec_file_ops kexec_image_ops = {
 	.probe = image_probe,
 	.load = image_load,
+#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
+	.verify_sig = image_verify_sig,
+#endif
 };
-- 
2.16.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 01/13] resource: add walk_system_ram_res_rev()
  2018-02-22 11:17   ` AKASHI Takahiro
  (?)
@ 2018-02-23  8:36     ` Dave Young
  -1 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-23  8:36 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, bhe, arnd, ard.biesheuvel,
	julien.thierry, kexec, linux-arm-kernel, linux-kernel,
	Linus Torvalds

Hi AKASHI,

On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> This function, being a variant of walk_system_ram_res() introduced in
> commit 8c86e70acead ("resource: provide new functions to walk through
> resources"), walks through a list of all the resources of System RAM
> in reversed order, i.e., from higher to lower.
> 
> It will be used in kexec_file implementation on arm64.

I remember there was an old discussion about this, it should be added
in patch log why this is needed.

> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> ---
>  include/linux/ioport.h |  3 +++
>  kernel/resource.c      | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 60 insertions(+)
> 
> diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> index da0ebaec25f0..f12d95fe038b 100644
> --- a/include/linux/ioport.h
> +++ b/include/linux/ioport.h
> @@ -277,6 +277,9 @@ extern int
>  walk_system_ram_res(u64 start, u64 end, void *arg,
>  		    int (*func)(struct resource *, void *));
>  extern int
> +walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> +			int (*func)(struct resource *, void *));
> +extern int
>  walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
>  		    void *arg, int (*func)(struct resource *, void *));
>  
> diff --git a/kernel/resource.c b/kernel/resource.c
> index e270b5048988..bdaa93407f4c 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -23,6 +23,8 @@
>  #include <linux/pfn.h>
>  #include <linux/mm.h>
>  #include <linux/resource_ext.h>
> +#include <linux/string.h>
> +#include <linux/vmalloc.h>
>  #include <asm/io.h>
>  
>  
> @@ -486,6 +488,61 @@ int walk_mem_res(u64 start, u64 end, void *arg,
>  				     arg, func);
>  }
>  
> +int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> +				int (*func)(struct resource *, void *))
> +{
> +	struct resource res, *rams;
> +	int rams_size = 16, i;
> +	int ret = -1;
> +
> +	/* create a list */
> +	rams = vmalloc(sizeof(struct resource) * rams_size);
> +	if (!rams)
> +		return ret;
> +
> +	res.start = start;
> +	res.end = end;
> +	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> +	i = 0;
> +	while ((res.start < res.end) &&
> +		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
> +		if (i >= rams_size) {
> +			/* re-alloc */
> +			struct resource *rams_new;
> +			int rams_new_size;
> +
> +			rams_new_size = rams_size + 16;
> +			rams_new = vmalloc(sizeof(struct resource)
> +							* rams_new_size);
> +			if (!rams_new)
> +				goto out;
> +
> +			memcpy(rams_new, rams,
> +					sizeof(struct resource) * rams_size);
> +			vfree(rams);
> +			rams = rams_new;
> +			rams_size = rams_new_size;
> +		}
> +
> +		rams[i].start = res.start;
> +		rams[i++].end = res.end;
> +
> +		res.start = res.end + 1;
> +		res.end = end;
> +	}
> +
> +	/* go reverse */
> +	for (i--; i >= 0; i--) {
> +		ret = (*func)(&rams[i], arg);
> +		if (ret)
> +			break;
> +	}
> +
> +out:
> +	vfree(rams);
> +	return ret;
> +}
> +
>  #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
>  
>  /*
> -- 
> 2.16.2
> 

Thanks
Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 01/13] resource: add walk_system_ram_res_rev()
@ 2018-02-23  8:36     ` Dave Young
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-23  8:36 UTC (permalink / raw)
  To: linux-arm-kernel

Hi AKASHI,

On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> This function, being a variant of walk_system_ram_res() introduced in
> commit 8c86e70acead ("resource: provide new functions to walk through
> resources"), walks through a list of all the resources of System RAM
> in reversed order, i.e., from higher to lower.
> 
> It will be used in kexec_file implementation on arm64.

I remember there was an old discussion about this, it should be added
in patch log why this is needed.

> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> ---
>  include/linux/ioport.h |  3 +++
>  kernel/resource.c      | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 60 insertions(+)
> 
> diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> index da0ebaec25f0..f12d95fe038b 100644
> --- a/include/linux/ioport.h
> +++ b/include/linux/ioport.h
> @@ -277,6 +277,9 @@ extern int
>  walk_system_ram_res(u64 start, u64 end, void *arg,
>  		    int (*func)(struct resource *, void *));
>  extern int
> +walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> +			int (*func)(struct resource *, void *));
> +extern int
>  walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
>  		    void *arg, int (*func)(struct resource *, void *));
>  
> diff --git a/kernel/resource.c b/kernel/resource.c
> index e270b5048988..bdaa93407f4c 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -23,6 +23,8 @@
>  #include <linux/pfn.h>
>  #include <linux/mm.h>
>  #include <linux/resource_ext.h>
> +#include <linux/string.h>
> +#include <linux/vmalloc.h>
>  #include <asm/io.h>
>  
>  
> @@ -486,6 +488,61 @@ int walk_mem_res(u64 start, u64 end, void *arg,
>  				     arg, func);
>  }
>  
> +int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> +				int (*func)(struct resource *, void *))
> +{
> +	struct resource res, *rams;
> +	int rams_size = 16, i;
> +	int ret = -1;
> +
> +	/* create a list */
> +	rams = vmalloc(sizeof(struct resource) * rams_size);
> +	if (!rams)
> +		return ret;
> +
> +	res.start = start;
> +	res.end = end;
> +	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> +	i = 0;
> +	while ((res.start < res.end) &&
> +		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
> +		if (i >= rams_size) {
> +			/* re-alloc */
> +			struct resource *rams_new;
> +			int rams_new_size;
> +
> +			rams_new_size = rams_size + 16;
> +			rams_new = vmalloc(sizeof(struct resource)
> +							* rams_new_size);
> +			if (!rams_new)
> +				goto out;
> +
> +			memcpy(rams_new, rams,
> +					sizeof(struct resource) * rams_size);
> +			vfree(rams);
> +			rams = rams_new;
> +			rams_size = rams_new_size;
> +		}
> +
> +		rams[i].start = res.start;
> +		rams[i++].end = res.end;
> +
> +		res.start = res.end + 1;
> +		res.end = end;
> +	}
> +
> +	/* go reverse */
> +	for (i--; i >= 0; i--) {
> +		ret = (*func)(&rams[i], arg);
> +		if (ret)
> +			break;
> +	}
> +
> +out:
> +	vfree(rams);
> +	return ret;
> +}
> +
>  #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
>  
>  /*
> -- 
> 2.16.2
> 

Thanks
Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 01/13] resource: add walk_system_ram_res_rev()
@ 2018-02-23  8:36     ` Dave Young
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-23  8:36 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: herbert, bhe, ard.biesheuvel, catalin.marinas, julien.thierry,
	will.deacon, linux-kernel, kexec, dhowells, arnd,
	linux-arm-kernel, mpe, bauerman, akpm, Linus Torvalds, davem,
	vgoyal

Hi AKASHI,

On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> This function, being a variant of walk_system_ram_res() introduced in
> commit 8c86e70acead ("resource: provide new functions to walk through
> resources"), walks through a list of all the resources of System RAM
> in reversed order, i.e., from higher to lower.
> 
> It will be used in kexec_file implementation on arm64.

I remember there was an old discussion about this, it should be added
in patch log why this is needed.

> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> ---
>  include/linux/ioport.h |  3 +++
>  kernel/resource.c      | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 60 insertions(+)
> 
> diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> index da0ebaec25f0..f12d95fe038b 100644
> --- a/include/linux/ioport.h
> +++ b/include/linux/ioport.h
> @@ -277,6 +277,9 @@ extern int
>  walk_system_ram_res(u64 start, u64 end, void *arg,
>  		    int (*func)(struct resource *, void *));
>  extern int
> +walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> +			int (*func)(struct resource *, void *));
> +extern int
>  walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
>  		    void *arg, int (*func)(struct resource *, void *));
>  
> diff --git a/kernel/resource.c b/kernel/resource.c
> index e270b5048988..bdaa93407f4c 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -23,6 +23,8 @@
>  #include <linux/pfn.h>
>  #include <linux/mm.h>
>  #include <linux/resource_ext.h>
> +#include <linux/string.h>
> +#include <linux/vmalloc.h>
>  #include <asm/io.h>
>  
>  
> @@ -486,6 +488,61 @@ int walk_mem_res(u64 start, u64 end, void *arg,
>  				     arg, func);
>  }
>  
> +int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> +				int (*func)(struct resource *, void *))
> +{
> +	struct resource res, *rams;
> +	int rams_size = 16, i;
> +	int ret = -1;
> +
> +	/* create a list */
> +	rams = vmalloc(sizeof(struct resource) * rams_size);
> +	if (!rams)
> +		return ret;
> +
> +	res.start = start;
> +	res.end = end;
> +	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> +	i = 0;
> +	while ((res.start < res.end) &&
> +		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
> +		if (i >= rams_size) {
> +			/* re-alloc */
> +			struct resource *rams_new;
> +			int rams_new_size;
> +
> +			rams_new_size = rams_size + 16;
> +			rams_new = vmalloc(sizeof(struct resource)
> +							* rams_new_size);
> +			if (!rams_new)
> +				goto out;
> +
> +			memcpy(rams_new, rams,
> +					sizeof(struct resource) * rams_size);
> +			vfree(rams);
> +			rams = rams_new;
> +			rams_size = rams_new_size;
> +		}
> +
> +		rams[i].start = res.start;
> +		rams[i++].end = res.end;
> +
> +		res.start = res.end + 1;
> +		res.end = end;
> +	}
> +
> +	/* go reverse */
> +	for (i--; i >= 0; i--) {
> +		ret = (*func)(&rams[i], arg);
> +		if (ret)
> +			break;
> +	}
> +
> +out:
> +	vfree(rams);
> +	return ret;
> +}
> +
>  #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
>  
>  /*
> -- 
> 2.16.2
> 

Thanks
Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 02/13] kexec_file: make an use of purgatory optional
  2018-02-22 11:17   ` AKASHI Takahiro
  (?)
@ 2018-02-23  8:49     ` Dave Young
  -1 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-23  8:49 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, bhe, arnd, ard.biesheuvel,
	julien.thierry, kexec, linux-arm-kernel, linux-kernel

Hi AKASHI,

On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> On arm64, no trampline code between old kernel and new kernel will be
> required in kexec_file implementation. This patch introduces a new
> configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be
> compiled in only if necessary.

Here also need the explanation about why no purgatory is needed, it would be
required for kexec if no strong reason.

> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> ---
>  arch/powerpc/Kconfig | 3 +++
>  arch/x86/Kconfig     | 3 +++
>  kernel/kexec_file.c  | 6 ++++++
>  3 files changed, 12 insertions(+)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 73ce5dd07642..c32a181a7cbb 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -552,6 +552,9 @@ config KEXEC_FILE
>  	  for kernel and initramfs as opposed to a list of segments as is the
>  	  case for the older kexec call.
>  
> +config ARCH_HAS_KEXEC_PURGATORY
> +	def_bool KEXEC_FILE
> +
>  config RELOCATABLE
>  	bool "Build a relocatable kernel"
>  	depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index c1236b187824..f031c3efe47e 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -2019,6 +2019,9 @@ config KEXEC_FILE
>  	  for kernel and initramfs as opposed to list of segments as
>  	  accepted by previous system call.
>  
> +config ARCH_HAS_KEXEC_PURGATORY
> +	def_bool KEXEC_FILE
> +
>  config KEXEC_VERIFY_SIG
>  	bool "Verify kernel signature during kexec_file_load() syscall"
>  	depends on KEXEC_FILE
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index e5bcd94c1efb..990adae52151 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -26,7 +26,11 @@
>  #include <linux/vmalloc.h>
>  #include "kexec_internal.h"
>  
> +#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
>  static int kexec_calculate_store_digests(struct kimage *image);
> +#else
> +static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> +#endif
>  
>  /* Architectures can provide this probe function */
>  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> @@ -520,6 +524,7 @@ int kexec_add_buffer(struct kexec_buf *kbuf)
>  	return 0;
>  }
>  
> +#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
>  /* Calculate and store the digest of segments */
>  static int kexec_calculate_store_digests(struct kimage *image)
>  {
> @@ -1022,3 +1027,4 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
>  
>  	return 0;
>  }
> +#endif /* CONFIG_ARCH_HAS_KEXEC_PURGATORY */
> -- 
> 2.16.2
> 

Thanks
Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 02/13] kexec_file: make an use of purgatory optional
@ 2018-02-23  8:49     ` Dave Young
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-23  8:49 UTC (permalink / raw)
  To: linux-arm-kernel

Hi AKASHI,

On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> On arm64, no trampline code between old kernel and new kernel will be
> required in kexec_file implementation. This patch introduces a new
> configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be
> compiled in only if necessary.

Here also need the explanation about why no purgatory is needed, it would be
required for kexec if no strong reason.

> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> ---
>  arch/powerpc/Kconfig | 3 +++
>  arch/x86/Kconfig     | 3 +++
>  kernel/kexec_file.c  | 6 ++++++
>  3 files changed, 12 insertions(+)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 73ce5dd07642..c32a181a7cbb 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -552,6 +552,9 @@ config KEXEC_FILE
>  	  for kernel and initramfs as opposed to a list of segments as is the
>  	  case for the older kexec call.
>  
> +config ARCH_HAS_KEXEC_PURGATORY
> +	def_bool KEXEC_FILE
> +
>  config RELOCATABLE
>  	bool "Build a relocatable kernel"
>  	depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index c1236b187824..f031c3efe47e 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -2019,6 +2019,9 @@ config KEXEC_FILE
>  	  for kernel and initramfs as opposed to list of segments as
>  	  accepted by previous system call.
>  
> +config ARCH_HAS_KEXEC_PURGATORY
> +	def_bool KEXEC_FILE
> +
>  config KEXEC_VERIFY_SIG
>  	bool "Verify kernel signature during kexec_file_load() syscall"
>  	depends on KEXEC_FILE
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index e5bcd94c1efb..990adae52151 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -26,7 +26,11 @@
>  #include <linux/vmalloc.h>
>  #include "kexec_internal.h"
>  
> +#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
>  static int kexec_calculate_store_digests(struct kimage *image);
> +#else
> +static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> +#endif
>  
>  /* Architectures can provide this probe function */
>  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> @@ -520,6 +524,7 @@ int kexec_add_buffer(struct kexec_buf *kbuf)
>  	return 0;
>  }
>  
> +#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
>  /* Calculate and store the digest of segments */
>  static int kexec_calculate_store_digests(struct kimage *image)
>  {
> @@ -1022,3 +1027,4 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
>  
>  	return 0;
>  }
> +#endif /* CONFIG_ARCH_HAS_KEXEC_PURGATORY */
> -- 
> 2.16.2
> 

Thanks
Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 02/13] kexec_file: make an use of purgatory optional
@ 2018-02-23  8:49     ` Dave Young
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-23  8:49 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: herbert, bhe, ard.biesheuvel, catalin.marinas, julien.thierry,
	will.deacon, linux-kernel, kexec, dhowells, arnd,
	linux-arm-kernel, mpe, bauerman, akpm, davem, vgoyal

Hi AKASHI,

On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> On arm64, no trampline code between old kernel and new kernel will be
> required in kexec_file implementation. This patch introduces a new
> configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be
> compiled in only if necessary.

Here also need the explanation about why no purgatory is needed, it would be
required for kexec if no strong reason.

> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> ---
>  arch/powerpc/Kconfig | 3 +++
>  arch/x86/Kconfig     | 3 +++
>  kernel/kexec_file.c  | 6 ++++++
>  3 files changed, 12 insertions(+)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 73ce5dd07642..c32a181a7cbb 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -552,6 +552,9 @@ config KEXEC_FILE
>  	  for kernel and initramfs as opposed to a list of segments as is the
>  	  case for the older kexec call.
>  
> +config ARCH_HAS_KEXEC_PURGATORY
> +	def_bool KEXEC_FILE
> +
>  config RELOCATABLE
>  	bool "Build a relocatable kernel"
>  	depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index c1236b187824..f031c3efe47e 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -2019,6 +2019,9 @@ config KEXEC_FILE
>  	  for kernel and initramfs as opposed to list of segments as
>  	  accepted by previous system call.
>  
> +config ARCH_HAS_KEXEC_PURGATORY
> +	def_bool KEXEC_FILE
> +
>  config KEXEC_VERIFY_SIG
>  	bool "Verify kernel signature during kexec_file_load() syscall"
>  	depends on KEXEC_FILE
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index e5bcd94c1efb..990adae52151 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -26,7 +26,11 @@
>  #include <linux/vmalloc.h>
>  #include "kexec_internal.h"
>  
> +#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
>  static int kexec_calculate_store_digests(struct kimage *image);
> +#else
> +static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> +#endif
>  
>  /* Architectures can provide this probe function */
>  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> @@ -520,6 +524,7 @@ int kexec_add_buffer(struct kexec_buf *kbuf)
>  	return 0;
>  }
>  
> +#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
>  /* Calculate and store the digest of segments */
>  static int kexec_calculate_store_digests(struct kimage *image)
>  {
> @@ -1022,3 +1027,4 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
>  
>  	return 0;
>  }
> +#endif /* CONFIG_ARCH_HAS_KEXEC_PURGATORY */
> -- 
> 2.16.2
> 

Thanks
Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
  2018-02-22 11:17   ` AKASHI Takahiro
  (?)
@ 2018-02-23  9:24     ` Dave Young
  -1 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-23  9:24 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, bhe, arnd, ard.biesheuvel,
	julien.thierry, kexec, linux-arm-kernel, linux-kernel

Hi AKASHI,

On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
> and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
> array and now duplicated among some architectures, let's factor them out.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/kexec.h            |  2 +-
>  arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
>  arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
>  arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
>  arch/x86/kernel/kexec-bzimage64.c           |  2 +-
>  arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
>  include/linux/kexec.h                       | 15 ++++----
>  kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
>  8 files changed, 70 insertions(+), 94 deletions(-)
> 

[snip]

> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 990adae52151..a6d14a768b3e 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -26,34 +26,83 @@
>  #include <linux/vmalloc.h>
>  #include "kexec_internal.h"
>  
> +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> +
>  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
>  static int kexec_calculate_store_digests(struct kimage *image);
>  #else
>  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
>  #endif
>  
> +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> +			     unsigned long buf_len)
> +{
> +	const struct kexec_file_ops * const *fops;
> +	int ret = -ENOEXEC;
> +
> +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> +		ret = (*fops)->probe(buf, buf_len);
> +		if (!ret) {
> +			image->fops = *fops;
> +			return ret;
> +		}
> +	}
> +
> +	return ret;
> +}
> +
>  /* Architectures can provide this probe function */
>  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
>  					 unsigned long buf_len)
>  {
> -	return -ENOEXEC;
> +	return _kexec_kernel_image_probe(image, buf, buf_len);


I vaguely remember previously I suggest split the _kexec_kernel_image_probe
because arch code can call them, and common code also use it like above.
But in your new series I do not find where else calls this function
except the common code arch_kexec_kernel_image_probe.  If nobody use
them then it is not worth to split them out, it is better to just embed
them in the __weak functions.

Ditto for other similar functions.

[snip]

Thanks
Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
@ 2018-02-23  9:24     ` Dave Young
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-23  9:24 UTC (permalink / raw)
  To: linux-arm-kernel

Hi AKASHI,

On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
> and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
> array and now duplicated among some architectures, let's factor them out.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/kexec.h            |  2 +-
>  arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
>  arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
>  arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
>  arch/x86/kernel/kexec-bzimage64.c           |  2 +-
>  arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
>  include/linux/kexec.h                       | 15 ++++----
>  kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
>  8 files changed, 70 insertions(+), 94 deletions(-)
> 

[snip]

> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 990adae52151..a6d14a768b3e 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -26,34 +26,83 @@
>  #include <linux/vmalloc.h>
>  #include "kexec_internal.h"
>  
> +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> +
>  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
>  static int kexec_calculate_store_digests(struct kimage *image);
>  #else
>  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
>  #endif
>  
> +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> +			     unsigned long buf_len)
> +{
> +	const struct kexec_file_ops * const *fops;
> +	int ret = -ENOEXEC;
> +
> +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> +		ret = (*fops)->probe(buf, buf_len);
> +		if (!ret) {
> +			image->fops = *fops;
> +			return ret;
> +		}
> +	}
> +
> +	return ret;
> +}
> +
>  /* Architectures can provide this probe function */
>  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
>  					 unsigned long buf_len)
>  {
> -	return -ENOEXEC;
> +	return _kexec_kernel_image_probe(image, buf, buf_len);


I vaguely remember previously I suggest split the _kexec_kernel_image_probe
because arch code can call them, and common code also use it like above.
But in your new series I do not find where else calls this function
except the common code arch_kexec_kernel_image_probe.  If nobody use
them then it is not worth to split them out, it is better to just embed
them in the __weak functions.

Ditto for other similar functions.

[snip]

Thanks
Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
@ 2018-02-23  9:24     ` Dave Young
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-23  9:24 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: herbert, bhe, ard.biesheuvel, catalin.marinas, julien.thierry,
	will.deacon, linux-kernel, kexec, dhowells, arnd,
	linux-arm-kernel, mpe, bauerman, akpm, davem, vgoyal

Hi AKASHI,

On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
> and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
> array and now duplicated among some architectures, let's factor them out.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/kexec.h            |  2 +-
>  arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
>  arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
>  arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
>  arch/x86/kernel/kexec-bzimage64.c           |  2 +-
>  arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
>  include/linux/kexec.h                       | 15 ++++----
>  kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
>  8 files changed, 70 insertions(+), 94 deletions(-)
> 

[snip]

> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 990adae52151..a6d14a768b3e 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -26,34 +26,83 @@
>  #include <linux/vmalloc.h>
>  #include "kexec_internal.h"
>  
> +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> +
>  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
>  static int kexec_calculate_store_digests(struct kimage *image);
>  #else
>  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
>  #endif
>  
> +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> +			     unsigned long buf_len)
> +{
> +	const struct kexec_file_ops * const *fops;
> +	int ret = -ENOEXEC;
> +
> +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> +		ret = (*fops)->probe(buf, buf_len);
> +		if (!ret) {
> +			image->fops = *fops;
> +			return ret;
> +		}
> +	}
> +
> +	return ret;
> +}
> +
>  /* Architectures can provide this probe function */
>  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
>  					 unsigned long buf_len)
>  {
> -	return -ENOEXEC;
> +	return _kexec_kernel_image_probe(image, buf, buf_len);


I vaguely remember previously I suggest split the _kexec_kernel_image_probe
because arch code can call them, and common code also use it like above.
But in your new series I do not find where else calls this function
except the common code arch_kexec_kernel_image_probe.  If nobody use
them then it is not worth to split them out, it is better to just embed
them in the __weak functions.

Ditto for other similar functions.

[snip]

Thanks
Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 04/13] x86: kexec_file: factor out elf core header related functions
  2018-02-22 11:17   ` AKASHI Takahiro
  (?)
@ 2018-02-24  3:15     ` Dave Young
  -1 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-24  3:15 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, bhe, arnd, ard.biesheuvel,
	julien.thierry, kexec, linux-arm-kernel, linux-kernel

Hi AKASHI,
On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> exclude_mem_range() and prepare_elf64_headers() can be re-used on other
> architectures, including arm64, as well. So let them factored out so as to
> move them to generic side in the next patch.
> 
> fill_up_crash_elf_data() can potentially be commonalized for most
> architectures who want to go through io resources (/proc/iomem) for a list
> of "System RAM", but leave it private for now.

Is it possible to spilt this patch to small patches?  For example it can
be one patch to change the max ranges to a dynamically allocated buffer.

The remain parts could be splitted as well, so that they can be easier
to review.

> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> ---
>  arch/x86/kernel/crash.c | 235 +++++++++++++++++++++---------------------------
>  1 file changed, 103 insertions(+), 132 deletions(-)
> 
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index 10e74d4778a1..5c19cfbf3b85 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -41,32 +41,14 @@
>  /* Alignment required for elf header segment */
>  #define ELF_CORE_HEADER_ALIGN   4096
>  
> -/* This primarily represents number of split ranges due to exclusion */
> -#define CRASH_MAX_RANGES	16
> -
>  struct crash_mem_range {
>  	u64 start, end;
>  };
>  
>  struct crash_mem {
> -	unsigned int nr_ranges;
> -	struct crash_mem_range ranges[CRASH_MAX_RANGES];
> -};
> -
> -/* Misc data about ram ranges needed to prepare elf headers */
> -struct crash_elf_data {
> -	struct kimage *image;
> -	/*
> -	 * Total number of ram ranges we have after various adjustments for
> -	 * crash reserved region, etc.
> -	 */
>  	unsigned int max_nr_ranges;
> -
> -	/* Pointer to elf header */
> -	void *ehdr;
> -	/* Pointer to next phdr */
> -	void *bufp;
> -	struct crash_mem mem;
> +	unsigned int nr_ranges;
> +	struct crash_mem_range ranges[0];
>  };
>  
>  /* Used while preparing memory map entries for second kernel */
> @@ -217,29 +199,32 @@ static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
>  	return 0;
>  }
>  
> -
>  /* Gather all the required information to prepare elf headers for ram regions */
> -static void fill_up_crash_elf_data(struct crash_elf_data *ced,
> -				   struct kimage *image)
> +static struct crash_mem *fill_up_crash_elf_data(void)
>  {
>  	unsigned int nr_ranges = 0;
> -
> -	ced->image = image;
> +	struct crash_mem *cmem;
>  
>  	walk_system_ram_res(0, -1, &nr_ranges,
>  				get_nr_ram_ranges_callback);
>  
> -	ced->max_nr_ranges = nr_ranges;
> +	/*
> +	 * Exclusion of crash region and/or crashk_low_res may cause
> +	 * another range split. So add extra two slots here.
> +	 */
> +	nr_ranges += 2;
> +	cmem = vmalloc(sizeof(struct crash_mem) +
> +			sizeof(struct crash_mem_range) * nr_ranges);
> +	if (!cmem)
> +		return NULL;
>  
> -	/* Exclusion of crash region could split memory ranges */
> -	ced->max_nr_ranges++;
> +	cmem->max_nr_ranges = nr_ranges;
> +	cmem->nr_ranges = 0;
>  
> -	/* If crashk_low_res is not 0, another range split possible */
> -	if (crashk_low_res.end)
> -		ced->max_nr_ranges++;
> +	return cmem;
>  }
>  
> -static int exclude_mem_range(struct crash_mem *mem,
> +static int crash_exclude_mem_range(struct crash_mem *mem,
>  		unsigned long long mstart, unsigned long long mend)
>  {
>  	int i, j;
> @@ -293,10 +278,8 @@ static int exclude_mem_range(struct crash_mem *mem,
>  		return 0;
>  
>  	/* Split happened */
> -	if (i == CRASH_MAX_RANGES - 1) {
> -		pr_err("Too many crash ranges after split\n");
> +	if (i == mem->max_nr_ranges - 1)
>  		return -ENOMEM;
> -	}
>  
>  	/* Location where new range should go */
>  	j = i + 1;
> @@ -314,27 +297,20 @@ static int exclude_mem_range(struct crash_mem *mem,
>  
>  /*
>   * Look for any unwanted ranges between mstart, mend and remove them. This
> - * might lead to split and split ranges are put in ced->mem.ranges[] array
> + * might lead to split and split ranges are put in cmem->ranges[] array
>   */
> -static int elf_header_exclude_ranges(struct crash_elf_data *ced,
> -		unsigned long long mstart, unsigned long long mend)
> +static int elf_header_exclude_ranges(struct crash_mem *cmem)
>  {
> -	struct crash_mem *cmem = &ced->mem;
>  	int ret = 0;
>  
> -	memset(cmem->ranges, 0, sizeof(cmem->ranges));
> -
> -	cmem->ranges[0].start = mstart;
> -	cmem->ranges[0].end = mend;
> -	cmem->nr_ranges = 1;
> -
>  	/* Exclude crashkernel region */
> -	ret = exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
> +	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
>  	if (ret)
>  		return ret;
>  
>  	if (crashk_low_res.end) {
> -		ret = exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
> +		ret = crash_exclude_mem_range(cmem, crashk_low_res.start,
> +							crashk_low_res.end);
>  		if (ret)
>  			return ret;
>  	}
> @@ -344,70 +320,29 @@ static int elf_header_exclude_ranges(struct crash_elf_data *ced,
>  
>  static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
>  {
> -	struct crash_elf_data *ced = arg;
> -	Elf64_Ehdr *ehdr;
> -	Elf64_Phdr *phdr;
> -	unsigned long mstart, mend;
> -	struct kimage *image = ced->image;
> -	struct crash_mem *cmem;
> -	int ret, i;
> +	struct crash_mem *cmem = arg;
>  
> -	ehdr = ced->ehdr;
> -
> -	/* Exclude unwanted mem ranges */
> -	ret = elf_header_exclude_ranges(ced, res->start, res->end);
> -	if (ret)
> -		return ret;
> -
> -	/* Go through all the ranges in ced->mem.ranges[] and prepare phdr */
> -	cmem = &ced->mem;
> -
> -	for (i = 0; i < cmem->nr_ranges; i++) {
> -		mstart = cmem->ranges[i].start;
> -		mend = cmem->ranges[i].end;
> -
> -		phdr = ced->bufp;
> -		ced->bufp += sizeof(Elf64_Phdr);
> -
> -		phdr->p_type = PT_LOAD;
> -		phdr->p_flags = PF_R|PF_W|PF_X;
> -		phdr->p_offset  = mstart;
> -
> -		/*
> -		 * If a range matches backup region, adjust offset to backup
> -		 * segment.
> -		 */
> -		if (mstart == image->arch.backup_src_start &&
> -		    (mend - mstart + 1) == image->arch.backup_src_sz)
> -			phdr->p_offset = image->arch.backup_load_addr;
> -
> -		phdr->p_paddr = mstart;
> -		phdr->p_vaddr = (unsigned long long) __va(mstart);
> -		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
> -		phdr->p_align = 0;
> -		ehdr->e_phnum++;
> -		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
> -			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
> -			ehdr->e_phnum, phdr->p_offset);
> -	}
> +	cmem->ranges[cmem->nr_ranges].start = res->start;
> +	cmem->ranges[cmem->nr_ranges].end = res->end;
> +	cmem->nr_ranges++;
>  
> -	return ret;
> +	return 0;
>  }
>  
> -static int prepare_elf64_headers(struct crash_elf_data *ced,
> -		void **addr, unsigned long *sz)
> +static int crash_prepare_elf64_headers(struct crash_mem *cmem, int kernel_map,
> +					void **addr, unsigned long *sz)
>  {
>  	Elf64_Ehdr *ehdr;
>  	Elf64_Phdr *phdr;
>  	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
> -	unsigned char *buf, *bufp;
> -	unsigned int cpu;
> +	unsigned char *buf;
> +	unsigned int cpu, i;
>  	unsigned long long notes_addr;
> -	int ret;
> +	unsigned long mstart, mend;
>  
>  	/* extra phdr for vmcoreinfo elf note */
>  	nr_phdr = nr_cpus + 1;
> -	nr_phdr += ced->max_nr_ranges;
> +	nr_phdr += cmem->nr_ranges;
>  
>  	/*
>  	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
> @@ -425,9 +360,8 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
>  	if (!buf)
>  		return -ENOMEM;
>  
> -	bufp = buf;
> -	ehdr = (Elf64_Ehdr *)bufp;
> -	bufp += sizeof(Elf64_Ehdr);
> +	ehdr = (Elf64_Ehdr *)buf;
> +	phdr = (Elf64_Phdr *)(ehdr + 1);
>  	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
>  	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
>  	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
> @@ -443,42 +377,51 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
>  
>  	/* Prepare one phdr of type PT_NOTE for each present cpu */
>  	for_each_present_cpu(cpu) {
> -		phdr = (Elf64_Phdr *)bufp;
> -		bufp += sizeof(Elf64_Phdr);
>  		phdr->p_type = PT_NOTE;
>  		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
>  		phdr->p_offset = phdr->p_paddr = notes_addr;
>  		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
>  		(ehdr->e_phnum)++;
> +		phdr++;
>  	}
>  
>  	/* Prepare one PT_NOTE header for vmcoreinfo */
> -	phdr = (Elf64_Phdr *)bufp;
> -	bufp += sizeof(Elf64_Phdr);
>  	phdr->p_type = PT_NOTE;
>  	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
>  	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
>  	(ehdr->e_phnum)++;
> +	phdr++;
>  
> -#ifdef CONFIG_X86_64
>  	/* Prepare PT_LOAD type program header for kernel text region */
> -	phdr = (Elf64_Phdr *)bufp;
> -	bufp += sizeof(Elf64_Phdr);
> -	phdr->p_type = PT_LOAD;
> -	phdr->p_flags = PF_R|PF_W|PF_X;
> -	phdr->p_vaddr = (Elf64_Addr)_text;
> -	phdr->p_filesz = phdr->p_memsz = _end - _text;
> -	phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
> -	(ehdr->e_phnum)++;
> -#endif
> +	if (kernel_map) {
> +		phdr->p_type = PT_LOAD;
> +		phdr->p_flags = PF_R|PF_W|PF_X;
> +		phdr->p_vaddr = (Elf64_Addr)_text;
> +		phdr->p_filesz = phdr->p_memsz = _end - _text;
> +		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
> +		ehdr->e_phnum++;
> +		phdr++;
> +	}
>  
> -	/* Prepare PT_LOAD headers for system ram chunks. */
> -	ced->ehdr = ehdr;
> -	ced->bufp = bufp;
> -	ret = walk_system_ram_res(0, -1, ced,
> -			prepare_elf64_ram_headers_callback);
> -	if (ret < 0)
> -		return ret;
> +	/* Go through all the ranges in cmem->ranges[] and prepare phdr */
> +	for (i = 0; i < cmem->nr_ranges; i++) {
> +		mstart = cmem->ranges[i].start;
> +		mend = cmem->ranges[i].end;
> +
> +		phdr->p_type = PT_LOAD;
> +		phdr->p_flags = PF_R|PF_W|PF_X;
> +		phdr->p_offset  = mstart;
> +
> +		phdr->p_paddr = mstart;
> +		phdr->p_vaddr = (unsigned long long) __va(mstart);
> +		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
> +		phdr->p_align = 0;
> +		ehdr->e_phnum++;
> +		phdr++;
> +		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
> +			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
> +			ehdr->e_phnum, phdr->p_offset);
> +	}
>  
>  	*addr = buf;
>  	*sz = elf_sz;
> @@ -489,18 +432,46 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
>  static int prepare_elf_headers(struct kimage *image, void **addr,
>  					unsigned long *sz)
>  {
> -	struct crash_elf_data *ced;
> -	int ret;
> +	struct crash_mem *cmem;
> +	Elf64_Ehdr *ehdr;
> +	Elf64_Phdr *phdr;
> +	int ret, i;
>  
> -	ced = kzalloc(sizeof(*ced), GFP_KERNEL);
> -	if (!ced)
> +	cmem = fill_up_crash_elf_data();
> +	if (!cmem)
>  		return -ENOMEM;
>  
> -	fill_up_crash_elf_data(ced, image);
> +	ret = walk_system_ram_res(0, -1, cmem,
> +				prepare_elf64_ram_headers_callback);
> +	if (ret)
> +		goto out;
> +
> +	/* Exclude unwanted mem ranges */
> +	ret = elf_header_exclude_ranges(cmem);
> +	if (ret)
> +		goto out;
>  
>  	/* By default prepare 64bit headers */
> -	ret =  prepare_elf64_headers(ced, addr, sz);
> -	kfree(ced);
> +	ret =  crash_prepare_elf64_headers(cmem,
> +				(int)IS_ENABLED(CONFIG_X86_64), addr, sz);
> +	if (ret)
> +		goto out;
> +
> +	/*
> +	 * If a range matches backup region, adjust offset to backup
> +	 * segment.
> +	 */
> +	ehdr = (Elf64_Ehdr *)*addr;
> +	phdr = (Elf64_Phdr *)(ehdr + 1);
> +	for (i = 0; i < ehdr->e_phnum; phdr++, i++)
> +		if (phdr->p_type == PT_LOAD &&
> +				phdr->p_paddr == image->arch.backup_src_start &&
> +				phdr->p_memsz == image->arch.backup_src_sz) {
> +			phdr->p_offset = image->arch.backup_load_addr;
> +			break;
> +		}
> +out:
> +	vfree(cmem);
>  	return ret;
>  }
>  
> @@ -546,14 +517,14 @@ static int memmap_exclude_ranges(struct kimage *image, struct crash_mem *cmem,
>  	/* Exclude Backup region */
>  	start = image->arch.backup_load_addr;
>  	end = start + image->arch.backup_src_sz - 1;
> -	ret = exclude_mem_range(cmem, start, end);
> +	ret = crash_exclude_mem_range(cmem, start, end);
>  	if (ret)
>  		return ret;
>  
>  	/* Exclude elf header region */
>  	start = image->arch.elf_load_addr;
>  	end = start + image->arch.elf_headers_sz - 1;
> -	return exclude_mem_range(cmem, start, end);
> +	return crash_exclude_mem_range(cmem, start, end);
>  }
>  
>  /* Prepare memory map for crash dump kernel */
> -- 
> 2.16.2
> 

Thanks
Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 04/13] x86: kexec_file: factor out elf core header related functions
@ 2018-02-24  3:15     ` Dave Young
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-24  3:15 UTC (permalink / raw)
  To: linux-arm-kernel

Hi AKASHI,
On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> exclude_mem_range() and prepare_elf64_headers() can be re-used on other
> architectures, including arm64, as well. So let them factored out so as to
> move them to generic side in the next patch.
> 
> fill_up_crash_elf_data() can potentially be commonalized for most
> architectures who want to go through io resources (/proc/iomem) for a list
> of "System RAM", but leave it private for now.

Is it possible to spilt this patch to small patches?  For example it can
be one patch to change the max ranges to a dynamically allocated buffer.

The remain parts could be splitted as well, so that they can be easier
to review.

> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> ---
>  arch/x86/kernel/crash.c | 235 +++++++++++++++++++++---------------------------
>  1 file changed, 103 insertions(+), 132 deletions(-)
> 
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index 10e74d4778a1..5c19cfbf3b85 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -41,32 +41,14 @@
>  /* Alignment required for elf header segment */
>  #define ELF_CORE_HEADER_ALIGN   4096
>  
> -/* This primarily represents number of split ranges due to exclusion */
> -#define CRASH_MAX_RANGES	16
> -
>  struct crash_mem_range {
>  	u64 start, end;
>  };
>  
>  struct crash_mem {
> -	unsigned int nr_ranges;
> -	struct crash_mem_range ranges[CRASH_MAX_RANGES];
> -};
> -
> -/* Misc data about ram ranges needed to prepare elf headers */
> -struct crash_elf_data {
> -	struct kimage *image;
> -	/*
> -	 * Total number of ram ranges we have after various adjustments for
> -	 * crash reserved region, etc.
> -	 */
>  	unsigned int max_nr_ranges;
> -
> -	/* Pointer to elf header */
> -	void *ehdr;
> -	/* Pointer to next phdr */
> -	void *bufp;
> -	struct crash_mem mem;
> +	unsigned int nr_ranges;
> +	struct crash_mem_range ranges[0];
>  };
>  
>  /* Used while preparing memory map entries for second kernel */
> @@ -217,29 +199,32 @@ static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
>  	return 0;
>  }
>  
> -
>  /* Gather all the required information to prepare elf headers for ram regions */
> -static void fill_up_crash_elf_data(struct crash_elf_data *ced,
> -				   struct kimage *image)
> +static struct crash_mem *fill_up_crash_elf_data(void)
>  {
>  	unsigned int nr_ranges = 0;
> -
> -	ced->image = image;
> +	struct crash_mem *cmem;
>  
>  	walk_system_ram_res(0, -1, &nr_ranges,
>  				get_nr_ram_ranges_callback);
>  
> -	ced->max_nr_ranges = nr_ranges;
> +	/*
> +	 * Exclusion of crash region and/or crashk_low_res may cause
> +	 * another range split. So add extra two slots here.
> +	 */
> +	nr_ranges += 2;
> +	cmem = vmalloc(sizeof(struct crash_mem) +
> +			sizeof(struct crash_mem_range) * nr_ranges);
> +	if (!cmem)
> +		return NULL;
>  
> -	/* Exclusion of crash region could split memory ranges */
> -	ced->max_nr_ranges++;
> +	cmem->max_nr_ranges = nr_ranges;
> +	cmem->nr_ranges = 0;
>  
> -	/* If crashk_low_res is not 0, another range split possible */
> -	if (crashk_low_res.end)
> -		ced->max_nr_ranges++;
> +	return cmem;
>  }
>  
> -static int exclude_mem_range(struct crash_mem *mem,
> +static int crash_exclude_mem_range(struct crash_mem *mem,
>  		unsigned long long mstart, unsigned long long mend)
>  {
>  	int i, j;
> @@ -293,10 +278,8 @@ static int exclude_mem_range(struct crash_mem *mem,
>  		return 0;
>  
>  	/* Split happened */
> -	if (i == CRASH_MAX_RANGES - 1) {
> -		pr_err("Too many crash ranges after split\n");
> +	if (i == mem->max_nr_ranges - 1)
>  		return -ENOMEM;
> -	}
>  
>  	/* Location where new range should go */
>  	j = i + 1;
> @@ -314,27 +297,20 @@ static int exclude_mem_range(struct crash_mem *mem,
>  
>  /*
>   * Look for any unwanted ranges between mstart, mend and remove them. This
> - * might lead to split and split ranges are put in ced->mem.ranges[] array
> + * might lead to split and split ranges are put in cmem->ranges[] array
>   */
> -static int elf_header_exclude_ranges(struct crash_elf_data *ced,
> -		unsigned long long mstart, unsigned long long mend)
> +static int elf_header_exclude_ranges(struct crash_mem *cmem)
>  {
> -	struct crash_mem *cmem = &ced->mem;
>  	int ret = 0;
>  
> -	memset(cmem->ranges, 0, sizeof(cmem->ranges));
> -
> -	cmem->ranges[0].start = mstart;
> -	cmem->ranges[0].end = mend;
> -	cmem->nr_ranges = 1;
> -
>  	/* Exclude crashkernel region */
> -	ret = exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
> +	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
>  	if (ret)
>  		return ret;
>  
>  	if (crashk_low_res.end) {
> -		ret = exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
> +		ret = crash_exclude_mem_range(cmem, crashk_low_res.start,
> +							crashk_low_res.end);
>  		if (ret)
>  			return ret;
>  	}
> @@ -344,70 +320,29 @@ static int elf_header_exclude_ranges(struct crash_elf_data *ced,
>  
>  static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
>  {
> -	struct crash_elf_data *ced = arg;
> -	Elf64_Ehdr *ehdr;
> -	Elf64_Phdr *phdr;
> -	unsigned long mstart, mend;
> -	struct kimage *image = ced->image;
> -	struct crash_mem *cmem;
> -	int ret, i;
> +	struct crash_mem *cmem = arg;
>  
> -	ehdr = ced->ehdr;
> -
> -	/* Exclude unwanted mem ranges */
> -	ret = elf_header_exclude_ranges(ced, res->start, res->end);
> -	if (ret)
> -		return ret;
> -
> -	/* Go through all the ranges in ced->mem.ranges[] and prepare phdr */
> -	cmem = &ced->mem;
> -
> -	for (i = 0; i < cmem->nr_ranges; i++) {
> -		mstart = cmem->ranges[i].start;
> -		mend = cmem->ranges[i].end;
> -
> -		phdr = ced->bufp;
> -		ced->bufp += sizeof(Elf64_Phdr);
> -
> -		phdr->p_type = PT_LOAD;
> -		phdr->p_flags = PF_R|PF_W|PF_X;
> -		phdr->p_offset  = mstart;
> -
> -		/*
> -		 * If a range matches backup region, adjust offset to backup
> -		 * segment.
> -		 */
> -		if (mstart == image->arch.backup_src_start &&
> -		    (mend - mstart + 1) == image->arch.backup_src_sz)
> -			phdr->p_offset = image->arch.backup_load_addr;
> -
> -		phdr->p_paddr = mstart;
> -		phdr->p_vaddr = (unsigned long long) __va(mstart);
> -		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
> -		phdr->p_align = 0;
> -		ehdr->e_phnum++;
> -		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
> -			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
> -			ehdr->e_phnum, phdr->p_offset);
> -	}
> +	cmem->ranges[cmem->nr_ranges].start = res->start;
> +	cmem->ranges[cmem->nr_ranges].end = res->end;
> +	cmem->nr_ranges++;
>  
> -	return ret;
> +	return 0;
>  }
>  
> -static int prepare_elf64_headers(struct crash_elf_data *ced,
> -		void **addr, unsigned long *sz)
> +static int crash_prepare_elf64_headers(struct crash_mem *cmem, int kernel_map,
> +					void **addr, unsigned long *sz)
>  {
>  	Elf64_Ehdr *ehdr;
>  	Elf64_Phdr *phdr;
>  	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
> -	unsigned char *buf, *bufp;
> -	unsigned int cpu;
> +	unsigned char *buf;
> +	unsigned int cpu, i;
>  	unsigned long long notes_addr;
> -	int ret;
> +	unsigned long mstart, mend;
>  
>  	/* extra phdr for vmcoreinfo elf note */
>  	nr_phdr = nr_cpus + 1;
> -	nr_phdr += ced->max_nr_ranges;
> +	nr_phdr += cmem->nr_ranges;
>  
>  	/*
>  	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
> @@ -425,9 +360,8 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
>  	if (!buf)
>  		return -ENOMEM;
>  
> -	bufp = buf;
> -	ehdr = (Elf64_Ehdr *)bufp;
> -	bufp += sizeof(Elf64_Ehdr);
> +	ehdr = (Elf64_Ehdr *)buf;
> +	phdr = (Elf64_Phdr *)(ehdr + 1);
>  	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
>  	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
>  	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
> @@ -443,42 +377,51 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
>  
>  	/* Prepare one phdr of type PT_NOTE for each present cpu */
>  	for_each_present_cpu(cpu) {
> -		phdr = (Elf64_Phdr *)bufp;
> -		bufp += sizeof(Elf64_Phdr);
>  		phdr->p_type = PT_NOTE;
>  		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
>  		phdr->p_offset = phdr->p_paddr = notes_addr;
>  		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
>  		(ehdr->e_phnum)++;
> +		phdr++;
>  	}
>  
>  	/* Prepare one PT_NOTE header for vmcoreinfo */
> -	phdr = (Elf64_Phdr *)bufp;
> -	bufp += sizeof(Elf64_Phdr);
>  	phdr->p_type = PT_NOTE;
>  	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
>  	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
>  	(ehdr->e_phnum)++;
> +	phdr++;
>  
> -#ifdef CONFIG_X86_64
>  	/* Prepare PT_LOAD type program header for kernel text region */
> -	phdr = (Elf64_Phdr *)bufp;
> -	bufp += sizeof(Elf64_Phdr);
> -	phdr->p_type = PT_LOAD;
> -	phdr->p_flags = PF_R|PF_W|PF_X;
> -	phdr->p_vaddr = (Elf64_Addr)_text;
> -	phdr->p_filesz = phdr->p_memsz = _end - _text;
> -	phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
> -	(ehdr->e_phnum)++;
> -#endif
> +	if (kernel_map) {
> +		phdr->p_type = PT_LOAD;
> +		phdr->p_flags = PF_R|PF_W|PF_X;
> +		phdr->p_vaddr = (Elf64_Addr)_text;
> +		phdr->p_filesz = phdr->p_memsz = _end - _text;
> +		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
> +		ehdr->e_phnum++;
> +		phdr++;
> +	}
>  
> -	/* Prepare PT_LOAD headers for system ram chunks. */
> -	ced->ehdr = ehdr;
> -	ced->bufp = bufp;
> -	ret = walk_system_ram_res(0, -1, ced,
> -			prepare_elf64_ram_headers_callback);
> -	if (ret < 0)
> -		return ret;
> +	/* Go through all the ranges in cmem->ranges[] and prepare phdr */
> +	for (i = 0; i < cmem->nr_ranges; i++) {
> +		mstart = cmem->ranges[i].start;
> +		mend = cmem->ranges[i].end;
> +
> +		phdr->p_type = PT_LOAD;
> +		phdr->p_flags = PF_R|PF_W|PF_X;
> +		phdr->p_offset  = mstart;
> +
> +		phdr->p_paddr = mstart;
> +		phdr->p_vaddr = (unsigned long long) __va(mstart);
> +		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
> +		phdr->p_align = 0;
> +		ehdr->e_phnum++;
> +		phdr++;
> +		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
> +			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
> +			ehdr->e_phnum, phdr->p_offset);
> +	}
>  
>  	*addr = buf;
>  	*sz = elf_sz;
> @@ -489,18 +432,46 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
>  static int prepare_elf_headers(struct kimage *image, void **addr,
>  					unsigned long *sz)
>  {
> -	struct crash_elf_data *ced;
> -	int ret;
> +	struct crash_mem *cmem;
> +	Elf64_Ehdr *ehdr;
> +	Elf64_Phdr *phdr;
> +	int ret, i;
>  
> -	ced = kzalloc(sizeof(*ced), GFP_KERNEL);
> -	if (!ced)
> +	cmem = fill_up_crash_elf_data();
> +	if (!cmem)
>  		return -ENOMEM;
>  
> -	fill_up_crash_elf_data(ced, image);
> +	ret = walk_system_ram_res(0, -1, cmem,
> +				prepare_elf64_ram_headers_callback);
> +	if (ret)
> +		goto out;
> +
> +	/* Exclude unwanted mem ranges */
> +	ret = elf_header_exclude_ranges(cmem);
> +	if (ret)
> +		goto out;
>  
>  	/* By default prepare 64bit headers */
> -	ret =  prepare_elf64_headers(ced, addr, sz);
> -	kfree(ced);
> +	ret =  crash_prepare_elf64_headers(cmem,
> +				(int)IS_ENABLED(CONFIG_X86_64), addr, sz);
> +	if (ret)
> +		goto out;
> +
> +	/*
> +	 * If a range matches backup region, adjust offset to backup
> +	 * segment.
> +	 */
> +	ehdr = (Elf64_Ehdr *)*addr;
> +	phdr = (Elf64_Phdr *)(ehdr + 1);
> +	for (i = 0; i < ehdr->e_phnum; phdr++, i++)
> +		if (phdr->p_type == PT_LOAD &&
> +				phdr->p_paddr == image->arch.backup_src_start &&
> +				phdr->p_memsz == image->arch.backup_src_sz) {
> +			phdr->p_offset = image->arch.backup_load_addr;
> +			break;
> +		}
> +out:
> +	vfree(cmem);
>  	return ret;
>  }
>  
> @@ -546,14 +517,14 @@ static int memmap_exclude_ranges(struct kimage *image, struct crash_mem *cmem,
>  	/* Exclude Backup region */
>  	start = image->arch.backup_load_addr;
>  	end = start + image->arch.backup_src_sz - 1;
> -	ret = exclude_mem_range(cmem, start, end);
> +	ret = crash_exclude_mem_range(cmem, start, end);
>  	if (ret)
>  		return ret;
>  
>  	/* Exclude elf header region */
>  	start = image->arch.elf_load_addr;
>  	end = start + image->arch.elf_headers_sz - 1;
> -	return exclude_mem_range(cmem, start, end);
> +	return crash_exclude_mem_range(cmem, start, end);
>  }
>  
>  /* Prepare memory map for crash dump kernel */
> -- 
> 2.16.2
> 

Thanks
Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 04/13] x86: kexec_file: factor out elf core header related functions
@ 2018-02-24  3:15     ` Dave Young
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-24  3:15 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: herbert, bhe, ard.biesheuvel, catalin.marinas, julien.thierry,
	will.deacon, linux-kernel, kexec, dhowells, arnd,
	linux-arm-kernel, mpe, bauerman, akpm, davem, vgoyal

Hi AKASHI,
On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> exclude_mem_range() and prepare_elf64_headers() can be re-used on other
> architectures, including arm64, as well. So let them factored out so as to
> move them to generic side in the next patch.
> 
> fill_up_crash_elf_data() can potentially be commonalized for most
> architectures who want to go through io resources (/proc/iomem) for a list
> of "System RAM", but leave it private for now.

Is it possible to spilt this patch to small patches?  For example it can
be one patch to change the max ranges to a dynamically allocated buffer.

The remain parts could be splitted as well, so that they can be easier
to review.

> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> ---
>  arch/x86/kernel/crash.c | 235 +++++++++++++++++++++---------------------------
>  1 file changed, 103 insertions(+), 132 deletions(-)
> 
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index 10e74d4778a1..5c19cfbf3b85 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -41,32 +41,14 @@
>  /* Alignment required for elf header segment */
>  #define ELF_CORE_HEADER_ALIGN   4096
>  
> -/* This primarily represents number of split ranges due to exclusion */
> -#define CRASH_MAX_RANGES	16
> -
>  struct crash_mem_range {
>  	u64 start, end;
>  };
>  
>  struct crash_mem {
> -	unsigned int nr_ranges;
> -	struct crash_mem_range ranges[CRASH_MAX_RANGES];
> -};
> -
> -/* Misc data about ram ranges needed to prepare elf headers */
> -struct crash_elf_data {
> -	struct kimage *image;
> -	/*
> -	 * Total number of ram ranges we have after various adjustments for
> -	 * crash reserved region, etc.
> -	 */
>  	unsigned int max_nr_ranges;
> -
> -	/* Pointer to elf header */
> -	void *ehdr;
> -	/* Pointer to next phdr */
> -	void *bufp;
> -	struct crash_mem mem;
> +	unsigned int nr_ranges;
> +	struct crash_mem_range ranges[0];
>  };
>  
>  /* Used while preparing memory map entries for second kernel */
> @@ -217,29 +199,32 @@ static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
>  	return 0;
>  }
>  
> -
>  /* Gather all the required information to prepare elf headers for ram regions */
> -static void fill_up_crash_elf_data(struct crash_elf_data *ced,
> -				   struct kimage *image)
> +static struct crash_mem *fill_up_crash_elf_data(void)
>  {
>  	unsigned int nr_ranges = 0;
> -
> -	ced->image = image;
> +	struct crash_mem *cmem;
>  
>  	walk_system_ram_res(0, -1, &nr_ranges,
>  				get_nr_ram_ranges_callback);
>  
> -	ced->max_nr_ranges = nr_ranges;
> +	/*
> +	 * Exclusion of crash region and/or crashk_low_res may cause
> +	 * another range split. So add extra two slots here.
> +	 */
> +	nr_ranges += 2;
> +	cmem = vmalloc(sizeof(struct crash_mem) +
> +			sizeof(struct crash_mem_range) * nr_ranges);
> +	if (!cmem)
> +		return NULL;
>  
> -	/* Exclusion of crash region could split memory ranges */
> -	ced->max_nr_ranges++;
> +	cmem->max_nr_ranges = nr_ranges;
> +	cmem->nr_ranges = 0;
>  
> -	/* If crashk_low_res is not 0, another range split possible */
> -	if (crashk_low_res.end)
> -		ced->max_nr_ranges++;
> +	return cmem;
>  }
>  
> -static int exclude_mem_range(struct crash_mem *mem,
> +static int crash_exclude_mem_range(struct crash_mem *mem,
>  		unsigned long long mstart, unsigned long long mend)
>  {
>  	int i, j;
> @@ -293,10 +278,8 @@ static int exclude_mem_range(struct crash_mem *mem,
>  		return 0;
>  
>  	/* Split happened */
> -	if (i == CRASH_MAX_RANGES - 1) {
> -		pr_err("Too many crash ranges after split\n");
> +	if (i == mem->max_nr_ranges - 1)
>  		return -ENOMEM;
> -	}
>  
>  	/* Location where new range should go */
>  	j = i + 1;
> @@ -314,27 +297,20 @@ static int exclude_mem_range(struct crash_mem *mem,
>  
>  /*
>   * Look for any unwanted ranges between mstart, mend and remove them. This
> - * might lead to split and split ranges are put in ced->mem.ranges[] array
> + * might lead to split and split ranges are put in cmem->ranges[] array
>   */
> -static int elf_header_exclude_ranges(struct crash_elf_data *ced,
> -		unsigned long long mstart, unsigned long long mend)
> +static int elf_header_exclude_ranges(struct crash_mem *cmem)
>  {
> -	struct crash_mem *cmem = &ced->mem;
>  	int ret = 0;
>  
> -	memset(cmem->ranges, 0, sizeof(cmem->ranges));
> -
> -	cmem->ranges[0].start = mstart;
> -	cmem->ranges[0].end = mend;
> -	cmem->nr_ranges = 1;
> -
>  	/* Exclude crashkernel region */
> -	ret = exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
> +	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
>  	if (ret)
>  		return ret;
>  
>  	if (crashk_low_res.end) {
> -		ret = exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
> +		ret = crash_exclude_mem_range(cmem, crashk_low_res.start,
> +							crashk_low_res.end);
>  		if (ret)
>  			return ret;
>  	}
> @@ -344,70 +320,29 @@ static int elf_header_exclude_ranges(struct crash_elf_data *ced,
>  
>  static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
>  {
> -	struct crash_elf_data *ced = arg;
> -	Elf64_Ehdr *ehdr;
> -	Elf64_Phdr *phdr;
> -	unsigned long mstart, mend;
> -	struct kimage *image = ced->image;
> -	struct crash_mem *cmem;
> -	int ret, i;
> +	struct crash_mem *cmem = arg;
>  
> -	ehdr = ced->ehdr;
> -
> -	/* Exclude unwanted mem ranges */
> -	ret = elf_header_exclude_ranges(ced, res->start, res->end);
> -	if (ret)
> -		return ret;
> -
> -	/* Go through all the ranges in ced->mem.ranges[] and prepare phdr */
> -	cmem = &ced->mem;
> -
> -	for (i = 0; i < cmem->nr_ranges; i++) {
> -		mstart = cmem->ranges[i].start;
> -		mend = cmem->ranges[i].end;
> -
> -		phdr = ced->bufp;
> -		ced->bufp += sizeof(Elf64_Phdr);
> -
> -		phdr->p_type = PT_LOAD;
> -		phdr->p_flags = PF_R|PF_W|PF_X;
> -		phdr->p_offset  = mstart;
> -
> -		/*
> -		 * If a range matches backup region, adjust offset to backup
> -		 * segment.
> -		 */
> -		if (mstart == image->arch.backup_src_start &&
> -		    (mend - mstart + 1) == image->arch.backup_src_sz)
> -			phdr->p_offset = image->arch.backup_load_addr;
> -
> -		phdr->p_paddr = mstart;
> -		phdr->p_vaddr = (unsigned long long) __va(mstart);
> -		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
> -		phdr->p_align = 0;
> -		ehdr->e_phnum++;
> -		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
> -			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
> -			ehdr->e_phnum, phdr->p_offset);
> -	}
> +	cmem->ranges[cmem->nr_ranges].start = res->start;
> +	cmem->ranges[cmem->nr_ranges].end = res->end;
> +	cmem->nr_ranges++;
>  
> -	return ret;
> +	return 0;
>  }
>  
> -static int prepare_elf64_headers(struct crash_elf_data *ced,
> -		void **addr, unsigned long *sz)
> +static int crash_prepare_elf64_headers(struct crash_mem *cmem, int kernel_map,
> +					void **addr, unsigned long *sz)
>  {
>  	Elf64_Ehdr *ehdr;
>  	Elf64_Phdr *phdr;
>  	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
> -	unsigned char *buf, *bufp;
> -	unsigned int cpu;
> +	unsigned char *buf;
> +	unsigned int cpu, i;
>  	unsigned long long notes_addr;
> -	int ret;
> +	unsigned long mstart, mend;
>  
>  	/* extra phdr for vmcoreinfo elf note */
>  	nr_phdr = nr_cpus + 1;
> -	nr_phdr += ced->max_nr_ranges;
> +	nr_phdr += cmem->nr_ranges;
>  
>  	/*
>  	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
> @@ -425,9 +360,8 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
>  	if (!buf)
>  		return -ENOMEM;
>  
> -	bufp = buf;
> -	ehdr = (Elf64_Ehdr *)bufp;
> -	bufp += sizeof(Elf64_Ehdr);
> +	ehdr = (Elf64_Ehdr *)buf;
> +	phdr = (Elf64_Phdr *)(ehdr + 1);
>  	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
>  	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
>  	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
> @@ -443,42 +377,51 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
>  
>  	/* Prepare one phdr of type PT_NOTE for each present cpu */
>  	for_each_present_cpu(cpu) {
> -		phdr = (Elf64_Phdr *)bufp;
> -		bufp += sizeof(Elf64_Phdr);
>  		phdr->p_type = PT_NOTE;
>  		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
>  		phdr->p_offset = phdr->p_paddr = notes_addr;
>  		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
>  		(ehdr->e_phnum)++;
> +		phdr++;
>  	}
>  
>  	/* Prepare one PT_NOTE header for vmcoreinfo */
> -	phdr = (Elf64_Phdr *)bufp;
> -	bufp += sizeof(Elf64_Phdr);
>  	phdr->p_type = PT_NOTE;
>  	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
>  	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
>  	(ehdr->e_phnum)++;
> +	phdr++;
>  
> -#ifdef CONFIG_X86_64
>  	/* Prepare PT_LOAD type program header for kernel text region */
> -	phdr = (Elf64_Phdr *)bufp;
> -	bufp += sizeof(Elf64_Phdr);
> -	phdr->p_type = PT_LOAD;
> -	phdr->p_flags = PF_R|PF_W|PF_X;
> -	phdr->p_vaddr = (Elf64_Addr)_text;
> -	phdr->p_filesz = phdr->p_memsz = _end - _text;
> -	phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
> -	(ehdr->e_phnum)++;
> -#endif
> +	if (kernel_map) {
> +		phdr->p_type = PT_LOAD;
> +		phdr->p_flags = PF_R|PF_W|PF_X;
> +		phdr->p_vaddr = (Elf64_Addr)_text;
> +		phdr->p_filesz = phdr->p_memsz = _end - _text;
> +		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
> +		ehdr->e_phnum++;
> +		phdr++;
> +	}
>  
> -	/* Prepare PT_LOAD headers for system ram chunks. */
> -	ced->ehdr = ehdr;
> -	ced->bufp = bufp;
> -	ret = walk_system_ram_res(0, -1, ced,
> -			prepare_elf64_ram_headers_callback);
> -	if (ret < 0)
> -		return ret;
> +	/* Go through all the ranges in cmem->ranges[] and prepare phdr */
> +	for (i = 0; i < cmem->nr_ranges; i++) {
> +		mstart = cmem->ranges[i].start;
> +		mend = cmem->ranges[i].end;
> +
> +		phdr->p_type = PT_LOAD;
> +		phdr->p_flags = PF_R|PF_W|PF_X;
> +		phdr->p_offset  = mstart;
> +
> +		phdr->p_paddr = mstart;
> +		phdr->p_vaddr = (unsigned long long) __va(mstart);
> +		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
> +		phdr->p_align = 0;
> +		ehdr->e_phnum++;
> +		phdr++;
> +		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
> +			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
> +			ehdr->e_phnum, phdr->p_offset);
> +	}
>  
>  	*addr = buf;
>  	*sz = elf_sz;
> @@ -489,18 +432,46 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
>  static int prepare_elf_headers(struct kimage *image, void **addr,
>  					unsigned long *sz)
>  {
> -	struct crash_elf_data *ced;
> -	int ret;
> +	struct crash_mem *cmem;
> +	Elf64_Ehdr *ehdr;
> +	Elf64_Phdr *phdr;
> +	int ret, i;
>  
> -	ced = kzalloc(sizeof(*ced), GFP_KERNEL);
> -	if (!ced)
> +	cmem = fill_up_crash_elf_data();
> +	if (!cmem)
>  		return -ENOMEM;
>  
> -	fill_up_crash_elf_data(ced, image);
> +	ret = walk_system_ram_res(0, -1, cmem,
> +				prepare_elf64_ram_headers_callback);
> +	if (ret)
> +		goto out;
> +
> +	/* Exclude unwanted mem ranges */
> +	ret = elf_header_exclude_ranges(cmem);
> +	if (ret)
> +		goto out;
>  
>  	/* By default prepare 64bit headers */
> -	ret =  prepare_elf64_headers(ced, addr, sz);
> -	kfree(ced);
> +	ret =  crash_prepare_elf64_headers(cmem,
> +				(int)IS_ENABLED(CONFIG_X86_64), addr, sz);
> +	if (ret)
> +		goto out;
> +
> +	/*
> +	 * If a range matches backup region, adjust offset to backup
> +	 * segment.
> +	 */
> +	ehdr = (Elf64_Ehdr *)*addr;
> +	phdr = (Elf64_Phdr *)(ehdr + 1);
> +	for (i = 0; i < ehdr->e_phnum; phdr++, i++)
> +		if (phdr->p_type == PT_LOAD &&
> +				phdr->p_paddr == image->arch.backup_src_start &&
> +				phdr->p_memsz == image->arch.backup_src_sz) {
> +			phdr->p_offset = image->arch.backup_load_addr;
> +			break;
> +		}
> +out:
> +	vfree(cmem);
>  	return ret;
>  }
>  
> @@ -546,14 +517,14 @@ static int memmap_exclude_ranges(struct kimage *image, struct crash_mem *cmem,
>  	/* Exclude Backup region */
>  	start = image->arch.backup_load_addr;
>  	end = start + image->arch.backup_src_sz - 1;
> -	ret = exclude_mem_range(cmem, start, end);
> +	ret = crash_exclude_mem_range(cmem, start, end);
>  	if (ret)
>  		return ret;
>  
>  	/* Exclude elf header region */
>  	start = image->arch.elf_load_addr;
>  	end = start + image->arch.elf_headers_sz - 1;
> -	return exclude_mem_range(cmem, start, end);
> +	return crash_exclude_mem_range(cmem, start, end);
>  }
>  
>  /* Prepare memory map for crash dump kernel */
> -- 
> 2.16.2
> 

Thanks
Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 04/13] x86: kexec_file: factor out elf core header related functions
  2018-02-24  3:15     ` Dave Young
  (?)
@ 2018-02-26  9:21       ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-26  9:21 UTC (permalink / raw)
  To: Dave Young
  Cc: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, bhe, arnd, ard.biesheuvel,
	julien.thierry, kexec, linux-arm-kernel, linux-kernel

On Sat, Feb 24, 2018 at 11:15:03AM +0800, Dave Young wrote:
> Hi AKASHI,
> On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > exclude_mem_range() and prepare_elf64_headers() can be re-used on other
> > architectures, including arm64, as well. So let them factored out so as to
> > move them to generic side in the next patch.
> > 
> > fill_up_crash_elf_data() can potentially be commonalized for most
> > architectures who want to go through io resources (/proc/iomem) for a list
> > of "System RAM", but leave it private for now.
> 
> Is it possible to spilt this patch to small patches?  For example it can
> be one patch to change the max ranges to a dynamically allocated buffer.
> 
> The remain parts could be splitted as well, so that they can be easier
> to review.

Sure. I'm now going to split patch#4 into four:
   x86: kexec_file: purge system-ram walking from prepare_elf64_headers()
   x86: kexec_file: remove X86_64 dependency from prepare_elf64_headers()
   x86: kexec_file: lift CRASH_MAX_RANGES limit on crash_mem buffer
   x86: kexec_file: clean up prepare_elf64_headers()

In addition, I'm going to post those patches plus old patch#2/3/5
as a separate patch set.

Thanks,
-Takahiro AKASHI

> > 
> > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Dave Young <dyoung@redhat.com>
> > Cc: Vivek Goyal <vgoyal@redhat.com>
> > Cc: Baoquan He <bhe@redhat.com>
> > ---
> >  arch/x86/kernel/crash.c | 235 +++++++++++++++++++++---------------------------
> >  1 file changed, 103 insertions(+), 132 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> > index 10e74d4778a1..5c19cfbf3b85 100644
> > --- a/arch/x86/kernel/crash.c
> > +++ b/arch/x86/kernel/crash.c
> > @@ -41,32 +41,14 @@
> >  /* Alignment required for elf header segment */
> >  #define ELF_CORE_HEADER_ALIGN   4096
> >  
> > -/* This primarily represents number of split ranges due to exclusion */
> > -#define CRASH_MAX_RANGES	16
> > -
> >  struct crash_mem_range {
> >  	u64 start, end;
> >  };
> >  
> >  struct crash_mem {
> > -	unsigned int nr_ranges;
> > -	struct crash_mem_range ranges[CRASH_MAX_RANGES];
> > -};
> > -
> > -/* Misc data about ram ranges needed to prepare elf headers */
> > -struct crash_elf_data {
> > -	struct kimage *image;
> > -	/*
> > -	 * Total number of ram ranges we have after various adjustments for
> > -	 * crash reserved region, etc.
> > -	 */
> >  	unsigned int max_nr_ranges;
> > -
> > -	/* Pointer to elf header */
> > -	void *ehdr;
> > -	/* Pointer to next phdr */
> > -	void *bufp;
> > -	struct crash_mem mem;
> > +	unsigned int nr_ranges;
> > +	struct crash_mem_range ranges[0];
> >  };
> >  
> >  /* Used while preparing memory map entries for second kernel */
> > @@ -217,29 +199,32 @@ static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
> >  	return 0;
> >  }
> >  
> > -
> >  /* Gather all the required information to prepare elf headers for ram regions */
> > -static void fill_up_crash_elf_data(struct crash_elf_data *ced,
> > -				   struct kimage *image)
> > +static struct crash_mem *fill_up_crash_elf_data(void)
> >  {
> >  	unsigned int nr_ranges = 0;
> > -
> > -	ced->image = image;
> > +	struct crash_mem *cmem;
> >  
> >  	walk_system_ram_res(0, -1, &nr_ranges,
> >  				get_nr_ram_ranges_callback);
> >  
> > -	ced->max_nr_ranges = nr_ranges;
> > +	/*
> > +	 * Exclusion of crash region and/or crashk_low_res may cause
> > +	 * another range split. So add extra two slots here.
> > +	 */
> > +	nr_ranges += 2;
> > +	cmem = vmalloc(sizeof(struct crash_mem) +
> > +			sizeof(struct crash_mem_range) * nr_ranges);
> > +	if (!cmem)
> > +		return NULL;
> >  
> > -	/* Exclusion of crash region could split memory ranges */
> > -	ced->max_nr_ranges++;
> > +	cmem->max_nr_ranges = nr_ranges;
> > +	cmem->nr_ranges = 0;
> >  
> > -	/* If crashk_low_res is not 0, another range split possible */
> > -	if (crashk_low_res.end)
> > -		ced->max_nr_ranges++;
> > +	return cmem;
> >  }
> >  
> > -static int exclude_mem_range(struct crash_mem *mem,
> > +static int crash_exclude_mem_range(struct crash_mem *mem,
> >  		unsigned long long mstart, unsigned long long mend)
> >  {
> >  	int i, j;
> > @@ -293,10 +278,8 @@ static int exclude_mem_range(struct crash_mem *mem,
> >  		return 0;
> >  
> >  	/* Split happened */
> > -	if (i == CRASH_MAX_RANGES - 1) {
> > -		pr_err("Too many crash ranges after split\n");
> > +	if (i == mem->max_nr_ranges - 1)
> >  		return -ENOMEM;
> > -	}
> >  
> >  	/* Location where new range should go */
> >  	j = i + 1;
> > @@ -314,27 +297,20 @@ static int exclude_mem_range(struct crash_mem *mem,
> >  
> >  /*
> >   * Look for any unwanted ranges between mstart, mend and remove them. This
> > - * might lead to split and split ranges are put in ced->mem.ranges[] array
> > + * might lead to split and split ranges are put in cmem->ranges[] array
> >   */
> > -static int elf_header_exclude_ranges(struct crash_elf_data *ced,
> > -		unsigned long long mstart, unsigned long long mend)
> > +static int elf_header_exclude_ranges(struct crash_mem *cmem)
> >  {
> > -	struct crash_mem *cmem = &ced->mem;
> >  	int ret = 0;
> >  
> > -	memset(cmem->ranges, 0, sizeof(cmem->ranges));
> > -
> > -	cmem->ranges[0].start = mstart;
> > -	cmem->ranges[0].end = mend;
> > -	cmem->nr_ranges = 1;
> > -
> >  	/* Exclude crashkernel region */
> > -	ret = exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
> > +	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
> >  	if (ret)
> >  		return ret;
> >  
> >  	if (crashk_low_res.end) {
> > -		ret = exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
> > +		ret = crash_exclude_mem_range(cmem, crashk_low_res.start,
> > +							crashk_low_res.end);
> >  		if (ret)
> >  			return ret;
> >  	}
> > @@ -344,70 +320,29 @@ static int elf_header_exclude_ranges(struct crash_elf_data *ced,
> >  
> >  static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
> >  {
> > -	struct crash_elf_data *ced = arg;
> > -	Elf64_Ehdr *ehdr;
> > -	Elf64_Phdr *phdr;
> > -	unsigned long mstart, mend;
> > -	struct kimage *image = ced->image;
> > -	struct crash_mem *cmem;
> > -	int ret, i;
> > +	struct crash_mem *cmem = arg;
> >  
> > -	ehdr = ced->ehdr;
> > -
> > -	/* Exclude unwanted mem ranges */
> > -	ret = elf_header_exclude_ranges(ced, res->start, res->end);
> > -	if (ret)
> > -		return ret;
> > -
> > -	/* Go through all the ranges in ced->mem.ranges[] and prepare phdr */
> > -	cmem = &ced->mem;
> > -
> > -	for (i = 0; i < cmem->nr_ranges; i++) {
> > -		mstart = cmem->ranges[i].start;
> > -		mend = cmem->ranges[i].end;
> > -
> > -		phdr = ced->bufp;
> > -		ced->bufp += sizeof(Elf64_Phdr);
> > -
> > -		phdr->p_type = PT_LOAD;
> > -		phdr->p_flags = PF_R|PF_W|PF_X;
> > -		phdr->p_offset  = mstart;
> > -
> > -		/*
> > -		 * If a range matches backup region, adjust offset to backup
> > -		 * segment.
> > -		 */
> > -		if (mstart == image->arch.backup_src_start &&
> > -		    (mend - mstart + 1) == image->arch.backup_src_sz)
> > -			phdr->p_offset = image->arch.backup_load_addr;
> > -
> > -		phdr->p_paddr = mstart;
> > -		phdr->p_vaddr = (unsigned long long) __va(mstart);
> > -		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
> > -		phdr->p_align = 0;
> > -		ehdr->e_phnum++;
> > -		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
> > -			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
> > -			ehdr->e_phnum, phdr->p_offset);
> > -	}
> > +	cmem->ranges[cmem->nr_ranges].start = res->start;
> > +	cmem->ranges[cmem->nr_ranges].end = res->end;
> > +	cmem->nr_ranges++;
> >  
> > -	return ret;
> > +	return 0;
> >  }
> >  
> > -static int prepare_elf64_headers(struct crash_elf_data *ced,
> > -		void **addr, unsigned long *sz)
> > +static int crash_prepare_elf64_headers(struct crash_mem *cmem, int kernel_map,
> > +					void **addr, unsigned long *sz)
> >  {
> >  	Elf64_Ehdr *ehdr;
> >  	Elf64_Phdr *phdr;
> >  	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
> > -	unsigned char *buf, *bufp;
> > -	unsigned int cpu;
> > +	unsigned char *buf;
> > +	unsigned int cpu, i;
> >  	unsigned long long notes_addr;
> > -	int ret;
> > +	unsigned long mstart, mend;
> >  
> >  	/* extra phdr for vmcoreinfo elf note */
> >  	nr_phdr = nr_cpus + 1;
> > -	nr_phdr += ced->max_nr_ranges;
> > +	nr_phdr += cmem->nr_ranges;
> >  
> >  	/*
> >  	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
> > @@ -425,9 +360,8 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
> >  	if (!buf)
> >  		return -ENOMEM;
> >  
> > -	bufp = buf;
> > -	ehdr = (Elf64_Ehdr *)bufp;
> > -	bufp += sizeof(Elf64_Ehdr);
> > +	ehdr = (Elf64_Ehdr *)buf;
> > +	phdr = (Elf64_Phdr *)(ehdr + 1);
> >  	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
> >  	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
> >  	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
> > @@ -443,42 +377,51 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
> >  
> >  	/* Prepare one phdr of type PT_NOTE for each present cpu */
> >  	for_each_present_cpu(cpu) {
> > -		phdr = (Elf64_Phdr *)bufp;
> > -		bufp += sizeof(Elf64_Phdr);
> >  		phdr->p_type = PT_NOTE;
> >  		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
> >  		phdr->p_offset = phdr->p_paddr = notes_addr;
> >  		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
> >  		(ehdr->e_phnum)++;
> > +		phdr++;
> >  	}
> >  
> >  	/* Prepare one PT_NOTE header for vmcoreinfo */
> > -	phdr = (Elf64_Phdr *)bufp;
> > -	bufp += sizeof(Elf64_Phdr);
> >  	phdr->p_type = PT_NOTE;
> >  	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
> >  	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
> >  	(ehdr->e_phnum)++;
> > +	phdr++;
> >  
> > -#ifdef CONFIG_X86_64
> >  	/* Prepare PT_LOAD type program header for kernel text region */
> > -	phdr = (Elf64_Phdr *)bufp;
> > -	bufp += sizeof(Elf64_Phdr);
> > -	phdr->p_type = PT_LOAD;
> > -	phdr->p_flags = PF_R|PF_W|PF_X;
> > -	phdr->p_vaddr = (Elf64_Addr)_text;
> > -	phdr->p_filesz = phdr->p_memsz = _end - _text;
> > -	phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
> > -	(ehdr->e_phnum)++;
> > -#endif
> > +	if (kernel_map) {
> > +		phdr->p_type = PT_LOAD;
> > +		phdr->p_flags = PF_R|PF_W|PF_X;
> > +		phdr->p_vaddr = (Elf64_Addr)_text;
> > +		phdr->p_filesz = phdr->p_memsz = _end - _text;
> > +		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
> > +		ehdr->e_phnum++;
> > +		phdr++;
> > +	}
> >  
> > -	/* Prepare PT_LOAD headers for system ram chunks. */
> > -	ced->ehdr = ehdr;
> > -	ced->bufp = bufp;
> > -	ret = walk_system_ram_res(0, -1, ced,
> > -			prepare_elf64_ram_headers_callback);
> > -	if (ret < 0)
> > -		return ret;
> > +	/* Go through all the ranges in cmem->ranges[] and prepare phdr */
> > +	for (i = 0; i < cmem->nr_ranges; i++) {
> > +		mstart = cmem->ranges[i].start;
> > +		mend = cmem->ranges[i].end;
> > +
> > +		phdr->p_type = PT_LOAD;
> > +		phdr->p_flags = PF_R|PF_W|PF_X;
> > +		phdr->p_offset  = mstart;
> > +
> > +		phdr->p_paddr = mstart;
> > +		phdr->p_vaddr = (unsigned long long) __va(mstart);
> > +		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
> > +		phdr->p_align = 0;
> > +		ehdr->e_phnum++;
> > +		phdr++;
> > +		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
> > +			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
> > +			ehdr->e_phnum, phdr->p_offset);
> > +	}
> >  
> >  	*addr = buf;
> >  	*sz = elf_sz;
> > @@ -489,18 +432,46 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
> >  static int prepare_elf_headers(struct kimage *image, void **addr,
> >  					unsigned long *sz)
> >  {
> > -	struct crash_elf_data *ced;
> > -	int ret;
> > +	struct crash_mem *cmem;
> > +	Elf64_Ehdr *ehdr;
> > +	Elf64_Phdr *phdr;
> > +	int ret, i;
> >  
> > -	ced = kzalloc(sizeof(*ced), GFP_KERNEL);
> > -	if (!ced)
> > +	cmem = fill_up_crash_elf_data();
> > +	if (!cmem)
> >  		return -ENOMEM;
> >  
> > -	fill_up_crash_elf_data(ced, image);
> > +	ret = walk_system_ram_res(0, -1, cmem,
> > +				prepare_elf64_ram_headers_callback);
> > +	if (ret)
> > +		goto out;
> > +
> > +	/* Exclude unwanted mem ranges */
> > +	ret = elf_header_exclude_ranges(cmem);
> > +	if (ret)
> > +		goto out;
> >  
> >  	/* By default prepare 64bit headers */
> > -	ret =  prepare_elf64_headers(ced, addr, sz);
> > -	kfree(ced);
> > +	ret =  crash_prepare_elf64_headers(cmem,
> > +				(int)IS_ENABLED(CONFIG_X86_64), addr, sz);
> > +	if (ret)
> > +		goto out;
> > +
> > +	/*
> > +	 * If a range matches backup region, adjust offset to backup
> > +	 * segment.
> > +	 */
> > +	ehdr = (Elf64_Ehdr *)*addr;
> > +	phdr = (Elf64_Phdr *)(ehdr + 1);
> > +	for (i = 0; i < ehdr->e_phnum; phdr++, i++)
> > +		if (phdr->p_type == PT_LOAD &&
> > +				phdr->p_paddr == image->arch.backup_src_start &&
> > +				phdr->p_memsz == image->arch.backup_src_sz) {
> > +			phdr->p_offset = image->arch.backup_load_addr;
> > +			break;
> > +		}
> > +out:
> > +	vfree(cmem);
> >  	return ret;
> >  }
> >  
> > @@ -546,14 +517,14 @@ static int memmap_exclude_ranges(struct kimage *image, struct crash_mem *cmem,
> >  	/* Exclude Backup region */
> >  	start = image->arch.backup_load_addr;
> >  	end = start + image->arch.backup_src_sz - 1;
> > -	ret = exclude_mem_range(cmem, start, end);
> > +	ret = crash_exclude_mem_range(cmem, start, end);
> >  	if (ret)
> >  		return ret;
> >  
> >  	/* Exclude elf header region */
> >  	start = image->arch.elf_load_addr;
> >  	end = start + image->arch.elf_headers_sz - 1;
> > -	return exclude_mem_range(cmem, start, end);
> > +	return crash_exclude_mem_range(cmem, start, end);
> >  }
> >  
> >  /* Prepare memory map for crash dump kernel */
> > -- 
> > 2.16.2
> > 
> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 04/13] x86: kexec_file: factor out elf core header related functions
@ 2018-02-26  9:21       ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-26  9:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Feb 24, 2018 at 11:15:03AM +0800, Dave Young wrote:
> Hi AKASHI,
> On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > exclude_mem_range() and prepare_elf64_headers() can be re-used on other
> > architectures, including arm64, as well. So let them factored out so as to
> > move them to generic side in the next patch.
> > 
> > fill_up_crash_elf_data() can potentially be commonalized for most
> > architectures who want to go through io resources (/proc/iomem) for a list
> > of "System RAM", but leave it private for now.
> 
> Is it possible to spilt this patch to small patches?  For example it can
> be one patch to change the max ranges to a dynamically allocated buffer.
> 
> The remain parts could be splitted as well, so that they can be easier
> to review.

Sure. I'm now going to split patch#4 into four:
   x86: kexec_file: purge system-ram walking from prepare_elf64_headers()
   x86: kexec_file: remove X86_64 dependency from prepare_elf64_headers()
   x86: kexec_file: lift CRASH_MAX_RANGES limit on crash_mem buffer
   x86: kexec_file: clean up prepare_elf64_headers()

In addition, I'm going to post those patches plus old patch#2/3/5
as a separate patch set.

Thanks,
-Takahiro AKASHI

> > 
> > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Dave Young <dyoung@redhat.com>
> > Cc: Vivek Goyal <vgoyal@redhat.com>
> > Cc: Baoquan He <bhe@redhat.com>
> > ---
> >  arch/x86/kernel/crash.c | 235 +++++++++++++++++++++---------------------------
> >  1 file changed, 103 insertions(+), 132 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> > index 10e74d4778a1..5c19cfbf3b85 100644
> > --- a/arch/x86/kernel/crash.c
> > +++ b/arch/x86/kernel/crash.c
> > @@ -41,32 +41,14 @@
> >  /* Alignment required for elf header segment */
> >  #define ELF_CORE_HEADER_ALIGN   4096
> >  
> > -/* This primarily represents number of split ranges due to exclusion */
> > -#define CRASH_MAX_RANGES	16
> > -
> >  struct crash_mem_range {
> >  	u64 start, end;
> >  };
> >  
> >  struct crash_mem {
> > -	unsigned int nr_ranges;
> > -	struct crash_mem_range ranges[CRASH_MAX_RANGES];
> > -};
> > -
> > -/* Misc data about ram ranges needed to prepare elf headers */
> > -struct crash_elf_data {
> > -	struct kimage *image;
> > -	/*
> > -	 * Total number of ram ranges we have after various adjustments for
> > -	 * crash reserved region, etc.
> > -	 */
> >  	unsigned int max_nr_ranges;
> > -
> > -	/* Pointer to elf header */
> > -	void *ehdr;
> > -	/* Pointer to next phdr */
> > -	void *bufp;
> > -	struct crash_mem mem;
> > +	unsigned int nr_ranges;
> > +	struct crash_mem_range ranges[0];
> >  };
> >  
> >  /* Used while preparing memory map entries for second kernel */
> > @@ -217,29 +199,32 @@ static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
> >  	return 0;
> >  }
> >  
> > -
> >  /* Gather all the required information to prepare elf headers for ram regions */
> > -static void fill_up_crash_elf_data(struct crash_elf_data *ced,
> > -				   struct kimage *image)
> > +static struct crash_mem *fill_up_crash_elf_data(void)
> >  {
> >  	unsigned int nr_ranges = 0;
> > -
> > -	ced->image = image;
> > +	struct crash_mem *cmem;
> >  
> >  	walk_system_ram_res(0, -1, &nr_ranges,
> >  				get_nr_ram_ranges_callback);
> >  
> > -	ced->max_nr_ranges = nr_ranges;
> > +	/*
> > +	 * Exclusion of crash region and/or crashk_low_res may cause
> > +	 * another range split. So add extra two slots here.
> > +	 */
> > +	nr_ranges += 2;
> > +	cmem = vmalloc(sizeof(struct crash_mem) +
> > +			sizeof(struct crash_mem_range) * nr_ranges);
> > +	if (!cmem)
> > +		return NULL;
> >  
> > -	/* Exclusion of crash region could split memory ranges */
> > -	ced->max_nr_ranges++;
> > +	cmem->max_nr_ranges = nr_ranges;
> > +	cmem->nr_ranges = 0;
> >  
> > -	/* If crashk_low_res is not 0, another range split possible */
> > -	if (crashk_low_res.end)
> > -		ced->max_nr_ranges++;
> > +	return cmem;
> >  }
> >  
> > -static int exclude_mem_range(struct crash_mem *mem,
> > +static int crash_exclude_mem_range(struct crash_mem *mem,
> >  		unsigned long long mstart, unsigned long long mend)
> >  {
> >  	int i, j;
> > @@ -293,10 +278,8 @@ static int exclude_mem_range(struct crash_mem *mem,
> >  		return 0;
> >  
> >  	/* Split happened */
> > -	if (i == CRASH_MAX_RANGES - 1) {
> > -		pr_err("Too many crash ranges after split\n");
> > +	if (i == mem->max_nr_ranges - 1)
> >  		return -ENOMEM;
> > -	}
> >  
> >  	/* Location where new range should go */
> >  	j = i + 1;
> > @@ -314,27 +297,20 @@ static int exclude_mem_range(struct crash_mem *mem,
> >  
> >  /*
> >   * Look for any unwanted ranges between mstart, mend and remove them. This
> > - * might lead to split and split ranges are put in ced->mem.ranges[] array
> > + * might lead to split and split ranges are put in cmem->ranges[] array
> >   */
> > -static int elf_header_exclude_ranges(struct crash_elf_data *ced,
> > -		unsigned long long mstart, unsigned long long mend)
> > +static int elf_header_exclude_ranges(struct crash_mem *cmem)
> >  {
> > -	struct crash_mem *cmem = &ced->mem;
> >  	int ret = 0;
> >  
> > -	memset(cmem->ranges, 0, sizeof(cmem->ranges));
> > -
> > -	cmem->ranges[0].start = mstart;
> > -	cmem->ranges[0].end = mend;
> > -	cmem->nr_ranges = 1;
> > -
> >  	/* Exclude crashkernel region */
> > -	ret = exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
> > +	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
> >  	if (ret)
> >  		return ret;
> >  
> >  	if (crashk_low_res.end) {
> > -		ret = exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
> > +		ret = crash_exclude_mem_range(cmem, crashk_low_res.start,
> > +							crashk_low_res.end);
> >  		if (ret)
> >  			return ret;
> >  	}
> > @@ -344,70 +320,29 @@ static int elf_header_exclude_ranges(struct crash_elf_data *ced,
> >  
> >  static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
> >  {
> > -	struct crash_elf_data *ced = arg;
> > -	Elf64_Ehdr *ehdr;
> > -	Elf64_Phdr *phdr;
> > -	unsigned long mstart, mend;
> > -	struct kimage *image = ced->image;
> > -	struct crash_mem *cmem;
> > -	int ret, i;
> > +	struct crash_mem *cmem = arg;
> >  
> > -	ehdr = ced->ehdr;
> > -
> > -	/* Exclude unwanted mem ranges */
> > -	ret = elf_header_exclude_ranges(ced, res->start, res->end);
> > -	if (ret)
> > -		return ret;
> > -
> > -	/* Go through all the ranges in ced->mem.ranges[] and prepare phdr */
> > -	cmem = &ced->mem;
> > -
> > -	for (i = 0; i < cmem->nr_ranges; i++) {
> > -		mstart = cmem->ranges[i].start;
> > -		mend = cmem->ranges[i].end;
> > -
> > -		phdr = ced->bufp;
> > -		ced->bufp += sizeof(Elf64_Phdr);
> > -
> > -		phdr->p_type = PT_LOAD;
> > -		phdr->p_flags = PF_R|PF_W|PF_X;
> > -		phdr->p_offset  = mstart;
> > -
> > -		/*
> > -		 * If a range matches backup region, adjust offset to backup
> > -		 * segment.
> > -		 */
> > -		if (mstart == image->arch.backup_src_start &&
> > -		    (mend - mstart + 1) == image->arch.backup_src_sz)
> > -			phdr->p_offset = image->arch.backup_load_addr;
> > -
> > -		phdr->p_paddr = mstart;
> > -		phdr->p_vaddr = (unsigned long long) __va(mstart);
> > -		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
> > -		phdr->p_align = 0;
> > -		ehdr->e_phnum++;
> > -		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
> > -			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
> > -			ehdr->e_phnum, phdr->p_offset);
> > -	}
> > +	cmem->ranges[cmem->nr_ranges].start = res->start;
> > +	cmem->ranges[cmem->nr_ranges].end = res->end;
> > +	cmem->nr_ranges++;
> >  
> > -	return ret;
> > +	return 0;
> >  }
> >  
> > -static int prepare_elf64_headers(struct crash_elf_data *ced,
> > -		void **addr, unsigned long *sz)
> > +static int crash_prepare_elf64_headers(struct crash_mem *cmem, int kernel_map,
> > +					void **addr, unsigned long *sz)
> >  {
> >  	Elf64_Ehdr *ehdr;
> >  	Elf64_Phdr *phdr;
> >  	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
> > -	unsigned char *buf, *bufp;
> > -	unsigned int cpu;
> > +	unsigned char *buf;
> > +	unsigned int cpu, i;
> >  	unsigned long long notes_addr;
> > -	int ret;
> > +	unsigned long mstart, mend;
> >  
> >  	/* extra phdr for vmcoreinfo elf note */
> >  	nr_phdr = nr_cpus + 1;
> > -	nr_phdr += ced->max_nr_ranges;
> > +	nr_phdr += cmem->nr_ranges;
> >  
> >  	/*
> >  	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
> > @@ -425,9 +360,8 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
> >  	if (!buf)
> >  		return -ENOMEM;
> >  
> > -	bufp = buf;
> > -	ehdr = (Elf64_Ehdr *)bufp;
> > -	bufp += sizeof(Elf64_Ehdr);
> > +	ehdr = (Elf64_Ehdr *)buf;
> > +	phdr = (Elf64_Phdr *)(ehdr + 1);
> >  	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
> >  	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
> >  	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
> > @@ -443,42 +377,51 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
> >  
> >  	/* Prepare one phdr of type PT_NOTE for each present cpu */
> >  	for_each_present_cpu(cpu) {
> > -		phdr = (Elf64_Phdr *)bufp;
> > -		bufp += sizeof(Elf64_Phdr);
> >  		phdr->p_type = PT_NOTE;
> >  		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
> >  		phdr->p_offset = phdr->p_paddr = notes_addr;
> >  		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
> >  		(ehdr->e_phnum)++;
> > +		phdr++;
> >  	}
> >  
> >  	/* Prepare one PT_NOTE header for vmcoreinfo */
> > -	phdr = (Elf64_Phdr *)bufp;
> > -	bufp += sizeof(Elf64_Phdr);
> >  	phdr->p_type = PT_NOTE;
> >  	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
> >  	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
> >  	(ehdr->e_phnum)++;
> > +	phdr++;
> >  
> > -#ifdef CONFIG_X86_64
> >  	/* Prepare PT_LOAD type program header for kernel text region */
> > -	phdr = (Elf64_Phdr *)bufp;
> > -	bufp += sizeof(Elf64_Phdr);
> > -	phdr->p_type = PT_LOAD;
> > -	phdr->p_flags = PF_R|PF_W|PF_X;
> > -	phdr->p_vaddr = (Elf64_Addr)_text;
> > -	phdr->p_filesz = phdr->p_memsz = _end - _text;
> > -	phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
> > -	(ehdr->e_phnum)++;
> > -#endif
> > +	if (kernel_map) {
> > +		phdr->p_type = PT_LOAD;
> > +		phdr->p_flags = PF_R|PF_W|PF_X;
> > +		phdr->p_vaddr = (Elf64_Addr)_text;
> > +		phdr->p_filesz = phdr->p_memsz = _end - _text;
> > +		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
> > +		ehdr->e_phnum++;
> > +		phdr++;
> > +	}
> >  
> > -	/* Prepare PT_LOAD headers for system ram chunks. */
> > -	ced->ehdr = ehdr;
> > -	ced->bufp = bufp;
> > -	ret = walk_system_ram_res(0, -1, ced,
> > -			prepare_elf64_ram_headers_callback);
> > -	if (ret < 0)
> > -		return ret;
> > +	/* Go through all the ranges in cmem->ranges[] and prepare phdr */
> > +	for (i = 0; i < cmem->nr_ranges; i++) {
> > +		mstart = cmem->ranges[i].start;
> > +		mend = cmem->ranges[i].end;
> > +
> > +		phdr->p_type = PT_LOAD;
> > +		phdr->p_flags = PF_R|PF_W|PF_X;
> > +		phdr->p_offset  = mstart;
> > +
> > +		phdr->p_paddr = mstart;
> > +		phdr->p_vaddr = (unsigned long long) __va(mstart);
> > +		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
> > +		phdr->p_align = 0;
> > +		ehdr->e_phnum++;
> > +		phdr++;
> > +		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
> > +			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
> > +			ehdr->e_phnum, phdr->p_offset);
> > +	}
> >  
> >  	*addr = buf;
> >  	*sz = elf_sz;
> > @@ -489,18 +432,46 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
> >  static int prepare_elf_headers(struct kimage *image, void **addr,
> >  					unsigned long *sz)
> >  {
> > -	struct crash_elf_data *ced;
> > -	int ret;
> > +	struct crash_mem *cmem;
> > +	Elf64_Ehdr *ehdr;
> > +	Elf64_Phdr *phdr;
> > +	int ret, i;
> >  
> > -	ced = kzalloc(sizeof(*ced), GFP_KERNEL);
> > -	if (!ced)
> > +	cmem = fill_up_crash_elf_data();
> > +	if (!cmem)
> >  		return -ENOMEM;
> >  
> > -	fill_up_crash_elf_data(ced, image);
> > +	ret = walk_system_ram_res(0, -1, cmem,
> > +				prepare_elf64_ram_headers_callback);
> > +	if (ret)
> > +		goto out;
> > +
> > +	/* Exclude unwanted mem ranges */
> > +	ret = elf_header_exclude_ranges(cmem);
> > +	if (ret)
> > +		goto out;
> >  
> >  	/* By default prepare 64bit headers */
> > -	ret =  prepare_elf64_headers(ced, addr, sz);
> > -	kfree(ced);
> > +	ret =  crash_prepare_elf64_headers(cmem,
> > +				(int)IS_ENABLED(CONFIG_X86_64), addr, sz);
> > +	if (ret)
> > +		goto out;
> > +
> > +	/*
> > +	 * If a range matches backup region, adjust offset to backup
> > +	 * segment.
> > +	 */
> > +	ehdr = (Elf64_Ehdr *)*addr;
> > +	phdr = (Elf64_Phdr *)(ehdr + 1);
> > +	for (i = 0; i < ehdr->e_phnum; phdr++, i++)
> > +		if (phdr->p_type == PT_LOAD &&
> > +				phdr->p_paddr == image->arch.backup_src_start &&
> > +				phdr->p_memsz == image->arch.backup_src_sz) {
> > +			phdr->p_offset = image->arch.backup_load_addr;
> > +			break;
> > +		}
> > +out:
> > +	vfree(cmem);
> >  	return ret;
> >  }
> >  
> > @@ -546,14 +517,14 @@ static int memmap_exclude_ranges(struct kimage *image, struct crash_mem *cmem,
> >  	/* Exclude Backup region */
> >  	start = image->arch.backup_load_addr;
> >  	end = start + image->arch.backup_src_sz - 1;
> > -	ret = exclude_mem_range(cmem, start, end);
> > +	ret = crash_exclude_mem_range(cmem, start, end);
> >  	if (ret)
> >  		return ret;
> >  
> >  	/* Exclude elf header region */
> >  	start = image->arch.elf_load_addr;
> >  	end = start + image->arch.elf_headers_sz - 1;
> > -	return exclude_mem_range(cmem, start, end);
> > +	return crash_exclude_mem_range(cmem, start, end);
> >  }
> >  
> >  /* Prepare memory map for crash dump kernel */
> > -- 
> > 2.16.2
> > 
> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 04/13] x86: kexec_file: factor out elf core header related functions
@ 2018-02-26  9:21       ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-26  9:21 UTC (permalink / raw)
  To: Dave Young
  Cc: herbert, bhe, ard.biesheuvel, catalin.marinas, julien.thierry,
	will.deacon, linux-kernel, kexec, dhowells, arnd,
	linux-arm-kernel, mpe, bauerman, akpm, davem, vgoyal

On Sat, Feb 24, 2018 at 11:15:03AM +0800, Dave Young wrote:
> Hi AKASHI,
> On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > exclude_mem_range() and prepare_elf64_headers() can be re-used on other
> > architectures, including arm64, as well. So let them factored out so as to
> > move them to generic side in the next patch.
> > 
> > fill_up_crash_elf_data() can potentially be commonalized for most
> > architectures who want to go through io resources (/proc/iomem) for a list
> > of "System RAM", but leave it private for now.
> 
> Is it possible to spilt this patch to small patches?  For example it can
> be one patch to change the max ranges to a dynamically allocated buffer.
> 
> The remain parts could be splitted as well, so that they can be easier
> to review.

Sure. I'm now going to split patch#4 into four:
   x86: kexec_file: purge system-ram walking from prepare_elf64_headers()
   x86: kexec_file: remove X86_64 dependency from prepare_elf64_headers()
   x86: kexec_file: lift CRASH_MAX_RANGES limit on crash_mem buffer
   x86: kexec_file: clean up prepare_elf64_headers()

In addition, I'm going to post those patches plus old patch#2/3/5
as a separate patch set.

Thanks,
-Takahiro AKASHI

> > 
> > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Dave Young <dyoung@redhat.com>
> > Cc: Vivek Goyal <vgoyal@redhat.com>
> > Cc: Baoquan He <bhe@redhat.com>
> > ---
> >  arch/x86/kernel/crash.c | 235 +++++++++++++++++++++---------------------------
> >  1 file changed, 103 insertions(+), 132 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> > index 10e74d4778a1..5c19cfbf3b85 100644
> > --- a/arch/x86/kernel/crash.c
> > +++ b/arch/x86/kernel/crash.c
> > @@ -41,32 +41,14 @@
> >  /* Alignment required for elf header segment */
> >  #define ELF_CORE_HEADER_ALIGN   4096
> >  
> > -/* This primarily represents number of split ranges due to exclusion */
> > -#define CRASH_MAX_RANGES	16
> > -
> >  struct crash_mem_range {
> >  	u64 start, end;
> >  };
> >  
> >  struct crash_mem {
> > -	unsigned int nr_ranges;
> > -	struct crash_mem_range ranges[CRASH_MAX_RANGES];
> > -};
> > -
> > -/* Misc data about ram ranges needed to prepare elf headers */
> > -struct crash_elf_data {
> > -	struct kimage *image;
> > -	/*
> > -	 * Total number of ram ranges we have after various adjustments for
> > -	 * crash reserved region, etc.
> > -	 */
> >  	unsigned int max_nr_ranges;
> > -
> > -	/* Pointer to elf header */
> > -	void *ehdr;
> > -	/* Pointer to next phdr */
> > -	void *bufp;
> > -	struct crash_mem mem;
> > +	unsigned int nr_ranges;
> > +	struct crash_mem_range ranges[0];
> >  };
> >  
> >  /* Used while preparing memory map entries for second kernel */
> > @@ -217,29 +199,32 @@ static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
> >  	return 0;
> >  }
> >  
> > -
> >  /* Gather all the required information to prepare elf headers for ram regions */
> > -static void fill_up_crash_elf_data(struct crash_elf_data *ced,
> > -				   struct kimage *image)
> > +static struct crash_mem *fill_up_crash_elf_data(void)
> >  {
> >  	unsigned int nr_ranges = 0;
> > -
> > -	ced->image = image;
> > +	struct crash_mem *cmem;
> >  
> >  	walk_system_ram_res(0, -1, &nr_ranges,
> >  				get_nr_ram_ranges_callback);
> >  
> > -	ced->max_nr_ranges = nr_ranges;
> > +	/*
> > +	 * Exclusion of crash region and/or crashk_low_res may cause
> > +	 * another range split. So add extra two slots here.
> > +	 */
> > +	nr_ranges += 2;
> > +	cmem = vmalloc(sizeof(struct crash_mem) +
> > +			sizeof(struct crash_mem_range) * nr_ranges);
> > +	if (!cmem)
> > +		return NULL;
> >  
> > -	/* Exclusion of crash region could split memory ranges */
> > -	ced->max_nr_ranges++;
> > +	cmem->max_nr_ranges = nr_ranges;
> > +	cmem->nr_ranges = 0;
> >  
> > -	/* If crashk_low_res is not 0, another range split possible */
> > -	if (crashk_low_res.end)
> > -		ced->max_nr_ranges++;
> > +	return cmem;
> >  }
> >  
> > -static int exclude_mem_range(struct crash_mem *mem,
> > +static int crash_exclude_mem_range(struct crash_mem *mem,
> >  		unsigned long long mstart, unsigned long long mend)
> >  {
> >  	int i, j;
> > @@ -293,10 +278,8 @@ static int exclude_mem_range(struct crash_mem *mem,
> >  		return 0;
> >  
> >  	/* Split happened */
> > -	if (i == CRASH_MAX_RANGES - 1) {
> > -		pr_err("Too many crash ranges after split\n");
> > +	if (i == mem->max_nr_ranges - 1)
> >  		return -ENOMEM;
> > -	}
> >  
> >  	/* Location where new range should go */
> >  	j = i + 1;
> > @@ -314,27 +297,20 @@ static int exclude_mem_range(struct crash_mem *mem,
> >  
> >  /*
> >   * Look for any unwanted ranges between mstart, mend and remove them. This
> > - * might lead to split and split ranges are put in ced->mem.ranges[] array
> > + * might lead to split and split ranges are put in cmem->ranges[] array
> >   */
> > -static int elf_header_exclude_ranges(struct crash_elf_data *ced,
> > -		unsigned long long mstart, unsigned long long mend)
> > +static int elf_header_exclude_ranges(struct crash_mem *cmem)
> >  {
> > -	struct crash_mem *cmem = &ced->mem;
> >  	int ret = 0;
> >  
> > -	memset(cmem->ranges, 0, sizeof(cmem->ranges));
> > -
> > -	cmem->ranges[0].start = mstart;
> > -	cmem->ranges[0].end = mend;
> > -	cmem->nr_ranges = 1;
> > -
> >  	/* Exclude crashkernel region */
> > -	ret = exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
> > +	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
> >  	if (ret)
> >  		return ret;
> >  
> >  	if (crashk_low_res.end) {
> > -		ret = exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
> > +		ret = crash_exclude_mem_range(cmem, crashk_low_res.start,
> > +							crashk_low_res.end);
> >  		if (ret)
> >  			return ret;
> >  	}
> > @@ -344,70 +320,29 @@ static int elf_header_exclude_ranges(struct crash_elf_data *ced,
> >  
> >  static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
> >  {
> > -	struct crash_elf_data *ced = arg;
> > -	Elf64_Ehdr *ehdr;
> > -	Elf64_Phdr *phdr;
> > -	unsigned long mstart, mend;
> > -	struct kimage *image = ced->image;
> > -	struct crash_mem *cmem;
> > -	int ret, i;
> > +	struct crash_mem *cmem = arg;
> >  
> > -	ehdr = ced->ehdr;
> > -
> > -	/* Exclude unwanted mem ranges */
> > -	ret = elf_header_exclude_ranges(ced, res->start, res->end);
> > -	if (ret)
> > -		return ret;
> > -
> > -	/* Go through all the ranges in ced->mem.ranges[] and prepare phdr */
> > -	cmem = &ced->mem;
> > -
> > -	for (i = 0; i < cmem->nr_ranges; i++) {
> > -		mstart = cmem->ranges[i].start;
> > -		mend = cmem->ranges[i].end;
> > -
> > -		phdr = ced->bufp;
> > -		ced->bufp += sizeof(Elf64_Phdr);
> > -
> > -		phdr->p_type = PT_LOAD;
> > -		phdr->p_flags = PF_R|PF_W|PF_X;
> > -		phdr->p_offset  = mstart;
> > -
> > -		/*
> > -		 * If a range matches backup region, adjust offset to backup
> > -		 * segment.
> > -		 */
> > -		if (mstart == image->arch.backup_src_start &&
> > -		    (mend - mstart + 1) == image->arch.backup_src_sz)
> > -			phdr->p_offset = image->arch.backup_load_addr;
> > -
> > -		phdr->p_paddr = mstart;
> > -		phdr->p_vaddr = (unsigned long long) __va(mstart);
> > -		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
> > -		phdr->p_align = 0;
> > -		ehdr->e_phnum++;
> > -		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
> > -			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
> > -			ehdr->e_phnum, phdr->p_offset);
> > -	}
> > +	cmem->ranges[cmem->nr_ranges].start = res->start;
> > +	cmem->ranges[cmem->nr_ranges].end = res->end;
> > +	cmem->nr_ranges++;
> >  
> > -	return ret;
> > +	return 0;
> >  }
> >  
> > -static int prepare_elf64_headers(struct crash_elf_data *ced,
> > -		void **addr, unsigned long *sz)
> > +static int crash_prepare_elf64_headers(struct crash_mem *cmem, int kernel_map,
> > +					void **addr, unsigned long *sz)
> >  {
> >  	Elf64_Ehdr *ehdr;
> >  	Elf64_Phdr *phdr;
> >  	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
> > -	unsigned char *buf, *bufp;
> > -	unsigned int cpu;
> > +	unsigned char *buf;
> > +	unsigned int cpu, i;
> >  	unsigned long long notes_addr;
> > -	int ret;
> > +	unsigned long mstart, mend;
> >  
> >  	/* extra phdr for vmcoreinfo elf note */
> >  	nr_phdr = nr_cpus + 1;
> > -	nr_phdr += ced->max_nr_ranges;
> > +	nr_phdr += cmem->nr_ranges;
> >  
> >  	/*
> >  	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
> > @@ -425,9 +360,8 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
> >  	if (!buf)
> >  		return -ENOMEM;
> >  
> > -	bufp = buf;
> > -	ehdr = (Elf64_Ehdr *)bufp;
> > -	bufp += sizeof(Elf64_Ehdr);
> > +	ehdr = (Elf64_Ehdr *)buf;
> > +	phdr = (Elf64_Phdr *)(ehdr + 1);
> >  	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
> >  	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
> >  	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
> > @@ -443,42 +377,51 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
> >  
> >  	/* Prepare one phdr of type PT_NOTE for each present cpu */
> >  	for_each_present_cpu(cpu) {
> > -		phdr = (Elf64_Phdr *)bufp;
> > -		bufp += sizeof(Elf64_Phdr);
> >  		phdr->p_type = PT_NOTE;
> >  		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
> >  		phdr->p_offset = phdr->p_paddr = notes_addr;
> >  		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
> >  		(ehdr->e_phnum)++;
> > +		phdr++;
> >  	}
> >  
> >  	/* Prepare one PT_NOTE header for vmcoreinfo */
> > -	phdr = (Elf64_Phdr *)bufp;
> > -	bufp += sizeof(Elf64_Phdr);
> >  	phdr->p_type = PT_NOTE;
> >  	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
> >  	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
> >  	(ehdr->e_phnum)++;
> > +	phdr++;
> >  
> > -#ifdef CONFIG_X86_64
> >  	/* Prepare PT_LOAD type program header for kernel text region */
> > -	phdr = (Elf64_Phdr *)bufp;
> > -	bufp += sizeof(Elf64_Phdr);
> > -	phdr->p_type = PT_LOAD;
> > -	phdr->p_flags = PF_R|PF_W|PF_X;
> > -	phdr->p_vaddr = (Elf64_Addr)_text;
> > -	phdr->p_filesz = phdr->p_memsz = _end - _text;
> > -	phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
> > -	(ehdr->e_phnum)++;
> > -#endif
> > +	if (kernel_map) {
> > +		phdr->p_type = PT_LOAD;
> > +		phdr->p_flags = PF_R|PF_W|PF_X;
> > +		phdr->p_vaddr = (Elf64_Addr)_text;
> > +		phdr->p_filesz = phdr->p_memsz = _end - _text;
> > +		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
> > +		ehdr->e_phnum++;
> > +		phdr++;
> > +	}
> >  
> > -	/* Prepare PT_LOAD headers for system ram chunks. */
> > -	ced->ehdr = ehdr;
> > -	ced->bufp = bufp;
> > -	ret = walk_system_ram_res(0, -1, ced,
> > -			prepare_elf64_ram_headers_callback);
> > -	if (ret < 0)
> > -		return ret;
> > +	/* Go through all the ranges in cmem->ranges[] and prepare phdr */
> > +	for (i = 0; i < cmem->nr_ranges; i++) {
> > +		mstart = cmem->ranges[i].start;
> > +		mend = cmem->ranges[i].end;
> > +
> > +		phdr->p_type = PT_LOAD;
> > +		phdr->p_flags = PF_R|PF_W|PF_X;
> > +		phdr->p_offset  = mstart;
> > +
> > +		phdr->p_paddr = mstart;
> > +		phdr->p_vaddr = (unsigned long long) __va(mstart);
> > +		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
> > +		phdr->p_align = 0;
> > +		ehdr->e_phnum++;
> > +		phdr++;
> > +		pr_debug("Crash PT_LOAD elf header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
> > +			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
> > +			ehdr->e_phnum, phdr->p_offset);
> > +	}
> >  
> >  	*addr = buf;
> >  	*sz = elf_sz;
> > @@ -489,18 +432,46 @@ static int prepare_elf64_headers(struct crash_elf_data *ced,
> >  static int prepare_elf_headers(struct kimage *image, void **addr,
> >  					unsigned long *sz)
> >  {
> > -	struct crash_elf_data *ced;
> > -	int ret;
> > +	struct crash_mem *cmem;
> > +	Elf64_Ehdr *ehdr;
> > +	Elf64_Phdr *phdr;
> > +	int ret, i;
> >  
> > -	ced = kzalloc(sizeof(*ced), GFP_KERNEL);
> > -	if (!ced)
> > +	cmem = fill_up_crash_elf_data();
> > +	if (!cmem)
> >  		return -ENOMEM;
> >  
> > -	fill_up_crash_elf_data(ced, image);
> > +	ret = walk_system_ram_res(0, -1, cmem,
> > +				prepare_elf64_ram_headers_callback);
> > +	if (ret)
> > +		goto out;
> > +
> > +	/* Exclude unwanted mem ranges */
> > +	ret = elf_header_exclude_ranges(cmem);
> > +	if (ret)
> > +		goto out;
> >  
> >  	/* By default prepare 64bit headers */
> > -	ret =  prepare_elf64_headers(ced, addr, sz);
> > -	kfree(ced);
> > +	ret =  crash_prepare_elf64_headers(cmem,
> > +				(int)IS_ENABLED(CONFIG_X86_64), addr, sz);
> > +	if (ret)
> > +		goto out;
> > +
> > +	/*
> > +	 * If a range matches backup region, adjust offset to backup
> > +	 * segment.
> > +	 */
> > +	ehdr = (Elf64_Ehdr *)*addr;
> > +	phdr = (Elf64_Phdr *)(ehdr + 1);
> > +	for (i = 0; i < ehdr->e_phnum; phdr++, i++)
> > +		if (phdr->p_type == PT_LOAD &&
> > +				phdr->p_paddr == image->arch.backup_src_start &&
> > +				phdr->p_memsz == image->arch.backup_src_sz) {
> > +			phdr->p_offset = image->arch.backup_load_addr;
> > +			break;
> > +		}
> > +out:
> > +	vfree(cmem);
> >  	return ret;
> >  }
> >  
> > @@ -546,14 +517,14 @@ static int memmap_exclude_ranges(struct kimage *image, struct crash_mem *cmem,
> >  	/* Exclude Backup region */
> >  	start = image->arch.backup_load_addr;
> >  	end = start + image->arch.backup_src_sz - 1;
> > -	ret = exclude_mem_range(cmem, start, end);
> > +	ret = crash_exclude_mem_range(cmem, start, end);
> >  	if (ret)
> >  		return ret;
> >  
> >  	/* Exclude elf header region */
> >  	start = image->arch.elf_load_addr;
> >  	end = start + image->arch.elf_headers_sz - 1;
> > -	return exclude_mem_range(cmem, start, end);
> > +	return crash_exclude_mem_range(cmem, start, end);
> >  }
> >  
> >  /* Prepare memory map for crash dump kernel */
> > -- 
> > 2.16.2
> > 
> 
> Thanks
> Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
  2018-02-23  9:24     ` Dave Young
  (?)
@ 2018-02-26 10:01       ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-26 10:01 UTC (permalink / raw)
  To: Dave Young
  Cc: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, bhe, arnd, ard.biesheuvel,
	julien.thierry, kexec, linux-arm-kernel, linux-kernel

On Fri, Feb 23, 2018 at 05:24:59PM +0800, Dave Young wrote:
> Hi AKASHI,
> 
> On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
> > and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
> > array and now duplicated among some architectures, let's factor them out.
> > 
> > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Dave Young <dyoung@redhat.com>
> > Cc: Vivek Goyal <vgoyal@redhat.com>
> > Cc: Baoquan He <bhe@redhat.com>
> > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> > ---
> >  arch/powerpc/include/asm/kexec.h            |  2 +-
> >  arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
> >  arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
> >  arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
> >  arch/x86/kernel/kexec-bzimage64.c           |  2 +-
> >  arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
> >  include/linux/kexec.h                       | 15 ++++----
> >  kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
> >  8 files changed, 70 insertions(+), 94 deletions(-)
> > 
> 
> [snip]
> 
> > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > index 990adae52151..a6d14a768b3e 100644
> > --- a/kernel/kexec_file.c
> > +++ b/kernel/kexec_file.c
> > @@ -26,34 +26,83 @@
> >  #include <linux/vmalloc.h>
> >  #include "kexec_internal.h"
> >  
> > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > +
> >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> >  static int kexec_calculate_store_digests(struct kimage *image);
> >  #else
> >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> >  #endif
> >  
> > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > +			     unsigned long buf_len)
> > +{
> > +	const struct kexec_file_ops * const *fops;
> > +	int ret = -ENOEXEC;
> > +
> > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > +		ret = (*fops)->probe(buf, buf_len);
> > +		if (!ret) {
> > +			image->fops = *fops;
> > +			return ret;
> > +		}
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> >  /* Architectures can provide this probe function */
> >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> >  					 unsigned long buf_len)
> >  {
> > -	return -ENOEXEC;
> > +	return _kexec_kernel_image_probe(image, buf, buf_len);
> 
> 
> I vaguely remember previously I suggest split the _kexec_kernel_image_probe
> because arch code can call them, and common code also use it like above.
> But in your new series I do not find where else calls this function
> except the common code arch_kexec_kernel_image_probe.  If nobody use
> them then it is not worth to split them out, it is better to just embed
> them in the __weak functions.

Powerpc's arch_kexec_kernel_image_probe() uses
_kexec_kekrnel_image_probe() as it needs an extra check to rule out
crash dump for now.

Thanks,
-Takahiro AKASHI


> Ditto for other similar functions.
> 
> [snip]
> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
@ 2018-02-26 10:01       ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-26 10:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 23, 2018 at 05:24:59PM +0800, Dave Young wrote:
> Hi AKASHI,
> 
> On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
> > and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
> > array and now duplicated among some architectures, let's factor them out.
> > 
> > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Dave Young <dyoung@redhat.com>
> > Cc: Vivek Goyal <vgoyal@redhat.com>
> > Cc: Baoquan He <bhe@redhat.com>
> > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> > ---
> >  arch/powerpc/include/asm/kexec.h            |  2 +-
> >  arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
> >  arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
> >  arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
> >  arch/x86/kernel/kexec-bzimage64.c           |  2 +-
> >  arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
> >  include/linux/kexec.h                       | 15 ++++----
> >  kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
> >  8 files changed, 70 insertions(+), 94 deletions(-)
> > 
> 
> [snip]
> 
> > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > index 990adae52151..a6d14a768b3e 100644
> > --- a/kernel/kexec_file.c
> > +++ b/kernel/kexec_file.c
> > @@ -26,34 +26,83 @@
> >  #include <linux/vmalloc.h>
> >  #include "kexec_internal.h"
> >  
> > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > +
> >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> >  static int kexec_calculate_store_digests(struct kimage *image);
> >  #else
> >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> >  #endif
> >  
> > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > +			     unsigned long buf_len)
> > +{
> > +	const struct kexec_file_ops * const *fops;
> > +	int ret = -ENOEXEC;
> > +
> > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > +		ret = (*fops)->probe(buf, buf_len);
> > +		if (!ret) {
> > +			image->fops = *fops;
> > +			return ret;
> > +		}
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> >  /* Architectures can provide this probe function */
> >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> >  					 unsigned long buf_len)
> >  {
> > -	return -ENOEXEC;
> > +	return _kexec_kernel_image_probe(image, buf, buf_len);
> 
> 
> I vaguely remember previously I suggest split the _kexec_kernel_image_probe
> because arch code can call them, and common code also use it like above.
> But in your new series I do not find where else calls this function
> except the common code arch_kexec_kernel_image_probe.  If nobody use
> them then it is not worth to split them out, it is better to just embed
> them in the __weak functions.

Powerpc's arch_kexec_kernel_image_probe() uses
_kexec_kekrnel_image_probe() as it needs an extra check to rule out
crash dump for now.

Thanks,
-Takahiro AKASHI


> Ditto for other similar functions.
> 
> [snip]
> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
@ 2018-02-26 10:01       ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-26 10:01 UTC (permalink / raw)
  To: Dave Young
  Cc: herbert, bhe, ard.biesheuvel, catalin.marinas, julien.thierry,
	will.deacon, linux-kernel, kexec, dhowells, arnd,
	linux-arm-kernel, mpe, bauerman, akpm, davem, vgoyal

On Fri, Feb 23, 2018 at 05:24:59PM +0800, Dave Young wrote:
> Hi AKASHI,
> 
> On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
> > and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
> > array and now duplicated among some architectures, let's factor them out.
> > 
> > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Dave Young <dyoung@redhat.com>
> > Cc: Vivek Goyal <vgoyal@redhat.com>
> > Cc: Baoquan He <bhe@redhat.com>
> > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> > ---
> >  arch/powerpc/include/asm/kexec.h            |  2 +-
> >  arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
> >  arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
> >  arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
> >  arch/x86/kernel/kexec-bzimage64.c           |  2 +-
> >  arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
> >  include/linux/kexec.h                       | 15 ++++----
> >  kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
> >  8 files changed, 70 insertions(+), 94 deletions(-)
> > 
> 
> [snip]
> 
> > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > index 990adae52151..a6d14a768b3e 100644
> > --- a/kernel/kexec_file.c
> > +++ b/kernel/kexec_file.c
> > @@ -26,34 +26,83 @@
> >  #include <linux/vmalloc.h>
> >  #include "kexec_internal.h"
> >  
> > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > +
> >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> >  static int kexec_calculate_store_digests(struct kimage *image);
> >  #else
> >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> >  #endif
> >  
> > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > +			     unsigned long buf_len)
> > +{
> > +	const struct kexec_file_ops * const *fops;
> > +	int ret = -ENOEXEC;
> > +
> > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > +		ret = (*fops)->probe(buf, buf_len);
> > +		if (!ret) {
> > +			image->fops = *fops;
> > +			return ret;
> > +		}
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> >  /* Architectures can provide this probe function */
> >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> >  					 unsigned long buf_len)
> >  {
> > -	return -ENOEXEC;
> > +	return _kexec_kernel_image_probe(image, buf, buf_len);
> 
> 
> I vaguely remember previously I suggest split the _kexec_kernel_image_probe
> because arch code can call them, and common code also use it like above.
> But in your new series I do not find where else calls this function
> except the common code arch_kexec_kernel_image_probe.  If nobody use
> them then it is not worth to split them out, it is better to just embed
> them in the __weak functions.

Powerpc's arch_kexec_kernel_image_probe() uses
_kexec_kekrnel_image_probe() as it needs an extra check to rule out
crash dump for now.

Thanks,
-Takahiro AKASHI


> Ditto for other similar functions.
> 
> [snip]
> 
> Thanks
> Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 02/13] kexec_file: make an use of purgatory optional
  2018-02-23  8:49     ` Dave Young
  (?)
@ 2018-02-26 10:24       ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-26 10:24 UTC (permalink / raw)
  To: Dave Young
  Cc: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, bhe, arnd, ard.biesheuvel,
	julien.thierry, kexec, linux-arm-kernel, linux-kernel

On Fri, Feb 23, 2018 at 04:49:34PM +0800, Dave Young wrote:
> Hi AKASHI,
> 
> On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > On arm64, no trampline code between old kernel and new kernel will be
> > required in kexec_file implementation. This patch introduces a new
> > configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be
> > compiled in only if necessary.
> 
> Here also need the explanation about why no purgatory is needed, it would be
> required for kexec if no strong reason.

OK, I will add the reason:
On arm64, crash dump kernel's usable memory is protected by
*unmapping* it from kernel virtual space unlike other architectures
where the region is just made read-only.
So our key developers think that it is highly unlikely that the region
is accidentally corrupted and this rationalizes that digest check code
be also dropped from purgatory.
This greatly simplifies our purgatory without any need for a bit ugly
relocation stuff, i.e. arch_kexec_apply_relocations_add().

Please see:
   http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html
to find out how simple our purgatory was. All that it does is
to shuffle arguments and jump into a new kernel.

Without this patch, we would have to have purgatory with a space for
a hash value (purgatory_sha256_digest) which is never checked against.

Do you think it makes sense?

Thanks,
-Takahiro AKASHI


> > 
> > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Dave Young <dyoung@redhat.com>
> > Cc: Vivek Goyal <vgoyal@redhat.com>
> > Cc: Baoquan He <bhe@redhat.com>
> > ---
> >  arch/powerpc/Kconfig | 3 +++
> >  arch/x86/Kconfig     | 3 +++
> >  kernel/kexec_file.c  | 6 ++++++
> >  3 files changed, 12 insertions(+)
> > 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index 73ce5dd07642..c32a181a7cbb 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -552,6 +552,9 @@ config KEXEC_FILE
> >  	  for kernel and initramfs as opposed to a list of segments as is the
> >  	  case for the older kexec call.
> >  
> > +config ARCH_HAS_KEXEC_PURGATORY
> > +	def_bool KEXEC_FILE
> > +
> >  config RELOCATABLE
> >  	bool "Build a relocatable kernel"
> >  	depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index c1236b187824..f031c3efe47e 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -2019,6 +2019,9 @@ config KEXEC_FILE
> >  	  for kernel and initramfs as opposed to list of segments as
> >  	  accepted by previous system call.
> >  
> > +config ARCH_HAS_KEXEC_PURGATORY
> > +	def_bool KEXEC_FILE
> > +
> >  config KEXEC_VERIFY_SIG
> >  	bool "Verify kernel signature during kexec_file_load() syscall"
> >  	depends on KEXEC_FILE
> > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > index e5bcd94c1efb..990adae52151 100644
> > --- a/kernel/kexec_file.c
> > +++ b/kernel/kexec_file.c
> > @@ -26,7 +26,11 @@
> >  #include <linux/vmalloc.h>
> >  #include "kexec_internal.h"
> >  
> > +#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> >  static int kexec_calculate_store_digests(struct kimage *image);
> > +#else
> > +static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> > +#endif
> >  
> >  /* Architectures can provide this probe function */
> >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > @@ -520,6 +524,7 @@ int kexec_add_buffer(struct kexec_buf *kbuf)
> >  	return 0;
> >  }
> >  
> > +#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> >  /* Calculate and store the digest of segments */
> >  static int kexec_calculate_store_digests(struct kimage *image)
> >  {
> > @@ -1022,3 +1027,4 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
> >  
> >  	return 0;
> >  }
> > +#endif /* CONFIG_ARCH_HAS_KEXEC_PURGATORY */
> > -- 
> > 2.16.2
> > 
> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 02/13] kexec_file: make an use of purgatory optional
@ 2018-02-26 10:24       ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-26 10:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 23, 2018 at 04:49:34PM +0800, Dave Young wrote:
> Hi AKASHI,
> 
> On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > On arm64, no trampline code between old kernel and new kernel will be
> > required in kexec_file implementation. This patch introduces a new
> > configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be
> > compiled in only if necessary.
> 
> Here also need the explanation about why no purgatory is needed, it would be
> required for kexec if no strong reason.

OK, I will add the reason:
On arm64, crash dump kernel's usable memory is protected by
*unmapping* it from kernel virtual space unlike other architectures
where the region is just made read-only.
So our key developers think that it is highly unlikely that the region
is accidentally corrupted and this rationalizes that digest check code
be also dropped from purgatory.
This greatly simplifies our purgatory without any need for a bit ugly
relocation stuff, i.e. arch_kexec_apply_relocations_add().

Please see:
   http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html
to find out how simple our purgatory was. All that it does is
to shuffle arguments and jump into a new kernel.

Without this patch, we would have to have purgatory with a space for
a hash value (purgatory_sha256_digest) which is never checked against.

Do you think it makes sense?

Thanks,
-Takahiro AKASHI


> > 
> > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Dave Young <dyoung@redhat.com>
> > Cc: Vivek Goyal <vgoyal@redhat.com>
> > Cc: Baoquan He <bhe@redhat.com>
> > ---
> >  arch/powerpc/Kconfig | 3 +++
> >  arch/x86/Kconfig     | 3 +++
> >  kernel/kexec_file.c  | 6 ++++++
> >  3 files changed, 12 insertions(+)
> > 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index 73ce5dd07642..c32a181a7cbb 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -552,6 +552,9 @@ config KEXEC_FILE
> >  	  for kernel and initramfs as opposed to a list of segments as is the
> >  	  case for the older kexec call.
> >  
> > +config ARCH_HAS_KEXEC_PURGATORY
> > +	def_bool KEXEC_FILE
> > +
> >  config RELOCATABLE
> >  	bool "Build a relocatable kernel"
> >  	depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index c1236b187824..f031c3efe47e 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -2019,6 +2019,9 @@ config KEXEC_FILE
> >  	  for kernel and initramfs as opposed to list of segments as
> >  	  accepted by previous system call.
> >  
> > +config ARCH_HAS_KEXEC_PURGATORY
> > +	def_bool KEXEC_FILE
> > +
> >  config KEXEC_VERIFY_SIG
> >  	bool "Verify kernel signature during kexec_file_load() syscall"
> >  	depends on KEXEC_FILE
> > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > index e5bcd94c1efb..990adae52151 100644
> > --- a/kernel/kexec_file.c
> > +++ b/kernel/kexec_file.c
> > @@ -26,7 +26,11 @@
> >  #include <linux/vmalloc.h>
> >  #include "kexec_internal.h"
> >  
> > +#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> >  static int kexec_calculate_store_digests(struct kimage *image);
> > +#else
> > +static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> > +#endif
> >  
> >  /* Architectures can provide this probe function */
> >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > @@ -520,6 +524,7 @@ int kexec_add_buffer(struct kexec_buf *kbuf)
> >  	return 0;
> >  }
> >  
> > +#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> >  /* Calculate and store the digest of segments */
> >  static int kexec_calculate_store_digests(struct kimage *image)
> >  {
> > @@ -1022,3 +1027,4 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
> >  
> >  	return 0;
> >  }
> > +#endif /* CONFIG_ARCH_HAS_KEXEC_PURGATORY */
> > -- 
> > 2.16.2
> > 
> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 02/13] kexec_file: make an use of purgatory optional
@ 2018-02-26 10:24       ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-26 10:24 UTC (permalink / raw)
  To: Dave Young
  Cc: herbert, bhe, ard.biesheuvel, catalin.marinas, julien.thierry,
	will.deacon, linux-kernel, kexec, dhowells, arnd,
	linux-arm-kernel, mpe, bauerman, akpm, davem, vgoyal

On Fri, Feb 23, 2018 at 04:49:34PM +0800, Dave Young wrote:
> Hi AKASHI,
> 
> On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > On arm64, no trampline code between old kernel and new kernel will be
> > required in kexec_file implementation. This patch introduces a new
> > configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be
> > compiled in only if necessary.
> 
> Here also need the explanation about why no purgatory is needed, it would be
> required for kexec if no strong reason.

OK, I will add the reason:
On arm64, crash dump kernel's usable memory is protected by
*unmapping* it from kernel virtual space unlike other architectures
where the region is just made read-only.
So our key developers think that it is highly unlikely that the region
is accidentally corrupted and this rationalizes that digest check code
be also dropped from purgatory.
This greatly simplifies our purgatory without any need for a bit ugly
relocation stuff, i.e. arch_kexec_apply_relocations_add().

Please see:
   http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html
to find out how simple our purgatory was. All that it does is
to shuffle arguments and jump into a new kernel.

Without this patch, we would have to have purgatory with a space for
a hash value (purgatory_sha256_digest) which is never checked against.

Do you think it makes sense?

Thanks,
-Takahiro AKASHI


> > 
> > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Dave Young <dyoung@redhat.com>
> > Cc: Vivek Goyal <vgoyal@redhat.com>
> > Cc: Baoquan He <bhe@redhat.com>
> > ---
> >  arch/powerpc/Kconfig | 3 +++
> >  arch/x86/Kconfig     | 3 +++
> >  kernel/kexec_file.c  | 6 ++++++
> >  3 files changed, 12 insertions(+)
> > 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index 73ce5dd07642..c32a181a7cbb 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -552,6 +552,9 @@ config KEXEC_FILE
> >  	  for kernel and initramfs as opposed to a list of segments as is the
> >  	  case for the older kexec call.
> >  
> > +config ARCH_HAS_KEXEC_PURGATORY
> > +	def_bool KEXEC_FILE
> > +
> >  config RELOCATABLE
> >  	bool "Build a relocatable kernel"
> >  	depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index c1236b187824..f031c3efe47e 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -2019,6 +2019,9 @@ config KEXEC_FILE
> >  	  for kernel and initramfs as opposed to list of segments as
> >  	  accepted by previous system call.
> >  
> > +config ARCH_HAS_KEXEC_PURGATORY
> > +	def_bool KEXEC_FILE
> > +
> >  config KEXEC_VERIFY_SIG
> >  	bool "Verify kernel signature during kexec_file_load() syscall"
> >  	depends on KEXEC_FILE
> > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > index e5bcd94c1efb..990adae52151 100644
> > --- a/kernel/kexec_file.c
> > +++ b/kernel/kexec_file.c
> > @@ -26,7 +26,11 @@
> >  #include <linux/vmalloc.h>
> >  #include "kexec_internal.h"
> >  
> > +#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> >  static int kexec_calculate_store_digests(struct kimage *image);
> > +#else
> > +static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> > +#endif
> >  
> >  /* Architectures can provide this probe function */
> >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > @@ -520,6 +524,7 @@ int kexec_add_buffer(struct kexec_buf *kbuf)
> >  	return 0;
> >  }
> >  
> > +#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> >  /* Calculate and store the digest of segments */
> >  static int kexec_calculate_store_digests(struct kimage *image)
> >  {
> > @@ -1022,3 +1027,4 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
> >  
> >  	return 0;
> >  }
> > +#endif /* CONFIG_ARCH_HAS_KEXEC_PURGATORY */
> > -- 
> > 2.16.2
> > 
> 
> Thanks
> Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file, x86, powerpc: factor out kexec_file_ops functions
  2018-02-22 11:17   ` AKASHI Takahiro
  (?)
@ 2018-02-26 11:17     ` Philipp Rudo
  -1 siblings, 0 replies; 102+ messages in thread
From: Philipp Rudo @ 2018-02-26 11:17 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry, kexec, linux-kernel, linux-arm-kernel

Hi AKASHI

On Thu, 22 Feb 2018 20:17:22 +0900
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

[...]

> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 990adae52151..a6d14a768b3e 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -26,34 +26,83 @@
>  #include <linux/vmalloc.h>
>  #include "kexec_internal.h"
> 
> +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> +

Having a weak definition of kexec_file_loaders causes trouble on s390 with 
gcc 4.8 (newer versions seem to work fine). For me it looks like that in this
version gcc doesn't recognize __weak but use the default value for
optimization. This leads to _kexec_kernel_image_probe to always return ENOEXEC
because the for-loop gets optimized out.

The problem can easily be worked around by declaring kexec_file_loaders in
include/linux/kexec.h and defining it in arch code. In particular doing this

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 37e9dce518aa..fc0788540d90 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -139,6 +139,8 @@ struct kexec_file_ops {
 #endif
 };
 
+extern const struct kexec_file_ops * const kexec_file_loaders[];
+
 /**
  * struct kexec_buf - parameters for finding a place for a buffer in memory
  * @image:	kexec image in which memory to search.
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 17ba407d0e79..4e3d1e4bc7f6 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -31,8 +31,6 @@
 #include <linux/vmalloc.h>
 #include "kexec_internal.h"
 
-const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
-
 #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
 static int kexec_calculate_store_digests(struct kimage *image);
 #else

A nice side effect of this solution is, that a developer who forgets to define
kexec_file_loaders gets a linker error. So he directly knows what's missing
instead of first having to find out where/why an error gets returned.

Otherwise the series is fine for me.

Thanks
Philipp

>  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
>  static int kexec_calculate_store_digests(struct kimage *image);
>  #else
>  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
>  #endif
> 
> +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> +			     unsigned long buf_len)
> +{
> +	const struct kexec_file_ops * const *fops;
> +	int ret = -ENOEXEC;
> +
> +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> +		ret = (*fops)->probe(buf, buf_len);
> +		if (!ret) {
> +			image->fops = *fops;
> +			return ret;
> +		}
> +	}
> +
> +	return ret;
> +}
> +
>  /* Architectures can provide this probe function */
>  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
>  					 unsigned long buf_len)
>  {
> -	return -ENOEXEC;
> +	return _kexec_kernel_image_probe(image, buf, buf_len);
> +}
> +
> +void *_kexec_kernel_image_load(struct kimage *image)
> +{
> +	if (!image->fops || !image->fops->load)
> +		return ERR_PTR(-ENOEXEC);
> +
> +	return image->fops->load(image, image->kernel_buf,
> +				 image->kernel_buf_len, image->initrd_buf,
> +				 image->initrd_buf_len, image->cmdline_buf,
> +				 image->cmdline_buf_len);
>  }
> 
>  void * __weak arch_kexec_kernel_image_load(struct kimage *image)
>  {
> -	return ERR_PTR(-ENOEXEC);
> +	return _kexec_kernel_image_load(image);
> +}
> +
> +int _kimage_file_post_load_cleanup(struct kimage *image)
> +{
> +	if (!image->fops || !image->fops->cleanup)
> +		return 0;
> +
> +	return image->fops->cleanup(image->image_loader_data);
>  }
> 
>  int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
>  {
> -	return -EINVAL;
> +	return _kimage_file_post_load_cleanup(image);
>  }
> 
>  #ifdef CONFIG_KEXEC_VERIFY_SIG
> +int _kexec_kernel_verify_sig(struct kimage *image, void *buf,
> +			    unsigned long buf_len)
> +{
> +	if (!image->fops || !image->fops->verify_sig) {
> +		pr_debug("kernel loader does not support signature verification.\n");
> +		return -EKEYREJECTED;
> +	}
> +
> +	return image->fops->verify_sig(buf, buf_len);
> +}
> +
>  int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
>  					unsigned long buf_len)
>  {
> -	return -EKEYREJECTED;
> +	return _kexec_kernel_verify_sig(image, buf, buf_len);
>  }
>  #endif
> 

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v8 03/13] kexec_file, x86, powerpc: factor out kexec_file_ops functions
@ 2018-02-26 11:17     ` Philipp Rudo
  0 siblings, 0 replies; 102+ messages in thread
From: Philipp Rudo @ 2018-02-26 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

Hi AKASHI

On Thu, 22 Feb 2018 20:17:22 +0900
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

[...]

> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 990adae52151..a6d14a768b3e 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -26,34 +26,83 @@
>  #include <linux/vmalloc.h>
>  #include "kexec_internal.h"
> 
> +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> +

Having a weak definition of kexec_file_loaders causes trouble on s390 with 
gcc 4.8 (newer versions seem to work fine). For me it looks like that in this
version gcc doesn't recognize __weak but use the default value for
optimization. This leads to _kexec_kernel_image_probe to always return ENOEXEC
because the for-loop gets optimized out.

The problem can easily be worked around by declaring kexec_file_loaders in
include/linux/kexec.h and defining it in arch code. In particular doing this

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 37e9dce518aa..fc0788540d90 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -139,6 +139,8 @@ struct kexec_file_ops {
 #endif
 };
 
+extern const struct kexec_file_ops * const kexec_file_loaders[];
+
 /**
  * struct kexec_buf - parameters for finding a place for a buffer in memory
  * @image:	kexec image in which memory to search.
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 17ba407d0e79..4e3d1e4bc7f6 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -31,8 +31,6 @@
 #include <linux/vmalloc.h>
 #include "kexec_internal.h"
 
-const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
-
 #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
 static int kexec_calculate_store_digests(struct kimage *image);
 #else

A nice side effect of this solution is, that a developer who forgets to define
kexec_file_loaders gets a linker error. So he directly knows what's missing
instead of first having to find out where/why an error gets returned.

Otherwise the series is fine for me.

Thanks
Philipp

>  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
>  static int kexec_calculate_store_digests(struct kimage *image);
>  #else
>  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
>  #endif
> 
> +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> +			     unsigned long buf_len)
> +{
> +	const struct kexec_file_ops * const *fops;
> +	int ret = -ENOEXEC;
> +
> +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> +		ret = (*fops)->probe(buf, buf_len);
> +		if (!ret) {
> +			image->fops = *fops;
> +			return ret;
> +		}
> +	}
> +
> +	return ret;
> +}
> +
>  /* Architectures can provide this probe function */
>  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
>  					 unsigned long buf_len)
>  {
> -	return -ENOEXEC;
> +	return _kexec_kernel_image_probe(image, buf, buf_len);
> +}
> +
> +void *_kexec_kernel_image_load(struct kimage *image)
> +{
> +	if (!image->fops || !image->fops->load)
> +		return ERR_PTR(-ENOEXEC);
> +
> +	return image->fops->load(image, image->kernel_buf,
> +				 image->kernel_buf_len, image->initrd_buf,
> +				 image->initrd_buf_len, image->cmdline_buf,
> +				 image->cmdline_buf_len);
>  }
> 
>  void * __weak arch_kexec_kernel_image_load(struct kimage *image)
>  {
> -	return ERR_PTR(-ENOEXEC);
> +	return _kexec_kernel_image_load(image);
> +}
> +
> +int _kimage_file_post_load_cleanup(struct kimage *image)
> +{
> +	if (!image->fops || !image->fops->cleanup)
> +		return 0;
> +
> +	return image->fops->cleanup(image->image_loader_data);
>  }
> 
>  int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
>  {
> -	return -EINVAL;
> +	return _kimage_file_post_load_cleanup(image);
>  }
> 
>  #ifdef CONFIG_KEXEC_VERIFY_SIG
> +int _kexec_kernel_verify_sig(struct kimage *image, void *buf,
> +			    unsigned long buf_len)
> +{
> +	if (!image->fops || !image->fops->verify_sig) {
> +		pr_debug("kernel loader does not support signature verification.\n");
> +		return -EKEYREJECTED;
> +	}
> +
> +	return image->fops->verify_sig(buf, buf_len);
> +}
> +
>  int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
>  					unsigned long buf_len)
>  {
> -	return -EKEYREJECTED;
> +	return _kexec_kernel_verify_sig(image, buf, buf_len);
>  }
>  #endif
> 

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file, x86, powerpc: factor out kexec_file_ops functions
@ 2018-02-26 11:17     ` Philipp Rudo
  0 siblings, 0 replies; 102+ messages in thread
From: Philipp Rudo @ 2018-02-26 11:17 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: herbert, bhe, ard.biesheuvel, catalin.marinas, julien.thierry,
	will.deacon, linux-kernel, kexec, dhowells, arnd,
	linux-arm-kernel, mpe, bauerman, akpm, dyoung, davem, vgoyal

Hi AKASHI

On Thu, 22 Feb 2018 20:17:22 +0900
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

[...]

> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 990adae52151..a6d14a768b3e 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -26,34 +26,83 @@
>  #include <linux/vmalloc.h>
>  #include "kexec_internal.h"
> 
> +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> +

Having a weak definition of kexec_file_loaders causes trouble on s390 with 
gcc 4.8 (newer versions seem to work fine). For me it looks like that in this
version gcc doesn't recognize __weak but use the default value for
optimization. This leads to _kexec_kernel_image_probe to always return ENOEXEC
because the for-loop gets optimized out.

The problem can easily be worked around by declaring kexec_file_loaders in
include/linux/kexec.h and defining it in arch code. In particular doing this

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 37e9dce518aa..fc0788540d90 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -139,6 +139,8 @@ struct kexec_file_ops {
 #endif
 };
 
+extern const struct kexec_file_ops * const kexec_file_loaders[];
+
 /**
  * struct kexec_buf - parameters for finding a place for a buffer in memory
  * @image:	kexec image in which memory to search.
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 17ba407d0e79..4e3d1e4bc7f6 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -31,8 +31,6 @@
 #include <linux/vmalloc.h>
 #include "kexec_internal.h"
 
-const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
-
 #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
 static int kexec_calculate_store_digests(struct kimage *image);
 #else

A nice side effect of this solution is, that a developer who forgets to define
kexec_file_loaders gets a linker error. So he directly knows what's missing
instead of first having to find out where/why an error gets returned.

Otherwise the series is fine for me.

Thanks
Philipp

>  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
>  static int kexec_calculate_store_digests(struct kimage *image);
>  #else
>  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
>  #endif
> 
> +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> +			     unsigned long buf_len)
> +{
> +	const struct kexec_file_ops * const *fops;
> +	int ret = -ENOEXEC;
> +
> +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> +		ret = (*fops)->probe(buf, buf_len);
> +		if (!ret) {
> +			image->fops = *fops;
> +			return ret;
> +		}
> +	}
> +
> +	return ret;
> +}
> +
>  /* Architectures can provide this probe function */
>  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
>  					 unsigned long buf_len)
>  {
> -	return -ENOEXEC;
> +	return _kexec_kernel_image_probe(image, buf, buf_len);
> +}
> +
> +void *_kexec_kernel_image_load(struct kimage *image)
> +{
> +	if (!image->fops || !image->fops->load)
> +		return ERR_PTR(-ENOEXEC);
> +
> +	return image->fops->load(image, image->kernel_buf,
> +				 image->kernel_buf_len, image->initrd_buf,
> +				 image->initrd_buf_len, image->cmdline_buf,
> +				 image->cmdline_buf_len);
>  }
> 
>  void * __weak arch_kexec_kernel_image_load(struct kimage *image)
>  {
> -	return ERR_PTR(-ENOEXEC);
> +	return _kexec_kernel_image_load(image);
> +}
> +
> +int _kimage_file_post_load_cleanup(struct kimage *image)
> +{
> +	if (!image->fops || !image->fops->cleanup)
> +		return 0;
> +
> +	return image->fops->cleanup(image->image_loader_data);
>  }
> 
>  int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
>  {
> -	return -EINVAL;
> +	return _kimage_file_post_load_cleanup(image);
>  }
> 
>  #ifdef CONFIG_KEXEC_VERIFY_SIG
> +int _kexec_kernel_verify_sig(struct kimage *image, void *buf,
> +			    unsigned long buf_len)
> +{
> +	if (!image->fops || !image->fops->verify_sig) {
> +		pr_debug("kernel loader does not support signature verification.\n");
> +		return -EKEYREJECTED;
> +	}
> +
> +	return image->fops->verify_sig(buf, buf_len);
> +}
> +
>  int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
>  					unsigned long buf_len)
>  {
> -	return -EKEYREJECTED;
> +	return _kexec_kernel_verify_sig(image, buf, buf_len);
>  }
>  #endif
> 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
  2018-02-26 10:01       ` AKASHI Takahiro
  (?)
@ 2018-02-26 11:25         ` Philipp Rudo
  -1 siblings, 0 replies; 102+ messages in thread
From: Philipp Rudo @ 2018-02-26 11:25 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Dave Young, herbert, bhe, ard.biesheuvel, catalin.marinas,
	julien.thierry, will.deacon, linux-kernel, kexec, dhowells, arnd,
	linux-arm-kernel, mpe, bauerman, akpm, davem, vgoyal

On Mon, 26 Feb 2018 19:01:39 +0900
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

> On Fri, Feb 23, 2018 at 05:24:59PM +0800, Dave Young wrote:
> > Hi AKASHI,
> > 
> > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:  
> > > As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
> > > and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
> > > array and now duplicated among some architectures, let's factor them out.
> > > 
> > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > Cc: Dave Young <dyoung@redhat.com>
> > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > Cc: Baoquan He <bhe@redhat.com>
> > > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > > Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> > > ---
> > >  arch/powerpc/include/asm/kexec.h            |  2 +-
> > >  arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
> > >  arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
> > >  arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
> > >  arch/x86/kernel/kexec-bzimage64.c           |  2 +-
> > >  arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
> > >  include/linux/kexec.h                       | 15 ++++----
> > >  kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
> > >  8 files changed, 70 insertions(+), 94 deletions(-)
> > >   
> > 
> > [snip]
> >   
> > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > index 990adae52151..a6d14a768b3e 100644
> > > --- a/kernel/kexec_file.c
> > > +++ b/kernel/kexec_file.c
> > > @@ -26,34 +26,83 @@
> > >  #include <linux/vmalloc.h>
> > >  #include "kexec_internal.h"
> > >  
> > > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > > +
> > >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> > >  static int kexec_calculate_store_digests(struct kimage *image);
> > >  #else
> > >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> > >  #endif
> > >  
> > > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > > +			     unsigned long buf_len)
> > > +{
> > > +	const struct kexec_file_ops * const *fops;
> > > +	int ret = -ENOEXEC;
> > > +
> > > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > > +		ret = (*fops)->probe(buf, buf_len);
> > > +		if (!ret) {
> > > +			image->fops = *fops;
> > > +			return ret;
> > > +		}
> > > +	}
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  /* Architectures can provide this probe function */
> > >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > >  					 unsigned long buf_len)
> > >  {
> > > -	return -ENOEXEC;
> > > +	return _kexec_kernel_image_probe(image, buf, buf_len);  
> > 
> > 
> > I vaguely remember previously I suggest split the _kexec_kernel_image_probe
> > because arch code can call them, and common code also use it like above.
> > But in your new series I do not find where else calls this function
> > except the common code arch_kexec_kernel_image_probe.  If nobody use
> > them then it is not worth to split them out, it is better to just embed
> > them in the __weak functions.  
> 
> Powerpc's arch_kexec_kernel_image_probe() uses
> _kexec_kekrnel_image_probe() as it needs an extra check to rule out
> crash dump for now.

s390 has to use it too. We have to write to a fixed address in the buffer. So
we need to check if the buffer is large enough to write to that address.

Philipp
 
> Thanks,
> -Takahiro AKASHI
> 
> 
> > Ditto for other similar functions.
> > 
> > [snip]
> > 
> > Thanks
> > Dave  
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
@ 2018-02-26 11:25         ` Philipp Rudo
  0 siblings, 0 replies; 102+ messages in thread
From: Philipp Rudo @ 2018-02-26 11:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 26 Feb 2018 19:01:39 +0900
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

> On Fri, Feb 23, 2018 at 05:24:59PM +0800, Dave Young wrote:
> > Hi AKASHI,
> > 
> > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:  
> > > As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
> > > and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
> > > array and now duplicated among some architectures, let's factor them out.
> > > 
> > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > Cc: Dave Young <dyoung@redhat.com>
> > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > Cc: Baoquan He <bhe@redhat.com>
> > > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > > Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> > > ---
> > >  arch/powerpc/include/asm/kexec.h            |  2 +-
> > >  arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
> > >  arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
> > >  arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
> > >  arch/x86/kernel/kexec-bzimage64.c           |  2 +-
> > >  arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
> > >  include/linux/kexec.h                       | 15 ++++----
> > >  kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
> > >  8 files changed, 70 insertions(+), 94 deletions(-)
> > >   
> > 
> > [snip]
> >   
> > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > index 990adae52151..a6d14a768b3e 100644
> > > --- a/kernel/kexec_file.c
> > > +++ b/kernel/kexec_file.c
> > > @@ -26,34 +26,83 @@
> > >  #include <linux/vmalloc.h>
> > >  #include "kexec_internal.h"
> > >  
> > > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > > +
> > >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> > >  static int kexec_calculate_store_digests(struct kimage *image);
> > >  #else
> > >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> > >  #endif
> > >  
> > > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > > +			     unsigned long buf_len)
> > > +{
> > > +	const struct kexec_file_ops * const *fops;
> > > +	int ret = -ENOEXEC;
> > > +
> > > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > > +		ret = (*fops)->probe(buf, buf_len);
> > > +		if (!ret) {
> > > +			image->fops = *fops;
> > > +			return ret;
> > > +		}
> > > +	}
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  /* Architectures can provide this probe function */
> > >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > >  					 unsigned long buf_len)
> > >  {
> > > -	return -ENOEXEC;
> > > +	return _kexec_kernel_image_probe(image, buf, buf_len);  
> > 
> > 
> > I vaguely remember previously I suggest split the _kexec_kernel_image_probe
> > because arch code can call them, and common code also use it like above.
> > But in your new series I do not find where else calls this function
> > except the common code arch_kexec_kernel_image_probe.  If nobody use
> > them then it is not worth to split them out, it is better to just embed
> > them in the __weak functions.  
> 
> Powerpc's arch_kexec_kernel_image_probe() uses
> _kexec_kekrnel_image_probe() as it needs an extra check to rule out
> crash dump for now.

s390 has to use it too. We have to write to a fixed address in the buffer. So
we need to check if the buffer is large enough to write to that address.

Philipp
 
> Thanks,
> -Takahiro AKASHI
> 
> 
> > Ditto for other similar functions.
> > 
> > [snip]
> > 
> > Thanks
> > Dave  
> 
> _______________________________________________
> kexec mailing list
> kexec at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
@ 2018-02-26 11:25         ` Philipp Rudo
  0 siblings, 0 replies; 102+ messages in thread
From: Philipp Rudo @ 2018-02-26 11:25 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: herbert, bhe, julien.thierry, catalin.marinas, ard.biesheuvel,
	will.deacon, linux-kernel, davem, dhowells, arnd, vgoyal, mpe,
	bauerman, akpm, Dave Young, kexec, linux-arm-kernel

On Mon, 26 Feb 2018 19:01:39 +0900
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

> On Fri, Feb 23, 2018 at 05:24:59PM +0800, Dave Young wrote:
> > Hi AKASHI,
> > 
> > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:  
> > > As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
> > > and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
> > > array and now duplicated among some architectures, let's factor them out.
> > > 
> > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > Cc: Dave Young <dyoung@redhat.com>
> > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > Cc: Baoquan He <bhe@redhat.com>
> > > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > > Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> > > ---
> > >  arch/powerpc/include/asm/kexec.h            |  2 +-
> > >  arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
> > >  arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
> > >  arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
> > >  arch/x86/kernel/kexec-bzimage64.c           |  2 +-
> > >  arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
> > >  include/linux/kexec.h                       | 15 ++++----
> > >  kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
> > >  8 files changed, 70 insertions(+), 94 deletions(-)
> > >   
> > 
> > [snip]
> >   
> > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > index 990adae52151..a6d14a768b3e 100644
> > > --- a/kernel/kexec_file.c
> > > +++ b/kernel/kexec_file.c
> > > @@ -26,34 +26,83 @@
> > >  #include <linux/vmalloc.h>
> > >  #include "kexec_internal.h"
> > >  
> > > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > > +
> > >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> > >  static int kexec_calculate_store_digests(struct kimage *image);
> > >  #else
> > >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> > >  #endif
> > >  
> > > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > > +			     unsigned long buf_len)
> > > +{
> > > +	const struct kexec_file_ops * const *fops;
> > > +	int ret = -ENOEXEC;
> > > +
> > > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > > +		ret = (*fops)->probe(buf, buf_len);
> > > +		if (!ret) {
> > > +			image->fops = *fops;
> > > +			return ret;
> > > +		}
> > > +	}
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  /* Architectures can provide this probe function */
> > >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > >  					 unsigned long buf_len)
> > >  {
> > > -	return -ENOEXEC;
> > > +	return _kexec_kernel_image_probe(image, buf, buf_len);  
> > 
> > 
> > I vaguely remember previously I suggest split the _kexec_kernel_image_probe
> > because arch code can call them, and common code also use it like above.
> > But in your new series I do not find where else calls this function
> > except the common code arch_kexec_kernel_image_probe.  If nobody use
> > them then it is not worth to split them out, it is better to just embed
> > them in the __weak functions.  
> 
> Powerpc's arch_kexec_kernel_image_probe() uses
> _kexec_kekrnel_image_probe() as it needs an extra check to rule out
> crash dump for now.

s390 has to use it too. We have to write to a fixed address in the buffer. So
we need to check if the buffer is large enough to write to that address.

Philipp
 
> Thanks,
> -Takahiro AKASHI
> 
> 
> > Ditto for other similar functions.
> > 
> > [snip]
> > 
> > Thanks
> > Dave  
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file, x86, powerpc: factor out kexec_file_ops functions
  2018-02-26 11:17     ` Philipp Rudo
  (?)
@ 2018-02-27  2:03       ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-27  2:03 UTC (permalink / raw)
  To: Philipp Rudo
  Cc: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry, kexec, linux-kernel, linux-arm-kernel

On Mon, Feb 26, 2018 at 12:17:18PM +0100, Philipp Rudo wrote:
> Hi AKASHI
> 
> On Thu, 22 Feb 2018 20:17:22 +0900
> AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:
> 
> [...]
> 
> > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > index 990adae52151..a6d14a768b3e 100644
> > --- a/kernel/kexec_file.c
> > +++ b/kernel/kexec_file.c
> > @@ -26,34 +26,83 @@
> >  #include <linux/vmalloc.h>
> >  #include "kexec_internal.h"
> > 
> > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > +
> 
> Having a weak definition of kexec_file_loaders causes trouble on s390 with 
> gcc 4.8 (newer versions seem to work fine). For me it looks like that in this
> version gcc doesn't recognize __weak but use the default value for
> optimization. This leads to _kexec_kernel_image_probe to always return ENOEXEC
> because the for-loop gets optimized out.

I gave it a try to compile with gcc 4.9 (not 4.8) for arm64
and didn't see any errors or warnings, but

> The problem can easily be worked around by declaring kexec_file_loaders in
> include/linux/kexec.h and defining it in arch code. In particular doing this
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 37e9dce518aa..fc0788540d90 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -139,6 +139,8 @@ struct kexec_file_ops {
>  #endif
>  };
>  
> +extern const struct kexec_file_ops * const kexec_file_loaders[];
> +
>  /**
>   * struct kexec_buf - parameters for finding a place for a buffer in memory
>   * @image:	kexec image in which memory to search.
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 17ba407d0e79..4e3d1e4bc7f6 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -31,8 +31,6 @@
>  #include <linux/vmalloc.h>
>  #include "kexec_internal.h"
>  
> -const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> -
>  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
>  static int kexec_calculate_store_digests(struct kimage *image);
>  #else

Your change is just fine with me, too.
I will incorporate it in my next version.

Thanks,
-Takahiro AKASHI

> A nice side effect of this solution is, that a developer who forgets to define
> kexec_file_loaders gets a linker error. So he directly knows what's missing
> instead of first having to find out where/why an error gets returned.
> 
> Otherwise the series is fine for me.
> 
> Thanks
> Philipp
> 
> >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> >  static int kexec_calculate_store_digests(struct kimage *image);
> >  #else
> >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> >  #endif
> > 
> > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > +			     unsigned long buf_len)
> > +{
> > +	const struct kexec_file_ops * const *fops;
> > +	int ret = -ENOEXEC;
> > +
> > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > +		ret = (*fops)->probe(buf, buf_len);
> > +		if (!ret) {
> > +			image->fops = *fops;
> > +			return ret;
> > +		}
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> >  /* Architectures can provide this probe function */
> >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> >  					 unsigned long buf_len)
> >  {
> > -	return -ENOEXEC;
> > +	return _kexec_kernel_image_probe(image, buf, buf_len);
> > +}
> > +
> > +void *_kexec_kernel_image_load(struct kimage *image)
> > +{
> > +	if (!image->fops || !image->fops->load)
> > +		return ERR_PTR(-ENOEXEC);
> > +
> > +	return image->fops->load(image, image->kernel_buf,
> > +				 image->kernel_buf_len, image->initrd_buf,
> > +				 image->initrd_buf_len, image->cmdline_buf,
> > +				 image->cmdline_buf_len);
> >  }
> > 
> >  void * __weak arch_kexec_kernel_image_load(struct kimage *image)
> >  {
> > -	return ERR_PTR(-ENOEXEC);
> > +	return _kexec_kernel_image_load(image);
> > +}
> > +
> > +int _kimage_file_post_load_cleanup(struct kimage *image)
> > +{
> > +	if (!image->fops || !image->fops->cleanup)
> > +		return 0;
> > +
> > +	return image->fops->cleanup(image->image_loader_data);
> >  }
> > 
> >  int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
> >  {
> > -	return -EINVAL;
> > +	return _kimage_file_post_load_cleanup(image);
> >  }
> > 
> >  #ifdef CONFIG_KEXEC_VERIFY_SIG
> > +int _kexec_kernel_verify_sig(struct kimage *image, void *buf,
> > +			    unsigned long buf_len)
> > +{
> > +	if (!image->fops || !image->fops->verify_sig) {
> > +		pr_debug("kernel loader does not support signature verification.\n");
> > +		return -EKEYREJECTED;
> > +	}
> > +
> > +	return image->fops->verify_sig(buf, buf_len);
> > +}
> > +
> >  int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
> >  					unsigned long buf_len)
> >  {
> > -	return -EKEYREJECTED;
> > +	return _kexec_kernel_verify_sig(image, buf, buf_len);
> >  }
> >  #endif
> > 
> 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 03/13] kexec_file, x86, powerpc: factor out kexec_file_ops functions
@ 2018-02-27  2:03       ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-27  2:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 26, 2018 at 12:17:18PM +0100, Philipp Rudo wrote:
> Hi AKASHI
> 
> On Thu, 22 Feb 2018 20:17:22 +0900
> AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:
> 
> [...]
> 
> > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > index 990adae52151..a6d14a768b3e 100644
> > --- a/kernel/kexec_file.c
> > +++ b/kernel/kexec_file.c
> > @@ -26,34 +26,83 @@
> >  #include <linux/vmalloc.h>
> >  #include "kexec_internal.h"
> > 
> > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > +
> 
> Having a weak definition of kexec_file_loaders causes trouble on s390 with 
> gcc 4.8 (newer versions seem to work fine). For me it looks like that in this
> version gcc doesn't recognize __weak but use the default value for
> optimization. This leads to _kexec_kernel_image_probe to always return ENOEXEC
> because the for-loop gets optimized out.

I gave it a try to compile with gcc 4.9 (not 4.8) for arm64
and didn't see any errors or warnings, but

> The problem can easily be worked around by declaring kexec_file_loaders in
> include/linux/kexec.h and defining it in arch code. In particular doing this
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 37e9dce518aa..fc0788540d90 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -139,6 +139,8 @@ struct kexec_file_ops {
>  #endif
>  };
>  
> +extern const struct kexec_file_ops * const kexec_file_loaders[];
> +
>  /**
>   * struct kexec_buf - parameters for finding a place for a buffer in memory
>   * @image:	kexec image in which memory to search.
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 17ba407d0e79..4e3d1e4bc7f6 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -31,8 +31,6 @@
>  #include <linux/vmalloc.h>
>  #include "kexec_internal.h"
>  
> -const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> -
>  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
>  static int kexec_calculate_store_digests(struct kimage *image);
>  #else

Your change is just fine with me, too.
I will incorporate it in my next version.

Thanks,
-Takahiro AKASHI

> A nice side effect of this solution is, that a developer who forgets to define
> kexec_file_loaders gets a linker error. So he directly knows what's missing
> instead of first having to find out where/why an error gets returned.
> 
> Otherwise the series is fine for me.
> 
> Thanks
> Philipp
> 
> >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> >  static int kexec_calculate_store_digests(struct kimage *image);
> >  #else
> >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> >  #endif
> > 
> > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > +			     unsigned long buf_len)
> > +{
> > +	const struct kexec_file_ops * const *fops;
> > +	int ret = -ENOEXEC;
> > +
> > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > +		ret = (*fops)->probe(buf, buf_len);
> > +		if (!ret) {
> > +			image->fops = *fops;
> > +			return ret;
> > +		}
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> >  /* Architectures can provide this probe function */
> >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> >  					 unsigned long buf_len)
> >  {
> > -	return -ENOEXEC;
> > +	return _kexec_kernel_image_probe(image, buf, buf_len);
> > +}
> > +
> > +void *_kexec_kernel_image_load(struct kimage *image)
> > +{
> > +	if (!image->fops || !image->fops->load)
> > +		return ERR_PTR(-ENOEXEC);
> > +
> > +	return image->fops->load(image, image->kernel_buf,
> > +				 image->kernel_buf_len, image->initrd_buf,
> > +				 image->initrd_buf_len, image->cmdline_buf,
> > +				 image->cmdline_buf_len);
> >  }
> > 
> >  void * __weak arch_kexec_kernel_image_load(struct kimage *image)
> >  {
> > -	return ERR_PTR(-ENOEXEC);
> > +	return _kexec_kernel_image_load(image);
> > +}
> > +
> > +int _kimage_file_post_load_cleanup(struct kimage *image)
> > +{
> > +	if (!image->fops || !image->fops->cleanup)
> > +		return 0;
> > +
> > +	return image->fops->cleanup(image->image_loader_data);
> >  }
> > 
> >  int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
> >  {
> > -	return -EINVAL;
> > +	return _kimage_file_post_load_cleanup(image);
> >  }
> > 
> >  #ifdef CONFIG_KEXEC_VERIFY_SIG
> > +int _kexec_kernel_verify_sig(struct kimage *image, void *buf,
> > +			    unsigned long buf_len)
> > +{
> > +	if (!image->fops || !image->fops->verify_sig) {
> > +		pr_debug("kernel loader does not support signature verification.\n");
> > +		return -EKEYREJECTED;
> > +	}
> > +
> > +	return image->fops->verify_sig(buf, buf_len);
> > +}
> > +
> >  int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
> >  					unsigned long buf_len)
> >  {
> > -	return -EKEYREJECTED;
> > +	return _kexec_kernel_verify_sig(image, buf, buf_len);
> >  }
> >  #endif
> > 
> 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file, x86, powerpc: factor out kexec_file_ops functions
@ 2018-02-27  2:03       ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-27  2:03 UTC (permalink / raw)
  To: Philipp Rudo
  Cc: herbert, bhe, ard.biesheuvel, catalin.marinas, julien.thierry,
	will.deacon, linux-kernel, kexec, dhowells, arnd,
	linux-arm-kernel, mpe, bauerman, akpm, dyoung, davem, vgoyal

On Mon, Feb 26, 2018 at 12:17:18PM +0100, Philipp Rudo wrote:
> Hi AKASHI
> 
> On Thu, 22 Feb 2018 20:17:22 +0900
> AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:
> 
> [...]
> 
> > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > index 990adae52151..a6d14a768b3e 100644
> > --- a/kernel/kexec_file.c
> > +++ b/kernel/kexec_file.c
> > @@ -26,34 +26,83 @@
> >  #include <linux/vmalloc.h>
> >  #include "kexec_internal.h"
> > 
> > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > +
> 
> Having a weak definition of kexec_file_loaders causes trouble on s390 with 
> gcc 4.8 (newer versions seem to work fine). For me it looks like that in this
> version gcc doesn't recognize __weak but use the default value for
> optimization. This leads to _kexec_kernel_image_probe to always return ENOEXEC
> because the for-loop gets optimized out.

I gave it a try to compile with gcc 4.9 (not 4.8) for arm64
and didn't see any errors or warnings, but

> The problem can easily be worked around by declaring kexec_file_loaders in
> include/linux/kexec.h and defining it in arch code. In particular doing this
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 37e9dce518aa..fc0788540d90 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -139,6 +139,8 @@ struct kexec_file_ops {
>  #endif
>  };
>  
> +extern const struct kexec_file_ops * const kexec_file_loaders[];
> +
>  /**
>   * struct kexec_buf - parameters for finding a place for a buffer in memory
>   * @image:	kexec image in which memory to search.
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 17ba407d0e79..4e3d1e4bc7f6 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -31,8 +31,6 @@
>  #include <linux/vmalloc.h>
>  #include "kexec_internal.h"
>  
> -const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> -
>  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
>  static int kexec_calculate_store_digests(struct kimage *image);
>  #else

Your change is just fine with me, too.
I will incorporate it in my next version.

Thanks,
-Takahiro AKASHI

> A nice side effect of this solution is, that a developer who forgets to define
> kexec_file_loaders gets a linker error. So he directly knows what's missing
> instead of first having to find out where/why an error gets returned.
> 
> Otherwise the series is fine for me.
> 
> Thanks
> Philipp
> 
> >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> >  static int kexec_calculate_store_digests(struct kimage *image);
> >  #else
> >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> >  #endif
> > 
> > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > +			     unsigned long buf_len)
> > +{
> > +	const struct kexec_file_ops * const *fops;
> > +	int ret = -ENOEXEC;
> > +
> > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > +		ret = (*fops)->probe(buf, buf_len);
> > +		if (!ret) {
> > +			image->fops = *fops;
> > +			return ret;
> > +		}
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> >  /* Architectures can provide this probe function */
> >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> >  					 unsigned long buf_len)
> >  {
> > -	return -ENOEXEC;
> > +	return _kexec_kernel_image_probe(image, buf, buf_len);
> > +}
> > +
> > +void *_kexec_kernel_image_load(struct kimage *image)
> > +{
> > +	if (!image->fops || !image->fops->load)
> > +		return ERR_PTR(-ENOEXEC);
> > +
> > +	return image->fops->load(image, image->kernel_buf,
> > +				 image->kernel_buf_len, image->initrd_buf,
> > +				 image->initrd_buf_len, image->cmdline_buf,
> > +				 image->cmdline_buf_len);
> >  }
> > 
> >  void * __weak arch_kexec_kernel_image_load(struct kimage *image)
> >  {
> > -	return ERR_PTR(-ENOEXEC);
> > +	return _kexec_kernel_image_load(image);
> > +}
> > +
> > +int _kimage_file_post_load_cleanup(struct kimage *image)
> > +{
> > +	if (!image->fops || !image->fops->cleanup)
> > +		return 0;
> > +
> > +	return image->fops->cleanup(image->image_loader_data);
> >  }
> > 
> >  int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
> >  {
> > -	return -EINVAL;
> > +	return _kimage_file_post_load_cleanup(image);
> >  }
> > 
> >  #ifdef CONFIG_KEXEC_VERIFY_SIG
> > +int _kexec_kernel_verify_sig(struct kimage *image, void *buf,
> > +			    unsigned long buf_len)
> > +{
> > +	if (!image->fops || !image->fops->verify_sig) {
> > +		pr_debug("kernel loader does not support signature verification.\n");
> > +		return -EKEYREJECTED;
> > +	}
> > +
> > +	return image->fops->verify_sig(buf, buf_len);
> > +}
> > +
> >  int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
> >  					unsigned long buf_len)
> >  {
> > -	return -EKEYREJECTED;
> > +	return _kexec_kernel_verify_sig(image, buf, buf_len);
> >  }
> >  #endif
> > 
> 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 00/13] arm64: kexec: add kexec_file_load() support
  2018-02-22 11:17 ` AKASHI Takahiro
  (?)
@ 2018-02-27  4:56   ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-27  4:56 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-arm-kernel, linux-kernel

Now my patch#2 to #5 were extracted from this patch set and put
into another separate one. Please see
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/562195.htmlk

Thanks,
-Takahiro AKASHI

On Thu, Feb 22, 2018 at 08:17:19PM +0900, AKASHI Takahiro wrote:
> This is the eighth round of implementing kexec_file_load() support
> on arm64.[1]
> Most of the code is based on kexec-tools (along with some kernel code
> from x86, which also came from kexec-tools).
> 
> 
> This patch series enables us to
>   * load the kernel by specifying its file descriptor, instead of user-
>     filled buffer, at kexec_file_load() system call, and
>   * optionally verify its signature at load time for trusted boot.
> 
> Contrary to kexec_load() system call, as we discussed a long time ago,
> users may not be allowed to provide a device tree to the 2nd kernel
> explicitly, hence enforcing a dt blob of the first kernel to be re-used
> internally.
> 
> To use kexec_file_load() system call, instead of kexec_load(), at kexec
> command, '-s' option must be specified. See [2] for a necessary patch for
> kexec-tools.
> 
> To anaylize a generated crash dump file, use the latest master branch of
> crash utility[3] for v4.16-rc kernel. I always try to submit patches to
> fix any inconsistencies introduced in the latest kernel.
> 
> Regarding a kernel image verification, a signature must be presented
> along with the binary itself. A signature is basically a hash value
> calculated against the whole binary data and encrypted by a key which
> will be authenticated by one of the system's trusted certificates.
> Any attempt to read and load a to-be-kexec-ed kernel image through
> a system call will be checked and blocked if the binary's hash value
> doesn't match its associated signature.
> 
> There are two methods available now:
> 1. implementing arch-specific verification hook of kexec_file_load()
> 2. utilizing IMA(Integrity Measurement Architecture)[4] appraisal framework
> 
> Before my v7, I believed that my patch only supports (1) but am now
> confident that (2) comes free if IMA is enabled and properly configured.
> 
> 
> (1) Arch-specific verification hook
> If CONFIG_KEXEC_VERIFY_SIG is enabled, kexec_file_load() invokes an arch-
> defined (and hence file-format-specific) hook function to check for the
> validity of kernel binary.
> 
> On x86, a signature is embedded into a PE file (Microsoft's format) header
> of binary. Since arm64's "Image" can also be seen as a PE file as far as
> CONFIG_EFI is enabled, we adopt this format for kernel signing.  
> 
> As in the case of UEFI applications, we can create a signed kernel image:
>     $ sbsign --key ${KEY} --cert ${CERT} Image
> 
> You may want to use certs/signing_key.pem, which is intended to be used
> for module sigining (CONFIG_MODULE_SIG), as ${KEY} and ${CERT} for test
> purpose.
> 
> 
> (2) IMA appraisal-based
> IMA was first introduced in linux in order to meet TCG (Trusted Computing
> Group) requirement that all the sensitive files be *measured* before
> reading/executing them to detect any untrusted changes/modification.
> Then appraisal feature, which allows us to ensure the integrity of
> files and even prevent them from reading/executing, was added later.
> 
> Meanwhile, kexec_file_load() has been merged since v3.17 and evolved to
> enable IMA-appraisal type verification by the commit b804defe4297 ("kexec:
> replace call to copy_file_from_fd() with kernel version").
> 
> In this scheme, a signature will be stored in a extended file attribute,
> "security.ima" while a decryption key is hold in a dedicated keyring,
> ".ima" or "_ima".  All the necessary process of verification is confined
> in a secure API, kernel_read_file_from_fd(), called by kexec_file_load().
> 
>     Please note that powerpc is one of the two architectures now
>     supporting KEXEC_FILE, and that it wishes to exntend IMA,
>     where a signature may be appended to "vmlinux" file[5], like module
>     signing, instead of using an extended file attribute.
> 
> While IMA meant to be used with TPM (Trusted Platform Module) on secure
> platform, IMA is still usable without TPM. Here is an example procedure
> about how we can give it a try to run the feature using a self-signed
> root ca for demo/test purposes:
> 
>  1) Generate needed keys and certificates, following "Generate trusted
>     keys" section in README of ima-evm-utils[6].
> 
>  2) Build the kernel with the following kernel configurations, specifying
>     "ima-local-ca.pem" for CONFIG_SYSTEM_TRUSTED_KEYS:
> 	CONFIG_EXT4_FS_SECURITY
> 	CONFIG_INTEGRITY_SIGNATURE
> 	CONFIG_INTEGRITY_ASYMMETRIC_KEYS
> 	CONFIG_INTEGRITY_TRUSTED_KEYRING
> 	CONFIG_IMA
> 	CONFIG_IMA_WRITE_POLICY
> 	CONFIG_IMA_READ_POLICY
> 	CONFIG_IMA_APPRAISE
> 	CONFIG_IMA_APPRAISE_BOOTPARAM
> 	CONFIG_SYSTEM_TRUSTED_KEYS
>     Please note that CONFIG_KEXEC_VERIFY_SIG is not, actually should
>     not be, enabled.
> 
>  3) Sign(label) a kernel image binary to be kexec-ed on target filesystem:
>     $ evmctl ima_sign --key /path/to/private_key.pem /your/Image
> 
>  4) Add a command line parameter and boot the kernel:
>     ima_appraise=enforce
> 
>  On live system,
>  5) Set a security policy:
>     $ mount -t securityfs none /sys/kernel/security
>     $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" \
>       > /sys/kernel/security/ima/policy
> 
>  6) Add a key for ima:
>     $ keyctl padd asymmetric my_ima_key %:.ima < /path/to/x509_ima.der
>     (or evmctl import /path/to/x509_ima.der <ima_keyring_id>)
> 
>  7) Then try kexec as normal.
> 
> 
> Concerns(or future works):
> * Even if the kernel is configured with CONFIG_RANDOMIZE_BASE, the 2nd
>   kernel won't be placed at a randomized address. We will have to
>   add some boot code similar to efi-stub to implement the randomization.
> for approach (1),
> * While big-endian kernel can support kernel signing, I'm not sure that
>   Image can be recognized as in PE format because x86 standard only
>   defines little-endian-based format.
> * vmlinux support
> 
>   [1] http://git.linaro.org/people/takahiro.akashi/linux-aarch64.git
> 	branch:arm64/kexec_file
>   [2] http://git.linaro.org/people/takahiro.akashi/kexec-tools.git
> 	branch:arm64/kexec_file
>   [3] http://github.com/crash-utility/crash.git
>   [4] https://sourceforge.net/p/linux-ima/wiki/Home/
>   [5] http://lkml.iu.edu//hypermail/linux/kernel/1707.0/03669.html
>   [6] https://sourceforge.net/p/linux-ima/ima-evm-utils/ci/master/tree/
> 
> 
> Changes in v8 (Feb 22, 2018)
> * introduce ARCH_HAS_KEXEC_PURGATORY so that arm64 will be able to skip
>   purgatory
> * remove "ifdef CONFIG_X86_64" stuffs from a re-factored function,
>   prepare_elf64_headers(), making its interface more generic
>   (The original patch was split into two for easier reviews.)
> * modify cpu_soft_restart() so as to let the 2nd kernel jump into its entry
>   code directly without requiring purgatory in case of kexec_file_load
> * remove CONFIG_KEXEC_FILE_IMAGE_FMT and introduce
>   CONFIG_KEXEC_IMAGE_VERIFY_SIG, much similar to x86 but quite redundant
>   for now.
> * In addition, update/modify dependencies of KEXEC_IMAGE_VERIFY_SIG
> 
> Changes in v7 (Dec 4, 2017)
> * rebased to v4.15-rc2
> * re-organize the patch set to separate KEXEC_FILE_VERIFY_SIG-related
>   code from the others
> * revamp factored-out code in kernel/kexec_file.c due to the changes
>   in original x86 code
> * redefine walk_sys_ram_res_rev() prototype due to change of callback
>   type in the counterpart, walk_sys_ram_res()
> * make KEXEC_FILE_IMAGE_FMT defaut on if KEXEC_FILE selected
> 
> Changes in v6 (Oct 24, 2017)
> * fix a for-loop bug in _kexec_kernel_image_probe() per Julien
> 
> Changes in v5 (Oct 10, 2017)
> * fix kbuild errors around patch #3
> per Julien's comments,
> * fix a bug in walk_system_ram_res_rev() with some cleanup
> * modify fdt_setprop_range() to use vmalloc()
> * modify fill_property() to use memset()
> 
> Changes in v4 (Oct 2, 2017)
> * reinstate x86's arch_kexec_kernel_image_load()
> * rename weak arch_kexec_kernel_xxx() to _kexec_kernel_xxx() for
>   better re-use
> * constify kexec_file_loaders[]
> 
> Changes in v3 (Sep 15, 2017)
> * fix kbuild test error
> * factor out arch_kexec_kernel_*() & arch_kimage_file_post_load_cleanup()
> * remove CONFIG_CRASH_CORE guard from kexec_file.c
> * add vmapped kernel region to vmcore for gdb backtracing
>   (see prepare_elf64_headers())
> * merge asm/kexec_file.h into asm/kexec.h
> * and some cleanups
> 
> Changes in v2 (Sep 8, 2017)
> * move core-header-related functions from crash_core.c to kexec_file.c
> * drop hash-check code from purgatory
> * modify purgatory asm to remove arch_kexec_apply_relocations_add()
> * drop older kernel support
> * drop vmlinux support (at least, for this series)
> 
> 
> Patch #1 to #10 are essential part for KEXEC_FILE support
> (additionally allowing for IMA-based verification):
>   Patch #1 to #6 are all preparatory patches on generic side.
>   Patch #7 to #11 are to enable kexec_file_load on arm64.
> 
> Patch #12 to #13 are for KEXEC_VERIFY_SIG (arch-specific verification)
> support
> 
> AKASHI Takahiro (13):
>   resource: add walk_system_ram_res_rev()
>   kexec_file: make an use of purgatory optional
>   kexec_file,x86,powerpc: factor out kexec_file_ops functions
>   x86: kexec_file: factor out elf core header related functions
>   kexec_file, x86: move re-factored code to generic side
>   asm-generic: add kexec_file_load system call to unistd.h
>   arm64: kexec_file: invoke the kernel without purgatory
>   arm64: kexec_file: load initrd and device-tree
>   arm64: kexec_file: add crash dump support
>   arm64: kexec_file: add Image format support
>   arm64: kexec_file: enable KEXEC_FILE config
>   include: pe.h: remove message[] from mz header definition
>   arm64: kexec_file: enable KEXEC_VERIFY_SIG for Image
> 
>  arch/arm64/Kconfig                          |  34 +++
>  arch/arm64/include/asm/kexec.h              |  90 +++++++
>  arch/arm64/kernel/Makefile                  |   3 +-
>  arch/arm64/kernel/cpu-reset.S               |   6 +-
>  arch/arm64/kernel/kexec_image.c             | 105 ++++++++
>  arch/arm64/kernel/machine_kexec.c           |  11 +-
>  arch/arm64/kernel/machine_kexec_file.c      | 401 ++++++++++++++++++++++++++++
>  arch/arm64/kernel/relocate_kernel.S         |   3 +-
>  arch/powerpc/Kconfig                        |   3 +
>  arch/powerpc/include/asm/kexec.h            |   2 +-
>  arch/powerpc/kernel/kexec_elf_64.c          |   2 +-
>  arch/powerpc/kernel/machine_kexec_file_64.c |  39 +--
>  arch/x86/Kconfig                            |   3 +
>  arch/x86/include/asm/kexec-bzimage64.h      |   2 +-
>  arch/x86/kernel/crash.c                     | 332 +++++------------------
>  arch/x86/kernel/kexec-bzimage64.c           |   2 +-
>  arch/x86/kernel/machine_kexec_64.c          |  45 +---
>  include/linux/ioport.h                      |   3 +
>  include/linux/kexec.h                       |  34 ++-
>  include/linux/pe.h                          |   2 +-
>  include/uapi/asm-generic/unistd.h           |   4 +-
>  kernel/kexec_file.c                         | 238 ++++++++++++++++-
>  kernel/resource.c                           |  57 ++++
>  23 files changed, 1046 insertions(+), 375 deletions(-)
>  create mode 100644 arch/arm64/kernel/kexec_image.c
>  create mode 100644 arch/arm64/kernel/machine_kexec_file.c
> 
> -- 
> 2.16.2
> 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 00/13] arm64: kexec: add kexec_file_load() support
@ 2018-02-27  4:56   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-27  4:56 UTC (permalink / raw)
  To: linux-arm-kernel

Now my patch#2 to #5 were extracted from this patch set and put
into another separate one. Please see
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/562195.htmlk

Thanks,
-Takahiro AKASHI

On Thu, Feb 22, 2018 at 08:17:19PM +0900, AKASHI Takahiro wrote:
> This is the eighth round of implementing kexec_file_load() support
> on arm64.[1]
> Most of the code is based on kexec-tools (along with some kernel code
> from x86, which also came from kexec-tools).
> 
> 
> This patch series enables us to
>   * load the kernel by specifying its file descriptor, instead of user-
>     filled buffer, at kexec_file_load() system call, and
>   * optionally verify its signature at load time for trusted boot.
> 
> Contrary to kexec_load() system call, as we discussed a long time ago,
> users may not be allowed to provide a device tree to the 2nd kernel
> explicitly, hence enforcing a dt blob of the first kernel to be re-used
> internally.
> 
> To use kexec_file_load() system call, instead of kexec_load(), at kexec
> command, '-s' option must be specified. See [2] for a necessary patch for
> kexec-tools.
> 
> To anaylize a generated crash dump file, use the latest master branch of
> crash utility[3] for v4.16-rc kernel. I always try to submit patches to
> fix any inconsistencies introduced in the latest kernel.
> 
> Regarding a kernel image verification, a signature must be presented
> along with the binary itself. A signature is basically a hash value
> calculated against the whole binary data and encrypted by a key which
> will be authenticated by one of the system's trusted certificates.
> Any attempt to read and load a to-be-kexec-ed kernel image through
> a system call will be checked and blocked if the binary's hash value
> doesn't match its associated signature.
> 
> There are two methods available now:
> 1. implementing arch-specific verification hook of kexec_file_load()
> 2. utilizing IMA(Integrity Measurement Architecture)[4] appraisal framework
> 
> Before my v7, I believed that my patch only supports (1) but am now
> confident that (2) comes free if IMA is enabled and properly configured.
> 
> 
> (1) Arch-specific verification hook
> If CONFIG_KEXEC_VERIFY_SIG is enabled, kexec_file_load() invokes an arch-
> defined (and hence file-format-specific) hook function to check for the
> validity of kernel binary.
> 
> On x86, a signature is embedded into a PE file (Microsoft's format) header
> of binary. Since arm64's "Image" can also be seen as a PE file as far as
> CONFIG_EFI is enabled, we adopt this format for kernel signing.  
> 
> As in the case of UEFI applications, we can create a signed kernel image:
>     $ sbsign --key ${KEY} --cert ${CERT} Image
> 
> You may want to use certs/signing_key.pem, which is intended to be used
> for module sigining (CONFIG_MODULE_SIG), as ${KEY} and ${CERT} for test
> purpose.
> 
> 
> (2) IMA appraisal-based
> IMA was first introduced in linux in order to meet TCG (Trusted Computing
> Group) requirement that all the sensitive files be *measured* before
> reading/executing them to detect any untrusted changes/modification.
> Then appraisal feature, which allows us to ensure the integrity of
> files and even prevent them from reading/executing, was added later.
> 
> Meanwhile, kexec_file_load() has been merged since v3.17 and evolved to
> enable IMA-appraisal type verification by the commit b804defe4297 ("kexec:
> replace call to copy_file_from_fd() with kernel version").
> 
> In this scheme, a signature will be stored in a extended file attribute,
> "security.ima" while a decryption key is hold in a dedicated keyring,
> ".ima" or "_ima".  All the necessary process of verification is confined
> in a secure API, kernel_read_file_from_fd(), called by kexec_file_load().
> 
>     Please note that powerpc is one of the two architectures now
>     supporting KEXEC_FILE, and that it wishes to exntend IMA,
>     where a signature may be appended to "vmlinux" file[5], like module
>     signing, instead of using an extended file attribute.
> 
> While IMA meant to be used with TPM (Trusted Platform Module) on secure
> platform, IMA is still usable without TPM. Here is an example procedure
> about how we can give it a try to run the feature using a self-signed
> root ca for demo/test purposes:
> 
>  1) Generate needed keys and certificates, following "Generate trusted
>     keys" section in README of ima-evm-utils[6].
> 
>  2) Build the kernel with the following kernel configurations, specifying
>     "ima-local-ca.pem" for CONFIG_SYSTEM_TRUSTED_KEYS:
> 	CONFIG_EXT4_FS_SECURITY
> 	CONFIG_INTEGRITY_SIGNATURE
> 	CONFIG_INTEGRITY_ASYMMETRIC_KEYS
> 	CONFIG_INTEGRITY_TRUSTED_KEYRING
> 	CONFIG_IMA
> 	CONFIG_IMA_WRITE_POLICY
> 	CONFIG_IMA_READ_POLICY
> 	CONFIG_IMA_APPRAISE
> 	CONFIG_IMA_APPRAISE_BOOTPARAM
> 	CONFIG_SYSTEM_TRUSTED_KEYS
>     Please note that CONFIG_KEXEC_VERIFY_SIG is not, actually should
>     not be, enabled.
> 
>  3) Sign(label) a kernel image binary to be kexec-ed on target filesystem:
>     $ evmctl ima_sign --key /path/to/private_key.pem /your/Image
> 
>  4) Add a command line parameter and boot the kernel:
>     ima_appraise=enforce
> 
>  On live system,
>  5) Set a security policy:
>     $ mount -t securityfs none /sys/kernel/security
>     $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" \
>       > /sys/kernel/security/ima/policy
> 
>  6) Add a key for ima:
>     $ keyctl padd asymmetric my_ima_key %:.ima < /path/to/x509_ima.der
>     (or evmctl import /path/to/x509_ima.der <ima_keyring_id>)
> 
>  7) Then try kexec as normal.
> 
> 
> Concerns(or future works):
> * Even if the kernel is configured with CONFIG_RANDOMIZE_BASE, the 2nd
>   kernel won't be placed at a randomized address. We will have to
>   add some boot code similar to efi-stub to implement the randomization.
> for approach (1),
> * While big-endian kernel can support kernel signing, I'm not sure that
>   Image can be recognized as in PE format because x86 standard only
>   defines little-endian-based format.
> * vmlinux support
> 
>   [1] http://git.linaro.org/people/takahiro.akashi/linux-aarch64.git
> 	branch:arm64/kexec_file
>   [2] http://git.linaro.org/people/takahiro.akashi/kexec-tools.git
> 	branch:arm64/kexec_file
>   [3] http://github.com/crash-utility/crash.git
>   [4] https://sourceforge.net/p/linux-ima/wiki/Home/
>   [5] http://lkml.iu.edu//hypermail/linux/kernel/1707.0/03669.html
>   [6] https://sourceforge.net/p/linux-ima/ima-evm-utils/ci/master/tree/
> 
> 
> Changes in v8 (Feb 22, 2018)
> * introduce ARCH_HAS_KEXEC_PURGATORY so that arm64 will be able to skip
>   purgatory
> * remove "ifdef CONFIG_X86_64" stuffs from a re-factored function,
>   prepare_elf64_headers(), making its interface more generic
>   (The original patch was split into two for easier reviews.)
> * modify cpu_soft_restart() so as to let the 2nd kernel jump into its entry
>   code directly without requiring purgatory in case of kexec_file_load
> * remove CONFIG_KEXEC_FILE_IMAGE_FMT and introduce
>   CONFIG_KEXEC_IMAGE_VERIFY_SIG, much similar to x86 but quite redundant
>   for now.
> * In addition, update/modify dependencies of KEXEC_IMAGE_VERIFY_SIG
> 
> Changes in v7 (Dec 4, 2017)
> * rebased to v4.15-rc2
> * re-organize the patch set to separate KEXEC_FILE_VERIFY_SIG-related
>   code from the others
> * revamp factored-out code in kernel/kexec_file.c due to the changes
>   in original x86 code
> * redefine walk_sys_ram_res_rev() prototype due to change of callback
>   type in the counterpart, walk_sys_ram_res()
> * make KEXEC_FILE_IMAGE_FMT defaut on if KEXEC_FILE selected
> 
> Changes in v6 (Oct 24, 2017)
> * fix a for-loop bug in _kexec_kernel_image_probe() per Julien
> 
> Changes in v5 (Oct 10, 2017)
> * fix kbuild errors around patch #3
> per Julien's comments,
> * fix a bug in walk_system_ram_res_rev() with some cleanup
> * modify fdt_setprop_range() to use vmalloc()
> * modify fill_property() to use memset()
> 
> Changes in v4 (Oct 2, 2017)
> * reinstate x86's arch_kexec_kernel_image_load()
> * rename weak arch_kexec_kernel_xxx() to _kexec_kernel_xxx() for
>   better re-use
> * constify kexec_file_loaders[]
> 
> Changes in v3 (Sep 15, 2017)
> * fix kbuild test error
> * factor out arch_kexec_kernel_*() & arch_kimage_file_post_load_cleanup()
> * remove CONFIG_CRASH_CORE guard from kexec_file.c
> * add vmapped kernel region to vmcore for gdb backtracing
>   (see prepare_elf64_headers())
> * merge asm/kexec_file.h into asm/kexec.h
> * and some cleanups
> 
> Changes in v2 (Sep 8, 2017)
> * move core-header-related functions from crash_core.c to kexec_file.c
> * drop hash-check code from purgatory
> * modify purgatory asm to remove arch_kexec_apply_relocations_add()
> * drop older kernel support
> * drop vmlinux support (at least, for this series)
> 
> 
> Patch #1 to #10 are essential part for KEXEC_FILE support
> (additionally allowing for IMA-based verification):
>   Patch #1 to #6 are all preparatory patches on generic side.
>   Patch #7 to #11 are to enable kexec_file_load on arm64.
> 
> Patch #12 to #13 are for KEXEC_VERIFY_SIG (arch-specific verification)
> support
> 
> AKASHI Takahiro (13):
>   resource: add walk_system_ram_res_rev()
>   kexec_file: make an use of purgatory optional
>   kexec_file,x86,powerpc: factor out kexec_file_ops functions
>   x86: kexec_file: factor out elf core header related functions
>   kexec_file, x86: move re-factored code to generic side
>   asm-generic: add kexec_file_load system call to unistd.h
>   arm64: kexec_file: invoke the kernel without purgatory
>   arm64: kexec_file: load initrd and device-tree
>   arm64: kexec_file: add crash dump support
>   arm64: kexec_file: add Image format support
>   arm64: kexec_file: enable KEXEC_FILE config
>   include: pe.h: remove message[] from mz header definition
>   arm64: kexec_file: enable KEXEC_VERIFY_SIG for Image
> 
>  arch/arm64/Kconfig                          |  34 +++
>  arch/arm64/include/asm/kexec.h              |  90 +++++++
>  arch/arm64/kernel/Makefile                  |   3 +-
>  arch/arm64/kernel/cpu-reset.S               |   6 +-
>  arch/arm64/kernel/kexec_image.c             | 105 ++++++++
>  arch/arm64/kernel/machine_kexec.c           |  11 +-
>  arch/arm64/kernel/machine_kexec_file.c      | 401 ++++++++++++++++++++++++++++
>  arch/arm64/kernel/relocate_kernel.S         |   3 +-
>  arch/powerpc/Kconfig                        |   3 +
>  arch/powerpc/include/asm/kexec.h            |   2 +-
>  arch/powerpc/kernel/kexec_elf_64.c          |   2 +-
>  arch/powerpc/kernel/machine_kexec_file_64.c |  39 +--
>  arch/x86/Kconfig                            |   3 +
>  arch/x86/include/asm/kexec-bzimage64.h      |   2 +-
>  arch/x86/kernel/crash.c                     | 332 +++++------------------
>  arch/x86/kernel/kexec-bzimage64.c           |   2 +-
>  arch/x86/kernel/machine_kexec_64.c          |  45 +---
>  include/linux/ioport.h                      |   3 +
>  include/linux/kexec.h                       |  34 ++-
>  include/linux/pe.h                          |   2 +-
>  include/uapi/asm-generic/unistd.h           |   4 +-
>  kernel/kexec_file.c                         | 238 ++++++++++++++++-
>  kernel/resource.c                           |  57 ++++
>  23 files changed, 1046 insertions(+), 375 deletions(-)
>  create mode 100644 arch/arm64/kernel/kexec_image.c
>  create mode 100644 arch/arm64/kernel/machine_kexec_file.c
> 
> -- 
> 2.16.2
> 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 00/13] arm64: kexec: add kexec_file_load() support
@ 2018-02-27  4:56   ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-02-27  4:56 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry
  Cc: kexec, linux-kernel, linux-arm-kernel

Now my patch#2 to #5 were extracted from this patch set and put
into another separate one. Please see
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/562195.htmlk

Thanks,
-Takahiro AKASHI

On Thu, Feb 22, 2018 at 08:17:19PM +0900, AKASHI Takahiro wrote:
> This is the eighth round of implementing kexec_file_load() support
> on arm64.[1]
> Most of the code is based on kexec-tools (along with some kernel code
> from x86, which also came from kexec-tools).
> 
> 
> This patch series enables us to
>   * load the kernel by specifying its file descriptor, instead of user-
>     filled buffer, at kexec_file_load() system call, and
>   * optionally verify its signature at load time for trusted boot.
> 
> Contrary to kexec_load() system call, as we discussed a long time ago,
> users may not be allowed to provide a device tree to the 2nd kernel
> explicitly, hence enforcing a dt blob of the first kernel to be re-used
> internally.
> 
> To use kexec_file_load() system call, instead of kexec_load(), at kexec
> command, '-s' option must be specified. See [2] for a necessary patch for
> kexec-tools.
> 
> To anaylize a generated crash dump file, use the latest master branch of
> crash utility[3] for v4.16-rc kernel. I always try to submit patches to
> fix any inconsistencies introduced in the latest kernel.
> 
> Regarding a kernel image verification, a signature must be presented
> along with the binary itself. A signature is basically a hash value
> calculated against the whole binary data and encrypted by a key which
> will be authenticated by one of the system's trusted certificates.
> Any attempt to read and load a to-be-kexec-ed kernel image through
> a system call will be checked and blocked if the binary's hash value
> doesn't match its associated signature.
> 
> There are two methods available now:
> 1. implementing arch-specific verification hook of kexec_file_load()
> 2. utilizing IMA(Integrity Measurement Architecture)[4] appraisal framework
> 
> Before my v7, I believed that my patch only supports (1) but am now
> confident that (2) comes free if IMA is enabled and properly configured.
> 
> 
> (1) Arch-specific verification hook
> If CONFIG_KEXEC_VERIFY_SIG is enabled, kexec_file_load() invokes an arch-
> defined (and hence file-format-specific) hook function to check for the
> validity of kernel binary.
> 
> On x86, a signature is embedded into a PE file (Microsoft's format) header
> of binary. Since arm64's "Image" can also be seen as a PE file as far as
> CONFIG_EFI is enabled, we adopt this format for kernel signing.  
> 
> As in the case of UEFI applications, we can create a signed kernel image:
>     $ sbsign --key ${KEY} --cert ${CERT} Image
> 
> You may want to use certs/signing_key.pem, which is intended to be used
> for module sigining (CONFIG_MODULE_SIG), as ${KEY} and ${CERT} for test
> purpose.
> 
> 
> (2) IMA appraisal-based
> IMA was first introduced in linux in order to meet TCG (Trusted Computing
> Group) requirement that all the sensitive files be *measured* before
> reading/executing them to detect any untrusted changes/modification.
> Then appraisal feature, which allows us to ensure the integrity of
> files and even prevent them from reading/executing, was added later.
> 
> Meanwhile, kexec_file_load() has been merged since v3.17 and evolved to
> enable IMA-appraisal type verification by the commit b804defe4297 ("kexec:
> replace call to copy_file_from_fd() with kernel version").
> 
> In this scheme, a signature will be stored in a extended file attribute,
> "security.ima" while a decryption key is hold in a dedicated keyring,
> ".ima" or "_ima".  All the necessary process of verification is confined
> in a secure API, kernel_read_file_from_fd(), called by kexec_file_load().
> 
>     Please note that powerpc is one of the two architectures now
>     supporting KEXEC_FILE, and that it wishes to exntend IMA,
>     where a signature may be appended to "vmlinux" file[5], like module
>     signing, instead of using an extended file attribute.
> 
> While IMA meant to be used with TPM (Trusted Platform Module) on secure
> platform, IMA is still usable without TPM. Here is an example procedure
> about how we can give it a try to run the feature using a self-signed
> root ca for demo/test purposes:
> 
>  1) Generate needed keys and certificates, following "Generate trusted
>     keys" section in README of ima-evm-utils[6].
> 
>  2) Build the kernel with the following kernel configurations, specifying
>     "ima-local-ca.pem" for CONFIG_SYSTEM_TRUSTED_KEYS:
> 	CONFIG_EXT4_FS_SECURITY
> 	CONFIG_INTEGRITY_SIGNATURE
> 	CONFIG_INTEGRITY_ASYMMETRIC_KEYS
> 	CONFIG_INTEGRITY_TRUSTED_KEYRING
> 	CONFIG_IMA
> 	CONFIG_IMA_WRITE_POLICY
> 	CONFIG_IMA_READ_POLICY
> 	CONFIG_IMA_APPRAISE
> 	CONFIG_IMA_APPRAISE_BOOTPARAM
> 	CONFIG_SYSTEM_TRUSTED_KEYS
>     Please note that CONFIG_KEXEC_VERIFY_SIG is not, actually should
>     not be, enabled.
> 
>  3) Sign(label) a kernel image binary to be kexec-ed on target filesystem:
>     $ evmctl ima_sign --key /path/to/private_key.pem /your/Image
> 
>  4) Add a command line parameter and boot the kernel:
>     ima_appraise=enforce
> 
>  On live system,
>  5) Set a security policy:
>     $ mount -t securityfs none /sys/kernel/security
>     $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" \
>       > /sys/kernel/security/ima/policy
> 
>  6) Add a key for ima:
>     $ keyctl padd asymmetric my_ima_key %:.ima < /path/to/x509_ima.der
>     (or evmctl import /path/to/x509_ima.der <ima_keyring_id>)
> 
>  7) Then try kexec as normal.
> 
> 
> Concerns(or future works):
> * Even if the kernel is configured with CONFIG_RANDOMIZE_BASE, the 2nd
>   kernel won't be placed at a randomized address. We will have to
>   add some boot code similar to efi-stub to implement the randomization.
> for approach (1),
> * While big-endian kernel can support kernel signing, I'm not sure that
>   Image can be recognized as in PE format because x86 standard only
>   defines little-endian-based format.
> * vmlinux support
> 
>   [1] http://git.linaro.org/people/takahiro.akashi/linux-aarch64.git
> 	branch:arm64/kexec_file
>   [2] http://git.linaro.org/people/takahiro.akashi/kexec-tools.git
> 	branch:arm64/kexec_file
>   [3] http://github.com/crash-utility/crash.git
>   [4] https://sourceforge.net/p/linux-ima/wiki/Home/
>   [5] http://lkml.iu.edu//hypermail/linux/kernel/1707.0/03669.html
>   [6] https://sourceforge.net/p/linux-ima/ima-evm-utils/ci/master/tree/
> 
> 
> Changes in v8 (Feb 22, 2018)
> * introduce ARCH_HAS_KEXEC_PURGATORY so that arm64 will be able to skip
>   purgatory
> * remove "ifdef CONFIG_X86_64" stuffs from a re-factored function,
>   prepare_elf64_headers(), making its interface more generic
>   (The original patch was split into two for easier reviews.)
> * modify cpu_soft_restart() so as to let the 2nd kernel jump into its entry
>   code directly without requiring purgatory in case of kexec_file_load
> * remove CONFIG_KEXEC_FILE_IMAGE_FMT and introduce
>   CONFIG_KEXEC_IMAGE_VERIFY_SIG, much similar to x86 but quite redundant
>   for now.
> * In addition, update/modify dependencies of KEXEC_IMAGE_VERIFY_SIG
> 
> Changes in v7 (Dec 4, 2017)
> * rebased to v4.15-rc2
> * re-organize the patch set to separate KEXEC_FILE_VERIFY_SIG-related
>   code from the others
> * revamp factored-out code in kernel/kexec_file.c due to the changes
>   in original x86 code
> * redefine walk_sys_ram_res_rev() prototype due to change of callback
>   type in the counterpart, walk_sys_ram_res()
> * make KEXEC_FILE_IMAGE_FMT defaut on if KEXEC_FILE selected
> 
> Changes in v6 (Oct 24, 2017)
> * fix a for-loop bug in _kexec_kernel_image_probe() per Julien
> 
> Changes in v5 (Oct 10, 2017)
> * fix kbuild errors around patch #3
> per Julien's comments,
> * fix a bug in walk_system_ram_res_rev() with some cleanup
> * modify fdt_setprop_range() to use vmalloc()
> * modify fill_property() to use memset()
> 
> Changes in v4 (Oct 2, 2017)
> * reinstate x86's arch_kexec_kernel_image_load()
> * rename weak arch_kexec_kernel_xxx() to _kexec_kernel_xxx() for
>   better re-use
> * constify kexec_file_loaders[]
> 
> Changes in v3 (Sep 15, 2017)
> * fix kbuild test error
> * factor out arch_kexec_kernel_*() & arch_kimage_file_post_load_cleanup()
> * remove CONFIG_CRASH_CORE guard from kexec_file.c
> * add vmapped kernel region to vmcore for gdb backtracing
>   (see prepare_elf64_headers())
> * merge asm/kexec_file.h into asm/kexec.h
> * and some cleanups
> 
> Changes in v2 (Sep 8, 2017)
> * move core-header-related functions from crash_core.c to kexec_file.c
> * drop hash-check code from purgatory
> * modify purgatory asm to remove arch_kexec_apply_relocations_add()
> * drop older kernel support
> * drop vmlinux support (at least, for this series)
> 
> 
> Patch #1 to #10 are essential part for KEXEC_FILE support
> (additionally allowing for IMA-based verification):
>   Patch #1 to #6 are all preparatory patches on generic side.
>   Patch #7 to #11 are to enable kexec_file_load on arm64.
> 
> Patch #12 to #13 are for KEXEC_VERIFY_SIG (arch-specific verification)
> support
> 
> AKASHI Takahiro (13):
>   resource: add walk_system_ram_res_rev()
>   kexec_file: make an use of purgatory optional
>   kexec_file,x86,powerpc: factor out kexec_file_ops functions
>   x86: kexec_file: factor out elf core header related functions
>   kexec_file, x86: move re-factored code to generic side
>   asm-generic: add kexec_file_load system call to unistd.h
>   arm64: kexec_file: invoke the kernel without purgatory
>   arm64: kexec_file: load initrd and device-tree
>   arm64: kexec_file: add crash dump support
>   arm64: kexec_file: add Image format support
>   arm64: kexec_file: enable KEXEC_FILE config
>   include: pe.h: remove message[] from mz header definition
>   arm64: kexec_file: enable KEXEC_VERIFY_SIG for Image
> 
>  arch/arm64/Kconfig                          |  34 +++
>  arch/arm64/include/asm/kexec.h              |  90 +++++++
>  arch/arm64/kernel/Makefile                  |   3 +-
>  arch/arm64/kernel/cpu-reset.S               |   6 +-
>  arch/arm64/kernel/kexec_image.c             | 105 ++++++++
>  arch/arm64/kernel/machine_kexec.c           |  11 +-
>  arch/arm64/kernel/machine_kexec_file.c      | 401 ++++++++++++++++++++++++++++
>  arch/arm64/kernel/relocate_kernel.S         |   3 +-
>  arch/powerpc/Kconfig                        |   3 +
>  arch/powerpc/include/asm/kexec.h            |   2 +-
>  arch/powerpc/kernel/kexec_elf_64.c          |   2 +-
>  arch/powerpc/kernel/machine_kexec_file_64.c |  39 +--
>  arch/x86/Kconfig                            |   3 +
>  arch/x86/include/asm/kexec-bzimage64.h      |   2 +-
>  arch/x86/kernel/crash.c                     | 332 +++++------------------
>  arch/x86/kernel/kexec-bzimage64.c           |   2 +-
>  arch/x86/kernel/machine_kexec_64.c          |  45 +---
>  include/linux/ioport.h                      |   3 +
>  include/linux/kexec.h                       |  34 ++-
>  include/linux/pe.h                          |   2 +-
>  include/uapi/asm-generic/unistd.h           |   4 +-
>  kernel/kexec_file.c                         | 238 ++++++++++++++++-
>  kernel/resource.c                           |  57 ++++
>  23 files changed, 1046 insertions(+), 375 deletions(-)
>  create mode 100644 arch/arm64/kernel/kexec_image.c
>  create mode 100644 arch/arm64/kernel/machine_kexec_file.c
> 
> -- 
> 2.16.2
> 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file, x86, powerpc: factor out kexec_file_ops functions
  2018-02-27  2:03       ` AKASHI Takahiro
  (?)
@ 2018-02-27  9:26         ` Philipp Rudo
  -1 siblings, 0 replies; 102+ messages in thread
From: Philipp Rudo @ 2018-02-27  9:26 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, dyoung, bhe, arnd, ard.biesheuvel,
	julien.thierry, kexec, linux-kernel, linux-arm-kernel

On Tue, 27 Feb 2018 11:03:07 +0900
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

> On Mon, Feb 26, 2018 at 12:17:18PM +0100, Philipp Rudo wrote:
> > Hi AKASHI
> > 
> > On Thu, 22 Feb 2018 20:17:22 +0900
> > AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:
> > 
> > [...]
> >   
> > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > index 990adae52151..a6d14a768b3e 100644
> > > --- a/kernel/kexec_file.c
> > > +++ b/kernel/kexec_file.c
> > > @@ -26,34 +26,83 @@
> > >  #include <linux/vmalloc.h>
> > >  #include "kexec_internal.h"
> > > 
> > > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > > +  
> > 
> > Having a weak definition of kexec_file_loaders causes trouble on s390 with 
> > gcc 4.8 (newer versions seem to work fine). For me it looks like that in this
> > version gcc doesn't recognize __weak but use the default value for
> > optimization. This leads to _kexec_kernel_image_probe to always return ENOEXEC
> > because the for-loop gets optimized out.  
> 
> I gave it a try to compile with gcc 4.9 (not 4.8) for arm64
> and didn't see any errors or warnings, but

I talked to our compiler guys, and it looks like its a bug in gcc which was
introduced with gcc 4.8 and removed again with gcc 4.9. So I was just extremely
lucky hitting the sweat spot...


> > The problem can easily be worked around by declaring kexec_file_loaders in
> > include/linux/kexec.h and defining it in arch code. In particular doing this
> > 
> > diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> > index 37e9dce518aa..fc0788540d90 100644
> > --- a/include/linux/kexec.h
> > +++ b/include/linux/kexec.h
> > @@ -139,6 +139,8 @@ struct kexec_file_ops {
> >  #endif
> >  };
> >  
> > +extern const struct kexec_file_ops * const kexec_file_loaders[];
> > +
> >  /**
> >   * struct kexec_buf - parameters for finding a place for a buffer in memory
> >   * @image:	kexec image in which memory to search.
> > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > index 17ba407d0e79..4e3d1e4bc7f6 100644
> > --- a/kernel/kexec_file.c
> > +++ b/kernel/kexec_file.c
> > @@ -31,8 +31,6 @@
> >  #include <linux/vmalloc.h>
> >  #include "kexec_internal.h"
> >  
> > -const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > -
> >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> >  static int kexec_calculate_store_digests(struct kimage *image);
> >  #else  
> 
> Your change is just fine with me, too.
> I will incorporate it in my next version.

Thanks a lot
Philipp

> Thanks,
> -Takahiro AKASHI
> 
> > A nice side effect of this solution is, that a developer who forgets to define
> > kexec_file_loaders gets a linker error. So he directly knows what's missing
> > instead of first having to find out where/why an error gets returned.
> > 
> > Otherwise the series is fine for me.
> > 
> > Thanks
> > Philipp
> >   
> > >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> > >  static int kexec_calculate_store_digests(struct kimage *image);
> > >  #else
> > >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> > >  #endif
> > > 
> > > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > > +			     unsigned long buf_len)
> > > +{
> > > +	const struct kexec_file_ops * const *fops;
> > > +	int ret = -ENOEXEC;
> > > +
> > > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > > +		ret = (*fops)->probe(buf, buf_len);
> > > +		if (!ret) {
> > > +			image->fops = *fops;
> > > +			return ret;
> > > +		}
> > > +	}
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  /* Architectures can provide this probe function */
> > >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > >  					 unsigned long buf_len)
> > >  {
> > > -	return -ENOEXEC;
> > > +	return _kexec_kernel_image_probe(image, buf, buf_len);
> > > +}
> > > +
> > > +void *_kexec_kernel_image_load(struct kimage *image)
> > > +{
> > > +	if (!image->fops || !image->fops->load)
> > > +		return ERR_PTR(-ENOEXEC);
> > > +
> > > +	return image->fops->load(image, image->kernel_buf,
> > > +				 image->kernel_buf_len, image->initrd_buf,
> > > +				 image->initrd_buf_len, image->cmdline_buf,
> > > +				 image->cmdline_buf_len);
> > >  }
> > > 
> > >  void * __weak arch_kexec_kernel_image_load(struct kimage *image)
> > >  {
> > > -	return ERR_PTR(-ENOEXEC);
> > > +	return _kexec_kernel_image_load(image);
> > > +}
> > > +
> > > +int _kimage_file_post_load_cleanup(struct kimage *image)
> > > +{
> > > +	if (!image->fops || !image->fops->cleanup)
> > > +		return 0;
> > > +
> > > +	return image->fops->cleanup(image->image_loader_data);
> > >  }
> > > 
> > >  int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
> > >  {
> > > -	return -EINVAL;
> > > +	return _kimage_file_post_load_cleanup(image);
> > >  }
> > > 
> > >  #ifdef CONFIG_KEXEC_VERIFY_SIG
> > > +int _kexec_kernel_verify_sig(struct kimage *image, void *buf,
> > > +			    unsigned long buf_len)
> > > +{
> > > +	if (!image->fops || !image->fops->verify_sig) {
> > > +		pr_debug("kernel loader does not support signature verification.\n");
> > > +		return -EKEYREJECTED;
> > > +	}
> > > +
> > > +	return image->fops->verify_sig(buf, buf_len);
> > > +}
> > > +
> > >  int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
> > >  					unsigned long buf_len)
> > >  {
> > > -	return -EKEYREJECTED;
> > > +	return _kexec_kernel_verify_sig(image, buf, buf_len);
> > >  }
> > >  #endif
> > >   
> >   
> 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 03/13] kexec_file, x86, powerpc: factor out kexec_file_ops functions
@ 2018-02-27  9:26         ` Philipp Rudo
  0 siblings, 0 replies; 102+ messages in thread
From: Philipp Rudo @ 2018-02-27  9:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 27 Feb 2018 11:03:07 +0900
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

> On Mon, Feb 26, 2018 at 12:17:18PM +0100, Philipp Rudo wrote:
> > Hi AKASHI
> > 
> > On Thu, 22 Feb 2018 20:17:22 +0900
> > AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:
> > 
> > [...]
> >   
> > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > index 990adae52151..a6d14a768b3e 100644
> > > --- a/kernel/kexec_file.c
> > > +++ b/kernel/kexec_file.c
> > > @@ -26,34 +26,83 @@
> > >  #include <linux/vmalloc.h>
> > >  #include "kexec_internal.h"
> > > 
> > > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > > +  
> > 
> > Having a weak definition of kexec_file_loaders causes trouble on s390 with 
> > gcc 4.8 (newer versions seem to work fine). For me it looks like that in this
> > version gcc doesn't recognize __weak but use the default value for
> > optimization. This leads to _kexec_kernel_image_probe to always return ENOEXEC
> > because the for-loop gets optimized out.  
> 
> I gave it a try to compile with gcc 4.9 (not 4.8) for arm64
> and didn't see any errors or warnings, but

I talked to our compiler guys, and it looks like its a bug in gcc which was
introduced with gcc 4.8 and removed again with gcc 4.9. So I was just extremely
lucky hitting the sweat spot...


> > The problem can easily be worked around by declaring kexec_file_loaders in
> > include/linux/kexec.h and defining it in arch code. In particular doing this
> > 
> > diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> > index 37e9dce518aa..fc0788540d90 100644
> > --- a/include/linux/kexec.h
> > +++ b/include/linux/kexec.h
> > @@ -139,6 +139,8 @@ struct kexec_file_ops {
> >  #endif
> >  };
> >  
> > +extern const struct kexec_file_ops * const kexec_file_loaders[];
> > +
> >  /**
> >   * struct kexec_buf - parameters for finding a place for a buffer in memory
> >   * @image:	kexec image in which memory to search.
> > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > index 17ba407d0e79..4e3d1e4bc7f6 100644
> > --- a/kernel/kexec_file.c
> > +++ b/kernel/kexec_file.c
> > @@ -31,8 +31,6 @@
> >  #include <linux/vmalloc.h>
> >  #include "kexec_internal.h"
> >  
> > -const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > -
> >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> >  static int kexec_calculate_store_digests(struct kimage *image);
> >  #else  
> 
> Your change is just fine with me, too.
> I will incorporate it in my next version.

Thanks a lot
Philipp

> Thanks,
> -Takahiro AKASHI
> 
> > A nice side effect of this solution is, that a developer who forgets to define
> > kexec_file_loaders gets a linker error. So he directly knows what's missing
> > instead of first having to find out where/why an error gets returned.
> > 
> > Otherwise the series is fine for me.
> > 
> > Thanks
> > Philipp
> >   
> > >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> > >  static int kexec_calculate_store_digests(struct kimage *image);
> > >  #else
> > >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> > >  #endif
> > > 
> > > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > > +			     unsigned long buf_len)
> > > +{
> > > +	const struct kexec_file_ops * const *fops;
> > > +	int ret = -ENOEXEC;
> > > +
> > > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > > +		ret = (*fops)->probe(buf, buf_len);
> > > +		if (!ret) {
> > > +			image->fops = *fops;
> > > +			return ret;
> > > +		}
> > > +	}
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  /* Architectures can provide this probe function */
> > >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > >  					 unsigned long buf_len)
> > >  {
> > > -	return -ENOEXEC;
> > > +	return _kexec_kernel_image_probe(image, buf, buf_len);
> > > +}
> > > +
> > > +void *_kexec_kernel_image_load(struct kimage *image)
> > > +{
> > > +	if (!image->fops || !image->fops->load)
> > > +		return ERR_PTR(-ENOEXEC);
> > > +
> > > +	return image->fops->load(image, image->kernel_buf,
> > > +				 image->kernel_buf_len, image->initrd_buf,
> > > +				 image->initrd_buf_len, image->cmdline_buf,
> > > +				 image->cmdline_buf_len);
> > >  }
> > > 
> > >  void * __weak arch_kexec_kernel_image_load(struct kimage *image)
> > >  {
> > > -	return ERR_PTR(-ENOEXEC);
> > > +	return _kexec_kernel_image_load(image);
> > > +}
> > > +
> > > +int _kimage_file_post_load_cleanup(struct kimage *image)
> > > +{
> > > +	if (!image->fops || !image->fops->cleanup)
> > > +		return 0;
> > > +
> > > +	return image->fops->cleanup(image->image_loader_data);
> > >  }
> > > 
> > >  int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
> > >  {
> > > -	return -EINVAL;
> > > +	return _kimage_file_post_load_cleanup(image);
> > >  }
> > > 
> > >  #ifdef CONFIG_KEXEC_VERIFY_SIG
> > > +int _kexec_kernel_verify_sig(struct kimage *image, void *buf,
> > > +			    unsigned long buf_len)
> > > +{
> > > +	if (!image->fops || !image->fops->verify_sig) {
> > > +		pr_debug("kernel loader does not support signature verification.\n");
> > > +		return -EKEYREJECTED;
> > > +	}
> > > +
> > > +	return image->fops->verify_sig(buf, buf_len);
> > > +}
> > > +
> > >  int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
> > >  					unsigned long buf_len)
> > >  {
> > > -	return -EKEYREJECTED;
> > > +	return _kexec_kernel_verify_sig(image, buf, buf_len);
> > >  }
> > >  #endif
> > >   
> >   
> 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file, x86, powerpc: factor out kexec_file_ops functions
@ 2018-02-27  9:26         ` Philipp Rudo
  0 siblings, 0 replies; 102+ messages in thread
From: Philipp Rudo @ 2018-02-27  9:26 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: herbert, bhe, ard.biesheuvel, catalin.marinas, julien.thierry,
	will.deacon, linux-kernel, kexec, dhowells, arnd,
	linux-arm-kernel, mpe, bauerman, akpm, dyoung, davem, vgoyal

On Tue, 27 Feb 2018 11:03:07 +0900
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

> On Mon, Feb 26, 2018 at 12:17:18PM +0100, Philipp Rudo wrote:
> > Hi AKASHI
> > 
> > On Thu, 22 Feb 2018 20:17:22 +0900
> > AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:
> > 
> > [...]
> >   
> > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > index 990adae52151..a6d14a768b3e 100644
> > > --- a/kernel/kexec_file.c
> > > +++ b/kernel/kexec_file.c
> > > @@ -26,34 +26,83 @@
> > >  #include <linux/vmalloc.h>
> > >  #include "kexec_internal.h"
> > > 
> > > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > > +  
> > 
> > Having a weak definition of kexec_file_loaders causes trouble on s390 with 
> > gcc 4.8 (newer versions seem to work fine). For me it looks like that in this
> > version gcc doesn't recognize __weak but use the default value for
> > optimization. This leads to _kexec_kernel_image_probe to always return ENOEXEC
> > because the for-loop gets optimized out.  
> 
> I gave it a try to compile with gcc 4.9 (not 4.8) for arm64
> and didn't see any errors or warnings, but

I talked to our compiler guys, and it looks like its a bug in gcc which was
introduced with gcc 4.8 and removed again with gcc 4.9. So I was just extremely
lucky hitting the sweat spot...


> > The problem can easily be worked around by declaring kexec_file_loaders in
> > include/linux/kexec.h and defining it in arch code. In particular doing this
> > 
> > diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> > index 37e9dce518aa..fc0788540d90 100644
> > --- a/include/linux/kexec.h
> > +++ b/include/linux/kexec.h
> > @@ -139,6 +139,8 @@ struct kexec_file_ops {
> >  #endif
> >  };
> >  
> > +extern const struct kexec_file_ops * const kexec_file_loaders[];
> > +
> >  /**
> >   * struct kexec_buf - parameters for finding a place for a buffer in memory
> >   * @image:	kexec image in which memory to search.
> > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > index 17ba407d0e79..4e3d1e4bc7f6 100644
> > --- a/kernel/kexec_file.c
> > +++ b/kernel/kexec_file.c
> > @@ -31,8 +31,6 @@
> >  #include <linux/vmalloc.h>
> >  #include "kexec_internal.h"
> >  
> > -const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > -
> >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> >  static int kexec_calculate_store_digests(struct kimage *image);
> >  #else  
> 
> Your change is just fine with me, too.
> I will incorporate it in my next version.

Thanks a lot
Philipp

> Thanks,
> -Takahiro AKASHI
> 
> > A nice side effect of this solution is, that a developer who forgets to define
> > kexec_file_loaders gets a linker error. So he directly knows what's missing
> > instead of first having to find out where/why an error gets returned.
> > 
> > Otherwise the series is fine for me.
> > 
> > Thanks
> > Philipp
> >   
> > >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> > >  static int kexec_calculate_store_digests(struct kimage *image);
> > >  #else
> > >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> > >  #endif
> > > 
> > > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > > +			     unsigned long buf_len)
> > > +{
> > > +	const struct kexec_file_ops * const *fops;
> > > +	int ret = -ENOEXEC;
> > > +
> > > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > > +		ret = (*fops)->probe(buf, buf_len);
> > > +		if (!ret) {
> > > +			image->fops = *fops;
> > > +			return ret;
> > > +		}
> > > +	}
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  /* Architectures can provide this probe function */
> > >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > >  					 unsigned long buf_len)
> > >  {
> > > -	return -ENOEXEC;
> > > +	return _kexec_kernel_image_probe(image, buf, buf_len);
> > > +}
> > > +
> > > +void *_kexec_kernel_image_load(struct kimage *image)
> > > +{
> > > +	if (!image->fops || !image->fops->load)
> > > +		return ERR_PTR(-ENOEXEC);
> > > +
> > > +	return image->fops->load(image, image->kernel_buf,
> > > +				 image->kernel_buf_len, image->initrd_buf,
> > > +				 image->initrd_buf_len, image->cmdline_buf,
> > > +				 image->cmdline_buf_len);
> > >  }
> > > 
> > >  void * __weak arch_kexec_kernel_image_load(struct kimage *image)
> > >  {
> > > -	return ERR_PTR(-ENOEXEC);
> > > +	return _kexec_kernel_image_load(image);
> > > +}
> > > +
> > > +int _kimage_file_post_load_cleanup(struct kimage *image)
> > > +{
> > > +	if (!image->fops || !image->fops->cleanup)
> > > +		return 0;
> > > +
> > > +	return image->fops->cleanup(image->image_loader_data);
> > >  }
> > > 
> > >  int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
> > >  {
> > > -	return -EINVAL;
> > > +	return _kimage_file_post_load_cleanup(image);
> > >  }
> > > 
> > >  #ifdef CONFIG_KEXEC_VERIFY_SIG
> > > +int _kexec_kernel_verify_sig(struct kimage *image, void *buf,
> > > +			    unsigned long buf_len)
> > > +{
> > > +	if (!image->fops || !image->fops->verify_sig) {
> > > +		pr_debug("kernel loader does not support signature verification.\n");
> > > +		return -EKEYREJECTED;
> > > +	}
> > > +
> > > +	return image->fops->verify_sig(buf, buf_len);
> > > +}
> > > +
> > >  int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
> > >  					unsigned long buf_len)
> > >  {
> > > -	return -EKEYREJECTED;
> > > +	return _kexec_kernel_verify_sig(image, buf, buf_len);
> > >  }
> > >  #endif
> > >   
> >   
> 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 00/13] arm64: kexec: add kexec_file_load() support
  2018-02-27  4:56   ` AKASHI Takahiro
  (?)
@ 2018-02-28 12:25     ` Dave Young
  -1 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-28 12:25 UTC (permalink / raw)
  To: AKASHI Takahiro, catalin.marinas, will.deacon, bauerman,
	dhowells, vgoyal, herbert, davem, akpm, mpe, bhe, arnd,
	ard.biesheuvel, julien.thierry, kexec, linux-arm-kernel,
	linux-kernel

Hi AKASHI,
On 02/27/18 at 01:56pm, AKASHI Takahiro wrote:
> Now my patch#2 to #5 were extracted from this patch set and put
> into another separate one. Please see
> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/562195.htmlk

Thanks!  Will read them

> 
> Thanks,
> -Takahiro AKASHI
> 
> On Thu, Feb 22, 2018 at 08:17:19PM +0900, AKASHI Takahiro wrote:
> > This is the eighth round of implementing kexec_file_load() support
> > on arm64.[1]
> > Most of the code is based on kexec-tools (along with some kernel code
> > from x86, which also came from kexec-tools).
> > 
> > 
> > This patch series enables us to
> >   * load the kernel by specifying its file descriptor, instead of user-
> >     filled buffer, at kexec_file_load() system call, and
> >   * optionally verify its signature at load time for trusted boot.
> > 
> > Contrary to kexec_load() system call, as we discussed a long time ago,
> > users may not be allowed to provide a device tree to the 2nd kernel
> > explicitly, hence enforcing a dt blob of the first kernel to be re-used
> > internally.
> > 
> > To use kexec_file_load() system call, instead of kexec_load(), at kexec
> > command, '-s' option must be specified. See [2] for a necessary patch for
> > kexec-tools.
> > 
> > To anaylize a generated crash dump file, use the latest master branch of
> > crash utility[3] for v4.16-rc kernel. I always try to submit patches to
> > fix any inconsistencies introduced in the latest kernel.
> > 
> > Regarding a kernel image verification, a signature must be presented
> > along with the binary itself. A signature is basically a hash value
> > calculated against the whole binary data and encrypted by a key which
> > will be authenticated by one of the system's trusted certificates.
> > Any attempt to read and load a to-be-kexec-ed kernel image through
> > a system call will be checked and blocked if the binary's hash value
> > doesn't match its associated signature.
> > 
> > There are two methods available now:
> > 1. implementing arch-specific verification hook of kexec_file_load()
> > 2. utilizing IMA(Integrity Measurement Architecture)[4] appraisal framework
> > 
> > Before my v7, I believed that my patch only supports (1) but am now
> > confident that (2) comes free if IMA is enabled and properly configured.
> > 
> > 
> > (1) Arch-specific verification hook
> > If CONFIG_KEXEC_VERIFY_SIG is enabled, kexec_file_load() invokes an arch-
> > defined (and hence file-format-specific) hook function to check for the
> > validity of kernel binary.
> > 
> > On x86, a signature is embedded into a PE file (Microsoft's format) header
> > of binary. Since arm64's "Image" can also be seen as a PE file as far as
> > CONFIG_EFI is enabled, we adopt this format for kernel signing.  
> > 
> > As in the case of UEFI applications, we can create a signed kernel image:
> >     $ sbsign --key ${KEY} --cert ${CERT} Image
> > 
> > You may want to use certs/signing_key.pem, which is intended to be used
> > for module sigining (CONFIG_MODULE_SIG), as ${KEY} and ${CERT} for test
> > purpose.
> > 
> > 
> > (2) IMA appraisal-based
> > IMA was first introduced in linux in order to meet TCG (Trusted Computing
> > Group) requirement that all the sensitive files be *measured* before
> > reading/executing them to detect any untrusted changes/modification.
> > Then appraisal feature, which allows us to ensure the integrity of
> > files and even prevent them from reading/executing, was added later.
> > 
> > Meanwhile, kexec_file_load() has been merged since v3.17 and evolved to
> > enable IMA-appraisal type verification by the commit b804defe4297 ("kexec:
> > replace call to copy_file_from_fd() with kernel version").
> > 
> > In this scheme, a signature will be stored in a extended file attribute,
> > "security.ima" while a decryption key is hold in a dedicated keyring,
> > ".ima" or "_ima".  All the necessary process of verification is confined
> > in a secure API, kernel_read_file_from_fd(), called by kexec_file_load().
> > 
> >     Please note that powerpc is one of the two architectures now
> >     supporting KEXEC_FILE, and that it wishes to exntend IMA,
> >     where a signature may be appended to "vmlinux" file[5], like module
> >     signing, instead of using an extended file attribute.
> > 
> > While IMA meant to be used with TPM (Trusted Platform Module) on secure
> > platform, IMA is still usable without TPM. Here is an example procedure
> > about how we can give it a try to run the feature using a self-signed
> > root ca for demo/test purposes:
> > 
> >  1) Generate needed keys and certificates, following "Generate trusted
> >     keys" section in README of ima-evm-utils[6].
> > 
> >  2) Build the kernel with the following kernel configurations, specifying
> >     "ima-local-ca.pem" for CONFIG_SYSTEM_TRUSTED_KEYS:
> > 	CONFIG_EXT4_FS_SECURITY
> > 	CONFIG_INTEGRITY_SIGNATURE
> > 	CONFIG_INTEGRITY_ASYMMETRIC_KEYS
> > 	CONFIG_INTEGRITY_TRUSTED_KEYRING
> > 	CONFIG_IMA
> > 	CONFIG_IMA_WRITE_POLICY
> > 	CONFIG_IMA_READ_POLICY
> > 	CONFIG_IMA_APPRAISE
> > 	CONFIG_IMA_APPRAISE_BOOTPARAM
> > 	CONFIG_SYSTEM_TRUSTED_KEYS
> >     Please note that CONFIG_KEXEC_VERIFY_SIG is not, actually should
> >     not be, enabled.
> > 
> >  3) Sign(label) a kernel image binary to be kexec-ed on target filesystem:
> >     $ evmctl ima_sign --key /path/to/private_key.pem /your/Image
> > 
> >  4) Add a command line parameter and boot the kernel:
> >     ima_appraise=enforce
> > 
> >  On live system,
> >  5) Set a security policy:
> >     $ mount -t securityfs none /sys/kernel/security
> >     $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" \
> >       > /sys/kernel/security/ima/policy
> > 
> >  6) Add a key for ima:
> >     $ keyctl padd asymmetric my_ima_key %:.ima < /path/to/x509_ima.der
> >     (or evmctl import /path/to/x509_ima.der <ima_keyring_id>)
> > 
> >  7) Then try kexec as normal.
> > 
> > 
> > Concerns(or future works):
> > * Even if the kernel is configured with CONFIG_RANDOMIZE_BASE, the 2nd
> >   kernel won't be placed at a randomized address. We will have to
> >   add some boot code similar to efi-stub to implement the randomization.
> > for approach (1),
> > * While big-endian kernel can support kernel signing, I'm not sure that
> >   Image can be recognized as in PE format because x86 standard only
> >   defines little-endian-based format.
> > * vmlinux support
> > 
> >   [1] http://git.linaro.org/people/takahiro.akashi/linux-aarch64.git
> > 	branch:arm64/kexec_file
> >   [2] http://git.linaro.org/people/takahiro.akashi/kexec-tools.git
> > 	branch:arm64/kexec_file
> >   [3] http://github.com/crash-utility/crash.git
> >   [4] https://sourceforge.net/p/linux-ima/wiki/Home/
> >   [5] http://lkml.iu.edu//hypermail/linux/kernel/1707.0/03669.html
> >   [6] https://sourceforge.net/p/linux-ima/ima-evm-utils/ci/master/tree/
> > 
> > 
> > Changes in v8 (Feb 22, 2018)
> > * introduce ARCH_HAS_KEXEC_PURGATORY so that arm64 will be able to skip
> >   purgatory
> > * remove "ifdef CONFIG_X86_64" stuffs from a re-factored function,
> >   prepare_elf64_headers(), making its interface more generic
> >   (The original patch was split into two for easier reviews.)
> > * modify cpu_soft_restart() so as to let the 2nd kernel jump into its entry
> >   code directly without requiring purgatory in case of kexec_file_load
> > * remove CONFIG_KEXEC_FILE_IMAGE_FMT and introduce
> >   CONFIG_KEXEC_IMAGE_VERIFY_SIG, much similar to x86 but quite redundant
> >   for now.
> > * In addition, update/modify dependencies of KEXEC_IMAGE_VERIFY_SIG
> > 
> > Changes in v7 (Dec 4, 2017)
> > * rebased to v4.15-rc2
> > * re-organize the patch set to separate KEXEC_FILE_VERIFY_SIG-related
> >   code from the others
> > * revamp factored-out code in kernel/kexec_file.c due to the changes
> >   in original x86 code
> > * redefine walk_sys_ram_res_rev() prototype due to change of callback
> >   type in the counterpart, walk_sys_ram_res()
> > * make KEXEC_FILE_IMAGE_FMT defaut on if KEXEC_FILE selected
> > 
> > Changes in v6 (Oct 24, 2017)
> > * fix a for-loop bug in _kexec_kernel_image_probe() per Julien
> > 
> > Changes in v5 (Oct 10, 2017)
> > * fix kbuild errors around patch #3
> > per Julien's comments,
> > * fix a bug in walk_system_ram_res_rev() with some cleanup
> > * modify fdt_setprop_range() to use vmalloc()
> > * modify fill_property() to use memset()
> > 
> > Changes in v4 (Oct 2, 2017)
> > * reinstate x86's arch_kexec_kernel_image_load()
> > * rename weak arch_kexec_kernel_xxx() to _kexec_kernel_xxx() for
> >   better re-use
> > * constify kexec_file_loaders[]
> > 
> > Changes in v3 (Sep 15, 2017)
> > * fix kbuild test error
> > * factor out arch_kexec_kernel_*() & arch_kimage_file_post_load_cleanup()
> > * remove CONFIG_CRASH_CORE guard from kexec_file.c
> > * add vmapped kernel region to vmcore for gdb backtracing
> >   (see prepare_elf64_headers())
> > * merge asm/kexec_file.h into asm/kexec.h
> > * and some cleanups
> > 
> > Changes in v2 (Sep 8, 2017)
> > * move core-header-related functions from crash_core.c to kexec_file.c
> > * drop hash-check code from purgatory
> > * modify purgatory asm to remove arch_kexec_apply_relocations_add()
> > * drop older kernel support
> > * drop vmlinux support (at least, for this series)
> > 
> > 
> > Patch #1 to #10 are essential part for KEXEC_FILE support
> > (additionally allowing for IMA-based verification):
> >   Patch #1 to #6 are all preparatory patches on generic side.
> >   Patch #7 to #11 are to enable kexec_file_load on arm64.
> > 
> > Patch #12 to #13 are for KEXEC_VERIFY_SIG (arch-specific verification)
> > support
> > 
> > AKASHI Takahiro (13):
> >   resource: add walk_system_ram_res_rev()
> >   kexec_file: make an use of purgatory optional
> >   kexec_file,x86,powerpc: factor out kexec_file_ops functions
> >   x86: kexec_file: factor out elf core header related functions
> >   kexec_file, x86: move re-factored code to generic side
> >   asm-generic: add kexec_file_load system call to unistd.h
> >   arm64: kexec_file: invoke the kernel without purgatory
> >   arm64: kexec_file: load initrd and device-tree
> >   arm64: kexec_file: add crash dump support
> >   arm64: kexec_file: add Image format support
> >   arm64: kexec_file: enable KEXEC_FILE config
> >   include: pe.h: remove message[] from mz header definition
> >   arm64: kexec_file: enable KEXEC_VERIFY_SIG for Image
> > 
> >  arch/arm64/Kconfig                          |  34 +++
> >  arch/arm64/include/asm/kexec.h              |  90 +++++++
> >  arch/arm64/kernel/Makefile                  |   3 +-
> >  arch/arm64/kernel/cpu-reset.S               |   6 +-
> >  arch/arm64/kernel/kexec_image.c             | 105 ++++++++
> >  arch/arm64/kernel/machine_kexec.c           |  11 +-
> >  arch/arm64/kernel/machine_kexec_file.c      | 401 ++++++++++++++++++++++++++++
> >  arch/arm64/kernel/relocate_kernel.S         |   3 +-
> >  arch/powerpc/Kconfig                        |   3 +
> >  arch/powerpc/include/asm/kexec.h            |   2 +-
> >  arch/powerpc/kernel/kexec_elf_64.c          |   2 +-
> >  arch/powerpc/kernel/machine_kexec_file_64.c |  39 +--
> >  arch/x86/Kconfig                            |   3 +
> >  arch/x86/include/asm/kexec-bzimage64.h      |   2 +-
> >  arch/x86/kernel/crash.c                     | 332 +++++------------------
> >  arch/x86/kernel/kexec-bzimage64.c           |   2 +-
> >  arch/x86/kernel/machine_kexec_64.c          |  45 +---
> >  include/linux/ioport.h                      |   3 +
> >  include/linux/kexec.h                       |  34 ++-
> >  include/linux/pe.h                          |   2 +-
> >  include/uapi/asm-generic/unistd.h           |   4 +-
> >  kernel/kexec_file.c                         | 238 ++++++++++++++++-
> >  kernel/resource.c                           |  57 ++++
> >  23 files changed, 1046 insertions(+), 375 deletions(-)
> >  create mode 100644 arch/arm64/kernel/kexec_image.c
> >  create mode 100644 arch/arm64/kernel/machine_kexec_file.c
> > 
> > -- 
> > 2.16.2
> > 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 00/13] arm64: kexec: add kexec_file_load() support
@ 2018-02-28 12:25     ` Dave Young
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-28 12:25 UTC (permalink / raw)
  To: linux-arm-kernel

Hi AKASHI,
On 02/27/18 at 01:56pm, AKASHI Takahiro wrote:
> Now my patch#2 to #5 were extracted from this patch set and put
> into another separate one. Please see
> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/562195.htmlk

Thanks!  Will read them

> 
> Thanks,
> -Takahiro AKASHI
> 
> On Thu, Feb 22, 2018 at 08:17:19PM +0900, AKASHI Takahiro wrote:
> > This is the eighth round of implementing kexec_file_load() support
> > on arm64.[1]
> > Most of the code is based on kexec-tools (along with some kernel code
> > from x86, which also came from kexec-tools).
> > 
> > 
> > This patch series enables us to
> >   * load the kernel by specifying its file descriptor, instead of user-
> >     filled buffer, at kexec_file_load() system call, and
> >   * optionally verify its signature at load time for trusted boot.
> > 
> > Contrary to kexec_load() system call, as we discussed a long time ago,
> > users may not be allowed to provide a device tree to the 2nd kernel
> > explicitly, hence enforcing a dt blob of the first kernel to be re-used
> > internally.
> > 
> > To use kexec_file_load() system call, instead of kexec_load(), at kexec
> > command, '-s' option must be specified. See [2] for a necessary patch for
> > kexec-tools.
> > 
> > To anaylize a generated crash dump file, use the latest master branch of
> > crash utility[3] for v4.16-rc kernel. I always try to submit patches to
> > fix any inconsistencies introduced in the latest kernel.
> > 
> > Regarding a kernel image verification, a signature must be presented
> > along with the binary itself. A signature is basically a hash value
> > calculated against the whole binary data and encrypted by a key which
> > will be authenticated by one of the system's trusted certificates.
> > Any attempt to read and load a to-be-kexec-ed kernel image through
> > a system call will be checked and blocked if the binary's hash value
> > doesn't match its associated signature.
> > 
> > There are two methods available now:
> > 1. implementing arch-specific verification hook of kexec_file_load()
> > 2. utilizing IMA(Integrity Measurement Architecture)[4] appraisal framework
> > 
> > Before my v7, I believed that my patch only supports (1) but am now
> > confident that (2) comes free if IMA is enabled and properly configured.
> > 
> > 
> > (1) Arch-specific verification hook
> > If CONFIG_KEXEC_VERIFY_SIG is enabled, kexec_file_load() invokes an arch-
> > defined (and hence file-format-specific) hook function to check for the
> > validity of kernel binary.
> > 
> > On x86, a signature is embedded into a PE file (Microsoft's format) header
> > of binary. Since arm64's "Image" can also be seen as a PE file as far as
> > CONFIG_EFI is enabled, we adopt this format for kernel signing.  
> > 
> > As in the case of UEFI applications, we can create a signed kernel image:
> >     $ sbsign --key ${KEY} --cert ${CERT} Image
> > 
> > You may want to use certs/signing_key.pem, which is intended to be used
> > for module sigining (CONFIG_MODULE_SIG), as ${KEY} and ${CERT} for test
> > purpose.
> > 
> > 
> > (2) IMA appraisal-based
> > IMA was first introduced in linux in order to meet TCG (Trusted Computing
> > Group) requirement that all the sensitive files be *measured* before
> > reading/executing them to detect any untrusted changes/modification.
> > Then appraisal feature, which allows us to ensure the integrity of
> > files and even prevent them from reading/executing, was added later.
> > 
> > Meanwhile, kexec_file_load() has been merged since v3.17 and evolved to
> > enable IMA-appraisal type verification by the commit b804defe4297 ("kexec:
> > replace call to copy_file_from_fd() with kernel version").
> > 
> > In this scheme, a signature will be stored in a extended file attribute,
> > "security.ima" while a decryption key is hold in a dedicated keyring,
> > ".ima" or "_ima".  All the necessary process of verification is confined
> > in a secure API, kernel_read_file_from_fd(), called by kexec_file_load().
> > 
> >     Please note that powerpc is one of the two architectures now
> >     supporting KEXEC_FILE, and that it wishes to exntend IMA,
> >     where a signature may be appended to "vmlinux" file[5], like module
> >     signing, instead of using an extended file attribute.
> > 
> > While IMA meant to be used with TPM (Trusted Platform Module) on secure
> > platform, IMA is still usable without TPM. Here is an example procedure
> > about how we can give it a try to run the feature using a self-signed
> > root ca for demo/test purposes:
> > 
> >  1) Generate needed keys and certificates, following "Generate trusted
> >     keys" section in README of ima-evm-utils[6].
> > 
> >  2) Build the kernel with the following kernel configurations, specifying
> >     "ima-local-ca.pem" for CONFIG_SYSTEM_TRUSTED_KEYS:
> > 	CONFIG_EXT4_FS_SECURITY
> > 	CONFIG_INTEGRITY_SIGNATURE
> > 	CONFIG_INTEGRITY_ASYMMETRIC_KEYS
> > 	CONFIG_INTEGRITY_TRUSTED_KEYRING
> > 	CONFIG_IMA
> > 	CONFIG_IMA_WRITE_POLICY
> > 	CONFIG_IMA_READ_POLICY
> > 	CONFIG_IMA_APPRAISE
> > 	CONFIG_IMA_APPRAISE_BOOTPARAM
> > 	CONFIG_SYSTEM_TRUSTED_KEYS
> >     Please note that CONFIG_KEXEC_VERIFY_SIG is not, actually should
> >     not be, enabled.
> > 
> >  3) Sign(label) a kernel image binary to be kexec-ed on target filesystem:
> >     $ evmctl ima_sign --key /path/to/private_key.pem /your/Image
> > 
> >  4) Add a command line parameter and boot the kernel:
> >     ima_appraise=enforce
> > 
> >  On live system,
> >  5) Set a security policy:
> >     $ mount -t securityfs none /sys/kernel/security
> >     $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" \
> >       > /sys/kernel/security/ima/policy
> > 
> >  6) Add a key for ima:
> >     $ keyctl padd asymmetric my_ima_key %:.ima < /path/to/x509_ima.der
> >     (or evmctl import /path/to/x509_ima.der <ima_keyring_id>)
> > 
> >  7) Then try kexec as normal.
> > 
> > 
> > Concerns(or future works):
> > * Even if the kernel is configured with CONFIG_RANDOMIZE_BASE, the 2nd
> >   kernel won't be placed at a randomized address. We will have to
> >   add some boot code similar to efi-stub to implement the randomization.
> > for approach (1),
> > * While big-endian kernel can support kernel signing, I'm not sure that
> >   Image can be recognized as in PE format because x86 standard only
> >   defines little-endian-based format.
> > * vmlinux support
> > 
> >   [1] http://git.linaro.org/people/takahiro.akashi/linux-aarch64.git
> > 	branch:arm64/kexec_file
> >   [2] http://git.linaro.org/people/takahiro.akashi/kexec-tools.git
> > 	branch:arm64/kexec_file
> >   [3] http://github.com/crash-utility/crash.git
> >   [4] https://sourceforge.net/p/linux-ima/wiki/Home/
> >   [5] http://lkml.iu.edu//hypermail/linux/kernel/1707.0/03669.html
> >   [6] https://sourceforge.net/p/linux-ima/ima-evm-utils/ci/master/tree/
> > 
> > 
> > Changes in v8 (Feb 22, 2018)
> > * introduce ARCH_HAS_KEXEC_PURGATORY so that arm64 will be able to skip
> >   purgatory
> > * remove "ifdef CONFIG_X86_64" stuffs from a re-factored function,
> >   prepare_elf64_headers(), making its interface more generic
> >   (The original patch was split into two for easier reviews.)
> > * modify cpu_soft_restart() so as to let the 2nd kernel jump into its entry
> >   code directly without requiring purgatory in case of kexec_file_load
> > * remove CONFIG_KEXEC_FILE_IMAGE_FMT and introduce
> >   CONFIG_KEXEC_IMAGE_VERIFY_SIG, much similar to x86 but quite redundant
> >   for now.
> > * In addition, update/modify dependencies of KEXEC_IMAGE_VERIFY_SIG
> > 
> > Changes in v7 (Dec 4, 2017)
> > * rebased to v4.15-rc2
> > * re-organize the patch set to separate KEXEC_FILE_VERIFY_SIG-related
> >   code from the others
> > * revamp factored-out code in kernel/kexec_file.c due to the changes
> >   in original x86 code
> > * redefine walk_sys_ram_res_rev() prototype due to change of callback
> >   type in the counterpart, walk_sys_ram_res()
> > * make KEXEC_FILE_IMAGE_FMT defaut on if KEXEC_FILE selected
> > 
> > Changes in v6 (Oct 24, 2017)
> > * fix a for-loop bug in _kexec_kernel_image_probe() per Julien
> > 
> > Changes in v5 (Oct 10, 2017)
> > * fix kbuild errors around patch #3
> > per Julien's comments,
> > * fix a bug in walk_system_ram_res_rev() with some cleanup
> > * modify fdt_setprop_range() to use vmalloc()
> > * modify fill_property() to use memset()
> > 
> > Changes in v4 (Oct 2, 2017)
> > * reinstate x86's arch_kexec_kernel_image_load()
> > * rename weak arch_kexec_kernel_xxx() to _kexec_kernel_xxx() for
> >   better re-use
> > * constify kexec_file_loaders[]
> > 
> > Changes in v3 (Sep 15, 2017)
> > * fix kbuild test error
> > * factor out arch_kexec_kernel_*() & arch_kimage_file_post_load_cleanup()
> > * remove CONFIG_CRASH_CORE guard from kexec_file.c
> > * add vmapped kernel region to vmcore for gdb backtracing
> >   (see prepare_elf64_headers())
> > * merge asm/kexec_file.h into asm/kexec.h
> > * and some cleanups
> > 
> > Changes in v2 (Sep 8, 2017)
> > * move core-header-related functions from crash_core.c to kexec_file.c
> > * drop hash-check code from purgatory
> > * modify purgatory asm to remove arch_kexec_apply_relocations_add()
> > * drop older kernel support
> > * drop vmlinux support (at least, for this series)
> > 
> > 
> > Patch #1 to #10 are essential part for KEXEC_FILE support
> > (additionally allowing for IMA-based verification):
> >   Patch #1 to #6 are all preparatory patches on generic side.
> >   Patch #7 to #11 are to enable kexec_file_load on arm64.
> > 
> > Patch #12 to #13 are for KEXEC_VERIFY_SIG (arch-specific verification)
> > support
> > 
> > AKASHI Takahiro (13):
> >   resource: add walk_system_ram_res_rev()
> >   kexec_file: make an use of purgatory optional
> >   kexec_file,x86,powerpc: factor out kexec_file_ops functions
> >   x86: kexec_file: factor out elf core header related functions
> >   kexec_file, x86: move re-factored code to generic side
> >   asm-generic: add kexec_file_load system call to unistd.h
> >   arm64: kexec_file: invoke the kernel without purgatory
> >   arm64: kexec_file: load initrd and device-tree
> >   arm64: kexec_file: add crash dump support
> >   arm64: kexec_file: add Image format support
> >   arm64: kexec_file: enable KEXEC_FILE config
> >   include: pe.h: remove message[] from mz header definition
> >   arm64: kexec_file: enable KEXEC_VERIFY_SIG for Image
> > 
> >  arch/arm64/Kconfig                          |  34 +++
> >  arch/arm64/include/asm/kexec.h              |  90 +++++++
> >  arch/arm64/kernel/Makefile                  |   3 +-
> >  arch/arm64/kernel/cpu-reset.S               |   6 +-
> >  arch/arm64/kernel/kexec_image.c             | 105 ++++++++
> >  arch/arm64/kernel/machine_kexec.c           |  11 +-
> >  arch/arm64/kernel/machine_kexec_file.c      | 401 ++++++++++++++++++++++++++++
> >  arch/arm64/kernel/relocate_kernel.S         |   3 +-
> >  arch/powerpc/Kconfig                        |   3 +
> >  arch/powerpc/include/asm/kexec.h            |   2 +-
> >  arch/powerpc/kernel/kexec_elf_64.c          |   2 +-
> >  arch/powerpc/kernel/machine_kexec_file_64.c |  39 +--
> >  arch/x86/Kconfig                            |   3 +
> >  arch/x86/include/asm/kexec-bzimage64.h      |   2 +-
> >  arch/x86/kernel/crash.c                     | 332 +++++------------------
> >  arch/x86/kernel/kexec-bzimage64.c           |   2 +-
> >  arch/x86/kernel/machine_kexec_64.c          |  45 +---
> >  include/linux/ioport.h                      |   3 +
> >  include/linux/kexec.h                       |  34 ++-
> >  include/linux/pe.h                          |   2 +-
> >  include/uapi/asm-generic/unistd.h           |   4 +-
> >  kernel/kexec_file.c                         | 238 ++++++++++++++++-
> >  kernel/resource.c                           |  57 ++++
> >  23 files changed, 1046 insertions(+), 375 deletions(-)
> >  create mode 100644 arch/arm64/kernel/kexec_image.c
> >  create mode 100644 arch/arm64/kernel/machine_kexec_file.c
> > 
> > -- 
> > 2.16.2
> > 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 00/13] arm64: kexec: add kexec_file_load() support
@ 2018-02-28 12:25     ` Dave Young
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-28 12:25 UTC (permalink / raw)
  To: AKASHI Takahiro, catalin.marinas, will.deacon, bauerman,
	dhowells, vgoyal, herbert, davem, akpm, mpe, bhe, arnd,
	ard.biesheuvel, julien.thierry, kexec, linux-arm-kernel,
	linux-kernel

Hi AKASHI,
On 02/27/18 at 01:56pm, AKASHI Takahiro wrote:
> Now my patch#2 to #5 were extracted from this patch set and put
> into another separate one. Please see
> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/562195.htmlk

Thanks!  Will read them

> 
> Thanks,
> -Takahiro AKASHI
> 
> On Thu, Feb 22, 2018 at 08:17:19PM +0900, AKASHI Takahiro wrote:
> > This is the eighth round of implementing kexec_file_load() support
> > on arm64.[1]
> > Most of the code is based on kexec-tools (along with some kernel code
> > from x86, which also came from kexec-tools).
> > 
> > 
> > This patch series enables us to
> >   * load the kernel by specifying its file descriptor, instead of user-
> >     filled buffer, at kexec_file_load() system call, and
> >   * optionally verify its signature at load time for trusted boot.
> > 
> > Contrary to kexec_load() system call, as we discussed a long time ago,
> > users may not be allowed to provide a device tree to the 2nd kernel
> > explicitly, hence enforcing a dt blob of the first kernel to be re-used
> > internally.
> > 
> > To use kexec_file_load() system call, instead of kexec_load(), at kexec
> > command, '-s' option must be specified. See [2] for a necessary patch for
> > kexec-tools.
> > 
> > To anaylize a generated crash dump file, use the latest master branch of
> > crash utility[3] for v4.16-rc kernel. I always try to submit patches to
> > fix any inconsistencies introduced in the latest kernel.
> > 
> > Regarding a kernel image verification, a signature must be presented
> > along with the binary itself. A signature is basically a hash value
> > calculated against the whole binary data and encrypted by a key which
> > will be authenticated by one of the system's trusted certificates.
> > Any attempt to read and load a to-be-kexec-ed kernel image through
> > a system call will be checked and blocked if the binary's hash value
> > doesn't match its associated signature.
> > 
> > There are two methods available now:
> > 1. implementing arch-specific verification hook of kexec_file_load()
> > 2. utilizing IMA(Integrity Measurement Architecture)[4] appraisal framework
> > 
> > Before my v7, I believed that my patch only supports (1) but am now
> > confident that (2) comes free if IMA is enabled and properly configured.
> > 
> > 
> > (1) Arch-specific verification hook
> > If CONFIG_KEXEC_VERIFY_SIG is enabled, kexec_file_load() invokes an arch-
> > defined (and hence file-format-specific) hook function to check for the
> > validity of kernel binary.
> > 
> > On x86, a signature is embedded into a PE file (Microsoft's format) header
> > of binary. Since arm64's "Image" can also be seen as a PE file as far as
> > CONFIG_EFI is enabled, we adopt this format for kernel signing.  
> > 
> > As in the case of UEFI applications, we can create a signed kernel image:
> >     $ sbsign --key ${KEY} --cert ${CERT} Image
> > 
> > You may want to use certs/signing_key.pem, which is intended to be used
> > for module sigining (CONFIG_MODULE_SIG), as ${KEY} and ${CERT} for test
> > purpose.
> > 
> > 
> > (2) IMA appraisal-based
> > IMA was first introduced in linux in order to meet TCG (Trusted Computing
> > Group) requirement that all the sensitive files be *measured* before
> > reading/executing them to detect any untrusted changes/modification.
> > Then appraisal feature, which allows us to ensure the integrity of
> > files and even prevent them from reading/executing, was added later.
> > 
> > Meanwhile, kexec_file_load() has been merged since v3.17 and evolved to
> > enable IMA-appraisal type verification by the commit b804defe4297 ("kexec:
> > replace call to copy_file_from_fd() with kernel version").
> > 
> > In this scheme, a signature will be stored in a extended file attribute,
> > "security.ima" while a decryption key is hold in a dedicated keyring,
> > ".ima" or "_ima".  All the necessary process of verification is confined
> > in a secure API, kernel_read_file_from_fd(), called by kexec_file_load().
> > 
> >     Please note that powerpc is one of the two architectures now
> >     supporting KEXEC_FILE, and that it wishes to exntend IMA,
> >     where a signature may be appended to "vmlinux" file[5], like module
> >     signing, instead of using an extended file attribute.
> > 
> > While IMA meant to be used with TPM (Trusted Platform Module) on secure
> > platform, IMA is still usable without TPM. Here is an example procedure
> > about how we can give it a try to run the feature using a self-signed
> > root ca for demo/test purposes:
> > 
> >  1) Generate needed keys and certificates, following "Generate trusted
> >     keys" section in README of ima-evm-utils[6].
> > 
> >  2) Build the kernel with the following kernel configurations, specifying
> >     "ima-local-ca.pem" for CONFIG_SYSTEM_TRUSTED_KEYS:
> > 	CONFIG_EXT4_FS_SECURITY
> > 	CONFIG_INTEGRITY_SIGNATURE
> > 	CONFIG_INTEGRITY_ASYMMETRIC_KEYS
> > 	CONFIG_INTEGRITY_TRUSTED_KEYRING
> > 	CONFIG_IMA
> > 	CONFIG_IMA_WRITE_POLICY
> > 	CONFIG_IMA_READ_POLICY
> > 	CONFIG_IMA_APPRAISE
> > 	CONFIG_IMA_APPRAISE_BOOTPARAM
> > 	CONFIG_SYSTEM_TRUSTED_KEYS
> >     Please note that CONFIG_KEXEC_VERIFY_SIG is not, actually should
> >     not be, enabled.
> > 
> >  3) Sign(label) a kernel image binary to be kexec-ed on target filesystem:
> >     $ evmctl ima_sign --key /path/to/private_key.pem /your/Image
> > 
> >  4) Add a command line parameter and boot the kernel:
> >     ima_appraise=enforce
> > 
> >  On live system,
> >  5) Set a security policy:
> >     $ mount -t securityfs none /sys/kernel/security
> >     $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" \
> >       > /sys/kernel/security/ima/policy
> > 
> >  6) Add a key for ima:
> >     $ keyctl padd asymmetric my_ima_key %:.ima < /path/to/x509_ima.der
> >     (or evmctl import /path/to/x509_ima.der <ima_keyring_id>)
> > 
> >  7) Then try kexec as normal.
> > 
> > 
> > Concerns(or future works):
> > * Even if the kernel is configured with CONFIG_RANDOMIZE_BASE, the 2nd
> >   kernel won't be placed at a randomized address. We will have to
> >   add some boot code similar to efi-stub to implement the randomization.
> > for approach (1),
> > * While big-endian kernel can support kernel signing, I'm not sure that
> >   Image can be recognized as in PE format because x86 standard only
> >   defines little-endian-based format.
> > * vmlinux support
> > 
> >   [1] http://git.linaro.org/people/takahiro.akashi/linux-aarch64.git
> > 	branch:arm64/kexec_file
> >   [2] http://git.linaro.org/people/takahiro.akashi/kexec-tools.git
> > 	branch:arm64/kexec_file
> >   [3] http://github.com/crash-utility/crash.git
> >   [4] https://sourceforge.net/p/linux-ima/wiki/Home/
> >   [5] http://lkml.iu.edu//hypermail/linux/kernel/1707.0/03669.html
> >   [6] https://sourceforge.net/p/linux-ima/ima-evm-utils/ci/master/tree/
> > 
> > 
> > Changes in v8 (Feb 22, 2018)
> > * introduce ARCH_HAS_KEXEC_PURGATORY so that arm64 will be able to skip
> >   purgatory
> > * remove "ifdef CONFIG_X86_64" stuffs from a re-factored function,
> >   prepare_elf64_headers(), making its interface more generic
> >   (The original patch was split into two for easier reviews.)
> > * modify cpu_soft_restart() so as to let the 2nd kernel jump into its entry
> >   code directly without requiring purgatory in case of kexec_file_load
> > * remove CONFIG_KEXEC_FILE_IMAGE_FMT and introduce
> >   CONFIG_KEXEC_IMAGE_VERIFY_SIG, much similar to x86 but quite redundant
> >   for now.
> > * In addition, update/modify dependencies of KEXEC_IMAGE_VERIFY_SIG
> > 
> > Changes in v7 (Dec 4, 2017)
> > * rebased to v4.15-rc2
> > * re-organize the patch set to separate KEXEC_FILE_VERIFY_SIG-related
> >   code from the others
> > * revamp factored-out code in kernel/kexec_file.c due to the changes
> >   in original x86 code
> > * redefine walk_sys_ram_res_rev() prototype due to change of callback
> >   type in the counterpart, walk_sys_ram_res()
> > * make KEXEC_FILE_IMAGE_FMT defaut on if KEXEC_FILE selected
> > 
> > Changes in v6 (Oct 24, 2017)
> > * fix a for-loop bug in _kexec_kernel_image_probe() per Julien
> > 
> > Changes in v5 (Oct 10, 2017)
> > * fix kbuild errors around patch #3
> > per Julien's comments,
> > * fix a bug in walk_system_ram_res_rev() with some cleanup
> > * modify fdt_setprop_range() to use vmalloc()
> > * modify fill_property() to use memset()
> > 
> > Changes in v4 (Oct 2, 2017)
> > * reinstate x86's arch_kexec_kernel_image_load()
> > * rename weak arch_kexec_kernel_xxx() to _kexec_kernel_xxx() for
> >   better re-use
> > * constify kexec_file_loaders[]
> > 
> > Changes in v3 (Sep 15, 2017)
> > * fix kbuild test error
> > * factor out arch_kexec_kernel_*() & arch_kimage_file_post_load_cleanup()
> > * remove CONFIG_CRASH_CORE guard from kexec_file.c
> > * add vmapped kernel region to vmcore for gdb backtracing
> >   (see prepare_elf64_headers())
> > * merge asm/kexec_file.h into asm/kexec.h
> > * and some cleanups
> > 
> > Changes in v2 (Sep 8, 2017)
> > * move core-header-related functions from crash_core.c to kexec_file.c
> > * drop hash-check code from purgatory
> > * modify purgatory asm to remove arch_kexec_apply_relocations_add()
> > * drop older kernel support
> > * drop vmlinux support (at least, for this series)
> > 
> > 
> > Patch #1 to #10 are essential part for KEXEC_FILE support
> > (additionally allowing for IMA-based verification):
> >   Patch #1 to #6 are all preparatory patches on generic side.
> >   Patch #7 to #11 are to enable kexec_file_load on arm64.
> > 
> > Patch #12 to #13 are for KEXEC_VERIFY_SIG (arch-specific verification)
> > support
> > 
> > AKASHI Takahiro (13):
> >   resource: add walk_system_ram_res_rev()
> >   kexec_file: make an use of purgatory optional
> >   kexec_file,x86,powerpc: factor out kexec_file_ops functions
> >   x86: kexec_file: factor out elf core header related functions
> >   kexec_file, x86: move re-factored code to generic side
> >   asm-generic: add kexec_file_load system call to unistd.h
> >   arm64: kexec_file: invoke the kernel without purgatory
> >   arm64: kexec_file: load initrd and device-tree
> >   arm64: kexec_file: add crash dump support
> >   arm64: kexec_file: add Image format support
> >   arm64: kexec_file: enable KEXEC_FILE config
> >   include: pe.h: remove message[] from mz header definition
> >   arm64: kexec_file: enable KEXEC_VERIFY_SIG for Image
> > 
> >  arch/arm64/Kconfig                          |  34 +++
> >  arch/arm64/include/asm/kexec.h              |  90 +++++++
> >  arch/arm64/kernel/Makefile                  |   3 +-
> >  arch/arm64/kernel/cpu-reset.S               |   6 +-
> >  arch/arm64/kernel/kexec_image.c             | 105 ++++++++
> >  arch/arm64/kernel/machine_kexec.c           |  11 +-
> >  arch/arm64/kernel/machine_kexec_file.c      | 401 ++++++++++++++++++++++++++++
> >  arch/arm64/kernel/relocate_kernel.S         |   3 +-
> >  arch/powerpc/Kconfig                        |   3 +
> >  arch/powerpc/include/asm/kexec.h            |   2 +-
> >  arch/powerpc/kernel/kexec_elf_64.c          |   2 +-
> >  arch/powerpc/kernel/machine_kexec_file_64.c |  39 +--
> >  arch/x86/Kconfig                            |   3 +
> >  arch/x86/include/asm/kexec-bzimage64.h      |   2 +-
> >  arch/x86/kernel/crash.c                     | 332 +++++------------------
> >  arch/x86/kernel/kexec-bzimage64.c           |   2 +-
> >  arch/x86/kernel/machine_kexec_64.c          |  45 +---
> >  include/linux/ioport.h                      |   3 +
> >  include/linux/kexec.h                       |  34 ++-
> >  include/linux/pe.h                          |   2 +-
> >  include/uapi/asm-generic/unistd.h           |   4 +-
> >  kernel/kexec_file.c                         | 238 ++++++++++++++++-
> >  kernel/resource.c                           |  57 ++++
> >  23 files changed, 1046 insertions(+), 375 deletions(-)
> >  create mode 100644 arch/arm64/kernel/kexec_image.c
> >  create mode 100644 arch/arm64/kernel/machine_kexec_file.c
> > 
> > -- 
> > 2.16.2
> > 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 02/13] kexec_file: make an use of purgatory optional
  2018-02-26 10:24       ` AKASHI Takahiro
  (?)
@ 2018-02-28 12:33         ` Dave Young
  -1 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-28 12:33 UTC (permalink / raw)
  To: AKASHI Takahiro, catalin.marinas, will.deacon, bauerman,
	dhowells, vgoyal, herbert, davem, akpm, mpe, bhe, arnd,
	ard.biesheuvel, julien.thierry, kexec, linux-arm-kernel,
	linux-kernel

On 02/26/18 at 07:24pm, AKASHI Takahiro wrote:
> On Fri, Feb 23, 2018 at 04:49:34PM +0800, Dave Young wrote:
> > Hi AKASHI,
> > 
> > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > On arm64, no trampline code between old kernel and new kernel will be
> > > required in kexec_file implementation. This patch introduces a new
> > > configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be
> > > compiled in only if necessary.
> > 
> > Here also need the explanation about why no purgatory is needed, it would be
> > required for kexec if no strong reason.
> 
> OK, I will add the reason:
> On arm64, crash dump kernel's usable memory is protected by
> *unmapping* it from kernel virtual space unlike other architectures
> where the region is just made read-only.
> So our key developers think that it is highly unlikely that the region
> is accidentally corrupted and this rationalizes that digest check code
> be also dropped from purgatory.
> This greatly simplifies our purgatory without any need for a bit ugly
> relocation stuff, i.e. arch_kexec_apply_relocations_add().
> 
> Please see:
>    http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html
> to find out how simple our purgatory was. All that it does is
> to shuffle arguments and jump into a new kernel.
> 
> Without this patch, we would have to have purgatory with a space for
> a hash value (purgatory_sha256_digest) which is never checked against.
> 
> Do you think it makes sense?

Hmm, it looks reasonable, I remember there could be some performance
issue for a purgatory because of cache disabled for arm64. I do not
object this.

[snip]

Thanks
Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 02/13] kexec_file: make an use of purgatory optional
@ 2018-02-28 12:33         ` Dave Young
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-28 12:33 UTC (permalink / raw)
  To: linux-arm-kernel

On 02/26/18 at 07:24pm, AKASHI Takahiro wrote:
> On Fri, Feb 23, 2018 at 04:49:34PM +0800, Dave Young wrote:
> > Hi AKASHI,
> > 
> > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > On arm64, no trampline code between old kernel and new kernel will be
> > > required in kexec_file implementation. This patch introduces a new
> > > configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be
> > > compiled in only if necessary.
> > 
> > Here also need the explanation about why no purgatory is needed, it would be
> > required for kexec if no strong reason.
> 
> OK, I will add the reason:
> On arm64, crash dump kernel's usable memory is protected by
> *unmapping* it from kernel virtual space unlike other architectures
> where the region is just made read-only.
> So our key developers think that it is highly unlikely that the region
> is accidentally corrupted and this rationalizes that digest check code
> be also dropped from purgatory.
> This greatly simplifies our purgatory without any need for a bit ugly
> relocation stuff, i.e. arch_kexec_apply_relocations_add().
> 
> Please see:
>    http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html
> to find out how simple our purgatory was. All that it does is
> to shuffle arguments and jump into a new kernel.
> 
> Without this patch, we would have to have purgatory with a space for
> a hash value (purgatory_sha256_digest) which is never checked against.
> 
> Do you think it makes sense?

Hmm, it looks reasonable, I remember there could be some performance
issue for a purgatory because of cache disabled for arm64. I do not
object this.

[snip]

Thanks
Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 02/13] kexec_file: make an use of purgatory optional
@ 2018-02-28 12:33         ` Dave Young
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-28 12:33 UTC (permalink / raw)
  To: AKASHI Takahiro, catalin.marinas, will.deacon, bauerman,
	dhowells, vgoyal, herbert, davem, akpm, mpe, bhe, arnd,
	ard.biesheuvel, julien.thierry, kexec, linux-arm-kernel,
	linux-kernel

On 02/26/18 at 07:24pm, AKASHI Takahiro wrote:
> On Fri, Feb 23, 2018 at 04:49:34PM +0800, Dave Young wrote:
> > Hi AKASHI,
> > 
> > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > On arm64, no trampline code between old kernel and new kernel will be
> > > required in kexec_file implementation. This patch introduces a new
> > > configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be
> > > compiled in only if necessary.
> > 
> > Here also need the explanation about why no purgatory is needed, it would be
> > required for kexec if no strong reason.
> 
> OK, I will add the reason:
> On arm64, crash dump kernel's usable memory is protected by
> *unmapping* it from kernel virtual space unlike other architectures
> where the region is just made read-only.
> So our key developers think that it is highly unlikely that the region
> is accidentally corrupted and this rationalizes that digest check code
> be also dropped from purgatory.
> This greatly simplifies our purgatory without any need for a bit ugly
> relocation stuff, i.e. arch_kexec_apply_relocations_add().
> 
> Please see:
>    http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html
> to find out how simple our purgatory was. All that it does is
> to shuffle arguments and jump into a new kernel.
> 
> Without this patch, we would have to have purgatory with a space for
> a hash value (purgatory_sha256_digest) which is never checked against.
> 
> Do you think it makes sense?

Hmm, it looks reasonable, I remember there could be some performance
issue for a purgatory because of cache disabled for arm64. I do not
object this.

[snip]

Thanks
Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
  2018-02-26 10:01       ` AKASHI Takahiro
  (?)
@ 2018-02-28 12:38         ` Dave Young
  -1 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-28 12:38 UTC (permalink / raw)
  To: AKASHI Takahiro, catalin.marinas, will.deacon, bauerman,
	dhowells, vgoyal, herbert, davem, akpm, mpe, bhe, arnd,
	ard.biesheuvel, julien.thierry, kexec, linux-arm-kernel,
	linux-kernel

On 02/26/18 at 07:01pm, AKASHI Takahiro wrote:
> On Fri, Feb 23, 2018 at 05:24:59PM +0800, Dave Young wrote:
> > Hi AKASHI,
> > 
> > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
> > > and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
> > > array and now duplicated among some architectures, let's factor them out.
> > > 
> > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > Cc: Dave Young <dyoung@redhat.com>
> > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > Cc: Baoquan He <bhe@redhat.com>
> > > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > > Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> > > ---
> > >  arch/powerpc/include/asm/kexec.h            |  2 +-
> > >  arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
> > >  arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
> > >  arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
> > >  arch/x86/kernel/kexec-bzimage64.c           |  2 +-
> > >  arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
> > >  include/linux/kexec.h                       | 15 ++++----
> > >  kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
> > >  8 files changed, 70 insertions(+), 94 deletions(-)
> > > 
> > 
> > [snip]
> > 
> > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > index 990adae52151..a6d14a768b3e 100644
> > > --- a/kernel/kexec_file.c
> > > +++ b/kernel/kexec_file.c
> > > @@ -26,34 +26,83 @@
> > >  #include <linux/vmalloc.h>
> > >  #include "kexec_internal.h"
> > >  
> > > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > > +
> > >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> > >  static int kexec_calculate_store_digests(struct kimage *image);
> > >  #else
> > >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> > >  #endif
> > >  
> > > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > > +			     unsigned long buf_len)
> > > +{
> > > +	const struct kexec_file_ops * const *fops;
> > > +	int ret = -ENOEXEC;
> > > +
> > > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > > +		ret = (*fops)->probe(buf, buf_len);
> > > +		if (!ret) {
> > > +			image->fops = *fops;
> > > +			return ret;
> > > +		}
> > > +	}
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  /* Architectures can provide this probe function */
> > >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > >  					 unsigned long buf_len)
> > >  {
> > > -	return -ENOEXEC;
> > > +	return _kexec_kernel_image_probe(image, buf, buf_len);
> > 
> > 
> > I vaguely remember previously I suggest split the _kexec_kernel_image_probe
> > because arch code can call them, and common code also use it like above.
> > But in your new series I do not find where else calls this function
> > except the common code arch_kexec_kernel_image_probe.  If nobody use
> > them then it is not worth to split them out, it is better to just embed
> > them in the __weak functions.
> 
> Powerpc's arch_kexec_kernel_image_probe() uses
> _kexec_kekrnel_image_probe() as it needs an extra check to rule out
> crash dump for now.

Oops, I missed that, but what about other similar functions? Such as:
_kexec_kernel_image_load
_kimage_file_post_load_cleanup
_kexec_kernel_verify_sig

> 
> Thanks,
> -Takahiro AKASHI
> 
> 
> > Ditto for other similar functions.
> > 
> > [snip]
> > 
> > Thanks
> > Dave

Thanks
Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
@ 2018-02-28 12:38         ` Dave Young
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-28 12:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 02/26/18 at 07:01pm, AKASHI Takahiro wrote:
> On Fri, Feb 23, 2018 at 05:24:59PM +0800, Dave Young wrote:
> > Hi AKASHI,
> > 
> > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
> > > and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
> > > array and now duplicated among some architectures, let's factor them out.
> > > 
> > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > Cc: Dave Young <dyoung@redhat.com>
> > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > Cc: Baoquan He <bhe@redhat.com>
> > > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > > Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> > > ---
> > >  arch/powerpc/include/asm/kexec.h            |  2 +-
> > >  arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
> > >  arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
> > >  arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
> > >  arch/x86/kernel/kexec-bzimage64.c           |  2 +-
> > >  arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
> > >  include/linux/kexec.h                       | 15 ++++----
> > >  kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
> > >  8 files changed, 70 insertions(+), 94 deletions(-)
> > > 
> > 
> > [snip]
> > 
> > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > index 990adae52151..a6d14a768b3e 100644
> > > --- a/kernel/kexec_file.c
> > > +++ b/kernel/kexec_file.c
> > > @@ -26,34 +26,83 @@
> > >  #include <linux/vmalloc.h>
> > >  #include "kexec_internal.h"
> > >  
> > > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > > +
> > >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> > >  static int kexec_calculate_store_digests(struct kimage *image);
> > >  #else
> > >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> > >  #endif
> > >  
> > > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > > +			     unsigned long buf_len)
> > > +{
> > > +	const struct kexec_file_ops * const *fops;
> > > +	int ret = -ENOEXEC;
> > > +
> > > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > > +		ret = (*fops)->probe(buf, buf_len);
> > > +		if (!ret) {
> > > +			image->fops = *fops;
> > > +			return ret;
> > > +		}
> > > +	}
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  /* Architectures can provide this probe function */
> > >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > >  					 unsigned long buf_len)
> > >  {
> > > -	return -ENOEXEC;
> > > +	return _kexec_kernel_image_probe(image, buf, buf_len);
> > 
> > 
> > I vaguely remember previously I suggest split the _kexec_kernel_image_probe
> > because arch code can call them, and common code also use it like above.
> > But in your new series I do not find where else calls this function
> > except the common code arch_kexec_kernel_image_probe.  If nobody use
> > them then it is not worth to split them out, it is better to just embed
> > them in the __weak functions.
> 
> Powerpc's arch_kexec_kernel_image_probe() uses
> _kexec_kekrnel_image_probe() as it needs an extra check to rule out
> crash dump for now.

Oops, I missed that, but what about other similar functions? Such as:
_kexec_kernel_image_load
_kimage_file_post_load_cleanup
_kexec_kernel_verify_sig

> 
> Thanks,
> -Takahiro AKASHI
> 
> 
> > Ditto for other similar functions.
> > 
> > [snip]
> > 
> > Thanks
> > Dave

Thanks
Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
@ 2018-02-28 12:38         ` Dave Young
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Young @ 2018-02-28 12:38 UTC (permalink / raw)
  To: AKASHI Takahiro, catalin.marinas, will.deacon, bauerman,
	dhowells, vgoyal, herbert, davem, akpm, mpe, bhe, arnd,
	ard.biesheuvel, julien.thierry, kexec, linux-arm-kernel,
	linux-kernel

On 02/26/18 at 07:01pm, AKASHI Takahiro wrote:
> On Fri, Feb 23, 2018 at 05:24:59PM +0800, Dave Young wrote:
> > Hi AKASHI,
> > 
> > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
> > > and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
> > > array and now duplicated among some architectures, let's factor them out.
> > > 
> > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > Cc: Dave Young <dyoung@redhat.com>
> > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > Cc: Baoquan He <bhe@redhat.com>
> > > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > > Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> > > ---
> > >  arch/powerpc/include/asm/kexec.h            |  2 +-
> > >  arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
> > >  arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
> > >  arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
> > >  arch/x86/kernel/kexec-bzimage64.c           |  2 +-
> > >  arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
> > >  include/linux/kexec.h                       | 15 ++++----
> > >  kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
> > >  8 files changed, 70 insertions(+), 94 deletions(-)
> > > 
> > 
> > [snip]
> > 
> > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > index 990adae52151..a6d14a768b3e 100644
> > > --- a/kernel/kexec_file.c
> > > +++ b/kernel/kexec_file.c
> > > @@ -26,34 +26,83 @@
> > >  #include <linux/vmalloc.h>
> > >  #include "kexec_internal.h"
> > >  
> > > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > > +
> > >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> > >  static int kexec_calculate_store_digests(struct kimage *image);
> > >  #else
> > >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> > >  #endif
> > >  
> > > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > > +			     unsigned long buf_len)
> > > +{
> > > +	const struct kexec_file_ops * const *fops;
> > > +	int ret = -ENOEXEC;
> > > +
> > > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > > +		ret = (*fops)->probe(buf, buf_len);
> > > +		if (!ret) {
> > > +			image->fops = *fops;
> > > +			return ret;
> > > +		}
> > > +	}
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  /* Architectures can provide this probe function */
> > >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > >  					 unsigned long buf_len)
> > >  {
> > > -	return -ENOEXEC;
> > > +	return _kexec_kernel_image_probe(image, buf, buf_len);
> > 
> > 
> > I vaguely remember previously I suggest split the _kexec_kernel_image_probe
> > because arch code can call them, and common code also use it like above.
> > But in your new series I do not find where else calls this function
> > except the common code arch_kexec_kernel_image_probe.  If nobody use
> > them then it is not worth to split them out, it is better to just embed
> > them in the __weak functions.
> 
> Powerpc's arch_kexec_kernel_image_probe() uses
> _kexec_kekrnel_image_probe() as it needs an extra check to rule out
> crash dump for now.

Oops, I missed that, but what about other similar functions? Such as:
_kexec_kernel_image_load
_kimage_file_post_load_cleanup
_kexec_kernel_verify_sig

> 
> Thanks,
> -Takahiro AKASHI
> 
> 
> > Ditto for other similar functions.
> > 
> > [snip]
> > 
> > Thanks
> > Dave

Thanks
Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 02/13] kexec_file: make an use of purgatory optional
  2018-02-28 12:33         ` Dave Young
  (?)
@ 2018-03-01  2:59           ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-03-01  2:59 UTC (permalink / raw)
  To: Dave Young
  Cc: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, bhe, arnd, ard.biesheuvel,
	julien.thierry, kexec, linux-arm-kernel, linux-kernel

On Wed, Feb 28, 2018 at 08:33:59PM +0800, Dave Young wrote:
> On 02/26/18 at 07:24pm, AKASHI Takahiro wrote:
> > On Fri, Feb 23, 2018 at 04:49:34PM +0800, Dave Young wrote:
> > > Hi AKASHI,
> > > 
> > > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > > On arm64, no trampline code between old kernel and new kernel will be
> > > > required in kexec_file implementation. This patch introduces a new
> > > > configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be
> > > > compiled in only if necessary.
> > > 
> > > Here also need the explanation about why no purgatory is needed, it would be
> > > required for kexec if no strong reason.
> > 
> > OK, I will add the reason:
> > On arm64, crash dump kernel's usable memory is protected by
> > *unmapping* it from kernel virtual space unlike other architectures
> > where the region is just made read-only.
> > So our key developers think that it is highly unlikely that the region
> > is accidentally corrupted and this rationalizes that digest check code
> > be also dropped from purgatory.
> > This greatly simplifies our purgatory without any need for a bit ugly
> > relocation stuff, i.e. arch_kexec_apply_relocations_add().
> > 
> > Please see:
> >    http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html
> > to find out how simple our purgatory was. All that it does is
> > to shuffle arguments and jump into a new kernel.
> > 
> > Without this patch, we would have to have purgatory with a space for
> > a hash value (purgatory_sha256_digest) which is never checked against.
> > 
> > Do you think it makes sense?
> 
> Hmm, it looks reasonable, I remember there could be some performance
> issue for a purgatory because of cache disabled for arm64. I do not
> object this.

Yeah, Pratyush(redhat) had expressed his concerns on slow boot-up of
the 2nd kernel which is due to hash value calculation.

-Takahiro AKASHI

> 
> [snip]
> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 02/13] kexec_file: make an use of purgatory optional
@ 2018-03-01  2:59           ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-03-01  2:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 28, 2018 at 08:33:59PM +0800, Dave Young wrote:
> On 02/26/18 at 07:24pm, AKASHI Takahiro wrote:
> > On Fri, Feb 23, 2018 at 04:49:34PM +0800, Dave Young wrote:
> > > Hi AKASHI,
> > > 
> > > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > > On arm64, no trampline code between old kernel and new kernel will be
> > > > required in kexec_file implementation. This patch introduces a new
> > > > configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be
> > > > compiled in only if necessary.
> > > 
> > > Here also need the explanation about why no purgatory is needed, it would be
> > > required for kexec if no strong reason.
> > 
> > OK, I will add the reason:
> > On arm64, crash dump kernel's usable memory is protected by
> > *unmapping* it from kernel virtual space unlike other architectures
> > where the region is just made read-only.
> > So our key developers think that it is highly unlikely that the region
> > is accidentally corrupted and this rationalizes that digest check code
> > be also dropped from purgatory.
> > This greatly simplifies our purgatory without any need for a bit ugly
> > relocation stuff, i.e. arch_kexec_apply_relocations_add().
> > 
> > Please see:
> >    http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html
> > to find out how simple our purgatory was. All that it does is
> > to shuffle arguments and jump into a new kernel.
> > 
> > Without this patch, we would have to have purgatory with a space for
> > a hash value (purgatory_sha256_digest) which is never checked against.
> > 
> > Do you think it makes sense?
> 
> Hmm, it looks reasonable, I remember there could be some performance
> issue for a purgatory because of cache disabled for arm64. I do not
> object this.

Yeah, Pratyush(redhat) had expressed his concerns on slow boot-up of
the 2nd kernel which is due to hash value calculation.

-Takahiro AKASHI

> 
> [snip]
> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 02/13] kexec_file: make an use of purgatory optional
@ 2018-03-01  2:59           ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-03-01  2:59 UTC (permalink / raw)
  To: Dave Young
  Cc: herbert, bhe, ard.biesheuvel, catalin.marinas, julien.thierry,
	will.deacon, linux-kernel, kexec, dhowells, arnd,
	linux-arm-kernel, mpe, bauerman, akpm, davem, vgoyal

On Wed, Feb 28, 2018 at 08:33:59PM +0800, Dave Young wrote:
> On 02/26/18 at 07:24pm, AKASHI Takahiro wrote:
> > On Fri, Feb 23, 2018 at 04:49:34PM +0800, Dave Young wrote:
> > > Hi AKASHI,
> > > 
> > > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > > On arm64, no trampline code between old kernel and new kernel will be
> > > > required in kexec_file implementation. This patch introduces a new
> > > > configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be
> > > > compiled in only if necessary.
> > > 
> > > Here also need the explanation about why no purgatory is needed, it would be
> > > required for kexec if no strong reason.
> > 
> > OK, I will add the reason:
> > On arm64, crash dump kernel's usable memory is protected by
> > *unmapping* it from kernel virtual space unlike other architectures
> > where the region is just made read-only.
> > So our key developers think that it is highly unlikely that the region
> > is accidentally corrupted and this rationalizes that digest check code
> > be also dropped from purgatory.
> > This greatly simplifies our purgatory without any need for a bit ugly
> > relocation stuff, i.e. arch_kexec_apply_relocations_add().
> > 
> > Please see:
> >    http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html
> > to find out how simple our purgatory was. All that it does is
> > to shuffle arguments and jump into a new kernel.
> > 
> > Without this patch, we would have to have purgatory with a space for
> > a hash value (purgatory_sha256_digest) which is never checked against.
> > 
> > Do you think it makes sense?
> 
> Hmm, it looks reasonable, I remember there could be some performance
> issue for a purgatory because of cache disabled for arm64. I do not
> object this.

Yeah, Pratyush(redhat) had expressed his concerns on slow boot-up of
the 2nd kernel which is due to hash value calculation.

-Takahiro AKASHI

> 
> [snip]
> 
> Thanks
> Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
  2018-02-28 12:38         ` Dave Young
  (?)
@ 2018-03-01  3:18           ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-03-01  3:18 UTC (permalink / raw)
  To: Dave Young
  Cc: catalin.marinas, will.deacon, bauerman, dhowells, vgoyal,
	herbert, davem, akpm, mpe, bhe, arnd, ard.biesheuvel,
	julien.thierry, kexec, linux-arm-kernel, linux-kernel

On Wed, Feb 28, 2018 at 08:38:00PM +0800, Dave Young wrote:
> On 02/26/18 at 07:01pm, AKASHI Takahiro wrote:
> > On Fri, Feb 23, 2018 at 05:24:59PM +0800, Dave Young wrote:
> > > Hi AKASHI,
> > > 
> > > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > > As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
> > > > and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
> > > > array and now duplicated among some architectures, let's factor them out.
> > > > 
> > > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > > Cc: Dave Young <dyoung@redhat.com>
> > > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > > Cc: Baoquan He <bhe@redhat.com>
> > > > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > > > Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> > > > ---
> > > >  arch/powerpc/include/asm/kexec.h            |  2 +-
> > > >  arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
> > > >  arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
> > > >  arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
> > > >  arch/x86/kernel/kexec-bzimage64.c           |  2 +-
> > > >  arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
> > > >  include/linux/kexec.h                       | 15 ++++----
> > > >  kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
> > > >  8 files changed, 70 insertions(+), 94 deletions(-)
> > > > 
> > > 
> > > [snip]
> > > 
> > > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > > index 990adae52151..a6d14a768b3e 100644
> > > > --- a/kernel/kexec_file.c
> > > > +++ b/kernel/kexec_file.c
> > > > @@ -26,34 +26,83 @@
> > > >  #include <linux/vmalloc.h>
> > > >  #include "kexec_internal.h"
> > > >  
> > > > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > > > +
> > > >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> > > >  static int kexec_calculate_store_digests(struct kimage *image);
> > > >  #else
> > > >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> > > >  #endif
> > > >  
> > > > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > > > +			     unsigned long buf_len)
> > > > +{
> > > > +	const struct kexec_file_ops * const *fops;
> > > > +	int ret = -ENOEXEC;
> > > > +
> > > > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > > > +		ret = (*fops)->probe(buf, buf_len);
> > > > +		if (!ret) {
> > > > +			image->fops = *fops;
> > > > +			return ret;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	return ret;
> > > > +}
> > > > +
> > > >  /* Architectures can provide this probe function */
> > > >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > > >  					 unsigned long buf_len)
> > > >  {
> > > > -	return -ENOEXEC;
> > > > +	return _kexec_kernel_image_probe(image, buf, buf_len);
> > > 
> > > 
> > > I vaguely remember previously I suggest split the _kexec_kernel_image_probe
> > > because arch code can call them, and common code also use it like above.
> > > But in your new series I do not find where else calls this function
> > > except the common code arch_kexec_kernel_image_probe.  If nobody use
> > > them then it is not worth to split them out, it is better to just embed
> > > them in the __weak functions.
> > 
> > Powerpc's arch_kexec_kernel_image_probe() uses
> > _kexec_kekrnel_image_probe() as it needs an extra check to rule out
> > crash dump for now.
> 
> Oops, I missed that, but what about other similar functions? Such as:
> _kexec_kernel_image_load
> _kimage_file_post_load_cleanup
> _kexec_kernel_verify_sig

Current no users.
Given those functions are so simple, we could go either way.

-Takahiro AKASHI


> > 
> > Thanks,
> > -Takahiro AKASHI
> > 
> > 
> > > Ditto for other similar functions.
> > > 
> > > [snip]
> > > 
> > > Thanks
> > > Dave
> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
@ 2018-03-01  3:18           ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-03-01  3:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 28, 2018 at 08:38:00PM +0800, Dave Young wrote:
> On 02/26/18 at 07:01pm, AKASHI Takahiro wrote:
> > On Fri, Feb 23, 2018 at 05:24:59PM +0800, Dave Young wrote:
> > > Hi AKASHI,
> > > 
> > > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > > As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
> > > > and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
> > > > array and now duplicated among some architectures, let's factor them out.
> > > > 
> > > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > > Cc: Dave Young <dyoung@redhat.com>
> > > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > > Cc: Baoquan He <bhe@redhat.com>
> > > > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > > > Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> > > > ---
> > > >  arch/powerpc/include/asm/kexec.h            |  2 +-
> > > >  arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
> > > >  arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
> > > >  arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
> > > >  arch/x86/kernel/kexec-bzimage64.c           |  2 +-
> > > >  arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
> > > >  include/linux/kexec.h                       | 15 ++++----
> > > >  kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
> > > >  8 files changed, 70 insertions(+), 94 deletions(-)
> > > > 
> > > 
> > > [snip]
> > > 
> > > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > > index 990adae52151..a6d14a768b3e 100644
> > > > --- a/kernel/kexec_file.c
> > > > +++ b/kernel/kexec_file.c
> > > > @@ -26,34 +26,83 @@
> > > >  #include <linux/vmalloc.h>
> > > >  #include "kexec_internal.h"
> > > >  
> > > > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > > > +
> > > >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> > > >  static int kexec_calculate_store_digests(struct kimage *image);
> > > >  #else
> > > >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> > > >  #endif
> > > >  
> > > > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > > > +			     unsigned long buf_len)
> > > > +{
> > > > +	const struct kexec_file_ops * const *fops;
> > > > +	int ret = -ENOEXEC;
> > > > +
> > > > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > > > +		ret = (*fops)->probe(buf, buf_len);
> > > > +		if (!ret) {
> > > > +			image->fops = *fops;
> > > > +			return ret;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	return ret;
> > > > +}
> > > > +
> > > >  /* Architectures can provide this probe function */
> > > >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > > >  					 unsigned long buf_len)
> > > >  {
> > > > -	return -ENOEXEC;
> > > > +	return _kexec_kernel_image_probe(image, buf, buf_len);
> > > 
> > > 
> > > I vaguely remember previously I suggest split the _kexec_kernel_image_probe
> > > because arch code can call them, and common code also use it like above.
> > > But in your new series I do not find where else calls this function
> > > except the common code arch_kexec_kernel_image_probe.  If nobody use
> > > them then it is not worth to split them out, it is better to just embed
> > > them in the __weak functions.
> > 
> > Powerpc's arch_kexec_kernel_image_probe() uses
> > _kexec_kekrnel_image_probe() as it needs an extra check to rule out
> > crash dump for now.
> 
> Oops, I missed that, but what about other similar functions? Such as:
> _kexec_kernel_image_load
> _kimage_file_post_load_cleanup
> _kexec_kernel_verify_sig

Current no users.
Given those functions are so simple, we could go either way.

-Takahiro AKASHI


> > 
> > Thanks,
> > -Takahiro AKASHI
> > 
> > 
> > > Ditto for other similar functions.
> > > 
> > > [snip]
> > > 
> > > Thanks
> > > Dave
> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions
@ 2018-03-01  3:18           ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-03-01  3:18 UTC (permalink / raw)
  To: Dave Young
  Cc: herbert, bhe, ard.biesheuvel, catalin.marinas, julien.thierry,
	will.deacon, linux-kernel, kexec, dhowells, arnd,
	linux-arm-kernel, mpe, bauerman, akpm, davem, vgoyal

On Wed, Feb 28, 2018 at 08:38:00PM +0800, Dave Young wrote:
> On 02/26/18 at 07:01pm, AKASHI Takahiro wrote:
> > On Fri, Feb 23, 2018 at 05:24:59PM +0800, Dave Young wrote:
> > > Hi AKASHI,
> > > 
> > > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > > As arch_kexec_kernel_*_{probe,load}(), arch_kimage_file_post_load_cleanup()
> > > > and arch_kexec_kernel_verify_sg can be parameterized with a kexec_file_ops
> > > > array and now duplicated among some architectures, let's factor them out.
> > > > 
> > > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > > Cc: Dave Young <dyoung@redhat.com>
> > > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > > Cc: Baoquan He <bhe@redhat.com>
> > > > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > > > Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> > > > ---
> > > >  arch/powerpc/include/asm/kexec.h            |  2 +-
> > > >  arch/powerpc/kernel/kexec_elf_64.c          |  2 +-
> > > >  arch/powerpc/kernel/machine_kexec_file_64.c | 39 ++------------------
> > > >  arch/x86/include/asm/kexec-bzimage64.h      |  2 +-
> > > >  arch/x86/kernel/kexec-bzimage64.c           |  2 +-
> > > >  arch/x86/kernel/machine_kexec_64.c          | 45 +----------------------
> > > >  include/linux/kexec.h                       | 15 ++++----
> > > >  kernel/kexec_file.c                         | 57 +++++++++++++++++++++++++++--
> > > >  8 files changed, 70 insertions(+), 94 deletions(-)
> > > > 
> > > 
> > > [snip]
> > > 
> > > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > > index 990adae52151..a6d14a768b3e 100644
> > > > --- a/kernel/kexec_file.c
> > > > +++ b/kernel/kexec_file.c
> > > > @@ -26,34 +26,83 @@
> > > >  #include <linux/vmalloc.h>
> > > >  #include "kexec_internal.h"
> > > >  
> > > > +const __weak struct kexec_file_ops * const kexec_file_loaders[] = {NULL};
> > > > +
> > > >  #ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY
> > > >  static int kexec_calculate_store_digests(struct kimage *image);
> > > >  #else
> > > >  static int kexec_calculate_store_digests(struct kimage *image) { return 0; };
> > > >  #endif
> > > >  
> > > > +int _kexec_kernel_image_probe(struct kimage *image, void *buf,
> > > > +			     unsigned long buf_len)
> > > > +{
> > > > +	const struct kexec_file_ops * const *fops;
> > > > +	int ret = -ENOEXEC;
> > > > +
> > > > +	for (fops = &kexec_file_loaders[0]; *fops && (*fops)->probe; ++fops) {
> > > > +		ret = (*fops)->probe(buf, buf_len);
> > > > +		if (!ret) {
> > > > +			image->fops = *fops;
> > > > +			return ret;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	return ret;
> > > > +}
> > > > +
> > > >  /* Architectures can provide this probe function */
> > > >  int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > > >  					 unsigned long buf_len)
> > > >  {
> > > > -	return -ENOEXEC;
> > > > +	return _kexec_kernel_image_probe(image, buf, buf_len);
> > > 
> > > 
> > > I vaguely remember previously I suggest split the _kexec_kernel_image_probe
> > > because arch code can call them, and common code also use it like above.
> > > But in your new series I do not find where else calls this function
> > > except the common code arch_kexec_kernel_image_probe.  If nobody use
> > > them then it is not worth to split them out, it is better to just embed
> > > them in the __weak functions.
> > 
> > Powerpc's arch_kexec_kernel_image_probe() uses
> > _kexec_kekrnel_image_probe() as it needs an extra check to rule out
> > crash dump for now.
> 
> Oops, I missed that, but what about other similar functions? Such as:
> _kexec_kernel_image_load
> _kimage_file_post_load_cleanup
> _kexec_kernel_verify_sig

Current no users.
Given those functions are so simple, we could go either way.

-Takahiro AKASHI


> > 
> > Thanks,
> > -Takahiro AKASHI
> > 
> > 
> > > Ditto for other similar functions.
> > > 
> > > [snip]
> > > 
> > > Thanks
> > > Dave
> 
> Thanks
> Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 01/13] resource: add walk_system_ram_res_rev()
  2018-02-23  8:36     ` Dave Young
  (?)
@ 2018-03-20  1:43       ` Baoquan He
  -1 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2018-03-20  1:43 UTC (permalink / raw)
  To: AKASHI Takahiro, Dave Young
  Cc: herbert, ard.biesheuvel, catalin.marinas, julien.thierry,
	will.deacon, linux-kernel, kexec, dhowells, arnd,
	linux-arm-kernel, mpe, bauerman, akpm, Linus Torvalds, davem,
	vgoyal

On 02/23/18 at 04:36pm, Dave Young wrote:
> Hi AKASHI,
> 
> On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > This function, being a variant of walk_system_ram_res() introduced in
> > commit 8c86e70acead ("resource: provide new functions to walk through
> > resources"), walks through a list of all the resources of System RAM
> > in reversed order, i.e., from higher to lower.
> > 
> > It will be used in kexec_file implementation on arm64.
> 
> I remember there was an old discussion about this, it should be added
> in patch log why this is needed.

It's used to load kernel/initrd at the top of system RAM, and this is
consistent with user space kexec behaviour.

In x86 64, Vivek didn't do like this since there's no reverse iomem
resource iterating function, he just chose a match RAM region bottom up,
then put kernel/initrd top down in the found RAM region. This is
different than kexec_tools utility. I am considering to change resource
sibling as double list, seems AKASHI's way is easier to be accepted by
people. So I will use this one to change x86 64 code.

Hi AKASHI,

About arm64 kexec_file patches, will you post recently? Or any other
plan?

Thanks
Baoquan

> 
> > 
> > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Vivek Goyal <vgoyal@redhat.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > ---
> >  include/linux/ioport.h |  3 +++
> >  kernel/resource.c      | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 60 insertions(+)
> > 
> > diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> > index da0ebaec25f0..f12d95fe038b 100644
> > --- a/include/linux/ioport.h
> > +++ b/include/linux/ioport.h
> > @@ -277,6 +277,9 @@ extern int
> >  walk_system_ram_res(u64 start, u64 end, void *arg,
> >  		    int (*func)(struct resource *, void *));
> >  extern int
> > +walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > +			int (*func)(struct resource *, void *));
> > +extern int
> >  walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
> >  		    void *arg, int (*func)(struct resource *, void *));
> >  
> > diff --git a/kernel/resource.c b/kernel/resource.c
> > index e270b5048988..bdaa93407f4c 100644
> > --- a/kernel/resource.c
> > +++ b/kernel/resource.c
> > @@ -23,6 +23,8 @@
> >  #include <linux/pfn.h>
> >  #include <linux/mm.h>
> >  #include <linux/resource_ext.h>
> > +#include <linux/string.h>
> > +#include <linux/vmalloc.h>
> >  #include <asm/io.h>
> >  
> >  
> > @@ -486,6 +488,61 @@ int walk_mem_res(u64 start, u64 end, void *arg,
> >  				     arg, func);
> >  }
> >  
> > +int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > +				int (*func)(struct resource *, void *))
> > +{
> > +	struct resource res, *rams;
> > +	int rams_size = 16, i;
> > +	int ret = -1;
> > +
> > +	/* create a list */
> > +	rams = vmalloc(sizeof(struct resource) * rams_size);
> > +	if (!rams)
> > +		return ret;
> > +
> > +	res.start = start;
> > +	res.end = end;
> > +	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> > +	i = 0;
> > +	while ((res.start < res.end) &&
> > +		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
> > +		if (i >= rams_size) {
> > +			/* re-alloc */
> > +			struct resource *rams_new;
> > +			int rams_new_size;
> > +
> > +			rams_new_size = rams_size + 16;
> > +			rams_new = vmalloc(sizeof(struct resource)
> > +							* rams_new_size);
> > +			if (!rams_new)
> > +				goto out;
> > +
> > +			memcpy(rams_new, rams,
> > +					sizeof(struct resource) * rams_size);
> > +			vfree(rams);
> > +			rams = rams_new;
> > +			rams_size = rams_new_size;
> > +		}
> > +
> > +		rams[i].start = res.start;
> > +		rams[i++].end = res.end;
> > +
> > +		res.start = res.end + 1;
> > +		res.end = end;
> > +	}
> > +
> > +	/* go reverse */
> > +	for (i--; i >= 0; i--) {
> > +		ret = (*func)(&rams[i], arg);
> > +		if (ret)
> > +			break;
> > +	}
> > +
> > +out:
> > +	vfree(rams);
> > +	return ret;
> > +}
> > +
> >  #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
> >  
> >  /*
> > -- 
> > 2.16.2
> > 
> 
> Thanks
> Dave
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 01/13] resource: add walk_system_ram_res_rev()
@ 2018-03-20  1:43       ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2018-03-20  1:43 UTC (permalink / raw)
  To: linux-arm-kernel

On 02/23/18 at 04:36pm, Dave Young wrote:
> Hi AKASHI,
> 
> On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > This function, being a variant of walk_system_ram_res() introduced in
> > commit 8c86e70acead ("resource: provide new functions to walk through
> > resources"), walks through a list of all the resources of System RAM
> > in reversed order, i.e., from higher to lower.
> > 
> > It will be used in kexec_file implementation on arm64.
> 
> I remember there was an old discussion about this, it should be added
> in patch log why this is needed.

It's used to load kernel/initrd at the top of system RAM, and this is
consistent with user space kexec behaviour.

In x86 64, Vivek didn't do like this since there's no reverse iomem
resource iterating function, he just chose a match RAM region bottom up,
then put kernel/initrd top down in the found RAM region. This is
different than kexec_tools utility. I am considering to change resource
sibling as double list, seems AKASHI's way is easier to be accepted by
people. So I will use this one to change x86 64 code.

Hi AKASHI,

About arm64 kexec_file patches, will you post recently? Or any other
plan?

Thanks
Baoquan

> 
> > 
> > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Vivek Goyal <vgoyal@redhat.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > ---
> >  include/linux/ioport.h |  3 +++
> >  kernel/resource.c      | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 60 insertions(+)
> > 
> > diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> > index da0ebaec25f0..f12d95fe038b 100644
> > --- a/include/linux/ioport.h
> > +++ b/include/linux/ioport.h
> > @@ -277,6 +277,9 @@ extern int
> >  walk_system_ram_res(u64 start, u64 end, void *arg,
> >  		    int (*func)(struct resource *, void *));
> >  extern int
> > +walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > +			int (*func)(struct resource *, void *));
> > +extern int
> >  walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
> >  		    void *arg, int (*func)(struct resource *, void *));
> >  
> > diff --git a/kernel/resource.c b/kernel/resource.c
> > index e270b5048988..bdaa93407f4c 100644
> > --- a/kernel/resource.c
> > +++ b/kernel/resource.c
> > @@ -23,6 +23,8 @@
> >  #include <linux/pfn.h>
> >  #include <linux/mm.h>
> >  #include <linux/resource_ext.h>
> > +#include <linux/string.h>
> > +#include <linux/vmalloc.h>
> >  #include <asm/io.h>
> >  
> >  
> > @@ -486,6 +488,61 @@ int walk_mem_res(u64 start, u64 end, void *arg,
> >  				     arg, func);
> >  }
> >  
> > +int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > +				int (*func)(struct resource *, void *))
> > +{
> > +	struct resource res, *rams;
> > +	int rams_size = 16, i;
> > +	int ret = -1;
> > +
> > +	/* create a list */
> > +	rams = vmalloc(sizeof(struct resource) * rams_size);
> > +	if (!rams)
> > +		return ret;
> > +
> > +	res.start = start;
> > +	res.end = end;
> > +	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> > +	i = 0;
> > +	while ((res.start < res.end) &&
> > +		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
> > +		if (i >= rams_size) {
> > +			/* re-alloc */
> > +			struct resource *rams_new;
> > +			int rams_new_size;
> > +
> > +			rams_new_size = rams_size + 16;
> > +			rams_new = vmalloc(sizeof(struct resource)
> > +							* rams_new_size);
> > +			if (!rams_new)
> > +				goto out;
> > +
> > +			memcpy(rams_new, rams,
> > +					sizeof(struct resource) * rams_size);
> > +			vfree(rams);
> > +			rams = rams_new;
> > +			rams_size = rams_new_size;
> > +		}
> > +
> > +		rams[i].start = res.start;
> > +		rams[i++].end = res.end;
> > +
> > +		res.start = res.end + 1;
> > +		res.end = end;
> > +	}
> > +
> > +	/* go reverse */
> > +	for (i--; i >= 0; i--) {
> > +		ret = (*func)(&rams[i], arg);
> > +		if (ret)
> > +			break;
> > +	}
> > +
> > +out:
> > +	vfree(rams);
> > +	return ret;
> > +}
> > +
> >  #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
> >  
> >  /*
> > -- 
> > 2.16.2
> > 
> 
> Thanks
> Dave
> 
> _______________________________________________
> kexec mailing list
> kexec at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 01/13] resource: add walk_system_ram_res_rev()
@ 2018-03-20  1:43       ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2018-03-20  1:43 UTC (permalink / raw)
  To: AKASHI Takahiro, Dave Young
  Cc: herbert, arnd, julien.thierry, catalin.marinas, ard.biesheuvel,
	will.deacon, linux-kernel, davem, dhowells, vgoyal, mpe,
	bauerman, akpm, Linus Torvalds, kexec, linux-arm-kernel

On 02/23/18 at 04:36pm, Dave Young wrote:
> Hi AKASHI,
> 
> On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > This function, being a variant of walk_system_ram_res() introduced in
> > commit 8c86e70acead ("resource: provide new functions to walk through
> > resources"), walks through a list of all the resources of System RAM
> > in reversed order, i.e., from higher to lower.
> > 
> > It will be used in kexec_file implementation on arm64.
> 
> I remember there was an old discussion about this, it should be added
> in patch log why this is needed.

It's used to load kernel/initrd at the top of system RAM, and this is
consistent with user space kexec behaviour.

In x86 64, Vivek didn't do like this since there's no reverse iomem
resource iterating function, he just chose a match RAM region bottom up,
then put kernel/initrd top down in the found RAM region. This is
different than kexec_tools utility. I am considering to change resource
sibling as double list, seems AKASHI's way is easier to be accepted by
people. So I will use this one to change x86 64 code.

Hi AKASHI,

About arm64 kexec_file patches, will you post recently? Or any other
plan?

Thanks
Baoquan

> 
> > 
> > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > Cc: Vivek Goyal <vgoyal@redhat.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > ---
> >  include/linux/ioport.h |  3 +++
> >  kernel/resource.c      | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 60 insertions(+)
> > 
> > diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> > index da0ebaec25f0..f12d95fe038b 100644
> > --- a/include/linux/ioport.h
> > +++ b/include/linux/ioport.h
> > @@ -277,6 +277,9 @@ extern int
> >  walk_system_ram_res(u64 start, u64 end, void *arg,
> >  		    int (*func)(struct resource *, void *));
> >  extern int
> > +walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > +			int (*func)(struct resource *, void *));
> > +extern int
> >  walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
> >  		    void *arg, int (*func)(struct resource *, void *));
> >  
> > diff --git a/kernel/resource.c b/kernel/resource.c
> > index e270b5048988..bdaa93407f4c 100644
> > --- a/kernel/resource.c
> > +++ b/kernel/resource.c
> > @@ -23,6 +23,8 @@
> >  #include <linux/pfn.h>
> >  #include <linux/mm.h>
> >  #include <linux/resource_ext.h>
> > +#include <linux/string.h>
> > +#include <linux/vmalloc.h>
> >  #include <asm/io.h>
> >  
> >  
> > @@ -486,6 +488,61 @@ int walk_mem_res(u64 start, u64 end, void *arg,
> >  				     arg, func);
> >  }
> >  
> > +int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > +				int (*func)(struct resource *, void *))
> > +{
> > +	struct resource res, *rams;
> > +	int rams_size = 16, i;
> > +	int ret = -1;
> > +
> > +	/* create a list */
> > +	rams = vmalloc(sizeof(struct resource) * rams_size);
> > +	if (!rams)
> > +		return ret;
> > +
> > +	res.start = start;
> > +	res.end = end;
> > +	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> > +	i = 0;
> > +	while ((res.start < res.end) &&
> > +		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
> > +		if (i >= rams_size) {
> > +			/* re-alloc */
> > +			struct resource *rams_new;
> > +			int rams_new_size;
> > +
> > +			rams_new_size = rams_size + 16;
> > +			rams_new = vmalloc(sizeof(struct resource)
> > +							* rams_new_size);
> > +			if (!rams_new)
> > +				goto out;
> > +
> > +			memcpy(rams_new, rams,
> > +					sizeof(struct resource) * rams_size);
> > +			vfree(rams);
> > +			rams = rams_new;
> > +			rams_size = rams_new_size;
> > +		}
> > +
> > +		rams[i].start = res.start;
> > +		rams[i++].end = res.end;
> > +
> > +		res.start = res.end + 1;
> > +		res.end = end;
> > +	}
> > +
> > +	/* go reverse */
> > +	for (i--; i >= 0; i--) {
> > +		ret = (*func)(&rams[i], arg);
> > +		if (ret)
> > +			break;
> > +	}
> > +
> > +out:
> > +	vfree(rams);
> > +	return ret;
> > +}
> > +
> >  #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
> >  
> >  /*
> > -- 
> > 2.16.2
> > 
> 
> Thanks
> Dave
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 01/13] resource: add walk_system_ram_res_rev()
  2018-03-20  1:43       ` Baoquan He
  (?)
@ 2018-03-20  3:12         ` AKASHI Takahiro
  -1 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-03-20  3:12 UTC (permalink / raw)
  To: Baoquan He
  Cc: Dave Young, herbert, ard.biesheuvel, catalin.marinas,
	julien.thierry, will.deacon, linux-kernel, kexec, dhowells, arnd,
	linux-arm-kernel, mpe, bauerman, akpm, Linus Torvalds, davem,
	vgoyal

Baoquan,

On Tue, Mar 20, 2018 at 09:43:18AM +0800, Baoquan He wrote:
> On 02/23/18 at 04:36pm, Dave Young wrote:
> > Hi AKASHI,
> > 
> > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > This function, being a variant of walk_system_ram_res() introduced in
> > > commit 8c86e70acead ("resource: provide new functions to walk through
> > > resources"), walks through a list of all the resources of System RAM
> > > in reversed order, i.e., from higher to lower.
> > > 
> > > It will be used in kexec_file implementation on arm64.
> > 
> > I remember there was an old discussion about this, it should be added
> > in patch log why this is needed.
> 
> It's used to load kernel/initrd at the top of system RAM, and this is
> consistent with user space kexec behaviour.
> 
> In x86 64, Vivek didn't do like this since there's no reverse iomem
> resource iterating function, he just chose a match RAM region bottom up,
> then put kernel/initrd top down in the found RAM region. This is
> different than kexec_tools utility. I am considering to change resource
> sibling as double list, seems AKASHI's way is easier to be accepted by
> people. So I will use this one to change x86 64 code.
> 
> Hi AKASHI,
> 
> About arm64 kexec_file patches, will you post recently? Or any other
> plan?

A short answer is yes, but my new version won't include this specific patch.
So please feel free to add it to your own patch set if you want.

The reason that I'm going to remove it is that we will make a modification
on /proc/iomem due to a bug fixing and then we will have to have our own 
"walking" routine.

Thanks,
-Takahiro AKASHI

> Thanks
> Baoquan
> 
> > 
> > > 
> > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > > ---
> > >  include/linux/ioport.h |  3 +++
> > >  kernel/resource.c      | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 60 insertions(+)
> > > 
> > > diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> > > index da0ebaec25f0..f12d95fe038b 100644
> > > --- a/include/linux/ioport.h
> > > +++ b/include/linux/ioport.h
> > > @@ -277,6 +277,9 @@ extern int
> > >  walk_system_ram_res(u64 start, u64 end, void *arg,
> > >  		    int (*func)(struct resource *, void *));
> > >  extern int
> > > +walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > > +			int (*func)(struct resource *, void *));
> > > +extern int
> > >  walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
> > >  		    void *arg, int (*func)(struct resource *, void *));
> > >  
> > > diff --git a/kernel/resource.c b/kernel/resource.c
> > > index e270b5048988..bdaa93407f4c 100644
> > > --- a/kernel/resource.c
> > > +++ b/kernel/resource.c
> > > @@ -23,6 +23,8 @@
> > >  #include <linux/pfn.h>
> > >  #include <linux/mm.h>
> > >  #include <linux/resource_ext.h>
> > > +#include <linux/string.h>
> > > +#include <linux/vmalloc.h>
> > >  #include <asm/io.h>
> > >  
> > >  
> > > @@ -486,6 +488,61 @@ int walk_mem_res(u64 start, u64 end, void *arg,
> > >  				     arg, func);
> > >  }
> > >  
> > > +int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > > +				int (*func)(struct resource *, void *))
> > > +{
> > > +	struct resource res, *rams;
> > > +	int rams_size = 16, i;
> > > +	int ret = -1;
> > > +
> > > +	/* create a list */
> > > +	rams = vmalloc(sizeof(struct resource) * rams_size);
> > > +	if (!rams)
> > > +		return ret;
> > > +
> > > +	res.start = start;
> > > +	res.end = end;
> > > +	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> > > +	i = 0;
> > > +	while ((res.start < res.end) &&
> > > +		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
> > > +		if (i >= rams_size) {
> > > +			/* re-alloc */
> > > +			struct resource *rams_new;
> > > +			int rams_new_size;
> > > +
> > > +			rams_new_size = rams_size + 16;
> > > +			rams_new = vmalloc(sizeof(struct resource)
> > > +							* rams_new_size);
> > > +			if (!rams_new)
> > > +				goto out;
> > > +
> > > +			memcpy(rams_new, rams,
> > > +					sizeof(struct resource) * rams_size);
> > > +			vfree(rams);
> > > +			rams = rams_new;
> > > +			rams_size = rams_new_size;
> > > +		}
> > > +
> > > +		rams[i].start = res.start;
> > > +		rams[i++].end = res.end;
> > > +
> > > +		res.start = res.end + 1;
> > > +		res.end = end;
> > > +	}
> > > +
> > > +	/* go reverse */
> > > +	for (i--; i >= 0; i--) {
> > > +		ret = (*func)(&rams[i], arg);
> > > +		if (ret)
> > > +			break;
> > > +	}
> > > +
> > > +out:
> > > +	vfree(rams);
> > > +	return ret;
> > > +}
> > > +
> > >  #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
> > >  
> > >  /*
> > > -- 
> > > 2.16.2
> > > 
> > 
> > Thanks
> > Dave
> > 
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 01/13] resource: add walk_system_ram_res_rev()
@ 2018-03-20  3:12         ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-03-20  3:12 UTC (permalink / raw)
  To: linux-arm-kernel

Baoquan,

On Tue, Mar 20, 2018 at 09:43:18AM +0800, Baoquan He wrote:
> On 02/23/18 at 04:36pm, Dave Young wrote:
> > Hi AKASHI,
> > 
> > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > This function, being a variant of walk_system_ram_res() introduced in
> > > commit 8c86e70acead ("resource: provide new functions to walk through
> > > resources"), walks through a list of all the resources of System RAM
> > > in reversed order, i.e., from higher to lower.
> > > 
> > > It will be used in kexec_file implementation on arm64.
> > 
> > I remember there was an old discussion about this, it should be added
> > in patch log why this is needed.
> 
> It's used to load kernel/initrd at the top of system RAM, and this is
> consistent with user space kexec behaviour.
> 
> In x86 64, Vivek didn't do like this since there's no reverse iomem
> resource iterating function, he just chose a match RAM region bottom up,
> then put kernel/initrd top down in the found RAM region. This is
> different than kexec_tools utility. I am considering to change resource
> sibling as double list, seems AKASHI's way is easier to be accepted by
> people. So I will use this one to change x86 64 code.
> 
> Hi AKASHI,
> 
> About arm64 kexec_file patches, will you post recently? Or any other
> plan?

A short answer is yes, but my new version won't include this specific patch.
So please feel free to add it to your own patch set if you want.

The reason that I'm going to remove it is that we will make a modification
on /proc/iomem due to a bug fixing and then we will have to have our own 
"walking" routine.

Thanks,
-Takahiro AKASHI

> Thanks
> Baoquan
> 
> > 
> > > 
> > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > > ---
> > >  include/linux/ioport.h |  3 +++
> > >  kernel/resource.c      | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 60 insertions(+)
> > > 
> > > diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> > > index da0ebaec25f0..f12d95fe038b 100644
> > > --- a/include/linux/ioport.h
> > > +++ b/include/linux/ioport.h
> > > @@ -277,6 +277,9 @@ extern int
> > >  walk_system_ram_res(u64 start, u64 end, void *arg,
> > >  		    int (*func)(struct resource *, void *));
> > >  extern int
> > > +walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > > +			int (*func)(struct resource *, void *));
> > > +extern int
> > >  walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
> > >  		    void *arg, int (*func)(struct resource *, void *));
> > >  
> > > diff --git a/kernel/resource.c b/kernel/resource.c
> > > index e270b5048988..bdaa93407f4c 100644
> > > --- a/kernel/resource.c
> > > +++ b/kernel/resource.c
> > > @@ -23,6 +23,8 @@
> > >  #include <linux/pfn.h>
> > >  #include <linux/mm.h>
> > >  #include <linux/resource_ext.h>
> > > +#include <linux/string.h>
> > > +#include <linux/vmalloc.h>
> > >  #include <asm/io.h>
> > >  
> > >  
> > > @@ -486,6 +488,61 @@ int walk_mem_res(u64 start, u64 end, void *arg,
> > >  				     arg, func);
> > >  }
> > >  
> > > +int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > > +				int (*func)(struct resource *, void *))
> > > +{
> > > +	struct resource res, *rams;
> > > +	int rams_size = 16, i;
> > > +	int ret = -1;
> > > +
> > > +	/* create a list */
> > > +	rams = vmalloc(sizeof(struct resource) * rams_size);
> > > +	if (!rams)
> > > +		return ret;
> > > +
> > > +	res.start = start;
> > > +	res.end = end;
> > > +	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> > > +	i = 0;
> > > +	while ((res.start < res.end) &&
> > > +		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
> > > +		if (i >= rams_size) {
> > > +			/* re-alloc */
> > > +			struct resource *rams_new;
> > > +			int rams_new_size;
> > > +
> > > +			rams_new_size = rams_size + 16;
> > > +			rams_new = vmalloc(sizeof(struct resource)
> > > +							* rams_new_size);
> > > +			if (!rams_new)
> > > +				goto out;
> > > +
> > > +			memcpy(rams_new, rams,
> > > +					sizeof(struct resource) * rams_size);
> > > +			vfree(rams);
> > > +			rams = rams_new;
> > > +			rams_size = rams_new_size;
> > > +		}
> > > +
> > > +		rams[i].start = res.start;
> > > +		rams[i++].end = res.end;
> > > +
> > > +		res.start = res.end + 1;
> > > +		res.end = end;
> > > +	}
> > > +
> > > +	/* go reverse */
> > > +	for (i--; i >= 0; i--) {
> > > +		ret = (*func)(&rams[i], arg);
> > > +		if (ret)
> > > +			break;
> > > +	}
> > > +
> > > +out:
> > > +	vfree(rams);
> > > +	return ret;
> > > +}
> > > +
> > >  #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
> > >  
> > >  /*
> > > -- 
> > > 2.16.2
> > > 
> > 
> > Thanks
> > Dave
> > 
> > _______________________________________________
> > kexec mailing list
> > kexec at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 01/13] resource: add walk_system_ram_res_rev()
@ 2018-03-20  3:12         ` AKASHI Takahiro
  0 siblings, 0 replies; 102+ messages in thread
From: AKASHI Takahiro @ 2018-03-20  3:12 UTC (permalink / raw)
  To: Baoquan He
  Cc: herbert, arnd, julien.thierry, catalin.marinas, ard.biesheuvel,
	will.deacon, linux-kernel, davem, dhowells, Linus Torvalds,
	vgoyal, mpe, bauerman, akpm, Dave Young, kexec, linux-arm-kernel

Baoquan,

On Tue, Mar 20, 2018 at 09:43:18AM +0800, Baoquan He wrote:
> On 02/23/18 at 04:36pm, Dave Young wrote:
> > Hi AKASHI,
> > 
> > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > This function, being a variant of walk_system_ram_res() introduced in
> > > commit 8c86e70acead ("resource: provide new functions to walk through
> > > resources"), walks through a list of all the resources of System RAM
> > > in reversed order, i.e., from higher to lower.
> > > 
> > > It will be used in kexec_file implementation on arm64.
> > 
> > I remember there was an old discussion about this, it should be added
> > in patch log why this is needed.
> 
> It's used to load kernel/initrd at the top of system RAM, and this is
> consistent with user space kexec behaviour.
> 
> In x86 64, Vivek didn't do like this since there's no reverse iomem
> resource iterating function, he just chose a match RAM region bottom up,
> then put kernel/initrd top down in the found RAM region. This is
> different than kexec_tools utility. I am considering to change resource
> sibling as double list, seems AKASHI's way is easier to be accepted by
> people. So I will use this one to change x86 64 code.
> 
> Hi AKASHI,
> 
> About arm64 kexec_file patches, will you post recently? Or any other
> plan?

A short answer is yes, but my new version won't include this specific patch.
So please feel free to add it to your own patch set if you want.

The reason that I'm going to remove it is that we will make a modification
on /proc/iomem due to a bug fixing and then we will have to have our own 
"walking" routine.

Thanks,
-Takahiro AKASHI

> Thanks
> Baoquan
> 
> > 
> > > 
> > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > > ---
> > >  include/linux/ioport.h |  3 +++
> > >  kernel/resource.c      | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 60 insertions(+)
> > > 
> > > diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> > > index da0ebaec25f0..f12d95fe038b 100644
> > > --- a/include/linux/ioport.h
> > > +++ b/include/linux/ioport.h
> > > @@ -277,6 +277,9 @@ extern int
> > >  walk_system_ram_res(u64 start, u64 end, void *arg,
> > >  		    int (*func)(struct resource *, void *));
> > >  extern int
> > > +walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > > +			int (*func)(struct resource *, void *));
> > > +extern int
> > >  walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
> > >  		    void *arg, int (*func)(struct resource *, void *));
> > >  
> > > diff --git a/kernel/resource.c b/kernel/resource.c
> > > index e270b5048988..bdaa93407f4c 100644
> > > --- a/kernel/resource.c
> > > +++ b/kernel/resource.c
> > > @@ -23,6 +23,8 @@
> > >  #include <linux/pfn.h>
> > >  #include <linux/mm.h>
> > >  #include <linux/resource_ext.h>
> > > +#include <linux/string.h>
> > > +#include <linux/vmalloc.h>
> > >  #include <asm/io.h>
> > >  
> > >  
> > > @@ -486,6 +488,61 @@ int walk_mem_res(u64 start, u64 end, void *arg,
> > >  				     arg, func);
> > >  }
> > >  
> > > +int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > > +				int (*func)(struct resource *, void *))
> > > +{
> > > +	struct resource res, *rams;
> > > +	int rams_size = 16, i;
> > > +	int ret = -1;
> > > +
> > > +	/* create a list */
> > > +	rams = vmalloc(sizeof(struct resource) * rams_size);
> > > +	if (!rams)
> > > +		return ret;
> > > +
> > > +	res.start = start;
> > > +	res.end = end;
> > > +	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> > > +	i = 0;
> > > +	while ((res.start < res.end) &&
> > > +		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
> > > +		if (i >= rams_size) {
> > > +			/* re-alloc */
> > > +			struct resource *rams_new;
> > > +			int rams_new_size;
> > > +
> > > +			rams_new_size = rams_size + 16;
> > > +			rams_new = vmalloc(sizeof(struct resource)
> > > +							* rams_new_size);
> > > +			if (!rams_new)
> > > +				goto out;
> > > +
> > > +			memcpy(rams_new, rams,
> > > +					sizeof(struct resource) * rams_size);
> > > +			vfree(rams);
> > > +			rams = rams_new;
> > > +			rams_size = rams_new_size;
> > > +		}
> > > +
> > > +		rams[i].start = res.start;
> > > +		rams[i++].end = res.end;
> > > +
> > > +		res.start = res.end + 1;
> > > +		res.end = end;
> > > +	}
> > > +
> > > +	/* go reverse */
> > > +	for (i--; i >= 0; i--) {
> > > +		ret = (*func)(&rams[i], arg);
> > > +		if (ret)
> > > +			break;
> > > +	}
> > > +
> > > +out:
> > > +	vfree(rams);
> > > +	return ret;
> > > +}
> > > +
> > >  #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
> > >  
> > >  /*
> > > -- 
> > > 2.16.2
> > > 
> > 
> > Thanks
> > Dave
> > 
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 01/13] resource: add walk_system_ram_res_rev()
  2018-03-20  3:12         ` AKASHI Takahiro
  (?)
@ 2018-03-20  3:48           ` Baoquan He
  -1 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2018-03-20  3:48 UTC (permalink / raw)
  To: AKASHI Takahiro, Dave Young, herbert, ard.biesheuvel,
	catalin.marinas, julien.thierry, will.deacon, linux-kernel,
	kexec, dhowells, arnd, linux-arm-kernel, mpe, bauerman, akpm,
	Linus Torvalds, davem, vgoyal

On 03/20/18 at 12:12pm, AKASHI Takahiro wrote:
> Baoquan,
> 
> On Tue, Mar 20, 2018 at 09:43:18AM +0800, Baoquan He wrote:
> > On 02/23/18 at 04:36pm, Dave Young wrote:
> > > Hi AKASHI,
> > > 
> > > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > > This function, being a variant of walk_system_ram_res() introduced in
> > > > commit 8c86e70acead ("resource: provide new functions to walk through
> > > > resources"), walks through a list of all the resources of System RAM
> > > > in reversed order, i.e., from higher to lower.
> > > > 
> > > > It will be used in kexec_file implementation on arm64.
> > > 
> > > I remember there was an old discussion about this, it should be added
> > > in patch log why this is needed.
> > 
> > It's used to load kernel/initrd at the top of system RAM, and this is
> > consistent with user space kexec behaviour.
> > 
> > In x86 64, Vivek didn't do like this since there's no reverse iomem
> > resource iterating function, he just chose a match RAM region bottom up,
> > then put kernel/initrd top down in the found RAM region. This is
> > different than kexec_tools utility. I am considering to change resource
> > sibling as double list, seems AKASHI's way is easier to be accepted by
> > people. So I will use this one to change x86 64 code.
> > 
> > Hi AKASHI,
> > 
> > About arm64 kexec_file patches, will you post recently? Or any other
> > plan?
> 
> A short answer is yes, but my new version won't include this specific patch.
> So please feel free to add it to your own patch set if you want.
> 
> The reason that I'm going to remove it is that we will make a modification
> on /proc/iomem due to a bug fixing and then we will have to have our own 
> "walking" routine.

I see. Saw your post about the /proc/iomem issue and discussions.

Then I will add this patch in and post a patchset.

Thanks
Baoquan

> 
> 
> > Thanks
> > Baoquan
> > 
> > > 
> > > > 
> > > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > > > ---
> > > >  include/linux/ioport.h |  3 +++
> > > >  kernel/resource.c      | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
> > > >  2 files changed, 60 insertions(+)
> > > > 
> > > > diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> > > > index da0ebaec25f0..f12d95fe038b 100644
> > > > --- a/include/linux/ioport.h
> > > > +++ b/include/linux/ioport.h
> > > > @@ -277,6 +277,9 @@ extern int
> > > >  walk_system_ram_res(u64 start, u64 end, void *arg,
> > > >  		    int (*func)(struct resource *, void *));
> > > >  extern int
> > > > +walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > > > +			int (*func)(struct resource *, void *));
> > > > +extern int
> > > >  walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
> > > >  		    void *arg, int (*func)(struct resource *, void *));
> > > >  
> > > > diff --git a/kernel/resource.c b/kernel/resource.c
> > > > index e270b5048988..bdaa93407f4c 100644
> > > > --- a/kernel/resource.c
> > > > +++ b/kernel/resource.c
> > > > @@ -23,6 +23,8 @@
> > > >  #include <linux/pfn.h>
> > > >  #include <linux/mm.h>
> > > >  #include <linux/resource_ext.h>
> > > > +#include <linux/string.h>
> > > > +#include <linux/vmalloc.h>
> > > >  #include <asm/io.h>
> > > >  
> > > >  
> > > > @@ -486,6 +488,61 @@ int walk_mem_res(u64 start, u64 end, void *arg,
> > > >  				     arg, func);
> > > >  }
> > > >  
> > > > +int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > > > +				int (*func)(struct resource *, void *))
> > > > +{
> > > > +	struct resource res, *rams;
> > > > +	int rams_size = 16, i;
> > > > +	int ret = -1;
> > > > +
> > > > +	/* create a list */
> > > > +	rams = vmalloc(sizeof(struct resource) * rams_size);
> > > > +	if (!rams)
> > > > +		return ret;
> > > > +
> > > > +	res.start = start;
> > > > +	res.end = end;
> > > > +	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> > > > +	i = 0;
> > > > +	while ((res.start < res.end) &&
> > > > +		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
> > > > +		if (i >= rams_size) {
> > > > +			/* re-alloc */
> > > > +			struct resource *rams_new;
> > > > +			int rams_new_size;
> > > > +
> > > > +			rams_new_size = rams_size + 16;
> > > > +			rams_new = vmalloc(sizeof(struct resource)
> > > > +							* rams_new_size);
> > > > +			if (!rams_new)
> > > > +				goto out;
> > > > +
> > > > +			memcpy(rams_new, rams,
> > > > +					sizeof(struct resource) * rams_size);
> > > > +			vfree(rams);
> > > > +			rams = rams_new;
> > > > +			rams_size = rams_new_size;
> > > > +		}
> > > > +
> > > > +		rams[i].start = res.start;
> > > > +		rams[i++].end = res.end;
> > > > +
> > > > +		res.start = res.end + 1;
> > > > +		res.end = end;
> > > > +	}
> > > > +
> > > > +	/* go reverse */
> > > > +	for (i--; i >= 0; i--) {
> > > > +		ret = (*func)(&rams[i], arg);
> > > > +		if (ret)
> > > > +			break;
> > > > +	}
> > > > +
> > > > +out:
> > > > +	vfree(rams);
> > > > +	return ret;
> > > > +}
> > > > +
> > > >  #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
> > > >  
> > > >  /*
> > > > -- 
> > > > 2.16.2
> > > > 
> > > 
> > > Thanks
> > > Dave
> > > 
> > > _______________________________________________
> > > kexec mailing list
> > > kexec@lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/kexec
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v8 01/13] resource: add walk_system_ram_res_rev()
@ 2018-03-20  3:48           ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2018-03-20  3:48 UTC (permalink / raw)
  To: linux-arm-kernel

On 03/20/18 at 12:12pm, AKASHI Takahiro wrote:
> Baoquan,
> 
> On Tue, Mar 20, 2018 at 09:43:18AM +0800, Baoquan He wrote:
> > On 02/23/18 at 04:36pm, Dave Young wrote:
> > > Hi AKASHI,
> > > 
> > > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > > This function, being a variant of walk_system_ram_res() introduced in
> > > > commit 8c86e70acead ("resource: provide new functions to walk through
> > > > resources"), walks through a list of all the resources of System RAM
> > > > in reversed order, i.e., from higher to lower.
> > > > 
> > > > It will be used in kexec_file implementation on arm64.
> > > 
> > > I remember there was an old discussion about this, it should be added
> > > in patch log why this is needed.
> > 
> > It's used to load kernel/initrd at the top of system RAM, and this is
> > consistent with user space kexec behaviour.
> > 
> > In x86 64, Vivek didn't do like this since there's no reverse iomem
> > resource iterating function, he just chose a match RAM region bottom up,
> > then put kernel/initrd top down in the found RAM region. This is
> > different than kexec_tools utility. I am considering to change resource
> > sibling as double list, seems AKASHI's way is easier to be accepted by
> > people. So I will use this one to change x86 64 code.
> > 
> > Hi AKASHI,
> > 
> > About arm64 kexec_file patches, will you post recently? Or any other
> > plan?
> 
> A short answer is yes, but my new version won't include this specific patch.
> So please feel free to add it to your own patch set if you want.
> 
> The reason that I'm going to remove it is that we will make a modification
> on /proc/iomem due to a bug fixing and then we will have to have our own 
> "walking" routine.

I see. Saw your post about the /proc/iomem issue and discussions.

Then I will add this patch in and post a patchset.

Thanks
Baoquan

> 
> 
> > Thanks
> > Baoquan
> > 
> > > 
> > > > 
> > > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > > > ---
> > > >  include/linux/ioport.h |  3 +++
> > > >  kernel/resource.c      | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
> > > >  2 files changed, 60 insertions(+)
> > > > 
> > > > diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> > > > index da0ebaec25f0..f12d95fe038b 100644
> > > > --- a/include/linux/ioport.h
> > > > +++ b/include/linux/ioport.h
> > > > @@ -277,6 +277,9 @@ extern int
> > > >  walk_system_ram_res(u64 start, u64 end, void *arg,
> > > >  		    int (*func)(struct resource *, void *));
> > > >  extern int
> > > > +walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > > > +			int (*func)(struct resource *, void *));
> > > > +extern int
> > > >  walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
> > > >  		    void *arg, int (*func)(struct resource *, void *));
> > > >  
> > > > diff --git a/kernel/resource.c b/kernel/resource.c
> > > > index e270b5048988..bdaa93407f4c 100644
> > > > --- a/kernel/resource.c
> > > > +++ b/kernel/resource.c
> > > > @@ -23,6 +23,8 @@
> > > >  #include <linux/pfn.h>
> > > >  #include <linux/mm.h>
> > > >  #include <linux/resource_ext.h>
> > > > +#include <linux/string.h>
> > > > +#include <linux/vmalloc.h>
> > > >  #include <asm/io.h>
> > > >  
> > > >  
> > > > @@ -486,6 +488,61 @@ int walk_mem_res(u64 start, u64 end, void *arg,
> > > >  				     arg, func);
> > > >  }
> > > >  
> > > > +int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > > > +				int (*func)(struct resource *, void *))
> > > > +{
> > > > +	struct resource res, *rams;
> > > > +	int rams_size = 16, i;
> > > > +	int ret = -1;
> > > > +
> > > > +	/* create a list */
> > > > +	rams = vmalloc(sizeof(struct resource) * rams_size);
> > > > +	if (!rams)
> > > > +		return ret;
> > > > +
> > > > +	res.start = start;
> > > > +	res.end = end;
> > > > +	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> > > > +	i = 0;
> > > > +	while ((res.start < res.end) &&
> > > > +		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
> > > > +		if (i >= rams_size) {
> > > > +			/* re-alloc */
> > > > +			struct resource *rams_new;
> > > > +			int rams_new_size;
> > > > +
> > > > +			rams_new_size = rams_size + 16;
> > > > +			rams_new = vmalloc(sizeof(struct resource)
> > > > +							* rams_new_size);
> > > > +			if (!rams_new)
> > > > +				goto out;
> > > > +
> > > > +			memcpy(rams_new, rams,
> > > > +					sizeof(struct resource) * rams_size);
> > > > +			vfree(rams);
> > > > +			rams = rams_new;
> > > > +			rams_size = rams_new_size;
> > > > +		}
> > > > +
> > > > +		rams[i].start = res.start;
> > > > +		rams[i++].end = res.end;
> > > > +
> > > > +		res.start = res.end + 1;
> > > > +		res.end = end;
> > > > +	}
> > > > +
> > > > +	/* go reverse */
> > > > +	for (i--; i >= 0; i--) {
> > > > +		ret = (*func)(&rams[i], arg);
> > > > +		if (ret)
> > > > +			break;
> > > > +	}
> > > > +
> > > > +out:
> > > > +	vfree(rams);
> > > > +	return ret;
> > > > +}
> > > > +
> > > >  #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
> > > >  
> > > >  /*
> > > > -- 
> > > > 2.16.2
> > > > 
> > > 
> > > Thanks
> > > Dave
> > > 
> > > _______________________________________________
> > > kexec mailing list
> > > kexec at lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/kexec
> 
> _______________________________________________
> kexec mailing list
> kexec at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v8 01/13] resource: add walk_system_ram_res_rev()
@ 2018-03-20  3:48           ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2018-03-20  3:48 UTC (permalink / raw)
  To: AKASHI Takahiro, Dave Young, herbert, ard.biesheuvel,
	catalin.marinas, julien.thierry, will.deacon, linux-kernel,
	kexec, dhowells, arnd, linux-arm-kernel, mpe, bauerman, akpm,
	Linus Torvalds, davem, vgoyal

On 03/20/18 at 12:12pm, AKASHI Takahiro wrote:
> Baoquan,
> 
> On Tue, Mar 20, 2018 at 09:43:18AM +0800, Baoquan He wrote:
> > On 02/23/18 at 04:36pm, Dave Young wrote:
> > > Hi AKASHI,
> > > 
> > > On 02/22/18 at 08:17pm, AKASHI Takahiro wrote:
> > > > This function, being a variant of walk_system_ram_res() introduced in
> > > > commit 8c86e70acead ("resource: provide new functions to walk through
> > > > resources"), walks through a list of all the resources of System RAM
> > > > in reversed order, i.e., from higher to lower.
> > > > 
> > > > It will be used in kexec_file implementation on arm64.
> > > 
> > > I remember there was an old discussion about this, it should be added
> > > in patch log why this is needed.
> > 
> > It's used to load kernel/initrd at the top of system RAM, and this is
> > consistent with user space kexec behaviour.
> > 
> > In x86 64, Vivek didn't do like this since there's no reverse iomem
> > resource iterating function, he just chose a match RAM region bottom up,
> > then put kernel/initrd top down in the found RAM region. This is
> > different than kexec_tools utility. I am considering to change resource
> > sibling as double list, seems AKASHI's way is easier to be accepted by
> > people. So I will use this one to change x86 64 code.
> > 
> > Hi AKASHI,
> > 
> > About arm64 kexec_file patches, will you post recently? Or any other
> > plan?
> 
> A short answer is yes, but my new version won't include this specific patch.
> So please feel free to add it to your own patch set if you want.
> 
> The reason that I'm going to remove it is that we will make a modification
> on /proc/iomem due to a bug fixing and then we will have to have our own 
> "walking" routine.

I see. Saw your post about the /proc/iomem issue and discussions.

Then I will add this patch in and post a patchset.

Thanks
Baoquan

> 
> 
> > Thanks
> > Baoquan
> > 
> > > 
> > > > 
> > > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > > > ---
> > > >  include/linux/ioport.h |  3 +++
> > > >  kernel/resource.c      | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
> > > >  2 files changed, 60 insertions(+)
> > > > 
> > > > diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> > > > index da0ebaec25f0..f12d95fe038b 100644
> > > > --- a/include/linux/ioport.h
> > > > +++ b/include/linux/ioport.h
> > > > @@ -277,6 +277,9 @@ extern int
> > > >  walk_system_ram_res(u64 start, u64 end, void *arg,
> > > >  		    int (*func)(struct resource *, void *));
> > > >  extern int
> > > > +walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > > > +			int (*func)(struct resource *, void *));
> > > > +extern int
> > > >  walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
> > > >  		    void *arg, int (*func)(struct resource *, void *));
> > > >  
> > > > diff --git a/kernel/resource.c b/kernel/resource.c
> > > > index e270b5048988..bdaa93407f4c 100644
> > > > --- a/kernel/resource.c
> > > > +++ b/kernel/resource.c
> > > > @@ -23,6 +23,8 @@
> > > >  #include <linux/pfn.h>
> > > >  #include <linux/mm.h>
> > > >  #include <linux/resource_ext.h>
> > > > +#include <linux/string.h>
> > > > +#include <linux/vmalloc.h>
> > > >  #include <asm/io.h>
> > > >  
> > > >  
> > > > @@ -486,6 +488,61 @@ int walk_mem_res(u64 start, u64 end, void *arg,
> > > >  				     arg, func);
> > > >  }
> > > >  
> > > > +int walk_system_ram_res_rev(u64 start, u64 end, void *arg,
> > > > +				int (*func)(struct resource *, void *))
> > > > +{
> > > > +	struct resource res, *rams;
> > > > +	int rams_size = 16, i;
> > > > +	int ret = -1;
> > > > +
> > > > +	/* create a list */
> > > > +	rams = vmalloc(sizeof(struct resource) * rams_size);
> > > > +	if (!rams)
> > > > +		return ret;
> > > > +
> > > > +	res.start = start;
> > > > +	res.end = end;
> > > > +	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> > > > +	i = 0;
> > > > +	while ((res.start < res.end) &&
> > > > +		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
> > > > +		if (i >= rams_size) {
> > > > +			/* re-alloc */
> > > > +			struct resource *rams_new;
> > > > +			int rams_new_size;
> > > > +
> > > > +			rams_new_size = rams_size + 16;
> > > > +			rams_new = vmalloc(sizeof(struct resource)
> > > > +							* rams_new_size);
> > > > +			if (!rams_new)
> > > > +				goto out;
> > > > +
> > > > +			memcpy(rams_new, rams,
> > > > +					sizeof(struct resource) * rams_size);
> > > > +			vfree(rams);
> > > > +			rams = rams_new;
> > > > +			rams_size = rams_new_size;
> > > > +		}
> > > > +
> > > > +		rams[i].start = res.start;
> > > > +		rams[i++].end = res.end;
> > > > +
> > > > +		res.start = res.end + 1;
> > > > +		res.end = end;
> > > > +	}
> > > > +
> > > > +	/* go reverse */
> > > > +	for (i--; i >= 0; i--) {
> > > > +		ret = (*func)(&rams[i], arg);
> > > > +		if (ret)
> > > > +			break;
> > > > +	}
> > > > +
> > > > +out:
> > > > +	vfree(rams);
> > > > +	return ret;
> > > > +}
> > > > +
> > > >  #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
> > > >  
> > > >  /*
> > > > -- 
> > > > 2.16.2
> > > > 
> > > 
> > > Thanks
> > > Dave
> > > 
> > > _______________________________________________
> > > kexec mailing list
> > > kexec@lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/kexec
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 102+ messages in thread

end of thread, other threads:[~2018-03-20  3:48 UTC | newest]

Thread overview: 102+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-22 11:17 [PATCH v8 00/13] arm64: kexec: add kexec_file_load() support AKASHI Takahiro
2018-02-22 11:17 ` AKASHI Takahiro
2018-02-22 11:17 ` AKASHI Takahiro
2018-02-22 11:17 ` [PATCH v8 01/13] resource: add walk_system_ram_res_rev() AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-23  8:36   ` Dave Young
2018-02-23  8:36     ` Dave Young
2018-02-23  8:36     ` Dave Young
2018-03-20  1:43     ` Baoquan He
2018-03-20  1:43       ` Baoquan He
2018-03-20  1:43       ` Baoquan He
2018-03-20  3:12       ` AKASHI Takahiro
2018-03-20  3:12         ` AKASHI Takahiro
2018-03-20  3:12         ` AKASHI Takahiro
2018-03-20  3:48         ` Baoquan He
2018-03-20  3:48           ` Baoquan He
2018-03-20  3:48           ` Baoquan He
2018-02-22 11:17 ` [PATCH v8 02/13] kexec_file: make an use of purgatory optional AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-23  8:49   ` Dave Young
2018-02-23  8:49     ` Dave Young
2018-02-23  8:49     ` Dave Young
2018-02-26 10:24     ` AKASHI Takahiro
2018-02-26 10:24       ` AKASHI Takahiro
2018-02-26 10:24       ` AKASHI Takahiro
2018-02-28 12:33       ` Dave Young
2018-02-28 12:33         ` Dave Young
2018-02-28 12:33         ` Dave Young
2018-03-01  2:59         ` AKASHI Takahiro
2018-03-01  2:59           ` AKASHI Takahiro
2018-03-01  2:59           ` AKASHI Takahiro
2018-02-22 11:17 ` [PATCH v8 03/13] kexec_file,x86,powerpc: factor out kexec_file_ops functions AKASHI Takahiro
2018-02-22 11:17   ` [PATCH v8 03/13] kexec_file, x86, powerpc: " AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-23  9:24   ` [PATCH v8 03/13] kexec_file,x86,powerpc: " Dave Young
2018-02-23  9:24     ` Dave Young
2018-02-23  9:24     ` Dave Young
2018-02-26 10:01     ` AKASHI Takahiro
2018-02-26 10:01       ` AKASHI Takahiro
2018-02-26 10:01       ` AKASHI Takahiro
2018-02-26 11:25       ` Philipp Rudo
2018-02-26 11:25         ` Philipp Rudo
2018-02-26 11:25         ` Philipp Rudo
2018-02-28 12:38       ` Dave Young
2018-02-28 12:38         ` Dave Young
2018-02-28 12:38         ` Dave Young
2018-03-01  3:18         ` AKASHI Takahiro
2018-03-01  3:18           ` AKASHI Takahiro
2018-03-01  3:18           ` AKASHI Takahiro
2018-02-26 11:17   ` [PATCH v8 03/13] kexec_file, x86, powerpc: " Philipp Rudo
2018-02-26 11:17     ` Philipp Rudo
2018-02-26 11:17     ` Philipp Rudo
2018-02-27  2:03     ` AKASHI Takahiro
2018-02-27  2:03       ` AKASHI Takahiro
2018-02-27  2:03       ` AKASHI Takahiro
2018-02-27  9:26       ` Philipp Rudo
2018-02-27  9:26         ` Philipp Rudo
2018-02-27  9:26         ` Philipp Rudo
2018-02-22 11:17 ` [PATCH v8 04/13] x86: kexec_file: factor out elf core header related functions AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-24  3:15   ` Dave Young
2018-02-24  3:15     ` Dave Young
2018-02-24  3:15     ` Dave Young
2018-02-26  9:21     ` AKASHI Takahiro
2018-02-26  9:21       ` AKASHI Takahiro
2018-02-26  9:21       ` AKASHI Takahiro
2018-02-22 11:17 ` [PATCH v8 05/13] kexec_file, x86: move re-factored code to generic side AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17 ` [PATCH v8 06/13] asm-generic: add kexec_file_load system call to unistd.h AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17 ` [PATCH v8 07/13] arm64: kexec_file: invoke the kernel without purgatory AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17 ` [PATCH v8 08/13] arm64: kexec_file: load initrd and device-tree AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17 ` [PATCH v8 09/13] arm64: kexec_file: add crash dump support AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17 ` [PATCH v8 10/13] arm64: kexec_file: add Image format support AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17 ` [PATCH v8 11/13] arm64: kexec_file: enable KEXEC_FILE config AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17 ` [PATCH v8 12/13] include: pe.h: remove message[] from mz header definition AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17 ` [PATCH v8 13/13] arm64: kexec_file: enable KEXEC_VERIFY_SIG for Image AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-22 11:17   ` AKASHI Takahiro
2018-02-27  4:56 ` [PATCH v8 00/13] arm64: kexec: add kexec_file_load() support AKASHI Takahiro
2018-02-27  4:56   ` AKASHI Takahiro
2018-02-27  4:56   ` AKASHI Takahiro
2018-02-28 12:25   ` Dave Young
2018-02-28 12:25     ` Dave Young
2018-02-28 12:25     ` Dave Young

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.