* Edited kexec_load(2) [kexec_file_load()] man page for review @ 2014-11-09 19:17 Michael Kerrisk (man-pages) 2014-11-11 21:30 ` Vivek Goyal 0 siblings, 1 reply; 19+ messages in thread From: Michael Kerrisk (man-pages) @ 2014-11-09 19:17 UTC (permalink / raw) To: Vivek Goyal Cc: mtk.manpages, lkml, linux-man, kexec, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman Hello Vivek (and all), Thanks for the kexec_file_load() patch [for the kexec_load(2) man page] that you quite some time ago sent. I have merged it and done some substantial editing as well. Could you please take a look at the draft below, and check that the kexec_file_load() material is okay. Please could you especially pay attention to the pieces marked "FIXME(kexec_file_load)", since those are pieces about which i had questions or doubts. Thanks, Michael .\" Copyright (C) 2010 Intel Corporation, Author: Andi Kleen .\" and Copyright 2014, Vivek Goyal <vgoyal@redhat.com> .\" .\" %%%LICENSE_START(VERBATIM) .\" Permission is granted to make and distribute verbatim copies of this .\" manual provided the copyright notice and this permission notice are .\" preserved on all copies. .\" .\" Permission is granted to copy and distribute modified versions of this .\" manual under the conditions for verbatim copying, provided that the .\" entire resulting derived work is distributed under the terms of a .\" permission notice identical to this one. .\" .\" Since the Linux kernel and libraries are constantly changing, this .\" manual page may be incorrect or out-of-date. The author(s) assume no .\" responsibility for errors or omissions, or for damages resulting from .\" the use of the information contained herein. The author(s) may not .\" have taken the same level of care in the production of this manual, .\" which is licensed free of charge, as they might when working .\" professionally. .\" .\" Formatted or processed versions of this manual, if unaccompanied by .\" the source, must acknowledge the copyright and authors of this work. .\" %%%LICENSE_END .\" .TH KEXEC_LOAD 2 2014-08-19 "Linux" "Linux Programmer's Manual" .SH NAME kexec_load, kexec_file_load \- load a new kernel for later execution .SH SYNOPSIS .nf .B #include <linux/kexec.h> .BI "long kexec_load(unsigned long " entry ", unsigned long " nr_segments "," .BI " struct kexec_segment *" segments \ ", unsigned long " flags ");" .\" FIXME(kexec_file_load): .\" Why are the return types of kexec_load() and kexec_file_load() .\" different? .BI "int kexec_file_load(int " kernel_fd ", int " initrd_fd "," .br .BI " unsigned long " cmdline_len \ ", const char *" cmdline "," .BI " unsigned long " flags ");" .fi .IR Note : There are no glibc wrappers for these system calls; see NOTES. .SH DESCRIPTION The .BR kexec_load () system call loads a new kernel that can be executed later by .BR reboot (2). .PP The .I flags argument is a bit mask that controls the operation of the call. The following values can be specified in .IR flags : .TP .BR KEXEC_ON_CRASH " (since Linux 2.6.13)" Execute the new kernel automatically on a system crash. .\" FIXME Explain in more detail how KEXEC_ON_CRASH is actually used .TP .BR KEXEC_PRESERVE_CONTEXT " (since Linux 2.6.27)" Preserve the system hardware and software states before executing the new kernel. This could be used for system suspend. This flag is available only if the kernel was configured with .BR CONFIG_KEXEC_JUMP , and is effective only if .I nr_segments is greater than 0. .PP The high-order bits (corresponding to the mask 0xffff0000) of .I flags contain the architecture of the to-be-executed kernel. Specify (OR) the constant .B KEXEC_ARCH_DEFAULT to use the current architecture, or one of the following architecture constants .BR KEXEC_ARCH_386 , .BR KEXEC_ARCH_68K , .BR KEXEC_ARCH_X86_64 , .BR KEXEC_ARCH_PPC , .BR KEXEC_ARCH_PPC64 , .BR KEXEC_ARCH_IA_64 , .BR KEXEC_ARCH_ARM , .BR KEXEC_ARCH_S390 , .BR KEXEC_ARCH_SH , .BR KEXEC_ARCH_MIPS , and .BR KEXEC_ARCH_MIPS_LE . The architecture must be executable on the CPU of the system. The .I entry argument is the physical entry address in the kernel image. The .I nr_segments argument is the number of segments pointed to by the .I segments pointer; the kernel imposes an (arbitrary) limit of 16 on the number of segments. The .I segments argument is an array of .I kexec_segment structures which define the kernel layout: .in +4n .nf struct kexec_segment { void *buf; /* Buffer in user space */ size_t bufsz; /* Buffer length in user space */ void *mem; /* Physical address of kernel */ size_t memsz; /* Physical address length */ }; .fi .in .PP .\" FIXME Explain the details of how the kernel image defined by segments .\" is copied from the calling process into previously reserved memory. The kernel image defined by .I segments is copied from the calling process into previously reserved memory. .SS kexec_file_load() The .BR kexec_file_load () system call is similar to .BR kexec_load (), but it takes a different set of arguments. It reads the kernel to be loaded from the file referred to by the descriptor .IR kernel_fd , and the initrd (initial RAM disk) to be loaded from file referred to by the descriptor .IR initrd_fd . The .IR cmdline argument is a pointer to a string containing the command line for the new kernel; the .IR cmdline_len argument specifies the length of the string in .IR cmdline . The .IR flags argument is a bit mask which modifies the behavior of the call. The following values can be specified in .IR flags : .TP .BR KEXEC_FILE_UNLOAD Unload the currently loaded kernel. .TP .BR KEXEC_FILE_ON_CRASH Load the new kernel in the memory region reserved for the crash kernel. This kernel is booted if the currently running kernel crashes. .TP .BR KEXEC_FILE_NO_INITRAMFS Loading initrd/initramfs is optional. Specify this flag if no initramfs is being loaded. If this flag is set, the value passed in .IR initrd_fd is ignored. .SH RETURN VALUE On success, these system calls returns 0. On error, \-1 is returned and .I errno is set to indicate the error. .SH ERRORS .TP .B EBUSY Another crash kernel is already being loaded or a crash kernel is already in use. .TP .B EINVAL .I flags is invalid; or .IR nr_segments is too large .\" KEXEC_SEGMENT_MAX == 16 .TP .B ENOEXEC .I kernel_fd does not refer to an open file, or the kernel can't load this file. .TP .B EPERM The caller does not have the .BR CAP_SYS_BOOT capability. .SH VERSIONS The .BR kexec_load () system call first appeared in Linux 2.6.13. The .BR kexec_file_load () system call first appeared in Linux 3.17. .SH CONFORMING TO These system calls are Linux-specific. .SH NOTES Currently, there is no glibc support for these system calls. Call them using .BR syscall (2). .PP The required constants are in the Linux kernel source file .IR linux/kexec.h , which is not currently exported to glibc. Therefore, these constants must be defined manually. .\" FIXME(kexec_file_load): .\" Is the following rationale accurate? Does it need expanding? The .BR kexec_file_load () .\" See also http://lwn.net/Articles/603116/ system call was added to provide support for systems where "kexec" loading should be restricted to only kernels that are signed. The .BR kexec_load () system call is available only if the kernel was configured with .BR CONFIG_KEXEC . The .BR kexec_file_load () system call is available only if the kernel was configured with .BR CONFIG_KEXEC_FILE . .\" FIXME(kexec_file_load): .\" Does kexec_file_load() need any other CONFIG_* options to be defined? .SH SEE ALSO .BR reboot (2), .BR syscall (2), .BR kexec (8) -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2014-11-09 19:17 Edited kexec_load(2) [kexec_file_load()] man page for review Michael Kerrisk (man-pages) @ 2014-11-11 21:30 ` Vivek Goyal 2015-01-07 21:17 ` Michael Kerrisk (man-pages) 0 siblings, 1 reply; 19+ messages in thread From: Vivek Goyal @ 2014-11-11 21:30 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: lkml, linux-man, kexec, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman On Sun, Nov 09, 2014 at 08:17:49PM +0100, Michael Kerrisk (man-pages) wrote: > Hello Vivek (and all), > > Thanks for the kexec_file_load() patch [for the kexec_load(2) man page] > that you quite some time ago sent. I have merged it and done some > substantial editing as well. Could you please take a look at the > draft below, and check that the kexec_file_load() material is okay. > Please could you especially pay attention to the pieces marked > "FIXME(kexec_file_load)", since those are pieces about which i > had questions or doubts. > Hi Michael, Thanks for editing this man page. I have some thoughts inline. [..] > .B #include <linux/kexec.h> > > .BI "long kexec_load(unsigned long " entry ", unsigned long " nr_segments "," > .BI " struct kexec_segment *" segments \ > ", unsigned long " flags ");" > > .\" FIXME(kexec_file_load): > .\" Why are the return types of kexec_load() and kexec_file_load() > .\" different? > .BI "int kexec_file_load(int " kernel_fd ", int " initrd_fd "," I think this is ignorance on my part. It probably should be "long" as SYSCALL_DEFINE() seems to expand to. asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); > .br > .BI " unsigned long " cmdline_len \ > ", const char *" cmdline "," > .BI " unsigned long " flags ");" > > .fi > .IR Note : > There are no glibc wrappers for these system calls; see NOTES. > .SH DESCRIPTION > The > .BR kexec_load () > system call loads a new kernel that can be executed later by > .BR reboot (2). > .PP > The > .I flags > argument is a bit mask that controls the operation of the call. > The following values can be specified in > .IR flags : > .TP > .BR KEXEC_ON_CRASH " (since Linux 2.6.13)" > Execute the new kernel automatically on a system crash. > .\" FIXME Explain in more detail how KEXEC_ON_CRASH is actually used Upon boot first kernel reserves a chunk of contiguous memory (if crashkernel=<> command line paramter is passed). This memory is is used to load the crash kernel (Kernel which will be booted into if first kernel crashes). Location of this reserved memory is exported to user space through /proc/iomem file. User space can parse it and prepare list of segments specifying this reserved memory as destination. Once kernel sees the flag KEXEC_ON_CRASH, it makes sure that all the segments are destined for reserved memory otherwise kernel load operation fails. [..] > struct kexec_segment { > void *buf; /* Buffer in user space */ > size_t bufsz; /* Buffer length in user space */ > void *mem; /* Physical address of kernel */ > size_t memsz; /* Physical address length */ > }; > .fi > .in > .PP > .\" FIXME Explain the details of how the kernel image defined by segments > .\" is copied from the calling process into previously reserved memory. Kernel image defined by segments is copied into kernel either in regular memory or in reserved memory (if KEXEC_ON_CRASH is set). Kernel first copies list of segments in kernel memory and then goes does various sanity checks on the segments. If everything looks line, kernel copies segment data to kernel memory. In case of normal kexec, segment data is loaded in any available memory and segment data is moved to final destination at the kexec reboot time. In case of kexec on panic (KEXEC_ON_CRASH flag set), segment data is directly loaded to reserved memory and after crash kexec simply jumps to starting point. [..] > .\" FIXME(kexec_file_load): > .\" Is the following rationale accurate? Does it need expanding? > The > .BR kexec_file_load () > .\" See also http://lwn.net/Articles/603116/ > system call was added to provide support for systems > where "kexec" loading should be restricted to > only kernels that are signed. Yes, this rationale looks good. > > The > .BR kexec_load () > system call is available only if the kernel was configured with > .BR CONFIG_KEXEC . > The > .BR kexec_file_load () > system call is available only if the kernel was configured with > .BR CONFIG_KEXEC_FILE . > .\" FIXME(kexec_file_load): > .\" Does kexec_file_load() need any other CONFIG_* options to be defined? Yes, it requires some other config options too. depends on KEXEC depends on X86_64 depends on CRYPTO=y depends on CRYPTO_SHA256=y CONFIG_KEXEC_VERIFY_SIG=y CONFIG_KEXEC_BZIMAGE_VERIFY_SIG=y CONFIG_SIGNED_PE_FILE_VERIFICATION=y CONFIG_PKCS7_MESSAGE_PARSER=y CONFIG_X509_CERTIFICATE_PARSER=y CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y So dependency list seems pretty long. Not sure how many of these should we specify in man page. Thanks Vivek ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2014-11-11 21:30 ` Vivek Goyal @ 2015-01-07 21:17 ` Michael Kerrisk (man-pages) 2015-01-12 22:16 ` Vivek Goyal 0 siblings, 1 reply; 19+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-01-07 21:17 UTC (permalink / raw) To: Vivek Goyal Cc: mtk.manpages, lkml, linux-man, kexec, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman Hi Vivek, Thanks for your comments, and my apologies for a the long delayed follow-up. On 11/11/2014 10:30 PM, Vivek Goyal wrote: > On Sun, Nov 09, 2014 at 08:17:49PM +0100, Michael Kerrisk (man-pages) wrote: >> Hello Vivek (and all), >> >> Thanks for the kexec_file_load() patch [for the kexec_load(2) man page] >> that you quite some time ago sent. I have merged it and done some >> substantial editing as well. Could you please take a look at the >> draft below, and check that the kexec_file_load() material is okay. >> Please could you especially pay attention to the pieces marked >> "FIXME(kexec_file_load)", since those are pieces about which i >> had questions or doubts. >> > > Hi Michael, > > Thanks for editing this man page. I have some thoughts inline. > > [..] >> .B #include <linux/kexec.h> >> >> .BI "long kexec_load(unsigned long " entry ", unsigned long " nr_segments "," >> .BI " struct kexec_segment *" segments \ >> ", unsigned long " flags ");" >> >> .\" FIXME(kexec_file_load): >> .\" Why are the return types of kexec_load() and kexec_file_load() >> .\" different? >> .BI "int kexec_file_load(int " kernel_fd ", int " initrd_fd "," > > I think this is ignorance on my part. It probably should be "long" as > SYSCALL_DEFINE() seems to expand to. > > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); Okay -- I've changed to 'long' in the man page. >> .br >> .BI " unsigned long " cmdline_len \ >> ", const char *" cmdline "," >> .BI " unsigned long " flags ");" >> >> .fi >> .IR Note : >> There are no glibc wrappers for these system calls; see NOTES. >> .SH DESCRIPTION >> The >> .BR kexec_load () >> system call loads a new kernel that can be executed later by >> .BR reboot (2). >> .PP >> The >> .I flags >> argument is a bit mask that controls the operation of the call. >> The following values can be specified in >> .IR flags : >> .TP >> .BR KEXEC_ON_CRASH " (since Linux 2.6.13)" >> Execute the new kernel automatically on a system crash. >> .\" FIXME Explain in more detail how KEXEC_ON_CRASH is actually used I wasn't expecting that you would respond to the FIXMEs that were not labeled "kexec_file_load", but I was hoping you might ;-). Thanks! I have a few additional questions to your nice notes. > Upon boot first kernel reserves a chunk of contiguous memory (if > crashkernel=<> command line paramter is passed). This memory is > is used to load the crash kernel (Kernel which will be booted into > if first kernel crashes). Can I just confirm: is it in all cases only possible to use kexec_load() and kexec_file_load() if the kernel was booted with the 'crashkernel' parameter set? > Location of this reserved memory is exported to user space through > /proc/iomem file. Is that export via an entry labeled "Crash kernel" in the /proc/iomem file? > User space can parse it and prepare list of segments > specifying this reserved memory as destination. I'm not quite clear on "specifying this reserved memory as destination". Is that done by specifying the address in the kexec_segment.mem fields? > Once kernel sees the flag KEXEC_ON_CRASH, it makes sure that all the > segments are destined for reserved memory otherwise kernel load operation > fails. Could you point me to where this checking is done? Also, what is the error (errno) that occurs when the load operation fails? (I think the answers to these questions are "at the start of kimage_alloc_init()" and "EADDRNOTAVAIL", but I'd like to confirm.) > [..] >> struct kexec_segment { >> void *buf; /* Buffer in user space */ >> size_t bufsz; /* Buffer length in user space */ >> void *mem; /* Physical address of kernel */ >> size_t memsz; /* Physical address length */ >> }; >> .fi >> .in >> .PP >> .\" FIXME Explain the details of how the kernel image defined by segments >> .\" is copied from the calling process into previously reserved memory. > > Kernel image defined by segments is copied into kernel either in regular > memory Could you clarify what you mean by "regular memory"? > or in reserved memory (if KEXEC_ON_CRASH is set). Kernel first > copies list of segments in kernel memory and then goes does various > sanity checks on the segments. If everything looks line, kernel copies > segment data to kernel memory. > > In case of normal kexec, segment data is loaded in any available memory > and segment data is moved to final destination at the kexec reboot time. By "moved to final destination", do you mean "moved from user space to the final kernel-space destination"? > In case of kexec on panic (KEXEC_ON_CRASH flag set), segment data is > directly loaded to reserved memory and after crash kexec simply jumps By "directly", I assume you mean "at the time of the kexec_laod() call", right? > to starting point. > > [..] >> .\" FIXME(kexec_file_load): >> .\" Is the following rationale accurate? Does it need expanding? >> The >> .BR kexec_file_load () >> .\" See also http://lwn.net/Articles/603116/ >> system call was added to provide support for systems >> where "kexec" loading should be restricted to >> only kernels that are signed. > > Yes, this rationale looks good. Okay -- thanks. >> The >> .BR kexec_load () >> system call is available only if the kernel was configured with >> .BR CONFIG_KEXEC . >> The >> .BR kexec_file_load () >> system call is available only if the kernel was configured with >> .BR CONFIG_KEXEC_FILE . >> .\" FIXME(kexec_file_load): >> .\" Does kexec_file_load() need any other CONFIG_* options to be defined? > > Yes, it requires some other config options too. > > depends on KEXEC > depends on X86_64 > depends on CRYPTO=y > depends on CRYPTO_SHA256=y > > CONFIG_KEXEC_VERIFY_SIG=y > CONFIG_KEXEC_BZIMAGE_VERIFY_SIG=y > CONFIG_SIGNED_PE_FILE_VERIFICATION=y > CONFIG_PKCS7_MESSAGE_PARSER=y > CONFIG_X509_CERTIFICATE_PARSER=y > CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y > > So dependency list seems pretty long. Not sure how many of these should > we specify in man page. On reflection, since they're dependencies of CONFIG_KEXEC_FILE, perhaps it's not necessary to add any of the others. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-07 21:17 ` Michael Kerrisk (man-pages) @ 2015-01-12 22:16 ` Vivek Goyal 2015-01-16 13:30 ` Michael Kerrisk (man-pages) 0 siblings, 1 reply; 19+ messages in thread From: Vivek Goyal @ 2015-01-12 22:16 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: lkml, linux-man, kexec, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman On Wed, Jan 07, 2015 at 10:17:56PM +0100, Michael Kerrisk (man-pages) wrote: [..] > >> .BR KEXEC_ON_CRASH " (since Linux 2.6.13)" > >> Execute the new kernel automatically on a system crash. > >> .\" FIXME Explain in more detail how KEXEC_ON_CRASH is actually used > > I wasn't expecting that you would respond to the FIXMEs that were > not labeled "kexec_file_load", but I was hoping you might ;-). Thanks! > I have a few additional questions to your nice notes. > > > Upon boot first kernel reserves a chunk of contiguous memory (if > > crashkernel=<> command line paramter is passed). This memory is > > is used to load the crash kernel (Kernel which will be booted into > > if first kernel crashes). > Hi Michael, > Can I just confirm: is it in all cases only possible to use kexec_load() > and kexec_file_load() if the kernel was booted with the 'crashkernel' > parameter set? As of now, only kexec_load() and kexec_file_load() system calls can make use of memory reserved by crashkernel=<> kernel parameter. And this is used only if we are trying to load a crash kernel (KEXEC_ON_CRASH flag specified). > > > Location of this reserved memory is exported to user space through > > /proc/iomem file. > > Is that export via an entry labeled "Crash kernel" in the > /proc/iomem file? Yes. > > > User space can parse it and prepare list of segments > > specifying this reserved memory as destination. > > I'm not quite clear on "specifying this reserved memory as destination". > Is that done by specifying the address in the kexec_segment.mem fields? You are absolutely right. User space can specify in kexec_segment.mem field the memory location where it expecting a particular segment to be loaded by kernel. > > > Once kernel sees the flag KEXEC_ON_CRASH, it makes sure that all the > > segments are destined for reserved memory otherwise kernel load operation > > fails. > > Could you point me to where this checking is done? Also, what is the > error (errno) that occurs when the load operation fails? (I think the > answers to these questions are "at the start of kimage_alloc_init()" > and "EADDRNOTAVAIL", but I'd like to confirm.) This checking happens in sanity_check_segment_list() which is called by kimage_alloc_init(). And yes, error code returned is -EADDRNOTAVAIL. > > > [..] > >> struct kexec_segment { > >> void *buf; /* Buffer in user space */ > >> size_t bufsz; /* Buffer length in user space */ > >> void *mem; /* Physical address of kernel */ > >> size_t memsz; /* Physical address length */ > >> }; > >> .fi > >> .in > >> .PP > >> .\" FIXME Explain the details of how the kernel image defined by segments > >> .\" is copied from the calling process into previously reserved memory. > > > > Kernel image defined by segments is copied into kernel either in regular > > memory > > Could you clarify what you mean by "regular memory"? I meant memory which is not reserved memory. > > > or in reserved memory (if KEXEC_ON_CRASH is set). Kernel first > > copies list of segments in kernel memory and then goes does various > > sanity checks on the segments. If everything looks line, kernel copies > > segment data to kernel memory. > > > > In case of normal kexec, segment data is loaded in any available memory > > and segment data is moved to final destination at the kexec reboot time. > > By "moved to final destination", do you mean "moved from user space to the > final kernel-space destination"? No. Segment data moves from user space to kernel space once kexec_load() call finishes successfully. But when user does reboot (kexec -e), at that time kernel moves that segment data to its final location. Kernel could not place the segment at its final location during kexec_load() time as that memory is already in use by running kernel. But once we are about to reboot to new kernel, we can overwrite the old kernel's memory. > > > In case of kexec on panic (KEXEC_ON_CRASH flag set), segment data is > > directly loaded to reserved memory and after crash kexec simply jumps > > By "directly", I assume you mean "at the time of the kexec_laod() call", > right? Yes. Thanks Vivek ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-12 22:16 ` Vivek Goyal @ 2015-01-16 13:30 ` Michael Kerrisk (man-pages) 2015-01-27 8:07 ` Michael Kerrisk (man-pages) 2015-01-27 14:24 ` Vivek Goyal 0 siblings, 2 replies; 19+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-01-16 13:30 UTC (permalink / raw) To: Vivek Goyal Cc: mtk.manpages, lkml, linux-man, kexec, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman Hello Vivek, Thanks for your comments! I've added some further text to the page based on those comments. See some follow-up questions below. On 01/12/2015 11:16 PM, Vivek Goyal wrote: > On Wed, Jan 07, 2015 at 10:17:56PM +0100, Michael Kerrisk (man-pages) wrote: > > [..] >>>> .BR KEXEC_ON_CRASH " (since Linux 2.6.13)" >>>> Execute the new kernel automatically on a system crash. >>>> .\" FIXME Explain in more detail how KEXEC_ON_CRASH is actually used >> >> I wasn't expecting that you would respond to the FIXMEs that were >> not labeled "kexec_file_load", but I was hoping you might ;-). Thanks! >> I have a few additional questions to your nice notes. >> >>> Upon boot first kernel reserves a chunk of contiguous memory (if >>> crashkernel=<> command line paramter is passed). This memory is >>> is used to load the crash kernel (Kernel which will be booted into >>> if first kernel crashes). >> > > Hi Michael, > >> Can I just confirm: is it in all cases only possible to use kexec_load() >> and kexec_file_load() if the kernel was booted with the 'crashkernel' >> parameter set? > > As of now, only kexec_load() and kexec_file_load() system calls can > make use of memory reserved by crashkernel=<> kernel parameter. And > this is used only if we are trying to load a crash kernel (KEXEC_ON_CRASH > flag specified). Okay. >>> Location of this reserved memory is exported to user space through >>> /proc/iomem file. >> >> Is that export via an entry labeled "Crash kernel" in the >> /proc/iomem file? > > Yes. Okay -- thanks. >>> User space can parse it and prepare list of segments >>> specifying this reserved memory as destination. >> >> I'm not quite clear on "specifying this reserved memory as destination". >> Is that done by specifying the address in the kexec_segment.mem fields? > > You are absolutely right. User space can specify in kexec_segment.mem > field the memory location where it expecting a particular segment to > be loaded by kernel. > >> >>> Once kernel sees the flag KEXEC_ON_CRASH, it makes sure that all the >>> segments are destined for reserved memory otherwise kernel load operation >>> fails. >> >> Could you point me to where this checking is done? Also, what is the >> error (errno) that occurs when the load operation fails? (I think the >> answers to these questions are "at the start of kimage_alloc_init()" >> and "EADDRNOTAVAIL", but I'd like to confirm.) > > This checking happens in sanity_check_segment_list() which is called > by kimage_alloc_init(). > > And yes, error code returned is -EADDRNOTAVAIL. Thanks. I added EADDRNOTAVAIL to the ERRORS. >>> [..] >>>> struct kexec_segment { >>>> void *buf; /* Buffer in user space */ >>>> size_t bufsz; /* Buffer length in user space */ >>>> void *mem; /* Physical address of kernel */ >>>> size_t memsz; /* Physical address length */ >>>> }; >>>> .fi >>>> .in >>>> .PP >>>> .\" FIXME Explain the details of how the kernel image defined by segments >>>> .\" is copied from the calling process into previously reserved memory. >>> >>> Kernel image defined by segments is copied into kernel either in regular >>> memory >> >> Could you clarify what you mean by "regular memory"? > > I meant memory which is not reserved memory. Okay. >>> or in reserved memory (if KEXEC_ON_CRASH is set). Kernel first >>> copies list of segments in kernel memory and then goes does various >>> sanity checks on the segments. If everything looks line, kernel copies >>> segment data to kernel memory. >>> >>> In case of normal kexec, segment data is loaded in any available memory >>> and segment data is moved to final destination at the kexec reboot time. >> >> By "moved to final destination", do you mean "moved from user space to the >> final kernel-space destination"? > > No. Segment data moves from user space to kernel space once kexec_load() > call finishes successfully. But when user does reboot (kexec -e), at that > time kernel moves that segment data to its final location. Kernel could > not place the segment at its final location during kexec_load() time as > that memory is already in use by running kernel. But once we are about > to reboot to new kernel, we can overwrite the old kernel's memory. Got it. >>> In case of kexec on panic (KEXEC_ON_CRASH flag set), segment data is >>> directly loaded to reserved memory and after crash kexec simply jumps >> >> By "directly", I assume you mean "at the time of the kexec_laod() call", >> right? > > Yes. Thanks. So, returning to the kexeec_segment structure: struct kexec_segment { void *buf; /* Buffer in user space */ size_t bufsz; /* Buffer length in user space */ void *mem; /* Physical address of kernel */ size_t memsz; /* Physical address length */ }; Are the following statements correct: * buf + bufsz identify a memory region in the caller's virtual address space that is the source of the copy * mem + memsz specify the target memory region of the copy * mem is physical memory address, as seen from kernel space * the number of bytes copied from userspace is min(bufsz, memsz) * if bufsz > memsz, then excess bytes in the user-space buffer are ignored. * if memsz > bufsz, then excess bytes in the target kernel buffer are filled with zeros. ? Also, it seems to me that 'mem' need not be page aligned. Is that correct? Should the man page say something about that? (E.g., is it generally desirable that 'mem' should be page aligned?) Likewise, 'memsz' doesn't need to be a page multiple, IIUC. Should the man page say anything about this? For example, should it note that the initialized kernel segment will be of size: (mem % PAGE_SIZE + memsz) rounded up to the next multiple of PAGE_SIZE And should it note that if 'mem' is not a multiple of the page size, then the initial bytes (mem % PAGE_SIZE)) in the first page of the kernel segment will be zeros? (Hopefully I have read kimage_load_normal_segment() correctly.) And one further question. Other than the fact that they are used with different system calls, what is the difference between KEXEC_ON_CRASH and KEXEC_FILE_ON_CRASH? Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-16 13:30 ` Michael Kerrisk (man-pages) @ 2015-01-27 8:07 ` Michael Kerrisk (man-pages) 2015-01-27 14:24 ` Vivek Goyal 1 sibling, 0 replies; 19+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-01-27 8:07 UTC (permalink / raw) To: Vivek Goyal Cc: Michael Kerrisk, lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman Hello Vivek, Ping! Cheers, Michael On 16 January 2015 at 14:30, Michael Kerrisk (man-pages) <mtk.manpages@gmail.com> wrote: > Hello Vivek, > > Thanks for your comments! I've added some further text to > the page based on those comments. See some follow-up > questions below. > > On 01/12/2015 11:16 PM, Vivek Goyal wrote: >> On Wed, Jan 07, 2015 at 10:17:56PM +0100, Michael Kerrisk (man-pages) wrote: >> >> [..] >>>>> .BR KEXEC_ON_CRASH " (since Linux 2.6.13)" >>>>> Execute the new kernel automatically on a system crash. >>>>> .\" FIXME Explain in more detail how KEXEC_ON_CRASH is actually used >>> >>> I wasn't expecting that you would respond to the FIXMEs that were >>> not labeled "kexec_file_load", but I was hoping you might ;-). Thanks! >>> I have a few additional questions to your nice notes. >>> >>>> Upon boot first kernel reserves a chunk of contiguous memory (if >>>> crashkernel=<> command line paramter is passed). This memory is >>>> is used to load the crash kernel (Kernel which will be booted into >>>> if first kernel crashes). >>> >> >> Hi Michael, >> >>> Can I just confirm: is it in all cases only possible to use kexec_load() >>> and kexec_file_load() if the kernel was booted with the 'crashkernel' >>> parameter set? >> >> As of now, only kexec_load() and kexec_file_load() system calls can >> make use of memory reserved by crashkernel=<> kernel parameter. And >> this is used only if we are trying to load a crash kernel (KEXEC_ON_CRASH >> flag specified). > > Okay. > >>>> Location of this reserved memory is exported to user space through >>>> /proc/iomem file. >>> >>> Is that export via an entry labeled "Crash kernel" in the >>> /proc/iomem file? >> >> Yes. > > Okay -- thanks. > >>>> User space can parse it and prepare list of segments >>>> specifying this reserved memory as destination. >>> >>> I'm not quite clear on "specifying this reserved memory as destination". >>> Is that done by specifying the address in the kexec_segment.mem fields? >> >> You are absolutely right. User space can specify in kexec_segment.mem >> field the memory location where it expecting a particular segment to >> be loaded by kernel. >> >>> >>>> Once kernel sees the flag KEXEC_ON_CRASH, it makes sure that all the >>>> segments are destined for reserved memory otherwise kernel load operation >>>> fails. >>> >>> Could you point me to where this checking is done? Also, what is the >>> error (errno) that occurs when the load operation fails? (I think the >>> answers to these questions are "at the start of kimage_alloc_init()" >>> and "EADDRNOTAVAIL", but I'd like to confirm.) >> >> This checking happens in sanity_check_segment_list() which is called >> by kimage_alloc_init(). >> >> And yes, error code returned is -EADDRNOTAVAIL. > > Thanks. I added EADDRNOTAVAIL to the ERRORS. > >>>> [..] >>>>> struct kexec_segment { >>>>> void *buf; /* Buffer in user space */ >>>>> size_t bufsz; /* Buffer length in user space */ >>>>> void *mem; /* Physical address of kernel */ >>>>> size_t memsz; /* Physical address length */ >>>>> }; >>>>> .fi >>>>> .in >>>>> .PP >>>>> .\" FIXME Explain the details of how the kernel image defined by segments >>>>> .\" is copied from the calling process into previously reserved memory. >>>> >>>> Kernel image defined by segments is copied into kernel either in regular >>>> memory >>> >>> Could you clarify what you mean by "regular memory"? >> >> I meant memory which is not reserved memory. > > Okay. > >>>> or in reserved memory (if KEXEC_ON_CRASH is set). Kernel first >>>> copies list of segments in kernel memory and then goes does various >>>> sanity checks on the segments. If everything looks line, kernel copies >>>> segment data to kernel memory. >>>> >>>> In case of normal kexec, segment data is loaded in any available memory >>>> and segment data is moved to final destination at the kexec reboot time. >>> >>> By "moved to final destination", do you mean "moved from user space to the >>> final kernel-space destination"? >> >> No. Segment data moves from user space to kernel space once kexec_load() >> call finishes successfully. But when user does reboot (kexec -e), at that >> time kernel moves that segment data to its final location. Kernel could >> not place the segment at its final location during kexec_load() time as >> that memory is already in use by running kernel. But once we are about >> to reboot to new kernel, we can overwrite the old kernel's memory. > > Got it. > >>>> In case of kexec on panic (KEXEC_ON_CRASH flag set), segment data is >>>> directly loaded to reserved memory and after crash kexec simply jumps >>> >>> By "directly", I assume you mean "at the time of the kexec_laod() call", >>> right? >> >> Yes. > > Thanks. > > So, returning to the kexeec_segment structure: > > struct kexec_segment { > void *buf; /* Buffer in user space */ > size_t bufsz; /* Buffer length in user space */ > void *mem; /* Physical address of kernel */ > size_t memsz; /* Physical address length */ > }; > > Are the following statements correct: > * buf + bufsz identify a memory region in the caller's virtual > address space that is the source of the copy > * mem + memsz specify the target memory region of the copy > * mem is physical memory address, as seen from kernel space > * the number of bytes copied from userspace is min(bufsz, memsz) > * if bufsz > memsz, then excess bytes in the user-space buffer > are ignored. > * if memsz > bufsz, then excess bytes in the target kernel buffer > are filled with zeros. > ? > > Also, it seems to me that 'mem' need not be page aligned. > Is that correct? Should the man page say something about that? > (E.g., is it generally desirable that 'mem' should be page aligned?) > > Likewise, 'memsz' doesn't need to be a page multiple, IIUC. > Should the man page say anything about this? For example, should > it note that the initialized kernel segment will be of size: > > (mem % PAGE_SIZE + memsz) rounded up to the next multiple of PAGE_SIZE > > And should it note that if 'mem' is not a multiple of the page size, then > the initial bytes (mem % PAGE_SIZE)) in the first page of the kernel segment > will be zeros? > > (Hopefully I have read kimage_load_normal_segment() correctly.) > > And one further question. Other than the fact that they are used with > different system calls, what is the difference between KEXEC_ON_CRASH > and KEXEC_FILE_ON_CRASH? > > Thanks, > > Michael > > -- > Michael Kerrisk > Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ > Linux/UNIX System Programming Training: http://man7.org/training/ -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-16 13:30 ` Michael Kerrisk (man-pages) 2015-01-27 8:07 ` Michael Kerrisk (man-pages) @ 2015-01-27 14:24 ` Vivek Goyal 2015-01-28 8:04 ` Michael Kerrisk (man-pages) 1 sibling, 1 reply; 19+ messages in thread From: Vivek Goyal @ 2015-01-27 14:24 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: lkml, linux-man, kexec, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman On Fri, Jan 16, 2015 at 02:30:25PM +0100, Michael Kerrisk (man-pages) wrote: [..] > Hi Michael, Please find my responses below. Sorry, I got stuck in other work and forgot about this thread. > So, returning to the kexeec_segment structure: > > struct kexec_segment { > void *buf; /* Buffer in user space */ > size_t bufsz; /* Buffer length in user space */ > void *mem; /* Physical address of kernel */ > size_t memsz; /* Physical address length */ > }; > > Are the following statements correct: > * buf + bufsz identify a memory region in the caller's virtual > address space that is the source of the copy Yes. > * mem + memsz specify the target memory region of the copy Yes. > * mem is physical memory address, as seen from kernel space Yes. > * the number of bytes copied from userspace is min(bufsz, memsz) Yes. bufsz can not be more than memsz. There is a check to validate this in kernel. result = -EINVAL; for (i = 0; i < nr_segments; i++) { if (image->segment[i].bufsz > image->segment[i].memsz) return result; } > * if bufsz > memsz, then excess bytes in the user-space buffer > are ignored. You will get -EINVAL. > * if memsz > bufsz, then excess bytes in the target kernel buffer > are filled with zeros. Yes. > Also, it seems to me that 'mem' need not be page aligned. > Is that correct? Should the man page say something about that? > (E.g., is it generally desirable that 'mem' should be page aligned?) mem and memsz need to be page aligned. There is a check for that too. mstart = image->segment[i].mem; mend = mstart + image->segment[i].memsz; if ((mstart & ~PAGE_MASK) || (mend & ~PAGE_MASK)) return result; > > Likewise, 'memsz' doesn't need to beta page multiple, IIUC. > Should the man page say anything about this? For example, should > it note that the initialized kernel segment will be of size: > > (mem % PAGE_SIZE + memsz) rounded up to the next multiple of PAGE_SIZE > > And should it note that if 'mem' is not a multiple of the page size, then > the initial bytes (mem % PAGE_SIZE)) in the first page of the kernel segment > will be zeros? > > (Hopefully I have read kimage_load_normal_segment() correctly.) Both mem and memsz need to be page aligned. > > And one further question. Other than the fact that they are used with > different system calls, what is the difference between KEXEC_ON_CRASH > and KEXEC_FILE_ON_CRASH? Right now I can't think of any other difference. They both tell respective system call that this kernel needs to be loaded in reserved memory region for crash kernel. Thanks Vivek ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-27 14:24 ` Vivek Goyal @ 2015-01-28 8:04 ` Michael Kerrisk (man-pages) 2015-01-28 14:48 ` Vivek Goyal 0 siblings, 1 reply; 19+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-01-28 8:04 UTC (permalink / raw) To: Vivek Goyal Cc: mtk.manpages, lkml, linux-man, kexec, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman Hi Vivek, On 01/27/2015 03:24 PM, Vivek Goyal wrote: > On Fri, Jan 16, 2015 at 02:30:25PM +0100, Michael Kerrisk (man-pages) wrote: > [..] >> > > Hi Michael, > > Please find my responses below. Sorry, I got stuck in other work and > forgot about this thread. > >> So, returning to the kexeec_segment structure: >> >> struct kexec_segment { >> void *buf; /* Buffer in user space */ >> size_t bufsz; /* Buffer length in user space */ >> void *mem; /* Physical address of kernel */ >> size_t memsz; /* Physical address length */ >> }; >> >> Are the following statements correct: >> * buf + bufsz identify a memory region in the caller's virtual >> address space that is the source of the copy > > Yes. Okay. >> * mem + memsz specify the target memory region of the copy > > Yes. Okay. >> * mem is physical memory address, as seen from kernel space > > Yes. Okay. >> * the number of bytes copied from userspace is min(bufsz, memsz) > > Yes. bufsz can not be more than memsz. There is a check to validate > this in kernel. > > result = -EINVAL; > for (i = 0; i < nr_segments; i++) { > if (image->segment[i].bufsz > image->segment[i].memsz) > return result; > } Okay. So it's more precise to leave discussion of min(bufz, memsz) out of the man page just to say: bufsz bytes are transferred; if bufsz < memsz, then the excess bytes in the target region are filled with zeros. Right? >> * if bufsz > memsz, then excess bytes in the user-space buffer >> are ignored. > > You will get -EINVAL. Okay. >> * if memsz > bufsz, then excess bytes in the target kernel buffer >> are filled with zeros. > > Yes. Okay. >> Also, it seems to me that 'mem' need not be page aligned. >> Is that correct? Should the man page say something about that? >> (E.g., is it generally desirable that 'mem' should be page aligned?) > > mem and memsz need to be page aligned. There is a check for that too. > > mstart = image->segment[i].mem; > mend = mstart + image->segment[i].memsz; > if ((mstart & ~PAGE_MASK) || (mend & ~PAGE_MASK)) > return result; > >> >> Likewise, 'memsz' doesn't need to beta page multiple, IIUC. >> Should the man page say anything about this? For example, should >> it note that the initialized kernel segment will be of size: >> >> (mem % PAGE_SIZE + memsz) rounded up to the next multiple of PAGE_SIZE >> >> And should it note that if 'mem' is not a multiple of the page size, then >> the initial bytes (mem % PAGE_SIZE)) in the first page of the kernel segment >> will be zeros? >> >> (Hopefully I have read kimage_load_normal_segment() correctly.) > > Both mem and memsz need to be page aligned. And the error if not is EADDRNOTAVAIL, right? >> And one further question. Other than the fact that they are used with >> different system calls, what is the difference between KEXEC_ON_CRASH >> and KEXEC_FILE_ON_CRASH? > > Right now I can't think of any other difference. They both tell respective > system call that this kernel needs to be loaded in reserved memory region > for crash kernel. Okay. I've made various adjustments to the page in the light of your comments above. Thanks! Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-28 8:04 ` Michael Kerrisk (man-pages) @ 2015-01-28 14:48 ` Vivek Goyal 2015-01-28 15:49 ` Michael Kerrisk (man-pages) 0 siblings, 1 reply; 19+ messages in thread From: Vivek Goyal @ 2015-01-28 14:48 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: lkml, linux-man, kexec, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman On Wed, Jan 28, 2015 at 09:04:38AM +0100, Michael Kerrisk (man-pages) wrote: Hi Michael, [..] > >> * the number of bytes copied from userspace is min(bufsz, memsz) > > > > Yes. bufsz can not be more than memsz. There is a check to validate > > this in kernel. > > > > result = -EINVAL; > > for (i = 0; i < nr_segments; i++) { > > if (image->segment[i].bufsz > image->segment[i].memsz) > > return result; > > } > > Okay. So it's more precise to leave discussion of min(bufz, memsz) > out of the man page just to say: bufsz bytes are transferred; > if bufsz < memsz, then the excess bytes in the target region are > filled with zeros. Right? Sounds good. [..] > > Both mem and memsz need to be page aligned. > > And the error if not is EADDRNOTAVAIL, right? Yes. > > >> And one further question. Other than the fact that they are used with > >> different system calls, what is the difference between KEXEC_ON_CRASH > >> and KEXEC_FILE_ON_CRASH? > > > > Right now I can't think of any other difference. They both tell respective > > system call that this kernel needs to be loaded in reserved memory region > > for crash kernel. > > Okay. > > I've made various adjustments to the page in the light of your comments > above. Thanks! Thank you for following it up and improving kexec man page. Thanks Vivek ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-28 14:48 ` Vivek Goyal @ 2015-01-28 15:49 ` Michael Kerrisk (man-pages) 2015-01-28 20:34 ` Vivek Goyal 0 siblings, 1 reply; 19+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-01-28 15:49 UTC (permalink / raw) To: Vivek Goyal Cc: lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen [Dropping Andi into CC, which I should have done to start with, since he wrote the original page, and might also have some comments] Hello Vivek, >> I've made various adjustments to the page in the light of your comments >> above. Thanks! > > Thank you for following it up and improving kexec man page. You're welcome. So, by now, I've made quite a lot of changes (including adding a number of cases under ERRORS). I think the revised kexec_load/kexec_file_load page is pretty much ready to go, but would you be willing to give the text below a check over first? Thanks Michael ==== .\" Copyright (C) 2010 Intel Corporation, Author: Andi Kleen .\" and Copyright 2014, Vivek Goyal <vgoyal@redhat.com> .\" and Copyright (c) 2015, Michael Kerrisk <mtk.manpages@gmail.com> .\" .\" %%%LICENSE_START(VERBATIM) .\" Permission is granted to make and distribute verbatim copies of this .\" manual provided the copyright notice and this permission notice are .\" preserved on all copies. .\" .\" Permission is granted to copy and distribute modified versions of this .\" manual under the conditions for verbatim copying, provided that the .\" entire resulting derived work is distributed under the terms of a .\" permission notice identical to this one. .\" .\" Since the Linux kernel and libraries are constantly changing, this .\" manual page may be incorrect or out-of-date. The author(s) assume no .\" responsibility for errors or omissions, or for damages resulting from .\" the use of the information contained herein. The author(s) may not .\" have taken the same level of care in the production of this manual, .\" which is licensed free of charge, as they might when working .\" professionally. .\" .\" Formatted or processed versions of this manual, if unaccompanied by .\" the source, must acknowledge the copyright and authors of this work. .\" %%%LICENSE_END .\" .TH KEXEC_LOAD 2 2014-08-19 "Linux" "Linux Programmer's Manual" .SH NAME kexec_load, kexec_file_load \- load a new kernel for later execution .SH SYNOPSIS .nf .B #include <linux/kexec.h> .BI "long kexec_load(unsigned long " entry ", unsigned long " nr_segments "," .BI " struct kexec_segment *" segments \ ", unsigned long " flags ");" .BI "long kexec_file_load(int " kernel_fd ", int " initrd_fd "," .br .BI " unsigned long " cmdline_len \ ", const char *" cmdline "," .BI " unsigned long " flags ");" .fi .IR Note : There are no glibc wrappers for these system calls; see NOTES. .SH DESCRIPTION The .BR kexec_load () system call loads a new kernel that can be executed later by .BR reboot (2). .PP The .I flags argument is a bit mask that controls the operation of the call. The following values can be specified in .IR flags : .TP .BR KEXEC_ON_CRASH " (since Linux 2.6.13)" Execute the new kernel automatically on a system crash. This "crash kernel" is loaded into an area of reserved memory that is determined at boot time using the .I craskkernel kernel command-line parameter. The location of this reserved memory is exported to user space via the .I /proc/iomem file, in an entry labeled "Crash kernel". A user-space application can parse this file and prepare a list of segments (see below) that specify this reserved memory as destination. If this flag is specified, the kernel checks that the target segments specified in .I segments fall within the reserved region. .TP .BR KEXEC_PRESERVE_CONTEXT " (since Linux 2.6.27)" Preserve the system hardware and software states before executing the new kernel. This could be used for system suspend. This flag is available only if the kernel was configured with .BR CONFIG_KEXEC_JUMP , and is effective only if .I nr_segments is greater than 0. .PP The high-order bits (corresponding to the mask 0xffff0000) of .I flags contain the architecture of the to-be-executed kernel. Specify (OR) the constant .B KEXEC_ARCH_DEFAULT to use the current architecture, or one of the following architecture constants .BR KEXEC_ARCH_386 , .BR KEXEC_ARCH_68K , .BR KEXEC_ARCH_X86_64 , .BR KEXEC_ARCH_PPC , .BR KEXEC_ARCH_PPC64 , .BR KEXEC_ARCH_IA_64 , .BR KEXEC_ARCH_ARM , .BR KEXEC_ARCH_S390 , .BR KEXEC_ARCH_SH , .BR KEXEC_ARCH_MIPS , and .BR KEXEC_ARCH_MIPS_LE . The architecture must be executable on the CPU of the system. The .I entry argument is the physical entry address in the kernel image. The .I nr_segments argument is the number of segments pointed to by the .I segments pointer; the kernel imposes an (arbitrary) limit of 16 on the number of segments. The .I segments argument is an array of .I kexec_segment structures which define the kernel layout: .in +4n .nf struct kexec_segment { void *buf; /* Buffer in user space */ size_t bufsz; /* Buffer length in user space */ void *mem; /* Physical address of kernel */ size_t memsz; /* Physical address length */ }; .fi .in .PP The kernel image defined by .I segments is copied from the calling process into the kernel either in regular memory or in reserved memory (if .BR KEXEC_ON_CRASH is set). The kernel first performs various sanity checks on the information passed in .IR segments . If these checks pass, the kernel copies the segment data to kernel memory. Each segment specified in .I segments is copied as follows: .IP * 3 .I buf and .I bufsz identify a memory region in the caller's virtual address space that is the source of the copy. The value in .I bufsz may not exceed the value in the .I memsz field. .IP * .I mem and .I memsz specify a physical address range that is the target of the copy. The values specified in both fields must be multiples of the system page size. .IP * .I bufsz bytes are copied from the source buffer to the target kernel buffer. If .I bufsz is less than .IR memsz , then the excess bytes in the kernel buffer are zeroed out. .PP In case of a normal kexec (i.e., the .BR KEXEC_ON_CRASH flag is not set), the segment data is loaded in any available memory and is moved to the final destination at kexec reboot time (e.g., when the .BR kexec (8) command is executed with the .I \-e option). In case of kexec on panic (i.e., the .BR KEXEC_ON_CRASH flag is set), the segment data is loaded to reserved memory at the time of the call, and, after a crash, the kexec mechanism simply passes control to that kernel. The .BR kexec_load () system call is available only if the kernel was configured with .BR CONFIG_KEXEC . .SS kexec_file_load() The .BR kexec_file_load () system call is similar to .BR kexec_load (), but it takes a different set of arguments. It reads the kernel to be loaded from the file referred to by the descriptor .IR kernel_fd , and the initrd (initial RAM disk) to be loaded from file referred to by the descriptor .IR initrd_fd . The .IR cmdline argument is a pointer to a buffer containing the command line for the new kernel. The .IR cmdline_len argument specifies size of the buffer. The last byte in the buffer must be a null byte (\(aq\\0\(aq). The .IR flags argument is a bit mask which modifies the behavior of the call. The following values can be specified in .IR flags : .TP .BR KEXEC_FILE_UNLOAD Unload the currently loaded kernel. .TP .BR KEXEC_FILE_ON_CRASH Load the new kernel in the memory region reserved for the crash kernel (as for .BR KEXEC_ON_CRASH). This kernel is booted if the currently running kernel crashes. .TP .BR KEXEC_FILE_NO_INITRAMFS Loading initrd/initramfs is optional. Specify this flag if no initramfs is being loaded. If this flag is set, the value passed in .IR initrd_fd is ignored. .PP The .BR kexec_file_load () .\" See also http://lwn.net/Articles/603116/ system call was added to provide support for systems where "kexec" loading should be restricted to only kernels that are signed. This system call is available only if the kernel was configured with .BR CONFIG_KEXEC_FILE . .SH RETURN VALUE On success, these system calls returns 0. On error, \-1 is returned and .I errno is set to indicate the error. .SH ERRORS .TP .B EADDRNOTAVAIL .\" See kernel/kexec.::sanity_check_segment_list in the 3.19 kernel source The .B KEXEC_ON_CRASH flags was specified, but the region specified by the .I mem and .I memsz fields of one of the .I segments entries lies outside the range of memory reserved for the crash kernel. .TP .B EADDRNOTAVAIL The value in a .I mem or .I memsz field in one of the .I segments entries is not a multiple of the system page size. .TP .B EBADF .I kernel_fd or .I initrd_fd is not a valid file descriptor. .TP .B EBUSY Another crash kernel is already being loaded or a crash kernel is already in use. .TP .B EINVAL .I flags is invalid. .TP .B EINVAL The value of a .I bufsz field in one of the .I segments entries exceeds the value in the corresponding .I memsz field. .TP .B EINVAL .IR nr_segments exceeds .BR KEXEC_SEGMENT_MAX (16). .TP .B EINVAL Two or more of the kernel target buffers overlap. .TP .B EINVAL The value in .I cmdline[cmdline_len-1] is not \(aq\\0\(aq. .TP .B EINVAL The file referred to by .I kernel_fd or .I initrd_fd is empty (length zero). .TP .B ENOMEM Could not allocate memory. .TP .B ENOEXEC .I kernel_fd does not refer to an open file, or the kernel can't load this file. .TP .B EPERM The caller does not have the .BR CAP_SYS_BOOT capability. .SH VERSIONS The .BR kexec_load () system call first appeared in Linux 2.6.13. The .BR kexec_file_load () system call first appeared in Linux 3.17. .SH CONFORMING TO These system calls are Linux-specific. .SH NOTES Currently, there is no glibc support for these system calls. Call them using .BR syscall (2). .SH SEE ALSO .BR reboot (2), .BR syscall (2), .BR kexec (8) The kernel source files .IR Documentation/kdump/kdump.txt and .IR Documentation/kernel-parameters.txt . -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-28 15:49 ` Michael Kerrisk (man-pages) @ 2015-01-28 20:34 ` Vivek Goyal 2015-01-28 21:14 ` Scot Doyle 0 siblings, 1 reply; 19+ messages in thread From: Vivek Goyal @ 2015-01-28 20:34 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen On Wed, Jan 28, 2015 at 04:49:34PM +0100, Michael Kerrisk (man-pages) wrote: > [Dropping Andi into CC, which I should have done to start with, since > he wrote the original page, and might also have some comments] > > Hello Vivek, > > >> I've made various adjustments to the page in the light of your comments > >> above. Thanks! > > > > Thank you for following it up and improving kexec man page. > > You're welcome. So, by now, I've made quite a lot of changes > (including adding a number of cases under ERRORS). I think the revised > kexec_load/kexec_file_load page is pretty much ready to go, but would > you be willing to give the text below a check over first? > Hi Michael, I had a quick look and it looks good to me. Thanks Vivek ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-28 20:34 ` Vivek Goyal @ 2015-01-28 21:14 ` Scot Doyle 2015-01-28 21:31 ` Vivek Goyal 0 siblings, 1 reply; 19+ messages in thread From: Scot Doyle @ 2015-01-28 21:14 UTC (permalink / raw) To: Michael Kerrisk (man-pages), Vivek Goyal Cc: lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen On Wed, 28 Jan 2015, Vivek Goyal wrote: > On Wed, Jan 28, 2015 at 04:49:34PM +0100, Michael Kerrisk (man-pages) wrote: > > Hello Vivek, > > > > >> I've made various adjustments to the page in the light of your comments > > >> above. Thanks! > > > > > > Thank you for following it up and improving kexec man page. > > > > You're welcome. So, by now, I've made quite a lot of changes > > (including adding a number of cases under ERRORS). I think the revised > > kexec_load/kexec_file_load page is pretty much ready to go, but would > > you be willing to give the text below a check over first? > > > > Hi Michael, > > I had a quick look and it looks good to me. > > Thanks > Vivek When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same true for kexec_load? Would it make sense to note this in the man pages along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message? Thanks, Scot ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-28 21:14 ` Scot Doyle @ 2015-01-28 21:31 ` Vivek Goyal 2015-01-28 22:10 ` Scot Doyle 0 siblings, 1 reply; 19+ messages in thread From: Vivek Goyal @ 2015-01-28 21:31 UTC (permalink / raw) To: Scot Doyle Cc: Michael Kerrisk (man-pages), lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen On Wed, Jan 28, 2015 at 09:14:03PM +0000, Scot Doyle wrote: > On Wed, 28 Jan 2015, Vivek Goyal wrote: > > On Wed, Jan 28, 2015 at 04:49:34PM +0100, Michael Kerrisk (man-pages) wrote: > > > Hello Vivek, > > > > > > >> I've made various adjustments to the page in the light of your comments > > > >> above. Thanks! > > > > > > > > Thank you for following it up and improving kexec man page. > > > > > > You're welcome. So, by now, I've made quite a lot of changes > > > (including adding a number of cases under ERRORS). I think the revised > > > kexec_load/kexec_file_load page is pretty much ready to go, but would > > > you be willing to give the text below a check over first? > > > > > > > Hi Michael, > > > > I had a quick look and it looks good to me. > > > > Thanks > > Vivek > > When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same > true for kexec_load? Would it make sense to note this in the man pages > along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message? Hmm.., I can't see an explicity dependency between RELOCATABLE and KEXEC. Both KEXEC and KEXEC_FILE should be able to load a kernel even if it had RELOCATABLE=n. Just that kernel will run from the address it has been built for. Thanks Vivek ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-28 21:31 ` Vivek Goyal @ 2015-01-28 22:10 ` Scot Doyle 2015-01-28 22:25 ` Vivek Goyal 0 siblings, 1 reply; 19+ messages in thread From: Scot Doyle @ 2015-01-28 22:10 UTC (permalink / raw) To: Vivek Goyal Cc: Michael Kerrisk (man-pages), lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen On Wed, 28 Jan 2015, Vivek Goyal wrote: > On Wed, Jan 28, 2015 at 09:14:03PM +0000, Scot Doyle wrote: > > On Wed, 28 Jan 2015, Vivek Goyal wrote: > > > On Wed, Jan 28, 2015 at 04:49:34PM +0100, Michael Kerrisk (man-pages) wrote: > > > > Hello Vivek, > > > > > > > > >> I've made various adjustments to the page in the light of your comments > > > > >> above. Thanks! > > > > > > > > > > Thank you for following it up and improving kexec man page. > > > > > > > > You're welcome. So, by now, I've made quite a lot of changes > > > > (including adding a number of cases under ERRORS). I think the revised > > > > kexec_load/kexec_file_load page is pretty much ready to go, but would > > > > you be willing to give the text below a check over first? > > > > > > > > > > Hi Michael, > > > > > > I had a quick look and it looks good to me. > > > > > > Thanks > > > Vivek > > > > When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same > > true for kexec_load? Would it make sense to note this in the man pages > > along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message? > > Hmm.., I can't see an explicity dependency between RELOCATABLE and > KEXEC. Both KEXEC and KEXEC_FILE should be able to load a kernel > even if it had RELOCATABLE=n. > > Just that kernel will run from the address it has been built for. > > Thanks > Vivek Confusing, right? kexec_file_load returns -ENOEXEC and dmesg says "kexec-bzImage64: XLF_CAN_BE_LOADED_ABOVE_4G is not set." which leads to arch/x86/boot/header.S line 396: #if defined(CONFIG_RELOCATABLE) && defined(CONFIG_X86_64) /* kernel/boot_param/ramdisk could be loaded above 4g */ # define XLF1 XLF_CAN_BE_LOADED_ABOVE_4G #else # define XLF1 0 #endif ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-28 22:10 ` Scot Doyle @ 2015-01-28 22:25 ` Vivek Goyal 2015-01-29 1:27 ` Scot Doyle 0 siblings, 1 reply; 19+ messages in thread From: Vivek Goyal @ 2015-01-28 22:25 UTC (permalink / raw) To: Scot Doyle Cc: Michael Kerrisk (man-pages), lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen On Wed, Jan 28, 2015 at 10:10:59PM +0000, Scot Doyle wrote: > On Wed, 28 Jan 2015, Vivek Goyal wrote: > > On Wed, Jan 28, 2015 at 09:14:03PM +0000, Scot Doyle wrote: > > > On Wed, 28 Jan 2015, Vivek Goyal wrote: > > > > On Wed, Jan 28, 2015 at 04:49:34PM +0100, Michael Kerrisk (man-pages) wrote: > > > > > Hello Vivek, > > > > > > > > > > >> I've made various adjustments to the page in the light of your comments > > > > > >> above. Thanks! > > > > > > > > > > > > Thank you for following it up and improving kexec man page. > > > > > > > > > > You're welcome. So, by now, I've made quite a lot of changes > > > > > (including adding a number of cases under ERRORS). I think the revised > > > > > kexec_load/kexec_file_load page is pretty much ready to go, but would > > > > > you be willing to give the text below a check over first? > > > > > > > > > > > > > Hi Michael, > > > > > > > > I had a quick look and it looks good to me. > > > > > > > > Thanks > > > > Vivek > > > > > > When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same > > > true for kexec_load? Would it make sense to note this in the man pages > > > along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message? > > > > Hmm.., I can't see an explicity dependency between RELOCATABLE and > > KEXEC. Both KEXEC and KEXEC_FILE should be able to load a kernel > > even if it had RELOCATABLE=n. > > > > Just that kernel will run from the address it has been built for. > > > > Thanks > > Vivek > > Confusing, right? kexec_file_load returns -ENOEXEC and dmesg says > "kexec-bzImage64: XLF_CAN_BE_LOADED_ABOVE_4G is not set." which leads to > arch/x86/boot/header.S line 396: > > #if defined(CONFIG_RELOCATABLE) && defined(CONFIG_X86_64) > /* kernel/boot_param/ramdisk could be loaded above 4g */ > # define XLF1 XLF_CAN_BE_LOADED_ABOVE_4G > #else > # define XLF1 0 > #endif Ah, this one. Actually generic kexec file loading implementation does not impose this restriction. It is the image specific loader part which decides what kind of bzImage it can load. Current implementation (kexec-bzimage64.c), is only supporting loading bzImages which are 64bit and can be loaded above 4G. This simplifies the implementation of loader. But there is nothing which prevents one from implementing other image loaders. So instead of saying that kexec_file_load() depends on CONFIG_RELOCATABLE, it might be better to say in man page that currently this system call supports only loading a bzImage which is 64bit and which can be loaded above 4G too. Thanks Vivek ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-28 22:25 ` Vivek Goyal @ 2015-01-29 1:27 ` Scot Doyle 2015-01-29 5:39 ` Michael Kerrisk (man-pages) 0 siblings, 1 reply; 19+ messages in thread From: Scot Doyle @ 2015-01-29 1:27 UTC (permalink / raw) To: Vivek Goyal, Michael Kerrisk (man-pages) Cc: lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen On Wed, 28 Jan 2015, Vivek Goyal wrote: > On Wed, Jan 28, 2015 at 10:10:59PM +0000, Scot Doyle wrote: > > On Wed, 28 Jan 2015, Vivek Goyal wrote: > > > On Wed, Jan 28, 2015 at 09:14:03PM +0000, Scot Doyle wrote: > > > > When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same > > > > true for kexec_load? Would it make sense to note this in the man pages > > > > along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message? > > > > > > Hmm.., I can't see an explicity dependency between RELOCATABLE and > > > KEXEC. Both KEXEC and KEXEC_FILE should be able to load a kernel > > > even if it had RELOCATABLE=n. > > > > > > Just that kernel will run from the address it has been built for. > > > > > > Thanks > > > Vivek > > > > Confusing, right? kexec_file_load returns -ENOEXEC and dmesg says > > "kexec-bzImage64: XLF_CAN_BE_LOADED_ABOVE_4G is not set." which leads to > > arch/x86/boot/header.S line 396: > > > > #if defined(CONFIG_RELOCATABLE) && defined(CONFIG_X86_64) > > /* kernel/boot_param/ramdisk could be loaded above 4g */ > > # define XLF1 XLF_CAN_BE_LOADED_ABOVE_4G > > #else > > # define XLF1 0 > > #endif > > Ah, this one. Actually generic kexec file loading implementation does not > impose this restriction. It is the image specific loader part which > decides what kind of bzImage it can load. > > Current implementation (kexec-bzimage64.c), is only supporting loading > bzImages which are 64bit and can be loaded above 4G. This simplifies > the implementation of loader. > > But there is nothing which prevents one from implementing other image > loaders. > > So instead of saying that kexec_file_load() depends on CONFIG_RELOCATABLE, > it might be better to say in man page that currently this system call > supports only loading a bzImage which is 64bit and which can be loaded > above 4G too. > > Thanks > Vivek Thanks, I agree, and think it would make sense to list them as part of the page's ENOEXEC error. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-29 1:27 ` Scot Doyle @ 2015-01-29 5:39 ` Michael Kerrisk (man-pages) 2015-01-29 16:06 ` Scot Doyle 0 siblings, 1 reply; 19+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-01-29 5:39 UTC (permalink / raw) To: Scot Doyle Cc: Vivek Goyal, lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen On 29 January 2015 at 02:27, Scot Doyle <lkml14@scotdoyle.com> wrote: > On Wed, 28 Jan 2015, Vivek Goyal wrote: >> On Wed, Jan 28, 2015 at 10:10:59PM +0000, Scot Doyle wrote: >> > On Wed, 28 Jan 2015, Vivek Goyal wrote: >> > > On Wed, Jan 28, 2015 at 09:14:03PM +0000, Scot Doyle wrote: >> > > > When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same >> > > > true for kexec_load? Would it make sense to note this in the man pages >> > > > along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message? >> > > >> > > Hmm.., I can't see an explicity dependency between RELOCATABLE and >> > > KEXEC. Both KEXEC and KEXEC_FILE should be able to load a kernel >> > > even if it had RELOCATABLE=n. >> > > >> > > Just that kernel will run from the address it has been built for. >> > > >> > > Thanks >> > > Vivek >> > >> > Confusing, right? kexec_file_load returns -ENOEXEC and dmesg says >> > "kexec-bzImage64: XLF_CAN_BE_LOADED_ABOVE_4G is not set." which leads to >> > arch/x86/boot/header.S line 396: >> > >> > #if defined(CONFIG_RELOCATABLE) && defined(CONFIG_X86_64) >> > /* kernel/boot_param/ramdisk could be loaded above 4g */ >> > # define XLF1 XLF_CAN_BE_LOADED_ABOVE_4G >> > #else >> > # define XLF1 0 >> > #endif >> >> Ah, this one. Actually generic kexec file loading implementation does not >> impose this restriction. It is the image specific loader part which >> decides what kind of bzImage it can load. >> >> Current implementation (kexec-bzimage64.c), is only supporting loading >> bzImages which are 64bit and can be loaded above 4G. This simplifies >> the implementation of loader. >> >> But there is nothing which prevents one from implementing other image >> loaders. >> >> So instead of saying that kexec_file_load() depends on CONFIG_RELOCATABLE, >> it might be better to say in man page that currently this system call >> supports only loading a bzImage which is 64bit and which can be loaded >> above 4G too. >> >> Thanks >> Vivek > > Thanks, I agree, and think it would make sense to list them as part of the > page's ENOEXEC error. Scott, could you then phras a couple of sentences that capture thge details, so I can add it to the ENOEXEC error? Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-29 5:39 ` Michael Kerrisk (man-pages) @ 2015-01-29 16:06 ` Scot Doyle 2015-01-30 15:25 ` Michael Kerrisk (man-pages) 0 siblings, 1 reply; 19+ messages in thread From: Scot Doyle @ 2015-01-29 16:06 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Vivek Goyal, lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen On Thu, 29 Jan 2015, Michael Kerrisk (man-pages) wrote: > On 29 January 2015 at 02:27, Scot Doyle <lkml14@scotdoyle.com> wrote: > > On Wed, 28 Jan 2015, Vivek Goyal wrote: > >> On Wed, Jan 28, 2015 at 10:10:59PM +0000, Scot Doyle wrote: > >> > On Wed, 28 Jan 2015, Vivek Goyal wrote: > >> > > On Wed, Jan 28, 2015 at 09:14:03PM +0000, Scot Doyle wrote: > >> > > > When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same > >> > > > true for kexec_load? Would it make sense to note this in the man pages > >> > > > along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message? > >> > > > >> > > Hmm.., I can't see an explicity dependency between RELOCATABLE and > >> > > KEXEC. Both KEXEC and KEXEC_FILE should be able to load a kernel > >> > > even if it had RELOCATABLE=n. > >> > > > >> > > Just that kernel will run from the address it has been built for. > >> > > > >> > > Thanks > >> > > Vivek > >> > > >> > Confusing, right? kexec_file_load returns -ENOEXEC and dmesg says > >> > "kexec-bzImage64: XLF_CAN_BE_LOADED_ABOVE_4G is not set." which leads to > >> > arch/x86/boot/header.S line 396: > >> > > >> > #if defined(CONFIG_RELOCATABLE) && defined(CONFIG_X86_64) > >> > /* kernel/boot_param/ramdisk could be loaded above 4g */ > >> > # define XLF1 XLF_CAN_BE_LOADED_ABOVE_4G > >> > #else > >> > # define XLF1 0 > >> > #endif > >> > >> Ah, this one. Actually generic kexec file loading implementation does not > >> impose this restriction. It is the image specific loader part which > >> decides what kind of bzImage it can load. > >> > >> Current implementation (kexec-bzimage64.c), is only supporting loading > >> bzImages which are 64bit and can be loaded above 4G. This simplifies > >> the implementation of loader. > >> > >> But there is nothing which prevents one from implementing other image > >> loaders. > >> > >> So instead of saying that kexec_file_load() depends on CONFIG_RELOCATABLE, > >> it might be better to say in man page that currently this system call > >> supports only loading a bzImage which is 64bit and which can be loaded > >> above 4G too. > >> > >> Thanks > >> Vivek > > > > Thanks, I agree, and think it would make sense to list them as part of the > > page's ENOEXEC error. > > Scott, could you then phras a couple of sentences that capture thge > details, so I can add it to the ENOEXEC error? > > Thanks, > > Michael Yes, maybe something like "kernel_fd does not refer to an open file, or the file type is not supported. Currently, the file must be a bzImage and contain an x86 kernel loadable above 4G in memory (see Documentation/x86/boot.txt)."? boot.txt explains that loading above 4G implies 64-bit and is specified via a bit in xloadflags added in Linux 3.8. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Edited kexec_load(2) [kexec_file_load()] man page for review 2015-01-29 16:06 ` Scot Doyle @ 2015-01-30 15:25 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 19+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-01-30 15:25 UTC (permalink / raw) To: Scot Doyle Cc: mtk.manpages, Vivek Goyal, lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen On 01/29/2015 05:06 PM, Scot Doyle wrote: > On Thu, 29 Jan 2015, Michael Kerrisk (man-pages) wrote: >> On 29 January 2015 at 02:27, Scot Doyle <lkml14@scotdoyle.com> wrote: >>> On Wed, 28 Jan 2015, Vivek Goyal wrote: >>>> On Wed, Jan 28, 2015 at 10:10:59PM +0000, Scot Doyle wrote: >>>>> On Wed, 28 Jan 2015, Vivek Goyal wrote: >>>>>> On Wed, Jan 28, 2015 at 09:14:03PM +0000, Scot Doyle wrote: >>>>>>> When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same >>>>>>> true for kexec_load? Would it make sense to note this in the man pages >>>>>>> along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message? >>>>>> >>>>>> Hmm.., I can't see an explicity dependency between RELOCATABLE and >>>>>> KEXEC. Both KEXEC and KEXEC_FILE should be able to load a kernel >>>>>> even if it had RELOCATABLE=n. >>>>>> >>>>>> Just that kernel will run from the address it has been built for. >>>>>> >>>>>> Thanks >>>>>> Vivek >>>>> >>>>> Confusing, right? kexec_file_load returns -ENOEXEC and dmesg says >>>>> "kexec-bzImage64: XLF_CAN_BE_LOADED_ABOVE_4G is not set." which leads to >>>>> arch/x86/boot/header.S line 396: >>>>> >>>>> #if defined(CONFIG_RELOCATABLE) && defined(CONFIG_X86_64) >>>>> /* kernel/boot_param/ramdisk could be loaded above 4g */ >>>>> # define XLF1 XLF_CAN_BE_LOADED_ABOVE_4G >>>>> #else >>>>> # define XLF1 0 >>>>> #endif >>>> >>>> Ah, this one. Actually generic kexec file loading implementation does not >>>> impose this restriction. It is the image specific loader part which >>>> decides what kind of bzImage it can load. >>>> >>>> Current implementation (kexec-bzimage64.c), is only supporting loading >>>> bzImages which are 64bit and can be loaded above 4G. This simplifies >>>> the implementation of loader. >>>> >>>> But there is nothing which prevents one from implementing other image >>>> loaders. >>>> >>>> So instead of saying that kexec_file_load() depends on CONFIG_RELOCATABLE, >>>> it might be better to say in man page that currently this system call >>>> supports only loading a bzImage which is 64bit and which can be loaded >>>> above 4G too. >>>> >>>> Thanks >>>> Vivek >>> >>> Thanks, I agree, and think it would make sense to list them as part of the >>> page's ENOEXEC error. >> >> Scott, could you then phras a couple of sentences that capture thge >> details, so I can add it to the ENOEXEC error? >> >> Thanks, >> >> Michael > > Yes, maybe something like "kernel_fd does not refer to an open file, or > the file type is not supported. Currently, the file must be a bzImage > and contain an x86 kernel loadable above 4G in memory (see > Documentation/x86/boot.txt)."? > > boot.txt explains that loading above 4G implies 64-bit and is specified > via a bit in xloadflags added in Linux 3.8. Added and pushed. Thanks, Scott. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2015-01-30 15:25 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-11-09 19:17 Edited kexec_load(2) [kexec_file_load()] man page for review Michael Kerrisk (man-pages) 2014-11-11 21:30 ` Vivek Goyal 2015-01-07 21:17 ` Michael Kerrisk (man-pages) 2015-01-12 22:16 ` Vivek Goyal 2015-01-16 13:30 ` Michael Kerrisk (man-pages) 2015-01-27 8:07 ` Michael Kerrisk (man-pages) 2015-01-27 14:24 ` Vivek Goyal 2015-01-28 8:04 ` Michael Kerrisk (man-pages) 2015-01-28 14:48 ` Vivek Goyal 2015-01-28 15:49 ` Michael Kerrisk (man-pages) 2015-01-28 20:34 ` Vivek Goyal 2015-01-28 21:14 ` Scot Doyle 2015-01-28 21:31 ` Vivek Goyal 2015-01-28 22:10 ` Scot Doyle 2015-01-28 22:25 ` Vivek Goyal 2015-01-29 1:27 ` Scot Doyle 2015-01-29 5:39 ` Michael Kerrisk (man-pages) 2015-01-29 16:06 ` Scot Doyle 2015-01-30 15:25 ` Michael Kerrisk (man-pages)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).