linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Edited kexec_load(2) [kexec_file_load()] man page for review
@ 2014-11-09 19:17 Michael Kerrisk (man-pages)
  2014-11-11 21:30 ` Vivek Goyal
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-11-09 19:17 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: mtk.manpages, lkml, linux-man, kexec, Andy Lutomirski,
	Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman

Hello Vivek (and all),

Thanks for the kexec_file_load() patch [for the kexec_load(2) man page]
that you quite some time ago sent. I have merged it and done some
substantial editing as well. Could you please take a look at the 
draft below, and check that the kexec_file_load() material is okay.
Please could you especially pay attention to the pieces marked
"FIXME(kexec_file_load)", since those are pieces about which i
had questions or doubts.

Thanks,

Michael

.\" Copyright (C) 2010 Intel Corporation, Author: Andi Kleen
.\" and Copyright 2014, Vivek Goyal <vgoyal@redhat.com>
.\"
.\" %%%LICENSE_START(VERBATIM)
.\" Permission is granted to make and distribute verbatim copies of this
.\" manual provided the copyright notice and this permission notice are
.\" preserved on all copies.
.\"
.\" Permission is granted to copy and distribute modified versions of this
.\" manual under the conditions for verbatim copying, provided that the
.\" entire resulting derived work is distributed under the terms of a
.\" permission notice identical to this one.
.\"
.\" Since the Linux kernel and libraries are constantly changing, this
.\" manual page may be incorrect or out-of-date.  The author(s) assume no
.\" responsibility for errors or omissions, or for damages resulting from
.\" the use of the information contained herein.  The author(s) may not
.\" have taken the same level of care in the production of this manual,
.\" which is licensed free of charge, as they might when working
.\" professionally.
.\"
.\" Formatted or processed versions of this manual, if unaccompanied by
.\" the source, must acknowledge the copyright and authors of this work.
.\" %%%LICENSE_END
.\"
.TH KEXEC_LOAD 2 2014-08-19 "Linux" "Linux Programmer's Manual"
.SH NAME
kexec_load, kexec_file_load \- load a new kernel for later execution
.SH SYNOPSIS
.nf
.B #include <linux/kexec.h>

.BI "long kexec_load(unsigned long " entry ", unsigned long " nr_segments ","
.BI "                struct kexec_segment *" segments \
", unsigned long " flags ");"

.\" FIXME(kexec_file_load):
.\"     Why are the return types of kexec_load() and kexec_file_load()
.\"     different?
.BI "int kexec_file_load(int " kernel_fd ", int " initrd_fd ","
.br
.BI "                    unsigned long " cmdline_len  \
", const char *" cmdline ","
.BI "                    unsigned long " flags ");"

.fi
.IR Note :
There are no glibc wrappers for these system calls; see NOTES.
.SH DESCRIPTION
The
.BR kexec_load ()
system call loads a new kernel that can be executed later by
.BR reboot (2).
.PP
The
.I flags
argument is a bit mask that controls the operation of the call.
The following values can be specified in
.IR flags :
.TP
.BR KEXEC_ON_CRASH " (since Linux 2.6.13)"
Execute the new kernel automatically on a system crash.
.\" FIXME Explain in more detail how KEXEC_ON_CRASH is actually used
.TP
.BR KEXEC_PRESERVE_CONTEXT " (since Linux 2.6.27)"
Preserve the system hardware and
software states before executing the new kernel.
This could be used for system suspend.
This flag is available only if the kernel was configured with
.BR CONFIG_KEXEC_JUMP ,
and is effective only if
.I nr_segments
is greater than 0.
.PP
The high-order bits (corresponding to the mask 0xffff0000) of
.I flags
contain the architecture of the to-be-executed kernel.
Specify (OR) the constant
.B KEXEC_ARCH_DEFAULT
to use the current architecture,
or one of the following architecture constants
.BR KEXEC_ARCH_386 ,
.BR KEXEC_ARCH_68K ,
.BR KEXEC_ARCH_X86_64 ,
.BR KEXEC_ARCH_PPC ,
.BR KEXEC_ARCH_PPC64 ,
.BR KEXEC_ARCH_IA_64 ,
.BR KEXEC_ARCH_ARM ,
.BR KEXEC_ARCH_S390 ,
.BR KEXEC_ARCH_SH ,
.BR KEXEC_ARCH_MIPS ,
and
.BR KEXEC_ARCH_MIPS_LE .
The architecture must be executable on the CPU of the system.

The
.I entry
argument is the physical entry address in the kernel image.
The
.I nr_segments
argument is the number of segments pointed to by the
.I segments
pointer;
the kernel imposes an (arbitrary) limit of 16 on the number of segments.
The
.I segments
argument is an array of
.I kexec_segment
structures which define the kernel layout:
.in +4n
.nf

struct kexec_segment {
    void   *buf;        /* Buffer in user space */
    size_t  bufsz;      /* Buffer length in user space */
    void   *mem;        /* Physical address of kernel */
    size_t  memsz;      /* Physical address length */
};
.fi
.in
.PP
.\" FIXME Explain the details of how the kernel image defined by segments
.\" is copied from the calling process into previously reserved memory.
The kernel image defined by
.I segments
is copied from the calling process into previously reserved memory.
.SS kexec_file_load()
The
.BR kexec_file_load ()
system call is similar to
.BR kexec_load (),
but it takes a different set of arguments.
It reads the kernel to be loaded from the file referred to by the descriptor
.IR kernel_fd ,
and the initrd (initial RAM disk)
to be loaded from file referred to by the descriptor
.IR initrd_fd .
The
.IR cmdline
argument is a pointer to a string containing the command line
for the new kernel; the
.IR cmdline_len
argument specifies the length of the string in
.IR cmdline .

The
.IR flags
argument is a bit mask which modifies the behavior of the call.
The following values can be specified in
.IR flags :
.TP
.BR KEXEC_FILE_UNLOAD
Unload the currently loaded kernel.
.TP
.BR KEXEC_FILE_ON_CRASH
Load the new kernel in the memory region reserved for the crash kernel.
This kernel is booted if the currently running kernel crashes.
.TP
.BR KEXEC_FILE_NO_INITRAMFS
Loading initrd/initramfs is optional.
Specify this flag if no initramfs is being loaded.
If this flag is set, the value passed in
.IR initrd_fd
is ignored.
.SH RETURN VALUE
On success, these system calls returns 0.
On error, \-1 is returned and
.I errno
is set to indicate the error.
.SH ERRORS
.TP
.B EBUSY
Another crash kernel is already being loaded
or a crash kernel is already in use.
.TP
.B EINVAL
.I flags
is invalid; or
.IR nr_segments
is too large
.\" KEXEC_SEGMENT_MAX == 16
.TP
.B ENOEXEC
.I kernel_fd
does not refer to an open file, or the kernel can't load this file.
.TP
.B EPERM
The caller does not have the
.BR CAP_SYS_BOOT
capability.
.SH VERSIONS
The
.BR kexec_load ()
system call first appeared in Linux 2.6.13.
The
.BR kexec_file_load ()
system call first appeared in Linux 3.17.
.SH CONFORMING TO
These system calls are Linux-specific.
.SH NOTES
Currently, there is no glibc support for these system calls.
Call them using
.BR syscall (2).
.PP
The required constants are in the Linux kernel source file
.IR linux/kexec.h ,
which is not currently exported to glibc.
Therefore, these constants must be defined manually.

.\" FIXME(kexec_file_load):
.\" Is the following rationale accurate? Does it need expanding?
The
.BR kexec_file_load ()
.\" See also http://lwn.net/Articles/603116/
system call was added to provide support for systems
where "kexec" loading should be restricted to
only kernels that are signed.

The
.BR kexec_load ()
system call is available only if the kernel was configured with
.BR CONFIG_KEXEC .
The
.BR kexec_file_load ()
system call is available only if the kernel was configured with
.BR CONFIG_KEXEC_FILE .
.\" FIXME(kexec_file_load):
.\"     Does kexec_file_load() need any other CONFIG_* options to be defined?
.SH SEE ALSO
.BR reboot (2),
.BR syscall (2),
.BR kexec (8)


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2014-11-09 19:17 Edited kexec_load(2) [kexec_file_load()] man page for review Michael Kerrisk (man-pages)
@ 2014-11-11 21:30 ` Vivek Goyal
  2015-01-07 21:17   ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 19+ messages in thread
From: Vivek Goyal @ 2014-11-11 21:30 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: lkml, linux-man, kexec, Andy Lutomirski, Dave Young,
	H. Peter Anvin, Borislav Petkov, Eric W. Biederman

On Sun, Nov 09, 2014 at 08:17:49PM +0100, Michael Kerrisk (man-pages) wrote:
> Hello Vivek (and all),
> 
> Thanks for the kexec_file_load() patch [for the kexec_load(2) man page]
> that you quite some time ago sent. I have merged it and done some
> substantial editing as well. Could you please take a look at the 
> draft below, and check that the kexec_file_load() material is okay.
> Please could you especially pay attention to the pieces marked
> "FIXME(kexec_file_load)", since those are pieces about which i
> had questions or doubts.
> 

Hi Michael,

Thanks for editing this man page. I have some thoughts inline.

[..]
> .B #include <linux/kexec.h>
> 
> .BI "long kexec_load(unsigned long " entry ", unsigned long " nr_segments ","
> .BI "                struct kexec_segment *" segments \
> ", unsigned long " flags ");"
> 
> .\" FIXME(kexec_file_load):
> .\"     Why are the return types of kexec_load() and kexec_file_load()
> .\"     different?
> .BI "int kexec_file_load(int " kernel_fd ", int " initrd_fd ","

I think this is ignorance on my part. It probably should be "long" as
SYSCALL_DEFINE() seems to expand to.

asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__));


> .br
> .BI "                    unsigned long " cmdline_len  \
> ", const char *" cmdline ","
> .BI "                    unsigned long " flags ");"
> 
> .fi
> .IR Note :
> There are no glibc wrappers for these system calls; see NOTES.
> .SH DESCRIPTION
> The
> .BR kexec_load ()
> system call loads a new kernel that can be executed later by
> .BR reboot (2).
> .PP
> The
> .I flags
> argument is a bit mask that controls the operation of the call.
> The following values can be specified in
> .IR flags :
> .TP
> .BR KEXEC_ON_CRASH " (since Linux 2.6.13)"
> Execute the new kernel automatically on a system crash.
> .\" FIXME Explain in more detail how KEXEC_ON_CRASH is actually used

Upon boot first kernel reserves a chunk of contiguous memory (if
crashkernel=<> command line paramter is passed). This memory is
is used to load the crash kernel (Kernel which will be booted into
if first kernel crashes).

Location of this reserved memory is exported to user space through
/proc/iomem file. User space can parse it and prepare list of segments
specifying this reserved memory as destination.

Once kernel sees the flag KEXEC_ON_CRASH, it makes sure that all the
segments are destined for reserved memory otherwise kernel load operation
fails.

[..]
> struct kexec_segment {
>     void   *buf;        /* Buffer in user space */
>     size_t  bufsz;      /* Buffer length in user space */
>     void   *mem;        /* Physical address of kernel */
>     size_t  memsz;      /* Physical address length */
> };
> .fi
> .in
> .PP
> .\" FIXME Explain the details of how the kernel image defined by segments
> .\" is copied from the calling process into previously reserved memory.

Kernel image defined by segments is copied into kernel either in regular
memory or in reserved memory (if KEXEC_ON_CRASH is set). Kernel first
copies list of segments in kernel memory and then goes does various
sanity checks on the segments. If everything looks line, kernel copies
segment data to kernel memory.

In case of normal kexec, segment data is loaded in any available memory
and segment data is moved to final destination at the kexec reboot time.

In case of kexec on panic (KEXEC_ON_CRASH flag set), segment data is
directly loaded to reserved memory and after crash kexec simply jumps
to starting point.

[..]
> .\" FIXME(kexec_file_load):
> .\" Is the following rationale accurate? Does it need expanding?
> The
> .BR kexec_file_load ()
> .\" See also http://lwn.net/Articles/603116/
> system call was added to provide support for systems
> where "kexec" loading should be restricted to
> only kernels that are signed.

Yes, this rationale looks good.

> 
> The
> .BR kexec_load ()
> system call is available only if the kernel was configured with
> .BR CONFIG_KEXEC .
> The
> .BR kexec_file_load ()
> system call is available only if the kernel was configured with
> .BR CONFIG_KEXEC_FILE .
> .\" FIXME(kexec_file_load):
> .\"     Does kexec_file_load() need any other CONFIG_* options to be defined?

Yes, it requires some other config options too.

depends on KEXEC
depends on X86_64
depends on CRYPTO=y
depends on CRYPTO_SHA256=y

CONFIG_KEXEC_VERIFY_SIG=y
CONFIG_KEXEC_BZIMAGE_VERIFY_SIG=y
CONFIG_SIGNED_PE_FILE_VERIFICATION=y
CONFIG_PKCS7_MESSAGE_PARSER=y
CONFIG_X509_CERTIFICATE_PARSER=y
CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y

So dependency list seems pretty long. Not sure how many of these should
we specify in man page.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2014-11-11 21:30 ` Vivek Goyal
@ 2015-01-07 21:17   ` Michael Kerrisk (man-pages)
  2015-01-12 22:16     ` Vivek Goyal
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-07 21:17 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: mtk.manpages, lkml, linux-man, kexec, Andy Lutomirski,
	Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman

Hi Vivek,

Thanks for your comments, and my apologies for a the long delayed follow-up.

On 11/11/2014 10:30 PM, Vivek Goyal wrote:
> On Sun, Nov 09, 2014 at 08:17:49PM +0100, Michael Kerrisk (man-pages) wrote:
>> Hello Vivek (and all),
>>
>> Thanks for the kexec_file_load() patch [for the kexec_load(2) man page]
>> that you quite some time ago sent. I have merged it and done some
>> substantial editing as well. Could you please take a look at the 
>> draft below, and check that the kexec_file_load() material is okay.
>> Please could you especially pay attention to the pieces marked
>> "FIXME(kexec_file_load)", since those are pieces about which i
>> had questions or doubts.
>>
> 
> Hi Michael,
> 
> Thanks for editing this man page. I have some thoughts inline.
> 
> [..]
>> .B #include <linux/kexec.h>
>>
>> .BI "long kexec_load(unsigned long " entry ", unsigned long " nr_segments ","
>> .BI "                struct kexec_segment *" segments \
>> ", unsigned long " flags ");"
>>
>> .\" FIXME(kexec_file_load):
>> .\"     Why are the return types of kexec_load() and kexec_file_load()
>> .\"     different?
>> .BI "int kexec_file_load(int " kernel_fd ", int " initrd_fd ","
> 
> I think this is ignorance on my part. It probably should be "long" as
> SYSCALL_DEFINE() seems to expand to.
> 
> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__));

Okay -- I've changed to 'long' in the man page.

>> .br
>> .BI "                    unsigned long " cmdline_len  \
>> ", const char *" cmdline ","
>> .BI "                    unsigned long " flags ");"
>>
>> .fi
>> .IR Note :
>> There are no glibc wrappers for these system calls; see NOTES.
>> .SH DESCRIPTION
>> The
>> .BR kexec_load ()
>> system call loads a new kernel that can be executed later by
>> .BR reboot (2).
>> .PP
>> The
>> .I flags
>> argument is a bit mask that controls the operation of the call.
>> The following values can be specified in
>> .IR flags :
>> .TP
>> .BR KEXEC_ON_CRASH " (since Linux 2.6.13)"
>> Execute the new kernel automatically on a system crash.
>> .\" FIXME Explain in more detail how KEXEC_ON_CRASH is actually used

I wasn't expecting that you would respond to the FIXMEs that were 
not labeled "kexec_file_load", but I was hoping you might ;-). Thanks!
I have a few additional questions to your nice notes.

> Upon boot first kernel reserves a chunk of contiguous memory (if
> crashkernel=<> command line paramter is passed). This memory is
> is used to load the crash kernel (Kernel which will be booted into
> if first kernel crashes).

Can I just confirm: is it in all cases only possible to use kexec_load() 
and kexec_file_load() if the kernel was booted with the 'crashkernel'
parameter set?

> Location of this reserved memory is exported to user space through
> /proc/iomem file. 

Is that export via an entry labeled "Crash kernel" in the 
/proc/iomem file?

> User space can parse it and prepare list of segments
> specifying this reserved memory as destination.

I'm not quite clear on "specifying this reserved memory as destination".
Is that done by specifying the address in the kexec_segment.mem fields?

> Once kernel sees the flag KEXEC_ON_CRASH, it makes sure that all the
> segments are destined for reserved memory otherwise kernel load operation
> fails.

Could you point me to where this checking is done? Also, what is the
error (errno) that occurs when the load operation fails? (I think the
answers to these questions are "at the start of kimage_alloc_init()"
and "EADDRNOTAVAIL", but I'd like to confirm.)

> [..]
>> struct kexec_segment {
>>     void   *buf;        /* Buffer in user space */
>>     size_t  bufsz;      /* Buffer length in user space */
>>     void   *mem;        /* Physical address of kernel */
>>     size_t  memsz;      /* Physical address length */
>> };
>> .fi
>> .in
>> .PP
>> .\" FIXME Explain the details of how the kernel image defined by segments
>> .\" is copied from the calling process into previously reserved memory.
> 
> Kernel image defined by segments is copied into kernel either in regular
> memory 

Could you clarify what you mean by "regular memory"?

> or in reserved memory (if KEXEC_ON_CRASH is set). Kernel first
> copies list of segments in kernel memory and then goes does various
> sanity checks on the segments. If everything looks line, kernel copies
> segment data to kernel memory.
> 
> In case of normal kexec, segment data is loaded in any available memory
> and segment data is moved to final destination at the kexec reboot time.

By "moved to final destination", do you mean "moved from user space to the
final kernel-space destination"?

> In case of kexec on panic (KEXEC_ON_CRASH flag set), segment data is
> directly loaded to reserved memory and after crash kexec simply jumps

By "directly", I assume you mean "at the time of the kexec_laod() call",
right?

> to starting point.
> 
> [..]
>> .\" FIXME(kexec_file_load):
>> .\" Is the following rationale accurate? Does it need expanding?
>> The
>> .BR kexec_file_load ()
>> .\" See also http://lwn.net/Articles/603116/
>> system call was added to provide support for systems
>> where "kexec" loading should be restricted to
>> only kernels that are signed.
> 
> Yes, this rationale looks good.

Okay -- thanks.

>> The
>> .BR kexec_load ()
>> system call is available only if the kernel was configured with
>> .BR CONFIG_KEXEC .
>> The
>> .BR kexec_file_load ()
>> system call is available only if the kernel was configured with
>> .BR CONFIG_KEXEC_FILE .
>> .\" FIXME(kexec_file_load):
>> .\"     Does kexec_file_load() need any other CONFIG_* options to be defined?
> 
> Yes, it requires some other config options too.
> 
> depends on KEXEC
> depends on X86_64
> depends on CRYPTO=y
> depends on CRYPTO_SHA256=y
> 
> CONFIG_KEXEC_VERIFY_SIG=y
> CONFIG_KEXEC_BZIMAGE_VERIFY_SIG=y
> CONFIG_SIGNED_PE_FILE_VERIFICATION=y
> CONFIG_PKCS7_MESSAGE_PARSER=y
> CONFIG_X509_CERTIFICATE_PARSER=y
> CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
> 
> So dependency list seems pretty long. Not sure how many of these should
> we specify in man page.

On reflection, since they're dependencies of CONFIG_KEXEC_FILE, perhaps
it's not necessary to add any of the others.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-07 21:17   ` Michael Kerrisk (man-pages)
@ 2015-01-12 22:16     ` Vivek Goyal
  2015-01-16 13:30       ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 19+ messages in thread
From: Vivek Goyal @ 2015-01-12 22:16 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: lkml, linux-man, kexec, Andy Lutomirski, Dave Young,
	H. Peter Anvin, Borislav Petkov, Eric W. Biederman

On Wed, Jan 07, 2015 at 10:17:56PM +0100, Michael Kerrisk (man-pages) wrote:

[..]
> >> .BR KEXEC_ON_CRASH " (since Linux 2.6.13)"
> >> Execute the new kernel automatically on a system crash.
> >> .\" FIXME Explain in more detail how KEXEC_ON_CRASH is actually used
> 
> I wasn't expecting that you would respond to the FIXMEs that were 
> not labeled "kexec_file_load", but I was hoping you might ;-). Thanks!
> I have a few additional questions to your nice notes.
> 
> > Upon boot first kernel reserves a chunk of contiguous memory (if
> > crashkernel=<> command line paramter is passed). This memory is
> > is used to load the crash kernel (Kernel which will be booted into
> > if first kernel crashes).
> 

Hi Michael,

> Can I just confirm: is it in all cases only possible to use kexec_load() 
> and kexec_file_load() if the kernel was booted with the 'crashkernel'
> parameter set?

As of now, only kexec_load() and kexec_file_load() system calls can
make use of memory reserved by crashkernel=<> kernel parameter. And
this is used only if we are trying to load a crash kernel (KEXEC_ON_CRASH
flag specified).

> 
> > Location of this reserved memory is exported to user space through
> > /proc/iomem file. 
> 
> Is that export via an entry labeled "Crash kernel" in the 
> /proc/iomem file?

Yes.

> 
> > User space can parse it and prepare list of segments
> > specifying this reserved memory as destination.
> 
> I'm not quite clear on "specifying this reserved memory as destination".
> Is that done by specifying the address in the kexec_segment.mem fields?

You are absolutely right. User space can specify in kexec_segment.mem
field the memory location where it expecting a particular segment to
be loaded by kernel.

> 
> > Once kernel sees the flag KEXEC_ON_CRASH, it makes sure that all the
> > segments are destined for reserved memory otherwise kernel load operation
> > fails.
> 
> Could you point me to where this checking is done? Also, what is the
> error (errno) that occurs when the load operation fails? (I think the
> answers to these questions are "at the start of kimage_alloc_init()"
> and "EADDRNOTAVAIL", but I'd like to confirm.)

This checking happens in sanity_check_segment_list() which is called
by kimage_alloc_init().

And yes, error code returned is -EADDRNOTAVAIL.

> 
> > [..]
> >> struct kexec_segment {
> >>     void   *buf;        /* Buffer in user space */
> >>     size_t  bufsz;      /* Buffer length in user space */
> >>     void   *mem;        /* Physical address of kernel */
> >>     size_t  memsz;      /* Physical address length */
> >> };
> >> .fi
> >> .in
> >> .PP
> >> .\" FIXME Explain the details of how the kernel image defined by segments
> >> .\" is copied from the calling process into previously reserved memory.
> > 
> > Kernel image defined by segments is copied into kernel either in regular
> > memory 
> 
> Could you clarify what you mean by "regular memory"?

I meant memory which is not reserved memory.

> 
> > or in reserved memory (if KEXEC_ON_CRASH is set). Kernel first
> > copies list of segments in kernel memory and then goes does various
> > sanity checks on the segments. If everything looks line, kernel copies
> > segment data to kernel memory.
> > 
> > In case of normal kexec, segment data is loaded in any available memory
> > and segment data is moved to final destination at the kexec reboot time.
> 
> By "moved to final destination", do you mean "moved from user space to the
> final kernel-space destination"?

No. Segment data moves from user space to kernel space once kexec_load()
call finishes successfully. But when user does reboot (kexec -e), at that
time kernel moves that segment data to its final location. Kernel could
not place the segment at its final location during kexec_load() time as
that memory is already in use by running kernel. But once we are about
to reboot to new kernel, we can overwrite the old kernel's memory.

> 
> > In case of kexec on panic (KEXEC_ON_CRASH flag set), segment data is
> > directly loaded to reserved memory and after crash kexec simply jumps
> 
> By "directly", I assume you mean "at the time of the kexec_laod() call",
> right?

Yes.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-12 22:16     ` Vivek Goyal
@ 2015-01-16 13:30       ` Michael Kerrisk (man-pages)
  2015-01-27  8:07         ` Michael Kerrisk (man-pages)
  2015-01-27 14:24         ` Vivek Goyal
  0 siblings, 2 replies; 19+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-16 13:30 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: mtk.manpages, lkml, linux-man, kexec, Andy Lutomirski,
	Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman

Hello Vivek,

Thanks for your comments! I've added some further text to
the page based on those comments. See some follow-up 
questions below.

On 01/12/2015 11:16 PM, Vivek Goyal wrote:
> On Wed, Jan 07, 2015 at 10:17:56PM +0100, Michael Kerrisk (man-pages) wrote:
> 
> [..]
>>>> .BR KEXEC_ON_CRASH " (since Linux 2.6.13)"
>>>> Execute the new kernel automatically on a system crash.
>>>> .\" FIXME Explain in more detail how KEXEC_ON_CRASH is actually used
>>
>> I wasn't expecting that you would respond to the FIXMEs that were 
>> not labeled "kexec_file_load", but I was hoping you might ;-). Thanks!
>> I have a few additional questions to your nice notes.
>>
>>> Upon boot first kernel reserves a chunk of contiguous memory (if
>>> crashkernel=<> command line paramter is passed). This memory is
>>> is used to load the crash kernel (Kernel which will be booted into
>>> if first kernel crashes).
>>
> 
> Hi Michael,
> 
>> Can I just confirm: is it in all cases only possible to use kexec_load() 
>> and kexec_file_load() if the kernel was booted with the 'crashkernel'
>> parameter set?
> 
> As of now, only kexec_load() and kexec_file_load() system calls can
> make use of memory reserved by crashkernel=<> kernel parameter. And
> this is used only if we are trying to load a crash kernel (KEXEC_ON_CRASH
> flag specified).

Okay.

>>> Location of this reserved memory is exported to user space through
>>> /proc/iomem file. 
>>
>> Is that export via an entry labeled "Crash kernel" in the 
>> /proc/iomem file?
> 
> Yes.

Okay -- thanks.

>>> User space can parse it and prepare list of segments
>>> specifying this reserved memory as destination.
>>
>> I'm not quite clear on "specifying this reserved memory as destination".
>> Is that done by specifying the address in the kexec_segment.mem fields?
> 
> You are absolutely right. User space can specify in kexec_segment.mem
> field the memory location where it expecting a particular segment to
> be loaded by kernel.
> 
>>
>>> Once kernel sees the flag KEXEC_ON_CRASH, it makes sure that all the
>>> segments are destined for reserved memory otherwise kernel load operation
>>> fails.
>>
>> Could you point me to where this checking is done? Also, what is the
>> error (errno) that occurs when the load operation fails? (I think the
>> answers to these questions are "at the start of kimage_alloc_init()"
>> and "EADDRNOTAVAIL", but I'd like to confirm.)
> 
> This checking happens in sanity_check_segment_list() which is called
> by kimage_alloc_init().
> 
> And yes, error code returned is -EADDRNOTAVAIL.

Thanks. I added EADDRNOTAVAIL to the ERRORS.

>>> [..]
>>>> struct kexec_segment {
>>>>     void   *buf;        /* Buffer in user space */
>>>>     size_t  bufsz;      /* Buffer length in user space */
>>>>     void   *mem;        /* Physical address of kernel */
>>>>     size_t  memsz;      /* Physical address length */
>>>> };
>>>> .fi
>>>> .in
>>>> .PP
>>>> .\" FIXME Explain the details of how the kernel image defined by segments
>>>> .\" is copied from the calling process into previously reserved memory.
>>>
>>> Kernel image defined by segments is copied into kernel either in regular
>>> memory 
>>
>> Could you clarify what you mean by "regular memory"?
> 
> I meant memory which is not reserved memory.

Okay.

>>> or in reserved memory (if KEXEC_ON_CRASH is set). Kernel first
>>> copies list of segments in kernel memory and then goes does various
>>> sanity checks on the segments. If everything looks line, kernel copies
>>> segment data to kernel memory.
>>>
>>> In case of normal kexec, segment data is loaded in any available memory
>>> and segment data is moved to final destination at the kexec reboot time.
>>
>> By "moved to final destination", do you mean "moved from user space to the
>> final kernel-space destination"?
> 
> No. Segment data moves from user space to kernel space once kexec_load()
> call finishes successfully. But when user does reboot (kexec -e), at that
> time kernel moves that segment data to its final location. Kernel could
> not place the segment at its final location during kexec_load() time as
> that memory is already in use by running kernel. But once we are about
> to reboot to new kernel, we can overwrite the old kernel's memory.

Got it.

>>> In case of kexec on panic (KEXEC_ON_CRASH flag set), segment data is
>>> directly loaded to reserved memory and after crash kexec simply jumps
>>
>> By "directly", I assume you mean "at the time of the kexec_laod() call",
>> right?
> 
> Yes.

Thanks.

So, returning to the kexeec_segment structure:

           struct kexec_segment {
               void   *buf;        /* Buffer in user space */
               size_t  bufsz;      /* Buffer length in user space */
               void   *mem;        /* Physical address of kernel */
               size_t  memsz;      /* Physical address length */
           };

Are the following statements correct:
* buf + bufsz identify a memory region in the caller's virtual 
  address space that is the source of the copy
* mem + memsz specify the target memory region of the copy
* mem is  physical memory address, as seen from kernel space
* the number of bytes copied from userspace is min(bufsz, memsz)
* if bufsz > memsz, then excess bytes in the user-space buffer 
  are ignored.
* if memsz > bufsz, then excess bytes in the target kernel buffer
  are filled with zeros.
?

Also, it seems to me that 'mem' need not be page aligned.
Is that correct? Should the man page say something about that?
(E.g., is it generally desirable that 'mem' should be page aligned?)

Likewise, 'memsz' doesn't need to be a page multiple, IIUC.
Should the man page say anything about this? For example, should 
it note that the initialized kernel segment will be of size:

     (mem % PAGE_SIZE + memsz) rounded up to the next multiple of PAGE_SIZE

And should it note that if 'mem' is not a multiple of the page size, then
the initial bytes (mem % PAGE_SIZE)) in the first page of the kernel segment 
will be zeros?

(Hopefully I have read kimage_load_normal_segment() correctly.)

And one further question. Other than the fact that they are used with 
different system calls, what is the difference between KEXEC_ON_CRASH 
and KEXEC_FILE_ON_CRASH?

Thanks,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-16 13:30       ` Michael Kerrisk (man-pages)
@ 2015-01-27  8:07         ` Michael Kerrisk (man-pages)
  2015-01-27 14:24         ` Vivek Goyal
  1 sibling, 0 replies; 19+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-27  8:07 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Michael Kerrisk, lkml, linux-man, Kexec Mailing List,
	Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov,
	Eric W. Biederman

Hello Vivek,

Ping!

Cheers,

Michael


On 16 January 2015 at 14:30, Michael Kerrisk (man-pages)
<mtk.manpages@gmail.com> wrote:
> Hello Vivek,
>
> Thanks for your comments! I've added some further text to
> the page based on those comments. See some follow-up
> questions below.
>
> On 01/12/2015 11:16 PM, Vivek Goyal wrote:
>> On Wed, Jan 07, 2015 at 10:17:56PM +0100, Michael Kerrisk (man-pages) wrote:
>>
>> [..]
>>>>> .BR KEXEC_ON_CRASH " (since Linux 2.6.13)"
>>>>> Execute the new kernel automatically on a system crash.
>>>>> .\" FIXME Explain in more detail how KEXEC_ON_CRASH is actually used
>>>
>>> I wasn't expecting that you would respond to the FIXMEs that were
>>> not labeled "kexec_file_load", but I was hoping you might ;-). Thanks!
>>> I have a few additional questions to your nice notes.
>>>
>>>> Upon boot first kernel reserves a chunk of contiguous memory (if
>>>> crashkernel=<> command line paramter is passed). This memory is
>>>> is used to load the crash kernel (Kernel which will be booted into
>>>> if first kernel crashes).
>>>
>>
>> Hi Michael,
>>
>>> Can I just confirm: is it in all cases only possible to use kexec_load()
>>> and kexec_file_load() if the kernel was booted with the 'crashkernel'
>>> parameter set?
>>
>> As of now, only kexec_load() and kexec_file_load() system calls can
>> make use of memory reserved by crashkernel=<> kernel parameter. And
>> this is used only if we are trying to load a crash kernel (KEXEC_ON_CRASH
>> flag specified).
>
> Okay.
>
>>>> Location of this reserved memory is exported to user space through
>>>> /proc/iomem file.
>>>
>>> Is that export via an entry labeled "Crash kernel" in the
>>> /proc/iomem file?
>>
>> Yes.
>
> Okay -- thanks.
>
>>>> User space can parse it and prepare list of segments
>>>> specifying this reserved memory as destination.
>>>
>>> I'm not quite clear on "specifying this reserved memory as destination".
>>> Is that done by specifying the address in the kexec_segment.mem fields?
>>
>> You are absolutely right. User space can specify in kexec_segment.mem
>> field the memory location where it expecting a particular segment to
>> be loaded by kernel.
>>
>>>
>>>> Once kernel sees the flag KEXEC_ON_CRASH, it makes sure that all the
>>>> segments are destined for reserved memory otherwise kernel load operation
>>>> fails.
>>>
>>> Could you point me to where this checking is done? Also, what is the
>>> error (errno) that occurs when the load operation fails? (I think the
>>> answers to these questions are "at the start of kimage_alloc_init()"
>>> and "EADDRNOTAVAIL", but I'd like to confirm.)
>>
>> This checking happens in sanity_check_segment_list() which is called
>> by kimage_alloc_init().
>>
>> And yes, error code returned is -EADDRNOTAVAIL.
>
> Thanks. I added EADDRNOTAVAIL to the ERRORS.
>
>>>> [..]
>>>>> struct kexec_segment {
>>>>>     void   *buf;        /* Buffer in user space */
>>>>>     size_t  bufsz;      /* Buffer length in user space */
>>>>>     void   *mem;        /* Physical address of kernel */
>>>>>     size_t  memsz;      /* Physical address length */
>>>>> };
>>>>> .fi
>>>>> .in
>>>>> .PP
>>>>> .\" FIXME Explain the details of how the kernel image defined by segments
>>>>> .\" is copied from the calling process into previously reserved memory.
>>>>
>>>> Kernel image defined by segments is copied into kernel either in regular
>>>> memory
>>>
>>> Could you clarify what you mean by "regular memory"?
>>
>> I meant memory which is not reserved memory.
>
> Okay.
>
>>>> or in reserved memory (if KEXEC_ON_CRASH is set). Kernel first
>>>> copies list of segments in kernel memory and then goes does various
>>>> sanity checks on the segments. If everything looks line, kernel copies
>>>> segment data to kernel memory.
>>>>
>>>> In case of normal kexec, segment data is loaded in any available memory
>>>> and segment data is moved to final destination at the kexec reboot time.
>>>
>>> By "moved to final destination", do you mean "moved from user space to the
>>> final kernel-space destination"?
>>
>> No. Segment data moves from user space to kernel space once kexec_load()
>> call finishes successfully. But when user does reboot (kexec -e), at that
>> time kernel moves that segment data to its final location. Kernel could
>> not place the segment at its final location during kexec_load() time as
>> that memory is already in use by running kernel. But once we are about
>> to reboot to new kernel, we can overwrite the old kernel's memory.
>
> Got it.
>
>>>> In case of kexec on panic (KEXEC_ON_CRASH flag set), segment data is
>>>> directly loaded to reserved memory and after crash kexec simply jumps
>>>
>>> By "directly", I assume you mean "at the time of the kexec_laod() call",
>>> right?
>>
>> Yes.
>
> Thanks.
>
> So, returning to the kexeec_segment structure:
>
>            struct kexec_segment {
>                void   *buf;        /* Buffer in user space */
>                size_t  bufsz;      /* Buffer length in user space */
>                void   *mem;        /* Physical address of kernel */
>                size_t  memsz;      /* Physical address length */
>            };
>
> Are the following statements correct:
> * buf + bufsz identify a memory region in the caller's virtual
>   address space that is the source of the copy
> * mem + memsz specify the target memory region of the copy
> * mem is  physical memory address, as seen from kernel space
> * the number of bytes copied from userspace is min(bufsz, memsz)
> * if bufsz > memsz, then excess bytes in the user-space buffer
>   are ignored.
> * if memsz > bufsz, then excess bytes in the target kernel buffer
>   are filled with zeros.
> ?
>
> Also, it seems to me that 'mem' need not be page aligned.
> Is that correct? Should the man page say something about that?
> (E.g., is it generally desirable that 'mem' should be page aligned?)
>
> Likewise, 'memsz' doesn't need to be a page multiple, IIUC.
> Should the man page say anything about this? For example, should
> it note that the initialized kernel segment will be of size:
>
>      (mem % PAGE_SIZE + memsz) rounded up to the next multiple of PAGE_SIZE
>
> And should it note that if 'mem' is not a multiple of the page size, then
> the initial bytes (mem % PAGE_SIZE)) in the first page of the kernel segment
> will be zeros?
>
> (Hopefully I have read kimage_load_normal_segment() correctly.)
>
> And one further question. Other than the fact that they are used with
> different system calls, what is the difference between KEXEC_ON_CRASH
> and KEXEC_FILE_ON_CRASH?
>
> Thanks,
>
> Michael
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-16 13:30       ` Michael Kerrisk (man-pages)
  2015-01-27  8:07         ` Michael Kerrisk (man-pages)
@ 2015-01-27 14:24         ` Vivek Goyal
  2015-01-28  8:04           ` Michael Kerrisk (man-pages)
  1 sibling, 1 reply; 19+ messages in thread
From: Vivek Goyal @ 2015-01-27 14:24 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: lkml, linux-man, kexec, Andy Lutomirski, Dave Young,
	H. Peter Anvin, Borislav Petkov, Eric W. Biederman

On Fri, Jan 16, 2015 at 02:30:25PM +0100, Michael Kerrisk (man-pages) wrote:
[..]
> 

Hi Michael,

Please find my responses below. Sorry, I got stuck in other work and
forgot about this thread.

> So, returning to the kexeec_segment structure:
> 
>            struct kexec_segment {
>                void   *buf;        /* Buffer in user space */
>                size_t  bufsz;      /* Buffer length in user space */
>                void   *mem;        /* Physical address of kernel */
>                size_t  memsz;      /* Physical address length */
>            };
> 
> Are the following statements correct:
> * buf + bufsz identify a memory region in the caller's virtual 
>   address space that is the source of the copy

Yes.

> * mem + memsz specify the target memory region of the copy

Yes.

> * mem is  physical memory address, as seen from kernel space

Yes.

> * the number of bytes copied from userspace is min(bufsz, memsz)

Yes. bufsz can not be more than memsz. There is a check to validate
this in kernel.

	result = -EINVAL;
	for (i = 0; i < nr_segments; i++) {
		if (image->segment[i].bufsz > image->segment[i].memsz)
			return result;
	}

> * if bufsz > memsz, then excess bytes in the user-space buffer 
>   are ignored.

You will get -EINVAL.

> * if memsz > bufsz, then excess bytes in the target kernel buffer
>   are filled with zeros.

Yes.

> Also, it seems to me that 'mem' need not be page aligned.
> Is that correct? Should the man page say something about that?
> (E.g., is it generally desirable that 'mem' should be page aligned?)

mem and memsz need to be page aligned. There is a check for that too.

	mstart = image->segment[i].mem;
	mend   = mstart + image->segment[i].memsz;
	if ((mstart & ~PAGE_MASK) || (mend & ~PAGE_MASK))
		return result;

> 
> Likewise, 'memsz' doesn't need to beta page multiple, IIUC.
> Should the man page say anything about this? For example, should 
> it note that the initialized kernel segment will be of size:
> 
>      (mem % PAGE_SIZE + memsz) rounded up to the next multiple of PAGE_SIZE
> 
> And should it note that if 'mem' is not a multiple of the page size, then
> the initial bytes (mem % PAGE_SIZE)) in the first page of the kernel segment 
> will be zeros?
> 
> (Hopefully I have read kimage_load_normal_segment() correctly.)

Both mem and memsz need to be page aligned.

> 
> And one further question. Other than the fact that they are used with 
> different system calls, what is the difference between KEXEC_ON_CRASH 
> and KEXEC_FILE_ON_CRASH?

Right now I can't think of any other difference. They both tell respective
system call that this kernel needs to be loaded in reserved memory region
for crash kernel.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-27 14:24         ` Vivek Goyal
@ 2015-01-28  8:04           ` Michael Kerrisk (man-pages)
  2015-01-28 14:48             ` Vivek Goyal
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-28  8:04 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: mtk.manpages, lkml, linux-man, kexec, Andy Lutomirski,
	Dave Young, H. Peter Anvin, Borislav Petkov, Eric W. Biederman

Hi Vivek,

On 01/27/2015 03:24 PM, Vivek Goyal wrote:
> On Fri, Jan 16, 2015 at 02:30:25PM +0100, Michael Kerrisk (man-pages) wrote:
> [..]
>>
> 
> Hi Michael,
> 
> Please find my responses below. Sorry, I got stuck in other work and
> forgot about this thread.
> 
>> So, returning to the kexeec_segment structure:
>>
>>            struct kexec_segment {
>>                void   *buf;        /* Buffer in user space */
>>                size_t  bufsz;      /* Buffer length in user space */
>>                void   *mem;        /* Physical address of kernel */
>>                size_t  memsz;      /* Physical address length */
>>            };
>>
>> Are the following statements correct:
>> * buf + bufsz identify a memory region in the caller's virtual 
>>   address space that is the source of the copy
> 
> Yes.

Okay.

>> * mem + memsz specify the target memory region of the copy
> 
> Yes.

Okay.

>> * mem is  physical memory address, as seen from kernel space
> 
> Yes.

Okay.

>> * the number of bytes copied from userspace is min(bufsz, memsz)
> 
> Yes. bufsz can not be more than memsz. There is a check to validate
> this in kernel.
> 
> 	result = -EINVAL;
> 	for (i = 0; i < nr_segments; i++) {
> 		if (image->segment[i].bufsz > image->segment[i].memsz)
> 			return result;
> 	}

Okay. So it's more precise to leave discussion of min(bufz, memsz) 
out of the man page just to say: bufsz bytes are transferred; 
if bufsz < memsz, then the excess bytes in the target region are 
filled with zeros. Right?

>> * if bufsz > memsz, then excess bytes in the user-space buffer 
>>   are ignored.
> 
> You will get -EINVAL.

Okay.

>> * if memsz > bufsz, then excess bytes in the target kernel buffer
>>   are filled with zeros.
> 
> Yes.

Okay.

>> Also, it seems to me that 'mem' need not be page aligned.
>> Is that correct? Should the man page say something about that?
>> (E.g., is it generally desirable that 'mem' should be page aligned?)
> 
> mem and memsz need to be page aligned. There is a check for that too.
> 
> 	mstart = image->segment[i].mem;
> 	mend   = mstart + image->segment[i].memsz;
> 	if ((mstart & ~PAGE_MASK) || (mend & ~PAGE_MASK))
> 		return result;
> 
>>
>> Likewise, 'memsz' doesn't need to beta page multiple, IIUC.
>> Should the man page say anything about this? For example, should 
>> it note that the initialized kernel segment will be of size:
>>
>>      (mem % PAGE_SIZE + memsz) rounded up to the next multiple of PAGE_SIZE
>>
>> And should it note that if 'mem' is not a multiple of the page size, then
>> the initial bytes (mem % PAGE_SIZE)) in the first page of the kernel segment 
>> will be zeros?
>>
>> (Hopefully I have read kimage_load_normal_segment() correctly.)
> 
> Both mem and memsz need to be page aligned.

And the error if not is EADDRNOTAVAIL, right?

>> And one further question. Other than the fact that they are used with 
>> different system calls, what is the difference between KEXEC_ON_CRASH 
>> and KEXEC_FILE_ON_CRASH?
> 
> Right now I can't think of any other difference. They both tell respective
> system call that this kernel needs to be loaded in reserved memory region
> for crash kernel.

Okay.

I've made various adjustments to the page in the light of your comments 
above. Thanks!

Cheers,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-28  8:04           ` Michael Kerrisk (man-pages)
@ 2015-01-28 14:48             ` Vivek Goyal
  2015-01-28 15:49               ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 19+ messages in thread
From: Vivek Goyal @ 2015-01-28 14:48 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: lkml, linux-man, kexec, Andy Lutomirski, Dave Young,
	H. Peter Anvin, Borislav Petkov, Eric W. Biederman

On Wed, Jan 28, 2015 at 09:04:38AM +0100, Michael Kerrisk (man-pages) wrote:

Hi Michael,

[..]
> >> * the number of bytes copied from userspace is min(bufsz, memsz)
> > 
> > Yes. bufsz can not be more than memsz. There is a check to validate
> > this in kernel.
> > 
> > 	result = -EINVAL;
> > 	for (i = 0; i < nr_segments; i++) {
> > 		if (image->segment[i].bufsz > image->segment[i].memsz)
> > 			return result;
> > 	}
> 
> Okay. So it's more precise to leave discussion of min(bufz, memsz) 
> out of the man page just to say: bufsz bytes are transferred; 
> if bufsz < memsz, then the excess bytes in the target region are 
> filled with zeros. Right?

Sounds good.

[..]
> > Both mem and memsz need to be page aligned.
> 
> And the error if not is EADDRNOTAVAIL, right?

Yes.

> 
> >> And one further question. Other than the fact that they are used with 
> >> different system calls, what is the difference between KEXEC_ON_CRASH 
> >> and KEXEC_FILE_ON_CRASH?
> > 
> > Right now I can't think of any other difference. They both tell respective
> > system call that this kernel needs to be loaded in reserved memory region
> > for crash kernel.
> 
> Okay.
> 
> I've made various adjustments to the page in the light of your comments 
> above. Thanks!

Thank you for following it up and improving kexec man page.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-28 14:48             ` Vivek Goyal
@ 2015-01-28 15:49               ` Michael Kerrisk (man-pages)
  2015-01-28 20:34                 ` Vivek Goyal
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-28 15:49 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young,
	H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen

[Dropping Andi into CC, which I should have done to start with, since
he wrote the original page, and might also have some comments]

Hello Vivek,

>> I've made various adjustments to the page in the light of your comments
>> above. Thanks!
>
> Thank you for following it up and improving kexec man page.

You're welcome. So, by now, I've made quite a lot of changes
(including adding a number of cases under ERRORS). I think the revised
kexec_load/kexec_file_load page is pretty much ready to go, but would
you be willing to give the text below a check over first?

Thanks

Michael

====

.\" Copyright (C) 2010 Intel Corporation, Author: Andi Kleen
.\" and Copyright 2014, Vivek Goyal <vgoyal@redhat.com>
.\" and Copyright (c) 2015, Michael Kerrisk <mtk.manpages@gmail.com>
.\"
.\" %%%LICENSE_START(VERBATIM)
.\" Permission is granted to make and distribute verbatim copies of this
.\" manual provided the copyright notice and this permission notice are
.\" preserved on all copies.
.\"
.\" Permission is granted to copy and distribute modified versions of this
.\" manual under the conditions for verbatim copying, provided that the
.\" entire resulting derived work is distributed under the terms of a
.\" permission notice identical to this one.
.\"
.\" Since the Linux kernel and libraries are constantly changing, this
.\" manual page may be incorrect or out-of-date.  The author(s) assume no
.\" responsibility for errors or omissions, or for damages resulting from
.\" the use of the information contained herein.  The author(s) may not
.\" have taken the same level of care in the production of this manual,
.\" which is licensed free of charge, as they might when working
.\" professionally.
.\"
.\" Formatted or processed versions of this manual, if unaccompanied by
.\" the source, must acknowledge the copyright and authors of this work.
.\" %%%LICENSE_END
.\"
.TH KEXEC_LOAD 2 2014-08-19 "Linux" "Linux Programmer's Manual"
.SH NAME
kexec_load, kexec_file_load \- load a new kernel for later execution
.SH SYNOPSIS
.nf
.B #include <linux/kexec.h>

.BI "long kexec_load(unsigned long " entry ", unsigned long " nr_segments ","
.BI "                struct kexec_segment *" segments \
", unsigned long " flags ");"

.BI "long kexec_file_load(int " kernel_fd ", int " initrd_fd ","
.br
.BI "                    unsigned long " cmdline_len  \
", const char *" cmdline ","
.BI "                    unsigned long " flags ");"

.fi
.IR Note :
There are no glibc wrappers for these system calls; see NOTES.
.SH DESCRIPTION
The
.BR kexec_load ()
system call loads a new kernel that can be executed later by
.BR reboot (2).
.PP
The
.I flags
argument is a bit mask that controls the operation of the call.
The following values can be specified in
.IR flags :
.TP
.BR KEXEC_ON_CRASH " (since Linux 2.6.13)"
Execute the new kernel automatically on a system crash.
This "crash kernel" is loaded into an area of reserved memory that
is determined at boot time using the
.I craskkernel
kernel command-line parameter.
The location of this reserved memory is exported to user space via the
.I /proc/iomem
file, in an entry labeled "Crash kernel".
A user-space application can parse this file and prepare a list of
segments (see below) that specify this reserved memory as destination.
If this flag is specified, the kernel checks that the
target segments specified in
.I segments
fall within the reserved region.
.TP
.BR KEXEC_PRESERVE_CONTEXT " (since Linux 2.6.27)"
Preserve the system hardware and
software states before executing the new kernel.
This could be used for system suspend.
This flag is available only if the kernel was configured with
.BR CONFIG_KEXEC_JUMP ,
and is effective only if
.I nr_segments
is greater than 0.
.PP
The high-order bits (corresponding to the mask 0xffff0000) of
.I flags
contain the architecture of the to-be-executed kernel.
Specify (OR) the constant
.B KEXEC_ARCH_DEFAULT
to use the current architecture,
or one of the following architecture constants
.BR KEXEC_ARCH_386 ,
.BR KEXEC_ARCH_68K ,
.BR KEXEC_ARCH_X86_64 ,
.BR KEXEC_ARCH_PPC ,
.BR KEXEC_ARCH_PPC64 ,
.BR KEXEC_ARCH_IA_64 ,
.BR KEXEC_ARCH_ARM ,
.BR KEXEC_ARCH_S390 ,
.BR KEXEC_ARCH_SH ,
.BR KEXEC_ARCH_MIPS ,
and
.BR KEXEC_ARCH_MIPS_LE .
The architecture must be executable on the CPU of the system.

The
.I entry
argument is the physical entry address in the kernel image.
The
.I nr_segments
argument is the number of segments pointed to by the
.I segments
pointer;
the kernel imposes an (arbitrary) limit of 16 on the number of segments.
The
.I segments
argument is an array of
.I kexec_segment
structures which define the kernel layout:
.in +4n
.nf

struct kexec_segment {
    void   *buf;        /* Buffer in user space */
    size_t  bufsz;      /* Buffer length in user space */
    void   *mem;        /* Physical address of kernel */
    size_t  memsz;      /* Physical address length */
};
.fi
.in
.PP
The kernel image defined by
.I segments
is copied from the calling process into
the kernel either in regular
memory or in reserved memory (if
.BR KEXEC_ON_CRASH
is set).
The kernel first performs various sanity checks on the
information passed in
.IR segments .
If these checks pass, the kernel copies the segment data to kernel memory.
Each segment specified in
.I segments
is copied as follows:
.IP * 3
.I buf
and
.I bufsz
identify a memory region in the caller's virtual address space
that is the source of the copy.
The value in
.I bufsz
may not exceed the value in the
.I memsz
field.
.IP *
.I mem
and
.I memsz
specify a physical address range that is the target of the copy.
The values specified in both fields must be multiples of
the system page size.
.IP *
.I bufsz
bytes are copied from the source buffer to the target kernel buffer.
If
.I bufsz
is less than
.IR memsz ,
then the excess bytes in the kernel buffer are zeroed out.
.PP
In case of a normal kexec (i.e., the
.BR KEXEC_ON_CRASH
flag is not set), the segment data is loaded in any available memory
and is moved to the final destination at kexec reboot time (e.g., when the
.BR kexec (8)
command is executed with the
.I \-e
option).

In case of kexec on panic (i.e., the
.BR KEXEC_ON_CRASH
flag is set), the segment data is
loaded to reserved memory at the time of the call, and, after a crash,
the kexec mechanism simply passes control to that kernel.

The
.BR kexec_load ()
system call is available only if the kernel was configured with
.BR CONFIG_KEXEC .
.SS kexec_file_load()
The
.BR kexec_file_load ()
system call is similar to
.BR kexec_load (),
but it takes a different set of arguments.
It reads the kernel to be loaded from the file referred to by the descriptor
.IR kernel_fd ,
and the initrd (initial RAM disk)
to be loaded from file referred to by the descriptor
.IR initrd_fd .
The
.IR cmdline
argument is a pointer to a buffer containing the command line
for the new kernel.
The
.IR cmdline_len
argument specifies size of the buffer.
The last byte in the buffer must be a null byte (\(aq\\0\(aq).

The
.IR flags
argument is a bit mask which modifies the behavior of the call.
The following values can be specified in
.IR flags :
.TP
.BR KEXEC_FILE_UNLOAD
Unload the currently loaded kernel.
.TP
.BR KEXEC_FILE_ON_CRASH
Load the new kernel in the memory region reserved for the crash kernel
(as for
.BR KEXEC_ON_CRASH).
This kernel is booted if the currently running kernel crashes.
.TP
.BR KEXEC_FILE_NO_INITRAMFS
Loading initrd/initramfs is optional.
Specify this flag if no initramfs is being loaded.
If this flag is set, the value passed in
.IR initrd_fd
is ignored.
.PP
The
.BR kexec_file_load ()
.\" See also http://lwn.net/Articles/603116/
system call was added to provide support for systems
where "kexec" loading should be restricted to
only kernels that are signed.
This system call is available only if the kernel was configured with
.BR CONFIG_KEXEC_FILE .
.SH RETURN VALUE
On success, these system calls returns 0.
On error, \-1 is returned and
.I errno
is set to indicate the error.
.SH ERRORS
.TP
.B EADDRNOTAVAIL
.\" See kernel/kexec.::sanity_check_segment_list in the 3.19 kernel source
The
.B KEXEC_ON_CRASH
flags was specified, but the region specified by the
.I mem
and
.I memsz
fields of one of the
.I segments
entries lies outside the range of memory reserved for the crash kernel.
.TP
.B EADDRNOTAVAIL
The value in a
.I mem
or
.I memsz
field in one of the
.I segments
entries is not a multiple of the system page size.
.TP
.B EBADF
.I kernel_fd
or
.I initrd_fd
is not a valid file descriptor.
.TP
.B EBUSY
Another crash kernel is already being loaded
or a crash kernel is already in use.
.TP
.B EINVAL
.I flags
is invalid.
.TP
.B EINVAL
The value of a
.I bufsz
field in one of the
.I segments
entries exceeds the value in the corresponding
.I memsz
field.
.TP
.B EINVAL
.IR nr_segments
exceeds
.BR KEXEC_SEGMENT_MAX
(16).
.TP
.B EINVAL
Two or more of the kernel target buffers overlap.
.TP
.B EINVAL
The value in
.I cmdline[cmdline_len-1]
is not \(aq\\0\(aq.
.TP
.B EINVAL
The file referred to by
.I kernel_fd
or
.I initrd_fd
is empty (length zero).
.TP
.B ENOMEM
Could not allocate memory.
.TP
.B ENOEXEC
.I kernel_fd
does not refer to an open file, or the kernel can't load this file.
.TP
.B EPERM
The caller does not have the
.BR CAP_SYS_BOOT
capability.
.SH VERSIONS
The
.BR kexec_load ()
system call first appeared in Linux 2.6.13.
The
.BR kexec_file_load ()
system call first appeared in Linux 3.17.
.SH CONFORMING TO
These system calls are Linux-specific.
.SH NOTES
Currently, there is no glibc support for these system calls.
Call them using
.BR syscall (2).
.SH SEE ALSO
.BR reboot (2),
.BR syscall (2),
.BR kexec (8)

The kernel source files
.IR Documentation/kdump/kdump.txt
and
.IR Documentation/kernel-parameters.txt .

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-28 15:49               ` Michael Kerrisk (man-pages)
@ 2015-01-28 20:34                 ` Vivek Goyal
  2015-01-28 21:14                   ` Scot Doyle
  0 siblings, 1 reply; 19+ messages in thread
From: Vivek Goyal @ 2015-01-28 20:34 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young,
	H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen

On Wed, Jan 28, 2015 at 04:49:34PM +0100, Michael Kerrisk (man-pages) wrote:
> [Dropping Andi into CC, which I should have done to start with, since
> he wrote the original page, and might also have some comments]
> 
> Hello Vivek,
> 
> >> I've made various adjustments to the page in the light of your comments
> >> above. Thanks!
> >
> > Thank you for following it up and improving kexec man page.
> 
> You're welcome. So, by now, I've made quite a lot of changes
> (including adding a number of cases under ERRORS). I think the revised
> kexec_load/kexec_file_load page is pretty much ready to go, but would
> you be willing to give the text below a check over first?
> 

Hi Michael,

I had a quick look and it looks good to me.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-28 20:34                 ` Vivek Goyal
@ 2015-01-28 21:14                   ` Scot Doyle
  2015-01-28 21:31                     ` Vivek Goyal
  0 siblings, 1 reply; 19+ messages in thread
From: Scot Doyle @ 2015-01-28 21:14 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages), Vivek Goyal
  Cc: lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young,
	H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen

On Wed, 28 Jan 2015, Vivek Goyal wrote:
> On Wed, Jan 28, 2015 at 04:49:34PM +0100, Michael Kerrisk (man-pages) wrote:
> > Hello Vivek,
> > 
> > >> I've made various adjustments to the page in the light of your comments
> > >> above. Thanks!
> > >
> > > Thank you for following it up and improving kexec man page.
> > 
> > You're welcome. So, by now, I've made quite a lot of changes
> > (including adding a number of cases under ERRORS). I think the revised
> > kexec_load/kexec_file_load page is pretty much ready to go, but would
> > you be willing to give the text below a check over first?
> > 
> 
> Hi Michael,
> 
> I had a quick look and it looks good to me.
> 
> Thanks
> Vivek

When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same 
true for kexec_load? Would it make sense to note this in the man pages 
along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message?

Thanks,
Scot


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-28 21:14                   ` Scot Doyle
@ 2015-01-28 21:31                     ` Vivek Goyal
  2015-01-28 22:10                       ` Scot Doyle
  0 siblings, 1 reply; 19+ messages in thread
From: Vivek Goyal @ 2015-01-28 21:31 UTC (permalink / raw)
  To: Scot Doyle
  Cc: Michael Kerrisk (man-pages),
	lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young,
	H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen

On Wed, Jan 28, 2015 at 09:14:03PM +0000, Scot Doyle wrote:
> On Wed, 28 Jan 2015, Vivek Goyal wrote:
> > On Wed, Jan 28, 2015 at 04:49:34PM +0100, Michael Kerrisk (man-pages) wrote:
> > > Hello Vivek,
> > > 
> > > >> I've made various adjustments to the page in the light of your comments
> > > >> above. Thanks!
> > > >
> > > > Thank you for following it up and improving kexec man page.
> > > 
> > > You're welcome. So, by now, I've made quite a lot of changes
> > > (including adding a number of cases under ERRORS). I think the revised
> > > kexec_load/kexec_file_load page is pretty much ready to go, but would
> > > you be willing to give the text below a check over first?
> > > 
> > 
> > Hi Michael,
> > 
> > I had a quick look and it looks good to me.
> > 
> > Thanks
> > Vivek
> 
> When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same 
> true for kexec_load? Would it make sense to note this in the man pages 
> along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message?

Hmm.., I can't see an explicity dependency between RELOCATABLE and
KEXEC. Both KEXEC and KEXEC_FILE should be able to load a kernel
even if it had RELOCATABLE=n.

Just that kernel will run from the address it has been built for.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-28 21:31                     ` Vivek Goyal
@ 2015-01-28 22:10                       ` Scot Doyle
  2015-01-28 22:25                         ` Vivek Goyal
  0 siblings, 1 reply; 19+ messages in thread
From: Scot Doyle @ 2015-01-28 22:10 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Michael Kerrisk (man-pages),
	lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young,
	H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen

On Wed, 28 Jan 2015, Vivek Goyal wrote:
> On Wed, Jan 28, 2015 at 09:14:03PM +0000, Scot Doyle wrote:
> > On Wed, 28 Jan 2015, Vivek Goyal wrote:
> > > On Wed, Jan 28, 2015 at 04:49:34PM +0100, Michael Kerrisk (man-pages) wrote:
> > > > Hello Vivek,
> > > > 
> > > > >> I've made various adjustments to the page in the light of your comments
> > > > >> above. Thanks!
> > > > >
> > > > > Thank you for following it up and improving kexec man page.
> > > > 
> > > > You're welcome. So, by now, I've made quite a lot of changes
> > > > (including adding a number of cases under ERRORS). I think the revised
> > > > kexec_load/kexec_file_load page is pretty much ready to go, but would
> > > > you be willing to give the text below a check over first?
> > > > 
> > > 
> > > Hi Michael,
> > > 
> > > I had a quick look and it looks good to me.
> > > 
> > > Thanks
> > > Vivek
> > 
> > When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same 
> > true for kexec_load? Would it make sense to note this in the man pages 
> > along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message?
> 
> Hmm.., I can't see an explicity dependency between RELOCATABLE and
> KEXEC. Both KEXEC and KEXEC_FILE should be able to load a kernel
> even if it had RELOCATABLE=n.
> 
> Just that kernel will run from the address it has been built for.
> 
> Thanks
> Vivek

Confusing, right? kexec_file_load returns -ENOEXEC and dmesg says 
"kexec-bzImage64: XLF_CAN_BE_LOADED_ABOVE_4G is not set." which leads to
arch/x86/boot/header.S line 396:

#if defined(CONFIG_RELOCATABLE) && defined(CONFIG_X86_64)  
   /* kernel/boot_param/ramdisk could be loaded above 4g */
# define XLF1 XLF_CAN_BE_LOADED_ABOVE_4G
#else
# define XLF1 0
#endif


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-28 22:10                       ` Scot Doyle
@ 2015-01-28 22:25                         ` Vivek Goyal
  2015-01-29  1:27                           ` Scot Doyle
  0 siblings, 1 reply; 19+ messages in thread
From: Vivek Goyal @ 2015-01-28 22:25 UTC (permalink / raw)
  To: Scot Doyle
  Cc: Michael Kerrisk (man-pages),
	lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young,
	H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen

On Wed, Jan 28, 2015 at 10:10:59PM +0000, Scot Doyle wrote:
> On Wed, 28 Jan 2015, Vivek Goyal wrote:
> > On Wed, Jan 28, 2015 at 09:14:03PM +0000, Scot Doyle wrote:
> > > On Wed, 28 Jan 2015, Vivek Goyal wrote:
> > > > On Wed, Jan 28, 2015 at 04:49:34PM +0100, Michael Kerrisk (man-pages) wrote:
> > > > > Hello Vivek,
> > > > > 
> > > > > >> I've made various adjustments to the page in the light of your comments
> > > > > >> above. Thanks!
> > > > > >
> > > > > > Thank you for following it up and improving kexec man page.
> > > > > 
> > > > > You're welcome. So, by now, I've made quite a lot of changes
> > > > > (including adding a number of cases under ERRORS). I think the revised
> > > > > kexec_load/kexec_file_load page is pretty much ready to go, but would
> > > > > you be willing to give the text below a check over first?
> > > > > 
> > > > 
> > > > Hi Michael,
> > > > 
> > > > I had a quick look and it looks good to me.
> > > > 
> > > > Thanks
> > > > Vivek
> > > 
> > > When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same 
> > > true for kexec_load? Would it make sense to note this in the man pages 
> > > along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message?
> > 
> > Hmm.., I can't see an explicity dependency between RELOCATABLE and
> > KEXEC. Both KEXEC and KEXEC_FILE should be able to load a kernel
> > even if it had RELOCATABLE=n.
> > 
> > Just that kernel will run from the address it has been built for.
> > 
> > Thanks
> > Vivek
> 
> Confusing, right? kexec_file_load returns -ENOEXEC and dmesg says 
> "kexec-bzImage64: XLF_CAN_BE_LOADED_ABOVE_4G is not set." which leads to
> arch/x86/boot/header.S line 396:
> 
> #if defined(CONFIG_RELOCATABLE) && defined(CONFIG_X86_64)  
>    /* kernel/boot_param/ramdisk could be loaded above 4g */
> # define XLF1 XLF_CAN_BE_LOADED_ABOVE_4G
> #else
> # define XLF1 0
> #endif

Ah, this one. Actually generic kexec file loading implementation does not
impose this restriction. It is the image specific loader part which
decides what kind of bzImage it can load.

Current implementation (kexec-bzimage64.c), is only supporting loading
bzImages which are 64bit and can be loaded above 4G. This simplifies
the implementation of loader.

But there is nothing which prevents one from implementing other image
loaders.

So instead of saying that kexec_file_load() depends on CONFIG_RELOCATABLE,
it might be better to say in man page that currently this system call
supports only loading a bzImage which is 64bit and which can be loaded
above 4G too.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-28 22:25                         ` Vivek Goyal
@ 2015-01-29  1:27                           ` Scot Doyle
  2015-01-29  5:39                             ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 19+ messages in thread
From: Scot Doyle @ 2015-01-29  1:27 UTC (permalink / raw)
  To: Vivek Goyal, Michael Kerrisk (man-pages)
  Cc: lkml, linux-man, Kexec Mailing List, Andy Lutomirski, Dave Young,
	H. Peter Anvin, Borislav Petkov, Eric W. Biederman, Andi Kleen

On Wed, 28 Jan 2015, Vivek Goyal wrote:
> On Wed, Jan 28, 2015 at 10:10:59PM +0000, Scot Doyle wrote:
> > On Wed, 28 Jan 2015, Vivek Goyal wrote:
> > > On Wed, Jan 28, 2015 at 09:14:03PM +0000, Scot Doyle wrote:
> > > > When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same 
> > > > true for kexec_load? Would it make sense to note this in the man pages 
> > > > along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message?
> > > 
> > > Hmm.., I can't see an explicity dependency between RELOCATABLE and
> > > KEXEC. Both KEXEC and KEXEC_FILE should be able to load a kernel
> > > even if it had RELOCATABLE=n.
> > > 
> > > Just that kernel will run from the address it has been built for.
> > > 
> > > Thanks
> > > Vivek
> > 
> > Confusing, right? kexec_file_load returns -ENOEXEC and dmesg says 
> > "kexec-bzImage64: XLF_CAN_BE_LOADED_ABOVE_4G is not set." which leads to
> > arch/x86/boot/header.S line 396:
> > 
> > #if defined(CONFIG_RELOCATABLE) && defined(CONFIG_X86_64)  
> >    /* kernel/boot_param/ramdisk could be loaded above 4g */
> > # define XLF1 XLF_CAN_BE_LOADED_ABOVE_4G
> > #else
> > # define XLF1 0
> > #endif
> 
> Ah, this one. Actually generic kexec file loading implementation does not
> impose this restriction. It is the image specific loader part which
> decides what kind of bzImage it can load.
> 
> Current implementation (kexec-bzimage64.c), is only supporting loading
> bzImages which are 64bit and can be loaded above 4G. This simplifies
> the implementation of loader.
> 
> But there is nothing which prevents one from implementing other image
> loaders.
> 
> So instead of saying that kexec_file_load() depends on CONFIG_RELOCATABLE,
> it might be better to say in man page that currently this system call
> supports only loading a bzImage which is 64bit and which can be loaded
> above 4G too.
> 
> Thanks
> Vivek

Thanks, I agree, and think it would make sense to list them as part of the
page's ENOEXEC error.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-29  1:27                           ` Scot Doyle
@ 2015-01-29  5:39                             ` Michael Kerrisk (man-pages)
  2015-01-29 16:06                               ` Scot Doyle
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-29  5:39 UTC (permalink / raw)
  To: Scot Doyle
  Cc: Vivek Goyal, lkml, linux-man, Kexec Mailing List,
	Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov,
	Eric W. Biederman, Andi Kleen

On 29 January 2015 at 02:27, Scot Doyle <lkml14@scotdoyle.com> wrote:
> On Wed, 28 Jan 2015, Vivek Goyal wrote:
>> On Wed, Jan 28, 2015 at 10:10:59PM +0000, Scot Doyle wrote:
>> > On Wed, 28 Jan 2015, Vivek Goyal wrote:
>> > > On Wed, Jan 28, 2015 at 09:14:03PM +0000, Scot Doyle wrote:
>> > > > When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same
>> > > > true for kexec_load? Would it make sense to note this in the man pages
>> > > > along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message?
>> > >
>> > > Hmm.., I can't see an explicity dependency between RELOCATABLE and
>> > > KEXEC. Both KEXEC and KEXEC_FILE should be able to load a kernel
>> > > even if it had RELOCATABLE=n.
>> > >
>> > > Just that kernel will run from the address it has been built for.
>> > >
>> > > Thanks
>> > > Vivek
>> >
>> > Confusing, right? kexec_file_load returns -ENOEXEC and dmesg says
>> > "kexec-bzImage64: XLF_CAN_BE_LOADED_ABOVE_4G is not set." which leads to
>> > arch/x86/boot/header.S line 396:
>> >
>> > #if defined(CONFIG_RELOCATABLE) && defined(CONFIG_X86_64)
>> >    /* kernel/boot_param/ramdisk could be loaded above 4g */
>> > # define XLF1 XLF_CAN_BE_LOADED_ABOVE_4G
>> > #else
>> > # define XLF1 0
>> > #endif
>>
>> Ah, this one. Actually generic kexec file loading implementation does not
>> impose this restriction. It is the image specific loader part which
>> decides what kind of bzImage it can load.
>>
>> Current implementation (kexec-bzimage64.c), is only supporting loading
>> bzImages which are 64bit and can be loaded above 4G. This simplifies
>> the implementation of loader.
>>
>> But there is nothing which prevents one from implementing other image
>> loaders.
>>
>> So instead of saying that kexec_file_load() depends on CONFIG_RELOCATABLE,
>> it might be better to say in man page that currently this system call
>> supports only loading a bzImage which is 64bit and which can be loaded
>> above 4G too.
>>
>> Thanks
>> Vivek
>
> Thanks, I agree, and think it would make sense to list them as part of the
> page's ENOEXEC error.

Scott, could you then phras a couple of sentences that capture thge
details, so I can add it to the ENOEXEC error?

Thanks,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-29  5:39                             ` Michael Kerrisk (man-pages)
@ 2015-01-29 16:06                               ` Scot Doyle
  2015-01-30 15:25                                 ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 19+ messages in thread
From: Scot Doyle @ 2015-01-29 16:06 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Vivek Goyal, lkml, linux-man, Kexec Mailing List,
	Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov,
	Eric W. Biederman, Andi Kleen

On Thu, 29 Jan 2015, Michael Kerrisk (man-pages) wrote:
> On 29 January 2015 at 02:27, Scot Doyle <lkml14@scotdoyle.com> wrote:
> > On Wed, 28 Jan 2015, Vivek Goyal wrote:
> >> On Wed, Jan 28, 2015 at 10:10:59PM +0000, Scot Doyle wrote:
> >> > On Wed, 28 Jan 2015, Vivek Goyal wrote:
> >> > > On Wed, Jan 28, 2015 at 09:14:03PM +0000, Scot Doyle wrote:
> >> > > > When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same
> >> > > > true for kexec_load? Would it make sense to note this in the man pages
> >> > > > along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message?
> >> > >
> >> > > Hmm.., I can't see an explicity dependency between RELOCATABLE and
> >> > > KEXEC. Both KEXEC and KEXEC_FILE should be able to load a kernel
> >> > > even if it had RELOCATABLE=n.
> >> > >
> >> > > Just that kernel will run from the address it has been built for.
> >> > >
> >> > > Thanks
> >> > > Vivek
> >> >
> >> > Confusing, right? kexec_file_load returns -ENOEXEC and dmesg says
> >> > "kexec-bzImage64: XLF_CAN_BE_LOADED_ABOVE_4G is not set." which leads to
> >> > arch/x86/boot/header.S line 396:
> >> >
> >> > #if defined(CONFIG_RELOCATABLE) && defined(CONFIG_X86_64)
> >> >    /* kernel/boot_param/ramdisk could be loaded above 4g */
> >> > # define XLF1 XLF_CAN_BE_LOADED_ABOVE_4G
> >> > #else
> >> > # define XLF1 0
> >> > #endif
> >>
> >> Ah, this one. Actually generic kexec file loading implementation does not
> >> impose this restriction. It is the image specific loader part which
> >> decides what kind of bzImage it can load.
> >>
> >> Current implementation (kexec-bzimage64.c), is only supporting loading
> >> bzImages which are 64bit and can be loaded above 4G. This simplifies
> >> the implementation of loader.
> >>
> >> But there is nothing which prevents one from implementing other image
> >> loaders.
> >>
> >> So instead of saying that kexec_file_load() depends on CONFIG_RELOCATABLE,
> >> it might be better to say in man page that currently this system call
> >> supports only loading a bzImage which is 64bit and which can be loaded
> >> above 4G too.
> >>
> >> Thanks
> >> Vivek
> >
> > Thanks, I agree, and think it would make sense to list them as part of the
> > page's ENOEXEC error.
> 
> Scott, could you then phras a couple of sentences that capture thge
> details, so I can add it to the ENOEXEC error?
> 
> Thanks,
> 
> Michael

Yes, maybe something like "kernel_fd does not refer to an open file, or 
the file type is not supported. Currently, the file must be a bzImage
and contain an x86 kernel loadable above 4G in memory (see 
Documentation/x86/boot.txt)."?

boot.txt explains that loading above 4G implies 64-bit and is specified 
via a bit in xloadflags added in Linux 3.8.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Edited kexec_load(2) [kexec_file_load()] man page for review
  2015-01-29 16:06                               ` Scot Doyle
@ 2015-01-30 15:25                                 ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 19+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-30 15:25 UTC (permalink / raw)
  To: Scot Doyle
  Cc: mtk.manpages, Vivek Goyal, lkml, linux-man, Kexec Mailing List,
	Andy Lutomirski, Dave Young, H. Peter Anvin, Borislav Petkov,
	Eric W. Biederman, Andi Kleen

On 01/29/2015 05:06 PM, Scot Doyle wrote:
> On Thu, 29 Jan 2015, Michael Kerrisk (man-pages) wrote:
>> On 29 January 2015 at 02:27, Scot Doyle <lkml14@scotdoyle.com> wrote:
>>> On Wed, 28 Jan 2015, Vivek Goyal wrote:
>>>> On Wed, Jan 28, 2015 at 10:10:59PM +0000, Scot Doyle wrote:
>>>>> On Wed, 28 Jan 2015, Vivek Goyal wrote:
>>>>>> On Wed, Jan 28, 2015 at 09:14:03PM +0000, Scot Doyle wrote:
>>>>>>> When I tested, kexec_file_load required CONFIG_RELOCATABLE. Is the same
>>>>>>> true for kexec_load? Would it make sense to note this in the man pages
>>>>>>> along with the need for CONFIG_KEXEC_FILE, etc? Or as an error message?
>>>>>>
>>>>>> Hmm.., I can't see an explicity dependency between RELOCATABLE and
>>>>>> KEXEC. Both KEXEC and KEXEC_FILE should be able to load a kernel
>>>>>> even if it had RELOCATABLE=n.
>>>>>>
>>>>>> Just that kernel will run from the address it has been built for.
>>>>>>
>>>>>> Thanks
>>>>>> Vivek
>>>>>
>>>>> Confusing, right? kexec_file_load returns -ENOEXEC and dmesg says
>>>>> "kexec-bzImage64: XLF_CAN_BE_LOADED_ABOVE_4G is not set." which leads to
>>>>> arch/x86/boot/header.S line 396:
>>>>>
>>>>> #if defined(CONFIG_RELOCATABLE) && defined(CONFIG_X86_64)
>>>>>    /* kernel/boot_param/ramdisk could be loaded above 4g */
>>>>> # define XLF1 XLF_CAN_BE_LOADED_ABOVE_4G
>>>>> #else
>>>>> # define XLF1 0
>>>>> #endif
>>>>
>>>> Ah, this one. Actually generic kexec file loading implementation does not
>>>> impose this restriction. It is the image specific loader part which
>>>> decides what kind of bzImage it can load.
>>>>
>>>> Current implementation (kexec-bzimage64.c), is only supporting loading
>>>> bzImages which are 64bit and can be loaded above 4G. This simplifies
>>>> the implementation of loader.
>>>>
>>>> But there is nothing which prevents one from implementing other image
>>>> loaders.
>>>>
>>>> So instead of saying that kexec_file_load() depends on CONFIG_RELOCATABLE,
>>>> it might be better to say in man page that currently this system call
>>>> supports only loading a bzImage which is 64bit and which can be loaded
>>>> above 4G too.
>>>>
>>>> Thanks
>>>> Vivek
>>>
>>> Thanks, I agree, and think it would make sense to list them as part of the
>>> page's ENOEXEC error.
>>
>> Scott, could you then phras a couple of sentences that capture thge
>> details, so I can add it to the ENOEXEC error?
>>
>> Thanks,
>>
>> Michael
> 
> Yes, maybe something like "kernel_fd does not refer to an open file, or 
> the file type is not supported. Currently, the file must be a bzImage
> and contain an x86 kernel loadable above 4G in memory (see 
> Documentation/x86/boot.txt)."?
> 
> boot.txt explains that loading above 4G implies 64-bit and is specified 
> via a bit in xloadflags added in Linux 3.8.

Added and pushed. Thanks, Scott.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-01-30 15:25 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-09 19:17 Edited kexec_load(2) [kexec_file_load()] man page for review Michael Kerrisk (man-pages)
2014-11-11 21:30 ` Vivek Goyal
2015-01-07 21:17   ` Michael Kerrisk (man-pages)
2015-01-12 22:16     ` Vivek Goyal
2015-01-16 13:30       ` Michael Kerrisk (man-pages)
2015-01-27  8:07         ` Michael Kerrisk (man-pages)
2015-01-27 14:24         ` Vivek Goyal
2015-01-28  8:04           ` Michael Kerrisk (man-pages)
2015-01-28 14:48             ` Vivek Goyal
2015-01-28 15:49               ` Michael Kerrisk (man-pages)
2015-01-28 20:34                 ` Vivek Goyal
2015-01-28 21:14                   ` Scot Doyle
2015-01-28 21:31                     ` Vivek Goyal
2015-01-28 22:10                       ` Scot Doyle
2015-01-28 22:25                         ` Vivek Goyal
2015-01-29  1:27                           ` Scot Doyle
2015-01-29  5:39                             ` Michael Kerrisk (man-pages)
2015-01-29 16:06                               ` Scot Doyle
2015-01-30 15:25                                 ` Michael Kerrisk (man-pages)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).