stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 01/12] kexec: Allow architecture code to opt-out at runtime
       [not found] <20210913155603.28383-1-joro@8bytes.org>
@ 2021-09-13 15:55 ` Joerg Roedel
  2021-11-01 16:10   ` Borislav Petkov
  2021-09-13 15:55 ` [PATCH v2 02/12] x86/kexec/64: Forbid kexec when running as an SEV-ES guest Joerg Roedel
  1 sibling, 1 reply; 8+ messages in thread
From: Joerg Roedel @ 2021-09-13 15:55 UTC (permalink / raw)
  To: x86
  Cc: Eric Biederman, kexec, Joerg Roedel, stable, hpa,
	Andy Lutomirski, Dave Hansen, Peter Zijlstra, Jiri Slaby,
	Dan Williams, Tom Lendacky, Juergen Gross, Kees Cook,
	David Rientjes, Cfir Cohen, Erdem Aktas, Masami Hiramatsu,
	Mike Stunes, Sean Christopherson, Martin Radev, Arvind Sankar,
	Joerg Roedel, linux-coco, linux-kernel, kvm, virtualization

From: Joerg Roedel <jroedel@suse.de>

Allow a runtime opt-out of kexec support for architecture code in case
the kernel is running in an environment where kexec is not properly
supported yet.

This will be used on x86 when the kernel is running as an SEV-ES
guest. SEV-ES guests need special handling for kexec to hand over all
CPUs to the new kernel. This requires special hypervisor support and
handling code in the guest which is not yet implemented.

Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 include/linux/kexec.h |  1 +
 kernel/kexec.c        | 14 ++++++++++++++
 kernel/kexec_file.c   |  9 +++++++++
 3 files changed, 24 insertions(+)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 0c994ae37729..85c30dcd0bdc 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -201,6 +201,7 @@ int arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
 				 unsigned long buf_len);
 #endif
 int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf);
+bool arch_kexec_supported(void);
 
 extern int kexec_add_buffer(struct kexec_buf *kbuf);
 int kexec_locate_mem_hole(struct kexec_buf *kbuf);
diff --git a/kernel/kexec.c b/kernel/kexec.c
index b5e40f069768..275cda429380 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -190,11 +190,25 @@ static int do_kexec_load(unsigned long entry, unsigned long nr_segments,
  * that to happen you need to do that yourself.
  */
 
+bool __weak arch_kexec_supported(void)
+{
+	return true;
+}
+
 static inline int kexec_load_check(unsigned long nr_segments,
 				   unsigned long flags)
 {
 	int result;
 
+	/*
+	 * The architecture may support kexec in general, but the kernel could
+	 * run in an environment where it is not (yet) possible to execute a new
+	 * kernel. Allow the architecture code to opt-out of kexec support when
+	 * it is running in such an environment.
+	 */
+	if (!arch_kexec_supported())
+		return -ENOSYS;
+
 	/* We only trust the superuser with rebooting the system. */
 	if (!capable(CAP_SYS_BOOT) || kexec_load_disabled)
 		return -EPERM;
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 33400ff051a8..96d08a512e9c 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -358,6 +358,15 @@ SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
 	int ret = 0, i;
 	struct kimage **dest_image, *image;
 
+	/*
+	 * The architecture may support kexec in general, but the kernel could
+	 * run in an environment where it is not (yet) possible to execute a new
+	 * kernel. Allow the architecture code to opt-out of kexec support when
+	 * it is running in such an environment.
+	 */
+	if (!arch_kexec_supported())
+		return -ENOSYS;
+
 	/* We only trust the superuser with rebooting the system. */
 	if (!capable(CAP_SYS_BOOT) || kexec_load_disabled)
 		return -EPERM;
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 02/12] x86/kexec/64: Forbid kexec when running as an SEV-ES guest
       [not found] <20210913155603.28383-1-joro@8bytes.org>
  2021-09-13 15:55 ` [PATCH v2 01/12] kexec: Allow architecture code to opt-out at runtime Joerg Roedel
@ 2021-09-13 15:55 ` Joerg Roedel
  1 sibling, 0 replies; 8+ messages in thread
From: Joerg Roedel @ 2021-09-13 15:55 UTC (permalink / raw)
  To: x86
  Cc: Eric Biederman, kexec, Joerg Roedel, stable, hpa,
	Andy Lutomirski, Dave Hansen, Peter Zijlstra, Jiri Slaby,
	Dan Williams, Tom Lendacky, Juergen Gross, Kees Cook,
	David Rientjes, Cfir Cohen, Erdem Aktas, Masami Hiramatsu,
	Mike Stunes, Sean Christopherson, Martin Radev, Arvind Sankar,
	Joerg Roedel, linux-coco, linux-kernel, kvm, virtualization

From: Joerg Roedel <jroedel@suse.de>

For now, kexec is not supported when running as an SEV-ES guest. Doing
so requires additional hypervisor support and special code to hand
over the CPUs to the new kernel in a safe way.

Until this is implemented, do not support kexec in SEV-ES guests.

Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/kernel/machine_kexec_64.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 131f30fdcfbd..a8e16a411b40 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -591,3 +591,11 @@ void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages)
 	 */
 	set_memory_encrypted((unsigned long)vaddr, pages);
 }
+
+/*
+ * Kexec is not supported in SEV-ES guests yet
+ */
+bool arch_kexec_supported(void)
+{
+	return !sev_es_active();
+}
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 01/12] kexec: Allow architecture code to opt-out at runtime
  2021-09-13 15:55 ` [PATCH v2 01/12] kexec: Allow architecture code to opt-out at runtime Joerg Roedel
@ 2021-11-01 16:10   ` Borislav Petkov
  2021-11-01 21:11     ` Eric W. Biederman
  0 siblings, 1 reply; 8+ messages in thread
From: Borislav Petkov @ 2021-11-01 16:10 UTC (permalink / raw)
  To: Eric Biederman
  Cc: Joerg Roedel, x86, kexec, Joerg Roedel, stable, hpa,
	Andy Lutomirski, Dave Hansen, Peter Zijlstra, Jiri Slaby,
	Dan Williams, Tom Lendacky, Juergen Gross, Kees Cook,
	David Rientjes, Cfir Cohen, Erdem Aktas, Masami Hiramatsu,
	Mike Stunes, Sean Christopherson, Martin Radev, Arvind Sankar,
	linux-coco, linux-kernel, kvm, virtualization

On Mon, Sep 13, 2021 at 05:55:52PM +0200, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
> 
> Allow a runtime opt-out of kexec support for architecture code in case
> the kernel is running in an environment where kexec is not properly
> supported yet.
> 
> This will be used on x86 when the kernel is running as an SEV-ES
> guest. SEV-ES guests need special handling for kexec to hand over all
> CPUs to the new kernel. This requires special hypervisor support and
> handling code in the guest which is not yet implemented.
> 
> Cc: stable@vger.kernel.org # v5.10+
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>  include/linux/kexec.h |  1 +
>  kernel/kexec.c        | 14 ++++++++++++++
>  kernel/kexec_file.c   |  9 +++++++++
>  3 files changed, 24 insertions(+)

I guess I can take this through the tip tree along with the next one.

Eric?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 01/12] kexec: Allow architecture code to opt-out at runtime
  2021-11-01 16:10   ` Borislav Petkov
@ 2021-11-01 21:11     ` Eric W. Biederman
  2021-11-02 16:37       ` Joerg Roedel
                         ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Eric W. Biederman @ 2021-11-01 21:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Joerg Roedel, x86, kexec, Joerg Roedel, stable, hpa,
	Andy Lutomirski, Dave Hansen, Peter Zijlstra, Jiri Slaby,
	Dan Williams, Tom Lendacky, Juergen Gross, Kees Cook,
	David Rientjes, Cfir Cohen, Erdem Aktas, Masami Hiramatsu,
	Mike Stunes, Sean Christopherson, Martin Radev, Arvind Sankar,
	linux-coco, linux-kernel, kvm, virtualization

Borislav Petkov <bp@alien8.de> writes:

> On Mon, Sep 13, 2021 at 05:55:52PM +0200, Joerg Roedel wrote:
>> From: Joerg Roedel <jroedel@suse.de>
>> 
>> Allow a runtime opt-out of kexec support for architecture code in case
>> the kernel is running in an environment where kexec is not properly
>> supported yet.
>> 
>> This will be used on x86 when the kernel is running as an SEV-ES
>> guest. SEV-ES guests need special handling for kexec to hand over all
>> CPUs to the new kernel. This requires special hypervisor support and
>> handling code in the guest which is not yet implemented.
>> 
>> Cc: stable@vger.kernel.org # v5.10+
>> Signed-off-by: Joerg Roedel <jroedel@suse.de>
>> ---
>>  include/linux/kexec.h |  1 +
>>  kernel/kexec.c        | 14 ++++++++++++++
>>  kernel/kexec_file.c   |  9 +++++++++
>>  3 files changed, 24 insertions(+)
>
> I guess I can take this through the tip tree along with the next one.

I seem to remember the consensus when this was reviewed that it was
unnecessary and there is already support for doing something like
this at a more fine grained level so we don't need a new kexec hook.

Eric


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 01/12] kexec: Allow architecture code to opt-out at runtime
  2021-11-01 21:11     ` Eric W. Biederman
@ 2021-11-02 16:37       ` Joerg Roedel
  2021-11-02 17:00       ` Joerg Roedel
  2021-11-02 17:17       ` Borislav Petkov
  2 siblings, 0 replies; 8+ messages in thread
From: Joerg Roedel @ 2021-11-02 16:37 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Borislav Petkov, Joerg Roedel, x86, kexec, stable, hpa,
	Andy Lutomirski, Dave Hansen, Peter Zijlstra, Jiri Slaby,
	Dan Williams, Tom Lendacky, Juergen Gross, Kees Cook,
	David Rientjes, Cfir Cohen, Erdem Aktas, Masami Hiramatsu,
	Mike Stunes, Sean Christopherson, Martin Radev, Arvind Sankar,
	linux-coco, linux-kernel, kvm, virtualization

On Mon, Nov 01, 2021 at 04:11:42PM -0500, Eric W. Biederman wrote:
> I seem to remember the consensus when this was reviewed that it was
> unnecessary and there is already support for doing something like
> this at a more fine grained level so we don't need a new kexec hook.

It was a discussion, no consenus :)

I still think it is better to solve this in generic code for everybody
to re-use than with an hack in the architecture hooks.

More and more platforms which enable confidential computing features
may need this hook in the future.

Regards,

-- 
Jörg Rödel
jroedel@suse.de

SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
 
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 01/12] kexec: Allow architecture code to opt-out at runtime
  2021-11-01 21:11     ` Eric W. Biederman
  2021-11-02 16:37       ` Joerg Roedel
@ 2021-11-02 17:00       ` Joerg Roedel
  2021-11-02 18:17         ` Eric W. Biederman
  2021-11-02 17:17       ` Borislav Petkov
  2 siblings, 1 reply; 8+ messages in thread
From: Joerg Roedel @ 2021-11-02 17:00 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Borislav Petkov, Joerg Roedel, x86, kexec, stable, hpa,
	Andy Lutomirski, Dave Hansen, Peter Zijlstra, Jiri Slaby,
	Dan Williams, Tom Lendacky, Juergen Gross, Kees Cook,
	David Rientjes, Cfir Cohen, Erdem Aktas, Masami Hiramatsu,
	Mike Stunes, Sean Christopherson, Martin Radev, Arvind Sankar,
	linux-coco, linux-kernel, kvm, virtualization

Hi again,

On Mon, Nov 01, 2021 at 04:11:42PM -0500, Eric W. Biederman wrote:
> I seem to remember the consensus when this was reviewed that it was
> unnecessary and there is already support for doing something like
> this at a more fine grained level so we don't need a new kexec hook.

Forgot to state to problem again which these patches solve:

Currently a Linux kernel running as an SEV-ES guest has no way to
successfully kexec into a new kernel. The normal SIPI sequence to reset
the non-boot VCPUs does not work in SEV-ES guests and special code is
needed in Linux to safely hand over the VCPUs from one kernel to the
next. What happens currently is that the kexec'ed kernel will just hang.

The code which implements the VCPU hand-over is also included in this
patch-set, but it requires a certain level of Hypervisor support which
is not available everywhere.

To make it clear to the user that kexec will not work in their
environment, it is best to disable the respected syscalls. This is what
the hook is needed for.

Regards,

-- 
Jörg Rödel
jroedel@suse.de

SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
 
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 01/12] kexec: Allow architecture code to opt-out at runtime
  2021-11-01 21:11     ` Eric W. Biederman
  2021-11-02 16:37       ` Joerg Roedel
  2021-11-02 17:00       ` Joerg Roedel
@ 2021-11-02 17:17       ` Borislav Petkov
  2 siblings, 0 replies; 8+ messages in thread
From: Borislav Petkov @ 2021-11-02 17:17 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Joerg Roedel, x86, kexec, Joerg Roedel, stable, hpa,
	Andy Lutomirski, Dave Hansen, Peter Zijlstra, Jiri Slaby,
	Dan Williams, Tom Lendacky, Juergen Gross, Kees Cook,
	David Rientjes, Cfir Cohen, Erdem Aktas, Masami Hiramatsu,
	Mike Stunes, Sean Christopherson, Martin Radev, Arvind Sankar,
	linux-coco, linux-kernel, kvm, virtualization

On Mon, Nov 01, 2021 at 04:11:42PM -0500, Eric W. Biederman wrote:
> I seem to remember the consensus when this was reviewed that it was
> unnecessary and there is already support for doing something like
> this at a more fine grained level so we don't need a new kexec hook.

Well, the executive summary is that you have a guest whose memory *and*
registers are encrypted so the hypervisor cannot have a poke inside and
reset the vCPU like it would normally do. So you need to do that dance
differently, i.e, the patchset.

If you try to kexec such a guest now, it'll init only the BSP, as Joerg
said. So I guess a single-threaded kdump.

And yes, one of the prominent use cases is kdumping from such a guest,
as distros love doing kdump for debugging.

I hope that explains it better.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 01/12] kexec: Allow architecture code to opt-out at runtime
  2021-11-02 17:00       ` Joerg Roedel
@ 2021-11-02 18:17         ` Eric W. Biederman
  0 siblings, 0 replies; 8+ messages in thread
From: Eric W. Biederman @ 2021-11-02 18:17 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Borislav Petkov, Joerg Roedel, x86, kexec, stable, hpa,
	Andy Lutomirski, Dave Hansen, Peter Zijlstra, Jiri Slaby,
	Dan Williams, Tom Lendacky, Juergen Gross, Kees Cook,
	David Rientjes, Cfir Cohen, Erdem Aktas, Masami Hiramatsu,
	Mike Stunes, Sean Christopherson, Martin Radev, Arvind Sankar,
	linux-coco, linux-kernel, kvm, virtualization

Joerg Roedel <jroedel@suse.de> writes:

> Hi again,
>
> On Mon, Nov 01, 2021 at 04:11:42PM -0500, Eric W. Biederman wrote:
>> I seem to remember the consensus when this was reviewed that it was
>> unnecessary and there is already support for doing something like
>> this at a more fine grained level so we don't need a new kexec hook.
>
> Forgot to state to problem again which these patches solve:
>
> Currently a Linux kernel running as an SEV-ES guest has no way to
> successfully kexec into a new kernel. The normal SIPI sequence to reset
> the non-boot VCPUs does not work in SEV-ES guests and special code is
> needed in Linux to safely hand over the VCPUs from one kernel to the
> next. What happens currently is that the kexec'ed kernel will just hang.
>
> The code which implements the VCPU hand-over is also included in this
> patch-set, but it requires a certain level of Hypervisor support which
> is not available everywhere.
>
> To make it clear to the user that kexec will not work in their
> environment, it is best to disable the respected syscalls. This is what
> the hook is needed for.

Note this is environmental.  This is the equivalent of a driver for a
device without some feature.

The kernel already has machine_kexec_prepare, which is perfectly capable
of detecting this is a problem and causing kexec_load to fail.  Which
is all that is required.

We don't need a new hook and a new code path to test for one
architecture.

So when we can reliably cause the system call to fail with a specific
error code I don't think it makes sense to make clutter up generic code
because of one architecture's design mistakes.


My honest preference would be to go farther and have a
firmware/hypervisor/platform independent rendezvous for the cpus so we
don't have to worry about what bugs the code under has implemented for
this special case.  Because frankly there when there are layers of
software if a bug can slip through it always seems to and causes
problems.


But definitely there is no reason to add another generic hook when the
existing hook is quite good enough.

Eric


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-11-02 18:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210913155603.28383-1-joro@8bytes.org>
2021-09-13 15:55 ` [PATCH v2 01/12] kexec: Allow architecture code to opt-out at runtime Joerg Roedel
2021-11-01 16:10   ` Borislav Petkov
2021-11-01 21:11     ` Eric W. Biederman
2021-11-02 16:37       ` Joerg Roedel
2021-11-02 17:00       ` Joerg Roedel
2021-11-02 18:17         ` Eric W. Biederman
2021-11-02 17:17       ` Borislav Petkov
2021-09-13 15:55 ` [PATCH v2 02/12] x86/kexec/64: Forbid kexec when running as an SEV-ES guest Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).