linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/62] Linux as SEV-ES Guest Support
@ 2020-02-11 13:51 Joerg Roedel
  2020-02-11 13:51 ` [PATCH 01/62] KVM: SVM: Add GHCB definitions Joerg Roedel
                   ` (63 more replies)
  0 siblings, 64 replies; 109+ messages in thread
From: Joerg Roedel @ 2020-02-11 13:51 UTC (permalink / raw)
  To: x86
  Cc: hpa, Andy Lutomirski, Dave Hansen, Peter Zijlstra,
	Thomas Hellstrom, Jiri Slaby, Dan Williams, Tom Lendacky,
	Juergen Gross, Kees Cook, linux-kernel, kvm, virtualization,
	Joerg Roedel, Joerg Roedel

Hi,

here is the first public post of the patch-set to enable Linux to run
under SEV-ES enabled hypervisors. The code is mostly feature-complete,
but there are still a couple of bugs to fix. Nevertheless, given the
size of the patch-set, I think it is about time to ask for initial
feedback of the changes that come with it. To better understand the code
here is a quick explanation of SEV-ES first.

This patch-set does not contain the hypervisor changes necessary to run
SEV-ES enabled KVM guests. These patches will be sent separatly when
they are ready to be sent out.

What is SEV-ES
==============

SEV-ES is an acronym for 'Secure Encrypted Virtualization - Encrypted
State' and means a hardware feature of AMD processors which hides the
register state of VCPUs to the hypervisor by encrypting it. The
hypervisor can't read or make changes to the guests register state.

Most intercepts set by the hypervisor do not cause a #VMEXIT of the
guest anymore, but turn into a VMM Communication Exception (#VC
exception, vector 29) inside the guest. The error-code of this exception
is the intercept exit-code that caused the exception. The guest handles
the #VC exception by communicating with the hypervisor through a shared
data structure, the 'Guest-Hypervisor-Communication-Block' (GHCB). The
layout of that data-structure and the protocol is specified in [1].

A description of the SEV-ES hardware interface can be found in the AMD64
Architecture Programmer's Manual Volume 2, Section 15.35 [2].

Implementation Details
======================

SEV-ES guests will always boot via UEFI firmware and use the 64-bit EFI
entry point into the kernel. This implies that only 64-bit Linux x86
guests are supported.

Pre-Decompression Boot Code and Early Exception Support
-------------------------------------------------------

Intercepts that cause exceptions in the guest include instructions like
CPUID, RDMSR/WRMSR, IOIO instructions and a couple more. Some of them
are executed very early during boot, which means that exceptions need to
work that early. That is the reason big parts of this patch-set enable
support for early exceptions, first in the pre-decompression boot-code
and later also in the early boot-code of the kernel image.

As these patches add exception support to the pre-decompression boot
code, it also implements a page-fault handler to create the
identity-mapped page-table on-demand. One reason for this change is to
make use of the exception handling code in non SEV-ES guests too, so
that it is less likely to break in the future. The other reason is that
for SEV-ES guests the code needs to setup its own page-table to map the
GHCB unencrypted. Without these patches the pre-decompression code only
uses its own page-table when KASLR is enabled and used.

SIPI and INIT Handling
----------------------

The hypervisor also can't make changes to the guest register state,
which implies that it can't emulate SIPI and INIT messages. This means
that any CPU register state reset needs to be done inside the guest.
Most of this is handled in the firmware, but the Linux kernel has to
setup an AP Jump Table to boot secondary processors. CPU online/offline
handling also needs special handling, where this patch-set implements a
shortcut. An offlined CPU will not go back to real-mode when it is woken
up again, but stays in long-mode an just jumps back to the trampoline
code.

NMI Special Handling
--------------------

The last thing that needs special handling with SEV-ES are NMIs.
Hypervisors usually start to intercept IRET instructions when an NMI got
injected to find out when the NMI window is re-opened. But handling IRET
intercepts requires the hypervisor to access guest register state and is
not possible with SEV-ES. The specification under [1] solves this
problem with an NMI_COMPLETE message sent my the guest to the
hypervisor, upon which the hypervisor re-opens the NMI window for the
guest.

This patch-set sends the NMI_COMPLETE message before the actual IRET,
while the kernel is still on a valid stack and kernel cr3. This opens
the NMI-window a few instructions early, but this is fine as under
x86-64 Linux NMI-nesting is safe. The alternative would be to
single-step over the IRET, but that requires more intrusive changes to
the entry code because it does not handle entries from kernel-mode while
on the entry stack.

Besides the special handling above the patch-set contains the handlers
for the #VC exception and all the exit-codes specified in [1].

Current State of the Patches
============================

The patch-set posted here can boot an SMP Linux guest under
SEV-ES-enabled KVM and the guest survives some load-testing
(kernel-compiles).  The guest boots to the graphical desktop and is
usable. But there are still know bugs and issues:

	* Putting some NMI-load on the guest will make it crash usually
	  within a minute
	* The handler for MMIO events needs more security checks when
	  walking the page-table
	* The MMIO handler also lacks emulation for MOVS and REP MOVS
	  instructions like used by memcpy_toio() and memcpy_fromio().

More testing will likely uncover more bugs, but I think the patch-set is
ready for initial feedback. It grew pretty big already and handling it
becomes more and more painful.

So please review the parts of the patch-set that you find interesting
and let me know your feedback.

Thanks a lot,

       Joerg

[1] https://developer.amd.com/wp-content/resources/56421.pdf
[2] https://www.amd.com/system/files/TechDocs/24593.pdf

Doug Covelli (1):
  x86/vmware: Add VMware specific handling for VMMCALL under SEV-ES

Joerg Roedel (43):
  KVM: SVM: Add GHCB Accessor functions
  x86/traps: Move some definitions to <asm/trap_defs.h>
  x86/insn-decoder: Make inat-tables.c suitable for pre-decompression
    code
  x86/boot/compressed: Fix debug_puthex() parameter type
  x86/boot/compressed/64: Disable red-zone usage
  x86/boot/compressed/64: Add IDT Infrastructure
  x86/boot/compressed/64: Rename kaslr_64.c to ident_map_64.c
  x86/boot/compressed/64: Add page-fault handler
  x86/boot/compressed/64: Always switch to own page-table
  x86/boot/compressed/64: Don't pre-map memory in KASLR code
  x86/boot/compressed/64: Change add_identity_map() to take start and
    end
  x86/boot/compressed/64: Add stage1 #VC handler
  x86/boot/compressed/64: Call set_sev_encryption_mask earlier
  x86/boot/compressed/64: Check return value of
    kernel_ident_mapping_init()
  x86/boot/compressed/64: Add function to map a page unencrypted
  x86/boot/compressed/64: Setup GHCB Based VC Exception handler
  x86/fpu: Move xgetbv()/xsetbv() into separate header
  x86/idt: Move IDT to data segment
  x86/idt: Split idt_data setup out of set_intr_gate()
  x86/head/64: Install boot GDT
  x86/head/64: Reload GDT after switch to virtual addresses
  x86/head/64: Load segment registers earlier
  x86/head/64: Switch to initial stack earlier
  x86/head/64: Load IDT earlier
  x86/head/64: Move early exception dispatch to C code
  x86/sev-es: Add SEV-ES Feature Detection
  x86/sev-es: Compile early handler code into kernel image
  x86/sev-es: Setup early #VC handler
  x86/sev-es: Setup GHCB based boot #VC handler
  x86/sev-es: Wire up existing #VC exit-code handlers
  x86/sev-es: Handle instruction fetches from user-space
  x86/sev-es: Harden runtime #VC handler for exceptions from user-space
  x86/sev-es: Filter exceptions not supported from user-space
  x86/sev-es: Handle RDTSCP Events
  x86/sev-es: Handle #AC Events
  x86/sev-es: Handle #DB Events
  x86/paravirt: Allow hypervisor specific VMMCALL handling under SEV-ES
  x86/realmode: Add SEV-ES specific trampoline entry point
  x86/head/64: Don't call verify_cpu() on starting APs
  x86/head/64: Rename start_cpu0
  x86/sev-es: Support CPU offline/online
  x86/cpufeature: Add SEV_ES_GUEST CPU Feature
  x86/sev-es: Add NMI state tracking

Tom Lendacky (18):
  KVM: SVM: Add GHCB definitions
  x86/cpufeatures: Add SEV-ES CPU feature
  x86/sev-es: Add support for handling IOIO exceptions
  x86/sev-es: Add CPUID handling to #VC handler
  x86/sev-es: Add handler for MMIO events
  x86/sev-es: Setup per-cpu GHCBs for the runtime handler
  x86/sev-es: Add Runtime #VC Exception Handler
  x86/sev-es: Handle MSR events
  x86/sev-es: Handle DR7 read/write events
  x86/sev-es: Handle WBINVD Events
  x86/sev-es: Handle RDTSC Events
  x86/sev-es: Handle RDPMC Events
  x86/sev-es: Handle INVD Events
  x86/sev-es: Handle MONITOR/MONITORX Events
  x86/sev-es: Handle MWAIT/MWAITX Events
  x86/sev-es: Handle VMMCALL Events
  x86/kvm: Add KVM specific VMMCALL handling under SEV-ES
  x86/realmode: Setup AP jump table

 arch/x86/Kconfig                           |   1 +
 arch/x86/boot/Makefile                     |   2 +-
 arch/x86/boot/compressed/Makefile          |   8 +-
 arch/x86/boot/compressed/head_64.S         |  41 ++
 arch/x86/boot/compressed/ident_map_64.c    | 320 +++++++++
 arch/x86/boot/compressed/idt_64.c          |  53 ++
 arch/x86/boot/compressed/idt_handlers_64.S |  78 +++
 arch/x86/boot/compressed/kaslr.c           |  36 +-
 arch/x86/boot/compressed/kaslr_64.c        | 156 -----
 arch/x86/boot/compressed/misc.h            |  34 +-
 arch/x86/boot/compressed/sev-es.c          | 148 ++++
 arch/x86/entry/entry_64.S                  |  52 ++
 arch/x86/include/asm/cpu.h                 |   2 +-
 arch/x86/include/asm/cpufeatures.h         |   2 +
 arch/x86/include/asm/desc.h                |   2 +
 arch/x86/include/asm/desc_defs.h           |   3 +
 arch/x86/include/asm/fpu/internal.h        |  29 +-
 arch/x86/include/asm/fpu/xcr.h             |  32 +
 arch/x86/include/asm/mem_encrypt.h         |   5 +
 arch/x86/include/asm/msr-index.h           |   3 +
 arch/x86/include/asm/processor.h           |   1 +
 arch/x86/include/asm/realmode.h            |   4 +
 arch/x86/include/asm/segment.h             |   2 +-
 arch/x86/include/asm/sev-es.h              | 119 ++++
 arch/x86/include/asm/svm.h                 | 103 +++
 arch/x86/include/asm/trap_defs.h           |  50 ++
 arch/x86/include/asm/traps.h               |  51 +-
 arch/x86/include/asm/x86_init.h            |  16 +-
 arch/x86/include/uapi/asm/svm.h            |  11 +
 arch/x86/kernel/Makefile                   |   1 +
 arch/x86/kernel/cpu/amd.c                  |  10 +-
 arch/x86/kernel/cpu/scattered.c            |   1 +
 arch/x86/kernel/cpu/vmware.c               |  48 +-
 arch/x86/kernel/head64.c                   |  49 ++
 arch/x86/kernel/head_32.S                  |   4 +-
 arch/x86/kernel/head_64.S                  | 162 +++--
 arch/x86/kernel/idt.c                      |  60 +-
 arch/x86/kernel/kvm.c                      |  35 +-
 arch/x86/kernel/nmi.c                      |   8 +
 arch/x86/kernel/sev-es-shared.c            | 721 ++++++++++++++++++++
 arch/x86/kernel/sev-es.c                   | 748 +++++++++++++++++++++
 arch/x86/kernel/smpboot.c                  |   4 +-
 arch/x86/kernel/traps.c                    |   3 +
 arch/x86/mm/extable.c                      |   1 +
 arch/x86/mm/mem_encrypt.c                  |  11 +-
 arch/x86/mm/mem_encrypt_identity.c         |   3 +
 arch/x86/realmode/init.c                   |  12 +
 arch/x86/realmode/rm/header.S              |   3 +
 arch/x86/realmode/rm/trampoline_64.S       |  20 +
 arch/x86/tools/gen-insn-attr-x86.awk       |  50 +-
 tools/arch/x86/tools/gen-insn-attr-x86.awk |  50 +-
 51 files changed, 3016 insertions(+), 352 deletions(-)
 create mode 100644 arch/x86/boot/compressed/ident_map_64.c
 create mode 100644 arch/x86/boot/compressed/idt_64.c
 create mode 100644 arch/x86/boot/compressed/idt_handlers_64.S
 delete mode 100644 arch/x86/boot/compressed/kaslr_64.c
 create mode 100644 arch/x86/boot/compressed/sev-es.c
 create mode 100644 arch/x86/include/asm/fpu/xcr.h
 create mode 100644 arch/x86/include/asm/sev-es.h
 create mode 100644 arch/x86/include/asm/trap_defs.h
 create mode 100644 arch/x86/kernel/sev-es-shared.c
 create mode 100644 arch/x86/kernel/sev-es.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 109+ messages in thread

end of thread, other threads:[~2020-03-17 21:34 UTC | newest]

Thread overview: 109+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-11 13:51 [RFC PATCH 00/62] Linux as SEV-ES Guest Support Joerg Roedel
2020-02-11 13:51 ` [PATCH 01/62] KVM: SVM: Add GHCB definitions Joerg Roedel
2020-02-11 13:51 ` [PATCH 02/62] KVM: SVM: Add GHCB Accessor functions Joerg Roedel
2020-02-11 13:51 ` [PATCH 03/62] x86/cpufeatures: Add SEV-ES CPU feature Joerg Roedel
2020-02-13  6:51   ` Borislav Petkov
2020-02-11 13:51 ` [PATCH 04/62] x86/traps: Move some definitions to <asm/trap_defs.h> Joerg Roedel
2020-02-11 13:51 ` [PATCH 05/62] x86/insn-decoder: Make inat-tables.c suitable for pre-decompression code Joerg Roedel
2020-02-11 13:52 ` [PATCH 06/62] x86/boot/compressed: Fix debug_puthex() parameter type Joerg Roedel
2020-02-11 13:52 ` [PATCH 07/62] x86/boot/compressed/64: Disable red-zone usage Joerg Roedel
2020-02-11 22:13   ` Andy Lutomirski
2020-02-11 13:52 ` [PATCH 08/62] x86/boot/compressed/64: Add IDT Infrastructure Joerg Roedel
2020-02-11 22:18   ` Andy Lutomirski
2020-02-12 11:19     ` Joerg Roedel
2020-02-14 19:40   ` Andi Kleen
2020-02-15 12:32     ` Joerg Roedel
2020-02-11 13:52 ` [PATCH 09/62] x86/boot/compressed/64: Rename kaslr_64.c to ident_map_64.c Joerg Roedel
2020-02-11 13:52 ` [PATCH 10/62] x86/boot/compressed/64: Add page-fault handler Joerg Roedel
2020-02-11 13:52 ` [PATCH 11/62] x86/boot/compressed/64: Always switch to own page-table Joerg Roedel
2020-02-11 13:52 ` [PATCH 12/62] x86/boot/compressed/64: Don't pre-map memory in KASLR code Joerg Roedel
2020-02-11 13:52 ` [PATCH 13/62] x86/boot/compressed/64: Change add_identity_map() to take start and end Joerg Roedel
2020-02-11 13:52 ` [PATCH 14/62] x86/boot/compressed/64: Add stage1 #VC handler Joerg Roedel
2020-02-11 22:23   ` Andy Lutomirski
2020-02-12 11:38     ` Joerg Roedel
2020-02-12 16:22       ` Andy Lutomirski
2020-02-11 13:52 ` [PATCH 15/62] x86/boot/compressed/64: Call set_sev_encryption_mask earlier Joerg Roedel
2020-02-11 13:52 ` [PATCH 16/62] x86/boot/compressed/64: Check return value of kernel_ident_mapping_init() Joerg Roedel
2020-02-11 13:52 ` [PATCH 17/62] x86/boot/compressed/64: Add function to map a page unencrypted Joerg Roedel
2020-02-11 13:52 ` [PATCH 18/62] x86/boot/compressed/64: Setup GHCB Based VC Exception handler Joerg Roedel
2020-02-11 22:25   ` Andy Lutomirski
2020-02-12 11:44     ` Joerg Roedel
2020-02-11 13:52 ` [PATCH 19/62] x86/sev-es: Add support for handling IOIO exceptions Joerg Roedel
2020-02-11 22:28   ` Andy Lutomirski
2020-02-12 11:49     ` Joerg Roedel
2020-02-11 13:52 ` [PATCH 20/62] x86/fpu: Move xgetbv()/xsetbv() into separate header Joerg Roedel
2020-02-11 13:52 ` [PATCH 21/62] x86/sev-es: Add CPUID handling to #VC handler Joerg Roedel
2020-02-11 13:52 ` [PATCH 22/62] x86/sev-es: Add handler for MMIO events Joerg Roedel
2020-02-11 13:52 ` [PATCH 23/62] x86/idt: Move IDT to data segment Joerg Roedel
2020-02-11 22:41   ` Andy Lutomirski
2020-02-12 11:55     ` Joerg Roedel
2020-02-12 16:23       ` Andy Lutomirski
2020-02-12 16:28         ` Jürgen Groß
2020-02-19 10:42           ` Joerg Roedel
2020-02-19 10:47             ` Jürgen Groß
2020-02-11 13:52 ` [PATCH 24/62] x86/idt: Split idt_data setup out of set_intr_gate() Joerg Roedel
2020-02-11 13:52 ` [PATCH 25/62] x86/head/64: Install boot GDT Joerg Roedel
2020-02-11 22:29   ` Andy Lutomirski
2020-02-12 12:20     ` Joerg Roedel
2020-02-11 13:52 ` [PATCH 26/62] x86/head/64: Reload GDT after switch to virtual addresses Joerg Roedel
2020-02-11 13:52 ` [PATCH 27/62] x86/head/64: Load segment registers earlier Joerg Roedel
2020-02-11 13:52 ` [PATCH 28/62] x86/head/64: Switch to initial stack earlier Joerg Roedel
2020-02-11 13:52 ` [PATCH 29/62] x86/head/64: Load IDT earlier Joerg Roedel
2020-02-11 13:52 ` [PATCH 30/62] x86/head/64: Move early exception dispatch to C code Joerg Roedel
2020-02-11 22:44   ` Andy Lutomirski
2020-02-12 12:39     ` Joerg Roedel
2020-02-11 13:52 ` [PATCH 31/62] x86/sev-es: Add SEV-ES Feature Detection Joerg Roedel
2020-02-11 13:52 ` [PATCH 32/62] x86/sev-es: Compile early handler code into kernel image Joerg Roedel
2020-02-11 13:52 ` [PATCH 33/62] x86/sev-es: Setup early #VC handler Joerg Roedel
2020-02-11 13:52 ` [PATCH 34/62] x86/sev-es: Setup GHCB based boot " Joerg Roedel
2020-02-11 13:52 ` [PATCH 35/62] x86/sev-es: Setup per-cpu GHCBs for the runtime handler Joerg Roedel
2020-02-11 22:46   ` Andy Lutomirski
2020-02-12 15:16     ` Joerg Roedel
2020-02-11 13:52 ` [PATCH 36/62] x86/sev-es: Add Runtime #VC Exception Handler Joerg Roedel
2020-02-11 13:52 ` [PATCH 37/62] x86/sev-es: Wire up existing #VC exit-code handlers Joerg Roedel
2020-02-11 13:52 ` [PATCH 38/62] x86/sev-es: Handle instruction fetches from user-space Joerg Roedel
2020-02-12 21:42   ` Andy Lutomirski
2020-03-13  9:12     ` Joerg Roedel
2020-03-17 21:34       ` Andy Lutomirski
2020-02-11 13:52 ` [PATCH 39/62] x86/sev-es: Harden runtime #VC handler for exceptions " Joerg Roedel
2020-02-11 22:47   ` Andy Lutomirski
2020-02-12 13:16     ` Joerg Roedel
2020-02-11 13:52 ` [PATCH 40/62] x86/sev-es: Filter exceptions not supported " Joerg Roedel
2020-02-11 13:52 ` [PATCH 41/62] x86/sev-es: Handle MSR events Joerg Roedel
2020-02-13 15:45   ` Dave Hansen
2020-02-14  7:23     ` Joerg Roedel
2020-02-14 16:59       ` Dave Hansen
2020-02-15 12:45         ` Joerg Roedel
2020-02-11 13:52 ` [PATCH 42/62] x86/sev-es: Handle DR7 read/write events Joerg Roedel
2020-02-11 13:52 ` [PATCH 43/62] x86/sev-es: Handle WBINVD Events Joerg Roedel
2020-02-11 13:52 ` [PATCH 44/62] x86/sev-es: Handle RDTSC Events Joerg Roedel
2020-02-11 13:52 ` [PATCH 45/62] x86/sev-es: Handle RDPMC Events Joerg Roedel
2020-02-11 13:52 ` [PATCH 46/62] x86/sev-es: Handle INVD Events Joerg Roedel
2020-02-12  0:12   ` Andy Lutomirski
2020-02-12 15:36     ` Joerg Roedel
2020-02-11 13:52 ` [PATCH 47/62] x86/sev-es: Handle RDTSCP Events Joerg Roedel
2020-02-11 13:52 ` [PATCH 48/62] x86/sev-es: Handle MONITOR/MONITORX Events Joerg Roedel
2020-02-11 13:52 ` [PATCH 49/62] x86/sev-es: Handle MWAIT/MWAITX Events Joerg Roedel
2020-02-11 13:52 ` [PATCH 50/62] x86/sev-es: Handle VMMCALL Events Joerg Roedel
2020-02-12  0:14   ` Andy Lutomirski
2020-02-12 13:22     ` Joerg Roedel
2020-02-11 13:52 ` [PATCH 51/62] x86/sev-es: Handle #AC Events Joerg Roedel
2020-02-11 13:52 ` [PATCH 52/62] x86/sev-es: Handle #DB Events Joerg Roedel
2020-02-11 13:52 ` [PATCH 53/62] x86/paravirt: Allow hypervisor specific VMMCALL handling under SEV-ES Joerg Roedel
2020-02-11 13:52 ` [PATCH 54/62] x86/kvm: Add KVM " Joerg Roedel
2020-02-11 13:52 ` [PATCH 55/62] x86/vmware: Add VMware specific handling for VMMCALL " Joerg Roedel
2020-02-11 13:52 ` [PATCH 56/62] x86/realmode: Add SEV-ES specific trampoline entry point Joerg Roedel
2020-02-11 13:52 ` [PATCH 57/62] x86/realmode: Setup AP jump table Joerg Roedel
2020-02-11 13:52 ` [PATCH 58/62] x86/head/64: Don't call verify_cpu() on starting APs Joerg Roedel
2020-02-11 13:52 ` [PATCH 59/62] x86/head/64: Rename start_cpu0 Joerg Roedel
2020-02-11 13:52 ` [PATCH 60/62] x86/sev-es: Support CPU offline/online Joerg Roedel
2020-02-11 13:52 ` [PATCH 61/62] x86/cpufeature: Add SEV_ES_GUEST CPU Feature Joerg Roedel
2020-02-11 13:52 ` [PATCH 62/62] x86/sev-es: Add NMI state tracking Joerg Roedel
2020-02-11 22:50   ` Andy Lutomirski
2020-02-12 13:56     ` Joerg Roedel
2020-02-11 14:50 ` [RFC PATCH 00/62] Linux as SEV-ES Guest Support Peter Zijlstra
2020-02-11 15:43   ` Joerg Roedel
2020-02-11 22:12     ` Andy Lutomirski
2020-02-12 13:54       ` Joerg Roedel
2020-02-12  3:48 ` Andy Lutomirski
2020-02-12 13:59   ` Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).