From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from theia.8bytes.org (8bytes.org [81.169.241.247]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8045B168 for ; Mon, 5 Jul 2021 08:34:08 +0000 (UTC) Received: from cap.home.8bytes.org (p5b006775.dip0.t-ipconnect.de [91.0.103.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by theia.8bytes.org (Postfix) with ESMTPSA id 2C64A261; Mon, 5 Jul 2021 10:26:12 +0200 (CEST) From: Joerg Roedel To: x86@kernel.org Cc: Joerg Roedel , Joerg Roedel , hpa@zytor.com, Andy Lutomirski , Dave Hansen , Peter Zijlstra , Jiri Slaby , Dan Williams , Tom Lendacky , Juergen Gross , Kees Cook , David Rientjes , Cfir Cohen , Erdem Aktas , Masami Hiramatsu , Mike Stunes , Sean Christopherson , Martin Radev , Arvind Sankar , linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org Subject: [RFC PATCH 00/12] x86/sev: KEXEC/KDUMP support for SEV-ES guests Date: Mon, 5 Jul 2021 10:24:31 +0200 Message-Id: <20210705082443.14721-1-joro@8bytes.org> X-Mailer: git-send-email 2.31.1 Precedence: bulk X-Mailing-List: linux-coco@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: Joerg Roedel Hi, here are changes to enable kexec/kdump in SEV-ES guests. The biggest problem for supporting kexec/kdump under SEV-ES is to find a way to hand the non-boot CPUs (APs) from one kernel to another. Without SEV-ES the first kernel parks the CPUs in a HLT loop until they get reset by the kexec'ed kernel via an INIT-SIPI-SIPI sequence. For virtual machines the CPU reset is emulated by the hypervisor, which sets the vCPU registers back to reset state. This does not work under SEV-ES, because the hypervisor has no access to the vCPU registers and can't make modifications to them. So an SEV-ES guest needs to reset the vCPU itself and park it using the AP-reset-hold protocol. Upon wakeup the guest needs to jump to real-mode and to the reset-vector configured in the AP-Jump-Table. The code to do this is the main part of this patch-set. It works by placing code on the AP Jump-Table page itself to park the vCPU and for jumping to the reset vector upon wakeup. The code on the AP Jump Table runs in 16-bit protected mode with segment base set to the beginning of the page. The AP Jump-Table is usually not within the first 1MB of memory, so the code can't run in real-mode. The AP Jump-Table is the best place to put the parking code, because the memory is owned, but read-only by the firmware and writeable by the OS. Only the first 4 bytes are used for the reset-vector, leaving the rest of the page for code/data/stack to park a vCPU. The code can't be in kernel memory because by the time the vCPU wakes up the memory will be owned by the new kernel, which might have overwritten it already. The other patches add initial GHCB Version 2 protocol support, because kexec/kdump need the MSR-based (without a GHCB) AP-reset-hold VMGEXT, which is a GHCB protocol version 2 feature. The kexec'ed kernel is also entered via the decompressor and needs MMIO support there, so this patch-set also adds MMIO #VC support to the decompressor and support for handling CLFLUSH instructions. Finally there is also code to disable kexec/kdump support at runtime when the environment does not support it (e.g. no GHCB protocol version 2 support or AP Jump Table over 4GB). The diffstat looks big, but most of it is moving code for MMIO #VC support around to make it available to the decompressor. There is also a video showing the code in action: https://www.youtube.com/watch?v=j1AUJANP7Mk Please review. Thanks, Joerg Joerg Roedel (12): kexec: Allow architecture code to opt-out at runtime x86/kexec/64: Forbid kexec when running as an SEV-ES guest x86/sev: Save and print negotiated GHCB protocol version x86/sev: Do not hardcode GHCB protocol version x86/sev: Use GHCB protocol version 2 if supported x86/sev: Cache AP Jump Table Address x86/sev: Setup code to park APs in the AP Jump Table x86/sev: Park APs on AP Jump Table with GHCB protocol version 2 x86/sev: Use AP Jump Table blob to stop CPU x86/sev: Add MMIO handling support to boot/compressed/ code x86/sev: Handle CLFLUSH MMIO events x86/sev: Support kexec under SEV-ES with AP Jump Table blob arch/x86/boot/compressed/sev.c | 56 +- arch/x86/include/asm/realmode.h | 5 + arch/x86/include/asm/sev-ap-jumptable.h | 25 + arch/x86/include/asm/sev.h | 13 +- arch/x86/kernel/machine_kexec_64.c | 12 + arch/x86/kernel/process.c | 8 + arch/x86/kernel/sev-shared.c | 333 +++++++++- arch/x86/kernel/sev.c | 494 ++++++--------- arch/x86/lib/insn-eval-shared.c | 805 ++++++++++++++++++++++++ arch/x86/lib/insn-eval.c | 802 +---------------------- arch/x86/realmode/Makefile | 9 +- arch/x86/realmode/rm/Makefile | 11 +- arch/x86/realmode/rm/header.S | 3 + arch/x86/realmode/rm/sev_ap_park.S | 89 +++ arch/x86/realmode/rmpiggy.S | 6 + arch/x86/realmode/sev/Makefile | 41 ++ arch/x86/realmode/sev/ap_jump_table.S | 130 ++++ arch/x86/realmode/sev/ap_jump_table.lds | 24 + include/linux/kexec.h | 2 + kernel/kexec.c | 14 + kernel/kexec_file.c | 9 + 21 files changed, 1765 insertions(+), 1126 deletions(-) create mode 100644 arch/x86/include/asm/sev-ap-jumptable.h create mode 100644 arch/x86/lib/insn-eval-shared.c create mode 100644 arch/x86/realmode/rm/sev_ap_park.S create mode 100644 arch/x86/realmode/sev/Makefile create mode 100644 arch/x86/realmode/sev/ap_jump_table.S create mode 100644 arch/x86/realmode/sev/ap_jump_table.lds base-commit: 8d9d46bbf3b6b7ff8edcac33603ab45c29e0e07f -- 2.31.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64DEBC07E9B for ; Mon, 5 Jul 2021 08:26:23 +0000 (UTC) Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2FD0C6145D for ; Mon, 5 Jul 2021 08:26:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2FD0C6145D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=8bytes.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=virtualization-bounces@lists.linux-foundation.org Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 83AF9607CA; Mon, 5 Jul 2021 08:26:22 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JgWVACqBYWlN; Mon, 5 Jul 2021 08:26:21 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp3.osuosl.org (Postfix) with ESMTPS id DAB92607B1; Mon, 5 Jul 2021 08:26:20 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id ED289C0027; Mon, 5 Jul 2021 08:26:19 +0000 (UTC) Received: from smtp2.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 2A6B9C000E for ; Mon, 5 Jul 2021 08:26:18 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 0BB45400DE for ; Mon, 5 Jul 2021 08:26:18 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HrF82lfVzPaP for ; Mon, 5 Jul 2021 08:26:17 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.8.0 Received: from theia.8bytes.org (8bytes.org [81.169.241.247]) by smtp2.osuosl.org (Postfix) with ESMTPS id DD148401A4 for ; Mon, 5 Jul 2021 08:26:16 +0000 (UTC) Received: from cap.home.8bytes.org (p5b006775.dip0.t-ipconnect.de [91.0.103.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by theia.8bytes.org (Postfix) with ESMTPSA id 2C64A261; Mon, 5 Jul 2021 10:26:12 +0200 (CEST) From: Joerg Roedel To: x86@kernel.org Subject: [RFC PATCH 00/12] x86/sev: KEXEC/KDUMP support for SEV-ES guests Date: Mon, 5 Jul 2021 10:24:31 +0200 Message-Id: <20210705082443.14721-1-joro@8bytes.org> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 Cc: kvm@vger.kernel.org, Peter Zijlstra , Dave Hansen , virtualization@lists.linux-foundation.org, Arvind Sankar , hpa@zytor.com, Jiri Slaby , Joerg Roedel , David Rientjes , Martin Radev , Tom Lendacky , Joerg Roedel , Kees Cook , Cfir Cohen , linux-coco@lists.linux.dev, Andy Lutomirski , Dan Williams , Juergen Gross , Mike Stunes , Sean Christopherson , linux-kernel@vger.kernel.org, Masami Hiramatsu , Erdem Aktas X-BeenThere: virtualization@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Linux virtualization List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: virtualization-bounces@lists.linux-foundation.org Sender: "Virtualization" From: Joerg Roedel Hi, here are changes to enable kexec/kdump in SEV-ES guests. The biggest problem for supporting kexec/kdump under SEV-ES is to find a way to hand the non-boot CPUs (APs) from one kernel to another. Without SEV-ES the first kernel parks the CPUs in a HLT loop until they get reset by the kexec'ed kernel via an INIT-SIPI-SIPI sequence. For virtual machines the CPU reset is emulated by the hypervisor, which sets the vCPU registers back to reset state. This does not work under SEV-ES, because the hypervisor has no access to the vCPU registers and can't make modifications to them. So an SEV-ES guest needs to reset the vCPU itself and park it using the AP-reset-hold protocol. Upon wakeup the guest needs to jump to real-mode and to the reset-vector configured in the AP-Jump-Table. The code to do this is the main part of this patch-set. It works by placing code on the AP Jump-Table page itself to park the vCPU and for jumping to the reset vector upon wakeup. The code on the AP Jump Table runs in 16-bit protected mode with segment base set to the beginning of the page. The AP Jump-Table is usually not within the first 1MB of memory, so the code can't run in real-mode. The AP Jump-Table is the best place to put the parking code, because the memory is owned, but read-only by the firmware and writeable by the OS. Only the first 4 bytes are used for the reset-vector, leaving the rest of the page for code/data/stack to park a vCPU. The code can't be in kernel memory because by the time the vCPU wakes up the memory will be owned by the new kernel, which might have overwritten it already. The other patches add initial GHCB Version 2 protocol support, because kexec/kdump need the MSR-based (without a GHCB) AP-reset-hold VMGEXT, which is a GHCB protocol version 2 feature. The kexec'ed kernel is also entered via the decompressor and needs MMIO support there, so this patch-set also adds MMIO #VC support to the decompressor and support for handling CLFLUSH instructions. Finally there is also code to disable kexec/kdump support at runtime when the environment does not support it (e.g. no GHCB protocol version 2 support or AP Jump Table over 4GB). The diffstat looks big, but most of it is moving code for MMIO #VC support around to make it available to the decompressor. There is also a video showing the code in action: https://www.youtube.com/watch?v=j1AUJANP7Mk Please review. Thanks, Joerg Joerg Roedel (12): kexec: Allow architecture code to opt-out at runtime x86/kexec/64: Forbid kexec when running as an SEV-ES guest x86/sev: Save and print negotiated GHCB protocol version x86/sev: Do not hardcode GHCB protocol version x86/sev: Use GHCB protocol version 2 if supported x86/sev: Cache AP Jump Table Address x86/sev: Setup code to park APs in the AP Jump Table x86/sev: Park APs on AP Jump Table with GHCB protocol version 2 x86/sev: Use AP Jump Table blob to stop CPU x86/sev: Add MMIO handling support to boot/compressed/ code x86/sev: Handle CLFLUSH MMIO events x86/sev: Support kexec under SEV-ES with AP Jump Table blob arch/x86/boot/compressed/sev.c | 56 +- arch/x86/include/asm/realmode.h | 5 + arch/x86/include/asm/sev-ap-jumptable.h | 25 + arch/x86/include/asm/sev.h | 13 +- arch/x86/kernel/machine_kexec_64.c | 12 + arch/x86/kernel/process.c | 8 + arch/x86/kernel/sev-shared.c | 333 +++++++++- arch/x86/kernel/sev.c | 494 ++++++--------- arch/x86/lib/insn-eval-shared.c | 805 ++++++++++++++++++++++++ arch/x86/lib/insn-eval.c | 802 +---------------------- arch/x86/realmode/Makefile | 9 +- arch/x86/realmode/rm/Makefile | 11 +- arch/x86/realmode/rm/header.S | 3 + arch/x86/realmode/rm/sev_ap_park.S | 89 +++ arch/x86/realmode/rmpiggy.S | 6 + arch/x86/realmode/sev/Makefile | 41 ++ arch/x86/realmode/sev/ap_jump_table.S | 130 ++++ arch/x86/realmode/sev/ap_jump_table.lds | 24 + include/linux/kexec.h | 2 + kernel/kexec.c | 14 + kernel/kexec_file.c | 9 + 21 files changed, 1765 insertions(+), 1126 deletions(-) create mode 100644 arch/x86/include/asm/sev-ap-jumptable.h create mode 100644 arch/x86/lib/insn-eval-shared.c create mode 100644 arch/x86/realmode/rm/sev_ap_park.S create mode 100644 arch/x86/realmode/sev/Makefile create mode 100644 arch/x86/realmode/sev/ap_jump_table.S create mode 100644 arch/x86/realmode/sev/ap_jump_table.lds base-commit: 8d9d46bbf3b6b7ff8edcac33603ab45c29e0e07f -- 2.31.1 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization