linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v12 00/13] Add AMD SEV guest live migration support
@ 2021-04-12 19:42 Ashish Kalra
  2021-04-12 19:43 ` [PATCH v12 01/13] KVM: SVM: Add KVM_SEV SEND_START command Ashish Kalra
                   ` (12 more replies)
  0 siblings, 13 replies; 30+ messages in thread
From: Ashish Kalra @ 2021-04-12 19:42 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, joro, bp, thomas.lendacky, x86, kvm,
	linux-kernel, srutherford, seanjc, venu.busireddy, brijesh.singh

From: Ashish Kalra <ashish.kalra@amd.com>

The series add support for AMD SEV guest live migration commands. To protect the
confidentiality of an SEV protected guest memory while in transit we need to
use the SEV commands defined in SEV API spec [1].

SEV guest VMs have the concept of private and shared memory. Private memory
is encrypted with the guest-specific key, while shared memory may be encrypted
with hypervisor key. The commands provided by the SEV FW are meant to be used
for the private memory only. The patch series introduces a new hypercall.
The guest OS can use this hypercall to notify the page encryption status.
If the page is encrypted with guest specific-key then we use SEV command during
the migration. If page is not encrypted then fallback to default.

The patch uses the KVM_EXIT_HYPERCALL exitcode and hypercall to
userspace exit functionality as a common interface from the guest back to the
VMM and passing on the guest shared/unencrypted page information to the
userspace VMM/Qemu. Qemu can consult this information during migration to know 
whether the page is encrypted.

This section descibes how the SEV live migration feature is negotiated
between the host and guest, the host indicates this feature support via 
KVM_FEATURE_CPUID. The guest firmware (OVMF) detects this feature and
sets a UEFI enviroment variable indicating OVMF support for live
migration, the guest kernel also detects the host support for this
feature via cpuid and in case of an EFI boot verifies if OVMF also
supports this feature by getting the UEFI enviroment variable and if it
set then enables live migration feature on host by writing to a custom
MSR, if not booted under EFI, then it simply enables the feature by
again writing to the custom MSR. The MSR is also handled by the
userspace VMM/Qemu.

A branch containing these patches is available here:
https://github.com/AMDESE/linux/tree/sev-migration-v12

[1] https://developer.amd.com/wp-content/resources/55766.PDF

Changes since v11:
- Clean up and remove kvm_x86_ops callback for page_enc_status_hc and
  instead add a new per-VM flag to support/enable the page encryption
  status hypercall.
- Remove KVM_EXIT_DMA_SHARE/KVM_EXIT_DMA_UNSHARE exitcodes and instead
  use the KVM_EXIT_HYPERCALL exitcode for page encryption status
  hypercall to userspace functionality. 

Changes since v10:
- Adds new KVM_EXIT_DMA_SHARE/KVM_EXIT_DMA_UNSHARE hypercall to
  userspace exit functionality as a common interface from the guest back to the
  KVM and passing on the guest shared/unencrypted region information to the
  userspace VMM/Qemu. KVM/host kernel does not maintain the guest shared
  memory regions information anymore. 
- Remove implicit enabling of SEV live migration feature for an SEV
  guest, now this is explicitly in control of the userspace VMM/Qemu.
- Custom MSR handling is also now moved into userspace VMM/Qemu.
- As KVM does not maintain the guest shared memory region information
  anymore, sev_dbg_crypt() cannot bypass unencrypted guest memory
  regions without support from userspace VMM/Qemu.

Changes since v9:
- Transitioning from page encryption bitmap to the shared pages list
  to keep track of guest's shared/unencrypted memory regions.
- Move back to marking the complete _bss_decrypted section as 
  decrypted in the shared pages list.
- Invoke a new function check_kvm_sev_migration() via kvm_init_platform()
  for guest to query for host-side support for SEV live migration 
  and to enable the SEV live migration feature, to avoid
  #ifdefs in code 
- Rename MSR_KVM_SEV_LIVE_MIG_EN to MSR_KVM_SEV_LIVE_MIGRATION.
- Invoke a new function handle_unencrypted_region() from 
  sev_dbg_crypt() to bypass unencrypted guest memory regions.

Changes since v8:
- Rebasing to kvm next branch.
- Fixed and added comments as per review feedback on v8 patches.
- Removed implicitly enabling live migration for incoming VMs in
  in KVM_SET_PAGE_ENC_BITMAP, it is now done via KVM_SET_MSR ioctl.
- Adds support for bypassing unencrypted guest memory regions for
  DBG_DECRYPT API calls, guest memory region encryption status in
  sev_dbg_decrypt() is referenced using the page encryption bitmap.

Changes since v7:
- Removed the hypervisor specific hypercall/paravirt callback for
  SEV live migration and moved back to calling kvm_sev_hypercall3 
  directly.
- Fix build errors as
  Reported-by: kbuild test robot <lkp@intel.com>, specifically fixed
  build error when CONFIG_HYPERVISOR_GUEST=y and
  CONFIG_AMD_MEM_ENCRYPT=n.
- Implicitly enabled live migration for incoming VM(s) to handle 
  A->B->C->... VM migrations.
- Fixed Documentation as per comments on v6 patches.
- Fixed error return path in sev_send_update_data() as per comments 
  on v6 patches. 

Changes since v6:
- Rebasing to mainline and refactoring to the new split SVM
  infrastructre.
- Move to static allocation of the unified Page Encryption bitmap
  instead of the dynamic resizing of the bitmap, the static allocation
  is done implicitly by extending kvm_arch_commit_memory_region() callack
  to add svm specific x86_ops which can read the userspace provided memory
  region/memslots and calculate the amount of guest RAM managed by the KVM
  and grow the bitmap.
- Fixed KVM_SET_PAGE_ENC_BITMAP ioctl to set the whole bitmap instead
  of simply clearing specific bits.
- Removed KVM_PAGE_ENC_BITMAP_RESET ioctl, which is now performed using
  KVM_SET_PAGE_ENC_BITMAP.
- Extended guest support for enabling Live Migration feature by adding a
  check for UEFI environment variable indicating OVMF support for Live
  Migration feature and additionally checking for KVM capability for the
  same feature. If not booted under EFI, then we simply check for KVM
  capability.
- Add hypervisor specific hypercall for SEV live migration by adding
  a new paravirt callback as part of x86_hyper_runtime.
  (x86 hypervisor specific runtime callbacks)
- Moving MSR handling for MSR_KVM_SEV_LIVE_MIG_EN into svm/sev code 
  and adding check for SEV live migration enabled by guest in the 
  KVM_GET_PAGE_ENC_BITMAP ioctl.
- Instead of the complete __bss_decrypted section, only specific variables
  such as hv_clock_boot and wall_clock are marked as decrypted in the
  page encryption bitmap

Changes since v5:
- Fix build errors as
  Reported-by: kbuild test robot <lkp@intel.com>

Changes since v4:
- Host support has been added to extend KVM capabilities/feature bits to 
  include a new KVM_FEATURE_SEV_LIVE_MIGRATION, which the guest can
  query for host-side support for SEV live migration and a new custom MSR
  MSR_KVM_SEV_LIVE_MIG_EN is added for guest to enable the SEV live
  migration feature.
- Ensure that _bss_decrypted section is marked as decrypted in the
  page encryption bitmap.
- Fixing KVM_GET_PAGE_ENC_BITMAP ioctl to return the correct bitmap
  as per the number of pages being requested by the user. Ensure that
  we only copy bmap->num_pages bytes in the userspace buffer, if
  bmap->num_pages is not byte aligned we read the trailing bits
  from the userspace and copy those bits as is. This fixes guest
  page(s) corruption issues observed after migration completion.
- Add kexec support for SEV Live Migration to reset the host's
  page encryption bitmap related to kernel specific page encryption
  status settings before we load a new kernel by kexec. We cannot
  reset the complete page encryption bitmap here as we need to
  retain the UEFI/OVMF firmware specific settings.

Changes since v3:
- Rebasing to mainline and testing.
- Adding a new KVM_PAGE_ENC_BITMAP_RESET ioctl, which resets the 
  page encryption bitmap on a guest reboot event.
- Adding a more reliable sanity check for GPA range being passed to
  the hypercall to ensure that guest MMIO ranges are also marked
  in the page encryption bitmap.

Changes since v2:
 - reset the page encryption bitmap on vcpu reboot

Changes since v1:
 - Add support to share the page encryption between the source and target
   machine.
 - Fix review feedbacks from Tom Lendacky.
 - Add check to limit the session blob length.
 - Update KVM_GET_PAGE_ENC_BITMAP icotl to use the base_gfn instead of
   the memory slot when querying the bitmap.

Ashish Kalra (5):
  KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall
  KVM: x86: Introduce new KVM_FEATURE_SEV_LIVE_MIGRATION feature &
    Custom MSR.
  EFI: Introduce the new AMD Memory Encryption GUID.
  x86/kvm: Add guest support for detecting and enabling SEV Live
    Migration feature.
  x86/kvm: Add kexec support for SEV Live Migration.

Brijesh Singh (8):
  KVM: SVM: Add KVM_SEV SEND_START command
  KVM: SVM: Add KVM_SEND_UPDATE_DATA command
  KVM: SVM: Add KVM_SEV_SEND_FINISH command
  KVM: SVM: Add support for KVM_SEV_RECEIVE_START command
  KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command
  KVM: SVM: Add KVM_SEV_RECEIVE_FINISH command
  KVM: x86: Add AMD SEV specific Hypercall3
  mm: x86: Invoke hypercall when page encryption status is changed

 .../virt/kvm/amd-memory-encryption.rst        | 120 +++++
 Documentation/virt/kvm/cpuid.rst              |   5 +
 Documentation/virt/kvm/hypercalls.rst         |  15 +
 Documentation/virt/kvm/msr.rst                |  12 +
 arch/x86/include/asm/kvm_host.h               |   2 +
 arch/x86/include/asm/kvm_para.h               |  12 +
 arch/x86/include/asm/mem_encrypt.h            |   8 +
 arch/x86/include/asm/paravirt.h               |  10 +
 arch/x86/include/asm/paravirt_types.h         |   2 +
 arch/x86/include/uapi/asm/kvm_para.h          |   4 +
 arch/x86/kernel/kvm.c                         |  76 +++
 arch/x86/kernel/paravirt.c                    |   1 +
 arch/x86/kvm/cpuid.c                          |   3 +-
 arch/x86/kvm/svm/sev.c                        | 454 ++++++++++++++++++
 arch/x86/kvm/x86.c                            |  29 ++
 arch/x86/mm/mem_encrypt.c                     |  98 +++-
 arch/x86/mm/pat/set_memory.c                  |   7 +
 include/linux/efi.h                           |   1 +
 include/linux/psp-sev.h                       |   8 +-
 include/uapi/linux/kvm.h                      |  39 ++
 include/uapi/linux/kvm_para.h                 |   1 +
 21 files changed, 901 insertions(+), 6 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v12 01/13] KVM: SVM: Add KVM_SEV SEND_START command
  2021-04-12 19:42 [PATCH v12 00/13] Add AMD SEV guest live migration support Ashish Kalra
@ 2021-04-12 19:43 ` Ashish Kalra
  2021-04-12 19:43 ` [PATCH v12 02/13] KVM: SVM: Add KVM_SEND_UPDATE_DATA command Ashish Kalra
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Ashish Kalra @ 2021-04-12 19:43 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, joro, bp, thomas.lendacky, x86, kvm,
	linux-kernel, srutherford, seanjc, venu.busireddy, brijesh.singh

From: Brijesh Singh <brijesh.singh@amd.com>

The command is used to create an outgoing SEV guest encryption context.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Steve Rutherford <srutherford@google.com>
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/amd-memory-encryption.rst        |  27 ++++
 arch/x86/kvm/svm/sev.c                        | 125 ++++++++++++++++++
 include/linux/psp-sev.h                       |   8 +-
 include/uapi/linux/kvm.h                      |  12 ++
 4 files changed, 168 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
index 469a6308765b..ac799dd7a618 100644
--- a/Documentation/virt/kvm/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/amd-memory-encryption.rst
@@ -284,6 +284,33 @@ Returns: 0 on success, -negative on error
                 __u32 len;
         };
 
+10. KVM_SEV_SEND_START
+----------------------
+
+The KVM_SEV_SEND_START command can be used by the hypervisor to create an
+outgoing guest encryption context.
+
+Parameters (in): struct kvm_sev_send_start
+
+Returns: 0 on success, -negative on error
+
+::
+        struct kvm_sev_send_start {
+                __u32 policy;                 /* guest policy */
+
+                __u64 pdh_cert_uaddr;         /* platform Diffie-Hellman certificate */
+                __u32 pdh_cert_len;
+
+                __u64 plat_certs_uaddr;        /* platform certificate chain */
+                __u32 plat_certs_len;
+
+                __u64 amd_certs_uaddr;        /* AMD certificate */
+                __u32 amd_certs_len;
+
+                __u64 session_uaddr;          /* Guest session information */
+                __u32 session_len;
+        };
+
 References
 ==========
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 874ea309279f..2b65900c05d6 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -1110,6 +1110,128 @@ static int sev_get_attestation_report(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+/* Userspace wants to query session length. */
+static int
+__sev_send_start_query_session_length(struct kvm *kvm, struct kvm_sev_cmd *argp,
+				      struct kvm_sev_send_start *params)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_send_start *data;
+	int ret;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
+	if (data == NULL)
+		return -ENOMEM;
+
+	data->handle = sev->handle;
+	ret = sev_issue_cmd(kvm, SEV_CMD_SEND_START, data, &argp->error);
+
+	params->session_len = data->session_len;
+	if (copy_to_user((void __user *)(uintptr_t)argp->data, params,
+				sizeof(struct kvm_sev_send_start)))
+		ret = -EFAULT;
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_send_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_send_start *data;
+	struct kvm_sev_send_start params;
+	void *amd_certs, *session_data;
+	void *pdh_cert, *plat_certs;
+	int ret;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
+				sizeof(struct kvm_sev_send_start)))
+		return -EFAULT;
+
+	/* if session_len is zero, userspace wants to query the session length */
+	if (!params.session_len)
+		return __sev_send_start_query_session_length(kvm, argp,
+				&params);
+
+	/* some sanity checks */
+	if (!params.pdh_cert_uaddr || !params.pdh_cert_len ||
+	    !params.session_uaddr || params.session_len > SEV_FW_BLOB_MAX_SIZE)
+		return -EINVAL;
+
+	/* allocate the memory to hold the session data blob */
+	session_data = kmalloc(params.session_len, GFP_KERNEL_ACCOUNT);
+	if (!session_data)
+		return -ENOMEM;
+
+	/* copy the certificate blobs from userspace */
+	pdh_cert = psp_copy_user_blob(params.pdh_cert_uaddr,
+				params.pdh_cert_len);
+	if (IS_ERR(pdh_cert)) {
+		ret = PTR_ERR(pdh_cert);
+		goto e_free_session;
+	}
+
+	plat_certs = psp_copy_user_blob(params.plat_certs_uaddr,
+				params.plat_certs_len);
+	if (IS_ERR(plat_certs)) {
+		ret = PTR_ERR(plat_certs);
+		goto e_free_pdh;
+	}
+
+	amd_certs = psp_copy_user_blob(params.amd_certs_uaddr,
+				params.amd_certs_len);
+	if (IS_ERR(amd_certs)) {
+		ret = PTR_ERR(amd_certs);
+		goto e_free_plat_cert;
+	}
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
+	if (data == NULL) {
+		ret = -ENOMEM;
+		goto e_free_amd_cert;
+	}
+
+	/* populate the FW SEND_START field with system physical address */
+	data->pdh_cert_address = __psp_pa(pdh_cert);
+	data->pdh_cert_len = params.pdh_cert_len;
+	data->plat_certs_address = __psp_pa(plat_certs);
+	data->plat_certs_len = params.plat_certs_len;
+	data->amd_certs_address = __psp_pa(amd_certs);
+	data->amd_certs_len = params.amd_certs_len;
+	data->session_address = __psp_pa(session_data);
+	data->session_len = params.session_len;
+	data->handle = sev->handle;
+
+	ret = sev_issue_cmd(kvm, SEV_CMD_SEND_START, data, &argp->error);
+
+	if (!ret && copy_to_user((void __user *)(uintptr_t)params.session_uaddr,
+			session_data, params.session_len)) {
+		ret = -EFAULT;
+		goto e_free;
+	}
+
+	params.policy = data->policy;
+	params.session_len = data->session_len;
+	if (copy_to_user((void __user *)(uintptr_t)argp->data, &params,
+				sizeof(struct kvm_sev_send_start)))
+		ret = -EFAULT;
+
+e_free:
+	kfree(data);
+e_free_amd_cert:
+	kfree(amd_certs);
+e_free_plat_cert:
+	kfree(plat_certs);
+e_free_pdh:
+	kfree(pdh_cert);
+e_free_session:
+	kfree(session_data);
+	return ret;
+}
+
 int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -1163,6 +1285,9 @@ int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_GET_ATTESTATION_REPORT:
 		r = sev_get_attestation_report(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_SEND_START:
+		r = sev_send_start(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
index b801ead1e2bb..73da511b9423 100644
--- a/include/linux/psp-sev.h
+++ b/include/linux/psp-sev.h
@@ -326,11 +326,11 @@ struct sev_data_send_start {
 	u64 pdh_cert_address;			/* In */
 	u32 pdh_cert_len;			/* In */
 	u32 reserved1;
-	u64 plat_cert_address;			/* In */
-	u32 plat_cert_len;			/* In */
+	u64 plat_certs_address;			/* In */
+	u32 plat_certs_len;			/* In */
 	u32 reserved2;
-	u64 amd_cert_address;			/* In */
-	u32 amd_cert_len;			/* In */
+	u64 amd_certs_address;			/* In */
+	u32 amd_certs_len;			/* In */
 	u32 reserved3;
 	u64 session_address;			/* In */
 	u32 session_len;			/* In/Out */
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index f6afee209620..ac53ad2e7271 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1729,6 +1729,18 @@ struct kvm_sev_attestation_report {
 	__u32 len;
 };
 
+struct kvm_sev_send_start {
+	__u32 policy;
+	__u64 pdh_cert_uaddr;
+	__u32 pdh_cert_len;
+	__u64 plat_certs_uaddr;
+	__u32 plat_certs_len;
+	__u64 amd_certs_uaddr;
+	__u32 amd_certs_len;
+	__u64 session_uaddr;
+	__u32 session_len;
+};
+
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
 #define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v12 02/13] KVM: SVM: Add KVM_SEND_UPDATE_DATA command
  2021-04-12 19:42 [PATCH v12 00/13] Add AMD SEV guest live migration support Ashish Kalra
  2021-04-12 19:43 ` [PATCH v12 01/13] KVM: SVM: Add KVM_SEV SEND_START command Ashish Kalra
@ 2021-04-12 19:43 ` Ashish Kalra
  2021-04-12 19:43 ` [PATCH v12 03/13] KVM: SVM: Add KVM_SEV_SEND_FINISH command Ashish Kalra
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Ashish Kalra @ 2021-04-12 19:43 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, joro, bp, thomas.lendacky, x86, kvm,
	linux-kernel, srutherford, seanjc, venu.busireddy, brijesh.singh

From: Brijesh Singh <brijesh.singh@amd.com>

The command is used for encrypting the guest memory region using the encryption
context created with KVM_SEV_SEND_START.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by : Steve Rutherford <srutherford@google.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/amd-memory-encryption.rst        |  24 ++++
 arch/x86/kvm/svm/sev.c                        | 122 ++++++++++++++++++
 include/uapi/linux/kvm.h                      |   9 ++
 3 files changed, 155 insertions(+)

diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
index ac799dd7a618..3c5456e0268a 100644
--- a/Documentation/virt/kvm/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/amd-memory-encryption.rst
@@ -311,6 +311,30 @@ Returns: 0 on success, -negative on error
                 __u32 session_len;
         };
 
+11. KVM_SEV_SEND_UPDATE_DATA
+----------------------------
+
+The KVM_SEV_SEND_UPDATE_DATA command can be used by the hypervisor to encrypt the
+outgoing guest memory region with the encryption context creating using
+KVM_SEV_SEND_START.
+
+Parameters (in): struct kvm_sev_send_update_data
+
+Returns: 0 on success, -negative on error
+
+::
+
+        struct kvm_sev_launch_send_update_data {
+                __u64 hdr_uaddr;        /* userspace address containing the packet header */
+                __u32 hdr_len;
+
+                __u64 guest_uaddr;      /* the source memory region to be encrypted */
+                __u32 guest_len;
+
+                __u64 trans_uaddr;      /* the destition memory region  */
+                __u32 trans_len;
+        };
+
 References
 ==========
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 2b65900c05d6..30527285a39a 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -34,6 +34,7 @@ static DECLARE_RWSEM(sev_deactivate_lock);
 static DEFINE_MUTEX(sev_bitmap_lock);
 unsigned int max_sev_asid;
 static unsigned int min_sev_asid;
+static unsigned long sev_me_mask;
 static unsigned long *sev_asid_bitmap;
 static unsigned long *sev_reclaim_asid_bitmap;
 
@@ -1232,6 +1233,123 @@ static int sev_send_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+/* Userspace wants to query either header or trans length. */
+static int
+__sev_send_update_data_query_lengths(struct kvm *kvm, struct kvm_sev_cmd *argp,
+				     struct kvm_sev_send_update_data *params)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_send_update_data *data;
+	int ret;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = sev->handle;
+	ret = sev_issue_cmd(kvm, SEV_CMD_SEND_UPDATE_DATA, data, &argp->error);
+
+	params->hdr_len = data->hdr_len;
+	params->trans_len = data->trans_len;
+
+	if (copy_to_user((void __user *)(uintptr_t)argp->data, params,
+			 sizeof(struct kvm_sev_send_update_data)))
+		ret = -EFAULT;
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_send_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_send_update_data *data;
+	struct kvm_sev_send_update_data params;
+	void *hdr, *trans_data;
+	struct page **guest_page;
+	unsigned long n;
+	int ret, offset;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
+			sizeof(struct kvm_sev_send_update_data)))
+		return -EFAULT;
+
+	/* userspace wants to query either header or trans length */
+	if (!params.trans_len || !params.hdr_len)
+		return __sev_send_update_data_query_lengths(kvm, argp, &params);
+
+	if (!params.trans_uaddr || !params.guest_uaddr ||
+	    !params.guest_len || !params.hdr_uaddr)
+		return -EINVAL;
+
+	/* Check if we are crossing the page boundary */
+	offset = params.guest_uaddr & (PAGE_SIZE - 1);
+	if ((params.guest_len + offset > PAGE_SIZE))
+		return -EINVAL;
+
+	/* Pin guest memory */
+	guest_page = sev_pin_memory(kvm, params.guest_uaddr & PAGE_MASK,
+				    PAGE_SIZE, &n, 0);
+	if (!guest_page)
+		return -EFAULT;
+
+	/* allocate memory for header and transport buffer */
+	ret = -ENOMEM;
+	hdr = kmalloc(params.hdr_len, GFP_KERNEL_ACCOUNT);
+	if (!hdr)
+		goto e_unpin;
+
+	trans_data = kmalloc(params.trans_len, GFP_KERNEL_ACCOUNT);
+	if (!trans_data)
+		goto e_free_hdr;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		goto e_free_trans_data;
+
+	data->hdr_address = __psp_pa(hdr);
+	data->hdr_len = params.hdr_len;
+	data->trans_address = __psp_pa(trans_data);
+	data->trans_len = params.trans_len;
+
+	/* The SEND_UPDATE_DATA command requires C-bit to be always set. */
+	data->guest_address = (page_to_pfn(guest_page[0]) << PAGE_SHIFT) +
+				offset;
+	data->guest_address |= sev_me_mask;
+	data->guest_len = params.guest_len;
+	data->handle = sev->handle;
+
+	ret = sev_issue_cmd(kvm, SEV_CMD_SEND_UPDATE_DATA, data, &argp->error);
+
+	if (ret)
+		goto e_free;
+
+	/* copy transport buffer to user space */
+	if (copy_to_user((void __user *)(uintptr_t)params.trans_uaddr,
+			 trans_data, params.trans_len)) {
+		ret = -EFAULT;
+		goto e_free;
+	}
+
+	/* Copy packet header to userspace. */
+	ret = copy_to_user((void __user *)(uintptr_t)params.hdr_uaddr, hdr,
+				params.hdr_len);
+
+e_free:
+	kfree(data);
+e_free_trans_data:
+	kfree(trans_data);
+e_free_hdr:
+	kfree(hdr);
+e_unpin:
+	sev_unpin_memory(kvm, guest_page, n);
+
+	return ret;
+}
+
 int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -1288,6 +1406,9 @@ int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_SEND_START:
 		r = sev_send_start(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_SEND_UPDATE_DATA:
+		r = sev_send_update_data(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
@@ -1467,6 +1588,7 @@ void __init sev_hardware_setup(void)
 
 	/* Minimum ASID value that should be used for SEV guest */
 	min_sev_asid = edx;
+	sev_me_mask = 1UL << (ebx & 0x3f);
 
 	/* Initialize SEV ASID bitmaps */
 	sev_asid_bitmap = bitmap_zalloc(max_sev_asid, GFP_KERNEL);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index ac53ad2e7271..d45af34c31be 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1741,6 +1741,15 @@ struct kvm_sev_send_start {
 	__u32 session_len;
 };
 
+struct kvm_sev_send_update_data {
+	__u64 hdr_uaddr;
+	__u32 hdr_len;
+	__u64 guest_uaddr;
+	__u32 guest_len;
+	__u64 trans_uaddr;
+	__u32 trans_len;
+};
+
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
 #define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v12 03/13] KVM: SVM: Add KVM_SEV_SEND_FINISH command
  2021-04-12 19:42 [PATCH v12 00/13] Add AMD SEV guest live migration support Ashish Kalra
  2021-04-12 19:43 ` [PATCH v12 01/13] KVM: SVM: Add KVM_SEV SEND_START command Ashish Kalra
  2021-04-12 19:43 ` [PATCH v12 02/13] KVM: SVM: Add KVM_SEND_UPDATE_DATA command Ashish Kalra
@ 2021-04-12 19:43 ` Ashish Kalra
  2021-04-13  0:03   ` Steve Rutherford
  2021-04-12 19:44 ` [PATCH v12 04/13] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command Ashish Kalra
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Ashish Kalra @ 2021-04-12 19:43 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, joro, bp, thomas.lendacky, x86, kvm,
	linux-kernel, srutherford, seanjc, venu.busireddy, brijesh.singh

From: Brijesh Singh <brijesh.singh@amd.com>

The command is used to finailize the encryption context created with
KVM_SEV_SEND_START command.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/amd-memory-encryption.rst        |  8 +++++++
 arch/x86/kvm/svm/sev.c                        | 23 +++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
index 3c5456e0268a..26c4e6c83f62 100644
--- a/Documentation/virt/kvm/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/amd-memory-encryption.rst
@@ -335,6 +335,14 @@ Returns: 0 on success, -negative on error
                 __u32 trans_len;
         };
 
+12. KVM_SEV_SEND_FINISH
+------------------------
+
+After completion of the migration flow, the KVM_SEV_SEND_FINISH command can be
+issued by the hypervisor to delete the encryption context.
+
+Returns: 0 on success, -negative on error
+
 References
 ==========
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 30527285a39a..92325d9527ce 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -1350,6 +1350,26 @@ static int sev_send_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_send_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_send_finish *data;
+	int ret;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = sev->handle;
+	ret = sev_issue_cmd(kvm, SEV_CMD_SEND_FINISH, data, &argp->error);
+
+	kfree(data);
+	return ret;
+}
+
 int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -1409,6 +1429,9 @@ int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_SEND_UPDATE_DATA:
 		r = sev_send_update_data(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_SEND_FINISH:
+		r = sev_send_finish(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v12 04/13] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command
  2021-04-12 19:42 [PATCH v12 00/13] Add AMD SEV guest live migration support Ashish Kalra
                   ` (2 preceding siblings ...)
  2021-04-12 19:43 ` [PATCH v12 03/13] KVM: SVM: Add KVM_SEV_SEND_FINISH command Ashish Kalra
@ 2021-04-12 19:44 ` Ashish Kalra
  2021-04-13  0:03   ` Steve Rutherford
  2021-04-12 19:44 ` [PATCH v12 05/13] KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command Ashish Kalra
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Ashish Kalra @ 2021-04-12 19:44 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, joro, bp, thomas.lendacky, x86, kvm,
	linux-kernel, srutherford, seanjc, venu.busireddy, brijesh.singh

From: Brijesh Singh <brijesh.singh@amd.com>

The command is used to create the encryption context for an incoming
SEV guest. The encryption context can be later used by the hypervisor
to import the incoming data into the SEV guest memory space.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/amd-memory-encryption.rst        | 29 +++++++
 arch/x86/kvm/svm/sev.c                        | 81 +++++++++++++++++++
 include/uapi/linux/kvm.h                      |  9 +++
 3 files changed, 119 insertions(+)

diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
index 26c4e6c83f62..c86c1ded8dd8 100644
--- a/Documentation/virt/kvm/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/amd-memory-encryption.rst
@@ -343,6 +343,35 @@ issued by the hypervisor to delete the encryption context.
 
 Returns: 0 on success, -negative on error
 
+13. KVM_SEV_RECEIVE_START
+------------------------
+
+The KVM_SEV_RECEIVE_START command is used for creating the memory encryption
+context for an incoming SEV guest. To create the encryption context, the user must
+provide a guest policy, the platform public Diffie-Hellman (PDH) key and session
+information.
+
+Parameters: struct  kvm_sev_receive_start (in/out)
+
+Returns: 0 on success, -negative on error
+
+::
+
+        struct kvm_sev_receive_start {
+                __u32 handle;           /* if zero then firmware creates a new handle */
+                __u32 policy;           /* guest's policy */
+
+                __u64 pdh_uaddr;        /* userspace address pointing to the PDH key */
+                __u32 pdh_len;
+
+                __u64 session_uaddr;    /* userspace address which points to the guest session information */
+                __u32 session_len;
+        };
+
+On success, the 'handle' field contains a new handle and on error, a negative value.
+
+For more details, see SEV spec Section 6.12.
+
 References
 ==========
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 92325d9527ce..e530c2b34b5e 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -1370,6 +1370,84 @@ static int sev_send_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_receive_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_receive_start *start;
+	struct kvm_sev_receive_start params;
+	int *error = &argp->error;
+	void *session_data;
+	void *pdh_data;
+	int ret;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	/* Get parameter from the userspace */
+	if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
+			sizeof(struct kvm_sev_receive_start)))
+		return -EFAULT;
+
+	/* some sanity checks */
+	if (!params.pdh_uaddr || !params.pdh_len ||
+	    !params.session_uaddr || !params.session_len)
+		return -EINVAL;
+
+	pdh_data = psp_copy_user_blob(params.pdh_uaddr, params.pdh_len);
+	if (IS_ERR(pdh_data))
+		return PTR_ERR(pdh_data);
+
+	session_data = psp_copy_user_blob(params.session_uaddr,
+			params.session_len);
+	if (IS_ERR(session_data)) {
+		ret = PTR_ERR(session_data);
+		goto e_free_pdh;
+	}
+
+	ret = -ENOMEM;
+	start = kzalloc(sizeof(*start), GFP_KERNEL);
+	if (!start)
+		goto e_free_session;
+
+	start->handle = params.handle;
+	start->policy = params.policy;
+	start->pdh_cert_address = __psp_pa(pdh_data);
+	start->pdh_cert_len = params.pdh_len;
+	start->session_address = __psp_pa(session_data);
+	start->session_len = params.session_len;
+
+	/* create memory encryption context */
+	ret = __sev_issue_cmd(argp->sev_fd, SEV_CMD_RECEIVE_START, start,
+				error);
+	if (ret)
+		goto e_free;
+
+	/* Bind ASID to this guest */
+	ret = sev_bind_asid(kvm, start->handle, error);
+	if (ret)
+		goto e_free;
+
+	params.handle = start->handle;
+	if (copy_to_user((void __user *)(uintptr_t)argp->data,
+			 &params, sizeof(struct kvm_sev_receive_start))) {
+		ret = -EFAULT;
+		sev_unbind_asid(kvm, start->handle);
+		goto e_free;
+	}
+
+	sev->handle = start->handle;
+	sev->fd = argp->sev_fd;
+
+e_free:
+	kfree(start);
+e_free_session:
+	kfree(session_data);
+e_free_pdh:
+	kfree(pdh_data);
+
+	return ret;
+}
+
 int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -1432,6 +1510,9 @@ int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_SEND_FINISH:
 		r = sev_send_finish(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_RECEIVE_START:
+		r = sev_receive_start(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index d45af34c31be..29c25e641a0c 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1750,6 +1750,15 @@ struct kvm_sev_send_update_data {
 	__u32 trans_len;
 };
 
+struct kvm_sev_receive_start {
+	__u32 handle;
+	__u32 policy;
+	__u64 pdh_uaddr;
+	__u32 pdh_len;
+	__u64 session_uaddr;
+	__u32 session_len;
+};
+
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
 #define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v12 05/13] KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command
  2021-04-12 19:42 [PATCH v12 00/13] Add AMD SEV guest live migration support Ashish Kalra
                   ` (3 preceding siblings ...)
  2021-04-12 19:44 ` [PATCH v12 04/13] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command Ashish Kalra
@ 2021-04-12 19:44 ` Ashish Kalra
  2021-04-13  0:04   ` Steve Rutherford
  2021-04-12 19:44 ` [PATCH v12 06/13] KVM: SVM: Add KVM_SEV_RECEIVE_FINISH command Ashish Kalra
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Ashish Kalra @ 2021-04-12 19:44 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, joro, bp, thomas.lendacky, x86, kvm,
	linux-kernel, srutherford, seanjc, venu.busireddy, brijesh.singh

From: Brijesh Singh <brijesh.singh@amd.com>

The command is used for copying the incoming buffer into the
SEV guest memory space.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/amd-memory-encryption.rst        | 24 ++++++
 arch/x86/kvm/svm/sev.c                        | 79 +++++++++++++++++++
 include/uapi/linux/kvm.h                      |  9 +++
 3 files changed, 112 insertions(+)

diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
index c86c1ded8dd8..c6ed5b26d841 100644
--- a/Documentation/virt/kvm/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/amd-memory-encryption.rst
@@ -372,6 +372,30 @@ On success, the 'handle' field contains a new handle and on error, a negative va
 
 For more details, see SEV spec Section 6.12.
 
+14. KVM_SEV_RECEIVE_UPDATE_DATA
+----------------------------
+
+The KVM_SEV_RECEIVE_UPDATE_DATA command can be used by the hypervisor to copy
+the incoming buffers into the guest memory region with encryption context
+created during the KVM_SEV_RECEIVE_START.
+
+Parameters (in): struct kvm_sev_receive_update_data
+
+Returns: 0 on success, -negative on error
+
+::
+
+        struct kvm_sev_launch_receive_update_data {
+                __u64 hdr_uaddr;        /* userspace address containing the packet header */
+                __u32 hdr_len;
+
+                __u64 guest_uaddr;      /* the destination guest memory region */
+                __u32 guest_len;
+
+                __u64 trans_uaddr;      /* the incoming buffer memory region  */
+                __u32 trans_len;
+        };
+
 References
 ==========
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index e530c2b34b5e..2c95657cc9bf 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -1448,6 +1448,82 @@ static int sev_receive_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_receive_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct kvm_sev_receive_update_data params;
+	struct sev_data_receive_update_data *data;
+	void *hdr = NULL, *trans = NULL;
+	struct page **guest_page;
+	unsigned long n;
+	int ret, offset;
+
+	if (!sev_guest(kvm))
+		return -EINVAL;
+
+	if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
+			sizeof(struct kvm_sev_receive_update_data)))
+		return -EFAULT;
+
+	if (!params.hdr_uaddr || !params.hdr_len ||
+	    !params.guest_uaddr || !params.guest_len ||
+	    !params.trans_uaddr || !params.trans_len)
+		return -EINVAL;
+
+	/* Check if we are crossing the page boundary */
+	offset = params.guest_uaddr & (PAGE_SIZE - 1);
+	if ((params.guest_len + offset > PAGE_SIZE))
+		return -EINVAL;
+
+	hdr = psp_copy_user_blob(params.hdr_uaddr, params.hdr_len);
+	if (IS_ERR(hdr))
+		return PTR_ERR(hdr);
+
+	trans = psp_copy_user_blob(params.trans_uaddr, params.trans_len);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		goto e_free_hdr;
+	}
+
+	ret = -ENOMEM;
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		goto e_free_trans;
+
+	data->hdr_address = __psp_pa(hdr);
+	data->hdr_len = params.hdr_len;
+	data->trans_address = __psp_pa(trans);
+	data->trans_len = params.trans_len;
+
+	/* Pin guest memory */
+	ret = -EFAULT;
+	guest_page = sev_pin_memory(kvm, params.guest_uaddr & PAGE_MASK,
+				    PAGE_SIZE, &n, 0);
+	if (!guest_page)
+		goto e_free;
+
+	/* The RECEIVE_UPDATE_DATA command requires C-bit to be always set. */
+	data->guest_address = (page_to_pfn(guest_page[0]) << PAGE_SHIFT) +
+				offset;
+	data->guest_address |= sev_me_mask;
+	data->guest_len = params.guest_len;
+	data->handle = sev->handle;
+
+	ret = sev_issue_cmd(kvm, SEV_CMD_RECEIVE_UPDATE_DATA, data,
+				&argp->error);
+
+	sev_unpin_memory(kvm, guest_page, n);
+
+e_free:
+	kfree(data);
+e_free_trans:
+	kfree(trans);
+e_free_hdr:
+	kfree(hdr);
+
+	return ret;
+}
+
 int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -1513,6 +1589,9 @@ int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_RECEIVE_START:
 		r = sev_receive_start(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_RECEIVE_UPDATE_DATA:
+		r = sev_receive_update_data(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 29c25e641a0c..3a656d43fc6c 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1759,6 +1759,15 @@ struct kvm_sev_receive_start {
 	__u32 session_len;
 };
 
+struct kvm_sev_receive_update_data {
+	__u64 hdr_uaddr;
+	__u32 hdr_len;
+	__u64 guest_uaddr;
+	__u32 guest_len;
+	__u64 trans_uaddr;
+	__u32 trans_len;
+};
+
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
 #define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v12 06/13] KVM: SVM: Add KVM_SEV_RECEIVE_FINISH command
  2021-04-12 19:42 [PATCH v12 00/13] Add AMD SEV guest live migration support Ashish Kalra
                   ` (4 preceding siblings ...)
  2021-04-12 19:44 ` [PATCH v12 05/13] KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command Ashish Kalra
@ 2021-04-12 19:44 ` Ashish Kalra
  2021-04-13  0:04   ` Steve Rutherford
  2021-04-12 19:45 ` [PATCH v12 07/13] KVM: x86: Add AMD SEV specific Hypercall3 Ashish Kalra
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Ashish Kalra @ 2021-04-12 19:44 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, joro, bp, thomas.lendacky, x86, kvm,
	linux-kernel, srutherford, seanjc, venu.busireddy, brijesh.singh

From: Brijesh Singh <brijesh.singh@amd.com>

The command finalize the guest receiving process and make the SEV guest
ready for the execution.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Steve Rutherford <srutherford@google.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/amd-memory-encryption.rst        |  8 +++++++
 arch/x86/kvm/svm/sev.c                        | 23 +++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
index c6ed5b26d841..0466c0febff9 100644
--- a/Documentation/virt/kvm/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/amd-memory-encryption.rst
@@ -396,6 +396,14 @@ Returns: 0 on success, -negative on error
                 __u32 trans_len;
         };
 
+15. KVM_SEV_RECEIVE_FINISH
+------------------------
+
+After completion of the migration flow, the KVM_SEV_RECEIVE_FINISH command can be
+issued by the hypervisor to make the guest ready for execution.
+
+Returns: 0 on success, -negative on error
+
 References
 ==========
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 2c95657cc9bf..c9795a22e502 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -1524,6 +1524,26 @@ static int sev_receive_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_receive_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_receive_finish *data;
+	int ret;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = sev->handle;
+	ret = sev_issue_cmd(kvm, SEV_CMD_RECEIVE_FINISH, data, &argp->error);
+
+	kfree(data);
+	return ret;
+}
+
 int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -1592,6 +1612,9 @@ int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_RECEIVE_UPDATE_DATA:
 		r = sev_receive_update_data(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_RECEIVE_FINISH:
+		r = sev_receive_finish(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v12 07/13] KVM: x86: Add AMD SEV specific Hypercall3
  2021-04-12 19:42 [PATCH v12 00/13] Add AMD SEV guest live migration support Ashish Kalra
                   ` (5 preceding siblings ...)
  2021-04-12 19:44 ` [PATCH v12 06/13] KVM: SVM: Add KVM_SEV_RECEIVE_FINISH command Ashish Kalra
@ 2021-04-12 19:45 ` Ashish Kalra
  2021-04-12 19:45 ` [PATCH v12 08/13] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall Ashish Kalra
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Ashish Kalra @ 2021-04-12 19:45 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, joro, bp, thomas.lendacky, x86, kvm,
	linux-kernel, srutherford, seanjc, venu.busireddy, brijesh.singh

From: Brijesh Singh <brijesh.singh@amd.com>

KVM hypercall framework relies on alternative framework to patch the
VMCALL -> VMMCALL on AMD platform. If a hypercall is made before
apply_alternative() is called then it defaults to VMCALL. The approach
works fine on non SEV guest. A VMCALL would causes #UD, and hypervisor
will be able to decode the instruction and do the right things. But
when SEV is active, guest memory is encrypted with guest key and
hypervisor will not be able to decode the instruction bytes.

Add SEV specific hypercall3, it unconditionally uses VMMCALL. The hypercall
will be used by the SEV guest to notify encrypted pages to the hypervisor.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Steve Rutherford <srutherford@google.com>
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/include/asm/kvm_para.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 338119852512..bc1b11d057fc 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -85,6 +85,18 @@ static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
 	return ret;
 }
 
+static inline long kvm_sev_hypercall3(unsigned int nr, unsigned long p1,
+				      unsigned long p2, unsigned long p3)
+{
+	long ret;
+
+	asm volatile("vmmcall"
+		     : "=a"(ret)
+		     : "a"(nr), "b"(p1), "c"(p2), "d"(p3)
+		     : "memory");
+	return ret;
+}
+
 #ifdef CONFIG_KVM_GUEST
 bool kvm_para_available(void);
 unsigned int kvm_arch_para_features(void);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v12 08/13] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall
  2021-04-12 19:42 [PATCH v12 00/13] Add AMD SEV guest live migration support Ashish Kalra
                   ` (6 preceding siblings ...)
  2021-04-12 19:45 ` [PATCH v12 07/13] KVM: x86: Add AMD SEV specific Hypercall3 Ashish Kalra
@ 2021-04-12 19:45 ` Ashish Kalra
  2021-04-13  0:04   ` Steve Rutherford
  2021-04-12 19:45 ` [PATCH v12 09/13] mm: x86: Invoke hypercall when page encryption status is changed Ashish Kalra
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Ashish Kalra @ 2021-04-12 19:45 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, joro, bp, thomas.lendacky, x86, kvm,
	linux-kernel, srutherford, seanjc, venu.busireddy, brijesh.singh

From: Ashish Kalra <ashish.kalra@amd.com>

This hypercall is used by the SEV guest to notify a change in the page
encryption status to the hypervisor. The hypercall should be invoked
only when the encryption attribute is changed from encrypted -> decrypted
and vice versa. By default all guest pages are considered encrypted.

The hypercall exits to userspace to manage the guest shared regions and
integrate with the userspace VMM's migration code.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 Documentation/virt/kvm/hypercalls.rst | 15 ++++++++++++++
 arch/x86/include/asm/kvm_host.h       |  2 ++
 arch/x86/kvm/svm/sev.c                |  1 +
 arch/x86/kvm/x86.c                    | 29 +++++++++++++++++++++++++++
 include/uapi/linux/kvm_para.h         |  1 +
 5 files changed, 48 insertions(+)

diff --git a/Documentation/virt/kvm/hypercalls.rst b/Documentation/virt/kvm/hypercalls.rst
index ed4fddd364ea..7aff0cebab7c 100644
--- a/Documentation/virt/kvm/hypercalls.rst
+++ b/Documentation/virt/kvm/hypercalls.rst
@@ -169,3 +169,18 @@ a0: destination APIC ID
 
 :Usage example: When sending a call-function IPI-many to vCPUs, yield if
 	        any of the IPI target vCPUs was preempted.
+
+
+8. KVM_HC_PAGE_ENC_STATUS
+-------------------------
+:Architecture: x86
+:Status: active
+:Purpose: Notify the encryption status changes in guest page table (SEV guest)
+
+a0: the guest physical address of the start page
+a1: the number of pages
+a2: encryption attribute
+
+   Where:
+	* 1: Encryption attribute is set
+	* 0: Encryption attribute is cleared
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3768819693e5..42eb0fe3df5d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1050,6 +1050,8 @@ struct kvm_arch {
 
 	bool bus_lock_detection_enabled;
 
+	bool page_enc_hc_enable;
+
 	/* Deflect RDMSR and WRMSR to user space when they trigger a #GP */
 	u32 user_space_msr_mask;
 	struct kvm_x86_msr_filter __rcu *msr_filter;
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index c9795a22e502..5184a0c0131a 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -197,6 +197,7 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	sev->active = true;
 	sev->asid = asid;
 	INIT_LIST_HEAD(&sev->regions_list);
+	kvm->arch.page_enc_hc_enable = true;
 
 	return 0;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f7d12fca397b..e8986478b653 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8208,6 +8208,13 @@ static void kvm_sched_yield(struct kvm *kvm, unsigned long dest_id)
 		kvm_vcpu_yield_to(target);
 }
 
+static int complete_hypercall_exit(struct kvm_vcpu *vcpu)
+{
+	kvm_rax_write(vcpu, vcpu->run->hypercall.ret);
+	++vcpu->stat.hypercalls;
+	return kvm_skip_emulated_instruction(vcpu);
+}
+
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 {
 	unsigned long nr, a0, a1, a2, a3, ret;
@@ -8273,6 +8280,28 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 		kvm_sched_yield(vcpu->kvm, a0);
 		ret = 0;
 		break;
+	case KVM_HC_PAGE_ENC_STATUS: {
+		u64 gpa = a0, npages = a1, enc = a2;
+
+		ret = -KVM_ENOSYS;
+		if (!vcpu->kvm->arch.page_enc_hc_enable)
+			break;
+
+		if (!PAGE_ALIGNED(gpa) || !npages ||
+		    gpa_to_gfn(gpa) + npages <= gpa_to_gfn(gpa)) {
+			ret = -EINVAL;
+			break;
+		}
+
+		vcpu->run->exit_reason        = KVM_EXIT_HYPERCALL;
+		vcpu->run->hypercall.nr       = KVM_HC_PAGE_ENC_STATUS;
+		vcpu->run->hypercall.args[0]  = gpa;
+		vcpu->run->hypercall.args[1]  = npages;
+		vcpu->run->hypercall.args[2]  = enc;
+		vcpu->run->hypercall.longmode = op_64_bit;
+		vcpu->arch.complete_userspace_io = complete_hypercall_exit;
+		return 0;
+	}
 	default:
 		ret = -KVM_ENOSYS;
 		break;
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index 8b86609849b9..847b83b75dc8 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -29,6 +29,7 @@
 #define KVM_HC_CLOCK_PAIRING		9
 #define KVM_HC_SEND_IPI		10
 #define KVM_HC_SCHED_YIELD		11
+#define KVM_HC_PAGE_ENC_STATUS		12
 
 /*
  * hypercalls use architecture specific
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v12 09/13] mm: x86: Invoke hypercall when page encryption status is changed
  2021-04-12 19:42 [PATCH v12 00/13] Add AMD SEV guest live migration support Ashish Kalra
                   ` (7 preceding siblings ...)
  2021-04-12 19:45 ` [PATCH v12 08/13] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall Ashish Kalra
@ 2021-04-12 19:45 ` Ashish Kalra
  2021-04-13  0:06   ` Steve Rutherford
  2021-04-12 19:45 ` [PATCH v12 10/13] KVM: x86: Introduce new KVM_FEATURE_SEV_LIVE_MIGRATION feature & Custom MSR Ashish Kalra
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Ashish Kalra @ 2021-04-12 19:45 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, joro, bp, thomas.lendacky, x86, kvm,
	linux-kernel, srutherford, seanjc, venu.busireddy, brijesh.singh

From: Brijesh Singh <brijesh.singh@amd.com>

Invoke a hypercall when a memory region is changed from encrypted ->
decrypted and vice versa. Hypervisor needs to know the page encryption
status during the guest migration.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/include/asm/paravirt.h       | 10 +++++
 arch/x86/include/asm/paravirt_types.h |  2 +
 arch/x86/kernel/paravirt.c            |  1 +
 arch/x86/mm/mem_encrypt.c             | 57 ++++++++++++++++++++++++++-
 arch/x86/mm/pat/set_memory.c          |  7 ++++
 5 files changed, 76 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 4abf110e2243..efaa3e628967 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -84,6 +84,12 @@ static inline void paravirt_arch_exit_mmap(struct mm_struct *mm)
 	PVOP_VCALL1(mmu.exit_mmap, mm);
 }
 
+static inline void page_encryption_changed(unsigned long vaddr, int npages,
+						bool enc)
+{
+	PVOP_VCALL3(mmu.page_encryption_changed, vaddr, npages, enc);
+}
+
 #ifdef CONFIG_PARAVIRT_XXL
 static inline void load_sp0(unsigned long sp0)
 {
@@ -799,6 +805,10 @@ static inline void paravirt_arch_dup_mmap(struct mm_struct *oldmm,
 static inline void paravirt_arch_exit_mmap(struct mm_struct *mm)
 {
 }
+
+static inline void page_encryption_changed(unsigned long vaddr, int npages, bool enc)
+{
+}
 #endif
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_X86_PARAVIRT_H */
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index de87087d3bde..69ef9c207b38 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -195,6 +195,8 @@ struct pv_mmu_ops {
 
 	/* Hook for intercepting the destruction of an mm_struct. */
 	void (*exit_mmap)(struct mm_struct *mm);
+	void (*page_encryption_changed)(unsigned long vaddr, int npages,
+					bool enc);
 
 #ifdef CONFIG_PARAVIRT_XXL
 	struct paravirt_callee_save read_cr2;
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index c60222ab8ab9..9f206e192f6b 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -335,6 +335,7 @@ struct paravirt_patch_template pv_ops = {
 			(void (*)(struct mmu_gather *, void *))tlb_remove_page,
 
 	.mmu.exit_mmap		= paravirt_nop,
+	.mmu.page_encryption_changed	= paravirt_nop,
 
 #ifdef CONFIG_PARAVIRT_XXL
 	.mmu.read_cr2		= __PV_IS_CALLEE_SAVE(native_read_cr2),
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index ae78cef79980..fae9ccbd0da7 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -19,6 +19,7 @@
 #include <linux/kernel.h>
 #include <linux/bitops.h>
 #include <linux/dma-mapping.h>
+#include <linux/kvm_para.h>
 
 #include <asm/tlbflush.h>
 #include <asm/fixmap.h>
@@ -29,6 +30,7 @@
 #include <asm/processor-flags.h>
 #include <asm/msr.h>
 #include <asm/cmdline.h>
+#include <asm/kvm_para.h>
 
 #include "mm_internal.h"
 
@@ -229,6 +231,47 @@ void __init sev_setup_arch(void)
 	swiotlb_adjust_size(size);
 }
 
+static void set_memory_enc_dec_hypercall(unsigned long vaddr, int npages,
+					bool enc)
+{
+	unsigned long sz = npages << PAGE_SHIFT;
+	unsigned long vaddr_end, vaddr_next;
+
+	vaddr_end = vaddr + sz;
+
+	for (; vaddr < vaddr_end; vaddr = vaddr_next) {
+		int psize, pmask, level;
+		unsigned long pfn;
+		pte_t *kpte;
+
+		kpte = lookup_address(vaddr, &level);
+		if (!kpte || pte_none(*kpte))
+			return;
+
+		switch (level) {
+		case PG_LEVEL_4K:
+			pfn = pte_pfn(*kpte);
+			break;
+		case PG_LEVEL_2M:
+			pfn = pmd_pfn(*(pmd_t *)kpte);
+			break;
+		case PG_LEVEL_1G:
+			pfn = pud_pfn(*(pud_t *)kpte);
+			break;
+		default:
+			return;
+		}
+
+		psize = page_level_size(level);
+		pmask = page_level_mask(level);
+
+		kvm_sev_hypercall3(KVM_HC_PAGE_ENC_STATUS,
+				   pfn << PAGE_SHIFT, psize >> PAGE_SHIFT, enc);
+
+		vaddr_next = (vaddr & pmask) + psize;
+	}
+}
+
 static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
 {
 	pgprot_t old_prot, new_prot;
@@ -286,12 +329,13 @@ static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
 static int __init early_set_memory_enc_dec(unsigned long vaddr,
 					   unsigned long size, bool enc)
 {
-	unsigned long vaddr_end, vaddr_next;
+	unsigned long vaddr_end, vaddr_next, start;
 	unsigned long psize, pmask;
 	int split_page_size_mask;
 	int level, ret;
 	pte_t *kpte;
 
+	start = vaddr;
 	vaddr_next = vaddr;
 	vaddr_end = vaddr + size;
 
@@ -346,6 +390,8 @@ static int __init early_set_memory_enc_dec(unsigned long vaddr,
 
 	ret = 0;
 
+	set_memory_enc_dec_hypercall(start, PAGE_ALIGN(size) >> PAGE_SHIFT,
+					enc);
 out:
 	__flush_tlb_all();
 	return ret;
@@ -481,6 +527,15 @@ void __init mem_encrypt_init(void)
 	if (sev_active() && !sev_es_active())
 		static_branch_enable(&sev_enable_key);
 
+#ifdef CONFIG_PARAVIRT
+	/*
+	 * With SEV, we need to make a hypercall when page encryption state is
+	 * changed.
+	 */
+	if (sev_active())
+		pv_ops.mmu.page_encryption_changed = set_memory_enc_dec_hypercall;
+#endif
+
 	print_mem_encrypt_feature_info();
 }
 
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 16f878c26667..3576b583ac65 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -27,6 +27,7 @@
 #include <asm/proto.h>
 #include <asm/memtype.h>
 #include <asm/set_memory.h>
+#include <asm/paravirt.h>
 
 #include "../mm_internal.h"
 
@@ -2012,6 +2013,12 @@ static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
 	 */
 	cpa_flush(&cpa, 0);
 
+	/* Notify hypervisor that a given memory range is mapped encrypted
+	 * or decrypted. The hypervisor will use this information during the
+	 * VM migration.
+	 */
+	page_encryption_changed(addr, numpages, enc);
+
 	return ret;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v12 10/13] KVM: x86: Introduce new KVM_FEATURE_SEV_LIVE_MIGRATION feature & Custom MSR.
  2021-04-12 19:42 [PATCH v12 00/13] Add AMD SEV guest live migration support Ashish Kalra
                   ` (8 preceding siblings ...)
  2021-04-12 19:45 ` [PATCH v12 09/13] mm: x86: Invoke hypercall when page encryption status is changed Ashish Kalra
@ 2021-04-12 19:45 ` Ashish Kalra
  2021-04-13  0:04   ` Steve Rutherford
  2021-04-12 19:46 ` [PATCH v12 11/13] EFI: Introduce the new AMD Memory Encryption GUID Ashish Kalra
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Ashish Kalra @ 2021-04-12 19:45 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, joro, bp, thomas.lendacky, x86, kvm,
	linux-kernel, srutherford, seanjc, venu.busireddy, brijesh.singh

From: Ashish Kalra <ashish.kalra@amd.com>

Add new KVM_FEATURE_SEV_LIVE_MIGRATION feature for guest to check
for host-side support for SEV live migration. Also add a new custom
MSR_KVM_SEV_LIVE_MIGRATION for guest to enable the SEV live migration
feature.

MSR is handled by userspace using MSR filters.

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 Documentation/virt/kvm/cpuid.rst     |  5 +++++
 Documentation/virt/kvm/msr.rst       | 12 ++++++++++++
 arch/x86/include/uapi/asm/kvm_para.h |  4 ++++
 arch/x86/kvm/cpuid.c                 |  3 ++-
 4 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/cpuid.rst b/Documentation/virt/kvm/cpuid.rst
index cf62162d4be2..0bdb6cdb12d3 100644
--- a/Documentation/virt/kvm/cpuid.rst
+++ b/Documentation/virt/kvm/cpuid.rst
@@ -96,6 +96,11 @@ KVM_FEATURE_MSI_EXT_DEST_ID        15          guest checks this feature bit
                                                before using extended destination
                                                ID bits in MSI address bits 11-5.
 
+KVM_FEATURE_SEV_LIVE_MIGRATION     16          guest checks this feature bit before
+                                               using the page encryption state
+                                               hypercall to notify the page state
+                                               change
+
 KVM_FEATURE_CLOCKSOURCE_STABLE_BIT 24          host will warn if no guest-side
                                                per-cpu warps are expected in
                                                kvmclock
diff --git a/Documentation/virt/kvm/msr.rst b/Documentation/virt/kvm/msr.rst
index e37a14c323d2..020245d16087 100644
--- a/Documentation/virt/kvm/msr.rst
+++ b/Documentation/virt/kvm/msr.rst
@@ -376,3 +376,15 @@ data:
 	write '1' to bit 0 of the MSR, this causes the host to re-scan its queue
 	and check if there are more notifications pending. The MSR is available
 	if KVM_FEATURE_ASYNC_PF_INT is present in CPUID.
+
+MSR_KVM_SEV_LIVE_MIGRATION:
+        0x4b564d08
+
+	Control SEV Live Migration features.
+
+data:
+        Bit 0 enables (1) or disables (0) host-side SEV Live Migration feature,
+        in other words, this is guest->host communication that it's properly
+        handling the shared pages list.
+
+        All other bits are reserved.
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 950afebfba88..f6bfa138874f 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -33,6 +33,7 @@
 #define KVM_FEATURE_PV_SCHED_YIELD	13
 #define KVM_FEATURE_ASYNC_PF_INT	14
 #define KVM_FEATURE_MSI_EXT_DEST_ID	15
+#define KVM_FEATURE_SEV_LIVE_MIGRATION	16
 
 #define KVM_HINTS_REALTIME      0
 
@@ -54,6 +55,7 @@
 #define MSR_KVM_POLL_CONTROL	0x4b564d05
 #define MSR_KVM_ASYNC_PF_INT	0x4b564d06
 #define MSR_KVM_ASYNC_PF_ACK	0x4b564d07
+#define MSR_KVM_SEV_LIVE_MIGRATION	0x4b564d08
 
 struct kvm_steal_time {
 	__u64 steal;
@@ -136,4 +138,6 @@ struct kvm_vcpu_pv_apf_data {
 #define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
 #define KVM_PV_EOI_DISABLED 0x0
 
+#define KVM_SEV_LIVE_MIGRATION_ENABLED BIT_ULL(0)
+
 #endif /* _UAPI_ASM_X86_KVM_PARA_H */
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 6bd2f8b830e4..4e2e69a692aa 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -812,7 +812,8 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 			     (1 << KVM_FEATURE_PV_SEND_IPI) |
 			     (1 << KVM_FEATURE_POLL_CONTROL) |
 			     (1 << KVM_FEATURE_PV_SCHED_YIELD) |
-			     (1 << KVM_FEATURE_ASYNC_PF_INT);
+			     (1 << KVM_FEATURE_ASYNC_PF_INT) |
+			     (1 << KVM_FEATURE_SEV_LIVE_MIGRATION);
 
 		if (sched_info_on())
 			entry->eax |= (1 << KVM_FEATURE_STEAL_TIME);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v12 11/13] EFI: Introduce the new AMD Memory Encryption GUID.
  2021-04-12 19:42 [PATCH v12 00/13] Add AMD SEV guest live migration support Ashish Kalra
                   ` (9 preceding siblings ...)
  2021-04-12 19:45 ` [PATCH v12 10/13] KVM: x86: Introduce new KVM_FEATURE_SEV_LIVE_MIGRATION feature & Custom MSR Ashish Kalra
@ 2021-04-12 19:46 ` Ashish Kalra
  2021-04-13  0:05   ` Steve Rutherford
  2021-04-12 19:46 ` [PATCH v12 12/13] x86/kvm: Add guest support for detecting and enabling SEV Live Migration feature Ashish Kalra
  2021-04-12 19:47 ` [PATCH v12 13/13] x86/kvm: Add kexec support for SEV Live Migration Ashish Kalra
  12 siblings, 1 reply; 30+ messages in thread
From: Ashish Kalra @ 2021-04-12 19:46 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, joro, bp, thomas.lendacky, x86, kvm,
	linux-kernel, srutherford, seanjc, venu.busireddy, brijesh.singh

From: Ashish Kalra <ashish.kalra@amd.com>

Introduce a new AMD Memory Encryption GUID which is currently
used for defining a new UEFI environment variable which indicates
UEFI/OVMF support for the SEV live migration feature. This variable
is setup when UEFI/OVMF detects host/hypervisor support for SEV
live migration and later this variable is read by the kernel using
EFI runtime services to verify if OVMF supports the live migration
feature.

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 include/linux/efi.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/efi.h b/include/linux/efi.h
index 6b5d36babfcc..6f364ace82cb 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -362,6 +362,7 @@ void efi_native_runtime_setup(void);
 
 /* OEM GUIDs */
 #define DELLEMC_EFI_RCI2_TABLE_GUID		EFI_GUID(0x2d9f28a2, 0xa886, 0x456a,  0x97, 0xa8, 0xf1, 0x1e, 0xf2, 0x4f, 0xf4, 0x55)
+#define MEM_ENCRYPT_GUID			EFI_GUID(0x0cf29b71, 0x9e51, 0x433a,  0xa3, 0xb7, 0x81, 0xf3, 0xab, 0x16, 0xb8, 0x75)
 
 typedef struct {
 	efi_guid_t guid;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v12 12/13] x86/kvm: Add guest support for detecting and enabling SEV Live Migration feature.
  2021-04-12 19:42 [PATCH v12 00/13] Add AMD SEV guest live migration support Ashish Kalra
                   ` (10 preceding siblings ...)
  2021-04-12 19:46 ` [PATCH v12 11/13] EFI: Introduce the new AMD Memory Encryption GUID Ashish Kalra
@ 2021-04-12 19:46 ` Ashish Kalra
  2021-04-13  0:25   ` Steve Rutherford
  2021-04-12 19:47 ` [PATCH v12 13/13] x86/kvm: Add kexec support for SEV Live Migration Ashish Kalra
  12 siblings, 1 reply; 30+ messages in thread
From: Ashish Kalra @ 2021-04-12 19:46 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, joro, bp, thomas.lendacky, x86, kvm,
	linux-kernel, srutherford, seanjc, venu.busireddy, brijesh.singh

From: Ashish Kalra <ashish.kalra@amd.com>

The guest support for detecting and enabling SEV Live migration
feature uses the following logic :

 - kvm_init_plaform() invokes check_kvm_sev_migration() which
   checks if its booted under the EFI

   - If not EFI,

     i) check for the KVM_FEATURE_CPUID

     ii) if CPUID reports that migration is supported, issue a wrmsrl()
         to enable the SEV live migration support

   - If EFI,

     i) check for the KVM_FEATURE_CPUID

     ii) If CPUID reports that migration is supported, read the UEFI variable which
         indicates OVMF support for live migration

     iii) the variable indicates live migration is supported, issue a wrmsrl() to
          enable the SEV live migration support

The EFI live migration check is done using a late_initcall() callback.

Also, ensure that _bss_decrypted section is marked as decrypted in the
shared pages list.

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/include/asm/mem_encrypt.h |  8 +++++
 arch/x86/kernel/kvm.c              | 52 ++++++++++++++++++++++++++++++
 arch/x86/mm/mem_encrypt.c          | 41 +++++++++++++++++++++++
 3 files changed, 101 insertions(+)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 31c4df123aa0..19b77f3a62dc 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -21,6 +21,7 @@
 extern u64 sme_me_mask;
 extern u64 sev_status;
 extern bool sev_enabled;
+extern bool sev_live_migration_enabled;
 
 void sme_encrypt_execute(unsigned long encrypted_kernel_vaddr,
 			 unsigned long decrypted_kernel_vaddr,
@@ -44,8 +45,11 @@ void __init sme_enable(struct boot_params *bp);
 
 int __init early_set_memory_decrypted(unsigned long vaddr, unsigned long size);
 int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size);
+void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages,
+					    bool enc);
 
 void __init mem_encrypt_free_decrypted_mem(void);
+void __init check_kvm_sev_migration(void);
 
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void);
@@ -60,6 +64,7 @@ bool sev_es_active(void);
 #else	/* !CONFIG_AMD_MEM_ENCRYPT */
 
 #define sme_me_mask	0ULL
+#define sev_live_migration_enabled	false
 
 static inline void __init sme_early_encrypt(resource_size_t paddr,
 					    unsigned long size) { }
@@ -84,8 +89,11 @@ static inline int __init
 early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0; }
 static inline int __init
 early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0; }
+static inline void __init
+early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages, bool enc) {}
 
 static inline void mem_encrypt_free_decrypted_mem(void) { }
+static inline void check_kvm_sev_migration(void) { }
 
 #define __bss_decrypted
 
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 78bb0fae3982..bcc82e0c9779 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -26,6 +26,7 @@
 #include <linux/kprobes.h>
 #include <linux/nmi.h>
 #include <linux/swait.h>
+#include <linux/efi.h>
 #include <asm/timer.h>
 #include <asm/cpu.h>
 #include <asm/traps.h>
@@ -429,6 +430,56 @@ static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
 	early_set_memory_decrypted((unsigned long) ptr, size);
 }
 
+static int __init setup_kvm_sev_migration(void)
+{
+	efi_char16_t efi_sev_live_migration_enabled[] = L"SevLiveMigrationEnabled";
+	efi_guid_t efi_variable_guid = MEM_ENCRYPT_GUID;
+	efi_status_t status;
+	unsigned long size;
+	bool enabled;
+
+	/*
+	 * check_kvm_sev_migration() invoked via kvm_init_platform() before
+	 * this callback would have setup the indicator that live migration
+	 * feature is supported/enabled.
+	 */
+	if (!sev_live_migration_enabled)
+		return 0;
+
+	if (!efi_enabled(EFI_RUNTIME_SERVICES)) {
+		pr_info("%s : EFI runtime services are not enabled\n", __func__);
+		return 0;
+	}
+
+	size = sizeof(enabled);
+
+	/* Get variable contents into buffer */
+	status = efi.get_variable(efi_sev_live_migration_enabled,
+				  &efi_variable_guid, NULL, &size, &enabled);
+
+	if (status == EFI_NOT_FOUND) {
+		pr_info("%s : EFI live migration variable not found\n", __func__);
+		return 0;
+	}
+
+	if (status != EFI_SUCCESS) {
+		pr_info("%s : EFI variable retrieval failed\n", __func__);
+		return 0;
+	}
+
+	if (enabled == 0) {
+		pr_info("%s: live migration disabled in EFI\n", __func__);
+		return 0;
+	}
+
+	pr_info("%s : live migration enabled in EFI\n", __func__);
+	wrmsrl(MSR_KVM_SEV_LIVE_MIGRATION, KVM_SEV_LIVE_MIGRATION_ENABLED);
+
+	return true;
+}
+
+late_initcall(setup_kvm_sev_migration);
+
 /*
  * Iterate through all possible CPUs and map the memory region pointed
  * by apf_reason, steal_time and kvm_apic_eoi as decrypted at once.
@@ -747,6 +798,7 @@ static bool __init kvm_msi_ext_dest_id(void)
 
 static void __init kvm_init_platform(void)
 {
+	check_kvm_sev_migration();
 	kvmclock_init();
 	x86_platform.apic_post_init = kvm_apic_init;
 }
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index fae9ccbd0da7..4de417333c09 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -20,6 +20,7 @@
 #include <linux/bitops.h>
 #include <linux/dma-mapping.h>
 #include <linux/kvm_para.h>
+#include <linux/efi.h>
 
 #include <asm/tlbflush.h>
 #include <asm/fixmap.h>
@@ -48,6 +49,8 @@ EXPORT_SYMBOL_GPL(sev_enable_key);
 
 bool sev_enabled __section(".data");
 
+bool sev_live_migration_enabled __section(".data");
+
 /* Buffer used for early in-place encryption by BSP, no locking needed */
 static char sme_early_buffer[PAGE_SIZE] __initdata __aligned(PAGE_SIZE);
 
@@ -237,6 +240,9 @@ static void set_memory_enc_dec_hypercall(unsigned long vaddr, int npages,
 	unsigned long sz = npages << PAGE_SHIFT;
 	unsigned long vaddr_end, vaddr_next;
 
+	if (!sev_live_migration_enabled)
+		return;
+
 	vaddr_end = vaddr + sz;
 
 	for (; vaddr < vaddr_end; vaddr = vaddr_next) {
@@ -407,6 +413,12 @@ int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size)
 	return early_set_memory_enc_dec(vaddr, size, true);
 }
 
+void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages,
+					bool enc)
+{
+	set_memory_enc_dec_hypercall(vaddr, npages, enc);
+}
+
 /*
  * SME and SEV are very similar but they are not the same, so there are
  * times that the kernel will need to distinguish between SME and SEV. The
@@ -462,6 +474,35 @@ bool force_dma_unencrypted(struct device *dev)
 	return false;
 }
 
+void __init check_kvm_sev_migration(void)
+{
+	if (sev_active() &&
+	    kvm_para_has_feature(KVM_FEATURE_SEV_LIVE_MIGRATION)) {
+		unsigned long nr_pages;
+
+		pr_info("KVM enable live migration\n");
+		sev_live_migration_enabled = true;
+
+		/*
+		 * Ensure that _bss_decrypted section is marked as decrypted in the
+		 * shared pages list.
+		 */
+		nr_pages = DIV_ROUND_UP(__end_bss_decrypted - __start_bss_decrypted,
+					PAGE_SIZE);
+		early_set_mem_enc_dec_hypercall((unsigned long)__start_bss_decrypted,
+						nr_pages, 0);
+
+		/*
+		 * If not booted using EFI, enable Live migration support.
+		 */
+		if (!efi_enabled(EFI_BOOT))
+			wrmsrl(MSR_KVM_SEV_LIVE_MIGRATION,
+			       KVM_SEV_LIVE_MIGRATION_ENABLED);
+		} else {
+			pr_info("KVM enable live migration feature unsupported\n");
+		}
+}
+
 void __init mem_encrypt_free_decrypted_mem(void)
 {
 	unsigned long vaddr, vaddr_end, npages;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v12 13/13] x86/kvm: Add kexec support for SEV Live Migration.
  2021-04-12 19:42 [PATCH v12 00/13] Add AMD SEV guest live migration support Ashish Kalra
                   ` (11 preceding siblings ...)
  2021-04-12 19:46 ` [PATCH v12 12/13] x86/kvm: Add guest support for detecting and enabling SEV Live Migration feature Ashish Kalra
@ 2021-04-12 19:47 ` Ashish Kalra
  2021-04-13  0:22   ` Steve Rutherford
  12 siblings, 1 reply; 30+ messages in thread
From: Ashish Kalra @ 2021-04-12 19:47 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, joro, bp, thomas.lendacky, x86, kvm,
	linux-kernel, srutherford, seanjc, venu.busireddy, brijesh.singh,
	kexec

From: Ashish Kalra <ashish.kalra@amd.com>

Reset the host's shared pages list related to kernel
specific page encryption status settings before we load a
new kernel by kexec. We cannot reset the complete
shared pages list here as we need to retain the
UEFI/OVMF firmware specific settings.

The host's shared pages list is maintained for the
guest to keep track of all unencrypted guest memory regions,
therefore we need to explicitly mark all shared pages as
encrypted again before rebooting into the new guest kernel.

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/kernel/kvm.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index bcc82e0c9779..4ad3ed547ff1 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -39,6 +39,7 @@
 #include <asm/cpuidle_haltpoll.h>
 #include <asm/ptrace.h>
 #include <asm/svm.h>
+#include <asm/e820/api.h>
 
 DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
 
@@ -384,6 +385,29 @@ static void kvm_pv_guest_cpu_reboot(void *unused)
 	 */
 	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
 		wrmsrl(MSR_KVM_PV_EOI_EN, 0);
+	/*
+	 * Reset the host's shared pages list related to kernel
+	 * specific page encryption status settings before we load a
+	 * new kernel by kexec. NOTE: We cannot reset the complete
+	 * shared pages list here as we need to retain the
+	 * UEFI/OVMF firmware specific settings.
+	 */
+	if (sev_live_migration_enabled & (smp_processor_id() == 0)) {
+		int i;
+		unsigned long nr_pages;
+
+		for (i = 0; i < e820_table->nr_entries; i++) {
+			struct e820_entry *entry = &e820_table->entries[i];
+
+			if (entry->type != E820_TYPE_RAM)
+				continue;
+
+			nr_pages = DIV_ROUND_UP(entry->size, PAGE_SIZE);
+
+			kvm_sev_hypercall3(KVM_HC_PAGE_ENC_STATUS,
+					   entry->addr, nr_pages, 1);
+		}
+	}
 	kvm_pv_disable_apf();
 	kvm_disable_steal_time();
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 03/13] KVM: SVM: Add KVM_SEV_SEND_FINISH command
  2021-04-12 19:43 ` [PATCH v12 03/13] KVM: SVM: Add KVM_SEV_SEND_FINISH command Ashish Kalra
@ 2021-04-13  0:03   ` Steve Rutherford
  0 siblings, 0 replies; 30+ messages in thread
From: Steve Rutherford @ 2021-04-13  0:03 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh

On Mon, Apr 12, 2021 at 12:44 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Brijesh Singh <brijesh.singh@amd.com>
>
> The command is used to finailize the encryption context created with
> KVM_SEV_SEND_START command.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: x86@kernel.org
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  .../virt/kvm/amd-memory-encryption.rst        |  8 +++++++
>  arch/x86/kvm/svm/sev.c                        | 23 +++++++++++++++++++
>  2 files changed, 31 insertions(+)
>
> diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
> index 3c5456e0268a..26c4e6c83f62 100644
> --- a/Documentation/virt/kvm/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/amd-memory-encryption.rst
> @@ -335,6 +335,14 @@ Returns: 0 on success, -negative on error
>                  __u32 trans_len;
>          };
>
> +12. KVM_SEV_SEND_FINISH
> +------------------------
> +
> +After completion of the migration flow, the KVM_SEV_SEND_FINISH command can be
> +issued by the hypervisor to delete the encryption context.
> +
> +Returns: 0 on success, -negative on error
> +
>  References
>  ==========
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 30527285a39a..92325d9527ce 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -1350,6 +1350,26 @@ static int sev_send_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
>         return ret;
>  }
>
> +static int sev_send_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +       struct sev_data_send_finish *data;
> +       int ret;
> +
> +       if (!sev_guest(kvm))
> +               return -ENOTTY;
> +
> +       data = kzalloc(sizeof(*data), GFP_KERNEL);
> +       if (!data)
> +               return -ENOMEM;
> +
> +       data->handle = sev->handle;
> +       ret = sev_issue_cmd(kvm, SEV_CMD_SEND_FINISH, data, &argp->error);
> +
> +       kfree(data);
> +       return ret;
> +}
> +
>  int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>  {
>         struct kvm_sev_cmd sev_cmd;
> @@ -1409,6 +1429,9 @@ int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>         case KVM_SEV_SEND_UPDATE_DATA:
>                 r = sev_send_update_data(kvm, &sev_cmd);
>                 break;
> +       case KVM_SEV_SEND_FINISH:
> +               r = sev_send_finish(kvm, &sev_cmd);
> +               break;
>         default:
>                 r = -EINVAL;
>                 goto out;
> --
> 2.17.1
>
Reviewed-by: Steve Rutherford <srutherford@google.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 04/13] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command
  2021-04-12 19:44 ` [PATCH v12 04/13] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command Ashish Kalra
@ 2021-04-13  0:03   ` Steve Rutherford
  0 siblings, 0 replies; 30+ messages in thread
From: Steve Rutherford @ 2021-04-13  0:03 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh

On Mon, Apr 12, 2021 at 12:44 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Brijesh Singh <brijesh.singh@amd.com>
>
> The command is used to create the encryption context for an incoming
> SEV guest. The encryption context can be later used by the hypervisor
> to import the incoming data into the SEV guest memory space.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: x86@kernel.org
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  .../virt/kvm/amd-memory-encryption.rst        | 29 +++++++
>  arch/x86/kvm/svm/sev.c                        | 81 +++++++++++++++++++
>  include/uapi/linux/kvm.h                      |  9 +++
>  3 files changed, 119 insertions(+)
>
> diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
> index 26c4e6c83f62..c86c1ded8dd8 100644
> --- a/Documentation/virt/kvm/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/amd-memory-encryption.rst
> @@ -343,6 +343,35 @@ issued by the hypervisor to delete the encryption context.
>
>  Returns: 0 on success, -negative on error
>
> +13. KVM_SEV_RECEIVE_START
> +------------------------
> +
> +The KVM_SEV_RECEIVE_START command is used for creating the memory encryption
> +context for an incoming SEV guest. To create the encryption context, the user must
> +provide a guest policy, the platform public Diffie-Hellman (PDH) key and session
> +information.
> +
> +Parameters: struct  kvm_sev_receive_start (in/out)
> +
> +Returns: 0 on success, -negative on error
> +
> +::
> +
> +        struct kvm_sev_receive_start {
> +                __u32 handle;           /* if zero then firmware creates a new handle */
> +                __u32 policy;           /* guest's policy */
> +
> +                __u64 pdh_uaddr;        /* userspace address pointing to the PDH key */
> +                __u32 pdh_len;
> +
> +                __u64 session_uaddr;    /* userspace address which points to the guest session information */
> +                __u32 session_len;
> +        };
> +
> +On success, the 'handle' field contains a new handle and on error, a negative value.
> +
> +For more details, see SEV spec Section 6.12.
> +
>  References
>  ==========
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 92325d9527ce..e530c2b34b5e 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -1370,6 +1370,84 @@ static int sev_send_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
>         return ret;
>  }
>
> +static int sev_receive_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +       struct sev_data_receive_start *start;
> +       struct kvm_sev_receive_start params;
> +       int *error = &argp->error;
> +       void *session_data;
> +       void *pdh_data;
> +       int ret;
> +
> +       if (!sev_guest(kvm))
> +               return -ENOTTY;
> +
> +       /* Get parameter from the userspace */
> +       if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
> +                       sizeof(struct kvm_sev_receive_start)))
> +               return -EFAULT;
> +
> +       /* some sanity checks */
> +       if (!params.pdh_uaddr || !params.pdh_len ||
> +           !params.session_uaddr || !params.session_len)
> +               return -EINVAL;
> +
> +       pdh_data = psp_copy_user_blob(params.pdh_uaddr, params.pdh_len);
> +       if (IS_ERR(pdh_data))
> +               return PTR_ERR(pdh_data);
> +
> +       session_data = psp_copy_user_blob(params.session_uaddr,
> +                       params.session_len);
> +       if (IS_ERR(session_data)) {
> +               ret = PTR_ERR(session_data);
> +               goto e_free_pdh;
> +       }
> +
> +       ret = -ENOMEM;
> +       start = kzalloc(sizeof(*start), GFP_KERNEL);
> +       if (!start)
> +               goto e_free_session;
> +
> +       start->handle = params.handle;
> +       start->policy = params.policy;
> +       start->pdh_cert_address = __psp_pa(pdh_data);
> +       start->pdh_cert_len = params.pdh_len;
> +       start->session_address = __psp_pa(session_data);
> +       start->session_len = params.session_len;
> +
> +       /* create memory encryption context */
> +       ret = __sev_issue_cmd(argp->sev_fd, SEV_CMD_RECEIVE_START, start,
> +                               error);
> +       if (ret)
> +               goto e_free;
> +
> +       /* Bind ASID to this guest */
> +       ret = sev_bind_asid(kvm, start->handle, error);
> +       if (ret)
> +               goto e_free;
> +
> +       params.handle = start->handle;
> +       if (copy_to_user((void __user *)(uintptr_t)argp->data,
> +                        &params, sizeof(struct kvm_sev_receive_start))) {
> +               ret = -EFAULT;
> +               sev_unbind_asid(kvm, start->handle);
> +               goto e_free;
> +       }
> +
> +       sev->handle = start->handle;
> +       sev->fd = argp->sev_fd;
> +
> +e_free:
> +       kfree(start);
> +e_free_session:
> +       kfree(session_data);
> +e_free_pdh:
> +       kfree(pdh_data);
> +
> +       return ret;
> +}
> +
>  int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>  {
>         struct kvm_sev_cmd sev_cmd;
> @@ -1432,6 +1510,9 @@ int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>         case KVM_SEV_SEND_FINISH:
>                 r = sev_send_finish(kvm, &sev_cmd);
>                 break;
> +       case KVM_SEV_RECEIVE_START:
> +               r = sev_receive_start(kvm, &sev_cmd);
> +               break;
>         default:
>                 r = -EINVAL;
>                 goto out;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index d45af34c31be..29c25e641a0c 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1750,6 +1750,15 @@ struct kvm_sev_send_update_data {
>         __u32 trans_len;
>  };
>
> +struct kvm_sev_receive_start {
> +       __u32 handle;
> +       __u32 policy;
> +       __u64 pdh_uaddr;
> +       __u32 pdh_len;
> +       __u64 session_uaddr;
> +       __u32 session_len;
> +};
> +
>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU    (1 << 0)
>  #define KVM_DEV_ASSIGN_PCI_2_3         (1 << 1)
>  #define KVM_DEV_ASSIGN_MASK_INTX       (1 << 2)
> --
> 2.17.1
>
Reviewed-by: Steve Rutherford <srutherford@google.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 05/13] KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command
  2021-04-12 19:44 ` [PATCH v12 05/13] KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command Ashish Kalra
@ 2021-04-13  0:04   ` Steve Rutherford
  0 siblings, 0 replies; 30+ messages in thread
From: Steve Rutherford @ 2021-04-13  0:04 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh

On Mon, Apr 12, 2021 at 12:44 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Brijesh Singh <brijesh.singh@amd.com>
>
> The command is used for copying the incoming buffer into the
> SEV guest memory space.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: x86@kernel.org
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  .../virt/kvm/amd-memory-encryption.rst        | 24 ++++++
>  arch/x86/kvm/svm/sev.c                        | 79 +++++++++++++++++++
>  include/uapi/linux/kvm.h                      |  9 +++
>  3 files changed, 112 insertions(+)
>
> diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
> index c86c1ded8dd8..c6ed5b26d841 100644
> --- a/Documentation/virt/kvm/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/amd-memory-encryption.rst
> @@ -372,6 +372,30 @@ On success, the 'handle' field contains a new handle and on error, a negative va
>
>  For more details, see SEV spec Section 6.12.
>
> +14. KVM_SEV_RECEIVE_UPDATE_DATA
> +----------------------------
> +
> +The KVM_SEV_RECEIVE_UPDATE_DATA command can be used by the hypervisor to copy
> +the incoming buffers into the guest memory region with encryption context
> +created during the KVM_SEV_RECEIVE_START.
> +
> +Parameters (in): struct kvm_sev_receive_update_data
> +
> +Returns: 0 on success, -negative on error
> +
> +::
> +
> +        struct kvm_sev_launch_receive_update_data {
> +                __u64 hdr_uaddr;        /* userspace address containing the packet header */
> +                __u32 hdr_len;
> +
> +                __u64 guest_uaddr;      /* the destination guest memory region */
> +                __u32 guest_len;
> +
> +                __u64 trans_uaddr;      /* the incoming buffer memory region  */
> +                __u32 trans_len;
> +        };
> +
>  References
>  ==========
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index e530c2b34b5e..2c95657cc9bf 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -1448,6 +1448,82 @@ static int sev_receive_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
>         return ret;
>  }
>
> +static int sev_receive_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +       struct kvm_sev_receive_update_data params;
> +       struct sev_data_receive_update_data *data;
> +       void *hdr = NULL, *trans = NULL;
> +       struct page **guest_page;
> +       unsigned long n;
> +       int ret, offset;
> +
> +       if (!sev_guest(kvm))
> +               return -EINVAL;
> +
> +       if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
> +                       sizeof(struct kvm_sev_receive_update_data)))
> +               return -EFAULT;
> +
> +       if (!params.hdr_uaddr || !params.hdr_len ||
> +           !params.guest_uaddr || !params.guest_len ||
> +           !params.trans_uaddr || !params.trans_len)
> +               return -EINVAL;
> +
> +       /* Check if we are crossing the page boundary */
> +       offset = params.guest_uaddr & (PAGE_SIZE - 1);
> +       if ((params.guest_len + offset > PAGE_SIZE))
> +               return -EINVAL;
> +
> +       hdr = psp_copy_user_blob(params.hdr_uaddr, params.hdr_len);
> +       if (IS_ERR(hdr))
> +               return PTR_ERR(hdr);
> +
> +       trans = psp_copy_user_blob(params.trans_uaddr, params.trans_len);
> +       if (IS_ERR(trans)) {
> +               ret = PTR_ERR(trans);
> +               goto e_free_hdr;
> +       }
> +
> +       ret = -ENOMEM;
> +       data = kzalloc(sizeof(*data), GFP_KERNEL);
> +       if (!data)
> +               goto e_free_trans;
> +
> +       data->hdr_address = __psp_pa(hdr);
> +       data->hdr_len = params.hdr_len;
> +       data->trans_address = __psp_pa(trans);
> +       data->trans_len = params.trans_len;
> +
> +       /* Pin guest memory */
> +       ret = -EFAULT;
> +       guest_page = sev_pin_memory(kvm, params.guest_uaddr & PAGE_MASK,
> +                                   PAGE_SIZE, &n, 0);
> +       if (!guest_page)
> +               goto e_free;
> +
> +       /* The RECEIVE_UPDATE_DATA command requires C-bit to be always set. */
> +       data->guest_address = (page_to_pfn(guest_page[0]) << PAGE_SHIFT) +
> +                               offset;
> +       data->guest_address |= sev_me_mask;
> +       data->guest_len = params.guest_len;
> +       data->handle = sev->handle;
> +
> +       ret = sev_issue_cmd(kvm, SEV_CMD_RECEIVE_UPDATE_DATA, data,
> +                               &argp->error);
> +
> +       sev_unpin_memory(kvm, guest_page, n);
> +
> +e_free:
> +       kfree(data);
> +e_free_trans:
> +       kfree(trans);
> +e_free_hdr:
> +       kfree(hdr);
> +
> +       return ret;
> +}
> +
>  int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>  {
>         struct kvm_sev_cmd sev_cmd;
> @@ -1513,6 +1589,9 @@ int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>         case KVM_SEV_RECEIVE_START:
>                 r = sev_receive_start(kvm, &sev_cmd);
>                 break;
> +       case KVM_SEV_RECEIVE_UPDATE_DATA:
> +               r = sev_receive_update_data(kvm, &sev_cmd);
> +               break;
>         default:
>                 r = -EINVAL;
>                 goto out;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 29c25e641a0c..3a656d43fc6c 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1759,6 +1759,15 @@ struct kvm_sev_receive_start {
>         __u32 session_len;
>  };
>
> +struct kvm_sev_receive_update_data {
> +       __u64 hdr_uaddr;
> +       __u32 hdr_len;
> +       __u64 guest_uaddr;
> +       __u32 guest_len;
> +       __u64 trans_uaddr;
> +       __u32 trans_len;
> +};
> +
>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU    (1 << 0)
>  #define KVM_DEV_ASSIGN_PCI_2_3         (1 << 1)
>  #define KVM_DEV_ASSIGN_MASK_INTX       (1 << 2)
> --
> 2.17.1
>
Reviewed-by: Steve Rutherford <srutherford@google.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 10/13] KVM: x86: Introduce new KVM_FEATURE_SEV_LIVE_MIGRATION feature & Custom MSR.
  2021-04-12 19:45 ` [PATCH v12 10/13] KVM: x86: Introduce new KVM_FEATURE_SEV_LIVE_MIGRATION feature & Custom MSR Ashish Kalra
@ 2021-04-13  0:04   ` Steve Rutherford
  0 siblings, 0 replies; 30+ messages in thread
From: Steve Rutherford @ 2021-04-13  0:04 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh

On Mon, Apr 12, 2021 at 12:46 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Ashish Kalra <ashish.kalra@amd.com>
>
> Add new KVM_FEATURE_SEV_LIVE_MIGRATION feature for guest to check
> for host-side support for SEV live migration. Also add a new custom
> MSR_KVM_SEV_LIVE_MIGRATION for guest to enable the SEV live migration
> feature.
>
> MSR is handled by userspace using MSR filters.
>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  Documentation/virt/kvm/cpuid.rst     |  5 +++++
>  Documentation/virt/kvm/msr.rst       | 12 ++++++++++++
>  arch/x86/include/uapi/asm/kvm_para.h |  4 ++++
>  arch/x86/kvm/cpuid.c                 |  3 ++-
>  4 files changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/virt/kvm/cpuid.rst b/Documentation/virt/kvm/cpuid.rst
> index cf62162d4be2..0bdb6cdb12d3 100644
> --- a/Documentation/virt/kvm/cpuid.rst
> +++ b/Documentation/virt/kvm/cpuid.rst
> @@ -96,6 +96,11 @@ KVM_FEATURE_MSI_EXT_DEST_ID        15          guest checks this feature bit
>                                                 before using extended destination
>                                                 ID bits in MSI address bits 11-5.
>
> +KVM_FEATURE_SEV_LIVE_MIGRATION     16          guest checks this feature bit before
> +                                               using the page encryption state
> +                                               hypercall to notify the page state
> +                                               change
> +
>  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT 24          host will warn if no guest-side
>                                                 per-cpu warps are expected in
>                                                 kvmclock
> diff --git a/Documentation/virt/kvm/msr.rst b/Documentation/virt/kvm/msr.rst
> index e37a14c323d2..020245d16087 100644
> --- a/Documentation/virt/kvm/msr.rst
> +++ b/Documentation/virt/kvm/msr.rst
> @@ -376,3 +376,15 @@ data:
>         write '1' to bit 0 of the MSR, this causes the host to re-scan its queue
>         and check if there are more notifications pending. The MSR is available
>         if KVM_FEATURE_ASYNC_PF_INT is present in CPUID.
> +
> +MSR_KVM_SEV_LIVE_MIGRATION:
> +        0x4b564d08
> +
> +       Control SEV Live Migration features.
> +
> +data:
> +        Bit 0 enables (1) or disables (0) host-side SEV Live Migration feature,
> +        in other words, this is guest->host communication that it's properly
> +        handling the shared pages list.
> +
> +        All other bits are reserved.
> diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
> index 950afebfba88..f6bfa138874f 100644
> --- a/arch/x86/include/uapi/asm/kvm_para.h
> +++ b/arch/x86/include/uapi/asm/kvm_para.h
> @@ -33,6 +33,7 @@
>  #define KVM_FEATURE_PV_SCHED_YIELD     13
>  #define KVM_FEATURE_ASYNC_PF_INT       14
>  #define KVM_FEATURE_MSI_EXT_DEST_ID    15
> +#define KVM_FEATURE_SEV_LIVE_MIGRATION 16
>
>  #define KVM_HINTS_REALTIME      0
>
> @@ -54,6 +55,7 @@
>  #define MSR_KVM_POLL_CONTROL   0x4b564d05
>  #define MSR_KVM_ASYNC_PF_INT   0x4b564d06
>  #define MSR_KVM_ASYNC_PF_ACK   0x4b564d07
> +#define MSR_KVM_SEV_LIVE_MIGRATION     0x4b564d08
>
>  struct kvm_steal_time {
>         __u64 steal;
> @@ -136,4 +138,6 @@ struct kvm_vcpu_pv_apf_data {
>  #define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
>  #define KVM_PV_EOI_DISABLED 0x0
>
> +#define KVM_SEV_LIVE_MIGRATION_ENABLED BIT_ULL(0)
> +
>  #endif /* _UAPI_ASM_X86_KVM_PARA_H */
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 6bd2f8b830e4..4e2e69a692aa 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -812,7 +812,8 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
>                              (1 << KVM_FEATURE_PV_SEND_IPI) |
>                              (1 << KVM_FEATURE_POLL_CONTROL) |
>                              (1 << KVM_FEATURE_PV_SCHED_YIELD) |
> -                            (1 << KVM_FEATURE_ASYNC_PF_INT);
> +                            (1 << KVM_FEATURE_ASYNC_PF_INT) |
> +                            (1 << KVM_FEATURE_SEV_LIVE_MIGRATION);
>
>                 if (sched_info_on())
>                         entry->eax |= (1 << KVM_FEATURE_STEAL_TIME);
> --
> 2.17.1
>
Reviewed-by: Steve Rutherford <srutherford@google.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 06/13] KVM: SVM: Add KVM_SEV_RECEIVE_FINISH command
  2021-04-12 19:44 ` [PATCH v12 06/13] KVM: SVM: Add KVM_SEV_RECEIVE_FINISH command Ashish Kalra
@ 2021-04-13  0:04   ` Steve Rutherford
  0 siblings, 0 replies; 30+ messages in thread
From: Steve Rutherford @ 2021-04-13  0:04 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh

On Mon, Apr 12, 2021 at 12:45 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Brijesh Singh <brijesh.singh@amd.com>
>
> The command finalize the guest receiving process and make the SEV guest
> ready for the execution.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: x86@kernel.org
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Reviewed-by: Steve Rutherford <srutherford@google.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  .../virt/kvm/amd-memory-encryption.rst        |  8 +++++++
>  arch/x86/kvm/svm/sev.c                        | 23 +++++++++++++++++++
>  2 files changed, 31 insertions(+)
>
> diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
> index c6ed5b26d841..0466c0febff9 100644
> --- a/Documentation/virt/kvm/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/amd-memory-encryption.rst
> @@ -396,6 +396,14 @@ Returns: 0 on success, -negative on error
>                  __u32 trans_len;
>          };
>
> +15. KVM_SEV_RECEIVE_FINISH
> +------------------------
> +
> +After completion of the migration flow, the KVM_SEV_RECEIVE_FINISH command can be
> +issued by the hypervisor to make the guest ready for execution.
> +
> +Returns: 0 on success, -negative on error
> +
>  References
>  ==========
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 2c95657cc9bf..c9795a22e502 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -1524,6 +1524,26 @@ static int sev_receive_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
>         return ret;
>  }
>
> +static int sev_receive_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +       struct sev_data_receive_finish *data;
> +       int ret;
> +
> +       if (!sev_guest(kvm))
> +               return -ENOTTY;
> +
> +       data = kzalloc(sizeof(*data), GFP_KERNEL);
> +       if (!data)
> +               return -ENOMEM;
> +
> +       data->handle = sev->handle;
> +       ret = sev_issue_cmd(kvm, SEV_CMD_RECEIVE_FINISH, data, &argp->error);
> +
> +       kfree(data);
> +       return ret;
> +}
> +
>  int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>  {
>         struct kvm_sev_cmd sev_cmd;
> @@ -1592,6 +1612,9 @@ int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>         case KVM_SEV_RECEIVE_UPDATE_DATA:
>                 r = sev_receive_update_data(kvm, &sev_cmd);
>                 break;
> +       case KVM_SEV_RECEIVE_FINISH:
> +               r = sev_receive_finish(kvm, &sev_cmd);
> +               break;
>         default:
>                 r = -EINVAL;
>                 goto out;
> --
> 2.17.1
>
Reviewed-by: Steve Rutherford <srutherford@google.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 08/13] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall
  2021-04-12 19:45 ` [PATCH v12 08/13] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall Ashish Kalra
@ 2021-04-13  0:04   ` Steve Rutherford
  0 siblings, 0 replies; 30+ messages in thread
From: Steve Rutherford @ 2021-04-13  0:04 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh

On Mon, Apr 12, 2021 at 12:45 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Ashish Kalra <ashish.kalra@amd.com>
>
> This hypercall is used by the SEV guest to notify a change in the page
> encryption status to the hypervisor. The hypercall should be invoked
> only when the encryption attribute is changed from encrypted -> decrypted
> and vice versa. By default all guest pages are considered encrypted.
>
> The hypercall exits to userspace to manage the guest shared regions and
> integrate with the userspace VMM's migration code.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: x86@kernel.org
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> Co-developed-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  Documentation/virt/kvm/hypercalls.rst | 15 ++++++++++++++
>  arch/x86/include/asm/kvm_host.h       |  2 ++
>  arch/x86/kvm/svm/sev.c                |  1 +
>  arch/x86/kvm/x86.c                    | 29 +++++++++++++++++++++++++++
>  include/uapi/linux/kvm_para.h         |  1 +
>  5 files changed, 48 insertions(+)
>
> diff --git a/Documentation/virt/kvm/hypercalls.rst b/Documentation/virt/kvm/hypercalls.rst
> index ed4fddd364ea..7aff0cebab7c 100644
> --- a/Documentation/virt/kvm/hypercalls.rst
> +++ b/Documentation/virt/kvm/hypercalls.rst
> @@ -169,3 +169,18 @@ a0: destination APIC ID
>
>  :Usage example: When sending a call-function IPI-many to vCPUs, yield if
>                 any of the IPI target vCPUs was preempted.
> +
> +
> +8. KVM_HC_PAGE_ENC_STATUS
> +-------------------------
> +:Architecture: x86
> +:Status: active
> +:Purpose: Notify the encryption status changes in guest page table (SEV guest)
> +
> +a0: the guest physical address of the start page
> +a1: the number of pages
> +a2: encryption attribute
> +
> +   Where:
> +       * 1: Encryption attribute is set
> +       * 0: Encryption attribute is cleared
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 3768819693e5..42eb0fe3df5d 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1050,6 +1050,8 @@ struct kvm_arch {
>
>         bool bus_lock_detection_enabled;
>
> +       bool page_enc_hc_enable;
> +
>         /* Deflect RDMSR and WRMSR to user space when they trigger a #GP */
>         u32 user_space_msr_mask;
>         struct kvm_x86_msr_filter __rcu *msr_filter;
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index c9795a22e502..5184a0c0131a 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -197,6 +197,7 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
>         sev->active = true;
>         sev->asid = asid;
>         INIT_LIST_HEAD(&sev->regions_list);
> +       kvm->arch.page_enc_hc_enable = true;
>
>         return 0;
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index f7d12fca397b..e8986478b653 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8208,6 +8208,13 @@ static void kvm_sched_yield(struct kvm *kvm, unsigned long dest_id)
>                 kvm_vcpu_yield_to(target);
>  }
>
> +static int complete_hypercall_exit(struct kvm_vcpu *vcpu)
> +{
> +       kvm_rax_write(vcpu, vcpu->run->hypercall.ret);
> +       ++vcpu->stat.hypercalls;
> +       return kvm_skip_emulated_instruction(vcpu);
> +}
> +
>  int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>  {
>         unsigned long nr, a0, a1, a2, a3, ret;
> @@ -8273,6 +8280,28 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>                 kvm_sched_yield(vcpu->kvm, a0);
>                 ret = 0;
>                 break;
> +       case KVM_HC_PAGE_ENC_STATUS: {
> +               u64 gpa = a0, npages = a1, enc = a2;
> +
> +               ret = -KVM_ENOSYS;
> +               if (!vcpu->kvm->arch.page_enc_hc_enable)
> +                       break;
> +
> +               if (!PAGE_ALIGNED(gpa) || !npages ||
> +                   gpa_to_gfn(gpa) + npages <= gpa_to_gfn(gpa)) {
> +                       ret = -EINVAL;
> +                       break;
> +               }
> +
> +               vcpu->run->exit_reason        = KVM_EXIT_HYPERCALL;
> +               vcpu->run->hypercall.nr       = KVM_HC_PAGE_ENC_STATUS;
> +               vcpu->run->hypercall.args[0]  = gpa;
> +               vcpu->run->hypercall.args[1]  = npages;
> +               vcpu->run->hypercall.args[2]  = enc;
> +               vcpu->run->hypercall.longmode = op_64_bit;
> +               vcpu->arch.complete_userspace_io = complete_hypercall_exit;
> +               return 0;
> +       }
>         default:
>                 ret = -KVM_ENOSYS;
>                 break;
> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
> index 8b86609849b9..847b83b75dc8 100644
> --- a/include/uapi/linux/kvm_para.h
> +++ b/include/uapi/linux/kvm_para.h
> @@ -29,6 +29,7 @@
>  #define KVM_HC_CLOCK_PAIRING           9
>  #define KVM_HC_SEND_IPI                10
>  #define KVM_HC_SCHED_YIELD             11
> +#define KVM_HC_PAGE_ENC_STATUS         12
>
>  /*
>   * hypercalls use architecture specific
> --
> 2.17.1
>
Reviewed-by: Steve Rutherford <srutherford@google.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 11/13] EFI: Introduce the new AMD Memory Encryption GUID.
  2021-04-12 19:46 ` [PATCH v12 11/13] EFI: Introduce the new AMD Memory Encryption GUID Ashish Kalra
@ 2021-04-13  0:05   ` Steve Rutherford
  0 siblings, 0 replies; 30+ messages in thread
From: Steve Rutherford @ 2021-04-13  0:05 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh

On Mon, Apr 12, 2021 at 12:46 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Ashish Kalra <ashish.kalra@amd.com>
>
> Introduce a new AMD Memory Encryption GUID which is currently
> used for defining a new UEFI environment variable which indicates
> UEFI/OVMF support for the SEV live migration feature. This variable
> is setup when UEFI/OVMF detects host/hypervisor support for SEV
> live migration and later this variable is read by the kernel using
> EFI runtime services to verify if OVMF supports the live migration
> feature.
>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  include/linux/efi.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index 6b5d36babfcc..6f364ace82cb 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -362,6 +362,7 @@ void efi_native_runtime_setup(void);
>
>  /* OEM GUIDs */
>  #define DELLEMC_EFI_RCI2_TABLE_GUID            EFI_GUID(0x2d9f28a2, 0xa886, 0x456a,  0x97, 0xa8, 0xf1, 0x1e, 0xf2, 0x4f, 0xf4, 0x55)
> +#define MEM_ENCRYPT_GUID                       EFI_GUID(0x0cf29b71, 0x9e51, 0x433a,  0xa3, 0xb7, 0x81, 0xf3, 0xab, 0x16, 0xb8, 0x75)
>
>  typedef struct {
>         efi_guid_t guid;
> --
> 2.17.1
>

Reviewed-by: Steve Rutherford <srutherford@google.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 09/13] mm: x86: Invoke hypercall when page encryption status is changed
  2021-04-12 19:45 ` [PATCH v12 09/13] mm: x86: Invoke hypercall when page encryption status is changed Ashish Kalra
@ 2021-04-13  0:06   ` Steve Rutherford
  0 siblings, 0 replies; 30+ messages in thread
From: Steve Rutherford @ 2021-04-13  0:06 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh

On Mon, Apr 12, 2021 at 12:45 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Brijesh Singh <brijesh.singh@amd.com>
>
> Invoke a hypercall when a memory region is changed from encrypted ->
> decrypted and vice versa. Hypervisor needs to know the page encryption
> status during the guest migration.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: x86@kernel.org
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  arch/x86/include/asm/paravirt.h       | 10 +++++
>  arch/x86/include/asm/paravirt_types.h |  2 +
>  arch/x86/kernel/paravirt.c            |  1 +
>  arch/x86/mm/mem_encrypt.c             | 57 ++++++++++++++++++++++++++-
>  arch/x86/mm/pat/set_memory.c          |  7 ++++
>  5 files changed, 76 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
> index 4abf110e2243..efaa3e628967 100644
> --- a/arch/x86/include/asm/paravirt.h
> +++ b/arch/x86/include/asm/paravirt.h
> @@ -84,6 +84,12 @@ static inline void paravirt_arch_exit_mmap(struct mm_struct *mm)
>         PVOP_VCALL1(mmu.exit_mmap, mm);
>  }
>
> +static inline void page_encryption_changed(unsigned long vaddr, int npages,
> +                                               bool enc)
> +{
> +       PVOP_VCALL3(mmu.page_encryption_changed, vaddr, npages, enc);
> +}
> +
>  #ifdef CONFIG_PARAVIRT_XXL
>  static inline void load_sp0(unsigned long sp0)
>  {
> @@ -799,6 +805,10 @@ static inline void paravirt_arch_dup_mmap(struct mm_struct *oldmm,
>  static inline void paravirt_arch_exit_mmap(struct mm_struct *mm)
>  {
>  }
> +
> +static inline void page_encryption_changed(unsigned long vaddr, int npages, bool enc)
> +{
> +}
>  #endif
>  #endif /* __ASSEMBLY__ */
>  #endif /* _ASM_X86_PARAVIRT_H */
> diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
> index de87087d3bde..69ef9c207b38 100644
> --- a/arch/x86/include/asm/paravirt_types.h
> +++ b/arch/x86/include/asm/paravirt_types.h
> @@ -195,6 +195,8 @@ struct pv_mmu_ops {
>
>         /* Hook for intercepting the destruction of an mm_struct. */
>         void (*exit_mmap)(struct mm_struct *mm);
> +       void (*page_encryption_changed)(unsigned long vaddr, int npages,
> +                                       bool enc);
>
>  #ifdef CONFIG_PARAVIRT_XXL
>         struct paravirt_callee_save read_cr2;
> diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
> index c60222ab8ab9..9f206e192f6b 100644
> --- a/arch/x86/kernel/paravirt.c
> +++ b/arch/x86/kernel/paravirt.c
> @@ -335,6 +335,7 @@ struct paravirt_patch_template pv_ops = {
>                         (void (*)(struct mmu_gather *, void *))tlb_remove_page,
>
>         .mmu.exit_mmap          = paravirt_nop,
> +       .mmu.page_encryption_changed    = paravirt_nop,
>
>  #ifdef CONFIG_PARAVIRT_XXL
>         .mmu.read_cr2           = __PV_IS_CALLEE_SAVE(native_read_cr2),
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index ae78cef79980..fae9ccbd0da7 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -19,6 +19,7 @@
>  #include <linux/kernel.h>
>  #include <linux/bitops.h>
>  #include <linux/dma-mapping.h>
> +#include <linux/kvm_para.h>
>
>  #include <asm/tlbflush.h>
>  #include <asm/fixmap.h>
> @@ -29,6 +30,7 @@
>  #include <asm/processor-flags.h>
>  #include <asm/msr.h>
>  #include <asm/cmdline.h>
> +#include <asm/kvm_para.h>
>
>  #include "mm_internal.h"
>
> @@ -229,6 +231,47 @@ void __init sev_setup_arch(void)
>         swiotlb_adjust_size(size);
>  }
>
> +static void set_memory_enc_dec_hypercall(unsigned long vaddr, int npages,
> +                                       bool enc)
> +{
> +       unsigned long sz = npages << PAGE_SHIFT;
> +       unsigned long vaddr_end, vaddr_next;
> +
> +       vaddr_end = vaddr + sz;
> +
> +       for (; vaddr < vaddr_end; vaddr = vaddr_next) {
> +               int psize, pmask, level;
> +               unsigned long pfn;
> +               pte_t *kpte;
> +
> +               kpte = lookup_address(vaddr, &level);
> +               if (!kpte || pte_none(*kpte))
> +                       return;
> +
> +               switch (level) {
> +               case PG_LEVEL_4K:
> +                       pfn = pte_pfn(*kpte);
> +                       break;
> +               case PG_LEVEL_2M:
> +                       pfn = pmd_pfn(*(pmd_t *)kpte);
> +                       break;
> +               case PG_LEVEL_1G:
> +                       pfn = pud_pfn(*(pud_t *)kpte);
> +                       break;
> +               default:
> +                       return;
> +               }
> +
> +               psize = page_level_size(level);
> +               pmask = page_level_mask(level);
> +
> +               kvm_sev_hypercall3(KVM_HC_PAGE_ENC_STATUS,
> +                                  pfn << PAGE_SHIFT, psize >> PAGE_SHIFT, enc);
> +
> +               vaddr_next = (vaddr & pmask) + psize;
> +       }
> +}
> +
>  static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
>  {
>         pgprot_t old_prot, new_prot;
> @@ -286,12 +329,13 @@ static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
>  static int __init early_set_memory_enc_dec(unsigned long vaddr,
>                                            unsigned long size, bool enc)
>  {
> -       unsigned long vaddr_end, vaddr_next;
> +       unsigned long vaddr_end, vaddr_next, start;
>         unsigned long psize, pmask;
>         int split_page_size_mask;
>         int level, ret;
>         pte_t *kpte;
>
> +       start = vaddr;
>         vaddr_next = vaddr;
>         vaddr_end = vaddr + size;
>
> @@ -346,6 +390,8 @@ static int __init early_set_memory_enc_dec(unsigned long vaddr,
>
>         ret = 0;
>
> +       set_memory_enc_dec_hypercall(start, PAGE_ALIGN(size) >> PAGE_SHIFT,
> +                                       enc);
>  out:
>         __flush_tlb_all();
>         return ret;
> @@ -481,6 +527,15 @@ void __init mem_encrypt_init(void)
>         if (sev_active() && !sev_es_active())
>                 static_branch_enable(&sev_enable_key);
>
> +#ifdef CONFIG_PARAVIRT
> +       /*
> +        * With SEV, we need to make a hypercall when page encryption state is
> +        * changed.
> +        */
> +       if (sev_active())
> +               pv_ops.mmu.page_encryption_changed = set_memory_enc_dec_hypercall;
> +#endif
> +
>         print_mem_encrypt_feature_info();
>  }
>
> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> index 16f878c26667..3576b583ac65 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -27,6 +27,7 @@
>  #include <asm/proto.h>
>  #include <asm/memtype.h>
>  #include <asm/set_memory.h>
> +#include <asm/paravirt.h>
>
>  #include "../mm_internal.h"
>
> @@ -2012,6 +2013,12 @@ static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
>          */
>         cpa_flush(&cpa, 0);
>
> +       /* Notify hypervisor that a given memory range is mapped encrypted
> +        * or decrypted. The hypervisor will use this information during the
> +        * VM migration.
> +        */
> +       page_encryption_changed(addr, numpages, enc);
> +
>         return ret;
>  }
>
> --
> 2.17.1
>
Reviewed-by: Steve Rutherford <srutherford@google.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 13/13] x86/kvm: Add kexec support for SEV Live Migration.
  2021-04-12 19:47 ` [PATCH v12 13/13] x86/kvm: Add kexec support for SEV Live Migration Ashish Kalra
@ 2021-04-13  0:22   ` Steve Rutherford
  2021-04-13  1:23     ` Steve Rutherford
  0 siblings, 1 reply; 30+ messages in thread
From: Steve Rutherford @ 2021-04-13  0:22 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh, kexec

On Mon, Apr 12, 2021 at 12:48 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Ashish Kalra <ashish.kalra@amd.com>
>
> Reset the host's shared pages list related to kernel
> specific page encryption status settings before we load a
> new kernel by kexec. We cannot reset the complete
> shared pages list here as we need to retain the
> UEFI/OVMF firmware specific settings.
>
> The host's shared pages list is maintained for the
> guest to keep track of all unencrypted guest memory regions,
> therefore we need to explicitly mark all shared pages as
> encrypted again before rebooting into the new guest kernel.
>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  arch/x86/kernel/kvm.c | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
>
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index bcc82e0c9779..4ad3ed547ff1 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -39,6 +39,7 @@
>  #include <asm/cpuidle_haltpoll.h>
>  #include <asm/ptrace.h>
>  #include <asm/svm.h>
> +#include <asm/e820/api.h>
>
>  DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
>
> @@ -384,6 +385,29 @@ static void kvm_pv_guest_cpu_reboot(void *unused)
>          */
>         if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
>                 wrmsrl(MSR_KVM_PV_EOI_EN, 0);
> +       /*
> +        * Reset the host's shared pages list related to kernel
> +        * specific page encryption status settings before we load a
> +        * new kernel by kexec. NOTE: We cannot reset the complete
> +        * shared pages list here as we need to retain the
> +        * UEFI/OVMF firmware specific settings.
> +        */
> +       if (sev_live_migration_enabled & (smp_processor_id() == 0)) {
What happens if the reboot of CPU0 races with another CPU servicing a
device request (while the reboot is pending for that CPU)?
Seems like you could run into a scenario where you have hypercalls racing.

Calling this on every core isn't free, but it is an easy way to avoid this race.
You could also count cores, and have only last core do the job, but
that seems more complicated.

Steve
> +               int i;
> +               unsigned long nr_pages;
> +
> +               for (i = 0; i < e820_table->nr_entries; i++) {
> +                       struct e820_entry *entry = &e820_table->entries[i];
> +
> +                       if (entry->type != E820_TYPE_RAM)
> +                               continue;
> +
> +                       nr_pages = DIV_ROUND_UP(entry->size, PAGE_SIZE);
> +
> +                       kvm_sev_hypercall3(KVM_HC_PAGE_ENC_STATUS,
> +                                          entry->addr, nr_pages, 1);
> +               }
> +       }
>         kvm_pv_disable_apf();
>         kvm_disable_steal_time();
>  }
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 12/13] x86/kvm: Add guest support for detecting and enabling SEV Live Migration feature.
  2021-04-12 19:46 ` [PATCH v12 12/13] x86/kvm: Add guest support for detecting and enabling SEV Live Migration feature Ashish Kalra
@ 2021-04-13  0:25   ` Steve Rutherford
  2021-04-13  1:01     ` Ashish Kalra
  0 siblings, 1 reply; 30+ messages in thread
From: Steve Rutherford @ 2021-04-13  0:25 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh

On Mon, Apr 12, 2021 at 12:46 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Ashish Kalra <ashish.kalra@amd.com>
>
> The guest support for detecting and enabling SEV Live migration
> feature uses the following logic :
>
>  - kvm_init_plaform() invokes check_kvm_sev_migration() which
>    checks if its booted under the EFI
>
>    - If not EFI,
>
>      i) check for the KVM_FEATURE_CPUID
>
>      ii) if CPUID reports that migration is supported, issue a wrmsrl()
>          to enable the SEV live migration support
>
>    - If EFI,
>
>      i) check for the KVM_FEATURE_CPUID
>
>      ii) If CPUID reports that migration is supported, read the UEFI variable which
>          indicates OVMF support for live migration
>
>      iii) the variable indicates live migration is supported, issue a wrmsrl() to
>           enable the SEV live migration support
>
> The EFI live migration check is done using a late_initcall() callback.
>
> Also, ensure that _bss_decrypted section is marked as decrypted in the
> shared pages list.
>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  arch/x86/include/asm/mem_encrypt.h |  8 +++++
>  arch/x86/kernel/kvm.c              | 52 ++++++++++++++++++++++++++++++
>  arch/x86/mm/mem_encrypt.c          | 41 +++++++++++++++++++++++
>  3 files changed, 101 insertions(+)
>
> diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
> index 31c4df123aa0..19b77f3a62dc 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -21,6 +21,7 @@
>  extern u64 sme_me_mask;
>  extern u64 sev_status;
>  extern bool sev_enabled;
> +extern bool sev_live_migration_enabled;
>
>  void sme_encrypt_execute(unsigned long encrypted_kernel_vaddr,
>                          unsigned long decrypted_kernel_vaddr,
> @@ -44,8 +45,11 @@ void __init sme_enable(struct boot_params *bp);
>
>  int __init early_set_memory_decrypted(unsigned long vaddr, unsigned long size);
>  int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size);
> +void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages,
> +                                           bool enc);
>
>  void __init mem_encrypt_free_decrypted_mem(void);
> +void __init check_kvm_sev_migration(void);
>
>  /* Architecture __weak replacement functions */
>  void __init mem_encrypt_init(void);
> @@ -60,6 +64,7 @@ bool sev_es_active(void);
>  #else  /* !CONFIG_AMD_MEM_ENCRYPT */
>
>  #define sme_me_mask    0ULL
> +#define sev_live_migration_enabled     false
>
>  static inline void __init sme_early_encrypt(resource_size_t paddr,
>                                             unsigned long size) { }
> @@ -84,8 +89,11 @@ static inline int __init
>  early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0; }
>  static inline int __init
>  early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0; }
> +static inline void __init
> +early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages, bool enc) {}
>
>  static inline void mem_encrypt_free_decrypted_mem(void) { }
> +static inline void check_kvm_sev_migration(void) { }
>
>  #define __bss_decrypted
>
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 78bb0fae3982..bcc82e0c9779 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -26,6 +26,7 @@
>  #include <linux/kprobes.h>
>  #include <linux/nmi.h>
>  #include <linux/swait.h>
> +#include <linux/efi.h>
>  #include <asm/timer.h>
>  #include <asm/cpu.h>
>  #include <asm/traps.h>
> @@ -429,6 +430,56 @@ static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
>         early_set_memory_decrypted((unsigned long) ptr, size);
>  }
>
> +static int __init setup_kvm_sev_migration(void)
> +{
> +       efi_char16_t efi_sev_live_migration_enabled[] = L"SevLiveMigrationEnabled";
> +       efi_guid_t efi_variable_guid = MEM_ENCRYPT_GUID;
> +       efi_status_t status;
> +       unsigned long size;
> +       bool enabled;
> +
> +       /*
> +        * check_kvm_sev_migration() invoked via kvm_init_platform() before
> +        * this callback would have setup the indicator that live migration
> +        * feature is supported/enabled.
> +        */
> +       if (!sev_live_migration_enabled)
> +               return 0;
> +
> +       if (!efi_enabled(EFI_RUNTIME_SERVICES)) {
> +               pr_info("%s : EFI runtime services are not enabled\n", __func__);
> +               return 0;
> +       }
> +
> +       size = sizeof(enabled);
> +
> +       /* Get variable contents into buffer */
> +       status = efi.get_variable(efi_sev_live_migration_enabled,
> +                                 &efi_variable_guid, NULL, &size, &enabled);
> +
> +       if (status == EFI_NOT_FOUND) {
> +               pr_info("%s : EFI live migration variable not found\n", __func__);
> +               return 0;
> +       }
> +
> +       if (status != EFI_SUCCESS) {
> +               pr_info("%s : EFI variable retrieval failed\n", __func__);
> +               return 0;
> +       }
> +
> +       if (enabled == 0) {
> +               pr_info("%s: live migration disabled in EFI\n", __func__);
> +               return 0;
> +       }
> +
> +       pr_info("%s : live migration enabled in EFI\n", __func__);
> +       wrmsrl(MSR_KVM_SEV_LIVE_MIGRATION, KVM_SEV_LIVE_MIGRATION_ENABLED);
> +
> +       return true;
> +}
> +
> +late_initcall(setup_kvm_sev_migration);
> +
>  /*
>   * Iterate through all possible CPUs and map the memory region pointed
>   * by apf_reason, steal_time and kvm_apic_eoi as decrypted at once.
> @@ -747,6 +798,7 @@ static bool __init kvm_msi_ext_dest_id(void)
>
>  static void __init kvm_init_platform(void)
>  {
> +       check_kvm_sev_migration();
>         kvmclock_init();
>         x86_platform.apic_post_init = kvm_apic_init;
>  }
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index fae9ccbd0da7..4de417333c09 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -20,6 +20,7 @@
>  #include <linux/bitops.h>
>  #include <linux/dma-mapping.h>
>  #include <linux/kvm_para.h>
> +#include <linux/efi.h>
>
>  #include <asm/tlbflush.h>
>  #include <asm/fixmap.h>
> @@ -48,6 +49,8 @@ EXPORT_SYMBOL_GPL(sev_enable_key);
>
>  bool sev_enabled __section(".data");
>
> +bool sev_live_migration_enabled __section(".data");
> +
>  /* Buffer used for early in-place encryption by BSP, no locking needed */
>  static char sme_early_buffer[PAGE_SIZE] __initdata __aligned(PAGE_SIZE);
>
> @@ -237,6 +240,9 @@ static void set_memory_enc_dec_hypercall(unsigned long vaddr, int npages,
>         unsigned long sz = npages << PAGE_SHIFT;
>         unsigned long vaddr_end, vaddr_next;
>
> +       if (!sev_live_migration_enabled)
> +               return;
> +
>         vaddr_end = vaddr + sz;
>
>         for (; vaddr < vaddr_end; vaddr = vaddr_next) {
> @@ -407,6 +413,12 @@ int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size)
>         return early_set_memory_enc_dec(vaddr, size, true);
>  }
>
> +void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages,
> +                                       bool enc)
> +{
> +       set_memory_enc_dec_hypercall(vaddr, npages, enc);
> +}
> +
>  /*
>   * SME and SEV are very similar but they are not the same, so there are
>   * times that the kernel will need to distinguish between SME and SEV. The
> @@ -462,6 +474,35 @@ bool force_dma_unencrypted(struct device *dev)
>         return false;
>  }
>
> +void __init check_kvm_sev_migration(void)
> +{
> +       if (sev_active() &&
> +           kvm_para_has_feature(KVM_FEATURE_SEV_LIVE_MIGRATION)) {
> +               unsigned long nr_pages;
> +
> +               pr_info("KVM enable live migration\n");
> +               sev_live_migration_enabled = true;
> +
> +               /*
> +                * Ensure that _bss_decrypted section is marked as decrypted in the
> +                * shared pages list.
> +                */
> +               nr_pages = DIV_ROUND_UP(__end_bss_decrypted - __start_bss_decrypted,
> +                                       PAGE_SIZE);
> +               early_set_mem_enc_dec_hypercall((unsigned long)__start_bss_decrypted,
> +                                               nr_pages, 0);
> +
> +               /*
> +                * If not booted using EFI, enable Live migration support.
> +                */
> +               if (!efi_enabled(EFI_BOOT))
> +                       wrmsrl(MSR_KVM_SEV_LIVE_MIGRATION,
> +                              KVM_SEV_LIVE_MIGRATION_ENABLED);
> +               } else {
> +                       pr_info("KVM enable live migration feature unsupported\n");
I might be misunderstanding this, but I'm not sure this log message is
correct: isn't the intention that the late initcall will be the one to
check if this should be enabled later in this case?

I have a similar question above about the log message after
"!efi_enabled(EFI_RUNTIME_SERVICES)": shouldn't that avoid logging if
!efi_enabled(EFI_BOOT) (since the wrmsl call already had been made
here?)
> +               }
> +}
> +
>  void __init mem_encrypt_free_decrypted_mem(void)
>  {
>         unsigned long vaddr, vaddr_end, npages;
> --
> 2.17.1
>

Other than these:
Reviewed-by: Steve Rutherford <srutherford@google.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 12/13] x86/kvm: Add guest support for detecting and enabling SEV Live Migration feature.
  2021-04-13  0:25   ` Steve Rutherford
@ 2021-04-13  1:01     ` Ashish Kalra
  0 siblings, 0 replies; 30+ messages in thread
From: Ashish Kalra @ 2021-04-13  1:01 UTC (permalink / raw)
  To: Steve Rutherford
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh

On Mon, Apr 12, 2021 at 05:25:15PM -0700, Steve Rutherford wrote:
> On Mon, Apr 12, 2021 at 12:46 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> >
> > From: Ashish Kalra <ashish.kalra@amd.com>
> >
> > The guest support for detecting and enabling SEV Live migration
> > feature uses the following logic :
> >
> >  - kvm_init_plaform() invokes check_kvm_sev_migration() which
> >    checks if its booted under the EFI
> >
> >    - If not EFI,
> >
> >      i) check for the KVM_FEATURE_CPUID
> >
> >      ii) if CPUID reports that migration is supported, issue a wrmsrl()
> >          to enable the SEV live migration support
> >
> >    - If EFI,
> >
> >      i) check for the KVM_FEATURE_CPUID
> >
> >      ii) If CPUID reports that migration is supported, read the UEFI variable which
> >          indicates OVMF support for live migration
> >
> >      iii) the variable indicates live migration is supported, issue a wrmsrl() to
> >           enable the SEV live migration support
> >
> > The EFI live migration check is done using a late_initcall() callback.
> >
> > Also, ensure that _bss_decrypted section is marked as decrypted in the
> > shared pages list.
> >
> > Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> > ---
> >  arch/x86/include/asm/mem_encrypt.h |  8 +++++
> >  arch/x86/kernel/kvm.c              | 52 ++++++++++++++++++++++++++++++
> >  arch/x86/mm/mem_encrypt.c          | 41 +++++++++++++++++++++++
> >  3 files changed, 101 insertions(+)
> >
> > diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
> > index 31c4df123aa0..19b77f3a62dc 100644
> > --- a/arch/x86/include/asm/mem_encrypt.h
> > +++ b/arch/x86/include/asm/mem_encrypt.h
> > @@ -21,6 +21,7 @@
> >  extern u64 sme_me_mask;
> >  extern u64 sev_status;
> >  extern bool sev_enabled;
> > +extern bool sev_live_migration_enabled;
> >
> >  void sme_encrypt_execute(unsigned long encrypted_kernel_vaddr,
> >                          unsigned long decrypted_kernel_vaddr,
> > @@ -44,8 +45,11 @@ void __init sme_enable(struct boot_params *bp);
> >
> >  int __init early_set_memory_decrypted(unsigned long vaddr, unsigned long size);
> >  int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size);
> > +void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages,
> > +                                           bool enc);
> >
> >  void __init mem_encrypt_free_decrypted_mem(void);
> > +void __init check_kvm_sev_migration(void);
> >
> >  /* Architecture __weak replacement functions */
> >  void __init mem_encrypt_init(void);
> > @@ -60,6 +64,7 @@ bool sev_es_active(void);
> >  #else  /* !CONFIG_AMD_MEM_ENCRYPT */
> >
> >  #define sme_me_mask    0ULL
> > +#define sev_live_migration_enabled     false
> >
> >  static inline void __init sme_early_encrypt(resource_size_t paddr,
> >                                             unsigned long size) { }
> > @@ -84,8 +89,11 @@ static inline int __init
> >  early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0; }
> >  static inline int __init
> >  early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0; }
> > +static inline void __init
> > +early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages, bool enc) {}
> >
> >  static inline void mem_encrypt_free_decrypted_mem(void) { }
> > +static inline void check_kvm_sev_migration(void) { }
> >
> >  #define __bss_decrypted
> >
> > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> > index 78bb0fae3982..bcc82e0c9779 100644
> > --- a/arch/x86/kernel/kvm.c
> > +++ b/arch/x86/kernel/kvm.c
> > @@ -26,6 +26,7 @@
> >  #include <linux/kprobes.h>
> >  #include <linux/nmi.h>
> >  #include <linux/swait.h>
> > +#include <linux/efi.h>
> >  #include <asm/timer.h>
> >  #include <asm/cpu.h>
> >  #include <asm/traps.h>
> > @@ -429,6 +430,56 @@ static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
> >         early_set_memory_decrypted((unsigned long) ptr, size);
> >  }
> >
> > +static int __init setup_kvm_sev_migration(void)
> > +{
> > +       efi_char16_t efi_sev_live_migration_enabled[] = L"SevLiveMigrationEnabled";
> > +       efi_guid_t efi_variable_guid = MEM_ENCRYPT_GUID;
> > +       efi_status_t status;
> > +       unsigned long size;
> > +       bool enabled;
> > +
> > +       /*
> > +        * check_kvm_sev_migration() invoked via kvm_init_platform() before
> > +        * this callback would have setup the indicator that live migration
> > +        * feature is supported/enabled.
> > +        */
> > +       if (!sev_live_migration_enabled)
> > +               return 0;
> > +
> > +       if (!efi_enabled(EFI_RUNTIME_SERVICES)) {
> > +               pr_info("%s : EFI runtime services are not enabled\n", __func__);
> > +               return 0;
> > +       }
> > +
> > +       size = sizeof(enabled);
> > +
> > +       /* Get variable contents into buffer */
> > +       status = efi.get_variable(efi_sev_live_migration_enabled,
> > +                                 &efi_variable_guid, NULL, &size, &enabled);
> > +
> > +       if (status == EFI_NOT_FOUND) {
> > +               pr_info("%s : EFI live migration variable not found\n", __func__);
> > +               return 0;
> > +       }
> > +
> > +       if (status != EFI_SUCCESS) {
> > +               pr_info("%s : EFI variable retrieval failed\n", __func__);
> > +               return 0;
> > +       }
> > +
> > +       if (enabled == 0) {
> > +               pr_info("%s: live migration disabled in EFI\n", __func__);
> > +               return 0;
> > +       }
> > +
> > +       pr_info("%s : live migration enabled in EFI\n", __func__);
> > +       wrmsrl(MSR_KVM_SEV_LIVE_MIGRATION, KVM_SEV_LIVE_MIGRATION_ENABLED);
> > +
> > +       return true;
> > +}
> > +
> > +late_initcall(setup_kvm_sev_migration);
> > +
> >  /*
> >   * Iterate through all possible CPUs and map the memory region pointed
> >   * by apf_reason, steal_time and kvm_apic_eoi as decrypted at once.
> > @@ -747,6 +798,7 @@ static bool __init kvm_msi_ext_dest_id(void)
> >
> >  static void __init kvm_init_platform(void)
> >  {
> > +       check_kvm_sev_migration();
> >         kvmclock_init();
> >         x86_platform.apic_post_init = kvm_apic_init;
> >  }
> > diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> > index fae9ccbd0da7..4de417333c09 100644
> > --- a/arch/x86/mm/mem_encrypt.c
> > +++ b/arch/x86/mm/mem_encrypt.c
> > @@ -20,6 +20,7 @@
> >  #include <linux/bitops.h>
> >  #include <linux/dma-mapping.h>
> >  #include <linux/kvm_para.h>
> > +#include <linux/efi.h>
> >
> >  #include <asm/tlbflush.h>
> >  #include <asm/fixmap.h>
> > @@ -48,6 +49,8 @@ EXPORT_SYMBOL_GPL(sev_enable_key);
> >
> >  bool sev_enabled __section(".data");
> >
> > +bool sev_live_migration_enabled __section(".data");
> > +
> >  /* Buffer used for early in-place encryption by BSP, no locking needed */
> >  static char sme_early_buffer[PAGE_SIZE] __initdata __aligned(PAGE_SIZE);
> >
> > @@ -237,6 +240,9 @@ static void set_memory_enc_dec_hypercall(unsigned long vaddr, int npages,
> >         unsigned long sz = npages << PAGE_SHIFT;
> >         unsigned long vaddr_end, vaddr_next;
> >
> > +       if (!sev_live_migration_enabled)
> > +               return;
> > +
> >         vaddr_end = vaddr + sz;
> >
> >         for (; vaddr < vaddr_end; vaddr = vaddr_next) {
> > @@ -407,6 +413,12 @@ int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size)
> >         return early_set_memory_enc_dec(vaddr, size, true);
> >  }
> >
> > +void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages,
> > +                                       bool enc)
> > +{
> > +       set_memory_enc_dec_hypercall(vaddr, npages, enc);
> > +}
> > +
> >  /*
> >   * SME and SEV are very similar but they are not the same, so there are
> >   * times that the kernel will need to distinguish between SME and SEV. The
> > @@ -462,6 +474,35 @@ bool force_dma_unencrypted(struct device *dev)
> >         return false;
> >  }
> >
> > +void __init check_kvm_sev_migration(void)
> > +{
> > +       if (sev_active() &&
> > +           kvm_para_has_feature(KVM_FEATURE_SEV_LIVE_MIGRATION)) {
> > +               unsigned long nr_pages;
> > +
> > +               pr_info("KVM enable live migration\n");
> > +               sev_live_migration_enabled = true;
> > +
> > +               /*
> > +                * Ensure that _bss_decrypted section is marked as decrypted in the
> > +                * shared pages list.
> > +                */
> > +               nr_pages = DIV_ROUND_UP(__end_bss_decrypted - __start_bss_decrypted,
> > +                                       PAGE_SIZE);
> > +               early_set_mem_enc_dec_hypercall((unsigned long)__start_bss_decrypted,
> > +                                               nr_pages, 0);
> > +
> > +               /*
> > +                * If not booted using EFI, enable Live migration support.
> > +                */
> > +               if (!efi_enabled(EFI_BOOT))
> > +                       wrmsrl(MSR_KVM_SEV_LIVE_MIGRATION,
> > +                              KVM_SEV_LIVE_MIGRATION_ENABLED);
> > +               } else {
> > +                       pr_info("KVM enable live migration feature unsupported\n");
> I might be misunderstanding this, but I'm not sure this log message is
> correct: isn't the intention that the late initcall will be the one to
> check if this should be enabled later in this case?
> 

Yes, you are right, this log message should be removed.

> I have a similar question above about the log message after
> "!efi_enabled(EFI_RUNTIME_SERVICES)": shouldn't that avoid logging if
> !efi_enabled(EFI_BOOT) (since the wrmsl call already had been made
> here?)

Yes, probably that check can be additionally added.

Thanks,
Ashish

> > +               }
> > +}
> > +
> >  void __init mem_encrypt_free_decrypted_mem(void)
> >  {
> >         unsigned long vaddr, vaddr_end, npages;
> > --
> > 2.17.1
> >
> 
> Other than these:
> Reviewed-by: Steve Rutherford <srutherford@google.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 13/13] x86/kvm: Add kexec support for SEV Live Migration.
  2021-04-13  0:22   ` Steve Rutherford
@ 2021-04-13  1:23     ` Steve Rutherford
  2021-04-13  1:48       ` Ashish Kalra
  0 siblings, 1 reply; 30+ messages in thread
From: Steve Rutherford @ 2021-04-13  1:23 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh, kexec

On Mon, Apr 12, 2021 at 5:22 PM Steve Rutherford <srutherford@google.com> wrote:
>
> On Mon, Apr 12, 2021 at 12:48 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> >
> > From: Ashish Kalra <ashish.kalra@amd.com>
> >
> > Reset the host's shared pages list related to kernel
> > specific page encryption status settings before we load a
> > new kernel by kexec. We cannot reset the complete
> > shared pages list here as we need to retain the
> > UEFI/OVMF firmware specific settings.
> >
> > The host's shared pages list is maintained for the
> > guest to keep track of all unencrypted guest memory regions,
> > therefore we need to explicitly mark all shared pages as
> > encrypted again before rebooting into the new guest kernel.
> >
> > Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> > ---
> >  arch/x86/kernel/kvm.c | 24 ++++++++++++++++++++++++
> >  1 file changed, 24 insertions(+)
> >
> > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> > index bcc82e0c9779..4ad3ed547ff1 100644
> > --- a/arch/x86/kernel/kvm.c
> > +++ b/arch/x86/kernel/kvm.c
> > @@ -39,6 +39,7 @@
> >  #include <asm/cpuidle_haltpoll.h>
> >  #include <asm/ptrace.h>
> >  #include <asm/svm.h>
> > +#include <asm/e820/api.h>
> >
> >  DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
> >
> > @@ -384,6 +385,29 @@ static void kvm_pv_guest_cpu_reboot(void *unused)
> >          */
> >         if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
> >                 wrmsrl(MSR_KVM_PV_EOI_EN, 0);
> > +       /*
> > +        * Reset the host's shared pages list related to kernel
> > +        * specific page encryption status settings before we load a
> > +        * new kernel by kexec. NOTE: We cannot reset the complete
> > +        * shared pages list here as we need to retain the
> > +        * UEFI/OVMF firmware specific settings.
> > +        */
> > +       if (sev_live_migration_enabled & (smp_processor_id() == 0)) {
> What happens if the reboot of CPU0 races with another CPU servicing a
> device request (while the reboot is pending for that CPU)?
> Seems like you could run into a scenario where you have hypercalls racing.
>
> Calling this on every core isn't free, but it is an easy way to avoid this race.
> You could also count cores, and have only last core do the job, but
> that seems more complicated.
On second thought, I think this may be insufficient as a fix, since my
read of kernel/reboot.c seems to imply that devices aren't shutdown
until after these notifiers occur. As such, a single thread might be
able to race with itself. I could be wrong here though.

The heavy hammer would be to disable migration through the MSR (which
the subsequent boot will re-enable).

I'm curious if there is a less "blocking" way of handling kexecs (that
strategy would block LM while the guest booted).

One option that comes to mind would be for the guest to "mute" the
encryption status hypercall after the call to reset the encryption
status. The problem would be that the encryption status for pages
would be very temporarily inaccurate in the window between that call
and the start of the next boot. That isn't ideal, but, on the other
hand, the VM was about to reboot anyway, so a corrupted shared page
for device communication probably isn't super important. Still, I'm
not really a fan of that. This would avoid corrupting the next boot,
which is clearly an improvement.

Each time the kernel boots it could also choose something like a
generation ID, and pass that down each time it calls the hypercall.
This would then let userspace identify which requests were coming from
the subsequent boot.

Everything here (except, perhaps, disabling migration through the MSR)
seems kind of complicated. I somewhat hope my interpretation of
kernel/reboot.c is wrong and this race just is not possible in the
first place.

Steve
> Steve
> > +               int i;
> > +               unsigned long nr_pages;
> > +
> > +               for (i = 0; i < e820_table->nr_entries; i++) {
> > +                       struct e820_entry *entry = &e820_table->entries[i];
> > +
> > +                       if (entry->type != E820_TYPE_RAM)
> > +                               continue;
> > +
> > +                       nr_pages = DIV_ROUND_UP(entry->size, PAGE_SIZE);
> > +
> > +                       kvm_sev_hypercall3(KVM_HC_PAGE_ENC_STATUS,
> > +                                          entry->addr, nr_pages, 1);
> > +               }
> > +       }
> >         kvm_pv_disable_apf();
> >         kvm_disable_steal_time();
> >  }
> > --
> > 2.17.1
> >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 13/13] x86/kvm: Add kexec support for SEV Live Migration.
  2021-04-13  1:23     ` Steve Rutherford
@ 2021-04-13  1:48       ` Ashish Kalra
  2021-04-13  2:25         ` Steve Rutherford
  0 siblings, 1 reply; 30+ messages in thread
From: Ashish Kalra @ 2021-04-13  1:48 UTC (permalink / raw)
  To: Steve Rutherford
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh, kexec

On Mon, Apr 12, 2021 at 06:23:32PM -0700, Steve Rutherford wrote:
> On Mon, Apr 12, 2021 at 5:22 PM Steve Rutherford <srutherford@google.com> wrote:
> >
> > On Mon, Apr 12, 2021 at 12:48 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> > >
> > > From: Ashish Kalra <ashish.kalra@amd.com>
> > >
> > > Reset the host's shared pages list related to kernel
> > > specific page encryption status settings before we load a
> > > new kernel by kexec. We cannot reset the complete
> > > shared pages list here as we need to retain the
> > > UEFI/OVMF firmware specific settings.
> > >
> > > The host's shared pages list is maintained for the
> > > guest to keep track of all unencrypted guest memory regions,
> > > therefore we need to explicitly mark all shared pages as
> > > encrypted again before rebooting into the new guest kernel.
> > >
> > > Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> > > ---
> > >  arch/x86/kernel/kvm.c | 24 ++++++++++++++++++++++++
> > >  1 file changed, 24 insertions(+)
> > >
> > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> > > index bcc82e0c9779..4ad3ed547ff1 100644
> > > --- a/arch/x86/kernel/kvm.c
> > > +++ b/arch/x86/kernel/kvm.c
> > > @@ -39,6 +39,7 @@
> > >  #include <asm/cpuidle_haltpoll.h>
> > >  #include <asm/ptrace.h>
> > >  #include <asm/svm.h>
> > > +#include <asm/e820/api.h>
> > >
> > >  DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
> > >
> > > @@ -384,6 +385,29 @@ static void kvm_pv_guest_cpu_reboot(void *unused)
> > >          */
> > >         if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
> > >                 wrmsrl(MSR_KVM_PV_EOI_EN, 0);
> > > +       /*
> > > +        * Reset the host's shared pages list related to kernel
> > > +        * specific page encryption status settings before we load a
> > > +        * new kernel by kexec. NOTE: We cannot reset the complete
> > > +        * shared pages list here as we need to retain the
> > > +        * UEFI/OVMF firmware specific settings.
> > > +        */
> > > +       if (sev_live_migration_enabled & (smp_processor_id() == 0)) {
> > What happens if the reboot of CPU0 races with another CPU servicing a
> > device request (while the reboot is pending for that CPU)?
> > Seems like you could run into a scenario where you have hypercalls racing.
> >
> > Calling this on every core isn't free, but it is an easy way to avoid this race.
> > You could also count cores, and have only last core do the job, but
> > that seems more complicated.
> On second thought, I think this may be insufficient as a fix, since my
> read of kernel/reboot.c seems to imply that devices aren't shutdown
> until after these notifiers occur. As such, a single thread might be
> able to race with itself. I could be wrong here though.
> 
> The heavy hammer would be to disable migration through the MSR (which
> the subsequent boot will re-enable).
> 
> I'm curious if there is a less "blocking" way of handling kexecs (that
> strategy would block LM while the guest booted).
> 
> One option that comes to mind would be for the guest to "mute" the
> encryption status hypercall after the call to reset the encryption
> status. The problem would be that the encryption status for pages
> would be very temporarily inaccurate in the window between that call
> and the start of the next boot. That isn't ideal, but, on the other
> hand, the VM was about to reboot anyway, so a corrupted shared page
> for device communication probably isn't super important. Still, I'm
> not really a fan of that. This would avoid corrupting the next boot,
> which is clearly an improvement.
> 
> Each time the kernel boots it could also choose something like a
> generation ID, and pass that down each time it calls the hypercall.
> This would then let userspace identify which requests were coming from
> the subsequent boot.
> 
> Everything here (except, perhaps, disabling migration through the MSR)
> seems kind of complicated. I somewhat hope my interpretation of
> kernel/reboot.c is wrong and this race just is not possible in the
> first place.
> 

Disabling migration through the MSR after resetting the page encryption
status is a reasonable approach. There is a similar window existing for
normal VM boot during which LM is disabled, from the point where OVMF
checks and adds support for SEV LM and the kernel boot checks for the
same and enables LM using the MSR. 

Thanks,
Ashish

> > > +               int i;
> > > +               unsigned long nr_pages;
> > > +
> > > +               for (i = 0; i < e820_table->nr_entries; i++) {
> > > +                       struct e820_entry *entry = &e820_table->entries[i];
> > > +
> > > +                       if (entry->type != E820_TYPE_RAM)
> > > +                               continue;
> > > +
> > > +                       nr_pages = DIV_ROUND_UP(entry->size, PAGE_SIZE);
> > > +
> > > +                       kvm_sev_hypercall3(KVM_HC_PAGE_ENC_STATUS,
> > > +                                          entry->addr, nr_pages, 1);
> > > +               }
> > > +       }
> > >         kvm_pv_disable_apf();
> > >         kvm_disable_steal_time();
> > >  }
> > > --
> > > 2.17.1
> > >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 13/13] x86/kvm: Add kexec support for SEV Live Migration.
  2021-04-13  1:48       ` Ashish Kalra
@ 2021-04-13  2:25         ` Steve Rutherford
  2021-04-13 11:47           ` Ashish Kalra
  0 siblings, 1 reply; 30+ messages in thread
From: Steve Rutherford @ 2021-04-13  2:25 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh, kexec

On Mon, Apr 12, 2021 at 6:48 PM Ashish Kalra <ashish.kalra@amd.com> wrote:
>
> On Mon, Apr 12, 2021 at 06:23:32PM -0700, Steve Rutherford wrote:
> > On Mon, Apr 12, 2021 at 5:22 PM Steve Rutherford <srutherford@google.com> wrote:
> > >
> > > On Mon, Apr 12, 2021 at 12:48 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> > > >
> > > > From: Ashish Kalra <ashish.kalra@amd.com>
> > > >
> > > > Reset the host's shared pages list related to kernel
> > > > specific page encryption status settings before we load a
> > > > new kernel by kexec. We cannot reset the complete
> > > > shared pages list here as we need to retain the
> > > > UEFI/OVMF firmware specific settings.
> > > >
> > > > The host's shared pages list is maintained for the
> > > > guest to keep track of all unencrypted guest memory regions,
> > > > therefore we need to explicitly mark all shared pages as
> > > > encrypted again before rebooting into the new guest kernel.
> > > >
> > > > Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> > > > ---
> > > >  arch/x86/kernel/kvm.c | 24 ++++++++++++++++++++++++
> > > >  1 file changed, 24 insertions(+)
> > > >
> > > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> > > > index bcc82e0c9779..4ad3ed547ff1 100644
> > > > --- a/arch/x86/kernel/kvm.c
> > > > +++ b/arch/x86/kernel/kvm.c
> > > > @@ -39,6 +39,7 @@
> > > >  #include <asm/cpuidle_haltpoll.h>
> > > >  #include <asm/ptrace.h>
> > > >  #include <asm/svm.h>
> > > > +#include <asm/e820/api.h>
> > > >
> > > >  DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
> > > >
> > > > @@ -384,6 +385,29 @@ static void kvm_pv_guest_cpu_reboot(void *unused)
> > > >          */
> > > >         if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
> > > >                 wrmsrl(MSR_KVM_PV_EOI_EN, 0);
> > > > +       /*
> > > > +        * Reset the host's shared pages list related to kernel
> > > > +        * specific page encryption status settings before we load a
> > > > +        * new kernel by kexec. NOTE: We cannot reset the complete
> > > > +        * shared pages list here as we need to retain the
> > > > +        * UEFI/OVMF firmware specific settings.
> > > > +        */
> > > > +       if (sev_live_migration_enabled & (smp_processor_id() == 0)) {
> > > What happens if the reboot of CPU0 races with another CPU servicing a
> > > device request (while the reboot is pending for that CPU)?
> > > Seems like you could run into a scenario where you have hypercalls racing.
> > >
> > > Calling this on every core isn't free, but it is an easy way to avoid this race.
> > > You could also count cores, and have only last core do the job, but
> > > that seems more complicated.
> > On second thought, I think this may be insufficient as a fix, since my
> > read of kernel/reboot.c seems to imply that devices aren't shutdown
> > until after these notifiers occur. As such, a single thread might be
> > able to race with itself. I could be wrong here though.
> >
> > The heavy hammer would be to disable migration through the MSR (which
> > the subsequent boot will re-enable).
> >
> > I'm curious if there is a less "blocking" way of handling kexecs (that
> > strategy would block LM while the guest booted).
> >
> > One option that comes to mind would be for the guest to "mute" the
> > encryption status hypercall after the call to reset the encryption
> > status. The problem would be that the encryption status for pages
> > would be very temporarily inaccurate in the window between that call
> > and the start of the next boot. That isn't ideal, but, on the other
> > hand, the VM was about to reboot anyway, so a corrupted shared page
> > for device communication probably isn't super important. Still, I'm
> > not really a fan of that. This would avoid corrupting the next boot,
> > which is clearly an improvement.
> >
> > Each time the kernel boots it could also choose something like a
> > generation ID, and pass that down each time it calls the hypercall.
> > This would then let userspace identify which requests were coming from
> > the subsequent boot.
> >
> > Everything here (except, perhaps, disabling migration through the MSR)
> > seems kind of complicated. I somewhat hope my interpretation of
> > kernel/reboot.c is wrong and this race just is not possible in the
> > first place.
> >
>
> Disabling migration through the MSR after resetting the page encryption
> status is a reasonable approach. There is a similar window existing for
> normal VM boot during which LM is disabled, from the point where OVMF
> checks and adds support for SEV LM and the kernel boot checks for the
> same and enables LM using the MSR.

I'm not totally confident that disabling LM through the MSR is
sufficient. I also think the newly booted kernel needs to reset the
state itself, since nothing stops the hypercalls after the disable
goes through. The host won't know the difference between early boot
(pre-enablement) hypercalls and racy just-before-restart hypercalls.
You might disable migration through the hypercall, get a late status
change hypercall, reboot, then re-enable migration, but still have
stale state.

I _believe_ that the kernel doesn't mark it's RAM as private on boot
as an optimization (might be wrong about this), since it would have
been expensive to mark all of ram as encrypted previously. I believe
that is no longer a limitation given the KVM_EXIT, so we can reset
this during early boot instead of just before the kexec.

Thanks,
Steve

>
> Thanks,
> Ashish
>
> > > > +               int i;
> > > > +               unsigned long nr_pages;
> > > > +
> > > > +               for (i = 0; i < e820_table->nr_entries; i++) {
> > > > +                       struct e820_entry *entry = &e820_table->entries[i];
> > > > +
> > > > +                       if (entry->type != E820_TYPE_RAM)
> > > > +                               continue;
> > > > +
> > > > +                       nr_pages = DIV_ROUND_UP(entry->size, PAGE_SIZE);
> > > > +
> > > > +                       kvm_sev_hypercall3(KVM_HC_PAGE_ENC_STATUS,
> > > > +                                          entry->addr, nr_pages, 1);
> > > > +               }
> > > > +       }
> > > >         kvm_pv_disable_apf();
> > > >         kvm_disable_steal_time();
> > > >  }
> > > > --
> > > > 2.17.1
> > > >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 13/13] x86/kvm: Add kexec support for SEV Live Migration.
  2021-04-13  2:25         ` Steve Rutherford
@ 2021-04-13 11:47           ` Ashish Kalra
  2021-04-13 19:19             ` Steve Rutherford
  0 siblings, 1 reply; 30+ messages in thread
From: Ashish Kalra @ 2021-04-13 11:47 UTC (permalink / raw)
  To: Steve Rutherford
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh, kexec

On Mon, Apr 12, 2021 at 07:25:03PM -0700, Steve Rutherford wrote:
> On Mon, Apr 12, 2021 at 6:48 PM Ashish Kalra <ashish.kalra@amd.com> wrote:
> >
> > On Mon, Apr 12, 2021 at 06:23:32PM -0700, Steve Rutherford wrote:
> > > On Mon, Apr 12, 2021 at 5:22 PM Steve Rutherford <srutherford@google.com> wrote:
> > > >
> > > > On Mon, Apr 12, 2021 at 12:48 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> > > > >
> > > > > From: Ashish Kalra <ashish.kalra@amd.com>
> > > > >
> > > > > Reset the host's shared pages list related to kernel
> > > > > specific page encryption status settings before we load a
> > > > > new kernel by kexec. We cannot reset the complete
> > > > > shared pages list here as we need to retain the
> > > > > UEFI/OVMF firmware specific settings.
> > > > >
> > > > > The host's shared pages list is maintained for the
> > > > > guest to keep track of all unencrypted guest memory regions,
> > > > > therefore we need to explicitly mark all shared pages as
> > > > > encrypted again before rebooting into the new guest kernel.
> > > > >
> > > > > Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> > > > > ---
> > > > >  arch/x86/kernel/kvm.c | 24 ++++++++++++++++++++++++
> > > > >  1 file changed, 24 insertions(+)
> > > > >
> > > > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> > > > > index bcc82e0c9779..4ad3ed547ff1 100644
> > > > > --- a/arch/x86/kernel/kvm.c
> > > > > +++ b/arch/x86/kernel/kvm.c
> > > > > @@ -39,6 +39,7 @@
> > > > >  #include <asm/cpuidle_haltpoll.h>
> > > > >  #include <asm/ptrace.h>
> > > > >  #include <asm/svm.h>
> > > > > +#include <asm/e820/api.h>
> > > > >
> > > > >  DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
> > > > >
> > > > > @@ -384,6 +385,29 @@ static void kvm_pv_guest_cpu_reboot(void *unused)
> > > > >          */
> > > > >         if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
> > > > >                 wrmsrl(MSR_KVM_PV_EOI_EN, 0);
> > > > > +       /*
> > > > > +        * Reset the host's shared pages list related to kernel
> > > > > +        * specific page encryption status settings before we load a
> > > > > +        * new kernel by kexec. NOTE: We cannot reset the complete
> > > > > +        * shared pages list here as we need to retain the
> > > > > +        * UEFI/OVMF firmware specific settings.
> > > > > +        */
> > > > > +       if (sev_live_migration_enabled & (smp_processor_id() == 0)) {
> > > > What happens if the reboot of CPU0 races with another CPU servicing a
> > > > device request (while the reboot is pending for that CPU)?
> > > > Seems like you could run into a scenario where you have hypercalls racing.
> > > >
> > > > Calling this on every core isn't free, but it is an easy way to avoid this race.
> > > > You could also count cores, and have only last core do the job, but
> > > > that seems more complicated.
> > > On second thought, I think this may be insufficient as a fix, since my
> > > read of kernel/reboot.c seems to imply that devices aren't shutdown
> > > until after these notifiers occur. As such, a single thread might be
> > > able to race with itself. I could be wrong here though.
> > >
> > > The heavy hammer would be to disable migration through the MSR (which
> > > the subsequent boot will re-enable).
> > >
> > > I'm curious if there is a less "blocking" way of handling kexecs (that
> > > strategy would block LM while the guest booted).
> > >
> > > One option that comes to mind would be for the guest to "mute" the
> > > encryption status hypercall after the call to reset the encryption
> > > status. The problem would be that the encryption status for pages
> > > would be very temporarily inaccurate in the window between that call
> > > and the start of the next boot. That isn't ideal, but, on the other
> > > hand, the VM was about to reboot anyway, so a corrupted shared page
> > > for device communication probably isn't super important. Still, I'm
> > > not really a fan of that. This would avoid corrupting the next boot,
> > > which is clearly an improvement.
> > >
> > > Each time the kernel boots it could also choose something like a
> > > generation ID, and pass that down each time it calls the hypercall.
> > > This would then let userspace identify which requests were coming from
> > > the subsequent boot.
> > >
> > > Everything here (except, perhaps, disabling migration through the MSR)
> > > seems kind of complicated. I somewhat hope my interpretation of
> > > kernel/reboot.c is wrong and this race just is not possible in the
> > > first place.
> > >
> >
> > Disabling migration through the MSR after resetting the page encryption
> > status is a reasonable approach. There is a similar window existing for
> > normal VM boot during which LM is disabled, from the point where OVMF
> > checks and adds support for SEV LM and the kernel boot checks for the
> > same and enables LM using the MSR.
> 
> I'm not totally confident that disabling LM through the MSR is
> sufficient. I also think the newly booted kernel needs to reset the
> state itself, since nothing stops the hypercalls after the disable
> goes through. The host won't know the difference between early boot
> (pre-enablement) hypercalls and racy just-before-restart hypercalls.
> You might disable migration through the hypercall, get a late status
> change hypercall, reboot, then re-enable migration, but still have
> stale state.
> 
> I _believe_ that the kernel doesn't mark it's RAM as private on boot
> as an optimization (might be wrong about this), since it would have
> been expensive to mark all of ram as encrypted previously. I believe
> that is no longer a limitation given the KVM_EXIT, so we can reset
> this during early boot instead of just before the kexec.
> 

I was wondering if disabling both migration (via the MSR) and "muting"
the hypercall using the "sev_live_migration_enabled" variable after the
page encryption status has been reset, will reset the page encryption
status of the guest to the (last known/good) configuration available to
the guest at boot time (i.e, all RAM pages marked as private and UEFI
setup shared MMIO/device regions, etc). 

But disabling migration and muting hypercalls after page encryption
status reset is still "racy" with hypercalls on other vCPUS, and that
can potentially mess-up the page encryption status available to guest
after kexec.

So probably, as you mentioned above, resetting the page encryption
status during early boot (immediately after detecting host support for
migration and enabling the hypercalls) instead of just before the kexec
is a good fix.

Thanks,
Ashish

> > > > > +               int i;
> > > > > +               unsigned long nr_pages;
> > > > > +
> > > > > +               for (i = 0; i < e820_table->nr_entries; i++) {
> > > > > +                       struct e820_entry *entry = &e820_table->entries[i];
> > > > > +
> > > > > +                       if (entry->type != E820_TYPE_RAM)
> > > > > +                               continue;
> > > > > +
> > > > > +                       nr_pages = DIV_ROUND_UP(entry->size, PAGE_SIZE);
> > > > > +
> > > > > +                       kvm_sev_hypercall3(KVM_HC_PAGE_ENC_STATUS,
> > > > > +                                          entry->addr, nr_pages, 1);
> > > > > +               }
> > > > > +       }
> > > > >         kvm_pv_disable_apf();
> > > > >         kvm_disable_steal_time();
> > > > >  }
> > > > > --
> > > > > 2.17.1
> > > > >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v12 13/13] x86/kvm: Add kexec support for SEV Live Migration.
  2021-04-13 11:47           ` Ashish Kalra
@ 2021-04-13 19:19             ` Steve Rutherford
  0 siblings, 0 replies; 30+ messages in thread
From: Steve Rutherford @ 2021-04-13 19:19 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, X86 ML, KVM list,
	LKML, Sean Christopherson, Venu Busireddy, Brijesh Singh, kexec

On Tue, Apr 13, 2021 at 4:47 AM Ashish Kalra <ashish.kalra@amd.com> wrote:
>
> On Mon, Apr 12, 2021 at 07:25:03PM -0700, Steve Rutherford wrote:
> > On Mon, Apr 12, 2021 at 6:48 PM Ashish Kalra <ashish.kalra@amd.com> wrote:
> > >
> > > On Mon, Apr 12, 2021 at 06:23:32PM -0700, Steve Rutherford wrote:
> > > > On Mon, Apr 12, 2021 at 5:22 PM Steve Rutherford <srutherford@google.com> wrote:
> > > > >
> > > > > On Mon, Apr 12, 2021 at 12:48 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> > > > > >
> > > > > > From: Ashish Kalra <ashish.kalra@amd.com>
> > > > > >
> > > > > > Reset the host's shared pages list related to kernel
> > > > > > specific page encryption status settings before we load a
> > > > > > new kernel by kexec. We cannot reset the complete
> > > > > > shared pages list here as we need to retain the
> > > > > > UEFI/OVMF firmware specific settings.
> > > > > >
> > > > > > The host's shared pages list is maintained for the
> > > > > > guest to keep track of all unencrypted guest memory regions,
> > > > > > therefore we need to explicitly mark all shared pages as
> > > > > > encrypted again before rebooting into the new guest kernel.
> > > > > >
> > > > > > Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> > > > > > ---
> > > > > >  arch/x86/kernel/kvm.c | 24 ++++++++++++++++++++++++
> > > > > >  1 file changed, 24 insertions(+)
> > > > > >
> > > > > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> > > > > > index bcc82e0c9779..4ad3ed547ff1 100644
> > > > > > --- a/arch/x86/kernel/kvm.c
> > > > > > +++ b/arch/x86/kernel/kvm.c
> > > > > > @@ -39,6 +39,7 @@
> > > > > >  #include <asm/cpuidle_haltpoll.h>
> > > > > >  #include <asm/ptrace.h>
> > > > > >  #include <asm/svm.h>
> > > > > > +#include <asm/e820/api.h>
> > > > > >
> > > > > >  DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
> > > > > >
> > > > > > @@ -384,6 +385,29 @@ static void kvm_pv_guest_cpu_reboot(void *unused)
> > > > > >          */
> > > > > >         if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
> > > > > >                 wrmsrl(MSR_KVM_PV_EOI_EN, 0);
> > > > > > +       /*
> > > > > > +        * Reset the host's shared pages list related to kernel
> > > > > > +        * specific page encryption status settings before we load a
> > > > > > +        * new kernel by kexec. NOTE: We cannot reset the complete
> > > > > > +        * shared pages list here as we need to retain the
> > > > > > +        * UEFI/OVMF firmware specific settings.
> > > > > > +        */
> > > > > > +       if (sev_live_migration_enabled & (smp_processor_id() == 0)) {
> > > > > What happens if the reboot of CPU0 races with another CPU servicing a
> > > > > device request (while the reboot is pending for that CPU)?
> > > > > Seems like you could run into a scenario where you have hypercalls racing.
> > > > >
> > > > > Calling this on every core isn't free, but it is an easy way to avoid this race.
> > > > > You could also count cores, and have only last core do the job, but
> > > > > that seems more complicated.
> > > > On second thought, I think this may be insufficient as a fix, since my
> > > > read of kernel/reboot.c seems to imply that devices aren't shutdown
> > > > until after these notifiers occur. As such, a single thread might be
> > > > able to race with itself. I could be wrong here though.
> > > >
> > > > The heavy hammer would be to disable migration through the MSR (which
> > > > the subsequent boot will re-enable).
> > > >
> > > > I'm curious if there is a less "blocking" way of handling kexecs (that
> > > > strategy would block LM while the guest booted).
> > > >
> > > > One option that comes to mind would be for the guest to "mute" the
> > > > encryption status hypercall after the call to reset the encryption
> > > > status. The problem would be that the encryption status for pages
> > > > would be very temporarily inaccurate in the window between that call
> > > > and the start of the next boot. That isn't ideal, but, on the other
> > > > hand, the VM was about to reboot anyway, so a corrupted shared page
> > > > for device communication probably isn't super important. Still, I'm
> > > > not really a fan of that. This would avoid corrupting the next boot,
> > > > which is clearly an improvement.
> > > >
> > > > Each time the kernel boots it could also choose something like a
> > > > generation ID, and pass that down each time it calls the hypercall.
> > > > This would then let userspace identify which requests were coming from
> > > > the subsequent boot.
> > > >
> > > > Everything here (except, perhaps, disabling migration through the MSR)
> > > > seems kind of complicated. I somewhat hope my interpretation of
> > > > kernel/reboot.c is wrong and this race just is not possible in the
> > > > first place.
> > > >
> > >
> > > Disabling migration through the MSR after resetting the page encryption
> > > status is a reasonable approach. There is a similar window existing for
> > > normal VM boot during which LM is disabled, from the point where OVMF
> > > checks and adds support for SEV LM and the kernel boot checks for the
> > > same and enables LM using the MSR.
> >
> > I'm not totally confident that disabling LM through the MSR is
> > sufficient. I also think the newly booted kernel needs to reset the
> > state itself, since nothing stops the hypercalls after the disable
> > goes through. The host won't know the difference between early boot
> > (pre-enablement) hypercalls and racy just-before-restart hypercalls.
> > You might disable migration through the hypercall, get a late status
> > change hypercall, reboot, then re-enable migration, but still have
> > stale state.
> >
> > I _believe_ that the kernel doesn't mark it's RAM as private on boot
> > as an optimization (might be wrong about this), since it would have
> > been expensive to mark all of ram as encrypted previously. I believe
> > that is no longer a limitation given the KVM_EXIT, so we can reset
> > this during early boot instead of just before the kexec.
> >
>
> I was wondering if disabling both migration (via the MSR) and "muting"
> the hypercall using the "sev_live_migration_enabled" variable after the
> page encryption status has been reset, will reset the page encryption
> status of the guest to the (last known/good) configuration available to
> the guest at boot time (i.e, all RAM pages marked as private and UEFI
> setup shared MMIO/device regions, etc).
>
> But disabling migration and muting hypercalls after page encryption
> status reset is still "racy" with hypercalls on other vCPUS, and that
> can potentially mess-up the page encryption status available to guest
> after kexec.
>
> So probably, as you mentioned above, resetting the page encryption
> status during early boot (immediately after detecting host support for
> migration and enabling the hypercalls) instead of just before the kexec
> is a good fix.
That strategy sounds good to me.

Thanks,
Steve
>
> Thanks,
> Ashish
>
> > > > > > +               int i;
> > > > > > +               unsigned long nr_pages;
> > > > > > +
> > > > > > +               for (i = 0; i < e820_table->nr_entries; i++) {
> > > > > > +                       struct e820_entry *entry = &e820_table->entries[i];
> > > > > > +
> > > > > > +                       if (entry->type != E820_TYPE_RAM)
> > > > > > +                               continue;
> > > > > > +
> > > > > > +                       nr_pages = DIV_ROUND_UP(entry->size, PAGE_SIZE);
> > > > > > +
> > > > > > +                       kvm_sev_hypercall3(KVM_HC_PAGE_ENC_STATUS,
> > > > > > +                                          entry->addr, nr_pages, 1);
> > > > > > +               }
> > > > > > +       }
> > > > > >         kvm_pv_disable_apf();
> > > > > >         kvm_disable_steal_time();
> > > > > >  }
> > > > > > --
> > > > > > 2.17.1
> > > > > >

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2021-04-13 19:20 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-12 19:42 [PATCH v12 00/13] Add AMD SEV guest live migration support Ashish Kalra
2021-04-12 19:43 ` [PATCH v12 01/13] KVM: SVM: Add KVM_SEV SEND_START command Ashish Kalra
2021-04-12 19:43 ` [PATCH v12 02/13] KVM: SVM: Add KVM_SEND_UPDATE_DATA command Ashish Kalra
2021-04-12 19:43 ` [PATCH v12 03/13] KVM: SVM: Add KVM_SEV_SEND_FINISH command Ashish Kalra
2021-04-13  0:03   ` Steve Rutherford
2021-04-12 19:44 ` [PATCH v12 04/13] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command Ashish Kalra
2021-04-13  0:03   ` Steve Rutherford
2021-04-12 19:44 ` [PATCH v12 05/13] KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command Ashish Kalra
2021-04-13  0:04   ` Steve Rutherford
2021-04-12 19:44 ` [PATCH v12 06/13] KVM: SVM: Add KVM_SEV_RECEIVE_FINISH command Ashish Kalra
2021-04-13  0:04   ` Steve Rutherford
2021-04-12 19:45 ` [PATCH v12 07/13] KVM: x86: Add AMD SEV specific Hypercall3 Ashish Kalra
2021-04-12 19:45 ` [PATCH v12 08/13] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall Ashish Kalra
2021-04-13  0:04   ` Steve Rutherford
2021-04-12 19:45 ` [PATCH v12 09/13] mm: x86: Invoke hypercall when page encryption status is changed Ashish Kalra
2021-04-13  0:06   ` Steve Rutherford
2021-04-12 19:45 ` [PATCH v12 10/13] KVM: x86: Introduce new KVM_FEATURE_SEV_LIVE_MIGRATION feature & Custom MSR Ashish Kalra
2021-04-13  0:04   ` Steve Rutherford
2021-04-12 19:46 ` [PATCH v12 11/13] EFI: Introduce the new AMD Memory Encryption GUID Ashish Kalra
2021-04-13  0:05   ` Steve Rutherford
2021-04-12 19:46 ` [PATCH v12 12/13] x86/kvm: Add guest support for detecting and enabling SEV Live Migration feature Ashish Kalra
2021-04-13  0:25   ` Steve Rutherford
2021-04-13  1:01     ` Ashish Kalra
2021-04-12 19:47 ` [PATCH v12 13/13] x86/kvm: Add kexec support for SEV Live Migration Ashish Kalra
2021-04-13  0:22   ` Steve Rutherford
2021-04-13  1:23     ` Steve Rutherford
2021-04-13  1:48       ` Ashish Kalra
2021-04-13  2:25         ` Steve Rutherford
2021-04-13 11:47           ` Ashish Kalra
2021-04-13 19:19             ` Steve Rutherford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).