linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/12] SEV Live Migration Patchset.
@ 2020-02-13  1:14 Ashish Kalra
  2020-02-13  1:14 ` [PATCH 01/12] KVM: SVM: Add KVM_SEV SEND_START command Ashish Kalra
                   ` (12 more replies)
  0 siblings, 13 replies; 49+ messages in thread
From: Ashish Kalra @ 2020-02-13  1:14 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky, rientjes,
	x86, kvm, linux-kernel

From: Ashish Kalra <ashish.kalra@amd.com>

This patchset adds support for SEV Live Migration on KVM/QEMU.

Ashish Kalra (1):
  KVM: x86: Introduce KVM_PAGE_ENC_BITMAP_RESET ioctl

Brijesh Singh (11):
  KVM: SVM: Add KVM_SEV SEND_START command
  KVM: SVM: Add KVM_SEND_UPDATE_DATA command
  KVM: SVM: Add KVM_SEV_SEND_FINISH command
  KVM: SVM: Add support for KVM_SEV_RECEIVE_START command
  KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command
  KVM: SVM: Add KVM_SEV_RECEIVE_FINISH command
  KVM: x86: Add AMD SEV specific Hypercall3
  KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall
  KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl
  mm: x86: Invoke hypercall when page encryption status  is changed
  KVM: x86: Introduce KVM_SET_PAGE_ENC_BITMAP ioctl

 .../virt/kvm/amd-memory-encryption.rst        | 120 ++++
 Documentation/virt/kvm/api.txt                |  59 ++
 Documentation/virt/kvm/hypercalls.txt         |  14 +
 arch/x86/include/asm/kvm_host.h               |   7 +
 arch/x86/include/asm/kvm_para.h               |  12 +
 arch/x86/include/asm/paravirt.h               |   6 +
 arch/x86/include/asm/paravirt_types.h         |   2 +
 arch/x86/kernel/paravirt.c                    |   1 +
 arch/x86/kvm/svm.c                            | 662 +++++++++++++++++-
 arch/x86/kvm/vmx/vmx.c                        |   1 +
 arch/x86/kvm/x86.c                            |  36 +
 arch/x86/mm/mem_encrypt.c                     |  57 +-
 arch/x86/mm/pat/set_memory.c                  |   7 +
 include/linux/psp-sev.h                       |   8 +-
 include/uapi/linux/kvm.h                      |  53 ++
 include/uapi/linux/kvm_para.h                 |   1 +
 16 files changed, 1037 insertions(+), 9 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH 01/12] KVM: SVM: Add KVM_SEV SEND_START command
  2020-02-13  1:14 [PATCH 00/12] SEV Live Migration Patchset Ashish Kalra
@ 2020-02-13  1:14 ` Ashish Kalra
  2020-03-09 21:28   ` Steve Rutherford
  2020-02-13  1:15 ` [PATCH 02/12] KVM: SVM: Add KVM_SEND_UPDATE_DATA command Ashish Kalra
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 49+ messages in thread
From: Ashish Kalra @ 2020-02-13  1:14 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky, rientjes,
	x86, kvm, linux-kernel

From: Brijesh Singh <brijesh.singh@amd.com>

The command is used to create an outgoing SEV guest encryption context.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/amd-memory-encryption.rst        |  27 ++++
 arch/x86/kvm/svm.c                            | 125 ++++++++++++++++++
 include/linux/psp-sev.h                       |   8 +-
 include/uapi/linux/kvm.h                      |  12 ++
 4 files changed, 168 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
index d18c97b4e140..826911f41f3b 100644
--- a/Documentation/virt/kvm/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/amd-memory-encryption.rst
@@ -238,6 +238,33 @@ Returns: 0 on success, -negative on error
                 __u32 trans_len;
         };
 
+10. KVM_SEV_SEND_START
+----------------------
+
+The KVM_SEV_SEND_START command can be used by the hypervisor to create an
+outgoing guest encryption context.
+
+Parameters (in): struct kvm_sev_send_start
+
+Returns: 0 on success, -negative on error
+
+::
+        struct kvm_sev_send_start {
+                __u32 policy;                 /* guest policy */
+
+                __u64 pdh_cert_uaddr;         /* platform Diffie-Hellman certificate */
+                __u32 pdh_cert_len;
+
+                __u64 plat_certs_uadr;        /* platform certificate chain */
+                __u32 plat_certs_len;
+
+                __u64 amd_certs_uaddr;        /* AMD certificate */
+                __u32 amd_cert_len;
+
+                __u64 session_uaddr;          /* Guest session information */
+                __u32 session_len;
+        };
+
 References
 ==========
 
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a3e32d61d60c..3a7e2cac51de 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7140,6 +7140,128 @@ static int sev_launch_secret(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+/* Userspace wants to query session length. */
+static int
+__sev_send_start_query_session_length(struct kvm *kvm, struct kvm_sev_cmd *argp,
+				      struct kvm_sev_send_start *params)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_send_start *data;
+	int ret;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
+	if (data == NULL)
+		return -ENOMEM;
+
+	data->handle = sev->handle;
+	ret = sev_issue_cmd(kvm, SEV_CMD_SEND_START, data, &argp->error);
+
+	params->session_len = data->session_len;
+	if (copy_to_user((void __user *)(uintptr_t)argp->data, params,
+				sizeof(struct kvm_sev_send_start)))
+		ret = -EFAULT;
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_send_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_send_start *data;
+	struct kvm_sev_send_start params;
+	void *amd_certs, *session_data;
+	void *pdh_cert, *plat_certs;
+	int ret;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
+				sizeof(struct kvm_sev_send_start)))
+		return -EFAULT;
+
+	/* if session_len is zero, userspace wants t query the session length */
+	if (!params.session_len)
+		return __sev_send_start_query_session_length(kvm, argp,
+				&params);
+
+	/* some sanity checks */
+	if (!params.pdh_cert_uaddr || !params.pdh_cert_len ||
+	    !params.session_uaddr || params.session_len > SEV_FW_BLOB_MAX_SIZE)
+		return -EINVAL;
+
+	/* allocate the memory to hold the session data blob */
+	session_data = kmalloc(params.session_len, GFP_KERNEL_ACCOUNT);
+	if (!session_data)
+		return -ENOMEM;
+
+	/* copy the certificate blobs from userspace */
+	pdh_cert = psp_copy_user_blob(params.pdh_cert_uaddr,
+				params.pdh_cert_len);
+	if (IS_ERR(pdh_cert)) {
+		ret = PTR_ERR(pdh_cert);
+		goto e_free_session;
+	}
+
+	plat_certs = psp_copy_user_blob(params.plat_certs_uaddr,
+				params.plat_certs_len);
+	if (IS_ERR(plat_certs)) {
+		ret = PTR_ERR(plat_certs);
+		goto e_free_pdh;
+	}
+
+	amd_certs = psp_copy_user_blob(params.amd_certs_uaddr,
+				params.amd_certs_len);
+	if (IS_ERR(amd_certs)) {
+		ret = PTR_ERR(amd_certs);
+		goto e_free_plat_cert;
+	}
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
+	if (data == NULL) {
+		ret = -ENOMEM;
+		goto e_free_amd_cert;
+	}
+
+	/* populate the FW SEND_START field with system physical address */
+	data->pdh_cert_address = __psp_pa(pdh_cert);
+	data->pdh_cert_len = params.pdh_cert_len;
+	data->plat_certs_address = __psp_pa(plat_certs);
+	data->plat_certs_len = params.plat_certs_len;
+	data->amd_certs_address = __psp_pa(amd_certs);
+	data->amd_certs_len = params.amd_certs_len;
+	data->session_address = __psp_pa(session_data);
+	data->session_len = params.session_len;
+	data->handle = sev->handle;
+
+	ret = sev_issue_cmd(kvm, SEV_CMD_SEND_START, data, &argp->error);
+
+	if (copy_to_user((void __user *)(uintptr_t) params.session_uaddr,
+			session_data, params.session_len)) {
+		ret = -EFAULT;
+		goto e_free;
+	}
+
+	params.policy = data->policy;
+	params.session_len = data->session_len;
+	if (copy_to_user((void __user *)(uintptr_t)argp->data, &params,
+				sizeof(struct kvm_sev_send_start)))
+		ret = -EFAULT;
+
+e_free:
+	kfree(data);
+e_free_amd_cert:
+	kfree(amd_certs);
+e_free_plat_cert:
+	kfree(plat_certs);
+e_free_pdh:
+	kfree(pdh_cert);
+e_free_session:
+	kfree(session_data);
+	return ret;
+}
+
 static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -7181,6 +7303,9 @@ static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_LAUNCH_SECRET:
 		r = sev_launch_secret(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_SEND_START:
+		r = sev_send_start(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
index 5167bf2bfc75..9f63b9d48b63 100644
--- a/include/linux/psp-sev.h
+++ b/include/linux/psp-sev.h
@@ -323,11 +323,11 @@ struct sev_data_send_start {
 	u64 pdh_cert_address;			/* In */
 	u32 pdh_cert_len;			/* In */
 	u32 reserved1;
-	u64 plat_cert_address;			/* In */
-	u32 plat_cert_len;			/* In */
+	u64 plat_certs_address;			/* In */
+	u32 plat_certs_len;			/* In */
 	u32 reserved2;
-	u64 amd_cert_address;			/* In */
-	u32 amd_cert_len;			/* In */
+	u64 amd_certs_address;			/* In */
+	u32 amd_certs_len;			/* In */
 	u32 reserved3;
 	u64 session_address;			/* In */
 	u32 session_len;			/* In/Out */
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 4b95f9a31a2f..17bef4c245e1 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1558,6 +1558,18 @@ struct kvm_sev_dbg {
 	__u32 len;
 };
 
+struct kvm_sev_send_start {
+	__u32 policy;
+	__u64 pdh_cert_uaddr;
+	__u32 pdh_cert_len;
+	__u64 plat_certs_uaddr;
+	__u32 plat_certs_len;
+	__u64 amd_certs_uaddr;
+	__u32 amd_certs_len;
+	__u64 session_uaddr;
+	__u32 session_len;
+};
+
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
 #define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 02/12]  KVM: SVM: Add KVM_SEND_UPDATE_DATA command
  2020-02-13  1:14 [PATCH 00/12] SEV Live Migration Patchset Ashish Kalra
  2020-02-13  1:14 ` [PATCH 01/12] KVM: SVM: Add KVM_SEV SEND_START command Ashish Kalra
@ 2020-02-13  1:15 ` Ashish Kalra
  2020-03-10  1:04   ` Steve Rutherford
  2020-02-13  1:16 ` [PATCH 03/12] KVM: SVM: Add KVM_SEV_SEND_FINISH command Ashish Kalra
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 49+ messages in thread
From: Ashish Kalra @ 2020-02-13  1:15 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky, rientjes,
	x86, kvm, linux-kernel

From: Brijesh Singh <brijesh.singh@amd.com>

The command is used for encrypting the guest memory region using the encryption
context created with KVM_SEV_SEND_START.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/amd-memory-encryption.rst        |  24 ++++
 arch/x86/kvm/svm.c                            | 136 +++++++++++++++++-
 include/uapi/linux/kvm.h                      |   9 ++
 3 files changed, 165 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
index 826911f41f3b..0f1c3860360f 100644
--- a/Documentation/virt/kvm/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/amd-memory-encryption.rst
@@ -265,6 +265,30 @@ Returns: 0 on success, -negative on error
                 __u32 session_len;
         };
 
+11. KVM_SEV_SEND_UPDATE_DATA
+----------------------------
+
+The KVM_SEV_SEND_UPDATE_DATA command can be used by the hypervisor to encrypt the
+outgoing guest memory region with the encryption context creating using
+KVM_SEV_SEND_START.
+
+Parameters (in): struct kvm_sev_send_update_data
+
+Returns: 0 on success, -negative on error
+
+::
+
+        struct kvm_sev_launch_send_update_data {
+                __u64 hdr_uaddr;        /* userspace address containing the packet header */
+                __u32 hdr_len;
+
+                __u64 guest_uaddr;      /* the source memory region to be encrypted */
+                __u32 guest_len;
+
+                __u64 trans_uaddr;      /* the destition memory region  */
+                __u32 trans_len;
+        };
+
 References
 ==========
 
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 3a7e2cac51de..ae97f774e979 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -426,6 +426,7 @@ static DECLARE_RWSEM(sev_deactivate_lock);
 static DEFINE_MUTEX(sev_bitmap_lock);
 static unsigned int max_sev_asid;
 static unsigned int min_sev_asid;
+static unsigned long sev_me_mask;
 static unsigned long *sev_asid_bitmap;
 static unsigned long *sev_reclaim_asid_bitmap;
 #define __sme_page_pa(x) __sme_set(page_to_pfn(x) << PAGE_SHIFT)
@@ -1231,16 +1232,22 @@ static int avic_ga_log_notifier(u32 ga_tag)
 static __init int sev_hardware_setup(void)
 {
 	struct sev_user_data_status *status;
+	int eax, ebx;
 	int rc;
 
-	/* Maximum number of encrypted guests supported simultaneously */
-	max_sev_asid = cpuid_ecx(0x8000001F);
+	/*
+	 * Query the memory encryption information.
+	 *  EBX:  Bit 0:5 Pagetable bit position used to indicate encryption
+	 *  (aka Cbit).
+	 *  ECX:  Maximum number of encrypted guests supported simultaneously.
+	 *  EDX:  Minimum ASID value that should be used for SEV guest.
+	 */
+	cpuid(0x8000001f, &eax, &ebx, &max_sev_asid, &min_sev_asid);
 
 	if (!max_sev_asid)
 		return 1;
 
-	/* Minimum ASID value that should be used for SEV guest */
-	min_sev_asid = cpuid_edx(0x8000001F);
+	sev_me_mask = 1UL << (ebx & 0x3f);
 
 	/* Initialize SEV ASID bitmaps */
 	sev_asid_bitmap = bitmap_zalloc(max_sev_asid, GFP_KERNEL);
@@ -7262,6 +7269,124 @@ static int sev_send_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+/* Userspace wants to query either header or trans length. */
+static int
+__sev_send_update_data_query_lengths(struct kvm *kvm, struct kvm_sev_cmd *argp,
+				     struct kvm_sev_send_update_data *params)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_send_update_data *data;
+	int ret;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = sev->handle;
+	ret = sev_issue_cmd(kvm, SEV_CMD_SEND_UPDATE_DATA, data, &argp->error);
+
+	params->hdr_len = data->hdr_len;
+	params->trans_len = data->trans_len;
+
+	if (copy_to_user((void __user *)(uintptr_t)argp->data, params,
+			 sizeof(struct kvm_sev_send_update_data)))
+		ret = -EFAULT;
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_send_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_send_update_data *data;
+	struct kvm_sev_send_update_data params;
+	void *hdr, *trans_data;
+	struct page **guest_page;
+	unsigned long n;
+	int ret, offset;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
+			sizeof(struct kvm_sev_send_update_data)))
+		return -EFAULT;
+
+	/* userspace wants to query either header or trans length */
+	if (!params.trans_len || !params.hdr_len)
+		return __sev_send_update_data_query_lengths(kvm, argp, &params);
+
+	if (!params.trans_uaddr || !params.guest_uaddr ||
+	    !params.guest_len || !params.hdr_uaddr)
+		return -EINVAL;
+
+
+	/* Check if we are crossing the page boundary */
+	offset = params.guest_uaddr & (PAGE_SIZE - 1);
+	if ((params.guest_len + offset > PAGE_SIZE))
+		return -EINVAL;
+
+	/* Pin guest memory */
+	guest_page = sev_pin_memory(kvm, params.guest_uaddr & PAGE_MASK,
+				    PAGE_SIZE, &n, 0);
+	if (!guest_page)
+		return -EFAULT;
+
+	/* allocate memory for header and transport buffer */
+	ret = -ENOMEM;
+	hdr = kmalloc(params.hdr_len, GFP_KERNEL_ACCOUNT);
+	if (!hdr)
+		goto e_unpin;
+
+	trans_data = kmalloc(params.trans_len, GFP_KERNEL_ACCOUNT);
+	if (!trans_data)
+		goto e_free_hdr;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		goto e_free_trans_data;
+
+	data->hdr_address = __psp_pa(hdr);
+	data->hdr_len = params.hdr_len;
+	data->trans_address = __psp_pa(trans_data);
+	data->trans_len = params.trans_len;
+
+	/* The SEND_UPDATE_DATA command requires C-bit to be always set. */
+	data->guest_address = (page_to_pfn(guest_page[0]) << PAGE_SHIFT) +
+				offset;
+	data->guest_address |= sev_me_mask;
+	data->guest_len = params.guest_len;
+	data->handle = sev->handle;
+
+	ret = sev_issue_cmd(kvm, SEV_CMD_SEND_UPDATE_DATA, data, &argp->error);
+
+	if (ret)
+		goto e_free;
+
+	/* copy transport buffer to user space */
+	if (copy_to_user((void __user *)(uintptr_t)params.trans_uaddr,
+			 trans_data, params.trans_len)) {
+		ret = -EFAULT;
+		goto e_unpin;
+	}
+
+	/* Copy packet header to userspace. */
+	ret = copy_to_user((void __user *)(uintptr_t)params.hdr_uaddr, hdr,
+				params.hdr_len);
+
+e_free:
+	kfree(data);
+e_free_trans_data:
+	kfree(trans_data);
+e_free_hdr:
+	kfree(hdr);
+e_unpin:
+	sev_unpin_memory(kvm, guest_page, n);
+
+	return ret;
+}
+
 static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -7306,6 +7431,9 @@ static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_SEND_START:
 		r = sev_send_start(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_SEND_UPDATE_DATA:
+		r = sev_send_update_data(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 17bef4c245e1..d9dc81bb9c55 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1570,6 +1570,15 @@ struct kvm_sev_send_start {
 	__u32 session_len;
 };
 
+struct kvm_sev_send_update_data {
+	__u64 hdr_uaddr;
+	__u32 hdr_len;
+	__u64 guest_uaddr;
+	__u32 guest_len;
+	__u64 trans_uaddr;
+	__u32 trans_len;
+};
+
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
 #define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 03/12] KVM: SVM: Add KVM_SEV_SEND_FINISH command
  2020-02-13  1:14 [PATCH 00/12] SEV Live Migration Patchset Ashish Kalra
  2020-02-13  1:14 ` [PATCH 01/12] KVM: SVM: Add KVM_SEV SEND_START command Ashish Kalra
  2020-02-13  1:15 ` [PATCH 02/12] KVM: SVM: Add KVM_SEND_UPDATE_DATA command Ashish Kalra
@ 2020-02-13  1:16 ` Ashish Kalra
  2020-03-10  1:09   ` Steve Rutherford
  2020-02-13  1:16 ` [PATCH 04/12] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command Ashish Kalra
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 49+ messages in thread
From: Ashish Kalra @ 2020-02-13  1:16 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky, rientjes,
	x86, kvm, linux-kernel

From: Brijesh Singh <brijesh.singh@amd.com>

The command is used to finailize the encryption context created with
KVM_SEV_SEND_START command.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/amd-memory-encryption.rst        |  8 +++++++
 arch/x86/kvm/svm.c                            | 23 +++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
index 0f1c3860360f..f22f09ad72bd 100644
--- a/Documentation/virt/kvm/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/amd-memory-encryption.rst
@@ -289,6 +289,14 @@ Returns: 0 on success, -negative on error
                 __u32 trans_len;
         };
 
+12. KVM_SEV_SEND_FINISH
+------------------------
+
+After completion of the migration flow, the KVM_SEV_SEND_FINISH command can be
+issued by the hypervisor to delete the encryption context.
+
+Returns: 0 on success, -negative on error
+
 References
 ==========
 
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ae97f774e979..c55c1865f9e0 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7387,6 +7387,26 @@ static int sev_send_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_send_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_send_finish *data;
+	int ret;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = sev->handle;
+	ret = sev_issue_cmd(kvm, SEV_CMD_SEND_FINISH, data, &argp->error);
+
+	kfree(data);
+	return ret;
+}
+
 static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -7434,6 +7454,9 @@ static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_SEND_UPDATE_DATA:
 		r = sev_send_update_data(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_SEND_FINISH:
+		r = sev_send_finish(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 04/12] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command
  2020-02-13  1:14 [PATCH 00/12] SEV Live Migration Patchset Ashish Kalra
                   ` (2 preceding siblings ...)
  2020-02-13  1:16 ` [PATCH 03/12] KVM: SVM: Add KVM_SEV_SEND_FINISH command Ashish Kalra
@ 2020-02-13  1:16 ` Ashish Kalra
  2020-03-10  1:41   ` Steve Rutherford
  2020-02-13  1:16 ` [PATCH 05/12] KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command Ashish Kalra
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 49+ messages in thread
From: Ashish Kalra @ 2020-02-13  1:16 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky, rientjes,
	x86, kvm, linux-kernel

From: Brijesh Singh <brijesh.singh@amd.com>

The command is used to create the encryption context for an incoming
SEV guest. The encryption context can be later used by the hypervisor
to import the incoming data into the SEV guest memory space.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/amd-memory-encryption.rst        | 29 +++++++
 arch/x86/kvm/svm.c                            | 81 +++++++++++++++++++
 include/uapi/linux/kvm.h                      |  9 +++
 3 files changed, 119 insertions(+)

diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
index f22f09ad72bd..4b882fb681fa 100644
--- a/Documentation/virt/kvm/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/amd-memory-encryption.rst
@@ -297,6 +297,35 @@ issued by the hypervisor to delete the encryption context.
 
 Returns: 0 on success, -negative on error
 
+13. KVM_SEV_RECEIVE_START
+------------------------
+
+The KVM_SEV_RECEIVE_START command is used for creating the memory encryption
+context for an incoming SEV guest. To create the encryption context, the user must
+provide a guest policy, the platform public Diffie-Hellman (PDH) key and session
+information.
+
+Parameters: struct  kvm_sev_receive_start (in/out)
+
+Returns: 0 on success, -negative on error
+
+::
+
+        struct kvm_sev_receive_start {
+                __u32 handle;           /* if zero then firmware creates a new handle */
+                __u32 policy;           /* guest's policy */
+
+                __u64 pdh_uaddr;         /* userspace address pointing to the PDH key */
+                __u32 dh_len;
+
+                __u64 session_addr;     /* userspace address which points to the guest session information */
+                __u32 session_len;
+        };
+
+On success, the 'handle' field contains a new handle and on error, a negative value.
+
+For more details, see SEV spec Section 6.12.
+
 References
 ==========
 
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index c55c1865f9e0..3b766f386c84 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7407,6 +7407,84 @@ static int sev_send_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_receive_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_receive_start *start;
+	struct kvm_sev_receive_start params;
+	int *error = &argp->error;
+	void *session_data;
+	void *pdh_data;
+	int ret;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	/* Get parameter from the userspace */
+	if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
+			sizeof(struct kvm_sev_receive_start)))
+		return -EFAULT;
+
+	/* some sanity checks */
+	if (!params.pdh_uaddr || !params.pdh_len ||
+	    !params.session_uaddr || !params.session_len)
+		return -EINVAL;
+
+	pdh_data = psp_copy_user_blob(params.pdh_uaddr, params.pdh_len);
+	if (IS_ERR(pdh_data))
+		return PTR_ERR(pdh_data);
+
+	session_data = psp_copy_user_blob(params.session_uaddr,
+			params.session_len);
+	if (IS_ERR(session_data)) {
+		ret = PTR_ERR(session_data);
+		goto e_free_pdh;
+	}
+
+	ret = -ENOMEM;
+	start = kzalloc(sizeof(*start), GFP_KERNEL);
+	if (!start)
+		goto e_free_session;
+
+	start->handle = params.handle;
+	start->policy = params.policy;
+	start->pdh_cert_address = __psp_pa(pdh_data);
+	start->pdh_cert_len = params.pdh_len;
+	start->session_address = __psp_pa(session_data);
+	start->session_len = params.session_len;
+
+	/* create memory encryption context */
+	ret = __sev_issue_cmd(argp->sev_fd, SEV_CMD_RECEIVE_START, start,
+				error);
+	if (ret)
+		goto e_free;
+
+	/* Bind ASID to this guest */
+	ret = sev_bind_asid(kvm, start->handle, error);
+	if (ret)
+		goto e_free;
+
+	params.handle = start->handle;
+	if (copy_to_user((void __user *)(uintptr_t)argp->data,
+			 &params, sizeof(struct kvm_sev_receive_start))) {
+		ret = -EFAULT;
+		sev_unbind_asid(kvm, start->handle);
+		goto e_free;
+	}
+
+	sev->handle = start->handle;
+	sev->fd = argp->sev_fd;
+
+e_free:
+	kfree(start);
+e_free_session:
+	kfree(session_data);
+e_free_pdh:
+	kfree(pdh_data);
+
+	return ret;
+}
+
 static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -7457,6 +7535,9 @@ static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_SEND_FINISH:
 		r = sev_send_finish(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_RECEIVE_START:
+		r = sev_receive_start(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index d9dc81bb9c55..74764b9db5fa 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1579,6 +1579,15 @@ struct kvm_sev_send_update_data {
 	__u32 trans_len;
 };
 
+struct kvm_sev_receive_start {
+	__u32 handle;
+	__u32 policy;
+	__u64 pdh_uaddr;
+	__u32 pdh_len;
+	__u64 session_uaddr;
+	__u32 session_len;
+};
+
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
 #define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 05/12] KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command
  2020-02-13  1:14 [PATCH 00/12] SEV Live Migration Patchset Ashish Kalra
                   ` (3 preceding siblings ...)
  2020-02-13  1:16 ` [PATCH 04/12] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command Ashish Kalra
@ 2020-02-13  1:16 ` Ashish Kalra
  2020-02-13  1:16 ` [PATCH 06/12] KVM: SVM: Add KVM_SEV_RECEIVE_FINISH command Ashish Kalra
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 49+ messages in thread
From: Ashish Kalra @ 2020-02-13  1:16 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky, rientjes,
	x86, kvm, linux-kernel

From: Brijesh Singh <brijesh.singh@amd.com>

The command is used for copying the incoming buffer into the
SEV guest memory space.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/amd-memory-encryption.rst        | 24 ++++++
 arch/x86/kvm/svm.c                            | 79 +++++++++++++++++++
 include/uapi/linux/kvm.h                      |  9 +++
 3 files changed, 112 insertions(+)

diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
index 4b882fb681fa..52fca9e258dc 100644
--- a/Documentation/virt/kvm/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/amd-memory-encryption.rst
@@ -326,6 +326,30 @@ On success, the 'handle' field contains a new handle and on error, a negative va
 
 For more details, see SEV spec Section 6.12.
 
+14. KVM_SEV_RECEIVE_UPDATE_DATA
+----------------------------
+
+The KVM_SEV_RECEIVE_UPDATE_DATA command can be used by the hypervisor to copy
+the incoming buffers into the guest memory region with encryption context
+created during the KVM_SEV_RECEIVE_START.
+
+Parameters (in): struct kvm_sev_receive_update_data
+
+Returns: 0 on success, -negative on error
+
+::
+
+        struct kvm_sev_launch_receive_update_data {
+                __u64 hdr_uaddr;        /* userspace address containing the packet header */
+                __u32 hdr_len;
+
+                __u64 guest_uaddr;      /* the destination guest memory region */
+                __u32 guest_len;
+
+                __u64 trans_uaddr;      /* the incoming buffer memory region  */
+                __u32 trans_len;
+        };
+
 References
 ==========
 
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 3b766f386c84..907c59ca74ad 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7485,6 +7485,82 @@ static int sev_receive_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_receive_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct kvm_sev_receive_update_data params;
+	struct sev_data_receive_update_data *data;
+	void *hdr = NULL, *trans = NULL;
+	struct page **guest_page;
+	unsigned long n;
+	int ret, offset;
+
+	if (!sev_guest(kvm))
+		return -EINVAL;
+
+	if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
+			sizeof(struct kvm_sev_receive_update_data)))
+		return -EFAULT;
+
+	if (!params.hdr_uaddr || !params.hdr_len ||
+	    !params.guest_uaddr || !params.guest_len ||
+	    !params.trans_uaddr || !params.trans_len)
+		return -EINVAL;
+
+	/* Check if we are crossing the page boundary */
+	offset = params.guest_uaddr & (PAGE_SIZE - 1);
+	if ((params.guest_len + offset > PAGE_SIZE))
+		return -EINVAL;
+
+	hdr = psp_copy_user_blob(params.hdr_uaddr, params.hdr_len);
+	if (IS_ERR(hdr))
+		return PTR_ERR(hdr);
+
+	trans = psp_copy_user_blob(params.trans_uaddr, params.trans_len);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		goto e_free_hdr;
+	}
+
+	ret = -ENOMEM;
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		goto e_free_trans;
+
+	data->hdr_address = __psp_pa(hdr);
+	data->hdr_len = params.hdr_len;
+	data->trans_address = __psp_pa(trans);
+	data->trans_len = params.trans_len;
+
+	/* Pin guest memory */
+	ret = -EFAULT;
+	guest_page = sev_pin_memory(kvm, params.guest_uaddr & PAGE_MASK,
+				    PAGE_SIZE, &n, 0);
+	if (!guest_page)
+		goto e_free;
+
+	/* The RECEIVE_UPDATE_DATA command requires C-bit to be always set. */
+	data->guest_address = (page_to_pfn(guest_page[0]) << PAGE_SHIFT) +
+				offset;
+	data->guest_address |= sev_me_mask;
+	data->guest_len = params.guest_len;
+	data->handle = sev->handle;
+
+	ret = sev_issue_cmd(kvm, SEV_CMD_RECEIVE_UPDATE_DATA, data,
+				&argp->error);
+
+	sev_unpin_memory(kvm, guest_page, n);
+
+e_free:
+	kfree(data);
+e_free_trans:
+	kfree(trans);
+e_free_hdr:
+	kfree(hdr);
+
+	return ret;
+}
+
 static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -7538,6 +7614,9 @@ static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_RECEIVE_START:
 		r = sev_receive_start(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_RECEIVE_UPDATE_DATA:
+		r = sev_receive_update_data(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 74764b9db5fa..4e80c57a3182 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1588,6 +1588,15 @@ struct kvm_sev_receive_start {
 	__u32 session_len;
 };
 
+struct kvm_sev_receive_update_data {
+	__u64 hdr_uaddr;
+	__u32 hdr_len;
+	__u64 guest_uaddr;
+	__u32 guest_len;
+	__u64 trans_uaddr;
+	__u32 trans_len;
+};
+
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
 #define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 06/12] KVM: SVM: Add KVM_SEV_RECEIVE_FINISH command
  2020-02-13  1:14 [PATCH 00/12] SEV Live Migration Patchset Ashish Kalra
                   ` (4 preceding siblings ...)
  2020-02-13  1:16 ` [PATCH 05/12] KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command Ashish Kalra
@ 2020-02-13  1:16 ` Ashish Kalra
  2020-02-13  1:17 ` [PATCH 07/12] KVM: x86: Add AMD SEV specific Hypercall3 Ashish Kalra
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 49+ messages in thread
From: Ashish Kalra @ 2020-02-13  1:16 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky, rientjes,
	x86, kvm, linux-kernel

From: Brijesh Singh <brijesh.singh@amd.com>

The command finalize the guest receiving process and make the SEV guest
ready for the execution.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/amd-memory-encryption.rst        |  8 +++++++
 arch/x86/kvm/svm.c                            | 23 +++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
index 52fca9e258dc..4dbcb22bcd55 100644
--- a/Documentation/virt/kvm/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/amd-memory-encryption.rst
@@ -350,6 +350,14 @@ Returns: 0 on success, -negative on error
                 __u32 trans_len;
         };
 
+15. KVM_SEV_RECEIVE_FINISH
+------------------------
+
+After completion of the migration flow, the KVM_SEV_RECEIVE_FINISH command can be
+issued by the hypervisor to make the guest ready for execution.
+
+Returns: 0 on success, -negative on error
+
 References
 ==========
 
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 907c59ca74ad..d86b02bece3a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7561,6 +7561,26 @@ static int sev_receive_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_receive_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_receive_finish *data;
+	int ret;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = sev->handle;
+	ret = sev_issue_cmd(kvm, SEV_CMD_RECEIVE_FINISH, data, &argp->error);
+
+	kfree(data);
+	return ret;
+}
+
 static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -7617,6 +7637,9 @@ static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_RECEIVE_UPDATE_DATA:
 		r = sev_receive_update_data(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_RECEIVE_FINISH:
+		r = sev_receive_finish(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 07/12] KVM: x86: Add AMD SEV specific Hypercall3
  2020-02-13  1:14 [PATCH 00/12] SEV Live Migration Patchset Ashish Kalra
                   ` (5 preceding siblings ...)
  2020-02-13  1:16 ` [PATCH 06/12] KVM: SVM: Add KVM_SEV_RECEIVE_FINISH command Ashish Kalra
@ 2020-02-13  1:17 ` Ashish Kalra
  2020-02-13  1:17 ` [PATCH 08/12] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall Ashish Kalra
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 49+ messages in thread
From: Ashish Kalra @ 2020-02-13  1:17 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky, rientjes,
	x86, kvm, linux-kernel

From: Brijesh Singh <brijesh.singh@amd.com>

KVM hypercall framework relies on alternative framework to patch the
VMCALL -> VMMCALL on AMD platform. If a hypercall is made before
apply_alternative() is called then it defaults to VMCALL. The approach
works fine on non SEV guest. A VMCALL would causes #UD, and hypervisor
will be able to decode the instruction and do the right things. But
when SEV is active, guest memory is encrypted with guest key and
hypervisor will not be able to decode the instruction bytes.

Add SEV specific hypercall3, it unconditionally uses VMMCALL. The hypercall
will be used by the SEV guest to notify encrypted pages to the hypervisor.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/include/asm/kvm_para.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 9b4df6eaa11a..6c09255633a4 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -84,6 +84,18 @@ static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
 	return ret;
 }
 
+static inline long kvm_sev_hypercall3(unsigned int nr, unsigned long p1,
+				      unsigned long p2, unsigned long p3)
+{
+	long ret;
+
+	asm volatile("vmmcall"
+		     : "=a"(ret)
+		     : "a"(nr), "b"(p1), "c"(p2), "d"(p3)
+		     : "memory");
+	return ret;
+}
+
 #ifdef CONFIG_KVM_GUEST
 bool kvm_para_available(void);
 unsigned int kvm_arch_para_features(void);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 08/12] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall
  2020-02-13  1:14 [PATCH 00/12] SEV Live Migration Patchset Ashish Kalra
                   ` (6 preceding siblings ...)
  2020-02-13  1:17 ` [PATCH 07/12] KVM: x86: Add AMD SEV specific Hypercall3 Ashish Kalra
@ 2020-02-13  1:17 ` Ashish Kalra
  2020-02-20  2:39   ` Steve Rutherford
  2020-02-13  1:17 ` [PATCH 09/12] KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl Ashish Kalra
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 49+ messages in thread
From: Ashish Kalra @ 2020-02-13  1:17 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky, rientjes,
	x86, kvm, linux-kernel

From: Brijesh Singh <brijesh.singh@amd.com>

This hypercall is used by the SEV guest to notify a change in the page
encryption status to the hypervisor. The hypercall should be invoked
only when the encryption attribute is changed from encrypted -> decrypted
and vice versa. By default all guest pages are considered encrypted.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 Documentation/virt/kvm/hypercalls.txt | 14 ++++
 arch/x86/include/asm/kvm_host.h       |  2 +
 arch/x86/kvm/svm.c                    | 94 +++++++++++++++++++++++++++
 arch/x86/kvm/vmx/vmx.c                |  1 +
 arch/x86/kvm/x86.c                    |  6 ++
 include/uapi/linux/kvm_para.h         |  1 +
 6 files changed, 118 insertions(+)

diff --git a/Documentation/virt/kvm/hypercalls.txt b/Documentation/virt/kvm/hypercalls.txt
index 5f6d291bd004..8ff0e4adcb13 100644
--- a/Documentation/virt/kvm/hypercalls.txt
+++ b/Documentation/virt/kvm/hypercalls.txt
@@ -152,3 +152,17 @@ a0: destination APIC ID
 
 Usage example: When sending a call-function IPI-many to vCPUs, yield if
 any of the IPI target vCPUs was preempted.
+
+8. KVM_HC_PAGE_ENC_STATUS
+-------------------------
+Architecture: x86
+Status: active
+Purpose: Notify the encryption status changes in guest page table (SEV guest)
+
+a0: the guest physical address of the start page
+a1: the number of pages
+a2: encryption attribute
+
+   Where:
+	* 1: Encryption attribute is set
+	* 0: Encryption attribute is cleared
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4dffbc10d3f8..4ae7293033b2 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1256,6 +1256,8 @@ struct kvm_x86_ops {
 
 	bool (*apic_init_signal_blocked)(struct kvm_vcpu *vcpu);
 	int (*enable_direct_tlbflush)(struct kvm_vcpu *vcpu);
+	int (*page_enc_status_hc)(struct kvm *kvm, unsigned long gpa,
+				  unsigned long sz, unsigned long mode);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d86b02bece3a..f09791109075 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -134,6 +134,8 @@ struct kvm_sev_info {
 	int fd;			/* SEV device fd */
 	unsigned long pages_locked; /* Number of pages locked */
 	struct list_head regions_list;  /* List of registered regions */
+	unsigned long *page_enc_bmap;
+	unsigned long page_enc_bmap_size;
 };
 
 struct kvm_svm {
@@ -1992,6 +1994,9 @@ static void sev_vm_destroy(struct kvm *kvm)
 
 	sev_unbind_asid(kvm, sev->handle);
 	sev_asid_free(sev->asid);
+
+	kvfree(sev->page_enc_bmap);
+	sev->page_enc_bmap = NULL;
 }
 
 static void avic_vm_destroy(struct kvm *kvm)
@@ -7581,6 +7586,93 @@ static int sev_receive_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_resize_page_enc_bitmap(struct kvm *kvm, unsigned long new_size)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	unsigned long *map;
+	unsigned long sz;
+
+	if (sev->page_enc_bmap_size >= new_size)
+		return 0;
+
+	sz = ALIGN(new_size, BITS_PER_LONG) / 8;
+
+	map = vmalloc(sz);
+	if (!map) {
+		pr_err_once("Failed to allocate encrypted bitmap size %lx\n",
+				sz);
+		return -ENOMEM;
+	}
+
+	/* mark the page encrypted (by default) */
+	memset(map, 0xff, sz);
+
+	bitmap_copy(map, sev->page_enc_bmap, sev->page_enc_bmap_size);
+	kvfree(sev->page_enc_bmap);
+
+	sev->page_enc_bmap = map;
+	sev->page_enc_bmap_size = new_size;
+
+	return 0;
+}
+
+static int svm_page_enc_status_hc(struct kvm *kvm, unsigned long gpa,
+				  unsigned long npages, unsigned long enc)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	gfn_t gfn_start, gfn_end;
+	int ret;
+
+	if (!sev_guest(kvm))
+		return -EINVAL;
+
+	if (!npages)
+		return 0;
+
+	gfn_start = gpa_to_gfn(gpa);
+	gfn_end = gfn_start + npages;
+
+	/* out of bound access error check */
+	if (gfn_end <= gfn_start)
+		return -EINVAL;
+
+	/* lets make sure that gpa exist in our memslot */
+	pfn_start = gfn_to_pfn(kvm, gfn_start);
+	pfn_end = gfn_to_pfn(kvm, gfn_end);
+
+	if (is_error_noslot_pfn(pfn_start) && !is_noslot_pfn(pfn_start)) {
+		/*
+		 * Allow guest MMIO range(s) to be added
+		 * to the page encryption bitmap.
+		 */
+		return -EINVAL;
+	}
+
+	if (is_error_noslot_pfn(pfn_end) && !is_noslot_pfn(pfn_end)) {
+		/*
+		 * Allow guest MMIO range(s) to be added
+		 * to the page encryption bitmap.
+		 */
+		return -EINVAL;
+	}
+
+	mutex_lock(&kvm->lock);
+	ret = sev_resize_page_enc_bitmap(kvm, gfn_end);
+	if (ret)
+		goto unlock;
+
+	if (enc)
+		__bitmap_set(sev->page_enc_bmap, gfn_start,
+				gfn_end - gfn_start);
+	else
+		__bitmap_clear(sev->page_enc_bmap, gfn_start,
+				gfn_end - gfn_start);
+
+unlock:
+	mutex_unlock(&kvm->lock);
+	return ret;
+}
+
 static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -7972,6 +8064,8 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.need_emulation_on_page_fault = svm_need_emulation_on_page_fault,
 
 	.apic_init_signal_blocked = svm_apic_init_signal_blocked,
+
+	.page_enc_status_hc = svm_page_enc_status_hc,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9a6664886f2e..7963f2979fdf 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7879,6 +7879,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
 	.nested_get_evmcs_version = NULL,
 	.need_emulation_on_page_fault = vmx_need_emulation_on_page_fault,
 	.apic_init_signal_blocked = vmx_apic_init_signal_blocked,
+	.page_enc_status_hc = NULL,
 };
 
 static void vmx_cleanup_l1d_flush(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fbabb2f06273..298627fa3d39 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7547,6 +7547,12 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 		kvm_sched_yield(vcpu->kvm, a0);
 		ret = 0;
 		break;
+	case KVM_HC_PAGE_ENC_STATUS:
+		ret = -KVM_ENOSYS;
+		if (kvm_x86_ops->page_enc_status_hc)
+			ret = kvm_x86_ops->page_enc_status_hc(vcpu->kvm,
+					a0, a1, a2);
+		break;
 	default:
 		ret = -KVM_ENOSYS;
 		break;
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index 8b86609849b9..847b83b75dc8 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -29,6 +29,7 @@
 #define KVM_HC_CLOCK_PAIRING		9
 #define KVM_HC_SEND_IPI		10
 #define KVM_HC_SCHED_YIELD		11
+#define KVM_HC_PAGE_ENC_STATUS		12
 
 /*
  * hypercalls use architecture specific
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 09/12] KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl
  2020-02-13  1:14 [PATCH 00/12] SEV Live Migration Patchset Ashish Kalra
                   ` (7 preceding siblings ...)
  2020-02-13  1:17 ` [PATCH 08/12] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall Ashish Kalra
@ 2020-02-13  1:17 ` Ashish Kalra
  2020-02-27 17:57   ` Venu Busireddy
  2020-02-13  1:18 ` [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed Ashish Kalra
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 49+ messages in thread
From: Ashish Kalra @ 2020-02-13  1:17 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky, rientjes,
	x86, kvm, linux-kernel

From: Brijesh Singh <brijesh.singh@amd.com>

The ioctl can be used to retrieve page encryption bitmap for a given
gfn range.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 Documentation/virt/kvm/api.txt  | 27 +++++++++++++++++++++
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/svm.c              | 43 +++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c              | 12 +++++++++
 include/uapi/linux/kvm.h        | 12 +++++++++
 5 files changed, 96 insertions(+)

diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt
index c6e1ce5d40de..053aecfabe74 100644
--- a/Documentation/virt/kvm/api.txt
+++ b/Documentation/virt/kvm/api.txt
@@ -4213,6 +4213,33 @@ the clear cpu reset definition in the POP. However, the cpu is not put
 into ESA mode. This reset is a superset of the initial reset.
 
 
+4.120 KVM_GET_PAGE_ENC_BITMAP (vm ioctl)
+
+Capability: basic
+Architectures: x86
+Type: vm ioctl
+Parameters: struct kvm_page_enc_bitmap (in/out)
+Returns: 0 on success, -1 on error
+
+/* for KVM_GET_PAGE_ENC_BITMAP */
+struct kvm_page_enc_bitmap {
+	__u64 start_gfn;
+	__u64 num_pages;
+	union {
+		void __user *enc_bitmap; /* one bit per page */
+		__u64 padding2;
+	};
+};
+
+The encrypted VMs have concept of private and shared pages. The private
+page is encrypted with the guest-specific key, while shared page may
+be encrypted with the hypervisor key. The KVM_GET_PAGE_ENC_BITMAP can
+be used to get the bitmap indicating whether the guest page is private
+or shared. The bitmap can be used during the guest migration, if the page
+is private then userspace need to use SEV migration commands to transmit
+the page.
+
+
 5. The kvm_run structure
 ------------------------
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4ae7293033b2..a6882c5214b4 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1258,6 +1258,8 @@ struct kvm_x86_ops {
 	int (*enable_direct_tlbflush)(struct kvm_vcpu *vcpu);
 	int (*page_enc_status_hc)(struct kvm *kvm, unsigned long gpa,
 				  unsigned long sz, unsigned long mode);
+	int (*get_page_enc_bitmap)(struct kvm *kvm,
+				struct kvm_page_enc_bitmap *bmap);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index f09791109075..f1c8806a97c6 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7673,6 +7673,48 @@ static int svm_page_enc_status_hc(struct kvm *kvm, unsigned long gpa,
 	return ret;
 }
 
+static int svm_get_page_enc_bitmap(struct kvm *kvm,
+				   struct kvm_page_enc_bitmap *bmap)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	unsigned long gfn_start, gfn_end;
+	unsigned long *bitmap;
+	unsigned long sz, i;
+	int ret;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	gfn_start = bmap->start_gfn;
+	gfn_end = gfn_start + bmap->num_pages;
+
+	sz = ALIGN(bmap->num_pages, BITS_PER_LONG) / 8;
+	bitmap = kmalloc(sz, GFP_KERNEL);
+	if (!bitmap)
+		return -ENOMEM;
+
+	/* by default all pages are marked encrypted */
+	memset(bitmap, 0xff, sz);
+
+	mutex_lock(&kvm->lock);
+	if (sev->page_enc_bmap) {
+		i = gfn_start;
+		for_each_clear_bit_from(i, sev->page_enc_bmap,
+				      min(sev->page_enc_bmap_size, gfn_end))
+			clear_bit(i - gfn_start, bitmap);
+	}
+	mutex_unlock(&kvm->lock);
+
+	ret = -EFAULT;
+	if (copy_to_user(bmap->enc_bitmap, bitmap, sz))
+		goto out;
+
+	ret = 0;
+out:
+	kfree(bitmap);
+	return ret;
+}
+
 static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -8066,6 +8108,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.apic_init_signal_blocked = svm_apic_init_signal_blocked,
 
 	.page_enc_status_hc = svm_page_enc_status_hc,
+	.get_page_enc_bitmap = svm_get_page_enc_bitmap,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 298627fa3d39..e955f886ee17 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5213,6 +5213,18 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	case KVM_SET_PMU_EVENT_FILTER:
 		r = kvm_vm_ioctl_set_pmu_event_filter(kvm, argp);
 		break;
+	case KVM_GET_PAGE_ENC_BITMAP: {
+		struct kvm_page_enc_bitmap bitmap;
+
+		r = -EFAULT;
+		if (copy_from_user(&bitmap, argp, sizeof(bitmap)))
+			goto out;
+
+		r = -ENOTTY;
+		if (kvm_x86_ops->get_page_enc_bitmap)
+			r = kvm_x86_ops->get_page_enc_bitmap(kvm, &bitmap);
+		break;
+	}
 	default:
 		r = -ENOTTY;
 	}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 4e80c57a3182..9377b26c5f4e 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -500,6 +500,16 @@ struct kvm_dirty_log {
 	};
 };
 
+/* for KVM_GET_PAGE_ENC_BITMAP */
+struct kvm_page_enc_bitmap {
+	__u64 start_gfn;
+	__u64 num_pages;
+	union {
+		void __user *enc_bitmap; /* one bit per page */
+		__u64 padding2;
+	};
+};
+
 /* for KVM_CLEAR_DIRTY_LOG */
 struct kvm_clear_dirty_log {
 	__u32 slot;
@@ -1478,6 +1488,8 @@ struct kvm_enc_region {
 #define KVM_S390_NORMAL_RESET	_IO(KVMIO,   0xc3)
 #define KVM_S390_CLEAR_RESET	_IO(KVMIO,   0xc4)
 
+#define KVM_GET_PAGE_ENC_BITMAP	_IOW(KVMIO, 0xc2, struct kvm_page_enc_bitmap)
+
 /* Secure Encrypted Virtualization command */
 enum sev_cmd_id {
 	/* Guest initialization commands */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status  is changed
  2020-02-13  1:14 [PATCH 00/12] SEV Live Migration Patchset Ashish Kalra
                   ` (8 preceding siblings ...)
  2020-02-13  1:17 ` [PATCH 09/12] KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl Ashish Kalra
@ 2020-02-13  1:18 ` Ashish Kalra
  2020-02-13  5:42   ` Andy Lutomirski
  2020-02-20  1:58   ` Steve Rutherford
  2020-02-13  1:18 ` [PATCH 11/12] KVM: x86: Introduce KVM_SET_PAGE_ENC_BITMAP ioctl Ashish Kalra
                   ` (2 subsequent siblings)
  12 siblings, 2 replies; 49+ messages in thread
From: Ashish Kalra @ 2020-02-13  1:18 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky, rientjes,
	x86, kvm, linux-kernel

From: Brijesh Singh <brijesh.singh@amd.com>

Invoke a hypercall when a memory region is changed from encrypted ->
decrypted and vice versa. Hypervisor need to know the page encryption
status during the guest migration.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/include/asm/paravirt.h       |  6 +++
 arch/x86/include/asm/paravirt_types.h |  2 +
 arch/x86/kernel/paravirt.c            |  1 +
 arch/x86/mm/mem_encrypt.c             | 57 ++++++++++++++++++++++++++-
 arch/x86/mm/pat/set_memory.c          |  7 ++++
 5 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 86e7317eb31f..407104613b06 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -78,6 +78,12 @@ static inline void paravirt_arch_exit_mmap(struct mm_struct *mm)
 	PVOP_VCALL1(mmu.exit_mmap, mm);
 }
 
+static inline void page_encryption_changed(unsigned long vaddr, int npages,
+						bool enc)
+{
+	PVOP_VCALL3(mmu.page_encryption_changed, vaddr, npages, enc);
+}
+
 #ifdef CONFIG_PARAVIRT_XXL
 static inline void load_sp0(unsigned long sp0)
 {
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 84812964d3dd..5ff03ac9a5f8 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -211,6 +211,8 @@ struct pv_mmu_ops {
 
 	/* Hook for intercepting the destruction of an mm_struct. */
 	void (*exit_mmap)(struct mm_struct *mm);
+	void (*page_encryption_changed)(unsigned long vaddr, int npages,
+					bool enc);
 
 #ifdef CONFIG_PARAVIRT_XXL
 	struct paravirt_callee_save read_cr2;
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 789f5e4f89de..8953447f327c 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -362,6 +362,7 @@ struct paravirt_patch_template pv_ops = {
 			(void (*)(struct mmu_gather *, void *))tlb_remove_page,
 
 	.mmu.exit_mmap		= paravirt_nop,
+	.mmu.page_encryption_changed	= paravirt_nop,
 
 #ifdef CONFIG_PARAVIRT_XXL
 	.mmu.read_cr2		= __PV_IS_CALLEE_SAVE(native_read_cr2),
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index f4bd4b431ba1..c9800fa811f6 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -19,6 +19,7 @@
 #include <linux/kernel.h>
 #include <linux/bitops.h>
 #include <linux/dma-mapping.h>
+#include <linux/kvm_para.h>
 
 #include <asm/tlbflush.h>
 #include <asm/fixmap.h>
@@ -29,6 +30,7 @@
 #include <asm/processor-flags.h>
 #include <asm/msr.h>
 #include <asm/cmdline.h>
+#include <asm/kvm_para.h>
 
 #include "mm_internal.h"
 
@@ -196,6 +198,47 @@ void __init sme_early_init(void)
 		swiotlb_force = SWIOTLB_FORCE;
 }
 
+static void set_memory_enc_dec_hypercall(unsigned long vaddr, int npages,
+					bool enc)
+{
+	unsigned long sz = npages << PAGE_SHIFT;
+	unsigned long vaddr_end, vaddr_next;
+
+	vaddr_end = vaddr + sz;
+
+	for (; vaddr < vaddr_end; vaddr = vaddr_next) {
+		int psize, pmask, level;
+		unsigned long pfn;
+		pte_t *kpte;
+
+		kpte = lookup_address(vaddr, &level);
+		if (!kpte || pte_none(*kpte))
+			return;
+
+		switch (level) {
+		case PG_LEVEL_4K:
+			pfn = pte_pfn(*kpte);
+			break;
+		case PG_LEVEL_2M:
+			pfn = pmd_pfn(*(pmd_t *)kpte);
+			break;
+		case PG_LEVEL_1G:
+			pfn = pud_pfn(*(pud_t *)kpte);
+			break;
+		default:
+			return;
+		}
+
+		psize = page_level_size(level);
+		pmask = page_level_mask(level);
+
+		kvm_sev_hypercall3(KVM_HC_PAGE_ENC_STATUS,
+				   pfn << PAGE_SHIFT, psize >> PAGE_SHIFT, enc);
+
+		vaddr_next = (vaddr & pmask) + psize;
+	}
+}
+
 static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
 {
 	pgprot_t old_prot, new_prot;
@@ -253,12 +296,13 @@ static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
 static int __init early_set_memory_enc_dec(unsigned long vaddr,
 					   unsigned long size, bool enc)
 {
-	unsigned long vaddr_end, vaddr_next;
+	unsigned long vaddr_end, vaddr_next, start;
 	unsigned long psize, pmask;
 	int split_page_size_mask;
 	int level, ret;
 	pte_t *kpte;
 
+	start = vaddr;
 	vaddr_next = vaddr;
 	vaddr_end = vaddr + size;
 
@@ -313,6 +357,8 @@ static int __init early_set_memory_enc_dec(unsigned long vaddr,
 
 	ret = 0;
 
+	set_memory_enc_dec_hypercall(start, PAGE_ALIGN(size) >> PAGE_SHIFT,
+					enc);
 out:
 	__flush_tlb_all();
 	return ret;
@@ -451,6 +497,15 @@ void __init mem_encrypt_init(void)
 	if (sev_active())
 		static_branch_enable(&sev_enable_key);
 
+#ifdef CONFIG_PARAVIRT
+	/*
+	 * With SEV, we need to make a hypercall when page encryption state is
+	 * changed.
+	 */
+	if (sev_active())
+		pv_ops.mmu.page_encryption_changed = set_memory_enc_dec_hypercall;
+#endif
+
 	pr_info("AMD %s active\n",
 		sev_active() ? "Secure Encrypted Virtualization (SEV)"
 			     : "Secure Memory Encryption (SME)");
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index c4aedd00c1ba..86b7804129fc 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -26,6 +26,7 @@
 #include <asm/proto.h>
 #include <asm/memtype.h>
 #include <asm/set_memory.h>
+#include <asm/paravirt.h>
 
 #include "../mm_internal.h"
 
@@ -1987,6 +1988,12 @@ static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
 	 */
 	cpa_flush(&cpa, 0);
 
+	/* Notify hypervisor that a given memory range is mapped encrypted
+	 * or decrypted. The hypervisor will use this information during the
+	 * VM migration.
+	 */
+	page_encryption_changed(addr, numpages, enc);
+
 	return ret;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 11/12] KVM: x86: Introduce KVM_SET_PAGE_ENC_BITMAP ioctl
  2020-02-13  1:14 [PATCH 00/12] SEV Live Migration Patchset Ashish Kalra
                   ` (9 preceding siblings ...)
  2020-02-13  1:18 ` [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed Ashish Kalra
@ 2020-02-13  1:18 ` Ashish Kalra
  2020-02-13  1:18 ` [PATCH 12/12] KVM: x86: Introduce KVM_PAGE_ENC_BITMAP_RESET ioctl Ashish Kalra
  2020-02-13  5:43 ` [PATCH 00/12] SEV Live Migration Patchset Andy Lutomirski
  12 siblings, 0 replies; 49+ messages in thread
From: Ashish Kalra @ 2020-02-13  1:18 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky, rientjes,
	x86, kvm, linux-kernel

From: Brijesh Singh <brijesh.singh@amd.com>

The ioctl can be used to set page encryption bitmap for an
incoming guest.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 Documentation/virt/kvm/api.txt  | 21 +++++++++++++++++
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/svm.c              | 42 +++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c              | 12 ++++++++++
 include/uapi/linux/kvm.h        |  1 +
 5 files changed, 78 insertions(+)

diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt
index 053aecfabe74..d4e29a457e80 100644
--- a/Documentation/virt/kvm/api.txt
+++ b/Documentation/virt/kvm/api.txt
@@ -4239,6 +4239,27 @@ or shared. The bitmap can be used during the guest migration, if the page
 is private then userspace need to use SEV migration commands to transmit
 the page.
 
+4.121 KVM_SET_PAGE_ENC_BITMAP (vm ioctl)
+
+Capability: basic
+Architectures: x86
+Type: vm ioctl
+Parameters: struct kvm_page_enc_bitmap (in/out)
+Returns: 0 on success, -1 on error
+
+/* for KVM_SET_PAGE_ENC_BITMAP */
+struct kvm_page_enc_bitmap {
+	__u64 start_gfn;
+	__u64 num_pages;
+	union {
+		void __user *enc_bitmap; /* one bit per page */
+		__u64 padding2;
+	};
+};
+
+During the guest live migration the outgoing guest exports its page encryption
+bitmap, the KVM_SET_PAGE_ENC_BITMAP can be used to build the page encryption
+bitmap for an incoming guest.
 
 5. The kvm_run structure
 ------------------------
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a6882c5214b4..698ea92290af 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1260,6 +1260,8 @@ struct kvm_x86_ops {
 				  unsigned long sz, unsigned long mode);
 	int (*get_page_enc_bitmap)(struct kvm *kvm,
 				struct kvm_page_enc_bitmap *bmap);
+	int (*set_page_enc_bitmap)(struct kvm *kvm,
+				struct kvm_page_enc_bitmap *bmap);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index f1c8806a97c6..a710a6a2d18c 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7715,6 +7715,47 @@ static int svm_get_page_enc_bitmap(struct kvm *kvm,
 	return ret;
 }
 
+static int svm_set_page_enc_bitmap(struct kvm *kvm,
+				   struct kvm_page_enc_bitmap *bmap)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	unsigned long gfn_start, gfn_end;
+	unsigned long *bitmap;
+	unsigned long sz, i;
+	int ret;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	gfn_start = bmap->start_gfn;
+	gfn_end = gfn_start + bmap->num_pages;
+
+	sz = ALIGN(bmap->num_pages, BITS_PER_LONG) / 8;
+	bitmap = kmalloc(sz, GFP_KERNEL);
+	if (!bitmap)
+		return -ENOMEM;
+
+	ret = -EFAULT;
+	if (copy_from_user(bitmap, bmap->enc_bitmap, sz))
+		goto out;
+
+	mutex_lock(&kvm->lock);
+	ret = sev_resize_page_enc_bitmap(kvm, gfn_end);
+	if (ret)
+		goto unlock;
+
+	i = gfn_start;
+	for_each_clear_bit_from(i, bitmap, (gfn_end - gfn_start))
+		clear_bit(i + gfn_start, sev->page_enc_bmap);
+
+	ret = 0;
+unlock:
+	mutex_unlock(&kvm->lock);
+out:
+	kfree(bitmap);
+	return ret;
+}
+
 static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -8109,6 +8150,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 
 	.page_enc_status_hc = svm_page_enc_status_hc,
 	.get_page_enc_bitmap = svm_get_page_enc_bitmap,
+	.set_page_enc_bitmap = svm_set_page_enc_bitmap,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e955f886ee17..a1ac5b8c5cd7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5225,6 +5225,18 @@ long kvm_arch_vm_ioctl(struct file *filp,
 			r = kvm_x86_ops->get_page_enc_bitmap(kvm, &bitmap);
 		break;
 	}
+	case KVM_SET_PAGE_ENC_BITMAP: {
+		struct kvm_page_enc_bitmap bitmap;
+
+		r = -EFAULT;
+		if (copy_from_user(&bitmap, argp, sizeof(bitmap)))
+			goto out;
+
+		r = -ENOTTY;
+		if (kvm_x86_ops->set_page_enc_bitmap)
+			r = kvm_x86_ops->set_page_enc_bitmap(kvm, &bitmap);
+		break;
+	}
 	default:
 		r = -ENOTTY;
 	}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 9377b26c5f4e..2f36afd11e0e 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1489,6 +1489,7 @@ struct kvm_enc_region {
 #define KVM_S390_CLEAR_RESET	_IO(KVMIO,   0xc4)
 
 #define KVM_GET_PAGE_ENC_BITMAP	_IOW(KVMIO, 0xc2, struct kvm_page_enc_bitmap)
+#define KVM_SET_PAGE_ENC_BITMAP	_IOW(KVMIO, 0xc3, struct kvm_page_enc_bitmap)
 
 /* Secure Encrypted Virtualization command */
 enum sev_cmd_id {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 12/12] KVM: x86: Introduce KVM_PAGE_ENC_BITMAP_RESET ioctl
  2020-02-13  1:14 [PATCH 00/12] SEV Live Migration Patchset Ashish Kalra
                   ` (10 preceding siblings ...)
  2020-02-13  1:18 ` [PATCH 11/12] KVM: x86: Introduce KVM_SET_PAGE_ENC_BITMAP ioctl Ashish Kalra
@ 2020-02-13  1:18 ` Ashish Kalra
  2020-02-13  5:43 ` [PATCH 00/12] SEV Live Migration Patchset Andy Lutomirski
  12 siblings, 0 replies; 49+ messages in thread
From: Ashish Kalra @ 2020-02-13  1:18 UTC (permalink / raw)
  To: pbonzini
  Cc: tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky, rientjes,
	x86, kvm, linux-kernel

From: Ashish Kalra <ashish.kalra@amd.com>

This ioctl can be used by the application to reset the page
encryption bitmap managed by the KVM driver. A typical usage
for this ioctl is on VM reboot, on reboot, we must reinitialize
the bitmap.

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 Documentation/virt/kvm/api.txt  | 11 +++++++++++
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/svm.c              | 16 ++++++++++++++++
 arch/x86/kvm/x86.c              |  6 ++++++
 include/uapi/linux/kvm.h        |  1 +
 5 files changed, 35 insertions(+)

diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt
index d4e29a457e80..bf0fd3c2ea07 100644
--- a/Documentation/virt/kvm/api.txt
+++ b/Documentation/virt/kvm/api.txt
@@ -4261,6 +4261,17 @@ During the guest live migration the outgoing guest exports its page encryption
 bitmap, the KVM_SET_PAGE_ENC_BITMAP can be used to build the page encryption
 bitmap for an incoming guest.
 
+4.122 KVM_PAGE_ENC_BITMAP_RESET (vm ioctl)
+
+Capability: basic
+Architectures: x86
+Type: vm ioctl
+Parameters: none
+Returns: 0 on success, -1 on error
+
+The KVM_PAGE_ENC_BITMAP_RESET is used to reset the guest's page encryption
+bitmap during guest reboot and this is only done on the guest's boot vCPU.
+
 5. The kvm_run structure
 ------------------------
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 698ea92290af..746c9c84d14a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1262,6 +1262,7 @@ struct kvm_x86_ops {
 				struct kvm_page_enc_bitmap *bmap);
 	int (*set_page_enc_bitmap)(struct kvm *kvm,
 				struct kvm_page_enc_bitmap *bmap);
+	int (*reset_page_enc_bitmap)(struct kvm *kvm);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a710a6a2d18c..1659539b1873 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7756,6 +7756,21 @@ static int svm_set_page_enc_bitmap(struct kvm *kvm,
 	return ret;
 }
 
+static int svm_reset_page_enc_bitmap(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	mutex_lock(&kvm->lock);
+	/* by default all pages should be marked encrypted */
+	if (sev->page_enc_bmap_size)
+		bitmap_fill(sev->page_enc_bmap, sev->page_enc_bmap_size);
+	mutex_unlock(&kvm->lock);
+	return 0;
+}
+
 static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -8151,6 +8166,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.page_enc_status_hc = svm_page_enc_status_hc,
 	.get_page_enc_bitmap = svm_get_page_enc_bitmap,
 	.set_page_enc_bitmap = svm_set_page_enc_bitmap,
+	.reset_page_enc_bitmap = svm_reset_page_enc_bitmap,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a1ac5b8c5cd7..eeb2a3dfeb02 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5237,6 +5237,12 @@ long kvm_arch_vm_ioctl(struct file *filp,
 			r = kvm_x86_ops->set_page_enc_bitmap(kvm, &bitmap);
 		break;
 	}
+	case KVM_PAGE_ENC_BITMAP_RESET: {
+		r = -ENOTTY;
+		if (kvm_x86_ops->reset_page_enc_bitmap)
+			r = kvm_x86_ops->reset_page_enc_bitmap(kvm);
+		break;
+	}
 	default:
 		r = -ENOTTY;
 	}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 2f36afd11e0e..4001c22cb36b 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1490,6 +1490,7 @@ struct kvm_enc_region {
 
 #define KVM_GET_PAGE_ENC_BITMAP	_IOW(KVMIO, 0xc2, struct kvm_page_enc_bitmap)
 #define KVM_SET_PAGE_ENC_BITMAP	_IOW(KVMIO, 0xc3, struct kvm_page_enc_bitmap)
+#define KVM_PAGE_ENC_BITMAP_RESET	_IO(KVMIO, 0xc4)
 
 /* Secure Encrypted Virtualization command */
 enum sev_cmd_id {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed
  2020-02-13  1:18 ` [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed Ashish Kalra
@ 2020-02-13  5:42   ` Andy Lutomirski
  2020-02-13 22:28     ` Ashish Kalra
  2020-02-20  1:58   ` Steve Rutherford
  1 sibling, 1 reply; 49+ messages in thread
From: Andy Lutomirski @ 2020-02-13  5:42 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krcmar, Joerg Roedel, Borislav Petkov, Tom Lendacky,
	David Rientjes, X86 ML, kvm list, LKML

On Wed, Feb 12, 2020 at 5:18 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Brijesh Singh <brijesh.singh@amd.com>
>
> Invoke a hypercall when a memory region is changed from encrypted ->
> decrypted and vice versa. Hypervisor need to know the page encryption
> status during the guest migration.

What happens if the guest memory status doesn't match what the
hypervisor thinks it is?  What happens if the guest gets migrated
between the hypercall and the associated flushes?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 00/12] SEV Live Migration Patchset.
  2020-02-13  1:14 [PATCH 00/12] SEV Live Migration Patchset Ashish Kalra
                   ` (11 preceding siblings ...)
  2020-02-13  1:18 ` [PATCH 12/12] KVM: x86: Introduce KVM_PAGE_ENC_BITMAP_RESET ioctl Ashish Kalra
@ 2020-02-13  5:43 ` Andy Lutomirski
  2020-02-13 23:09   ` Ashish Kalra
  2020-02-14  2:10   ` Brijesh Singh
  12 siblings, 2 replies; 49+ messages in thread
From: Andy Lutomirski @ 2020-02-13  5:43 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krcmar, Joerg Roedel, Borislav Petkov, Tom Lendacky,
	David Rientjes, X86 ML, kvm list, LKML, Peter Zijlstra,
	Dave Hansen

On Wed, Feb 12, 2020 at 5:14 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Ashish Kalra <ashish.kalra@amd.com>
>
> This patchset adds support for SEV Live Migration on KVM/QEMU.

I skimmed this all and I don't see any description of how this all works.

Does any of this address the mess in svm_register_enc_region()?  Right
now, when QEMU (or a QEMU alternative) wants to allocate some memory
to be used for guest encrypted pages, it mmap()s some memory and the
kernel does get_user_pages_fast() on it.  The pages are kept pinned
for the lifetime of the mapping.  This is not at all okay.  Let's see:

 - The memory is pinned and it doesn't play well with the Linux memory
management code.  You just wrote a big patch set to migrate the pages
to a whole different machines, but we apparently can't even migrate
them to a different NUMA node or even just a different address.  And
good luck swapping it out.

 - The memory is still mapped in the QEMU process, and that mapping is
incoherent with actual guest access to the memory.  It's nice that KVM
clflushes it so that, in principle, everything might actually work,
but this is gross.  We should not be exposing incoherent mappings to
userspace.

Perhaps all this fancy infrastructure you're writing for migration and
all this new API surface could also teach the kernel how to migrate
pages from a guest *to the same guest* so we don't need to pin pages
forever.  And perhaps you could put some thought into how to improve
the API so that it doesn't involve nonsensical incoherent mappings.

(To be blunt: if I had noticed how the SEV code worked before it was
merged, I would have NAKed it.  It's too late now to retroactively
remove it from the kernel, but perhaps we could try not to pile more
complexity on top of the unfortunate foundation we have.)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed
  2020-02-13  5:42   ` Andy Lutomirski
@ 2020-02-13 22:28     ` Ashish Kalra
  2020-02-14 18:56       ` Andy Lutomirski
  0 siblings, 1 reply; 49+ messages in thread
From: Ashish Kalra @ 2020-02-13 22:28 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krcmar, Joerg Roedel, Borislav Petkov, Tom Lendacky,
	David Rientjes, X86 ML, kvm list, LKML

On Wed, Feb 12, 2020 at 09:42:02PM -0800, Andy Lutomirski wrote:
>> On Wed, Feb 12, 2020 at 5:18 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>> >
>> > From: Brijesh Singh <brijesh.singh@amd.com>
> >
> > Invoke a hypercall when a memory region is changed from encrypted ->
> > decrypted and vice versa. Hypervisor need to know the page encryption
> > status during the guest migration.
>> 
>> What happens if the guest memory status doesn't match what the
>> hypervisor thinks it is?  What happens if the guest gets migrated
>> between the hypercall and the associated flushes?

This is basically same as the dirty page tracking and logging being done
during Live Migration. As with dirty page tracking and logging we
maintain a page encryption bitmap in the kernel which keeps tracks of
guest's page encrypted/decrypted state changes and this bitmap is
sync'ed regularly from kernel to qemu and also during the live migration
process, therefore any dirty pages whose encryption status will change
during migration, should also have their page status updated when the
page encryption bitmap is sync'ed. 

Also i think that when the amount of dirty pages reach a low threshold,
QEMU stops the source VM and then transfers all the remaining dirty
pages, so at that point, there will also be a final sync of the page
encryption bitmap, there won't be any hypercalls after this as the
source VM has been stopped and the remaining VM state gets transferred.

Thanks,
Ashish

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 00/12] SEV Live Migration Patchset.
  2020-02-13  5:43 ` [PATCH 00/12] SEV Live Migration Patchset Andy Lutomirski
@ 2020-02-13 23:09   ` Ashish Kalra
  2020-02-14 18:58     ` Andy Lutomirski
  2020-02-14  2:10   ` Brijesh Singh
  1 sibling, 1 reply; 49+ messages in thread
From: Ashish Kalra @ 2020-02-13 23:09 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krcmar, Joerg Roedel, Borislav Petkov, Tom Lendacky,
	David Rientjes, X86 ML, kvm list, LKML, brijesh.singh

On Wed, Feb 12, 2020 at 09:43:41PM -0800, Andy Lutomirski wrote:
> On Wed, Feb 12, 2020 at 5:14 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> >
> > From: Ashish Kalra <ashish.kalra@amd.com>
> >
> > This patchset adds support for SEV Live Migration on KVM/QEMU.
> 
> I skimmed this all and I don't see any description of how this all works.
> 
> Does any of this address the mess in svm_register_enc_region()?  Right
> now, when QEMU (or a QEMU alternative) wants to allocate some memory
> to be used for guest encrypted pages, it mmap()s some memory and the
> kernel does get_user_pages_fast() on it.  The pages are kept pinned
> for the lifetime of the mapping.  This is not at all okay.  Let's see:
> 
>  - The memory is pinned and it doesn't play well with the Linux memory
> management code.  You just wrote a big patch set to migrate the pages
> to a whole different machines, but we apparently can't even migrate
> them to a different NUMA node or even just a different address.  And
> good luck swapping it out.
> 
>  - The memory is still mapped in the QEMU process, and that mapping is
> incoherent with actual guest access to the memory.  It's nice that KVM
> clflushes it so that, in principle, everything might actually work,
> but this is gross.  We should not be exposing incoherent mappings to
> userspace.
> 
> Perhaps all this fancy infrastructure you're writing for migration and
> all this new API surface could also teach the kernel how to migrate
> pages from a guest *to the same guest* so we don't need to pin pages
> forever.  And perhaps you could put some thought into how to improve
> the API so that it doesn't involve nonsensical incoherent mappings.o

As a different key is used to encrypt memory in each VM, the hypervisor
can't simply copy the the ciphertext from one VM to another to migrate
the VM.  Therefore, the AMD SEV Key Management API provides a new sets
of function which the hypervisor can use to package a guest page for
migration, while maintaining the confidentiality provided by AMD SEV.

There is a new page encryption bitmap created in the kernel which 
keeps tracks of encrypted/decrypted state of guest's pages and this
bitmap is updated by a new hypercall interface provided to the guest
kernel and firmware. 

KVM_GET_PAGE_ENC_BITMAP ioctl can be used to get the guest page encryption
bitmap. The bitmap can be used to check if the given guest page is
private or shared.

During the migration flow, the SEND_START is called on the source hypervisor
to create an outgoing encryption context. The SEV guest policy dictates whether
the certificate passed through the migrate-set-parameters command will be
validated. SEND_UPDATE_DATA is called to encrypt the guest private pages.
After migration is completed, SEND_FINISH is called to destroy the encryption
context and make the VM non-runnable to protect it against cloning.

On the target machine, RECEIVE_START is called first to create an
incoming encryption context. The RECEIVE_UPDATE_DATA is called to copy
the received encrypted page into guest memory. After migration has
completed, RECEIVE_FINISH is called to make the VM runnable.

Thanks,
Ashish

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 00/12] SEV Live Migration Patchset.
  2020-02-13  5:43 ` [PATCH 00/12] SEV Live Migration Patchset Andy Lutomirski
  2020-02-13 23:09   ` Ashish Kalra
@ 2020-02-14  2:10   ` Brijesh Singh
  1 sibling, 0 replies; 49+ messages in thread
From: Brijesh Singh @ 2020-02-14  2:10 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Andy Lutomirski, brijesh.singh, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Radim Krcmar, Joerg Roedel,
	Borislav Petkov, Tom Lendacky, David Rientjes, X86 ML, kvm list,
	LKML, Peter Zijlstra, Dave Hansen

Hi Ashish,


On 2/12/20 11:43 PM, Andy Lutomirski wrote:
> On Wed, Feb 12, 2020 at 5:14 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>> From: Ashish Kalra <ashish.kalra@amd.com>
>>
>> This patchset adds support for SEV Live Migration on KVM/QEMU.


I think you should some explanation in cover letter so that we get the
full context of the patch. You can see a reference cover-letter in
previous submissions.

https://marc.info/?l=kvm&m=156278967226011&w=2

The patch series is also missing the revision number and change log,
these information will helper reviewer to know what is changed since
previous version and which feedback are addressed in this version. 

Thanks

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed
  2020-02-13 22:28     ` Ashish Kalra
@ 2020-02-14 18:56       ` Andy Lutomirski
  2020-02-14 20:36         ` Ashish Kalra
  0 siblings, 1 reply; 49+ messages in thread
From: Andy Lutomirski @ 2020-02-14 18:56 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krcmar, Joerg Roedel, Borislav Petkov, Tom Lendacky,
	David Rientjes, X86 ML, kvm list, LKML

On Thu, Feb 13, 2020 at 2:28 PM Ashish Kalra <ashish.kalra@amd.com> wrote:
>
> On Wed, Feb 12, 2020 at 09:42:02PM -0800, Andy Lutomirski wrote:
> >> On Wed, Feb 12, 2020 at 5:18 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> >> >
> >> > From: Brijesh Singh <brijesh.singh@amd.com>
> > >
> > > Invoke a hypercall when a memory region is changed from encrypted ->
> > > decrypted and vice versa. Hypervisor need to know the page encryption
> > > status during the guest migration.
> >>
> >> What happens if the guest memory status doesn't match what the
> >> hypervisor thinks it is?  What happens if the guest gets migrated
> >> between the hypercall and the associated flushes?
>
> This is basically same as the dirty page tracking and logging being done
> during Live Migration. As with dirty page tracking and logging we
> maintain a page encryption bitmap in the kernel which keeps tracks of
> guest's page encrypted/decrypted state changes and this bitmap is
> sync'ed regularly from kernel to qemu and also during the live migration
> process, therefore any dirty pages whose encryption status will change
> during migration, should also have their page status updated when the
> page encryption bitmap is sync'ed.
>
> Also i think that when the amount of dirty pages reach a low threshold,
> QEMU stops the source VM and then transfers all the remaining dirty
> pages, so at that point, there will also be a final sync of the page
> encryption bitmap, there won't be any hypercalls after this as the
> source VM has been stopped and the remaining VM state gets transferred.

And have you ensured that, in the inevitable race when a guest gets
migrated part way through an encryption state change, that no data
corruption occurs?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 00/12] SEV Live Migration Patchset.
  2020-02-13 23:09   ` Ashish Kalra
@ 2020-02-14 18:58     ` Andy Lutomirski
  2020-02-17 19:49       ` Ashish Kalra
  0 siblings, 1 reply; 49+ messages in thread
From: Andy Lutomirski @ 2020-02-14 18:58 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Andy Lutomirski, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Radim Krcmar, Joerg Roedel, Borislav Petkov,
	Tom Lendacky, David Rientjes, X86 ML, kvm list, LKML,
	Brijesh Singh

On Thu, Feb 13, 2020 at 3:09 PM Ashish Kalra <ashish.kalra@amd.com> wrote:
>
> On Wed, Feb 12, 2020 at 09:43:41PM -0800, Andy Lutomirski wrote:
> > On Wed, Feb 12, 2020 at 5:14 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> > >
> > > From: Ashish Kalra <ashish.kalra@amd.com>
> > >
> > > This patchset adds support for SEV Live Migration on KVM/QEMU.
> >
> > I skimmed this all and I don't see any description of how this all works.
> >
> > Does any of this address the mess in svm_register_enc_region()?  Right
> > now, when QEMU (or a QEMU alternative) wants to allocate some memory
> > to be used for guest encrypted pages, it mmap()s some memory and the
> > kernel does get_user_pages_fast() on it.  The pages are kept pinned
> > for the lifetime of the mapping.  This is not at all okay.  Let's see:
> >
> >  - The memory is pinned and it doesn't play well with the Linux memory
> > management code.  You just wrote a big patch set to migrate the pages
> > to a whole different machines, but we apparently can't even migrate
> > them to a different NUMA node or even just a different address.  And
> > good luck swapping it out.
> >
> >  - The memory is still mapped in the QEMU process, and that mapping is
> > incoherent with actual guest access to the memory.  It's nice that KVM
> > clflushes it so that, in principle, everything might actually work,
> > but this is gross.  We should not be exposing incoherent mappings to
> > userspace.
> >
> > Perhaps all this fancy infrastructure you're writing for migration and
> > all this new API surface could also teach the kernel how to migrate
> > pages from a guest *to the same guest* so we don't need to pin pages
> > forever.  And perhaps you could put some thought into how to improve
> > the API so that it doesn't involve nonsensical incoherent mappings.o
>
> As a different key is used to encrypt memory in each VM, the hypervisor
> can't simply copy the the ciphertext from one VM to another to migrate
> the VM.  Therefore, the AMD SEV Key Management API provides a new sets
> of function which the hypervisor can use to package a guest page for
> migration, while maintaining the confidentiality provided by AMD SEV.
>
> There is a new page encryption bitmap created in the kernel which
> keeps tracks of encrypted/decrypted state of guest's pages and this
> bitmap is updated by a new hypercall interface provided to the guest
> kernel and firmware.
>
> KVM_GET_PAGE_ENC_BITMAP ioctl can be used to get the guest page encryption
> bitmap. The bitmap can be used to check if the given guest page is
> private or shared.
>
> During the migration flow, the SEND_START is called on the source hypervisor
> to create an outgoing encryption context. The SEV guest policy dictates whether
> the certificate passed through the migrate-set-parameters command will be
> validated. SEND_UPDATE_DATA is called to encrypt the guest private pages.
> After migration is completed, SEND_FINISH is called to destroy the encryption
> context and make the VM non-runnable to protect it against cloning.
>
> On the target machine, RECEIVE_START is called first to create an
> incoming encryption context. The RECEIVE_UPDATE_DATA is called to copy
> the received encrypted page into guest memory. After migration has
> completed, RECEIVE_FINISH is called to make the VM runnable.
>

Thanks!  This belongs somewhere in the patch set.

You still haven't answered my questions about the existing coherency
issues and whether the same infrastructure can be used to migrate
guest pages within the same machine.

Also, you're making guest-side and host-side changes.  What ensures
that you don't try to migrate a guest that doesn't support the
hypercall for encryption state tracking?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed
  2020-02-14 18:56       ` Andy Lutomirski
@ 2020-02-14 20:36         ` Ashish Kalra
  0 siblings, 0 replies; 49+ messages in thread
From: Ashish Kalra @ 2020-02-14 20:36 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krcmar, Joerg Roedel, Borislav Petkov, Tom Lendacky,
	David Rientjes, X86 ML, kvm list, LKML, brijesh.singh

On Fri, Feb 14, 2020 at 10:56:53AM -0800, Andy Lutomirski wrote:
> On Thu, Feb 13, 2020 at 2:28 PM Ashish Kalra <ashish.kalra@amd.com> wrote:
> >
> > On Wed, Feb 12, 2020 at 09:42:02PM -0800, Andy Lutomirski wrote:
> > >> On Wed, Feb 12, 2020 at 5:18 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> > >> >
> > >> > From: Brijesh Singh <brijesh.singh@amd.com>
> > > >
> > > > Invoke a hypercall when a memory region is changed from encrypted ->
> > > > decrypted and vice versa. Hypervisor need to know the page encryption
> > > > status during the guest migration.
> > >>
> > >> What happens if the guest memory status doesn't match what the
> > >> hypervisor thinks it is?  What happens if the guest gets migrated
> > >> between the hypercall and the associated flushes?
> >
> > This is basically same as the dirty page tracking and logging being done
> > during Live Migration. As with dirty page tracking and logging we
> > maintain a page encryption bitmap in the kernel which keeps tracks of
> > guest's page encrypted/decrypted state changes and this bitmap is
> > sync'ed regularly from kernel to qemu and also during the live migration
> > process, therefore any dirty pages whose encryption status will change
> > during migration, should also have their page status updated when the
> > page encryption bitmap is sync'ed.
> >
> > Also i think that when the amount of dirty pages reach a low threshold,
> > QEMU stops the source VM and then transfers all the remaining dirty
> > pages, so at that point, there will also be a final sync of the page
> > encryption bitmap, there won't be any hypercalls after this as the
> > source VM has been stopped and the remaining VM state gets transferred.
> 
> And have you ensured that, in the inevitable race when a guest gets
> migrated part way through an encryption state change, that no data
> corruption occurs?

We ensure that we send the page encryption state bitmap to the
destination VM at migration completion and when the remaining amount of 
RAM/dirty pages are flushed. Also as the source VM is stopped before
this flush of remaining blocks occur, so any encryption state change
hypercalls would have been completed before that.

Thanks,
Ashish

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 00/12] SEV Live Migration Patchset.
  2020-02-14 18:58     ` Andy Lutomirski
@ 2020-02-17 19:49       ` Ashish Kalra
  2020-03-03  1:05         ` Steve Rutherford
  2020-03-19 13:05         ` Paolo Bonzini
  0 siblings, 2 replies; 49+ messages in thread
From: Ashish Kalra @ 2020-02-17 19:49 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krcmar, Joerg Roedel, Borislav Petkov, Tom Lendacky,
	David Rientjes, X86 ML, kvm list, LKML, Brijesh Singh

On Fri, Feb 14, 2020 at 10:58:46AM -0800, Andy Lutomirski wrote:
> On Thu, Feb 13, 2020 at 3:09 PM Ashish Kalra <ashish.kalra@amd.com> wrote:
> >
> > On Wed, Feb 12, 2020 at 09:43:41PM -0800, Andy Lutomirski wrote:
> > > On Wed, Feb 12, 2020 at 5:14 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> > > >
> > > > From: Ashish Kalra <ashish.kalra@amd.com>
> > > >
> > > > This patchset adds support for SEV Live Migration on KVM/QEMU.
> > >
> > > I skimmed this all and I don't see any description of how this all works.
> > >
> > > Does any of this address the mess in svm_register_enc_region()?  Right
> > > now, when QEMU (or a QEMU alternative) wants to allocate some memory
> > > to be used for guest encrypted pages, it mmap()s some memory and the
> > > kernel does get_user_pages_fast() on it.  The pages are kept pinned
> > > for the lifetime of the mapping.  This is not at all okay.  Let's see:
> > >
> > >  - The memory is pinned and it doesn't play well with the Linux memory
> > > management code.  You just wrote a big patch set to migrate the pages
> > > to a whole different machines, but we apparently can't even migrate
> > > them to a different NUMA node or even just a different address.  And
> > > good luck swapping it out.
> > >
> > >  - The memory is still mapped in the QEMU process, and that mapping is
> > > incoherent with actual guest access to the memory.  It's nice that KVM
> > > clflushes it so that, in principle, everything might actually work,
> > > but this is gross.  We should not be exposing incoherent mappings to
> > > userspace.
> > >
> > > Perhaps all this fancy infrastructure you're writing for migration and
> > > all this new API surface could also teach the kernel how to migrate
> > > pages from a guest *to the same guest* so we don't need to pin pages
> > > forever.  And perhaps you could put some thought into how to improve
> > > the API so that it doesn't involve nonsensical incoherent mappings.o
> >
> > As a different key is used to encrypt memory in each VM, the hypervisor
> > can't simply copy the the ciphertext from one VM to another to migrate
> > the VM.  Therefore, the AMD SEV Key Management API provides a new sets
> > of function which the hypervisor can use to package a guest page for
> > migration, while maintaining the confidentiality provided by AMD SEV.
> >
> > There is a new page encryption bitmap created in the kernel which
> > keeps tracks of encrypted/decrypted state of guest's pages and this
> > bitmap is updated by a new hypercall interface provided to the guest
> > kernel and firmware.
> >
> > KVM_GET_PAGE_ENC_BITMAP ioctl can be used to get the guest page encryption
> > bitmap. The bitmap can be used to check if the given guest page is
> > private or shared.
> >
> > During the migration flow, the SEND_START is called on the source hypervisor
> > to create an outgoing encryption context. The SEV guest policy dictates whether
> > the certificate passed through the migrate-set-parameters command will be
> > validated. SEND_UPDATE_DATA is called to encrypt the guest private pages.
> > After migration is completed, SEND_FINISH is called to destroy the encryption
> > context and make the VM non-runnable to protect it against cloning.
> >
> > On the target machine, RECEIVE_START is called first to create an
> > incoming encryption context. The RECEIVE_UPDATE_DATA is called to copy
> > the received encrypted page into guest memory. After migration has
> > completed, RECEIVE_FINISH is called to make the VM runnable.
> >
> 
> Thanks!  This belongs somewhere in the patch set.
> 
> You still haven't answered my questions about the existing coherency
> issues and whether the same infrastructure can be used to migrate
> guest pages within the same machine.

Page Migration and Live Migration are separate features and one of my
colleagues is currently working on making page migration possible and removing
SEV Guest pinning requirements.
> 
> Also, you're making guest-side and host-side changes.  What ensures
> that you don't try to migrate a guest that doesn't support the
> hypercall for encryption state tracking?

This is a good question and it is still an open-ended question. There
are two possibilities here: guest does not have any unencrypted pages
(for e.g booting 32-bit) and so it does not make any hypercalls, and 
the other possibility is that the guest does not have support for
the newer hypercall.

In the first case, all the guest pages are then assumed to be 
encrypted and live migration happens as such.

For the second case, we have been discussing this internally,
and one option is to extend the KVM capabilites/feature bits to check for this ?

Thanks,
Ashish

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed
  2020-02-13  1:18 ` [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed Ashish Kalra
  2020-02-13  5:42   ` Andy Lutomirski
@ 2020-02-20  1:58   ` Steve Rutherford
  2020-02-20  2:12     ` Andy Lutomirski
  1 sibling, 1 reply; 49+ messages in thread
From: Steve Rutherford @ 2020-02-20  1:58 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krčmář,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes, x86,
	KVM list, LKML

On Wed, Feb 12, 2020 at 5:18 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Brijesh Singh <brijesh.singh@amd.com>
>
> Invoke a hypercall when a memory region is changed from encrypted ->
> decrypted and vice versa. Hypervisor need to know the page encryption
> status during the guest migration.

One messy aspect, which I think is fine in practice, is that this
presumes that pages are either treated as encrypted or decrypted. If
also done on SEV, the in-place re-encryption supported by SME would
break SEV migration. Linux doesn't do this now on SEV, and I don't
have an intuition for why Linux might want this, but we will need to
ensure it is never done in order to ensure that migration works down
the line. I don't believe the AMD manual promises this will work
anyway.

Something feels a bit wasteful about having all future kernels
universally announce c-bit status when SEV is enabled, even if KVM
isn't listening, since it may be too old (or just not want to know).
Might be worth eliding the hypercalls if you get ENOSYS back? There
might be a better way of passing paravirt config metadata across than
just trying and seeing if the hypercall succeeds, but I'm not super
familiar with it.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed
  2020-02-20  1:58   ` Steve Rutherford
@ 2020-02-20  2:12     ` Andy Lutomirski
  2020-02-20  3:29       ` Steve Rutherford
  2020-02-20 15:54       ` Brijesh Singh
  0 siblings, 2 replies; 49+ messages in thread
From: Andy Lutomirski @ 2020-02-20  2:12 UTC (permalink / raw)
  To: Steve Rutherford
  Cc: Ashish Kalra, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Radim Krčmář,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes, x86,
	KVM list, LKML



> On Feb 19, 2020, at 5:58 PM, Steve Rutherford <srutherford@google.com> wrote:
> 
> On Wed, Feb 12, 2020 at 5:18 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>> 
>> From: Brijesh Singh <brijesh.singh@amd.com>
>> 
>> Invoke a hypercall when a memory region is changed from encrypted ->
>> decrypted and vice versa. Hypervisor need to know the page encryption
>> status during the guest migration.
> 
> One messy aspect, which I think is fine in practice, is that this
> presumes that pages are either treated as encrypted or decrypted. If
> also done on SEV, the in-place re-encryption supported by SME would
> break SEV migration. Linux doesn't do this now on SEV, and I don't
> have an intuition for why Linux might want this, but we will need to
> ensure it is never done in order to ensure that migration works down
> the line. I don't believe the AMD manual promises this will work
> anyway.
> 
> Something feels a bit wasteful about having all future kernels
> universally announce c-bit status when SEV is enabled, even if KVM
> isn't listening, since it may be too old (or just not want to know).
> Might be worth eliding the hypercalls if you get ENOSYS back? There
> might be a better way of passing paravirt config metadata across than
> just trying and seeing if the hypercall succeeds, but I'm not super
> familiar with it.

I actually think this should be a hard requirement to merge this. The host needs to tell the guest that it supports this particular migration strategy and the guest needs to tell the host that it is using it.  And the guest needs a way to tell the host that it’s *not* using it right now due to kexec, for example.

I’m still uneasy about a guest being migrated in the window where the hypercall tracking and the page encryption bit don’t match.  I guess maybe corruption in this window doesn’t matter?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 08/12] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall
  2020-02-13  1:17 ` [PATCH 08/12] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall Ashish Kalra
@ 2020-02-20  2:39   ` Steve Rutherford
  2020-02-20  5:28     ` Ashish Kalra
  0 siblings, 1 reply; 49+ messages in thread
From: Steve Rutherford @ 2020-02-20  2:39 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krčmář,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes, x86,
	KVM list, LKML

On Wed, Feb 12, 2020 at 5:17 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> +static int sev_resize_page_enc_bitmap(struct kvm *kvm, unsigned long new_size)
> +{
> +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +       unsigned long *map;
> +       unsigned long sz;
> +
> +       if (sev->page_enc_bmap_size >= new_size)
> +               return 0;
> +
> +       sz = ALIGN(new_size, BITS_PER_LONG) / 8;
> +
> +       map = vmalloc(sz);
> +       if (!map) {
> +               pr_err_once("Failed to allocate encrypted bitmap size %lx\n",
> +                               sz);
> +               return -ENOMEM;
> +       }
> +
> +       /* mark the page encrypted (by default) */
> +       memset(map, 0xff, sz);
> +
> +       bitmap_copy(map, sev->page_enc_bmap, sev->page_enc_bmap_size);
Personally, I would do the arithmetic and swap the `memset(map, 0xff,
sz);` for `memset(map + sev->page_enc_bmap_size, 0xff, sz -
sev->page_enc_bmap_size);`, but gcc might be smart enough to do this
for you.

> +       kvfree(sev->page_enc_bmap);
> +
> +       sev->page_enc_bmap = map;
> +       sev->page_enc_bmap_size = new_size;
> +
> +       return 0;
> +}
> +
> +static int svm_page_enc_status_hc(struct kvm *kvm, unsigned long gpa,
> +                                 unsigned long npages, unsigned long enc)
> +{
> +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +       gfn_t gfn_start, gfn_end;
> +       int ret;
> +
> +       if (!sev_guest(kvm))
> +               return -EINVAL;
> +
> +       if (!npages)
> +               return 0;
> +
> +       gfn_start = gpa_to_gfn(gpa);
> +       gfn_end = gfn_start + npages;
> +
> +       /* out of bound access error check */
> +       if (gfn_end <= gfn_start)
> +               return -EINVAL;
> +
> +       /* lets make sure that gpa exist in our memslot */
> +       pfn_start = gfn_to_pfn(kvm, gfn_start);
> +       pfn_end = gfn_to_pfn(kvm, gfn_end);
I believe these functions assume as_id==0, which is probably fine in
practice. If one were to want to migrate a VM with SMM support (which
I believe is the only current usage of non-zero as_ids), it feels like
SMM would need to be in control of its own c-bit tracking, but that
doesn't seem super feasible (otherwise the guest kernel could corrupt
SMM by passing invalid c-bit statuses). I'm not certain anyone wants
SMM with SEV anyway?

> +
> +       if (is_error_noslot_pfn(pfn_start) && !is_noslot_pfn(pfn_start)) {
> +               /*
> +                * Allow guest MMIO range(s) to be added
> +                * to the page encryption bitmap.
> +                */
> +               return -EINVAL;
> +       }
> +
> +       if (is_error_noslot_pfn(pfn_end) && !is_noslot_pfn(pfn_end)) {
> +               /*
> +                * Allow guest MMIO range(s) to be added
> +                * to the page encryption bitmap.
> +                */
> +               return -EINVAL;
> +       }
> +
> +       mutex_lock(&kvm->lock);
> +       ret = sev_resize_page_enc_bitmap(kvm, gfn_end);
> +       if (ret)
> +               goto unlock;
> +
> +       if (enc)
> +               __bitmap_set(sev->page_enc_bmap, gfn_start,
> +                               gfn_end - gfn_start);
> +       else
> +               __bitmap_clear(sev->page_enc_bmap, gfn_start,
> +                               gfn_end - gfn_start);
> +
> +unlock:
> +       mutex_unlock(&kvm->lock);
> +       return ret;
> +}
> +
>  static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>  {
>         struct kvm_sev_cmd sev_cmd;
> @@ -7972,6 +8064,8 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
>         .need_emulation_on_page_fault = svm_need_emulation_on_page_fault,
>
>         .apic_init_signal_blocked = svm_apic_init_signal_blocked,
> +
> +       .page_enc_status_hc = svm_page_enc_status_hc,
>  };
>
>  static int __init svm_init(void)
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 9a6664886f2e..7963f2979fdf 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7879,6 +7879,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
>         .nested_get_evmcs_version = NULL,
>         .need_emulation_on_page_fault = vmx_need_emulation_on_page_fault,
>         .apic_init_signal_blocked = vmx_apic_init_signal_blocked,
> +       .page_enc_status_hc = NULL,
>  };
>
>  static void vmx_cleanup_l1d_flush(void)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index fbabb2f06273..298627fa3d39 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7547,6 +7547,12 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>                 kvm_sched_yield(vcpu->kvm, a0);
>                 ret = 0;
>                 break;
> +       case KVM_HC_PAGE_ENC_STATUS:
> +               ret = -KVM_ENOSYS;
> +               if (kvm_x86_ops->page_enc_status_hc)
> +                       ret = kvm_x86_ops->page_enc_status_hc(vcpu->kvm,
> +                                       a0, a1, a2);
> +               break;
>         default:
>                 ret = -KVM_ENOSYS;
>                 break;
Add a cap to kvm_vm_ioctl_enable_cap so that the vmm can configure
whether or not this hypercall is offered. Moving to an enable cap
would also allow the vmm to pass down the expected size of the c-bit
tracking buffer, so that you don't need to handle dynamic resizing in
response to guest hypercall, otherwise KVM will sporadically start
copying around large buffers when working with large VMs.

Stepping back a bit, I'm a little surprised by the fact that you don't
treat the c-bit buffers the same way as the dirty tracking buffers and
put them alongside the memslots. That's probably more effort, and the
strategy of using one large buffer should work fine (assuming you
don't need to support non-zero as_ids).

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed
  2020-02-20  2:12     ` Andy Lutomirski
@ 2020-02-20  3:29       ` Steve Rutherford
  2020-02-20 15:54       ` Brijesh Singh
  1 sibling, 0 replies; 49+ messages in thread
From: Steve Rutherford @ 2020-02-20  3:29 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ashish Kalra, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Radim Krčmář,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes, x86,
	KVM list, LKML

On Wed, Feb 19, 2020 at 6:12 PM Andy Lutomirski <luto@amacapital.net> wrote:
>
>
>
> > On Feb 19, 2020, at 5:58 PM, Steve Rutherford <srutherford@google.com> wrote:
> >
> > On Wed, Feb 12, 2020 at 5:18 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> >>
> >> From: Brijesh Singh <brijesh.singh@amd.com>
> >>
> >> Invoke a hypercall when a memory region is changed from encrypted ->
> >> decrypted and vice versa. Hypervisor need to know the page encryption
> >> status during the guest migration.
> >
> > One messy aspect, which I think is fine in practice, is that this
> > presumes that pages are either treated as encrypted or decrypted. If
> > also done on SEV, the in-place re-encryption supported by SME would
> > break SEV migration. Linux doesn't do this now on SEV, and I don't
> > have an intuition for why Linux might want this, but we will need to
> > ensure it is never done in order to ensure that migration works down
> > the line. I don't believe the AMD manual promises this will work
> > anyway.
> >
> > Something feels a bit wasteful about having all future kernels
> > universally announce c-bit status when SEV is enabled, even if KVM
> > isn't listening, since it may be too old (or just not want to know).
> > Might be worth eliding the hypercalls if you get ENOSYS back? There
> > might be a better way of passing paravirt config metadata across than
> > just trying and seeing if the hypercall succeeds, but I'm not super
> > familiar with it.
>
> I actually think this should be a hard requirement to merge this. The host needs to tell the guest that it supports this particular migration strategy and the guest needs to tell the host that it is using it.  And the guest needs a way to tell the host that it’s *not* using it right now due to kexec, for example.
>
> I’m still uneasy about a guest being migrated in the window where the hypercall tracking and the page encryption bit don’t match.  I guess maybe corruption in this window doesn’t matter?
It does matter, since you don't want to accidentally clear the dirty
bit when you are migrating the page from the wrong perspective.
Treating pages with dirty c-bits as dirty pages should solve this
problem. It's probably reasonable to expect userspace to handle this?
Downside is that you would then need ~3 copies of the c-bit tracking
buffer: one as the kernel version, one as the "old" usermode version,
and one as the "current" usermode version (which is smaller, since you
can fetch smaller sections than the full buffer). The kernel could
probably directly twiddle the dirty tracking bits and avoid the extra
userspace version, but this doesn't seem required.

That said, this does balloon the c-bit tracking overhead. Tracking
100GB of guest pages requires 3MB per instance of these buffers, which
isn't that bad but also isn't free (assuming my back of the envelope
math is right).

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 08/12] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall
  2020-02-20  2:39   ` Steve Rutherford
@ 2020-02-20  5:28     ` Ashish Kalra
  2020-02-20 21:21       ` Ashish Kalra
  0 siblings, 1 reply; 49+ messages in thread
From: Ashish Kalra @ 2020-02-20  5:28 UTC (permalink / raw)
  To: Steve Rutherford
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krčmář,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes, x86,
	KVM list, LKML, brijesh.singh

On Wed, Feb 19, 2020 at 06:39:39PM -0800, Steve Rutherford wrote:
> On Wed, Feb 12, 2020 at 5:17 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> >
> >  static void vmx_cleanup_l1d_flush(void)
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index fbabb2f06273..298627fa3d39 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -7547,6 +7547,12 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> >                 kvm_sched_yield(vcpu->kvm, a0);
> >                 ret = 0;
> >                 break;
> > +       case KVM_HC_PAGE_ENC_STATUS:
> > +               ret = -KVM_ENOSYS;
> > +               if (kvm_x86_ops->page_enc_status_hc)
> > +                       ret = kvm_x86_ops->page_enc_status_hc(vcpu->kvm,
> > +                                       a0, a1, a2);
> > +               break;
> >         default:
> >                 ret = -KVM_ENOSYS;
> >                 break;
> Add a cap to kvm_vm_ioctl_enable_cap so that the vmm can configure
> whether or not this hypercall is offered. Moving to an enable cap
> would also allow the vmm to pass down the expected size of the c-bit
> tracking buffer, so that you don't need to handle dynamic resizing in
> response to guest hypercall, otherwise KVM will sporadically start
> copying around large buffers when working with large VMs.
> 

Yes, that is something we have been looking at adding.

But, how will VMM know the expected size of the c-bit tracking buffer ?

The guest kernel and firmware make the hypercall to mark page encryption
status and depending on the GPA range being marked, the kernel's page
encryption bitmap needs to be dynamically resized as response to the guest
hypercall.

> Stepping back a bit, I'm a little surprised by the fact that you don't
> treat the c-bit buffers the same way as the dirty tracking buffers and
> put them alongside the memslots. That's probably more effort, and the
> strategy of using one large buffer should work fine (assuming you
> don't need to support non-zero as_ids).

Thanks,
Ashish

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed
  2020-02-20  2:12     ` Andy Lutomirski
  2020-02-20  3:29       ` Steve Rutherford
@ 2020-02-20 15:54       ` Brijesh Singh
  2020-02-20 20:43         ` Steve Rutherford
  1 sibling, 1 reply; 49+ messages in thread
From: Brijesh Singh @ 2020-02-20 15:54 UTC (permalink / raw)
  To: Andy Lutomirski, Steve Rutherford
  Cc: brijesh.singh, Ashish Kalra, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Radim Krčmář,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes, x86,
	KVM list, LKML



On 2/19/20 8:12 PM, Andy Lutomirski wrote:
> 
> 
>> On Feb 19, 2020, at 5:58 PM, Steve Rutherford <srutherford@google.com> wrote:
>>
>> On Wed, Feb 12, 2020 at 5:18 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>>>
>>> From: Brijesh Singh <brijesh.singh@amd.com>
>>>
>>> Invoke a hypercall when a memory region is changed from encrypted ->
>>> decrypted and vice versa. Hypervisor need to know the page encryption
>>> status during the guest migration.
>>
>> One messy aspect, which I think is fine in practice, is that this
>> presumes that pages are either treated as encrypted or decrypted. If
>> also done on SEV, the in-place re-encryption supported by SME would
>> break SEV migration. Linux doesn't do this now on SEV, and I don't
>> have an intuition for why Linux might want this, but we will need to
>> ensure it is never done in order to ensure that migration works down
>> the line. I don't believe the AMD manual promises this will work
>> anyway.
>>
>> Something feels a bit wasteful about having all future kernels
>> universally announce c-bit status when SEV is enabled, even if KVM
>> isn't listening, since it may be too old (or just not want to know).
>> Might be worth eliding the hypercalls if you get ENOSYS back? There
>> might be a better way of passing paravirt config metadata across than
>> just trying and seeing if the hypercall succeeds, but I'm not super
>> familiar with it.
> 
> I actually think this should be a hard requirement to merge this. The host needs to tell the guest that it supports this particular migration strategy and the guest needs to tell the host that it is using it.  And the guest needs a way to tell the host that it’s *not* using it right now due to kexec, for example.
> 
> I’m still uneasy about a guest being migrated in the window where the hypercall tracking and the page encryption bit don’t match.  I guess maybe corruption in this window doesn’t matter?
> 

I don't think there is a corruption issue here. Let's consider the below
case:

1) A page is transmitted as C=1 (encrypted)

2) During the migration window, the page encryption bit is changed
  to C=0 (decrypted)

3) #2 will cause a change in page table memory, thus dirty memory
  the tracker will create retransmission of the page table memory.

4) The page itself will not be re-transmitted because there was
  no change to the content of the page.

On destination, the read from the page will get the ciphertext.

The encryption bit change in the page table is used on the next access.
The user of the page needs to ensure that data is written with the
correct encryption bit before reading.

thanks

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed
  2020-02-20 15:54       ` Brijesh Singh
@ 2020-02-20 20:43         ` Steve Rutherford
  2020-02-20 22:43           ` Brijesh Singh
  0 siblings, 1 reply; 49+ messages in thread
From: Steve Rutherford @ 2020-02-20 20:43 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Andy Lutomirski, Ashish Kalra, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Radim Krčmář,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes, x86,
	KVM list, LKML

On Thu, Feb 20, 2020 at 7:55 AM Brijesh Singh <brijesh.singh@amd.com> wrote:
>
>
>
> On 2/19/20 8:12 PM, Andy Lutomirski wrote:
> >
> >
> >> On Feb 19, 2020, at 5:58 PM, Steve Rutherford <srutherford@google.com> wrote:
> >>
> >> On Wed, Feb 12, 2020 at 5:18 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> >>>
> >>> From: Brijesh Singh <brijesh.singh@amd.com>
> >>>
> >>> Invoke a hypercall when a memory region is changed from encrypted ->
> >>> decrypted and vice versa. Hypervisor need to know the page encryption
> >>> status during the guest migration.
> >>
> >> One messy aspect, which I think is fine in practice, is that this
> >> presumes that pages are either treated as encrypted or decrypted. If
> >> also done on SEV, the in-place re-encryption supported by SME would
> >> break SEV migration. Linux doesn't do this now on SEV, and I don't
> >> have an intuition for why Linux might want this, but we will need to
> >> ensure it is never done in order to ensure that migration works down
> >> the line. I don't believe the AMD manual promises this will work
> >> anyway.
> >>
> >> Something feels a bit wasteful about having all future kernels
> >> universally announce c-bit status when SEV is enabled, even if KVM
> >> isn't listening, since it may be too old (or just not want to know).
> >> Might be worth eliding the hypercalls if you get ENOSYS back? There
> >> might be a better way of passing paravirt config metadata across than
> >> just trying and seeing if the hypercall succeeds, but I'm not super
> >> familiar with it.
> >
> > I actually think this should be a hard requirement to merge this. The host needs to tell the guest that it supports this particular migration strategy and the guest needs to tell the host that it is using it.  And the guest needs a way to tell the host that it’s *not* using it right now due to kexec, for example.
> >
> > I’m still uneasy about a guest being migrated in the window where the hypercall tracking and the page encryption bit don’t match.  I guess maybe corruption in this window doesn’t matter?
> >
>
> I don't think there is a corruption issue here. Let's consider the below
> case:
>
> 1) A page is transmitted as C=1 (encrypted)
>
> 2) During the migration window, the page encryption bit is changed
>   to C=0 (decrypted)
>
> 3) #2 will cause a change in page table memory, thus dirty memory
>   the tracker will create retransmission of the page table memory.
>
> 4) The page itself will not be re-transmitted because there was
>   no change to the content of the page.
>
> On destination, the read from the page will get the ciphertext.
>
> The encryption bit change in the page table is used on the next access.
> The user of the page needs to ensure that data is written with the
> correct encryption bit before reading.
>
> thanks


I think the issue results from a slightly different perspective than
the one you are using. I think the situation Andy is interested in is
when a c-bit change and a write happen close in time. There are five
events, and the ordering matters:
1) Guest dirties the c-bit in the guest
2) Guest dirties the page
3) Host userspace observes the c-bit logs
4) Host userspace observes the page dirty logs
5) Host transmits the page

If these are reordered to:
3) Host userspace observes the c-bit logs
1) Guest dirties the c-bit in the guest
2) Guest dirties the page
4) Host userspace observes the page dirty logs
5) Host transmits the page (from the wrong c-bit perspective!)

Then the host will transmit a page with the wrong c-bit status and
clear the dirty bit for that page. If the guest page is not
retransmitted incidentally later, then this page will be corrupted.

If you treat pages with dirty c-bits as dirty pages, then you will
check the c-bit logs later and observe the dirty c-bit and retransmit.
There might be some cleverness around enforcing that you always fetch
the c-bit logs after fetching the dirty logs, but I haven't convinced
myself that this works yet. I think it might, since then the c-bits
are at least as fresh as the dirty bits.

The main uncertainty that comes to mind for that strategy is if, on
multi-vCPU VMs, the page dirtying event (from the new c-bit
perspective) and the c-bit status change hypercall can themselves
race. If a write from the new c-bit perspective can arrive before the
c-bit status change arrives in the c-bit logs, we will need to treat
pages with dirty c-bits as dirty pages.

Note that I do agree that if the c-bit status flips, and no one writes
to the page, it doesn't really matter if you retransmit that page. If
a guest wants to read nonsense, it can.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 08/12] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall
  2020-02-20  5:28     ` Ashish Kalra
@ 2020-02-20 21:21       ` Ashish Kalra
  0 siblings, 0 replies; 49+ messages in thread
From: Ashish Kalra @ 2020-02-20 21:21 UTC (permalink / raw)
  To: Steve Rutherford
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krčmář,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes, x86,
	KVM list, LKML, brijesh.singh

On Thu, Feb 20, 2020 at 05:28:21AM +0000, Ashish Kalra wrote:
> On Wed, Feb 19, 2020 at 06:39:39PM -0800, Steve Rutherford wrote:
> > On Wed, Feb 12, 2020 at 5:17 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> > >
> > >  static void vmx_cleanup_l1d_flush(void)
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index fbabb2f06273..298627fa3d39 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -7547,6 +7547,12 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> > >                 kvm_sched_yield(vcpu->kvm, a0);
> > >                 ret = 0;
> > >                 break;
> > > +       case KVM_HC_PAGE_ENC_STATUS:
> > > +               ret = -KVM_ENOSYS;
> > > +               if (kvm_x86_ops->page_enc_status_hc)
> > > +                       ret = kvm_x86_ops->page_enc_status_hc(vcpu->kvm,
> > > +                                       a0, a1, a2);
> > > +               break;
> > >         default:
> > >                 ret = -KVM_ENOSYS;
> > >                 break;
> > Add a cap to kvm_vm_ioctl_enable_cap so that the vmm can configure
> > whether or not this hypercall is offered. Moving to an enable cap
> > would also allow the vmm to pass down the expected size of the c-bit
> > tracking buffer, so that you don't need to handle dynamic resizing in
> > response to guest hypercall, otherwise KVM will sporadically start
> > copying around large buffers when working with large VMs.
> > 
> 
> Yes, that is something we have been looking at adding.
> 
> But, how will VMM know the expected size of the c-bit tracking buffer ?
> 
> The guest kernel and firmware make the hypercall to mark page encryption
> status and depending on the GPA range being marked, the kernel's page
> encryption bitmap needs to be dynamically resized as response to the guest
> hypercall.
> 

Discussed this with Brijesh, though KVM can provide a hint about
the expected (max.) size of the c-bit tracking buffer, but there is
still an issue for hotplugged guest memory, hence dynamically sized 
encryption bitmap is probably the right approach.

Thanks,
Ashish

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed
  2020-02-20 20:43         ` Steve Rutherford
@ 2020-02-20 22:43           ` Brijesh Singh
  2020-02-20 23:23             ` Steve Rutherford
  0 siblings, 1 reply; 49+ messages in thread
From: Brijesh Singh @ 2020-02-20 22:43 UTC (permalink / raw)
  To: Steve Rutherford
  Cc: brijesh.singh, Andy Lutomirski, Ashish Kalra, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krčmář,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes, x86,
	KVM list, LKML



On 2/20/20 2:43 PM, Steve Rutherford wrote:
> On Thu, Feb 20, 2020 at 7:55 AM Brijesh Singh <brijesh.singh@amd.com> wrote:
>>
>>
>>
>> On 2/19/20 8:12 PM, Andy Lutomirski wrote:
>>>
>>>
>>>> On Feb 19, 2020, at 5:58 PM, Steve Rutherford <srutherford@google.com> wrote:
>>>>
>>>> On Wed, Feb 12, 2020 at 5:18 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>>>>>
>>>>> From: Brijesh Singh <brijesh.singh@amd.com>
>>>>>
>>>>> Invoke a hypercall when a memory region is changed from encrypted ->
>>>>> decrypted and vice versa. Hypervisor need to know the page encryption
>>>>> status during the guest migration.
>>>>
>>>> One messy aspect, which I think is fine in practice, is that this
>>>> presumes that pages are either treated as encrypted or decrypted. If
>>>> also done on SEV, the in-place re-encryption supported by SME would
>>>> break SEV migration. Linux doesn't do this now on SEV, and I don't
>>>> have an intuition for why Linux might want this, but we will need to
>>>> ensure it is never done in order to ensure that migration works down
>>>> the line. I don't believe the AMD manual promises this will work
>>>> anyway.
>>>>
>>>> Something feels a bit wasteful about having all future kernels
>>>> universally announce c-bit status when SEV is enabled, even if KVM
>>>> isn't listening, since it may be too old (or just not want to know).
>>>> Might be worth eliding the hypercalls if you get ENOSYS back? There
>>>> might be a better way of passing paravirt config metadata across than
>>>> just trying and seeing if the hypercall succeeds, but I'm not super
>>>> familiar with it.
>>>
>>> I actually think this should be a hard requirement to merge this. The host needs to tell the guest that it supports this particular migration strategy and the guest needs to tell the host that it is using it.  And the guest needs a way to tell the host that it’s *not* using it right now due to kexec, for example.
>>>
>>> I’m still uneasy about a guest being migrated in the window where the hypercall tracking and the page encryption bit don’t match.  I guess maybe corruption in this window doesn’t matter?
>>>
>>
>> I don't think there is a corruption issue here. Let's consider the below
>> case:
>>
>> 1) A page is transmitted as C=1 (encrypted)
>>
>> 2) During the migration window, the page encryption bit is changed
>>    to C=0 (decrypted)
>>
>> 3) #2 will cause a change in page table memory, thus dirty memory
>>    the tracker will create retransmission of the page table memory.
>>
>> 4) The page itself will not be re-transmitted because there was
>>    no change to the content of the page.
>>
>> On destination, the read from the page will get the ciphertext.
>>
>> The encryption bit change in the page table is used on the next access.
>> The user of the page needs to ensure that data is written with the
>> correct encryption bit before reading.
>>
>> thanks
> 
> 
> I think the issue results from a slightly different perspective than
> the one you are using. I think the situation Andy is interested in is
> when a c-bit change and a write happen close in time. There are five
> events, and the ordering matters:
> 1) Guest dirties the c-bit in the guest
> 2) Guest dirties the page
> 3) Host userspace observes the c-bit logs
> 4) Host userspace observes the page dirty logs
> 5) Host transmits the page
> 
> If these are reordered to:
> 3) Host userspace observes the c-bit logs
> 1) Guest dirties the c-bit in the guest
> 2) Guest dirties the page
> 4) Host userspace observes the page dirty logs
> 5) Host transmits the page (from the wrong c-bit perspective!)
> 
> Then the host will transmit a page with the wrong c-bit status and
> clear the dirty bit for that page. If the guest page is not
> retransmitted incidentally later, then this page will be corrupted.
> 
> If you treat pages with dirty c-bits as dirty pages, then you will
> check the c-bit logs later and observe the dirty c-bit and retransmit.
> There might be some cleverness around enforcing that you always fetch
> the c-bit logs after fetching the dirty logs, but I haven't convinced
> myself that this works yet. I think it might, since then the c-bits
> are at least as fresh as the dirty bits.
> 

Unlike the dirty log, the c-bit log maintains the complete state.
So, I think it is the Host userspace responsibility to ensure that it
either keeps track of any c-bit log changes since it last sync'ed.
During the migration, after pausing the guest it can get the recent
c-bit log and compare if something has changed since it last sync'ed.
If so, then retransmit the page with new c-bit state.

> The main uncertainty that comes to mind for that strategy is if, on
> multi-vCPU VMs, the page dirtying event (from the new c-bit
> perspective) and the c-bit status change hypercall can themselves
> race. If a write from the new c-bit perspective can arrive before the
> c-bit status change arrives in the c-bit logs, we will need to treat
> pages with dirty c-bits as dirty pages.
> 

I believe if host userspace tracks the changes in the c-bit log since
it last synced then this problem can be avoided. Do you think we should
consider tracking the last sync changes in KVM or let the host userspace
handle it.

> Note that I do agree that if the c-bit status flips, and no one writes
> to the page, it doesn't really matter if you retransmit that page. If
> a guest wants to read nonsense, it can.
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed
  2020-02-20 22:43           ` Brijesh Singh
@ 2020-02-20 23:23             ` Steve Rutherford
  2020-02-20 23:40               ` Andy Lutomirski
  0 siblings, 1 reply; 49+ messages in thread
From: Steve Rutherford @ 2020-02-20 23:23 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Andy Lutomirski, Ashish Kalra, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Joerg Roedel, Borislav Petkov,
	Tom Lendacky, David Rientjes, x86, KVM list, LKML

On Thu, Feb 20, 2020 at 2:43 PM Brijesh Singh <brijesh.singh@amd.com> wrote:
>
>
>
> On 2/20/20 2:43 PM, Steve Rutherford wrote:
> > On Thu, Feb 20, 2020 at 7:55 AM Brijesh Singh <brijesh.singh@amd.com> wrote:
> >>
> >>
> >>
> >> On 2/19/20 8:12 PM, Andy Lutomirski wrote:
> >>>
> >>>
> >>>> On Feb 19, 2020, at 5:58 PM, Steve Rutherford <srutherford@google.com> wrote:
> >>>>
> >>>> On Wed, Feb 12, 2020 at 5:18 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> >>>>>
> >>>>> From: Brijesh Singh <brijesh.singh@amd.com>
> >>>>>
> >>>>> Invoke a hypercall when a memory region is changed from encrypted ->
> >>>>> decrypted and vice versa. Hypervisor need to know the page encryption
> >>>>> status during the guest migration.
> >>>>
> >>>> One messy aspect, which I think is fine in practice, is that this
> >>>> presumes that pages are either treated as encrypted or decrypted. If
> >>>> also done on SEV, the in-place re-encryption supported by SME would
> >>>> break SEV migration. Linux doesn't do this now on SEV, and I don't
> >>>> have an intuition for why Linux might want this, but we will need to
> >>>> ensure it is never done in order to ensure that migration works down
> >>>> the line. I don't believe the AMD manual promises this will work
> >>>> anyway.
> >>>>
> >>>> Something feels a bit wasteful about having all future kernels
> >>>> universally announce c-bit status when SEV is enabled, even if KVM
> >>>> isn't listening, since it may be too old (or just not want to know).
> >>>> Might be worth eliding the hypercalls if you get ENOSYS back? There
> >>>> might be a better way of passing paravirt config metadata across than
> >>>> just trying and seeing if the hypercall succeeds, but I'm not super
> >>>> familiar with it.
> >>>
> >>> I actually think this should be a hard requirement to merge this. The host needs to tell the guest that it supports this particular migration strategy and the guest needs to tell the host that it is using it.  And the guest needs a way to tell the host that it’s *not* using it right now due to kexec, for example.
> >>>
> >>> I’m still uneasy about a guest being migrated in the window where the hypercall tracking and the page encryption bit don’t match.  I guess maybe corruption in this window doesn’t matter?
> >>>
> >>
> >> I don't think there is a corruption issue here. Let's consider the below
> >> case:
> >>
> >> 1) A page is transmitted as C=1 (encrypted)
> >>
> >> 2) During the migration window, the page encryption bit is changed
> >>    to C=0 (decrypted)
> >>
> >> 3) #2 will cause a change in page table memory, thus dirty memory
> >>    the tracker will create retransmission of the page table memory.
> >>
> >> 4) The page itself will not be re-transmitted because there was
> >>    no change to the content of the page.
> >>
> >> On destination, the read from the page will get the ciphertext.
> >>
> >> The encryption bit change in the page table is used on the next access.
> >> The user of the page needs to ensure that data is written with the
> >> correct encryption bit before reading.
> >>
> >> thanks
> >
> >
> > I think the issue results from a slightly different perspective than
> > the one you are using. I think the situation Andy is interested in is
> > when a c-bit change and a write happen close in time. There are five
> > events, and the ordering matters:
> > 1) Guest dirties the c-bit in the guest
> > 2) Guest dirties the page
> > 3) Host userspace observes the c-bit logs
> > 4) Host userspace observes the page dirty logs
> > 5) Host transmits the page
> >
> > If these are reordered to:
> > 3) Host userspace observes the c-bit logs
> > 1) Guest dirties the c-bit in the guest
> > 2) Guest dirties the page
> > 4) Host userspace observes the page dirty logs
> > 5) Host transmits the page (from the wrong c-bit perspective!)
> >
> > Then the host will transmit a page with the wrong c-bit status and
> > clear the dirty bit for that page. If the guest page is not
> > retransmitted incidentally later, then this page will be corrupted.
> >
> > If you treat pages with dirty c-bits as dirty pages, then you will
> > check the c-bit logs later and observe the dirty c-bit and retransmit.
> > There might be some cleverness around enforcing that you always fetch
> > the c-bit logs after fetching the dirty logs, but I haven't convinced
> > myself that this works yet. I think it might, since then the c-bits
> > are at least as fresh as the dirty bits.
> >
>
> Unlike the dirty log, the c-bit log maintains the complete state.
> So, I think it is the Host userspace responsibility to ensure that it
> either keeps track of any c-bit log changes since it last sync'ed.
> During the migration, after pausing the guest it can get the recent
> c-bit log and compare if something has changed since it last sync'ed.
> If so, then retransmit the page with new c-bit state.
>
> > The main uncertainty that comes to mind for that strategy is if, on
> > multi-vCPU VMs, the page dirtying event (from the new c-bit
> > perspective) and the c-bit status change hypercall can themselves
> > race. If a write from the new c-bit perspective can arrive before the
> > c-bit status change arrives in the c-bit logs, we will need to treat
> > pages with dirty c-bits as dirty pages.
> >
>
> I believe if host userspace tracks the changes in the c-bit log since
> it last synced then this problem can be avoided. Do you think we should
> consider tracking the last sync changes in KVM or let the host userspace
> handle it.
Punting this off to userspace to handle works. If storing the old
c-bit statuses in userspace becomes a memory issue (unlikely), we can
fix that down the line.

Andy, are your concerns about the raceyness of c-bit tracking resolved?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed
  2020-02-20 23:23             ` Steve Rutherford
@ 2020-02-20 23:40               ` Andy Lutomirski
  0 siblings, 0 replies; 49+ messages in thread
From: Andy Lutomirski @ 2020-02-20 23:40 UTC (permalink / raw)
  To: Steve Rutherford
  Cc: Brijesh Singh, Ashish Kalra, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Joerg Roedel, Borislav Petkov,
	Tom Lendacky, David Rientjes, x86, KVM list, LKML



> On Feb 20, 2020, at 3:24 PM, Steve Rutherford <srutherford@google.com> wrote:
> 
> On Thu, Feb 20, 2020 at 2:43 PM Brijesh Singh <brijesh.singh@amd.com> wrote:
>> 
>> 
>> 
>>> On 2/20/20 2:43 PM, Steve Rutherford wrote:
>>> On Thu, Feb 20, 2020 at 7:55 AM Brijesh Singh <brijesh.singh@amd.com> wrote:
>>>> 
>>>> 
>>>> 
>>>> On 2/19/20 8:12 PM, Andy Lutomirski wrote:
>>>>> 
>>>>> 
>>>>>> On Feb 19, 2020, at 5:58 PM, Steve Rutherford <srutherford@google.com> wrote:
>>>>>> 
>>>>>> On Wed, Feb 12, 2020 at 5:18 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>>>>>>> 
>>>>>>> From: Brijesh Singh <brijesh.singh@amd.com>
>>>>>>> 
>>>>>>> Invoke a hypercall when a memory region is changed from encrypted ->
>>>>>>> decrypted and vice versa. Hypervisor need to know the page encryption
>>>>>>> status during the guest migration.
>>>>>> 
>>>>>> One messy aspect, which I think is fine in practice, is that this
>>>>>> presumes that pages are either treated as encrypted or decrypted. If
>>>>>> also done on SEV, the in-place re-encryption supported by SME would
>>>>>> break SEV migration. Linux doesn't do this now on SEV, and I don't
>>>>>> have an intuition for why Linux might want this, but we will need to
>>>>>> ensure it is never done in order to ensure that migration works down
>>>>>> the line. I don't believe the AMD manual promises this will work
>>>>>> anyway.
>>>>>> 
>>>>>> Something feels a bit wasteful about having all future kernels
>>>>>> universally announce c-bit status when SEV is enabled, even if KVM
>>>>>> isn't listening, since it may be too old (or just not want to know).
>>>>>> Might be worth eliding the hypercalls if you get ENOSYS back? There
>>>>>> might be a better way of passing paravirt config metadata across than
>>>>>> just trying and seeing if the hypercall succeeds, but I'm not super
>>>>>> familiar with it.
>>>>> 
>>>>> I actually think this should be a hard requirement to merge this. The host needs to tell the guest that it supports this particular migration strategy and the guest needs to tell the host that it is using it.  And the guest needs a way to tell the host that it’s *not* using it right now due to kexec, for example.
>>>>> 
>>>>> I’m still uneasy about a guest being migrated in the window where the hypercall tracking and the page encryption bit don’t match.  I guess maybe corruption in this window doesn’t matter?
>>>>> 
>>>> 
>>>> I don't think there is a corruption issue here. Let's consider the below
>>>> case:
>>>> 
>>>> 1) A page is transmitted as C=1 (encrypted)
>>>> 
>>>> 2) During the migration window, the page encryption bit is changed
>>>>   to C=0 (decrypted)
>>>> 
>>>> 3) #2 will cause a change in page table memory, thus dirty memory
>>>>   the tracker will create retransmission of the page table memory.
>>>> 
>>>> 4) The page itself will not be re-transmitted because there was
>>>>   no change to the content of the page.
>>>> 
>>>> On destination, the read from the page will get the ciphertext.
>>>> 
>>>> The encryption bit change in the page table is used on the next access.
>>>> The user of the page needs to ensure that data is written with the
>>>> correct encryption bit before reading.
>>>> 
>>>> thanks
>>> 
>>> 
>>> I think the issue results from a slightly different perspective than
>>> the one you are using. I think the situation Andy is interested in is
>>> when a c-bit change and a write happen close in time. There are five
>>> events, and the ordering matters:
>>> 1) Guest dirties the c-bit in the guest
>>> 2) Guest dirties the page
>>> 3) Host userspace observes the c-bit logs
>>> 4) Host userspace observes the page dirty logs
>>> 5) Host transmits the page
>>> 
>>> If these are reordered to:
>>> 3) Host userspace observes the c-bit logs
>>> 1) Guest dirties the c-bit in the guest
>>> 2) Guest dirties the page
>>> 4) Host userspace observes the page dirty logs
>>> 5) Host transmits the page (from the wrong c-bit perspective!)
>>> 
>>> Then the host will transmit a page with the wrong c-bit status and
>>> clear the dirty bit for that page. If the guest page is not
>>> retransmitted incidentally later, then this page will be corrupted.
>>> 
>>> If you treat pages with dirty c-bits as dirty pages, then you will
>>> check the c-bit logs later and observe the dirty c-bit and retransmit.
>>> There might be some cleverness around enforcing that you always fetch
>>> the c-bit logs after fetching the dirty logs, but I haven't convinced
>>> myself that this works yet. I think it might, since then the c-bits
>>> are at least as fresh as the dirty bits.
>>> 
>> 
>> Unlike the dirty log, the c-bit log maintains the complete state.
>> So, I think it is the Host userspace responsibility to ensure that it
>> either keeps track of any c-bit log changes since it last sync'ed.
>> During the migration, after pausing the guest it can get the recent
>> c-bit log and compare if something has changed since it last sync'ed.
>> If so, then retransmit the page with new c-bit state.
>> 
>>> The main uncertainty that comes to mind for that strategy is if, on
>>> multi-vCPU VMs, the page dirtying event (from the new c-bit
>>> perspective) and the c-bit status change hypercall can themselves
>>> race. If a write from the new c-bit perspective can arrive before the
>>> c-bit status change arrives in the c-bit logs, we will need to treat
>>> pages with dirty c-bits as dirty pages.
>>> 
>> 
>> I believe if host userspace tracks the changes in the c-bit log since
>> it last synced then this problem can be avoided. Do you think we should
>> consider tracking the last sync changes in KVM or let the host userspace
>> handle it.
> Punting this off to userspace to handle works. If storing the old
> c-bit statuses in userspace becomes a memory issue (unlikely), we can
> fix that down the line.
> 
> Andy, are your concerns about the raceyness of c-bit tracking resolved?

Probably, as long as the guest doesn’t play nasty games with trying to read its own ciphertext.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 09/12] KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl
  2020-02-13  1:17 ` [PATCH 09/12] KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl Ashish Kalra
@ 2020-02-27 17:57   ` Venu Busireddy
  2020-02-27 18:18     ` Venu Busireddy
  0 siblings, 1 reply; 49+ messages in thread
From: Venu Busireddy @ 2020-02-27 17:57 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: pbonzini, tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky,
	rientjes, x86, kvm, linux-kernel

On 2020-02-13 01:17:45 +0000, Ashish Kalra wrote:
> From: Brijesh Singh <brijesh.singh@amd.com>
> 
> The ioctl can be used to retrieve page encryption bitmap for a given
> gfn range.
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: x86@kernel.org
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>

This patch does not apply to upstream Linux version 5.5.6.

  <snip>
  Applying: KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl
  error: patch failed: Documentation/virt/kvm/api.txt:4213
  error: Documentation/virt/kvm/api.txt: patch does not apply
  error: patch failed: include/uapi/linux/kvm.h:1478
  error: include/uapi/linux/kvm.h: patch does not apply
  Patch failed at 0009 KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl
  <snip>

Which kernel version does this patch series apply to, cleanly?

Thanks,

Venu

> ---
>  Documentation/virt/kvm/api.txt  | 27 +++++++++++++++++++++
>  arch/x86/include/asm/kvm_host.h |  2 ++
>  arch/x86/kvm/svm.c              | 43 +++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c              | 12 +++++++++
>  include/uapi/linux/kvm.h        | 12 +++++++++
>  5 files changed, 96 insertions(+)
> 
> diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt
> index c6e1ce5d40de..053aecfabe74 100644
> --- a/Documentation/virt/kvm/api.txt
> +++ b/Documentation/virt/kvm/api.txt
> @@ -4213,6 +4213,33 @@ the clear cpu reset definition in the POP. However, the cpu is not put
>  into ESA mode. This reset is a superset of the initial reset.
>  
>  
> +4.120 KVM_GET_PAGE_ENC_BITMAP (vm ioctl)
> +
> +Capability: basic
> +Architectures: x86
> +Type: vm ioctl
> +Parameters: struct kvm_page_enc_bitmap (in/out)
> +Returns: 0 on success, -1 on error
> +
> +/* for KVM_GET_PAGE_ENC_BITMAP */
> +struct kvm_page_enc_bitmap {
> +	__u64 start_gfn;
> +	__u64 num_pages;
> +	union {
> +		void __user *enc_bitmap; /* one bit per page */
> +		__u64 padding2;
> +	};
> +};
> +
> +The encrypted VMs have concept of private and shared pages. The private
> +page is encrypted with the guest-specific key, while shared page may
> +be encrypted with the hypervisor key. The KVM_GET_PAGE_ENC_BITMAP can
> +be used to get the bitmap indicating whether the guest page is private
> +or shared. The bitmap can be used during the guest migration, if the page
> +is private then userspace need to use SEV migration commands to transmit
> +the page.
> +
> +
>  5. The kvm_run structure
>  ------------------------
>  
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 4ae7293033b2..a6882c5214b4 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1258,6 +1258,8 @@ struct kvm_x86_ops {
>  	int (*enable_direct_tlbflush)(struct kvm_vcpu *vcpu);
>  	int (*page_enc_status_hc)(struct kvm *kvm, unsigned long gpa,
>  				  unsigned long sz, unsigned long mode);
> +	int (*get_page_enc_bitmap)(struct kvm *kvm,
> +				struct kvm_page_enc_bitmap *bmap);
>  };
>  
>  struct kvm_arch_async_pf {
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index f09791109075..f1c8806a97c6 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -7673,6 +7673,48 @@ static int svm_page_enc_status_hc(struct kvm *kvm, unsigned long gpa,
>  	return ret;
>  }
>  
> +static int svm_get_page_enc_bitmap(struct kvm *kvm,
> +				   struct kvm_page_enc_bitmap *bmap)
> +{
> +	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +	unsigned long gfn_start, gfn_end;
> +	unsigned long *bitmap;
> +	unsigned long sz, i;
> +	int ret;
> +
> +	if (!sev_guest(kvm))
> +		return -ENOTTY;
> +
> +	gfn_start = bmap->start_gfn;
> +	gfn_end = gfn_start + bmap->num_pages;
> +
> +	sz = ALIGN(bmap->num_pages, BITS_PER_LONG) / 8;
> +	bitmap = kmalloc(sz, GFP_KERNEL);
> +	if (!bitmap)
> +		return -ENOMEM;
> +
> +	/* by default all pages are marked encrypted */
> +	memset(bitmap, 0xff, sz);
> +
> +	mutex_lock(&kvm->lock);
> +	if (sev->page_enc_bmap) {
> +		i = gfn_start;
> +		for_each_clear_bit_from(i, sev->page_enc_bmap,
> +				      min(sev->page_enc_bmap_size, gfn_end))
> +			clear_bit(i - gfn_start, bitmap);
> +	}
> +	mutex_unlock(&kvm->lock);
> +
> +	ret = -EFAULT;
> +	if (copy_to_user(bmap->enc_bitmap, bitmap, sz))
> +		goto out;
> +
> +	ret = 0;
> +out:
> +	kfree(bitmap);
> +	return ret;
> +}
> +
>  static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>  {
>  	struct kvm_sev_cmd sev_cmd;
> @@ -8066,6 +8108,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
>  	.apic_init_signal_blocked = svm_apic_init_signal_blocked,
>  
>  	.page_enc_status_hc = svm_page_enc_status_hc,
> +	.get_page_enc_bitmap = svm_get_page_enc_bitmap,
>  };
>  
>  static int __init svm_init(void)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 298627fa3d39..e955f886ee17 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5213,6 +5213,18 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  	case KVM_SET_PMU_EVENT_FILTER:
>  		r = kvm_vm_ioctl_set_pmu_event_filter(kvm, argp);
>  		break;
> +	case KVM_GET_PAGE_ENC_BITMAP: {
> +		struct kvm_page_enc_bitmap bitmap;
> +
> +		r = -EFAULT;
> +		if (copy_from_user(&bitmap, argp, sizeof(bitmap)))
> +			goto out;
> +
> +		r = -ENOTTY;
> +		if (kvm_x86_ops->get_page_enc_bitmap)
> +			r = kvm_x86_ops->get_page_enc_bitmap(kvm, &bitmap);
> +		break;
> +	}
>  	default:
>  		r = -ENOTTY;
>  	}
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 4e80c57a3182..9377b26c5f4e 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -500,6 +500,16 @@ struct kvm_dirty_log {
>  	};
>  };
>  
> +/* for KVM_GET_PAGE_ENC_BITMAP */
> +struct kvm_page_enc_bitmap {
> +	__u64 start_gfn;
> +	__u64 num_pages;
> +	union {
> +		void __user *enc_bitmap; /* one bit per page */
> +		__u64 padding2;
> +	};
> +};
> +
>  /* for KVM_CLEAR_DIRTY_LOG */
>  struct kvm_clear_dirty_log {
>  	__u32 slot;
> @@ -1478,6 +1488,8 @@ struct kvm_enc_region {
>  #define KVM_S390_NORMAL_RESET	_IO(KVMIO,   0xc3)
>  #define KVM_S390_CLEAR_RESET	_IO(KVMIO,   0xc4)
>  
> +#define KVM_GET_PAGE_ENC_BITMAP	_IOW(KVMIO, 0xc2, struct kvm_page_enc_bitmap)
> +
>  /* Secure Encrypted Virtualization command */
>  enum sev_cmd_id {
>  	/* Guest initialization commands */
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 09/12] KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl
  2020-02-27 17:57   ` Venu Busireddy
@ 2020-02-27 18:18     ` Venu Busireddy
  2020-02-27 19:38       ` Ashish Kalra
  0 siblings, 1 reply; 49+ messages in thread
From: Venu Busireddy @ 2020-02-27 18:18 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: pbonzini, tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky,
	rientjes, x86, kvm, linux-kernel

On 2020-02-27 11:57:50 -0600, Venu Busireddy wrote:
> On 2020-02-13 01:17:45 +0000, Ashish Kalra wrote:
> > From: Brijesh Singh <brijesh.singh@amd.com>
> > 
> > The ioctl can be used to retrieve page encryption bitmap for a given
> > gfn range.
> > 
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> > Cc: Joerg Roedel <joro@8bytes.org>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Tom Lendacky <thomas.lendacky@amd.com>
> > Cc: x86@kernel.org
> > Cc: kvm@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> > Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> 
> This patch does not apply to upstream Linux version 5.5.6.
> 
>   <snip>
>   Applying: KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl
>   error: patch failed: Documentation/virt/kvm/api.txt:4213
>   error: Documentation/virt/kvm/api.txt: patch does not apply
>   error: patch failed: include/uapi/linux/kvm.h:1478
>   error: include/uapi/linux/kvm.h: patch does not apply
>   Patch failed at 0009 KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl
>   <snip>
> 
> Which kernel version does this patch series apply to, cleanly?

Tried git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git.
With that, patch 08/12 fails with the top of that tree, as well as
tag v5.6-rc3.

> Thanks,
> 
> Venu
> 
> > ---
> >  Documentation/virt/kvm/api.txt  | 27 +++++++++++++++++++++
> >  arch/x86/include/asm/kvm_host.h |  2 ++
> >  arch/x86/kvm/svm.c              | 43 +++++++++++++++++++++++++++++++++
> >  arch/x86/kvm/x86.c              | 12 +++++++++
> >  include/uapi/linux/kvm.h        | 12 +++++++++
> >  5 files changed, 96 insertions(+)
> > 
> > diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt
> > index c6e1ce5d40de..053aecfabe74 100644
> > --- a/Documentation/virt/kvm/api.txt
> > +++ b/Documentation/virt/kvm/api.txt
> > @@ -4213,6 +4213,33 @@ the clear cpu reset definition in the POP. However, the cpu is not put
> >  into ESA mode. This reset is a superset of the initial reset.
> >  
> >  
> > +4.120 KVM_GET_PAGE_ENC_BITMAP (vm ioctl)
> > +
> > +Capability: basic
> > +Architectures: x86
> > +Type: vm ioctl
> > +Parameters: struct kvm_page_enc_bitmap (in/out)
> > +Returns: 0 on success, -1 on error
> > +
> > +/* for KVM_GET_PAGE_ENC_BITMAP */
> > +struct kvm_page_enc_bitmap {
> > +	__u64 start_gfn;
> > +	__u64 num_pages;
> > +	union {
> > +		void __user *enc_bitmap; /* one bit per page */
> > +		__u64 padding2;
> > +	};
> > +};
> > +
> > +The encrypted VMs have concept of private and shared pages. The private
> > +page is encrypted with the guest-specific key, while shared page may
> > +be encrypted with the hypervisor key. The KVM_GET_PAGE_ENC_BITMAP can
> > +be used to get the bitmap indicating whether the guest page is private
> > +or shared. The bitmap can be used during the guest migration, if the page
> > +is private then userspace need to use SEV migration commands to transmit
> > +the page.
> > +
> > +
> >  5. The kvm_run structure
> >  ------------------------
> >  
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index 4ae7293033b2..a6882c5214b4 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1258,6 +1258,8 @@ struct kvm_x86_ops {
> >  	int (*enable_direct_tlbflush)(struct kvm_vcpu *vcpu);
> >  	int (*page_enc_status_hc)(struct kvm *kvm, unsigned long gpa,
> >  				  unsigned long sz, unsigned long mode);
> > +	int (*get_page_enc_bitmap)(struct kvm *kvm,
> > +				struct kvm_page_enc_bitmap *bmap);
> >  };
> >  
> >  struct kvm_arch_async_pf {
> > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> > index f09791109075..f1c8806a97c6 100644
> > --- a/arch/x86/kvm/svm.c
> > +++ b/arch/x86/kvm/svm.c
> > @@ -7673,6 +7673,48 @@ static int svm_page_enc_status_hc(struct kvm *kvm, unsigned long gpa,
> >  	return ret;
> >  }
> >  
> > +static int svm_get_page_enc_bitmap(struct kvm *kvm,
> > +				   struct kvm_page_enc_bitmap *bmap)
> > +{
> > +	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> > +	unsigned long gfn_start, gfn_end;
> > +	unsigned long *bitmap;
> > +	unsigned long sz, i;
> > +	int ret;
> > +
> > +	if (!sev_guest(kvm))
> > +		return -ENOTTY;
> > +
> > +	gfn_start = bmap->start_gfn;
> > +	gfn_end = gfn_start + bmap->num_pages;
> > +
> > +	sz = ALIGN(bmap->num_pages, BITS_PER_LONG) / 8;
> > +	bitmap = kmalloc(sz, GFP_KERNEL);
> > +	if (!bitmap)
> > +		return -ENOMEM;
> > +
> > +	/* by default all pages are marked encrypted */
> > +	memset(bitmap, 0xff, sz);
> > +
> > +	mutex_lock(&kvm->lock);
> > +	if (sev->page_enc_bmap) {
> > +		i = gfn_start;
> > +		for_each_clear_bit_from(i, sev->page_enc_bmap,
> > +				      min(sev->page_enc_bmap_size, gfn_end))
> > +			clear_bit(i - gfn_start, bitmap);
> > +	}
> > +	mutex_unlock(&kvm->lock);
> > +
> > +	ret = -EFAULT;
> > +	if (copy_to_user(bmap->enc_bitmap, bitmap, sz))
> > +		goto out;
> > +
> > +	ret = 0;
> > +out:
> > +	kfree(bitmap);
> > +	return ret;
> > +}
> > +
> >  static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
> >  {
> >  	struct kvm_sev_cmd sev_cmd;
> > @@ -8066,6 +8108,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
> >  	.apic_init_signal_blocked = svm_apic_init_signal_blocked,
> >  
> >  	.page_enc_status_hc = svm_page_enc_status_hc,
> > +	.get_page_enc_bitmap = svm_get_page_enc_bitmap,
> >  };
> >  
> >  static int __init svm_init(void)
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 298627fa3d39..e955f886ee17 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -5213,6 +5213,18 @@ long kvm_arch_vm_ioctl(struct file *filp,
> >  	case KVM_SET_PMU_EVENT_FILTER:
> >  		r = kvm_vm_ioctl_set_pmu_event_filter(kvm, argp);
> >  		break;
> > +	case KVM_GET_PAGE_ENC_BITMAP: {
> > +		struct kvm_page_enc_bitmap bitmap;
> > +
> > +		r = -EFAULT;
> > +		if (copy_from_user(&bitmap, argp, sizeof(bitmap)))
> > +			goto out;
> > +
> > +		r = -ENOTTY;
> > +		if (kvm_x86_ops->get_page_enc_bitmap)
> > +			r = kvm_x86_ops->get_page_enc_bitmap(kvm, &bitmap);
> > +		break;
> > +	}
> >  	default:
> >  		r = -ENOTTY;
> >  	}
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > index 4e80c57a3182..9377b26c5f4e 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -500,6 +500,16 @@ struct kvm_dirty_log {
> >  	};
> >  };
> >  
> > +/* for KVM_GET_PAGE_ENC_BITMAP */
> > +struct kvm_page_enc_bitmap {
> > +	__u64 start_gfn;
> > +	__u64 num_pages;
> > +	union {
> > +		void __user *enc_bitmap; /* one bit per page */
> > +		__u64 padding2;
> > +	};
> > +};
> > +
> >  /* for KVM_CLEAR_DIRTY_LOG */
> >  struct kvm_clear_dirty_log {
> >  	__u32 slot;
> > @@ -1478,6 +1488,8 @@ struct kvm_enc_region {
> >  #define KVM_S390_NORMAL_RESET	_IO(KVMIO,   0xc3)
> >  #define KVM_S390_CLEAR_RESET	_IO(KVMIO,   0xc4)
> >  
> > +#define KVM_GET_PAGE_ENC_BITMAP	_IOW(KVMIO, 0xc2, struct kvm_page_enc_bitmap)
> > +
> >  /* Secure Encrypted Virtualization command */
> >  enum sev_cmd_id {
> >  	/* Guest initialization commands */
> > -- 
> > 2.17.1
> > 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 09/12] KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl
  2020-02-27 18:18     ` Venu Busireddy
@ 2020-02-27 19:38       ` Ashish Kalra
  0 siblings, 0 replies; 49+ messages in thread
From: Ashish Kalra @ 2020-02-27 19:38 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: pbonzini, tglx, mingo, hpa, rkrcmar, joro, bp, thomas.lendacky,
	rientjes, x86, kvm, linux-kernel

The patch series should apply to tag v5.6-rc1.

I will be posting the v5 of this patchset next week, which should
apply on top of
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git.

Thanks,
Ashish

On Thu, Feb 27, 2020 at 12:18:58PM -0600, Venu Busireddy wrote:
> On 2020-02-27 11:57:50 -0600, Venu Busireddy wrote:
> > On 2020-02-13 01:17:45 +0000, Ashish Kalra wrote:
> > > From: Brijesh Singh <brijesh.singh@amd.com>
> > > 
> > > The ioctl can be used to retrieve page encryption bitmap for a given
> > > gfn range.
> > > 
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: Ingo Molnar <mingo@redhat.com>
> > > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> > > Cc: Joerg Roedel <joro@8bytes.org>
> > > Cc: Borislav Petkov <bp@suse.de>
> > > Cc: Tom Lendacky <thomas.lendacky@amd.com>
> > > Cc: x86@kernel.org
> > > Cc: kvm@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > > Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> > > Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> > 
> > This patch does not apply to upstream Linux version 5.5.6.
> > 
> >   <snip>
> >   Applying: KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl
> >   error: patch failed: Documentation/virt/kvm/api.txt:4213
> >   error: Documentation/virt/kvm/api.txt: patch does not apply
> >   error: patch failed: include/uapi/linux/kvm.h:1478
> >   error: include/uapi/linux/kvm.h: patch does not apply
> >   Patch failed at 0009 KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl
> >   <snip>
> > 
> > Which kernel version does this patch series apply to, cleanly?
> 
> Tried git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git.
> With that, patch 08/12 fails with the top of that tree, as well as
> tag v5.6-rc3.
> 
> > Thanks,
> > 
> > Venu
> > 
> > > ---
> > >  Documentation/virt/kvm/api.txt  | 27 +++++++++++++++++++++
> > >  arch/x86/include/asm/kvm_host.h |  2 ++
> > >  arch/x86/kvm/svm.c              | 43 +++++++++++++++++++++++++++++++++
> > >  arch/x86/kvm/x86.c              | 12 +++++++++
> > >  include/uapi/linux/kvm.h        | 12 +++++++++
> > >  5 files changed, 96 insertions(+)
> > > 
> > > diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt
> > > index c6e1ce5d40de..053aecfabe74 100644
> > > --- a/Documentation/virt/kvm/api.txt
> > > +++ b/Documentation/virt/kvm/api.txt
> > > @@ -4213,6 +4213,33 @@ the clear cpu reset definition in the POP. However, the cpu is not put
> > >  into ESA mode. This reset is a superset of the initial reset.
> > >  
> > >  
> > > +4.120 KVM_GET_PAGE_ENC_BITMAP (vm ioctl)
> > > +
> > > +Capability: basic
> > > +Architectures: x86
> > > +Type: vm ioctl
> > > +Parameters: struct kvm_page_enc_bitmap (in/out)
> > > +Returns: 0 on success, -1 on error
> > > +
> > > +/* for KVM_GET_PAGE_ENC_BITMAP */
> > > +struct kvm_page_enc_bitmap {
> > > +	__u64 start_gfn;
> > > +	__u64 num_pages;
> > > +	union {
> > > +		void __user *enc_bitmap; /* one bit per page */
> > > +		__u64 padding2;
> > > +	};
> > > +};
> > > +
> > > +The encrypted VMs have concept of private and shared pages. The private
> > > +page is encrypted with the guest-specific key, while shared page may
> > > +be encrypted with the hypervisor key. The KVM_GET_PAGE_ENC_BITMAP can
> > > +be used to get the bitmap indicating whether the guest page is private
> > > +or shared. The bitmap can be used during the guest migration, if the page
> > > +is private then userspace need to use SEV migration commands to transmit
> > > +the page.
> > > +
> > > +
> > >  5. The kvm_run structure
> > >  ------------------------
> > >  
> > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > > index 4ae7293033b2..a6882c5214b4 100644
> > > --- a/arch/x86/include/asm/kvm_host.h
> > > +++ b/arch/x86/include/asm/kvm_host.h
> > > @@ -1258,6 +1258,8 @@ struct kvm_x86_ops {
> > >  	int (*enable_direct_tlbflush)(struct kvm_vcpu *vcpu);
> > >  	int (*page_enc_status_hc)(struct kvm *kvm, unsigned long gpa,
> > >  				  unsigned long sz, unsigned long mode);
> > > +	int (*get_page_enc_bitmap)(struct kvm *kvm,
> > > +				struct kvm_page_enc_bitmap *bmap);
> > >  };
> > >  
> > >  struct kvm_arch_async_pf {
> > > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> > > index f09791109075..f1c8806a97c6 100644
> > > --- a/arch/x86/kvm/svm.c
> > > +++ b/arch/x86/kvm/svm.c
> > > @@ -7673,6 +7673,48 @@ static int svm_page_enc_status_hc(struct kvm *kvm, unsigned long gpa,
> > >  	return ret;
> > >  }
> > >  
> > > +static int svm_get_page_enc_bitmap(struct kvm *kvm,
> > > +				   struct kvm_page_enc_bitmap *bmap)
> > > +{
> > > +	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> > > +	unsigned long gfn_start, gfn_end;
> > > +	unsigned long *bitmap;
> > > +	unsigned long sz, i;
> > > +	int ret;
> > > +
> > > +	if (!sev_guest(kvm))
> > > +		return -ENOTTY;
> > > +
> > > +	gfn_start = bmap->start_gfn;
> > > +	gfn_end = gfn_start + bmap->num_pages;
> > > +
> > > +	sz = ALIGN(bmap->num_pages, BITS_PER_LONG) / 8;
> > > +	bitmap = kmalloc(sz, GFP_KERNEL);
> > > +	if (!bitmap)
> > > +		return -ENOMEM;
> > > +
> > > +	/* by default all pages are marked encrypted */
> > > +	memset(bitmap, 0xff, sz);
> > > +
> > > +	mutex_lock(&kvm->lock);
> > > +	if (sev->page_enc_bmap) {
> > > +		i = gfn_start;
> > > +		for_each_clear_bit_from(i, sev->page_enc_bmap,
> > > +				      min(sev->page_enc_bmap_size, gfn_end))
> > > +			clear_bit(i - gfn_start, bitmap);
> > > +	}
> > > +	mutex_unlock(&kvm->lock);
> > > +
> > > +	ret = -EFAULT;
> > > +	if (copy_to_user(bmap->enc_bitmap, bitmap, sz))
> > > +		goto out;
> > > +
> > > +	ret = 0;
> > > +out:
> > > +	kfree(bitmap);
> > > +	return ret;
> > > +}
> > > +
> > >  static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
> > >  {
> > >  	struct kvm_sev_cmd sev_cmd;
> > > @@ -8066,6 +8108,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
> > >  	.apic_init_signal_blocked = svm_apic_init_signal_blocked,
> > >  
> > >  	.page_enc_status_hc = svm_page_enc_status_hc,
> > > +	.get_page_enc_bitmap = svm_get_page_enc_bitmap,
> > >  };
> > >  
> > >  static int __init svm_init(void)
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index 298627fa3d39..e955f886ee17 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -5213,6 +5213,18 @@ long kvm_arch_vm_ioctl(struct file *filp,
> > >  	case KVM_SET_PMU_EVENT_FILTER:
> > >  		r = kvm_vm_ioctl_set_pmu_event_filter(kvm, argp);
> > >  		break;
> > > +	case KVM_GET_PAGE_ENC_BITMAP: {
> > > +		struct kvm_page_enc_bitmap bitmap;
> > > +
> > > +		r = -EFAULT;
> > > +		if (copy_from_user(&bitmap, argp, sizeof(bitmap)))
> > > +			goto out;
> > > +
> > > +		r = -ENOTTY;
> > > +		if (kvm_x86_ops->get_page_enc_bitmap)
> > > +			r = kvm_x86_ops->get_page_enc_bitmap(kvm, &bitmap);
> > > +		break;
> > > +	}
> > >  	default:
> > >  		r = -ENOTTY;
> > >  	}
> > > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > > index 4e80c57a3182..9377b26c5f4e 100644
> > > --- a/include/uapi/linux/kvm.h
> > > +++ b/include/uapi/linux/kvm.h
> > > @@ -500,6 +500,16 @@ struct kvm_dirty_log {
> > >  	};
> > >  };
> > >  
> > > +/* for KVM_GET_PAGE_ENC_BITMAP */
> > > +struct kvm_page_enc_bitmap {
> > > +	__u64 start_gfn;
> > > +	__u64 num_pages;
> > > +	union {
> > > +		void __user *enc_bitmap; /* one bit per page */
> > > +		__u64 padding2;
> > > +	};
> > > +};
> > > +
> > >  /* for KVM_CLEAR_DIRTY_LOG */
> > >  struct kvm_clear_dirty_log {
> > >  	__u32 slot;
> > > @@ -1478,6 +1488,8 @@ struct kvm_enc_region {
> > >  #define KVM_S390_NORMAL_RESET	_IO(KVMIO,   0xc3)
> > >  #define KVM_S390_CLEAR_RESET	_IO(KVMIO,   0xc4)
> > >  
> > > +#define KVM_GET_PAGE_ENC_BITMAP	_IOW(KVMIO, 0xc2, struct kvm_page_enc_bitmap)
> > > +
> > >  /* Secure Encrypted Virtualization command */
> > >  enum sev_cmd_id {
> > >  	/* Guest initialization commands */
> > > -- 
> > > 2.17.1
> > > 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 00/12] SEV Live Migration Patchset.
  2020-02-17 19:49       ` Ashish Kalra
@ 2020-03-03  1:05         ` Steve Rutherford
  2020-03-03  4:42           ` Ashish Kalra
  2020-03-19 13:05         ` Paolo Bonzini
  1 sibling, 1 reply; 49+ messages in thread
From: Steve Rutherford @ 2020-03-03  1:05 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Andy Lutomirski, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Radim Krcmar, Joerg Roedel, Borislav Petkov,
	Tom Lendacky, David Rientjes, X86 ML, kvm list, LKML,
	Brijesh Singh

I think this patch series might be incompatible with kexec, but I'm
not sure since I'm really not familiar with that subsystem. It seems
like you need to explicitly mark all shared pages as encrypted again
before rebooting into the new kernel. If you don't do this, and the
newly booted kernel believes RAM is encrypted and has always been, you
won't flip formerly decrypted shared device pages back to encrypted.
For example, if the swiotlb moves, you need to tell KVM that the old
swiotlb pages are now encrypted.

I could imagine this being handled implicitly by the way kexec handles
freeing memory, but I would find this surprising. I only see
enlightenment for handling SME's interaction with the initial image in
the kexec implementation.

On Mon, Feb 17, 2020 at 11:50 AM Ashish Kalra <ashish.kalra@amd.com> wrote:
>
> On Fri, Feb 14, 2020 at 10:58:46AM -0800, Andy Lutomirski wrote:
> > On Thu, Feb 13, 2020 at 3:09 PM Ashish Kalra <ashish.kalra@amd.com> wrote:
> > >
> > > On Wed, Feb 12, 2020 at 09:43:41PM -0800, Andy Lutomirski wrote:
> > > > On Wed, Feb 12, 2020 at 5:14 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> > > > >
> > > > > From: Ashish Kalra <ashish.kalra@amd.com>
> > > > >
> > > > > This patchset adds support for SEV Live Migration on KVM/QEMU.
> > > >
> > > > I skimmed this all and I don't see any description of how this all works.
> > > >
> > > > Does any of this address the mess in svm_register_enc_region()?  Right
> > > > now, when QEMU (or a QEMU alternative) wants to allocate some memory
> > > > to be used for guest encrypted pages, it mmap()s some memory and the
> > > > kernel does get_user_pages_fast() on it.  The pages are kept pinned
> > > > for the lifetime of the mapping.  This is not at all okay.  Let's see:
> > > >
> > > >  - The memory is pinned and it doesn't play well with the Linux memory
> > > > management code.  You just wrote a big patch set to migrate the pages
> > > > to a whole different machines, but we apparently can't even migrate
> > > > them to a different NUMA node or even just a different address.  And
> > > > good luck swapping it out.
> > > >
> > > >  - The memory is still mapped in the QEMU process, and that mapping is
> > > > incoherent with actual guest access to the memory.  It's nice that KVM
> > > > clflushes it so that, in principle, everything might actually work,
> > > > but this is gross.  We should not be exposing incoherent mappings to
> > > > userspace.
> > > >
> > > > Perhaps all this fancy infrastructure you're writing for migration and
> > > > all this new API surface could also teach the kernel how to migrate
> > > > pages from a guest *to the same guest* so we don't need to pin pages
> > > > forever.  And perhaps you could put some thought into how to improve
> > > > the API so that it doesn't involve nonsensical incoherent mappings.o
> > >
> > > As a different key is used to encrypt memory in each VM, the hypervisor
> > > can't simply copy the the ciphertext from one VM to another to migrate
> > > the VM.  Therefore, the AMD SEV Key Management API provides a new sets
> > > of function which the hypervisor can use to package a guest page for
> > > migration, while maintaining the confidentiality provided by AMD SEV.
> > >
> > > There is a new page encryption bitmap created in the kernel which
> > > keeps tracks of encrypted/decrypted state of guest's pages and this
> > > bitmap is updated by a new hypercall interface provided to the guest
> > > kernel and firmware.
> > >
> > > KVM_GET_PAGE_ENC_BITMAP ioctl can be used to get the guest page encryption
> > > bitmap. The bitmap can be used to check if the given guest page is
> > > private or shared.
> > >
> > > During the migration flow, the SEND_START is called on the source hypervisor
> > > to create an outgoing encryption context. The SEV guest policy dictates whether
> > > the certificate passed through the migrate-set-parameters command will be
> > > validated. SEND_UPDATE_DATA is called to encrypt the guest private pages.
> > > After migration is completed, SEND_FINISH is called to destroy the encryption
> > > context and make the VM non-runnable to protect it against cloning.
> > >
> > > On the target machine, RECEIVE_START is called first to create an
> > > incoming encryption context. The RECEIVE_UPDATE_DATA is called to copy
> > > the received encrypted page into guest memory. After migration has
> > > completed, RECEIVE_FINISH is called to make the VM runnable.
> > >
> >
> > Thanks!  This belongs somewhere in the patch set.
> >
> > You still haven't answered my questions about the existing coherency
> > issues and whether the same infrastructure can be used to migrate
> > guest pages within the same machine.
>
> Page Migration and Live Migration are separate features and one of my
> colleagues is currently working on making page migration possible and removing
> SEV Guest pinning requirements.
> >
> > Also, you're making guest-side and host-side changes.  What ensures
> > that you don't try to migrate a guest that doesn't support the
> > hypercall for encryption state tracking?
>
> This is a good question and it is still an open-ended question. There
> are two possibilities here: guest does not have any unencrypted pages
> (for e.g booting 32-bit) and so it does not make any hypercalls, and
> the other possibility is that the guest does not have support for
> the newer hypercall.
>
> In the first case, all the guest pages are then assumed to be
> encrypted and live migration happens as such.
>
> For the second case, we have been discussing this internally,
> and one option is to extend the KVM capabilites/feature bits to check for this ?
>
> Thanks,
> Ashish

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 00/12] SEV Live Migration Patchset.
  2020-03-03  1:05         ` Steve Rutherford
@ 2020-03-03  4:42           ` Ashish Kalra
  0 siblings, 0 replies; 49+ messages in thread
From: Ashish Kalra @ 2020-03-03  4:42 UTC (permalink / raw)
  To: Steve Rutherford
  Cc: Andy Lutomirski, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Radim Krcmar, Joerg Roedel, Borislav Petkov,
	Tom Lendacky, David Rientjes, X86 ML, kvm list, LKML,
	Brijesh Singh

On Mon, Mar 02, 2020 at 05:05:29PM -0800, Steve Rutherford wrote:
> I think this patch series might be incompatible with kexec, but I'm
> not sure since I'm really not familiar with that subsystem. It seems
> like you need to explicitly mark all shared pages as encrypted again
> before rebooting into the new kernel. If you don't do this, and the
> newly booted kernel believes RAM is encrypted and has always been, you
> won't flip formerly decrypted shared device pages back to encrypted.
> For example, if the swiotlb moves, you need to tell KVM that the old
> swiotlb pages are now encrypted.
> 
> I could imagine this being handled implicitly by the way kexec handles
> freeing memory, but I would find this surprising. I only see
> enlightenment for handling SME's interaction with the initial image in
> the kexec implementation.
> 

I believe that there already is enlightenment for SEV handling in kexec
implementation, but you are right as there is no support currently for 
resetting the page encryption bitmap in the SEV handling code in kexec,
similar to what we do for a guest reboot event.

This support probably needs to be added as part of the
arch_kexec_{post,pre}_{alloc,free}_pages() interface.

Thanks,
Ashish

> On Mon, Feb 17, 2020 at 11:50 AM Ashish Kalra <ashish.kalra@amd.com> wrote:
> >
> > On Fri, Feb 14, 2020 at 10:58:46AM -0800, Andy Lutomirski wrote:
> > > On Thu, Feb 13, 2020 at 3:09 PM Ashish Kalra <ashish.kalra@amd.com> wrote:
> > > >
> > > > On Wed, Feb 12, 2020 at 09:43:41PM -0800, Andy Lutomirski wrote:
> > > > > On Wed, Feb 12, 2020 at 5:14 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> > > > > >
> > > > > > From: Ashish Kalra <ashish.kalra@amd.com>
> > > > > >
> > > > > > This patchset adds support for SEV Live Migration on KVM/QEMU.
> > > > >
> > > > > I skimmed this all and I don't see any description of how this all works.
> > > > >
> > > > > Does any of this address the mess in svm_register_enc_region()?  Right
> > > > > now, when QEMU (or a QEMU alternative) wants to allocate some memory
> > > > > to be used for guest encrypted pages, it mmap()s some memory and the
> > > > > kernel does get_user_pages_fast() on it.  The pages are kept pinned
> > > > > for the lifetime of the mapping.  This is not at all okay.  Let's see:
> > > > >
> > > > >  - The memory is pinned and it doesn't play well with the Linux memory
> > > > > management code.  You just wrote a big patch set to migrate the pages
> > > > > to a whole different machines, but we apparently can't even migrate
> > > > > them to a different NUMA node or even just a different address.  And
> > > > > good luck swapping it out.
> > > > >
> > > > >  - The memory is still mapped in the QEMU process, and that mapping is
> > > > > incoherent with actual guest access to the memory.  It's nice that KVM
> > > > > clflushes it so that, in principle, everything might actually work,
> > > > > but this is gross.  We should not be exposing incoherent mappings to
> > > > > userspace.
> > > > >
> > > > > Perhaps all this fancy infrastructure you're writing for migration and
> > > > > all this new API surface could also teach the kernel how to migrate
> > > > > pages from a guest *to the same guest* so we don't need to pin pages
> > > > > forever.  And perhaps you could put some thought into how to improve
> > > > > the API so that it doesn't involve nonsensical incoherent mappings.o
> > > >
> > > > As a different key is used to encrypt memory in each VM, the hypervisor
> > > > can't simply copy the the ciphertext from one VM to another to migrate
> > > > the VM.  Therefore, the AMD SEV Key Management API provides a new sets
> > > > of function which the hypervisor can use to package a guest page for
> > > > migration, while maintaining the confidentiality provided by AMD SEV.
> > > >
> > > > There is a new page encryption bitmap created in the kernel which
> > > > keeps tracks of encrypted/decrypted state of guest's pages and this
> > > > bitmap is updated by a new hypercall interface provided to the guest
> > > > kernel and firmware.
> > > >
> > > > KVM_GET_PAGE_ENC_BITMAP ioctl can be used to get the guest page encryption
> > > > bitmap. The bitmap can be used to check if the given guest page is
> > > > private or shared.
> > > >
> > > > During the migration flow, the SEND_START is called on the source hypervisor
> > > > to create an outgoing encryption context. The SEV guest policy dictates whether
> > > > the certificate passed through the migrate-set-parameters command will be
> > > > validated. SEND_UPDATE_DATA is called to encrypt the guest private pages.
> > > > After migration is completed, SEND_FINISH is called to destroy the encryption
> > > > context and make the VM non-runnable to protect it against cloning.
> > > >
> > > > On the target machine, RECEIVE_START is called first to create an
> > > > incoming encryption context. The RECEIVE_UPDATE_DATA is called to copy
> > > > the received encrypted page into guest memory. After migration has
> > > > completed, RECEIVE_FINISH is called to make the VM runnable.
> > > >
> > >
> > > Thanks!  This belongs somewhere in the patch set.
> > >
> > > You still haven't answered my questions about the existing coherency
> > > issues and whether the same infrastructure can be used to migrate
> > > guest pages within the same machine.
> >
> > Page Migration and Live Migration are separate features and one of my
> > colleagues is currently working on making page migration possible and removing
> > SEV Guest pinning requirements.
> > >
> > > Also, you're making guest-side and host-side changes.  What ensures
> > > that you don't try to migrate a guest that doesn't support the
> > > hypercall for encryption state tracking?
> >
> > This is a good question and it is still an open-ended question. There
> > are two possibilities here: guest does not have any unencrypted pages
> > (for e.g booting 32-bit) and so it does not make any hypercalls, and
> > the other possibility is that the guest does not have support for
> > the newer hypercall.
> >
> > In the first case, all the guest pages are then assumed to be
> > encrypted and live migration happens as such.
> >
> > For the second case, we have been discussing this internally,
> > and one option is to extend the KVM capabilites/feature bits to check for this ?
> >
> > Thanks,
> > Ashish

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 01/12] KVM: SVM: Add KVM_SEV SEND_START command
  2020-02-13  1:14 ` [PATCH 01/12] KVM: SVM: Add KVM_SEV SEND_START command Ashish Kalra
@ 2020-03-09 21:28   ` Steve Rutherford
  2020-03-10  0:19     ` Steve Rutherford
  0 siblings, 1 reply; 49+ messages in thread
From: Steve Rutherford @ 2020-03-09 21:28 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krčmář,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes,
	X86 ML, KVM list, LKML

On Wed, Feb 12, 2020 at 5:15 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Brijesh Singh <brijesh.singh@amd.com>
>
> The command is used to create an outgoing SEV guest encryption context.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: x86@kernel.org
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  .../virt/kvm/amd-memory-encryption.rst        |  27 ++++
>  arch/x86/kvm/svm.c                            | 125 ++++++++++++++++++
>  include/linux/psp-sev.h                       |   8 +-
>  include/uapi/linux/kvm.h                      |  12 ++
>  4 files changed, 168 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
> index d18c97b4e140..826911f41f3b 100644
> --- a/Documentation/virt/kvm/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/amd-memory-encryption.rst
> @@ -238,6 +238,33 @@ Returns: 0 on success, -negative on error
>                  __u32 trans_len;
>          };
>
> +10. KVM_SEV_SEND_START
> +----------------------
> +
> +The KVM_SEV_SEND_START command can be used by the hypervisor to create an
> +outgoing guest encryption context.
> +
> +Parameters (in): struct kvm_sev_send_start
> +
> +Returns: 0 on success, -negative on error
> +
> +::
> +        struct kvm_sev_send_start {
> +                __u32 policy;                 /* guest policy */
> +
> +                __u64 pdh_cert_uaddr;         /* platform Diffie-Hellman certificate */
> +                __u32 pdh_cert_len;
> +
> +                __u64 plat_certs_uadr;        /* platform certificate chain */
> +                __u32 plat_certs_len;
> +
> +                __u64 amd_certs_uaddr;        /* AMD certificate */
> +                __u32 amd_cert_len;
> +
> +                __u64 session_uaddr;          /* Guest session information */
> +                __u32 session_len;
> +        };
> +
>  References
>  ==========
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index a3e32d61d60c..3a7e2cac51de 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -7140,6 +7140,128 @@ static int sev_launch_secret(struct kvm *kvm, struct kvm_sev_cmd *argp)
>         return ret;
>  }
>
> +/* Userspace wants to query session length. */
> +static int
> +__sev_send_start_query_session_length(struct kvm *kvm, struct kvm_sev_cmd *argp,
> +                                     struct kvm_sev_send_start *params)
> +{
> +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +       struct sev_data_send_start *data;
> +       int ret;
> +
> +       data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
> +       if (data == NULL)
> +               return -ENOMEM;
> +
> +       data->handle = sev->handle;
> +       ret = sev_issue_cmd(kvm, SEV_CMD_SEND_START, data, &argp->error);
> +
> +       params->session_len = data->session_len;
> +       if (copy_to_user((void __user *)(uintptr_t)argp->data, params,
> +                               sizeof(struct kvm_sev_send_start)))
> +               ret = -EFAULT;
> +
> +       kfree(data);
> +       return ret;
> +}
> +
> +static int sev_send_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +       struct sev_data_send_start *data;
> +       struct kvm_sev_send_start params;
> +       void *amd_certs, *session_data;
> +       void *pdh_cert, *plat_certs;
> +       int ret;
> +
> +       if (!sev_guest(kvm))
> +               return -ENOTTY;
> +
> +       if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
> +                               sizeof(struct kvm_sev_send_start)))
> +               return -EFAULT;
> +
> +       /* if session_len is zero, userspace wants t query the session length */

/t/to/
>
> +       if (!params.session_len)
> +               return __sev_send_start_query_session_length(kvm, argp,
> +                               &params);
Document this behavior with the command.

> +
> +       /* some sanity checks */
> +       if (!params.pdh_cert_uaddr || !params.pdh_cert_len ||
> +           !params.session_uaddr || params.session_len > SEV_FW_BLOB_MAX_SIZE)
> +               return -EINVAL;
> +
> +       /* allocate the memory to hold the session data blob */
> +       session_data = kmalloc(params.session_len, GFP_KERNEL_ACCOUNT);
> +       if (!session_data)
> +               return -ENOMEM;
> +
> +       /* copy the certificate blobs from userspace */
> +       pdh_cert = psp_copy_user_blob(params.pdh_cert_uaddr,
> +                               params.pdh_cert_len);
> +       if (IS_ERR(pdh_cert)) {
> +               ret = PTR_ERR(pdh_cert);
> +               goto e_free_session;
> +       }
> +
> +       plat_certs = psp_copy_user_blob(params.plat_certs_uaddr,
> +                               params.plat_certs_len);
> +       if (IS_ERR(plat_certs)) {
> +               ret = PTR_ERR(plat_certs);
> +               goto e_free_pdh;
> +       }
> +
> +       amd_certs = psp_copy_user_blob(params.amd_certs_uaddr,
> +                               params.amd_certs_len);
> +       if (IS_ERR(amd_certs)) {
> +               ret = PTR_ERR(amd_certs);
> +               goto e_free_plat_cert;
> +       }
> +
> +       data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
> +       if (data == NULL) {
> +               ret = -ENOMEM;
> +               goto e_free_amd_cert;
> +       }
> +
> +       /* populate the FW SEND_START field with system physical address */
> +       data->pdh_cert_address = __psp_pa(pdh_cert);
> +       data->pdh_cert_len = params.pdh_cert_len;
> +       data->plat_certs_address = __psp_pa(plat_certs);
> +       data->plat_certs_len = params.plat_certs_len;
> +       data->amd_certs_address = __psp_pa(amd_certs);
> +       data->amd_certs_len = params.amd_certs_len;
> +       data->session_address = __psp_pa(session_data);
> +       data->session_len = params.session_len;
> +       data->handle = sev->handle;
> +
> +       ret = sev_issue_cmd(kvm, SEV_CMD_SEND_START, data, &argp->error);
sev_issue_cmd can fail. I think you want to handle those errors here
(e.g. it can return -ebadf or a number of others). Right now they
could get clobbered by a later copy_to_user error.

It's also worth documenting what the error argp->error is filled in
with. I didn't see anything in the docs mentioning the status codes
(may have missed it).

> +
> +       if (copy_to_user((void __user *)(uintptr_t) params.session_uaddr,
> +                       session_data, params.session_len)) {
> +               ret = -EFAULT;
> +               goto e_free;
> +       }
> +
> +       params.policy = data->policy;
> +       params.session_len = data->session_len;
> +       if (copy_to_user((void __user *)(uintptr_t)argp->data, &params,
> +                               sizeof(struct kvm_sev_send_start)))
> +               ret = -EFAULT;
> +
> +e_free:
> +       kfree(data);
> +e_free_amd_cert:
> +       kfree(amd_certs);
> +e_free_plat_cert:
> +       kfree(plat_certs);
> +e_free_pdh:
> +       kfree(pdh_cert);
> +e_free_session:
> +       kfree(session_data);
> +       return ret;
> +}
> +
>  static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>  {
>         struct kvm_sev_cmd sev_cmd;
> @@ -7181,6 +7303,9 @@ static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>         case KVM_SEV_LAUNCH_SECRET:
>                 r = sev_launch_secret(kvm, &sev_cmd);
>                 break;
> +       case KVM_SEV_SEND_START:
> +               r = sev_send_start(kvm, &sev_cmd);
> +               break;
>         default:
>                 r = -EINVAL;
>                 goto out;
> diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
> index 5167bf2bfc75..9f63b9d48b63 100644
> --- a/include/linux/psp-sev.h
> +++ b/include/linux/psp-sev.h
> @@ -323,11 +323,11 @@ struct sev_data_send_start {
>         u64 pdh_cert_address;                   /* In */
>         u32 pdh_cert_len;                       /* In */
>         u32 reserved1;
> -       u64 plat_cert_address;                  /* In */
> -       u32 plat_cert_len;                      /* In */
> +       u64 plat_certs_address;                 /* In */
> +       u32 plat_certs_len;                     /* In */
>         u32 reserved2;
> -       u64 amd_cert_address;                   /* In */
> -       u32 amd_cert_len;                       /* In */
> +       u64 amd_certs_address;                  /* In */
> +       u32 amd_certs_len;                      /* In */
>         u32 reserved3;
>         u64 session_address;                    /* In */
>         u32 session_len;                        /* In/Out */
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 4b95f9a31a2f..17bef4c245e1 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1558,6 +1558,18 @@ struct kvm_sev_dbg {
>         __u32 len;
>  };
>
> +struct kvm_sev_send_start {
> +       __u32 policy;
> +       __u64 pdh_cert_uaddr;
> +       __u32 pdh_cert_len;
> +       __u64 plat_certs_uaddr;
> +       __u32 plat_certs_len;
> +       __u64 amd_certs_uaddr;
> +       __u32 amd_certs_len;
> +       __u64 session_uaddr;
> +       __u32 session_len;
> +};
> +
>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU    (1 << 0)
>  #define KVM_DEV_ASSIGN_PCI_2_3         (1 << 1)
>  #define KVM_DEV_ASSIGN_MASK_INTX       (1 << 2)
> --
> 2.17.1
>

Looks pretty reasonable overall.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 01/12] KVM: SVM: Add KVM_SEV SEND_START command
  2020-03-09 21:28   ` Steve Rutherford
@ 2020-03-10  0:19     ` Steve Rutherford
  0 siblings, 0 replies; 49+ messages in thread
From: Steve Rutherford @ 2020-03-10  0:19 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krčmář,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes,
	X86 ML, KVM list, LKML

On Mon, Mar 9, 2020 at 2:28 PM Steve Rutherford <srutherford@google.com> wrote:
>
> On Wed, Feb 12, 2020 at 5:15 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> >
> > From: Brijesh Singh <brijesh.singh@amd.com>
> >
> > The command is used to create an outgoing SEV guest encryption context.
> >
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> > Cc: Joerg Roedel <joro@8bytes.org>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Tom Lendacky <thomas.lendacky@amd.com>
> > Cc: x86@kernel.org
> > Cc: kvm@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> > Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> > ---
> >  .../virt/kvm/amd-memory-encryption.rst        |  27 ++++
> >  arch/x86/kvm/svm.c                            | 125 ++++++++++++++++++
> >  include/linux/psp-sev.h                       |   8 +-
> >  include/uapi/linux/kvm.h                      |  12 ++
> >  4 files changed, 168 insertions(+), 4 deletions(-)
> >
> > diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
> > index d18c97b4e140..826911f41f3b 100644
> > --- a/Documentation/virt/kvm/amd-memory-encryption.rst
> > +++ b/Documentation/virt/kvm/amd-memory-encryption.rst
> > @@ -238,6 +238,33 @@ Returns: 0 on success, -negative on error
> >                  __u32 trans_len;
> >          };
> >
> > +10. KVM_SEV_SEND_START
> > +----------------------
> > +
> > +The KVM_SEV_SEND_START command can be used by the hypervisor to create an
> > +outgoing guest encryption context.
> > +
> > +Parameters (in): struct kvm_sev_send_start
> > +
> > +Returns: 0 on success, -negative on error
> > +
> > +::
> > +        struct kvm_sev_send_start {
> > +                __u32 policy;                 /* guest policy */
> > +
> > +                __u64 pdh_cert_uaddr;         /* platform Diffie-Hellman certificate */
> > +                __u32 pdh_cert_len;
> > +
> > +                __u64 plat_certs_uadr;        /* platform certificate chain */
> > +                __u32 plat_certs_len;
> > +
> > +                __u64 amd_certs_uaddr;        /* AMD certificate */
> > +                __u32 amd_cert_len;
> > +
> > +                __u64 session_uaddr;          /* Guest session information */
> > +                __u32 session_len;
> > +        };
> > +
> >  References
> >  ==========
> >
> > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> > index a3e32d61d60c..3a7e2cac51de 100644
> > --- a/arch/x86/kvm/svm.c
> > +++ b/arch/x86/kvm/svm.c
> > @@ -7140,6 +7140,128 @@ static int sev_launch_secret(struct kvm *kvm, struct kvm_sev_cmd *argp)
> >         return ret;
> >  }
> >
> > +/* Userspace wants to query session length. */
> > +static int
> > +__sev_send_start_query_session_length(struct kvm *kvm, struct kvm_sev_cmd *argp,
> > +                                     struct kvm_sev_send_start *params)
> > +{
> > +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> > +       struct sev_data_send_start *data;
> > +       int ret;
> > +
> > +       data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
> > +       if (data == NULL)
> > +               return -ENOMEM;
> > +
> > +       data->handle = sev->handle;
> > +       ret = sev_issue_cmd(kvm, SEV_CMD_SEND_START, data, &argp->error);
> > +
> > +       params->session_len = data->session_len;
> > +       if (copy_to_user((void __user *)(uintptr_t)argp->data, params,
> > +                               sizeof(struct kvm_sev_send_start)))
> > +               ret = -EFAULT;
> > +
> > +       kfree(data);
> > +       return ret;
> > +}
> > +
> > +static int sev_send_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
> > +{
> > +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> > +       struct sev_data_send_start *data;
> > +       struct kvm_sev_send_start params;
> > +       void *amd_certs, *session_data;
> > +       void *pdh_cert, *plat_certs;
> > +       int ret;
> > +
> > +       if (!sev_guest(kvm))
> > +               return -ENOTTY;
> > +
> > +       if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
> > +                               sizeof(struct kvm_sev_send_start)))
> > +               return -EFAULT;
> > +
> > +       /* if session_len is zero, userspace wants t query the session length */
>
> /t/to/
> >
> > +       if (!params.session_len)
> > +               return __sev_send_start_query_session_length(kvm, argp,
> > +                               &params);
> Document this behavior with the command.
>
> > +
> > +       /* some sanity checks */
> > +       if (!params.pdh_cert_uaddr || !params.pdh_cert_len ||
> > +           !params.session_uaddr || params.session_len > SEV_FW_BLOB_MAX_SIZE)
> > +               return -EINVAL;
> > +
> > +       /* allocate the memory to hold the session data blob */
> > +       session_data = kmalloc(params.session_len, GFP_KERNEL_ACCOUNT);
> > +       if (!session_data)
> > +               return -ENOMEM;
> > +
> > +       /* copy the certificate blobs from userspace */
> > +       pdh_cert = psp_copy_user_blob(params.pdh_cert_uaddr,
> > +                               params.pdh_cert_len);
> > +       if (IS_ERR(pdh_cert)) {
> > +               ret = PTR_ERR(pdh_cert);
> > +               goto e_free_session;
> > +       }
> > +
> > +       plat_certs = psp_copy_user_blob(params.plat_certs_uaddr,
> > +                               params.plat_certs_len);
> > +       if (IS_ERR(plat_certs)) {
> > +               ret = PTR_ERR(plat_certs);
> > +               goto e_free_pdh;
> > +       }
> > +
> > +       amd_certs = psp_copy_user_blob(params.amd_certs_uaddr,
> > +                               params.amd_certs_len);
> > +       if (IS_ERR(amd_certs)) {
> > +               ret = PTR_ERR(amd_certs);
> > +               goto e_free_plat_cert;
> > +       }
> > +
> > +       data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
> > +       if (data == NULL) {
> > +               ret = -ENOMEM;
> > +               goto e_free_amd_cert;
> > +       }
> > +
> > +       /* populate the FW SEND_START field with system physical address */
> > +       data->pdh_cert_address = __psp_pa(pdh_cert);
> > +       data->pdh_cert_len = params.pdh_cert_len;
> > +       data->plat_certs_address = __psp_pa(plat_certs);
> > +       data->plat_certs_len = params.plat_certs_len;
> > +       data->amd_certs_address = __psp_pa(amd_certs);
> > +       data->amd_certs_len = params.amd_certs_len;
> > +       data->session_address = __psp_pa(session_data);
> > +       data->session_len = params.session_len;
> > +       data->handle = sev->handle;
> > +
> > +       ret = sev_issue_cmd(kvm, SEV_CMD_SEND_START, data, &argp->error);
> sev_issue_cmd can fail. I think you want to handle those errors here
> (e.g. it can return -ebadf or a number of others). Right now they
> could get clobbered by a later copy_to_user error.
>
> It's also worth documenting what the error argp->error is filled in
> with. I didn't see anything in the docs mentioning the status codes
> (may have missed it).
>
> > +
> > +       if (copy_to_user((void __user *)(uintptr_t) params.session_uaddr,
> > +                       session_data, params.session_len)) {
> > +               ret = -EFAULT;
> > +               goto e_free;
> > +       }

One additional aspect, which also comes up for other commands, is that
it's not clear if the command succeeded if you get back -EFAULT. Any
of the copy_to_users could have failed, on either side of the
sev_issue_cmd call.

If userspace filled in the error with a reserved value (e.g.
0xFFFF0000, which is larger than the largest possible error code), it
could observe that that value was clobbered and infer that
sev_issue_cmd succeeded/failed/etc. This is particularly error prone
since the success code is zero, which is almost certainly what people
will initialize the error field as, unless they go out of their way.

I think the cleanest answer would be to write in a reserved value to
the error at the start of sev_send_* and have sev_issue_command
clobber that value with the expected value. This way, userspace can
know which GSTATE the guest transitioned to, even if it sees -EFAULT.
> > +
> > +       params.policy = data->policy;
> > +       params.session_len = data->session_len;
> > +       if (copy_to_user((void __user *)(uintptr_t)argp->data, &params,
> > +                               sizeof(struct kvm_sev_send_start)))
> > +               ret = -EFAULT;
> > +
> > +e_free:
> > +       kfree(data);
> > +e_free_amd_cert:
> > +       kfree(amd_certs);
> > +e_free_plat_cert:
> > +       kfree(plat_certs);
> > +e_free_pdh:
> > +       kfree(pdh_cert);
> > +e_free_session:
> > +       kfree(session_data);
> > +       return ret;
> > +}
> > +
> >  static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
> >  {
> >         struct kvm_sev_cmd sev_cmd;
> > @@ -7181,6 +7303,9 @@ static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
> >         case KVM_SEV_LAUNCH_SECRET:
> >                 r = sev_launch_secret(kvm, &sev_cmd);
> >                 break;
> > +       case KVM_SEV_SEND_START:
> > +               r = sev_send_start(kvm, &sev_cmd);
> > +               break;
> >         default:
> >                 r = -EINVAL;
> >                 goto out;
> > diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
> > index 5167bf2bfc75..9f63b9d48b63 100644
> > --- a/include/linux/psp-sev.h
> > +++ b/include/linux/psp-sev.h
> > @@ -323,11 +323,11 @@ struct sev_data_send_start {
> >         u64 pdh_cert_address;                   /* In */
> >         u32 pdh_cert_len;                       /* In */
> >         u32 reserved1;
> > -       u64 plat_cert_address;                  /* In */
> > -       u32 plat_cert_len;                      /* In */
> > +       u64 plat_certs_address;                 /* In */
> > +       u32 plat_certs_len;                     /* In */
> >         u32 reserved2;
> > -       u64 amd_cert_address;                   /* In */
> > -       u32 amd_cert_len;                       /* In */
> > +       u64 amd_certs_address;                  /* In */
> > +       u32 amd_certs_len;                      /* In */
> >         u32 reserved3;
> >         u64 session_address;                    /* In */
> >         u32 session_len;                        /* In/Out */
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > index 4b95f9a31a2f..17bef4c245e1 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -1558,6 +1558,18 @@ struct kvm_sev_dbg {
> >         __u32 len;
> >  };
> >
> > +struct kvm_sev_send_start {
> > +       __u32 policy;
> > +       __u64 pdh_cert_uaddr;
> > +       __u32 pdh_cert_len;
> > +       __u64 plat_certs_uaddr;
> > +       __u32 plat_certs_len;
> > +       __u64 amd_certs_uaddr;
> > +       __u32 amd_certs_len;
> > +       __u64 session_uaddr;
> > +       __u32 session_len;
> > +};
> > +
> >  #define KVM_DEV_ASSIGN_ENABLE_IOMMU    (1 << 0)
> >  #define KVM_DEV_ASSIGN_PCI_2_3         (1 << 1)
> >  #define KVM_DEV_ASSIGN_MASK_INTX       (1 << 2)
> > --
> > 2.17.1
> >
>
> Looks pretty reasonable overall.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 02/12] KVM: SVM: Add KVM_SEND_UPDATE_DATA command
  2020-02-13  1:15 ` [PATCH 02/12] KVM: SVM: Add KVM_SEND_UPDATE_DATA command Ashish Kalra
@ 2020-03-10  1:04   ` Steve Rutherford
  2020-03-12  1:49     ` Ashish Kalra
  0 siblings, 1 reply; 49+ messages in thread
From: Steve Rutherford @ 2020-03-10  1:04 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes,
	X86 ML, KVM list, LKML

On Wed, Feb 12, 2020 at 5:15 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Brijesh Singh <brijesh.singh@amd.com>
>
> The command is used for encrypting the guest memory region using the encryption
> context created with KVM_SEV_SEND_START.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: x86@kernel.org
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  .../virt/kvm/amd-memory-encryption.rst        |  24 ++++
>  arch/x86/kvm/svm.c                            | 136 +++++++++++++++++-
>  include/uapi/linux/kvm.h                      |   9 ++
>  3 files changed, 165 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
> index 826911f41f3b..0f1c3860360f 100644
> --- a/Documentation/virt/kvm/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/amd-memory-encryption.rst
> @@ -265,6 +265,30 @@ Returns: 0 on success, -negative on error
>                  __u32 session_len;
>          };
>
> +11. KVM_SEV_SEND_UPDATE_DATA
> +----------------------------
> +
> +The KVM_SEV_SEND_UPDATE_DATA command can be used by the hypervisor to encrypt the
> +outgoing guest memory region with the encryption context creating using
> +KVM_SEV_SEND_START.
> +
> +Parameters (in): struct kvm_sev_send_update_data
> +
> +Returns: 0 on success, -negative on error
> +
> +::
> +
> +        struct kvm_sev_launch_send_update_data {
> +                __u64 hdr_uaddr;        /* userspace address containing the packet header */
> +                __u32 hdr_len;
> +
> +                __u64 guest_uaddr;      /* the source memory region to be encrypted */
> +                __u32 guest_len;
> +
> +                __u64 trans_uaddr;      /* the destition memory region  */
> +                __u32 trans_len;
> +        };
> +
>  References
>  ==========
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 3a7e2cac51de..ae97f774e979 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -426,6 +426,7 @@ static DECLARE_RWSEM(sev_deactivate_lock);
>  static DEFINE_MUTEX(sev_bitmap_lock);
>  static unsigned int max_sev_asid;
>  static unsigned int min_sev_asid;
> +static unsigned long sev_me_mask;
>  static unsigned long *sev_asid_bitmap;
>  static unsigned long *sev_reclaim_asid_bitmap;
>  #define __sme_page_pa(x) __sme_set(page_to_pfn(x) << PAGE_SHIFT)
> @@ -1231,16 +1232,22 @@ static int avic_ga_log_notifier(u32 ga_tag)
>  static __init int sev_hardware_setup(void)
>  {
>         struct sev_user_data_status *status;
> +       int eax, ebx;
>         int rc;
>
> -       /* Maximum number of encrypted guests supported simultaneously */
> -       max_sev_asid = cpuid_ecx(0x8000001F);
> +       /*
> +        * Query the memory encryption information.
> +        *  EBX:  Bit 0:5 Pagetable bit position used to indicate encryption
> +        *  (aka Cbit).
> +        *  ECX:  Maximum number of encrypted guests supported simultaneously.
> +        *  EDX:  Minimum ASID value that should be used for SEV guest.
> +        */
> +       cpuid(0x8000001f, &eax, &ebx, &max_sev_asid, &min_sev_asid);
>
>         if (!max_sev_asid)
>                 return 1;
>
> -       /* Minimum ASID value that should be used for SEV guest */
> -       min_sev_asid = cpuid_edx(0x8000001F);
> +       sev_me_mask = 1UL << (ebx & 0x3f);
>
>         /* Initialize SEV ASID bitmaps */
>         sev_asid_bitmap = bitmap_zalloc(max_sev_asid, GFP_KERNEL);
> @@ -7262,6 +7269,124 @@ static int sev_send_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
>         return ret;
>  }
>
> +/* Userspace wants to query either header or trans length. */
> +static int
> +__sev_send_update_data_query_lengths(struct kvm *kvm, struct kvm_sev_cmd *argp,
> +                                    struct kvm_sev_send_update_data *params)
> +{
> +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +       struct sev_data_send_update_data *data;
> +       int ret;
> +
> +       data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
> +       if (!data)
> +               return -ENOMEM;
> +
> +       data->handle = sev->handle;
> +       ret = sev_issue_cmd(kvm, SEV_CMD_SEND_UPDATE_DATA, data, &argp->error);
> +
> +       params->hdr_len = data->hdr_len;
> +       params->trans_len = data->trans_len;
> +
> +       if (copy_to_user((void __user *)(uintptr_t)argp->data, params,
> +                        sizeof(struct kvm_sev_send_update_data)))
> +               ret = -EFAULT;
> +
> +       kfree(data);
> +       return ret;
> +}
> +
> +static int sev_send_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +       struct sev_data_send_update_data *data;
> +       struct kvm_sev_send_update_data params;
> +       void *hdr, *trans_data;
> +       struct page **guest_page;
> +       unsigned long n;
> +       int ret, offset;
> +
> +       if (!sev_guest(kvm))
> +               return -ENOTTY;
> +
> +       if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
> +                       sizeof(struct kvm_sev_send_update_data)))
> +               return -EFAULT;
> +
> +       /* userspace wants to query either header or trans length */
> +       if (!params.trans_len || !params.hdr_len)
> +               return __sev_send_update_data_query_lengths(kvm, argp, &params);
> +
> +       if (!params.trans_uaddr || !params.guest_uaddr ||
> +           !params.guest_len || !params.hdr_uaddr)
> +               return -EINVAL;
> +
> +
> +       /* Check if we are crossing the page boundary */
> +       offset = params.guest_uaddr & (PAGE_SIZE - 1);
> +       if ((params.guest_len + offset > PAGE_SIZE))
> +               return -EINVAL;
> +
> +       /* Pin guest memory */
> +       guest_page = sev_pin_memory(kvm, params.guest_uaddr & PAGE_MASK,
> +                                   PAGE_SIZE, &n, 0);
> +       if (!guest_page)
> +               return -EFAULT;
> +
> +       /* allocate memory for header and transport buffer */
> +       ret = -ENOMEM;
> +       hdr = kmalloc(params.hdr_len, GFP_KERNEL_ACCOUNT);
> +       if (!hdr)
> +               goto e_unpin;
> +
> +       trans_data = kmalloc(params.trans_len, GFP_KERNEL_ACCOUNT);
> +       if (!trans_data)
> +               goto e_free_hdr;
> +
> +       data = kzalloc(sizeof(*data), GFP_KERNEL);
> +       if (!data)
> +               goto e_free_trans_data;
> +
> +       data->hdr_address = __psp_pa(hdr);
> +       data->hdr_len = params.hdr_len;
> +       data->trans_address = __psp_pa(trans_data);
> +       data->trans_len = params.trans_len;
> +
> +       /* The SEND_UPDATE_DATA command requires C-bit to be always set. */
> +       data->guest_address = (page_to_pfn(guest_page[0]) << PAGE_SHIFT) +
> +                               offset;
> +       data->guest_address |= sev_me_mask;
> +       data->guest_len = params.guest_len;
> +       data->handle = sev->handle;
> +
> +       ret = sev_issue_cmd(kvm, SEV_CMD_SEND_UPDATE_DATA, data, &argp->error);
> +
> +       if (ret)
> +               goto e_free;
> +
> +       /* copy transport buffer to user space */
> +       if (copy_to_user((void __user *)(uintptr_t)params.trans_uaddr,
> +                        trans_data, params.trans_len)) {
> +               ret = -EFAULT;
> +               goto e_unpin;
> +       }
> +
> +       /* Copy packet header to userspace. */
> +       ret = copy_to_user((void __user *)(uintptr_t)params.hdr_uaddr, hdr,
> +                               params.hdr_len);
> +
> +e_free:
> +       kfree(data);
> +e_free_trans_data:
> +       kfree(trans_data);
> +e_free_hdr:
> +       kfree(hdr);
> +e_unpin:
> +       sev_unpin_memory(kvm, guest_page, n);
> +
> +       return ret;
> +}
> +
>  static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>  {
>         struct kvm_sev_cmd sev_cmd;
> @@ -7306,6 +7431,9 @@ static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>         case KVM_SEV_SEND_START:
>                 r = sev_send_start(kvm, &sev_cmd);
>                 break;
> +       case KVM_SEV_SEND_UPDATE_DATA:
> +               r = sev_send_update_data(kvm, &sev_cmd);
> +               break;
>         default:
>                 r = -EINVAL;
>                 goto out;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 17bef4c245e1..d9dc81bb9c55 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1570,6 +1570,15 @@ struct kvm_sev_send_start {
>         __u32 session_len;
>  };
>
> +struct kvm_sev_send_update_data {
> +       __u64 hdr_uaddr;
> +       __u32 hdr_len;
> +       __u64 guest_uaddr;
> +       __u32 guest_len;
> +       __u64 trans_uaddr;
> +       __u32 trans_len;
> +};
Input from others is welcome here, but I'd put the padding in
intentionally (explicitly fill in the reserved u8s between *_len and
*_uaddr). I had to double check that this pattern was intentional and
matched the SEV spec.
> +
>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU    (1 << 0)
>  #define KVM_DEV_ASSIGN_PCI_2_3         (1 << 1)
>  #define KVM_DEV_ASSIGN_MASK_INTX       (1 << 2)
> --
> 2.17.1
>

High level: this looks good. Same comments on documenting the magic
parameters for querying as the prior patch, and also the -EFAULT
behavior.

Thanks,
Steve

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 03/12] KVM: SVM: Add KVM_SEV_SEND_FINISH command
  2020-02-13  1:16 ` [PATCH 03/12] KVM: SVM: Add KVM_SEV_SEND_FINISH command Ashish Kalra
@ 2020-03-10  1:09   ` Steve Rutherford
  0 siblings, 0 replies; 49+ messages in thread
From: Steve Rutherford @ 2020-03-10  1:09 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes,
	X86 ML, KVM list, LKML

On Wed, Feb 12, 2020 at 5:16 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Brijesh Singh <brijesh.singh@amd.com>
>
> The command is used to finailize the encryption context created with
> KVM_SEV_SEND_START command.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: x86@kernel.org
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  .../virt/kvm/amd-memory-encryption.rst        |  8 +++++++
>  arch/x86/kvm/svm.c                            | 23 +++++++++++++++++++
>  2 files changed, 31 insertions(+)
>
> diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
> index 0f1c3860360f..f22f09ad72bd 100644
> --- a/Documentation/virt/kvm/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/amd-memory-encryption.rst
> @@ -289,6 +289,14 @@ Returns: 0 on success, -negative on error
>                  __u32 trans_len;
>          };
>
> +12. KVM_SEV_SEND_FINISH
> +------------------------
> +
> +After completion of the migration flow, the KVM_SEV_SEND_FINISH command can be
> +issued by the hypervisor to delete the encryption context.
> +
> +Returns: 0 on success, -negative on error
> +
>  References
>  ==========
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index ae97f774e979..c55c1865f9e0 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -7387,6 +7387,26 @@ static int sev_send_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
>         return ret;
>  }
>
> +static int sev_send_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +       struct sev_data_send_finish *data;
> +       int ret;
> +
> +       if (!sev_guest(kvm))
> +               return -ENOTTY;
> +
> +       data = kzalloc(sizeof(*data), GFP_KERNEL);
> +       if (!data)
> +               return -ENOMEM;
> +
> +       data->handle = sev->handle;
> +       ret = sev_issue_cmd(kvm, SEV_CMD_SEND_FINISH, data, &argp->error);
> +
> +       kfree(data);
> +       return ret;
> +}
> +
>  static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>  {
>         struct kvm_sev_cmd sev_cmd;
> @@ -7434,6 +7454,9 @@ static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>         case KVM_SEV_SEND_UPDATE_DATA:
>                 r = sev_send_update_data(kvm, &sev_cmd);
>                 break;
> +       case KVM_SEV_SEND_FINISH:
> +               r = sev_send_finish(kvm, &sev_cmd);
> +               break;
>         default:
>                 r = -EINVAL;
>                 goto out;
> --
> 2.17.1
>
Reviewed-by: Steve Rutherford <srutherford@google.com>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 04/12] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command
  2020-02-13  1:16 ` [PATCH 04/12] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command Ashish Kalra
@ 2020-03-10  1:41   ` Steve Rutherford
  2020-03-12  0:38     ` Ashish Kalra
  0 siblings, 1 reply; 49+ messages in thread
From: Steve Rutherford @ 2020-03-10  1:41 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krčmář,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes,
	X86 ML, KVM list, LKML

On Wed, Feb 12, 2020 at 5:16 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Brijesh Singh <brijesh.singh@amd.com>
>
> The command is used to create the encryption context for an incoming
> SEV guest. The encryption context can be later used by the hypervisor
> to import the incoming data into the SEV guest memory space.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: x86@kernel.org
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  .../virt/kvm/amd-memory-encryption.rst        | 29 +++++++
>  arch/x86/kvm/svm.c                            | 81 +++++++++++++++++++
>  include/uapi/linux/kvm.h                      |  9 +++
>  3 files changed, 119 insertions(+)
>
> diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
> index f22f09ad72bd..4b882fb681fa 100644
> --- a/Documentation/virt/kvm/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/amd-memory-encryption.rst
> @@ -297,6 +297,35 @@ issued by the hypervisor to delete the encryption context.
>
>  Returns: 0 on success, -negative on error
>
> +13. KVM_SEV_RECEIVE_START
> +------------------------
> +
> +The KVM_SEV_RECEIVE_START command is used for creating the memory encryption
> +context for an incoming SEV guest. To create the encryption context, the user must
> +provide a guest policy, the platform public Diffie-Hellman (PDH) key and session
> +information.
> +
> +Parameters: struct  kvm_sev_receive_start (in/out)
> +
> +Returns: 0 on success, -negative on error
> +
> +::
> +
> +        struct kvm_sev_receive_start {
> +                __u32 handle;           /* if zero then firmware creates a new handle */
> +                __u32 policy;           /* guest's policy */
> +
> +                __u64 pdh_uaddr;         /* userspace address pointing to the PDH key */
> +                __u32 dh_len;
> +
> +                __u64 session_addr;     /* userspace address which points to the guest session information */
> +                __u32 session_len;
> +        };
> +
> +On success, the 'handle' field contains a new handle and on error, a negative value.
> +
> +For more details, see SEV spec Section 6.12.
> +
>  References
>  ==========
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index c55c1865f9e0..3b766f386c84 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -7407,6 +7407,84 @@ static int sev_send_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
>         return ret;
>  }
>
> +static int sev_receive_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +       struct sev_data_receive_start *start;
> +       struct kvm_sev_receive_start params;
> +       int *error = &argp->error;
> +       void *session_data;
> +       void *pdh_data;
> +       int ret;
> +
> +       if (!sev_guest(kvm))
> +               return -ENOTTY;
> +
> +       /* Get parameter from the userspace */
> +       if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
> +                       sizeof(struct kvm_sev_receive_start)))
> +               return -EFAULT;
> +
> +       /* some sanity checks */
> +       if (!params.pdh_uaddr || !params.pdh_len ||
> +           !params.session_uaddr || !params.session_len)
> +               return -EINVAL;
> +
> +       pdh_data = psp_copy_user_blob(params.pdh_uaddr, params.pdh_len);
> +       if (IS_ERR(pdh_data))
> +               return PTR_ERR(pdh_data);
> +
> +       session_data = psp_copy_user_blob(params.session_uaddr,
> +                       params.session_len);
> +       if (IS_ERR(session_data)) {
> +               ret = PTR_ERR(session_data);
> +               goto e_free_pdh;
> +       }
> +
> +       ret = -ENOMEM;
> +       start = kzalloc(sizeof(*start), GFP_KERNEL);
> +       if (!start)
> +               goto e_free_session;
> +
> +       start->handle = params.handle;
> +       start->policy = params.policy;
> +       start->pdh_cert_address = __psp_pa(pdh_data);
> +       start->pdh_cert_len = params.pdh_len;
> +       start->session_address = __psp_pa(session_data);
> +       start->session_len = params.session_len;
> +
> +       /* create memory encryption context */

Set ret to a different value here, since otherwise this will look like -ENOMEM.

> +       ret = __sev_issue_cmd(argp->sev_fd, SEV_CMD_RECEIVE_START, start,
> +                               error);
> +       if (ret)
> +               goto e_free;
> +
> +       /* Bind ASID to this guest */

Ideally, set ret to another distinct value, since the error spaces for
these commands overlap, so you won't be sure which had the problem.
You also wouldn't be sure if one succeeded and the other failed vs
both failing.

> +       ret = sev_bind_asid(kvm, start->handle, error);
> +       if (ret)
> +               goto e_free;
> +
> +       params.handle = start->handle;
> +       if (copy_to_user((void __user *)(uintptr_t)argp->data,
> +                        &params, sizeof(struct kvm_sev_receive_start))) {
> +               ret = -EFAULT;
> +               sev_unbind_asid(kvm, start->handle);
> +               goto e_free;
> +       }
> +
> +       sev->handle = start->handle;
> +       sev->fd = argp->sev_fd;
> +
> +e_free:
> +       kfree(start);
> +e_free_session:
> +       kfree(session_data);
> +e_free_pdh:
> +       kfree(pdh_data);
> +
> +       return ret;
> +}
> +
>  static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>  {
>         struct kvm_sev_cmd sev_cmd;
> @@ -7457,6 +7535,9 @@ static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
>         case KVM_SEV_SEND_FINISH:
>                 r = sev_send_finish(kvm, &sev_cmd);
>                 break;
> +       case KVM_SEV_RECEIVE_START:
> +               r = sev_receive_start(kvm, &sev_cmd);
> +               break;
>         default:
>                 r = -EINVAL;
>                 goto out;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index d9dc81bb9c55..74764b9db5fa 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1579,6 +1579,15 @@ struct kvm_sev_send_update_data {
>         __u32 trans_len;
>  };
>
> +struct kvm_sev_receive_start {
> +       __u32 handle;
> +       __u32 policy;
> +       __u64 pdh_uaddr;
> +       __u32 pdh_len;
> +       __u64 session_uaddr;
> +       __u32 session_len;
> +};
> +
>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU    (1 << 0)
>  #define KVM_DEV_ASSIGN_PCI_2_3         (1 << 1)
>  #define KVM_DEV_ASSIGN_MASK_INTX       (1 << 2)
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 04/12] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command
  2020-03-10  1:41   ` Steve Rutherford
@ 2020-03-12  0:38     ` Ashish Kalra
  2020-03-12  2:55       ` Steve Rutherford
  0 siblings, 1 reply; 49+ messages in thread
From: Ashish Kalra @ 2020-03-12  0:38 UTC (permalink / raw)
  To: Steve Rutherford
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krčmář,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes,
	X86 ML, KVM list, LKML, brijesh.singh

On Mon, Mar 09, 2020 at 06:41:02PM -0700, Steve Rutherford wrote:
> > +static int sev_receive_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
> > +{
> > +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> > +       struct sev_data_receive_start *start;
> > +       struct kvm_sev_receive_start params;
> > +       int *error = &argp->error;
> > +       void *session_data;
> > +       void *pdh_data;
> > +       int ret;
> > +
> > +       if (!sev_guest(kvm))
> > +               return -ENOTTY;
> > +
> > +       /* Get parameter from the userspace */
> > +       if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
> > +                       sizeof(struct kvm_sev_receive_start)))
> > +               return -EFAULT;
> > +
> > +       /* some sanity checks */
> > +       if (!params.pdh_uaddr || !params.pdh_len ||
> > +           !params.session_uaddr || !params.session_len)
> > +               return -EINVAL;
> > +
> > +       pdh_data = psp_copy_user_blob(params.pdh_uaddr, params.pdh_len);
> > +       if (IS_ERR(pdh_data))
> > +               return PTR_ERR(pdh_data);
> > +
> > +       session_data = psp_copy_user_blob(params.session_uaddr,
> > +                       params.session_len);
> > +       if (IS_ERR(session_data)) {
> > +               ret = PTR_ERR(session_data);
> > +               goto e_free_pdh;
> > +       }
> > +
> > +       ret = -ENOMEM;
> > +       start = kzalloc(sizeof(*start), GFP_KERNEL);
> > +       if (!start)
> > +               goto e_free_session;
> > +
> > +       start->handle = params.handle;
> > +       start->policy = params.policy;
> > +       start->pdh_cert_address = __psp_pa(pdh_data);
> > +       start->pdh_cert_len = params.pdh_len;
> > +       start->session_address = __psp_pa(session_data);
> > +       start->session_len = params.session_len;
> > +
> > +       /* create memory encryption context */
> 
> Set ret to a different value here, since otherwise this will look like -ENOMEM.

But, ret will be the value returned by __sev_issue_cmd(), so why will it
look like -ENOMEM ?

> 
> > +       ret = __sev_issue_cmd(argp->sev_fd, SEV_CMD_RECEIVE_START, start,
> > +                               error);
> > +       if (ret)
> > +               goto e_free;
> > +
> > +       /* Bind ASID to this guest */
> 
> Ideally, set ret to another distinct value, since the error spaces for
> these commands overlap, so you won't be sure which had the problem.
> You also wouldn't be sure if one succeeded and the other failed vs
> both failing.

Both commands "may" return the same error code as set by sev_do_cmd(), but
then we need that very specific error code, sev_do_cmd() can't return
different error codes for each command it is issuing ?

> 
> > +       ret = sev_bind_asid(kvm, start->handle, error);
> > +       if (ret)
> > +               goto e_free;
> > +

Thanks,
Ashish


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 02/12] KVM: SVM: Add KVM_SEND_UPDATE_DATA command
  2020-03-10  1:04   ` Steve Rutherford
@ 2020-03-12  1:49     ` Ashish Kalra
  0 siblings, 0 replies; 49+ messages in thread
From: Ashish Kalra @ 2020-03-12  1:49 UTC (permalink / raw)
  To: Steve Rutherford
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes,
	X86 ML, KVM list, LKML, brijesh.singh

On Mon, Mar 09, 2020 at 06:04:56PM -0700, Steve Rutherford wrote:
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > index 17bef4c245e1..d9dc81bb9c55 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -1570,6 +1570,15 @@ struct kvm_sev_send_start {
> >         __u32 session_len;
> >  };
> >
> > +struct kvm_sev_send_update_data {
> > +       __u64 hdr_uaddr;
> > +       __u32 hdr_len;
> > +       __u64 guest_uaddr;
> > +       __u32 guest_len;
> > +       __u64 trans_uaddr;
> > +       __u32 trans_len;
> > +};
> Input from others is welcome here, but I'd put the padding in
> intentionally (explicitly fill in the reserved u8s between *_len and
> *_uaddr). I had to double check that this pattern was intentional and
> matched the SEV spec.

struct kvm_sev_send_update_data{} is used to get/send the information 
from/to userspace, while struct sev_data_send_update_data{} has the
paddings and is packed as required by the SEV spec.

Thanks,
Ashish

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 04/12] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command
  2020-03-12  0:38     ` Ashish Kalra
@ 2020-03-12  2:55       ` Steve Rutherford
  0 siblings, 0 replies; 49+ messages in thread
From: Steve Rutherford @ 2020-03-12  2:55 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krčmář,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes,
	X86 ML, KVM list, LKML, Brijesh Singh

On Wed, Mar 11, 2020 at 5:39 PM Ashish Kalra <ashish.kalra@amd.com> wrote:
>
> But, ret will be the value returned by __sev_issue_cmd(), so why will it
> look like -ENOMEM ?
My bad, this is fine.
>
> >
> > > +       ret = __sev_issue_cmd(argp->sev_fd, SEV_CMD_RECEIVE_START, start,
> > > +                               error);
> > > +       if (ret)
> > > +               goto e_free;
> > > +
> > > +       /* Bind ASID to this guest */
> >
> > Ideally, set ret to another distinct value, since the error spaces for
> > these commands overlap, so you won't be sure which had the problem.
> > You also wouldn't be sure if one succeeded and the other failed vs
> > both failing.
>
> Both commands "may" return the same error code as set by sev_do_cmd(), but
> then we need that very specific error code, sev_do_cmd() can't return
> different error codes for each command it is issuing ?

I'll try to separate my comment into two levels: High level response,
and pragmatic response.

--- High level ---
At the end of the day, I want to be able to handle these errors in a
reasonable way. As often as possible, I'd like userspace to be able to
see a set of errors and know what to do in response. I find this
particularly important for migration, where you are mucking around
with a live VM with customer data you don't want to lose.

One red flag for me is when one pair of {errno, SEV error code}
corresponds to two distinct situations. For example, when, in another
patch in this series, {EFAULT, SUCCESS} could have corresponded to
either the command succeeding or the command never having run. Seems
like a pretty wide range of possibilities for a single error value.

I want to try to give the return codes scrutiny now, since we are
probably going to be stuck with maintaining them indefinitely, even if
there are mistakes.

--- Pragmatic ---
There's probably a strong argument that most situations like this
don't matter, since there's nothing you can do about an error except
kill the VM (or not continue migrating) anyway. I'm pretty open to
this argument. In particular, looking at SEV RECEIVE START, I think
you could throw away this attempt at creating a migration target, and
just make a new one (pretty much without consequence), so I think my
comment on this particular patch is moot. You can't cancel the SEND
START so you will be stuck working with this particular destination
host, but you can mint a new target VM via SEV RECEIVE START.

Looking at the earlier patches, older commands seem to have the same
ambiguity. The command SEV LAUNCH START also has identical errors that
could be sourced from either of two commands. Seems like we're already
committed to ambiguity being ok.

Given that I have no further comments on this particular patch:
Reviewed-by: Steve Rutherford <srutherford@google.com>

>
> >
> > > +       ret = sev_bind_asid(kvm, start->handle, error);
> > > +       if (ret)
> > > +               goto e_free;
> > > +
>
> Thanks,
> Ashish
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 00/12] SEV Live Migration Patchset.
  2020-02-17 19:49       ` Ashish Kalra
  2020-03-03  1:05         ` Steve Rutherford
@ 2020-03-19 13:05         ` Paolo Bonzini
  2020-03-19 16:18           ` Ashish Kalra
  1 sibling, 1 reply; 49+ messages in thread
From: Paolo Bonzini @ 2020-03-19 13:05 UTC (permalink / raw)
  To: Ashish Kalra, Andy Lutomirski
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Radim Krcmar,
	Joerg Roedel, Borislav Petkov, Tom Lendacky, David Rientjes,
	X86 ML, kvm list, LKML, Brijesh Singh

On 17/02/20 20:49, Ashish Kalra wrote:
>> Also, you're making guest-side and host-side changes.  What ensures
>> that you don't try to migrate a guest that doesn't support the
>> hypercall for encryption state tracking?
> This is a good question and it is still an open-ended question. There
> are two possibilities here: guest does not have any unencrypted pages
> (for e.g booting 32-bit) and so it does not make any hypercalls, and 
> the other possibility is that the guest does not have support for
> the newer hypercall.
> 
> In the first case, all the guest pages are then assumed to be 
> encrypted and live migration happens as such.
> 
> For the second case, we have been discussing this internally,
> and one option is to extend the KVM capabilites/feature bits to check for this ?

You could extend the hypercall to completely block live migration (e.g.
a0=a1=~0, a2=0 to unblock or 1 to block).  The KVM_GET_PAGE_ENC_BITMAP
ioctl can also return the blocked/unblocked state.

Paolo


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 00/12] SEV Live Migration Patchset.
  2020-03-19 13:05         ` Paolo Bonzini
@ 2020-03-19 16:18           ` Ashish Kalra
  2020-03-19 16:24             ` Paolo Bonzini
  0 siblings, 1 reply; 49+ messages in thread
From: Ashish Kalra @ 2020-03-19 16:18 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krcmar, Joerg Roedel, Borislav Petkov, Tom Lendacky,
	David Rientjes, X86 ML, kvm list, LKML, Brijesh Singh

Hello Paolo,

On Thu, Mar 19, 2020 at 02:05:21PM +0100, Paolo Bonzini wrote:
> On 17/02/20 20:49, Ashish Kalra wrote:
> >> Also, you're making guest-side and host-side changes.  What ensures
> >> that you don't try to migrate a guest that doesn't support the
> >> hypercall for encryption state tracking?
> > This is a good question and it is still an open-ended question. There
> > are two possibilities here: guest does not have any unencrypted pages
> > (for e.g booting 32-bit) and so it does not make any hypercalls, and 
> > the other possibility is that the guest does not have support for
> > the newer hypercall.
> > 
> > In the first case, all the guest pages are then assumed to be 
> > encrypted and live migration happens as such.
> > 
> > For the second case, we have been discussing this internally,
> > and one option is to extend the KVM capabilites/feature bits to check for this ?
> 
> You could extend the hypercall to completely block live migration (e.g.
> a0=a1=~0, a2=0 to unblock or 1 to block).  The KVM_GET_PAGE_ENC_BITMAP
> ioctl can also return the blocked/unblocked state.
> 

Currently i have added a new KVM para feature
"KVM_FEATURE_SEV_LIVE_MIGRATION" to indicate host support for the SEV
live migration feature and a custom KVM MSR "MSR_KVM_SEV_LIVE_MIG_EN"
for the guest to enable SEV live migration. The MSR also has other
flags for future SEV live migration extensions.

Thanks,
Ashish

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 00/12] SEV Live Migration Patchset.
  2020-03-19 16:18           ` Ashish Kalra
@ 2020-03-19 16:24             ` Paolo Bonzini
  0 siblings, 0 replies; 49+ messages in thread
From: Paolo Bonzini @ 2020-03-19 16:24 UTC (permalink / raw)
  To: Ashish Kalra
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krcmar, Joerg Roedel, Borislav Petkov, Tom Lendacky,
	David Rientjes, X86 ML, kvm list, LKML, Brijesh Singh

On 19/03/20 17:18, Ashish Kalra wrote:
>>> For the second case, we have been discussing this internally,
>>> and one option is to extend the KVM capabilites/feature bits to check for this ?
>> You could extend the hypercall to completely block live migration (e.g.
>> a0=a1=~0, a2=0 to unblock or 1 to block).  The KVM_GET_PAGE_ENC_BITMAP
>> ioctl can also return the blocked/unblocked state.
>>
> Currently i have added a new KVM para feature
> "KVM_FEATURE_SEV_LIVE_MIGRATION" to indicate host support for the SEV
> live migration feature and a custom KVM MSR "MSR_KVM_SEV_LIVE_MIG_EN"
> for the guest to enable SEV live migration. The MSR also has other
> flags for future SEV live migration extensions.

Ok, that would be fine too.  Thanks!

Paolo


^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2020-03-19 16:24 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-13  1:14 [PATCH 00/12] SEV Live Migration Patchset Ashish Kalra
2020-02-13  1:14 ` [PATCH 01/12] KVM: SVM: Add KVM_SEV SEND_START command Ashish Kalra
2020-03-09 21:28   ` Steve Rutherford
2020-03-10  0:19     ` Steve Rutherford
2020-02-13  1:15 ` [PATCH 02/12] KVM: SVM: Add KVM_SEND_UPDATE_DATA command Ashish Kalra
2020-03-10  1:04   ` Steve Rutherford
2020-03-12  1:49     ` Ashish Kalra
2020-02-13  1:16 ` [PATCH 03/12] KVM: SVM: Add KVM_SEV_SEND_FINISH command Ashish Kalra
2020-03-10  1:09   ` Steve Rutherford
2020-02-13  1:16 ` [PATCH 04/12] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command Ashish Kalra
2020-03-10  1:41   ` Steve Rutherford
2020-03-12  0:38     ` Ashish Kalra
2020-03-12  2:55       ` Steve Rutherford
2020-02-13  1:16 ` [PATCH 05/12] KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command Ashish Kalra
2020-02-13  1:16 ` [PATCH 06/12] KVM: SVM: Add KVM_SEV_RECEIVE_FINISH command Ashish Kalra
2020-02-13  1:17 ` [PATCH 07/12] KVM: x86: Add AMD SEV specific Hypercall3 Ashish Kalra
2020-02-13  1:17 ` [PATCH 08/12] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall Ashish Kalra
2020-02-20  2:39   ` Steve Rutherford
2020-02-20  5:28     ` Ashish Kalra
2020-02-20 21:21       ` Ashish Kalra
2020-02-13  1:17 ` [PATCH 09/12] KVM: x86: Introduce KVM_GET_PAGE_ENC_BITMAP ioctl Ashish Kalra
2020-02-27 17:57   ` Venu Busireddy
2020-02-27 18:18     ` Venu Busireddy
2020-02-27 19:38       ` Ashish Kalra
2020-02-13  1:18 ` [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed Ashish Kalra
2020-02-13  5:42   ` Andy Lutomirski
2020-02-13 22:28     ` Ashish Kalra
2020-02-14 18:56       ` Andy Lutomirski
2020-02-14 20:36         ` Ashish Kalra
2020-02-20  1:58   ` Steve Rutherford
2020-02-20  2:12     ` Andy Lutomirski
2020-02-20  3:29       ` Steve Rutherford
2020-02-20 15:54       ` Brijesh Singh
2020-02-20 20:43         ` Steve Rutherford
2020-02-20 22:43           ` Brijesh Singh
2020-02-20 23:23             ` Steve Rutherford
2020-02-20 23:40               ` Andy Lutomirski
2020-02-13  1:18 ` [PATCH 11/12] KVM: x86: Introduce KVM_SET_PAGE_ENC_BITMAP ioctl Ashish Kalra
2020-02-13  1:18 ` [PATCH 12/12] KVM: x86: Introduce KVM_PAGE_ENC_BITMAP_RESET ioctl Ashish Kalra
2020-02-13  5:43 ` [PATCH 00/12] SEV Live Migration Patchset Andy Lutomirski
2020-02-13 23:09   ` Ashish Kalra
2020-02-14 18:58     ` Andy Lutomirski
2020-02-17 19:49       ` Ashish Kalra
2020-03-03  1:05         ` Steve Rutherford
2020-03-03  4:42           ` Ashish Kalra
2020-03-19 13:05         ` Paolo Bonzini
2020-03-19 16:18           ` Ashish Kalra
2020-03-19 16:24             ` Paolo Bonzini
2020-02-14  2:10   ` Brijesh Singh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).