All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2 V7] Add AMD SEV and SEV-ES intra host migration support
@ 2021-09-02 18:17 Peter Gonda
  2021-09-02 18:17 ` [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration Peter Gonda
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Peter Gonda @ 2021-09-02 18:17 UTC (permalink / raw)
  To: kvm; +Cc: Peter Gonda

Intra host migration provides a low-cost mechanism for userspace VMM
upgrades.  It is an alternative to traditional (i.e., remote) live
migration. Whereas remote migration handles moving a guest to a new host,
intra host migration only handles moving a guest to a new userspace VMM
within a host.  This can be used to update, rollback, change flags of the
VMM, etc. The lower cost compared to live migration comes from the fact
that the guest's memory does not need to be copied between processes. A
handle to the guest memory simply gets passed to the new VMM, this could
be done via /dev/shm with share=on or similar feature.

The guest state can be transferred from an old VMM to a new VMM as follows:
1. Export guest state from KVM to the old user-space VMM via a getter
user-space/kernel API 2. Transfer guest state from old VMM to new VMM via
IPC communication 3. Import guest state into KVM from the new user-space
VMM via a setter user-space/kernel API VMMs by exporting from KVM using
getters, sending that data to the new VMM, then setting it again in KVM.

In the common case for intra host migration, we can rely on the normal
ioctls for passing data from one VMM to the next. SEV, SEV-ES, and other
confidential compute environments make most of this information opaque, and
render KVM ioctls such as "KVM_GET_REGS" irrelevant.  As a result, we need
the ability to pass this opaque metadata from one VMM to the next. The
easiest way to do this is to leave this data in the kernel, and transfer
ownership of the metadata from one KVM VM (or vCPU) to the next. For
example, we need to move the SEV enabled ASID, VMSAs, and GHCB metadata
from one VMM to the next.  In general, we need to be able to hand off any
data that would be unsafe/impossible for the kernel to hand directly to
userspace (and cannot be reproduced using data that can be handed safely to
userspace).

For the intra host operation the SEV required metadata, the source VM FD is
sent to the target VMM. The target VMM calls the new cap ioctl with the
source VM FD, KVM then moves all the SEV state to the target VM from the
source VM.

V7
 * Address selftest feedback.

V6
 * Add selftest.

V5:
 * Fix up locking scheme
 * Address marcorr@ comments.

V4:
 * Move to seanjc@'s suggestion of source VM FD based single ioctl design.

v3:
 * Fix memory leak found by dan.carpenter@

v2:
 * Added marcorr@ reviewed by tag
 * Renamed function introduced in 1/3
 * Edited with seanjc@'s review comments
 ** Cleaned up WARN usage
 ** Userspace makes random token now
 * Edited with brijesh.singh@'s review comments
 ** Checks for different LAUNCH_* states in send function

v1: https://lore.kernel.org/kvm/20210621163118.1040170-1-pgonda@google.com/

base-commit: 680c7e3be6a3

Peter Gonda (3):
  KVM, SEV: Add support for SEV intra host migration
  KVM, SEV: Add support for SEV-ES intra host migration
  selftest: KVM: Add intra host migration tests

 Documentation/virt/kvm/api.rst                |  15 ++
 arch/x86/include/asm/kvm_host.h               |   1 +
 arch/x86/kvm/svm/sev.c                        | 159 ++++++++++++++++++
 arch/x86/kvm/svm/svm.c                        |   1 +
 arch/x86/kvm/svm/svm.h                        |   2 +
 arch/x86/kvm/x86.c                            |   5 +
 include/uapi/linux/kvm.h                      |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/x86_64/sev_vm_tests.c       | 159 ++++++++++++++++++
 9 files changed, 344 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/sev_vm_tests.c

-- 
2.33.0.153.gba50c8fa24-goog


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration
  2021-09-02 18:17 [PATCH 0/2 V7] Add AMD SEV and SEV-ES intra host migration support Peter Gonda
@ 2021-09-02 18:17 ` Peter Gonda
  2021-09-10  0:11   ` Sean Christopherson
  2021-09-02 18:17 ` [PATCH 2/3 V7] KVM, SEV: Add support for SEV-ES " Peter Gonda
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 19+ messages in thread
From: Peter Gonda @ 2021-09-02 18:17 UTC (permalink / raw)
  To: kvm
  Cc: Peter Gonda, Sean Christopherson, Marc Orr, Paolo Bonzini,
	David Rientjes, Dr . David Alan Gilbert, Brijesh Singh,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	linux-kernel

For SEV to work with intra host migration, contents of the SEV info struct
such as the ASID (used to index the encryption key in the AMD SP) and
the list of memory regions need to be transferred to the target VM.
This change adds a commands for a target VMM to get a source SEV VM's sev
info.

The target is expected to be initialized (sev_guest_init), but not
launched state (sev_launch_start) when performing receive. Once the
target has received, it will be in a launched state and will not
need to perform the typical SEV launch commands.

Signed-off-by: Peter Gonda <pgonda@google.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Cc: Marc Orr <marcorr@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 Documentation/virt/kvm/api.rst  |  15 +++++
 arch/x86/include/asm/kvm_host.h |   1 +
 arch/x86/kvm/svm/sev.c          | 101 ++++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c          |   1 +
 arch/x86/kvm/svm/svm.h          |   2 +
 arch/x86/kvm/x86.c              |   5 ++
 include/uapi/linux/kvm.h        |   1 +
 7 files changed, 126 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 4ea1bb28297b..e8cecc024649 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6702,6 +6702,21 @@ MAP_SHARED mmap will result in an -EINVAL return.
 When enabled the VMM may make use of the ``KVM_ARM_MTE_COPY_TAGS`` ioctl to
 perform a bulk copy of tags to/from the guest.
 
+7.29 KVM_CAP_VM_MIGRATE_ENC_CONTEXT_FROM
+-------------------------------------
+
+Architectures: x86 SEV enabled
+Type: vm
+Parameters: args[0] is the fd of the source vm
+Returns: 0 on success
+
+This capability enables userspace to migrate the encryption context from the vm
+indicated by the fd to the vm this is called on.
+
+This is intended to support intra-host migration of VMs between userspace VMMs.
+in-guest workloads scheduled by the host. This allows for upgrading the VMM
+process without interrupting the guest.
+
 8. Other capabilities.
 ======================
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 09b256db394a..f06d87a85654 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1456,6 +1456,7 @@ struct kvm_x86_ops {
 	int (*mem_enc_reg_region)(struct kvm *kvm, struct kvm_enc_region *argp);
 	int (*mem_enc_unreg_region)(struct kvm *kvm, struct kvm_enc_region *argp);
 	int (*vm_copy_enc_context_from)(struct kvm *kvm, unsigned int source_fd);
+	int (*vm_migrate_enc_context_from)(struct kvm *kvm, unsigned int source_fd);
 
 	int (*get_msr_feature)(struct kvm_msr_entry *entry);
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 46eb1ba62d3d..8db666a362d4 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -1501,6 +1501,107 @@ static int sev_receive_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return sev_issue_cmd(kvm, SEV_CMD_RECEIVE_FINISH, &data, &argp->error);
 }
 
+static int svm_sev_lock_for_migration(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+
+	/*
+	 * Bail if this VM is already involved in a migration to avoid deadlock
+	 * between two VMs trying to migrate to/from each other.
+	 */
+	if (atomic_cmpxchg_acquire(&sev->migration_in_progress, 0, 1))
+		return -EBUSY;
+
+	mutex_lock(&kvm->lock);
+
+	return 0;
+}
+
+static void svm_unlock_after_migration(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+
+	mutex_unlock(&kvm->lock);
+	atomic_set_release(&sev->migration_in_progress, 0);
+}
+
+static void migrate_info_from(struct kvm_sev_info *dst,
+			      struct kvm_sev_info *src)
+{
+	sev_asid_free(dst);
+
+	dst->asid = src->asid;
+	dst->misc_cg = src->misc_cg;
+	dst->handle = src->handle;
+	dst->pages_locked = src->pages_locked;
+
+	src->asid = 0;
+	src->active = false;
+	src->handle = 0;
+	src->pages_locked = 0;
+	src->misc_cg = NULL;
+
+	INIT_LIST_HEAD(&dst->regions_list);
+	list_replace_init(&src->regions_list, &dst->regions_list);
+}
+
+int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd)
+{
+	struct kvm_sev_info *dst_sev = &to_kvm_svm(kvm)->sev_info;
+	struct file *source_kvm_file;
+	struct kvm *source_kvm;
+	int ret;
+
+	ret = svm_sev_lock_for_migration(kvm);
+	if (ret)
+		return ret;
+
+	if (!sev_guest(kvm) || sev_es_guest(kvm)) {
+		ret = -EINVAL;
+		pr_warn_ratelimited("VM must be SEV enabled to migrate to.\n");
+		goto out_unlock;
+	}
+
+	if (!list_empty(&dst_sev->regions_list)) {
+		ret = -EINVAL;
+		pr_warn_ratelimited(
+			"VM must not have encrypted regions to migrate to.\n");
+		goto out_unlock;
+	}
+
+	source_kvm_file = fget(source_fd);
+	if (!file_is_kvm(source_kvm_file)) {
+		ret = -EBADF;
+		pr_warn_ratelimited(
+				"Source VM must be SEV enabled to migrate from.\n");
+		goto out_fput;
+	}
+
+	source_kvm = source_kvm_file->private_data;
+	ret = svm_sev_lock_for_migration(source_kvm);
+	if (ret)
+		goto out_fput;
+
+	if (!sev_guest(source_kvm) || sev_es_guest(source_kvm)) {
+		ret = -EINVAL;
+		pr_warn_ratelimited(
+			"Source VM must be SEV enabled to migrate from.\n");
+		goto out_source;
+	}
+
+	migrate_info_from(dst_sev, &to_kvm_svm(source_kvm)->sev_info);
+	ret = 0;
+
+out_source:
+	svm_unlock_after_migration(source_kvm);
+out_fput:
+	if (source_kvm_file)
+		fput(source_kvm_file);
+out_unlock:
+	svm_unlock_after_migration(kvm);
+	return ret;
+}
+
 int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 1a70e11f0487..88dd76dd966f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4625,6 +4625,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.mem_enc_unreg_region = svm_unregister_enc_region,
 
 	.vm_copy_enc_context_from = svm_vm_copy_asid_from,
+	.vm_migrate_enc_context_from = svm_vm_migrate_from,
 
 	.can_emulate_instruction = svm_can_emulate_instruction,
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 524d943f3efc..67bfb43301e1 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -80,6 +80,7 @@ struct kvm_sev_info {
 	u64 ap_jump_table;	/* SEV-ES AP Jump Table address */
 	struct kvm *enc_context_owner; /* Owner of copied encryption context */
 	struct misc_cg *misc_cg; /* For misc cgroup accounting */
+	atomic_t migration_in_progress;
 };
 
 struct kvm_svm {
@@ -552,6 +553,7 @@ int svm_register_enc_region(struct kvm *kvm,
 int svm_unregister_enc_region(struct kvm *kvm,
 			      struct kvm_enc_region *range);
 int svm_vm_copy_asid_from(struct kvm *kvm, unsigned int source_fd);
+int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd);
 void pre_sev_run(struct vcpu_svm *svm, int cpu);
 void __init sev_set_cpu_caps(void);
 void __init sev_hardware_setup(void);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 86539c1686fa..c461867d37aa 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5654,6 +5654,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		if (kvm_x86_ops.vm_copy_enc_context_from)
 			r = kvm_x86_ops.vm_copy_enc_context_from(kvm, cap->args[0]);
 		return r;
+	case KVM_CAP_VM_MIGRATE_ENC_CONTEXT_FROM:
+		r = -EINVAL;
+		if (kvm_x86_ops.vm_migrate_enc_context_from)
+			r = kvm_x86_ops.vm_migrate_enc_context_from(kvm, cap->args[0]);
+		return r;
 	case KVM_CAP_EXIT_HYPERCALL:
 		if (cap->args[0] & ~KVM_EXIT_HYPERCALL_VALID_MASK) {
 			r = -EINVAL;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index a067410ebea5..49660204cdb9 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_BINARY_STATS_FD 203
 #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
 #define KVM_CAP_ARM_MTE 205
+#define KVM_CAP_VM_MIGRATE_ENC_CONTEXT_FROM 206
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.33.0.153.gba50c8fa24-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/3 V7] KVM, SEV: Add support for SEV-ES intra host migration
  2021-09-02 18:17 [PATCH 0/2 V7] Add AMD SEV and SEV-ES intra host migration support Peter Gonda
  2021-09-02 18:17 ` [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration Peter Gonda
@ 2021-09-02 18:17 ` Peter Gonda
  2021-09-10  0:50   ` Sean Christopherson
  2021-09-02 18:17 ` [PATCH 3/3 V7] selftest: KVM: Add intra host migration tests Peter Gonda
  2021-09-02 18:45 ` [PATCH 0/2 V7] Add AMD SEV and SEV-ES intra host migration support Sean Christopherson
  3 siblings, 1 reply; 19+ messages in thread
From: Peter Gonda @ 2021-09-02 18:17 UTC (permalink / raw)
  To: kvm
  Cc: Peter Gonda, Marc Orr, Paolo Bonzini, Sean Christopherson,
	David Rientjes, Dr . David Alan Gilbert, Brijesh Singh,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	linux-kernel

For SEV-ES to work with intra host migration the VMSAs, GHCB metadata,
and other SEV-ES info needs to be preserved along with the guest's
memory.

Signed-off-by: Peter Gonda <pgonda@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Cc: Marc Orr <marcorr@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/kvm/svm/sev.c | 62 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 60 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 8db666a362d4..fac21a82e4de 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -1545,6 +1545,59 @@ static void migrate_info_from(struct kvm_sev_info *dst,
 	list_replace_init(&src->regions_list, &dst->regions_list);
 }
 
+static int migrate_vmsa_from(struct kvm *dst, struct kvm *src)
+{
+	int i, num_vcpus;
+	struct kvm_vcpu *dst_vcpu, *src_vcpu;
+	struct vcpu_svm *dst_svm, *src_svm;
+
+	num_vcpus = atomic_read(&dst->online_vcpus);
+	if (num_vcpus != atomic_read(&src->online_vcpus)) {
+		pr_warn_ratelimited(
+			"Source and target VMs must have same number of vCPUs.\n");
+		return -EINVAL;
+	}
+
+	for (i = 0; i < num_vcpus; ++i) {
+		src_vcpu = src->vcpus[i];
+		if (!src_vcpu->arch.guest_state_protected) {
+			pr_warn_ratelimited(
+				"Source ES VM vCPUs must have protected state.\n");
+			return -EINVAL;
+		}
+	}
+
+	for (i = 0; i < num_vcpus; ++i) {
+		src_vcpu = src->vcpus[i];
+		src_svm = to_svm(src_vcpu);
+		dst_vcpu = dst->vcpus[i];
+		dst_svm = to_svm(dst_vcpu);
+
+		/*
+		 * Copy VMSA and GHCB fields from the source to the destination.
+		 * Clear them on the source to prevent the VM running and
+		 * changing the state of the VMSA/GHCB unexpectedly.
+		 */
+		dst_vcpu->vcpu_id = src_vcpu->vcpu_id;
+		dst_svm->vmsa = src_svm->vmsa;
+		src_svm->vmsa = NULL;
+		dst_svm->ghcb = src_svm->ghcb;
+		src_svm->ghcb = NULL;
+		dst_svm->vmcb->control.ghcb_gpa =
+				src_svm->vmcb->control.ghcb_gpa;
+		src_svm->vmcb->control.ghcb_gpa = 0;
+		dst_svm->ghcb_sa = src_svm->ghcb_sa;
+		src_svm->ghcb_sa = NULL;
+		dst_svm->ghcb_sa_len = src_svm->ghcb_sa_len;
+		src_svm->ghcb_sa_len = 0;
+		dst_svm->ghcb_sa_sync = src_svm->ghcb_sa_sync;
+		src_svm->ghcb_sa_sync = false;
+		dst_svm->ghcb_sa_free = src_svm->ghcb_sa_free;
+		src_svm->ghcb_sa_free = false;
+	}
+	return 0;
+}
+
 int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd)
 {
 	struct kvm_sev_info *dst_sev = &to_kvm_svm(kvm)->sev_info;
@@ -1556,7 +1609,7 @@ int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd)
 	if (ret)
 		return ret;
 
-	if (!sev_guest(kvm) || sev_es_guest(kvm)) {
+	if (!sev_guest(kvm)) {
 		ret = -EINVAL;
 		pr_warn_ratelimited("VM must be SEV enabled to migrate to.\n");
 		goto out_unlock;
@@ -1582,13 +1635,18 @@ int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd)
 	if (ret)
 		goto out_fput;
 
-	if (!sev_guest(source_kvm) || sev_es_guest(source_kvm)) {
+	if (!sev_guest(source_kvm)) {
 		ret = -EINVAL;
 		pr_warn_ratelimited(
 			"Source VM must be SEV enabled to migrate from.\n");
 		goto out_source;
 	}
 
+	if (sev_es_guest(kvm)) {
+		ret = migrate_vmsa_from(kvm, source_kvm);
+		if (ret)
+			goto out_source;
+	}
 	migrate_info_from(dst_sev, &to_kvm_svm(source_kvm)->sev_info);
 	ret = 0;
 
-- 
2.33.0.153.gba50c8fa24-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/3 V7] selftest: KVM: Add intra host migration tests
  2021-09-02 18:17 [PATCH 0/2 V7] Add AMD SEV and SEV-ES intra host migration support Peter Gonda
  2021-09-02 18:17 ` [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration Peter Gonda
  2021-09-02 18:17 ` [PATCH 2/3 V7] KVM, SEV: Add support for SEV-ES " Peter Gonda
@ 2021-09-02 18:17 ` Peter Gonda
  2021-09-10 17:16   ` Sean Christopherson
  2021-09-02 18:45 ` [PATCH 0/2 V7] Add AMD SEV and SEV-ES intra host migration support Sean Christopherson
  3 siblings, 1 reply; 19+ messages in thread
From: Peter Gonda @ 2021-09-02 18:17 UTC (permalink / raw)
  To: kvm
  Cc: Peter Gonda, Sean Christopherson, Marc Orr, David Rientjes,
	Brijesh Singh, linux-kernel

Adds testcases for intra host migration for SEV and SEV-ES. Also adds
locking test to confirm no deadlock exists.

Signed-off-by: Peter Gonda <pgonda@google.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Cc: Marc Orr <marcorr@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/x86_64/sev_vm_tests.c       | 159 ++++++++++++++++++
 2 files changed, 160 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/sev_vm_tests.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index c103873531e0..44fd3566fb51 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -72,6 +72,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/vmx_pmu_msrs_test
 TEST_GEN_PROGS_x86_64 += x86_64/xen_shinfo_test
 TEST_GEN_PROGS_x86_64 += x86_64/xen_vmcall_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_pi_mmio_test
+TEST_GEN_PROGS_x86_64 += x86_64/sev_vm_tests
 TEST_GEN_PROGS_x86_64 += access_tracking_perf_test
 TEST_GEN_PROGS_x86_64 += demand_paging_test
 TEST_GEN_PROGS_x86_64 += dirty_log_test
diff --git a/tools/testing/selftests/kvm/x86_64/sev_vm_tests.c b/tools/testing/selftests/kvm/x86_64/sev_vm_tests.c
new file mode 100644
index 000000000000..8ce8dd63ca85
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/sev_vm_tests.c
@@ -0,0 +1,159 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/kvm.h>
+#include <linux/psp-sev.h>
+#include <stdio.h>
+#include <sys/ioctl.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <pthread.h>
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "processor.h"
+#include "svm_util.h"
+#include "kselftest.h"
+#include "../lib/kvm_util_internal.h"
+
+#define SEV_DEV_PATH "/dev/sev"
+
+#define MIGRATE_TEST_NUM_VCPUS 4
+#define MIGRATE_TEST_VMS 3
+#define LOCK_TESTING_THREADS 3
+#define LOCK_TESTING_ITERATIONS 10000
+
+/*
+ * Open SEV_DEV_PATH if available, otherwise exit the entire program.
+ *
+ * Input Args:
+ *   flags - The flags to pass when opening SEV_DEV_PATH.
+ *
+ * Return:
+ *   The opened file descriptor of /dev/sev.
+ */
+static int open_sev_dev_path_or_exit(int flags)
+{
+	static int fd;
+
+	if (fd != 0)
+		return fd;
+
+	fd = open(SEV_DEV_PATH, flags);
+	if (fd < 0) {
+		print_skip("%s not available, is SEV not enabled? (errno: %d)",
+			   SEV_DEV_PATH, errno);
+		exit(KSFT_SKIP);
+	}
+
+	return fd;
+}
+
+static void sev_ioctl(int vm_fd, int cmd_id, void *data)
+{
+	struct kvm_sev_cmd cmd = {
+		.id = cmd_id,
+		.data = (uint64_t)data,
+		.sev_fd = open_sev_dev_path_or_exit(0),
+	};
+	int ret;
+
+	TEST_ASSERT(cmd_id < KVM_SEV_NR_MAX && cmd_id >= 0,
+		    "Unknown SEV CMD : %d\n", cmd_id);
+
+	ret = ioctl(vm_fd, KVM_MEMORY_ENCRYPT_OP, &cmd);
+	TEST_ASSERT((ret == 0 || cmd.error == SEV_RET_SUCCESS),
+		    "%d failed: return code: %d, errno: %d, fw error: %d",
+		    cmd_id, ret, errno, cmd.error);
+}
+
+static struct kvm_vm *sev_vm_create(bool es)
+{
+	struct kvm_vm *vm;
+	struct kvm_sev_launch_start start = { 0 };
+	int i;
+
+	vm = vm_create(VM_MODE_DEFAULT, 0, O_RDWR);
+	sev_ioctl(vm->fd, es ? KVM_SEV_ES_INIT : KVM_SEV_INIT, NULL);
+	for (i = 0; i < MIGRATE_TEST_NUM_VCPUS; ++i)
+		vm_vcpu_add(vm, i);
+	start.policy |= (es) << 2;
+	sev_ioctl(vm->fd, KVM_SEV_LAUNCH_START, &start);
+	if (es)
+		sev_ioctl(vm->fd, KVM_SEV_LAUNCH_UPDATE_VMSA, NULL);
+	return vm;
+}
+
+static void test_sev_migrate_from(bool es)
+{
+	struct kvm_vm *vms[MIGRATE_TEST_VMS];
+	struct kvm_enable_cap cap = {
+		.cap = KVM_CAP_VM_MIGRATE_ENC_CONTEXT_FROM
+	};
+	int i;
+
+	for (i = 0; i < MIGRATE_TEST_VMS; ++i) {
+		vms[i] = sev_vm_create(es);
+		if (i > 0) {
+			cap.args[0] = vms[i - 1]->fd;
+			vm_enable_cap(vms[i], &cap);
+		}
+	}
+}
+
+struct locking_thread_input {
+	struct kvm_vm *vm;
+	int source_fds[LOCK_TESTING_THREADS];
+};
+
+static void *locking_test_thread(void *arg)
+{
+	/*
+	 * This test case runs a number of threads all trying to use the intra
+	 * host migration ioctls. This tries to detect if a deadlock exists.
+	 */
+	struct kvm_enable_cap cap = {
+		.cap = KVM_CAP_VM_MIGRATE_ENC_CONTEXT_FROM
+	};
+	int i, j;
+	struct locking_thread_input *input = (struct locking_test_thread *)arg;
+
+	for (i = 0; i < LOCK_TESTING_ITERATIONS; ++i) {
+		j = input->source_fds[i % LOCK_TESTING_THREADS];
+		cap.args[0] = input->source_fds[j];
+		/*
+		 * Call IOCTL directly without checking return code or
+		 * asserting. We are * simply trying to confirm there is no
+		 * deadlock from userspace * not check correctness of
+		 * migration here.
+		 */
+		ioctl(input->vm->fd, KVM_ENABLE_CAP, &cap);
+	}
+}
+
+static void test_sev_migrate_locking(void)
+{
+	struct locking_thread_input input[LOCK_TESTING_THREADS];
+	pthread_t pt[LOCK_TESTING_THREADS];
+	int i;
+
+	for (i = 0; i < LOCK_TESTING_THREADS; ++i) {
+		input[i].vm = sev_vm_create(/* es= */ false);
+		input[0].source_fds[i] = input[i].vm->fd;
+	}
+	for (i = 1; i < LOCK_TESTING_THREADS; ++i)
+		memcpy(input[i].source_fds, input[0].source_fds,
+		       sizeof(input[i].source_fds));
+
+	for (i = 0; i < LOCK_TESTING_THREADS; ++i)
+		pthread_create(&pt[i], NULL, locking_test_thread, &input[i]);
+
+	for (i = 0; i < LOCK_TESTING_THREADS; ++i)
+		pthread_join(pt[i], NULL);
+}
+
+int main(int argc, char *argv[])
+{
+	test_sev_migrate_from(/* es= */ false);
+	test_sev_migrate_from(/* es= */ true);
+	test_sev_migrate_locking();
+	return 0;
+}
-- 
2.33.0.153.gba50c8fa24-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/2 V7] Add AMD SEV and SEV-ES intra host migration support
  2021-09-02 18:17 [PATCH 0/2 V7] Add AMD SEV and SEV-ES intra host migration support Peter Gonda
                   ` (2 preceding siblings ...)
  2021-09-02 18:17 ` [PATCH 3/3 V7] selftest: KVM: Add intra host migration tests Peter Gonda
@ 2021-09-02 18:45 ` Sean Christopherson
  2021-09-02 18:53   ` Peter Gonda
  3 siblings, 1 reply; 19+ messages in thread
From: Sean Christopherson @ 2021-09-02 18:45 UTC (permalink / raw)
  To: Peter Gonda; +Cc: kvm

Please Cc the cover letter to anyone that was Cc'd on one or more patches.  That's
especially helpful if some recipients aren't subscribed to KVM.  Oh, and Cc lkml
as well, otherwise I believe lore, patchwork, etc... won't have the cover letter.

On Thu, Sep 02, 2021, Peter Gonda wrote:
> Intra host migration provides a low-cost mechanism for userspace VMM
> upgrades.  It is an alternative to traditional (i.e., remote) live
> migration. Whereas remote migration handles moving a guest to a new host,
> intra host migration only handles moving a guest to a new userspace VMM
> within a host.  This can be used to update, rollback, change flags of the
> VMM, etc. The lower cost compared to live migration comes from the fact
> that the guest's memory does not need to be copied between processes. A
> handle to the guest memory simply gets passed to the new VMM, this could
> be done via /dev/shm with share=on or similar feature.
> 
> The guest state can be transferred from an old VMM to a new VMM as follows:
> 1. Export guest state from KVM to the old user-space VMM via a getter
> user-space/kernel API 2. Transfer guest state from old VMM to new VMM via
> IPC communication 3. Import guest state into KVM from the new user-space
> VMM via a setter user-space/kernel API VMMs by exporting from KVM using
> getters, sending that data to the new VMM, then setting it again in KVM.
> 
> In the common case for intra host migration, we can rely on the normal
> ioctls for passing data from one VMM to the next. SEV, SEV-ES, and other
> confidential compute environments make most of this information opaque, and
> render KVM ioctls such as "KVM_GET_REGS" irrelevant.  As a result, we need
> the ability to pass this opaque metadata from one VMM to the next. The
> easiest way to do this is to leave this data in the kernel, and transfer
> ownership of the metadata from one KVM VM (or vCPU) to the next. For
> example, we need to move the SEV enabled ASID, VMSAs, and GHCB metadata
> from one VMM to the next.  In general, we need to be able to hand off any
> data that would be unsafe/impossible for the kernel to hand directly to
> userspace (and cannot be reproduced using data that can be handed safely to
> userspace).
> 
> For the intra host operation the SEV required metadata, the source VM FD is
> sent to the target VMM. The target VMM calls the new cap ioctl with the
> source VM FD, KVM then moves all the SEV state to the target VM from the
> source VM.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/2 V7] Add AMD SEV and SEV-ES intra host migration support
  2021-09-02 18:45 ` [PATCH 0/2 V7] Add AMD SEV and SEV-ES intra host migration support Sean Christopherson
@ 2021-09-02 18:53   ` Peter Gonda
  0 siblings, 0 replies; 19+ messages in thread
From: Peter Gonda @ 2021-09-02 18:53 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm list, Marc Orr, Paolo Bonzini, David Rientjes,
	Dr. David Alan Gilbert, Singh, Brijesh, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, linux-kernel

On Thu, Sep 2, 2021 at 12:45 PM Sean Christopherson <seanjc@google.com> wrote:
>
> Please Cc the cover letter to anyone that was Cc'd on one or more patches.  That's
> especially helpful if some recipients aren't subscribed to KVM.  Oh, and Cc lkml
> as well, otherwise I believe lore, patchwork, etc... won't have the cover letter.

Add CCs here. Thanks.

>
> On Thu, Sep 02, 2021, Peter Gonda wrote:
> > Intra host migration provides a low-cost mechanism for userspace VMM
> > upgrades.  It is an alternative to traditional (i.e., remote) live
> > migration. Whereas remote migration handles moving a guest to a new host,
> > intra host migration only handles moving a guest to a new userspace VMM
> > within a host.  This can be used to update, rollback, change flags of the
> > VMM, etc. The lower cost compared to live migration comes from the fact
> > that the guest's memory does not need to be copied between processes. A
> > handle to the guest memory simply gets passed to the new VMM, this could
> > be done via /dev/shm with share=on or similar feature.
> >
> > The guest state can be transferred from an old VMM to a new VMM as follows:
> > 1. Export guest state from KVM to the old user-space VMM via a getter
> > user-space/kernel API 2. Transfer guest state from old VMM to new VMM via
> > IPC communication 3. Import guest state into KVM from the new user-space
> > VMM via a setter user-space/kernel API VMMs by exporting from KVM using
> > getters, sending that data to the new VMM, then setting it again in KVM.
> >
> > In the common case for intra host migration, we can rely on the normal
> > ioctls for passing data from one VMM to the next. SEV, SEV-ES, and other
> > confidential compute environments make most of this information opaque, and
> > render KVM ioctls such as "KVM_GET_REGS" irrelevant.  As a result, we need
> > the ability to pass this opaque metadata from one VMM to the next. The
> > easiest way to do this is to leave this data in the kernel, and transfer
> > ownership of the metadata from one KVM VM (or vCPU) to the next. For
> > example, we need to move the SEV enabled ASID, VMSAs, and GHCB metadata
> > from one VMM to the next.  In general, we need to be able to hand off any
> > data that would be unsafe/impossible for the kernel to hand directly to
> > userspace (and cannot be reproduced using data that can be handed safely to
> > userspace).
> >
> > For the intra host operation the SEV required metadata, the source VM FD is
> > sent to the target VMM. The target VMM calls the new cap ioctl with the
> > source VM FD, KVM then moves all the SEV state to the target VM from the
> > source VM.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration
  2021-09-02 18:17 ` [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration Peter Gonda
@ 2021-09-10  0:11   ` Sean Christopherson
  2021-09-10  1:12     ` Sean Christopherson
                       ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Sean Christopherson @ 2021-09-10  0:11 UTC (permalink / raw)
  To: Peter Gonda
  Cc: kvm, Marc Orr, Paolo Bonzini, David Rientjes,
	Dr . David Alan Gilbert, Brijesh Singh, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, linux-kernel

Nit, preferred shortlog scope is "KVM: SEV:"

On Thu, Sep 02, 2021, Peter Gonda wrote:
> For SEV to work with intra host migration, contents of the SEV info struct
> such as the ASID (used to index the encryption key in the AMD SP) and
> the list of memory regions need to be transferred to the target VM.
> This change adds a commands for a target VMM to get a source SEV VM's sev
> info.
> 
> The target is expected to be initialized (sev_guest_init), but not
> launched state (sev_launch_start) when performing receive. Once the
> target has received, it will be in a launched state and will not
> need to perform the typical SEV launch commands.
> 
> Signed-off-by: Peter Gonda <pgonda@google.com>
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Reviewed-by: Marc Orr <marcorr@google.com>
> Cc: Marc Orr <marcorr@google.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Sean Christopherson <seanjc@google.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Brijesh Singh <brijesh.singh@amd.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> Cc: Wanpeng Li <wanpengli@tencent.com>
> Cc: Jim Mattson <jmattson@google.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  Documentation/virt/kvm/api.rst  |  15 +++++
>  arch/x86/include/asm/kvm_host.h |   1 +
>  arch/x86/kvm/svm/sev.c          | 101 ++++++++++++++++++++++++++++++++
>  arch/x86/kvm/svm/svm.c          |   1 +
>  arch/x86/kvm/svm/svm.h          |   2 +
>  arch/x86/kvm/x86.c              |   5 ++
>  include/uapi/linux/kvm.h        |   1 +
>  7 files changed, 126 insertions(+)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 4ea1bb28297b..e8cecc024649 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6702,6 +6702,21 @@ MAP_SHARED mmap will result in an -EINVAL return.
>  When enabled the VMM may make use of the ``KVM_ARM_MTE_COPY_TAGS`` ioctl to
>  perform a bulk copy of tags to/from the guest.
>  
> +7.29 KVM_CAP_VM_MIGRATE_ENC_CONTEXT_FROM
> +-------------------------------------

Do we really want to bury this under KVM_CAP?  Even KVM_CAP_VM_COPY_ENC_CONTEXT_FROM
is a bit of a stretch, but at least that's a one-way "enabling", whereas this
migration routine should be able to handle multiple migrations, e.g. migrate A->B
and B->A.  Peeking at your selftest, it should be fairly easy to add in this edge
case.

This is probably a Paolo question, I've no idea if there's a desire to expand
KVM_CAP versus adding a new ioctl().

> +Architectures: x86 SEV enabled
> +Type: vm
> +Parameters: args[0] is the fd of the source vm
> +Returns: 0 on success

It'd be helpful to provide a brief description of the error cases.  Looks like
-EINVAL is the only possible error?

> +This capability enables userspace to migrate the encryption context

I would prefer to scope this beyond "encryption context".  Even for SEV, it
copies more than just the "context", which was an abstraction of SEV's ASID,
e.g. this also hands off the set of encrypted memory regions.  Looking toward
the future, if TDX wants to support this it's going to need to hand over a ton
of stuff, e.g. S-EPT tables.

Not sure on a name, maybe MIGRATE_PROTECTED_VM_FROM?

> from the vm

Capitalize VM in the description, if only to be consistent within these two
paragraphs.  If it helps, oretend all the terrible examples in this file don't
exist ;-)

> +indicated by the fd to the vm this is called on.
> +
> +This is intended to support intra-host migration of VMs between userspace VMMs.
> +in-guest workloads scheduled by the host. This allows for upgrading the VMMg

This snippet (and the lowercase "vm") looks like it was left behind after a
copy-paste from KVM_CAP_VM_COPY_ENC_CONTEXT_FROM.

> +process without interrupting the guest.
> +
>  8. Other capabilities.
>  ======================
>  
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 09b256db394a..f06d87a85654 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1456,6 +1456,7 @@ struct kvm_x86_ops {
>  	int (*mem_enc_reg_region)(struct kvm *kvm, struct kvm_enc_region *argp);
>  	int (*mem_enc_unreg_region)(struct kvm *kvm, struct kvm_enc_region *argp);
>  	int (*vm_copy_enc_context_from)(struct kvm *kvm, unsigned int source_fd);
> +	int (*vm_migrate_enc_context_from)(struct kvm *kvm, unsigned int source_fd);
>  
>  	int (*get_msr_feature)(struct kvm_msr_entry *entry);
>  
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 46eb1ba62d3d..8db666a362d4 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -1501,6 +1501,107 @@ static int sev_receive_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  	return sev_issue_cmd(kvm, SEV_CMD_RECEIVE_FINISH, &data, &argp->error);
>  }
>  
> +static int svm_sev_lock_for_migration(struct kvm *kvm)
> +{
> +	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +
> +	/*
> +	 * Bail if this VM is already involved in a migration to avoid deadlock
> +	 * between two VMs trying to migrate to/from each other.
> +	 */
> +	if (atomic_cmpxchg_acquire(&sev->migration_in_progress, 0, 1))
> +		return -EBUSY;
> +
> +	mutex_lock(&kvm->lock);
> +
> +	return 0;
> +}
> +
> +static void svm_unlock_after_migration(struct kvm *kvm)
> +{
> +	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +
> +	mutex_unlock(&kvm->lock);
> +	atomic_set_release(&sev->migration_in_progress, 0);
> +}
> +
> +static void migrate_info_from(struct kvm_sev_info *dst,
> +			      struct kvm_sev_info *src)
> +{
> +	sev_asid_free(dst);

Ooh, this brings up a potential shortcoming of requiring @dst to be SEV-enabled.
If every SEV{-ES} ASID is allocated, then there won't be an available ASID to
(temporarily) allocate for the intra-host migration.  But that temporary ASID
isn't actually necessary, i.e. there's no reason intra-host migration should fail
if all ASIDs are in-use.

I don't see any harm in requiring the @dst to _not_ be SEV-enabled.  sev_info
is not dynamically allocated, i.e. migration_in_progress is accessible either
way.  That would also simplify some of the checks, e.g. the regions_list check
goes away because svm_register_enc_region() fails on non-SEV guests.

I believe this will also fix multiple bugs in the next patch (SEV-ES support).

Bug #1, SEV-ES support changes the checks to:

	if (!sev_guest(kvm)) {
		ret = -EINVAL;
		pr_warn_ratelimited("VM must be SEV enabled to migrate to.\n");
		goto out_unlock;
	}

	...

	if (!sev_guest(source_kvm)) {
		ret = -EINVAL;
		pr_warn_ratelimited(
			"Source VM must be SEV enabled to migrate from.\n");
		goto out_source;
	}

	if (sev_es_guest(kvm)) {
		ret = migrate_vmsa_from(kvm, source_kvm);
		if (ret)
			goto out_source;
	}

and fails to handle the scenario where dst.SEV_ES != src.SEV_ES.  If @dst is
SEV_ES-enabled and @src has created vCPUs, migrate_vmsa_from() will still fail
due to guest_state_protected being false, but the reverse won't hold true and
KVM will "successfully" migrate an SEV-ES guest to an SEV guest.  I'm guessing
fireworks will ensue, e.g. due to running with the wrong ASID.

Bug #2, migrate_vmsa_from() leaks dst->vmsa, as this

		dst_svm->vmsa = src_svm->vmsa;
		src_svm->vmsa = NULL;

overwrites dst_svm->vmsa that was allocated by svm_create_vcpu().

AFAICT, there isn't anything that will break by forcing @dst to be !SEV (except
stuff that's already broken, see below).  For SEV{-ES} specific stuff, anything
that is allocated/set vCPU creation likely needs to be migrated, e.g. VMSA and
the GHCB MSR value.  The only missing action is kvm_free_guest_fpu().

Side topic, the VMSA really should be allocated in sev_es_create_vcpu(), and
guest_fpu should never be allocated for SEV-ES guests (though that doesn't change
the need for kvm_free_guest_fpu() in this case).  I'll send patches for that.

> +	dst->asid = src->asid;
> +	dst->misc_cg = src->misc_cg;
> +	dst->handle = src->handle;
> +	dst->pages_locked = src->pages_locked;
> +
> +	src->asid = 0;
> +	src->active = false;
> +	src->handle = 0;
> +	src->pages_locked = 0;
> +	src->misc_cg = NULL;
> +
> +	INIT_LIST_HEAD(&dst->regions_list);
> +	list_replace_init(&src->regions_list, &dst->regions_list);
> +}
> +
> +int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd)
> +{
> +	struct kvm_sev_info *dst_sev = &to_kvm_svm(kvm)->sev_info;
> +	struct file *source_kvm_file;
> +	struct kvm *source_kvm;
> +	int ret;
> +
> +	ret = svm_sev_lock_for_migration(kvm);
> +	if (ret)
> +		return ret;
> +
> +	if (!sev_guest(kvm) || sev_es_guest(kvm)) {
> +		ret = -EINVAL;
> +		pr_warn_ratelimited("VM must be SEV enabled to migrate to.\n");

Linux generally doesn't log user errors to dmesg.  They can be helpful during
development, but aren't actionable and thus are of limited use in production.

> +		goto out_unlock;
> +	}

Hmm, I was going to say that migration should be rejected if @dst has created
vCPUs, but the SEV-ES support migrates VMSA state and so must run after vCPUs
are created.  Holding kvm->lock does not prevent invoking per-vCPU ioctls(),
including KVM_RUN.  Modifying vCPU SEV{-ES} state while a vCPU is actively running
is bound to cause explosions.

One option for this patch would be to check kvm->created_vcpus and then add
different logic for SEV-ES, but that's probably not desirable for userspace as
it will mean triggering intra-host migration at different points for SEV vs. SEV-ES.

So I think the only option is to take vcpu->mutex for all vCPUs in both @src and
@dst.  Adding that after acquiring kvm->lock in svm_sev_lock_for_migration()
should Just Work.  Unless userspace is misbehaving, the lock won't be contended
since all vCPUs need to be quiesced, though it's probably worth using the
mutex_lock_killable() variant just to be safe.

> +	if (!list_empty(&dst_sev->regions_list)) {
> +		ret = -EINVAL;
> +		pr_warn_ratelimited(
> +			"VM must not have encrypted regions to migrate to.\n");
> +		goto out_unlock;
> +	}
> +
> +	source_kvm_file = fget(source_fd);
> +	if (!file_is_kvm(source_kvm_file)) {
> +		ret = -EBADF;
> +		pr_warn_ratelimited(
> +				"Source VM must be SEV enabled to migrate from.\n");

Case in point for not logging errors, this is arguably inaccurate as the source
"VM" isn't a VM.

> +		goto out_fput;
> +	}
> +
> +	source_kvm = source_kvm_file->private_data;
> +	ret = svm_sev_lock_for_migration(source_kvm);
> +	if (ret)
> +		goto out_fput;
> +
> +	if (!sev_guest(source_kvm) || sev_es_guest(source_kvm)) {
> +		ret = -EINVAL;
> +		pr_warn_ratelimited(
> +			"Source VM must be SEV enabled to migrate from.\n");
> +		goto out_source;
> +	}
> +
> +	migrate_info_from(dst_sev, &to_kvm_svm(source_kvm)->sev_info);
> +	ret = 0;
> +
> +out_source:
> +	svm_unlock_after_migration(source_kvm);
> +out_fput:
> +	if (source_kvm_file)
> +		fput(source_kvm_file);
> +out_unlock:
> +	svm_unlock_after_migration(kvm);
> +	return ret;
> +}

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3 V7] KVM, SEV: Add support for SEV-ES intra host migration
  2021-09-02 18:17 ` [PATCH 2/3 V7] KVM, SEV: Add support for SEV-ES " Peter Gonda
@ 2021-09-10  0:50   ` Sean Christopherson
  2021-09-10  1:20     ` Sean Christopherson
  0 siblings, 1 reply; 19+ messages in thread
From: Sean Christopherson @ 2021-09-10  0:50 UTC (permalink / raw)
  To: Peter Gonda
  Cc: kvm, Marc Orr, Paolo Bonzini, David Rientjes,
	Dr . David Alan Gilbert, Brijesh Singh, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, linux-kernel

On Thu, Sep 02, 2021, Peter Gonda wrote:
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 8db666a362d4..fac21a82e4de 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -1545,6 +1545,59 @@ static void migrate_info_from(struct kvm_sev_info *dst,
>  	list_replace_init(&src->regions_list, &dst->regions_list);
>  }
>  
> +static int migrate_vmsa_from(struct kvm *dst, struct kvm *src)
> +{
> +	int i, num_vcpus;
> +	struct kvm_vcpu *dst_vcpu, *src_vcpu;
> +	struct vcpu_svm *dst_svm, *src_svm;
> +
> +	num_vcpus = atomic_read(&dst->online_vcpus);
> +	if (num_vcpus != atomic_read(&src->online_vcpus)) {
> +		pr_warn_ratelimited(
> +			"Source and target VMs must have same number of vCPUs.\n");

Same comments about not logging the why.

> +		return -EINVAL;
> +	}
> +
> +	for (i = 0; i < num_vcpus; ++i) 
> +		src_vcpu = src->vcpus[i];

This can be:

	kvm_for_each_vcpu(i, src_vcpu, src) {
		if (!src_vcpu->arch.guest_state_protected)
			return -EINVAL;

	}
> +		if (!src_vcpu->arch.guest_state_protected) {
> +			pr_warn_ratelimited(
> +				"Source ES VM vCPUs must have protected state.\n");
> +			return -EINVAL;
> +		}
> +	}
> +
> +	for (i = 0; i < num_vcpus; ++i) {

And again here,

	kvm_for_each_vcpu(i, src_vcpu, src) {
		src_svm = to_svm(src_vcpu);

> +		src_vcpu = src->vcpus[i];
> +		src_svm = to_svm(src_vcpu);
> +		dst_vcpu = dst->vcpus[i];

Probably a good idea to use kvm_get_vcpu(), even though dst->lock is held.  If
nothing else, using kvm_get_vcpu() may save some merge pain as there's a proposal
to switch vcpus to an xarray.

> +		dst_svm = to_svm(dst_vcpu);
> +
> +		/*
> +		 * Copy VMSA and GHCB fields from the source to the destination.
> +		 * Clear them on the source to prevent the VM running and

As brought up in the prior patch, clearing the fields might ensure future KVM_RUNs
fail, but it doesn't prevent the VM from running _now_.  And with vcpu->mutext
held, I think a more appropriate comment would be:

		/*
		 * Transfer VMSA and GHCB state to the destination.  Nullify and
		 * clear source fields as appropriate, the state now belongs to
		 * the destination.
		 */

> +		 * changing the state of the VMSA/GHCB unexpectedly.
> +		 */
> +		dst_vcpu->vcpu_id = src_vcpu->vcpu_id;
> +		dst_svm->vmsa = src_svm->vmsa;
> +		src_svm->vmsa = NULL;
> +		dst_svm->ghcb = src_svm->ghcb;
> +		src_svm->ghcb = NULL;
> +		dst_svm->vmcb->control.ghcb_gpa =
> +				src_svm->vmcb->control.ghcb_gpa;

Let this poke out, an 83 char line isn't the end of the world, and not having
the interrupt makes the code more readable overall.

> +		src_svm->vmcb->control.ghcb_gpa = 0;

Nit, '0' isn't an invalid GPA.  The reset value would be more appropriate, though
I would just leave this alone.

> +		dst_svm->ghcb_sa = src_svm->ghcb_sa;
> +		src_svm->ghcb_sa = NULL;
> +		dst_svm->ghcb_sa_len = src_svm->ghcb_sa_len;
> +		src_svm->ghcb_sa_len = 0;
> +		dst_svm->ghcb_sa_sync = src_svm->ghcb_sa_sync;
> +		src_svm->ghcb_sa_sync = false;
> +		dst_svm->ghcb_sa_free = src_svm->ghcb_sa_free;
> +		src_svm->ghcb_sa_free = false;
> +	}
> +	return 0;
> +}

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration
  2021-09-10  0:11   ` Sean Christopherson
@ 2021-09-10  1:12     ` Sean Christopherson
  2021-09-13 16:21       ` Peter Gonda
  2021-09-10  1:15     ` Marc Orr
  2021-09-10 21:54     ` Peter Gonda
  2 siblings, 1 reply; 19+ messages in thread
From: Sean Christopherson @ 2021-09-10  1:12 UTC (permalink / raw)
  To: Peter Gonda
  Cc: kvm, Marc Orr, Paolo Bonzini, David Rientjes,
	Dr . David Alan Gilbert, Brijesh Singh, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, linux-kernel

On Fri, Sep 10, 2021, Sean Christopherson wrote:
> Ooh, this brings up a potential shortcoming of requiring @dst to be SEV-enabled.
> If every SEV{-ES} ASID is allocated, then there won't be an available ASID to
> (temporarily) allocate for the intra-host migration.  But that temporary ASID
> isn't actually necessary, i.e. there's no reason intra-host migration should fail
> if all ASIDs are in-use.

...

> So I think the only option is to take vcpu->mutex for all vCPUs in both @src and
> @dst.  Adding that after acquiring kvm->lock in svm_sev_lock_for_migration()
> should Just Work.  Unless userspace is misbehaving, the lock won't be contended
> since all vCPUs need to be quiesced, though it's probably worth using the
> mutex_lock_killable() variant just to be safe.

Circling back to this after looking at the SEV-ES support, I think the vCPUs in
the source VM need to be reset via kvm_vcpu_reset(vcpu, false).  I doubt there's
a use case for actually doing anything with the vCPU, but leaving it runnable
without purging state makes me nervous.

Alternative #1 would be to mark vCPUs as dead in some way so as to prevent doing
anything useful with the vCPU.

Alternative #2 would be to "kill" the source VM by setting kvm->vm_bugged to
prevent all ioctls().

The downside to preventing future ioctls() is that this would need to be the
very last step of migration.  Not sure if that's problematic?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration
  2021-09-10  0:11   ` Sean Christopherson
  2021-09-10  1:12     ` Sean Christopherson
@ 2021-09-10  1:15     ` Marc Orr
  2021-09-10  1:40       ` Sean Christopherson
  2021-09-10 21:54     ` Peter Gonda
  2 siblings, 1 reply; 19+ messages in thread
From: Marc Orr @ 2021-09-10  1:15 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Peter Gonda, kvm list, Paolo Bonzini, David Rientjes,
	Dr . David Alan Gilbert, Brijesh Singh, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, linux-kernel

> > +     dst->asid = src->asid;
> > +     dst->misc_cg = src->misc_cg;
> > +     dst->handle = src->handle;
> > +     dst->pages_locked = src->pages_locked;
> > +
> > +     src->asid = 0;
> > +     src->active = false;
> > +     src->handle = 0;
> > +     src->pages_locked = 0;
> > +     src->misc_cg = NULL;
> > +
> > +     INIT_LIST_HEAD(&dst->regions_list);
> > +     list_replace_init(&src->regions_list, &dst->regions_list);
> > +}
> > +
> > +int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd)
> > +{
> > +     struct kvm_sev_info *dst_sev = &to_kvm_svm(kvm)->sev_info;
> > +     struct file *source_kvm_file;
> > +     struct kvm *source_kvm;
> > +     int ret;
> > +
> > +     ret = svm_sev_lock_for_migration(kvm);
> > +     if (ret)
> > +             return ret;
> > +
> > +     if (!sev_guest(kvm) || sev_es_guest(kvm)) {
> > +             ret = -EINVAL;
> > +             pr_warn_ratelimited("VM must be SEV enabled to migrate to.\n");
>
> Linux generally doesn't log user errors to dmesg.  They can be helpful during
> development, but aren't actionable and thus are of limited use in production.

Ha. I had suggested adding the logs when I reviewed these patches
(maybe before Peter posted them publicly). My rationale is that if I'm
looking at a crash in production, and all I have is a stack trace and
the error code, then I can narrow the failure down to this function,
but once the function starts returning the same error code in multiple
places now it's non-trivial for me to deduce exactly which condition
caused the crash. Having these logs makes it trivial. However, if this
is not the preferred Linux style then so be it.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3 V7] KVM, SEV: Add support for SEV-ES intra host migration
  2021-09-10  0:50   ` Sean Christopherson
@ 2021-09-10  1:20     ` Sean Christopherson
  0 siblings, 0 replies; 19+ messages in thread
From: Sean Christopherson @ 2021-09-10  1:20 UTC (permalink / raw)
  To: Peter Gonda
  Cc: kvm, Marc Orr, Paolo Bonzini, David Rientjes,
	Dr . David Alan Gilbert, Brijesh Singh, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, linux-kernel

On Fri, Sep 10, 2021, Sean Christopherson wrote:
> On Thu, Sep 02, 2021, Peter Gonda wrote:
> > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > index 8db666a362d4..fac21a82e4de 100644
> > --- a/arch/x86/kvm/svm/sev.c
> > +++ b/arch/x86/kvm/svm/sev.c
> > @@ -1545,6 +1545,59 @@ static void migrate_info_from(struct kvm_sev_info *dst,
> >  	list_replace_init(&src->regions_list, &dst->regions_list);
> >  }
> >  
> > +static int migrate_vmsa_from(struct kvm *dst, struct kvm *src)

Better to call this sev_es_migrate_from()...

> > +{
> > +	int i, num_vcpus;
> > +	struct kvm_vcpu *dst_vcpu, *src_vcpu;
> > +	struct vcpu_svm *dst_svm, *src_svm;
> > +

...because this should also clear kvm->es_active.  KVM_SEV_INIT isn't problematic
(as currently written) because the common sev_guest_init() explicitly writes es_active,
but I think a clever userspace could get an SEV ASID into an "ES" guest via
KVM_CAP_VM_COPY_ENC_CONTEXT_FROM, which requires its dst to be !SEV and thus
doesn't touch es_active.

Huh, that's a bug, svm_vm_copy_asid_from() should explicitly disallow copying the
ASID from an SEV-ES guest.  I'll send a patch for that.

Last thought, it's probably worth renaming migrate_info_from() to sev_migrate_from()
to pair with sev_es_migrate_from().

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration
  2021-09-10  1:15     ` Marc Orr
@ 2021-09-10  1:40       ` Sean Christopherson
  2021-09-10  3:41         ` Marc Orr
  0 siblings, 1 reply; 19+ messages in thread
From: Sean Christopherson @ 2021-09-10  1:40 UTC (permalink / raw)
  To: Marc Orr
  Cc: Peter Gonda, kvm list, Paolo Bonzini, David Rientjes,
	Dr . David Alan Gilbert, Brijesh Singh, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, linux-kernel

On Thu, Sep 09, 2021, Marc Orr wrote:
> > > +int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd)
> > > +{
> > > +     struct kvm_sev_info *dst_sev = &to_kvm_svm(kvm)->sev_info;
> > > +     struct file *source_kvm_file;
> > > +     struct kvm *source_kvm;
> > > +     int ret;
> > > +
> > > +     ret = svm_sev_lock_for_migration(kvm);
> > > +     if (ret)
> > > +             return ret;
> > > +
> > > +     if (!sev_guest(kvm) || sev_es_guest(kvm)) {
> > > +             ret = -EINVAL;
> > > +             pr_warn_ratelimited("VM must be SEV enabled to migrate to.\n");
> >
> > Linux generally doesn't log user errors to dmesg.  They can be helpful during
> > development, but aren't actionable and thus are of limited use in production.
> 
> Ha. I had suggested adding the logs when I reviewed these patches
> (maybe before Peter posted them publicly). My rationale is that if I'm
> looking at a crash in production, and all I have is a stack trace and
> the error code, then I can narrow the failure down to this function,
> but once the function starts returning the same error code in multiple
> places now it's non-trivial for me to deduce exactly which condition
> caused the crash. Having these logs makes it trivial. However, if this
> is not the preferred Linux style then so be it.

I don't necessarily disagree, but none of these errors conditions should so much
as sniff production.  E.g. if userspace invokes this on a !KVM fd or on a non-SEV
source, or before guest_state_protected=true, then userspace has bigger problems.
Ditto if the dest isn't actual KVM VM or doesn't meet whatever SEV-enabled/disabled
criteria we end up with.

The mismatch in online_vcpus is the only one where I could reasonablly see a bug
escaping to production, e.g. due to an orchestration layer mixup.

For all of these conditions, userspace _must_ be aware of the conditions for success,
and except for guest_state_protected=true, userspace has access to what state it
sent into KVM, e.g. it shouldn't be difficult for userspace dump the relevant bits
from the src and dst without any help from the kernel.

If userspace really needs kernel help to differentiate what's up, I'd rather use
more unique errors for online_cpus and guest_state_protected, e.g. -E2BIG isn't
too big of a strecth for the online_cpus mismatch.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration
  2021-09-10  1:40       ` Sean Christopherson
@ 2021-09-10  3:41         ` Marc Orr
  0 siblings, 0 replies; 19+ messages in thread
From: Marc Orr @ 2021-09-10  3:41 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Peter Gonda, kvm list, Paolo Bonzini, David Rientjes,
	Dr . David Alan Gilbert, Brijesh Singh, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, LKML

On Thu, Sep 9, 2021 at 6:40 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Thu, Sep 09, 2021, Marc Orr wrote:
> > > > +int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd)
> > > > +{
> > > > +     struct kvm_sev_info *dst_sev = &to_kvm_svm(kvm)->sev_info;
> > > > +     struct file *source_kvm_file;
> > > > +     struct kvm *source_kvm;
> > > > +     int ret;
> > > > +
> > > > +     ret = svm_sev_lock_for_migration(kvm);
> > > > +     if (ret)
> > > > +             return ret;
> > > > +
> > > > +     if (!sev_guest(kvm) || sev_es_guest(kvm)) {
> > > > +             ret = -EINVAL;
> > > > +             pr_warn_ratelimited("VM must be SEV enabled to migrate to.\n");
> > >
> > > Linux generally doesn't log user errors to dmesg.  They can be helpful during
> > > development, but aren't actionable and thus are of limited use in production.
> >
> > Ha. I had suggested adding the logs when I reviewed these patches
> > (maybe before Peter posted them publicly). My rationale is that if I'm
> > looking at a crash in production, and all I have is a stack trace and
> > the error code, then I can narrow the failure down to this function,
> > but once the function starts returning the same error code in multiple
> > places now it's non-trivial for me to deduce exactly which condition
> > caused the crash. Having these logs makes it trivial. However, if this
> > is not the preferred Linux style then so be it.
>
> I don't necessarily disagree, but none of these errors conditions should so much
> as sniff production.  E.g. if userspace invokes this on a !KVM fd or on a non-SEV
> source, or before guest_state_protected=true, then userspace has bigger problems.
> Ditto if the dest isn't actual KVM VM or doesn't meet whatever SEV-enabled/disabled
> criteria we end up with.
>
> The mismatch in online_vcpus is the only one where I could reasonablly see a bug
> escaping to production, e.g. due to an orchestration layer mixup.
>
> For all of these conditions, userspace _must_ be aware of the conditions for success,
> and except for guest_state_protected=true, userspace has access to what state it
> sent into KVM, e.g. it shouldn't be difficult for userspace dump the relevant bits
> from the src and dst without any help from the kernel.
>
> If userspace really needs kernel help to differentiate what's up, I'd rather use
> more unique errors for online_cpus and guest_state_protected, e.g. -E2BIG isn't
> too big of a strecth for the online_cpus mismatch.

SGTM, thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3 V7] selftest: KVM: Add intra host migration tests
  2021-09-02 18:17 ` [PATCH 3/3 V7] selftest: KVM: Add intra host migration tests Peter Gonda
@ 2021-09-10 17:16   ` Sean Christopherson
  2021-09-10 22:14     ` Peter Gonda
  0 siblings, 1 reply; 19+ messages in thread
From: Sean Christopherson @ 2021-09-10 17:16 UTC (permalink / raw)
  To: Peter Gonda; +Cc: kvm, Marc Orr, David Rientjes, Brijesh Singh, linux-kernel

On Thu, Sep 02, 2021, Peter Gonda wrote:
> +/*
> + * Open SEV_DEV_PATH if available, otherwise exit the entire program.
> + *
> + * Input Args:
> + *   flags - The flags to pass when opening SEV_DEV_PATH.
> + *
> + * Return:
> + *   The opened file descriptor of /dev/sev.
> + */
> +static int open_sev_dev_path_or_exit(int flags)
> +{
> +	static int fd;
> +
> +	if (fd != 0)
> +		return fd;

Caching the file here is unnecessary, it's used in exactly one function.

> +	fd = open(SEV_DEV_PATH, flags);
> +	if (fd < 0) {
> +		print_skip("%s not available, is SEV not enabled? (errno: %d)",
> +			   SEV_DEV_PATH, errno);
> +		exit(KSFT_SKIP);
> +	}
> +
> +	return fd;
> +}

Rather than copy-paste _open_kvm_dev_path_or_exit(), it's probably worth factoring
out a helper in a separate patch, e.g.

diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 10a8ed691c66..06a6c04010fb 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -31,6 +31,19 @@ static void *align(void *x, size_t size)
        return (void *) (((size_t) x + mask) & ~mask);
 }

+int open_path_or_exit(const char *path, int flags)
+{
+       int fd;
+
+       fd = open(path, flags);
+       if (fd < 0) {
+               print_skip("%s not available (errno: %d)", path, errno);
+               exit(KSFT_SKIP);
+       }
+
+       return fd;
+}
+
 /*
  * Open KVM_DEV_PATH if available, otherwise exit the entire program.
  *
@@ -42,16 +55,7 @@ static void *align(void *x, size_t size)
  */
 static int _open_kvm_dev_path_or_exit(int flags)
 {
-       int fd;
-
-       fd = open(KVM_DEV_PATH, flags);
-       if (fd < 0) {
-               print_skip("%s not available, is KVM loaded? (errno: %d)",
-                          KVM_DEV_PATH, errno);
-               exit(KSFT_SKIP);
-       }
-
-       return fd;
+       return open_path_or_exit(KVM_DEV_PATH, flags);
 }

 int open_kvm_dev_path_or_exit(void)


> +
> +static void sev_ioctl(int vm_fd, int cmd_id, void *data)
> +{
> +	struct kvm_sev_cmd cmd = {
> +		.id = cmd_id,
> +		.data = (uint64_t)data,
> +		.sev_fd = open_sev_dev_path_or_exit(0),
> +	};
> +	int ret;
> +
> +	TEST_ASSERT(cmd_id < KVM_SEV_NR_MAX && cmd_id >= 0,
> +		    "Unknown SEV CMD : %d\n", cmd_id);

LOL, I like sanity checks, but asserting that the test itself isn't horrendously 
broken is a bit much.  And someone manages to screw up that badly, the ioctl()
below will fail.

> +	ret = ioctl(vm_fd, KVM_MEMORY_ENCRYPT_OP, &cmd);
> +	TEST_ASSERT((ret == 0 || cmd.error == SEV_RET_SUCCESS),
> +		    "%d failed: return code: %d, errno: %d, fw error: %d",
> +		    cmd_id, ret, errno, cmd.error);
> +}
> +
> +static struct kvm_vm *sev_vm_create(bool es)
> +{
> +	struct kvm_vm *vm;
> +	struct kvm_sev_launch_start start = { 0 };
> +	int i;

Rather than cache /dev/sev in a helper, you can do:

	int sev_fd = open_path_or_exit(SEV_DEV_PATH, 0);

	sev_ioctl(vm, sev_fd, ...);

> +	vm = vm_create(VM_MODE_DEFAULT, 0, O_RDWR);
> +	sev_ioctl(vm->fd, es ? KVM_SEV_ES_INIT : KVM_SEV_INIT, NULL);
> +	for (i = 0; i < MIGRATE_TEST_NUM_VCPUS; ++i)
> +		vm_vcpu_add(vm, i);
> +	start.policy |= (es) << 2;

I had to go spelunking to confirm this is the "ES" policy, please do:

	if (es)
		start.policy |= SEV_POLICY_ES;

> +	sev_ioctl(vm->fd, KVM_SEV_LAUNCH_START, &start);
> +	if (es)
> +		sev_ioctl(vm->fd, KVM_SEV_LAUNCH_UPDATE_VMSA, NULL);


And with sev_fd scoped to this function:

	close(sev_fd);

which I think is legal?

> +	return vm;
> +}
> +
> +static void test_sev_migrate_from(bool es)
> +{
> +	struct kvm_vm *vms[MIGRATE_TEST_VMS];

Prefix this and LOCK_TESTING_THREAD with NR_ so that it's clear these are arbitrary
numbers of things.  And I guess s/MIGRATE_TEST_NUM_VCPUS/NR_MIGRATE_TEST_VCPUS to
be consistent.

> +	struct kvm_enable_cap cap = {
> +		.cap = KVM_CAP_VM_MIGRATE_ENC_CONTEXT_FROM
> +	};
> +	int i;
> +
> +	for (i = 0; i < MIGRATE_TEST_VMS; ++i) {
> +		vms[i] = sev_vm_create(es);

It doesn't really matter, but closing these fds tests that KVM doesn't explode
when VMs are destroyed without the process exiting.

> +		if (i > 0) {
> +			cap.args[0] = vms[i - 1]->fd;
> +			vm_enable_cap(vms[i], &cap);
> +		}
> +	}

For giggles, we can also test migrating back (with some feedback from below
mixed in):

	/* Initial migration from the src to the first dst. */
	sev_migrate_from(dst_vms[0]->fd, src_vm->fd);

	for (i = 1; i < NR_MIGRATE_TEST_VMS; i++)
		sev_migrate_from(vms[i]->fd, vms[i - 1]->fd);

	/* Migrate the guest back to the original VM. */
	sev_migrate_from(src_vm->fd, dst_vms[NR_MIGRATE_TEST_VMS - 1]->fd);

> +}
> +
> +struct locking_thread_input {
> +	struct kvm_vm *vm;
> +	int source_fds[LOCK_TESTING_THREADS];
> +};
> +
> +static void *locking_test_thread(void *arg)
> +{
> +	/*
> +	 * This test case runs a number of threads all trying to use the intra
> +	 * host migration ioctls. This tries to detect if a deadlock exists.
> +	 */
> +	struct kvm_enable_cap cap = {
> +		.cap = KVM_CAP_VM_MIGRATE_ENC_CONTEXT_FROM
> +	};
> +	int i, j;
> +	struct locking_thread_input *input = (struct locking_test_thread *)arg;
> +
> +	for (i = 0; i < LOCK_TESTING_ITERATIONS; ++i) {
> +		j = input->source_fds[i % LOCK_TESTING_THREADS];
> +		cap.args[0] = input->source_fds[j];

This looks wrong, it's indexing source_fds with a value from source_fds.  Did
you intend?

		j = i % LOCK_TESTING_THREADS;
		cap.args[0] = input->source_fds[j];

> +		/*
> +		 * Call IOCTL directly without checking return code or
> +		 * asserting. We are * simply trying to confirm there is no
> +		 * deadlock from userspace * not check correctness of
> +		 * migration here.
> +		 */
> +		ioctl(input->vm->fd, KVM_ENABLE_CAP, &cap);

For readability and future extensibility, I'd say create a single helper and use
it even in the happy case, e.g.

static int __sev_migrate_from(int dst_fd, int src_fd)
{
	struct kvm_enable_cap cap = {
		.cap = KVM_CAP_VM_MIGRATE_ENC_CONTEXT_FROM,
		.args = { src_fd } // No idea if this is correct syntax
	};

	return ioctl(dst_fd, KVM_ENABLE_CAP, &cap);
}


static void sev_migrate_from(...)
{
	ret = __sev_migrate_from(...);
	TEST_ASSERT(!ret, "Migration failed, blah blah blah");
}

> +	}
> +}
> +
> +static void test_sev_migrate_locking(void)
> +{
> +	struct locking_thread_input input[LOCK_TESTING_THREADS];
> +	pthread_t pt[LOCK_TESTING_THREADS];
> +	int i;
> +
> +	for (i = 0; i < LOCK_TESTING_THREADS; ++i) {

With a bit of refactoring, the same VMs from the happy case can be reused for
the locking test, and we can also get concurrent SEV+SEV-ES migration (see below).

> +		input[i].vm = sev_vm_create(/* es= */ false);
> +		input[0].source_fds[i] = input[i].vm->fd;
> +	}
> +	for (i = 1; i < LOCK_TESTING_THREADS; ++i)
> +		memcpy(input[i].source_fds, input[0].source_fds,
> +		       sizeof(input[i].source_fds));
> +
> +	for (i = 0; i < LOCK_TESTING_THREADS; ++i)
> +		pthread_create(&pt[i], NULL, locking_test_thread, &input[i]);
> +
> +	for (i = 0; i < LOCK_TESTING_THREADS; ++i)
> +		pthread_join(pt[i], NULL);
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +	test_sev_migrate_from(/* es= */ false);
> +	test_sev_migrate_from(/* es= */ true);
> +	test_sev_migrate_locking();


With a little refactoring, this can add other tests, e.g. illegal dst.  Assuming
KVM requires the dst to be !SEV, SEV and SEV-ES can use the same set of destination
VMs.  And the locking test can take 'em all.  E.g. something like:

	struct kvm_vm *sev_vm, *sev_es_vm;

	sev_vm = sev_vm_create(false);
	sev_es_vm = sev_vm_create(true);

	for (i = 0; i < NR_MIGRATE_TEST_VMS; i++)
		dst_vms[i] = sev_dst_vm_create();

	test_sev_migrate_from(sev_vms, dst_vms);
	test_sev_migrate_from(sev_es_vms, dst_vms);

	ret = __sev_migrate_from(sev_es_vms[0], sev_vms[0]);
	TEST_ASSERT(ret == -EINVAL, ...);

	ret = __sev_migrate_from(sev_vms[0], sev_es_vms[0]);
	TEST_ASSERT(ret == -EINVAL, ...);
	
	ret = __sev_migrate_from(dst_vms[0], dst_vms[1]);
	TEST_ASSERT(ret == -EINVAL, ....);

	test_sev_migrate_locking(sev_vm, sev_es_vm, dst_vms);

> +	return 0;
> +}
> -- 
> 2.33.0.153.gba50c8fa24-goog
> 

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration
  2021-09-10  0:11   ` Sean Christopherson
  2021-09-10  1:12     ` Sean Christopherson
  2021-09-10  1:15     ` Marc Orr
@ 2021-09-10 21:54     ` Peter Gonda
  2021-09-10 22:03       ` Sean Christopherson
  2 siblings, 1 reply; 19+ messages in thread
From: Peter Gonda @ 2021-09-10 21:54 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm list, Marc Orr, Paolo Bonzini, David Rientjes,
	Dr . David Alan Gilbert, Brijesh Singh, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, linux-kernel

On Thu, Sep 9, 2021 at 6:11 PM Sean Christopherson <seanjc@google.com> wrote:
>
> Nit, preferred shortlog scope is "KVM: SEV:"
>
> On Thu, Sep 02, 2021, Peter Gonda wrote:
> > For SEV to work with intra host migration, contents of the SEV info struct
> > such as the ASID (used to index the encryption key in the AMD SP) and
> > the list of memory regions need to be transferred to the target VM.
> > This change adds a commands for a target VMM to get a source SEV VM's sev
> > info.
> >
> > The target is expected to be initialized (sev_guest_init), but not
> > launched state (sev_launch_start) when performing receive. Once the
> > target has received, it will be in a launched state and will not
> > need to perform the typical SEV launch commands.
> >
> > Signed-off-by: Peter Gonda <pgonda@google.com>
> > Suggested-by: Sean Christopherson <seanjc@google.com>
> > Reviewed-by: Marc Orr <marcorr@google.com>
> > Cc: Marc Orr <marcorr@google.com>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Sean Christopherson <seanjc@google.com>
> > Cc: David Rientjes <rientjes@google.com>
> > Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > Cc: Brijesh Singh <brijesh.singh@amd.com>
> > Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> > Cc: Wanpeng Li <wanpengli@tencent.com>
> > Cc: Jim Mattson <jmattson@google.com>
> > Cc: Joerg Roedel <joro@8bytes.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Borislav Petkov <bp@alien8.de>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: kvm@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > ---
> >  Documentation/virt/kvm/api.rst  |  15 +++++
> >  arch/x86/include/asm/kvm_host.h |   1 +
> >  arch/x86/kvm/svm/sev.c          | 101 ++++++++++++++++++++++++++++++++
> >  arch/x86/kvm/svm/svm.c          |   1 +
> >  arch/x86/kvm/svm/svm.h          |   2 +
> >  arch/x86/kvm/x86.c              |   5 ++
> >  include/uapi/linux/kvm.h        |   1 +
> >  7 files changed, 126 insertions(+)
> >
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index 4ea1bb28297b..e8cecc024649 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -6702,6 +6702,21 @@ MAP_SHARED mmap will result in an -EINVAL return.
> >  When enabled the VMM may make use of the ``KVM_ARM_MTE_COPY_TAGS`` ioctl to
> >  perform a bulk copy of tags to/from the guest.
> >
> > +7.29 KVM_CAP_VM_MIGRATE_ENC_CONTEXT_FROM
> > +-------------------------------------
>
> Do we really want to bury this under KVM_CAP?  Even KVM_CAP_VM_COPY_ENC_CONTEXT_FROM
> is a bit of a stretch, but at least that's a one-way "enabling", whereas this
> migration routine should be able to handle multiple migrations, e.g. migrate A->B
> and B->A.  Peeking at your selftest, it should be fairly easy to add in this edge
> case.
>
> This is probably a Paolo question, I've no idea if there's a desire to expand
> KVM_CAP versus adding a new ioctl().

Thanks for the review Sean. I put this under KVM_CAP as you suggested
following the idea of svm_vm_copy_asid_from. Paolo or anyone else have
thoughts here? It doesn't really matter to me.

>
> > +Architectures: x86 SEV enabled
> > +Type: vm
> > +Parameters: args[0] is the fd of the source vm
> > +Returns: 0 on success
>
> It'd be helpful to provide a brief description of the error cases.  Looks like
> -EINVAL is the only possible error?
>
> > +This capability enables userspace to migrate the encryption context
>
> I would prefer to scope this beyond "encryption context".  Even for SEV, it
> copies more than just the "context", which was an abstraction of SEV's ASID,
> e.g. this also hands off the set of encrypted memory regions.  Looking toward
> the future, if TDX wants to support this it's going to need to hand over a ton
> of stuff, e.g. S-EPT tables.
>
> Not sure on a name, maybe MIGRATE_PROTECTED_VM_FROM?

Protected VM sounds reasonable. I was using 'context' here to mean all
metadata related to a CoCo VM as with the
KVM_CAP_VM_COPY_ENC_CONTEXT_FROM. Is it worth diverging naming here?

>
> > from the vm
>
> Capitalize VM in the description, if only to be consistent within these two
> paragraphs.  If it helps, oretend all the terrible examples in this file don't
> exist ;-)
>
> > +indicated by the fd to the vm this is called on.
> > +
> > +This is intended to support intra-host migration of VMs between userspace VMMs.
> > +in-guest workloads scheduled by the host. This allows for upgrading the VMMg
>
> This snippet (and the lowercase "vm") looks like it was left behind after a
> copy-paste from KVM_CAP_VM_COPY_ENC_CONTEXT_FROM.
>
> > +process without interrupting the guest.
> > +
> >  8. Other capabilities.
> >  ======================
> >
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index 09b256db394a..f06d87a85654 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1456,6 +1456,7 @@ struct kvm_x86_ops {
> >       int (*mem_enc_reg_region)(struct kvm *kvm, struct kvm_enc_region *argp);
> >       int (*mem_enc_unreg_region)(struct kvm *kvm, struct kvm_enc_region *argp);
> >       int (*vm_copy_enc_context_from)(struct kvm *kvm, unsigned int source_fd);
> > +     int (*vm_migrate_enc_context_from)(struct kvm *kvm, unsigned int source_fd);
> >
> >       int (*get_msr_feature)(struct kvm_msr_entry *entry);
> >
> > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > index 46eb1ba62d3d..8db666a362d4 100644
> > --- a/arch/x86/kvm/svm/sev.c
> > +++ b/arch/x86/kvm/svm/sev.c
> > @@ -1501,6 +1501,107 @@ static int sev_receive_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
> >       return sev_issue_cmd(kvm, SEV_CMD_RECEIVE_FINISH, &data, &argp->error);
> >  }
> >
> > +static int svm_sev_lock_for_migration(struct kvm *kvm)
> > +{
> > +     struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> > +
> > +     /*
> > +      * Bail if this VM is already involved in a migration to avoid deadlock
> > +      * between two VMs trying to migrate to/from each other.
> > +      */
> > +     if (atomic_cmpxchg_acquire(&sev->migration_in_progress, 0, 1))
> > +             return -EBUSY;
> > +
> > +     mutex_lock(&kvm->lock);
> > +
> > +     return 0;
> > +}
> > +
> > +static void svm_unlock_after_migration(struct kvm *kvm)
> > +{
> > +     struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> > +
> > +     mutex_unlock(&kvm->lock);
> > +     atomic_set_release(&sev->migration_in_progress, 0);
> > +}
> > +
> > +static void migrate_info_from(struct kvm_sev_info *dst,
> > +                           struct kvm_sev_info *src)
> > +{
> > +     sev_asid_free(dst);
>
> Ooh, this brings up a potential shortcoming of requiring @dst to be SEV-enabled.
> If every SEV{-ES} ASID is allocated, then there won't be an available ASID to
> (temporarily) allocate for the intra-host migration.  But that temporary ASID
> isn't actually necessary, i.e. there's no reason intra-host migration should fail
> if all ASIDs are in-use.
>
> I don't see any harm in requiring the @dst to _not_ be SEV-enabled.  sev_info
> is not dynamically allocated, i.e. migration_in_progress is accessible either
> way.  That would also simplify some of the checks, e.g. the regions_list check
> goes away because svm_register_enc_region() fails on non-SEV guests.
>
> I believe this will also fix multiple bugs in the next patch (SEV-ES support).
>
> Bug #1, SEV-ES support changes the checks to:
>
>         if (!sev_guest(kvm)) {
>                 ret = -EINVAL;
>                 pr_warn_ratelimited("VM must be SEV enabled to migrate to.\n");
>                 goto out_unlock;
>         }
>
>         ...
>
>         if (!sev_guest(source_kvm)) {
>                 ret = -EINVAL;
>                 pr_warn_ratelimited(
>                         "Source VM must be SEV enabled to migrate from.\n");
>                 goto out_source;
>         }
>
>         if (sev_es_guest(kvm)) {
>                 ret = migrate_vmsa_from(kvm, source_kvm);
>                 if (ret)
>                         goto out_source;
>         }
>
> and fails to handle the scenario where dst.SEV_ES != src.SEV_ES.  If @dst is
> SEV_ES-enabled and @src has created vCPUs, migrate_vmsa_from() will still fail
> due to guest_state_protected being false, but the reverse won't hold true and
> KVM will "successfully" migrate an SEV-ES guest to an SEV guest.  I'm guessing
> fireworks will ensue, e.g. due to running with the wrong ASID.
>
> Bug #2, migrate_vmsa_from() leaks dst->vmsa, as this
>
>                 dst_svm->vmsa = src_svm->vmsa;
>                 src_svm->vmsa = NULL;
>
> overwrites dst_svm->vmsa that was allocated by svm_create_vcpu().
>
> AFAICT, there isn't anything that will break by forcing @dst to be !SEV (except
> stuff that's already broken, see below).  For SEV{-ES} specific stuff, anything
> that is allocated/set vCPU creation likely needs to be migrated, e.g. VMSA and
> the GHCB MSR value.  The only missing action is kvm_free_guest_fpu().
>
> Side topic, the VMSA really should be allocated in sev_es_create_vcpu(), and
> guest_fpu should never be allocated for SEV-ES guests (though that doesn't change
> the need for kvm_free_guest_fpu() in this case).  I'll send patches for that.

I believe there is no need to require dst to be SEV or SEV-ES enabled.
This logic was just carried over from our internal implementation
which is more similar to the 2 ioctl version in V1-3.

>
> > +     dst->asid = src->asid;
> > +     dst->misc_cg = src->misc_cg;
> > +     dst->handle = src->handle;
> > +     dst->pages_locked = src->pages_locked;
> > +
> > +     src->asid = 0;
> > +     src->active = false;
> > +     src->handle = 0;
> > +     src->pages_locked = 0;
> > +     src->misc_cg = NULL;
> > +
> > +     INIT_LIST_HEAD(&dst->regions_list);
> > +     list_replace_init(&src->regions_list, &dst->regions_list);
> > +}
> > +
> > +int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd)
> > +{
> > +     struct kvm_sev_info *dst_sev = &to_kvm_svm(kvm)->sev_info;
> > +     struct file *source_kvm_file;
> > +     struct kvm *source_kvm;
> > +     int ret;
> > +
> > +     ret = svm_sev_lock_for_migration(kvm);
> > +     if (ret)
> > +             return ret;
> > +
> > +     if (!sev_guest(kvm) || sev_es_guest(kvm)) {
> > +             ret = -EINVAL;
> > +             pr_warn_ratelimited("VM must be SEV enabled to migrate to.\n");
>
> Linux generally doesn't log user errors to dmesg.  They can be helpful during
> development, but aren't actionable and thus are of limited use in production.
>

As noted I added for marcorr's feedback. I'll remove all of this.

> > +             goto out_unlock;
> > +     }
>
> Hmm, I was going to say that migration should be rejected if @dst has created
> vCPUs, but the SEV-ES support migrates VMSA state and so must run after vCPUs
> are created.  Holding kvm->lock does not prevent invoking per-vCPU ioctls(),
> including KVM_RUN.  Modifying vCPU SEV{-ES} state while a vCPU is actively running
> is bound to cause explosions.
>
> One option for this patch would be to check kvm->created_vcpus and then add
> different logic for SEV-ES, but that's probably not desirable for userspace as
> it will mean triggering intra-host migration at different points for SEV vs. SEV-ES.
>
> So I think the only option is to take vcpu->mutex for all vCPUs in both @src and
> @dst.  Adding that after acquiring kvm->lock in svm_sev_lock_for_migration()
> should Just Work.  Unless userspace is misbehaving, the lock won't be contended
> since all vCPUs need to be quiesced, though it's probably worth using the
> mutex_lock_killable() variant just to be safe.

Ack will do.

>
> > +     if (!list_empty(&dst_sev->regions_list)) {
> > +             ret = -EINVAL;
> > +             pr_warn_ratelimited(
> > +                     "VM must not have encrypted regions to migrate to.\n");
> > +             goto out_unlock;
> > +     }
> > +
> > +     source_kvm_file = fget(source_fd);
> > +     if (!file_is_kvm(source_kvm_file)) {
> > +             ret = -EBADF;
> > +             pr_warn_ratelimited(
> > +                             "Source VM must be SEV enabled to migrate from.\n");
>
> Case in point for not logging errors, this is arguably inaccurate as the source
> "VM" isn't a VM.
>
> > +             goto out_fput;
> > +     }
> > +
> > +     source_kvm = source_kvm_file->private_data;
> > +     ret = svm_sev_lock_for_migration(source_kvm);
> > +     if (ret)
> > +             goto out_fput;
> > +
> > +     if (!sev_guest(source_kvm) || sev_es_guest(source_kvm)) {
> > +             ret = -EINVAL;
> > +             pr_warn_ratelimited(
> > +                     "Source VM must be SEV enabled to migrate from.\n");
> > +             goto out_source;
> > +     }
> > +
> > +     migrate_info_from(dst_sev, &to_kvm_svm(source_kvm)->sev_info);
> > +     ret = 0;
> > +
> > +out_source:
> > +     svm_unlock_after_migration(source_kvm);
> > +out_fput:
> > +     if (source_kvm_file)
> > +             fput(source_kvm_file);
> > +out_unlock:
> > +     svm_unlock_after_migration(kvm);
> > +     return ret;
> > +}

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration
  2021-09-10 21:54     ` Peter Gonda
@ 2021-09-10 22:03       ` Sean Christopherson
  2021-09-10 22:07         ` Peter Gonda
  0 siblings, 1 reply; 19+ messages in thread
From: Sean Christopherson @ 2021-09-10 22:03 UTC (permalink / raw)
  To: Peter Gonda
  Cc: kvm list, Marc Orr, Paolo Bonzini, David Rientjes,
	Dr . David Alan Gilbert, Brijesh Singh, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, linux-kernel

On Fri, Sep 10, 2021, Peter Gonda wrote:
> > Do we really want to bury this under KVM_CAP?  Even KVM_CAP_VM_COPY_ENC_CONTEXT_FROM
> > is a bit of a stretch, but at least that's a one-way "enabling", whereas this
> > migration routine should be able to handle multiple migrations, e.g. migrate A->B
> > and B->A.  Peeking at your selftest, it should be fairly easy to add in this edge
> > case.
> >
> > This is probably a Paolo question, I've no idea if there's a desire to expand
> > KVM_CAP versus adding a new ioctl().
> 
> Thanks for the review Sean. I put this under KVM_CAP as you suggested
> following the idea of svm_vm_copy_asid_from. Paolo or anyone else have
> thoughts here? It doesn't really matter to me.

Ah, sorry :-/  I obviously don't have a strong preference either.

> > > +Architectures: x86 SEV enabled
> > > +Type: vm
> > > +Parameters: args[0] is the fd of the source vm
> > > +Returns: 0 on success
> >
> > It'd be helpful to provide a brief description of the error cases.  Looks like
> > -EINVAL is the only possible error?
> >
> > > +This capability enables userspace to migrate the encryption context
> >
> > I would prefer to scope this beyond "encryption context".  Even for SEV, it
> > copies more than just the "context", which was an abstraction of SEV's ASID,
> > e.g. this also hands off the set of encrypted memory regions.  Looking toward
> > the future, if TDX wants to support this it's going to need to hand over a ton
> > of stuff, e.g. S-EPT tables.
> >
> > Not sure on a name, maybe MIGRATE_PROTECTED_VM_FROM?
> 
> Protected VM sounds reasonable. I was using 'context' here to mean all
> metadata related to a CoCo VM as with the
> KVM_CAP_VM_COPY_ENC_CONTEXT_FROM. Is it worth diverging naming here?

Yes, as they are two similar but slightly different things, IMO we want to diverge
so that it's obvious they operate on different data.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration
  2021-09-10 22:03       ` Sean Christopherson
@ 2021-09-10 22:07         ` Peter Gonda
  0 siblings, 0 replies; 19+ messages in thread
From: Peter Gonda @ 2021-09-10 22:07 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm list, Marc Orr, Paolo Bonzini, David Rientjes,
	Dr . David Alan Gilbert, Brijesh Singh, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, linux-kernel

On Fri, Sep 10, 2021 at 4:03 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Sep 10, 2021, Peter Gonda wrote:
> > > Do we really want to bury this under KVM_CAP?  Even KVM_CAP_VM_COPY_ENC_CONTEXT_FROM
> > > is a bit of a stretch, but at least that's a one-way "enabling", whereas this
> > > migration routine should be able to handle multiple migrations, e.g. migrate A->B
> > > and B->A.  Peeking at your selftest, it should be fairly easy to add in this edge
> > > case.
> > >
> > > This is probably a Paolo question, I've no idea if there's a desire to expand
> > > KVM_CAP versus adding a new ioctl().
> >
> > Thanks for the review Sean. I put this under KVM_CAP as you suggested
> > following the idea of svm_vm_copy_asid_from. Paolo or anyone else have
> > thoughts here? It doesn't really matter to me.
>
> Ah, sorry :-/  I obviously don't have a strong preference either.

I am going to suggest leaving it under KVM_CAP for this reason. I
don't see a great use case for A->B then B->A migrations. And if we
are going to move to dst must be not SEV or SEV-ES enabled, which I
think makes sense. Then your VM can only ever have migrated from 1
other VM since once it has it will be SEV/SEV-ES enabled. Does that
seem reasonable?

>
> > > > +Architectures: x86 SEV enabled
> > > > +Type: vm
> > > > +Parameters: args[0] is the fd of the source vm
> > > > +Returns: 0 on success
> > >
> > > It'd be helpful to provide a brief description of the error cases.  Looks like
> > > -EINVAL is the only possible error?
> > >
> > > > +This capability enables userspace to migrate the encryption context
> > >
> > > I would prefer to scope this beyond "encryption context".  Even for SEV, it
> > > copies more than just the "context", which was an abstraction of SEV's ASID,
> > > e.g. this also hands off the set of encrypted memory regions.  Looking toward
> > > the future, if TDX wants to support this it's going to need to hand over a ton
> > > of stuff, e.g. S-EPT tables.
> > >
> > > Not sure on a name, maybe MIGRATE_PROTECTED_VM_FROM?
> >
> > Protected VM sounds reasonable. I was using 'context' here to mean all
> > metadata related to a CoCo VM as with the
> > KVM_CAP_VM_COPY_ENC_CONTEXT_FROM. Is it worth diverging naming here?
>
> Yes, as they are two similar but slightly different things, IMO we want to diverge
> so that it's obvious they operate on different data.

Sounds good I'll rename.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3 V7] selftest: KVM: Add intra host migration tests
  2021-09-10 17:16   ` Sean Christopherson
@ 2021-09-10 22:14     ` Peter Gonda
  0 siblings, 0 replies; 19+ messages in thread
From: Peter Gonda @ 2021-09-10 22:14 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm list, Marc Orr, David Rientjes, Brijesh Singh, linux-kernel

On Fri, Sep 10, 2021 at 11:16 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Thu, Sep 02, 2021, Peter Gonda wrote:
> > +/*
> > + * Open SEV_DEV_PATH if available, otherwise exit the entire program.
> > + *
> > + * Input Args:
> > + *   flags - The flags to pass when opening SEV_DEV_PATH.
> > + *
> > + * Return:
> > + *   The opened file descriptor of /dev/sev.
> > + */
> > +static int open_sev_dev_path_or_exit(int flags)
> > +{
> > +     static int fd;
> > +
> > +     if (fd != 0)
> > +             return fd;
>
> Caching the file here is unnecessary, it's used in exactly one function.
>
> > +     fd = open(SEV_DEV_PATH, flags);
> > +     if (fd < 0) {
> > +             print_skip("%s not available, is SEV not enabled? (errno: %d)",
> > +                        SEV_DEV_PATH, errno);
> > +             exit(KSFT_SKIP);
> > +     }
> > +
> > +     return fd;
> > +}
>
> Rather than copy-paste _open_kvm_dev_path_or_exit(), it's probably worth factoring
> out a helper in a separate patch, e.g.

So the suggestion would be to move open_sev_dev_path_or_exit into
tools/testing/selftests/kvm/include/x86_64/svm_util.h

If so, wouldn't it make sense to keep the caching of the FD?

>
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 10a8ed691c66..06a6c04010fb 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -31,6 +31,19 @@ static void *align(void *x, size_t size)
>         return (void *) (((size_t) x + mask) & ~mask);
>  }
>
> +int open_path_or_exit(const char *path, int flags)
> +{
> +       int fd;
> +
> +       fd = open(path, flags);
> +       if (fd < 0) {
> +               print_skip("%s not available (errno: %d)", path, errno);
> +               exit(KSFT_SKIP);
> +       }
> +
> +       return fd;
> +}
> +
>  /*
>   * Open KVM_DEV_PATH if available, otherwise exit the entire program.
>   *
> @@ -42,16 +55,7 @@ static void *align(void *x, size_t size)
>   */
>  static int _open_kvm_dev_path_or_exit(int flags)
>  {
> -       int fd;
> -
> -       fd = open(KVM_DEV_PATH, flags);
> -       if (fd < 0) {
> -               print_skip("%s not available, is KVM loaded? (errno: %d)",
> -                          KVM_DEV_PATH, errno);
> -               exit(KSFT_SKIP);
> -       }
> -
> -       return fd;
> +       return open_path_or_exit(KVM_DEV_PATH, flags);
>  }
>
>  int open_kvm_dev_path_or_exit(void)
>
>
> > +
> > +static void sev_ioctl(int vm_fd, int cmd_id, void *data)
> > +{
> > +     struct kvm_sev_cmd cmd = {
> > +             .id = cmd_id,
> > +             .data = (uint64_t)data,
> > +             .sev_fd = open_sev_dev_path_or_exit(0),
> > +     };
> > +     int ret;
> > +
> > +     TEST_ASSERT(cmd_id < KVM_SEV_NR_MAX && cmd_id >= 0,
> > +                 "Unknown SEV CMD : %d\n", cmd_id);
>
> LOL, I like sanity checks, but asserting that the test itself isn't horrendously
> broken is a bit much.  And someone manages to screw up that badly, the ioctl()
> below will fail.

Ack. I'll remove this.

>
> > +     ret = ioctl(vm_fd, KVM_MEMORY_ENCRYPT_OP, &cmd);
> > +     TEST_ASSERT((ret == 0 || cmd.error == SEV_RET_SUCCESS),
> > +                 "%d failed: return code: %d, errno: %d, fw error: %d",
> > +                 cmd_id, ret, errno, cmd.error);
> > +}
> > +
> > +static struct kvm_vm *sev_vm_create(bool es)
> > +{
> > +     struct kvm_vm *vm;
> > +     struct kvm_sev_launch_start start = { 0 };
> > +     int i;
>
> Rather than cache /dev/sev in a helper, you can do:
>
>         int sev_fd = open_path_or_exit(SEV_DEV_PATH, 0);
>
>         sev_ioctl(vm, sev_fd, ...);
>
> > +     vm = vm_create(VM_MODE_DEFAULT, 0, O_RDWR);
> > +     sev_ioctl(vm->fd, es ? KVM_SEV_ES_INIT : KVM_SEV_INIT, NULL);
> > +     for (i = 0; i < MIGRATE_TEST_NUM_VCPUS; ++i)
> > +             vm_vcpu_add(vm, i);
> > +     start.policy |= (es) << 2;
>
> I had to go spelunking to confirm this is the "ES" policy, please do:
>
>         if (es)
>                 start.policy |= SEV_POLICY_ES;
>
> > +     sev_ioctl(vm->fd, KVM_SEV_LAUNCH_START, &start);
> > +     if (es)
> > +             sev_ioctl(vm->fd, KVM_SEV_LAUNCH_UPDATE_VMSA, NULL);
>
>
> And with sev_fd scoped to this function:
>
>         close(sev_fd);
>
> which I think is legal?
>
> > +     return vm;
> > +}
> > +
> > +static void test_sev_migrate_from(bool es)
> > +{
> > +     struct kvm_vm *vms[MIGRATE_TEST_VMS];
>
> Prefix this and LOCK_TESTING_THREAD with NR_ so that it's clear these are arbitrary
> numbers of things.  And I guess s/MIGRATE_TEST_NUM_VCPUS/NR_MIGRATE_TEST_VCPUS to
> be consistent.
>
> > +     struct kvm_enable_cap cap = {
> > +             .cap = KVM_CAP_VM_MIGRATE_ENC_CONTEXT_FROM
> > +     };
> > +     int i;
> > +
> > +     for (i = 0; i < MIGRATE_TEST_VMS; ++i) {
> > +             vms[i] = sev_vm_create(es);
>
> It doesn't really matter, but closing these fds tests that KVM doesn't explode
> when VMs are destroyed without the process exiting.
>

Can do, I spot checked a couple other tests and didn't see any close
calls so didn't clutter the test here.

> > +             if (i > 0) {
> > +                     cap.args[0] = vms[i - 1]->fd;
> > +                     vm_enable_cap(vms[i], &cap);
> > +             }
> > +     }
>
> For giggles, we can also test migrating back (with some feedback from below
> mixed in):
>
>         /* Initial migration from the src to the first dst. */
>         sev_migrate_from(dst_vms[0]->fd, src_vm->fd);
>
>         for (i = 1; i < NR_MIGRATE_TEST_VMS; i++)
>                 sev_migrate_from(vms[i]->fd, vms[i - 1]->fd);
>
>         /* Migrate the guest back to the original VM. */
>         sev_migrate_from(src_vm->fd, dst_vms[NR_MIGRATE_TEST_VMS - 1]->fd);
>
> > +}
> > +
> > +struct locking_thread_input {
> > +     struct kvm_vm *vm;
> > +     int source_fds[LOCK_TESTING_THREADS];
> > +};
> > +
> > +static void *locking_test_thread(void *arg)
> > +{
> > +     /*
> > +      * This test case runs a number of threads all trying to use the intra
> > +      * host migration ioctls. This tries to detect if a deadlock exists.
> > +      */
> > +     struct kvm_enable_cap cap = {
> > +             .cap = KVM_CAP_VM_MIGRATE_ENC_CONTEXT_FROM
> > +     };
> > +     int i, j;
> > +     struct locking_thread_input *input = (struct locking_test_thread *)arg;
> > +
> > +     for (i = 0; i < LOCK_TESTING_ITERATIONS; ++i) {
> > +             j = input->source_fds[i % LOCK_TESTING_THREADS];
> > +             cap.args[0] = input->source_fds[j];
>
> This looks wrong, it's indexing source_fds with a value from source_fds.  Did
> you intend?
>
>                 j = i % LOCK_TESTING_THREADS;
>                 cap.args[0] = input->source_fds[j];
>

Yup that's wrong I'll update.

> > +             /*
> > +              * Call IOCTL directly without checking return code or
> > +              * asserting. We are * simply trying to confirm there is no
> > +              * deadlock from userspace * not check correctness of
> > +              * migration here.
> > +              */
> > +             ioctl(input->vm->fd, KVM_ENABLE_CAP, &cap);
>
> For readability and future extensibility, I'd say create a single helper and use
> it even in the happy case, e.g.
>
> static int __sev_migrate_from(int dst_fd, int src_fd)
> {
>         struct kvm_enable_cap cap = {
>                 .cap = KVM_CAP_VM_MIGRATE_ENC_CONTEXT_FROM,
>                 .args = { src_fd } // No idea if this is correct syntax
>         };
>
>         return ioctl(dst_fd, KVM_ENABLE_CAP, &cap);
> }
>
>
> static void sev_migrate_from(...)
> {
>         ret = __sev_migrate_from(...);
>         TEST_ASSERT(!ret, "Migration failed, blah blah blah");
> }
>
> > +     }
> > +}
> > +
> > +static void test_sev_migrate_locking(void)
> > +{
> > +     struct locking_thread_input input[LOCK_TESTING_THREADS];
> > +     pthread_t pt[LOCK_TESTING_THREADS];
> > +     int i;
> > +
> > +     for (i = 0; i < LOCK_TESTING_THREADS; ++i) {
>
> With a bit of refactoring, the same VMs from the happy case can be reused for
> the locking test, and we can also get concurrent SEV+SEV-ES migration (see below).
>
> > +             input[i].vm = sev_vm_create(/* es= */ false);
> > +             input[0].source_fds[i] = input[i].vm->fd;
> > +     }
> > +     for (i = 1; i < LOCK_TESTING_THREADS; ++i)
> > +             memcpy(input[i].source_fds, input[0].source_fds,
> > +                    sizeof(input[i].source_fds));
> > +
> > +     for (i = 0; i < LOCK_TESTING_THREADS; ++i)
> > +             pthread_create(&pt[i], NULL, locking_test_thread, &input[i]);
> > +
> > +     for (i = 0; i < LOCK_TESTING_THREADS; ++i)
> > +             pthread_join(pt[i], NULL);
> > +}
> > +
> > +int main(int argc, char *argv[])
> > +{
> > +     test_sev_migrate_from(/* es= */ false);
> > +     test_sev_migrate_from(/* es= */ true);
> > +     test_sev_migrate_locking();
>
>
> With a little refactoring, this can add other tests, e.g. illegal dst.  Assuming
> KVM requires the dst to be !SEV, SEV and SEV-ES can use the same set of destination
> VMs.  And the locking test can take 'em all.  E.g. something like:
>
>         struct kvm_vm *sev_vm, *sev_es_vm;
>
>         sev_vm = sev_vm_create(false);
>         sev_es_vm = sev_vm_create(true);
>
>         for (i = 0; i < NR_MIGRATE_TEST_VMS; i++)
>                 dst_vms[i] = sev_dst_vm_create();
>
>         test_sev_migrate_from(sev_vms, dst_vms);
>         test_sev_migrate_from(sev_es_vms, dst_vms);
>
>         ret = __sev_migrate_from(sev_es_vms[0], sev_vms[0]);
>         TEST_ASSERT(ret == -EINVAL, ...);
>
>         ret = __sev_migrate_from(sev_vms[0], sev_es_vms[0]);
>         TEST_ASSERT(ret == -EINVAL, ...);
>
>         ret = __sev_migrate_from(dst_vms[0], dst_vms[1]);
>         TEST_ASSERT(ret == -EINVAL, ....);
>
>         test_sev_migrate_locking(sev_vm, sev_es_vm, dst_vms);
>

Ack. I'll add these parameter validation tests.

> > +     return 0;
> > +}
> > --
> > 2.33.0.153.gba50c8fa24-goog
> >

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration
  2021-09-10  1:12     ` Sean Christopherson
@ 2021-09-13 16:21       ` Peter Gonda
  0 siblings, 0 replies; 19+ messages in thread
From: Peter Gonda @ 2021-09-13 16:21 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm list, Marc Orr, Paolo Bonzini, David Rientjes,
	Dr . David Alan Gilbert, Brijesh Singh, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, linux-kernel

On Thu, Sep 9, 2021 at 7:12 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Sep 10, 2021, Sean Christopherson wrote:
> > Ooh, this brings up a potential shortcoming of requiring @dst to be SEV-enabled.
> > If every SEV{-ES} ASID is allocated, then there won't be an available ASID to
> > (temporarily) allocate for the intra-host migration.  But that temporary ASID
> > isn't actually necessary, i.e. there's no reason intra-host migration should fail
> > if all ASIDs are in-use.

Ack forcing dst to be SEV disabled will mitigate this problem.

>
> ...
>
> > So I think the only option is to take vcpu->mutex for all vCPUs in both @src and
> > @dst.  Adding that after acquiring kvm->lock in svm_sev_lock_for_migration()
> > should Just Work.  Unless userspace is misbehaving, the lock won't be contended
> > since all vCPUs need to be quiesced, though it's probably worth using the
> > mutex_lock_killable() variant just to be safe.
>
> Circling back to this after looking at the SEV-ES support, I think the vCPUs in
> the source VM need to be reset via kvm_vcpu_reset(vcpu, false).  I doubt there's
> a use case for actually doing anything with the vCPU, but leaving it runnable
> without purging state makes me nervous.
>
> Alternative #1 would be to mark vCPUs as dead in some way so as to prevent doing
> anything useful with the vCPU.
>
> Alternative #2 would be to "kill" the source VM by setting kvm->vm_bugged to
> prevent all ioctls().
>
> The downside to preventing future ioctls() is that this would need to be the
> very last step of migration.  Not sure if that's problematic?

I'll add calls to kvm_vcpu_reset. Alternative #2 using vm_bugged won't
work for us because we need to keep using the source VM even after the
state is transfered.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-09-13 16:21 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-02 18:17 [PATCH 0/2 V7] Add AMD SEV and SEV-ES intra host migration support Peter Gonda
2021-09-02 18:17 ` [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration Peter Gonda
2021-09-10  0:11   ` Sean Christopherson
2021-09-10  1:12     ` Sean Christopherson
2021-09-13 16:21       ` Peter Gonda
2021-09-10  1:15     ` Marc Orr
2021-09-10  1:40       ` Sean Christopherson
2021-09-10  3:41         ` Marc Orr
2021-09-10 21:54     ` Peter Gonda
2021-09-10 22:03       ` Sean Christopherson
2021-09-10 22:07         ` Peter Gonda
2021-09-02 18:17 ` [PATCH 2/3 V7] KVM, SEV: Add support for SEV-ES " Peter Gonda
2021-09-10  0:50   ` Sean Christopherson
2021-09-10  1:20     ` Sean Christopherson
2021-09-02 18:17 ` [PATCH 3/3 V7] selftest: KVM: Add intra host migration tests Peter Gonda
2021-09-10 17:16   ` Sean Christopherson
2021-09-10 22:14     ` Peter Gonda
2021-09-02 18:45 ` [PATCH 0/2 V7] Add AMD SEV and SEV-ES intra host migration support Sean Christopherson
2021-09-02 18:53   ` Peter Gonda

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.