kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shivam Kumar <shivam.kumar1@nutanix.com>
To: pbonzini@redhat.com
Cc: kvm@vger.kernel.org, Shivam Kumar <shivam.kumar1@nutanix.com>,
	Anurag Madnawat <anurag.madnawat@nutanix.com>,
	Shaju Abraham <shaju.abraham@nutanix.com>,
	Manish Mishra <manish.mishra@nutanix.com>
Subject: [PATCH 3/6] Add KVM_CAP_DIRTY_QUOTA_MIGRATION and handle vCPU page faults.
Date: Sun, 14 Nov 2021 14:57:18 +0000	[thread overview]
Message-ID: <20211114145721.209219-4-shivam.kumar1@nutanix.com> (raw)
In-Reply-To: <20211114145721.209219-1-shivam.kumar1@nutanix.com>

When a dirty quota migration is initiated from QEMU side, the following
things happen:

1. An mmap ioctl is called for each vCPU to mmap the dirty quota context.
This results into vCPU page fault which needs to be handled.
2. An ioctl to start dirty quota migration is called from QEMU and must be
handled. This happens once QEMU is ready to start the migration.

Co-developed-by: Anurag Madnawat <anurag.madnawat@nutanix.com>
Signed-off-by: Anurag Madnawat <anurag.madnawat@nutanix.com>
Signed-off-by: Shivam Kumar <shivam.kumar1@nutanix.com>
Signed-off-by: Shaju Abraham <shaju.abraham@nutanix.com>
Signed-off-by: Manish Mishra <manish.mishra@nutanix.com>
---
 Documentation/virt/kvm/api.rst        | 39 +++++++++++++++++++++++++++
 include/linux/dirty_quota_migration.h |  8 ++++++
 include/uapi/linux/kvm.h              |  1 +
 virt/kvm/dirty_quota_migration.c      |  6 +++++
 virt/kvm/kvm_main.c                   | 37 +++++++++++++++++++++++++
 5 files changed, 91 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index aeeb071c7688..6679bceee649 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -277,6 +277,10 @@ the VCPU file descriptor can be mmap-ed, including:
   KVM_DIRTY_LOG_PAGE_OFFSET * PAGE_SIZE.  For more information on
   KVM_CAP_DIRTY_LOG_RING, see section 8.3.
 
+- if KVM_CAP_DIRTY_QUOTA_MIGRATION is available, a number of pages at
+  KVM_DIRTY_QUOTA_PAGE_OFFSET * PAGE_SIZE.  For more information on
+  KVM_CAP_DIRTY_QUOTA_MIGRATION, see section 8.35.
+
 
 4.6 KVM_SET_MEMORY_REGION
 -------------------------
@@ -7484,3 +7488,38 @@ The argument to KVM_ENABLE_CAP is also a bitmask, and must be a subset
 of the result of KVM_CHECK_EXTENSION.  KVM will forward to userspace
 the hypercalls whose corresponding bit is in the argument, and return
 ENOSYS for the others.
+
+8.35 KVM_CAP_DIRTY_QUOTA_MIGRATION
+---------------------------
+
+:Architectures: x86
+:Parameters: args[0] - boolean value specifying whether to enable or
+disable dirty quota migration (true and false respectively)
+
+With dirty quota migration, memory dirtying is throttled by setting a
+limit on the number of pages a vCPU can dirty in given fixed microscopic
+size time intervals. This limit depends on the network throughput
+calculated over the last few intervals so as to throttle the vCPUs based
+on available network bandwidth. We are referring to this limit as the
+"dirty quota" of a vCPU and the fixed size intervals as the "dirty quota
+intervals".
+
+vCPUDirtyQuotaContext keeps the dirty quota context for each vCPU. It
+keeps the number of pages the vCPU has dirtied (dirty_counter) in the
+ongoing dirty quota interval and the maximum number of dirties allowed for
+the vCPU (dirty_quota) in the ongoing dirty quota interval.
+
+  struct vCPUDirtyQuotaContext {
+          u64 dirty_counter;
+          u64 dirty_quota;
+  };
+
+The flag dirty_quota_migration_enabled determines whether dirty quota-
+based throttling is enabled for an ongoing migration or not.
+
+When the guest tries to dirty a page, it leads to a vmexit as each page
+is write-protected. In the vmexit path, we increment the dirty_counter
+for the corresponding vCPU. Then, we check if the vCPU has exceeded its
+quota. If yes, we exit to userspace with a new exit reason
+KVM_EXIT_DIRTY_QUOTA_FULL. This "quota full" event is further handled on
+the userspace side.
diff --git a/include/linux/dirty_quota_migration.h b/include/linux/dirty_quota_migration.h
index 8c12fa428436..b6c6f5f896dd 100644
--- a/include/linux/dirty_quota_migration.h
+++ b/include/linux/dirty_quota_migration.h
@@ -24,9 +24,17 @@ static inline int kvm_vcpu_dirty_quota_alloc(struct vCPUDirtyQuotaContext **vCPU
 	return 0;
 }
 
+static inline struct page *kvm_dirty_quota_context_get_page(
+		struct vCPUDirtyQuotaContext *vCPUdqctx, u32 offset)
+{
+	return NULL;
+}
+
 #else /* KVM_DIRTY_QUOTA_PAGE_OFFSET == 0 */
 
 int kvm_vcpu_dirty_quota_alloc(struct vCPUDirtyQuotaContext **vCPUdqctx);
+struct page *kvm_dirty_quota_context_get_page(
+		struct vCPUDirtyQuotaContext *vCPUdqctx, u32 offset);
 
 #endif /* KVM_DIRTY_QUOTA_PAGE_OFFSET == 0 */
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 647f7e1a04dc..a6785644bf47 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1131,6 +1131,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
 #define KVM_CAP_ARM_MTE 205
 #define KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM 206
+#define KVM_CAP_DIRTY_QUOTA_MIGRATION 207
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/virt/kvm/dirty_quota_migration.c b/virt/kvm/dirty_quota_migration.c
index 262f071aac0c..7e9ace760939 100644
--- a/virt/kvm/dirty_quota_migration.c
+++ b/virt/kvm/dirty_quota_migration.c
@@ -12,3 +12,9 @@ int kvm_vcpu_dirty_quota_alloc(struct vCPUDirtyQuotaContext **vCPUdqctx)
 	memset((*vCPUdqctx), 0, size);
 	return 0;
 }
+
+struct page *kvm_dirty_quota_context_get_page(
+		struct vCPUDirtyQuotaContext *vCPUdqctx, u32 offset)
+{
+	return vmalloc_to_page((void *)vCPUdqctx + offset * PAGE_SIZE);
+}
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 5626ae1b92ce..1564d3a3f608 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3519,6 +3519,9 @@ static vm_fault_t kvm_vcpu_fault(struct vm_fault *vmf)
 		page = kvm_dirty_ring_get_page(
 		    &vcpu->dirty_ring,
 		    vmf->pgoff - KVM_DIRTY_LOG_PAGE_OFFSET);
+	else if (vmf->pgoff == KVM_DIRTY_QUOTA_PAGE_OFFSET)
+		page = kvm_dirty_quota_context_get_page(vcpu->vCPUdqctx,
+				vmf->pgoff - KVM_DIRTY_QUOTA_PAGE_OFFSET);
 	else
 		return kvm_arch_vcpu_fault(vcpu, vmf);
 	get_page(page);
@@ -4207,6 +4210,12 @@ static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
 #endif
 	case KVM_CAP_BINARY_STATS_FD:
 		return 1;
+	case KVM_CAP_DIRTY_QUOTA_MIGRATION:
+#if KVM_DIRTY_QUOTA_PAGE_OFFSET > 0
+		return 1;
+#else
+		return 0;
+#endif
 	default:
 		break;
 	}
@@ -4273,6 +4282,31 @@ static int kvm_vm_ioctl_reset_dirty_pages(struct kvm *kvm)
 	return cleared;
 }
 
+static int kvm_vm_ioctl_enable_dirty_quota_migration(struct kvm *kvm,
+		bool enabled)
+{
+	if (!KVM_DIRTY_LOG_PAGE_OFFSET)
+		return -EINVAL;
+
+	/*
+	 * For now, dirty quota migration works with dirty bitmap so don't
+	 * enable it if dirty ring interface is enabled. In future, dirty
+	 * quota migration may work with dirty ring interface was well.
+	 */
+	if (kvm->dirty_ring_size)
+		return -EINVAL;
+
+	/* Return if no change */
+	if (kvm->dirty_quota_migration_enabled == enabled)
+		return -EINVAL;
+
+	mutex_lock(&kvm->lock);
+	kvm->dirty_quota_migration_enabled = enabled;
+	mutex_unlock(&kvm->lock);
+
+	return 0;
+}
+
 int __attribute__((weak)) kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 						  struct kvm_enable_cap *cap)
 {
@@ -4305,6 +4339,9 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
 	}
 	case KVM_CAP_DIRTY_LOG_RING:
 		return kvm_vm_ioctl_enable_dirty_log_ring(kvm, cap->args[0]);
+	case KVM_CAP_DIRTY_QUOTA_MIGRATION:
+		return kvm_vm_ioctl_enable_dirty_quota_migration(kvm,
+				cap->args[0]);
 	default:
 		return kvm_vm_ioctl_enable_cap(kvm, cap);
 	}
-- 
2.22.3


  parent reply	other threads:[~2021-11-14 14:57 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-14 14:57 [PATCH 0/6] KVM: Dirty Quota-Based VM Live Migration Auto-Converge Shivam Kumar
2021-11-14 14:57 ` [PATCH 1/6] Define data structures for dirty quota migration Shivam Kumar
2021-11-14 14:57 ` [PATCH 2/6] Init dirty quota flag and allocate memory for vCPUdqctx Shivam Kumar
2021-11-14 14:57 ` Shivam Kumar [this message]
2021-11-18 17:57   ` [PATCH 3/6] Add KVM_CAP_DIRTY_QUOTA_MIGRATION and handle vCPU page faults Sean Christopherson
2021-11-19 20:03     ` Shivam Kumar
     [not found]     ` <02b8fa86-a86b-969e-2137-1953639cb6d2@nutanix.com>
     [not found]       ` <YZgD0D4536s2DMem@google.com>
2021-11-19 20:21         ` Shivam Kumar
2021-11-25  8:43           ` Shivam Kumar
2021-12-01 17:22             ` Sean Christopherson
2021-11-14 14:57 ` [PATCH 4/6] Increment dirty counter for vmexit due to page write fault Shivam Kumar
2021-11-18 17:48   ` Sean Christopherson
2021-11-19 19:54     ` Shivam Kumar
2021-11-14 14:57 ` [PATCH 5/6] Exit to userspace when dirty quota is full Shivam Kumar
2021-11-14 14:57 ` [PATCH 6/6] Free vCPUdqctx memory on vCPU destroy Shivam Kumar
2021-11-18 17:46 ` [PATCH 0/6] KVM: Dirty Quota-Based VM Live Migration Auto-Converge Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211114145721.209219-4-shivam.kumar1@nutanix.com \
    --to=shivam.kumar1@nutanix.com \
    --cc=anurag.madnawat@nutanix.com \
    --cc=kvm@vger.kernel.org \
    --cc=manish.mishra@nutanix.com \
    --cc=pbonzini@redhat.com \
    --cc=shaju.abraham@nutanix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).