kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
From: Gavin Shan <gshan@redhat.com>
To: kvmarm@lists.linux.dev
Cc: kvm@vger.kernel.org, catalin.marinas@arm.com,
	andrew.jones@linux.dev, will@kernel.org, shan.gavin@gmail.com,
	bgardon@google.com, dmatlack@google.com, pbonzini@redhat.com,
	zhenyzha@redhat.com, shuah@kernel.org,
	kvmarm@lists.cs.columbia.edu
Subject: [PATCH v5 3/7] KVM: x86: Allow to use bitmap in ring-based dirty page tracking
Date: Wed,  5 Oct 2022 08:41:50 +0800	[thread overview]
Message-ID: <20221005004154.83502-4-gshan@redhat.com> (raw)
In-Reply-To: <20221005004154.83502-1-gshan@redhat.com>

There is no running vcpu and available per-vcpu dirty ring when
pages become dirty in some cases. One example is to save arm64's
vgic/its tables during migration. This leads to our lost tracking
on these dirty pages.

Fix the issue by reusing the bitmap to track those dirty pages.
The bitmap is visited by KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG
ioctls. KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP is used to advertise the
new capability.

Suggested-by: Marc Zyngier <maz@kernel.org>
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 Documentation/virt/kvm/api.rst | 17 ++++++++---------
 include/uapi/linux/kvm.h       |  1 +
 virt/kvm/kvm_main.c            | 24 +++++++++---------------
 3 files changed, 18 insertions(+), 24 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 32427ea160df..4f5e09042f8a 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -8019,8 +8019,8 @@ guest according to the bits in the KVM_CPUID_FEATURES CPUID leaf
 (0x40000001). Otherwise, a guest may use the paravirtual features
 regardless of what has actually been exposed through the CPUID leaf.
 
-8.29 KVM_CAP_DIRTY_LOG_RING/KVM_CAP_DIRTY_LOG_RING_ACQ_REL
-----------------------------------------------------------
+8.29 KVM_CAP_DIRTY_LOG_{RING, RING_ACQ_REL, RING_ALLOW_BITMAP}
+--------------------------------------------------------------
 
 :Architectures: x86
 :Parameters: args[0] - size of the dirty log ring
@@ -8104,13 +8104,6 @@ flushing is done by the KVM_GET_DIRTY_LOG ioctl).  To achieve that, one
 needs to kick the vcpu out of KVM_RUN using a signal.  The resulting
 vmexit ensures that all dirty GFNs are flushed to the dirty rings.
 
-NOTE: the capability KVM_CAP_DIRTY_LOG_RING and the corresponding
-ioctl KVM_RESET_DIRTY_RINGS are mutual exclusive to the existing ioctls
-KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG.  After enabling
-KVM_CAP_DIRTY_LOG_RING with an acceptable dirty ring size, the virtual
-machine will switch to ring-buffer dirty page tracking and further
-KVM_GET_DIRTY_LOG or KVM_CLEAR_DIRTY_LOG ioctls will fail.
-
 NOTE: KVM_CAP_DIRTY_LOG_RING_ACQ_REL is the only capability that
 should be exposed by weakly ordered architecture, in order to indicate
 the additional memory ordering requirements imposed on userspace when
@@ -8119,6 +8112,12 @@ Architecture with TSO-like ordering (such as x86) are allowed to
 expose both KVM_CAP_DIRTY_LOG_RING and KVM_CAP_DIRTY_LOG_RING_ACQ_REL
 to userspace.
 
+NOTE: There is no running vcpu and available vcpu dirty ring when pages
+becomes dirty in some cases. One example is to save arm64's vgic/its
+tables during migration. The dirty bitmap is still used to track those
+dirty pages, indicated by KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP. The ditry
+bitmap is visited by KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG ioctls.
+
 8.30 KVM_CAP_XEN_HVM
 --------------------
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 0d5d4419139a..a1f10bc5fe8b 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1178,6 +1178,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_S390_ZPCI_OP 221
 #define KVM_CAP_S390_CPU_TOPOLOGY 222
 #define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223
+#define KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP 224
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 5b064dbadaf4..505e3840cf0c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1617,7 +1617,7 @@ static int kvm_prepare_memory_region(struct kvm *kvm,
 			new->dirty_bitmap = NULL;
 		else if (old && old->dirty_bitmap)
 			new->dirty_bitmap = old->dirty_bitmap;
-		else if (!kvm->dirty_ring_size) {
+		else {
 			r = kvm_alloc_dirty_bitmap(new);
 			if (r)
 				return r;
@@ -2060,10 +2060,6 @@ int kvm_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log,
 	unsigned long n;
 	unsigned long any = 0;
 
-	/* Dirty ring tracking is exclusive to dirty log tracking */
-	if (kvm->dirty_ring_size)
-		return -ENXIO;
-
 	*memslot = NULL;
 	*is_dirty = 0;
 
@@ -2125,10 +2121,6 @@ static int kvm_get_dirty_log_protect(struct kvm *kvm, struct kvm_dirty_log *log)
 	unsigned long *dirty_bitmap_buffer;
 	bool flush;
 
-	/* Dirty ring tracking is exclusive to dirty log tracking */
-	if (kvm->dirty_ring_size)
-		return -ENXIO;
-
 	as_id = log->slot >> 16;
 	id = (u16)log->slot;
 	if (as_id >= KVM_ADDRESS_SPACE_NUM || id >= KVM_USER_MEM_SLOTS)
@@ -2237,10 +2229,6 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm,
 	unsigned long *dirty_bitmap_buffer;
 	bool flush;
 
-	/* Dirty ring tracking is exclusive to dirty log tracking */
-	if (kvm->dirty_ring_size)
-		return -ENXIO;
-
 	as_id = log->slot >> 16;
 	id = (u16)log->slot;
 	if (as_id >= KVM_ADDRESS_SPACE_NUM || id >= KVM_USER_MEM_SLOTS)
@@ -3305,7 +3293,7 @@ void mark_page_dirty_in_slot(struct kvm *kvm,
 	struct kvm_vcpu *vcpu = kvm_get_running_vcpu();
 
 #ifdef CONFIG_HAVE_KVM_DIRTY_RING
-	if (WARN_ON_ONCE(!vcpu) || WARN_ON_ONCE(vcpu->kvm != kvm))
+	if (WARN_ON_ONCE(vcpu && vcpu->kvm != kvm))
 		return;
 #endif
 
@@ -3313,7 +3301,7 @@ void mark_page_dirty_in_slot(struct kvm *kvm,
 		unsigned long rel_gfn = gfn - memslot->base_gfn;
 		u32 slot = (memslot->as_id << 16) | memslot->id;
 
-		if (kvm->dirty_ring_size)
+		if (kvm->dirty_ring_size && vcpu)
 			kvm_dirty_ring_push(&vcpu->dirty_ring,
 					    slot, rel_gfn);
 		else
@@ -4485,6 +4473,12 @@ static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
 		return KVM_DIRTY_RING_MAX_ENTRIES * sizeof(struct kvm_dirty_gfn);
 #else
 		return 0;
+#endif
+	case KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP:
+#ifdef CONFIG_HAVE_KVM_DIRTY_RING
+		return 1;
+#else
+		return 0;
 #endif
 	case KVM_CAP_BINARY_STATS_FD:
 	case KVM_CAP_SYSTEM_EVENT_DATA:
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: Gavin Shan <gshan@redhat.com>
To: kvmarm@lists.linux.dev
Cc: kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
	peterx@redhat.com, catalin.marinas@arm.com, bgardon@google.com,
	shuah@kernel.org, andrew.jones@linux.dev, will@kernel.org,
	dmatlack@google.com, pbonzini@redhat.com, zhenyzha@redhat.com,
	james.morse@arm.com, suzuki.poulose@arm.com,
	alexandru.elisei@arm.com, oliver.upton@linux.dev,
	seanjc@google.com, shan.gavin@gmail.com
Subject: [PATCH v5 3/7] KVM: x86: Allow to use bitmap in ring-based dirty page tracking
Date: Wed,  5 Oct 2022 08:41:50 +0800	[thread overview]
Message-ID: <20221005004154.83502-4-gshan@redhat.com> (raw)
Message-ID: <20221005004150.ybPrz8daMvdvz74HOGuwFx9EJQEaOxbOysGkQUcXDJs@z> (raw)
In-Reply-To: <20221005004154.83502-1-gshan@redhat.com>

There is no running vcpu and available per-vcpu dirty ring when
pages become dirty in some cases. One example is to save arm64's
vgic/its tables during migration. This leads to our lost tracking
on these dirty pages.

Fix the issue by reusing the bitmap to track those dirty pages.
The bitmap is visited by KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG
ioctls. KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP is used to advertise the
new capability.

Suggested-by: Marc Zyngier <maz@kernel.org>
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 Documentation/virt/kvm/api.rst | 17 ++++++++---------
 include/uapi/linux/kvm.h       |  1 +
 virt/kvm/kvm_main.c            | 24 +++++++++---------------
 3 files changed, 18 insertions(+), 24 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 32427ea160df..4f5e09042f8a 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -8019,8 +8019,8 @@ guest according to the bits in the KVM_CPUID_FEATURES CPUID leaf
 (0x40000001). Otherwise, a guest may use the paravirtual features
 regardless of what has actually been exposed through the CPUID leaf.
 
-8.29 KVM_CAP_DIRTY_LOG_RING/KVM_CAP_DIRTY_LOG_RING_ACQ_REL
-----------------------------------------------------------
+8.29 KVM_CAP_DIRTY_LOG_{RING, RING_ACQ_REL, RING_ALLOW_BITMAP}
+--------------------------------------------------------------
 
 :Architectures: x86
 :Parameters: args[0] - size of the dirty log ring
@@ -8104,13 +8104,6 @@ flushing is done by the KVM_GET_DIRTY_LOG ioctl).  To achieve that, one
 needs to kick the vcpu out of KVM_RUN using a signal.  The resulting
 vmexit ensures that all dirty GFNs are flushed to the dirty rings.
 
-NOTE: the capability KVM_CAP_DIRTY_LOG_RING and the corresponding
-ioctl KVM_RESET_DIRTY_RINGS are mutual exclusive to the existing ioctls
-KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG.  After enabling
-KVM_CAP_DIRTY_LOG_RING with an acceptable dirty ring size, the virtual
-machine will switch to ring-buffer dirty page tracking and further
-KVM_GET_DIRTY_LOG or KVM_CLEAR_DIRTY_LOG ioctls will fail.
-
 NOTE: KVM_CAP_DIRTY_LOG_RING_ACQ_REL is the only capability that
 should be exposed by weakly ordered architecture, in order to indicate
 the additional memory ordering requirements imposed on userspace when
@@ -8119,6 +8112,12 @@ Architecture with TSO-like ordering (such as x86) are allowed to
 expose both KVM_CAP_DIRTY_LOG_RING and KVM_CAP_DIRTY_LOG_RING_ACQ_REL
 to userspace.
 
+NOTE: There is no running vcpu and available vcpu dirty ring when pages
+becomes dirty in some cases. One example is to save arm64's vgic/its
+tables during migration. The dirty bitmap is still used to track those
+dirty pages, indicated by KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP. The ditry
+bitmap is visited by KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG ioctls.
+
 8.30 KVM_CAP_XEN_HVM
 --------------------
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 0d5d4419139a..a1f10bc5fe8b 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1178,6 +1178,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_S390_ZPCI_OP 221
 #define KVM_CAP_S390_CPU_TOPOLOGY 222
 #define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223
+#define KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP 224
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 5b064dbadaf4..505e3840cf0c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1617,7 +1617,7 @@ static int kvm_prepare_memory_region(struct kvm *kvm,
 			new->dirty_bitmap = NULL;
 		else if (old && old->dirty_bitmap)
 			new->dirty_bitmap = old->dirty_bitmap;
-		else if (!kvm->dirty_ring_size) {
+		else {
 			r = kvm_alloc_dirty_bitmap(new);
 			if (r)
 				return r;
@@ -2060,10 +2060,6 @@ int kvm_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log,
 	unsigned long n;
 	unsigned long any = 0;
 
-	/* Dirty ring tracking is exclusive to dirty log tracking */
-	if (kvm->dirty_ring_size)
-		return -ENXIO;
-
 	*memslot = NULL;
 	*is_dirty = 0;
 
@@ -2125,10 +2121,6 @@ static int kvm_get_dirty_log_protect(struct kvm *kvm, struct kvm_dirty_log *log)
 	unsigned long *dirty_bitmap_buffer;
 	bool flush;
 
-	/* Dirty ring tracking is exclusive to dirty log tracking */
-	if (kvm->dirty_ring_size)
-		return -ENXIO;
-
 	as_id = log->slot >> 16;
 	id = (u16)log->slot;
 	if (as_id >= KVM_ADDRESS_SPACE_NUM || id >= KVM_USER_MEM_SLOTS)
@@ -2237,10 +2229,6 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm,
 	unsigned long *dirty_bitmap_buffer;
 	bool flush;
 
-	/* Dirty ring tracking is exclusive to dirty log tracking */
-	if (kvm->dirty_ring_size)
-		return -ENXIO;
-
 	as_id = log->slot >> 16;
 	id = (u16)log->slot;
 	if (as_id >= KVM_ADDRESS_SPACE_NUM || id >= KVM_USER_MEM_SLOTS)
@@ -3305,7 +3293,7 @@ void mark_page_dirty_in_slot(struct kvm *kvm,
 	struct kvm_vcpu *vcpu = kvm_get_running_vcpu();
 
 #ifdef CONFIG_HAVE_KVM_DIRTY_RING
-	if (WARN_ON_ONCE(!vcpu) || WARN_ON_ONCE(vcpu->kvm != kvm))
+	if (WARN_ON_ONCE(vcpu && vcpu->kvm != kvm))
 		return;
 #endif
 
@@ -3313,7 +3301,7 @@ void mark_page_dirty_in_slot(struct kvm *kvm,
 		unsigned long rel_gfn = gfn - memslot->base_gfn;
 		u32 slot = (memslot->as_id << 16) | memslot->id;
 
-		if (kvm->dirty_ring_size)
+		if (kvm->dirty_ring_size && vcpu)
 			kvm_dirty_ring_push(&vcpu->dirty_ring,
 					    slot, rel_gfn);
 		else
@@ -4485,6 +4473,12 @@ static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
 		return KVM_DIRTY_RING_MAX_ENTRIES * sizeof(struct kvm_dirty_gfn);
 #else
 		return 0;
+#endif
+	case KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP:
+#ifdef CONFIG_HAVE_KVM_DIRTY_RING
+		return 1;
+#else
+		return 0;
 #endif
 	case KVM_CAP_BINARY_STATS_FD:
 	case KVM_CAP_SYSTEM_EVENT_DATA:
-- 
2.23.0


  parent reply	other threads:[~2022-10-05  0:43 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-05  0:41 [PATCH v5 0/7] KVM: arm64: Enable ring-based dirty memory tracking Gavin Shan
2022-10-05  0:41 ` Gavin Shan
2022-10-05  0:41 ` [PATCH v5 1/7] KVM: x86: Introduce KVM_REQ_RING_SOFT_FULL Gavin Shan
2022-10-05  0:41   ` Gavin Shan
2022-10-05  0:41 ` [PATCH v5 2/7] KVM: x86: Move declaration of kvm_cpu_dirty_log_size() to kvm_dirty_ring.h Gavin Shan
2022-10-05  0:41   ` Gavin Shan
2022-10-05  0:41 ` Gavin Shan [this message]
2022-10-05  0:41   ` [PATCH v5 3/7] KVM: x86: Allow to use bitmap in ring-based dirty page tracking Gavin Shan
2022-10-06 20:28   ` Peter Xu
2022-10-06 20:28     ` Peter Xu
2022-10-06 23:38     ` Gavin Shan
2022-10-06 23:38       ` Gavin Shan
2022-10-07 14:31       ` Peter Xu
2022-10-07 14:31         ` Peter Xu
2022-10-10 23:18         ` Oliver Upton
2022-10-10 23:18           ` Oliver Upton
2022-10-10 23:43           ` Oliver Upton
2022-10-10 23:43             ` Oliver Upton
2022-10-10 23:49           ` Peter Xu
2022-10-10 23:49             ` Peter Xu
2022-10-10 23:58             ` Gavin Shan
2022-10-10 23:58               ` Gavin Shan
2022-10-10 23:58             ` Oliver Upton
2022-10-10 23:58               ` Oliver Upton
2022-10-11  0:20               ` Peter Xu
2022-10-11  0:20                 ` Peter Xu
2022-10-11  1:12                 ` Oliver Upton
2022-10-11  1:12                   ` Oliver Upton
2022-10-11  3:56                   ` Gavin Shan
2022-10-11  3:56                     ` Gavin Shan
2022-10-11  6:31                     ` Gavin Shan
2022-10-11  6:31                       ` Gavin Shan
2022-10-14 16:55                   ` Peter Xu
2022-10-14 16:55                     ` Peter Xu
2022-10-18  7:38                     ` Oliver Upton
2022-10-18  7:38                       ` Oliver Upton
2022-10-18  7:40                       ` Oliver Upton
2022-10-18  7:40                         ` Oliver Upton
2022-10-18 15:50                       ` Peter Xu
2022-10-18 15:50                         ` Peter Xu
2022-10-05  0:41 ` [PATCH v5 4/7] KVM: arm64: Enable ring-based dirty memory tracking Gavin Shan
2022-10-05  0:41   ` Gavin Shan
2022-10-05  0:41 ` [PATCH v5 5/7] KVM: selftests: Use host page size to map ring buffer in dirty_log_test Gavin Shan
2022-10-05  0:41   ` Gavin Shan
2022-10-05  0:41 ` [PATCH v5 6/7] KVM: selftests: Clear dirty ring states between two modes " Gavin Shan
2022-10-05  0:41   ` Gavin Shan
2022-10-05  0:41 ` [PATCH v5 7/7] KVM: selftests: Automate choosing dirty ring size " Gavin Shan
2022-10-05  0:41   ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221005004154.83502-4-gshan@redhat.com \
    --to=gshan@redhat.com \
    --cc=andrew.jones@linux.dev \
    --cc=bgardon@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=dmatlack@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=kvmarm@lists.linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=shan.gavin@gmail.com \
    --cc=shuah@kernel.org \
    --cc=will@kernel.org \
    --cc=zhenyzha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).