All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Graf <agraf@suse.de>
To: kvm-ppc@vger.kernel.org
Cc: kvm@vger.kernel.org, pbonzini@redhat.com, mtosatti@redhat.com,
	Alexey Kardashevskiy <aik@ozlabs.ru>,
	Paul Mackerras <paulus@samba.org>
Subject: [PULL 36/41] KVM: PPC: Book3S HV: Fix dirty map for hugepages
Date: Fri, 30 May 2014 14:42:51 +0200	[thread overview]
Message-ID: <1401453776-55285-37-git-send-email-agraf@suse.de> (raw)
In-Reply-To: <1401453776-55285-1-git-send-email-agraf@suse.de>

From: Alexey Kardashevskiy <aik@ozlabs.ru>

The dirty map that we construct for the KVM_GET_DIRTY_LOG ioctl has
one bit per system page (4K/64K).  Currently, we only set one bit in
the map for each HPT entry with the Change bit set, even if the HPT is
for a large page (e.g., 16MB).  Userspace then considers only the
first system page dirty, though in fact the guest may have modified
anywhere in the large page.

To fix this, we make kvm_test_clear_dirty() return the actual number
of pages that are dirty (and rename it to kvm_test_clear_dirty_npages()
to emphasize that that's what it returns).  In kvmppc_hv_get_dirty_log()
we then set that many bits in the dirty map.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 33 ++++++++++++++++++++++++---------
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 4e22ecb..96c9044 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -1060,22 +1060,27 @@ void kvm_set_spte_hva_hv(struct kvm *kvm, unsigned long hva, pte_t pte)
 	kvm_handle_hva(kvm, hva, kvm_unmap_rmapp);
 }
 
-static int kvm_test_clear_dirty(struct kvm *kvm, unsigned long *rmapp)
+/*
+ * Returns the number of system pages that are dirty.
+ * This can be more than 1 if we find a huge-page HPTE.
+ */
+static int kvm_test_clear_dirty_npages(struct kvm *kvm, unsigned long *rmapp)
 {
 	struct revmap_entry *rev = kvm->arch.revmap;
 	unsigned long head, i, j;
+	unsigned long n;
 	unsigned long *hptep;
-	int ret = 0;
+	int npages_dirty = 0;
 
  retry:
 	lock_rmap(rmapp);
 	if (*rmapp & KVMPPC_RMAP_CHANGED) {
 		*rmapp &= ~KVMPPC_RMAP_CHANGED;
-		ret = 1;
+		npages_dirty = 1;
 	}
 	if (!(*rmapp & KVMPPC_RMAP_PRESENT)) {
 		unlock_rmap(rmapp);
-		return ret;
+		return npages_dirty;
 	}
 
 	i = head = *rmapp & KVMPPC_RMAP_INDEX;
@@ -1106,13 +1111,16 @@ static int kvm_test_clear_dirty(struct kvm *kvm, unsigned long *rmapp)
 				rev[i].guest_rpte |= HPTE_R_C;
 				note_hpte_modification(kvm, &rev[i]);
 			}
-			ret = 1;
+			n = hpte_page_size(hptep[0], hptep[1]);
+			n = (n + PAGE_SIZE - 1) >> PAGE_SHIFT;
+			if (n > npages_dirty)
+				npages_dirty = n;
 		}
 		hptep[0] &= ~HPTE_V_HVLOCK;
 	} while ((i = j) != head);
 
 	unlock_rmap(rmapp);
-	return ret;
+	return npages_dirty;
 }
 
 static void harvest_vpa_dirty(struct kvmppc_vpa *vpa,
@@ -1136,15 +1144,22 @@ static void harvest_vpa_dirty(struct kvmppc_vpa *vpa,
 long kvmppc_hv_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot,
 			     unsigned long *map)
 {
-	unsigned long i;
+	unsigned long i, j;
 	unsigned long *rmapp;
 	struct kvm_vcpu *vcpu;
 
 	preempt_disable();
 	rmapp = memslot->arch.rmap;
 	for (i = 0; i < memslot->npages; ++i) {
-		if (kvm_test_clear_dirty(kvm, rmapp) && map)
-			__set_bit_le(i, map);
+		int npages = kvm_test_clear_dirty_npages(kvm, rmapp);
+		/*
+		 * Note that if npages > 0 then i must be a multiple of npages,
+		 * since we always put huge-page HPTEs in the rmap chain
+		 * corresponding to their page base address.
+		 */
+		if (npages && map)
+			for (j = i; npages; ++j, --npages)
+				__set_bit_le(j, map);
 		++rmapp;
 	}
 
-- 
1.8.1.4

WARNING: multiple messages have this Message-ID (diff)
From: Alexander Graf <agraf@suse.de>
To: kvm-ppc@vger.kernel.org
Cc: kvm@vger.kernel.org, pbonzini@redhat.com, mtosatti@redhat.com,
	Alexey Kardashevskiy <aik@ozlabs.ru>,
	Paul Mackerras <paulus@samba.org>
Subject: [PULL 36/41] KVM: PPC: Book3S HV: Fix dirty map for hugepages
Date: Fri, 30 May 2014 12:42:51 +0000	[thread overview]
Message-ID: <1401453776-55285-37-git-send-email-agraf@suse.de> (raw)
In-Reply-To: <1401453776-55285-1-git-send-email-agraf@suse.de>

From: Alexey Kardashevskiy <aik@ozlabs.ru>

The dirty map that we construct for the KVM_GET_DIRTY_LOG ioctl has
one bit per system page (4K/64K).  Currently, we only set one bit in
the map for each HPT entry with the Change bit set, even if the HPT is
for a large page (e.g., 16MB).  Userspace then considers only the
first system page dirty, though in fact the guest may have modified
anywhere in the large page.

To fix this, we make kvm_test_clear_dirty() return the actual number
of pages that are dirty (and rename it to kvm_test_clear_dirty_npages()
to emphasize that that's what it returns).  In kvmppc_hv_get_dirty_log()
we then set that many bits in the dirty map.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 33 ++++++++++++++++++++++++---------
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 4e22ecb..96c9044 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -1060,22 +1060,27 @@ void kvm_set_spte_hva_hv(struct kvm *kvm, unsigned long hva, pte_t pte)
 	kvm_handle_hva(kvm, hva, kvm_unmap_rmapp);
 }
 
-static int kvm_test_clear_dirty(struct kvm *kvm, unsigned long *rmapp)
+/*
+ * Returns the number of system pages that are dirty.
+ * This can be more than 1 if we find a huge-page HPTE.
+ */
+static int kvm_test_clear_dirty_npages(struct kvm *kvm, unsigned long *rmapp)
 {
 	struct revmap_entry *rev = kvm->arch.revmap;
 	unsigned long head, i, j;
+	unsigned long n;
 	unsigned long *hptep;
-	int ret = 0;
+	int npages_dirty = 0;
 
  retry:
 	lock_rmap(rmapp);
 	if (*rmapp & KVMPPC_RMAP_CHANGED) {
 		*rmapp &= ~KVMPPC_RMAP_CHANGED;
-		ret = 1;
+		npages_dirty = 1;
 	}
 	if (!(*rmapp & KVMPPC_RMAP_PRESENT)) {
 		unlock_rmap(rmapp);
-		return ret;
+		return npages_dirty;
 	}
 
 	i = head = *rmapp & KVMPPC_RMAP_INDEX;
@@ -1106,13 +1111,16 @@ static int kvm_test_clear_dirty(struct kvm *kvm, unsigned long *rmapp)
 				rev[i].guest_rpte |= HPTE_R_C;
 				note_hpte_modification(kvm, &rev[i]);
 			}
-			ret = 1;
+			n = hpte_page_size(hptep[0], hptep[1]);
+			n = (n + PAGE_SIZE - 1) >> PAGE_SHIFT;
+			if (n > npages_dirty)
+				npages_dirty = n;
 		}
 		hptep[0] &= ~HPTE_V_HVLOCK;
 	} while ((i = j) != head);
 
 	unlock_rmap(rmapp);
-	return ret;
+	return npages_dirty;
 }
 
 static void harvest_vpa_dirty(struct kvmppc_vpa *vpa,
@@ -1136,15 +1144,22 @@ static void harvest_vpa_dirty(struct kvmppc_vpa *vpa,
 long kvmppc_hv_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot,
 			     unsigned long *map)
 {
-	unsigned long i;
+	unsigned long i, j;
 	unsigned long *rmapp;
 	struct kvm_vcpu *vcpu;
 
 	preempt_disable();
 	rmapp = memslot->arch.rmap;
 	for (i = 0; i < memslot->npages; ++i) {
-		if (kvm_test_clear_dirty(kvm, rmapp) && map)
-			__set_bit_le(i, map);
+		int npages = kvm_test_clear_dirty_npages(kvm, rmapp);
+		/*
+		 * Note that if npages > 0 then i must be a multiple of npages,
+		 * since we always put huge-page HPTEs in the rmap chain
+		 * corresponding to their page base address.
+		 */
+		if (npages && map)
+			for (j = i; npages; ++j, --npages)
+				__set_bit_le(j, map);
 		++rmapp;
 	}
 
-- 
1.8.1.4


  parent reply	other threads:[~2014-05-30 12:42 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-30 12:42 [PULL 00/41] ppc patch queue 2014-05-30 Alexander Graf
2014-05-30 12:42 ` Alexander Graf
2014-05-30 12:42 ` [PULL 01/41] KVM: PPC: E500: Ignore L1CSR1_ICFI,ICLFR Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 02/41] KVM: PPC: E500: Add dcbtls emulation Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 03/41] KVM: PPC: BOOK3S: PR: Enable Little Endian PR guest Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 04/41] KVM: PPC: BOOK3S: PR: Fix WARN_ON with debug options on Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 05/41] KVM: PPC: Book3S: PR: Fix C/R bit setting Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 06/41] KVM: PPC: Book3S_32: PR: Access HTAB in big endian Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 07/41] KVM: PPC: Book3S_64 " Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 08/41] KVM: PPC: Book3S_64 PR: Access shadow slb " Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 09/41] KVM: PPC: Book3S PR: Default to big endian guest Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 10/41] KVM: PPC: Book3S PR: PAPR: Access HTAB in big endian Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 11/41] KVM: PPC: Book3S PR: PAPR: Access RTAS " Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 12/41] KVM: PPC: PR: Fill pvinfo hcall instructions " Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 13/41] KVM: PPC: Make shared struct aka magic page guest endian Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 14/41] KVM: PPC: Book3S PR: Do dcbz32 patching with big endian instructions Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 15/41] KVM: PPC: Book3S: Move little endian conflict to HV KVM Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 16/41] KVM: PPC: Book3S PR: Ignore PMU SPRs Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 17/41] KVM: PPC: Book3S PR: Emulate TIR register Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 18/41] KVM: PPC: Book3S PR: Handle Facility interrupt and FSCR Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 19/41] KVM: PPC: Book3S PR: Expose TAR facility to guest Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 20/41] KVM: PPC: Book3S PR: Expose EBB registers Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 21/41] KVM: PPC: Book3S PR: Expose TM registers Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 22/41] KVM: PPC: BOOK3S: HV: Prefer CMA region for hash page table allocation Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 23/41] KVM: PPC: BOOK3S: HV: Add mixed page-size support for guest Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 24/41] KVM: PPC: Disable NX for old magic page using guests Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 25/41] PPC: KVM: Make NX bit available with magic page Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 26/41] KVM: PPC: BOOK3S: Always use the saved DAR value Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 27/41] KVM: PPC: BOOK3S: Remove open coded make_dsisr in alignment handler Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 28/41] PPC: ePAPR: Fix hypercall on LE guest Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 29/41] KVM: PPC: Graciously fail broken LE hypercalls Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 30/41] KVM: PPC: MPIC: Reset IRQ source private members Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 31/41] KVM: PPC: Add CAP to indicate hcall fixes Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 32/41] KVM: PPC: Book3S: Add ONE_REG register names that were missed Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 33/41] KVM: PPC: Book3S: Move KVM_REG_PPC_WORT to an unused register number Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 15:50   ` Paolo Bonzini
2014-05-30 15:50     ` Paolo Bonzini
2014-05-30 15:53     ` Alexander Graf
2014-05-30 15:53       ` Alexander Graf
2014-05-30 15:55       ` Paolo Bonzini
2014-05-30 15:55         ` Paolo Bonzini
2014-05-30 15:58         ` Alexander Graf
2014-05-30 15:58           ` Alexander Graf
2014-05-30 16:03           ` Paolo Bonzini
2014-05-30 16:03             ` Paolo Bonzini
2014-05-30 16:08             ` Alexander Graf
2014-05-30 16:08               ` Alexander Graf
2014-05-30 16:11               ` Paolo Bonzini
2014-05-30 16:11                 ` Paolo Bonzini
2014-05-30 16:14                 ` Alexander Graf
2014-05-30 16:14                   ` Alexander Graf
2014-05-30 12:42 ` [PULL 34/41] KVM: PPC: Book3S HV: Fix check for running inside guest in global_invalidates() Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 35/41] KVM: PPC: Book3S HV: Put huge-page HPTEs in rmap chain for base address Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` Alexander Graf [this message]
2014-05-30 12:42   ` [PULL 36/41] KVM: PPC: Book3S HV: Fix dirty map for hugepages Alexander Graf
2014-05-30 12:42 ` [PULL 37/41] KVM: PPC: Book3S HV: Make sure we don't miss dirty pages Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 38/41] KVM: PPC: Book3S HV: Work around POWER8 performance monitor bugs Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 39/41] KVM: PPC: Book3S HV: Fix machine check delivery to guest Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 40/41] KVM: PPC: Book3S PR: Use SLB entry 0 Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:42 ` [PULL 41/41] KVM: PPC: Book3S PR: Rework SLB switching code Alexander Graf
2014-05-30 12:42   ` Alexander Graf
2014-05-30 12:58 ` [PULL 00/41] ppc patch queue 2014-05-30 Paolo Bonzini
2014-05-30 12:58   ` Paolo Bonzini
2014-05-30 13:10   ` Alexander Graf
2014-05-30 13:10     ` Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1401453776-55285-37-git-send-email-agraf@suse.de \
    --to=agraf@suse.de \
    --cc=aik@ozlabs.ru \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=paulus@samba.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.