kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] PPC HV bug fixes + hcalls for FreeBSD
@ 2015-04-24  5:39 Paul Mackerras
  2015-04-24  5:39 ` [PATCH 1/3] KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE Paul Mackerras
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Paul Mackerras @ 2015-04-24  5:39 UTC (permalink / raw)
  To: kvm-ppc, kvm; +Cc: Nathan Whitehorn

The main purpose of this series is to implement the H_CLEAR_REF and
H_CLEAR_MOD hypercalls defined by PAPR.  We are doing this for the
sake of FreeBSD guests as Linux guests don't use them.  Along the way
I found a couple of bugs, so the fixes for those are split out as the
first two patches.

The first two patches could go in immediately.  I'd like to get
feedback from actual users of H_CLEAR_REF/MOD before the third patch
goes in.

These patches are against Alex Graf's kvm-ppc-queue branch.

Paul.

 arch/powerpc/include/asm/kvm_host.h     |  2 ++
 arch/powerpc/kernel/asm-offsets.c       |  3 ++
 arch/powerpc/kvm/book3s_64_mmu_hv.c     |  6 +++-
 arch/powerpc/kvm/book3s_hv.c            | 51 ++++++++++++++++++++++-----------
 arch/powerpc/kvm/book3s_hv_builtin.c    | 16 +++++++++--
 arch/powerpc/kvm/book3s_hv_rm_mmu.c     | 26 +++++++++++++----
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 22 ++++++++++++--
 7 files changed, 98 insertions(+), 28 deletions(-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE
  2015-04-24  5:39 [PATCH 0/3] PPC HV bug fixes + hcalls for FreeBSD Paul Mackerras
@ 2015-04-24  5:39 ` Paul Mackerras
  2015-04-28  5:06   ` Aneesh Kumar K.V
  2015-04-24  5:39 ` [PATCH 2/3] KVM: PPC: Book3S HV: Fix bug in dirty page tracking Paul Mackerras
  2015-04-24  5:39 ` [PATCH 3/3] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD Paul Mackerras
  2 siblings, 1 reply; 12+ messages in thread
From: Paul Mackerras @ 2015-04-24  5:39 UTC (permalink / raw)
  To: kvm-ppc, kvm; +Cc: Nathan Whitehorn

The reference (R) and change (C) bits in a HPT entry can be set by
hardware at any time up until the HPTE is invalidated and the TLB
invalidation sequence has completed.  This means that when removing
a HPTE, we need to read the HPTE after the invalidation sequence has
completed in order to obtain reliable values of R and C.  The code
in kvmppc_do_h_remove() used to do this.  However, commit 6f22bd3265fb
("KVM: PPC: Book3S HV: Make HTAB code LE host aware") removed the
read after invalidation as a side effect of other changes.  This
restores the read of the HPTE after invalidation.

The user-visible effect of this bug would be that when migrating a
guest, there is a small probability that a page modified by the guest
and then unmapped by the guest might not get re-transmitted and thus
the destination might end up with a stale copy of the page.

Fixes: 6f22bd3265fb
Cc: stable@vger.kernel.org # v3.17+
Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/kvm/book3s_hv_rm_mmu.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index f6bf0b1..5c1737f 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -413,14 +413,12 @@ long kvmppc_do_h_remove(struct kvm *kvm, unsigned long flags,
 	rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
 	v = pte & ~HPTE_V_HVLOCK;
 	if (v & HPTE_V_VALID) {
-		u64 pte1;
-
-		pte1 = be64_to_cpu(hpte[1]);
 		hpte[0] &= ~cpu_to_be64(HPTE_V_VALID);
-		rb = compute_tlbie_rb(v, pte1, pte_index);
+		rb = compute_tlbie_rb(v, be64_to_cpu(hpte[1]), pte_index);
 		do_tlbies(kvm, &rb, 1, global_invalidates(kvm, flags), true);
 		/* Read PTE low word after tlbie to get final R/C values */
-		remove_revmap_chain(kvm, pte_index, rev, v, pte1);
+		remove_revmap_chain(kvm, pte_index, rev, v,
+				    be64_to_cpu(hpte[1]));
 	}
 	r = rev->guest_rpte & ~HPTE_GR_RESERVED;
 	note_hpte_modification(kvm, rev);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] KVM: PPC: Book3S HV: Fix bug in dirty page tracking
  2015-04-24  5:39 [PATCH 0/3] PPC HV bug fixes + hcalls for FreeBSD Paul Mackerras
  2015-04-24  5:39 ` [PATCH 1/3] KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE Paul Mackerras
@ 2015-04-24  5:39 ` Paul Mackerras
  2015-04-24  5:39 ` [PATCH 3/3] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD Paul Mackerras
  2 siblings, 0 replies; 12+ messages in thread
From: Paul Mackerras @ 2015-04-24  5:39 UTC (permalink / raw)
  To: kvm-ppc, kvm; +Cc: Nathan Whitehorn

This fixes a bug in the tracking of pages that get modified by the
guest.  If the guest creates a large-page HPTE, writes to memory
somewhere within the large page, and then removes the HPTE, we only
record the modified state for the first normal page within the large
page, when in fact the guest might have modified some other normal
page within the large page.

To fix this we use some unused bits in the rmap entry to record the
order (log base 2) of the size of the page that was modified, when
removing an HPTE.  Then in kvm_test_clear_dirty_npages() we use that
order to return the correct number of modified pages.

The same thing could in principle happen when removing a HPTE at the
host's request, i.e. when paging out a page, except that we never
page out large pages, and the guest can only create large-page HPTEs
if the guest RAM is backed by large pages.  However, we also fix
this case for the sake of future-proofing.

The reference bit is also subject to the same loss of information.  We
don't make the same fix here for the reference bit because there isn't
an interface for userspace to find out which pages the guest has
referenced, whereas there is one for userspace to find out which pages
the guest has modified.  Because of this loss of information, the
kvm_age_hva_hv() and kvm_test_age_hva_hv() functions might incorrectly
say that a page has not been referenced when it has, but that doesn't
matter greatly because we never page or swap out large pages.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/include/asm/kvm_book3s.h |  1 +
 arch/powerpc/include/asm/kvm_host.h   |  2 ++
 arch/powerpc/kvm/book3s_64_mmu_hv.c   |  8 +++++++-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   | 17 +++++++++++++++++
 4 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index 578e550..9b072a5 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -162,6 +162,7 @@ extern pfn_t kvmppc_gpa_to_pfn(struct kvm_vcpu *vcpu, gpa_t gpa, bool writing,
 			bool *writable);
 extern void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev,
 			unsigned long *rmap, long pte_index, int realmode);
+extern void kvmppc_update_rmap_change(unsigned long *rmap, unsigned long psize);
 extern void kvmppc_invalidate_hpte(struct kvm *kvm, __be64 *hptep,
 			unsigned long pte_index);
 void kvmppc_clear_ref_hpte(struct kvm *kvm, __be64 *hptep,
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index d67a838..9c2617e 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -205,8 +205,10 @@ struct revmap_entry {
  */
 #define KVMPPC_RMAP_LOCK_BIT	63
 #define KVMPPC_RMAP_RC_SHIFT	32
+#define KVMPPC_RMAP_CHG_SHIFT	48
 #define KVMPPC_RMAP_REFERENCED	(HPTE_R_R << KVMPPC_RMAP_RC_SHIFT)
 #define KVMPPC_RMAP_CHANGED	(HPTE_R_C << KVMPPC_RMAP_RC_SHIFT)
+#define KVMPPC_RMAP_CHG_ORDER	(0x3ful << KVMPPC_RMAP_CHG_SHIFT)
 #define KVMPPC_RMAP_PRESENT	0x100000000ul
 #define KVMPPC_RMAP_INDEX	0xfffffffful
 
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index d6fe308..c9c25af 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -763,6 +763,8 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long *rmapp,
 			/* Harvest R and C */
 			rcbits = be64_to_cpu(hptep[1]) & (HPTE_R_R | HPTE_R_C);
 			*rmapp |= rcbits << KVMPPC_RMAP_RC_SHIFT;
+			if (rcbits & HPTE_R_C)
+				kvmppc_update_rmap_change(rmapp, psize);
 			if (rcbits & ~rev[i].guest_rpte) {
 				rev[i].guest_rpte = ptel | rcbits;
 				note_hpte_modification(kvm, &rev[i]);
@@ -929,8 +931,12 @@ static int kvm_test_clear_dirty_npages(struct kvm *kvm, unsigned long *rmapp)
  retry:
 	lock_rmap(rmapp);
 	if (*rmapp & KVMPPC_RMAP_CHANGED) {
-		*rmapp &= ~KVMPPC_RMAP_CHANGED;
+		long change_order = (*rmapp & KVMPPC_RMAP_CHG_ORDER)
+			>> KVMPPC_RMAP_CHG_SHIFT;
+		*rmapp &= ~(KVMPPC_RMAP_CHANGED | KVMPPC_RMAP_CHG_ORDER);
 		npages_dirty = 1;
+		if (change_order > PAGE_SHIFT)
+			npages_dirty = 1ul << (change_order - PAGE_SHIFT);
 	}
 	if (!(*rmapp & KVMPPC_RMAP_PRESENT)) {
 		unlock_rmap(rmapp);
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 5c1737f..24ccc79 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -12,6 +12,7 @@
 #include <linux/kvm_host.h>
 #include <linux/hugetlb.h>
 #include <linux/module.h>
+#include <linux/log2.h>
 
 #include <asm/tlbflush.h>
 #include <asm/kvm_ppc.h>
@@ -94,6 +95,20 @@ void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev,
 }
 EXPORT_SYMBOL_GPL(kvmppc_add_revmap_chain);
 
+/* Update the changed page order field of an rmap entry */
+void kvmppc_update_rmap_change(unsigned long *rmap, unsigned long psize)
+{
+	unsigned long order;
+
+	if (!psize)
+		return;
+	order = ilog2(psize);
+	order <<= KVMPPC_RMAP_CHG_SHIFT;
+	if (order > (*rmap & KVMPPC_RMAP_CHG_ORDER))
+		*rmap = (*rmap & ~KVMPPC_RMAP_CHG_ORDER) | order;
+}
+EXPORT_SYMBOL_GPL(kvmppc_update_rmap_change);
+
 /* Remove this HPTE from the chain for a real page */
 static void remove_revmap_chain(struct kvm *kvm, long pte_index,
 				struct revmap_entry *rev,
@@ -128,6 +143,8 @@ static void remove_revmap_chain(struct kvm *kvm, long pte_index,
 			*rmap = (*rmap & ~KVMPPC_RMAP_INDEX) | head;
 	}
 	*rmap |= rcbits << KVMPPC_RMAP_RC_SHIFT;
+	if (rcbits & HPTE_R_C)
+		kvmppc_update_rmap_change(rmap, hpte_page_size(hpte_v, hpte_r));
 	unlock_rmap(rmap);
 }
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD
  2015-04-24  5:39 [PATCH 0/3] PPC HV bug fixes + hcalls for FreeBSD Paul Mackerras
  2015-04-24  5:39 ` [PATCH 1/3] KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE Paul Mackerras
  2015-04-24  5:39 ` [PATCH 2/3] KVM: PPC: Book3S HV: Fix bug in dirty page tracking Paul Mackerras
@ 2015-04-24  5:39 ` Paul Mackerras
  2015-07-27 17:31   ` Nathan Whitehorn
  2 siblings, 1 reply; 12+ messages in thread
From: Paul Mackerras @ 2015-04-24  5:39 UTC (permalink / raw)
  To: kvm-ppc, kvm; +Cc: Nathan Whitehorn

This adds implementations for the H_CLEAR_REF (test and clear reference
bit) and H_CLEAR_MOD (test and clear changed bit) hypercalls.

When clearing the reference or change bit in the guest view of the HPTE,
we also have to clear it in the real HPTE so that we can detect future
references or changes.  When we do so, we transfer the R or C bit value
to the rmap entry for the underlying host page so that kvm_age_hva_hv(),
kvm_test_age_hva_hv() and kvmppc_hv_get_dirty_log() know that the page
has been referenced and/or changed.

These hypercalls are not used by Linux guests and these implementations
are only compile tested.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/kvm/book3s_hv_rm_mmu.c     | 126 ++++++++++++++++++++++++++++++--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   4 +-
 2 files changed, 121 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 24ccc79..479ff7e 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -109,25 +109,38 @@ void kvmppc_update_rmap_change(unsigned long *rmap, unsigned long psize)
 }
 EXPORT_SYMBOL_GPL(kvmppc_update_rmap_change);
 
+/* Returns a pointer to the revmap entry for the page mapped by a HPTE */
+static unsigned long *revmap_for_hpte(struct kvm *kvm, unsigned long hpte_v,
+				      unsigned long hpte_gr)
+{
+	struct kvm_memory_slot *memslot;
+	unsigned long *rmap;
+	unsigned long gfn;
+
+	gfn = hpte_rpn(hpte_gr, hpte_page_size(hpte_v, hpte_gr));
+	memslot = __gfn_to_memslot(kvm_memslots_raw(kvm), gfn);
+	if (!memslot)
+		return NULL;
+
+	rmap = real_vmalloc_addr(&memslot->arch.rmap[gfn - memslot->base_gfn]);
+	return rmap;
+}
+
 /* Remove this HPTE from the chain for a real page */
 static void remove_revmap_chain(struct kvm *kvm, long pte_index,
 				struct revmap_entry *rev,
 				unsigned long hpte_v, unsigned long hpte_r)
 {
 	struct revmap_entry *next, *prev;
-	unsigned long gfn, ptel, head;
-	struct kvm_memory_slot *memslot;
+	unsigned long ptel, head;
 	unsigned long *rmap;
 	unsigned long rcbits;
 
 	rcbits = hpte_r & (HPTE_R_R | HPTE_R_C);
 	ptel = rev->guest_rpte |= rcbits;
-	gfn = hpte_rpn(ptel, hpte_page_size(hpte_v, ptel));
-	memslot = __gfn_to_memslot(kvm_memslots_raw(kvm), gfn);
-	if (!memslot)
+	rmap = revmap_for_hpte(kvm, hpte_v, ptel);
+	if (!rmap)
 		return;
-
-	rmap = real_vmalloc_addr(&memslot->arch.rmap[gfn - memslot->base_gfn]);
 	lock_rmap(rmap);
 
 	head = *rmap & KVMPPC_RMAP_INDEX;
@@ -662,6 +675,105 @@ long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long flags,
 	return H_SUCCESS;
 }
 
+long kvmppc_h_clear_ref(struct kvm_vcpu *vcpu, unsigned long flags,
+			unsigned long pte_index)
+{
+	struct kvm *kvm = vcpu->kvm;
+	__be64 *hpte;
+	unsigned long v, r, gr;
+	struct revmap_entry *rev;
+	unsigned long *rmap;
+	long ret = H_NOT_FOUND;
+
+	if (pte_index >= kvm->arch.hpt_npte)
+		return H_PARAMETER;
+
+	rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
+	hpte = (__be64 *)(kvm->arch.hpt_virt + (pte_index << 4));
+	while (!try_lock_hpte(hpte, HPTE_V_HVLOCK))
+		cpu_relax();
+	v = be64_to_cpu(hpte[0]);
+	r = be64_to_cpu(hpte[1]);
+	if (!(v & (HPTE_V_VALID | HPTE_V_ABSENT)))
+		goto out;
+
+	gr = rev->guest_rpte;
+	if (rev->guest_rpte & HPTE_R_R) {
+		rev->guest_rpte &= ~HPTE_R_R;
+		note_hpte_modification(kvm, rev);
+	}
+	if (v & HPTE_V_VALID) {
+		gr |= r & (HPTE_R_R | HPTE_R_C);
+		if (r & HPTE_R_R) {
+			kvmppc_clear_ref_hpte(kvm, hpte, pte_index);
+			rmap = revmap_for_hpte(kvm, v, gr);
+			if (rmap) {
+				lock_rmap(rmap);
+				*rmap |= KVMPPC_RMAP_REFERENCED;
+				unlock_rmap(rmap);
+			}
+		}
+	}
+	vcpu->arch.gpr[4] = gr;
+	ret = H_SUCCESS;
+ out:
+	unlock_hpte(hpte, v & ~HPTE_V_HVLOCK);
+	return ret;
+}
+
+long kvmppc_h_clear_mod(struct kvm_vcpu *vcpu, unsigned long flags,
+			unsigned long pte_index)
+{
+	struct kvm *kvm = vcpu->kvm;
+	__be64 *hpte;
+	unsigned long v, r, gr;
+	struct revmap_entry *rev;
+	unsigned long *rmap;
+	long ret = H_NOT_FOUND;
+
+	if (pte_index >= kvm->arch.hpt_npte)
+		return H_PARAMETER;
+
+	rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
+	hpte = (__be64 *)(kvm->arch.hpt_virt + (pte_index << 4));
+	while (!try_lock_hpte(hpte, HPTE_V_HVLOCK))
+		cpu_relax();
+	v = be64_to_cpu(hpte[0]);
+	r = be64_to_cpu(hpte[1]);
+	if (!(v & (HPTE_V_VALID | HPTE_V_ABSENT)))
+		goto out;
+
+	gr = rev->guest_rpte;
+	if (gr & HPTE_R_C) {
+		rev->guest_rpte &= ~HPTE_R_C;
+		note_hpte_modification(kvm, rev);
+	}
+	if (v & HPTE_V_VALID) {
+		/* need to make it temporarily absent so C is stable */
+		hpte[0] |= cpu_to_be64(HPTE_V_ABSENT);
+		kvmppc_invalidate_hpte(kvm, hpte, pte_index);
+		r = be64_to_cpu(hpte[1]);
+		gr |= r & (HPTE_R_R | HPTE_R_C);
+		if (r & HPTE_R_C) {
+			unsigned long psize = hpte_page_size(v, r);
+			hpte[1] = cpu_to_be64(r & ~HPTE_R_C);
+			eieio();
+			rmap = revmap_for_hpte(kvm, v, gr);
+			if (rmap) {
+				lock_rmap(rmap);
+				*rmap |= KVMPPC_RMAP_CHANGED;
+				kvmppc_update_rmap_change(rmap, psize);
+				unlock_rmap(rmap);
+			}
+		}
+	}
+	vcpu->arch.gpr[4] = gr;
+	ret = H_SUCCESS;
+ out:
+	unlock_hpte(hpte, v & ~HPTE_V_HVLOCK);
+	return ret;
+}
+
 void kvmppc_invalidate_hpte(struct kvm *kvm, __be64 *hptep,
 			unsigned long pte_index)
 {
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 4d70df2..52752a3 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1816,8 +1816,8 @@ hcall_real_table:
 	.long	DOTSYM(kvmppc_h_remove) - hcall_real_table
 	.long	DOTSYM(kvmppc_h_enter) - hcall_real_table
 	.long	DOTSYM(kvmppc_h_read) - hcall_real_table
-	.long	0		/* 0x10 - H_CLEAR_MOD */
-	.long	0		/* 0x14 - H_CLEAR_REF */
+	.long	DOTSYM(kvmppc_h_clear_mod) - hcall_real_table
+	.long	DOTSYM(kvmppc_h_clear_ref) - hcall_real_table
 	.long	DOTSYM(kvmppc_h_protect) - hcall_real_table
 	.long	DOTSYM(kvmppc_h_get_tce) - hcall_real_table
 	.long	DOTSYM(kvmppc_h_put_tce) - hcall_real_table
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE
  2015-04-24  5:39 ` [PATCH 1/3] KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE Paul Mackerras
@ 2015-04-28  5:06   ` Aneesh Kumar K.V
  2015-04-28  9:10     ` Paul Mackerras
  0 siblings, 1 reply; 12+ messages in thread
From: Aneesh Kumar K.V @ 2015-04-28  5:06 UTC (permalink / raw)
  To: Paul Mackerras, kvm-ppc, kvm; +Cc: Nathan Whitehorn

Paul Mackerras <paulus@samba.org> writes:

> The reference (R) and change (C) bits in a HPT entry can be set by
> hardware at any time up until the HPTE is invalidated and the TLB
> invalidation sequence has completed.  This means that when removing
> a HPTE, we need to read the HPTE after the invalidation sequence has
> completed in order to obtain reliable values of R and C.  The code
> in kvmppc_do_h_remove() used to do this.  However, commit 6f22bd3265fb
> ("KVM: PPC: Book3S HV: Make HTAB code LE host aware") removed the
> read after invalidation as a side effect of other changes.  This
> restores the read of the HPTE after invalidation.
>
> The user-visible effect of this bug would be that when migrating a
> guest, there is a small probability that a page modified by the guest
> and then unmapped by the guest might not get re-transmitted and thus
> the destination might end up with a stale copy of the page.
>
> Fixes: 6f22bd3265fb
> Cc: stable@vger.kernel.org # v3.17+
> Signed-off-by: Paul Mackerras <paulus@samba.org>
> ---
>  arch/powerpc/kvm/book3s_hv_rm_mmu.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> index f6bf0b1..5c1737f 100644
> --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> @@ -413,14 +413,12 @@ long kvmppc_do_h_remove(struct kvm *kvm, unsigned long flags,
>  	rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
>  	v = pte & ~HPTE_V_HVLOCK;
>  	if (v & HPTE_V_VALID) {
> -		u64 pte1;
> -
> -		pte1 = be64_to_cpu(hpte[1]);
>  		hpte[0] &= ~cpu_to_be64(HPTE_V_VALID);
> -		rb = compute_tlbie_rb(v, pte1, pte_index);
> +		rb = compute_tlbie_rb(v, be64_to_cpu(hpte[1]), pte_index);
>  		do_tlbies(kvm, &rb, 1, global_invalidates(kvm, flags), true);
>  		/* Read PTE low word after tlbie to get final R/C values */
> -		remove_revmap_chain(kvm, pte_index, rev, v, pte1);
> +		remove_revmap_chain(kvm, pte_index, rev, v,
> +				    be64_to_cpu(hpte[1]));
>  	}

May be add the above commit message as a code comment ?

>  	r = rev->guest_rpte & ~HPTE_GR_RESERVED;
>  	note_hpte_modification(kvm, rev);
> -- 
> 2.1.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE
  2015-04-28  5:06   ` Aneesh Kumar K.V
@ 2015-04-28  9:10     ` Paul Mackerras
  0 siblings, 0 replies; 12+ messages in thread
From: Paul Mackerras @ 2015-04-28  9:10 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: kvm-ppc, kvm, Nathan Whitehorn

On Tue, Apr 28, 2015 at 10:36:52AM +0530, Aneesh Kumar K.V wrote:
> Paul Mackerras <paulus@samba.org> writes:
> 
> > The reference (R) and change (C) bits in a HPT entry can be set by
> > hardware at any time up until the HPTE is invalidated and the TLB
> > invalidation sequence has completed.  This means that when removing
> > a HPTE, we need to read the HPTE after the invalidation sequence has
> > completed in order to obtain reliable values of R and C.  The code
> > in kvmppc_do_h_remove() used to do this.  However, commit 6f22bd3265fb
> > ("KVM: PPC: Book3S HV: Make HTAB code LE host aware") removed the
> > read after invalidation as a side effect of other changes.  This
> > restores the read of the HPTE after invalidation.
> >
> > The user-visible effect of this bug would be that when migrating a
> > guest, there is a small probability that a page modified by the guest
> > and then unmapped by the guest might not get re-transmitted and thus
> > the destination might end up with a stale copy of the page.
> >
> > Fixes: 6f22bd3265fb
> > Cc: stable@vger.kernel.org # v3.17+
> > Signed-off-by: Paul Mackerras <paulus@samba.org>
> > ---
> >  arch/powerpc/kvm/book3s_hv_rm_mmu.c | 8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> > index f6bf0b1..5c1737f 100644
> > --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> > +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> > @@ -413,14 +413,12 @@ long kvmppc_do_h_remove(struct kvm *kvm, unsigned long flags,
> >  	rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
> >  	v = pte & ~HPTE_V_HVLOCK;
> >  	if (v & HPTE_V_VALID) {
> > -		u64 pte1;
> > -
> > -		pte1 = be64_to_cpu(hpte[1]);
> >  		hpte[0] &= ~cpu_to_be64(HPTE_V_VALID);
> > -		rb = compute_tlbie_rb(v, pte1, pte_index);
> > +		rb = compute_tlbie_rb(v, be64_to_cpu(hpte[1]), pte_index);
> >  		do_tlbies(kvm, &rb, 1, global_invalidates(kvm, flags), true);
> >  		/* Read PTE low word after tlbie to get final R/C values */
> > -		remove_revmap_chain(kvm, pte_index, rev, v, pte1);
> > +		remove_revmap_chain(kvm, pte_index, rev, v,
> > +				    be64_to_cpu(hpte[1]));
> >  	}
> 
> May be add the above commit message as a code comment ?

Well, that's what "/* Read PTE low word after tlbie to get final R/C
values */" was trying to be, originally, but maybe it would be helpful
to expand on it.

Paul.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD
  2015-04-24  5:39 ` [PATCH 3/3] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD Paul Mackerras
@ 2015-07-27 17:31   ` Nathan Whitehorn
  2015-09-06 19:47     ` Nathan Whitehorn
  0 siblings, 1 reply; 12+ messages in thread
From: Nathan Whitehorn @ 2015-07-27 17:31 UTC (permalink / raw)
  To: Paul Mackerras, kvm-ppc, kvm

I've been running with these patches and a FreeBSD guest for a while now 
and they work very well, providing big performance improvements in tight 
memory situations. Thanks! I did have to patch QEMU to enable the 
hypercalls, however -- it might be worth adding them to the default set 
since these calls are mandated by PAPR as part of the base hypercall 
interface.
-Nathan

On 04/23/15 22:39, Paul Mackerras wrote:
> This adds implementations for the H_CLEAR_REF (test and clear reference
> bit) and H_CLEAR_MOD (test and clear changed bit) hypercalls.
>
> When clearing the reference or change bit in the guest view of the HPTE,
> we also have to clear it in the real HPTE so that we can detect future
> references or changes.  When we do so, we transfer the R or C bit value
> to the rmap entry for the underlying host page so that kvm_age_hva_hv(),
> kvm_test_age_hva_hv() and kvmppc_hv_get_dirty_log() know that the page
> has been referenced and/or changed.
>
> These hypercalls are not used by Linux guests and these implementations
> are only compile tested.
>
> Signed-off-by: Paul Mackerras <paulus@samba.org>
> ---
>   arch/powerpc/kvm/book3s_hv_rm_mmu.c     | 126 ++++++++++++++++++++++++++++++--
>   arch/powerpc/kvm/book3s_hv_rmhandlers.S |   4 +-
>   2 files changed, 121 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> index 24ccc79..479ff7e 100644
> --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> @@ -109,25 +109,38 @@ void kvmppc_update_rmap_change(unsigned long *rmap, unsigned long psize)
>   }
>   EXPORT_SYMBOL_GPL(kvmppc_update_rmap_change);
>   
> +/* Returns a pointer to the revmap entry for the page mapped by a HPTE */
> +static unsigned long *revmap_for_hpte(struct kvm *kvm, unsigned long hpte_v,
> +				      unsigned long hpte_gr)
> +{
> +	struct kvm_memory_slot *memslot;
> +	unsigned long *rmap;
> +	unsigned long gfn;
> +
> +	gfn = hpte_rpn(hpte_gr, hpte_page_size(hpte_v, hpte_gr));
> +	memslot = __gfn_to_memslot(kvm_memslots_raw(kvm), gfn);
> +	if (!memslot)
> +		return NULL;
> +
> +	rmap = real_vmalloc_addr(&memslot->arch.rmap[gfn - memslot->base_gfn]);
> +	return rmap;
> +}
> +
>   /* Remove this HPTE from the chain for a real page */
>   static void remove_revmap_chain(struct kvm *kvm, long pte_index,
>   				struct revmap_entry *rev,
>   				unsigned long hpte_v, unsigned long hpte_r)
>   {
>   	struct revmap_entry *next, *prev;
> -	unsigned long gfn, ptel, head;
> -	struct kvm_memory_slot *memslot;
> +	unsigned long ptel, head;
>   	unsigned long *rmap;
>   	unsigned long rcbits;
>   
>   	rcbits = hpte_r & (HPTE_R_R | HPTE_R_C);
>   	ptel = rev->guest_rpte |= rcbits;
> -	gfn = hpte_rpn(ptel, hpte_page_size(hpte_v, ptel));
> -	memslot = __gfn_to_memslot(kvm_memslots_raw(kvm), gfn);
> -	if (!memslot)
> +	rmap = revmap_for_hpte(kvm, hpte_v, ptel);
> +	if (!rmap)
>   		return;
> -
> -	rmap = real_vmalloc_addr(&memslot->arch.rmap[gfn - memslot->base_gfn]);
>   	lock_rmap(rmap);
>   
>   	head = *rmap & KVMPPC_RMAP_INDEX;
> @@ -662,6 +675,105 @@ long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long flags,
>   	return H_SUCCESS;
>   }
>   
> +long kvmppc_h_clear_ref(struct kvm_vcpu *vcpu, unsigned long flags,
> +			unsigned long pte_index)
> +{
> +	struct kvm *kvm = vcpu->kvm;
> +	__be64 *hpte;
> +	unsigned long v, r, gr;
> +	struct revmap_entry *rev;
> +	unsigned long *rmap;
> +	long ret = H_NOT_FOUND;
> +
> +	if (pte_index >= kvm->arch.hpt_npte)
> +		return H_PARAMETER;
> +
> +	rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
> +	hpte = (__be64 *)(kvm->arch.hpt_virt + (pte_index << 4));
> +	while (!try_lock_hpte(hpte, HPTE_V_HVLOCK))
> +		cpu_relax();
> +	v = be64_to_cpu(hpte[0]);
> +	r = be64_to_cpu(hpte[1]);
> +	if (!(v & (HPTE_V_VALID | HPTE_V_ABSENT)))
> +		goto out;
> +
> +	gr = rev->guest_rpte;
> +	if (rev->guest_rpte & HPTE_R_R) {
> +		rev->guest_rpte &= ~HPTE_R_R;
> +		note_hpte_modification(kvm, rev);
> +	}
> +	if (v & HPTE_V_VALID) {
> +		gr |= r & (HPTE_R_R | HPTE_R_C);
> +		if (r & HPTE_R_R) {
> +			kvmppc_clear_ref_hpte(kvm, hpte, pte_index);
> +			rmap = revmap_for_hpte(kvm, v, gr);
> +			if (rmap) {
> +				lock_rmap(rmap);
> +				*rmap |= KVMPPC_RMAP_REFERENCED;
> +				unlock_rmap(rmap);
> +			}
> +		}
> +	}
> +	vcpu->arch.gpr[4] = gr;
> +	ret = H_SUCCESS;
> + out:
> +	unlock_hpte(hpte, v & ~HPTE_V_HVLOCK);
> +	return ret;
> +}
> +
> +long kvmppc_h_clear_mod(struct kvm_vcpu *vcpu, unsigned long flags,
> +			unsigned long pte_index)
> +{
> +	struct kvm *kvm = vcpu->kvm;
> +	__be64 *hpte;
> +	unsigned long v, r, gr;
> +	struct revmap_entry *rev;
> +	unsigned long *rmap;
> +	long ret = H_NOT_FOUND;
> +
> +	if (pte_index >= kvm->arch.hpt_npte)
> +		return H_PARAMETER;
> +
> +	rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
> +	hpte = (__be64 *)(kvm->arch.hpt_virt + (pte_index << 4));
> +	while (!try_lock_hpte(hpte, HPTE_V_HVLOCK))
> +		cpu_relax();
> +	v = be64_to_cpu(hpte[0]);
> +	r = be64_to_cpu(hpte[1]);
> +	if (!(v & (HPTE_V_VALID | HPTE_V_ABSENT)))
> +		goto out;
> +
> +	gr = rev->guest_rpte;
> +	if (gr & HPTE_R_C) {
> +		rev->guest_rpte &= ~HPTE_R_C;
> +		note_hpte_modification(kvm, rev);
> +	}
> +	if (v & HPTE_V_VALID) {
> +		/* need to make it temporarily absent so C is stable */
> +		hpte[0] |= cpu_to_be64(HPTE_V_ABSENT);
> +		kvmppc_invalidate_hpte(kvm, hpte, pte_index);
> +		r = be64_to_cpu(hpte[1]);
> +		gr |= r & (HPTE_R_R | HPTE_R_C);
> +		if (r & HPTE_R_C) {
> +			unsigned long psize = hpte_page_size(v, r);
> +			hpte[1] = cpu_to_be64(r & ~HPTE_R_C);
> +			eieio();
> +			rmap = revmap_for_hpte(kvm, v, gr);
> +			if (rmap) {
> +				lock_rmap(rmap);
> +				*rmap |= KVMPPC_RMAP_CHANGED;
> +				kvmppc_update_rmap_change(rmap, psize);
> +				unlock_rmap(rmap);
> +			}
> +		}
> +	}
> +	vcpu->arch.gpr[4] = gr;
> +	ret = H_SUCCESS;
> + out:
> +	unlock_hpte(hpte, v & ~HPTE_V_HVLOCK);
> +	return ret;
> +}
> +
>   void kvmppc_invalidate_hpte(struct kvm *kvm, __be64 *hptep,
>   			unsigned long pte_index)
>   {
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index 4d70df2..52752a3 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -1816,8 +1816,8 @@ hcall_real_table:
>   	.long	DOTSYM(kvmppc_h_remove) - hcall_real_table
>   	.long	DOTSYM(kvmppc_h_enter) - hcall_real_table
>   	.long	DOTSYM(kvmppc_h_read) - hcall_real_table
> -	.long	0		/* 0x10 - H_CLEAR_MOD */
> -	.long	0		/* 0x14 - H_CLEAR_REF */
> +	.long	DOTSYM(kvmppc_h_clear_mod) - hcall_real_table
> +	.long	DOTSYM(kvmppc_h_clear_ref) - hcall_real_table
>   	.long	DOTSYM(kvmppc_h_protect) - hcall_real_table
>   	.long	DOTSYM(kvmppc_h_get_tce) - hcall_real_table
>   	.long	DOTSYM(kvmppc_h_put_tce) - hcall_real_table


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD
  2015-07-27 17:31   ` Nathan Whitehorn
@ 2015-09-06 19:47     ` Nathan Whitehorn
  2015-09-06 23:52       ` Paul Mackerras
  0 siblings, 1 reply; 12+ messages in thread
From: Nathan Whitehorn @ 2015-09-06 19:47 UTC (permalink / raw)
  To: Paul Mackerras, kvm-ppc, kvm

Anything I can do to help move these along? It's a big performance 
improvement for FreeBSD guests.
-Nathan

On 07/27/15 10:31, Nathan Whitehorn wrote:
> I've been running with these patches and a FreeBSD guest for a while 
> now and they work very well, providing big performance improvements in 
> tight memory situations. Thanks! I did have to patch QEMU to enable 
> the hypercalls, however -- it might be worth adding them to the 
> default set since these calls are mandated by PAPR as part of the base 
> hypercall interface.
> -Nathan
>
> On 04/23/15 22:39, Paul Mackerras wrote:
>> This adds implementations for the H_CLEAR_REF (test and clear reference
>> bit) and H_CLEAR_MOD (test and clear changed bit) hypercalls.
>>
>> When clearing the reference or change bit in the guest view of the HPTE,
>> we also have to clear it in the real HPTE so that we can detect future
>> references or changes.  When we do so, we transfer the R or C bit value
>> to the rmap entry for the underlying host page so that kvm_age_hva_hv(),
>> kvm_test_age_hva_hv() and kvmppc_hv_get_dirty_log() know that the page
>> has been referenced and/or changed.
>>
>> These hypercalls are not used by Linux guests and these implementations
>> are only compile tested.
>>
>> Signed-off-by: Paul Mackerras <paulus@samba.org>
>> ---
>>   arch/powerpc/kvm/book3s_hv_rm_mmu.c     | 126 
>> ++++++++++++++++++++++++++++++--
>>   arch/powerpc/kvm/book3s_hv_rmhandlers.S |   4 +-
>>   2 files changed, 121 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
>> b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
>> index 24ccc79..479ff7e 100644
>> --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
>> +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
>> @@ -109,25 +109,38 @@ void kvmppc_update_rmap_change(unsigned long 
>> *rmap, unsigned long psize)
>>   }
>>   EXPORT_SYMBOL_GPL(kvmppc_update_rmap_change);
>>   +/* Returns a pointer to the revmap entry for the page mapped by a 
>> HPTE */
>> +static unsigned long *revmap_for_hpte(struct kvm *kvm, unsigned long 
>> hpte_v,
>> +                      unsigned long hpte_gr)
>> +{
>> +    struct kvm_memory_slot *memslot;
>> +    unsigned long *rmap;
>> +    unsigned long gfn;
>> +
>> +    gfn = hpte_rpn(hpte_gr, hpte_page_size(hpte_v, hpte_gr));
>> +    memslot = __gfn_to_memslot(kvm_memslots_raw(kvm), gfn);
>> +    if (!memslot)
>> +        return NULL;
>> +
>> +    rmap = real_vmalloc_addr(&memslot->arch.rmap[gfn - 
>> memslot->base_gfn]);
>> +    return rmap;
>> +}
>> +
>>   /* Remove this HPTE from the chain for a real page */
>>   static void remove_revmap_chain(struct kvm *kvm, long pte_index,
>>                   struct revmap_entry *rev,
>>                   unsigned long hpte_v, unsigned long hpte_r)
>>   {
>>       struct revmap_entry *next, *prev;
>> -    unsigned long gfn, ptel, head;
>> -    struct kvm_memory_slot *memslot;
>> +    unsigned long ptel, head;
>>       unsigned long *rmap;
>>       unsigned long rcbits;
>>         rcbits = hpte_r & (HPTE_R_R | HPTE_R_C);
>>       ptel = rev->guest_rpte |= rcbits;
>> -    gfn = hpte_rpn(ptel, hpte_page_size(hpte_v, ptel));
>> -    memslot = __gfn_to_memslot(kvm_memslots_raw(kvm), gfn);
>> -    if (!memslot)
>> +    rmap = revmap_for_hpte(kvm, hpte_v, ptel);
>> +    if (!rmap)
>>           return;
>> -
>> -    rmap = real_vmalloc_addr(&memslot->arch.rmap[gfn - 
>> memslot->base_gfn]);
>>       lock_rmap(rmap);
>>         head = *rmap & KVMPPC_RMAP_INDEX;
>> @@ -662,6 +675,105 @@ long kvmppc_h_read(struct kvm_vcpu *vcpu, 
>> unsigned long flags,
>>       return H_SUCCESS;
>>   }
>>   +long kvmppc_h_clear_ref(struct kvm_vcpu *vcpu, unsigned long flags,
>> +            unsigned long pte_index)
>> +{
>> +    struct kvm *kvm = vcpu->kvm;
>> +    __be64 *hpte;
>> +    unsigned long v, r, gr;
>> +    struct revmap_entry *rev;
>> +    unsigned long *rmap;
>> +    long ret = H_NOT_FOUND;
>> +
>> +    if (pte_index >= kvm->arch.hpt_npte)
>> +        return H_PARAMETER;
>> +
>> +    rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
>> +    hpte = (__be64 *)(kvm->arch.hpt_virt + (pte_index << 4));
>> +    while (!try_lock_hpte(hpte, HPTE_V_HVLOCK))
>> +        cpu_relax();
>> +    v = be64_to_cpu(hpte[0]);
>> +    r = be64_to_cpu(hpte[1]);
>> +    if (!(v & (HPTE_V_VALID | HPTE_V_ABSENT)))
>> +        goto out;
>> +
>> +    gr = rev->guest_rpte;
>> +    if (rev->guest_rpte & HPTE_R_R) {
>> +        rev->guest_rpte &= ~HPTE_R_R;
>> +        note_hpte_modification(kvm, rev);
>> +    }
>> +    if (v & HPTE_V_VALID) {
>> +        gr |= r & (HPTE_R_R | HPTE_R_C);
>> +        if (r & HPTE_R_R) {
>> +            kvmppc_clear_ref_hpte(kvm, hpte, pte_index);
>> +            rmap = revmap_for_hpte(kvm, v, gr);
>> +            if (rmap) {
>> +                lock_rmap(rmap);
>> +                *rmap |= KVMPPC_RMAP_REFERENCED;
>> +                unlock_rmap(rmap);
>> +            }
>> +        }
>> +    }
>> +    vcpu->arch.gpr[4] = gr;
>> +    ret = H_SUCCESS;
>> + out:
>> +    unlock_hpte(hpte, v & ~HPTE_V_HVLOCK);
>> +    return ret;
>> +}
>> +
>> +long kvmppc_h_clear_mod(struct kvm_vcpu *vcpu, unsigned long flags,
>> +            unsigned long pte_index)
>> +{
>> +    struct kvm *kvm = vcpu->kvm;
>> +    __be64 *hpte;
>> +    unsigned long v, r, gr;
>> +    struct revmap_entry *rev;
>> +    unsigned long *rmap;
>> +    long ret = H_NOT_FOUND;
>> +
>> +    if (pte_index >= kvm->arch.hpt_npte)
>> +        return H_PARAMETER;
>> +
>> +    rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
>> +    hpte = (__be64 *)(kvm->arch.hpt_virt + (pte_index << 4));
>> +    while (!try_lock_hpte(hpte, HPTE_V_HVLOCK))
>> +        cpu_relax();
>> +    v = be64_to_cpu(hpte[0]);
>> +    r = be64_to_cpu(hpte[1]);
>> +    if (!(v & (HPTE_V_VALID | HPTE_V_ABSENT)))
>> +        goto out;
>> +
>> +    gr = rev->guest_rpte;
>> +    if (gr & HPTE_R_C) {
>> +        rev->guest_rpte &= ~HPTE_R_C;
>> +        note_hpte_modification(kvm, rev);
>> +    }
>> +    if (v & HPTE_V_VALID) {
>> +        /* need to make it temporarily absent so C is stable */
>> +        hpte[0] |= cpu_to_be64(HPTE_V_ABSENT);
>> +        kvmppc_invalidate_hpte(kvm, hpte, pte_index);
>> +        r = be64_to_cpu(hpte[1]);
>> +        gr |= r & (HPTE_R_R | HPTE_R_C);
>> +        if (r & HPTE_R_C) {
>> +            unsigned long psize = hpte_page_size(v, r);
>> +            hpte[1] = cpu_to_be64(r & ~HPTE_R_C);
>> +            eieio();
>> +            rmap = revmap_for_hpte(kvm, v, gr);
>> +            if (rmap) {
>> +                lock_rmap(rmap);
>> +                *rmap |= KVMPPC_RMAP_CHANGED;
>> +                kvmppc_update_rmap_change(rmap, psize);
>> +                unlock_rmap(rmap);
>> +            }
>> +        }
>> +    }
>> +    vcpu->arch.gpr[4] = gr;
>> +    ret = H_SUCCESS;
>> + out:
>> +    unlock_hpte(hpte, v & ~HPTE_V_HVLOCK);
>> +    return ret;
>> +}
>> +
>>   void kvmppc_invalidate_hpte(struct kvm *kvm, __be64 *hptep,
>>               unsigned long pte_index)
>>   {
>> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
>> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> index 4d70df2..52752a3 100644
>> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> @@ -1816,8 +1816,8 @@ hcall_real_table:
>>       .long    DOTSYM(kvmppc_h_remove) - hcall_real_table
>>       .long    DOTSYM(kvmppc_h_enter) - hcall_real_table
>>       .long    DOTSYM(kvmppc_h_read) - hcall_real_table
>> -    .long    0        /* 0x10 - H_CLEAR_MOD */
>> -    .long    0        /* 0x14 - H_CLEAR_REF */
>> +    .long    DOTSYM(kvmppc_h_clear_mod) - hcall_real_table
>> +    .long    DOTSYM(kvmppc_h_clear_ref) - hcall_real_table
>>       .long    DOTSYM(kvmppc_h_protect) - hcall_real_table
>>       .long    DOTSYM(kvmppc_h_get_tce) - hcall_real_table
>>       .long    DOTSYM(kvmppc_h_put_tce) - hcall_real_table
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD
  2015-09-06 19:47     ` Nathan Whitehorn
@ 2015-09-06 23:52       ` Paul Mackerras
  2015-09-07  0:46         ` Nathan Whitehorn
  2015-09-12 16:47         ` Nathan Whitehorn
  0 siblings, 2 replies; 12+ messages in thread
From: Paul Mackerras @ 2015-09-06 23:52 UTC (permalink / raw)
  To: Nathan Whitehorn; +Cc: kvm-ppc, kvm

On Sun, Sep 06, 2015 at 12:47:12PM -0700, Nathan Whitehorn wrote:
> Anything I can do to help move these along? It's a big performance
> improvement for FreeBSD guests.

These patches are in Paolo's kvm-ppc-next branch and should go into
Linus' tree in the next couple of days.

Paul.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD
  2015-09-06 23:52       ` Paul Mackerras
@ 2015-09-07  0:46         ` Nathan Whitehorn
  2015-09-12 16:47         ` Nathan Whitehorn
  1 sibling, 0 replies; 12+ messages in thread
From: Nathan Whitehorn @ 2015-09-07  0:46 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: kvm-ppc, kvm

Fantastic, thanks!
-Nathan

On 09/06/15 16:52, Paul Mackerras wrote:
> On Sun, Sep 06, 2015 at 12:47:12PM -0700, Nathan Whitehorn wrote:
>> Anything I can do to help move these along? It's a big performance
>> improvement for FreeBSD guests.
> These patches are in Paolo's kvm-ppc-next branch and should go into
> Linus' tree in the next couple of days.
>
> Paul.
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD
  2015-09-06 23:52       ` Paul Mackerras
  2015-09-07  0:46         ` Nathan Whitehorn
@ 2015-09-12 16:47         ` Nathan Whitehorn
  2015-09-12 16:51           ` Alexander Graf
  1 sibling, 1 reply; 12+ messages in thread
From: Nathan Whitehorn @ 2015-09-12 16:47 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: kvm-ppc, kvm

On 09/06/15 16:52, Paul Mackerras wrote:
> On Sun, Sep 06, 2015 at 12:47:12PM -0700, Nathan Whitehorn wrote:
>> Anything I can do to help move these along? It's a big performance
>> improvement for FreeBSD guests.
> These patches are in Paolo's kvm-ppc-next branch and should go into
> Linus' tree in the next couple of days.
>
> Paul.
>

One additional question. What is your preferred way to enable these? 
Since these are part of the mandatory part of the PAPR spec, I think 
there's an argument to add them to the default_hcall_list? Otherwise, 
they should be enabled by default in QEMU (I can take care of sending 
that patch if you prefer this route).
-Nathan


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD
  2015-09-12 16:47         ` Nathan Whitehorn
@ 2015-09-12 16:51           ` Alexander Graf
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Graf @ 2015-09-12 16:51 UTC (permalink / raw)
  To: Nathan Whitehorn; +Cc: Paul Mackerras, kvm-ppc, kvm



> Am 12.09.2015 um 18:47 schrieb Nathan Whitehorn <nwhitehorn@freebsd.org>:
> 
>> On 09/06/15 16:52, Paul Mackerras wrote:
>>> On Sun, Sep 06, 2015 at 12:47:12PM -0700, Nathan Whitehorn wrote:
>>> Anything I can do to help move these along? It's a big performance
>>> improvement for FreeBSD guests.
>> These patches are in Paolo's kvm-ppc-next branch and should go into
>> Linus' tree in the next couple of days.
>> 
>> Paul.
> 
> One additional question. What is your preferred way to enable these? Since these are part of the mandatory part of the PAPR spec, I think there's an argument to add them to the default_hcall_list? Otherwise, they should be enabled by default in QEMU (I can take care of sending that patch if you prefer this route).

The default hcall list just describes which hcalls were implicitly enabled at the point in time we made them enableable by user space. IMHO no new hcalls should get added there.

So yes, please send a patch to qemu :).


Alex


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-09-12 16:51 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-24  5:39 [PATCH 0/3] PPC HV bug fixes + hcalls for FreeBSD Paul Mackerras
2015-04-24  5:39 ` [PATCH 1/3] KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE Paul Mackerras
2015-04-28  5:06   ` Aneesh Kumar K.V
2015-04-28  9:10     ` Paul Mackerras
2015-04-24  5:39 ` [PATCH 2/3] KVM: PPC: Book3S HV: Fix bug in dirty page tracking Paul Mackerras
2015-04-24  5:39 ` [PATCH 3/3] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD Paul Mackerras
2015-07-27 17:31   ` Nathan Whitehorn
2015-09-06 19:47     ` Nathan Whitehorn
2015-09-06 23:52       ` Paul Mackerras
2015-09-07  0:46         ` Nathan Whitehorn
2015-09-12 16:47         ` Nathan Whitehorn
2015-09-12 16:51           ` Alexander Graf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).