All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] KVM: PPC: Book3S HV: HPT read/write functions for userspace
@ 2012-11-14  4:30 ` Paul Mackerras
  0 siblings, 0 replies; 24+ messages in thread
From: Paul Mackerras @ 2012-11-14  4:30 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc

This series of patches provides an interface by which userspace can
read and write the hashed page table (HPT) of a Book3S HV guest.
The interface is an ioctl which provides a file descriptor which can
be accessed with the read() and write() system calls.  The data read
and written is the guest view of the HPT, in which the second
doubleword of each HPTE (HPT entry) contains a guest physical address,
as distinct from the real HPT that the hardware accesses, where the
second doubleword of each HPTE contains a real address.

Because the HPT is divided into groups (HPTEGs) of 8 entries each,
where each HPTEG usually only contains a few valid entries, or none,
the data format that we use does run-length encoding of the invalid
entries, so in fact the invalid entries take up no space in the
stream.

The interface also provides for doing multiple passes over the HPT,
where the first pass provides information on all HPTEs, and subsequent
passes only return the HPTEs that have changed since the previous pass.

I have implemented a read/write interface rather than an mmap-based
interface because the data is not stored contiguously anywhere in
kernel memory.  Of each 16-byte HPTE, the first 8 bytes come from the
real HPT and the second 8 bytes come from the parallel vmalloc'd array
where we store the guest view of the guest physical address,
permissions, accessed/dirty bits etc.  Thus a mmap-based interface
would not be practicable (not without doubling the size of the
parallel array, typically requiring an extra 8MB of kernel memory per
guest).  This is also why I have not used the memslot interface for
this.

This implements the interface for HV-style KVM but not for PR-style
KVM.  Userspace does not need any additional interface with PR-style
KVM because userspace maintains the guest HPT already in that case,
and has an image of the guest view of the HPT in its address space.

This series is against the next branch of the kvm tree.  The patches
are basically identical to the previous posting of the series, just
rediffed for the move of kvm.h from include/linux to
include/uapi/linux, and for commit 8ca40a70a7 ("KVM: Take kvm instead
of vcpu to mmu_notifier_retry"), which supersedes patch 1 of the old
series.

The overall diffstat is:

 Documentation/virtual/kvm/api.txt        |   53 +++++
 arch/powerpc/include/asm/kvm_book3s.h    |    8 +-
 arch/powerpc/include/asm/kvm_book3s_64.h |   24 ++
 arch/powerpc/include/asm/kvm_host.h      |    1 +
 arch/powerpc/include/asm/kvm_ppc.h       |    2 +
 arch/powerpc/include/uapi/asm/kvm.h      |   24 ++
 arch/powerpc/kvm/book3s_64_mmu_hv.c      |  380 +++++++++++++++++++++++++++++-
 arch/powerpc/kvm/book3s_hv.c             |   12 -
 arch/powerpc/kvm/book3s_hv_rm_mmu.c      |   71 ++++--
 arch/powerpc/kvm/powerpc.c               |   17 ++
 include/uapi/linux/kvm.h                 |    3 +
 11 files changed, 551 insertions(+), 44 deletions(-)

Please apply.

Paul.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 0/4] KVM: PPC: Book3S HV: HPT read/write functions for userspace
@ 2012-11-14  4:30 ` Paul Mackerras
  0 siblings, 0 replies; 24+ messages in thread
From: Paul Mackerras @ 2012-11-14  4:30 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc

This series of patches provides an interface by which userspace can
read and write the hashed page table (HPT) of a Book3S HV guest.
The interface is an ioctl which provides a file descriptor which can
be accessed with the read() and write() system calls.  The data read
and written is the guest view of the HPT, in which the second
doubleword of each HPTE (HPT entry) contains a guest physical address,
as distinct from the real HPT that the hardware accesses, where the
second doubleword of each HPTE contains a real address.

Because the HPT is divided into groups (HPTEGs) of 8 entries each,
where each HPTEG usually only contains a few valid entries, or none,
the data format that we use does run-length encoding of the invalid
entries, so in fact the invalid entries take up no space in the
stream.

The interface also provides for doing multiple passes over the HPT,
where the first pass provides information on all HPTEs, and subsequent
passes only return the HPTEs that have changed since the previous pass.

I have implemented a read/write interface rather than an mmap-based
interface because the data is not stored contiguously anywhere in
kernel memory.  Of each 16-byte HPTE, the first 8 bytes come from the
real HPT and the second 8 bytes come from the parallel vmalloc'd array
where we store the guest view of the guest physical address,
permissions, accessed/dirty bits etc.  Thus a mmap-based interface
would not be practicable (not without doubling the size of the
parallel array, typically requiring an extra 8MB of kernel memory per
guest).  This is also why I have not used the memslot interface for
this.

This implements the interface for HV-style KVM but not for PR-style
KVM.  Userspace does not need any additional interface with PR-style
KVM because userspace maintains the guest HPT already in that case,
and has an image of the guest view of the HPT in its address space.

This series is against the next branch of the kvm tree.  The patches
are basically identical to the previous posting of the series, just
rediffed for the move of kvm.h from include/linux to
include/uapi/linux, and for commit 8ca40a70a7 ("KVM: Take kvm instead
of vcpu to mmu_notifier_retry"), which supersedes patch 1 of the old
series.

The overall diffstat is:

 Documentation/virtual/kvm/api.txt        |   53 +++++
 arch/powerpc/include/asm/kvm_book3s.h    |    8 +-
 arch/powerpc/include/asm/kvm_book3s_64.h |   24 ++
 arch/powerpc/include/asm/kvm_host.h      |    1 +
 arch/powerpc/include/asm/kvm_ppc.h       |    2 +
 arch/powerpc/include/uapi/asm/kvm.h      |   24 ++
 arch/powerpc/kvm/book3s_64_mmu_hv.c      |  380 +++++++++++++++++++++++++++++-
 arch/powerpc/kvm/book3s_hv.c             |   12 -
 arch/powerpc/kvm/book3s_hv_rm_mmu.c      |   71 ++++--
 arch/powerpc/kvm/powerpc.c               |   17 ++
 include/uapi/linux/kvm.h                 |    3 +
 11 files changed, 551 insertions(+), 44 deletions(-)

Please apply.

Paul.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/4] KVM: PPC: Book3S HV: Restructure HPT entry creation code
  2012-11-14  4:30 ` Paul Mackerras
@ 2012-11-14  4:31   ` Paul Mackerras
  -1 siblings, 0 replies; 24+ messages in thread
From: Paul Mackerras @ 2012-11-14  4:31 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc

This restructures the code that creates HPT (hashed page table)
entries so that it can be called in situations where we don't have a
struct vcpu pointer, only a struct kvm pointer.  It also fixes a bug
where kvmppc_map_vrma() would corrupt the guest R4 value.

Most of the work of kvmppc_virtmode_h_enter is now done by a new
function, kvmppc_virtmode_do_h_enter, which itself calls another new
function, kvmppc_do_h_enter, which contains most of the old
kvmppc_h_enter.  The new kvmppc_do_h_enter takes explicit arguments
for the place to return the HPTE index, the Linux page tables to use,
and whether it is being called in real mode, thus removing the need
for it to have the vcpu as an argument.

Currently kvmppc_map_vrma creates the VRMA (virtual real mode area)
HPTEs by calling kvmppc_virtmode_h_enter, which is designed primarily
to handle H_ENTER hcalls from the guest that need to pin a page of
memory.  Since H_ENTER returns the index of the created HPTE in R4,
kvmppc_virtmode_h_enter updates the guest R4, corrupting the guest R4
in the case when it gets called from kvmppc_map_vrma on the first
VCPU_RUN ioctl.  With this, kvmppc_map_vrma instead calls
kvmppc_virtmode_do_h_enter with the address of a dummy word as the
place to store the HPTE index, thus avoiding corrupting the guest R4.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/include/asm/kvm_book3s.h |    5 +++--
 arch/powerpc/kvm/book3s_64_mmu_hv.c   |   36 +++++++++++++++++++++++----------
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   |   27 ++++++++++++++++---------
 3 files changed, 45 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index 36fcf41..fea768f 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -157,8 +157,9 @@ extern void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long addr,
 extern void kvmppc_unpin_guest_page(struct kvm *kvm, void *addr);
 extern long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 			long pte_index, unsigned long pteh, unsigned long ptel);
-extern long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
-			long pte_index, unsigned long pteh, unsigned long ptel);
+extern long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
+			long pte_index, unsigned long pteh, unsigned long ptel,
+			pgd_t *pgdir, bool realmode, unsigned long *idx_ret);
 extern long kvmppc_hv_get_dirty_log(struct kvm *kvm,
 			struct kvm_memory_slot *memslot, unsigned long *map);
 
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 2a89a36..6ee6516 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -41,6 +41,10 @@
 /* Power architecture requires HPT is at least 256kB */
 #define PPC_MIN_HPT_ORDER	18
 
+static long kvmppc_virtmode_do_h_enter(struct kvm *kvm, unsigned long flags,
+				long pte_index, unsigned long pteh,
+				unsigned long ptel, unsigned long *pte_idx_ret);
+
 long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
 {
 	unsigned long hpt;
@@ -185,6 +189,7 @@ void kvmppc_map_vrma(struct kvm_vcpu *vcpu, struct kvm_memory_slot *memslot,
 	unsigned long addr, hash;
 	unsigned long psize;
 	unsigned long hp0, hp1;
+	unsigned long idx_ret;
 	long ret;
 	struct kvm *kvm = vcpu->kvm;
 
@@ -216,7 +221,8 @@ void kvmppc_map_vrma(struct kvm_vcpu *vcpu, struct kvm_memory_slot *memslot,
 		hash = (hash << 3) + 7;
 		hp_v = hp0 | ((addr >> 16) & ~0x7fUL);
 		hp_r = hp1 | addr;
-		ret = kvmppc_virtmode_h_enter(vcpu, H_EXACT, hash, hp_v, hp_r);
+		ret = kvmppc_virtmode_do_h_enter(kvm, H_EXACT, hash, hp_v, hp_r,
+						 &idx_ret);
 		if (ret != H_SUCCESS) {
 			pr_err("KVM: map_vrma at %lx failed, ret=%ld\n",
 			       addr, ret);
@@ -354,15 +360,10 @@ static long kvmppc_get_guest_page(struct kvm *kvm, unsigned long gfn,
 	return err;
 }
 
-/*
- * We come here on a H_ENTER call from the guest when we are not
- * using mmu notifiers and we don't have the requested page pinned
- * already.
- */
-long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
-			long pte_index, unsigned long pteh, unsigned long ptel)
+long kvmppc_virtmode_do_h_enter(struct kvm *kvm, unsigned long flags,
+				long pte_index, unsigned long pteh,
+				unsigned long ptel, unsigned long *pte_idx_ret)
 {
-	struct kvm *kvm = vcpu->kvm;
 	unsigned long psize, gpa, gfn;
 	struct kvm_memory_slot *memslot;
 	long ret;
@@ -390,8 +391,8 @@ long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
  do_insert:
 	/* Protect linux PTE lookup from page table destruction */
 	rcu_read_lock_sched();	/* this disables preemption too */
-	vcpu->arch.pgdir = current->mm->pgd;
-	ret = kvmppc_h_enter(vcpu, flags, pte_index, pteh, ptel);
+	ret = kvmppc_do_h_enter(kvm, flags, pte_index, pteh, ptel,
+				current->mm->pgd, false, pte_idx_ret);
 	rcu_read_unlock_sched();
 	if (ret == H_TOO_HARD) {
 		/* this can't happen */
@@ -402,6 +403,19 @@ long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 
 }
 
+/*
+ * We come here on a H_ENTER call from the guest when we are not
+ * using mmu notifiers and we don't have the requested page pinned
+ * already.
+ */
+long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
+			     long pte_index, unsigned long pteh,
+			     unsigned long ptel)
+{
+	return kvmppc_virtmode_do_h_enter(vcpu->kvm, flags, pte_index,
+					  pteh, ptel, &vcpu->arch.gpr[4]);
+}
+
 static struct kvmppc_slb *kvmppc_mmu_book3s_hv_find_slbe(struct kvm_vcpu *vcpu,
 							 gva_t eaddr)
 {
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 5e06e31..362dffe 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -103,14 +103,14 @@ static void remove_revmap_chain(struct kvm *kvm, long pte_index,
 	unlock_rmap(rmap);
 }
 
-static pte_t lookup_linux_pte(struct kvm_vcpu *vcpu, unsigned long hva,
+static pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
 			      int writing, unsigned long *pte_sizep)
 {
 	pte_t *ptep;
 	unsigned long ps = *pte_sizep;
 	unsigned int shift;
 
-	ptep = find_linux_pte_or_hugepte(vcpu->arch.pgdir, hva, &shift);
+	ptep = find_linux_pte_or_hugepte(pgdir, hva, &shift);
 	if (!ptep)
 		return __pte(0);
 	if (shift)
@@ -130,10 +130,10 @@ static inline void unlock_hpte(unsigned long *hpte, unsigned long hpte_v)
 	hpte[0] = hpte_v;
 }
 
-long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
-		    long pte_index, unsigned long pteh, unsigned long ptel)
+long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
+		       long pte_index, unsigned long pteh, unsigned long ptel,
+		       pgd_t *pgdir, bool realmode, unsigned long *pte_idx_ret)
 {
-	struct kvm *kvm = vcpu->kvm;
 	unsigned long i, pa, gpa, gfn, psize;
 	unsigned long slot_fn, hva;
 	unsigned long *hpte;
@@ -147,7 +147,6 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 	unsigned int writing;
 	unsigned long mmu_seq;
 	unsigned long rcbits;
-	bool realmode = vcpu->arch.vcore->vcore_state == VCORE_RUNNING;
 
 	psize = hpte_page_size(pteh, ptel);
 	if (!psize)
@@ -201,7 +200,7 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 
 		/* Look up the Linux PTE for the backing page */
 		pte_size = psize;
-		pte = lookup_linux_pte(vcpu, hva, writing, &pte_size);
+		pte = lookup_linux_pte(pgdir, hva, writing, &pte_size);
 		if (pte_present(pte)) {
 			if (writing && !pte_write(pte))
 				/* make the actual HPTE be read-only */
@@ -210,6 +209,7 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 			pa = pte_pfn(pte) << PAGE_SHIFT;
 		}
 	}
+
 	if (pte_size < psize)
 		return H_PARAMETER;
 	if (pa && pte_size > psize)
@@ -297,7 +297,7 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 		lock_rmap(rmap);
 		/* Check for pending invalidations under the rmap chain lock */
 		if (kvm->arch.using_mmu_notifiers &&
-		    mmu_notifier_retry(vcpu->kvm, mmu_seq)) {
+		    mmu_notifier_retry(kvm, mmu_seq)) {
 			/* inval in progress, write a non-present HPTE */
 			pteh |= HPTE_V_ABSENT;
 			pteh &= ~HPTE_V_VALID;
@@ -318,10 +318,17 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 	hpte[0] = pteh;
 	asm volatile("ptesync" : : : "memory");
 
-	vcpu->arch.gpr[4] = pte_index;
+	*pte_idx_ret = pte_index;
 	return H_SUCCESS;
 }
-EXPORT_SYMBOL_GPL(kvmppc_h_enter);
+EXPORT_SYMBOL_GPL(kvmppc_do_h_enter);
+
+long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
+		    long pte_index, unsigned long pteh, unsigned long ptel)
+{
+	return kvmppc_do_h_enter(vcpu->kvm, flags, pte_index, pteh, ptel,
+				 vcpu->arch.pgdir, true, &vcpu->arch.gpr[4]);
+}
 
 #define LOCK_TOKEN	(*(u32 *)(&get_paca()->lock_token))
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 1/4] KVM: PPC: Book3S HV: Restructure HPT entry creation code
@ 2012-11-14  4:31   ` Paul Mackerras
  0 siblings, 0 replies; 24+ messages in thread
From: Paul Mackerras @ 2012-11-14  4:31 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc

This restructures the code that creates HPT (hashed page table)
entries so that it can be called in situations where we don't have a
struct vcpu pointer, only a struct kvm pointer.  It also fixes a bug
where kvmppc_map_vrma() would corrupt the guest R4 value.

Most of the work of kvmppc_virtmode_h_enter is now done by a new
function, kvmppc_virtmode_do_h_enter, which itself calls another new
function, kvmppc_do_h_enter, which contains most of the old
kvmppc_h_enter.  The new kvmppc_do_h_enter takes explicit arguments
for the place to return the HPTE index, the Linux page tables to use,
and whether it is being called in real mode, thus removing the need
for it to have the vcpu as an argument.

Currently kvmppc_map_vrma creates the VRMA (virtual real mode area)
HPTEs by calling kvmppc_virtmode_h_enter, which is designed primarily
to handle H_ENTER hcalls from the guest that need to pin a page of
memory.  Since H_ENTER returns the index of the created HPTE in R4,
kvmppc_virtmode_h_enter updates the guest R4, corrupting the guest R4
in the case when it gets called from kvmppc_map_vrma on the first
VCPU_RUN ioctl.  With this, kvmppc_map_vrma instead calls
kvmppc_virtmode_do_h_enter with the address of a dummy word as the
place to store the HPTE index, thus avoiding corrupting the guest R4.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/include/asm/kvm_book3s.h |    5 +++--
 arch/powerpc/kvm/book3s_64_mmu_hv.c   |   36 +++++++++++++++++++++++----------
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   |   27 ++++++++++++++++---------
 3 files changed, 45 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index 36fcf41..fea768f 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -157,8 +157,9 @@ extern void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long addr,
 extern void kvmppc_unpin_guest_page(struct kvm *kvm, void *addr);
 extern long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 			long pte_index, unsigned long pteh, unsigned long ptel);
-extern long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
-			long pte_index, unsigned long pteh, unsigned long ptel);
+extern long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
+			long pte_index, unsigned long pteh, unsigned long ptel,
+			pgd_t *pgdir, bool realmode, unsigned long *idx_ret);
 extern long kvmppc_hv_get_dirty_log(struct kvm *kvm,
 			struct kvm_memory_slot *memslot, unsigned long *map);
 
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 2a89a36..6ee6516 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -41,6 +41,10 @@
 /* Power architecture requires HPT is at least 256kB */
 #define PPC_MIN_HPT_ORDER	18
 
+static long kvmppc_virtmode_do_h_enter(struct kvm *kvm, unsigned long flags,
+				long pte_index, unsigned long pteh,
+				unsigned long ptel, unsigned long *pte_idx_ret);
+
 long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
 {
 	unsigned long hpt;
@@ -185,6 +189,7 @@ void kvmppc_map_vrma(struct kvm_vcpu *vcpu, struct kvm_memory_slot *memslot,
 	unsigned long addr, hash;
 	unsigned long psize;
 	unsigned long hp0, hp1;
+	unsigned long idx_ret;
 	long ret;
 	struct kvm *kvm = vcpu->kvm;
 
@@ -216,7 +221,8 @@ void kvmppc_map_vrma(struct kvm_vcpu *vcpu, struct kvm_memory_slot *memslot,
 		hash = (hash << 3) + 7;
 		hp_v = hp0 | ((addr >> 16) & ~0x7fUL);
 		hp_r = hp1 | addr;
-		ret = kvmppc_virtmode_h_enter(vcpu, H_EXACT, hash, hp_v, hp_r);
+		ret = kvmppc_virtmode_do_h_enter(kvm, H_EXACT, hash, hp_v, hp_r,
+						 &idx_ret);
 		if (ret != H_SUCCESS) {
 			pr_err("KVM: map_vrma at %lx failed, ret=%ld\n",
 			       addr, ret);
@@ -354,15 +360,10 @@ static long kvmppc_get_guest_page(struct kvm *kvm, unsigned long gfn,
 	return err;
 }
 
-/*
- * We come here on a H_ENTER call from the guest when we are not
- * using mmu notifiers and we don't have the requested page pinned
- * already.
- */
-long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
-			long pte_index, unsigned long pteh, unsigned long ptel)
+long kvmppc_virtmode_do_h_enter(struct kvm *kvm, unsigned long flags,
+				long pte_index, unsigned long pteh,
+				unsigned long ptel, unsigned long *pte_idx_ret)
 {
-	struct kvm *kvm = vcpu->kvm;
 	unsigned long psize, gpa, gfn;
 	struct kvm_memory_slot *memslot;
 	long ret;
@@ -390,8 +391,8 @@ long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
  do_insert:
 	/* Protect linux PTE lookup from page table destruction */
 	rcu_read_lock_sched();	/* this disables preemption too */
-	vcpu->arch.pgdir = current->mm->pgd;
-	ret = kvmppc_h_enter(vcpu, flags, pte_index, pteh, ptel);
+	ret = kvmppc_do_h_enter(kvm, flags, pte_index, pteh, ptel,
+				current->mm->pgd, false, pte_idx_ret);
 	rcu_read_unlock_sched();
 	if (ret = H_TOO_HARD) {
 		/* this can't happen */
@@ -402,6 +403,19 @@ long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 
 }
 
+/*
+ * We come here on a H_ENTER call from the guest when we are not
+ * using mmu notifiers and we don't have the requested page pinned
+ * already.
+ */
+long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
+			     long pte_index, unsigned long pteh,
+			     unsigned long ptel)
+{
+	return kvmppc_virtmode_do_h_enter(vcpu->kvm, flags, pte_index,
+					  pteh, ptel, &vcpu->arch.gpr[4]);
+}
+
 static struct kvmppc_slb *kvmppc_mmu_book3s_hv_find_slbe(struct kvm_vcpu *vcpu,
 							 gva_t eaddr)
 {
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 5e06e31..362dffe 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -103,14 +103,14 @@ static void remove_revmap_chain(struct kvm *kvm, long pte_index,
 	unlock_rmap(rmap);
 }
 
-static pte_t lookup_linux_pte(struct kvm_vcpu *vcpu, unsigned long hva,
+static pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
 			      int writing, unsigned long *pte_sizep)
 {
 	pte_t *ptep;
 	unsigned long ps = *pte_sizep;
 	unsigned int shift;
 
-	ptep = find_linux_pte_or_hugepte(vcpu->arch.pgdir, hva, &shift);
+	ptep = find_linux_pte_or_hugepte(pgdir, hva, &shift);
 	if (!ptep)
 		return __pte(0);
 	if (shift)
@@ -130,10 +130,10 @@ static inline void unlock_hpte(unsigned long *hpte, unsigned long hpte_v)
 	hpte[0] = hpte_v;
 }
 
-long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
-		    long pte_index, unsigned long pteh, unsigned long ptel)
+long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
+		       long pte_index, unsigned long pteh, unsigned long ptel,
+		       pgd_t *pgdir, bool realmode, unsigned long *pte_idx_ret)
 {
-	struct kvm *kvm = vcpu->kvm;
 	unsigned long i, pa, gpa, gfn, psize;
 	unsigned long slot_fn, hva;
 	unsigned long *hpte;
@@ -147,7 +147,6 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 	unsigned int writing;
 	unsigned long mmu_seq;
 	unsigned long rcbits;
-	bool realmode = vcpu->arch.vcore->vcore_state = VCORE_RUNNING;
 
 	psize = hpte_page_size(pteh, ptel);
 	if (!psize)
@@ -201,7 +200,7 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 
 		/* Look up the Linux PTE for the backing page */
 		pte_size = psize;
-		pte = lookup_linux_pte(vcpu, hva, writing, &pte_size);
+		pte = lookup_linux_pte(pgdir, hva, writing, &pte_size);
 		if (pte_present(pte)) {
 			if (writing && !pte_write(pte))
 				/* make the actual HPTE be read-only */
@@ -210,6 +209,7 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 			pa = pte_pfn(pte) << PAGE_SHIFT;
 		}
 	}
+
 	if (pte_size < psize)
 		return H_PARAMETER;
 	if (pa && pte_size > psize)
@@ -297,7 +297,7 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 		lock_rmap(rmap);
 		/* Check for pending invalidations under the rmap chain lock */
 		if (kvm->arch.using_mmu_notifiers &&
-		    mmu_notifier_retry(vcpu->kvm, mmu_seq)) {
+		    mmu_notifier_retry(kvm, mmu_seq)) {
 			/* inval in progress, write a non-present HPTE */
 			pteh |= HPTE_V_ABSENT;
 			pteh &= ~HPTE_V_VALID;
@@ -318,10 +318,17 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 	hpte[0] = pteh;
 	asm volatile("ptesync" : : : "memory");
 
-	vcpu->arch.gpr[4] = pte_index;
+	*pte_idx_ret = pte_index;
 	return H_SUCCESS;
 }
-EXPORT_SYMBOL_GPL(kvmppc_h_enter);
+EXPORT_SYMBOL_GPL(kvmppc_do_h_enter);
+
+long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
+		    long pte_index, unsigned long pteh, unsigned long ptel)
+{
+	return kvmppc_do_h_enter(vcpu->kvm, flags, pte_index, pteh, ptel,
+				 vcpu->arch.pgdir, true, &vcpu->arch.gpr[4]);
+}
 
 #define LOCK_TOKEN	(*(u32 *)(&get_paca()->lock_token))
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/4] KVM: PPC: Book3S HV: Add a mechanism for recording modified HPTEs
  2012-11-14  4:30 ` Paul Mackerras
@ 2012-11-14  4:32   ` Paul Mackerras
  -1 siblings, 0 replies; 24+ messages in thread
From: Paul Mackerras @ 2012-11-14  4:32 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc

This uses a bit in our record of the guest view of the HPTE to record
when the HPTE gets modified.  We use a reserved bit for this, and ensure
that this bit is always cleared in HPTE values returned to the guest.
The recording of modified HPTEs is only done if other code indicates
its interest by setting kvm->arch.hpte_mod_interest to a non-zero value.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/include/asm/kvm_book3s_64.h |    6 ++++++
 arch/powerpc/include/asm/kvm_host.h      |    1 +
 arch/powerpc/kvm/book3s_hv_rm_mmu.c      |   25 ++++++++++++++++++++++---
 3 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
index 1472a5b..4ca4f25 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -50,6 +50,12 @@ extern int kvm_hpt_order;		/* order of preallocated HPTs */
 #define HPTE_V_HVLOCK	0x40UL
 #define HPTE_V_ABSENT	0x20UL
 
+/*
+ * We use this bit in the guest_rpte field of the revmap entry
+ * to indicate a modified HPTE.
+ */
+#define HPTE_GR_MODIFIED	(1ul << 62)
+
 static inline long try_lock_hpte(unsigned long *hpte, unsigned long bits)
 {
 	unsigned long tmp, old;
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 3093896..58c7264 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -248,6 +248,7 @@ struct kvm_arch {
 	atomic_t vcpus_running;
 	unsigned long hpt_npte;
 	unsigned long hpt_mask;
+	atomic_t hpte_mod_interest;
 	spinlock_t slot_phys_lock;
 	unsigned short last_vcpu[NR_CPUS];
 	struct kvmppc_vcore *vcores[KVM_MAX_VCORES];
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 362dffe..726231a 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -66,6 +66,18 @@ void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev,
 }
 EXPORT_SYMBOL_GPL(kvmppc_add_revmap_chain);
 
+/*
+ * Note modification of an HPTE; set the HPTE modified bit
+ * if it wasn't modified before and anyone is interested.
+ */
+static inline void note_hpte_modification(struct kvm *kvm,
+					  struct revmap_entry *rev)
+{
+	if (!(rev->guest_rpte & HPTE_GR_MODIFIED) &&
+	    atomic_read(&kvm->arch.hpte_mod_interest))
+		rev->guest_rpte |= HPTE_GR_MODIFIED;
+}
+
 /* Remove this HPTE from the chain for a real page */
 static void remove_revmap_chain(struct kvm *kvm, long pte_index,
 				struct revmap_entry *rev,
@@ -287,8 +299,10 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
 	rev = &kvm->arch.revmap[pte_index];
 	if (realmode)
 		rev = real_vmalloc_addr(rev);
-	if (rev)
+	if (rev) {
 		rev->guest_rpte = g_ptel;
+		note_hpte_modification(kvm, rev);
+	}
 
 	/* Link HPTE into reverse-map chain */
 	if (pteh & HPTE_V_VALID) {
@@ -392,7 +406,8 @@ long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
 		/* Read PTE low word after tlbie to get final R/C values */
 		remove_revmap_chain(kvm, pte_index, rev, v, hpte[1]);
 	}
-	r = rev->guest_rpte;
+	r = rev->guest_rpte & ~HPTE_GR_MODIFIED;
+	note_hpte_modification(kvm, rev);
 	unlock_hpte(hpte, 0);
 
 	vcpu->arch.gpr[4] = v;
@@ -466,6 +481,7 @@ long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
 
 			args[j] = ((0x80 | flags) << 56) + pte_index;
 			rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
+			note_hpte_modification(kvm, rev);
 
 			if (!(hp[0] & HPTE_V_VALID)) {
 				/* insert R and C bits from PTE */
@@ -555,6 +571,7 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags,
 	if (rev) {
 		r = (rev->guest_rpte & ~mask) | bits;
 		rev->guest_rpte = r;
+		note_hpte_modification(kvm, rev);
 	}
 	r = (hpte[1] & ~mask) | bits;
 
@@ -606,8 +623,10 @@ long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long flags,
 			v &= ~HPTE_V_ABSENT;
 			v |= HPTE_V_VALID;
 		}
-		if (v & HPTE_V_VALID)
+		if (v & HPTE_V_VALID) {
 			r = rev[i].guest_rpte | (r & (HPTE_R_R | HPTE_R_C));
+			r &= ~HPTE_GR_MODIFIED;
+		}
 		vcpu->arch.gpr[4 + i * 2] = v;
 		vcpu->arch.gpr[5 + i * 2] = r;
 	}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/4] KVM: PPC: Book3S HV: Add a mechanism for recording modified HPTEs
@ 2012-11-14  4:32   ` Paul Mackerras
  0 siblings, 0 replies; 24+ messages in thread
From: Paul Mackerras @ 2012-11-14  4:32 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc

This uses a bit in our record of the guest view of the HPTE to record
when the HPTE gets modified.  We use a reserved bit for this, and ensure
that this bit is always cleared in HPTE values returned to the guest.
The recording of modified HPTEs is only done if other code indicates
its interest by setting kvm->arch.hpte_mod_interest to a non-zero value.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/include/asm/kvm_book3s_64.h |    6 ++++++
 arch/powerpc/include/asm/kvm_host.h      |    1 +
 arch/powerpc/kvm/book3s_hv_rm_mmu.c      |   25 ++++++++++++++++++++++---
 3 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
index 1472a5b..4ca4f25 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -50,6 +50,12 @@ extern int kvm_hpt_order;		/* order of preallocated HPTs */
 #define HPTE_V_HVLOCK	0x40UL
 #define HPTE_V_ABSENT	0x20UL
 
+/*
+ * We use this bit in the guest_rpte field of the revmap entry
+ * to indicate a modified HPTE.
+ */
+#define HPTE_GR_MODIFIED	(1ul << 62)
+
 static inline long try_lock_hpte(unsigned long *hpte, unsigned long bits)
 {
 	unsigned long tmp, old;
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 3093896..58c7264 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -248,6 +248,7 @@ struct kvm_arch {
 	atomic_t vcpus_running;
 	unsigned long hpt_npte;
 	unsigned long hpt_mask;
+	atomic_t hpte_mod_interest;
 	spinlock_t slot_phys_lock;
 	unsigned short last_vcpu[NR_CPUS];
 	struct kvmppc_vcore *vcores[KVM_MAX_VCORES];
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 362dffe..726231a 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -66,6 +66,18 @@ void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev,
 }
 EXPORT_SYMBOL_GPL(kvmppc_add_revmap_chain);
 
+/*
+ * Note modification of an HPTE; set the HPTE modified bit
+ * if it wasn't modified before and anyone is interested.
+ */
+static inline void note_hpte_modification(struct kvm *kvm,
+					  struct revmap_entry *rev)
+{
+	if (!(rev->guest_rpte & HPTE_GR_MODIFIED) &&
+	    atomic_read(&kvm->arch.hpte_mod_interest))
+		rev->guest_rpte |= HPTE_GR_MODIFIED;
+}
+
 /* Remove this HPTE from the chain for a real page */
 static void remove_revmap_chain(struct kvm *kvm, long pte_index,
 				struct revmap_entry *rev,
@@ -287,8 +299,10 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
 	rev = &kvm->arch.revmap[pte_index];
 	if (realmode)
 		rev = real_vmalloc_addr(rev);
-	if (rev)
+	if (rev) {
 		rev->guest_rpte = g_ptel;
+		note_hpte_modification(kvm, rev);
+	}
 
 	/* Link HPTE into reverse-map chain */
 	if (pteh & HPTE_V_VALID) {
@@ -392,7 +406,8 @@ long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
 		/* Read PTE low word after tlbie to get final R/C values */
 		remove_revmap_chain(kvm, pte_index, rev, v, hpte[1]);
 	}
-	r = rev->guest_rpte;
+	r = rev->guest_rpte & ~HPTE_GR_MODIFIED;
+	note_hpte_modification(kvm, rev);
 	unlock_hpte(hpte, 0);
 
 	vcpu->arch.gpr[4] = v;
@@ -466,6 +481,7 @@ long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
 
 			args[j] = ((0x80 | flags) << 56) + pte_index;
 			rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
+			note_hpte_modification(kvm, rev);
 
 			if (!(hp[0] & HPTE_V_VALID)) {
 				/* insert R and C bits from PTE */
@@ -555,6 +571,7 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags,
 	if (rev) {
 		r = (rev->guest_rpte & ~mask) | bits;
 		rev->guest_rpte = r;
+		note_hpte_modification(kvm, rev);
 	}
 	r = (hpte[1] & ~mask) | bits;
 
@@ -606,8 +623,10 @@ long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long flags,
 			v &= ~HPTE_V_ABSENT;
 			v |= HPTE_V_VALID;
 		}
-		if (v & HPTE_V_VALID)
+		if (v & HPTE_V_VALID) {
 			r = rev[i].guest_rpte | (r & (HPTE_R_R | HPTE_R_C));
+			r &= ~HPTE_GR_MODIFIED;
+		}
 		vcpu->arch.gpr[4 + i * 2] = v;
 		vcpu->arch.gpr[5 + i * 2] = r;
 	}
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/4] KVM: PPC: Book3S HV: Make a HPTE removal function available
  2012-11-14  4:30 ` Paul Mackerras
@ 2012-11-14  4:32   ` Paul Mackerras
  -1 siblings, 0 replies; 24+ messages in thread
From: Paul Mackerras @ 2012-11-14  4:32 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc

This makes a HPTE removal function, kvmppc_do_h_remove(), available
outside book3s_hv_rm_mmu.c.  This will be used by the HPT writing
code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/include/asm/kvm_book3s.h |    3 +++
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   |   19 +++++++++++++------
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index fea768f..46763d10 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -160,6 +160,9 @@ extern long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 extern long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
 			long pte_index, unsigned long pteh, unsigned long ptel,
 			pgd_t *pgdir, bool realmode, unsigned long *idx_ret);
+extern long kvmppc_do_h_remove(struct kvm *kvm, unsigned long flags,
+			unsigned long pte_index, unsigned long avpn,
+			unsigned long *hpret);
 extern long kvmppc_hv_get_dirty_log(struct kvm *kvm,
 			struct kvm_memory_slot *memslot, unsigned long *map);
 
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 726231a..e407e97 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -364,11 +364,10 @@ static inline int try_lock_tlbie(unsigned int *lock)
 	return old == 0;
 }
 
-long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
-		     unsigned long pte_index, unsigned long avpn,
-		     unsigned long va)
+long kvmppc_do_h_remove(struct kvm *kvm, unsigned long flags,
+			unsigned long pte_index, unsigned long avpn,
+			unsigned long *hpret)
 {
-	struct kvm *kvm = vcpu->kvm;
 	unsigned long *hpte;
 	unsigned long v, r, rb;
 	struct revmap_entry *rev;
@@ -410,10 +409,18 @@ long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
 	note_hpte_modification(kvm, rev);
 	unlock_hpte(hpte, 0);
 
-	vcpu->arch.gpr[4] = v;
-	vcpu->arch.gpr[5] = r;
+	hpret[0] = v;
+	hpret[1] = r;
 	return H_SUCCESS;
 }
+EXPORT_SYMBOL_GPL(kvmppc_do_h_remove);
+
+long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
+		     unsigned long pte_index, unsigned long avpn)
+{
+	return kvmppc_do_h_remove(vcpu->kvm, flags, pte_index, avpn,
+				  &vcpu->arch.gpr[4]);
+}
 
 long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
 {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/4] KVM: PPC: Book3S HV: Make a HPTE removal function available
@ 2012-11-14  4:32   ` Paul Mackerras
  0 siblings, 0 replies; 24+ messages in thread
From: Paul Mackerras @ 2012-11-14  4:32 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc

This makes a HPTE removal function, kvmppc_do_h_remove(), available
outside book3s_hv_rm_mmu.c.  This will be used by the HPT writing
code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/include/asm/kvm_book3s.h |    3 +++
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   |   19 +++++++++++++------
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index fea768f..46763d10 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -160,6 +160,9 @@ extern long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
 extern long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
 			long pte_index, unsigned long pteh, unsigned long ptel,
 			pgd_t *pgdir, bool realmode, unsigned long *idx_ret);
+extern long kvmppc_do_h_remove(struct kvm *kvm, unsigned long flags,
+			unsigned long pte_index, unsigned long avpn,
+			unsigned long *hpret);
 extern long kvmppc_hv_get_dirty_log(struct kvm *kvm,
 			struct kvm_memory_slot *memslot, unsigned long *map);
 
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 726231a..e407e97 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -364,11 +364,10 @@ static inline int try_lock_tlbie(unsigned int *lock)
 	return old = 0;
 }
 
-long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
-		     unsigned long pte_index, unsigned long avpn,
-		     unsigned long va)
+long kvmppc_do_h_remove(struct kvm *kvm, unsigned long flags,
+			unsigned long pte_index, unsigned long avpn,
+			unsigned long *hpret)
 {
-	struct kvm *kvm = vcpu->kvm;
 	unsigned long *hpte;
 	unsigned long v, r, rb;
 	struct revmap_entry *rev;
@@ -410,10 +409,18 @@ long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
 	note_hpte_modification(kvm, rev);
 	unlock_hpte(hpte, 0);
 
-	vcpu->arch.gpr[4] = v;
-	vcpu->arch.gpr[5] = r;
+	hpret[0] = v;
+	hpret[1] = r;
 	return H_SUCCESS;
 }
+EXPORT_SYMBOL_GPL(kvmppc_do_h_remove);
+
+long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
+		     unsigned long pte_index, unsigned long avpn)
+{
+	return kvmppc_do_h_remove(vcpu->kvm, flags, pte_index, avpn,
+				  &vcpu->arch.gpr[4]);
+}
 
 long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
 {
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/4] KVM: PPC: Book3S HV: Provide a method for userspace to read and write the HPT
  2012-11-14  4:30 ` Paul Mackerras
@ 2012-11-14  4:33   ` Paul Mackerras
  -1 siblings, 0 replies; 24+ messages in thread
From: Paul Mackerras @ 2012-11-14  4:33 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc

A new ioctl, KVM_PPC_GET_HTAB_FD, returns a file descriptor.  Reads on
this fd return the contents of the HPT (hashed page table), writes
create and/or remove entries in the HPT.  There is a new capability,
KVM_CAP_PPC_HTAB_FD, to indicate the presence of the ioctl.  The ioctl
takes an argument structure with the index of the first HPT entry to
read out and a set of flags.  The flags indicate whether the user is
intending to read or write the HPT, and whether to return all entries
or only the "bolted" entries (those with the bolted bit, 0x10, set in
the first doubleword).

This is intended for use in implementing qemu's savevm/loadvm and for
live migration.  Therefore, on reads, the first pass returns information
about all HPTEs (or all bolted HPTEs).  When the first pass reaches the
end of the HPT, it returns from the read.  Subsequent reads only return
information about HPTEs that have changed since they were last read.
A read that finds no changed HPTEs in the HPT following where the last
read finished will return 0 bytes.

The format of the data provides a simple run-length compression of the
invalid entries.  Each block of data starts with a header that indicates
the index (position in the HPT, which is just an array), the number of
valid entries starting at that index (may be zero), and the number of
invalid entries following those valid entries.  The valid entries, 16
bytes each, follow the header.  The invalid entries are not explicitly
represented.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 Documentation/virtual/kvm/api.txt        |   53 +++++
 arch/powerpc/include/asm/kvm_book3s_64.h |   18 ++
 arch/powerpc/include/asm/kvm_ppc.h       |    2 +
 arch/powerpc/include/uapi/asm/kvm.h      |   24 +++
 arch/powerpc/kvm/book3s_64_mmu_hv.c      |  344 ++++++++++++++++++++++++++++++
 arch/powerpc/kvm/book3s_hv.c             |   12 --
 arch/powerpc/kvm/powerpc.c               |   17 ++
 include/uapi/linux/kvm.h                 |    3 +
 8 files changed, 461 insertions(+), 12 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 6671fdc..33080ea 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2071,6 +2071,59 @@ KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; source cpu in parm
 
 Note that the vcpu ioctl is asynchronous to vcpu execution.
 
+4.78 KVM_PPC_GET_HTAB_FD
+
+Capability: KVM_CAP_PPC_HTAB_FD
+Architectures: powerpc
+Type: vm ioctl
+Parameters: Pointer to struct kvm_get_htab_fd (in)
+Returns: file descriptor number (>= 0) on success, -1 on error
+
+This returns a file descriptor that can be used either to read out the
+entries in the guest's hashed page table (HPT), or to write entries to
+initialize the HPT.  The returned fd can only be written to if the
+KVM_GET_HTAB_WRITE bit is set in the flags field of the argument, and
+can only be read if that bit is clear.  The argument struct looks like
+this:
+
+/* For KVM_PPC_GET_HTAB_FD */
+struct kvm_get_htab_fd {
+	__u64	flags;
+	__u64	start_index;
+};
+
+/* Values for kvm_get_htab_fd.flags */
+#define KVM_GET_HTAB_BOLTED_ONLY	((__u64)0x1)
+#define KVM_GET_HTAB_WRITE		((__u64)0x2)
+
+The `start_index' field gives the index in the HPT of the entry at
+which to start reading.  It is ignored when writing.
+
+Reads on the fd will initially supply information about all
+"interesting" HPT entries.  Interesting entries are those with the
+bolted bit set, if the KVM_GET_HTAB_BOLTED_ONLY bit is set, otherwise
+all entries.  When the end of the HPT is reached, the read() will
+return.  If read() is called again on the fd, it will start again from
+the beginning of the HPT, but will only return HPT entries that have
+changed since they were last read.
+
+Data read or written is structured as a header (8 bytes) followed by a
+series of valid HPT entries (16 bytes) each.  The header indicates how
+many valid HPT entries there are and how many invalid entries follow
+the valid entries.  The invalid entries are not represented explicitly
+in the stream.  The header format is:
+
+struct kvm_get_htab_header {
+	__u32	index;
+	__u16	n_valid;
+	__u16	n_invalid;
+};
+
+Writes to the fd create HPT entries starting at the index given in the
+header; first `n_valid' valid entries with contents from the data
+written, then `n_invalid' invalid entries, invalidating any previously
+valid entries found.
+
 
 5. The kvm_run structure
 ------------------------
diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
index 4ca4f25..dc0a78d 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -243,4 +243,22 @@ static inline bool slot_is_aligned(struct kvm_memory_slot *memslot,
 	return !(memslot->base_gfn & mask) && !(memslot->npages & mask);
 }
 
+static inline unsigned long slb_pgsize_encoding(unsigned long psize)
+{
+	unsigned long senc = 0;
+
+	if (psize > 0x1000) {
+		senc = SLB_VSID_L;
+		if (psize == 0x10000)
+			senc |= SLB_VSID_LP_01;
+	}
+	return senc;
+}
+
+static inline int is_vrma_hpte(unsigned long hpte_v)
+{
+	return (hpte_v & ~0xffffffUL) ==
+		(HPTE_V_1TB_SEG | (VRMA_VSID << (40 - 16)));
+}
+
 #endif /* __ASM_KVM_BOOK3S_64_H__ */
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 609cca3..1ca31e9 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -164,6 +164,8 @@ extern void kvmppc_bookehv_exit(void);
 
 extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu);
 
+extern int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *);
+
 /*
  * Cuts out inst bits with ordering according to spec.
  * That means the leftmost bit is zero. All given bits are included.
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index b89ae4d..6518e38 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -331,6 +331,30 @@ struct kvm_book3e_206_tlb_params {
 	__u32 reserved[8];
 };
 
+/* For KVM_PPC_GET_HTAB_FD */
+struct kvm_get_htab_fd {
+	__u64	flags;
+	__u64	start_index;
+};
+
+/* Values for kvm_get_htab_fd.flags */
+#define KVM_GET_HTAB_BOLTED_ONLY	((__u64)0x1)
+#define KVM_GET_HTAB_WRITE		((__u64)0x2)
+
+/*
+ * Data read on the file descriptor is formatted as a series of
+ * records, each consisting of a header followed by a series of
+ * `n_valid' HPTEs (16 bytes each), which are all valid.  Following 
+ * those valid HPTEs there are `n_invalid' invalid HPTEs, which
+ * are not represented explicitly in the stream.  The same format
+ * is used for writing.
+ */
+struct kvm_get_htab_header {
+	__u32	index;
+	__u16	n_valid;
+	__u16	n_invalid;
+};
+
 #define KVM_REG_PPC_HIOR	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x1)
 #define KVM_REG_PPC_IAC1	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x2)
 #define KVM_REG_PPC_IAC2	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x3)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 6ee6516..42a1088 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -25,6 +25,8 @@
 #include <linux/hugetlb.h>
 #include <linux/vmalloc.h>
 #include <linux/srcu.h>
+#include <linux/anon_inodes.h>
+#include <linux/file.h>
 
 #include <asm/tlbflush.h>
 #include <asm/kvm_ppc.h>
@@ -1145,6 +1147,348 @@ void kvmppc_unpin_guest_page(struct kvm *kvm, void *va)
 	put_page(page);
 }
 
+/*
+ * Functions for reading and writing the hash table via reads and
+ * writes on a file descriptor.
+ *
+ * Reads return the guest view of the hash table, which has to be
+ * pieced together from the real hash table and the guest_rpte
+ * values in the revmap array.
+ *
+ * On writes, each HPTE written is considered in turn, and if it
+ * is valid, it is written to the HPT as if an H_ENTER with the
+ * exact flag set was done.  When the invalid count is non-zero
+ * in the header written to the stream, the kernel will make
+ * sure that that many HPTEs are invalid, and invalidate them
+ * if not.
+ */
+
+struct kvm_htab_ctx {
+	unsigned long	index;
+	unsigned long	flags;
+	struct kvm	*kvm;
+	int		first_pass;
+};
+
+#define HPTE_SIZE	(2 * sizeof(unsigned long))
+
+static long record_hpte(unsigned long flags, unsigned long *hptp,
+			unsigned long *hpte, struct revmap_entry *revp,
+			int want_valid, int first_pass)
+{
+	unsigned long v, r;
+	int ok = 1;
+	int valid, dirty;
+
+	/* Unmodified entries are uninteresting except on the first pass */
+	dirty = !!(revp->guest_rpte & HPTE_GR_MODIFIED);
+	if (!first_pass && !dirty)
+		return 0;
+
+	valid = 0;
+	if (hptp[0] & (HPTE_V_VALID | HPTE_V_ABSENT)) {
+		valid = 1;
+		if ((flags & KVM_GET_HTAB_BOLTED_ONLY) &&
+		    !(hptp[0] & HPTE_V_BOLTED))
+			valid = 0;
+	}
+	if (valid != want_valid)
+		return 0;
+
+	v = r = 0;
+	if (valid || dirty) {
+		/* lock the HPTE so it's stable and read it */
+		preempt_disable();
+		while (!try_lock_hpte(hptp, HPTE_V_HVLOCK))
+			cpu_relax();
+		v = hptp[0];
+		if (v & HPTE_V_ABSENT) {
+			v &= ~HPTE_V_ABSENT;
+			v |= HPTE_V_VALID;
+		}
+		/* re-evaluate valid and dirty from synchronized HPTE value */
+		valid = !!(v & HPTE_V_VALID);
+		if ((flags & KVM_GET_HTAB_BOLTED_ONLY) && !(v & HPTE_V_BOLTED))
+			valid = 0;
+		r = revp->guest_rpte | (hptp[1] & (HPTE_R_R | HPTE_R_C));
+		dirty = !!(revp->guest_rpte & HPTE_GR_MODIFIED);
+		/* only clear modified if this is the right sort of entry */
+		if (valid == want_valid && dirty) {
+			r &= ~HPTE_GR_MODIFIED;
+			revp->guest_rpte = r;
+		}
+		asm volatile(PPC_RELEASE_BARRIER "" : : : "memory");
+		hptp[0] &= ~HPTE_V_HVLOCK;
+		preempt_enable();
+		if (!(valid == want_valid && (first_pass || dirty)))
+			ok = 0;
+	}
+	hpte[0] = v;
+	hpte[1] = r;
+	return ok;
+}
+
+static ssize_t kvm_htab_read(struct file *file, char __user *buf,
+			     size_t count, loff_t *ppos)
+{
+	struct kvm_htab_ctx *ctx = file->private_data;
+	struct kvm *kvm = ctx->kvm;
+	struct kvm_get_htab_header hdr;
+	unsigned long *hptp;
+	struct revmap_entry *revp;
+	unsigned long i, nb, nw;
+	unsigned long __user *lbuf;
+	struct kvm_get_htab_header __user *hptr;
+	unsigned long flags;
+	int first_pass;
+	unsigned long hpte[2];
+
+	if (!access_ok(VERIFY_WRITE, buf, count))
+		return -EFAULT;
+
+	first_pass = ctx->first_pass;
+	flags = ctx->flags;
+
+	i = ctx->index;
+	hptp = (unsigned long *)(kvm->arch.hpt_virt + (i * HPTE_SIZE));
+	revp = kvm->arch.revmap + i;
+	lbuf = (unsigned long __user *)buf;
+
+	nb = 0;
+	while (nb + sizeof(hdr) + HPTE_SIZE < count) {
+		/* Initialize header */
+		hptr = (struct kvm_get_htab_header __user *)buf;
+		hdr.index = i;
+		hdr.n_valid = 0;
+		hdr.n_invalid = 0;
+		nw = nb;
+		nb += sizeof(hdr);
+		lbuf = (unsigned long __user *)(buf + sizeof(hdr));
+
+		/* Skip uninteresting entries, i.e. clean on not-first pass */
+		if (!first_pass) {
+			while (i < kvm->arch.hpt_npte &&
+			       !(revp->guest_rpte & HPTE_GR_MODIFIED)) {
+				++i;
+				hptp += 2;
+				++revp;
+			}
+		}
+
+		/* Grab a series of valid entries */
+		while (i < kvm->arch.hpt_npte &&
+		       hdr.n_valid < 0xffff &&
+		       nb + HPTE_SIZE < count &&
+		       record_hpte(flags, hptp, hpte, revp, 1, first_pass)) {
+			/* valid entry, write it out */
+			++hdr.n_valid;
+			if (__put_user(hpte[0], lbuf) ||
+			    __put_user(hpte[1], lbuf + 1))
+				return -EFAULT;
+			nb += HPTE_SIZE;
+			lbuf += 2;
+			++i;
+			hptp += 2;
+			++revp;
+		}
+		/* Now skip invalid entries while we can */
+		while (i < kvm->arch.hpt_npte &&
+		       hdr.n_invalid < 0xffff &&
+		       record_hpte(flags, hptp, hpte, revp, 0, first_pass)) {
+			/* found an invalid entry */
+			++hdr.n_invalid;
+			++i;
+			hptp += 2;
+			++revp;
+		}
+ 
+		if (hdr.n_valid || hdr.n_invalid) {
+			/* write back the header */
+			if (__copy_to_user(hptr, &hdr, sizeof(hdr)))
+				return -EFAULT;
+			nw = nb;
+			buf = (char __user *)lbuf;
+		} else {
+			nb = nw;
+		}
+
+		/* Check if we've wrapped around the hash table */
+		if (i >= kvm->arch.hpt_npte) {
+			i = 0;
+			ctx->first_pass = 0;
+			break;
+		}
+	}
+
+	ctx->index = i;
+
+	return nb;
+}
+
+static ssize_t kvm_htab_write(struct file *file, const char __user *buf,
+			      size_t count, loff_t *ppos)
+{
+	struct kvm_htab_ctx *ctx = file->private_data;
+	struct kvm *kvm = ctx->kvm;
+	struct kvm_get_htab_header hdr;
+	unsigned long i, j;
+	unsigned long v, r;
+	unsigned long __user *lbuf;
+	unsigned long *hptp;
+	unsigned long tmp[2];
+	ssize_t nb;
+	long int err, ret;
+	int rma_setup;
+
+	if (!access_ok(VERIFY_READ, buf, count))
+		return -EFAULT;
+
+	/* lock out vcpus from running while we're doing this */
+	mutex_lock(&kvm->lock);
+	rma_setup = kvm->arch.rma_setup_done;
+	if (rma_setup) {
+		kvm->arch.rma_setup_done = 0;	/* temporarily */
+		/* order rma_setup_done vs. vcpus_running */
+		smp_mb();
+		if (atomic_read(&kvm->arch.vcpus_running)) {
+			kvm->arch.rma_setup_done = 1;
+			mutex_unlock(&kvm->lock);
+			return -EBUSY;
+		}
+	}
+
+	err = 0;
+	for (nb = 0; nb + sizeof(hdr) <= count; ) {
+		err = -EFAULT;
+		if (__copy_from_user(&hdr, buf, sizeof(hdr)))
+			break;
+
+		err = 0;
+		if (nb + hdr.n_valid * HPTE_SIZE > count)
+			break;
+
+		nb += sizeof(hdr);
+		buf += sizeof(hdr);
+
+		err = -EINVAL;
+		i = hdr.index;
+		if (i >= kvm->arch.hpt_npte ||
+		    i + hdr.n_valid + hdr.n_invalid > kvm->arch.hpt_npte)
+			break;
+
+		hptp = (unsigned long *)(kvm->arch.hpt_virt + (i * HPTE_SIZE));
+		lbuf = (unsigned long __user *)buf;
+		for (j = 0; j < hdr.n_valid; ++j) {
+			err = -EFAULT;
+			if (__get_user(v, lbuf) || __get_user(r, lbuf + 1))
+				goto out;
+			err = -EINVAL;
+			if (!(v & HPTE_V_VALID))
+				goto out;
+			lbuf += 2;
+			nb += HPTE_SIZE;
+
+			if (hptp[0] & (HPTE_V_VALID | HPTE_V_ABSENT))
+				kvmppc_do_h_remove(kvm, 0, i, 0, tmp);
+			err = -EIO;
+			ret = kvmppc_virtmode_do_h_enter(kvm, H_EXACT, i, v, r,
+							 tmp);
+			if (ret != H_SUCCESS) {
+				pr_err("kvm_htab_write ret %ld i=%ld v=%lx "
+				       "r=%lx\n", ret, i, v, r);
+				goto out;
+			}
+			if (!rma_setup && is_vrma_hpte(v)) {
+				unsigned long psize = hpte_page_size(v, r);
+				unsigned long senc = slb_pgsize_encoding(psize);
+				unsigned long lpcr;
+
+				kvm->arch.vrma_slb_v = senc | SLB_VSID_B_1T |
+					(VRMA_VSID << SLB_VSID_SHIFT_1T);
+				lpcr = kvm->arch.lpcr & ~LPCR_VRMASD;
+				lpcr |= senc << (LPCR_VRMASD_SH - 4);
+				kvm->arch.lpcr = lpcr;
+				rma_setup = 1;
+			}
+			++i;
+			hptp += 2;
+		}
+
+		for (j = 0; j < hdr.n_invalid; ++j) {
+			if (hptp[0] & (HPTE_V_VALID | HPTE_V_ABSENT))
+				kvmppc_do_h_remove(kvm, 0, i, 0, tmp);
+			++i;
+			hptp += 2;
+		}
+		err = 0;
+	}
+
+ out:
+	/* Order HPTE updates vs. rma_setup_done */
+	smp_wmb();
+	kvm->arch.rma_setup_done = rma_setup;
+	mutex_unlock(&kvm->lock);
+
+	if (err)
+		return err;
+	return nb;
+}
+
+static int kvm_htab_release(struct inode *inode, struct file *filp)
+{
+	struct kvm_htab_ctx *ctx = filp->private_data;
+
+	filp->private_data = NULL;
+	if (!(ctx->flags & KVM_GET_HTAB_WRITE))
+		atomic_dec(&ctx->kvm->arch.hpte_mod_interest);
+	kvm_put_kvm(ctx->kvm);
+	kfree(ctx);
+	return 0;
+}
+
+static struct file_operations kvm_htab_fops = {
+	.read		= kvm_htab_read,
+	.write		= kvm_htab_write,
+	.llseek		= default_llseek,
+	.release	= kvm_htab_release,
+};
+
+int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *ghf)
+{
+	int ret;
+	struct kvm_htab_ctx *ctx;
+	int rwflag;
+
+	/* reject flags we don't recognize */
+	if (ghf->flags & ~(KVM_GET_HTAB_BOLTED_ONLY | KVM_GET_HTAB_WRITE))
+		return -EINVAL;
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+	kvm_get_kvm(kvm);
+	ctx->kvm = kvm;
+	ctx->index = ghf->start_index;
+	ctx->flags = ghf->flags;
+	ctx->first_pass = 1;
+
+	rwflag = (ghf->flags & KVM_GET_HTAB_WRITE) ? O_WRONLY : O_RDONLY;
+	ret = anon_inode_getfd("kvm-htab", &kvm_htab_fops, ctx, rwflag);
+	if (ret < 0) {
+		kvm_put_kvm(kvm);
+		return ret;
+	}
+
+	if (rwflag == O_RDONLY) {
+		mutex_lock(&kvm->slots_lock);
+		atomic_inc(&kvm->arch.hpte_mod_interest);
+		/* make sure kvmppc_do_h_enter etc. see the increment */
+		synchronize_srcu_expedited(&kvm->srcu);
+		mutex_unlock(&kvm->slots_lock);
+	}
+
+	return ret;
+}
+
 void kvmppc_mmu_book3s_hv_init(struct kvm_vcpu *vcpu)
 {
 	struct kvmppc_mmu *mmu = &vcpu->arch.mmu;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 843eb75..a4f59db 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1563,18 +1563,6 @@ out:
 	return r;
 }
 
-static unsigned long slb_pgsize_encoding(unsigned long psize)
-{
-	unsigned long senc = 0;
-
-	if (psize > 0x1000) {
-		senc = SLB_VSID_L;
-		if (psize == 0x10000)
-			senc |= SLB_VSID_LP_01;
-	}
-	return senc;
-}
-
 static void unpin_slot(struct kvm_memory_slot *memslot)
 {
 	unsigned long *physp;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index deb0d59..3aa52d9 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -353,6 +353,12 @@ int kvm_dev_ioctl_check_extension(long ext)
 		r = 1;
 #else
 		r = 0;
+		break;
+#endif
+#ifdef CONFIG_KVM_BOOK3S_64_HV
+	case KVM_CAP_PPC_HTAB_FD:
+		r = 1;
+		break;
 #endif
 		break;
 	case KVM_CAP_NR_VCPUS:
@@ -934,6 +940,17 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		r = 0;
 		break;
 	}
+
+	case KVM_PPC_GET_HTAB_FD: {
+		struct kvm *kvm = filp->private_data;
+		struct kvm_get_htab_fd ghf;
+
+		r = -EFAULT;
+		if (copy_from_user(&ghf, argp, sizeof(ghf)))
+			break;
+		r = kvm_vm_ioctl_get_htab_fd(kvm, &ghf);
+		break;
+	}
 #endif /* CONFIG_KVM_BOOK3S_64_HV */
 
 #ifdef CONFIG_PPC_BOOK3S_64
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 494a84c..e6e5d4b 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -634,6 +634,7 @@ struct kvm_ppc_smmu_info {
 #endif
 #define KVM_CAP_IRQFD_RESAMPLE 82
 #define KVM_CAP_PPC_BOOKE_WATCHDOG 83
+#define KVM_CAP_PPC_HTAB_FD 84
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -859,6 +860,8 @@ struct kvm_s390_ucas_mapping {
 #define KVM_CREATE_SPAPR_TCE	  _IOW(KVMIO,  0xa8, struct kvm_create_spapr_tce)
 /* Available with KVM_CAP_RMA */
 #define KVM_ALLOCATE_RMA	  _IOR(KVMIO,  0xa9, struct kvm_allocate_rma)
+/* Available with KVM_CAP_PPC_HTAB_FD */
+#define KVM_PPC_GET_HTAB_FD	  _IOW(KVMIO,  0xaa, struct kvm_get_htab_fd)
 
 /*
  * ioctls for vcpu fds
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/4] KVM: PPC: Book3S HV: Provide a method for userspace to read and write the HPT
@ 2012-11-14  4:33   ` Paul Mackerras
  0 siblings, 0 replies; 24+ messages in thread
From: Paul Mackerras @ 2012-11-14  4:33 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc

A new ioctl, KVM_PPC_GET_HTAB_FD, returns a file descriptor.  Reads on
this fd return the contents of the HPT (hashed page table), writes
create and/or remove entries in the HPT.  There is a new capability,
KVM_CAP_PPC_HTAB_FD, to indicate the presence of the ioctl.  The ioctl
takes an argument structure with the index of the first HPT entry to
read out and a set of flags.  The flags indicate whether the user is
intending to read or write the HPT, and whether to return all entries
or only the "bolted" entries (those with the bolted bit, 0x10, set in
the first doubleword).

This is intended for use in implementing qemu's savevm/loadvm and for
live migration.  Therefore, on reads, the first pass returns information
about all HPTEs (or all bolted HPTEs).  When the first pass reaches the
end of the HPT, it returns from the read.  Subsequent reads only return
information about HPTEs that have changed since they were last read.
A read that finds no changed HPTEs in the HPT following where the last
read finished will return 0 bytes.

The format of the data provides a simple run-length compression of the
invalid entries.  Each block of data starts with a header that indicates
the index (position in the HPT, which is just an array), the number of
valid entries starting at that index (may be zero), and the number of
invalid entries following those valid entries.  The valid entries, 16
bytes each, follow the header.  The invalid entries are not explicitly
represented.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 Documentation/virtual/kvm/api.txt        |   53 +++++
 arch/powerpc/include/asm/kvm_book3s_64.h |   18 ++
 arch/powerpc/include/asm/kvm_ppc.h       |    2 +
 arch/powerpc/include/uapi/asm/kvm.h      |   24 +++
 arch/powerpc/kvm/book3s_64_mmu_hv.c      |  344 ++++++++++++++++++++++++++++++
 arch/powerpc/kvm/book3s_hv.c             |   12 --
 arch/powerpc/kvm/powerpc.c               |   17 ++
 include/uapi/linux/kvm.h                 |    3 +
 8 files changed, 461 insertions(+), 12 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 6671fdc..33080ea 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2071,6 +2071,59 @@ KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; source cpu in parm
 
 Note that the vcpu ioctl is asynchronous to vcpu execution.
 
+4.78 KVM_PPC_GET_HTAB_FD
+
+Capability: KVM_CAP_PPC_HTAB_FD
+Architectures: powerpc
+Type: vm ioctl
+Parameters: Pointer to struct kvm_get_htab_fd (in)
+Returns: file descriptor number (>= 0) on success, -1 on error
+
+This returns a file descriptor that can be used either to read out the
+entries in the guest's hashed page table (HPT), or to write entries to
+initialize the HPT.  The returned fd can only be written to if the
+KVM_GET_HTAB_WRITE bit is set in the flags field of the argument, and
+can only be read if that bit is clear.  The argument struct looks like
+this:
+
+/* For KVM_PPC_GET_HTAB_FD */
+struct kvm_get_htab_fd {
+	__u64	flags;
+	__u64	start_index;
+};
+
+/* Values for kvm_get_htab_fd.flags */
+#define KVM_GET_HTAB_BOLTED_ONLY	((__u64)0x1)
+#define KVM_GET_HTAB_WRITE		((__u64)0x2)
+
+The `start_index' field gives the index in the HPT of the entry at
+which to start reading.  It is ignored when writing.
+
+Reads on the fd will initially supply information about all
+"interesting" HPT entries.  Interesting entries are those with the
+bolted bit set, if the KVM_GET_HTAB_BOLTED_ONLY bit is set, otherwise
+all entries.  When the end of the HPT is reached, the read() will
+return.  If read() is called again on the fd, it will start again from
+the beginning of the HPT, but will only return HPT entries that have
+changed since they were last read.
+
+Data read or written is structured as a header (8 bytes) followed by a
+series of valid HPT entries (16 bytes) each.  The header indicates how
+many valid HPT entries there are and how many invalid entries follow
+the valid entries.  The invalid entries are not represented explicitly
+in the stream.  The header format is:
+
+struct kvm_get_htab_header {
+	__u32	index;
+	__u16	n_valid;
+	__u16	n_invalid;
+};
+
+Writes to the fd create HPT entries starting at the index given in the
+header; first `n_valid' valid entries with contents from the data
+written, then `n_invalid' invalid entries, invalidating any previously
+valid entries found.
+
 
 5. The kvm_run structure
 ------------------------
diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
index 4ca4f25..dc0a78d 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -243,4 +243,22 @@ static inline bool slot_is_aligned(struct kvm_memory_slot *memslot,
 	return !(memslot->base_gfn & mask) && !(memslot->npages & mask);
 }
 
+static inline unsigned long slb_pgsize_encoding(unsigned long psize)
+{
+	unsigned long senc = 0;
+
+	if (psize > 0x1000) {
+		senc = SLB_VSID_L;
+		if (psize = 0x10000)
+			senc |= SLB_VSID_LP_01;
+	}
+	return senc;
+}
+
+static inline int is_vrma_hpte(unsigned long hpte_v)
+{
+	return (hpte_v & ~0xffffffUL) =
+		(HPTE_V_1TB_SEG | (VRMA_VSID << (40 - 16)));
+}
+
 #endif /* __ASM_KVM_BOOK3S_64_H__ */
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 609cca3..1ca31e9 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -164,6 +164,8 @@ extern void kvmppc_bookehv_exit(void);
 
 extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu);
 
+extern int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *);
+
 /*
  * Cuts out inst bits with ordering according to spec.
  * That means the leftmost bit is zero. All given bits are included.
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index b89ae4d..6518e38 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -331,6 +331,30 @@ struct kvm_book3e_206_tlb_params {
 	__u32 reserved[8];
 };
 
+/* For KVM_PPC_GET_HTAB_FD */
+struct kvm_get_htab_fd {
+	__u64	flags;
+	__u64	start_index;
+};
+
+/* Values for kvm_get_htab_fd.flags */
+#define KVM_GET_HTAB_BOLTED_ONLY	((__u64)0x1)
+#define KVM_GET_HTAB_WRITE		((__u64)0x2)
+
+/*
+ * Data read on the file descriptor is formatted as a series of
+ * records, each consisting of a header followed by a series of
+ * `n_valid' HPTEs (16 bytes each), which are all valid.  Following 
+ * those valid HPTEs there are `n_invalid' invalid HPTEs, which
+ * are not represented explicitly in the stream.  The same format
+ * is used for writing.
+ */
+struct kvm_get_htab_header {
+	__u32	index;
+	__u16	n_valid;
+	__u16	n_invalid;
+};
+
 #define KVM_REG_PPC_HIOR	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x1)
 #define KVM_REG_PPC_IAC1	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x2)
 #define KVM_REG_PPC_IAC2	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x3)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 6ee6516..42a1088 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -25,6 +25,8 @@
 #include <linux/hugetlb.h>
 #include <linux/vmalloc.h>
 #include <linux/srcu.h>
+#include <linux/anon_inodes.h>
+#include <linux/file.h>
 
 #include <asm/tlbflush.h>
 #include <asm/kvm_ppc.h>
@@ -1145,6 +1147,348 @@ void kvmppc_unpin_guest_page(struct kvm *kvm, void *va)
 	put_page(page);
 }
 
+/*
+ * Functions for reading and writing the hash table via reads and
+ * writes on a file descriptor.
+ *
+ * Reads return the guest view of the hash table, which has to be
+ * pieced together from the real hash table and the guest_rpte
+ * values in the revmap array.
+ *
+ * On writes, each HPTE written is considered in turn, and if it
+ * is valid, it is written to the HPT as if an H_ENTER with the
+ * exact flag set was done.  When the invalid count is non-zero
+ * in the header written to the stream, the kernel will make
+ * sure that that many HPTEs are invalid, and invalidate them
+ * if not.
+ */
+
+struct kvm_htab_ctx {
+	unsigned long	index;
+	unsigned long	flags;
+	struct kvm	*kvm;
+	int		first_pass;
+};
+
+#define HPTE_SIZE	(2 * sizeof(unsigned long))
+
+static long record_hpte(unsigned long flags, unsigned long *hptp,
+			unsigned long *hpte, struct revmap_entry *revp,
+			int want_valid, int first_pass)
+{
+	unsigned long v, r;
+	int ok = 1;
+	int valid, dirty;
+
+	/* Unmodified entries are uninteresting except on the first pass */
+	dirty = !!(revp->guest_rpte & HPTE_GR_MODIFIED);
+	if (!first_pass && !dirty)
+		return 0;
+
+	valid = 0;
+	if (hptp[0] & (HPTE_V_VALID | HPTE_V_ABSENT)) {
+		valid = 1;
+		if ((flags & KVM_GET_HTAB_BOLTED_ONLY) &&
+		    !(hptp[0] & HPTE_V_BOLTED))
+			valid = 0;
+	}
+	if (valid != want_valid)
+		return 0;
+
+	v = r = 0;
+	if (valid || dirty) {
+		/* lock the HPTE so it's stable and read it */
+		preempt_disable();
+		while (!try_lock_hpte(hptp, HPTE_V_HVLOCK))
+			cpu_relax();
+		v = hptp[0];
+		if (v & HPTE_V_ABSENT) {
+			v &= ~HPTE_V_ABSENT;
+			v |= HPTE_V_VALID;
+		}
+		/* re-evaluate valid and dirty from synchronized HPTE value */
+		valid = !!(v & HPTE_V_VALID);
+		if ((flags & KVM_GET_HTAB_BOLTED_ONLY) && !(v & HPTE_V_BOLTED))
+			valid = 0;
+		r = revp->guest_rpte | (hptp[1] & (HPTE_R_R | HPTE_R_C));
+		dirty = !!(revp->guest_rpte & HPTE_GR_MODIFIED);
+		/* only clear modified if this is the right sort of entry */
+		if (valid = want_valid && dirty) {
+			r &= ~HPTE_GR_MODIFIED;
+			revp->guest_rpte = r;
+		}
+		asm volatile(PPC_RELEASE_BARRIER "" : : : "memory");
+		hptp[0] &= ~HPTE_V_HVLOCK;
+		preempt_enable();
+		if (!(valid = want_valid && (first_pass || dirty)))
+			ok = 0;
+	}
+	hpte[0] = v;
+	hpte[1] = r;
+	return ok;
+}
+
+static ssize_t kvm_htab_read(struct file *file, char __user *buf,
+			     size_t count, loff_t *ppos)
+{
+	struct kvm_htab_ctx *ctx = file->private_data;
+	struct kvm *kvm = ctx->kvm;
+	struct kvm_get_htab_header hdr;
+	unsigned long *hptp;
+	struct revmap_entry *revp;
+	unsigned long i, nb, nw;
+	unsigned long __user *lbuf;
+	struct kvm_get_htab_header __user *hptr;
+	unsigned long flags;
+	int first_pass;
+	unsigned long hpte[2];
+
+	if (!access_ok(VERIFY_WRITE, buf, count))
+		return -EFAULT;
+
+	first_pass = ctx->first_pass;
+	flags = ctx->flags;
+
+	i = ctx->index;
+	hptp = (unsigned long *)(kvm->arch.hpt_virt + (i * HPTE_SIZE));
+	revp = kvm->arch.revmap + i;
+	lbuf = (unsigned long __user *)buf;
+
+	nb = 0;
+	while (nb + sizeof(hdr) + HPTE_SIZE < count) {
+		/* Initialize header */
+		hptr = (struct kvm_get_htab_header __user *)buf;
+		hdr.index = i;
+		hdr.n_valid = 0;
+		hdr.n_invalid = 0;
+		nw = nb;
+		nb += sizeof(hdr);
+		lbuf = (unsigned long __user *)(buf + sizeof(hdr));
+
+		/* Skip uninteresting entries, i.e. clean on not-first pass */
+		if (!first_pass) {
+			while (i < kvm->arch.hpt_npte &&
+			       !(revp->guest_rpte & HPTE_GR_MODIFIED)) {
+				++i;
+				hptp += 2;
+				++revp;
+			}
+		}
+
+		/* Grab a series of valid entries */
+		while (i < kvm->arch.hpt_npte &&
+		       hdr.n_valid < 0xffff &&
+		       nb + HPTE_SIZE < count &&
+		       record_hpte(flags, hptp, hpte, revp, 1, first_pass)) {
+			/* valid entry, write it out */
+			++hdr.n_valid;
+			if (__put_user(hpte[0], lbuf) ||
+			    __put_user(hpte[1], lbuf + 1))
+				return -EFAULT;
+			nb += HPTE_SIZE;
+			lbuf += 2;
+			++i;
+			hptp += 2;
+			++revp;
+		}
+		/* Now skip invalid entries while we can */
+		while (i < kvm->arch.hpt_npte &&
+		       hdr.n_invalid < 0xffff &&
+		       record_hpte(flags, hptp, hpte, revp, 0, first_pass)) {
+			/* found an invalid entry */
+			++hdr.n_invalid;
+			++i;
+			hptp += 2;
+			++revp;
+		}
+ 
+		if (hdr.n_valid || hdr.n_invalid) {
+			/* write back the header */
+			if (__copy_to_user(hptr, &hdr, sizeof(hdr)))
+				return -EFAULT;
+			nw = nb;
+			buf = (char __user *)lbuf;
+		} else {
+			nb = nw;
+		}
+
+		/* Check if we've wrapped around the hash table */
+		if (i >= kvm->arch.hpt_npte) {
+			i = 0;
+			ctx->first_pass = 0;
+			break;
+		}
+	}
+
+	ctx->index = i;
+
+	return nb;
+}
+
+static ssize_t kvm_htab_write(struct file *file, const char __user *buf,
+			      size_t count, loff_t *ppos)
+{
+	struct kvm_htab_ctx *ctx = file->private_data;
+	struct kvm *kvm = ctx->kvm;
+	struct kvm_get_htab_header hdr;
+	unsigned long i, j;
+	unsigned long v, r;
+	unsigned long __user *lbuf;
+	unsigned long *hptp;
+	unsigned long tmp[2];
+	ssize_t nb;
+	long int err, ret;
+	int rma_setup;
+
+	if (!access_ok(VERIFY_READ, buf, count))
+		return -EFAULT;
+
+	/* lock out vcpus from running while we're doing this */
+	mutex_lock(&kvm->lock);
+	rma_setup = kvm->arch.rma_setup_done;
+	if (rma_setup) {
+		kvm->arch.rma_setup_done = 0;	/* temporarily */
+		/* order rma_setup_done vs. vcpus_running */
+		smp_mb();
+		if (atomic_read(&kvm->arch.vcpus_running)) {
+			kvm->arch.rma_setup_done = 1;
+			mutex_unlock(&kvm->lock);
+			return -EBUSY;
+		}
+	}
+
+	err = 0;
+	for (nb = 0; nb + sizeof(hdr) <= count; ) {
+		err = -EFAULT;
+		if (__copy_from_user(&hdr, buf, sizeof(hdr)))
+			break;
+
+		err = 0;
+		if (nb + hdr.n_valid * HPTE_SIZE > count)
+			break;
+
+		nb += sizeof(hdr);
+		buf += sizeof(hdr);
+
+		err = -EINVAL;
+		i = hdr.index;
+		if (i >= kvm->arch.hpt_npte ||
+		    i + hdr.n_valid + hdr.n_invalid > kvm->arch.hpt_npte)
+			break;
+
+		hptp = (unsigned long *)(kvm->arch.hpt_virt + (i * HPTE_SIZE));
+		lbuf = (unsigned long __user *)buf;
+		for (j = 0; j < hdr.n_valid; ++j) {
+			err = -EFAULT;
+			if (__get_user(v, lbuf) || __get_user(r, lbuf + 1))
+				goto out;
+			err = -EINVAL;
+			if (!(v & HPTE_V_VALID))
+				goto out;
+			lbuf += 2;
+			nb += HPTE_SIZE;
+
+			if (hptp[0] & (HPTE_V_VALID | HPTE_V_ABSENT))
+				kvmppc_do_h_remove(kvm, 0, i, 0, tmp);
+			err = -EIO;
+			ret = kvmppc_virtmode_do_h_enter(kvm, H_EXACT, i, v, r,
+							 tmp);
+			if (ret != H_SUCCESS) {
+				pr_err("kvm_htab_write ret %ld i=%ld v=%lx "
+				       "r=%lx\n", ret, i, v, r);
+				goto out;
+			}
+			if (!rma_setup && is_vrma_hpte(v)) {
+				unsigned long psize = hpte_page_size(v, r);
+				unsigned long senc = slb_pgsize_encoding(psize);
+				unsigned long lpcr;
+
+				kvm->arch.vrma_slb_v = senc | SLB_VSID_B_1T |
+					(VRMA_VSID << SLB_VSID_SHIFT_1T);
+				lpcr = kvm->arch.lpcr & ~LPCR_VRMASD;
+				lpcr |= senc << (LPCR_VRMASD_SH - 4);
+				kvm->arch.lpcr = lpcr;
+				rma_setup = 1;
+			}
+			++i;
+			hptp += 2;
+		}
+
+		for (j = 0; j < hdr.n_invalid; ++j) {
+			if (hptp[0] & (HPTE_V_VALID | HPTE_V_ABSENT))
+				kvmppc_do_h_remove(kvm, 0, i, 0, tmp);
+			++i;
+			hptp += 2;
+		}
+		err = 0;
+	}
+
+ out:
+	/* Order HPTE updates vs. rma_setup_done */
+	smp_wmb();
+	kvm->arch.rma_setup_done = rma_setup;
+	mutex_unlock(&kvm->lock);
+
+	if (err)
+		return err;
+	return nb;
+}
+
+static int kvm_htab_release(struct inode *inode, struct file *filp)
+{
+	struct kvm_htab_ctx *ctx = filp->private_data;
+
+	filp->private_data = NULL;
+	if (!(ctx->flags & KVM_GET_HTAB_WRITE))
+		atomic_dec(&ctx->kvm->arch.hpte_mod_interest);
+	kvm_put_kvm(ctx->kvm);
+	kfree(ctx);
+	return 0;
+}
+
+static struct file_operations kvm_htab_fops = {
+	.read		= kvm_htab_read,
+	.write		= kvm_htab_write,
+	.llseek		= default_llseek,
+	.release	= kvm_htab_release,
+};
+
+int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *ghf)
+{
+	int ret;
+	struct kvm_htab_ctx *ctx;
+	int rwflag;
+
+	/* reject flags we don't recognize */
+	if (ghf->flags & ~(KVM_GET_HTAB_BOLTED_ONLY | KVM_GET_HTAB_WRITE))
+		return -EINVAL;
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+	kvm_get_kvm(kvm);
+	ctx->kvm = kvm;
+	ctx->index = ghf->start_index;
+	ctx->flags = ghf->flags;
+	ctx->first_pass = 1;
+
+	rwflag = (ghf->flags & KVM_GET_HTAB_WRITE) ? O_WRONLY : O_RDONLY;
+	ret = anon_inode_getfd("kvm-htab", &kvm_htab_fops, ctx, rwflag);
+	if (ret < 0) {
+		kvm_put_kvm(kvm);
+		return ret;
+	}
+
+	if (rwflag = O_RDONLY) {
+		mutex_lock(&kvm->slots_lock);
+		atomic_inc(&kvm->arch.hpte_mod_interest);
+		/* make sure kvmppc_do_h_enter etc. see the increment */
+		synchronize_srcu_expedited(&kvm->srcu);
+		mutex_unlock(&kvm->slots_lock);
+	}
+
+	return ret;
+}
+
 void kvmppc_mmu_book3s_hv_init(struct kvm_vcpu *vcpu)
 {
 	struct kvmppc_mmu *mmu = &vcpu->arch.mmu;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 843eb75..a4f59db 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1563,18 +1563,6 @@ out:
 	return r;
 }
 
-static unsigned long slb_pgsize_encoding(unsigned long psize)
-{
-	unsigned long senc = 0;
-
-	if (psize > 0x1000) {
-		senc = SLB_VSID_L;
-		if (psize = 0x10000)
-			senc |= SLB_VSID_LP_01;
-	}
-	return senc;
-}
-
 static void unpin_slot(struct kvm_memory_slot *memslot)
 {
 	unsigned long *physp;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index deb0d59..3aa52d9 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -353,6 +353,12 @@ int kvm_dev_ioctl_check_extension(long ext)
 		r = 1;
 #else
 		r = 0;
+		break;
+#endif
+#ifdef CONFIG_KVM_BOOK3S_64_HV
+	case KVM_CAP_PPC_HTAB_FD:
+		r = 1;
+		break;
 #endif
 		break;
 	case KVM_CAP_NR_VCPUS:
@@ -934,6 +940,17 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		r = 0;
 		break;
 	}
+
+	case KVM_PPC_GET_HTAB_FD: {
+		struct kvm *kvm = filp->private_data;
+		struct kvm_get_htab_fd ghf;
+
+		r = -EFAULT;
+		if (copy_from_user(&ghf, argp, sizeof(ghf)))
+			break;
+		r = kvm_vm_ioctl_get_htab_fd(kvm, &ghf);
+		break;
+	}
 #endif /* CONFIG_KVM_BOOK3S_64_HV */
 
 #ifdef CONFIG_PPC_BOOK3S_64
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 494a84c..e6e5d4b 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -634,6 +634,7 @@ struct kvm_ppc_smmu_info {
 #endif
 #define KVM_CAP_IRQFD_RESAMPLE 82
 #define KVM_CAP_PPC_BOOKE_WATCHDOG 83
+#define KVM_CAP_PPC_HTAB_FD 84
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -859,6 +860,8 @@ struct kvm_s390_ucas_mapping {
 #define KVM_CREATE_SPAPR_TCE	  _IOW(KVMIO,  0xa8, struct kvm_create_spapr_tce)
 /* Available with KVM_CAP_RMA */
 #define KVM_ALLOCATE_RMA	  _IOR(KVMIO,  0xa9, struct kvm_allocate_rma)
+/* Available with KVM_CAP_PPC_HTAB_FD */
+#define KVM_PPC_GET_HTAB_FD	  _IOW(KVMIO,  0xaa, struct kvm_get_htab_fd)
 
 /*
  * ioctls for vcpu fds
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] KVM: PPC: Book3S HV: Add a mechanism for recording modified HPTEs
  2012-11-14  4:32   ` Paul Mackerras
@ 2012-11-19 12:36     ` Alexander Graf
  -1 siblings, 0 replies; 24+ messages in thread
From: Alexander Graf @ 2012-11-19 12:36 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: kvm, kvm-ppc


On 14.11.2012, at 05:32, Paul Mackerras wrote:

> This uses a bit in our record of the guest view of the HPTE to record
> when the HPTE gets modified.  We use a reserved bit for this, and ensure
> that this bit is always cleared in HPTE values returned to the guest.
> The recording of modified HPTEs is only done if other code indicates
> its interest by setting kvm->arch.hpte_mod_interest to a non-zero value.
> 
> Signed-off-by: Paul Mackerras <paulus@samba.org>
> ---
> arch/powerpc/include/asm/kvm_book3s_64.h |    6 ++++++
> arch/powerpc/include/asm/kvm_host.h      |    1 +
> arch/powerpc/kvm/book3s_hv_rm_mmu.c      |   25 ++++++++++++++++++++++---
> 3 files changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
> index 1472a5b..4ca4f25 100644
> --- a/arch/powerpc/include/asm/kvm_book3s_64.h
> +++ b/arch/powerpc/include/asm/kvm_book3s_64.h
> @@ -50,6 +50,12 @@ extern int kvm_hpt_order;		/* order of preallocated HPTs */
> #define HPTE_V_HVLOCK	0x40UL
> #define HPTE_V_ABSENT	0x20UL
> 
> +/*
> + * We use this bit in the guest_rpte field of the revmap entry
> + * to indicate a modified HPTE.
> + */
> +#define HPTE_GR_MODIFIED	(1ul << 62)
> +
> static inline long try_lock_hpte(unsigned long *hpte, unsigned long bits)
> {
> 	unsigned long tmp, old;
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index 3093896..58c7264 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -248,6 +248,7 @@ struct kvm_arch {
> 	atomic_t vcpus_running;
> 	unsigned long hpt_npte;
> 	unsigned long hpt_mask;
> +	atomic_t hpte_mod_interest;
> 	spinlock_t slot_phys_lock;
> 	unsigned short last_vcpu[NR_CPUS];
> 	struct kvmppc_vcore *vcores[KVM_MAX_VCORES];
> diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> index 362dffe..726231a 100644
> --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> @@ -66,6 +66,18 @@ void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev,
> }
> EXPORT_SYMBOL_GPL(kvmppc_add_revmap_chain);
> 
> +/*
> + * Note modification of an HPTE; set the HPTE modified bit
> + * if it wasn't modified before and anyone is interested.
> + */
> +static inline void note_hpte_modification(struct kvm *kvm,
> +					  struct revmap_entry *rev)
> +{
> +	if (!(rev->guest_rpte & HPTE_GR_MODIFIED) &&
> +	    atomic_read(&kvm->arch.hpte_mod_interest))

Aren't atomic reads expensive? Doing one on every guest HPTE modification (when nobody is listening) sounds quite excessive. Wouldn't it be cheaper to just always update the bit?

> +		rev->guest_rpte |= HPTE_GR_MODIFIED;
> +}
> +
> /* Remove this HPTE from the chain for a real page */
> static void remove_revmap_chain(struct kvm *kvm, long pte_index,
> 				struct revmap_entry *rev,
> @@ -287,8 +299,10 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
> 	rev = &kvm->arch.revmap[pte_index];
> 	if (realmode)
> 		rev = real_vmalloc_addr(rev);
> -	if (rev)
> +	if (rev) {
> 		rev->guest_rpte = g_ptel;
> +		note_hpte_modification(kvm, rev);
> +	}
> 
> 	/* Link HPTE into reverse-map chain */
> 	if (pteh & HPTE_V_VALID) {
> @@ -392,7 +406,8 @@ long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
> 		/* Read PTE low word after tlbie to get final R/C values */
> 		remove_revmap_chain(kvm, pte_index, rev, v, hpte[1]);
> 	}
> -	r = rev->guest_rpte;
> +	r = rev->guest_rpte & ~HPTE_GR_MODIFIED;
> +	note_hpte_modification(kvm, rev);
> 	unlock_hpte(hpte, 0);
> 
> 	vcpu->arch.gpr[4] = v;
> @@ -466,6 +481,7 @@ long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
> 
> 			args[j] = ((0x80 | flags) << 56) + pte_index;
> 			rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
> +			note_hpte_modification(kvm, rev);
> 
> 			if (!(hp[0] & HPTE_V_VALID)) {
> 				/* insert R and C bits from PTE */
> @@ -555,6 +571,7 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags,
> 	if (rev) {
> 		r = (rev->guest_rpte & ~mask) | bits;
> 		rev->guest_rpte = r;
> +		note_hpte_modification(kvm, rev);
> 	}
> 	r = (hpte[1] & ~mask) | bits;
> 
> @@ -606,8 +623,10 @@ long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long flags,
> 			v &= ~HPTE_V_ABSENT;
> 			v |= HPTE_V_VALID;
> 		}
> -		if (v & HPTE_V_VALID)
> +		if (v & HPTE_V_VALID) {
> 			r = rev[i].guest_rpte | (r & (HPTE_R_R | HPTE_R_C));
> +			r &= ~HPTE_GR_MODIFIED;

It probably makes sense to make this a RESERVED mask, so that if we ever want to use another bit for host information, we can just add that to the respective mask. Also, you should probably clear the reserved bits (or bail out when set) on h_enter.


Alex

> +		}
> 		vcpu->arch.gpr[4 + i * 2] = v;
> 		vcpu->arch.gpr[5 + i * 2] = r;
> 	}
> -- 
> 1.7.10.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] KVM: PPC: Book3S HV: Add a mechanism for recording modified HPTEs
@ 2012-11-19 12:36     ` Alexander Graf
  0 siblings, 0 replies; 24+ messages in thread
From: Alexander Graf @ 2012-11-19 12:36 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: kvm, kvm-ppc


On 14.11.2012, at 05:32, Paul Mackerras wrote:

> This uses a bit in our record of the guest view of the HPTE to record
> when the HPTE gets modified.  We use a reserved bit for this, and ensure
> that this bit is always cleared in HPTE values returned to the guest.
> The recording of modified HPTEs is only done if other code indicates
> its interest by setting kvm->arch.hpte_mod_interest to a non-zero value.
> 
> Signed-off-by: Paul Mackerras <paulus@samba.org>
> ---
> arch/powerpc/include/asm/kvm_book3s_64.h |    6 ++++++
> arch/powerpc/include/asm/kvm_host.h      |    1 +
> arch/powerpc/kvm/book3s_hv_rm_mmu.c      |   25 ++++++++++++++++++++++---
> 3 files changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
> index 1472a5b..4ca4f25 100644
> --- a/arch/powerpc/include/asm/kvm_book3s_64.h
> +++ b/arch/powerpc/include/asm/kvm_book3s_64.h
> @@ -50,6 +50,12 @@ extern int kvm_hpt_order;		/* order of preallocated HPTs */
> #define HPTE_V_HVLOCK	0x40UL
> #define HPTE_V_ABSENT	0x20UL
> 
> +/*
> + * We use this bit in the guest_rpte field of the revmap entry
> + * to indicate a modified HPTE.
> + */
> +#define HPTE_GR_MODIFIED	(1ul << 62)
> +
> static inline long try_lock_hpte(unsigned long *hpte, unsigned long bits)
> {
> 	unsigned long tmp, old;
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index 3093896..58c7264 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -248,6 +248,7 @@ struct kvm_arch {
> 	atomic_t vcpus_running;
> 	unsigned long hpt_npte;
> 	unsigned long hpt_mask;
> +	atomic_t hpte_mod_interest;
> 	spinlock_t slot_phys_lock;
> 	unsigned short last_vcpu[NR_CPUS];
> 	struct kvmppc_vcore *vcores[KVM_MAX_VCORES];
> diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> index 362dffe..726231a 100644
> --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> @@ -66,6 +66,18 @@ void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev,
> }
> EXPORT_SYMBOL_GPL(kvmppc_add_revmap_chain);
> 
> +/*
> + * Note modification of an HPTE; set the HPTE modified bit
> + * if it wasn't modified before and anyone is interested.
> + */
> +static inline void note_hpte_modification(struct kvm *kvm,
> +					  struct revmap_entry *rev)
> +{
> +	if (!(rev->guest_rpte & HPTE_GR_MODIFIED) &&
> +	    atomic_read(&kvm->arch.hpte_mod_interest))

Aren't atomic reads expensive? Doing one on every guest HPTE modification (when nobody is listening) sounds quite excessive. Wouldn't it be cheaper to just always update the bit?

> +		rev->guest_rpte |= HPTE_GR_MODIFIED;
> +}
> +
> /* Remove this HPTE from the chain for a real page */
> static void remove_revmap_chain(struct kvm *kvm, long pte_index,
> 				struct revmap_entry *rev,
> @@ -287,8 +299,10 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
> 	rev = &kvm->arch.revmap[pte_index];
> 	if (realmode)
> 		rev = real_vmalloc_addr(rev);
> -	if (rev)
> +	if (rev) {
> 		rev->guest_rpte = g_ptel;
> +		note_hpte_modification(kvm, rev);
> +	}
> 
> 	/* Link HPTE into reverse-map chain */
> 	if (pteh & HPTE_V_VALID) {
> @@ -392,7 +406,8 @@ long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
> 		/* Read PTE low word after tlbie to get final R/C values */
> 		remove_revmap_chain(kvm, pte_index, rev, v, hpte[1]);
> 	}
> -	r = rev->guest_rpte;
> +	r = rev->guest_rpte & ~HPTE_GR_MODIFIED;
> +	note_hpte_modification(kvm, rev);
> 	unlock_hpte(hpte, 0);
> 
> 	vcpu->arch.gpr[4] = v;
> @@ -466,6 +481,7 @@ long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
> 
> 			args[j] = ((0x80 | flags) << 56) + pte_index;
> 			rev = real_vmalloc_addr(&kvm->arch.revmap[pte_index]);
> +			note_hpte_modification(kvm, rev);
> 
> 			if (!(hp[0] & HPTE_V_VALID)) {
> 				/* insert R and C bits from PTE */
> @@ -555,6 +571,7 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags,
> 	if (rev) {
> 		r = (rev->guest_rpte & ~mask) | bits;
> 		rev->guest_rpte = r;
> +		note_hpte_modification(kvm, rev);
> 	}
> 	r = (hpte[1] & ~mask) | bits;
> 
> @@ -606,8 +623,10 @@ long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long flags,
> 			v &= ~HPTE_V_ABSENT;
> 			v |= HPTE_V_VALID;
> 		}
> -		if (v & HPTE_V_VALID)
> +		if (v & HPTE_V_VALID) {
> 			r = rev[i].guest_rpte | (r & (HPTE_R_R | HPTE_R_C));
> +			r &= ~HPTE_GR_MODIFIED;

It probably makes sense to make this a RESERVED mask, so that if we ever want to use another bit for host information, we can just add that to the respective mask. Also, you should probably clear the reserved bits (or bail out when set) on h_enter.


Alex

> +		}
> 		vcpu->arch.gpr[4 + i * 2] = v;
> 		vcpu->arch.gpr[5 + i * 2] = r;
> 	}
> -- 
> 1.7.10.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] KVM: PPC: Book3S HV: Provide a method for userspace to read and write the HPT
  2012-11-14  4:33   ` Paul Mackerras
@ 2012-11-19 12:53     ` Alexander Graf
  -1 siblings, 0 replies; 24+ messages in thread
From: Alexander Graf @ 2012-11-19 12:53 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: kvm, kvm-ppc


On 14.11.2012, at 05:33, Paul Mackerras wrote:

> A new ioctl, KVM_PPC_GET_HTAB_FD, returns a file descriptor.  Reads on
> this fd return the contents of the HPT (hashed page table), writes
> create and/or remove entries in the HPT.  There is a new capability,
> KVM_CAP_PPC_HTAB_FD, to indicate the presence of the ioctl.  The ioctl
> takes an argument structure with the index of the first HPT entry to
> read out and a set of flags.  The flags indicate whether the user is
> intending to read or write the HPT, and whether to return all entries
> or only the "bolted" entries (those with the bolted bit, 0x10, set in
> the first doubleword).
> 
> This is intended for use in implementing qemu's savevm/loadvm and for
> live migration.  Therefore, on reads, the first pass returns information
> about all HPTEs (or all bolted HPTEs).  When the first pass reaches the
> end of the HPT, it returns from the read.  Subsequent reads only return
> information about HPTEs that have changed since they were last read.
> A read that finds no changed HPTEs in the HPT following where the last
> read finished will return 0 bytes.
> 
> The format of the data provides a simple run-length compression of the
> invalid entries.  Each block of data starts with a header that indicates
> the index (position in the HPT, which is just an array), the number of
> valid entries starting at that index (may be zero), and the number of
> invalid entries following those valid entries.  The valid entries, 16
> bytes each, follow the header.  The invalid entries are not explicitly
> represented.
> 
> Signed-off-by: Paul Mackerras <paulus@samba.org>
> ---
> Documentation/virtual/kvm/api.txt        |   53 +++++
> arch/powerpc/include/asm/kvm_book3s_64.h |   18 ++
> arch/powerpc/include/asm/kvm_ppc.h       |    2 +
> arch/powerpc/include/uapi/asm/kvm.h      |   24 +++
> arch/powerpc/kvm/book3s_64_mmu_hv.c      |  344 ++++++++++++++++++++++++++++++
> arch/powerpc/kvm/book3s_hv.c             |   12 --
> arch/powerpc/kvm/powerpc.c               |   17 ++
> include/uapi/linux/kvm.h                 |    3 +
> 8 files changed, 461 insertions(+), 12 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 6671fdc..33080ea 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -2071,6 +2071,59 @@ KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; source cpu in parm
> 
> Note that the vcpu ioctl is asynchronous to vcpu execution.
> 
> +4.78 KVM_PPC_GET_HTAB_FD
> +
> +Capability: KVM_CAP_PPC_HTAB_FD
> +Architectures: powerpc
> +Type: vm ioctl
> +Parameters: Pointer to struct kvm_get_htab_fd (in)
> +Returns: file descriptor number (>= 0) on success, -1 on error
> +
> +This returns a file descriptor that can be used either to read out the
> +entries in the guest's hashed page table (HPT), or to write entries to
> +initialize the HPT.  The returned fd can only be written to if the
> +KVM_GET_HTAB_WRITE bit is set in the flags field of the argument, and
> +can only be read if that bit is clear.  The argument struct looks like
> +this:
> +
> +/* For KVM_PPC_GET_HTAB_FD */
> +struct kvm_get_htab_fd {
> +	__u64	flags;
> +	__u64	start_index;
> +};
> +
> +/* Values for kvm_get_htab_fd.flags */
> +#define KVM_GET_HTAB_BOLTED_ONLY	((__u64)0x1)
> +#define KVM_GET_HTAB_WRITE		((__u64)0x2)
> +
> +The `start_index' field gives the index in the HPT of the entry at
> +which to start reading.  It is ignored when writing.
> +
> +Reads on the fd will initially supply information about all
> +"interesting" HPT entries.  Interesting entries are those with the
> +bolted bit set, if the KVM_GET_HTAB_BOLTED_ONLY bit is set, otherwise
> +all entries.  When the end of the HPT is reached, the read() will
> +return.  If read() is called again on the fd, it will start again from
> +the beginning of the HPT, but will only return HPT entries that have
> +changed since they were last read.
> +
> +Data read or written is structured as a header (8 bytes) followed by a
> +series of valid HPT entries (16 bytes) each.  The header indicates how
> +many valid HPT entries there are and how many invalid entries follow
> +the valid entries.  The invalid entries are not represented explicitly
> +in the stream.  The header format is:
> +
> +struct kvm_get_htab_header {
> +	__u32	index;
> +	__u16	n_valid;
> +	__u16	n_invalid;
> +};
> +
> +Writes to the fd create HPT entries starting at the index given in the
> +header; first `n_valid' valid entries with contents from the data
> +written, then `n_invalid' invalid entries, invalidating any previously
> +valid entries found.
> +
> 
> 5. The kvm_run structure
> ------------------------
> diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
> index 4ca4f25..dc0a78d 100644
> --- a/arch/powerpc/include/asm/kvm_book3s_64.h
> +++ b/arch/powerpc/include/asm/kvm_book3s_64.h
> @@ -243,4 +243,22 @@ static inline bool slot_is_aligned(struct kvm_memory_slot *memslot,
> 	return !(memslot->base_gfn & mask) && !(memslot->npages & mask);
> }
> 
> +static inline unsigned long slb_pgsize_encoding(unsigned long psize)
> +{
> +	unsigned long senc = 0;
> +
> +	if (psize > 0x1000) {
> +		senc = SLB_VSID_L;
> +		if (psize == 0x10000)
> +			senc |= SLB_VSID_LP_01;

Is this always accurate?

> +	}
> +	return senc;
> +}
> +
> +static inline int is_vrma_hpte(unsigned long hpte_v)
> +{
> +	return (hpte_v & ~0xffffffUL) ==
> +		(HPTE_V_1TB_SEG | (VRMA_VSID << (40 - 16)));
> +}
> +
> #endif /* __ASM_KVM_BOOK3S_64_H__ */
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
> index 609cca3..1ca31e9 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -164,6 +164,8 @@ extern void kvmppc_bookehv_exit(void);
> 
> extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu);
> 
> +extern int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *);
> +
> /*
>  * Cuts out inst bits with ordering according to spec.
>  * That means the leftmost bit is zero. All given bits are included.
> diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
> index b89ae4d..6518e38 100644
> --- a/arch/powerpc/include/uapi/asm/kvm.h
> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> @@ -331,6 +331,30 @@ struct kvm_book3e_206_tlb_params {
> 	__u32 reserved[8];
> };
> 
> +/* For KVM_PPC_GET_HTAB_FD */
> +struct kvm_get_htab_fd {
> +	__u64	flags;
> +	__u64	start_index;

Please add some padding here, in case we need more information passed down. 16 bytes should be enough.

The actual protocol looks quite good :). And if it doesn't work for us anymore, we can always bump it to a new protocol using the flags in the above ioctl.


Alex

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] KVM: PPC: Book3S HV: Provide a method for userspace to read and write the HPT
@ 2012-11-19 12:53     ` Alexander Graf
  0 siblings, 0 replies; 24+ messages in thread
From: Alexander Graf @ 2012-11-19 12:53 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: kvm, kvm-ppc


On 14.11.2012, at 05:33, Paul Mackerras wrote:

> A new ioctl, KVM_PPC_GET_HTAB_FD, returns a file descriptor.  Reads on
> this fd return the contents of the HPT (hashed page table), writes
> create and/or remove entries in the HPT.  There is a new capability,
> KVM_CAP_PPC_HTAB_FD, to indicate the presence of the ioctl.  The ioctl
> takes an argument structure with the index of the first HPT entry to
> read out and a set of flags.  The flags indicate whether the user is
> intending to read or write the HPT, and whether to return all entries
> or only the "bolted" entries (those with the bolted bit, 0x10, set in
> the first doubleword).
> 
> This is intended for use in implementing qemu's savevm/loadvm and for
> live migration.  Therefore, on reads, the first pass returns information
> about all HPTEs (or all bolted HPTEs).  When the first pass reaches the
> end of the HPT, it returns from the read.  Subsequent reads only return
> information about HPTEs that have changed since they were last read.
> A read that finds no changed HPTEs in the HPT following where the last
> read finished will return 0 bytes.
> 
> The format of the data provides a simple run-length compression of the
> invalid entries.  Each block of data starts with a header that indicates
> the index (position in the HPT, which is just an array), the number of
> valid entries starting at that index (may be zero), and the number of
> invalid entries following those valid entries.  The valid entries, 16
> bytes each, follow the header.  The invalid entries are not explicitly
> represented.
> 
> Signed-off-by: Paul Mackerras <paulus@samba.org>
> ---
> Documentation/virtual/kvm/api.txt        |   53 +++++
> arch/powerpc/include/asm/kvm_book3s_64.h |   18 ++
> arch/powerpc/include/asm/kvm_ppc.h       |    2 +
> arch/powerpc/include/uapi/asm/kvm.h      |   24 +++
> arch/powerpc/kvm/book3s_64_mmu_hv.c      |  344 ++++++++++++++++++++++++++++++
> arch/powerpc/kvm/book3s_hv.c             |   12 --
> arch/powerpc/kvm/powerpc.c               |   17 ++
> include/uapi/linux/kvm.h                 |    3 +
> 8 files changed, 461 insertions(+), 12 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 6671fdc..33080ea 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -2071,6 +2071,59 @@ KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; source cpu in parm
> 
> Note that the vcpu ioctl is asynchronous to vcpu execution.
> 
> +4.78 KVM_PPC_GET_HTAB_FD
> +
> +Capability: KVM_CAP_PPC_HTAB_FD
> +Architectures: powerpc
> +Type: vm ioctl
> +Parameters: Pointer to struct kvm_get_htab_fd (in)
> +Returns: file descriptor number (>= 0) on success, -1 on error
> +
> +This returns a file descriptor that can be used either to read out the
> +entries in the guest's hashed page table (HPT), or to write entries to
> +initialize the HPT.  The returned fd can only be written to if the
> +KVM_GET_HTAB_WRITE bit is set in the flags field of the argument, and
> +can only be read if that bit is clear.  The argument struct looks like
> +this:
> +
> +/* For KVM_PPC_GET_HTAB_FD */
> +struct kvm_get_htab_fd {
> +	__u64	flags;
> +	__u64	start_index;
> +};
> +
> +/* Values for kvm_get_htab_fd.flags */
> +#define KVM_GET_HTAB_BOLTED_ONLY	((__u64)0x1)
> +#define KVM_GET_HTAB_WRITE		((__u64)0x2)
> +
> +The `start_index' field gives the index in the HPT of the entry at
> +which to start reading.  It is ignored when writing.
> +
> +Reads on the fd will initially supply information about all
> +"interesting" HPT entries.  Interesting entries are those with the
> +bolted bit set, if the KVM_GET_HTAB_BOLTED_ONLY bit is set, otherwise
> +all entries.  When the end of the HPT is reached, the read() will
> +return.  If read() is called again on the fd, it will start again from
> +the beginning of the HPT, but will only return HPT entries that have
> +changed since they were last read.
> +
> +Data read or written is structured as a header (8 bytes) followed by a
> +series of valid HPT entries (16 bytes) each.  The header indicates how
> +many valid HPT entries there are and how many invalid entries follow
> +the valid entries.  The invalid entries are not represented explicitly
> +in the stream.  The header format is:
> +
> +struct kvm_get_htab_header {
> +	__u32	index;
> +	__u16	n_valid;
> +	__u16	n_invalid;
> +};
> +
> +Writes to the fd create HPT entries starting at the index given in the
> +header; first `n_valid' valid entries with contents from the data
> +written, then `n_invalid' invalid entries, invalidating any previously
> +valid entries found.
> +
> 
> 5. The kvm_run structure
> ------------------------
> diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
> index 4ca4f25..dc0a78d 100644
> --- a/arch/powerpc/include/asm/kvm_book3s_64.h
> +++ b/arch/powerpc/include/asm/kvm_book3s_64.h
> @@ -243,4 +243,22 @@ static inline bool slot_is_aligned(struct kvm_memory_slot *memslot,
> 	return !(memslot->base_gfn & mask) && !(memslot->npages & mask);
> }
> 
> +static inline unsigned long slb_pgsize_encoding(unsigned long psize)
> +{
> +	unsigned long senc = 0;
> +
> +	if (psize > 0x1000) {
> +		senc = SLB_VSID_L;
> +		if (psize = 0x10000)
> +			senc |= SLB_VSID_LP_01;

Is this always accurate?

> +	}
> +	return senc;
> +}
> +
> +static inline int is_vrma_hpte(unsigned long hpte_v)
> +{
> +	return (hpte_v & ~0xffffffUL) =
> +		(HPTE_V_1TB_SEG | (VRMA_VSID << (40 - 16)));
> +}
> +
> #endif /* __ASM_KVM_BOOK3S_64_H__ */
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
> index 609cca3..1ca31e9 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -164,6 +164,8 @@ extern void kvmppc_bookehv_exit(void);
> 
> extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu);
> 
> +extern int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *);
> +
> /*
>  * Cuts out inst bits with ordering according to spec.
>  * That means the leftmost bit is zero. All given bits are included.
> diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
> index b89ae4d..6518e38 100644
> --- a/arch/powerpc/include/uapi/asm/kvm.h
> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> @@ -331,6 +331,30 @@ struct kvm_book3e_206_tlb_params {
> 	__u32 reserved[8];
> };
> 
> +/* For KVM_PPC_GET_HTAB_FD */
> +struct kvm_get_htab_fd {
> +	__u64	flags;
> +	__u64	start_index;

Please add some padding here, in case we need more information passed down. 16 bytes should be enough.

The actual protocol looks quite good :). And if it doesn't work for us anymore, we can always bump it to a new protocol using the flags in the above ioctl.


Alex


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] KVM: PPC: Book3S HV: Restructure HPT entry creation code
  2012-11-14  4:31   ` Paul Mackerras
@ 2012-11-19 12:54     ` Alexander Graf
  -1 siblings, 0 replies; 24+ messages in thread
From: Alexander Graf @ 2012-11-19 12:54 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: kvm, kvm-ppc


On 14.11.2012, at 05:31, Paul Mackerras wrote:

> This restructures the code that creates HPT (hashed page table)
> entries so that it can be called in situations where we don't have a
> struct vcpu pointer, only a struct kvm pointer.  It also fixes a bug
> where kvmppc_map_vrma() would corrupt the guest R4 value.
> 
> Most of the work of kvmppc_virtmode_h_enter is now done by a new
> function, kvmppc_virtmode_do_h_enter, which itself calls another new
> function, kvmppc_do_h_enter, which contains most of the old
> kvmppc_h_enter.  The new kvmppc_do_h_enter takes explicit arguments
> for the place to return the HPTE index, the Linux page tables to use,
> and whether it is being called in real mode, thus removing the need
> for it to have the vcpu as an argument.
> 
> Currently kvmppc_map_vrma creates the VRMA (virtual real mode area)
> HPTEs by calling kvmppc_virtmode_h_enter, which is designed primarily
> to handle H_ENTER hcalls from the guest that need to pin a page of
> memory.  Since H_ENTER returns the index of the created HPTE in R4,
> kvmppc_virtmode_h_enter updates the guest R4, corrupting the guest R4
> in the case when it gets called from kvmppc_map_vrma on the first
> VCPU_RUN ioctl.  With this, kvmppc_map_vrma instead calls
> kvmppc_virtmode_do_h_enter with the address of a dummy word as the
> place to store the HPTE index, thus avoiding corrupting the guest R4.
> 
> Signed-off-by: Paul Mackerras <paulus@samba.org>

Thanks, applied to kvm-ppc-next.


Alex

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] KVM: PPC: Book3S HV: Restructure HPT entry creation code
@ 2012-11-19 12:54     ` Alexander Graf
  0 siblings, 0 replies; 24+ messages in thread
From: Alexander Graf @ 2012-11-19 12:54 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: kvm, kvm-ppc


On 14.11.2012, at 05:31, Paul Mackerras wrote:

> This restructures the code that creates HPT (hashed page table)
> entries so that it can be called in situations where we don't have a
> struct vcpu pointer, only a struct kvm pointer.  It also fixes a bug
> where kvmppc_map_vrma() would corrupt the guest R4 value.
> 
> Most of the work of kvmppc_virtmode_h_enter is now done by a new
> function, kvmppc_virtmode_do_h_enter, which itself calls another new
> function, kvmppc_do_h_enter, which contains most of the old
> kvmppc_h_enter.  The new kvmppc_do_h_enter takes explicit arguments
> for the place to return the HPTE index, the Linux page tables to use,
> and whether it is being called in real mode, thus removing the need
> for it to have the vcpu as an argument.
> 
> Currently kvmppc_map_vrma creates the VRMA (virtual real mode area)
> HPTEs by calling kvmppc_virtmode_h_enter, which is designed primarily
> to handle H_ENTER hcalls from the guest that need to pin a page of
> memory.  Since H_ENTER returns the index of the created HPTE in R4,
> kvmppc_virtmode_h_enter updates the guest R4, corrupting the guest R4
> in the case when it gets called from kvmppc_map_vrma on the first
> VCPU_RUN ioctl.  With this, kvmppc_map_vrma instead calls
> kvmppc_virtmode_do_h_enter with the address of a dummy word as the
> place to store the HPTE index, thus avoiding corrupting the guest R4.
> 
> Signed-off-by: Paul Mackerras <paulus@samba.org>

Thanks, applied to kvm-ppc-next.


Alex


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] KVM: PPC: Book3S HV: Add a mechanism for recording modified HPTEs
  2012-11-19 12:36     ` Alexander Graf
@ 2012-11-19 17:20       ` Scott Wood
  -1 siblings, 0 replies; 24+ messages in thread
From: Scott Wood @ 2012-11-19 17:20 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Paul Mackerras, kvm, kvm-ppc

On 11/19/2012 06:36:42 AM, Alexander Graf wrote:
> 
> On 14.11.2012, at 05:32, Paul Mackerras wrote:
> 
> > +/*
> > + * Note modification of an HPTE; set the HPTE modified bit
> > + * if it wasn't modified before and anyone is interested.
> > + */
> > +static inline void note_hpte_modification(struct kvm *kvm,
> > +					  struct revmap_entry *rev)
> > +{
> > +	if (!(rev->guest_rpte & HPTE_GR_MODIFIED) &&
> > +	    atomic_read(&kvm->arch.hpte_mod_interest))
> 
> Aren't atomic reads expensive?

No, it's a simple load instruction (done with inline asm so the  
compiler can't break it up into smaller loads).

-Scott

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] KVM: PPC: Book3S HV: Add a mechanism for recording modified HPTEs
@ 2012-11-19 17:20       ` Scott Wood
  0 siblings, 0 replies; 24+ messages in thread
From: Scott Wood @ 2012-11-19 17:20 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Paul Mackerras, kvm, kvm-ppc

On 11/19/2012 06:36:42 AM, Alexander Graf wrote:
> 
> On 14.11.2012, at 05:32, Paul Mackerras wrote:
> 
> > +/*
> > + * Note modification of an HPTE; set the HPTE modified bit
> > + * if it wasn't modified before and anyone is interested.
> > + */
> > +static inline void note_hpte_modification(struct kvm *kvm,
> > +					  struct revmap_entry *rev)
> > +{
> > +	if (!(rev->guest_rpte & HPTE_GR_MODIFIED) &&
> > +	    atomic_read(&kvm->arch.hpte_mod_interest))
> 
> Aren't atomic reads expensive?

No, it's a simple load instruction (done with inline asm so the  
compiler can't break it up into smaller loads).

-Scott

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] KVM: PPC: Book3S HV: Add a mechanism for recording modified HPTEs
  2012-11-19 17:20       ` Scott Wood
@ 2012-11-19 17:40         ` Alexander Graf
  -1 siblings, 0 replies; 24+ messages in thread
From: Alexander Graf @ 2012-11-19 17:40 UTC (permalink / raw)
  To: Scott Wood; +Cc: Paul Mackerras, kvm, kvm-ppc


On 19.11.2012, at 18:20, Scott Wood wrote:

> On 11/19/2012 06:36:42 AM, Alexander Graf wrote:
>> On 14.11.2012, at 05:32, Paul Mackerras wrote:
>> > +/*
>> > + * Note modification of an HPTE; set the HPTE modified bit
>> > + * if it wasn't modified before and anyone is interested.
>> > + */
>> > +static inline void note_hpte_modification(struct kvm *kvm,
>> > +					  struct revmap_entry *rev)
>> > +{
>> > +	if (!(rev->guest_rpte & HPTE_GR_MODIFIED) &&
>> > +	    atomic_read(&kvm->arch.hpte_mod_interest))
>> Aren't atomic reads expensive?
> 
> No, it's a simple load instruction (done with inline asm so the compiler can't break it up into smaller loads).

Oh, you're right. Fun. I would've assumed that some cache line locking needs to be done to guarantee atomicity :). But this of course makes things even easier.


Alex

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] KVM: PPC: Book3S HV: Add a mechanism for recording modified HPTEs
@ 2012-11-19 17:40         ` Alexander Graf
  0 siblings, 0 replies; 24+ messages in thread
From: Alexander Graf @ 2012-11-19 17:40 UTC (permalink / raw)
  To: Scott Wood; +Cc: Paul Mackerras, kvm, kvm-ppc


On 19.11.2012, at 18:20, Scott Wood wrote:

> On 11/19/2012 06:36:42 AM, Alexander Graf wrote:
>> On 14.11.2012, at 05:32, Paul Mackerras wrote:
>> > +/*
>> > + * Note modification of an HPTE; set the HPTE modified bit
>> > + * if it wasn't modified before and anyone is interested.
>> > + */
>> > +static inline void note_hpte_modification(struct kvm *kvm,
>> > +					  struct revmap_entry *rev)
>> > +{
>> > +	if (!(rev->guest_rpte & HPTE_GR_MODIFIED) &&
>> > +	    atomic_read(&kvm->arch.hpte_mod_interest))
>> Aren't atomic reads expensive?
> 
> No, it's a simple load instruction (done with inline asm so the compiler can't break it up into smaller loads).

Oh, you're right. Fun. I would've assumed that some cache line locking needs to be done to guarantee atomicity :). But this of course makes things even easier.


Alex


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] KVM: PPC: Book3S HV: Add a mechanism for recording modified HPTEs
  2012-11-19 12:36     ` Alexander Graf
@ 2012-11-19 23:25       ` Paul Mackerras
  -1 siblings, 0 replies; 24+ messages in thread
From: Paul Mackerras @ 2012-11-19 23:25 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc

On Mon, Nov 19, 2012 at 01:36:42PM +0100, Alexander Graf wrote:
> 
> Aren't atomic reads expensive? Doing one on every guest HPTE modification (when nobody is listening) sounds quite excessive. Wouldn't it be cheaper to just always update the bit?

As Scott said, it's just an ordinary load.  The reason for not setting
the bit always is to make the first pass of reading the HPT quicker.
That first pass sends all HPTEs, and clears the modified bits of any
HPTEs that have it set.  If the modified bit is set, we have to lock
the HPTE in order to clear the modified bit, even if the HPTE is
invalid, whereas we don't have to lock invalid HPTEs if they don't
have the modified bit set.

> It probably makes sense to make this a RESERVED mask, so that if we ever want to use another bit for host information, we can just add that to the respective mask. Also, you should probably clear the reserved bits (or bail out when set) on h_enter.

OK, and yes good point about clearing the reserved bits on H_ENTER.

Regards,
Paul.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] KVM: PPC: Book3S HV: Add a mechanism for recording modified HPTEs
@ 2012-11-19 23:25       ` Paul Mackerras
  0 siblings, 0 replies; 24+ messages in thread
From: Paul Mackerras @ 2012-11-19 23:25 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc

On Mon, Nov 19, 2012 at 01:36:42PM +0100, Alexander Graf wrote:
> 
> Aren't atomic reads expensive? Doing one on every guest HPTE modification (when nobody is listening) sounds quite excessive. Wouldn't it be cheaper to just always update the bit?

As Scott said, it's just an ordinary load.  The reason for not setting
the bit always is to make the first pass of reading the HPT quicker.
That first pass sends all HPTEs, and clears the modified bits of any
HPTEs that have it set.  If the modified bit is set, we have to lock
the HPTE in order to clear the modified bit, even if the HPTE is
invalid, whereas we don't have to lock invalid HPTEs if they don't
have the modified bit set.

> It probably makes sense to make this a RESERVED mask, so that if we ever want to use another bit for host information, we can just add that to the respective mask. Also, you should probably clear the reserved bits (or bail out when set) on h_enter.

OK, and yes good point about clearing the reserved bits on H_ENTER.

Regards,
Paul.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] KVM: PPC: Book3S HV: Provide a method for userspace to read and write the HPT
  2012-11-19 12:53     ` Alexander Graf
@ 2012-11-19 23:59       ` Paul Mackerras
  -1 siblings, 0 replies; 24+ messages in thread
From: Paul Mackerras @ 2012-11-19 23:59 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc

On Mon, Nov 19, 2012 at 01:53:44PM +0100, Alexander Graf wrote:

> > +static inline unsigned long slb_pgsize_encoding(unsigned long psize)
> > +{
> > +	unsigned long senc = 0;
> > +
> > +	if (psize > 0x1000) {
> > +		senc = SLB_VSID_L;
> > +		if (psize == 0x10000)
> > +			senc |= SLB_VSID_LP_01;
> 
> Is this always accurate?

It's accurate for page sizes of 4k, 64k and 16M on POWER7, and for
page sizes of 4k and 16M on PPC970, which is all we support.  This
function was just moved from book3s_hv.c anyway.  I'll add a comment
to it.

> > +/* For KVM_PPC_GET_HTAB_FD */
> > +struct kvm_get_htab_fd {
> > +	__u64	flags;
> > +	__u64	start_index;
> 
> Please add some padding here, in case we need more information passed down. 16 bytes should be enough.

OK.

> The actual protocol looks quite good :). And if it doesn't work for us anymore, we can always bump it to a new protocol using the flags in the above ioctl.

Right.  That's why I made it give an error if any unknown flags are
set.

Regards,
Paul.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] KVM: PPC: Book3S HV: Provide a method for userspace to read and write the HPT
@ 2012-11-19 23:59       ` Paul Mackerras
  0 siblings, 0 replies; 24+ messages in thread
From: Paul Mackerras @ 2012-11-19 23:59 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc

On Mon, Nov 19, 2012 at 01:53:44PM +0100, Alexander Graf wrote:

> > +static inline unsigned long slb_pgsize_encoding(unsigned long psize)
> > +{
> > +	unsigned long senc = 0;
> > +
> > +	if (psize > 0x1000) {
> > +		senc = SLB_VSID_L;
> > +		if (psize = 0x10000)
> > +			senc |= SLB_VSID_LP_01;
> 
> Is this always accurate?

It's accurate for page sizes of 4k, 64k and 16M on POWER7, and for
page sizes of 4k and 16M on PPC970, which is all we support.  This
function was just moved from book3s_hv.c anyway.  I'll add a comment
to it.

> > +/* For KVM_PPC_GET_HTAB_FD */
> > +struct kvm_get_htab_fd {
> > +	__u64	flags;
> > +	__u64	start_index;
> 
> Please add some padding here, in case we need more information passed down. 16 bytes should be enough.

OK.

> The actual protocol looks quite good :). And if it doesn't work for us anymore, we can always bump it to a new protocol using the flags in the above ioctl.

Right.  That's why I made it give an error if any unknown flags are
set.

Regards,
Paul.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2012-11-20  0:00 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-14  4:30 [PATCH 0/4] KVM: PPC: Book3S HV: HPT read/write functions for userspace Paul Mackerras
2012-11-14  4:30 ` Paul Mackerras
2012-11-14  4:31 ` [PATCH 1/4] KVM: PPC: Book3S HV: Restructure HPT entry creation code Paul Mackerras
2012-11-14  4:31   ` Paul Mackerras
2012-11-19 12:54   ` Alexander Graf
2012-11-19 12:54     ` Alexander Graf
2012-11-14  4:32 ` [PATCH 2/4] KVM: PPC: Book3S HV: Add a mechanism for recording modified HPTEs Paul Mackerras
2012-11-14  4:32   ` Paul Mackerras
2012-11-19 12:36   ` Alexander Graf
2012-11-19 12:36     ` Alexander Graf
2012-11-19 17:20     ` Scott Wood
2012-11-19 17:20       ` Scott Wood
2012-11-19 17:40       ` Alexander Graf
2012-11-19 17:40         ` Alexander Graf
2012-11-19 23:25     ` Paul Mackerras
2012-11-19 23:25       ` Paul Mackerras
2012-11-14  4:32 ` [PATCH 3/4] KVM: PPC: Book3S HV: Make a HPTE removal function available Paul Mackerras
2012-11-14  4:32   ` Paul Mackerras
2012-11-14  4:33 ` [PATCH 4/4] KVM: PPC: Book3S HV: Provide a method for userspace to read and write the HPT Paul Mackerras
2012-11-14  4:33   ` Paul Mackerras
2012-11-19 12:53   ` Alexander Graf
2012-11-19 12:53     ` Alexander Graf
2012-11-19 23:59     ` Paul Mackerras
2012-11-19 23:59       ` Paul Mackerras

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.