All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] Migrate non-migrated pages of a SVM.
@ 2020-06-18  9:19 ` Ram Pai
  0 siblings, 0 replies; 16+ messages in thread
From: Ram Pai @ 2020-06-18  9:19 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev
  Cc: ldufour, linuxram, cclaudio, bharata, sathnaga, aneesh.kumar,
	sukadev, bauerman, david

This patch series migrates the non-migrated pages of a SVM.
This is required when the UV calls H_SVM_INIT_DONE, and
when a memory-slot is hotplugged to a Secure VM.

Testing: Passed rigorous SVM reboot test using different
	sized SVMs.

Changelog:
	. fixed a bug observed by Bharata. Pages that
	where paged-in and later paged-out must also be
	skipped from migration during H_SVM_INIT_DONE.

Laurent Dufour (1):
  KVM: PPC: Book3S HV: migrate hot plugged memory

Ram Pai (3):
  KVM: PPC: Book3S HV: Fix function definition in book3s_hv_uvmem.c
  KVM: PPC: Book3S HV: track the state GFNs associated with secure VMs
  KVM: PPC: Book3S HV: migrate remaining normal-GFNs to secure-GFNs in
    H_SVM_INIT_DONE

 Documentation/powerpc/ultravisor.rst        |   2 +
 arch/powerpc/include/asm/kvm_book3s_uvmem.h |   8 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c      |   2 +-
 arch/powerpc/kvm/book3s_hv.c                |  12 +-
 arch/powerpc/kvm/book3s_hv_uvmem.c          | 449 ++++++++++++++++++++++------
 5 files changed, 368 insertions(+), 105 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 0/4] Migrate non-migrated pages of a SVM.
@ 2020-06-18  9:19 ` Ram Pai
  0 siblings, 0 replies; 16+ messages in thread
From: Ram Pai @ 2020-06-18  9:19 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev
  Cc: ldufour, linuxram, cclaudio, bharata, sathnaga, aneesh.kumar,
	sukadev, bauerman, david

This patch series migrates the non-migrated pages of a SVM.
This is required when the UV calls H_SVM_INIT_DONE, and
when a memory-slot is hotplugged to a Secure VM.

Testing: Passed rigorous SVM reboot test using different
	sized SVMs.

Changelog:
	. fixed a bug observed by Bharata. Pages that
	where paged-in and later paged-out must also be
	skipped from migration during H_SVM_INIT_DONE.

Laurent Dufour (1):
  KVM: PPC: Book3S HV: migrate hot plugged memory

Ram Pai (3):
  KVM: PPC: Book3S HV: Fix function definition in book3s_hv_uvmem.c
  KVM: PPC: Book3S HV: track the state GFNs associated with secure VMs
  KVM: PPC: Book3S HV: migrate remaining normal-GFNs to secure-GFNs in
    H_SVM_INIT_DONE

 Documentation/powerpc/ultravisor.rst        |   2 +
 arch/powerpc/include/asm/kvm_book3s_uvmem.h |   8 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c      |   2 +-
 arch/powerpc/kvm/book3s_hv.c                |  12 +-
 arch/powerpc/kvm/book3s_hv_uvmem.c          | 449 ++++++++++++++++++++++------
 5 files changed, 368 insertions(+), 105 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/4] KVM: PPC: Book3S HV: Fix function definition in book3s_hv_uvmem.c
  2020-06-18  9:19 ` Ram Pai
@ 2020-06-18  9:19   ` Ram Pai
  -1 siblings, 0 replies; 16+ messages in thread
From: Ram Pai @ 2020-06-18  9:19 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev
  Cc: ldufour, linuxram, cclaudio, bharata, sathnaga, aneesh.kumar,
	sukadev, bauerman, david

Without this fix, git is confused. It generates wrong
function context for code changes in subsequent patches.
Weird, but true.

Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Claudio Carvalho <cclaudio@linux.ibm.com>
Cc: kvm-ppc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/kvm/book3s_hv_uvmem.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index ad950f89..3599aaa 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -369,8 +369,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
  * Alloc a PFN from private device memory pool and copy page from normal
  * memory to secure memory using UV_PAGE_IN uvcall.
  */
-static int
-kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start,
+static int kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start,
 		   unsigned long end, unsigned long gpa, struct kvm *kvm,
 		   unsigned long page_shift, bool *downgrade)
 {
@@ -437,8 +436,8 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
  * In the former case, uses dev_pagemap_ops.migrate_to_ram handler
  * to unmap the device page from QEMU's page tables.
  */
-static unsigned long
-kvmppc_share_page(struct kvm *kvm, unsigned long gpa, unsigned long page_shift)
+static unsigned long kvmppc_share_page(struct kvm *kvm, unsigned long gpa,
+		unsigned long page_shift)
 {
 
 	int ret = H_PARAMETER;
@@ -487,9 +486,9 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
  * H_PAGE_IN_SHARED flag makes the page shared which means that the same
  * memory in is visible from both UV and HV.
  */
-unsigned long
-kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
-		     unsigned long flags, unsigned long page_shift)
+unsigned long kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
+		unsigned long flags,
+		unsigned long page_shift)
 {
 	bool downgrade = false;
 	unsigned long start, end;
@@ -546,10 +545,10 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
  * Provision a new page on HV side and copy over the contents
  * from secure memory using UV_PAGE_OUT uvcall.
  */
-static int
-kvmppc_svm_page_out(struct vm_area_struct *vma, unsigned long start,
-		    unsigned long end, unsigned long page_shift,
-		    struct kvm *kvm, unsigned long gpa)
+static int kvmppc_svm_page_out(struct vm_area_struct *vma,
+		unsigned long start,
+		unsigned long end, unsigned long page_shift,
+		struct kvm *kvm, unsigned long gpa)
 {
 	unsigned long src_pfn, dst_pfn = 0;
 	struct migrate_vma mig;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 1/4] KVM: PPC: Book3S HV: Fix function definition in book3s_hv_uvmem.c
@ 2020-06-18  9:19   ` Ram Pai
  0 siblings, 0 replies; 16+ messages in thread
From: Ram Pai @ 2020-06-18  9:19 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev
  Cc: ldufour, linuxram, cclaudio, bharata, sathnaga, aneesh.kumar,
	sukadev, bauerman, david

Without this fix, git is confused. It generates wrong
function context for code changes in subsequent patches.
Weird, but true.

Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Claudio Carvalho <cclaudio@linux.ibm.com>
Cc: kvm-ppc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/kvm/book3s_hv_uvmem.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index ad950f89..3599aaa 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -369,8 +369,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
  * Alloc a PFN from private device memory pool and copy page from normal
  * memory to secure memory using UV_PAGE_IN uvcall.
  */
-static int
-kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start,
+static int kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start,
 		   unsigned long end, unsigned long gpa, struct kvm *kvm,
 		   unsigned long page_shift, bool *downgrade)
 {
@@ -437,8 +436,8 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
  * In the former case, uses dev_pagemap_ops.migrate_to_ram handler
  * to unmap the device page from QEMU's page tables.
  */
-static unsigned long
-kvmppc_share_page(struct kvm *kvm, unsigned long gpa, unsigned long page_shift)
+static unsigned long kvmppc_share_page(struct kvm *kvm, unsigned long gpa,
+		unsigned long page_shift)
 {
 
 	int ret = H_PARAMETER;
@@ -487,9 +486,9 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
  * H_PAGE_IN_SHARED flag makes the page shared which means that the same
  * memory in is visible from both UV and HV.
  */
-unsigned long
-kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
-		     unsigned long flags, unsigned long page_shift)
+unsigned long kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
+		unsigned long flags,
+		unsigned long page_shift)
 {
 	bool downgrade = false;
 	unsigned long start, end;
@@ -546,10 +545,10 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
  * Provision a new page on HV side and copy over the contents
  * from secure memory using UV_PAGE_OUT uvcall.
  */
-static int
-kvmppc_svm_page_out(struct vm_area_struct *vma, unsigned long start,
-		    unsigned long end, unsigned long page_shift,
-		    struct kvm *kvm, unsigned long gpa)
+static int kvmppc_svm_page_out(struct vm_area_struct *vma,
+		unsigned long start,
+		unsigned long end, unsigned long page_shift,
+		struct kvm *kvm, unsigned long gpa)
 {
 	unsigned long src_pfn, dst_pfn = 0;
 	struct migrate_vma mig;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/4] KVM: PPC: Book3S HV: track the state GFNs associated with secure VMs
  2020-06-18  9:19 ` Ram Pai
@ 2020-06-18  9:19   ` Ram Pai
  -1 siblings, 0 replies; 16+ messages in thread
From: Ram Pai @ 2020-06-18  9:19 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev
  Cc: ldufour, linuxram, cclaudio, bharata, sathnaga, aneesh.kumar,
	sukadev, bauerman, david

During the life of SVM, its GFNs transition through normal, secure and
shared states. Since the kernel does not track GFNs that are shared, it
is not possible to disambiguate a shared GFN from a GFN whose PFN has
not yet been migrated to a secure-PFN. Also it is not possible to
disambiguate a secure-GFN from a GFN whose GFN has been pagedout from
the ultravisor.

The ability to identify the state of a GFN is needed to skip migration of its
PFN to secure-PFN during ESM transition.

The code is re-organized to track the states of a GFN as explained
below.

************************************************************************
 1. States of a GFN
    ---------------
 The GFN can be in one of the following states.

 (a) Secure - The GFN is secure. The GFN is associated with
 	a Secure VM, the contents of the GFN is not accessible
 	to the Hypervisor.  This GFN can be backed by a secure-PFN,
 	or can be backed by a normal-PFN with contents encrypted.
 	The former is true when the GFN is paged-in into the
 	ultravisor. The latter is true when the GFN is paged-out
 	of the ultravisor.

 (b) Shared - The GFN is shared. The GFN is associated with a
 	a secure VM. The contents of the GFN is accessible to
 	Hypervisor. This GFN is backed by a normal-PFN and its
 	content is un-encrypted.

 (c) Normal - The GFN is a normal. The GFN is associated with
 	a normal VM. The contents of the GFN is accesible to
 	the Hypervisor. Its content is never encrypted.

 2. States of a VM.
    ---------------

 (a) Normal VM:  A VM whose contents are always accessible to
 	the hypervisor.  All its GFNs are normal-GFNs.

 (b) Secure VM: A VM whose contents are not accessible to the
 	hypervisor without the VM's consent.  Its GFNs are
 	either Shared-GFN or Secure-GFNs.

 (c) Transient VM: A Normal VM that is transitioning to secure VM.
 	The transition starts on successful return of
 	H_SVM_INIT_START, and ends on successful return
 	of H_SVM_INIT_DONE. This transient VM, can have GFNs
 	in any of the three states; i.e Secure-GFN, Shared-GFN,
 	and Normal-GFN.	The VM never executes in this state
 	in supervisor-mode.

 3. Memory slot State.
    ------------------
  	The state of a memory slot mirrors the state of the
  	VM the memory slot is associated with.

 4. VM State transition.
    --------------------

  A VM always starts in Normal Mode.

  H_SVM_INIT_START moves the VM into transient state. During this
  time the Ultravisor may request some of its GFNs to be shared or
  secured. So its GFNs can be in one of the three GFN states.

  H_SVM_INIT_DONE moves the VM entirely from transient state to
  secure-state. At this point any left-over normal-GFNs are
  transitioned to Secure-GFN.

  H_SVM_INIT_ABORT moves the transient VM back to normal VM.
  All its GFNs are moved to Normal-GFNs.

  UV_TERMINATE transitions the secure-VM back to normal-VM. All
  the secure-GFN and shared-GFNs are tranistioned to normal-GFN
  Note: The contents of the normal-GFN is undefined at this point.

 5. GFN state implementation:
    -------------------------

 Secure GFN is associated with a secure-PFN; also called uvmem_pfn,
 when the GFN is paged-in. Its pfn[] has KVMPPC_GFN_UVMEM_PFN flag
 set, and contains the value of the secure-PFN.
 It is associated with a normal-PFN; also called mem_pfn, when
 the GFN is pagedout. Its pfn[] has KVMPPC_GFN_MEM_PFN flag set.
 The value of the normal-PFN is not tracked.

 Shared GFN is associated with a normal-PFN. Its pfn[] has
 KVMPPC_UVMEM_SHARED_PFN flag set. The value of the normal-PFN
 is not tracked.

 Normal GFN is associated with normal-PFN. Its pfn[] has
 no flag set. The value of the normal-PFN is not tracked.

 6. Life cycle of a GFN
    --------------------
 --------------------------------------------------------------
 |        |     Share  |  Unshare | SVM       |H_SVM_INIT_DONE|
 |        |operation   |operation | abort/    |               |
 |        |            |          | terminate |               |
 -------------------------------------------------------------
 |        |            |          |           |               |
 | Secure |     Shared | Secure   |Normal     |Secure         |
 |        |            |          |           |               |
 | Shared |     Shared | Secure   |Normal     |Shared         |
 |        |            |          |           |               |
 | Normal |     Shared | Secure   |Normal     |Secure         |
 --------------------------------------------------------------

 7. Life cycle of a VM
    --------------------
 --------------------------------------------------------------------
 |         |  start    |  H_SVM_  |H_SVM_   |H_SVM_     |UV_SVM_    |
 |         |  VM       |INIT_START|INIT_DONE|INIT_ABORT |TERMINATE  |
 |         |           |          |         |           |           |
 --------- ----------------------------------------------------------
 |         |           |          |         |           |           |
 | Normal  | Normal    | Transient|Error    |Error      |Normal     |
 |         |           |          |         |           |           |
 | Secure  |   Error   | Error    |Error    |Error      |Normal     |
 |         |           |          |         |           |           |
 |Transient|   N/A     | Error    |Secure   |Normal     |Normal     |
 --------------------------------------------------------------------

************************************************************************

Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Claudio Carvalho <cclaudio@linux.ibm.com>
Cc: kvm-ppc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/kvm_book3s_uvmem.h |   6 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c      |   2 +-
 arch/powerpc/kvm/book3s_hv.c                |   2 +-
 arch/powerpc/kvm/book3s_hv_uvmem.c          | 195 +++++++++++++++++++++++++---
 4 files changed, 180 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_uvmem.h b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
index 5a9834e..f0c5708 100644
--- a/arch/powerpc/include/asm/kvm_book3s_uvmem.h
+++ b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
@@ -21,7 +21,8 @@ unsigned long kvmppc_h_svm_page_out(struct kvm *kvm,
 int kvmppc_send_page_to_uv(struct kvm *kvm, unsigned long gfn);
 unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm);
 void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
-			     struct kvm *kvm, bool skip_page_out);
+			     struct kvm *kvm, bool skip_page_out,
+			     bool purge_gfn);
 #else
 static inline int kvmppc_uvmem_init(void)
 {
@@ -75,6 +76,7 @@ static inline int kvmppc_send_page_to_uv(struct kvm *kvm, unsigned long gfn)
 
 static inline void
 kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
-			struct kvm *kvm, bool skip_page_out) { }
+			struct kvm *kvm, bool skip_page_out,
+			bool purge_gfn) { }
 #endif /* CONFIG_PPC_UV */
 #endif /* __ASM_KVM_BOOK3S_UVMEM_H__ */
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 803940d..3448459 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -1100,7 +1100,7 @@ void kvmppc_radix_flush_memslot(struct kvm *kvm,
 	unsigned int shift;
 
 	if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_START)
-		kvmppc_uvmem_drop_pages(memslot, kvm, true);
+		kvmppc_uvmem_drop_pages(memslot, kvm, true, false);
 
 	if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_DONE)
 		return;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6717d24..6cf80e5 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -5482,7 +5482,7 @@ static int kvmhv_svm_off(struct kvm *kvm)
 			continue;
 
 		kvm_for_each_memslot(memslot, slots) {
-			kvmppc_uvmem_drop_pages(memslot, kvm, true);
+			kvmppc_uvmem_drop_pages(memslot, kvm, true, true);
 			uv_unregister_mem_slot(kvm->arch.lpid, memslot->id);
 		}
 	}
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 3599aaa..666d1bb 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -98,7 +98,127 @@
 static unsigned long *kvmppc_uvmem_bitmap;
 static DEFINE_SPINLOCK(kvmppc_uvmem_bitmap_lock);
 
-#define KVMPPC_UVMEM_PFN	(1UL << 63)
+/*
+ * States of a GFN
+ * ---------------
+ * The GFN can be in one of the following states.
+ *
+ * (a) Secure - The GFN is secure. The GFN is associated with
+ *	a Secure VM, the contents of the GFN is not accessible
+ *	to the Hypervisor.  This GFN can be backed by a secure-PFN,
+ *	or can be backed by a normal-PFN with contents encrypted.
+ *	The former is true when the GFN is paged-in into the
+ *	ultravisor. The latter is true when the GFN is paged-out
+ *	of the ultravisor.
+ *
+ * (b) Shared - The GFN is shared. The GFN is associated with a
+ *	a secure VM. The contents of the GFN is accessible to
+ *	Hypervisor. This GFN is backed by a normal-PFN and its
+ *	content is un-encrypted.
+ *
+ * (c) Normal - The GFN is a normal. The GFN is associated with
+ *	a normal VM. The contents of the GFN is accesible to
+ *	the Hypervisor. Its content is never encrypted.
+ *
+ * States of a VM.
+ * ---------------
+ *
+ * Normal VM:  A VM whose contents are always accessible to
+ *	the hypervisor.  All its GFNs are normal-GFNs.
+ *
+ * Secure VM: A VM whose contents are not accessible to the
+ *	hypervisor without the VM's consent.  Its GFNs are
+ *	either Shared-GFN or Secure-GFNs.
+ *
+ * Transient VM: A Normal VM that is transitioning to secure VM.
+ *	The transition starts on successful return of
+ *	H_SVM_INIT_START, and ends on successful return
+ *	of H_SVM_INIT_DONE. This transient VM, can have GFNs
+ *	in any of the three states; i.e Secure-GFN, Shared-GFN,
+ *	and Normal-GFN.	The VM never executes in this state
+ *	in supervisor-mode.
+ *
+ * Memory slot State.
+ * -----------------------------
+ *	The state of a memory slot mirrors the state of the
+ *	VM the memory slot is associated with.
+ *
+ * VM State transition.
+ * --------------------
+ *
+ *  A VM always starts in Normal Mode.
+ *
+ *  H_SVM_INIT_START moves the VM into transient state. During this
+ *  time the Ultravisor may request some of its GFNs to be shared or
+ *  secured. So its GFNs can be in one of the three GFN states.
+ *
+ *  H_SVM_INIT_DONE moves the VM entirely from transient state to
+ *  secure-state. At this point any left-over normal-GFNs are
+ *  transitioned to Secure-GFN.
+ *
+ *  H_SVM_INIT_ABORT moves the transient VM back to normal VM.
+ *  All its GFNs are moved to Normal-GFNs.
+ *
+ *  UV_TERMINATE transitions the secure-VM back to normal-VM. All
+ *  the secure-GFN and shared-GFNs are tranistioned to normal-GFN
+ *  Note: The contents of the normal-GFN is undefined at this point.
+ *
+ * GFN state implementation:
+ * -------------------------
+ *
+ * Secure GFN is associated with a secure-PFN; also called uvmem_pfn,
+ * when the GFN is paged-in. Its pfn[] has KVMPPC_GFN_UVMEM_PFN flag
+ * set, and contains the value of the secure-PFN.
+ * It is associated with a normal-PFN; also called mem_pfn, when
+ * the GFN is pagedout. Its pfn[] has KVMPPC_GFN_MEM_PFN flag set.
+ * The value of the normal-PFN is not tracked.
+ *
+ * Shared GFN is associated with a normal-PFN. Its pfn[] has
+ * KVMPPC_UVMEM_SHARED_PFN flag set. The value of the normal-PFN
+ * is not tracked.
+ *
+ * Normal GFN is associated with normal-PFN. Its pfn[] has
+ * no flag set. The value of the normal-PFN is not tracked.
+ *
+ * Life cycle of a GFN
+ * --------------------
+ *
+ * --------------------------------------------------------------
+ * |        |     Share  |  Unshare | SVM       |H_SVM_INIT_DONE|
+ * |        |operation   |operation | abort/    |               |
+ * |        |            |          | terminate |               |
+ * -------------------------------------------------------------
+ * |        |            |          |           |               |
+ * | Secure |     Shared | Secure   |Normal     |Secure         |
+ * |        |            |          |           |               |
+ * | Shared |     Shared | Secure   |Normal     |Shared         |
+ * |        |            |          |           |               |
+ * | Normal |     Shared | Secure   |Normal     |Secure         |
+ * --------------------------------------------------------------
+ *
+ * Life cycle of a VM
+ * --------------------
+ *
+ * --------------------------------------------------------------------
+ * |         |  start    |  H_SVM_  |H_SVM_   |H_SVM_     |UV_SVM_    |
+ * |         |  VM       |INIT_START|INIT_DONE|INIT_ABORT |TERMINATE  |
+ * |         |           |          |         |           |           |
+ * --------- ----------------------------------------------------------
+ * |         |           |          |         |           |           |
+ * | Normal  | Normal    | Transient|Error    |Error      |Normal     |
+ * |         |           |          |         |           |           |
+ * | Secure  |   Error   | Error    |Error    |Error      |Normal     |
+ * |         |           |          |         |           |           |
+ * |Transient|   N/A     | Error    |Secure   |Normal     |Normal     |
+ * --------------------------------------------------------------------
+ */
+
+#define KVMPPC_GFN_UVMEM_PFN	(1UL << 63)
+#define KVMPPC_GFN_MEM_PFN	(1UL << 62)
+#define KVMPPC_GFN_SHARED	(1UL << 61)
+#define KVMPPC_GFN_SECURE	(KVMPPC_GFN_UVMEM_PFN | KVMPPC_GFN_MEM_PFN)
+#define KVMPPC_GFN_FLAG_MASK	(KVMPPC_GFN_SECURE | KVMPPC_GFN_SHARED)
+#define KVMPPC_GFN_PFN_MASK	(~KVMPPC_GFN_FLAG_MASK)
 
 struct kvmppc_uvmem_slot {
 	struct list_head list;
@@ -106,11 +226,11 @@ struct kvmppc_uvmem_slot {
 	unsigned long base_pfn;
 	unsigned long *pfns;
 };
-
 struct kvmppc_uvmem_page_pvt {
 	struct kvm *kvm;
 	unsigned long gpa;
 	bool skip_page_out;
+	bool purge_gfn;
 };
 
 int kvmppc_uvmem_slot_init(struct kvm *kvm, const struct kvm_memory_slot *slot)
@@ -154,8 +274,8 @@ void kvmppc_uvmem_slot_free(struct kvm *kvm, const struct kvm_memory_slot *slot)
 	mutex_unlock(&kvm->arch.uvmem_lock);
 }
 
-static void kvmppc_uvmem_pfn_insert(unsigned long gfn, unsigned long uvmem_pfn,
-				    struct kvm *kvm)
+static void kvmppc_mark_gfn(unsigned long gfn, struct kvm *kvm,
+			unsigned long flag, unsigned long uvmem_pfn)
 {
 	struct kvmppc_uvmem_slot *p;
 
@@ -163,24 +283,41 @@ static void kvmppc_uvmem_pfn_insert(unsigned long gfn, unsigned long uvmem_pfn,
 		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
 			unsigned long index = gfn - p->base_pfn;
 
-			p->pfns[index] = uvmem_pfn | KVMPPC_UVMEM_PFN;
+			if (flag == KVMPPC_GFN_UVMEM_PFN)
+				p->pfns[index] = uvmem_pfn | flag;
+			else
+				p->pfns[index] = flag;
 			return;
 		}
 	}
 }
 
-static void kvmppc_uvmem_pfn_remove(unsigned long gfn, struct kvm *kvm)
+/* mark the GFN as secure-GFN associated with @uvmem pfn device-PFN. */
+static void kvmppc_gfn_secure_uvmem_pfn(unsigned long gfn,
+			unsigned long uvmem_pfn, struct kvm *kvm)
 {
-	struct kvmppc_uvmem_slot *p;
+	kvmppc_mark_gfn(gfn, kvm, KVMPPC_GFN_UVMEM_PFN, uvmem_pfn);
+}
 
-	list_for_each_entry(p, &kvm->arch.uvmem_pfns, list) {
-		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
-			p->pfns[gfn - p->base_pfn] = 0;
-			return;
-		}
-	}
+/* mark the GFN as secure-GFN associated with a memory-PFN. */
+static void kvmppc_gfn_secure_mem_pfn(unsigned long gfn, struct kvm *kvm)
+{
+	kvmppc_mark_gfn(gfn, kvm, KVMPPC_GFN_MEM_PFN, 0);
 }
 
+/* mark the GFN as a shared GFN. */
+static void kvmppc_gfn_shared(unsigned long gfn, struct kvm *kvm)
+{
+	kvmppc_mark_gfn(gfn, kvm, KVMPPC_GFN_SHARED, 0);
+}
+
+/* mark the GFN as a non-existent GFN. */
+static void kvmppc_gfn_remove(unsigned long gfn, struct kvm *kvm)
+{
+	kvmppc_mark_gfn(gfn, kvm, 0, 0);
+}
+
+/* return true, if the GFN is a secure-GFN backed by a secure-PFN */
 static bool kvmppc_gfn_is_uvmem_pfn(unsigned long gfn, struct kvm *kvm,
 				    unsigned long *uvmem_pfn)
 {
@@ -190,10 +327,10 @@ static bool kvmppc_gfn_is_uvmem_pfn(unsigned long gfn, struct kvm *kvm,
 		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
 			unsigned long index = gfn - p->base_pfn;
 
-			if (p->pfns[index] & KVMPPC_UVMEM_PFN) {
+			if (p->pfns[index] & KVMPPC_GFN_UVMEM_PFN) {
 				if (uvmem_pfn)
 					*uvmem_pfn = p->pfns[index] &
-						     ~KVMPPC_UVMEM_PFN;
+						     KVMPPC_GFN_PFN_MASK;
 				return true;
 			} else
 				return false;
@@ -257,9 +394,13 @@ unsigned long kvmppc_h_svm_init_done(struct kvm *kvm)
  * is HV side fault on these pages. Next we *get* these pages, forcing
  * fault on them, do fault time migration to replace the device PTEs in
  * QEMU page table with normal PTEs from newly allocated pages.
+ *
+ * if @purge_gfn is set, invalidate the GFN. GFN is not shared nor secure
+ * anymore.
  */
 void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
-			     struct kvm *kvm, bool skip_page_out)
+			     struct kvm *kvm, bool skip_page_out,
+			     bool purge_gfn)
 {
 	int i;
 	struct kvmppc_uvmem_page_pvt *pvt;
@@ -270,14 +411,17 @@ void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
 		struct page *uvmem_page;
 
 		mutex_lock(&kvm->arch.uvmem_lock);
+
 		if (!kvmppc_gfn_is_uvmem_pfn(gfn, kvm, &uvmem_pfn)) {
+			if (purge_gfn)
+				kvmppc_gfn_remove(gfn, kvm);
 			mutex_unlock(&kvm->arch.uvmem_lock);
 			continue;
 		}
-
 		uvmem_page = pfn_to_page(uvmem_pfn);
 		pvt = uvmem_page->zone_device_data;
 		pvt->skip_page_out = skip_page_out;
+		pvt->purge_gfn = purge_gfn;
 		mutex_unlock(&kvm->arch.uvmem_lock);
 
 		pfn = gfn_to_pfn(kvm, gfn);
@@ -305,7 +449,7 @@ unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm)
 	srcu_idx = srcu_read_lock(&kvm->srcu);
 
 	kvm_for_each_memslot(memslot, kvm_memslots(kvm))
-		kvmppc_uvmem_drop_pages(memslot, kvm, false);
+		kvmppc_uvmem_drop_pages(memslot, kvm, false, true);
 
 	srcu_read_unlock(&kvm->srcu, srcu_idx);
 
@@ -347,7 +491,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
 		goto out_clear;
 
 	uvmem_pfn = bit + pfn_first;
-	kvmppc_uvmem_pfn_insert(gpa >> PAGE_SHIFT, uvmem_pfn, kvm);
+	kvmppc_gfn_secure_uvmem_pfn(gpa >> PAGE_SHIFT, uvmem_pfn, kvm);
 
 	pvt->gpa = gpa;
 	pvt->kvm = kvm;
@@ -454,6 +598,7 @@ static unsigned long kvmppc_share_page(struct kvm *kvm, unsigned long gpa,
 		uvmem_page = pfn_to_page(uvmem_pfn);
 		pvt = uvmem_page->zone_device_data;
 		pvt->skip_page_out = true;
+		pvt->purge_gfn = false;
 	}
 
 retry:
@@ -467,12 +612,16 @@ static unsigned long kvmppc_share_page(struct kvm *kvm, unsigned long gpa,
 		uvmem_page = pfn_to_page(uvmem_pfn);
 		pvt = uvmem_page->zone_device_data;
 		pvt->skip_page_out = true;
+		pvt->purge_gfn = false;
 		kvm_release_pfn_clean(pfn);
 		goto retry;
 	}
 
-	if (!uv_page_in(kvm->arch.lpid, pfn << page_shift, gpa, 0, page_shift))
+	if (!uv_page_in(kvm->arch.lpid, pfn << page_shift, gpa, 0,
+				page_shift)) {
+		kvmppc_gfn_shared(gfn, kvm);
 		ret = H_SUCCESS;
+	}
 	kvm_release_pfn_clean(pfn);
 	mutex_unlock(&kvm->arch.uvmem_lock);
 out:
@@ -530,6 +679,7 @@ unsigned long kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
 	if (!kvmppc_svm_page_in(vma, start, end, gpa, kvm, page_shift,
 				&downgrade))
 		ret = H_SUCCESS;
+
 out_unlock:
 	mutex_unlock(&kvm->arch.uvmem_lock);
 out:
@@ -655,7 +805,10 @@ static void kvmppc_uvmem_page_free(struct page *page)
 
 	pvt = page->zone_device_data;
 	page->zone_device_data = NULL;
-	kvmppc_uvmem_pfn_remove(pvt->gpa >> PAGE_SHIFT, pvt->kvm);
+	if (pvt->purge_gfn)
+		kvmppc_gfn_remove(pvt->gpa >> PAGE_SHIFT, pvt->kvm);
+	else
+		kvmppc_gfn_secure_mem_pfn(pvt->gpa >> PAGE_SHIFT, pvt->kvm);
 	kfree(pvt);
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/4] KVM: PPC: Book3S HV: track the state GFNs associated with secure VMs
@ 2020-06-18  9:19   ` Ram Pai
  0 siblings, 0 replies; 16+ messages in thread
From: Ram Pai @ 2020-06-18  9:19 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev
  Cc: ldufour, linuxram, cclaudio, bharata, sathnaga, aneesh.kumar,
	sukadev, bauerman, david

During the life of SVM, its GFNs transition through normal, secure and
shared states. Since the kernel does not track GFNs that are shared, it
is not possible to disambiguate a shared GFN from a GFN whose PFN has
not yet been migrated to a secure-PFN. Also it is not possible to
disambiguate a secure-GFN from a GFN whose GFN has been pagedout from
the ultravisor.

The ability to identify the state of a GFN is needed to skip migration of its
PFN to secure-PFN during ESM transition.

The code is re-organized to track the states of a GFN as explained
below.

************************************************************************
 1. States of a GFN
    ---------------
 The GFN can be in one of the following states.

 (a) Secure - The GFN is secure. The GFN is associated with
 	a Secure VM, the contents of the GFN is not accessible
 	to the Hypervisor.  This GFN can be backed by a secure-PFN,
 	or can be backed by a normal-PFN with contents encrypted.
 	The former is true when the GFN is paged-in into the
 	ultravisor. The latter is true when the GFN is paged-out
 	of the ultravisor.

 (b) Shared - The GFN is shared. The GFN is associated with a
 	a secure VM. The contents of the GFN is accessible to
 	Hypervisor. This GFN is backed by a normal-PFN and its
 	content is un-encrypted.

 (c) Normal - The GFN is a normal. The GFN is associated with
 	a normal VM. The contents of the GFN is accesible to
 	the Hypervisor. Its content is never encrypted.

 2. States of a VM.
    ---------------

 (a) Normal VM:  A VM whose contents are always accessible to
 	the hypervisor.  All its GFNs are normal-GFNs.

 (b) Secure VM: A VM whose contents are not accessible to the
 	hypervisor without the VM's consent.  Its GFNs are
 	either Shared-GFN or Secure-GFNs.

 (c) Transient VM: A Normal VM that is transitioning to secure VM.
 	The transition starts on successful return of
 	H_SVM_INIT_START, and ends on successful return
 	of H_SVM_INIT_DONE. This transient VM, can have GFNs
 	in any of the three states; i.e Secure-GFN, Shared-GFN,
 	and Normal-GFN.	The VM never executes in this state
 	in supervisor-mode.

 3. Memory slot State.
    ------------------
  	The state of a memory slot mirrors the state of the
  	VM the memory slot is associated with.

 4. VM State transition.
    --------------------

  A VM always starts in Normal Mode.

  H_SVM_INIT_START moves the VM into transient state. During this
  time the Ultravisor may request some of its GFNs to be shared or
  secured. So its GFNs can be in one of the three GFN states.

  H_SVM_INIT_DONE moves the VM entirely from transient state to
  secure-state. At this point any left-over normal-GFNs are
  transitioned to Secure-GFN.

  H_SVM_INIT_ABORT moves the transient VM back to normal VM.
  All its GFNs are moved to Normal-GFNs.

  UV_TERMINATE transitions the secure-VM back to normal-VM. All
  the secure-GFN and shared-GFNs are tranistioned to normal-GFN
  Note: The contents of the normal-GFN is undefined at this point.

 5. GFN state implementation:
    -------------------------

 Secure GFN is associated with a secure-PFN; also called uvmem_pfn,
 when the GFN is paged-in. Its pfn[] has KVMPPC_GFN_UVMEM_PFN flag
 set, and contains the value of the secure-PFN.
 It is associated with a normal-PFN; also called mem_pfn, when
 the GFN is pagedout. Its pfn[] has KVMPPC_GFN_MEM_PFN flag set.
 The value of the normal-PFN is not tracked.

 Shared GFN is associated with a normal-PFN. Its pfn[] has
 KVMPPC_UVMEM_SHARED_PFN flag set. The value of the normal-PFN
 is not tracked.

 Normal GFN is associated with normal-PFN. Its pfn[] has
 no flag set. The value of the normal-PFN is not tracked.

 6. Life cycle of a GFN
    --------------------
 --------------------------------------------------------------
 |        |     Share  |  Unshare | SVM       |H_SVM_INIT_DONE|
 |        |operation   |operation | abort/    |               |
 |        |            |          | terminate |               |
 -------------------------------------------------------------
 |        |            |          |           |               |
 | Secure |     Shared | Secure   |Normal     |Secure         |
 |        |            |          |           |               |
 | Shared |     Shared | Secure   |Normal     |Shared         |
 |        |            |          |           |               |
 | Normal |     Shared | Secure   |Normal     |Secure         |
 --------------------------------------------------------------

 7. Life cycle of a VM
    --------------------
 --------------------------------------------------------------------
 |         |  start    |  H_SVM_  |H_SVM_   |H_SVM_     |UV_SVM_    |
 |         |  VM       |INIT_START|INIT_DONE|INIT_ABORT |TERMINATE  |
 |         |           |          |         |           |           |
 --------- ----------------------------------------------------------
 |         |           |          |         |           |           |
 | Normal  | Normal    | Transient|Error    |Error      |Normal     |
 |         |           |          |         |           |           |
 | Secure  |   Error   | Error    |Error    |Error      |Normal     |
 |         |           |          |         |           |           |
 |Transient|   N/A     | Error    |Secure   |Normal     |Normal     |
 --------------------------------------------------------------------

************************************************************************

Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Claudio Carvalho <cclaudio@linux.ibm.com>
Cc: kvm-ppc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/kvm_book3s_uvmem.h |   6 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c      |   2 +-
 arch/powerpc/kvm/book3s_hv.c                |   2 +-
 arch/powerpc/kvm/book3s_hv_uvmem.c          | 195 +++++++++++++++++++++++++---
 4 files changed, 180 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_uvmem.h b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
index 5a9834e..f0c5708 100644
--- a/arch/powerpc/include/asm/kvm_book3s_uvmem.h
+++ b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
@@ -21,7 +21,8 @@ unsigned long kvmppc_h_svm_page_out(struct kvm *kvm,
 int kvmppc_send_page_to_uv(struct kvm *kvm, unsigned long gfn);
 unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm);
 void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
-			     struct kvm *kvm, bool skip_page_out);
+			     struct kvm *kvm, bool skip_page_out,
+			     bool purge_gfn);
 #else
 static inline int kvmppc_uvmem_init(void)
 {
@@ -75,6 +76,7 @@ static inline int kvmppc_send_page_to_uv(struct kvm *kvm, unsigned long gfn)
 
 static inline void
 kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
-			struct kvm *kvm, bool skip_page_out) { }
+			struct kvm *kvm, bool skip_page_out,
+			bool purge_gfn) { }
 #endif /* CONFIG_PPC_UV */
 #endif /* __ASM_KVM_BOOK3S_UVMEM_H__ */
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 803940d..3448459 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -1100,7 +1100,7 @@ void kvmppc_radix_flush_memslot(struct kvm *kvm,
 	unsigned int shift;
 
 	if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_START)
-		kvmppc_uvmem_drop_pages(memslot, kvm, true);
+		kvmppc_uvmem_drop_pages(memslot, kvm, true, false);
 
 	if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_DONE)
 		return;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6717d24..6cf80e5 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -5482,7 +5482,7 @@ static int kvmhv_svm_off(struct kvm *kvm)
 			continue;
 
 		kvm_for_each_memslot(memslot, slots) {
-			kvmppc_uvmem_drop_pages(memslot, kvm, true);
+			kvmppc_uvmem_drop_pages(memslot, kvm, true, true);
 			uv_unregister_mem_slot(kvm->arch.lpid, memslot->id);
 		}
 	}
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 3599aaa..666d1bb 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -98,7 +98,127 @@
 static unsigned long *kvmppc_uvmem_bitmap;
 static DEFINE_SPINLOCK(kvmppc_uvmem_bitmap_lock);
 
-#define KVMPPC_UVMEM_PFN	(1UL << 63)
+/*
+ * States of a GFN
+ * ---------------
+ * The GFN can be in one of the following states.
+ *
+ * (a) Secure - The GFN is secure. The GFN is associated with
+ *	a Secure VM, the contents of the GFN is not accessible
+ *	to the Hypervisor.  This GFN can be backed by a secure-PFN,
+ *	or can be backed by a normal-PFN with contents encrypted.
+ *	The former is true when the GFN is paged-in into the
+ *	ultravisor. The latter is true when the GFN is paged-out
+ *	of the ultravisor.
+ *
+ * (b) Shared - The GFN is shared. The GFN is associated with a
+ *	a secure VM. The contents of the GFN is accessible to
+ *	Hypervisor. This GFN is backed by a normal-PFN and its
+ *	content is un-encrypted.
+ *
+ * (c) Normal - The GFN is a normal. The GFN is associated with
+ *	a normal VM. The contents of the GFN is accesible to
+ *	the Hypervisor. Its content is never encrypted.
+ *
+ * States of a VM.
+ * ---------------
+ *
+ * Normal VM:  A VM whose contents are always accessible to
+ *	the hypervisor.  All its GFNs are normal-GFNs.
+ *
+ * Secure VM: A VM whose contents are not accessible to the
+ *	hypervisor without the VM's consent.  Its GFNs are
+ *	either Shared-GFN or Secure-GFNs.
+ *
+ * Transient VM: A Normal VM that is transitioning to secure VM.
+ *	The transition starts on successful return of
+ *	H_SVM_INIT_START, and ends on successful return
+ *	of H_SVM_INIT_DONE. This transient VM, can have GFNs
+ *	in any of the three states; i.e Secure-GFN, Shared-GFN,
+ *	and Normal-GFN.	The VM never executes in this state
+ *	in supervisor-mode.
+ *
+ * Memory slot State.
+ * -----------------------------
+ *	The state of a memory slot mirrors the state of the
+ *	VM the memory slot is associated with.
+ *
+ * VM State transition.
+ * --------------------
+ *
+ *  A VM always starts in Normal Mode.
+ *
+ *  H_SVM_INIT_START moves the VM into transient state. During this
+ *  time the Ultravisor may request some of its GFNs to be shared or
+ *  secured. So its GFNs can be in one of the three GFN states.
+ *
+ *  H_SVM_INIT_DONE moves the VM entirely from transient state to
+ *  secure-state. At this point any left-over normal-GFNs are
+ *  transitioned to Secure-GFN.
+ *
+ *  H_SVM_INIT_ABORT moves the transient VM back to normal VM.
+ *  All its GFNs are moved to Normal-GFNs.
+ *
+ *  UV_TERMINATE transitions the secure-VM back to normal-VM. All
+ *  the secure-GFN and shared-GFNs are tranistioned to normal-GFN
+ *  Note: The contents of the normal-GFN is undefined at this point.
+ *
+ * GFN state implementation:
+ * -------------------------
+ *
+ * Secure GFN is associated with a secure-PFN; also called uvmem_pfn,
+ * when the GFN is paged-in. Its pfn[] has KVMPPC_GFN_UVMEM_PFN flag
+ * set, and contains the value of the secure-PFN.
+ * It is associated with a normal-PFN; also called mem_pfn, when
+ * the GFN is pagedout. Its pfn[] has KVMPPC_GFN_MEM_PFN flag set.
+ * The value of the normal-PFN is not tracked.
+ *
+ * Shared GFN is associated with a normal-PFN. Its pfn[] has
+ * KVMPPC_UVMEM_SHARED_PFN flag set. The value of the normal-PFN
+ * is not tracked.
+ *
+ * Normal GFN is associated with normal-PFN. Its pfn[] has
+ * no flag set. The value of the normal-PFN is not tracked.
+ *
+ * Life cycle of a GFN
+ * --------------------
+ *
+ * --------------------------------------------------------------
+ * |        |     Share  |  Unshare | SVM       |H_SVM_INIT_DONE|
+ * |        |operation   |operation | abort/    |               |
+ * |        |            |          | terminate |               |
+ * -------------------------------------------------------------
+ * |        |            |          |           |               |
+ * | Secure |     Shared | Secure   |Normal     |Secure         |
+ * |        |            |          |           |               |
+ * | Shared |     Shared | Secure   |Normal     |Shared         |
+ * |        |            |          |           |               |
+ * | Normal |     Shared | Secure   |Normal     |Secure         |
+ * --------------------------------------------------------------
+ *
+ * Life cycle of a VM
+ * --------------------
+ *
+ * --------------------------------------------------------------------
+ * |         |  start    |  H_SVM_  |H_SVM_   |H_SVM_     |UV_SVM_    |
+ * |         |  VM       |INIT_START|INIT_DONE|INIT_ABORT |TERMINATE  |
+ * |         |           |          |         |           |           |
+ * --------- ----------------------------------------------------------
+ * |         |           |          |         |           |           |
+ * | Normal  | Normal    | Transient|Error    |Error      |Normal     |
+ * |         |           |          |         |           |           |
+ * | Secure  |   Error   | Error    |Error    |Error      |Normal     |
+ * |         |           |          |         |           |           |
+ * |Transient|   N/A     | Error    |Secure   |Normal     |Normal     |
+ * --------------------------------------------------------------------
+ */
+
+#define KVMPPC_GFN_UVMEM_PFN	(1UL << 63)
+#define KVMPPC_GFN_MEM_PFN	(1UL << 62)
+#define KVMPPC_GFN_SHARED	(1UL << 61)
+#define KVMPPC_GFN_SECURE	(KVMPPC_GFN_UVMEM_PFN | KVMPPC_GFN_MEM_PFN)
+#define KVMPPC_GFN_FLAG_MASK	(KVMPPC_GFN_SECURE | KVMPPC_GFN_SHARED)
+#define KVMPPC_GFN_PFN_MASK	(~KVMPPC_GFN_FLAG_MASK)
 
 struct kvmppc_uvmem_slot {
 	struct list_head list;
@@ -106,11 +226,11 @@ struct kvmppc_uvmem_slot {
 	unsigned long base_pfn;
 	unsigned long *pfns;
 };
-
 struct kvmppc_uvmem_page_pvt {
 	struct kvm *kvm;
 	unsigned long gpa;
 	bool skip_page_out;
+	bool purge_gfn;
 };
 
 int kvmppc_uvmem_slot_init(struct kvm *kvm, const struct kvm_memory_slot *slot)
@@ -154,8 +274,8 @@ void kvmppc_uvmem_slot_free(struct kvm *kvm, const struct kvm_memory_slot *slot)
 	mutex_unlock(&kvm->arch.uvmem_lock);
 }
 
-static void kvmppc_uvmem_pfn_insert(unsigned long gfn, unsigned long uvmem_pfn,
-				    struct kvm *kvm)
+static void kvmppc_mark_gfn(unsigned long gfn, struct kvm *kvm,
+			unsigned long flag, unsigned long uvmem_pfn)
 {
 	struct kvmppc_uvmem_slot *p;
 
@@ -163,24 +283,41 @@ static void kvmppc_uvmem_pfn_insert(unsigned long gfn, unsigned long uvmem_pfn,
 		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
 			unsigned long index = gfn - p->base_pfn;
 
-			p->pfns[index] = uvmem_pfn | KVMPPC_UVMEM_PFN;
+			if (flag = KVMPPC_GFN_UVMEM_PFN)
+				p->pfns[index] = uvmem_pfn | flag;
+			else
+				p->pfns[index] = flag;
 			return;
 		}
 	}
 }
 
-static void kvmppc_uvmem_pfn_remove(unsigned long gfn, struct kvm *kvm)
+/* mark the GFN as secure-GFN associated with @uvmem pfn device-PFN. */
+static void kvmppc_gfn_secure_uvmem_pfn(unsigned long gfn,
+			unsigned long uvmem_pfn, struct kvm *kvm)
 {
-	struct kvmppc_uvmem_slot *p;
+	kvmppc_mark_gfn(gfn, kvm, KVMPPC_GFN_UVMEM_PFN, uvmem_pfn);
+}
 
-	list_for_each_entry(p, &kvm->arch.uvmem_pfns, list) {
-		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
-			p->pfns[gfn - p->base_pfn] = 0;
-			return;
-		}
-	}
+/* mark the GFN as secure-GFN associated with a memory-PFN. */
+static void kvmppc_gfn_secure_mem_pfn(unsigned long gfn, struct kvm *kvm)
+{
+	kvmppc_mark_gfn(gfn, kvm, KVMPPC_GFN_MEM_PFN, 0);
 }
 
+/* mark the GFN as a shared GFN. */
+static void kvmppc_gfn_shared(unsigned long gfn, struct kvm *kvm)
+{
+	kvmppc_mark_gfn(gfn, kvm, KVMPPC_GFN_SHARED, 0);
+}
+
+/* mark the GFN as a non-existent GFN. */
+static void kvmppc_gfn_remove(unsigned long gfn, struct kvm *kvm)
+{
+	kvmppc_mark_gfn(gfn, kvm, 0, 0);
+}
+
+/* return true, if the GFN is a secure-GFN backed by a secure-PFN */
 static bool kvmppc_gfn_is_uvmem_pfn(unsigned long gfn, struct kvm *kvm,
 				    unsigned long *uvmem_pfn)
 {
@@ -190,10 +327,10 @@ static bool kvmppc_gfn_is_uvmem_pfn(unsigned long gfn, struct kvm *kvm,
 		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
 			unsigned long index = gfn - p->base_pfn;
 
-			if (p->pfns[index] & KVMPPC_UVMEM_PFN) {
+			if (p->pfns[index] & KVMPPC_GFN_UVMEM_PFN) {
 				if (uvmem_pfn)
 					*uvmem_pfn = p->pfns[index] &
-						     ~KVMPPC_UVMEM_PFN;
+						     KVMPPC_GFN_PFN_MASK;
 				return true;
 			} else
 				return false;
@@ -257,9 +394,13 @@ unsigned long kvmppc_h_svm_init_done(struct kvm *kvm)
  * is HV side fault on these pages. Next we *get* these pages, forcing
  * fault on them, do fault time migration to replace the device PTEs in
  * QEMU page table with normal PTEs from newly allocated pages.
+ *
+ * if @purge_gfn is set, invalidate the GFN. GFN is not shared nor secure
+ * anymore.
  */
 void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
-			     struct kvm *kvm, bool skip_page_out)
+			     struct kvm *kvm, bool skip_page_out,
+			     bool purge_gfn)
 {
 	int i;
 	struct kvmppc_uvmem_page_pvt *pvt;
@@ -270,14 +411,17 @@ void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
 		struct page *uvmem_page;
 
 		mutex_lock(&kvm->arch.uvmem_lock);
+
 		if (!kvmppc_gfn_is_uvmem_pfn(gfn, kvm, &uvmem_pfn)) {
+			if (purge_gfn)
+				kvmppc_gfn_remove(gfn, kvm);
 			mutex_unlock(&kvm->arch.uvmem_lock);
 			continue;
 		}
-
 		uvmem_page = pfn_to_page(uvmem_pfn);
 		pvt = uvmem_page->zone_device_data;
 		pvt->skip_page_out = skip_page_out;
+		pvt->purge_gfn = purge_gfn;
 		mutex_unlock(&kvm->arch.uvmem_lock);
 
 		pfn = gfn_to_pfn(kvm, gfn);
@@ -305,7 +449,7 @@ unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm)
 	srcu_idx = srcu_read_lock(&kvm->srcu);
 
 	kvm_for_each_memslot(memslot, kvm_memslots(kvm))
-		kvmppc_uvmem_drop_pages(memslot, kvm, false);
+		kvmppc_uvmem_drop_pages(memslot, kvm, false, true);
 
 	srcu_read_unlock(&kvm->srcu, srcu_idx);
 
@@ -347,7 +491,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
 		goto out_clear;
 
 	uvmem_pfn = bit + pfn_first;
-	kvmppc_uvmem_pfn_insert(gpa >> PAGE_SHIFT, uvmem_pfn, kvm);
+	kvmppc_gfn_secure_uvmem_pfn(gpa >> PAGE_SHIFT, uvmem_pfn, kvm);
 
 	pvt->gpa = gpa;
 	pvt->kvm = kvm;
@@ -454,6 +598,7 @@ static unsigned long kvmppc_share_page(struct kvm *kvm, unsigned long gpa,
 		uvmem_page = pfn_to_page(uvmem_pfn);
 		pvt = uvmem_page->zone_device_data;
 		pvt->skip_page_out = true;
+		pvt->purge_gfn = false;
 	}
 
 retry:
@@ -467,12 +612,16 @@ static unsigned long kvmppc_share_page(struct kvm *kvm, unsigned long gpa,
 		uvmem_page = pfn_to_page(uvmem_pfn);
 		pvt = uvmem_page->zone_device_data;
 		pvt->skip_page_out = true;
+		pvt->purge_gfn = false;
 		kvm_release_pfn_clean(pfn);
 		goto retry;
 	}
 
-	if (!uv_page_in(kvm->arch.lpid, pfn << page_shift, gpa, 0, page_shift))
+	if (!uv_page_in(kvm->arch.lpid, pfn << page_shift, gpa, 0,
+				page_shift)) {
+		kvmppc_gfn_shared(gfn, kvm);
 		ret = H_SUCCESS;
+	}
 	kvm_release_pfn_clean(pfn);
 	mutex_unlock(&kvm->arch.uvmem_lock);
 out:
@@ -530,6 +679,7 @@ unsigned long kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
 	if (!kvmppc_svm_page_in(vma, start, end, gpa, kvm, page_shift,
 				&downgrade))
 		ret = H_SUCCESS;
+
 out_unlock:
 	mutex_unlock(&kvm->arch.uvmem_lock);
 out:
@@ -655,7 +805,10 @@ static void kvmppc_uvmem_page_free(struct page *page)
 
 	pvt = page->zone_device_data;
 	page->zone_device_data = NULL;
-	kvmppc_uvmem_pfn_remove(pvt->gpa >> PAGE_SHIFT, pvt->kvm);
+	if (pvt->purge_gfn)
+		kvmppc_gfn_remove(pvt->gpa >> PAGE_SHIFT, pvt->kvm);
+	else
+		kvmppc_gfn_secure_mem_pfn(pvt->gpa >> PAGE_SHIFT, pvt->kvm);
 	kfree(pvt);
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 3/4] KVM: PPC: Book3S HV: migrate remaining normal-GFNs to secure-GFNs in H_SVM_INIT_DONE
  2020-06-18  9:19 ` Ram Pai
@ 2020-06-18  9:19   ` Ram Pai
  -1 siblings, 0 replies; 16+ messages in thread
From: Ram Pai @ 2020-06-18  9:19 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev
  Cc: ldufour, linuxram, cclaudio, bharata, sathnaga, aneesh.kumar,
	sukadev, bauerman, david

H_SVM_INIT_DONE incorrectly assumes that the Ultravisor has explicitly
called H_SVM_PAGE_IN for all secure pages. These GFNs continue to be
normal GFNs associated with normal PFNs; when infact, these GFNs should
have been secure GFNs, associated with device PFNs.

Move all the PFN associated with the SVM's GFNs, to secure-PFNs, in
H_SVM_INIT_DONE. Skip the GFNs that are already Paged-in or Shared
through H_SVM_PAGE_IN, or Paged-in followed by a Paged-out through
UV_PAGE_OUT.

Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Claudio Carvalho <cclaudio@linux.ibm.com>
Cc: kvm-ppc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 Documentation/powerpc/ultravisor.rst |   2 +
 arch/powerpc/kvm/book3s_hv_uvmem.c   | 235 +++++++++++++++++++++++++----------
 2 files changed, 171 insertions(+), 66 deletions(-)

diff --git a/Documentation/powerpc/ultravisor.rst b/Documentation/powerpc/ultravisor.rst
index 363736d..3bc8957 100644
--- a/Documentation/powerpc/ultravisor.rst
+++ b/Documentation/powerpc/ultravisor.rst
@@ -933,6 +933,8 @@ Return values
 	* H_UNSUPPORTED		if called from the wrong context (e.g.
 				from an SVM or before an H_SVM_INIT_START
 				hypercall).
+	* H_STATE		if the hypervisor could not successfully
+                                transition the VM to Secure VM.
 
 Description
 ~~~~~~~~~~~
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 666d1bb..78f8580 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -339,6 +339,21 @@ static bool kvmppc_gfn_is_uvmem_pfn(unsigned long gfn, struct kvm *kvm,
 	return false;
 }
 
+/* return true, if the GFN is a shared-GFN, or a secure-GFN */
+bool kvmppc_gfn_has_transitioned(unsigned long gfn, struct kvm *kvm)
+{
+	struct kvmppc_uvmem_slot *p;
+
+	list_for_each_entry(p, &kvm->arch.uvmem_pfns, list) {
+		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
+			unsigned long index = gfn - p->base_pfn;
+
+			return (p->pfns[index] & KVMPPC_GFN_FLAG_MASK);
+		}
+	}
+	return false;
+}
+
 unsigned long kvmppc_h_svm_init_start(struct kvm *kvm)
 {
 	struct kvm_memslots *slots;
@@ -377,14 +392,152 @@ unsigned long kvmppc_h_svm_init_start(struct kvm *kvm)
 	return ret;
 }
 
+static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm);
+
+/*
+ * Alloc a PFN from private device memory pool. If @pagein is true,
+ * copy page from normal memory to secure memory using UV_PAGE_IN uvcall.
+ */
+static int kvmppc_svm_migrate_page(struct vm_area_struct *vma,
+		unsigned long start,
+		unsigned long end, unsigned long gpa, struct kvm *kvm,
+		unsigned long page_shift,
+		bool pagein)
+{
+	unsigned long src_pfn, dst_pfn = 0;
+	struct migrate_vma mig;
+	struct page *dpage;
+	struct page *spage;
+	unsigned long pfn;
+	int ret = 0;
+
+	memset(&mig, 0, sizeof(mig));
+	mig.vma = vma;
+	mig.start = start;
+	mig.end = end;
+	mig.src = &src_pfn;
+	mig.dst = &dst_pfn;
+
+	ret = migrate_vma_setup(&mig);
+	if (ret)
+		return ret;
+
+	if (!(*mig.src & MIGRATE_PFN_MIGRATE)) {
+		ret = -1;
+		goto out_finalize;
+	}
+
+	dpage = kvmppc_uvmem_get_page(gpa, kvm);
+	if (!dpage) {
+		ret = -1;
+		goto out_finalize;
+	}
+
+	if (pagein) {
+		pfn = *mig.src >> MIGRATE_PFN_SHIFT;
+		spage = migrate_pfn_to_page(*mig.src);
+		if (spage) {
+			ret = uv_page_in(kvm->arch.lpid, pfn << page_shift,
+					gpa, 0, page_shift);
+			if (ret)
+				goto out_finalize;
+		}
+	}
+
+	*mig.dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
+	migrate_vma_pages(&mig);
+out_finalize:
+	migrate_vma_finalize(&mig);
+	return ret;
+}
+
+static int kvmppc_uv_migrate_mem_slot(struct kvm *kvm,
+		const struct kvm_memory_slot *memslot)
+{
+	unsigned long gfn = memslot->base_gfn;
+	unsigned long end;
+	bool downgrade = false;
+	struct vm_area_struct *vma;
+	int i, ret = 0;
+	unsigned long start = gfn_to_hva(kvm, gfn);
+
+	if (kvm_is_error_hva(start))
+		return H_STATE;
+
+	end = start + (memslot->npages << PAGE_SHIFT);
+
+	down_write(&kvm->mm->mmap_sem);
+
+	mutex_lock(&kvm->arch.uvmem_lock);
+	vma = find_vma_intersection(kvm->mm, start, end);
+	if (!vma || vma->vm_start > start || vma->vm_end < end) {
+		ret = H_STATE;
+		goto out_unlock;
+	}
+
+	ret = ksm_madvise(vma, vma->vm_start, vma->vm_end,
+			  MADV_UNMERGEABLE, &vma->vm_flags);
+	downgrade_write(&kvm->mm->mmap_sem);
+	downgrade = true;
+	if (ret) {
+		ret = H_STATE;
+		goto out_unlock;
+	}
+
+	for (i = 0; i < memslot->npages; i++, ++gfn) {
+		/*
+		 * skip GFNs that have already tranistioned.
+		 * paged-in GFNs, shared GFNs, paged-in GFNs
+		 * that were later paged-out.
+		 */
+		if (kvmppc_gfn_has_transitioned(gfn, kvm))
+			continue;
+
+		start = gfn_to_hva(kvm, gfn);
+		end = start + (1UL << PAGE_SHIFT);
+		ret = kvmppc_svm_migrate_page(vma, start, end,
+			(gfn << PAGE_SHIFT), kvm, PAGE_SHIFT, false);
+
+		if (ret)
+			goto out_unlock;
+	}
+
+out_unlock:
+	mutex_unlock(&kvm->arch.uvmem_lock);
+	if (downgrade)
+		up_read(&kvm->mm->mmap_sem);
+	else
+		up_write(&kvm->mm->mmap_sem);
+	return ret;
+}
+
 unsigned long kvmppc_h_svm_init_done(struct kvm *kvm)
 {
+	struct kvm_memslots *slots;
+	struct kvm_memory_slot *memslot;
+	int srcu_idx;
+	long ret = H_SUCCESS;
+
 	if (!(kvm->arch.secure_guest & KVMPPC_SECURE_INIT_START))
 		return H_UNSUPPORTED;
 
+	/* migrate any unmoved normal pfn to device pfns*/
+	srcu_idx = srcu_read_lock(&kvm->srcu);
+	slots = kvm_memslots(kvm);
+	kvm_for_each_memslot(memslot, slots) {
+		ret = kvmppc_uv_migrate_mem_slot(kvm, memslot);
+		if (ret) {
+			ret = H_STATE;
+			goto out;
+		}
+	}
+
 	kvm->arch.secure_guest |= KVMPPC_SECURE_INIT_DONE;
 	pr_info("LPID %d went secure\n", kvm->arch.lpid);
-	return H_SUCCESS;
+
+out:
+	srcu_read_unlock(&kvm->srcu, srcu_idx);
+	return ret;
 }
 
 /*
@@ -510,68 +663,6 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
 }
 
 /*
- * Alloc a PFN from private device memory pool and copy page from normal
- * memory to secure memory using UV_PAGE_IN uvcall.
- */
-static int kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start,
-		   unsigned long end, unsigned long gpa, struct kvm *kvm,
-		   unsigned long page_shift, bool *downgrade)
-{
-	unsigned long src_pfn, dst_pfn = 0;
-	struct migrate_vma mig;
-	struct page *spage;
-	unsigned long pfn;
-	struct page *dpage;
-	int ret = 0;
-
-	memset(&mig, 0, sizeof(mig));
-	mig.vma = vma;
-	mig.start = start;
-	mig.end = end;
-	mig.src = &src_pfn;
-	mig.dst = &dst_pfn;
-
-	/*
-	 * We come here with mmap_sem write lock held just for
-	 * ksm_madvise(), otherwise we only need read mmap_sem.
-	 * Hence downgrade to read lock once ksm_madvise() is done.
-	 */
-	ret = ksm_madvise(vma, vma->vm_start, vma->vm_end,
-			  MADV_UNMERGEABLE, &vma->vm_flags);
-	downgrade_write(&kvm->mm->mmap_sem);
-	*downgrade = true;
-	if (ret)
-		return ret;
-
-	ret = migrate_vma_setup(&mig);
-	if (ret)
-		return ret;
-
-	if (!(*mig.src & MIGRATE_PFN_MIGRATE)) {
-		ret = -1;
-		goto out_finalize;
-	}
-
-	dpage = kvmppc_uvmem_get_page(gpa, kvm);
-	if (!dpage) {
-		ret = -1;
-		goto out_finalize;
-	}
-
-	pfn = *mig.src >> MIGRATE_PFN_SHIFT;
-	spage = migrate_pfn_to_page(*mig.src);
-	if (spage)
-		uv_page_in(kvm->arch.lpid, pfn << page_shift, gpa, 0,
-			   page_shift);
-
-	*mig.dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
-	migrate_vma_pages(&mig);
-out_finalize:
-	migrate_vma_finalize(&mig);
-	return ret;
-}
-
-/*
  * Shares the page with HV, thus making it a normal page.
  *
  * - If the page is already secure, then provision a new page and share
@@ -676,9 +767,21 @@ unsigned long kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
 	if (!vma || vma->vm_start > start || vma->vm_end < end)
 		goto out_unlock;
 
-	if (!kvmppc_svm_page_in(vma, start, end, gpa, kvm, page_shift,
-				&downgrade))
-		ret = H_SUCCESS;
+	ret = ksm_madvise(vma, vma->vm_start, vma->vm_end,
+			  MADV_UNMERGEABLE, &vma->vm_flags);
+	downgrade_write(&kvm->mm->mmap_sem);
+	downgrade = true;
+	if (ret) {
+		ret = H_PARAMETER;
+		goto out_unlock;
+	}
+
+	ret = H_PARAMETER;
+	if (kvmppc_svm_migrate_page(vma, start, end, gpa, kvm, page_shift,
+				true))
+		goto out_unlock;
+
+	ret = H_SUCCESS;
 
 out_unlock:
 	mutex_unlock(&kvm->arch.uvmem_lock);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 3/4] KVM: PPC: Book3S HV: migrate remaining normal-GFNs to secure-GFNs in H_SVM_INIT_DONE
@ 2020-06-18  9:19   ` Ram Pai
  0 siblings, 0 replies; 16+ messages in thread
From: Ram Pai @ 2020-06-18  9:19 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev
  Cc: ldufour, linuxram, cclaudio, bharata, sathnaga, aneesh.kumar,
	sukadev, bauerman, david

H_SVM_INIT_DONE incorrectly assumes that the Ultravisor has explicitly
called H_SVM_PAGE_IN for all secure pages. These GFNs continue to be
normal GFNs associated with normal PFNs; when infact, these GFNs should
have been secure GFNs, associated with device PFNs.

Move all the PFN associated with the SVM's GFNs, to secure-PFNs, in
H_SVM_INIT_DONE. Skip the GFNs that are already Paged-in or Shared
through H_SVM_PAGE_IN, or Paged-in followed by a Paged-out through
UV_PAGE_OUT.

Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Claudio Carvalho <cclaudio@linux.ibm.com>
Cc: kvm-ppc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 Documentation/powerpc/ultravisor.rst |   2 +
 arch/powerpc/kvm/book3s_hv_uvmem.c   | 235 +++++++++++++++++++++++++----------
 2 files changed, 171 insertions(+), 66 deletions(-)

diff --git a/Documentation/powerpc/ultravisor.rst b/Documentation/powerpc/ultravisor.rst
index 363736d..3bc8957 100644
--- a/Documentation/powerpc/ultravisor.rst
+++ b/Documentation/powerpc/ultravisor.rst
@@ -933,6 +933,8 @@ Return values
 	* H_UNSUPPORTED		if called from the wrong context (e.g.
 				from an SVM or before an H_SVM_INIT_START
 				hypercall).
+	* H_STATE		if the hypervisor could not successfully
+                                transition the VM to Secure VM.
 
 Description
 ~~~~~~~~~~~
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 666d1bb..78f8580 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -339,6 +339,21 @@ static bool kvmppc_gfn_is_uvmem_pfn(unsigned long gfn, struct kvm *kvm,
 	return false;
 }
 
+/* return true, if the GFN is a shared-GFN, or a secure-GFN */
+bool kvmppc_gfn_has_transitioned(unsigned long gfn, struct kvm *kvm)
+{
+	struct kvmppc_uvmem_slot *p;
+
+	list_for_each_entry(p, &kvm->arch.uvmem_pfns, list) {
+		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
+			unsigned long index = gfn - p->base_pfn;
+
+			return (p->pfns[index] & KVMPPC_GFN_FLAG_MASK);
+		}
+	}
+	return false;
+}
+
 unsigned long kvmppc_h_svm_init_start(struct kvm *kvm)
 {
 	struct kvm_memslots *slots;
@@ -377,14 +392,152 @@ unsigned long kvmppc_h_svm_init_start(struct kvm *kvm)
 	return ret;
 }
 
+static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm);
+
+/*
+ * Alloc a PFN from private device memory pool. If @pagein is true,
+ * copy page from normal memory to secure memory using UV_PAGE_IN uvcall.
+ */
+static int kvmppc_svm_migrate_page(struct vm_area_struct *vma,
+		unsigned long start,
+		unsigned long end, unsigned long gpa, struct kvm *kvm,
+		unsigned long page_shift,
+		bool pagein)
+{
+	unsigned long src_pfn, dst_pfn = 0;
+	struct migrate_vma mig;
+	struct page *dpage;
+	struct page *spage;
+	unsigned long pfn;
+	int ret = 0;
+
+	memset(&mig, 0, sizeof(mig));
+	mig.vma = vma;
+	mig.start = start;
+	mig.end = end;
+	mig.src = &src_pfn;
+	mig.dst = &dst_pfn;
+
+	ret = migrate_vma_setup(&mig);
+	if (ret)
+		return ret;
+
+	if (!(*mig.src & MIGRATE_PFN_MIGRATE)) {
+		ret = -1;
+		goto out_finalize;
+	}
+
+	dpage = kvmppc_uvmem_get_page(gpa, kvm);
+	if (!dpage) {
+		ret = -1;
+		goto out_finalize;
+	}
+
+	if (pagein) {
+		pfn = *mig.src >> MIGRATE_PFN_SHIFT;
+		spage = migrate_pfn_to_page(*mig.src);
+		if (spage) {
+			ret = uv_page_in(kvm->arch.lpid, pfn << page_shift,
+					gpa, 0, page_shift);
+			if (ret)
+				goto out_finalize;
+		}
+	}
+
+	*mig.dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
+	migrate_vma_pages(&mig);
+out_finalize:
+	migrate_vma_finalize(&mig);
+	return ret;
+}
+
+static int kvmppc_uv_migrate_mem_slot(struct kvm *kvm,
+		const struct kvm_memory_slot *memslot)
+{
+	unsigned long gfn = memslot->base_gfn;
+	unsigned long end;
+	bool downgrade = false;
+	struct vm_area_struct *vma;
+	int i, ret = 0;
+	unsigned long start = gfn_to_hva(kvm, gfn);
+
+	if (kvm_is_error_hva(start))
+		return H_STATE;
+
+	end = start + (memslot->npages << PAGE_SHIFT);
+
+	down_write(&kvm->mm->mmap_sem);
+
+	mutex_lock(&kvm->arch.uvmem_lock);
+	vma = find_vma_intersection(kvm->mm, start, end);
+	if (!vma || vma->vm_start > start || vma->vm_end < end) {
+		ret = H_STATE;
+		goto out_unlock;
+	}
+
+	ret = ksm_madvise(vma, vma->vm_start, vma->vm_end,
+			  MADV_UNMERGEABLE, &vma->vm_flags);
+	downgrade_write(&kvm->mm->mmap_sem);
+	downgrade = true;
+	if (ret) {
+		ret = H_STATE;
+		goto out_unlock;
+	}
+
+	for (i = 0; i < memslot->npages; i++, ++gfn) {
+		/*
+		 * skip GFNs that have already tranistioned.
+		 * paged-in GFNs, shared GFNs, paged-in GFNs
+		 * that were later paged-out.
+		 */
+		if (kvmppc_gfn_has_transitioned(gfn, kvm))
+			continue;
+
+		start = gfn_to_hva(kvm, gfn);
+		end = start + (1UL << PAGE_SHIFT);
+		ret = kvmppc_svm_migrate_page(vma, start, end,
+			(gfn << PAGE_SHIFT), kvm, PAGE_SHIFT, false);
+
+		if (ret)
+			goto out_unlock;
+	}
+
+out_unlock:
+	mutex_unlock(&kvm->arch.uvmem_lock);
+	if (downgrade)
+		up_read(&kvm->mm->mmap_sem);
+	else
+		up_write(&kvm->mm->mmap_sem);
+	return ret;
+}
+
 unsigned long kvmppc_h_svm_init_done(struct kvm *kvm)
 {
+	struct kvm_memslots *slots;
+	struct kvm_memory_slot *memslot;
+	int srcu_idx;
+	long ret = H_SUCCESS;
+
 	if (!(kvm->arch.secure_guest & KVMPPC_SECURE_INIT_START))
 		return H_UNSUPPORTED;
 
+	/* migrate any unmoved normal pfn to device pfns*/
+	srcu_idx = srcu_read_lock(&kvm->srcu);
+	slots = kvm_memslots(kvm);
+	kvm_for_each_memslot(memslot, slots) {
+		ret = kvmppc_uv_migrate_mem_slot(kvm, memslot);
+		if (ret) {
+			ret = H_STATE;
+			goto out;
+		}
+	}
+
 	kvm->arch.secure_guest |= KVMPPC_SECURE_INIT_DONE;
 	pr_info("LPID %d went secure\n", kvm->arch.lpid);
-	return H_SUCCESS;
+
+out:
+	srcu_read_unlock(&kvm->srcu, srcu_idx);
+	return ret;
 }
 
 /*
@@ -510,68 +663,6 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
 }
 
 /*
- * Alloc a PFN from private device memory pool and copy page from normal
- * memory to secure memory using UV_PAGE_IN uvcall.
- */
-static int kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start,
-		   unsigned long end, unsigned long gpa, struct kvm *kvm,
-		   unsigned long page_shift, bool *downgrade)
-{
-	unsigned long src_pfn, dst_pfn = 0;
-	struct migrate_vma mig;
-	struct page *spage;
-	unsigned long pfn;
-	struct page *dpage;
-	int ret = 0;
-
-	memset(&mig, 0, sizeof(mig));
-	mig.vma = vma;
-	mig.start = start;
-	mig.end = end;
-	mig.src = &src_pfn;
-	mig.dst = &dst_pfn;
-
-	/*
-	 * We come here with mmap_sem write lock held just for
-	 * ksm_madvise(), otherwise we only need read mmap_sem.
-	 * Hence downgrade to read lock once ksm_madvise() is done.
-	 */
-	ret = ksm_madvise(vma, vma->vm_start, vma->vm_end,
-			  MADV_UNMERGEABLE, &vma->vm_flags);
-	downgrade_write(&kvm->mm->mmap_sem);
-	*downgrade = true;
-	if (ret)
-		return ret;
-
-	ret = migrate_vma_setup(&mig);
-	if (ret)
-		return ret;
-
-	if (!(*mig.src & MIGRATE_PFN_MIGRATE)) {
-		ret = -1;
-		goto out_finalize;
-	}
-
-	dpage = kvmppc_uvmem_get_page(gpa, kvm);
-	if (!dpage) {
-		ret = -1;
-		goto out_finalize;
-	}
-
-	pfn = *mig.src >> MIGRATE_PFN_SHIFT;
-	spage = migrate_pfn_to_page(*mig.src);
-	if (spage)
-		uv_page_in(kvm->arch.lpid, pfn << page_shift, gpa, 0,
-			   page_shift);
-
-	*mig.dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
-	migrate_vma_pages(&mig);
-out_finalize:
-	migrate_vma_finalize(&mig);
-	return ret;
-}
-
-/*
  * Shares the page with HV, thus making it a normal page.
  *
  * - If the page is already secure, then provision a new page and share
@@ -676,9 +767,21 @@ unsigned long kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
 	if (!vma || vma->vm_start > start || vma->vm_end < end)
 		goto out_unlock;
 
-	if (!kvmppc_svm_page_in(vma, start, end, gpa, kvm, page_shift,
-				&downgrade))
-		ret = H_SUCCESS;
+	ret = ksm_madvise(vma, vma->vm_start, vma->vm_end,
+			  MADV_UNMERGEABLE, &vma->vm_flags);
+	downgrade_write(&kvm->mm->mmap_sem);
+	downgrade = true;
+	if (ret) {
+		ret = H_PARAMETER;
+		goto out_unlock;
+	}
+
+	ret = H_PARAMETER;
+	if (kvmppc_svm_migrate_page(vma, start, end, gpa, kvm, page_shift,
+				true))
+		goto out_unlock;
+
+	ret = H_SUCCESS;
 
 out_unlock:
 	mutex_unlock(&kvm->arch.uvmem_lock);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 4/4] KVM: PPC: Book3S HV: migrate hot plugged memory
  2020-06-18  9:19 ` Ram Pai
@ 2020-06-18  9:19   ` Ram Pai
  -1 siblings, 0 replies; 16+ messages in thread
From: Ram Pai @ 2020-06-18  9:19 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev
  Cc: ldufour, linuxram, cclaudio, bharata, sathnaga, aneesh.kumar,
	sukadev, bauerman, david

From: Laurent Dufour <ldufour@linux.ibm.com>

When a memory slot is hot plugged to a SVM, PFNs associated with the
GFNs in that slot must be migrated to the secure-PFNs, aka device-PFNs.

kvmppc_uv_migrate_mem_slot() is called to accomplish this. UV_PAGE_IN
ucall is skipped, since the ultravisor does not trust the content of
those pages and hence ignores it.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
	[resolved conflicts, and modified the commit log]
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
---
 arch/powerpc/include/asm/kvm_book3s_uvmem.h |  2 ++
 arch/powerpc/kvm/book3s_hv.c                | 10 ++++++----
 arch/powerpc/kvm/book3s_hv_uvmem.c          |  2 +-
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_uvmem.h b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
index f0c5708..05ae789 100644
--- a/arch/powerpc/include/asm/kvm_book3s_uvmem.h
+++ b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
@@ -23,6 +23,8 @@ unsigned long kvmppc_h_svm_page_out(struct kvm *kvm,
 void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
 			     struct kvm *kvm, bool skip_page_out,
 			     bool purge_gfn);
+int kvmppc_uv_migrate_mem_slot(struct kvm *kvm,
+				const struct kvm_memory_slot *memslot);
 #else
 static inline int kvmppc_uvmem_init(void)
 {
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6cf80e5..bf7324d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4531,10 +4531,12 @@ static void kvmppc_core_commit_memory_region_hv(struct kvm *kvm,
 	case KVM_MR_CREATE:
 		if (kvmppc_uvmem_slot_init(kvm, new))
 			return;
-		uv_register_mem_slot(kvm->arch.lpid,
-				     new->base_gfn << PAGE_SHIFT,
-				     new->npages * PAGE_SIZE,
-				     0, new->id);
+		if (uv_register_mem_slot(kvm->arch.lpid,
+					 new->base_gfn << PAGE_SHIFT,
+					 new->npages * PAGE_SIZE,
+					 0, new->id))
+			return;
+		kvmppc_uv_migrate_mem_slot(kvm, new);
 		break;
 	case KVM_MR_DELETE:
 		uv_unregister_mem_slot(kvm->arch.lpid, old->id);
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 78f8580..4d8f5bc 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -451,7 +451,7 @@ static int kvmppc_svm_migrate_page(struct vm_area_struct *vma,
 	return ret;
 }
 
-static int kvmppc_uv_migrate_mem_slot(struct kvm *kvm,
+int kvmppc_uv_migrate_mem_slot(struct kvm *kvm,
 		const struct kvm_memory_slot *memslot)
 {
 	unsigned long gfn = memslot->base_gfn;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 4/4] KVM: PPC: Book3S HV: migrate hot plugged memory
@ 2020-06-18  9:19   ` Ram Pai
  0 siblings, 0 replies; 16+ messages in thread
From: Ram Pai @ 2020-06-18  9:19 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev
  Cc: ldufour, linuxram, cclaudio, bharata, sathnaga, aneesh.kumar,
	sukadev, bauerman, david

From: Laurent Dufour <ldufour@linux.ibm.com>

When a memory slot is hot plugged to a SVM, PFNs associated with the
GFNs in that slot must be migrated to the secure-PFNs, aka device-PFNs.

kvmppc_uv_migrate_mem_slot() is called to accomplish this. UV_PAGE_IN
ucall is skipped, since the ultravisor does not trust the content of
those pages and hence ignores it.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
	[resolved conflicts, and modified the commit log]
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
---
 arch/powerpc/include/asm/kvm_book3s_uvmem.h |  2 ++
 arch/powerpc/kvm/book3s_hv.c                | 10 ++++++----
 arch/powerpc/kvm/book3s_hv_uvmem.c          |  2 +-
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_uvmem.h b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
index f0c5708..05ae789 100644
--- a/arch/powerpc/include/asm/kvm_book3s_uvmem.h
+++ b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
@@ -23,6 +23,8 @@ unsigned long kvmppc_h_svm_page_out(struct kvm *kvm,
 void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
 			     struct kvm *kvm, bool skip_page_out,
 			     bool purge_gfn);
+int kvmppc_uv_migrate_mem_slot(struct kvm *kvm,
+				const struct kvm_memory_slot *memslot);
 #else
 static inline int kvmppc_uvmem_init(void)
 {
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6cf80e5..bf7324d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4531,10 +4531,12 @@ static void kvmppc_core_commit_memory_region_hv(struct kvm *kvm,
 	case KVM_MR_CREATE:
 		if (kvmppc_uvmem_slot_init(kvm, new))
 			return;
-		uv_register_mem_slot(kvm->arch.lpid,
-				     new->base_gfn << PAGE_SHIFT,
-				     new->npages * PAGE_SIZE,
-				     0, new->id);
+		if (uv_register_mem_slot(kvm->arch.lpid,
+					 new->base_gfn << PAGE_SHIFT,
+					 new->npages * PAGE_SIZE,
+					 0, new->id))
+			return;
+		kvmppc_uv_migrate_mem_slot(kvm, new);
 		break;
 	case KVM_MR_DELETE:
 		uv_unregister_mem_slot(kvm->arch.lpid, old->id);
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 78f8580..4d8f5bc 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -451,7 +451,7 @@ static int kvmppc_svm_migrate_page(struct vm_area_struct *vma,
 	return ret;
 }
 
-static int kvmppc_uv_migrate_mem_slot(struct kvm *kvm,
+int kvmppc_uv_migrate_mem_slot(struct kvm *kvm,
 		const struct kvm_memory_slot *memslot)
 {
 	unsigned long gfn = memslot->base_gfn;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/4] KVM: PPC: Book3S HV: track the state GFNs associated with secure VMs
  2020-06-18  9:19   ` Ram Pai
@ 2020-06-18 13:31     ` Laurent Dufour
  -1 siblings, 0 replies; 16+ messages in thread
From: Laurent Dufour @ 2020-06-18 13:31 UTC (permalink / raw)
  To: Ram Pai, kvm-ppc, linuxppc-dev
  Cc: cclaudio, bharata, sathnaga, aneesh.kumar, sukadev, bauerman, david

Le 18/06/2020 à 11:19, Ram Pai a écrit :
> During the life of SVM, its GFNs transition through normal, secure and
> shared states. Since the kernel does not track GFNs that are shared, it
> is not possible to disambiguate a shared GFN from a GFN whose PFN has
> not yet been migrated to a secure-PFN. Also it is not possible to
> disambiguate a secure-GFN from a GFN whose GFN has been pagedout from
> the ultravisor.
> 
> The ability to identify the state of a GFN is needed to skip migration of its
> PFN to secure-PFN during ESM transition.
> 
> The code is re-organized to track the states of a GFN as explained
> below.
> 
> ************************************************************************
>   1. States of a GFN
>      ---------------
>   The GFN can be in one of the following states.
> 
>   (a) Secure - The GFN is secure. The GFN is associated with
>   	a Secure VM, the contents of the GFN is not accessible
>   	to the Hypervisor.  This GFN can be backed by a secure-PFN,
>   	or can be backed by a normal-PFN with contents encrypted.
>   	The former is true when the GFN is paged-in into the
>   	ultravisor. The latter is true when the GFN is paged-out
>   	of the ultravisor.
> 
>   (b) Shared - The GFN is shared. The GFN is associated with a
>   	a secure VM. The contents of the GFN is accessible to
>   	Hypervisor. This GFN is backed by a normal-PFN and its
>   	content is un-encrypted.
> 
>   (c) Normal - The GFN is a normal. The GFN is associated with
>   	a normal VM. The contents of the GFN is accesible to
>   	the Hypervisor. Its content is never encrypted.
> 
>   2. States of a VM.
>      ---------------
> 
>   (a) Normal VM:  A VM whose contents are always accessible to
>   	the hypervisor.  All its GFNs are normal-GFNs.
> 
>   (b) Secure VM: A VM whose contents are not accessible to the
>   	hypervisor without the VM's consent.  Its GFNs are
>   	either Shared-GFN or Secure-GFNs.
> 
>   (c) Transient VM: A Normal VM that is transitioning to secure VM.
>   	The transition starts on successful return of
>   	H_SVM_INIT_START, and ends on successful return
>   	of H_SVM_INIT_DONE. This transient VM, can have GFNs
>   	in any of the three states; i.e Secure-GFN, Shared-GFN,
>   	and Normal-GFN.	The VM never executes in this state
>   	in supervisor-mode.
> 
>   3. Memory slot State.
>      ------------------
>    	The state of a memory slot mirrors the state of the
>    	VM the memory slot is associated with.
> 
>   4. VM State transition.
>      --------------------
> 
>    A VM always starts in Normal Mode.
> 
>    H_SVM_INIT_START moves the VM into transient state. During this
>    time the Ultravisor may request some of its GFNs to be shared or
>    secured. So its GFNs can be in one of the three GFN states.
> 
>    H_SVM_INIT_DONE moves the VM entirely from transient state to
>    secure-state. At this point any left-over normal-GFNs are
>    transitioned to Secure-GFN.
> 
>    H_SVM_INIT_ABORT moves the transient VM back to normal VM.
>    All its GFNs are moved to Normal-GFNs.
> 
>    UV_TERMINATE transitions the secure-VM back to normal-VM. All
>    the secure-GFN and shared-GFNs are tranistioned to normal-GFN
>    Note: The contents of the normal-GFN is undefined at this point.
> 
>   5. GFN state implementation:
>      -------------------------
> 
>   Secure GFN is associated with a secure-PFN; also called uvmem_pfn,
>   when the GFN is paged-in. Its pfn[] has KVMPPC_GFN_UVMEM_PFN flag
>   set, and contains the value of the secure-PFN.
>   It is associated with a normal-PFN; also called mem_pfn, when
>   the GFN is pagedout. Its pfn[] has KVMPPC_GFN_MEM_PFN flag set.
>   The value of the normal-PFN is not tracked.
> 
>   Shared GFN is associated with a normal-PFN. Its pfn[] has
>   KVMPPC_UVMEM_SHARED_PFN flag set. The value of the normal-PFN
>   is not tracked.
> 
>   Normal GFN is associated with normal-PFN. Its pfn[] has
>   no flag set. The value of the normal-PFN is not tracked.
> 
>   6. Life cycle of a GFN
>      --------------------
>   --------------------------------------------------------------
>   |        |     Share  |  Unshare | SVM       |H_SVM_INIT_DONE|
>   |        |operation   |operation | abort/    |               |
>   |        |            |          | terminate |               |
>   -------------------------------------------------------------
>   |        |            |          |           |               |
>   | Secure |     Shared | Secure   |Normal     |Secure         |
>   |        |            |          |           |               |
>   | Shared |     Shared | Secure   |Normal     |Shared         |
>   |        |            |          |           |               |
>   | Normal |     Shared | Secure   |Normal     |Secure         |
>   --------------------------------------------------------------
> 
>   7. Life cycle of a VM
>      --------------------
>   --------------------------------------------------------------------
>   |         |  start    |  H_SVM_  |H_SVM_   |H_SVM_     |UV_SVM_    |
>   |         |  VM       |INIT_START|INIT_DONE|INIT_ABORT |TERMINATE  |
>   |         |           |          |         |           |           |
>   --------- ----------------------------------------------------------
>   |         |           |          |         |           |           |
>   | Normal  | Normal    | Transient|Error    |Error      |Normal     |
>   |         |           |          |         |           |           |
>   | Secure  |   Error   | Error    |Error    |Error      |Normal     |
>   |         |           |          |         |           |           |
>   |Transient|   N/A     | Error    |Secure   |Normal     |Normal     |
>   --------------------------------------------------------------------
> 
> ************************************************************************
> 
> Cc: Paul Mackerras <paulus@ozlabs.org>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Bharata B Rao <bharata@linux.ibm.com>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> Cc: Laurent Dufour <ldufour@linux.ibm.com>
> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: Claudio Carvalho <cclaudio@linux.ibm.com>
> Cc: kvm-ppc@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>   arch/powerpc/include/asm/kvm_book3s_uvmem.h |   6 +-
>   arch/powerpc/kvm/book3s_64_mmu_radix.c      |   2 +-
>   arch/powerpc/kvm/book3s_hv.c                |   2 +-
>   arch/powerpc/kvm/book3s_hv_uvmem.c          | 195 +++++++++++++++++++++++++---
>   4 files changed, 180 insertions(+), 25 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_book3s_uvmem.h b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
> index 5a9834e..f0c5708 100644
> --- a/arch/powerpc/include/asm/kvm_book3s_uvmem.h
> +++ b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
> @@ -21,7 +21,8 @@ unsigned long kvmppc_h_svm_page_out(struct kvm *kvm,
>   int kvmppc_send_page_to_uv(struct kvm *kvm, unsigned long gfn);
>   unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm);
>   void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
> -			     struct kvm *kvm, bool skip_page_out);
> +			     struct kvm *kvm, bool skip_page_out,
> +			     bool purge_gfn);
>   #else
>   static inline int kvmppc_uvmem_init(void)
>   {
> @@ -75,6 +76,7 @@ static inline int kvmppc_send_page_to_uv(struct kvm *kvm, unsigned long gfn)
>   
>   static inline void
>   kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
> -			struct kvm *kvm, bool skip_page_out) { }
> +			struct kvm *kvm, bool skip_page_out,
> +			bool purge_gfn) { }
>   #endif /* CONFIG_PPC_UV */
>   #endif /* __ASM_KVM_BOOK3S_UVMEM_H__ */
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> index 803940d..3448459 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> @@ -1100,7 +1100,7 @@ void kvmppc_radix_flush_memslot(struct kvm *kvm,
>   	unsigned int shift;
>   
>   	if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_START)
> -		kvmppc_uvmem_drop_pages(memslot, kvm, true);
> +		kvmppc_uvmem_drop_pages(memslot, kvm, true, false);

When reviewing the v1 of this series, I asked you the question about the fact 
that the call here is made with purge_gfn = false. Your answer was:

> This function does not know, under what context it is called. Since
> its job is to just flush the memslot, it cannot assume anything
> about purging the pages in the memslot.

Indeed in the case of the memory hotplug operation, this function is called to 
wipe the page from the secure device in the case the pages are secured. In that 
case the purge is required. Indeed, I checked the other call to 
kvmppc_radix_flush_memslot() in kvmppc_core_flush_memslot_hv() and I cannot see 
why in that case too purge_gfn should be false, especially when the memslot is 
reused as detailed in __kvm_set_memory_region() around the call to 
kvm_arch_flush_shadow_memslot().

I'm sorry to not have ask this earlier, but could you please elaborate on this?

>   
>   	if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_DONE)
>   		return;
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 6717d24..6cf80e5 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -5482,7 +5482,7 @@ static int kvmhv_svm_off(struct kvm *kvm)
>   			continue;
>   
>   		kvm_for_each_memslot(memslot, slots) {
> -			kvmppc_uvmem_drop_pages(memslot, kvm, true);
> +			kvmppc_uvmem_drop_pages(memslot, kvm, true, true);
>   			uv_unregister_mem_slot(kvm->arch.lpid, memslot->id);
>   		}
>   	}
> diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
> index 3599aaa..666d1bb 100644
> --- a/arch/powerpc/kvm/book3s_hv_uvmem.c
> +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
> @@ -98,7 +98,127 @@
>   static unsigned long *kvmppc_uvmem_bitmap;
>   static DEFINE_SPINLOCK(kvmppc_uvmem_bitmap_lock);
>   
> -#define KVMPPC_UVMEM_PFN	(1UL << 63)
> +/*
> + * States of a GFN
> + * ---------------
> + * The GFN can be in one of the following states.
> + *
> + * (a) Secure - The GFN is secure. The GFN is associated with
> + *	a Secure VM, the contents of the GFN is not accessible
> + *	to the Hypervisor.  This GFN can be backed by a secure-PFN,
> + *	or can be backed by a normal-PFN with contents encrypted.
> + *	The former is true when the GFN is paged-in into the
> + *	ultravisor. The latter is true when the GFN is paged-out
> + *	of the ultravisor.
> + *
> + * (b) Shared - The GFN is shared. The GFN is associated with a
> + *	a secure VM. The contents of the GFN is accessible to
> + *	Hypervisor. This GFN is backed by a normal-PFN and its
> + *	content is un-encrypted.
> + *
> + * (c) Normal - The GFN is a normal. The GFN is associated with
> + *	a normal VM. The contents of the GFN is accesible to
> + *	the Hypervisor. Its content is never encrypted.
> + *
> + * States of a VM.
> + * ---------------
> + *
> + * Normal VM:  A VM whose contents are always accessible to
> + *	the hypervisor.  All its GFNs are normal-GFNs.
> + *
> + * Secure VM: A VM whose contents are not accessible to the
> + *	hypervisor without the VM's consent.  Its GFNs are
> + *	either Shared-GFN or Secure-GFNs.
> + *
> + * Transient VM: A Normal VM that is transitioning to secure VM.
> + *	The transition starts on successful return of
> + *	H_SVM_INIT_START, and ends on successful return
> + *	of H_SVM_INIT_DONE. This transient VM, can have GFNs
> + *	in any of the three states; i.e Secure-GFN, Shared-GFN,
> + *	and Normal-GFN.	The VM never executes in this state
> + *	in supervisor-mode.
> + *
> + * Memory slot State.
> + * -----------------------------
> + *	The state of a memory slot mirrors the state of the
> + *	VM the memory slot is associated with.
> + *
> + * VM State transition.
> + * --------------------
> + *
> + *  A VM always starts in Normal Mode.
> + *
> + *  H_SVM_INIT_START moves the VM into transient state. During this
> + *  time the Ultravisor may request some of its GFNs to be shared or
> + *  secured. So its GFNs can be in one of the three GFN states.
> + *
> + *  H_SVM_INIT_DONE moves the VM entirely from transient state to
> + *  secure-state. At this point any left-over normal-GFNs are
> + *  transitioned to Secure-GFN.
> + *
> + *  H_SVM_INIT_ABORT moves the transient VM back to normal VM.
> + *  All its GFNs are moved to Normal-GFNs.
> + *
> + *  UV_TERMINATE transitions the secure-VM back to normal-VM. All
> + *  the secure-GFN and shared-GFNs are tranistioned to normal-GFN
> + *  Note: The contents of the normal-GFN is undefined at this point.
> + *
> + * GFN state implementation:
> + * -------------------------
> + *
> + * Secure GFN is associated with a secure-PFN; also called uvmem_pfn,
> + * when the GFN is paged-in. Its pfn[] has KVMPPC_GFN_UVMEM_PFN flag
> + * set, and contains the value of the secure-PFN.
> + * It is associated with a normal-PFN; also called mem_pfn, when
> + * the GFN is pagedout. Its pfn[] has KVMPPC_GFN_MEM_PFN flag set.
> + * The value of the normal-PFN is not tracked.
> + *
> + * Shared GFN is associated with a normal-PFN. Its pfn[] has
> + * KVMPPC_UVMEM_SHARED_PFN flag set. The value of the normal-PFN
> + * is not tracked.
> + *
> + * Normal GFN is associated with normal-PFN. Its pfn[] has
> + * no flag set. The value of the normal-PFN is not tracked.
> + *
> + * Life cycle of a GFN
> + * --------------------
> + *
> + * --------------------------------------------------------------
> + * |        |     Share  |  Unshare | SVM       |H_SVM_INIT_DONE|
> + * |        |operation   |operation | abort/    |               |
> + * |        |            |          | terminate |               |
> + * -------------------------------------------------------------
> + * |        |            |          |           |               |
> + * | Secure |     Shared | Secure   |Normal     |Secure         |
> + * |        |            |          |           |               |
> + * | Shared |     Shared | Secure   |Normal     |Shared         |
> + * |        |            |          |           |               |
> + * | Normal |     Shared | Secure   |Normal     |Secure         |
> + * --------------------------------------------------------------
> + *
> + * Life cycle of a VM
> + * --------------------
> + *
> + * --------------------------------------------------------------------
> + * |         |  start    |  H_SVM_  |H_SVM_   |H_SVM_     |UV_SVM_    |
> + * |         |  VM       |INIT_START|INIT_DONE|INIT_ABORT |TERMINATE  |
> + * |         |           |          |         |           |           |
> + * --------- ----------------------------------------------------------
> + * |         |           |          |         |           |           |
> + * | Normal  | Normal    | Transient|Error    |Error      |Normal     |
> + * |         |           |          |         |           |           |
> + * | Secure  |   Error   | Error    |Error    |Error      |Normal     |
> + * |         |           |          |         |           |           |
> + * |Transient|   N/A     | Error    |Secure   |Normal     |Normal     |
> + * --------------------------------------------------------------------
> + */
> +
> +#define KVMPPC_GFN_UVMEM_PFN	(1UL << 63)
> +#define KVMPPC_GFN_MEM_PFN	(1UL << 62)
> +#define KVMPPC_GFN_SHARED	(1UL << 61)
> +#define KVMPPC_GFN_SECURE	(KVMPPC_GFN_UVMEM_PFN | KVMPPC_GFN_MEM_PFN)
> +#define KVMPPC_GFN_FLAG_MASK	(KVMPPC_GFN_SECURE | KVMPPC_GFN_SHARED)
> +#define KVMPPC_GFN_PFN_MASK	(~KVMPPC_GFN_FLAG_MASK)
>   
>   struct kvmppc_uvmem_slot {
>   	struct list_head list;
> @@ -106,11 +226,11 @@ struct kvmppc_uvmem_slot {
>   	unsigned long base_pfn;
>   	unsigned long *pfns;
>   };
> -
>   struct kvmppc_uvmem_page_pvt {
>   	struct kvm *kvm;
>   	unsigned long gpa;
>   	bool skip_page_out;
> +	bool purge_gfn;
>   };
>   
>   int kvmppc_uvmem_slot_init(struct kvm *kvm, const struct kvm_memory_slot *slot)
> @@ -154,8 +274,8 @@ void kvmppc_uvmem_slot_free(struct kvm *kvm, const struct kvm_memory_slot *slot)
>   	mutex_unlock(&kvm->arch.uvmem_lock);
>   }
>   
> -static void kvmppc_uvmem_pfn_insert(unsigned long gfn, unsigned long uvmem_pfn,
> -				    struct kvm *kvm)
> +static void kvmppc_mark_gfn(unsigned long gfn, struct kvm *kvm,
> +			unsigned long flag, unsigned long uvmem_pfn)
>   {
>   	struct kvmppc_uvmem_slot *p;
>   
> @@ -163,24 +283,41 @@ static void kvmppc_uvmem_pfn_insert(unsigned long gfn, unsigned long uvmem_pfn,
>   		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
>   			unsigned long index = gfn - p->base_pfn;
>   
> -			p->pfns[index] = uvmem_pfn | KVMPPC_UVMEM_PFN;
> +			if (flag == KVMPPC_GFN_UVMEM_PFN)
> +				p->pfns[index] = uvmem_pfn | flag;
> +			else
> +				p->pfns[index] = flag;

That's minoir, but I'm wondering if that check is really needed since all the 
calls to kvmppc_mark_gfn() with flags != KVMPPC_GFN_UVMEM_PFN are made with 
uvmem_pfn = 0.

>   			return;
>   		}
>   	}
>   }
>   
> -static void kvmppc_uvmem_pfn_remove(unsigned long gfn, struct kvm *kvm)
> +/* mark the GFN as secure-GFN associated with @uvmem pfn device-PFN. */
> +static void kvmppc_gfn_secure_uvmem_pfn(unsigned long gfn,
> +			unsigned long uvmem_pfn, struct kvm *kvm)
>   {
> -	struct kvmppc_uvmem_slot *p;
> +	kvmppc_mark_gfn(gfn, kvm, KVMPPC_GFN_UVMEM_PFN, uvmem_pfn);
> +}
>   
> -	list_for_each_entry(p, &kvm->arch.uvmem_pfns, list) {
> -		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
> -			p->pfns[gfn - p->base_pfn] = 0;
> -			return;
> -		}
> -	}
> +/* mark the GFN as secure-GFN associated with a memory-PFN. */
> +static void kvmppc_gfn_secure_mem_pfn(unsigned long gfn, struct kvm *kvm)
> +{
> +	kvmppc_mark_gfn(gfn, kvm, KVMPPC_GFN_MEM_PFN, 0);
>   }
>   
> +/* mark the GFN as a shared GFN. */
> +static void kvmppc_gfn_shared(unsigned long gfn, struct kvm *kvm)
> +{
> +	kvmppc_mark_gfn(gfn, kvm, KVMPPC_GFN_SHARED, 0);
> +}
> +
> +/* mark the GFN as a non-existent GFN. */
> +static void kvmppc_gfn_remove(unsigned long gfn, struct kvm *kvm)
> +{
> +	kvmppc_mark_gfn(gfn, kvm, 0, 0);
> +}
> +
> +/* return true, if the GFN is a secure-GFN backed by a secure-PFN */
>   static bool kvmppc_gfn_is_uvmem_pfn(unsigned long gfn, struct kvm *kvm,
>   				    unsigned long *uvmem_pfn)
>   {
> @@ -190,10 +327,10 @@ static bool kvmppc_gfn_is_uvmem_pfn(unsigned long gfn, struct kvm *kvm,
>   		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
>   			unsigned long index = gfn - p->base_pfn;
>   
> -			if (p->pfns[index] & KVMPPC_UVMEM_PFN) {
> +			if (p->pfns[index] & KVMPPC_GFN_UVMEM_PFN) {
>   				if (uvmem_pfn)
>   					*uvmem_pfn = p->pfns[index] &
> -						     ~KVMPPC_UVMEM_PFN;
> +						     KVMPPC_GFN_PFN_MASK;
>   				return true;
>   			} else
>   				return false;
> @@ -257,9 +394,13 @@ unsigned long kvmppc_h_svm_init_done(struct kvm *kvm)
>    * is HV side fault on these pages. Next we *get* these pages, forcing
>    * fault on them, do fault time migration to replace the device PTEs in
>    * QEMU page table with normal PTEs from newly allocated pages.
> + *
> + * if @purge_gfn is set, invalidate the GFN. GFN is not shared nor secure
> + * anymore.
>    */
>   void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
> -			     struct kvm *kvm, bool skip_page_out)
> +			     struct kvm *kvm, bool skip_page_out,
> +			     bool purge_gfn)
>   {
>   	int i;
>   	struct kvmppc_uvmem_page_pvt *pvt;
> @@ -270,14 +411,17 @@ void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
>   		struct page *uvmem_page;
>   
>   		mutex_lock(&kvm->arch.uvmem_lock);
> +
>   		if (!kvmppc_gfn_is_uvmem_pfn(gfn, kvm, &uvmem_pfn)) {
> +			if (purge_gfn)
> +				kvmppc_gfn_remove(gfn, kvm);
>   			mutex_unlock(&kvm->arch.uvmem_lock);
>   			continue;
>   		}
> -
>   		uvmem_page = pfn_to_page(uvmem_pfn);
>   		pvt = uvmem_page->zone_device_data;
>   		pvt->skip_page_out = skip_page_out;
> +		pvt->purge_gfn = purge_gfn;
>   		mutex_unlock(&kvm->arch.uvmem_lock);
>   
>   		pfn = gfn_to_pfn(kvm, gfn);
> @@ -305,7 +449,7 @@ unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm)
>   	srcu_idx = srcu_read_lock(&kvm->srcu);
>   
>   	kvm_for_each_memslot(memslot, kvm_memslots(kvm))
> -		kvmppc_uvmem_drop_pages(memslot, kvm, false);
> +		kvmppc_uvmem_drop_pages(memslot, kvm, false, true);
>   
>   	srcu_read_unlock(&kvm->srcu, srcu_idx);
>   
> @@ -347,7 +491,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
>   		goto out_clear;
>   
>   	uvmem_pfn = bit + pfn_first;
> -	kvmppc_uvmem_pfn_insert(gpa >> PAGE_SHIFT, uvmem_pfn, kvm);
> +	kvmppc_gfn_secure_uvmem_pfn(gpa >> PAGE_SHIFT, uvmem_pfn, kvm);
>   
>   	pvt->gpa = gpa;
>   	pvt->kvm = kvm;
> @@ -454,6 +598,7 @@ static unsigned long kvmppc_share_page(struct kvm *kvm, unsigned long gpa,
>   		uvmem_page = pfn_to_page(uvmem_pfn);
>   		pvt = uvmem_page->zone_device_data;
>   		pvt->skip_page_out = true;
> +		pvt->purge_gfn = false;
>   	}
>   
>   retry:
> @@ -467,12 +612,16 @@ static unsigned long kvmppc_share_page(struct kvm *kvm, unsigned long gpa,
>   		uvmem_page = pfn_to_page(uvmem_pfn);
>   		pvt = uvmem_page->zone_device_data;
>   		pvt->skip_page_out = true;
> +		pvt->purge_gfn = false;
>   		kvm_release_pfn_clean(pfn);
>   		goto retry;
>   	}
>   
> -	if (!uv_page_in(kvm->arch.lpid, pfn << page_shift, gpa, 0, page_shift))
> +	if (!uv_page_in(kvm->arch.lpid, pfn << page_shift, gpa, 0,
> +				page_shift)) {
> +		kvmppc_gfn_shared(gfn, kvm);
>   		ret = H_SUCCESS;
> +	}
>   	kvm_release_pfn_clean(pfn);
>   	mutex_unlock(&kvm->arch.uvmem_lock);
>   out:
> @@ -530,6 +679,7 @@ unsigned long kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
>   	if (!kvmppc_svm_page_in(vma, start, end, gpa, kvm, page_shift,
>   				&downgrade))
>   		ret = H_SUCCESS;
> +
>   out_unlock:
>   	mutex_unlock(&kvm->arch.uvmem_lock);
>   out:
> @@ -655,7 +805,10 @@ static void kvmppc_uvmem_page_free(struct page *page)
>   
>   	pvt = page->zone_device_data;
>   	page->zone_device_data = NULL;
> -	kvmppc_uvmem_pfn_remove(pvt->gpa >> PAGE_SHIFT, pvt->kvm);
> +	if (pvt->purge_gfn)
> +		kvmppc_gfn_remove(pvt->gpa >> PAGE_SHIFT, pvt->kvm);
> +	else
> +		kvmppc_gfn_secure_mem_pfn(pvt->gpa >> PAGE_SHIFT, pvt->kvm);
>   	kfree(pvt);
>   }
>   
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/4] KVM: PPC: Book3S HV: track the state GFNs associated with secure VMs
@ 2020-06-18 13:31     ` Laurent Dufour
  0 siblings, 0 replies; 16+ messages in thread
From: Laurent Dufour @ 2020-06-18 13:31 UTC (permalink / raw)
  To: Ram Pai, kvm-ppc, linuxppc-dev
  Cc: cclaudio, bharata, sathnaga, aneesh.kumar, sukadev, bauerman, david

Le 18/06/2020 à 11:19, Ram Pai a écrit :
> During the life of SVM, its GFNs transition through normal, secure and
> shared states. Since the kernel does not track GFNs that are shared, it
> is not possible to disambiguate a shared GFN from a GFN whose PFN has
> not yet been migrated to a secure-PFN. Also it is not possible to
> disambiguate a secure-GFN from a GFN whose GFN has been pagedout from
> the ultravisor.
> 
> The ability to identify the state of a GFN is needed to skip migration of its
> PFN to secure-PFN during ESM transition.
> 
> The code is re-organized to track the states of a GFN as explained
> below.
> 
> ************************************************************************
>   1. States of a GFN
>      ---------------
>   The GFN can be in one of the following states.
> 
>   (a) Secure - The GFN is secure. The GFN is associated with
>   	a Secure VM, the contents of the GFN is not accessible
>   	to the Hypervisor.  This GFN can be backed by a secure-PFN,
>   	or can be backed by a normal-PFN with contents encrypted.
>   	The former is true when the GFN is paged-in into the
>   	ultravisor. The latter is true when the GFN is paged-out
>   	of the ultravisor.
> 
>   (b) Shared - The GFN is shared. The GFN is associated with a
>   	a secure VM. The contents of the GFN is accessible to
>   	Hypervisor. This GFN is backed by a normal-PFN and its
>   	content is un-encrypted.
> 
>   (c) Normal - The GFN is a normal. The GFN is associated with
>   	a normal VM. The contents of the GFN is accesible to
>   	the Hypervisor. Its content is never encrypted.
> 
>   2. States of a VM.
>      ---------------
> 
>   (a) Normal VM:  A VM whose contents are always accessible to
>   	the hypervisor.  All its GFNs are normal-GFNs.
> 
>   (b) Secure VM: A VM whose contents are not accessible to the
>   	hypervisor without the VM's consent.  Its GFNs are
>   	either Shared-GFN or Secure-GFNs.
> 
>   (c) Transient VM: A Normal VM that is transitioning to secure VM.
>   	The transition starts on successful return of
>   	H_SVM_INIT_START, and ends on successful return
>   	of H_SVM_INIT_DONE. This transient VM, can have GFNs
>   	in any of the three states; i.e Secure-GFN, Shared-GFN,
>   	and Normal-GFN.	The VM never executes in this state
>   	in supervisor-mode.
> 
>   3. Memory slot State.
>      ------------------
>    	The state of a memory slot mirrors the state of the
>    	VM the memory slot is associated with.
> 
>   4. VM State transition.
>      --------------------
> 
>    A VM always starts in Normal Mode.
> 
>    H_SVM_INIT_START moves the VM into transient state. During this
>    time the Ultravisor may request some of its GFNs to be shared or
>    secured. So its GFNs can be in one of the three GFN states.
> 
>    H_SVM_INIT_DONE moves the VM entirely from transient state to
>    secure-state. At this point any left-over normal-GFNs are
>    transitioned to Secure-GFN.
> 
>    H_SVM_INIT_ABORT moves the transient VM back to normal VM.
>    All its GFNs are moved to Normal-GFNs.
> 
>    UV_TERMINATE transitions the secure-VM back to normal-VM. All
>    the secure-GFN and shared-GFNs are tranistioned to normal-GFN
>    Note: The contents of the normal-GFN is undefined at this point.
> 
>   5. GFN state implementation:
>      -------------------------
> 
>   Secure GFN is associated with a secure-PFN; also called uvmem_pfn,
>   when the GFN is paged-in. Its pfn[] has KVMPPC_GFN_UVMEM_PFN flag
>   set, and contains the value of the secure-PFN.
>   It is associated with a normal-PFN; also called mem_pfn, when
>   the GFN is pagedout. Its pfn[] has KVMPPC_GFN_MEM_PFN flag set.
>   The value of the normal-PFN is not tracked.
> 
>   Shared GFN is associated with a normal-PFN. Its pfn[] has
>   KVMPPC_UVMEM_SHARED_PFN flag set. The value of the normal-PFN
>   is not tracked.
> 
>   Normal GFN is associated with normal-PFN. Its pfn[] has
>   no flag set. The value of the normal-PFN is not tracked.
> 
>   6. Life cycle of a GFN
>      --------------------
>   --------------------------------------------------------------
>   |        |     Share  |  Unshare | SVM       |H_SVM_INIT_DONE|
>   |        |operation   |operation | abort/    |               |
>   |        |            |          | terminate |               |
>   -------------------------------------------------------------
>   |        |            |          |           |               |
>   | Secure |     Shared | Secure   |Normal     |Secure         |
>   |        |            |          |           |               |
>   | Shared |     Shared | Secure   |Normal     |Shared         |
>   |        |            |          |           |               |
>   | Normal |     Shared | Secure   |Normal     |Secure         |
>   --------------------------------------------------------------
> 
>   7. Life cycle of a VM
>      --------------------
>   --------------------------------------------------------------------
>   |         |  start    |  H_SVM_  |H_SVM_   |H_SVM_     |UV_SVM_    |
>   |         |  VM       |INIT_START|INIT_DONE|INIT_ABORT |TERMINATE  |
>   |         |           |          |         |           |           |
>   --------- ----------------------------------------------------------
>   |         |           |          |         |           |           |
>   | Normal  | Normal    | Transient|Error    |Error      |Normal     |
>   |         |           |          |         |           |           |
>   | Secure  |   Error   | Error    |Error    |Error      |Normal     |
>   |         |           |          |         |           |           |
>   |Transient|   N/A     | Error    |Secure   |Normal     |Normal     |
>   --------------------------------------------------------------------
> 
> ************************************************************************
> 
> Cc: Paul Mackerras <paulus@ozlabs.org>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Bharata B Rao <bharata@linux.ibm.com>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> Cc: Laurent Dufour <ldufour@linux.ibm.com>
> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: Claudio Carvalho <cclaudio@linux.ibm.com>
> Cc: kvm-ppc@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>   arch/powerpc/include/asm/kvm_book3s_uvmem.h |   6 +-
>   arch/powerpc/kvm/book3s_64_mmu_radix.c      |   2 +-
>   arch/powerpc/kvm/book3s_hv.c                |   2 +-
>   arch/powerpc/kvm/book3s_hv_uvmem.c          | 195 +++++++++++++++++++++++++---
>   4 files changed, 180 insertions(+), 25 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_book3s_uvmem.h b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
> index 5a9834e..f0c5708 100644
> --- a/arch/powerpc/include/asm/kvm_book3s_uvmem.h
> +++ b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
> @@ -21,7 +21,8 @@ unsigned long kvmppc_h_svm_page_out(struct kvm *kvm,
>   int kvmppc_send_page_to_uv(struct kvm *kvm, unsigned long gfn);
>   unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm);
>   void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
> -			     struct kvm *kvm, bool skip_page_out);
> +			     struct kvm *kvm, bool skip_page_out,
> +			     bool purge_gfn);
>   #else
>   static inline int kvmppc_uvmem_init(void)
>   {
> @@ -75,6 +76,7 @@ static inline int kvmppc_send_page_to_uv(struct kvm *kvm, unsigned long gfn)
>   
>   static inline void
>   kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
> -			struct kvm *kvm, bool skip_page_out) { }
> +			struct kvm *kvm, bool skip_page_out,
> +			bool purge_gfn) { }
>   #endif /* CONFIG_PPC_UV */
>   #endif /* __ASM_KVM_BOOK3S_UVMEM_H__ */
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> index 803940d..3448459 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> @@ -1100,7 +1100,7 @@ void kvmppc_radix_flush_memslot(struct kvm *kvm,
>   	unsigned int shift;
>   
>   	if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_START)
> -		kvmppc_uvmem_drop_pages(memslot, kvm, true);
> +		kvmppc_uvmem_drop_pages(memslot, kvm, true, false);

When reviewing the v1 of this series, I asked you the question about the fact 
that the call here is made with purge_gfn = false. Your answer was:

> This function does not know, under what context it is called. Since
> its job is to just flush the memslot, it cannot assume anything
> about purging the pages in the memslot.

Indeed in the case of the memory hotplug operation, this function is called to 
wipe the page from the secure device in the case the pages are secured. In that 
case the purge is required. Indeed, I checked the other call to 
kvmppc_radix_flush_memslot() in kvmppc_core_flush_memslot_hv() and I cannot see 
why in that case too purge_gfn should be false, especially when the memslot is 
reused as detailed in __kvm_set_memory_region() around the call to 
kvm_arch_flush_shadow_memslot().

I'm sorry to not have ask this earlier, but could you please elaborate on this?

>   
>   	if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_DONE)
>   		return;
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 6717d24..6cf80e5 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -5482,7 +5482,7 @@ static int kvmhv_svm_off(struct kvm *kvm)
>   			continue;
>   
>   		kvm_for_each_memslot(memslot, slots) {
> -			kvmppc_uvmem_drop_pages(memslot, kvm, true);
> +			kvmppc_uvmem_drop_pages(memslot, kvm, true, true);
>   			uv_unregister_mem_slot(kvm->arch.lpid, memslot->id);
>   		}
>   	}
> diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
> index 3599aaa..666d1bb 100644
> --- a/arch/powerpc/kvm/book3s_hv_uvmem.c
> +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
> @@ -98,7 +98,127 @@
>   static unsigned long *kvmppc_uvmem_bitmap;
>   static DEFINE_SPINLOCK(kvmppc_uvmem_bitmap_lock);
>   
> -#define KVMPPC_UVMEM_PFN	(1UL << 63)
> +/*
> + * States of a GFN
> + * ---------------
> + * The GFN can be in one of the following states.
> + *
> + * (a) Secure - The GFN is secure. The GFN is associated with
> + *	a Secure VM, the contents of the GFN is not accessible
> + *	to the Hypervisor.  This GFN can be backed by a secure-PFN,
> + *	or can be backed by a normal-PFN with contents encrypted.
> + *	The former is true when the GFN is paged-in into the
> + *	ultravisor. The latter is true when the GFN is paged-out
> + *	of the ultravisor.
> + *
> + * (b) Shared - The GFN is shared. The GFN is associated with a
> + *	a secure VM. The contents of the GFN is accessible to
> + *	Hypervisor. This GFN is backed by a normal-PFN and its
> + *	content is un-encrypted.
> + *
> + * (c) Normal - The GFN is a normal. The GFN is associated with
> + *	a normal VM. The contents of the GFN is accesible to
> + *	the Hypervisor. Its content is never encrypted.
> + *
> + * States of a VM.
> + * ---------------
> + *
> + * Normal VM:  A VM whose contents are always accessible to
> + *	the hypervisor.  All its GFNs are normal-GFNs.
> + *
> + * Secure VM: A VM whose contents are not accessible to the
> + *	hypervisor without the VM's consent.  Its GFNs are
> + *	either Shared-GFN or Secure-GFNs.
> + *
> + * Transient VM: A Normal VM that is transitioning to secure VM.
> + *	The transition starts on successful return of
> + *	H_SVM_INIT_START, and ends on successful return
> + *	of H_SVM_INIT_DONE. This transient VM, can have GFNs
> + *	in any of the three states; i.e Secure-GFN, Shared-GFN,
> + *	and Normal-GFN.	The VM never executes in this state
> + *	in supervisor-mode.
> + *
> + * Memory slot State.
> + * -----------------------------
> + *	The state of a memory slot mirrors the state of the
> + *	VM the memory slot is associated with.
> + *
> + * VM State transition.
> + * --------------------
> + *
> + *  A VM always starts in Normal Mode.
> + *
> + *  H_SVM_INIT_START moves the VM into transient state. During this
> + *  time the Ultravisor may request some of its GFNs to be shared or
> + *  secured. So its GFNs can be in one of the three GFN states.
> + *
> + *  H_SVM_INIT_DONE moves the VM entirely from transient state to
> + *  secure-state. At this point any left-over normal-GFNs are
> + *  transitioned to Secure-GFN.
> + *
> + *  H_SVM_INIT_ABORT moves the transient VM back to normal VM.
> + *  All its GFNs are moved to Normal-GFNs.
> + *
> + *  UV_TERMINATE transitions the secure-VM back to normal-VM. All
> + *  the secure-GFN and shared-GFNs are tranistioned to normal-GFN
> + *  Note: The contents of the normal-GFN is undefined at this point.
> + *
> + * GFN state implementation:
> + * -------------------------
> + *
> + * Secure GFN is associated with a secure-PFN; also called uvmem_pfn,
> + * when the GFN is paged-in. Its pfn[] has KVMPPC_GFN_UVMEM_PFN flag
> + * set, and contains the value of the secure-PFN.
> + * It is associated with a normal-PFN; also called mem_pfn, when
> + * the GFN is pagedout. Its pfn[] has KVMPPC_GFN_MEM_PFN flag set.
> + * The value of the normal-PFN is not tracked.
> + *
> + * Shared GFN is associated with a normal-PFN. Its pfn[] has
> + * KVMPPC_UVMEM_SHARED_PFN flag set. The value of the normal-PFN
> + * is not tracked.
> + *
> + * Normal GFN is associated with normal-PFN. Its pfn[] has
> + * no flag set. The value of the normal-PFN is not tracked.
> + *
> + * Life cycle of a GFN
> + * --------------------
> + *
> + * --------------------------------------------------------------
> + * |        |     Share  |  Unshare | SVM       |H_SVM_INIT_DONE|
> + * |        |operation   |operation | abort/    |               |
> + * |        |            |          | terminate |               |
> + * -------------------------------------------------------------
> + * |        |            |          |           |               |
> + * | Secure |     Shared | Secure   |Normal     |Secure         |
> + * |        |            |          |           |               |
> + * | Shared |     Shared | Secure   |Normal     |Shared         |
> + * |        |            |          |           |               |
> + * | Normal |     Shared | Secure   |Normal     |Secure         |
> + * --------------------------------------------------------------
> + *
> + * Life cycle of a VM
> + * --------------------
> + *
> + * --------------------------------------------------------------------
> + * |         |  start    |  H_SVM_  |H_SVM_   |H_SVM_     |UV_SVM_    |
> + * |         |  VM       |INIT_START|INIT_DONE|INIT_ABORT |TERMINATE  |
> + * |         |           |          |         |           |           |
> + * --------- ----------------------------------------------------------
> + * |         |           |          |         |           |           |
> + * | Normal  | Normal    | Transient|Error    |Error      |Normal     |
> + * |         |           |          |         |           |           |
> + * | Secure  |   Error   | Error    |Error    |Error      |Normal     |
> + * |         |           |          |         |           |           |
> + * |Transient|   N/A     | Error    |Secure   |Normal     |Normal     |
> + * --------------------------------------------------------------------
> + */
> +
> +#define KVMPPC_GFN_UVMEM_PFN	(1UL << 63)
> +#define KVMPPC_GFN_MEM_PFN	(1UL << 62)
> +#define KVMPPC_GFN_SHARED	(1UL << 61)
> +#define KVMPPC_GFN_SECURE	(KVMPPC_GFN_UVMEM_PFN | KVMPPC_GFN_MEM_PFN)
> +#define KVMPPC_GFN_FLAG_MASK	(KVMPPC_GFN_SECURE | KVMPPC_GFN_SHARED)
> +#define KVMPPC_GFN_PFN_MASK	(~KVMPPC_GFN_FLAG_MASK)
>   
>   struct kvmppc_uvmem_slot {
>   	struct list_head list;
> @@ -106,11 +226,11 @@ struct kvmppc_uvmem_slot {
>   	unsigned long base_pfn;
>   	unsigned long *pfns;
>   };
> -
>   struct kvmppc_uvmem_page_pvt {
>   	struct kvm *kvm;
>   	unsigned long gpa;
>   	bool skip_page_out;
> +	bool purge_gfn;
>   };
>   
>   int kvmppc_uvmem_slot_init(struct kvm *kvm, const struct kvm_memory_slot *slot)
> @@ -154,8 +274,8 @@ void kvmppc_uvmem_slot_free(struct kvm *kvm, const struct kvm_memory_slot *slot)
>   	mutex_unlock(&kvm->arch.uvmem_lock);
>   }
>   
> -static void kvmppc_uvmem_pfn_insert(unsigned long gfn, unsigned long uvmem_pfn,
> -				    struct kvm *kvm)
> +static void kvmppc_mark_gfn(unsigned long gfn, struct kvm *kvm,
> +			unsigned long flag, unsigned long uvmem_pfn)
>   {
>   	struct kvmppc_uvmem_slot *p;
>   
> @@ -163,24 +283,41 @@ static void kvmppc_uvmem_pfn_insert(unsigned long gfn, unsigned long uvmem_pfn,
>   		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
>   			unsigned long index = gfn - p->base_pfn;
>   
> -			p->pfns[index] = uvmem_pfn | KVMPPC_UVMEM_PFN;
> +			if (flag = KVMPPC_GFN_UVMEM_PFN)
> +				p->pfns[index] = uvmem_pfn | flag;
> +			else
> +				p->pfns[index] = flag;

That's minoir, but I'm wondering if that check is really needed since all the 
calls to kvmppc_mark_gfn() with flags != KVMPPC_GFN_UVMEM_PFN are made with 
uvmem_pfn = 0.

>   			return;
>   		}
>   	}
>   }
>   
> -static void kvmppc_uvmem_pfn_remove(unsigned long gfn, struct kvm *kvm)
> +/* mark the GFN as secure-GFN associated with @uvmem pfn device-PFN. */
> +static void kvmppc_gfn_secure_uvmem_pfn(unsigned long gfn,
> +			unsigned long uvmem_pfn, struct kvm *kvm)
>   {
> -	struct kvmppc_uvmem_slot *p;
> +	kvmppc_mark_gfn(gfn, kvm, KVMPPC_GFN_UVMEM_PFN, uvmem_pfn);
> +}
>   
> -	list_for_each_entry(p, &kvm->arch.uvmem_pfns, list) {
> -		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
> -			p->pfns[gfn - p->base_pfn] = 0;
> -			return;
> -		}
> -	}
> +/* mark the GFN as secure-GFN associated with a memory-PFN. */
> +static void kvmppc_gfn_secure_mem_pfn(unsigned long gfn, struct kvm *kvm)
> +{
> +	kvmppc_mark_gfn(gfn, kvm, KVMPPC_GFN_MEM_PFN, 0);
>   }
>   
> +/* mark the GFN as a shared GFN. */
> +static void kvmppc_gfn_shared(unsigned long gfn, struct kvm *kvm)
> +{
> +	kvmppc_mark_gfn(gfn, kvm, KVMPPC_GFN_SHARED, 0);
> +}
> +
> +/* mark the GFN as a non-existent GFN. */
> +static void kvmppc_gfn_remove(unsigned long gfn, struct kvm *kvm)
> +{
> +	kvmppc_mark_gfn(gfn, kvm, 0, 0);
> +}
> +
> +/* return true, if the GFN is a secure-GFN backed by a secure-PFN */
>   static bool kvmppc_gfn_is_uvmem_pfn(unsigned long gfn, struct kvm *kvm,
>   				    unsigned long *uvmem_pfn)
>   {
> @@ -190,10 +327,10 @@ static bool kvmppc_gfn_is_uvmem_pfn(unsigned long gfn, struct kvm *kvm,
>   		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
>   			unsigned long index = gfn - p->base_pfn;
>   
> -			if (p->pfns[index] & KVMPPC_UVMEM_PFN) {
> +			if (p->pfns[index] & KVMPPC_GFN_UVMEM_PFN) {
>   				if (uvmem_pfn)
>   					*uvmem_pfn = p->pfns[index] &
> -						     ~KVMPPC_UVMEM_PFN;
> +						     KVMPPC_GFN_PFN_MASK;
>   				return true;
>   			} else
>   				return false;
> @@ -257,9 +394,13 @@ unsigned long kvmppc_h_svm_init_done(struct kvm *kvm)
>    * is HV side fault on these pages. Next we *get* these pages, forcing
>    * fault on them, do fault time migration to replace the device PTEs in
>    * QEMU page table with normal PTEs from newly allocated pages.
> + *
> + * if @purge_gfn is set, invalidate the GFN. GFN is not shared nor secure
> + * anymore.
>    */
>   void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
> -			     struct kvm *kvm, bool skip_page_out)
> +			     struct kvm *kvm, bool skip_page_out,
> +			     bool purge_gfn)
>   {
>   	int i;
>   	struct kvmppc_uvmem_page_pvt *pvt;
> @@ -270,14 +411,17 @@ void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
>   		struct page *uvmem_page;
>   
>   		mutex_lock(&kvm->arch.uvmem_lock);
> +
>   		if (!kvmppc_gfn_is_uvmem_pfn(gfn, kvm, &uvmem_pfn)) {
> +			if (purge_gfn)
> +				kvmppc_gfn_remove(gfn, kvm);
>   			mutex_unlock(&kvm->arch.uvmem_lock);
>   			continue;
>   		}
> -
>   		uvmem_page = pfn_to_page(uvmem_pfn);
>   		pvt = uvmem_page->zone_device_data;
>   		pvt->skip_page_out = skip_page_out;
> +		pvt->purge_gfn = purge_gfn;
>   		mutex_unlock(&kvm->arch.uvmem_lock);
>   
>   		pfn = gfn_to_pfn(kvm, gfn);
> @@ -305,7 +449,7 @@ unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm)
>   	srcu_idx = srcu_read_lock(&kvm->srcu);
>   
>   	kvm_for_each_memslot(memslot, kvm_memslots(kvm))
> -		kvmppc_uvmem_drop_pages(memslot, kvm, false);
> +		kvmppc_uvmem_drop_pages(memslot, kvm, false, true);
>   
>   	srcu_read_unlock(&kvm->srcu, srcu_idx);
>   
> @@ -347,7 +491,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
>   		goto out_clear;
>   
>   	uvmem_pfn = bit + pfn_first;
> -	kvmppc_uvmem_pfn_insert(gpa >> PAGE_SHIFT, uvmem_pfn, kvm);
> +	kvmppc_gfn_secure_uvmem_pfn(gpa >> PAGE_SHIFT, uvmem_pfn, kvm);
>   
>   	pvt->gpa = gpa;
>   	pvt->kvm = kvm;
> @@ -454,6 +598,7 @@ static unsigned long kvmppc_share_page(struct kvm *kvm, unsigned long gpa,
>   		uvmem_page = pfn_to_page(uvmem_pfn);
>   		pvt = uvmem_page->zone_device_data;
>   		pvt->skip_page_out = true;
> +		pvt->purge_gfn = false;
>   	}
>   
>   retry:
> @@ -467,12 +612,16 @@ static unsigned long kvmppc_share_page(struct kvm *kvm, unsigned long gpa,
>   		uvmem_page = pfn_to_page(uvmem_pfn);
>   		pvt = uvmem_page->zone_device_data;
>   		pvt->skip_page_out = true;
> +		pvt->purge_gfn = false;
>   		kvm_release_pfn_clean(pfn);
>   		goto retry;
>   	}
>   
> -	if (!uv_page_in(kvm->arch.lpid, pfn << page_shift, gpa, 0, page_shift))
> +	if (!uv_page_in(kvm->arch.lpid, pfn << page_shift, gpa, 0,
> +				page_shift)) {
> +		kvmppc_gfn_shared(gfn, kvm);
>   		ret = H_SUCCESS;
> +	}
>   	kvm_release_pfn_clean(pfn);
>   	mutex_unlock(&kvm->arch.uvmem_lock);
>   out:
> @@ -530,6 +679,7 @@ unsigned long kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
>   	if (!kvmppc_svm_page_in(vma, start, end, gpa, kvm, page_shift,
>   				&downgrade))
>   		ret = H_SUCCESS;
> +
>   out_unlock:
>   	mutex_unlock(&kvm->arch.uvmem_lock);
>   out:
> @@ -655,7 +805,10 @@ static void kvmppc_uvmem_page_free(struct page *page)
>   
>   	pvt = page->zone_device_data;
>   	page->zone_device_data = NULL;
> -	kvmppc_uvmem_pfn_remove(pvt->gpa >> PAGE_SHIFT, pvt->kvm);
> +	if (pvt->purge_gfn)
> +		kvmppc_gfn_remove(pvt->gpa >> PAGE_SHIFT, pvt->kvm);
> +	else
> +		kvmppc_gfn_secure_mem_pfn(pvt->gpa >> PAGE_SHIFT, pvt->kvm);
>   	kfree(pvt);
>   }
>   
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/4] Migrate non-migrated pages of a SVM.
  2020-06-18  9:19 ` Ram Pai
@ 2020-06-18 20:37   ` Ram Pai
  -1 siblings, 0 replies; 16+ messages in thread
From: Ram Pai @ 2020-06-18 20:37 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev
  Cc: ldufour, cclaudio, bharata, sathnaga, aneesh.kumar, sukadev,
	bauerman, david

I should have elaborated on the problem and the need for these patches.

Explaining it here. Will add it to the series in next version.

-------------------------------------------------------------

The time taken to switch a VM to Secure-VM, increases by the size of
the VM.  A 100GB VM takes about 7minutes. This is unacceptable.  This
linear increase is caused by a suboptimal behavior by the Ultravisor and
the Hypervisor.  The Ultravisor unnecessarily migrates all the GFN of
the VM from normal-memory to secure-memory. It has to just migrate the
necessary and sufficient GFNs.

However when the optimization is incorporated in the Ultravisor, the
Hypervisor starts misbehaving. The Hypervisor has a inbuilt assumption
that the Ultravisor will explicitly request to migrate, each and every
GFN of the VM. If only necessary and sufficient GFNs are requested for
migration, the Hypervisor continues to manage the rest of the GFNs are
normal GFNs. This leads of memory corruption, manifested consistently
when the SVM reboots.

The same is true, when a memory slot is hotplugged into a SVM. The
Hypervisor expects the ultravisor to request migration of all GFNs to
secure-GFN.  But at the same time the hypervisor is unable to handle any
H_SVM_PAGE_IN requests from the Ultravisor, done in the context of
UV_REGISTER_MEM_SLOT ucall.  This problem manifests as random errors in
the SVM, when a memory-slot is hotplugged.

This patch series automatically migrates the non-migrated pages of a SVM,
and thus solves the problem.

------------------------------------------------------------------



On Thu, Jun 18, 2020 at 02:19:01AM -0700, Ram Pai wrote:
> This patch series migrates the non-migrated pages of a SVM.
> This is required when the UV calls H_SVM_INIT_DONE, and
> when a memory-slot is hotplugged to a Secure VM.
> 
> Testing: Passed rigorous SVM reboot test using different
> 	sized SVMs.
> 
> Changelog:
> 	. fixed a bug observed by Bharata. Pages that
> 	where paged-in and later paged-out must also be
> 	skipped from migration during H_SVM_INIT_DONE.
> 
> Laurent Dufour (1):
>   KVM: PPC: Book3S HV: migrate hot plugged memory
> 
> Ram Pai (3):
>   KVM: PPC: Book3S HV: Fix function definition in book3s_hv_uvmem.c
>   KVM: PPC: Book3S HV: track the state GFNs associated with secure VMs
>   KVM: PPC: Book3S HV: migrate remaining normal-GFNs to secure-GFNs in
>     H_SVM_INIT_DONE
> 
>  Documentation/powerpc/ultravisor.rst        |   2 +
>  arch/powerpc/include/asm/kvm_book3s_uvmem.h |   8 +-
>  arch/powerpc/kvm/book3s_64_mmu_radix.c      |   2 +-
>  arch/powerpc/kvm/book3s_hv.c                |  12 +-
>  arch/powerpc/kvm/book3s_hv_uvmem.c          | 449 ++++++++++++++++++++++------
>  5 files changed, 368 insertions(+), 105 deletions(-)
> 
> -- 
> 1.8.3.1

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/4] Migrate non-migrated pages of a SVM.
@ 2020-06-18 20:37   ` Ram Pai
  0 siblings, 0 replies; 16+ messages in thread
From: Ram Pai @ 2020-06-18 20:37 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev
  Cc: ldufour, cclaudio, bharata, sathnaga, aneesh.kumar, sukadev,
	bauerman, david

I should have elaborated on the problem and the need for these patches.

Explaining it here. Will add it to the series in next version.

-------------------------------------------------------------

The time taken to switch a VM to Secure-VM, increases by the size of
the VM.  A 100GB VM takes about 7minutes. This is unacceptable.  This
linear increase is caused by a suboptimal behavior by the Ultravisor and
the Hypervisor.  The Ultravisor unnecessarily migrates all the GFN of
the VM from normal-memory to secure-memory. It has to just migrate the
necessary and sufficient GFNs.

However when the optimization is incorporated in the Ultravisor, the
Hypervisor starts misbehaving. The Hypervisor has a inbuilt assumption
that the Ultravisor will explicitly request to migrate, each and every
GFN of the VM. If only necessary and sufficient GFNs are requested for
migration, the Hypervisor continues to manage the rest of the GFNs are
normal GFNs. This leads of memory corruption, manifested consistently
when the SVM reboots.

The same is true, when a memory slot is hotplugged into a SVM. The
Hypervisor expects the ultravisor to request migration of all GFNs to
secure-GFN.  But at the same time the hypervisor is unable to handle any
H_SVM_PAGE_IN requests from the Ultravisor, done in the context of
UV_REGISTER_MEM_SLOT ucall.  This problem manifests as random errors in
the SVM, when a memory-slot is hotplugged.

This patch series automatically migrates the non-migrated pages of a SVM,
and thus solves the problem.

------------------------------------------------------------------



On Thu, Jun 18, 2020 at 02:19:01AM -0700, Ram Pai wrote:
> This patch series migrates the non-migrated pages of a SVM.
> This is required when the UV calls H_SVM_INIT_DONE, and
> when a memory-slot is hotplugged to a Secure VM.
> 
> Testing: Passed rigorous SVM reboot test using different
> 	sized SVMs.
> 
> Changelog:
> 	. fixed a bug observed by Bharata. Pages that
> 	where paged-in and later paged-out must also be
> 	skipped from migration during H_SVM_INIT_DONE.
> 
> Laurent Dufour (1):
>   KVM: PPC: Book3S HV: migrate hot plugged memory
> 
> Ram Pai (3):
>   KVM: PPC: Book3S HV: Fix function definition in book3s_hv_uvmem.c
>   KVM: PPC: Book3S HV: track the state GFNs associated with secure VMs
>   KVM: PPC: Book3S HV: migrate remaining normal-GFNs to secure-GFNs in
>     H_SVM_INIT_DONE
> 
>  Documentation/powerpc/ultravisor.rst        |   2 +
>  arch/powerpc/include/asm/kvm_book3s_uvmem.h |   8 +-
>  arch/powerpc/kvm/book3s_64_mmu_radix.c      |   2 +-
>  arch/powerpc/kvm/book3s_hv.c                |  12 +-
>  arch/powerpc/kvm/book3s_hv_uvmem.c          | 449 ++++++++++++++++++++++------
>  5 files changed, 368 insertions(+), 105 deletions(-)
> 
> -- 
> 1.8.3.1

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/4] KVM: PPC: Book3S HV: track the state GFNs associated with secure VMs
  2020-06-18 13:31     ` Laurent Dufour
@ 2020-06-19  1:16       ` Ram Pai
  -1 siblings, 0 replies; 16+ messages in thread
From: Ram Pai @ 2020-06-19  1:16 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: cclaudio, kvm-ppc, bharata, sathnaga, aneesh.kumar, sukadev,
	linuxppc-dev, bauerman, david

On Thu, Jun 18, 2020 at 03:31:06PM +0200, Laurent Dufour wrote:
> Le 18/06/2020 à 11:19, Ram Pai a écrit :
> >

.snip..

> >************************************************************************
> >  1. States of a GFN
> >     ---------------
> >  The GFN can be in one of the following states.
> >diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c

...snip...

> >index 803940d..3448459 100644
> >--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> >+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> >@@ -1100,7 +1100,7 @@ void kvmppc_radix_flush_memslot(struct kvm *kvm,
> >  	unsigned int shift;
> >  	if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_START)
> >-		kvmppc_uvmem_drop_pages(memslot, kvm, true);
> >+		kvmppc_uvmem_drop_pages(memslot, kvm, true, false);
> 
> When reviewing the v1 of this series, I asked you the question about
> the fact that the call here is made with purge_gfn = false. Your
> answer was:
> 
> >This function does not know, under what context it is called. Since
> >its job is to just flush the memslot, it cannot assume anything
> >about purging the pages in the memslot.
> 
> Indeed in the case of the memory hotplug operation, this function is
> called to wipe the page from the secure device in the case the pages
> are secured. In that case the purge is required. Indeed, I checked
> the other call to kvmppc_radix_flush_memslot() in
> kvmppc_core_flush_memslot_hv() and I cannot see why in that case too
> purge_gfn should be false, especially when the memslot is reused as
> detailed in __kvm_set_memory_region() around the call to
> kvm_arch_flush_shadow_memslot().
> 
> I'm sorry to not have ask this earlier, but could you please elaborate on this?

You are right. kvmppc_radix_flush_memslot() is getting called everytime with
the intention of disassociating the memslot from that VM. Which implies,
the memslot is intended to be deleted and possibly reused.

I should be calling kvmppc_uvmem_drop_pages() with purge_gfn=true, here
aswell.

I expect some form of problem showing up in memhot-plug/unplug path.

RP


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/4] KVM: PPC: Book3S HV: track the state GFNs associated with secure VMs
@ 2020-06-19  1:16       ` Ram Pai
  0 siblings, 0 replies; 16+ messages in thread
From: Ram Pai @ 2020-06-19  1:16 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: cclaudio, kvm-ppc, bharata, sathnaga, aneesh.kumar, sukadev,
	linuxppc-dev, bauerman, david

On Thu, Jun 18, 2020 at 03:31:06PM +0200, Laurent Dufour wrote:
> Le 18/06/2020 à 11:19, Ram Pai a écrit :
> >

.snip..

> >************************************************************************
> >  1. States of a GFN
> >     ---------------
> >  The GFN can be in one of the following states.
> >diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c

...snip...

> >index 803940d..3448459 100644
> >--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> >+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> >@@ -1100,7 +1100,7 @@ void kvmppc_radix_flush_memslot(struct kvm *kvm,
> >  	unsigned int shift;
> >  	if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_START)
> >-		kvmppc_uvmem_drop_pages(memslot, kvm, true);
> >+		kvmppc_uvmem_drop_pages(memslot, kvm, true, false);
> 
> When reviewing the v1 of this series, I asked you the question about
> the fact that the call here is made with purge_gfn = false. Your
> answer was:
> 
> >This function does not know, under what context it is called. Since
> >its job is to just flush the memslot, it cannot assume anything
> >about purging the pages in the memslot.
> 
> Indeed in the case of the memory hotplug operation, this function is
> called to wipe the page from the secure device in the case the pages
> are secured. In that case the purge is required. Indeed, I checked
> the other call to kvmppc_radix_flush_memslot() in
> kvmppc_core_flush_memslot_hv() and I cannot see why in that case too
> purge_gfn should be false, especially when the memslot is reused as
> detailed in __kvm_set_memory_region() around the call to
> kvm_arch_flush_shadow_memslot().
> 
> I'm sorry to not have ask this earlier, but could you please elaborate on this?

You are right. kvmppc_radix_flush_memslot() is getting called everytime with
the intention of disassociating the memslot from that VM. Which implies,
the memslot is intended to be deleted and possibly reused.

I should be calling kvmppc_uvmem_drop_pages() with purge_gfn=true, here
aswell.

I expect some form of problem showing up in memhot-plug/unplug path.

RP

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-06-19  1:18 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-18  9:19 [PATCH v2 0/4] Migrate non-migrated pages of a SVM Ram Pai
2020-06-18  9:19 ` Ram Pai
2020-06-18  9:19 ` [PATCH v2 1/4] KVM: PPC: Book3S HV: Fix function definition in book3s_hv_uvmem.c Ram Pai
2020-06-18  9:19   ` Ram Pai
2020-06-18  9:19 ` [PATCH v2 2/4] KVM: PPC: Book3S HV: track the state GFNs associated with secure VMs Ram Pai
2020-06-18  9:19   ` Ram Pai
2020-06-18 13:31   ` Laurent Dufour
2020-06-18 13:31     ` Laurent Dufour
2020-06-19  1:16     ` Ram Pai
2020-06-19  1:16       ` Ram Pai
2020-06-18  9:19 ` [PATCH v2 3/4] KVM: PPC: Book3S HV: migrate remaining normal-GFNs to secure-GFNs in H_SVM_INIT_DONE Ram Pai
2020-06-18  9:19   ` Ram Pai
2020-06-18  9:19 ` [PATCH v2 4/4] KVM: PPC: Book3S HV: migrate hot plugged memory Ram Pai
2020-06-18  9:19   ` Ram Pai
2020-06-18 20:37 ` [PATCH v2 0/4] Migrate non-migrated pages of a SVM Ram Pai
2020-06-18 20:37   ` Ram Pai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.