All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4] Enable async page faults on s390
@ 2013-07-10 12:59 ` Dominik Dingel
  0 siblings, 0 replies; 20+ messages in thread
From: Dominik Dingel @ 2013-07-10 12:59 UTC (permalink / raw)
  To: Gleb Natapov, Paolo Bonzini
  Cc: Christian Borntraeger, Heiko Carstens, Martin Schwidefsky,
	Cornelia Huck, Xiantao Zhang, Alexander Graf, Christoffer Dall,
	Marc Zyngier, Ralf Baechle, kvm, linux-s390, linux-mm,
	linux-kernel, Dominik Dingel

Gleb, Paolo, 

based on the work from Martin and Carsten, this implementation enables async page faults.
To the guest it will provide the pfault interface, but internally it uses the
async page fault common code. 

The inital submission and it's discussion can be followed on http://www.mail-archive.com/kvm@vger.kernel.org/msg63359.html . 

There is a slight modification for common code to move from a pull to a push based approch on s390. 
As s390 we don't want to wait till we leave the guest state to queue the notification interrupts.

To use this feature the controlling userspace hase to enable the capability.
With that knob we can later on disable this feature for live migration.

v3 -> v4
 - Change "done" interrupts from local to floating
 - Add a comment for clarification
 - Change KVM_HAVE_ERR_BAD to move s390 implementation to s390 backend 

v2 -> v3
 - Reworked the architecture specific parts, to only provide on addtional
   implementation
 - Renamed function to kvm_async_page_present_(sync|async)
 - Fixing KVM_HVA_ERR_BAD handling

v1 -> v2:
 - Adding other architecture backends
 - Adding documentation for the ioctl
 - Improving the overall error handling
 - Reducing the needed modifications on the common code

Dominik Dingel (4):
  PF: Add FAULT_FLAG_RETRY_NOWAIT for guest fault
  PF: Make KVM_HVA_ERR_BAD usable on s390
  PF: Provide additional direct page notification
  PF: Async page fault support on s390

 Documentation/s390/kvm.txt        |  24 ++++++++
 arch/s390/include/asm/kvm_host.h  |  30 ++++++++++
 arch/s390/include/asm/pgtable.h   |   2 +
 arch/s390/include/asm/processor.h |   1 +
 arch/s390/include/uapi/asm/kvm.h  |  10 ++++
 arch/s390/kvm/Kconfig             |   2 +
 arch/s390/kvm/Makefile            |   2 +-
 arch/s390/kvm/diag.c              |  63 ++++++++++++++++++++
 arch/s390/kvm/interrupt.c         |  43 +++++++++++---
 arch/s390/kvm/kvm-s390.c          | 118 ++++++++++++++++++++++++++++++++++++++
 arch/s390/kvm/kvm-s390.h          |   4 ++
 arch/s390/kvm/sigp.c              |   6 ++
 arch/s390/mm/fault.c              |  26 +++++++--
 arch/x86/kvm/mmu.c                |   2 +-
 include/linux/kvm_host.h          |  10 +++-
 include/uapi/linux/kvm.h          |   2 +
 virt/kvm/Kconfig                  |   4 ++
 virt/kvm/async_pf.c               |  22 ++++++-
 18 files changed, 354 insertions(+), 17 deletions(-)

-- 
1.8.2.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v4 0/4] Enable async page faults on s390
@ 2013-07-10 12:59 ` Dominik Dingel
  0 siblings, 0 replies; 20+ messages in thread
From: Dominik Dingel @ 2013-07-10 12:59 UTC (permalink / raw)
  To: Gleb Natapov, Paolo Bonzini
  Cc: Christian Borntraeger, Heiko Carstens, Martin Schwidefsky,
	Cornelia Huck, Xiantao Zhang, Alexander Graf, Christoffer Dall,
	Marc Zyngier, Ralf Baechle, kvm, linux-s390, linux-mm,
	linux-kernel, Dominik Dingel

Gleb, Paolo, 

based on the work from Martin and Carsten, this implementation enables async page faults.
To the guest it will provide the pfault interface, but internally it uses the
async page fault common code. 

The inital submission and it's discussion can be followed on http://www.mail-archive.com/kvm@vger.kernel.org/msg63359.html . 

There is a slight modification for common code to move from a pull to a push based approch on s390. 
As s390 we don't want to wait till we leave the guest state to queue the notification interrupts.

To use this feature the controlling userspace hase to enable the capability.
With that knob we can later on disable this feature for live migration.

v3 -> v4
 - Change "done" interrupts from local to floating
 - Add a comment for clarification
 - Change KVM_HAVE_ERR_BAD to move s390 implementation to s390 backend 

v2 -> v3
 - Reworked the architecture specific parts, to only provide on addtional
   implementation
 - Renamed function to kvm_async_page_present_(sync|async)
 - Fixing KVM_HVA_ERR_BAD handling

v1 -> v2:
 - Adding other architecture backends
 - Adding documentation for the ioctl
 - Improving the overall error handling
 - Reducing the needed modifications on the common code

Dominik Dingel (4):
  PF: Add FAULT_FLAG_RETRY_NOWAIT for guest fault
  PF: Make KVM_HVA_ERR_BAD usable on s390
  PF: Provide additional direct page notification
  PF: Async page fault support on s390

 Documentation/s390/kvm.txt        |  24 ++++++++
 arch/s390/include/asm/kvm_host.h  |  30 ++++++++++
 arch/s390/include/asm/pgtable.h   |   2 +
 arch/s390/include/asm/processor.h |   1 +
 arch/s390/include/uapi/asm/kvm.h  |  10 ++++
 arch/s390/kvm/Kconfig             |   2 +
 arch/s390/kvm/Makefile            |   2 +-
 arch/s390/kvm/diag.c              |  63 ++++++++++++++++++++
 arch/s390/kvm/interrupt.c         |  43 +++++++++++---
 arch/s390/kvm/kvm-s390.c          | 118 ++++++++++++++++++++++++++++++++++++++
 arch/s390/kvm/kvm-s390.h          |   4 ++
 arch/s390/kvm/sigp.c              |   6 ++
 arch/s390/mm/fault.c              |  26 +++++++--
 arch/x86/kvm/mmu.c                |   2 +-
 include/linux/kvm_host.h          |  10 +++-
 include/uapi/linux/kvm.h          |   2 +
 virt/kvm/Kconfig                  |   4 ++
 virt/kvm/async_pf.c               |  22 ++++++-
 18 files changed, 354 insertions(+), 17 deletions(-)

-- 
1.8.2.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/4] PF: Add FAULT_FLAG_RETRY_NOWAIT for guest fault
  2013-07-10 12:59 ` Dominik Dingel
@ 2013-07-10 12:59   ` Dominik Dingel
  -1 siblings, 0 replies; 20+ messages in thread
From: Dominik Dingel @ 2013-07-10 12:59 UTC (permalink / raw)
  To: Gleb Natapov, Paolo Bonzini
  Cc: Christian Borntraeger, Heiko Carstens, Martin Schwidefsky,
	Cornelia Huck, Xiantao Zhang, Alexander Graf, Christoffer Dall,
	Marc Zyngier, Ralf Baechle, kvm, linux-s390, linux-mm,
	linux-kernel, Dominik Dingel

In case of a fault retry exit sie64() with gmap_fault indication for the
running thread set. This makes it possible to handle async page faults
without the need for mm notifiers.

Based on a patch from Martin Schwidefsky.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/pgtable.h   |  2 ++
 arch/s390/include/asm/processor.h |  1 +
 arch/s390/kvm/kvm-s390.c          | 13 +++++++++++++
 arch/s390/mm/fault.c              | 26 ++++++++++++++++++++++----
 4 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 0ea4e59..4a4cc64 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -740,6 +740,7 @@ static inline void pgste_set_pte(pte_t *ptep, pte_t entry)
  * @table: pointer to the page directory
  * @asce: address space control element for gmap page table
  * @crst_list: list of all crst tables used in the guest address space
+ * @pfault_enabled: defines if pfaults are applicable for the guest
  */
 struct gmap {
 	struct list_head list;
@@ -748,6 +749,7 @@ struct gmap {
 	unsigned long asce;
 	void *private;
 	struct list_head crst_list;
+	unsigned long pfault_enabled;
 };
 
 /**
diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h
index 6b49987..4fa96ca 100644
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -77,6 +77,7 @@ struct thread_struct {
         unsigned long ksp;              /* kernel stack pointer             */
 	mm_segment_t mm_segment;
 	unsigned long gmap_addr;	/* address of last gmap fault. */
+	unsigned int gmap_pfault;	/* signal of a pending guest pfault */
 	struct per_regs per_user;	/* User specified PER registers */
 	struct per_event per_event;	/* Cause of the last PER trap */
 	unsigned long per_flags;	/* Flags to control debug behavior */
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index ba694d2..702daca 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -682,6 +682,15 @@ static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+static void kvm_arch_fault_in_sync(struct kvm_vcpu *vcpu)
+{
+	hva_t fault = gmap_fault(current->thread.gmap_addr, vcpu->arch.gmap);
+	struct mm_struct *mm = current->mm;
+	down_read(&mm->mmap_sem);
+	get_user_pages(current, mm, fault, 1, 1, 0, NULL, NULL);
+	up_read(&mm->mmap_sem);
+}
+
 static int __vcpu_run(struct kvm_vcpu *vcpu)
 {
 	int rc;
@@ -715,6 +724,10 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
 	if (rc < 0) {
 		if (kvm_is_ucontrol(vcpu->kvm)) {
 			rc = SIE_INTERCEPT_UCONTROL;
+		} else if (current->thread.gmap_pfault) {
+			kvm_arch_fault_in_sync(vcpu);
+			current->thread.gmap_pfault = 0;
+			rc = 0;
 		} else {
 			VCPU_EVENT(vcpu, 3, "%s", "fault in sie instruction");
 			trace_kvm_s390_sie_fault(vcpu);
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index 047c3e4..7d4c4b1 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -50,6 +50,7 @@
 #define VM_FAULT_BADMAP		0x020000
 #define VM_FAULT_BADACCESS	0x040000
 #define VM_FAULT_SIGNAL		0x080000
+#define VM_FAULT_PFAULT		0x100000
 
 static unsigned long store_indication __read_mostly;
 
@@ -232,6 +233,7 @@ static noinline void do_fault_error(struct pt_regs *regs, int fault)
 			return;
 		}
 	case VM_FAULT_BADCONTEXT:
+	case VM_FAULT_PFAULT:
 		do_no_context(regs);
 		break;
 	case VM_FAULT_SIGNAL:
@@ -269,6 +271,9 @@ static noinline void do_fault_error(struct pt_regs *regs, int fault)
  */
 static inline int do_exception(struct pt_regs *regs, int access)
 {
+#ifdef CONFIG_PGSTE
+	struct gmap *gmap;
+#endif
 	struct task_struct *tsk;
 	struct mm_struct *mm;
 	struct vm_area_struct *vma;
@@ -307,9 +312,10 @@ static inline int do_exception(struct pt_regs *regs, int access)
 	down_read(&mm->mmap_sem);
 
 #ifdef CONFIG_PGSTE
-	if ((current->flags & PF_VCPU) && S390_lowcore.gmap) {
-		address = __gmap_fault(address,
-				     (struct gmap *) S390_lowcore.gmap);
+	gmap = (struct gmap *)
+		((current->flags & PF_VCPU) ? S390_lowcore.gmap : 0);
+	if (gmap) {
+		address = __gmap_fault(address, gmap);
 		if (address == -EFAULT) {
 			fault = VM_FAULT_BADMAP;
 			goto out_up;
@@ -318,6 +324,8 @@ static inline int do_exception(struct pt_regs *regs, int access)
 			fault = VM_FAULT_OOM;
 			goto out_up;
 		}
+		if (test_bit(1, &gmap->pfault_enabled))
+			flags |= FAULT_FLAG_RETRY_NOWAIT;
 	}
 #endif
 
@@ -374,9 +382,19 @@ retry:
 				      regs, address);
 		}
 		if (fault & VM_FAULT_RETRY) {
+#ifdef CONFIG_PGSTE
+			if (gmap && (flags & FAULT_FLAG_RETRY_NOWAIT)) {
+				/* FAULT_FLAG_RETRY_NOWAIT has been set,
+				 * mmap_sem has not been released */
+				current->thread.gmap_pfault = 1;
+				fault = VM_FAULT_PFAULT;
+				goto out_up;
+			}
+#endif
 			/* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk
 			 * of starvation. */
-			flags &= ~FAULT_FLAG_ALLOW_RETRY;
+			flags &= ~(FAULT_FLAG_ALLOW_RETRY |
+				   FAULT_FLAG_RETRY_NOWAIT);
 			flags |= FAULT_FLAG_TRIED;
 			down_read(&mm->mmap_sem);
 			goto retry;
-- 
1.8.2.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 1/4] PF: Add FAULT_FLAG_RETRY_NOWAIT for guest fault
@ 2013-07-10 12:59   ` Dominik Dingel
  0 siblings, 0 replies; 20+ messages in thread
From: Dominik Dingel @ 2013-07-10 12:59 UTC (permalink / raw)
  To: Gleb Natapov, Paolo Bonzini
  Cc: Christian Borntraeger, Heiko Carstens, Martin Schwidefsky,
	Cornelia Huck, Xiantao Zhang, Alexander Graf, Christoffer Dall,
	Marc Zyngier, Ralf Baechle, kvm, linux-s390, linux-mm,
	linux-kernel, Dominik Dingel

In case of a fault retry exit sie64() with gmap_fault indication for the
running thread set. This makes it possible to handle async page faults
without the need for mm notifiers.

Based on a patch from Martin Schwidefsky.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/pgtable.h   |  2 ++
 arch/s390/include/asm/processor.h |  1 +
 arch/s390/kvm/kvm-s390.c          | 13 +++++++++++++
 arch/s390/mm/fault.c              | 26 ++++++++++++++++++++++----
 4 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 0ea4e59..4a4cc64 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -740,6 +740,7 @@ static inline void pgste_set_pte(pte_t *ptep, pte_t entry)
  * @table: pointer to the page directory
  * @asce: address space control element for gmap page table
  * @crst_list: list of all crst tables used in the guest address space
+ * @pfault_enabled: defines if pfaults are applicable for the guest
  */
 struct gmap {
 	struct list_head list;
@@ -748,6 +749,7 @@ struct gmap {
 	unsigned long asce;
 	void *private;
 	struct list_head crst_list;
+	unsigned long pfault_enabled;
 };
 
 /**
diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h
index 6b49987..4fa96ca 100644
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -77,6 +77,7 @@ struct thread_struct {
         unsigned long ksp;              /* kernel stack pointer             */
 	mm_segment_t mm_segment;
 	unsigned long gmap_addr;	/* address of last gmap fault. */
+	unsigned int gmap_pfault;	/* signal of a pending guest pfault */
 	struct per_regs per_user;	/* User specified PER registers */
 	struct per_event per_event;	/* Cause of the last PER trap */
 	unsigned long per_flags;	/* Flags to control debug behavior */
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index ba694d2..702daca 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -682,6 +682,15 @@ static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+static void kvm_arch_fault_in_sync(struct kvm_vcpu *vcpu)
+{
+	hva_t fault = gmap_fault(current->thread.gmap_addr, vcpu->arch.gmap);
+	struct mm_struct *mm = current->mm;
+	down_read(&mm->mmap_sem);
+	get_user_pages(current, mm, fault, 1, 1, 0, NULL, NULL);
+	up_read(&mm->mmap_sem);
+}
+
 static int __vcpu_run(struct kvm_vcpu *vcpu)
 {
 	int rc;
@@ -715,6 +724,10 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
 	if (rc < 0) {
 		if (kvm_is_ucontrol(vcpu->kvm)) {
 			rc = SIE_INTERCEPT_UCONTROL;
+		} else if (current->thread.gmap_pfault) {
+			kvm_arch_fault_in_sync(vcpu);
+			current->thread.gmap_pfault = 0;
+			rc = 0;
 		} else {
 			VCPU_EVENT(vcpu, 3, "%s", "fault in sie instruction");
 			trace_kvm_s390_sie_fault(vcpu);
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index 047c3e4..7d4c4b1 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -50,6 +50,7 @@
 #define VM_FAULT_BADMAP		0x020000
 #define VM_FAULT_BADACCESS	0x040000
 #define VM_FAULT_SIGNAL		0x080000
+#define VM_FAULT_PFAULT		0x100000
 
 static unsigned long store_indication __read_mostly;
 
@@ -232,6 +233,7 @@ static noinline void do_fault_error(struct pt_regs *regs, int fault)
 			return;
 		}
 	case VM_FAULT_BADCONTEXT:
+	case VM_FAULT_PFAULT:
 		do_no_context(regs);
 		break;
 	case VM_FAULT_SIGNAL:
@@ -269,6 +271,9 @@ static noinline void do_fault_error(struct pt_regs *regs, int fault)
  */
 static inline int do_exception(struct pt_regs *regs, int access)
 {
+#ifdef CONFIG_PGSTE
+	struct gmap *gmap;
+#endif
 	struct task_struct *tsk;
 	struct mm_struct *mm;
 	struct vm_area_struct *vma;
@@ -307,9 +312,10 @@ static inline int do_exception(struct pt_regs *regs, int access)
 	down_read(&mm->mmap_sem);
 
 #ifdef CONFIG_PGSTE
-	if ((current->flags & PF_VCPU) && S390_lowcore.gmap) {
-		address = __gmap_fault(address,
-				     (struct gmap *) S390_lowcore.gmap);
+	gmap = (struct gmap *)
+		((current->flags & PF_VCPU) ? S390_lowcore.gmap : 0);
+	if (gmap) {
+		address = __gmap_fault(address, gmap);
 		if (address == -EFAULT) {
 			fault = VM_FAULT_BADMAP;
 			goto out_up;
@@ -318,6 +324,8 @@ static inline int do_exception(struct pt_regs *regs, int access)
 			fault = VM_FAULT_OOM;
 			goto out_up;
 		}
+		if (test_bit(1, &gmap->pfault_enabled))
+			flags |= FAULT_FLAG_RETRY_NOWAIT;
 	}
 #endif
 
@@ -374,9 +382,19 @@ retry:
 				      regs, address);
 		}
 		if (fault & VM_FAULT_RETRY) {
+#ifdef CONFIG_PGSTE
+			if (gmap && (flags & FAULT_FLAG_RETRY_NOWAIT)) {
+				/* FAULT_FLAG_RETRY_NOWAIT has been set,
+				 * mmap_sem has not been released */
+				current->thread.gmap_pfault = 1;
+				fault = VM_FAULT_PFAULT;
+				goto out_up;
+			}
+#endif
 			/* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk
 			 * of starvation. */
-			flags &= ~FAULT_FLAG_ALLOW_RETRY;
+			flags &= ~(FAULT_FLAG_ALLOW_RETRY |
+				   FAULT_FLAG_RETRY_NOWAIT);
 			flags |= FAULT_FLAG_TRIED;
 			down_read(&mm->mmap_sem);
 			goto retry;
-- 
1.8.2.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/4] PF: Make KVM_HVA_ERR_BAD usable on s390
  2013-07-10 12:59 ` Dominik Dingel
@ 2013-07-10 12:59   ` Dominik Dingel
  -1 siblings, 0 replies; 20+ messages in thread
From: Dominik Dingel @ 2013-07-10 12:59 UTC (permalink / raw)
  To: Gleb Natapov, Paolo Bonzini
  Cc: Christian Borntraeger, Heiko Carstens, Martin Schwidefsky,
	Cornelia Huck, Xiantao Zhang, Alexander Graf, Christoffer Dall,
	Marc Zyngier, Ralf Baechle, kvm, linux-s390, linux-mm,
	linux-kernel, Dominik Dingel

Current common code uses PAGE_OFFSET to indicate a bad host virtual address.
As this check won't work on architectures that don't map kernel and user memory
into the same address space (e.g. s390), such architectures can now provide
there own KVM_HVA_ERR_BAD defines.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
---
 arch/s390/include/asm/kvm_host.h | 8 ++++++++
 include/linux/kvm_host.h         | 8 ++++++++
 2 files changed, 16 insertions(+)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 3238d40..cd30c3d 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -274,6 +274,14 @@ struct kvm_arch{
 	int css_support;
 };
 
+#define KVM_HVA_ERR_BAD		(-1UL)
+#define KVM_HVA_ERR_RO_BAD	(-1UL)
+
+static inline bool kvm_is_error_hva(unsigned long addr)
+{
+	return addr == KVM_HVA_ERR_BAD;
+}
+
 extern int sie64a(struct kvm_s390_sie_block *, u64 *);
 extern char sie_exit;
 #endif
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a63d83e..92e8f64 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -85,6 +85,12 @@ static inline bool is_noslot_pfn(pfn_t pfn)
 	return pfn == KVM_PFN_NOSLOT;
 }
 
+/*
+ * architectures with KVM_HVA_ERR_BAD other than PAGE_OFFSET (e.g. s390)
+ * provide own defines and kvm_is_error_hva
+ */
+#ifndef KVM_HVA_ERR_BAD
+
 #define KVM_HVA_ERR_BAD		(PAGE_OFFSET)
 #define KVM_HVA_ERR_RO_BAD	(PAGE_OFFSET + PAGE_SIZE)
 
@@ -93,6 +99,8 @@ static inline bool kvm_is_error_hva(unsigned long addr)
 	return addr >= PAGE_OFFSET;
 }
 
+#endif
+
 #define KVM_ERR_PTR_BAD_PAGE	(ERR_PTR(-ENOENT))
 
 static inline bool is_error_page(struct page *page)
-- 
1.8.2.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/4] PF: Make KVM_HVA_ERR_BAD usable on s390
@ 2013-07-10 12:59   ` Dominik Dingel
  0 siblings, 0 replies; 20+ messages in thread
From: Dominik Dingel @ 2013-07-10 12:59 UTC (permalink / raw)
  To: Gleb Natapov, Paolo Bonzini
  Cc: Christian Borntraeger, Heiko Carstens, Martin Schwidefsky,
	Cornelia Huck, Xiantao Zhang, Alexander Graf, Christoffer Dall,
	Marc Zyngier, Ralf Baechle, kvm, linux-s390, linux-mm,
	linux-kernel, Dominik Dingel

Current common code uses PAGE_OFFSET to indicate a bad host virtual address.
As this check won't work on architectures that don't map kernel and user memory
into the same address space (e.g. s390), such architectures can now provide
there own KVM_HVA_ERR_BAD defines.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
---
 arch/s390/include/asm/kvm_host.h | 8 ++++++++
 include/linux/kvm_host.h         | 8 ++++++++
 2 files changed, 16 insertions(+)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 3238d40..cd30c3d 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -274,6 +274,14 @@ struct kvm_arch{
 	int css_support;
 };
 
+#define KVM_HVA_ERR_BAD		(-1UL)
+#define KVM_HVA_ERR_RO_BAD	(-1UL)
+
+static inline bool kvm_is_error_hva(unsigned long addr)
+{
+	return addr == KVM_HVA_ERR_BAD;
+}
+
 extern int sie64a(struct kvm_s390_sie_block *, u64 *);
 extern char sie_exit;
 #endif
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a63d83e..92e8f64 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -85,6 +85,12 @@ static inline bool is_noslot_pfn(pfn_t pfn)
 	return pfn == KVM_PFN_NOSLOT;
 }
 
+/*
+ * architectures with KVM_HVA_ERR_BAD other than PAGE_OFFSET (e.g. s390)
+ * provide own defines and kvm_is_error_hva
+ */
+#ifndef KVM_HVA_ERR_BAD
+
 #define KVM_HVA_ERR_BAD		(PAGE_OFFSET)
 #define KVM_HVA_ERR_RO_BAD	(PAGE_OFFSET + PAGE_SIZE)
 
@@ -93,6 +99,8 @@ static inline bool kvm_is_error_hva(unsigned long addr)
 	return addr >= PAGE_OFFSET;
 }
 
+#endif
+
 #define KVM_ERR_PTR_BAD_PAGE	(ERR_PTR(-ENOENT))
 
 static inline bool is_error_page(struct page *page)
-- 
1.8.2.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/4] PF: Provide additional direct page notification
  2013-07-10 12:59 ` Dominik Dingel
@ 2013-07-10 12:59   ` Dominik Dingel
  -1 siblings, 0 replies; 20+ messages in thread
From: Dominik Dingel @ 2013-07-10 12:59 UTC (permalink / raw)
  To: Gleb Natapov, Paolo Bonzini
  Cc: Christian Borntraeger, Heiko Carstens, Martin Schwidefsky,
	Cornelia Huck, Xiantao Zhang, Alexander Graf, Christoffer Dall,
	Marc Zyngier, Ralf Baechle, kvm, linux-s390, linux-mm,
	linux-kernel, Dominik Dingel

By setting a Kconfig option, the architecture can control when
guest notifications will be presented by the apf backend.
So there is the default batch mechanism, working as before, where the vcpu thread
should pull in this information. On the other hand there is now the direct
mechanism, this will directly push the information to the guest.
This way s390 can use an already existing architecture interface.

Still the vcpu thread should call check_completion to cleanup leftovers,
that leaves most of the common code untouched.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/x86/kvm/mmu.c       |  2 +-
 include/linux/kvm_host.h |  2 +-
 virt/kvm/Kconfig         |  4 ++++
 virt/kvm/async_pf.c      | 22 +++++++++++++++++++---
 4 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 0d094da..b8632e9 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3343,7 +3343,7 @@ static int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn)
 	arch.direct_map = vcpu->arch.mmu.direct_map;
 	arch.cr3 = vcpu->arch.mmu.get_cr3(vcpu);
 
-	return kvm_setup_async_pf(vcpu, gva, gfn, &arch);
+	return kvm_setup_async_pf(vcpu, gva, gfn_to_hva(vcpu->kvm, gfn), &arch);
 }
 
 static bool can_do_async_pf(struct kvm_vcpu *vcpu)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 92e8f64..fe87e46 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -191,7 +191,7 @@ struct kvm_async_pf {
 
 void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu);
 void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu);
-int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn,
+int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, unsigned long hva,
 		       struct kvm_arch_async_pf *arch);
 int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu);
 #endif
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 779262f..0774495 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -22,6 +22,10 @@ config KVM_MMIO
 config KVM_ASYNC_PF
        bool
 
+# Toggle to switch between direct notification and batch job
+config KVM_ASYNC_PF_SYNC
+       bool
+
 config HAVE_KVM_MSI
        bool
 
diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c
index ea475cd..cfa9366 100644
--- a/virt/kvm/async_pf.c
+++ b/virt/kvm/async_pf.c
@@ -28,6 +28,21 @@
 #include "async_pf.h"
 #include <trace/events/kvm.h>
 
+static inline void kvm_async_page_present_sync(struct kvm_vcpu *vcpu,
+					       struct kvm_async_pf *work)
+{
+#ifdef CONFIG_KVM_ASYNC_PF_SYNC
+	kvm_arch_async_page_present(vcpu, work);
+#endif
+}
+static inline void kvm_async_page_present_async(struct kvm_vcpu *vcpu,
+						struct kvm_async_pf *work)
+{
+#ifndef CONFIG_KVM_ASYNC_PF_SYNC
+	kvm_arch_async_page_present(vcpu, work);
+#endif
+}
+
 static struct kmem_cache *async_pf_cache;
 
 int kvm_async_pf_init(void)
@@ -70,6 +85,7 @@ static void async_pf_execute(struct work_struct *work)
 	down_read(&mm->mmap_sem);
 	get_user_pages(current, mm, addr, 1, 1, 0, &page, NULL);
 	up_read(&mm->mmap_sem);
+	kvm_async_page_present_sync(vcpu, apf);
 	unuse_mm(mm);
 
 	spin_lock(&vcpu->async_pf.lock);
@@ -134,7 +150,7 @@ void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu)
 
 		if (work->page)
 			kvm_arch_async_page_ready(vcpu, work);
-		kvm_arch_async_page_present(vcpu, work);
+		kvm_async_page_present_async(vcpu, work);
 
 		list_del(&work->queue);
 		vcpu->async_pf.queued--;
@@ -144,7 +160,7 @@ void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu)
 	}
 }
 
-int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn,
+int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, unsigned long hva,
 		       struct kvm_arch_async_pf *arch)
 {
 	struct kvm_async_pf *work;
@@ -166,7 +182,7 @@ int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn,
 	work->done = false;
 	work->vcpu = vcpu;
 	work->gva = gva;
-	work->addr = gfn_to_hva(vcpu->kvm, gfn);
+	work->addr = hva;
 	work->arch = *arch;
 	work->mm = current->mm;
 	atomic_inc(&work->mm->mm_count);
-- 
1.8.2.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/4] PF: Provide additional direct page notification
@ 2013-07-10 12:59   ` Dominik Dingel
  0 siblings, 0 replies; 20+ messages in thread
From: Dominik Dingel @ 2013-07-10 12:59 UTC (permalink / raw)
  To: Gleb Natapov, Paolo Bonzini
  Cc: Christian Borntraeger, Heiko Carstens, Martin Schwidefsky,
	Cornelia Huck, Xiantao Zhang, Alexander Graf, Christoffer Dall,
	Marc Zyngier, Ralf Baechle, kvm, linux-s390, linux-mm,
	linux-kernel, Dominik Dingel

By setting a Kconfig option, the architecture can control when
guest notifications will be presented by the apf backend.
So there is the default batch mechanism, working as before, where the vcpu thread
should pull in this information. On the other hand there is now the direct
mechanism, this will directly push the information to the guest.
This way s390 can use an already existing architecture interface.

Still the vcpu thread should call check_completion to cleanup leftovers,
that leaves most of the common code untouched.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/x86/kvm/mmu.c       |  2 +-
 include/linux/kvm_host.h |  2 +-
 virt/kvm/Kconfig         |  4 ++++
 virt/kvm/async_pf.c      | 22 +++++++++++++++++++---
 4 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 0d094da..b8632e9 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3343,7 +3343,7 @@ static int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn)
 	arch.direct_map = vcpu->arch.mmu.direct_map;
 	arch.cr3 = vcpu->arch.mmu.get_cr3(vcpu);
 
-	return kvm_setup_async_pf(vcpu, gva, gfn, &arch);
+	return kvm_setup_async_pf(vcpu, gva, gfn_to_hva(vcpu->kvm, gfn), &arch);
 }
 
 static bool can_do_async_pf(struct kvm_vcpu *vcpu)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 92e8f64..fe87e46 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -191,7 +191,7 @@ struct kvm_async_pf {
 
 void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu);
 void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu);
-int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn,
+int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, unsigned long hva,
 		       struct kvm_arch_async_pf *arch);
 int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu);
 #endif
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 779262f..0774495 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -22,6 +22,10 @@ config KVM_MMIO
 config KVM_ASYNC_PF
        bool
 
+# Toggle to switch between direct notification and batch job
+config KVM_ASYNC_PF_SYNC
+       bool
+
 config HAVE_KVM_MSI
        bool
 
diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c
index ea475cd..cfa9366 100644
--- a/virt/kvm/async_pf.c
+++ b/virt/kvm/async_pf.c
@@ -28,6 +28,21 @@
 #include "async_pf.h"
 #include <trace/events/kvm.h>
 
+static inline void kvm_async_page_present_sync(struct kvm_vcpu *vcpu,
+					       struct kvm_async_pf *work)
+{
+#ifdef CONFIG_KVM_ASYNC_PF_SYNC
+	kvm_arch_async_page_present(vcpu, work);
+#endif
+}
+static inline void kvm_async_page_present_async(struct kvm_vcpu *vcpu,
+						struct kvm_async_pf *work)
+{
+#ifndef CONFIG_KVM_ASYNC_PF_SYNC
+	kvm_arch_async_page_present(vcpu, work);
+#endif
+}
+
 static struct kmem_cache *async_pf_cache;
 
 int kvm_async_pf_init(void)
@@ -70,6 +85,7 @@ static void async_pf_execute(struct work_struct *work)
 	down_read(&mm->mmap_sem);
 	get_user_pages(current, mm, addr, 1, 1, 0, &page, NULL);
 	up_read(&mm->mmap_sem);
+	kvm_async_page_present_sync(vcpu, apf);
 	unuse_mm(mm);
 
 	spin_lock(&vcpu->async_pf.lock);
@@ -134,7 +150,7 @@ void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu)
 
 		if (work->page)
 			kvm_arch_async_page_ready(vcpu, work);
-		kvm_arch_async_page_present(vcpu, work);
+		kvm_async_page_present_async(vcpu, work);
 
 		list_del(&work->queue);
 		vcpu->async_pf.queued--;
@@ -144,7 +160,7 @@ void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu)
 	}
 }
 
-int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn,
+int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, unsigned long hva,
 		       struct kvm_arch_async_pf *arch)
 {
 	struct kvm_async_pf *work;
@@ -166,7 +182,7 @@ int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn,
 	work->done = false;
 	work->vcpu = vcpu;
 	work->gva = gva;
-	work->addr = gfn_to_hva(vcpu->kvm, gfn);
+	work->addr = hva;
 	work->arch = *arch;
 	work->mm = current->mm;
 	atomic_inc(&work->mm->mm_count);
-- 
1.8.2.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/4] PF: Async page fault support on s390
  2013-07-10 12:59 ` Dominik Dingel
@ 2013-07-10 12:59   ` Dominik Dingel
  -1 siblings, 0 replies; 20+ messages in thread
From: Dominik Dingel @ 2013-07-10 12:59 UTC (permalink / raw)
  To: Gleb Natapov, Paolo Bonzini
  Cc: Christian Borntraeger, Heiko Carstens, Martin Schwidefsky,
	Cornelia Huck, Xiantao Zhang, Alexander Graf, Christoffer Dall,
	Marc Zyngier, Ralf Baechle, kvm, linux-s390, linux-mm,
	linux-kernel, Dominik Dingel

This patch enables async page faults for s390 kvm guests.
It provides the userspace API to enable, disable or get the status of this
feature. Also it includes the diagnose code, called by the guest to enable
async page faults.

The async page faults will use an already existing guest interface for this
purpose, as described in "CP Programming Services (SC24-6084)".

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
---
 Documentation/s390/kvm.txt       |  24 +++++++++
 arch/s390/include/asm/kvm_host.h |  22 ++++++++
 arch/s390/include/uapi/asm/kvm.h |  10 ++++
 arch/s390/kvm/Kconfig            |   2 +
 arch/s390/kvm/Makefile           |   2 +-
 arch/s390/kvm/diag.c             |  63 +++++++++++++++++++++++
 arch/s390/kvm/interrupt.c        |  43 +++++++++++++---
 arch/s390/kvm/kvm-s390.c         | 107 ++++++++++++++++++++++++++++++++++++++-
 arch/s390/kvm/kvm-s390.h         |   4 ++
 arch/s390/kvm/sigp.c             |   6 +++
 include/uapi/linux/kvm.h         |   2 +
 11 files changed, 276 insertions(+), 9 deletions(-)

diff --git a/Documentation/s390/kvm.txt b/Documentation/s390/kvm.txt
index 85f3280..707b7e9 100644
--- a/Documentation/s390/kvm.txt
+++ b/Documentation/s390/kvm.txt
@@ -70,6 +70,30 @@ floating interrupts are:
 KVM_S390_INT_VIRTIO
 KVM_S390_INT_SERVICE
 
+ioctl:      KVM_S390_APF_ENABLE:
+args:       none
+This ioctl is used to enable the async page fault interface. So in a
+host page fault case the host can now submit pfault tokens to the guest.
+
+ioctl:      KVM_S390_APF_DISABLE:
+args:       none
+This ioctl is used to disable the async page fault interface. From this point
+on no new pfault tokens will be issued to the guest. Already existing async
+page faults are not covered by this and will be normally handled.
+
+ioctl:      KVM_S390_APF_STATUS:
+args:       none
+This ioctl allows the userspace to get the current status of the APF feature.
+The main purpose for this, is to ensure that no pfault tokens will be lost
+during live migration or similar management operations.
+The possible return values are:
+KVM_S390_APF_DISABLED_NON_PENDING
+KVM_S390_APF_DISABLED_PENDING
+KVM_S390_APF_ENABLED_NON_PENDING
+KVM_S390_APF_ENABLED_PENDING
+Caution: if KVM_S390_APF is enabled the PENDING status could be already changed
+as soon as the ioctl returns to userspace.
+
 3. ioctl calls to the kvm-vcpu file descriptor
 KVM does support the following ioctls on s390 that are common with other
 architectures and do behave the same:
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index cd30c3d..e8012fc 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -257,6 +257,10 @@ struct kvm_vcpu_arch {
 		u64		stidp_data;
 	};
 	struct gmap *gmap;
+#define KVM_S390_PFAULT_TOKEN_INVALID	(-1UL)
+	unsigned long pfault_token;
+	unsigned long pfault_select;
+	unsigned long pfault_compare;
 };
 
 struct kvm_vm_stat {
@@ -282,6 +286,24 @@ static inline bool kvm_is_error_hva(unsigned long addr)
 	return addr == KVM_HVA_ERR_BAD;
 }
 
+#define ASYNC_PF_PER_VCPU	64
+struct kvm_vcpu;
+struct kvm_async_pf;
+struct kvm_arch_async_pf {
+	unsigned long pfault_token;
+};
+
+bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu);
+
+void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,
+			       struct kvm_async_pf *work);
+
+void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
+				     struct kvm_async_pf *work);
+
+void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
+				 struct kvm_async_pf *work);
+
 extern int sie64a(struct kvm_s390_sie_block *, u64 *);
 extern char sie_exit;
 #endif
diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
index d25da59..b6c83e0 100644
--- a/arch/s390/include/uapi/asm/kvm.h
+++ b/arch/s390/include/uapi/asm/kvm.h
@@ -57,4 +57,14 @@ struct kvm_sync_regs {
 #define KVM_REG_S390_EPOCHDIFF	(KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x2)
 #define KVM_REG_S390_CPU_TIMER  (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x3)
 #define KVM_REG_S390_CLOCK_COMP (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x4)
+
+/* ioctls used for setting/getting status of APF on s390x */
+#define KVM_S390_APF_ENABLE	1
+#define KVM_S390_APF_DISABLE	2
+#define KVM_S390_APF_STATUS	3
+#define KVM_S390_APF_DISABLED_NON_PENDING	0
+#define KVM_S390_APF_DISABLED_PENDING		1
+#define KVM_S390_APF_ENABLED_NON_PENDING	2
+#define KVM_S390_APF_ENABLED_PENDING		3
+
 #endif
diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
index 70b46ea..4993eed 100644
--- a/arch/s390/kvm/Kconfig
+++ b/arch/s390/kvm/Kconfig
@@ -23,6 +23,8 @@ config KVM
 	select ANON_INODES
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
 	select HAVE_KVM_EVENTFD
+	select KVM_ASYNC_PF
+	select KVM_ASYNC_PF_DIRECT
 	---help---
 	  Support hosting paravirtualized guest machines using the SIE
 	  virtualization capability on the mainframe. This should work
diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
index 40b4c64..63bfc28 100644
--- a/arch/s390/kvm/Makefile
+++ b/arch/s390/kvm/Makefile
@@ -7,7 +7,7 @@
 # as published by the Free Software Foundation.
 
 KVM := ../../../virt/kvm
-common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o
+common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o
 
 ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
 
diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
index 3074475..3d210af 100644
--- a/arch/s390/kvm/diag.c
+++ b/arch/s390/kvm/diag.c
@@ -17,6 +17,7 @@
 #include "kvm-s390.h"
 #include "trace.h"
 #include "trace-s390.h"
+#include "gaccess.h"
 
 static int diag_release_pages(struct kvm_vcpu *vcpu)
 {
@@ -46,6 +47,66 @@ static int diag_release_pages(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+static int __diag_page_ref_service(struct kvm_vcpu *vcpu)
+{
+	struct prs_parm {
+		u16 code;
+		u16 subcode;
+		u16 parm_len;
+		u16 parm_version;
+		u64 token_addr;
+		u64 select_mask;
+		u64 compare_mask;
+		u64 zarch;
+	};
+	struct prs_parm parm;
+	int rc;
+	u16 rx = (vcpu->arch.sie_block->ipa & 0xf0) >> 4;
+	u16 ry = (vcpu->arch.sie_block->ipa & 0x0f);
+	if (copy_from_guest(vcpu, &parm, vcpu->run->s.regs.gprs[rx], sizeof(parm)))
+		return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
+
+	if (parm.parm_version != 2 || parm.parm_len < 0x5)
+		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
+
+	switch (parm.subcode) {
+	case 0: /* TOKEN */
+		if ((parm.zarch >> 63) != 1 || parm.token_addr & 7 ||
+		    (parm.compare_mask & parm.select_mask) != parm.compare_mask)
+			return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
+
+		vcpu->arch.pfault_token = parm.token_addr;
+		vcpu->arch.pfault_select = parm.select_mask;
+		vcpu->arch.pfault_compare = parm.compare_mask;
+		vcpu->run->s.regs.gprs[ry] = 0;
+		rc = 0;
+		break;
+	case 1: 
+		/* 
+		 * CANCEL 
+		 * Specification allows to let already pending tokens survive
+		 * the cancel, therefore to reduce code complexity, we assume, all
+		 * outstanding tokens as already pending.
+		 */
+		if (vcpu->run->s.regs.gprs[rx] & 7)
+			return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
+
+		vcpu->run->s.regs.gprs[ry] = 0;
+
+		if (vcpu->arch.pfault_token == KVM_S390_PFAULT_TOKEN_INVALID)
+			vcpu->run->s.regs.gprs[ry] = 1;
+
+		vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
+		rc = 0;
+		break;
+	default:
+		rc = -EOPNOTSUPP;
+		break;
+	}
+
+	return rc;
+}
+
 static int __diag_time_slice_end(struct kvm_vcpu *vcpu)
 {
 	VCPU_EVENT(vcpu, 5, "%s", "diag time slice end");
@@ -143,6 +204,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu)
 		return __diag_time_slice_end(vcpu);
 	case 0x9c:
 		return __diag_time_slice_end_directed(vcpu);
+	case 0x258:
+		return __diag_page_ref_service(vcpu);
 	case 0x308:
 		return __diag_ipl_functions(vcpu);
 	case 0x500:
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 7f35cb3..00e7feb 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -31,7 +31,7 @@ static int is_ioint(u64 type)
 	return ((type & 0xfffe0000u) != 0xfffe0000u);
 }
 
-static int psw_extint_disabled(struct kvm_vcpu *vcpu)
+int psw_extint_disabled(struct kvm_vcpu *vcpu)
 {
 	return !(vcpu->arch.sie_block->gpsw.mask & PSW_MASK_EXT);
 }
@@ -78,11 +78,8 @@ static int __interrupt_is_deliverable(struct kvm_vcpu *vcpu,
 			return 1;
 		return 0;
 	case KVM_S390_INT_SERVICE:
-		if (psw_extint_disabled(vcpu))
-			return 0;
-		if (vcpu->arch.sie_block->gcr[0] & 0x200ul)
-			return 1;
-		return 0;
+	case KVM_S390_INT_PFAULT_INIT:
+	case KVM_S390_INT_PFAULT_DONE:
 	case KVM_S390_INT_VIRTIO:
 		if (psw_extint_disabled(vcpu))
 			return 0;
@@ -150,6 +147,8 @@ static void __set_intercept_indicator(struct kvm_vcpu *vcpu,
 	case KVM_S390_INT_EXTERNAL_CALL:
 	case KVM_S390_INT_EMERGENCY:
 	case KVM_S390_INT_SERVICE:
+	case KVM_S390_INT_PFAULT_INIT:
+	case KVM_S390_INT_PFAULT_DONE:
 	case KVM_S390_INT_VIRTIO:
 		if (psw_extint_disabled(vcpu))
 			__set_cpuflag(vcpu, CPUSTAT_EXT_INT);
@@ -223,6 +222,26 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
 		rc |= put_guest(vcpu, inti->ext.ext_params,
 				(u32 __user *)__LC_EXT_PARAMS);
 		break;
+	case KVM_S390_INT_PFAULT_INIT:
+		rc  = put_guest(vcpu, 0x2603, (u16 __user *) __LC_EXT_INT_CODE);
+		rc |= put_guest(vcpu, 0x0600, (u16 __user *) __LC_EXT_CPU_ADDR);
+		rc |= copy_to_guest(vcpu, __LC_EXT_OLD_PSW,
+				    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
+		rc |= copy_from_guest(vcpu, &vcpu->arch.sie_block->gpsw,
+				      __LC_EXT_NEW_PSW, sizeof(psw_t));
+		rc |= put_guest(vcpu, inti->ext.ext_params2,
+				(u64 __user *) __LC_EXT_PARAMS2);
+		break;
+	case KVM_S390_INT_PFAULT_DONE:
+		rc  = put_guest(vcpu, 0x2603, (u16 __user *) __LC_EXT_INT_CODE);
+		rc |= put_guest(vcpu, 0x0680, (u16 __user *) __LC_EXT_CPU_ADDR);
+		rc |= copy_to_guest(vcpu, __LC_EXT_OLD_PSW,
+				    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
+		rc |= copy_from_guest(vcpu, &vcpu->arch.sie_block->gpsw,
+				      __LC_EXT_NEW_PSW, sizeof(psw_t));
+		rc |= put_guest(vcpu, inti->ext.ext_params2,
+				(u64 __user *) __LC_EXT_PARAMS2);
+		break;
 	case KVM_S390_INT_VIRTIO:
 		VCPU_EVENT(vcpu, 4, "interrupt: virtio parm:%x,parm64:%llx",
 			   inti->ext.ext_params, inti->ext.ext_params2);
@@ -357,7 +376,7 @@ static int __try_deliver_ckc_interrupt(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
-static int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
+int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
 {
 	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
 	struct kvm_s390_float_interrupt *fi = vcpu->arch.local_int.float_int;
@@ -681,6 +700,11 @@ int kvm_s390_inject_vm(struct kvm *kvm,
 		inti->type = s390int->type;
 		inti->ext.ext_params = s390int->parm;
 		break;
+	case KVM_S390_INT_PFAULT_INIT:
+	case KVM_S390_INT_PFAULT_DONE:
+		inti->type = s390int->type;
+		inti->ext.ext_params2 = s390int->parm64;
+		break;
 	case KVM_S390_PROGRAM_INT:
 	case KVM_S390_SIGP_STOP:
 	case KVM_S390_INT_EXTERNAL_CALL:
@@ -811,6 +835,11 @@ int kvm_s390_inject_vcpu(struct kvm_vcpu *vcpu,
 		inti->type = s390int->type;
 		inti->mchk.mcic = s390int->parm64;
 		break;
+	case KVM_S390_INT_PFAULT_INIT:
+	case KVM_S390_INT_PFAULT_DONE:
+		inti->type = s390int->type;
+		inti->ext.ext_params2 = s390int->parm64;
+		break;
 	case KVM_S390_INT_VIRTIO:
 	case KVM_S390_INT_SERVICE:
 	case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 702daca..ef70296 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -145,6 +145,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 #ifdef CONFIG_KVM_S390_UCONTROL
 	case KVM_CAP_S390_UCONTROL:
 #endif
+	case KVM_CAP_ASYNC_PF:
 	case KVM_CAP_SYNC_REGS:
 	case KVM_CAP_ONE_REG:
 	case KVM_CAP_ENABLE_CAP:
@@ -186,6 +187,33 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	int r;
 
 	switch (ioctl) {
+	case KVM_S390_APF_ENABLE:
+		set_bit(1, &kvm->arch.gmap->pfault_enabled);
+		r = 0;
+		break;
+	case KVM_S390_APF_DISABLE:
+		clear_bit(1, &kvm->arch.gmap->pfault_enabled);
+		r = 0;
+		break;
+	case KVM_S390_APF_STATUS: {
+		bool pfaults_pending = false;
+		unsigned int i;
+		struct kvm_vcpu *vcpu;
+		r = 0;
+		if (test_bit(1, &kvm->arch.gmap->pfault_enabled))
+			r += 2;
+
+		kvm_for_each_vcpu(i, vcpu, kvm) {
+			spin_lock(&vcpu->async_pf.lock);
+			if (vcpu->async_pf.queued > 0)
+				pfaults_pending = true;
+			spin_unlock(&vcpu->async_pf.lock);
+		}
+
+		if (pfaults_pending)
+			r += 1;
+		break;
+	}
 	case KVM_S390_INTERRUPT: {
 		struct kvm_s390_interrupt s390int;
 
@@ -264,6 +292,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
 	VCPU_EVENT(vcpu, 3, "%s", "free cpu");
 	trace_kvm_s390_destroy_vcpu(vcpu->vcpu_id);
+	kvm_clear_async_pf_completion_queue(vcpu);
 	if (!kvm_is_ucontrol(vcpu->kvm)) {
 		clear_bit(63 - vcpu->vcpu_id,
 			  (unsigned long *) &vcpu->kvm->arch.sca->mcn);
@@ -313,6 +342,9 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 /* Section: vcpu related */
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 {
+	vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
+	kvm_clear_async_pf_completion_queue(vcpu);
+	kvm_async_pf_wakeup_all(vcpu);
 	if (kvm_is_ucontrol(vcpu->kvm)) {
 		vcpu->arch.gmap = gmap_alloc(current->mm);
 		if (!vcpu->arch.gmap)
@@ -370,6 +402,7 @@ static void kvm_s390_vcpu_initial_reset(struct kvm_vcpu *vcpu)
 	vcpu->arch.guest_fpregs.fpc = 0;
 	asm volatile("lfpc %0" : : "Q" (vcpu->arch.guest_fpregs.fpc));
 	vcpu->arch.sie_block->gbea = 1;
+	vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
 	atomic_set_mask(CPUSTAT_STOPPED, &vcpu->arch.sie_block->cpuflags);
 }
 
@@ -691,10 +724,81 @@ static void kvm_arch_fault_in_sync(struct kvm_vcpu *vcpu)
 	up_read(&mm->mmap_sem);
 }
 
+static void __kvm_inject_pfault_token(struct kvm_vcpu *vcpu, bool start_token,
+				      unsigned long token)
+{
+	struct kvm_s390_interrupt inti;
+	inti.parm64 = token;
+
+	if (start_token) {
+		inti.type = KVM_S390_INT_PFAULT_INIT;
+		if (kvm_s390_inject_vcpu(vcpu, &inti))
+			WARN(1, "pfault interrupt injection failed");
+	} else {
+		inti.type = KVM_S390_INT_PFAULT_DONE;
+		if (kvm_s390_inject_vm(vcpu->kvm, &inti))
+			WARN(1, "pfault interrupt injection failed");
+	}
+}
+
+void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
+				     struct kvm_async_pf *work)
+{
+	__kvm_inject_pfault_token(vcpu, true, work->arch.pfault_token);
+}
+
+void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
+				 struct kvm_async_pf *work)
+{
+	__kvm_inject_pfault_token(vcpu, false, work->arch.pfault_token);
+}
+
+void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,
+			       struct kvm_async_pf *work)
+{
+	/* s390 will always inject the page directly */
+}
+
+bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu)
+{
+	/*
+	 * s390 will always inject the page directly,
+	 * but we still want check_async_completion to cleanup
+	 */
+	return true;
+}
+
+static int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu)
+{
+	hva_t hva = gmap_fault(current->thread.gmap_addr, vcpu->arch.gmap);
+	struct kvm_arch_async_pf arch;
+
+	if (vcpu->arch.pfault_token == KVM_S390_PFAULT_TOKEN_INVALID)
+		return 0;
+	if ((vcpu->arch.sie_block->gpsw.mask & vcpu->arch.pfault_select) !=
+	    vcpu->arch.pfault_compare)
+		return 0;
+	if (psw_extint_disabled(vcpu))
+		return 0;
+	if (kvm_cpu_has_interrupt(vcpu))
+		return 0;
+	if (!(vcpu->arch.sie_block->gcr[0] & 0x200ul))
+		return 0;
+
+	if (copy_from_guest(vcpu, &arch.pfault_token, vcpu->arch.pfault_token, 8)) {
+		/* already in error case, insert the interrupt and return 0 */
+		int ign = kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
+		return ign - ign;
+	}
+	return kvm_setup_async_pf(vcpu, current->thread.gmap_addr, hva, &arch);
+}
+
 static int __vcpu_run(struct kvm_vcpu *vcpu)
 {
 	int rc;
 
+	kvm_check_async_pf_completion(vcpu);
+
 	memcpy(&vcpu->arch.sie_block->gg14, &vcpu->run->s.regs.gprs[14], 16);
 
 	if (need_resched())
@@ -725,7 +829,8 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
 		if (kvm_is_ucontrol(vcpu->kvm)) {
 			rc = SIE_INTERCEPT_UCONTROL;
 		} else if (current->thread.gmap_pfault) {
-			kvm_arch_fault_in_sync(vcpu);
+			if (!kvm_arch_setup_async_pf(vcpu))
+				kvm_arch_fault_in_sync(vcpu);
 			current->thread.gmap_pfault = 0;
 			rc = 0;
 		} else {
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index 028ca9f..d0f4d2a 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -148,4 +148,8 @@ void exit_sie_sync(struct kvm_vcpu *vcpu);
 /* implemented in diag.c */
 int kvm_s390_handle_diag(struct kvm_vcpu *vcpu);
 
+/* implemented in interrupt.c */
+int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
+int psw_extint_disabled(struct kvm_vcpu *vcpu);
+
 #endif
diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
index bec398c..a6a0f02 100644
--- a/arch/s390/kvm/sigp.c
+++ b/arch/s390/kvm/sigp.c
@@ -186,6 +186,12 @@ int kvm_s390_inject_sigp_stop(struct kvm_vcpu *vcpu, int action)
 static int __sigp_set_arch(struct kvm_vcpu *vcpu, u32 parameter)
 {
 	int rc;
+	unsigned int i;
+	struct kvm_vcpu *vcpu_to_set;
+
+	kvm_for_each_vcpu(i, vcpu_to_set, vcpu->kvm) {
+		vcpu_to_set->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
+	}
 
 	switch (parameter & 0xff) {
 	case 0:
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index acccd08..fae432c 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -413,6 +413,8 @@ struct kvm_s390_psw {
 #define KVM_S390_PROGRAM_INT		0xfffe0001u
 #define KVM_S390_SIGP_SET_PREFIX	0xfffe0002u
 #define KVM_S390_RESTART		0xfffe0003u
+#define KVM_S390_INT_PFAULT_INIT	0xfffe0004u
+#define KVM_S390_INT_PFAULT_DONE	0xfffe0005u
 #define KVM_S390_MCHK			0xfffe1000u
 #define KVM_S390_INT_VIRTIO		0xffff2603u
 #define KVM_S390_INT_SERVICE		0xffff2401u
-- 
1.8.2.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/4] PF: Async page fault support on s390
@ 2013-07-10 12:59   ` Dominik Dingel
  0 siblings, 0 replies; 20+ messages in thread
From: Dominik Dingel @ 2013-07-10 12:59 UTC (permalink / raw)
  To: Gleb Natapov, Paolo Bonzini
  Cc: Christian Borntraeger, Heiko Carstens, Martin Schwidefsky,
	Cornelia Huck, Xiantao Zhang, Alexander Graf, Christoffer Dall,
	Marc Zyngier, Ralf Baechle, kvm, linux-s390, linux-mm,
	linux-kernel, Dominik Dingel

This patch enables async page faults for s390 kvm guests.
It provides the userspace API to enable, disable or get the status of this
feature. Also it includes the diagnose code, called by the guest to enable
async page faults.

The async page faults will use an already existing guest interface for this
purpose, as described in "CP Programming Services (SC24-6084)".

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
---
 Documentation/s390/kvm.txt       |  24 +++++++++
 arch/s390/include/asm/kvm_host.h |  22 ++++++++
 arch/s390/include/uapi/asm/kvm.h |  10 ++++
 arch/s390/kvm/Kconfig            |   2 +
 arch/s390/kvm/Makefile           |   2 +-
 arch/s390/kvm/diag.c             |  63 +++++++++++++++++++++++
 arch/s390/kvm/interrupt.c        |  43 +++++++++++++---
 arch/s390/kvm/kvm-s390.c         | 107 ++++++++++++++++++++++++++++++++++++++-
 arch/s390/kvm/kvm-s390.h         |   4 ++
 arch/s390/kvm/sigp.c             |   6 +++
 include/uapi/linux/kvm.h         |   2 +
 11 files changed, 276 insertions(+), 9 deletions(-)

diff --git a/Documentation/s390/kvm.txt b/Documentation/s390/kvm.txt
index 85f3280..707b7e9 100644
--- a/Documentation/s390/kvm.txt
+++ b/Documentation/s390/kvm.txt
@@ -70,6 +70,30 @@ floating interrupts are:
 KVM_S390_INT_VIRTIO
 KVM_S390_INT_SERVICE
 
+ioctl:      KVM_S390_APF_ENABLE:
+args:       none
+This ioctl is used to enable the async page fault interface. So in a
+host page fault case the host can now submit pfault tokens to the guest.
+
+ioctl:      KVM_S390_APF_DISABLE:
+args:       none
+This ioctl is used to disable the async page fault interface. From this point
+on no new pfault tokens will be issued to the guest. Already existing async
+page faults are not covered by this and will be normally handled.
+
+ioctl:      KVM_S390_APF_STATUS:
+args:       none
+This ioctl allows the userspace to get the current status of the APF feature.
+The main purpose for this, is to ensure that no pfault tokens will be lost
+during live migration or similar management operations.
+The possible return values are:
+KVM_S390_APF_DISABLED_NON_PENDING
+KVM_S390_APF_DISABLED_PENDING
+KVM_S390_APF_ENABLED_NON_PENDING
+KVM_S390_APF_ENABLED_PENDING
+Caution: if KVM_S390_APF is enabled the PENDING status could be already changed
+as soon as the ioctl returns to userspace.
+
 3. ioctl calls to the kvm-vcpu file descriptor
 KVM does support the following ioctls on s390 that are common with other
 architectures and do behave the same:
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index cd30c3d..e8012fc 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -257,6 +257,10 @@ struct kvm_vcpu_arch {
 		u64		stidp_data;
 	};
 	struct gmap *gmap;
+#define KVM_S390_PFAULT_TOKEN_INVALID	(-1UL)
+	unsigned long pfault_token;
+	unsigned long pfault_select;
+	unsigned long pfault_compare;
 };
 
 struct kvm_vm_stat {
@@ -282,6 +286,24 @@ static inline bool kvm_is_error_hva(unsigned long addr)
 	return addr == KVM_HVA_ERR_BAD;
 }
 
+#define ASYNC_PF_PER_VCPU	64
+struct kvm_vcpu;
+struct kvm_async_pf;
+struct kvm_arch_async_pf {
+	unsigned long pfault_token;
+};
+
+bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu);
+
+void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,
+			       struct kvm_async_pf *work);
+
+void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
+				     struct kvm_async_pf *work);
+
+void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
+				 struct kvm_async_pf *work);
+
 extern int sie64a(struct kvm_s390_sie_block *, u64 *);
 extern char sie_exit;
 #endif
diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
index d25da59..b6c83e0 100644
--- a/arch/s390/include/uapi/asm/kvm.h
+++ b/arch/s390/include/uapi/asm/kvm.h
@@ -57,4 +57,14 @@ struct kvm_sync_regs {
 #define KVM_REG_S390_EPOCHDIFF	(KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x2)
 #define KVM_REG_S390_CPU_TIMER  (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x3)
 #define KVM_REG_S390_CLOCK_COMP (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x4)
+
+/* ioctls used for setting/getting status of APF on s390x */
+#define KVM_S390_APF_ENABLE	1
+#define KVM_S390_APF_DISABLE	2
+#define KVM_S390_APF_STATUS	3
+#define KVM_S390_APF_DISABLED_NON_PENDING	0
+#define KVM_S390_APF_DISABLED_PENDING		1
+#define KVM_S390_APF_ENABLED_NON_PENDING	2
+#define KVM_S390_APF_ENABLED_PENDING		3
+
 #endif
diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
index 70b46ea..4993eed 100644
--- a/arch/s390/kvm/Kconfig
+++ b/arch/s390/kvm/Kconfig
@@ -23,6 +23,8 @@ config KVM
 	select ANON_INODES
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
 	select HAVE_KVM_EVENTFD
+	select KVM_ASYNC_PF
+	select KVM_ASYNC_PF_DIRECT
 	---help---
 	  Support hosting paravirtualized guest machines using the SIE
 	  virtualization capability on the mainframe. This should work
diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
index 40b4c64..63bfc28 100644
--- a/arch/s390/kvm/Makefile
+++ b/arch/s390/kvm/Makefile
@@ -7,7 +7,7 @@
 # as published by the Free Software Foundation.
 
 KVM := ../../../virt/kvm
-common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o
+common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o
 
 ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
 
diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
index 3074475..3d210af 100644
--- a/arch/s390/kvm/diag.c
+++ b/arch/s390/kvm/diag.c
@@ -17,6 +17,7 @@
 #include "kvm-s390.h"
 #include "trace.h"
 #include "trace-s390.h"
+#include "gaccess.h"
 
 static int diag_release_pages(struct kvm_vcpu *vcpu)
 {
@@ -46,6 +47,66 @@ static int diag_release_pages(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+static int __diag_page_ref_service(struct kvm_vcpu *vcpu)
+{
+	struct prs_parm {
+		u16 code;
+		u16 subcode;
+		u16 parm_len;
+		u16 parm_version;
+		u64 token_addr;
+		u64 select_mask;
+		u64 compare_mask;
+		u64 zarch;
+	};
+	struct prs_parm parm;
+	int rc;
+	u16 rx = (vcpu->arch.sie_block->ipa & 0xf0) >> 4;
+	u16 ry = (vcpu->arch.sie_block->ipa & 0x0f);
+	if (copy_from_guest(vcpu, &parm, vcpu->run->s.regs.gprs[rx], sizeof(parm)))
+		return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
+
+	if (parm.parm_version != 2 || parm.parm_len < 0x5)
+		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
+
+	switch (parm.subcode) {
+	case 0: /* TOKEN */
+		if ((parm.zarch >> 63) != 1 || parm.token_addr & 7 ||
+		    (parm.compare_mask & parm.select_mask) != parm.compare_mask)
+			return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
+
+		vcpu->arch.pfault_token = parm.token_addr;
+		vcpu->arch.pfault_select = parm.select_mask;
+		vcpu->arch.pfault_compare = parm.compare_mask;
+		vcpu->run->s.regs.gprs[ry] = 0;
+		rc = 0;
+		break;
+	case 1: 
+		/* 
+		 * CANCEL 
+		 * Specification allows to let already pending tokens survive
+		 * the cancel, therefore to reduce code complexity, we assume, all
+		 * outstanding tokens as already pending.
+		 */
+		if (vcpu->run->s.regs.gprs[rx] & 7)
+			return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
+
+		vcpu->run->s.regs.gprs[ry] = 0;
+
+		if (vcpu->arch.pfault_token == KVM_S390_PFAULT_TOKEN_INVALID)
+			vcpu->run->s.regs.gprs[ry] = 1;
+
+		vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
+		rc = 0;
+		break;
+	default:
+		rc = -EOPNOTSUPP;
+		break;
+	}
+
+	return rc;
+}
+
 static int __diag_time_slice_end(struct kvm_vcpu *vcpu)
 {
 	VCPU_EVENT(vcpu, 5, "%s", "diag time slice end");
@@ -143,6 +204,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu)
 		return __diag_time_slice_end(vcpu);
 	case 0x9c:
 		return __diag_time_slice_end_directed(vcpu);
+	case 0x258:
+		return __diag_page_ref_service(vcpu);
 	case 0x308:
 		return __diag_ipl_functions(vcpu);
 	case 0x500:
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 7f35cb3..00e7feb 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -31,7 +31,7 @@ static int is_ioint(u64 type)
 	return ((type & 0xfffe0000u) != 0xfffe0000u);
 }
 
-static int psw_extint_disabled(struct kvm_vcpu *vcpu)
+int psw_extint_disabled(struct kvm_vcpu *vcpu)
 {
 	return !(vcpu->arch.sie_block->gpsw.mask & PSW_MASK_EXT);
 }
@@ -78,11 +78,8 @@ static int __interrupt_is_deliverable(struct kvm_vcpu *vcpu,
 			return 1;
 		return 0;
 	case KVM_S390_INT_SERVICE:
-		if (psw_extint_disabled(vcpu))
-			return 0;
-		if (vcpu->arch.sie_block->gcr[0] & 0x200ul)
-			return 1;
-		return 0;
+	case KVM_S390_INT_PFAULT_INIT:
+	case KVM_S390_INT_PFAULT_DONE:
 	case KVM_S390_INT_VIRTIO:
 		if (psw_extint_disabled(vcpu))
 			return 0;
@@ -150,6 +147,8 @@ static void __set_intercept_indicator(struct kvm_vcpu *vcpu,
 	case KVM_S390_INT_EXTERNAL_CALL:
 	case KVM_S390_INT_EMERGENCY:
 	case KVM_S390_INT_SERVICE:
+	case KVM_S390_INT_PFAULT_INIT:
+	case KVM_S390_INT_PFAULT_DONE:
 	case KVM_S390_INT_VIRTIO:
 		if (psw_extint_disabled(vcpu))
 			__set_cpuflag(vcpu, CPUSTAT_EXT_INT);
@@ -223,6 +222,26 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
 		rc |= put_guest(vcpu, inti->ext.ext_params,
 				(u32 __user *)__LC_EXT_PARAMS);
 		break;
+	case KVM_S390_INT_PFAULT_INIT:
+		rc  = put_guest(vcpu, 0x2603, (u16 __user *) __LC_EXT_INT_CODE);
+		rc |= put_guest(vcpu, 0x0600, (u16 __user *) __LC_EXT_CPU_ADDR);
+		rc |= copy_to_guest(vcpu, __LC_EXT_OLD_PSW,
+				    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
+		rc |= copy_from_guest(vcpu, &vcpu->arch.sie_block->gpsw,
+				      __LC_EXT_NEW_PSW, sizeof(psw_t));
+		rc |= put_guest(vcpu, inti->ext.ext_params2,
+				(u64 __user *) __LC_EXT_PARAMS2);
+		break;
+	case KVM_S390_INT_PFAULT_DONE:
+		rc  = put_guest(vcpu, 0x2603, (u16 __user *) __LC_EXT_INT_CODE);
+		rc |= put_guest(vcpu, 0x0680, (u16 __user *) __LC_EXT_CPU_ADDR);
+		rc |= copy_to_guest(vcpu, __LC_EXT_OLD_PSW,
+				    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
+		rc |= copy_from_guest(vcpu, &vcpu->arch.sie_block->gpsw,
+				      __LC_EXT_NEW_PSW, sizeof(psw_t));
+		rc |= put_guest(vcpu, inti->ext.ext_params2,
+				(u64 __user *) __LC_EXT_PARAMS2);
+		break;
 	case KVM_S390_INT_VIRTIO:
 		VCPU_EVENT(vcpu, 4, "interrupt: virtio parm:%x,parm64:%llx",
 			   inti->ext.ext_params, inti->ext.ext_params2);
@@ -357,7 +376,7 @@ static int __try_deliver_ckc_interrupt(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
-static int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
+int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
 {
 	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
 	struct kvm_s390_float_interrupt *fi = vcpu->arch.local_int.float_int;
@@ -681,6 +700,11 @@ int kvm_s390_inject_vm(struct kvm *kvm,
 		inti->type = s390int->type;
 		inti->ext.ext_params = s390int->parm;
 		break;
+	case KVM_S390_INT_PFAULT_INIT:
+	case KVM_S390_INT_PFAULT_DONE:
+		inti->type = s390int->type;
+		inti->ext.ext_params2 = s390int->parm64;
+		break;
 	case KVM_S390_PROGRAM_INT:
 	case KVM_S390_SIGP_STOP:
 	case KVM_S390_INT_EXTERNAL_CALL:
@@ -811,6 +835,11 @@ int kvm_s390_inject_vcpu(struct kvm_vcpu *vcpu,
 		inti->type = s390int->type;
 		inti->mchk.mcic = s390int->parm64;
 		break;
+	case KVM_S390_INT_PFAULT_INIT:
+	case KVM_S390_INT_PFAULT_DONE:
+		inti->type = s390int->type;
+		inti->ext.ext_params2 = s390int->parm64;
+		break;
 	case KVM_S390_INT_VIRTIO:
 	case KVM_S390_INT_SERVICE:
 	case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 702daca..ef70296 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -145,6 +145,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 #ifdef CONFIG_KVM_S390_UCONTROL
 	case KVM_CAP_S390_UCONTROL:
 #endif
+	case KVM_CAP_ASYNC_PF:
 	case KVM_CAP_SYNC_REGS:
 	case KVM_CAP_ONE_REG:
 	case KVM_CAP_ENABLE_CAP:
@@ -186,6 +187,33 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	int r;
 
 	switch (ioctl) {
+	case KVM_S390_APF_ENABLE:
+		set_bit(1, &kvm->arch.gmap->pfault_enabled);
+		r = 0;
+		break;
+	case KVM_S390_APF_DISABLE:
+		clear_bit(1, &kvm->arch.gmap->pfault_enabled);
+		r = 0;
+		break;
+	case KVM_S390_APF_STATUS: {
+		bool pfaults_pending = false;
+		unsigned int i;
+		struct kvm_vcpu *vcpu;
+		r = 0;
+		if (test_bit(1, &kvm->arch.gmap->pfault_enabled))
+			r += 2;
+
+		kvm_for_each_vcpu(i, vcpu, kvm) {
+			spin_lock(&vcpu->async_pf.lock);
+			if (vcpu->async_pf.queued > 0)
+				pfaults_pending = true;
+			spin_unlock(&vcpu->async_pf.lock);
+		}
+
+		if (pfaults_pending)
+			r += 1;
+		break;
+	}
 	case KVM_S390_INTERRUPT: {
 		struct kvm_s390_interrupt s390int;
 
@@ -264,6 +292,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
 	VCPU_EVENT(vcpu, 3, "%s", "free cpu");
 	trace_kvm_s390_destroy_vcpu(vcpu->vcpu_id);
+	kvm_clear_async_pf_completion_queue(vcpu);
 	if (!kvm_is_ucontrol(vcpu->kvm)) {
 		clear_bit(63 - vcpu->vcpu_id,
 			  (unsigned long *) &vcpu->kvm->arch.sca->mcn);
@@ -313,6 +342,9 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 /* Section: vcpu related */
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 {
+	vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
+	kvm_clear_async_pf_completion_queue(vcpu);
+	kvm_async_pf_wakeup_all(vcpu);
 	if (kvm_is_ucontrol(vcpu->kvm)) {
 		vcpu->arch.gmap = gmap_alloc(current->mm);
 		if (!vcpu->arch.gmap)
@@ -370,6 +402,7 @@ static void kvm_s390_vcpu_initial_reset(struct kvm_vcpu *vcpu)
 	vcpu->arch.guest_fpregs.fpc = 0;
 	asm volatile("lfpc %0" : : "Q" (vcpu->arch.guest_fpregs.fpc));
 	vcpu->arch.sie_block->gbea = 1;
+	vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
 	atomic_set_mask(CPUSTAT_STOPPED, &vcpu->arch.sie_block->cpuflags);
 }
 
@@ -691,10 +724,81 @@ static void kvm_arch_fault_in_sync(struct kvm_vcpu *vcpu)
 	up_read(&mm->mmap_sem);
 }
 
+static void __kvm_inject_pfault_token(struct kvm_vcpu *vcpu, bool start_token,
+				      unsigned long token)
+{
+	struct kvm_s390_interrupt inti;
+	inti.parm64 = token;
+
+	if (start_token) {
+		inti.type = KVM_S390_INT_PFAULT_INIT;
+		if (kvm_s390_inject_vcpu(vcpu, &inti))
+			WARN(1, "pfault interrupt injection failed");
+	} else {
+		inti.type = KVM_S390_INT_PFAULT_DONE;
+		if (kvm_s390_inject_vm(vcpu->kvm, &inti))
+			WARN(1, "pfault interrupt injection failed");
+	}
+}
+
+void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
+				     struct kvm_async_pf *work)
+{
+	__kvm_inject_pfault_token(vcpu, true, work->arch.pfault_token);
+}
+
+void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
+				 struct kvm_async_pf *work)
+{
+	__kvm_inject_pfault_token(vcpu, false, work->arch.pfault_token);
+}
+
+void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,
+			       struct kvm_async_pf *work)
+{
+	/* s390 will always inject the page directly */
+}
+
+bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu)
+{
+	/*
+	 * s390 will always inject the page directly,
+	 * but we still want check_async_completion to cleanup
+	 */
+	return true;
+}
+
+static int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu)
+{
+	hva_t hva = gmap_fault(current->thread.gmap_addr, vcpu->arch.gmap);
+	struct kvm_arch_async_pf arch;
+
+	if (vcpu->arch.pfault_token == KVM_S390_PFAULT_TOKEN_INVALID)
+		return 0;
+	if ((vcpu->arch.sie_block->gpsw.mask & vcpu->arch.pfault_select) !=
+	    vcpu->arch.pfault_compare)
+		return 0;
+	if (psw_extint_disabled(vcpu))
+		return 0;
+	if (kvm_cpu_has_interrupt(vcpu))
+		return 0;
+	if (!(vcpu->arch.sie_block->gcr[0] & 0x200ul))
+		return 0;
+
+	if (copy_from_guest(vcpu, &arch.pfault_token, vcpu->arch.pfault_token, 8)) {
+		/* already in error case, insert the interrupt and return 0 */
+		int ign = kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
+		return ign - ign;
+	}
+	return kvm_setup_async_pf(vcpu, current->thread.gmap_addr, hva, &arch);
+}
+
 static int __vcpu_run(struct kvm_vcpu *vcpu)
 {
 	int rc;
 
+	kvm_check_async_pf_completion(vcpu);
+
 	memcpy(&vcpu->arch.sie_block->gg14, &vcpu->run->s.regs.gprs[14], 16);
 
 	if (need_resched())
@@ -725,7 +829,8 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
 		if (kvm_is_ucontrol(vcpu->kvm)) {
 			rc = SIE_INTERCEPT_UCONTROL;
 		} else if (current->thread.gmap_pfault) {
-			kvm_arch_fault_in_sync(vcpu);
+			if (!kvm_arch_setup_async_pf(vcpu))
+				kvm_arch_fault_in_sync(vcpu);
 			current->thread.gmap_pfault = 0;
 			rc = 0;
 		} else {
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index 028ca9f..d0f4d2a 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -148,4 +148,8 @@ void exit_sie_sync(struct kvm_vcpu *vcpu);
 /* implemented in diag.c */
 int kvm_s390_handle_diag(struct kvm_vcpu *vcpu);
 
+/* implemented in interrupt.c */
+int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
+int psw_extint_disabled(struct kvm_vcpu *vcpu);
+
 #endif
diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
index bec398c..a6a0f02 100644
--- a/arch/s390/kvm/sigp.c
+++ b/arch/s390/kvm/sigp.c
@@ -186,6 +186,12 @@ int kvm_s390_inject_sigp_stop(struct kvm_vcpu *vcpu, int action)
 static int __sigp_set_arch(struct kvm_vcpu *vcpu, u32 parameter)
 {
 	int rc;
+	unsigned int i;
+	struct kvm_vcpu *vcpu_to_set;
+
+	kvm_for_each_vcpu(i, vcpu_to_set, vcpu->kvm) {
+		vcpu_to_set->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
+	}
 
 	switch (parameter & 0xff) {
 	case 0:
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index acccd08..fae432c 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -413,6 +413,8 @@ struct kvm_s390_psw {
 #define KVM_S390_PROGRAM_INT		0xfffe0001u
 #define KVM_S390_SIGP_SET_PREFIX	0xfffe0002u
 #define KVM_S390_RESTART		0xfffe0003u
+#define KVM_S390_INT_PFAULT_INIT	0xfffe0004u
+#define KVM_S390_INT_PFAULT_DONE	0xfffe0005u
 #define KVM_S390_MCHK			0xfffe1000u
 #define KVM_S390_INT_VIRTIO		0xffff2603u
 #define KVM_S390_INT_SERVICE		0xffff2401u
-- 
1.8.2.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] PF: Async page fault support on s390
  2013-07-10 12:59   ` Dominik Dingel
@ 2013-07-11  9:04     ` Gleb Natapov
  -1 siblings, 0 replies; 20+ messages in thread
From: Gleb Natapov @ 2013-07-11  9:04 UTC (permalink / raw)
  To: Dominik Dingel
  Cc: Paolo Bonzini, Christian Borntraeger, Heiko Carstens,
	Martin Schwidefsky, Cornelia Huck, Xiantao Zhang, Alexander Graf,
	Christoffer Dall, Marc Zyngier, Ralf Baechle, kvm, linux-s390,
	linux-mm, linux-kernel

On Wed, Jul 10, 2013 at 02:59:55PM +0200, Dominik Dingel wrote:
> This patch enables async page faults for s390 kvm guests.
> It provides the userspace API to enable, disable or get the status of this
> feature. Also it includes the diagnose code, called by the guest to enable
> async page faults.
> 
> The async page faults will use an already existing guest interface for this
> purpose, as described in "CP Programming Services (SC24-6084)".
> 
> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Christian, looks good now?

> ---
>  Documentation/s390/kvm.txt       |  24 +++++++++
>  arch/s390/include/asm/kvm_host.h |  22 ++++++++
>  arch/s390/include/uapi/asm/kvm.h |  10 ++++
>  arch/s390/kvm/Kconfig            |   2 +
>  arch/s390/kvm/Makefile           |   2 +-
>  arch/s390/kvm/diag.c             |  63 +++++++++++++++++++++++
>  arch/s390/kvm/interrupt.c        |  43 +++++++++++++---
>  arch/s390/kvm/kvm-s390.c         | 107 ++++++++++++++++++++++++++++++++++++++-
>  arch/s390/kvm/kvm-s390.h         |   4 ++
>  arch/s390/kvm/sigp.c             |   6 +++
>  include/uapi/linux/kvm.h         |   2 +
>  11 files changed, 276 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/s390/kvm.txt b/Documentation/s390/kvm.txt
> index 85f3280..707b7e9 100644
> --- a/Documentation/s390/kvm.txt
> +++ b/Documentation/s390/kvm.txt
> @@ -70,6 +70,30 @@ floating interrupts are:
>  KVM_S390_INT_VIRTIO
>  KVM_S390_INT_SERVICE
>  
> +ioctl:      KVM_S390_APF_ENABLE:
> +args:       none
> +This ioctl is used to enable the async page fault interface. So in a
> +host page fault case the host can now submit pfault tokens to the guest.
> +
> +ioctl:      KVM_S390_APF_DISABLE:
> +args:       none
> +This ioctl is used to disable the async page fault interface. From this point
> +on no new pfault tokens will be issued to the guest. Already existing async
> +page faults are not covered by this and will be normally handled.
> +
> +ioctl:      KVM_S390_APF_STATUS:
> +args:       none
> +This ioctl allows the userspace to get the current status of the APF feature.
> +The main purpose for this, is to ensure that no pfault tokens will be lost
> +during live migration or similar management operations.
> +The possible return values are:
> +KVM_S390_APF_DISABLED_NON_PENDING
> +KVM_S390_APF_DISABLED_PENDING
> +KVM_S390_APF_ENABLED_NON_PENDING
> +KVM_S390_APF_ENABLED_PENDING
> +Caution: if KVM_S390_APF is enabled the PENDING status could be already changed
> +as soon as the ioctl returns to userspace.
> +
>  3. ioctl calls to the kvm-vcpu file descriptor
>  KVM does support the following ioctls on s390 that are common with other
>  architectures and do behave the same:
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index cd30c3d..e8012fc 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -257,6 +257,10 @@ struct kvm_vcpu_arch {
>  		u64		stidp_data;
>  	};
>  	struct gmap *gmap;
> +#define KVM_S390_PFAULT_TOKEN_INVALID	(-1UL)
> +	unsigned long pfault_token;
> +	unsigned long pfault_select;
> +	unsigned long pfault_compare;
>  };
>  
>  struct kvm_vm_stat {
> @@ -282,6 +286,24 @@ static inline bool kvm_is_error_hva(unsigned long addr)
>  	return addr == KVM_HVA_ERR_BAD;
>  }
>  
> +#define ASYNC_PF_PER_VCPU	64
> +struct kvm_vcpu;
> +struct kvm_async_pf;
> +struct kvm_arch_async_pf {
> +	unsigned long pfault_token;
> +};
> +
> +bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu);
> +
> +void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,
> +			       struct kvm_async_pf *work);
> +
> +void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
> +				     struct kvm_async_pf *work);
> +
> +void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
> +				 struct kvm_async_pf *work);
> +
>  extern int sie64a(struct kvm_s390_sie_block *, u64 *);
>  extern char sie_exit;
>  #endif
> diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
> index d25da59..b6c83e0 100644
> --- a/arch/s390/include/uapi/asm/kvm.h
> +++ b/arch/s390/include/uapi/asm/kvm.h
> @@ -57,4 +57,14 @@ struct kvm_sync_regs {
>  #define KVM_REG_S390_EPOCHDIFF	(KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x2)
>  #define KVM_REG_S390_CPU_TIMER  (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x3)
>  #define KVM_REG_S390_CLOCK_COMP (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x4)
> +
> +/* ioctls used for setting/getting status of APF on s390x */
> +#define KVM_S390_APF_ENABLE	1
> +#define KVM_S390_APF_DISABLE	2
> +#define KVM_S390_APF_STATUS	3
> +#define KVM_S390_APF_DISABLED_NON_PENDING	0
> +#define KVM_S390_APF_DISABLED_PENDING		1
> +#define KVM_S390_APF_ENABLED_NON_PENDING	2
> +#define KVM_S390_APF_ENABLED_PENDING		3
> +
>  #endif
> diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
> index 70b46ea..4993eed 100644
> --- a/arch/s390/kvm/Kconfig
> +++ b/arch/s390/kvm/Kconfig
> @@ -23,6 +23,8 @@ config KVM
>  	select ANON_INODES
>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
>  	select HAVE_KVM_EVENTFD
> +	select KVM_ASYNC_PF
> +	select KVM_ASYNC_PF_DIRECT
>  	---help---
>  	  Support hosting paravirtualized guest machines using the SIE
>  	  virtualization capability on the mainframe. This should work
> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
> index 40b4c64..63bfc28 100644
> --- a/arch/s390/kvm/Makefile
> +++ b/arch/s390/kvm/Makefile
> @@ -7,7 +7,7 @@
>  # as published by the Free Software Foundation.
>  
>  KVM := ../../../virt/kvm
> -common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o
> +common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o
>  
>  ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>  
> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
> index 3074475..3d210af 100644
> --- a/arch/s390/kvm/diag.c
> +++ b/arch/s390/kvm/diag.c
> @@ -17,6 +17,7 @@
>  #include "kvm-s390.h"
>  #include "trace.h"
>  #include "trace-s390.h"
> +#include "gaccess.h"
>  
>  static int diag_release_pages(struct kvm_vcpu *vcpu)
>  {
> @@ -46,6 +47,66 @@ static int diag_release_pages(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> +static int __diag_page_ref_service(struct kvm_vcpu *vcpu)
> +{
> +	struct prs_parm {
> +		u16 code;
> +		u16 subcode;
> +		u16 parm_len;
> +		u16 parm_version;
> +		u64 token_addr;
> +		u64 select_mask;
> +		u64 compare_mask;
> +		u64 zarch;
> +	};
> +	struct prs_parm parm;
> +	int rc;
> +	u16 rx = (vcpu->arch.sie_block->ipa & 0xf0) >> 4;
> +	u16 ry = (vcpu->arch.sie_block->ipa & 0x0f);
> +	if (copy_from_guest(vcpu, &parm, vcpu->run->s.regs.gprs[rx], sizeof(parm)))
> +		return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
> +
> +	if (parm.parm_version != 2 || parm.parm_len < 0x5)
> +		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
> +
> +	switch (parm.subcode) {
> +	case 0: /* TOKEN */
> +		if ((parm.zarch >> 63) != 1 || parm.token_addr & 7 ||
> +		    (parm.compare_mask & parm.select_mask) != parm.compare_mask)
> +			return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
> +
> +		vcpu->arch.pfault_token = parm.token_addr;
> +		vcpu->arch.pfault_select = parm.select_mask;
> +		vcpu->arch.pfault_compare = parm.compare_mask;
> +		vcpu->run->s.regs.gprs[ry] = 0;
> +		rc = 0;
> +		break;
> +	case 1: 
> +		/* 
> +		 * CANCEL 
> +		 * Specification allows to let already pending tokens survive
> +		 * the cancel, therefore to reduce code complexity, we assume, all
> +		 * outstanding tokens as already pending.
> +		 */
> +		if (vcpu->run->s.regs.gprs[rx] & 7)
> +			return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
> +
> +		vcpu->run->s.regs.gprs[ry] = 0;
> +
> +		if (vcpu->arch.pfault_token == KVM_S390_PFAULT_TOKEN_INVALID)
> +			vcpu->run->s.regs.gprs[ry] = 1;
> +
> +		vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
> +		rc = 0;
> +		break;
> +	default:
> +		rc = -EOPNOTSUPP;
> +		break;
> +	}
> +
> +	return rc;
> +}
> +
>  static int __diag_time_slice_end(struct kvm_vcpu *vcpu)
>  {
>  	VCPU_EVENT(vcpu, 5, "%s", "diag time slice end");
> @@ -143,6 +204,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu)
>  		return __diag_time_slice_end(vcpu);
>  	case 0x9c:
>  		return __diag_time_slice_end_directed(vcpu);
> +	case 0x258:
> +		return __diag_page_ref_service(vcpu);
>  	case 0x308:
>  		return __diag_ipl_functions(vcpu);
>  	case 0x500:
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index 7f35cb3..00e7feb 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -31,7 +31,7 @@ static int is_ioint(u64 type)
>  	return ((type & 0xfffe0000u) != 0xfffe0000u);
>  }
>  
> -static int psw_extint_disabled(struct kvm_vcpu *vcpu)
> +int psw_extint_disabled(struct kvm_vcpu *vcpu)
>  {
>  	return !(vcpu->arch.sie_block->gpsw.mask & PSW_MASK_EXT);
>  }
> @@ -78,11 +78,8 @@ static int __interrupt_is_deliverable(struct kvm_vcpu *vcpu,
>  			return 1;
>  		return 0;
>  	case KVM_S390_INT_SERVICE:
> -		if (psw_extint_disabled(vcpu))
> -			return 0;
> -		if (vcpu->arch.sie_block->gcr[0] & 0x200ul)
> -			return 1;
> -		return 0;
> +	case KVM_S390_INT_PFAULT_INIT:
> +	case KVM_S390_INT_PFAULT_DONE:
>  	case KVM_S390_INT_VIRTIO:
>  		if (psw_extint_disabled(vcpu))
>  			return 0;
> @@ -150,6 +147,8 @@ static void __set_intercept_indicator(struct kvm_vcpu *vcpu,
>  	case KVM_S390_INT_EXTERNAL_CALL:
>  	case KVM_S390_INT_EMERGENCY:
>  	case KVM_S390_INT_SERVICE:
> +	case KVM_S390_INT_PFAULT_INIT:
> +	case KVM_S390_INT_PFAULT_DONE:
>  	case KVM_S390_INT_VIRTIO:
>  		if (psw_extint_disabled(vcpu))
>  			__set_cpuflag(vcpu, CPUSTAT_EXT_INT);
> @@ -223,6 +222,26 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
>  		rc |= put_guest(vcpu, inti->ext.ext_params,
>  				(u32 __user *)__LC_EXT_PARAMS);
>  		break;
> +	case KVM_S390_INT_PFAULT_INIT:
> +		rc  = put_guest(vcpu, 0x2603, (u16 __user *) __LC_EXT_INT_CODE);
> +		rc |= put_guest(vcpu, 0x0600, (u16 __user *) __LC_EXT_CPU_ADDR);
> +		rc |= copy_to_guest(vcpu, __LC_EXT_OLD_PSW,
> +				    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
> +		rc |= copy_from_guest(vcpu, &vcpu->arch.sie_block->gpsw,
> +				      __LC_EXT_NEW_PSW, sizeof(psw_t));
> +		rc |= put_guest(vcpu, inti->ext.ext_params2,
> +				(u64 __user *) __LC_EXT_PARAMS2);
> +		break;
> +	case KVM_S390_INT_PFAULT_DONE:
> +		rc  = put_guest(vcpu, 0x2603, (u16 __user *) __LC_EXT_INT_CODE);
> +		rc |= put_guest(vcpu, 0x0680, (u16 __user *) __LC_EXT_CPU_ADDR);
> +		rc |= copy_to_guest(vcpu, __LC_EXT_OLD_PSW,
> +				    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
> +		rc |= copy_from_guest(vcpu, &vcpu->arch.sie_block->gpsw,
> +				      __LC_EXT_NEW_PSW, sizeof(psw_t));
> +		rc |= put_guest(vcpu, inti->ext.ext_params2,
> +				(u64 __user *) __LC_EXT_PARAMS2);
> +		break;
>  	case KVM_S390_INT_VIRTIO:
>  		VCPU_EVENT(vcpu, 4, "interrupt: virtio parm:%x,parm64:%llx",
>  			   inti->ext.ext_params, inti->ext.ext_params2);
> @@ -357,7 +376,7 @@ static int __try_deliver_ckc_interrupt(struct kvm_vcpu *vcpu)
>  	return 1;
>  }
>  
> -static int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
> +int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
>  	struct kvm_s390_float_interrupt *fi = vcpu->arch.local_int.float_int;
> @@ -681,6 +700,11 @@ int kvm_s390_inject_vm(struct kvm *kvm,
>  		inti->type = s390int->type;
>  		inti->ext.ext_params = s390int->parm;
>  		break;
> +	case KVM_S390_INT_PFAULT_INIT:
> +	case KVM_S390_INT_PFAULT_DONE:
> +		inti->type = s390int->type;
> +		inti->ext.ext_params2 = s390int->parm64;
> +		break;
>  	case KVM_S390_PROGRAM_INT:
>  	case KVM_S390_SIGP_STOP:
>  	case KVM_S390_INT_EXTERNAL_CALL:
> @@ -811,6 +835,11 @@ int kvm_s390_inject_vcpu(struct kvm_vcpu *vcpu,
>  		inti->type = s390int->type;
>  		inti->mchk.mcic = s390int->parm64;
>  		break;
> +	case KVM_S390_INT_PFAULT_INIT:
> +	case KVM_S390_INT_PFAULT_DONE:
> +		inti->type = s390int->type;
> +		inti->ext.ext_params2 = s390int->parm64;
> +		break;
>  	case KVM_S390_INT_VIRTIO:
>  	case KVM_S390_INT_SERVICE:
>  	case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 702daca..ef70296 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -145,6 +145,7 @@ int kvm_dev_ioctl_check_extension(long ext)
>  #ifdef CONFIG_KVM_S390_UCONTROL
>  	case KVM_CAP_S390_UCONTROL:
>  #endif
> +	case KVM_CAP_ASYNC_PF:
>  	case KVM_CAP_SYNC_REGS:
>  	case KVM_CAP_ONE_REG:
>  	case KVM_CAP_ENABLE_CAP:
> @@ -186,6 +187,33 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  	int r;
>  
>  	switch (ioctl) {
> +	case KVM_S390_APF_ENABLE:
> +		set_bit(1, &kvm->arch.gmap->pfault_enabled);
> +		r = 0;
> +		break;
> +	case KVM_S390_APF_DISABLE:
> +		clear_bit(1, &kvm->arch.gmap->pfault_enabled);
> +		r = 0;
> +		break;
> +	case KVM_S390_APF_STATUS: {
> +		bool pfaults_pending = false;
> +		unsigned int i;
> +		struct kvm_vcpu *vcpu;
> +		r = 0;
> +		if (test_bit(1, &kvm->arch.gmap->pfault_enabled))
> +			r += 2;
> +
> +		kvm_for_each_vcpu(i, vcpu, kvm) {
> +			spin_lock(&vcpu->async_pf.lock);
> +			if (vcpu->async_pf.queued > 0)
> +				pfaults_pending = true;
> +			spin_unlock(&vcpu->async_pf.lock);
> +		}
> +
> +		if (pfaults_pending)
> +			r += 1;
> +		break;
> +	}
>  	case KVM_S390_INTERRUPT: {
>  		struct kvm_s390_interrupt s390int;
>  
> @@ -264,6 +292,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>  {
>  	VCPU_EVENT(vcpu, 3, "%s", "free cpu");
>  	trace_kvm_s390_destroy_vcpu(vcpu->vcpu_id);
> +	kvm_clear_async_pf_completion_queue(vcpu);
>  	if (!kvm_is_ucontrol(vcpu->kvm)) {
>  		clear_bit(63 - vcpu->vcpu_id,
>  			  (unsigned long *) &vcpu->kvm->arch.sca->mcn);
> @@ -313,6 +342,9 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>  /* Section: vcpu related */
>  int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>  {
> +	vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
> +	kvm_clear_async_pf_completion_queue(vcpu);
> +	kvm_async_pf_wakeup_all(vcpu);
>  	if (kvm_is_ucontrol(vcpu->kvm)) {
>  		vcpu->arch.gmap = gmap_alloc(current->mm);
>  		if (!vcpu->arch.gmap)
> @@ -370,6 +402,7 @@ static void kvm_s390_vcpu_initial_reset(struct kvm_vcpu *vcpu)
>  	vcpu->arch.guest_fpregs.fpc = 0;
>  	asm volatile("lfpc %0" : : "Q" (vcpu->arch.guest_fpregs.fpc));
>  	vcpu->arch.sie_block->gbea = 1;
> +	vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
>  	atomic_set_mask(CPUSTAT_STOPPED, &vcpu->arch.sie_block->cpuflags);
>  }
>  
> @@ -691,10 +724,81 @@ static void kvm_arch_fault_in_sync(struct kvm_vcpu *vcpu)
>  	up_read(&mm->mmap_sem);
>  }
>  
> +static void __kvm_inject_pfault_token(struct kvm_vcpu *vcpu, bool start_token,
> +				      unsigned long token)
> +{
> +	struct kvm_s390_interrupt inti;
> +	inti.parm64 = token;
> +
> +	if (start_token) {
> +		inti.type = KVM_S390_INT_PFAULT_INIT;
> +		if (kvm_s390_inject_vcpu(vcpu, &inti))
> +			WARN(1, "pfault interrupt injection failed");
> +	} else {
> +		inti.type = KVM_S390_INT_PFAULT_DONE;
> +		if (kvm_s390_inject_vm(vcpu->kvm, &inti))
> +			WARN(1, "pfault interrupt injection failed");
> +	}
> +}
> +
> +void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
> +				     struct kvm_async_pf *work)
> +{
> +	__kvm_inject_pfault_token(vcpu, true, work->arch.pfault_token);
> +}
> +
> +void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
> +				 struct kvm_async_pf *work)
> +{
> +	__kvm_inject_pfault_token(vcpu, false, work->arch.pfault_token);
> +}
> +
> +void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,
> +			       struct kvm_async_pf *work)
> +{
> +	/* s390 will always inject the page directly */
> +}
> +
> +bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * s390 will always inject the page directly,
> +	 * but we still want check_async_completion to cleanup
> +	 */
> +	return true;
> +}
> +
> +static int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu)
> +{
> +	hva_t hva = gmap_fault(current->thread.gmap_addr, vcpu->arch.gmap);
> +	struct kvm_arch_async_pf arch;
> +
> +	if (vcpu->arch.pfault_token == KVM_S390_PFAULT_TOKEN_INVALID)
> +		return 0;
> +	if ((vcpu->arch.sie_block->gpsw.mask & vcpu->arch.pfault_select) !=
> +	    vcpu->arch.pfault_compare)
> +		return 0;
> +	if (psw_extint_disabled(vcpu))
> +		return 0;
> +	if (kvm_cpu_has_interrupt(vcpu))
> +		return 0;
> +	if (!(vcpu->arch.sie_block->gcr[0] & 0x200ul))
> +		return 0;
> +
> +	if (copy_from_guest(vcpu, &arch.pfault_token, vcpu->arch.pfault_token, 8)) {
> +		/* already in error case, insert the interrupt and return 0 */
> +		int ign = kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
> +		return ign - ign;
> +	}
> +	return kvm_setup_async_pf(vcpu, current->thread.gmap_addr, hva, &arch);
> +}
> +
>  static int __vcpu_run(struct kvm_vcpu *vcpu)
>  {
>  	int rc;
>  
> +	kvm_check_async_pf_completion(vcpu);
> +
>  	memcpy(&vcpu->arch.sie_block->gg14, &vcpu->run->s.regs.gprs[14], 16);
>  
>  	if (need_resched())
> @@ -725,7 +829,8 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>  		if (kvm_is_ucontrol(vcpu->kvm)) {
>  			rc = SIE_INTERCEPT_UCONTROL;
>  		} else if (current->thread.gmap_pfault) {
> -			kvm_arch_fault_in_sync(vcpu);
> +			if (!kvm_arch_setup_async_pf(vcpu))
> +				kvm_arch_fault_in_sync(vcpu);
>  			current->thread.gmap_pfault = 0;
>  			rc = 0;
>  		} else {
> diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
> index 028ca9f..d0f4d2a 100644
> --- a/arch/s390/kvm/kvm-s390.h
> +++ b/arch/s390/kvm/kvm-s390.h
> @@ -148,4 +148,8 @@ void exit_sie_sync(struct kvm_vcpu *vcpu);
>  /* implemented in diag.c */
>  int kvm_s390_handle_diag(struct kvm_vcpu *vcpu);
>  
> +/* implemented in interrupt.c */
> +int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
> +int psw_extint_disabled(struct kvm_vcpu *vcpu);
> +
>  #endif
> diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
> index bec398c..a6a0f02 100644
> --- a/arch/s390/kvm/sigp.c
> +++ b/arch/s390/kvm/sigp.c
> @@ -186,6 +186,12 @@ int kvm_s390_inject_sigp_stop(struct kvm_vcpu *vcpu, int action)
>  static int __sigp_set_arch(struct kvm_vcpu *vcpu, u32 parameter)
>  {
>  	int rc;
> +	unsigned int i;
> +	struct kvm_vcpu *vcpu_to_set;
> +
> +	kvm_for_each_vcpu(i, vcpu_to_set, vcpu->kvm) {
> +		vcpu_to_set->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
> +	}
>  
>  	switch (parameter & 0xff) {
>  	case 0:
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index acccd08..fae432c 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -413,6 +413,8 @@ struct kvm_s390_psw {
>  #define KVM_S390_PROGRAM_INT		0xfffe0001u
>  #define KVM_S390_SIGP_SET_PREFIX	0xfffe0002u
>  #define KVM_S390_RESTART		0xfffe0003u
> +#define KVM_S390_INT_PFAULT_INIT	0xfffe0004u
> +#define KVM_S390_INT_PFAULT_DONE	0xfffe0005u
>  #define KVM_S390_MCHK			0xfffe1000u
>  #define KVM_S390_INT_VIRTIO		0xffff2603u
>  #define KVM_S390_INT_SERVICE		0xffff2401u
> -- 
> 1.8.2.2

--
			Gleb.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] PF: Async page fault support on s390
@ 2013-07-11  9:04     ` Gleb Natapov
  0 siblings, 0 replies; 20+ messages in thread
From: Gleb Natapov @ 2013-07-11  9:04 UTC (permalink / raw)
  To: Dominik Dingel
  Cc: Paolo Bonzini, Christian Borntraeger, Heiko Carstens,
	Martin Schwidefsky, Cornelia Huck, Xiantao Zhang, Alexander Graf,
	Christoffer Dall, Marc Zyngier, Ralf Baechle, kvm, linux-s390,
	linux-mm, linux-kernel

On Wed, Jul 10, 2013 at 02:59:55PM +0200, Dominik Dingel wrote:
> This patch enables async page faults for s390 kvm guests.
> It provides the userspace API to enable, disable or get the status of this
> feature. Also it includes the diagnose code, called by the guest to enable
> async page faults.
> 
> The async page faults will use an already existing guest interface for this
> purpose, as described in "CP Programming Services (SC24-6084)".
> 
> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Christian, looks good now?

> ---
>  Documentation/s390/kvm.txt       |  24 +++++++++
>  arch/s390/include/asm/kvm_host.h |  22 ++++++++
>  arch/s390/include/uapi/asm/kvm.h |  10 ++++
>  arch/s390/kvm/Kconfig            |   2 +
>  arch/s390/kvm/Makefile           |   2 +-
>  arch/s390/kvm/diag.c             |  63 +++++++++++++++++++++++
>  arch/s390/kvm/interrupt.c        |  43 +++++++++++++---
>  arch/s390/kvm/kvm-s390.c         | 107 ++++++++++++++++++++++++++++++++++++++-
>  arch/s390/kvm/kvm-s390.h         |   4 ++
>  arch/s390/kvm/sigp.c             |   6 +++
>  include/uapi/linux/kvm.h         |   2 +
>  11 files changed, 276 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/s390/kvm.txt b/Documentation/s390/kvm.txt
> index 85f3280..707b7e9 100644
> --- a/Documentation/s390/kvm.txt
> +++ b/Documentation/s390/kvm.txt
> @@ -70,6 +70,30 @@ floating interrupts are:
>  KVM_S390_INT_VIRTIO
>  KVM_S390_INT_SERVICE
>  
> +ioctl:      KVM_S390_APF_ENABLE:
> +args:       none
> +This ioctl is used to enable the async page fault interface. So in a
> +host page fault case the host can now submit pfault tokens to the guest.
> +
> +ioctl:      KVM_S390_APF_DISABLE:
> +args:       none
> +This ioctl is used to disable the async page fault interface. From this point
> +on no new pfault tokens will be issued to the guest. Already existing async
> +page faults are not covered by this and will be normally handled.
> +
> +ioctl:      KVM_S390_APF_STATUS:
> +args:       none
> +This ioctl allows the userspace to get the current status of the APF feature.
> +The main purpose for this, is to ensure that no pfault tokens will be lost
> +during live migration or similar management operations.
> +The possible return values are:
> +KVM_S390_APF_DISABLED_NON_PENDING
> +KVM_S390_APF_DISABLED_PENDING
> +KVM_S390_APF_ENABLED_NON_PENDING
> +KVM_S390_APF_ENABLED_PENDING
> +Caution: if KVM_S390_APF is enabled the PENDING status could be already changed
> +as soon as the ioctl returns to userspace.
> +
>  3. ioctl calls to the kvm-vcpu file descriptor
>  KVM does support the following ioctls on s390 that are common with other
>  architectures and do behave the same:
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index cd30c3d..e8012fc 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -257,6 +257,10 @@ struct kvm_vcpu_arch {
>  		u64		stidp_data;
>  	};
>  	struct gmap *gmap;
> +#define KVM_S390_PFAULT_TOKEN_INVALID	(-1UL)
> +	unsigned long pfault_token;
> +	unsigned long pfault_select;
> +	unsigned long pfault_compare;
>  };
>  
>  struct kvm_vm_stat {
> @@ -282,6 +286,24 @@ static inline bool kvm_is_error_hva(unsigned long addr)
>  	return addr == KVM_HVA_ERR_BAD;
>  }
>  
> +#define ASYNC_PF_PER_VCPU	64
> +struct kvm_vcpu;
> +struct kvm_async_pf;
> +struct kvm_arch_async_pf {
> +	unsigned long pfault_token;
> +};
> +
> +bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu);
> +
> +void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,
> +			       struct kvm_async_pf *work);
> +
> +void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
> +				     struct kvm_async_pf *work);
> +
> +void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
> +				 struct kvm_async_pf *work);
> +
>  extern int sie64a(struct kvm_s390_sie_block *, u64 *);
>  extern char sie_exit;
>  #endif
> diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
> index d25da59..b6c83e0 100644
> --- a/arch/s390/include/uapi/asm/kvm.h
> +++ b/arch/s390/include/uapi/asm/kvm.h
> @@ -57,4 +57,14 @@ struct kvm_sync_regs {
>  #define KVM_REG_S390_EPOCHDIFF	(KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x2)
>  #define KVM_REG_S390_CPU_TIMER  (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x3)
>  #define KVM_REG_S390_CLOCK_COMP (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x4)
> +
> +/* ioctls used for setting/getting status of APF on s390x */
> +#define KVM_S390_APF_ENABLE	1
> +#define KVM_S390_APF_DISABLE	2
> +#define KVM_S390_APF_STATUS	3
> +#define KVM_S390_APF_DISABLED_NON_PENDING	0
> +#define KVM_S390_APF_DISABLED_PENDING		1
> +#define KVM_S390_APF_ENABLED_NON_PENDING	2
> +#define KVM_S390_APF_ENABLED_PENDING		3
> +
>  #endif
> diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
> index 70b46ea..4993eed 100644
> --- a/arch/s390/kvm/Kconfig
> +++ b/arch/s390/kvm/Kconfig
> @@ -23,6 +23,8 @@ config KVM
>  	select ANON_INODES
>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
>  	select HAVE_KVM_EVENTFD
> +	select KVM_ASYNC_PF
> +	select KVM_ASYNC_PF_DIRECT
>  	---help---
>  	  Support hosting paravirtualized guest machines using the SIE
>  	  virtualization capability on the mainframe. This should work
> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
> index 40b4c64..63bfc28 100644
> --- a/arch/s390/kvm/Makefile
> +++ b/arch/s390/kvm/Makefile
> @@ -7,7 +7,7 @@
>  # as published by the Free Software Foundation.
>  
>  KVM := ../../../virt/kvm
> -common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o
> +common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o
>  
>  ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>  
> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
> index 3074475..3d210af 100644
> --- a/arch/s390/kvm/diag.c
> +++ b/arch/s390/kvm/diag.c
> @@ -17,6 +17,7 @@
>  #include "kvm-s390.h"
>  #include "trace.h"
>  #include "trace-s390.h"
> +#include "gaccess.h"
>  
>  static int diag_release_pages(struct kvm_vcpu *vcpu)
>  {
> @@ -46,6 +47,66 @@ static int diag_release_pages(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> +static int __diag_page_ref_service(struct kvm_vcpu *vcpu)
> +{
> +	struct prs_parm {
> +		u16 code;
> +		u16 subcode;
> +		u16 parm_len;
> +		u16 parm_version;
> +		u64 token_addr;
> +		u64 select_mask;
> +		u64 compare_mask;
> +		u64 zarch;
> +	};
> +	struct prs_parm parm;
> +	int rc;
> +	u16 rx = (vcpu->arch.sie_block->ipa & 0xf0) >> 4;
> +	u16 ry = (vcpu->arch.sie_block->ipa & 0x0f);
> +	if (copy_from_guest(vcpu, &parm, vcpu->run->s.regs.gprs[rx], sizeof(parm)))
> +		return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
> +
> +	if (parm.parm_version != 2 || parm.parm_len < 0x5)
> +		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
> +
> +	switch (parm.subcode) {
> +	case 0: /* TOKEN */
> +		if ((parm.zarch >> 63) != 1 || parm.token_addr & 7 ||
> +		    (parm.compare_mask & parm.select_mask) != parm.compare_mask)
> +			return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
> +
> +		vcpu->arch.pfault_token = parm.token_addr;
> +		vcpu->arch.pfault_select = parm.select_mask;
> +		vcpu->arch.pfault_compare = parm.compare_mask;
> +		vcpu->run->s.regs.gprs[ry] = 0;
> +		rc = 0;
> +		break;
> +	case 1: 
> +		/* 
> +		 * CANCEL 
> +		 * Specification allows to let already pending tokens survive
> +		 * the cancel, therefore to reduce code complexity, we assume, all
> +		 * outstanding tokens as already pending.
> +		 */
> +		if (vcpu->run->s.regs.gprs[rx] & 7)
> +			return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
> +
> +		vcpu->run->s.regs.gprs[ry] = 0;
> +
> +		if (vcpu->arch.pfault_token == KVM_S390_PFAULT_TOKEN_INVALID)
> +			vcpu->run->s.regs.gprs[ry] = 1;
> +
> +		vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
> +		rc = 0;
> +		break;
> +	default:
> +		rc = -EOPNOTSUPP;
> +		break;
> +	}
> +
> +	return rc;
> +}
> +
>  static int __diag_time_slice_end(struct kvm_vcpu *vcpu)
>  {
>  	VCPU_EVENT(vcpu, 5, "%s", "diag time slice end");
> @@ -143,6 +204,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu)
>  		return __diag_time_slice_end(vcpu);
>  	case 0x9c:
>  		return __diag_time_slice_end_directed(vcpu);
> +	case 0x258:
> +		return __diag_page_ref_service(vcpu);
>  	case 0x308:
>  		return __diag_ipl_functions(vcpu);
>  	case 0x500:
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index 7f35cb3..00e7feb 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -31,7 +31,7 @@ static int is_ioint(u64 type)
>  	return ((type & 0xfffe0000u) != 0xfffe0000u);
>  }
>  
> -static int psw_extint_disabled(struct kvm_vcpu *vcpu)
> +int psw_extint_disabled(struct kvm_vcpu *vcpu)
>  {
>  	return !(vcpu->arch.sie_block->gpsw.mask & PSW_MASK_EXT);
>  }
> @@ -78,11 +78,8 @@ static int __interrupt_is_deliverable(struct kvm_vcpu *vcpu,
>  			return 1;
>  		return 0;
>  	case KVM_S390_INT_SERVICE:
> -		if (psw_extint_disabled(vcpu))
> -			return 0;
> -		if (vcpu->arch.sie_block->gcr[0] & 0x200ul)
> -			return 1;
> -		return 0;
> +	case KVM_S390_INT_PFAULT_INIT:
> +	case KVM_S390_INT_PFAULT_DONE:
>  	case KVM_S390_INT_VIRTIO:
>  		if (psw_extint_disabled(vcpu))
>  			return 0;
> @@ -150,6 +147,8 @@ static void __set_intercept_indicator(struct kvm_vcpu *vcpu,
>  	case KVM_S390_INT_EXTERNAL_CALL:
>  	case KVM_S390_INT_EMERGENCY:
>  	case KVM_S390_INT_SERVICE:
> +	case KVM_S390_INT_PFAULT_INIT:
> +	case KVM_S390_INT_PFAULT_DONE:
>  	case KVM_S390_INT_VIRTIO:
>  		if (psw_extint_disabled(vcpu))
>  			__set_cpuflag(vcpu, CPUSTAT_EXT_INT);
> @@ -223,6 +222,26 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
>  		rc |= put_guest(vcpu, inti->ext.ext_params,
>  				(u32 __user *)__LC_EXT_PARAMS);
>  		break;
> +	case KVM_S390_INT_PFAULT_INIT:
> +		rc  = put_guest(vcpu, 0x2603, (u16 __user *) __LC_EXT_INT_CODE);
> +		rc |= put_guest(vcpu, 0x0600, (u16 __user *) __LC_EXT_CPU_ADDR);
> +		rc |= copy_to_guest(vcpu, __LC_EXT_OLD_PSW,
> +				    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
> +		rc |= copy_from_guest(vcpu, &vcpu->arch.sie_block->gpsw,
> +				      __LC_EXT_NEW_PSW, sizeof(psw_t));
> +		rc |= put_guest(vcpu, inti->ext.ext_params2,
> +				(u64 __user *) __LC_EXT_PARAMS2);
> +		break;
> +	case KVM_S390_INT_PFAULT_DONE:
> +		rc  = put_guest(vcpu, 0x2603, (u16 __user *) __LC_EXT_INT_CODE);
> +		rc |= put_guest(vcpu, 0x0680, (u16 __user *) __LC_EXT_CPU_ADDR);
> +		rc |= copy_to_guest(vcpu, __LC_EXT_OLD_PSW,
> +				    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
> +		rc |= copy_from_guest(vcpu, &vcpu->arch.sie_block->gpsw,
> +				      __LC_EXT_NEW_PSW, sizeof(psw_t));
> +		rc |= put_guest(vcpu, inti->ext.ext_params2,
> +				(u64 __user *) __LC_EXT_PARAMS2);
> +		break;
>  	case KVM_S390_INT_VIRTIO:
>  		VCPU_EVENT(vcpu, 4, "interrupt: virtio parm:%x,parm64:%llx",
>  			   inti->ext.ext_params, inti->ext.ext_params2);
> @@ -357,7 +376,7 @@ static int __try_deliver_ckc_interrupt(struct kvm_vcpu *vcpu)
>  	return 1;
>  }
>  
> -static int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
> +int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
>  	struct kvm_s390_float_interrupt *fi = vcpu->arch.local_int.float_int;
> @@ -681,6 +700,11 @@ int kvm_s390_inject_vm(struct kvm *kvm,
>  		inti->type = s390int->type;
>  		inti->ext.ext_params = s390int->parm;
>  		break;
> +	case KVM_S390_INT_PFAULT_INIT:
> +	case KVM_S390_INT_PFAULT_DONE:
> +		inti->type = s390int->type;
> +		inti->ext.ext_params2 = s390int->parm64;
> +		break;
>  	case KVM_S390_PROGRAM_INT:
>  	case KVM_S390_SIGP_STOP:
>  	case KVM_S390_INT_EXTERNAL_CALL:
> @@ -811,6 +835,11 @@ int kvm_s390_inject_vcpu(struct kvm_vcpu *vcpu,
>  		inti->type = s390int->type;
>  		inti->mchk.mcic = s390int->parm64;
>  		break;
> +	case KVM_S390_INT_PFAULT_INIT:
> +	case KVM_S390_INT_PFAULT_DONE:
> +		inti->type = s390int->type;
> +		inti->ext.ext_params2 = s390int->parm64;
> +		break;
>  	case KVM_S390_INT_VIRTIO:
>  	case KVM_S390_INT_SERVICE:
>  	case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 702daca..ef70296 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -145,6 +145,7 @@ int kvm_dev_ioctl_check_extension(long ext)
>  #ifdef CONFIG_KVM_S390_UCONTROL
>  	case KVM_CAP_S390_UCONTROL:
>  #endif
> +	case KVM_CAP_ASYNC_PF:
>  	case KVM_CAP_SYNC_REGS:
>  	case KVM_CAP_ONE_REG:
>  	case KVM_CAP_ENABLE_CAP:
> @@ -186,6 +187,33 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  	int r;
>  
>  	switch (ioctl) {
> +	case KVM_S390_APF_ENABLE:
> +		set_bit(1, &kvm->arch.gmap->pfault_enabled);
> +		r = 0;
> +		break;
> +	case KVM_S390_APF_DISABLE:
> +		clear_bit(1, &kvm->arch.gmap->pfault_enabled);
> +		r = 0;
> +		break;
> +	case KVM_S390_APF_STATUS: {
> +		bool pfaults_pending = false;
> +		unsigned int i;
> +		struct kvm_vcpu *vcpu;
> +		r = 0;
> +		if (test_bit(1, &kvm->arch.gmap->pfault_enabled))
> +			r += 2;
> +
> +		kvm_for_each_vcpu(i, vcpu, kvm) {
> +			spin_lock(&vcpu->async_pf.lock);
> +			if (vcpu->async_pf.queued > 0)
> +				pfaults_pending = true;
> +			spin_unlock(&vcpu->async_pf.lock);
> +		}
> +
> +		if (pfaults_pending)
> +			r += 1;
> +		break;
> +	}
>  	case KVM_S390_INTERRUPT: {
>  		struct kvm_s390_interrupt s390int;
>  
> @@ -264,6 +292,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>  {
>  	VCPU_EVENT(vcpu, 3, "%s", "free cpu");
>  	trace_kvm_s390_destroy_vcpu(vcpu->vcpu_id);
> +	kvm_clear_async_pf_completion_queue(vcpu);
>  	if (!kvm_is_ucontrol(vcpu->kvm)) {
>  		clear_bit(63 - vcpu->vcpu_id,
>  			  (unsigned long *) &vcpu->kvm->arch.sca->mcn);
> @@ -313,6 +342,9 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>  /* Section: vcpu related */
>  int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>  {
> +	vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
> +	kvm_clear_async_pf_completion_queue(vcpu);
> +	kvm_async_pf_wakeup_all(vcpu);
>  	if (kvm_is_ucontrol(vcpu->kvm)) {
>  		vcpu->arch.gmap = gmap_alloc(current->mm);
>  		if (!vcpu->arch.gmap)
> @@ -370,6 +402,7 @@ static void kvm_s390_vcpu_initial_reset(struct kvm_vcpu *vcpu)
>  	vcpu->arch.guest_fpregs.fpc = 0;
>  	asm volatile("lfpc %0" : : "Q" (vcpu->arch.guest_fpregs.fpc));
>  	vcpu->arch.sie_block->gbea = 1;
> +	vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
>  	atomic_set_mask(CPUSTAT_STOPPED, &vcpu->arch.sie_block->cpuflags);
>  }
>  
> @@ -691,10 +724,81 @@ static void kvm_arch_fault_in_sync(struct kvm_vcpu *vcpu)
>  	up_read(&mm->mmap_sem);
>  }
>  
> +static void __kvm_inject_pfault_token(struct kvm_vcpu *vcpu, bool start_token,
> +				      unsigned long token)
> +{
> +	struct kvm_s390_interrupt inti;
> +	inti.parm64 = token;
> +
> +	if (start_token) {
> +		inti.type = KVM_S390_INT_PFAULT_INIT;
> +		if (kvm_s390_inject_vcpu(vcpu, &inti))
> +			WARN(1, "pfault interrupt injection failed");
> +	} else {
> +		inti.type = KVM_S390_INT_PFAULT_DONE;
> +		if (kvm_s390_inject_vm(vcpu->kvm, &inti))
> +			WARN(1, "pfault interrupt injection failed");
> +	}
> +}
> +
> +void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
> +				     struct kvm_async_pf *work)
> +{
> +	__kvm_inject_pfault_token(vcpu, true, work->arch.pfault_token);
> +}
> +
> +void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
> +				 struct kvm_async_pf *work)
> +{
> +	__kvm_inject_pfault_token(vcpu, false, work->arch.pfault_token);
> +}
> +
> +void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,
> +			       struct kvm_async_pf *work)
> +{
> +	/* s390 will always inject the page directly */
> +}
> +
> +bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * s390 will always inject the page directly,
> +	 * but we still want check_async_completion to cleanup
> +	 */
> +	return true;
> +}
> +
> +static int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu)
> +{
> +	hva_t hva = gmap_fault(current->thread.gmap_addr, vcpu->arch.gmap);
> +	struct kvm_arch_async_pf arch;
> +
> +	if (vcpu->arch.pfault_token == KVM_S390_PFAULT_TOKEN_INVALID)
> +		return 0;
> +	if ((vcpu->arch.sie_block->gpsw.mask & vcpu->arch.pfault_select) !=
> +	    vcpu->arch.pfault_compare)
> +		return 0;
> +	if (psw_extint_disabled(vcpu))
> +		return 0;
> +	if (kvm_cpu_has_interrupt(vcpu))
> +		return 0;
> +	if (!(vcpu->arch.sie_block->gcr[0] & 0x200ul))
> +		return 0;
> +
> +	if (copy_from_guest(vcpu, &arch.pfault_token, vcpu->arch.pfault_token, 8)) {
> +		/* already in error case, insert the interrupt and return 0 */
> +		int ign = kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
> +		return ign - ign;
> +	}
> +	return kvm_setup_async_pf(vcpu, current->thread.gmap_addr, hva, &arch);
> +}
> +
>  static int __vcpu_run(struct kvm_vcpu *vcpu)
>  {
>  	int rc;
>  
> +	kvm_check_async_pf_completion(vcpu);
> +
>  	memcpy(&vcpu->arch.sie_block->gg14, &vcpu->run->s.regs.gprs[14], 16);
>  
>  	if (need_resched())
> @@ -725,7 +829,8 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>  		if (kvm_is_ucontrol(vcpu->kvm)) {
>  			rc = SIE_INTERCEPT_UCONTROL;
>  		} else if (current->thread.gmap_pfault) {
> -			kvm_arch_fault_in_sync(vcpu);
> +			if (!kvm_arch_setup_async_pf(vcpu))
> +				kvm_arch_fault_in_sync(vcpu);
>  			current->thread.gmap_pfault = 0;
>  			rc = 0;
>  		} else {
> diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
> index 028ca9f..d0f4d2a 100644
> --- a/arch/s390/kvm/kvm-s390.h
> +++ b/arch/s390/kvm/kvm-s390.h
> @@ -148,4 +148,8 @@ void exit_sie_sync(struct kvm_vcpu *vcpu);
>  /* implemented in diag.c */
>  int kvm_s390_handle_diag(struct kvm_vcpu *vcpu);
>  
> +/* implemented in interrupt.c */
> +int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
> +int psw_extint_disabled(struct kvm_vcpu *vcpu);
> +
>  #endif
> diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
> index bec398c..a6a0f02 100644
> --- a/arch/s390/kvm/sigp.c
> +++ b/arch/s390/kvm/sigp.c
> @@ -186,6 +186,12 @@ int kvm_s390_inject_sigp_stop(struct kvm_vcpu *vcpu, int action)
>  static int __sigp_set_arch(struct kvm_vcpu *vcpu, u32 parameter)
>  {
>  	int rc;
> +	unsigned int i;
> +	struct kvm_vcpu *vcpu_to_set;
> +
> +	kvm_for_each_vcpu(i, vcpu_to_set, vcpu->kvm) {
> +		vcpu_to_set->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
> +	}
>  
>  	switch (parameter & 0xff) {
>  	case 0:
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index acccd08..fae432c 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -413,6 +413,8 @@ struct kvm_s390_psw {
>  #define KVM_S390_PROGRAM_INT		0xfffe0001u
>  #define KVM_S390_SIGP_SET_PREFIX	0xfffe0002u
>  #define KVM_S390_RESTART		0xfffe0003u
> +#define KVM_S390_INT_PFAULT_INIT	0xfffe0004u
> +#define KVM_S390_INT_PFAULT_DONE	0xfffe0005u
>  #define KVM_S390_MCHK			0xfffe1000u
>  #define KVM_S390_INT_VIRTIO		0xffff2603u
>  #define KVM_S390_INT_SERVICE		0xffff2401u
> -- 
> 1.8.2.2

--
			Gleb.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] PF: Async page fault support on s390
  2013-07-11  9:04     ` Gleb Natapov
@ 2013-07-11 10:41       ` Christian Borntraeger
  -1 siblings, 0 replies; 20+ messages in thread
From: Christian Borntraeger @ 2013-07-11 10:41 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Dominik Dingel, Paolo Bonzini, Heiko Carstens,
	Martin Schwidefsky, Cornelia Huck, Xiantao Zhang, Alexander Graf,
	Christoffer Dall, Marc Zyngier, Ralf Baechle, kvm, linux-s390,
	linux-mm, linux-kernel

On 11/07/13 11:04, Gleb Natapov wrote:
> On Wed, Jul 10, 2013 at 02:59:55PM +0200, Dominik Dingel wrote:
>> This patch enables async page faults for s390 kvm guests.
>> It provides the userspace API to enable, disable or get the status of this
>> feature. Also it includes the diagnose code, called by the guest to enable
>> async page faults.
>>
>> The async page faults will use an already existing guest interface for this
>> purpose, as described in "CP Programming Services (SC24-6084)".
>>
>> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
> Christian, looks good now?

Looks good, but I just had a  discussion with Dominik about several other cases 
(guest driven reboot, qemu driven reboot, life migration). This patch should 
allow all these cases (independent from this patch we need an ioctl to flush the
list of pending interrupts to do so, but reboot is currently broken in that
regard anyway - patch is currently being looked at)

We are currently discussion if we should get rid of the APF_STATUS and let 
the kernel wait for outstanding page faults before returning from KVM_RUN
or if we go with this patch and let userspace wait for completion. 

Will discuss this with Dominik, Conny and Alex. So lets defer that till next
week, ok?


> 
>> ---
>>  Documentation/s390/kvm.txt       |  24 +++++++++
>>  arch/s390/include/asm/kvm_host.h |  22 ++++++++
>>  arch/s390/include/uapi/asm/kvm.h |  10 ++++
>>  arch/s390/kvm/Kconfig            |   2 +
>>  arch/s390/kvm/Makefile           |   2 +-
>>  arch/s390/kvm/diag.c             |  63 +++++++++++++++++++++++
>>  arch/s390/kvm/interrupt.c        |  43 +++++++++++++---
>>  arch/s390/kvm/kvm-s390.c         | 107 ++++++++++++++++++++++++++++++++++++++-
>>  arch/s390/kvm/kvm-s390.h         |   4 ++
>>  arch/s390/kvm/sigp.c             |   6 +++
>>  include/uapi/linux/kvm.h         |   2 +
>>  11 files changed, 276 insertions(+), 9 deletions(-)
>>
>> diff --git a/Documentation/s390/kvm.txt b/Documentation/s390/kvm.txt
>> index 85f3280..707b7e9 100644
>> --- a/Documentation/s390/kvm.txt
>> +++ b/Documentation/s390/kvm.txt
>> @@ -70,6 +70,30 @@ floating interrupts are:
>>  KVM_S390_INT_VIRTIO
>>  KVM_S390_INT_SERVICE
>>  
>> +ioctl:      KVM_S390_APF_ENABLE:
>> +args:       none
>> +This ioctl is used to enable the async page fault interface. So in a
>> +host page fault case the host can now submit pfault tokens to the guest.
>> +
>> +ioctl:      KVM_S390_APF_DISABLE:
>> +args:       none
>> +This ioctl is used to disable the async page fault interface. From this point
>> +on no new pfault tokens will be issued to the guest. Already existing async
>> +page faults are not covered by this and will be normally handled.
>> +
>> +ioctl:      KVM_S390_APF_STATUS:
>> +args:       none
>> +This ioctl allows the userspace to get the current status of the APF feature.
>> +The main purpose for this, is to ensure that no pfault tokens will be lost
>> +during live migration or similar management operations.
>> +The possible return values are:
>> +KVM_S390_APF_DISABLED_NON_PENDING
>> +KVM_S390_APF_DISABLED_PENDING
>> +KVM_S390_APF_ENABLED_NON_PENDING
>> +KVM_S390_APF_ENABLED_PENDING
>> +Caution: if KVM_S390_APF is enabled the PENDING status could be already changed
>> +as soon as the ioctl returns to userspace.
>> +
>>  3. ioctl calls to the kvm-vcpu file descriptor
>>  KVM does support the following ioctls on s390 that are common with other
>>  architectures and do behave the same:
>> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
>> index cd30c3d..e8012fc 100644
>> --- a/arch/s390/include/asm/kvm_host.h
>> +++ b/arch/s390/include/asm/kvm_host.h
>> @@ -257,6 +257,10 @@ struct kvm_vcpu_arch {
>>  		u64		stidp_data;
>>  	};
>>  	struct gmap *gmap;
>> +#define KVM_S390_PFAULT_TOKEN_INVALID	(-1UL)
>> +	unsigned long pfault_token;
>> +	unsigned long pfault_select;
>> +	unsigned long pfault_compare;
>>  };
>>  
>>  struct kvm_vm_stat {
>> @@ -282,6 +286,24 @@ static inline bool kvm_is_error_hva(unsigned long addr)
>>  	return addr == KVM_HVA_ERR_BAD;
>>  }
>>  
>> +#define ASYNC_PF_PER_VCPU	64
>> +struct kvm_vcpu;
>> +struct kvm_async_pf;
>> +struct kvm_arch_async_pf {
>> +	unsigned long pfault_token;
>> +};
>> +
>> +bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu);
>> +
>> +void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,
>> +			       struct kvm_async_pf *work);
>> +
>> +void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
>> +				     struct kvm_async_pf *work);
>> +
>> +void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
>> +				 struct kvm_async_pf *work);
>> +
>>  extern int sie64a(struct kvm_s390_sie_block *, u64 *);
>>  extern char sie_exit;
>>  #endif
>> diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
>> index d25da59..b6c83e0 100644
>> --- a/arch/s390/include/uapi/asm/kvm.h
>> +++ b/arch/s390/include/uapi/asm/kvm.h
>> @@ -57,4 +57,14 @@ struct kvm_sync_regs {
>>  #define KVM_REG_S390_EPOCHDIFF	(KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x2)
>>  #define KVM_REG_S390_CPU_TIMER  (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x3)
>>  #define KVM_REG_S390_CLOCK_COMP (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x4)
>> +
>> +/* ioctls used for setting/getting status of APF on s390x */
>> +#define KVM_S390_APF_ENABLE	1
>> +#define KVM_S390_APF_DISABLE	2
>> +#define KVM_S390_APF_STATUS	3
>> +#define KVM_S390_APF_DISABLED_NON_PENDING	0
>> +#define KVM_S390_APF_DISABLED_PENDING		1
>> +#define KVM_S390_APF_ENABLED_NON_PENDING	2
>> +#define KVM_S390_APF_ENABLED_PENDING		3
>> +
>>  #endif
>> diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
>> index 70b46ea..4993eed 100644
>> --- a/arch/s390/kvm/Kconfig
>> +++ b/arch/s390/kvm/Kconfig
>> @@ -23,6 +23,8 @@ config KVM
>>  	select ANON_INODES
>>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
>>  	select HAVE_KVM_EVENTFD
>> +	select KVM_ASYNC_PF
>> +	select KVM_ASYNC_PF_DIRECT
>>  	---help---
>>  	  Support hosting paravirtualized guest machines using the SIE
>>  	  virtualization capability on the mainframe. This should work
>> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
>> index 40b4c64..63bfc28 100644
>> --- a/arch/s390/kvm/Makefile
>> +++ b/arch/s390/kvm/Makefile
>> @@ -7,7 +7,7 @@
>>  # as published by the Free Software Foundation.
>>  
>>  KVM := ../../../virt/kvm
>> -common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o
>> +common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o
>>  
>>  ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>>  
>> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
>> index 3074475..3d210af 100644
>> --- a/arch/s390/kvm/diag.c
>> +++ b/arch/s390/kvm/diag.c
>> @@ -17,6 +17,7 @@
>>  #include "kvm-s390.h"
>>  #include "trace.h"
>>  #include "trace-s390.h"
>> +#include "gaccess.h"
>>  
>>  static int diag_release_pages(struct kvm_vcpu *vcpu)
>>  {
>> @@ -46,6 +47,66 @@ static int diag_release_pages(struct kvm_vcpu *vcpu)
>>  	return 0;
>>  }
>>  
>> +static int __diag_page_ref_service(struct kvm_vcpu *vcpu)
>> +{
>> +	struct prs_parm {
>> +		u16 code;
>> +		u16 subcode;
>> +		u16 parm_len;
>> +		u16 parm_version;
>> +		u64 token_addr;
>> +		u64 select_mask;
>> +		u64 compare_mask;
>> +		u64 zarch;
>> +	};
>> +	struct prs_parm parm;
>> +	int rc;
>> +	u16 rx = (vcpu->arch.sie_block->ipa & 0xf0) >> 4;
>> +	u16 ry = (vcpu->arch.sie_block->ipa & 0x0f);
>> +	if (copy_from_guest(vcpu, &parm, vcpu->run->s.regs.gprs[rx], sizeof(parm)))
>> +		return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
>> +
>> +	if (parm.parm_version != 2 || parm.parm_len < 0x5)
>> +		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
>> +
>> +	switch (parm.subcode) {
>> +	case 0: /* TOKEN */
>> +		if ((parm.zarch >> 63) != 1 || parm.token_addr & 7 ||
>> +		    (parm.compare_mask & parm.select_mask) != parm.compare_mask)
>> +			return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
>> +
>> +		vcpu->arch.pfault_token = parm.token_addr;
>> +		vcpu->arch.pfault_select = parm.select_mask;
>> +		vcpu->arch.pfault_compare = parm.compare_mask;
>> +		vcpu->run->s.regs.gprs[ry] = 0;
>> +		rc = 0;
>> +		break;
>> +	case 1: 
>> +		/* 
>> +		 * CANCEL 
>> +		 * Specification allows to let already pending tokens survive
>> +		 * the cancel, therefore to reduce code complexity, we assume, all
>> +		 * outstanding tokens as already pending.
>> +		 */
>> +		if (vcpu->run->s.regs.gprs[rx] & 7)
>> +			return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
>> +
>> +		vcpu->run->s.regs.gprs[ry] = 0;
>> +
>> +		if (vcpu->arch.pfault_token == KVM_S390_PFAULT_TOKEN_INVALID)
>> +			vcpu->run->s.regs.gprs[ry] = 1;
>> +
>> +		vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
>> +		rc = 0;
>> +		break;
>> +	default:
>> +		rc = -EOPNOTSUPP;
>> +		break;
>> +	}
>> +
>> +	return rc;
>> +}
>> +
>>  static int __diag_time_slice_end(struct kvm_vcpu *vcpu)
>>  {
>>  	VCPU_EVENT(vcpu, 5, "%s", "diag time slice end");
>> @@ -143,6 +204,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu)
>>  		return __diag_time_slice_end(vcpu);
>>  	case 0x9c:
>>  		return __diag_time_slice_end_directed(vcpu);
>> +	case 0x258:
>> +		return __diag_page_ref_service(vcpu);
>>  	case 0x308:
>>  		return __diag_ipl_functions(vcpu);
>>  	case 0x500:
>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
>> index 7f35cb3..00e7feb 100644
>> --- a/arch/s390/kvm/interrupt.c
>> +++ b/arch/s390/kvm/interrupt.c
>> @@ -31,7 +31,7 @@ static int is_ioint(u64 type)
>>  	return ((type & 0xfffe0000u) != 0xfffe0000u);
>>  }
>>  
>> -static int psw_extint_disabled(struct kvm_vcpu *vcpu)
>> +int psw_extint_disabled(struct kvm_vcpu *vcpu)
>>  {
>>  	return !(vcpu->arch.sie_block->gpsw.mask & PSW_MASK_EXT);
>>  }
>> @@ -78,11 +78,8 @@ static int __interrupt_is_deliverable(struct kvm_vcpu *vcpu,
>>  			return 1;
>>  		return 0;
>>  	case KVM_S390_INT_SERVICE:
>> -		if (psw_extint_disabled(vcpu))
>> -			return 0;
>> -		if (vcpu->arch.sie_block->gcr[0] & 0x200ul)
>> -			return 1;
>> -		return 0;
>> +	case KVM_S390_INT_PFAULT_INIT:
>> +	case KVM_S390_INT_PFAULT_DONE:
>>  	case KVM_S390_INT_VIRTIO:
>>  		if (psw_extint_disabled(vcpu))
>>  			return 0;
>> @@ -150,6 +147,8 @@ static void __set_intercept_indicator(struct kvm_vcpu *vcpu,
>>  	case KVM_S390_INT_EXTERNAL_CALL:
>>  	case KVM_S390_INT_EMERGENCY:
>>  	case KVM_S390_INT_SERVICE:
>> +	case KVM_S390_INT_PFAULT_INIT:
>> +	case KVM_S390_INT_PFAULT_DONE:
>>  	case KVM_S390_INT_VIRTIO:
>>  		if (psw_extint_disabled(vcpu))
>>  			__set_cpuflag(vcpu, CPUSTAT_EXT_INT);
>> @@ -223,6 +222,26 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
>>  		rc |= put_guest(vcpu, inti->ext.ext_params,
>>  				(u32 __user *)__LC_EXT_PARAMS);
>>  		break;
>> +	case KVM_S390_INT_PFAULT_INIT:
>> +		rc  = put_guest(vcpu, 0x2603, (u16 __user *) __LC_EXT_INT_CODE);
>> +		rc |= put_guest(vcpu, 0x0600, (u16 __user *) __LC_EXT_CPU_ADDR);
>> +		rc |= copy_to_guest(vcpu, __LC_EXT_OLD_PSW,
>> +				    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
>> +		rc |= copy_from_guest(vcpu, &vcpu->arch.sie_block->gpsw,
>> +				      __LC_EXT_NEW_PSW, sizeof(psw_t));
>> +		rc |= put_guest(vcpu, inti->ext.ext_params2,
>> +				(u64 __user *) __LC_EXT_PARAMS2);
>> +		break;
>> +	case KVM_S390_INT_PFAULT_DONE:
>> +		rc  = put_guest(vcpu, 0x2603, (u16 __user *) __LC_EXT_INT_CODE);
>> +		rc |= put_guest(vcpu, 0x0680, (u16 __user *) __LC_EXT_CPU_ADDR);
>> +		rc |= copy_to_guest(vcpu, __LC_EXT_OLD_PSW,
>> +				    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
>> +		rc |= copy_from_guest(vcpu, &vcpu->arch.sie_block->gpsw,
>> +				      __LC_EXT_NEW_PSW, sizeof(psw_t));
>> +		rc |= put_guest(vcpu, inti->ext.ext_params2,
>> +				(u64 __user *) __LC_EXT_PARAMS2);
>> +		break;
>>  	case KVM_S390_INT_VIRTIO:
>>  		VCPU_EVENT(vcpu, 4, "interrupt: virtio parm:%x,parm64:%llx",
>>  			   inti->ext.ext_params, inti->ext.ext_params2);
>> @@ -357,7 +376,7 @@ static int __try_deliver_ckc_interrupt(struct kvm_vcpu *vcpu)
>>  	return 1;
>>  }
>>  
>> -static int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
>> +int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
>>  {
>>  	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
>>  	struct kvm_s390_float_interrupt *fi = vcpu->arch.local_int.float_int;
>> @@ -681,6 +700,11 @@ int kvm_s390_inject_vm(struct kvm *kvm,
>>  		inti->type = s390int->type;
>>  		inti->ext.ext_params = s390int->parm;
>>  		break;
>> +	case KVM_S390_INT_PFAULT_INIT:
>> +	case KVM_S390_INT_PFAULT_DONE:
>> +		inti->type = s390int->type;
>> +		inti->ext.ext_params2 = s390int->parm64;
>> +		break;
>>  	case KVM_S390_PROGRAM_INT:
>>  	case KVM_S390_SIGP_STOP:
>>  	case KVM_S390_INT_EXTERNAL_CALL:
>> @@ -811,6 +835,11 @@ int kvm_s390_inject_vcpu(struct kvm_vcpu *vcpu,
>>  		inti->type = s390int->type;
>>  		inti->mchk.mcic = s390int->parm64;
>>  		break;
>> +	case KVM_S390_INT_PFAULT_INIT:
>> +	case KVM_S390_INT_PFAULT_DONE:
>> +		inti->type = s390int->type;
>> +		inti->ext.ext_params2 = s390int->parm64;
>> +		break;
>>  	case KVM_S390_INT_VIRTIO:
>>  	case KVM_S390_INT_SERVICE:
>>  	case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 702daca..ef70296 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -145,6 +145,7 @@ int kvm_dev_ioctl_check_extension(long ext)
>>  #ifdef CONFIG_KVM_S390_UCONTROL
>>  	case KVM_CAP_S390_UCONTROL:
>>  #endif
>> +	case KVM_CAP_ASYNC_PF:
>>  	case KVM_CAP_SYNC_REGS:
>>  	case KVM_CAP_ONE_REG:
>>  	case KVM_CAP_ENABLE_CAP:
>> @@ -186,6 +187,33 @@ long kvm_arch_vm_ioctl(struct file *filp,
>>  	int r;
>>  
>>  	switch (ioctl) {
>> +	case KVM_S390_APF_ENABLE:
>> +		set_bit(1, &kvm->arch.gmap->pfault_enabled);
>> +		r = 0;
>> +		break;
>> +	case KVM_S390_APF_DISABLE:
>> +		clear_bit(1, &kvm->arch.gmap->pfault_enabled);
>> +		r = 0;
>> +		break;
>> +	case KVM_S390_APF_STATUS: {
>> +		bool pfaults_pending = false;
>> +		unsigned int i;
>> +		struct kvm_vcpu *vcpu;
>> +		r = 0;
>> +		if (test_bit(1, &kvm->arch.gmap->pfault_enabled))
>> +			r += 2;
>> +
>> +		kvm_for_each_vcpu(i, vcpu, kvm) {
>> +			spin_lock(&vcpu->async_pf.lock);
>> +			if (vcpu->async_pf.queued > 0)
>> +				pfaults_pending = true;
>> +			spin_unlock(&vcpu->async_pf.lock);
>> +		}
>> +
>> +		if (pfaults_pending)
>> +			r += 1;
>> +		break;
>> +	}
>>  	case KVM_S390_INTERRUPT: {
>>  		struct kvm_s390_interrupt s390int;
>>  
>> @@ -264,6 +292,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>>  {
>>  	VCPU_EVENT(vcpu, 3, "%s", "free cpu");
>>  	trace_kvm_s390_destroy_vcpu(vcpu->vcpu_id);
>> +	kvm_clear_async_pf_completion_queue(vcpu);
>>  	if (!kvm_is_ucontrol(vcpu->kvm)) {
>>  		clear_bit(63 - vcpu->vcpu_id,
>>  			  (unsigned long *) &vcpu->kvm->arch.sca->mcn);
>> @@ -313,6 +342,9 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>>  /* Section: vcpu related */
>>  int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>>  {
>> +	vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
>> +	kvm_clear_async_pf_completion_queue(vcpu);
>> +	kvm_async_pf_wakeup_all(vcpu);
>>  	if (kvm_is_ucontrol(vcpu->kvm)) {
>>  		vcpu->arch.gmap = gmap_alloc(current->mm);
>>  		if (!vcpu->arch.gmap)
>> @@ -370,6 +402,7 @@ static void kvm_s390_vcpu_initial_reset(struct kvm_vcpu *vcpu)
>>  	vcpu->arch.guest_fpregs.fpc = 0;
>>  	asm volatile("lfpc %0" : : "Q" (vcpu->arch.guest_fpregs.fpc));
>>  	vcpu->arch.sie_block->gbea = 1;
>> +	vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
>>  	atomic_set_mask(CPUSTAT_STOPPED, &vcpu->arch.sie_block->cpuflags);
>>  }
>>  
>> @@ -691,10 +724,81 @@ static void kvm_arch_fault_in_sync(struct kvm_vcpu *vcpu)
>>  	up_read(&mm->mmap_sem);
>>  }
>>  
>> +static void __kvm_inject_pfault_token(struct kvm_vcpu *vcpu, bool start_token,
>> +				      unsigned long token)
>> +{
>> +	struct kvm_s390_interrupt inti;
>> +	inti.parm64 = token;
>> +
>> +	if (start_token) {
>> +		inti.type = KVM_S390_INT_PFAULT_INIT;
>> +		if (kvm_s390_inject_vcpu(vcpu, &inti))
>> +			WARN(1, "pfault interrupt injection failed");
>> +	} else {
>> +		inti.type = KVM_S390_INT_PFAULT_DONE;
>> +		if (kvm_s390_inject_vm(vcpu->kvm, &inti))
>> +			WARN(1, "pfault interrupt injection failed");
>> +	}
>> +}
>> +
>> +void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
>> +				     struct kvm_async_pf *work)
>> +{
>> +	__kvm_inject_pfault_token(vcpu, true, work->arch.pfault_token);
>> +}
>> +
>> +void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
>> +				 struct kvm_async_pf *work)
>> +{
>> +	__kvm_inject_pfault_token(vcpu, false, work->arch.pfault_token);
>> +}
>> +
>> +void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,
>> +			       struct kvm_async_pf *work)
>> +{
>> +	/* s390 will always inject the page directly */
>> +}
>> +
>> +bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu)
>> +{
>> +	/*
>> +	 * s390 will always inject the page directly,
>> +	 * but we still want check_async_completion to cleanup
>> +	 */
>> +	return true;
>> +}
>> +
>> +static int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu)
>> +{
>> +	hva_t hva = gmap_fault(current->thread.gmap_addr, vcpu->arch.gmap);
>> +	struct kvm_arch_async_pf arch;
>> +
>> +	if (vcpu->arch.pfault_token == KVM_S390_PFAULT_TOKEN_INVALID)
>> +		return 0;
>> +	if ((vcpu->arch.sie_block->gpsw.mask & vcpu->arch.pfault_select) !=
>> +	    vcpu->arch.pfault_compare)
>> +		return 0;
>> +	if (psw_extint_disabled(vcpu))
>> +		return 0;
>> +	if (kvm_cpu_has_interrupt(vcpu))
>> +		return 0;
>> +	if (!(vcpu->arch.sie_block->gcr[0] & 0x200ul))
>> +		return 0;
>> +
>> +	if (copy_from_guest(vcpu, &arch.pfault_token, vcpu->arch.pfault_token, 8)) {
>> +		/* already in error case, insert the interrupt and return 0 */
>> +		int ign = kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
>> +		return ign - ign;
>> +	}
>> +	return kvm_setup_async_pf(vcpu, current->thread.gmap_addr, hva, &arch);
>> +}
>> +
>>  static int __vcpu_run(struct kvm_vcpu *vcpu)
>>  {
>>  	int rc;
>>  
>> +	kvm_check_async_pf_completion(vcpu);
>> +
>>  	memcpy(&vcpu->arch.sie_block->gg14, &vcpu->run->s.regs.gprs[14], 16);
>>  
>>  	if (need_resched())
>> @@ -725,7 +829,8 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>>  		if (kvm_is_ucontrol(vcpu->kvm)) {
>>  			rc = SIE_INTERCEPT_UCONTROL;
>>  		} else if (current->thread.gmap_pfault) {
>> -			kvm_arch_fault_in_sync(vcpu);
>> +			if (!kvm_arch_setup_async_pf(vcpu))
>> +				kvm_arch_fault_in_sync(vcpu);
>>  			current->thread.gmap_pfault = 0;
>>  			rc = 0;
>>  		} else {
>> diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
>> index 028ca9f..d0f4d2a 100644
>> --- a/arch/s390/kvm/kvm-s390.h
>> +++ b/arch/s390/kvm/kvm-s390.h
>> @@ -148,4 +148,8 @@ void exit_sie_sync(struct kvm_vcpu *vcpu);
>>  /* implemented in diag.c */
>>  int kvm_s390_handle_diag(struct kvm_vcpu *vcpu);
>>  
>> +/* implemented in interrupt.c */
>> +int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
>> +int psw_extint_disabled(struct kvm_vcpu *vcpu);
>> +
>>  #endif
>> diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
>> index bec398c..a6a0f02 100644
>> --- a/arch/s390/kvm/sigp.c
>> +++ b/arch/s390/kvm/sigp.c
>> @@ -186,6 +186,12 @@ int kvm_s390_inject_sigp_stop(struct kvm_vcpu *vcpu, int action)
>>  static int __sigp_set_arch(struct kvm_vcpu *vcpu, u32 parameter)
>>  {
>>  	int rc;
>> +	unsigned int i;
>> +	struct kvm_vcpu *vcpu_to_set;
>> +
>> +	kvm_for_each_vcpu(i, vcpu_to_set, vcpu->kvm) {
>> +		vcpu_to_set->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
>> +	}
>>  
>>  	switch (parameter & 0xff) {
>>  	case 0:
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index acccd08..fae432c 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -413,6 +413,8 @@ struct kvm_s390_psw {
>>  #define KVM_S390_PROGRAM_INT		0xfffe0001u
>>  #define KVM_S390_SIGP_SET_PREFIX	0xfffe0002u
>>  #define KVM_S390_RESTART		0xfffe0003u
>> +#define KVM_S390_INT_PFAULT_INIT	0xfffe0004u
>> +#define KVM_S390_INT_PFAULT_DONE	0xfffe0005u
>>  #define KVM_S390_MCHK			0xfffe1000u
>>  #define KVM_S390_INT_VIRTIO		0xffff2603u
>>  #define KVM_S390_INT_SERVICE		0xffff2401u
>> -- 
>> 1.8.2.2
> 
> --
> 			Gleb.
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] PF: Async page fault support on s390
@ 2013-07-11 10:41       ` Christian Borntraeger
  0 siblings, 0 replies; 20+ messages in thread
From: Christian Borntraeger @ 2013-07-11 10:41 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Dominik Dingel, Paolo Bonzini, Heiko Carstens,
	Martin Schwidefsky, Cornelia Huck, Xiantao Zhang, Alexander Graf,
	Christoffer Dall, Marc Zyngier, Ralf Baechle, kvm, linux-s390,
	linux-mm, linux-kernel

On 11/07/13 11:04, Gleb Natapov wrote:
> On Wed, Jul 10, 2013 at 02:59:55PM +0200, Dominik Dingel wrote:
>> This patch enables async page faults for s390 kvm guests.
>> It provides the userspace API to enable, disable or get the status of this
>> feature. Also it includes the diagnose code, called by the guest to enable
>> async page faults.
>>
>> The async page faults will use an already existing guest interface for this
>> purpose, as described in "CP Programming Services (SC24-6084)".
>>
>> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
> Christian, looks good now?

Looks good, but I just had a  discussion with Dominik about several other cases 
(guest driven reboot, qemu driven reboot, life migration). This patch should 
allow all these cases (independent from this patch we need an ioctl to flush the
list of pending interrupts to do so, but reboot is currently broken in that
regard anyway - patch is currently being looked at)

We are currently discussion if we should get rid of the APF_STATUS and let 
the kernel wait for outstanding page faults before returning from KVM_RUN
or if we go with this patch and let userspace wait for completion. 

Will discuss this with Dominik, Conny and Alex. So lets defer that till next
week, ok?


> 
>> ---
>>  Documentation/s390/kvm.txt       |  24 +++++++++
>>  arch/s390/include/asm/kvm_host.h |  22 ++++++++
>>  arch/s390/include/uapi/asm/kvm.h |  10 ++++
>>  arch/s390/kvm/Kconfig            |   2 +
>>  arch/s390/kvm/Makefile           |   2 +-
>>  arch/s390/kvm/diag.c             |  63 +++++++++++++++++++++++
>>  arch/s390/kvm/interrupt.c        |  43 +++++++++++++---
>>  arch/s390/kvm/kvm-s390.c         | 107 ++++++++++++++++++++++++++++++++++++++-
>>  arch/s390/kvm/kvm-s390.h         |   4 ++
>>  arch/s390/kvm/sigp.c             |   6 +++
>>  include/uapi/linux/kvm.h         |   2 +
>>  11 files changed, 276 insertions(+), 9 deletions(-)
>>
>> diff --git a/Documentation/s390/kvm.txt b/Documentation/s390/kvm.txt
>> index 85f3280..707b7e9 100644
>> --- a/Documentation/s390/kvm.txt
>> +++ b/Documentation/s390/kvm.txt
>> @@ -70,6 +70,30 @@ floating interrupts are:
>>  KVM_S390_INT_VIRTIO
>>  KVM_S390_INT_SERVICE
>>  
>> +ioctl:      KVM_S390_APF_ENABLE:
>> +args:       none
>> +This ioctl is used to enable the async page fault interface. So in a
>> +host page fault case the host can now submit pfault tokens to the guest.
>> +
>> +ioctl:      KVM_S390_APF_DISABLE:
>> +args:       none
>> +This ioctl is used to disable the async page fault interface. From this point
>> +on no new pfault tokens will be issued to the guest. Already existing async
>> +page faults are not covered by this and will be normally handled.
>> +
>> +ioctl:      KVM_S390_APF_STATUS:
>> +args:       none
>> +This ioctl allows the userspace to get the current status of the APF feature.
>> +The main purpose for this, is to ensure that no pfault tokens will be lost
>> +during live migration or similar management operations.
>> +The possible return values are:
>> +KVM_S390_APF_DISABLED_NON_PENDING
>> +KVM_S390_APF_DISABLED_PENDING
>> +KVM_S390_APF_ENABLED_NON_PENDING
>> +KVM_S390_APF_ENABLED_PENDING
>> +Caution: if KVM_S390_APF is enabled the PENDING status could be already changed
>> +as soon as the ioctl returns to userspace.
>> +
>>  3. ioctl calls to the kvm-vcpu file descriptor
>>  KVM does support the following ioctls on s390 that are common with other
>>  architectures and do behave the same:
>> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
>> index cd30c3d..e8012fc 100644
>> --- a/arch/s390/include/asm/kvm_host.h
>> +++ b/arch/s390/include/asm/kvm_host.h
>> @@ -257,6 +257,10 @@ struct kvm_vcpu_arch {
>>  		u64		stidp_data;
>>  	};
>>  	struct gmap *gmap;
>> +#define KVM_S390_PFAULT_TOKEN_INVALID	(-1UL)
>> +	unsigned long pfault_token;
>> +	unsigned long pfault_select;
>> +	unsigned long pfault_compare;
>>  };
>>  
>>  struct kvm_vm_stat {
>> @@ -282,6 +286,24 @@ static inline bool kvm_is_error_hva(unsigned long addr)
>>  	return addr == KVM_HVA_ERR_BAD;
>>  }
>>  
>> +#define ASYNC_PF_PER_VCPU	64
>> +struct kvm_vcpu;
>> +struct kvm_async_pf;
>> +struct kvm_arch_async_pf {
>> +	unsigned long pfault_token;
>> +};
>> +
>> +bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu);
>> +
>> +void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,
>> +			       struct kvm_async_pf *work);
>> +
>> +void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
>> +				     struct kvm_async_pf *work);
>> +
>> +void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
>> +				 struct kvm_async_pf *work);
>> +
>>  extern int sie64a(struct kvm_s390_sie_block *, u64 *);
>>  extern char sie_exit;
>>  #endif
>> diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
>> index d25da59..b6c83e0 100644
>> --- a/arch/s390/include/uapi/asm/kvm.h
>> +++ b/arch/s390/include/uapi/asm/kvm.h
>> @@ -57,4 +57,14 @@ struct kvm_sync_regs {
>>  #define KVM_REG_S390_EPOCHDIFF	(KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x2)
>>  #define KVM_REG_S390_CPU_TIMER  (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x3)
>>  #define KVM_REG_S390_CLOCK_COMP (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x4)
>> +
>> +/* ioctls used for setting/getting status of APF on s390x */
>> +#define KVM_S390_APF_ENABLE	1
>> +#define KVM_S390_APF_DISABLE	2
>> +#define KVM_S390_APF_STATUS	3
>> +#define KVM_S390_APF_DISABLED_NON_PENDING	0
>> +#define KVM_S390_APF_DISABLED_PENDING		1
>> +#define KVM_S390_APF_ENABLED_NON_PENDING	2
>> +#define KVM_S390_APF_ENABLED_PENDING		3
>> +
>>  #endif
>> diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig
>> index 70b46ea..4993eed 100644
>> --- a/arch/s390/kvm/Kconfig
>> +++ b/arch/s390/kvm/Kconfig
>> @@ -23,6 +23,8 @@ config KVM
>>  	select ANON_INODES
>>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
>>  	select HAVE_KVM_EVENTFD
>> +	select KVM_ASYNC_PF
>> +	select KVM_ASYNC_PF_DIRECT
>>  	---help---
>>  	  Support hosting paravirtualized guest machines using the SIE
>>  	  virtualization capability on the mainframe. This should work
>> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
>> index 40b4c64..63bfc28 100644
>> --- a/arch/s390/kvm/Makefile
>> +++ b/arch/s390/kvm/Makefile
>> @@ -7,7 +7,7 @@
>>  # as published by the Free Software Foundation.
>>  
>>  KVM := ../../../virt/kvm
>> -common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o
>> +common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o
>>  
>>  ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>>  
>> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
>> index 3074475..3d210af 100644
>> --- a/arch/s390/kvm/diag.c
>> +++ b/arch/s390/kvm/diag.c
>> @@ -17,6 +17,7 @@
>>  #include "kvm-s390.h"
>>  #include "trace.h"
>>  #include "trace-s390.h"
>> +#include "gaccess.h"
>>  
>>  static int diag_release_pages(struct kvm_vcpu *vcpu)
>>  {
>> @@ -46,6 +47,66 @@ static int diag_release_pages(struct kvm_vcpu *vcpu)
>>  	return 0;
>>  }
>>  
>> +static int __diag_page_ref_service(struct kvm_vcpu *vcpu)
>> +{
>> +	struct prs_parm {
>> +		u16 code;
>> +		u16 subcode;
>> +		u16 parm_len;
>> +		u16 parm_version;
>> +		u64 token_addr;
>> +		u64 select_mask;
>> +		u64 compare_mask;
>> +		u64 zarch;
>> +	};
>> +	struct prs_parm parm;
>> +	int rc;
>> +	u16 rx = (vcpu->arch.sie_block->ipa & 0xf0) >> 4;
>> +	u16 ry = (vcpu->arch.sie_block->ipa & 0x0f);
>> +	if (copy_from_guest(vcpu, &parm, vcpu->run->s.regs.gprs[rx], sizeof(parm)))
>> +		return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
>> +
>> +	if (parm.parm_version != 2 || parm.parm_len < 0x5)
>> +		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
>> +
>> +	switch (parm.subcode) {
>> +	case 0: /* TOKEN */
>> +		if ((parm.zarch >> 63) != 1 || parm.token_addr & 7 ||
>> +		    (parm.compare_mask & parm.select_mask) != parm.compare_mask)
>> +			return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
>> +
>> +		vcpu->arch.pfault_token = parm.token_addr;
>> +		vcpu->arch.pfault_select = parm.select_mask;
>> +		vcpu->arch.pfault_compare = parm.compare_mask;
>> +		vcpu->run->s.regs.gprs[ry] = 0;
>> +		rc = 0;
>> +		break;
>> +	case 1: 
>> +		/* 
>> +		 * CANCEL 
>> +		 * Specification allows to let already pending tokens survive
>> +		 * the cancel, therefore to reduce code complexity, we assume, all
>> +		 * outstanding tokens as already pending.
>> +		 */
>> +		if (vcpu->run->s.regs.gprs[rx] & 7)
>> +			return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
>> +
>> +		vcpu->run->s.regs.gprs[ry] = 0;
>> +
>> +		if (vcpu->arch.pfault_token == KVM_S390_PFAULT_TOKEN_INVALID)
>> +			vcpu->run->s.regs.gprs[ry] = 1;
>> +
>> +		vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
>> +		rc = 0;
>> +		break;
>> +	default:
>> +		rc = -EOPNOTSUPP;
>> +		break;
>> +	}
>> +
>> +	return rc;
>> +}
>> +
>>  static int __diag_time_slice_end(struct kvm_vcpu *vcpu)
>>  {
>>  	VCPU_EVENT(vcpu, 5, "%s", "diag time slice end");
>> @@ -143,6 +204,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu)
>>  		return __diag_time_slice_end(vcpu);
>>  	case 0x9c:
>>  		return __diag_time_slice_end_directed(vcpu);
>> +	case 0x258:
>> +		return __diag_page_ref_service(vcpu);
>>  	case 0x308:
>>  		return __diag_ipl_functions(vcpu);
>>  	case 0x500:
>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
>> index 7f35cb3..00e7feb 100644
>> --- a/arch/s390/kvm/interrupt.c
>> +++ b/arch/s390/kvm/interrupt.c
>> @@ -31,7 +31,7 @@ static int is_ioint(u64 type)
>>  	return ((type & 0xfffe0000u) != 0xfffe0000u);
>>  }
>>  
>> -static int psw_extint_disabled(struct kvm_vcpu *vcpu)
>> +int psw_extint_disabled(struct kvm_vcpu *vcpu)
>>  {
>>  	return !(vcpu->arch.sie_block->gpsw.mask & PSW_MASK_EXT);
>>  }
>> @@ -78,11 +78,8 @@ static int __interrupt_is_deliverable(struct kvm_vcpu *vcpu,
>>  			return 1;
>>  		return 0;
>>  	case KVM_S390_INT_SERVICE:
>> -		if (psw_extint_disabled(vcpu))
>> -			return 0;
>> -		if (vcpu->arch.sie_block->gcr[0] & 0x200ul)
>> -			return 1;
>> -		return 0;
>> +	case KVM_S390_INT_PFAULT_INIT:
>> +	case KVM_S390_INT_PFAULT_DONE:
>>  	case KVM_S390_INT_VIRTIO:
>>  		if (psw_extint_disabled(vcpu))
>>  			return 0;
>> @@ -150,6 +147,8 @@ static void __set_intercept_indicator(struct kvm_vcpu *vcpu,
>>  	case KVM_S390_INT_EXTERNAL_CALL:
>>  	case KVM_S390_INT_EMERGENCY:
>>  	case KVM_S390_INT_SERVICE:
>> +	case KVM_S390_INT_PFAULT_INIT:
>> +	case KVM_S390_INT_PFAULT_DONE:
>>  	case KVM_S390_INT_VIRTIO:
>>  		if (psw_extint_disabled(vcpu))
>>  			__set_cpuflag(vcpu, CPUSTAT_EXT_INT);
>> @@ -223,6 +222,26 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
>>  		rc |= put_guest(vcpu, inti->ext.ext_params,
>>  				(u32 __user *)__LC_EXT_PARAMS);
>>  		break;
>> +	case KVM_S390_INT_PFAULT_INIT:
>> +		rc  = put_guest(vcpu, 0x2603, (u16 __user *) __LC_EXT_INT_CODE);
>> +		rc |= put_guest(vcpu, 0x0600, (u16 __user *) __LC_EXT_CPU_ADDR);
>> +		rc |= copy_to_guest(vcpu, __LC_EXT_OLD_PSW,
>> +				    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
>> +		rc |= copy_from_guest(vcpu, &vcpu->arch.sie_block->gpsw,
>> +				      __LC_EXT_NEW_PSW, sizeof(psw_t));
>> +		rc |= put_guest(vcpu, inti->ext.ext_params2,
>> +				(u64 __user *) __LC_EXT_PARAMS2);
>> +		break;
>> +	case KVM_S390_INT_PFAULT_DONE:
>> +		rc  = put_guest(vcpu, 0x2603, (u16 __user *) __LC_EXT_INT_CODE);
>> +		rc |= put_guest(vcpu, 0x0680, (u16 __user *) __LC_EXT_CPU_ADDR);
>> +		rc |= copy_to_guest(vcpu, __LC_EXT_OLD_PSW,
>> +				    &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
>> +		rc |= copy_from_guest(vcpu, &vcpu->arch.sie_block->gpsw,
>> +				      __LC_EXT_NEW_PSW, sizeof(psw_t));
>> +		rc |= put_guest(vcpu, inti->ext.ext_params2,
>> +				(u64 __user *) __LC_EXT_PARAMS2);
>> +		break;
>>  	case KVM_S390_INT_VIRTIO:
>>  		VCPU_EVENT(vcpu, 4, "interrupt: virtio parm:%x,parm64:%llx",
>>  			   inti->ext.ext_params, inti->ext.ext_params2);
>> @@ -357,7 +376,7 @@ static int __try_deliver_ckc_interrupt(struct kvm_vcpu *vcpu)
>>  	return 1;
>>  }
>>  
>> -static int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
>> +int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
>>  {
>>  	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
>>  	struct kvm_s390_float_interrupt *fi = vcpu->arch.local_int.float_int;
>> @@ -681,6 +700,11 @@ int kvm_s390_inject_vm(struct kvm *kvm,
>>  		inti->type = s390int->type;
>>  		inti->ext.ext_params = s390int->parm;
>>  		break;
>> +	case KVM_S390_INT_PFAULT_INIT:
>> +	case KVM_S390_INT_PFAULT_DONE:
>> +		inti->type = s390int->type;
>> +		inti->ext.ext_params2 = s390int->parm64;
>> +		break;
>>  	case KVM_S390_PROGRAM_INT:
>>  	case KVM_S390_SIGP_STOP:
>>  	case KVM_S390_INT_EXTERNAL_CALL:
>> @@ -811,6 +835,11 @@ int kvm_s390_inject_vcpu(struct kvm_vcpu *vcpu,
>>  		inti->type = s390int->type;
>>  		inti->mchk.mcic = s390int->parm64;
>>  		break;
>> +	case KVM_S390_INT_PFAULT_INIT:
>> +	case KVM_S390_INT_PFAULT_DONE:
>> +		inti->type = s390int->type;
>> +		inti->ext.ext_params2 = s390int->parm64;
>> +		break;
>>  	case KVM_S390_INT_VIRTIO:
>>  	case KVM_S390_INT_SERVICE:
>>  	case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 702daca..ef70296 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -145,6 +145,7 @@ int kvm_dev_ioctl_check_extension(long ext)
>>  #ifdef CONFIG_KVM_S390_UCONTROL
>>  	case KVM_CAP_S390_UCONTROL:
>>  #endif
>> +	case KVM_CAP_ASYNC_PF:
>>  	case KVM_CAP_SYNC_REGS:
>>  	case KVM_CAP_ONE_REG:
>>  	case KVM_CAP_ENABLE_CAP:
>> @@ -186,6 +187,33 @@ long kvm_arch_vm_ioctl(struct file *filp,
>>  	int r;
>>  
>>  	switch (ioctl) {
>> +	case KVM_S390_APF_ENABLE:
>> +		set_bit(1, &kvm->arch.gmap->pfault_enabled);
>> +		r = 0;
>> +		break;
>> +	case KVM_S390_APF_DISABLE:
>> +		clear_bit(1, &kvm->arch.gmap->pfault_enabled);
>> +		r = 0;
>> +		break;
>> +	case KVM_S390_APF_STATUS: {
>> +		bool pfaults_pending = false;
>> +		unsigned int i;
>> +		struct kvm_vcpu *vcpu;
>> +		r = 0;
>> +		if (test_bit(1, &kvm->arch.gmap->pfault_enabled))
>> +			r += 2;
>> +
>> +		kvm_for_each_vcpu(i, vcpu, kvm) {
>> +			spin_lock(&vcpu->async_pf.lock);
>> +			if (vcpu->async_pf.queued > 0)
>> +				pfaults_pending = true;
>> +			spin_unlock(&vcpu->async_pf.lock);
>> +		}
>> +
>> +		if (pfaults_pending)
>> +			r += 1;
>> +		break;
>> +	}
>>  	case KVM_S390_INTERRUPT: {
>>  		struct kvm_s390_interrupt s390int;
>>  
>> @@ -264,6 +292,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>>  {
>>  	VCPU_EVENT(vcpu, 3, "%s", "free cpu");
>>  	trace_kvm_s390_destroy_vcpu(vcpu->vcpu_id);
>> +	kvm_clear_async_pf_completion_queue(vcpu);
>>  	if (!kvm_is_ucontrol(vcpu->kvm)) {
>>  		clear_bit(63 - vcpu->vcpu_id,
>>  			  (unsigned long *) &vcpu->kvm->arch.sca->mcn);
>> @@ -313,6 +342,9 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>>  /* Section: vcpu related */
>>  int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>>  {
>> +	vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
>> +	kvm_clear_async_pf_completion_queue(vcpu);
>> +	kvm_async_pf_wakeup_all(vcpu);
>>  	if (kvm_is_ucontrol(vcpu->kvm)) {
>>  		vcpu->arch.gmap = gmap_alloc(current->mm);
>>  		if (!vcpu->arch.gmap)
>> @@ -370,6 +402,7 @@ static void kvm_s390_vcpu_initial_reset(struct kvm_vcpu *vcpu)
>>  	vcpu->arch.guest_fpregs.fpc = 0;
>>  	asm volatile("lfpc %0" : : "Q" (vcpu->arch.guest_fpregs.fpc));
>>  	vcpu->arch.sie_block->gbea = 1;
>> +	vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
>>  	atomic_set_mask(CPUSTAT_STOPPED, &vcpu->arch.sie_block->cpuflags);
>>  }
>>  
>> @@ -691,10 +724,81 @@ static void kvm_arch_fault_in_sync(struct kvm_vcpu *vcpu)
>>  	up_read(&mm->mmap_sem);
>>  }
>>  
>> +static void __kvm_inject_pfault_token(struct kvm_vcpu *vcpu, bool start_token,
>> +				      unsigned long token)
>> +{
>> +	struct kvm_s390_interrupt inti;
>> +	inti.parm64 = token;
>> +
>> +	if (start_token) {
>> +		inti.type = KVM_S390_INT_PFAULT_INIT;
>> +		if (kvm_s390_inject_vcpu(vcpu, &inti))
>> +			WARN(1, "pfault interrupt injection failed");
>> +	} else {
>> +		inti.type = KVM_S390_INT_PFAULT_DONE;
>> +		if (kvm_s390_inject_vm(vcpu->kvm, &inti))
>> +			WARN(1, "pfault interrupt injection failed");
>> +	}
>> +}
>> +
>> +void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
>> +				     struct kvm_async_pf *work)
>> +{
>> +	__kvm_inject_pfault_token(vcpu, true, work->arch.pfault_token);
>> +}
>> +
>> +void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
>> +				 struct kvm_async_pf *work)
>> +{
>> +	__kvm_inject_pfault_token(vcpu, false, work->arch.pfault_token);
>> +}
>> +
>> +void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,
>> +			       struct kvm_async_pf *work)
>> +{
>> +	/* s390 will always inject the page directly */
>> +}
>> +
>> +bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu)
>> +{
>> +	/*
>> +	 * s390 will always inject the page directly,
>> +	 * but we still want check_async_completion to cleanup
>> +	 */
>> +	return true;
>> +}
>> +
>> +static int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu)
>> +{
>> +	hva_t hva = gmap_fault(current->thread.gmap_addr, vcpu->arch.gmap);
>> +	struct kvm_arch_async_pf arch;
>> +
>> +	if (vcpu->arch.pfault_token == KVM_S390_PFAULT_TOKEN_INVALID)
>> +		return 0;
>> +	if ((vcpu->arch.sie_block->gpsw.mask & vcpu->arch.pfault_select) !=
>> +	    vcpu->arch.pfault_compare)
>> +		return 0;
>> +	if (psw_extint_disabled(vcpu))
>> +		return 0;
>> +	if (kvm_cpu_has_interrupt(vcpu))
>> +		return 0;
>> +	if (!(vcpu->arch.sie_block->gcr[0] & 0x200ul))
>> +		return 0;
>> +
>> +	if (copy_from_guest(vcpu, &arch.pfault_token, vcpu->arch.pfault_token, 8)) {
>> +		/* already in error case, insert the interrupt and return 0 */
>> +		int ign = kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
>> +		return ign - ign;
>> +	}
>> +	return kvm_setup_async_pf(vcpu, current->thread.gmap_addr, hva, &arch);
>> +}
>> +
>>  static int __vcpu_run(struct kvm_vcpu *vcpu)
>>  {
>>  	int rc;
>>  
>> +	kvm_check_async_pf_completion(vcpu);
>> +
>>  	memcpy(&vcpu->arch.sie_block->gg14, &vcpu->run->s.regs.gprs[14], 16);
>>  
>>  	if (need_resched())
>> @@ -725,7 +829,8 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>>  		if (kvm_is_ucontrol(vcpu->kvm)) {
>>  			rc = SIE_INTERCEPT_UCONTROL;
>>  		} else if (current->thread.gmap_pfault) {
>> -			kvm_arch_fault_in_sync(vcpu);
>> +			if (!kvm_arch_setup_async_pf(vcpu))
>> +				kvm_arch_fault_in_sync(vcpu);
>>  			current->thread.gmap_pfault = 0;
>>  			rc = 0;
>>  		} else {
>> diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
>> index 028ca9f..d0f4d2a 100644
>> --- a/arch/s390/kvm/kvm-s390.h
>> +++ b/arch/s390/kvm/kvm-s390.h
>> @@ -148,4 +148,8 @@ void exit_sie_sync(struct kvm_vcpu *vcpu);
>>  /* implemented in diag.c */
>>  int kvm_s390_handle_diag(struct kvm_vcpu *vcpu);
>>  
>> +/* implemented in interrupt.c */
>> +int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
>> +int psw_extint_disabled(struct kvm_vcpu *vcpu);
>> +
>>  #endif
>> diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
>> index bec398c..a6a0f02 100644
>> --- a/arch/s390/kvm/sigp.c
>> +++ b/arch/s390/kvm/sigp.c
>> @@ -186,6 +186,12 @@ int kvm_s390_inject_sigp_stop(struct kvm_vcpu *vcpu, int action)
>>  static int __sigp_set_arch(struct kvm_vcpu *vcpu, u32 parameter)
>>  {
>>  	int rc;
>> +	unsigned int i;
>> +	struct kvm_vcpu *vcpu_to_set;
>> +
>> +	kvm_for_each_vcpu(i, vcpu_to_set, vcpu->kvm) {
>> +		vcpu_to_set->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
>> +	}
>>  
>>  	switch (parameter & 0xff) {
>>  	case 0:
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index acccd08..fae432c 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -413,6 +413,8 @@ struct kvm_s390_psw {
>>  #define KVM_S390_PROGRAM_INT		0xfffe0001u
>>  #define KVM_S390_SIGP_SET_PREFIX	0xfffe0002u
>>  #define KVM_S390_RESTART		0xfffe0003u
>> +#define KVM_S390_INT_PFAULT_INIT	0xfffe0004u
>> +#define KVM_S390_INT_PFAULT_DONE	0xfffe0005u
>>  #define KVM_S390_MCHK			0xfffe1000u
>>  #define KVM_S390_INT_VIRTIO		0xffff2603u
>>  #define KVM_S390_INT_SERVICE		0xffff2401u
>> -- 
>> 1.8.2.2
> 
> --
> 			Gleb.
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] PF: Async page fault support on s390
  2013-07-11 10:41       ` Christian Borntraeger
@ 2013-07-11 10:58         ` Gleb Natapov
  -1 siblings, 0 replies; 20+ messages in thread
From: Gleb Natapov @ 2013-07-11 10:58 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Dominik Dingel, Paolo Bonzini, Heiko Carstens,
	Martin Schwidefsky, Cornelia Huck, Xiantao Zhang, Alexander Graf,
	Christoffer Dall, Marc Zyngier, Ralf Baechle, kvm, linux-s390,
	linux-mm, linux-kernel

On Thu, Jul 11, 2013 at 12:41:37PM +0200, Christian Borntraeger wrote:
> On 11/07/13 11:04, Gleb Natapov wrote:
> > On Wed, Jul 10, 2013 at 02:59:55PM +0200, Dominik Dingel wrote:
> >> This patch enables async page faults for s390 kvm guests.
> >> It provides the userspace API to enable, disable or get the status of this
> >> feature. Also it includes the diagnose code, called by the guest to enable
> >> async page faults.
> >>
> >> The async page faults will use an already existing guest interface for this
> >> purpose, as described in "CP Programming Services (SC24-6084)".
> >>
> >> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
> > Christian, looks good now?
> 
> Looks good, but I just had a  discussion with Dominik about several other cases 
> (guest driven reboot, qemu driven reboot, life migration). This patch should 
> allow all these cases (independent from this patch we need an ioctl to flush the
> list of pending interrupts to do so, but reboot is currently broken in that
> regard anyway - patch is currently being looked at)
> 
> We are currently discussion if we should get rid of the APF_STATUS and let 
> the kernel wait for outstanding page faults before returning from KVM_RUN
> or if we go with this patch and let userspace wait for completion. 
> 
> Will discuss this with Dominik, Conny and Alex. So lets defer that till next
> week, ok?
> 
Sure.

--
			Gleb.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] PF: Async page fault support on s390
@ 2013-07-11 10:58         ` Gleb Natapov
  0 siblings, 0 replies; 20+ messages in thread
From: Gleb Natapov @ 2013-07-11 10:58 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Dominik Dingel, Paolo Bonzini, Heiko Carstens,
	Martin Schwidefsky, Cornelia Huck, Xiantao Zhang, Alexander Graf,
	Christoffer Dall, Marc Zyngier, Ralf Baechle, kvm, linux-s390,
	linux-mm, linux-kernel

On Thu, Jul 11, 2013 at 12:41:37PM +0200, Christian Borntraeger wrote:
> On 11/07/13 11:04, Gleb Natapov wrote:
> > On Wed, Jul 10, 2013 at 02:59:55PM +0200, Dominik Dingel wrote:
> >> This patch enables async page faults for s390 kvm guests.
> >> It provides the userspace API to enable, disable or get the status of this
> >> feature. Also it includes the diagnose code, called by the guest to enable
> >> async page faults.
> >>
> >> The async page faults will use an already existing guest interface for this
> >> purpose, as described in "CP Programming Services (SC24-6084)".
> >>
> >> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
> > Christian, looks good now?
> 
> Looks good, but I just had a  discussion with Dominik about several other cases 
> (guest driven reboot, qemu driven reboot, life migration). This patch should 
> allow all these cases (independent from this patch we need an ioctl to flush the
> list of pending interrupts to do so, but reboot is currently broken in that
> regard anyway - patch is currently being looked at)
> 
> We are currently discussion if we should get rid of the APF_STATUS and let 
> the kernel wait for outstanding page faults before returning from KVM_RUN
> or if we go with this patch and let userspace wait for completion. 
> 
> Will discuss this with Dominik, Conny and Alex. So lets defer that till next
> week, ok?
> 
Sure.

--
			Gleb.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] PF: Async page fault support on s390
  2013-07-11 10:41       ` Christian Borntraeger
@ 2013-07-18 13:57         ` Paolo Bonzini
  -1 siblings, 0 replies; 20+ messages in thread
From: Paolo Bonzini @ 2013-07-18 13:57 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Gleb Natapov, Dominik Dingel, Heiko Carstens, Martin Schwidefsky,
	Cornelia Huck, Xiantao Zhang, Alexander Graf, Christoffer Dall,
	Marc Zyngier, Ralf Baechle, kvm, linux-s390, linux-mm,
	linux-kernel

Il 11/07/2013 12:41, Christian Borntraeger ha scritto:
> On 11/07/13 11:04, Gleb Natapov wrote:
>> On Wed, Jul 10, 2013 at 02:59:55PM +0200, Dominik Dingel wrote:
>>> This patch enables async page faults for s390 kvm guests.
>>> It provides the userspace API to enable, disable or get the status of this
>>> feature. Also it includes the diagnose code, called by the guest to enable
>>> async page faults.
>>>
>>> The async page faults will use an already existing guest interface for this
>>> purpose, as described in "CP Programming Services (SC24-6084)".
>>>
>>> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
>> Christian, looks good now?
> 
> Looks good, but I just had a  discussion with Dominik about several other cases 
> (guest driven reboot, qemu driven reboot, life migration). This patch should 
> allow all these cases (independent from this patch we need an ioctl to flush the
> list of pending interrupts to do so, but reboot is currently broken in that
> regard anyway - patch is currently being looked at)
> 
> We are currently discussion if we should get rid of the APF_STATUS and let 
> the kernel wait for outstanding page faults before returning from KVM_RUN
> or if we go with this patch and let userspace wait for completion. 
> 
> Will discuss this with Dominik, Conny and Alex. So lets defer that till next
> week, ok?

Let us know if we should wait for a v5. :)

Paolo


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] PF: Async page fault support on s390
@ 2013-07-18 13:57         ` Paolo Bonzini
  0 siblings, 0 replies; 20+ messages in thread
From: Paolo Bonzini @ 2013-07-18 13:57 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Gleb Natapov, Dominik Dingel, Heiko Carstens, Martin Schwidefsky,
	Cornelia Huck, Xiantao Zhang, Alexander Graf, Christoffer Dall,
	Marc Zyngier, Ralf Baechle, kvm, linux-s390, linux-mm,
	linux-kernel

Il 11/07/2013 12:41, Christian Borntraeger ha scritto:
> On 11/07/13 11:04, Gleb Natapov wrote:
>> On Wed, Jul 10, 2013 at 02:59:55PM +0200, Dominik Dingel wrote:
>>> This patch enables async page faults for s390 kvm guests.
>>> It provides the userspace API to enable, disable or get the status of this
>>> feature. Also it includes the diagnose code, called by the guest to enable
>>> async page faults.
>>>
>>> The async page faults will use an already existing guest interface for this
>>> purpose, as described in "CP Programming Services (SC24-6084)".
>>>
>>> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
>> Christian, looks good now?
> 
> Looks good, but I just had a  discussion with Dominik about several other cases 
> (guest driven reboot, qemu driven reboot, life migration). This patch should 
> allow all these cases (independent from this patch we need an ioctl to flush the
> list of pending interrupts to do so, but reboot is currently broken in that
> regard anyway - patch is currently being looked at)
> 
> We are currently discussion if we should get rid of the APF_STATUS and let 
> the kernel wait for outstanding page faults before returning from KVM_RUN
> or if we go with this patch and let userspace wait for completion. 
> 
> Will discuss this with Dominik, Conny and Alex. So lets defer that till next
> week, ok?

Let us know if we should wait for a v5. :)

Paolo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] PF: Async page fault support on s390
  2013-07-18 13:57         ` Paolo Bonzini
@ 2013-07-18 14:12           ` Christian Borntraeger
  -1 siblings, 0 replies; 20+ messages in thread
From: Christian Borntraeger @ 2013-07-18 14:12 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Gleb Natapov, Dominik Dingel, Heiko Carstens, Martin Schwidefsky,
	Cornelia Huck, Xiantao Zhang, Alexander Graf, Christoffer Dall,
	Marc Zyngier, Ralf Baechle, kvm, linux-s390, linux-mm,
	linux-kernel

On 18/07/13 15:57, Paolo Bonzini wrote:
> Il 11/07/2013 12:41, Christian Borntraeger ha scritto:
>> On 11/07/13 11:04, Gleb Natapov wrote:
>>> On Wed, Jul 10, 2013 at 02:59:55PM +0200, Dominik Dingel wrote:
>>>> This patch enables async page faults for s390 kvm guests.
>>>> It provides the userspace API to enable, disable or get the status of this
>>>> feature. Also it includes the diagnose code, called by the guest to enable
>>>> async page faults.
>>>>
>>>> The async page faults will use an already existing guest interface for this
>>>> purpose, as described in "CP Programming Services (SC24-6084)".
>>>>
>>>> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
>>> Christian, looks good now?
>>
>> Looks good, but I just had a  discussion with Dominik about several other cases 
>> (guest driven reboot, qemu driven reboot, life migration). This patch should 
>> allow all these cases (independent from this patch we need an ioctl to flush the
>> list of pending interrupts to do so, but reboot is currently broken in that
>> regard anyway - patch is currently being looked at)
>>
>> We are currently discussion if we should get rid of the APF_STATUS and let 
>> the kernel wait for outstanding page faults before returning from KVM_RUN
>> or if we go with this patch and let userspace wait for completion. 
>>
>> Will discuss this with Dominik, Conny and Alex. So lets defer that till next
>> week, ok?
> 
> Let us know if we should wait for a v5. :)

Yes, there will be a v5



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] PF: Async page fault support on s390
@ 2013-07-18 14:12           ` Christian Borntraeger
  0 siblings, 0 replies; 20+ messages in thread
From: Christian Borntraeger @ 2013-07-18 14:12 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Gleb Natapov, Dominik Dingel, Heiko Carstens, Martin Schwidefsky,
	Cornelia Huck, Xiantao Zhang, Alexander Graf, Christoffer Dall,
	Marc Zyngier, Ralf Baechle, kvm, linux-s390, linux-mm,
	linux-kernel

On 18/07/13 15:57, Paolo Bonzini wrote:
> Il 11/07/2013 12:41, Christian Borntraeger ha scritto:
>> On 11/07/13 11:04, Gleb Natapov wrote:
>>> On Wed, Jul 10, 2013 at 02:59:55PM +0200, Dominik Dingel wrote:
>>>> This patch enables async page faults for s390 kvm guests.
>>>> It provides the userspace API to enable, disable or get the status of this
>>>> feature. Also it includes the diagnose code, called by the guest to enable
>>>> async page faults.
>>>>
>>>> The async page faults will use an already existing guest interface for this
>>>> purpose, as described in "CP Programming Services (SC24-6084)".
>>>>
>>>> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
>>> Christian, looks good now?
>>
>> Looks good, but I just had a  discussion with Dominik about several other cases 
>> (guest driven reboot, qemu driven reboot, life migration). This patch should 
>> allow all these cases (independent from this patch we need an ioctl to flush the
>> list of pending interrupts to do so, but reboot is currently broken in that
>> regard anyway - patch is currently being looked at)
>>
>> We are currently discussion if we should get rid of the APF_STATUS and let 
>> the kernel wait for outstanding page faults before returning from KVM_RUN
>> or if we go with this patch and let userspace wait for completion. 
>>
>> Will discuss this with Dominik, Conny and Alex. So lets defer that till next
>> week, ok?
> 
> Let us know if we should wait for a v5. :)

Yes, there will be a v5


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2013-07-18 14:12 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-10 12:59 [PATCH v4 0/4] Enable async page faults on s390 Dominik Dingel
2013-07-10 12:59 ` Dominik Dingel
2013-07-10 12:59 ` [PATCH 1/4] PF: Add FAULT_FLAG_RETRY_NOWAIT for guest fault Dominik Dingel
2013-07-10 12:59   ` Dominik Dingel
2013-07-10 12:59 ` [PATCH 2/4] PF: Make KVM_HVA_ERR_BAD usable on s390 Dominik Dingel
2013-07-10 12:59   ` Dominik Dingel
2013-07-10 12:59 ` [PATCH 3/4] PF: Provide additional direct page notification Dominik Dingel
2013-07-10 12:59   ` Dominik Dingel
2013-07-10 12:59 ` [PATCH 4/4] PF: Async page fault support on s390 Dominik Dingel
2013-07-10 12:59   ` Dominik Dingel
2013-07-11  9:04   ` Gleb Natapov
2013-07-11  9:04     ` Gleb Natapov
2013-07-11 10:41     ` Christian Borntraeger
2013-07-11 10:41       ` Christian Borntraeger
2013-07-11 10:58       ` Gleb Natapov
2013-07-11 10:58         ` Gleb Natapov
2013-07-18 13:57       ` Paolo Bonzini
2013-07-18 13:57         ` Paolo Bonzini
2013-07-18 14:12         ` Christian Borntraeger
2013-07-18 14:12           ` Christian Borntraeger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.