[PATCH 0/3] kvm-s390: revised version of kvm-s390 guest memory handling

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/3] kvm-s390: revised version of kvm-s390 guest memory handling - rebased
@ 2009-06-02 14:26 ehrhardt
  2009-06-02 14:26 ` [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state " ehrhardt
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: ehrhardt @ 2009-06-02 14:26 UTC (permalink / raw)
  To: kvm, avi
  Cc: ehrhardt, borntraeger, cotte, heiko.carstens, schwidefsky, mtosatti

From: Christian Ehrhardt <ehrhardt@de.ibm.com>

As requested this is a rebased patch on top of the already applied v3
of the patch series.

*updates to already applied version*
- ensure the wait_on_bit waiter is notified
- move the reset of requests to kvm_vcpu_release to drop them early
- ensure dropping all vcpu requests while freeing a vcpu
- ensure kick allocations (might_sleep) are out of atomic context
- update vcpu->cpu in kvm-s390 arch handler for load/put
- centralize consumption of vcpu->request bits
- updates on running vcpus can now be handled without need to rerun the vcpu
- kvm_arch_set_memory_region waits until the bit is consumed by the vcpu
- kickout only scheduled vcpus (wait might hang forever on non-scheduled vcpus)

Note: further unification of make_all_cpu_request and the kick mechanism is
planned, but it might be good to split it from this step towards commonality.

Patches included:
Subject: [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state - rebased
Subject: [PATCH 2/3] kvm-s390: update vcpu->cpu - rebased
Subject: [PATCH 3/3] kvm-s390: streamline memslot handling - rebased

Overall-Diffstat:
 arch/s390/include/asm/kvm_host.h |    2 +-
 arch/s390/kvm/intercept.c        |   10 ++++++----
 arch/s390/kvm/kvm-s390.c         |   36 +++++++++++++++++++++++++-----------
 arch/s390/kvm/kvm-s390.h         |   23 ++++++++++++++++++++++-
 arch/s390/kvm/sigp.c             |   31 +++++++++++++++++++++----------
 virt/kvm/kvm_main.c              |    4 ++++
 6 files changed, 79 insertions(+), 27 deletions(-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state - rebased
  2009-06-02 14:26 [PATCH 0/3] kvm-s390: revised version of kvm-s390 guest memory handling - rebased ehrhardt
@ 2009-06-02 14:26 ` ehrhardt
  2009-06-02 14:26 ` [PATCH 2/3] kvm-s390: update vcpu->cpu " ehrhardt
  2009-06-02 14:26 ` [PATCH 3/3] kvm-s390: streamline memslot handling " ehrhardt
  2 siblings, 0 replies; 12+ messages in thread
From: ehrhardt @ 2009-06-02 14:26 UTC (permalink / raw)
  To: kvm, avi
  Cc: ehrhardt, borntraeger, cotte, heiko.carstens, schwidefsky, mtosatti

From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>

As requested this is a rebased patch on top of the already applied v3
of the patch series.

*updates to already applied version*
- ensure allocations (might_sleep) are out of atomic context
- centralize consumption of vcpu->request bits

To ensure vcpu's come out of guest context in certain cases this patch adds a
s390 specific way to kick them out of guest context. Currently it kicks them
out to rerun the vcpu_run path in the s390 code, but the mechanism itself is
expandable and with a new flag we could also add e.g. kicks to userspace etc.

Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
---

[diffstat]
 include/asm/kvm_host.h |    2 +-
 kvm/intercept.c        |   10 ++++++----
 kvm/kvm-s390.c         |    7 +++----
 kvm/kvm-s390.h         |   16 +++++++++++++++-
 kvm/sigp.c             |   31 +++++++++++++++++++++----------
 5 files changed, 46 insertions(+), 20 deletions(-)

Index: kvm/arch/s390/kvm/intercept.c
===================================================================
--- kvm.orig/arch/s390/kvm/intercept.c
+++ kvm/arch/s390/kvm/intercept.c
@@ -141,10 +141,12 @@ static int handle_stop(struct kvm_vcpu *
 			rc = -ENOTSUPP;
 	}
 
-	if (vcpu->arch.local_int.action_bits & ACTION_RELOADVCPU_ON_STOP) {
-		vcpu->arch.local_int.action_bits &= ~ACTION_RELOADVCPU_ON_STOP;
-		rc = SIE_INTERCEPT_RERUNVCPU;
-		vcpu->run->exit_reason = KVM_EXIT_INTR;
+	if (vcpu->arch.local_int.action_bits & ACTION_VCPUREQUEST_ON_STOP) {
+		vcpu->arch.local_int.action_bits &= ~ACTION_VCPUREQUEST_ON_STOP;
+		if (kvm_s390_handle_vcpu_requests(vcpu, VCPUREQUESTLVL_SIGP)) {
+			rc = SIE_INTERCEPT_CHECKREQUESTS;
+			vcpu->run->exit_reason = KVM_EXIT_INTR;
+		}
 	}
 
 	if (vcpu->arch.local_int.action_bits & ACTION_STOP_ON_STOP) {
Index: kvm/arch/s390/include/asm/kvm_host.h
===================================================================
--- kvm.orig/arch/s390/include/asm/kvm_host.h
+++ kvm/arch/s390/include/asm/kvm_host.h
@@ -182,7 +182,7 @@ struct kvm_s390_interrupt_info {
 /* for local_interrupt.action_flags */
 #define ACTION_STORE_ON_STOP		(1<<0)
 #define ACTION_STOP_ON_STOP		(1<<1)
-#define ACTION_RELOADVCPU_ON_STOP	(1<<2)
+#define ACTION_VCPUREQUEST_ON_STOP	(1<<2)
 
 struct kvm_s390_local_interrupt {
 	spinlock_t lock;
Index: kvm/arch/s390/kvm/kvm-s390.c
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.c
+++ kvm/arch/s390/kvm/kvm-s390.c
@@ -484,8 +484,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_v
 
 rerun_vcpu:
 	if (vcpu->requests)
-		if (test_and_clear_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests))
-			kvm_s390_vcpu_set_mem(vcpu);
+		kvm_s390_handle_vcpu_requests(vcpu, VCPUREQUESTLVL_VCPURUN);
 
 	/* verify, that memory has been registered */
 	if (!vcpu->arch.sie_block->gmslm) {
@@ -521,7 +520,7 @@ rerun_vcpu:
 		rc = kvm_handle_sie_intercept(vcpu);
 	} while (!signal_pending(current) && !rc);
 
-	if (rc == SIE_INTERCEPT_RERUNVCPU)
+	if (rc == SIE_INTERCEPT_CHECKREQUESTS)
 		goto rerun_vcpu;
 
 	if (signal_pending(current) && !rc) {
@@ -710,7 +709,7 @@ int kvm_arch_set_memory_region(struct kv
 						&kvm->vcpus[i]->requests))
 				continue;
 			kvm_s390_inject_sigp_stop(kvm->vcpus[i],
-						  ACTION_RELOADVCPU_ON_STOP);
+						  ACTION_VCPUREQUEST_ON_STOP);
 		}
 	}
 
Index: kvm/arch/s390/kvm/kvm-s390.h
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.h
+++ kvm/arch/s390/kvm/kvm-s390.h
@@ -25,7 +25,7 @@
 typedef int (*intercept_handler_t)(struct kvm_vcpu *vcpu);
 
 /* negativ values are error codes, positive values for internal conditions */
-#define SIE_INTERCEPT_RERUNVCPU		(1<<0)
+#define SIE_INTERCEPT_CHECKREQUESTS		(1<<0)
 int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu);
 
 #define VM_EVENT(d_kvm, d_loglevel, d_string, d_args...)\
@@ -81,6 +81,20 @@ static inline void kvm_s390_vcpu_set_mem
 	up_read(&vcpu->kvm->slots_lock);
 }
 
+/* interception levels from which handle vcpu requests can be called */
+#define VCPUREQUESTLVL_SIGP		1
+#define VCPUREQUESTLVL_VCPURUN		2
+static inline unsigned long kvm_s390_handle_vcpu_requests(struct kvm_vcpu *vcpu,
+						   int level)
+{
+	BUG_ON(!level);
+
+	if (!vcpu->requests)
+		return 0;
+
+	return vcpu->requests;
+}
+
 /* implemented in priv.c */
 int kvm_s390_handle_b2(struct kvm_vcpu *vcpu);
 
Index: kvm/arch/s390/kvm/sigp.c
===================================================================
--- kvm.orig/arch/s390/kvm/sigp.c
+++ kvm/arch/s390/kvm/sigp.c
@@ -108,15 +108,9 @@ unlock:
 	return rc;
 }
 
-static int __inject_sigp_stop(struct kvm_s390_local_interrupt *li, int action)
+static int __inject_sigp_stop(struct kvm_s390_local_interrupt *li, int action,
+			      struct kvm_s390_interrupt_info *inti)
 {
-	struct kvm_s390_interrupt_info *inti;
-
-	inti = kzalloc(sizeof(*inti), GFP_KERNEL);
-	if (!inti)
-		return -ENOMEM;
-	inti->type = KVM_S390_SIGP_STOP;
-
 	spin_lock_bh(&li->lock);
 	list_add_tail(&inti->list, &li->list);
 	atomic_set(&li->active, 1);
@@ -133,11 +127,17 @@ static int __sigp_stop(struct kvm_vcpu *
 {
 	struct kvm_s390_float_interrupt *fi = &vcpu->kvm->arch.float_int;
 	struct kvm_s390_local_interrupt *li;
+	struct kvm_s390_interrupt_info *inti;
 	int rc;
 
 	if (cpu_addr >= KVM_MAX_VCPUS)
 		return 3; /* not operational */
 
+	inti = kzalloc(sizeof(*inti), GFP_KERNEL);
+	if (!inti)
+		return -ENOMEM;
+	inti->type = KVM_S390_SIGP_STOP;
+
 	spin_lock(&fi->lock);
 	li = fi->local_int[cpu_addr];
 	if (li == NULL) {
@@ -145,7 +145,7 @@ static int __sigp_stop(struct kvm_vcpu *
 		goto unlock;
 	}
 
-	rc = __inject_sigp_stop(li, action);
+	rc = __inject_sigp_stop(li, action, inti);
 
 unlock:
 	spin_unlock(&fi->lock);
@@ -156,7 +156,18 @@ unlock:
 int kvm_s390_inject_sigp_stop(struct kvm_vcpu *vcpu, int action)
 {
 	struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
-	return __inject_sigp_stop(li, action);
+	struct kvm_s390_interrupt_info *inti;
+	int rc;
+
+	inti = kzalloc(sizeof(*inti), GFP_KERNEL);
+	if (!inti)
+		return -ENOMEM;
+	inti->type = KVM_S390_SIGP_STOP;
+
+	rc = __inject_sigp_stop(li, action, inti);
+	if (rc)
+		kfree(inti);
+	return rc;
 }
 
 static int __sigp_set_arch(struct kvm_vcpu *vcpu, u32 parameter)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/3] kvm-s390: update vcpu->cpu - rebased
  2009-06-02 14:26 [PATCH 0/3] kvm-s390: revised version of kvm-s390 guest memory handling - rebased ehrhardt
  2009-06-02 14:26 ` [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state " ehrhardt
@ 2009-06-02 14:26 ` ehrhardt
  2009-06-02 14:26 ` [PATCH 3/3] kvm-s390: streamline memslot handling " ehrhardt
  2 siblings, 0 replies; 12+ messages in thread
From: ehrhardt @ 2009-06-02 14:26 UTC (permalink / raw)
  To: kvm, avi
  Cc: ehrhardt, borntraeger, cotte, heiko.carstens, schwidefsky, mtosatti

From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>

As requested this is a rebased patch on top of the already applied v3
of the patch series.

kvm on s390 formerly ignored vcpu->cpu.
This patch adds set/unset vcpu->cpu in kvm_arch_vcpu_load/put to allow
further architecture unification e.g. let generic code not find -1 on
currently scheduled vcpus.

Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
---

[diffstat]
 kvm-s390.c |    2 ++
 1 file changed, 2 insertions(+)

[diff]
Index: kvm/arch/s390/kvm/kvm-s390.c
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.c
+++ kvm/arch/s390/kvm/kvm-s390.c
@@ -244,6 +244,7 @@ void kvm_arch_vcpu_uninit(struct kvm_vcp
 
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
+	vcpu->cpu = cpu;
 	save_fp_regs(&vcpu->arch.host_fpregs);
 	save_access_regs(vcpu->arch.host_acrs);
 	vcpu->arch.guest_fpregs.fpc &= FPC_VALID_MASK;
@@ -253,6 +254,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu 
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
+	vcpu->cpu = -1;
 	save_fp_regs(&vcpu->arch.guest_fpregs);
 	save_access_regs(vcpu->arch.guest_acrs);
 	restore_fp_regs(&vcpu->arch.host_fpregs);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3/3] kvm-s390: streamline memslot handling - rebased
  2009-06-02 14:26 [PATCH 0/3] kvm-s390: revised version of kvm-s390 guest memory handling - rebased ehrhardt
  2009-06-02 14:26 ` [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state " ehrhardt
  2009-06-02 14:26 ` [PATCH 2/3] kvm-s390: update vcpu->cpu " ehrhardt
@ 2009-06-02 14:26 ` ehrhardt
  2009-06-05 20:53   ` Marcelo Tosatti
  2 siblings, 1 reply; 12+ messages in thread
From: ehrhardt @ 2009-06-02 14:26 UTC (permalink / raw)
  To: kvm, avi
  Cc: ehrhardt, borntraeger, cotte, heiko.carstens, schwidefsky, mtosatti

From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>

As requested this is a rebased patch on top of the already applied v3
of the patch series.

*updates to applied version*
- ensure the wait_on_bit waiter is notified
- ensure dropping vcpu all requests while freeing a vcpu
- kickout only scheduled vcpus (its superfluous and wait might hang forever on
  not running vcpus)
- kvm_arch_set_memory_region waits until the bit is consumed by the vcpu

This patch relocates the variables kvm-s390 uses to track guest mem addr/size.
As discussed dropping the variables at struct kvm_arch level allows to use the
common vcpu->request based mechanism to reload guest memory if e.g. changes
via set_memory_region.
The kick mechanism introduced in this series is used to ensure running vcpus
leave guest state to catch the update.


Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
---

[diffstat]
 arch/s390/kvm/kvm-s390.c |   27 ++++++++++++++++++++-------
 arch/s390/kvm/kvm-s390.h |    7 +++++++
 virt/kvm/kvm_main.c      |    4 ++++
 3 files changed, 31 insertions(+), 7 deletions(-)

Index: kvm/arch/s390/kvm/kvm-s390.c
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.c
+++ kvm/arch/s390/kvm/kvm-s390.c
@@ -674,6 +674,12 @@ long kvm_arch_vcpu_ioctl(struct file *fi
 	return -EINVAL;
 }
 
+static int wait_bit_schedule(void *word)
+{
+	schedule();
+	return 0;
+}
+
 /* Section: memory related */
 int kvm_arch_set_memory_region(struct kvm *kvm,
 				struct kvm_userspace_memory_region *mem,
@@ -681,6 +687,7 @@ int kvm_arch_set_memory_region(struct kv
 				int user_alloc)
 {
 	int i;
+	struct kvm_vcpu *vcpu;
 
 	/* A few sanity checks. We can have exactly one memory slot which has
 	   to start at guest virtual zero and which has to be located at a
@@ -706,13 +713,19 @@ int kvm_arch_set_memory_region(struct kv
 
 	/* request update of sie control block for all available vcpus */
 	for (i = 0; i < KVM_MAX_VCPUS; ++i) {
-		if (kvm->vcpus[i]) {
-			if (test_and_set_bit(KVM_REQ_MMU_RELOAD,
-						&kvm->vcpus[i]->requests))
-				continue;
-			kvm_s390_inject_sigp_stop(kvm->vcpus[i],
-						  ACTION_VCPUREQUEST_ON_STOP);
-		}
+		vcpu = kvm->vcpus[i];
+		if (!vcpu)
+			continue;
+
+		if (!test_and_set_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests))
+			continue;
+
+		if (vcpu->cpu == -1)
+			continue;
+
+		kvm_s390_inject_sigp_stop(vcpu, ACTION_VCPUREQUEST_ON_STOP);
+		wait_on_bit(&vcpu->requests, KVM_REQ_MMU_RELOAD,
+			    wait_bit_schedule, TASK_UNINTERRUPTIBLE);
 	}
 
 	return 0;
Index: kvm/arch/s390/kvm/kvm-s390.h
===================================================================
--- kvm.orig/arch/s390/kvm/kvm-s390.h
+++ kvm/arch/s390/kvm/kvm-s390.h
@@ -92,6 +92,13 @@ static inline unsigned long kvm_s390_han
 	if (!vcpu->requests)
 		return 0;
 
+	/* requests that can be handled at all levels */
+	if (test_and_clear_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests)) {
+		smp_mb__after_clear_bit();
+		wake_up_bit(&vcpu->requests, KVM_REQ_MMU_RELOAD);
+		kvm_s390_vcpu_set_mem(vcpu);
+	}
+
 	return vcpu->requests;
 }
 
Index: kvm/virt/kvm/kvm_main.c
===================================================================
--- kvm.orig/virt/kvm/kvm_main.c
+++ kvm/virt/kvm/kvm_main.c
@@ -1682,6 +1682,10 @@ static int kvm_vcpu_release(struct inode
 {
 	struct kvm_vcpu *vcpu = filp->private_data;
 
+	clear_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests);
+	smp_mb__after_clear_bit();
+	wake_up_bit(&vcpu->requests, KVM_REQ_MMU_RELOAD);
+
 	kvm_put_kvm(vcpu->kvm);
 	return 0;
 }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] kvm-s390: streamline memslot handling - rebased
  2009-06-02 14:26 ` [PATCH 3/3] kvm-s390: streamline memslot handling " ehrhardt
@ 2009-06-05 20:53   ` Marcelo Tosatti
  2009-06-08 10:51     ` Christian Ehrhardt
  0 siblings, 1 reply; 12+ messages in thread
From: Marcelo Tosatti @ 2009-06-05 20:53 UTC (permalink / raw)
  To: ehrhardt; +Cc: kvm, avi, borntraeger, cotte, heiko.carstens, schwidefsky

On Tue, Jun 02, 2009 at 04:26:11PM +0200, ehrhardt@linux.vnet.ibm.com wrote:
> From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
> 
> As requested this is a rebased patch on top of the already applied v3
> of the patch series.
> 
> *updates to applied version*
> - ensure the wait_on_bit waiter is notified
> - ensure dropping vcpu all requests while freeing a vcpu
> - kickout only scheduled vcpus (its superfluous and wait might hang forever on
>   not running vcpus)
> - kvm_arch_set_memory_region waits until the bit is consumed by the vcpu
> 
> This patch relocates the variables kvm-s390 uses to track guest mem addr/size.
> As discussed dropping the variables at struct kvm_arch level allows to use the
> common vcpu->request based mechanism to reload guest memory if e.g. changes
> via set_memory_region.
> The kick mechanism introduced in this series is used to ensure running vcpus
> leave guest state to catch the update.
> 
> 
> Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
> ---
> 
> [diffstat]
>  arch/s390/kvm/kvm-s390.c |   27 ++++++++++++++++++++-------
>  arch/s390/kvm/kvm-s390.h |    7 +++++++
>  virt/kvm/kvm_main.c      |    4 ++++
>  3 files changed, 31 insertions(+), 7 deletions(-)
> 
> Index: kvm/arch/s390/kvm/kvm-s390.c
> ===================================================================
> --- kvm.orig/arch/s390/kvm/kvm-s390.c
> +++ kvm/arch/s390/kvm/kvm-s390.c
> @@ -674,6 +674,12 @@ long kvm_arch_vcpu_ioctl(struct file *fi
>  	return -EINVAL;
>  }
>  
> +static int wait_bit_schedule(void *word)
> +{
> +	schedule();
> +	return 0;
> +}
> +
>  /* Section: memory related */
>  int kvm_arch_set_memory_region(struct kvm *kvm,
>  				struct kvm_userspace_memory_region *mem,
> @@ -681,6 +687,7 @@ int kvm_arch_set_memory_region(struct kv
>  				int user_alloc)
>  {
>  	int i;
> +	struct kvm_vcpu *vcpu;
>  
>  	/* A few sanity checks. We can have exactly one memory slot which has
>  	   to start at guest virtual zero and which has to be located at a
> @@ -706,13 +713,19 @@ int kvm_arch_set_memory_region(struct kv
>  
>  	/* request update of sie control block for all available vcpus */
>  	for (i = 0; i < KVM_MAX_VCPUS; ++i) {
> -		if (kvm->vcpus[i]) {
> -			if (test_and_set_bit(KVM_REQ_MMU_RELOAD,
> -						&kvm->vcpus[i]->requests))
> -				continue;
> -			kvm_s390_inject_sigp_stop(kvm->vcpus[i],
> -						  ACTION_VCPUREQUEST_ON_STOP);
> -		}
> +		vcpu = kvm->vcpus[i];
> +		if (!vcpu)
> +			continue;
> +
> +		if (!test_and_set_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests))
> +			continue;
> +
> +		if (vcpu->cpu == -1)
> +			continue;

What happens if the check for cpu == -1 races with kvm_arch_vcpu_put?
This context will wait until the vcpu_put context is scheduled back in
to clear the bit? Is that OK?

> +
> +		kvm_s390_inject_sigp_stop(vcpu, ACTION_VCPUREQUEST_ON_STOP);
> +		wait_on_bit(&vcpu->requests, KVM_REQ_MMU_RELOAD,
> +			    wait_bit_schedule, TASK_UNINTERRUPTIBLE);
>  	}

 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
+       vcpu->cpu = -1;
        save_fp_regs(&vcpu->arch.guest_fpregs);
        save_access_regs(vcpu->arch.guest_acrs);
        restore_fp_regs(&vcpu->arch.host_fpregs);

>  
>  	return 0;
> Index: kvm/arch/s390/kvm/kvm-s390.h
> ===================================================================
> --- kvm.orig/arch/s390/kvm/kvm-s390.h
> +++ kvm/arch/s390/kvm/kvm-s390.h
> @@ -92,6 +92,13 @@ static inline unsigned long kvm_s390_han
>  	if (!vcpu->requests)
>  		return 0;
>  
> +	/* requests that can be handled at all levels */
> +	if (test_and_clear_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests)) {
> +		smp_mb__after_clear_bit();

Really need that smp_mb__after_clear_bit ? AFAIK test_and_clear_bit
implies a barrier?

> +		wake_up_bit(&vcpu->requests, KVM_REQ_MMU_RELOAD);
> +		kvm_s390_vcpu_set_mem(vcpu);
> +	}
> +
>  	return vcpu->requests;
>  }
>  
> Index: kvm/virt/kvm/kvm_main.c
> ===================================================================
> --- kvm.orig/virt/kvm/kvm_main.c
> +++ kvm/virt/kvm/kvm_main.c
> @@ -1682,6 +1682,10 @@ static int kvm_vcpu_release(struct inode
>  {
>  	struct kvm_vcpu *vcpu = filp->private_data;
>  
> +	clear_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests);
> +	smp_mb__after_clear_bit();
> +	wake_up_bit(&vcpu->requests, KVM_REQ_MMU_RELOAD);
> +

And this should be generic? Say if other architectures want to make use 
of a similar wait infrastructure. Talk is cheap.

Anyway, yeah, the set request / wait mechanism you implement here is
quite similar to the idea mentioned earlier that could be used for x86.

Just get rid of this explicit KVM_REQ_MMU_RELOAD knowledge in
arch-independent code please (if you want to see this merged).

Later it can all be lifted off to arch independent code.

>  	kvm_put_kvm(vcpu->kvm);
>  	return 0;
>  }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] kvm-s390: streamline memslot handling - rebased
  2009-06-05 20:53   ` Marcelo Tosatti
@ 2009-06-08 10:51     ` Christian Ehrhardt
  2009-06-08 11:10       ` Avi Kivity
  2009-06-09  0:56       ` Marcelo Tosatti
  0 siblings, 2 replies; 12+ messages in thread
From: Christian Ehrhardt @ 2009-06-08 10:51 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, avi, borntraeger, cotte, heiko.carstens, schwidefsky

Marcelo Tosatti wrote:
> On Tue, Jun 02, 2009 at 04:26:11PM +0200, ehrhardt@linux.vnet.ibm.com wrote:
>   
>> From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
>>     
[...]
>> @@ -706,13 +713,19 @@ int kvm_arch_set_memory_region(struct kv
>>  
>>  	/* request update of sie control block for all available vcpus */
>>  	for (i = 0; i < KVM_MAX_VCPUS; ++i) {
>> -		if (kvm->vcpus[i]) {
>> -			if (test_and_set_bit(KVM_REQ_MMU_RELOAD,
>> -						&kvm->vcpus[i]->requests))
>> -				continue;
>> -			kvm_s390_inject_sigp_stop(kvm->vcpus[i],
>> -						  ACTION_VCPUREQUEST_ON_STOP);
>> -		}
>> +		vcpu = kvm->vcpus[i];
>> +		if (!vcpu)
>> +			continue;
>> +
>> +		if (!test_and_set_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests))
>> +			continue;
>> +
>> +		if (vcpu->cpu == -1)
>> +			continue;
>>     
>
> What happens if the check for cpu == -1 races with kvm_arch_vcpu_put?
> This context will wait until the vcpu_put context is scheduled back in
> to clear the bit? Is that OK?
>   
It either comes back to clear the bit or it is consumed on deletion of 
the vcpu. Both ways are ok. The question we have to answer is if it 
might stall the mem update ioctl for too long.
Because eventually the check for vcpu->cpu == -1 is just an optimization 
if we would completely ignore remove it we would have the same problem 
-> could it stall the set mem operation too much. That means the "race" 
is not an issue it might just be sub-optimal, but the chance for a long 
stall could become an issue. Unfortunately I have no better approach to 
that (yet), until then this I like this implementation more than what we 
would have without all the corner case fixes in that patch series.

>> +
>> +		kvm_s390_inject_sigp_stop(vcpu, ACTION_VCPUREQUEST_ON_STOP);
>> +		wait_on_bit(&vcpu->requests, KVM_REQ_MMU_RELOAD,
>> +			    wait_bit_schedule, TASK_UNINTERRUPTIBLE);
>>  	}
>>     
>
>  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>  {
> +       vcpu->cpu = -1;
>         save_fp_regs(&vcpu->arch.guest_fpregs);
>   
[...]
>> +++ kvm/arch/s390/kvm/kvm-s390.h
>> @@ -92,6 +92,13 @@ static inline unsigned long kvm_s390_han
>>  	if (!vcpu->requests)
>>  		return 0;
>>  
>> +	/* requests that can be handled at all levels */
>> +	if (test_and_clear_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests)) {
>> +		smp_mb__after_clear_bit();
>>     
>
> Really need that smp_mb__after_clear_bit ? AFAIK test_and_clear_bit
> implies a barrier?
>   

Well I agree that practically test_and_clear_bit has a barrier on s390, 
but as far as I read Documentation/atomic_ops.txt at line 339-360 I 
think the interface does not imply it so I wanted to add it explicitly. 
I would be happy if someone really knows the in depth details here and 
corrects me :-)

>> +		wake_up_bit(&vcpu->requests, KVM_REQ_MMU_RELOAD);
>> +		kvm_s390_vcpu_set_mem(vcpu);
>> +	}
>> +
>>  	return vcpu->requests;
>>  }
>>  
>> Index: kvm/virt/kvm/kvm_main.c
>> ===================================================================
>> --- kvm.orig/virt/kvm/kvm_main.c
>> +++ kvm/virt/kvm/kvm_main.c
>> @@ -1682,6 +1682,10 @@ static int kvm_vcpu_release(struct inode
>>  {
>>  	struct kvm_vcpu *vcpu = filp->private_data;
>>  
>> +	clear_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests);
>> +	smp_mb__after_clear_bit();
>> +	wake_up_bit(&vcpu->requests, KVM_REQ_MMU_RELOAD);
>> +
>>     
>
> And this should be generic? Say if other architectures want to make use 
> of a similar wait infrastructure. Talk is cheap.
>   
Clear bit and wake up on release doesn't hurt any architecture, but it 
is at a good place fine for those using the mechanism to ensure cleaning 
up outstanding things when closing a vcpu fd.
I thought its not worth to add kvm_ARCH_vcpu_release for it while I 
could do so if we want it separated.
(continued below)
> Anyway, yeah, the set request / wait mechanism you implement here is
> quite similar to the idea mentioned earlier that could be used for x86.
>
> Just get rid of this explicit KVM_REQ_MMU_RELOAD knowledge in
> arch-independent code please (if you want to see this merged).
>   
I agree to lift the wait part to other archs later if needed, but as 
mentioned above I could move this to arch code to the cost of one arch 
hook more. But as also mentioned it doesn't really hurt. I agree that it 
does not need to be KVM_REQ_MMU_RELOAD specific, we could just 
walk/clear/wake all bits on that vcpu->requests variable.
Would that be generic enough in your opinion ?
> Later it can all be lifted off to arch independent code.
>   
True for the wait part which can evolve in our arch code until it is 
ripe to get cross arch merged.

-- 

Grüsse / regards, Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] kvm-s390: streamline memslot handling - rebased
  2009-06-08 10:51     ` Christian Ehrhardt
@ 2009-06-08 11:10       ` Avi Kivity
  2009-06-08 12:05         ` Christian Ehrhardt
  2009-06-09  0:56       ` Marcelo Tosatti
  1 sibling, 1 reply; 12+ messages in thread
From: Avi Kivity @ 2009-06-08 11:10 UTC (permalink / raw)
  To: Christian Ehrhardt
  Cc: Marcelo Tosatti, kvm, borntraeger, cotte, heiko.carstens, schwidefsky

Christian Ehrhardt wrote:   
>>
>> Really need that smp_mb__after_clear_bit ? AFAIK test_and_clear_bit
>> implies a barrier?
>>   
>
> Well I agree that practically test_and_clear_bit has a barrier on 
> s390, but as far as I read Documentation/atomic_ops.txt at line 
> 339-360 I think the interface does not imply it so I wanted to add it 
> explicitly. I would be happy if someone really knows the in depth 
> details here and corrects me :-)

IIUC rmw bitops are full memory barriers.  The non-rmw (from the 
caller's perspective), clear_bit() and set_bit(), are not.


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] kvm-s390: streamline memslot handling - rebased
  2009-06-08 11:10       ` Avi Kivity
@ 2009-06-08 12:05         ` Christian Ehrhardt
  2009-06-08 12:09           ` Avi Kivity
  0 siblings, 1 reply; 12+ messages in thread
From: Christian Ehrhardt @ 2009-06-08 12:05 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Marcelo Tosatti, kvm, borntraeger, cotte, heiko.carstens, schwidefsky

Avi Kivity wrote:
> Christian Ehrhardt wrote:  
>>>
>>> Really need that smp_mb__after_clear_bit ? AFAIK test_and_clear_bit
>>> implies a barrier?
>>>   
>>
>> Well I agree that practically test_and_clear_bit has a barrier on 
>> s390, but as far as I read Documentation/atomic_ops.txt at line 
>> 339-360 I think the interface does not imply it so I wanted to add it 
>> explicitly. I would be happy if someone really knows the in depth 
>> details here and corrects me :-)
>
> IIUC rmw bitops are full memory barriers.  The non-rmw (from the 
> caller's perspective), clear_bit() and set_bit(), are not.
>
>
Ok, as the real implementation has one + memory-barriers.txt describing 
it with barrier and finally include/asm-generic/bitops/atomic.h 
descirbes it that way too I think I can drop the explicit smb_wb from my 
patch in the next update (I wait a bit to give the discussion about the 
wati/bits a bit more time).

Hmm ... would that be worth a clarifying patch to atomic_ops.txt that 
confused me in the first place ?

-- 

Grüsse / regards, Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] kvm-s390: streamline memslot handling - rebased
  2009-06-08 12:05         ` Christian Ehrhardt
@ 2009-06-08 12:09           ` Avi Kivity
  0 siblings, 0 replies; 12+ messages in thread
From: Avi Kivity @ 2009-06-08 12:09 UTC (permalink / raw)
  To: Christian Ehrhardt
  Cc: Marcelo Tosatti, kvm, borntraeger, cotte, heiko.carstens, schwidefsky

Christian Ehrhardt wrote:
> Hmm ... would that be worth a clarifying patch to atomic_ops.txt that 
> confused me in the first place ?

If it confused you, it probably confuses others.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] kvm-s390: streamline memslot handling - rebased
  2009-06-08 10:51     ` Christian Ehrhardt
  2009-06-08 11:10       ` Avi Kivity
@ 2009-06-09  0:56       ` Marcelo Tosatti
  2009-06-14 12:04         ` Avi Kivity
  1 sibling, 1 reply; 12+ messages in thread
From: Marcelo Tosatti @ 2009-06-09  0:56 UTC (permalink / raw)
  To: Christian Ehrhardt
  Cc: kvm, avi, borntraeger, cotte, heiko.carstens, schwidefsky

On Mon, Jun 08, 2009 at 12:51:26PM +0200, Christian Ehrhardt wrote:
>>>  Index: kvm/virt/kvm/kvm_main.c
>>> ===================================================================
>>> --- kvm.orig/virt/kvm/kvm_main.c
>>> +++ kvm/virt/kvm/kvm_main.c
>>> @@ -1682,6 +1682,10 @@ static int kvm_vcpu_release(struct inode
>>>  {
>>>  	struct kvm_vcpu *vcpu = filp->private_data;
>>>  +	clear_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests);
>>> +	smp_mb__after_clear_bit();
>>> +	wake_up_bit(&vcpu->requests, KVM_REQ_MMU_RELOAD);
>>> +
>>>     
>>
>> And this should be generic? Say if other architectures want to make use 
>> of a similar wait infrastructure. Talk is cheap.
>>   
> Clear bit and wake up on release doesn't hurt any architecture, but it  
> is at a good place fine for those using the mechanism to ensure cleaning  
> up outstanding things when closing a vcpu fd.
> I thought its not worth to add kvm_ARCH_vcpu_release for it while I  
> could do so if we want it separated.

Yeah, was frustated for lack of more useful comments so decided to
nitpick on something.

> (continued below)
>> Anyway, yeah, the set request / wait mechanism you implement here is
>> quite similar to the idea mentioned earlier that could be used for x86.
>>
>> Just get rid of this explicit KVM_REQ_MMU_RELOAD knowledge in
>> arch-independent code please (if you want to see this merged).
>>   
> I agree to lift the wait part to other archs later if needed, but as  
> mentioned above I could move this to arch code to the cost of one arch  
> hook more. But as also mentioned it doesn't really hurt. I agree that it  
> does not need to be KVM_REQ_MMU_RELOAD specific, we could just  
> walk/clear/wake all bits on that vcpu->requests variable.
> Would that be generic enough in your opinion ?

Don't know.

Avi?

>> Later it can all be lifted off to arch independent code.
>>   
> True for the wait part which can evolve in our arch code until it is 
> ripe to get cross arch merged.
>
> -- 
>
> Grüsse / regards, Christian Ehrhardt
> IBM Linux Technology Center, Open Virtualization 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] kvm-s390: streamline memslot handling - rebased
  2009-06-09  0:56       ` Marcelo Tosatti
@ 2009-06-14 12:04         ` Avi Kivity
  2009-06-15 13:47           ` Christian Ehrhardt
  0 siblings, 1 reply; 12+ messages in thread
From: Avi Kivity @ 2009-06-14 12:04 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Christian Ehrhardt, kvm, borntraeger, cotte, heiko.carstens, schwidefsky

Marcelo Tosatti wrote:
>   
>> (continued below)
>>     
>>> Anyway, yeah, the set request / wait mechanism you implement here is
>>> quite similar to the idea mentioned earlier that could be used for x86.
>>>
>>> Just get rid of this explicit KVM_REQ_MMU_RELOAD knowledge in
>>> arch-independent code please (if you want to see this merged).
>>>   
>>>       
>> I agree to lift the wait part to other archs later if needed, but as  
>> mentioned above I could move this to arch code to the cost of one arch  
>> hook more. But as also mentioned it doesn't really hurt. I agree that it  
>> does not need to be KVM_REQ_MMU_RELOAD specific, we could just  
>> walk/clear/wake all bits on that vcpu->requests variable.
>> Would that be generic enough in your opinion ?
>>     
>
> Don't know.
>
> Avi?
>   

I think I lost the thread here, but I'll try.  Isn't the wake part 
make_all_vcpus_request() in kvm_main.c?  The wait part could be moved to 
a similar generic function.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] kvm-s390: streamline memslot handling - rebased
  2009-06-14 12:04         ` Avi Kivity
@ 2009-06-15 13:47           ` Christian Ehrhardt
  0 siblings, 0 replies; 12+ messages in thread
From: Christian Ehrhardt @ 2009-06-15 13:47 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Marcelo Tosatti, kvm, borntraeger, cotte, heiko.carstens, schwidefsky

Avi Kivity wrote:
> Marcelo Tosatti wrote:
>>  
>>> (continued below)
>>>    
>>>> Anyway, yeah, the set request / wait mechanism you implement here is
>>>> quite similar to the idea mentioned earlier that could be used for 
>>>> x86.
>>>>
>>>> Just get rid of this explicit KVM_REQ_MMU_RELOAD knowledge in
>>>> arch-independent code please (if you want to see this merged).
>>>>         
>>> I agree to lift the wait part to other archs later if needed, but 
>>> as  mentioned above I could move this to arch code to the cost of 
>>> one arch  hook more. But as also mentioned it doesn't really hurt. I 
>>> agree that it  does not need to be KVM_REQ_MMU_RELOAD specific, we 
>>> could just  walk/clear/wake all bits on that vcpu->requests variable.
>>> Would that be generic enough in your opinion ?
>>>     
>>
>> Don't know.
>>
>> Avi?
>>   
>
> I think I lost the thread here, but I'll try.  Isn't the wake part 
> make_all_vcpus_request() in kvm_main.c?  The wait part could be moved 
> to a similar generic function.
>
I'll try to summarize my current thoughts a bit:
The rebased patch series brings several fixes and the wait/wakeup 
mechanism which is in discussion here.
As explained before this keeps the new wait implementation in s390 arch 
code which allows us to experiment with it. Later if we are happy with 
it we might (or not) continue the merge and bring this mechanism to 
make_all_vcpus_request (as on x86 you don't have the issues I try to fix 
here we don't need to hurry bringing that into generic code).

Well now to the wait/wakeup which is here in discussion in detail:
The s390 arch code can kick a guest, but we don't know implicitly (as 
x86 does) that the kick succeeded, it might happen somewhen sooner or later.
Therefore the code uses wait_on_bit to wait until the vcpu->request bit 
is consumed.
To ensure cleanup of these waiting threads in some special cases the 
clear&wake up is also needed at other places than the real bit 
consumption. One of them is the vcpu release code where we should 
clear&wakeup all waiters (Marcelo correctly pointed out that we should 
not be bit specific there, so I just just wake up all in the updated code).

That was the discussion here: "if it would be ok to clear & wake up 
all". As wake_up_bit doesn't hurt if there is no waiter it looks like 
the best solution to to do that in the generic part of vcpu_release. If 
ever someone else waits for this or another bit in vcpu->requests, the 
code ensures all of them are awaken on vcpu release.

I send an updated version of the rebased series in a few minutes, 
containing updates related to what marcelo pointed out.

P.S. in case you think we need much more discussions we might try to 
catch up on irc to save this thread a few cycles :-)

-- 

Grüsse / regards, Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-06-15 13:47 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-02 14:26 [PATCH 0/3] kvm-s390: revised version of kvm-s390 guest memory handling - rebased ehrhardt
2009-06-02 14:26 ` [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state " ehrhardt
2009-06-02 14:26 ` [PATCH 2/3] kvm-s390: update vcpu->cpu " ehrhardt
2009-06-02 14:26 ` [PATCH 3/3] kvm-s390: streamline memslot handling " ehrhardt
2009-06-05 20:53   ` Marcelo Tosatti
2009-06-08 10:51     ` Christian Ehrhardt
2009-06-08 11:10       ` Avi Kivity
2009-06-08 12:05         ` Christian Ehrhardt
2009-06-08 12:09           ` Avi Kivity
2009-06-09  0:56       ` Marcelo Tosatti
2009-06-14 12:04         ` Avi Kivity
2009-06-15 13:47           ` Christian Ehrhardt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.