[RFC PATCH v2 0/2] s390x: Improvements to SIGP handling [KVM]

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RFC PATCH v2 0/2] s390x: Improvements to SIGP handling [KVM]
@ 2021-11-02 19:46 Eric Farman
  2021-11-02 19:46 ` [RFC PATCH v2 1/2] Capability/IOCTL/Documentation Eric Farman
  2021-11-02 19:46 ` [RFC PATCH v2 2/2] KVM: s390: Extend the USER_SIGP capability Eric Farman
  0 siblings, 2 replies; 8+ messages in thread
From: Eric Farman @ 2021-11-02 19:46 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Thomas Huth
  Cc: Heiko Carstens, Vasily Gorbik, Paolo Bonzini, Jonathan Corbet,
	kvm, linux-s390, Eric Farman

Here is a new variation of the SIGP handling discussed a few
weeks ago [1]. Notable changes:

 - Patches 1 and 6 from v1 were picked for 5.16 (Thank you!) [2]
 - Patches 2 through 5 were removed, and replaced with this
   iteration that relies on a KVM capability and IOCTL

I opted to use David's suggestion [3] for the kernel to
automatically set a vcpu "busy" and userspace to reset it
when complete. I made it dependent on the existing USER_SIGP
stuff, which maybe isn't great for potential non-SIGP scenarios
in the future, but this at least shows how it could work.

According to the Principles of Operation, only a subset of
SIGP orders would generate a "busy" condition, and a different
subset would even notice it. But I did the entirety of the SIGP
orders, even the invalid ones that would otherwise return some
status bits and CC1 instead of the CC2 (BUSY) condition.
Perhaps that's too much, perhaps not.

As I'm writing this, I'm realizing that I probably need to look
at the cpu reset paths clearer, to ensure the "busy" indicator
is actually reset to zero.

Since this is an RFC, I've left the CAP/IOCTL definitions as
a standalone patch, so I see it easier when working with the
QEMU code. Ultimately this would be squashed together, and
might have some refit after the merge window anyway.

I'll send the QEMU series shortly, which takes advantage of this.

Thoughts?

[1] https://lore.kernel.org/r/20211008203112.1979843-1-farman@linux.ibm.com/
[2] https://lore.kernel.org/r/20211031121104.14764-1-borntraeger@de.ibm.com/
[3] https://lore.kernel.org/r/3e3b38d1-b338-0211-04ab-91f913c1f557@redhat.com/

Eric Farman (2):
  Capability/IOCTL/Documentation
  KVM: s390: Extend the USER_SIGP capability

 Documentation/virt/kvm/api.rst   | 27 +++++++++++++++++++++
 arch/s390/include/asm/kvm_host.h |  2 ++
 arch/s390/kvm/kvm-s390.c         | 18 ++++++++++++++
 arch/s390/kvm/kvm-s390.h         | 10 ++++++++
 arch/s390/kvm/sigp.c             | 40 ++++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h         |  4 ++++
 6 files changed, 101 insertions(+)

-- 
2.25.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH v2 1/2] Capability/IOCTL/Documentation
  2021-11-02 19:46 [RFC PATCH v2 0/2] s390x: Improvements to SIGP handling [KVM] Eric Farman
@ 2021-11-02 19:46 ` Eric Farman
  2021-11-02 19:46 ` [RFC PATCH v2 2/2] KVM: s390: Extend the USER_SIGP capability Eric Farman
  1 sibling, 0 replies; 8+ messages in thread
From: Eric Farman @ 2021-11-02 19:46 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Thomas Huth
  Cc: Heiko Carstens, Vasily Gorbik, Paolo Bonzini, Jonathan Corbet,
	kvm, linux-s390, Eric Farman

(This should be squashed with the next patch; it's just broken
out for ease-of-future rebase.)

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 Documentation/virt/kvm/api.rst | 27 +++++++++++++++++++++++++++
 include/uapi/linux/kvm.h       |  4 ++++
 2 files changed, 31 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index a6729c8cf063..00fdc86545e5 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -5317,6 +5317,18 @@ the trailing ``'\0'``, is indicated by ``name_size`` in the header.
 The Stats Data block contains an array of 64-bit values in the same order
 as the descriptors in Descriptors block.
 
+4.134 KVM_S390_VCPU_RESET_SIGP_BUSY
+-----------------------------------
+
+:Capability: KVM_CAP_S390_USER_SIGP_BUSY
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: none
+:Returns: 0
+
+This ioctl resets the VCPU's indicator that it is busy processing a SIGP
+order, and is thus available for additional SIGP orders.
+
 5. The kvm_run structure
 ========================
 
@@ -6706,6 +6718,21 @@ MAP_SHARED mmap will result in an -EINVAL return.
 When enabled the VMM may make use of the ``KVM_ARM_MTE_COPY_TAGS`` ioctl to
 perform a bulk copy of tags to/from the guest.
 
+7.29 KVM_CAP_S390_USER_SIGP_BUSY
+--------------------------------
+
+:Architectures: s390
+:Parameters: none
+
+This capability indicates that KVM should indicate when a SIGP order has been
+sent to userspace for a particular vcpu, and return CC2 (BUSY) to any further
+SIGP order directed at the same vcpu even for those orders that are handled
+within the kernel.
+
+This capability is dependent on KVM_CAP_S390_USER_SIGP. If this capability
+is not enabled, SIGP orders handled by the kernel may not indicate whether a
+vcpu is currently processing another SIGP order.
+
 8. Other capabilities.
 ======================
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index a067410ebea5..7e7727b4ef59 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_BINARY_STATS_FD 203
 #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
 #define KVM_CAP_ARM_MTE 205
+#define KVM_CAP_S390_USER_SIGP_BUSY 206
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -2007,4 +2008,7 @@ struct kvm_stats_desc {
 
 #define KVM_GET_STATS_FD  _IO(KVMIO,  0xce)
 
+/* Available with KVM_CAP_S390_USER_SIGP_BUSY */
+#define KVM_S390_VCPU_RESET_SIGP_BUSY	_IO(KVMIO, 0xcf)
+
 #endif /* __LINUX_KVM_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH v2 2/2] KVM: s390: Extend the USER_SIGP capability
  2021-11-02 19:46 [RFC PATCH v2 0/2] s390x: Improvements to SIGP handling [KVM] Eric Farman
  2021-11-02 19:46 ` [RFC PATCH v2 1/2] Capability/IOCTL/Documentation Eric Farman
@ 2021-11-02 19:46 ` Eric Farman
  2021-11-04  9:06   ` David Hildenbrand
  1 sibling, 1 reply; 8+ messages in thread
From: Eric Farman @ 2021-11-02 19:46 UTC (permalink / raw)
  To: Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Thomas Huth
  Cc: Heiko Carstens, Vasily Gorbik, Paolo Bonzini, Jonathan Corbet,
	kvm, linux-s390, Eric Farman

With commit 2444b352c3ac ("KVM: s390: forward most SIGP orders to user
space") we have a capability that allows the "fast" SIGP orders (as
defined by the Programming Notes for the SIGNAL PROCESSOR instruction in
the Principles of Operation) to be handled in-kernel, while all others are
sent to userspace for processing.

This works fine but it creates a situation when, for example, a SIGP SENSE
might return CC1 (STATUS STORED, and status bits indicating the vcpu is
stopped), when in actuality userspace is still processing a SIGP STOP AND
STORE STATUS order, and the vcpu is not yet actually stopped. Thus, the
SIGP SENSE should actually be returning CC2 (busy) instead of CC1.

To fix this, add another CPU capability, dependent on the USER_SIGP one,
that will mark a vcpu as "busy" processing a SIGP order, and a
corresponding IOCTL that userspace can call to indicate it has finished
its work and the SIGP operation is completed.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 arch/s390/include/asm/kvm_host.h |  2 ++
 arch/s390/kvm/kvm-s390.c         | 18 ++++++++++++++
 arch/s390/kvm/kvm-s390.h         | 10 ++++++++
 arch/s390/kvm/sigp.c             | 40 ++++++++++++++++++++++++++++++++
 4 files changed, 70 insertions(+)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index a604d51acfc8..bd202bb3acb5 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -746,6 +746,7 @@ struct kvm_vcpu_arch {
 	__u64 cputm_start;
 	bool gs_enabled;
 	bool skey_enabled;
+	atomic_t sigp_busy;
 	struct kvm_s390_pv_vcpu pv;
 	union diag318_info diag318_info;
 };
@@ -941,6 +942,7 @@ struct kvm_arch{
 	int user_sigp;
 	int user_stsi;
 	int user_instr0;
+	int user_sigp_busy;
 	struct s390_io_adapter *adapters[MAX_S390_IO_ADAPTERS];
 	wait_queue_head_t ipte_wq;
 	int ipte_lock_count;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 5f52e7eec02f..ff23a46288cc 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -564,6 +564,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_S390_VCPU_RESETS:
 	case KVM_CAP_SET_GUEST_DEBUG:
 	case KVM_CAP_S390_DIAG318:
+	case KVM_CAP_S390_USER_SIGP_BUSY:
 		r = 1;
 		break;
 	case KVM_CAP_SET_GUEST_DEBUG2:
@@ -706,6 +707,15 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
 		kvm->arch.user_sigp = 1;
 		r = 0;
 		break;
+	case KVM_CAP_S390_USER_SIGP_BUSY:
+		r = -EINVAL;
+		if (kvm->arch.user_sigp) {
+			kvm->arch.user_sigp_busy = 1;
+			r = 0;
+		}
+		VM_EVENT(kvm, 3, "ENABLE: CAP_S390_USER_SIGP_BUSY %s",
+			 r ? "(not available)" : "(success)");
+		break;
 	case KVM_CAP_S390_VECTOR_REGISTERS:
 		mutex_lock(&kvm->lock);
 		if (kvm->created_vcpus) {
@@ -4825,6 +4835,14 @@ long kvm_arch_vcpu_async_ioctl(struct file *filp,
 			return -EINVAL;
 		return kvm_s390_inject_vcpu(vcpu, &s390irq);
 	}
+	case KVM_S390_VCPU_RESET_SIGP_BUSY: {
+		if (!vcpu->kvm->arch.user_sigp_busy)
+			return -EFAULT;
+
+		VCPU_EVENT(vcpu, 3, "SIGP: CPU %x reset busy", vcpu->vcpu_id);
+		kvm_s390_vcpu_clear_sigp_busy(vcpu);
+		return 0;
+	}
 	}
 	return -ENOIOCTLCMD;
 }
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index c07a050d757d..9ce97832224b 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -82,6 +82,16 @@ static inline int is_vcpu_idle(struct kvm_vcpu *vcpu)
 	return test_bit(vcpu->vcpu_idx, vcpu->kvm->arch.idle_mask);
 }
 
+static inline bool kvm_s390_vcpu_set_sigp_busy(struct kvm_vcpu *vcpu)
+{
+	return (atomic_cmpxchg(&vcpu->arch.sigp_busy, 0, 1) == 0);
+}
+
+static inline void kvm_s390_vcpu_clear_sigp_busy(struct kvm_vcpu *vcpu)
+{
+	atomic_set(&vcpu->arch.sigp_busy, 0);
+}
+
 static inline int kvm_is_ucontrol(struct kvm *kvm)
 {
 #ifdef CONFIG_KVM_S390_UCONTROL
diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
index 5ad3fb4619f1..034ea72e098a 100644
--- a/arch/s390/kvm/sigp.c
+++ b/arch/s390/kvm/sigp.c
@@ -341,9 +341,42 @@ static int handle_sigp_dst(struct kvm_vcpu *vcpu, u8 order_code,
 			   "sigp order %u -> cpu %x: handled in user space",
 			   order_code, dst_vcpu->vcpu_id);
 
+	kvm_s390_vcpu_clear_sigp_busy(dst_vcpu);
+
 	return rc;
 }
 
+static int handle_sigp_order_busy(struct kvm_vcpu *vcpu, u8 order_code,
+				  u16 cpu_addr)
+{
+	struct kvm_vcpu *dst_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, cpu_addr);
+
+	if (!vcpu->kvm->arch.user_sigp_busy)
+		return 0;
+
+	/*
+	 * Just see if the target vcpu exists; the CC3 will be set wherever
+	 * the SIGP order is processed directly.
+	 */
+	if (!dst_vcpu)
+		return 0;
+
+	/* Reset orders will be accepted, regardless if target vcpu is busy */
+	if (order_code == SIGP_INITIAL_CPU_RESET ||
+	    order_code == SIGP_CPU_RESET)
+		return 0;
+
+	/* Orders that affect multiple vcpus should not flag one vcpu busy */
+	if (order_code == SIGP_SET_ARCHITECTURE)
+		return 0;
+
+	/* If this fails, the vcpu is already busy processing another SIGP */
+	if (!kvm_s390_vcpu_set_sigp_busy(dst_vcpu))
+		return -EBUSY;
+
+	return 0;
+}
+
 static int handle_sigp_order_in_user_space(struct kvm_vcpu *vcpu, u8 order_code,
 					   u16 cpu_addr)
 {
@@ -408,6 +441,13 @@ int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu)
 		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
 
 	order_code = kvm_s390_get_base_disp_rs(vcpu, NULL);
+
+	rc = handle_sigp_order_busy(vcpu, order_code, cpu_addr);
+	if (rc) {
+		kvm_s390_set_psw_cc(vcpu, SIGP_CC_BUSY);
+		return 0;
+	}
+
 	if (handle_sigp_order_in_user_space(vcpu, order_code, cpu_addr))
 		return -EOPNOTSUPP;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2 2/2] KVM: s390: Extend the USER_SIGP capability
  2021-11-02 19:46 ` [RFC PATCH v2 2/2] KVM: s390: Extend the USER_SIGP capability Eric Farman
@ 2021-11-04  9:06   ` David Hildenbrand
  2021-11-04 14:33     ` Eric Farman
  0 siblings, 1 reply; 8+ messages in thread
From: David Hildenbrand @ 2021-11-04  9:06 UTC (permalink / raw)
  To: Eric Farman, Christian Borntraeger, Janosch Frank,
	Claudio Imbrenda, Thomas Huth
  Cc: Heiko Carstens, Vasily Gorbik, Paolo Bonzini, Jonathan Corbet,
	kvm, linux-s390

On 02.11.21 20:46, Eric Farman wrote:
> With commit 2444b352c3ac ("KVM: s390: forward most SIGP orders to user
> space") we have a capability that allows the "fast" SIGP orders (as
> defined by the Programming Notes for the SIGNAL PROCESSOR instruction in
> the Principles of Operation) to be handled in-kernel, while all others are
> sent to userspace for processing.
> 
> This works fine but it creates a situation when, for example, a SIGP SENSE
> might return CC1 (STATUS STORED, and status bits indicating the vcpu is
> stopped), when in actuality userspace is still processing a SIGP STOP AND
> STORE STATUS order, and the vcpu is not yet actually stopped. Thus, the
> SIGP SENSE should actually be returning CC2 (busy) instead of CC1.
> 
> To fix this, add another CPU capability, dependent on the USER_SIGP one,
> that will mark a vcpu as "busy" processing a SIGP order, and a
> corresponding IOCTL that userspace can call to indicate it has finished
> its work and the SIGP operation is completed.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  arch/s390/include/asm/kvm_host.h |  2 ++
>  arch/s390/kvm/kvm-s390.c         | 18 ++++++++++++++
>  arch/s390/kvm/kvm-s390.h         | 10 ++++++++
>  arch/s390/kvm/sigp.c             | 40 ++++++++++++++++++++++++++++++++
>  4 files changed, 70 insertions(+)
> 
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index a604d51acfc8..bd202bb3acb5 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -746,6 +746,7 @@ struct kvm_vcpu_arch {
>  	__u64 cputm_start;
>  	bool gs_enabled;
>  	bool skey_enabled;
> +	atomic_t sigp_busy;
>  	struct kvm_s390_pv_vcpu pv;
>  	union diag318_info diag318_info;
>  };
> @@ -941,6 +942,7 @@ struct kvm_arch{
>  	int user_sigp;
>  	int user_stsi;
>  	int user_instr0;
> +	int user_sigp_busy;
>  	struct s390_io_adapter *adapters[MAX_S390_IO_ADAPTERS];
>  	wait_queue_head_t ipte_wq;
>  	int ipte_lock_count;
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 5f52e7eec02f..ff23a46288cc 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -564,6 +564,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_S390_VCPU_RESETS:
>  	case KVM_CAP_SET_GUEST_DEBUG:
>  	case KVM_CAP_S390_DIAG318:
> +	case KVM_CAP_S390_USER_SIGP_BUSY:
>  		r = 1;
>  		break;
>  	case KVM_CAP_SET_GUEST_DEBUG2:
> @@ -706,6 +707,15 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>  		kvm->arch.user_sigp = 1;
>  		r = 0;
>  		break;
> +	case KVM_CAP_S390_USER_SIGP_BUSY:
> +		r = -EINVAL;
> +		if (kvm->arch.user_sigp) {
> +			kvm->arch.user_sigp_busy = 1;
> +			r = 0;
> +		}
> +		VM_EVENT(kvm, 3, "ENABLE: CAP_S390_USER_SIGP_BUSY %s",
> +			 r ? "(not available)" : "(success)");
> +		break;
>  	case KVM_CAP_S390_VECTOR_REGISTERS:
>  		mutex_lock(&kvm->lock);
>  		if (kvm->created_vcpus) {
> @@ -4825,6 +4835,14 @@ long kvm_arch_vcpu_async_ioctl(struct file *filp,
>  			return -EINVAL;
>  		return kvm_s390_inject_vcpu(vcpu, &s390irq);
>  	}
> +	case KVM_S390_VCPU_RESET_SIGP_BUSY: {
> +		if (!vcpu->kvm->arch.user_sigp_busy)
> +			return -EFAULT;
> +
> +		VCPU_EVENT(vcpu, 3, "SIGP: CPU %x reset busy", vcpu->vcpu_id);
> +		kvm_s390_vcpu_clear_sigp_busy(vcpu);
> +		return 0;
> +	}
>  	}
>  	return -ENOIOCTLCMD;
>  }
> diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
> index c07a050d757d..9ce97832224b 100644
> --- a/arch/s390/kvm/kvm-s390.h
> +++ b/arch/s390/kvm/kvm-s390.h
> @@ -82,6 +82,16 @@ static inline int is_vcpu_idle(struct kvm_vcpu *vcpu)
>  	return test_bit(vcpu->vcpu_idx, vcpu->kvm->arch.idle_mask);
>  }
>  
> +static inline bool kvm_s390_vcpu_set_sigp_busy(struct kvm_vcpu *vcpu)
> +{
> +	return (atomic_cmpxchg(&vcpu->arch.sigp_busy, 0, 1) == 0);
> +}
> +
> +static inline void kvm_s390_vcpu_clear_sigp_busy(struct kvm_vcpu *vcpu)
> +{
> +	atomic_set(&vcpu->arch.sigp_busy, 0);
> +}
> +
>  static inline int kvm_is_ucontrol(struct kvm *kvm)
>  {
>  #ifdef CONFIG_KVM_S390_UCONTROL
> diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
> index 5ad3fb4619f1..034ea72e098a 100644
> --- a/arch/s390/kvm/sigp.c
> +++ b/arch/s390/kvm/sigp.c
> @@ -341,9 +341,42 @@ static int handle_sigp_dst(struct kvm_vcpu *vcpu, u8 order_code,
>  			   "sigp order %u -> cpu %x: handled in user space",
>  			   order_code, dst_vcpu->vcpu_id);
>  
> +	kvm_s390_vcpu_clear_sigp_busy(dst_vcpu);
> +
>  	return rc;
>  }
>  
> +static int handle_sigp_order_busy(struct kvm_vcpu *vcpu, u8 order_code,
> +				  u16 cpu_addr)
> +{
> +	struct kvm_vcpu *dst_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, cpu_addr);
> +
> +	if (!vcpu->kvm->arch.user_sigp_busy)
> +		return 0;
> +
> +	/*
> +	 * Just see if the target vcpu exists; the CC3 will be set wherever
> +	 * the SIGP order is processed directly.
> +	 */
> +	if (!dst_vcpu)
> +		return 0;
> +
> +	/* Reset orders will be accepted, regardless if target vcpu is busy */
> +	if (order_code == SIGP_INITIAL_CPU_RESET ||
> +	    order_code == SIGP_CPU_RESET)
> +		return 0;
> +
> +	/* Orders that affect multiple vcpus should not flag one vcpu busy */
> +	if (order_code == SIGP_SET_ARCHITECTURE)
> +		return 0;
> +
> +	/* If this fails, the vcpu is already busy processing another SIGP */
> +	if (!kvm_s390_vcpu_set_sigp_busy(dst_vcpu))
> +		return -EBUSY;
> +
> +	return 0;
> +}
> +
>  static int handle_sigp_order_in_user_space(struct kvm_vcpu *vcpu, u8 order_code,
>  					   u16 cpu_addr)
>  {
> @@ -408,6 +441,13 @@ int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu)
>  		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>  
>  	order_code = kvm_s390_get_base_disp_rs(vcpu, NULL);
> +
> +	rc = handle_sigp_order_busy(vcpu, order_code, cpu_addr);
> +	if (rc) {
> +		kvm_s390_set_psw_cc(vcpu, SIGP_CC_BUSY);
> +		return 0;
> +	}
> +
>  	if (handle_sigp_order_in_user_space(vcpu, order_code, cpu_addr))
>  		return -EOPNOTSUPP;


After looking at the QEMU side, I wonder if we should instead:

a) Let user space always set/reset SIGP busy. Don't set/reset it in the
   kernel automatically. All "heavy weight" SIGP orders are carried out
   in user space nowadays either way.
b) Reject all in-kernel SIGP orders targeting a CPU if marked BUSY by
   user space. (i.e., SIGP SENSE)
c) Don't reject SIGP orders that will be handled in QEMU from the
   kernel. Just let user space deal with it -- especially with the
   "problematic" ones like RESET and SET_ARCHITECTURE.

For example, we don't care about concurrent SIGP SENSE. We only care
about "lightweight" SIGP orders with concurrent "heavy weight" SIGP orders.

This should simplify this code and avoid having to clear the the BUSY
flag in QEMU (that might be bogus) when detecting another BUSY situation
(the trylock thingy, see my QEMU reply). The downside is that we have to
issue yet another IOCTL to set the CPU busy for SIGP -- not sure if we
really care.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2 2/2] KVM: s390: Extend the USER_SIGP capability
  2021-11-04  9:06   ` David Hildenbrand
@ 2021-11-04 14:33     ` Eric Farman
  2021-11-04 14:59       ` David Hildenbrand
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Farman @ 2021-11-04 14:33 UTC (permalink / raw)
  To: David Hildenbrand, Christian Borntraeger, Janosch Frank,
	Claudio Imbrenda, Thomas Huth
  Cc: Heiko Carstens, Vasily Gorbik, Paolo Bonzini, Jonathan Corbet,
	kvm, linux-s390

On Thu, 2021-11-04 at 10:06 +0100, David Hildenbrand wrote:
> On 02.11.21 20:46, Eric Farman wrote:
> > With commit 2444b352c3ac ("KVM: s390: forward most SIGP orders to
> > user
> > space") we have a capability that allows the "fast" SIGP orders (as
> > defined by the Programming Notes for the SIGNAL PROCESSOR
> > instruction in
> > the Principles of Operation) to be handled in-kernel, while all
> > others are
> > sent to userspace for processing.
> > 
> > This works fine but it creates a situation when, for example, a
> > SIGP SENSE
> > might return CC1 (STATUS STORED, and status bits indicating the
> > vcpu is
> > stopped), when in actuality userspace is still processing a SIGP
> > STOP AND
> > STORE STATUS order, and the vcpu is not yet actually stopped. Thus,
> > the
> > SIGP SENSE should actually be returning CC2 (busy) instead of CC1.
> > 
> > To fix this, add another CPU capability, dependent on the USER_SIGP
> > one,
> > that will mark a vcpu as "busy" processing a SIGP order, and a
> > corresponding IOCTL that userspace can call to indicate it has
> > finished
> > its work and the SIGP operation is completed.
> > 
> > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > ---
> >  arch/s390/include/asm/kvm_host.h |  2 ++
> >  arch/s390/kvm/kvm-s390.c         | 18 ++++++++++++++
> >  arch/s390/kvm/kvm-s390.h         | 10 ++++++++
> >  arch/s390/kvm/sigp.c             | 40
> > ++++++++++++++++++++++++++++++++
> >  4 files changed, 70 insertions(+)
> > 
> > diff --git a/arch/s390/include/asm/kvm_host.h
> > b/arch/s390/include/asm/kvm_host.h
> > index a604d51acfc8..bd202bb3acb5 100644
> > --- a/arch/s390/include/asm/kvm_host.h
> > +++ b/arch/s390/include/asm/kvm_host.h
> > @@ -746,6 +746,7 @@ struct kvm_vcpu_arch {
> >  	__u64 cputm_start;
> >  	bool gs_enabled;
> >  	bool skey_enabled;
> > +	atomic_t sigp_busy;
> >  	struct kvm_s390_pv_vcpu pv;
> >  	union diag318_info diag318_info;
> >  };
> > @@ -941,6 +942,7 @@ struct kvm_arch{
> >  	int user_sigp;
> >  	int user_stsi;
> >  	int user_instr0;
> > +	int user_sigp_busy;
> >  	struct s390_io_adapter *adapters[MAX_S390_IO_ADAPTERS];
> >  	wait_queue_head_t ipte_wq;
> >  	int ipte_lock_count;
> > diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> > index 5f52e7eec02f..ff23a46288cc 100644
> > --- a/arch/s390/kvm/kvm-s390.c
> > +++ b/arch/s390/kvm/kvm-s390.c
> > @@ -564,6 +564,7 @@ int kvm_vm_ioctl_check_extension(struct kvm
> > *kvm, long ext)
> >  	case KVM_CAP_S390_VCPU_RESETS:
> >  	case KVM_CAP_SET_GUEST_DEBUG:
> >  	case KVM_CAP_S390_DIAG318:
> > +	case KVM_CAP_S390_USER_SIGP_BUSY:
> >  		r = 1;
> >  		break;
> >  	case KVM_CAP_SET_GUEST_DEBUG2:
> > @@ -706,6 +707,15 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> > struct kvm_enable_cap *cap)
> >  		kvm->arch.user_sigp = 1;
> >  		r = 0;
> >  		break;
> > +	case KVM_CAP_S390_USER_SIGP_BUSY:
> > +		r = -EINVAL;
> > +		if (kvm->arch.user_sigp) {
> > +			kvm->arch.user_sigp_busy = 1;
> > +			r = 0;
> > +		}
> > +		VM_EVENT(kvm, 3, "ENABLE: CAP_S390_USER_SIGP_BUSY %s",
> > +			 r ? "(not available)" : "(success)");
> > +		break;
> >  	case KVM_CAP_S390_VECTOR_REGISTERS:
> >  		mutex_lock(&kvm->lock);
> >  		if (kvm->created_vcpus) {
> > @@ -4825,6 +4835,14 @@ long kvm_arch_vcpu_async_ioctl(struct file
> > *filp,
> >  			return -EINVAL;
> >  		return kvm_s390_inject_vcpu(vcpu, &s390irq);
> >  	}
> > +	case KVM_S390_VCPU_RESET_SIGP_BUSY: {
> > +		if (!vcpu->kvm->arch.user_sigp_busy)
> > +			return -EFAULT;
> > +
> > +		VCPU_EVENT(vcpu, 3, "SIGP: CPU %x reset busy", vcpu-
> > >vcpu_id);
> > +		kvm_s390_vcpu_clear_sigp_busy(vcpu);
> > +		return 0;
> > +	}
> >  	}
> >  	return -ENOIOCTLCMD;
> >  }
> > diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
> > index c07a050d757d..9ce97832224b 100644
> > --- a/arch/s390/kvm/kvm-s390.h
> > +++ b/arch/s390/kvm/kvm-s390.h
> > @@ -82,6 +82,16 @@ static inline int is_vcpu_idle(struct kvm_vcpu
> > *vcpu)
> >  	return test_bit(vcpu->vcpu_idx, vcpu->kvm->arch.idle_mask);
> >  }
> >  
> > +static inline bool kvm_s390_vcpu_set_sigp_busy(struct kvm_vcpu
> > *vcpu)
> > +{
> > +	return (atomic_cmpxchg(&vcpu->arch.sigp_busy, 0, 1) == 0);
> > +}
> > +
> > +static inline void kvm_s390_vcpu_clear_sigp_busy(struct kvm_vcpu
> > *vcpu)
> > +{
> > +	atomic_set(&vcpu->arch.sigp_busy, 0);
> > +}
> > +
> >  static inline int kvm_is_ucontrol(struct kvm *kvm)
> >  {
> >  #ifdef CONFIG_KVM_S390_UCONTROL
> > diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
> > index 5ad3fb4619f1..034ea72e098a 100644
> > --- a/arch/s390/kvm/sigp.c
> > +++ b/arch/s390/kvm/sigp.c
> > @@ -341,9 +341,42 @@ static int handle_sigp_dst(struct kvm_vcpu
> > *vcpu, u8 order_code,
> >  			   "sigp order %u -> cpu %x: handled in user
> > space",
> >  			   order_code, dst_vcpu->vcpu_id);
> >  
> > +	kvm_s390_vcpu_clear_sigp_busy(dst_vcpu);
> > +
> >  	return rc;
> >  }
> >  
> > +static int handle_sigp_order_busy(struct kvm_vcpu *vcpu, u8
> > order_code,
> > +				  u16 cpu_addr)
> > +{
> > +	struct kvm_vcpu *dst_vcpu = kvm_get_vcpu_by_id(vcpu->kvm,
> > cpu_addr);
> > +
> > +	if (!vcpu->kvm->arch.user_sigp_busy)
> > +		return 0;
> > +
> > +	/*
> > +	 * Just see if the target vcpu exists; the CC3 will be set
> > wherever
> > +	 * the SIGP order is processed directly.
> > +	 */
> > +	if (!dst_vcpu)
> > +		return 0;
> > +
> > +	/* Reset orders will be accepted, regardless if target vcpu is
> > busy */
> > +	if (order_code == SIGP_INITIAL_CPU_RESET ||
> > +	    order_code == SIGP_CPU_RESET)
> > +		return 0;
> > +
> > +	/* Orders that affect multiple vcpus should not flag one vcpu
> > busy */
> > +	if (order_code == SIGP_SET_ARCHITECTURE)
> > +		return 0;
> > +
> > +	/* If this fails, the vcpu is already busy processing another
> > SIGP */
> > +	if (!kvm_s390_vcpu_set_sigp_busy(dst_vcpu))
> > +		return -EBUSY;
> > +
> > +	return 0;
> > +}
> > +
> >  static int handle_sigp_order_in_user_space(struct kvm_vcpu *vcpu,
> > u8 order_code,
> >  					   u16 cpu_addr)
> >  {
> > @@ -408,6 +441,13 @@ int kvm_s390_handle_sigp(struct kvm_vcpu
> > *vcpu)
> >  		return kvm_s390_inject_program_int(vcpu,
> > PGM_PRIVILEGED_OP);
> >  
> >  	order_code = kvm_s390_get_base_disp_rs(vcpu, NULL);
> > +
> > +	rc = handle_sigp_order_busy(vcpu, order_code, cpu_addr);
> > +	if (rc) {
> > +		kvm_s390_set_psw_cc(vcpu, SIGP_CC_BUSY);
> > +		return 0;
> > +	}
> > +
> >  	if (handle_sigp_order_in_user_space(vcpu, order_code,
> > cpu_addr))
> >  		return -EOPNOTSUPP;
> 
> After looking at the QEMU side, I wonder if we should instead:
> 
> a) Let user space always set/reset SIGP busy. Don't set/reset it in
> the
>    kernel automatically. All "heavy weight" SIGP orders are carried
> out
>    in user space nowadays either way.
> b) Reject all in-kernel SIGP orders targeting a CPU if marked BUSY by
>    user space. (i.e., SIGP SENSE)
> c) Don't reject SIGP orders that will be handled in QEMU from the
>    kernel. Just let user space deal with it -- especially with the
>    "problematic" ones like RESET and SET_ARCHITECTURE.
> 
> For example, we don't care about concurrent SIGP SENSE. We only care
> about "lightweight" SIGP orders with concurrent "heavy weight" SIGP
> orders.

I very much care about concurrent SIGP SENSE (a "lightweight" order
handled in-kernel) and how that interacts with the "heavy weight" SIGP
orders (handled in userspace). SIGP SENSE might return CC0 (accepted)
if a vcpu is operating normally, or CC1 (status stored) with status
bits indicating an external call is pending and/or the vcpu is stopped.
This means that the actual response will depend on whether userspace
has picked up the sigp order and processed it or not. Giving CC0 when
userspace is actively processing a SIGP STOP/STOP AND STORE STATUS
would be misleading for the SIGP SENSE. (Did the STOP order get lost?
Failed? Not yet dispatched? Blocked?)

Meanwhile, the Principles of Operation (SA22-7832-12) page 4-95
describes a list of orders that would generate a CC2 (busy) when the
order is still "active" in userspace:

"""
A previously issued start, stop, restart, stop-
and-store-status, set-prefix, store-status-at-
address order, or store-additional-status-at-
address has been accepted by the
addressed CPU, and execution of the func-
tion requested by the order has not yet been
completed.
...
If the currently specified order is sense, external
call, emergency signal, start, stop, restart, stop
and store status, set prefix, store status at
address, set architecture, set multithreading, or
store additional status at address, then the order
is rejected, and condition code 2 is set. If the cur-
rently specified order is one of the reset orders,
or an unassigned or not-implemented order, the
order code is interpreted as described in “Status
Bits” on page 4-96.
"""

(There is another entry for the reset orders; not copied here for sake
of keeping my novella manageable.)

So, you're right that I could be more precise in terms how QEMU handles
a SIGP order while it's already busy handling one, and only limit the
CC2 from the kernel to those in-kernel orders. But I did say I took
this simplified approach in the cover letter. :)

Regardless, because of the above I really do want/need a way to give
the kernel a clue that userspace is doing something, without waiting
for userspace to say "hey, that order you kicked back to me? I'm
working on it now, I'll let you know when it's done!" Otherwise, SIGP
SENSE (and other lightweight friends) is still racing with the receipt
of a "start the sigp" ioctl.

> 
> This should simplify this code and avoid having to clear the the BUSY
> flag in QEMU (that might be bogus) when detecting another BUSY
> situation
> (the trylock thingy, see my QEMU reply).

Still digesting that one. Regarding the potential bogus indicator, at
one point I had the kernel recording the SIGP order itself that was
sent to a vcpu, similar to QEMU's cpu->env.sigp_order (which is only
used for the STOP variants?), instead of the simple toggle used here. I
found myself not really caring WHAT the order was, just that QEMU was
doing SOMETHING, which is why it's just on/off.

But it does mean that the QEMU patch is rather unpleasant, and maybe
knowing what order is being reset helps clean up that side of things?

>  The downside is that we have to
> issue yet another IOCTL to set the CPU busy for SIGP -- not sure if
> we
> really care.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2 2/2] KVM: s390: Extend the USER_SIGP capability
  2021-11-04 14:33     ` Eric Farman
@ 2021-11-04 14:59       ` David Hildenbrand
  2021-11-04 15:54         ` Eric Farman
  0 siblings, 1 reply; 8+ messages in thread
From: David Hildenbrand @ 2021-11-04 14:59 UTC (permalink / raw)
  To: Eric Farman, Christian Borntraeger, Janosch Frank,
	Claudio Imbrenda, Thomas Huth
  Cc: Heiko Carstens, Vasily Gorbik, Paolo Bonzini, Jonathan Corbet,
	kvm, linux-s390

>> For example, we don't care about concurrent SIGP SENSE. We only care
>> about "lightweight" SIGP orders with concurrent "heavy weight" SIGP
>> orders.
> 
> I very much care about concurrent SIGP SENSE (a "lightweight" order
> handled in-kernel) and how that interacts with the "heavy weight" SIGP
> orders (handled in userspace). SIGP SENSE might return CC0 (accepted)
> if a vcpu is operating normally, or CC1 (status stored) with status
> bits indicating an external call is pending and/or the vcpu is stopped.
> This means that the actual response will depend on whether userspace
> has picked up the sigp order and processed it or not. Giving CC0 when
> userspace is actively processing a SIGP STOP/STOP AND STORE STATUS
> would be misleading for the SIGP SENSE. (Did the STOP order get lost?
> Failed? Not yet dispatched? Blocked?)

But that would only visible when concurrently SIGP STOP'ing from one
VCPU and SIGP SENSE'ing from another VCPU. But in that case, there are
already no guarantees, because it's inherently racy:

VCPU #2: SIGP STOP #3
VCPU #1: SIGP SENSE #3

There is no guarantee who ends up first
a) In the kernel
b) On the final destination (SENSE -> kernel; STOP -> QEMU)

They could be rescheduled/delayed in various ways.


The important part is that orders from the *same* CPU are properly
handled, right?

VCPU #1: SIGP STOP #3
VCPU #1: SIGP SENSE #3

SENSE must return BUSY in case the STOP was not successful yet, correct?

And that can be achieved by setting the VCPU #3 busy when landing in
user space to trigger the SIGP STOP, before returning to the kernel and
processing the SIGP SENSE.


Or am I missing something important?

> 
> Meanwhile, the Principles of Operation (SA22-7832-12) page 4-95
> describes a list of orders that would generate a CC2 (busy) when the
> order is still "active" in userspace:
> 
> """
> A previously issued start, stop, restart, stop-
> and-store-status, set-prefix, store-status-at-
> address order, or store-additional-status-at-
> address has been accepted by the
> addressed CPU, and execution of the func-
> tion requested by the order has not yet been
> completed.

Right, but my take is that the order has not been accepted by the target
CPU before we're actually in user space to e.g., trigger SIGP STOP.

> ...
> If the currently specified order is sense, external
> call, emergency signal, start, stop, restart, stop
> and store status, set prefix, store status at
> address, set architecture, set multithreading, or
> store additional status at address, then the order
> is rejected, and condition code 2 is set. If the cur-
> rently specified order is one of the reset orders,
> or an unassigned or not-implemented order, the
> order code is interpreted as described in “Status
> Bits” on page 4-96.
> """
> 
> (There is another entry for the reset orders; not copied here for sake
> of keeping my novella manageable.)

Yes, these have to be special because we can have CPUs that never stop
(endless program interruption stream).

> 
> So, you're right that I could be more precise in terms how QEMU handles
> a SIGP order while it's already busy handling one, and only limit the
> CC2 from the kernel to those in-kernel orders. But I did say I took
> this simplified approach in the cover letter. :)
> 
> Regardless, because of the above I really do want/need a way to give
> the kernel a clue that userspace is doing something, without waiting
> for userspace to say "hey, that order you kicked back to me? I'm
> working on it now, I'll let you know when it's done!" Otherwise, SIGP
> SENSE (and other lightweight friends) is still racing with the receipt
> of a "start the sigp" ioctl.

And my point is that it's only visible when two VCPUs are involved and
there are absolutely no guarantees regarding that. (see my first reply)


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2 2/2] KVM: s390: Extend the USER_SIGP capability
  2021-11-04 14:59       ` David Hildenbrand
@ 2021-11-04 15:54         ` Eric Farman
  2021-11-08  8:57           ` David Hildenbrand
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Farman @ 2021-11-04 15:54 UTC (permalink / raw)
  To: David Hildenbrand, Christian Borntraeger, Janosch Frank,
	Claudio Imbrenda, Thomas Huth
  Cc: Heiko Carstens, Vasily Gorbik, Paolo Bonzini, Jonathan Corbet,
	kvm, linux-s390

On Thu, 2021-11-04 at 15:59 +0100, David Hildenbrand wrote:
> > > For example, we don't care about concurrent SIGP SENSE. We only
> > > care
> > > about "lightweight" SIGP orders with concurrent "heavy weight"
> > > SIGP
> > > orders.
> > 
> > I very much care about concurrent SIGP SENSE (a "lightweight" order
> > handled in-kernel) and how that interacts with the "heavy weight"
> > SIGP
> > orders (handled in userspace). SIGP SENSE might return CC0
> > (accepted)
> > if a vcpu is operating normally, or CC1 (status stored) with status
> > bits indicating an external call is pending and/or the vcpu is
> > stopped.
> > This means that the actual response will depend on whether
> > userspace
> > has picked up the sigp order and processed it or not. Giving CC0
> > when
> > userspace is actively processing a SIGP STOP/STOP AND STORE STATUS
> > would be misleading for the SIGP SENSE. (Did the STOP order get
> > lost?
> > Failed? Not yet dispatched? Blocked?)
> 
> But that would only visible when concurrently SIGP STOP'ing from one
> VCPU and SIGP SENSE'ing from another VCPU. But in that case, there
> are
> already no guarantees, because it's inherently racy:
> 
> VCPU #2: SIGP STOP #3
> VCPU #1: SIGP SENSE #3
> 

Is it inherently racy? QEMU has a global "one SIGP at a time,
regardless of vcpu count" mechanism, so that it gets serialized at that
level. POPS says an order is rejected (BUSY) if the "access path to a
cpu is processing another order", and I would imagine that KVM is
acting as that access path to the vcpu. The deliniation between
kernelspace and userspace should be uninteresting on whether parallel
orders are serialized (in QEMU via USER_SIGP) or not (!USER_SIGP or
"lightweight" orders).

> There is no guarantee who ends up first
> a) In the kernel
> b) On the final destination (SENSE -> kernel; STOP -> QEMU)
> 
> They could be rescheduled/delayed in various ways.
> 
> 
> The important part is that orders from the *same* CPU are properly
> handled, right?
> 
> VCPU #1: SIGP STOP #3
> VCPU #1: SIGP SENSE #3
> 
> SENSE must return BUSY in case the STOP was not successful yet,
> correct?

It's not a matter of whether STOP is/not successful. If the vcpu is
actively processing a STOP, then the SENSE gets a BUSY. But there's no
code today to do that for the SENSE, which is of course why I'm here.
:)

> 
> And that can be achieved by setting the VCPU #3 busy when landing in
> user space to trigger the SIGP STOP, before returning to the kernel
> and
> processing the SIGP SENSE.
> 

I will try it, but I am not convinced.

> 
> Or am I missing something important?
> 
> > Meanwhile, the Principles of Operation (SA22-7832-12) page 4-95
> > describes a list of orders that would generate a CC2 (busy) when
> > the
> > order is still "active" in userspace:
> > 
> > """
> > A previously issued start, stop, restart, stop-
> > and-store-status, set-prefix, store-status-at-
> > address order, or store-additional-status-at-
> > address has been accepted by the
> > addressed CPU, and execution of the func-
> > tion requested by the order has not yet been
> > completed.
> 
> Right, but my take is that the order has not been accepted by the
> target
> CPU before we're actually in user space to e.g., trigger SIGP STOP.

Not accepted, yes, but also not rejected either. We're still trying to
figure out who's processing the order and getting it to the addressed
cpu.

> 
> > ...
> > If the currently specified order is sense, external
> > call, emergency signal, start, stop, restart, stop
> > and store status, set prefix, store status at
> > address, set architecture, set multithreading, or
> > store additional status at address, then the order
> > is rejected, and condition code 2 is set. If the cur-
> > rently specified order is one of the reset orders,
> > or an unassigned or not-implemented order, the
> > order code is interpreted as described in “Status
> > Bits” on page 4-96.
> > """
> > 
> > (There is another entry for the reset orders; not copied here for
> > sake
> > of keeping my novella manageable.)
> 
> Yes, these have to be special because we can have CPUs that never
> stop
> (endless program interruption stream).
> 
> > So, you're right that I could be more precise in terms how QEMU
> > handles
> > a SIGP order while it's already busy handling one, and only limit
> > the
> > CC2 from the kernel to those in-kernel orders. But I did say I took
> > this simplified approach in the cover letter. :)
> > 
> > Regardless, because of the above I really do want/need a way to
> > give
> > the kernel a clue that userspace is doing something, without
> > waiting
> > for userspace to say "hey, that order you kicked back to me? I'm
> > working on it now, I'll let you know when it's done!" Otherwise,
> > SIGP
> > SENSE (and other lightweight friends) is still racing with the
> > receipt
> > of a "start the sigp" ioctl.
> 
> And my point is that it's only visible when two VCPUs are involved
> and
> there are absolutely no guarantees regarding that. (see my first
> reply)
> 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2 2/2] KVM: s390: Extend the USER_SIGP capability
  2021-11-04 15:54         ` Eric Farman
@ 2021-11-08  8:57           ` David Hildenbrand
  0 siblings, 0 replies; 8+ messages in thread
From: David Hildenbrand @ 2021-11-08  8:57 UTC (permalink / raw)
  To: Eric Farman, Christian Borntraeger, Janosch Frank,
	Claudio Imbrenda, Thomas Huth
  Cc: Heiko Carstens, Vasily Gorbik, Paolo Bonzini, Jonathan Corbet,
	kvm, linux-s390

On 04.11.21 16:54, Eric Farman wrote:
> On Thu, 2021-11-04 at 15:59 +0100, David Hildenbrand wrote:
>>>> For example, we don't care about concurrent SIGP SENSE. We only
>>>> care
>>>> about "lightweight" SIGP orders with concurrent "heavy weight"
>>>> SIGP
>>>> orders.
>>>
>>> I very much care about concurrent SIGP SENSE (a "lightweight" order
>>> handled in-kernel) and how that interacts with the "heavy weight"
>>> SIGP
>>> orders (handled in userspace). SIGP SENSE might return CC0
>>> (accepted)
>>> if a vcpu is operating normally, or CC1 (status stored) with status
>>> bits indicating an external call is pending and/or the vcpu is
>>> stopped.
>>> This means that the actual response will depend on whether
>>> userspace
>>> has picked up the sigp order and processed it or not. Giving CC0
>>> when
>>> userspace is actively processing a SIGP STOP/STOP AND STORE STATUS
>>> would be misleading for the SIGP SENSE. (Did the STOP order get
>>> lost?
>>> Failed? Not yet dispatched? Blocked?)
>>
>> But that would only visible when concurrently SIGP STOP'ing from one
>> VCPU and SIGP SENSE'ing from another VCPU. But in that case, there
>> are
>> already no guarantees, because it's inherently racy:
>>
>> VCPU #2: SIGP STOP #3
>> VCPU #1: SIGP SENSE #3
>>
> 
> Is it inherently racy? QEMU has a global "one SIGP at a time,
> regardless of vcpu count" mechanism, so that it gets serialized at that
> level. POPS says an order is rejected (BUSY) if the "access path to a
> cpu is processing another order", and I would imagine that KVM is
> acting as that access path to the vcpu. The deliniation between
> kernelspace and userspace should be uninteresting on whether parallel
> orders are serialized (in QEMU via USER_SIGP) or not (!USER_SIGP or
> "lightweight" orders).

There is no real way for a guest to enforce the execution order of

VCPU #2: SIGP STOP #3
VCPU #1: SIGP SENSE #3

or

VCPU #1: SIGP SENSE #3
VCPU #2: SIGP STOP #3

without additional synchronization.

There could be random delays in the instruction execution at any point
in time. So the SENSE on #2 might observe "stopped" "not stopped" or
"busy" randomly, because it's inherently racy.


Of course, one could implement some synchronization on top:

VCPU #2: SIGP STOP #3
# VCPU #2 instructs #1 to SIGP SENSE #2
VCPU #1: SIGP SENSE #3
# VCPU #2 waits for SIGP SENSE #2 result from #1
VCPU #2: SIGP SENSE #3

Then, we have to make sure that it cannot happen that #1 observes "not
busy" and #2 observes "busy". But, to implement something like that, #2
has to execute additional instructions to perform the synchronization.

So after SIGP STOP returns on #2 and #2 was able to execute new
instructions, we have to make sure that SIGP SENSE of #3 returns "busy"
on all VCPUs until #3 finished the SIGP STOP.

> 
>> There is no guarantee who ends up first
>> a) In the kernel
>> b) On the final destination (SENSE -> kernel; STOP -> QEMU)
>>
>> They could be rescheduled/delayed in various ways.
>>
>>
>> The important part is that orders from the *same* CPU are properly
>> handled, right?
>>
>> VCPU #1: SIGP STOP #3
>> VCPU #1: SIGP SENSE #3
>>
>> SENSE must return BUSY in case the STOP was not successful yet,
>> correct?
> 
> It's not a matter of whether STOP is/not successful. If the vcpu is

Right, I meant "accepted but not fully processed yet".

> actively processing a STOP, then the SENSE gets a BUSY. But there's no
> code today to do that for the SENSE, which is of course why I'm here.
> :)

Right, and the only problematic SIGP orders are really SIGP STOP*,
because these are the only ones that will get processed asynchronously
-- the sending VCPU can return and execute new instructions without the
SIGP STOP order being fully processed.


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-11-08  8:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-02 19:46 [RFC PATCH v2 0/2] s390x: Improvements to SIGP handling [KVM] Eric Farman
2021-11-02 19:46 ` [RFC PATCH v2 1/2] Capability/IOCTL/Documentation Eric Farman
2021-11-02 19:46 ` [RFC PATCH v2 2/2] KVM: s390: Extend the USER_SIGP capability Eric Farman
2021-11-04  9:06   ` David Hildenbrand
2021-11-04 14:33     ` Eric Farman
2021-11-04 14:59       ` David Hildenbrand
2021-11-04 15:54         ` Eric Farman
2021-11-08  8:57           ` David Hildenbrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.