All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v9 0/3] s390x: KVM: CPU Topology
@ 2022-05-06  9:24 Pierre Morel
  2022-05-06  9:24 ` [PATCH v9 1/3] s390x: KVM: ipte lock for SCA access should be contained in KVM Pierre Morel
                   ` (3 more replies)
  0 siblings, 4 replies; 29+ messages in thread
From: Pierre Morel @ 2022-05-06  9:24 UTC (permalink / raw)
  To: kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, david,
	thuth, imbrenda, hca, gor, pmorel, wintera, seiden, nrb

Hi all,

This new spin adds bug correction and simplification of ipte_lock
to the series for the implementation of interpretation for the PTF
instruction and the handling of the STSI instruction.

The series provides:
1- interception of the STSI instruction forwarding the CPU topology
2- interpretation of the PTF instruction
3- a KVM capability for the userland hypervisor to ask KVM to 
   setup PTF interpretation.


0- Foreword

The S390 CPU topology is reported using two instructions:
- PTF, to get information if the CPU topology did change since last
  PTF instruction or a subsystem reset.
- STSI, to get the topology information, consisting of the topology
  of the CPU inside the sockets, of the sockets inside the books etc.

The PTF(2) instruction report a change if the STSI(15.1.2) instruction
will report a difference with the last STSI(15.1.2) instruction*.
With the SIE interpretation, the PTF(2) instruction will report a
change to the guest if the host sets the SCA.MTCR bit.

*The STSI(15.1.2) instruction reports:
- The cores address within a socket
- The polarization of the cores
- The CPU type of the cores
- If the cores are dedicated or not

We decided to implement the CPU topology for S390 in several steps:

- first we report CPU hotplug
- modification of the CPU mask inside sockets

In future development we will provide:

- handling of shared CPUs
- reporting of the CPU Type
- reporting of the polarization


1- Interception of STSI

To provide Topology information to the guest through the STSI
instruction, we forward STSI with Function Code 15 to the
userland hypervisor which will take care to provide the right
information to the guest.

To let the guest use both the PTF instruction  to check if a topology
change occurred and sthe STSI_15.x.x instruction we add a new KVM
capability to enable the topology facility.

2- Interpretation of PTF with FC(2)

The PTF instruction will report a topology change if there is any change
with a previous STSI(15.1.2) SYSIB.
Changes inside a STSI(15.1.2) SYSIB occur if CPU bits are set or clear
inside the CPU Topology List Entry CPU mask field, which happens with
changes in CPU polarization, dedication, CPU types and adding or
removing CPUs in a socket.

The reporting to the guest is done using the Multiprocessor
Topology-Change-Report (MTCR) bit of the utility entry of the guest's
SCA which will be cleared during the interpretation of PTF.

To check if the topology has been modified we use a new field of the
arch vCPU prev_cpu, to save the previous real CPU ID at the end of a
schedule and verify on next schedule that the CPU used is in the same
socket, this field is initialized to -1 on vCPU creation.


Regards,
Pierre

Pierre Morel (3):
  s390x: KVM: ipte lock for SCA access should be contained in KVM
  s390x: KVM: guest support for topology function
  s390x: KVM: resetting the Topology-Change-Report

 Documentation/virt/kvm/api.rst   |  16 ++++
 arch/s390/include/asm/kvm_host.h |  12 ++-
 arch/s390/include/uapi/asm/kvm.h |   5 ++
 arch/s390/kvm/gaccess.c          |  96 +++++++++++------------
 arch/s390/kvm/gaccess.h          |   6 +-
 arch/s390/kvm/kvm-s390.c         | 128 ++++++++++++++++++++++++++++++-
 arch/s390/kvm/kvm-s390.h         |  25 ++++++
 arch/s390/kvm/priv.c             |  20 +++--
 arch/s390/kvm/vsie.c             |   3 +
 include/uapi/linux/kvm.h         |   1 +
 10 files changed, 250 insertions(+), 62 deletions(-)

-- 
2.27.0

Changelog:

from v8 to v9

- bug correction in kvm_s390_topology_changed
  (Heiko)

- simplification for ipte_lock/unlock to use kvm
  as arg instead of vcpu and test on sclp.has_siif
  instead of the SIE ECA_SII.
  (David)

- use of a single value for reporting if the
  topology changed instead of a structure
  (David)

from v7 to v8

- implement reset handling
  (Janosch)

- change the way to check if the topology changed
  (Nico, Heiko)

from v6 to v7

- rebase

from v5 to v6

- make the subject more accurate
  (Claudio)

- Change the kvm_s390_set_mtcr() function to have vcpu in the name
  (Janosch)

- Replace the checks on ECB_PTF wit the check of facility 11
  (Janosch)

- modify kvm_arch_vcpu_load, move the check in a function in
  the header file
  (Janosh)

- No magical number replace the "new cpu value" of -1 with a define
  (Janosch)

- Make the checks for STSI validity clearer
  (Janosch)

from v4 tp v5

- modify the way KVM_CAP is tested to be OK with vsie
  (David)

from v3 to v4

- squatch both patches
  (David)

- Added Documentation
  (David)

- Modified the detection for new vCPUs
  (Pierre)

from v2 to v3

- use PTF interpretation
  (Christian)

- optimize arch_update_cpu_topology using PTF
  (Pierre)

from v1 to v2:

- Add a KVM capability to let QEMU know we support PTF and STSI 15
  (David)

- check KVM facility 11 before accepting STSI fc 15
  (David)

- handle all we can in userland
  (David)

- add tracing to STSI fc 15
  (Connie)


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v9 1/3] s390x: KVM: ipte lock for SCA access should be contained in KVM
  2022-05-06  9:24 [PATCH v9 0/3] s390x: KVM: CPU Topology Pierre Morel
@ 2022-05-06  9:24 ` Pierre Morel
  2022-05-12  9:08   ` David Hildenbrand
  2022-05-12 11:32   ` Janosch Frank
  2022-05-06  9:24 ` [PATCH v9 2/3] s390x: KVM: guest support for topology function Pierre Morel
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 29+ messages in thread
From: Pierre Morel @ 2022-05-06  9:24 UTC (permalink / raw)
  To: kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, david,
	thuth, imbrenda, hca, gor, pmorel, wintera, seiden, nrb

The former check to chose between SIIF or not SIIF can be done
using the sclp.has_siif instead of accessing per vCPU structures

When accessing the SCA, ipte lock and ipte_unlock do not need
to access any vcpu structures but only the KVM structure.

Let's simplify the ipte handling.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 arch/s390/kvm/gaccess.c | 96 ++++++++++++++++++++---------------------
 arch/s390/kvm/gaccess.h |  6 +--
 arch/s390/kvm/priv.c    |  6 +--
 3 files changed, 54 insertions(+), 54 deletions(-)

diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
index d53a183c2005..0e1f6dd31882 100644
--- a/arch/s390/kvm/gaccess.c
+++ b/arch/s390/kvm/gaccess.c
@@ -262,77 +262,77 @@ struct aste {
 	/* .. more fields there */
 };
 
-int ipte_lock_held(struct kvm_vcpu *vcpu)
+int ipte_lock_held(struct kvm *kvm)
 {
-	if (vcpu->arch.sie_block->eca & ECA_SII) {
+	if (sclp.has_siif) {
 		int rc;
 
-		read_lock(&vcpu->kvm->arch.sca_lock);
-		rc = kvm_s390_get_ipte_control(vcpu->kvm)->kh != 0;
-		read_unlock(&vcpu->kvm->arch.sca_lock);
+		read_lock(&kvm->arch.sca_lock);
+		rc = kvm_s390_get_ipte_control(kvm)->kh != 0;
+		read_unlock(&kvm->arch.sca_lock);
 		return rc;
 	}
-	return vcpu->kvm->arch.ipte_lock_count != 0;
+	return kvm->arch.ipte_lock_count != 0;
 }
 
-static void ipte_lock_simple(struct kvm_vcpu *vcpu)
+static void ipte_lock_simple(struct kvm *kvm)
 {
 	union ipte_control old, new, *ic;
 
-	mutex_lock(&vcpu->kvm->arch.ipte_mutex);
-	vcpu->kvm->arch.ipte_lock_count++;
-	if (vcpu->kvm->arch.ipte_lock_count > 1)
+	mutex_lock(&kvm->arch.ipte_mutex);
+	kvm->arch.ipte_lock_count++;
+	if (kvm->arch.ipte_lock_count > 1)
 		goto out;
 retry:
-	read_lock(&vcpu->kvm->arch.sca_lock);
-	ic = kvm_s390_get_ipte_control(vcpu->kvm);
+	read_lock(&kvm->arch.sca_lock);
+	ic = kvm_s390_get_ipte_control(kvm);
 	do {
 		old = READ_ONCE(*ic);
 		if (old.k) {
-			read_unlock(&vcpu->kvm->arch.sca_lock);
+			read_unlock(&kvm->arch.sca_lock);
 			cond_resched();
 			goto retry;
 		}
 		new = old;
 		new.k = 1;
 	} while (cmpxchg(&ic->val, old.val, new.val) != old.val);
-	read_unlock(&vcpu->kvm->arch.sca_lock);
+	read_unlock(&kvm->arch.sca_lock);
 out:
-	mutex_unlock(&vcpu->kvm->arch.ipte_mutex);
+	mutex_unlock(&kvm->arch.ipte_mutex);
 }
 
-static void ipte_unlock_simple(struct kvm_vcpu *vcpu)
+static void ipte_unlock_simple(struct kvm *kvm)
 {
 	union ipte_control old, new, *ic;
 
-	mutex_lock(&vcpu->kvm->arch.ipte_mutex);
-	vcpu->kvm->arch.ipte_lock_count--;
-	if (vcpu->kvm->arch.ipte_lock_count)
+	mutex_lock(&kvm->arch.ipte_mutex);
+	kvm->arch.ipte_lock_count--;
+	if (kvm->arch.ipte_lock_count)
 		goto out;
-	read_lock(&vcpu->kvm->arch.sca_lock);
-	ic = kvm_s390_get_ipte_control(vcpu->kvm);
+	read_lock(&kvm->arch.sca_lock);
+	ic = kvm_s390_get_ipte_control(kvm);
 	do {
 		old = READ_ONCE(*ic);
 		new = old;
 		new.k = 0;
 	} while (cmpxchg(&ic->val, old.val, new.val) != old.val);
-	read_unlock(&vcpu->kvm->arch.sca_lock);
-	wake_up(&vcpu->kvm->arch.ipte_wq);
+	read_unlock(&kvm->arch.sca_lock);
+	wake_up(&kvm->arch.ipte_wq);
 out:
-	mutex_unlock(&vcpu->kvm->arch.ipte_mutex);
+	mutex_unlock(&kvm->arch.ipte_mutex);
 }
 
-static void ipte_lock_siif(struct kvm_vcpu *vcpu)
+static void ipte_lock_siif(struct kvm *kvm)
 {
 	union ipte_control old, new, *ic;
 
 retry:
-	read_lock(&vcpu->kvm->arch.sca_lock);
-	ic = kvm_s390_get_ipte_control(vcpu->kvm);
+	read_lock(&kvm->arch.sca_lock);
+	ic = kvm_s390_get_ipte_control(kvm);
 	do {
 		old = READ_ONCE(*ic);
 		if (old.kg) {
-			read_unlock(&vcpu->kvm->arch.sca_lock);
+			read_unlock(&kvm->arch.sca_lock);
 			cond_resched();
 			goto retry;
 		}
@@ -340,15 +340,15 @@ static void ipte_lock_siif(struct kvm_vcpu *vcpu)
 		new.k = 1;
 		new.kh++;
 	} while (cmpxchg(&ic->val, old.val, new.val) != old.val);
-	read_unlock(&vcpu->kvm->arch.sca_lock);
+	read_unlock(&kvm->arch.sca_lock);
 }
 
-static void ipte_unlock_siif(struct kvm_vcpu *vcpu)
+static void ipte_unlock_siif(struct kvm *kvm)
 {
 	union ipte_control old, new, *ic;
 
-	read_lock(&vcpu->kvm->arch.sca_lock);
-	ic = kvm_s390_get_ipte_control(vcpu->kvm);
+	read_lock(&kvm->arch.sca_lock);
+	ic = kvm_s390_get_ipte_control(kvm);
 	do {
 		old = READ_ONCE(*ic);
 		new = old;
@@ -356,25 +356,25 @@ static void ipte_unlock_siif(struct kvm_vcpu *vcpu)
 		if (!new.kh)
 			new.k = 0;
 	} while (cmpxchg(&ic->val, old.val, new.val) != old.val);
-	read_unlock(&vcpu->kvm->arch.sca_lock);
+	read_unlock(&kvm->arch.sca_lock);
 	if (!new.kh)
-		wake_up(&vcpu->kvm->arch.ipte_wq);
+		wake_up(&kvm->arch.ipte_wq);
 }
 
-void ipte_lock(struct kvm_vcpu *vcpu)
+void ipte_lock(struct kvm *kvm)
 {
-	if (vcpu->arch.sie_block->eca & ECA_SII)
-		ipte_lock_siif(vcpu);
+	if (sclp.has_siif)
+		ipte_lock_siif(kvm);
 	else
-		ipte_lock_simple(vcpu);
+		ipte_lock_simple(kvm);
 }
 
-void ipte_unlock(struct kvm_vcpu *vcpu)
+void ipte_unlock(struct kvm *kvm)
 {
-	if (vcpu->arch.sie_block->eca & ECA_SII)
-		ipte_unlock_siif(vcpu);
+	if (sclp.has_siif)
+		ipte_unlock_siif(kvm);
 	else
-		ipte_unlock_simple(vcpu);
+		ipte_unlock_simple(kvm);
 }
 
 static int ar_translation(struct kvm_vcpu *vcpu, union asce *asce, u8 ar,
@@ -1075,7 +1075,7 @@ int access_guest_with_key(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar,
 	try_storage_prot_override = storage_prot_override_applicable(vcpu);
 	need_ipte_lock = psw_bits(*psw).dat && !asce.r;
 	if (need_ipte_lock)
-		ipte_lock(vcpu);
+		ipte_lock(vcpu->kvm);
 	/*
 	 * Since we do the access further down ultimately via a move instruction
 	 * that does key checking and returns an error in case of a protection
@@ -1113,7 +1113,7 @@ int access_guest_with_key(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar,
 		rc = trans_exc(vcpu, rc, ga, ar, mode, prot);
 out_unlock:
 	if (need_ipte_lock)
-		ipte_unlock(vcpu);
+		ipte_unlock(vcpu->kvm);
 	if (nr_pages > ARRAY_SIZE(gpa_array))
 		vfree(gpas);
 	return rc;
@@ -1185,10 +1185,10 @@ int check_gva_range(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
 	rc = get_vcpu_asce(vcpu, &asce, gva, ar, mode);
 	if (rc)
 		return rc;
-	ipte_lock(vcpu);
+	ipte_lock(vcpu->kvm);
 	rc = guest_range_to_gpas(vcpu, gva, ar, NULL, length, asce, mode,
 				 access_key);
-	ipte_unlock(vcpu);
+	ipte_unlock(vcpu->kvm);
 
 	return rc;
 }
@@ -1451,7 +1451,7 @@ int kvm_s390_shadow_fault(struct kvm_vcpu *vcpu, struct gmap *sg,
 	 * tables/pointers we read stay valid - unshadowing is however
 	 * always possible - only guest_table_lock protects us.
 	 */
-	ipte_lock(vcpu);
+	ipte_lock(vcpu->kvm);
 
 	rc = gmap_shadow_pgt_lookup(sg, saddr, &pgt, &dat_protection, &fake);
 	if (rc)
@@ -1485,7 +1485,7 @@ int kvm_s390_shadow_fault(struct kvm_vcpu *vcpu, struct gmap *sg,
 	pte.p |= dat_protection;
 	if (!rc)
 		rc = gmap_shadow_page(sg, saddr, __pte(pte.val));
-	ipte_unlock(vcpu);
+	ipte_unlock(vcpu->kvm);
 	mmap_read_unlock(sg->mm);
 	return rc;
 }
diff --git a/arch/s390/kvm/gaccess.h b/arch/s390/kvm/gaccess.h
index 1124ff282012..9408d6cc8e2c 100644
--- a/arch/s390/kvm/gaccess.h
+++ b/arch/s390/kvm/gaccess.h
@@ -440,9 +440,9 @@ int read_guest_real(struct kvm_vcpu *vcpu, unsigned long gra, void *data,
 	return access_guest_real(vcpu, gra, data, len, 0);
 }
 
-void ipte_lock(struct kvm_vcpu *vcpu);
-void ipte_unlock(struct kvm_vcpu *vcpu);
-int ipte_lock_held(struct kvm_vcpu *vcpu);
+void ipte_lock(struct kvm *kvm);
+void ipte_unlock(struct kvm *kvm);
+int ipte_lock_held(struct kvm *kvm);
 int kvm_s390_check_low_addr_prot_real(struct kvm_vcpu *vcpu, unsigned long gra);
 
 /* MVPG PEI indication bits */
diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 5beb7a4a11b3..0e8603acc105 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -443,7 +443,7 @@ static int handle_ipte_interlock(struct kvm_vcpu *vcpu)
 	vcpu->stat.instruction_ipte_interlock++;
 	if (psw_bits(vcpu->arch.sie_block->gpsw).pstate)
 		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
-	wait_event(vcpu->kvm->arch.ipte_wq, !ipte_lock_held(vcpu));
+	wait_event(vcpu->kvm->arch.ipte_wq, !ipte_lock_held(vcpu->kvm));
 	kvm_s390_retry_instr(vcpu);
 	VCPU_EVENT(vcpu, 4, "%s", "retrying ipte interlock operation");
 	return 0;
@@ -1472,7 +1472,7 @@ static int handle_tprot(struct kvm_vcpu *vcpu)
 	access_key = (operand2 & 0xf0) >> 4;
 
 	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_DAT)
-		ipte_lock(vcpu);
+		ipte_lock(vcpu->kvm);
 
 	ret = guest_translate_address_with_key(vcpu, address, ar, &gpa,
 					       GACC_STORE, access_key);
@@ -1509,7 +1509,7 @@ static int handle_tprot(struct kvm_vcpu *vcpu)
 	}
 
 	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_DAT)
-		ipte_unlock(vcpu);
+		ipte_unlock(vcpu->kvm);
 	return ret;
 }
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v9 2/3] s390x: KVM: guest support for topology function
  2022-05-06  9:24 [PATCH v9 0/3] s390x: KVM: CPU Topology Pierre Morel
  2022-05-06  9:24 ` [PATCH v9 1/3] s390x: KVM: ipte lock for SCA access should be contained in KVM Pierre Morel
@ 2022-05-06  9:24 ` Pierre Morel
  2022-05-12  9:24   ` David Hildenbrand
                     ` (2 more replies)
  2022-05-06  9:24 ` [PATCH v9 3/3] s390x: KVM: resetting the Topology-Change-Report Pierre Morel
  2022-05-18 15:26 ` [PATCH v9 0/3] s390x: KVM: CPU Topology Christian Borntraeger
  3 siblings, 3 replies; 29+ messages in thread
From: Pierre Morel @ 2022-05-06  9:24 UTC (permalink / raw)
  To: kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, david,
	thuth, imbrenda, hca, gor, pmorel, wintera, seiden, nrb

We let the userland hypervisor know if the machine support the CPU
topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.

The PTF instruction will report a topology change if there is any change
with a previous STSI_15_1_2 SYSIB.
Changes inside a STSI_15_1_2 SYSIB occur if CPU bits are set or clear
inside the CPU Topology List Entry CPU mask field, which happens with
changes in CPU polarization, dedication, CPU types and adding or
removing CPUs in a socket.

The reporting to the guest is done using the Multiprocessor
Topology-Change-Report (MTCR) bit of the utility entry of the guest's
SCA which will be cleared during the interpretation of PTF.

To check if the topology has been modified we use a new field of the
arch vCPU to save the previous real CPU ID at the end of a schedule
and verify on next schedule that the CPU used is in the same socket.
We do not report polarization, CPU Type or dedication change.

STSI(15.1.x) gives information on the CPU configuration topology.
Let's accept the interception of STSI with the function code 15 and
let the userland part of the hypervisor handle it when userland
support the CPU Topology facility.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 Documentation/virt/kvm/api.rst   | 16 +++++++++++
 arch/s390/include/asm/kvm_host.h | 12 ++++++--
 arch/s390/kvm/kvm-s390.c         | 49 +++++++++++++++++++++++++++++++-
 arch/s390/kvm/kvm-s390.h         | 25 ++++++++++++++++
 arch/s390/kvm/priv.c             | 14 ++++++---
 arch/s390/kvm/vsie.c             |  3 ++
 include/uapi/linux/kvm.h         |  1 +
 7 files changed, 112 insertions(+), 8 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 4a900cdbc62e..c15f5b9dafb6 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7779,3 +7779,19 @@ Ordering of KVM_GET_*/KVM_SET_* ioctls
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 TBD
+
+8.17 KVM_CAP_S390_CPU_TOPOLOGY
+------------------------------
+
+:Capability: KVM_CAP_S390_CPU_TOPOLOGY
+:Architectures: s390
+:Type: vm
+
+This capability indicates that kvm will provide the S390 CPU Topology facility
+which consist of the interpretation of the PTF instruction for the Function
+Code 2 along with interception and forwarding of both the PTF instruction
+with Function Codes 0 or 1 and the STSI(15,1,x) instruction to the userland
+hypervisor.
+
+The stfle facility 11, CPU Topology facility, should not be provided to the
+guest without this capability.
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 766028d54a3e..04653b43ccee 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -97,15 +97,19 @@ struct bsca_block {
 	union ipte_control ipte_control;
 	__u64	reserved[5];
 	__u64	mcn;
-	__u64	reserved2;
+#define SCA_UTILITY_MTCR	0x8000
+	__u16	utility;
+	__u8	reserved2[6];
 	struct bsca_entry cpu[KVM_S390_BSCA_CPU_SLOTS];
 };
 
 struct esca_block {
 	union ipte_control ipte_control;
-	__u64   reserved1[7];
+	__u64   reserved1[6];
+	__u16	utility;
+	__u8	reserved2[6];
 	__u64   mcn[4];
-	__u64   reserved2[20];
+	__u64   reserved3[20];
 	struct esca_entry cpu[KVM_S390_ESCA_CPU_SLOTS];
 };
 
@@ -249,6 +253,7 @@ struct kvm_s390_sie_block {
 #define ECB_SPECI	0x08
 #define ECB_SRSI	0x04
 #define ECB_HOSTPROTINT	0x02
+#define ECB_PTF		0x01
 	__u8	ecb;			/* 0x0061 */
 #define ECB2_CMMA	0x80
 #define ECB2_IEP	0x20
@@ -750,6 +755,7 @@ struct kvm_vcpu_arch {
 	bool skey_enabled;
 	struct kvm_s390_pv_vcpu pv;
 	union diag318_info diag318_info;
+	int prev_cpu;
 };
 
 struct kvm_vm_stat {
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index da3dabda1a12..c8bdce31464f 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -606,6 +606,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_S390_PROTECTED:
 		r = is_prot_virt_host();
 		break;
+	case KVM_CAP_S390_CPU_TOPOLOGY:
+		r = test_facility(11);
+		break;
 	default:
 		r = 0;
 	}
@@ -817,6 +820,20 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
 		icpt_operexc_on_all_vcpus(kvm);
 		r = 0;
 		break;
+	case KVM_CAP_S390_CPU_TOPOLOGY:
+		r = -EINVAL;
+		mutex_lock(&kvm->lock);
+		if (kvm->created_vcpus) {
+			r = -EBUSY;
+		} else if (test_facility(11)) {
+			set_kvm_facility(kvm->arch.model.fac_mask, 11);
+			set_kvm_facility(kvm->arch.model.fac_list, 11);
+			r = 0;
+		}
+		mutex_unlock(&kvm->lock);
+		VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
+			 r ? "(not available)" : "(success)");
+		break;
 	default:
 		r = -EINVAL;
 		break;
@@ -1695,6 +1712,25 @@ static int kvm_s390_get_cpu_model(struct kvm *kvm, struct kvm_device_attr *attr)
 	return ret;
 }
 
+/**
+ * kvm_s390_sca_set_mtcr
+ * @kvm: guest KVM description
+ *
+ * Is only relevant if the topology facility is present,
+ * the caller should check KVM facility 11
+ *
+ * Updates the Multiprocessor Topology-Change-Report to signal
+ * the guest with a topology change.
+ */
+static void kvm_s390_sca_set_mtcr(struct kvm *kvm)
+{
+	struct bsca_block *sca = kvm->arch.sca; /* SCA version doesn't matter */
+
+	ipte_lock(kvm);
+	sca->utility |= SCA_UTILITY_MTCR;
+	ipte_unlock(kvm);
+}
+
 static int kvm_s390_vm_set_attr(struct kvm *kvm, struct kvm_device_attr *attr)
 {
 	int ret;
@@ -3138,16 +3174,20 @@ __u64 kvm_s390_get_cpu_timer(struct kvm_vcpu *vcpu)
 
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
-
 	gmap_enable(vcpu->arch.enabled_gmap);
 	kvm_s390_set_cpuflags(vcpu, CPUSTAT_RUNNING);
 	if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
 		__start_cpu_timer_accounting(vcpu);
 	vcpu->cpu = cpu;
+
+	if (kvm_s390_topology_changed(vcpu))
+		kvm_s390_sca_set_mtcr(vcpu->kvm);
 }
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
+	/* Remember which CPU was backing the vCPU */
+	vcpu->arch.prev_cpu = vcpu->cpu;
 	vcpu->cpu = -1;
 	if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
 		__stop_cpu_timer_accounting(vcpu);
@@ -3267,6 +3307,13 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
 		vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
 	if (test_kvm_facility(vcpu->kvm, 9))
 		vcpu->arch.sie_block->ecb |= ECB_SRSI;
+
+	/* PTF needs guest facilities to enable interpretation */
+	if (test_kvm_facility(vcpu->kvm, 11))
+		vcpu->arch.sie_block->ecb |= ECB_PTF;
+	/* Indicate this is a new vcpu */
+	vcpu->arch.prev_cpu = S390_KVM_TOPOLOGY_NEW_CPU;
+
 	if (test_kvm_facility(vcpu->kvm, 73))
 		vcpu->arch.sie_block->ecb |= ECB_TE;
 	if (!kvm_is_ucontrol(vcpu->kvm))
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index 497d52a83c78..5fd5e635a611 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -514,4 +514,29 @@ void kvm_s390_vcpu_crypto_reset_all(struct kvm *kvm);
  */
 extern unsigned int diag9c_forwarding_hz;
 
+#define S390_KVM_TOPOLOGY_NEW_CPU -1
+/**
+ * kvm_s390_topology_changed
+ * @vcpu: the virtual CPU
+ *
+ * If the topology facility is present, checks if the CPU toplogy
+ * viewed by the guest changed due to load balancing or CPU hotplug.
+ */
+static inline bool kvm_s390_topology_changed(struct kvm_vcpu *vcpu)
+{
+	if (!test_kvm_facility(vcpu->kvm, 11))
+		return false;
+
+	/* A new vCPU has been hotplugged */
+	if (vcpu->arch.prev_cpu == S390_KVM_TOPOLOGY_NEW_CPU)
+		return true;
+
+	/* The real CPU backing up the vCPU is still on same socket */
+	if (cpumask_test_cpu(vcpu->cpu,
+			     topology_core_cpumask(vcpu->arch.prev_cpu)))
+		return false;
+
+	return true;
+}
+
 #endif
diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 0e8603acc105..d9e16b09c8bf 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -874,10 +874,12 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
 	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
 		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
 
-	if (fc > 3) {
-		kvm_s390_set_psw_cc(vcpu, 3);
-		return 0;
-	}
+	if (fc > 3 && fc != 15)
+		goto out_no_data;
+
+	/* fc 15 is provided with PTF/CPU topology support */
+	if (fc == 15 && !test_kvm_facility(vcpu->kvm, 11))
+		goto out_no_data;
 
 	if (vcpu->run->s.regs.gprs[0] & 0x0fffff00
 	    || vcpu->run->s.regs.gprs[1] & 0xffff0000)
@@ -911,6 +913,10 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
 			goto out_no_data;
 		handle_stsi_3_2_2(vcpu, (void *) mem);
 		break;
+	case 15:
+		trace_kvm_s390_handle_stsi(vcpu, fc, sel1, sel2, operand2);
+		insert_stsi_usr_data(vcpu, operand2, ar, fc, sel1, sel2);
+		return -EREMOTE;
 	}
 	if (kvm_s390_pv_cpu_is_protected(vcpu)) {
 		memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
index dada78b92691..4f4fee697550 100644
--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
 	/* Host-protection-interruption introduced with ESOP */
 	if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
 		scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
+	/* CPU Topology */
+	if (test_kvm_facility(vcpu->kvm, 11))
+		scb_s->ecb |= scb_o->ecb & ECB_PTF;
 	/* transactional execution */
 	if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
 		/* remap the prefix is tx is toggled on */
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 6a184d260c7f..538a2f9cf42d 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1152,6 +1152,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_DISABLE_QUIRKS2 213
 /* #define KVM_CAP_VM_TSC_CONTROL 214 */
 #define KVM_CAP_SYSTEM_EVENT_DATA 215
+#define KVM_CAP_S390_CPU_TOPOLOGY 216
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v9 3/3] s390x: KVM: resetting the Topology-Change-Report
  2022-05-06  9:24 [PATCH v9 0/3] s390x: KVM: CPU Topology Pierre Morel
  2022-05-06  9:24 ` [PATCH v9 1/3] s390x: KVM: ipte lock for SCA access should be contained in KVM Pierre Morel
  2022-05-06  9:24 ` [PATCH v9 2/3] s390x: KVM: guest support for topology function Pierre Morel
@ 2022-05-06  9:24 ` Pierre Morel
  2022-05-12  9:31   ` David Hildenbrand
  2022-05-18 15:26 ` [PATCH v9 0/3] s390x: KVM: CPU Topology Christian Borntraeger
  3 siblings, 1 reply; 29+ messages in thread
From: Pierre Morel @ 2022-05-06  9:24 UTC (permalink / raw)
  To: kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, david,
	thuth, imbrenda, hca, gor, pmorel, wintera, seiden, nrb

During a subsystem reset the Topology-Change-Report is cleared.
Let's give userland the possibility to clear the MTCR in the case
of a subsystem reset.

To migrate the MTCR, let's give userland the possibility to
query the MTCR state.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 arch/s390/include/uapi/asm/kvm.h |  5 ++
 arch/s390/kvm/kvm-s390.c         | 79 ++++++++++++++++++++++++++++++++
 2 files changed, 84 insertions(+)

diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
index 7a6b14874d65..abdcf4069343 100644
--- a/arch/s390/include/uapi/asm/kvm.h
+++ b/arch/s390/include/uapi/asm/kvm.h
@@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
 #define KVM_S390_VM_CRYPTO		2
 #define KVM_S390_VM_CPU_MODEL		3
 #define KVM_S390_VM_MIGRATION		4
+#define KVM_S390_VM_CPU_TOPOLOGY	5
 
 /* kvm attributes for mem_ctrl */
 #define KVM_S390_VM_MEM_ENABLE_CMMA	0
@@ -171,6 +172,10 @@ struct kvm_s390_vm_cpu_subfunc {
 #define KVM_S390_VM_MIGRATION_START	1
 #define KVM_S390_VM_MIGRATION_STATUS	2
 
+/* kvm attributes for cpu topology */
+#define KVM_S390_VM_CPU_TOPO_MTR_CLEAR	0
+#define KVM_S390_VM_CPU_TOPO_MTR_SET	1
+
 /* for KVM_GET_REGS and KVM_SET_REGS */
 struct kvm_regs {
 	/* general purpose regs for s390 */
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index c8bdce31464f..80a1244f0ead 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -1731,6 +1731,76 @@ static void kvm_s390_sca_set_mtcr(struct kvm *kvm)
 	ipte_unlock(kvm);
 }
 
+/**
+ * kvm_s390_sca_clear_mtcr
+ * @kvm: guest KVM description
+ *
+ * Is only relevant if the topology facility is present,
+ * the caller should check KVM facility 11
+ *
+ * Updates the Multiprocessor Topology-Change-Report to signal
+ * the guest with a topology change.
+ */
+static void kvm_s390_sca_clear_mtcr(struct kvm *kvm)
+{
+	struct bsca_block *sca = kvm->arch.sca; /* SCA version doesn't matter */
+
+	ipte_lock(kvm);
+	sca->utility  &= ~SCA_UTILITY_MTCR;
+	ipte_unlock(kvm);
+}
+
+static int kvm_s390_set_topology(struct kvm *kvm, struct kvm_device_attr *attr)
+{
+	if (!test_kvm_facility(kvm, 11))
+		return -ENXIO;
+
+	switch (attr->attr) {
+	case KVM_S390_VM_CPU_TOPO_MTR_SET:
+		kvm_s390_sca_set_mtcr(kvm);
+		break;
+	case KVM_S390_VM_CPU_TOPO_MTR_CLEAR:
+		kvm_s390_sca_clear_mtcr(kvm);
+		break;
+	}
+	return 0;
+}
+
+/**
+ * kvm_s390_sca_get_mtcr
+ * @kvm: guest KVM description
+ *
+ * Is only relevant if the topology facility is present,
+ * the caller should check KVM facility 11
+ *
+ * reports to QEMU the Multiprocessor Topology-Change-Report.
+ */
+static int kvm_s390_sca_get_mtcr(struct kvm *kvm)
+{
+	struct bsca_block *sca = kvm->arch.sca; /* SCA version doesn't matter */
+	int val;
+
+	ipte_lock(kvm);
+	val = !!(sca->utility & SCA_UTILITY_MTCR);
+	ipte_unlock(kvm);
+
+	return val;
+}
+
+static int kvm_s390_get_topology(struct kvm *kvm, struct kvm_device_attr *attr)
+{
+	int mtcr;
+
+	if (!test_kvm_facility(kvm, 11))
+		return -ENXIO;
+
+	mtcr = kvm_s390_sca_get_mtcr(kvm);
+	if (copy_to_user((void __user *)attr->addr, &mtcr, sizeof(mtcr)))
+		return -EFAULT;
+
+	return 0;
+}
+
 static int kvm_s390_vm_set_attr(struct kvm *kvm, struct kvm_device_attr *attr)
 {
 	int ret;
@@ -1751,6 +1821,9 @@ static int kvm_s390_vm_set_attr(struct kvm *kvm, struct kvm_device_attr *attr)
 	case KVM_S390_VM_MIGRATION:
 		ret = kvm_s390_vm_set_migration(kvm, attr);
 		break;
+	case KVM_S390_VM_CPU_TOPOLOGY:
+		ret = kvm_s390_set_topology(kvm, attr);
+		break;
 	default:
 		ret = -ENXIO;
 		break;
@@ -1776,6 +1849,9 @@ static int kvm_s390_vm_get_attr(struct kvm *kvm, struct kvm_device_attr *attr)
 	case KVM_S390_VM_MIGRATION:
 		ret = kvm_s390_vm_get_migration(kvm, attr);
 		break;
+	case KVM_S390_VM_CPU_TOPOLOGY:
+		ret = kvm_s390_get_topology(kvm, attr);
+		break;
 	default:
 		ret = -ENXIO;
 		break;
@@ -1849,6 +1925,9 @@ static int kvm_s390_vm_has_attr(struct kvm *kvm, struct kvm_device_attr *attr)
 	case KVM_S390_VM_MIGRATION:
 		ret = 0;
 		break;
+	case KVM_S390_VM_CPU_TOPOLOGY:
+		ret = test_kvm_facility(kvm, 11) ? 0 : -ENXIO;
+		break;
 	default:
 		ret = -ENXIO;
 		break;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 1/3] s390x: KVM: ipte lock for SCA access should be contained in KVM
  2022-05-06  9:24 ` [PATCH v9 1/3] s390x: KVM: ipte lock for SCA access should be contained in KVM Pierre Morel
@ 2022-05-12  9:08   ` David Hildenbrand
  2022-05-16 16:30     ` Pierre Morel
  2022-05-12 11:32   ` Janosch Frank
  1 sibling, 1 reply; 29+ messages in thread
From: David Hildenbrand @ 2022-05-12  9:08 UTC (permalink / raw)
  To: Pierre Morel, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb

On 06.05.22 11:24, Pierre Morel wrote:
> The former check to chose between SIIF or not SIIF can be done
> using the sclp.has_siif instead of accessing per vCPU structures
> 
> When accessing the SCA, ipte lock and ipte_unlock do not need
> to access any vcpu structures but only the KVM structure.
> 
> Let's simplify the ipte handling.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>

Much better

Reviewed-by: David Hildenbrand <david@redhat.com>


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 2/3] s390x: KVM: guest support for topology function
  2022-05-06  9:24 ` [PATCH v9 2/3] s390x: KVM: guest support for topology function Pierre Morel
@ 2022-05-12  9:24   ` David Hildenbrand
  2022-05-16 14:13     ` Pierre Morel
  2022-05-12 11:41   ` Janosch Frank
  2022-05-19  9:01   ` Christian Borntraeger
  2 siblings, 1 reply; 29+ messages in thread
From: David Hildenbrand @ 2022-05-12  9:24 UTC (permalink / raw)
  To: Pierre Morel, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb

On 06.05.22 11:24, Pierre Morel wrote:
> We let the userland hypervisor know if the machine support the CPU
> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
> 
> The PTF instruction will report a topology change if there is any change
> with a previous STSI_15_1_2 SYSIB.
> Changes inside a STSI_15_1_2 SYSIB occur if CPU bits are set or clear
> inside the CPU Topology List Entry CPU mask field, which happens with
> changes in CPU polarization, dedication, CPU types and adding or
> removing CPUs in a socket.
> 
> The reporting to the guest is done using the Multiprocessor
> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
> SCA which will be cleared during the interpretation of PTF.
> 
> To check if the topology has been modified we use a new field of the
> arch vCPU to save the previous real CPU ID at the end of a schedule
> and verify on next schedule that the CPU used is in the same socket.
> We do not report polarization, CPU Type or dedication change.
> 
> STSI(15.1.x) gives information on the CPU configuration topology.
> Let's accept the interception of STSI with the function code 15 and
> let the userland part of the hypervisor handle it when userland
> support the CPU Topology facility.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>

[...]


> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
> index 0e8603acc105..d9e16b09c8bf 100644
> --- a/arch/s390/kvm/priv.c
> +++ b/arch/s390/kvm/priv.c
> @@ -874,10 +874,12 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>  	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>  		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>  
> -	if (fc > 3) {
> -		kvm_s390_set_psw_cc(vcpu, 3);
> -		return 0;
> -	}
> +	if (fc > 3 && fc != 15)
> +		goto out_no_data;
> +
> +	/* fc 15 is provided with PTF/CPU topology support */
> +	if (fc == 15 && !test_kvm_facility(vcpu->kvm, 11))
> +		goto out_no_data;


Maybe shorter as

if (fc == 15 && !test_kvm_facility(vcpu->kvm, 11))
	goto out_no_data;
else if (fc > 3)
	goto out_no_data;


Apart from that, LGTM.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 3/3] s390x: KVM: resetting the Topology-Change-Report
  2022-05-06  9:24 ` [PATCH v9 3/3] s390x: KVM: resetting the Topology-Change-Report Pierre Morel
@ 2022-05-12  9:31   ` David Hildenbrand
  2022-05-12  9:52     ` Claudio Imbrenda
                       ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: David Hildenbrand @ 2022-05-12  9:31 UTC (permalink / raw)
  To: Pierre Morel, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb

On 06.05.22 11:24, Pierre Morel wrote:
> During a subsystem reset the Topology-Change-Report is cleared.
> Let's give userland the possibility to clear the MTCR in the case
> of a subsystem reset.
> 
> To migrate the MTCR, let's give userland the possibility to
> query the MTCR state.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
>  arch/s390/include/uapi/asm/kvm.h |  5 ++
>  arch/s390/kvm/kvm-s390.c         | 79 ++++++++++++++++++++++++++++++++
>  2 files changed, 84 insertions(+)
> 
> diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
> index 7a6b14874d65..abdcf4069343 100644
> --- a/arch/s390/include/uapi/asm/kvm.h
> +++ b/arch/s390/include/uapi/asm/kvm.h
> @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
>  #define KVM_S390_VM_CRYPTO		2
>  #define KVM_S390_VM_CPU_MODEL		3
>  #define KVM_S390_VM_MIGRATION		4
> +#define KVM_S390_VM_CPU_TOPOLOGY	5
>  
>  /* kvm attributes for mem_ctrl */
>  #define KVM_S390_VM_MEM_ENABLE_CMMA	0
> @@ -171,6 +172,10 @@ struct kvm_s390_vm_cpu_subfunc {
>  #define KVM_S390_VM_MIGRATION_START	1
>  #define KVM_S390_VM_MIGRATION_STATUS	2
>  
> +/* kvm attributes for cpu topology */
> +#define KVM_S390_VM_CPU_TOPO_MTR_CLEAR	0
> +#define KVM_S390_VM_CPU_TOPO_MTR_SET	1
> +
>  /* for KVM_GET_REGS and KVM_SET_REGS */
>  struct kvm_regs {
>  	/* general purpose regs for s390 */
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index c8bdce31464f..80a1244f0ead 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -1731,6 +1731,76 @@ static void kvm_s390_sca_set_mtcr(struct kvm *kvm)
>  	ipte_unlock(kvm);
>  }
>  
> +/**
> + * kvm_s390_sca_clear_mtcr
> + * @kvm: guest KVM description
> + *
> + * Is only relevant if the topology facility is present,
> + * the caller should check KVM facility 11
> + *
> + * Updates the Multiprocessor Topology-Change-Report to signal
> + * the guest with a topology change.
> + */
> +static void kvm_s390_sca_clear_mtcr(struct kvm *kvm)
> +{
> +	struct bsca_block *sca = kvm->arch.sca; /* SCA version doesn't matter */
> +
> +	ipte_lock(kvm);
> +	sca->utility  &= ~SCA_UTILITY_MTCR;


One space too much.

sca->utility &= ~SCA_UTILITY_MTCR;

> +	ipte_unlock(kvm);
> +}
> +
> +static int kvm_s390_set_topology(struct kvm *kvm, struct kvm_device_attr *attr)
> +{
> +	if (!test_kvm_facility(kvm, 11))
> +		return -ENXIO;
> +
> +	switch (attr->attr) {
> +	case KVM_S390_VM_CPU_TOPO_MTR_SET:
> +		kvm_s390_sca_set_mtcr(kvm);
> +		break;
> +	case KVM_S390_VM_CPU_TOPO_MTR_CLEAR:
> +		kvm_s390_sca_clear_mtcr(kvm);
> +		break;
> +	}
> +	return 0;
> +}
> +
> +/**
> + * kvm_s390_sca_get_mtcr
> + * @kvm: guest KVM description
> + *
> + * Is only relevant if the topology facility is present,
> + * the caller should check KVM facility 11
> + *
> + * reports to QEMU the Multiprocessor Topology-Change-Report.
> + */
> +static int kvm_s390_sca_get_mtcr(struct kvm *kvm)
> +{
> +	struct bsca_block *sca = kvm->arch.sca; /* SCA version doesn't matter */
> +	int val;
> +
> +	ipte_lock(kvm);
> +	val = !!(sca->utility & SCA_UTILITY_MTCR);
> +	ipte_unlock(kvm);
> +
> +	return val;
> +}
> +
> +static int kvm_s390_get_topology(struct kvm *kvm, struct kvm_device_attr *attr)
> +{
> +	int mtcr;

I think we prefer something like u16 when copying to user space.

> +
> +	if (!test_kvm_facility(kvm, 11))
> +		return -ENXIO;
> +
> +	mtcr = kvm_s390_sca_get_mtcr(kvm);
> +	if (copy_to_user((void __user *)attr->addr, &mtcr, sizeof(mtcr)))
> +		return -EFAULT;
> +
> +	return 0;
> +}

You should probably add documentation, and document that only the last
bit (0x1) has a meaning.

Apart from that LGTM.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 3/3] s390x: KVM: resetting the Topology-Change-Report
  2022-05-12  9:31   ` David Hildenbrand
@ 2022-05-12  9:52     ` Claudio Imbrenda
  2022-05-12 10:01       ` David Hildenbrand
  2022-05-16 10:36     ` Pierre Morel
  2022-05-18 10:51     ` Pierre Morel
  2 siblings, 1 reply; 29+ messages in thread
From: Claudio Imbrenda @ 2022-05-12  9:52 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Pierre Morel, kvm, linux-s390, linux-kernel, borntraeger,
	frankja, cohuck, thuth, hca, gor, wintera, seiden, nrb

On Thu, 12 May 2022 11:31:18 +0200
David Hildenbrand <david@redhat.com> wrote:

> On 06.05.22 11:24, Pierre Morel wrote:
> > During a subsystem reset the Topology-Change-Report is cleared.
> > Let's give userland the possibility to clear the MTCR in the case
> > of a subsystem reset.
> > 
> > To migrate the MTCR, let's give userland the possibility to
> > query the MTCR state.
> > 
> > Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> > ---
> >  arch/s390/include/uapi/asm/kvm.h |  5 ++
> >  arch/s390/kvm/kvm-s390.c         | 79 ++++++++++++++++++++++++++++++++
> >  2 files changed, 84 insertions(+)
> > 
> > diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
> > index 7a6b14874d65..abdcf4069343 100644
> > --- a/arch/s390/include/uapi/asm/kvm.h
> > +++ b/arch/s390/include/uapi/asm/kvm.h
> > @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
> >  #define KVM_S390_VM_CRYPTO		2
> >  #define KVM_S390_VM_CPU_MODEL		3
> >  #define KVM_S390_VM_MIGRATION		4
> > +#define KVM_S390_VM_CPU_TOPOLOGY	5
> >  
> >  /* kvm attributes for mem_ctrl */
> >  #define KVM_S390_VM_MEM_ENABLE_CMMA	0
> > @@ -171,6 +172,10 @@ struct kvm_s390_vm_cpu_subfunc {
> >  #define KVM_S390_VM_MIGRATION_START	1
> >  #define KVM_S390_VM_MIGRATION_STATUS	2
> >  
> > +/* kvm attributes for cpu topology */
> > +#define KVM_S390_VM_CPU_TOPO_MTR_CLEAR	0
> > +#define KVM_S390_VM_CPU_TOPO_MTR_SET	1
> > +
> >  /* for KVM_GET_REGS and KVM_SET_REGS */
> >  struct kvm_regs {
> >  	/* general purpose regs for s390 */
> > diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> > index c8bdce31464f..80a1244f0ead 100644
> > --- a/arch/s390/kvm/kvm-s390.c
> > +++ b/arch/s390/kvm/kvm-s390.c
> > @@ -1731,6 +1731,76 @@ static void kvm_s390_sca_set_mtcr(struct kvm *kvm)
> >  	ipte_unlock(kvm);
> >  }
> >  
> > +/**
> > + * kvm_s390_sca_clear_mtcr
> > + * @kvm: guest KVM description
> > + *
> > + * Is only relevant if the topology facility is present,
> > + * the caller should check KVM facility 11
> > + *
> > + * Updates the Multiprocessor Topology-Change-Report to signal
> > + * the guest with a topology change.
> > + */
> > +static void kvm_s390_sca_clear_mtcr(struct kvm *kvm)
> > +{
> > +	struct bsca_block *sca = kvm->arch.sca; /* SCA version doesn't matter */
> > +
> > +	ipte_lock(kvm);
> > +	sca->utility  &= ~SCA_UTILITY_MTCR;  
> 
> 
> One space too much.
> 
> sca->utility &= ~SCA_UTILITY_MTCR;
> 
> > +	ipte_unlock(kvm);
> > +}
> > +
> > +static int kvm_s390_set_topology(struct kvm *kvm, struct kvm_device_attr *attr)
> > +{
> > +	if (!test_kvm_facility(kvm, 11))
> > +		return -ENXIO;
> > +
> > +	switch (attr->attr) {
> > +	case KVM_S390_VM_CPU_TOPO_MTR_SET:
> > +		kvm_s390_sca_set_mtcr(kvm);
> > +		break;
> > +	case KVM_S390_VM_CPU_TOPO_MTR_CLEAR:
> > +		kvm_s390_sca_clear_mtcr(kvm);
> > +		break;
> > +	}
> > +	return 0;
> > +}
> > +
> > +/**
> > + * kvm_s390_sca_get_mtcr
> > + * @kvm: guest KVM description
> > + *
> > + * Is only relevant if the topology facility is present,
> > + * the caller should check KVM facility 11
> > + *
> > + * reports to QEMU the Multiprocessor Topology-Change-Report.
> > + */
> > +static int kvm_s390_sca_get_mtcr(struct kvm *kvm)
> > +{
> > +	struct bsca_block *sca = kvm->arch.sca; /* SCA version doesn't matter */
> > +	int val;
> > +
> > +	ipte_lock(kvm);
> > +	val = !!(sca->utility & SCA_UTILITY_MTCR);
> > +	ipte_unlock(kvm);
> > +
> > +	return val;
> > +}
> > +
> > +static int kvm_s390_get_topology(struct kvm *kvm, struct kvm_device_attr *attr)
> > +{
> > +	int mtcr;  
> 
> I think we prefer something like u16 when copying to user space.

but then userspace also has to expect a u16, right?

> 
> > +
> > +	if (!test_kvm_facility(kvm, 11))
> > +		return -ENXIO;
> > +
> > +	mtcr = kvm_s390_sca_get_mtcr(kvm);
> > +	if (copy_to_user((void __user *)attr->addr, &mtcr, sizeof(mtcr)))
> > +		return -EFAULT;
> > +
> > +	return 0;
> > +}  
> 
> You should probably add documentation, and document that only the last
> bit (0x1) has a meaning.
> 
> Apart from that LGTM.
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 3/3] s390x: KVM: resetting the Topology-Change-Report
  2022-05-12  9:52     ` Claudio Imbrenda
@ 2022-05-12 10:01       ` David Hildenbrand
  2022-05-16 14:21         ` Pierre Morel
  0 siblings, 1 reply; 29+ messages in thread
From: David Hildenbrand @ 2022-05-12 10:01 UTC (permalink / raw)
  To: Claudio Imbrenda
  Cc: Pierre Morel, kvm, linux-s390, linux-kernel, borntraeger,
	frankja, cohuck, thuth, hca, gor, wintera, seiden, nrb

>>
>> I think we prefer something like u16 when copying to user space.
> 
> but then userspace also has to expect a u16, right?

Yep.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 1/3] s390x: KVM: ipte lock for SCA access should be contained in KVM
  2022-05-06  9:24 ` [PATCH v9 1/3] s390x: KVM: ipte lock for SCA access should be contained in KVM Pierre Morel
  2022-05-12  9:08   ` David Hildenbrand
@ 2022-05-12 11:32   ` Janosch Frank
  2022-05-16 14:13     ` Pierre Morel
  1 sibling, 1 reply; 29+ messages in thread
From: Janosch Frank @ 2022-05-12 11:32 UTC (permalink / raw)
  To: Pierre Morel, kvm
  Cc: linux-s390, linux-kernel, borntraeger, cohuck, david, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb

On 5/6/22 11:24, Pierre Morel wrote:
> The former check to chose between SIIF or not SIIF can be done
> using the sclp.has_siif instead of accessing per vCPU structures

Maybe replace this paragraph with:
We can check if SIIF is enabled by testing the sclp_info struct instead 
of testing the sie control block eca variable. sclp.has_ssif is the only 
requirement to set ECA_SII anyway so we can go straight to the source 
for that.

Reviewed-by: Janosch Frank <frankja@linux.ibm.com>

> 
> When accessing the SCA, ipte lock and ipte_unlock do not need
> to access any vcpu structures but only the KVM structure.
> 
> Let's simplify the ipte handling.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
>   arch/s390/kvm/gaccess.c | 96 ++++++++++++++++++++---------------------
>   arch/s390/kvm/gaccess.h |  6 +--
>   arch/s390/kvm/priv.c    |  6 +--
>   3 files changed, 54 insertions(+), 54 deletions(-)
> 
> diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
> index d53a183c2005..0e1f6dd31882 100644
> --- a/arch/s390/kvm/gaccess.c
> +++ b/arch/s390/kvm/gaccess.c
> @@ -262,77 +262,77 @@ struct aste {
>   	/* .. more fields there */
>   };
>   
> -int ipte_lock_held(struct kvm_vcpu *vcpu)
> +int ipte_lock_held(struct kvm *kvm)
>   {
> -	if (vcpu->arch.sie_block->eca & ECA_SII) {
> +	if (sclp.has_siif) {
>   		int rc;
>   
> -		read_lock(&vcpu->kvm->arch.sca_lock);
> -		rc = kvm_s390_get_ipte_control(vcpu->kvm)->kh != 0;
> -		read_unlock(&vcpu->kvm->arch.sca_lock);
> +		read_lock(&kvm->arch.sca_lock);
> +		rc = kvm_s390_get_ipte_control(kvm)->kh != 0;
> +		read_unlock(&kvm->arch.sca_lock);
>   		return rc;
>   	}
> -	return vcpu->kvm->arch.ipte_lock_count != 0;
> +	return kvm->arch.ipte_lock_count != 0;
>   }
>   
> -static void ipte_lock_simple(struct kvm_vcpu *vcpu)
> +static void ipte_lock_simple(struct kvm *kvm)
>   {
>   	union ipte_control old, new, *ic;
>   
> -	mutex_lock(&vcpu->kvm->arch.ipte_mutex);
> -	vcpu->kvm->arch.ipte_lock_count++;
> -	if (vcpu->kvm->arch.ipte_lock_count > 1)
> +	mutex_lock(&kvm->arch.ipte_mutex);
> +	kvm->arch.ipte_lock_count++;
> +	if (kvm->arch.ipte_lock_count > 1)
>   		goto out;
>   retry:
> -	read_lock(&vcpu->kvm->arch.sca_lock);
> -	ic = kvm_s390_get_ipte_control(vcpu->kvm);
> +	read_lock(&kvm->arch.sca_lock);
> +	ic = kvm_s390_get_ipte_control(kvm);
>   	do {
>   		old = READ_ONCE(*ic);
>   		if (old.k) {
> -			read_unlock(&vcpu->kvm->arch.sca_lock);
> +			read_unlock(&kvm->arch.sca_lock);
>   			cond_resched();
>   			goto retry;
>   		}
>   		new = old;
>   		new.k = 1;
>   	} while (cmpxchg(&ic->val, old.val, new.val) != old.val);
> -	read_unlock(&vcpu->kvm->arch.sca_lock);
> +	read_unlock(&kvm->arch.sca_lock);
>   out:
> -	mutex_unlock(&vcpu->kvm->arch.ipte_mutex);
> +	mutex_unlock(&kvm->arch.ipte_mutex);
>   }
>   
> -static void ipte_unlock_simple(struct kvm_vcpu *vcpu)
> +static void ipte_unlock_simple(struct kvm *kvm)
>   {
>   	union ipte_control old, new, *ic;
>   
> -	mutex_lock(&vcpu->kvm->arch.ipte_mutex);
> -	vcpu->kvm->arch.ipte_lock_count--;
> -	if (vcpu->kvm->arch.ipte_lock_count)
> +	mutex_lock(&kvm->arch.ipte_mutex);
> +	kvm->arch.ipte_lock_count--;
> +	if (kvm->arch.ipte_lock_count)
>   		goto out;
> -	read_lock(&vcpu->kvm->arch.sca_lock);
> -	ic = kvm_s390_get_ipte_control(vcpu->kvm);
> +	read_lock(&kvm->arch.sca_lock);
> +	ic = kvm_s390_get_ipte_control(kvm);
>   	do {
>   		old = READ_ONCE(*ic);
>   		new = old;
>   		new.k = 0;
>   	} while (cmpxchg(&ic->val, old.val, new.val) != old.val);
> -	read_unlock(&vcpu->kvm->arch.sca_lock);
> -	wake_up(&vcpu->kvm->arch.ipte_wq);
> +	read_unlock(&kvm->arch.sca_lock);
> +	wake_up(&kvm->arch.ipte_wq);
>   out:
> -	mutex_unlock(&vcpu->kvm->arch.ipte_mutex);
> +	mutex_unlock(&kvm->arch.ipte_mutex);
>   }
>   
> -static void ipte_lock_siif(struct kvm_vcpu *vcpu)
> +static void ipte_lock_siif(struct kvm *kvm)
>   {
>   	union ipte_control old, new, *ic;
>   
>   retry:
> -	read_lock(&vcpu->kvm->arch.sca_lock);
> -	ic = kvm_s390_get_ipte_control(vcpu->kvm);
> +	read_lock(&kvm->arch.sca_lock);
> +	ic = kvm_s390_get_ipte_control(kvm);
>   	do {
>   		old = READ_ONCE(*ic);
>   		if (old.kg) {
> -			read_unlock(&vcpu->kvm->arch.sca_lock);
> +			read_unlock(&kvm->arch.sca_lock);
>   			cond_resched();
>   			goto retry;
>   		}
> @@ -340,15 +340,15 @@ static void ipte_lock_siif(struct kvm_vcpu *vcpu)
>   		new.k = 1;
>   		new.kh++;
>   	} while (cmpxchg(&ic->val, old.val, new.val) != old.val);
> -	read_unlock(&vcpu->kvm->arch.sca_lock);
> +	read_unlock(&kvm->arch.sca_lock);
>   }
>   
> -static void ipte_unlock_siif(struct kvm_vcpu *vcpu)
> +static void ipte_unlock_siif(struct kvm *kvm)
>   {
>   	union ipte_control old, new, *ic;
>   
> -	read_lock(&vcpu->kvm->arch.sca_lock);
> -	ic = kvm_s390_get_ipte_control(vcpu->kvm);
> +	read_lock(&kvm->arch.sca_lock);
> +	ic = kvm_s390_get_ipte_control(kvm);
>   	do {
>   		old = READ_ONCE(*ic);
>   		new = old;
> @@ -356,25 +356,25 @@ static void ipte_unlock_siif(struct kvm_vcpu *vcpu)
>   		if (!new.kh)
>   			new.k = 0;
>   	} while (cmpxchg(&ic->val, old.val, new.val) != old.val);
> -	read_unlock(&vcpu->kvm->arch.sca_lock);
> +	read_unlock(&kvm->arch.sca_lock);
>   	if (!new.kh)
> -		wake_up(&vcpu->kvm->arch.ipte_wq);
> +		wake_up(&kvm->arch.ipte_wq);
>   }
>   
> -void ipte_lock(struct kvm_vcpu *vcpu)
> +void ipte_lock(struct kvm *kvm)
>   {
> -	if (vcpu->arch.sie_block->eca & ECA_SII)
> -		ipte_lock_siif(vcpu);
> +	if (sclp.has_siif)
> +		ipte_lock_siif(kvm);
>   	else
> -		ipte_lock_simple(vcpu);
> +		ipte_lock_simple(kvm);
>   }
>   
> -void ipte_unlock(struct kvm_vcpu *vcpu)
> +void ipte_unlock(struct kvm *kvm)
>   {
> -	if (vcpu->arch.sie_block->eca & ECA_SII)
> -		ipte_unlock_siif(vcpu);
> +	if (sclp.has_siif)
> +		ipte_unlock_siif(kvm);
>   	else
> -		ipte_unlock_simple(vcpu);
> +		ipte_unlock_simple(kvm);
>   }
>   
>   static int ar_translation(struct kvm_vcpu *vcpu, union asce *asce, u8 ar,
> @@ -1075,7 +1075,7 @@ int access_guest_with_key(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar,
>   	try_storage_prot_override = storage_prot_override_applicable(vcpu);
>   	need_ipte_lock = psw_bits(*psw).dat && !asce.r;
>   	if (need_ipte_lock)
> -		ipte_lock(vcpu);
> +		ipte_lock(vcpu->kvm);
>   	/*
>   	 * Since we do the access further down ultimately via a move instruction
>   	 * that does key checking and returns an error in case of a protection
> @@ -1113,7 +1113,7 @@ int access_guest_with_key(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar,
>   		rc = trans_exc(vcpu, rc, ga, ar, mode, prot);
>   out_unlock:
>   	if (need_ipte_lock)
> -		ipte_unlock(vcpu);
> +		ipte_unlock(vcpu->kvm);
>   	if (nr_pages > ARRAY_SIZE(gpa_array))
>   		vfree(gpas);
>   	return rc;
> @@ -1185,10 +1185,10 @@ int check_gva_range(struct kvm_vcpu *vcpu, unsigned long gva, u8 ar,
>   	rc = get_vcpu_asce(vcpu, &asce, gva, ar, mode);
>   	if (rc)
>   		return rc;
> -	ipte_lock(vcpu);
> +	ipte_lock(vcpu->kvm);
>   	rc = guest_range_to_gpas(vcpu, gva, ar, NULL, length, asce, mode,
>   				 access_key);
> -	ipte_unlock(vcpu);
> +	ipte_unlock(vcpu->kvm);
>   
>   	return rc;
>   }
> @@ -1451,7 +1451,7 @@ int kvm_s390_shadow_fault(struct kvm_vcpu *vcpu, struct gmap *sg,
>   	 * tables/pointers we read stay valid - unshadowing is however
>   	 * always possible - only guest_table_lock protects us.
>   	 */
> -	ipte_lock(vcpu);
> +	ipte_lock(vcpu->kvm);
>   
>   	rc = gmap_shadow_pgt_lookup(sg, saddr, &pgt, &dat_protection, &fake);
>   	if (rc)
> @@ -1485,7 +1485,7 @@ int kvm_s390_shadow_fault(struct kvm_vcpu *vcpu, struct gmap *sg,
>   	pte.p |= dat_protection;
>   	if (!rc)
>   		rc = gmap_shadow_page(sg, saddr, __pte(pte.val));
> -	ipte_unlock(vcpu);
> +	ipte_unlock(vcpu->kvm);
>   	mmap_read_unlock(sg->mm);
>   	return rc;
>   }
> diff --git a/arch/s390/kvm/gaccess.h b/arch/s390/kvm/gaccess.h
> index 1124ff282012..9408d6cc8e2c 100644
> --- a/arch/s390/kvm/gaccess.h
> +++ b/arch/s390/kvm/gaccess.h
> @@ -440,9 +440,9 @@ int read_guest_real(struct kvm_vcpu *vcpu, unsigned long gra, void *data,
>   	return access_guest_real(vcpu, gra, data, len, 0);
>   }
>   
> -void ipte_lock(struct kvm_vcpu *vcpu);
> -void ipte_unlock(struct kvm_vcpu *vcpu);
> -int ipte_lock_held(struct kvm_vcpu *vcpu);
> +void ipte_lock(struct kvm *kvm);
> +void ipte_unlock(struct kvm *kvm);
> +int ipte_lock_held(struct kvm *kvm);
>   int kvm_s390_check_low_addr_prot_real(struct kvm_vcpu *vcpu, unsigned long gra);
>   
>   /* MVPG PEI indication bits */
> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
> index 5beb7a4a11b3..0e8603acc105 100644
> --- a/arch/s390/kvm/priv.c
> +++ b/arch/s390/kvm/priv.c
> @@ -443,7 +443,7 @@ static int handle_ipte_interlock(struct kvm_vcpu *vcpu)
>   	vcpu->stat.instruction_ipte_interlock++;
>   	if (psw_bits(vcpu->arch.sie_block->gpsw).pstate)
>   		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
> -	wait_event(vcpu->kvm->arch.ipte_wq, !ipte_lock_held(vcpu));
> +	wait_event(vcpu->kvm->arch.ipte_wq, !ipte_lock_held(vcpu->kvm));
>   	kvm_s390_retry_instr(vcpu);
>   	VCPU_EVENT(vcpu, 4, "%s", "retrying ipte interlock operation");
>   	return 0;
> @@ -1472,7 +1472,7 @@ static int handle_tprot(struct kvm_vcpu *vcpu)
>   	access_key = (operand2 & 0xf0) >> 4;
>   
>   	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_DAT)
> -		ipte_lock(vcpu);
> +		ipte_lock(vcpu->kvm);
>   
>   	ret = guest_translate_address_with_key(vcpu, address, ar, &gpa,
>   					       GACC_STORE, access_key);
> @@ -1509,7 +1509,7 @@ static int handle_tprot(struct kvm_vcpu *vcpu)
>   	}
>   
>   	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_DAT)
> -		ipte_unlock(vcpu);
> +		ipte_unlock(vcpu->kvm);
>   	return ret;
>   }
>   


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 2/3] s390x: KVM: guest support for topology function
  2022-05-06  9:24 ` [PATCH v9 2/3] s390x: KVM: guest support for topology function Pierre Morel
  2022-05-12  9:24   ` David Hildenbrand
@ 2022-05-12 11:41   ` Janosch Frank
  2022-05-16 10:41     ` Pierre Morel
  2022-05-19  9:01   ` Christian Borntraeger
  2 siblings, 1 reply; 29+ messages in thread
From: Janosch Frank @ 2022-05-12 11:41 UTC (permalink / raw)
  To: Pierre Morel, kvm
  Cc: linux-s390, linux-kernel, borntraeger, cohuck, david, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb

On 5/6/22 11:24, Pierre Morel wrote:
> We let the userland hypervisor know if the machine support the CPU
> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.

Nope, we indicate KVM's support which is based on the machine's support.

On the same note: Shouldn't the CAP indication be part of the last 
patch? The resets are needed for a full support of this feature, no?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 3/3] s390x: KVM: resetting the Topology-Change-Report
  2022-05-12  9:31   ` David Hildenbrand
  2022-05-12  9:52     ` Claudio Imbrenda
@ 2022-05-16 10:36     ` Pierre Morel
  2022-05-18 10:51     ` Pierre Morel
  2 siblings, 0 replies; 29+ messages in thread
From: Pierre Morel @ 2022-05-16 10:36 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb



On 5/12/22 11:31, David Hildenbrand wrote:
> On 06.05.22 11:24, Pierre Morel wrote:
>> During a subsystem reset the Topology-Change-Report is cleared.
>> Let's give userland the possibility to clear the MTCR in the case
>> of a subsystem reset.
>>
>> To migrate the MTCR, let's give userland the possibility to
>> query the MTCR state.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>> ---
>>   arch/s390/include/uapi/asm/kvm.h |  5 ++
>>   arch/s390/kvm/kvm-s390.c         | 79 ++++++++++++++++++++++++++++++++
>>   2 files changed, 84 insertions(+)
>>
>> diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
>> index 7a6b14874d65..abdcf4069343 100644
>> --- a/arch/s390/include/uapi/asm/kvm.h
>> +++ b/arch/s390/include/uapi/asm/kvm.h
>> @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
>>   #define KVM_S390_VM_CRYPTO		2
>>   #define KVM_S390_VM_CPU_MODEL		3
>>   #define KVM_S390_VM_MIGRATION		4
>> +#define KVM_S390_VM_CPU_TOPOLOGY	5
>>   
>>   /* kvm attributes for mem_ctrl */
>>   #define KVM_S390_VM_MEM_ENABLE_CMMA	0
>> @@ -171,6 +172,10 @@ struct kvm_s390_vm_cpu_subfunc {
>>   #define KVM_S390_VM_MIGRATION_START	1
>>   #define KVM_S390_VM_MIGRATION_STATUS	2
>>   
>> +/* kvm attributes for cpu topology */
>> +#define KVM_S390_VM_CPU_TOPO_MTR_CLEAR	0
>> +#define KVM_S390_VM_CPU_TOPO_MTR_SET	1
>> +
>>   /* for KVM_GET_REGS and KVM_SET_REGS */
>>   struct kvm_regs {
>>   	/* general purpose regs for s390 */
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index c8bdce31464f..80a1244f0ead 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -1731,6 +1731,76 @@ static void kvm_s390_sca_set_mtcr(struct kvm *kvm)
>>   	ipte_unlock(kvm);
>>   }
>>   
>> +/**
>> + * kvm_s390_sca_clear_mtcr
>> + * @kvm: guest KVM description
>> + *
>> + * Is only relevant if the topology facility is present,
>> + * the caller should check KVM facility 11
>> + *
>> + * Updates the Multiprocessor Topology-Change-Report to signal
>> + * the guest with a topology change.
>> + */
>> +static void kvm_s390_sca_clear_mtcr(struct kvm *kvm)
>> +{
>> +	struct bsca_block *sca = kvm->arch.sca; /* SCA version doesn't matter */
>> +
>> +	ipte_lock(kvm);
>> +	sca->utility  &= ~SCA_UTILITY_MTCR;
> 
> 
> One space too much.
> 
> sca->utility &= ~SCA_UTILITY_MTCR;
> 
>> +	ipte_unlock(kvm);
>> +}
>> +
>> +static int kvm_s390_set_topology(struct kvm *kvm, struct kvm_device_attr *attr)
>> +{
>> +	if (!test_kvm_facility(kvm, 11))
>> +		return -ENXIO;
>> +
>> +	switch (attr->attr) {
>> +	case KVM_S390_VM_CPU_TOPO_MTR_SET:
>> +		kvm_s390_sca_set_mtcr(kvm);
>> +		break;
>> +	case KVM_S390_VM_CPU_TOPO_MTR_CLEAR:
>> +		kvm_s390_sca_clear_mtcr(kvm);
>> +		break;
>> +	}
>> +	return 0;
>> +}
>> +
>> +/**
>> + * kvm_s390_sca_get_mtcr
>> + * @kvm: guest KVM description
>> + *
>> + * Is only relevant if the topology facility is present,
>> + * the caller should check KVM facility 11
>> + *
>> + * reports to QEMU the Multiprocessor Topology-Change-Report.
>> + */
>> +static int kvm_s390_sca_get_mtcr(struct kvm *kvm)
>> +{
>> +	struct bsca_block *sca = kvm->arch.sca; /* SCA version doesn't matter */
>> +	int val;
>> +
>> +	ipte_lock(kvm);
>> +	val = !!(sca->utility & SCA_UTILITY_MTCR);
>> +	ipte_unlock(kvm);
>> +
>> +	return val;
>> +}
>> +
>> +static int kvm_s390_get_topology(struct kvm *kvm, struct kvm_device_attr *attr)
>> +{
>> +	int mtcr;
> 
> I think we prefer something like u16 when copying to user space.

If you prefer.
The original idea was to have something like a bool but then I should 
have change get_mtcr to has_mtcr.


> 
>> +
>> +	if (!test_kvm_facility(kvm, 11))
>> +		return -ENXIO;
>> +
>> +	mtcr = kvm_s390_sca_get_mtcr(kvm);
>> +	if (copy_to_user((void __user *)attr->addr, &mtcr, sizeof(mtcr)))
>> +		return -EFAULT;
>> +
>> +	return 0;
>> +}
> 
> You should probably add documentation, and document that only the last
> bit (0x1) has a meaning.
> 
> Apart from that LGTM.
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 2/3] s390x: KVM: guest support for topology function
  2022-05-12 11:41   ` Janosch Frank
@ 2022-05-16 10:41     ` Pierre Morel
  0 siblings, 0 replies; 29+ messages in thread
From: Pierre Morel @ 2022-05-16 10:41 UTC (permalink / raw)
  To: Janosch Frank, kvm
  Cc: linux-s390, linux-kernel, borntraeger, cohuck, david, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb



On 5/12/22 13:41, Janosch Frank wrote:
> On 5/6/22 11:24, Pierre Morel wrote:
>> We let the userland hypervisor know if the machine support the CPU
>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
> 
> Nope, we indicate KVM's support which is based on the machine's support.

OK I reword.

> 
> On the same note: Shouldn't the CAP indication be part of the last 
> patch? The resets are needed for a full support of this feature, no?

Looks right, I will move it last.

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 2/3] s390x: KVM: guest support for topology function
  2022-05-12  9:24   ` David Hildenbrand
@ 2022-05-16 14:13     ` Pierre Morel
  2022-06-17 14:49       ` Pierre Morel
  0 siblings, 1 reply; 29+ messages in thread
From: Pierre Morel @ 2022-05-16 14:13 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb



On 5/12/22 11:24, David Hildenbrand wrote:
> On 06.05.22 11:24, Pierre Morel wrote:
>> We let the userland hypervisor know if the machine support the CPU
>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>
>> The PTF instruction will report a topology change if there is any change
>> with a previous STSI_15_1_2 SYSIB.
>> Changes inside a STSI_15_1_2 SYSIB occur if CPU bits are set or clear
>> inside the CPU Topology List Entry CPU mask field, which happens with
>> changes in CPU polarization, dedication, CPU types and adding or
>> removing CPUs in a socket.
>>
>> The reporting to the guest is done using the Multiprocessor
>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>> SCA which will be cleared during the interpretation of PTF.
>>
>> To check if the topology has been modified we use a new field of the
>> arch vCPU to save the previous real CPU ID at the end of a schedule
>> and verify on next schedule that the CPU used is in the same socket.
>> We do not report polarization, CPU Type or dedication change.
>>
>> STSI(15.1.x) gives information on the CPU configuration topology.
>> Let's accept the interception of STSI with the function code 15 and
>> let the userland part of the hypervisor handle it when userland
>> support the CPU Topology facility.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> 
> [...]
> 
> 
>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>> index 0e8603acc105..d9e16b09c8bf 100644
>> --- a/arch/s390/kvm/priv.c
>> +++ b/arch/s390/kvm/priv.c
>> @@ -874,10 +874,12 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>   	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>>   		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>>   
>> -	if (fc > 3) {
>> -		kvm_s390_set_psw_cc(vcpu, 3);
>> -		return 0;
>> -	}
>> +	if (fc > 3 && fc != 15)
>> +		goto out_no_data;
>> +
>> +	/* fc 15 is provided with PTF/CPU topology support */
>> +	if (fc == 15 && !test_kvm_facility(vcpu->kvm, 11))
>> +		goto out_no_data;
> 
> 
> Maybe shorter as
> 
> if (fc == 15 && !test_kvm_facility(vcpu->kvm, 11))
> 	goto out_no_data;
> else if (fc > 3)
> 	goto out_no_data;
> 

yes.

> 
> Apart from that, LGTM.
> 

Thanks,
Pierre

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 1/3] s390x: KVM: ipte lock for SCA access should be contained in KVM
  2022-05-12 11:32   ` Janosch Frank
@ 2022-05-16 14:13     ` Pierre Morel
  0 siblings, 0 replies; 29+ messages in thread
From: Pierre Morel @ 2022-05-16 14:13 UTC (permalink / raw)
  To: Janosch Frank, kvm
  Cc: linux-s390, linux-kernel, borntraeger, cohuck, david, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb



On 5/12/22 13:32, Janosch Frank wrote:
> On 5/6/22 11:24, Pierre Morel wrote:
>> The former check to chose between SIIF or not SIIF can be done
>> using the sclp.has_siif instead of accessing per vCPU structures
> 
> Maybe replace this paragraph with:
> We can check if SIIF is enabled by testing the sclp_info struct instead 
> of testing the sie control block eca variable. sclp.has_ssif is the only 
> requirement to set ECA_SII anyway so we can go straight to the source 
> for that.
> 
> Reviewed-by: Janosch Frank <frankja@linux.ibm.com>

OK, thanks,

Regards,
Pierre

> 
>>
>> When accessing the SCA, ipte lock and ipte_unlock do not need
>> to access any vcpu structures but only the KVM structure.
>>
>> Let's simplify the ipte handling.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>> ---
>>   arch/s390/kvm/gaccess.c | 96 ++++++++++++++++++++---------------------
>>   arch/s390/kvm/gaccess.h |  6 +--
>>   arch/s390/kvm/priv.c    |  6 +--
>>   3 files changed, 54 insertions(+), 54 deletions(-)
>>
>> diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
>> index d53a183c2005..0e1f6dd31882 100644
>> --- a/arch/s390/kvm/gaccess.c
>> +++ b/arch/s390/kvm/gaccess.c
>> @@ -262,77 +262,77 @@ struct aste {
>>       /* .. more fields there */
>>   };
>> -int ipte_lock_held(struct kvm_vcpu *vcpu)
>> +int ipte_lock_held(struct kvm *kvm)
>>   {
>> -    if (vcpu->arch.sie_block->eca & ECA_SII) {
>> +    if (sclp.has_siif) {
>>           int rc;
>> -        read_lock(&vcpu->kvm->arch.sca_lock);
>> -        rc = kvm_s390_get_ipte_control(vcpu->kvm)->kh != 0;
>> -        read_unlock(&vcpu->kvm->arch.sca_lock);
>> +        read_lock(&kvm->arch.sca_lock);
>> +        rc = kvm_s390_get_ipte_control(kvm)->kh != 0;
>> +        read_unlock(&kvm->arch.sca_lock);
>>           return rc;
>>       }
>> -    return vcpu->kvm->arch.ipte_lock_count != 0;
>> +    return kvm->arch.ipte_lock_count != 0;
>>   }
>> -static void ipte_lock_simple(struct kvm_vcpu *vcpu)
>> +static void ipte_lock_simple(struct kvm *kvm)
>>   {
>>       union ipte_control old, new, *ic;
>> -    mutex_lock(&vcpu->kvm->arch.ipte_mutex);
>> -    vcpu->kvm->arch.ipte_lock_count++;
>> -    if (vcpu->kvm->arch.ipte_lock_count > 1)
>> +    mutex_lock(&kvm->arch.ipte_mutex);
>> +    kvm->arch.ipte_lock_count++;
>> +    if (kvm->arch.ipte_lock_count > 1)
>>           goto out;
>>   retry:
>> -    read_lock(&vcpu->kvm->arch.sca_lock);
>> -    ic = kvm_s390_get_ipte_control(vcpu->kvm);
>> +    read_lock(&kvm->arch.sca_lock);
>> +    ic = kvm_s390_get_ipte_control(kvm);
>>       do {
>>           old = READ_ONCE(*ic);
>>           if (old.k) {
>> -            read_unlock(&vcpu->kvm->arch.sca_lock);
>> +            read_unlock(&kvm->arch.sca_lock);
>>               cond_resched();
>>               goto retry;
>>           }
>>           new = old;
>>           new.k = 1;
>>       } while (cmpxchg(&ic->val, old.val, new.val) != old.val);
>> -    read_unlock(&vcpu->kvm->arch.sca_lock);
>> +    read_unlock(&kvm->arch.sca_lock);
>>   out:
>> -    mutex_unlock(&vcpu->kvm->arch.ipte_mutex);
>> +    mutex_unlock(&kvm->arch.ipte_mutex);
>>   }
>> -static void ipte_unlock_simple(struct kvm_vcpu *vcpu)
>> +static void ipte_unlock_simple(struct kvm *kvm)
>>   {
>>       union ipte_control old, new, *ic;
>> -    mutex_lock(&vcpu->kvm->arch.ipte_mutex);
>> -    vcpu->kvm->arch.ipte_lock_count--;
>> -    if (vcpu->kvm->arch.ipte_lock_count)
>> +    mutex_lock(&kvm->arch.ipte_mutex);
>> +    kvm->arch.ipte_lock_count--;
>> +    if (kvm->arch.ipte_lock_count)
>>           goto out;
>> -    read_lock(&vcpu->kvm->arch.sca_lock);
>> -    ic = kvm_s390_get_ipte_control(vcpu->kvm);
>> +    read_lock(&kvm->arch.sca_lock);
>> +    ic = kvm_s390_get_ipte_control(kvm);
>>       do {
>>           old = READ_ONCE(*ic);
>>           new = old;
>>           new.k = 0;
>>       } while (cmpxchg(&ic->val, old.val, new.val) != old.val);
>> -    read_unlock(&vcpu->kvm->arch.sca_lock);
>> -    wake_up(&vcpu->kvm->arch.ipte_wq);
>> +    read_unlock(&kvm->arch.sca_lock);
>> +    wake_up(&kvm->arch.ipte_wq);
>>   out:
>> -    mutex_unlock(&vcpu->kvm->arch.ipte_mutex);
>> +    mutex_unlock(&kvm->arch.ipte_mutex);
>>   }
>> -static void ipte_lock_siif(struct kvm_vcpu *vcpu)
>> +static void ipte_lock_siif(struct kvm *kvm)
>>   {
>>       union ipte_control old, new, *ic;
>>   retry:
>> -    read_lock(&vcpu->kvm->arch.sca_lock);
>> -    ic = kvm_s390_get_ipte_control(vcpu->kvm);
>> +    read_lock(&kvm->arch.sca_lock);
>> +    ic = kvm_s390_get_ipte_control(kvm);
>>       do {
>>           old = READ_ONCE(*ic);
>>           if (old.kg) {
>> -            read_unlock(&vcpu->kvm->arch.sca_lock);
>> +            read_unlock(&kvm->arch.sca_lock);
>>               cond_resched();
>>               goto retry;
>>           }
>> @@ -340,15 +340,15 @@ static void ipte_lock_siif(struct kvm_vcpu *vcpu)
>>           new.k = 1;
>>           new.kh++;
>>       } while (cmpxchg(&ic->val, old.val, new.val) != old.val);
>> -    read_unlock(&vcpu->kvm->arch.sca_lock);
>> +    read_unlock(&kvm->arch.sca_lock);
>>   }
>> -static void ipte_unlock_siif(struct kvm_vcpu *vcpu)
>> +static void ipte_unlock_siif(struct kvm *kvm)
>>   {
>>       union ipte_control old, new, *ic;
>> -    read_lock(&vcpu->kvm->arch.sca_lock);
>> -    ic = kvm_s390_get_ipte_control(vcpu->kvm);
>> +    read_lock(&kvm->arch.sca_lock);
>> +    ic = kvm_s390_get_ipte_control(kvm);
>>       do {
>>           old = READ_ONCE(*ic);
>>           new = old;
>> @@ -356,25 +356,25 @@ static void ipte_unlock_siif(struct kvm_vcpu *vcpu)
>>           if (!new.kh)
>>               new.k = 0;
>>       } while (cmpxchg(&ic->val, old.val, new.val) != old.val);
>> -    read_unlock(&vcpu->kvm->arch.sca_lock);
>> +    read_unlock(&kvm->arch.sca_lock);
>>       if (!new.kh)
>> -        wake_up(&vcpu->kvm->arch.ipte_wq);
>> +        wake_up(&kvm->arch.ipte_wq);
>>   }
>> -void ipte_lock(struct kvm_vcpu *vcpu)
>> +void ipte_lock(struct kvm *kvm)
>>   {
>> -    if (vcpu->arch.sie_block->eca & ECA_SII)
>> -        ipte_lock_siif(vcpu);
>> +    if (sclp.has_siif)
>> +        ipte_lock_siif(kvm);
>>       else
>> -        ipte_lock_simple(vcpu);
>> +        ipte_lock_simple(kvm);
>>   }
>> -void ipte_unlock(struct kvm_vcpu *vcpu)
>> +void ipte_unlock(struct kvm *kvm)
>>   {
>> -    if (vcpu->arch.sie_block->eca & ECA_SII)
>> -        ipte_unlock_siif(vcpu);
>> +    if (sclp.has_siif)
>> +        ipte_unlock_siif(kvm);
>>       else
>> -        ipte_unlock_simple(vcpu);
>> +        ipte_unlock_simple(kvm);
>>   }
>>   static int ar_translation(struct kvm_vcpu *vcpu, union asce *asce, 
>> u8 ar,
>> @@ -1075,7 +1075,7 @@ int access_guest_with_key(struct kvm_vcpu *vcpu, 
>> unsigned long ga, u8 ar,
>>       try_storage_prot_override = storage_prot_override_applicable(vcpu);
>>       need_ipte_lock = psw_bits(*psw).dat && !asce.r;
>>       if (need_ipte_lock)
>> -        ipte_lock(vcpu);
>> +        ipte_lock(vcpu->kvm);
>>       /*
>>        * Since we do the access further down ultimately via a move 
>> instruction
>>        * that does key checking and returns an error in case of a 
>> protection
>> @@ -1113,7 +1113,7 @@ int access_guest_with_key(struct kvm_vcpu *vcpu, 
>> unsigned long ga, u8 ar,
>>           rc = trans_exc(vcpu, rc, ga, ar, mode, prot);
>>   out_unlock:
>>       if (need_ipte_lock)
>> -        ipte_unlock(vcpu);
>> +        ipte_unlock(vcpu->kvm);
>>       if (nr_pages > ARRAY_SIZE(gpa_array))
>>           vfree(gpas);
>>       return rc;
>> @@ -1185,10 +1185,10 @@ int check_gva_range(struct kvm_vcpu *vcpu, 
>> unsigned long gva, u8 ar,
>>       rc = get_vcpu_asce(vcpu, &asce, gva, ar, mode);
>>       if (rc)
>>           return rc;
>> -    ipte_lock(vcpu);
>> +    ipte_lock(vcpu->kvm);
>>       rc = guest_range_to_gpas(vcpu, gva, ar, NULL, length, asce, mode,
>>                    access_key);
>> -    ipte_unlock(vcpu);
>> +    ipte_unlock(vcpu->kvm);
>>       return rc;
>>   }
>> @@ -1451,7 +1451,7 @@ int kvm_s390_shadow_fault(struct kvm_vcpu *vcpu, 
>> struct gmap *sg,
>>        * tables/pointers we read stay valid - unshadowing is however
>>        * always possible - only guest_table_lock protects us.
>>        */
>> -    ipte_lock(vcpu);
>> +    ipte_lock(vcpu->kvm);
>>       rc = gmap_shadow_pgt_lookup(sg, saddr, &pgt, &dat_protection, 
>> &fake);
>>       if (rc)
>> @@ -1485,7 +1485,7 @@ int kvm_s390_shadow_fault(struct kvm_vcpu *vcpu, 
>> struct gmap *sg,
>>       pte.p |= dat_protection;
>>       if (!rc)
>>           rc = gmap_shadow_page(sg, saddr, __pte(pte.val));
>> -    ipte_unlock(vcpu);
>> +    ipte_unlock(vcpu->kvm);
>>       mmap_read_unlock(sg->mm);
>>       return rc;
>>   }
>> diff --git a/arch/s390/kvm/gaccess.h b/arch/s390/kvm/gaccess.h
>> index 1124ff282012..9408d6cc8e2c 100644
>> --- a/arch/s390/kvm/gaccess.h
>> +++ b/arch/s390/kvm/gaccess.h
>> @@ -440,9 +440,9 @@ int read_guest_real(struct kvm_vcpu *vcpu, 
>> unsigned long gra, void *data,
>>       return access_guest_real(vcpu, gra, data, len, 0);
>>   }
>> -void ipte_lock(struct kvm_vcpu *vcpu);
>> -void ipte_unlock(struct kvm_vcpu *vcpu);
>> -int ipte_lock_held(struct kvm_vcpu *vcpu);
>> +void ipte_lock(struct kvm *kvm);
>> +void ipte_unlock(struct kvm *kvm);
>> +int ipte_lock_held(struct kvm *kvm);
>>   int kvm_s390_check_low_addr_prot_real(struct kvm_vcpu *vcpu, 
>> unsigned long gra);
>>   /* MVPG PEI indication bits */
>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>> index 5beb7a4a11b3..0e8603acc105 100644
>> --- a/arch/s390/kvm/priv.c
>> +++ b/arch/s390/kvm/priv.c
>> @@ -443,7 +443,7 @@ static int handle_ipte_interlock(struct kvm_vcpu 
>> *vcpu)
>>       vcpu->stat.instruction_ipte_interlock++;
>>       if (psw_bits(vcpu->arch.sie_block->gpsw).pstate)
>>           return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>> -    wait_event(vcpu->kvm->arch.ipte_wq, !ipte_lock_held(vcpu));
>> +    wait_event(vcpu->kvm->arch.ipte_wq, !ipte_lock_held(vcpu->kvm));
>>       kvm_s390_retry_instr(vcpu);
>>       VCPU_EVENT(vcpu, 4, "%s", "retrying ipte interlock operation");
>>       return 0;
>> @@ -1472,7 +1472,7 @@ static int handle_tprot(struct kvm_vcpu *vcpu)
>>       access_key = (operand2 & 0xf0) >> 4;
>>       if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_DAT)
>> -        ipte_lock(vcpu);
>> +        ipte_lock(vcpu->kvm);
>>       ret = guest_translate_address_with_key(vcpu, address, ar, &gpa,
>>                              GACC_STORE, access_key);
>> @@ -1509,7 +1509,7 @@ static int handle_tprot(struct kvm_vcpu *vcpu)
>>       }
>>       if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_DAT)
>> -        ipte_unlock(vcpu);
>> +        ipte_unlock(vcpu->kvm);
>>       return ret;
>>   }
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 3/3] s390x: KVM: resetting the Topology-Change-Report
  2022-05-12 10:01       ` David Hildenbrand
@ 2022-05-16 14:21         ` Pierre Morel
  2022-05-18 14:33           ` David Hildenbrand
  0 siblings, 1 reply; 29+ messages in thread
From: Pierre Morel @ 2022-05-16 14:21 UTC (permalink / raw)
  To: David Hildenbrand, Claudio Imbrenda
  Cc: kvm, linux-s390, linux-kernel, borntraeger, frankja, cohuck,
	thuth, hca, gor, wintera, seiden, nrb



On 5/12/22 12:01, David Hildenbrand wrote:
>>>
>>> I think we prefer something like u16 when copying to user space.
>>
>> but then userspace also has to expect a u16, right?
> 
> Yep.
> 

Yes but in fact, inspired by previous discussion I had on the VFIO 
interface, that is the reason why I did prefer an int.
It is much simpler than a u16 and the definition of a bit.

Despite a bit in a u16 is what the s3990 achitecture proposes I thought 
we could make it easier on the KVM/QEMU interface.

But if the discussion stops here, I will do as you both propose change 
to u16 in KVM and userland and add the documentation for the interface.

Regards,
Pierre

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 1/3] s390x: KVM: ipte lock for SCA access should be contained in KVM
  2022-05-12  9:08   ` David Hildenbrand
@ 2022-05-16 16:30     ` Pierre Morel
  0 siblings, 0 replies; 29+ messages in thread
From: Pierre Morel @ 2022-05-16 16:30 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb



On 5/12/22 11:08, David Hildenbrand wrote:
> On 06.05.22 11:24, Pierre Morel wrote:
>> The former check to chose between SIIF or not SIIF can be done
>> using the sclp.has_siif instead of accessing per vCPU structures
>>
>> When accessing the SCA, ipte lock and ipte_unlock do not need
>> to access any vcpu structures but only the KVM structure.
>>
>> Let's simplify the ipte handling.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> 
> Much better
> 
> Reviewed-by: David Hildenbrand <david@redhat.com>
> 
> 

Thanks,
Regards,

Pierre

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 3/3] s390x: KVM: resetting the Topology-Change-Report
  2022-05-12  9:31   ` David Hildenbrand
  2022-05-12  9:52     ` Claudio Imbrenda
  2022-05-16 10:36     ` Pierre Morel
@ 2022-05-18 10:51     ` Pierre Morel
  2 siblings, 0 replies; 29+ messages in thread
From: Pierre Morel @ 2022-05-18 10:51 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb



On 5/12/22 11:31, David Hildenbrand wrote:
> On 06.05.22 11:24, Pierre Morel wrote:
>> During a subsystem reset the Topology-Change-Report is cleared.
>> Let's give userland the possibility to clear the MTCR in the case
>> of a subsystem reset.
>>
>> To migrate the MTCR, let's give userland the possibility to
>> query the MTCR state.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>> ---
>>   arch/s390/include/uapi/asm/kvm.h |  5 ++
>>   arch/s390/kvm/kvm-s390.c         | 79 ++++++++++++++++++++++++++++++++
>>   2 files changed, 84 insertions(+)
>>
>> diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
>> index 7a6b14874d65..abdcf4069343 100644
>> --- a/arch/s390/include/uapi/asm/kvm.h
>> +++ b/arch/s390/include/uapi/asm/kvm.h
>> @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
>>   #define KVM_S390_VM_CRYPTO		2
>>   #define KVM_S390_VM_CPU_MODEL		3
>>   #define KVM_S390_VM_MIGRATION		4
>> +#define KVM_S390_VM_CPU_TOPOLOGY	5
>>   
>>   /* kvm attributes for mem_ctrl */
>>   #define KVM_S390_VM_MEM_ENABLE_CMMA	0
>> @@ -171,6 +172,10 @@ struct kvm_s390_vm_cpu_subfunc {
>>   #define KVM_S390_VM_MIGRATION_START	1
>>   #define KVM_S390_VM_MIGRATION_STATUS	2
>>   
>> +/* kvm attributes for cpu topology */
>> +#define KVM_S390_VM_CPU_TOPO_MTR_CLEAR	0
>> +#define KVM_S390_VM_CPU_TOPO_MTR_SET	1
>> +
>>   /* for KVM_GET_REGS and KVM_SET_REGS */
>>   struct kvm_regs {
>>   	/* general purpose regs for s390 */
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index c8bdce31464f..80a1244f0ead 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -1731,6 +1731,76 @@ static void kvm_s390_sca_set_mtcr(struct kvm *kvm)
>>   	ipte_unlock(kvm);
>>   }
>>   
>> +/**
>> + * kvm_s390_sca_clear_mtcr
>> + * @kvm: guest KVM description
>> + *
>> + * Is only relevant if the topology facility is present,
>> + * the caller should check KVM facility 11
>> + *
>> + * Updates the Multiprocessor Topology-Change-Report to signal
>> + * the guest with a topology change.
>> + */
>> +static void kvm_s390_sca_clear_mtcr(struct kvm *kvm)
>> +{
>> +	struct bsca_block *sca = kvm->arch.sca; /* SCA version doesn't matter */
>> +
>> +	ipte_lock(kvm);
>> +	sca->utility  &= ~SCA_UTILITY_MTCR;
> 
> 
> One space too much.
> 
> sca->utility &= ~SCA_UTILITY_MTCR;
> 
>> +	ipte_unlock(kvm);
>> +}
>> +
>> +static int kvm_s390_set_topology(struct kvm *kvm, struct kvm_device_attr *attr)
>> +{
>> +	if (!test_kvm_facility(kvm, 11))
>> +		return -ENXIO;
>> +
>> +	switch (attr->attr) {
>> +	case KVM_S390_VM_CPU_TOPO_MTR_SET:
>> +		kvm_s390_sca_set_mtcr(kvm);
>> +		break;
>> +	case KVM_S390_VM_CPU_TOPO_MTR_CLEAR:
>> +		kvm_s390_sca_clear_mtcr(kvm);
>> +		break;
>> +	}
>> +	return 0;
>> +}
>> +
>> +/**
>> + * kvm_s390_sca_get_mtcr
>> + * @kvm: guest KVM description
>> + *
>> + * Is only relevant if the topology facility is present,
>> + * the caller should check KVM facility 11
>> + *
>> + * reports to QEMU the Multiprocessor Topology-Change-Report.
>> + */
>> +static int kvm_s390_sca_get_mtcr(struct kvm *kvm)
>> +{
>> +	struct bsca_block *sca = kvm->arch.sca; /* SCA version doesn't matter */
>> +	int val;
>> +
>> +	ipte_lock(kvm);
>> +	val = !!(sca->utility & SCA_UTILITY_MTCR);
>> +	ipte_unlock(kvm);
>> +
>> +	return val;
>> +}
>> +
>> +static int kvm_s390_get_topology(struct kvm *kvm, struct kvm_device_attr *attr)
>> +{
>> +	int mtcr;
> 
> I think we prefer something like u16 when copying to user space.

I come back here.
I think I prefer to keep the int.

the u16 is more than the MTCR but the entire utility field, so what 
should I do:

rename the function to kvm_s390_get_sca_utility() ?
and then should I modify the KVM_S390_VM_CPU_TOPOLOGY
to KVM_S390_VM_SCA_UTILITY ?

I do not like that, I do not think we should report/handle more 
information than expected/needed.

I can mask the MTCR bit and return a u16 with bit 0 (0x8000) set
but I find this a little weird

I admit an int is may be not optimal.
logically I should report a bool but I do not like to report a bool 
through the UAPI.

The more I think about it the more I think an int is OK.
Or in the case we want to spare memory space I can create a flag in a 
u16 but it should theoretically be different than the firmware MTCR bit. 
Could be 0x0001.
But still, it is only to leave during the copy_to_user where the copy of 
an int may be as good or better than the copy of a u16.

So any more opinion on this?

Regards,
Pierre

> 
>> +
>> +	if (!test_kvm_facility(kvm, 11))
>> +		return -ENXIO;
>> +
>> +	mtcr = kvm_s390_sca_get_mtcr(kvm);
>> +	if (copy_to_user((void __user *)attr->addr, &mtcr, sizeof(mtcr)))
>> +		return -EFAULT;
>> +
>> +	return 0;
>> +}
> 
> You should probably add documentation, and document that only the last
> bit (0x1) has a meaning.
> 
> Apart from that LGTM.
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 3/3] s390x: KVM: resetting the Topology-Change-Report
  2022-05-16 14:21         ` Pierre Morel
@ 2022-05-18 14:33           ` David Hildenbrand
  2022-05-18 16:55             ` Pierre Morel
  0 siblings, 1 reply; 29+ messages in thread
From: David Hildenbrand @ 2022-05-18 14:33 UTC (permalink / raw)
  To: Pierre Morel, Claudio Imbrenda
  Cc: kvm, linux-s390, linux-kernel, borntraeger, frankja, cohuck,
	thuth, hca, gor, wintera, seiden, nrb

On 16.05.22 16:21, Pierre Morel wrote:
> 
> 
> On 5/12/22 12:01, David Hildenbrand wrote:
>>>>
>>>> I think we prefer something like u16 when copying to user space.
>>>
>>> but then userspace also has to expect a u16, right?
>>
>> Yep.
>>
> 
> Yes but in fact, inspired by previous discussion I had on the VFIO 
> interface, that is the reason why I did prefer an int.
> It is much simpler than a u16 and the definition of a bit.
> 
> Despite a bit in a u16 is what the s3990 achitecture proposes I thought 
> we could make it easier on the KVM/QEMU interface.
> 
> But if the discussion stops here, I will do as you both propose change 
> to u16 in KVM and userland and add the documentation for the interface.

In general, we pass via the ABI fixed-sized values -- u8, u16, u32, u64
... instead of int. Simply because sizeof(int) is in theory variable
(e.g., 32bit vs 64bit).

Take a look at arch/s390/include/uapi/asm/kvm.h and you won't find any
usage of int or bool.

Having that said, I'll let the maintainers decide. Using e.g., u8 is
just the natural thing to do on a Linux ABI, but we don't really support
32 bit ... maybe we'll support 128bit at one point? ;)

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 0/3] s390x: KVM: CPU Topology
  2022-05-06  9:24 [PATCH v9 0/3] s390x: KVM: CPU Topology Pierre Morel
                   ` (2 preceding siblings ...)
  2022-05-06  9:24 ` [PATCH v9 3/3] s390x: KVM: resetting the Topology-Change-Report Pierre Morel
@ 2022-05-18 15:26 ` Christian Borntraeger
  2022-05-18 16:41   ` Pierre Morel
  2022-05-19  5:46   ` Heiko Carstens
  3 siblings, 2 replies; 29+ messages in thread
From: Christian Borntraeger @ 2022-05-18 15:26 UTC (permalink / raw)
  To: Pierre Morel, kvm
  Cc: linux-s390, linux-kernel, frankja, cohuck, david, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb

Pierre,

please use "KVM: s390x:" and not "s390x: KVM:" for future series.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 0/3] s390x: KVM: CPU Topology
  2022-05-18 15:26 ` [PATCH v9 0/3] s390x: KVM: CPU Topology Christian Borntraeger
@ 2022-05-18 16:41   ` Pierre Morel
  2022-05-19  5:46   ` Heiko Carstens
  1 sibling, 0 replies; 29+ messages in thread
From: Pierre Morel @ 2022-05-18 16:41 UTC (permalink / raw)
  To: Christian Borntraeger, kvm
  Cc: linux-s390, linux-kernel, frankja, cohuck, david, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb


On 5/18/22 17:26, Christian Borntraeger wrote:
> Pierre,
> 
> please use "KVM: s390x:" and not "s390x: KVM:" for future series.


OK, thanks

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 3/3] s390x: KVM: resetting the Topology-Change-Report
  2022-05-18 14:33           ` David Hildenbrand
@ 2022-05-18 16:55             ` Pierre Morel
  0 siblings, 0 replies; 29+ messages in thread
From: Pierre Morel @ 2022-05-18 16:55 UTC (permalink / raw)
  To: David Hildenbrand, Claudio Imbrenda
  Cc: kvm, linux-s390, linux-kernel, borntraeger, frankja, cohuck,
	thuth, hca, gor, wintera, seiden, nrb



On 5/18/22 16:33, David Hildenbrand wrote:
> On 16.05.22 16:21, Pierre Morel wrote:
>>
>>
>> On 5/12/22 12:01, David Hildenbrand wrote:
>>>>>
>>>>> I think we prefer something like u16 when copying to user space.
>>>>
>>>> but then userspace also has to expect a u16, right?
>>>
>>> Yep.
>>>
>>
>> Yes but in fact, inspired by previous discussion I had on the VFIO
>> interface, that is the reason why I did prefer an int.
>> It is much simpler than a u16 and the definition of a bit.
>>
>> Despite a bit in a u16 is what the s3990 achitecture proposes I thought
>> we could make it easier on the KVM/QEMU interface.
>>
>> But if the discussion stops here, I will do as you both propose change
>> to u16 in KVM and userland and add the documentation for the interface.
> 
> In general, we pass via the ABI fixed-sized values -- u8, u16, u32, u64
> ... instead of int. Simply because sizeof(int) is in theory variable
> (e.g., 32bit vs 64bit).
> 
> Take a look at arch/s390/include/uapi/asm/kvm.h and you won't find any
> usage of int or bool.
> 
> Having that said, I'll let the maintainers decide. Using e.g., u8 is
> just the natural thing to do on a Linux ABI, but we don't really support
> 32 bit ... maybe we'll support 128bit at one point? ;)
> 

OK then I use u16 with a flag in case we get something in the utilities 
which is related to the topology in the future.

Thanks,
Pierre

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 0/3] s390x: KVM: CPU Topology
  2022-05-18 15:26 ` [PATCH v9 0/3] s390x: KVM: CPU Topology Christian Borntraeger
  2022-05-18 16:41   ` Pierre Morel
@ 2022-05-19  5:46   ` Heiko Carstens
  2022-05-19  8:07     ` Christian Borntraeger
  1 sibling, 1 reply; 29+ messages in thread
From: Heiko Carstens @ 2022-05-19  5:46 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Pierre Morel, kvm, linux-s390, linux-kernel, frankja, cohuck,
	david, thuth, imbrenda, gor, wintera, seiden, nrb

On Wed, May 18, 2022 at 05:26:59PM +0200, Christian Borntraeger wrote:
> Pierre,
> 
> please use "KVM: s390x:" and not "s390x: KVM:" for future series.

My grep arts ;) tell me that you probably want "KVM: s390:" without
"x" for the kernel.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 0/3] s390x: KVM: CPU Topology
  2022-05-19  5:46   ` Heiko Carstens
@ 2022-05-19  8:07     ` Christian Borntraeger
  2022-05-19  9:02       ` Pierre Morel
  0 siblings, 1 reply; 29+ messages in thread
From: Christian Borntraeger @ 2022-05-19  8:07 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Pierre Morel, kvm, linux-s390, linux-kernel, frankja, cohuck,
	david, thuth, imbrenda, gor, wintera, seiden, nrb

Am 19.05.22 um 07:46 schrieb Heiko Carstens:
> On Wed, May 18, 2022 at 05:26:59PM +0200, Christian Borntraeger wrote:
>> Pierre,
>>
>> please use "KVM: s390x:" and not "s390x: KVM:" for future series.
> 
> My grep arts ;) tell me that you probably want "KVM: s390:" without
> "x" for the kernel.

yes :-)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 2/3] s390x: KVM: guest support for topology function
  2022-05-06  9:24 ` [PATCH v9 2/3] s390x: KVM: guest support for topology function Pierre Morel
  2022-05-12  9:24   ` David Hildenbrand
  2022-05-12 11:41   ` Janosch Frank
@ 2022-05-19  9:01   ` Christian Borntraeger
  2022-05-19  9:23     ` Pierre Morel
  2 siblings, 1 reply; 29+ messages in thread
From: Christian Borntraeger @ 2022-05-19  9:01 UTC (permalink / raw)
  To: Pierre Morel, kvm
  Cc: linux-s390, linux-kernel, frankja, cohuck, david, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb



Am 06.05.22 um 11:24 schrieb Pierre Morel:
> We let the userland hypervisor know if the machine support the CPU
> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
> 
> The PTF instruction will report a topology change if there is any change
> with a previous STSI_15_1_2 SYSIB.
> Changes inside a STSI_15_1_2 SYSIB occur if CPU bits are set or clear
> inside the CPU Topology List Entry CPU mask field, which happens with
> changes in CPU polarization, dedication, CPU types and adding or
> removing CPUs in a socket.
> 
> The reporting to the guest is done using the Multiprocessor
> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
> SCA which will be cleared during the interpretation of PTF.
> 
> To check if the topology has been modified we use a new field of the
> arch vCPU to save the previous real CPU ID at the end of a schedule
> and verify on next schedule that the CPU used is in the same socket.
> We do not report polarization, CPU Type or dedication change.

I think we should not do this. When PTF returns with "has changed" the guest
Linux will rebuild its schedule domains. And this is a really expensive
operation as far as I can tell. And the host Linux scheduler WILL schedule
too often to other CPUs. So in essence this will result in Linux guests
rebuilding their scheduler domains all the time.
So remove the "previous CPU logic" for now and only trigger an MTCR when
userspace says so.  (eg. on config changes). The idea was to have user
defined schedule domains. Following host schedule decisions will be
nearly impossible.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 0/3] s390x: KVM: CPU Topology
  2022-05-19  8:07     ` Christian Borntraeger
@ 2022-05-19  9:02       ` Pierre Morel
  0 siblings, 0 replies; 29+ messages in thread
From: Pierre Morel @ 2022-05-19  9:02 UTC (permalink / raw)
  To: Christian Borntraeger, Heiko Carstens
  Cc: kvm, linux-s390, linux-kernel, frankja, cohuck, david, thuth,
	imbrenda, gor, wintera, seiden, nrb



On 5/19/22 10:07, Christian Borntraeger wrote:
> Am 19.05.22 um 07:46 schrieb Heiko Carstens:
>> On Wed, May 18, 2022 at 05:26:59PM +0200, Christian Borntraeger wrote:
>>> Pierre,
>>>
>>> please use "KVM: s390x:" and not "s390x: KVM:" for future series.
>>
>> My grep arts ;) tell me that you probably want "KVM: s390:" without
>> "x" for the kernel.
> 
> yes :-)

Thanks, both of you.
I change it accordingly.

Regards,
Pierre

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 2/3] s390x: KVM: guest support for topology function
  2022-05-19  9:01   ` Christian Borntraeger
@ 2022-05-19  9:23     ` Pierre Morel
  2022-05-19  9:36       ` Christian Borntraeger
  0 siblings, 1 reply; 29+ messages in thread
From: Pierre Morel @ 2022-05-19  9:23 UTC (permalink / raw)
  To: Christian Borntraeger, kvm
  Cc: linux-s390, linux-kernel, frankja, cohuck, david, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb, Viktor Mihajlovski



On 5/19/22 11:01, Christian Borntraeger wrote:
> 
> 
> Am 06.05.22 um 11:24 schrieb Pierre Morel:
>> We let the userland hypervisor know if the machine support the CPU
>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>
>> The PTF instruction will report a topology change if there is any change
>> with a previous STSI_15_1_2 SYSIB.
>> Changes inside a STSI_15_1_2 SYSIB occur if CPU bits are set or clear
>> inside the CPU Topology List Entry CPU mask field, which happens with
>> changes in CPU polarization, dedication, CPU types and adding or
>> removing CPUs in a socket.
>>
>> The reporting to the guest is done using the Multiprocessor
>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>> SCA which will be cleared during the interpretation of PTF.
>>
>> To check if the topology has been modified we use a new field of the
>> arch vCPU to save the previous real CPU ID at the end of a schedule
>> and verify on next schedule that the CPU used is in the same socket.
>> We do not report polarization, CPU Type or dedication change.
> 
> I think we should not do this. When PTF returns with "has changed" the 
> guest
> Linux will rebuild its schedule domains. And this is a really expensive
> operation as far as I can tell. And the host Linux scheduler WILL schedule
> too often to other CPUs. So in essence this will result in Linux guests
> rebuilding their scheduler domains all the time.
> So remove the "previous CPU logic" for now and only trigger an MTCR when
> userspace says so.  (eg. on config changes). The idea was to have user
> defined schedule domains. Following host schedule decisions will be
> nearly impossible.



I guess you saw that the MTCR bit is set only if the previous and new 
CPU are on different sockets, like it is on the hardware, not on every 
scheduling to another CPU.

However this can easily be done in an enhancement, if ever, since it has 
no implication on the UAPI.
I change this for the next round.

Thanks,
Pierre

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 2/3] s390x: KVM: guest support for topology function
  2022-05-19  9:23     ` Pierre Morel
@ 2022-05-19  9:36       ` Christian Borntraeger
  0 siblings, 0 replies; 29+ messages in thread
From: Christian Borntraeger @ 2022-05-19  9:36 UTC (permalink / raw)
  To: Pierre Morel, kvm
  Cc: linux-s390, linux-kernel, frankja, cohuck, david, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb, Viktor Mihajlovski



Am 19.05.22 um 11:23 schrieb Pierre Morel:
> 
> 
> On 5/19/22 11:01, Christian Borntraeger wrote:
>>
>>
>> Am 06.05.22 um 11:24 schrieb Pierre Morel:
>>> We let the userland hypervisor know if the machine support the CPU
>>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>>
>>> The PTF instruction will report a topology change if there is any change
>>> with a previous STSI_15_1_2 SYSIB.
>>> Changes inside a STSI_15_1_2 SYSIB occur if CPU bits are set or clear
>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>> changes in CPU polarization, dedication, CPU types and adding or
>>> removing CPUs in a socket.
>>>
>>> The reporting to the guest is done using the Multiprocessor
>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>> SCA which will be cleared during the interpretation of PTF.
>>>
>>> To check if the topology has been modified we use a new field of the
>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>> and verify on next schedule that the CPU used is in the same socket.
>>> We do not report polarization, CPU Type or dedication change.
>>
>> I think we should not do this. When PTF returns with "has changed" the guest
>> Linux will rebuild its schedule domains. And this is a really expensive
>> operation as far as I can tell. And the host Linux scheduler WILL schedule
>> too often to other CPUs. So in essence this will result in Linux guests
>> rebuilding their scheduler domains all the time.
>> So remove the "previous CPU logic" for now and only trigger an MTCR when
>> userspace says so.  (eg. on config changes). The idea was to have user
>> defined schedule domains. Following host schedule decisions will be
>> nearly impossible.
> 
> 
> 
> I guess you saw that the MTCR bit is set only if the previous and new CPU are on different sockets, like it is on the hardware, not on every scheduling to another CPU.

Yes, but even that happens too often as far as I can tell.
> 
> However this can easily be done in an enhancement, if ever, since it has no implication on the UAPI.
> I change this for the next round.

Yes, lets defer that (we would need solid measurements).

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 2/3] s390x: KVM: guest support for topology function
  2022-05-16 14:13     ` Pierre Morel
@ 2022-06-17 14:49       ` Pierre Morel
  0 siblings, 0 replies; 29+ messages in thread
From: Pierre Morel @ 2022-06-17 14:49 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor, wintera, seiden, nrb



On 5/16/22 16:13, Pierre Morel wrote:
> 
> 
> On 5/12/22 11:24, David Hildenbrand wrote:
>> On 06.05.22 11:24, Pierre Morel wrote:
>>> We let the userland hypervisor know if the machine support the CPU
>>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>>
>>> The PTF instruction will report a topology change if there is any change
>>> with a previous STSI_15_1_2 SYSIB.
>>> Changes inside a STSI_15_1_2 SYSIB occur if CPU bits are set or clear
>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>> changes in CPU polarization, dedication, CPU types and adding or
>>> removing CPUs in a socket.
>>>
>>> The reporting to the guest is done using the Multiprocessor
>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>> SCA which will be cleared during the interpretation of PTF.
>>>
>>> To check if the topology has been modified we use a new field of the
>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>> and verify on next schedule that the CPU used is in the same socket.
>>> We do not report polarization, CPU Type or dedication change.
>>>
>>> STSI(15.1.x) gives information on the CPU configuration topology.
>>> Let's accept the interception of STSI with the function code 15 and
>>> let the userland part of the hypervisor handle it when userland
>>> support the CPU Topology facility.
>>>
>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>
>> [...]
>>
>>
>>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>>> index 0e8603acc105..d9e16b09c8bf 100644
>>> --- a/arch/s390/kvm/priv.c
>>> +++ b/arch/s390/kvm/priv.c
>>> @@ -874,10 +874,12 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>>       if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>>>           return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>>> -    if (fc > 3) {
>>> -        kvm_s390_set_psw_cc(vcpu, 3);
>>> -        return 0;
>>> -    }
>>> +    if (fc > 3 && fc != 15)
>>> +        goto out_no_data;
>>> +
>>> +    /* fc 15 is provided with PTF/CPU topology support */
>>> +    if (fc == 15 && !test_kvm_facility(vcpu->kvm, 11))
>>> +        goto out_no_data;
>>
>>
>> Maybe shorter as
>>
>> if (fc == 15 && !test_kvm_facility(vcpu->kvm, 11))
>>     goto out_no_data;
>> else if (fc > 3)
>>     goto out_no_data;
>>
> 
> yes.

hum, sorry, but no.

when test_kvm_facility(11) is true then !test_kvm_facility(11) is false 
and the first test fails
and the second succeed jumping to out_no_data for fc == 15

I can use what I proposed with a comment to make it better readable.
What about:

         /* Bailout forbidden function codes */
         if (fc > 3 && fc != 15)
                 goto out_no_data;
         /* fc 15 is provided with PTF/CPU topology support */
         if (fc == 15 && !test_kvm_facility(vcpu->kvm, 11))
                 goto out_no_data;


> 
>>
>> Apart from that, LGTM.
>>
> 
> Thanks,
> Pierre
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-06-17 14:45 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-06  9:24 [PATCH v9 0/3] s390x: KVM: CPU Topology Pierre Morel
2022-05-06  9:24 ` [PATCH v9 1/3] s390x: KVM: ipte lock for SCA access should be contained in KVM Pierre Morel
2022-05-12  9:08   ` David Hildenbrand
2022-05-16 16:30     ` Pierre Morel
2022-05-12 11:32   ` Janosch Frank
2022-05-16 14:13     ` Pierre Morel
2022-05-06  9:24 ` [PATCH v9 2/3] s390x: KVM: guest support for topology function Pierre Morel
2022-05-12  9:24   ` David Hildenbrand
2022-05-16 14:13     ` Pierre Morel
2022-06-17 14:49       ` Pierre Morel
2022-05-12 11:41   ` Janosch Frank
2022-05-16 10:41     ` Pierre Morel
2022-05-19  9:01   ` Christian Borntraeger
2022-05-19  9:23     ` Pierre Morel
2022-05-19  9:36       ` Christian Borntraeger
2022-05-06  9:24 ` [PATCH v9 3/3] s390x: KVM: resetting the Topology-Change-Report Pierre Morel
2022-05-12  9:31   ` David Hildenbrand
2022-05-12  9:52     ` Claudio Imbrenda
2022-05-12 10:01       ` David Hildenbrand
2022-05-16 14:21         ` Pierre Morel
2022-05-18 14:33           ` David Hildenbrand
2022-05-18 16:55             ` Pierre Morel
2022-05-16 10:36     ` Pierre Morel
2022-05-18 10:51     ` Pierre Morel
2022-05-18 15:26 ` [PATCH v9 0/3] s390x: KVM: CPU Topology Christian Borntraeger
2022-05-18 16:41   ` Pierre Morel
2022-05-19  5:46   ` Heiko Carstens
2022-05-19  8:07     ` Christian Borntraeger
2022-05-19  9:02       ` Pierre Morel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.