* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
2021-09-06 18:37 ` David Hildenbrand
@ 2021-09-07 10:24 ` Pierre Morel
2021-09-08 7:04 ` Christian Borntraeger
2021-09-07 12:28 ` Pierre Morel
2021-09-09 9:03 ` Pierre Morel
2 siblings, 1 reply; 25+ messages in thread
From: Pierre Morel @ 2021-09-07 10:24 UTC (permalink / raw)
To: David Hildenbrand, kvm
Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
imbrenda, hca, gor
On 9/6/21 8:37 PM, David Hildenbrand wrote:
> On 03.08.21 10:26, Pierre Morel wrote:
>> We let the userland hypervisor know if the machine support the CPU
>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>
>> The PTF instruction will report a topology change if there is any change
>> with a previous STSI_15_2 SYSIB.
>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>> inside the CPU Topology List Entry CPU mask field, which happens with
>> changes in CPU polarization, dedication, CPU types and adding or
>> removing CPUs in a socket.
>>
>> The reporting to the guest is done using the Multiprocessor
>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>> SCA which will be cleared during the interpretation of PTF.
>>
>> To check if the topology has been modified we use a new field of the
>> arch vCPU to save the previous real CPU ID at the end of a schedule
>> and verify on next schedule that the CPU used is in the same socket.
>>
>> We deliberatly ignore:
>> - polarization: only horizontal polarization is currently used in linux.
>> - CPU Type: only IFL Type are supported in Linux
>> - Dedication: we consider that only a complete dedicated CPU stack can
>> take benefit of the CPU Topology.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>
>
>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>> __u8 icptcode; /* 0x0050 */
>> __u8 icptstatus; /* 0x0051 */
>> __u16 ihcpu; /* 0x0052 */
>> - __u8 reserved54; /* 0x0054 */
>> + __u8 mtcr; /* 0x0054 */
>> #define IICTL_CODE_NONE 0x00
>> #define IICTL_CODE_MCHK 0x01
>> #define IICTL_CODE_EXT 0x02
>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>> #define ECB_TE 0x10
>> #define ECB_SRSI 0x04
>> #define ECB_HOSTPROTINT 0x02
>> +#define ECB_PTF 0x01
>
> From below I understand, that ECB_PTF can be used with stfl(11) in the
> hypervisor.
>
> What is to happen if the hypervisor doesn't support stfl(11) and we
> consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
Yes.
>
>
>> __u8 ecb; /* 0x0061 */
>> #define ECB2_CMMA 0x80
>> #define ECB2_IEP 0x20
>> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>> bool skey_enabled;
>> struct kvm_s390_pv_vcpu pv;
>> union diag318_info diag318_info;
>> + int prev_cpu;
>> };
>> struct kvm_vm_stat {
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index b655a7d82bf0..ff6d8a2b511c 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm,
>> long ext)
>> case KVM_CAP_S390_VCPU_RESETS:
>> case KVM_CAP_SET_GUEST_DEBUG:
>> case KVM_CAP_S390_DIAG318:
>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>
> I would have expected instead
>
> r = test_facility(11);
> break
The idea is that QEMU will emulate both PTF and SYSIB_15 in this case.
>
> ...
>
>> r = 1;
>> break;
>> case KVM_CAP_SET_GUEST_DEBUG2:
>> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>> struct kvm_enable_cap *cap)
>> icpt_operexc_on_all_vcpus(kvm);
>> r = 0;
>> break;
>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>> + mutex_lock(&kvm->lock);
>> + if (kvm->created_vcpus) {
>> + r = -EBUSY;
>> + } else {
>
> ...
> } else if (test_facility(11)) {
> set_kvm_facility(kvm->arch.model.fac_mask, 11);
> set_kvm_facility(kvm->arch.model.fac_list, 11);
> r = 0;
> } else {
> r = -EINVAL;
> }
>
> similar to how we handle KVM_CAP_S390_VECTOR_REGISTERS.
>
> But I assume you want to be able to support hosts without ECB_PTF, correct?
yes, this was the idea.
>
>
>> + set_kvm_facility(kvm->arch.model.fac_mask, 11);
>> + set_kvm_facility(kvm->arch.model.fac_list, 11);
>> + r = 0;
>> + }
>> + mutex_unlock(&kvm->lock);
>> + VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
>> + r ? "(not available)" : "(success)");
>> + break;
>> +
>> + r = -EINVAL;
>> + break;
>
> ^ dead code
>
:) indeed , sorry.
> [...]
>
>> }
>> void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>> {
>> + vcpu->arch.prev_cpu = vcpu->cpu;
>> vcpu->cpu = -1;
>> if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
>> __stop_cpu_timer_accounting(vcpu);
>> @@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu
>> *vcpu)
>> vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
>> if (test_kvm_facility(vcpu->kvm, 9))
>> vcpu->arch.sie_block->ecb |= ECB_SRSI;
>> +
>> + /* PTF needs both host and guest facilities to enable
>> interpretation */
>> + if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
>> + vcpu->arch.sie_block->ecb |= ECB_PTF;
>
> Here you say we need both ...
Yes because for interpretation we need both.
But if PTF is not interpreted we will emulate it in QEMU.
>
>> +
>> if (test_kvm_facility(vcpu->kvm, 73))
>> vcpu->arch.sie_block->ecb |= ECB_TE;
>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>> index 4002a24bc43a..50d67190bf65 100644
>> --- a/arch/s390/kvm/vsie.c
>> +++ b/arch/s390/kvm/vsie.c
>> @@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu,
>> struct vsie_page *vsie_page)
>> /* Host-protection-interruption introduced with ESOP */
>> if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
>> scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
>> + /* CPU Topology */
>> + if (test_kvm_facility(vcpu->kvm, 11))
>> + scb_s->ecb |= scb_o->ecb & ECB_PTF;
>
> but here you don't check?
Arrrg, yes, this is false, we must check both here too.
>
>> /* transactional execution */
>> if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
>> /* remap the prefix is tx is toggled on */
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index d9e4aabcb31a..081ce0cd44b9 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
>> #define KVM_CAP_BINARY_STATS_FD 203
>> #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
>> #define KVM_CAP_ARM_MTE 205
>> +#define KVM_CAP_S390_CPU_TOPOLOGY 206
>
> We'll need a Documentation/virt/kvm/api.rst description.
>
> I'm not completely confident that the way we're handling the
> capability+facility is the right approach. It all feels a bit suboptimal.
>
> Except stfl(74) -- STHYI --, we never enable a facility via
> set_kvm_facility() that's not available in the host. And STHYI is
> special such that it is never implemented in hardware.
Then we can fall back to KVM_facility + in kernel emulation but if for
PTF it will be quite simple, for STSI_15 it will be much bigger.
>
> I'll think about what might be cleaner once I get some more details
> about the interaction with stfl(11) in the hypervisor.
>
And I just saw I for an unknown reason forgot two patches in the QEMU
series:
s390x: kvm: make topology change report pending
s390x: kvm: enable CPU Topology Function
So I will publish a new QEMU series this afternoon with the comments
from Thomas.
thanks,
Pierre
--
Pierre Morel
IBM Lab Boeblingen
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
2021-09-07 10:24 ` Pierre Morel
@ 2021-09-08 7:04 ` Christian Borntraeger
2021-09-08 12:00 ` Pierre Morel
0 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2021-09-08 7:04 UTC (permalink / raw)
To: Pierre Morel, David Hildenbrand, kvm
Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor
On 07.09.21 12:24, Pierre Morel wrote:
>
>
> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>> On 03.08.21 10:26, Pierre Morel wrote:
>>> We let the userland hypervisor know if the machine support the CPU
>>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>>
>>> The PTF instruction will report a topology change if there is any change
>>> with a previous STSI_15_2 SYSIB.
>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>> changes in CPU polarization, dedication, CPU types and adding or
>>> removing CPUs in a socket.
>>>
>>> The reporting to the guest is done using the Multiprocessor
>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>> SCA which will be cleared during the interpretation of PTF.
>>>
>>> To check if the topology has been modified we use a new field of the
>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>> and verify on next schedule that the CPU used is in the same socket.
>>>
>>> We deliberatly ignore:
>>> - polarization: only horizontal polarization is currently used in linux.
>>> - CPU Type: only IFL Type are supported in Linux
>>> - Dedication: we consider that only a complete dedicated CPU stack can
>>> take benefit of the CPU Topology.
>>>
>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>
>>
>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>> __u8 icptcode; /* 0x0050 */
>>> __u8 icptstatus; /* 0x0051 */
>>> __u16 ihcpu; /* 0x0052 */
>>> - __u8 reserved54; /* 0x0054 */
>>> + __u8 mtcr; /* 0x0054 */
>>> #define IICTL_CODE_NONE 0x00
>>> #define IICTL_CODE_MCHK 0x01
>>> #define IICTL_CODE_EXT 0x02
>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>> #define ECB_TE 0x10
>>> #define ECB_SRSI 0x04
>>> #define ECB_HOSTPROTINT 0x02
>>> +#define ECB_PTF 0x01
>>
>> From below I understand, that ECB_PTF can be used with stfl(11) in the hypervisor.
>>
>> What is to happen if the hypervisor doesn't support stfl(11) and we consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
>
> Yes.
Do we want that? I do not think so. Other OSes (like zOS) do use PTF in there low level interrupt handler, so PTF must be really fast.
I think I would prefer that in that case the guest will simply not see stfle(11).
So the user can still specify the topology but the guest will have no interface to query it.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
2021-09-08 7:04 ` Christian Borntraeger
@ 2021-09-08 12:00 ` Pierre Morel
2021-09-08 12:01 ` Christian Borntraeger
0 siblings, 1 reply; 25+ messages in thread
From: Pierre Morel @ 2021-09-08 12:00 UTC (permalink / raw)
To: Christian Borntraeger, David Hildenbrand, kvm
Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor
On 9/8/21 9:04 AM, Christian Borntraeger wrote:
>
>
> On 07.09.21 12:24, Pierre Morel wrote:
>>
>>
>> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>>> On 03.08.21 10:26, Pierre Morel wrote:
>>>> We let the userland hypervisor know if the machine support the CPU
>>>> topology facility using a new KVM capability:
>>>> KVM_CAP_S390_CPU_TOPOLOGY.
>>>>
>>>> The PTF instruction will report a topology change if there is any
>>>> change
>>>> with a previous STSI_15_2 SYSIB.
>>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>>> changes in CPU polarization, dedication, CPU types and adding or
>>>> removing CPUs in a socket.
>>>>
>>>> The reporting to the guest is done using the Multiprocessor
>>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>>> SCA which will be cleared during the interpretation of PTF.
>>>>
>>>> To check if the topology has been modified we use a new field of the
>>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>>> and verify on next schedule that the CPU used is in the same socket.
>>>>
>>>> We deliberatly ignore:
>>>> - polarization: only horizontal polarization is currently used in
>>>> linux.
>>>> - CPU Type: only IFL Type are supported in Linux
>>>> - Dedication: we consider that only a complete dedicated CPU stack can
>>>> take benefit of the CPU Topology.
>>>>
>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>
>>>
>>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>>> __u8 icptcode; /* 0x0050 */
>>>> __u8 icptstatus; /* 0x0051 */
>>>> __u16 ihcpu; /* 0x0052 */
>>>> - __u8 reserved54; /* 0x0054 */
>>>> + __u8 mtcr; /* 0x0054 */
>>>> #define IICTL_CODE_NONE 0x00
>>>> #define IICTL_CODE_MCHK 0x01
>>>> #define IICTL_CODE_EXT 0x02
>>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>>> #define ECB_TE 0x10
>>>> #define ECB_SRSI 0x04
>>>> #define ECB_HOSTPROTINT 0x02
>>>> +#define ECB_PTF 0x01
>>>
>>> From below I understand, that ECB_PTF can be used with stfl(11) in
>>> the hypervisor.
>>>
>>> What is to happen if the hypervisor doesn't support stfl(11) and we
>>> consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
>>
>> Yes.
>
> Do we want that? I do not think so. Other OSes (like zOS) do use PTF in
> there low level interrupt handler, so PTF must be really fast.
> I think I would prefer that in that case the guest will simply not see
> stfle(11).
> So the user can still specify the topology but the guest will have no
> interface to query it.
I do not understand.
If the host support stfle(11) we interpret PTF.
The proposition was to emulate only in the case it is not supported,
what you propose is to not advertise stfl(11) if the host does not
support it, and consequently to never emulate is it right?
In this case, as STSI_15 is linked to stfl(11) too, the guest will not
be aware of the topology.
OK for me.
--
Pierre Morel
IBM Lab Boeblingen
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
2021-09-08 12:00 ` Pierre Morel
@ 2021-09-08 12:01 ` Christian Borntraeger
2021-09-08 12:52 ` Pierre Morel
0 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2021-09-08 12:01 UTC (permalink / raw)
To: Pierre Morel, David Hildenbrand, kvm
Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor
On 08.09.21 14:00, Pierre Morel wrote:
>
>
> On 9/8/21 9:04 AM, Christian Borntraeger wrote:
>>
>>
>> On 07.09.21 12:24, Pierre Morel wrote:
>>>
>>>
>>> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>>>> On 03.08.21 10:26, Pierre Morel wrote:
>>>>> We let the userland hypervisor know if the machine support the CPU
>>>>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>>>>
>>>>> The PTF instruction will report a topology change if there is any change
>>>>> with a previous STSI_15_2 SYSIB.
>>>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>>>> changes in CPU polarization, dedication, CPU types and adding or
>>>>> removing CPUs in a socket.
>>>>>
>>>>> The reporting to the guest is done using the Multiprocessor
>>>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>>>> SCA which will be cleared during the interpretation of PTF.
>>>>>
>>>>> To check if the topology has been modified we use a new field of the
>>>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>>>> and verify on next schedule that the CPU used is in the same socket.
>>>>>
>>>>> We deliberatly ignore:
>>>>> - polarization: only horizontal polarization is currently used in linux.
>>>>> - CPU Type: only IFL Type are supported in Linux
>>>>> - Dedication: we consider that only a complete dedicated CPU stack can
>>>>> take benefit of the CPU Topology.
>>>>>
>>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>>
>>>>
>>>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>>>> __u8 icptcode; /* 0x0050 */
>>>>> __u8 icptstatus; /* 0x0051 */
>>>>> __u16 ihcpu; /* 0x0052 */
>>>>> - __u8 reserved54; /* 0x0054 */
>>>>> + __u8 mtcr; /* 0x0054 */
>>>>> #define IICTL_CODE_NONE 0x00
>>>>> #define IICTL_CODE_MCHK 0x01
>>>>> #define IICTL_CODE_EXT 0x02
>>>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>>>> #define ECB_TE 0x10
>>>>> #define ECB_SRSI 0x04
>>>>> #define ECB_HOSTPROTINT 0x02
>>>>> +#define ECB_PTF 0x01
>>>>
>>>> From below I understand, that ECB_PTF can be used with stfl(11) in the hypervisor.
>>>>
>>>> What is to happen if the hypervisor doesn't support stfl(11) and we consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
>>>
>>> Yes.
>>
>> Do we want that? I do not think so. Other OSes (like zOS) do use PTF in there low level interrupt handler, so PTF must be really fast.
>> I think I would prefer that in that case the guest will simply not see stfle(11).
>> So the user can still specify the topology but the guest will have no interface to query it.
>
> I do not understand.
> If the host support stfle(11) we interpret PTF.
>
> The proposition was to emulate only in the case it is not supported, what you propose is to not advertise stfl(11) if the host does not support it, and consequently to never emulate is it right?
Yes, exactly. My idea is to provide it to guests if we can do it fast, but do not provide it if it would add a performance issue.
>
> In this case, as STSI_15 is linked to stfl(11) too, the guest will not be aware of the topology.
>
> OK for me.
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
2021-09-08 12:01 ` Christian Borntraeger
@ 2021-09-08 12:52 ` Pierre Morel
0 siblings, 0 replies; 25+ messages in thread
From: Pierre Morel @ 2021-09-08 12:52 UTC (permalink / raw)
To: Christian Borntraeger, David Hildenbrand, kvm
Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor
On 9/8/21 2:01 PM, Christian Borntraeger wrote:
>
>
> On 08.09.21 14:00, Pierre Morel wrote:
>>
>>
>> On 9/8/21 9:04 AM, Christian Borntraeger wrote:
>>>
>>>
>>> On 07.09.21 12:24, Pierre Morel wrote:
>>>>
>>>>
>>>> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>>>>> On 03.08.21 10:26, Pierre Morel wrote:
>>>>>> We let the userland hypervisor know if the machine support the CPU
>>>>>> topology facility using a new KVM capability:
>>>>>> KVM_CAP_S390_CPU_TOPOLOGY.
>>>>>>
>>>>>> The PTF instruction will report a topology change if there is any
>>>>>> change
>>>>>> with a previous STSI_15_2 SYSIB.
>>>>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>>>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>>>>> changes in CPU polarization, dedication, CPU types and adding or
>>>>>> removing CPUs in a socket.
>>>>>>
>>>>>> The reporting to the guest is done using the Multiprocessor
>>>>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>>>>> SCA which will be cleared during the interpretation of PTF.
>>>>>>
>>>>>> To check if the topology has been modified we use a new field of the
>>>>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>>>>> and verify on next schedule that the CPU used is in the same socket.
>>>>>>
>>>>>> We deliberatly ignore:
>>>>>> - polarization: only horizontal polarization is currently used in
>>>>>> linux.
>>>>>> - CPU Type: only IFL Type are supported in Linux
>>>>>> - Dedication: we consider that only a complete dedicated CPU stack
>>>>>> can
>>>>>> take benefit of the CPU Topology.
>>>>>>
>>>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>>>
>>>>>
>>>>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>>>>> __u8 icptcode; /* 0x0050 */
>>>>>> __u8 icptstatus; /* 0x0051 */
>>>>>> __u16 ihcpu; /* 0x0052 */
>>>>>> - __u8 reserved54; /* 0x0054 */
>>>>>> + __u8 mtcr; /* 0x0054 */
>>>>>> #define IICTL_CODE_NONE 0x00
>>>>>> #define IICTL_CODE_MCHK 0x01
>>>>>> #define IICTL_CODE_EXT 0x02
>>>>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>>>>> #define ECB_TE 0x10
>>>>>> #define ECB_SRSI 0x04
>>>>>> #define ECB_HOSTPROTINT 0x02
>>>>>> +#define ECB_PTF 0x01
>>>>>
>>>>> From below I understand, that ECB_PTF can be used with stfl(11) in
>>>>> the hypervisor.
>>>>>
>>>>> What is to happen if the hypervisor doesn't support stfl(11) and we
>>>>> consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF
>>>>> fully?
>>>>
>>>> Yes.
>>>
>>> Do we want that? I do not think so. Other OSes (like zOS) do use PTF
>>> in there low level interrupt handler, so PTF must be really fast.
>>> I think I would prefer that in that case the guest will simply not
>>> see stfle(11).
>>> So the user can still specify the topology but the guest will have no
>>> interface to query it.
>>
>> I do not understand.
>> If the host support stfle(11) we interpret PTF.
>>
>> The proposition was to emulate only in the case it is not supported,
>> what you propose is to not advertise stfl(11) if the host does not
>> support it, and consequently to never emulate is it right?
>
> Yes, exactly. My idea is to provide it to guests if we can do it fast,
> but do not provide it if it would add a performance issue.
OK, understood, I will update this and the QEMU part too as we do not
need emulation there anymore.
Thanks,
Pierre
--
Pierre Morel
IBM Lab Boeblingen
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
2021-09-06 18:37 ` David Hildenbrand
2021-09-07 10:24 ` Pierre Morel
@ 2021-09-07 12:28 ` Pierre Morel
2021-09-08 7:07 ` Christian Borntraeger
2021-09-09 9:03 ` Pierre Morel
2 siblings, 1 reply; 25+ messages in thread
From: Pierre Morel @ 2021-09-07 12:28 UTC (permalink / raw)
To: David Hildenbrand, kvm
Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
imbrenda, hca, gor
On 9/6/21 8:37 PM, David Hildenbrand wrote:
> On 03.08.21 10:26, Pierre Morel wrote:
>> We let the userland hypervisor know if the machine support the CPU
>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>
>> The PTF instruction will report a topology change if there is any change
>> with a previous STSI_15_2 SYSIB.
>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>> inside the CPU Topology List Entry CPU mask field, which happens with
>> changes in CPU polarization, dedication, CPU types and adding or
>> removing CPUs in a socket.
>>
>> The reporting to the guest is done using the Multiprocessor
>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>> SCA which will be cleared during the interpretation of PTF.
>>
>> To check if the topology has been modified we use a new field of the
>> arch vCPU to save the previous real CPU ID at the end of a schedule
>> and verify on next schedule that the CPU used is in the same socket.
>>
>> We deliberatly ignore:
>> - polarization: only horizontal polarization is currently used in linux.
>> - CPU Type: only IFL Type are supported in Linux
>> - Dedication: we consider that only a complete dedicated CPU stack can
>> take benefit of the CPU Topology.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>
>
>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>> __u8 icptcode; /* 0x0050 */
>> __u8 icptstatus; /* 0x0051 */
>> __u16 ihcpu; /* 0x0052 */
>> - __u8 reserved54; /* 0x0054 */
>> + __u8 mtcr; /* 0x0054 */
>> #define IICTL_CODE_NONE 0x00
>> #define IICTL_CODE_MCHK 0x01
>> #define IICTL_CODE_EXT 0x02
>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>> #define ECB_TE 0x10
>> #define ECB_SRSI 0x04
>> #define ECB_HOSTPROTINT 0x02
>> +#define ECB_PTF 0x01
>
> From below I understand, that ECB_PTF can be used with stfl(11) in the
> hypervisor.
>
> What is to happen if the hypervisor doesn't support stfl(11) and we
> consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
>
>
>> __u8 ecb; /* 0x0061 */
>> #define ECB2_CMMA 0x80
>> #define ECB2_IEP 0x20
>> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>> bool skey_enabled;
>> struct kvm_s390_pv_vcpu pv;
>> union diag318_info diag318_info;
>> + int prev_cpu;
>> };
>> struct kvm_vm_stat {
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index b655a7d82bf0..ff6d8a2b511c 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm,
>> long ext)
>> case KVM_CAP_S390_VCPU_RESETS:
>> case KVM_CAP_SET_GUEST_DEBUG:
>> case KVM_CAP_S390_DIAG318:
>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>
> I would have expected instead
>
> r = test_facility(11);
> break
>
> ...
>
>> r = 1;
>> break;
>> case KVM_CAP_SET_GUEST_DEBUG2:
>> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>> struct kvm_enable_cap *cap)
>> icpt_operexc_on_all_vcpus(kvm);
>> r = 0;
>> break;
>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>> + mutex_lock(&kvm->lock);
>> + if (kvm->created_vcpus) {
>> + r = -EBUSY;
>> + } else {
>
> ...
> } else if (test_facility(11)) {
> set_kvm_facility(kvm->arch.model.fac_mask, 11);
> set_kvm_facility(kvm->arch.model.fac_list, 11);
> r = 0;
> } else {
> r = -EINVAL;
> }
>
> similar to how we handle KVM_CAP_S390_VECTOR_REGISTERS.
>
> But I assume you want to be able to support hosts without ECB_PTF, correct?
>
>
>> + set_kvm_facility(kvm->arch.model.fac_mask, 11);
>> + set_kvm_facility(kvm->arch.model.fac_list, 11);
>> + r = 0;
>> + }
>> + mutex_unlock(&kvm->lock);
>> + VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
>> + r ? "(not available)" : "(success)");
>> + break;
>> +
>> + r = -EINVAL;
>> + break;
>
> ^ dead code
>
> [...]
>
>> }
>> void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>> {
>> + vcpu->arch.prev_cpu = vcpu->cpu;
>> vcpu->cpu = -1;
>> if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
>> __stop_cpu_timer_accounting(vcpu);
>> @@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu
>> *vcpu)
>> vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
>> if (test_kvm_facility(vcpu->kvm, 9))
>> vcpu->arch.sie_block->ecb |= ECB_SRSI;
>> +
>> + /* PTF needs both host and guest facilities to enable
>> interpretation */
>> + if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
>> + vcpu->arch.sie_block->ecb |= ECB_PTF;
>
> Here you say we need both ...
>
>> +
>> if (test_kvm_facility(vcpu->kvm, 73))
>> vcpu->arch.sie_block->ecb |= ECB_TE;
>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>> index 4002a24bc43a..50d67190bf65 100644
>> --- a/arch/s390/kvm/vsie.c
>> +++ b/arch/s390/kvm/vsie.c
>> @@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu,
>> struct vsie_page *vsie_page)
>> /* Host-protection-interruption introduced with ESOP */
>> if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
>> scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
>> + /* CPU Topology */
>> + if (test_kvm_facility(vcpu->kvm, 11))
>> + scb_s->ecb |= scb_o->ecb & ECB_PTF;
>
> but here you don't check?
>
>> /* transactional execution */
>> if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
>> /* remap the prefix is tx is toggled on */
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index d9e4aabcb31a..081ce0cd44b9 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
>> #define KVM_CAP_BINARY_STATS_FD 203
>> #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
>> #define KVM_CAP_ARM_MTE 205
>> +#define KVM_CAP_S390_CPU_TOPOLOGY 206
>
> We'll need a Documentation/virt/kvm/api.rst description.
>
> I'm not completely confident that the way we're handling the
> capability+facility is the right approach. It all feels a bit suboptimal.
>
> Except stfl(74) -- STHYI --, we never enable a facility via
> set_kvm_facility() that's not available in the host. And STHYI is
> special such that it is never implemented in hardware.
>
> I'll think about what might be cleaner once I get some more details
> about the interaction with stfl(11) in the hypervisor.
>
OK, may be we do not need to handle the case stfl(11) is not present in
the host, these are pre GA10...
--
Pierre Morel
IBM Lab Boeblingen
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
2021-09-07 12:28 ` Pierre Morel
@ 2021-09-08 7:07 ` Christian Borntraeger
2021-09-08 13:09 ` Pierre Morel
0 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2021-09-08 7:07 UTC (permalink / raw)
To: Pierre Morel, David Hildenbrand, kvm
Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor
On 07.09.21 14:28, Pierre Morel wrote:
>
>
> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>> On 03.08.21 10:26, Pierre Morel wrote:
>>> We let the userland hypervisor know if the machine support the CPU
>>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>>
>>> The PTF instruction will report a topology change if there is any change
>>> with a previous STSI_15_2 SYSIB.
>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>> changes in CPU polarization, dedication, CPU types and adding or
>>> removing CPUs in a socket.
>>>
>>> The reporting to the guest is done using the Multiprocessor
>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>> SCA which will be cleared during the interpretation of PTF.
>>>
>>> To check if the topology has been modified we use a new field of the
>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>> and verify on next schedule that the CPU used is in the same socket.
>>>
>>> We deliberatly ignore:
>>> - polarization: only horizontal polarization is currently used in linux.
>>> - CPU Type: only IFL Type are supported in Linux
>>> - Dedication: we consider that only a complete dedicated CPU stack can
>>> take benefit of the CPU Topology.
>>>
>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>
>>
>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>> __u8 icptcode; /* 0x0050 */
>>> __u8 icptstatus; /* 0x0051 */
>>> __u16 ihcpu; /* 0x0052 */
>>> - __u8 reserved54; /* 0x0054 */
>>> + __u8 mtcr; /* 0x0054 */
>>> #define IICTL_CODE_NONE 0x00
>>> #define IICTL_CODE_MCHK 0x01
>>> #define IICTL_CODE_EXT 0x02
>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>> #define ECB_TE 0x10
>>> #define ECB_SRSI 0x04
>>> #define ECB_HOSTPROTINT 0x02
>>> +#define ECB_PTF 0x01
>>
>> From below I understand, that ECB_PTF can be used with stfl(11) in the hypervisor.
>>
>> What is to happen if the hypervisor doesn't support stfl(11) and we consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
>>
>>
>>> __u8 ecb; /* 0x0061 */
>>> #define ECB2_CMMA 0x80
>>> #define ECB2_IEP 0x20
>>> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>>> bool skey_enabled;
>>> struct kvm_s390_pv_vcpu pv;
>>> union diag318_info diag318_info;
>>> + int prev_cpu;
>>> };
>>> struct kvm_vm_stat {
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index b655a7d82bf0..ff6d8a2b511c 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>> case KVM_CAP_S390_VCPU_RESETS:
>>> case KVM_CAP_SET_GUEST_DEBUG:
>>> case KVM_CAP_S390_DIAG318:
>>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>>
>> I would have expected instead
>>
>> r = test_facility(11);
>> break
>>
>> ...
>>
>>> r = 1;
>>> break;
>>> case KVM_CAP_SET_GUEST_DEBUG2:
>>> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>>> icpt_operexc_on_all_vcpus(kvm);
>>> r = 0;
>>> break;
>>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>>> + mutex_lock(&kvm->lock);
>>> + if (kvm->created_vcpus) {
>>> + r = -EBUSY;
>>> + } else {
>>
>> ...
>> } else if (test_facility(11)) {
>> set_kvm_facility(kvm->arch.model.fac_mask, 11);
>> set_kvm_facility(kvm->arch.model.fac_list, 11);
>> r = 0;
>> } else {
>> r = -EINVAL;
>> }
>>
>> similar to how we handle KVM_CAP_S390_VECTOR_REGISTERS.
>>
>> But I assume you want to be able to support hosts without ECB_PTF, correct?
>>
>>
>>> + set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>> + set_kvm_facility(kvm->arch.model.fac_list, 11);
>>> + r = 0;
>>> + }
>>> + mutex_unlock(&kvm->lock);
>>> + VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
>>> + r ? "(not available)" : "(success)");
>>> + break;
>>> +
>>> + r = -EINVAL;
>>> + break;
>>
>> ^ dead code
>>
>> [...]
>>
>>> }
>>> void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>> {
>>> + vcpu->arch.prev_cpu = vcpu->cpu;
>>> vcpu->cpu = -1;
>>> if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
>>> __stop_cpu_timer_accounting(vcpu);
>>> @@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
>>> vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
>>> if (test_kvm_facility(vcpu->kvm, 9))
>>> vcpu->arch.sie_block->ecb |= ECB_SRSI;
>>> +
>>> + /* PTF needs both host and guest facilities to enable interpretation */
>>> + if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
>>> + vcpu->arch.sie_block->ecb |= ECB_PTF;
>>
>> Here you say we need both ...
>>
>>> +
>>> if (test_kvm_facility(vcpu->kvm, 73))
>>> vcpu->arch.sie_block->ecb |= ECB_TE;
>>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>>> index 4002a24bc43a..50d67190bf65 100644
>>> --- a/arch/s390/kvm/vsie.c
>>> +++ b/arch/s390/kvm/vsie.c
>>> @@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
>>> /* Host-protection-interruption introduced with ESOP */
>>> if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
>>> scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
>>> + /* CPU Topology */
>>> + if (test_kvm_facility(vcpu->kvm, 11))
>>> + scb_s->ecb |= scb_o->ecb & ECB_PTF;
>>
>> but here you don't check?
>>
>>> /* transactional execution */
>>> if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
>>> /* remap the prefix is tx is toggled on */
>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>> index d9e4aabcb31a..081ce0cd44b9 100644
>>> --- a/include/uapi/linux/kvm.h
>>> +++ b/include/uapi/linux/kvm.h
>>> @@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
>>> #define KVM_CAP_BINARY_STATS_FD 203
>>> #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
>>> #define KVM_CAP_ARM_MTE 205
>>> +#define KVM_CAP_S390_CPU_TOPOLOGY 206
>>
>> We'll need a Documentation/virt/kvm/api.rst description.
>>
>> I'm not completely confident that the way we're handling the capability+facility is the right approach. It all feels a bit suboptimal.
>>
>> Except stfl(74) -- STHYI --, we never enable a facility via set_kvm_facility() that's not available in the host. And STHYI is special such that it is never implemented in hardware.
>>
>> I'll think about what might be cleaner once I get some more details about the interaction with stfl(11) in the hypervisor.
>>
>
> OK, may be we do not need to handle the case stfl(11) is not present in the host, these are pre GA10...
What about VSIE? For all existing KVM guests, stfl11 is off.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
2021-09-08 7:07 ` Christian Borntraeger
@ 2021-09-08 13:09 ` Pierre Morel
2021-09-08 13:16 ` Christian Borntraeger
0 siblings, 1 reply; 25+ messages in thread
From: Pierre Morel @ 2021-09-08 13:09 UTC (permalink / raw)
To: Christian Borntraeger, David Hildenbrand, kvm
Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor
On 9/8/21 9:07 AM, Christian Borntraeger wrote:
>
>
> On 07.09.21 14:28, Pierre Morel wrote:
>>
>>
>> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>>> On 03.08.21 10:26, Pierre Morel wrote:
>>>> We let the userland hypervisor know if the machine support the CPU
>>>> topology facility using a new KVM capability:
>>>> KVM_CAP_S390_CPU_TOPOLOGY.
>>>>
>>>> The PTF instruction will report a topology change if there is any
>>>> change
>>>> with a previous STSI_15_2 SYSIB.
>>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>>> changes in CPU polarization, dedication, CPU types and adding or
>>>> removing CPUs in a socket.
>>>>
>>>> The reporting to the guest is done using the Multiprocessor
>>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>>> SCA which will be cleared during the interpretation of PTF.
>>>>
>>>> To check if the topology has been modified we use a new field of the
>>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>>> and verify on next schedule that the CPU used is in the same socket.
>>>>
>>>> We deliberatly ignore:
>>>> - polarization: only horizontal polarization is currently used in
>>>> linux.
>>>> - CPU Type: only IFL Type are supported in Linux
>>>> - Dedication: we consider that only a complete dedicated CPU stack can
>>>> take benefit of the CPU Topology.
>>>>
>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>
>>>
>>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>>> __u8 icptcode; /* 0x0050 */
>>>> __u8 icptstatus; /* 0x0051 */
>>>> __u16 ihcpu; /* 0x0052 */
>>>> - __u8 reserved54; /* 0x0054 */
>>>> + __u8 mtcr; /* 0x0054 */
>>>> #define IICTL_CODE_NONE 0x00
>>>> #define IICTL_CODE_MCHK 0x01
>>>> #define IICTL_CODE_EXT 0x02
>>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>>> #define ECB_TE 0x10
>>>> #define ECB_SRSI 0x04
>>>> #define ECB_HOSTPROTINT 0x02
>>>> +#define ECB_PTF 0x01
>>>
>>> From below I understand, that ECB_PTF can be used with stfl(11) in
>>> the hypervisor.
>>>
>>> What is to happen if the hypervisor doesn't support stfl(11) and we
>>> consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
>>>
>>>
>>>> __u8 ecb; /* 0x0061 */
>>>> #define ECB2_CMMA 0x80
>>>> #define ECB2_IEP 0x20
>>>> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>>>> bool skey_enabled;
>>>> struct kvm_s390_pv_vcpu pv;
>>>> union diag318_info diag318_info;
>>>> + int prev_cpu;
>>>> };
>>>> struct kvm_vm_stat {
>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>> index b655a7d82bf0..ff6d8a2b511c 100644
>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm
>>>> *kvm, long ext)
>>>> case KVM_CAP_S390_VCPU_RESETS:
>>>> case KVM_CAP_SET_GUEST_DEBUG:
>>>> case KVM_CAP_S390_DIAG318:
>>>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>>>
>>> I would have expected instead
>>>
>>> r = test_facility(11);
>>> break
>>>
>>> ...
>>>
>>>> r = 1;
>>>> break;
>>>> case KVM_CAP_SET_GUEST_DEBUG2:
>>>> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>>>> struct kvm_enable_cap *cap)
>>>> icpt_operexc_on_all_vcpus(kvm);
>>>> r = 0;
>>>> break;
>>>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>>>> + mutex_lock(&kvm->lock);
>>>> + if (kvm->created_vcpus) {
>>>> + r = -EBUSY;
>>>> + } else {
>>>
>>> ...
>>> } else if (test_facility(11)) {
>>> set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>> set_kvm_facility(kvm->arch.model.fac_list, 11);
>>> r = 0;
>>> } else {
>>> r = -EINVAL;
>>> }
>>>
>>> similar to how we handle KVM_CAP_S390_VECTOR_REGISTERS.
>>>
>>> But I assume you want to be able to support hosts without ECB_PTF,
>>> correct?
>>>
>>>
>>>> + set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>>> + set_kvm_facility(kvm->arch.model.fac_list, 11);
>>>> + r = 0;
>>>> + }
>>>> + mutex_unlock(&kvm->lock);
>>>> + VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
>>>> + r ? "(not available)" : "(success)");
>>>> + break;
>>>> +
>>>> + r = -EINVAL;
>>>> + break;
>>>
>>> ^ dead code
>>>
>>> [...]
>>>
>>>> }
>>>> void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>>> {
>>>> + vcpu->arch.prev_cpu = vcpu->cpu;
>>>> vcpu->cpu = -1;
>>>> if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
>>>> __stop_cpu_timer_accounting(vcpu);
>>>> @@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct
>>>> kvm_vcpu *vcpu)
>>>> vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
>>>> if (test_kvm_facility(vcpu->kvm, 9))
>>>> vcpu->arch.sie_block->ecb |= ECB_SRSI;
>>>> +
>>>> + /* PTF needs both host and guest facilities to enable
>>>> interpretation */
>>>> + if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
>>>> + vcpu->arch.sie_block->ecb |= ECB_PTF;
>>>
>>> Here you say we need both ...
>>>
>>>> +
>>>> if (test_kvm_facility(vcpu->kvm, 73))
>>>> vcpu->arch.sie_block->ecb |= ECB_TE;
>>>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>>>> index 4002a24bc43a..50d67190bf65 100644
>>>> --- a/arch/s390/kvm/vsie.c
>>>> +++ b/arch/s390/kvm/vsie.c
>>>> @@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu,
>>>> struct vsie_page *vsie_page)
>>>> /* Host-protection-interruption introduced with ESOP */
>>>> if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
>>>> scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
>>>> + /* CPU Topology */
>>>> + if (test_kvm_facility(vcpu->kvm, 11))
>>>> + scb_s->ecb |= scb_o->ecb & ECB_PTF;
>>>
>>> but here you don't check?
>>>
>>>> /* transactional execution */
>>>> if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
>>>> /* remap the prefix is tx is toggled on */
>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>> index d9e4aabcb31a..081ce0cd44b9 100644
>>>> --- a/include/uapi/linux/kvm.h
>>>> +++ b/include/uapi/linux/kvm.h
>>>> @@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
>>>> #define KVM_CAP_BINARY_STATS_FD 203
>>>> #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
>>>> #define KVM_CAP_ARM_MTE 205
>>>> +#define KVM_CAP_S390_CPU_TOPOLOGY 206
>>>
>>> We'll need a Documentation/virt/kvm/api.rst description.
>>>
>>> I'm not completely confident that the way we're handling the
>>> capability+facility is the right approach. It all feels a bit
>>> suboptimal.
>>>
>>> Except stfl(74) -- STHYI --, we never enable a facility via
>>> set_kvm_facility() that's not available in the host. And STHYI is
>>> special such that it is never implemented in hardware.
>>>
>>> I'll think about what might be cleaner once I get some more details
>>> about the interaction with stfl(11) in the hypervisor.
>>>
>>
>> OK, may be we do not need to handle the case stfl(11) is not present
>> in the host, these are pre GA10...
>
> What about VSIE? For all existing KVM guests, stfl11 is off.
In VSIE the patch activates stfl(11) only if the host has stfl(11).
I do not see any problem to activate the interpretation in VSIE with
ECB_PTF (ECB.7) when the host has stfl(11) and QEMU asks to enable it
for the guest using the CAPABILITY as it is done in this patch.
if any intermediary hypervizor decide to not advertize stfl(11) for the
guest like an old QEMU not having the CAPABILITY, or a QEMU with
ctop=off, KVM will not set ECB_PTF and the PTF instruction will trigger
a program check as before.
Is it OK or did I missed something?
--
Pierre Morel
IBM Lab Boeblingen
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
2021-09-08 13:09 ` Pierre Morel
@ 2021-09-08 13:16 ` Christian Borntraeger
2021-09-08 14:17 ` Pierre Morel
0 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2021-09-08 13:16 UTC (permalink / raw)
To: Pierre Morel, David Hildenbrand, kvm
Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor
On 08.09.21 15:09, Pierre Morel wrote:
>
>
> On 9/8/21 9:07 AM, Christian Borntraeger wrote:
>>
>>
>> On 07.09.21 14:28, Pierre Morel wrote:
>>>
>>>
>>> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>>>> On 03.08.21 10:26, Pierre Morel wrote:
>>>>> We let the userland hypervisor know if the machine support the CPU
>>>>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>>>>
>>>>> The PTF instruction will report a topology change if there is any change
>>>>> with a previous STSI_15_2 SYSIB.
>>>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>>>> changes in CPU polarization, dedication, CPU types and adding or
>>>>> removing CPUs in a socket.
>>>>>
>>>>> The reporting to the guest is done using the Multiprocessor
>>>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>>>> SCA which will be cleared during the interpretation of PTF.
>>>>>
>>>>> To check if the topology has been modified we use a new field of the
>>>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>>>> and verify on next schedule that the CPU used is in the same socket.
>>>>>
>>>>> We deliberatly ignore:
>>>>> - polarization: only horizontal polarization is currently used in linux.
>>>>> - CPU Type: only IFL Type are supported in Linux
>>>>> - Dedication: we consider that only a complete dedicated CPU stack can
>>>>> take benefit of the CPU Topology.
>>>>>
>>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>>
>>>>
>>>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>>>> __u8 icptcode; /* 0x0050 */
>>>>> __u8 icptstatus; /* 0x0051 */
>>>>> __u16 ihcpu; /* 0x0052 */
>>>>> - __u8 reserved54; /* 0x0054 */
>>>>> + __u8 mtcr; /* 0x0054 */
>>>>> #define IICTL_CODE_NONE 0x00
>>>>> #define IICTL_CODE_MCHK 0x01
>>>>> #define IICTL_CODE_EXT 0x02
>>>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>>>> #define ECB_TE 0x10
>>>>> #define ECB_SRSI 0x04
>>>>> #define ECB_HOSTPROTINT 0x02
>>>>> +#define ECB_PTF 0x01
>>>>
>>>> From below I understand, that ECB_PTF can be used with stfl(11) in the hypervisor.
>>>>
>>>> What is to happen if the hypervisor doesn't support stfl(11) and we consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
>>>>
>>>>
>>>>> __u8 ecb; /* 0x0061 */
>>>>> #define ECB2_CMMA 0x80
>>>>> #define ECB2_IEP 0x20
>>>>> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>>>>> bool skey_enabled;
>>>>> struct kvm_s390_pv_vcpu pv;
>>>>> union diag318_info diag318_info;
>>>>> + int prev_cpu;
>>>>> };
>>>>> struct kvm_vm_stat {
>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>>> index b655a7d82bf0..ff6d8a2b511c 100644
>>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>>> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>>>> case KVM_CAP_S390_VCPU_RESETS:
>>>>> case KVM_CAP_SET_GUEST_DEBUG:
>>>>> case KVM_CAP_S390_DIAG318:
>>>>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>>>>
>>>> I would have expected instead
>>>>
>>>> r = test_facility(11);
>>>> break
>>>>
>>>> ...
>>>>
>>>>> r = 1;
>>>>> break;
>>>>> case KVM_CAP_SET_GUEST_DEBUG2:
>>>>> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>>>>> icpt_operexc_on_all_vcpus(kvm);
>>>>> r = 0;
>>>>> break;
>>>>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>>>>> + mutex_lock(&kvm->lock);
>>>>> + if (kvm->created_vcpus) {
>>>>> + r = -EBUSY;
>>>>> + } else {
>>>>
>>>> ...
>>>> } else if (test_facility(11)) {
>>>> set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>>> set_kvm_facility(kvm->arch.model.fac_list, 11);
>>>> r = 0;
>>>> } else {
>>>> r = -EINVAL;
>>>> }
>>>>
>>>> similar to how we handle KVM_CAP_S390_VECTOR_REGISTERS.
>>>>
>>>> But I assume you want to be able to support hosts without ECB_PTF, correct?
>>>>
>>>>
>>>>> + set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>>>> + set_kvm_facility(kvm->arch.model.fac_list, 11);
>>>>> + r = 0;
>>>>> + }
>>>>> + mutex_unlock(&kvm->lock);
>>>>> + VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
>>>>> + r ? "(not available)" : "(success)");
>>>>> + break;
>>>>> +
>>>>> + r = -EINVAL;
>>>>> + break;
>>>>
>>>> ^ dead code
>>>>
>>>> [...]
>>>>
>>>>> }
>>>>> void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>>>> {
>>>>> + vcpu->arch.prev_cpu = vcpu->cpu;
>>>>> vcpu->cpu = -1;
>>>>> if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
>>>>> __stop_cpu_timer_accounting(vcpu);
>>>>> @@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
>>>>> vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
>>>>> if (test_kvm_facility(vcpu->kvm, 9))
>>>>> vcpu->arch.sie_block->ecb |= ECB_SRSI;
>>>>> +
>>>>> + /* PTF needs both host and guest facilities to enable interpretation */
>>>>> + if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
>>>>> + vcpu->arch.sie_block->ecb |= ECB_PTF;
>>>>
>>>> Here you say we need both ...
>>>>
>>>>> +
>>>>> if (test_kvm_facility(vcpu->kvm, 73))
>>>>> vcpu->arch.sie_block->ecb |= ECB_TE;
>>>>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>>>>> index 4002a24bc43a..50d67190bf65 100644
>>>>> --- a/arch/s390/kvm/vsie.c
>>>>> +++ b/arch/s390/kvm/vsie.c
>>>>> @@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
>>>>> /* Host-protection-interruption introduced with ESOP */
>>>>> if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
>>>>> scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
>>>>> + /* CPU Topology */
>>>>> + if (test_kvm_facility(vcpu->kvm, 11))
>>>>> + scb_s->ecb |= scb_o->ecb & ECB_PTF;
>>>>
>>>> but here you don't check?
>>>>
>>>>> /* transactional execution */
>>>>> if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
>>>>> /* remap the prefix is tx is toggled on */
>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>>> index d9e4aabcb31a..081ce0cd44b9 100644
>>>>> --- a/include/uapi/linux/kvm.h
>>>>> +++ b/include/uapi/linux/kvm.h
>>>>> @@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
>>>>> #define KVM_CAP_BINARY_STATS_FD 203
>>>>> #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
>>>>> #define KVM_CAP_ARM_MTE 205
>>>>> +#define KVM_CAP_S390_CPU_TOPOLOGY 206
>>>>
>>>> We'll need a Documentation/virt/kvm/api.rst description.
>>>>
>>>> I'm not completely confident that the way we're handling the capability+facility is the right approach. It all feels a bit suboptimal.
>>>>
>>>> Except stfl(74) -- STHYI --, we never enable a facility via set_kvm_facility() that's not available in the host. And STHYI is special such that it is never implemented in hardware.
>>>>
>>>> I'll think about what might be cleaner once I get some more details about the interaction with stfl(11) in the hypervisor.
>>>>
>>>
>>> OK, may be we do not need to handle the case stfl(11) is not present in the host, these are pre GA10...
>>
>> What about VSIE? For all existing KVM guests, stfl11 is off.
>
> In VSIE the patch activates stfl(11) only if the host has stfl(11).
>
> I do not see any problem to activate the interpretation in VSIE with ECB_PTF (ECB.7) when the host has stfl(11) and QEMU asks to enable it for the guest using the CAPABILITY as it is done in this patch.
>
> if any intermediary hypervizor decide to not advertize stfl(11) for the guest like an old QEMU not having the CAPABILITY, or a QEMU with ctop=off, KVM will not set ECB_PTF and the PTF instruction will trigger a program check as before.
>
> Is it OK or did I missed something?
Yes, sure.
My point was regarding the pre z10 statement. We will see hosts without stfl(e)11 when running nested on z14, z15 and co.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
2021-09-08 13:16 ` Christian Borntraeger
@ 2021-09-08 14:17 ` Pierre Morel
0 siblings, 0 replies; 25+ messages in thread
From: Pierre Morel @ 2021-09-08 14:17 UTC (permalink / raw)
To: Christian Borntraeger, David Hildenbrand, kvm
Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor
On 9/8/21 3:16 PM, Christian Borntraeger wrote:
>
>
> On 08.09.21 15:09, Pierre Morel wrote:
>>
>>
>> On 9/8/21 9:07 AM, Christian Borntraeger wrote:
>>>
>>>
>>> On 07.09.21 14:28, Pierre Morel wrote:
>>>>
>>>>
>>>> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>>>>> On 03.08.21 10:26, Pierre Morel wrote:
>>>>>> We let the userland hypervisor know if the machine support the CPU
>>>>>> topology facility using a new KVM capability:
>>>>>> KVM_CAP_S390_CPU_TOPOLOGY.
>>>>>>
>>>>>> The PTF instruction will report a topology change if there is any
>>>>>> change
>>>>>> with a previous STSI_15_2 SYSIB.
>>>>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>>>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>>>>> changes in CPU polarization, dedication, CPU types and adding or
>>>>>> removing CPUs in a socket.
>>>>>>
>>>>>> The reporting to the guest is done using the Multiprocessor
>>>>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>>>>> SCA which will be cleared during the interpretation of PTF.
>>>>>>
>>>>>> To check if the topology has been modified we use a new field of the
>>>>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>>>>> and verify on next schedule that the CPU used is in the same socket.
>>>>>>
>>>>>> We deliberatly ignore:
>>>>>> - polarization: only horizontal polarization is currently used in
>>>>>> linux.
>>>>>> - CPU Type: only IFL Type are supported in Linux
>>>>>> - Dedication: we consider that only a complete dedicated CPU stack
>>>>>> can
>>>>>> take benefit of the CPU Topology.
>>>>>>
>>>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>>>
>>>>>
>>>>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>>>>> __u8 icptcode; /* 0x0050 */
>>>>>> __u8 icptstatus; /* 0x0051 */
>>>>>> __u16 ihcpu; /* 0x0052 */
>>>>>> - __u8 reserved54; /* 0x0054 */
>>>>>> + __u8 mtcr; /* 0x0054 */
>>>>>> #define IICTL_CODE_NONE 0x00
>>>>>> #define IICTL_CODE_MCHK 0x01
>>>>>> #define IICTL_CODE_EXT 0x02
>>>>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>>>>> #define ECB_TE 0x10
>>>>>> #define ECB_SRSI 0x04
>>>>>> #define ECB_HOSTPROTINT 0x02
>>>>>> +#define ECB_PTF 0x01
>>>>>
>>>>> From below I understand, that ECB_PTF can be used with stfl(11) in
>>>>> the hypervisor.
>>>>>
>>>>> What is to happen if the hypervisor doesn't support stfl(11) and we
>>>>> consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF
>>>>> fully?
>>>>>
>>>>>
>>>>>> __u8 ecb; /* 0x0061 */
>>>>>> #define ECB2_CMMA 0x80
>>>>>> #define ECB2_IEP 0x20
>>>>>> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>>>>>> bool skey_enabled;
>>>>>> struct kvm_s390_pv_vcpu pv;
>>>>>> union diag318_info diag318_info;
>>>>>> + int prev_cpu;
>>>>>> };
>>>>>> struct kvm_vm_stat {
>>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>>>> index b655a7d82bf0..ff6d8a2b511c 100644
>>>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>>>> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm
>>>>>> *kvm, long ext)
>>>>>> case KVM_CAP_S390_VCPU_RESETS:
>>>>>> case KVM_CAP_SET_GUEST_DEBUG:
>>>>>> case KVM_CAP_S390_DIAG318:
>>>>>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>>>>>
>>>>> I would have expected instead
>>>>>
>>>>> r = test_facility(11);
>>>>> break
>>>>>
>>>>> ...
>>>>>
>>>>>> r = 1;
>>>>>> break;
>>>>>> case KVM_CAP_SET_GUEST_DEBUG2:
>>>>>> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>>>>>> struct kvm_enable_cap *cap)
>>>>>> icpt_operexc_on_all_vcpus(kvm);
>>>>>> r = 0;
>>>>>> break;
>>>>>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>>>>>> + mutex_lock(&kvm->lock);
>>>>>> + if (kvm->created_vcpus) {
>>>>>> + r = -EBUSY;
>>>>>> + } else {
>>>>>
>>>>> ...
>>>>> } else if (test_facility(11)) {
>>>>> set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>>>> set_kvm_facility(kvm->arch.model.fac_list, 11);
>>>>> r = 0;
>>>>> } else {
>>>>> r = -EINVAL;
>>>>> }
>>>>>
>>>>> similar to how we handle KVM_CAP_S390_VECTOR_REGISTERS.
>>>>>
>>>>> But I assume you want to be able to support hosts without ECB_PTF,
>>>>> correct?
>>>>>
>>>>>
>>>>>> + set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>>>>> + set_kvm_facility(kvm->arch.model.fac_list, 11);
>>>>>> + r = 0;
>>>>>> + }
>>>>>> + mutex_unlock(&kvm->lock);
>>>>>> + VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
>>>>>> + r ? "(not available)" : "(success)");
>>>>>> + break;
>>>>>> +
>>>>>> + r = -EINVAL;
>>>>>> + break;
>>>>>
>>>>> ^ dead code
>>>>>
>>>>> [...]
>>>>>
>>>>>> }
>>>>>> void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>>>>> {
>>>>>> + vcpu->arch.prev_cpu = vcpu->cpu;
>>>>>> vcpu->cpu = -1;
>>>>>> if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
>>>>>> __stop_cpu_timer_accounting(vcpu);
>>>>>> @@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct
>>>>>> kvm_vcpu *vcpu)
>>>>>> vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
>>>>>> if (test_kvm_facility(vcpu->kvm, 9))
>>>>>> vcpu->arch.sie_block->ecb |= ECB_SRSI;
>>>>>> +
>>>>>> + /* PTF needs both host and guest facilities to enable
>>>>>> interpretation */
>>>>>> + if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
>>>>>> + vcpu->arch.sie_block->ecb |= ECB_PTF;
>>>>>
>>>>> Here you say we need both ...
>>>>>
>>>>>> +
>>>>>> if (test_kvm_facility(vcpu->kvm, 73))
>>>>>> vcpu->arch.sie_block->ecb |= ECB_TE;
>>>>>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>>>>>> index 4002a24bc43a..50d67190bf65 100644
>>>>>> --- a/arch/s390/kvm/vsie.c
>>>>>> +++ b/arch/s390/kvm/vsie.c
>>>>>> @@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu,
>>>>>> struct vsie_page *vsie_page)
>>>>>> /* Host-protection-interruption introduced with ESOP */
>>>>>> if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
>>>>>> scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
>>>>>> + /* CPU Topology */
>>>>>> + if (test_kvm_facility(vcpu->kvm, 11))
>>>>>> + scb_s->ecb |= scb_o->ecb & ECB_PTF;
>>>>>
>>>>> but here you don't check?
>>>>>
>>>>>> /* transactional execution */
>>>>>> if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
>>>>>> /* remap the prefix is tx is toggled on */
>>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>>>> index d9e4aabcb31a..081ce0cd44b9 100644
>>>>>> --- a/include/uapi/linux/kvm.h
>>>>>> +++ b/include/uapi/linux/kvm.h
>>>>>> @@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
>>>>>> #define KVM_CAP_BINARY_STATS_FD 203
>>>>>> #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
>>>>>> #define KVM_CAP_ARM_MTE 205
>>>>>> +#define KVM_CAP_S390_CPU_TOPOLOGY 206
>>>>>
>>>>> We'll need a Documentation/virt/kvm/api.rst description.
>>>>>
>>>>> I'm not completely confident that the way we're handling the
>>>>> capability+facility is the right approach. It all feels a bit
>>>>> suboptimal.
>>>>>
>>>>> Except stfl(74) -- STHYI --, we never enable a facility via
>>>>> set_kvm_facility() that's not available in the host. And STHYI is
>>>>> special such that it is never implemented in hardware.
>>>>>
>>>>> I'll think about what might be cleaner once I get some more details
>>>>> about the interaction with stfl(11) in the hypervisor.
>>>>>
>>>>
>>>> OK, may be we do not need to handle the case stfl(11) is not present
>>>> in the host, these are pre GA10...
>>>
>>> What about VSIE? For all existing KVM guests, stfl11 is off.
>>
>> In VSIE the patch activates stfl(11) only if the host has stfl(11).
>>
>> I do not see any problem to activate the interpretation in VSIE with
>> ECB_PTF (ECB.7) when the host has stfl(11) and QEMU asks to enable it
>> for the guest using the CAPABILITY as it is done in this patch.
>>
>> if any intermediary hypervizor decide to not advertize stfl(11) for
>> the guest like an old QEMU not having the CAPABILITY, or a QEMU with
>> ctop=off, KVM will not set ECB_PTF and the PTF instruction will
>> trigger a program check as before.
>>
>> Is it OK or did I missed something?
>
> Yes, sure.
> My point was regarding the pre z10 statement. We will see hosts without
> stfl(e)11 when running nested on z14, z15 and co.
Ah OK, yes.
understood.
Thanks,
Pierre
--
Pierre Morel
IBM Lab Boeblingen
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
2021-09-06 18:37 ` David Hildenbrand
2021-09-07 10:24 ` Pierre Morel
2021-09-07 12:28 ` Pierre Morel
@ 2021-09-09 9:03 ` Pierre Morel
2 siblings, 0 replies; 25+ messages in thread
From: Pierre Morel @ 2021-09-09 9:03 UTC (permalink / raw)
To: David Hildenbrand, kvm
Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
imbrenda, hca, gor
On 9/6/21 8:37 PM, David Hildenbrand wrote:
> On 03.08.21 10:26, Pierre Morel wrote:
>> We let the userland hypervisor know if the machine support the CPU
>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>
>> The PTF instruction will report a topology change if there is any change
>> with a previous STSI_15_2 SYSIB.
>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>> inside the CPU Topology List Entry CPU mask field, which happens with
>> changes in CPU polarization, dedication, CPU types and adding or
>> removing CPUs in a socket.
>>
>> The reporting to the guest is done using the Multiprocessor
>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>> SCA which will be cleared during the interpretation of PTF.
>>
>> To check if the topology has been modified we use a new field of the
>> arch vCPU to save the previous real CPU ID at the end of a schedule
>> and verify on next schedule that the CPU used is in the same socket.
>>
>> We deliberatly ignore:
>> - polarization: only horizontal polarization is currently used in linux.
>> - CPU Type: only IFL Type are supported in Linux
>> - Dedication: we consider that only a complete dedicated CPU stack can
>> take benefit of the CPU Topology.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>
>
>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>> __u8 icptcode; /* 0x0050 */
>> __u8 icptstatus; /* 0x0051 */
>> __u16 ihcpu; /* 0x0052 */
>> - __u8 reserved54; /* 0x0054 */
>> + __u8 mtcr; /* 0x0054 */
>> #define IICTL_CODE_NONE 0x00
>> #define IICTL_CODE_MCHK 0x01
>> #define IICTL_CODE_EXT 0x02
>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>> #define ECB_TE 0x10
>> #define ECB_SRSI 0x04
>> #define ECB_HOSTPROTINT 0x02
>> +#define ECB_PTF 0x01
>
> From below I understand, that ECB_PTF can be used with stfl(11) in the
> hypervisor.
>
> What is to happen if the hypervisor doesn't support stfl(11) and we
> consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
>
>
>> __u8 ecb; /* 0x0061 */
>> #define ECB2_CMMA 0x80
>> #define ECB2_IEP 0x20
>> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>> bool skey_enabled;
>> struct kvm_s390_pv_vcpu pv;
>> union diag318_info diag318_info;
>> + int prev_cpu;
>> };
>> struct kvm_vm_stat {
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index b655a7d82bf0..ff6d8a2b511c 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm,
>> long ext)
>> case KVM_CAP_S390_VCPU_RESETS:
>> case KVM_CAP_SET_GUEST_DEBUG:
>> case KVM_CAP_S390_DIAG318:
>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>
> I would have expected instead
>
> r = test_facility(11);
> break
I will change to this as we decided not to support emulation if the hist
does not support facility 11.
>
> ...
>
>> r = 1;
>> break;
>> case KVM_CAP_SET_GUEST_DEBUG2:
>> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>> struct kvm_enable_cap *cap)
>> icpt_operexc_on_all_vcpus(kvm);
>> r = 0;
>> break;
>> + case KVM_CAP_S390_CPU_TOPOLOGY:
>> + mutex_lock(&kvm->lock);
>> + if (kvm->created_vcpus) {
>> + r = -EBUSY;
>> + } else {
>
> ...
> } else if (test_facility(11)) {
> set_kvm_facility(kvm->arch.model.fac_mask, 11);
> set_kvm_facility(kvm->arch.model.fac_list, 11);
> r = 0;
> } else {
> r = -EINVAL;
> }
>
> similar to how we handle KVM_CAP_S390_VECTOR_REGISTERS.
>
> But I assume you want to be able to support hosts without ECB_PTF, correct?
No more, after Christian comments we do not want to support emulation at
all.
>
>
...snip...
>> +
>> + /* PTF needs both host and guest facilities to enable
>> interpretation */
>> + if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
>> + vcpu->arch.sie_block->ecb |= ECB_PTF;
>
> Here you say we need both ...
>
>> +
>> if (test_kvm_facility(vcpu->kvm, 73))
>> vcpu->arch.sie_block->ecb |= ECB_TE;
>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>> index 4002a24bc43a..50d67190bf65 100644
>> --- a/arch/s390/kvm/vsie.c
>> +++ b/arch/s390/kvm/vsie.c
>> @@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu,
>> struct vsie_page *vsie_page)
>> /* Host-protection-interruption introduced with ESOP */
>> if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
>> scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
>> + /* CPU Topology */
>> + if (test_kvm_facility(vcpu->kvm, 11))
>> + scb_s->ecb |= scb_o->ecb & ECB_PTF;
>
> but here you don't check?
do we really need to check at all, even for test_kvm_facility() ?
as facilities do not change during a guest session and we checked for
setting it at first time.
Regards,
Pierre
--
Pierre Morel
IBM Lab Boeblingen
^ permalink raw reply [flat|nested] 25+ messages in thread