* [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race
@ 2021-02-09 14:15 Janosch Frank
2021-02-09 14:21 ` Thomas Huth
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Janosch Frank @ 2021-02-09 14:15 UTC (permalink / raw)
To: kvm; +Cc: linux-s390, david, thuth, imbrenda
KVM and QEMU handle a SIGP stop and store status in two steps:
1) Stop the CPU by injecting a stop request
2) Store when the CPU has left SIE because of the stop request
The problem is that the SIGP order is already considered completed by
KVM/QEMU when step 1 has been performed and not once both have
completed. In addition we currently don't implement the busy CC so a
kernel has no way of knowing that the store has finished other than
checking the location for the store.
This workaround is based on the fact that for a new SIE entry (via the
added smp restart) a stop with the store status has to be finished
first.
Correct handling of this in KVM/QEMU will need some thought and time.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
---
s390x/smp.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/s390x/smp.c b/s390x/smp.c
index b0ece491..32f284a2 100644
--- a/s390x/smp.c
+++ b/s390x/smp.c
@@ -102,12 +102,15 @@ static void test_stop_store_status(void)
lc->grs_sa[15] = 0;
smp_cpu_stop_store_status(1);
mb();
+ report(smp_cpu_stopped(1), "cpu stopped");
+ /* For the cpu to be started it should have finished storing */
+ smp_cpu_restart(1);
report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu->lowcore, "prefix");
report(lc->grs_sa[15], "stack");
- report(smp_cpu_stopped(1), "cpu stopped");
report_prefix_pop();
report_prefix_push("stopped");
+ smp_cpu_stop(1);
lc->prefix_sa = 0;
lc->grs_sa[15] = 0;
smp_cpu_stop_store_status(1);
--
2.25.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race
2021-02-09 14:15 [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race Janosch Frank
@ 2021-02-09 14:21 ` Thomas Huth
2021-02-09 16:08 ` Claudio Imbrenda
2021-02-09 16:55 ` David Hildenbrand
2 siblings, 0 replies; 7+ messages in thread
From: Thomas Huth @ 2021-02-09 14:21 UTC (permalink / raw)
To: Janosch Frank, kvm; +Cc: linux-s390, david, imbrenda
On 09/02/2021 15.15, Janosch Frank wrote:
> KVM and QEMU handle a SIGP stop and store status in two steps:
> 1) Stop the CPU by injecting a stop request
> 2) Store when the CPU has left SIE because of the stop request
>
> The problem is that the SIGP order is already considered completed by
> KVM/QEMU when step 1 has been performed and not once both have
> completed. In addition we currently don't implement the busy CC so a
> kernel has no way of knowing that the store has finished other than
> checking the location for the store.
>
> This workaround is based on the fact that for a new SIE entry (via the
> added smp restart) a stop with the store status has to be finished
> first.
>
> Correct handling of this in KVM/QEMU will need some thought and time.
>
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> ---
> s390x/smp.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/s390x/smp.c b/s390x/smp.c
> index b0ece491..32f284a2 100644
> --- a/s390x/smp.c
> +++ b/s390x/smp.c
> @@ -102,12 +102,15 @@ static void test_stop_store_status(void)
> lc->grs_sa[15] = 0;
> smp_cpu_stop_store_status(1);
> mb();
> + report(smp_cpu_stopped(1), "cpu stopped");
> + /* For the cpu to be started it should have finished storing */
> + smp_cpu_restart(1);
> report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu->lowcore, "prefix");
> report(lc->grs_sa[15], "stack");
> - report(smp_cpu_stopped(1), "cpu stopped");
> report_prefix_pop();
>
> report_prefix_push("stopped");
> + smp_cpu_stop(1);
> lc->prefix_sa = 0;
> lc->grs_sa[15] = 0;
> smp_cpu_stop_store_status(1);
Thanks, this fixes the flaky test for me!
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race
2021-02-09 14:15 [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race Janosch Frank
2021-02-09 14:21 ` Thomas Huth
@ 2021-02-09 16:08 ` Claudio Imbrenda
2021-02-09 16:14 ` Janosch Frank
2021-02-09 16:55 ` David Hildenbrand
2 siblings, 1 reply; 7+ messages in thread
From: Claudio Imbrenda @ 2021-02-09 16:08 UTC (permalink / raw)
To: Janosch Frank; +Cc: kvm, linux-s390, david, thuth
On Tue, 9 Feb 2021 09:15:54 -0500
Janosch Frank <frankja@linux.ibm.com> wrote:
> KVM and QEMU handle a SIGP stop and store status in two steps:
> 1) Stop the CPU by injecting a stop request
> 2) Store when the CPU has left SIE because of the stop request
>
> The problem is that the SIGP order is already considered completed by
> KVM/QEMU when step 1 has been performed and not once both have
> completed. In addition we currently don't implement the busy CC so a
> kernel has no way of knowing that the store has finished other than
> checking the location for the store.
>
> This workaround is based on the fact that for a new SIE entry (via the
> added smp restart) a stop with the store status has to be finished
> first.
>
> Correct handling of this in KVM/QEMU will need some thought and time.
do I understand correctly that you are here "fixing" the test by not
triggering the KVM bug? Shouldn't we try to trigger as many bugs as
possible instead?
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> ---
> s390x/smp.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/s390x/smp.c b/s390x/smp.c
> index b0ece491..32f284a2 100644
> --- a/s390x/smp.c
> +++ b/s390x/smp.c
> @@ -102,12 +102,15 @@ static void test_stop_store_status(void)
> lc->grs_sa[15] = 0;
> smp_cpu_stop_store_status(1);
> mb();
> + report(smp_cpu_stopped(1), "cpu stopped");
> + /* For the cpu to be started it should have finished storing
> */
> + smp_cpu_restart(1);
> report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu->lowcore,
> "prefix"); report(lc->grs_sa[15], "stack");
> - report(smp_cpu_stopped(1), "cpu stopped");
> report_prefix_pop();
>
> report_prefix_push("stopped");
> + smp_cpu_stop(1);
> lc->prefix_sa = 0;
> lc->grs_sa[15] = 0;
> smp_cpu_stop_store_status(1);
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race
2021-02-09 16:08 ` Claudio Imbrenda
@ 2021-02-09 16:14 ` Janosch Frank
2021-02-09 16:19 ` Thomas Huth
0 siblings, 1 reply; 7+ messages in thread
From: Janosch Frank @ 2021-02-09 16:14 UTC (permalink / raw)
To: Claudio Imbrenda; +Cc: kvm, linux-s390, david, thuth
On 2/9/21 5:08 PM, Claudio Imbrenda wrote:
> On Tue, 9 Feb 2021 09:15:54 -0500
> Janosch Frank <frankja@linux.ibm.com> wrote:
>
>> KVM and QEMU handle a SIGP stop and store status in two steps:
>> 1) Stop the CPU by injecting a stop request
>> 2) Store when the CPU has left SIE because of the stop request
>>
>> The problem is that the SIGP order is already considered completed by
>> KVM/QEMU when step 1 has been performed and not once both have
>> completed. In addition we currently don't implement the busy CC so a
>> kernel has no way of knowing that the store has finished other than
>> checking the location for the store.
>>
>> This workaround is based on the fact that for a new SIE entry (via the
>> added smp restart) a stop with the store status has to be finished
>> first.
>>
>> Correct handling of this in KVM/QEMU will need some thought and time.
>
> do I understand correctly that you are here "fixing" the test by not
> triggering the KVM bug? Shouldn't we try to trigger as many bugs as
> possible instead?
This is not a bug, it's missing code :-)
We trigger a higher number of bugs by running tests and this workaround
does exactly that by letting Thomas use the smp test in the CI again.
>
>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>> ---
>> s390x/smp.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/s390x/smp.c b/s390x/smp.c
>> index b0ece491..32f284a2 100644
>> --- a/s390x/smp.c
>> +++ b/s390x/smp.c
>> @@ -102,12 +102,15 @@ static void test_stop_store_status(void)
>> lc->grs_sa[15] = 0;
>> smp_cpu_stop_store_status(1);
>> mb();
>> + report(smp_cpu_stopped(1), "cpu stopped");
>> + /* For the cpu to be started it should have finished storing
>> */
>> + smp_cpu_restart(1);
>> report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu->lowcore,
>> "prefix"); report(lc->grs_sa[15], "stack");
>> - report(smp_cpu_stopped(1), "cpu stopped");
>> report_prefix_pop();
>>
>> report_prefix_push("stopped");
>> + smp_cpu_stop(1);
>> lc->prefix_sa = 0;
>> lc->grs_sa[15] = 0;
>> smp_cpu_stop_store_status(1);
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race
2021-02-09 16:14 ` Janosch Frank
@ 2021-02-09 16:19 ` Thomas Huth
2021-02-09 16:42 ` Janosch Frank
0 siblings, 1 reply; 7+ messages in thread
From: Thomas Huth @ 2021-02-09 16:19 UTC (permalink / raw)
To: Janosch Frank, Claudio Imbrenda; +Cc: kvm, linux-s390, david
On 09/02/2021 17.14, Janosch Frank wrote:
> On 2/9/21 5:08 PM, Claudio Imbrenda wrote:
>> On Tue, 9 Feb 2021 09:15:54 -0500
>> Janosch Frank <frankja@linux.ibm.com> wrote:
>>
>>> KVM and QEMU handle a SIGP stop and store status in two steps:
>>> 1) Stop the CPU by injecting a stop request
>>> 2) Store when the CPU has left SIE because of the stop request
>>>
>>> The problem is that the SIGP order is already considered completed by
>>> KVM/QEMU when step 1 has been performed and not once both have
>>> completed. In addition we currently don't implement the busy CC so a
>>> kernel has no way of knowing that the store has finished other than
>>> checking the location for the store.
>>>
>>> This workaround is based on the fact that for a new SIE entry (via the
>>> added smp restart) a stop with the store status has to be finished
>>> first.
>>>
>>> Correct handling of this in KVM/QEMU will need some thought and time.
>>
>> do I understand correctly that you are here "fixing" the test by not
>> triggering the KVM bug? Shouldn't we try to trigger as many bugs as
>> possible instead?
>
> This is not a bug, it's missing code :-)
>
> We trigger a higher number of bugs by running tests and this workaround
> does exactly that by letting Thomas use the smp test in the CI again.
Alternatively, we could use report_xfail here to make the test pass, but
still have the problem reported so that we do not forget to fix it later.
Thomas
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race
2021-02-09 16:19 ` Thomas Huth
@ 2021-02-09 16:42 ` Janosch Frank
0 siblings, 0 replies; 7+ messages in thread
From: Janosch Frank @ 2021-02-09 16:42 UTC (permalink / raw)
To: Thomas Huth, Claudio Imbrenda; +Cc: kvm, linux-s390, david
On 2/9/21 5:19 PM, Thomas Huth wrote:
> On 09/02/2021 17.14, Janosch Frank wrote:
>> On 2/9/21 5:08 PM, Claudio Imbrenda wrote:
>>> On Tue, 9 Feb 2021 09:15:54 -0500
>>> Janosch Frank <frankja@linux.ibm.com> wrote:
>>>
>>>> KVM and QEMU handle a SIGP stop and store status in two steps:
>>>> 1) Stop the CPU by injecting a stop request
>>>> 2) Store when the CPU has left SIE because of the stop request
>>>>
>>>> The problem is that the SIGP order is already considered completed by
>>>> KVM/QEMU when step 1 has been performed and not once both have
>>>> completed. In addition we currently don't implement the busy CC so a
>>>> kernel has no way of knowing that the store has finished other than
>>>> checking the location for the store.
>>>>
>>>> This workaround is based on the fact that for a new SIE entry (via the
>>>> added smp restart) a stop with the store status has to be finished
>>>> first.
>>>>
>>>> Correct handling of this in KVM/QEMU will need some thought and time.
>>>
>>> do I understand correctly that you are here "fixing" the test by not
>>> triggering the KVM bug? Shouldn't we try to trigger as many bugs as
>>> possible instead?
>>
>> This is not a bug, it's missing code :-)
>>
>> We trigger a higher number of bugs by running tests and this workaround
>> does exactly that by letting Thomas use the smp test in the CI again.
>
> Alternatively, we could use report_xfail here to make the test pass, but
> still have the problem reported so that we do not forget to fix it later.
>
> Thomas
>
I have no strong opinion on that although I'll need to have a look if
our CI flags XPASS/XFAIL as fails.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race
2021-02-09 14:15 [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race Janosch Frank
2021-02-09 14:21 ` Thomas Huth
2021-02-09 16:08 ` Claudio Imbrenda
@ 2021-02-09 16:55 ` David Hildenbrand
2 siblings, 0 replies; 7+ messages in thread
From: David Hildenbrand @ 2021-02-09 16:55 UTC (permalink / raw)
To: Janosch Frank, kvm; +Cc: linux-s390, thuth, imbrenda
On 09.02.21 15:15, Janosch Frank wrote:
> KVM and QEMU handle a SIGP stop and store status in two steps:
> 1) Stop the CPU by injecting a stop request
> 2) Store when the CPU has left SIE because of the stop request
>
> The problem is that the SIGP order is already considered completed by
> KVM/QEMU when step 1 has been performed and not once both have
> completed. In addition we currently don't implement the busy CC so a
QEMU remembers that stop-and-store-status is active in
"cpu->env.sigp_order" and rejects other orders until completed with
SIGP_CC_BUSY. See handle_sigp_single_dst().
The "issue" is that the kernel does not know about that. The kernel will
continue handling
- SIGP_SENSE
- SIGP_EXTERNAL_CALL
- SIGP_EMERGENCY_SIGNAL
- SIGP_COND_EMERGENCY_SIGNAL
- SIGP_SENSE_RUNNING
itself and not go to user space where that information is present. And
for some of these orders we really don't want to go to user space.
I remember that at least SIGP_SENSE_RUNNING doesn't give any guarantees,
so that one might be correct.
I remember that it was a little unclear if all of these orders actually
have to wait for other orders to finish (IOW: return SIGP_CC_BUSY).
Especially with SIGPIF even the hardware has no idea if we are emulating
a SIGP STOP .* right now, how should it know? We would have to disable
the facility, and that's most probably not what the HW/facility was
designed for.
Yeah, it's complicated. We would have to let QEMU "set" and "clear" a
SIGP busy status in the kernel. Then we could at least let SIGP SENSE be
correct and fast (I think that's the only critical one to get right).
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-02-09 16:56 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-09 14:15 [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race Janosch Frank
2021-02-09 14:21 ` Thomas Huth
2021-02-09 16:08 ` Claudio Imbrenda
2021-02-09 16:14 ` Janosch Frank
2021-02-09 16:19 ` Thomas Huth
2021-02-09 16:42 ` Janosch Frank
2021-02-09 16:55 ` David Hildenbrand
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.