All of lore.kernel.org
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race
@ 2021-02-09 14:15 Janosch Frank
  2021-02-09 14:21 ` Thomas Huth
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Janosch Frank @ 2021-02-09 14:15 UTC (permalink / raw)
  To: kvm; +Cc: linux-s390, david, thuth, imbrenda

KVM and QEMU handle a SIGP stop and store status in two steps:
1) Stop the CPU by injecting a stop request
2) Store when the CPU has left SIE because of the stop request

The problem is that the SIGP order is already considered completed by
KVM/QEMU when step 1 has been performed and not once both have
completed. In addition we currently don't implement the busy CC so a
kernel has no way of knowing that the store has finished other than
checking the location for the store.

This workaround is based on the fact that for a new SIE entry (via the
added smp restart) a stop with the store status has to be finished
first.

Correct handling of this in KVM/QEMU will need some thought and time.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
---
 s390x/smp.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/s390x/smp.c b/s390x/smp.c
index b0ece491..32f284a2 100644
--- a/s390x/smp.c
+++ b/s390x/smp.c
@@ -102,12 +102,15 @@ static void test_stop_store_status(void)
 	lc->grs_sa[15] = 0;
 	smp_cpu_stop_store_status(1);
 	mb();
+	report(smp_cpu_stopped(1), "cpu stopped");
+	/* For the cpu to be started it should have finished storing */
+	smp_cpu_restart(1);
 	report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu->lowcore, "prefix");
 	report(lc->grs_sa[15], "stack");
-	report(smp_cpu_stopped(1), "cpu stopped");
 	report_prefix_pop();
 
 	report_prefix_push("stopped");
+	smp_cpu_stop(1);
 	lc->prefix_sa = 0;
 	lc->grs_sa[15] = 0;
 	smp_cpu_stop_store_status(1);
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race
  2021-02-09 14:15 [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race Janosch Frank
@ 2021-02-09 14:21 ` Thomas Huth
  2021-02-09 16:08 ` Claudio Imbrenda
  2021-02-09 16:55 ` David Hildenbrand
  2 siblings, 0 replies; 7+ messages in thread
From: Thomas Huth @ 2021-02-09 14:21 UTC (permalink / raw)
  To: Janosch Frank, kvm; +Cc: linux-s390, david, imbrenda

On 09/02/2021 15.15, Janosch Frank wrote:
> KVM and QEMU handle a SIGP stop and store status in two steps:
> 1) Stop the CPU by injecting a stop request
> 2) Store when the CPU has left SIE because of the stop request
> 
> The problem is that the SIGP order is already considered completed by
> KVM/QEMU when step 1 has been performed and not once both have
> completed. In addition we currently don't implement the busy CC so a
> kernel has no way of knowing that the store has finished other than
> checking the location for the store.
> 
> This workaround is based on the fact that for a new SIE entry (via the
> added smp restart) a stop with the store status has to be finished
> first.
> 
> Correct handling of this in KVM/QEMU will need some thought and time.
> 
> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> ---
>   s390x/smp.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/s390x/smp.c b/s390x/smp.c
> index b0ece491..32f284a2 100644
> --- a/s390x/smp.c
> +++ b/s390x/smp.c
> @@ -102,12 +102,15 @@ static void test_stop_store_status(void)
>   	lc->grs_sa[15] = 0;
>   	smp_cpu_stop_store_status(1);
>   	mb();
> +	report(smp_cpu_stopped(1), "cpu stopped");
> +	/* For the cpu to be started it should have finished storing */
> +	smp_cpu_restart(1);
>   	report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu->lowcore, "prefix");
>   	report(lc->grs_sa[15], "stack");
> -	report(smp_cpu_stopped(1), "cpu stopped");
>   	report_prefix_pop();
>   
>   	report_prefix_push("stopped");
> +	smp_cpu_stop(1);
>   	lc->prefix_sa = 0;
>   	lc->grs_sa[15] = 0;
>   	smp_cpu_stop_store_status(1);

Thanks, this fixes the flaky test for me!

Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race
  2021-02-09 14:15 [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race Janosch Frank
  2021-02-09 14:21 ` Thomas Huth
@ 2021-02-09 16:08 ` Claudio Imbrenda
  2021-02-09 16:14   ` Janosch Frank
  2021-02-09 16:55 ` David Hildenbrand
  2 siblings, 1 reply; 7+ messages in thread
From: Claudio Imbrenda @ 2021-02-09 16:08 UTC (permalink / raw)
  To: Janosch Frank; +Cc: kvm, linux-s390, david, thuth

On Tue,  9 Feb 2021 09:15:54 -0500
Janosch Frank <frankja@linux.ibm.com> wrote:

> KVM and QEMU handle a SIGP stop and store status in two steps:
> 1) Stop the CPU by injecting a stop request
> 2) Store when the CPU has left SIE because of the stop request
> 
> The problem is that the SIGP order is already considered completed by
> KVM/QEMU when step 1 has been performed and not once both have
> completed. In addition we currently don't implement the busy CC so a
> kernel has no way of knowing that the store has finished other than
> checking the location for the store.
> 
> This workaround is based on the fact that for a new SIE entry (via the
> added smp restart) a stop with the store status has to be finished
> first.
> 
> Correct handling of this in KVM/QEMU will need some thought and time.

do I understand correctly that you are here "fixing" the test by not
triggering the KVM bug? Shouldn't we try to trigger as many bugs as
possible instead?

> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
> ---
>  s390x/smp.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/s390x/smp.c b/s390x/smp.c
> index b0ece491..32f284a2 100644
> --- a/s390x/smp.c
> +++ b/s390x/smp.c
> @@ -102,12 +102,15 @@ static void test_stop_store_status(void)
>  	lc->grs_sa[15] = 0;
>  	smp_cpu_stop_store_status(1);
>  	mb();
> +	report(smp_cpu_stopped(1), "cpu stopped");
> +	/* For the cpu to be started it should have finished storing
> */
> +	smp_cpu_restart(1);
>  	report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu->lowcore,
> "prefix"); report(lc->grs_sa[15], "stack");
> -	report(smp_cpu_stopped(1), "cpu stopped");
>  	report_prefix_pop();
>  
>  	report_prefix_push("stopped");
> +	smp_cpu_stop(1);
>  	lc->prefix_sa = 0;
>  	lc->grs_sa[15] = 0;
>  	smp_cpu_stop_store_status(1);


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race
  2021-02-09 16:08 ` Claudio Imbrenda
@ 2021-02-09 16:14   ` Janosch Frank
  2021-02-09 16:19     ` Thomas Huth
  0 siblings, 1 reply; 7+ messages in thread
From: Janosch Frank @ 2021-02-09 16:14 UTC (permalink / raw)
  To: Claudio Imbrenda; +Cc: kvm, linux-s390, david, thuth

On 2/9/21 5:08 PM, Claudio Imbrenda wrote:
> On Tue,  9 Feb 2021 09:15:54 -0500
> Janosch Frank <frankja@linux.ibm.com> wrote:
> 
>> KVM and QEMU handle a SIGP stop and store status in two steps:
>> 1) Stop the CPU by injecting a stop request
>> 2) Store when the CPU has left SIE because of the stop request
>>
>> The problem is that the SIGP order is already considered completed by
>> KVM/QEMU when step 1 has been performed and not once both have
>> completed. In addition we currently don't implement the busy CC so a
>> kernel has no way of knowing that the store has finished other than
>> checking the location for the store.
>>
>> This workaround is based on the fact that for a new SIE entry (via the
>> added smp restart) a stop with the store status has to be finished
>> first.
>>
>> Correct handling of this in KVM/QEMU will need some thought and time.
> 
> do I understand correctly that you are here "fixing" the test by not
> triggering the KVM bug? Shouldn't we try to trigger as many bugs as
> possible instead?

This is not a bug, it's missing code :-)

We trigger a higher number of bugs by running tests and this workaround
does exactly that by letting Thomas use the smp test in the CI again.

> 
>> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
>> ---
>>  s390x/smp.c | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/s390x/smp.c b/s390x/smp.c
>> index b0ece491..32f284a2 100644
>> --- a/s390x/smp.c
>> +++ b/s390x/smp.c
>> @@ -102,12 +102,15 @@ static void test_stop_store_status(void)
>>  	lc->grs_sa[15] = 0;
>>  	smp_cpu_stop_store_status(1);
>>  	mb();
>> +	report(smp_cpu_stopped(1), "cpu stopped");
>> +	/* For the cpu to be started it should have finished storing
>> */
>> +	smp_cpu_restart(1);
>>  	report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu->lowcore,
>> "prefix"); report(lc->grs_sa[15], "stack");
>> -	report(smp_cpu_stopped(1), "cpu stopped");
>>  	report_prefix_pop();
>>  
>>  	report_prefix_push("stopped");
>> +	smp_cpu_stop(1);
>>  	lc->prefix_sa = 0;
>>  	lc->grs_sa[15] = 0;
>>  	smp_cpu_stop_store_status(1);
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race
  2021-02-09 16:14   ` Janosch Frank
@ 2021-02-09 16:19     ` Thomas Huth
  2021-02-09 16:42       ` Janosch Frank
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Huth @ 2021-02-09 16:19 UTC (permalink / raw)
  To: Janosch Frank, Claudio Imbrenda; +Cc: kvm, linux-s390, david

On 09/02/2021 17.14, Janosch Frank wrote:
> On 2/9/21 5:08 PM, Claudio Imbrenda wrote:
>> On Tue,  9 Feb 2021 09:15:54 -0500
>> Janosch Frank <frankja@linux.ibm.com> wrote:
>>
>>> KVM and QEMU handle a SIGP stop and store status in two steps:
>>> 1) Stop the CPU by injecting a stop request
>>> 2) Store when the CPU has left SIE because of the stop request
>>>
>>> The problem is that the SIGP order is already considered completed by
>>> KVM/QEMU when step 1 has been performed and not once both have
>>> completed. In addition we currently don't implement the busy CC so a
>>> kernel has no way of knowing that the store has finished other than
>>> checking the location for the store.
>>>
>>> This workaround is based on the fact that for a new SIE entry (via the
>>> added smp restart) a stop with the store status has to be finished
>>> first.
>>>
>>> Correct handling of this in KVM/QEMU will need some thought and time.
>>
>> do I understand correctly that you are here "fixing" the test by not
>> triggering the KVM bug? Shouldn't we try to trigger as many bugs as
>> possible instead?
> 
> This is not a bug, it's missing code :-)
> 
> We trigger a higher number of bugs by running tests and this workaround
> does exactly that by letting Thomas use the smp test in the CI again.

Alternatively, we could use report_xfail here to make the test pass, but 
still have the problem reported so that we do not forget to fix it later.

  Thomas


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race
  2021-02-09 16:19     ` Thomas Huth
@ 2021-02-09 16:42       ` Janosch Frank
  0 siblings, 0 replies; 7+ messages in thread
From: Janosch Frank @ 2021-02-09 16:42 UTC (permalink / raw)
  To: Thomas Huth, Claudio Imbrenda; +Cc: kvm, linux-s390, david

On 2/9/21 5:19 PM, Thomas Huth wrote:
> On 09/02/2021 17.14, Janosch Frank wrote:
>> On 2/9/21 5:08 PM, Claudio Imbrenda wrote:
>>> On Tue,  9 Feb 2021 09:15:54 -0500
>>> Janosch Frank <frankja@linux.ibm.com> wrote:
>>>
>>>> KVM and QEMU handle a SIGP stop and store status in two steps:
>>>> 1) Stop the CPU by injecting a stop request
>>>> 2) Store when the CPU has left SIE because of the stop request
>>>>
>>>> The problem is that the SIGP order is already considered completed by
>>>> KVM/QEMU when step 1 has been performed and not once both have
>>>> completed. In addition we currently don't implement the busy CC so a
>>>> kernel has no way of knowing that the store has finished other than
>>>> checking the location for the store.
>>>>
>>>> This workaround is based on the fact that for a new SIE entry (via the
>>>> added smp restart) a stop with the store status has to be finished
>>>> first.
>>>>
>>>> Correct handling of this in KVM/QEMU will need some thought and time.
>>>
>>> do I understand correctly that you are here "fixing" the test by not
>>> triggering the KVM bug? Shouldn't we try to trigger as many bugs as
>>> possible instead?
>>
>> This is not a bug, it's missing code :-)
>>
>> We trigger a higher number of bugs by running tests and this workaround
>> does exactly that by letting Thomas use the smp test in the CI again.
> 
> Alternatively, we could use report_xfail here to make the test pass, but 
> still have the problem reported so that we do not forget to fix it later.
> 
>   Thomas
> 

I have no strong opinion on that although I'll need to have a look if
our CI flags XPASS/XFAIL as fails.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race
  2021-02-09 14:15 [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race Janosch Frank
  2021-02-09 14:21 ` Thomas Huth
  2021-02-09 16:08 ` Claudio Imbrenda
@ 2021-02-09 16:55 ` David Hildenbrand
  2 siblings, 0 replies; 7+ messages in thread
From: David Hildenbrand @ 2021-02-09 16:55 UTC (permalink / raw)
  To: Janosch Frank, kvm; +Cc: linux-s390, thuth, imbrenda

On 09.02.21 15:15, Janosch Frank wrote:
> KVM and QEMU handle a SIGP stop and store status in two steps:
> 1) Stop the CPU by injecting a stop request
> 2) Store when the CPU has left SIE because of the stop request
> 
> The problem is that the SIGP order is already considered completed by
> KVM/QEMU when step 1 has been performed and not once both have
> completed. In addition we currently don't implement the busy CC so a

QEMU remembers that stop-and-store-status is active in 
"cpu->env.sigp_order" and rejects other orders until completed with 
SIGP_CC_BUSY. See handle_sigp_single_dst().

The "issue" is that the kernel does not know about that. The kernel will 
continue handling
- SIGP_SENSE
- SIGP_EXTERNAL_CALL
- SIGP_EMERGENCY_SIGNAL
- SIGP_COND_EMERGENCY_SIGNAL
- SIGP_SENSE_RUNNING
itself and not go to user space where that information is present. And 
for some of these orders we really don't want to go to user space.

I remember that at least SIGP_SENSE_RUNNING doesn't give any guarantees, 
so that one might be correct.

I remember that it was a little unclear if all of these orders actually 
have to wait for other orders to finish (IOW: return SIGP_CC_BUSY). 
Especially with SIGPIF even the hardware has no idea if we are emulating 
a SIGP STOP .* right now, how should it know? We would have to disable 
the facility, and that's most probably not what the HW/facility was 
designed for.

Yeah, it's complicated. We would have to let QEMU "set" and "clear" a 
SIGP busy status in the kernel. Then we could at least let SIGP SENSE be 
correct and fast (I think that's the only critical one to get right).

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-02-09 16:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-09 14:15 [kvm-unit-tests PATCH] s390x: Workaround smp stop and store status race Janosch Frank
2021-02-09 14:21 ` Thomas Huth
2021-02-09 16:08 ` Claudio Imbrenda
2021-02-09 16:14   ` Janosch Frank
2021-02-09 16:19     ` Thomas Huth
2021-02-09 16:42       ` Janosch Frank
2021-02-09 16:55 ` David Hildenbrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.