linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] powerpc/rtas: Fix hang in race against concurrent cpu offline
@ 2019-06-24 23:48 Juliet Kim
  2019-06-25 17:29 ` Nathan Lynch
  0 siblings, 1 reply; 5+ messages in thread
From: Juliet Kim @ 2019-06-24 23:48 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mmc, mwb, nathanl

From 1bd2bf7146876e099eafb292668f9a12dc22f526 Mon Sep 17 00:00:00 2001
From: Juliet Kim <julietk@linux.vnet.ibm.com>
Date: Mon, 24 Jun 2019 17:17:46 -0400
Subject: [PATCH 1/1] powerpc/rtas: Fix hang in race against concurrent cpu
 offline
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The commit
(“powerpc/rtas: Fix a potential race between CPU-Offline & Migration)
attempted to fix a hang in Live Partition Mobility(LPM) by abandoning
the LPM attempt if a race between LPM and concurrent CPU offline was
detected.

However, that fix failed to notify Hypervisor that the LPM attempted
had been abandoned which results in a system hang.

Fix this by sending a signal PHYP to cancel the migration, so that PHYP
can stop waiting, and clean up the migration.

Fixes: dfd718a2ed1f ("powerpc/rtas: Fix a potential race between CPU-Offline & Migration")
Signed-off-by: Juliet Kim <julietk@linux.vnet.ibm.com>
---
This is an alternate solution to the one Nathan proposed in:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2019-June/192414.html
---
 arch/powerpc/include/asm/hvcall.h | 7 +++++++
 arch/powerpc/kernel/rtas.c        | 8 ++++++++
 2 files changed, 15 insertions(+)

diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index 463c63a..29ca285 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -261,6 +261,7 @@
 #define H_ADD_CONN		0x284
 #define H_DEL_CONN		0x288
 #define H_JOIN			0x298
+#define H_VASI_SIGNAL           0x2A0
 #define H_VASI_STATE            0x2A4
 #define H_VIOCTL		0x2A8
 #define H_ENABLE_CRQ		0x2B0
@@ -348,6 +349,12 @@
 #define H_SIGNAL_SYS_RESET_ALL_OTHERS		-2
 /* >= 0 values are CPU number */
 
+/* Values for argument to H_VASI_SIGNAL */
+#define H_SIGNAL_CANCEL_MIG     0x01
+
+/* Values for 2nd argument to H_VASI_SIGNAL */
+#define H_CPU_OFFLINE_DETECTED  0x0000000006000004
+
 /* H_GET_CPU_CHARACTERISTICS return values */
 #define H_CPU_CHAR_SPEC_BAR_ORI31	(1ull << 63) // IBM bit 0
 #define H_CPU_CHAR_BCCTRL_SERIALISED	(1ull << 62) // IBM bit 1
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index b824f4c..f9002b7 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -981,6 +981,14 @@ int rtas_ibm_suspend_me(u64 handle)
 
 	/* Check if we raced with a CPU-Offline Operation */
 	if (unlikely(!cpumask_equal(cpu_present_mask, cpu_online_mask))) {
+
+		/* We uses CANCEL, not ABORT to gracefully cancel migration */
+		rc = plpar_hcall_norets(H_VASI_SIGNAL, handle,
+			H_SIGNAL_CANCEL_MIG, H_CPU_OFFLINE_DETECTED);
+
+		if (rc != H_SUCCESS)
+			pr_err("%s: vasi_signal failed %ld\n", __func__, rc);
+
 		pr_err("%s: Raced against a concurrent CPU-Offline\n",
 		       __func__);
 		atomic_set(&data.error, -EBUSY);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/rtas: Fix hang in race against concurrent cpu offline
  2019-06-24 23:48 [PATCH] powerpc/rtas: Fix hang in race against concurrent cpu offline Juliet Kim
@ 2019-06-25 17:29 ` Nathan Lynch
  2019-06-26 21:49   ` Juliet Kim
  0 siblings, 1 reply; 5+ messages in thread
From: Nathan Lynch @ 2019-06-25 17:29 UTC (permalink / raw)
  To: Juliet Kim; +Cc: linuxppc-dev, mmc, mwb

Juliet Kim <julietk@linux.vnet.ibm.com> writes:
> The commit
> (“powerpc/rtas: Fix a potential race between CPU-Offline & Migration)
> attempted to fix a hang in Live Partition Mobility(LPM) by abandoning
> the LPM attempt if a race between LPM and concurrent CPU offline was
> detected.
>
> However, that fix failed to notify Hypervisor that the LPM attempted
> had been abandoned which results in a system hang.

It is surprising to me that leaving a migration unterminated would cause
Linux to hang. Can you explain more about how that happens?


> Fix this by sending a signal PHYP to cancel the migration, so that PHYP
> can stop waiting, and clean up the migration.

This is well-spotted and rtas_ibm_suspend_me() needs to signal
cancellation in several error paths. But I don't agree that this is one
of them: this race is going to be a temporary condition in any
production setting, and retrying would allow the migration to succeed.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/rtas: Fix hang in race against concurrent cpu offline
  2019-06-25 17:29 ` Nathan Lynch
@ 2019-06-26 21:49   ` Juliet Kim
  2019-06-26 23:51     ` Nathan Lynch
  0 siblings, 1 reply; 5+ messages in thread
From: Juliet Kim @ 2019-06-26 21:49 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: linuxppc-dev, mmc, mwb

On 6/25/19 12:29 PM, Nathan Lynch wrote:
> Juliet Kim <julietk@linux.vnet.ibm.com> writes:
>> The commit
>> (“powerpc/rtas: Fix a potential race between CPU-Offline & Migration)
>> attempted to fix a hang in Live Partition Mobility(LPM) by abandoning
>> the LPM attempt if a race between LPM and concurrent CPU offline was
>> detected.
>>
>> However, that fix failed to notify Hypervisor that the LPM attempted
>> had been abandoned which results in a system hang.
> It is surprising to me that leaving a migration unterminated would cause
> Linux to hang. Can you explain more about how that happens?
>
PHYP will block further requests(next partition migration, dlpar etc) while
it's in suspending state. That would have a follow-on effect on the HMC and
potentially this and other partitions.
>> Fix this by sending a signal PHYP to cancel the migration, so that PHYP
>> can stop waiting, and clean up the migration.
> This is well-spotted and rtas_ibm_suspend_me() needs to signal
> cancellation in several error paths. But I don't agree that this is one
> of them: this race is going to be a temporary condition in any
> production setting, and retrying would allow the migration to succeed.
If LPM and CPU offine requests conflict with one another, it might be better
to let them fail and let the customer decide which he prefers. IBM i cancels
migration if the other OS components/operations veto migration. It’s consistent
with other OS behavior for LPM. I think all the IBM products should have a
consistent customer experience. Even if the race can be temporary, it still
could happen and can cause livelock.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/rtas: Fix hang in race against concurrent cpu offline
  2019-06-26 21:49   ` Juliet Kim
@ 2019-06-26 23:51     ` Nathan Lynch
  2019-06-28 20:03       ` Juliet Kim
  0 siblings, 1 reply; 5+ messages in thread
From: Nathan Lynch @ 2019-06-26 23:51 UTC (permalink / raw)
  To: Juliet Kim; +Cc: linuxppc-dev, mmc, mwb

Hi Juliet,

Juliet Kim <julietk@linux.vnet.ibm.com> writes:
> On 6/25/19 12:29 PM, Nathan Lynch wrote:
>> Juliet Kim <julietk@linux.vnet.ibm.com> writes:
>>>
>>> However, that fix failed to notify Hypervisor that the LPM attempted
>>> had been abandoned which results in a system hang.
>
>> It is surprising to me that leaving a migration unterminated would cause
>> Linux to hang. Can you explain more about how that happens?
>>
> PHYP will block further requests(next partition migration, dlpar etc) while
> it's in suspending state. That would have a follow-on effect on the HMC and
> potentially this and other partitions.

I can believe that operations on _this LPAR_ would be blocked by the
platform and/or management console while the migration remains
unterminated, but the OS should not be able to perpetrate a denial of
service on other partitions or the management console simply by botching
the LPM protocol. If it can, that's not Linux's bug to fix.


>>> Fix this by sending a signal PHYP to cancel the migration, so that PHYP
>>> can stop waiting, and clean up the migration.
>>
>> This is well-spotted and rtas_ibm_suspend_me() needs to signal
>> cancellation in several error paths. But I don't agree that this is one
>> of them: this race is going to be a temporary condition in any
>> production setting, and retrying would allow the migration to
>> succeed.
>
> If LPM and CPU offine requests conflict with one another, it might be better
> to let them fail and let the customer decide which he prefers.

Hmm I don't think so. When (if ever) this happens in production it would
be the result of an unlucky race with a power management daemon or
similar, not a conscious decision of the administrator in the moment.


> IBM i cancels migration if the other OS components/operations veto
> migration. It’s consistent with other OS behavior for LPM.

But this situation isn't really like that. If we were to have a real
veto mechanism, it would only make sense to have it run as early as
possible, before the platform has done a bunch of work. This benign,
recoverable race is occurring right before we complete the migration,
which at this point has been copying state to the destination for
minutes or hours. It doesn't make sense to error out like this.

As I mentioned earlier though, it does make sense to signal a
cancellation for these less-recoverable error conditions in
rtas_ibm_suspend_me():

- rtas_online_cpus_mask() failure
- alloc_cpumask_var() failure
- the atomic_read(&data.error) != 0 case after returning from the IPI


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/rtas: Fix hang in race against concurrent cpu offline
  2019-06-26 23:51     ` Nathan Lynch
@ 2019-06-28 20:03       ` Juliet Kim
  0 siblings, 0 replies; 5+ messages in thread
From: Juliet Kim @ 2019-06-28 20:03 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: linuxppc-dev, mmc, mwb


On 6/26/19 6:51 PM, Nathan Lynch wrote:
> Hi Juliet,
>
> Juliet Kim <julietk@linux.vnet.ibm.com> writes:
>> On 6/25/19 12:29 PM, Nathan Lynch wrote:
>>> Juliet Kim <julietk@linux.vnet.ibm.com> writes:
>>>> However, that fix failed to notify Hypervisor that the LPM attempted
>>>> had been abandoned which results in a system hang.
>>> It is surprising to me that leaving a migration unterminated would cause
>>> Linux to hang. Can you explain more about how that happens?
>>>
>> PHYP will block further requests(next partition migration, dlpar etc) while
>> it's in suspending state. That would have a follow-on effect on the HMC and
>> potentially this and other partitions.
> I can believe that operations on _this LPAR_ would be blocked by the
> platform and/or management console while the migration remains
> unterminated, but the OS should not be able to perpetrate a denial of
> service on other partitions or the management console simply by botching
> the LPM protocol. If it can, that's not Linux's bug to fix.
>
>
>>>> Fix this by sending a signal PHYP to cancel the migration, so that PHYP
>>>> can stop waiting, and clean up the migration.
>>> This is well-spotted and rtas_ibm_suspend_me() needs to signal
>>> cancellation in several error paths. But I don't agree that this is one
>>> of them: this race is going to be a temporary condition in any
>>> production setting, and retrying would allow the migration to
>>> succeed.
>> If LPM and CPU offine requests conflict with one another, it might be better
>> to let them fail and let the customer decide which he prefers.
> Hmm I don't think so. When (if ever) this happens in production it would
> be the result of an unlucky race with a power management daemon or
> similar, not a conscious decision of the administrator in the moment.
>
Guessing that a production race would only be against power mgmt is maybe
reasonable.  But we have an actual failure case where the race was against
an explicit offline request, and that's a legitimate/supported thing for
a customer to do.

>> IBM i cancels migration if the other OS components/operations veto
>> migration. It’s consistent with other OS behavior for LPM.
> But this situation isn't really like that. If we were to have a real
> veto mechanism, it would only make sense to have it run as early as
> possible, before the platform has done a bunch of work. This benign,
> recoverable race is occurring right before we complete the migration,
> which at this point has been copying state to the destination for
> minutes or hours. It doesn't make sense to error out like this.

Let me clarify that the cancellation is occurring in the phase preparing
for migration.It would be even better if it runs before LPM is allowed to make
a start. But that's what a long-term solution might look like.

> As I mentioned earlier though, it does make sense to signal a
> cancellation for these less-recoverable error conditions in
> rtas_ibm_suspend_me():
>
> - rtas_online_cpus_mask() failure
> - alloc_cpumask_var() failure
> - the atomic_read(&data.error) != 0 case after returning from the IPI

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-06-28 20:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-24 23:48 [PATCH] powerpc/rtas: Fix hang in race against concurrent cpu offline Juliet Kim
2019-06-25 17:29 ` Nathan Lynch
2019-06-26 21:49   ` Juliet Kim
2019-06-26 23:51     ` Nathan Lynch
2019-06-28 20:03       ` Juliet Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).