linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] powerpc/pseries: Move vas_migration_handler early during migration
@ 2022-09-22  8:27 Haren Myneni
  2022-09-22 12:14 ` Nathan Lynch
  2022-10-04 13:25 ` Michael Ellerman
  0 siblings, 2 replies; 5+ messages in thread
From: Haren Myneni @ 2022-09-22  8:27 UTC (permalink / raw)
  To: mpe, npiggin, nathanl, linuxppc-dev


When the migration is initiated, the hypervisor changes VAS
mappings as part of pre-migration event. Then the OS gets the
migration event which closes all VAS windows before the migration
starts. NX generates continuous faults until windows are closed
and the user space can not differentiate these NX faults coming
from the actual migration. So to reduce this time window, close
VAS windows first in pseries_migrate_partition().

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
---
 arch/powerpc/platforms/pseries/mobility.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c
index 3d36a8955eaf..884595b7c51f 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -740,11 +740,19 @@ static int pseries_migrate_partition(u64 handle)
 #ifdef CONFIG_PPC_WATCHDOG
 	factor = nmi_wd_lpm_factor;
 #endif
+	/*
+	 * When the migration is initiated, the hypervisor changes VAS
+	 * mappings to prepare before OS gets the notification and
+	 * closes all VAS windows. NX generates continuous faults during
+	 * this time and the user space can not differentiate these
+	 * faults from the migration event. So reduce this time window
+	 * by closing VAS windows at the beginning of this function.
+	 */
+	vas_migration_handler(VAS_SUSPEND);
+
 	ret = wait_for_vasi_session_suspending(handle);
 	if (ret)
-		return ret;
-
-	vas_migration_handler(VAS_SUSPEND);
+		goto out;
 
 	if (factor)
 		watchdog_nmi_set_timeout_pct(factor);
@@ -765,6 +773,7 @@ static int pseries_migrate_partition(u64 handle)
 	if (factor)
 		watchdog_nmi_set_timeout_pct(0);
 
+out:
 	vas_migration_handler(VAS_RESUME);
 
 	return ret;
-- 
2.26.3



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/pseries: Move vas_migration_handler early during migration
  2022-09-22  8:27 [PATCH] powerpc/pseries: Move vas_migration_handler early during migration Haren Myneni
@ 2022-09-22 12:14 ` Nathan Lynch
  2022-09-23  8:37   ` Haren Myneni
  2022-10-04 13:25 ` Michael Ellerman
  1 sibling, 1 reply; 5+ messages in thread
From: Nathan Lynch @ 2022-09-22 12:14 UTC (permalink / raw)
  To: Haren Myneni, mpe, npiggin, linuxppc-dev

Haren Myneni <haren@linux.ibm.com> writes:
> When the migration is initiated, the hypervisor changes VAS
> mappings as part of pre-migration event. Then the OS gets the
> migration event which closes all VAS windows before the migration
> starts. NX generates continuous faults until windows are closed
> and the user space can not differentiate these NX faults coming
> from the actual migration. So to reduce this time window, close
> VAS windows first in pseries_migrate_partition().

I'm concerned that this is only narrowing a window of time where
undesirable faults occur, and that it may not be sufficient for all
configurations. Migrations can be in progress for minutes or hours,
while the time that we wait for the VASI state transition is usually
seconds or minutes. So I worry that this works around a problem in
limited cases but doesn't cover them all.

Maybe I don't understand the problem well enough. How does user space
respond to the NX faults?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/pseries: Move vas_migration_handler early during migration
  2022-09-22 12:14 ` Nathan Lynch
@ 2022-09-23  8:37   ` Haren Myneni
  2022-09-24  0:11     ` Nathan Lynch
  0 siblings, 1 reply; 5+ messages in thread
From: Haren Myneni @ 2022-09-23  8:37 UTC (permalink / raw)
  To: Nathan Lynch, mpe, npiggin, linuxppc-dev

On Thu, 2022-09-22 at 07:14 -0500, Nathan Lynch wrote:
> Haren Myneni <haren@linux.ibm.com> writes:
> > When the migration is initiated, the hypervisor changes VAS
> > mappings as part of pre-migration event. Then the OS gets the
> > migration event which closes all VAS windows before the migration
> > starts. NX generates continuous faults until windows are closed
> > and the user space can not differentiate these NX faults coming
> > from the actual migration. So to reduce this time window, close
> > VAS windows first in pseries_migrate_partition().
> 
> I'm concerned that this is only narrowing a window of time where
> undesirable faults occur, and that it may not be sufficient for all
> configurations. Migrations can be in progress for minutes or hours,
> while the time that we wait for the VASI state transition is usually
> seconds or minutes. So I worry that this works around a problem in
> limited cases but doesn't cover them all.
> 
> Maybe I don't understand the problem well enough. How does user space
> respond to the NX faults?

The user space resend the request to NX whenever the request is
returned with NX fault. So the process should be same even for faults
caused by the pre-migration.

Whereas the paste will be returned with failure when the window is
closed (unmap the paste address) and it can be considered as NX busy.
Up to the user space whether to send the request again after some delay
or fall back to SW compression and send the request again later.

For the migration, pre-migration event is notified to the hypervisor
and then OS will receive the migration event (SUSPEND) - So this patch
close windows early before VASI so that removing NX fault handling
during the time taken for VASI state transistion. 

Thanks
Haren


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/pseries: Move vas_migration_handler early during migration
  2022-09-23  8:37   ` Haren Myneni
@ 2022-09-24  0:11     ` Nathan Lynch
  0 siblings, 0 replies; 5+ messages in thread
From: Nathan Lynch @ 2022-09-24  0:11 UTC (permalink / raw)
  To: Haren Myneni; +Cc: linuxppc-dev, npiggin

Haren Myneni <haren@linux.ibm.com> writes:
> On Thu, 2022-09-22 at 07:14 -0500, Nathan Lynch wrote:
>> Haren Myneni <haren@linux.ibm.com> writes:
>> > When the migration is initiated, the hypervisor changes VAS
>> > mappings as part of pre-migration event. Then the OS gets the
>> > migration event which closes all VAS windows before the migration
>> > starts. NX generates continuous faults until windows are closed
>> > and the user space can not differentiate these NX faults coming
>> > from the actual migration. So to reduce this time window, close
>> > VAS windows first in pseries_migrate_partition().
>> 
>> I'm concerned that this is only narrowing a window of time where
>> undesirable faults occur, and that it may not be sufficient for all
>> configurations. Migrations can be in progress for minutes or hours,
>> while the time that we wait for the VASI state transition is usually
>> seconds or minutes. So I worry that this works around a problem in
>> limited cases but doesn't cover them all.
>> 
>> Maybe I don't understand the problem well enough. How does user space
>> respond to the NX faults?
>
> The user space resend the request to NX whenever the request is
> returned with NX fault. So the process should be same even for faults
> caused by the pre-migration.
>
> Whereas the paste will be returned with failure when the window is
> closed (unmap the paste address) and it can be considered as NX busy.
> Up to the user space whether to send the request again after some delay
> or fall back to SW compression and send the request again later.
>
> For the migration, pre-migration event is notified to the hypervisor
> and then OS will receive the migration event (SUSPEND) - So this patch
> close windows early before VASI so that removing NX fault handling
> during the time taken for VASI state transistion.

OK, so we can consider this a quality of implementation improvement that
allows better behavior and less wasted retries for NX clients in a
migration scenario, but there's not a correctness issue, really. With
that clarified, I've confirmed that the slightly altered control flow
and error handling in pseries_migrate_partition() look correct after
your change.

Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/pseries: Move vas_migration_handler early during migration
  2022-09-22  8:27 [PATCH] powerpc/pseries: Move vas_migration_handler early during migration Haren Myneni
  2022-09-22 12:14 ` Nathan Lynch
@ 2022-10-04 13:25 ` Michael Ellerman
  1 sibling, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2022-10-04 13:25 UTC (permalink / raw)
  To: nathanl, Haren Myneni, linuxppc-dev, npiggin, mpe

On Thu, 22 Sep 2022 01:27:07 -0700, Haren Myneni wrote:
> When the migration is initiated, the hypervisor changes VAS
> mappings as part of pre-migration event. Then the OS gets the
> migration event which closes all VAS windows before the migration
> starts. NX generates continuous faults until windows are closed
> and the user space can not differentiate these NX faults coming
> from the actual migration. So to reduce this time window, close
> VAS windows first in pseries_migrate_partition().
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/pseries: Move vas_migration_handler early during migration
      https://git.kernel.org/powerpc/c/465dda9d320d1cb9424f1015b0520ec4c4f0d279

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-10-04 13:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-22  8:27 [PATCH] powerpc/pseries: Move vas_migration_handler early during migration Haren Myneni
2022-09-22 12:14 ` Nathan Lynch
2022-09-23  8:37   ` Haren Myneni
2022-09-24  0:11     ` Nathan Lynch
2022-10-04 13:25 ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).