All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nathan Lynch <nathanl@linux.ibm.com>
To: Laurent Dufour <ldufour@linux.ibm.com>,
	mpe@ellerman.id.au, benh@kernel.crashing.org, paulus@samba.org,
	haren@linux.vnet.ibm.com, npiggin@gmail.com
Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/4] powerpc/mobility: Wait for memory transfer to complete
Date: Tue, 21 Jun 2022 11:52:14 -0500	[thread overview]
Message-ID: <87y1xqm01t.fsf@linux.ibm.com> (raw)
In-Reply-To: <20220614135414.37746-2-ldufour@linux.ibm.com>

Laurent Dufour <ldufour@linux.ibm.com> writes:

> In pseries_migration_partition(), loop until the memory transfer is
> complete. This way the calling drmgr process will not exit earlier,
> allowing callbacks to be run only once the migration is fully completed.
>
> If reading the VASI state is done after the hypervisor has completed the
> migration, the HCALL is returning H_PARAMETER. We can safely assume that
> the memory transfer is achieved if this happens.
>
> This will also allow to manage the NMI watchdog state in the next commits.
>
> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> ---
>  arch/powerpc/platforms/pseries/mobility.c | 42 +++++++++++++++++++++--
>  1 file changed, 40 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c
> index 78f3f74c7056..179bbd4ae881 100644
> --- a/arch/powerpc/platforms/pseries/mobility.c
> +++ b/arch/powerpc/platforms/pseries/mobility.c
> @@ -427,6 +427,43 @@ static int wait_for_vasi_session_suspending(u64 handle)
>  	return ret;
>  }
>  
> +static void wait_for_vasi_session_completed(u64 handle)
> +{
> +	unsigned long state = 0;
> +	int ret;
> +
> +	pr_info("waiting for memory transfert to complete...\n");
> +	/*
> +	 * Wait for transition from H_VASI_RESUMED to
> +	 * H_VASI_COMPLETED. Treat anything else as an error.

"Treat anything else as an error" does not match the code since there is
a special case for when the stream handle has expired. So that should be
dropped from this comment.

> +	 */
> +	while (true) {
> +		ret = poll_vasi_state(handle, &state);
> +
> +		/*
> +		 * If the memory transfer is already complete and the migration
> +		 * has been cleaned up by the hypervisor, H_PARAMETER is return,
> +		 * which is translate in EINVAL by poll_vasi_state().
> +		 */
> +		if (ret == -EINVAL || (!ret && state == H_VASI_COMPLETED)) {
> +			pr_info("memory transfert completed.\n");
> +			break;
> +		}
> +
> +		if (ret) {
> +			pr_err("H_VASI_STATE return error (%d)\n", ret);
> +			break;
> +		}
> +
> +		if (state != H_VASI_RESUMED) {
> +			pr_err("unexpected H_VASI_STATE result %lu\n", state);
> +			break;
> +		}
> +
> +		msleep(500);
> +	}
> +}
> +
>  static void prod_single(unsigned int target_cpu)
>  {
>  	long hvrc;
> @@ -673,9 +710,10 @@ static int pseries_migrate_partition(u64 handle)
>  	vas_migration_handler(VAS_SUSPEND);
>  
>  	ret = pseries_suspend(handle);
> -	if (ret == 0)
> +	if (ret == 0) {
>  		post_mobility_fixup();
> -	else
> +		wait_for_vasi_session_completed(handle);
> +	} else
>  		pseries_cancel_migration(handle, ret);
>  
>  	vas_migration_handler(VAS_RESUME);

While this may noticeably lengthen the time it takes for drmgr to return
from the system call, it seems like the right thing to do. The migration
should not be considered complete until the VASI stream poll yields a
"Complete" status. And we still need to add code to send gratuitous ARPs
through ibmveth interfaces while waiting for the transition, which would
likely build on this.

I believe the HMC and associated software can cope with the drmgr
command taking a longer time to return in cases where the partition
memory needs a while to completely sync to the destination.

Apart from the small critique on the comment in
wait_for_vasi_session_completed(), this looks fine to me.

Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>

  reply	other threads:[~2022-06-21 16:52 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-14 13:54 [PATCH v2 0/4] Extending NMI watchdog during LPM Laurent Dufour
2022-06-14 13:54 ` [PATCH v2 1/4] powerpc/mobility: Wait for memory transfer to complete Laurent Dufour
2022-06-21 16:52   ` Nathan Lynch [this message]
2022-06-14 13:54 ` [PATCH v2 2/4] watchdog: export watchdog_mutex and lockup_detector_reconfigure Laurent Dufour
2022-06-24  6:31   ` Michael Ellerman
2022-06-24  8:27     ` Laurent Dufour
2022-06-24  9:37   ` Christoph Hellwig
2022-06-24  9:37     ` Christoph Hellwig
2022-06-24 12:45     ` Laurent Dufour
2022-06-24 12:45       ` Laurent Dufour
2022-06-14 13:54 ` [PATCH v2 3/4] powerpc/watchdog: introduce a LPM factor Laurent Dufour
2022-06-22  9:26   ` kernel test robot
2022-06-22  9:26     ` kernel test robot
2022-06-14 13:54 ` [PATCH v2 4/4] pseries/mobility: Set NMI watchdog factor during LPM Laurent Dufour
2022-06-23 17:28   ` Nathan Lynch
2022-06-23 17:28     ` Nathan Lynch
2022-06-24 14:09     ` Laurent Dufour
2022-06-24 14:09       ` Laurent Dufour

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y1xqm01t.fsf@linux.ibm.com \
    --to=nathanl@linux.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=haren@linux.vnet.ibm.com \
    --cc=ldufour@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.