All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Avihai Horon <avihaih@nvidia.com>, Juan Quintela <quintela@redhat.com>
Cc: qemu-devel@nongnu.org, Cornelia Huck <cohuck@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	Joao Martins <joao.m.martins@oracle.com>,
	Yishai Hadas <yishaih@nvidia.com>, Mark Bloch <mbloch@nvidia.com>,
	Maor Gottlieb <maorg@nvidia.com>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	Tarun Gupta <targupta@nvidia.com>
Subject: Re: [PATCH v2 07/11] vfio/migration: Implement VFIO migration protocol v2
Date: Mon, 18 Jul 2022 12:12:19 -0300	[thread overview]
Message-ID: <20220718151219.GA60193@nvidia.com> (raw)
In-Reply-To: <20220530170739.19072-8-avihaih@nvidia.com>

On Mon, May 30, 2022 at 08:07:35PM +0300, Avihai Horon wrote:

> +/* Returns 1 if end-of-stream is reached, 0 if more data and -1 if error */
> +static int vfio_save_block(QEMUFile *f, VFIOMigration *migration)
> +{
> +    ssize_t data_size;
> +
> +    data_size = read(migration->data_fd, migration->data_buffer,
> +                     migration->data_buffer_size);
> +    if (data_size < 0) {
> +        return -1;
> +    }
> +    if (data_size == 0) {
> +        return 1;
> +    }
> +
> +    qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE);
> +    qemu_put_be64(f, data_size);
> +    qemu_put_buffer_async(f, migration->data_buffer, data_size, false);
> +    qemu_fflush(f);
> +    bytes_transferred += data_size;
> +
> +    trace_vfio_save_block(migration->vbasedev->name, data_size);
> +
> +    return qemu_file_get_error(f);
> +}

We looked at this from an eye to "how much data is transfered" per
callback.

The above function is the basic data mover, and
'migration->data_buffer_size' is set to 1MB at the moment.

So, we product up to 1MB VFIO_MIG_FLAG_DEV_DATA_STATE sections.

This series does not include the precopy support, but that will
include a precopy 'save_live_iterate' function like this:

static int vfio_save_iterate(QEMUFile *f, void *opaque)
{
    VFIODevice *vbasedev = opaque;
    VFIOMigration *migration = vbasedev->migration;
    int ret;

    ret = vfio_save_block(f, migration);
    if (ret < 0) {
        return ret;
    }
    if (ret == 1) {
        return 1;
    }
    qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
    return 0;
}

Thus, during precopy this will never do more than 1MB per callback.

> +static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
> +{
> +    VFIODevice *vbasedev = opaque;
> +    enum vfio_device_mig_state recover_state;
> +    int ret;
> +
> +    /* We reach here with device state STOP or STOP_COPY only */
> +    recover_state = VFIO_DEVICE_STATE_STOP;
> +    ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_STOP_COPY,
> +                                   recover_state);
> +    if (ret) {
> +        return ret;
> +    }
> +
> +    do {
> +        ret = vfio_save_block(f, vbasedev->migration);
> +        if (ret < 0) {
> +            return ret;
> +        }
> +    } while (!ret);

This seems to be the main problem where we chain together 1MB blocks
until the entire completed precopy data is completed. The above is
hooked to 'save_live_complete_precopy'

So, if we want to break the above up into some 'save_iterate' like
function, do you have some advice how to do it? The above do/while
must happen after the VFIO_DEVICE_STATE_STOP_COPY.

For mlx5 the above loop will often be ~10MB's for small VMs and
100MB's for big VMs (big meaning making extensive use of RDMA
functionality), and this will not change with pre-copy support or not.

Is it still a problem?

For other devices, like a GPU, I would imagine pre-copy support is
implemented and this will be a smaller post-precopy residual.

Jason


  parent reply	other threads:[~2022-07-18 15:18 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-30 17:07 [PATCH v2 00/11] vfio/migration: Implement VFIO migration protocol v2 Avihai Horon
2022-05-30 17:07 ` [PATCH v2 01/11] vfio/migration: Fix NULL pointer dereference bug Avihai Horon
2022-05-30 17:07 ` [PATCH v2 02/11] vfio/migration: Skip pre-copy if dirty page tracking is not supported Avihai Horon
2022-05-30 17:12   ` Avihai Horon
2022-06-07 17:53     ` Avihai Horon
2022-05-30 17:07 ` [PATCH v2 03/11] migration/qemu-file: Add qemu_file_get_to_fd() Avihai Horon
2022-05-30 17:07 ` [PATCH v2 04/11] vfio/common: Change vfio_devices_all_running_and_saving() logic to equivalent one Avihai Horon
2022-05-30 17:07 ` [PATCH v2 05/11] vfio/migration: Move migration v1 logic to vfio_migration_init() Avihai Horon
2022-05-30 17:07 ` [PATCH v2 06/11] vfio/migration: Rename functions/structs related to v1 protocol Avihai Horon
2022-05-30 17:07 ` [PATCH v2 07/11] vfio/migration: Implement VFIO migration protocol v2 Avihai Horon
2022-06-14 11:08   ` Joao Martins
2022-06-14 16:34     ` Avihai Horon
2022-06-14 17:24       ` Joao Martins
2022-06-15  6:40         ` Avihai Horon
2022-07-18 15:12   ` Jason Gunthorpe [this message]
2022-07-27 15:45     ` Avihai Horon
2022-05-30 17:07 ` [PATCH v2 08/11] vfio/migration: Remove VFIO migration protocol v1 Avihai Horon
2022-09-19  8:35   ` liulongfang via
2022-09-19 11:50     ` Alex Williamson
2022-09-19 12:58       ` Philippe Mathieu-Daudé via
2022-09-19  9:41   ` Philippe Mathieu-Daudé via
2022-05-30 17:07 ` [PATCH v2 09/11] vfio/migration: Reset device if setting recover state fails Avihai Horon
2022-10-11  1:41   ` liulongfang via
2022-05-30 17:07 ` [PATCH v2 10/11] vfio: Alphabetize migration section of VFIO trace-events file Avihai Horon
2022-05-30 17:07 ` [PATCH v2 11/11] docs/devel: Align vfio-migration docs to VFIO migration v2 Avihai Horon
2022-06-07 17:44 ` [PATCH v2 00/11] vfio/migration: Implement VFIO migration protocol v2 Avihai Horon
2022-06-07 21:32   ` Alex Williamson
2022-06-13 11:21     ` Avihai Horon
2022-06-17 21:51       ` Alex Williamson
2022-06-23 14:56         ` Jason Gunthorpe
2022-06-27  7:36         ` Avihai Horon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220718151219.GA60193@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=avihaih@nvidia.com \
    --cc=cohuck@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=joao.m.martins@oracle.com \
    --cc=kwankhede@nvidia.com \
    --cc=maorg@nvidia.com \
    --cc=mbloch@nvidia.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=targupta@nvidia.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.