All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>
Cc: qemu-devel@nongnu.org, maxime.coquelin@redhat.com,
	marcandre.lureau@redhat.com, peterx@redhat.com,
	imammedo@redhat.com, quintela@redhat.com, aarcange@redhat.com
Subject: Re: [Qemu-devel] [PATCH v3 15/29] vhost+postcopy: Send address back to qemu
Date: Tue, 27 Feb 2018 16:25:20 +0200	[thread overview]
Message-ID: <20180227162211-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20180216131625.9639-16-dgilbert@redhat.com>

On Fri, Feb 16, 2018 at 01:16:11PM +0000, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> We need a better way, but at the moment we need the address of the
> mappings sent back to qemu so it can interpret the messages on the
> userfaultfd it reads.
> 
> This is done as a 3 stage set:
>    QEMU -> client
>       set_mem_table
> 
>    mmap stuff, get addresses
> 
>    client -> qemu
>        here are the addresses
> 
>    qemu -> client
>        OK - now you can use them
> 
> That ensures that qemu has registered the new addresses in it's
> userfault code before the client starts accessing them.
> 
> Note: We don't ask for the default 'ack' reply since we've got our own.
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  contrib/libvhost-user/libvhost-user.c | 24 ++++++++++++-
>  docs/interop/vhost-user.txt           |  9 +++++
>  hw/virtio/trace-events                |  1 +
>  hw/virtio/vhost-user.c                | 67 +++++++++++++++++++++++++++++++++--
>  4 files changed, 98 insertions(+), 3 deletions(-)
> 
> diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c
> index a18bc74a7c..e02e5d6f46 100644
> --- a/contrib/libvhost-user/libvhost-user.c
> +++ b/contrib/libvhost-user/libvhost-user.c
> @@ -491,10 +491,32 @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg *vmsg)
>                     dev_region->mmap_addr);
>          }
>  
> +        /* Return the address to QEMU so that it can translate the ufd
> +         * fault addresses back.
> +         */
> +        msg_region->userspace_addr = (uintptr_t)(mmap_addr +
> +                                                 dev_region->mmap_offset);
>          close(vmsg->fds[i]);
>      }
>  
> -    /* TODO: Get address back to QEMU */
> +    /* Send the message back to qemu with the addresses filled in */
> +    vmsg->fd_num = 0;
> +    if (!vu_message_write(dev, dev->sock, vmsg)) {
> +        vu_panic(dev, "failed to respond to set-mem-table for postcopy");
> +        return false;
> +    }
> +
> +    /* Wait for QEMU to confirm that it's registered the handler for the
> +     * faults.
> +     */
> +    if (!vu_message_read(dev, dev->sock, vmsg) ||
> +        vmsg->size != sizeof(vmsg->payload.u64) ||
> +        vmsg->payload.u64 != 0) {
> +        vu_panic(dev, "failed to receive valid ack for postcopy set-mem-table");
> +        return false;
> +    }
> +
> +    /* OK, now we can go and register the memory and generate faults */
>      for (i = 0; i < dev->nregions; i++) {
>          VuDevRegion *dev_region = &dev->regions[i];
>  #ifdef UFFDIO_REGISTER
> diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
> index bdec9ec0e8..5bbcab2cc4 100644
> --- a/docs/interop/vhost-user.txt
> +++ b/docs/interop/vhost-user.txt
> @@ -454,12 +454,21 @@ Master message types
>        Id: 5
>        Equivalent ioctl: VHOST_SET_MEM_TABLE
>        Master payload: memory regions description
> +      Slave payload: (postcopy only) memory regions description
>  
>        Sets the memory map regions on the slave so it can translate the vring
>        addresses. In the ancillary data there is an array of file descriptors
>        for each memory mapped region. The size and ordering of the fds matches
>        the number and ordering of memory regions.
>  
> +      When postcopy-listening has been received,

Which message is this?

> SET_MEM_TABLE replies with
> +      the bases of the memory mapped regions to the master.  It must have mmap'd
> +      the regions but not yet accessed them and should not yet generate a userfault
> +      event. Note NEED_REPLY_MASK is not set in this case.
> +      QEMU will then reply back to the list of mappings with an empty
> +      VHOST_USER_SET_MEM_TABLE as an acknolwedgment; only upon reception of this
> +      message may the guest start accessing the memory and generating faults.
> +
>   * VHOST_USER_SET_LOG_BASE
>  
>        Id: 6

As you say yourself, this is probably the best we can do for now,
but it's not ideal. So I think it's a good idea to isolate this
behind a separate protocol feature bit. For now it will be required
for postcopy, when it's fixed in kernel we can drop it
cleanly.


> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> index 06ec03d6e7..05d18ada77 100644
> --- a/hw/virtio/trace-events
> +++ b/hw/virtio/trace-events
> @@ -8,6 +8,7 @@ vhost_section(const char *name, int r) "%s:%d"
>  
>  # hw/virtio/vhost-user.c
>  vhost_user_postcopy_listen(void) ""
> +vhost_user_set_mem_table_postcopy(uint64_t client_addr, uint64_t qhva, int reply_i, int region_i) "client:0x%"PRIx64" for hva: 0x%"PRIx64" reply %d region %d"
>  
>  # hw/virtio/virtio.c
>  virtqueue_alloc_element(void *elem, size_t sz, unsigned in_num, unsigned out_num) "elem %p size %zd in_num %u out_num %u"
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 64f4b3b3f9..a060442cb9 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -159,6 +159,7 @@ struct vhost_user {
>      int slave_fd;
>      NotifierWithReturn postcopy_notifier;
>      struct PostCopyFD  postcopy_fd;
> +    uint64_t           postcopy_client_bases[VHOST_MEMORY_MAX_NREGIONS];
>      /* True once we've entered postcopy_listen */
>      bool               postcopy_listen;
>  };
> @@ -328,12 +329,15 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, uint64_t base,
>  static int vhost_user_set_mem_table_postcopy(struct vhost_dev *dev,
>                                               struct vhost_memory *mem)
>  {
> +    struct vhost_user *u = dev->opaque;
>      int fds[VHOST_MEMORY_MAX_NREGIONS];
>      int i, fd;
>      size_t fd_num = 0;
>      bool reply_supported = virtio_has_feature(dev->protocol_features,
>                                                VHOST_USER_PROTOCOL_F_REPLY_ACK);
> -    /* TODO: Add actual postcopy differences */
> +    VhostUserMsg msg_reply;
> +    int region_i, msg_i;
> +
>      VhostUserMsg msg = {
>          .hdr.request = VHOST_USER_SET_MEM_TABLE,
>          .hdr.flags = VHOST_USER_VERSION,
> @@ -380,6 +384,64 @@ static int vhost_user_set_mem_table_postcopy(struct vhost_dev *dev,
>          return -1;
>      }
>  
> +    if (vhost_user_read(dev, &msg_reply) < 0) {
> +        return -1;
> +    }
> +
> +    if (msg_reply.hdr.request != VHOST_USER_SET_MEM_TABLE) {
> +        error_report("%s: Received unexpected msg type."
> +                     "Expected %d received %d", __func__,
> +                     VHOST_USER_SET_MEM_TABLE, msg_reply.hdr.request);
> +        return -1;
> +    }
> +    /* We're using the same structure, just reusing one of the
> +     * fields, so it should be the same size.
> +     */
> +    if (msg_reply.hdr.size != msg.hdr.size) {
> +        error_report("%s: Unexpected size for postcopy reply "
> +                     "%d vs %d", __func__, msg_reply.hdr.size, msg.hdr.size);
> +        return -1;
> +    }
> +
> +    memset(u->postcopy_client_bases, 0,
> +           sizeof(uint64_t) * VHOST_MEMORY_MAX_NREGIONS);
> +
> +    /* They're in the same order as the regions that were sent
> +     * but some of the regions were skipped (above) if they
> +     * didn't have fd's
> +    */
> +    for (msg_i = 0, region_i = 0;
> +         region_i < dev->mem->nregions;
> +        region_i++) {
> +        if (msg_i < fd_num &&
> +            msg_reply.payload.memory.regions[msg_i].guest_phys_addr ==
> +            dev->mem->regions[region_i].guest_phys_addr) {
> +            u->postcopy_client_bases[region_i] =
> +                msg_reply.payload.memory.regions[msg_i].userspace_addr;
> +            trace_vhost_user_set_mem_table_postcopy(
> +                msg_reply.payload.memory.regions[msg_i].userspace_addr,
> +                msg.payload.memory.regions[msg_i].userspace_addr,
> +                msg_i, region_i);
> +            msg_i++;
> +        }
> +    }
> +    if (msg_i != fd_num) {
> +        error_report("%s: postcopy reply not fully consumed "
> +                     "%d vs %zd",
> +                     __func__, msg_i, fd_num);
> +        return -1;
> +    }
> +    /* Now we've registered this with the postcopy code, we ack to the client,
> +     * because now we're in the position to be able to deal with any faults
> +     * it generates.
> +     */
> +    /* TODO: Use this for failure cases as well with a bad value */
> +    msg.hdr.size = sizeof(msg.payload.u64);
> +    msg.payload.u64 = 0; /* OK */
> +    if (vhost_user_write(dev, &msg, NULL, 0) < 0) {
> +        return -1;
> +    }
> +
>      if (reply_supported) {
>          return process_message_reply(dev, &msg);
>      }
> @@ -396,7 +458,8 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
>      size_t fd_num = 0;
>      bool do_postcopy = u->postcopy_listen && u->postcopy_fd.handler;
>      bool reply_supported = virtio_has_feature(dev->protocol_features,
> -                                              VHOST_USER_PROTOCOL_F_REPLY_ACK);
> +                                          VHOST_USER_PROTOCOL_F_REPLY_ACK) &&
> +                                          !do_postcopy;
>  
>      if (do_postcopy) {
>          /* Postcopy has enough differences that it's best done in it's own
> -- 
> 2.14.3

  reply	other threads:[~2018-02-27 14:25 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-16 13:15 [Qemu-devel] [PATCH v3 00/29] postcopy+vhost-user/shared ram Dr. David Alan Gilbert (git)
2018-02-16 13:15 ` [Qemu-devel] [PATCH v3 01/29] migrate: Update ram_block_discard_range for shared Dr. David Alan Gilbert (git)
2018-02-28  6:37   ` Peter Xu
2018-02-28 19:54     ` Dr. David Alan Gilbert
2018-02-16 13:15 ` [Qemu-devel] [PATCH v3 02/29] qemu_ram_block_host_offset Dr. David Alan Gilbert (git)
2018-02-16 13:15 ` [Qemu-devel] [PATCH v3 03/29] postcopy: use UFFDIO_ZEROPAGE only when available Dr. David Alan Gilbert (git)
2018-02-28  6:53   ` Peter Xu
2018-03-05 17:23     ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 04/29] postcopy: Add notifier chain Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 05/29] postcopy: Add vhost-user flag for postcopy and check it Dr. David Alan Gilbert (git)
2018-02-28  7:14   ` Peter Xu
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 06/29] vhost-user: Add 'VHOST_USER_POSTCOPY_ADVISE' message Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 07/29] libvhost-user: Support sending fds back to qemu Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 08/29] libvhost-user: Open userfaultfd Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 09/29] postcopy: Allow registering of fd handler Dr. David Alan Gilbert (git)
2018-02-28  8:38   ` Peter Xu
2018-03-05 17:35     ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 10/29] vhost+postcopy: Register shared ufd with postcopy Dr. David Alan Gilbert (git)
2018-02-28  8:46   ` Peter Xu
2018-03-05 18:21     ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 11/29] vhost+postcopy: Transmit 'listen' to client Dr. David Alan Gilbert (git)
2018-02-28  8:42   ` Peter Xu
2018-03-05 17:42     ` Dr. David Alan Gilbert
2018-03-06  7:06       ` Peter Xu
2018-03-06 11:20         ` Dr. David Alan Gilbert
2018-03-07 10:05           ` Peter Xu
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 12/29] postcopy+vhost-user: Split set_mem_table for postcopy Dr. David Alan Gilbert (git)
2018-02-28  8:49   ` Peter Xu
2018-03-05 18:45     ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 13/29] migration/ram: ramblock_recv_bitmap_test_byte_offset Dr. David Alan Gilbert (git)
2018-02-28  8:52   ` Peter Xu
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 14/29] libvhost-user+postcopy: Register new regions with the ufd Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 15/29] vhost+postcopy: Send address back to qemu Dr. David Alan Gilbert (git)
2018-02-27 14:25   ` Michael S. Tsirkin [this message]
2018-02-27 19:54     ` Dr. David Alan Gilbert
2018-02-27 20:25       ` Michael S. Tsirkin
2018-02-28 18:26         ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 16/29] vhost+postcopy: Stash RAMBlock and offset Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 17/29] vhost+postcopy: Send requests to source for shared pages Dr. David Alan Gilbert (git)
2018-02-28 10:03   ` Peter Xu
2018-03-05 18:55     ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 18/29] vhost+postcopy: Resolve client address Dr. David Alan Gilbert (git)
2018-03-02  7:29   ` Peter Xu
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 19/29] postcopy: wake shared Dr. David Alan Gilbert (git)
2018-03-02  7:44   ` Peter Xu
2018-03-05 19:35     ` Dr. David Alan Gilbert
2018-03-12 15:44   ` Marc-André Lureau
2018-03-12 16:42     ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 20/29] postcopy: postcopy_notify_shared_wake Dr. David Alan Gilbert (git)
2018-03-02  7:51   ` Peter Xu
2018-03-05 19:55     ` Dr. David Alan Gilbert
2018-03-06  3:37       ` Peter Xu
2018-03-06 10:54         ` Dr. David Alan Gilbert
2018-03-07 10:13           ` Peter Xu
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 21/29] vhost+postcopy: Add vhost waker Dr. David Alan Gilbert (git)
2018-03-02  7:55   ` Peter Xu
2018-03-05 20:16     ` Dr. David Alan Gilbert
2018-03-06  7:19       ` Peter Xu
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 22/29] vhost+postcopy: Call wakeups Dr. David Alan Gilbert (git)
2018-03-02  8:05   ` Peter Xu
2018-03-06 10:36     ` Dr. David Alan Gilbert
2018-03-08  6:22       ` Peter Xu
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 23/29] libvhost-user: mprotect & madvises for postcopy Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 24/29] vhost-user: Add VHOST_USER_POSTCOPY_END message Dr. David Alan Gilbert (git)
2018-02-26 20:27   ` Michael S. Tsirkin
2018-02-27 10:09     ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 25/29] vhost+postcopy: Wire up POSTCOPY_END notify Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 26/29] vhost: Huge page align and merge Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 27/29] postcopy: Allow shared memory Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 28/29] libvhost-user: Claim support for postcopy Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 29/29] postcopy shared docs Dr. David Alan Gilbert (git)
2018-02-27 14:01 ` [Qemu-devel] [PATCH v3 00/29] postcopy+vhost-user/shared ram Michael S. Tsirkin
2018-02-27 20:05   ` Dr. David Alan Gilbert
2018-02-27 20:23     ` Michael S. Tsirkin
2018-02-28 18:38       ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180227162211-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.