From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49570) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f9CQn-0000E3-JM for qemu-devel@nongnu.org; Thu, 19 Apr 2018 12:29:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f9CQj-0006z2-Bp for qemu-devel@nongnu.org; Thu, 19 Apr 2018 12:29:09 -0400 Received: from mga11.intel.com ([192.55.52.93]:9325) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f9CQi-0006yR-U6 for qemu-devel@nongnu.org; Thu, 19 Apr 2018 12:29:05 -0400 From: "Liang, Cunming" Date: Thu, 19 Apr 2018 16:27:44 +0000 Message-ID: References: <20180412151232.17506-1-tiwei.bie@intel.com> <20180412151232.17506-7-tiwei.bie@intel.com> <20180418192154-mutt-send-email-mst@kernel.org> <20180419111439.i6gfhnept6wy7uzp@debian> <20180419182035-mutt-send-email-mst@kernel.org> In-Reply-To: <20180419182035-mutt-send-email-mst@kernel.org> Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH v3 6/6] vhost-user: support registering external host notifiers List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: "Bie, Tiwei" , "jasowang@redhat.com" , "alex.williamson@redhat.com" , "pbonzini@redhat.com" , "stefanha@redhat.com" , "qemu-devel@nongnu.org" , "virtio-dev@lists.oasis-open.org" , "Daly, Dan" , "Tan, Jianfeng" , "Wang, Zhihong" , "Wang, Xiao W" > -----Original Message----- > From: Michael S. Tsirkin [mailto:mst@redhat.com] > Sent: Thursday, April 19, 2018 11:43 PM > To: Liang, Cunming > Cc: Bie, Tiwei ; jasowang@redhat.com; > alex.williamson@redhat.com; pbonzini@redhat.com; stefanha@redhat.com; > qemu-devel@nongnu.org; virtio-dev@lists.oasis-open.org; Daly, Dan > ; Tan, Jianfeng ; Wang, Zhiho= ng > ; Wang, Xiao W > Subject: Re: [PATCH v3 6/6] vhost-user: support registering external host > notifiers >=20 > On Thu, Apr 19, 2018 at 12:43:42PM +0000, Liang, Cunming wrote: > > > > > > > -----Original Message----- > > > From: Bie, Tiwei > > > Sent: Thursday, April 19, 2018 7:15 PM > > > To: Michael S. Tsirkin > > > Cc: jasowang@redhat.com; alex.williamson@redhat.com; > > > pbonzini@redhat.com; stefanha@redhat.com; qemu-devel@nongnu.org; > > > virtio-dev@lists.oasis- open.org; Liang, Cunming > > > ; Daly, Dan ; Tan, > > > Jianfeng ; Wang, Zhihong > > > ; Wang, Xiao W > > > Subject: Re: [PATCH v3 6/6] vhost-user: support registering external > > > host notifiers > > > > > > On Wed, Apr 18, 2018 at 07:34:06PM +0300, Michael S. Tsirkin wrote: > > > > On Thu, Apr 12, 2018 at 11:12:32PM +0800, Tiwei Bie wrote: > > > > > This patch introduces VHOST_USER_PROTOCOL_F_HOST_NOTIFIER. > > > > > With this feature negotiated, vhost-user backend can register > > > > > memory region based host notifiers. And it will allow the guest > > > > > driver in the VM to notify the hardware accelerator at the > > > > > vhost-user backend directly. > > > > > > > > > > Signed-off-by: Tiwei Bie > > > > > > > > Overall I think we can merge this approach, but I have two main > > > > concerns about this: > > > > > > > > 1. Testing. Most people do not have the virtio hardware > > > > so how to make sure this does not bit rot? > > > > > > > > I have an idea: add an option like this to libvhost-user. > > > > Naturally libvhost-user can not get notified about a write > > > > to an mmapped area, but it can write a special value there > > > > (e.g. all-ones?) and then poll it to detect VQ # writes. > > > > > > > > Then include a vhost user bridge test with an option like this. > > > > > > > > I'd like to see a patch doing this. > > > > > > Sure, I'll do it. Thanks for the suggestion! > > > > > > > > > > > 2. Memory barriers. Right now after updating the avail idx, > > > > virtio does smp_wmb() and then the MMIO write. > > > > Normal hardware drivers do wmb() which is an sfence. > > > > Can a PCI device read bypass index write and see a stale > > > > index value? > > A compiler barrier is enough on strongly-ordered memory platform. As it > doesn't re-order store, PCI device won't see a stale index value. But a w= eakly- > ordered memory needs sfence. >=20 >=20 > Oh you are right. >=20 > So it's only needed for non-intel platforms or when packets are in WC mem= ory > then. And I don't know whether dpdk ever puts packets in WC memory. No, we haven't use WC memory. >=20 > I guess we'll cross this bridge when we get to it. >=20 >=20 > > > > > > It depends on arch's memory model. Cunming will provide more details > > > about this later. > > > > > > > To make virtio pci do wmb() we would need a new feature bit. > > > > Alternatively I guess we could maybe look at subsystem vendor/de= vice id. > > > > > > > > I'd like to see a patch doing one of these things. > > > > > > We prefer to add a new feature bit as it's a more robust way to do > > > this. I'll send out some patches soon. > > > > > > Thank you very much! :) > > > > > > Best regards, > > > Tiwei Bie > > > > > > > > > > > Thanks! > > > > > > > > > --- > > > > > docs/interop/vhost-user.txt | 33 +++++++++++ > > > > > hw/virtio/vhost-user.c | 123 > > > +++++++++++++++++++++++++++++++++++++++++ > > > > > include/hw/virtio/vhost-user.h | 8 +++ > > > > > 3 files changed, 164 insertions(+) > > > > > > > > > > diff --git a/docs/interop/vhost-user.txt > > > > > b/docs/interop/vhost-user.txt index 534caab18a..9e57b36b20 > > > > > 100644 > > > > > --- a/docs/interop/vhost-user.txt > > > > > +++ b/docs/interop/vhost-user.txt > > > > > @@ -132,6 +132,16 @@ Depending on the request type, payload can b= e: > > > > > Payload: Size bytes array holding the contents of the virtio > > > > > device's configuration space > > > > > > > > > > + * Vring area description > > > > > + ----------------------- > > > > > + | u64 | size | offset | > > > > > + ----------------------- > > > > > + > > > > > + u64: a 64-bit integer contains vring index and flags > > > > > + Size: a 64-bit size of this area > > > > > + Offset: a 64-bit offset of this area from the start of the > > > > > + supplied file descriptor > > > > > + > > > > > In QEMU the vhost-user message is implemented with the following > struct: > > > > > > > > > > typedef struct VhostUserMsg { > > > > > @@ -146,6 +156,7 @@ typedef struct VhostUserMsg { > > > > > VhostUserLog log; > > > > > struct vhost_iotlb_msg iotlb; > > > > > VhostUserConfig config; > > > > > + VhostUserVringArea area; > > > > > }; > > > > > } QEMU_PACKED VhostUserMsg; > > > > > > > > > > @@ -380,6 +391,7 @@ Protocol features #define > > > > > VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7 > > > > > #define VHOST_USER_PROTOCOL_F_PAGEFAULT 8 > > > > > #define VHOST_USER_PROTOCOL_F_CONFIG 9 > > > > > +#define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER 10 > > > > > > > > > > Master message types > > > > > -------------------- > > > > > @@ -777,6 +789,27 @@ Slave message types > > > > > the VHOST_USER_NEED_REPLY flag, master must respond with ze= ro > when > > > > > operation is successfully completed, or non-zero otherwise. > > > > > > > > > > + * VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG > > > > > + > > > > > + Id: 3 > > > > > + Equivalent ioctl: N/A > > > > > + Slave payload: vring area description > > > > > + Master payload: N/A > > > > > + > > > > > + Sets host notifier for a specified queue. The queue index = is contained > > > > > + in the u64 field of the vring area description. The host n= otifier is > > > > > + described by the file descriptor (typically it's a VFIO de= vice fd) which > > > > > + is passed as ancillary data and the size (which is mmap si= ze and should > > > > > + be the same as host page size) and offset (which is mmap o= ffset) > carried > > > > > + in the vring area description. QEMU can mmap the file desc= riptor > based > > > > > + on the size and offset to get a memory range. Registering = a host > notifier > > > > > + means mapping this memory range to the VM as the > > > > > + specified queue's > > > notify > > > > > + MMIO region. Slave sends this request to tell QEMU to de-r= egister > the > > > > > + existing notifier if any and register the new notifier if = the request is > > > > > + sent with a file descriptor. > > > > > + This request should be sent only when > > > VHOST_USER_PROTOCOL_F_HOST_NOTIFIER > > > > > + protocol feature has been successfully negotiated. > > > > > + > > > > > VHOST_USER_PROTOCOL_F_REPLY_ACK: > > > > > ------------------------------- The original vhost-user > > > > > specification only demands replies for certain diff --git > > > > > a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index > > > > > 791e0a4763..1cd9c7276b 100644 > > > > > --- a/hw/virtio/vhost-user.c > > > > > +++ b/hw/virtio/vhost-user.c > > > > > @@ -13,6 +13,7 @@ > > > > > #include "hw/virtio/vhost.h" > > > > > #include "hw/virtio/vhost-user.h" > > > > > #include "hw/virtio/vhost-backend.h" > > > > > +#include "hw/virtio/virtio.h" > > > > > #include "hw/virtio/virtio-net.h" > > > > > #include "chardev/char-fe.h" > > > > > #include "sysemu/kvm.h" > > > > > @@ -48,6 +49,7 @@ enum VhostUserProtocolFeature { > > > > > VHOST_USER_PROTOCOL_F_CRYPTO_SESSION =3D 7, > > > > > VHOST_USER_PROTOCOL_F_PAGEFAULT =3D 8, > > > > > VHOST_USER_PROTOCOL_F_CONFIG =3D 9, > > > > > + VHOST_USER_PROTOCOL_F_HOST_NOTIFIER =3D 10, > > > > > VHOST_USER_PROTOCOL_F_MAX > > > > > }; > > > > > > > > > > @@ -92,6 +94,7 @@ typedef enum VhostUserSlaveRequest { > > > > > VHOST_USER_SLAVE_NONE =3D 0, > > > > > VHOST_USER_SLAVE_IOTLB_MSG =3D 1, > > > > > VHOST_USER_SLAVE_CONFIG_CHANGE_MSG =3D 2, > > > > > + VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG =3D 3, > > > > > VHOST_USER_SLAVE_MAX > > > > > } VhostUserSlaveRequest; > > > > > > > > > > @@ -136,6 +139,12 @@ static VhostUserConfig c __attribute__ > ((unused)); > > > > > + sizeof(c.size) \ > > > > > + sizeof(c.flags)) > > > > > > > > > > +typedef struct VhostUserVringArea { > > > > > + uint64_t u64; > > > > > + uint64_t size; > > > > > + uint64_t offset; > > > > > +} VhostUserVringArea; > > > > > + > > > > > typedef struct { > > > > > VhostUserRequest request; > > > > > > > > > > @@ -157,6 +166,7 @@ typedef union { > > > > > struct vhost_iotlb_msg iotlb; > > > > > VhostUserConfig config; > > > > > VhostUserCryptoSession session; > > > > > + VhostUserVringArea area; > > > > > } VhostUserPayload; > > > > > > > > > > typedef struct VhostUserMsg { > > > > > @@ -638,9 +648,37 @@ static int vhost_user_set_vring_num(struct > > > vhost_dev *dev, > > > > > return vhost_set_vring(dev, VHOST_USER_SET_VRING_NUM, > > > > > ring); } > > > > > > > > > > +static void vhost_user_host_notifier_restore(struct vhost_dev *d= ev, > > > > > + int queue_idx) { > > > > > + struct vhost_user *u =3D dev->opaque; > > > > > + VhostUserHostNotifier *n =3D &u->user->notifier[queue_idx]; > > > > > + VirtIODevice *vdev =3D dev->vdev; > > > > > + > > > > > + if (n->addr && !n->set) { > > > > > + virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->m= r, true); > > > > > + n->set =3D true; > > > > > + } > > > > > +} > > > > > + > > > > > +static void vhost_user_host_notifier_remove(struct vhost_dev *de= v, > > > > > + int queue_idx) { > > > > > + struct vhost_user *u =3D dev->opaque; > > > > > + VhostUserHostNotifier *n =3D &u->user->notifier[queue_idx]; > > > > > + VirtIODevice *vdev =3D dev->vdev; > > > > > + > > > > > + if (n->addr && n->set) { > > > > > + virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->m= r, > false); > > > > > + n->set =3D false; > > > > > + } > > > > > +} > > > > > + > > > > > static int vhost_user_set_vring_base(struct vhost_dev *dev, > > > > > struct vhost_vring_state > > > > > *ring) { > > > > > + vhost_user_host_notifier_restore(dev, ring->index); > > > > > + > > > > > return vhost_set_vring(dev, VHOST_USER_SET_VRING_BASE, > > > > > ring); } > > > > > > > > > > @@ -674,6 +712,8 @@ static int vhost_user_get_vring_base(struct > > > vhost_dev *dev, > > > > > .hdr.size =3D sizeof(msg.payload.state), > > > > > }; > > > > > > > > > > + vhost_user_host_notifier_remove(dev, ring->index); > > > > > + > > > > > if (vhost_user_write(dev, &msg, NULL, 0) < 0) { > > > > > return -1; > > > > > } > > > > > @@ -847,6 +887,76 @@ static int > > > vhost_user_slave_handle_config_change(struct vhost_dev *dev) > > > > > return ret; > > > > > } > > > > > > > > > > +static int vhost_user_slave_handle_vring_host_notifier(struct > > > > > +vhost_dev > > > *dev, > > > > > + VhostUser= VringArea *area, > > > > > + int fd) { > > > > > + int queue_idx =3D area->u64 & VHOST_USER_VRING_IDX_MASK; > > > > > + size_t page_size =3D qemu_real_host_page_size; > > > > > + struct vhost_user *u =3D dev->opaque; > > > > > + VhostUserState *user =3D u->user; > > > > > + VirtIODevice *vdev =3D dev->vdev; > > > > > + VhostUserHostNotifier *n; > > > > > + int ret =3D 0; > > > > > + void *addr; > > > > > + char *name; > > > > > + > > > > > + if (!virtio_has_feature(dev->protocol_features, > > > > > + VHOST_USER_PROTOCOL_F_HOST_NOTIFIER)= || > > > > > + vdev =3D=3D NULL || queue_idx >=3D virtio_get_num_queues= (vdev)) { > > > > > + ret =3D -1; > > > > > + goto out; > > > > > + } > > > > > + > > > > > + n =3D &user->notifier[queue_idx]; > > > > > + > > > > > + if (n->addr) { > > > > > + virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->m= r, > false); > > > > > + object_unparent(OBJECT(&n->mr)); > > > > > + munmap(n->addr, page_size); > > > > > + n->addr =3D NULL; > > > > > + } > > > > > + > > > > > + if (area->u64 & VHOST_USER_VRING_NOFD_MASK) { > > > > > + goto out; > > > > > + } > > > > > + > > > > > + /* Sanity check. */ > > > > > + if (area->size !=3D page_size) { > > > > > + ret =3D -1; > > > > > + goto out; > > > > > + } > > > > > + > > > > > + addr =3D mmap(NULL, page_size, PROT_READ | PROT_WRITE, > MAP_SHARED, > > > > > + fd, area->offset); > > > > > + if (addr =3D=3D MAP_FAILED) { > > > > > + ret =3D -1; > > > > > + goto out; > > > > > + } > > > > > + > > > > > + name =3D g_strdup_printf("vhost-user/host-notifier@%p mmaps[= %d]", > > > > > + user, queue_idx); > > > > > + memory_region_init_ram_device_ptr(&n->mr, OBJECT(vdev), name= , > > > > > + page_size, addr); > > > > > + g_free(name); > > > > > + > > > > > + if (virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->m= r, > true)) { > > > > > + munmap(addr, page_size); > > > > > + ret =3D -1; > > > > > + goto out; > > > > > + } > > > > > + > > > > > + n->addr =3D addr; > > > > > + n->set =3D true; > > > > > + > > > > > +out: > > > > > + /* Always close the fd. */ > > > > > + if (fd !=3D -1) { > > > > > + close(fd); > > > > > + } > > > > > + return ret; > > > > > +} > > > > > + > > > > > static void slave_read(void *opaque) { > > > > > struct vhost_dev *dev =3D opaque; @@ -913,6 +1023,10 @@ > > > > > static void slave_read(void *opaque) > > > > > case VHOST_USER_SLAVE_CONFIG_CHANGE_MSG : > > > > > ret =3D vhost_user_slave_handle_config_change(dev); > > > > > break; > > > > > + case VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG: > > > > > + ret =3D vhost_user_slave_handle_vring_host_notifier(dev, > &payload.area, > > > > > + fd); > > > > > + break; > > > > > default: > > > > > error_report("Received unexpected msg type."); > > > > > if (fd !=3D -1) { > > > > > @@ -1641,6 +1755,15 @@ VhostUserState *vhost_user_init(void) > > > > > > > > > > void vhost_user_cleanup(VhostUserState *user) { > > > > > + int i; > > > > > + > > > > > + for (i =3D 0; i < VIRTIO_QUEUE_MAX; i++) { > > > > > + if (user->notifier[i].addr) { > > > > > + object_unparent(OBJECT(&user->notifier[i].mr)); > > > > > + munmap(user->notifier[i].addr, qemu_real_host_page_s= ize); > > > > > + user->notifier[i].addr =3D NULL; > > > > > + } > > > > > + } > > > > > } > > > > > > > > > > const VhostOps user_ops =3D { > > > > > diff --git a/include/hw/virtio/vhost-user.h > > > > > b/include/hw/virtio/vhost-user.h index eb8bc0d90d..fd660393a0 > > > > > 100644 > > > > > --- a/include/hw/virtio/vhost-user.h > > > > > +++ b/include/hw/virtio/vhost-user.h > > > > > @@ -9,9 +9,17 @@ > > > > > #define HW_VIRTIO_VHOST_USER_H > > > > > > > > > > #include "chardev/char-fe.h" > > > > > +#include "hw/virtio/virtio.h" > > > > > + > > > > > +typedef struct VhostUserHostNotifier { > > > > > + MemoryRegion mr; > > > > > + void *addr; > > > > > + bool set; > > > > > +} VhostUserHostNotifier; > > > > > > > > > > typedef struct VhostUserState { > > > > > CharBackend *chr; > > > > > + VhostUserHostNotifier notifier[VIRTIO_QUEUE_MAX]; > > > > > } VhostUserState; > > > > > > > > > > VhostUserState *vhost_user_init(void); > > > > > -- > > > > > 2.11.0 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: virtio-dev-return-3885-cohuck=redhat.com@lists.oasis-open.org Sender: List-Post: List-Help: List-Unsubscribe: List-Subscribe: Received: from lists.oasis-open.org (oasis-open.org [66.179.20.138]) by lists.oasis-open.org (Postfix) with ESMTP id 6AD6E58191A0 for ; Thu, 19 Apr 2018 09:27:59 -0700 (PDT) From: "Liang, Cunming" Date: Thu, 19 Apr 2018 16:27:44 +0000 Message-ID: References: <20180412151232.17506-1-tiwei.bie@intel.com> <20180412151232.17506-7-tiwei.bie@intel.com> <20180418192154-mutt-send-email-mst@kernel.org> <20180419111439.i6gfhnept6wy7uzp@debian> <20180419182035-mutt-send-email-mst@kernel.org> In-Reply-To: <20180419182035-mutt-send-email-mst@kernel.org> Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: [virtio-dev] RE: [PATCH v3 6/6] vhost-user: support registering external host notifiers To: "Michael S. Tsirkin" Cc: "Bie, Tiwei" , "jasowang@redhat.com" , "alex.williamson@redhat.com" , "pbonzini@redhat.com" , "stefanha@redhat.com" , "qemu-devel@nongnu.org" , "virtio-dev@lists.oasis-open.org" , "Daly, Dan" , "Tan, Jianfeng" , "Wang, Zhihong" , "Wang, Xiao W" List-ID: > -----Original Message----- > From: Michael S. Tsirkin [mailto:mst@redhat.com] > Sent: Thursday, April 19, 2018 11:43 PM > To: Liang, Cunming > Cc: Bie, Tiwei ; jasowang@redhat.com; > alex.williamson@redhat.com; pbonzini@redhat.com; stefanha@redhat.com; > qemu-devel@nongnu.org; virtio-dev@lists.oasis-open.org; Daly, Dan > ; Tan, Jianfeng ; Wang, Zhiho= ng > ; Wang, Xiao W > Subject: Re: [PATCH v3 6/6] vhost-user: support registering external host > notifiers >=20 > On Thu, Apr 19, 2018 at 12:43:42PM +0000, Liang, Cunming wrote: > > > > > > > -----Original Message----- > > > From: Bie, Tiwei > > > Sent: Thursday, April 19, 2018 7:15 PM > > > To: Michael S. Tsirkin > > > Cc: jasowang@redhat.com; alex.williamson@redhat.com; > > > pbonzini@redhat.com; stefanha@redhat.com; qemu-devel@nongnu.org; > > > virtio-dev@lists.oasis- open.org; Liang, Cunming > > > ; Daly, Dan ; Tan, > > > Jianfeng ; Wang, Zhihong > > > ; Wang, Xiao W > > > Subject: Re: [PATCH v3 6/6] vhost-user: support registering external > > > host notifiers > > > > > > On Wed, Apr 18, 2018 at 07:34:06PM +0300, Michael S. Tsirkin wrote: > > > > On Thu, Apr 12, 2018 at 11:12:32PM +0800, Tiwei Bie wrote: > > > > > This patch introduces VHOST_USER_PROTOCOL_F_HOST_NOTIFIER. > > > > > With this feature negotiated, vhost-user backend can register > > > > > memory region based host notifiers. And it will allow the guest > > > > > driver in the VM to notify the hardware accelerator at the > > > > > vhost-user backend directly. > > > > > > > > > > Signed-off-by: Tiwei Bie > > > > > > > > Overall I think we can merge this approach, but I have two main > > > > concerns about this: > > > > > > > > 1. Testing. Most people do not have the virtio hardware > > > > so how to make sure this does not bit rot? > > > > > > > > I have an idea: add an option like this to libvhost-user. > > > > Naturally libvhost-user can not get notified about a write > > > > to an mmapped area, but it can write a special value there > > > > (e.g. all-ones?) and then poll it to detect VQ # writes. > > > > > > > > Then include a vhost user bridge test with an option like this. > > > > > > > > I'd like to see a patch doing this. > > > > > > Sure, I'll do it. Thanks for the suggestion! > > > > > > > > > > > 2. Memory barriers. Right now after updating the avail idx, > > > > virtio does smp_wmb() and then the MMIO write. > > > > Normal hardware drivers do wmb() which is an sfence. > > > > Can a PCI device read bypass index write and see a stale > > > > index value? > > A compiler barrier is enough on strongly-ordered memory platform. As it > doesn't re-order store, PCI device won't see a stale index value. But a w= eakly- > ordered memory needs sfence. >=20 >=20 > Oh you are right. >=20 > So it's only needed for non-intel platforms or when packets are in WC mem= ory > then. And I don't know whether dpdk ever puts packets in WC memory. No, we haven't use WC memory. >=20 > I guess we'll cross this bridge when we get to it. >=20 >=20 > > > > > > It depends on arch's memory model. Cunming will provide more details > > > about this later. > > > > > > > To make virtio pci do wmb() we would need a new feature bit. > > > > Alternatively I guess we could maybe look at subsystem vendor/de= vice id. > > > > > > > > I'd like to see a patch doing one of these things. > > > > > > We prefer to add a new feature bit as it's a more robust way to do > > > this. I'll send out some patches soon. > > > > > > Thank you very much! :) > > > > > > Best regards, > > > Tiwei Bie > > > > > > > > > > > Thanks! > > > > > > > > > --- > > > > > docs/interop/vhost-user.txt | 33 +++++++++++ > > > > > hw/virtio/vhost-user.c | 123 > > > +++++++++++++++++++++++++++++++++++++++++ > > > > > include/hw/virtio/vhost-user.h | 8 +++ > > > > > 3 files changed, 164 insertions(+) > > > > > > > > > > diff --git a/docs/interop/vhost-user.txt > > > > > b/docs/interop/vhost-user.txt index 534caab18a..9e57b36b20 > > > > > 100644 > > > > > --- a/docs/interop/vhost-user.txt > > > > > +++ b/docs/interop/vhost-user.txt > > > > > @@ -132,6 +132,16 @@ Depending on the request type, payload can b= e: > > > > > Payload: Size bytes array holding the contents of the virtio > > > > > device's configuration space > > > > > > > > > > + * Vring area description > > > > > + ----------------------- > > > > > + | u64 | size | offset | > > > > > + ----------------------- > > > > > + > > > > > + u64: a 64-bit integer contains vring index and flags > > > > > + Size: a 64-bit size of this area > > > > > + Offset: a 64-bit offset of this area from the start of the > > > > > + supplied file descriptor > > > > > + > > > > > In QEMU the vhost-user message is implemented with the following > struct: > > > > > > > > > > typedef struct VhostUserMsg { > > > > > @@ -146,6 +156,7 @@ typedef struct VhostUserMsg { > > > > > VhostUserLog log; > > > > > struct vhost_iotlb_msg iotlb; > > > > > VhostUserConfig config; > > > > > + VhostUserVringArea area; > > > > > }; > > > > > } QEMU_PACKED VhostUserMsg; > > > > > > > > > > @@ -380,6 +391,7 @@ Protocol features #define > > > > > VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7 > > > > > #define VHOST_USER_PROTOCOL_F_PAGEFAULT 8 > > > > > #define VHOST_USER_PROTOCOL_F_CONFIG 9 > > > > > +#define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER 10 > > > > > > > > > > Master message types > > > > > -------------------- > > > > > @@ -777,6 +789,27 @@ Slave message types > > > > > the VHOST_USER_NEED_REPLY flag, master must respond with ze= ro > when > > > > > operation is successfully completed, or non-zero otherwise. > > > > > > > > > > + * VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG > > > > > + > > > > > + Id: 3 > > > > > + Equivalent ioctl: N/A > > > > > + Slave payload: vring area description > > > > > + Master payload: N/A > > > > > + > > > > > + Sets host notifier for a specified queue. The queue index = is contained > > > > > + in the u64 field of the vring area description. The host n= otifier is > > > > > + described by the file descriptor (typically it's a VFIO de= vice fd) which > > > > > + is passed as ancillary data and the size (which is mmap si= ze and should > > > > > + be the same as host page size) and offset (which is mmap o= ffset) > carried > > > > > + in the vring area description. QEMU can mmap the file desc= riptor > based > > > > > + on the size and offset to get a memory range. Registering = a host > notifier > > > > > + means mapping this memory range to the VM as the > > > > > + specified queue's > > > notify > > > > > + MMIO region. Slave sends this request to tell QEMU to de-r= egister > the > > > > > + existing notifier if any and register the new notifier if = the request is > > > > > + sent with a file descriptor. > > > > > + This request should be sent only when > > > VHOST_USER_PROTOCOL_F_HOST_NOTIFIER > > > > > + protocol feature has been successfully negotiated. > > > > > + > > > > > VHOST_USER_PROTOCOL_F_REPLY_ACK: > > > > > ------------------------------- The original vhost-user > > > > > specification only demands replies for certain diff --git > > > > > a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index > > > > > 791e0a4763..1cd9c7276b 100644 > > > > > --- a/hw/virtio/vhost-user.c > > > > > +++ b/hw/virtio/vhost-user.c > > > > > @@ -13,6 +13,7 @@ > > > > > #include "hw/virtio/vhost.h" > > > > > #include "hw/virtio/vhost-user.h" > > > > > #include "hw/virtio/vhost-backend.h" > > > > > +#include "hw/virtio/virtio.h" > > > > > #include "hw/virtio/virtio-net.h" > > > > > #include "chardev/char-fe.h" > > > > > #include "sysemu/kvm.h" > > > > > @@ -48,6 +49,7 @@ enum VhostUserProtocolFeature { > > > > > VHOST_USER_PROTOCOL_F_CRYPTO_SESSION =3D 7, > > > > > VHOST_USER_PROTOCOL_F_PAGEFAULT =3D 8, > > > > > VHOST_USER_PROTOCOL_F_CONFIG =3D 9, > > > > > + VHOST_USER_PROTOCOL_F_HOST_NOTIFIER =3D 10, > > > > > VHOST_USER_PROTOCOL_F_MAX > > > > > }; > > > > > > > > > > @@ -92,6 +94,7 @@ typedef enum VhostUserSlaveRequest { > > > > > VHOST_USER_SLAVE_NONE =3D 0, > > > > > VHOST_USER_SLAVE_IOTLB_MSG =3D 1, > > > > > VHOST_USER_SLAVE_CONFIG_CHANGE_MSG =3D 2, > > > > > + VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG =3D 3, > > > > > VHOST_USER_SLAVE_MAX > > > > > } VhostUserSlaveRequest; > > > > > > > > > > @@ -136,6 +139,12 @@ static VhostUserConfig c __attribute__ > ((unused)); > > > > > + sizeof(c.size) \ > > > > > + sizeof(c.flags)) > > > > > > > > > > +typedef struct VhostUserVringArea { > > > > > + uint64_t u64; > > > > > + uint64_t size; > > > > > + uint64_t offset; > > > > > +} VhostUserVringArea; > > > > > + > > > > > typedef struct { > > > > > VhostUserRequest request; > > > > > > > > > > @@ -157,6 +166,7 @@ typedef union { > > > > > struct vhost_iotlb_msg iotlb; > > > > > VhostUserConfig config; > > > > > VhostUserCryptoSession session; > > > > > + VhostUserVringArea area; > > > > > } VhostUserPayload; > > > > > > > > > > typedef struct VhostUserMsg { > > > > > @@ -638,9 +648,37 @@ static int vhost_user_set_vring_num(struct > > > vhost_dev *dev, > > > > > return vhost_set_vring(dev, VHOST_USER_SET_VRING_NUM, > > > > > ring); } > > > > > > > > > > +static void vhost_user_host_notifier_restore(struct vhost_dev *d= ev, > > > > > + int queue_idx) { > > > > > + struct vhost_user *u =3D dev->opaque; > > > > > + VhostUserHostNotifier *n =3D &u->user->notifier[queue_idx]; > > > > > + VirtIODevice *vdev =3D dev->vdev; > > > > > + > > > > > + if (n->addr && !n->set) { > > > > > + virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->m= r, true); > > > > > + n->set =3D true; > > > > > + } > > > > > +} > > > > > + > > > > > +static void vhost_user_host_notifier_remove(struct vhost_dev *de= v, > > > > > + int queue_idx) { > > > > > + struct vhost_user *u =3D dev->opaque; > > > > > + VhostUserHostNotifier *n =3D &u->user->notifier[queue_idx]; > > > > > + VirtIODevice *vdev =3D dev->vdev; > > > > > + > > > > > + if (n->addr && n->set) { > > > > > + virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->m= r, > false); > > > > > + n->set =3D false; > > > > > + } > > > > > +} > > > > > + > > > > > static int vhost_user_set_vring_base(struct vhost_dev *dev, > > > > > struct vhost_vring_state > > > > > *ring) { > > > > > + vhost_user_host_notifier_restore(dev, ring->index); > > > > > + > > > > > return vhost_set_vring(dev, VHOST_USER_SET_VRING_BASE, > > > > > ring); } > > > > > > > > > > @@ -674,6 +712,8 @@ static int vhost_user_get_vring_base(struct > > > vhost_dev *dev, > > > > > .hdr.size =3D sizeof(msg.payload.state), > > > > > }; > > > > > > > > > > + vhost_user_host_notifier_remove(dev, ring->index); > > > > > + > > > > > if (vhost_user_write(dev, &msg, NULL, 0) < 0) { > > > > > return -1; > > > > > } > > > > > @@ -847,6 +887,76 @@ static int > > > vhost_user_slave_handle_config_change(struct vhost_dev *dev) > > > > > return ret; > > > > > } > > > > > > > > > > +static int vhost_user_slave_handle_vring_host_notifier(struct > > > > > +vhost_dev > > > *dev, > > > > > + VhostUser= VringArea *area, > > > > > + int fd) { > > > > > + int queue_idx =3D area->u64 & VHOST_USER_VRING_IDX_MASK; > > > > > + size_t page_size =3D qemu_real_host_page_size; > > > > > + struct vhost_user *u =3D dev->opaque; > > > > > + VhostUserState *user =3D u->user; > > > > > + VirtIODevice *vdev =3D dev->vdev; > > > > > + VhostUserHostNotifier *n; > > > > > + int ret =3D 0; > > > > > + void *addr; > > > > > + char *name; > > > > > + > > > > > + if (!virtio_has_feature(dev->protocol_features, > > > > > + VHOST_USER_PROTOCOL_F_HOST_NOTIFIER)= || > > > > > + vdev =3D=3D NULL || queue_idx >=3D virtio_get_num_queues= (vdev)) { > > > > > + ret =3D -1; > > > > > + goto out; > > > > > + } > > > > > + > > > > > + n =3D &user->notifier[queue_idx]; > > > > > + > > > > > + if (n->addr) { > > > > > + virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->m= r, > false); > > > > > + object_unparent(OBJECT(&n->mr)); > > > > > + munmap(n->addr, page_size); > > > > > + n->addr =3D NULL; > > > > > + } > > > > > + > > > > > + if (area->u64 & VHOST_USER_VRING_NOFD_MASK) { > > > > > + goto out; > > > > > + } > > > > > + > > > > > + /* Sanity check. */ > > > > > + if (area->size !=3D page_size) { > > > > > + ret =3D -1; > > > > > + goto out; > > > > > + } > > > > > + > > > > > + addr =3D mmap(NULL, page_size, PROT_READ | PROT_WRITE, > MAP_SHARED, > > > > > + fd, area->offset); > > > > > + if (addr =3D=3D MAP_FAILED) { > > > > > + ret =3D -1; > > > > > + goto out; > > > > > + } > > > > > + > > > > > + name =3D g_strdup_printf("vhost-user/host-notifier@%p mmaps[= %d]", > > > > > + user, queue_idx); > > > > > + memory_region_init_ram_device_ptr(&n->mr, OBJECT(vdev), name= , > > > > > + page_size, addr); > > > > > + g_free(name); > > > > > + > > > > > + if (virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->m= r, > true)) { > > > > > + munmap(addr, page_size); > > > > > + ret =3D -1; > > > > > + goto out; > > > > > + } > > > > > + > > > > > + n->addr =3D addr; > > > > > + n->set =3D true; > > > > > + > > > > > +out: > > > > > + /* Always close the fd. */ > > > > > + if (fd !=3D -1) { > > > > > + close(fd); > > > > > + } > > > > > + return ret; > > > > > +} > > > > > + > > > > > static void slave_read(void *opaque) { > > > > > struct vhost_dev *dev =3D opaque; @@ -913,6 +1023,10 @@ > > > > > static void slave_read(void *opaque) > > > > > case VHOST_USER_SLAVE_CONFIG_CHANGE_MSG : > > > > > ret =3D vhost_user_slave_handle_config_change(dev); > > > > > break; > > > > > + case VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG: > > > > > + ret =3D vhost_user_slave_handle_vring_host_notifier(dev, > &payload.area, > > > > > + fd); > > > > > + break; > > > > > default: > > > > > error_report("Received unexpected msg type."); > > > > > if (fd !=3D -1) { > > > > > @@ -1641,6 +1755,15 @@ VhostUserState *vhost_user_init(void) > > > > > > > > > > void vhost_user_cleanup(VhostUserState *user) { > > > > > + int i; > > > > > + > > > > > + for (i =3D 0; i < VIRTIO_QUEUE_MAX; i++) { > > > > > + if (user->notifier[i].addr) { > > > > > + object_unparent(OBJECT(&user->notifier[i].mr)); > > > > > + munmap(user->notifier[i].addr, qemu_real_host_page_s= ize); > > > > > + user->notifier[i].addr =3D NULL; > > > > > + } > > > > > + } > > > > > } > > > > > > > > > > const VhostOps user_ops =3D { > > > > > diff --git a/include/hw/virtio/vhost-user.h > > > > > b/include/hw/virtio/vhost-user.h index eb8bc0d90d..fd660393a0 > > > > > 100644 > > > > > --- a/include/hw/virtio/vhost-user.h > > > > > +++ b/include/hw/virtio/vhost-user.h > > > > > @@ -9,9 +9,17 @@ > > > > > #define HW_VIRTIO_VHOST_USER_H > > > > > > > > > > #include "chardev/char-fe.h" > > > > > +#include "hw/virtio/virtio.h" > > > > > + > > > > > +typedef struct VhostUserHostNotifier { > > > > > + MemoryRegion mr; > > > > > + void *addr; > > > > > + bool set; > > > > > +} VhostUserHostNotifier; > > > > > > > > > > typedef struct VhostUserState { > > > > > CharBackend *chr; > > > > > + VhostUserHostNotifier notifier[VIRTIO_QUEUE_MAX]; > > > > > } VhostUserState; > > > > > > > > > > VhostUserState *vhost_user_init(void); > > > > > -- > > > > > 2.11.0 --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org