On Mon, Aug 30, 2021 at 03:04:08AM +0000, John Johnson wrote: > > > > On Aug 24, 2021, at 8:14 AM, Stefan Hajnoczi wrote: > > > > On Mon, Aug 16, 2021 at 09:42:38AM -0700, Elena Ufimtseva wrote: > >> @@ -62,5 +65,10 @@ typedef struct VFIOProxy { > >> > >> VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp); > >> void vfio_user_disconnect(VFIOProxy *proxy); > >> +void vfio_user_set_reqhandler(VFIODevice *vbasdev, > > > > "vbasedev" for consistency? > > > > OK > > >> + int (*handler)(void *opaque, char *buf, > >> + VFIOUserFDs *fds), > >> + void *reqarg); > > > > The handler callback is undocumented. What context does it run in, what > > do the arguments mean, and what should the function return? Please > > document it so it's easy for others to modify this code in the future > > without reverse-engineering the assumptions behind it. > > > > OK > > > >> +void vfio_user_recv(void *opaque) > >> +{ > >> + VFIODevice *vbasedev = opaque; > >> + VFIOProxy *proxy = vbasedev->proxy; > >> + VFIOUserReply *reply = NULL; > >> + g_autofree int *fdp = NULL; > >> + VFIOUserFDs reqfds = { 0, 0, fdp }; > >> + VFIOUserHdr msg; > >> + struct iovec iov = { > >> + .iov_base = &msg, > >> + .iov_len = sizeof(msg), > >> + }; > >> + bool isreply; > >> + int i, ret; > >> + size_t msgleft, numfds = 0; > >> + char *data = NULL; > >> + g_autofree char *buf = NULL; > >> + Error *local_err = NULL; > >> + > >> + qemu_mutex_lock(&proxy->lock); > >> + if (proxy->state == VFIO_PROXY_CLOSING) { > >> + qemu_mutex_unlock(&proxy->lock); > >> + return; > >> + } > >> + > >> + ret = qio_channel_readv_full(proxy->ioc, &iov, 1, &fdp, &numfds, > >> + &local_err); > > > > This is a blocking call. My understanding is that the IOThread is shared > > by all vfio-user devices, so other devices will have to wait if one of > > them is acting up (e.g. the device emulation process sent less than > > sizeof(msg) bytes). > > > > While we're blocked in this function the proxy device cannot be > > hot-removed since proxy->lock is held. > > > > It would more robust to use of the event loop to avoid blocking. There > > could be a per-connection receiver coroutine that calls > > qio_channel_readv_full_all_eof() (it yields the coroutine if reading > > would block). > > > > I thought the main loop uses BQL, which I don’t need for most > message processing. The blocking behavior can be fixed with FIONREAD > beforehand to detect a message with fewer bytes than in a header. It's I/O-bound work, exactly what the main loop was intended for. I'm not sure the BQL can be avoided anyway: - The vfio-user client runs under the BQL (a vCPU thread). - The vfio-user server needs to hold the BQL since most QEMU device models assume they are running under the BQL. The network communication code doesn't need to know about the BQL though. Event-driven code (code that runs in an AioContext) can rely on the fact that its callbacks only execute in the AioContext, i.e. in one thread at any given time. The code probably doesn't need explicit BQL lock/unlock and can run safely in another IOThread if the user wishes (I would leave that up to the user, e.g. -device vfio-user-pci,iothread=iothread0, instead of creating a dedicated IOThread that is shared for all vfio-user communication). See nbd/server.c for an example of doing event-driven network I/O with coroutines. Stefan