On Mon, Aug 16, 2021 at 09:42:46AM -0700, Elena Ufimtseva wrote: > +void vfio_user_drain_reqs(VFIOProxy *proxy) > +{ > + VFIOUserReply *reply; > + bool iolock = 0; > + > + /* > + * Any DMA map/unmap requests sent in the middle > + * of a memory region transaction were sent async. > + * Wait for them here. > + */ > + QEMU_LOCK_GUARD(&proxy->lock); > + if (proxy->last_nowait != NULL) { > + iolock = qemu_mutex_iothread_locked(); > + if (iolock) { > + qemu_mutex_unlock_iothread(); > + } > + > + reply = proxy->last_nowait; > + reply->nowait = 0; > + while (reply->complete == 0) { > + if (!qemu_cond_timedwait(&reply->cv, &proxy->lock, wait_time)) { > + error_printf("vfio_drain_reqs - timed out\n"); > + break; > + } > + } > + > + if (reply->msg->flags & VFIO_USER_ERROR) { > + error_printf("vfio_user_rcv error reply on async request "); > + error_printf("command %x error %s\n", reply->msg->command, > + strerror(reply->msg->error_reply)); > + } > + proxy->last_nowait = NULL; > + g_free(reply->msg); > + QTAILQ_INSERT_HEAD(&proxy->free, reply, next); > + } > + > + if (iolock) { > + qemu_mutex_lock_iothread(); > + } Not sure this lock ordering is correct. Above we acquire proxy->lock while holding the BQL and here we acquire the BQL while holding proxy->lock. If another thread (e.g. a vCPU thread) does something similar this is the ABBA lock ordering problem. The more obviously correct way to write this is: WITH_QEMU_LOCK_GUARD(&proxy->lock) { ... } if (iolock) { qemu_mutex_lock_iothread(); } > +} > + > static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, > uint32_t size, uint32_t flags) > { > @@ -715,6 +756,89 @@ int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp) > return 0; > } > > +int vfio_user_dma_map(VFIOProxy *proxy, struct vfio_iommu_type1_dma_map *map, > + VFIOUserFDs *fds, bool will_commit) > +{ > + VFIOUserDMAMap *msgp = g_malloc(sizeof(*msgp)); Is this zero-initialized anywhere to guarantee that QEMU memory isn't exposed to the VFIO device emulation program? > + int ret, flags; > + > + /* commit will wait, so send async without dropping BQL */ > + flags = will_commit ? (NOIOLOCK | NOWAIT) : 0; Why is this distinction between will_commit and !will_commit necessary? I get a sense that the network communications code drops the BQL and that's undesirable here for some reason. I wonder why the code doesn't take the NOWAIT code path all the time?