On Fri, Sep 10, 2021 at 05:25:13AM +0000, John Johnson wrote: > > > > On Sep 8, 2021, at 11:29 PM, Stefan Hajnoczi wrote: > > > > On Thu, Sep 09, 2021 at 05:11:49AM +0000, John Johnson wrote: > >> > >> > >> I did look at coroutines, but they seemed to work when the sender > >> is triggering the coroutine on send, not when request packets are arriving > >> asynchronously to the sends. > > > > This can be done with a receiver coroutine. Its job is to be the only > > thing that reads vfio-user messages from the socket. A receiver > > coroutine reads messages from the socket and wakes up the waiting > > coroutine that yielded from vfio_user_send_recv() or > > vfio_user_pci_process_req(). > > > > (Although vfio_user_pci_process_req() could be called directly from the > > receiver coroutine, it seems safer to have a separate coroutine that > > processes requests so that the receiver isn't blocked in case > > vfio_user_pci_process_req() yields while processing a request.) > > > > Going back to what you mentioned above, the receiver coroutine does > > something like this: > > > > if it's a reply > > reply = find_reply(...) > > qemu_coroutine_enter(reply->co) // instead of signalling reply->cv > > else > > QSIMPLEQ_INSERT_TAIL(&pending_reqs, request, next); > > if (pending_reqs_was_empty) { > > qemu_coroutine_enter(process_request_co); > > } > > > > The pending_reqs queue holds incoming requests that the > > process_request_co coroutine processes. > > > > > How do coroutines work across threads? There can be multiple vCPU > threads waiting for replies, and I think the receiver coroutine will be > running in the main loop thread. Where would a vCPU block waiting for > a reply? I think coroutine_yield() returns to its coroutine_enter() caller. A vCPU thread holding the BQL can iterate the event loop if it has reached a synchronous point that needs to wait for a reply before returning. I think we have this situation when a MemoryRegion is accessed on the proxy device. For example, block/block-backend.c:blk_prw() kicks off a coroutine and then runs the event loop until the coroutine finishes: Coroutine *co = qemu_coroutine_create(co_entry, &rwco); bdrv_coroutine_enter(blk_bs(blk), co); BDRV_POLL_WHILE(blk_bs(blk), rwco.ret == NOT_DONE); BDRV_POLL_WHILE() boils down to a loop like this: while ((cond)) { aio_poll(ctx, true); } I also want to check that I understand the scenarios in which the vfio-user communication code is used: 1. vhost-user-server The vfio-user communication code should run in a given AioContext (it will be the main loop by default but maybe the user will be able to configure a specific IOThread in the future). 2. vCPU thread vfio-user clients The vfio-user communication code is called from the vCPU thread where the proxy device executes. The MemoryRegion->read()/write() callbacks are synchronous, so the thread needs to wait for a vfio-user reply before it can return. Is this what you had in mind? Stefan