On Tue, Sep 10, 2019 at 05:14:36PM +0200, Johannes Berg wrote: > On Tue, 2019-09-10 at 17:03 +0200, Stefan Hajnoczi wrote: > > > > > Now, this means that the CPU (that's part of the simulation) has to > > > *wait* for the device to add an entry to the simulation calendar in > > > response to the kick... That means that it really has to look like > > > > > > CPU device calendar > > > ---[kick]--> > > > ---[add entry]--> > > > <---[return]----- > > > > What are the semantics of returning from the calendar? Does it mean > > "it's now your turn to run?", "your entry has been added and you'll be > > notified later when it's time to run?", or something else? > > The latter - the entry was added, and you'll be notified when it's time > to run; but we need to have that state on the calendar so the CPU won't > converse with the calendar before that state is committed. Is the device only adding a calendar entry and not doing any actual device emulation at this stage? If yes, then this suggests the system could be structured more cleanly. The vhost-user device process should focus on device emulation. It should not be aware of the calendar. The vhost-user protocol also shouldn't require modifications. Instead, Linux arch/um code would add the entry to the calendar when the CPU wants to kick a vhost-user device. I assume the CPU is suspended until arch/um code completes adding the entry to the calendar. When the calendar decides to run the device entry it signals the vhost-user kick eventfd. The vhost-user device processes the virtqueue as if it had been directly signalled by the CPU, totally unaware that it's running within a simulation system. The irq path is similar: the device signals the callfd and the calendar adds an entry to notify UML that the request has completed. Some pseudo-code: arch/um/drivers/.../vhost-user.c: void um_vu_kick(struct um_vu_vq *vq) { if (simulation_mode) { calendar_add_entry({ .type = CAL_ENTRY_TYPE_VHOST_USER_KICK, .device = vq->dev, .vq_idx = vq->idx, }); return; } /* The normal code path: signal the kickfd */ uint64_t val = 1; write(vq->kickfd, &val, sizeof(val)); } I'm suggesting this because it seems like a cleaner approach than exposing the calendar concept to the vhost-user devices. I'm not 100% sure it offers the semantics you need to make everything deterministic though. A different topic: vhost-user does not have a 1:1 vq buffer:kick relationship. It's possible to add multiple buffers and kick only once. It is also possible for the device to complete multiple buffers and only call once. This could pose a problem for simulation because it allows a degree of non-determinism. But as long as the both the CPU and the I/O completion of the device happen on a strict schedule this isn't a problem. Stefan