On Wed, Feb 19, 2020 at 08:05:12PM +0100, Paolo Bonzini wrote: > Il mer 19 feb 2020, 18:58 Stefan Hajnoczi ha scritto: > > > On Wed, Feb 19, 2020 at 12:09:48PM +0100, Paolo Bonzini wrote: > > > Really a great idea, though I have some remarks on the implementation > > below. > > > > > > On 19/02/20 11:00, Stefan Hajnoczi wrote: > > > > + * Each aio_bh_poll() call carves off a slice of the BH list. This > > way newly > > > > + * scheduled BHs are not processed until the next aio_bh_poll() > > call. This > > > > + * concept extends to nested aio_bh_poll() calls because slices are > > chained > > > > + * together. > > > > > > This is the tricky part so I would expand a bit on why it's needed: > > > > > > /* > > > * Each aio_bh_poll() call carves off a slice of the BH list, so that > > > * newly scheduled BHs are not processed until the next aio_bh_poll() > > > * call. All active aio_bh_poll() calls chained their slices together > > > * in a list, so that nested aio_bh_poll() calls process all scheduled > > > * bottom halves. > > > */ > > > > Thanks, will fix in v2. > > > > > > +struct BHListSlice { > > > > + QEMUBH *first_bh; > > > > + BHListSlice *next; > > > > +}; > > > > + > > > > > > Using QLIST and QSLIST removes the need to create your own lists, since > > > you can use QSLIST_MOVE_ATOMIC and QSLIST_INSERT_HEAD_ATOMIC. For > > example: > > > > > > struct BHListSlice { > > > QSLIST_HEAD(, QEMUBH) first_bh; > > > QLIST_ENTRY(BHListSlice) next; > > > }; > > > > > > ... > > > > > > QSLIST_HEAD(, QEMUBH) active_bh; > > > QLIST_HEAD(, BHListSlice) bh_list; > > > > I thought about this but chose the explicit tail pointer approach > > because it lets aio_compute_timeout() and aio_ctx_check() iterate over > > both the active BH list and slices in a single for loop :). But > > thinking about it more, maybe it can still be done by replacing > > active_bh with a permanently present first BHListSlice element. > > > > Probably not so easy since the idea was to empty the slices list entirely > via the FIFO order. > > But since you are rewriting everything anyway, can you add a flag for > whether there are any non-idle bottom halves anywhere in the list? It need > not be computed perfectly, because any non-idle bh will cause the idle > bottom halves to be triggered as well; you can just set in qemu_bh_schedule > and clear it at the end of aio_bh_poll. > > Then if there is any active bh or slice you consult the flag and use it to > return the timeout, which will be either 0 or 10ms depending on the flag. > That's truly O(1). (More precisely, this patch goes from O(#created-bh) to > O(#scheduled-bh), which of course is optimal for aio_bh_poll but not for > aio_compute_timeout; making timeout computation O(1) can lower latency a > bit by decreasing the constant factor). Yes, good idea. I'll send a follow-up patch. Stefan