On Wed, Mar 08, 2023 at 06:25:43PM +0100, Kevin Wolf wrote: > Am 08.03.2023 um 15:26 hat Stefan Hajnoczi geschrieben: > > On Wed, Mar 08, 2023 at 09:48:17AM +0100, Kevin Wolf wrote: > > > Am 07.03.2023 um 20:20 hat Stefan Hajnoczi geschrieben: > > > > On Tue, Mar 07, 2023 at 06:17:22PM +0100, Kevin Wolf wrote: > > > > > Am 01.03.2023 um 21:57 hat Stefan Hajnoczi geschrieben: > > > > > > There is no need for the AioContext lock in bdrv_drain_all() because > > > > > > nothing in AIO_WAIT_WHILE() needs the lock and the condition is atomic. > > > > > > > > > > > > Note that the NULL AioContext argument to AIO_WAIT_WHILE() is odd. In > > > > > > the future it can be removed. > > > > > > > > > > It can be removed for all callers that run in the main loop context. For > > > > > code running in an iothread, it's still important to pass a non-NULL > > > > > context. This makes me doubt that the ctx parameter can really be > > > > > removed without changing more. > > > > > > > > > > Is your plan to remove the if from AIO_WAIT_WHILE_INTERNAL(), too, and > > > > > to poll qemu_get_current_aio_context() instead of ctx_ or the main > > > > > context? > > > > > > > > This is what I'd like once everything has been converted to > > > > AIO_WAIT_WHILE_UNLOCKED() - and at this point we might as well call it > > > > AIO_WAIT_WHILE() again: > > > > > > > > #define AIO_WAIT_WHILE(cond) ({ \ > > > > bool waited_ = false; \ > > > > AioWait *wait_ = &global_aio_wait; \ > > > > /* Increment wait_->num_waiters before evaluating cond. */ \ > > > > qatomic_inc(&wait_->num_waiters); \ > > > > /* Paired with smp_mb in aio_wait_kick(). */ \ > > > > smp_mb(); \ > > > > while ((cond)) { \ > > > > aio_poll(qemu_get_current_aio_context(), true); \ > > > > waited_ = true; \ > > > > } \ > > > > qatomic_dec(&wait_->num_waiters); \ > > > > waited_; }) > > > > > > Ok, yes, this is what I tried to describe above. > > > > > > > However, I just realized this only works in the main loop thread because > > > > that's where aio_wait_kick() notifications are received. An IOThread > > > > running AIO_WAIT_WHILE() won't be woken when another thread (including > > > > the main loop thread) calls aio_wait_kick(). > > > > > > Which is of course a limitation we already have today. You can wait for > > > things in your own iothread, or for all threads from the main loop. > > > > > > However, in the future multiqueue world, the first case probably becomes > > > pretty much useless because even for the same node, you could get > > > activity in any thread. > > > > > > So essentially AIO_WAIT_WHILE() becomes GLOBAL_STATE_CODE(). Which is > > > probably a good idea anyway, but I'm not entirely sure how many places > > > we currently have where it's called from an iothread. I know the drain > > > in mirror_run(), but Emanuele already had a patch in his queue where > > > bdrv_co_yield_to_drain() schedules drain in the main context, so if that > > > works, mirror_run() would be solved. > > > > > > https://gitlab.com/eesposit/qemu/-/commit/63562351aca4fb05d5711eb410feb96e64b5d4ad > > > > > > > I would propose introducing a QemuCond for each condition that we wait > > > > on, but QemuCond lacks event loop integration. The current thread would > > > > be unable to run aio_poll() while also waiting on a QemuCond. > > > > > > > > Life outside coroutines is hard, man! I need to think about this more. > > > > Luckily this problem doesn't block this patch series. > > > > > > I hope that we don't really need all of this if we can limit running > > > synchronous code to the main loop. > > > > Great idea, I think you're right. > > > > I'll audit the code to find the IOThread AIO_WAIT_WHILE() callers and > > maybe a future patch series can work on that. > > > > > > > > There is an assertion in > > > > > > AIO_WAIT_WHILE() that checks that we're in the main loop AioContext and > > > > > > we would lose that check by dropping the argument. However, that was a > > > > > > precursor to the GLOBAL_STATE_CODE()/IO_CODE() macros and is now a > > > > > > duplicate check. So I think we won't lose much by dropping it, but let's > > > > > > do a few more AIO_WAIT_WHILE_UNLOCKED() coversions of this sort to > > > > > > confirm this is the case. > > > > > > > > > > > > Signed-off-by: Stefan Hajnoczi > > > > > > > > > > Yes, it seems that we don't lose much, except maybe some consistency in > > > > > the intermediate state. The commit message could state a bit more > > > > > directly what we gain, though. Since you mention removing the parameter > > > > > as a future possibility, I assume that's the goal with it, but I > > > > > wouldn't be sure just from reading the commit message. > > > > > > > > AIO_WAIT_WHILE() callers need to be weened of the AioContext lock. > > > > That's the main motivation and this patch series converts the easy > > > > cases where we already don't need the lock. Dropping the function > > > > argument eventually is a side benefit. > > > > > > Yes, but the conversion to AIO_WAIT_WHILE_UNLOCKED() could be done with > > > ctx instead of NULL. So moving to NULL is a separate change that needs a > > > separate explanation. You could even argue that it should be a separate > > > patch if it's an independent change. > > > > > > Or am I missing something and keeping ctx would actually break things? > > > > Yes, ctx argument does not need to be modified when converting from > > AIO_WAIT_WHILE() to AIO_WAIT_WHILE_UNLOCKED(). Passing it bothers me > > because we don't really use it when unlock=false. > > > > Would you like me to keep ctx non-NULL for now? > > I don't really mind doing both changes in one commit because they are so > small, but at least I'd like the commit message to be more explicit > about the eventual goal we have with switching to NULL instead of just > stating that it's odd, but harmless. Got it, I'll send another revision. Stefan