On Tue, May 24, 2022 at 11:17:06AM +0200, Paolo Bonzini wrote: > On 5/24/22 10:08, Stefan Hajnoczi wrote: > > On Tue, May 24, 2022 at 09:55:39AM +0200, Paolo Bonzini wrote: > > > On 5/22/22 17:06, Stefan Hajnoczi wrote: > > > > However, I hit on a problem that I think Emanuele and Paolo have already > > > > pointed out: draining is GS & IO. This might have worked under the 1 IOThread > > > > model but it does not make sense for multi-queue. It is possible to submit I/O > > > > requests in drained sections. How can multiple threads be in drained sections > > > > simultaneously and possibly submit further I/O requests in their drained > > > > sections? Those sections wouldn't be "drained" in any useful sense of the word. > > > > > > Yeah, that works only if the drained sections are well-behaved. > > > > > > "External" sources of I/O are fine; they are disabled using is_external, and > > > don't drain themselves I think. > > > > I/O requests for a given BDS may be executing in multiple AioContexts, > > so how do you call aio_disable_external() on all relevant AioContexts? > > With multiqueue yeah, we have to replace aio_disable_external() with > drained_begin/end() callbacks; but I'm not talking about that yet. > > > > In parallel to the block layer discussions, it's possible to work on > > > introducing a request queue lock in virtio-blk and virtio-scsi. That's the > > > only thing that relies on the AioContext lock outside the block layer. > > > > I'm not sure what the request queue lock protects in virtio-blk? In > > virtio-scsi I guess a lock is needed to protect SCSI target emulation > > state? > > Yes, but even in virtio-blk there is this code that runs in the main thread > and is currently protected by aio_context_acquire/release: > > blk_drain(s->blk); > > /* We drop queued requests after blk_drain() because blk_drain() > * itself can produce them. */ > while (s->rq) { > req = s->rq; > s->rq = req->next; > virtqueue_detach_element(req->vq, &req->elem, 0); > virtio_blk_free_request(req); > } > > Maybe it's safe to run it without a lock because it runs after > virtio_set_status(vdev, 0) but I'd rather play it safe and protect s->rq > with a lock. What does the lock protect? A lock can prevent s->rq or req->vq corruption but it cannot prevent request leaks. This loop's job is to free all requests so there is no leak. If a lock is necessary then this code is already broken in a more fundamental way because it can leak. Stefan