Am 16.07.2019 um 18:24 hat Max Reitz geschrieben: > On 16.07.19 16:40, Kevin Wolf wrote: > > Am 19.06.2019 um 17:25 hat Max Reitz geschrieben: > >> Hi, > >> > >> This is v2 to “block: Keep track of parent quiescing”. > >> > >> Please read this cover letter, because I’m very unsure about the design > >> in this series and I’d appreciate some comments. > >> > >> As Kevin wrote in his reply to that series, the actual problem is that > >> bdrv_drain_invoke() polls on every node whenever ending a drain. This > >> may cause graph changes, which is actually forbidden. > >> > >> To solve that problem, this series makes the drain code construct a list > >> of undrain operations that have been initiated, and then polls all of > >> them on the root level once graph changes are acceptable. > >> > >> Note that I don’t like this list concept very much, so I’m open to > >> alternatives. > > > > So drain_end is different from drain_begin in that it wants to wait only > > for all bdrv_drain_invoke() calls to complete, but not for other > > requests that are in flight. Makes sense. > > > > Though instead of managing a whole list, wouldn't a counter suffice? > > > >> Furthermore, all BdrvChildRoles with BDS parents have a broken > >> .drained_end() implementation. The documentation clearly states that > >> this function is not allowed to poll, but it does. So this series > >> changes it to a variant (using the new code) that does not poll. > >> > >> There is a catch, which may actually be a problem, I don’t know: The new > >> variant of that .drained_end() does not poll at all, never. As > >> described above, now every bdrv_drain_invoke() returns an object that > >> describes when it will be done and which can thus be polled for. These > >> objects are just discarded when using BdrvChildRole.drained_end(). That > >> does not feel quite right. It would probably be more correct to let > >> BdrvChildRole.drained_end() return these objects so the top level > >> bdrv_drained_end() can poll for their completion. > >> > >> I decided not to do this, for two reasons: > >> (1) Doing so would spill the “list of objects to poll for” design to > >> places outside of block/io.c. I don’t like the design very much as > >> it is, but I can live with it as long as it’s constrained to the > >> core drain code in block/io.c. > >> This is made worse by the fact that currently, those objects are of > >> type BdrvCoDrainData. But it shouldn’t be a problem to add a new > >> type that is externally visible (we only need the AioContext and > >> whether bdrv_drain_invoke_entry() is done). > >> > >> (2) It seems to work as it is. > >> > >> The alternative would be to add the same GSList ** parameter to > >> BdrvChildRole.drained_end() that I added in the core drain code in patch > >> 2, and then let the .drained_end() implementation fill that with objects > >> to poll for. (Which would be accomplished by adding a frontend to > >> bdrv_do_drained_end() that lets bdrv_child_cb_drained_poll() pass the > >> parameter through.) > >> > >> Opinions? > > > > I think I would add an int* to BdrvChildRole.drained_end() so that we > > can just increase the counter whereever we need to. > > So you mean just polling the @bs for which a caller gave poll=true until > the counter reaches 0? I’ll try, sounds good (if I can get it to work). Yes, that's what I have in mind. We expect graph changes to happen during the polling, but I think the caller is responsible for making sure that the top-level @bs stays around, so we don't need to be extra careful here. Also, bdrv_drain_invoke() is always called in the same AioContext as the top-level drain operation, so the whole aio_context_acquire/release stuff from this series should become unnecessary, and we don't need atomics to access the counter either. So I think this should really simplify the series a lot. Kevin