On 29.08.19 12:52, Vladimir Sementsov-Ogievskiy wrote:
> Thanks for reviewing!
> 
> 28.08.2019 18:59, Max Reitz wrote:
>> On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
>>> Split copying code part from backup to "block-copy", including separate
>>> state structure and function renaming. This is needed to share it with
>>> backup-top filter driver in further commits.
>>>
>>> Notes:
>>>
>>> 1. As BlockCopyState keeps own BlockBackend objects, remaining
>>
>> I suppose these should be BdrvChild objects at some point, but doing it
>> now would just mean effectively duplicating code from block-backend.c.
>> (“now” = before we have a backup-top filter to attach the children to.)
> 
> How much is it bad to not do it, but leave them to be block-backends in block-copy
> state? They'll connected anyway through the job, as they all are in job.nodes.
> 
> We have block-backends in jobs currently, is it bad?

Yes.  First of all, it’s simply not how it should be.  It’s ugly.

Second, it does produce tangible problems.  One thing that comes to mind
is that we only need BB.disable_request_queuing because these
BlockBackends do not have a clear connection to the block job, which
means that the job may want to perform requests on drained BBs.

If source and target were children of the filter node, draining them
would first drain the job, and only then would they be marked as
quiesced, thus solving the problem (as far as I remember).

>>> job->common.blk users only use it to get bs by blk_bs() call, so clear
>>> job->commen.blk permissions set in block_job_create.
>>>
>>> 2. Rename s/initializing_bitmap/skip_unallocated/ to sound a bit better
>>> as interface to BlockCopyState
>>>
>>> 3. Split is not very clean: there left some duplicated fields, backup
>>
>> Are there any but cluster_size and len (and source, in a sense)?
> 
> Seems no more

Good, I was just asking because duplicated fields may be difficult to
keep in sync and so on.

[...]

>>> @@ -99,9 +118,83 @@ static void cow_request_end(CowRequest *req)
>>>       qemu_co_queue_restart_all(&req->wait_queue);
>>>   }
>>>   
>>> +static void block_copy_state_free(BlockCopyState *s)
>>> +{
>>> +    if (!s) {
>>> +        return;
>>> +    }
>>> +
>>> +    bdrv_release_dirty_bitmap(blk_bs(s->source), s->copy_bitmap);
>>> +    blk_unref(s->source);
>>> +    s->source = NULL;
>>> +    blk_unref(s->target);
>>> +    s->target = NULL;
>>
>> I’m not quite sure why you NULL these pointers when you free the whole
>> object next anyway.
> 
> it is for backup_drain, I'm afraid of some yield during blk_unref (and seems it's unsafe
> anyway, as I zero reference after calling blk_unref). Anyway,
> backup_drain will be dropped in "[PATCH v3] job: drop job_drain", I'll drop
> "= NULL" here now and workaround backup_drain in backup_clean with corresponding
> comment.

OK.

[...]

>>> +
>>> +    blk_set_disable_request_queuing(s->source, true);
>>> +    blk_set_allow_aio_context_change(s->source, true);
>>> +    blk_set_disable_request_queuing(s->target, true);
>>> +    blk_set_allow_aio_context_change(s->target, true);
>>
>> Hm.  Doesn’t creating new BBs here mean that we have to deal with the
>> fallout of changing the AioContext on either of them somewhere?
> 
> In backup context, backup job is responsible for keeping source and target bs
> in same context, so I think allowing blk to change aio context and assert in
> block_copy() that context is the same should be enough for now.

Hm, OK, and the backup job takes care of that through
child_job_set_aio_ctx() in blockjob.c.

But it should still be noted that on master, if you try to move e.g. the
source to a new context (by attaching a device to it), this happens:

qemu-system-x86_64: Cannot change iothread of active block backend

This comes from blk_root_can_set_aio_ctx(); but at the same time, it
will happily allow the context change if blk->allow_aio_context_change
is set – which is precisely what you do here.

So with this patch, the change is allowed; but the target is moved to
the new context, too.

So it should be noted that this is a change in behavior.  (Well, at
least I wanted to note it here.)

> I'll add a comment on this here.

By the way, this is another problem that we wouldn’t have if source and
target were BdrvChildren of backup-top.

(The problem being that the BBs’ contexts are kept in sync indirectly
through the node list attached to the job.)

Max