qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
To: Max Reitz <mreitz@redhat.com>, qemu-block@nongnu.org
Cc: qemu-devel@nongnu.org, kwolf@redhat.com, fam@euphon.net,
	stefanha@redhat.com, armbru@redhat.com, jsnow@redhat.com,
	libvir-list@redhat.com, eblake@redhat.com, den@openvz.org,
	vsementsov@virtuozzo.com
Subject: Re: [PATCH v11 13/13] block: apply COR-filter to block-stream jobs
Date: Thu, 15 Oct 2020 20:16:40 +0300	[thread overview]
Message-ID: <fd2d9c80-c238-e635-d12c-f2d41b5e3dbc@virtuozzo.com> (raw)
In-Reply-To: <6074273c-088f-069b-2a6c-aee812c3bad6@redhat.com>

On 14.10.2020 19:24, Max Reitz wrote:
> On 12.10.20 19:43, Andrey Shinkevich wrote:

[...]

>> ---
>>   block/stream.c             | 93 +++++++++++++++++++++++++++++-----------------
>>   tests/qemu-iotests/030     | 51 +++----------------------
>>   tests/qemu-iotests/030.out |  4 +-
>>   tests/qemu-iotests/141.out |  2 +-
>>   tests/qemu-iotests/245     | 19 +++++++---
>>   5 files changed, 81 insertions(+), 88 deletions(-)
> 
> Looks like stream_run() could be a bit streamlined now (the allocation
> checking should be unnecessary, unconditionally calling
> stream_populate() should be sufficient), but not necessary now.
> 

That is what I had kept in my mind when I tackled this patch. But there 
is an underwater reef to streamline. Namely, how the block-stream job 
gets known about a long unallocated tail to exit the loop earlier in the 
stream_run(). Shall we return the '-EOF' or another error code from the 
cor_co_preadv_part() to be handled by the stream_run()? Any other 
suggestions, if any, will be appreciated.

>> diff --git a/block/stream.c b/block/stream.c
>> index d3e1812..93564db 100644
>> --- a/block/stream.c
>> +++ b/block/stream.c
> 
> [...]

>> +
>> +    cor_filter_bs = bdrv_cor_filter_append(bs, opts, BDRV_O_RDWR, errp);
>> +    if (cor_filter_bs == NULL) {
>> +        goto fail;
>> +    }
>> +
>> +    if (bdrv_freeze_backing_chain(cor_filter_bs, bs, errp) < 0) {
> 
> Is there a reason why we can’t combine this with the
> bdrv_free_backing_chain() from bs down to above_base?  I mean, the
> effect should be the same, just asking.
> 

The bdrv_freeze_backing_chain(bs, above_base, errp) is called before the 
bdrv_reopen_set_read_only() to keep the backing chain safe during the 
context switch. Then we will want to freeze the 'COR -> TOP BS' link as 
well. Freezing/unfreezing parts is simlier to manage than doing that 
with the whole chain.
If we decide to invoke the bdrv_reopen_set_read_only() after freezing 
the backing chain together with the COR-filter, we will not be able to 
get the 'write' permission on the read-only node.


>> +        bdrv_cor_filter_drop(cor_filter_bs);
>> +        cor_filter_bs = NULL;
>> +        goto fail;
>> +    }
>> +
>> +    s = block_job_create(job_id, &stream_job_driver, NULL, cor_filter_bs,
>> +                         BLK_PERM_CONSISTENT_READ,
>> +                         basic_flags | BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD,
> 
> Not that I’m an expert on the GRAPH_MOD permission, but why is this
> shared here but not below?  Shouldn’t it be the same in both cases?
> (Same for taking it as a permission.)
> 

When we invoke the block_job_add_bdrv(&s->common, "active node", bs,..) 
below (particularly, we need it to block the operations on the top node, 
bdrv_op_block_all()), we ask for the GRAPH_MOD permission for the top 
node. To allow that, the parent filter node should share that permission 
for the underlying node. Otherwise, we get assertion failed in the 
bdrv_check_update_perm() called from bdrv_replace_node() when we remove 
the filter.

>>                            speed, creation_flags, NULL, NULL, errp);
>>       if (!s) {
>>           goto fail;
>>       }
>>   
>> +    /*
>> +     * Prevent concurrent jobs trying to modify the graph structure here, we
>> +     * already have our own plans. Also don't allow resize as the image size is
>> +     * queried only at the job start and then cached.
>> +     */
>> +    if (block_job_add_bdrv(&s->common, "active node", bs,
>> +                           basic_flags | BLK_PERM_GRAPH_MOD,
>> +                           basic_flags | BLK_PERM_WRITE, &error_abort)) {
>> +        goto fail;
>> +    }
>> +
>>       /* Block all intermediate nodes between bs and base, because they will
>>        * disappear from the chain after this operation. The streaming job reads
>>        * every block only once, assuming that it doesn't change, so forbid writes
> 
> [...]
> 
>> diff --git a/tests/qemu-iotests/245 b/tests/qemu-iotests/245
>> index e60c832..940e85a 100755
>> --- a/tests/qemu-iotests/245
>> +++ b/tests/qemu-iotests/245
>> @@ -899,17 +899,26 @@ class TestBlockdevReopen(iotests.QMPTestCase):
>>           # make hd1 read-only and block-stream requires it to be read-write
>>           # (Which error message appears depends on whether the stream job is
>>           # already done with copying at this point.)
> 
> Hm.  Let’s look at the set of messages below... [1]
> 
>> -        self.reopen(opts, {},
>> +        # As the COR-filter node is inserted into the backing chain with the
>> +        # 'block-stream' operation, we move the options to their proper nodes.
>> +        opts = hd_opts(1)
> 
> Oh, so this patch changes it so that only the subtree below hd1 is
> reopened, and we don’t have to deal with the filter options.  Got it.
> (I think.)
> 

Yes, that's right.

>> +        opts['backing'] = hd_opts(2)
>> +        opts['backing']['backing'] = None
>> +        self.reopen(opts, {'read-only': True},
>>               ["Can't set node 'hd1' to r/o with copy-on-read enabled",
> 
> [1]
> 
> This isn’t done anymore as of this patch.  So I don’t think this error
> message can still appear.  Will some other message appear in its stead,
> or is it always going to be the second one?
> 

The only second message appears in the test case when I run it on my 
node. So, I will remove the first one as the bdrv_enable_copy_on_read() 
is not called for the top BS on the frozen backing chain anymore.
Also, I will delet the part of the comment:
"(Which error message appears depends on whether the stream job is 
already done with copying at this point.)"

>>                "Cannot make block node read-only, there is a writer on it"])
>>   
>>           # We can't remove hd2 while the stream job is ongoing
>> -        opts['backing']['backing'] = None
>> -        self.reopen(opts, {'backing.read-only': False}, "Cannot change 'backing' link from 'hd1' to 'hd2'")
>> +        opts['backing'] = None
>> +        self.reopen(opts, {'read-only': False},
>> +                    "Cannot change 'backing' link from 'hd1' to 'hd2'")
>>   
>> -        # We can detach hd1 from hd0 because it doesn't affect the stream job
>> +        # We can't detach hd1 from hd0 because there is the COR-filter implicit
>> +        # node in between.
>> +        opts = hd_opts(0)
>>           opts['backing'] = None
>> -        self.reopen(opts)
>> +        self.reopen(opts, {},
>> +                    "Cannot change backing link if 'hd0' has an implicit backing file")
> 
> Does “has an implicit backing file” mean that hd0 has an implicit node
> (the COR filter) as its backing file?  And then reopening isn’t allowed
> because the user supposedly doesn’t know about that implicit node?  If
> so, makes sense.

Yes, it is.

Andrey

> 
> Max
> 
>>   
>>           self.vm.run_job('stream0', auto_finalize = False, auto_dismiss = True)
>>   
>>
> 
> 


  reply	other threads:[~2020-10-15 17:19 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-12 17:43 [PATCH v11 00/13] Apply COR-filter to the block-stream permanently Andrey Shinkevich via
2020-10-12 17:43 ` [PATCH v11 01/13] copy-on-read: Support preadv/pwritev_part functions Andrey Shinkevich via
2020-10-12 17:43 ` [PATCH v11 02/13] copy-on-read: add filter append/drop functions Andrey Shinkevich via
2020-10-14 10:44   ` Max Reitz
2020-10-14 14:28     ` Andrey Shinkevich
2020-10-14 16:26       ` Max Reitz
2020-10-12 17:43 ` [PATCH v11 03/13] qapi: add filter-node-name to block-stream Andrey Shinkevich via
2020-10-12 17:43 ` [PATCH v11 04/13] copy-on-read: pass overlay base node name to COR driver Andrey Shinkevich via
2020-10-14 11:09   ` Max Reitz
2020-10-14 11:57     ` Max Reitz
2020-10-14 14:56       ` Vladimir Sementsov-Ogievskiy
2020-10-14 16:27         ` Max Reitz
2020-10-14 16:08     ` Andrey Shinkevich
2020-10-14 16:18       ` Vladimir Sementsov-Ogievskiy
2020-10-14 16:36       ` Max Reitz
2020-10-12 17:43 ` [PATCH v11 05/13] copy-on-read: limit COR operations to base in " Andrey Shinkevich via
2020-10-14 11:59   ` Max Reitz
2020-10-14 17:43     ` Andrey Shinkevich
2020-10-14 12:01   ` Max Reitz
2020-10-14 18:57     ` Andrey Shinkevich
2020-10-15 15:56       ` Max Reitz
2020-10-15 17:37         ` Andrey Shinkevich
2020-10-16 14:28           ` Vladimir Sementsov-Ogievskiy
2020-10-12 17:43 ` [PATCH v11 06/13] block: modify the comment for BDRV_REQ_PREFETCH flag Andrey Shinkevich via
2020-10-14 12:22   ` Max Reitz
2020-10-14 15:04     ` Vladimir Sementsov-Ogievskiy
2020-10-14 19:57     ` Andrey Shinkevich
2020-10-12 17:43 ` [PATCH v11 07/13] block: include supported_read_flags into BDS structure Andrey Shinkevich via
2020-10-14 12:31   ` Max Reitz
2020-10-12 17:43 ` [PATCH v11 08/13] copy-on-read: add support for BDRV_REQ_PREFETCH to COR-filter Andrey Shinkevich via
2020-10-14 12:40   ` Max Reitz
2020-10-12 17:43 ` [PATCH v11 09/13] copy-on-read: skip non-guest reads if no copy needed Andrey Shinkevich via
2020-10-14 12:51   ` Max Reitz
2020-10-14 15:22     ` Vladimir Sementsov-Ogievskiy
2020-10-14 16:30       ` Max Reitz
2020-10-14 16:39         ` Vladimir Sementsov-Ogievskiy
2020-10-15 15:49           ` Max Reitz
2020-10-21 20:43       ` Andrey Shinkevich
2020-10-22  7:50         ` Andrey Shinkevich
2020-10-22  8:56           ` Vladimir Sementsov-Ogievskiy
2020-10-14 20:49     ` Andrey Shinkevich
2020-10-12 17:43 ` [PATCH v11 10/13] stream: skip filters when writing backing file name to QCOW2 header Andrey Shinkevich via
2020-10-14 15:02   ` Max Reitz
2020-10-14 15:40     ` Vladimir Sementsov-Ogievskiy
2020-10-12 17:43 ` [PATCH v11 11/13] stream: mark backing-file argument as deprecated Andrey Shinkevich via
2020-10-14 15:03   ` Max Reitz
2020-10-14 15:43     ` Vladimir Sementsov-Ogievskiy
2020-10-15  9:01       ` Andrey Shinkevich
2020-10-12 17:43 ` [PATCH v11 12/13] stream: remove unused backing-file name parameter Andrey Shinkevich via
2020-10-14 15:05   ` Max Reitz
2020-10-12 17:43 ` [PATCH v11 13/13] block: apply COR-filter to block-stream jobs Andrey Shinkevich via
2020-10-14 16:24   ` Max Reitz
2020-10-15 17:16     ` Andrey Shinkevich [this message]
2020-10-16 15:06       ` Andrey Shinkevich
2020-10-16 15:45       ` Vladimir Sementsov-Ogievskiy
2020-10-20 19:43         ` Andrey Shinkevich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fd2d9c80-c238-e635-d12c-f2d41b5e3dbc@virtuozzo.com \
    --to=andrey.shinkevich@virtuozzo.com \
    --cc=armbru@redhat.com \
    --cc=den@openvz.org \
    --cc=eblake@redhat.com \
    --cc=fam@euphon.net \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).