From: Hanna Reitz <hreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
qemu-block@nongnu.org
Cc: kwolf@redhat.com, eblake@redhat.com, qemu-devel@nongnu.org,
nsoffer@redhat.com, nikita.lapshin@virtuozzo.com, den@openvz.org,
jsnow@redhat.com
Subject: Re: [PATCH v2 1/4] qemu-img: implement compare --stat
Date: Wed, 27 Oct 2021 10:35:44 +0200 [thread overview]
Message-ID: <6b016523-87d5-f17e-8f1c-2e6ccf0dce0b@redhat.com> (raw)
In-Reply-To: <f63b4410-2c3d-d078-26e5-e891bfbb61e5@virtuozzo.com>
On 26.10.21 11:18, Vladimir Sementsov-Ogievskiy wrote:
> 26.10.2021 11:47, Hanna Reitz wrote:
>> On 26.10.21 09:53, Vladimir Sementsov-Ogievskiy wrote:
>>> 25.10.2021 19:40, Hanna Reitz wrote:
>>>> On 21.10.21 12:12, Vladimir Sementsov-Ogievskiy wrote:
>>>>> With new option qemu-img compare will not stop at first mismatch, but
>>>>> instead calculate statistics: how many clusters with different data,
>>>>> how many clusters with equal data, how many clusters were unallocated
>>>>> but become data and so on.
>>>>>
>>>>> We compare images chunk by chunk. Chunk size depends on what
>>>>> block_status returns for both images. It may return less than cluster
>>>>> (remember about qcow2 subclusters), it may return more than
>>>>> cluster (if
>>>>> several consecutive clusters share same status). Finally images may
>>>>> have different cluster sizes. This all leads to ambiguity in how to
>>>>> finally compare the data.
>>>>>
>>>>> What we can say for sure is that, when we compare two qcow2 images
>>>>> with
>>>>> same cluster size, we should compare clusters with data separately.
>>>>> Otherwise, if we for example compare 10 consecutive clusters of data
>>>>> where only one byte differs we'll report 10 different clusters.
>>>>> Expected result in this case is 1 different cluster and 9 equal ones.
>>>>>
>>>>> So, to serve this case and just to have some defined rule let's do
>>>>> the
>>>>> following:
>>>>>
>>>>> 1. Select some block-size for compare procedure. In this commit it
>>>>> must
>>>>> be specified by user, next commit will add some automatic
>>>>> logic and
>>>>> make --block-size optional.
>>>>>
>>>>> 2. Go chunk-by-chunk using block_status as we do now with only one
>>>>> differency:
>>>>> If block_status() returns DATA region that intersects block-size
>>>>> aligned boundary, crop this region at this boundary.
>>>>>
>>>>> This way it's still possible to compare less than cluster and report
>>>>> subcluster-level accuracy, but we newer compare more than one cluster
>>>>> of data.
>>>>>
>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy
>>>>> <vsementsov@virtuozzo.com>
>>>>> ---
>>>>> docs/tools/qemu-img.rst | 18 +++-
>>>>> qemu-img.c | 206
>>>>> +++++++++++++++++++++++++++++++++++++---
>>>>> qemu-img-cmds.hx | 4 +-
>>>>> 3 files changed, 212 insertions(+), 16 deletions(-)
>>>>
>>>> Looks good to me overall! Just some technical comments below.
>>>>
>>>>> diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
>>>>> index d58980aef8..21164253d4 100644
>>>>> --- a/docs/tools/qemu-img.rst
>>>>> +++ b/docs/tools/qemu-img.rst
>>>>> @@ -159,6 +159,18 @@ Parameters to compare subcommand:
>>>>> Strict mode - fail on different image size or sector allocation
>>>>> +.. option:: --stat
>>>>> +
>>>>> + Instead of exit on first mismatch compare the whole images and
>>>>> print
>>>>> + statistics on amount of different pairs of clusters, based on
>>>>> their
>>>>> + block-status and are they equal or not.
>>>>
>>>> I’d phrase this as:
>>>>
>>>> Instead of exiting on the first mismatch, compare the whole images
>>>> and print statistics on how much they differ in terms of block
>>>> status (i.e. are blocks allocated or not, do they contain data, are
>>>> they marked as containing only zeroes) and block content (a block
>>>> of data that contains only zero still has the same content as a
>>>> marked-zero block).
>>>
>>> For me the rest starting from "and block content" sounds unclear,
>>> seems doesn't add any information to previous (i.e. are blocks
>>> allocated ...)
>>
>> By “block content” I meant what you said by “equal or not”, i.e. what
>> is returned when reading from the block.
>>
>> Now that I think about it again, I believe we should go with your
>> original “equal or not”, though, because that reflects what qemu-img
>> --stat prints, like so perhaps:
>>
>> Instead of exiting on the first mismatch, compare the whole images
>> and print statistics on the amount of different pairs of blocks,
>> based on their block status and whether they are equal or not.
>
> OK
>
>>
>> I’d still like to add something like what I had in parentheses,
>> though, because as a user, I’d find the “block status” and “equal or
>> not” terms to be a bit handwave-y. I don’t think “block status” is a
>> common term in our documentation, so I wanted to add some examples;
>> and I wanted to show by example that “equal” blocks don’t need to
>> have the same block status.
>
> Actually, I think, that the resulting tool is anyway
> developer-oriented, to use it you should know what this block status
> really means.. May be just s/block status/block type/, will it be more
> human friendly?
Oh, OK. I think I’d prefer “block status” still, because that’s what we
use in the code.
Hanna
next prev parent reply other threads:[~2021-10-27 8:37 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-21 10:12 [PATCH v2 0/4] qemu-img compare --stat Vladimir Sementsov-Ogievskiy
2021-10-21 10:12 ` [PATCH v2 1/4] qemu-img: implement " Vladimir Sementsov-Ogievskiy
2021-10-25 16:40 ` Hanna Reitz
2021-10-26 7:53 ` Vladimir Sementsov-Ogievskiy
2021-10-26 8:47 ` Hanna Reitz
2021-10-26 9:18 ` Vladimir Sementsov-Ogievskiy
2021-10-27 8:35 ` Hanna Reitz [this message]
2021-10-21 10:12 ` [PATCH v2 2/4] qemu-img: make --block-size optional for " Vladimir Sementsov-Ogievskiy
2021-10-26 8:07 ` Hanna Reitz
2021-10-28 10:04 ` Vladimir Sementsov-Ogievskiy
2021-10-21 10:12 ` [PATCH v2 3/4] qemu-img: add --shallow option for qemu-img compare Vladimir Sementsov-Ogievskiy
2021-10-26 8:11 ` Hanna Reitz
2021-10-21 10:12 ` [PATCH v2 4/4] iotests: add qemu-img-compare-stat test Vladimir Sementsov-Ogievskiy
2021-10-26 8:26 ` Hanna Reitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6b016523-87d5-f17e-8f1c-2e6ccf0dce0b@redhat.com \
--to=hreitz@redhat.com \
--cc=den@openvz.org \
--cc=eblake@redhat.com \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=nikita.lapshin@virtuozzo.com \
--cc=nsoffer@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).