qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Hanna Reitz <hreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-block@nongnu.org
Cc: kwolf@redhat.com, eblake@redhat.com, qemu-devel@nongnu.org,
	nsoffer@redhat.com, nikita.lapshin@virtuozzo.com, den@openvz.org,
	jsnow@redhat.com
Subject: Re: [PATCH v2 1/4] qemu-img: implement compare --stat
Date: Wed, 27 Oct 2021 10:35:44 +0200	[thread overview]
Message-ID: <6b016523-87d5-f17e-8f1c-2e6ccf0dce0b@redhat.com> (raw)
In-Reply-To: <f63b4410-2c3d-d078-26e5-e891bfbb61e5@virtuozzo.com>

On 26.10.21 11:18, Vladimir Sementsov-Ogievskiy wrote:
> 26.10.2021 11:47, Hanna Reitz wrote:
>> On 26.10.21 09:53, Vladimir Sementsov-Ogievskiy wrote:
>>> 25.10.2021 19:40, Hanna Reitz wrote:
>>>> On 21.10.21 12:12, Vladimir Sementsov-Ogievskiy wrote:
>>>>> With new option qemu-img compare will not stop at first mismatch, but
>>>>> instead calculate statistics: how many clusters with different data,
>>>>> how many clusters with equal data, how many clusters were unallocated
>>>>> but become data and so on.
>>>>>
>>>>> We compare images chunk by chunk. Chunk size depends on what
>>>>> block_status returns for both images. It may return less than cluster
>>>>> (remember about qcow2 subclusters), it may return more than 
>>>>> cluster (if
>>>>> several consecutive clusters share same status). Finally images may
>>>>> have different cluster sizes. This all leads to ambiguity in how to
>>>>> finally compare the data.
>>>>>
>>>>> What we can say for sure is that, when we compare two qcow2 images 
>>>>> with
>>>>> same cluster size, we should compare clusters with data separately.
>>>>> Otherwise, if we for example compare 10 consecutive clusters of data
>>>>> where only one byte differs we'll report 10 different clusters.
>>>>> Expected result in this case is 1 different cluster and 9 equal ones.
>>>>>
>>>>> So, to serve this case and just to have some defined rule let's do 
>>>>> the
>>>>> following:
>>>>>
>>>>> 1. Select some block-size for compare procedure. In this commit it 
>>>>> must
>>>>>     be specified by user, next commit will add some automatic 
>>>>> logic and
>>>>>     make --block-size optional.
>>>>>
>>>>> 2. Go chunk-by-chunk using block_status as we do now with only one
>>>>>     differency:
>>>>>     If block_status() returns DATA region that intersects block-size
>>>>>     aligned boundary, crop this region at this boundary.
>>>>>
>>>>> This way it's still possible to compare less than cluster and report
>>>>> subcluster-level accuracy, but we newer compare more than one cluster
>>>>> of data.
>>>>>
>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>>>> <vsementsov@virtuozzo.com>
>>>>> ---
>>>>>   docs/tools/qemu-img.rst |  18 +++-
>>>>>   qemu-img.c              | 206 
>>>>> +++++++++++++++++++++++++++++++++++++---
>>>>>   qemu-img-cmds.hx        |   4 +-
>>>>>   3 files changed, 212 insertions(+), 16 deletions(-)
>>>>
>>>> Looks good to me overall!  Just some technical comments below.
>>>>
>>>>> diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
>>>>> index d58980aef8..21164253d4 100644
>>>>> --- a/docs/tools/qemu-img.rst
>>>>> +++ b/docs/tools/qemu-img.rst
>>>>> @@ -159,6 +159,18 @@ Parameters to compare subcommand:
>>>>>     Strict mode - fail on different image size or sector allocation
>>>>> +.. option:: --stat
>>>>> +
>>>>> +  Instead of exit on first mismatch compare the whole images and 
>>>>> print
>>>>> +  statistics on amount of different pairs of clusters, based on 
>>>>> their
>>>>> +  block-status and are they equal or not.
>>>>
>>>> I’d phrase this as:
>>>>
>>>> Instead of exiting on the first mismatch, compare the whole images 
>>>> and print statistics on how much they differ in terms of block 
>>>> status (i.e. are blocks allocated or not, do they contain data, are 
>>>> they marked as containing only zeroes) and block content (a block 
>>>> of data that contains only zero still has the same content as a 
>>>> marked-zero block).
>>>
>>> For me the rest starting from "and block content" sounds unclear, 
>>> seems doesn't add any information to previous (i.e. are blocks 
>>> allocated ...)
>>
>> By “block content” I meant what you said by “equal or not”, i.e. what 
>> is returned when reading from the block.
>>
>> Now that I think about it again, I believe we should go with your 
>> original “equal or not”, though, because that reflects what qemu-img 
>> --stat prints, like so perhaps:
>>
>> Instead of exiting on the first mismatch, compare the whole images 
>> and print statistics on the amount of different pairs of blocks, 
>> based on their block status and whether they are equal or not.
>
> OK
>
>>
>> I’d still like to add something like what I had in parentheses, 
>> though, because as a user, I’d find the “block status” and “equal or 
>> not” terms to be a bit handwave-y.  I don’t think “block status” is a 
>> common term in our documentation, so I wanted to add some examples; 
>> and I wanted to show by example that “equal” blocks don’t need to 
>> have the same block status.
>
> Actually, I think, that the resulting tool is anyway 
> developer-oriented, to use it you should know what this block status 
> really means.. May be just s/block status/block type/, will it be more 
> human friendly?

Oh, OK.  I think I’d prefer “block status” still, because that’s what we 
use in the code.

Hanna



  reply	other threads:[~2021-10-27  8:37 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-21 10:12 [PATCH v2 0/4] qemu-img compare --stat Vladimir Sementsov-Ogievskiy
2021-10-21 10:12 ` [PATCH v2 1/4] qemu-img: implement " Vladimir Sementsov-Ogievskiy
2021-10-25 16:40   ` Hanna Reitz
2021-10-26  7:53     ` Vladimir Sementsov-Ogievskiy
2021-10-26  8:47       ` Hanna Reitz
2021-10-26  9:18         ` Vladimir Sementsov-Ogievskiy
2021-10-27  8:35           ` Hanna Reitz [this message]
2021-10-21 10:12 ` [PATCH v2 2/4] qemu-img: make --block-size optional for " Vladimir Sementsov-Ogievskiy
2021-10-26  8:07   ` Hanna Reitz
2021-10-28 10:04     ` Vladimir Sementsov-Ogievskiy
2021-10-21 10:12 ` [PATCH v2 3/4] qemu-img: add --shallow option for qemu-img compare Vladimir Sementsov-Ogievskiy
2021-10-26  8:11   ` Hanna Reitz
2021-10-21 10:12 ` [PATCH v2 4/4] iotests: add qemu-img-compare-stat test Vladimir Sementsov-Ogievskiy
2021-10-26  8:26   ` Hanna Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6b016523-87d5-f17e-8f1c-2e6ccf0dce0b@redhat.com \
    --to=hreitz@redhat.com \
    --cc=den@openvz.org \
    --cc=eblake@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=nikita.lapshin@virtuozzo.com \
    --cc=nsoffer@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).