All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Eric Blake <eblake@redhat.com>, Max Reitz <mreitz@redhat.com>,
	qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>, Fam Zheng <fam@euphon.net>,
	"open list:Sheepdog" <sheepdog@lists.wpkg.org>,
	qemu-block@nongnu.org, Jeff Cody <codyprime@gmail.com>,
	Stefan Weil <sw@weilnetz.de>, Peter Lieven <pl@kamp.de>,
	"Richard W.M. Jones" <rjones@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	david.edmondson@oracle.com, Stefan Hajnoczi <stefanha@redhat.com>,
	"Denis V. Lunev" <den@openvz.org>,
	Liu Yuan <namei.unix@gmail.com>,
	Jason Dillaman <dillaman@redhat.com>
Subject: Re: [PATCH 09/17] block: Refactor bdrv_has_zero_init{,_truncate}
Date: Wed, 5 Feb 2020 17:25:37 +0300	[thread overview]
Message-ID: <c63179c1-671c-20fe-0dbc-4cc570e026a7@virtuozzo.com> (raw)
In-Reply-To: <945a62b8-6b50-70bf-bb7e-daacc00d5e43@redhat.com>

05.02.2020 17:07, Eric Blake wrote:
> On 2/5/20 1:51 AM, Vladimir Sementsov-Ogievskiy wrote:
> 
>>>>> +typedef enum {
>>>>> +    /*
>>>>> +     * bdrv_known_zeroes() should include this bit if the contents of
>>>>> +     * a freshly-created image with no backing file reads as all
>>>>> +     * zeroes without any additional effort.  If .bdrv_co_truncate is
>>>>> +     * set, then this must be clear if BDRV_ZERO_TRUNCATE is clear.
>>>>
>>>> I understand that this is preexisting logic, but could I ask: why?
>>>> What's wrong
>>>> if driver can guarantee that created file is all-zero, but is not sure
>>>> about
>>>> file resizing? I agree that it's normal for these flags to have the same
>>>> value,
>>>> but what is the reason for this restriction?..
>>>
>>> If areas added by truncation (or growth, rather) are always zero, then
>>> the file can always be created with size 0 and grown from there.  Thus,
>>> images where truncation adds zeroed areas will generally always be zero
>>> after creation.
>>
>> This means, that if truncation bit is set, than create bit should be set.. But
>> here we say that if truncation is clear, than create bit must be clear.
> 
> Max, did we get the logic backwards?
> 
>>
>>>
>>>> So, the only possible combination of flags, when they differs, is
>>>> create=0 and
>>>> truncate=1.. How is it possible?
>>>
>>> For preallocated qcow2 images, it depends on the storage whether they
>>> are actually 0 after creation.  Hence qcow2_has_zero_init() then defers
>>> to bdrv_has_zero_init() of s->data_file->bs.
>>>
>>> But when you truncate them (with PREALLOC_MODE_OFF, as
>>> BlockDriver.bdrv_has_zero_init_truncate()’s comment explains), the new
>>> area is always going to be 0, regardless of initial preallocation.
>>
>> ah yes, due to qcow2 zero clusters.
> 
> Hmm. Do we actually set the zero flag on unallocated clusters when resizing a qcow2 image?  That would be an O(n) operation (we have to visit the L2 entry for each added cluster, even if only to set the zero cluster bit).  Or do we instead just rely on the fact that qcow2 is inherently sparse, and that when you resize the guest-visible size without writing any new clusters, then it is only subsequent guest access to those addresses that finally allocate clusters, making resize O(1) (update the qcow2 metadata cluster, but not any L2 tables) while still reading 0 from the new data.  To some extent, that's what the allocation mode is supposed to control.

We must mark as ZERO new cluster at least if there is a _larger_ backing file, to prevent data from backing file become available for the guest. But we don't do it. It's a bug and there is fixing series from Kevin, I've just pinged it:
"[PATCH for-4.2? v3 0/8] block: Fix resize (extending) of short overlays"

> 
> What about with external data images, where a resize in guest-visible length requires a resize of the underlying data image?  There, we DO have to worry about whether the data image resizes with zeroes (as in the filesystem) or with random data (as in a block device).
> 


-- 
Best regards,
Vladimir


  reply	other threads:[~2020-02-05 14:26 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-31 17:44 [PATCH 00/17] Improve qcow2 all-zero detection Eric Blake
2020-01-31 17:44 ` [PATCH 01/17] qcow2: Comment typo fixes Eric Blake
2020-02-04 14:12   ` Vladimir Sementsov-Ogievskiy
2020-02-09 19:34   ` Alberto Garcia
2020-01-31 17:44 ` [PATCH 02/17] qcow2: List autoclear bit names in header Eric Blake
2020-02-04 14:26   ` Vladimir Sementsov-Ogievskiy
2020-01-31 17:44 ` [PATCH 03/17] qcow2: Avoid feature name extension on small cluster size Eric Blake
2020-02-04 14:39   ` Vladimir Sementsov-Ogievskiy
2020-02-09 19:28   ` Alberto Garcia
2020-01-31 17:44 ` [PATCH 04/17] block: Improve documentation of .bdrv_has_zero_init Eric Blake
2020-02-04 15:03   ` Vladimir Sementsov-Ogievskiy
2020-02-04 15:16     ` Eric Blake
2020-01-31 17:44 ` [PATCH 05/17] block: Don't advertise zero_init_truncate with encryption Eric Blake
2020-02-10 18:12   ` Alberto Garcia
2020-01-31 17:44 ` [PATCH 06/17] block: Improve bdrv_has_zero_init_truncate with backing file Eric Blake
2020-02-10 18:13   ` Alberto Garcia
2020-01-31 17:44 ` [PATCH 07/17] gluster: Drop useless has_zero_init callback Eric Blake
2020-02-04 15:06   ` Vladimir Sementsov-Ogievskiy
2020-02-10 18:21   ` Alberto Garcia
2020-02-17  8:06   ` [GEDI] " Niels de Vos
2020-02-17 12:03     ` Eric Blake
2020-02-17 12:22       ` Eric Blake
2020-02-17 14:01       ` Niels de Vos
2020-01-31 17:44 ` [PATCH 08/17] sheepdog: Consistently set bdrv_has_zero_init_truncate Eric Blake
2020-02-04 15:09   ` Vladimir Sementsov-Ogievskiy
2020-01-31 17:44 ` [PATCH 09/17] block: Refactor bdrv_has_zero_init{,_truncate} Eric Blake
2020-02-04 15:35   ` Vladimir Sementsov-Ogievskiy
2020-02-04 15:49     ` Eric Blake
2020-02-04 16:07       ` Vladimir Sementsov-Ogievskiy
2020-02-04 17:42     ` Max Reitz
2020-02-04 17:51       ` Eric Blake
2020-02-05 16:43         ` Max Reitz
2020-02-05  7:51       ` Vladimir Sementsov-Ogievskiy
2020-02-05 14:07         ` Eric Blake
2020-02-05 14:25           ` Vladimir Sementsov-Ogievskiy [this message]
2020-02-05 14:36             ` Eric Blake
2020-02-05 17:55           ` Max Reitz
2020-02-04 17:53   ` Max Reitz
2020-02-04 19:03     ` Eric Blake
2020-02-05 17:22       ` Max Reitz
2020-02-05 18:39         ` Eric Blake
2020-02-06  9:18           ` Max Reitz
2020-01-31 17:44 ` [PATCH 10/17] block: Add new BDRV_ZERO_OPEN flag Eric Blake
2020-01-31 18:03   ` Eric Blake
2020-02-04 17:34   ` Max Reitz
2020-02-04 17:50     ` Eric Blake
2020-02-05  8:39       ` Vladimir Sementsov-Ogievskiy
2020-02-05 17:26       ` Max Reitz
2020-01-31 17:44 ` [PATCH 11/17] file-posix: Support BDRV_ZERO_OPEN Eric Blake
2020-01-31 17:44 ` [PATCH 12/17] gluster: " Eric Blake
2020-02-17  8:16   ` [GEDI] " Niels de Vos
2020-01-31 17:44 ` [PATCH 13/17] qcow2: Add new autoclear feature for all zero image Eric Blake
2020-02-03 17:45   ` Vladimir Sementsov-Ogievskiy
2020-02-04 13:12     ` Eric Blake
2020-02-04 13:29       ` Vladimir Sementsov-Ogievskiy
2020-01-31 17:44 ` [PATCH 14/17] qcow2: Expose all zero bit through .bdrv_known_zeroes Eric Blake
2020-01-31 17:44 ` [PATCH 15/17] qcow2: Implement all-zero autoclear bit Eric Blake
2020-01-31 17:44 ` [PATCH 16/17] iotests: Add new test for qcow2 all-zero bit Eric Blake
2020-01-31 17:44 ` [PATCH 17/17] qcow2: Let qemu-img check cover " Eric Blake
2020-02-04 17:32 ` [PATCH 00/17] Improve qcow2 all-zero detection Max Reitz
2020-02-04 18:53   ` Eric Blake
2020-02-05 17:04     ` Max Reitz
2020-02-05 19:21       ` Eric Blake
2020-02-06  9:12         ` Max Reitz
2020-02-05  9:04 ` Vladimir Sementsov-Ogievskiy
2020-02-05  9:25   ` Vladimir Sementsov-Ogievskiy
2020-02-05 14:26     ` Eric Blake
2020-02-05 14:47       ` Vladimir Sementsov-Ogievskiy
2020-02-05 15:14         ` Vladimir Sementsov-Ogievskiy
2020-02-05 17:58           ` Max Reitz
2020-02-05 14:22   ` Eric Blake
2020-02-05 14:43     ` Vladimir Sementsov-Ogievskiy
2020-02-05 14:58       ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c63179c1-671c-20fe-0dbc-4cc570e026a7@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=armbru@redhat.com \
    --cc=codyprime@gmail.com \
    --cc=david.edmondson@oracle.com \
    --cc=den@openvz.org \
    --cc=dillaman@redhat.com \
    --cc=eblake@redhat.com \
    --cc=fam@euphon.net \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=namei.unix@gmail.com \
    --cc=pl@kamp.de \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rjones@redhat.com \
    --cc=sheepdog@lists.wpkg.org \
    --cc=stefanha@redhat.com \
    --cc=sw@weilnetz.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.