On 05.02.20 15:07, Eric Blake wrote: > On 2/5/20 1:51 AM, Vladimir Sementsov-Ogievskiy wrote: > >>>>> +typedef enum { >>>>> +    /* >>>>> +     * bdrv_known_zeroes() should include this bit if the contents of >>>>> +     * a freshly-created image with no backing file reads as all >>>>> +     * zeroes without any additional effort.  If .bdrv_co_truncate is >>>>> +     * set, then this must be clear if BDRV_ZERO_TRUNCATE is clear. >>>> >>>> I understand that this is preexisting logic, but could I ask: why? >>>> What's wrong >>>> if driver can guarantee that created file is all-zero, but is not sure >>>> about >>>> file resizing? I agree that it's normal for these flags to have the >>>> same >>>> value, >>>> but what is the reason for this restriction?.. >>> >>> If areas added by truncation (or growth, rather) are always zero, then >>> the file can always be created with size 0 and grown from there.  Thus, >>> images where truncation adds zeroed areas will generally always be zero >>> after creation. >> >> This means, that if truncation bit is set, than create bit should be >> set.. But >> here we say that if truncation is clear, than create bit must be clear. > > Max, did we get the logic backwards? Or maybe my explanation was just wrong. Because nobody actually forces a driver to use truncate to ensure that an newly created file will be 0. Hm. And more importantly, you can’t use truncate with PREALLOC_MODE_OFF when you want to create an image with preallocation. Let’s see. The offending commit message says: > No .bdrv_has_zero_init() implementation returns 1 if growing the file > would add non-zero areas (at least with PREALLOC_MODE_OFF), so using it > in lieu of this new function was always safe. > > But on the other hand, it is possible that growing an image that is not > zero-initialized would still add a zero-initialized area, like when > using nonpreallocating truncation on a preallocated image. For callers > that care only about truncation, not about creation with potential > preallocation, this new function is useful. So I suppose the explanation is just the preallocation mode alone; has_zero_init() is for the image’s actual preallocation mode, whereas has_zero_init_truncate() is forced to PREALLOC_MODE_OFF. As such, the latter is less strict than the former. So the former cannot be true when the latter is false. >>>> So, the only possible combination of flags, when they differs, is >>>> create=0 and >>>> truncate=1.. How is it possible? >>> >>> For preallocated qcow2 images, it depends on the storage whether they >>> are actually 0 after creation.  Hence qcow2_has_zero_init() then defers >>> to bdrv_has_zero_init() of s->data_file->bs. >>> >>> But when you truncate them (with PREALLOC_MODE_OFF, as >>> BlockDriver.bdrv_has_zero_init_truncate()’s comment explains), the new >>> area is always going to be 0, regardless of initial preallocation. >> >> ah yes, due to qcow2 zero clusters. > > Hmm. Do we actually set the zero flag on unallocated clusters when > resizing a qcow2 image? No. They are just unallocated, i.e. zero. (Nodes with backing files never return true for bdrv_has_zero_init_truncate anyway). > That would be an O(n) operation (we have to > visit the L2 entry for each added cluster, even if only to set the zero > cluster bit).  Or do we instead just rely on the fact that qcow2 is > inherently sparse, and that when you resize the guest-visible size > without writing any new clusters, then it is only subsequent guest > access to those addresses that finally allocate clusters, making resize > O(1) (update the qcow2 metadata cluster, but not any L2 tables) while > still reading 0 from the new data.  To some extent, that's what the > allocation mode is supposed to control. > > What about with external data images, where a resize in guest-visible > length requires a resize of the underlying data image?  There, we DO > have to worry about whether the data image resizes with zeroes (as in > the filesystem) or with random data (as in a block device). Well, partially: Namely, only with data_file_raw. Because otherwise the clusters are still unallocated and thus read as zero. So yes, then we do have to worry about that. With data_file_raw, we have an obligation to make the data file return the same data as the qcow2 file, so, um. I wonder whether we actually take any care of this yet. If you have some external data file without zero_init(_truncate), do get zeroes when reading from the qcow2 node, but non-zeroes when reading from the raw data file? That would be OK without data_file_raw, but not with it. I suppose I’ll have to test it. Max