On 04.02.20 18:50, Eric Blake wrote: > On 2/4/20 11:34 AM, Max Reitz wrote: > >>> +++ b/include/block/block.h >>> @@ -105,6 +105,16 @@ typedef enum { >>>        * for drivers that set .bdrv_co_truncate. >>>        */ >>>       BDRV_ZERO_TRUNCATE      = 0x2, >>> + >>> +    /* >>> +     * bdrv_known_zeroes() should include this bit if an image is >>> +     * known to read as all zeroes when first opened; this bit should >>> +     * not be relied on after any writes to the image. >> >> Is there a good reason for this?  Because to me this screams like we are >> going to check this flag without ensuring that the image has actually >> not been written to yet.  So if it’s generally easy for drivers to stop >> reporting this flag after a write, then maybe we should do so. > > In patch 15 (implementing things in qcow2), I actually wrote the driver > to return live results, rather than just open-time results, in part > because writing the bit to persistent storage in qcow2 means that the > bit must be accurate, without relying on the block layer's help. > > But my pending NBD patch (not posted yet, but will be soon), the > proposal I'm making for the NBD protocol itself is just open-time, not > live, and so it would be more work than necessary to make the NBD driver > report live results. > > But it seems like it should be easy enough to also patch the block layer > itself to guarantee that callers of bdrv_known_zeroes() cannot see this > bit set if the block layer has been used in any non-zero transaction, by > repeating the same logic as used in qcow2 to kill the bit (any > write/write_compressed/bdrv_copy clear the bit, any trim clears the bit > if the driver does not guarantee trim reads as zero, any truncate clears > the bit if the driver does not guarantee truncate reads as zero, etc). > Basically, the block layer would cache the results of .bdrv_known_zeroes > during .bdrv_co_open, bdrv_co_pwrite() and friends would update that > cache, and and bdrv_known_zeroes() would report the cached value rather > than a fresh call to .bdrv_known_zeroes. Sounds reasonable to me in generaly, but I’d prefer it to be fetched on-demand rather than unconditionally in bdrv_open(). (I realize that this means a tri-state of “known false”, “known true”, and “not yet queried”.) > Are we worried enough about clients of this interface to make the block > layer more robust?  (From the maintenance standpoint, the more the block > layer guarantees, the easier it is to write code that uses the block > layer; but there is the counter-argument that making the block layer > track whether an image has been modified means a [slight] penalty to > every write request to update the boolean.) Just like Vladimir, I’m worried about repeating the same mistakes we have before: That is, most places that called bdrv_has_zero_init() just did so out of wishful thinking, hoping that it would do what they need it to. It didn’t. Max