On 04/18/2018 09:25 AM, Vladimir Sementsov-Ogievskiy wrote: >>> Do your code with >>> >>>      /* Found an extent, and we're inside it.  */ >>>      *next = f.fe.fe_logical + f.fe.fe_length; >>>      if (f.fe.fe_flags & FIEMAP_EXTENT_UNWRITTEN) { >>>          return BDRV_BLOCK_DATA|BDRV_BLOCK_ZERO; >>>      } else { >>>          return BDRV_BLOCK_DATA; >>>      } >>> >>> provide safe block_status based on FIEMAP without FLAG_SYNC? >> No, in fact we found data corruption with FIEMAP. > > How to reproduce it? I've tried your code, looks like it shows all > "data" regions even if I didn't call "sync". > There's no easy way to reproduce unsafe data races reliably; but FIEMAP without sync is such an unsafe data race (most of the time, you will get the answer you expect, but under the right conditions, FIEMAP may report the area as unallocated even though you have already called write(); if you treat that unallocated region as BDRV_BLOCK_ZERO, rather than read()ing it, you have corrupted data). That's because FIEMAP only reports what the disk has allocated, but file systems can have delayed allocations where contents in the kernel cache are NOT yet flushed to disk unless you use sync; but using sync kills performance. If you want examples of FIEMAP corrupting data, look at the coreutils archive from several years ago, where FIEMAP without sync caused corruptions during cp. A quick search found at least this example: https://lists.gnu.org/archive/html/bug-coreutils/2011-04/msg00023.html For more details, see qemu commits c4875e5b and 38c4d0a, and discussion at https://lists.gnu.org/archive/html/qemu-devel/2014-09/msg04921.html -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org