On 12.08.19 22:26, John Snow wrote: > > > On 7/25/19 11:57 AM, Max Reitz wrote: >> Compressed writes generally have to write full clusters, not just in >> theory but also in practice when it comes to vmdk's streamOptimized >> subformat. It currently is just silently broken for writes with >> non-zero in-cluster offsets: >> >> $ qemu-img create -f vmdk -o subformat=streamOptimized foo.vmdk 1M >> $ qemu-io -c 'write 4k 4k' -c 'read 4k 4k' foo.vmdk >> wrote 4096/4096 bytes at offset 4096 >> 4 KiB, 1 ops; 00.01 sec (443.724 KiB/sec and 110.9309 ops/sec) >> read failed: Invalid argument >> >> (The technical reason is that vmdk_write_extent() just writes the >> incomplete compressed data actually to offset 4k. When reading the >> data, vmdk_read_extent() looks at offset 0 and finds the compressed data >> size to be 0, because that is what it reads from there. This yields an >> error.) >> >> For incomplete writes with zero in-cluster offsets, the error path when >> reading the rest of the cluster is a bit different, but the result is >> the same: >> >> $ qemu-img create -f vmdk -o subformat=streamOptimized foo.vmdk 1M >> $ qemu-io -c 'write 0k 4k' -c 'read 4k 4k' foo.vmdk >> wrote 4096/4096 bytes at offset 0 >> 4 KiB, 1 ops; 00.01 sec (362.641 KiB/sec and 90.6603 ops/sec) >> read failed: Invalid argument >> >> (Here, vmdk_read_extent() finds the data and then sees that the >> uncompressed data is short.) >> >> It is better to reject invalid writes than to make the user believe they >> might have succeeded and then fail when trying to read it back. >> >> Signed-off-by: Max Reitz >> --- >> block/vmdk.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/block/vmdk.c b/block/vmdk.c >> index db6acfc31e..641acacfe0 100644 >> --- a/block/vmdk.c >> +++ b/block/vmdk.c >> @@ -1731,6 +1731,16 @@ static int vmdk_write_extent(VmdkExtent *extent, int64_t cluster_offset, >> if (extent->compressed) { >> void *compressed_data; >> >> + /* Only whole clusters */ >> + if (offset_in_cluster || >> + n_bytes > (extent->cluster_sectors * SECTOR_SIZE) || >> + (n_bytes < (extent->cluster_sectors * SECTOR_SIZE) && >> + offset + n_bytes != extent->end_sector * SECTOR_SIZE)) >> + { >> + ret = -EINVAL; >> + goto out; >> + } >> + >> if (!extent->has_marker) { >> ret = -EINVAL; >> goto out; >> > > What does this look like from a guest's perspective? Is there something > that enforces the alignment in the graph for us? > > Or is it the case that indeed guests (or users via qemu-io) can request > invalid writes and we will halt the VM in those cases (in preference to > corrupting the disk)? Have you ever tried using a streamOptimized VMDK disk with a guest? I haven’t, but I know that it won’t work. O:-) If you try to write to an already allocated cluster, you’ll get an EIO and an error message via error_report() (“Could not write to allocated cluster for streamOptimized”). So really, the only use of streamOptimized is as a qemu-img convert source/target, or as a backup/mirror target. (Just like compressed clusters in qcow2 images.) I suppose if I introduced streamOptimized support today, I wouldn’t just forward vmdk_co_pwritev_compressed() to vmdk_co_pwritev(), but instead make vmdk_co_pwritev_compressed() only work on streamOptimized images, and vmdk_co_pwritev() only on everything else. Then it would be more clear. Hm. In fact, that’s a bug, isn’t it? vmdk will accept compressed writes for any subformat, even if it doesn’t support compression. So if you use -c and convert to vmdk, it will succeed, but the result won’t be compressed, It’s also a bit weird to accept normal writes for streamOptimized, but I’m not sure whether that’s really a bug? In any case, changing this behavior would not be backwards-compatible... Should we deprecate normal writes to streamOptimized? Max