On 30.07.19 19:56, Eric Blake wrote: > On 7/30/19 12:24 PM, Max Reitz wrote: >> The qcow2 specification says to ignore unknown extra data fields in >> snapshot table entries. Currently, we discard it whenever we update the >> image, which is a bit different from "ignore". >> >> This patch makes the qcow2 driver keep all unknown extra data fields >> when updating an image's snapshot table. > > The cover letter questioned whether we want this, but I think we do. > >> >> Signed-off-by: Max Reitz >> --- >> block/qcow2.h | 5 ++++ >> block/qcow2-snapshot.c | 59 +++++++++++++++++++++++++++++++++++------- >> 2 files changed, 55 insertions(+), 9 deletions(-) >> >> diff --git a/block/qcow2.h b/block/qcow2.h >> index 175708cee0..290a48b77e 100644 >> --- a/block/qcow2.h >> +++ b/block/qcow2.h >> @@ -61,6 +61,9 @@ >> * space for snapshot names and IDs */ >> #define QCOW_MAX_SNAPSHOTS_SIZE (1024 * QCOW_MAX_SNAPSHOTS) >> >> +/* Maximum amount of extra data per snapshot table entry to accept */ >> +#define QCOW_MAX_SNAPSHOT_EXTRA_DATA 1024 >> + >> /* Bitmap header extension constraints */ >> #define QCOW2_MAX_BITMAPS 65535 >> #define QCOW2_MAX_BITMAP_DIRECTORY_SIZE (1024 * QCOW2_MAX_BITMAPS) >> @@ -178,6 +181,8 @@ typedef struct QCowSnapshot { >> uint32_t date_sec; >> uint32_t date_nsec; >> uint64_t vm_clock_nsec; >> + uint32_t extra_data_size; >> + void *unknown_extra_data; /* Extra data past QCowSnapshotExtraData */ > > Is char* going to be any easier to use than void*? > >> +++ b/block/qcow2-snapshot.c > >> @@ -80,30 +80,52 @@ int qcow2_read_snapshots(BlockDriverState *bs, Error **errp) > >> + >> + /* Read known extra data */ >> ret = bdrv_pread(bs->file, offset, &extra, >> - MIN(sizeof(extra), extra_data_size)); >> + MIN(sizeof(extra), sn->extra_data_size)); >> if (ret < 0) { >> error_setg_errno(errp, -ret, "Failed to read snapshot table"); >> goto fail; >> } >> - offset += extra_data_size; >> + offset += MIN(sizeof(extra), sn->extra_data_size); >> >> - if (extra_data_size >= 8) { >> + if (sn->extra_data_size >= 8) { > > While touching this, is it worth spelling it: > if (sn->extra_data_size >= sizeof(extra.vm_state_size_large)) { > >> sn->vm_state_size = be64_to_cpu(extra.vm_state_size_large); >> } >> >> - if (extra_data_size >= 16) { >> + if (sn->extra_data_size >= 16) { > > and a similar use of sizeof() instead of hard-coded 16 here? Well, the most verbose spelling would be offsetof() + sizeof() both times. Hm. I’ll see how it looks. I think it is obvious enough what it means as it is, and the full offsetof() + sizeof() might actually be less obvious just because it’s longer and thus takes longer to read. >> sn->disk_size = be64_to_cpu(extra.disk_size); >> } else { >> sn->disk_size = bs->total_sectors * BDRV_SECTOR_SIZE; >> } >> >> + if (sn->extra_data_size > sizeof(extra)) { >> + /* Store unknown extra data */ >> + size_t unknown_extra_data_size = >> + sn->extra_data_size - sizeof(extra); >> + >> + sn->unknown_extra_data = g_malloc(unknown_extra_data_size); >> + ret = bdrv_pread(bs->file, offset, sn->unknown_extra_data, >> + unknown_extra_data_size); > > We're doing two separate bdrv_pread()s. Would it be better to do a > single bdrv_preadv into a vector composed of &extra and > &unknown_extra_data, for less I/O? (Then again, this micro-optimization > is probably in the noise in the long run) Interesting idea, we could even add the ID and name string into that vector. But I’m not sure whether it’s really useful. (I’ll take a look anyway, because it sounds interesting.) >> + if (ret < 0) { >> + error_setg_errno(errp, -ret, "Failed to read snapshot table"); >> + goto fail; >> + } >> + offset += unknown_extra_data_size; >> + } >> + >> /* Read snapshot ID */ >> sn->id_str = g_malloc(id_str_size + 1); >> ret = bdrv_pread(bs->file, offset, sn->id_str, id_str_size); >> @@ -161,7 +183,7 @@ static int qcow2_write_snapshots(BlockDriverState *bs) >> sn = s->snapshots + i; >> offset = ROUND_UP(offset, 8); >> offset += sizeof(h); >> - offset += sizeof(extra); >> + offset += MAX(sizeof(extra), sn->extra_data_size); >> offset += strlen(sn->id_str); >> offset += strlen(sn->name); >> >> @@ -208,7 +230,8 @@ static int qcow2_write_snapshots(BlockDriverState *bs) >> h.date_sec = cpu_to_be32(sn->date_sec); >> h.date_nsec = cpu_to_be32(sn->date_nsec); >> h.vm_clock_nsec = cpu_to_be64(sn->vm_clock_nsec); >> - h.extra_data_size = cpu_to_be32(sizeof(extra)); >> + h.extra_data_size = cpu_to_be32(MAX(sizeof(extra), >> + sn->extra_data_size)); >> >> memset(&extra, 0, sizeof(extra)); >> extra.vm_state_size_large = cpu_to_be64(sn->vm_state_size); >> @@ -233,6 +256,22 @@ static int qcow2_write_snapshots(BlockDriverState *bs) >> } >> offset += sizeof(extra); >> >> + if (sn->extra_data_size > sizeof(extra)) { >> + size_t unknown_extra_data_size = >> + sn->extra_data_size - sizeof(extra); >> + >> + /* qcow2_read_snapshots() ensures no unbounded allocation */ >> + assert(unknown_extra_data_size <= INT_MAX); > > Should we make this tighter: > assert(unknown_extra_data_size < QCOW_MAX_SNAPSHOT_EXTRA_DATA); > or even > assert(unknown_extra_data_size <= > QCOW_MAX_SNAPSHOT_EXTRA_DATA - sizeof(extra)); Hm, I wanted to signal that it can fit into the int taken by bdrv_pwrite(). But then I should use BDRV_REQUEST_MAX_BYTES, actually. I’m not sure. It doesn’t really matter that it’s less than QCOW_MAX_SNAPSHOT_EXTRA_DATA here. What does matter is that it doesn’t exceed BDRV_REQUEST_MAX_BYTES. >> + assert(sn->unknown_extra_data); >> + >> + ret = bdrv_pwrite(bs->file, offset, sn->unknown_extra_data, >> + unknown_extra_data_size); >> + if (ret < 0) { >> + goto fail; >> + } >> + offset += unknown_extra_data_size; >> + } >> + >> ret = bdrv_pwrite(bs->file, offset, sn->id_str, id_str_size); >> if (ret < 0) { >> goto fail; >> @@ -375,6 +414,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info) >> sn->date_sec = sn_info->date_sec; >> sn->date_nsec = sn_info->date_nsec; >> sn->vm_clock_nsec = sn_info->vm_clock_nsec; >> + sn->extra_data_size = sizeof(QCowSnapshotExtraData); >> >> /* Allocate the L1 table of the snapshot and copy the current one there. */ >> l1_table_offset = qcow2_alloc_clusters(bs, s->l1_size * sizeof(uint64_t)); >> @@ -646,6 +686,7 @@ int qcow2_snapshot_delete(BlockDriverState *bs, >> * The snapshot is now unused, clean up. If we fail after this point, we >> * won't recover but just leak clusters. >> */ >> + g_free(sn.unknown_extra_data); >> g_free(sn.id_str); >> g_free(sn.name); >> >> > > Tweaks to assertions are minor enough that I'm okay with: > Reviewed-by: Eric Blake Thanks for reviewing! Max