From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Max Reitz <mreitz@redhat.com>, qemu-block@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>, qemu-devel@nongnu.org
Subject: Re: [PATCH 1/4] qcow2: Improve refcount structure rebuilding
Date: Fri, 26 Mar 2021 14:48:41 +0300 [thread overview]
Message-ID: <969cd321-0cc7-ddc3-8a0d-75819be3a6bf@virtuozzo.com> (raw)
In-Reply-To: <20210310155906.147478-2-mreitz@redhat.com>
10.03.2021 18:59, Max Reitz wrote:
> When rebuilding the refcount structures (when qemu-img check -r found
> errors with refcount = 0, but reference count > 0), the new refcount
> table defaults to being put at the image file end[1]. There is no good
> reason for that except that it means we will not have to rewrite any
> refblocks we already wrote to disk.
>
> Changing the code to rewrite those refblocks is not too difficult,
> though, so let us do that. That is beneficial for images on block
> devices, where we cannot really write beyond the end of the image file.
>
> [1] Unless there is something allocated in the area pointed to by the
> last refblock, so we have to write that refblock. In that case, we
> try to put the reftable in there.
>
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1519071
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
> block/qcow2-refcount.c | 126 ++++++++++++++++++++++-------------------
> 1 file changed, 67 insertions(+), 59 deletions(-)
>
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index 8e649b008e..162caeeb8e 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -2352,8 +2352,9 @@ static int rebuild_refcount_structure(BlockDriverState *bs,
> int64_t *nb_clusters)
> {
> BDRVQcow2State *s = bs->opaque;
> - int64_t first_free_cluster = 0, reftable_offset = -1, cluster = 0;
> + int64_t first_free_cluster = 0, reftable_offset = -1, cluster;
> int64_t refblock_offset, refblock_start, refblock_index;
> + int64_t first_cluster, end_cluster;
> uint32_t reftable_size = 0;
> uint64_t *on_disk_reftable = NULL;
> void *on_disk_refblock;
> @@ -2365,8 +2366,11 @@ static int rebuild_refcount_structure(BlockDriverState *bs,
>
> qcow2_cache_empty(bs, s->refcount_block_cache);
>
> + first_cluster = 0;
> + end_cluster = *nb_clusters;
> +
> write_refblocks:
> - for (; cluster < *nb_clusters; cluster++) {
> + for (cluster = first_cluster; cluster < end_cluster; cluster++) {
> if (!s->get_refcount(*refcount_table, cluster)) {
> continue;
> }
> @@ -2374,65 +2378,68 @@ write_refblocks:
> refblock_index = cluster >> s->refcount_block_bits;
> refblock_start = refblock_index << s->refcount_block_bits;
>
> - /* Don't allocate a cluster in a refblock already written to disk */
> - if (first_free_cluster < refblock_start) {
> - first_free_cluster = refblock_start;
> - }
> - refblock_offset = alloc_clusters_imrt(bs, 1, refcount_table,
> - nb_clusters, &first_free_cluster);
> - if (refblock_offset < 0) {
> - fprintf(stderr, "ERROR allocating refblock: %s\n",
> - strerror(-refblock_offset));
> - res->check_errors++;
> - ret = refblock_offset;
> - goto fail;
> - }
> -
> - if (reftable_size <= refblock_index) {
> - uint32_t old_reftable_size = reftable_size;
> - uint64_t *new_on_disk_reftable;
> + if (reftable_size > refblock_index &&
> + on_disk_reftable[refblock_index])
> + {
> + refblock_offset = on_disk_reftable[refblock_index];
In this branch, we assign it to ..
> + } else {
> + int64_t refblock_cluster_index;
>
> - reftable_size = ROUND_UP((refblock_index + 1) * REFTABLE_ENTRY_SIZE,
> - s->cluster_size) / REFTABLE_ENTRY_SIZE;
> - new_on_disk_reftable = g_try_realloc(on_disk_reftable,
> - reftable_size *
> - REFTABLE_ENTRY_SIZE);
> - if (!new_on_disk_reftable) {
> + /* Don't allocate a cluster in a refblock already written to disk */
> + if (first_free_cluster < refblock_start) {
> + first_free_cluster = refblock_start;
> + }
> + refblock_offset = alloc_clusters_imrt(bs, 1, refcount_table,
> + nb_clusters,
> + &first_free_cluster);
> + if (refblock_offset < 0) {
> + fprintf(stderr, "ERROR allocating refblock: %s\n",
> + strerror(-refblock_offset));
> res->check_errors++;
> - ret = -ENOMEM;
> + ret = refblock_offset;
> goto fail;
> }
> - on_disk_reftable = new_on_disk_reftable;
>
> - memset(on_disk_reftable + old_reftable_size, 0,
> - (reftable_size - old_reftable_size) * REFTABLE_ENTRY_SIZE);
> + refblock_cluster_index = refblock_offset / s->cluster_size;
> + if (refblock_cluster_index >= end_cluster) {
> + /*
> + * We must write the refblock that holds this refblock's
> + * refcount
> + */
> + end_cluster = refblock_cluster_index + 1;
> + }
>
> - /* The offset we have for the reftable is now no longer valid;
> - * this will leak that range, but we can easily fix that by running
> - * a leak-fixing check after this rebuild operation */
> - reftable_offset = -1;
> - } else {
> - assert(on_disk_reftable);
> - }
> - on_disk_reftable[refblock_index] = refblock_offset;
> + if (reftable_size <= refblock_index) {
> + uint32_t old_reftable_size = reftable_size;
> + uint64_t *new_on_disk_reftable;
> +
> + reftable_size =
> + ROUND_UP((refblock_index + 1) * REFTABLE_ENTRY_SIZE,
> + s->cluster_size) / REFTABLE_ENTRY_SIZE;
> + new_on_disk_reftable =
> + g_try_realloc(on_disk_reftable,
> + reftable_size * REFTABLE_ENTRY_SIZE);
> + if (!new_on_disk_reftable) {
> + res->check_errors++;
> + ret = -ENOMEM;
> + goto fail;
> + }
> + on_disk_reftable = new_on_disk_reftable;
>
> - /* If this is apparently the last refblock (for now), try to squeeze the
> - * reftable in */
> - if (refblock_index == (*nb_clusters - 1) >> s->refcount_block_bits &&
> - reftable_offset < 0)
> - {
> - uint64_t reftable_clusters = size_to_clusters(s, reftable_size *
> - REFTABLE_ENTRY_SIZE);
> - reftable_offset = alloc_clusters_imrt(bs, reftable_clusters,
> - refcount_table, nb_clusters,
> - &first_free_cluster);
> - if (reftable_offset < 0) {
> - fprintf(stderr, "ERROR allocating reftable: %s\n",
> - strerror(-reftable_offset));
> - res->check_errors++;
> - ret = reftable_offset;
> - goto fail;
> + memset(on_disk_reftable + old_reftable_size, 0,
> + (reftable_size - old_reftable_size) *
> + REFTABLE_ENTRY_SIZE);
> +
> + /*
> + * The offset we have for the reftable is now no longer valid;
> + * this will leak that range, but we can easily fix that by
> + * running a leak-fixing check after this rebuild operation
> + */
> + reftable_offset = -1;
> + } else {
> + assert(on_disk_reftable);
> }
> + on_disk_reftable[refblock_index] = refblock_offset;
only to write back again ?
> }
>
> ret = qcow2_pre_write_overlap_check(bs, 0, refblock_offset,
> @@ -2459,15 +2466,12 @@ write_refblocks:
> }
>
> if (reftable_offset < 0) {
at this point reftable_offset is always -1 now..
Ah not. and now I am a bit close to understanding all of this logic. this thing with "goto write_refblocks" is not obvious
> - uint64_t post_refblock_start, reftable_clusters;
> + uint64_t reftable_clusters;
>
> - post_refblock_start = ROUND_UP(*nb_clusters, s->refcount_block_size);
> reftable_clusters =
> size_to_clusters(s, reftable_size * REFTABLE_ENTRY_SIZE);
> - /* Not pretty but simple */
> - if (first_free_cluster < post_refblock_start) {
> - first_free_cluster = post_refblock_start;
> - }
> +
> + first_free_cluster = 0;
> reftable_offset = alloc_clusters_imrt(bs, reftable_clusters,
> refcount_table, nb_clusters,
> &first_free_cluster);
> @@ -2479,6 +2483,10 @@ write_refblocks:
> goto fail;
> }
>
> + assert(offset_into_cluster(s, reftable_offset) == 0);
> + first_cluster = reftable_offset / s->cluster_size;
> + end_cluster = first_cluster + reftable_clusters;
> +
> goto write_refblocks;
these three lines now looks like a function call in assembler :)
> }
>
>
You didn't ping the series (more than two week old) so, I'm not sure that you are not preparing v2 now.. But I kept it "marked unred" all this time, and several times tried to look at it, and postponed, because I don't familiar with this place of qcow2 driver. And the function looks too difficult. Now finally I think I understand most of the logic that you change. Honestly, I think a bit of refactoring the rebuild_refcount_structure() prior to logic change would make it a lot easier to review..
--
Best regards,
Vladimir
next prev parent reply other threads:[~2021-03-26 11:49 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-10 15:59 [PATCH 0/4] qcow2: Improve refcount structure rebuilding Max Reitz
2021-03-10 15:59 ` [PATCH 1/4] " Max Reitz
2021-03-26 11:48 ` Vladimir Sementsov-Ogievskiy [this message]
2021-03-26 13:47 ` Max Reitz
2021-03-26 14:38 ` Vladimir Sementsov-Ogievskiy
2021-03-10 15:59 ` [PATCH 2/4] iotests/common.qemu: Add _cleanup_single_qemu Max Reitz
2021-03-10 15:59 ` [PATCH 3/4] iotests/common.qemu: Allow using the QSD Max Reitz
2021-03-10 15:59 ` [PATCH 4/4] iotests/108: Test new refcount rebuild algorithm Max Reitz
2021-03-10 16:07 ` Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=969cd321-0cc7-ddc3-8a0d-75819be3a6bf@virtuozzo.com \
--to=vsementsov@virtuozzo.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).