All of lore.kernel.org
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-block@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>, qemu-devel@nongnu.org
Subject: Re: [PATCH 1/4] qcow2: Improve refcount structure rebuilding
Date: Fri, 26 Mar 2021 14:47:14 +0100	[thread overview]
Message-ID: <3eb1748e-012d-9c74-d9dc-c1a6f2644ba6@redhat.com> (raw)
In-Reply-To: <969cd321-0cc7-ddc3-8a0d-75819be3a6bf@virtuozzo.com>

On 26.03.21 12:48, Vladimir Sementsov-Ogievskiy wrote:
> 10.03.2021 18:59, Max Reitz wrote:
>> When rebuilding the refcount structures (when qemu-img check -r found
>> errors with refcount = 0, but reference count > 0), the new refcount
>> table defaults to being put at the image file end[1].  There is no good
>> reason for that except that it means we will not have to rewrite any
>> refblocks we already wrote to disk.
>>
>> Changing the code to rewrite those refblocks is not too difficult,
>> though, so let us do that.  That is beneficial for images on block
>> devices, where we cannot really write beyond the end of the image file.
>>
>> [1] Unless there is something allocated in the area pointed to by the
>>      last refblock, so we have to write that refblock.  In that case, we
>>      try to put the reftable in there.
>>
>> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1519071
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   block/qcow2-refcount.c | 126 ++++++++++++++++++++++-------------------
>>   1 file changed, 67 insertions(+), 59 deletions(-)
>>
>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>> index 8e649b008e..162caeeb8e 100644
>> --- a/block/qcow2-refcount.c
>> +++ b/block/qcow2-refcount.c
>> @@ -2352,8 +2352,9 @@ static int 
>> rebuild_refcount_structure(BlockDriverState *bs,
>>                                         int64_t *nb_clusters)
>>   {
>>       BDRVQcow2State *s = bs->opaque;
>> -    int64_t first_free_cluster = 0, reftable_offset = -1, cluster = 0;
>> +    int64_t first_free_cluster = 0, reftable_offset = -1, cluster;
>>       int64_t refblock_offset, refblock_start, refblock_index;
>> +    int64_t first_cluster, end_cluster;
>>       uint32_t reftable_size = 0;
>>       uint64_t *on_disk_reftable = NULL;
>>       void *on_disk_refblock;
>> @@ -2365,8 +2366,11 @@ static int 
>> rebuild_refcount_structure(BlockDriverState *bs,
>>       qcow2_cache_empty(bs, s->refcount_block_cache);
>> +    first_cluster = 0;
>> +    end_cluster = *nb_clusters;
>> +
>>   write_refblocks:
>> -    for (; cluster < *nb_clusters; cluster++) {
>> +    for (cluster = first_cluster; cluster < end_cluster; cluster++) {
>>           if (!s->get_refcount(*refcount_table, cluster)) {
>>               continue;
>>           }
>> @@ -2374,65 +2378,68 @@ write_refblocks:
>>           refblock_index = cluster >> s->refcount_block_bits;
>>           refblock_start = refblock_index << s->refcount_block_bits;
>> -        /* Don't allocate a cluster in a refblock already written to 
>> disk */
>> -        if (first_free_cluster < refblock_start) {
>> -            first_free_cluster = refblock_start;
>> -        }
>> -        refblock_offset = alloc_clusters_imrt(bs, 1, refcount_table,
>> -                                              nb_clusters, 
>> &first_free_cluster);
>> -        if (refblock_offset < 0) {
>> -            fprintf(stderr, "ERROR allocating refblock: %s\n",
>> -                    strerror(-refblock_offset));
>> -            res->check_errors++;
>> -            ret = refblock_offset;
>> -            goto fail;
>> -        }
>> -
>> -        if (reftable_size <= refblock_index) {
>> -            uint32_t old_reftable_size = reftable_size;
>> -            uint64_t *new_on_disk_reftable;
>> +        if (reftable_size > refblock_index &&
>> +            on_disk_reftable[refblock_index])
>> +        {
>> +            refblock_offset = on_disk_reftable[refblock_index];
> 
> In this branch, we assign it to ..
> 
>> +        } else {
>> +            int64_t refblock_cluster_index;
>> -            reftable_size = ROUND_UP((refblock_index + 1) * 
>> REFTABLE_ENTRY_SIZE,
>> -                                     s->cluster_size) / 
>> REFTABLE_ENTRY_SIZE;
>> -            new_on_disk_reftable = g_try_realloc(on_disk_reftable,
>> -                                                 reftable_size *
>> -                                                 REFTABLE_ENTRY_SIZE);
>> -            if (!new_on_disk_reftable) {
>> +            /* Don't allocate a cluster in a refblock already written 
>> to disk */
>> +            if (first_free_cluster < refblock_start) {
>> +                first_free_cluster = refblock_start;
>> +            }
>> +            refblock_offset = alloc_clusters_imrt(bs, 1, refcount_table,
>> +                                                  nb_clusters,
>> +                                                  &first_free_cluster);
>> +            if (refblock_offset < 0) {
>> +                fprintf(stderr, "ERROR allocating refblock: %s\n",
>> +                        strerror(-refblock_offset));
>>                   res->check_errors++;
>> -                ret = -ENOMEM;
>> +                ret = refblock_offset;
>>                   goto fail;
>>               }
>> -            on_disk_reftable = new_on_disk_reftable;
>> -            memset(on_disk_reftable + old_reftable_size, 0,
>> -                   (reftable_size - old_reftable_size) * 
>> REFTABLE_ENTRY_SIZE);
>> +            refblock_cluster_index = refblock_offset / s->cluster_size;
>> +            if (refblock_cluster_index >= end_cluster) {
>> +                /*
>> +                 * We must write the refblock that holds this refblock's
>> +                 * refcount
>> +                 */
>> +                end_cluster = refblock_cluster_index + 1;
>> +            }
>> -            /* The offset we have for the reftable is now no longer 
>> valid;
>> -             * this will leak that range, but we can easily fix that 
>> by running
>> -             * a leak-fixing check after this rebuild operation */
>> -            reftable_offset = -1;
>> -        } else {
>> -            assert(on_disk_reftable);
>> -        }
>> -        on_disk_reftable[refblock_index] = refblock_offset;
>> +            if (reftable_size <= refblock_index) {
>> +                uint32_t old_reftable_size = reftable_size;
>> +                uint64_t *new_on_disk_reftable;
>> +
>> +                reftable_size =
>> +                    ROUND_UP((refblock_index + 1) * REFTABLE_ENTRY_SIZE,
>> +                             s->cluster_size) / REFTABLE_ENTRY_SIZE;
>> +                new_on_disk_reftable =
>> +                    g_try_realloc(on_disk_reftable,
>> +                                  reftable_size * REFTABLE_ENTRY_SIZE);
>> +                if (!new_on_disk_reftable) {
>> +                    res->check_errors++;
>> +                    ret = -ENOMEM;
>> +                    goto fail;
>> +                }
>> +                on_disk_reftable = new_on_disk_reftable;
>> -        /* If this is apparently the last refblock (for now), try to 
>> squeeze the
>> -         * reftable in */
>> -        if (refblock_index == (*nb_clusters - 1) >> 
>> s->refcount_block_bits &&
>> -            reftable_offset < 0)
>> -        {
>> -            uint64_t reftable_clusters = size_to_clusters(s, 
>> reftable_size *
>> -                                                          
>> REFTABLE_ENTRY_SIZE);
>> -            reftable_offset = alloc_clusters_imrt(bs, reftable_clusters,
>> -                                                  refcount_table, 
>> nb_clusters,
>> -                                                  &first_free_cluster);
>> -            if (reftable_offset < 0) {
>> -                fprintf(stderr, "ERROR allocating reftable: %s\n",
>> -                        strerror(-reftable_offset));
>> -                res->check_errors++;
>> -                ret = reftable_offset;
>> -                goto fail;
>> +                memset(on_disk_reftable + old_reftable_size, 0,
>> +                       (reftable_size - old_reftable_size) *
>> +                       REFTABLE_ENTRY_SIZE);
>> +
>> +                /*
>> +                 * The offset we have for the reftable is now no 
>> longer valid;
>> +                 * this will leak that range, but we can easily fix 
>> that by
>> +                 * running a leak-fixing check after this rebuild 
>> operation
>> +                 */
>> +                reftable_offset = -1;
>> +            } else {
>> +                assert(on_disk_reftable);
>>               }
>> +            on_disk_reftable[refblock_index] = refblock_offset;
> 
> only to write back again ?

This assignment is on a deeper level, though, isn it?  I.e. it’s inside 
the else branch from above.

>>           }
>>           ret = qcow2_pre_write_overlap_check(bs, 0, refblock_offset,
>> @@ -2459,15 +2466,12 @@ write_refblocks:
>>       }
>>       if (reftable_offset < 0) {
> 
> at this point reftable_offset is always -1 now..
> 
> Ah not. and now I am a bit close to understanding all of this logic. 
> this thing with "goto write_refblocks" is not obvious
> 
>> -        uint64_t post_refblock_start, reftable_clusters;
>> +        uint64_t reftable_clusters;
>> -        post_refblock_start = ROUND_UP(*nb_clusters, 
>> s->refcount_block_size);
>>           reftable_clusters =
>>               size_to_clusters(s, reftable_size * REFTABLE_ENTRY_SIZE);
>> -        /* Not pretty but simple */
>> -        if (first_free_cluster < post_refblock_start) {
>> -            first_free_cluster = post_refblock_start;
>> -        }
>> +
>> +        first_free_cluster = 0;
>>           reftable_offset = alloc_clusters_imrt(bs, reftable_clusters,
>>                                                 refcount_table, 
>> nb_clusters,
>>                                                 &first_free_cluster);
>> @@ -2479,6 +2483,10 @@ write_refblocks:
>>               goto fail;
>>           }
>> +        assert(offset_into_cluster(s, reftable_offset) == 0);
>> +        first_cluster = reftable_offset / s->cluster_size;
>> +        end_cluster = first_cluster + reftable_clusters;
>> +
>>           goto write_refblocks;
> 
> these three lines now looks like a function call in assembler :)
> 
>>       }
>>
> 
> You didn't ping the series (more than two week old) so, I'm not sure 
> that you are not preparing v2 now.. But I kept it "marked unred" all 
> this time, and several times tried to look at it, and postponed, because 
> I don't familiar with this place of qcow2 driver. And the function looks 
> too difficult. Now finally I think I understand most of the logic that 
> you change. Honestly, I think a bit of refactoring the 
> rebuild_refcount_structure() prior to logic change would make it a lot 
> easier to review..

Hm, yes, putting the for () loop into its own function would probably be 
a good starting put.  I’ll look into it.

Max



  reply	other threads:[~2021-03-26 13:49 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-10 15:59 [PATCH 0/4] qcow2: Improve refcount structure rebuilding Max Reitz
2021-03-10 15:59 ` [PATCH 1/4] " Max Reitz
2021-03-26 11:48   ` Vladimir Sementsov-Ogievskiy
2021-03-26 13:47     ` Max Reitz [this message]
2021-03-26 14:38       ` Vladimir Sementsov-Ogievskiy
2021-03-10 15:59 ` [PATCH 2/4] iotests/common.qemu: Add _cleanup_single_qemu Max Reitz
2021-03-10 15:59 ` [PATCH 3/4] iotests/common.qemu: Allow using the QSD Max Reitz
2021-03-10 15:59 ` [PATCH 4/4] iotests/108: Test new refcount rebuild algorithm Max Reitz
2021-03-10 16:07   ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3eb1748e-012d-9c74-d9dc-c1a6f2644ba6@redhat.com \
    --to=mreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.