qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alberto Garcia <berto@igalia.com>
To: Eric Blake <eblake@redhat.com>, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-block@nongnu.org, Max Reitz <mreitz@redhat.com>
Subject: Re: [PATCH v5 19/31] qcow2: Add subcluster support to calculate_l2_meta()
Date: Wed, 06 May 2020 19:14:25 +0200	[thread overview]
Message-ID: <w517dxp9ea6.fsf@maestria.local.igalia.com> (raw)
In-Reply-To: <12569151-2f16-f136-6928-2a915b25120b@redhat.com>

On Tue 05 May 2020 11:59:18 PM CEST, Eric Blake wrote:
>> +        for (i = first_sc; i <= last_sc; i++) {
>> +            unsigned c = i / s->subclusters_per_cluster;
>> +            unsigned sc = i % s->subclusters_per_cluster;
>> +            l2_entry = get_l2_entry(s, l2_slice, l2_index + c);
>> +            l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index + c);
>> +            type = qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, sc);
>> +            if (type == QCOW2_SUBCLUSTER_INVALID) {
>> +                l2_index += c; /* Point to the invalid entry */
>> +                goto fail;
>> +            }
>> +            if (type != QCOW2_SUBCLUSTER_NORMAL) {
>>                   break;
>>               }
>>           }
>
> This loop is now 32 times slower (for extended L2 entries).  Do you
> really need to check for an invalid subcluster here, or can we just
> blindly check that all 32 subclusters are NORMAL, and leave handling
> of invalid clusters for the rest of the function after we failed the
> exit-early test?  For that matter, for all but the first and last
> cluster, checking if 32 clusters are NORMAL is a simple 64-bit
> comparison rather than 32 iterations of a loop; and even for the first
> and last cluster, the _RANGE macros in 14/31 work well to mask out
> which bits must be set/cleared.  My guess is that optimizing this loop
> is worthwhile, since overwriting existing data is probably more common
> than allocating new data.

I think you're right, and now we have the _RANGE macros so it should be
doable. I'll look into it.

>> -        if (i == nb_clusters) {
>> -            return;
>> +        if (i == last_sc + 1) {
>> +            return 0;
>>           }
>>       }
>
> If we get here, then i is now the address of the first subcluster that 
> was not NORMAL, even if it is much smaller than the final subcluster 
> learned by nb_clusters for the overall request.  [1]

I'm replying to this part later in [1]

>>       /* Get the L2 entry of the first cluster */
>>       l2_entry = get_l2_entry(s, l2_slice, l2_index);
>> -    type = qcow2_get_cluster_type(bs, l2_entry);
>> +    l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index);
>> +    sc_index = offset_to_sc_index(s, guest_offset);
>> +    type = qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, sc_index);
>>   
>> -    if (type == QCOW2_CLUSTER_NORMAL && keep_old) {
>> -        cow_start_from = cow_start_to;
>> +    if (type == QCOW2_SUBCLUSTER_INVALID) {
>> +        goto fail;
>> +    }
>> +
>> +    if (!keep_old) {
>> +        switch (type) {
>> +        case QCOW2_SUBCLUSTER_COMPRESSED:
>> +            cow_start_from = 0;
>> +            break;
>> +        case QCOW2_SUBCLUSTER_NORMAL:
>> +        case QCOW2_SUBCLUSTER_ZERO_ALLOC:
>> +        case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC: {
>> +            int i;
>> +            /* Skip all leading zero and unallocated subclusters */
>> +            for (i = 0; i < sc_index; i++) {
>> +                QCow2SubclusterType t;
>> +                t = qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, i);
>> +                if (t == QCOW2_SUBCLUSTER_INVALID) {
>> +                    goto fail;
>> +                } else if (t == QCOW2_SUBCLUSTER_NORMAL) {
>> +                    break;
>> +                }
>> +            }
>> +            cow_start_from = i << s->subcluster_bits;
>> +            break;
>
> Note that you are only skipping until the first normal subcluster, even 
> if other zero/unallocated clusters occur between the first normal 
> cluster and the start of the action.

That's correct.

> Or visually, suppose we have:
>
> --0-NN-0_NNNNNNNN_NNNNNNNN_NNNNNNNN
>
> as our 32 subclusters, with sc_index of 8.  You will end up skipping
> subclusters 0-3, but NOT 6 and 7.

That's correct. This function works with the assumption that the initial
COW region is located immediately before the data region, which is in
turn contiguous to the tail COW region.

I'm actually not sure that it necessarily has to be that way, but at
least it seems that functions like handle_alloc_space() rely on
that. Certainly before subclusters I don't see how there would be any
space between the COW regions and the actual data region.

While checking the documentation of QCowL2Meta I also realized that
maybe it also needs to be updated. "The COW Region between the start of
the first allocated cluster and the area the guest actually writes to",
it's not necessarily the start of the cluster anymore, although the word
"between" leaves some room for interpretation.

Anyway, even if we could do COW of subclusters 4-5 only, there's no
general way to do that without touching QCowL2Meta or using more than
one structure (imagine we have -N-N-N-N_NNNN ...). I'm also not sure
that it's worth it.

> Still, even though we spend time copying the allocated contents of
> those two subclusters, we also copy the subcluster status, and the
> guest still ends up reading the same data as before.

No, the subcluster status is updated and those subclusters are marked
now as allocated. That's actually why we can use the _RANGE masks that
you proposed here:

   https://lists.gnu.org/archive/html/qemu-block/2020-04/msg01155.html

In other words, we have this bitmap:

   --0-NN-0_NNNNNNNN_NNNNNNNN_NNNNNNNN

If we write to subcluster 8 (which was already allocated), the resulting
bitmap is this one:

   --0-NNNN_NNNNNNNN_NNNNNNNN_NNNNNNNN

The last block in iotest 271 deals exactly with this kind of scenarios.

>>       /* Get the L2 entry of the last cluster */
>> -    l2_entry = get_l2_entry(s, l2_slice, l2_index + nb_clusters - 1);
>> -    type = qcow2_get_cluster_type(bs, l2_entry);
>> +    l2_index += nb_clusters - 1;
>> +    l2_entry = get_l2_entry(s, l2_slice, l2_index);
>> +    l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index);
>> +    sc_index = offset_to_sc_index(s, guest_offset + bytes - 1);
>> +    type = qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, sc_index);
>
> [1] but here, we are skipping any intermediate clusters, and worrying
> only about the state of the final cluster.  Is that always going to do
> the correct thing, or will there be cases where we need to update the
> L2 entries of intermediate clusters?

       Cluster 1             Cluster 2             Cluster 3
|---------------------|---------------------|---------------------|
   <---cow_start--><-------write request--------><--cow_end--->


All L2 entries from the beginning of cow_start until the end of cow_end
are always updated. That's again what the loop that I optimized using
the _RANGE masks (and that I liked above) was doing.

The code in calculate_l2_meta() is only concerned with determining the
actual start and end points. Everything between them will be written to
and marked as allocated. It's only the subclusters outside that range
that keep the previous values (unallocated, or zero).

Berto


  reply	other threads:[~2020-05-06 17:15 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-05 17:38 [PATCH v5 00/31] Add subcluster allocation to qcow2 Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 01/31] qcow2: Make Qcow2AioTask store the full host offset Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 02/31] qcow2: Convert qcow2_get_cluster_offset() into qcow2_get_host_offset() Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 03/31] qcow2: Add calculate_l2_meta() Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 04/31] qcow2: Split cluster_needs_cow() out of count_cow_clusters() Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 05/31] qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in handle_copied() Alberto Garcia
2020-05-05 19:23   ` Eric Blake
2020-05-05 17:38 ` [PATCH v5 06/31] qcow2: Add get_l2_entry() and set_l2_entry() Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 07/31] qcow2: Document the Extended L2 Entries feature Alberto Garcia
2020-05-05 19:35   ` Eric Blake
2020-05-06 15:02     ` Alberto Garcia
2020-05-06 15:24       ` Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 08/31] qcow2: Add dummy has_subclusters() function Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 09/31] qcow2: Add subcluster-related fields to BDRVQcow2State Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 10/31] qcow2: Add offset_to_sc_index() Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 11/31] qcow2: Add offset_into_subcluster() and size_to_subclusters() Alberto Garcia
2020-05-05 19:42   ` Eric Blake
2020-05-06 10:18     ` Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 12/31] qcow2: Add l2_entry_size() Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 13/31] qcow2: Update get/set_l2_entry() and add get/set_l2_bitmap() Alberto Garcia
2020-05-05 20:04   ` Eric Blake
2020-05-06 12:06     ` Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 14/31] qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type() Alberto Garcia
2020-05-05 21:08   ` Eric Blake
2020-05-05 17:38 ` [PATCH v5 15/31] qcow2: Add qcow2_cluster_is_allocated() Alberto Garcia
2020-05-05 21:10   ` Eric Blake
2020-05-05 17:38 ` [PATCH v5 16/31] qcow2: Add cluster type parameter to qcow2_get_host_offset() Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 17/31] qcow2: Replace QCOW2_CLUSTER_* with QCOW2_SUBCLUSTER_* Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 18/31] qcow2: Handle QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 19/31] qcow2: Add subcluster support to calculate_l2_meta() Alberto Garcia
2020-05-05 21:59   ` Eric Blake
2020-05-06 17:14     ` Alberto Garcia [this message]
2020-05-06 17:39       ` Eric Blake
2020-05-07 15:34         ` Alberto Garcia
2020-05-07 15:50           ` Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 20/31] qcow2: Add subcluster support to qcow2_get_host_offset() Alberto Garcia
2020-05-06 17:55   ` Eric Blake
2020-05-08 11:44     ` Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 21/31] qcow2: Add subcluster support to zero_in_l2_slice() Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 22/31] qcow2: Add subcluster support to discard_in_l2_slice() Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 23/31] qcow2: Add subcluster support to check_refcounts_l2() Alberto Garcia
2020-05-06 17:58   ` Eric Blake
2020-05-05 17:38 ` [PATCH v5 24/31] qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2() Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 25/31] qcow2: Clear the L2 bitmap when allocating a compressed cluster Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 26/31] qcow2: Add subcluster support to handle_alloc_space() Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 27/31] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes() Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 28/31] qcow2: Add the 'extended_l2' option and the QCOW2_INCOMPAT_EXTL2 bit Alberto Garcia
2020-05-06 18:09   ` Eric Blake
2020-05-07 14:37     ` Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 29/31] qcow2: Assert that expand_zero_clusters_in_l1() does not support subclusters Alberto Garcia
2020-05-06 18:11   ` Eric Blake
2020-05-05 17:38 ` [PATCH v5 30/31] qcow2: Add subcluster support to qcow2_measure() Alberto Garcia
2020-05-06 18:13   ` Eric Blake
2020-05-07 15:16     ` Alberto Garcia
2020-05-05 17:38 ` [PATCH v5 31/31] iotests: Add tests for qcow2 images with extended L2 entries Alberto Garcia
2020-05-20  9:35 ` [PATCH v5 00/31] Add subcluster allocation to qcow2 Derek Su
2020-05-20  9:50   ` Alberto Garcia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=w517dxp9ea6.fsf@maestria.local.igalia.com \
    --to=berto@igalia.com \
    --cc=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).