qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Alberto Garcia <berto@igalia.com>, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>, Derek Su <dereksu@qnap.com>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-block@nongnu.org, Max Reitz <mreitz@redhat.com>
Subject: Re: [PATCH v7 28/32] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
Date: Thu, 28 May 2020 14:11:07 -0500	[thread overview]
Message-ID: <0adafac6-15e8-96eb-6c3f-bb9c182fb2d1@redhat.com> (raw)
In-Reply-To: <w51sgfkt81f.fsf@maestria.local.igalia.com>

On 5/28/20 10:04 AM, Alberto Garcia wrote:
> On Wed 27 May 2020 07:58:10 PM CEST, Eric Blake wrote:
>>> There is just one thing to take into account for a possible future
>>> improvement: compressed clusters cannot be partially zeroized so
>>> zero_l2_subclusters() on the head or the tail can return -ENOTSUP.
>>> This makes the caller repeat the *complete* request and write actual
>>> zeroes to disk. This is sub-optimal because
>>>
>>>      1) if the head area was compressed we would still be able to use
>>>         the fast path for the body and possibly the tail.
>>>
>>>      2) if the tail area was compressed we are writing zeroes to the
>>>         head and the body areas, which are already zeroized.
>>
>> Is this true?  The block layer tries hard to break zero requests up so
>> that any non-cluster-aligned requests do not cross cluster boundaries.
>> In practice, that means that when you have an unaligned request, the
>> head and tail cluster will be the same cluster, and there is no body in
>> play, so that returning -ENOTSUP is correct because there really is no
>> other work to do and repeating the entire request (which is less than a
>> cluster in length) is the right approach.
> 
> Let's use an example.
> 
> cluster size is 64KB, subcluster size is 2KB, and we get this request:
> 
>     write -z 31k 130k
> 
> Since pwrite_zeroes_alignment equals the cluster size (64KB), this
> would result in 3 calls to qcow2_co_pwrite_zeroes():
> 
>     offset=31k  size=33k    [-ENOTSUP, writes actual zeros]
>     offset=64k  size=64k    [zeroized using the relevant metadata bits]
>     offset=128k size=33k    [-ENOTSUP, writes actual zeros]
> 
> However this patch changes the alignment:
> 
>      bs->bl.pwrite_zeroes_alignment = s->subcluster_size;

Ah, I missed that trick.  But it is nice, and indeed...

> 
> so we get these instead:
> 
>     offset=31k  size=1k     [-ENOTSUP, writes actual zeros]
>     offset=32k  size=128k   [zeroized using the relevant metadata bits]
>     offset=160k size=1k     [-ENOTSUP, writes actual zeros]
> 
> So far, so good. Reducing the alignment requirements allows us to
> maximize the number of subclusters to zeroize.

...we can now hit a request that is not cluster-aligned.

> 
> Now let's suppose we have this request:
> 
>     write -z 32k 128k
> 
> This one is aligned so it goes directly to qcow2_co_pwrite_zeroes().
> However if the third cluster is compressed then the function will
> return -ENOTSUP after having zeroized the first 96KB of the request,
> forcing the caller to repeat it completely using the slow path.
> 
> I think the problem also exists in the current code (without my
> patches). If you zeroize 10 clusters and the last one is compressed
> you have to repeat the request after having zeroized 9 clusters.

Hmm. In the pre-patch code, qcow2_co_pwrite_zeroes() calls 
qcow2_cluster_zeroize() which can fail with -ENOTSUP up front, but not 
after the fact.  Once it starts the while loop over clusters, its use of 
zero_in_l2_slice() handles compressed clusters just fine; as far as I 
can tell, only your new subcluster handling lets it now fail with 
-ENOTSUP after earlier clusters have been visited.

But isn't this something we could solve recursively?  Instead of 
returning -ENOTSUP, we could have zero_in_l2_slice() call 
bdrv_pwrite_zeroes() on the (sub-)clusters associated with a compressed 
cluster.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



  reply	other threads:[~2020-05-28 19:12 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-25 18:08 [PATCH v7 00/32] Add subcluster allocation to qcow2 Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 01/32] qcow2: Make Qcow2AioTask store the full host offset Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 02/32] qcow2: Convert qcow2_get_cluster_offset() into qcow2_get_host_offset() Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 03/32] qcow2: Add calculate_l2_meta() Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 04/32] qcow2: Split cluster_needs_cow() out of count_cow_clusters() Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 05/32] qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in handle_copied() Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 06/32] qcow2: Add get_l2_entry() and set_l2_entry() Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 07/32] qcow2: Document the Extended L2 Entries feature Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 08/32] qcow2: Add dummy has_subclusters() function Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 09/32] qcow2: Add subcluster-related fields to BDRVQcow2State Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 10/32] qcow2: Add offset_to_sc_index() Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 11/32] qcow2: Add offset_into_subcluster() and size_to_subclusters() Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 12/32] qcow2: Add l2_entry_size() Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 13/32] qcow2: Update get/set_l2_entry() and add get/set_l2_bitmap() Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 14/32] qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type() Alberto Garcia
2020-05-26 20:32   ` Eric Blake
2020-05-27  9:51     ` Alberto Garcia
2020-05-27 13:27       ` Eric Blake
2020-05-25 18:08 ` [PATCH v7 15/32] qcow2: Add qcow2_get_subcluster_range_type() Alberto Garcia
2020-05-26 21:48   ` Eric Blake
2020-05-25 18:08 ` [PATCH v7 16/32] qcow2: Add qcow2_cluster_is_allocated() Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 17/32] qcow2: Add cluster type parameter to qcow2_get_host_offset() Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 18/32] qcow2: Replace QCOW2_CLUSTER_* with QCOW2_SUBCLUSTER_* Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 19/32] qcow2: Handle QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 20/32] qcow2: Add subcluster support to calculate_l2_meta() Alberto Garcia
2020-05-27 15:57   ` Eric Blake
2020-05-25 18:08 ` [PATCH v7 21/32] qcow2: Add subcluster support to qcow2_get_host_offset() Alberto Garcia
2020-05-27 16:40   ` Eric Blake
2020-05-25 18:08 ` [PATCH v7 22/32] qcow2: Add subcluster support to zero_in_l2_slice() Alberto Garcia
2020-05-27 16:43   ` Eric Blake
2020-05-25 18:08 ` [PATCH v7 23/32] qcow2: Add subcluster support to discard_in_l2_slice() Alberto Garcia
2020-05-27 16:50   ` Eric Blake
2020-05-25 18:08 ` [PATCH v7 24/32] qcow2: Add subcluster support to check_refcounts_l2() Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 25/32] qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2() Alberto Garcia
2020-05-27 16:52   ` Eric Blake
2020-05-25 18:08 ` [PATCH v7 26/32] qcow2: Clear the L2 bitmap when allocating a compressed cluster Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 27/32] qcow2: Add subcluster support to handle_alloc_space() Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 28/32] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes() Alberto Garcia
2020-05-27 17:58   ` Eric Blake
2020-05-28 15:04     ` Alberto Garcia
2020-05-28 19:11       ` Eric Blake [this message]
2020-05-29 16:06         ` Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 29/32] qcow2: Add subcluster support to qcow2_measure() Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 30/32] qcow2: Add the 'extended_l2' option and the QCOW2_INCOMPAT_EXTL2 bit Alberto Garcia
2020-05-27 18:03   ` Eric Blake
2020-05-25 18:08 ` [PATCH v7 31/32] qcow2: Assert that expand_zero_clusters_in_l1() does not support subclusters Alberto Garcia
2020-05-25 18:08 ` [PATCH v7 32/32] iotests: Add tests for qcow2 images with extended L2 entries Alberto Garcia
2020-05-27 18:30   ` Eric Blake
2020-05-29 15:07     ` Alberto Garcia
2020-05-29 15:13       ` Eric Blake
2020-05-26  2:54 ` [PATCH v7 00/32] Add subcluster allocation to qcow2 no-reply
2020-05-26 11:01   ` Alberto Garcia
2020-05-26  5:17 ` no-reply
2020-05-26  5:48 ` no-reply
2020-05-26  6:14 ` no-reply
2020-05-26  8:01 ` no-reply
2020-05-26 10:13 ` no-reply
2020-05-26 13:08 ` no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0adafac6-15e8-96eb-6c3f-bb9c182fb2d1@redhat.com \
    --to=eblake@redhat.com \
    --cc=berto@igalia.com \
    --cc=dereksu@qnap.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).