All of lore.kernel.org
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Anton Nefedov <anton.nefedov@virtuozzo.com>, qemu-devel@nongnu.org
Cc: qemu-block@nongnu.org, kwolf@redhat.com, eblake@redhat.com,
	den@virtuozzo.com, berto@igalia.com
Subject: Re: [Qemu-devel] [PATCH v7 8/9] qcow2: skip writing zero buffers to empty COW areas
Date: Wed, 31 Jan 2018 18:40:43 +0100	[thread overview]
Message-ID: <4b7846c3-bec4-b1a0-50e3-58aa337028ac@redhat.com> (raw)
In-Reply-To: <2e58f0a9-7db4-b562-e1d0-b8c712a67b21@virtuozzo.com>

[-- Attachment #1: Type: text/plain, Size: 4946 bytes --]

On 2018-01-30 15:23, Anton Nefedov wrote:
> 
> 
> On 29/1/2018 11:28 PM, Max Reitz wrote:
>> On 2018-01-18 18:49, Anton Nefedov wrote:
>>> If COW areas of the newly allocated clusters are zeroes on the
>>> backing image,
>>> efficient bdrv_write_zeroes(flags=BDRV_REQ_ALLOCATE) can be used on
>>> the whole
>>> cluster instead of writing explicit zero buffers later in perform_cow().
>>>
>>> iotest 060:
>>> write to the discarded cluster does not trigger COW anymore.
>>> Use a backing image instead.
>>>
>>> iotest 066:
>>> cluster-alignment areas that were not really COWed are now detected
>>> as zeroes, hence the initial write has to be exactly the same size for
>>> the maps to match
>>>
>>> Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
>>> ---
>>>   qapi/block-core.json       |  4 ++-
>>>   block/qcow2.h              |  6 +++++
>>>   block/qcow2-cluster.c      |  2 +-
>>>   block/qcow2.c              | 66
>>> ++++++++++++++++++++++++++++++++++++++++++++--
>>>   block/trace-events         |  1 +
>>>   tests/qemu-iotests/060     | 26 +++++++++++-------
>>>   tests/qemu-iotests/060.out |  5 +++-
>>>   tests/qemu-iotests/066     |  2 +-
>>>   tests/qemu-iotests/066.out |  4 +--
>>>   9 files changed, 98 insertions(+), 18 deletions(-)
>>
>> [...]
>>
>>> @@ -1875,6 +1880,52 @@ static bool is_zero(BlockDriverState *bs,
>>> int64_t offset, int64_t bytes)
>>>       return res >= 0 && (res & BDRV_BLOCK_ZERO) && nr == bytes;
>>>   }
>>>   +static bool is_zero_cow(BlockDriverState *bs, QCowL2Meta *m)
>>> +{
>>> +    return is_zero(bs, m->offset + m->cow_start.offset,
>>> +                   m->cow_start.nb_bytes) &&
>>> +           is_zero(bs, m->offset + m->cow_end.offset,
>>> m->cow_end.nb_bytes);
>>> +}
>>> +
>>> +static int handle_alloc_space(BlockDriverState *bs, QCowL2Meta *l2meta)
>>> +{
>>> +    BDRVQcow2State *s = bs->opaque;
>>> +    QCowL2Meta *m;
>>> +
>>> +    for (m = l2meta; m != NULL; m = m->next) {
>>> +        int ret;
>>> +
>>> +        if (!m->cow_start.nb_bytes && !m->cow_end.nb_bytes) {
>>> +            continue;
>>> +        }
>>> +
>>> +        if (bs->encrypted) {
>>> +            continue;
>>> +        }
>>
>> Not sure if the compiler optimizes this anyway, but I'd pull this out of
>> the loop.
>>
> 
> An imprint of the following patches (which were dropped from this
> series) - preallocation ahead of image EOF, which takes action
> regardless of image encryption.
> 
> But I'll leave the check outside the loop until it's needed
> there.
> 
> 
>> Maybe you could put all of the conditions under which this function can
>> actually do something at its beginning: That is, we can't do anything if
>> the BDS is encrypted or if bs->file does not support BDRV_REQ_ALLOCATE
>> (and then you just call this function unconditionally in
>> qcow2_co_pwritev()).
>>
> 
> Done.
> 
>>> +        if (!is_zero_cow(bs, m)) {
>>> +            continue;
>>> +        }
>>
>> Is this really efficient?  I remember someone complaining about
>> bdrv_co_block_status() being kind of slow on some filesystems, so
>> there'd be a tradeoff depending on how it compares to just reading up to
>> two clusters from the backing file -- especially considering that the OS
>> can query the same information just as quickly, and thus the only
>> overhead the read should have is a memset() (assuming the OS is clever).
>>
>> So basically my question is whether it would be better to just skip this
>> if we have any backing file at all and only do this optimization if
>> there is none.
>>
> 
> So what we trade between is
> (read+write) vs (lseek+fallocate or lseek+read+write).
> 
> Indeed if it comes to lseek the profit is smaller, and we're probably
> unlikely to find a hole anyway.
> 
> Maybe it's good enough to cover these cases:
>  1. no backing
>  2. beyond bdrv_getlength() in backing
>  3. unallocated in backing qcow2 (covers 'beyond bdrv_getlength()
>                                           in backing->file')
> 
> 1 & 2 are easy to check;

Not sure how useful 2 is, though.  (I don't know.  I always hear about
people wanting to optimize for such a case where a backing file is
shorter than the overlay, but I can't imagine a real use case for that.)

> 3: if that's not too hacky maybe we can do the bdrv_is_allocated() check
> for qcow2 exclusively and if there is raw (or any other format) backing
> image - do the COW

Yes, well, it seems useful in principle...  But it would require general
block layer work indeed.  Maybe a new flag for bdrv_block_status*() that
says only to look into the format layer?

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

  reply	other threads:[~2018-01-31 17:40 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-18 17:48 [Qemu-devel] [PATCH v7 0/9] qcow2: cluster space preallocation Anton Nefedov
2018-01-18 17:48 ` [Qemu-devel] [PATCH v7 1/9] mirror: inherit supported write/zero flags Anton Nefedov
2018-01-29 19:21   ` Max Reitz
2018-01-29 19:26     ` Eric Blake
2018-01-30 12:15       ` Anton Nefedov
2018-01-30 16:00         ` Eric Blake
2018-01-31 17:20         ` Max Reitz
2018-01-18 17:49 ` [Qemu-devel] [PATCH v7 2/9] blkverify: set " Anton Nefedov
2018-01-29 19:23   ` Max Reitz
2018-01-18 17:49 ` [Qemu-devel] [PATCH v7 3/9] block: introduce BDRV_REQ_ALLOCATE flag Anton Nefedov
2018-01-29 19:37   ` Max Reitz
2018-01-30 12:34     ` Anton Nefedov
2018-01-31 17:31       ` Max Reitz
2018-02-01 13:34         ` Anton Nefedov
2018-02-01 18:06           ` Max Reitz
2018-01-18 17:49 ` [Qemu-devel] [PATCH v7 4/9] block: treat BDRV_REQ_ALLOCATE as serialising Anton Nefedov
2018-01-29 19:48   ` Max Reitz
2018-01-30 12:36     ` Anton Nefedov
2018-01-31 17:35       ` Max Reitz
2018-01-31 15:11   ` Alberto Garcia
2018-01-31 17:11     ` Anton Nefedov
2018-02-01 14:01       ` Alberto Garcia
2018-01-18 17:49 ` [Qemu-devel] [PATCH v7 5/9] file-posix: support BDRV_REQ_ALLOCATE Anton Nefedov
2018-01-29 19:56   ` Max Reitz
2018-01-18 17:49 ` [Qemu-devel] [PATCH v7 6/9] block: support BDRV_REQ_ALLOCATE in passthrough drivers Anton Nefedov
2018-01-29 19:58   ` Max Reitz
2018-01-18 17:49 ` [Qemu-devel] [PATCH v7 7/9] qcow2: move is_zero() up Anton Nefedov
2018-01-29 19:59   ` Max Reitz
2018-01-18 17:49 ` [Qemu-devel] [PATCH v7 8/9] qcow2: skip writing zero buffers to empty COW areas Anton Nefedov
2018-01-29 20:28   ` Max Reitz
2018-01-30 14:23     ` Anton Nefedov
2018-01-31 17:40       ` Max Reitz [this message]
2018-01-31 18:32         ` Eric Blake
2018-01-31 18:35           ` Max Reitz
2018-01-31 18:43             ` Eric Blake
2018-01-18 17:49 ` [Qemu-devel] [PATCH v7 9/9] iotest 134: test cluster-misaligned encrypted write Anton Nefedov
2018-01-29 20:30   ` Max Reitz
2018-01-30 13:03   ` Alberto Garcia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4b7846c3-bec4-b1a0-50e3-58aa337028ac@redhat.com \
    --to=mreitz@redhat.com \
    --cc=anton.nefedov@virtuozzo.com \
    --cc=berto@igalia.com \
    --cc=den@virtuozzo.com \
    --cc=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.