All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anton Nefedov <anton.nefedov@virtuozzo.com>
To: Liu Qing <liuqing@huayun.com>
Cc: Eric Blake <eblake@redhat.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [Qemu-devel] reduce write bandwidth of qcow2 driver while allocating new cluster
Date: Mon, 4 Sep 2017 16:17:42 +0300	[thread overview]
Message-ID: <d101b33b-e411-5a23-c92c-e6e411d79173@virtuozzo.com> (raw)
In-Reply-To: <20170831065515.GA22443@host-172-16-90-85.openstacklocal>



On 31/8/2017 9:55 AM, Liu Qing wrote:
> On Wed, Aug 30, 2017 at 01:15:33PM +0300, Anton Nefedov wrote:
>>
>> On 29/08/2017 05:56, Liu Qing wrote:
>>> On Mon, Aug 28, 2017 at 10:46:34AM -0500, Eric Blake wrote:
>>>> [adding qemu-block]
>>>>
>>>> On 08/28/2017 12:56 AM, Liu Qing wrote:
>>>>> Dear list,
>>>>>     Recently I used fio to test qcow2 driver in the guest os, and found out
>>>>> that when a new cluster is allocated the 4K IO will occupy 64K(default cluster
>>>>> size) bandwith.
>>>>>     From the code qcow2 driver will fill the unused part of new allocated
>>>>> cluster with 0 in perform_cow. These 0s are set in qcow2_co_readv when the read
>>>>> destination is not allocated and it has no backing file. Could I forbidden any
>>>>> further write in copy_sectors if the copy source is not allocated and it has
>>>>> no backing file? So only the requested data is written to the cluster. Function
>>>>> copy_sectors is only used by perform_cow in the master branch.
>>>>
>>>> There have already been discussions on optimizing COW writes in a manner
>>>> similar to what you are describing; for example,
>>>>
>>>> https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg00109.html
>>> Thanks Eric, this is what I am looking for.
>>> The only concern I have is in patch '[Qemu-devel] [PATCH v4 12/15] qcow2: skip
>>> writing zero buffers to empty' it says:
>>>
>>> It can be detected that
>>>   1. COW alignment of a write request is zeroes
>>>   2. Respective areas on the underlying BDS already read as zeroes
>>>      after being preallocated previously
>>>   If both of these true, COW may be skipped
>>>
>>> Will writing zero be skipped if the disk is not preallocated? @Anton
>>>
>>
>> Hi,
>>
>> In short, no, it will not (with my patches), but there might be some way
>> if that's what you really need.
>>
>>
>> First of all, this might be undesirable as you lose the cluster-size
>> data locality: now the whole cluster is written at once and is expected
>> to reside in the contiguous area on the physical drive.
>>
>> Secondly, I think there is no guarantee that the underlying bs->file
>> image reads back as zeroes if the cluster is unallocated on qcow2 level.
> Why we need this guarantee? If the cluster is unallocated, it means no
> one used these clusters previously. So why should these unallocated
> clusters be read back as zeroes?

Hi, sorry I missed your mail;

I'm actually not sure if this is fixed in some spec or smth, that we
must read 0 from the never-written-to areas.

I can guess why it looks quite desirable - suppose we had a guest offset
X which mapped to the image offset Y, then the cluster got discarded and
the new guest offset Z mapped to the image offset Y - then the guest can
read old data from the other offset. But of course the sensitive data at
X should be explicitly overwritten by guest means, rather than just
discarded.

/Anton

>>
>> For example, the unallocated cluster could have been used earlier but
>> then discarded. Discard passthrough is configurable so discard may not
>> be passed down to the underlying image. And I guess that in general,
>> even if it is passed, there is no strong requirement on reading back as
>> zeroes - look at qcow2 discard handling - discard head and tail which do
>> not cover full clusters are ignored.
>>
>> _perhaps_, one may expect that there will be zeroes if the cluster is
>> allocated at the end of file
>> (see 'clusters_are_trailing' detection here
>> https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg00122.html)
>>
>> but I haven't thought about all corner cases here.
>>
>>
>> /Anton
>>
>>> BTW: why the code in the patch is a little different than the latest
>>> master branch? For example I don't have the is_zero function but only
>>> get is_zero_sectors. Is there something wrong with my settings?
>>>
>>> My repo:
>>> # git remote -v
>>> origin  git://git.qemu-project.org/qemu.git (fetch)
>>> origin  git://git.qemu-project.org/qemu.git (push)
>>>
>>> Thanks.
>>>>
>>>> --
>>>> Eric Blake, Principal Software Engineer
>>>> Red Hat, Inc.           +1-919-301-3266
>>>> Virtualization:  qemu.org | libvirt.org
>>>>
>>>
>>>

  reply	other threads:[~2017-09-04 13:22 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-28  5:56 [Qemu-devel] reduce write bandwidth of qcow2 driver while allocating new cluster Liu Qing
2017-08-28 15:46 ` Eric Blake
2017-08-29  2:56   ` Liu Qing
2017-08-30 10:15     ` Anton Nefedov
2017-08-31  6:55       ` Liu Qing
2017-09-04 13:17         ` Anton Nefedov [this message]
2017-09-05  3:32           ` Liu Qing
2017-08-28 21:40 ` John Snow
2017-08-29  3:05   ` Liu Qing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d101b33b-e411-5a23-c92c-e6e411d79173@virtuozzo.com \
    --to=anton.nefedov@virtuozzo.com \
    --cc=eblake@redhat.com \
    --cc=liuqing@huayun.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.