From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:59163) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dorJh-0006XG-Qh for qemu-devel@nongnu.org; Mon, 04 Sep 2017 09:22:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dorIJ-0000rx-9t for qemu-devel@nongnu.org; Mon, 04 Sep 2017 09:21:27 -0400 References: <20170828055631.GB3258@host-172-16-90-85.openstacklocal> <7327a956-aef3-a8da-13ff-53073039802f@redhat.com> <20170829025645.GA30307@host-172-16-90-85.openstacklocal> <17126413-cb74-a633-0138-5bd4e5dab315@virtuozzo.com> <20170831065515.GA22443@host-172-16-90-85.openstacklocal> From: Anton Nefedov Message-ID: Date: Mon, 4 Sep 2017 16:17:42 +0300 MIME-Version: 1.0 In-Reply-To: <20170831065515.GA22443@host-172-16-90-85.openstacklocal> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] reduce write bandwidth of qcow2 driver while allocating new cluster List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Liu Qing Cc: Eric Blake , qemu-devel@nongnu.org, qemu-block@nongnu.org On 31/8/2017 9:55 AM, Liu Qing wrote: > On Wed, Aug 30, 2017 at 01:15:33PM +0300, Anton Nefedov wrote: >> >> On 29/08/2017 05:56, Liu Qing wrote: >>> On Mon, Aug 28, 2017 at 10:46:34AM -0500, Eric Blake wrote: >>>> [adding qemu-block] >>>> >>>> On 08/28/2017 12:56 AM, Liu Qing wrote: >>>>> Dear list, >>>>> Recently I used fio to test qcow2 driver in the guest os, and found out >>>>> that when a new cluster is allocated the 4K IO will occupy 64K(default cluster >>>>> size) bandwith. >>>>> From the code qcow2 driver will fill the unused part of new allocated >>>>> cluster with 0 in perform_cow. These 0s are set in qcow2_co_readv when the read >>>>> destination is not allocated and it has no backing file. Could I forbidden any >>>>> further write in copy_sectors if the copy source is not allocated and it has >>>>> no backing file? So only the requested data is written to the cluster. Function >>>>> copy_sectors is only used by perform_cow in the master branch. >>>> >>>> There have already been discussions on optimizing COW writes in a manner >>>> similar to what you are describing; for example, >>>> >>>> https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg00109.html >>> Thanks Eric, this is what I am looking for. >>> The only concern I have is in patch '[Qemu-devel] [PATCH v4 12/15] qcow2: skip >>> writing zero buffers to empty' it says: >>> >>> It can be detected that >>> 1. COW alignment of a write request is zeroes >>> 2. Respective areas on the underlying BDS already read as zeroes >>> after being preallocated previously >>> If both of these true, COW may be skipped >>> >>> Will writing zero be skipped if the disk is not preallocated? @Anton >>> >> >> Hi, >> >> In short, no, it will not (with my patches), but there might be some way >> if that's what you really need. >> >> >> First of all, this might be undesirable as you lose the cluster-size >> data locality: now the whole cluster is written at once and is expected >> to reside in the contiguous area on the physical drive. >> >> Secondly, I think there is no guarantee that the underlying bs->file >> image reads back as zeroes if the cluster is unallocated on qcow2 level. > Why we need this guarantee? If the cluster is unallocated, it means no > one used these clusters previously. So why should these unallocated > clusters be read back as zeroes? Hi, sorry I missed your mail; I'm actually not sure if this is fixed in some spec or smth, that we must read 0 from the never-written-to areas. I can guess why it looks quite desirable - suppose we had a guest offset X which mapped to the image offset Y, then the cluster got discarded and the new guest offset Z mapped to the image offset Y - then the guest can read old data from the other offset. But of course the sensitive data at X should be explicitly overwritten by guest means, rather than just discarded. /Anton >> >> For example, the unallocated cluster could have been used earlier but >> then discarded. Discard passthrough is configurable so discard may not >> be passed down to the underlying image. And I guess that in general, >> even if it is passed, there is no strong requirement on reading back as >> zeroes - look at qcow2 discard handling - discard head and tail which do >> not cover full clusters are ignored. >> >> _perhaps_, one may expect that there will be zeroes if the cluster is >> allocated at the end of file >> (see 'clusters_are_trailing' detection here >> https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg00122.html) >> >> but I haven't thought about all corner cases here. >> >> >> /Anton >> >>> BTW: why the code in the patch is a little different than the latest >>> master branch? For example I don't have the is_zero function but only >>> get is_zero_sectors. Is there something wrong with my settings? >>> >>> My repo: >>> # git remote -v >>> origin git://git.qemu-project.org/qemu.git (fetch) >>> origin git://git.qemu-project.org/qemu.git (push) >>> >>> Thanks. >>>> >>>> -- >>>> Eric Blake, Principal Software Engineer >>>> Red Hat, Inc. +1-919-301-3266 >>>> Virtualization: qemu.org | libvirt.org >>>> >>> >>>