From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:42966)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <kwolf@redhat.com>) id 1S7mqJ-0007cW-8y
	for qemu-devel@nongnu.org; Wed, 14 Mar 2012 07:58:17 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <kwolf@redhat.com>) id 1S7mqD-0000Ts-1f
	for qemu-devel@nongnu.org; Wed, 14 Mar 2012 07:58:10 -0400
Received: from mx1.redhat.com ([209.132.183.28]:19230)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <kwolf@redhat.com>) id 1S7mqC-0000Tb-Pj
	for qemu-devel@nongnu.org; Wed, 14 Mar 2012 07:58:04 -0400
Message-ID: <4F60889F.6030401@redhat.com>
Date: Wed, 14 Mar 2012 13:01:35 +0100
From: Kevin Wolf <kwolf@redhat.com>
MIME-Version: 1.0
References: <1331226917-6658-1-git-send-email-pbonzini@redhat.com>
	<1331226917-6658-7-git-send-email-pbonzini@redhat.com>
	<4F5A31B2.3050701@redhat.com> <4F5A46A1.4000508@redhat.com>
	<1331402560.8577.46.camel@watermelon.coderich.net>
	<4F5DEBCE.3040409@redhat.com>
	<1331665990.24052.42.camel@watermelon.coderich.net>
	<4F604B98.9090606@redhat.com>
In-Reply-To: <4F604B98.9090606@redhat.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co,
 aio}_discard for write_zeroes operations
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Laager <rlaager@wiktel.com>, qemu-devel@nongnu.org

Am 14.03.2012 08:41, schrieb Paolo Bonzini:
> Il 13/03/2012 20:13, Richard Laager ha scritto:
>>> If you have a new kernel that supports SEEK_HOLE/SEEK_DATA, it can also
>>> be done by skipping the zero write on known holes.
>>>
>>> This could even be done at the block layer level using bdrv_is_allocated.
>>
>> Would we want to make all write_zeros operations check for and skip
>> holes, or is write_zeros different from a discard in that it SHOULD/MUST
>> allocate space?
> 
> I think that's pretty much the question to answer for this patch to graduate
> from the RFC state (the rest is just technicalities, so to speak).  So far,
> write_zeros was intended to be an efficient operation (it avoids allocating
> a cluster in qed and will do the same in qcow3, which is why I decided to
> merge it with discard).

Yes, for qcow3 and to some degree also for QED, setting the zero flag is
the natural implementation for both discard and write_zeros. The big
question is what happens with other formats.

Paolo mentioned a use case as a fast way for guests to write zeros, but
is it really faster than a normal write when we have to emulate it by a
bdrv_write with a temporary buffer of zeros? On the other hand we have
the cases where discard really means "I don't care about the data any
more" and emulating it by writing zeros is just a waste of resources there.

So I think we only want to advertise that discard zeroes data if we can
do it efficiently. This means that the format does support it, and that
the device is able to communicate the discard granularity (= cluster
size) to the guest OS.

Kevin