From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:41683) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S64D5-0008Bq-PF for qemu-devel@nongnu.org; Fri, 09 Mar 2012 13:06:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S64Cz-00055v-FH for qemu-devel@nongnu.org; Fri, 09 Mar 2012 13:06:35 -0500 Received: from mx1.redhat.com ([209.132.183.28]:1025) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S64Cz-00055l-7G for qemu-devel@nongnu.org; Fri, 09 Mar 2012 13:06:29 -0500 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q29I6RVH023398 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 9 Mar 2012 13:06:27 -0500 Message-ID: <4F5A46A1.4000508@redhat.com> Date: Fri, 09 Mar 2012 19:06:25 +0100 From: Paolo Bonzini MIME-Version: 1.0 References: <1331226917-6658-1-git-send-email-pbonzini@redhat.com> <1331226917-6658-7-git-send-email-pbonzini@redhat.com> <4F5A31B2.3050701@redhat.com> In-Reply-To: <4F5A31B2.3050701@redhat.com> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: qemu-devel@nongnu.org Il 09/03/2012 17:37, Kevin Wolf ha scritto: >> > Remove the bdrv_co_write_zeroes callback. Instead use the discard >> > information from bdrv_get_info to choose between bdrv_co_discard >> > and a normal write. >> > >> > Signed-off-by: Paolo Bonzini > I'm not sure if this a good idea. > > The goal of discard is to remove data from the image (or not add it if > it isn't there yet) and ideally deallocate the used clusters. The goal > of write_zeroes is to mark space as zero and explicitly allocate it for > this purpose. > > From a guest point of view these are pretty similar, but from a host > perspective I'd say there's a difference. True. However, we need to present a uniform view to the guests, including the granularity, or discard can never be enabled. The granularity must be 512 on IDE (though it can be higher on SCSI), so there are problems mapping block layer discard straight down to the guest. There are basically three ways to do this: 1) we could cheat and present a discard_granularity that is smaller than what the underlying format/protocol supports. This is fine but forces discard_zeroes_data to be false. That's a pity because Linux 3.4 will start using efficient zero write operations (WRITE SAME on SCSI, but could be extended to UNMAP/TRIM if discard_zeroes_data is true). 2) we can make an emulated discard that always supports 512 bytes granularity and always zeroes data. This patch series takes this route. 3) we can let the user choose between (1) and (2). I didn't choose this because of laziness mostly---co_write_zeroes support is not really complete, for example there's no aio version to use in device models---and also because I doubt anyone would really use the option. Paolo