From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:36417) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rdf7Y-0001bT-Um for qemu-devel@nongnu.org; Thu, 22 Dec 2011 04:39:35 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Rdf7X-0001kv-Eg for qemu-devel@nongnu.org; Thu, 22 Dec 2011 04:39:28 -0500 Received: from e06smtp11.uk.ibm.com ([195.75.94.107]:39507) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rdf7X-0001ji-7O for qemu-devel@nongnu.org; Thu, 22 Dec 2011 04:39:27 -0500 Received: from /spool/local by e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 22 Dec 2011 09:39:23 -0000 Received: from d06av03.portsmouth.uk.ibm.com (d06av03.portsmouth.uk.ibm.com [9.149.37.213]) by d06nrmr1707.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id pBM9dLE82367578 for ; Thu, 22 Dec 2011 09:39:21 GMT Received: from d06av03.portsmouth.uk.ibm.com (localhost.localdomain [127.0.0.1]) by d06av03.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id pBM9dKqw000776 for ; Thu, 22 Dec 2011 02:39:20 -0700 Date: Thu, 22 Dec 2011 07:47:17 +0000 From: Stefan Hajnoczi Message-ID: <20111222074717.GA8758@stefanha-thinkpad.localdomain> References: <1324483240-31726-1-git-send-email-stefanha@linux.vnet.ibm.com> <1324483240-31726-2-git-send-email-stefanha@linux.vnet.ibm.com> <4EF20CCB.4060306@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4EF20CCB.4060306@redhat.com> Subject: Re: [Qemu-devel] [PATCH v3 1/6] cutils: extract buffer_is_zero() from qemu-img.c List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake Cc: Kevin Wolf , Marcelo Tosatti , qemu-devel@nongnu.org On Wed, Dec 21, 2011 at 09:43:55AM -0700, Eric Blake wrote: > On 12/21/2011 09:00 AM, Stefan Hajnoczi wrote: > > The qemu-img.c:is_not_zero() function checks if a buffer contains all > > zeroes. This function will come in handy for zero-detection in the > > block layer, so clean it up and move it to cutils.c. > > > > Note that the function now returns true if the buffer is all zeroes. > > This avoids the double-negatives (i.e. !is_not_zero()) that the old > > function can cause in callers. > > Are there plans to improve the efficiency of buffer_is_zero to take > advantage of metadata about sparseness? > > That is, there are cases where we can use metadata to prove a region of > a file is sparse, without having to read every byte within that region. > Now that this series is giving QED special metadata that marks a zero > cluster, it is faster to query if that metadata exists denoting a zero > cluster than it is to read the entire cluster and check for non-zero. > Likewise, with regular files, the kernel provides lseek(SEEK_HOLE) (or > the older, lower-level, ioctl(FS_IOC_FIEMAP)); which at least GNU > coreutils is using for efficient sparse detection in source files. Yes, there are ways to optimize this for specific storage backends. But we need a code path that supports all storage systems first. For example, raw files over NFS or an image file over HTTP (curl). In the case of qcow2 or QED backing files we already don't read zeroes today. Instead we memset the read buffer to zero and the waste CPU cycles doing buffer_is_zero() detection. At least this means that file I/O (and network I/O, if using NFS) is already optimal if your backing file is qcow2 or QED - it's just the CPU cycles that we can optimize away. Stefan