From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:37802) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SBT54-0003NX-Sh for qemu-devel@nongnu.org; Sat, 24 Mar 2012 11:40:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SBT51-0008WV-VG for qemu-devel@nongnu.org; Sat, 24 Mar 2012 11:40:38 -0400 Received: from verein.lst.de ([213.95.11.211]:52380 helo=newverein.lst.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SBT51-0008WF-OV for qemu-devel@nongnu.org; Sat, 24 Mar 2012 11:40:35 -0400 Date: Sat, 24 Mar 2012 16:40:34 +0100 From: Christoph Hellwig Message-ID: <20120324154034.GF13014@lst.de> References: <1331226917-6658-1-git-send-email-pbonzini@redhat.com> <1331226917-6658-15-git-send-email-pbonzini@redhat.com> <1331325410.3715.77.camel@watermelon.coderich.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1331325410.3715.77.camel@watermelon.coderich.net> Subject: Re: [Qemu-devel] [RFC PATCH 14/17] block: support FALLOC_FL_PUNCH_HOLE trimming List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Laager Cc: Paolo Bonzini , qemu-devel@nongnu.org On Fri, Mar 09, 2012 at 02:36:50PM -0600, Richard Laager wrote: > I'm not sure if fallocate() and/or BLKDISCARD always guarantee that the > discard has made it to stable storage. If they don't, does O_DIRECT or > O_DSYNC on open() cause them to make any such guarantee? If not, should > you be calling fdatasync() or fsync() when the user has specified > cache=direct or cache=writethrough? > > Note that the Illumos implementation (see below) has a flag to ask for > either behavior. For fallocate the current Linux behaviour is that you need an fdatasync or the O_DSYNC flag to guarantee that it makes it's way to disk. For XFS the history implementation of both XFS_IOC_FREESP and hole punching was that it always made it to disk, and for all other filesystems that historic behaviour was that O_DSYNC was ignored, and depending on the fs a fdatasync might or might not be able to give your a guarantee either. For BLKDISCARD you'd strictly speaking need to issue a cache flush via fdatasync, too, but I'm not sure if any actual devices require that. > On Fri, 2012-03-09 at 00:35 -0800, Chris Wedgwood wrote: > > Simplest still compare the blocks allocated by the file > > to it's length (ie. stat.st_blocks != stat.st_size>>9). > > I thought of this as well. It covers "99%" of the cases, but there's one > case where it breaks down. Imagine I have a sparse file backing my > virtual disk. In the guest, I fill the virtual disk 100%. Then, I > restart QEMU. Now it thinks that sparse file is non-sparse and stops > issuing hole punches. Note that due to speculative preallocation in the filesystem stat.st_blocks can easily be bigger than stat.st_size>>9, so the above comparism should at least be a < instead of !=.