From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=53185 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OoHhs-0004zc-V6 for qemu-devel@nongnu.org; Wed, 25 Aug 2010 11:16:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OoHhr-0001bi-ON for qemu-devel@nongnu.org; Wed, 25 Aug 2010 11:16:04 -0400 Received: from mx1.redhat.com ([209.132.183.28]:19491) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OoHhr-0001bZ-H0 for qemu-devel@nongnu.org; Wed, 25 Aug 2010 11:16:03 -0400 Message-ID: <4C7533A7.7090404@redhat.com> Date: Wed, 25 Aug 2010 18:15:51 +0300 From: Avi Kivity MIME-Version: 1.0 References: <1282646430-5777-1-git-send-email-kwolf@redhat.com> <4C73C2BF.8050300@codemonkey.ws> <4C73C622.7080808@redhat.com> <4C73C926.3010901@codemonkey.ws> <4C73C9CF.7090800@redhat.com> <4C73CAA9.2060104@codemonkey.ws> <4C73CB85.9010306@redhat.com> <4C73CBD6.7000900@codemonkey.ws> <4C73CCCB.6050704@redhat.com> <4C73CF8D.5060405@codemonkey.ws> <4C74C2F3.9050506@redhat.com> <4C7510C1.8080305@codemonkey.ws> <4C75195A.8050508@redhat.com> <4C751DBB.8060101@codemonkey.ws> <4C752211.5010600@redhat.com> <4C75252F.6040002@codemonkey.ws> <4C752A56.6060609@redhat.com> <4C753171.2050405@codemonkey.ws> In-Reply-To: <4C753171.2050405@codemonkey.ws> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes" List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: Kevin Wolf , stefanha@gmail.com, mjt@tls.msk.ru, qemu-devel@nongnu.org, hch@lst.de On 08/25/2010 06:06 PM, Anthony Liguori wrote: > On 08/25/2010 09:36 AM, Avi Kivity wrote: >>> >>> If you tried to maintain a free list, then you would need to sync on >>> TRIM/DISCARD which is potentially a fast path. While a background >>> task may be less efficient in the short term, it's just as efficient >>> in the long term and it has the advantage of keeping any fast path >>> fast. >>> >> >> You only need to sync when the free list size grows beyond the amount >> of space you're prepared to lose on power fail. And you may be able >> to defer the background task indefinitely by satisfying new >> allocations from the free list. > > Free does not mean free. If you immediately punch a hole in the l2 > without doing a sync, then you're never sure whether the hole is there > on disk or not. So if you then allocate that block and put it > somewhere else in another l2 table, you need to sync the previous l2 > change before you update the new l2. > > Otherwise you can have two l2 entries pointing to the same block after > a power failure. That's not a leak, that's a data corruption. L2 certainly needs to be updated before the block is reused. But that's not different from a file format without a free list. The batching I was referring to was only for free list management, same as the allocation issue which started this thread. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.