From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=54880 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OntfF-0004fk-Tp for qemu-devel@nongnu.org; Tue, 24 Aug 2010 09:35:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OntfB-0001t9-JX for qemu-devel@nongnu.org; Tue, 24 Aug 2010 09:35:45 -0400 Received: from mail-iw0-f173.google.com ([209.85.214.173]:38936) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OntfB-0001t4-Gd for qemu-devel@nongnu.org; Tue, 24 Aug 2010 09:35:41 -0400 Received: by iwn38 with SMTP id 38so2915758iwn.4 for ; Tue, 24 Aug 2010 06:35:40 -0700 (PDT) Message-ID: <4C73CAA9.2060104@codemonkey.ws> Date: Tue, 24 Aug 2010 08:35:37 -0500 From: Anthony Liguori MIME-Version: 1.0 References: <1282646430-5777-1-git-send-email-kwolf@redhat.com> <4C73C2BF.8050300@codemonkey.ws> <4C73C622.7080808@redhat.com> <4C73C926.3010901@codemonkey.ws> <4C73C9CF.7090800@redhat.com> In-Reply-To: <4C73C9CF.7090800@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes" List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: Kevin Wolf , stefanha@gmail.com, mjt@tls.msk.ru, qemu-devel@nongnu.org, hch@lst.de On 08/24/2010 08:31 AM, Avi Kivity wrote: > On 08/24/2010 04:29 PM, Anthony Liguori wrote: >> >> I'm not sure this patch is needed in the first place. >> >> If you have a sequence of operations like: >> >> 0) receive guest write request Z >> 1) submit write A >> 2) write A completes >> 3) submit write B >> 4) write B completes >> 5) report guest write Z complete >> >> You're adding a: >> >> 4.5) sync write B >> >> Which is ultimately unnecessary if what you care about is avoiding >> reordering of step (2) and (4). When a write() request completes, >> you're guaranteed that a subsequent read() request will return the >> written data. That's always true. If I could do a write(A) followed >> by a write(B) and then read()=A, no software would actually function >> correctly. >> >> It's important to make sure that you don't get image corruption if >> (2) happens but not (4). But I think that's okay in qcow2 today. > > It's about metadata writes. If an operation changes metadata, we must > sync it to disk before writing any data or other metadata which > depends on it, regardless of any promises to the guest. Why? If the metadata isn't sync, we loose the write. But that can happen anyway because we're not sync'ing the data We need to sync the metadata in the event of a guest initiated flush, but we shouldn't need to for a normal write. Regards, Anthony Liguori