From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=54040 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1OuB3R-0003MT-VI
	for qemu-devel@nongnu.org; Fri, 10 Sep 2010 17:22:42 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
	(envelope-from <jamie@shareable.org>) id 1OuB3Q-0004pz-9O
	for qemu-devel@nongnu.org; Fri, 10 Sep 2010 17:22:41 -0400
Received: from mail2.shareable.org ([80.68.89.115]:56268)
	by eggs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <jamie@shareable.org>) id 1OuB3Q-0004p0-5R
	for qemu-devel@nongnu.org; Fri, 10 Sep 2010 17:22:40 -0400
Date: Fri, 10 Sep 2010 22:22:31 +0100
From: Jamie Lokier <jamie@shareable.org>
Subject: Re: [Qemu-devel] [RFC] qed: Add QEMU Enhanced Disk format
Message-ID: <20100910212231.GH4062@shareable.org>
References: <4C84E738.3020802@codemonkey.ws> <4C865187.6090508@redhat.com>
	<4C865CFE.7010508@codemonkey.ws> <4C8663C4.1090508@redhat.com>
	<4C866773.2030103@codemonkey.ws>
	<4C86BC6B.5010809@codemonkey.ws> <4C874812.9090807@redhat.com>
	<395D4377-00F9-4765-94C4-470BDFA1F96E@suse.de>
	<4C874F22.6060802@redhat.com>
	<AANLkTik+NHXjVmW5ozSGOOLf_FeQE8DHhoQPN6LOpFW0@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <AANLkTik+NHXjVmW5ozSGOOLf_FeQE8DHhoQPN6LOpFW0@mail.gmail.com>
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Alexander Graf <agraf@suse.de>, Avi Kivity <avi@redhat.com>, Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>, qemu-devel@nongnu.org

Stefan Hajnoczi wrote:
> Since there is no ordering imposed between the data write and metadata
> update, the following scenarios may occur on crash:
> 1. Neither data write nor metadata update reach the disk.  This is
> fine, qed metadata has not been corrupted.
> 2. Data reaches disk but metadata update does not.  We have leaked a
> cluster but not corrupted metadata.  Leaked clusters can be detected
> with qemu-img check.
> 3. Metadata update reaches disk but data does not.  The interesting
> case!  The L2 table now points to a cluster which is beyond the last
> cluster in the image file.  Remember that file size is rounded down by
> cluster size, so partial data writes are discarded and this case
> applies.

Better add:

4. File size is extended fully, but the data didn't all reach the disk.
5. Metadata is partially updated.
6. (Nasty) Metadata partial write has clobbered neighbouring
   metadata which wasn't meant to be changed.  (This may happen up
   to a sector size on normal hard disks - data is hard to come by.
   This happens to a much larger file range on flash and RAIDs
   sometimes - I call it the "radius of destruction").

6 can also happen when doing the L1 updated mentioned earlier, in
which case you might lose a much larger part of the guest image.

-- Jamie