Re: [Qemu-devel] Re: qcow2 corruption observed, fixed by reverting old change

From: Jamie Lokier <jamie@shareable.org>
To: qemu-devel@nongnu.org
Cc: Marc Bevand <m.bevand@gmail.com>, Gleb Natapov <gleb@redhat.com>,
	kvm@vger.kernel.org
Subject: Re: [Qemu-devel] Re: qcow2 corruption observed, fixed by reverting old change
Date: Fri, 13 Feb 2009 16:23:36 +0000	[thread overview]
Message-ID: <20090213162336.GI18471@shareable.org> (raw)
In-Reply-To: <49955681.9070301@suse.de>

Marc Bevand schrieb:
> I tested kvm-81 and kvm-83 as well (can't test kvm-80 or older
> because of the qcow2 performance regression caused by the default
> writethrough caching policy) but it randomly triggers an even worse
> bug: the moment I shut down a guest by typing "quit" in the monitor,
> it sometimes overwrite the first 4kB of the disk image with mostly
> NUL bytes (!) which completely destroys it. I am familiar with the
> qcow2 format and apparently this 4kB block seems to be an L2 table
> with most entries set to zero. I have had to restore at least 6 or 7
> disk images from backup after occurences of that bug.

Ow!  That's a really serious bug.  How many of us have regular hourly
backups of our disk images?  And how many of us are running databases
or mail servers on our VMs, where even restoring from a recent backup
is a harmful event?

I've not noticed this bug reported by Marc, probably because I nearly
always finish a KVM session by killing it, either because I'm testing
or because KVM locks up occasionally and needs kill -9 :-(

And because I've not used any KVM since kvm-72 in production until
recently, only for testing my personal VMs.

I must say, _thank goodness_ that the bug I reported occurs at boot
time, and caused me to revert the qcow2 code.  I'm now running a
crticial VM on kvm-83 with reverted qcow2.  Sure it's risky as there's
no reason to believe kvm-83 is "stable", but there's no reason to
believe any other version of KVM is especially stable either - there's
no stabilising bug fix only branch that I'm aware of.

If I hadn't had the boot time bug which I reported, I could have
unrecoverable corruption instead from Marc's bug.

For the time being, I'm going to _strongly_ advise my VM using
professional clients to never, *ever* use qcow2 except for snapshot
testing.

Unfortunately the other delta/growable formats seem to be even less
reliable, because they're not used much, so they should be avoided too.

This corruption plus the data integrity/durability issues on host
failure are a big deal.  Even with kvm-72, I'm nervous about qcow2 now.
Just because a bug hasn't caused obvious guest failures, doesn't mean
it's not happening.

Is there a way to restructure the code and/or how it works so it's
more clearly correct?

> My intuition tells me this may be the qcow2 code trying to allocate
> a cluster to write a new L2 table, but not noticing the allocation
> failed (represented by a 0 offset), and writing the L2 table at that
> 0 offset, overwriting the qcow2 header.

My intuition says it's important to identify the cause of this, as it
might not be qcow2 but the AIO code going awry with a random offset
when closing down, e.g. if there's a use-after-free bug.

Marc..  this is quite a serious bug you've reported.  Is there a
reason you didn't report it earlier?

-- Jamie