All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jamie Lokier <jamie@shareable.org>
To: qemu-devel@nongnu.org
Cc: Marc Bevand <m.bevand@gmail.com>, Gleb Natapov <gleb@redhat.com>,
	kvm@vger.kernel.org
Subject: Re: [Qemu-devel] Re: qcow2 corruption observed, fixed by reverting old change
Date: Fri, 13 Feb 2009 16:23:36 +0000	[thread overview]
Message-ID: <20090213162336.GI18471@shareable.org> (raw)
In-Reply-To: <49955681.9070301@suse.de>

Marc Bevand schrieb:
> I tested kvm-81 and kvm-83 as well (can't test kvm-80 or older
> because of the qcow2 performance regression caused by the default
> writethrough caching policy) but it randomly triggers an even worse
> bug: the moment I shut down a guest by typing "quit" in the monitor,
> it sometimes overwrite the first 4kB of the disk image with mostly
> NUL bytes (!) which completely destroys it. I am familiar with the
> qcow2 format and apparently this 4kB block seems to be an L2 table
> with most entries set to zero. I have had to restore at least 6 or 7
> disk images from backup after occurences of that bug.

Ow!  That's a really serious bug.  How many of us have regular hourly
backups of our disk images?  And how many of us are running databases
or mail servers on our VMs, where even restoring from a recent backup
is a harmful event?

I've not noticed this bug reported by Marc, probably because I nearly
always finish a KVM session by killing it, either because I'm testing
or because KVM locks up occasionally and needs kill -9 :-(

And because I've not used any KVM since kvm-72 in production until
recently, only for testing my personal VMs.

I must say, _thank goodness_ that the bug I reported occurs at boot
time, and caused me to revert the qcow2 code.  I'm now running a
crticial VM on kvm-83 with reverted qcow2.  Sure it's risky as there's
no reason to believe kvm-83 is "stable", but there's no reason to
believe any other version of KVM is especially stable either - there's
no stabilising bug fix only branch that I'm aware of.

If I hadn't had the boot time bug which I reported, I could have
unrecoverable corruption instead from Marc's bug.

For the time being, I'm going to _strongly_ advise my VM using
professional clients to never, *ever* use qcow2 except for snapshot
testing.

Unfortunately the other delta/growable formats seem to be even less
reliable, because they're not used much, so they should be avoided too.

This corruption plus the data integrity/durability issues on host
failure are a big deal.  Even with kvm-72, I'm nervous about qcow2 now.
Just because a bug hasn't caused obvious guest failures, doesn't mean
it's not happening.

Is there a way to restructure the code and/or how it works so it's
more clearly correct?

> My intuition tells me this may be the qcow2 code trying to allocate
> a cluster to write a new L2 table, but not noticing the allocation
> failed (represented by a 0 offset), and writing the L2 table at that
> 0 offset, overwriting the qcow2 header.

My intuition says it's important to identify the cause of this, as it
might not be qcow2 but the AIO code going awry with a random offset
when closing down, e.g. if there's a use-after-free bug.

Marc..  this is quite a serious bug you've reported.  Is there a
reason you didn't report it earlier?

-- Jamie 

WARNING: multiple messages have this Message-ID (diff)
From: Jamie Lokier <jamie@shareable.org>
To: qemu-devel@nongnu.org
Cc: Marc Bevand <m.bevand@gmail.com>,
	kvm@vger.kernel.org, Gleb Natapov <gleb@redhat.com>
Subject: Re: [Qemu-devel] Re: qcow2 corruption observed, fixed by reverting old change
Date: Fri, 13 Feb 2009 16:23:36 +0000	[thread overview]
Message-ID: <20090213162336.GI18471@shareable.org> (raw)
In-Reply-To: <49955681.9070301@suse.de>

Marc Bevand schrieb:
> I tested kvm-81 and kvm-83 as well (can't test kvm-80 or older
> because of the qcow2 performance regression caused by the default
> writethrough caching policy) but it randomly triggers an even worse
> bug: the moment I shut down a guest by typing "quit" in the monitor,
> it sometimes overwrite the first 4kB of the disk image with mostly
> NUL bytes (!) which completely destroys it. I am familiar with the
> qcow2 format and apparently this 4kB block seems to be an L2 table
> with most entries set to zero. I have had to restore at least 6 or 7
> disk images from backup after occurences of that bug.

Ow!  That's a really serious bug.  How many of us have regular hourly
backups of our disk images?  And how many of us are running databases
or mail servers on our VMs, where even restoring from a recent backup
is a harmful event?

I've not noticed this bug reported by Marc, probably because I nearly
always finish a KVM session by killing it, either because I'm testing
or because KVM locks up occasionally and needs kill -9 :-(

And because I've not used any KVM since kvm-72 in production until
recently, only for testing my personal VMs.

I must say, _thank goodness_ that the bug I reported occurs at boot
time, and caused me to revert the qcow2 code.  I'm now running a
crticial VM on kvm-83 with reverted qcow2.  Sure it's risky as there's
no reason to believe kvm-83 is "stable", but there's no reason to
believe any other version of KVM is especially stable either - there's
no stabilising bug fix only branch that I'm aware of.

If I hadn't had the boot time bug which I reported, I could have
unrecoverable corruption instead from Marc's bug.

For the time being, I'm going to _strongly_ advise my VM using
professional clients to never, *ever* use qcow2 except for snapshot
testing.

Unfortunately the other delta/growable formats seem to be even less
reliable, because they're not used much, so they should be avoided too.

This corruption plus the data integrity/durability issues on host
failure are a big deal.  Even with kvm-72, I'm nervous about qcow2 now.
Just because a bug hasn't caused obvious guest failures, doesn't mean
it's not happening.

Is there a way to restructure the code and/or how it works so it's
more clearly correct?

> My intuition tells me this may be the qcow2 code trying to allocate
> a cluster to write a new L2 table, but not noticing the allocation
> failed (represented by a 0 offset), and writing the L2 table at that
> 0 offset, overwriting the qcow2 header.

My intuition says it's important to identify the cause of this, as it
might not be qcow2 but the AIO code going awry with a random offset
when closing down, e.g. if there's a use-after-free bug.

Marc..  this is quite a serious bug you've reported.  Is there a
reason you didn't report it earlier?

-- Jamie 

  reply	other threads:[~2009-02-13 16:23 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-11  7:00 qcow2 corruption observed, fixed by reverting old change Jamie Lokier
2009-02-11  7:00 ` [Qemu-devel] " Jamie Lokier
2009-02-11  9:57 ` Kevin Wolf
2009-02-11 11:27   ` Jamie Lokier
2009-02-11 11:27     ` Jamie Lokier
2009-02-11 11:41   ` Jamie Lokier
2009-02-11 11:41     ` Jamie Lokier
2009-02-11 12:41     ` Kevin Wolf
2009-02-11 12:41       ` Kevin Wolf
2009-02-11 16:48       ` Jamie Lokier
2009-02-11 16:48         ` Jamie Lokier
2009-02-12 22:57         ` Consul
2009-02-12 22:57           ` [Qemu-devel] " Consul
2009-02-12 23:19           ` Consul
2009-02-12 23:19             ` [Qemu-devel] " Consul
2009-02-13  7:50             ` Marc Bevand
2009-02-16 12:44         ` [Qemu-devel] " Kevin Wolf
2009-02-17  0:43           ` Jamie Lokier
2009-02-17  0:43             ` Jamie Lokier
2009-03-06 22:37         ` Filip Navara
2009-03-06 22:37           ` Filip Navara
2009-02-12  5:45       ` Chris Wright
2009-02-12  5:45         ` Chris Wright
2009-02-12 11:08         ` Johannes Schindelin
2009-02-12 11:08           ` Johannes Schindelin
2009-02-13  6:41 ` Marc Bevand
2009-02-13 11:16   ` Kevin Wolf
2009-02-13 11:16     ` [Qemu-devel] " Kevin Wolf
2009-02-13 16:23     ` Jamie Lokier [this message]
2009-02-13 16:23       ` Jamie Lokier
2009-02-13 18:43       ` Chris Wright
2009-02-13 18:43         ` Chris Wright
2009-02-14  6:31       ` Marc Bevand
2009-02-14 22:28         ` Dor Laor
2009-02-14 22:28           ` Dor Laor
2009-02-15  2:27           ` Jamie Lokier
2009-02-15  7:56           ` Marc Bevand
2009-02-15  7:56             ` Marc Bevand
2009-02-15  2:37         ` Jamie Lokier
2009-02-15 10:57     ` Gleb Natapov
2009-02-15 10:57       ` [Qemu-devel] " Gleb Natapov
2009-02-15 11:46       ` Marc Bevand
2009-02-15 11:46         ` [Qemu-devel] " Marc Bevand
2009-02-15 11:54         ` Marc Bevand
2009-02-15 11:54           ` [Qemu-devel] " Marc Bevand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090213162336.GI18471@shareable.org \
    --to=jamie@shareable.org \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=m.bevand@gmail.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.