From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38949) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VgAEE-0005nz-0K for qemu-devel@nongnu.org; Tue, 12 Nov 2013 04:25:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VgAE8-0002nE-1k for qemu-devel@nongnu.org; Tue, 12 Nov 2013 04:25:45 -0500 Received: from indium.canonical.com ([91.189.90.7]:53504) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VgAE7-0002n6-GF for qemu-devel@nongnu.org; Tue, 12 Nov 2013 04:25:39 -0500 Received: from loganberry.canonical.com ([91.189.90.37]) by indium.canonical.com with esmtp (Exim 4.71 #1 (Debian)) id 1VgAE6-0005vi-HO for ; Tue, 12 Nov 2013 09:25:38 +0000 Received: from loganberry.canonical.com (localhost [127.0.0.1]) by loganberry.canonical.com (Postfix) with ESMTP id 6A4162E807F for ; Tue, 12 Nov 2013 09:25:38 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Date: Tue, 12 Nov 2013 09:17:34 -0000 From: Blue Sender: bounces@canonical.com References: <20131112091735.10165.76015.malonedeb@soybean.canonical.com> Message-Id: <20131112091735.10165.76015.malonedeb@soybean.canonical.com> Errors-To: bounces@canonical.com Subject: [Qemu-devel] [Bug 1250360] [NEW] qcow2 image logical corruption after host crash Reply-To: Bug 1250360 <1250360@bugs.launchpad.net> List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Public bug reported: Description of problem: In case of power failure disk images that were active and created in qcow2 = format can become logically corrupt so that they actually appear as unused = (full of zeroes). Data seems to be there, but at this moment i cannot find any reliable metho= d to recover it. Should it be a raw image, a recovery path would be availab= le, but a qcow2 image only presents zeroes once it gets corrupted. My under= standing is that the blockmap of the image gets reset and the image is then= assumed to be unused. My detailed setup : Kernel 2.6.32-358.18.1.el6.x86_64 qemu-kvm-0.12.1.2-2.355.0.1.el6.centos.7.x86_64 Used via libvirt libvirt-0.10.2-18.el6_4.14.x86_64 The image was used from a NFS share (the nfs server did NOT crash and remai= ned permanently active). qemu-img check finds no corruption; qemu-img convert will fully convert the image to raw at a raw image full of= zeroes. However, there is data in the file, and the storage backend was no= t restarted, inactivated during the incident. I encountered this issue on two different machines, in both cases i was not= able to recover the data. Image was qcow2, thin provisioned, created like this : qemu-img create -f qcow2 -o cluster_size=3D2M imagename.img While addressing the root cause in order to not have this issue repeat would be the ideal scenario, a temporary workaround to run on the affected qcow2 image to "patch" it and recover the data (eventually after a full fsck/recovery inside the guest) would also be good. Otherwise we are basically losing data on a large scale when using qcow2. Version-Release number of selected component (if applicable): Kernel 2.6.32-358.18.1.el6.x86_64 qemu-kvm-0.12.1.2-2.355.0.1.el6.centos.7.x86_64 Used via libvirt libvirt-0.10.2-18.el6_4.14.x86_64 How reproducible: I am not able (and don't have at the moment enough resources to try to manu= ally reproduce it), but the probability of the issue seems quite high as th= is is the second case of such corruption in weeks. Additional info: I can privately provide an image displaying the corruption. The reported problem has actually two aspects : first is the cause that eve= ntually produces this issue. The second is the fact that once the logical corruption has occured, qemu-i= mg check finds nothing wrong with the image - this is obviously wrong. ** Affects: qemu Importance: Undecided Status: New -- = You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1250360 Title: qcow2 image logical corruption after host crash Status in QEMU: New Bug description: Description of problem: In case of power failure disk images that were active and created in qcow= 2 format can become logically corrupt so that they actually appear as unuse= d (full of zeroes). Data seems to be there, but at this moment i cannot find any reliable met= hod to recover it. Should it be a raw image, a recovery path would be avail= able, but a qcow2 image only presents zeroes once it gets corrupted. My und= erstanding is that the blockmap of the image gets reset and the image is th= en assumed to be unused. My detailed setup : Kernel 2.6.32-358.18.1.el6.x86_64 qemu-kvm-0.12.1.2-2.355.0.1.el6.centos.7.x86_64 Used via libvirt libvirt-0.10.2-18.el6_4.14.x86_64 The image was used from a NFS share (the nfs server did NOT crash and rem= ained permanently active). qemu-img check finds no corruption; qemu-img convert will fully convert the image to raw at a raw image full = of zeroes. However, there is data in the file, and the storage backend was = not restarted, inactivated during the incident. I encountered this issue on two different machines, in both cases i was n= ot able to recover the data. Image was qcow2, thin provisioned, created like this : qemu-img create -f qcow2 -o cluster_size=3D2M imagename.img While addressing the root cause in order to not have this issue repeat would be the ideal scenario, a temporary workaround to run on the affected qcow2 image to "patch" it and recover the data (eventually after a full fsck/recovery inside the guest) would also be good. Otherwise we are basically losing data on a large scale when using qcow2. Version-Release number of selected component (if applicable): Kernel 2.6.32-358.18.1.el6.x86_64 qemu-kvm-0.12.1.2-2.355.0.1.el6.centos.7.x86_64 Used via libvirt libvirt-0.10.2-18.el6_4.14.x86_64 How reproducible: I am not able (and don't have at the moment enough resources to try to ma= nually reproduce it), but the probability of the issue seems quite high as = this is the second case of such corruption in weeks. Additional info: I can privately provide an image displaying the corruption. The reported problem has actually two aspects : first is the cause that e= ventually produces this issue. The second is the fact that once the logical corruption has occured, qemu= -img check finds nothing wrong with the image - this is obviously wrong. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1250360/+subscriptions