All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: parent transid verify failures on 2.6.39
@ 2011-06-22 22:42 Andrej Podzimek
  2011-06-23  1:45 ` Chris Mason
  0 siblings, 1 reply; 16+ messages in thread
From: Andrej Podzimek @ 2011-06-22 22:42 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3209 bytes --]

>> After doing an upgrade to 2.6.39 from 2.6.39-rc7, I am unable to mount
>> my 3 disk btrfs volume.  It was a clean reboot, which makes it all the
>> more puzzling.  This is what I'm getting:
>>
>>
>> [68808.339109] device fsid a941511a96bcbfb8-8c123cb07aa6aaa1 devid 2
>> transid 339584 /dev/sdc1
>> [68808.340354] device fsid a941511a96bcbfb8-8c123cb07aa6aaa1 devid 1
>> transid 339584 /dev/sda1
>> [68808.340774] device fsid a941511a96bcbfb8-8c123cb07aa6aaa1 devid 3
>> transid 339584 /dev/sdb1
>>
>> [70106.913668] btrfs: disk space caching is enabled
>> [70106.968648] parent transid verify failed on 6038227976192 wanted
>> 337418 found 337853
>> [70106.969031] parent transid verify failed on 6038227976192 wanted
>> 337418 found 337853
>> [70106.969403] parent transid verify failed on 6038227976192 wanted
>> 337418 found 337853
>> [70106.969671] parent transid verify failed on 6038227976192 wanted
>> 337418 found 337853
>> [70106.969691] parent transid verify failed on 6038227976192 wanted
>> 337418 found 337853
>> [70106.969704] Failed to read block groups: -5
>> [70107.050658] btrfs: open_ctree failed
>>
>> I went to run a btrfsck, but found out that I needed to compile with
>> the tmp branch or I would get an unsupported features message (lzo and
>> space_cache).  After compiling that, when I run btrfsck, I get this:
>>
>> parent transid verify failed on 6038227976192 wanted 337418 found 337853
>> parent transid verify failed on 6038227976192 wanted 337418 found 337853
>> parent transid verify failed on 6038227976192 wanted 337418 found 337853
>>
>> And then it stops.  This happens with btrfs-debug-tree, or
>> btrfs-select-super.  I've tried it on sda1, sdb1, and sdc1 and also
>> with -s 0, -s 1, and -s 2.  Dmesg shows a segfault:
>>
>> [71775.589462] btrfsck[14453]: segfault at c4 ip 000000000040e477 sp
>> 00007fffa9eb4d30 error 4 in btrfsck[400000+21000]
>>
>> For fun, I ran it through gdb and I got this:
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> find_first_block_group (root=0x61d1b0, path=0x61ef10, key=0x7fffffffe240)
>>     at extent-tree.c:3028
>> 3028                    if (slot >= btrfs_header_nritems(leaf)) {
>>
>>
>>
>> Is there any hope of recovery here?  Not the end of the world if the
>> volume is lost, but it would be a bit of a pain and I'm at a loss as
>> to why it happened.  I tried mounting with the new integration-test
>> branch just for fun, but there's no difference on the mounting.  Any
>> help that could be provided would be immensely appreciated.  Thanks!
>>
>
> So I have a patch I can give you that will possibly help you recover
> your data if you don't have backups, or you can wait a couple of days
> (hopefully) for the new btrfsck tool that will be much better than the
> hack I can give you.  Thanks,
>
> Josef

Could I try your hack, pretty please? If there's any chance it could either resolve this problem

	http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg10683.html ,

or at least restore the data from the filesystem, then I'd like to give it a go. Waiting for the new btrfsck is currently not an option for me :-)

Andrej


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5804 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: parent transid verify failures on 2.6.39
  2011-06-22 22:42 parent transid verify failures on 2.6.39 Andrej Podzimek
@ 2011-06-23  1:45 ` Chris Mason
  2011-06-23 18:26   ` Daniel Witzel
  2011-06-23 19:54   ` Josef Bacik
  0 siblings, 2 replies; 16+ messages in thread
From: Chris Mason @ 2011-06-23  1:45 UTC (permalink / raw)
  To: Andrej Podzimek; +Cc: Josef Bacik, linux-btrfs

Excerpts from Andrej Podzimek's message of 2011-06-22 18:42:28 -0400:
> 
> Could I try your hack, pretty please? If there's any chance it could either resolve this problem
> 
>     http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg10683.html ,
> 
> or at least restore the data from the filesystem, then I'd like to give it a go. Waiting for the new btrfsck is currently not an option for me :-)

It looks like your box is failing to read the extent allocation tree.
We don't allow the mount to proceed without that tree, but you don't
actually need it for a readonly mount (to copy things off).

Josef, is your hack just a mount option to make -o readonly skip the
extent allocation tree?

I can put this into my -o recovery patch and we can give it a try.

-chris

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: parent transid verify failures on 2.6.39
  2011-06-23  1:45 ` Chris Mason
@ 2011-06-23 18:26   ` Daniel Witzel
  2011-06-23 19:54   ` Josef Bacik
  1 sibling, 0 replies; 16+ messages in thread
From: Daniel Witzel @ 2011-06-23 18:26 UTC (permalink / raw)
  To: linux-btrfs

I'm game for trying it, been waiting a good bit to recover, and i have spare
drives to save the data to. Just post the git link for it if it's different then
Chris Masons git tree.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: parent transid verify failures on 2.6.39
  2011-06-23  1:45 ` Chris Mason
  2011-06-23 18:26   ` Daniel Witzel
@ 2011-06-23 19:54   ` Josef Bacik
  2011-06-23 21:11     ` Daniel Witzel
  2011-06-24  1:30     ` Andrej Podzimek
  1 sibling, 2 replies; 16+ messages in thread
From: Josef Bacik @ 2011-06-23 19:54 UTC (permalink / raw)
  To: Chris Mason; +Cc: Andrej Podzimek, Josef Bacik, linux-btrfs

On Wed, Jun 22, 2011 at 09:45:20PM -0400, Chris Mason wrote:
> Excerpts from Andrej Podzimek's message of 2011-06-22 18:42:28 -0400:
> > 
> > Could I try your hack, pretty please? If there's any chance it could either resolve this problem
> > 
> >     http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg10683.html ,
> > 
> > or at least restore the data from the filesystem, then I'd like to give it a go. Waiting for the new btrfsck is currently not an option for me :-)
> 
> It looks like your box is failing to read the extent allocation tree.
> We don't allow the mount to proceed without that tree, but you don't
> actually need it for a readonly mount (to copy things off).
> 
> Josef, is your hack just a mount option to make -o readonly skip the
> extent allocation tree?
> 
> I can put this into my -o recovery patch and we can give it a try.
> 

Here's the patch, you _have_ to mount -o readonly.  Basically what it does is
search all the mirrors and finds the one with the newest generation number and
just uses that one, assuming that it will be the closest one to what we want.
This has worked relatively well for the people who have used it, so hopefully it
will work for you.  Thanks,

Josef

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index c650a1d..53e330e 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -281,7 +281,7 @@ static int csum_tree_block(struct btrfs_root *root, struct extent_buffer *buf,
  * in the wrong place.
  */
 static int verify_parent_transid(struct extent_io_tree *io_tree,
-				 struct extent_buffer *eb, u64 parent_transid)
+				 struct extent_buffer *eb, u64 parent_transid, int uptodate)
 {
 	struct extent_state *cached_state = NULL;
 	int ret;
@@ -296,6 +296,11 @@ static int verify_parent_transid(struct extent_io_tree *io_tree,
 		ret = 0;
 		goto out;
 	}
+	if (!uptodate) {
+		ret = 0;
+		goto out;
+	}
+
 	if (printk_ratelimit()) {
 		printk("parent transid verify failed on %llu wanted %llu "
 		       "found %llu\n",
@@ -323,6 +328,9 @@ static int btree_read_extent_buffer_pages(struct btrfs_root *root,
 	int ret;
 	int num_copies = 0;
 	int mirror_num = 0;
+	int uptodate = 1;
+	int good_mirror = 0;
+	u64 generation = 0;
 
 	clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
 	io_tree = &BTRFS_I(root->fs_info->btree_inode)->io_tree;
@@ -330,9 +338,14 @@ static int btree_read_extent_buffer_pages(struct btrfs_root *root,
 		ret = read_extent_buffer_pages(io_tree, eb, start, 1,
 					       btree_get_extent, mirror_num);
 		if (!ret &&
-		    !verify_parent_transid(io_tree, eb, parent_transid))
+		    !verify_parent_transid(io_tree, eb, parent_transid, uptodate))
 			return ret;
 
+		if (btrfs_header_generation(eb) > generation) {
+			good_mirror = mirror_num;
+			generation = btrfs_header_generation(eb);
+		}
+
 		/*
 		 * This buffer's crc is fine, but its contents are corrupted, so
 		 * there is no reason to read the other copies, they won't be
@@ -347,8 +360,11 @@ static int btree_read_extent_buffer_pages(struct btrfs_root *root,
 			return ret;
 
 		mirror_num++;
-		if (mirror_num > num_copies)
-			return ret;
+		if (mirror_num > num_copies) {
+			mirror_num = good_mirror;
+			uptodate = 0;
+			continue;
+		}
 	}
 	return -EIO;
 }
@@ -1996,11 +2012,13 @@ struct btrfs_root *open_ctree(struct super_block *sb,
 		goto fail_block_groups;
 	}
 
+	/*
 	ret = btrfs_read_block_groups(extent_root);
 	if (ret) {
 		printk(KERN_ERR "Failed to read block groups: %d\n", ret);
 		goto fail_block_groups;
 	}
+	*/
 
 	fs_info->cleaner_kthread = kthread_run(cleaner_kthread, tree_root,
 					       "btrfs-cleaner");
@@ -2613,12 +2631,7 @@ int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid)
 
 	ret = extent_buffer_uptodate(&BTRFS_I(btree_inode)->io_tree, buf,
 				     NULL);
-	if (!ret)
-		return ret;
-
-	ret = verify_parent_transid(&BTRFS_I(btree_inode)->io_tree, buf,
-				    parent_transid);
-	return !ret;
+	return ret;
 }
 
 int btrfs_set_buffer_uptodate(struct extent_buffer *buf)

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: parent transid verify failures on 2.6.39
  2011-06-23 19:54   ` Josef Bacik
@ 2011-06-23 21:11     ` Daniel Witzel
  2011-06-23 21:35       ` Josef Bacik
  2011-06-24  1:30     ` Andrej Podzimek
  1 sibling, 1 reply; 16+ messages in thread
From: Daniel Witzel @ 2011-06-23 21:11 UTC (permalink / raw)
  To: linux-btrfs

Well still no cigar, didnt even change the error output. Thank you though for 
at least trying to help. here goes the error info with your patch applied to
/fs/btrfs/disk-io.c:

mount -o ro /dev/sdf1 (same for c1,d1,etc) /btrfs (dmesg output)
[ 1647.330104] btrfs: open_ctree failed
[ 1683.328038] device label 1TB0 devid 1 transid 2135 /dev/sdf1
[ 1683.344059] parent transid verify failed on 2206281838592 wanted 2135 found 
1545
[ 1683.349109] btrfs: open_ctree failed

btrfsck -s 0 (and 1) /dev/sdf1(c1,d1,etc):
localhost btrfs-progs-unstable # ./btrfsck -s 1 /dev/sdc1
using SB copy 1, bytenr 67108864
failed to read /dev/sr0
failed to read /dev/sr0
parent transid verify failed on 2206281838592 wanted 2135 found 1545
btrfsck: disk-io.c:739: open_ctree_fd: Assertion `!(!tree_root->node)' failed.
Aborted

(dmesg output)
[ 1825.231434] device label 1TB0 devid 1 transid 2135 /dev/sdf1
[ 1825.235292] device label 1TB0 devid 2 transid 2134 /dev/sde1
[ 1825.238560] device label 1TB0 devid 3 transid 2135 /dev/sdd1
[ 1825.241176] device label 1TB0 devid 4 transid 2135 /dev/sdc1
[ 1825.244681] device label 1TB0 devid 5 transid 2135 /dev/sdb1


gentoo baselayout 2, kernel 2.6.39-r1, btrfs-progs-unstable:cmason git master
branch, 5 disk usb raid-0 array




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: parent transid verify failures on 2.6.39
  2011-06-23 21:11     ` Daniel Witzel
@ 2011-06-23 21:35       ` Josef Bacik
  2011-06-24 15:46         ` Daniel Witzel
  0 siblings, 1 reply; 16+ messages in thread
From: Josef Bacik @ 2011-06-23 21:35 UTC (permalink / raw)
  To: Daniel Witzel; +Cc: linux-btrfs

On 06/23/2011 05:11 PM, Daniel Witzel wrote:
> Well still no cigar, didnt even change the error output. Thank you though for
> at least trying to help. here goes the error info with your patch applied to
> /fs/btrfs/disk-io.c:
>
> mount -o ro /dev/sdf1 (same for c1,d1,etc) /btrfs (dmesg output)
> [ 1647.330104] btrfs: open_ctree failed
> [ 1683.328038] device label 1TB0 devid 1 transid 2135 /dev/sdf1
> [ 1683.344059] parent transid verify failed on 2206281838592 wanted 2135 found
> 1545
> [ 1683.349109] btrfs: open_ctree failed

You didn't apply it right then, because you shouldn't see these errors 
anymore.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: parent transid verify failures on 2.6.39
  2011-06-23 19:54   ` Josef Bacik
  2011-06-23 21:11     ` Daniel Witzel
@ 2011-06-24  1:30     ` Andrej Podzimek
  2011-06-28 15:46       ` Daniel Witzel
  1 sibling, 1 reply; 16+ messages in thread
From: Andrej Podzimek @ 2011-06-24  1:30 UTC (permalink / raw)
  To: Josef Bacik; +Cc: Chris Mason, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1477 bytes --]

>>> Could I try your hack, pretty please? If there's any chance it could either resolve this problem
>>>
>>>      http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg10683.html ,
>>>
>>> or at least restore the data from the filesystem, then I'd like to give it a go. Waiting for the new btrfsck is currently not an option for me :-)
>>
>> It looks like your box is failing to read the extent allocation tree.
>> We don't allow the mount to proceed without that tree, but you don't
>> actually need it for a readonly mount (to copy things off).
>>
>> Josef, is your hack just a mount option to make -o readonly skip the
>> extent allocation tree?
>>
>> I can put this into my -o recovery patch and we can give it a try.
>>
>
> Here's the patch, you _have_ to mount -o readonly.  Basically what it does is
> search all the mirrors and finds the one with the newest generation number and
> just uses that one, assuming that it will be the closest one to what we want.
> This has worked relatively well for the people who have used it, so hopefully it
> will work for you.  Thanks,

Great, it works for me. I could mount the RAID0 root partition from an ArchLinux live CD (after patching, compiling and replacing the btrfs module first). Thank you very much!

My RAID1 /boot partition looks odd and mount froze (with an ooops) when I tried to access it this way. Fortunately, /boot doesn't matter that much, it will be easy to recover.

Andrej


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5804 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: parent transid verify failures on 2.6.39
  2011-06-23 21:35       ` Josef Bacik
@ 2011-06-24 15:46         ` Daniel Witzel
  0 siblings, 0 replies; 16+ messages in thread
From: Daniel Witzel @ 2011-06-24 15:46 UTC (permalink / raw)
  To: linux-btrfs

well here is what I,m doing:

patch -p1 < disk-io.patch  
output: "patching file fs/btrfs/disk-io.c"
rmmod btrfs 
rmmod lzo_compress
make -j3
make -j3 modules
make -j3 modules_install
cp arch/x86_64/boot/bzImage /boot/linux-next
depmod -a

(reboot)
modprobe btrfs
btrfs device scan
btrfs filesystem show (all drives show)
mount -o ro /dev/sdb1 /btrfs

and the output is : 

[ 4364.813453] parent transid verify failed on 2206281838592 wanted 2135 found 
1545
[ 4364.817093] btrfs: open_ctree failed
 

I checked the resulting disk-io.c file and the changes were merged. as you can
see I rebuilt my kernel and modules, rebooted and still got this error. is there
a step I'm missing? 

thanks





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: parent transid verify failures on 2.6.39
  2011-06-24  1:30     ` Andrej Podzimek
@ 2011-06-28 15:46       ` Daniel Witzel
  2011-06-28 16:44         ` Mitch Harder
  0 siblings, 1 reply; 16+ messages in thread
From: Daniel Witzel @ 2011-06-28 15:46 UTC (permalink / raw)
  To: linux-btrfs

Earlier I tried the read only patch with no result. Josef said I must be 
applying it wrong because the error I get is not possible with the patch applied.
I tried again with no luck and posted my steps for review. Well here I am a few 
days later with the following questions:

1) If my steps are correct what else could be the problem
2) if my steps are wrong what do i need to do to get it right

Any help would be awesome

Thanks
Dan Witzel




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: parent transid verify failures on 2.6.39
  2011-06-28 15:46       ` Daniel Witzel
@ 2011-06-28 16:44         ` Mitch Harder
  2011-06-28 17:04           ` Daniel Witzel
  2011-06-28 17:31           ` Daniel Witzel
  0 siblings, 2 replies; 16+ messages in thread
From: Mitch Harder @ 2011-06-28 16:44 UTC (permalink / raw)
  To: Daniel Witzel; +Cc: linux-btrfs

On Tue, Jun 28, 2011 at 10:46 AM, Daniel Witzel <dannyboy48888@gmail.com> wrote:
> Earlier I tried the read only patch with no result. Josef said I must be
> applying it wrong because the error I get is not possible with the patch applied.
> I tried again with no luck and posted my steps for review. Well here I am a few
> days later with the following questions:
>
> 1) If my steps are correct what else could be the problem
> 2) if my steps are wrong what do i need to do to get it right
>
> Any help would be awesome
>
> Thanks
> Dan Witzel
>

I just used this patch yesterday to help with a slightly different corruption.

I know the patch didn't apply cleanly for me, and I had to massage it.

You may want to manually audit disk-io.c to make sure the entire patch
is applied.

I know if I try to apply this patch to my 2.6.39.1 kernel, it fails.

# patch -p1 --dry-run <
/mnt/local/local/dontpanic/parent-transid-verify-failures-on-2.6.39.patch
patching file fs/btrfs/disk-io.c
Hunk #2 FAILED at 296.
Hunk #3 succeeded at 321 (offset -2 lines).
Hunk #4 succeeded at 331 (offset -2 lines).
Hunk #5 succeeded at 353 (offset -2 lines).
Hunk #6 succeeded at 1993 (offset -14 lines).
Hunk #7 succeeded at 2629 (offset 3 lines).
1 out of 7 hunks FAILED -- saving rejects to file fs/btrfs/disk-io.c.rej

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: parent transid verify failures on 2.6.39
  2011-06-28 16:44         ` Mitch Harder
@ 2011-06-28 17:04           ` Daniel Witzel
  2011-06-28 17:31           ` Daniel Witzel
  1 sibling, 0 replies; 16+ messages in thread
From: Daniel Witzel @ 2011-06-28 17:04 UTC (permalink / raw)
  To: linux-btrfs

Thanks for the reply. copied the patch from the line "diff...." onward 
did a fresh kernel tree and got the following (same on 2.6.39-r1 and r2)

localhost linux# patch -p1 --dry-run --verbose < disk-io.patch
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
|index c650a1d..53e330e 100644
|--- a/fs/btrfs/disk-io.c
|+++ b/fs/btrfs/disk-io.c
--------------------------
Patching file fs/btrfs/disk-io.c using Plan A...
Hunk #1 succeeded at 281.
Hunk #2 succeeded at 296.
Hunk #3 succeeded at 328.
Hunk #4 succeeded at 338.
Hunk #5 succeeded at 360.
Hunk #6 succeeded at 2012.
Hunk #7 succeeded at 2631.
done


A perfect patch job if I say so :)

any other ideas are welcome 

Dan Witzel




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: parent transid verify failures on 2.6.39
  2011-06-28 16:44         ` Mitch Harder
  2011-06-28 17:04           ` Daniel Witzel
@ 2011-06-28 17:31           ` Daniel Witzel
  1 sibling, 0 replies; 16+ messages in thread
From: Daniel Witzel @ 2011-06-28 17:31 UTC (permalink / raw)
  To: linux-btrfs

Thanks for the reply, Copied the patch from the "diff" line onwards and patched
against a  fresh kernel 2.6.39-r1 and r2 tree with same result:

localhost linux # patch --dry-run --verbose -p1 < disk-io.patch 
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
|index c650a1d..53e330e 100644
|--- a/fs/btrfs/disk-io.c
|+++ b/fs/btrfs/disk-io.c
--------------------------
Patching file fs/btrfs/disk-io.c using Plan A...
Hunk #1 succeeded at 281.
Hunk #2 succeeded at 296.
Hunk #3 succeeded at 328.
Hunk #4 succeeded at 338.
Hunk #5 succeeded at 360.
Hunk #6 succeeded at 2012.
Hunk #7 succeeded at 2631.
done


same problem. Any other ideas would be great

Dan Witzel




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: parent transid verify failures on 2.6.39
  2011-05-25 19:32   ` Craig Johnson
@ 2011-07-03  7:09     ` Skylar
  0 siblings, 0 replies; 16+ messages in thread
From: Skylar @ 2011-07-03  7:09 UTC (permalink / raw)
  To: linux-btrfs

Craig Johnson <crajohns <at> gmail.com> writes:

> 
> I can wait a couple of days for the new tool - glad to know that there
> is still hope.  If the new btrfsck isn't available within a week or so
> I might hit you up for that patch.  Thanks!
> 
> - Craig
> 
> On Wed, May 25, 2011 at 2:28 PM, Josef Bacik <josef <at> redhat.com> wrote:
> > On 05/25/2011 03:06 PM, Craig Johnson wrote:
> >> After doing an upgrade to 2.6.39 from 2.6.39-rc7, I am unable to mount
> >> my 3 disk btrfs volume.  It was a clean reboot, which makes it all the
> >> more puzzling.  This is what I'm getting:
> >>
> >>
> >> [68808.339109] device fsid a941511a96bcbfb8-8c123cb07aa6aaa1 devid 2
> >> transid 339584 /dev/sdc1
> >> [68808.340354] device fsid a941511a96bcbfb8-8c123cb07aa6aaa1 devid 1
> >> transid 339584 /dev/sda1
> >> [68808.340774] device fsid a941511a96bcbfb8-8c123cb07aa6aaa1 devid 3
> >> transid 339584 /dev/sdb1
> >>
> >> [70106.913668] btrfs: disk space caching is enabled
> >> [70106.968648] parent transid verify failed on 6038227976192 wanted
> >> 337418 found 337853
> >> [70106.969031] parent transid verify failed on 6038227976192 wanted
> >> 337418 found 337853
> >> [70106.969403] parent transid verify failed on 6038227976192 wanted
> >> 337418 found 337853
> >> [70106.969671] parent transid verify failed on 6038227976192 wanted
> >> 337418 found 337853
> >> [70106.969691] parent transid verify failed on 6038227976192 wanted
> >> 337418 found 337853
> >> [70106.969704] Failed to read block groups: -5
> >> [70107.050658] btrfs: open_ctree failed
> >>
> >> I went to run a btrfsck, but found out that I needed to compile with
> >> the tmp branch or I would get an unsupported features message (lzo and
> >> space_cache).  After compiling that, when I run btrfsck, I get this:
> >>
> >> parent transid verify failed on 6038227976192 wanted 337418 found 337853
> >> parent transid verify failed on 6038227976192 wanted 337418 found 337853
> >> parent transid verify failed on 6038227976192 wanted 337418 found 337853
> >>
> >> And then it stops.  This happens with btrfs-debug-tree, or
> >> btrfs-select-super.  I've tried it on sda1, sdb1, and sdc1 and also
> >> with -s 0, -s 1, and -s 2.  Dmesg shows a segfault:
> >>
> >> [71775.589462] btrfsck[14453]: segfault at c4 ip 000000000040e477 sp
> >> 00007fffa9eb4d30 error 4 in btrfsck[400000+21000]
> >>
> >> For fun, I ran it through gdb and I got this:
> >>
> >> Program received signal SIGSEGV, Segmentation fault.
> >> find_first_block_group (root=0x61d1b0, path=0x61ef10, key=0x7fffffffe240)
> >>     at extent-tree.c:3028
> >> 3028                    if (slot >= btrfs_header_nritems(leaf)) {
> >>
> >>
> >>
> >> Is there any hope of recovery here?  Not the end of the world if the
> >> volume is lost, but it would be a bit of a pain and I'm at a loss as
> >> to why it happened.  I tried mounting with the new integration-test
> >> branch just for fun, but there's no difference on the mounting.  Any
> >> help that could be provided would be immensely appreciated.  Thanks!
> >>
> >
> > So I have a patch I can give you that will possibly help you recover
> > your data if you don't have backups, or you can wait a couple of days
> > (hopefully) for the new btrfsck tool that will be much better than the
> > hack I can give you.  Thanks,
> >
> > Josef
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo <at> vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 


Hate to be a pest, but is there any word on the release of this tool? I've been
silently and eagerly waiting, and figured I'd prod, in case I was looking in all
the wrong places. I'm faced with a 15tb (Yes, terabyte. 23 drives of varying
sizes.) array of random data that was lost to this problem. The rack UPS system
had failed batteries - SEVERELY neglected, sadly, not that it's of any relevance.

dmesg;
[236109.982618] device label SolaceNetArray devid 2 transid 267118 /dev/sds3
[236110.486189] btrfs: use zlib compression
[236110.624320] parent transid verify failed on 12377604001792 wanted 267118
found 47109
[236110.624480] parent transid verify failed on 12377604001792 wanted 267118
found 47109
[236110.624635] parent transid verify failed on 12377604001792 wanted 267118
found 47109
[236110.624640] parent transid verify failed on 12377604001792 wanted 267118
found 47109
[236110.670432] btrfs: open_ctree failed

The data is not mission-critical, but it does reperesent a very, very long
effort of collection and organization. Losing it would be a significant setback,
and I've been tasked with the process of recovery. Obviously at 15tb, an
off-site backup was not a cost-effective option. If there's anything I can do to
aid your efforts - though I honestly doubt it - please let me know.

I'm willing to try any "patch" or "hack" you're willing to provide, if it is
unlikely (obviously can't be guaranteed) to cause complete loss of the data.
Please let me know.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: parent transid verify failures on 2.6.39
  2011-05-25 19:28 ` Josef Bacik
@ 2011-05-25 19:32   ` Craig Johnson
  2011-07-03  7:09     ` Skylar
  0 siblings, 1 reply; 16+ messages in thread
From: Craig Johnson @ 2011-05-25 19:32 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs

I can wait a couple of days for the new tool - glad to know that there
is still hope.  If the new btrfsck isn't available within a week or so
I might hit you up for that patch.  Thanks!

- Craig

On Wed, May 25, 2011 at 2:28 PM, Josef Bacik <josef@redhat.com> wrote:
> On 05/25/2011 03:06 PM, Craig Johnson wrote:
>> After doing an upgrade to 2.6.39 from 2.6.39-rc7, I am unable to mou=
nt
>> my 3 disk btrfs volume. =A0It was a clean reboot, which makes it all=
 the
>> more puzzling. =A0This is what I'm getting:
>>
>>
>> [68808.339109] device fsid a941511a96bcbfb8-8c123cb07aa6aaa1 devid 2
>> transid 339584 /dev/sdc1
>> [68808.340354] device fsid a941511a96bcbfb8-8c123cb07aa6aaa1 devid 1
>> transid 339584 /dev/sda1
>> [68808.340774] device fsid a941511a96bcbfb8-8c123cb07aa6aaa1 devid 3
>> transid 339584 /dev/sdb1
>>
>> [70106.913668] btrfs: disk space caching is enabled
>> [70106.968648] parent transid verify failed on 6038227976192 wanted
>> 337418 found 337853
>> [70106.969031] parent transid verify failed on 6038227976192 wanted
>> 337418 found 337853
>> [70106.969403] parent transid verify failed on 6038227976192 wanted
>> 337418 found 337853
>> [70106.969671] parent transid verify failed on 6038227976192 wanted
>> 337418 found 337853
>> [70106.969691] parent transid verify failed on 6038227976192 wanted
>> 337418 found 337853
>> [70106.969704] Failed to read block groups: -5
>> [70107.050658] btrfs: open_ctree failed
>>
>> I went to run a btrfsck, but found out that I needed to compile with
>> the tmp branch or I would get an unsupported features message (lzo a=
nd
>> space_cache). =A0After compiling that, when I run btrfsck, I get thi=
s:
>>
>> parent transid verify failed on 6038227976192 wanted 337418 found 33=
7853
>> parent transid verify failed on 6038227976192 wanted 337418 found 33=
7853
>> parent transid verify failed on 6038227976192 wanted 337418 found 33=
7853
>>
>> And then it stops. =A0This happens with btrfs-debug-tree, or
>> btrfs-select-super. =A0I've tried it on sda1, sdb1, and sdc1 and als=
o
>> with -s 0, -s 1, and -s 2. =A0Dmesg shows a segfault:
>>
>> [71775.589462] btrfsck[14453]: segfault at c4 ip 000000000040e477 sp
>> 00007fffa9eb4d30 error 4 in btrfsck[400000+21000]
>>
>> For fun, I ran it through gdb and I got this:
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> find_first_block_group (root=3D0x61d1b0, path=3D0x61ef10, key=3D0x7f=
ffffffe240)
>> =A0 =A0 at extent-tree.c:3028
>> 3028 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (slot >=3D btrfs_head=
er_nritems(leaf)) {
>>
>>
>>
>> Is there any hope of recovery here? =A0Not the end of the world if t=
he
>> volume is lost, but it would be a bit of a pain and I'm at a loss as
>> to why it happened. =A0I tried mounting with the new integration-tes=
t
>> branch just for fun, but there's no difference on the mounting. =A0A=
ny
>> help that could be provided would be immensely appreciated. =A0Thank=
s!
>>
>
> So I have a patch I can give you that will possibly help you recover
> your data if you don't have backups, or you can wait a couple of days
> (hopefully) for the new btrfsck tool that will be much better than th=
e
> hack I can give you. =A0Thanks,
>
> Josef
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: parent transid verify failures on 2.6.39
  2011-05-25 19:06 Craig Johnson
@ 2011-05-25 19:28 ` Josef Bacik
  2011-05-25 19:32   ` Craig Johnson
  0 siblings, 1 reply; 16+ messages in thread
From: Josef Bacik @ 2011-05-25 19:28 UTC (permalink / raw)
  To: Craig Johnson; +Cc: linux-btrfs

On 05/25/2011 03:06 PM, Craig Johnson wrote:
> After doing an upgrade to 2.6.39 from 2.6.39-rc7, I am unable to mount
> my 3 disk btrfs volume.  It was a clean reboot, which makes it all the
> more puzzling.  This is what I'm getting:
> 
> 
> [68808.339109] device fsid a941511a96bcbfb8-8c123cb07aa6aaa1 devid 2
> transid 339584 /dev/sdc1
> [68808.340354] device fsid a941511a96bcbfb8-8c123cb07aa6aaa1 devid 1
> transid 339584 /dev/sda1
> [68808.340774] device fsid a941511a96bcbfb8-8c123cb07aa6aaa1 devid 3
> transid 339584 /dev/sdb1
> 
> [70106.913668] btrfs: disk space caching is enabled
> [70106.968648] parent transid verify failed on 6038227976192 wanted
> 337418 found 337853
> [70106.969031] parent transid verify failed on 6038227976192 wanted
> 337418 found 337853
> [70106.969403] parent transid verify failed on 6038227976192 wanted
> 337418 found 337853
> [70106.969671] parent transid verify failed on 6038227976192 wanted
> 337418 found 337853
> [70106.969691] parent transid verify failed on 6038227976192 wanted
> 337418 found 337853
> [70106.969704] Failed to read block groups: -5
> [70107.050658] btrfs: open_ctree failed
> 
> I went to run a btrfsck, but found out that I needed to compile with
> the tmp branch or I would get an unsupported features message (lzo and
> space_cache).  After compiling that, when I run btrfsck, I get this:
> 
> parent transid verify failed on 6038227976192 wanted 337418 found 337853
> parent transid verify failed on 6038227976192 wanted 337418 found 337853
> parent transid verify failed on 6038227976192 wanted 337418 found 337853
> 
> And then it stops.  This happens with btrfs-debug-tree, or
> btrfs-select-super.  I've tried it on sda1, sdb1, and sdc1 and also
> with -s 0, -s 1, and -s 2.  Dmesg shows a segfault:
> 
> [71775.589462] btrfsck[14453]: segfault at c4 ip 000000000040e477 sp
> 00007fffa9eb4d30 error 4 in btrfsck[400000+21000]
> 
> For fun, I ran it through gdb and I got this:
> 
> Program received signal SIGSEGV, Segmentation fault.
> find_first_block_group (root=0x61d1b0, path=0x61ef10, key=0x7fffffffe240)
>     at extent-tree.c:3028
> 3028                    if (slot >= btrfs_header_nritems(leaf)) {
> 
> 
> 
> Is there any hope of recovery here?  Not the end of the world if the
> volume is lost, but it would be a bit of a pain and I'm at a loss as
> to why it happened.  I tried mounting with the new integration-test
> branch just for fun, but there's no difference on the mounting.  Any
> help that could be provided would be immensely appreciated.  Thanks!
> 

So I have a patch I can give you that will possibly help you recover
your data if you don't have backups, or you can wait a couple of days
(hopefully) for the new btrfsck tool that will be much better than the
hack I can give you.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 16+ messages in thread

* parent transid verify failures on 2.6.39
@ 2011-05-25 19:06 Craig Johnson
  2011-05-25 19:28 ` Josef Bacik
  0 siblings, 1 reply; 16+ messages in thread
From: Craig Johnson @ 2011-05-25 19:06 UTC (permalink / raw)
  To: linux-btrfs

After doing an upgrade to 2.6.39 from 2.6.39-rc7, I am unable to mount
my 3 disk btrfs volume.  It was a clean reboot, which makes it all the
more puzzling.  This is what I'm getting:


[68808.339109] device fsid a941511a96bcbfb8-8c123cb07aa6aaa1 devid 2
transid 339584 /dev/sdc1
[68808.340354] device fsid a941511a96bcbfb8-8c123cb07aa6aaa1 devid 1
transid 339584 /dev/sda1
[68808.340774] device fsid a941511a96bcbfb8-8c123cb07aa6aaa1 devid 3
transid 339584 /dev/sdb1

[70106.913668] btrfs: disk space caching is enabled
[70106.968648] parent transid verify failed on 6038227976192 wanted
337418 found 337853
[70106.969031] parent transid verify failed on 6038227976192 wanted
337418 found 337853
[70106.969403] parent transid verify failed on 6038227976192 wanted
337418 found 337853
[70106.969671] parent transid verify failed on 6038227976192 wanted
337418 found 337853
[70106.969691] parent transid verify failed on 6038227976192 wanted
337418 found 337853
[70106.969704] Failed to read block groups: -5
[70107.050658] btrfs: open_ctree failed

I went to run a btrfsck, but found out that I needed to compile with
the tmp branch or I would get an unsupported features message (lzo and
space_cache).  After compiling that, when I run btrfsck, I get this:

parent transid verify failed on 6038227976192 wanted 337418 found 337853
parent transid verify failed on 6038227976192 wanted 337418 found 337853
parent transid verify failed on 6038227976192 wanted 337418 found 337853

And then it stops.  This happens with btrfs-debug-tree, or
btrfs-select-super.  I've tried it on sda1, sdb1, and sdc1 and also
with -s 0, -s 1, and -s 2.  Dmesg shows a segfault:

[71775.589462] btrfsck[14453]: segfault at c4 ip 000000000040e477 sp
00007fffa9eb4d30 error 4 in btrfsck[400000+21000]

For fun, I ran it through gdb and I got this:

Program received signal SIGSEGV, Segmentation fault.
find_first_block_group (root=0x61d1b0, path=0x61ef10, key=0x7fffffffe240)
    at extent-tree.c:3028
3028                    if (slot >= btrfs_header_nritems(leaf)) {



Is there any hope of recovery here?  Not the end of the world if the
volume is lost, but it would be a bit of a pain and I'm at a loss as
to why it happened.  I tried mounting with the new integration-test
branch just for fun, but there's no difference on the mounting.  Any
help that could be provided would be immensely appreciated.  Thanks!

- Craig

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2011-07-03  7:09 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-22 22:42 parent transid verify failures on 2.6.39 Andrej Podzimek
2011-06-23  1:45 ` Chris Mason
2011-06-23 18:26   ` Daniel Witzel
2011-06-23 19:54   ` Josef Bacik
2011-06-23 21:11     ` Daniel Witzel
2011-06-23 21:35       ` Josef Bacik
2011-06-24 15:46         ` Daniel Witzel
2011-06-24  1:30     ` Andrej Podzimek
2011-06-28 15:46       ` Daniel Witzel
2011-06-28 16:44         ` Mitch Harder
2011-06-28 17:04           ` Daniel Witzel
2011-06-28 17:31           ` Daniel Witzel
  -- strict thread matches above, loose matches on Subject: below --
2011-05-25 19:06 Craig Johnson
2011-05-25 19:28 ` Josef Bacik
2011-05-25 19:32   ` Craig Johnson
2011-07-03  7:09     ` Skylar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.