linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* panic during rebalance, and now upon mount
@ 2010-01-30  6:05 Troy Ablan
  2010-01-30 12:09 ` Yan, Zheng 
  0 siblings, 1 reply; 8+ messages in thread
From: Troy Ablan @ 2010-01-30  6:05 UTC (permalink / raw)
  To: linux-btrfs

Hi folks,

During a very lengthy btrfs-vol -b (3.5 days in), btrfs BUGged out. 
Upon rebooting and trying to mount that fs, the exact same bug (with the
exact same call trace) happens.  I moved up to 2.6.33-rc6 from
gentoo-maintained 2.6.32-r2 to see what would happen, and it appears to
panic at the equivalent line of the same source file as before.

Let me know if I can do anything to assist.  I won't do anything to the
disks for the next few days in case some forensics will be useful.

[  154.899692] device label bk0 devid 14 transid 111134 /dev/mapper/btrn
[  154.958264] btrfs: use compression
[  202.394048] ------------[ cut here ]------------
[  202.394136] kernel BUG at fs/btrfs/extent-tree.c:5377!
[  202.394220] invalid opcode: 0000 [#1] SMP
[  202.394372] last sysfs file:
/sys/devices/virtual/block/md1/md/metadata_version
[  202.394500] CPU 5
[  202.394655] Pid: 5838, comm: btrfs-relocate- Tainted: G        W 
2.6.33-rc6 #1 P55M-GD45 (MS-7588) /MS-7588
[  202.394787] RIP: 0010:[<ffffffff8129e5ad>]  [<ffffffff8129e5ad>]
walk_up_proc+0x37d/0x3c0
[  202.394955] RSP: 0018:ffff880139729ca0  EFLAGS: 00010282
[  202.395039] RAX: 0000000000000218 RBX: ffff88013c460300 RCX:
ffff880139728000
[  202.395127] RDX: ffff880000000000 RSI: fffffffffffffff8 RDI:
ffff880138ac08e0
[  202.395214] RBP: ffff880139729d00 R08: 0000000000000008 R09:
0000000000000001
[  202.395301] R10: 0000000000000001 R11: 0000000000000001 R12:
ffff880138ab8880
[  202.395389] R13: 0000000000000000 R14: ffff88013f72f880 R15:
ffff88013b646800
[  202.395476] FS:  0000000000000000(0000) GS:ffff880028340000(0000)
knlGS:0000000000000000
[  202.395606] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  202.395691] CR2: 0000000000425f40 CR3: 00000000018d3000 CR4:
00000000000006e0
[  202.395778] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  202.395865] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[  202.395953] Process btrfs-relocate- (pid: 5838, threadinfo
ffff880139728000, task ffff88013f0e28f0)
[  202.396083] Stack:
[  202.396162]  ffff880139729cf0 0000000000000002 ffff88013f72f880
0000000000000206
[  202.397142] <0> ffff880139729d30 ffff880138ac08e0 0000000000000000
0000000000000000
[  202.397444] <0> ffff88013c460300 ffff88013f72f880 0000000000000000
ffff880139728000
[  202.397856] Call Trace:
[  202.397937]  [<ffffffff8129e72f>] walk_up_tree+0x13f/0x1c0
[  202.398023]  [<ffffffff8129f99c>] btrfs_drop_snapshot+0x21c/0x600
[  202.398110]  [<ffffffff812a9dd0>] ? __btrfs_end_transaction+0x100/0x170
[  202.398198]  [<ffffffff812e7d7d>] merge_func+0x7d/0xc0
[  202.398284]  [<ffffffff812d25aa>] worker_loop+0x17a/0x540
[  202.398379]  [<ffffffff812d2430>] ? worker_loop+0x0/0x540
[  202.398487]  [<ffffffff812d2430>] ? worker_loop+0x0/0x540
[  202.398611]  [<ffffffff81095936>] kthread+0x96/0xa0
[  202.398697]  [<ffffffff81034bd4>] kernel_thread_helper+0x4/0x10
[  202.398784]  [<ffffffff816ac869>] ? restore_args+0x0/0x30
[  202.398869]  [<ffffffff810958a0>] ? kthread+0x0/0xa0
[  202.398953]  [<ffffffff81034bd0>] ? kernel_thread_helper+0x0/0x10
[  202.399039] Code: 6d db b6 6d 48 c1 f8 03 48 0f af c2 48 ba 00 00 00
00 00 88 ff ff 48 c1 e0 0c 48 8b 44 10 58 ff 49 1c 48 39 c6 0f 84 ab fd
ff ff <0f> 0b eb fe 0f 1f 80 00 00 00 00 47 8b 4c ae 60 45 85 c9 0f 85
[  202.401551] RIP  [<ffffffff8129e5ad>] walk_up_proc+0x37d/0x3c0
[  202.401671]  RSP <ffff880139729ca0>
[  202.401796] ---[ end trace 4c085bcc2bd215f6 ]---

-- 
Troy


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: panic during rebalance, and now upon mount
  2010-01-30  6:05 panic during rebalance, and now upon mount Troy Ablan
@ 2010-01-30 12:09 ` Yan, Zheng 
  2010-01-30 17:31   ` Troy Ablan
  0 siblings, 1 reply; 8+ messages in thread
From: Yan, Zheng  @ 2010-01-30 12:09 UTC (permalink / raw)
  To: Troy Ablan; +Cc: linux-btrfs

On Sat, Jan 30, 2010 at 2:05 PM, Troy Ablan <tablan@gmail.com> wrote:
> Hi folks,
>
> During a very lengthy btrfs-vol -b (3.5 days in), btrfs BUGged out.
> Upon rebooting and trying to mount that fs, the exact same bug (with =
the
> exact same call trace) happens. =A0I moved up to 2.6.33-rc6 from
> gentoo-maintained 2.6.32-r2 to see what would happen, and it appears =
to
> panic at the equivalent line of the same source file as before.
>
> Let me know if I can do anything to assist. =A0I won't do anything to=
 the
> disks for the next few days in case some forensics will be useful.
>
> [ =A0154.899692] device label bk0 devid 14 transid 111134 /dev/mapper=
/btrn
> [ =A0154.958264] btrfs: use compression
> [ =A0202.394048] ------------[ cut here ]------------
> [ =A0202.394136] kernel BUG at fs/btrfs/extent-tree.c:5377!
> [ =A0202.394220] invalid opcode: 0000 [#1] SMP
> [ =A0202.394372] last sysfs file:
> /sys/devices/virtual/block/md1/md/metadata_version
> [ =A0202.394500] CPU 5
> [ =A0202.394655] Pid: 5838, comm: btrfs-relocate- Tainted: G =A0 =A0 =
=A0 =A0W
> 2.6.33-rc6 #1 P55M-GD45 (MS-7588) /MS-7588
> [ =A0202.394787] RIP: 0010:[<ffffffff8129e5ad>] =A0[<ffffffff8129e5ad=
>]
> walk_up_proc+0x37d/0x3c0
> [ =A0202.394955] RSP: 0018:ffff880139729ca0 =A0EFLAGS: 00010282
> [ =A0202.395039] RAX: 0000000000000218 RBX: ffff88013c460300 RCX:
> ffff880139728000
> [ =A0202.395127] RDX: ffff880000000000 RSI: fffffffffffffff8 RDI:
> ffff880138ac08e0
> [ =A0202.395214] RBP: ffff880139729d00 R08: 0000000000000008 R09:
> 0000000000000001
> [ =A0202.395301] R10: 0000000000000001 R11: 0000000000000001 R12:
> ffff880138ab8880
> [ =A0202.395389] R13: 0000000000000000 R14: ffff88013f72f880 R15:
> ffff88013b646800
> [ =A0202.395476] FS: =A00000000000000000(0000) GS:ffff880028340000(00=
00)
> knlGS:0000000000000000
> [ =A0202.395606] CS: =A00010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ =A0202.395691] CR2: 0000000000425f40 CR3: 00000000018d3000 CR4:
> 00000000000006e0
> [ =A0202.395778] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ =A0202.395865] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [ =A0202.395953] Process btrfs-relocate- (pid: 5838, threadinfo
> ffff880139728000, task ffff88013f0e28f0)
> [ =A0202.396083] Stack:
> [ =A0202.396162] =A0ffff880139729cf0 0000000000000002 ffff88013f72f88=
0
> 0000000000000206
> [ =A0202.397142] <0> ffff880139729d30 ffff880138ac08e0 00000000000000=
00
> 0000000000000000
> [ =A0202.397444] <0> ffff88013c460300 ffff88013f72f880 00000000000000=
00
> ffff880139728000
> [ =A0202.397856] Call Trace:
> [ =A0202.397937] =A0[<ffffffff8129e72f>] walk_up_tree+0x13f/0x1c0
> [ =A0202.398023] =A0[<ffffffff8129f99c>] btrfs_drop_snapshot+0x21c/0x=
600
> [ =A0202.398110] =A0[<ffffffff812a9dd0>] ? __btrfs_end_transaction+0x=
100/0x170
> [ =A0202.398198] =A0[<ffffffff812e7d7d>] merge_func+0x7d/0xc0
> [ =A0202.398284] =A0[<ffffffff812d25aa>] worker_loop+0x17a/0x540
> [ =A0202.398379] =A0[<ffffffff812d2430>] ? worker_loop+0x0/0x540
> [ =A0202.398487] =A0[<ffffffff812d2430>] ? worker_loop+0x0/0x540
> [ =A0202.398611] =A0[<ffffffff81095936>] kthread+0x96/0xa0
> [ =A0202.398697] =A0[<ffffffff81034bd4>] kernel_thread_helper+0x4/0x1=
0
> [ =A0202.398784] =A0[<ffffffff816ac869>] ? restore_args+0x0/0x30
> [ =A0202.398869] =A0[<ffffffff810958a0>] ? kthread+0x0/0xa0
> [ =A0202.398953] =A0[<ffffffff81034bd0>] ? kernel_thread_helper+0x0/0=
x10
> [ =A0202.399039] Code: 6d db b6 6d 48 c1 f8 03 48 0f af c2 48 ba 00 0=
0 00
> 00 00 88 ff ff 48 c1 e0 0c 48 8b 44 10 58 ff 49 1c 48 39 c6 0f 84 ab =
fd
> ff ff <0f> 0b eb fe 0f 1f 80 00 00 00 00 47 8b 4c ae 60 45 85 c9 0f 8=
5
> [ =A0202.401551] RIP =A0[<ffffffff8129e5ad>] walk_up_proc+0x37d/0x3c0
> [ =A0202.401671] =A0RSP <ffff880139729ca0>
> [ =A0202.401796] ---[ end trace 4c085bcc2bd215f6 ]---
>

Thank you for reporting this. Would you please run btrsck and mount
that fs again with the debug patch attached below.

Regards
Yan, Zheng

---
diff -urp 1/fs/btrfs/extent-tree.c 2/fs/btrfs/extent-tree.c
--- 1/fs/btrfs/extent-tree.c	2010-01-22 12:16:34.203525744 +0800
+++ 2/fs/btrfs/extent-tree.c	2010-01-30 20:03:23.609292953 +0800
@@ -5373,8 +5373,18 @@ static noinline int walk_up_proc(struct
 		if (wc->flags[level] & BTRFS_BLOCK_FLAG_FULL_BACKREF)
 			parent =3D eb->start;
 		else
-			BUG_ON(root->root_key.objectid !=3D
-			       btrfs_header_owner(eb));
+			if (root->root_key.objectid !=3D
+			    btrfs_header_owner(eb)) {
+				printk("root %llu %llu\n",
+				       root->root_key.objectid,
+				       root->root_key.offset);
+				printk("node %llu refs %llu flags %llu owner %llu reloc %d\n",
+				       eb->start, wc->refs[level], wc->flags[level],
+				       btrfs_header_owner(eb),
+				       btrfs_header_flag(eb, BTRFS_HEADER_FLAG_RELOC));
+
+				BUG();
+			}
 	} else {
 		if (wc->flags[level + 1] & BTRFS_BLOCK_FLAG_FULL_BACKREF)
 			parent =3D path->nodes[level + 1]->start;
@@ -5496,6 +5506,8 @@ int btrfs_drop_snapshot(struct btrfs_roo
 		       sizeof(wc->update_progress));
 	} else {
 		btrfs_disk_key_to_cpu(&key, &root_item->drop_progress);
+		printk("drop progress %llu %d %llu\n", key.objectid,
+			key.type, key.offset);
 		memcpy(&wc->update_progress, &key,
 		       sizeof(wc->update_progress));
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: panic during rebalance, and now upon mount
  2010-01-30 12:09 ` Yan, Zheng 
@ 2010-01-30 17:31   ` Troy Ablan
  2010-01-31  2:00     ` Yan, Zheng 
  0 siblings, 1 reply; 8+ messages in thread
From: Troy Ablan @ 2010-01-30 17:31 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Yan, Zheng 

Yan, Zheng wrote:
> Thank you for reporting this. Would you please run btrsck and mount
> that fs again with the debug patch attached below.
>
> Regards
> Yan, Zheng
>
> ---
> diff -urp 1/fs/btrfs/extent-tree.c 2/fs/btrfs/extent-tree.c
> --- 1/fs/btrfs/extent-tree.c	2010-01-22 12:16:34.203525744 +0800
> +++ 2/fs/btrfs/extent-tree.c	2010-01-30 20:03:23.609292953 +0800
> @@ -5373,8 +5373,18 @@ static noinline int walk_up_proc(struct
>  		if (wc->flags[level] & BTRFS_BLOCK_FLAG_FULL_BACKREF)
>  			parent = eb->start;
>  		else
> -			BUG_ON(root->root_key.objectid !=
> -			       btrfs_header_owner(eb));
> +			if (root->root_key.objectid !=
> +			    btrfs_header_owner(eb)) {
> +				printk("root %llu %llu\n",
> +				       root->root_key.objectid,
> +				       root->root_key.offset);
> +				printk("node %llu refs %llu flags %llu owner %llu reloc %d\n",
> +				       eb->start, wc->refs[level], wc->flags[level],
> +				       btrfs_header_owner(eb),
> +				       btrfs_header_flag(eb, BTRFS_HEADER_FLAG_RELOC));
> +
> +				BUG();
> +			}
>  	} else {
>  		if (wc->flags[level + 1] & BTRFS_BLOCK_FLAG_FULL_BACKREF)
>  			parent = path->nodes[level + 1]->start;
> @@ -5496,6 +5506,8 @@ int btrfs_drop_snapshot(struct btrfs_roo
>  		       sizeof(wc->update_progress));
>  	} else {
>  		btrfs_disk_key_to_cpu(&key, &root_item->drop_progress);
> +		printk("drop progress %llu %d %llu\n", key.objectid,
> +			key.type, key.offset);
>  		memcpy(&wc->update_progress, &key,
>  		       sizeof(wc->update_progress));
>   
Thanks for the quick reply. 

btrfsck bails out

-[~:#]- btrfsck /dev/mapper/btra
btrfsck: btrfsck.c:584: splice_shared_node: Assertion `!(src ==
&src_node->root_cache)' failed.
Aborted


A mount produces this.   I did not use -o compress this time, but I
suspect it doesn't matter.

[ 3192.249204] device label bk0 devid 1 transid 111135 /dev/mapper/btra
[ 3240.180895] root 18446744073709551608 536
[ 3240.180898] node 9197760471040 refs 0 flags 0 owner 536 reloc 1
[ 3240.180904] ------------[ cut here ]------------
[ 3240.180957] kernel BUG at fs/btrfs/extent-tree.c:5386!
[ 3240.181009] invalid opcode: 0000 [#1] SMP
[ 3240.181064] last sysfs file:
/sys/devices/virtual/block/md1/md/metadata_version
[ 3240.181159] CPU 3
[ 3240.181210] Pid: 6143, comm: btrfs-relocate- Tainted: G        W 
2.6.33-rc6 #2 P55M-GD45 (MS-7588) /MS-7588
[ 3240.181309] RIP: 0010:[<ffffffff8129e65f>]  [<ffffffff8129e65f>]
walk_up_proc+0x42f/0x490
[ 3240.181413] RSP: 0018:ffff880113183c90  EFLAGS: 00010286
[ 3240.181465] RAX: 0000000000000046 RBX: ffff88013e70eb40 RCX:
000000000003ffff
[ 3240.181520] RDX: ffff8800282c0000 RSI: 0000000000000086 RDI:
0000000000000000
[ 3240.181575] RBP: ffff880113183cf0 R08: 0000000000000000 R09:
ffffffff816ac57f
[ 3240.181630] R10: 0000000000000000 R11: 0000000000000004 R12:
0000000000000000
[ 3240.181685] R13: ffff880118ce1d90 R14: ffff880113182000 R15:
ffff88013f00a000
[ 3240.181740] FS:  0000000000000000(0000) GS:ffff8800282c0000(0000)
knlGS:0000000000000000
[ 3240.181837] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 3240.181890] CR2: 00007fc6e1ceb3dc CR3: 00000000018d3000 CR4:
00000000000006e0
[ 3240.182733] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 3240.182788] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 3240.182844] Process btrfs-relocate- (pid: 6143, threadinfo
ffff880113182000, task ffff880138c78380)
[ 3240.182942] Stack:
[ 3240.182988]  ffff880113183d20 0000000000000002 0000000000000008
0000000000000206
[ 3240.183050] <0> ffff880113183d20 ffff880113e948e0 0000000000000000
0000000000000000
[ 3240.183158] <0> ffff88013e70eb40 ffff88011b8d6f40 0000000000000000
ffff880113182000
[ 3240.183309] Call Trace:
[ 3240.183358]  [<ffffffff8129e7ff>] walk_up_tree+0x13f/0x1c0
[ 3240.183412]  [<ffffffff8129fa68>] btrfs_drop_snapshot+0x218/0x5e0
[ 3240.183466]  [<ffffffff812a9e80>] ? __btrfs_end_transaction+0x100/0x170
[ 3240.183680]  [<ffffffff812e7e2d>] merge_func+0x7d/0xc0
[ 3240.183735]  [<ffffffff812d265a>] worker_loop+0x17a/0x540
[ 3240.183789]  [<ffffffff812d24e0>] ? worker_loop+0x0/0x540
[ 3240.183842]  [<ffffffff812d24e0>] ? worker_loop+0x0/0x540
[ 3240.183895]  [<ffffffff81095936>] kthread+0x96/0xa0
[ 3240.183949]  [<ffffffff81034bd4>] kernel_thread_helper+0x4/0x10
[ 3240.184004]  [<ffffffff816ac929>] ? restore_args+0x0/0x30
[ 3240.184057]  [<ffffffff810958a0>] ? kthread+0x0/0xa0
[ 3240.184109]  [<ffffffff81034bd0>] ? kernel_thread_helper+0x0/0x10
[ 3240.184162] Code: 4e 1c 48 c7 c7 98 6a 81 81 83 e2 02 48 8b 45 b0 41
0f 95 c1 48 8b 0c c3 4a 8b 14 e3 41 83 e1 01 49 8b 75 00 31 c0 e8 2d af
40 00 <0f> 0b eb fe 0f 1f 44 00 00 4c 89 ef e8 40 70 03 00 4c 89 ef e8
[ 3240.184466] RIP  [<ffffffff8129e65f>] walk_up_proc+0x42f/0x490
[ 3240.184522]  RSP <ffff880113183c90>
[ 3240.184878] ---[ end trace 3c8b3f22cb58d773 ]---


-- 
Troy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: panic during rebalance, and now upon mount
  2010-01-30 17:31   ` Troy Ablan
@ 2010-01-31  2:00     ` Yan, Zheng 
  2010-01-31 10:09       ` Troy Ablan
  0 siblings, 1 reply; 8+ messages in thread
From: Yan, Zheng  @ 2010-01-31  2:00 UTC (permalink / raw)
  To: Troy Ablan; +Cc: linux-btrfs

On Sun, Jan 31, 2010 at 1:31 AM, Troy Ablan <tablan@gmail.com> wrote:
> Yan, Zheng wrote:
>> Thank you for reporting this. Would you please run btrsck and mount
>> that fs again with the debug patch attached below.
>>
>> Regards
>> Yan, Zheng
>>
>> ---
>> diff -urp 1/fs/btrfs/extent-tree.c 2/fs/btrfs/extent-tree.c
>> --- 1/fs/btrfs/extent-tree.c =A02010-01-22 12:16:34.203525744 +0800
>> +++ 2/fs/btrfs/extent-tree.c =A02010-01-30 20:03:23.609292953 +0800
>> @@ -5373,8 +5373,18 @@ static noinline int walk_up_proc(struct
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (wc->flags[level] & BTRFS_BLOCK_FLAG_=
=46ULL_BACKREF)
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 parent =3D eb->start;
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 else
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 BUG_ON(root->root_key.obje=
ctid !=3D
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0btrfs_heade=
r_owner(eb));
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (root->root_key.objecti=
d !=3D
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 btrfs_header_owner=
(eb)) {
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 printk("ro=
ot %llu %llu\n",
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0root->root_key.objectid,
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0root->root_key.offset);
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 printk("no=
de %llu refs %llu flags %llu owner %llu reloc %d\n",
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0eb->start, wc->refs[level], wc->flags[level],
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0btrfs_header_owner(eb),
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0btrfs_header_flag(eb, BTRFS_HEADER_FLAG_RELOC));
>> +
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 BUG();
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 }
>> =A0 =A0 =A0 } else {
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (wc->flags[level + 1] & BTRFS_BLOCK_F=
LAG_FULL_BACKREF)
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 parent =3D path->nodes[l=
evel + 1]->start;
>> @@ -5496,6 +5506,8 @@ int btrfs_drop_snapshot(struct btrfs_roo
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0sizeof(wc->update_progres=
s));
>> =A0 =A0 =A0 } else {
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 btrfs_disk_key_to_cpu(&key, &root_item->=
drop_progress);
>> + =A0 =A0 =A0 =A0 =A0 =A0 printk("drop progress %llu %d %llu\n", key=
=2Eobjectid,
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 key.type, key.offset);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 memcpy(&wc->update_progress, &key,
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0sizeof(wc->update_progres=
s));
>>
> Thanks for the quick reply.
>
> btrfsck bails out
>
> -[~:#]- btrfsck /dev/mapper/btra
> btrfsck: btrfsck.c:584: splice_shared_node: Assertion `!(src =3D=3D
> &src_node->root_cache)' failed.
> Aborted
>
>
> A mount produces this. =A0 I did not use -o compress this time, but I
> suspect it doesn't matter.
>
> [ 3192.249204] device label bk0 devid 1 transid 111135 /dev/mapper/bt=
ra
> [ 3240.180895] root 18446744073709551608 536
> [ 3240.180898] node 9197760471040 refs 0 flags 0 owner 536 reloc 1
> [ 3240.180904] ------------[ cut here ]------------
> [ 3240.180957] kernel BUG at fs/btrfs/extent-tree.c:5386!
> [ 3240.181009] invalid opcode: 0000 [#1] SMP
> [ 3240.181064] last sysfs file:
> /sys/devices/virtual/block/md1/md/metadata_version
> [ 3240.181159] CPU 3
> [ 3240.181210] Pid: 6143, comm: btrfs-relocate- Tainted: G =A0 =A0 =A0=
 =A0W
> 2.6.33-rc6 #2 P55M-GD45 (MS-7588) /MS-7588
> [ 3240.181309] RIP: 0010:[<ffffffff8129e65f>] =A0[<ffffffff8129e65f>]
> walk_up_proc+0x42f/0x490
> [ 3240.181413] RSP: 0018:ffff880113183c90 =A0EFLAGS: 00010286
> [ 3240.181465] RAX: 0000000000000046 RBX: ffff88013e70eb40 RCX:
> 000000000003ffff
> [ 3240.181520] RDX: ffff8800282c0000 RSI: 0000000000000086 RDI:
> 0000000000000000
> [ 3240.181575] RBP: ffff880113183cf0 R08: 0000000000000000 R09:
> ffffffff816ac57f
> [ 3240.181630] R10: 0000000000000000 R11: 0000000000000004 R12:
> 0000000000000000
> [ 3240.181685] R13: ffff880118ce1d90 R14: ffff880113182000 R15:
> ffff88013f00a000
> [ 3240.181740] FS: =A00000000000000000(0000) GS:ffff8800282c0000(0000=
)
> knlGS:0000000000000000
> [ 3240.181837] CS: =A00010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 3240.181890] CR2: 00007fc6e1ceb3dc CR3: 00000000018d3000 CR4:
> 00000000000006e0
> [ 3240.182733] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ 3240.182788] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [ 3240.182844] Process btrfs-relocate- (pid: 6143, threadinfo
> ffff880113182000, task ffff880138c78380)
> [ 3240.182942] Stack:
> [ 3240.182988] =A0ffff880113183d20 0000000000000002 0000000000000008
> 0000000000000206
> [ 3240.183050] <0> ffff880113183d20 ffff880113e948e0 0000000000000000
> 0000000000000000
> [ 3240.183158] <0> ffff88013e70eb40 ffff88011b8d6f40 0000000000000000
> ffff880113182000
> [ 3240.183309] Call Trace:
> [ 3240.183358] =A0[<ffffffff8129e7ff>] walk_up_tree+0x13f/0x1c0
> [ 3240.183412] =A0[<ffffffff8129fa68>] btrfs_drop_snapshot+0x218/0x5e=
0
> [ 3240.183466] =A0[<ffffffff812a9e80>] ? __btrfs_end_transaction+0x10=
0/0x170
> [ 3240.183680] =A0[<ffffffff812e7e2d>] merge_func+0x7d/0xc0
> [ 3240.183735] =A0[<ffffffff812d265a>] worker_loop+0x17a/0x540
> [ 3240.183789] =A0[<ffffffff812d24e0>] ? worker_loop+0x0/0x540
> [ 3240.183842] =A0[<ffffffff812d24e0>] ? worker_loop+0x0/0x540
> [ 3240.183895] =A0[<ffffffff81095936>] kthread+0x96/0xa0
> [ 3240.183949] =A0[<ffffffff81034bd4>] kernel_thread_helper+0x4/0x10
> [ 3240.184004] =A0[<ffffffff816ac929>] ? restore_args+0x0/0x30
> [ 3240.184057] =A0[<ffffffff810958a0>] ? kthread+0x0/0xa0
> [ 3240.184109] =A0[<ffffffff81034bd0>] ? kernel_thread_helper+0x0/0x1=
0
> [ 3240.184162] Code: 4e 1c 48 c7 c7 98 6a 81 81 83 e2 02 48 8b 45 b0 =
41
> 0f 95 c1 48 8b 0c c3 4a 8b 14 e3 41 83 e1 01 49 8b 75 00 31 c0 e8 2d =
af
> 40 00 <0f> 0b eb fe 0f 1f 44 00 00 4c 89 ef e8 40 70 03 00 4c 89 ef e=
8
> [ 3240.184466] RIP =A0[<ffffffff8129e65f>] walk_up_proc+0x42f/0x490
> [ 3240.184522] =A0RSP <ffff880113183c90>
> [ 3240.184878] ---[ end trace 3c8b3f22cb58d773 ]---
>
>

Please run btrfsck and mount the fs with the new patches attached
below. Thank you

Yan, Zheng,

=46or btrfs kernel module:
---
diff -urp 1/fs/btrfs/extent-tree.c 2/fs/btrfs/extent-tree.c
--- 1/fs/btrfs/extent-tree.c	2010-01-22 12:16:34.203525744 +0800
+++ 2/fs/btrfs/extent-tree.c	2010-01-31 09:29:01.131484542 +0800
@@ -5372,9 +5372,19 @@ static noinline int walk_up_proc(struct
 	if (eb =3D=3D root->node) {
 		if (wc->flags[level] & BTRFS_BLOCK_FLAG_FULL_BACKREF)
 			parent =3D eb->start;
-		else
-			BUG_ON(root->root_key.objectid !=3D
-			       btrfs_header_owner(eb));
+		else {	=09
+			if (root->root_key.objectid !=3D btrfs_header_owner(eb)) {
+				printk("root %llu %llu\n",
+					root->root_key.objectid, root->root_key.offset);
+				printk("node %llu refs %llu flags %llu owner %llu "
+					"reloc %d level %d nritems %d\n",
+					eb->start, wc->refs[level], wc->flags[level],
+					btrfs_header_owner(eb),
+					btrfs_header_flag(eb, BTRFS_HEADER_FLAG_RELOC),
+					btrfs_header_level(eb), btrfs_header_nritems(eb));
+				BUG();
+			}
+		}
 	} else {
 		if (wc->flags[level + 1] & BTRFS_BLOCK_FLAG_FULL_BACKREF)
 			parent =3D path->nodes[level + 1]->start;
@@ -5496,6 +5506,8 @@ int btrfs_drop_snapshot(struct btrfs_roo
 		       sizeof(wc->update_progress));
 	} else {
 		btrfs_disk_key_to_cpu(&key, &root_item->drop_progress);
+		printk("drop progress %llu %d %llu\n", key.objectid,
+			key.type, key.offset);
 		memcpy(&wc->update_progress, &key,
 		       sizeof(wc->update_progress));

---

=46or btrfsck:
---
diff -urp btrfs-progs-unstable/btrfsck.c btrfs-progs-2/btrfsck.c
--- btrfs-progs-unstable/btrfsck.c	2009-09-28 15:54:55.980479398 +0800
+++ btrfs-progs-2/btrfsck.c	2010-01-31 09:46:24.645485459 +0800
@@ -581,7 +581,6 @@ again:
 		}
 		ret =3D insert_existing_cache_extent(dst, &ins->cache);
 		if (ret =3D=3D -EEXIST) {
-			WARN_ON(src =3D=3D &src_node->root_cache);
 			conflict =3D get_inode_rec(dst, rec->ino, 1);
 			merge_inode_recs(rec, conflict, dst);
 			if (rec->checked) {
---
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: panic during rebalance, and now upon mount
  2010-01-31  2:00     ` Yan, Zheng 
@ 2010-01-31 10:09       ` Troy Ablan
  2010-01-31 12:34         ` Yan, Zheng 
  0 siblings, 1 reply; 8+ messages in thread
From: Troy Ablan @ 2010-01-31 10:09 UTC (permalink / raw)
  To: Yan, Zheng ; +Cc: linux-btrfs

Yan, Zheng wrote:
> Please run btrfsck and mount the fs with the new patches attached
> below. Thank you
>
> Yan, Zheng,
>
>
>   
During my two runs of btrfsck, the machine froze in an odd way before it
completed where the VTs were still accessible but wouldn't accept
keystrokes.  I at first suspected it ran out of memory+swap.  I gave it
more swap through temp swapfiles the second time around.  During most of
the fsck run, the process held onto 2.7 GB of RAM.  Toward the end, it
climbed all the way to 3.8 GB, and then the entire machine froze in this
odd way again. 

This was the `vmstat 1` that was running as it froze.

procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
id wa
 0  2   8296  30128   4240   3552    0    0  2800     0 1833 3366  0  1
75 24
 0  2   8296  29880   5204   3204    0    0  4028     0 2447 4554  0  1
74 24
 0  2   8296  30508   4400   3288    0    0  1680    20 1855 2422  0  1
75 24
 0  2   8296  30136   3720   3108    0    0  2464     0 2009 2672  0  1
73 25
 0  2   8296  29392   4396   3012    0    0  5996     0 2346 4089  0  2
71 27
 0  2   8296  29144   4076   2592    0    0  3952     0 4430 2910  0  1
77 21
 0  2   8296  29764   2416   2868    0    0  2168    12 1750 2488  0  1
70 28
 0  3  28900  25796   1804   1860    0 20604  3428 20660 1906 2766  0  2
75 23
 1  3 109760  25796   1396   1992    0 80852   288 80852 7665  442  0  2
60 37
 0  4 113096  25672   1396   1992    0 3332     0  3332  105   43  0  0
74 26

And the `top`

top - 02:50:28 up  3:48,  8 users,  load average: 2.52, 2.01, 1.72
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.1%us,  0.6%sy,  0.0%ni, 59.5%id, 39.7%wa,  0.0%hi,  0.0%si, 
0.0%st
Mem:   4050160k total,  4024488k used,    25672k free,     1268k buffers
Swap: 20978072k total,   119136k used, 20858936k free,     2072k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+ 
COMMAND                        
 5882 root      20   0 3840m 3.6g  220 D    2 94.2   8:26.44
btrfsck                        


This is as much from fsck as was available during both runs 

-[~:#]- ./btrfsck /dev/mapper/btra
root 256 inode 356018 errors 2000
root 256 inode 356022 errors 2000
root 256 inode 356023 errors 2000
root 256 inode 356025 errors 2000
root 256 inode 356027 errors 2000
root 256 inode 356033 errors 2000
root 256 inode 356034 errors 2000
root 256 inode 356035 errors 2000
root 256 inode 356036 errors 2000
root 256 inode 356037 errors 2000
root 256 inode 356038 errors 2000
root 256 inode 356039 errors 2000
root 256 inode 356040 errors 2000
root 256 inode 356044 errors 2000
root 256 inode 356045 errors 2000
root 256 inode 356046 errors 2000
root 256 inode 356047 errors 2000
root 256 inode 356049 errors 2000
root 256 inode 356050 errors 2000
root 256 inode 356051 errors 2000
root 256 inode 356052 errors 2000
root 256 inode 356053 errors 2000
root 256 inode 356054 errors 2000
root 256 inode 356055 errors 2000
root 256 inode 356056 errors 2000
root 256 inode 356058 errors 2000
root 256 inode 356063 errors 2000
root 256 inode 640297 errors 2000
root 256 inode 640301 errors 2000
root 256 inode 640302 errors 2000
root 256 inode 640304 errors 2000
root 256 inode 640306 errors 2000
root 256 inode 640312 errors 2000
root 256 inode 640313 errors 2000
root 256 inode 640314 errors 2000
root 256 inode 640315 errors 2000
root 256 inode 640316 errors 2000
root 256 inode 640317 errors 2000
root 256 inode 640318 errors 2000
root 256 inode 640319 errors 2000
root 256 inode 640323 errors 2000
root 256 inode 640324 errors 2000
root 256 inode 640325 errors 2000
root 256 inode 640326 errors 2000
root 256 inode 640328 errors 2000
root 256 inode 640329 errors 2000
root 256 inode 640330 errors 2000
root 256 inode 640331 errors 2000
root 256 inode 640332 errors 2000
root 256 inode 640333 errors 2000
root 256 inode 640334 errors 2000
root 256 inode 640335 errors 2000
root 256 inode 640337 errors 2000
root 256 inode 640342 errors 2000
root 256 inode 831519 errors 2000
root 257 inode 272056 errors 2000
root 257 inode 272060 errors 2000
root 257 inode 272061 errors 2000
root 257 inode 272063 errors 2000
root 257 inode 272065 errors 2000
root 257 inode 272071 errors 2000
root 257 inode 272072 errors 2000
root 257 inode 272073 errors 2000
root 257 inode 272074 errors 2000
root 257 inode 272075 errors 2000
root 257 inode 272076 errors 2000
root 257 inode 272077 errors 2000
root 257 inode 272078 errors 2000
root 257 inode 272082 errors 2000
root 257 inode 272083 errors 2000
root 257 inode 272084 errors 2000
root 257 inode 272085 errors 2000
root 257 inode 272087 errors 2000
root 257 inode 272088 errors 2000
root 257 inode 272089 errors 2000
root 257 inode 272090 errors 2000
root 257 inode 272091 errors 2000
root 257 inode 272092 errors 2000
root 257 inode 272093 errors 2000
root 257 inode 272094 errors 2000
root 257 inode 272096 errors 2000
root 257 inode 272101 errors 2000
root 257 inode 799036 errors 400
root 259 inode 47249 errors 2000
root 259 inode 47253 errors 2000
root 259 inode 47254 errors 2000
root 259 inode 47256 errors 2000
root 259 inode 47258 errors 2000
root 259 inode 47264 errors 2000
root 259 inode 47265 errors 2000
root 259 inode 47266 errors 2000
root 259 inode 47267 errors 2000
root 259 inode 47268 errors 2000
root 259 inode 47269 errors 2000
root 259 inode 47270 errors 2000
root 259 inode 47271 errors 2000
root 259 inode 47275 errors 2000
root 259 inode 47276 errors 2000
root 259 inode 47277 errors 2000
root 259 inode 47278 errors 2000
root 259 inode 47280 errors 2000
root 259 inode 47281 errors 2000
root 259 inode 47282 errors 2000
root 259 inode 47283 errors 2000
root 259 inode 47284 errors 2000
root 259 inode 47285 errors 2000
root 259 inode 47286 errors 2000
root 259 inode 47287 errors 2000
root 259 inode 47289 errors 2000
root 259 inode 47294 errors 2000
root 264 inode 241037 errors 2000
root 264 inode 242664 errors 2000
root 264 inode 242665 errors 2000
root 264 inode 243283 errors 2000
root 264 inode 250001 errors 2000
root 264 inode 250440 errors 2000
root 264 inode 250891 errors 2000
root 268 inode 78830 errors 2000
root 268 inode 78834 errors 2000
root 268 inode 78835 errors 2000
root 268 inode 78837 errors 2000
root 268 inode 78839 errors 2000
root 268 inode 78845 errors 2000
root 268 inode 78846 errors 2000
root 268 inode 78847 errors 2000
root 268 inode 78848 errors 2000
root 268 inode 78849 errors 2000
root 268 inode 78850 errors 2000
root 268 inode 78851 errors 2000
root 268 inode 78852 errors 2000
root 268 inode 78856 errors 2000
root 268 inode 78857 errors 2000
root 268 inode 78858 errors 2000
root 268 inode 78859 errors 2000
root 268 inode 78861 errors 2000
root 268 inode 78862 errors 2000
root 268 inode 78863 errors 2000
root 268 inode 78864 errors 2000
root 268 inode 78865 errors 2000
root 268 inode 78866 errors 2000
root 268 inode 78867 errors 2000
root 268 inode 78868 errors 2000
root 268 inode 78870 errors 2000
root 268 inode 78875 errors 2000


And for the mount, done after rebooting

[  198.062442] device label bk0 devid 1 transid 111136 /dev/mapper/btra
[  198.157820] btrfs: use compression
[  246.549684] root 18446744073709551608 536
[  246.549688] node 9197760471040 refs 0 flags 0 owner 536 reloc 1 level
0 nritems 0
[  246.549695] ------------[ cut here ]------------
[  246.549780] kernel BUG at fs/btrfs/extent-tree.c:5385!
[  246.549864] invalid opcode: 0000 [#1] SMP
[  246.550016] last sysfs file:
/sys/devices/virtual/block/md1/md/metadata_version
[  246.550145] CPU 5
[  246.550260] Pid: 5993, comm: btrfs-relocate- Tainted: G        W 
2.6.33-rc6 #1 P55M-GD45 (MS-7588) /MS-7588
[  246.550392] RIP: 0010:[<ffffffff8129e686>]  [<ffffffff8129e686>]
walk_up_proc+0x456/0x4c0
[  246.550560] RSP: 0018:ffff880138a93c70  EFLAGS: 00010282
[  246.550644] RAX: 0000000000000058 RBX: ffff88013c12f9c0 RCX:
000000000003ffff
[  246.550732] RDX: ffff880028340000 RSI: 0000000000000082 RDI:
0000000000000000
[  246.550819] RBP: ffff880138a93cf0 R08: 0000000000000000 R09:
ffffffff816ac59f
[  246.551695] R10: 0000000000000000 R11: 0000000000000004 R12:
0000000000000000
[  246.551783] R13: ffff880138a92000 R14: ffff8801396769a0 R15:
ffff88013e69c000
[  246.551870] FS:  0000000000000000(0000) GS:ffff880028340000(0000)
knlGS:0000000000000000
[  246.552000] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  246.552085] CR2: 00007f3dde7b9ea0 CR3: 00000000018d3000 CR4:
00000000000006e0
[  246.552172] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  246.552259] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[  246.552347] Process btrfs-relocate- (pid: 5993, threadinfo
ffff880138a92000, task ffff88013c5d40c0)
[  246.552478] Stack:
[  246.552556]  ffff880100000000 ffff880100000000 ffff880138a93d30
ffff880000000000
[  246.552748] <0> 6db6db6db6db6db7 0000160000000000 0000000000000008
0000000000000206
[  246.553050] <0> ffff880138a93d20 ffff88013967c8e0 0000000000000000
0000000000000000
[  246.553531] Call Trace:
[  246.553613]  [<ffffffff8129e82f>] walk_up_tree+0x13f/0x1c0
[  246.553699]  [<ffffffff8129fa98>] btrfs_drop_snapshot+0x218/0x5e0
[  246.553786]  [<ffffffff812a9eb0>] ? __btrfs_end_transaction+0x100/0x170
[  246.553875]  [<ffffffff812e7e5d>] merge_func+0x7d/0xc0
[  246.553961]  [<ffffffff812d268a>] worker_loop+0x17a/0x540
[  246.554046]  [<ffffffff812d2510>] ? worker_loop+0x0/0x540
[  246.554131]  [<ffffffff812d2510>] ? worker_loop+0x0/0x540
[  246.554217]  [<ffffffff81095936>] kthread+0x96/0xa0
[  246.554303]  [<ffffffff81034bd4>] kernel_thread_helper+0x4/0x10
[  246.554408]  [<ffffffff816ac969>] ? restore_args+0x0/0x30
[  246.554493]  [<ffffffff810958a0>] ? kthread+0x0/0xa0
[  246.554578]  [<ffffffff81034bd0>] ? kernel_thread_helper+0x0/0x10
[  246.554663] Code: 0f 95 c1 89 7c 24 08 48 8b 0c c3 4a 8b 14 e3 40 0f
b6 f6 41 83 e1 01 89 34 24 48 c7 c7 98 6a 81 81 49 8b 36 31 c0 e8 26 af
40 00 <0f> 0b eb fe 66 0f 1f 44 00 00 4c 89 f7 e8 48 70 03 00 4c 89 f7
[  246.557353] RIP  [<ffffffff8129e686>] walk_up_proc+0x456/0x4c0
[  246.557474]  RSP <ffff880138a93c70>
[  246.557601] ---[ end trace 18f62a4fb26ae09e ]---

Thanks again for taking a look at this

-- 
Troy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: panic during rebalance, and now upon mount
  2010-01-31 10:09       ` Troy Ablan
@ 2010-01-31 12:34         ` Yan, Zheng 
  2010-01-31 19:33           ` Troy Ablan
  0 siblings, 1 reply; 8+ messages in thread
From: Yan, Zheng  @ 2010-01-31 12:34 UTC (permalink / raw)
  To: Troy Ablan; +Cc: linux-btrfs

On Sun, Jan 31, 2010 at 6:09 PM, Troy Ablan <tablan@gmail.com> wrote:
> Yan, Zheng wrote:
>> Please run btrfsck and mount the fs with the new patches attached
>> below. Thank you
>>
>> Yan, Zheng,
>>
>>
>>
> During my two runs of btrfsck, the machine froze in an odd way before=
 it
> completed where the VTs were still accessible but wouldn't accept
> keystrokes. =A0I at first suspected it ran out of memory+swap. =A0I g=
ave it
> more swap through temp swapfiles the second time around. =A0During mo=
st of
> the fsck run, the process held onto 2.7 GB of RAM. =A0Toward the end,=
 it
> climbed all the way to 3.8 GB, and then the entire machine froze in t=
his
> odd way again.
>
> This was the `vmstat 1` that was running as it froze.
>
> procs -----------memory---------- ---swap-- -----io---- -system--
> ----cpu----
> =A0r =A0b =A0 swpd =A0 free =A0 buff =A0cache =A0 si =A0 so =A0 =A0bi=
 =A0 =A0bo =A0 in =A0 cs us sy
> id wa
> =A00 =A02 =A0 8296 =A030128 =A0 4240 =A0 3552 =A0 =A00 =A0 =A00 =A028=
00 =A0 =A0 0 1833 3366 =A00 =A01
> 75 24
> =A00 =A02 =A0 8296 =A029880 =A0 5204 =A0 3204 =A0 =A00 =A0 =A00 =A040=
28 =A0 =A0 0 2447 4554 =A00 =A01
> 74 24
> =A00 =A02 =A0 8296 =A030508 =A0 4400 =A0 3288 =A0 =A00 =A0 =A00 =A016=
80 =A0 =A020 1855 2422 =A00 =A01
> 75 24
> =A00 =A02 =A0 8296 =A030136 =A0 3720 =A0 3108 =A0 =A00 =A0 =A00 =A024=
64 =A0 =A0 0 2009 2672 =A00 =A01
> 73 25
> =A00 =A02 =A0 8296 =A029392 =A0 4396 =A0 3012 =A0 =A00 =A0 =A00 =A059=
96 =A0 =A0 0 2346 4089 =A00 =A02
> 71 27
> =A00 =A02 =A0 8296 =A029144 =A0 4076 =A0 2592 =A0 =A00 =A0 =A00 =A039=
52 =A0 =A0 0 4430 2910 =A00 =A01
> 77 21
> =A00 =A02 =A0 8296 =A029764 =A0 2416 =A0 2868 =A0 =A00 =A0 =A00 =A021=
68 =A0 =A012 1750 2488 =A00 =A01
> 70 28
> =A00 =A03 =A028900 =A025796 =A0 1804 =A0 1860 =A0 =A00 20604 =A03428 =
20660 1906 2766 =A00 =A02
> 75 23
> =A01 =A03 109760 =A025796 =A0 1396 =A0 1992 =A0 =A00 80852 =A0 288 80=
852 7665 =A0442 =A00 =A02
> 60 37
> =A00 =A04 113096 =A025672 =A0 1396 =A0 1992 =A0 =A00 3332 =A0 =A0 0 =A0=
3332 =A0105 =A0 43 =A00 =A00
> 74 26
>
> And the `top`
>
> top - 02:50:28 up =A03:48, =A08 users, =A0load average: 2.52, 2.01, 1=
=2E72
> Tasks: =A0 1 total, =A0 0 running, =A0 1 sleeping, =A0 0 stopped, =A0=
 0 zombie
> Cpu(s): =A00.1%us, =A00.6%sy, =A00.0%ni, 59.5%id, 39.7%wa, =A00.0%hi,=
 =A00.0%si,
> 0.0%st
> Mem: =A0 4050160k total, =A04024488k used, =A0 =A025672k free, =A0 =A0=
 1268k buffers
> Swap: 20978072k total, =A0 119136k used, 20858936k free, =A0 =A0 2072=
k cached
>
> =A0PID USER =A0 =A0 =A0PR =A0NI =A0VIRT =A0RES =A0SHR S %CPU %MEM =A0=
 =A0TIME+
> COMMAND
> =A05882 root =A0 =A0 =A020 =A0 0 3840m 3.6g =A0220 D =A0 =A02 94.2 =A0=
 8:26.44
> btrfsck
>
>
> This is as much from fsck as was available during both runs
>
> -[~:#]- ./btrfsck /dev/mapper/btra
> root 256 inode 356018 errors 2000
> root 256 inode 356022 errors 2000
> root 256 inode 356023 errors 2000
> root 256 inode 356025 errors 2000
> root 256 inode 356027 errors 2000
> root 256 inode 356033 errors 2000
> root 256 inode 356034 errors 2000
> root 256 inode 356035 errors 2000
> root 256 inode 356036 errors 2000
> root 256 inode 356037 errors 2000
> root 256 inode 356038 errors 2000
> root 256 inode 356039 errors 2000
> root 256 inode 356040 errors 2000
> root 256 inode 356044 errors 2000
> root 256 inode 356045 errors 2000
> root 256 inode 356046 errors 2000
> root 256 inode 356047 errors 2000
> root 256 inode 356049 errors 2000
> root 256 inode 356050 errors 2000
> root 256 inode 356051 errors 2000
> root 256 inode 356052 errors 2000
> root 256 inode 356053 errors 2000
> root 256 inode 356054 errors 2000
> root 256 inode 356055 errors 2000
> root 256 inode 356056 errors 2000
> root 256 inode 356058 errors 2000
> root 256 inode 356063 errors 2000
> root 256 inode 640297 errors 2000
> root 256 inode 640301 errors 2000
> root 256 inode 640302 errors 2000
> root 256 inode 640304 errors 2000
> root 256 inode 640306 errors 2000
> root 256 inode 640312 errors 2000
> root 256 inode 640313 errors 2000
> root 256 inode 640314 errors 2000
> root 256 inode 640315 errors 2000
> root 256 inode 640316 errors 2000
> root 256 inode 640317 errors 2000
> root 256 inode 640318 errors 2000
> root 256 inode 640319 errors 2000
> root 256 inode 640323 errors 2000
> root 256 inode 640324 errors 2000
> root 256 inode 640325 errors 2000
> root 256 inode 640326 errors 2000
> root 256 inode 640328 errors 2000
> root 256 inode 640329 errors 2000
> root 256 inode 640330 errors 2000
> root 256 inode 640331 errors 2000
> root 256 inode 640332 errors 2000
> root 256 inode 640333 errors 2000
> root 256 inode 640334 errors 2000
> root 256 inode 640335 errors 2000
> root 256 inode 640337 errors 2000
> root 256 inode 640342 errors 2000
> root 256 inode 831519 errors 2000
> root 257 inode 272056 errors 2000
> root 257 inode 272060 errors 2000
> root 257 inode 272061 errors 2000
> root 257 inode 272063 errors 2000
> root 257 inode 272065 errors 2000
> root 257 inode 272071 errors 2000
> root 257 inode 272072 errors 2000
> root 257 inode 272073 errors 2000
> root 257 inode 272074 errors 2000
> root 257 inode 272075 errors 2000
> root 257 inode 272076 errors 2000
> root 257 inode 272077 errors 2000
> root 257 inode 272078 errors 2000
> root 257 inode 272082 errors 2000
> root 257 inode 272083 errors 2000
> root 257 inode 272084 errors 2000
> root 257 inode 272085 errors 2000
> root 257 inode 272087 errors 2000
> root 257 inode 272088 errors 2000
> root 257 inode 272089 errors 2000
> root 257 inode 272090 errors 2000
> root 257 inode 272091 errors 2000
> root 257 inode 272092 errors 2000
> root 257 inode 272093 errors 2000
> root 257 inode 272094 errors 2000
> root 257 inode 272096 errors 2000
> root 257 inode 272101 errors 2000
> root 257 inode 799036 errors 400
> root 259 inode 47249 errors 2000
> root 259 inode 47253 errors 2000
> root 259 inode 47254 errors 2000
> root 259 inode 47256 errors 2000
> root 259 inode 47258 errors 2000
> root 259 inode 47264 errors 2000
> root 259 inode 47265 errors 2000
> root 259 inode 47266 errors 2000
> root 259 inode 47267 errors 2000
> root 259 inode 47268 errors 2000
> root 259 inode 47269 errors 2000
> root 259 inode 47270 errors 2000
> root 259 inode 47271 errors 2000
> root 259 inode 47275 errors 2000
> root 259 inode 47276 errors 2000
> root 259 inode 47277 errors 2000
> root 259 inode 47278 errors 2000
> root 259 inode 47280 errors 2000
> root 259 inode 47281 errors 2000
> root 259 inode 47282 errors 2000
> root 259 inode 47283 errors 2000
> root 259 inode 47284 errors 2000
> root 259 inode 47285 errors 2000
> root 259 inode 47286 errors 2000
> root 259 inode 47287 errors 2000
> root 259 inode 47289 errors 2000
> root 259 inode 47294 errors 2000
> root 264 inode 241037 errors 2000
> root 264 inode 242664 errors 2000
> root 264 inode 242665 errors 2000
> root 264 inode 243283 errors 2000
> root 264 inode 250001 errors 2000
> root 264 inode 250440 errors 2000
> root 264 inode 250891 errors 2000
> root 268 inode 78830 errors 2000
> root 268 inode 78834 errors 2000
> root 268 inode 78835 errors 2000
> root 268 inode 78837 errors 2000
> root 268 inode 78839 errors 2000
> root 268 inode 78845 errors 2000
> root 268 inode 78846 errors 2000
> root 268 inode 78847 errors 2000
> root 268 inode 78848 errors 2000
> root 268 inode 78849 errors 2000
> root 268 inode 78850 errors 2000
> root 268 inode 78851 errors 2000
> root 268 inode 78852 errors 2000
> root 268 inode 78856 errors 2000
> root 268 inode 78857 errors 2000
> root 268 inode 78858 errors 2000
> root 268 inode 78859 errors 2000
> root 268 inode 78861 errors 2000
> root 268 inode 78862 errors 2000
> root 268 inode 78863 errors 2000
> root 268 inode 78864 errors 2000
> root 268 inode 78865 errors 2000
> root 268 inode 78866 errors 2000
> root 268 inode 78867 errors 2000
> root 268 inode 78868 errors 2000
> root 268 inode 78870 errors 2000
> root 268 inode 78875 errors 2000
>
>
> And for the mount, done after rebooting
>
> [ =A0198.062442] device label bk0 devid 1 transid 111136 /dev/mapper/=
btra
> [ =A0198.157820] btrfs: use compression
> [ =A0246.549684] root 18446744073709551608 536
> [ =A0246.549688] node 9197760471040 refs 0 flags 0 owner 536 reloc 1 =
level
> 0 nritems 0
> [ =A0246.549695] ------------[ cut here ]------------
> [ =A0246.549780] kernel BUG at fs/btrfs/extent-tree.c:5385!
> [ =A0246.549864] invalid opcode: 0000 [#1] SMP
> [ =A0246.550016] last sysfs file:
> /sys/devices/virtual/block/md1/md/metadata_version
> [ =A0246.550145] CPU 5
> [ =A0246.550260] Pid: 5993, comm: btrfs-relocate- Tainted: G =A0 =A0 =
=A0 =A0W
> 2.6.33-rc6 #1 P55M-GD45 (MS-7588) /MS-7588
> [ =A0246.550392] RIP: 0010:[<ffffffff8129e686>] =A0[<ffffffff8129e686=
>]
> walk_up_proc+0x456/0x4c0
> [ =A0246.550560] RSP: 0018:ffff880138a93c70 =A0EFLAGS: 00010282
> [ =A0246.550644] RAX: 0000000000000058 RBX: ffff88013c12f9c0 RCX:
> 000000000003ffff
> [ =A0246.550732] RDX: ffff880028340000 RSI: 0000000000000082 RDI:
> 0000000000000000
> [ =A0246.550819] RBP: ffff880138a93cf0 R08: 0000000000000000 R09:
> ffffffff816ac59f
> [ =A0246.551695] R10: 0000000000000000 R11: 0000000000000004 R12:
> 0000000000000000
> [ =A0246.551783] R13: ffff880138a92000 R14: ffff8801396769a0 R15:
> ffff88013e69c000
> [ =A0246.551870] FS: =A00000000000000000(0000) GS:ffff880028340000(00=
00)
> knlGS:0000000000000000
> [ =A0246.552000] CS: =A00010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ =A0246.552085] CR2: 00007f3dde7b9ea0 CR3: 00000000018d3000 CR4:
> 00000000000006e0
> [ =A0246.552172] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ =A0246.552259] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [ =A0246.552347] Process btrfs-relocate- (pid: 5993, threadinfo
> ffff880138a92000, task ffff88013c5d40c0)
> [ =A0246.552478] Stack:
> [ =A0246.552556] =A0ffff880100000000 ffff880100000000 ffff880138a93d3=
0
> ffff880000000000
> [ =A0246.552748] <0> 6db6db6db6db6db7 0000160000000000 00000000000000=
08
> 0000000000000206
> [ =A0246.553050] <0> ffff880138a93d20 ffff88013967c8e0 00000000000000=
00
> 0000000000000000
> [ =A0246.553531] Call Trace:
> [ =A0246.553613] =A0[<ffffffff8129e82f>] walk_up_tree+0x13f/0x1c0
> [ =A0246.553699] =A0[<ffffffff8129fa98>] btrfs_drop_snapshot+0x218/0x=
5e0
> [ =A0246.553786] =A0[<ffffffff812a9eb0>] ? __btrfs_end_transaction+0x=
100/0x170
> [ =A0246.553875] =A0[<ffffffff812e7e5d>] merge_func+0x7d/0xc0
> [ =A0246.553961] =A0[<ffffffff812d268a>] worker_loop+0x17a/0x540
> [ =A0246.554046] =A0[<ffffffff812d2510>] ? worker_loop+0x0/0x540
> [ =A0246.554131] =A0[<ffffffff812d2510>] ? worker_loop+0x0/0x540
> [ =A0246.554217] =A0[<ffffffff81095936>] kthread+0x96/0xa0
> [ =A0246.554303] =A0[<ffffffff81034bd4>] kernel_thread_helper+0x4/0x1=
0
> [ =A0246.554408] =A0[<ffffffff816ac969>] ? restore_args+0x0/0x30
> [ =A0246.554493] =A0[<ffffffff810958a0>] ? kthread+0x0/0xa0
> [ =A0246.554578] =A0[<ffffffff81034bd0>] ? kernel_thread_helper+0x0/0=
x10
> [ =A0246.554663] Code: 0f 95 c1 89 7c 24 08 48 8b 0c c3 4a 8b 14 e3 4=
0 0f
> b6 f6 41 83 e1 01 89 34 24 48 c7 c7 98 6a 81 81 49 8b 36 31 c0 e8 26 =
af
> 40 00 <0f> 0b eb fe 66 0f 1f 44 00 00 4c 89 f7 e8 48 70 03 00 4c 89 f=
7
> [ =A0246.557353] RIP =A0[<ffffffff8129e686>] walk_up_proc+0x456/0x4c0
> [ =A0246.557474] =A0RSP <ffff880138a93c70>
> [ =A0246.557601] ---[ end trace 18f62a4fb26ae09e ]---
>

Please try the patch attached below. It should solve the bug during mou=
nting
that fs. But I don't know why there are so many link count errors in th=
at fs.
How old is that fs? what was that fs used for?

Thank you very much.
Yan, Zheng

---
diff -urp 1/fs/btrfs/extent-tree.c 2/fs/btrfs/extent-tree.c
--- 1/fs/btrfs/extent-tree.c	2010-01-22 12:16:34.203525744 +0800
+++ 2/fs/btrfs/extent-tree.c	2010-01-31 20:09:08.509200892 +0800
@@ -5402,14 +5402,14 @@ static noinline int walk_down_tree(struc
 	int ret;

 	while (level >=3D 0) {
-		if (path->slots[level] >=3D
-		    btrfs_header_nritems(path->nodes[level]))
-			break;
-
 		ret =3D walk_down_proc(trans, root, path, wc, lookup_info);
 		if (ret > 0)
 			break;

+		if (path->slots[level] >=3D
+		    btrfs_header_nritems(path->nodes[level]))
+			break;
+
 		if (level =3D=3D 0)
 			break;

---
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: panic during rebalance, and now upon mount
  2010-01-31 12:34         ` Yan, Zheng 
@ 2010-01-31 19:33           ` Troy Ablan
  2010-02-01  4:22             ` Yan, Zheng 
  0 siblings, 1 reply; 8+ messages in thread
From: Troy Ablan @ 2010-01-31 19:33 UTC (permalink / raw)
  To: Yan, Zheng ; +Cc: linux-btrfs

Yan, Zheng wrote:
> Please try the patch attached below. It should solve the bug during
> mounting
> that fs. But I don't know why there are so many link count errors in that fs.
> How old is that fs? what was that fs used for?
>
> Thank you very much.
> Yan, Zheng
>
>   
Good, so far.  Thanks!

The filesystem is less than 2 weeks old, created and managed exclusively
with the unstable tools Btrfs v0.19-4-gab8fb4c-dirty

I created the filesystem -d raid1 -m raid1.

There are 14 dm-crypt mappings corresponding to 14 partitions on 14
drives.  There's one filesystem made up from these devices with about 14
TB of space (a mixture of devices ranging from 500GB to 2TB)

The filesystem is used for incremental backup from remote computers
using rsync.

The filesystem tree is as follows

/
/machine1 <- normal directory
/machine1/machine1 <- a subvolume
/machine1/machine1-20100120-1220 <- a snapshot of the subvolume above
...
/machine1/machine1-20100131-1220 <- more snapshots of the subvolume above
/machine2 <- normal directory
/machine2/machine1 <- a subvolume
/machine2/machine2-20100120-1020 <- a snapshot of the subvolume above
...
/machine2/machine2-20100131-1020 <- more snapshots of the subvolume above
...

The files are backed up with `rsync -aH --inplace` onto the subvolume
for each machine.

The only oddness I can think of is that during initial testing of this
filesystem, I yanked a drive physically from the machine while it was
writing.  btrfs seemed to continue to try to write to the inaccessible
device, and indeed, btrfs-show showed the used space on the missing
drive increasing over time.  Also, I was unable to remove the drive from
the volume (ioctl returned -1), so it was in this state until I rebooted
a couple hours later.   I then did a btrfs-vol -r missing on the drive,
and then added it back in as a new device.  I did btrfs-vol -b which
succeeded once.   After adding more drives, I did btrfs-vol -b again,
and that left me in the state where this thread began.

As I understand it, a btrfs-vol -b is currently one of the only ways to
reduplicate unmirrored chunks after a drive failure. (aside from
rewriting the data or removing and readding devices).  Is my
understanding correct?

Thanks

-- 
Troy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: panic during rebalance, and now upon mount
  2010-01-31 19:33           ` Troy Ablan
@ 2010-02-01  4:22             ` Yan, Zheng 
  0 siblings, 0 replies; 8+ messages in thread
From: Yan, Zheng  @ 2010-02-01  4:22 UTC (permalink / raw)
  To: Troy Ablan; +Cc: linux-btrfs

On Mon, Feb 1, 2010 at 3:33 AM, Troy Ablan <tablan@gmail.com> wrote:
> Yan, Zheng wrote:
>> Please try the patch attached below. It should solve the bug during
>> mounting
>> that fs. But I don't know why there are so many link count errors in=
 that fs.
>> How old is that fs? what was that fs used for?
>>
>> Thank you very much.
>> Yan, Zheng
>>
>>
> Good, so far. =A0Thanks!
>
> The filesystem is less than 2 weeks old, created and managed exclusiv=
ely
> with the unstable tools Btrfs v0.19-4-gab8fb4c-dirty
>
> I created the filesystem -d raid1 -m raid1.
>
> There are 14 dm-crypt mappings corresponding to 14 partitions on 14
> drives. =A0There's one filesystem made up from these devices with abo=
ut 14
> TB of space (a mixture of devices ranging from 500GB to 2TB)
>
> The filesystem is used for incremental backup from remote computers
> using rsync.
>
> The filesystem tree is as follows
>
> /
> /machine1 <- normal directory
> /machine1/machine1 <- a subvolume
> /machine1/machine1-20100120-1220 <- a snapshot of the subvolume above
> ....
> /machine1/machine1-20100131-1220 <- more snapshots of the subvolume a=
bove
> /machine2 <- normal directory
> /machine2/machine1 <- a subvolume
> /machine2/machine2-20100120-1020 <- a snapshot of the subvolume above
> ....
> /machine2/machine2-20100131-1020 <- more snapshots of the subvolume a=
bove
> ....
>
> The files are backed up with `rsync -aH --inplace` onto the subvolume
> for each machine.
>
> The only oddness I can think of is that during initial testing of thi=
s
> filesystem, I yanked a drive physically from the machine while it was
> writing. =A0btrfs seemed to continue to try to write to the inaccessi=
ble
> device, and indeed, btrfs-show showed the used space on the missing
> drive increasing over time. =A0Also, I was unable to remove the drive=
 from
> the volume (ioctl returned -1), so it was in this state until I reboo=
ted
> a couple hours later. =A0 I then did a btrfs-vol -r missing on the dr=
ive,
> and then added it back in as a new device. =A0I did btrfs-vol -b whic=
h
> succeeded once. =A0 After adding more drives, I did btrfs-vol -b agai=
n,
> and that left me in the state where this thread began.
>
> As I understand it, a btrfs-vol -b is currently one of the only ways =
to
> reduplicate unmirrored chunks after a drive failure. (aside from
> rewriting the data or removing and readding devices). =A0Is my
> understanding correct?
>
Yes,

Thanks again for helping debug.

Yan, Zheng
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-02-01  4:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-30  6:05 panic during rebalance, and now upon mount Troy Ablan
2010-01-30 12:09 ` Yan, Zheng 
2010-01-30 17:31   ` Troy Ablan
2010-01-31  2:00     ` Yan, Zheng 
2010-01-31 10:09       ` Troy Ablan
2010-01-31 12:34         ` Yan, Zheng 
2010-01-31 19:33           ` Troy Ablan
2010-02-01  4:22             ` Yan, Zheng 

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).