All of lore.kernel.org
 help / color / mirror / Atom feed
* btrfs BUG during cosd open() syscall
@ 2011-01-26 16:00 Jim Schutt
  2011-01-26 17:17 ` Gregory Farnum
  2011-01-26 17:59 ` btrfs BUG during Ceph " Jim Schutt
  0 siblings, 2 replies; 12+ messages in thread
From: Jim Schutt @ 2011-01-26 16:00 UTC (permalink / raw)
  To: ceph-devel

Hi,

I got this kernel BUG during a heavy write load, using
current ceph unstable kernel 
(a3f5274e535 in git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git).

Please let me know what other information you need to make this useful.

-- Jim

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000100
[97221.834832] IP: [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs]
[97221.834832] PGD 198d6b067 PUD 13d79f067 PMD 0 
[97221.834832] Oops: 0000 [#1] SMP 
[97221.834832] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:10:0d.0/local_cpus
[97221.834832] CPU 3 
[97221.834832] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ]
[97221.834832] 
[97221.834832] Pid: 30295, comm: cosd Not tainted 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950
[97221.834832] RIP: 0010:[<ffffffffa075b3ab>]  [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs]
[97221.834832] RSP: 0018:ffff8801cf205c08  EFLAGS: 00010282
[97221.834832] RAX: ffffffffa075b39b RBX: ffff88018490a3a0 RCX: 0000000000000001
[97221.834832] RDX: 0000000000000000 RSI: ffffffff819e7ea0 RDI: ffff88018490a3a0
[97221.834832] RBP: ffff8801cf205c08 R08: ffffe8ffffccefa8 R09: 0000000000000000
[97221.834832] R10: ffff8801488e9658 R11: 0000000000000000 R12: ffff88021b5c6400
[97221.834832] R13: ffff8801fad145a0 R14: ffff8801faf8c440 R15: ffff88017bab9848
[97221.834832] FS:  00007f0b011f9940(0000) GS:ffff8800cfcc0000(0000) knlGS:0000000000000000
[97221.834832] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[97221.834832] CR2: 0000000000000100 CR3: 00000001b8c89000 CR4: 00000000000006e0
[97221.834832] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[97221.834832] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[97221.834832] Process cosd (pid: 30295, threadinfo ffff8801cf204000, task ffff8801488e9610)
[97221.834832] Stack:
[97221.834832]  ffff8801cf205c28 ffffffff810fd714 00000000fffffffb fffffffffffffffb
[97221.834832]  ffff8801cf205cd8 ffffffffa07587e8 ffff8801cf205c48 0000000000000102
[97221.834832]  0000000fcf205c58 ffff88022f5c46a0 ffff8801d1ef8800 ffffffff8136a638
[97221.834832] Call Trace:
[97221.834832]  [<ffffffff810fd714>] iput+0x5c/0x1e0
[97221.834832]  [<ffffffffa07587e8>] btrfs_new_inode+0x2d3/0x2e5 [btrfs]
[97221.834832]  [<ffffffff8136a638>] ? _cond_resched+0xe/0x22
[97221.834832]  [<ffffffff8136ae20>] ? mutex_lock+0x16/0x3a
[97221.834832]  [<ffffffffa0756da1>] ? start_transaction+0x176/0x1bc [btrfs]
[97221.834832]  [<ffffffffa075d1fc>] btrfs_create+0xbb/0x1fa [btrfs]
[97221.834832]  [<ffffffff810f49e2>] vfs_create+0x76/0x96
[97221.834832]  [<ffffffff810f56af>] do_last+0x24d/0x4d3
[97221.834832]  [<ffffffff810f5b16>] do_filp_open+0x1e1/0x4c5
[97221.834832]  [<ffffffff81031061>] ? should_resched+0xe/0x2f
[97221.834832]  [<ffffffff8136a638>] ? _cond_resched+0xe/0x22
[97221.834832]  [<ffffffff811aa669>] ? might_fault+0xe/0x10
[97221.834832]  [<ffffffff811aa753>] ? __strncpy_from_user+0x20/0x4a
[97221.834832]  [<ffffffff810e9023>] do_sys_open+0x62/0xeb
[97221.834832]  [<ffffffff810e90df>] sys_open+0x20/0x22
[97221.834832]  [<ffffffff81002c2b>] system_call_fastpath+0x16/0x1b
[97221.834832] Code: 53 fc 94 e0 4c 89 e7 e8 f6 8a 95 e0 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 48 8b 97 68 fe ff ff <83> ba 00 01 00 00 00 75 12 48 8b 82 28 01 00 00 b9 01 00  
[97221.834832] RIP  [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs]
[97221.834832]  RSP <ffff8801cf205c08>
[97221.834832] CR2: 0000000000000100
[97222.207152] ---[ end trace 32eb8bbbb4782eb8 ]---






^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs BUG during cosd open() syscall
  2011-01-26 16:00 btrfs BUG during cosd open() syscall Jim Schutt
@ 2011-01-26 17:17 ` Gregory Farnum
  2011-01-26 17:55   ` Jim Schutt
  2011-01-26 17:59 ` btrfs BUG during Ceph " Jim Schutt
  1 sibling, 1 reply; 12+ messages in thread
From: Gregory Farnum @ 2011-01-26 17:17 UTC (permalink / raw)
  To: Jim Schutt; +Cc: ceph-devel

Jim:
This looks to be a btrfs bug that Ceph isn't involved with at all.
Please send it on to the btrfs list/bug tracker.
(In future if you want to check btrfs bugs that you find while using
Ceph with us that's fine; occasionally it's a bug in code that Sage
contributed or a cooperative issue with ioctl usage.)
Thanks!
-Greg

On Wed, Jan 26, 2011 at 8:00 AM, Jim Schutt <jaschut@sandia.gov> wrote:
> Hi,
>
> I got this kernel BUG during a heavy write load, using
> current ceph unstable kernel
> (a3f5274e535 in git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git).
>
> Please let me know what other information you need to make this useful.
>
> -- Jim
>
>  BUG: unable to handle kernel NULL pointer dereference at 0000000000000100
> [97221.834832] IP: [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs]
> [97221.834832] PGD 198d6b067 PUD 13d79f067 PMD 0
> [97221.834832] Oops: 0000 [#1] SMP
> [97221.834832] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:10:0d.0/local_cpus
> [97221.834832] CPU 3
> [97221.834832] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ]
> [97221.834832]
> [97221.834832] Pid: 30295, comm: cosd Not tainted 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950
> [97221.834832] RIP: 0010:[<ffffffffa075b3ab>]  [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs]
> [97221.834832] RSP: 0018:ffff8801cf205c08  EFLAGS: 00010282
> [97221.834832] RAX: ffffffffa075b39b RBX: ffff88018490a3a0 RCX: 0000000000000001
> [97221.834832] RDX: 0000000000000000 RSI: ffffffff819e7ea0 RDI: ffff88018490a3a0
> [97221.834832] RBP: ffff8801cf205c08 R08: ffffe8ffffccefa8 R09: 0000000000000000
> [97221.834832] R10: ffff8801488e9658 R11: 0000000000000000 R12: ffff88021b5c6400
> [97221.834832] R13: ffff8801fad145a0 R14: ffff8801faf8c440 R15: ffff88017bab9848
> [97221.834832] FS:  00007f0b011f9940(0000) GS:ffff8800cfcc0000(0000) knlGS:0000000000000000
> [97221.834832] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [97221.834832] CR2: 0000000000000100 CR3: 00000001b8c89000 CR4: 00000000000006e0
> [97221.834832] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [97221.834832] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [97221.834832] Process cosd (pid: 30295, threadinfo ffff8801cf204000, task ffff8801488e9610)
> [97221.834832] Stack:
> [97221.834832]  ffff8801cf205c28 ffffffff810fd714 00000000fffffffb fffffffffffffffb
> [97221.834832]  ffff8801cf205cd8 ffffffffa07587e8 ffff8801cf205c48 0000000000000102
> [97221.834832]  0000000fcf205c58 ffff88022f5c46a0 ffff8801d1ef8800 ffffffff8136a638
> [97221.834832] Call Trace:
> [97221.834832]  [<ffffffff810fd714>] iput+0x5c/0x1e0
> [97221.834832]  [<ffffffffa07587e8>] btrfs_new_inode+0x2d3/0x2e5 [btrfs]
> [97221.834832]  [<ffffffff8136a638>] ? _cond_resched+0xe/0x22
> [97221.834832]  [<ffffffff8136ae20>] ? mutex_lock+0x16/0x3a
> [97221.834832]  [<ffffffffa0756da1>] ? start_transaction+0x176/0x1bc [btrfs]
> [97221.834832]  [<ffffffffa075d1fc>] btrfs_create+0xbb/0x1fa [btrfs]
> [97221.834832]  [<ffffffff810f49e2>] vfs_create+0x76/0x96
> [97221.834832]  [<ffffffff810f56af>] do_last+0x24d/0x4d3
> [97221.834832]  [<ffffffff810f5b16>] do_filp_open+0x1e1/0x4c5
> [97221.834832]  [<ffffffff81031061>] ? should_resched+0xe/0x2f
> [97221.834832]  [<ffffffff8136a638>] ? _cond_resched+0xe/0x22
> [97221.834832]  [<ffffffff811aa669>] ? might_fault+0xe/0x10
> [97221.834832]  [<ffffffff811aa753>] ? __strncpy_from_user+0x20/0x4a
> [97221.834832]  [<ffffffff810e9023>] do_sys_open+0x62/0xeb
> [97221.834832]  [<ffffffff810e90df>] sys_open+0x20/0x22
> [97221.834832]  [<ffffffff81002c2b>] system_call_fastpath+0x16/0x1b
> [97221.834832] Code: 53 fc 94 e0 4c 89 e7 e8 f6 8a 95 e0 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 48 8b 97 68 fe ff ff <83> ba 00 01 00 00 00 75 12 48 8b 82 28 01 00 00 b9 01 00
> [97221.834832] RIP  [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs]
> [97221.834832]  RSP <ffff8801cf205c08>
> [97221.834832] CR2: 0000000000000100
> [97222.207152] ---[ end trace 32eb8bbbb4782eb8 ]---
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs BUG during cosd open() syscall
  2011-01-26 17:17 ` Gregory Farnum
@ 2011-01-26 17:55   ` Jim Schutt
  2011-01-26 17:58     ` Sage Weil
  0 siblings, 1 reply; 12+ messages in thread
From: Jim Schutt @ 2011-01-26 17:55 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel


Hi,

On Wed, 2011-01-26 at 10:17 -0700, Gregory Farnum wrote:
> Jim:
> This looks to be a btrfs bug that Ceph isn't involved with at all.
> Please send it on to the btrfs list/bug tracker.

OK, thanks for taking a look.

> (In future if you want to check btrfs bugs that you find while using
> Ceph with us that's fine; occasionally it's a bug in code that Sage
> contributed or a cooperative issue with ioctl usage.)

Is it OK to Cc: ceph-devel on these types of things in the
interest of keeping other ceph users informed?

-- Jim


> Thanks!
> -Greg
> 
> On Wed, Jan 26, 2011 at 8:00 AM, Jim Schutt <jaschut@sandia.gov> wrote:
> > Hi,
> >
> > I got this kernel BUG during a heavy write load, using
> > current ceph unstable kernel
> > (a3f5274e535 in git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git).
> >
> > Please let me know what other information you need to make this useful.
> >
> > -- Jim
> >
> >  BUG: unable to handle kernel NULL pointer dereference at 0000000000000100
> > [97221.834832] IP: [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs]
> > [97221.834832] PGD 198d6b067 PUD 13d79f067 PMD 0
> > [97221.834832] Oops: 0000 [#1] SMP
> > [97221.834832] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:10:0d.0/local_cpus
> > [97221.834832] CPU 3
> > [97221.834832] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ]
> > [97221.834832]
> > [97221.834832] Pid: 30295, comm: cosd Not tainted 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950
> > [97221.834832] RIP: 0010:[<ffffffffa075b3ab>]  [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs]
> > [97221.834832] RSP: 0018:ffff8801cf205c08  EFLAGS: 00010282
> > [97221.834832] RAX: ffffffffa075b39b RBX: ffff88018490a3a0 RCX: 0000000000000001
> > [97221.834832] RDX: 0000000000000000 RSI: ffffffff819e7ea0 RDI: ffff88018490a3a0
> > [97221.834832] RBP: ffff8801cf205c08 R08: ffffe8ffffccefa8 R09: 0000000000000000
> > [97221.834832] R10: ffff8801488e9658 R11: 0000000000000000 R12: ffff88021b5c6400
> > [97221.834832] R13: ffff8801fad145a0 R14: ffff8801faf8c440 R15: ffff88017bab9848
> > [97221.834832] FS:  00007f0b011f9940(0000) GS:ffff8800cfcc0000(0000) knlGS:0000000000000000
> > [97221.834832] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [97221.834832] CR2: 0000000000000100 CR3: 00000001b8c89000 CR4: 00000000000006e0
> > [97221.834832] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [97221.834832] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > [97221.834832] Process cosd (pid: 30295, threadinfo ffff8801cf204000, task ffff8801488e9610)
> > [97221.834832] Stack:
> > [97221.834832]  ffff8801cf205c28 ffffffff810fd714 00000000fffffffb fffffffffffffffb
> > [97221.834832]  ffff8801cf205cd8 ffffffffa07587e8 ffff8801cf205c48 0000000000000102
> > [97221.834832]  0000000fcf205c58 ffff88022f5c46a0 ffff8801d1ef8800 ffffffff8136a638
> > [97221.834832] Call Trace:
> > [97221.834832]  [<ffffffff810fd714>] iput+0x5c/0x1e0
> > [97221.834832]  [<ffffffffa07587e8>] btrfs_new_inode+0x2d3/0x2e5 [btrfs]
> > [97221.834832]  [<ffffffff8136a638>] ? _cond_resched+0xe/0x22
> > [97221.834832]  [<ffffffff8136ae20>] ? mutex_lock+0x16/0x3a
> > [97221.834832]  [<ffffffffa0756da1>] ? start_transaction+0x176/0x1bc [btrfs]
> > [97221.834832]  [<ffffffffa075d1fc>] btrfs_create+0xbb/0x1fa [btrfs]
> > [97221.834832]  [<ffffffff810f49e2>] vfs_create+0x76/0x96
> > [97221.834832]  [<ffffffff810f56af>] do_last+0x24d/0x4d3
> > [97221.834832]  [<ffffffff810f5b16>] do_filp_open+0x1e1/0x4c5
> > [97221.834832]  [<ffffffff81031061>] ? should_resched+0xe/0x2f
> > [97221.834832]  [<ffffffff8136a638>] ? _cond_resched+0xe/0x22
> > [97221.834832]  [<ffffffff811aa669>] ? might_fault+0xe/0x10
> > [97221.834832]  [<ffffffff811aa753>] ? __strncpy_from_user+0x20/0x4a
> > [97221.834832]  [<ffffffff810e9023>] do_sys_open+0x62/0xeb
> > [97221.834832]  [<ffffffff810e90df>] sys_open+0x20/0x22
> > [97221.834832]  [<ffffffff81002c2b>] system_call_fastpath+0x16/0x1b
> > [97221.834832] Code: 53 fc 94 e0 4c 89 e7 e8 f6 8a 95 e0 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 48 8b 97 68 fe ff ff <83> ba 00 01 00 00 00 75 12 48 8b 82 28 01 00 00 b9 01 00
> > [97221.834832] RIP  [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs]
> > [97221.834832]  RSP <ffff8801cf205c08>
> > [97221.834832] CR2: 0000000000000100
> > [97222.207152] ---[ end trace 32eb8bbbb4782eb8 ]---
> >
> >
> >
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs BUG during cosd open() syscall
  2011-01-26 17:55   ` Jim Schutt
@ 2011-01-26 17:58     ` Sage Weil
  0 siblings, 0 replies; 12+ messages in thread
From: Sage Weil @ 2011-01-26 17:58 UTC (permalink / raw)
  To: Jim Schutt; +Cc: Gregory Farnum, ceph-devel

On Wed, 26 Jan 2011, Jim Schutt wrote:
> > (In future if you want to check btrfs bugs that you find while using
> > Ceph with us that's fine; occasionally it's a bug in code that Sage
> > contributed or a cooperative issue with ioctl usage.)
> 
> Is it OK to Cc: ceph-devel on these types of things in the
> interest of keeping other ceph users informed?

Yes, please!  We're definitely interested in what/which errors you're 
seeing, as some of it may be related to the ceph workload.

Thanks!
sage


> 
> -- Jim
> 
> 
> > Thanks!
> > -Greg
> > 
> > On Wed, Jan 26, 2011 at 8:00 AM, Jim Schutt <jaschut@sandia.gov> wrote:
> > > Hi,
> > >
> > > I got this kernel BUG during a heavy write load, using
> > > current ceph unstable kernel
> > > (a3f5274e535 in git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git).
> > >
> > > Please let me know what other information you need to make this useful.
> > >
> > > -- Jim
> > >
> > >  BUG: unable to handle kernel NULL pointer dereference at 0000000000000100
> > > [97221.834832] IP: [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs]
> > > [97221.834832] PGD 198d6b067 PUD 13d79f067 PMD 0
> > > [97221.834832] Oops: 0000 [#1] SMP
> > > [97221.834832] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:10:0d.0/local_cpus
> > > [97221.834832] CPU 3
> > > [97221.834832] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ]
> > > [97221.834832]
> > > [97221.834832] Pid: 30295, comm: cosd Not tainted 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950
> > > [97221.834832] RIP: 0010:[<ffffffffa075b3ab>]  [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs]
> > > [97221.834832] RSP: 0018:ffff8801cf205c08  EFLAGS: 00010282
> > > [97221.834832] RAX: ffffffffa075b39b RBX: ffff88018490a3a0 RCX: 0000000000000001
> > > [97221.834832] RDX: 0000000000000000 RSI: ffffffff819e7ea0 RDI: ffff88018490a3a0
> > > [97221.834832] RBP: ffff8801cf205c08 R08: ffffe8ffffccefa8 R09: 0000000000000000
> > > [97221.834832] R10: ffff8801488e9658 R11: 0000000000000000 R12: ffff88021b5c6400
> > > [97221.834832] R13: ffff8801fad145a0 R14: ffff8801faf8c440 R15: ffff88017bab9848
> > > [97221.834832] FS:  00007f0b011f9940(0000) GS:ffff8800cfcc0000(0000) knlGS:0000000000000000
> > > [97221.834832] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > [97221.834832] CR2: 0000000000000100 CR3: 00000001b8c89000 CR4: 00000000000006e0
> > > [97221.834832] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [97221.834832] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > [97221.834832] Process cosd (pid: 30295, threadinfo ffff8801cf204000, task ffff8801488e9610)
> > > [97221.834832] Stack:
> > > [97221.834832]  ffff8801cf205c28 ffffffff810fd714 00000000fffffffb fffffffffffffffb
> > > [97221.834832]  ffff8801cf205cd8 ffffffffa07587e8 ffff8801cf205c48 0000000000000102
> > > [97221.834832]  0000000fcf205c58 ffff88022f5c46a0 ffff8801d1ef8800 ffffffff8136a638
> > > [97221.834832] Call Trace:
> > > [97221.834832]  [<ffffffff810fd714>] iput+0x5c/0x1e0
> > > [97221.834832]  [<ffffffffa07587e8>] btrfs_new_inode+0x2d3/0x2e5 [btrfs]
> > > [97221.834832]  [<ffffffff8136a638>] ? _cond_resched+0xe/0x22
> > > [97221.834832]  [<ffffffff8136ae20>] ? mutex_lock+0x16/0x3a
> > > [97221.834832]  [<ffffffffa0756da1>] ? start_transaction+0x176/0x1bc [btrfs]
> > > [97221.834832]  [<ffffffffa075d1fc>] btrfs_create+0xbb/0x1fa [btrfs]
> > > [97221.834832]  [<ffffffff810f49e2>] vfs_create+0x76/0x96
> > > [97221.834832]  [<ffffffff810f56af>] do_last+0x24d/0x4d3
> > > [97221.834832]  [<ffffffff810f5b16>] do_filp_open+0x1e1/0x4c5
> > > [97221.834832]  [<ffffffff81031061>] ? should_resched+0xe/0x2f
> > > [97221.834832]  [<ffffffff8136a638>] ? _cond_resched+0xe/0x22
> > > [97221.834832]  [<ffffffff811aa669>] ? might_fault+0xe/0x10
> > > [97221.834832]  [<ffffffff811aa753>] ? __strncpy_from_user+0x20/0x4a
> > > [97221.834832]  [<ffffffff810e9023>] do_sys_open+0x62/0xeb
> > > [97221.834832]  [<ffffffff810e90df>] sys_open+0x20/0x22
> > > [97221.834832]  [<ffffffff81002c2b>] system_call_fastpath+0x16/0x1b
> > > [97221.834832] Code: 53 fc 94 e0 4c 89 e7 e8 f6 8a 95 e0 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 48 8b 97 68 fe ff ff <83> ba 00 01 00 00 00 75 12 48 8b 82 28 01 00 00 b9 01 00
> > > [97221.834832] RIP  [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs]
> > > [97221.834832]  RSP <ffff8801cf205c08>
> > > [97221.834832] CR2: 0000000000000100
> > > [97222.207152] ---[ end trace 32eb8bbbb4782eb8 ]---
> > >
> > >
> > >
> > >
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >
> > 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* btrfs BUG during Ceph cosd open() syscall
  2011-01-26 16:00 btrfs BUG during cosd open() syscall Jim Schutt
  2011-01-26 17:17 ` Gregory Farnum
@ 2011-01-26 17:59 ` Jim Schutt
  2011-01-26 18:48   ` Jim Schutt
  1 sibling, 1 reply; 12+ messages in thread
From: Jim Schutt @ 2011-01-26 17:59 UTC (permalink / raw)
  To: linux-btrfs; +Cc: ceph-devel

Hi,

I got this kernel BUG on a server running multiple Ceph
cosd instances, during a heavy write load generated by
multiple Ceph clients.

The server was running the current ceph unstable kernel 
(a3f5274e535 in git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git).

Please let me know what other information you need to 
make this report useful.

-- Jim

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000100
[97221.834832] IP: [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs]
[97221.834832] PGD 198d6b067 PUD 13d79f067 PMD 0 
[97221.834832] Oops: 0000 [#1] SMP 
[97221.834832] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:10:0d.0/local_cpus
[97221.834832] CPU 3 
[97221.834832] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ]
[97221.834832] 
[97221.834832] Pid: 30295, comm: cosd Not tainted 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950
[97221.834832] RIP: 0010:[<ffffffffa075b3ab>]  [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs]
[97221.834832] RSP: 0018:ffff8801cf205c08  EFLAGS: 00010282
[97221.834832] RAX: ffffffffa075b39b RBX: ffff88018490a3a0 RCX: 0000000000000001
[97221.834832] RDX: 0000000000000000 RSI: ffffffff819e7ea0 RDI: ffff88018490a3a0
[97221.834832] RBP: ffff8801cf205c08 R08: ffffe8ffffccefa8 R09: 0000000000000000
[97221.834832] R10: ffff8801488e9658 R11: 0000000000000000 R12: ffff88021b5c6400
[97221.834832] R13: ffff8801fad145a0 R14: ffff8801faf8c440 R15: ffff88017bab9848
[97221.834832] FS:  00007f0b011f9940(0000) GS:ffff8800cfcc0000(0000) knlGS:0000000000000000
[97221.834832] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[97221.834832] CR2: 0000000000000100 CR3: 00000001b8c89000 CR4: 00000000000006e0
[97221.834832] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[97221.834832] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[97221.834832] Process cosd (pid: 30295, threadinfo ffff8801cf204000, task ffff8801488e9610)
[97221.834832] Stack:
[97221.834832]  ffff8801cf205c28 ffffffff810fd714 00000000fffffffb fffffffffffffffb
[97221.834832]  ffff8801cf205cd8 ffffffffa07587e8 ffff8801cf205c48 0000000000000102
[97221.834832]  0000000fcf205c58 ffff88022f5c46a0 ffff8801d1ef8800 ffffffff8136a638
[97221.834832] Call Trace:
[97221.834832]  [<ffffffff810fd714>] iput+0x5c/0x1e0
[97221.834832]  [<ffffffffa07587e8>] btrfs_new_inode+0x2d3/0x2e5 [btrfs]
[97221.834832]  [<ffffffff8136a638>] ? _cond_resched+0xe/0x22
[97221.834832]  [<ffffffff8136ae20>] ? mutex_lock+0x16/0x3a
[97221.834832]  [<ffffffffa0756da1>] ? start_transaction+0x176/0x1bc [btrfs]
[97221.834832]  [<ffffffffa075d1fc>] btrfs_create+0xbb/0x1fa [btrfs]
[97221.834832]  [<ffffffff810f49e2>] vfs_create+0x76/0x96
[97221.834832]  [<ffffffff810f56af>] do_last+0x24d/0x4d3
[97221.834832]  [<ffffffff810f5b16>] do_filp_open+0x1e1/0x4c5
[97221.834832]  [<ffffffff81031061>] ? should_resched+0xe/0x2f
[97221.834832]  [<ffffffff8136a638>] ? _cond_resched+0xe/0x22
[97221.834832]  [<ffffffff811aa669>] ? might_fault+0xe/0x10
[97221.834832]  [<ffffffff811aa753>] ? __strncpy_from_user+0x20/0x4a
[97221.834832]  [<ffffffff810e9023>] do_sys_open+0x62/0xeb
[97221.834832]  [<ffffffff810e90df>] sys_open+0x20/0x22
[97221.834832]  [<ffffffff81002c2b>] system_call_fastpath+0x16/0x1b
[97221.834832] Code: 53 fc 94 e0 4c 89 e7 e8 f6 8a 95 e0 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 48 8b 97 68 fe ff ff <83> ba 00 01 00 00 00 75 12 48 8b 82 28 01 00 00 b9 01 00  
[97221.834832] RIP  [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs]
[97221.834832]  RSP <ffff8801cf205c08>
[97221.834832] CR2: 0000000000000100
[97222.207152] ---[ end trace 32eb8bbbb4782eb8 ]---




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs BUG during Ceph cosd open() syscall
  2011-01-26 17:59 ` btrfs BUG during Ceph " Jim Schutt
@ 2011-01-26 18:48   ` Jim Schutt
  2011-01-26 19:20     ` Matt Weil
  2011-01-27 16:05     ` btrfs BUG during Ceph cosd truncate() syscall Jim Schutt
  0 siblings, 2 replies; 12+ messages in thread
From: Jim Schutt @ 2011-01-26 18:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: ceph-devel

Hi,

On Wed, 2011-01-26 at 10:59 -0700, Jim Schutt wrote:
> Hi,
> 
> I got this kernel BUG on a server running multiple Ceph
> cosd instances, during a heavy write load generated by
> multiple Ceph clients.
> 
> The server was running the current ceph unstable kernel 
> (a3f5274e535 in git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git).
> 
> Please let me know what other information you need to 
> make this report useful.
> 
> -- Jim
> 

Here's another example.

Again, please let me know what other information you need to
make this report useful.

-- Jim

[11199.532483] ------------[ cut here ]------------
[11199.536292] kernel BUG at fs/btrfs/extent-tree.c:2198!
[11199.536292] invalid opcode: 0000 [#1] SMP 
[11199.536292] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
[11199.536292] CPU 3 
[11199.536292] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ]
[11199.536292] 
[11199.536292] Pid: 1664, comm: cosd Not tainted 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950
[11199.536292] RIP: 0010:[<ffffffffa0774081>]  [<ffffffffa0774081>] run_clustered_refs+0x71e/0x76b [btrfs]
[11199.536292] RSP: 0018:ffff8801c90abb58  EFLAGS: 00010282
[11199.536292] RAX: 00000000fffffffb RBX: 0000000000000000 RCX: ffff8802262c5000
[11199.536292] RDX: ffff88017921e2d0 RSI: ffffea000527f690 RDI: 0000000000000001
[11199.536292] RBP: ffff8801c90abc28 R08: ffffe8ffffccefe8 R09: 0000000000000000
[11199.536292] R10: 0000000000000003 R11: ffff880227549e98 R12: ffff880140bb8f00
[11199.536292] R13: 0000000000000000 R14: ffff880181eff378 R15: ffff8802262c5000
[11199.536292] FS:  00007f5e680fc940(0000) GS:ffff8800cfcc0000(0000) knlGS:0000000000000000
[11199.536292] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[11199.536292] CR2: 00007f0e1a476260 CR3: 0000000173aa0000 CR4: 00000000000006e0
[11199.536292] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[11199.536292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[11199.536292] Process cosd (pid: 1664, threadinfo ffff8801c90aa000, task ffff8801df12d840)
[11199.536292] Stack:
[11199.536292]  0000000000000000 0000000000000000 0000000000000001 0000000000000000
[11199.536292]  ffff8801c90abc48 ffff8802262c5000 ffff8801e0a9c600 ffff880181eff378
[11199.536292]  0000000000000000 0000002600000206 ffff880181eff380 000000007921e750
[11199.536292] Call Trace:
[11199.536292]  [<ffffffffa0785be0>] ? btrfs_update_inode+0xc3/0xd3 [btrfs]
[11199.536292]  [<ffffffffa07741bc>] btrfs_run_delayed_refs+0xee/0x15e [btrfs]
[11199.536292]  [<ffffffff810fa54d>] ? __fsnotify_update_dcache_flags+0x22/0x56
[11199.536292]  [<ffffffffa07801d0>] __btrfs_end_transaction+0x6d/0x1e3 [btrfs]
[11199.536292]  [<ffffffffa0780372>] btrfs_end_transaction_throttle+0x18/0x1a [btrfs]
[11199.536292]  [<ffffffffa07872e1>] btrfs_create+0x1a0/0x1fa [btrfs]
[11199.536292]  [<ffffffff810f49e2>] vfs_create+0x76/0x96
[11199.536292]  [<ffffffff810f56af>] do_last+0x24d/0x4d3
[11199.536292]  [<ffffffff810f5b16>] do_filp_open+0x1e1/0x4c5
[11199.536292]  [<ffffffff81031061>] ? should_resched+0xe/0x2f
[11199.536292]  [<ffffffff8136a638>] ? _cond_resched+0xe/0x22
[11199.536292]  [<ffffffff811aa669>] ? might_fault+0xe/0x10
[11199.536292]  [<ffffffff811aa753>] ? __strncpy_from_user+0x20/0x4a
[11199.536292]  [<ffffffff810e9023>] do_sys_open+0x62/0xeb
[11199.536292]  [<ffffffff810e90df>] sys_open+0x20/0x22
[11199.536292]  [<ffffffff81002c2b>] system_call_fastpath+0x16/0x1b
[11199.536292] Code: 24 08 48 8b 46 40 48 89 04 24 48 8b b5 58 ff ff ff 48 8b bd 60 ff ff ff e8 61 e7 ff ff eb 08 0f 0b eb fe 0f 0b eb fe 85 c0 74 04 <0f> 0b eb fe 4c 89 e7 e8 65 ae ff ff 48 8b bd 70 ff ff ff  
[11199.536292] RIP  [<ffffffffa0774081>] run_clustered_refs+0x71e/0x76b [btrfs]
[11199.536292]  RSP <ffff8801c90abb58>
[11199.905250] ---[ end trace b0dead1e7c3dbf7b ]---
Jan 26 11:40:32 an1 [11199.532483] ------------[ cut here ]------------
Jan 26 11:40:33 an1 [11199.536292] invalid opcode: 0000 [#1] SMP 
Jan 26 11:40:33 an1 [11199.536292] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
Jan 26 11:40:38 an1 [11199.536292] Stack:
Jan 26 11:40:38 an1 [11199.536292] Call Trace:
Jan 26 11:40:40 an1 [11199.536292] Code: 24 08 48 8b 46 40 48 89 04 24 48 8b b5 58 ff ff ff 48 8b bd 60 ff ff ff e8 61 e7 ff ff eb 08 0f 0b eb fe 0f 0b eb fe 85 c0 74 04 <0f> 0b eb fe 4c 89 e7 e8 65 ae ff ff 4 
[11212.699541] btrfs: sdm2 checksum verify failed on 31928320 wanted 237BEA0B found F7B13C5E level 0
[11212.709895] btrfs: sdm2 checksum verify failed on 31928320 wanted 237BEA0B found F7B13C5E level 0
[11212.719737] btrfs: sdm2 checksum verify failed on 31928320 wanted 237BEA0B found F7B13C5E level 0
[11212.729433] ------------[ cut here ]------------
[11212.730394] kernel BUG at fs/btrfs/extent-tree.c:5789!
[11212.734157] invalid opcode: 0000 [#2] SMP 
[11212.734157] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
[11212.734157] CPU 3 
[11212.734157] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ]
[11212.734157] 
[11212.734157] Pid: 27662, comm: btrfs-cleaner Tainted: G      D     2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950
[11212.734157] RIP: 0010:[<ffffffffa0773452>]  [<ffffffffa0773452>] reada_walk_down+0x18c/0x249 [btrfs]
[11212.734157] RSP: 0018:ffff880227539be0  EFLAGS: 00010282
[11212.734157] RAX: 00000000fffffffb RBX: ffff8801cd50d750 RCX: ffff88020b993000
[11212.734157] RDX: ffff88017921e3f0 RSI: ffffea000527f690 RDI: 0000000100000090
[11212.734157] RBP: ffff880227539c80 R08: ffffe8ffffccefe8 R09: 0000000000000000
[11212.734157] R10: 0000000100a68468 R11: ffff880227549e98 R12: ffff8801d83c3000
[11212.734157] R13: 0000000000000040 R14: ffff88020b993000 R15: 00000000000000e0
[11212.734157] FS:  0000000000000000(0000) GS:ffff8800cfcc0000(0000) knlGS:0000000000000000
[11212.734157] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[11212.734157] CR2: 0000000000b92de8 CR3: 000000020e5b3000 CR4: 00000000000006e0
[11212.734157] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[11212.734157] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[11212.734157] Process btrfs-cleaner (pid: 27662, threadinfo ffff880227538000, task ffff88020ebc0000)
[11212.734157] Stack:
[11212.734157]  ffff880227539bf0 0000000400000000 ffff8801cd50d750 ffff8801e0a9ca00
[11212.734157]  00000000024cd000 000010000000006b ffff88021527f880 0000000100000001
[11212.734157]  ffff880227539c50 ffffffffa079c6bc ffff880225c96198 ffff8801b0cf9aa8
[11212.734157] Call Trace:
[11212.734157]  [<ffffffffa079c6bc>] ? extent_buffer_uptodate+0x6c/0x8a [btrfs]
[11212.734157]  [<ffffffffa0775d62>] do_walk_down+0x25b/0x395 [btrfs]
[11212.734157]  [<ffffffffa076db1f>] ? btrfs_header_generation+0x1f/0x25 [btrfs]
[11212.734157]  [<ffffffffa0771268>] ? walk_down_proc+0x10a/0x1d0 [btrfs]
[11212.734157]  [<ffffffffa0775f1d>] walk_down_tree+0x81/0xac [btrfs]
[11212.734157]  [<ffffffffa077636f>] btrfs_drop_snapshot+0x2aa/0x467 [btrfs]
[11212.734157]  [<ffffffff81031049>] ? need_resched+0x23/0x2d
[11212.734157]  [<ffffffff81031061>] ? should_resched+0xe/0x2f
[11212.734157]  [<ffffffffa077d080>] ? cleaner_kthread+0x0/0x16b [btrfs]
[11212.734157]  [<ffffffffa077f24d>] btrfs_clean_old_snapshots+0xee/0x10c [btrfs]
[11212.734157]  [<ffffffffa077d177>] cleaner_kthread+0xf7/0x16b [btrfs]
[11212.734157]  [<ffffffff8105b11e>] kthread+0x72/0x7a
[11212.734157]  [<ffffffff810039d4>] kernel_thread_helper+0x4/0x10
[11212.734157]  [<ffffffff8105b0ac>] ? kthread+0x0/0x7a
[11212.734157]  [<ffffffff810039d0>] ? kernel_thread_helper+0x0/0x10
[11212.734157] Code: 01 00 00 0f 86 bb 00 00 00 8b 4d 8c 48 8b 55 80 4c 8d 4d c0 48 8b bd 78 ff ff ff 4c 8d 45 c8 4c 89 f6 e8 ec da ff ff 85 c0 74 04 <0f> 0b eb fe 48 8b 45 c8 48 85 c0 75 04 0f 0b eb fe 41 83  
[11212.734157] RIP  [<ffffffffa0773452>] reada_walk_down+0x18c/0x249 [btrfs]
[11212.734157]  RSP <ffff880227539be0>
[11213.101484] ---[ end trace b0dead1e7c3dbf7c ]---
Jan 26 11:40:45 an1 [11212.729433] ------------[ cut here ]------------
Jan 26 11:40:45 an1 [11212.734157] invalid opcode: 0000 [#2] SMP 
Jan 26 11:40:45 an1 [11212.734157] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
Jan 26 11:40:46 an1 [11212.734157] Stack:
Jan 26 11:40:46 an1 [11212.734157] Call Trace:
Jan 26 11:40:46 an1 [11212.734157] Code: 01 00 00 0f 86 bb 00 00 00 8b 4d 8c 48 8b 55 80 4c 8d 4d c0 48 8b bd 78 ff ff ff 4c 8d 45 c8 4c 89 f6 e8 ec da ff ff 85 c0 74 04 <0f> 0b eb fe 48 8b 45 c8 48 85 c0 75 0 






^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs BUG during Ceph cosd open() syscall
  2011-01-26 18:48   ` Jim Schutt
@ 2011-01-26 19:20     ` Matt Weil
  2011-01-27 15:58         ` Christian Brunner
  2011-01-27 16:05     ` btrfs BUG during Ceph cosd truncate() syscall Jim Schutt
  1 sibling, 1 reply; 12+ messages in thread
From: Matt Weil @ 2011-01-26 19:20 UTC (permalink / raw)
  To: Jim Schutt; +Cc: linux-btrfs, ceph-devel

heavy writes as well

Jan  5 16:56:46 linuscs101 kernel: [ 3666.496742] ------------[ cut here ]------------
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496754] WARNING: at fs/btrfs/inode.c:2143 btrfs_orphan_commit_root+0xb0/0xc0()
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496756] Hardware name: ProLiant DL380 G5
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496758] Modules linked in: nfsd exportfs nfs lockd nfs_acl auth_rpcgss bonding sunrpc radeon ttm drm_kms_helper drm bnx2 psmouse i5000_edac usbhid lp shpchp ipmi_si i2c_algo_bit hid edac_core parport ipmi_msghandler serio_raw i5k_amb hpilo cciss fbcon tileblit font bitblit softcursor
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496788] Pid: 2764, comm: cosd Not tainted 2.6.37-ceph-client #1
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496790] Call Trace:
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496797]  [<ffffffff81060dbf>] warn_slowpath_common+0x7f/0xc0
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496800]  [<ffffffff81060e1a>] warn_slowpath_null+0x1a/0x20
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496804]  [<ffffffff81273b70>] btrfs_orphan_commit_root+0xb0/0xc0
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496807]  [<ffffffff8126f1c1>] commit_fs_roots+0xa1/0x140
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496810]  [<ffffffff81270640>] btrfs_commit_transaction+0x350/0x730
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496816]  [<ffffffff81082aa0>] ? autoremove_wake_function+0x0/0x40
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496820]  [<ffffffff8129ec33>] btrfs_mksubvol+0x363/0x380
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496823]  [<ffffffff8129ed3d>] btrfs_ioctl_snap_create_transid+0xed/0x140
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496826]  [<ffffffff8129ee87>] btrfs_ioctl_snap_create+0xf7/0x140
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496830]  [<ffffffff812a0dcf>] btrfs_ioctl+0x61f/0xa20
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496834]  [<ffffffff811836da>] ? fsnotify+0x1ea/0x320
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496839]  [<ffffffff8115ce19>] do_vfs_ioctl+0xa9/0x5a0
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496842]  [<ffffffff8115d391>] sys_ioctl+0x81/0xa0
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496847]  [<ffffffff8100c042>] system_call_fastpath+0x16/0x1b
>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496850] ---[ end trace 2a6c3f752cfb5f1b ]---
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.723170] CPU 1
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.723210] Modules linked in: nfsd exportfs nfs lockd nfs_acl auth_rpcgss bonding sunrpc radeon ttm drm_kms_helper drm bnx2 psmouse i5000_edac usbhid lp shpchp ipmi_si i2c_algo_bit hid edac_core parport ipmi_msghandler serio_raw i5k_amb hpilo cciss fbcon tileblit font bitblit softcursor
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724006]
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724041] Pid: 2766, comm: cosd Tainted: G        W   2.6.37-ceph-client #1 /ProLiant DL380 G5
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724169] RIP: 0010:[<ffffffff81278190>]  [<ffffffff81278190>] btrfs_truncate+0x510/0x530
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724318] RSP: 0018:ffff8803d7e1bd48  EFLAGS: 00010286
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724397] RAX: 00000000ffffffe4 RBX: ffff8803dfaf1800 RCX: ffff880406ce7090
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724493] RDX: 0000000000000000 RSI: ffffea000e17d288 RDI: 0000000000000206
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724592] RBP: ffff8803d7e1bdd8 R08: 0000000000000783 R09: ffff8803d7e1bb28
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724691] R10: 00000000ffffffe4 R11: 0000000000000001 R12: ffff8803dee49f00
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724793] R13: ffff8803d5369c10 R14: ffff8803d5369a78 R15: ffff8803d5369d38
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724899] FS:  00007f77acfb6710(0000) GS:ffff8800cfc40000(0000) knlGS:0000000000000000
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725019] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725096] CR2: 00007f81cd5b8000 CR3: 00000003dfad3000 CR4: 00000000000006e0
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725195] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725293] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725392] Process cosd (pid: 2766, threadinfo ffff8803d7e1a000, task ffff8803dfaf8000)
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725549]  0000000000000000 ffffffffffffffff ffff8803d5369d78 00000000000001da
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725695]  0000000000000fff 00000000d5369d38 0000000000001000 0000000000000000
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725841]  ffff8803d5369aa8 ffff8803d5369c10 ffff8803d7e1bdc8 0000000000000000
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726039]  [<ffffffff81104c46>] vmtruncate+0x56/0x70
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726113]  [<ffffffff8127cece>] btrfs_setattr+0x13e/0x2a0
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726202]  [<ffffffff811652c0>] notify_change+0x170/0x2e0
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726292]  [<ffffffff8114b9b4>] do_truncate+0x64/0xa0
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726370]  [<ffffffff81156d73>] ? generic_permission+0x23/0xc0
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726460]  [<ffffffff81156bd5>] ? get_write_access+0x45/0x70
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726543]  [<ffffffff8114bb39>] sys_truncate+0x149/0x150
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726631]  [<ffffffff8100c042>] system_call_fastpath+0x16/0x1b
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.727618]  RSP<ffff8803d7e1bd48>
>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.748986] ---[ end trace 2a6c3f752cfb5f1c ]---



On 1/26/11 12:48 PM, Jim Schutt wrote:
> Hi,
>
> On Wed, 2011-01-26 at 10:59 -0700, Jim Schutt wrote:
>> Hi,
>>
>> I got this kernel BUG on a server running multiple Ceph
>> cosd instances, during a heavy write load generated by
>> multiple Ceph clients.
>>
>> The server was running the current ceph unstable kernel
>> (a3f5274e535 in git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git).
>>
>> Please let me know what other information you need to
>> make this report useful.
>>
>> -- Jim
>>
> Here's another example.
>
> Again, please let me know what other information you need to
> make this report useful.
>
> -- Jim
>
> [11199.532483] ------------[ cut here ]------------
> [11199.536292] kernel BUG at fs/btrfs/extent-tree.c:2198!
> [11199.536292] invalid opcode: 0000 [#1] SMP
> [11199.536292] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
> [11199.536292] CPU 3
> [11199.536292] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ]
> [11199.536292]
> [11199.536292] Pid: 1664, comm: cosd Not tainted 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950
> [11199.536292] RIP: 0010:[<ffffffffa0774081>]  [<ffffffffa0774081>] run_clustered_refs+0x71e/0x76b [btrfs]
> [11199.536292] RSP: 0018:ffff8801c90abb58  EFLAGS: 00010282
> [11199.536292] RAX: 00000000fffffffb RBX: 0000000000000000 RCX: ffff8802262c5000
> [11199.536292] RDX: ffff88017921e2d0 RSI: ffffea000527f690 RDI: 0000000000000001
> [11199.536292] RBP: ffff8801c90abc28 R08: ffffe8ffffccefe8 R09: 0000000000000000
> [11199.536292] R10: 0000000000000003 R11: ffff880227549e98 R12: ffff880140bb8f00
> [11199.536292] R13: 0000000000000000 R14: ffff880181eff378 R15: ffff8802262c5000
> [11199.536292] FS:  00007f5e680fc940(0000) GS:ffff8800cfcc0000(0000) knlGS:0000000000000000
> [11199.536292] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [11199.536292] CR2: 00007f0e1a476260 CR3: 0000000173aa0000 CR4: 00000000000006e0
> [11199.536292] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [11199.536292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [11199.536292] Process cosd (pid: 1664, threadinfo ffff8801c90aa000, task ffff8801df12d840)
> [11199.536292] Stack:
> [11199.536292]  0000000000000000 0000000000000000 0000000000000001 0000000000000000
> [11199.536292]  ffff8801c90abc48 ffff8802262c5000 ffff8801e0a9c600 ffff880181eff378
> [11199.536292]  0000000000000000 0000002600000206 ffff880181eff380 000000007921e750
> [11199.536292] Call Trace:
> [11199.536292]  [<ffffffffa0785be0>] ? btrfs_update_inode+0xc3/0xd3 [btrfs]
> [11199.536292]  [<ffffffffa07741bc>] btrfs_run_delayed_refs+0xee/0x15e [btrfs]
> [11199.536292]  [<ffffffff810fa54d>] ? __fsnotify_update_dcache_flags+0x22/0x56
> [11199.536292]  [<ffffffffa07801d0>] __btrfs_end_transaction+0x6d/0x1e3 [btrfs]
> [11199.536292]  [<ffffffffa0780372>] btrfs_end_transaction_throttle+0x18/0x1a [btrfs]
> [11199.536292]  [<ffffffffa07872e1>] btrfs_create+0x1a0/0x1fa [btrfs]
> [11199.536292]  [<ffffffff810f49e2>] vfs_create+0x76/0x96
> [11199.536292]  [<ffffffff810f56af>] do_last+0x24d/0x4d3
> [11199.536292]  [<ffffffff810f5b16>] do_filp_open+0x1e1/0x4c5
> [11199.536292]  [<ffffffff81031061>] ? should_resched+0xe/0x2f
> [11199.536292]  [<ffffffff8136a638>] ? _cond_resched+0xe/0x22
> [11199.536292]  [<ffffffff811aa669>] ? might_fault+0xe/0x10
> [11199.536292]  [<ffffffff811aa753>] ? __strncpy_from_user+0x20/0x4a
> [11199.536292]  [<ffffffff810e9023>] do_sys_open+0x62/0xeb
> [11199.536292]  [<ffffffff810e90df>] sys_open+0x20/0x22
> [11199.536292]  [<ffffffff81002c2b>] system_call_fastpath+0x16/0x1b
> [11199.536292] Code: 24 08 48 8b 46 40 48 89 04 24 48 8b b5 58 ff ff ff 48 8b bd 60 ff ff ff e8 61 e7 ff ff eb 08 0f 0b eb fe 0f 0b eb fe 85 c0 74 04<0f>  0b eb fe 4c 89 e7 e8 65 ae ff ff 48 8b bd 70 ff ff ff
> [11199.536292] RIP  [<ffffffffa0774081>] run_clustered_refs+0x71e/0x76b [btrfs]
> [11199.536292]  RSP<ffff8801c90abb58>
> [11199.905250] ---[ end trace b0dead1e7c3dbf7b ]---
> Jan 26 11:40:32 an1 [11199.532483] ------------[ cut here ]------------
> Jan 26 11:40:33 an1 [11199.536292] invalid opcode: 0000 [#1] SMP
> Jan 26 11:40:33 an1 [11199.536292] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
> Jan 26 11:40:38 an1 [11199.536292] Stack:
> Jan 26 11:40:38 an1 [11199.536292] Call Trace:
> Jan 26 11:40:40 an1 [11199.536292] Code: 24 08 48 8b 46 40 48 89 04 24 48 8b b5 58 ff ff ff 48 8b bd 60 ff ff ff e8 61 e7 ff ff eb 08 0f 0b eb fe 0f 0b eb fe 85 c0 74 04<0f>  0b eb fe 4c 89 e7 e8 65 ae ff ff 4
> [11212.699541] btrfs: sdm2 checksum verify failed on 31928320 wanted 237BEA0B found F7B13C5E level 0
> [11212.709895] btrfs: sdm2 checksum verify failed on 31928320 wanted 237BEA0B found F7B13C5E level 0
> [11212.719737] btrfs: sdm2 checksum verify failed on 31928320 wanted 237BEA0B found F7B13C5E level 0
> [11212.729433] ------------[ cut here ]------------
> [11212.730394] kernel BUG at fs/btrfs/extent-tree.c:5789!
> [11212.734157] invalid opcode: 0000 [#2] SMP
> [11212.734157] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
> [11212.734157] CPU 3
> [11212.734157] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ]
> [11212.734157]
> [11212.734157] Pid: 27662, comm: btrfs-cleaner Tainted: G      D     2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950
> [11212.734157] RIP: 0010:[<ffffffffa0773452>]  [<ffffffffa0773452>] reada_walk_down+0x18c/0x249 [btrfs]
> [11212.734157] RSP: 0018:ffff880227539be0  EFLAGS: 00010282
> [11212.734157] RAX: 00000000fffffffb RBX: ffff8801cd50d750 RCX: ffff88020b993000
> [11212.734157] RDX: ffff88017921e3f0 RSI: ffffea000527f690 RDI: 0000000100000090
> [11212.734157] RBP: ffff880227539c80 R08: ffffe8ffffccefe8 R09: 0000000000000000
> [11212.734157] R10: 0000000100a68468 R11: ffff880227549e98 R12: ffff8801d83c3000
> [11212.734157] R13: 0000000000000040 R14: ffff88020b993000 R15: 00000000000000e0
> [11212.734157] FS:  0000000000000000(0000) GS:ffff8800cfcc0000(0000) knlGS:0000000000000000
> [11212.734157] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [11212.734157] CR2: 0000000000b92de8 CR3: 000000020e5b3000 CR4: 00000000000006e0
> [11212.734157] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [11212.734157] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [11212.734157] Process btrfs-cleaner (pid: 27662, threadinfo ffff880227538000, task ffff88020ebc0000)
> [11212.734157] Stack:
> [11212.734157]  ffff880227539bf0 0000000400000000 ffff8801cd50d750 ffff8801e0a9ca00
> [11212.734157]  00000000024cd000 000010000000006b ffff88021527f880 0000000100000001
> [11212.734157]  ffff880227539c50 ffffffffa079c6bc ffff880225c96198 ffff8801b0cf9aa8
> [11212.734157] Call Trace:
> [11212.734157]  [<ffffffffa079c6bc>] ? extent_buffer_uptodate+0x6c/0x8a [btrfs]
> [11212.734157]  [<ffffffffa0775d62>] do_walk_down+0x25b/0x395 [btrfs]
> [11212.734157]  [<ffffffffa076db1f>] ? btrfs_header_generation+0x1f/0x25 [btrfs]
> [11212.734157]  [<ffffffffa0771268>] ? walk_down_proc+0x10a/0x1d0 [btrfs]
> [11212.734157]  [<ffffffffa0775f1d>] walk_down_tree+0x81/0xac [btrfs]
> [11212.734157]  [<ffffffffa077636f>] btrfs_drop_snapshot+0x2aa/0x467 [btrfs]
> [11212.734157]  [<ffffffff81031049>] ? need_resched+0x23/0x2d
> [11212.734157]  [<ffffffff81031061>] ? should_resched+0xe/0x2f
> [11212.734157]  [<ffffffffa077d080>] ? cleaner_kthread+0x0/0x16b [btrfs]
> [11212.734157]  [<ffffffffa077f24d>] btrfs_clean_old_snapshots+0xee/0x10c [btrfs]
> [11212.734157]  [<ffffffffa077d177>] cleaner_kthread+0xf7/0x16b [btrfs]
> [11212.734157]  [<ffffffff8105b11e>] kthread+0x72/0x7a
> [11212.734157]  [<ffffffff810039d4>] kernel_thread_helper+0x4/0x10
> [11212.734157]  [<ffffffff8105b0ac>] ? kthread+0x0/0x7a
> [11212.734157]  [<ffffffff810039d0>] ? kernel_thread_helper+0x0/0x10
> [11212.734157] Code: 01 00 00 0f 86 bb 00 00 00 8b 4d 8c 48 8b 55 80 4c 8d 4d c0 48 8b bd 78 ff ff ff 4c 8d 45 c8 4c 89 f6 e8 ec da ff ff 85 c0 74 04<0f>  0b eb fe 48 8b 45 c8 48 85 c0 75 04 0f 0b eb fe 41 83
> [11212.734157] RIP  [<ffffffffa0773452>] reada_walk_down+0x18c/0x249 [btrfs]
> [11212.734157]  RSP<ffff880227539be0>
> [11213.101484] ---[ end trace b0dead1e7c3dbf7c ]---
> Jan 26 11:40:45 an1 [11212.729433] ------------[ cut here ]------------
> Jan 26 11:40:45 an1 [11212.734157] invalid opcode: 0000 [#2] SMP
> Jan 26 11:40:45 an1 [11212.734157] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
> Jan 26 11:40:46 an1 [11212.734157] Stack:
> Jan 26 11:40:46 an1 [11212.734157] Call Trace:
> Jan 26 11:40:46 an1 [11212.734157] Code: 01 00 00 0f 86 bb 00 00 00 8b 4d 8c 48 8b 55 80 4c 8d 4d c0 48 8b bd 78 ff ff ff 4c 8d 45 c8 4c 89 f6 e8 ec da ff ff 85 c0 74 04<0f>  0b eb fe 48 8b 45 c8 48 85 c0 75 0
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs BUG during Ceph cosd open() syscall
  2011-01-26 19:20     ` Matt Weil
@ 2011-01-27 15:58         ` Christian Brunner
  0 siblings, 0 replies; 12+ messages in thread
From: Christian Brunner @ 2011-01-27 15:58 UTC (permalink / raw)
  To: Matt Weil; +Cc: Jim Schutt, linux-btrfs, ceph-devel

The btrfs_orphan_commit_root warning is also reproducable in our ceph
environment.

Regards
Christian

2011/1/26 Matt Weil <mweil@genome.wustl.edu>:
> heavy writes as well
>
> Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496742] ------------[ cut=
 here
> ]------------
>>
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496754] WARNING: at
>> fs/btrfs/inode.c:2143 btrfs_orphan_commit_root+0xb0/0xc0()
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496756] Hardware name=
: ProLiant
>> DL380 G5
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496758] Modules linke=
d in: nfsd
>> exportfs nfs lockd nfs_acl auth_rpcgss bonding sunrpc radeon ttm
>> drm_kms_helper drm bnx2 psmouse i5000_edac usbhid lp shpchp ipmi_si
>> i2c_algo_bit hid edac_core parport ipmi_msghandler serio_raw i5k_amb=
 hpilo
>> cciss fbcon tileblit font bitblit softcursor
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496788] Pid: 2764, co=
mm: cosd
>> Not tainted 2.6.37-ceph-client #1
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496790] Call Trace:
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496797] =A0[<ffffffff=
81060dbf>]
>> warn_slowpath_common+0x7f/0xc0
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496800] =A0[<ffffffff=
81060e1a>]
>> warn_slowpath_null+0x1a/0x20
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496804] =A0[<ffffffff=
81273b70>]
>> btrfs_orphan_commit_root+0xb0/0xc0
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496807] =A0[<ffffffff=
8126f1c1>]
>> commit_fs_roots+0xa1/0x140
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496810] =A0[<ffffffff=
81270640>]
>> btrfs_commit_transaction+0x350/0x730
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496816] =A0[<ffffffff=
81082aa0>] ?
>> autoremove_wake_function+0x0/0x40
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496820] =A0[<ffffffff=
8129ec33>]
>> btrfs_mksubvol+0x363/0x380
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496823] =A0[<ffffffff=
8129ed3d>]
>> btrfs_ioctl_snap_create_transid+0xed/0x140
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496826] =A0[<ffffffff=
8129ee87>]
>> btrfs_ioctl_snap_create+0xf7/0x140
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496830] =A0[<ffffffff=
812a0dcf>]
>> btrfs_ioctl+0x61f/0xa20
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496834] =A0[<ffffffff=
811836da>] ?
>> fsnotify+0x1ea/0x320
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496839] =A0[<ffffffff=
8115ce19>]
>> do_vfs_ioctl+0xa9/0x5a0
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496842] =A0[<ffffffff=
8115d391>]
>> sys_ioctl+0x81/0xa0
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496847] =A0[<ffffffff=
8100c042>]
>> system_call_fastpath+0x16/0x1b
>> =A0Jan =A05 16:56:46 linuscs101 kernel: [ 3666.496850] ---[ end trac=
e
>> 2a6c3f752cfb5f1b ]---
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.723170] CPU 1
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.723210] Modules linke=
d in: nfsd
>> exportfs nfs lockd nfs_acl auth_rpcgss bonding sunrpc radeon ttm
>> drm_kms_helper drm bnx2 psmouse i5000_edac usbhid lp shpchp ipmi_si
>> i2c_algo_bit hid edac_core parport ipmi_msghandler serio_raw i5k_amb=
 hpilo
>> cciss fbcon tileblit font bitblit softcursor
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.724006]
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.724041] Pid: 2766, co=
mm: cosd
>> Tainted: G =A0 =A0 =A0 =A0W =A0 2.6.37-ceph-client #1 /ProLiant DL38=
0 G5
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.724169] RIP:
>> 0010:[<ffffffff81278190>] =A0[<ffffffff81278190>] btrfs_truncate+0x5=
10/0x530
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.724318] RSP:
>> 0018:ffff8803d7e1bd48 =A0EFLAGS: 00010286
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.724397] RAX: 00000000=
ffffffe4
>> RBX: ffff8803dfaf1800 RCX: ffff880406ce7090
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.724493] RDX: 00000000=
00000000
>> RSI: ffffea000e17d288 RDI: 0000000000000206
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.724592] RBP: ffff8803=
d7e1bdd8
>> R08: 0000000000000783 R09: ffff8803d7e1bb28
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.724691] R10: 00000000=
ffffffe4
>> R11: 0000000000000001 R12: ffff8803dee49f00
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.724793] R13: ffff8803=
d5369c10
>> R14: ffff8803d5369a78 R15: ffff8803d5369d38
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.724899] FS:
>> =A000007f77acfb6710(0000) GS:ffff8800cfc40000(0000) knlGS:0000000000=
000000
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.725019] CS: =A00010 D=
S: 0000 ES:
>> 0000 CR0: 0000000080050033
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.725096] CR2: 00007f81=
cd5b8000
>> CR3: 00000003dfad3000 CR4: 00000000000006e0
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.725195] DR0: 00000000=
00000000
>> DR1: 0000000000000000 DR2: 0000000000000000
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.725293] DR3: 00000000=
00000000
>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.725392] Process cosd =
(pid:
>> 2766, threadinfo ffff8803d7e1a000, task ffff8803dfaf8000)
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.725549] =A00000000000=
000000
>> ffffffffffffffff ffff8803d5369d78 00000000000001da
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.725695] =A00000000000=
000fff
>> 00000000d5369d38 0000000000001000 0000000000000000
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.725841] =A0ffff8803d5=
369aa8
>> ffff8803d5369c10 ffff8803d7e1bdc8 0000000000000000
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.726039] =A0[<ffffffff=
81104c46>]
>> vmtruncate+0x56/0x70
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.726113] =A0[<ffffffff=
8127cece>]
>> btrfs_setattr+0x13e/0x2a0
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.726202] =A0[<ffffffff=
811652c0>]
>> notify_change+0x170/0x2e0
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.726292] =A0[<ffffffff=
8114b9b4>]
>> do_truncate+0x64/0xa0
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.726370] =A0[<ffffffff=
81156d73>] ?
>> generic_permission+0x23/0xc0
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.726460] =A0[<ffffffff=
81156bd5>] ?
>> get_write_access+0x45/0x70
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.726543] =A0[<ffffffff=
8114bb39>]
>> sys_truncate+0x149/0x150
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.726631] =A0[<ffffffff=
8100c042>]
>> system_call_fastpath+0x16/0x1b
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.727618] =A0RSP<ffff88=
03d7e1bd48>
>> =A0Jan =A05 17:07:45 linuscs101 kernel: [ 4325.748986] ---[ end trac=
e
>> 2a6c3f752cfb5f1c ]---
>
>
>
> On 1/26/11 12:48 PM, Jim Schutt wrote:
>>
>> Hi,
>>
>> On Wed, 2011-01-26 at 10:59 -0700, Jim Schutt wrote:
>>>
>>> Hi,
>>>
>>> I got this kernel BUG on a server running multiple Ceph
>>> cosd instances, during a heavy write load generated by
>>> multiple Ceph clients.
>>>
>>> The server was running the current ceph unstable kernel
>>> (a3f5274e535 in
>>> git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git)=
=2E
>>>
>>> Please let me know what other information you need to
>>> make this report useful.
>>>
>>> -- Jim
>>>
>> Here's another example.
>>
>> Again, please let me know what other information you need to
>> make this report useful.
>>
>> -- Jim
>>
>> [11199.532483] ------------[ cut here ]------------
>> [11199.536292] kernel BUG at fs/btrfs/extent-tree.c:2198!
>> [11199.536292] invalid opcode: 0000 [#1] SMP
>> [11199.536292] last sysfs file:
>> /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
>> [11199.536292] CPU 3
>> [11199.536292] Modules linked in: loop btrfs zlib_deflate ipt_MASQUE=
RADE
>> iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conn=
track
>> ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ]
>> [11199.536292]
>> [11199.536292] Pid: 1664, comm: cosd Not tainted 2.6.37-00017-ga3f52=
74 #4
>> 0DT097/PowerEdge 1950
>> [11199.536292] RIP: 0010:[<ffffffffa0774081>] =A0[<ffffffffa0774081>=
]
>> run_clustered_refs+0x71e/0x76b [btrfs]
>> [11199.536292] RSP: 0018:ffff8801c90abb58 =A0EFLAGS: 00010282
>> [11199.536292] RAX: 00000000fffffffb RBX: 0000000000000000 RCX:
>> ffff8802262c5000
>> [11199.536292] RDX: ffff88017921e2d0 RSI: ffffea000527f690 RDI:
>> 0000000000000001
>> [11199.536292] RBP: ffff8801c90abc28 R08: ffffe8ffffccefe8 R09:
>> 0000000000000000
>> [11199.536292] R10: 0000000000000003 R11: ffff880227549e98 R12:
>> ffff880140bb8f00
>> [11199.536292] R13: 0000000000000000 R14: ffff880181eff378 R15:
>> ffff8802262c5000
>> [11199.536292] FS: =A000007f5e680fc940(0000) GS:ffff8800cfcc0000(000=
0)
>> knlGS:0000000000000000
>> [11199.536292] CS: =A00010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [11199.536292] CR2: 00007f0e1a476260 CR3: 0000000173aa0000 CR4:
>> 00000000000006e0
>> [11199.536292] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [11199.536292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
>> 0000000000000400
>> [11199.536292] Process cosd (pid: 1664, threadinfo ffff8801c90aa000,=
 task
>> ffff8801df12d840)
>> [11199.536292] Stack:
>> [11199.536292] =A00000000000000000 0000000000000000 0000000000000001
>> 0000000000000000
>> [11199.536292] =A0ffff8801c90abc48 ffff8802262c5000 ffff8801e0a9c600
>> ffff880181eff378
>> [11199.536292] =A00000000000000000 0000002600000206 ffff880181eff380
>> 000000007921e750
>> [11199.536292] Call Trace:
>> [11199.536292] =A0[<ffffffffa0785be0>] ? btrfs_update_inode+0xc3/0xd=
3
>> [btrfs]
>> [11199.536292] =A0[<ffffffffa07741bc>] btrfs_run_delayed_refs+0xee/0=
x15e
>> [btrfs]
>> [11199.536292] =A0[<ffffffff810fa54d>] ?
>> __fsnotify_update_dcache_flags+0x22/0x56
>> [11199.536292] =A0[<ffffffffa07801d0>] __btrfs_end_transaction+0x6d/=
0x1e3
>> [btrfs]
>> [11199.536292] =A0[<ffffffffa0780372>]
>> btrfs_end_transaction_throttle+0x18/0x1a [btrfs]
>> [11199.536292] =A0[<ffffffffa07872e1>] btrfs_create+0x1a0/0x1fa [btr=
fs]
>> [11199.536292] =A0[<ffffffff810f49e2>] vfs_create+0x76/0x96
>> [11199.536292] =A0[<ffffffff810f56af>] do_last+0x24d/0x4d3
>> [11199.536292] =A0[<ffffffff810f5b16>] do_filp_open+0x1e1/0x4c5
>> [11199.536292] =A0[<ffffffff81031061>] ? should_resched+0xe/0x2f
>> [11199.536292] =A0[<ffffffff8136a638>] ? _cond_resched+0xe/0x22
>> [11199.536292] =A0[<ffffffff811aa669>] ? might_fault+0xe/0x10
>> [11199.536292] =A0[<ffffffff811aa753>] ? __strncpy_from_user+0x20/0x=
4a
>> [11199.536292] =A0[<ffffffff810e9023>] do_sys_open+0x62/0xeb
>> [11199.536292] =A0[<ffffffff810e90df>] sys_open+0x20/0x22
>> [11199.536292] =A0[<ffffffff81002c2b>] system_call_fastpath+0x16/0x1=
b
>> [11199.536292] Code: 24 08 48 8b 46 40 48 89 04 24 48 8b b5 58 ff ff=
 ff 48
>> 8b bd 60 ff ff ff e8 61 e7 ff ff eb 08 0f 0b eb fe 0f 0b eb fe 85 c0=
 74
>> 04<0f> =A00b eb fe 4c 89 e7 e8 65 ae ff ff 48 8b bd 70 ff ff ff
>> [11199.536292] RIP =A0[<ffffffffa0774081>] run_clustered_refs+0x71e/=
0x76b
>> [btrfs]
>> [11199.536292] =A0RSP<ffff8801c90abb58>
>> [11199.905250] ---[ end trace b0dead1e7c3dbf7b ]---
>> Jan 26 11:40:32 an1 [11199.532483] ------------[ cut here ]---------=
---
>> Jan 26 11:40:33 an1 [11199.536292] invalid opcode: 0000 [#1] SMP
>> Jan 26 11:40:33 an1 [11199.536292] last sysfs file:
>> /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
>> Jan 26 11:40:38 an1 [11199.536292] Stack:
>> Jan 26 11:40:38 an1 [11199.536292] Call Trace:
>> Jan 26 11:40:40 an1 [11199.536292] Code: 24 08 48 8b 46 40 48 89 04 =
24 48
>> 8b b5 58 ff ff ff 48 8b bd 60 ff ff ff e8 61 e7 ff ff eb 08 0f 0b eb=
 fe 0f
>> 0b eb fe 85 c0 74 04<0f> =A00b eb fe 4c 89 e7 e8 65 ae ff ff 4
>> [11212.699541] btrfs: sdm2 checksum verify failed on 31928320 wanted
>> 237BEA0B found F7B13C5E level 0
>> [11212.709895] btrfs: sdm2 checksum verify failed on 31928320 wanted
>> 237BEA0B found F7B13C5E level 0
>> [11212.719737] btrfs: sdm2 checksum verify failed on 31928320 wanted
>> 237BEA0B found F7B13C5E level 0
>> [11212.729433] ------------[ cut here ]------------
>> [11212.730394] kernel BUG at fs/btrfs/extent-tree.c:5789!
>> [11212.734157] invalid opcode: 0000 [#2] SMP
>> [11212.734157] last sysfs file:
>> /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
>> [11212.734157] CPU 3
>> [11212.734157] Modules linked in: loop btrfs zlib_deflate ipt_MASQUE=
RADE
>> iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conn=
track
>> ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ]
>> [11212.734157]
>> [11212.734157] Pid: 27662, comm: btrfs-cleaner Tainted: G =A0 =A0 =A0=
D
>> 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950
>> [11212.734157] RIP: 0010:[<ffffffffa0773452>] =A0[<ffffffffa0773452>=
]
>> reada_walk_down+0x18c/0x249 [btrfs]
>> [11212.734157] RSP: 0018:ffff880227539be0 =A0EFLAGS: 00010282
>> [11212.734157] RAX: 00000000fffffffb RBX: ffff8801cd50d750 RCX:
>> ffff88020b993000
>> [11212.734157] RDX: ffff88017921e3f0 RSI: ffffea000527f690 RDI:
>> 0000000100000090
>> [11212.734157] RBP: ffff880227539c80 R08: ffffe8ffffccefe8 R09:
>> 0000000000000000
>> [11212.734157] R10: 0000000100a68468 R11: ffff880227549e98 R12:
>> ffff8801d83c3000
>> [11212.734157] R13: 0000000000000040 R14: ffff88020b993000 R15:
>> 00000000000000e0
>> [11212.734157] FS: =A00000000000000000(0000) GS:ffff8800cfcc0000(000=
0)
>> knlGS:0000000000000000
>> [11212.734157] CS: =A00010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [11212.734157] CR2: 0000000000b92de8 CR3: 000000020e5b3000 CR4:
>> 00000000000006e0
>> [11212.734157] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [11212.734157] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
>> 0000000000000400
>> [11212.734157] Process btrfs-cleaner (pid: 27662, threadinfo
>> ffff880227538000, task ffff88020ebc0000)
>> [11212.734157] Stack:
>> [11212.734157] =A0ffff880227539bf0 0000000400000000 ffff8801cd50d750
>> ffff8801e0a9ca00
>> [11212.734157] =A000000000024cd000 000010000000006b ffff88021527f880
>> 0000000100000001
>> [11212.734157] =A0ffff880227539c50 ffffffffa079c6bc ffff880225c96198
>> ffff8801b0cf9aa8
>> [11212.734157] Call Trace:
>> [11212.734157] =A0[<ffffffffa079c6bc>] ? extent_buffer_uptodate+0x6c=
/0x8a
>> [btrfs]
>> [11212.734157] =A0[<ffffffffa0775d62>] do_walk_down+0x25b/0x395 [btr=
fs]
>> [11212.734157] =A0[<ffffffffa076db1f>] ? btrfs_header_generation+0x1=
f/0x25
>> [btrfs]
>> [11212.734157] =A0[<ffffffffa0771268>] ? walk_down_proc+0x10a/0x1d0 =
[btrfs]
>> [11212.734157] =A0[<ffffffffa0775f1d>] walk_down_tree+0x81/0xac [btr=
fs]
>> [11212.734157] =A0[<ffffffffa077636f>] btrfs_drop_snapshot+0x2aa/0x4=
67
>> [btrfs]
>> [11212.734157] =A0[<ffffffff81031049>] ? need_resched+0x23/0x2d
>> [11212.734157] =A0[<ffffffff81031061>] ? should_resched+0xe/0x2f
>> [11212.734157] =A0[<ffffffffa077d080>] ? cleaner_kthread+0x0/0x16b [=
btrfs]
>> [11212.734157] =A0[<ffffffffa077f24d>] btrfs_clean_old_snapshots+0xe=
e/0x10c
>> [btrfs]
>> [11212.734157] =A0[<ffffffffa077d177>] cleaner_kthread+0xf7/0x16b [b=
trfs]
>> [11212.734157] =A0[<ffffffff8105b11e>] kthread+0x72/0x7a
>> [11212.734157] =A0[<ffffffff810039d4>] kernel_thread_helper+0x4/0x10
>> [11212.734157] =A0[<ffffffff8105b0ac>] ? kthread+0x0/0x7a
>> [11212.734157] =A0[<ffffffff810039d0>] ? kernel_thread_helper+0x0/0x=
10
>> [11212.734157] Code: 01 00 00 0f 86 bb 00 00 00 8b 4d 8c 48 8b 55 80=
 4c 8d
>> 4d c0 48 8b bd 78 ff ff ff 4c 8d 45 c8 4c 89 f6 e8 ec da ff ff 85 c0=
 74
>> 04<0f> =A00b eb fe 48 8b 45 c8 48 85 c0 75 04 0f 0b eb fe 41 83
>> [11212.734157] RIP =A0[<ffffffffa0773452>] reada_walk_down+0x18c/0x2=
49
>> [btrfs]
>> [11212.734157] =A0RSP<ffff880227539be0>
>> [11213.101484] ---[ end trace b0dead1e7c3dbf7c ]---
>> Jan 26 11:40:45 an1 [11212.729433] ------------[ cut here ]---------=
---
>> Jan 26 11:40:45 an1 [11212.734157] invalid opcode: 0000 [#2] SMP
>> Jan 26 11:40:45 an1 [11212.734157] last sysfs file:
>> /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
>> Jan 26 11:40:46 an1 [11212.734157] Stack:
>> Jan 26 11:40:46 an1 [11212.734157] Call Trace:
>> Jan 26 11:40:46 an1 [11212.734157] Code: 01 00 00 0f 86 bb 00 00 00 =
8b 4d
>> 8c 48 8b 55 80 4c 8d 4d c0 48 8b bd 78 ff ff ff 4c 8d 45 c8 4c 89 f6=
 e8 ec
>> da ff ff 85 c0 74 04<0f> =A00b eb fe 48 8b 45 c8 48 85 c0 75 0
>>
>>
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel=
" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at =A0http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel"=
 in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at =A0http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs BUG during Ceph cosd open() syscall
@ 2011-01-27 15:58         ` Christian Brunner
  0 siblings, 0 replies; 12+ messages in thread
From: Christian Brunner @ 2011-01-27 15:58 UTC (permalink / raw)
  To: Matt Weil; +Cc: Jim Schutt, linux-btrfs, ceph-devel

The btrfs_orphan_commit_root warning is also reproducable in our ceph
environment.

Regards
Christian

2011/1/26 Matt Weil <mweil@genome.wustl.edu>:
> heavy writes as well
>
> Jan  5 16:56:46 linuscs101 kernel: [ 3666.496742] ------------[ cut here
> ]------------
>>
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496754] WARNING: at
>> fs/btrfs/inode.c:2143 btrfs_orphan_commit_root+0xb0/0xc0()
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496756] Hardware name: ProLiant
>> DL380 G5
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496758] Modules linked in: nfsd
>> exportfs nfs lockd nfs_acl auth_rpcgss bonding sunrpc radeon ttm
>> drm_kms_helper drm bnx2 psmouse i5000_edac usbhid lp shpchp ipmi_si
>> i2c_algo_bit hid edac_core parport ipmi_msghandler serio_raw i5k_amb hpilo
>> cciss fbcon tileblit font bitblit softcursor
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496788] Pid: 2764, comm: cosd
>> Not tainted 2.6.37-ceph-client #1
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496790] Call Trace:
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496797]  [<ffffffff81060dbf>]
>> warn_slowpath_common+0x7f/0xc0
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496800]  [<ffffffff81060e1a>]
>> warn_slowpath_null+0x1a/0x20
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496804]  [<ffffffff81273b70>]
>> btrfs_orphan_commit_root+0xb0/0xc0
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496807]  [<ffffffff8126f1c1>]
>> commit_fs_roots+0xa1/0x140
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496810]  [<ffffffff81270640>]
>> btrfs_commit_transaction+0x350/0x730
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496816]  [<ffffffff81082aa0>] ?
>> autoremove_wake_function+0x0/0x40
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496820]  [<ffffffff8129ec33>]
>> btrfs_mksubvol+0x363/0x380
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496823]  [<ffffffff8129ed3d>]
>> btrfs_ioctl_snap_create_transid+0xed/0x140
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496826]  [<ffffffff8129ee87>]
>> btrfs_ioctl_snap_create+0xf7/0x140
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496830]  [<ffffffff812a0dcf>]
>> btrfs_ioctl+0x61f/0xa20
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496834]  [<ffffffff811836da>] ?
>> fsnotify+0x1ea/0x320
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496839]  [<ffffffff8115ce19>]
>> do_vfs_ioctl+0xa9/0x5a0
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496842]  [<ffffffff8115d391>]
>> sys_ioctl+0x81/0xa0
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496847]  [<ffffffff8100c042>]
>> system_call_fastpath+0x16/0x1b
>>  Jan  5 16:56:46 linuscs101 kernel: [ 3666.496850] ---[ end trace
>> 2a6c3f752cfb5f1b ]---
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.723170] CPU 1
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.723210] Modules linked in: nfsd
>> exportfs nfs lockd nfs_acl auth_rpcgss bonding sunrpc radeon ttm
>> drm_kms_helper drm bnx2 psmouse i5000_edac usbhid lp shpchp ipmi_si
>> i2c_algo_bit hid edac_core parport ipmi_msghandler serio_raw i5k_amb hpilo
>> cciss fbcon tileblit font bitblit softcursor
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724006]
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724041] Pid: 2766, comm: cosd
>> Tainted: G        W   2.6.37-ceph-client #1 /ProLiant DL380 G5
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724169] RIP:
>> 0010:[<ffffffff81278190>]  [<ffffffff81278190>] btrfs_truncate+0x510/0x530
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724318] RSP:
>> 0018:ffff8803d7e1bd48  EFLAGS: 00010286
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724397] RAX: 00000000ffffffe4
>> RBX: ffff8803dfaf1800 RCX: ffff880406ce7090
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724493] RDX: 0000000000000000
>> RSI: ffffea000e17d288 RDI: 0000000000000206
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724592] RBP: ffff8803d7e1bdd8
>> R08: 0000000000000783 R09: ffff8803d7e1bb28
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724691] R10: 00000000ffffffe4
>> R11: 0000000000000001 R12: ffff8803dee49f00
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724793] R13: ffff8803d5369c10
>> R14: ffff8803d5369a78 R15: ffff8803d5369d38
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.724899] FS:
>>  00007f77acfb6710(0000) GS:ffff8800cfc40000(0000) knlGS:0000000000000000
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725019] CS:  0010 DS: 0000 ES:
>> 0000 CR0: 0000000080050033
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725096] CR2: 00007f81cd5b8000
>> CR3: 00000003dfad3000 CR4: 00000000000006e0
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725195] DR0: 0000000000000000
>> DR1: 0000000000000000 DR2: 0000000000000000
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725293] DR3: 0000000000000000
>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725392] Process cosd (pid:
>> 2766, threadinfo ffff8803d7e1a000, task ffff8803dfaf8000)
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725549]  0000000000000000
>> ffffffffffffffff ffff8803d5369d78 00000000000001da
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725695]  0000000000000fff
>> 00000000d5369d38 0000000000001000 0000000000000000
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.725841]  ffff8803d5369aa8
>> ffff8803d5369c10 ffff8803d7e1bdc8 0000000000000000
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726039]  [<ffffffff81104c46>]
>> vmtruncate+0x56/0x70
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726113]  [<ffffffff8127cece>]
>> btrfs_setattr+0x13e/0x2a0
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726202]  [<ffffffff811652c0>]
>> notify_change+0x170/0x2e0
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726292]  [<ffffffff8114b9b4>]
>> do_truncate+0x64/0xa0
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726370]  [<ffffffff81156d73>] ?
>> generic_permission+0x23/0xc0
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726460]  [<ffffffff81156bd5>] ?
>> get_write_access+0x45/0x70
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726543]  [<ffffffff8114bb39>]
>> sys_truncate+0x149/0x150
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.726631]  [<ffffffff8100c042>]
>> system_call_fastpath+0x16/0x1b
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.727618]  RSP<ffff8803d7e1bd48>
>>  Jan  5 17:07:45 linuscs101 kernel: [ 4325.748986] ---[ end trace
>> 2a6c3f752cfb5f1c ]---
>
>
>
> On 1/26/11 12:48 PM, Jim Schutt wrote:
>>
>> Hi,
>>
>> On Wed, 2011-01-26 at 10:59 -0700, Jim Schutt wrote:
>>>
>>> Hi,
>>>
>>> I got this kernel BUG on a server running multiple Ceph
>>> cosd instances, during a heavy write load generated by
>>> multiple Ceph clients.
>>>
>>> The server was running the current ceph unstable kernel
>>> (a3f5274e535 in
>>> git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git).
>>>
>>> Please let me know what other information you need to
>>> make this report useful.
>>>
>>> -- Jim
>>>
>> Here's another example.
>>
>> Again, please let me know what other information you need to
>> make this report useful.
>>
>> -- Jim
>>
>> [11199.532483] ------------[ cut here ]------------
>> [11199.536292] kernel BUG at fs/btrfs/extent-tree.c:2198!
>> [11199.536292] invalid opcode: 0000 [#1] SMP
>> [11199.536292] last sysfs file:
>> /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
>> [11199.536292] CPU 3
>> [11199.536292] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE
>> iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack
>> ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ]
>> [11199.536292]
>> [11199.536292] Pid: 1664, comm: cosd Not tainted 2.6.37-00017-ga3f5274 #4
>> 0DT097/PowerEdge 1950
>> [11199.536292] RIP: 0010:[<ffffffffa0774081>]  [<ffffffffa0774081>]
>> run_clustered_refs+0x71e/0x76b [btrfs]
>> [11199.536292] RSP: 0018:ffff8801c90abb58  EFLAGS: 00010282
>> [11199.536292] RAX: 00000000fffffffb RBX: 0000000000000000 RCX:
>> ffff8802262c5000
>> [11199.536292] RDX: ffff88017921e2d0 RSI: ffffea000527f690 RDI:
>> 0000000000000001
>> [11199.536292] RBP: ffff8801c90abc28 R08: ffffe8ffffccefe8 R09:
>> 0000000000000000
>> [11199.536292] R10: 0000000000000003 R11: ffff880227549e98 R12:
>> ffff880140bb8f00
>> [11199.536292] R13: 0000000000000000 R14: ffff880181eff378 R15:
>> ffff8802262c5000
>> [11199.536292] FS:  00007f5e680fc940(0000) GS:ffff8800cfcc0000(0000)
>> knlGS:0000000000000000
>> [11199.536292] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [11199.536292] CR2: 00007f0e1a476260 CR3: 0000000173aa0000 CR4:
>> 00000000000006e0
>> [11199.536292] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [11199.536292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
>> 0000000000000400
>> [11199.536292] Process cosd (pid: 1664, threadinfo ffff8801c90aa000, task
>> ffff8801df12d840)
>> [11199.536292] Stack:
>> [11199.536292]  0000000000000000 0000000000000000 0000000000000001
>> 0000000000000000
>> [11199.536292]  ffff8801c90abc48 ffff8802262c5000 ffff8801e0a9c600
>> ffff880181eff378
>> [11199.536292]  0000000000000000 0000002600000206 ffff880181eff380
>> 000000007921e750
>> [11199.536292] Call Trace:
>> [11199.536292]  [<ffffffffa0785be0>] ? btrfs_update_inode+0xc3/0xd3
>> [btrfs]
>> [11199.536292]  [<ffffffffa07741bc>] btrfs_run_delayed_refs+0xee/0x15e
>> [btrfs]
>> [11199.536292]  [<ffffffff810fa54d>] ?
>> __fsnotify_update_dcache_flags+0x22/0x56
>> [11199.536292]  [<ffffffffa07801d0>] __btrfs_end_transaction+0x6d/0x1e3
>> [btrfs]
>> [11199.536292]  [<ffffffffa0780372>]
>> btrfs_end_transaction_throttle+0x18/0x1a [btrfs]
>> [11199.536292]  [<ffffffffa07872e1>] btrfs_create+0x1a0/0x1fa [btrfs]
>> [11199.536292]  [<ffffffff810f49e2>] vfs_create+0x76/0x96
>> [11199.536292]  [<ffffffff810f56af>] do_last+0x24d/0x4d3
>> [11199.536292]  [<ffffffff810f5b16>] do_filp_open+0x1e1/0x4c5
>> [11199.536292]  [<ffffffff81031061>] ? should_resched+0xe/0x2f
>> [11199.536292]  [<ffffffff8136a638>] ? _cond_resched+0xe/0x22
>> [11199.536292]  [<ffffffff811aa669>] ? might_fault+0xe/0x10
>> [11199.536292]  [<ffffffff811aa753>] ? __strncpy_from_user+0x20/0x4a
>> [11199.536292]  [<ffffffff810e9023>] do_sys_open+0x62/0xeb
>> [11199.536292]  [<ffffffff810e90df>] sys_open+0x20/0x22
>> [11199.536292]  [<ffffffff81002c2b>] system_call_fastpath+0x16/0x1b
>> [11199.536292] Code: 24 08 48 8b 46 40 48 89 04 24 48 8b b5 58 ff ff ff 48
>> 8b bd 60 ff ff ff e8 61 e7 ff ff eb 08 0f 0b eb fe 0f 0b eb fe 85 c0 74
>> 04<0f>  0b eb fe 4c 89 e7 e8 65 ae ff ff 48 8b bd 70 ff ff ff
>> [11199.536292] RIP  [<ffffffffa0774081>] run_clustered_refs+0x71e/0x76b
>> [btrfs]
>> [11199.536292]  RSP<ffff8801c90abb58>
>> [11199.905250] ---[ end trace b0dead1e7c3dbf7b ]---
>> Jan 26 11:40:32 an1 [11199.532483] ------------[ cut here ]------------
>> Jan 26 11:40:33 an1 [11199.536292] invalid opcode: 0000 [#1] SMP
>> Jan 26 11:40:33 an1 [11199.536292] last sysfs file:
>> /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
>> Jan 26 11:40:38 an1 [11199.536292] Stack:
>> Jan 26 11:40:38 an1 [11199.536292] Call Trace:
>> Jan 26 11:40:40 an1 [11199.536292] Code: 24 08 48 8b 46 40 48 89 04 24 48
>> 8b b5 58 ff ff ff 48 8b bd 60 ff ff ff e8 61 e7 ff ff eb 08 0f 0b eb fe 0f
>> 0b eb fe 85 c0 74 04<0f>  0b eb fe 4c 89 e7 e8 65 ae ff ff 4
>> [11212.699541] btrfs: sdm2 checksum verify failed on 31928320 wanted
>> 237BEA0B found F7B13C5E level 0
>> [11212.709895] btrfs: sdm2 checksum verify failed on 31928320 wanted
>> 237BEA0B found F7B13C5E level 0
>> [11212.719737] btrfs: sdm2 checksum verify failed on 31928320 wanted
>> 237BEA0B found F7B13C5E level 0
>> [11212.729433] ------------[ cut here ]------------
>> [11212.730394] kernel BUG at fs/btrfs/extent-tree.c:5789!
>> [11212.734157] invalid opcode: 0000 [#2] SMP
>> [11212.734157] last sysfs file:
>> /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
>> [11212.734157] CPU 3
>> [11212.734157] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE
>> iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack
>> ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ]
>> [11212.734157]
>> [11212.734157] Pid: 27662, comm: btrfs-cleaner Tainted: G      D
>> 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950
>> [11212.734157] RIP: 0010:[<ffffffffa0773452>]  [<ffffffffa0773452>]
>> reada_walk_down+0x18c/0x249 [btrfs]
>> [11212.734157] RSP: 0018:ffff880227539be0  EFLAGS: 00010282
>> [11212.734157] RAX: 00000000fffffffb RBX: ffff8801cd50d750 RCX:
>> ffff88020b993000
>> [11212.734157] RDX: ffff88017921e3f0 RSI: ffffea000527f690 RDI:
>> 0000000100000090
>> [11212.734157] RBP: ffff880227539c80 R08: ffffe8ffffccefe8 R09:
>> 0000000000000000
>> [11212.734157] R10: 0000000100a68468 R11: ffff880227549e98 R12:
>> ffff8801d83c3000
>> [11212.734157] R13: 0000000000000040 R14: ffff88020b993000 R15:
>> 00000000000000e0
>> [11212.734157] FS:  0000000000000000(0000) GS:ffff8800cfcc0000(0000)
>> knlGS:0000000000000000
>> [11212.734157] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [11212.734157] CR2: 0000000000b92de8 CR3: 000000020e5b3000 CR4:
>> 00000000000006e0
>> [11212.734157] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [11212.734157] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
>> 0000000000000400
>> [11212.734157] Process btrfs-cleaner (pid: 27662, threadinfo
>> ffff880227538000, task ffff88020ebc0000)
>> [11212.734157] Stack:
>> [11212.734157]  ffff880227539bf0 0000000400000000 ffff8801cd50d750
>> ffff8801e0a9ca00
>> [11212.734157]  00000000024cd000 000010000000006b ffff88021527f880
>> 0000000100000001
>> [11212.734157]  ffff880227539c50 ffffffffa079c6bc ffff880225c96198
>> ffff8801b0cf9aa8
>> [11212.734157] Call Trace:
>> [11212.734157]  [<ffffffffa079c6bc>] ? extent_buffer_uptodate+0x6c/0x8a
>> [btrfs]
>> [11212.734157]  [<ffffffffa0775d62>] do_walk_down+0x25b/0x395 [btrfs]
>> [11212.734157]  [<ffffffffa076db1f>] ? btrfs_header_generation+0x1f/0x25
>> [btrfs]
>> [11212.734157]  [<ffffffffa0771268>] ? walk_down_proc+0x10a/0x1d0 [btrfs]
>> [11212.734157]  [<ffffffffa0775f1d>] walk_down_tree+0x81/0xac [btrfs]
>> [11212.734157]  [<ffffffffa077636f>] btrfs_drop_snapshot+0x2aa/0x467
>> [btrfs]
>> [11212.734157]  [<ffffffff81031049>] ? need_resched+0x23/0x2d
>> [11212.734157]  [<ffffffff81031061>] ? should_resched+0xe/0x2f
>> [11212.734157]  [<ffffffffa077d080>] ? cleaner_kthread+0x0/0x16b [btrfs]
>> [11212.734157]  [<ffffffffa077f24d>] btrfs_clean_old_snapshots+0xee/0x10c
>> [btrfs]
>> [11212.734157]  [<ffffffffa077d177>] cleaner_kthread+0xf7/0x16b [btrfs]
>> [11212.734157]  [<ffffffff8105b11e>] kthread+0x72/0x7a
>> [11212.734157]  [<ffffffff810039d4>] kernel_thread_helper+0x4/0x10
>> [11212.734157]  [<ffffffff8105b0ac>] ? kthread+0x0/0x7a
>> [11212.734157]  [<ffffffff810039d0>] ? kernel_thread_helper+0x0/0x10
>> [11212.734157] Code: 01 00 00 0f 86 bb 00 00 00 8b 4d 8c 48 8b 55 80 4c 8d
>> 4d c0 48 8b bd 78 ff ff ff 4c 8d 45 c8 4c 89 f6 e8 ec da ff ff 85 c0 74
>> 04<0f>  0b eb fe 48 8b 45 c8 48 85 c0 75 04 0f 0b eb fe 41 83
>> [11212.734157] RIP  [<ffffffffa0773452>] reada_walk_down+0x18c/0x249
>> [btrfs]
>> [11212.734157]  RSP<ffff880227539be0>
>> [11213.101484] ---[ end trace b0dead1e7c3dbf7c ]---
>> Jan 26 11:40:45 an1 [11212.729433] ------------[ cut here ]------------
>> Jan 26 11:40:45 an1 [11212.734157] invalid opcode: 0000 [#2] SMP
>> Jan 26 11:40:45 an1 [11212.734157] last sysfs file:
>> /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
>> Jan 26 11:40:46 an1 [11212.734157] Stack:
>> Jan 26 11:40:46 an1 [11212.734157] Call Trace:
>> Jan 26 11:40:46 an1 [11212.734157] Code: 01 00 00 0f 86 bb 00 00 00 8b 4d
>> 8c 48 8b 55 80 4c 8d 4d c0 48 8b bd 78 ff ff ff 4c 8d 45 c8 4c 89 f6 e8 ec
>> da ff ff 85 c0 74 04<0f>  0b eb fe 48 8b 45 c8 48 85 c0 75 0
>>
>>
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* btrfs BUG during Ceph cosd truncate() syscall
  2011-01-26 18:48   ` Jim Schutt
  2011-01-26 19:20     ` Matt Weil
@ 2011-01-27 16:05     ` Jim Schutt
  2011-01-27 16:36       ` Wido den Hollander
  1 sibling, 1 reply; 12+ messages in thread
From: Jim Schutt @ 2011-01-27 16:05 UTC (permalink / raw)
  To: linux-btrfs; +Cc: ceph-devel

Hi,

I got this kernel BUG on a server running multiple Ceph
cosd instances.  I'm not sure what was going on at the
time, as I just noticed this on my serial console for
this node.

It looks like another example of the truncate issue in
Matt Weil's report.

Please let me know what other information is needed to 
make this report useful.

Thanks -- Jim

an4 login: [62397.925080] ------------[ cut here ]------------
[62397.926012] kernel BUG at fs/btrfs/inode.c:6403!
[62397.926012] invalid opcode: 0000 [#1] SMP 
[62397.926012] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
[62397.926012] CPU 1 
[62397.926012] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ]
[62397.994828] 
[62397.994828] Pid: 10514, comm: cosd Not tainted 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950
[62397.994828] RIP: 0010:[<ffffffffa07834ff>]  [<ffffffffa07834ff>] btrfs_truncate+0x444/0x47a [btrfs]
[62397.994828] RSP: 0018:ffff8801a2e61d48  EFLAGS: 00010286
[62397.994828] RAX: 00000000ffffffe4 RBX: ffff88018c9c3a50 RCX: ffff8802136e9240
[62397.994828] RDX: ffff8802136e97e0 RSI: ffffea00074402f8 RDI: 0000000000000090
[62397.994828] RBP: ffff8801a2e61dd8 R08: ffffe8ffffc4ebe8 R09: 00000001e2a6a8c0
[62397.994828] R10: 0000000000000008 R11: 0000000000000016 R12: ffff8801e2a6a8c0
[62397.994828] R13: 0000000000000000 R14: ffff88018c9c3a50 R15: ffff880223b56800
[62397.994828] FS:  00007f6122b2e940(0000) GS:ffff8800cfc40000(0000) knlGS:0000000000000000
[62397.994828] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[62397.994828] CR2: 00007f7c1c7580a0 CR3: 00000001fc864000 CR4: 00000000000006e0
[62397.994828] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[62397.994828] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[62397.994828] Process cosd (pid: 10514, threadinfo ffff8801a2e60000, task ffff8801da311610)
[62397.994828] Stack:
[62397.994828]  0000000000000000 0000000000000000 0000000000000000 ffffffff00001000
[62397.994828]  ffff88018c9c38b8 ffff88018c9c3a50 ffff88018c9c3b78 0000000000000000
[62397.994828]  ffff88018c9c38e8 ffff88018c9c3b78 0000000000000000 ffffffff810b4960
[62397.994828] Call Trace:
[62397.994828]  [<ffffffff810b4960>] ? truncate_pagecache+0x52/0x5a
[62397.994828]  [<ffffffff810b49ca>] vmtruncate+0x44/0x50
[62397.994828]  [<ffffffffa078482c>] btrfs_setattr+0x205/0x24e [btrfs]
[62397.994828]  [<ffffffff810fe7fc>] notify_change+0x194/0x285
[62397.994828]  [<ffffffff810e9c0a>] do_truncate+0x71/0x90
[62397.994828]  [<ffffffff810f34f1>] ? generic_permission+0x1c/0x91
[62397.994828]  [<ffffffff810f3317>] ? get_write_access+0x1d/0x47
[62397.994828]  [<ffffffff810e9df7>] sys_truncate+0x112/0x124
[62397.994828]  [<ffffffff81002c2b>] system_call_fastpath+0x16/0x1b
[62397.994828] Code: 83 7e 5c 00 74 13 4c 89 f6 4c 89 e7 e8 00 cd ff ff 85 c0 74 04 0f 0b eb fe 4c 89 f2 4c 89 fe 4c 89 e7 e8 22 f6 ff ff 85 c0 74 04 <0f> 0b eb fe 4c 89 fe 4c 89 e7 49 8b 5c 24 20 e8 47 9e ff  
[62397.994828] RIP  [<ffffffffa07834ff>] btrfs_truncate+0x444/0x47a [btrfs]
[62397.994828]  RSP <ffff8801a2e61d48>
Jan 27 08:47:39 [62398.251586] ---[ end trace c4d86802177b259b ]---
an4 [62397.925080] ------------[ cut here ]------------
Jan 27 08:47:39 an4 [62397.926012] invalid opcode: 0000 [#1] SMP 
Jan 27 08:47:39 an4 [62397.926012] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
Jan 27 08:47:39 an4 [62397.994828] Stack:
Jan 27 08:47:39 an4 [62397.994828] Call Trace:
Jan 27 08:47:39 an4 [62397.994828] Code: 83 7e 5c 00 74 13 4c 89 f6 4c 89 e7 e8 00 cd ff ff 85 c0 74 04 0f 0b eb fe 4c 89 f2 4c 89 fe 4c 89 e7 e8 22 f6 ff ff 85 c0 74 04 <0f> 0b eb fe 4c 89 fe 4c 89 e7 49 8b 5 




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs BUG during Ceph cosd truncate() syscall
  2011-01-27 16:05     ` btrfs BUG during Ceph cosd truncate() syscall Jim Schutt
@ 2011-01-27 16:36       ` Wido den Hollander
  2011-01-27 16:51         ` Jim Schutt
  0 siblings, 1 reply; 12+ messages in thread
From: Wido den Hollander @ 2011-01-27 16:36 UTC (permalink / raw)
  To: Jim Schutt; +Cc: ceph-devel

Hi Jim,

On Thu, 2011-01-27 at 09:05 -0700, Jim Schutt wrote:
> Hi,
> 
> I got this kernel BUG on a server running multiple Ceph
> cosd instances.  I'm not sure what was going on at the
> time, as I just noticed this on my serial console for
> this node.
> 
> It looks like another example of the truncate issue in
> Matt Weil's report.

Yes, I saw the same. I've been mailing with Josef Bacik and he asked me
to try his latest btrfs with some patches: 

git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work.git

Could you try that (master branch) (Compiles against 2.6.38 btw)

I've been running it now for a few hours and seems to be stable, where
it would fail within 30 minutes yesterday.

Wido


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs BUG during Ceph cosd truncate() syscall
  2011-01-27 16:36       ` Wido den Hollander
@ 2011-01-27 16:51         ` Jim Schutt
  0 siblings, 0 replies; 12+ messages in thread
From: Jim Schutt @ 2011-01-27 16:51 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: ceph-devel

Hi Wido,

On Thu, 2011-01-27 at 09:36 -0700, Wido den Hollander wrote:
> Hi Jim,
> 
> On Thu, 2011-01-27 at 09:05 -0700, Jim Schutt wrote:
> > Hi,
> > 
> > I got this kernel BUG on a server running multiple Ceph
> > cosd instances.  I'm not sure what was going on at the
> > time, as I just noticed this on my serial console for
> > this node.
> > 
> > It looks like another example of the truncate issue in
> > Matt Weil's report.
> 
> Yes, I saw the same. I've been mailing with Josef Bacik and he asked me
> to try his latest btrfs with some patches: 
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work.git

Sure, I'll give that a try.

> 
> Could you try that (master branch) (Compiles against 2.6.38 btw)
> 
> I've been running it now for a few hours and seems to be stable, where
> it would fail within 30 minutes yesterday.

Great!

-- Jim

> 
> Wido
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-01-27 16:51 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-26 16:00 btrfs BUG during cosd open() syscall Jim Schutt
2011-01-26 17:17 ` Gregory Farnum
2011-01-26 17:55   ` Jim Schutt
2011-01-26 17:58     ` Sage Weil
2011-01-26 17:59 ` btrfs BUG during Ceph " Jim Schutt
2011-01-26 18:48   ` Jim Schutt
2011-01-26 19:20     ` Matt Weil
2011-01-27 15:58       ` Christian Brunner
2011-01-27 15:58         ` Christian Brunner
2011-01-27 16:05     ` btrfs BUG during Ceph cosd truncate() syscall Jim Schutt
2011-01-27 16:36       ` Wido den Hollander
2011-01-27 16:51         ` Jim Schutt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.