* Crash on 2.6.21.7 Vanilla + DRBD 0.7
@ 2007-10-04 13:33 vindex+lists-xfs
2007-10-04 14:27 ` Hannes Dorbath
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: vindex+lists-xfs @ 2007-10-04 13:33 UTC (permalink / raw)
To: xfs
Hi,
I did compile a fresh 2.6.21.7 kernel from kernel.org (no distro patch,
....), and latest svn (3062) 0.7.X drbd.
After just 2 days of uptime, I did experience another crash.
I wonder if it is an XFS related bug, a DRBD one, or related to XFS on
top of DRBD.
This bug seems to occur with intensive IO operations.
What do you think about it ?
Thanks
Oct 3 18:55:23 kernel: Oops: 0002 [#1]
Oct 3 18:55:23 kernel: SMP
Oct 3 18:55:23 kernel: CPU: 7
Oct 3 18:55:23 kernel: EIP: 0060:[<c016540c>] Not tainted VLI
Oct 3 18:55:23 kernel: EFLAGS: 00010046 (2.6.21-dl380-g5-20071001 #1)
Oct 3 18:55:23 kernel: EIP is at cache_alloc_refill+0x11c/0x4f0
Oct 3 18:55:23 kernel: eax: f79c2940 ebx: 00000015 ecx: 00000005 edx: 65b567b0
Oct 3 18:55:23 kernel: esi: 0000000a edi: d5d26000 ebp: f79d03c0 esp: d2531c98
Oct 3 18:55:23 kernel: ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
Oct 3 18:55:23 kernel: Process rsync (pid: 22409, ti=d2530000 task=da1e8070 task.ti=d2530000)
Oct 3 18:55:23 kernel: Stack: 00000010 000002d0 ce9ca0b8 000002d0 f79cfe00 f79d1c00 f79c2940 00000000
Oct 3 18:55:23 kernel: 00000001 d2531cd4 ce9ca088 c022aade d5d2601c 00000282 f79cfe00 000002d0
Oct 3 18:55:23 kernel: f79cfe00 c01652e6 00000000 00000001 c0265a4e 00000011 d2531d60 d7acfb40
Oct 3 18:55:23 kernel: Call Trace:
Oct 3 18:55:23 kernel: [<c022aade>] xfs_da_brelse+0x6e/0xb0
Oct 3 18:55:23 kernel: [<c01652e6>] kmem_cache_alloc+0x46/0x50
Oct 3 18:55:23 kernel: [<c0265a4e>] kmem_zone_alloc+0x4e/0xc0
Oct 3 18:55:23 kernel: [<c027015f>] xfs_fs_alloc_inode+0xf/0x20
Oct 3 18:55:23 kernel: [<c017bbd6>] alloc_inode+0x16/0x170
Oct 3 18:55:23 kernel: [<c017bd89>] iget_locked+0x59/0x130
Oct 3 18:55:23 kernel: [<c023fa38>] xfs_iget+0x78/0x160
Oct 3 18:55:23 kernel: [<c020a49c>] xfs_acl_vget+0x6c/0x160
Oct 3 18:55:23 kernel: [<c025b143>] xfs_dir_lookup_int+0x93/0xf0
Oct 3 18:55:23 kernel: [<c025ea55>] xfs_lookup+0x75/0xa0
Oct 3 18:55:23 kernel: [<c026d0c2>] xfs_vn_lookup+0x52/0x90
Oct 3 18:55:23 kernel: [<c016fd08>] do_lookup+0x148/0x190
Oct 3 18:55:23 kernel: [<c0171cb4>] __link_path_walk+0x814/0xe40
Oct 3 18:55:23 kernel: [<c0172325>] link_path_walk+0x45/0xc0
Oct 3 18:55:23 kernel: [<c0172581>] do_path_lookup+0x81/0x1c0
Oct 3 18:55:23 kernel: [<c01712c3>] getname+0xb3/0xe0
Oct 3 18:55:23 kernel: [<c0172f8b>] __user_walk_fd+0x3b/0x60
Oct 3 18:55:23 kernel: [<c016bcdf>] vfs_lstat_fd+0x1f/0x50
Oct 3 18:55:23 kernel: [<c016bd5f>] sys_lstat64+0xf/0x30
Oct 3 18:55:23 kernel: [<c01040b0>] sysenter_past_esp+0x5d/0x81
Oct 3 18:55:23 kernel: =======================
Oct 3 18:55:23 kernel: Code: 10 8b 77 14 01 c2 8b 44 24 30 8b 34 b0 89 77 14 89 54 8d 14 8d 51 01 89 55 00 8b 44 24 10 8b 77 10 3b 70 5c 72 c0 8b 17 8b 47 04 <89> 42 04 89 10 83 7f 14 ff c7 07 00 01 10 00 c7 47 04 00 02 20
Oct 3 18:55:23 kernel: EIP: [<c016540c>] cache_alloc_refill+0x11c/0x4f0 SS:ESP 0068:d2531c98
Oct 3 18:55:26 kernel: Oops: 0002 [#2]
Oct 3 18:55:26 kernel: SMP
Oct 3 18:55:26 kernel: CPU: 7
Oct 3 18:55:26 kernel: EIP: 0060:[<c017bbe0>] Not tainted VLI
Oct 3 18:55:26 kernel: EFLAGS: 00210282 (2.6.21-dl380-g5-20071001 #1)
Oct 3 18:55:26 kernel: EIP is at alloc_inode+0x20/0x170
Oct 3 18:55:26 kernel: eax: b4fd89ba ebx: b4fd89ba ecx: b4fd89ba edx: b4fd89ba
Oct 3 18:55:26 kernel: esi: f29bb000 edi: f29bb000 ebp: ca743575 esp: d6747c64
Oct 3 18:55:26 kernel: ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
Oct 3 18:55:26 kernel: Process imapd (pid: 20054, ti=d6746000 task=e04a20b0 task.ti=d6746000)
Oct 3 18:55:26 kernel: Stack: 00000000 c76fe0dc f29bb000 c017bd89 ffffffff ffffffff c04abda0 ca743575
Oct 3 18:55:26 kernel: ca743575 f53b5800 c023fa38 cb2b4524 1b2595f3 00000020 f0dd7400 ded8b7a8
Oct 3 18:55:26 kernel: 00000000 f53b5800 c04abda0 cb2b4524 cb2b4524 ca743575 00000000 00000004
Oct 3 18:55:26 kernel: Call Trace:
Oct 3 18:55:26 kernel: [<c017bd89>] iget_locked+0x59/0x130
Oct 3 18:55:26 kernel: [<c023fa38>] xfs_iget+0x78/0x160
Oct 3 18:55:26 kernel: [<c025a697>] xfs_trans_iget+0x117/0x190
Oct 3 18:55:26 kernel: [<c0243d87>] xfs_ialloc+0xc7/0x570
Oct 3 18:55:26 kernel: [<c024aabc>] xlog_grant_push_ail+0x3c/0x150
Oct 3 18:55:26 kernel: [<c025b261>] xfs_dir_ialloc+0x81/0x2d0
Oct 3 18:55:26 kernel: [<c025855b>] xfs_trans_reserve+0xab/0x230
Oct 3 18:55:26 kernel: [<c0261aa5>] xfs_create+0x395/0x6a0
Oct 3 18:55:26 kernel: [<c023eac5>] xfs_iunlock+0x85/0xa0
Oct 3 18:55:26 kernel: [<c026d6b5>] xfs_vn_mknod+0x235/0x360
Oct 3 18:55:26 kernel: [<c01705cd>] vfs_create+0xdd/0x140
Oct 3 18:55:26 kernel: [<c01738ae>] open_namei+0x58e/0x5f0
Oct 3 18:55:26 kernel: [<c016716e>] do_filp_open+0x2e/0x60
Oct 3 18:55:26 kernel: [<c0166e4f>] get_unused_fd+0x4f/0xb0
Oct 3 18:55:26 kernel: [<c01671ea>] do_sys_open+0x4a/0xe0
Oct 3 18:55:26 kernel: [<c01672bc>] sys_open+0x1c/0x20
Oct 3 18:55:26 kernel: [<c01040b0>] sysenter_past_esp+0x5d/0x81
Oct 3 18:55:26 kernel: =======================
Oct 3 18:55:26 kernel: Code: 90 90 90 90 90 90 90 90 90 90 90 57 56 89 c6 53 8b 40 20 8b 10 85 d2 0f 84 1e 01 00 00 89 f0 ff d2 89 c3 85 db 0f 84 ee 00 00 00 <89> b3 98 00 00 00 b9 02 00 00 00 0f b6 46 10 8d bb f8 00 00 00
Oct 3 18:55:26 kernel: EIP: [<c017bbe0>] alloc_inode+0x20/0x170 SS:ESP 0068:d6747c64
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Crash on 2.6.21.7 Vanilla + DRBD 0.7
2007-10-04 13:33 Crash on 2.6.21.7 Vanilla + DRBD 0.7 vindex+lists-xfs
@ 2007-10-04 14:27 ` Hannes Dorbath
2007-10-04 16:42 ` Laurent CARON
2007-10-04 14:35 ` Hannes Dorbath
2007-10-04 23:10 ` David Chinner
2 siblings, 1 reply; 12+ messages in thread
From: Hannes Dorbath @ 2007-10-04 14:27 UTC (permalink / raw)
To: xfs
On 04.10.2007 15:33, vindex+lists-xfs@apartia.org wrote:
> What do you think about it ?
Is that by any chance a kernel with 4k stack size?
--
Regards,
Hannes Dorbath
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Crash on 2.6.21.7 Vanilla + DRBD 0.7
2007-10-04 13:33 Crash on 2.6.21.7 Vanilla + DRBD 0.7 vindex+lists-xfs
2007-10-04 14:27 ` Hannes Dorbath
@ 2007-10-04 14:35 ` Hannes Dorbath
2007-10-04 16:33 ` Laurent CARON
2007-10-10 15:15 ` Louis-David Mitterrand
2007-10-04 23:10 ` David Chinner
2 siblings, 2 replies; 12+ messages in thread
From: Hannes Dorbath @ 2007-10-04 14:35 UTC (permalink / raw)
To: xfs
On 04.10.2007 15:33, vindex+lists-xfs@apartia.org wrote:
> What do you think about it ?
Another thing, is there a special reason why you use DRBD 0.7.x branch?
AFAIK it will still deadlock with kernel 2.6.22. You are not running
.22, but if you upgrade you might have serious problems. You should
really go with DRBD 8.0.6 if you can.
--
Regards,
Hannes Dorbath
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Crash on 2.6.21.7 Vanilla + DRBD 0.7
2007-10-04 14:35 ` Hannes Dorbath
@ 2007-10-04 16:33 ` Laurent CARON
2007-10-10 15:15 ` Louis-David Mitterrand
1 sibling, 0 replies; 12+ messages in thread
From: Laurent CARON @ 2007-10-04 16:33 UTC (permalink / raw)
To: xfs; +Cc: Hannes Dorbath
Hannes Dorbath wrote:
> On 04.10.2007 15:33, vindex+lists-xfs@apartia.org wrote:
>> What do you think about it ?
>
> Another thing, is there a special reason why you use DRBD 0.7.x branch?
> AFAIK it will still deadlock with kernel 2.6.22. You are not running
> .22, but if you upgrade you might have serious problems. You should
> really go with DRBD 8.0.6 if you can.
>
>
Hi,
We use 0.7.X since we had a major problem with 8.0.x.
The initial sync did never complete.
I tried to solve this problem with Lars Ellenberg to no avail, and
decided to go back to 0.7 which is a well tested version.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Crash on 2.6.21.7 Vanilla + DRBD 0.7
2007-10-04 14:27 ` Hannes Dorbath
@ 2007-10-04 16:42 ` Laurent CARON
0 siblings, 0 replies; 12+ messages in thread
From: Laurent CARON @ 2007-10-04 16:42 UTC (permalink / raw)
To: Hannes Dorbath; +Cc: xfs
Hannes Dorbath wrote:
> On 04.10.2007 15:33, vindex+lists-xfs@apartia.org wrote:
>> What do you think about it ?
>
> Is that by any chance a kernel with 4k stack size?
>
>
We're using a 8kb stack size (default value).
Laurent
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Crash on 2.6.21.7 Vanilla + DRBD 0.7
2007-10-04 13:33 Crash on 2.6.21.7 Vanilla + DRBD 0.7 vindex+lists-xfs
2007-10-04 14:27 ` Hannes Dorbath
2007-10-04 14:35 ` Hannes Dorbath
@ 2007-10-04 23:10 ` David Chinner
2 siblings, 0 replies; 12+ messages in thread
From: David Chinner @ 2007-10-04 23:10 UTC (permalink / raw)
To: xfs
On Thu, Oct 04, 2007 at 03:33:02PM +0200, vindex+lists-xfs@apartia.org wrote:
>
> Hi,
>
> I did compile a fresh 2.6.21.7 kernel from kernel.org (no distro patch,
> ....), and latest svn (3062) 0.7.X drbd.
>
> After just 2 days of uptime, I did experience another crash.
>
> I wonder if it is an XFS related bug, a DRBD one, or related to XFS on
> top of DRBD.
>
> This bug seems to occur with intensive IO operations.
>
> What do you think about it ?
>
> Thanks
>
>
> Oct 3 18:55:23 kernel: Oops: 0002 [#1]
> Oct 3 18:55:23 kernel: SMP
> Oct 3 18:55:23 kernel: CPU: 7
> Oct 3 18:55:23 kernel: EIP: 0060:[<c016540c>] Not tainted VLI
> Oct 3 18:55:23 kernel: EFLAGS: 00010046 (2.6.21-dl380-g5-20071001 #1)
> Oct 3 18:55:23 kernel: EIP is at cache_alloc_refill+0x11c/0x4f0
Use after free somewhere, i'd say. Turn on slab/slub poisoning and
other memory debugging options and see where it panics next
time.
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Crash on 2.6.21.7 Vanilla + DRBD 0.7
2007-10-04 14:35 ` Hannes Dorbath
2007-10-04 16:33 ` Laurent CARON
@ 2007-10-10 15:15 ` Louis-David Mitterrand
2007-10-10 22:46 ` David Chinner
1 sibling, 1 reply; 12+ messages in thread
From: Louis-David Mitterrand @ 2007-10-10 15:15 UTC (permalink / raw)
To: xfs
On Thu, Oct 04, 2007 at 04:35:05PM +0200, Hannes Dorbath wrote:
> On 04.10.2007 15:33, vindex+lists-xfs@apartia.org wrote:
>> What do you think about it ?
>
> Another thing, is there a special reason why you use DRBD 0.7.x branch?
> AFAIK it will still deadlock with kernel 2.6.22. You are not running .22,
> but if you upgrade you might have serious problems. You should really go
> with DRBD 8.0.6 if you can.
>
After upgrading to 8.0.6 we had another xfs-related crash 4 days later.
In desperation we are about to abandon xfs and convert this huge
partition to ext3. Is there anyting else we could try before taking that
step?
Thanks,
Oct 9 12:20:05 sargon/sargon kernel: SMP
Oct 9 12:20:05 sargon/sargon kernel: CPU: 1
Oct 9 12:20:05 sargon/sargon kernel: EIP: 0060:[<c015edc2>] Not tainted VLI
Oct 9 12:20:05 sargon/sargon kernel: EFLAGS: 00010082 (2.6.22-dl380-g5-20070917 #1)
Oct 9 12:20:05 sargon/sargon kernel: EIP is at free_block+0x67/0xfe
Oct 9 12:20:05 sargon/sargon kernel: eax: a9b1fb46 ebx: 00000000 ecx: f65f4200 edx: d9741040
Oct 9 12:20:05 sargon/sargon kernel: esi: f65f4000 edi: f79e8f40 ebp: f79da680 esp: f797de5c
Oct 9 12:20:05 sargon/sargon kernel: ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068
Oct 9 12:20:05 sargon/sargon kernel: Process kswapd0 (pid: 248, ti=f797c000 task=f7c22a90 task.ti=f797c000)
Oct 9 12:20:05 sargon/sargon kernel: Stack: 00000000 00000000 0000001b 00000010 f79e9f14 0000001b 000000d8 f79da680
Oct 9 12:20:05 sargon/sargon kernel: f79e8f40 c015eb6e 00000000 f79e9ec0 f79e9ec0 00000246 cb353240 00000001
Oct 9 12:20:05 sargon/sargon kernel: c015eca7 cb353240 f52f2cf0 dbe833c0 c0210cd0 00000001 dbe833dc dbe833c0
Oct 9 12:20:05 sargon/sargon kernel: Call Trace:
Oct 9 12:20:05 sargon/sargon kernel: [<c015eb6e>] cache_flusharray+0x70/0x96
Oct 9 12:20:05 sargon/sargon kernel: [<c015eca7>] kmem_cache_free+0x7d/0x96
Oct 9 12:20:05 sargon/sargon kernel: [<c0210cd0>] xfs_finish_reclaim+0x121/0x129
Oct 9 12:20:05 sargon/sargon kernel: [<c021e892>] xfs_fs_clear_inode+0x8f/0xb1
Oct 9 12:20:05 sargon/sargon kernel: [<c0172379>] clear_inode+0xa2/0xf0
Oct 9 12:20:05 sargon/sargon kernel: [<c0172639>] dispose_list+0x46/0xc2
Oct 9 12:20:05 sargon/sargon kernel: [<c0172841>] shrink_icache_memory+0x18c/0x1b4
Oct 9 12:20:05 sargon/sargon kernel: [<c014ca77>] shrink_slab+0xd9/0x138
Oct 9 12:20:05 sargon/sargon kernel: [<c014ce04>] kswapd+0x297/0x3e8
Oct 9 12:20:05 sargon/sargon kernel: [<c012d2f1>] autoremove_wake_function+0x0/0x35
Oct 9 12:20:05 sargon/sargon kernel: [<c014cb6d>] kswapd+0x0/0x3e8
Oct 9 12:20:05 sargon/sargon kernel: [<c012d22b>] kthread+0x38/0x5d
Oct 9 12:20:05 sargon/sargon kernel: [<c012d1f3>] kthread+0x0/0x5d
Oct 9 12:20:05 sargon/sargon kernel: [<c0104963>] kernel_thread_helper+0x7/0x10
Oct 9 12:20:05 sargon/sargon kernel: =======================
Oct 9 12:20:05 sargon/sargon kernel: Code: 00 3d 00 40 02 00 75 03 8b 52 0c 8b 02 84 c0 78 04 0f 0b eb fe 8b 72 1c 8b 54 24 28 8b 46 04 8b bc 95 88 00 00 00 8b 16 89 42 04 <89> 10 2b 4e 0c c7 06 00 01 10 00 c7 46 04 00 02 20 00 89 c8 f7
Oct 9 12:20:05 sargon/sargon kernel: EIP: [<c015edc2>] free_block+0x67/0xfe SS:ESP 0068:f797de5c
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Crash on 2.6.21.7 Vanilla + DRBD 0.7
2007-10-10 15:15 ` Louis-David Mitterrand
@ 2007-10-10 22:46 ` David Chinner
2007-10-11 7:36 ` Laurent CARON
0 siblings, 1 reply; 12+ messages in thread
From: David Chinner @ 2007-10-10 22:46 UTC (permalink / raw)
To: xfs
On Wed, Oct 10, 2007 at 05:15:37PM +0200, Louis-David Mitterrand wrote:
> On Thu, Oct 04, 2007 at 04:35:05PM +0200, Hannes Dorbath wrote:
> > On 04.10.2007 15:33, vindex+lists-xfs@apartia.org wrote:
> >> What do you think about it ?
> >
> > Another thing, is there a special reason why you use DRBD 0.7.x branch?
> > AFAIK it will still deadlock with kernel 2.6.22. You are not running .22,
> > but if you upgrade you might have serious problems. You should really go
> > with DRBD 8.0.6 if you can.
> >
>
> After upgrading to 8.0.6 we had another xfs-related crash 4 days later.
> In desperation we are about to abandon xfs and convert this huge
> partition to ext3. Is there anyting else we could try before taking that
> step?
Yes, please turn on slab debugging so we can try to find the cause
of this memory corruption. I expect the problem to be in DRBD as
nobody else running XFS is reporting this problem. However, without
running with the right debug options enabled we'll never get to
the bottom of the problem.
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Crash on 2.6.21.7 Vanilla + DRBD 0.7
2007-10-10 22:46 ` David Chinner
@ 2007-10-11 7:36 ` Laurent CARON
0 siblings, 0 replies; 12+ messages in thread
From: Laurent CARON @ 2007-10-11 7:36 UTC (permalink / raw)
To: xfs; +Cc: David Chinner
David Chinner wrote:
> Yes, please turn on slab debugging so we can try to find the cause
> of this memory corruption. I expect the problem to be in DRBD as
> nobody else running XFS is reporting this problem. However, without
> running with the right debug options enabled we'll never get to
> the bottom of the problem.
Hi,
Before installing a new kernel i've got a (little?) clue.
The setup is as follows:
The drbd partition is mounted to a generic mountpoint
/dev/drbd1 on /data/web type xfs (rw)
The subdirectories of /data/web are mounted (mount --bind) to another
directory
/data/web/var/www on /var/www type xfs (rw,bind)
/data/web/var/lib/postgresql on /var/lib/postgresql type xfs (rw,bind)
/data/web/var/lib/mysql on /var/lib/mysql type xfs (rw,bind)
It seems I made a mistake here.
mount -t xfs --bind /data/web/var/www /var/www
instead of
mount --bind /data/web/var/www /var/www
Could this be 'THE' root of the problem (if the system then sees
/var/www as a 'real' XFS filesystem and not a directory mounted over) ?
Thanks
Laurent
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Crash on 2.6.21.7 Vanilla + DRBD 0.7
2007-10-07 23:36 ` David Chinner
@ 2007-10-08 12:47 ` Laurent CARON
0 siblings, 0 replies; 12+ messages in thread
From: Laurent CARON @ 2007-10-08 12:47 UTC (permalink / raw)
To: David Chinner; +Cc: drbd-user, linux-kernel
David Chinner wrote:
> Can you turn on slab debug and poisoning and see where
> the kernel fails with that? e.g. set:
>
> CONFIG_DEBUG_SLAB=y
> CONFIG_DEBUG_SLAB_LEAK=y
I was a little worried about letting those servers in such a bad state,
and went the "easy" way.
I did upgrade from drbd 0.7.X to latest svn 8.0.X
Laurent
PS: Should this bug reappear, i'll change the kernel's config, and let
you know the result.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Crash on 2.6.21.7 Vanilla + DRBD 0.7
2007-10-04 7:29 Laurent Caron
@ 2007-10-07 23:36 ` David Chinner
2007-10-08 12:47 ` Laurent CARON
0 siblings, 1 reply; 12+ messages in thread
From: David Chinner @ 2007-10-07 23:36 UTC (permalink / raw)
To: Laurent Caron; +Cc: drbd-user, linux-kernel
On Thu, Oct 04, 2007 at 09:29:40AM +0200, Laurent Caron wrote:
>
> Hi,
>
> I did compile a fresh 2.6.21.7 kernel from kernel.org (no distro patch, ....), and latest svn (3062) 0.7.X drbd.
>
> After just 2 days of uptime, I did experience another crash.
>
> I wonder if it is an XFS related bug, a DRBD one, or related to XFS on top of DRBD.
>
> This bug seems to occur with intensive IO operations.
>
> What do you think about it ?
This still looks like memory corruption of some sort:. I'd
suspect DRBD at this point because nobody is repprting this against
other block devices in 2.6.21....
> Oct 3 18:55:23 kernel: Oops: 0002 [#1]
> Oct 3 18:55:23 kernel: SMP
> Oct 3 18:55:23 kernel: CPU: 7
> Oct 3 18:55:23 kernel: EIP: 0060:[<c016540c>] Not tainted VLI
> Oct 3 18:55:23 kernel: EFLAGS: 00010046 (2.6.21-dl380-g5-20071001 #1)
> Oct 3 18:55:23 kernel: EIP is at cache_alloc_refill+0x11c/0x4f0
Can you turn on slab debug and poisoning and see where
the kernel fails with that? e.g. set:
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_SLAB_LEAK=y
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
^ permalink raw reply [flat|nested] 12+ messages in thread
* Crash on 2.6.21.7 Vanilla + DRBD 0.7
@ 2007-10-04 7:29 Laurent Caron
2007-10-07 23:36 ` David Chinner
0 siblings, 1 reply; 12+ messages in thread
From: Laurent Caron @ 2007-10-04 7:29 UTC (permalink / raw)
To: drbd-user; +Cc: linux-kernel
Hi,
I did compile a fresh 2.6.21.7 kernel from kernel.org (no distro patch, ....), and latest svn (3062) 0.7.X drbd.
After just 2 days of uptime, I did experience another crash.
I wonder if it is an XFS related bug, a DRBD one, or related to XFS on top of DRBD.
This bug seems to occur with intensive IO operations.
What do you think about it ?
Thanks
Laurent
Oct 3 18:55:23 kernel: Oops: 0002 [#1]
Oct 3 18:55:23 kernel: SMP
Oct 3 18:55:23 kernel: CPU: 7
Oct 3 18:55:23 kernel: EIP: 0060:[<c016540c>] Not tainted VLI
Oct 3 18:55:23 kernel: EFLAGS: 00010046 (2.6.21-dl380-g5-20071001 #1)
Oct 3 18:55:23 kernel: EIP is at cache_alloc_refill+0x11c/0x4f0
Oct 3 18:55:23 kernel: eax: f79c2940 ebx: 00000015 ecx: 00000005 edx: 65b567b0
Oct 3 18:55:23 kernel: esi: 0000000a edi: d5d26000 ebp: f79d03c0 esp: d2531c98
Oct 3 18:55:23 kernel: ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
Oct 3 18:55:23 kernel: Process rsync (pid: 22409, ti=d2530000 task=da1e8070 task.ti=d2530000)
Oct 3 18:55:23 kernel: Stack: 00000010 000002d0 ce9ca0b8 000002d0 f79cfe00 f79d1c00 f79c2940 00000000
Oct 3 18:55:23 kernel: 00000001 d2531cd4 ce9ca088 c022aade d5d2601c 00000282 f79cfe00 000002d0
Oct 3 18:55:23 kernel: f79cfe00 c01652e6 00000000 00000001 c0265a4e 00000011 d2531d60 d7acfb40
Oct 3 18:55:23 kernel: Call Trace:
Oct 3 18:55:23 kernel: [<c022aade>] xfs_da_brelse+0x6e/0xb0
Oct 3 18:55:23 kernel: [<c01652e6>] kmem_cache_alloc+0x46/0x50
Oct 3 18:55:23 kernel: [<c0265a4e>] kmem_zone_alloc+0x4e/0xc0
Oct 3 18:55:23 kernel: [<c027015f>] xfs_fs_alloc_inode+0xf/0x20
Oct 3 18:55:23 kernel: [<c017bbd6>] alloc_inode+0x16/0x170
Oct 3 18:55:23 kernel: [<c017bd89>] iget_locked+0x59/0x130
Oct 3 18:55:23 kernel: [<c023fa38>] xfs_iget+0x78/0x160
Oct 3 18:55:23 kernel: [<c020a49c>] xfs_acl_vget+0x6c/0x160
Oct 3 18:55:23 kernel: [<c025b143>] xfs_dir_lookup_int+0x93/0xf0
Oct 3 18:55:23 kernel: [<c025ea55>] xfs_lookup+0x75/0xa0
Oct 3 18:55:23 kernel: [<c026d0c2>] xfs_vn_lookup+0x52/0x90
Oct 3 18:55:23 kernel: [<c016fd08>] do_lookup+0x148/0x190
Oct 3 18:55:23 kernel: [<c0171cb4>] __link_path_walk+0x814/0xe40
Oct 3 18:55:23 kernel: [<c0172325>] link_path_walk+0x45/0xc0
Oct 3 18:55:23 kernel: [<c0172581>] do_path_lookup+0x81/0x1c0
Oct 3 18:55:23 kernel: [<c01712c3>] getname+0xb3/0xe0
Oct 3 18:55:23 kernel: [<c0172f8b>] __user_walk_fd+0x3b/0x60
Oct 3 18:55:23 kernel: [<c016bcdf>] vfs_lstat_fd+0x1f/0x50
Oct 3 18:55:23 kernel: [<c016bd5f>] sys_lstat64+0xf/0x30
Oct 3 18:55:23 kernel: [<c01040b0>] sysenter_past_esp+0x5d/0x81
Oct 3 18:55:23 kernel: =======================
Oct 3 18:55:23 kernel: Code: 10 8b 77 14 01 c2 8b 44 24 30 8b 34 b0 89 77 14 89 54 8d 14 8d 51 01 89 55 00 8b 44 24 10 8b 77 10 3b 70 5c 72 c0 8b 17 8b 47 04 <89> 42 04 89 10 83 7f 14 ff c7 07 00 01 10 00 c7 47 04 00 02 20
Oct 3 18:55:23 kernel: EIP: [<c016540c>] cache_alloc_refill+0x11c/0x4f0 SS:ESP 0068:d2531c98
Oct 3 18:55:26 kernel: Oops: 0002 [#2]
Oct 3 18:55:26 kernel: SMP
Oct 3 18:55:26 kernel: CPU: 7
Oct 3 18:55:26 kernel: EIP: 0060:[<c017bbe0>] Not tainted VLI
Oct 3 18:55:26 kernel: EFLAGS: 00210282 (2.6.21-dl380-g5-20071001 #1)
Oct 3 18:55:26 kernel: EIP is at alloc_inode+0x20/0x170
Oct 3 18:55:26 kernel: eax: b4fd89ba ebx: b4fd89ba ecx: b4fd89ba edx: b4fd89ba
Oct 3 18:55:26 kernel: esi: f29bb000 edi: f29bb000 ebp: ca743575 esp: d6747c64
Oct 3 18:55:26 kernel: ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
Oct 3 18:55:26 kernel: Process imapd (pid: 20054, ti=d6746000 task=e04a20b0 task.ti=d6746000)
Oct 3 18:55:26 kernel: Stack: 00000000 c76fe0dc f29bb000 c017bd89 ffffffff ffffffff c04abda0 ca743575
Oct 3 18:55:26 kernel: ca743575 f53b5800 c023fa38 cb2b4524 1b2595f3 00000020 f0dd7400 ded8b7a8
Oct 3 18:55:26 kernel: 00000000 f53b5800 c04abda0 cb2b4524 cb2b4524 ca743575 00000000 00000004
Oct 3 18:55:26 kernel: Call Trace:
Oct 3 18:55:26 kernel: [<c017bd89>] iget_locked+0x59/0x130
Oct 3 18:55:26 kernel: [<c023fa38>] xfs_iget+0x78/0x160
Oct 3 18:55:26 kernel: [<c025a697>] xfs_trans_iget+0x117/0x190
Oct 3 18:55:26 kernel: [<c0243d87>] xfs_ialloc+0xc7/0x570
Oct 3 18:55:26 kernel: [<c024aabc>] xlog_grant_push_ail+0x3c/0x150
Oct 3 18:55:26 kernel: [<c025b261>] xfs_dir_ialloc+0x81/0x2d0
Oct 3 18:55:26 kernel: [<c025855b>] xfs_trans_reserve+0xab/0x230
Oct 3 18:55:26 kernel: [<c0261aa5>] xfs_create+0x395/0x6a0
Oct 3 18:55:26 kernel: [<c023eac5>] xfs_iunlock+0x85/0xa0
Oct 3 18:55:26 kernel: [<c026d6b5>] xfs_vn_mknod+0x235/0x360
Oct 3 18:55:26 kernel: [<c01705cd>] vfs_create+0xdd/0x140
Oct 3 18:55:26 kernel: [<c01738ae>] open_namei+0x58e/0x5f0
Oct 3 18:55:26 kernel: [<c016716e>] do_filp_open+0x2e/0x60
Oct 3 18:55:26 kernel: [<c0166e4f>] get_unused_fd+0x4f/0xb0
Oct 3 18:55:26 kernel: [<c01671ea>] do_sys_open+0x4a/0xe0
Oct 3 18:55:26 kernel: [<c01672bc>] sys_open+0x1c/0x20
Oct 3 18:55:26 kernel: [<c01040b0>] sysenter_past_esp+0x5d/0x81
Oct 3 18:55:26 kernel: =======================
Oct 3 18:55:26 kernel: Code: 90 90 90 90 90 90 90 90 90 90 90 57 56 89 c6 53 8b 40 20 8b 10 85 d2 0f 84 1e 01 00 00 89 f0 ff d2 89 c3 85 db 0f 84 ee 00 00 00 <89> b3 98 00 00 00 b9 02 00 00 00 0f b6 46 10 8d bb f8 00 00 00
Oct 3 18:55:26 kernel: EIP: [<c017bbe0>] alloc_inode+0x20/0x170 SS:ESP 0068:d6747c64
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2007-10-11 7:36 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-10-04 13:33 Crash on 2.6.21.7 Vanilla + DRBD 0.7 vindex+lists-xfs
2007-10-04 14:27 ` Hannes Dorbath
2007-10-04 16:42 ` Laurent CARON
2007-10-04 14:35 ` Hannes Dorbath
2007-10-04 16:33 ` Laurent CARON
2007-10-10 15:15 ` Louis-David Mitterrand
2007-10-10 22:46 ` David Chinner
2007-10-11 7:36 ` Laurent CARON
2007-10-04 23:10 ` David Chinner
-- strict thread matches above, loose matches on Subject: below --
2007-10-04 7:29 Laurent Caron
2007-10-07 23:36 ` David Chinner
2007-10-08 12:47 ` Laurent CARON
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.