linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] kernel 2.6.32.x hangs during boot process
@ 2010-01-16  9:58 François Figarola
  2010-01-23  0:07 ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: François Figarola @ 2010-01-16  9:58 UTC (permalink / raw)
  To: linux-kernel

Dear all,

First, I apologize por my poor english...

Since I've tried to boot 2.6.32.x kernel, my system hangs during the
boot process, and I think it could be related to the problem reported
earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92).

The hardware is a Dell PowerEdge 2950 which runs fine with the
2.6.31.x kernel series (actually running with the latest 2.6.31.11),
and the system is debian etch.

Here is the trace of the bug I've got (using netconsole) with a
2.6.32.3 kernel :

BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8)
[unmount of ext3 dm-4]
------------[ cut here ]------------
kernel BUG at fs/dcache.c:670!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/block/dm-2/removable
CPU 0
Modules linked in: i5k_amb hwmon button processor thermal fan [last
unloaded: scsi_wait_scan]
Pid: 3311, comm: kpartx Not tainted 2.6.32.3 #2 PowerEdge 2950
RIP: 0010:[<ffffffff810f95f0>]  [<ffffffff810f95f0>]
shrink_dcache_for_umount_subtree+0x280/0x290
RSP: 0018:ffff88066670dcf8  EFLAGS: 00010296
RAX: 000000000000005c RBX: ffff8806677696c0 RCX: 0000000000000096
RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246
RBP: ffff880667690000 R08: 0000000000000000 R09: ffff8806670d1628
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667690060
R13: 0000000000000007 R14: ffff8806654d1a88 R15: 0000000000dec0b0
FS:  00007f176e96b770(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fff0a2e0080 CR3: 0000000666607000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kpartx (pid: 3311, threadinfo ffff88066670c000, task ffff8806652997d0)
Stack:
ffff880665b8b178 ffff880665b8af18 ffffffff81619600 0000000000000001
<0> ffff880667408e00 ffffffff810f9629 ffff880665b8af18 ffffffff810e8049
<0> ffff8806651333f8 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159
Call Trace:
[<ffffffff810f9629>] ? shrink_dcache_for_umount+0x29/0x50
[<ffffffff810e8049>] ? generic_shutdown_super+0x19/0x100
[<ffffffff810e8159>] ? kill_block_super+0x29/0x50
[<ffffffff810e8238>] ? deactivate_locked_super+0x58/0x80
[<ffffffff81112842>] ? thaw_bdev+0xd2/0x110
[<ffffffff814b0c67>] ? dm_resume+0xf7/0x160
[<ffffffff814b5f00>] ? dev_suspend+0x0/0x220
[<ffffffff814b60b1>] ? dev_suspend+0x1b1/0x220
[<ffffffff814b6c7b>] ? ctl_ioctl+0x1eb/0x260
[<ffffffff810c0b1b>] ? handle_mm_fault+0x63b/0x990
[<ffffffff814b6cfe>] ? dm_ctl_ioctl+0xe/0x20
[<ffffffff8104991a>] ? finish_task_switch+0x3a/0xc0
[<ffffffff810f4e9f>] ? vfs_ioctl+0x2f/0xb0
[<ffffffff810f53bb>] ? do_vfs_ioctl+0x3fb/0x580
[<ffffffff815fb101>] ? thread_return+0x3e/0x64d
[<ffffffff810f55e1>] ? sys_ioctl+0xa1/0xb0
[<ffffffff8100bf02>] ? system_call_fastpath+0x16/0x1b
Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00
00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f>
0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48
RIP  [<ffffffff810f95f0>] shrink_dcache_for_umount_subtree+0x280/0x290
RSP <ffff88066670dcf8>
---[ end trace 3cc1cb65fcc6a8ca ]---

another trace with same behavior on a new compiled kernel with more
debug options;
but I can't see any difference :

BUG: Dentry ffff880667556738{i=41a46,n=sleep} still in use (8)
[unmount of ext3 dm-4]
------------[ cut here ]------------
kernel BUG at fs/dcache.c:670!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/block/dm-3/removable
CPU 1
Modules linked in: i5k_amb(+) button hwmon processor thermal fan [last
unloaded: scsi_wait_scan]
Pid: 3315, comm: kpartx Not tainted 2.6.32.3 #3 PowerEdge 2950
RIP: 0010:[<ffffffff810f95f0>]  [<ffffffff810f95f0>]
shrink_dcache_for_umount_subtree+0x280/0x290
RSP: 0018:ffff880667089cf8  EFLAGS: 00010296
RAX: 000000000000005c RBX: ffff880667790a60 RCX: 0000000000000096
RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246
RBP: ffff880667556738 R08: 0000000000000000 R09: ffff88066604b420
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667556798
R13: 0000000000000007 R14: ffff880665842360 R15: 0000000000b3c0b0
FS:  00007f7b1006c770(0000) GS:ffff880028240000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f6e67f1c350 CR3: 0000000664ff1000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kpartx (pid: 3315, threadinfo ffff880667088000, task ffff880664f55f40)
Stack:
ffff880667058af0 ffff880667058890 ffffffff81619600 0000000000000001
<0> ffff880667408e00 ffffffff810f9629 ffff880667058890 ffffffff810e8049
<0> ffff88067f83e758 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159
Call Trace:
[<ffffffff810f9629>] ? shrink_dcache_for_umount+0x29/0x50
[<ffffffff810e8049>] ? generic_shutdown_super+0x19/0x100
[<ffffffff810e8159>] ? kill_block_super+0x29/0x50
[<ffffffff810e8238>] ? deactivate_locked_super+0x58/0x80
[<ffffffff81112842>] ? thaw_bdev+0xd2/0x110
[<ffffffff814b0c67>] ? dm_resume+0xf7/0x160
[<ffffffff814b5f00>] ? dev_suspend+0x0/0x220
[<ffffffff814b60b1>] ? dev_suspend+0x1b1/0x220
[<ffffffff814b6c7b>] ? ctl_ioctl+0x1eb/0x260
[<ffffffff810c0b1b>] ? handle_mm_fault+0x63b/0x990
[<ffffffff814b6cfe>] ? dm_ctl_ioctl+0xe/0x20
[<ffffffff8104991a>] ? finish_task_switch+0x3a/0xc0
[<ffffffff810f4e9f>] ? vfs_ioctl+0x2f/0xb0
[<ffffffff810f53bb>] ? do_vfs_ioctl+0x3fb/0x580
[<ffffffff815fb101>] ? thread_return+0x3e/0x64d
[<ffffffff810f55e1>] ? sys_ioctl+0xa1/0xb0
[<ffffffff8100bf02>] ? system_call_fastpath+0x16/0x1b
Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00
00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f>
0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48
RIP  [<ffffffff810f95f0>] shrink_dcache_for_umount_subtree+0x280/0x290
RSP <ffff880667089cf8>
---[ end trace a9fb3c2286e56cbd ]---


I think the problem should be related with lvm or device mapper because
I could start perfectly a 2.6.32.2 kernel on another PowerEdge 2950
without any kind of lvm or dm configured...
but I'm really not expert with kernel debug.

Here is the fstab of the buggy system :

# /etc/fstab: static file system information.
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
proc            /proc           proc    defaults        0       0
/dev/dm-4       /               ext3    errors=remount-ro 0       1
/dev/dm-1       /boot           ext3    defaults        0       2
/dev/dm-7       /home           ext3    defaults        0       2
/dev/dm-5       /usr            ext3    defaults        0       2
/dev/dm-6       /var            ext3    defaults        0       2
/dev/dm-2       none            swap    sw              0       0
/dev/hda        /media/cdrom0   udf,iso9660 user,noauto     0       0
debugfs /sys/kernel/debug debugfs noauto 0 0

I hope it can help, and try to give us more informations if necessary.

François.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] kernel 2.6.32.x hangs during boot process
  2010-01-16  9:58 [BUG] kernel 2.6.32.x hangs during boot process François Figarola
@ 2010-01-23  0:07 ` Andrew Morton
  2010-01-28  2:42   ` Neil Brown
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2010-01-23  0:07 UTC (permalink / raw)
  To: François Figarola; +Cc: linux-kernel, Neil Brown, linux-raid, Al Viro

(cc's added)

On Sat, 16 Jan 2010 10:58:30 +0100
Fran__ois Figarola  <francois.figarola@i-consult.fr> wrote:

> Dear all,
> 
> First, I apologize por my poor english...
> 
> Since I've tried to boot 2.6.32.x kernel, my system hangs during the
> boot process, and I think it could be related to the problem reported
> earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92).
> 
> The hardware is a Dell PowerEdge 2950 which runs fine with the
> 2.6.31.x kernel series (actually running with the latest 2.6.31.11),
> and the system is debian etch.
> 
> Here is the trace of the bug I've got (using netconsole) with a
> 2.6.32.3 kernel :
> 
> BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8)
> [unmount of ext3 dm-4]
> ------------[ cut here ]------------
> kernel BUG at fs/dcache.c:670!

That's

			if (atomic_read(&dentry->d_count) != 0) {
				printk(KERN_ERR
				       "BUG: Dentry %p{i=%lx,n=%s}"
				       " still in use (%d)"
				       " [unmount of %s %s]\n",
				       dentry,
				       dentry->d_inode ?
				       dentry->d_inode->i_ino : 0UL,
				       dentry->d_name.name,
				       atomic_read(&dentry->d_count),
				       dentry->d_sb->s_type->name,
				       dentry->d_sb->s_id);
				BUG();
			}

I'm a bit surprised that the system is doing a dm suspemd/resume during
the boot process.

I assume it's a DM bug, dunno.

> invalid opcode: 0000 [#1] SMP
> last sysfs file: /sys/block/dm-2/removable
> CPU 0
> Modules linked in: i5k_amb hwmon button processor thermal fan [last
> unloaded: scsi_wait_scan]
> Pid: 3311, comm: kpartx Not tainted 2.6.32.3 #2 PowerEdge 2950
> RIP: 0010:[<ffffffff810f95f0>] __[<ffffffff810f95f0>]
> shrink_dcache_for_umount_subtree+0x280/0x290
> RSP: 0018:ffff88066670dcf8 __EFLAGS: 00010296
> RAX: 000000000000005c RBX: ffff8806677696c0 RCX: 0000000000000096
> RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246
> RBP: ffff880667690000 R08: 0000000000000000 R09: ffff8806670d1628
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667690060
> R13: 0000000000000007 R14: ffff8806654d1a88 R15: 0000000000dec0b0
> FS: __00007f176e96b770(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
> CS: __0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007fff0a2e0080 CR3: 0000000666607000 CR4: 00000000000006f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process kpartx (pid: 3311, threadinfo ffff88066670c000, task ffff8806652997d0)
> Stack:
> ffff880665b8b178 ffff880665b8af18 ffffffff81619600 0000000000000001
> <0> ffff880667408e00 ffffffff810f9629 ffff880665b8af18 ffffffff810e8049
> <0> ffff8806651333f8 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159
> Call Trace:
> [<ffffffff810f9629>] ? shrink_dcache_for_umount+0x29/0x50
> [<ffffffff810e8049>] ? generic_shutdown_super+0x19/0x100
> [<ffffffff810e8159>] ? kill_block_super+0x29/0x50
> [<ffffffff810e8238>] ? deactivate_locked_super+0x58/0x80
> [<ffffffff81112842>] ? thaw_bdev+0xd2/0x110
> [<ffffffff814b0c67>] ? dm_resume+0xf7/0x160
> [<ffffffff814b5f00>] ? dev_suspend+0x0/0x220
> [<ffffffff814b60b1>] ? dev_suspend+0x1b1/0x220
> [<ffffffff814b6c7b>] ? ctl_ioctl+0x1eb/0x260
> [<ffffffff810c0b1b>] ? handle_mm_fault+0x63b/0x990
> [<ffffffff814b6cfe>] ? dm_ctl_ioctl+0xe/0x20
> [<ffffffff8104991a>] ? finish_task_switch+0x3a/0xc0
> [<ffffffff810f4e9f>] ? vfs_ioctl+0x2f/0xb0
> [<ffffffff810f53bb>] ? do_vfs_ioctl+0x3fb/0x580
> [<ffffffff815fb101>] ? thread_return+0x3e/0x64d
> [<ffffffff810f55e1>] ? sys_ioctl+0xa1/0xb0
> [<ffffffff8100bf02>] ? system_call_fastpath+0x16/0x1b
> Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00
> 00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f>
> 0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48
> RIP __[<ffffffff810f95f0>] shrink_dcache_for_umount_subtree+0x280/0x290
> RSP <ffff88066670dcf8>
> ---[ end trace 3cc1cb65fcc6a8ca ]---
> 
> another trace with same behavior on a new compiled kernel with more
> debug options;
> but I can't see any difference :
> 
> BUG: Dentry ffff880667556738{i=41a46,n=sleep} still in use (8)
> [unmount of ext3 dm-4]
> ------------[ cut here ]------------
> kernel BUG at fs/dcache.c:670!
> invalid opcode: 0000 [#1] SMP
> last sysfs file: /sys/block/dm-3/removable
> CPU 1
> Modules linked in: i5k_amb(+) button hwmon processor thermal fan [last
> unloaded: scsi_wait_scan]
> Pid: 3315, comm: kpartx Not tainted 2.6.32.3 #3 PowerEdge 2950
> RIP: 0010:[<ffffffff810f95f0>] __[<ffffffff810f95f0>]
> shrink_dcache_for_umount_subtree+0x280/0x290
> RSP: 0018:ffff880667089cf8 __EFLAGS: 00010296
> RAX: 000000000000005c RBX: ffff880667790a60 RCX: 0000000000000096
> RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246
> RBP: ffff880667556738 R08: 0000000000000000 R09: ffff88066604b420
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667556798
> R13: 0000000000000007 R14: ffff880665842360 R15: 0000000000b3c0b0
> FS: __00007f7b1006c770(0000) GS:ffff880028240000(0000) knlGS:0000000000000000
> CS: __0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007f6e67f1c350 CR3: 0000000664ff1000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process kpartx (pid: 3315, threadinfo ffff880667088000, task ffff880664f55f40)
> Stack:
> ffff880667058af0 ffff880667058890 ffffffff81619600 0000000000000001
> <0> ffff880667408e00 ffffffff810f9629 ffff880667058890 ffffffff810e8049
> <0> ffff88067f83e758 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159
> Call Trace:
> [<ffffffff810f9629>] ? shrink_dcache_for_umount+0x29/0x50
> [<ffffffff810e8049>] ? generic_shutdown_super+0x19/0x100
> [<ffffffff810e8159>] ? kill_block_super+0x29/0x50
> [<ffffffff810e8238>] ? deactivate_locked_super+0x58/0x80
> [<ffffffff81112842>] ? thaw_bdev+0xd2/0x110
> [<ffffffff814b0c67>] ? dm_resume+0xf7/0x160
> [<ffffffff814b5f00>] ? dev_suspend+0x0/0x220
> [<ffffffff814b60b1>] ? dev_suspend+0x1b1/0x220
> [<ffffffff814b6c7b>] ? ctl_ioctl+0x1eb/0x260
> [<ffffffff810c0b1b>] ? handle_mm_fault+0x63b/0x990
> [<ffffffff814b6cfe>] ? dm_ctl_ioctl+0xe/0x20
> [<ffffffff8104991a>] ? finish_task_switch+0x3a/0xc0
> [<ffffffff810f4e9f>] ? vfs_ioctl+0x2f/0xb0
> [<ffffffff810f53bb>] ? do_vfs_ioctl+0x3fb/0x580
> [<ffffffff815fb101>] ? thread_return+0x3e/0x64d
> [<ffffffff810f55e1>] ? sys_ioctl+0xa1/0xb0
> [<ffffffff8100bf02>] ? system_call_fastpath+0x16/0x1b
> Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00
> 00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f>
> 0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48
> RIP __[<ffffffff810f95f0>] shrink_dcache_for_umount_subtree+0x280/0x290
> RSP <ffff880667089cf8>
> ---[ end trace a9fb3c2286e56cbd ]---
> 
> 
> I think the problem should be related with lvm or device mapper because
> I could start perfectly a 2.6.32.2 kernel on another PowerEdge 2950
> without any kind of lvm or dm configured...
> but I'm really not expert with kernel debug.
> 
> Here is the fstab of the buggy system :
> 
> # /etc/fstab: static file system information.
> #
> # <file system> <mount point> __ <type> __<options> __ __ __ <dump> __<pass>
> proc __ __ __ __ __ __/proc __ __ __ __ __ proc __ __defaults __ __ __ __0 __ __ __ 0
> /dev/dm-4 __ __ __ / __ __ __ __ __ __ __ ext3 __ __errors=remount-ro 0 __ __ __ 1
> /dev/dm-1 __ __ __ /boot __ __ __ __ __ ext3 __ __defaults __ __ __ __0 __ __ __ 2
> /dev/dm-7 __ __ __ /home __ __ __ __ __ ext3 __ __defaults __ __ __ __0 __ __ __ 2
> /dev/dm-5 __ __ __ /usr __ __ __ __ __ __ext3 __ __defaults __ __ __ __0 __ __ __ 2
> /dev/dm-6 __ __ __ /var __ __ __ __ __ __ext3 __ __defaults __ __ __ __0 __ __ __ 2
> /dev/dm-2 __ __ __ none __ __ __ __ __ __swap __ __sw __ __ __ __ __ __ __0 __ __ __ 0
> /dev/hda __ __ __ __/media/cdrom0 __ udf,iso9660 user,noauto __ __ 0 __ __ __ 0
> debugfs /sys/kernel/debug debugfs noauto 0 0
> 
> I hope it can help, and try to give us more informations if necessary.
> 
> Fran__ois.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] kernel 2.6.32.x hangs during boot process
  2010-01-23  0:07 ` Andrew Morton
@ 2010-01-28  2:42   ` Neil Brown
  2010-01-28  6:32     ` [dm-devel] " Jun'ichi Nomura
  0 siblings, 1 reply; 9+ messages in thread
From: Neil Brown @ 2010-01-28  2:42 UTC (permalink / raw)
  To: Andrew Morton
  Cc: François Figarola, linux-kernel, linux-raid, Al Viro, dm-devel

On Fri, 22 Jan 2010 16:07:40 -0800
Andrew Morton <akpm@linux-foundation.org> wrote:

> (cc's added)
(another cc added, one that might actually be useful.....)

> 
> On Sat, 16 Jan 2010 10:58:30 +0100
> Fran__ois Figarola  <francois.figarola@i-consult.fr> wrote:
> 
> > Dear all,
> > 
> > First, I apologize por my poor english...
> > 
> > Since I've tried to boot 2.6.32.x kernel, my system hangs during the
> > boot process, and I think it could be related to the problem reported
> > earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92).
> > 
> > The hardware is a Dell PowerEdge 2950 which runs fine with the
> > 2.6.31.x kernel series (actually running with the latest 2.6.31.11),
> > and the system is debian etch.
> > 
> > Here is the trace of the bug I've got (using netconsole) with a
> > 2.6.32.3 kernel :
> > 
> > BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8)
> > [unmount of ext3 dm-4]
> > ------------[ cut here ]------------
> > kernel BUG at fs/dcache.c:670!
> 
> That's
> 
> 			if (atomic_read(&dentry->d_count) != 0) {
> 				printk(KERN_ERR
> 				       "BUG: Dentry %p{i=%lx,n=%s}"
> 				       " still in use (%d)"
> 				       " [unmount of %s %s]\n",
> 				       dentry,
> 				       dentry->d_inode ?
> 				       dentry->d_inode->i_ino : 0UL,
> 				       dentry->d_name.name,
> 				       atomic_read(&dentry->d_count),
> 				       dentry->d_sb->s_type->name,
> 				       dentry->d_sb->s_id);
> 				BUG();
> 			}
> 
> I'm a bit surprised that the system is doing a dm suspemd/resume during
> the boot process.

It could be that a dm_resume if how you activate a dm device once it is
built, but I'm not sure....
Maybe the guys on dm-devel can help.

NeilBrown

> 
> I assume it's a DM bug, dunno.
> 
> > invalid opcode: 0000 [#1] SMP
> > last sysfs file: /sys/block/dm-2/removable
> > CPU 0
> > Modules linked in: i5k_amb hwmon button processor thermal fan [last
> > unloaded: scsi_wait_scan]
> > Pid: 3311, comm: kpartx Not tainted 2.6.32.3 #2 PowerEdge 2950
> > RIP: 0010:[<ffffffff810f95f0>] __[<ffffffff810f95f0>]
> > shrink_dcache_for_umount_subtree+0x280/0x290
> > RSP: 0018:ffff88066670dcf8 __EFLAGS: 00010296
> > RAX: 000000000000005c RBX: ffff8806677696c0 RCX: 0000000000000096
> > RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246
> > RBP: ffff880667690000 R08: 0000000000000000 R09: ffff8806670d1628
> > R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667690060
> > R13: 0000000000000007 R14: ffff8806654d1a88 R15: 0000000000dec0b0
> > FS: __00007f176e96b770(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
> > CS: __0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > CR2: 00007fff0a2e0080 CR3: 0000000666607000 CR4: 00000000000006f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Process kpartx (pid: 3311, threadinfo ffff88066670c000, task ffff8806652997d0)
> > Stack:
> > ffff880665b8b178 ffff880665b8af18 ffffffff81619600 0000000000000001
> > <0> ffff880667408e00 ffffffff810f9629 ffff880665b8af18 ffffffff810e8049
> > <0> ffff8806651333f8 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159
> > Call Trace:
> > [<ffffffff810f9629>] ? shrink_dcache_for_umount+0x29/0x50
> > [<ffffffff810e8049>] ? generic_shutdown_super+0x19/0x100
> > [<ffffffff810e8159>] ? kill_block_super+0x29/0x50
> > [<ffffffff810e8238>] ? deactivate_locked_super+0x58/0x80
> > [<ffffffff81112842>] ? thaw_bdev+0xd2/0x110
> > [<ffffffff814b0c67>] ? dm_resume+0xf7/0x160
> > [<ffffffff814b5f00>] ? dev_suspend+0x0/0x220
> > [<ffffffff814b60b1>] ? dev_suspend+0x1b1/0x220
> > [<ffffffff814b6c7b>] ? ctl_ioctl+0x1eb/0x260
> > [<ffffffff810c0b1b>] ? handle_mm_fault+0x63b/0x990
> > [<ffffffff814b6cfe>] ? dm_ctl_ioctl+0xe/0x20
> > [<ffffffff8104991a>] ? finish_task_switch+0x3a/0xc0
> > [<ffffffff810f4e9f>] ? vfs_ioctl+0x2f/0xb0
> > [<ffffffff810f53bb>] ? do_vfs_ioctl+0x3fb/0x580
> > [<ffffffff815fb101>] ? thread_return+0x3e/0x64d
> > [<ffffffff810f55e1>] ? sys_ioctl+0xa1/0xb0
> > [<ffffffff8100bf02>] ? system_call_fastpath+0x16/0x1b
> > Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00
> > 00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f>
> > 0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48
> > RIP __[<ffffffff810f95f0>] shrink_dcache_for_umount_subtree+0x280/0x290
> > RSP <ffff88066670dcf8>
> > ---[ end trace 3cc1cb65fcc6a8ca ]---
> > 
> > another trace with same behavior on a new compiled kernel with more
> > debug options;
> > but I can't see any difference :
> > 
> > BUG: Dentry ffff880667556738{i=41a46,n=sleep} still in use (8)
> > [unmount of ext3 dm-4]
> > ------------[ cut here ]------------
> > kernel BUG at fs/dcache.c:670!
> > invalid opcode: 0000 [#1] SMP
> > last sysfs file: /sys/block/dm-3/removable
> > CPU 1
> > Modules linked in: i5k_amb(+) button hwmon processor thermal fan [last
> > unloaded: scsi_wait_scan]
> > Pid: 3315, comm: kpartx Not tainted 2.6.32.3 #3 PowerEdge 2950
> > RIP: 0010:[<ffffffff810f95f0>] __[<ffffffff810f95f0>]
> > shrink_dcache_for_umount_subtree+0x280/0x290
> > RSP: 0018:ffff880667089cf8 __EFLAGS: 00010296
> > RAX: 000000000000005c RBX: ffff880667790a60 RCX: 0000000000000096
> > RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246
> > RBP: ffff880667556738 R08: 0000000000000000 R09: ffff88066604b420
> > R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667556798
> > R13: 0000000000000007 R14: ffff880665842360 R15: 0000000000b3c0b0
> > FS: __00007f7b1006c770(0000) GS:ffff880028240000(0000) knlGS:0000000000000000
> > CS: __0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > CR2: 00007f6e67f1c350 CR3: 0000000664ff1000 CR4: 00000000000006e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Process kpartx (pid: 3315, threadinfo ffff880667088000, task ffff880664f55f40)
> > Stack:
> > ffff880667058af0 ffff880667058890 ffffffff81619600 0000000000000001
> > <0> ffff880667408e00 ffffffff810f9629 ffff880667058890 ffffffff810e8049
> > <0> ffff88067f83e758 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159
> > Call Trace:
> > [<ffffffff810f9629>] ? shrink_dcache_for_umount+0x29/0x50
> > [<ffffffff810e8049>] ? generic_shutdown_super+0x19/0x100
> > [<ffffffff810e8159>] ? kill_block_super+0x29/0x50
> > [<ffffffff810e8238>] ? deactivate_locked_super+0x58/0x80
> > [<ffffffff81112842>] ? thaw_bdev+0xd2/0x110
> > [<ffffffff814b0c67>] ? dm_resume+0xf7/0x160
> > [<ffffffff814b5f00>] ? dev_suspend+0x0/0x220
> > [<ffffffff814b60b1>] ? dev_suspend+0x1b1/0x220
> > [<ffffffff814b6c7b>] ? ctl_ioctl+0x1eb/0x260
> > [<ffffffff810c0b1b>] ? handle_mm_fault+0x63b/0x990
> > [<ffffffff814b6cfe>] ? dm_ctl_ioctl+0xe/0x20
> > [<ffffffff8104991a>] ? finish_task_switch+0x3a/0xc0
> > [<ffffffff810f4e9f>] ? vfs_ioctl+0x2f/0xb0
> > [<ffffffff810f53bb>] ? do_vfs_ioctl+0x3fb/0x580
> > [<ffffffff815fb101>] ? thread_return+0x3e/0x64d
> > [<ffffffff810f55e1>] ? sys_ioctl+0xa1/0xb0
> > [<ffffffff8100bf02>] ? system_call_fastpath+0x16/0x1b
> > Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00
> > 00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f>
> > 0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48
> > RIP __[<ffffffff810f95f0>] shrink_dcache_for_umount_subtree+0x280/0x290
> > RSP <ffff880667089cf8>
> > ---[ end trace a9fb3c2286e56cbd ]---
> > 
> > 
> > I think the problem should be related with lvm or device mapper because
> > I could start perfectly a 2.6.32.2 kernel on another PowerEdge 2950
> > without any kind of lvm or dm configured...
> > but I'm really not expert with kernel debug.
> > 
> > Here is the fstab of the buggy system :
> > 
> > # /etc/fstab: static file system information.
> > #
> > # <file system> <mount point> __ <type> __<options> __ __ __ <dump> __<pass>
> > proc __ __ __ __ __ __/proc __ __ __ __ __ proc __ __defaults __ __ __ __0 __ __ __ 0
> > /dev/dm-4 __ __ __ / __ __ __ __ __ __ __ ext3 __ __errors=remount-ro 0 __ __ __ 1
> > /dev/dm-1 __ __ __ /boot __ __ __ __ __ ext3 __ __defaults __ __ __ __0 __ __ __ 2
> > /dev/dm-7 __ __ __ /home __ __ __ __ __ ext3 __ __defaults __ __ __ __0 __ __ __ 2
> > /dev/dm-5 __ __ __ /usr __ __ __ __ __ __ext3 __ __defaults __ __ __ __0 __ __ __ 2
> > /dev/dm-6 __ __ __ /var __ __ __ __ __ __ext3 __ __defaults __ __ __ __0 __ __ __ 2
> > /dev/dm-2 __ __ __ none __ __ __ __ __ __swap __ __sw __ __ __ __ __ __ __0 __ __ __ 0
> > /dev/hda __ __ __ __/media/cdrom0 __ udf,iso9660 user,noauto __ __ 0 __ __ __ 0
> > debugfs /sys/kernel/debug debugfs noauto 0 0
> > 
> > I hope it can help, and try to give us more informations if necessary.
> > 
> > Fran__ois.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process
  2010-01-28  2:42   ` Neil Brown
@ 2010-01-28  6:32     ` Jun'ichi Nomura
  2010-01-28 18:16       ` Thomas Backlund
                         ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Jun'ichi Nomura @ 2010-01-28  6:32 UTC (permalink / raw)
  To: François Figarola, hch
  Cc: device-mapper development, linux-kernel, Neil Brown,
	Andrew Morton, linux-raid, Al Viro

>> On Sat, 16 Jan 2010 10:58:30 +0100
>> Fran__ois Figarola  <francois.figarola@i-consult.fr> wrote:
>>> Since I've tried to boot 2.6.32.x kernel, my system hangs during the
>>> boot process, and I think it could be related to the problem reported
>>> earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92).
>>>
>>> The hardware is a Dell PowerEdge 2950 which runs fine with the
>>> 2.6.31.x kernel series (actually running with the latest 2.6.31.11),
>>> and the system is debian etch.
>>>
>>> Here is the trace of the bug I've got (using netconsole) with a
>>> 2.6.32.3 kernel :
>>>
>>> BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8)
>>> [unmount of ext3 dm-4]
>>> ------------[ cut here ]------------
>>> kernel BUG at fs/dcache.c:670!

I can reproduce this when suspend/resume read-only mounted dm device.

When MS_RDONLY, both freeze_bdev and thaw_bdev call deactivate_locked_super,
which seems wrong. The change was introduced with the commit below:

  commit 4504230a71566785a05d3e6b53fa1ee071b864eb
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Mon Aug 3 23:28:35 2009 +0200

  freeze_bdev: grab active reference to frozen superblocks

With the attached patch, both remount-ro and remount-rw are
rejected as EBUSY on freezed device as expected.

Christoph, do you think this is the right fix?

-- 
Jun'ichi Nomura, NEC Corporation


If MS_RDONLY, freeze_bdev should just up_write(s_umount) instead of
deactivate_locked_super().
Also, keep sb->s_frozen consistent so that remount can check the frozen state.

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 73d6a73..600261f 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -246,7 +246,9 @@ struct super_block *freeze_bdev(struct block_device *bdev)
 	if (!sb)
 		goto out;
 	if (sb->s_flags & MS_RDONLY) {
-		deactivate_locked_super(sb);
+		sb->s_frozen = SB_FREEZE_TRANS;
+		smp_wmb();
+		up_write(&sb->s_umount);
 		mutex_unlock(&bdev->bd_fsfreeze_mutex);
 		return sb;
 	}
@@ -307,7 +309,7 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
 	BUG_ON(sb->s_bdev != bdev);
 	down_write(&sb->s_umount);
 	if (sb->s_flags & MS_RDONLY)
-		goto out_deactivate;
+		goto out_unfrozen;
 
 	if (sb->s_op->unfreeze_fs) {
 		error = sb->s_op->unfreeze_fs(sb);
@@ -321,11 +323,11 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
 		}
 	}
 
+out_unfrozen:
 	sb->s_frozen = SB_UNFROZEN;
 	smp_wmb();
 	wake_up(&sb->s_wait_unfrozen);
 
-out_deactivate:
 	if (sb)
 		deactivate_locked_super(sb);
 out_unlock:

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process
  2010-01-28  6:32     ` [dm-devel] " Jun'ichi Nomura
@ 2010-01-28 18:16       ` Thomas Backlund
  2010-01-28 18:25       ` Christoph Hellwig
  2010-01-29  7:06       ` [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process François Figarola
  2 siblings, 0 replies; 9+ messages in thread
From: Thomas Backlund @ 2010-01-28 18:16 UTC (permalink / raw)
  To: Jun'ichi Nomura
  Cc: François Figarola, hch, device-mapper development,
	linux-kernel, Neil Brown, Andrew Morton, linux-raid, Al Viro

28.01.2010 08:32, Jun'ichi Nomura skrev:
>>> On Sat, 16 Jan 2010 10:58:30 +0100
>>> Fran__ois Figarola<francois.figarola@i-consult.fr>  wrote:
>>>> Since I've tried to boot 2.6.32.x kernel, my system hangs during the
>>>> boot process, and I think it could be related to the problem reported
>>>> earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92).
>>>>
>>>> The hardware is a Dell PowerEdge 2950 which runs fine with the
>>>> 2.6.31.x kernel series (actually running with the latest 2.6.31.11),
>>>> and the system is debian etch.
>>>>
>>>> Here is the trace of the bug I've got (using netconsole) with a
>>>> 2.6.32.3 kernel :
>>>>
>>>> BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8)
>>>> [unmount of ext3 dm-4]
>>>> ------------[ cut here ]------------
>>>> kernel BUG at fs/dcache.c:670!
>
> I can reproduce this when suspend/resume read-only mounted dm device.
>
> When MS_RDONLY, both freeze_bdev and thaw_bdev call deactivate_locked_super,
> which seems wrong. The change was introduced with the commit below:
>
>    commit 4504230a71566785a05d3e6b53fa1ee071b864eb
>    Author: Christoph Hellwig<hch@lst.de>
>    Date:   Mon Aug 3 23:28:35 2009 +0200
>
>    freeze_bdev: grab active reference to frozen superblocks
>
> With the attached patch, both remount-ro and remount-rw are
> rejected as EBUSY on freezed device as expected.
>
> Christoph, do you think this is the right fix?
>

I can confirm that both reverting the above patch, or applying the fix 
below fixes the issue on both 2.6.32 and 2.6.33-rc5

So if it's considered the correct fix, it needs to be cc stable@ for 2.6.32

(I reported this same issue this morning here:
  http://marc.info/?l=linux-kernel&m=126467195500908&w=2,
  but then I found this thread/fix)

The system I have tested on is a 4-disk dmraid10 connected to an Intel 
ICH10R on an Asus P7P55D Deluxe running x86_64

> Jun'ichi Nomura, NEC Corporation
>
>
> If MS_RDONLY, freeze_bdev should just up_write(s_umount) instead of
> deactivate_locked_super().
> Also, keep sb->s_frozen consistent so that remount can check the frozen state.
>
> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
   Tested-by: Thomas Backlund <tmb@mandriva.org>
>
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 73d6a73..600261f 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -246,7 +246,9 @@ struct super_block *freeze_bdev(struct block_device *bdev)
>  	if (!sb)
>  		goto out;
>  	if (sb->s_flags & MS_RDONLY) {
> -		deactivate_locked_super(sb);
> +		sb->s_frozen = SB_FREEZE_TRANS;
> +		smp_wmb();
> +		up_write(&sb->s_umount);
>  		mutex_unlock(&bdev->bd_fsfreeze_mutex);
>  		return sb;
>  	}
> @@ -307,7 +309,7 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
>  	BUG_ON(sb->s_bdev != bdev);
>  	down_write(&sb->s_umount);
>  	if (sb->s_flags & MS_RDONLY)
> -		goto out_deactivate;
> +		goto out_unfrozen;
>
>  	if (sb->s_op->unfreeze_fs) {
>  		error = sb->s_op->unfreeze_fs(sb);
> @@ -321,11 +323,11 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
>  		}
>  	}
>
> +out_unfrozen:
>  	sb->s_frozen = SB_UNFROZEN;
>  	smp_wmb();
>  	wake_up(&sb->s_wait_unfrozen);
>
> -out_deactivate:
>  	if (sb)
>  		deactivate_locked_super(sb);
>  out_unlock:

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process
  2010-01-28  6:32     ` [dm-devel] " Jun'ichi Nomura
  2010-01-28 18:16       ` Thomas Backlund
@ 2010-01-28 18:25       ` Christoph Hellwig
  2010-01-29  0:56         ` [BUGFIX] [PATCH] freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb Jun'ichi Nomura
  2010-01-29  7:06       ` [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process François Figarola
  2 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2010-01-28 18:25 UTC (permalink / raw)
  To: Jun'ichi Nomura
  Cc: Fran?ois Figarola, hch, device-mapper development, linux-kernel,
	Neil Brown, Andrew Morton, linux-raid, Al Viro

On Thu, Jan 28, 2010 at 03:32:41PM +0900, Jun'ichi Nomura wrote:
> When MS_RDONLY, both freeze_bdev and thaw_bdev call deactivate_locked_super,
> which seems wrong. The change was introduced with the commit below:
> 
>   commit 4504230a71566785a05d3e6b53fa1ee071b864eb
>   Author: Christoph Hellwig <hch@lst.de>
>   Date:   Mon Aug 3 23:28:35 2009 +0200
> 
>   freeze_bdev: grab active reference to frozen superblocks
> 
> With the attached patch, both remount-ro and remount-rw are
> rejected as EBUSY on freezed device as expected.
> 
> Christoph, do you think this is the right fix?

Indeed, this looks wrong in my original code, and the patch looks like
the correct fix.  Thanks a lot!


Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [BUGFIX] [PATCH] freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb
  2010-01-28 18:25       ` Christoph Hellwig
@ 2010-01-29  0:56         ` Jun'ichi Nomura
  2010-01-30 18:44           ` Thomas Backlund
  0 siblings, 1 reply; 9+ messages in thread
From: Jun'ichi Nomura @ 2010-01-29  0:56 UTC (permalink / raw)
  To: Christoph Hellwig, linux-kernel, tmb
  Cc: Fran?ois Figarola, device-mapper development, Neil Brown,
	Andrew Morton, linux-raid, Al Viro, stable

Thanks Thomas and Christoph for testing and review.
I removed 'smp_wmb()' before up_write from the previous patch,
since up_write() should have necessary ordering constraints.
(I.e. the change of s_frozen is visible to others after up_write)
I'm quite sure the change is harmless but if you are uncomfortable
with Tested-by/Reviewed-by on the modified patch, please remove them.


If MS_RDONLY, freeze_bdev should just up_write(s_umount) instead of
deactivate_locked_super().
Also, keep sb->s_frozen consistent so that remount can check the frozen state.

Otherwise a crash reported here can happen:
http://lkml.org/lkml/2010/1/16/37
http://lkml.org/lkml/2010/1/28/53


This patch should be applied for 2.6.32 stable series, too.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Thomas Backlund <tmb@mandriva.org> 
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> 
Cc: stable@kernel.org

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 73d6a73..d11d028 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -246,7 +246,8 @@ struct super_block *freeze_bdev(struct block_device *bdev)
 	if (!sb)
 		goto out;
 	if (sb->s_flags & MS_RDONLY) {
-		deactivate_locked_super(sb);
+		sb->s_frozen = SB_FREEZE_TRANS;
+		up_write(&sb->s_umount);
 		mutex_unlock(&bdev->bd_fsfreeze_mutex);
 		return sb;
 	}
@@ -307,7 +308,7 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
 	BUG_ON(sb->s_bdev != bdev);
 	down_write(&sb->s_umount);
 	if (sb->s_flags & MS_RDONLY)
-		goto out_deactivate;
+		goto out_unfrozen;
 
 	if (sb->s_op->unfreeze_fs) {
 		error = sb->s_op->unfreeze_fs(sb);
@@ -321,11 +322,11 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
 		}
 	}
 
+out_unfrozen:
 	sb->s_frozen = SB_UNFROZEN;
 	smp_wmb();
 	wake_up(&sb->s_wait_unfrozen);
 
-out_deactivate:
 	if (sb)
 		deactivate_locked_super(sb);
 out_unlock:

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process
  2010-01-28  6:32     ` [dm-devel] " Jun'ichi Nomura
  2010-01-28 18:16       ` Thomas Backlund
  2010-01-28 18:25       ` Christoph Hellwig
@ 2010-01-29  7:06       ` François Figarola
  2 siblings, 0 replies; 9+ messages in thread
From: François Figarola @ 2010-01-29  7:06 UTC (permalink / raw)
  To: Jun'ichi Nomura
  Cc: hch, device-mapper development, linux-kernel, Neil Brown,
	Andrew Morton, linux-raid, Al Viro

Jun'ichi Nomura a écrit :
>>> On Sat, 16 Jan 2010 10:58:30 +0100
>>> Fran__ois Figarola  <francois.figarola@i-consult.fr> wrote:
>>>       
>>>> Since I've tried to boot 2.6.32.x kernel, my system hangs during the
>>>> boot process, and I think it could be related to the problem reported
>>>> earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92).
>>>>
>>>> The hardware is a Dell PowerEdge 2950 which runs fine with the
>>>> 2.6.31.x kernel series (actually running with the latest 2.6.31.11),
>>>> and the system is debian etch.
>>>>
>>>> Here is the trace of the bug I've got (using netconsole) with a
>>>> 2.6.32.3 kernel :
>>>>
>>>> BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8)
>>>> [unmount of ext3 dm-4]
>>>> ------------[ cut here ]------------
>>>> kernel BUG at fs/dcache.c:670!
>>>>         
>
> I can reproduce this when suspend/resume read-only mounted dm device.
>
> When MS_RDONLY, both freeze_bdev and thaw_bdev call deactivate_locked_super,
> which seems wrong. The change was introduced with the commit below:
>
>   commit 4504230a71566785a05d3e6b53fa1ee071b864eb
>   Author: Christoph Hellwig <hch@lst.de>
>   Date:   Mon Aug 3 23:28:35 2009 +0200
>
>   freeze_bdev: grab active reference to frozen superblocks
>
> With the attached patch, both remount-ro and remount-rw are
> rejected as EBUSY on freezed device as expected.
>
> Christoph, do you think this is the right fix?
>
>   
With the fix from Jun'ichi Nomura, a 2.6.32.5 kernel
boots now correctly.

Thanks.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUGFIX] [PATCH] freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb
  2010-01-29  0:56         ` [BUGFIX] [PATCH] freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb Jun'ichi Nomura
@ 2010-01-30 18:44           ` Thomas Backlund
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Backlund @ 2010-01-30 18:44 UTC (permalink / raw)
  To: Jun'ichi Nomura
  Cc: Christoph Hellwig, linux-kernel, tmb, Fran?ois Figarola,
	device-mapper development, Neil Brown, Andrew Morton, linux-raid,
	Al Viro, stable

29.01.2010 02:56, Jun'ichi Nomura skrev:
> Thanks Thomas and Christoph for testing and review.
> I removed 'smp_wmb()' before up_write from the previous patch,
> since up_write() should have necessary ordering constraints.
> (I.e. the change of s_frozen is visible to others after up_write)
> I'm quite sure the change is harmless but if you are uncomfortable
> with Tested-by/Reviewed-by on the modified patch, please remove them.
>

I've just verified that this patch works as intended on both 2.6.32 and 
2.6.33-rc6, so for me it's still OK.
>
> If MS_RDONLY, freeze_bdev should just up_write(s_umount) instead of
> deactivate_locked_super().
> Also, keep sb->s_frozen consistent so that remount can check the frozen state.
>
> Otherwise a crash reported here can happen:
> http://lkml.org/lkml/2010/1/16/37
> http://lkml.org/lkml/2010/1/28/53
>
>
> This patch should be applied for 2.6.32 stable series, too.
>
> Reviewed-by: Christoph Hellwig<hch@lst.de>
> Tested-by: Thomas Backlund<tmb@mandriva.org>
> Signed-off-by: Jun'ichi Nomura<j-nomura@ce.jp.nec.com>
> Cc: stable@kernel.org
>
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 73d6a73..d11d028 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -246,7 +246,8 @@ struct super_block *freeze_bdev(struct block_device *bdev)
>   	if (!sb)
>   		goto out;
>   	if (sb->s_flags&  MS_RDONLY) {
> -		deactivate_locked_super(sb);
> +		sb->s_frozen = SB_FREEZE_TRANS;
> +		up_write(&sb->s_umount);
>   		mutex_unlock(&bdev->bd_fsfreeze_mutex);
>   		return sb;
>   	}
> @@ -307,7 +308,7 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
>   	BUG_ON(sb->s_bdev != bdev);
>   	down_write(&sb->s_umount);
>   	if (sb->s_flags&  MS_RDONLY)
> -		goto out_deactivate;
> +		goto out_unfrozen;
>
>   	if (sb->s_op->unfreeze_fs) {
>   		error = sb->s_op->unfreeze_fs(sb);
> @@ -321,11 +322,11 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
>   		}
>   	}
>
> +out_unfrozen:
>   	sb->s_frozen = SB_UNFROZEN;
>   	smp_wmb();
>   	wake_up(&sb->s_wait_unfrozen);
>
> -out_deactivate:
>   	if (sb)
>   		deactivate_locked_super(sb);
>   out_unlock:
> .
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-01-30 18:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-16  9:58 [BUG] kernel 2.6.32.x hangs during boot process François Figarola
2010-01-23  0:07 ` Andrew Morton
2010-01-28  2:42   ` Neil Brown
2010-01-28  6:32     ` [dm-devel] " Jun'ichi Nomura
2010-01-28 18:16       ` Thomas Backlund
2010-01-28 18:25       ` Christoph Hellwig
2010-01-29  0:56         ` [BUGFIX] [PATCH] freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb Jun'ichi Nomura
2010-01-30 18:44           ` Thomas Backlund
2010-01-29  7:06       ` [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process François Figarola

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).