All of lore.kernel.org
 help / color / mirror / Atom feed
* Ext4 deadlock (+lockdep splat) during rsync
@ 2017-01-08 22:41 George Spelvin
  2017-01-10  3:40 ` Theodore Ts'o
  2017-01-19 17:37 ` Jan Kara
  0 siblings, 2 replies; 13+ messages in thread
From: George Spelvin @ 2017-01-08 22:41 UTC (permalink / raw)
  To: linux-ext4; +Cc: linux

After replacing a drive in a RAID array, I tried to bring some things
up to date with rsync and ran into an annoyingly repeatable deadlock.

So I found a chance to boot with a lockdep kernel and immediately turned up the following:

[  755.740865] =============================================
[  755.741072] [ INFO: possible recursive locking detected ]
[  755.741279] 4.9.1-00126-gfbb9fcc9-dirty #576 Not tainted
[  755.741489] ---------------------------------------------
[  755.741699] rsync/14818 is trying to acquire lock:
[  755.741907]  (&ei->xattr_sem){++++..}, at: [<ffffffff81234603>] ext4_expand_extra_isize_ea+0x63/0x850
[  755.742145] but task is already holding lock:
[  755.742742]  (&ei->xattr_sem){++++..}, at: [<ffffffff81236f95>] ext4_try_add_inline_entry+0x55/0x1a0
[  755.743102] other info that might help us debug this:
[  755.743802]  Possible unsafe locking scenario:
[  755.743802]        CPU0
[  755.743802]        ----
[  755.743802]   lock(&ei->xattr_sem);
[  755.743802]   lock(&ei->xattr_sem);
[  755.743802] *** DEADLOCK ***
[  755.743802]  May be due to missing lock nesting notation
[  755.743802] 4 locks held by rsync/14818:
[  755.743802]  #0:  (sb_writers#3){.+.+.+}, at: [<ffffffff811a0fef>] mnt_want_write+0x1f/0x50
[  755.743802]  #1:  (&type->i_mutex_dir_key){++++++}, at: [<ffffffff8118b058>] path_openat+0x2f8/0x9f0
[  755.743802]  #2:  (jbd2_handle){++++..}, at: [<ffffffff81239aa6>] start_this_handle+0x196/0x540
[  755.743802]  #3:  (&ei->xattr_sem){++++..}, at: [<ffffffff81236f95>] ext4_try_add_inline_entry+0x55/0x1a0
[  755.743802] stack backtrace:
[  755.743802] CPU: 0 PID: 14818 Comm: rsync Not tainted 4.9.1-00126-gfbb9fcc9-dirty #576
[  755.743802] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./X79-UP4, BIOS F7 03/19/2014
[  755.743802]  ffffc9000c273820 ffffffff812a6d05 ffffffff8253a080 ffffffff8253a080
[  755.743802]  ffffc9000c2738d8 ffffffff810c7eab ffffc9000c272000 ffffc90000000004
[  755.743802]  0000000000000000 ffffffff81e0b100 1a883a7e30ec461a ffffffff8253a080
[  755.743802] Call Trace:
[  755.743802]  [<ffffffff812a6d05>] dump_stack+0x68/0x93
[  755.743802]  [<ffffffff810c7eab>] __lock_acquire+0x7ab/0x1270
[  755.743802]  [<ffffffff810c8d10>] lock_acquire+0x60/0x80
[  755.743802]  [<ffffffff81234603>] ? ext4_expand_extra_isize_ea+0x63/0x850
[  755.743802]  [<ffffffff81793a44>] down_write+0x44/0x80
[  755.743802]  [<ffffffff81234603>] ? ext4_expand_extra_isize_ea+0x63/0x850
[  755.743802]  [<ffffffff81234603>] ext4_expand_extra_isize_ea+0x63/0x850
[  755.743802]  [<ffffffff81795ec2>] ? _raw_read_unlock+0x22/0x30
[  755.743802]  [<ffffffff8123a542>] ? jbd2_journal_extend+0x132/0x1b0
[  755.743802]  [<ffffffff812002c9>] ext4_mark_inode_dirty+0x129/0x180
[  755.743802]  [<ffffffff81235d64>] ext4_add_dirent_to_inline.isra.16+0xe4/0x100
[  755.743802]  [<ffffffff81236fd9>] ext4_try_add_inline_entry+0x99/0x1a0
[  755.743802]  [<ffffffff8120b102>] ext4_add_entry+0x1d2/0x370
[  755.743802]  [<ffffffff8120b2b9>] ext4_add_nondir+0x19/0x70
[  755.743802]  [<ffffffff8120b523>] ext4_create+0xc3/0x150
[  755.743802]  [<ffffffff8118aaf8>] lookup_open+0x3d8/0x640
[  755.743802]  [<ffffffff8118b072>] path_openat+0x312/0x9f0
[  755.743802]  [<ffffffff8118d309>] do_filp_open+0x79/0xd0
[  755.743802]  [<ffffffff81795bc2>] ? _raw_spin_unlock+0x22/0x30
[  755.743802]  [<ffffffff8119d9b3>] ? __alloc_fd+0xf3/0x200
[  755.743802]  [<ffffffff8117b4fe>] do_sys_open+0x11e/0x1f0
[  755.743802]  [<ffffffff811d2046>] compat_SyS_open+0x16/0x20
[  755.743802]  [<ffffffff810027e4>] do_fast_syscall_32+0x94/0x210
[  755.743802]  [<ffffffff81797da1>] entry_SYSENTER_compat+0x51/0x60

Followed by repetitions of 
[ 1105.549736] INFO: task rsync:14818 blocked for more than 120 seconds.
[ 1105.550089]       Not tainted 4.9.1-00126-gfbb9fcc9-dirty #576
[ 1105.550436] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1105.550801] rsync           D    0 14818      1 0x20020004
[ 1105.551173]  ffff88040f7f1900 ffff88042d660000 0000000000000000 ffff8804082a2b00
[ 1105.551552]  ffff88042f216298 ffffc9000c273820 ffffffff8178fc08 0000000000000046
[ 1105.551933]  ffff8804082a3060 ffff88042f216298 ffff88042f216280 ffff8804082a2b00
[ 1105.552311] Call Trace:
[ 1105.552678]  [<ffffffff8178fc08>] ? __schedule+0x2b8/0x840
[ 1105.553051]  [<ffffffff817901c8>] schedule+0x38/0x90
[ 1105.553438]  [<ffffffff817945b5>] ? rwsem_down_write_failed+0x2a5/0x450
[ 1105.553810]  [<ffffffff817945ba>] rwsem_down_write_failed+0x2aa/0x450
[ 1105.554186]  [<ffffffff81794360>] ? rwsem_down_write_failed+0x50/0x450
[ 1105.554571]  [<ffffffff812b41d7>] call_rwsem_down_write_failed+0x17/0x30
[ 1105.554953]  [<ffffffff812a6d21>] ? dump_stack+0x84/0x93
[ 1105.555335]  [<ffffffff81793a5f>] down_write+0x5f/0x80
[ 1105.555720]  [<ffffffff81234603>] ? ext4_expand_extra_isize_ea+0x63/0x850
[ 1105.556111]  [<ffffffff81234603>] ext4_expand_extra_isize_ea+0x63/0x850
[ 1105.556502]  [<ffffffff81795ec2>] ? _raw_read_unlock+0x22/0x30
[ 1105.556900]  [<ffffffff8123a542>] ? jbd2_journal_extend+0x132/0x1b0
[ 1105.557282]  [<ffffffff812002c9>] ext4_mark_inode_dirty+0x129/0x180
[ 1105.557675]  [<ffffffff81235d64>] ext4_add_dirent_to_inline.isra.16+0xe4/0x100
[ 1105.558065]  [<ffffffff81236fd9>] ext4_try_add_inline_entry+0x99/0x1a0
[ 1105.558448]  [<ffffffff8120b102>] ext4_add_entry+0x1d2/0x370
[ 1105.558832]  [<ffffffff8120b2b9>] ext4_add_nondir+0x19/0x70
[ 1105.559208]  [<ffffffff8120b523>] ext4_create+0xc3/0x150
[ 1105.559575]  [<ffffffff8118aaf8>] lookup_open+0x3d8/0x640
[ 1105.559943]  [<ffffffff8118b072>] path_openat+0x312/0x9f0
[ 1105.560287]  [<ffffffff8118d309>] do_filp_open+0x79/0xd0
[ 1105.560637]  [<ffffffff81795bc2>] ? _raw_spin_unlock+0x22/0x30
[ 1105.560978]  [<ffffffff8119d9b3>] ? __alloc_fd+0xf3/0x200
[ 1105.561316]  [<ffffffff8117b4fe>] do_sys_open+0x11e/0x1f0
[ 1105.561652]  [<ffffffff811d2046>] compat_SyS_open+0x16/0x20
[ 1105.561984]  [<ffffffff810027e4>] do_fast_syscall_32+0x94/0x210
[ 1105.562313]  [<ffffffff81797da1>] entry_SYSENTER_compat+0x51/0x60
[ 1105.562639] INFO: lockdep is turned off.

Problem first observed with a 4.8 kernel, but the lockdep one is 4.9.
I have metadata checksums turned on, 

Filesystem features:      has_journal resize_inode dir_index sparse_super2 filetype needs_recovery extent flex_bg inline_data sparse_super large_file extra_isize metadata_csum
Journal features:         journal_incompat_revoke journal_checksum_v3
Inode size:               256
Required extra isize:     28
Desired extra isize:      28

It's awakward to reboot in the middle of sending e-mail, but when I do
I can strace the rsync and find the path it's trying to create, if that
is helpful to debugging.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ext4 deadlock (+lockdep splat) during rsync
  2017-01-08 22:41 Ext4 deadlock (+lockdep splat) during rsync George Spelvin
@ 2017-01-10  3:40 ` Theodore Ts'o
  2017-01-10  7:25   ` George Spelvin
  2017-01-19 17:37 ` Jan Kara
  1 sibling, 1 reply; 13+ messages in thread
From: Theodore Ts'o @ 2017-01-10  3:40 UTC (permalink / raw)
  To: George Spelvin; +Cc: linux-ext4

On Sun, Jan 08, 2017 at 05:41:14PM -0500, George Spelvin wrote:
> After replacing a drive in a RAID array, I tried to bring some things
> up to date with rsync and ran into an annoyingly repeatable deadlock.

Thanks for the bug report!  I'm able to reproduce this, so it's a
matter of my tracking down the bug.

In the meantime, do you need a workaround?  If you're not using the
new project quota feature, I can send you a quick patch to disable
attempts to move the extended attributes to resereve space for the
project ID --- this is what is causing the deadlock.

	       	       	       	       - Ted

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ext4 deadlock (+lockdep splat) during rsync
  2017-01-10  3:40 ` Theodore Ts'o
@ 2017-01-10  7:25   ` George Spelvin
  0 siblings, 0 replies; 13+ messages in thread
From: George Spelvin @ 2017-01-10  7:25 UTC (permalink / raw)
  To: linux, tytso; +Cc: linux-ext4

> Thanks for the bug report!  I'm able to reproduce this, so it's a
> matter of my tracking down the bug.

Hooray!  And thank you; I don't know why I hit so many ext4 bugs.

> In the meantime, do you need a workaround?  If you're not using the
> new project quota feature, I can send you a quick patch to disable
> attempts to move the extended attributes to resereve space for the
> project ID --- this is what is causing the deadlock.

If it's very little bother, I'd appreciate it, but if you expect to
have something more complete by the end of the week, I can wait
that long.

I didn't think I was using extended attributes at all, so the
limitation is fine.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ext4 deadlock (+lockdep splat) during rsync
  2017-01-08 22:41 Ext4 deadlock (+lockdep splat) during rsync George Spelvin
  2017-01-10  3:40 ` Theodore Ts'o
@ 2017-01-19 17:37 ` Jan Kara
  2017-01-19 17:45   ` Jan Kara
  2017-01-19 23:19   ` Theodore Ts'o
  1 sibling, 2 replies; 13+ messages in thread
From: Jan Kara @ 2017-01-19 17:37 UTC (permalink / raw)
  To: George Spelvin; +Cc: linux-ext4

On Sun 08-01-17 17:41:14, George Spelvin wrote:
> After replacing a drive in a RAID array, I tried to bring some things
> up to date with rsync and ran into an annoyingly repeatable deadlock.
> 
> So I found a chance to boot with a lockdep kernel and immediately turned up the following:
> 
> [  755.740865] =============================================
> [  755.741072] [ INFO: possible recursive locking detected ]
> [  755.741279] 4.9.1-00126-gfbb9fcc9-dirty #576 Not tainted
> [  755.741489] ---------------------------------------------
> [  755.741699] rsync/14818 is trying to acquire lock:
> [  755.741907]  (&ei->xattr_sem){++++..}, at: [<ffffffff81234603>] ext4_expand_extra_isize_ea+0x63/0x850
> [  755.742145] but task is already holding lock:
> [  755.742742]  (&ei->xattr_sem){++++..}, at: [<ffffffff81236f95>] ext4_try_add_inline_entry+0x55/0x1a0
> [  755.743102] other info that might help us debug this:
> [  755.743802]  Possible unsafe locking scenario:
> [  755.743802]        CPU0
> [  755.743802]        ----
> [  755.743802]   lock(&ei->xattr_sem);
> [  755.743802]   lock(&ei->xattr_sem);
> [  755.743802] *** DEADLOCK ***
> [  755.743802]  May be due to missing lock nesting notation
> [  755.743802] 4 locks held by rsync/14818:
> [  755.743802]  #0:  (sb_writers#3){.+.+.+}, at: [<ffffffff811a0fef>] mnt_want_write+0x1f/0x50
> [  755.743802]  #1:  (&type->i_mutex_dir_key){++++++}, at: [<ffffffff8118b058>] path_openat+0x2f8/0x9f0
> [  755.743802]  #2:  (jbd2_handle){++++..}, at: [<ffffffff81239aa6>] start_this_handle+0x196/0x540
> [  755.743802]  #3:  (&ei->xattr_sem){++++..}, at: [<ffffffff81236f95>] ext4_try_add_inline_entry+0x55/0x1a0
> [  755.743802] stack backtrace:
> [  755.743802] CPU: 0 PID: 14818 Comm: rsync Not tainted 4.9.1-00126-gfbb9fcc9-dirty #576
> [  755.743802] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./X79-UP4, BIOS F7 03/19/2014
> [  755.743802]  ffffc9000c273820 ffffffff812a6d05 ffffffff8253a080 ffffffff8253a080
> [  755.743802]  ffffc9000c2738d8 ffffffff810c7eab ffffc9000c272000 ffffc90000000004
> [  755.743802]  0000000000000000 ffffffff81e0b100 1a883a7e30ec461a ffffffff8253a080
> [  755.743802] Call Trace:
> [  755.743802]  [<ffffffff812a6d05>] dump_stack+0x68/0x93
> [  755.743802]  [<ffffffff810c7eab>] __lock_acquire+0x7ab/0x1270
> [  755.743802]  [<ffffffff810c8d10>] lock_acquire+0x60/0x80
> [  755.743802]  [<ffffffff81234603>] ? ext4_expand_extra_isize_ea+0x63/0x850
> [  755.743802]  [<ffffffff81793a44>] down_write+0x44/0x80
> [  755.743802]  [<ffffffff81234603>] ? ext4_expand_extra_isize_ea+0x63/0x850
> [  755.743802]  [<ffffffff81234603>] ext4_expand_extra_isize_ea+0x63/0x850
> [  755.743802]  [<ffffffff81795ec2>] ? _raw_read_unlock+0x22/0x30
> [  755.743802]  [<ffffffff8123a542>] ? jbd2_journal_extend+0x132/0x1b0
> [  755.743802]  [<ffffffff812002c9>] ext4_mark_inode_dirty+0x129/0x180
> [  755.743802]  [<ffffffff81235d64>] ext4_add_dirent_to_inline.isra.16+0xe4/0x100
> [  755.743802]  [<ffffffff81236fd9>] ext4_try_add_inline_entry+0x99/0x1a0
> [  755.743802]  [<ffffffff8120b102>] ext4_add_entry+0x1d2/0x370
> [  755.743802]  [<ffffffff8120b2b9>] ext4_add_nondir+0x19/0x70
> [  755.743802]  [<ffffffff8120b523>] ext4_create+0xc3/0x150
> [  755.743802]  [<ffffffff8118aaf8>] lookup_open+0x3d8/0x640
> [  755.743802]  [<ffffffff8118b072>] path_openat+0x312/0x9f0
> [  755.743802]  [<ffffffff8118d309>] do_filp_open+0x79/0xd0
> [  755.743802]  [<ffffffff81795bc2>] ? _raw_spin_unlock+0x22/0x30
> [  755.743802]  [<ffffffff8119d9b3>] ? __alloc_fd+0xf3/0x200
> [  755.743802]  [<ffffffff8117b4fe>] do_sys_open+0x11e/0x1f0
> [  755.743802]  [<ffffffff811d2046>] compat_SyS_open+0x16/0x20
> [  755.743802]  [<ffffffff810027e4>] do_fast_syscall_32+0x94/0x210
> [  755.743802]  [<ffffffff81797da1>] entry_SYSENTER_compat+0x51/0x60

OK, the problem is that we call ext4_mark_inode_dirty() while holding
xattr_sem and that recurses into ext4_expand_extra_isize_ea() which tries
to grab it again. This may happen in several place in inline.c, generally
when handling inline directories. I'll try to craft a fix tomorrow...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ext4 deadlock (+lockdep splat) during rsync
  2017-01-19 17:37 ` Jan Kara
@ 2017-01-19 17:45   ` Jan Kara
  2017-01-19 17:59     ` George Spelvin
  2017-01-19 23:19   ` Theodore Ts'o
  1 sibling, 1 reply; 13+ messages in thread
From: Jan Kara @ 2017-01-19 17:45 UTC (permalink / raw)
  To: George Spelvin; +Cc: linux-ext4

On Thu 19-01-17 18:37:07, Jan Kara wrote:
> On Sun 08-01-17 17:41:14, George Spelvin wrote:
> > After replacing a drive in a RAID array, I tried to bring some things
> > up to date with rsync and ran into an annoyingly repeatable deadlock.
> > 
> > So I found a chance to boot with a lockdep kernel and immediately turned up the following:
> > 
> > [  755.740865] =============================================
> > [  755.741072] [ INFO: possible recursive locking detected ]
> > [  755.741279] 4.9.1-00126-gfbb9fcc9-dirty #576 Not tainted
> > [  755.741489] ---------------------------------------------
> > [  755.741699] rsync/14818 is trying to acquire lock:
> > [  755.741907]  (&ei->xattr_sem){++++..}, at: [<ffffffff81234603>] ext4_expand_extra_isize_ea+0x63/0x850
> > [  755.742145] but task is already holding lock:
> > [  755.742742]  (&ei->xattr_sem){++++..}, at: [<ffffffff81236f95>] ext4_try_add_inline_entry+0x55/0x1a0
> > [  755.743102] other info that might help us debug this:
> > [  755.743802]  Possible unsafe locking scenario:
> > [  755.743802]        CPU0
> > [  755.743802]        ----
> > [  755.743802]   lock(&ei->xattr_sem);
> > [  755.743802]   lock(&ei->xattr_sem);
> > [  755.743802] *** DEADLOCK ***
> > [  755.743802]  May be due to missing lock nesting notation
> > [  755.743802] 4 locks held by rsync/14818:
> > [  755.743802]  #0:  (sb_writers#3){.+.+.+}, at: [<ffffffff811a0fef>] mnt_want_write+0x1f/0x50
> > [  755.743802]  #1:  (&type->i_mutex_dir_key){++++++}, at: [<ffffffff8118b058>] path_openat+0x2f8/0x9f0
> > [  755.743802]  #2:  (jbd2_handle){++++..}, at: [<ffffffff81239aa6>] start_this_handle+0x196/0x540
> > [  755.743802]  #3:  (&ei->xattr_sem){++++..}, at: [<ffffffff81236f95>] ext4_try_add_inline_entry+0x55/0x1a0
> > [  755.743802] stack backtrace:
> > [  755.743802] CPU: 0 PID: 14818 Comm: rsync Not tainted 4.9.1-00126-gfbb9fcc9-dirty #576
> > [  755.743802] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./X79-UP4, BIOS F7 03/19/2014
> > [  755.743802]  ffffc9000c273820 ffffffff812a6d05 ffffffff8253a080 ffffffff8253a080
> > [  755.743802]  ffffc9000c2738d8 ffffffff810c7eab ffffc9000c272000 ffffc90000000004
> > [  755.743802]  0000000000000000 ffffffff81e0b100 1a883a7e30ec461a ffffffff8253a080
> > [  755.743802] Call Trace:
> > [  755.743802]  [<ffffffff812a6d05>] dump_stack+0x68/0x93
> > [  755.743802]  [<ffffffff810c7eab>] __lock_acquire+0x7ab/0x1270
> > [  755.743802]  [<ffffffff810c8d10>] lock_acquire+0x60/0x80
> > [  755.743802]  [<ffffffff81234603>] ? ext4_expand_extra_isize_ea+0x63/0x850
> > [  755.743802]  [<ffffffff81793a44>] down_write+0x44/0x80
> > [  755.743802]  [<ffffffff81234603>] ? ext4_expand_extra_isize_ea+0x63/0x850
> > [  755.743802]  [<ffffffff81234603>] ext4_expand_extra_isize_ea+0x63/0x850
> > [  755.743802]  [<ffffffff81795ec2>] ? _raw_read_unlock+0x22/0x30
> > [  755.743802]  [<ffffffff8123a542>] ? jbd2_journal_extend+0x132/0x1b0
> > [  755.743802]  [<ffffffff812002c9>] ext4_mark_inode_dirty+0x129/0x180
> > [  755.743802]  [<ffffffff81235d64>] ext4_add_dirent_to_inline.isra.16+0xe4/0x100
> > [  755.743802]  [<ffffffff81236fd9>] ext4_try_add_inline_entry+0x99/0x1a0
> > [  755.743802]  [<ffffffff8120b102>] ext4_add_entry+0x1d2/0x370
> > [  755.743802]  [<ffffffff8120b2b9>] ext4_add_nondir+0x19/0x70
> > [  755.743802]  [<ffffffff8120b523>] ext4_create+0xc3/0x150
> > [  755.743802]  [<ffffffff8118aaf8>] lookup_open+0x3d8/0x640
> > [  755.743802]  [<ffffffff8118b072>] path_openat+0x312/0x9f0
> > [  755.743802]  [<ffffffff8118d309>] do_filp_open+0x79/0xd0
> > [  755.743802]  [<ffffffff81795bc2>] ? _raw_spin_unlock+0x22/0x30
> > [  755.743802]  [<ffffffff8119d9b3>] ? __alloc_fd+0xf3/0x200
> > [  755.743802]  [<ffffffff8117b4fe>] do_sys_open+0x11e/0x1f0
> > [  755.743802]  [<ffffffff811d2046>] compat_SyS_open+0x16/0x20
> > [  755.743802]  [<ffffffff810027e4>] do_fast_syscall_32+0x94/0x210
> > [  755.743802]  [<ffffffff81797da1>] entry_SYSENTER_compat+0x51/0x60
> 
> OK, the problem is that we call ext4_mark_inode_dirty() while holding
> xattr_sem and that recurses into ext4_expand_extra_isize_ea() which tries
> to grab it again. This may happen in several place in inline.c, generally
> when handling inline directories. I'll try to craft a fix tomorrow...

Ah, I've noticed Ted had already beaten me to it...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ext4 deadlock (+lockdep splat) during rsync
  2017-01-19 17:45   ` Jan Kara
@ 2017-01-19 17:59     ` George Spelvin
  0 siblings, 0 replies; 13+ messages in thread
From: George Spelvin @ 2017-01-19 17:59 UTC (permalink / raw)
  To: jack, linux; +Cc: linux-ext4

> Ah, I've noticed Ted had already beaten me to it...

Yes, but I've posted two more bugs you're welcome to take a look at.

Thank you very much!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ext4 deadlock (+lockdep splat) during rsync
  2017-01-19 17:37 ` Jan Kara
  2017-01-19 17:45   ` Jan Kara
@ 2017-01-19 23:19   ` Theodore Ts'o
  2017-01-19 23:38     ` George Spelvin
  1 sibling, 1 reply; 13+ messages in thread
From: Theodore Ts'o @ 2017-01-19 23:19 UTC (permalink / raw)
  To: Jan Kara; +Cc: George Spelvin, linux-ext4

On Thu, Jan 19, 2017 at 06:37:07PM +0100, Jan Kara wrote:
> OK, the problem is that we call ext4_mark_inode_dirty() while holding
> xattr_sem and that recurses into ext4_expand_extra_isize_ea() which tries
> to grab it again. This may happen in several place in inline.c, generally
> when handling inline directories. I'll try to craft a fix tomorrow...

I think the inline_data patches I posted should have taken care of
George's initial set of bug reports?

(And George, the reason why you're seeing lots of problems is because
inline_data isn't enabled by default yet, and as the old joke goes,
the Early Christians get the best Lions.  :-)

						- Ted

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ext4 deadlock (+lockdep splat) during rsync
  2017-01-19 23:19   ` Theodore Ts'o
@ 2017-01-19 23:38     ` George Spelvin
  2017-01-20 14:40       ` Theodore Ts'o
  0 siblings, 1 reply; 13+ messages in thread
From: George Spelvin @ 2017-01-19 23:38 UTC (permalink / raw)
  To: tytso; +Cc: jack, linux-ext4, linux

> I think the inline_data patches I posted should have taken care of
> George's initial set of bug reports?

Er, the two I posted most recently were on a kernel with your 4-patch
deadlock series applied:

98a9e36a ext4: propagate error values from ext4_inline_data_truncate()
50c39f0d ext4: avoid calling ext4_mark_inode_dirty() under unneeded semaphores
f321034b ext4: fix deadlock between inline_data and ext4_expand_extra_isize_ea()
5283ac14 ext4: add debug_want_extra_isize mount option

Are you referring to some other patches I should be looking for?

> (And George, the reason why you're seeing lots of problems is because
> inline_data isn't enabled by default yet, and as the old joke goes,
> the Early Christians get the best Lions.  :-)

Yes, I know I'm tempting fate by keeping data I actaully care about (I
have backups of most of it, but reassembling it from those backups would
be most unpleasant) on a file system with experimental features enabled.

But *somebody* has to debug new features, and I've noticed that the fickle
goddess Glitch is most likely to make an appearance when She sees such
an offering.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ext4 deadlock (+lockdep splat) during rsync
  2017-01-19 23:38     ` George Spelvin
@ 2017-01-20 14:40       ` Theodore Ts'o
  2017-01-20 17:57         ` George Spelvin
  0 siblings, 1 reply; 13+ messages in thread
From: Theodore Ts'o @ 2017-01-20 14:40 UTC (permalink / raw)
  To: George Spelvin; +Cc: jack, linux-ext4

On Thu, Jan 19, 2017 at 06:38:24PM -0500, George Spelvin wrote:
> > I think the inline_data patches I posted should have taken care of
> > George's initial set of bug reports?
> 
> Er, the two I posted most recently were on a kernel with your 4-patch
> deadlock series applied:
> 
> 98a9e36a ext4: propagate error values from ext4_inline_data_truncate()
> 50c39f0d ext4: avoid calling ext4_mark_inode_dirty() under unneeded semaphores
> f321034b ext4: fix deadlock between inline_data and ext4_expand_extra_isize_ea()
> 5283ac14 ext4: add debug_want_extra_isize mount option

Yes, I know that's why I said, it takes care of the _initial_ set of
bug reports (not your recently reported ones).  Jan's comments
indicated he was lookig at the initial set, and I to avoid him redoing
work that i had already done.

Those bugs look like they are very separate in that they have nothing
to do with locking, but rather in e2fsck and the kernel not properly
dealing with certain types of inconsistencies on disk.  On my list to
deal with.  As a workaround, you can just clri the offending corrupted
inode from lost+found and then run e2fsck.

> > (And George, the reason why you're seeing lots of problems is because
> > inline_data isn't enabled by default yet, and as the old joke goes,
> > the Early Christians get the best Lions.  :-)
> 
> Yes, I know I'm tempting fate by keeping data I actaully care about (I
> have backups of most of it, but reassembling it from those backups would
> be most unpleasant) on a file system with experimental features enabled.
> 
> But *somebody* has to debug new features, and I've noticed that the fickle
> goddess Glitch is most likely to make an appearance when She sees such
> an offering.

Indeed, the reason why we didn't see this earlier is because we don't
have a test which tests the case of expanding i_extra_size with inline
data, and the introduction of project quota was the first time we've
expanded the extra inode fields in a while.  Since you were using
inline_data, you very graciously exposed these bugs and a shortcoming
in our testing.  :-)

On my todo list is to add some i_extra_isize testing to xfstsests.

      	   	      	       		     - Ted

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ext4 deadlock (+lockdep splat) during rsync
  2017-01-20 14:40       ` Theodore Ts'o
@ 2017-01-20 17:57         ` George Spelvin
  2017-01-21 13:30           ` Theodore Ts'o
  0 siblings, 1 reply; 13+ messages in thread
From: George Spelvin @ 2017-01-20 17:57 UTC (permalink / raw)
  To: linux, tytso; +Cc: jack, linux-ext4

> Those bugs look like they are very separate in that they have nothing
> to do with locking, but rather in e2fsck and the kernel not properly
> dealing with certain types of inconsistencies on disk.  On my list to
> deal with.  As a workaround, you can just clri the offending corrupted
> inode from lost+found and then run e2fsck.

Yes, that's fairly obvious, but I've avoided that for a few reasons:
1) I was holding on to the broken inodes in case you needed more information
   that I didn't include in the initial report,
2) I also thought they might be wanted to test an eventual fix, and
3) I was probably going to get a bit fancier to preserve the file names.

My latest report is a bit confusing because debugfs() reads the
directories without complaint, so I'm not sure it's a valid on-disk
situation that the kernel is improperly rejecting, or an invaid situation
that libext2fs is improperly accepting.

Since I don't currently understand the on-disk format of inline data,
(I need to RTFS), I can't figure that out or see if there's a way
to do a repair in situ.

Oh!  Found it at
https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Inline_Data

So I'm guessing the problem is that the required empty system.data
attribute is missing (due to the preceding extra_isize changing mess),
and if I just created it, everything would magically come back to life.

Something like ea_set <inode> system.data ""

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ext4 deadlock (+lockdep splat) during rsync
  2017-01-20 17:57         ` George Spelvin
@ 2017-01-21 13:30           ` Theodore Ts'o
  2017-01-21 16:45             ` ext4_iget:4740: inode #%ld: block 48: comm find: invalid block George Spelvin
  0 siblings, 1 reply; 13+ messages in thread
From: Theodore Ts'o @ 2017-01-21 13:30 UTC (permalink / raw)
  To: George Spelvin; +Cc: jack, linux-ext4

On Fri, Jan 20, 2017 at 12:57:23PM -0500, George Spelvin wrote:
> So I'm guessing the problem is that the required empty system.data
> attribute is missing (due to the preceding extra_isize changing mess),
> and if I just created it, everything would magically come back to life
> 
> Something like ea_set <inode> system.data ""

Well, it would still a corrupted, innconsistent inode, in that there
would still be a block attached to the inode, and i_size would be
different from the size of the data xattr used by inline_data.  So you
might as well just clri the inode and rerun fsck, or clri the inode
and then unlink the directory entry in lost+found.

You might get it to a state where the kernel isn't explicitly
complaining any more, but you might end up unmasking other bugs where
the kernel is further failing to handle an inconsistent inode relating
to inline_data.  Which is fine if you want to potentially exposing
more problems, but if this is file system with Data You Care about, my
suggestsion to run debugfs's clri and then e2fsck -f really is the
most conservative advice to give.

					- Ted

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ext4_iget:4740: inode #%ld: block 48: comm find: invalid block
  2017-01-21 13:30           ` Theodore Ts'o
@ 2017-01-21 16:45             ` George Spelvin
  2017-01-23  0:08               ` Theodore Ts'o
  0 siblings, 1 reply; 13+ messages in thread
From: George Spelvin @ 2017-01-21 16:45 UTC (permalink / raw)
  To: linux, tytso; +Cc: jack, linux-ext4

Ted Ts'o wrote:
On Fri, Jan 20, 2017 at 12:57:23PM -0500, George Spelvin wrote:
>> So I'm guessing the problem is that the required empty system.data
>> attribute is missing (due to the preceding extra_isize changing mess),
>> and if I just created it, everything would magically come back to life
>> 
>> Something like ea_set <inode> system.data ""

> Well, it would still a corrupted, innconsistent inode, in that there
> would still be a block attached to the inode, and i_size would be
> different from the size of the data xattr used by inline_data.  So you
> might as well just clri the inode and rerun fsck, or clri the inode
> and then unlink the directory entry in lost+found.

Er, huh?  I was referring to the following error, which is one of a
dozen inodes I have with this problem (sorry all the Subject: lines have
gotten tangled):

>> Subject: ext4_iget:4740: inode #%ld: block 48: comm find: invalid block
>>
>> debugfs:  stat <1171779>
>> Inode: 1171779   Type: directory    Mode:  0775   Flags: 0x10000000
>> Generation: 2016668288    Version: 0x00000000:00000007
>> User:  1000   Group:   161   Project:     0   Size: 60
>> File ACL: 0    Directory ACL: 0
>> Links: 2   Blockcount: 0
>> Fragment:  Address: 0    Number: 0    Size: 0
>>  ctime: 0x587eff35:6e19953c -- Wed Jan 18 00:37:57 2017
>>  atime: 0x56d5943f:bb6e1344 -- Tue Mar  1 08:08:15 2016
>>  mtime: 0x587eff35:6e19953c -- Wed Jan 18 00:37:57 2017
>> crtime: 0x56d388b6:533e9e7c -- Sun Feb 28 18:54:30 2016
>> Size of extra inode fields: 32
>> Inode checksum: 0x82a01dca
>> Size of inline data: 60
>> 
>> debugfs:  ls <1171779>
>>  1171779  (12) .    1171778  (12) ..    1171954  (12) 0    1171955  (12) 1   
>>  1171956  (12) 2    1171957  (20) 3   

Zero blocks, data apparently safe and sound in the i_blocks field, just
missing (due to getting trashed by buggy i_extra_size changing code)
the system.data ea, and thus i_inline_off == 0 which upsets the kernel.

> You might get it to a state where the kernel isn't explicitly
> complaining any more, but you might end up unmasking other bugs where
> the kernel is further failing to handle an inconsistent inode relating
> to inline_data.

I just want to get it to a state where I can mv the contents into a
replacement directly and then rmdir this one, rather than having to
make a note of the name & inode number of each of the contained files
and then recreate it from the contents of lost+found (which is already
a bit of a swamp I'm wading through).

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ext4_iget:4740: inode #%ld: block 48: comm find: invalid block
  2017-01-21 16:45             ` ext4_iget:4740: inode #%ld: block 48: comm find: invalid block George Spelvin
@ 2017-01-23  0:08               ` Theodore Ts'o
  0 siblings, 0 replies; 13+ messages in thread
From: Theodore Ts'o @ 2017-01-23  0:08 UTC (permalink / raw)
  To: George Spelvin; +Cc: jack, linux-ext4

On Sat, Jan 21, 2017 at 11:45:12AM -0500, George Spelvin wrote:
> Er, huh?  I was referring to the following error, which is one of a
> dozen inodes I have with this problem (sorry all the Subject: lines have
> gotten tangled):

Yes, there are a bunch of different inodes, and I was referring to a
different inode dump you've sent out.

> I just want to get it to a state where I can mv the contents into a
> replacement directly and then rmdir this one, rather than having to
> make a note of the name & inode number of each of the contained files
> and then recreate it from the contents of lost+found (which is already
> a bit of a swamp I'm wading through).

My one recommendation would be try it out on a copy of the file
system, or try recreating the corruption on a small file system and
then experiment before you go ahead try to recover things.  I tend to
be *very* caution when live data is concerned, especially when I'm not
in front of the system myself and I'm asked to give advice to a user.

Or if it's too hard to make a copy of the file system, you might make
one change to one inode, then run fsck, and then mount the file
system, and make sure the kernel doesn't do something untoward, etc.,
before you try making that same class of change on two or three
inodes, then fsck and mount the file system, and try accessing those
files, before you make all of the changes for that class of
corruption.

Good luck,

						- Ted

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-01-23  0:08 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-08 22:41 Ext4 deadlock (+lockdep splat) during rsync George Spelvin
2017-01-10  3:40 ` Theodore Ts'o
2017-01-10  7:25   ` George Spelvin
2017-01-19 17:37 ` Jan Kara
2017-01-19 17:45   ` Jan Kara
2017-01-19 17:59     ` George Spelvin
2017-01-19 23:19   ` Theodore Ts'o
2017-01-19 23:38     ` George Spelvin
2017-01-20 14:40       ` Theodore Ts'o
2017-01-20 17:57         ` George Spelvin
2017-01-21 13:30           ` Theodore Ts'o
2017-01-21 16:45             ` ext4_iget:4740: inode #%ld: block 48: comm find: invalid block George Spelvin
2017-01-23  0:08               ` Theodore Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.