All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: viro@zeniv.linux.org.uk
Cc: linux-fsdevel@vger.kernel.org
Subject: [regression, v6.0-rc0] filesystem freeze hangs on sb_wait_write()
Date: Mon, 10 Oct 2022 16:03:19 +1100	[thread overview]
Message-ID: <20221010050319.GC2703033@dread.disaster.area> (raw)

Hi Al,

Just tried to run fstests on XFS on a current Linus kernel at commit
493ffd6605b2 ("Merge tag 'ucount-rlimits-cleanups-for-v5.19' of
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace")
and generic/068 hangs trying to freeze the filesystem like so:

[  163.957724] task:xfs_io          state:D stack:14680 pid: 4831 ppid:  4825 flags:0x00004000
[  163.961425] Call Trace:
[  163.962553]  <TASK>
[  163.963497]  __schedule+0x2f9/0xa30
[  163.965125]  ? percpu_down_write+0x60/0x190
[  163.966888]  schedule+0x5a/0xc0
[  163.968206]  percpu_down_write+0xe8/0x190
[  163.969865]  freeze_super+0x78/0x170
[  163.971247]  __x64_sys_ioctl+0x61/0xb0
[  163.973947]  do_syscall_64+0x35/0x80
[  163.975119]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  163.976781] RIP: 0033:0x7ff78910bb07
[  163.978028] RSP: 002b:00007ffefe7279b8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
[  163.980734] RAX: ffffffffffffffda RBX: 000055a88c183ad0 RCX: 00007ff78910bb07
[  163.983346] RDX: 00007ffefe7279cc RSI: ffffffffc0045877 RDI: 0000000000000003
[  163.985864] RBP: 0000000000000008 R08: 00007ff7891d5ba0 R09: 00007ff7891d5c00
[  163.988371] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000001
[  163.990704] R13: 000055a88c184270 R14: 000055a88c184fc0 R15: 000055a88c184fe0
[  163.992964]  </TASK>
[  163.993612] task:fsstress        state:D stack:12464 pid: 4832 ppid:  4822 flags:0x00000000
[  163.996390] Call Trace:
[  163.997310]  <TASK>
[  163.998076]  __schedule+0x2f9/0xa30
[  163.999323]  ? __smp_call_single_queue+0x23/0x40
[  164.000685]  schedule+0x5a/0xc0
[  164.001719]  ? percpu_rwsem_wait+0x123/0x150
[  164.003139]  percpu_rwsem_wait+0x123/0x150
[  164.004535]  ? __percpu_rwsem_trylock.part.0+0x50/0x50
[  164.006202]  __percpu_down_read+0x5b/0x110
[  164.007560]  mnt_want_write+0xa0/0xd0
[  164.008816]  do_renameat2+0x17b/0x530
[  164.010006]  ? xfs_can_free_eofblocks+0x39/0x1e0
[  164.011558]  ? __might_fault+0x1e/0x20
[  164.012870]  ? strncpy_from_user+0x1e/0x160
[  164.014236]  __x64_sys_renameat2+0x4b/0x60
[  164.015451]  do_syscall_64+0x35/0x80
[  164.016576]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  164.018117] RIP: 0033:0x7f3c7fe97c0f
[  164.019124] RSP: 002b:00007fff67cf26b8 EFLAGS: 00000202 ORIG_RAX: 000000000000013c
[  164.021202] RAX: ffffffffffffffda RBX: 00007fff67cf2980 RCX: 00007f3c7fe97c0f
[  164.023309] RDX: 00000000ffffff9c RSI: 000055fd446825a0 RDI: 00000000ffffff9c
[  164.025479] RBP: 00007fff67cf2990 R08: 0000000000000002 R09: 00007fff67cf2960
[  164.027655] R10: 000055fd44682f10 R11: 0000000000000202 R12: 0000000000000079
[  164.029721] R13: 0000000000000002 R14: 000055fd446703a0 R15: 00000000ffffffff
[  164.031937]  </TASK>
[  164.032675] task:fsstress        state:D stack:14224 pid: 4833 ppid:  4822 flags:0x00000000
[  164.035086] Call Trace:
[  164.035803]  <TASK>
[  164.036512]  __schedule+0x2f9/0xa30
[  164.037661]  ? __smp_call_single_queue+0x23/0x40
[  164.039048]  schedule+0x5a/0xc0
[  164.040076]  ? percpu_rwsem_wait+0x123/0x150
[  164.041315]  percpu_rwsem_wait+0x123/0x150
[  164.042458]  ? __percpu_rwsem_trylock.part.0+0x50/0x50
[  164.043888]  __percpu_down_read+0x5b/0x110
[  164.045097]  mnt_want_write+0xa0/0xd0
[  164.046117]  do_fchownat+0x78/0xe0
[  164.047060]  __x64_sys_lchown+0x1f/0x30
[  164.048210]  do_syscall_64+0x35/0x80
[  164.049295]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  164.050775] RIP: 0033:0x7f3c7ff2df97
[  164.051727] RSP: 002b:00007fff67cf2838 EFLAGS: 00000202 ORIG_RAX: 000000000000005e
[  164.053803] RAX: ffffffffffffffda RBX: 00007fff67cf29a0 RCX: 00007f3c7ff2df97
[  164.055773] RDX: 0000000000028dd3 RSI: 0000000000022124 RDI: 000055fd44672450
[  164.057745] RBP: 0000000000022124 R08: 0000000000000064 R09: 00007fff67cf299c
[  164.059668] R10: fffffffffffffb8b R11: 0000000000000202 R12: 0000000000000008
[  164.061609] R13: 0000000000028dd3 R14: 0000000000022124 R15: 000055fd44660b50
[  164.063578]  </TASK>

Eventually the hung task timer kicks in and reports all these
blocked threads over and over again. The read locks look like
sb_start_write() calls, and the freeze_super() call is waiting
here:


        sb->s_writers.frozen = SB_FREEZE_WRITE;
	/* Release s_umount to preserve sb_start_write -> s_umount ordering */
	up_write(&sb->s_umount);
>>>>>>	sb_wait_write(sb, SB_FREEZE_WRITE);
	down_write(&sb->s_umount);


Every test vm I ran this one failed in an identical manner. They all
failed trying to freeze on the second iteration of the {freeze;thaw}
loop, so whatever is going on should be easily reproducable:

# cat /tmp/9061.out
# QA output created by 068
*** init FS

*** iteration: 0
*** freezing $SCRATCH_MNT
*** thawing  $SCRATCH_MNT

*** iteration: 1
*** freezing $SCRATCH_MNT
#

I just tested ext4 and it hangs in an identical fashion, so whatever
is going on is not specific to XFS - it smells like a
mnt_want_write() or sb_want_write() imbalance somewhere in the
generic code. I haven't had time to bisect it, so this heads-up is
all I'll have until I get can back to this tomorrow....

-Dave.
-- 
Dave Chinner
david@fromorbit.com

             reply	other threads:[~2022-10-10  5:03 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-10  5:03 Dave Chinner [this message]
2022-10-11  0:40 ` [regression, v6.0-rc0, io-uring?] filesystem freeze hangs on sb_wait_write() Dave Chinner
2022-10-11  1:10   ` Pavel Begunkov
2022-10-11  2:01     ` Jens Axboe
2022-10-11  2:10       ` Pavel Begunkov
2022-10-11  2:54         ` Jens Axboe
2022-10-11 14:18           ` Jens Axboe
2022-10-11 14:39             ` Pavel Begunkov
2022-10-11 14:47               ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221010050319.GC2703033@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.