[5.15-rc1 regression] io_uring: fsstress hangs in do_coredump() on exit

* [5.15-rc1 regression] io_uring: fsstress hangs in do_coredump() on exit
@ 2021-09-21  6:40 Dave Chinner
  2021-09-21 13:25 ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2021-09-21  6:40 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, linux-fsdevel

Hi Jens,

I updated all my trees from 5.14 to 5.15-rc2 this morning and
immediately had problems running the recoveryloop fstest group on
them. These tests have a typical pattern of "run load in the
background, shutdown the filesystem, kill load, unmount and test
recovery".

Whent eh load includes fsstress, and it gets killed after shutdown,
it hangs on exit like so:

# echo w > /proc/sysrq-trigger 
[  370.669482] sysrq: Show Blocked State
[  370.671732] task:fsstress        state:D stack:11088 pid: 9619 ppid:  9615 flags:0x00000000
[  370.675870] Call Trace:
[  370.677067]  __schedule+0x310/0x9f0
[  370.678564]  schedule+0x67/0xe0
[  370.679545]  schedule_timeout+0x114/0x160
[  370.682002]  __wait_for_common+0xc0/0x160
[  370.684274]  wait_for_completion+0x24/0x30
[  370.685471]  do_coredump+0x202/0x1150
[  370.690270]  get_signal+0x4c2/0x900
[  370.691305]  arch_do_signal_or_restart+0x106/0x7a0
[  370.693888]  exit_to_user_mode_prepare+0xfb/0x1d0
[  370.695241]  syscall_exit_to_user_mode+0x17/0x40
[  370.696572]  do_syscall_64+0x42/0x80
[  370.697620]  entry_SYSCALL_64_after_hwframe+0x44/0xae

It's 100% reproducable on one of my test machines, but only one of
them. That one machine is running fstests on pmem, so it has
synchronous storage. Every other test machine using normal async
storage (nvme, iscsi, etc) and none of them are hanging.

A quick troll of the commit history between 5.14 and 5.15-rc2
indicates a couple of potential candidates. The 5th kernel build
(instead of ~16 for a bisect) told me that commit 15e20db2e0ce
("io-wq: only exit on fatal signals") is the cause of the
regression. I've confirmed that this is the first commit where the
problem shows up.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 6+ messages in thread