io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: io-uring <io-uring@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: [5.15-rc1 regression] io_uring: fsstress hangs in do_coredump() on exit
Date: Tue, 21 Sep 2021 16:40:32 +1000	[thread overview]
Message-ID: <20210921064032.GW2361455@dread.disaster.area> (raw)

Hi Jens,

I updated all my trees from 5.14 to 5.15-rc2 this morning and
immediately had problems running the recoveryloop fstest group on
them. These tests have a typical pattern of "run load in the
background, shutdown the filesystem, kill load, unmount and test
recovery".

Whent eh load includes fsstress, and it gets killed after shutdown,
it hangs on exit like so:

# echo w > /proc/sysrq-trigger 
[  370.669482] sysrq: Show Blocked State
[  370.671732] task:fsstress        state:D stack:11088 pid: 9619 ppid:  9615 flags:0x00000000
[  370.675870] Call Trace:
[  370.677067]  __schedule+0x310/0x9f0
[  370.678564]  schedule+0x67/0xe0
[  370.679545]  schedule_timeout+0x114/0x160
[  370.682002]  __wait_for_common+0xc0/0x160
[  370.684274]  wait_for_completion+0x24/0x30
[  370.685471]  do_coredump+0x202/0x1150
[  370.690270]  get_signal+0x4c2/0x900
[  370.691305]  arch_do_signal_or_restart+0x106/0x7a0
[  370.693888]  exit_to_user_mode_prepare+0xfb/0x1d0
[  370.695241]  syscall_exit_to_user_mode+0x17/0x40
[  370.696572]  do_syscall_64+0x42/0x80
[  370.697620]  entry_SYSCALL_64_after_hwframe+0x44/0xae

It's 100% reproducable on one of my test machines, but only one of
them. That one machine is running fstests on pmem, so it has
synchronous storage. Every other test machine using normal async
storage (nvme, iscsi, etc) and none of them are hanging.

A quick troll of the commit history between 5.14 and 5.15-rc2
indicates a couple of potential candidates. The 5th kernel build
(instead of ~16 for a bisect) told me that commit 15e20db2e0ce
("io-wq: only exit on fatal signals") is the cause of the
regression. I've confirmed that this is the first commit where the
problem shows up.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

             reply	other threads:[~2021-09-21  7:15 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-21  6:40 Dave Chinner [this message]
2021-09-21 13:25 ` [5.15-rc1 regression] io_uring: fsstress hangs in do_coredump() on exit Jens Axboe
2021-09-21 14:19   ` Jens Axboe
2021-09-21 21:35     ` Dave Chinner
2021-09-21 21:41       ` Jens Axboe
2021-09-23 14:05         ` Olivier Langlois

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210921064032.GW2361455@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).