All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Dave Chinner <david@fromorbit.com>
Cc: io-uring <io-uring@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [5.15-rc1 regression] io_uring: fsstress hangs in do_coredump() on exit
Date: Tue, 21 Sep 2021 15:41:20 -0600	[thread overview]
Message-ID: <6d46951b-a7b3-0feb-3af0-aaa8ec87b87a@kernel.dk> (raw)
In-Reply-To: <20210921213552.GZ2361455@dread.disaster.area>

On 9/21/21 3:35 PM, Dave Chinner wrote:
> On Tue, Sep 21, 2021 at 08:19:53AM -0600, Jens Axboe wrote:
>> On 9/21/21 7:25 AM, Jens Axboe wrote:
>>> On 9/21/21 12:40 AM, Dave Chinner wrote:
>>>> Hi Jens,
>>>>
>>>> I updated all my trees from 5.14 to 5.15-rc2 this morning and
>>>> immediately had problems running the recoveryloop fstest group on
>>>> them. These tests have a typical pattern of "run load in the
>>>> background, shutdown the filesystem, kill load, unmount and test
>>>> recovery".
>>>>
>>>> Whent eh load includes fsstress, and it gets killed after shutdown,
>>>> it hangs on exit like so:
>>>>
>>>> # echo w > /proc/sysrq-trigger 
>>>> [  370.669482] sysrq: Show Blocked State
>>>> [  370.671732] task:fsstress        state:D stack:11088 pid: 9619 ppid:  9615 flags:0x00000000
>>>> [  370.675870] Call Trace:
>>>> [  370.677067]  __schedule+0x310/0x9f0
>>>> [  370.678564]  schedule+0x67/0xe0
>>>> [  370.679545]  schedule_timeout+0x114/0x160
>>>> [  370.682002]  __wait_for_common+0xc0/0x160
>>>> [  370.684274]  wait_for_completion+0x24/0x30
>>>> [  370.685471]  do_coredump+0x202/0x1150
>>>> [  370.690270]  get_signal+0x4c2/0x900
>>>> [  370.691305]  arch_do_signal_or_restart+0x106/0x7a0
>>>> [  370.693888]  exit_to_user_mode_prepare+0xfb/0x1d0
>>>> [  370.695241]  syscall_exit_to_user_mode+0x17/0x40
>>>> [  370.696572]  do_syscall_64+0x42/0x80
>>>> [  370.697620]  entry_SYSCALL_64_after_hwframe+0x44/0xae
>>>>
>>>> It's 100% reproducable on one of my test machines, but only one of
>>>> them. That one machine is running fstests on pmem, so it has
>>>> synchronous storage. Every other test machine using normal async
>>>> storage (nvme, iscsi, etc) and none of them are hanging.
>>>>
>>>> A quick troll of the commit history between 5.14 and 5.15-rc2
>>>> indicates a couple of potential candidates. The 5th kernel build
>>>> (instead of ~16 for a bisect) told me that commit 15e20db2e0ce
>>>> ("io-wq: only exit on fatal signals") is the cause of the
>>>> regression. I've confirmed that this is the first commit where the
>>>> problem shows up.
>>>
>>> Thanks for the report Dave, I'll take a look. Can you elaborate on
>>> exactly what is being run? And when killed, it's a non-fatal signal?
> 
> It's whatever kill/killall sends by default.  Typical behaviour that
> causes a hang is something like:
> 
> $FSSTRESS_PROG -n10000000 -p $PROCS -d $load_dir >> $seqres.full 2>&1 &
> ....
> sleep 5
> _scratch_shutdown
> $KILLALL_PROG -q $FSSTRESS_PROG
> wait
> 
> _shutdown_scratch is typically just an 'xfs_io -rx -c "shutdown"
> /mnt/scratch' command that shuts down the filesystem. Other tests in
> the recoveryloop group use DM targets to fail IO that trigger a
> shutdown, others inject errors that trigger shutdowns, etc. But the
> result is that all hang waiting for fsstress processes that have
> been using io_uring to exit.
> 
> Just run fstests with "./check -g recoveryloop" - there's only a
> handful of tests and it only takes about 5 minutes to run them all
> on a fake DRAM based pmem device..

I made a trivial reproducer just to verify.

>> Can you try with this patch?
>>
>> diff --git a/fs/io-wq.c b/fs/io-wq.c
>> index b5fd015268d7..1e55a0a2a217 100644
>> --- a/fs/io-wq.c
>> +++ b/fs/io-wq.c
>> @@ -586,7 +586,8 @@ static int io_wqe_worker(void *data)
>>  
>>  			if (!get_signal(&ksig))
>>  				continue;
>> -			if (fatal_signal_pending(current))
>> +			if (fatal_signal_pending(current) ||
>> +			    signal_group_exit(current->signal)) {
>>  				break;
>>  			continue;
>>  		}
> 
> Cleaned up so it compiles and the tests run properly again. But
> playing whack-a-mole with signals seems kinda fragile. I was pointed
> to this patchset by another dev on #xfs overnight who saw the same
> hangs that also fixed the hang:

It seems sane to me - exit if there's a fatal signal, or doing core
dump. Don't think there should be other conditions.

> https://lore.kernel.org/lkml/cover.1629655338.git.olivier@trillion01.com/
> 
> It was posted about a month ago and I don't see any response to it
> on the lists...

That's been a long discussion, but it's a different topic really. Yes
it's signals, but it's not this particular issue. It'll happen to work
around this issue, as it cancels everything post core dumping.

-- 
Jens Axboe


  reply	other threads:[~2021-09-21 21:41 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-21  6:40 [5.15-rc1 regression] io_uring: fsstress hangs in do_coredump() on exit Dave Chinner
2021-09-21 13:25 ` Jens Axboe
2021-09-21 14:19   ` Jens Axboe
2021-09-21 21:35     ` Dave Chinner
2021-09-21 21:41       ` Jens Axboe [this message]
2021-09-23 14:05         ` Olivier Langlois

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6d46951b-a7b3-0feb-3af0-aaa8ec87b87a@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=david@fromorbit.com \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.