All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thilo Fromm <t-lo@linux.microsoft.com>
To: Jan Kara <jack@suse.cz>
Cc: jack@suse.com, tytso@mit.edu, Ye Bin <yebin10@huawei.com>,
	linux-ext4@vger.kernel.org
Subject: Re: [syzbot] possible deadlock in jbd2_journal_lock_updates
Date: Thu, 29 Sep 2022 15:18:21 +0200	[thread overview]
Message-ID: <d8b18ba8-ea12-b617-6b5e-455a1d7b5e21@linux.microsoft.com> (raw)
In-Reply-To: <20220929082716.5urzcfk4hnapd3cr@quack3>

Hello Honza,

Thank you very much for your thorough feedback. We were unaware of the 
backtrace issue and will have a look at once.

>>> So this seems like a real issue. Essentially, the problem is that
>>> ext4_bmap() acquires inode->i_rwsem while its caller
>>> jbd2_journal_flush() is holding journal->j_checkpoint_mutex. This
>>> looks like a real deadlock possibility.
>>
>> Flatcar Container Linux users have reported a kernel issue which might be
>> caused by commit 51ae846cff5. The issue is triggered under I/O load in
>> certain conditions and leads to a complete system hang. I've pasted a
>> typical kernel log below; please refer to
>> https://github.com/flatcar/Flatcar/issues/847 for more details.
>>
>> The issue can be triggered on Flatcar release 3227.2.2 / kernel version
>> 5.15.63 (we ship LTS kernels) but not on release 3227.2.1 / kernel 5.15.58.
>> 51ae846cff5 was introduced to 5.15 in 5.15.61.
> 
> Well, so far your stacktraces do not really show anything pointing to that
> particular commit. So we need to understand that hang some more.

This makes sense and I agree. Sorry for the garbled stack traces.

In other news, one of our users - who can reliably trigger the issue in 
their set-up - ran tests with kernel 5.15.63 with and without commit 
51ae846cff5. Without the commit, the kernel hang did not occur (see 
https://github.com/flatcar/Flatcar/issues/847#issuecomment-1261967920).

We'll now focus on un-garbling our traces to get to the bottom of this.

>> ( Kernel log of a crash follows; more info here:
>> https://github.com/flatcar/Flatcar/issues/847 )
>>
[...]
>> [1282119.190346]  ret_from_fork+0x22/0x30
> 
> Hrm, so your backtraces seem to be strange. For example in this stacktrace
> we should have kjournald2() somewhere instead of
> jbd2_journal_check_available_features() which can hardly be there. So
> somehow stack unwinding or symbol resolution is strangely confused with
> this kernel. Compiling with any unusual config or compiler?

We're on GCC 10.3.0 and will review our build process to get to the 
bottom of this. Will get back to this thread as soon as we have news. 
Thanks again for pointing this out!

> So far it seems that most tasks are waiting for transaction to commit, jbd2
> thread committing the transaction waits for someone to drop its transaction
> reference which never happens. It is unclear who holds the transaction
> reference. But with stacktraces corrupted like this it is difficult to be
> certain.
> 
> So probably first try find out why stacktraces are not working right on
> your kernel and fix them. And then, if the hang happens, please trigger
> sysrq-w (or do echo w >/proc/sysrq-trigger if you can still get to the
> machine) and send here the output. It will dump all blocked tasks and from
> that we should be able to better understand what is happening.

Working on it!

Best regards,
Thilo

  reply	other threads:[~2022-09-29 13:18 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-08  7:34 [syzbot] possible deadlock in jbd2_journal_lock_updates syzbot
2022-08-08 16:38 ` syzbot
2022-08-24 10:06   ` Jan Kara
2022-09-28  7:30     ` Thilo Fromm
2022-09-29  8:27       ` Jan Kara
2022-09-29 13:18         ` Thilo Fromm [this message]
2022-10-04  6:38           ` Jeremi Piotrowski
2022-10-04  9:10             ` Jan Kara
2022-10-04 14:21               ` Thilo Fromm
2022-10-05 15:10                 ` Jan Kara
2022-10-10 14:24                   ` Jeremi Piotrowski
2022-10-14  6:42                     ` Thilo Fromm
2022-10-14 13:25                       ` Jan Kara
2022-10-21 10:23                         ` Thilo Fromm
2022-10-24 10:46                           ` Jan Kara
2022-10-24 16:32                             ` Thilo Fromm
2022-10-26 10:18                               ` Jan Kara
2022-11-10 12:57                                 ` Jeremi Piotrowski
2022-11-10 15:26                                   ` Jan Kara
2022-11-10 19:27                                     ` Jeremi Piotrowski
2022-11-11 14:24                                       ` Jan Kara
2022-11-11 15:10                                         ` Jeremi Piotrowski
2022-11-11 15:52                                           ` Jeremi Piotrowski
2022-11-21 13:35                                             ` Jan Kara
2022-11-21 15:00                                               ` Jan Kara
2022-11-21 15:18                                                 ` Thorsten Leemhuis
2022-11-21 15:40                                                   ` Jan Kara
2022-11-21 18:15                                                 ` Jeremi Piotrowski
2022-11-22 11:57                                                   ` Jan Kara
2022-11-22 17:48                                                     ` Jeremi Piotrowski
2022-11-23 19:41                                                       ` Jan Kara
2022-09-30 12:16       ` [syzbot] possible deadlock in jbd2_journal_lock_updates #forregzbot Thorsten Leemhuis
2022-11-23  9:56         ` Thorsten Leemhuis
2023-04-30 23:38 ` [syzbot] possible deadlock in jbd2_journal_lock_updates Theodore Ts'o
     [not found] <20220819122008.1561-1-hdanton@sina.com>
2022-08-19 16:00 ` syzbot
     [not found] <20220821023626.1810-1-hdanton@sina.com>
2022-08-21 10:34 ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d8b18ba8-ea12-b617-6b5e-455a1d7b5e21@linux.microsoft.com \
    --to=t-lo@linux.microsoft.com \
    --cc=jack@suse.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=yebin10@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.