linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Syzkaller & bisect] There is "possible deadlock in ext4_bmap" issue in v6.1 kernel
@ 2022-12-14  4:19 Pengfei Xu
  2022-12-14 15:19 ` Theodore Ts'o
  0 siblings, 1 reply; 3+ messages in thread
From: Pengfei Xu @ 2022-12-14  4:19 UTC (permalink / raw)
  To: yebin10; +Cc: jack, linux-kernel, linux-ext4, tytso, heng.su, pengfei.xu

Hi Ye Bin and ext4 expert,

Greeting!

There is "possible deadlock in ext4_bmap" issue in v6.1 kernel:
"
[   28.421104] 
[   28.421671] ======================================================
[   28.422016] WARNING: possible circular locking dependency detected
[   28.422358] 6.1.0-830b3c68c1fb #1 Not tainted
[   28.422607] ------------------------------------------------------
[   28.422956] repro/588 is trying to acquire lock:
[   28.423217] ff110000060e5980 (&sb->s_type->i_mutex_key#8){++++}-{3:3}, at: ext4_bmap+0x39/0x1c0
[   28.423729] 
[   28.423729] but task is already holding lock:
[   28.424055] ff110000093323f8 (&journal->j_checkpoint_mutex){+.+.}-{3:3}, at: jbd2_journal_flush+0x267/0x5f0
[   28.424610] 
[   28.424610] which lock already depends on the new lock.
[   28.424610] 
[   28.425057] 
[   28.425057] the existing dependency chain (in reverse order) is:
[   28.425467] 
[   28.425467] -> #3 (&journal->j_checkpoint_mutex){+.+.}-{3:3}:
[   28.425877]        mutex_lock_io_nested+0xad/0xd60
[   28.426156]        jbd2_journal_flush+0xcf/0x5f0
[   28.426429]        __ext4_ioctl+0x225/0x25c0
[   28.426680]        ext4_ioctl+0x2e/0x40
[   28.426908]        __x64_sys_ioctl+0x10e/0x160
[   28.427171]        do_syscall_64+0x3b/0x90
[   28.427417]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   28.427739] 
[   28.427739] -> #2 (&journal->j_barrier){+.+.}-{3:3}:
[   28.428105]        __mutex_lock+0x9c/0xf30
[   28.428342]        mutex_lock_nested+0x1f/0x30
[   28.428600]        jbd2_journal_lock_updates+0xbe/0x1a0
[   28.428909]        ext4_change_inode_journal_flag+0xd4/0x2c0
[   28.429234]        ext4_fileattr_set+0x86d/0xe60
[   28.429501]        vfs_fileattr_set+0x365/0x4e0
[   28.429765]        do_vfs_ioctl+0x378/0xcc0
[   28.430010]        __x64_sys_ioctl+0xca/0x160
[   28.430265]        do_syscall_64+0x3b/0x90
[   28.430508]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   28.430832] 
[   28.430832] -> #1 (&sbi->s_writepages_rwsem){++++}-{0:0}:
[   28.431224]        percpu_down_write+0x4d/0x2d0
[   28.431487]        ext4_ind_migrate+0xcc/0x390
[   28.431748]        ext4_fileattr_set+0xbba/0xe60
[   28.432018]        vfs_fileattr_set+0x365/0x4e0
[   28.432280]        do_vfs_ioctl+0x378/0xcc0
[   28.432525]        __x64_sys_ioctl+0xca/0x160
[   28.432780]        do_syscall_64+0x3b/0x90
[   28.433027]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   28.433349] 
[   28.433349] -> #0 (&sb->s_type->i_mutex_key#8){++++}-{3:3}:
[   28.433751]        __lock_acquire+0x1023/0x1d20
[   28.434022]        lock_acquire+0xd6/0x2f0
[   28.434259]        down_read+0x45/0x160
[   28.434483]        ext4_bmap+0x39/0x1c0
[   28.434718]        bmap+0x40/0x70
[   28.434925]        jbd2_journal_bmap+0x61/0xe0
[   28.435186]        jbd2_journal_flush+0x461/0x5f0
[   28.435459]        __ext4_ioctl+0x225/0x25c0
[   28.435708]        ext4_ioctl+0x2e/0x40
[   28.435935]        __x64_sys_ioctl+0x10e/0x160
[   28.436193]        do_syscall_64+0x3b/0x90
[   28.436436]        entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   28.436757] 
[   28.436757] other info that might help us debug this:
[   28.436757] 
[   28.437194] Chain exists of:
[   28.437194]   &sb->s_type->i_mutex_key#8 --> &journal->j_barrier --> &journal->j_checkpoint_mutex
[   28.437194] 
[   28.437924]  Possible unsafe locking scenario:
[   28.437924] 
[   28.438251]        CPU0                    CPU1
[   28.438505]        ----                    ----
[   28.438759]   lock(&journal->j_checkpoint_mutex);
[   28.439030]                                lock(&journal->j_barrier);
[   28.439388]                                lock(&journal->j_checkpoint_mutex);
[   28.439788]   lock(&sb->s_type->i_mutex_key#8);
[   28.440052] 
[   28.440052]  *** DEADLOCK ***
[   28.440052] 
[   28.440378] 2 locks held by repro/588:
[   28.440595]  #0: ff11000009332170 (&journal->j_barrier){+.+.}-{3:3}, at: jbd2_journal_lock_updates+0xbe/0x1a0
[   28.441168]  #1: ff110000093323f8 (&journal->j_checkpoint_mutex){+.+.}-{3:3}, at: jbd2_journal_flush+0x267/0x5f0
[   28.441746] 
[   28.441746] stack backtrace:
[   28.441994] CPU: 1 PID: 588 Comm: repro Not tainted 6.1.0-830b3c68c1fb #1
[   28.442375] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   28.443004] Call Trace:
[   28.443148]  <TASK>
[   28.443275]  dump_stack_lvl+0xa7/0xdb
[   28.443497]  dump_stack+0x19/0x1f
[   28.443699]  print_circular_bug.isra.46.cold.67+0x13e/0x143
[   28.444031]  check_noncircular+0x102/0x120
[   28.444277]  ? write_comp_data+0x2f/0x90
[   28.444508]  ? __sanitizer_cov_trace_pc+0x25/0x60
[   28.444783]  __lock_acquire+0x1023/0x1d20
[   28.445025]  lock_acquire+0xd6/0x2f0
[   28.445236]  ? ext4_bmap+0x39/0x1c0
[   28.445453]  down_read+0x45/0x160
[   28.445652]  ? ext4_bmap+0x39/0x1c0
[   28.445865]  ? ext4_invalidate_folio+0x1e0/0x1e0
[   28.446143]  ext4_bmap+0x39/0x1c0
[   28.446347]  ? ext4_invalidate_folio+0x1e0/0x1e0
[   28.446622]  bmap+0x40/0x70
[   28.446800]  jbd2_journal_bmap+0x61/0xe0
[   28.447038]  jbd2_journal_flush+0x461/0x5f0
[   28.447287]  __ext4_ioctl+0x225/0x25c0
[   28.447510]  ? __sanitizer_cov_trace_pc+0x25/0x60
[   28.447782]  ? do_vfs_ioctl+0xb8/0xcc0
[   28.448006]  ? __sanitizer_cov_trace_pc+0x25/0x60
[   28.448281]  ext4_ioctl+0x2e/0x40
[   28.448480]  __x64_sys_ioctl+0x10e/0x160
[   28.448712]  ? ext4_fileattr_set+0xe60/0xe60
[   28.448964]  do_syscall_64+0x3b/0x90
[   28.449183]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   28.449479] RIP: 0033:0x7f839e4b959d
[   28.449689] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 8
[   28.450702] RSP: 002b:00007ffd48343808 EFLAGS: 00000217 ORIG_RAX: 0000000000000010
[   28.451127] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f839e4b959d
[   28.451523] RDX: 0000000020000080 RSI: 000000004004662b RDI: 0000000000000004
[   28.451922] RBP: 00007ffd48343820 R08: 00007ffd48343900 R09: 00007ffd48343900
[   28.452317] R10: 00007ffd48343900 R11: 0000000000000217 R12: 0000000000401060
[   28.452712] R13: 00007ffd48343900 R14: 0000000000000000 R15: 0000000000000000
[   28.453116]  </TASK>
[   28.675120] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   28.898857] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   29.121957] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   29.343926] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   29.563709] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   29.783732] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   30.002988] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   30.222407] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   30.442175] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   33.495841] ext4_ioctl_checkpoint: 13 callbacks suppressed
[   33.495853] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   33.714325] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   33.932831] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   34.151299] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   34.370576] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   34.589091] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   34.808403] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   35.028663] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
[   35.247587] warning: checkpointing journal with EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT can be slow
"
Bisect and found that, the first bad commit is:
51ae846cff568c8c29921b1b28eb2dfbcd4ac12d
ext4: fix warning in ext4_iomap_begin as race between bmap and write

After reverted the bad commit on top of v6.1, this issue was gone.

All above bisect info, kconfig, reproduced code from syzkaller is in link:
https://github.com/xupengfe/syzkaller_logs/tree/main/221212_234319_ext4_bmap

============================================================================


There was one more similar "possible deadlock in jbd2_journal_lock_updates"
issue found by syzkaller, and bisected and found that the first bad commit is
also 51ae846cff568c8c29921b1b28eb2dfbcd4ac12d

Reverted above commit on top of v6.1, the jbd2_journal_lock_updates issue was
gone also.

Related issue bisect info, reproduced code is in link:
https://github.com/xupengfe/syzkaller_logs/tree/main/221213_160256_jbd2_journal_lock_updates


Thanks!
BR.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Syzkaller & bisect] There is "possible deadlock in ext4_bmap" issue in v6.1 kernel
  2022-12-14  4:19 [Syzkaller & bisect] There is "possible deadlock in ext4_bmap" issue in v6.1 kernel Pengfei Xu
@ 2022-12-14 15:19 ` Theodore Ts'o
  2022-12-15  1:48   ` Pengfei Xu
  0 siblings, 1 reply; 3+ messages in thread
From: Theodore Ts'o @ 2022-12-14 15:19 UTC (permalink / raw)
  To: Pengfei Xu; +Cc: yebin10, jack, linux-kernel, linux-ext4, heng.su

On Wed, Dec 14, 2022 at 12:19:59PM +0800, Pengfei Xu wrote:
> Hi Ye Bin and ext4 expert,

Please include the URL to the syzkaller dashboard where these things
are reported.  The dashboard includes a lot of useful information
which is not included in your report.

It's also better if you reply to the syzkaller report that was
originally sent to LKML and/or the linux-ext4 list.

						- Ted

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Syzkaller & bisect] There is "possible deadlock in ext4_bmap" issue in v6.1 kernel
  2022-12-14 15:19 ` Theodore Ts'o
@ 2022-12-15  1:48   ` Pengfei Xu
  0 siblings, 0 replies; 3+ messages in thread
From: Pengfei Xu @ 2022-12-15  1:48 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: yebin10, jack, linux-kernel, linux-ext4, heng.su

Hi Ted,

On 2022-12-14 at 10:19:00 -0500, Theodore Ts'o wrote:
> On Wed, Dec 14, 2022 at 12:19:59PM +0800, Pengfei Xu wrote:
> > Hi Ye Bin and ext4 expert,
> 
> Please include the URL to the syzkaller dashboard where these things
> are reported.  The dashboard includes a lot of useful information
> which is not included in your report.
Ah, thanks for suggestion!
I will add the URL from syzkaller dashboard, the link for this issue
is as below?
https://syzkaller.appspot.com/bug?id=e4aaa78795e490421c79f76ec3679006c8ff4cf0

> 
> It's also better if you reply to the syzkaller report that was
> originally sent to LKML and/or the linux-ext4 list.
I will search the LKML and related sub module email list for this issue at
first next.

Thanks!
BR.


> 
> 						- Ted

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-12-15  1:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-14  4:19 [Syzkaller & bisect] There is "possible deadlock in ext4_bmap" issue in v6.1 kernel Pengfei Xu
2022-12-14 15:19 ` Theodore Ts'o
2022-12-15  1:48   ` Pengfei Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).