linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BUG: sleeping function called from invalid context in ext4_superblock_csum_set
@ 2020-11-01 23:24 syzbot
       [not found] ` <20201102033326.3343-1-hdanton@sina.com>
  0 siblings, 1 reply; 3+ messages in thread
From: syzbot @ 2020-11-01 23:24 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel, syzkaller-bugs, viro

Hello,

syzbot found the following issue on:

HEAD commit:    ed8780e3 Merge tag 'x86-urgent-2020-10-27' of git://git.ke..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=174e7b74500000
kernel config:  https://syzkaller.appspot.com/x/.config?x=1be40dc0ffa0bea0
dashboard link: https://syzkaller.appspot.com/bug?extid=7a4ba6a239b91a126c28
compiler:       gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+7a4ba6a239b91a126c28@syzkaller.appspotmail.com

EXT4-fs error (device sda1): mb_free_blocks:1506: group 7, inode 16554: block 229408:freeing already freed block (bit 32); block bitmap corrupt.
BUG: sleeping function called from invalid context at include/linux/buffer_head.h:364
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9864, name: syz-executor.0
5 locks held by syz-executor.0/9864:
 #0: ffff888015554460 (sb_writers#6){.+.+}-{0:0}, at: sb_start_write include/linux/fs.h:1648 [inline]
 #0: ffff888015554460 (sb_writers#6){.+.+}-{0:0}, at: mnt_want_write+0x3a/0xb0 fs/namespace.c:354
 #1: ffff88801f81d4c8 (&sb->s_type->i_mutex_key#9){+.+.}-{3:3}, at: inode_lock include/linux/fs.h:774 [inline]
 #1: ffff88801f81d4c8 (&sb->s_type->i_mutex_key#9){+.+.}-{3:3}, at: do_truncate+0x125/0x1f0 fs/open.c:62
 #2: ffff88801f81d350 (&ei->i_mmap_sem){++++}-{3:3}, at: ext4_setattr+0xdde/0x1ff0 fs/ext4/inode.c:5415
 #3: ffff88801f81d2b8 (&ei->i_data_sem){++++}-{3:3}, at: ext4_truncate+0x787/0x1420 fs/ext4/inode.c:4246
 #4: ffff88801d7ae1d8 (&bgl->locks[i].lock){+.+.}-{2:2}, at: spin_trylock include/linux/spinlock.h:364 [inline]
 #4: ffff88801d7ae1d8 (&bgl->locks[i].lock){+.+.}-{2:2}, at: ext4_lock_group+0x71/0x240 fs/ext4/ext4.h:3326
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 2 PID: 9864 Comm: syz-executor.0 Not tainted 5.10.0-rc1-syzkaller #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:118
 ___might_sleep.cold+0x1e8/0x22e kernel/sched/core.c:7298
 lock_buffer include/linux/buffer_head.h:364 [inline]
 ext4_superblock_csum_set+0x164/0x3c0 fs/ext4/super.c:301
 ext4_commit_super+0x611/0xc50 fs/ext4/super.c:5536
 __ext4_grp_locked_error+0x4c9/0x570 fs/ext4/super.c:1017
 mb_free_blocks+0xb59/0x15f0 fs/ext4/mballoc.c:1506
 ext4_mb_release_inode_pa.isra.0+0x310/0xca0 fs/ext4/mballoc.c:4177
 ext4_discard_preallocations+0x6c5/0xe90 fs/ext4/mballoc.c:4441
 ext4_truncate+0x791/0x1420 fs/ext4/inode.c:4248
 ext4_setattr+0x133c/0x1ff0 fs/ext4/inode.c:5490
 notify_change+0xb60/0x10a0 fs/attr.c:336
 do_truncate+0x134/0x1f0 fs/open.c:64
 handle_truncate fs/namei.c:2910 [inline]
 do_open fs/namei.c:3256 [inline]
 path_openat+0x2054/0x2730 fs/namei.c:3369
 do_filp_open+0x17e/0x3c0 fs/namei.c:3396
 do_sys_openat2+0x16d/0x420 fs/open.c:1168
 do_sys_open fs/open.c:1184 [inline]
 __do_sys_creat fs/open.c:1258 [inline]
 __se_sys_creat fs/open.c:1252 [inline]
 __x64_sys_creat+0xc9/0x120 fs/open.c:1252
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45da59
Code: bd b1 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 8b b1 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fb835092c88 EFLAGS: 00000246 ORIG_RAX: 0000000000000055
RAX: ffffffffffffffda RBX: 00000000006f4da0 RCX: 000000000045da59
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000300
RBP: 00000000004aab8b R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000075bf00
R13: 00007fff472d64af R14: 00007fb835073000 R15: 0000000000000003
EXT4-fs error (device sda1): ext4_mb_generate_buddy:802: group 7, block bitmap and bg descriptor inconsistent: 32734 vs 32735 free clusters
EXT4-fs (sda1): pa 00000000af22a596: logic 0, phys. 229408, len 32
EXT4-fs error (device sda1): ext4_mb_release_inode_pa:4186: group 7, free 16, pa_free 15


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: BUG: sleeping function called from invalid context in ext4_superblock_csum_set
       [not found]   ` <CAABuPhbFbQ+_nwDKXjUngtuS5twU6OqKtNu5xYW-d82JJ3cFuQ@mail.gmail.com>
@ 2020-11-04 13:12     ` Jan Kara
  2020-11-11 14:31       ` Jan Kara
  0 siblings, 1 reply; 3+ messages in thread
From: Jan Kara @ 2020-11-04 13:12 UTC (permalink / raw)
  To: Costa Sapuntzakis
  Cc: Hillf Danton, Jan Kara, syzbot, linux-fsdevel, linux-kernel,
	Theodore Ts'o, syzkaller-bugs, viro

On Tue 03-11-20 09:16:19, Costa Sapuntzakis wrote:
> Jan, does this fixup from Hillf look ok to you?  You originally argued for
> lock_buffer/unlock_buffer.
> 
> I think the problem here is that the ext4 code assumes that
> ext4_commit_super will not sleep if sync == 0 (or at least
> __ext4_grp_locked_error deos). Perhaps there should be a comment on
> ext4_commit_super documenting this constraint.

Hum, right. I forgot about that. The spinlock Hillf suggests kind of works
but it still doesn't quite handle the case where superblock is modified in
parallel from another place (that can still lead to sb checksum mismatch on
next load). When we are going for a more complex solution I'd rather solve
this as well... I'm looking into possible solutions now.

								Honza

> 
> A spinlock won't sleep so fixes the problem.
> 
> An alternative is to move all calls to ext4_commit_super( , 0) outside of
> spinlocks.
> 
> -Costa
> 
> On Sun, Nov 1, 2020 at 7:34 PM Hillf Danton <hdanton@sina.com> wrote:
> 
> > On Sun, 01 Nov 2020 15:24:21 -0800
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit:    ed8780e3 Merge tag 'x86-urgent-2020-10-27' of git://
> > git.ke..
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=174e7b74500000
> > > kernel config:
> > https://syzkaller.appspot.com/x/.config?x=1be40dc0ffa0bea0
> > > dashboard link:
> > https://syzkaller.appspot.com/bug?extid=7a4ba6a239b91a126c28
> > > compiler:       gcc (GCC) 10.1.0-syz 20200507
> > >
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the
> > commit:
> > > Reported-by: syzbot+7a4ba6a239b91a126c28@syzkaller.appspotmail.com
> > >
> > > EXT4-fs error (device sda1): mb_free_blocks:1506: group 7, inode 16554:
> > block 229408:freeing already freed block (bit 32); block bitmap corrupt.
> > > BUG: sleeping function called from invalid context at
> > include/linux/buffer_head.h:364
> > > in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9864, name:
> > syz-executor.0
> > > 5 locks held by syz-executor.0/9864:
> > >  #0: ffff888015554460 (sb_writers#6){.+.+}-{0:0}, at: sb_start_write
> > include/linux/fs.h:1648 [inline]
> > >  #0: ffff888015554460 (sb_writers#6){.+.+}-{0:0}, at:
> > mnt_want_write+0x3a/0xb0 fs/namespace.c:354
> > >  #1: ffff88801f81d4c8 (&sb->s_type->i_mutex_key#9){+.+.}-{3:3}, at:
> > inode_lock include/linux/fs.h:774 [inline]
> > >  #1: ffff88801f81d4c8 (&sb->s_type->i_mutex_key#9){+.+.}-{3:3}, at:
> > do_truncate+0x125/0x1f0 fs/open.c:62
> > >  #2: ffff88801f81d350 (&ei->i_mmap_sem){++++}-{3:3}, at:
> > ext4_setattr+0xdde/0x1ff0 fs/ext4/inode.c:5415
> > >  #3: ffff88801f81d2b8 (&ei->i_data_sem){++++}-{3:3}, at:
> > ext4_truncate+0x787/0x1420 fs/ext4/inode.c:4246
> > >  #4: ffff88801d7ae1d8 (&bgl->locks[i].lock){+.+.}-{2:2}, at:
> > spin_trylock include/linux/spinlock.h:364 [inline]
> > >  #4: ffff88801d7ae1d8 (&bgl->locks[i].lock){+.+.}-{2:2}, at:
> > ext4_lock_group+0x71/0x240 fs/ext4/ext4.h:3326
> > > Preemption disabled at:
> > > [<0000000000000000>] 0x0
> > > CPU: 2 PID: 9864 Comm: syz-executor.0 Not tainted 5.10.0-rc1-syzkaller #0
> > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> > > Call Trace:
> > >  __dump_stack lib/dump_stack.c:77 [inline]
> > >  dump_stack+0x107/0x163 lib/dump_stack.c:118
> > >  ___might_sleep.cold+0x1e8/0x22e kernel/sched/core.c:7298
> > >  lock_buffer include/linux/buffer_head.h:364 [inline]
> > >  ext4_superblock_csum_set+0x164/0x3c0 fs/ext4/super.c:301
> > >  ext4_commit_super+0x611/0xc50 fs/ext4/super.c:5536
> > >  __ext4_grp_locked_error+0x4c9/0x570 fs/ext4/super.c:1017
> > >  mb_free_blocks+0xb59/0x15f0 fs/ext4/mballoc.c:1506
> > >  ext4_mb_release_inode_pa.isra.0+0x310/0xca0 fs/ext4/mballoc.c:4177
> > >  ext4_discard_preallocations+0x6c5/0xe90 fs/ext4/mballoc.c:4441
> > >  ext4_truncate+0x791/0x1420 fs/ext4/inode.c:4248
> > >  ext4_setattr+0x133c/0x1ff0 fs/ext4/inode.c:5490
> > >  notify_change+0xb60/0x10a0 fs/attr.c:336
> > >  do_truncate+0x134/0x1f0 fs/open.c:64
> > >  handle_truncate fs/namei.c:2910 [inline]
> > >  do_open fs/namei.c:3256 [inline]
> > >  path_openat+0x2054/0x2730 fs/namei.c:3369
> > >  do_filp_open+0x17e/0x3c0 fs/namei.c:3396
> > >  do_sys_openat2+0x16d/0x420 fs/open.c:1168
> > >  do_sys_open fs/open.c:1184 [inline]
> > >  __do_sys_creat fs/open.c:1258 [inline]
> > >  __se_sys_creat fs/open.c:1252 [inline]
> > >  __x64_sys_creat+0xc9/0x120 fs/open.c:1252
> > >  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> > >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > RIP: 0033:0x45da59
> > > Code: bd b1 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89
> > f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0
> > ff ff 0f 83 8b b1 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> > > RSP: 002b:00007fb835092c88 EFLAGS: 00000246 ORIG_RAX: 0000000000000055
> > > RAX: ffffffffffffffda RBX: 00000000006f4da0 RCX: 000000000045da59
> > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000300
> > > RBP: 00000000004aab8b R08: 0000000000000000 R09: 0000000000000000
> > > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000075bf00
> > > R13: 00007fff472d64af R14: 00007fb835073000 R15: 0000000000000003
> > > EXT4-fs error (device sda1): ext4_mb_generate_buddy:802: group 7, block
> > bitmap and bg descriptor inconsistent: 32734 vs 32735 free clusters
> > > EXT4-fs (sda1): pa 00000000af22a596: logic 0, phys. 229408, len 32
> > > EXT4-fs error (device sda1): ext4_mb_release_inode_pa:4186: group 7,
> > free 16, pa_free 15
> > >
> > >
> > > ---
> > > This report is generated by a bot. It may contain errors.
> > > See https://goo.gl/tpsmEJ for more information about syzbot.
> > > syzbot engineers can be reached at syzkaller@googlegroups.com.
> > >
> > > syzbot will keep track of this issue. See:
> > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> >
> > Fix acaa532687cd ("ext4: fix superblock checksum calculation race")
> > by adding a spin lock in ext4 sb and replacing lock_buffer() with it.
> >
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -285,6 +285,8 @@ static int ext4_superblock_csum_verify(s
> >  void ext4_superblock_csum_set(struct super_block *sb)
> >  {
> >         struct ext4_super_block *es = EXT4_SB(sb)->s_es;
> > +       struct ext4_sb_info *sbi = EXT4_SB(sb);
> > +       unsigned long flags;
> >
> >         if (!ext4_has_metadata_csum(sb))
> >                 return;
> > @@ -298,9 +300,9 @@ void ext4_superblock_csum_set(struct sup
> >          *  3) the first thread resumes and finishes its checksum
> > calculation
> >          *     and updates s_checksum with a potentially stale or torn
> > value.
> >          */
> > -       lock_buffer(EXT4_SB(sb)->s_sbh);
> > +       spin_lock_irqsave(&sbi->s_cs_lock, flags);
> >         es->s_checksum = ext4_superblock_csum(sb, es);
> > -       unlock_buffer(EXT4_SB(sb)->s_sbh);
> > +       spin_unlock_irqrestore(&sbi->s_cs_lock, flags);
> >  }
> >
> >  ext4_fsblk_t ext4_block_bitmap(struct super_block *sb,
> > --- a/fs/ext4/mballoc.c
> > +++ b/fs/ext4/mballoc.c
> > @@ -2868,6 +2868,7 @@ int ext4_mb_init(struct super_block *sb)
> >                 i++;
> >         } while (i <= sb->s_blocksize_bits + 1);
> >
> > +       spin_lock_init(&sbi->s_cs_lock);
> >         spin_lock_init(&sbi->s_md_lock);
> >         spin_lock_init(&sbi->s_bal_lock);
> >         sbi->s_mb_free_pending = 0;
> > --- a/fs/ext4/ext4.h
> > +++ b/fs/ext4/ext4.h
> > @@ -1439,6 +1439,7 @@ struct ext4_sb_info {
> >         loff_t s_bitmap_maxbytes;       /* max bytes for bitmap files */
> >         struct buffer_head * s_sbh;     /* Buffer containing the super
> > block */
> >         struct ext4_super_block *s_es;  /* Pointer to the super block in
> > the buffer */
> > +       spinlock_t s_cs_lock;           /* SB checksum lock */
> >         struct buffer_head * __rcu *s_group_desc;
> >         unsigned int s_mount_opt;
> >         unsigned int s_mount_opt2;
> >
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: BUG: sleeping function called from invalid context in ext4_superblock_csum_set
  2020-11-04 13:12     ` Jan Kara
@ 2020-11-11 14:31       ` Jan Kara
  0 siblings, 0 replies; 3+ messages in thread
From: Jan Kara @ 2020-11-11 14:31 UTC (permalink / raw)
  To: Costa Sapuntzakis
  Cc: Hillf Danton, Jan Kara, syzbot, linux-fsdevel, linux-kernel,
	Theodore Ts'o, syzkaller-bugs, viro

On Wed 04-11-20 14:12:35, Jan Kara wrote:
> On Tue 03-11-20 09:16:19, Costa Sapuntzakis wrote:
> > Jan, does this fixup from Hillf look ok to you?  You originally argued for
> > lock_buffer/unlock_buffer.
> > 
> > I think the problem here is that the ext4 code assumes that
> > ext4_commit_super will not sleep if sync == 0 (or at least
> > __ext4_grp_locked_error deos). Perhaps there should be a comment on
> > ext4_commit_super documenting this constraint.
> 
> Hum, right. I forgot about that. The spinlock Hillf suggests kind of works
> but it still doesn't quite handle the case where superblock is modified in
> parallel from another place (that can still lead to sb checksum mismatch on
> next load). When we are going for a more complex solution I'd rather solve
> this as well... I'm looking into possible solutions now.

Just an update: I'm still working on this and it's like peeling an onion.
The mixing of journalled superblock updates with unjournalled one is really
evil and commit acaa532687cd fixes just a small part of the problems. So
for now I suggest to just revert acaa532687cd (to avoid sleep in atomic
context) and I'll submit larger set of fixes once they are ready.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-11-11 14:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-01 23:24 BUG: sleeping function called from invalid context in ext4_superblock_csum_set syzbot
     [not found] ` <20201102033326.3343-1-hdanton@sina.com>
     [not found]   ` <CAABuPhbFbQ+_nwDKXjUngtuS5twU6OqKtNu5xYW-d82JJ3cFuQ@mail.gmail.com>
2020-11-04 13:12     ` Jan Kara
2020-11-11 14:31       ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).