linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* WARNING: bad unlock balance in ovl_llseek
@ 2020-03-01 19:13 syzbot
  2020-03-02 11:10 ` Amir Goldstein
  0 siblings, 1 reply; 11+ messages in thread
From: syzbot @ 2020-03-01 19:13 UTC (permalink / raw)
  To: amir73il, linux-kernel, linux-unionfs, miklos, mszeredi, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    f8788d86 Linux 5.6-rc3
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13c5f8f9e00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=5d2e033af114153f
dashboard link: https://syzkaller.appspot.com/bug?extid=66a9752fa927f745385e
compiler:       clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=131d9a81e00000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14117a81e00000

The bug was bisected to:

commit b1f9d3858f724ed45b279b689fb5b400d91352e3
Author: Amir Goldstein <amir73il@gmail.com>
Date:   Sat Dec 21 09:42:29 2019 +0000

    ovl: use ovl_inode_lock in ovl_llseek()

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=16ff3bede00000
final crash:    https://syzkaller.appspot.com/x/report.txt?x=15ff3bede00000
console output: https://syzkaller.appspot.com/x/log.txt?x=11ff3bede00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+66a9752fa927f745385e@syzkaller.appspotmail.com
Fixes: b1f9d3858f72 ("ovl: use ovl_inode_lock in ovl_llseek()")

=====================================
WARNING: bad unlock balance detected!
5.6.0-rc3-syzkaller #0 Not tainted
-------------------------------------
syz-executor194/8947 is trying to release lock (&ovl_i_lock_key[depth]) at:
[<ffffffff828b7835>] ovl_inode_unlock fs/overlayfs/overlayfs.h:328 [inline]
[<ffffffff828b7835>] ovl_llseek+0x215/0x2c0 fs/overlayfs/file.c:193
but there are no more locks to release!

other info that might help us debug this:
1 lock held by syz-executor194/8947:
 #0: ffff88809742ade0 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x279/0x310 fs/file.c:821

stack backtrace:
CPU: 0 PID: 8947 Comm: syz-executor194 Not tainted 5.6.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1fb/0x318 lib/dump_stack.c:118
 print_unlock_imbalance_bug+0x20b/0x240 kernel/locking/lockdep.c:4007
 __lock_release kernel/locking/lockdep.c:4241 [inline]
 lock_release+0x469/0x710 kernel/locking/lockdep.c:4502
 __mutex_unlock_slowpath+0x80/0x5b0 kernel/locking/mutex.c:1228
 mutex_unlock+0xd/0x10 kernel/locking/mutex.c:740
 ovl_inode_unlock fs/overlayfs/overlayfs.h:328 [inline]
 ovl_llseek+0x215/0x2c0 fs/overlayfs/file.c:193
 vfs_llseek fs/read_write.c:300 [inline]
 ksys_lseek fs/read_write.c:313 [inline]
 __do_sys_lseek fs/read_write.c:324 [inline]
 __se_sys_lseek fs/read_write.c:322 [inline]
 __x64_sys_lseek+0x15f/0x1f0 fs/read_write.c:322
 do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x445df9
Code: e8 ac b8 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b 12 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ff735aecce8 EFLAGS: 00000246 ORIG_RAX: 0000000000000008
RAX: ffffffffffffffda RBX: 00000000006dac38 RCX: 0000000000445df


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING: bad unlock balance in ovl_llseek
  2020-03-01 19:13 WARNING: bad unlock balance in ovl_llseek syzbot
@ 2020-03-02 11:10 ` Amir Goldstein
  2020-03-02 11:17   ` Miklos Szeredi
  2020-03-02 11:29   ` Dmitry Vyukov
  0 siblings, 2 replies; 11+ messages in thread
From: Amir Goldstein @ 2020-03-02 11:10 UTC (permalink / raw)
  To: syzbot, Dmitry Vyukov
  Cc: linux-kernel, overlayfs, Miklos Szeredi, Miklos Szeredi, syzkaller-bugs

On Sun, Mar 1, 2020 at 9:13 PM syzbot
<syzbot+66a9752fa927f745385e@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:    f8788d86 Linux 5.6-rc3
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13c5f8f9e00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=5d2e033af114153f
> dashboard link: https://syzkaller.appspot.com/bug?extid=66a9752fa927f745385e
> compiler:       clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=131d9a81e00000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14117a81e00000
>

Dmitry,

There is something strange about the C repro.
It passes an invalid address for the first arg of mount syscall:

    syscall(__NR_mount, 0x400000ul, 0x20000000ul, 0x20000080ul, 0ul,
            0x20000100ul);

With this address mount syscall returns -EFAULT on my system.
I fixed this manually, but repro did not trigger the reported bug on my system.

> The bug was bisected to:
>
> commit b1f9d3858f724ed45b279b689fb5b400d91352e3
> Author: Amir Goldstein <amir73il@gmail.com>
> Date:   Sat Dec 21 09:42:29 2019 +0000
>
>     ovl: use ovl_inode_lock in ovl_llseek()
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=16ff3bede00000
> final crash:    https://syzkaller.appspot.com/x/report.txt?x=15ff3bede00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=11ff3bede00000
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+66a9752fa927f745385e@syzkaller.appspotmail.com
> Fixes: b1f9d3858f72 ("ovl: use ovl_inode_lock in ovl_llseek()")
>
> =====================================
> WARNING: bad unlock balance detected!
> 5.6.0-rc3-syzkaller #0 Not tainted
> -------------------------------------
> syz-executor194/8947 is trying to release lock (&ovl_i_lock_key[depth]) at:
> [<ffffffff828b7835>] ovl_inode_unlock fs/overlayfs/overlayfs.h:328 [inline]
> [<ffffffff828b7835>] ovl_llseek+0x215/0x2c0 fs/overlayfs/file.c:193
> but there are no more locks to release!
>

This is strange. I don't see how that can happen nor how my change would
have caused this regression. If anything, the lock chance may have brought
a bug in stack file ops to light, but don't see the bug.

The repro is multi-threaded but when I ran the repro, a single thread did:
- open lower file (pre copy up)
- lchown file (copy up)
- llseek the open file (so llseek is on a temporary ovl_open_realfile())

Perhaps when bug was triggered ops above were executed by different
threads?

Dmitry, I may have asked this before - how hard would it be to attach an
strace of the repro to a bug report?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING: bad unlock balance in ovl_llseek
  2020-03-02 11:10 ` Amir Goldstein
@ 2020-03-02 11:17   ` Miklos Szeredi
  2020-03-02 12:19     ` Amir Goldstein
  2020-03-02 11:29   ` Dmitry Vyukov
  1 sibling, 1 reply; 11+ messages in thread
From: Miklos Szeredi @ 2020-03-02 11:17 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: syzbot, Dmitry Vyukov, linux-kernel, overlayfs, Miklos Szeredi,
	syzkaller-bugs

On Mon, Mar 2, 2020 at 12:10 PM Amir Goldstein <amir73il@gmail.com> wrote:

> > =====================================
> > WARNING: bad unlock balance detected!
> > 5.6.0-rc3-syzkaller #0 Not tainted
> > -------------------------------------
> > syz-executor194/8947 is trying to release lock (&ovl_i_lock_key[depth]) at:
> > [<ffffffff828b7835>] ovl_inode_unlock fs/overlayfs/overlayfs.h:328 [inline]
> > [<ffffffff828b7835>] ovl_llseek+0x215/0x2c0 fs/overlayfs/file.c:193
> > but there are no more locks to release!
> >
>
> This is strange. I don't see how that can happen nor how my change would
> have caused this regression. If anything, the lock chance may have brought
> a bug in stack file ops to light, but don't see the bug.

The bug is that ovl_inode_lock() is interruptible and that the caller
doesn't check for error.

I think the fix is to make this lock uninterruptible (probably rename
the current helper to _interruptible and use the current name as the
uninterruptible version).

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING: bad unlock balance in ovl_llseek
  2020-03-02 11:10 ` Amir Goldstein
  2020-03-02 11:17   ` Miklos Szeredi
@ 2020-03-02 11:29   ` Dmitry Vyukov
  2020-03-02 12:10     ` Amir Goldstein
  1 sibling, 1 reply; 11+ messages in thread
From: Dmitry Vyukov @ 2020-03-02 11:29 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: syzbot, linux-kernel, overlayfs, Miklos Szeredi, Miklos Szeredi,
	syzkaller-bugs

On Mon, Mar 2, 2020 at 12:10 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Sun, Mar 1, 2020 at 9:13 PM syzbot
> <syzbot+66a9752fa927f745385e@syzkaller.appspotmail.com> wrote:
> >
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    f8788d86 Linux 5.6-rc3
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=13c5f8f9e00000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=5d2e033af114153f
> > dashboard link: https://syzkaller.appspot.com/bug?extid=66a9752fa927f745385e
> > compiler:       clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=131d9a81e00000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14117a81e00000
> >
>
> Dmitry,
>
> There is something strange about the C repro.
> It passes an invalid address for the first arg of mount syscall:
>
>     syscall(__NR_mount, 0x400000ul, 0x20000000ul, 0x20000080ul, 0ul,
>             0x20000100ul);
>
> With this address mount syscall returns -EFAULT on my system.
> I fixed this manually, but repro did not trigger the reported bug on my system.

Hi Amir,

This is not strange in the context of fuzzer, it's goal is to pass
random data. Generally if it says 0x400000ul, that's what it is, don't
fix it, or you are running a different program that may not reproduce
the bug. If syzbot attaches a reproducer, the bug was triggered by
precisely this program.

The reason why it passes non-pointers here is we think the src
argument of overlay mount is unused:
https://github.com/google/syzkaller/blob/4a4e0509de520c7139ca2b5606712cbadc550db2/sys/linux/filesystem.txt#L12
If it's not true, it needs to be fixed (or almost all overlay mounts
fail with EFAULT during fuzzing).


> > The bug was bisected to:
> >
> > commit b1f9d3858f724ed45b279b689fb5b400d91352e3
> > Author: Amir Goldstein <amir73il@gmail.com>
> > Date:   Sat Dec 21 09:42:29 2019 +0000
> >
> >     ovl: use ovl_inode_lock in ovl_llseek()
> >
> > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=16ff3bede00000
> > final crash:    https://syzkaller.appspot.com/x/report.txt?x=15ff3bede00000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=11ff3bede00000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+66a9752fa927f745385e@syzkaller.appspotmail.com
> > Fixes: b1f9d3858f72 ("ovl: use ovl_inode_lock in ovl_llseek()")
> >
> > =====================================
> > WARNING: bad unlock balance detected!
> > 5.6.0-rc3-syzkaller #0 Not tainted
> > -------------------------------------
> > syz-executor194/8947 is trying to release lock (&ovl_i_lock_key[depth]) at:
> > [<ffffffff828b7835>] ovl_inode_unlock fs/overlayfs/overlayfs.h:328 [inline]
> > [<ffffffff828b7835>] ovl_llseek+0x215/0x2c0 fs/overlayfs/file.c:193
> > but there are no more locks to release!
> >
>
> This is strange. I don't see how that can happen nor how my change would
> have caused this regression. If anything, the lock chance may have brought
> a bug in stack file ops to light, but don't see the bug.
>
> The repro is multi-threaded but when I ran the repro, a single thread did:
> - open lower file (pre copy up)
> - lchown file (copy up)
> - llseek the open file (so llseek is on a temporary ovl_open_realfile())
>
> Perhaps when bug was triggered ops above were executed by different
> threads?

Perfectly possible.

> Dmitry, I may have asked this before - how hard would it be to attach an
> strace of the repro to a bug report?

This is tracked in https://github.com/google/syzkaller/issues/197 but
no progress so far.
What exactly were the main pain points in this case? But note that
strace is not atomic with actual execution, so it may lead you down
even worse rabbit hole...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING: bad unlock balance in ovl_llseek
  2020-03-02 11:29   ` Dmitry Vyukov
@ 2020-03-02 12:10     ` Amir Goldstein
  2020-03-02 12:34       ` Dmitry Vyukov
  0 siblings, 1 reply; 11+ messages in thread
From: Amir Goldstein @ 2020-03-02 12:10 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, linux-kernel, overlayfs, Miklos Szeredi, Miklos Szeredi,
	syzkaller-bugs

On Mon, Mar 2, 2020 at 1:29 PM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Mon, Mar 2, 2020 at 12:10 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Sun, Mar 1, 2020 at 9:13 PM syzbot
> > <syzbot+66a9752fa927f745385e@syzkaller.appspotmail.com> wrote:
> > >
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit:    f8788d86 Linux 5.6-rc3
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=13c5f8f9e00000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=5d2e033af114153f
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=66a9752fa927f745385e
> > > compiler:       clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=131d9a81e00000
> > > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14117a81e00000
> > >
> >
> > Dmitry,
> >
> > There is something strange about the C repro.
> > It passes an invalid address for the first arg of mount syscall:
> >
> >     syscall(__NR_mount, 0x400000ul, 0x20000000ul, 0x20000080ul, 0ul,
> >             0x20000100ul);
> >
> > With this address mount syscall returns -EFAULT on my system.
> > I fixed this manually, but repro did not trigger the reported bug on my system.
>
> Hi Amir,
>
> This is not strange in the context of fuzzer, it's goal is to pass
> random data. Generally if it says 0x400000ul, that's what it is, don't
> fix it, or you are running a different program that may not reproduce
> the bug. If syzbot attaches a reproducer, the bug was triggered by
> precisely this program.
>

What's strange it that a bug in overlay code cannot be triggered if overlay
isn't mounted and as it is the repro couldn't mount overlayfs at all, at
lease with my kernel config.
The bounds check that causes mount failure is in vfs code, not in
overlayfs code,
so not sure what exactly went on there.


> The reason why it passes non-pointers here is we think the src
> argument of overlay mount is unused:
> https://github.com/google/syzkaller/blob/4a4e0509de520c7139ca2b5606712cbadc550db2/sys/linux/filesystem.txt#L12
> If it's not true, it needs to be fixed (or almost all overlay mounts
> fail with EFAULT during fuzzing).
>
>
> > > The bug was bisected to:
> > >
> > > commit b1f9d3858f724ed45b279b689fb5b400d91352e3
> > > Author: Amir Goldstein <amir73il@gmail.com>
> > > Date:   Sat Dec 21 09:42:29 2019 +0000
> > >
> > >     ovl: use ovl_inode_lock in ovl_llseek()
> > >
> > > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=16ff3bede00000
> > > final crash:    https://syzkaller.appspot.com/x/report.txt?x=15ff3bede00000
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=11ff3bede00000
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+66a9752fa927f745385e@syzkaller.appspotmail.com
> > > Fixes: b1f9d3858f72 ("ovl: use ovl_inode_lock in ovl_llseek()")
> > >
> > > =====================================
> > > WARNING: bad unlock balance detected!
> > > 5.6.0-rc3-syzkaller #0 Not tainted
> > > -------------------------------------
> > > syz-executor194/8947 is trying to release lock (&ovl_i_lock_key[depth]) at:
> > > [<ffffffff828b7835>] ovl_inode_unlock fs/overlayfs/overlayfs.h:328 [inline]
> > > [<ffffffff828b7835>] ovl_llseek+0x215/0x2c0 fs/overlayfs/file.c:193
> > > but there are no more locks to release!
> > >
> >
> > This is strange. I don't see how that can happen nor how my change would
> > have caused this regression. If anything, the lock chance may have brought
> > a bug in stack file ops to light, but don't see the bug.
> >
> > The repro is multi-threaded but when I ran the repro, a single thread did:
> > - open lower file (pre copy up)
> > - lchown file (copy up)
> > - llseek the open file (so llseek is on a temporary ovl_open_realfile())
> >
> > Perhaps when bug was triggered ops above were executed by different
> > threads?
>
> Perfectly possible.
>
> > Dmitry, I may have asked this before - how hard would it be to attach an
> > strace of the repro to a bug report?
>
> This is tracked in https://github.com/google/syzkaller/issues/197 but
> no progress so far.
> What exactly were the main pain points in this case? But note that
> strace is not atomic with actual execution, so it may lead you down
> even worse rabbit hole...

Sure, but it can add more insight for analysis.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING: bad unlock balance in ovl_llseek
  2020-03-02 11:17   ` Miklos Szeredi
@ 2020-03-02 12:19     ` Amir Goldstein
  2020-03-03  7:34       ` Amir Goldstein
  0 siblings, 1 reply; 11+ messages in thread
From: Amir Goldstein @ 2020-03-02 12:19 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: syzbot, Dmitry Vyukov, linux-kernel, overlayfs, Miklos Szeredi,
	syzkaller-bugs

On Mon, Mar 2, 2020 at 1:17 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Mon, Mar 2, 2020 at 12:10 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> > > =====================================
> > > WARNING: bad unlock balance detected!
> > > 5.6.0-rc3-syzkaller #0 Not tainted
> > > -------------------------------------
> > > syz-executor194/8947 is trying to release lock (&ovl_i_lock_key[depth]) at:
> > > [<ffffffff828b7835>] ovl_inode_unlock fs/overlayfs/overlayfs.h:328 [inline]
> > > [<ffffffff828b7835>] ovl_llseek+0x215/0x2c0 fs/overlayfs/file.c:193
> > > but there are no more locks to release!
> > >
> >
> > This is strange. I don't see how that can happen nor how my change would
> > have caused this regression. If anything, the lock chance may have brought
> > a bug in stack file ops to light, but don't see the bug.
>
> The bug is that ovl_inode_lock() is interruptible and that the caller
> doesn't check for error.
>
> I think the fix is to make this lock uninterruptible (probably rename
> the current helper to _interruptible and use the current name as the
> uninterruptible version).
>

Ok.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING: bad unlock balance in ovl_llseek
  2020-03-02 12:10     ` Amir Goldstein
@ 2020-03-02 12:34       ` Dmitry Vyukov
  2020-03-02 13:24         ` Amir Goldstein
  0 siblings, 1 reply; 11+ messages in thread
From: Dmitry Vyukov @ 2020-03-02 12:34 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: syzbot, linux-kernel, overlayfs, Miklos Szeredi, Miklos Szeredi,
	syzkaller-bugs

On Mon, Mar 2, 2020 at 1:10 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > On Sun, Mar 1, 2020 at 9:13 PM syzbot
> > > <syzbot+66a9752fa927f745385e@syzkaller.appspotmail.com> wrote:
> > > >
> > > > Hello,
> > > >
> > > > syzbot found the following crash on:
> > > >
> > > > HEAD commit:    f8788d86 Linux 5.6-rc3
> > > > git tree:       upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=13c5f8f9e00000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=5d2e033af114153f
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=66a9752fa927f745385e
> > > > compiler:       clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=131d9a81e00000
> > > > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14117a81e00000
> > > >
> > >
> > > Dmitry,
> > >
> > > There is something strange about the C repro.
> > > It passes an invalid address for the first arg of mount syscall:
> > >
> > >     syscall(__NR_mount, 0x400000ul, 0x20000000ul, 0x20000080ul, 0ul,
> > >             0x20000100ul);
> > >
> > > With this address mount syscall returns -EFAULT on my system.
> > > I fixed this manually, but repro did not trigger the reported bug on my system.
> >
> > Hi Amir,
> >
> > This is not strange in the context of fuzzer, it's goal is to pass
> > random data. Generally if it says 0x400000ul, that's what it is, don't
> > fix it, or you are running a different program that may not reproduce
> > the bug. If syzbot attaches a reproducer, the bug was triggered by
> > precisely this program.
> >
>
> What's strange it that a bug in overlay code cannot be triggered if overlay
> isn't mounted and as it is the repro couldn't mount overlayfs at all, at
> lease with my kernel config.

Can it depend on kernel config? The bug was triggered by the program
provided somehow.

Separate question: why is it failing? Isn't src unused for overlayfs?
Where/how does vfs code look at src?

> The bounds check that causes mount failure is in vfs code, not in
> overlayfs code,
> so not sure what exactly went on there.
>
>
> > The reason why it passes non-pointers here is we think the src
> > argument of overlay mount is unused:
> > https://github.com/google/syzkaller/blob/4a4e0509de520c7139ca2b5606712cbadc550db2/sys/linux/filesystem.txt#L12
> > If it's not true, it needs to be fixed (or almost all overlay mounts
> > fail with EFAULT during fuzzing).
> >
> >
> > > > The bug was bisected to:
> > > >
> > > > commit b1f9d3858f724ed45b279b689fb5b400d91352e3
> > > > Author: Amir Goldstein <amir73il@gmail.com>
> > > > Date:   Sat Dec 21 09:42:29 2019 +0000
> > > >
> > > >     ovl: use ovl_inode_lock in ovl_llseek()
> > > >
> > > > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=16ff3bede00000
> > > > final crash:    https://syzkaller.appspot.com/x/report.txt?x=15ff3bede00000
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=11ff3bede00000
> > > >
> > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > Reported-by: syzbot+66a9752fa927f745385e@syzkaller.appspotmail.com
> > > > Fixes: b1f9d3858f72 ("ovl: use ovl_inode_lock in ovl_llseek()")
> > > >
> > > > =====================================
> > > > WARNING: bad unlock balance detected!
> > > > 5.6.0-rc3-syzkaller #0 Not tainted
> > > > -------------------------------------
> > > > syz-executor194/8947 is trying to release lock (&ovl_i_lock_key[depth]) at:
> > > > [<ffffffff828b7835>] ovl_inode_unlock fs/overlayfs/overlayfs.h:328 [inline]
> > > > [<ffffffff828b7835>] ovl_llseek+0x215/0x2c0 fs/overlayfs/file.c:193
> > > > but there are no more locks to release!
> > > >
> > >
> > > This is strange. I don't see how that can happen nor how my change would
> > > have caused this regression. If anything, the lock chance may have brought
> > > a bug in stack file ops to light, but don't see the bug.
> > >
> > > The repro is multi-threaded but when I ran the repro, a single thread did:
> > > - open lower file (pre copy up)
> > > - lchown file (copy up)
> > > - llseek the open file (so llseek is on a temporary ovl_open_realfile())
> > >
> > > Perhaps when bug was triggered ops above were executed by different
> > > threads?
> >
> > Perfectly possible.
> >
> > > Dmitry, I may have asked this before - how hard would it be to attach an
> > > strace of the repro to a bug report?
> >
> > This is tracked in https://github.com/google/syzkaller/issues/197 but
> > no progress so far.
> > What exactly were the main pain points in this case? But note that
> > strace is not atomic with actual execution, so it may lead you down
> > even worse rabbit hole...
>
> Sure, but it can add more insight for analysis.
>
> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING: bad unlock balance in ovl_llseek
  2020-03-02 12:34       ` Dmitry Vyukov
@ 2020-03-02 13:24         ` Amir Goldstein
  2020-03-02 14:57           ` Dmitry Vyukov
  0 siblings, 1 reply; 11+ messages in thread
From: Amir Goldstein @ 2020-03-02 13:24 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, linux-kernel, overlayfs, Miklos Szeredi, Miklos Szeredi,
	syzkaller-bugs

On Mon, Mar 2, 2020 at 2:35 PM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Mon, Mar 2, 2020 at 1:10 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > > On Sun, Mar 1, 2020 at 9:13 PM syzbot
> > > > <syzbot+66a9752fa927f745385e@syzkaller.appspotmail.com> wrote:
> > > > >
> > > > > Hello,
> > > > >
> > > > > syzbot found the following crash on:
> > > > >
> > > > > HEAD commit:    f8788d86 Linux 5.6-rc3
> > > > > git tree:       upstream
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=13c5f8f9e00000
> > > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=5d2e033af114153f
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=66a9752fa927f745385e
> > > > > compiler:       clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> > > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=131d9a81e00000
> > > > > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14117a81e00000
> > > > >
> > > >
> > > > Dmitry,
> > > >
> > > > There is something strange about the C repro.
> > > > It passes an invalid address for the first arg of mount syscall:
> > > >
> > > >     syscall(__NR_mount, 0x400000ul, 0x20000000ul, 0x20000080ul, 0ul,
> > > >             0x20000100ul);
> > > >
> > > > With this address mount syscall returns -EFAULT on my system.
> > > > I fixed this manually, but repro did not trigger the reported bug on my system.
> > >
> > > Hi Amir,
> > >
> > > This is not strange in the context of fuzzer, it's goal is to pass
> > > random data. Generally if it says 0x400000ul, that's what it is, don't
> > > fix it, or you are running a different program that may not reproduce
> > > the bug. If syzbot attaches a reproducer, the bug was triggered by
> > > precisely this program.
> > >
> >
> > What's strange it that a bug in overlay code cannot be triggered if overlay
> > isn't mounted and as it is the repro couldn't mount overlayfs at all, at
> > lease with my kernel config.
>
> Can it depend on kernel config? The bug was triggered by the program
> provided somehow.

I am not sure. I do not have CONFIG_HARDENED_USERCOPY set.

>
> Separate question: why is it failing? Isn't src unused for overlayfs?
> Where/how does vfs code look at src?
>

SYSCALL_DEFINE5(mount, ...
copy_mount_string(dev_name)
strndup_user()
memdup_user()
copy_from_user()

Not in overlayfs code.
Actually, the source (dev) is not used by overlayfs but is visible at
/proc/mounts.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING: bad unlock balance in ovl_llseek
  2020-03-02 13:24         ` Amir Goldstein
@ 2020-03-02 14:57           ` Dmitry Vyukov
  0 siblings, 0 replies; 11+ messages in thread
From: Dmitry Vyukov @ 2020-03-02 14:57 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: syzbot, linux-kernel, overlayfs, Miklos Szeredi, Miklos Szeredi,
	syzkaller-bugs, syzkaller

On Mon, Mar 2, 2020 at 2:24 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > > > On Sun, Mar 1, 2020 at 9:13 PM syzbot
> > > > > <syzbot+66a9752fa927f745385e@syzkaller.appspotmail.com> wrote:
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > > syzbot found the following crash on:
> > > > > >
> > > > > > HEAD commit:    f8788d86 Linux 5.6-rc3
> > > > > > git tree:       upstream
> > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=13c5f8f9e00000
> > > > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=5d2e033af114153f
> > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=66a9752fa927f745385e
> > > > > > compiler:       clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> > > > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=131d9a81e00000
> > > > > > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14117a81e00000
> > > > > >
> > > > >
> > > > > Dmitry,
> > > > >
> > > > > There is something strange about the C repro.
> > > > > It passes an invalid address for the first arg of mount syscall:
> > > > >
> > > > >     syscall(__NR_mount, 0x400000ul, 0x20000000ul, 0x20000080ul, 0ul,
> > > > >             0x20000100ul);
> > > > >
> > > > > With this address mount syscall returns -EFAULT on my system.
> > > > > I fixed this manually, but repro did not trigger the reported bug on my system.
> > > >
> > > > Hi Amir,
> > > >
> > > > This is not strange in the context of fuzzer, it's goal is to pass
> > > > random data. Generally if it says 0x400000ul, that's what it is, don't
> > > > fix it, or you are running a different program that may not reproduce
> > > > the bug. If syzbot attaches a reproducer, the bug was triggered by
> > > > precisely this program.
> > > >
> > >
> > > What's strange it that a bug in overlay code cannot be triggered if overlay
> > > isn't mounted and as it is the repro couldn't mount overlayfs at all, at
> > > lease with my kernel config.
> >
> > Can it depend on kernel config? The bug was triggered by the program
> > provided somehow.
>
> I am not sure. I do not have CONFIG_HARDENED_USERCOPY set.
>
> >
> > Separate question: why is it failing? Isn't src unused for overlayfs?
> > Where/how does vfs code look at src?
> >
>
> SYSCALL_DEFINE5(mount, ...
> copy_mount_string(dev_name)
> strndup_user()
> memdup_user()
> copy_from_user()
>
> Not in overlayfs code.
> Actually, the source (dev) is not used by overlayfs but is visible at
> /proc/mounts.

Oh, I see, this is another instance of "fuzzer fun".

In the descriptions we define src argument as const 0. And const 0 is
fine and is accepted by copy_mount_string (it has a check for NULL).
Generally fuzzer does not try to change values specified as const, but
sometimes it does. So I guess it happened so that address 0x400000ul
is mapped onto the executable and contained something that resembles a
null-terminated string so that copy_mount_string did not fail (but
otherwise that string does not matter much for overlayfs). But in your
binary 0x400000ul did not contain an addressable null-terminated
string and mount failed.
Additionally we don't attempt changing const value back to the default
value during crash mimization/simplification process:
https://github.com/google/syzkaller/blob/4a4e0509de520c7139ca2b5606712cbadc550db2/prog/minimization.go#L202-L206
because it was deemed too expensive (for each attempt we need a
freshly booted and clean machine) and not important enough (just a
single arg value and does not increase "systematic complexity" of the
repro).

All of this has combined into the effect we see here... I am not sure
what's the action item here...

FWIW fuzzer-found will always be more expensive to debug and deal with
for a very long tail of various reasons. Unit tests don't have this
problem. If only we had a comprehensive test coverage for kernel, we
would not need to deal with so many fuzzer-found bugs... ;)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING: bad unlock balance in ovl_llseek
  2020-03-02 12:19     ` Amir Goldstein
@ 2020-03-03  7:34       ` Amir Goldstein
  2020-03-06 22:35         ` syzbot
  0 siblings, 1 reply; 11+ messages in thread
From: Amir Goldstein @ 2020-03-03  7:34 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: syzbot, Dmitry Vyukov, linux-kernel, overlayfs, Miklos Szeredi,
	syzkaller-bugs

On Mon, Mar 2, 2020 at 2:19 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Mon, Mar 2, 2020 at 1:17 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Mon, Mar 2, 2020 at 12:10 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > > > =====================================
> > > > WARNING: bad unlock balance detected!
> > > > 5.6.0-rc3-syzkaller #0 Not tainted
> > > > -------------------------------------
> > > > syz-executor194/8947 is trying to release lock (&ovl_i_lock_key[depth]) at:
> > > > [<ffffffff828b7835>] ovl_inode_unlock fs/overlayfs/overlayfs.h:328 [inline]
> > > > [<ffffffff828b7835>] ovl_llseek+0x215/0x2c0 fs/overlayfs/file.c:193
> > > > but there are no more locks to release!
> > > >
> > >
> > > This is strange. I don't see how that can happen nor how my change would
> > > have caused this regression. If anything, the lock chance may have brought
> > > a bug in stack file ops to light, but don't see the bug.
> >
> > The bug is that ovl_inode_lock() is interruptible and that the caller
> > doesn't check for error.
> >
> > I think the fix is to make this lock uninterruptible (probably rename
> > the current helper to _interruptible and use the current name as the
> > uninterruptible version).
> >
>

Miklos,

Added the patch to ovl-fixes along with another ovl fix for fast track 5.6.

#syz test: https://github.com/amir73il/linux.git ovl-fixes

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING: bad unlock balance in ovl_llseek
  2020-03-03  7:34       ` Amir Goldstein
@ 2020-03-06 22:35         ` syzbot
  0 siblings, 0 replies; 11+ messages in thread
From: syzbot @ 2020-03-06 22:35 UTC (permalink / raw)
  To: amir73il, dvyukov, linux-kernel, linux-unionfs, miklos, mszeredi,
	syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger crash:

Reported-and-tested-by: syzbot+66a9752fa927f745385e@syzkaller.appspotmail.com

Tested on:

commit:         13944050 ovl: fix lock in ovl_llseek()
git tree:       https://github.com/amir73il/linux.git ovl-fixes
kernel config:  https://syzkaller.appspot.com/x/.config?x=31018567b8f0fc70
dashboard link: https://syzkaller.appspot.com/bug?extid=66a9752fa927f745385e
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-03-06 22:35 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-01 19:13 WARNING: bad unlock balance in ovl_llseek syzbot
2020-03-02 11:10 ` Amir Goldstein
2020-03-02 11:17   ` Miklos Szeredi
2020-03-02 12:19     ` Amir Goldstein
2020-03-03  7:34       ` Amir Goldstein
2020-03-06 22:35         ` syzbot
2020-03-02 11:29   ` Dmitry Vyukov
2020-03-02 12:10     ` Amir Goldstein
2020-03-02 12:34       ` Dmitry Vyukov
2020-03-02 13:24         ` Amir Goldstein
2020-03-02 14:57           ` Dmitry Vyukov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).