linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* WARNING in percpu_ref_kill_and_confirm
@ 2019-04-22 16:06 syzbot
  2019-04-22 16:23 ` Jens Axboe
  2019-04-22 16:23 ` Linus Torvalds
  0 siblings, 2 replies; 10+ messages in thread
From: syzbot @ 2019-04-22 16:06 UTC (permalink / raw)
  To: arnd, axboe, bp, darrick.wong, gregkh, hpa, linux-api,
	linux-arch, linux-block, linux-fsdevel, linux-kernel, luto,
	mathieu.desnoyers, mingo, mpe, syzkaller-bugs, tglx, torvalds,
	viro, x86

Hello,

syzbot found the following crash on:

HEAD commit:    9e5de623 Merge tag 'nfs-for-5.1-5' of git://git.linux-nfs...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15624257200000
kernel config:  https://syzkaller.appspot.com/x/.config?x=856fc6d0fbbeede9
dashboard link: https://syzkaller.appspot.com/bug?extid=10d25e23199614b7721f
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
userspace arch: i386
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17ff39f3200000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15758647200000

The bug was bisected to:

commit 38e7571c07be01f9f19b355a9306a4e3d5cb0f5b
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Fri Mar 8 22:48:40 2019 +0000

     Merge tag 'io_uring-2019-03-06' of git://git.kernel.dk/linux-block

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1736bc57200000
final crash:    https://syzkaller.appspot.com/x/report.txt?x=14b6bc57200000
console output: https://syzkaller.appspot.com/x/log.txt?x=10b6bc57200000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+10d25e23199614b7721f@syzkaller.appspotmail.com
Fixes: 38e7571c07be ("Merge tag 'io_uring-2019-03-06' of  
git://git.kernel.dk/linux-block")

------------[ cut here ]------------
percpu_ref_kill_and_confirm called more than once on io_ring_ctx_ref_free!
WARNING: CPU: 1 PID: 7815 at lib/percpu-refcount.c:335  
percpu_ref_kill_and_confirm+0x341/0x3b0 lib/percpu-refcount.c:335
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 7815 Comm: syz-executor269 Not tainted 5.1.0-rc5+ #77
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
  panic+0x2cb/0x65c kernel/panic.c:214
  __warn.cold+0x20/0x45 kernel/panic.c:571
  report_bug+0x263/0x2b0 lib/bug.c:186
  fixup_bug arch/x86/kernel/traps.c:179 [inline]
  fixup_bug arch/x86/kernel/traps.c:174 [inline]
  do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
  do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
RIP: 0010:percpu_ref_kill_and_confirm+0x341/0x3b0 lib/percpu-refcount.c:335
Code: 42 e0 2a 06 01 48 89 fa 48 c1 ea 03 80 3c 02 00 75 76 49 8b 54 24 10  
48 c7 c6 a0 71 a1 87 48 c7 c7 40 71 a1 87 e8 ad 92 13 fe <0f> 0b 48 b8 00  
00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 80 3c 02
RSP: 0018:ffff8880a96cfce0 EFLAGS: 00010086
RAX: 0000000000000000 RBX: 0000607f5142e35b RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff815afcf6 RDI: ffffed10152d9f8e
RBP: ffff8880a96cfd10 R08: ffff8880a85c40c0 R09: fffffbfff1133639
R10: fffffbfff1133638 R11: ffffffff8899b1c3 R12: ffff88809ee571c0
R13: ffff88809ee571c8 R14: 0000000000000286 R15: 0000000000000000
  percpu_ref_kill include/linux/percpu-refcount.h:128 [inline]
  __io_uring_register+0xa7/0x1fe0 fs/io_uring.c:2937
  __do_sys_io_uring_register fs/io_uring.c:2998 [inline]
  __se_sys_io_uring_register fs/io_uring.c:2980 [inline]
  __ia32_sys_io_uring_register+0x193/0x1f0 fs/io_uring.c:2980
  do_syscall_32_irqs_on arch/x86/entry/common.c:326 [inline]
  do_fast_syscall_32+0x281/0xc98 arch/x86/entry/common.c:397
  entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7f16869
Code: 85 d2 74 02 89 0a 5b 5d c3 8b 04 24 c3 8b 14 24 c3 8b 3c 24 c3 90 90  
90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90  
90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 002b:00000000f7ef11ec EFLAGS: 00000296 ORIG_RAX: 00000000000001ab
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARNING in percpu_ref_kill_and_confirm
  2019-04-22 16:06 WARNING in percpu_ref_kill_and_confirm syzbot
@ 2019-04-22 16:23 ` Jens Axboe
  2019-04-22 16:27   ` Linus Torvalds
  2019-04-22 16:23 ` Linus Torvalds
  1 sibling, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2019-04-22 16:23 UTC (permalink / raw)
  To: syzbot, arnd, bp, darrick.wong, gregkh, hpa, linux-api,
	linux-arch, linux-block, linux-fsdevel, linux-kernel, luto,
	mathieu.desnoyers, mingo, mpe, syzkaller-bugs, tglx, torvalds,
	viro, x86

On 4/22/19 10:06 AM, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    9e5de623 Merge tag 'nfs-for-5.1-5' of git://git.linux-nfs...
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15624257200000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=856fc6d0fbbeede9
> dashboard link: https://syzkaller.appspot.com/bug?extid=10d25e23199614b7721f
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> userspace arch: i386
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17ff39f3200000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15758647200000
> 
> The bug was bisected to:
> 
> commit 38e7571c07be01f9f19b355a9306a4e3d5cb0f5b
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Fri Mar 8 22:48:40 2019 +0000
> 
>      Merge tag 'io_uring-2019-03-06' of git://git.kernel.dk/linux-block
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1736bc57200000
> final crash:    https://syzkaller.appspot.com/x/report.txt?x=14b6bc57200000
> console output: https://syzkaller.appspot.com/x/log.txt?x=10b6bc57200000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+10d25e23199614b7721f@syzkaller.appspotmail.com
> Fixes: 38e7571c07be ("Merge tag 'io_uring-2019-03-06' of  
> git://git.kernel.dk/linux-block")
> 
> ------------[ cut here ]------------
> percpu_ref_kill_and_confirm called more than once on io_ring_ctx_ref_free!
> WARNING: CPU: 1 PID: 7815 at lib/percpu-refcount.c:335  
> percpu_ref_kill_and_confirm+0x341/0x3b0 lib/percpu-refcount.c:335
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 1 PID: 7815 Comm: syz-executor269 Not tainted 5.1.0-rc5+ #77
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
> Google 01/01/2011
> Call Trace:
>   __dump_stack lib/dump_stack.c:77 [inline]
>   dump_stack+0x172/0x1f0 lib/dump_stack.c:113
>   panic+0x2cb/0x65c kernel/panic.c:214
>   __warn.cold+0x20/0x45 kernel/panic.c:571
>   report_bug+0x263/0x2b0 lib/bug.c:186
>   fixup_bug arch/x86/kernel/traps.c:179 [inline]
>   fixup_bug arch/x86/kernel/traps.c:174 [inline]
>   do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
>   do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
>   invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
> RIP: 0010:percpu_ref_kill_and_confirm+0x341/0x3b0 lib/percpu-refcount.c:335
> Code: 42 e0 2a 06 01 48 89 fa 48 c1 ea 03 80 3c 02 00 75 76 49 8b 54 24 10  
> 48 c7 c6 a0 71 a1 87 48 c7 c7 40 71 a1 87 e8 ad 92 13 fe <0f> 0b 48 b8 00  
> 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 80 3c 02
> RSP: 0018:ffff8880a96cfce0 EFLAGS: 00010086
> RAX: 0000000000000000 RBX: 0000607f5142e35b RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff815afcf6 RDI: ffffed10152d9f8e
> RBP: ffff8880a96cfd10 R08: ffff8880a85c40c0 R09: fffffbfff1133639
> R10: fffffbfff1133638 R11: ffffffff8899b1c3 R12: ffff88809ee571c0
> R13: ffff88809ee571c8 R14: 0000000000000286 R15: 0000000000000000
>   percpu_ref_kill include/linux/percpu-refcount.h:128 [inline]
>   __io_uring_register+0xa7/0x1fe0 fs/io_uring.c:2937
>   __do_sys_io_uring_register fs/io_uring.c:2998 [inline]
>   __se_sys_io_uring_register fs/io_uring.c:2980 [inline]
>   __ia32_sys_io_uring_register+0x193/0x1f0 fs/io_uring.c:2980
>   do_syscall_32_irqs_on arch/x86/entry/common.c:326 [inline]
>   do_fast_syscall_32+0x281/0xc98 arch/x86/entry/common.c:397
>   entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
> RIP: 0023:0xf7f16869
> Code: 85 d2 74 02 89 0a 5b 5d c3 8b 04 24 c3 8b 14 24 c3 8b 3c 24 c3 90 90  
> 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90  
> 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
> RSP: 002b:00000000f7ef11ec EFLAGS: 00000296 ORIG_RAX: 00000000000001ab
> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> Kernel Offset: disabled
> Rebooting in 86400 seconds..

I think the below should fix this. Very early versions of io_uring didn't
have this issue, since we did the percpu ref tryget for io_uring_register().
But I think we'll be just fine just checking if the ref is already dying
once inside the mutex. If it is, it's either going away, or someone else
is already doing io_uring_register() on it.


diff --git a/fs/io_uring.c b/fs/io_uring.c
index f65f85d89217..a2f39faed6a7 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2934,6 +2934,14 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
 {
 	int ret;
 
+	/*
+	 * We're inside the ring mutex, if the ref is already dying, then
+	 * someone else killed the ctx or is already going through
+	 * io_uring_register().
+	 */
+	if (percpu_ref_is_dying(&ctx->refs))
+		return -ENXIO;
+
 	percpu_ref_kill(&ctx->refs);
 
 	/*

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: WARNING in percpu_ref_kill_and_confirm
  2019-04-22 16:06 WARNING in percpu_ref_kill_and_confirm syzbot
  2019-04-22 16:23 ` Jens Axboe
@ 2019-04-22 16:23 ` Linus Torvalds
  2019-04-22 16:28   ` Jens Axboe
  1 sibling, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2019-04-22 16:23 UTC (permalink / raw)
  To: syzbot
  Cc: Arnd Bergmann, Jens Axboe, Borislav Petkov, Darrick J. Wong,
	Greg Kroah-Hartman, Peter Anvin, Linux API, linux-arch,
	linux-block, linux-fsdevel, Linux List Kernel Mailing,
	Andrew Lutomirski, Mathieu Desnoyers, Ingo Molnar,
	Michael Ellerman, syzkaller-bugs, Thomas Gleixner, Al Viro,
	the arch/x86 maintainers

On Mon, Apr 22, 2019 at 9:06 AM syzbot
<syzbot+10d25e23199614b7721f@syzkaller.appspotmail.com> wrote:
>
>
> The bug was bisected to:
>
> commit 38e7571c07be01f9f19b355a9306a4e3d5cb0f5b
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Fri Mar 8 22:48:40 2019 +0000
>
>      Merge tag 'io_uring-2019-03-06' of git://git.kernel.dk/linux-block
>
> percpu_ref_kill_and_confirm called more than once on io_ring_ctx_ref_free!

So I don't see how that happens in the original code (because
__io_uring_register() is called with the uring_lock held), but let's
see.

HOWEVER.

I do see how it happens now as of the latest kernel as of commit
b19062a56726 ("io_uring: fix possible deadlock between
io_uring_{enter,register}") where the code explicitly drops the mutex
in order to wait for other uring users to finish.

So Jens, I think that commit was buggy. I suspect that
io_uring_register() should perhaps do something like

--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2934,7 +2934,10 @@ static int __io_uring_register(struct
io_ring_ctx *ctx, unsigned opcode,
 {
        int ret;

+       if (!percpu_ref_tryget(&ctx->refs))
+               return -EBUSY;
        percpu_ref_kill(&ctx->refs);
+       percpu_ref_put(&ctx->refs);

        /*
         * Drop uring mutex before waiting for references to exit. If another

to guarantee that it's the *only* case of io_uring_register() doing that kill.

Hmm?

                Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARNING in percpu_ref_kill_and_confirm
  2019-04-22 16:23 ` Jens Axboe
@ 2019-04-22 16:27   ` Linus Torvalds
  2019-04-22 16:32     ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2019-04-22 16:27 UTC (permalink / raw)
  To: Jens Axboe
  Cc: syzbot, Arnd Bergmann, Borislav Petkov, Darrick J. Wong,
	Greg Kroah-Hartman, Peter Anvin, Linux API, linux-arch,
	linux-block, linux-fsdevel, Linux List Kernel Mailing,
	Andrew Lutomirski, Mathieu Desnoyers, Ingo Molnar,
	Michael Ellerman, syzkaller-bugs, Thomas Gleixner, Al Viro,
	the arch/x86 maintainers

[ Crossed emails ]

On Mon, Apr 22, 2019 at 9:23 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> I think the below should fix this. Very early versions of io_uring didn't
> have this issue, since we did the percpu ref tryget for io_uring_register().

Ok, so I like your patch better than mine, but note how syzbot
bisected this to the initial merge of the io_uring code.

I agree that code shouldn't have had this particular issue, but it
looks like it does.

Is there some way to race with io_ring_ctx_wait_and_kill(), which
_also_ does that ref_kill() thing? I'm not seeing how that could
happen, but maybe if the file ref counts get screwed up you have
->release() called early..

                  Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARNING in percpu_ref_kill_and_confirm
  2019-04-22 16:23 ` Linus Torvalds
@ 2019-04-22 16:28   ` Jens Axboe
  0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2019-04-22 16:28 UTC (permalink / raw)
  To: Linus Torvalds, syzbot
  Cc: Arnd Bergmann, Borislav Petkov, Darrick J. Wong,
	Greg Kroah-Hartman, Peter Anvin, Linux API, linux-arch,
	linux-block, linux-fsdevel, Linux List Kernel Mailing,
	Andrew Lutomirski, Mathieu Desnoyers, Ingo Molnar,
	Michael Ellerman, syzkaller-bugs, Thomas Gleixner, Al Viro,
	the arch/x86 maintainers

On 4/22/19 10:23 AM, Linus Torvalds wrote:
> On Mon, Apr 22, 2019 at 9:06 AM syzbot
> <syzbot+10d25e23199614b7721f@syzkaller.appspotmail.com> wrote:
>>
>>
>> The bug was bisected to:
>>
>> commit 38e7571c07be01f9f19b355a9306a4e3d5cb0f5b
>> Author: Linus Torvalds <torvalds@linux-foundation.org>
>> Date:   Fri Mar 8 22:48:40 2019 +0000
>>
>>      Merge tag 'io_uring-2019-03-06' of git://git.kernel.dk/linux-block
>>
>> percpu_ref_kill_and_confirm called more than once on io_ring_ctx_ref_free!
> 
> So I don't see how that happens in the original code (because
> __io_uring_register() is called with the uring_lock held), but let's
> see.
> 
> HOWEVER.
> 
> I do see how it happens now as of the latest kernel as of commit
> b19062a56726 ("io_uring: fix possible deadlock between
> io_uring_{enter,register}") where the code explicitly drops the mutex
> in order to wait for other uring users to finish.
> 
> So Jens, I think that commit was buggy. I suspect that
> io_uring_register() should perhaps do something like
> 
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -2934,7 +2934,10 @@ static int __io_uring_register(struct
> io_ring_ctx *ctx, unsigned opcode,
>  {
>         int ret;
> 
> +       if (!percpu_ref_tryget(&ctx->refs))
> +               return -EBUSY;
>         percpu_ref_kill(&ctx->refs);
> +       percpu_ref_put(&ctx->refs);
> 
>         /*
>          * Drop uring mutex before waiting for references to exit. If another
> 
> to guarantee that it's the *only* case of io_uring_register() doing that kill.
> 
> Hmm?

Just sent out something as well. I think we can get by with just
checking if it's dying, or we can go the route of what you did which is
actually very similar to what the earlier versions did. Both versions
should fix the issue.

I'll test just to be totally sure.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARNING in percpu_ref_kill_and_confirm
  2019-04-22 16:27   ` Linus Torvalds
@ 2019-04-22 16:32     ` Jens Axboe
  2019-04-22 16:38       ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2019-04-22 16:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: syzbot, Arnd Bergmann, Borislav Petkov, Darrick J. Wong,
	Greg Kroah-Hartman, Peter Anvin, Linux API, linux-arch,
	linux-block, linux-fsdevel, Linux List Kernel Mailing,
	Andrew Lutomirski, Mathieu Desnoyers, Ingo Molnar,
	Michael Ellerman, syzkaller-bugs, Thomas Gleixner, Al Viro,
	the arch/x86 maintainers

On 4/22/19 10:27 AM, Linus Torvalds wrote:
> [ Crossed emails ]
> 
> On Mon, Apr 22, 2019 at 9:23 AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> I think the below should fix this. Very early versions of io_uring didn't
>> have this issue, since we did the percpu ref tryget for io_uring_register().
> 
> Ok, so I like your patch better than mine, but note how syzbot
> bisected this to the initial merge of the io_uring code.

Yes, I did think about that too...

> I agree that code shouldn't have had this particular issue, but it
> looks like it does.
> 
> Is there some way to race with io_ring_ctx_wait_and_kill(), which
> _also_ does that ref_kill() thing? I'm not seeing how that could
> happen, but maybe if the file ref counts get screwed up you have
> ->release() called early..

I just tried on the current code and it triggers easily, but that's
with that mutex patch in there. I agree it should not trigger before
that, unless something is wonky. I'll try and play around with it a bit
and see what is going on (or if I can trigger it at all with the mutex
change reverted).

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARNING in percpu_ref_kill_and_confirm
  2019-04-22 16:32     ` Jens Axboe
@ 2019-04-22 16:38       ` Jens Axboe
  2019-04-22 16:48         ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2019-04-22 16:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: syzbot, Arnd Bergmann, Borislav Petkov, Darrick J. Wong,
	Greg Kroah-Hartman, Peter Anvin, Linux API, linux-arch,
	linux-block, linux-fsdevel, Linux List Kernel Mailing,
	Andrew Lutomirski, Mathieu Desnoyers, Ingo Molnar,
	Michael Ellerman, syzkaller-bugs, Thomas Gleixner, Al Viro,
	the arch/x86 maintainers

On 4/22/19 10:32 AM, Jens Axboe wrote:
> On 4/22/19 10:27 AM, Linus Torvalds wrote:
>> [ Crossed emails ]
>>
>> On Mon, Apr 22, 2019 at 9:23 AM Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>> I think the below should fix this. Very early versions of io_uring didn't
>>> have this issue, since we did the percpu ref tryget for io_uring_register().
>>
>> Ok, so I like your patch better than mine, but note how syzbot
>> bisected this to the initial merge of the io_uring code.
> 
> Yes, I did think about that too...
> 
>> I agree that code shouldn't have had this particular issue, but it
>> looks like it does.
>>
>> Is there some way to race with io_ring_ctx_wait_and_kill(), which
>> _also_ does that ref_kill() thing? I'm not seeing how that could
>> happen, but maybe if the file ref counts get screwed up you have
>> ->release() called early..
> 
> I just tried on the current code and it triggers easily, but that's
> with that mutex patch in there. I agree it should not trigger before
> that, unless something is wonky. I'll try and play around with it a bit
> and see what is going on (or if I can trigger it at all with the mutex
> change reverted).

With the mutex change in, I can trigger it in a second or so. Just ran
the reproducer with that change reverted, and I'm not seeing any badness.
So I do wonder if the bisect results are accurate?

I think the dying check should cover it, and then marked with fixing
that mutex commit.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARNING in percpu_ref_kill_and_confirm
  2019-04-22 16:38       ` Jens Axboe
@ 2019-04-22 16:48         ` Linus Torvalds
  2019-04-22 16:50           ` Jens Axboe
  2019-04-23 14:41           ` Dmitry Vyukov
  0 siblings, 2 replies; 10+ messages in thread
From: Linus Torvalds @ 2019-04-22 16:48 UTC (permalink / raw)
  To: Jens Axboe
  Cc: syzbot, Arnd Bergmann, Borislav Petkov, Darrick J. Wong,
	Greg Kroah-Hartman, Peter Anvin, Linux API, linux-arch,
	linux-block, linux-fsdevel, Linux List Kernel Mailing,
	Andrew Lutomirski, Mathieu Desnoyers, Ingo Molnar,
	Michael Ellerman, syzkaller-bugs, Thomas Gleixner, Al Viro,
	the arch/x86 maintainers

On Mon, Apr 22, 2019 at 9:38 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> With the mutex change in, I can trigger it in a second or so. Just ran
> the reproducer with that change reverted, and I'm not seeing any badness.
> So I do wonder if the bisect results are accurate?

Looking at the syzbot report, it's syzbot being confused.

The actual WARNING in percpu_ref_kill_and_confirm() only happens with
recent kernels.

But then syzbot mixes it up with a completely different bug:

   crash: BUG: MAX_STACK_TRACE_ENTRIES too low!
   BUG: MAX_STACK_TRACE_ENTRIES too low!

and for some reason decides that *that* bug is the same thing entirely.

So yeah, I think the simple percpu_ref_is_dying() check is sufficient,
and that the syzbot bisection is completely bogus.

                Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARNING in percpu_ref_kill_and_confirm
  2019-04-22 16:48         ` Linus Torvalds
@ 2019-04-22 16:50           ` Jens Axboe
  2019-04-23 14:41           ` Dmitry Vyukov
  1 sibling, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2019-04-22 16:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: syzbot, Arnd Bergmann, Borislav Petkov, Darrick J. Wong,
	Greg Kroah-Hartman, Peter Anvin, Linux API, linux-arch,
	linux-block, linux-fsdevel, Linux List Kernel Mailing,
	Andrew Lutomirski, Mathieu Desnoyers, Ingo Molnar,
	Michael Ellerman, syzkaller-bugs, Thomas Gleixner, Al Viro,
	the arch/x86 maintainers

On 4/22/19 10:48 AM, Linus Torvalds wrote:
> On Mon, Apr 22, 2019 at 9:38 AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> With the mutex change in, I can trigger it in a second or so. Just ran
>> the reproducer with that change reverted, and I'm not seeing any badness.
>> So I do wonder if the bisect results are accurate?
> 
> Looking at the syzbot report, it's syzbot being confused.
> 
> The actual WARNING in percpu_ref_kill_and_confirm() only happens with
> recent kernels.
> 
> But then syzbot mixes it up with a completely different bug:
> 
>    crash: BUG: MAX_STACK_TRACE_ENTRIES too low!
>    BUG: MAX_STACK_TRACE_ENTRIES too low!
> 
> and for some reason decides that *that* bug is the same thing entirely.
> 
> So yeah, I think the simple percpu_ref_is_dying() check is sufficient,
> and that the syzbot bisection is completely bogus.

Ah good, that makes me feel better. I'll queue the fix up, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARNING in percpu_ref_kill_and_confirm
  2019-04-22 16:48         ` Linus Torvalds
  2019-04-22 16:50           ` Jens Axboe
@ 2019-04-23 14:41           ` Dmitry Vyukov
  1 sibling, 0 replies; 10+ messages in thread
From: Dmitry Vyukov @ 2019-04-23 14:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jens Axboe, syzbot, Arnd Bergmann, Borislav Petkov,
	Darrick J. Wong, Greg Kroah-Hartman, Peter Anvin, Linux API,
	linux-arch, linux-block, linux-fsdevel,
	Linux List Kernel Mailing, Andrew Lutomirski, Mathieu Desnoyers,
	Ingo Molnar, Michael Ellerman, syzkaller-bugs, Thomas Gleixner,
	Al Viro, the arch/x86 maintainers, syzkaller

On Mon, Apr 22, 2019 at 7:48 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Mon, Apr 22, 2019 at 9:38 AM Jens Axboe <axboe@kernel.dk> wrote:
> >
> > With the mutex change in, I can trigger it in a second or so. Just ran
> > the reproducer with that change reverted, and I'm not seeing any badness.
> > So I do wonder if the bisect results are accurate?
>
> Looking at the syzbot report, it's syzbot being confused.
>
> The actual WARNING in percpu_ref_kill_and_confirm() only happens with
> recent kernels.
>
> But then syzbot mixes it up with a completely different bug:
>
>    crash: BUG: MAX_STACK_TRACE_ENTRIES too low!
>    BUG: MAX_STACK_TRACE_ENTRIES too low!
>
> and for some reason decides that *that* bug is the same thing entirely.
>
> So yeah, I think the simple percpu_ref_is_dying() check is sufficient,
> and that the syzbot bisection is completely bogus.

Using crashed/not-crashed predicate gives better results overall. More
than half kernel bugs have different manifestations due to different
reasons. And even if we can say for sure that we see a different bug,
we still don't know if the original bug is also there or not. See the
following threads for details:
https://groups.google.com/d/msg/syzkaller-bugs/nFeC8-UG1gg/y6gUEsvAAgAJ
https://groups.google.com/d/msg/syzkaller/sR8aAXaWEF4/tTWYRgvmAwAJ

Unrelated crashes is the most common cause of incorrect bisection
results (66%). To enable better bisection we would need to integrate
some meaningful precommit testing into kernel development process
(would be tremendously useful for other reasons too). E.g. this "BUG:
MAX_STACK_TRACE_ENTRIES too low!" is this:
https://syzkaller.appspot.com/bug?id=dbd70f0407487a061d2d46fdc6bccc94b95ce3c0
and the reproducer is simply opening /dev/infiniband/rdma_cm or
/dev/vhci or something equally simple with LOCKDEP enabled. None of
this was done in a testing environment for several weeks. And then it
took another month to propagate the fix through all distributed kernel
trees. For all that time simple programs crash and bisection can't be
done and we are spending time here...

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-04-23 14:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-22 16:06 WARNING in percpu_ref_kill_and_confirm syzbot
2019-04-22 16:23 ` Jens Axboe
2019-04-22 16:27   ` Linus Torvalds
2019-04-22 16:32     ` Jens Axboe
2019-04-22 16:38       ` Jens Axboe
2019-04-22 16:48         ` Linus Torvalds
2019-04-22 16:50           ` Jens Axboe
2019-04-23 14:41           ` Dmitry Vyukov
2019-04-22 16:23 ` Linus Torvalds
2019-04-22 16:28   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).