All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Hillf Danton <hdanton@sina.com>, Ming Lei <ming.lei@redhat.com>
Cc: syzbot <syzbot+d6218cb2fae0b2411e9d@syzkaller.appspotmail.com>,
	axboe@kernel.dk, io-uring@vger.kernel.org,
	linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com
Subject: Re: [syzbot] WARNING in __percpu_ref_exit (2)
Date: Thu, 18 Mar 2021 14:28:11 +0000	[thread overview]
Message-ID: <79ce5bcd-b314-dded-0b36-0a8fb66f5a7a@gmail.com> (raw)
In-Reply-To: <20210318083340.1900-1-hdanton@sina.com>

On 18/03/2021 08:33, Hillf Danton wrote:
> On Mon, 15 Mar 2021 12:18:20 +0000 Pavel Begunkov wrote:
>> On 15/03/2021 11:58, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following issue on:
>>>
>>> HEAD commit:    75013c6c Merge tag 'perf_urgent_for_v5.12-rc3' of git://gi..
>>> git tree:       upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=174df32ad00000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=844457676c06b88c
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=d6218cb2fae0b2411e9d
>>> userspace arch: i386
>>>
>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>
>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>> Reported-by: syzbot+d6218cb2fae0b2411e9d@syzkaller.appspotmail.com
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 1 PID: 53 at lib/percpu-refcount.c:113 __percpu_ref_exit+0x98/0x100 lib/percpu-refcount.c:113
>>
>> if (percpu_count) {
>> 	/* non-NULL confirm_switch indicates switching in progress */
>> 	WARN_ON_ONCE(ref->data && ref->data->confirm_switch);
>> 	...
>> }
>>
>> Points to this warning. Not sure, but not yet included
>> "io_uring: halt SQO submission on ctx exit" may fix it or at least is
>> related.
> 
> Seems it does not, nor related, see below.
>>
>>> Modules linked in:
>>> CPU: 1 PID: 53 Comm: kworker/u4:2 Not tainted 5.12.0-rc2-syzkaller #0
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>> Workqueue: events_unbound io_ring_exit_work
>>> RIP: 0010:__percpu_ref_exit+0x98/0x100 lib/percpu-refcount.c:113
>>> Code: fd 49 8d 7c 24 10 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 75 61 49 83 7c 24 10 00 74 07 e8 28 42 ac fd <0f> 0b e8 21 42 ac fd 48 89 ef e8 e9 fa da fd 48 89 da 48 b8 00 00
>>> RSP: 0018:ffffc90000f1fb78 EFLAGS: 00010293
>>> RAX: 0000000000000000 RBX: ffff88805c976000 RCX: 0000000000000000
>>> RDX: ffff888011839bc0 RSI: ffffffff83c76be8 RDI: ffff88802b2a9010
>>> RBP: 0000607f46077778 R08: 0000000000000000 R09: ffffffff8fab0967
>>> R10: ffffffff83c76b88 R11: 0000000000000009 R12: ffff88802b2a9000
>>> R13: 0000000000000001 R14: ffff88802b2a9000 R15: dffffc0000000000
>>> FS:  0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 00000000085a0004 CR3: 000000001896a000 CR4: 00000000001506f0
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>> Call Trace:
>>>  percpu_ref_exit+0x3b/0x140 lib/percpu-refcount.c:134
>>>  io_ring_ctx_free fs/io_uring.c:8419 [inline]
>>>  io_ring_exit_work+0x599/0xcf0 fs/io_uring.c:8565
>>>  process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
>>>  worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
>>>  kthread+0x3b1/0x4a0 kernel/kthread.c:292
>>>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
>>>
>>>
>>> ---
>>> This report is generated by a bot. It may contain errors.
>>> See https://goo.gl/tpsmEJ for more information about syzbot.
>>> syzbot engineers can be reached at syzkaller@googlegroups.com.
>>>
>>> syzbot will keep track of this issue. See:
>>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>>>
>>
>> -- 
>> Pavel Begunkov
> 
> Thoughts for sync RCU are appreciated if the chance for the race
> between rcu and workqueue is not zero on killing io ctx.

Would you elaborate? Because your case below doesn't make much
sense.

1) io_ring_ctx_wait_and_kill() indeed kills ctx->refs
2) io_ring_exit_work() waits for a completion signaled by
ctx->refs hitting 0
3) and only then calls io_ring_ctx_free().

And 2) won't complete until the switching is over

> 
> CPU0
> ----
> io_ring_ctx_wait_and_kill
> percpu_ref_kill(&ctx->refs);
> percpu_ref_kill_and_confirm(ref, NULL);
> spin_lock_irqsave(&percpu_ref_switch_lock, flags);
> __percpu_ref_switch_mode(ref, confirm_switch);
>   __percpu_ref_switch_to_atomic
>     ref->data->confirm_switch = confirm_switch ?:
> 		percpu_ref_noop_confirm_switch;
>     call_rcu(&ref->data->rcu, percpu_ref_switch_to_atomic_rcu);
> spin_unlock_irqrestore(&percpu_ref_switch_lock, flags);
> 
> INIT_WORK(&ctx->exit_work, io_ring_exit_work);
> queue_work(system_unbound_wq, &ctx->exit_work);
> 
> 						CPU1
> 						----
> 						io_ring_exit_work
> 						io_ring_ctx_free(ctx);
> 						percpu_ref_exit(&ctx->refs);
> 						__percpu_ref_exit(ref);
> 						WARN_ON_ONCE(ref->data &&
> 							ref->data->confirm_switch);
> 
> 
> percpu_ref_switch_to_atomic_rcu
>   percpu_ref_call_confirm_rcu(rcu);
>     data->confirm_switch(ref);
>     data->confirm_switch = NULL;
>     wake_up_all(&percpu_ref_switch_waitq);
> 

-- 
Pavel Begunkov

  parent reply	other threads:[~2021-03-18 14:32 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-15 11:58 [syzbot] WARNING in __percpu_ref_exit (2) syzbot
2021-03-15 12:18 ` Pavel Begunkov
     [not found] ` <20210318083340.1900-1-hdanton@sina.com>
2021-03-18 14:28   ` Pavel Begunkov [this message]
2021-04-18 19:30 ` syzbot
2021-04-19 12:07   ` Pavel Begunkov
2021-04-19 15:02     ` syzbot
2021-09-13  9:22 ` syzbot
2021-09-16  7:59   ` Dmitry Vyukov
2021-09-16 13:17     ` Pavel Begunkov
2021-09-16 14:01       ` syzbot
2021-09-20  8:15         ` Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=79ce5bcd-b314-dded-0b36-0a8fb66f5a7a@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=hdanton@sina.com \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=syzbot+d6218cb2fae0b2411e9d@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.