linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: syzbot <syzbot+3ae5eaae0809ee311e75@syzkaller.appspotmail.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	bp@alien8.de, hpa@zytor.com, linux-kernel@vger.kernel.org,
	luto@kernel.org, mingo@kernel.org,
	syzkaller-bugs@googlegroups.com, x86@kernel.org
Subject: Re: WARNING: suspicious RCU usage in idtentry_exit
Date: Thu, 28 May 2020 09:11:43 -0700	[thread overview]
Message-ID: <20200528161143.GF2869@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <87wo4wnpzb.fsf@nanos.tec.linutronix.de>

On Thu, May 28, 2020 at 03:33:44PM +0200, Thomas Gleixner wrote:
> syzbot <syzbot+3ae5eaae0809ee311e75@syzkaller.appspotmail.com> writes:
> 
> + Paolo, Paul
> 
> > syzbot found the following crash on:
> >
> > HEAD commit:    7b4cb0a4 Add linux-next specific files for 20200525
> > git tree:       linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=13356016100000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=47b0740d89299c10
> > dashboard link: https://syzkaller.appspot.com/bug?extid=3ae5eaae0809ee311e75
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+3ae5eaae0809ee311e75@syzkaller.appspotmail.com
> >
> > =============================
> > WARNING: suspicious RCU usage
> > 5.7.0-rc7-next-20200525-syzkaller #0 Not tainted
> > -----------------------------
> > kernel/rcu/tree.c:715 RCU dynticks_nesting counter underflow/zero!!

So the nesting counter overflowed or got clobbered to either zero
or some negative number.  The usual cause of this is a misnesting of
rcu_nmi_enter() and rcu_nmi_exit().

If this were reproducible, I would suggest tracking this down by enabling
the rcu_dyntick trace event.  :-/

> > other info that might help us debug this:
> >
> >
> > RCU used illegally from idle CPU!

This might indicate that the aforementioned mismatch was having invoked
rcu_nmi_exit() in an exception that never invoked rcu_nmi_enter().
In this case, the lack of the rcu_nmi_enter() would leave the CPU
looking idle to RCU, and then the call to rcu_nmi_exit() would result in
a negative counter.  But I would have expected a pair of earlier splats
from rcu_nmi_exit() in that case:

	WARN_ON_ONCE(rdp->dynticks_nesting <= 0);
	WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs());

So another hypothesis is that neither rcu_nmi_enter() nor rcu_nmi_exit()
were invoked, leaving the ->dynticks_nesting counter at the value zero,
in turn causing rcu_irq_exit_preempt() to complain.

> > rcu_scheduler_active = 2, debug_locks = 1
> > RCU used illegally from extended quiescent state!

Huh.  This is a bit repetitive, isn't it?  I just queued a patch to say this
only once.  </distraction>

> > no locks held by syz-executor.5/24641.
> >
> > stack backtrace:
> > CPU: 1 PID: 24641 Comm: syz-executor.5 Not tainted 5.7.0-rc7-next-20200525-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x18f/0x20d lib/dump_stack.c:118
> >  rcu_irq_exit_preempt+0x1fa/0x250 kernel/rcu/tree.c:715
> >  idtentry_exit+0x9e/0xc0 arch/x86/entry/common.c:583
> >  exc_general_protection+0x23d/0x520 arch/x86/kernel/traps.c:506
> >  asm_exc_general_protection+0x1e/0x30 arch/x86/include/asm/idtentry.h:353
> > RIP: 0010:kvm_fastop_exception+0xb68/0xfe8
> > Code: f2 ff ff ff 48 31 db e9 fb c9 2a f9 b8 f2 ff ff ff 48 31 f6 e9 ff c9 2a f9 31 c0 e9 ec 2c 2b f9 b8 fb ff ff ff e9 13 a9 31 f9 <b9> fb ff ff ff 31 c0 31 d2 e9 33 a9 31 f9 31 db e9 2a 0b 42 f9 31
> > RSP: 0018:ffffc90004a87a30 EFLAGS: 00010212
> > RAX: 0000000000040000 RBX: ffff88809cca4080 RCX: 0000000000000122
> > RDX: 00000000000063ff RSI: ffffc90004a87a98 RDI: 0000000000000122
> > RBP: 0000000000000122 R08: ffff888058486480 R09: fffffbfff131f481
> > R10: ffffffff898fa403 R11: fffffbfff131f480 R12: 0000000000000122
> > R13: 0000000000000078 R14: 0000000000000006 R15: ffffffff88244b5c
> >  paravirt_read_msr_safe arch/x86/include/asm/paravirt.h:178 [inline]
> >  vmx_create_vcpu+0x184/0x2b40 arch/x86/kvm/vmx/vmx.c:6827
> >  kvm_arch_vcpu_create+0x6a8/0xb30 arch/x86/kvm/x86.c:9427
> >  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:3043 [inline]
> >  kvm_vm_ioctl+0x15b7/0x2460 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3603
> >  vfs_ioctl fs/ioctl.c:48 [inline]
> >  ksys_ioctl+0x11a/0x180 fs/ioctl.c:753
> >  __do_sys_ioctl fs/ioctl.c:762 [inline]
> >  __se_sys_ioctl fs/ioctl.c:760 [inline]
> >  __x64_sys_ioctl+0x6f/0xb0 fs/ioctl.c:760
> >  do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:353
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > RIP: 0033:0x45ca29
> > Code: 0d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> > RSP: 002b:00007f2c93b11c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > RAX: ffffffffffffffda RBX: 00000000004e73c0 RCX: 000000000045ca29
> > RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 0000000000000004
> > RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
> > R13: 0000000000000396 R14: 00000000004c62c6 R15: 00007f2c93b126d4
> 
> Weird. I have no idea how that thing is an EQS here.

No argument on the "Weird" part!  ;-)

Is this a NO_HZ_FULL=y kernel?  If so, one possibility is that the call
to rcu_user_exit() went missing somehow.  If not, then RCU should have
been watching userspace execution.

Again, the only thing I can think of (should this prove to be
reproducible) is the rcu_dyntick trace event.

							Thanx, Paul

  reply	other threads:[~2020-05-28 16:11 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-26  3:54 WARNING: suspicious RCU usage in idtentry_exit syzbot
2020-05-28 13:33 ` Thomas Gleixner
2020-05-28 16:11   ` Paul E. McKenney [this message]
2020-05-28 20:19     ` Thomas Gleixner
2020-05-28 20:48       ` Paul E. McKenney
2020-05-29  6:20         ` Dmitry Vyukov
2020-05-29  8:51           ` Thomas Gleixner
2020-05-29 14:05           ` Paul E. McKenney
2020-05-29 14:32             ` Dmitry Vyukov
2020-05-29 16:07               ` Paul E. McKenney
2020-07-31  9:23   ` [tip: core/rcu] lockdep: Complain only once about RCU in extended quiescent state tip-bot2 for Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200528161143.GF2869@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=syzbot+3ae5eaae0809ee311e75@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).