All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: syzbot <syzbot+3ae5eaae0809ee311e75@syzkaller.appspotmail.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	bp@alien8.de, hpa@zytor.com, linux-kernel@vger.kernel.org,
	luto@kernel.org, mingo@kernel.org,
	syzkaller-bugs@googlegroups.com, x86@kernel.org
Subject: Re: WARNING: suspicious RCU usage in idtentry_exit
Date: Thu, 28 May 2020 09:11:43 -0700	[thread overview]
Message-ID: <20200528161143.GF2869@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <87wo4wnpzb.fsf@nanos.tec.linutronix.de>

On Thu, May 28, 2020 at 03:33:44PM +0200, Thomas Gleixner wrote:
> syzbot <syzbot+3ae5eaae0809ee311e75@syzkaller.appspotmail.com> writes:
> 
> + Paolo, Paul
> 
> > syzbot found the following crash on:
> >
> > HEAD commit:    7b4cb0a4 Add linux-next specific files for 20200525
> > git tree:       linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=13356016100000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=47b0740d89299c10
> > dashboard link: https://syzkaller.appspot.com/bug?extid=3ae5eaae0809ee311e75
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+3ae5eaae0809ee311e75@syzkaller.appspotmail.com
> >
> > =============================
> > WARNING: suspicious RCU usage
> > 5.7.0-rc7-next-20200525-syzkaller #0 Not tainted
> > -----------------------------
> > kernel/rcu/tree.c:715 RCU dynticks_nesting counter underflow/zero!!

So the nesting counter overflowed or got clobbered to either zero
or some negative number.  The usual cause of this is a misnesting of
rcu_nmi_enter() and rcu_nmi_exit().

If this were reproducible, I would suggest tracking this down by enabling
the rcu_dyntick trace event.  :-/

> > other info that might help us debug this:
> >
> >
> > RCU used illegally from idle CPU!

This might indicate that the aforementioned mismatch was having invoked
rcu_nmi_exit() in an exception that never invoked rcu_nmi_enter().
In this case, the lack of the rcu_nmi_enter() would leave the CPU
looking idle to RCU, and then the call to rcu_nmi_exit() would result in
a negative counter.  But I would have expected a pair of earlier splats
from rcu_nmi_exit() in that case:

	WARN_ON_ONCE(rdp->dynticks_nesting <= 0);
	WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs());

So another hypothesis is that neither rcu_nmi_enter() nor rcu_nmi_exit()
were invoked, leaving the ->dynticks_nesting counter at the value zero,
in turn causing rcu_irq_exit_preempt() to complain.

> > rcu_scheduler_active = 2, debug_locks = 1
> > RCU used illegally from extended quiescent state!

Huh.  This is a bit repetitive, isn't it?  I just queued a patch to say this
only once.  </distraction>

> > no locks held by syz-executor.5/24641.
> >
> > stack backtrace:
> > CPU: 1 PID: 24641 Comm: syz-executor.5 Not tainted 5.7.0-rc7-next-20200525-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x18f/0x20d lib/dump_stack.c:118
> >  rcu_irq_exit_preempt+0x1fa/0x250 kernel/rcu/tree.c:715
> >  idtentry_exit+0x9e/0xc0 arch/x86/entry/common.c:583
> >  exc_general_protection+0x23d/0x520 arch/x86/kernel/traps.c:506
> >  asm_exc_general_protection+0x1e/0x30 arch/x86/include/asm/idtentry.h:353
> > RIP: 0010:kvm_fastop_exception+0xb68/0xfe8
> > Code: f2 ff ff ff 48 31 db e9 fb c9 2a f9 b8 f2 ff ff ff 48 31 f6 e9 ff c9 2a f9 31 c0 e9 ec 2c 2b f9 b8 fb ff ff ff e9 13 a9 31 f9 <b9> fb ff ff ff 31 c0 31 d2 e9 33 a9 31 f9 31 db e9 2a 0b 42 f9 31
> > RSP: 0018:ffffc90004a87a30 EFLAGS: 00010212
> > RAX: 0000000000040000 RBX: ffff88809cca4080 RCX: 0000000000000122
> > RDX: 00000000000063ff RSI: ffffc90004a87a98 RDI: 0000000000000122
> > RBP: 0000000000000122 R08: ffff888058486480 R09: fffffbfff131f481
> > R10: ffffffff898fa403 R11: fffffbfff131f480 R12: 0000000000000122
> > R13: 0000000000000078 R14: 0000000000000006 R15: ffffffff88244b5c
> >  paravirt_read_msr_safe arch/x86/include/asm/paravirt.h:178 [inline]
> >  vmx_create_vcpu+0x184/0x2b40 arch/x86/kvm/vmx/vmx.c:6827
> >  kvm_arch_vcpu_create+0x6a8/0xb30 arch/x86/kvm/x86.c:9427
> >  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:3043 [inline]
> >  kvm_vm_ioctl+0x15b7/0x2460 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3603
> >  vfs_ioctl fs/ioctl.c:48 [inline]
> >  ksys_ioctl+0x11a/0x180 fs/ioctl.c:753
> >  __do_sys_ioctl fs/ioctl.c:762 [inline]
> >  __se_sys_ioctl fs/ioctl.c:760 [inline]
> >  __x64_sys_ioctl+0x6f/0xb0 fs/ioctl.c:760
> >  do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:353
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > RIP: 0033:0x45ca29
> > Code: 0d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> > RSP: 002b:00007f2c93b11c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > RAX: ffffffffffffffda RBX: 00000000004e73c0 RCX: 000000000045ca29
> > RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 0000000000000004
> > RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
> > R13: 0000000000000396 R14: 00000000004c62c6 R15: 00007f2c93b126d4
> 
> Weird. I have no idea how that thing is an EQS here.

No argument on the "Weird" part!  ;-)

Is this a NO_HZ_FULL=y kernel?  If so, one possibility is that the call
to rcu_user_exit() went missing somehow.  If not, then RCU should have
been watching userspace execution.

Again, the only thing I can think of (should this prove to be
reproducible) is the rcu_dyntick trace event.

							Thanx, Paul

  reply	other threads:[~2020-05-28 16:11 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-26  3:54 WARNING: suspicious RCU usage in idtentry_exit syzbot
2020-05-28 13:33 ` Thomas Gleixner
2020-05-28 16:11   ` Paul E. McKenney [this message]
2020-05-28 20:19     ` Thomas Gleixner
2020-05-28 20:48       ` Paul E. McKenney
2020-05-29  6:20         ` Dmitry Vyukov
2020-05-29  8:51           ` Thomas Gleixner
2020-05-29 14:05           ` Paul E. McKenney
2020-05-29 14:32             ` Dmitry Vyukov
2020-05-29 16:07               ` Paul E. McKenney
2020-07-31  9:23   ` [tip: core/rcu] lockdep: Complain only once about RCU in extended quiescent state tip-bot2 for Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200528161143.GF2869@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=syzbot+3ae5eaae0809ee311e75@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.