linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dmitry Vyukov <dvyukov@google.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: syzbot <syzbot+1a33233ccd8201ec2322@syzkaller.appspotmail.com>,
	Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jens Axboe <axboe@kernel.dk>,
	Christian Brauner <christian@brauner.io>,
	LKML <linux-kernel@vger.kernel.org>,
	Shakeel Butt <shakeelb@google.com>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>
Subject: Re: [syzbot] WARNING: suspicious RCU usage in copy_page_range
Date: Wed, 31 Mar 2021 11:57:23 +0200	[thread overview]
Message-ID: <CACT4Y+b1NsdnC1hqk54Y8zEs7r3y7+EnAqbG1eBmuhji_bfFqw@mail.gmail.com> (raw)
In-Reply-To: <YGQlHvte0BeKx0uV@hirez.programming.kicks-ass.net>

On Wed, Mar 31, 2021 at 9:31 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Mar 31, 2021 at 08:11:38AM +0200, Dmitry Vyukov wrote:
> > On Wed, Mar 31, 2021 at 12:26 AM syzbot
> > <syzbot+1a33233ccd8201ec2322@syzkaller.appspotmail.com> wrote:
> > >
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit:    db24726b Merge tag 'integrity-v5.12-fix' of git://git.kern..
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=16c16b7cd00000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=daeff30c2474a60f
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=1a33233ccd8201ec2322
> > >
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+1a33233ccd8201ec2322@syzkaller.appspotmail.com
> >
> > I think this is a LOCKDEP issue. +LOCKDEP maintainers.
> >
> > Another bug happened on another thread ("WARNING: possible circular
> > locking dependency detected"). Lockdep disabled lock tracking
> > ("debug_locks = 0" in the report), which probably made it miss
> > rcu_unlock somewhere, but it did not turn off reporting yet and
> > produced the false positive first.
> >
> > I think if LOCKDEP disables lock tracking, it must also disable
> > reporting of issues that require lock tracking. That would avoid false
> > positives.
>
> Still early and brain hasn't really booted yet, but features that
> require lock tracking are supposed to check debug_locks.
>
> And afaict debug_lockdep_rcu_enabled(), which is called by
> RCU_LOCKDEP_WARN(), which is called by rcu_sleep_check() does just that.

Right... yet it somehow happens.
Looking at a dozen of reports, all with 2 concurrent lockdep splats
and "debug_locks = 0" in the report, I am pretty sure there is some
kind of race in lockdep.
I see there are at least 2 places where lockdep can falsely assume rcu
lock is held:
https://elixir.bootlin.com/linux/v5.12-rc5/source/kernel/locking/lockdep.c#L5543
https://elixir.bootlin.com/linux/v5.12-rc5/source/kernel/rcu/update.c#L105
both to "avoid false positives", but for "Illegal context switch in
RCU-bh read-side critical section" it can actually lead to false
positives, right?

Is there something else that turns off tracking before setting
debug_locks=0? Perhaps we get into that window where tracking is
disabled, but debug_locks is not reset yet?

lockdep_enabled() returns false if lockdep_recursion var is set:
https://elixir.bootlin.com/linux/v5.12-rc5/source/kernel/locking/lockdep.c#L87

but lockdep_lock() sets it _before_ taking the lock:
https://elixir.bootlin.com/linux/v5.12-rc5/source/kernel/locking/lockdep.c#L111

Is it possible that lockdep_recursion is set, then the task is
rescheduled and another task sees wrong value for lockdep_recursion?
Shouldn't lockdep_recursion be set _after_ arch_spin_unlock(&__lock)?
Though, I assume lockdep_lock() is called frequently and not only on
reports, so if my reasoning would be true, it would produce false
positives all the time, not necessary on concurrent reports... this
does not agree with the observed failure mode...

  reply	other threads:[~2021-03-31  9:58 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-30 22:26 [syzbot] WARNING: suspicious RCU usage in copy_page_range syzbot
2021-03-31  6:11 ` Dmitry Vyukov
2021-03-31  7:30   ` Peter Zijlstra
2021-03-31  9:57     ` Dmitry Vyukov [this message]
2021-04-06  7:41       ` Peter Zijlstra
2021-04-13 18:12         ` Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACT4Y+b1NsdnC1hqk54Y8zEs7r3y7+EnAqbG1eBmuhji_bfFqw@mail.gmail.com \
    --to=dvyukov@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=christian@brauner.io \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=shakeelb@google.com \
    --cc=syzbot+1a33233ccd8201ec2322@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).