linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marco Elver <elver@google.com>
To: Eric Dumazet <edumazet@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	David Miller <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, netdev <netdev@vger.kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH] random32: Use rcuidle variant for tracepoint
Date: Fri, 21 Aug 2020 17:35:32 +0200	[thread overview]
Message-ID: <20200821153532.GA3205540@elver.google.com> (raw)
In-Reply-To: <CANn89i+1MQRCSRVg-af758en5e9nwQBes3aBSjQ6BY1pV5+HdQ@mail.gmail.com>

On Fri, Aug 21, 2020 at 08:06AM -0700, Eric Dumazet wrote:
> On Fri, Aug 21, 2020 at 1:59 AM <peterz@infradead.org> wrote:
> >
> > On Fri, Aug 21, 2020 at 08:30:43AM +0200, Marco Elver wrote:
> > > With KCSAN enabled, prandom_u32() may be called from any context,
> > > including idle CPUs.
> > >
> > > Therefore, switch to using trace_prandom_u32_rcuidle(), to avoid various
> > > issues due to recursion and lockdep warnings when KCSAN and tracing is
> > > enabled.
> >
> > At some point we're going to have to introduce noinstr to idle as well.
> > But until that time this should indeed cure things.
> 
> I do not understand what the issue is.  This _rcuidle() is kind of opaque ;)
>
> Would this alternative patch work, or is it something more fundamental ?

There are 2 problems:

1. Recursion due to ending up in lockdep from the tracepoint. I need to
solve this either way. One way is to use _rcuidle() variant, which
doesn't call into lockdep.

2. Somehow running into trouble because we use tracing from an idle CPU.
At least that's what I gathered from the documentation -- but you'd have
to wait for Peter or Steven to get a better explanation.

> Thanks !
> 
> diff --git a/lib/random32.c b/lib/random32.c
> index 932345323af092a93fc2690b0ebbf4f7485ae4f3..17af2d1631e5ab6e02ad1e9288af7e007bed6d5f
> 100644
> --- a/lib/random32.c
> +++ b/lib/random32.c
> @@ -83,9 +83,10 @@ u32 prandom_u32(void)
>         u32 res;
> 
>         res = prandom_u32_state(state);
> -       trace_prandom_u32(res);
>         put_cpu_var(net_rand_state);
> 
> +       trace_prandom_u32(res);
> +
>         return res;
>  }
>  EXPORT_SYMBOL(prandom_u32);

That unfortunately still gets me the same warning:

| ------------[ cut here ]------------
| DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
| WARNING: CPU: 4 PID: 1861 at kernel/locking/lockdep.c:4875 check_flags.part.0+0x157/0x160 kernel/locking/lockdep.c:4875
| Modules linked in:
| CPU: 4 PID: 1861 Comm: kworker/u16:4 Not tainted 5.9.0-rc1+ #24
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
| RIP: 0010:check_flags.part.0+0x157/0x160 kernel/locking/lockdep.c:4875
| Code: c0 0f 84 70 5d 00 00 44 8b 0d fd 11 5f 06 45 85 c9 0f 85 60 5d 00 00 48 c7 c6 3e d0 f4 86 48 c7 c7 b2 49 f3 86 e8 8d 49 f6 ff <0f> 0b e9 46 5d 00 00 66 90 41 57 41 56 49 89 fe 41 55 41 89 d5 41
| RSP: 0000:ffffc900034bfcb0 EFLAGS: 00010082
| RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff8136161c
| RDX: ffff88881a9dcb00 RSI: ffffffff81363835 RDI: 0000000000000006
| RBP: ffffc900034bfd00 R08: 0000000000000000 R09: 0000ffffffffffff
| R10: 0000000000000104 R11: 0000ffff874efd6b R12: ffffffff874f26c0
| R13: 0000000000000244 R14: 0000000000000000 R15: 0000000000000046
| FS:  0000000000000000(0000) GS:ffff88881fc00000(0000) knlGS:0000000000000000
| CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
| CR2: 0000000000000000 CR3: 0000000007489001 CR4: 0000000000770ee0
| DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
| DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
| PKRU: 55555554
| Call Trace:
|  check_flags kernel/locking/lockdep.c:4871 [inline]
|  lock_is_held_type+0x42/0x100 kernel/locking/lockdep.c:5042
|  lock_is_held include/linux/lockdep.h:267 [inline]
|  rcu_read_lock_sched_held+0x41/0x80 kernel/rcu/update.c:136
|  trace_prandom_u32 include/trace/events/random.h:310 [inline]
|  prandom_u32+0x1bb/0x200 lib/random32.c:86
|  prandom_u32_max include/linux/prandom.h:46 [inline]
|  reset_kcsan_skip kernel/kcsan/core.c:277 [inline]
|  kcsan_setup_watchpoint+0x9b/0x600 kernel/kcsan/core.c:424
|  perf_lock_task_context+0x5e3/0x6e0 kernel/events/core.c:1491
|  perf_pin_task_context kernel/events/core.c:1506 [inline]
|  perf_event_exit_task_context kernel/events/core.c:12284 [inline]
|  perf_event_exit_task+0x1e2/0x910 kernel/events/core.c:12364
|  do_exit+0x70e/0x18b0 kernel/exit.c:815
|  call_usermodehelper_exec_async+0x2e2/0x2f0 kernel/umh.c:114
|  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
| irq event stamp: 107
| hardirqs last  enabled at (107): [<ffffffff815532ab>] perf_lock_task_context+0x5db/0x6e0 kernel/events/core.c:1491
| hardirqs last disabled at (106): [<ffffffff81552f12>] perf_lock_task_context+0x242/0x6e0 kernel/events/core.c:1459
| softirqs last  enabled at (0): [<ffffffff8129b95e>] copy_process+0xe9e/0x3970 kernel/fork.c:2004
| softirqs last disabled at (0): [<0000000000000000>] 0x0
| ---[ end trace a3058d9b157af5c4 ]---
| possible reason: unannotated irqs-off.
| irq event stamp: 107
| hardirqs last  enabled at (107): [<ffffffff815532ab>] perf_lock_task_context+0x5db/0x6e0 kernel/events/core.c:1491
| hardirqs last disabled at (106): [<ffffffff81552f12>] perf_lock_task_context+0x242/0x6e0 kernel/events/core.c:1459
| softirqs last  enabled at (0): [<ffffffff8129b95e>] copy_process+0xe9e/0x3970 kernel/fork.c:2004
| softirqs last disabled at (0): [<0000000000000000>] 0x0

I also have a patch which avoids the problem entirely by not using
prandom_u32(): https://lkml.kernel.org/r/20200821123126.3121494-1-elver@google.com
But that patch will likely only make it into the next merge window
(because of other conflicts).

So, if the _rcuidle() variant here doesn't break your usecase, there
should be no harm in using the _rcuidle() variant. This also lifts the
restriction on where prandom_u32() is usable to what it was before,
which should be any context.

Steven, Peter: What's the downside to of _rcuidle()?

Thanks,
-- Marco

  reply	other threads:[~2020-08-21 15:36 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-21  6:30 [PATCH] random32: Use rcuidle variant for tracepoint Marco Elver
2020-08-21  8:58 ` Marco Elver
2020-08-21  8:59 ` peterz
2020-08-21 15:06   ` Eric Dumazet
2020-08-21 15:35     ` Marco Elver [this message]
2020-09-18  1:21       ` Steven Rostedt
2020-08-21 15:38     ` Steven Rostedt
2020-08-21 15:41       ` Steven Rostedt
2020-08-21 18:38         ` Peter Zijlstra
2020-10-09  7:58 ` [tip: locking/core] kcsan: Use tracing-safe version of prandom tip-bot2 for Marco Elver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200821153532.GA3205540@elver.google.com \
    --to=elver@google.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).