linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Leo Yan <leo.yan@linaro.org>
To: Masami Hiramatsu <masami.hiramatsu@linaro.org>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>,
	linux- stable <stable@vger.kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"open list:KERNEL SELFTEST FRAMEWORK" 
	<linux-kselftest@vger.kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Shuah Khan <shuah@kernel.org>, Sasha Levin <sashal@kernel.org>,
	rcu@vger.kernel.org, lkft-triage@lists.linaro.org,
	Anders Roxell <anders.roxell@linaro.org>,
	zanussi@kernel.org, svens@linux.ibm.com,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: WARNING: events/ipi.h:36 suspicious rcu_dereference_check() usage!
Date: Fri, 17 Apr 2020 15:15:43 +0800	[thread overview]
Message-ID: <20200417071543.GA7316@leoy-ThinkPad-X240s> (raw)
In-Reply-To: <CAA93ih2To3YN=L7VSa_RzVRV5OH9DTffd0zdKWB2M4CfE0Gp1Q@mail.gmail.com>

On Fri, Apr 17, 2020 at 03:53:27PM +0900, Masami Hiramatsu wrote:
> Hello,
> 
> It seems that *arm/arm64 specific* IPI trace event has caused this issue.
> From the stacktrace, __update_max_tr() is invoked from do_idle() context.
> 
> ------
> __update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
> {
> [SNIP]
>         /* record this tasks comm */
>         tracing_record_cmdline(tsk);
>         latency_fsnotify(tr);  <<------here
> }
> ------
> And via latency_fsnotify(), it calls irq_work_queue().
> 
> ------
> void latency_fsnotify(struct trace_array *tr)
> {
>         if (!fsnotify_wq)
>                 return;
>         /*
>          * We cannot call queue_work(&tr->fsnotify_work) from here because it's
>          * possible that we are called from __schedule() or do_idle(), which
>          * could cause a deadlock.
>          */
>         irq_work_queue(&tr->fsnotify_irqwork); <<------- here
> }
> ------
> Please NOTE(*) that the above comment said that irq_work_queue()
> should be SAFE from calling in do_idle(), this means it doesn't touch
> any RCU.
> 
> And the irq_work_queue() finally kicks IPI via arch_irq_work_raise()
> which causes an IPI.
> 
> --------<arm64: arch/arm64/kernel/smp.c>--------
> #ifdef CONFIG_IRQ_WORK
> void arch_irq_work_raise(void)
> {
>         if (__smp_cross_call)
>                 smp_cross_call(cpumask_of(smp_processor_id()), IPI_IRQ_WORK);
> }
> #endif
> 
> static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
> {
>         trace_ipi_raise(target, ipi_types[ipinr]);  <<----- This
> causes a warning!
>         __smp_cross_call(target, ipinr);
> }
> -------
> And trace_* macro touch the RCU. See include/linux/tracepoint.h:231
> ------
> #define __DECLARE_TRACE(name, proto, args, cond, data_proto, data_args) \
>         extern struct tracepoint __tracepoint_##name;                   \
>         static inline void trace_##name(proto)                          \
>         {                                                               \
>                 if (static_key_false(&__tracepoint_##name.key))         \
>                         __DO_TRACE(&__tracepoint_##name,                \
>                                 TP_PROTO(data_proto),                   \
>                                 TP_ARGS(data_args),                     \
>                                 TP_CONDITION(cond), 0);                 \
>                 if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) {             \
>                         rcu_read_lock_sched_notrace();                  \
>                         rcu_dereference_sched(__tracepoint_##name.funcs);\
>                         rcu_read_unlock_sched_notrace();                \
>                 }                                                       \
>         }
> ------
> 
> BTW, this ipi_raise trace event is only used in arm/arm64.
> 
> $ git grep trace_ipi_raise
> arch/arm/kernel/smp.c:  trace_ipi_raise_rcuidle(target, ipi_types[ipinr]);
> arch/arm64/kernel/smp.c:        trace_ipi_raise(target, ipi_types[ipinr]);
> include/trace/events/ipi.h: * __tracepoint_string, ideally the same as
> used with trace_ipi_raise
> include/trace/events/ipi.h: * __tracepoint_string, ideally the same as
> used with trace_ipi_raise for
> 
> Thus, this only occurs on arm64 platform, but not on x86.
> 
> To fix this, maybe we need to remove this event or mark this is under
> rcu watched (but is it really watched?)

Good analysis, Masami :)

Seems to me, Arm64 should study Arm platform to change using
trace_ipi_raise_rcuidle() to work around the restriction so can allow the
event to work in idle context?

Thanks,
Leo

> 2020年4月8日(水) 18:31 Naresh Kamboju <naresh.kamboju@linaro.org>:
> 
> >
> > On arm64 qemu_arm64, juno-r2 and dragonboard-410c while running kselftest ftrace
> > on stable rc 5.5.1-rc1 to till today 5.5.16-rc2 and 5.6  found this
> > kernel warning.
> >
> > [  386.349099] kselftest: Running tests in ftrace
> > [  393.984018]
> > [  393.984290] =============================
> > [  393.984781] WARNING: suspicious RCU usage
> > [  393.988690] 5.6.3-rc2 #1 Not tainted
> > [  393.992679] -----------------------------
> > [  393.996327] /usr/src/kernel/include/trace/events/ipi.h:36
> > suspicious rcu_dereference_check() usage!
> > [  394.000241]
> > [  394.000241] other info that might help us debug this:
> > [  394.000241]
> > [  394.009094]
> > [  394.009094] RCU used illegally from idle CPU!
> > [  394.009094] rcu_scheduler_active = 2, debug_locks = 1
> > [  394.017084] RCU used illegally from extended quiescent state!
> > [  394.028187] 1 lock held by swapper/3/0:
> > [  394.033826]  #0: ffff80001237b6a8 (max_trace_lock){....}, at:
> > check_critical_timing+0x7c/0x1a8
> > [  394.037480]
> > [  394.037480] stack backtrace:
> > [  394.046158] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.6.3-rc2 #1
> > [  394.050584] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> > [  394.056663] Call trace:
> > [  394.063515]  dump_backtrace+0x0/0x1e0
> > [  394.065686]  show_stack+0x24/0x30
> > [  394.069505]  dump_stack+0xe8/0x150
> > [  394.072805]  lockdep_rcu_suspicious+0xcc/0x110
> > [  394.076106]  arch_irq_work_raise+0x208/0x210
> > [  394.080533]  __irq_work_queue_local+0x5c/0x80
> > [  394.084959]  irq_work_queue+0x38/0x78
> > [  394.089212]  __update_max_tr+0x150/0x218
> > [  394.092858]  update_max_tr_single.part.82+0x98/0x100
> > [  394.096851]  update_max_tr_single+0x1c/0x28
> > [  394.101798]  check_critical_timing+0x198/0x1a8
> > [  394.105705]  stop_critical_timings+0x128/0x148
> > [  394.110221]  cpuidle_enter_state+0x74/0x4f8
> > [  394.114645]  cpuidle_enter+0x3c/0x50
> > [  394.118726]  call_cpuidle+0x44/0x80
> > [  394.122542]  do_idle+0x22c/0x2d0
> > [  394.125755]  cpu_startup_entry+0x28/0x48
> > [  394.129229]  secondary_start_kernel+0x1b4/0x210
> >
> >
> > metadata:
> >   git branch: linux-5.5.y and linux-5.6.y
> >   git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> >   kernel-config:
> > http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/juno/lkft/linux-stable-rc-5.5/65/config
> >
> > Full test log,
> > 5.6 test logs,
> > https://lkft.validation.linaro.org/scheduler/job/1350627#L12612
> > https://lkft.validation.linaro.org/scheduler/job/1350731#L9509
> >
> > 5.5 test logs,
> > https://lkft.validation.linaro.org/scheduler/job/1322704#L9777
> > https://lkft.validation.linaro.org/scheduler/job/1153369#L9745
> > https://lkft.validation.linaro.org/scheduler/job/1351155#L8982
> > https://lkft.validation.linaro.org/scheduler/job/1351065#L12349
> >
> > --
> > Linaro LKFT
> > https://lkft.linaro.org
> 
> 
> 
> --
> Masami Hiramatsu

  reply	other threads:[~2020-04-17  7:15 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-08  9:31 WARNING: events/ipi.h:36 suspicious rcu_dereference_check() usage! Naresh Kamboju
2020-04-17  6:53 ` Masami Hiramatsu
2020-04-17  7:15   ` Leo Yan [this message]
2020-04-17  7:28     ` Masami Hiramatsu
2020-04-17 13:50     ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200417071543.GA7316@leoy-ThinkPad-X240s \
    --to=leo.yan@linaro.org \
    --cc=anders.roxell@linaro.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=lkft-triage@lists.linaro.org \
    --cc=masami.hiramatsu@linaro.org \
    --cc=mingo@redhat.com \
    --cc=naresh.kamboju@linaro.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=sashal@kernel.org \
    --cc=shuah@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=svens@linux.ibm.com \
    --cc=zanussi@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).