All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] printk: Make console tracepoint safe in NMI() context
@ 2022-07-15 12:01 Petr Mladek
  2022-07-15 12:39 ` Marco Elver
  2022-07-15 15:02 ` Steven Rostedt
  0 siblings, 2 replies; 7+ messages in thread
From: Petr Mladek @ 2022-07-15 12:01 UTC (permalink / raw)
  To: Steven Rostedt, Paul E . McKenney, John Ogness
  Cc: Sergey Senozhatsky, Marco Elver, kasan-dev, Thomas Gleixner,
	Johannes Berg, Alexander Potapenko, Dmitry Vyukov,
	Naresh Kamboju, Peter Zijlstra, Linux Kernel Functional Testing,
	linux-kernel, Petr Mladek

The commit 701850dc0c31bfadf75a0 ("printk, tracing: fix console
tracepoint") moved the tracepoint from console_unlock() to
vprintk_store(). As a result, it might be called in any
context and triggered the following warning:

  WARNING: CPU: 1 PID: 16462 at include/trace/events/printk.h:10 printk_sprint+0x81/0xda
  Modules linked in: ppdev parport_pc parport
  CPU: 1 PID: 16462 Comm: event_benchmark Not tainted 5.19.0-rc5-test+ #5
  Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
  EIP: printk_sprint+0x81/0xda
  Code: 89 d8 e8 88 fc 33 00 e9 02 00 00 00 eb 6b 64 a1 a4 b8 91 c1 e8 fd d6 ff ff 84 c0 74 5c 64 a1 14 08 92 c1 a9 00 00 f0 00 74 02 <0f> 0b 64 ff 05 14 08 92 c1 b8 e0 c4 6b c1 e8 a5 dc 00 00 89 c7 e8
  EAX: 80110001 EBX: c20a52f8 ECX: 0000000c EDX: 6d203036
  ESI: 3df6004c EDI: 00000000 EBP: c61fbd7c ESP: c61fbd70
  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010006
  CR0: 80050033 CR2: b7efc000 CR3: 05b80000 CR4: 001506f0
  Call Trace:
   vprintk_store+0x24b/0x2ff
   vprintk+0x37/0x4d
   _printk+0x14/0x16
   nmi_handle+0x1ef/0x24e
   ? find_next_bit.part.0+0x13/0x13
   ? find_next_bit.part.0+0x13/0x13
   ? function_trace_call+0xd8/0xd9
   default_do_nmi+0x57/0x1af
   ? trace_hardirqs_off_finish+0x2a/0xd9
   ? to_kthread+0xf/0xf
   exc_nmi+0x9b/0xf4
   asm_exc_nmi+0xae/0x29c

It comes from:

  #define __DO_TRACE(name, args, cond, rcuidle) \
  [...]
		/* srcu can't be used from NMI */	\
		WARN_ON_ONCE(rcuidle && in_nmi());	\

It might be possible to make srcu working in NMI. But it
would be slower on some architectures. It is not worth
doing it just because of this tracepoint.

It would be possible to disable this tracepoint in NMI
or in rcuidle context. Where the rcuidle context looks
more rare and thus more acceptable to be ignored.

Alternative solution would be to move the tracepoint
back to console code. But the location is less reliable
by definition. Also the synchronization against other
tracing messages is much worse.

Let's ignore the tracepoint in rcuidle context as the least
evil solution.

Link: https://lore.kernel.org/r/20220712151655.GU1790663@paulmck-ThinkPad-P17-Gen-1

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 include/trace/events/printk.h | 11 ++++++++++-
 kernel/printk/printk.c        |  2 +-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/printk.h b/include/trace/events/printk.h
index 13d405b2fd8b..a3ee720f41b5 100644
--- a/include/trace/events/printk.h
+++ b/include/trace/events/printk.h
@@ -7,11 +7,20 @@
 
 #include <linux/tracepoint.h>
 
-TRACE_EVENT(console,
+TRACE_EVENT_CONDITION(console,
 	TP_PROTO(const char *text, size_t len),
 
 	TP_ARGS(text, len),
 
+	/*
+	 * trace_console_rcuidle() is not working in NMI. printk()
+	 * is used more often in NMI than in rcuidle context.
+	 * Choose the less evil solution here.
+	 *
+	 * raw_smp_processor_id() is reliable in rcuidle context.
+	 */
+	TP_CONDITION(!rcu_is_idle_cpu(raw_smp_processor_id())),
+
 	TP_STRUCT__entry(
 		__dynamic_array(char, msg, len + 1)
 	),
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index b49c6ff6dca0..bd76a45ecc7f 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2108,7 +2108,7 @@ static u16 printk_sprint(char *text, u16 size, int facility,
 		}
 	}
 
-	trace_console_rcuidle(text, text_len);
+	trace_console(text, text_len);
 
 	return text_len;
 }
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] printk: Make console tracepoint safe in NMI() context
  2022-07-15 12:01 [PATCH] printk: Make console tracepoint safe in NMI() context Petr Mladek
@ 2022-07-15 12:39 ` Marco Elver
  2022-07-15 13:51   ` Steven Rostedt
  2022-07-15 15:02 ` Steven Rostedt
  1 sibling, 1 reply; 7+ messages in thread
From: Marco Elver @ 2022-07-15 12:39 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Steven Rostedt, Paul E . McKenney, John Ogness,
	Sergey Senozhatsky, kasan-dev, Thomas Gleixner, Johannes Berg,
	Alexander Potapenko, Dmitry Vyukov, Naresh Kamboju,
	Peter Zijlstra, Linux Kernel Functional Testing, linux-kernel

On Fri, 15 Jul 2022 at 14:02, Petr Mladek <pmladek@suse.com> wrote:
>
> The commit 701850dc0c31bfadf75a0 ("printk, tracing: fix console
> tracepoint") moved the tracepoint from console_unlock() to
> vprintk_store(). As a result, it might be called in any
> context and triggered the following warning:
>
>   WARNING: CPU: 1 PID: 16462 at include/trace/events/printk.h:10 printk_sprint+0x81/0xda
>   Modules linked in: ppdev parport_pc parport
>   CPU: 1 PID: 16462 Comm: event_benchmark Not tainted 5.19.0-rc5-test+ #5
>   Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
>   EIP: printk_sprint+0x81/0xda
>   Code: 89 d8 e8 88 fc 33 00 e9 02 00 00 00 eb 6b 64 a1 a4 b8 91 c1 e8 fd d6 ff ff 84 c0 74 5c 64 a1 14 08 92 c1 a9 00 00 f0 00 74 02 <0f> 0b 64 ff 05 14 08 92 c1 b8 e0 c4 6b c1 e8 a5 dc 00 00 89 c7 e8
>   EAX: 80110001 EBX: c20a52f8 ECX: 0000000c EDX: 6d203036
>   ESI: 3df6004c EDI: 00000000 EBP: c61fbd7c ESP: c61fbd70
>   DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010006
>   CR0: 80050033 CR2: b7efc000 CR3: 05b80000 CR4: 001506f0
>   Call Trace:
>    vprintk_store+0x24b/0x2ff
>    vprintk+0x37/0x4d
>    _printk+0x14/0x16
>    nmi_handle+0x1ef/0x24e
>    ? find_next_bit.part.0+0x13/0x13
>    ? find_next_bit.part.0+0x13/0x13
>    ? function_trace_call+0xd8/0xd9
>    default_do_nmi+0x57/0x1af
>    ? trace_hardirqs_off_finish+0x2a/0xd9
>    ? to_kthread+0xf/0xf
>    exc_nmi+0x9b/0xf4
>    asm_exc_nmi+0xae/0x29c
>
> It comes from:
>
>   #define __DO_TRACE(name, args, cond, rcuidle) \
>   [...]
>                 /* srcu can't be used from NMI */       \
>                 WARN_ON_ONCE(rcuidle && in_nmi());      \
>
> It might be possible to make srcu working in NMI. But it
> would be slower on some architectures. It is not worth
> doing it just because of this tracepoint.
>
> It would be possible to disable this tracepoint in NMI
> or in rcuidle context. Where the rcuidle context looks
> more rare and thus more acceptable to be ignored.
>
> Alternative solution would be to move the tracepoint
> back to console code. But the location is less reliable
> by definition. Also the synchronization against other
> tracing messages is much worse.
>
> Let's ignore the tracepoint in rcuidle context as the least
> evil solution.
>
> Link: https://lore.kernel.org/r/20220712151655.GU1790663@paulmck-ThinkPad-P17-Gen-1
>
> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Petr Mladek <pmladek@suse.com>
> ---
>  include/trace/events/printk.h | 11 ++++++++++-
>  kernel/printk/printk.c        |  2 +-
>  2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/include/trace/events/printk.h b/include/trace/events/printk.h
> index 13d405b2fd8b..a3ee720f41b5 100644
> --- a/include/trace/events/printk.h
> +++ b/include/trace/events/printk.h
> @@ -7,11 +7,20 @@
>
>  #include <linux/tracepoint.h>
>
> -TRACE_EVENT(console,
> +TRACE_EVENT_CONDITION(console,
>         TP_PROTO(const char *text, size_t len),
>
>         TP_ARGS(text, len),
>
> +       /*
> +        * trace_console_rcuidle() is not working in NMI. printk()
> +        * is used more often in NMI than in rcuidle context.
> +        * Choose the less evil solution here.
> +        *
> +        * raw_smp_processor_id() is reliable in rcuidle context.
> +        */
> +       TP_CONDITION(!rcu_is_idle_cpu(raw_smp_processor_id())),
> +

Couldn't this just use rcu_is_watching()?

  | * rcu_is_watching - see if RCU thinks that the current CPU is not idle

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] printk: Make console tracepoint safe in NMI() context
  2022-07-15 12:39 ` Marco Elver
@ 2022-07-15 13:51   ` Steven Rostedt
  2022-07-15 15:10     ` Paul E. McKenney
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2022-07-15 13:51 UTC (permalink / raw)
  To: Marco Elver
  Cc: Petr Mladek, Paul E . McKenney, John Ogness, Sergey Senozhatsky,
	kasan-dev, Thomas Gleixner, Johannes Berg, Alexander Potapenko,
	Dmitry Vyukov, Naresh Kamboju, Peter Zijlstra,
	Linux Kernel Functional Testing, linux-kernel

On Fri, 15 Jul 2022 14:39:52 +0200
Marco Elver <elver@google.com> wrote:

> Couldn't this just use rcu_is_watching()?
> 
>   | * rcu_is_watching - see if RCU thinks that the current CPU is not idle

Maybe, but I was thinking that Petr had a way to hit the issue that we
worry about. But since the non _rcuide() call requires rcu watching,
prehaps that is better to use.

-- Steve

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] printk: Make console tracepoint safe in NMI() context
  2022-07-15 12:01 [PATCH] printk: Make console tracepoint safe in NMI() context Petr Mladek
  2022-07-15 12:39 ` Marco Elver
@ 2022-07-15 15:02 ` Steven Rostedt
  1 sibling, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2022-07-15 15:02 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Paul E . McKenney, John Ogness, Sergey Senozhatsky, Marco Elver,
	kasan-dev, Thomas Gleixner, Johannes Berg, Alexander Potapenko,
	Dmitry Vyukov, Naresh Kamboju, Peter Zijlstra,
	Linux Kernel Functional Testing, linux-kernel

On Fri, 15 Jul 2022 14:01:52 +0200
Petr Mladek <pmladek@suse.com> wrote:

> +	/*
> +	 * trace_console_rcuidle() is not working in NMI. printk()
> +	 * is used more often in NMI than in rcuidle context.
> +	 * Choose the less evil solution here.
> +	 *
> +	 * raw_smp_processor_id() is reliable in rcuidle context.
> +	 */
> +	TP_CONDITION(!rcu_is_idle_cpu(raw_smp_processor_id())),
> +

As Marco mentioned in the other thread, would a check for
'rcu_is_watching()' be better?

-- Steve


>  	TP_STRUCT__entry(
>  		__dynamic_array(char, msg, len + 1)
>  	),

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] printk: Make console tracepoint safe in NMI() context
  2022-07-15 13:51   ` Steven Rostedt
@ 2022-07-15 15:10     ` Paul E. McKenney
  2022-07-15 15:25       ` Steven Rostedt
  2022-07-15 15:34       ` Petr Mladek
  0 siblings, 2 replies; 7+ messages in thread
From: Paul E. McKenney @ 2022-07-15 15:10 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Marco Elver, Petr Mladek, John Ogness, Sergey Senozhatsky,
	kasan-dev, Thomas Gleixner, Johannes Berg, Alexander Potapenko,
	Dmitry Vyukov, Naresh Kamboju, Peter Zijlstra,
	Linux Kernel Functional Testing, linux-kernel

On Fri, Jul 15, 2022 at 09:51:56AM -0400, Steven Rostedt wrote:
> On Fri, 15 Jul 2022 14:39:52 +0200
> Marco Elver <elver@google.com> wrote:
> 
> > Couldn't this just use rcu_is_watching()?
> > 
> >   | * rcu_is_watching - see if RCU thinks that the current CPU is not idle
> 
> Maybe, but I was thinking that Petr had a way to hit the issue that we
> worry about. But since the non _rcuide() call requires rcu watching,
> prehaps that is better to use.

In case this helps...  ;-)

The rcu_is_watching() function is designed to be used from the current
CPU, so it dispenses with memory ordering.  However, it explicitly
disables preemption in order to avoid weird preemption patterns.

The formulation that Marco used is designed to be used from a remote
CPU, and so it includes explicit memory ordering that is not needed
in this case.  But it does not disable preemption.

So if preemption is enabled at that point in tracing, you really want
to be using rcu_is_watching().

							Thanx, Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] printk: Make console tracepoint safe in NMI() context
  2022-07-15 15:10     ` Paul E. McKenney
@ 2022-07-15 15:25       ` Steven Rostedt
  2022-07-15 15:34       ` Petr Mladek
  1 sibling, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2022-07-15 15:25 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Marco Elver, Petr Mladek, John Ogness, Sergey Senozhatsky,
	kasan-dev, Thomas Gleixner, Johannes Berg, Alexander Potapenko,
	Dmitry Vyukov, Naresh Kamboju, Peter Zijlstra,
	Linux Kernel Functional Testing, linux-kernel

On Fri, 15 Jul 2022 08:10:00 -0700
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> So if preemption is enabled at that point in tracing, you really want
> to be using rcu_is_watching().

And yes, at that point in tracing, preemption is still enabled if the
tracepoint was called with preemption enabled.

Thus, we really need to convert that to rcu_is_watching().

-- Steve

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] printk: Make console tracepoint safe in NMI() context
  2022-07-15 15:10     ` Paul E. McKenney
  2022-07-15 15:25       ` Steven Rostedt
@ 2022-07-15 15:34       ` Petr Mladek
  1 sibling, 0 replies; 7+ messages in thread
From: Petr Mladek @ 2022-07-15 15:34 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Steven Rostedt, Marco Elver, John Ogness, Sergey Senozhatsky,
	kasan-dev, Thomas Gleixner, Johannes Berg, Alexander Potapenko,
	Dmitry Vyukov, Naresh Kamboju, Peter Zijlstra,
	Linux Kernel Functional Testing, linux-kernel

On Fri 2022-07-15 08:10:00, Paul E. McKenney wrote:
> On Fri, Jul 15, 2022 at 09:51:56AM -0400, Steven Rostedt wrote:
> > On Fri, 15 Jul 2022 14:39:52 +0200
> > Marco Elver <elver@google.com> wrote:
> > 
> > > Couldn't this just use rcu_is_watching()?
> > > 
> > >   | * rcu_is_watching - see if RCU thinks that the current CPU is not idle
> > 
> > Maybe, but I was thinking that Petr had a way to hit the issue that we
> > worry about. But since the non _rcuide() call requires rcu watching,
> > prehaps that is better to use.

I actually saw the warning even with simple sysrq+l. I wonder why
I have missed it during testing. It was probably well hidden within
the other backtraces.

I was not aware that rcu_is_watching() and rcu_is_idle_cpu() did
basically the same. I used rcu_is_idle_cpu() because of the "idle"
in the name and the function description ;-)

> In case this helps...  ;-)
> 
> The rcu_is_watching() function is designed to be used from the current
> CPU, so it dispenses with memory ordering.  However, it explicitly
> disables preemption in order to avoid weird preemption patterns.
> 
> The formulation that Marco used is designed to be used from a remote
> CPU, and so it includes explicit memory ordering that is not needed
> in this case.  But it does not disable preemption.
> 
> So if preemption is enabled at that point in tracing, you really want
> to be using rcu_is_watching().

rcu_is_watching() is the right variant then. I am going to send v2.

Thanks a lot for the detailed explanation.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-07-15 15:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-15 12:01 [PATCH] printk: Make console tracepoint safe in NMI() context Petr Mladek
2022-07-15 12:39 ` Marco Elver
2022-07-15 13:51   ` Steven Rostedt
2022-07-15 15:10     ` Paul E. McKenney
2022-07-15 15:25       ` Steven Rostedt
2022-07-15 15:34       ` Petr Mladek
2022-07-15 15:02 ` Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.