All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH -printk] printk, tracing: fix console tracepoint
@ 2022-05-03  7:38 Marco Elver
  2022-05-03 16:15 ` Steven Rostedt
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Marco Elver @ 2022-05-03  7:38 UTC (permalink / raw)
  To: elver, John Ogness, Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, linux-kernel, kasan-dev,
	Thomas Gleixner, Johannes Berg, Alexander Potapenko,
	Dmitry Vyukov, Naresh Kamboju, Linux Kernel Functional Testing

The original intent of the 'console' tracepoint per 95100358491a
("printk/tracing: Add console output tracing") had been to "[...] record
any printk messages into the trace, regardless of the current console
loglevel. This can help correlate (existing) printk debugging with other
tracing."

Petr points out [1] that calling trace_console_rcuidle() in
call_console_driver() had been the wrong thing for a while, because
"printk() always used console_trylock() and the message was flushed to
the console only when the trylock succeeded. And it was always deferred
in NMI or when printed via printk_deferred()."

With 09c5ba0aa2fc ("printk: add kthread console printers"), things only
got worse, and calls to call_console_driver() no longer happen with
typical printk() calls but always appear deferred [2].

As such, the tracepoint can no longer serve its purpose to clearly
correlate printk() calls and other tracing, as well as breaks usecases
that expect every printk() call to result in a callback of the console
tracepoint. Notably, the KFENCE and KCSAN test suites, which want to
capture console output and assume a printk() immediately gives us a
callback to the console tracepoint.

Fix the console tracepoint by moving it into printk_sprint() [3].

One notable difference is that by moving tracing into printk_sprint(),
the 'text' will no longer include the "header" (loglevel and timestamp),
but only the raw message. Arguably this is less of a problem now that
the console tracepoint happens on the printk() call and isn't delayed.

Link: https://lore.kernel.org/all/Ym+WqKStCg%2FEHfh3@alley/ [1]
Link: https://lore.kernel.org/all/CA+G9fYu2kS0wR4WqMRsj2rePKV9XLgOU1PiXnMvpT+Z=c2ucHA@mail.gmail.com/ [2]
Link: https://lore.kernel.org/all/87fslup9dx.fsf@jogness.linutronix.de/ [3]
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Signed-off-by: Marco Elver <elver@google.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index f66d6e72a642..a3e1035929b0 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2064,8 +2064,6 @@ static void call_console_driver(struct console *con, const char *text, size_t le
 {
 	size_t dropped_len;
 
-	trace_console_rcuidle(text, len);
-
 	if (con->dropped && dropped_text) {
 		dropped_len = snprintf(dropped_text, DROPPED_TEXT_MAX,
 				       "** %lu printk messages dropped **\n",
@@ -2240,6 +2238,8 @@ static u16 printk_sprint(char *text, u16 size, int facility,
 		}
 	}
 
+	trace_console_rcuidle(text, text_len);
+
 	return text_len;
 }
 
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-05-03  7:38 [PATCH -printk] printk, tracing: fix console tracepoint Marco Elver
@ 2022-05-03 16:15 ` Steven Rostedt
  2022-05-03 19:14 ` John Ogness
  2022-07-11 22:29 ` Steven Rostedt
  2 siblings, 0 replies; 21+ messages in thread
From: Steven Rostedt @ 2022-05-03 16:15 UTC (permalink / raw)
  To: Marco Elver
  Cc: John Ogness, Petr Mladek, Sergey Senozhatsky, linux-kernel,
	kasan-dev, Thomas Gleixner, Johannes Berg, Alexander Potapenko,
	Dmitry Vyukov, Naresh Kamboju, Linux Kernel Functional Testing

On Tue,  3 May 2022 09:38:44 +0200
Marco Elver <elver@google.com> wrote:

> The original intent of the 'console' tracepoint per 95100358491a
> ("printk/tracing: Add console output tracing") had been to "[...] record
> any printk messages into the trace, regardless of the current console
> loglevel. This can help correlate (existing) printk debugging with other
> tracing."
> 
> Petr points out [1] that calling trace_console_rcuidle() in
> call_console_driver() had been the wrong thing for a while, because
> "printk() always used console_trylock() and the message was flushed to
> the console only when the trylock succeeded. And it was always deferred
> in NMI or when printed via printk_deferred()."
> 
> With 09c5ba0aa2fc ("printk: add kthread console printers"), things only
> got worse, and calls to call_console_driver() no longer happen with
> typical printk() calls but always appear deferred [2].
> 
> As such, the tracepoint can no longer serve its purpose to clearly
> correlate printk() calls and other tracing, as well as breaks usecases
> that expect every printk() call to result in a callback of the console
> tracepoint. Notably, the KFENCE and KCSAN test suites, which want to
> capture console output and assume a printk() immediately gives us a
> callback to the console tracepoint.
> 
> Fix the console tracepoint by moving it into printk_sprint() [3].
> 
> One notable difference is that by moving tracing into printk_sprint(),
> the 'text' will no longer include the "header" (loglevel and timestamp),
> but only the raw message. Arguably this is less of a problem now that
> the console tracepoint happens on the printk() call and isn't delayed.
> 

I'm OK with this change, but I don't know everyone that uses the trace
printk feature. I am worried that this could cause regressions in
people's workloads.

I'd like to hear more feedback from others, but for me:

Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>

-- Steve


> Link: https://lore.kernel.org/all/Ym+WqKStCg%2FEHfh3@alley/ [1]
> Link: https://lore.kernel.org/all/CA+G9fYu2kS0wR4WqMRsj2rePKV9XLgOU1PiXnMvpT+Z=c2ucHA@mail.gmail.com/ [2]
> Link: https://lore.kernel.org/all/87fslup9dx.fsf@jogness.linutronix.de/ [3]
> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
> Signed-off-by: Marco Elver <elver@google.com>
> Cc: John Ogness <john.ogness@linutronix.de>
> Cc: Petr Mladek <pmladek@suse.com>
> ---
>  kernel/printk/printk.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index f66d6e72a642..a3e1035929b0 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2064,8 +2064,6 @@ static void call_console_driver(struct console *con, const char *text, size_t le
>  {
>  	size_t dropped_len;
>  
> -	trace_console_rcuidle(text, len);
> -
>  	if (con->dropped && dropped_text) {
>  		dropped_len = snprintf(dropped_text, DROPPED_TEXT_MAX,
>  				       "** %lu printk messages dropped **\n",
> @@ -2240,6 +2238,8 @@ static u16 printk_sprint(char *text, u16 size, int facility,
>  		}
>  	}
>  
> +	trace_console_rcuidle(text, text_len);
> +
>  	return text_len;
>  }
>  


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-05-03  7:38 [PATCH -printk] printk, tracing: fix console tracepoint Marco Elver
  2022-05-03 16:15 ` Steven Rostedt
@ 2022-05-03 19:14 ` John Ogness
  2022-05-04  9:46   ` Petr Mladek
  2022-07-11 22:29 ` Steven Rostedt
  2 siblings, 1 reply; 21+ messages in thread
From: John Ogness @ 2022-05-03 19:14 UTC (permalink / raw)
  To: Marco Elver, elver, Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, linux-kernel, kasan-dev,
	Thomas Gleixner, Johannes Berg, Alexander Potapenko,
	Dmitry Vyukov, Naresh Kamboju, Linux Kernel Functional Testing

On 2022-05-03, Marco Elver <elver@google.com> wrote:
> One notable difference is that by moving tracing into printk_sprint(),
> the 'text' will no longer include the "header" (loglevel and timestamp),
> but only the raw message. Arguably this is less of a problem now that
> the console tracepoint happens on the printk() call and isn't delayed.

Another slight difference is that messages composed of LOG_CONT pieces
will trigger the tracepoint for each individual piece and _never_ as a
complete line.

It was never guaranteed that all LOG_CONT pieces make it into the final
printed line anyway, but with this change it will be guaranteed that
they are always handled separately.

I am OK with this change, but like Steven, I agree the the users of that
tracepoint need to chime in.

Acked-by: John Ogness <john.ogness@linutronix.de>

The ongoing printbuf/seq_buf work [0] will hopefully someday do away
with LOG_CONT altogether.

John

[0] https://lwn.net/Articles/892611

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-05-03 19:14 ` John Ogness
@ 2022-05-04  9:46   ` Petr Mladek
  2022-05-06 14:51     ` Petr Mladek
  0 siblings, 1 reply; 21+ messages in thread
From: Petr Mladek @ 2022-05-04  9:46 UTC (permalink / raw)
  To: John Ogness
  Cc: Marco Elver, Sergey Senozhatsky, Steven Rostedt, linux-kernel,
	kasan-dev, Thomas Gleixner, Johannes Berg, Alexander Potapenko,
	Dmitry Vyukov, Naresh Kamboju, Linux Kernel Functional Testing

On Tue 2022-05-03 21:20:44, John Ogness wrote:
> On 2022-05-03, Marco Elver <elver@google.com> wrote:
> > One notable difference is that by moving tracing into printk_sprint(),
> > the 'text' will no longer include the "header" (loglevel and timestamp),
> > but only the raw message. Arguably this is less of a problem now that
> > the console tracepoint happens on the printk() call and isn't delayed.
> 
> Another slight difference is that messages composed of LOG_CONT pieces
> will trigger the tracepoint for each individual piece and _never_ as a
> complete line.
> 
> It was never guaranteed that all LOG_CONT pieces make it into the final
> printed line anyway, but with this change it will be guaranteed that
> they are always handled separately.
> 
> I am OK with this change, but like Steven, I agree the the users of that
> tracepoint need to chime in.

My feeling is that the feature is not used much. Otherwise people
would complain that it was asynchronous and hard to use.

I mean that the printk() messages appeared in the trace log
asynchronously. So it required some post processing to correctly
sort them against other tracing messages. The same result can be
achieved by processing printk log buffer, dmesg.log, journalctl.

I guess that we will only find the answer when we push the change
into linux-next and mainline. I am going to do so.

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-05-04  9:46   ` Petr Mladek
@ 2022-05-06 14:51     ` Petr Mladek
  2022-05-06 14:55       ` Marco Elver
  0 siblings, 1 reply; 21+ messages in thread
From: Petr Mladek @ 2022-05-06 14:51 UTC (permalink / raw)
  To: John Ogness
  Cc: Marco Elver, Sergey Senozhatsky, Steven Rostedt, linux-kernel,
	kasan-dev, Thomas Gleixner, Johannes Berg, Alexander Potapenko,
	Dmitry Vyukov, Naresh Kamboju, Linux Kernel Functional Testing

On Wed 2022-05-04 11:46:36, Petr Mladek wrote:
> On Tue 2022-05-03 21:20:44, John Ogness wrote:
> > On 2022-05-03, Marco Elver <elver@google.com> wrote:
> > > One notable difference is that by moving tracing into printk_sprint(),
> > > the 'text' will no longer include the "header" (loglevel and timestamp),
> > > but only the raw message. Arguably this is less of a problem now that
> > > the console tracepoint happens on the printk() call and isn't delayed.
> > 
> > Another slight difference is that messages composed of LOG_CONT pieces
> > will trigger the tracepoint for each individual piece and _never_ as a
> > complete line.
> > 
> > It was never guaranteed that all LOG_CONT pieces make it into the final
> > printed line anyway, but with this change it will be guaranteed that
> > they are always handled separately.
> > 
> > I am OK with this change, but like Steven, I agree the the users of that
> > tracepoint need to chime in.
> 
> My feeling is that the feature is not used much. Otherwise people
> would complain that it was asynchronous and hard to use.
> 
> I mean that the printk() messages appeared in the trace log
> asynchronously. So it required some post processing to correctly
> sort them against other tracing messages. The same result can be
> achieved by processing printk log buffer, dmesg.log, journalctl.
> 
> I guess that we will only find the answer when we push the change
> into linux-next and mainline. I am going to do so.

JFYI, the patch has been committed into printk/linux.git,
branch rework/kthreads.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-05-06 14:51     ` Petr Mladek
@ 2022-05-06 14:55       ` Marco Elver
  0 siblings, 0 replies; 21+ messages in thread
From: Marco Elver @ 2022-05-06 14:55 UTC (permalink / raw)
  To: Petr Mladek
  Cc: John Ogness, Sergey Senozhatsky, Steven Rostedt, linux-kernel,
	kasan-dev, Thomas Gleixner, Johannes Berg, Alexander Potapenko,
	Dmitry Vyukov, Naresh Kamboju, Linux Kernel Functional Testing

On Fri, 6 May 2022 at 16:51, Petr Mladek <pmladek@suse.com> wrote:
>
> On Wed 2022-05-04 11:46:36, Petr Mladek wrote:
> > On Tue 2022-05-03 21:20:44, John Ogness wrote:
> > > On 2022-05-03, Marco Elver <elver@google.com> wrote:
> > > > One notable difference is that by moving tracing into printk_sprint(),
> > > > the 'text' will no longer include the "header" (loglevel and timestamp),
> > > > but only the raw message. Arguably this is less of a problem now that
> > > > the console tracepoint happens on the printk() call and isn't delayed.
> > >
> > > Another slight difference is that messages composed of LOG_CONT pieces
> > > will trigger the tracepoint for each individual piece and _never_ as a
> > > complete line.
> > >
> > > It was never guaranteed that all LOG_CONT pieces make it into the final
> > > printed line anyway, but with this change it will be guaranteed that
> > > they are always handled separately.
> > >
> > > I am OK with this change, but like Steven, I agree the the users of that
> > > tracepoint need to chime in.
> >
> > My feeling is that the feature is not used much. Otherwise people
> > would complain that it was asynchronous and hard to use.
> >
> > I mean that the printk() messages appeared in the trace log
> > asynchronously. So it required some post processing to correctly
> > sort them against other tracing messages. The same result can be
> > achieved by processing printk log buffer, dmesg.log, journalctl.
> >
> > I guess that we will only find the answer when we push the change
> > into linux-next and mainline. I am going to do so.
>
> JFYI, the patch has been committed into printk/linux.git,
> branch rework/kthreads.

Thank you, sounds good.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-05-03  7:38 [PATCH -printk] printk, tracing: fix console tracepoint Marco Elver
  2022-05-03 16:15 ` Steven Rostedt
  2022-05-03 19:14 ` John Ogness
@ 2022-07-11 22:29 ` Steven Rostedt
  2022-07-12  0:21   ` Paul E. McKenney
  2 siblings, 1 reply; 21+ messages in thread
From: Steven Rostedt @ 2022-07-11 22:29 UTC (permalink / raw)
  To: Marco Elver, Paul E. McKenney
  Cc: John Ogness, Petr Mladek, Sergey Senozhatsky, linux-kernel,
	kasan-dev, Thomas Gleixner, Johannes Berg, Alexander Potapenko,
	Dmitry Vyukov, Naresh Kamboju, Linux Kernel Functional Testing


I know I acked this, but I finally got a tree where it is included in my
testing, and I hit this:

INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.860 msecs
------------[ cut here ]------------
WARNING: CPU: 1 PID: 16462 at include/trace/events/printk.h:10 printk_sprint+0x81/0xda
Modules linked in: ppdev parport_pc parport
CPU: 1 PID: 16462 Comm: event_benchmark Not tainted 5.19.0-rc5-test+ #5
Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
EIP: printk_sprint+0x81/0xda
Code: 89 d8 e8 88 fc 33 00 e9 02 00 00 00 eb 6b 64 a1 a4 b8 91 c1 e8 fd d6 ff ff 84 c0 74 5c 64 a1 14 08 92 c1 a9 00 00 f0 00 74 02 <0f> 0b 64 ff 05 14 08 92 c1 b8 e0 c4 6b c1 e8 a5 dc 00 00 89 c7 e8
EAX: 80110001 EBX: c20a52f8 ECX: 0000000c EDX: 6d203036
ESI: 3df6004c EDI: 00000000 EBP: c61fbd7c ESP: c61fbd70
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010006
CR0: 80050033 CR2: b7efc000 CR3: 05b80000 CR4: 001506f0
Call Trace:
 vprintk_store+0x24b/0x2ff
perf: interrupt took too long (7980 > 7977), lowering kernel.perf_event_max_sample_rate to 25000
 vprintk+0x37/0x4d
 _printk+0x14/0x16
 nmi_handle+0x1ef/0x24e
 ? find_next_bit.part.0+0x13/0x13
 ? find_next_bit.part.0+0x13/0x13
 ? function_trace_call+0xd8/0xd9
 default_do_nmi+0x57/0x1af
 ? trace_hardirqs_off_finish+0x2a/0xd9
 ? to_kthread+0xf/0xf
 exc_nmi+0x9b/0xf4
 asm_exc_nmi+0xae/0x29c


On Tue,  3 May 2022 09:38:44 +0200
Marco Elver <elver@google.com> wrote:

> Petr points out [1] that calling trace_console_rcuidle() in
> call_console_driver() had been the wrong thing for a while, because
> "printk() always used console_trylock() and the message was flushed to
> the console only when the trylock succeeded. And it was always deferred
> in NMI or when printed via printk_deferred()."

The issue is that we use "trace_console_rcuidle()" where the "_rcuidle()"
version uses srcu, which the last I knew is not safe in NMI context.

Paul, has that changed?

Thus, we need to make sure that printk() is always called when "rcu is
watching" and remove the _rcuidle() part, or we do not call it from nmi
context. Or make srcu nmi safe.

For now, I'm reverting this in my local tree.

-- Steve


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-07-11 22:29 ` Steven Rostedt
@ 2022-07-12  0:21   ` Paul E. McKenney
  2022-07-12  0:53     ` Steven Rostedt
  0 siblings, 1 reply; 21+ messages in thread
From: Paul E. McKenney @ 2022-07-12  0:21 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Marco Elver, John Ogness, Petr Mladek, Sergey Senozhatsky,
	linux-kernel, kasan-dev, Thomas Gleixner, Johannes Berg,
	Alexander Potapenko, Dmitry Vyukov, Naresh Kamboju,
	Linux Kernel Functional Testing

On Mon, Jul 11, 2022 at 06:29:18PM -0400, Steven Rostedt wrote:
> 
> I know I acked this, but I finally got a tree where it is included in my
> testing, and I hit this:
> 
> INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.860 msecs
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 16462 at include/trace/events/printk.h:10 printk_sprint+0x81/0xda
> Modules linked in: ppdev parport_pc parport
> CPU: 1 PID: 16462 Comm: event_benchmark Not tainted 5.19.0-rc5-test+ #5
> Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
> EIP: printk_sprint+0x81/0xda
> Code: 89 d8 e8 88 fc 33 00 e9 02 00 00 00 eb 6b 64 a1 a4 b8 91 c1 e8 fd d6 ff ff 84 c0 74 5c 64 a1 14 08 92 c1 a9 00 00 f0 00 74 02 <0f> 0b 64 ff 05 14 08 92 c1 b8 e0 c4 6b c1 e8 a5 dc 00 00 89 c7 e8
> EAX: 80110001 EBX: c20a52f8 ECX: 0000000c EDX: 6d203036
> ESI: 3df6004c EDI: 00000000 EBP: c61fbd7c ESP: c61fbd70
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010006
> CR0: 80050033 CR2: b7efc000 CR3: 05b80000 CR4: 001506f0
> Call Trace:
>  vprintk_store+0x24b/0x2ff
> perf: interrupt took too long (7980 > 7977), lowering kernel.perf_event_max_sample_rate to 25000
>  vprintk+0x37/0x4d
>  _printk+0x14/0x16
>  nmi_handle+0x1ef/0x24e
>  ? find_next_bit.part.0+0x13/0x13
>  ? find_next_bit.part.0+0x13/0x13
>  ? function_trace_call+0xd8/0xd9
>  default_do_nmi+0x57/0x1af
>  ? trace_hardirqs_off_finish+0x2a/0xd9
>  ? to_kthread+0xf/0xf
>  exc_nmi+0x9b/0xf4
>  asm_exc_nmi+0xae/0x29c
> 
> 
> On Tue,  3 May 2022 09:38:44 +0200
> Marco Elver <elver@google.com> wrote:
> 
> > Petr points out [1] that calling trace_console_rcuidle() in
> > call_console_driver() had been the wrong thing for a while, because
> > "printk() always used console_trylock() and the message was flushed to
> > the console only when the trylock succeeded. And it was always deferred
> > in NMI or when printed via printk_deferred()."
> 
> The issue is that we use "trace_console_rcuidle()" where the "_rcuidle()"
> version uses srcu, which the last I knew is not safe in NMI context.
> 
> Paul, has that changed?

On x86, both srcu_read_lock() and srcu_read_unlock() should be OK from
NMI context, give or take their use of lockdep.  Which is why we have
srcu_read_lock_notrace() and srcu_read_unlock_notrace(), which do not
use lockdep.  Which __DO_TRACE() does in fact invoke.  Ah, but you have
this: "WARN_ON_ONCE(rcuidle && in_nmi())".

Because all the world is not an x86.

For srcu_read_lock() and srcu_read_unlock() to be NMI-safe more generally,
this_cpu_inc() would need to be NMI-safe on all platforms.  This requires
it be implemented with read-modify-write atomic instructions, which
appears to be the case on arm64, s390 (for at least some configs,
maybe all depending on what the "laa" instruction does), and again x86.
The generic code relies on disabling interrrupts, which does nothing
against NMIs.

So srcu_read_lock_notrace() and srcu_read_unlock_notrace() are NMI-safe
only on arm64, s390, and x86.  So that WARN_ON_ONCE() is necessary.

> Thus, we need to make sure that printk() is always called when "rcu is
> watching" and remove the _rcuidle() part, or we do not call it from nmi
> context. Or make srcu nmi safe.

Which means making this_cpu_inc() NMI-safe.  Which could be done
by making the generic this_cpu_inc() implementation use something
like arch_atomic_fetch_add().  Or a cmpxchg() loop.  Or making
__srcu_read_lock() and srcu_read_unlock() check the architecture and Do
The Right Thing.  Maybe a CONFIG_THIS_CPU_NMI_SAFE that is set by
x86, s390, and arm64?

Is either approach something that those other architectures would be
willing to countenance?  For me, it might be worth it just to not have
to dig through many layers of this_cpu macro definitions from time
to time...  :-/

							Thanx, Paul

> For now, I'm reverting this in my local tree.
> 
> -- Steve
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-07-12  0:21   ` Paul E. McKenney
@ 2022-07-12  0:53     ` Steven Rostedt
  2022-07-12  2:57       ` Paul E. McKenney
  0 siblings, 1 reply; 21+ messages in thread
From: Steven Rostedt @ 2022-07-12  0:53 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Marco Elver, John Ogness, Petr Mladek, Sergey Senozhatsky,
	linux-kernel, kasan-dev, Thomas Gleixner, Johannes Berg,
	Alexander Potapenko, Dmitry Vyukov, Naresh Kamboju,
	Linux Kernel Functional Testing

On Mon, 11 Jul 2022 17:21:28 -0700
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> On x86, both srcu_read_lock() and srcu_read_unlock() should be OK from
> NMI context, give or take their use of lockdep.  Which is why we have
> srcu_read_lock_notrace() and srcu_read_unlock_notrace(), which do not
> use lockdep.  Which __DO_TRACE() does in fact invoke.  Ah, but you have
> this: "WARN_ON_ONCE(rcuidle && in_nmi())".
> 
> Because all the world is not an x86.

But since NMIs are architecture specific, we could change that to:

	WARN_ON_ONCE(!srcu_nmi_safe && rcuidle && in_nmi());

and add a srcu_nmi_safe constant or macro that is 1 on architectures that
srcu is safe in NMI and 0 otherwise.

Or do we care if a tracepoint happens in those architectures where it is
not safe. We could then just do:

	if (!srcu_nmi_safe && rcuidle && in_nmi())
		return;

and just skip tracepoints that are marked rcu_idle and happen within NMI.

-- Steve

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-07-12  0:53     ` Steven Rostedt
@ 2022-07-12  2:57       ` Paul E. McKenney
  2022-07-12 11:49         ` Paul E. McKenney
  0 siblings, 1 reply; 21+ messages in thread
From: Paul E. McKenney @ 2022-07-12  2:57 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Marco Elver, John Ogness, Petr Mladek, Sergey Senozhatsky,
	linux-kernel, kasan-dev, Thomas Gleixner, Johannes Berg,
	Alexander Potapenko, Dmitry Vyukov, Naresh Kamboju,
	Linux Kernel Functional Testing

On Mon, Jul 11, 2022 at 08:53:19PM -0400, Steven Rostedt wrote:
> On Mon, 11 Jul 2022 17:21:28 -0700
> "Paul E. McKenney" <paulmck@kernel.org> wrote:
> 
> > On x86, both srcu_read_lock() and srcu_read_unlock() should be OK from
> > NMI context, give or take their use of lockdep.  Which is why we have
> > srcu_read_lock_notrace() and srcu_read_unlock_notrace(), which do not
> > use lockdep.  Which __DO_TRACE() does in fact invoke.  Ah, but you have
> > this: "WARN_ON_ONCE(rcuidle && in_nmi())".
> > 
> > Because all the world is not an x86.
> 
> But since NMIs are architecture specific, we could change that to:
> 
> 	WARN_ON_ONCE(!srcu_nmi_safe && rcuidle && in_nmi());
> 
> and add a srcu_nmi_safe constant or macro that is 1 on architectures that
> srcu is safe in NMI and 0 otherwise.
> 
> Or do we care if a tracepoint happens in those architectures where it is
> not safe. We could then just do:
> 
> 	if (!srcu_nmi_safe && rcuidle && in_nmi())
> 		return;
> 
> and just skip tracepoints that are marked rcu_idle and happen within NMI.

More generally, this is this_cpu_nmi_safe rather than just SRCU.  Or could
be, depending on what the architecture guys would like to guarantee.
SRCU is just passing through the this_cpu*() NMI-safety property.

And in addition to in_nmi(), there is the HAVE_NMI Kconfig option:

	$ git grep -w HAVE_NMI
	arch/Kconfig:config HAVE_NMI
	arch/Kconfig:	depends on HAVE_NMI
	arch/arm/Kconfig:	select HAVE_NMI
	arch/arm64/Kconfig:	select HAVE_NMI
	arch/loongarch/Kconfig:	select HAVE_NMI
	arch/mips/Kconfig:	select HAVE_NMI
	arch/powerpc/Kconfig:	select HAVE_NMI				if PERF_EVENTS || (PPC64 && PPC_BOOK3S)
	arch/s390/Kconfig:	select HAVE_NMI
	arch/sh/Kconfig:	select HAVE_NMI
	arch/sparc/Kconfig:	select HAVE_NMI
	arch/x86/Kconfig:	select HAVE_NMI

So if I understand correctly, arm, loongarch, mips, powerpc, sh, and
sparc are the ones that have NMIs but don't have NMI-safe this_cpu*()
operations.  These are the ones that would need !srcu_nmi_safe.

Or, longer term, I could make HAVE_NMI && !srcu_safe_nmi cause SRCU to use
explicit atomics for srcu_read_lock_trace() and srcu_read_unlock_trace().
I am assuming that any NMI handler executing srcu_read_lock_trace()
also executes the matching srcu_read_unlock_trace().  (Silly me, I know!)
This assumption means that srcu_read_lock() and srcu_read_unlock() can
continue with their non-atomic this_cpu_inc() ways.

But a quick fix that stopped the bleeding and allowed printk() to
progress would be useful in the short term, regardless of whether or
not in the longer term it makes sense to make srcu_read_lock_trace()
and srcu_read_unlock_trace() NMI-safe.

Thoughts?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-07-12  2:57       ` Paul E. McKenney
@ 2022-07-12 11:49         ` Paul E. McKenney
  2022-07-12 13:39           ` Steven Rostedt
  0 siblings, 1 reply; 21+ messages in thread
From: Paul E. McKenney @ 2022-07-12 11:49 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Marco Elver, John Ogness, Petr Mladek, Sergey Senozhatsky,
	linux-kernel, kasan-dev, Thomas Gleixner, Johannes Berg,
	Alexander Potapenko, Dmitry Vyukov, Naresh Kamboju,
	Linux Kernel Functional Testing

On Mon, Jul 11, 2022 at 07:57:01PM -0700, Paul E. McKenney wrote:
> On Mon, Jul 11, 2022 at 08:53:19PM -0400, Steven Rostedt wrote:
> > On Mon, 11 Jul 2022 17:21:28 -0700
> > "Paul E. McKenney" <paulmck@kernel.org> wrote:
> > 
> > > On x86, both srcu_read_lock() and srcu_read_unlock() should be OK from
> > > NMI context, give or take their use of lockdep.  Which is why we have
> > > srcu_read_lock_notrace() and srcu_read_unlock_notrace(), which do not
> > > use lockdep.  Which __DO_TRACE() does in fact invoke.  Ah, but you have
> > > this: "WARN_ON_ONCE(rcuidle && in_nmi())".
> > > 
> > > Because all the world is not an x86.
> > 
> > But since NMIs are architecture specific, we could change that to:
> > 
> > 	WARN_ON_ONCE(!srcu_nmi_safe && rcuidle && in_nmi());
> > 
> > and add a srcu_nmi_safe constant or macro that is 1 on architectures that
> > srcu is safe in NMI and 0 otherwise.
> > 
> > Or do we care if a tracepoint happens in those architectures where it is
> > not safe. We could then just do:
> > 
> > 	if (!srcu_nmi_safe && rcuidle && in_nmi())
> > 		return;
> > 
> > and just skip tracepoints that are marked rcu_idle and happen within NMI.
> 
> More generally, this is this_cpu_nmi_safe rather than just SRCU.  Or could
> be, depending on what the architecture guys would like to guarantee.
> SRCU is just passing through the this_cpu*() NMI-safety property.
> 
> And in addition to in_nmi(), there is the HAVE_NMI Kconfig option:
> 
> 	$ git grep -w HAVE_NMI
> 	arch/Kconfig:config HAVE_NMI
> 	arch/Kconfig:	depends on HAVE_NMI
> 	arch/arm/Kconfig:	select HAVE_NMI
> 	arch/arm64/Kconfig:	select HAVE_NMI
> 	arch/loongarch/Kconfig:	select HAVE_NMI
> 	arch/mips/Kconfig:	select HAVE_NMI
> 	arch/powerpc/Kconfig:	select HAVE_NMI				if PERF_EVENTS || (PPC64 && PPC_BOOK3S)
> 	arch/s390/Kconfig:	select HAVE_NMI
> 	arch/sh/Kconfig:	select HAVE_NMI
> 	arch/sparc/Kconfig:	select HAVE_NMI
> 	arch/x86/Kconfig:	select HAVE_NMI
> 
> So if I understand correctly, arm, loongarch, mips, powerpc, sh, and
> sparc are the ones that have NMIs but don't have NMI-safe this_cpu*()
> operations.  These are the ones that would need !srcu_nmi_safe.
> 
> Or, longer term, I could make HAVE_NMI && !srcu_safe_nmi cause SRCU to use
> explicit atomics for srcu_read_lock_trace() and srcu_read_unlock_trace().
> I am assuming that any NMI handler executing srcu_read_lock_trace()
> also executes the matching srcu_read_unlock_trace().  (Silly me, I know!)
> This assumption means that srcu_read_lock() and srcu_read_unlock() can
> continue with their non-atomic this_cpu_inc() ways.
> 
> But a quick fix that stopped the bleeding and allowed printk() to
> progress would be useful in the short term, regardless of whether or
> not in the longer term it makes sense to make srcu_read_lock_trace()
> and srcu_read_unlock_trace() NMI-safe.

Except that doesn't rcuidle && in_nmi() imply a misplaced trace event?

Isn't it still the case that you are not supposed to have trace events
in NMI handlers before RCU is watching or after it is no longer watching,
just as for entry/exit code in general?  Once in the body of the handler,
rcuidle should be false and all should be well.

Or am I missing something here?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-07-12 11:49         ` Paul E. McKenney
@ 2022-07-12 13:39           ` Steven Rostedt
  2022-07-12 13:49             ` Paul E. McKenney
  0 siblings, 1 reply; 21+ messages in thread
From: Steven Rostedt @ 2022-07-12 13:39 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Marco Elver, John Ogness, Petr Mladek, Sergey Senozhatsky,
	linux-kernel, kasan-dev, Thomas Gleixner, Johannes Berg,
	Alexander Potapenko, Dmitry Vyukov, Naresh Kamboju,
	Linux Kernel Functional Testing

On Tue, 12 Jul 2022 04:49:54 -0700
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> > But a quick fix that stopped the bleeding and allowed printk() to
> > progress would be useful in the short term, regardless of whether or
> > not in the longer term it makes sense to make srcu_read_lock_trace()
> > and srcu_read_unlock_trace() NMI-safe.  
> 
> Except that doesn't rcuidle && in_nmi() imply a misplaced trace event?
> 
> Isn't it still the case that you are not supposed to have trace events
> in NMI handlers before RCU is watching or after it is no longer watching,
> just as for entry/exit code in general?  Once in the body of the handler,
> rcuidle should be false and all should be well.
> 
> Or am I missing something here?

I guess the question is, can we have printk() in such a place? Because this
tracepoint is attached to printk and where ever printk is done so is this
tracepoint.

-- Steve

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-07-12 13:39           ` Steven Rostedt
@ 2022-07-12 13:49             ` Paul E. McKenney
  2022-07-12 14:53               ` Steven Rostedt
  0 siblings, 1 reply; 21+ messages in thread
From: Paul E. McKenney @ 2022-07-12 13:49 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Marco Elver, John Ogness, Petr Mladek, Sergey Senozhatsky,
	linux-kernel, kasan-dev, Thomas Gleixner, Johannes Berg,
	Alexander Potapenko, Dmitry Vyukov, Naresh Kamboju,
	Linux Kernel Functional Testing

On Tue, Jul 12, 2022 at 09:39:40AM -0400, Steven Rostedt wrote:
> On Tue, 12 Jul 2022 04:49:54 -0700
> "Paul E. McKenney" <paulmck@kernel.org> wrote:
> 
> > > But a quick fix that stopped the bleeding and allowed printk() to
> > > progress would be useful in the short term, regardless of whether or
> > > not in the longer term it makes sense to make srcu_read_lock_trace()
> > > and srcu_read_unlock_trace() NMI-safe.  
> > 
> > Except that doesn't rcuidle && in_nmi() imply a misplaced trace event?
> > 
> > Isn't it still the case that you are not supposed to have trace events
> > in NMI handlers before RCU is watching or after it is no longer watching,
> > just as for entry/exit code in general?  Once in the body of the handler,
> > rcuidle should be false and all should be well.
> > 
> > Or am I missing something here?
> 
> I guess the question is, can we have printk() in such a place? Because this
> tracepoint is attached to printk and where ever printk is done so is this
> tracepoint.

As I understand it, code in such a place should be labeled noinstr.
Then the call to printk() would be complained about as an illegal
noinstr-to-non-noinstr call.

But where exactly is that printk()?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-07-12 13:49             ` Paul E. McKenney
@ 2022-07-12 14:53               ` Steven Rostedt
  2022-07-12 15:16                 ` Paul E. McKenney
  2022-07-12 15:25                 ` Petr Mladek
  0 siblings, 2 replies; 21+ messages in thread
From: Steven Rostedt @ 2022-07-12 14:53 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Marco Elver, John Ogness, Petr Mladek, Sergey Senozhatsky,
	linux-kernel, kasan-dev, Thomas Gleixner, Johannes Berg,
	Alexander Potapenko, Dmitry Vyukov, Naresh Kamboju,
	Linux Kernel Functional Testing

On Tue, 12 Jul 2022 06:49:16 -0700
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> > I guess the question is, can we have printk() in such a place? Because this
> > tracepoint is attached to printk and where ever printk is done so is this
> > tracepoint.  
> 
> As I understand it, code in such a place should be labeled noinstr.
> Then the call to printk() would be complained about as an illegal
> noinstr-to-non-noinstr call.
> 
> But where exactly is that printk()?

Perhaps the fix is to remove the _rcuidle() from trace_console_rcuidle().
If printk() can never be called from noinstr (aka RCU not watching).

-- Steve

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-07-12 14:53               ` Steven Rostedt
@ 2022-07-12 15:16                 ` Paul E. McKenney
  2022-07-13 11:25                   ` Petr Mladek
  2022-07-12 15:25                 ` Petr Mladek
  1 sibling, 1 reply; 21+ messages in thread
From: Paul E. McKenney @ 2022-07-12 15:16 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Marco Elver, John Ogness, Petr Mladek, Sergey Senozhatsky,
	linux-kernel, kasan-dev, Thomas Gleixner, Johannes Berg,
	Alexander Potapenko, Dmitry Vyukov, Naresh Kamboju,
	Linux Kernel Functional Testing

On Tue, Jul 12, 2022 at 10:53:53AM -0400, Steven Rostedt wrote:
> On Tue, 12 Jul 2022 06:49:16 -0700
> "Paul E. McKenney" <paulmck@kernel.org> wrote:
> 
> > > I guess the question is, can we have printk() in such a place? Because this
> > > tracepoint is attached to printk and where ever printk is done so is this
> > > tracepoint.  
> > 
> > As I understand it, code in such a place should be labeled noinstr.
> > Then the call to printk() would be complained about as an illegal
> > noinstr-to-non-noinstr call.
> > 
> > But where exactly is that printk()?
> 
> Perhaps the fix is to remove the _rcuidle() from trace_console_rcuidle().
> If printk() can never be called from noinstr (aka RCU not watching).

Maybe printk() is supposed to be invoked from noinstr.  It might be a
special case in the tooling.  I have no idea.  ;-)

However, the current SRCU read-side algorithm will tolerate being invoked
from noinstr as long as it is not also an NMI handler.  Much though
debugging tools might (or might not) complain.

Don't get me wrong, I can make SRCU tolerate being called while RCU is
not watching.  It is not even all that complicated.  The cost is that
architectures that have NMIs but do not have NMI-safe this_cpu*()
operations have an SRCU reader switch from explicit smp_mb() and
interrupt disabling to a cmpxchg() loop relying on the implicit barriers
in cmpxchg().

For arm64, this was reportedly a win.

If it turns out that we need SRCU readers to be invoked from NMI handlers
in locations where RCU is not watching, are there people who would be
willing to play with a modified SRCU on the systems in question?

-ENOHARDWARE at this end.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-07-12 14:53               ` Steven Rostedt
  2022-07-12 15:16                 ` Paul E. McKenney
@ 2022-07-12 15:25                 ` Petr Mladek
  1 sibling, 0 replies; 21+ messages in thread
From: Petr Mladek @ 2022-07-12 15:25 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, Marco Elver, John Ogness, Sergey Senozhatsky,
	linux-kernel, kasan-dev, Thomas Gleixner, Johannes Berg,
	Alexander Potapenko, Dmitry Vyukov, Naresh Kamboju,
	Peter Zijlstra, Linux Kernel Functional Testing

On Tue 2022-07-12 10:53:53, Steven Rostedt wrote:
> On Tue, 12 Jul 2022 06:49:16 -0700
> "Paul E. McKenney" <paulmck@kernel.org> wrote:
> 
> > > I guess the question is, can we have printk() in such a place? Because this
> > > tracepoint is attached to printk and where ever printk is done so is this
> > > tracepoint.  
> > 
> > As I understand it, code in such a place should be labeled noinstr.
> > Then the call to printk() would be complained about as an illegal
> > noinstr-to-non-noinstr call.
> > 
> > But where exactly is that printk()?
> 
> Perhaps the fix is to remove the _rcuidle() from trace_console_rcuidle().
> If printk() can never be called from noinstr (aka RCU not watching).

Good point!

My understanding is that printk() should not get called when rcu
is not watching. But it might need to reduce the scope of the code
when rcu is not watching.

PeterZ actually removed _rcuidle() in this trace a patchset,
see https://lore.kernel.org/r/20220608144517.444659212@infradead.org.

Adding Peter into Cc.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-07-12 15:16                 ` Paul E. McKenney
@ 2022-07-13 11:25                   ` Petr Mladek
  2022-07-13 14:05                     ` Paul E. McKenney
  0 siblings, 1 reply; 21+ messages in thread
From: Petr Mladek @ 2022-07-13 11:25 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Steven Rostedt, Marco Elver, John Ogness, Sergey Senozhatsky,
	linux-kernel, kasan-dev, Thomas Gleixner, Johannes Berg,
	Alexander Potapenko, Dmitry Vyukov, Naresh Kamboju,
	Linux Kernel Functional Testing

On Tue 2022-07-12 08:16:55, Paul E. McKenney wrote:
> On Tue, Jul 12, 2022 at 10:53:53AM -0400, Steven Rostedt wrote:
> > On Tue, 12 Jul 2022 06:49:16 -0700
> > "Paul E. McKenney" <paulmck@kernel.org> wrote:
> > 
> > > > I guess the question is, can we have printk() in such a place? Because this
> > > > tracepoint is attached to printk and where ever printk is done so is this
> > > > tracepoint.  
> > > 
> > > As I understand it, code in such a place should be labeled noinstr.
> > > Then the call to printk() would be complained about as an illegal
> > > noinstr-to-non-noinstr call.
> > > 
> > > But where exactly is that printk()?
> > 
> > Perhaps the fix is to remove the _rcuidle() from trace_console_rcuidle().
> > If printk() can never be called from noinstr (aka RCU not watching).
> 
> Maybe printk() is supposed to be invoked from noinstr.  It might be a
> special case in the tooling.  I have no idea.  ;-)

I think that it is ok to do _not_ support printk() in noinstr parts.

> However, the current SRCU read-side algorithm will tolerate being invoked
> from noinstr as long as it is not also an NMI handler.  Much though
> debugging tools might (or might not) complain.
> 
> Don't get me wrong, I can make SRCU tolerate being called while RCU is
> not watching.  It is not even all that complicated.  The cost is that
> architectures that have NMIs but do not have NMI-safe this_cpu*()
> operations have an SRCU reader switch from explicit smp_mb() and
> interrupt disabling to a cmpxchg() loop relying on the implicit barriers
> in cmpxchg().
> 
> For arm64, this was reportedly a win.

IMHO, the tracepoint in printk() is not worth slowing down other
important fast paths.

The tracepoint was moved into vprintk_store() in 5.19-rc1. It used
to be in console_unlock() before. The previous location was not
reliable by definition. Old messages might be overridden by new
ones before they reach console. Also messages in NMI context
used to be stored in per-CPU buffers. There was even bigger
risk that they would not reach the console.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-07-13 11:25                   ` Petr Mladek
@ 2022-07-13 14:05                     ` Paul E. McKenney
  2022-07-14 14:53                       ` Petr Mladek
  0 siblings, 1 reply; 21+ messages in thread
From: Paul E. McKenney @ 2022-07-13 14:05 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Steven Rostedt, Marco Elver, John Ogness, Sergey Senozhatsky,
	linux-kernel, kasan-dev, Thomas Gleixner, Johannes Berg,
	Alexander Potapenko, Dmitry Vyukov, Naresh Kamboju,
	Linux Kernel Functional Testing

On Wed, Jul 13, 2022 at 01:25:41PM +0200, Petr Mladek wrote:
> On Tue 2022-07-12 08:16:55, Paul E. McKenney wrote:
> > On Tue, Jul 12, 2022 at 10:53:53AM -0400, Steven Rostedt wrote:
> > > On Tue, 12 Jul 2022 06:49:16 -0700
> > > "Paul E. McKenney" <paulmck@kernel.org> wrote:
> > > 
> > > > > I guess the question is, can we have printk() in such a place? Because this
> > > > > tracepoint is attached to printk and where ever printk is done so is this
> > > > > tracepoint.  
> > > > 
> > > > As I understand it, code in such a place should be labeled noinstr.
> > > > Then the call to printk() would be complained about as an illegal
> > > > noinstr-to-non-noinstr call.
> > > > 
> > > > But where exactly is that printk()?
> > > 
> > > Perhaps the fix is to remove the _rcuidle() from trace_console_rcuidle().
> > > If printk() can never be called from noinstr (aka RCU not watching).
> > 
> > Maybe printk() is supposed to be invoked from noinstr.  It might be a
> > special case in the tooling.  I have no idea.  ;-)
> 
> I think that it is ok to do _not_ support printk() in noinstr parts.
> 
> > However, the current SRCU read-side algorithm will tolerate being invoked
> > from noinstr as long as it is not also an NMI handler.  Much though
> > debugging tools might (or might not) complain.
> > 
> > Don't get me wrong, I can make SRCU tolerate being called while RCU is
> > not watching.  It is not even all that complicated.  The cost is that
> > architectures that have NMIs but do not have NMI-safe this_cpu*()
> > operations have an SRCU reader switch from explicit smp_mb() and
> > interrupt disabling to a cmpxchg() loop relying on the implicit barriers
> > in cmpxchg().
> > 
> > For arm64, this was reportedly a win.
> 
> IMHO, the tracepoint in printk() is not worth slowing down other
> important fast paths.
> 
> The tracepoint was moved into vprintk_store() in 5.19-rc1. It used
> to be in console_unlock() before. The previous location was not
> reliable by definition. Old messages might be overridden by new
> ones before they reach console. Also messages in NMI context
> used to be stored in per-CPU buffers. There was even bigger
> risk that they would not reach the console.

Fair enough, works for me!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-07-13 14:05                     ` Paul E. McKenney
@ 2022-07-14 14:53                       ` Petr Mladek
  2022-07-14 15:17                         ` Steven Rostedt
  0 siblings, 1 reply; 21+ messages in thread
From: Petr Mladek @ 2022-07-14 14:53 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Steven Rostedt, Marco Elver, John Ogness, Sergey Senozhatsky,
	linux-kernel, kasan-dev, Thomas Gleixner, Johannes Berg,
	Alexander Potapenko, Dmitry Vyukov, Naresh Kamboju,
	Linux Kernel Functional Testing

On Wed 2022-07-13 07:05:50, Paul E. McKenney wrote:
> On Wed, Jul 13, 2022 at 01:25:41PM +0200, Petr Mladek wrote:
> > On Tue 2022-07-12 08:16:55, Paul E. McKenney wrote:
> > > Maybe printk() is supposed to be invoked from noinstr.  It might be a
> > > special case in the tooling.  I have no idea.  ;-)
> > 
> > I think that it is ok to do _not_ support printk() in noinstr parts.
> > 
> > > However, the current SRCU read-side algorithm will tolerate being invoked
> > > from noinstr as long as it is not also an NMI handler.  Much though
> > > debugging tools might (or might not) complain.
> > > 
> > > Don't get me wrong, I can make SRCU tolerate being called while RCU is
> > > not watching.  It is not even all that complicated.  The cost is that
> > > architectures that have NMIs but do not have NMI-safe this_cpu*()
> > > operations have an SRCU reader switch from explicit smp_mb() and
> > > interrupt disabling to a cmpxchg() loop relying on the implicit barriers
> > > in cmpxchg().
> > > 
> > > For arm64, this was reportedly a win.
> > 
> > IMHO, the tracepoint in printk() is not worth slowing down other
> > important fast paths.
> > 
> > The tracepoint was moved into vprintk_store() in 5.19-rc1. It used
> > to be in console_unlock() before. The previous location was not
> > reliable by definition. Old messages might be overridden by new
> > ones before they reach console. Also messages in NMI context
> > used to be stored in per-CPU buffers. There was even bigger
> > risk that they would not reach the console.
> 
> Fair enough, works for me!

The remaining question is how to make the code safe and calm
down the warning.

My understanding is that Peter Zijlstra wants to reduce the scope
of the rcuidle code even more in the future. So, we could
do something like:

From 24c3517dedf2a30efabe72871c188fbfffffd397 Mon Sep 17 00:00:00 2001
From: Petr Mladek <pmladek@suse.com>
Date: Thu, 14 Jul 2022 14:54:12 +0200
Subject: [PATCH] printk: Make console tracepoint safe in NMI() context

The commit 701850dc0c31bfadf75a0 ("printk, tracing: fix console
tracepoint") moved the tracepoint from console_unlock() to
vprintk_store(). As a result, it might be called in any
context and triggered the following warning:

  WARNING: CPU: 1 PID: 16462 at include/trace/events/printk.h:10 printk_sprint+0x81/0xda
  Modules linked in: ppdev parport_pc parport
  CPU: 1 PID: 16462 Comm: event_benchmark Not tainted 5.19.0-rc5-test+ #5
  Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
  EIP: printk_sprint+0x81/0xda
  Code: 89 d8 e8 88 fc 33 00 e9 02 00 00 00 eb 6b 64 a1 a4 b8 91 c1 e8 fd d6 ff ff 84 c0 74 5c 64 a1 14 08 92 c1 a9 00 00 f0 00 74 02 <0f> 0b 64 ff 05 14 08 92 c1 b8 e0 c4 6b c1 e8 a5 dc 00 00 89 c7 e8
  EAX: 80110001 EBX: c20a52f8 ECX: 0000000c EDX: 6d203036
  ESI: 3df6004c EDI: 00000000 EBP: c61fbd7c ESP: c61fbd70
  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010006
  CR0: 80050033 CR2: b7efc000 CR3: 05b80000 CR4: 001506f0
  Call Trace:
   vprintk_store+0x24b/0x2ff
   vprintk+0x37/0x4d
   _printk+0x14/0x16
   nmi_handle+0x1ef/0x24e
   ? find_next_bit.part.0+0x13/0x13
   ? find_next_bit.part.0+0x13/0x13
   ? function_trace_call+0xd8/0xd9
   default_do_nmi+0x57/0x1af
   ? trace_hardirqs_off_finish+0x2a/0xd9
   ? to_kthread+0xf/0xf
   exc_nmi+0x9b/0xf4
   asm_exc_nmi+0xae/0x29c

It comes from:

  #define __DO_TRACE(name, args, cond, rcuidle) \
  [...]
		/* srcu can't be used from NMI */	\
		WARN_ON_ONCE(rcuidle && in_nmi());	\

It might be possible to make srcu working in NMI. But it
would be slower on some architectures. It is not worth
doing it just because of this tracepoint.

It would be possible to disable this tracepoint in NMI
or in rcuidle context. Where the rcuidle context looks
more rare and thus more acceptable to be ignored.

Alternative solution would be to move the tracepoint
back to console code. But the location is less reliable
by definition. Also the synchronization against other
tracing messages is much worse.

Let's ignore the tracepoint in rcuidle context as the least
evil solution.

There seems to be three possibilities.
Link: https://lore.kernel.org/r/20220712151655.GU1790663@paulmck-ThinkPad-P17-Gen-1

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index b49c6ff6dca0..a13cf3310204 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2108,7 +2108,15 @@ static u16 printk_sprint(char *text, u16 size, int facility,
 		}
 	}
 
-	trace_console_rcuidle(text, text_len);
+	/*
+	 * trace_console_idle() is not working in NMI. printk()
+	 * is used more often in NMI than in rcuidle context.
+	 * Choose the less evil solution here.
+	 *
+	 * smp_processor_id() is reliable in rcuidle context.
+	 */
+	if (!rcu_is_idle_cpu(smp_processor_id()))
+		trace_console(text, text_len);
 
 	return text_len;
 }
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-07-14 14:53                       ` Petr Mladek
@ 2022-07-14 15:17                         ` Steven Rostedt
  2022-07-15  9:40                           ` Petr Mladek
  0 siblings, 1 reply; 21+ messages in thread
From: Steven Rostedt @ 2022-07-14 15:17 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Paul E. McKenney, Marco Elver, John Ogness, Sergey Senozhatsky,
	linux-kernel, kasan-dev, Thomas Gleixner, Johannes Berg,
	Alexander Potapenko, Dmitry Vyukov, Naresh Kamboju,
	Linux Kernel Functional Testing

On Thu, 14 Jul 2022 16:53:24 +0200
Petr Mladek <pmladek@suse.com> wrote:

> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2108,7 +2108,15 @@ static u16 printk_sprint(char *text, u16 size, int facility,
>  		}
>  	}
>  
> -	trace_console_rcuidle(text, text_len);
> +	/*
> +	 * trace_console_idle() is not working in NMI. printk()
> +	 * is used more often in NMI than in rcuidle context.
> +	 * Choose the less evil solution here.
> +	 *
> +	 * smp_processor_id() is reliable in rcuidle context.
> +	 */
> +	if (!rcu_is_idle_cpu(smp_processor_id()))
> +		trace_console(text, text_len);
>  
>  	return text_len;
>  }
> -- 

Although printk is not really a fast path, you could do this and avoid the
check when the trace event is not active:

(Not even compiled tested)

Tweaked the comment, and used raw_smp_processor_id() as I'm not sure we are
in a preempt disabled context, and we don't care if we are not.

-- Steve

diff --git a/include/trace/events/printk.h b/include/trace/events/printk.h
index 13d405b2fd8b..d0a5f63920bb 100644
--- a/include/trace/events/printk.h
+++ b/include/trace/events/printk.h
@@ -7,11 +7,20 @@
 
 #include <linux/tracepoint.h>
 
-TRACE_EVENT(console,
+TRACE_EVENT_CONDITION(console,
 	TP_PROTO(const char *text, size_t len),
 
 	TP_ARGS(text, len),
 
+	/*
+	 * trace_console_rcuidle() is not working in NMI. printk()
+	 * is used more often in NMI than in rcuidle context.
+	 * Choose the less evil solution here.
+	 *
+	 * raw_smp_processor_id() is reliable in rcuidle context.
+	 */
+	TP_CONDITION(!rcu_is_idle_cpu(raw_smp_processor_id())),
+
 	TP_STRUCT__entry(
 		__dynamic_array(char, msg, len + 1)
 	),

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH -printk] printk, tracing: fix console tracepoint
  2022-07-14 15:17                         ` Steven Rostedt
@ 2022-07-15  9:40                           ` Petr Mladek
  0 siblings, 0 replies; 21+ messages in thread
From: Petr Mladek @ 2022-07-15  9:40 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, Marco Elver, John Ogness, Sergey Senozhatsky,
	linux-kernel, kasan-dev, Thomas Gleixner, Johannes Berg,
	Alexander Potapenko, Dmitry Vyukov, Naresh Kamboju,
	Linux Kernel Functional Testing

On Thu 2022-07-14 11:17:49, Steven Rostedt wrote:
> Although printk is not really a fast path, you could do this and avoid the
> check when the trace event is not active:
> 
> (Not even compiled tested)
> 
> Tweaked the comment, and used raw_smp_processor_id() as I'm not sure we are
> in a preempt disabled context, and we don't care if we are not.

Makes sense.

> diff --git a/include/trace/events/printk.h b/include/trace/events/printk.h
> index 13d405b2fd8b..d0a5f63920bb 100644
> --- a/include/trace/events/printk.h
> +++ b/include/trace/events/printk.h
> @@ -7,11 +7,20 @@
>  
>  #include <linux/tracepoint.h>
>  
> -TRACE_EVENT(console,
> +TRACE_EVENT_CONDITION(console,
>  	TP_PROTO(const char *text, size_t len),
>  
>  	TP_ARGS(text, len),
>  
> +	/*
> +	 * trace_console_rcuidle() is not working in NMI. printk()
> +	 * is used more often in NMI than in rcuidle context.
> +	 * Choose the less evil solution here.
> +	 *
> +	 * raw_smp_processor_id() is reliable in rcuidle context.
> +	 */
> +	TP_CONDITION(!rcu_is_idle_cpu(raw_smp_processor_id())),
> +
>  	TP_STRUCT__entry(
>  		__dynamic_array(char, msg, len + 1)
>  	),

It looks better. I am going to test it and send as a proper patch
for review.

Thanks for help.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2022-07-15  9:40 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-03  7:38 [PATCH -printk] printk, tracing: fix console tracepoint Marco Elver
2022-05-03 16:15 ` Steven Rostedt
2022-05-03 19:14 ` John Ogness
2022-05-04  9:46   ` Petr Mladek
2022-05-06 14:51     ` Petr Mladek
2022-05-06 14:55       ` Marco Elver
2022-07-11 22:29 ` Steven Rostedt
2022-07-12  0:21   ` Paul E. McKenney
2022-07-12  0:53     ` Steven Rostedt
2022-07-12  2:57       ` Paul E. McKenney
2022-07-12 11:49         ` Paul E. McKenney
2022-07-12 13:39           ` Steven Rostedt
2022-07-12 13:49             ` Paul E. McKenney
2022-07-12 14:53               ` Steven Rostedt
2022-07-12 15:16                 ` Paul E. McKenney
2022-07-13 11:25                   ` Petr Mladek
2022-07-13 14:05                     ` Paul E. McKenney
2022-07-14 14:53                       ` Petr Mladek
2022-07-14 15:17                         ` Steven Rostedt
2022-07-15  9:40                           ` Petr Mladek
2022-07-12 15:25                 ` Petr Mladek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.