All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] Threaded printk breaks early debugging
@ 2022-06-10 12:48 ` Peter Geis
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Geis @ 2022-06-10 12:48 UTC (permalink / raw)
  To: John Ogness
  Cc: Petr Mladek, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

Good Morning,

First, I love this patch from a functional standpoint! However I've
run into an issue debugging early boot issues. Anything that causes
the kernel threading system to die (for example here, a NPE) causes
the boot console to halt before it outputs the error.

Would it be possible to have both a kconfig option and a runtime
option to toggle printk_fallback_preferred_direct()? Currently I've
had to force it at all times while debugging.

Thanks again for this amazing work!

Very Respectfully,
Peter Geis

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [BUG] Threaded printk breaks early debugging
@ 2022-06-10 12:48 ` Peter Geis
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Geis @ 2022-06-10 12:48 UTC (permalink / raw)
  To: John Ogness
  Cc: Petr Mladek, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

Good Morning,

First, I love this patch from a functional standpoint! However I've
run into an issue debugging early boot issues. Anything that causes
the kernel threading system to die (for example here, a NPE) causes
the boot console to halt before it outputs the error.

Would it be possible to have both a kconfig option and a runtime
option to toggle printk_fallback_preferred_direct()? Currently I've
had to force it at all times while debugging.

Thanks again for this amazing work!

Very Respectfully,
Peter Geis

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-10 12:48 ` Peter Geis
@ 2022-06-10 15:05   ` John Ogness
  -1 siblings, 0 replies; 40+ messages in thread
From: John Ogness @ 2022-06-10 15:05 UTC (permalink / raw)
  To: Peter Geis
  Cc: Petr Mladek, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

Hi Peter,

On 2022-06-10, Peter Geis <pgwipeout@gmail.com> wrote:
> However I've run into an issue debugging early boot issues. Anything
> that causes the kernel threading system to die (for example here, a
> NPE) causes the boot console to halt before it outputs the error.

A null pointer exception should trigger a panic, which will cause printk
to enter direct mode for the oops. It should be just as effective as
before introducing the printing kthreads.

Can you please provide a test case or patch that will trigger the
problem you refer to?

> Would it be possible to have both a kconfig option and a runtime
> option to toggle printk_fallback_preferred_direct()? Currently I've
> had to force it at all times while debugging.

It would be trivial to provide a boot parameter for this behavior, but I
think we first need to see a real problem and also see if we can fix
that problem.

John Ogness

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-10 15:05   ` John Ogness
  0 siblings, 0 replies; 40+ messages in thread
From: John Ogness @ 2022-06-10 15:05 UTC (permalink / raw)
  To: Peter Geis
  Cc: Petr Mladek, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

Hi Peter,

On 2022-06-10, Peter Geis <pgwipeout@gmail.com> wrote:
> However I've run into an issue debugging early boot issues. Anything
> that causes the kernel threading system to die (for example here, a
> NPE) causes the boot console to halt before it outputs the error.

A null pointer exception should trigger a panic, which will cause printk
to enter direct mode for the oops. It should be just as effective as
before introducing the printing kthreads.

Can you please provide a test case or patch that will trigger the
problem you refer to?

> Would it be possible to have both a kconfig option and a runtime
> option to toggle printk_fallback_preferred_direct()? Currently I've
> had to force it at all times while debugging.

It would be trivial to provide a boot parameter for this behavior, but I
think we first need to see a real problem and also see if we can fix
that problem.

John Ogness

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-10 15:05   ` John Ogness
@ 2022-06-10 15:34     ` Peter Geis
  -1 siblings, 0 replies; 40+ messages in thread
From: Peter Geis @ 2022-06-10 15:34 UTC (permalink / raw)
  To: John Ogness
  Cc: Petr Mladek, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Fri, Jun 10, 2022 at 11:05 AM John Ogness <john.ogness@linutronix.de> wrote:
>
> Hi Peter,
>
> On 2022-06-10, Peter Geis <pgwipeout@gmail.com> wrote:
> > However I've run into an issue debugging early boot issues. Anything
> > that causes the kernel threading system to die (for example here, a
> > NPE) causes the boot console to halt before it outputs the error.
>
> A null pointer exception should trigger a panic, which will cause printk
> to enter direct mode for the oops. It should be just as effective as
> before introducing the printing kthreads.

This might be a side effect of the fact that this is on a low powered
arm64 board. I noticed with threading enabled during large bursts the
console drops an excessive amount of messages. It's especially
apparent during the handover from earlycon to the normal console.

>
> Can you please provide a test case or patch that will trigger the
> problem you refer to?

Unfortunately no, unless you have a Rockchip rk356x device with both
DSI and HDMI. We're working on adding DSI support to the new video
output driver, and when both are enabled it tosses a NPE.

>
> > Would it be possible to have both a kconfig option and a runtime
> > option to toggle printk_fallback_preferred_direct()? Currently I've
> > had to force it at all times while debugging.
>
> It would be trivial to provide a boot parameter for this behavior, but I
> think we first need to see a real problem and also see if we can fix
> that problem.
>
> John Ogness

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-10 15:34     ` Peter Geis
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Geis @ 2022-06-10 15:34 UTC (permalink / raw)
  To: John Ogness
  Cc: Petr Mladek, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Fri, Jun 10, 2022 at 11:05 AM John Ogness <john.ogness@linutronix.de> wrote:
>
> Hi Peter,
>
> On 2022-06-10, Peter Geis <pgwipeout@gmail.com> wrote:
> > However I've run into an issue debugging early boot issues. Anything
> > that causes the kernel threading system to die (for example here, a
> > NPE) causes the boot console to halt before it outputs the error.
>
> A null pointer exception should trigger a panic, which will cause printk
> to enter direct mode for the oops. It should be just as effective as
> before introducing the printing kthreads.

This might be a side effect of the fact that this is on a low powered
arm64 board. I noticed with threading enabled during large bursts the
console drops an excessive amount of messages. It's especially
apparent during the handover from earlycon to the normal console.

>
> Can you please provide a test case or patch that will trigger the
> problem you refer to?

Unfortunately no, unless you have a Rockchip rk356x device with both
DSI and HDMI. We're working on adding DSI support to the new video
output driver, and when both are enabled it tosses a NPE.

>
> > Would it be possible to have both a kconfig option and a runtime
> > option to toggle printk_fallback_preferred_direct()? Currently I've
> > had to force it at all times while debugging.
>
> It would be trivial to provide a boot parameter for this behavior, but I
> think we first need to see a real problem and also see if we can fix
> that problem.
>
> John Ogness

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-10 15:34     ` Peter Geis
@ 2022-06-12  2:57       ` Sergey Senozhatsky
  -1 siblings, 0 replies; 40+ messages in thread
From: Sergey Senozhatsky @ 2022-06-12  2:57 UTC (permalink / raw)
  To: Peter Geis
  Cc: John Ogness, Petr Mladek, Sergey Senozhatsky,
	Linux Kernel Mailing List, open list:ARM/Rockchip SoC...

On (22/06/10 11:34), Peter Geis wrote:
> > On 2022-06-10, Peter Geis <pgwipeout@gmail.com> wrote:
> > > However I've run into an issue debugging early boot issues. Anything
> > > that causes the kernel threading system to die (for example here, a
> > > NPE) causes the boot console to halt before it outputs the error.
> >
> > A null pointer exception should trigger a panic, which will cause printk
> > to enter direct mode for the oops. It should be just as effective as
> > before introducing the printing kthreads.
> 
> This might be a side effect of the fact that this is on a low powered
> arm64 board. I noticed with threading enabled during large bursts the
> console drops an excessive amount of messages. It's especially
> apparent during the handover from earlycon to the normal console.

How many CPUs does it have?

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-12  2:57       ` Sergey Senozhatsky
  0 siblings, 0 replies; 40+ messages in thread
From: Sergey Senozhatsky @ 2022-06-12  2:57 UTC (permalink / raw)
  To: Peter Geis
  Cc: John Ogness, Petr Mladek, Sergey Senozhatsky,
	Linux Kernel Mailing List, open list:ARM/Rockchip SoC...

On (22/06/10 11:34), Peter Geis wrote:
> > On 2022-06-10, Peter Geis <pgwipeout@gmail.com> wrote:
> > > However I've run into an issue debugging early boot issues. Anything
> > > that causes the kernel threading system to die (for example here, a
> > > NPE) causes the boot console to halt before it outputs the error.
> >
> > A null pointer exception should trigger a panic, which will cause printk
> > to enter direct mode for the oops. It should be just as effective as
> > before introducing the printing kthreads.
> 
> This might be a side effect of the fact that this is on a low powered
> arm64 board. I noticed with threading enabled during large bursts the
> console drops an excessive amount of messages. It's especially
> apparent during the handover from earlycon to the normal console.

How many CPUs does it have?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-10 15:05   ` John Ogness
@ 2022-06-12  3:13     ` Sergey Senozhatsky
  -1 siblings, 0 replies; 40+ messages in thread
From: Sergey Senozhatsky @ 2022-06-12  3:13 UTC (permalink / raw)
  To: John Ogness, Petr Mladek
  Cc: Peter Geis, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On (22/06/10 17:11), John Ogness wrote:
> It would be trivial to provide a boot parameter for this behavior, but I
> think we first need to see a real problem and also see if we can fix
> that problem.

John, Petr, a quick question

Should a situation when we have only one online CPU be enough of a reason
to do direct printing? Otherwise we might not have CPUs to wakeup khtread on,
e.g. when CPU that printk is in atomic section for too long.

---

@@ -480,6 +480,9 @@ static inline bool allow_direct_printing(void)
 	if (!printk_kthreads_available)
 		return true;
 
+	if (num_online_cpus() == 1)
+		return true;
+
 	/*
 	 * Prefer direct printing when the system is in a problematic state.
 	 * The context that sets this state will always see the updated value.

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-12  3:13     ` Sergey Senozhatsky
  0 siblings, 0 replies; 40+ messages in thread
From: Sergey Senozhatsky @ 2022-06-12  3:13 UTC (permalink / raw)
  To: John Ogness, Petr Mladek
  Cc: Peter Geis, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On (22/06/10 17:11), John Ogness wrote:
> It would be trivial to provide a boot parameter for this behavior, but I
> think we first need to see a real problem and also see if we can fix
> that problem.

John, Petr, a quick question

Should a situation when we have only one online CPU be enough of a reason
to do direct printing? Otherwise we might not have CPUs to wakeup khtread on,
e.g. when CPU that printk is in atomic section for too long.

---

@@ -480,6 +480,9 @@ static inline bool allow_direct_printing(void)
 	if (!printk_kthreads_available)
 		return true;
 
+	if (num_online_cpus() == 1)
+		return true;
+
 	/*
 	 * Prefer direct printing when the system is in a problematic state.
 	 * The context that sets this state will always see the updated value.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-12  2:57       ` Sergey Senozhatsky
@ 2022-06-12 13:30         ` Peter Geis
  -1 siblings, 0 replies; 40+ messages in thread
From: Peter Geis @ 2022-06-12 13:30 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: John Ogness, Petr Mladek, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Sat, Jun 11, 2022 at 10:57 PM Sergey Senozhatsky
<senozhatsky@chromium.org> wrote:
>
> On (22/06/10 11:34), Peter Geis wrote:
> > > On 2022-06-10, Peter Geis <pgwipeout@gmail.com> wrote:
> > > > However I've run into an issue debugging early boot issues. Anything
> > > > that causes the kernel threading system to die (for example here, a
> > > > NPE) causes the boot console to halt before it outputs the error.
> > >
> > > A null pointer exception should trigger a panic, which will cause printk
> > > to enter direct mode for the oops. It should be just as effective as
> > > before introducing the printing kthreads.
> >
> > This might be a side effect of the fact that this is on a low powered
> > arm64 board. I noticed with threading enabled during large bursts the
> > console drops an excessive amount of messages. It's especially
> > apparent during the handover from earlycon to the normal console.
>
> How many CPUs does it have?

Four and all are online when the error occurs.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-12 13:30         ` Peter Geis
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Geis @ 2022-06-12 13:30 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: John Ogness, Petr Mladek, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Sat, Jun 11, 2022 at 10:57 PM Sergey Senozhatsky
<senozhatsky@chromium.org> wrote:
>
> On (22/06/10 11:34), Peter Geis wrote:
> > > On 2022-06-10, Peter Geis <pgwipeout@gmail.com> wrote:
> > > > However I've run into an issue debugging early boot issues. Anything
> > > > that causes the kernel threading system to die (for example here, a
> > > > NPE) causes the boot console to halt before it outputs the error.
> > >
> > > A null pointer exception should trigger a panic, which will cause printk
> > > to enter direct mode for the oops. It should be just as effective as
> > > before introducing the printing kthreads.
> >
> > This might be a side effect of the fact that this is on a low powered
> > arm64 board. I noticed with threading enabled during large bursts the
> > console drops an excessive amount of messages. It's especially
> > apparent during the handover from earlycon to the normal console.
>
> How many CPUs does it have?

Four and all are online when the error occurs.

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-12  3:13     ` Sergey Senozhatsky
@ 2022-06-12 23:02       ` John Ogness
  -1 siblings, 0 replies; 40+ messages in thread
From: John Ogness @ 2022-06-12 23:02 UTC (permalink / raw)
  To: Sergey Senozhatsky, Petr Mladek
  Cc: Peter Geis, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On 2022-06-12, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
> Should a situation when we have only one online CPU be enough of a
> reason to do direct printing? Otherwise we might not have CPUs to
> wakeup khtread on, e.g. when CPU that printk is in atomic section for
> too long.

IMHO, no. Especially in that situation, we do not want printk causing
that atomic section to become even longer. If the machine has entered
normal operation, we want printk out of the way.

John Ogness

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-12 23:02       ` John Ogness
  0 siblings, 0 replies; 40+ messages in thread
From: John Ogness @ 2022-06-12 23:02 UTC (permalink / raw)
  To: Sergey Senozhatsky, Petr Mladek
  Cc: Peter Geis, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On 2022-06-12, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
> Should a situation when we have only one online CPU be enough of a
> reason to do direct printing? Otherwise we might not have CPUs to
> wakeup khtread on, e.g. when CPU that printk is in atomic section for
> too long.

IMHO, no. Especially in that situation, we do not want printk causing
that atomic section to become even longer. If the machine has entered
normal operation, we want printk out of the way.

John Ogness

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-10 15:34     ` Peter Geis
@ 2022-06-12 23:08       ` John Ogness
  -1 siblings, 0 replies; 40+ messages in thread
From: John Ogness @ 2022-06-12 23:08 UTC (permalink / raw)
  To: Peter Geis
  Cc: Petr Mladek, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On 2022-06-10, Peter Geis <pgwipeout@gmail.com> wrote:
> This might be a side effect of the fact that this is on a low powered
> arm64 board. I noticed with threading enabled during large bursts the
> console drops an excessive amount of messages. It's especially
> apparent during the handover from earlycon to the normal console.

I guess you have a very small kernel buffer and are generating unusually
high amounts of messages? Is there a reason you cannot use a larger
buffer?

Or maybe you are generating a constant amount of messages that a serial
port could never keep up with unless it was interfering with the system
in such a way as to slow everything down? This is exactly what printk
should _not_ be used for.

John Ogness

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-12 23:08       ` John Ogness
  0 siblings, 0 replies; 40+ messages in thread
From: John Ogness @ 2022-06-12 23:08 UTC (permalink / raw)
  To: Peter Geis
  Cc: Petr Mladek, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On 2022-06-10, Peter Geis <pgwipeout@gmail.com> wrote:
> This might be a side effect of the fact that this is on a low powered
> arm64 board. I noticed with threading enabled during large bursts the
> console drops an excessive amount of messages. It's especially
> apparent during the handover from earlycon to the normal console.

I guess you have a very small kernel buffer and are generating unusually
high amounts of messages? Is there a reason you cannot use a larger
buffer?

Or maybe you are generating a constant amount of messages that a serial
port could never keep up with unless it was interfering with the system
in such a way as to slow everything down? This is exactly what printk
should _not_ be used for.

John Ogness

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-12 23:08       ` John Ogness
@ 2022-06-12 23:30         ` Peter Geis
  -1 siblings, 0 replies; 40+ messages in thread
From: Peter Geis @ 2022-06-12 23:30 UTC (permalink / raw)
  To: John Ogness
  Cc: Petr Mladek, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Sun, Jun 12, 2022 at 7:08 PM John Ogness <john.ogness@linutronix.de> wrote:
>
> On 2022-06-10, Peter Geis <pgwipeout@gmail.com> wrote:
> > This might be a side effect of the fact that this is on a low powered
> > arm64 board. I noticed with threading enabled during large bursts the
> > console drops an excessive amount of messages. It's especially
> > apparent during the handover from earlycon to the normal console.
>
> I guess you have a very small kernel buffer and are generating unusually
> high amounts of messages? Is there a reason you cannot use a larger
> buffer?

The buffer isn't an issue here, everything is available in dmesg when
userspace becomes available. Instead some messages bound for the
serial console are never output.

>
> Or maybe you are generating a constant amount of messages that a serial
> port could never keep up with unless it was interfering with the system
> in such a way as to slow everything down? This is exactly what printk
> should _not_ be used for.

We run a serial console at 1.5m baud which is significantly higher
than most SoCs which default to 115200. I have noticed some timing
differences since the introduction of the threaded console. A
significant amount of information is dumped very early in the boot
process (between 0 and 4 seconds into boot), as most drivers are
probing during this time. It also happens to be when the earlycon
console hands over to the normal console. There is no abnormal
debugging enabled, the output is a standard (non-quiet) boot log. The
question is why is direct mode not triggering during a panic?

>
> John Ogness

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-12 23:30         ` Peter Geis
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Geis @ 2022-06-12 23:30 UTC (permalink / raw)
  To: John Ogness
  Cc: Petr Mladek, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Sun, Jun 12, 2022 at 7:08 PM John Ogness <john.ogness@linutronix.de> wrote:
>
> On 2022-06-10, Peter Geis <pgwipeout@gmail.com> wrote:
> > This might be a side effect of the fact that this is on a low powered
> > arm64 board. I noticed with threading enabled during large bursts the
> > console drops an excessive amount of messages. It's especially
> > apparent during the handover from earlycon to the normal console.
>
> I guess you have a very small kernel buffer and are generating unusually
> high amounts of messages? Is there a reason you cannot use a larger
> buffer?

The buffer isn't an issue here, everything is available in dmesg when
userspace becomes available. Instead some messages bound for the
serial console are never output.

>
> Or maybe you are generating a constant amount of messages that a serial
> port could never keep up with unless it was interfering with the system
> in such a way as to slow everything down? This is exactly what printk
> should _not_ be used for.

We run a serial console at 1.5m baud which is significantly higher
than most SoCs which default to 115200. I have noticed some timing
differences since the introduction of the threaded console. A
significant amount of information is dumped very early in the boot
process (between 0 and 4 seconds into boot), as most drivers are
probing during this time. It also happens to be when the earlycon
console hands over to the normal console. There is no abnormal
debugging enabled, the output is a standard (non-quiet) boot log. The
question is why is direct mode not triggering during a panic?

>
> John Ogness

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-12 23:30         ` Peter Geis
@ 2022-06-13  2:23           ` John Ogness
  -1 siblings, 0 replies; 40+ messages in thread
From: John Ogness @ 2022-06-13  2:23 UTC (permalink / raw)
  To: Peter Geis
  Cc: Petr Mladek, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On 2022-06-12, Peter Geis <pgwipeout@gmail.com> wrote:
> The buffer isn't an issue here, everything is available in dmesg when
> userspace becomes available. Instead some messages bound for the
> serial console are never output.

OK. Good to know.

> We run a serial console at 1.5m baud which is significantly higher
> than most SoCs which default to 115200. I have noticed some timing
> differences since the introduction of the threaded console. A
> significant amount of information is dumped very early in the boot
> process (between 0 and 4 seconds into boot), as most drivers are
> probing during this time. It also happens to be when the earlycon
> console hands over to the normal console. There is no abnormal
> debugging enabled, the output is a standard (non-quiet) boot log. The
> question is why is direct mode not triggering during a panic?

Just to be clear, you are not losing any intermediate messages. Only the
tail end of the kernel log was never printed. Is this correct?

This may be the same issue being discussed here [0].

John Ogness

[0] https://lore.kernel.org/all/87v8t5l39z.fsf@jogness.linutronix.de

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-13  2:23           ` John Ogness
  0 siblings, 0 replies; 40+ messages in thread
From: John Ogness @ 2022-06-13  2:23 UTC (permalink / raw)
  To: Peter Geis
  Cc: Petr Mladek, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On 2022-06-12, Peter Geis <pgwipeout@gmail.com> wrote:
> The buffer isn't an issue here, everything is available in dmesg when
> userspace becomes available. Instead some messages bound for the
> serial console are never output.

OK. Good to know.

> We run a serial console at 1.5m baud which is significantly higher
> than most SoCs which default to 115200. I have noticed some timing
> differences since the introduction of the threaded console. A
> significant amount of information is dumped very early in the boot
> process (between 0 and 4 seconds into boot), as most drivers are
> probing during this time. It also happens to be when the earlycon
> console hands over to the normal console. There is no abnormal
> debugging enabled, the output is a standard (non-quiet) boot log. The
> question is why is direct mode not triggering during a panic?

Just to be clear, you are not losing any intermediate messages. Only the
tail end of the kernel log was never printed. Is this correct?

This may be the same issue being discussed here [0].

John Ogness

[0] https://lore.kernel.org/all/87v8t5l39z.fsf@jogness.linutronix.de

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-12 23:02       ` John Ogness
@ 2022-06-13  3:49         ` Sergey Senozhatsky
  -1 siblings, 0 replies; 40+ messages in thread
From: Sergey Senozhatsky @ 2022-06-13  3:49 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Petr Mladek, Peter Geis,
	Linux Kernel Mailing List, open list:ARM/Rockchip SoC...

On (22/06/13 01:08), John Ogness wrote:
> On 2022-06-12, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
> > Should a situation when we have only one online CPU be enough of a
> > reason to do direct printing? Otherwise we might not have CPUs to
> > wakeup khtread on, e.g. when CPU that printk is in atomic section for
> > too long.
> 
> IMHO, no. Especially in that situation, we do not want printk causing
> that atomic section to become even longer. If the machine has entered
> normal operation, we want printk out of the way.

At the same time printk throttles itself in such cases: new messages are
not added at much higher pace that they are printed at. So we lower the
chances of missing messages.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-13  3:49         ` Sergey Senozhatsky
  0 siblings, 0 replies; 40+ messages in thread
From: Sergey Senozhatsky @ 2022-06-13  3:49 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Petr Mladek, Peter Geis,
	Linux Kernel Mailing List, open list:ARM/Rockchip SoC...

On (22/06/13 01:08), John Ogness wrote:
> On 2022-06-12, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
> > Should a situation when we have only one online CPU be enough of a
> > reason to do direct printing? Otherwise we might not have CPUs to
> > wakeup khtread on, e.g. when CPU that printk is in atomic section for
> > too long.
> 
> IMHO, no. Especially in that situation, we do not want printk causing
> that atomic section to become even longer. If the machine has entered
> normal operation, we want printk out of the way.

At the same time printk throttles itself in such cases: new messages are
not added at much higher pace that they are printed at. So we lower the
chances of missing messages.

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-13  3:49         ` Sergey Senozhatsky
@ 2022-06-13  8:30           ` John Ogness
  -1 siblings, 0 replies; 40+ messages in thread
From: John Ogness @ 2022-06-13  8:30 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Petr Mladek, Peter Geis,
	Linux Kernel Mailing List, open list:ARM/Rockchip SoC...

On 2022-06-13, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
>> > Should a situation when we have only one online CPU be enough of a
>> > reason to do direct printing? Otherwise we might not have CPUs to
>> > wakeup khtread on, e.g. when CPU that printk is in atomic section for
>> > too long.
>> 
>> IMHO, no. Especially in that situation, we do not want printk causing
>> that atomic section to become even longer. If the machine has entered
>> normal operation, we want printk out of the way.
>
> At the same time printk throttles itself in such cases: new messages are
> not added at much higher pace that they are printed at. So we lower the
> chances of missing messages.

That is true if there is only 1 printk caller. For SMP systems with
printing handovers, it might not help at all. I firmly believe that
sprinkling randomness into printk (i.e. system) latencies is not the
answer. We need to keep printk lockless and out of the system's way
unless there is a real emergency happening.

This particular thread is not about missed messages due to printk not
"throttling the system", but rather the kernel buffers not getting
flushed in an emergency. This, of course, needs to be properly handled.

John Ogness

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-13  8:30           ` John Ogness
  0 siblings, 0 replies; 40+ messages in thread
From: John Ogness @ 2022-06-13  8:30 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Petr Mladek, Peter Geis,
	Linux Kernel Mailing List, open list:ARM/Rockchip SoC...

On 2022-06-13, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
>> > Should a situation when we have only one online CPU be enough of a
>> > reason to do direct printing? Otherwise we might not have CPUs to
>> > wakeup khtread on, e.g. when CPU that printk is in atomic section for
>> > too long.
>> 
>> IMHO, no. Especially in that situation, we do not want printk causing
>> that atomic section to become even longer. If the machine has entered
>> normal operation, we want printk out of the way.
>
> At the same time printk throttles itself in such cases: new messages are
> not added at much higher pace that they are printed at. So we lower the
> chances of missing messages.

That is true if there is only 1 printk caller. For SMP systems with
printing handovers, it might not help at all. I firmly believe that
sprinkling randomness into printk (i.e. system) latencies is not the
answer. We need to keep printk lockless and out of the system's way
unless there is a real emergency happening.

This particular thread is not about missed messages due to printk not
"throttling the system", but rather the kernel buffers not getting
flushed in an emergency. This, of course, needs to be properly handled.

John Ogness

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-13  8:30           ` John Ogness
@ 2022-06-13  9:05             ` Sergey Senozhatsky
  -1 siblings, 0 replies; 40+ messages in thread
From: Sergey Senozhatsky @ 2022-06-13  9:05 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Petr Mladek, Peter Geis,
	Linux Kernel Mailing List, open list:ARM/Rockchip SoC...

On (22/06/13 10:36), John Ogness wrote:
> >> IMHO, no. Especially in that situation, we do not want printk causing
> >> that atomic section to become even longer. If the machine has entered
> >> normal operation, we want printk out of the way.
> >
> > At the same time printk throttles itself in such cases: new messages are
> > not added at much higher pace that they are printed at. So we lower the
> > chances of missing messages.
> 
> That is true if there is only 1 printk caller.

Well, which is the case when num_online_cpus() == 1?

> For SMP systems with printing handovers, it might not help at all.
> I firmly believe that sprinkling randomness into printk (i.e. system)
> latencies is not the answer. We need to keep printk lockless and out
> of the system's way unless there is a real emergency happening.

Yeah sure.

> This particular thread is not about missed messages due to printk not
> "throttling the system", but rather the kernel buffers not getting
> flushed in an emergency. This, of course, needs to be properly handled.

True, but Peter mentioned

  "I noticed with threading enabled during large bursts the console
   drops an excessive amount of messages. It's especially apparent
   during the handover from earlycon to the normal console."

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-13  9:05             ` Sergey Senozhatsky
  0 siblings, 0 replies; 40+ messages in thread
From: Sergey Senozhatsky @ 2022-06-13  9:05 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Petr Mladek, Peter Geis,
	Linux Kernel Mailing List, open list:ARM/Rockchip SoC...

On (22/06/13 10:36), John Ogness wrote:
> >> IMHO, no. Especially in that situation, we do not want printk causing
> >> that atomic section to become even longer. If the machine has entered
> >> normal operation, we want printk out of the way.
> >
> > At the same time printk throttles itself in such cases: new messages are
> > not added at much higher pace that they are printed at. So we lower the
> > chances of missing messages.
> 
> That is true if there is only 1 printk caller.

Well, which is the case when num_online_cpus() == 1?

> For SMP systems with printing handovers, it might not help at all.
> I firmly believe that sprinkling randomness into printk (i.e. system)
> latencies is not the answer. We need to keep printk lockless and out
> of the system's way unless there is a real emergency happening.

Yeah sure.

> This particular thread is not about missed messages due to printk not
> "throttling the system", but rather the kernel buffers not getting
> flushed in an emergency. This, of course, needs to be properly handled.

True, but Peter mentioned

  "I noticed with threading enabled during large bursts the console
   drops an excessive amount of messages. It's especially apparent
   during the handover from earlycon to the normal console."

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-13  9:05             ` Sergey Senozhatsky
@ 2022-06-13 10:14               ` Petr Mladek
  -1 siblings, 0 replies; 40+ messages in thread
From: Petr Mladek @ 2022-06-13 10:14 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: John Ogness, Peter Geis, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Mon 2022-06-13 18:05:08, Sergey Senozhatsky wrote:
> On (22/06/13 10:36), John Ogness wrote:
> > >> IMHO, no. Especially in that situation, we do not want printk causing
> > >> that atomic section to become even longer. If the machine has entered
> > >> normal operation, we want printk out of the way.
> > >
> > > At the same time printk throttles itself in such cases: new messages are
> > > not added at much higher pace that they are printed at. So we lower the
> > > chances of missing messages.
> > 
> > That is true if there is only 1 printk caller.
> 
> Well, which is the case when num_online_cpus() == 1?

Good question.

Well, it would be nice to have the same behavior on single CPU
and SMP systems. Blocking atomic context with slow console is
bad even on single processor system. If there are problems with
lost messages then we will need a solution for SMP anyway.

> > For SMP systems with printing handovers, it might not help at all.
> > I firmly believe that sprinkling randomness into printk (i.e. system)
> > latencies is not the answer. We need to keep printk lockless and out
> > of the system's way unless there is a real emergency happening.
> 
> Yeah sure.
>
> > This particular thread is not about missed messages due to printk not
> > "throttling the system", but rather the kernel buffers not getting
> > flushed in an emergency. This, of course, needs to be properly handled.
> 
> True, but Peter mentioned
> 
>   "I noticed with threading enabled during large bursts the console
>    drops an excessive amount of messages. It's especially apparent
>    during the handover from earlycon to the normal console."

But this is also the situation when softlockups happen. It should
ideally be solved with a big enough buffer.

Another interesting alternative is the Peter Zijlstra's mode
where all messages are printed to the console "immediately".
They are serialized only by the CPU-reentrant lock.

This mode is not good for production system. But it might
be good for debugging. The good thing is that the behavior
is well defined.

I hope that we will get this mode with the atomic consoles.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-13 10:14               ` Petr Mladek
  0 siblings, 0 replies; 40+ messages in thread
From: Petr Mladek @ 2022-06-13 10:14 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: John Ogness, Peter Geis, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Mon 2022-06-13 18:05:08, Sergey Senozhatsky wrote:
> On (22/06/13 10:36), John Ogness wrote:
> > >> IMHO, no. Especially in that situation, we do not want printk causing
> > >> that atomic section to become even longer. If the machine has entered
> > >> normal operation, we want printk out of the way.
> > >
> > > At the same time printk throttles itself in such cases: new messages are
> > > not added at much higher pace that they are printed at. So we lower the
> > > chances of missing messages.
> > 
> > That is true if there is only 1 printk caller.
> 
> Well, which is the case when num_online_cpus() == 1?

Good question.

Well, it would be nice to have the same behavior on single CPU
and SMP systems. Blocking atomic context with slow console is
bad even on single processor system. If there are problems with
lost messages then we will need a solution for SMP anyway.

> > For SMP systems with printing handovers, it might not help at all.
> > I firmly believe that sprinkling randomness into printk (i.e. system)
> > latencies is not the answer. We need to keep printk lockless and out
> > of the system's way unless there is a real emergency happening.
> 
> Yeah sure.
>
> > This particular thread is not about missed messages due to printk not
> > "throttling the system", but rather the kernel buffers not getting
> > flushed in an emergency. This, of course, needs to be properly handled.
> 
> True, but Peter mentioned
> 
>   "I noticed with threading enabled during large bursts the console
>    drops an excessive amount of messages. It's especially apparent
>    during the handover from earlycon to the normal console."

But this is also the situation when softlockups happen. It should
ideally be solved with a big enough buffer.

Another interesting alternative is the Peter Zijlstra's mode
where all messages are printed to the console "immediately".
They are serialized only by the CPU-reentrant lock.

This mode is not good for production system. But it might
be good for debugging. The good thing is that the behavior
is well defined.

I hope that we will get this mode with the atomic consoles.

Best Regards,
Petr

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-12 23:30         ` Peter Geis
@ 2022-06-13 11:24           ` Petr Mladek
  -1 siblings, 0 replies; 40+ messages in thread
From: Petr Mladek @ 2022-06-13 11:24 UTC (permalink / raw)
  To: Peter Geis
  Cc: John Ogness, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Sun 2022-06-12 19:30:27, Peter Geis wrote:
> On Sun, Jun 12, 2022 at 7:08 PM John Ogness <john.ogness@linutronix.de> wrote:
> >
> > On 2022-06-10, Peter Geis <pgwipeout@gmail.com> wrote:
> > > This might be a side effect of the fact that this is on a low powered
> > > arm64 board. I noticed with threading enabled during large bursts the
> > > console drops an excessive amount of messages.

What do you mean that console drops an excessive amount of messages,
please?

Do you see "dropped XXX messages" in the console output?
Or something else?

> > > It's especially
> > > apparent during the handover from earlycon to the normal console.
> >
> > I guess you have a very small kernel buffer and are generating unusually
> > high amounts of messages? Is there a reason you cannot use a larger
> > buffer?
> 
> The buffer isn't an issue here, everything is available in dmesg when
> userspace becomes available. Instead some messages bound for the
> serial console are never output.

This is strage. The message "dropped XXX messages" is printed only
when the buffer was full and the oldest messages were overwritten
before they were pushed to the console. If "dmesg" shows all messages
it means that no messages were overwritten.

Best Regards,
Petr

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-13 11:24           ` Petr Mladek
  0 siblings, 0 replies; 40+ messages in thread
From: Petr Mladek @ 2022-06-13 11:24 UTC (permalink / raw)
  To: Peter Geis
  Cc: John Ogness, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Sun 2022-06-12 19:30:27, Peter Geis wrote:
> On Sun, Jun 12, 2022 at 7:08 PM John Ogness <john.ogness@linutronix.de> wrote:
> >
> > On 2022-06-10, Peter Geis <pgwipeout@gmail.com> wrote:
> > > This might be a side effect of the fact that this is on a low powered
> > > arm64 board. I noticed with threading enabled during large bursts the
> > > console drops an excessive amount of messages.

What do you mean that console drops an excessive amount of messages,
please?

Do you see "dropped XXX messages" in the console output?
Or something else?

> > > It's especially
> > > apparent during the handover from earlycon to the normal console.
> >
> > I guess you have a very small kernel buffer and are generating unusually
> > high amounts of messages? Is there a reason you cannot use a larger
> > buffer?
> 
> The buffer isn't an issue here, everything is available in dmesg when
> userspace becomes available. Instead some messages bound for the
> serial console are never output.

This is strage. The message "dropped XXX messages" is printed only
when the buffer was full and the oldest messages were overwritten
before they were pushed to the console. If "dmesg" shows all messages
it means that no messages were overwritten.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-13  2:23           ` John Ogness
@ 2022-06-13 15:11             ` Petr Mladek
  -1 siblings, 0 replies; 40+ messages in thread
From: Petr Mladek @ 2022-06-13 15:11 UTC (permalink / raw)
  To: John Ogness
  Cc: Peter Geis, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Mon 2022-06-13 04:29:50, John Ogness wrote:
> On 2022-06-12, Peter Geis <pgwipeout@gmail.com> wrote:
> > The buffer isn't an issue here, everything is available in dmesg when
> > userspace becomes available. Instead some messages bound for the
> > serial console are never output.
> 
> OK. Good to know.
> 
> > We run a serial console at 1.5m baud which is significantly higher
> > than most SoCs which default to 115200. I have noticed some timing
> > differences since the introduction of the threaded console. A
> > significant amount of information is dumped very early in the boot
> > process (between 0 and 4 seconds into boot), as most drivers are
> > probing during this time. It also happens to be when the earlycon
> > console hands over to the normal console. There is no abnormal
> > debugging enabled, the output is a standard (non-quiet) boot log. The
> > question is why is direct mode not triggering during a panic?
> 
> Just to be clear, you are not losing any intermediate messages. Only the
> tail end of the kernel log was never printed. Is this correct?
> 
> This may be the same issue being discussed here [0].
> 
> John Ogness
> 
> [0] https://lore.kernel.org/all/87v8t5l39z.fsf@jogness.linutronix.de

If the last messages are missing then it is most likely the same
issue.

Peter, could you please try if the patch at
https://lore.kernel.org/r/YqdSw/fJvnkRbjvc@alley
would make any difference?

Best Regards,
Petr

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-13 15:11             ` Petr Mladek
  0 siblings, 0 replies; 40+ messages in thread
From: Petr Mladek @ 2022-06-13 15:11 UTC (permalink / raw)
  To: John Ogness
  Cc: Peter Geis, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Mon 2022-06-13 04:29:50, John Ogness wrote:
> On 2022-06-12, Peter Geis <pgwipeout@gmail.com> wrote:
> > The buffer isn't an issue here, everything is available in dmesg when
> > userspace becomes available. Instead some messages bound for the
> > serial console are never output.
> 
> OK. Good to know.
> 
> > We run a serial console at 1.5m baud which is significantly higher
> > than most SoCs which default to 115200. I have noticed some timing
> > differences since the introduction of the threaded console. A
> > significant amount of information is dumped very early in the boot
> > process (between 0 and 4 seconds into boot), as most drivers are
> > probing during this time. It also happens to be when the earlycon
> > console hands over to the normal console. There is no abnormal
> > debugging enabled, the output is a standard (non-quiet) boot log. The
> > question is why is direct mode not triggering during a panic?
> 
> Just to be clear, you are not losing any intermediate messages. Only the
> tail end of the kernel log was never printed. Is this correct?
> 
> This may be the same issue being discussed here [0].
> 
> John Ogness
> 
> [0] https://lore.kernel.org/all/87v8t5l39z.fsf@jogness.linutronix.de

If the last messages are missing then it is most likely the same
issue.

Peter, could you please try if the patch at
https://lore.kernel.org/r/YqdSw/fJvnkRbjvc@alley
would make any difference?

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [BUG] Threaded printk breaks early debugging
  2022-06-13 10:14               ` Petr Mladek
@ 2022-06-13 16:11                 ` David Laight
  -1 siblings, 0 replies; 40+ messages in thread
From: David Laight @ 2022-06-13 16:11 UTC (permalink / raw)
  To: 'Petr Mladek', Sergey Senozhatsky
  Cc: John Ogness, Peter Geis, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

From: Petr Mladek
> Sent: 13 June 2022 11:14
...
> Another interesting alternative is the Peter Zijlstra's mode
> where all messages are printed to the console "immediately".
> They are serialized only by the CPU-reentrant lock.
> 
> This mode is not good for production system. But it might
> be good for debugging. The good thing is that the behavior
> is well defined.

ISTM that all messages should be output (even if this means
that cpu spin waiting for a serial console) until userspace
has a chance to set an option to change the behaviour.

Oh, and can someone stop the distros hiding the console output.
I want to see the 'oops' traceback when I break the kernel!

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [BUG] Threaded printk breaks early debugging
@ 2022-06-13 16:11                 ` David Laight
  0 siblings, 0 replies; 40+ messages in thread
From: David Laight @ 2022-06-13 16:11 UTC (permalink / raw)
  To: 'Petr Mladek', Sergey Senozhatsky
  Cc: John Ogness, Peter Geis, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

From: Petr Mladek
> Sent: 13 June 2022 11:14
...
> Another interesting alternative is the Peter Zijlstra's mode
> where all messages are printed to the console "immediately".
> They are serialized only by the CPU-reentrant lock.
> 
> This mode is not good for production system. But it might
> be good for debugging. The good thing is that the behavior
> is well defined.

ISTM that all messages should be output (even if this means
that cpu spin waiting for a serial console) until userspace
has a chance to set an option to change the behaviour.

Oh, and can someone stop the distros hiding the console output.
I want to see the 'oops' traceback when I break the kernel!

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-13 15:11             ` Petr Mladek
@ 2022-06-13 22:20               ` Peter Geis
  -1 siblings, 0 replies; 40+ messages in thread
From: Peter Geis @ 2022-06-13 22:20 UTC (permalink / raw)
  To: Petr Mladek
  Cc: John Ogness, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Mon, Jun 13, 2022 at 11:11 AM Petr Mladek <pmladek@suse.com> wrote:
>
> On Mon 2022-06-13 04:29:50, John Ogness wrote:
> > On 2022-06-12, Peter Geis <pgwipeout@gmail.com> wrote:
> > > The buffer isn't an issue here, everything is available in dmesg when
> > > userspace becomes available. Instead some messages bound for the
> > > serial console are never output.
> >
> > OK. Good to know.
> >
> > > We run a serial console at 1.5m baud which is significantly higher
> > > than most SoCs which default to 115200. I have noticed some timing
> > > differences since the introduction of the threaded console. A
> > > significant amount of information is dumped very early in the boot
> > > process (between 0 and 4 seconds into boot), as most drivers are
> > > probing during this time. It also happens to be when the earlycon
> > > console hands over to the normal console. There is no abnormal
> > > debugging enabled, the output is a standard (non-quiet) boot log. The
> > > question is why is direct mode not triggering during a panic?
> >
> > Just to be clear, you are not losing any intermediate messages. Only the
> > tail end of the kernel log was never printed. Is this correct?
> >
> > This may be the same issue being discussed here [0].
> >
> > John Ogness
> >
> > [0] https://lore.kernel.org/all/87v8t5l39z.fsf@jogness.linutronix.de
>
> If the last messages are missing then it is most likely the same
> issue.
>
> Peter, could you please try if the patch at
> https://lore.kernel.org/r/YqdSw/fJvnkRbjvc@alley
> would make any difference?

This patch permits the panic to come through to the console port. Thank you!
Tested-by: Peter Geis <pgwipeout@gmail.com>

>
> Best Regards,
> Petr

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-13 22:20               ` Peter Geis
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Geis @ 2022-06-13 22:20 UTC (permalink / raw)
  To: Petr Mladek
  Cc: John Ogness, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Mon, Jun 13, 2022 at 11:11 AM Petr Mladek <pmladek@suse.com> wrote:
>
> On Mon 2022-06-13 04:29:50, John Ogness wrote:
> > On 2022-06-12, Peter Geis <pgwipeout@gmail.com> wrote:
> > > The buffer isn't an issue here, everything is available in dmesg when
> > > userspace becomes available. Instead some messages bound for the
> > > serial console are never output.
> >
> > OK. Good to know.
> >
> > > We run a serial console at 1.5m baud which is significantly higher
> > > than most SoCs which default to 115200. I have noticed some timing
> > > differences since the introduction of the threaded console. A
> > > significant amount of information is dumped very early in the boot
> > > process (between 0 and 4 seconds into boot), as most drivers are
> > > probing during this time. It also happens to be when the earlycon
> > > console hands over to the normal console. There is no abnormal
> > > debugging enabled, the output is a standard (non-quiet) boot log. The
> > > question is why is direct mode not triggering during a panic?
> >
> > Just to be clear, you are not losing any intermediate messages. Only the
> > tail end of the kernel log was never printed. Is this correct?
> >
> > This may be the same issue being discussed here [0].
> >
> > John Ogness
> >
> > [0] https://lore.kernel.org/all/87v8t5l39z.fsf@jogness.linutronix.de
>
> If the last messages are missing then it is most likely the same
> issue.
>
> Peter, could you please try if the patch at
> https://lore.kernel.org/r/YqdSw/fJvnkRbjvc@alley
> would make any difference?

This patch permits the panic to come through to the console port. Thank you!
Tested-by: Peter Geis <pgwipeout@gmail.com>

>
> Best Regards,
> Petr

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-13 16:11                 ` David Laight
@ 2022-06-14  8:37                   ` Petr Mladek
  -1 siblings, 0 replies; 40+ messages in thread
From: Petr Mladek @ 2022-06-14  8:37 UTC (permalink / raw)
  To: David Laight
  Cc: Sergey Senozhatsky, John Ogness, Peter Geis,
	Linux Kernel Mailing List, open list:ARM/Rockchip SoC...

On Mon 2022-06-13 16:11:19, David Laight wrote:
> From: Petr Mladek
> > Sent: 13 June 2022 11:14
> ...
> > Another interesting alternative is the Peter Zijlstra's mode
> > where all messages are printed to the console "immediately".
> > They are serialized only by the CPU-reentrant lock.
> > 
> > This mode is not good for production system. But it might
> > be good for debugging. The good thing is that the behavior
> > is well defined.
> 
> ISTM that all messages should be output (even if this means
> that cpu spin waiting for a serial console) until userspace
> has a chance to set an option to change the behaviour.

There are systems that do not boot because of softlockups
caused by slow consoles.

I am pretty sure that we will add the option to disable
printk kthreads sooner or later. But we prefer to improve
the code so that it works out of box. If we add the option
already know then people will disable kthreads and
do not report problematic scenarios.


> Oh, and can someone stop the distros hiding the console output.
> I want to see the 'oops' traceback when I break the kernel!

Different users have different needs. AFAIK, distros hide
the messages because they scare users and overload support.
It is probably more effective to enable them only for debugging
when users notice some malfunction.

I do not know the details. I guess that there were years of
evolution. The default in SUSE was probably set by some
project managers or people taking care of boot splash,
installation, ...

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-14  8:37                   ` Petr Mladek
  0 siblings, 0 replies; 40+ messages in thread
From: Petr Mladek @ 2022-06-14  8:37 UTC (permalink / raw)
  To: David Laight
  Cc: Sergey Senozhatsky, John Ogness, Peter Geis,
	Linux Kernel Mailing List, open list:ARM/Rockchip SoC...

On Mon 2022-06-13 16:11:19, David Laight wrote:
> From: Petr Mladek
> > Sent: 13 June 2022 11:14
> ...
> > Another interesting alternative is the Peter Zijlstra's mode
> > where all messages are printed to the console "immediately".
> > They are serialized only by the CPU-reentrant lock.
> > 
> > This mode is not good for production system. But it might
> > be good for debugging. The good thing is that the behavior
> > is well defined.
> 
> ISTM that all messages should be output (even if this means
> that cpu spin waiting for a serial console) until userspace
> has a chance to set an option to change the behaviour.

There are systems that do not boot because of softlockups
caused by slow consoles.

I am pretty sure that we will add the option to disable
printk kthreads sooner or later. But we prefer to improve
the code so that it works out of box. If we add the option
already know then people will disable kthreads and
do not report problematic scenarios.


> Oh, and can someone stop the distros hiding the console output.
> I want to see the 'oops' traceback when I break the kernel!

Different users have different needs. AFAIK, distros hide
the messages because they scare users and overload support.
It is probably more effective to enable them only for debugging
when users notice some malfunction.

I do not know the details. I guess that there were years of
evolution. The default in SUSE was probably set by some
project managers or people taking care of boot splash,
installation, ...

Best Regards,
Petr

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
  2022-06-13 22:20               ` Peter Geis
@ 2022-06-14  8:38                 ` Petr Mladek
  -1 siblings, 0 replies; 40+ messages in thread
From: Petr Mladek @ 2022-06-14  8:38 UTC (permalink / raw)
  To: Peter Geis
  Cc: John Ogness, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Mon 2022-06-13 18:20:03, Peter Geis wrote:
> On Mon, Jun 13, 2022 at 11:11 AM Petr Mladek <pmladek@suse.com> wrote:
> >
> > On Mon 2022-06-13 04:29:50, John Ogness wrote:
> > > On 2022-06-12, Peter Geis <pgwipeout@gmail.com> wrote:
> > > > The buffer isn't an issue here, everything is available in dmesg when
> > > > userspace becomes available. Instead some messages bound for the
> > > > serial console are never output.
> > >
> > > OK. Good to know.
> > >
> > > > We run a serial console at 1.5m baud which is significantly higher
> > > > than most SoCs which default to 115200. I have noticed some timing
> > > > differences since the introduction of the threaded console. A
> > > > significant amount of information is dumped very early in the boot
> > > > process (between 0 and 4 seconds into boot), as most drivers are
> > > > probing during this time. It also happens to be when the earlycon
> > > > console hands over to the normal console. There is no abnormal
> > > > debugging enabled, the output is a standard (non-quiet) boot log. The
> > > > question is why is direct mode not triggering during a panic?
> > >
> > > Just to be clear, you are not losing any intermediate messages. Only the
> > > tail end of the kernel log was never printed. Is this correct?
> > >
> > > This may be the same issue being discussed here [0].
> > >
> > > John Ogness
> > >
> > > [0] https://lore.kernel.org/all/87v8t5l39z.fsf@jogness.linutronix.de
> >
> > If the last messages are missing then it is most likely the same
> > issue.
> >
> > Peter, could you please try if the patch at
> > https://lore.kernel.org/r/YqdSw/fJvnkRbjvc@alley
> > would make any difference?
> 
> This patch permits the panic to come through to the console port. Thank you!
> Tested-by: Peter Geis <pgwipeout@gmail.com>

Great news. Thanks a lot for feedback.

I guess that the patch is not final. But it seems to be a good
direction.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [BUG] Threaded printk breaks early debugging
@ 2022-06-14  8:38                 ` Petr Mladek
  0 siblings, 0 replies; 40+ messages in thread
From: Petr Mladek @ 2022-06-14  8:38 UTC (permalink / raw)
  To: Peter Geis
  Cc: John Ogness, Sergey Senozhatsky, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...

On Mon 2022-06-13 18:20:03, Peter Geis wrote:
> On Mon, Jun 13, 2022 at 11:11 AM Petr Mladek <pmladek@suse.com> wrote:
> >
> > On Mon 2022-06-13 04:29:50, John Ogness wrote:
> > > On 2022-06-12, Peter Geis <pgwipeout@gmail.com> wrote:
> > > > The buffer isn't an issue here, everything is available in dmesg when
> > > > userspace becomes available. Instead some messages bound for the
> > > > serial console are never output.
> > >
> > > OK. Good to know.
> > >
> > > > We run a serial console at 1.5m baud which is significantly higher
> > > > than most SoCs which default to 115200. I have noticed some timing
> > > > differences since the introduction of the threaded console. A
> > > > significant amount of information is dumped very early in the boot
> > > > process (between 0 and 4 seconds into boot), as most drivers are
> > > > probing during this time. It also happens to be when the earlycon
> > > > console hands over to the normal console. There is no abnormal
> > > > debugging enabled, the output is a standard (non-quiet) boot log. The
> > > > question is why is direct mode not triggering during a panic?
> > >
> > > Just to be clear, you are not losing any intermediate messages. Only the
> > > tail end of the kernel log was never printed. Is this correct?
> > >
> > > This may be the same issue being discussed here [0].
> > >
> > > John Ogness
> > >
> > > [0] https://lore.kernel.org/all/87v8t5l39z.fsf@jogness.linutronix.de
> >
> > If the last messages are missing then it is most likely the same
> > issue.
> >
> > Peter, could you please try if the patch at
> > https://lore.kernel.org/r/YqdSw/fJvnkRbjvc@alley
> > would make any difference?
> 
> This patch permits the panic to come through to the console port. Thank you!
> Tested-by: Peter Geis <pgwipeout@gmail.com>

Great news. Thanks a lot for feedback.

I guess that the patch is not final. But it seems to be a good
direction.

Best Regards,
Petr

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2022-06-14  8:53 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-10 12:48 [BUG] Threaded printk breaks early debugging Peter Geis
2022-06-10 12:48 ` Peter Geis
2022-06-10 15:05 ` John Ogness
2022-06-10 15:05   ` John Ogness
2022-06-10 15:34   ` Peter Geis
2022-06-10 15:34     ` Peter Geis
2022-06-12  2:57     ` Sergey Senozhatsky
2022-06-12  2:57       ` Sergey Senozhatsky
2022-06-12 13:30       ` Peter Geis
2022-06-12 13:30         ` Peter Geis
2022-06-12 23:08     ` John Ogness
2022-06-12 23:08       ` John Ogness
2022-06-12 23:30       ` Peter Geis
2022-06-12 23:30         ` Peter Geis
2022-06-13  2:23         ` John Ogness
2022-06-13  2:23           ` John Ogness
2022-06-13 15:11           ` Petr Mladek
2022-06-13 15:11             ` Petr Mladek
2022-06-13 22:20             ` Peter Geis
2022-06-13 22:20               ` Peter Geis
2022-06-14  8:38               ` Petr Mladek
2022-06-14  8:38                 ` Petr Mladek
2022-06-13 11:24         ` Petr Mladek
2022-06-13 11:24           ` Petr Mladek
2022-06-12  3:13   ` Sergey Senozhatsky
2022-06-12  3:13     ` Sergey Senozhatsky
2022-06-12 23:02     ` John Ogness
2022-06-12 23:02       ` John Ogness
2022-06-13  3:49       ` Sergey Senozhatsky
2022-06-13  3:49         ` Sergey Senozhatsky
2022-06-13  8:30         ` John Ogness
2022-06-13  8:30           ` John Ogness
2022-06-13  9:05           ` Sergey Senozhatsky
2022-06-13  9:05             ` Sergey Senozhatsky
2022-06-13 10:14             ` Petr Mladek
2022-06-13 10:14               ` Petr Mladek
2022-06-13 16:11               ` David Laight
2022-06-13 16:11                 ` David Laight
2022-06-14  8:37                 ` Petr Mladek
2022-06-14  8:37                   ` Petr Mladek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.