All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] printk: Prevent printk kthreads from blocking direct console handling
@ 2022-06-15 16:28 ` Petr Mladek
  0 siblings, 0 replies; 14+ messages in thread
From: Petr Mladek @ 2022-06-15 16:28 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Paul E . McKenney, frederic,
	Peter Geis, zhouzhouyi, dave, josh, Linus Torvalds, rcu,
	linux-rockchip, linux-kernel, Petr Mladek

There are reports that console kthreads prevented printing
messages during panic() or shutdown(), see
BugLink: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
BugLink: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com

In both situations, printk() correctly tries to flush the consoles
directly but it fails to get the global console_lock(). Both
problems went away with these patches:

1st patch blocks console kthreads so that they do not start
handling new messages when the direct printing is preferred
by the system state. It is probably enough on its own. It
solves the problem when the kthreads actively did
the wrong thing.

2nd patch allows to wait for the console kthreads to release
the lock in any context. It should make it more reliable.
It would have been useful even for the legacy code.

More possible improvements:

  +  the waiting might be necessary also in the suspend code paths

  + convert con->mutex to con->spinlock to avoid blocking
    the global console_lock() when sleeping with con->lock

  + at least disable preemption around console_emit_next_record()
    in console kthread to avoid sleeping in the console driver
    code

  + somehow change the priority of the kthread so that it gets
    scheduled immediately when the scheduler works

Petr Mladek (2):
  printk: Block console kthreads when direct printing will be required
  printk: Wait for the global console lock when the system is going down

 include/linux/printk.h      |  5 +++++
 kernel/panic.c              |  2 ++
 kernel/printk/internal.h    |  2 ++
 kernel/printk/printk.c      |  8 +++++++-
 kernel/printk/printk_safe.c | 32 ++++++++++++++++++++++++++++++++
 kernel/reboot.c             |  2 ++
 6 files changed, 50 insertions(+), 1 deletion(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 0/2] printk: Prevent printk kthreads from blocking direct console handling
@ 2022-06-15 16:28 ` Petr Mladek
  0 siblings, 0 replies; 14+ messages in thread
From: Petr Mladek @ 2022-06-15 16:28 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Paul E . McKenney, frederic,
	Peter Geis, zhouzhouyi, dave, josh, Linus Torvalds, rcu,
	linux-rockchip, linux-kernel, Petr Mladek

There are reports that console kthreads prevented printing
messages during panic() or shutdown(), see
BugLink: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
BugLink: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com

In both situations, printk() correctly tries to flush the consoles
directly but it fails to get the global console_lock(). Both
problems went away with these patches:

1st patch blocks console kthreads so that they do not start
handling new messages when the direct printing is preferred
by the system state. It is probably enough on its own. It
solves the problem when the kthreads actively did
the wrong thing.

2nd patch allows to wait for the console kthreads to release
the lock in any context. It should make it more reliable.
It would have been useful even for the legacy code.

More possible improvements:

  +  the waiting might be necessary also in the suspend code paths

  + convert con->mutex to con->spinlock to avoid blocking
    the global console_lock() when sleeping with con->lock

  + at least disable preemption around console_emit_next_record()
    in console kthread to avoid sleeping in the console driver
    code

  + somehow change the priority of the kthread so that it gets
    scheduled immediately when the scheduler works

Petr Mladek (2):
  printk: Block console kthreads when direct printing will be required
  printk: Wait for the global console lock when the system is going down

 include/linux/printk.h      |  5 +++++
 kernel/panic.c              |  2 ++
 kernel/printk/internal.h    |  2 ++
 kernel/printk/printk.c      |  8 +++++++-
 kernel/printk/printk_safe.c | 32 ++++++++++++++++++++++++++++++++
 kernel/reboot.c             |  2 ++
 6 files changed, 50 insertions(+), 1 deletion(-)

-- 
2.35.3


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] printk: Block console kthreads when direct printing will be required
  2022-06-15 16:28 ` Petr Mladek
@ 2022-06-15 16:28   ` Petr Mladek
  -1 siblings, 0 replies; 14+ messages in thread
From: Petr Mladek @ 2022-06-15 16:28 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Paul E . McKenney, frederic,
	Peter Geis, zhouzhouyi, dave, josh, Linus Torvalds, rcu,
	linux-rockchip, linux-kernel, Petr Mladek

There are known situations when the console kthreads are not
reliable or does not work in principle, for example, early boot,
panic, shutdown.

For these situations there is the direct (legacy) mode when printk() tries
to get console_lock() and flush the messages directly. It works very well
during the early boot when the console kthreads are not available at all.
It gets more complicated in the other situations when console kthreads
might be actively printing and block console_trylock() in printk().

The same problem is in the legacy code as well. Any console_lock()
owner could block console_trylock() in printk(). It is solved by
a trick that the current console_lock() owner is responsible for
printing all pending messages. It is actually the reason why there
is the risk of softlockups and why the console kthreads were
introduced.

The console kthreads use the same approach. They are responsible
for printing the messages by definition. So that they handle
the messages anytime when they are awake and see new ones.
The global console_lock is available when there is nothing
to do.

It should work well when the problematic context is correctly
detected and printk() switches to the direct mode. But it seems
that it is not enough in practice. There are reports that
the messages are not printed during panic() or shutdown()
even though printk() tries to use the direct mode here.

The problem seems to be that console kthreads become active in these
situation as well. They steel the job before other CPUs are stopped.
Then they are stopped in the middle of the job and block the global
console_lock.

First part of the solution is to block console kthreads when
the system is in a problematic state and requires the direct
printk() mode.

BugLink: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
BugLink: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com
Suggested-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index ea3dd55709e7..45c6c2b0b104 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3729,7 +3729,9 @@ static bool printer_should_wake(struct console *con, u64 seq)
 		return true;
 
 	if (con->blocked ||
-	    console_kthreads_atomically_blocked()) {
+	    console_kthreads_atomically_blocked() ||
+	    system_state > SYSTEM_RUNNING ||
+	    oops_in_progress) {
 		return false;
 	}
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 1/2] printk: Block console kthreads when direct printing will be required
@ 2022-06-15 16:28   ` Petr Mladek
  0 siblings, 0 replies; 14+ messages in thread
From: Petr Mladek @ 2022-06-15 16:28 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Paul E . McKenney, frederic,
	Peter Geis, zhouzhouyi, dave, josh, Linus Torvalds, rcu,
	linux-rockchip, linux-kernel, Petr Mladek

There are known situations when the console kthreads are not
reliable or does not work in principle, for example, early boot,
panic, shutdown.

For these situations there is the direct (legacy) mode when printk() tries
to get console_lock() and flush the messages directly. It works very well
during the early boot when the console kthreads are not available at all.
It gets more complicated in the other situations when console kthreads
might be actively printing and block console_trylock() in printk().

The same problem is in the legacy code as well. Any console_lock()
owner could block console_trylock() in printk(). It is solved by
a trick that the current console_lock() owner is responsible for
printing all pending messages. It is actually the reason why there
is the risk of softlockups and why the console kthreads were
introduced.

The console kthreads use the same approach. They are responsible
for printing the messages by definition. So that they handle
the messages anytime when they are awake and see new ones.
The global console_lock is available when there is nothing
to do.

It should work well when the problematic context is correctly
detected and printk() switches to the direct mode. But it seems
that it is not enough in practice. There are reports that
the messages are not printed during panic() or shutdown()
even though printk() tries to use the direct mode here.

The problem seems to be that console kthreads become active in these
situation as well. They steel the job before other CPUs are stopped.
Then they are stopped in the middle of the job and block the global
console_lock.

First part of the solution is to block console kthreads when
the system is in a problematic state and requires the direct
printk() mode.

BugLink: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
BugLink: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com
Suggested-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index ea3dd55709e7..45c6c2b0b104 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3729,7 +3729,9 @@ static bool printer_should_wake(struct console *con, u64 seq)
 		return true;
 
 	if (con->blocked ||
-	    console_kthreads_atomically_blocked()) {
+	    console_kthreads_atomically_blocked() ||
+	    system_state > SYSTEM_RUNNING ||
+	    oops_in_progress) {
 		return false;
 	}
 
-- 
2.35.3


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] printk: Wait for the global console lock when the system is going down
  2022-06-15 16:28 ` Petr Mladek
@ 2022-06-15 16:28   ` Petr Mladek
  -1 siblings, 0 replies; 14+ messages in thread
From: Petr Mladek @ 2022-06-15 16:28 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Paul E . McKenney, frederic,
	Peter Geis, zhouzhouyi, dave, josh, Linus Torvalds, rcu,
	linux-rockchip, linux-kernel, Petr Mladek

There are reports that the console kthreads block the global console
lock when the system is going down, for example, reboot, panic.

First part of the solution was to block kthreads in these problematic
system states so they stopped handling newly added messages.

Second part of the solution is to wait when for the kthreads when
they are actively printing. It solves the problem when a message
was printed before the system entered the problematic state and
the kthreads managed to step in.

A busy waiting has to be used because panic() can be called in any
context and in an unknown state of the scheduler.

There must be a timeout because the kthread might get stuck or sleeping
and never release the lock. The timeout 10s is an arbitrary value
inspired by the softlockup timeout.

BugLink: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
BugLink: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 include/linux/printk.h      |  5 +++++
 kernel/panic.c              |  2 ++
 kernel/printk/internal.h    |  2 ++
 kernel/printk/printk.c      |  4 ++++
 kernel/printk/printk_safe.c | 32 ++++++++++++++++++++++++++++++++
 kernel/reboot.c             |  2 ++
 6 files changed, 47 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 10ec29bc0135..f88ec15f83dc 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -173,6 +173,7 @@ extern void printk_prefer_direct_enter(void);
 extern void printk_prefer_direct_exit(void);
 
 extern bool pr_flush(int timeout_ms, bool reset_on_progress);
+extern void try_block_console_kthreads(int timeout_ms);
 
 /*
  * Please don't use printk_ratelimit(), because it shares ratelimiting state
@@ -237,6 +238,10 @@ static inline bool pr_flush(int timeout_ms, bool reset_on_progress)
 	return true;
 }
 
+static inline void try_block_console_kthreads(int timeout_ms)
+{
+}
+
 static inline int printk_ratelimit(void)
 {
 	return 0;
diff --git a/kernel/panic.c b/kernel/panic.c
index a3c758dba15a..4cf13c37bd08 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -297,6 +297,7 @@ void panic(const char *fmt, ...)
 		 * unfortunately means it may not be hardened to work in a
 		 * panic situation.
 		 */
+		try_block_console_kthreads(10000);
 		smp_send_stop();
 	} else {
 		/*
@@ -304,6 +305,7 @@ void panic(const char *fmt, ...)
 		 * kmsg_dump, we will need architecture dependent extra
 		 * works in addition to stopping other CPUs.
 		 */
+		try_block_console_kthreads(10000);
 		crash_smp_send_stop();
 	}
 
diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index d947ca6c84f9..e7d8578860ad 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -20,6 +20,8 @@ enum printk_info_flags {
 	LOG_CONT	= 8,	/* text is a fragment of a continuation line */
 };
 
+extern bool block_console_kthreads;
+
 __printf(4, 0)
 int vprintk_store(int facility, int level,
 		  const struct dev_printk_info *dev_info,
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 45c6c2b0b104..b095fb5f5f61 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -250,6 +250,9 @@ static atomic_t console_kthreads_active = ATOMIC_INIT(0);
 #define console_kthread_printing_exit() \
 	atomic_dec(&console_kthreads_active)
 
+/* Block console kthreads to avoid processing new messages. */
+bool block_console_kthreads;
+
 /*
  * Helper macros to handle lockdep when locking/unlocking console_sem. We use
  * macros instead of functions so that _RET_IP_ contains useful information.
@@ -3730,6 +3733,7 @@ static bool printer_should_wake(struct console *con, u64 seq)
 
 	if (con->blocked ||
 	    console_kthreads_atomically_blocked() ||
+	    block_console_kthreads ||
 	    system_state > SYSTEM_RUNNING ||
 	    oops_in_progress) {
 		return false;
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index ef0f9a2044da..caac4de1ea59 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -8,7 +8,9 @@
 #include <linux/smp.h>
 #include <linux/cpumask.h>
 #include <linux/printk.h>
+#include <linux/console.h>
 #include <linux/kprobes.h>
+#include <linux/delay.h>
 
 #include "internal.h"
 
@@ -50,3 +52,33 @@ asmlinkage int vprintk(const char *fmt, va_list args)
 	return vprintk_default(fmt, args);
 }
 EXPORT_SYMBOL(vprintk);
+
+/**
+ * try_block_console_kthreads() - Try to block console kthreads and
+ *	make the global console_lock() avaialble
+ *
+ * @timeout_ms:        The maximum time (in ms) to wait.
+ *
+ * Prevent console kthreads from starting processing new messages. Wait
+ * until the global console_lock() become available.
+ *
+ * Context: Can be called in any context.
+ */
+void try_block_console_kthreads(int timeout_ms)
+{
+	block_console_kthreads = true;
+
+	/* Do not wait when the console lock could not be safely taken. */
+	if (this_cpu_read(printk_context) || in_nmi())
+		return;
+
+	while (timeout_ms > 0) {
+		if (console_trylock()) {
+			console_unlock();
+			return;
+		}
+
+		udelay(1000);
+		timeout_ms -= 1;
+	}
+}
diff --git a/kernel/reboot.c b/kernel/reboot.c
index b5a71d1ff603..80564ffafabf 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -82,6 +82,7 @@ void kernel_restart_prepare(char *cmd)
 {
 	blocking_notifier_call_chain(&reboot_notifier_list, SYS_RESTART, cmd);
 	system_state = SYSTEM_RESTART;
+	try_block_console_kthreads(10000);
 	usermodehelper_disable();
 	device_shutdown();
 }
@@ -270,6 +271,7 @@ static void kernel_shutdown_prepare(enum system_states state)
 	blocking_notifier_call_chain(&reboot_notifier_list,
 		(state == SYSTEM_HALT) ? SYS_HALT : SYS_POWER_OFF, NULL);
 	system_state = state;
+	try_block_console_kthreads(10000);
 	usermodehelper_disable();
 	device_shutdown();
 }
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] printk: Wait for the global console lock when the system is going down
@ 2022-06-15 16:28   ` Petr Mladek
  0 siblings, 0 replies; 14+ messages in thread
From: Petr Mladek @ 2022-06-15 16:28 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Paul E . McKenney, frederic,
	Peter Geis, zhouzhouyi, dave, josh, Linus Torvalds, rcu,
	linux-rockchip, linux-kernel, Petr Mladek

There are reports that the console kthreads block the global console
lock when the system is going down, for example, reboot, panic.

First part of the solution was to block kthreads in these problematic
system states so they stopped handling newly added messages.

Second part of the solution is to wait when for the kthreads when
they are actively printing. It solves the problem when a message
was printed before the system entered the problematic state and
the kthreads managed to step in.

A busy waiting has to be used because panic() can be called in any
context and in an unknown state of the scheduler.

There must be a timeout because the kthread might get stuck or sleeping
and never release the lock. The timeout 10s is an arbitrary value
inspired by the softlockup timeout.

BugLink: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
BugLink: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 include/linux/printk.h      |  5 +++++
 kernel/panic.c              |  2 ++
 kernel/printk/internal.h    |  2 ++
 kernel/printk/printk.c      |  4 ++++
 kernel/printk/printk_safe.c | 32 ++++++++++++++++++++++++++++++++
 kernel/reboot.c             |  2 ++
 6 files changed, 47 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 10ec29bc0135..f88ec15f83dc 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -173,6 +173,7 @@ extern void printk_prefer_direct_enter(void);
 extern void printk_prefer_direct_exit(void);
 
 extern bool pr_flush(int timeout_ms, bool reset_on_progress);
+extern void try_block_console_kthreads(int timeout_ms);
 
 /*
  * Please don't use printk_ratelimit(), because it shares ratelimiting state
@@ -237,6 +238,10 @@ static inline bool pr_flush(int timeout_ms, bool reset_on_progress)
 	return true;
 }
 
+static inline void try_block_console_kthreads(int timeout_ms)
+{
+}
+
 static inline int printk_ratelimit(void)
 {
 	return 0;
diff --git a/kernel/panic.c b/kernel/panic.c
index a3c758dba15a..4cf13c37bd08 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -297,6 +297,7 @@ void panic(const char *fmt, ...)
 		 * unfortunately means it may not be hardened to work in a
 		 * panic situation.
 		 */
+		try_block_console_kthreads(10000);
 		smp_send_stop();
 	} else {
 		/*
@@ -304,6 +305,7 @@ void panic(const char *fmt, ...)
 		 * kmsg_dump, we will need architecture dependent extra
 		 * works in addition to stopping other CPUs.
 		 */
+		try_block_console_kthreads(10000);
 		crash_smp_send_stop();
 	}
 
diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index d947ca6c84f9..e7d8578860ad 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -20,6 +20,8 @@ enum printk_info_flags {
 	LOG_CONT	= 8,	/* text is a fragment of a continuation line */
 };
 
+extern bool block_console_kthreads;
+
 __printf(4, 0)
 int vprintk_store(int facility, int level,
 		  const struct dev_printk_info *dev_info,
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 45c6c2b0b104..b095fb5f5f61 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -250,6 +250,9 @@ static atomic_t console_kthreads_active = ATOMIC_INIT(0);
 #define console_kthread_printing_exit() \
 	atomic_dec(&console_kthreads_active)
 
+/* Block console kthreads to avoid processing new messages. */
+bool block_console_kthreads;
+
 /*
  * Helper macros to handle lockdep when locking/unlocking console_sem. We use
  * macros instead of functions so that _RET_IP_ contains useful information.
@@ -3730,6 +3733,7 @@ static bool printer_should_wake(struct console *con, u64 seq)
 
 	if (con->blocked ||
 	    console_kthreads_atomically_blocked() ||
+	    block_console_kthreads ||
 	    system_state > SYSTEM_RUNNING ||
 	    oops_in_progress) {
 		return false;
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index ef0f9a2044da..caac4de1ea59 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -8,7 +8,9 @@
 #include <linux/smp.h>
 #include <linux/cpumask.h>
 #include <linux/printk.h>
+#include <linux/console.h>
 #include <linux/kprobes.h>
+#include <linux/delay.h>
 
 #include "internal.h"
 
@@ -50,3 +52,33 @@ asmlinkage int vprintk(const char *fmt, va_list args)
 	return vprintk_default(fmt, args);
 }
 EXPORT_SYMBOL(vprintk);
+
+/**
+ * try_block_console_kthreads() - Try to block console kthreads and
+ *	make the global console_lock() avaialble
+ *
+ * @timeout_ms:        The maximum time (in ms) to wait.
+ *
+ * Prevent console kthreads from starting processing new messages. Wait
+ * until the global console_lock() become available.
+ *
+ * Context: Can be called in any context.
+ */
+void try_block_console_kthreads(int timeout_ms)
+{
+	block_console_kthreads = true;
+
+	/* Do not wait when the console lock could not be safely taken. */
+	if (this_cpu_read(printk_context) || in_nmi())
+		return;
+
+	while (timeout_ms > 0) {
+		if (console_trylock()) {
+			console_unlock();
+			return;
+		}
+
+		udelay(1000);
+		timeout_ms -= 1;
+	}
+}
diff --git a/kernel/reboot.c b/kernel/reboot.c
index b5a71d1ff603..80564ffafabf 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -82,6 +82,7 @@ void kernel_restart_prepare(char *cmd)
 {
 	blocking_notifier_call_chain(&reboot_notifier_list, SYS_RESTART, cmd);
 	system_state = SYSTEM_RESTART;
+	try_block_console_kthreads(10000);
 	usermodehelper_disable();
 	device_shutdown();
 }
@@ -270,6 +271,7 @@ static void kernel_shutdown_prepare(enum system_states state)
 	blocking_notifier_call_chain(&reboot_notifier_list,
 		(state == SYSTEM_HALT) ? SYS_HALT : SYS_POWER_OFF, NULL);
 	system_state = state;
+	try_block_console_kthreads(10000);
 	usermodehelper_disable();
 	device_shutdown();
 }
-- 
2.35.3


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] printk: Prevent printk kthreads from blocking direct console handling
  2022-06-15 16:28 ` Petr Mladek
@ 2022-06-15 17:10   ` Paul E. McKenney
  -1 siblings, 0 replies; 14+ messages in thread
From: Paul E. McKenney @ 2022-06-15 17:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: John Ogness, Sergey Senozhatsky, Steven Rostedt, frederic,
	Peter Geis, zhouzhouyi, dave, josh, Linus Torvalds, rcu,
	linux-rockchip, linux-kernel

On Wed, Jun 15, 2022 at 06:28:03PM +0200, Petr Mladek wrote:
> There are reports that console kthreads prevented printing
> messages during panic() or shutdown(), see
> BugLink: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
> BugLink: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com
> 
> In both situations, printk() correctly tries to flush the consoles
> directly but it fails to get the global console_lock(). Both
> problems went away with these patches:
> 
> 1st patch blocks console kthreads so that they do not start
> handling new messages when the direct printing is preferred
> by the system state. It is probably enough on its own. It
> solves the problem when the kthreads actively did
> the wrong thing.
> 
> 2nd patch allows to wait for the console kthreads to release
> the lock in any context. It should make it more reliable.
> It would have been useful even for the legacy code.

Thank you!

For the series:

Tested-by: Paul E. McKenney <paulmck@kernel.org>

> More possible improvements:
> 
>   +  the waiting might be necessary also in the suspend code paths
> 
>   + convert con->mutex to con->spinlock to avoid blocking
>     the global console_lock() when sleeping with con->lock
> 
>   + at least disable preemption around console_emit_next_record()
>     in console kthread to avoid sleeping in the console driver
>     code
> 
>   + somehow change the priority of the kthread so that it gets
>     scheduled immediately when the scheduler works
> 
> Petr Mladek (2):
>   printk: Block console kthreads when direct printing will be required
>   printk: Wait for the global console lock when the system is going down
> 
>  include/linux/printk.h      |  5 +++++
>  kernel/panic.c              |  2 ++
>  kernel/printk/internal.h    |  2 ++
>  kernel/printk/printk.c      |  8 +++++++-
>  kernel/printk/printk_safe.c | 32 ++++++++++++++++++++++++++++++++
>  kernel/reboot.c             |  2 ++
>  6 files changed, 50 insertions(+), 1 deletion(-)
> 
> -- 
> 2.35.3
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] printk: Prevent printk kthreads from blocking direct console handling
@ 2022-06-15 17:10   ` Paul E. McKenney
  0 siblings, 0 replies; 14+ messages in thread
From: Paul E. McKenney @ 2022-06-15 17:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: John Ogness, Sergey Senozhatsky, Steven Rostedt, frederic,
	Peter Geis, zhouzhouyi, dave, josh, Linus Torvalds, rcu,
	linux-rockchip, linux-kernel

On Wed, Jun 15, 2022 at 06:28:03PM +0200, Petr Mladek wrote:
> There are reports that console kthreads prevented printing
> messages during panic() or shutdown(), see
> BugLink: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
> BugLink: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com
> 
> In both situations, printk() correctly tries to flush the consoles
> directly but it fails to get the global console_lock(). Both
> problems went away with these patches:
> 
> 1st patch blocks console kthreads so that they do not start
> handling new messages when the direct printing is preferred
> by the system state. It is probably enough on its own. It
> solves the problem when the kthreads actively did
> the wrong thing.
> 
> 2nd patch allows to wait for the console kthreads to release
> the lock in any context. It should make it more reliable.
> It would have been useful even for the legacy code.

Thank you!

For the series:

Tested-by: Paul E. McKenney <paulmck@kernel.org>

> More possible improvements:
> 
>   +  the waiting might be necessary also in the suspend code paths
> 
>   + convert con->mutex to con->spinlock to avoid blocking
>     the global console_lock() when sleeping with con->lock
> 
>   + at least disable preemption around console_emit_next_record()
>     in console kthread to avoid sleeping in the console driver
>     code
> 
>   + somehow change the priority of the kthread so that it gets
>     scheduled immediately when the scheduler works
> 
> Petr Mladek (2):
>   printk: Block console kthreads when direct printing will be required
>   printk: Wait for the global console lock when the system is going down
> 
>  include/linux/printk.h      |  5 +++++
>  kernel/panic.c              |  2 ++
>  kernel/printk/internal.h    |  2 ++
>  kernel/printk/printk.c      |  8 +++++++-
>  kernel/printk/printk_safe.c | 32 ++++++++++++++++++++++++++++++++
>  kernel/reboot.c             |  2 ++
>  6 files changed, 50 insertions(+), 1 deletion(-)
> 
> -- 
> 2.35.3
> 

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] printk: Block console kthreads when direct printing will be required
  2022-06-15 16:28   ` Petr Mladek
@ 2022-06-15 17:47     ` Linus Torvalds
  -1 siblings, 0 replies; 14+ messages in thread
From: Linus Torvalds @ 2022-06-15 17:47 UTC (permalink / raw)
  To: Petr Mladek
  Cc: John Ogness, Sergey Senozhatsky, Steven Rostedt,
	Paul E . McKenney, Frederic Weisbecker, Peter Geis, zhouzhouyi,
	Davidlohr Bueso, Josh Triplett, rcu, linux-rockchip,
	Linux Kernel Mailing List

On Wed, Jun 15, 2022 at 9:28 AM Petr Mladek <pmladek@suse.com> wrote:
>
> BugLink: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
> BugLink: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com

Other thread discussion about this exact thing:

   https://lore.kernel.org/all/CAHk-=wgzRUT1fBpuz3xcN+YdsX0SxqOzHWRtj0ReHpUBb5TKbA@mail.gmail.com/

please stop making up random tags that make no sense.

Just use "Link:"

Look at that first one (I didn't even bother following the second
one). The "bug" part is not even the most important part.

The reason to follow that link is all the discussion, the test-patch,
and the confirmation from Paul that "yup, that patch solves the
problem for me".

It's extra context to the commit, in case somebody wants to know the
history. The "bug" part is (and always should be) already explained in
the commit message, there's absolutely no point in adding soem extra
noise to the "Link:" tag.

And if the only reason for "BugLink:" to exist is to show "look, this
tag actually contains relevant and interesting information", then the
solution to THAT problem is to not have the links that are useless and
pointless in the first place.

Put another way: if you want to distinguish useless links from useful
ones, just do it by not including the useless ones.

Ok?

                   Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] printk: Block console kthreads when direct printing will be required
@ 2022-06-15 17:47     ` Linus Torvalds
  0 siblings, 0 replies; 14+ messages in thread
From: Linus Torvalds @ 2022-06-15 17:47 UTC (permalink / raw)
  To: Petr Mladek
  Cc: John Ogness, Sergey Senozhatsky, Steven Rostedt,
	Paul E . McKenney, Frederic Weisbecker, Peter Geis, zhouzhouyi,
	Davidlohr Bueso, Josh Triplett, rcu, linux-rockchip,
	Linux Kernel Mailing List

On Wed, Jun 15, 2022 at 9:28 AM Petr Mladek <pmladek@suse.com> wrote:
>
> BugLink: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
> BugLink: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com

Other thread discussion about this exact thing:

   https://lore.kernel.org/all/CAHk-=wgzRUT1fBpuz3xcN+YdsX0SxqOzHWRtj0ReHpUBb5TKbA@mail.gmail.com/

please stop making up random tags that make no sense.

Just use "Link:"

Look at that first one (I didn't even bother following the second
one). The "bug" part is not even the most important part.

The reason to follow that link is all the discussion, the test-patch,
and the confirmation from Paul that "yup, that patch solves the
problem for me".

It's extra context to the commit, in case somebody wants to know the
history. The "bug" part is (and always should be) already explained in
the commit message, there's absolutely no point in adding soem extra
noise to the "Link:" tag.

And if the only reason for "BugLink:" to exist is to show "look, this
tag actually contains relevant and interesting information", then the
solution to THAT problem is to not have the links that are useless and
pointless in the first place.

Put another way: if you want to distinguish useless links from useful
ones, just do it by not including the useless ones.

Ok?

                   Linus

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] printk: Block console kthreads when direct printing will be required
  2022-06-15 17:47     ` Linus Torvalds
@ 2022-06-15 19:20       ` Petr Mladek
  -1 siblings, 0 replies; 14+ messages in thread
From: Petr Mladek @ 2022-06-15 19:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: John Ogness, Sergey Senozhatsky, Steven Rostedt,
	Paul E . McKenney, Frederic Weisbecker, Peter Geis, zhouzhouyi,
	Davidlohr Bueso, Josh Triplett, rcu, linux-rockchip,
	Linux Kernel Mailing List

On Wed 2022-06-15 10:47:14, Linus Torvalds wrote:
> On Wed, Jun 15, 2022 at 9:28 AM Petr Mladek <pmladek@suse.com> wrote:
> >
> > BugLink: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
> > BugLink: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com
> 
> Other thread discussion about this exact thing:
> 
>    https://lore.kernel.org/all/CAHk-=wgzRUT1fBpuz3xcN+YdsX0SxqOzHWRtj0ReHpUBb5TKbA@mail.gmail.com/
> 
> please stop making up random tags that make no sense.
> 
> Just use "Link:"
> 
> Look at that first one (I didn't even bother following the second
> one). The "bug" part is not even the most important part.
> 
> The reason to follow that link is all the discussion, the test-patch,
> and the confirmation from Paul that "yup, that patch solves the
> problem for me".
> 
> It's extra context to the commit, in case somebody wants to know the
> history. The "bug" part is (and always should be) already explained in
> the commit message, there's absolutely no point in adding soem extra
> noise to the "Link:" tag.
> 
> And if the only reason for "BugLink:" to exist is to show "look, this
> tag actually contains relevant and interesting information", then the
> solution to THAT problem is to not have the links that are useless and
> pointless in the first place.
> 
> Put another way: if you want to distinguish useless links from useful
> ones, just do it by not including the useless ones.
> 
> Ok?

Got it! I am going to use "Link:" instead.

I just see how the discussion evolved at
https://lore.kernel.org/all/CAHk-=wgzRUT1fBpuz3xcN+YdsX0SxqOzHWRtj0ReHpUBb5TKbA@mail.gmail.com/

It is actually this exact discussion that confused me. I got the
impression that BugLink was a commonly used tag. I see that
I was too fast.

Thanks for stopping me.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] printk: Block console kthreads when direct printing will be required
@ 2022-06-15 19:20       ` Petr Mladek
  0 siblings, 0 replies; 14+ messages in thread
From: Petr Mladek @ 2022-06-15 19:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: John Ogness, Sergey Senozhatsky, Steven Rostedt,
	Paul E . McKenney, Frederic Weisbecker, Peter Geis, zhouzhouyi,
	Davidlohr Bueso, Josh Triplett, rcu, linux-rockchip,
	Linux Kernel Mailing List

On Wed 2022-06-15 10:47:14, Linus Torvalds wrote:
> On Wed, Jun 15, 2022 at 9:28 AM Petr Mladek <pmladek@suse.com> wrote:
> >
> > BugLink: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
> > BugLink: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com
> 
> Other thread discussion about this exact thing:
> 
>    https://lore.kernel.org/all/CAHk-=wgzRUT1fBpuz3xcN+YdsX0SxqOzHWRtj0ReHpUBb5TKbA@mail.gmail.com/
> 
> please stop making up random tags that make no sense.
> 
> Just use "Link:"
> 
> Look at that first one (I didn't even bother following the second
> one). The "bug" part is not even the most important part.
> 
> The reason to follow that link is all the discussion, the test-patch,
> and the confirmation from Paul that "yup, that patch solves the
> problem for me".
> 
> It's extra context to the commit, in case somebody wants to know the
> history. The "bug" part is (and always should be) already explained in
> the commit message, there's absolutely no point in adding soem extra
> noise to the "Link:" tag.
> 
> And if the only reason for "BugLink:" to exist is to show "look, this
> tag actually contains relevant and interesting information", then the
> solution to THAT problem is to not have the links that are useless and
> pointless in the first place.
> 
> Put another way: if you want to distinguish useless links from useful
> ones, just do it by not including the useless ones.
> 
> Ok?

Got it! I am going to use "Link:" instead.

I just see how the discussion evolved at
https://lore.kernel.org/all/CAHk-=wgzRUT1fBpuz3xcN+YdsX0SxqOzHWRtj0ReHpUBb5TKbA@mail.gmail.com/

It is actually this exact discussion that confused me. I got the
impression that BugLink was a commonly used tag. I see that
I was too fast.

Thanks for stopping me.

Best Regards,
Petr

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] printk: Prevent printk kthreads from blocking direct console handling
  2022-06-15 17:10   ` Paul E. McKenney
@ 2022-06-15 20:09     ` Petr Mladek
  -1 siblings, 0 replies; 14+ messages in thread
From: Petr Mladek @ 2022-06-15 20:09 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: John Ogness, Sergey Senozhatsky, Steven Rostedt, frederic,
	Peter Geis, zhouzhouyi, dave, josh, Linus Torvalds, rcu,
	linux-rockchip, linux-kernel

On Wed 2022-06-15 10:10:42, Paul E. McKenney wrote:
> On Wed, Jun 15, 2022 at 06:28:03PM +0200, Petr Mladek wrote:
> > There are reports that console kthreads prevented printing
> > messages during panic() or shutdown(), see
> > BugLink: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
> > BugLink: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com
> > 
> > In both situations, printk() correctly tries to flush the consoles
> > directly but it fails to get the global console_lock(). Both
> > problems went away with these patches:
> > 
> > 1st patch blocks console kthreads so that they do not start
> > handling new messages when the direct printing is preferred
> > by the system state. It is probably enough on its own. It
> > solves the problem when the kthreads actively did
> > the wrong thing.
> > 
> > 2nd patch allows to wait for the console kthreads to release
> > the lock in any context. It should make it more reliable.
> > It would have been useful even for the legacy code.
> 
> Thank you!
> 
> For the series:
> 
> Tested-by: Paul E. McKenney <paulmck@kernel.org>

Thanks a lot for testing.

I have pushed it into printk/linux.git, branch rework/kthreads
to give it a spin in linux-next.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] printk: Prevent printk kthreads from blocking direct console handling
@ 2022-06-15 20:09     ` Petr Mladek
  0 siblings, 0 replies; 14+ messages in thread
From: Petr Mladek @ 2022-06-15 20:09 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: John Ogness, Sergey Senozhatsky, Steven Rostedt, frederic,
	Peter Geis, zhouzhouyi, dave, josh, Linus Torvalds, rcu,
	linux-rockchip, linux-kernel

On Wed 2022-06-15 10:10:42, Paul E. McKenney wrote:
> On Wed, Jun 15, 2022 at 06:28:03PM +0200, Petr Mladek wrote:
> > There are reports that console kthreads prevented printing
> > messages during panic() or shutdown(), see
> > BugLink: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
> > BugLink: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com
> > 
> > In both situations, printk() correctly tries to flush the consoles
> > directly but it fails to get the global console_lock(). Both
> > problems went away with these patches:
> > 
> > 1st patch blocks console kthreads so that they do not start
> > handling new messages when the direct printing is preferred
> > by the system state. It is probably enough on its own. It
> > solves the problem when the kthreads actively did
> > the wrong thing.
> > 
> > 2nd patch allows to wait for the console kthreads to release
> > the lock in any context. It should make it more reliable.
> > It would have been useful even for the legacy code.
> 
> Thank you!
> 
> For the series:
> 
> Tested-by: Paul E. McKenney <paulmck@kernel.org>

Thanks a lot for testing.

I have pushed it into printk/linux.git, branch rework/kthreads
to give it a spin in linux-next.

Best Regards,
Petr

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-06-15 20:09 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-15 16:28 [PATCH 0/2] printk: Prevent printk kthreads from blocking direct console handling Petr Mladek
2022-06-15 16:28 ` Petr Mladek
2022-06-15 16:28 ` [PATCH 1/2] printk: Block console kthreads when direct printing will be required Petr Mladek
2022-06-15 16:28   ` Petr Mladek
2022-06-15 17:47   ` Linus Torvalds
2022-06-15 17:47     ` Linus Torvalds
2022-06-15 19:20     ` Petr Mladek
2022-06-15 19:20       ` Petr Mladek
2022-06-15 16:28 ` [PATCH 2/2] printk: Wait for the global console lock when the system is going down Petr Mladek
2022-06-15 16:28   ` Petr Mladek
2022-06-15 17:10 ` [PATCH 0/2] printk: Prevent printk kthreads from blocking direct console handling Paul E. McKenney
2022-06-15 17:10   ` Paul E. McKenney
2022-06-15 20:09   ` Petr Mladek
2022-06-15 20:09     ` Petr Mladek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.