All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH printk v2 00/12] implement threaded console printing
@ 2022-04-05 13:25 John Ogness
  2022-04-05 13:25 ` [PATCH printk v2 01/12] printk: rename cpulock functions John Ogness
                   ` (12 more replies)
  0 siblings, 13 replies; 37+ messages in thread
From: John Ogness @ 2022-04-05 13:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Andrew Morton, Marco Elver, Randy Dunlap,
	Alexander Potapenko, Stephen Boyd, Nicholas Piggin,
	Greg Kroah-Hartman, Jiri Slaby, Paul E. McKenney,
	Frederic Weisbecker, Neeraj Upadhyay, Josh Triplett,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Kees Cook,
	Luis Chamberlain, Wei Liu, Xiaoming Ni, Peter Zijlstra,
	Andy Shevchenko, Helge Deller, Sebastian Andrzej Siewior,
	Daniel Lezcano, Shawn Guo, Dmitry Torokhov, Matti Vaittinen,
	Eric W. Biederman, Phil Auld, Juri Lelli, Wang Qing, rcu

This is v2 of a series to implement a kthread for each registered
console. v1 is here [0]. The kthreads locklessly retrieve the
records from the printk ringbuffer and also do not cause any lock
contention between each other. This allows consoles to run at full
speed. For example, a netconsole is able to dump records much
faster than a serial or vt console. Also, during normal operation,
it completely decouples printk() callers from console printing.

There are situations where kthread printing is not sufficient. For
example, during panic situations, where the kthreads may not get a
chance to schedule. In such cases, the current method of attempting
to print directly within the printk() caller context is used. New
functions printk_prefer_direct_enter() and
printk_prefer_direct_exit() are made available to mark areas of the
kernel where direct printing is preferred. (These should only be
areas that do not occur during normal operation.)

This series also introduces pr_flush(): a might_sleep() function
that will block until all active printing threads have caught up
to the latest record at the time of the pr_flush() call. This
function is useful, for example, to wait until pending records
are flushed to consoles before suspending.

Note that this series does *not* increase the reliability of console
printing. Rather it focuses on the non-interference aspect of
printk() by decoupling printk() callers from printing (during normal
operation). Nonetheless, the reliability aspect should not worsen
due to this series.

Changes since v1:

- Rename nearly all new variables, functions, macros, and flags:

  variables:
  @console_lock_count        --> @console_kthreads_active
  @consoles_paused           --> @console_kthreads_blocked
  @kthreads_started          --> @printk_kthreads_available
  @printk_direct             --> @printk_prefer_direct

  functions/macros:
  console_excl_trylock()     --> console_kthreads_atomic_tryblock()
  console_excl_unlock()      --> console_kthreads_atomic_unblock()
  console_printer_tryenter() --> console_kthread_printing_tryenter()
  console_printer_exit()     --> console_kthread_printing_exit()
  pause_all_consoles()       --> console_kthreads_block()
  unpause_all_consoles()     --> console_kthreads_unblock()
  printk_direct_enter()      --> printk_prefer_direct_enter()
  printk_direct_exit()       --> printk_prefer_direct_exit()
  start_printk_kthread()     --> printk_start_kthread()

  console flags:
  CON_PAUSED                 --> CON_THD_BLOCKED

- Drop the patch to avoid VT printing if in an atomic context. (This
  can only be done after the Linux BSoD feature is implemented.)

- Keep printk_delay() in the printk() context. (We may want to move
  this into printer contexts later.)

- Keep console_trylock() usage to reacquire the console lock in
  console_unlock().

- Keep cpu_online(raw_smp_processor_id()) in the check if a console
  is usable and remove the printk_percpu_data_ready() check (percpu
  data is always ready for consoles).

- Start console printer threads as an early_initcall. This means
  threads are active before multiple CPUs come online.

- Remove "system_state < SYSTEM_RUNNING" as a reason to allow direct
  printing. @kthreads_started is used for the early boot decision.

- Add console_kthreads_atomically_blocked() macro.

- Add printk_console_msg() macro to printk messages with console
  details. The macro allows print indexing.

- Add __pr_flush() to allow waiting for only a single console.

- Add a full memory barrier in wake_up_klogd() to ensure new records
  are visible before checking if the waitqueue is empty.

- Wake kthreads when exiting preferred direct mode.

- Wake kthreads when a CPU comes online.

- In unregister_console(), stop the printer thread before acquiring
  the console lock.

- Print "thread started" message after all allocations successful.

- Permanently enable preferred direct printing if there are any
  errors setting up any of the threaded console printers.

- Keep threaded printers asleep if there is no con->write()
  callback.

- Keep threaded printers asleep if preferred direct printing is
  active.

- Always allow handovers if the console is locked.

- Only stop irqsoff tracing if handovers are allowed.

- console_emit_next_record() is for when the console is not locked,
  console_emit_next_record_transferable() is for when the console is
  locked.

- Wait for console printers to catch up after unblanking the screen.

- In console_stop(), only wait for the stopping console.

- In console_start(), only wait for the starting console.

- Print only 1 record per wait cycle to increase chances of direct
  printing taking over.

- Repurpose the irqwork flag PRINTK_PENDING_OUTPUT for direct
  printing.

- Add caution notice and tips to description of
  printk_cpu_sync_get_irqsave() on how to use it correctly.

- Improve the comments explaining why preemption is disabled during
  direct printing.

- Improve commit message to explain the motivation for expanding the
  console lock and how it works.

John Ogness

[0] https://lore.kernel.org/all/20220207194323.273637-1-john.ogness@linutronix.de

John Ogness (12):
  printk: rename cpulock functions
  printk: cpu sync always disable interrupts
  printk: get caller_id/timestamp after migration disable
  printk: call boot_delay_msec() in printk_delay()
  printk: add macro for console detail messages
  printk: refactor and rework printing logic
  printk: move buffer definitions into console_emit_next_record() caller
  printk: add pr_flush()
  printk: add functions to prefer direct printing
  printk: add kthread console printers
  printk: extend console_lock for proper kthread support
  printk: remove @console_locked

 drivers/tty/sysrq.c     |    2 +
 include/linux/console.h |   19 +
 include/linux/printk.h  |   82 ++-
 kernel/hung_task.c      |   11 +-
 kernel/panic.c          |    4 +
 kernel/printk/printk.c  | 1070 +++++++++++++++++++++++++++++----------
 kernel/rcu/tree_stall.h |    2 +
 kernel/reboot.c         |   14 +-
 kernel/watchdog.c       |    4 +
 kernel/watchdog_hld.c   |    4 +
 lib/dump_stack.c        |    4 +-
 lib/nmi_backtrace.c     |    4 +-
 12 files changed, 908 insertions(+), 312 deletions(-)


base-commit: 3ef4ea3d84ca568dcd57816b9521e82e3bd94f08
-- 
2.30.2


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH printk v2 01/12] printk: rename cpulock functions
  2022-04-05 13:25 [PATCH printk v2 00/12] implement threaded console printing John Ogness
@ 2022-04-05 13:25 ` John Ogness
  2022-04-06  9:07   ` Petr Mladek
  2022-04-05 13:25 ` [PATCH printk v2 02/12] printk: cpu sync always disable interrupts John Ogness
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: John Ogness @ 2022-04-05 13:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Andrew Morton, Marco Elver, Randy Dunlap,
	Alexander Potapenko, Stephen Boyd, Nicholas Piggin

Since the printk cpulock is CPU-reentrant and since it is used
in all contexts, its usage must be carefully considered and
most likely will require programming locklessly. To avoid
mistaking the printk cpulock as a typical lock, rename it to
cpu_sync. The main functions then become:

    printk_cpu_sync_get_irqsave(flags);
    printk_cpu_sync_put_irqrestore(flags);

Add extra notes of caution in the function description to help
developers understand the requirements for correct usage.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 include/linux/printk.h | 54 +++++++++++++++++++-------------
 kernel/printk/printk.c | 71 +++++++++++++++++++++---------------------
 lib/dump_stack.c       |  4 +--
 lib/nmi_backtrace.c    |  4 +--
 4 files changed, 73 insertions(+), 60 deletions(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 1522df223c0f..859323a52985 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -277,43 +277,55 @@ static inline void printk_trigger_flush(void)
 #endif
 
 #ifdef CONFIG_SMP
-extern int __printk_cpu_trylock(void);
-extern void __printk_wait_on_cpu_lock(void);
-extern void __printk_cpu_unlock(void);
+extern int __printk_cpu_sync_try_get(void);
+extern void __printk_cpu_sync_wait(void);
+extern void __printk_cpu_sync_put(void);
 
 /**
- * printk_cpu_lock_irqsave() - Acquire the printk cpu-reentrant spinning
- *                             lock and disable interrupts.
+ * printk_cpu_sync_get_irqsave() - Acquire the printk cpu-reentrant spinning
+ *                                 lock and disable interrupts.
  * @flags: Stack-allocated storage for saving local interrupt state,
- *         to be passed to printk_cpu_unlock_irqrestore().
+ *         to be passed to printk_cpu_sync_put_irqrestore().
  *
  * If the lock is owned by another CPU, spin until it becomes available.
  * Interrupts are restored while spinning.
+ *
+ * CAUTION: This function must be used carefully. It does not behave like a
+ * typical lock. Here are important things to watch out for...
+ *
+ *     * This function is reentrant on the same CPU. Therefore the calling
+ *       code must not assume exclusive access to data if code accessing the
+ *       data can run reentrant or within NMI context on the same CPU.
+ *
+ *     * If there exists usage of this function from NMI context, it becomes
+ *       unsafe to perform any type of locking or spinning to wait for other
+ *       CPUs after calling this function from any context. This includes
+ *       using spinlocks or any other busy-waiting synchronization methods.
  */
-#define printk_cpu_lock_irqsave(flags)		\
-	for (;;) {				\
-		local_irq_save(flags);		\
-		if (__printk_cpu_trylock())	\
-			break;			\
-		local_irq_restore(flags);	\
-		__printk_wait_on_cpu_lock();	\
+#define printk_cpu_sync_get_irqsave(flags)		\
+	for (;;) {					\
+		local_irq_save(flags);			\
+		if (__printk_cpu_sync_try_get())	\
+			break;				\
+		local_irq_restore(flags);		\
+		__printk_cpu_sync_wait();		\
 	}
 
 /**
- * printk_cpu_unlock_irqrestore() - Release the printk cpu-reentrant spinning
- *                                  lock and restore interrupts.
- * @flags: Caller's saved interrupt state, from printk_cpu_lock_irqsave().
+ * printk_cpu_sync_put_irqrestore() - Release the printk cpu-reentrant spinning
+ *                                    lock and restore interrupts.
+ * @flags: Caller's saved interrupt state, from printk_cpu_sync_get_irqsave().
  */
-#define printk_cpu_unlock_irqrestore(flags)	\
+#define printk_cpu_sync_put_irqrestore(flags)	\
 	do {					\
-		__printk_cpu_unlock();		\
+		__printk_cpu_sync_put();	\
 		local_irq_restore(flags);	\
-	} while (0)				\
+	} while (0)
 
 #else
 
-#define printk_cpu_lock_irqsave(flags) ((void)flags)
-#define printk_cpu_unlock_irqrestore(flags) ((void)flags)
+#define printk_cpu_sync_get_irqsave(flags) ((void)flags)
+#define printk_cpu_sync_put_irqrestore(flags) ((void)flags)
 
 #endif /* CONFIG_SMP */
 
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index da03c15ecc89..13a1eebe72af 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3667,26 +3667,26 @@ EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
 #endif
 
 #ifdef CONFIG_SMP
-static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
-static atomic_t printk_cpulock_nested = ATOMIC_INIT(0);
+static atomic_t printk_cpu_sync_owner = ATOMIC_INIT(-1);
+static atomic_t printk_cpu_sync_nested = ATOMIC_INIT(0);
 
 /**
- * __printk_wait_on_cpu_lock() - Busy wait until the printk cpu-reentrant
- *                               spinning lock is not owned by any CPU.
+ * __printk_cpu_sync_wait() - Busy wait until the printk cpu-reentrant
+ *                            spinning lock is not owned by any CPU.
  *
  * Context: Any context.
  */
-void __printk_wait_on_cpu_lock(void)
+void __printk_cpu_sync_wait(void)
 {
 	do {
 		cpu_relax();
-	} while (atomic_read(&printk_cpulock_owner) != -1);
+	} while (atomic_read(&printk_cpu_sync_owner) != -1);
 }
-EXPORT_SYMBOL(__printk_wait_on_cpu_lock);
+EXPORT_SYMBOL(__printk_cpu_sync_wait);
 
 /**
- * __printk_cpu_trylock() - Try to acquire the printk cpu-reentrant
- *                          spinning lock.
+ * __printk_cpu_sync_try_get() - Try to acquire the printk cpu-reentrant
+ *                               spinning lock.
  *
  * If no processor has the lock, the calling processor takes the lock and
  * becomes the owner. If the calling processor is already the owner of the
@@ -3695,7 +3695,7 @@ EXPORT_SYMBOL(__printk_wait_on_cpu_lock);
  * Context: Any context. Expects interrupts to be disabled.
  * Return: 1 on success, otherwise 0.
  */
-int __printk_cpu_trylock(void)
+int __printk_cpu_sync_try_get(void)
 {
 	int cpu;
 	int old;
@@ -3705,79 +3705,80 @@ int __printk_cpu_trylock(void)
 	/*
 	 * Guarantee loads and stores from this CPU when it is the lock owner
 	 * are _not_ visible to the previous lock owner. This pairs with
-	 * __printk_cpu_unlock:B.
+	 * __printk_cpu_sync_put:B.
 	 *
 	 * Memory barrier involvement:
 	 *
-	 * If __printk_cpu_trylock:A reads from __printk_cpu_unlock:B, then
-	 * __printk_cpu_unlock:A can never read from __printk_cpu_trylock:B.
+	 * If __printk_cpu_sync_try_get:A reads from __printk_cpu_sync_put:B,
+	 * then __printk_cpu_sync_put:A can never read from
+	 * __printk_cpu_sync_try_get:B.
 	 *
 	 * Relies on:
 	 *
-	 * RELEASE from __printk_cpu_unlock:A to __printk_cpu_unlock:B
+	 * RELEASE from __printk_cpu_sync_put:A to __printk_cpu_sync_put:B
 	 * of the previous CPU
 	 *    matching
-	 * ACQUIRE from __printk_cpu_trylock:A to __printk_cpu_trylock:B
-	 * of this CPU
+	 * ACQUIRE from __printk_cpu_sync_try_get:A to
+	 * __printk_cpu_sync_try_get:B of this CPU
 	 */
-	old = atomic_cmpxchg_acquire(&printk_cpulock_owner, -1,
-				     cpu); /* LMM(__printk_cpu_trylock:A) */
+	old = atomic_cmpxchg_acquire(&printk_cpu_sync_owner, -1,
+				     cpu); /* LMM(__printk_cpu_sync_try_get:A) */
 	if (old == -1) {
 		/*
 		 * This CPU is now the owner and begins loading/storing
-		 * data: LMM(__printk_cpu_trylock:B)
+		 * data: LMM(__printk_cpu_sync_try_get:B)
 		 */
 		return 1;
 
 	} else if (old == cpu) {
 		/* This CPU is already the owner. */
-		atomic_inc(&printk_cpulock_nested);
+		atomic_inc(&printk_cpu_sync_nested);
 		return 1;
 	}
 
 	return 0;
 }
-EXPORT_SYMBOL(__printk_cpu_trylock);
+EXPORT_SYMBOL(__printk_cpu_sync_try_get);
 
 /**
- * __printk_cpu_unlock() - Release the printk cpu-reentrant spinning lock.
+ * __printk_cpu_sync_put() - Release the printk cpu-reentrant spinning lock.
  *
  * The calling processor must be the owner of the lock.
  *
  * Context: Any context. Expects interrupts to be disabled.
  */
-void __printk_cpu_unlock(void)
+void __printk_cpu_sync_put(void)
 {
-	if (atomic_read(&printk_cpulock_nested)) {
-		atomic_dec(&printk_cpulock_nested);
+	if (atomic_read(&printk_cpu_sync_nested)) {
+		atomic_dec(&printk_cpu_sync_nested);
 		return;
 	}
 
 	/*
 	 * This CPU is finished loading/storing data:
-	 * LMM(__printk_cpu_unlock:A)
+	 * LMM(__printk_cpu_sync_put:A)
 	 */
 
 	/*
 	 * Guarantee loads and stores from this CPU when it was the
 	 * lock owner are visible to the next lock owner. This pairs
-	 * with __printk_cpu_trylock:A.
+	 * with __printk_cpu_sync_try_get:A.
 	 *
 	 * Memory barrier involvement:
 	 *
-	 * If __printk_cpu_trylock:A reads from __printk_cpu_unlock:B,
-	 * then __printk_cpu_trylock:B reads from __printk_cpu_unlock:A.
+	 * If __printk_cpu_sync_try_get:A reads from __printk_cpu_sync_put:B,
+	 * then __printk_cpu_sync_try_get:B reads from __printk_cpu_sync_put:A.
 	 *
 	 * Relies on:
 	 *
-	 * RELEASE from __printk_cpu_unlock:A to __printk_cpu_unlock:B
+	 * RELEASE from __printk_cpu_sync_put:A to __printk_cpu_sync_put:B
 	 * of this CPU
 	 *    matching
-	 * ACQUIRE from __printk_cpu_trylock:A to __printk_cpu_trylock:B
-	 * of the next CPU
+	 * ACQUIRE from __printk_cpu_sync_try_get:A to
+	 * __printk_cpu_sync_try_get:B of the next CPU
 	 */
-	atomic_set_release(&printk_cpulock_owner,
-			   -1); /* LMM(__printk_cpu_unlock:B) */
+	atomic_set_release(&printk_cpu_sync_owner,
+			   -1); /* LMM(__printk_cpu_sync_put:B) */
 }
-EXPORT_SYMBOL(__printk_cpu_unlock);
+EXPORT_SYMBOL(__printk_cpu_sync_put);
 #endif /* CONFIG_SMP */
diff --git a/lib/dump_stack.c b/lib/dump_stack.c
index 6b7f1bf6715d..83471e81501a 100644
--- a/lib/dump_stack.c
+++ b/lib/dump_stack.c
@@ -102,9 +102,9 @@ asmlinkage __visible void dump_stack_lvl(const char *log_lvl)
 	 * Permit this cpu to perform nested stack dumps while serialising
 	 * against other CPUs
 	 */
-	printk_cpu_lock_irqsave(flags);
+	printk_cpu_sync_get_irqsave(flags);
 	__dump_stack(log_lvl);
-	printk_cpu_unlock_irqrestore(flags);
+	printk_cpu_sync_put_irqrestore(flags);
 }
 EXPORT_SYMBOL(dump_stack_lvl);
 
diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c
index 199ab201d501..d01aec6ae15c 100644
--- a/lib/nmi_backtrace.c
+++ b/lib/nmi_backtrace.c
@@ -99,7 +99,7 @@ bool nmi_cpu_backtrace(struct pt_regs *regs)
 		 * Allow nested NMI backtraces while serializing
 		 * against other CPUs.
 		 */
-		printk_cpu_lock_irqsave(flags);
+		printk_cpu_sync_get_irqsave(flags);
 		if (!READ_ONCE(backtrace_idle) && regs && cpu_in_idle(instruction_pointer(regs))) {
 			pr_warn("NMI backtrace for cpu %d skipped: idling at %pS\n",
 				cpu, (void *)instruction_pointer(regs));
@@ -110,7 +110,7 @@ bool nmi_cpu_backtrace(struct pt_regs *regs)
 			else
 				dump_stack();
 		}
-		printk_cpu_unlock_irqrestore(flags);
+		printk_cpu_sync_put_irqrestore(flags);
 		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
 		return true;
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH printk v2 02/12] printk: cpu sync always disable interrupts
  2022-04-05 13:25 [PATCH printk v2 00/12] implement threaded console printing John Ogness
  2022-04-05 13:25 ` [PATCH printk v2 01/12] printk: rename cpulock functions John Ogness
@ 2022-04-05 13:25 ` John Ogness
  2022-04-05 13:25 ` [PATCH printk v2 03/12] printk: get caller_id/timestamp after migration disable John Ogness
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 37+ messages in thread
From: John Ogness @ 2022-04-05 13:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner, linux-kernel

The CPU sync functions are a NOP for !CONFIG_SMP. But for
!CONFIG_SMP they still need to disable interrupts in order to
preserve context within the CPU sync sections.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
 include/linux/printk.h | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 859323a52985..b70a42f94031 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -281,9 +281,16 @@ extern int __printk_cpu_sync_try_get(void);
 extern void __printk_cpu_sync_wait(void);
 extern void __printk_cpu_sync_put(void);
 
+#else
+
+#define __printk_cpu_sync_try_get() true
+#define __printk_cpu_sync_wait()
+#define __printk_cpu_sync_put()
+#endif /* CONFIG_SMP */
+
 /**
- * printk_cpu_sync_get_irqsave() - Acquire the printk cpu-reentrant spinning
- *                                 lock and disable interrupts.
+ * printk_cpu_sync_get_irqsave() - Disable interrupts and acquire the printk
+ *                                 cpu-reentrant spinning lock.
  * @flags: Stack-allocated storage for saving local interrupt state,
  *         to be passed to printk_cpu_sync_put_irqrestore().
  *
@@ -322,13 +329,6 @@ extern void __printk_cpu_sync_put(void);
 		local_irq_restore(flags);	\
 	} while (0)
 
-#else
-
-#define printk_cpu_sync_get_irqsave(flags) ((void)flags)
-#define printk_cpu_sync_put_irqrestore(flags) ((void)flags)
-
-#endif /* CONFIG_SMP */
-
 extern int kptr_restrict;
 
 /**
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH printk v2 03/12] printk: get caller_id/timestamp after migration disable
  2022-04-05 13:25 [PATCH printk v2 00/12] implement threaded console printing John Ogness
  2022-04-05 13:25 ` [PATCH printk v2 01/12] printk: rename cpulock functions John Ogness
  2022-04-05 13:25 ` [PATCH printk v2 02/12] printk: cpu sync always disable interrupts John Ogness
@ 2022-04-05 13:25 ` John Ogness
  2022-04-05 13:25 ` [PATCH printk v2 04/12] printk: call boot_delay_msec() in printk_delay() John Ogness
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 37+ messages in thread
From: John Ogness @ 2022-04-05 13:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner, linux-kernel

Currently the local CPU timestamp and caller_id for the record are
collected while migration is enabled. Since this information is
CPU-specific, it should be collected with migration disabled.

Migration is disabled immediately after collecting this information
anyway, so just move the information collection to after the
migration disabling.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 13a1eebe72af..9774a3f6bafb 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2041,7 +2041,7 @@ static inline void printk_delay(void)
 static inline u32 printk_caller_id(void)
 {
 	return in_task() ? task_pid_nr(current) :
-		0x80000000 + raw_smp_processor_id();
+		0x80000000 + smp_processor_id();
 }
 
 /**
@@ -2123,7 +2123,6 @@ int vprintk_store(int facility, int level,
 		  const struct dev_printk_info *dev_info,
 		  const char *fmt, va_list args)
 {
-	const u32 caller_id = printk_caller_id();
 	struct prb_reserved_entry e;
 	enum printk_info_flags flags = 0;
 	struct printk_record r;
@@ -2133,10 +2132,14 @@ int vprintk_store(int facility, int level,
 	u8 *recursion_ptr;
 	u16 reserve_size;
 	va_list args2;
+	u32 caller_id;
 	u16 text_len;
 	int ret = 0;
 	u64 ts_nsec;
 
+	if (!printk_enter_irqsave(recursion_ptr, irqflags))
+		return 0;
+
 	/*
 	 * Since the duration of printk() can vary depending on the message
 	 * and state of the ringbuffer, grab the timestamp now so that it is
@@ -2145,8 +2148,7 @@ int vprintk_store(int facility, int level,
 	 */
 	ts_nsec = local_clock();
 
-	if (!printk_enter_irqsave(recursion_ptr, irqflags))
-		return 0;
+	caller_id = printk_caller_id();
 
 	/*
 	 * The sprintf needs to come first since the syslog prefix might be
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH printk v2 04/12] printk: call boot_delay_msec() in printk_delay()
  2022-04-05 13:25 [PATCH printk v2 00/12] implement threaded console printing John Ogness
                   ` (2 preceding siblings ...)
  2022-04-05 13:25 ` [PATCH printk v2 03/12] printk: get caller_id/timestamp after migration disable John Ogness
@ 2022-04-05 13:25 ` John Ogness
  2022-04-05 13:25 ` [PATCH printk v2 05/12] printk: add macro for console detail messages John Ogness
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 37+ messages in thread
From: John Ogness @ 2022-04-05 13:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner, linux-kernel

boot_delay_msec() is always called immediately before printk_delay()
so just call it from within printk_delay().

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 9774a3f6bafb..fb9a6eaf54fd 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2026,8 +2026,10 @@ static u8 *__printk_recursion_counter(void)
 
 int printk_delay_msec __read_mostly;
 
-static inline void printk_delay(void)
+static inline void printk_delay(int level)
 {
+	boot_delay_msec(level);
+
 	if (unlikely(printk_delay_msec)) {
 		int m = printk_delay_msec;
 
@@ -2252,8 +2254,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 		in_sched = true;
 	}
 
-	boot_delay_msec(level);
-	printk_delay();
+	printk_delay(level);
 
 	printed_len = vprintk_store(facility, level, dev_info, fmt, args);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH printk v2 05/12] printk: add macro for console detail messages
  2022-04-05 13:25 [PATCH printk v2 00/12] implement threaded console printing John Ogness
                   ` (3 preceding siblings ...)
  2022-04-05 13:25 ` [PATCH printk v2 04/12] printk: call boot_delay_msec() in printk_delay() John Ogness
@ 2022-04-05 13:25 ` John Ogness
  2022-04-06 10:31   ` Petr Mladek
  2022-04-05 13:25 ` [PATCH printk v2 06/12] printk: refactor and rework printing logic John Ogness
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: John Ogness @ 2022-04-05 13:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner, linux-kernel

It is useful to generate log messages that include details about
the related console. Rather than duplicate the logic to assemble
the details, put that logic into a macro printk_console_msg().

Once console printers become threaded, this macro will find many
users.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/printk/printk.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index fb9a6eaf54fd..39ca9d758e52 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2993,6 +2993,10 @@ static void try_enable_default_console(struct console *newcon)
 		newcon->flags |= CON_CONSDEV;
 }
 
+#define printk_console_msg(c, lvl, msg) \
+	printk(lvl pr_fmt("%sconsole [%s%d] " msg "\n"), \
+	       (c->flags & CON_BOOT) ? "boot" : "", c->name, c->index)
+
 /*
  * The console driver calls this routine during kernel initialization
  * to register the console printing procedure with printk() and to
@@ -3131,9 +3135,7 @@ void register_console(struct console *newcon)
 	 * users know there might be something in the kernel's log buffer that
 	 * went to the bootconsole (that they do not see on the real console)
 	 */
-	pr_info("%sconsole [%s%d] enabled\n",
-		(newcon->flags & CON_BOOT) ? "boot" : "" ,
-		newcon->name, newcon->index);
+	printk_console_msg(newcon, KERN_INFO, "enabled");
 	if (bootcon_enabled &&
 	    ((newcon->flags & (CON_CONSDEV | CON_BOOT)) == CON_CONSDEV) &&
 	    !keep_bootcon) {
@@ -3152,9 +3154,7 @@ int unregister_console(struct console *console)
 	struct console *con;
 	int res;
 
-	pr_info("%sconsole [%s%d] disabled\n",
-		(console->flags & CON_BOOT) ? "boot" : "" ,
-		console->name, console->index);
+	printk_console_msg(console, KERN_INFO, "disabled");
 
 	res = _braille_unregister_console(console);
 	if (res < 0)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH printk v2 06/12] printk: refactor and rework printing logic
  2022-04-05 13:25 [PATCH printk v2 00/12] implement threaded console printing John Ogness
                   ` (4 preceding siblings ...)
  2022-04-05 13:25 ` [PATCH printk v2 05/12] printk: add macro for console detail messages John Ogness
@ 2022-04-05 13:25 ` John Ogness
  2022-04-06 14:02   ` Petr Mladek
  2022-04-05 13:25 ` [PATCH printk v2 07/12] printk: move buffer definitions into console_emit_next_record() caller John Ogness
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: John Ogness @ 2022-04-05 13:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman

Refactor/rework printing logic in order to prepare for moving to
threaded console printing.

- Move @console_seq into struct console so that the current
  "position" of each console can be tracked individually.

- Move @console_dropped into struct console so that the current drop
  count of each console can be tracked individually.

- Modify printing logic so that each console independently loads,
  prepares, and prints its next record.

- Remove exclusive_console logic. Since console positions are
  handled independently, replaying past records occurs naturally.

- Update the comments explaining why preemption is disabled while
  printing from printk() context.

With these changes, there is a change in behavior: the console
replaying the log (formerly exclusive console) will no longer block
other consoles. New messages appear on the other consoles while the
newly added console is still replaying.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 include/linux/console.h |   2 +
 kernel/printk/printk.c  | 429 ++++++++++++++++++++--------------------
 2 files changed, 220 insertions(+), 211 deletions(-)

diff --git a/include/linux/console.h b/include/linux/console.h
index 7cd758a4f44e..8c1686e2c233 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -151,6 +151,8 @@ struct console {
 	int	cflag;
 	uint	ispeed;
 	uint	ospeed;
+	u64	seq;
+	unsigned long dropped;
 	void	*data;
 	struct	 console *next;
 };
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 39ca9d758e52..db9278dba775 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -280,11 +280,6 @@ static bool panic_in_progress(void)
  */
 static int console_locked, console_suspended;
 
-/*
- * If exclusive_console is non-NULL then only this console is to be printed to.
- */
-static struct console *exclusive_console;
-
 /*
  *	Array of consoles built from command line options (console=)
  */
@@ -374,12 +369,6 @@ static u64 syslog_seq;
 static size_t syslog_partial;
 static bool syslog_time;
 
-/* All 3 protected by @console_sem. */
-/* the next printk record to write to the console */
-static u64 console_seq;
-static u64 exclusive_console_stop_seq;
-static unsigned long console_dropped;
-
 struct latched_seq {
 	seqcount_latch_t	latch;
 	u64			val[2];
@@ -1911,47 +1900,26 @@ static int console_trylock_spinning(void)
 }
 
 /*
- * Call the console drivers, asking them to write out
- * log_buf[start] to log_buf[end - 1].
- * The console_lock must be held.
+ * Call the specified console driver, asking it to write out the specified
+ * text and length. For non-extended consoles, if any records have been
+ * dropped, a dropped message will be written out first.
  */
-static void call_console_drivers(const char *ext_text, size_t ext_len,
-				 const char *text, size_t len)
+static void call_console_driver(struct console *con, const char *text, size_t len)
 {
 	static char dropped_text[64];
-	size_t dropped_len = 0;
-	struct console *con;
+	size_t dropped_len;
 
 	trace_console_rcuidle(text, len);
 
-	if (!console_drivers)
-		return;
-
-	if (console_dropped) {
+	if (con->dropped && !(con->flags & CON_EXTENDED)) {
 		dropped_len = snprintf(dropped_text, sizeof(dropped_text),
 				       "** %lu printk messages dropped **\n",
-				       console_dropped);
-		console_dropped = 0;
+				       con->dropped);
+		con->dropped = 0;
+		con->write(con, dropped_text, dropped_len);
 	}
 
-	for_each_console(con) {
-		if (exclusive_console && con != exclusive_console)
-			continue;
-		if (!(con->flags & CON_ENABLED))
-			continue;
-		if (!con->write)
-			continue;
-		if (!cpu_online(smp_processor_id()) &&
-		    !(con->flags & CON_ANYTIME))
-			continue;
-		if (con->flags & CON_EXTENDED)
-			con->write(con, ext_text, ext_len);
-		else {
-			if (dropped_len)
-				con->write(con, dropped_text, dropped_len);
-			con->write(con, text, len);
-		}
-	}
+	con->write(con, text, len);
 }
 
 /*
@@ -2261,15 +2229,18 @@ asmlinkage int vprintk_emit(int facility, int level,
 	/* If called from the scheduler, we can not call up(). */
 	if (!in_sched) {
 		/*
-		 * Disable preemption to avoid being preempted while holding
-		 * console_sem which would prevent anyone from printing to
-		 * console
+		 * The caller may be holding system-critical or
+		 * timing-sensitive locks. Disable preemption during
+		 * printing of all remaining records to all consoles so that
+		 * this context can return as soon as possible. Hopefully
+		 * another printk() caller will take over the printing.
 		 */
 		preempt_disable();
 		/*
 		 * Try to acquire and then immediately release the console
-		 * semaphore.  The release will print out buffers and wake up
-		 * /dev/kmsg and syslog() users.
+		 * semaphore. The release will print out buffers. With the
+		 * spinning variant, this context tries to take over the
+		 * printing from another printing context.
 		 */
 		if (console_trylock_spinning())
 			console_unlock();
@@ -2307,11 +2278,9 @@ EXPORT_SYMBOL(_printk);
 
 #define prb_read_valid(rb, seq, r)	false
 #define prb_first_valid_seq(rb)		0
+#define prb_next_seq(rb)		0
 
 static u64 syslog_seq;
-static u64 console_seq;
-static u64 exclusive_console_stop_seq;
-static unsigned long console_dropped;
 
 static size_t record_print_text(const struct printk_record *r,
 				bool syslog, bool time)
@@ -2328,8 +2297,7 @@ static ssize_t msg_print_ext_body(char *buf, size_t size,
 				  struct dev_printk_info *dev_info) { return 0; }
 static void console_lock_spinning_enable(void) { }
 static int console_lock_spinning_disable_and_check(void) { return 0; }
-static void call_console_drivers(const char *ext_text, size_t ext_len,
-				 const char *text, size_t len) {}
+static void call_console_driver(struct console *con, const char *text, size_t len) { }
 static bool suppress_message_printing(int level) { return false; }
 
 #endif /* CONFIG_PRINTK */
@@ -2599,22 +2567,6 @@ int is_console_locked(void)
 }
 EXPORT_SYMBOL(is_console_locked);
 
-/*
- * Check if we have any console that is capable of printing while cpu is
- * booting or shutting down. Requires console_sem.
- */
-static int have_callable_console(void)
-{
-	struct console *con;
-
-	for_each_console(con)
-		if ((con->flags & CON_ENABLED) &&
-				(con->flags & CON_ANYTIME))
-			return 1;
-
-	return 0;
-}
-
 /*
  * Return true when this CPU should unlock console_sem without pushing all
  * messages to the console. This reduces the chance that the console is
@@ -2635,15 +2587,173 @@ static bool abandon_console_lock_in_panic(void)
 }
 
 /*
- * Can we actually use the console at this time on this cpu?
+ * Check if the given console is currently capable and allowed to print
+ * records.
+ *
+ * Requires the console_lock.
+ */
+static inline bool console_is_usable(struct console *con)
+{
+	if (!(con->flags & CON_ENABLED))
+		return false;
+
+	if (!con->write)
+		return false;
+
+	/*
+	 * Console drivers may assume that per-cpu resources have been
+	 * allocated. So unless they're explicitly marked as being able to
+	 * cope (CON_ANYTIME) don't call them until this CPU is officially up.
+	 */
+	if (!cpu_online(raw_smp_processor_id()) &&
+	    !(con->flags & CON_ANYTIME))
+		return false;
+
+	return true;
+}
+
+static void __console_unlock(void)
+{
+	console_locked = 0;
+	up_console_sem();
+}
+
+/*
+ * Print one record for the given console. The record printed is whatever
+ * record is the next available record for the given console.
+ *
+ * Requires the console_lock.
+ *
+ * Returns false if the given console has no next record to print, otherwise
+ * true.
  *
- * Console drivers may assume that per-cpu resources have been allocated. So
- * unless they're explicitly marked as being able to cope (CON_ANYTIME) don't
- * call them until this CPU is officially up.
+ * @handover will be set to true if a printk waiter has taken over the
+ * console_lock, in which case the caller is no longer holding the
+ * console_lock. Otherwise it is set to false.
  */
-static inline int can_use_console(void)
+static bool console_emit_next_record(struct console *con, bool *handover)
 {
-	return cpu_online(raw_smp_processor_id()) || have_callable_console();
+	static char ext_text[CONSOLE_EXT_LOG_MAX];
+	static char text[CONSOLE_LOG_MAX];
+	static int panic_console_dropped;
+	struct printk_info info;
+	struct printk_record r;
+	unsigned long flags;
+	char *write_text;
+	size_t len;
+
+	prb_rec_init_rd(&r, &info, text, sizeof(text));
+
+	*handover = false;
+
+	if (!prb_read_valid(prb, con->seq, &r))
+		return false;
+
+	if (con->seq != r.info->seq) {
+		con->dropped += r.info->seq - con->seq;
+		con->seq = r.info->seq;
+		if (panic_in_progress() && panic_console_dropped++ > 10) {
+			suppress_panic_printk = 1;
+			pr_warn_once("Too many dropped messages. Suppress messages on non-panic CPUs to prevent livelock.\n");
+		}
+	}
+
+	/* Skip record that has level above the console loglevel. */
+	if (suppress_message_printing(r.info->level)) {
+		con->seq++;
+		goto skip;
+	}
+
+	if (con->flags & CON_EXTENDED) {
+		write_text = &ext_text[0];
+		len = info_print_ext_header(ext_text, sizeof(ext_text), r.info);
+		len += msg_print_ext_body(ext_text + len, sizeof(ext_text) - len,
+					  &r.text_buf[0], r.info->text_len, &r.info->dev_info);
+	} else {
+		write_text = &text[0];
+		len = record_print_text(&r, console_msg_format & MSG_FORMAT_SYSLOG, printk_time);
+	}
+
+	/*
+	 * While actively printing out messages, if another printk()
+	 * were to occur on another CPU, it may wait for this one to
+	 * finish. This task can not be preempted if there is a
+	 * waiter waiting to take over.
+	 *
+	 * Interrupts are disabled because the hand over to a waiter
+	 * must not be interrupted until the hand over is completed
+	 * (@console_waiter is cleared).
+	 */
+	printk_safe_enter_irqsave(flags);
+	console_lock_spinning_enable();
+
+	stop_critical_timings();	/* don't trace print latency */
+	call_console_driver(con, write_text, len);
+	start_critical_timings();
+
+	con->seq++;
+
+	*handover = console_lock_spinning_disable_and_check();
+	printk_safe_exit_irqrestore(flags);
+skip:
+	return true;
+}
+
+/*
+ * Print out all remaining records to all consoles.
+ *
+ * Requires the console_lock.
+ *
+ * Returns true if a console was available for flushing, otherwise false.
+ *
+ * @next_seq is set to the highest sequence number of all of the consoles that
+ * were flushed.
+ *
+ * @handover will be set to true if a printk waiter has taken over the
+ * console_lock, in which case the caller is no longer holding the
+ * console_lock. Otherwise it is set to false.
+ */
+static bool console_flush_all(bool do_cond_resched, u64 *next_seq, bool *handover)
+{
+	bool any_usable = false;
+	struct console *con;
+	bool any_progress;
+
+	*next_seq = 0;
+	*handover = false;
+
+	do {
+		any_progress = false;
+
+		for_each_console(con) {
+			bool progress;
+
+			if (!console_is_usable(con))
+				continue;
+			any_usable = true;
+
+			progress = console_emit_next_record(con, handover);
+			if (*handover)
+				return true;
+
+			/* Track the highest seq flushed. */
+			if (con->seq > *next_seq)
+				*next_seq = con->seq;
+
+			if (!progress)
+				continue;
+			any_progress = true;
+
+			/* Allow panic_cpu to take over the consoles safely */
+			if (abandon_console_lock_in_panic())
+				return false;
+
+			if (do_cond_resched)
+				cond_resched();
+		}
+	} while (any_progress);
+
+	return any_usable;
 }
 
 /**
@@ -2662,22 +2772,16 @@ static inline int can_use_console(void)
  */
 void console_unlock(void)
 {
-	static char ext_text[CONSOLE_EXT_LOG_MAX];
-	static char text[CONSOLE_LOG_MAX];
-	static int panic_console_dropped;
-	unsigned long flags;
-	bool do_cond_resched, retry;
-	struct printk_info info;
-	struct printk_record r;
-	u64 __maybe_unused next_seq;
+	bool do_cond_resched;
+	bool handover;
+	bool flushed;
+	u64 next_seq;
 
 	if (console_suspended) {
 		up_console_sem();
 		return;
 	}
 
-	prb_rec_init_rd(&r, &info, text, sizeof(text));
-
 	/*
 	 * Console drivers are called with interrupts disabled, so
 	 * @console_may_schedule should be cleared before; however, we may
@@ -2686,125 +2790,33 @@ void console_unlock(void)
 	 * between lines if allowable.  Not doing so can cause a very long
 	 * scheduling stall on a slow console leading to RCU stall and
 	 * softlockup warnings which exacerbate the issue with more
-	 * messages practically incapacitating the system.
-	 *
-	 * console_trylock() is not able to detect the preemptive
-	 * context reliably. Therefore the value must be stored before
-	 * and cleared after the "again" goto label.
+	 * messages practically incapacitating the system. Therefore, create
+	 * a local to use for the printing loop.
 	 */
 	do_cond_resched = console_may_schedule;
-again:
-	console_may_schedule = 0;
-
-	/*
-	 * We released the console_sem lock, so we need to recheck if
-	 * cpu is online and (if not) is there at least one CON_ANYTIME
-	 * console.
-	 */
-	if (!can_use_console()) {
-		console_locked = 0;
-		up_console_sem();
-		return;
-	}
 
-	for (;;) {
-		size_t ext_len = 0;
-		int handover;
-		size_t len;
+	do {
+		console_may_schedule = 0;
 
-skip:
-		if (!prb_read_valid(prb, console_seq, &r))
+		flushed = console_flush_all(do_cond_resched, &next_seq, &handover);
+		if (handover)
 			break;
 
-		if (console_seq != r.info->seq) {
-			console_dropped += r.info->seq - console_seq;
-			console_seq = r.info->seq;
-			if (panic_in_progress() && panic_console_dropped++ > 10) {
-				suppress_panic_printk = 1;
-				pr_warn_once("Too many dropped messages. Suppress messages on non-panic CPUs to prevent livelock.\n");
-			}
-		}
-
-		if (suppress_message_printing(r.info->level)) {
-			/*
-			 * Skip record we have buffered and already printed
-			 * directly to the console when we received it, and
-			 * record that has level above the console loglevel.
-			 */
-			console_seq++;
-			goto skip;
-		}
-
-		/* Output to all consoles once old messages replayed. */
-		if (unlikely(exclusive_console &&
-			     console_seq >= exclusive_console_stop_seq)) {
-			exclusive_console = NULL;
-		}
+		__console_unlock();
 
-		/*
-		 * Handle extended console text first because later
-		 * record_print_text() will modify the record buffer in-place.
-		 */
-		if (nr_ext_console_drivers) {
-			ext_len = info_print_ext_header(ext_text,
-						sizeof(ext_text),
-						r.info);
-			ext_len += msg_print_ext_body(ext_text + ext_len,
-						sizeof(ext_text) - ext_len,
-						&r.text_buf[0],
-						r.info->text_len,
-						&r.info->dev_info);
-		}
-		len = record_print_text(&r,
-				console_msg_format & MSG_FORMAT_SYSLOG,
-				printk_time);
-		console_seq++;
+		/* Were there any consoles available for flushing? */
+		if (!flushed)
+			break;
 
 		/*
-		 * While actively printing out messages, if another printk()
-		 * were to occur on another CPU, it may wait for this one to
-		 * finish. This task can not be preempted if there is a
-		 * waiter waiting to take over.
-		 *
-		 * Interrupts are disabled because the hand over to a waiter
-		 * must not be interrupted until the hand over is completed
-		 * (@console_waiter is cleared).
+		 * Some context may have added new records after
+		 * console_flush_all() but before unlocking the console.
+		 * Re-check if there is a new record to flush. If the trylock
+		 * fails, another context is already handling the printing.
 		 */
-		printk_safe_enter_irqsave(flags);
-		console_lock_spinning_enable();
-
-		stop_critical_timings();	/* don't trace print latency */
-		call_console_drivers(ext_text, ext_len, text, len);
-		start_critical_timings();
-
-		handover = console_lock_spinning_disable_and_check();
-		printk_safe_exit_irqrestore(flags);
-		if (handover)
-			return;
-
-		/* Allow panic_cpu to take over the consoles safely */
-		if (abandon_console_lock_in_panic())
-			break;
-
-		if (do_cond_resched)
-			cond_resched();
-	}
-
-	/* Get consistent value of the next-to-be-used sequence number. */
-	next_seq = console_seq;
-
-	console_locked = 0;
-	up_console_sem();
-
-	/*
-	 * Someone could have filled up the buffer again, so re-check if there's
-	 * something to flush. In case we cannot trylock the console_sem again,
-	 * there's a new owner and the console_unlock() from them will do the
-	 * flush, no worries.
-	 */
-	retry = prb_read_valid(prb, next_seq, NULL);
-	if (retry && !abandon_console_lock_in_panic() && console_trylock())
-		goto again;
+	} while (prb_read_valid(prb, next_seq, NULL) &&
+		 !abandon_console_lock_in_panic() &&
+		 console_trylock());
 }
 EXPORT_SYMBOL(console_unlock);
 
@@ -2864,8 +2876,14 @@ void console_flush_on_panic(enum con_flush_mode mode)
 	console_trylock();
 	console_may_schedule = 0;
 
-	if (mode == CONSOLE_REPLAY_ALL)
-		console_seq = prb_first_valid_seq(prb);
+	if (mode == CONSOLE_REPLAY_ALL) {
+		struct console *c;
+		u64 seq;
+
+		seq = prb_first_valid_seq(prb);
+		for_each_console(c)
+			c->seq = seq;
+	}
 	console_unlock();
 }
 
@@ -3104,26 +3122,15 @@ void register_console(struct console *newcon)
 	if (newcon->flags & CON_EXTENDED)
 		nr_ext_console_drivers++;
 
+	newcon->dropped = 0;
 	if (newcon->flags & CON_PRINTBUFFER) {
-		/*
-		 * console_unlock(); will print out the buffered messages
-		 * for us.
-		 *
-		 * We're about to replay the log buffer.  Only do this to the
-		 * just-registered console to avoid excessive message spam to
-		 * the already-registered consoles.
-		 *
-		 * Set exclusive_console with disabled interrupts to reduce
-		 * race window with eventual console_flush_on_panic() that
-		 * ignores console_lock.
-		 */
-		exclusive_console = newcon;
-		exclusive_console_stop_seq = console_seq;
-
 		/* Get a consistent copy of @syslog_seq. */
 		mutex_lock(&syslog_lock);
-		console_seq = syslog_seq;
+		newcon->seq = syslog_seq;
 		mutex_unlock(&syslog_lock);
+	} else {
+		/* Begin with next message. */
+		newcon->seq = prb_next_seq(prb);
 	}
 	console_unlock();
 	console_sysfs_notify();
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH printk v2 07/12] printk: move buffer definitions into console_emit_next_record() caller
  2022-04-05 13:25 [PATCH printk v2 00/12] implement threaded console printing John Ogness
                   ` (5 preceding siblings ...)
  2022-04-05 13:25 ` [PATCH printk v2 06/12] printk: refactor and rework printing logic John Ogness
@ 2022-04-05 13:25 ` John Ogness
  2022-04-06 14:40   ` Petr Mladek
  2022-04-05 13:25 ` [PATCH printk v2 08/12] printk: add pr_flush() John Ogness
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: John Ogness @ 2022-04-05 13:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner, linux-kernel

Extended consoles print extended messages and do not print messages about
dropped records.

Non-extended consoles print "normal" messages as well as extra messages
about dropped records.

Currently the buffers for these various message types are defined within
the functions that might use them and their usage is based upon the
CON_EXTENDED flag. This will be a problem when moving to kthread printers
because each printer must be able to provide its own buffers.

Move all the message buffer definitions outside of
console_emit_next_record(). The caller knows if extended or dropped
messages should be printed and can specify the appropriate buffers to
use. The console_emit_next_record() and call_console_driver() functions
can know what to print based on whether specified buffers are non-NULL.

With this change, buffer definition/allocation/specification is separated
from the code that does the various types of string printing.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/printk/printk.c | 60 ++++++++++++++++++++++++++++++------------
 1 file changed, 43 insertions(+), 17 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index db9278dba775..06f0b29d909d 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -394,6 +394,9 @@ static struct latched_seq clear_seq = {
 /* the maximum size of a formatted record (i.e. with prefix added per line) */
 #define CONSOLE_LOG_MAX		1024
 
+/* the maximum size for a dropped text message */
+#define DROPPED_TEXT_MAX	64
+
 /* the maximum size allowed to be reserved for a record */
 #define LOG_LINE_MAX		(CONSOLE_LOG_MAX - PREFIX_MAX)
 
@@ -1901,18 +1904,18 @@ static int console_trylock_spinning(void)
 
 /*
  * Call the specified console driver, asking it to write out the specified
- * text and length. For non-extended consoles, if any records have been
+ * text and length. If @dropped_text is non-NULL and any records have been
  * dropped, a dropped message will be written out first.
  */
-static void call_console_driver(struct console *con, const char *text, size_t len)
+static void call_console_driver(struct console *con, const char *text, size_t len,
+				char *dropped_text)
 {
-	static char dropped_text[64];
 	size_t dropped_len;
 
 	trace_console_rcuidle(text, len);
 
-	if (con->dropped && !(con->flags & CON_EXTENDED)) {
-		dropped_len = snprintf(dropped_text, sizeof(dropped_text),
+	if (con->dropped && dropped_text) {
+		dropped_len = snprintf(dropped_text, DROPPED_TEXT_MAX,
 				       "** %lu printk messages dropped **\n",
 				       con->dropped);
 		con->dropped = 0;
@@ -2274,6 +2277,7 @@ EXPORT_SYMBOL(_printk);
 #else /* CONFIG_PRINTK */
 
 #define CONSOLE_LOG_MAX		0
+#define DROPPED_TEXT_MAX	0
 #define printk_time		false
 
 #define prb_read_valid(rb, seq, r)	false
@@ -2297,7 +2301,10 @@ static ssize_t msg_print_ext_body(char *buf, size_t size,
 				  struct dev_printk_info *dev_info) { return 0; }
 static void console_lock_spinning_enable(void) { }
 static int console_lock_spinning_disable_and_check(void) { return 0; }
-static void call_console_driver(struct console *con, const char *text, size_t len) { }
+static void call_console_driver(struct console *con, const char *text, size_t len,
+				char *dropped_text)
+{
+}
 static bool suppress_message_printing(int level) { return false; }
 
 #endif /* CONFIG_PRINTK */
@@ -2622,6 +2629,14 @@ static void __console_unlock(void)
  * Print one record for the given console. The record printed is whatever
  * record is the next available record for the given console.
  *
+ * @text is a buffer of size CONSOLE_LOG_MAX.
+ *
+ * If extended messages should be printed, @ext_text is a buffer of size
+ * CONSOLE_EXT_LOG_MAX. Otherwise @ext_text must be NULL.
+ *
+ * If dropped messages should be printed, @dropped_text is a buffer of size
+ * DROPPED_TEXT_MAX. Otherwise @dropped_text must be NULL.
+ *
  * Requires the console_lock.
  *
  * Returns false if the given console has no next record to print, otherwise
@@ -2631,10 +2646,9 @@ static void __console_unlock(void)
  * console_lock, in which case the caller is no longer holding the
  * console_lock. Otherwise it is set to false.
  */
-static bool console_emit_next_record(struct console *con, bool *handover)
+static bool console_emit_next_record(struct console *con, char *text, char *ext_text,
+				     char *dropped_text, bool *handover)
 {
-	static char ext_text[CONSOLE_EXT_LOG_MAX];
-	static char text[CONSOLE_LOG_MAX];
 	static int panic_console_dropped;
 	struct printk_info info;
 	struct printk_record r;
@@ -2642,7 +2656,7 @@ static bool console_emit_next_record(struct console *con, bool *handover)
 	char *write_text;
 	size_t len;
 
-	prb_rec_init_rd(&r, &info, text, sizeof(text));
+	prb_rec_init_rd(&r, &info, text, CONSOLE_LOG_MAX);
 
 	*handover = false;
 
@@ -2664,13 +2678,13 @@ static bool console_emit_next_record(struct console *con, bool *handover)
 		goto skip;
 	}
 
-	if (con->flags & CON_EXTENDED) {
-		write_text = &ext_text[0];
-		len = info_print_ext_header(ext_text, sizeof(ext_text), r.info);
-		len += msg_print_ext_body(ext_text + len, sizeof(ext_text) - len,
+	if (ext_text) {
+		write_text = ext_text;
+		len = info_print_ext_header(ext_text, CONSOLE_EXT_LOG_MAX, r.info);
+		len += msg_print_ext_body(ext_text + len, CONSOLE_EXT_LOG_MAX - len,
 					  &r.text_buf[0], r.info->text_len, &r.info->dev_info);
 	} else {
-		write_text = &text[0];
+		write_text = text;
 		len = record_print_text(&r, console_msg_format & MSG_FORMAT_SYSLOG, printk_time);
 	}
 
@@ -2688,7 +2702,7 @@ static bool console_emit_next_record(struct console *con, bool *handover)
 	console_lock_spinning_enable();
 
 	stop_critical_timings();	/* don't trace print latency */
-	call_console_driver(con, write_text, len);
+	call_console_driver(con, write_text, len, dropped_text);
 	start_critical_timings();
 
 	con->seq++;
@@ -2715,6 +2729,9 @@ static bool console_emit_next_record(struct console *con, bool *handover)
  */
 static bool console_flush_all(bool do_cond_resched, u64 *next_seq, bool *handover)
 {
+	static char dropped_text[DROPPED_TEXT_MAX];
+	static char ext_text[CONSOLE_EXT_LOG_MAX];
+	static char text[CONSOLE_LOG_MAX];
 	bool any_usable = false;
 	struct console *con;
 	bool any_progress;
@@ -2732,7 +2749,16 @@ static bool console_flush_all(bool do_cond_resched, u64 *next_seq, bool *handove
 				continue;
 			any_usable = true;
 
-			progress = console_emit_next_record(con, handover);
+			if (con->flags & CON_EXTENDED) {
+				/* Extended consoles do not print "dropped messages". */
+				progress = console_emit_next_record(con, &text[0],
+								    &ext_text[0], NULL,
+								    handover);
+			} else {
+				progress = console_emit_next_record(con, &text[0],
+								    NULL, &dropped_text[0],
+								    handover);
+			}
 			if (*handover)
 				return true;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH printk v2 08/12] printk: add pr_flush()
  2022-04-05 13:25 [PATCH printk v2 00/12] implement threaded console printing John Ogness
                   ` (6 preceding siblings ...)
  2022-04-05 13:25 ` [PATCH printk v2 07/12] printk: move buffer definitions into console_emit_next_record() caller John Ogness
@ 2022-04-05 13:25 ` John Ogness
  2022-04-06 15:17   ` Petr Mladek
  2022-04-05 13:25 ` [PATCH printk v2 09/12] printk: add functions to prefer direct printing John Ogness
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: John Ogness @ 2022-04-05 13:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner, linux-kernel

Provide a might-sleep function to allow waiting for console printers
to catch up to the latest logged message.

Use pr_flush() whenever it is desirable to get buffered messages
printed before continuing: suspend_console(), resume_console(),
console_stop(), console_start(), console_unblank().

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 include/linux/printk.h |  7 ++++
 kernel/printk/printk.c | 83 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 90 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index b70a42f94031..091fba7283e1 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -170,6 +170,8 @@ extern void __printk_safe_exit(void);
 #define printk_deferred_enter __printk_safe_enter
 #define printk_deferred_exit __printk_safe_exit
 
+extern bool pr_flush(int timeout_ms, bool reset_on_progress);
+
 /*
  * Please don't use printk_ratelimit(), because it shares ratelimiting state
  * with all other unrelated printk_ratelimit() callsites.  Instead use
@@ -220,6 +222,11 @@ static inline void printk_deferred_exit(void)
 {
 }
 
+static inline bool pr_flush(int timeout_ms, bool reset_on_progress)
+{
+	return true;
+}
+
 static inline int printk_ratelimit(void)
 {
 	return 0;
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 06f0b29d909d..d0369afaf365 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2274,6 +2274,8 @@ asmlinkage __visible int _printk(const char *fmt, ...)
 }
 EXPORT_SYMBOL(_printk);
 
+static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress);
+
 #else /* CONFIG_PRINTK */
 
 #define CONSOLE_LOG_MAX		0
@@ -2306,6 +2308,7 @@ static void call_console_driver(struct console *con, const char *text, size_t le
 {
 }
 static bool suppress_message_printing(int level) { return false; }
+static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress) { return true; }
 
 #endif /* CONFIG_PRINTK */
 
@@ -2493,6 +2496,7 @@ void suspend_console(void)
 	if (!console_suspend_enabled)
 		return;
 	pr_info("Suspending console(s) (use no_console_suspend to debug)\n");
+	pr_flush(1000, true);
 	console_lock();
 	console_suspended = 1;
 	up_console_sem();
@@ -2505,6 +2509,7 @@ void resume_console(void)
 	down_console_sem();
 	console_suspended = 0;
 	console_unlock();
+	pr_flush(1000, true);
 }
 
 /**
@@ -2882,6 +2887,9 @@ void console_unblank(void)
 		if ((c->flags & CON_ENABLED) && c->unblank)
 			c->unblank();
 	console_unlock();
+
+	if (!oops_in_progress)
+		pr_flush(1000, true);
 }
 
 /**
@@ -2940,6 +2948,7 @@ struct tty_driver *console_device(int *index)
  */
 void console_stop(struct console *console)
 {
+	__pr_flush(console, 1000, true);
 	console_lock();
 	console->flags &= ~CON_ENABLED;
 	console_unlock();
@@ -2951,6 +2960,7 @@ void console_start(struct console *console)
 	console_lock();
 	console->flags |= CON_ENABLED;
 	console_unlock();
+	__pr_flush(console, 1000, true);
 }
 EXPORT_SYMBOL(console_start);
 
@@ -3321,6 +3331,79 @@ static int __init printk_late_init(void)
 late_initcall(printk_late_init);
 
 #if defined CONFIG_PRINTK
+/* If @con is specified, only wait for that console. Otherwise wait for all. */
+static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress)
+{
+	int remaining = timeout_ms;
+	struct console *c;
+	u64 last_diff = 0;
+	u64 printk_seq;
+	u64 diff;
+	u64 seq;
+
+	might_sleep();
+
+	seq = prb_next_seq(prb);
+
+	for (;;) {
+		diff = 0;
+
+		console_lock();
+		for_each_console(c) {
+			if (con && con != c)
+				continue;
+			if (!console_is_usable(c))
+				continue;
+			printk_seq = c->seq;
+			if (printk_seq < seq)
+				diff += seq - printk_seq;
+		}
+		console_unlock();
+
+		if (diff != last_diff && reset_on_progress)
+			remaining = timeout_ms;
+
+		if (diff == 0 || remaining == 0)
+			break;
+
+		if (remaining < 0) {
+			/* no timeout limit */
+			msleep(100);
+		} else if (remaining < 100) {
+			msleep(remaining);
+			remaining = 0;
+		} else {
+			msleep(100);
+			remaining -= 100;
+		}
+
+		last_diff = diff;
+	}
+
+	return (diff == 0);
+}
+
+/**
+ * pr_flush() - Wait for printing threads to catch up.
+ *
+ * @timeout_ms:        The maximum time (in ms) to wait.
+ * @reset_on_progress: Reset the timeout if forward progress is seen.
+ *
+ * A value of 0 for @timeout_ms means no waiting will occur. A value of -1
+ * represents infinite waiting.
+ *
+ * If @reset_on_progress is true, the timeout will be reset whenever any
+ * printer has been seen to make some forward progress.
+ *
+ * Context: Process context. May sleep while acquiring console lock.
+ * Return: true if all enabled printers are caught up.
+ */
+bool pr_flush(int timeout_ms, bool reset_on_progress)
+{
+	return __pr_flush(NULL, timeout_ms, reset_on_progress);
+}
+EXPORT_SYMBOL(pr_flush);
+
 /*
  * Delayed printk version, for scheduler-internal messages:
  */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH printk v2 09/12] printk: add functions to prefer direct printing
  2022-04-05 13:25 [PATCH printk v2 00/12] implement threaded console printing John Ogness
                   ` (7 preceding siblings ...)
  2022-04-05 13:25 ` [PATCH printk v2 08/12] printk: add pr_flush() John Ogness
@ 2022-04-05 13:25 ` John Ogness
  2022-04-07  9:56   ` Petr Mladek
  2022-04-09 15:57   ` Paul E. McKenney
  2022-04-05 13:25 ` [PATCH printk v2 10/12] printk: add kthread console printers John Ogness
                   ` (3 subsequent siblings)
  12 siblings, 2 replies; 37+ messages in thread
From: John Ogness @ 2022-04-05 13:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman, Jiri Slaby, Paul E. McKenney,
	Frederic Weisbecker, Neeraj Upadhyay, Josh Triplett,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Kees Cook,
	Andrew Morton, Luis Chamberlain, Wei Liu, Xiaoming Ni,
	Peter Zijlstra, Andy Shevchenko, Helge Deller, Marco Elver,
	Sebastian Andrzej Siewior, Daniel Lezcano, Shawn Guo,
	Dmitry Torokhov, Matti Vaittinen, Eric W. Biederman, Phil Auld,
	Juri Lelli, Wang Qing, rcu

Once kthread printing is available, console printing will no longer
occur in the context of the printk caller. However, there are some
special contexts where it is desirable for the printk caller to
directly print out kernel messages. Using pr_flush() to wait for
threaded printers is only possible if the caller is in a sleepable
context and the kthreads are active. That is not always the case.

Introduce printk_prefer_direct_enter() and printk_prefer_direct_exit()
functions to explicitly (and globally) activate/deactivate preferred
direct console printing. The term "direct console printing" refers to
printing to all enabled consoles from the context of the printk
caller. The term "prefer" is used because this type of printing is
only best effort. If the console is currently locked or other
printers are already actively printing, the printk caller will need
to rely on the other contexts to handle the printing.

This preferred direct printing is how all printing is currently
handled (unless explicitly deferred).

When kthread printing is introduced, there may be some unanticipated
problems due to kthreads being unable to flush important messages.
In order to minimize such risks, preferred direct printing is
activated for the primary important messages when the system
experiences general types of major errors. These are:

 - emergency reboot/shutdown
 - cpu and rcu stalls
 - hard and soft lockups
 - hung tasks
 - warn
 - sysrq

Note that since kthread printing does not yet exist, no behavior
changes result from this commit. This is only implementing the
counter and marking the various places where preferred direct
printing is active.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 drivers/tty/sysrq.c     |  2 ++
 include/linux/printk.h  | 11 +++++++++++
 kernel/hung_task.c      | 11 ++++++++++-
 kernel/panic.c          |  4 ++++
 kernel/printk/printk.c  | 28 ++++++++++++++++++++++++++++
 kernel/rcu/tree_stall.h |  2 ++
 kernel/reboot.c         | 14 +++++++++++++-
 kernel/watchdog.c       |  4 ++++
 kernel/watchdog_hld.c   |  4 ++++
 9 files changed, 78 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index bbfd004449b5..2884cd638d64 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -578,6 +578,7 @@ void __handle_sysrq(int key, bool check_mask)
 
 	rcu_sysrq_start();
 	rcu_read_lock();
+	printk_prefer_direct_enter();
 	/*
 	 * Raise the apparent loglevel to maximum so that the sysrq header
 	 * is shown to provide the user with positive feedback.  We do not
@@ -619,6 +620,7 @@ void __handle_sysrq(int key, bool check_mask)
 		pr_cont("\n");
 		console_loglevel = orig_log_level;
 	}
+	printk_prefer_direct_exit();
 	rcu_read_unlock();
 	rcu_sysrq_end();
 
diff --git a/include/linux/printk.h b/include/linux/printk.h
index 091fba7283e1..cd26aab0ab2a 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -170,6 +170,9 @@ extern void __printk_safe_exit(void);
 #define printk_deferred_enter __printk_safe_enter
 #define printk_deferred_exit __printk_safe_exit
 
+extern void printk_prefer_direct_enter(void);
+extern void printk_prefer_direct_exit(void);
+
 extern bool pr_flush(int timeout_ms, bool reset_on_progress);
 
 /*
@@ -222,6 +225,14 @@ static inline void printk_deferred_exit(void)
 {
 }
 
+static inline void printk_prefer_direct_enter(void)
+{
+}
+
+static inline void printk_prefer_direct_exit(void)
+{
+}
+
 static inline bool pr_flush(int timeout_ms, bool reset_on_progress)
 {
 	return true;
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 52501e5f7655..02a65d554340 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -127,6 +127,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
 	 * complain:
 	 */
 	if (sysctl_hung_task_warnings) {
+		printk_prefer_direct_enter();
+
 		if (sysctl_hung_task_warnings > 0)
 			sysctl_hung_task_warnings--;
 		pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
@@ -142,6 +144,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
 
 		if (sysctl_hung_task_all_cpu_backtrace)
 			hung_task_show_all_bt = true;
+
+		printk_prefer_direct_exit();
 	}
 
 	touch_nmi_watchdog();
@@ -204,12 +208,17 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
 	}
  unlock:
 	rcu_read_unlock();
-	if (hung_task_show_lock)
+	if (hung_task_show_lock) {
+		printk_prefer_direct_enter();
 		debug_show_all_locks();
+		printk_prefer_direct_exit();
+	}
 
 	if (hung_task_show_all_bt) {
 		hung_task_show_all_bt = false;
+		printk_prefer_direct_enter();
 		trigger_all_cpu_backtrace();
+		printk_prefer_direct_exit();
 	}
 
 	if (hung_task_call_panic)
diff --git a/kernel/panic.c b/kernel/panic.c
index 55b50e052ec3..7d422597403f 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -560,6 +560,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
 {
 	disable_trace_on_warning();
 
+	printk_prefer_direct_enter();
+
 	if (file)
 		pr_warn("WARNING: CPU: %d PID: %d at %s:%d %pS\n",
 			raw_smp_processor_id(), current->pid, file, line,
@@ -597,6 +599,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
 
 	/* Just a warning, don't kill lockdep. */
 	add_taint(taint, LOCKDEP_STILL_OK);
+
+	printk_prefer_direct_exit();
 }
 
 #ifndef __WARN_FLAGS
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index d0369afaf365..258d02cff140 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -362,6 +362,34 @@ static int console_msg_format = MSG_FORMAT_DEFAULT;
 static DEFINE_MUTEX(syslog_lock);
 
 #ifdef CONFIG_PRINTK
+static atomic_t printk_prefer_direct = ATOMIC_INIT(0);
+
+/**
+ * printk_prefer_direct_enter - cause printk() calls to attempt direct
+ *                              printing to all enabled consoles
+ *
+ * Since it is not possible to call into the console printing code from any
+ * context, there is no guarantee that direct printing will occur.
+ *
+ * This globally effects all printk() callers.
+ *
+ * Context: Any context.
+ */
+void printk_prefer_direct_enter(void)
+{
+	atomic_inc(&printk_prefer_direct);
+}
+
+/**
+ * printk_prefer_direct_exit - restore printk() behavior
+ *
+ * Context: Any context.
+ */
+void printk_prefer_direct_exit(void)
+{
+	WARN_ON(atomic_dec_if_positive(&printk_prefer_direct) < 0);
+}
+
 DECLARE_WAIT_QUEUE_HEAD(log_wait);
 /* All 3 protected by @syslog_lock. */
 /* the next printk record to read by syslog(READ) or /proc/kmsg */
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 0c5d8516516a..d612707c2ed0 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -619,6 +619,7 @@ static void print_cpu_stall(unsigned long gps)
 	 * See Documentation/RCU/stallwarn.rst for info on how to debug
 	 * RCU CPU stall warnings.
 	 */
+	printk_prefer_direct_enter();
 	trace_rcu_stall_warning(rcu_state.name, TPS("SelfDetected"));
 	pr_err("INFO: %s self-detected stall on CPU\n", rcu_state.name);
 	raw_spin_lock_irqsave_rcu_node(rdp->mynode, flags);
@@ -656,6 +657,7 @@ static void print_cpu_stall(unsigned long gps)
 	 */
 	set_tsk_need_resched(current);
 	set_preempt_need_resched();
+	printk_prefer_direct_exit();
 }
 
 static void check_cpu_stall(struct rcu_data *rdp)
diff --git a/kernel/reboot.c b/kernel/reboot.c
index 6bcc5d6a6572..4177645e74d6 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -447,9 +447,11 @@ static int __orderly_reboot(void)
 	ret = run_cmd(reboot_cmd);
 
 	if (ret) {
+		printk_prefer_direct_enter();
 		pr_warn("Failed to start orderly reboot: forcing the issue\n");
 		emergency_sync();
 		kernel_restart(NULL);
+		printk_prefer_direct_exit();
 	}
 
 	return ret;
@@ -462,6 +464,7 @@ static int __orderly_poweroff(bool force)
 	ret = run_cmd(poweroff_cmd);
 
 	if (ret && force) {
+		printk_prefer_direct_enter();
 		pr_warn("Failed to start orderly shutdown: forcing the issue\n");
 
 		/*
@@ -471,6 +474,7 @@ static int __orderly_poweroff(bool force)
 		 */
 		emergency_sync();
 		kernel_power_off();
+		printk_prefer_direct_exit();
 	}
 
 	return ret;
@@ -528,6 +532,8 @@ EXPORT_SYMBOL_GPL(orderly_reboot);
  */
 static void hw_failure_emergency_poweroff_func(struct work_struct *work)
 {
+	printk_prefer_direct_enter();
+
 	/*
 	 * We have reached here after the emergency shutdown waiting period has
 	 * expired. This means orderly_poweroff has not been able to shut off
@@ -544,6 +550,8 @@ static void hw_failure_emergency_poweroff_func(struct work_struct *work)
 	 */
 	pr_emerg("Hardware protection shutdown failed. Trying emergency restart\n");
 	emergency_restart();
+
+	printk_prefer_direct_exit();
 }
 
 static DECLARE_DELAYED_WORK(hw_failure_emergency_poweroff_work,
@@ -582,11 +590,13 @@ void hw_protection_shutdown(const char *reason, int ms_until_forced)
 {
 	static atomic_t allow_proceed = ATOMIC_INIT(1);
 
+	printk_prefer_direct_enter();
+
 	pr_emerg("HARDWARE PROTECTION shutdown (%s)\n", reason);
 
 	/* Shutdown should be initiated only once. */
 	if (!atomic_dec_and_test(&allow_proceed))
-		return;
+		goto out;
 
 	/*
 	 * Queue a backup emergency shutdown in the event of
@@ -594,6 +604,8 @@ void hw_protection_shutdown(const char *reason, int ms_until_forced)
 	 */
 	hw_failure_emergency_poweroff(ms_until_forced);
 	orderly_poweroff(true);
+out:
+	printk_prefer_direct_exit();
 }
 EXPORT_SYMBOL_GPL(hw_protection_shutdown);
 
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 9166220457bc..40024e03d422 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -424,6 +424,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 		/* Start period for the next softlockup warning. */
 		update_report_ts();
 
+		printk_prefer_direct_enter();
+
 		pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
 			smp_processor_id(), duration,
 			current->comm, task_pid_nr(current));
@@ -442,6 +444,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 		add_taint(TAINT_SOFTLOCKUP, LOCKDEP_STILL_OK);
 		if (softlockup_panic)
 			panic("softlockup: hung tasks");
+
+		printk_prefer_direct_exit();
 	}
 
 	return HRTIMER_RESTART;
diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index 247bf0b1582c..701f35f0e2d4 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -135,6 +135,8 @@ static void watchdog_overflow_callback(struct perf_event *event,
 		if (__this_cpu_read(hard_watchdog_warn) == true)
 			return;
 
+		printk_prefer_direct_enter();
+
 		pr_emerg("Watchdog detected hard LOCKUP on cpu %d\n",
 			 this_cpu);
 		print_modules();
@@ -155,6 +157,8 @@ static void watchdog_overflow_callback(struct perf_event *event,
 		if (hardlockup_panic)
 			nmi_panic(regs, "Hard LOCKUP");
 
+		printk_prefer_direct_exit();
+
 		__this_cpu_write(hard_watchdog_warn, true);
 		return;
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH printk v2 10/12] printk: add kthread console printers
  2022-04-05 13:25 [PATCH printk v2 00/12] implement threaded console printing John Ogness
                   ` (8 preceding siblings ...)
  2022-04-05 13:25 ` [PATCH printk v2 09/12] printk: add functions to prefer direct printing John Ogness
@ 2022-04-05 13:25 ` John Ogness
  2022-04-07 16:43   ` start/stop: was: " Petr Mladek
                     ` (3 more replies)
  2022-04-05 13:25 ` [PATCH printk v2 11/12] printk: extend console_lock for proper kthread support John Ogness
                   ` (2 subsequent siblings)
  12 siblings, 4 replies; 37+ messages in thread
From: John Ogness @ 2022-04-05 13:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman

Create a kthread for each console to perform console printing. During
normal operation (@system_state == SYSTEM_RUNNING), the kthread
printers are responsible for all printing on their respective
consoles.

During non-normal operation, console printing is done as it has been:
within the context of the printk caller or within irq work triggered
by the printk caller.

Console printers synchronize against each other and against console
lockers by taking the console lock for each message that is printed.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 include/linux/console.h |   2 +
 kernel/printk/printk.c  | 262 ++++++++++++++++++++++++++++++++++++++--
 2 files changed, 252 insertions(+), 12 deletions(-)

diff --git a/include/linux/console.h b/include/linux/console.h
index 8c1686e2c233..9a251e70c090 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -153,6 +153,8 @@ struct console {
 	uint	ospeed;
 	u64	seq;
 	unsigned long dropped;
+	struct task_struct *thread;
+
 	void	*data;
 	struct	 console *next;
 };
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 258d02cff140..a3b8b984c9aa 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -361,6 +361,13 @@ static int console_msg_format = MSG_FORMAT_DEFAULT;
 /* syslog_lock protects syslog_* variables and write access to clear_seq. */
 static DEFINE_MUTEX(syslog_lock);
 
+/*
+ * A flag to signify if printk_activate_kthreads() has already started the
+ * kthread printers. If true, any later registered consoles must start their
+ * own kthread directly. The flag is write protected by the console_lock.
+ */
+static bool printk_kthreads_available;
+
 #ifdef CONFIG_PRINTK
 static atomic_t printk_prefer_direct = ATOMIC_INIT(0);
 
@@ -388,6 +395,21 @@ void printk_prefer_direct_enter(void)
 void printk_prefer_direct_exit(void)
 {
 	WARN_ON(atomic_dec_if_positive(&printk_prefer_direct) < 0);
+
+	/*
+	 * Threaded printers remain asleep while preferred direct printing is
+	 * active. Explicitly wake them in case any new printk records became
+	 * visible after preferred direct printing was deactivated.
+	 */
+	wake_up_klogd();
+}
+
+static inline bool allow_direct_printing(void)
+{
+	return (!printk_kthreads_available ||
+		system_state > SYSTEM_RUNNING ||
+		oops_in_progress ||
+		atomic_read(&printk_prefer_direct));
 }
 
 DECLARE_WAIT_QUEUE_HEAD(log_wait);
@@ -2258,10 +2280,10 @@ asmlinkage int vprintk_emit(int facility, int level,
 	printed_len = vprintk_store(facility, level, dev_info, fmt, args);
 
 	/* If called from the scheduler, we can not call up(). */
-	if (!in_sched) {
+	if (!in_sched && allow_direct_printing()) {
 		/*
 		 * The caller may be holding system-critical or
-		 * timing-sensitive locks. Disable preemption during
+		 * timing-sensitive locks. Disable preemption during direct
 		 * printing of all remaining records to all consoles so that
 		 * this context can return as soon as possible. Hopefully
 		 * another printk() caller will take over the printing.
@@ -2304,6 +2326,8 @@ EXPORT_SYMBOL(_printk);
 
 static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress);
 
+static void printk_start_kthread(struct console *con);
+
 #else /* CONFIG_PRINTK */
 
 #define CONSOLE_LOG_MAX		0
@@ -2337,6 +2361,8 @@ static void call_console_driver(struct console *con, const char *text, size_t le
 }
 static bool suppress_message_printing(int level) { return false; }
 static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress) { return true; }
+static void printk_start_kthread(struct console *con) { }
+static bool allow_direct_printing(void) { return true; }
 
 #endif /* CONFIG_PRINTK */
 
@@ -2537,6 +2563,13 @@ void resume_console(void)
 	down_console_sem();
 	console_suspended = 0;
 	console_unlock();
+
+	/*
+	 * While suspended, new records may have been added to the
+	 * ringbuffer. Wake up the kthread printers to print them.
+	 */
+	wake_up_klogd();
+
 	pr_flush(1000, true);
 }
 
@@ -2555,6 +2588,9 @@ static int console_cpu_notify(unsigned int cpu)
 		/* If trylock fails, someone else is doing the printing */
 		if (console_trylock())
 			console_unlock();
+
+		/* Wake kthread printers. Some may have become usable. */
+		wake_up_klogd();
 	}
 	return 0;
 }
@@ -2773,6 +2809,10 @@ static bool console_flush_all(bool do_cond_resched, u64 *next_seq, bool *handove
 	*handover = false;
 
 	do {
+		/* Let the kthread printers do the work if they can. */
+		if (!allow_direct_printing())
+			break;
+
 		any_progress = false;
 
 		for_each_console(con) {
@@ -2988,6 +3028,10 @@ void console_start(struct console *console)
 	console_lock();
 	console->flags |= CON_ENABLED;
 	console_unlock();
+
+	/* Wake the newly enabled kthread printer. */
+	wake_up_klogd();
+
 	__pr_flush(console, 1000, true);
 }
 EXPORT_SYMBOL(console_start);
@@ -3196,6 +3240,8 @@ void register_console(struct console *newcon)
 		/* Begin with next message. */
 		newcon->seq = prb_next_seq(prb);
 	}
+	if (printk_kthreads_available)
+		printk_start_kthread(newcon);
 	console_unlock();
 	console_sysfs_notify();
 
@@ -3222,6 +3268,7 @@ EXPORT_SYMBOL(register_console);
 
 int unregister_console(struct console *console)
 {
+	struct task_struct *thd = console->thread;
 	struct console *con;
 	int res;
 
@@ -3233,6 +3280,11 @@ int unregister_console(struct console *console)
 	if (res > 0)
 		return 0;
 
+	if (thd) {
+		console->thread = NULL;
+		kthread_stop(thd);
+	}
+
 	res = -ENODEV;
 	console_lock();
 	if (console_drivers == console) {
@@ -3358,6 +3410,20 @@ static int __init printk_late_init(void)
 }
 late_initcall(printk_late_init);
 
+static int __init printk_activate_kthreads(void)
+{
+	struct console *con;
+
+	console_lock();
+	printk_kthreads_available = true;
+	for_each_console(con)
+		printk_start_kthread(con);
+	console_unlock();
+
+	return 0;
+}
+early_initcall(printk_activate_kthreads);
+
 #if defined CONFIG_PRINTK
 /* If @con is specified, only wait for that console. Otherwise wait for all. */
 static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress)
@@ -3432,6 +3498,154 @@ bool pr_flush(int timeout_ms, bool reset_on_progress)
 }
 EXPORT_SYMBOL(pr_flush);
 
+/*
+ * Enter preferred direct printing, but never exit. Mark console threads as
+ * unavailable. The system is then forever in preferred direct printing and
+ * any printing threads will exit.
+ */
+static void printk_fallback_preferred_direct(void)
+{
+	printk_prefer_direct_enter();
+	pr_err("falling back to preferred direct printing\n");
+	printk_kthreads_available = false;
+}
+
+static bool printer_should_wake(struct console *con, u64 seq)
+{
+	short flags;
+
+	if (kthread_should_stop() || !printk_kthreads_available)
+		return true;
+
+	if (console_suspended)
+		return false;
+
+	if (!con->write)
+		return false;
+
+	/*
+	 * This is an unsafe read to con->flags, but a false positive is not
+	 * a problem. Worst case it would allow the printer to wake up even
+	 * when it is disabled. But the printer will notice that itself when
+	 * attempting to print and instead go back to sleep.
+	 */
+	flags = data_race(READ_ONCE(con->flags));
+	if (!(flags & CON_ENABLED))
+		return false;
+
+	if (atomic_read(&printk_prefer_direct))
+		return false;
+
+	return prb_read_valid(prb, seq, NULL);
+}
+
+static int printk_kthread_func(void *data)
+{
+	struct console *con = data;
+	char *dropped_text = NULL;
+	char *ext_text = NULL;
+	bool handover;
+	u64 seq = 0;
+	char *text;
+	int error;
+
+	text = kmalloc(CONSOLE_LOG_MAX, GFP_KERNEL);
+	if (!text) {
+		printk_console_msg(con, KERN_ERR, "failed to allocate text buffer");
+		printk_fallback_preferred_direct();
+		goto out;
+	}
+
+	if (con->flags & CON_EXTENDED) {
+		ext_text = kmalloc(CONSOLE_EXT_LOG_MAX, GFP_KERNEL);
+		if (!ext_text) {
+			printk_console_msg(con, KERN_ERR, "failed to allocate ext_text buffer");
+			printk_fallback_preferred_direct();
+			goto out;
+		}
+	} else {
+		dropped_text = kmalloc(DROPPED_TEXT_MAX, GFP_KERNEL);
+		if (!dropped_text) {
+			printk_console_msg(con, KERN_ERR,
+					   "failed to allocate dropped_text buffer");
+			printk_fallback_preferred_direct();
+			goto out;
+		}
+	}
+
+	printk_console_msg(con, KERN_INFO, "printing thread started");
+
+	for (;;) {
+		/*
+		 * Guarantee this task is visible on the waitqueue before
+		 * checking the wake condition.
+		 *
+		 * The full memory barrier within set_current_state() of
+		 * prepare_to_wait_event() pairs with the full memory barrier
+		 * within wq_has_sleeper().
+		 *
+		 * See __wake_up_klogd:A for the pairing memory barrier.
+		 */
+		error = wait_event_interruptible(log_wait,
+				printer_should_wake(con, seq)); /* LMM(printk_kthread_func:A) */
+
+		if (kthread_should_stop() || !printk_kthreads_available)
+			break;
+
+		if (error)
+			continue;
+
+		console_lock();
+
+		if (console_suspended) {
+			__console_unlock();
+			continue;
+		}
+
+		if (!console_is_usable(con)) {
+			__console_unlock();
+			continue;
+		}
+
+		/*
+		 * Even though the printk kthread is always preemptible, it is
+		 * still not allowed to call cond_resched() from within
+		 * console drivers. The task may become non-preemptible in the
+		 * console driver call chain. For example, vt_console_print()
+		 * takes a spinlock and then can call into fbcon_redraw(),
+		 * which can conditionally invoke cond_resched().
+		 */
+		console_may_schedule = 0;
+		console_emit_next_record(con, text, ext_text, dropped_text, &handover);
+		if (handover)
+			continue;
+
+		seq = con->seq;
+
+		__console_unlock();
+	}
+
+	printk_console_msg(con, KERN_INFO, "printing thread stopped");
+out:
+	kfree(dropped_text);
+	kfree(ext_text);
+	kfree(text);
+	return 0;
+}
+
+/* Must be called within console_lock(). */
+static void printk_start_kthread(struct console *con)
+{
+	con->thread = kthread_run(printk_kthread_func, con,
+				  "pr/%s%d", con->name, con->index);
+	if (IS_ERR(con->thread)) {
+		con->thread = NULL;
+		printk_console_msg(con, KERN_ERR, "unable to start printing thread");
+		printk_fallback_preferred_direct();
+		return;
+	}
+}
+
 /*
  * Delayed printk version, for scheduler-internal messages:
  */
@@ -3445,40 +3659,64 @@ static void wake_up_klogd_work_func(struct irq_work *irq_work)
 	int pending = this_cpu_xchg(printk_pending, 0);
 
 	if (pending & PRINTK_PENDING_OUTPUT) {
+		printk_prefer_direct_enter();
+
 		/* If trylock fails, someone else is doing the printing */
 		if (console_trylock())
 			console_unlock();
+
+		printk_prefer_direct_exit();
 	}
 
 	if (pending & PRINTK_PENDING_WAKEUP)
-		wake_up_interruptible(&log_wait);
+		wake_up_interruptible_all(&log_wait);
 }
 
 static DEFINE_PER_CPU(struct irq_work, wake_up_klogd_work) =
 	IRQ_WORK_INIT_LAZY(wake_up_klogd_work_func);
 
-void wake_up_klogd(void)
+static void __wake_up_klogd(int val)
 {
 	if (!printk_percpu_data_ready())
 		return;
 
 	preempt_disable();
-	if (waitqueue_active(&log_wait)) {
-		this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
+	/*
+	 * Guarantee any new records can be seen by printing threads before
+	 * checking if the wait queue is empty.
+	 *
+	 * The full memory barrier within wq_has_sleeper() pairs with the full
+	 * memory barrier within set_current_state() of
+	 * prepare_to_wait_event(), which is called after ___wait_event() adds
+	 * the waiter but before it has checked the wait condition.
+	 *
+	 * See printk_kthread_func:A for the pairing memory barrier.
+	 */
+	if (wq_has_sleeper(&log_wait) || /* LMM(__wake_up_klogd:A) */
+	    (val & PRINTK_PENDING_OUTPUT)) {
+		this_cpu_or(printk_pending, val);
 		irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
 	}
 	preempt_enable();
 }
 
+void wake_up_klogd(void)
+{
+	__wake_up_klogd(PRINTK_PENDING_WAKEUP);
+}
+
 void defer_console_output(void)
 {
-	if (!printk_percpu_data_ready())
-		return;
+	int val = PRINTK_PENDING_WAKEUP;
 
-	preempt_disable();
-	this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
-	irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
-	preempt_enable();
+	/*
+	 * If console deferring was called with preferred direct printing,
+	 * make the irqwork perform the direct printing.
+	 */
+	if (atomic_read(&printk_prefer_direct))
+		val |= PRINTK_PENDING_OUTPUT;
+
+	__wake_up_klogd(val);
 }
 
 void printk_trigger_flush(void)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH printk v2 11/12] printk: extend console_lock for proper kthread support
  2022-04-05 13:25 [PATCH printk v2 00/12] implement threaded console printing John Ogness
                   ` (9 preceding siblings ...)
  2022-04-05 13:25 ` [PATCH printk v2 10/12] printk: add kthread console printers John Ogness
@ 2022-04-05 13:25 ` John Ogness
  2022-04-08 13:45   ` guarantee forward progress: was: " Petr Mladek
  2022-04-08 13:52   ` register_console: " Petr Mladek
  2022-04-05 13:25 ` [PATCH printk v2 12/12] printk: remove @console_locked John Ogness
  2022-04-05 15:03 ` [PATCH printk v2 00/12] implement threaded console printing Andy Shevchenko
  12 siblings, 2 replies; 37+ messages in thread
From: John Ogness @ 2022-04-05 13:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman

Currently threaded console printers synchronize against each
other using console_lock(). However, different console drivers
are unrelated and do not require any synchronization between
each other. Removing the synchronization between the threaded
console printers will allow each console to print at its own
speed.

But the threaded consoles printers do still need to synchronize
against console_lock() callers. Introduce a per-console mutex
and a new console flag CON_THD_BLOCKED to provide this
synchronization.

console_lock() is modified so that it must acquire the mutex
of each console in order to set the CON_THD_BLOCKED flag.
Console printing threads will acquire their mutex while
printing a record. If CON_THD_BLOCKED was set, the thread will
go back to sleep instead of printing.

The reason for the CON_THD_BLOCKED flag is so that
console_lock() callers do not need to acquire multiple console
mutexes simultaneously, which would introduce unnecessary
complexity due to nested mutex locking.

However, threaded console printers also need to synchronize
against console_trylock() callers. Since console_trylock() may
be called from any context, the per-console mutex cannot be
used for this synchronization. (mutex_trylock() cannot be
called from atomic contexts.) Introduce a global atomic counter
to identify if any threaded printers are active. The threaded
printers will also check the atomic counter to identify if the
console has been locked by another task via console_trylock().

Note that @console_sem is still used to provide synchronization
between console_lock() and console_trylock() callers.

A locking overview for console_lock(), console_trylock(), and the
threaded printers is as follows (pseudo code):

console_lock()
{
        down(&console_sem);
        for_each_console(con) {
                mutex_lock(&con->lock);
                con->flags |= CON_THD_BLOCKED;
                mutex_unlock(&con->lock);
        }
        /* console lock acquired */
}

console_trylock()
{
        if (down_trylock(&console_sem) == 0) {
                if (atomic_cmpxchg(&console_kthreads_active, 0, -1) == 0) {
                        /* console lock acquired */
                }
        }
}

threaded_printer()
{
        mutex_lock(&con->lock);
        if (!(con->flags & CON_THD_BLOCKED)) {
		/* console_lock() callers blocked */

                if (atomic_inc_unless_negative(&console_kthreads_active)) {
                        /* console_trylock() callers blocked */

                        con->write();

                        atomic_dec(&console_lock_count);
                }
        }
        mutex_unlock(&con->lock);
}

The console owner and waiter logic now only applies between contexts
that have taken the console lock. Since threaded printers never take
the console lock, they do not have a console lock to handover to
other contexts.

@panic_console_dropped must change to atomic_t since it is no longer
protected exclusively by the console lock.

Since threaded printers remain asleep if they see that the console
is locked, they now must be explicitly woken in __console_unlock().
This means wake_up_klogd() calls following a console_unlock() are
no longer necessary and are removed.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 include/linux/console.h |  15 +++
 kernel/printk/printk.c  | 243 ++++++++++++++++++++++++++++++----------
 2 files changed, 199 insertions(+), 59 deletions(-)

diff --git a/include/linux/console.h b/include/linux/console.h
index 9a251e70c090..c1fd4f41c547 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -16,6 +16,7 @@
 
 #include <linux/atomic.h>
 #include <linux/types.h>
+#include <linux/mutex.h>
 
 struct vc_data;
 struct console_font_op;
@@ -136,6 +137,7 @@ static inline int con_debug_leave(void)
 #define CON_ANYTIME	(16) /* Safe to call when cpu is offline */
 #define CON_BRL		(32) /* Used for a braille device */
 #define CON_EXTENDED	(64) /* Use the extended output format a la /dev/kmsg */
+#define CON_THD_BLOCKED	(128) /* Thread blocked because console is locked */
 
 struct console {
 	char	name[16];
@@ -155,6 +157,19 @@ struct console {
 	unsigned long dropped;
 	struct task_struct *thread;
 
+	/*
+	 * The per-console lock is used by printing kthreads to synchronize
+	 * this console with callers of console_lock(). This is necessary in
+	 * order to allow printing kthreads to run in parallel to each other,
+	 * while each safely accessing their own @flags and synchronizing
+	 * against direct printing via console_lock/console_unlock.
+	 *
+	 * Note: For synchronizing against direct printing via
+	 *       console_trylock/console_unlock, see the static global
+	 *       variable @console_kthreads_active.
+	 */
+	struct mutex lock;
+
 	void	*data;
 	struct	 console *next;
 };
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index a3b8b984c9aa..a5525a7b7bc5 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -223,6 +223,33 @@ int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
 /* Number of registered extended console drivers. */
 static int nr_ext_console_drivers;
 
+/*
+ * Used to synchronize printing kthreads against direct printing via
+ * console_trylock/console_unlock.
+ *
+ * Values:
+ * -1 = console kthreads atomically blocked (via global trylock)
+ *  0 = no kthread printing, console not locked (via trylock)
+ * >0 = kthread(s) actively printing
+ *
+ * Note: For synchronizing against direct printing via
+ *       console_lock/console_unlock, see the @lock variable in
+ *       struct console.
+ */
+static atomic_t console_kthreads_active = ATOMIC_INIT(0);
+
+#define console_kthreads_atomic_tryblock() \
+	(atomic_cmpxchg(&console_kthreads_active, 0, -1) == 0)
+#define console_kthreads_atomic_unblock() \
+	atomic_cmpxchg(&console_kthreads_active, -1, 0)
+#define console_kthreads_atomically_blocked() \
+	(atomic_read(&console_kthreads_active) == -1)
+
+#define console_kthread_printing_tryenter() \
+	atomic_inc_unless_negative(&console_kthreads_active)
+#define console_kthread_printing_exit() \
+	atomic_dec(&console_kthreads_active)
+
 /*
  * Helper macros to handle lockdep when locking/unlocking console_sem. We use
  * macros instead of functions so that _RET_IP_ contains useful information.
@@ -270,6 +297,49 @@ static bool panic_in_progress(void)
 	return unlikely(atomic_read(&panic_cpu) != PANIC_CPU_INVALID);
 }
 
+/*
+ * Tracks whether kthread printers are all blocked. A value of true implies
+ * that the console is locked via console_lock() or the console is suspended.
+ * Reading and writing to this variable requires holding @console_sem.
+ */
+static bool console_kthreads_blocked;
+
+/*
+ * Block all kthread printers from a schedulable context.
+ *
+ * Requires the console_lock.
+ */
+static void console_kthreads_block(void)
+{
+	struct console *con;
+
+	for_each_console(con) {
+		mutex_lock(&con->lock);
+		con->flags |= CON_THD_BLOCKED;
+		mutex_unlock(&con->lock);
+	}
+
+	console_kthreads_blocked = true;
+}
+
+/*
+ * Unblock all kthread printers from a schedulable context.
+ *
+ * Requires the console_lock.
+ */
+static void console_kthreads_unblock(void)
+{
+	struct console *con;
+
+	for_each_console(con) {
+		mutex_lock(&con->lock);
+		con->flags &= ~CON_THD_BLOCKED;
+		mutex_unlock(&con->lock);
+	}
+
+	console_kthreads_blocked = false;
+}
+
 /*
  * This is used for debugging the mess that is the VT code by
  * keeping track if we have the console semaphore held. It's
@@ -2563,13 +2633,6 @@ void resume_console(void)
 	down_console_sem();
 	console_suspended = 0;
 	console_unlock();
-
-	/*
-	 * While suspended, new records may have been added to the
-	 * ringbuffer. Wake up the kthread printers to print them.
-	 */
-	wake_up_klogd();
-
 	pr_flush(1000, true);
 }
 
@@ -2588,9 +2651,10 @@ static int console_cpu_notify(unsigned int cpu)
 		/* If trylock fails, someone else is doing the printing */
 		if (console_trylock())
 			console_unlock();
-
-		/* Wake kthread printers. Some may have become usable. */
-		wake_up_klogd();
+		else {
+			/* Some kthread printers may have become usable. */
+			wake_up_klogd();
+		}
 	}
 	return 0;
 }
@@ -2610,6 +2674,7 @@ void console_lock(void)
 	down_console_sem();
 	if (console_suspended)
 		return;
+	console_kthreads_block();
 	console_locked = 1;
 	console_may_schedule = 1;
 }
@@ -2631,6 +2696,10 @@ int console_trylock(void)
 		up_console_sem();
 		return 0;
 	}
+	if (!console_kthreads_atomic_tryblock()) {
+		up_console_sem();
+		return 0;
+	}
 	console_locked = 1;
 	console_may_schedule = 0;
 	return 1;
@@ -2639,7 +2708,7 @@ EXPORT_SYMBOL(console_trylock);
 
 int is_console_locked(void)
 {
-	return console_locked;
+	return (console_locked || atomic_read(&console_kthreads_active));
 }
 EXPORT_SYMBOL(is_console_locked);
 
@@ -2691,6 +2760,22 @@ static inline bool console_is_usable(struct console *con)
 static void __console_unlock(void)
 {
 	console_locked = 0;
+
+	/*
+	 * Depending on whether console_lock() or console_trylock() was used,
+	 * appropriately allow the kthread printers to continue.
+	 */
+	if (console_kthreads_blocked)
+		console_kthreads_unblock();
+	else
+		console_kthreads_atomic_unblock();
+
+	/*
+	 * New records may have arrived while the console was locked.
+	 * Wake the kthread printers to print them.
+	 */
+	wake_up_klogd();
+
 	up_console_sem();
 }
 
@@ -2706,19 +2791,18 @@ static void __console_unlock(void)
  * If dropped messages should be printed, @dropped_text is a buffer of size
  * DROPPED_TEXT_MAX. Otherwise @dropped_text must be NULL.
  *
- * Requires the console_lock.
- *
  * Returns false if the given console has no next record to print, otherwise
  * true.
  *
  * @handover will be set to true if a printk waiter has taken over the
  * console_lock, in which case the caller is no longer holding the
- * console_lock. Otherwise it is set to false.
+ * console_lock. Otherwise it is set to false. A NULL pointer may be provided
+ * to disable allowing the console_lock to be taken over by a printk waiter.
  */
-static bool console_emit_next_record(struct console *con, char *text, char *ext_text,
-				     char *dropped_text, bool *handover)
+static bool __console_emit_next_record(struct console *con, char *text, char *ext_text,
+				       char *dropped_text, bool *handover)
 {
-	static int panic_console_dropped;
+	static atomic_t panic_console_dropped = ATOMIC_INIT(0);
 	struct printk_info info;
 	struct printk_record r;
 	unsigned long flags;
@@ -2727,7 +2811,8 @@ static bool console_emit_next_record(struct console *con, char *text, char *ext_
 
 	prb_rec_init_rd(&r, &info, text, CONSOLE_LOG_MAX);
 
-	*handover = false;
+	if (handover)
+		*handover = false;
 
 	if (!prb_read_valid(prb, con->seq, &r))
 		return false;
@@ -2735,7 +2820,8 @@ static bool console_emit_next_record(struct console *con, char *text, char *ext_
 	if (con->seq != r.info->seq) {
 		con->dropped += r.info->seq - con->seq;
 		con->seq = r.info->seq;
-		if (panic_in_progress() && panic_console_dropped++ > 10) {
+		if (panic_in_progress() &&
+		    atomic_fetch_inc_relaxed(&panic_console_dropped) > 10) {
 			suppress_panic_printk = 1;
 			pr_warn_once("Too many dropped messages. Suppress messages on non-panic CPUs to prevent livelock.\n");
 		}
@@ -2757,31 +2843,52 @@ static bool console_emit_next_record(struct console *con, char *text, char *ext_
 		len = record_print_text(&r, console_msg_format & MSG_FORMAT_SYSLOG, printk_time);
 	}
 
-	/*
-	 * While actively printing out messages, if another printk()
-	 * were to occur on another CPU, it may wait for this one to
-	 * finish. This task can not be preempted if there is a
-	 * waiter waiting to take over.
-	 *
-	 * Interrupts are disabled because the hand over to a waiter
-	 * must not be interrupted until the hand over is completed
-	 * (@console_waiter is cleared).
-	 */
-	printk_safe_enter_irqsave(flags);
-	console_lock_spinning_enable();
+	if (handover) {
+		/*
+		 * While actively printing out messages, if another printk()
+		 * were to occur on another CPU, it may wait for this one to
+		 * finish. This task can not be preempted if there is a
+		 * waiter waiting to take over.
+		 *
+		 * Interrupts are disabled because the hand over to a waiter
+		 * must not be interrupted until the hand over is completed
+		 * (@console_waiter is cleared).
+		 */
+		printk_safe_enter_irqsave(flags);
+		console_lock_spinning_enable();
+
+		/* don't trace irqsoff print latency */
+		stop_critical_timings();
+	}
 
-	stop_critical_timings();	/* don't trace print latency */
 	call_console_driver(con, write_text, len, dropped_text);
-	start_critical_timings();
 
 	con->seq++;
 
-	*handover = console_lock_spinning_disable_and_check();
-	printk_safe_exit_irqrestore(flags);
+	if (handover) {
+		start_critical_timings();
+		*handover = console_lock_spinning_disable_and_check();
+		printk_safe_exit_irqrestore(flags);
+	}
 skip:
 	return true;
 }
 
+/*
+ * Print a record for a given console, but allow another printk() caller to
+ * take over the console_lock and continue printing.
+ *
+ * Requires the console_lock, but depending on @handover after the call, the
+ * caller may no longer have the console_lock.
+ *
+ * See __console_emit_next_record() for argument details.
+ */
+static bool console_emit_next_record_transferable(struct console *con, char *text, char *ext_text,
+						  char *dropped_text, bool *handover)
+{
+	return __console_emit_next_record(con, text, ext_text, dropped_text, handover);
+}
+
 /*
  * Print out all remaining records to all consoles.
  *
@@ -2824,13 +2931,11 @@ static bool console_flush_all(bool do_cond_resched, u64 *next_seq, bool *handove
 
 			if (con->flags & CON_EXTENDED) {
 				/* Extended consoles do not print "dropped messages". */
-				progress = console_emit_next_record(con, &text[0],
-								    &ext_text[0], NULL,
-								    handover);
+				progress = console_emit_next_record_transferable(con, &text[0],
+								&ext_text[0], NULL, handover);
 			} else {
-				progress = console_emit_next_record(con, &text[0],
-								    NULL, &dropped_text[0],
-								    handover);
+				progress = console_emit_next_record_transferable(con, &text[0],
+								NULL, &dropped_text[0], handover);
 			}
 			if (*handover)
 				return true;
@@ -2946,6 +3051,10 @@ void console_unblank(void)
 	if (oops_in_progress) {
 		if (down_trylock_console_sem() != 0)
 			return;
+		if (!console_kthreads_atomic_tryblock()) {
+			up_console_sem();
+			return;
+		}
 	} else
 		console_lock();
 
@@ -3028,10 +3137,6 @@ void console_start(struct console *console)
 	console_lock();
 	console->flags |= CON_ENABLED;
 	console_unlock();
-
-	/* Wake the newly enabled kthread printer. */
-	wake_up_klogd();
-
 	__pr_flush(console, 1000, true);
 }
 EXPORT_SYMBOL(console_start);
@@ -3230,7 +3335,11 @@ void register_console(struct console *newcon)
 	if (newcon->flags & CON_EXTENDED)
 		nr_ext_console_drivers++;
 
+	if (console_kthreads_blocked)
+		newcon->flags |= CON_THD_BLOCKED;
+
 	newcon->dropped = 0;
+	mutex_init(&newcon->lock);
 	if (newcon->flags & CON_PRINTBUFFER) {
 		/* Get a consistent copy of @syslog_seq. */
 		mutex_lock(&syslog_lock);
@@ -3510,6 +3619,19 @@ static void printk_fallback_preferred_direct(void)
 	printk_kthreads_available = false;
 }
 
+/*
+ * Print a record for a given console, not allowing another printk() caller
+ * to take over. This is appropriate for contexts that do not have the
+ * console_lock.
+ *
+ * See __console_emit_next_record() for argument details.
+ */
+static bool console_emit_next_record(struct console *con, char *text, char *ext_text,
+				     char *dropped_text)
+{
+	return __console_emit_next_record(con, text, ext_text, dropped_text, NULL);
+}
+
 static bool printer_should_wake(struct console *con, u64 seq)
 {
 	short flags;
@@ -3517,9 +3639,6 @@ static bool printer_should_wake(struct console *con, u64 seq)
 	if (kthread_should_stop() || !printk_kthreads_available)
 		return true;
 
-	if (console_suspended)
-		return false;
-
 	if (!con->write)
 		return false;
 
@@ -3530,8 +3649,12 @@ static bool printer_should_wake(struct console *con, u64 seq)
 	 * attempting to print and instead go back to sleep.
 	 */
 	flags = data_race(READ_ONCE(con->flags));
-	if (!(flags & CON_ENABLED))
+
+	if (!(flags & CON_ENABLED) ||
+	    (flags & CON_THD_BLOCKED) ||
+	    console_kthreads_atomically_blocked()) {
 		return false;
+	}
 
 	if (atomic_read(&printk_prefer_direct))
 		return false;
@@ -3544,7 +3667,6 @@ static int printk_kthread_func(void *data)
 	struct console *con = data;
 	char *dropped_text = NULL;
 	char *ext_text = NULL;
-	bool handover;
 	u64 seq = 0;
 	char *text;
 	int error;
@@ -3595,15 +3717,18 @@ static int printk_kthread_func(void *data)
 		if (error)
 			continue;
 
-		console_lock();
+		error = mutex_lock_interruptible(&con->lock);
+		if (error)
+			continue;
 
-		if (console_suspended) {
-			__console_unlock();
+		if (!console_is_usable(con)) {
+			mutex_unlock(&con->lock);
 			continue;
 		}
 
-		if (!console_is_usable(con)) {
-			__console_unlock();
+		if ((con->flags & CON_THD_BLOCKED) ||
+		    !console_kthread_printing_tryenter()) {
+			mutex_unlock(&con->lock);
 			continue;
 		}
 
@@ -3616,13 +3741,13 @@ static int printk_kthread_func(void *data)
 		 * which can conditionally invoke cond_resched().
 		 */
 		console_may_schedule = 0;
-		console_emit_next_record(con, text, ext_text, dropped_text, &handover);
-		if (handover)
-			continue;
+		console_emit_next_record(con, text, ext_text, dropped_text);
 
 		seq = con->seq;
 
-		__console_unlock();
+		console_kthread_printing_exit();
+
+		mutex_unlock(&con->lock);
 	}
 
 	printk_console_msg(con, KERN_INFO, "printing thread stopped");
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH printk v2 12/12] printk: remove @console_locked
  2022-04-05 13:25 [PATCH printk v2 00/12] implement threaded console printing John Ogness
                   ` (10 preceding siblings ...)
  2022-04-05 13:25 ` [PATCH printk v2 11/12] printk: extend console_lock for proper kthread support John Ogness
@ 2022-04-05 13:25 ` John Ogness
  2022-04-05 15:03 ` [PATCH printk v2 00/12] implement threaded console printing Andy Shevchenko
  12 siblings, 0 replies; 37+ messages in thread
From: John Ogness @ 2022-04-05 13:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner, linux-kernel

The static global variable @console_locked is used to help debug
VT code to make sure that certain code paths are running with
the console_lock held. However, this information is also available
with the static global variable @console_kthreads_blocked (for
locking via console_lock()), and the static global variable
@console_kthreads_active (for locking via console_trylock()).

Remove @console_locked and update is_console_locked() to use the
alternative variables.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk.c | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index a5525a7b7bc5..2da66adac87e 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -340,15 +340,7 @@ static void console_kthreads_unblock(void)
 	console_kthreads_blocked = false;
 }
 
-/*
- * This is used for debugging the mess that is the VT code by
- * keeping track if we have the console semaphore held. It's
- * definitely not the perfect debug tool (we don't know if _WE_
- * hold it and are racing, but it helps tracking those weird code
- * paths in the console code where we end up in places I want
- * locked without the console semaphore held).
- */
-static int console_locked, console_suspended;
+static int console_suspended;
 
 /*
  *	Array of consoles built from command line options (console=)
@@ -2675,7 +2667,6 @@ void console_lock(void)
 	if (console_suspended)
 		return;
 	console_kthreads_block();
-	console_locked = 1;
 	console_may_schedule = 1;
 }
 EXPORT_SYMBOL(console_lock);
@@ -2700,15 +2691,26 @@ int console_trylock(void)
 		up_console_sem();
 		return 0;
 	}
-	console_locked = 1;
 	console_may_schedule = 0;
 	return 1;
 }
 EXPORT_SYMBOL(console_trylock);
 
+/*
+ * This is used to help to make sure that certain paths within the VT code are
+ * running with the console lock held. It is definitely not the perfect debug
+ * tool (it is not known if the VT code is the task holding the console lock),
+ * but it helps tracking those weird code paths in the console code such as
+ * when the console is suspended: where the console is not locked but no
+ * console printing may occur.
+ *
+ * Note: This returns true when the console is suspended but is not locked.
+ *       This is intentional because the VT code must consider that situation
+ *       the same as if the console was locked.
+ */
 int is_console_locked(void)
 {
-	return (console_locked || atomic_read(&console_kthreads_active));
+	return (console_kthreads_blocked || atomic_read(&console_kthreads_active));
 }
 EXPORT_SYMBOL(is_console_locked);
 
@@ -2759,8 +2761,6 @@ static inline bool console_is_usable(struct console *con)
 
 static void __console_unlock(void)
 {
-	console_locked = 0;
-
 	/*
 	 * Depending on whether console_lock() or console_trylock() was used,
 	 * appropriately allow the kthread printers to continue.
@@ -3058,7 +3058,6 @@ void console_unblank(void)
 	} else
 		console_lock();
 
-	console_locked = 1;
 	console_may_schedule = 0;
 	for_each_console(c)
 		if ((c->flags & CON_ENABLED) && c->unblank)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 00/12] implement threaded console printing
  2022-04-05 13:25 [PATCH printk v2 00/12] implement threaded console printing John Ogness
                   ` (11 preceding siblings ...)
  2022-04-05 13:25 ` [PATCH printk v2 12/12] printk: remove @console_locked John Ogness
@ 2022-04-05 15:03 ` Andy Shevchenko
  2022-04-05 21:24   ` John Ogness
  12 siblings, 1 reply; 37+ messages in thread
From: Andy Shevchenko @ 2022-04-05 15:03 UTC (permalink / raw)
  To: John Ogness
  Cc: Petr Mladek, Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Andrew Morton, Marco Elver, Randy Dunlap,
	Alexander Potapenko, Stephen Boyd, Nicholas Piggin,
	Greg Kroah-Hartman, Jiri Slaby, Paul E. McKenney,
	Frederic Weisbecker, Neeraj Upadhyay, Josh Triplett,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Kees Cook,
	Luis Chamberlain, Wei Liu, Xiaoming Ni, Peter Zijlstra,
	Helge Deller, Sebastian Andrzej Siewior, Daniel Lezcano,
	Shawn Guo, Dmitry Torokhov, Matti Vaittinen, Eric W. Biederman,
	Phil Auld, Juri Lelli, Wang Qing, rcu

On Tue, Apr 05, 2022 at 03:31:23PM +0206, John Ogness wrote:
> This is v2 of a series to implement a kthread for each registered
> console. v1 is here [0]. The kthreads locklessly retrieve the
> records from the printk ringbuffer and also do not cause any lock
> contention between each other. This allows consoles to run at full
> speed. For example, a netconsole is able to dump records much
> faster than a serial or vt console. Also, during normal operation,
> it completely decouples printk() callers from console printing.
> 
> There are situations where kthread printing is not sufficient. For
> example, during panic situations, where the kthreads may not get a
> chance to schedule. In such cases, the current method of attempting
> to print directly within the printk() caller context is used. New
> functions printk_prefer_direct_enter() and
> printk_prefer_direct_exit() are made available to mark areas of the
> kernel where direct printing is preferred. (These should only be
> areas that do not occur during normal operation.)
> 
> This series also introduces pr_flush(): a might_sleep() function
> that will block until all active printing threads have caught up
> to the latest record at the time of the pr_flush() call. This
> function is useful, for example, to wait until pending records
> are flushed to consoles before suspending.
> 
> Note that this series does *not* increase the reliability of console
> printing. Rather it focuses on the non-interference aspect of
> printk() by decoupling printk() callers from printing (during normal
> operation). Nonetheless, the reliability aspect should not worsen
> due to this series.

From the list of patches and the only one I am Cc'ed to I can't find the
answer to my question, i.e. does it take care about console_unregister()
cases at run time? (We have some drivers to call ->exit() for console in
that case, perhaps it should do something before calling it.)

Would be nice to see some pointers where I can find the answer and
maybe even describing in the cover-letter/commit message/documentation.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 00/12] implement threaded console printing
  2022-04-05 15:03 ` [PATCH printk v2 00/12] implement threaded console printing Andy Shevchenko
@ 2022-04-05 21:24   ` John Ogness
  0 siblings, 0 replies; 37+ messages in thread
From: John Ogness @ 2022-04-05 21:24 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Petr Mladek, Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Andrew Morton, Marco Elver, Randy Dunlap,
	Alexander Potapenko, Stephen Boyd, Nicholas Piggin,
	Greg Kroah-Hartman, Jiri Slaby, Paul E. McKenney,
	Frederic Weisbecker, Neeraj Upadhyay, Josh Triplett,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Kees Cook,
	Luis Chamberlain, Wei Liu, Xiaoming Ni, Peter Zijlstra,
	Helge Deller, Sebastian Andrzej Siewior, Daniel Lezcano,
	Shawn Guo, Dmitry Torokhov, Matti Vaittinen, Eric W. Biederman,
	Phil Auld, Juri Lelli, Wang Qing, rcu

On 2022-04-05, Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:
> From the list of patches and the only one I am Cc'ed to I can't find
> the answer to my question,

It seems lore is having trouble picking up the series. Until then, I've
pushed the series to github.com so you can see the code.

> i.e. does it take care about console_unregister() cases at run time?
> (We have some drivers to call ->exit() for console in that case,
> perhaps it should do something before calling it.)

Yes. In console_unregister(), the printing thread is shutdown before the
actual console unregistration takes place. The thread is shutdown here:

https://github.com/Linutronix/linux/blob/b27eb6c3d168d608b3a9d04578e030a3d79d889a/kernel/printk/printk.c#L3393

The console's exit() callback is below that:

https://github.com/Linutronix/linux/blob/b27eb6c3d168d608b3a9d04578e030a3d79d889a/kernel/printk/printk.c#L3429

> Would be nice to see some pointers where I can find the answer and
> maybe even describing in the cover-letter/commit
> message/documentation.

I suppose for the console thread patch [0] I could add a paragraph to
the commit message mentioning kernel thread shutdown for console
unregistration.

[0] https://github.com/Linutronix/linux/commit/e26ae677da2e339dd268c1f871b81e61e782393f

John

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 01/12] printk: rename cpulock functions
  2022-04-05 13:25 ` [PATCH printk v2 01/12] printk: rename cpulock functions John Ogness
@ 2022-04-06  9:07   ` Petr Mladek
  0 siblings, 0 replies; 37+ messages in thread
From: Petr Mladek @ 2022-04-06  9:07 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Andrew Morton, Marco Elver, Randy Dunlap,
	Alexander Potapenko, Stephen Boyd, Nicholas Piggin

On Tue 2022-04-05 15:31:24, John Ogness wrote:
> Since the printk cpulock is CPU-reentrant and since it is used
> in all contexts, its usage must be carefully considered and
> most likely will require programming locklessly. To avoid
> mistaking the printk cpulock as a typical lock, rename it to
> cpu_sync. The main functions then become:
> 
>     printk_cpu_sync_get_irqsave(flags);
>     printk_cpu_sync_put_irqrestore(flags);
> 
> Add extra notes of caution in the function description to help
> developers understand the requirements for correct usage.
>
> Signed-off-by: John Ogness <john.ogness@linutronix.de>

Looks good the me.

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 05/12] printk: add macro for console detail messages
  2022-04-05 13:25 ` [PATCH printk v2 05/12] printk: add macro for console detail messages John Ogness
@ 2022-04-06 10:31   ` Petr Mladek
  0 siblings, 0 replies; 37+ messages in thread
From: Petr Mladek @ 2022-04-06 10:31 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner, linux-kernel

On Tue 2022-04-05 15:31:28, John Ogness wrote:
> It is useful to generate log messages that include details about
> the related console. Rather than duplicate the logic to assemble
> the details, put that logic into a macro printk_console_msg().
> 
> Once console printers become threaded, this macro will find many
> users.

I really like this idea.

> Signed-off-by: John Ogness <john.ogness@linutronix.de>
> ---
>  kernel/printk/printk.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index fb9a6eaf54fd..39ca9d758e52 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2993,6 +2993,10 @@ static void try_enable_default_console(struct console *newcon)
>  		newcon->flags |= CON_CONSDEV;
>  }
>  
> +#define printk_console_msg(c, lvl, msg) \
> +	printk(lvl pr_fmt("%sconsole [%s%d] " msg "\n"), \
> +	       (c->flags & CON_BOOT) ? "boot" : "", c->name, c->index)

This might be dangerous because "msg" is handled as printk format.
Any potential %<modifier> might cause unexpected memory access.

Honestly, I am not good at writing macros. I played with it and
the following worked for me:

#define printk_console_msg(c, lvl, fmt, ...)			\
	printk(lvl pr_fmt("%sconsole [%s%d] " fmt),		\
	       (c->flags & CON_BOOT) ? "boot" : "",		\
	       c->name, c->index,				\
		##__VA_ARGS__)

Note that it requires adding "\n" back to the callers.

Also, please, call it "console_printk()" or so. It needs extra
arguments and this name will help to distinguish it from
the plain printk() wrappers. From my POV, it is something like,
for example, dev_printk(), snd_printk(), or xfs_printk_ratelimited().

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 06/12] printk: refactor and rework printing logic
  2022-04-05 13:25 ` [PATCH printk v2 06/12] printk: refactor and rework printing logic John Ogness
@ 2022-04-06 14:02   ` Petr Mladek
  0 siblings, 0 replies; 37+ messages in thread
From: Petr Mladek @ 2022-04-06 14:02 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman

On Tue 2022-04-05 15:31:29, John Ogness wrote:
> Refactor/rework printing logic in order to prepare for moving to
> threaded console printing.
> 
> - Move @console_seq into struct console so that the current
>   "position" of each console can be tracked individually.
> 
> - Move @console_dropped into struct console so that the current drop
>   count of each console can be tracked individually.
> 
> - Modify printing logic so that each console independently loads,
>   prepares, and prints its next record.
> 
> - Remove exclusive_console logic. Since console positions are
>   handled independently, replaying past records occurs naturally.
> 
> - Update the comments explaining why preemption is disabled while
>   printing from printk() context.
> 
> With these changes, there is a change in behavior: the console
> replaying the log (formerly exclusive console) will no longer block
> other consoles. New messages appear on the other consoles while the
> newly added console is still replaying.
> 
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> +/*
> + * Print out all remaining records to all consoles.
> + *
> + * Requires the console_lock.

It would be nice to document @do_cond_resched, something like:

 * @do_cond_resched is set by the caller. It can be true only in
   schedulable context.

> + *
> + * Returns true if a console was available for flushing, otherwise false.

It is not completely true after adding abandon_console_lock_in_panic().
But it is actually a chance to slightly change the meaning. I suggest
to define it as:

 * Returns true when there was at least one usable console and all
   messages were flushed to all usable consoles.

It will help to better define @next_seq meaning. And it is exactly
the meaning that we need in console_unlock(), see below.

> + *
> + * @next_seq is set to the highest sequence number of all of the consoles that
> + * were flushed.

The value is meaningful only when all consoles were flushed. It would
be nice to make it more clear. If we changed the return value meaning
than we could say:

 * @next_seq is the sequence number after the last available record.
 * The value is valid only when this function returns true. It means
 * that usable consoles are completely flushed.


> + *
> + * @handover will be set to true if a printk waiter has taken over the
> + * console_lock, in which case the caller is no longer holding the
> + * console_lock. Otherwise it is set to false.
> + */
> +static bool console_flush_all(bool do_cond_resched, u64 *next_seq, bool *handover)
> +{
> +	bool any_usable = false;
> +	struct console *con;
> +	bool any_progress;
> +
> +	*next_seq = 0;
> +	*handover = false;
> +
> +	do {
> +		any_progress = false;
> +
> +		for_each_console(con) {
> +			bool progress;
> +
> +			if (!console_is_usable(con))
> +				continue;
> +			any_usable = true;
> +
> +			progress = console_emit_next_record(con, handover);
> +			if (*handover)
> +				return true;

The return value is not used in this case. I would return "false"
here and update the meaning as suggested above.

> +			/* Track the highest seq flushed. */
> +			if (con->seq > *next_seq)
> +				*next_seq = con->seq;
> +
> +			if (!progress)
> +				continue;
> +			any_progress = true;
> +
> +			/* Allow panic_cpu to take over the consoles safely */
> +			if (abandon_console_lock_in_panic())
> +				return false;
> +
> +			if (do_cond_resched)
> +				cond_resched();
> +		}
> +	} while (any_progress);
> +
> +	return any_usable;
>  }
>  
>  /**
> @@ -2686,125 +2790,33 @@ void console_unlock(void)
>  	 * between lines if allowable.  Not doing so can cause a very long
>  	 * scheduling stall on a slow console leading to RCU stall and
>  	 * softlockup warnings which exacerbate the issue with more
> -	 * messages practically incapacitating the system.
> -	 *
> -	 * console_trylock() is not able to detect the preemptive
> -	 * context reliably. Therefore the value must be stored before
> -	 * and cleared after the "again" goto label.
> +	 * messages practically incapacitating the system. Therefore, create
> +	 * a local to use for the printing loop.
>  	 */
>  	do_cond_resched = console_may_schedule;
> -again:
> -	console_may_schedule = 0;
> -
> -	/*
> -	 * We released the console_sem lock, so we need to recheck if
> -	 * cpu is online and (if not) is there at least one CON_ANYTIME
> -	 * console.
> -	 */
> -	if (!can_use_console()) {
> -		console_locked = 0;
> -		up_console_sem();
> -		return;
> -	}
>  
> -	for (;;) {
> -		size_t ext_len = 0;
> -		int handover;
> -		size_t len;
> +	do {
> +		console_may_schedule = 0;
>  
> -skip:
> -		if (!prb_read_valid(prb, console_seq, &r))
> +		flushed = console_flush_all(do_cond_resched, &next_seq, &handover);
> +		if (handover)
>  			break;
>  
> -		if (console_seq != r.info->seq) {
> -			console_dropped += r.info->seq - console_seq;
> -			console_seq = r.info->seq;
> -			if (panic_in_progress() && panic_console_dropped++ > 10) {
> -				suppress_panic_printk = 1;
> -				pr_warn_once("Too many dropped messages. Suppress messages on non-panic CPUs to prevent livelock.\n");
> -			}
> -		}
> -
> -		if (suppress_message_printing(r.info->level)) {
> -			/*
> -			 * Skip record we have buffered and already printed
> -			 * directly to the console when we received it, and
> -			 * record that has level above the console loglevel.
> -			 */
> -			console_seq++;
> -			goto skip;
> -		}
> -
> -		/* Output to all consoles once old messages replayed. */
> -		if (unlikely(exclusive_console &&
> -			     console_seq >= exclusive_console_stop_seq)) {
> -			exclusive_console = NULL;
> -		}
> +		__console_unlock();
>  
> -		/*
> -		 * Handle extended console text first because later
> -		 * record_print_text() will modify the record buffer in-place.
> -		 */
> -		if (nr_ext_console_drivers) {
> -			ext_len = info_print_ext_header(ext_text,
> -						sizeof(ext_text),
> -						r.info);
> -			ext_len += msg_print_ext_body(ext_text + ext_len,
> -						sizeof(ext_text) - ext_len,
> -						&r.text_buf[0],
> -						r.info->text_len,
> -						&r.info->dev_info);
> -		}
> -		len = record_print_text(&r,
> -				console_msg_format & MSG_FORMAT_SYSLOG,
> -				printk_time);
> -		console_seq++;
> +		/* Were there any consoles available for flushing? */

The value is false also when abandon_console_lock_in_panic() returned
true. We basically want to break here when we could not do the job.
I would write something like:

		/*
		 * Do not try again when couldn't do the job. It avoids
		 * an infinite loop.
		 */

> +		if (!flushed)
> +			break;
>  
>  		/*
> -		 * While actively printing out messages, if another printk()
> -		 * were to occur on another CPU, it may wait for this one to
> -		 * finish. This task can not be preempted if there is a
> -		 * waiter waiting to take over.
> -		 *
> -		 * Interrupts are disabled because the hand over to a waiter
> -		 * must not be interrupted until the hand over is completed
> -		 * (@console_waiter is cleared).
> +		 * Some context may have added new records after
> +		 * console_flush_all() but before unlocking the console.
> +		 * Re-check if there is a new record to flush. If the trylock
> +		 * fails, another context is already handling the printing.
>  		 */
> -		printk_safe_enter_irqsave(flags);
> -		console_lock_spinning_enable();
> -
> -		stop_critical_timings();	/* don't trace print latency */
> -		call_console_drivers(ext_text, ext_len, text, len);
> -		start_critical_timings();
> -
> -		handover = console_lock_spinning_disable_and_check();
> -		printk_safe_exit_irqrestore(flags);
> -		if (handover)
> -			return;
> -
> -		/* Allow panic_cpu to take over the consoles safely */
> -		if (abandon_console_lock_in_panic())
> -			break;
> -
> -		if (do_cond_resched)
> -			cond_resched();
> -	}
> -
> -	/* Get consistent value of the next-to-be-used sequence number. */
> -	next_seq = console_seq;
> -
> -	console_locked = 0;
> -	up_console_sem();
> -
> -	/*
> -	 * Someone could have filled up the buffer again, so re-check if there's
> -	 * something to flush. In case we cannot trylock the console_sem again,
> -	 * there's a new owner and the console_unlock() from them will do the
> -	 * flush, no worries.
> -	 */
> -	retry = prb_read_valid(prb, next_seq, NULL);
> -	if (retry && !abandon_console_lock_in_panic() && console_trylock())
> -		goto again;
> +	} while (prb_read_valid(prb, next_seq, NULL) &&
> +		 !abandon_console_lock_in_panic() &&

This is not longer needed. Either all messages are proceed or
we are not here because console_flush_all() returned false.

Another motivation to remove it that it is bit racy here. The state
could change only under the console lock.

It is not a big deal. But I would prefer to remove it because it makes
false assumptions about the logic.

> +		 console_trylock());
>  }
>  EXPORT_SYMBOL(console_unlock);


Summary: All the proposed changes are cosmetic. And I like the new code.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 07/12] printk: move buffer definitions into console_emit_next_record() caller
  2022-04-05 13:25 ` [PATCH printk v2 07/12] printk: move buffer definitions into console_emit_next_record() caller John Ogness
@ 2022-04-06 14:40   ` Petr Mladek
  0 siblings, 0 replies; 37+ messages in thread
From: Petr Mladek @ 2022-04-06 14:40 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner, linux-kernel

On Tue 2022-04-05 15:31:30, John Ogness wrote:
> Extended consoles print extended messages and do not print messages about
> dropped records.
> 
> Non-extended consoles print "normal" messages as well as extra messages
> about dropped records.
> 
> Currently the buffers for these various message types are defined within
> the functions that might use them and their usage is based upon the
> CON_EXTENDED flag. This will be a problem when moving to kthread printers
> because each printer must be able to provide its own buffers.
> 
> Move all the message buffer definitions outside of
> console_emit_next_record(). The caller knows if extended or dropped
> messages should be printed and can specify the appropriate buffers to
> use. The console_emit_next_record() and call_console_driver() functions
> can know what to print based on whether specified buffers are non-NULL.
> 
> With this change, buffer definition/allocation/specification is separated
> from the code that does the various types of string printing.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>

I feel slightly sad that we replace the safe sizeof(buffer)
with a hardcoded size. But the change makes sense. And I think
that it is not worth passing the size as parameters.

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 08/12] printk: add pr_flush()
  2022-04-05 13:25 ` [PATCH printk v2 08/12] printk: add pr_flush() John Ogness
@ 2022-04-06 15:17   ` Petr Mladek
  2022-04-08 18:57     ` John Ogness
  0 siblings, 1 reply; 37+ messages in thread
From: Petr Mladek @ 2022-04-06 15:17 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner, linux-kernel

On Tue 2022-04-05 15:31:31, John Ogness wrote:
> Provide a might-sleep function to allow waiting for console printers
> to catch up to the latest logged message.
> 
> Use pr_flush() whenever it is desirable to get buffered messages
> printed before continuing: suspend_console(), resume_console(),
> console_stop(), console_start(), console_unblank().
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 06f0b29d909d..d0369afaf365 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3321,6 +3331,79 @@ static int __init printk_late_init(void)
>  late_initcall(printk_late_init);
>  
>  #if defined CONFIG_PRINTK
> +/* If @con is specified, only wait for that console. Otherwise wait for all. */
> +static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress)
> +{
> +	int remaining = timeout_ms;
> +	struct console *c;
> +	u64 last_diff = 0;
> +	u64 printk_seq;
> +	u64 diff;
> +	u64 seq;
> +
> +	might_sleep();
> +
> +	seq = prb_next_seq(prb);

I auggest to add here:

	/*
	 * Try to flush the messages when kthreads are not available
	 * and there is not other console lock owner.
	 */
	if (console_trylock())
		console_unlock()

> +
> +	for (;;) {
> +		diff = 0;
> +
> +		console_lock();
> +		for_each_console(c) {
> +			if (con && con != c)
> +				continue;
> +			if (!console_is_usable(c))
> +				continue;
> +			printk_seq = c->seq;
> +			if (printk_seq < seq)
> +				diff += seq - printk_seq;
> +		}
> +		console_unlock();

This is a bit sub-optimal when the kthreads are not available or
are disabled. In this case, the messages are flushed [*] by
the console_unlock() and the diff is outdated.

As a result, the code would non-necessarily sleep 100ms before
it realizes that the job is done.

One might argue that the problem will be only when there are
non-handled messages and in situations when the delay probably
is not critical.

Well, it is ugly to keep it this way. I suggest to add the extra
console_trylock()/unlock() at the beginning.

> +
> +		if (diff != last_diff && reset_on_progress)
> +			remaining = timeout_ms;
> +
> +		if (diff == 0 || remaining == 0)
> +			break;
> +
> +		if (remaining < 0) {
> +			/* no timeout limit */
> +			msleep(100);
> +		} else if (remaining < 100) {
> +			msleep(remaining);
> +			remaining = 0;
> +		} else {
> +			msleep(100);
> +			remaining -= 100;
> +		}
> +
> +		last_diff = diff;
> +	}
> +
> +	return (diff == 0);
> +}

Otherwise, it looks good to me. It is an interesting function.
I wonder if it gets popular also outside printk code. ;-)

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 09/12] printk: add functions to prefer direct printing
  2022-04-05 13:25 ` [PATCH printk v2 09/12] printk: add functions to prefer direct printing John Ogness
@ 2022-04-07  9:56   ` Petr Mladek
  2022-04-07 13:35     ` Helge Deller
  2022-04-09 15:57   ` Paul E. McKenney
  1 sibling, 1 reply; 37+ messages in thread
From: Petr Mladek @ 2022-04-07  9:56 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman, Jiri Slaby, Paul E. McKenney,
	Frederic Weisbecker, Neeraj Upadhyay, Josh Triplett,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Kees Cook,
	Andrew Morton, Luis Chamberlain, Wei Liu, Xiaoming Ni,
	Peter Zijlstra, Andy Shevchenko, Helge Deller, Marco Elver,
	Sebastian Andrzej Siewior, Daniel Lezcano, Shawn Guo,
	Dmitry Torokhov, Matti Vaittinen, Eric W. Biederman, Phil Auld,
	Juri Lelli, Wang Qing, rcu

On Tue 2022-04-05 15:31:32, John Ogness wrote:
> Once kthread printing is available, console printing will no longer
> occur in the context of the printk caller. However, there are some
> special contexts where it is desirable for the printk caller to
> directly print out kernel messages. Using pr_flush() to wait for
> threaded printers is only possible if the caller is in a sleepable
> context and the kthreads are active. That is not always the case.
> 
> Introduce printk_prefer_direct_enter() and printk_prefer_direct_exit()
> functions to explicitly (and globally) activate/deactivate preferred
> direct console printing. The term "direct console printing" refers to
> printing to all enabled consoles from the context of the printk
> caller. The term "prefer" is used because this type of printing is
> only best effort. If the console is currently locked or other
> printers are already actively printing, the printk caller will need
> to rely on the other contexts to handle the printing.
> 
> This preferred direct printing is how all printing is currently
> handled (unless explicitly deferred).
> 
> When kthread printing is introduced, there may be some unanticipated
> problems due to kthreads being unable to flush important messages.
> In order to minimize such risks, preferred direct printing is
> activated for the primary important messages when the system
> experiences general types of major errors. These are:
> 
>  - emergency reboot/shutdown
>  - cpu and rcu stalls
>  - hard and soft lockups
>  - hung tasks
>  - warn
>  - sysrq
> 
> Note that since kthread printing does not yet exist, no behavior
> changes result from this commit. This is only implementing the
> counter and marking the various places where preferred direct
> printing is active.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>

Looks good to me. Let's see how it works in practice.
It is possible that we will need to add it on more
locations. But it is also possible that we will be
able to remove it somewhere.

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 09/12] printk: add functions to prefer direct printing
  2022-04-07  9:56   ` Petr Mladek
@ 2022-04-07 13:35     ` Helge Deller
  2022-04-07 14:35       ` John Ogness
  0 siblings, 1 reply; 37+ messages in thread
From: Helge Deller @ 2022-04-07 13:35 UTC (permalink / raw)
  To: Petr Mladek, John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman, Jiri Slaby, Paul E. McKenney,
	Frederic Weisbecker, Neeraj Upadhyay, Josh Triplett,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Kees Cook,
	Andrew Morton, Luis Chamberlain, Wei Liu, Xiaoming Ni,
	Peter Zijlstra, Andy Shevchenko, Marco Elver,
	Sebastian Andrzej Siewior, Daniel Lezcano, Shawn Guo,
	Dmitry Torokhov, Matti Vaittinen, Eric W. Biederman, Phil Auld,
	Juri Lelli, Wang Qing, rcu

On 4/7/22 11:56, Petr Mladek wrote:
> On Tue 2022-04-05 15:31:32, John Ogness wrote:
>> Once kthread printing is available, console printing will no longer
>> occur in the context of the printk caller. However, there are some
>> special contexts where it is desirable for the printk caller to
>> directly print out kernel messages. Using pr_flush() to wait for
>> threaded printers is only possible if the caller is in a sleepable
>> context and the kthreads are active. That is not always the case.
>>
>> Introduce printk_prefer_direct_enter() and printk_prefer_direct_exit()
>> functions to explicitly (and globally) activate/deactivate preferred
>> direct console printing. The term "direct console printing" refers to
>> printing to all enabled consoles from the context of the printk
>> caller. The term "prefer" is used because this type of printing is
>> only best effort. If the console is currently locked or other
>> printers are already actively printing, the printk caller will need
>> to rely on the other contexts to handle the printing.
>>
>> This preferred direct printing is how all printing is currently
>> handled (unless explicitly deferred).
>>
>> When kthread printing is introduced, there may be some unanticipated
>> problems due to kthreads being unable to flush important messages.
>> In order to minimize such risks, preferred direct printing is
>> activated for the primary important messages when the system
>> experiences general types of major errors. These are:
>>
>>  - emergency reboot/shutdown
>>  - cpu and rcu stalls
>>  - hard and soft lockups
>>  - hung tasks
>>  - warn
>>  - sysrq

I agree that those needs direct printing.
Did you maybe checked how much (e.g. in percentage) of a normal log is then
still left to be printed threadened?  Just in case it's not much,
is there still a benefit to introduce threadened printing?

Or another example, e.g. when running on a slow console (the best
testcase I know of is a 9600 baud serial port), where some printks
are threadened. Then a warn printk() has to be shown, doesn't it mean
all threadened printks have to be flushed (waited for) until
the warning can be shown. Will there then still be a measureable time benefit?

Just wondering...

Helge

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 09/12] printk: add functions to prefer direct printing
  2022-04-07 13:35     ` Helge Deller
@ 2022-04-07 14:35       ` John Ogness
  2022-04-07 19:36         ` Helge Deller
  0 siblings, 1 reply; 37+ messages in thread
From: John Ogness @ 2022-04-07 14:35 UTC (permalink / raw)
  To: Helge Deller, Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman, Jiri Slaby, Paul E. McKenney,
	Frederic Weisbecker, Neeraj Upadhyay, Josh Triplett,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Kees Cook,
	Andrew Morton, Luis Chamberlain, Wei Liu, Xiaoming Ni,
	Peter Zijlstra, Andy Shevchenko, Marco Elver,
	Sebastian Andrzej Siewior, Daniel Lezcano, Shawn Guo,
	Dmitry Torokhov, Matti Vaittinen, Eric W. Biederman, Phil Auld,
	Juri Lelli, Wang Qing, rcu

On 2022-04-07, Helge Deller <deller@gmx.de> wrote:
>>>  - emergency reboot/shutdown
>>>  - cpu and rcu stalls
>>>  - hard and soft lockups
>>>  - hung tasks
>>>  - warn
>>>  - sysrq
>
> I agree that those needs direct printing.
> Did you maybe checked how much (e.g. in percentage) of a normal log is
> then still left to be printed threadened?  Just in case it's not much,
> is there still a benefit to introduce threadened printing?

A "normal" log can still contain many many messages. There have been
reports [0] where the system died because _printk_ killed it. This was
not any of the scenarios that we want direct printing for. It was just a
lot of messages that some driver wanted to output.

Until now the response was always to reduce printk usage or avoid it
altogether. printk has a bad reputation because it is so unpredictable
and expensive. So we are fixing printk. With threaded printers, we make
a dramatic change. printk becomes 100% lockless and consistently quite
fast. You can use it to debug or log messages within any code in the
kernel and not need to worry about horrible latencies being introduced.

For drivers that start dumping tons of messages, there is no concern
that the system will die because of the messages.

> Or another example, e.g. when running on a slow console (the best
> testcase I know of is a 9600 baud serial port), where some printks are
> threadened. Then a warn printk() has to be shown, doesn't it mean all
> threadened printks have to be flushed (waited for) until the warning
> can be shown. Will there then still be a measureable time benefit?

Now you are talking about something else, unrelated to the threaded
printing. A warn is considered a serious non-normal event. In this case,
printk temporarily falls back to the direct, slow, unpredictable,
printk-caller-context behavior we know today.

If we see that direct printing is being used during normal operation
(i.e. developers are performing warns just for fun), then we should not
mark warns for direct printing. We will need to get some real-world
experience here, but the goal is that systems do not use direct printing
unless a real serious emergency is going on. Otherwise, the system risks
dying because of the messages rather than the problem being reported.

John

[0] https://lore.kernel.org/all/20190306171943.12345598@oasis.local.home

^ permalink raw reply	[flat|nested] 37+ messages in thread

* start/stop: was: Re: [PATCH printk v2 10/12] printk: add kthread console printers
  2022-04-05 13:25 ` [PATCH printk v2 10/12] printk: add kthread console printers John Ogness
@ 2022-04-07 16:43   ` Petr Mladek
  2022-04-07 16:49   ` console_is_usable() check: " Petr Mladek
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 37+ messages in thread
From: Petr Mladek @ 2022-04-07 16:43 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman

On Tue 2022-04-05 15:31:33, John Ogness wrote:
> Create a kthread for each console to perform console printing. During
> normal operation (@system_state == SYSTEM_RUNNING), the kthread
> printers are responsible for all printing on their respective
> consoles.
> 
> During non-normal operation, console printing is done as it has been:
> within the context of the printk caller or within irq work triggered
> by the printk caller.
> 
> Console printers synchronize against each other and against console
> lockers by taking the console lock for each message that is printed.
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 258d02cff140..a3b8b984c9aa 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2988,6 +3028,10 @@ void console_start(struct console *console)
>  	console_lock();
>  	console->flags |= CON_ENABLED;
>  	console_unlock();
> +
> +	/* Wake the newly enabled kthread printer. */
> +	wake_up_klogd();

wake_up_klogd() will wake the kthread only when PRINTK_PENDING_OUTPUT
is set. But this particular console might be far behind the other
kthreads.

I think that we always want to wake up the kthread. In the worst case,
it will just realize that there is nothing to do and will go back to sleep.

> +
>  	__pr_flush(console, 1000, true);
>  }
>  EXPORT_SYMBOL(console_start);
 @@ -3196,6 +3240,8 @@ void register_console(struct console *newcon)
>  		/* Begin with next message. */
>  		newcon->seq = prb_next_seq(prb);
>  	}
> +	if (printk_kthreads_available)
> +		printk_start_kthread(newcon);
>  	console_unlock();
>  	console_sysfs_notify();
>  
> @@ -3222,6 +3268,7 @@ EXPORT_SYMBOL(register_console);
>  
>  int unregister_console(struct console *console)
>  {
> +	struct task_struct *thd = console->thread;
>  	struct console *con;
>  	int res;
>  
> @@ -3233,6 +3280,11 @@ int unregister_console(struct console *console)
>  	if (res > 0)
>  		return 0;
>  
> +	if (thd) {
> +		console->thread = NULL;
> +		kthread_stop(thd);

This feels racy against parallel register_console() and
unregister_console() calls. I guess that it is not much
realistic in the real life but ...

Anyway, printk_start_kthread(newcon) is called under console_lock().
It would be great to stop it under the lock as well so that we
do not need to worry about races.

Hmm, it might cause deadlock at this stage because the kthread
take console_lock() in the main loop.

It should work after introducing the per-console mutexes. They might
be taken when console_lock() is taken. But the kthread will
not do the printing when the console_lock() is taken by anyone else.

A solution might be to add a comment that there is a non-realistic
race that will be removed by introducing per-console mutexes.
And fix it later.

Nit: Also I would prefer to create printk_stop_kthread()
     so that it is easy to find the counterparts.


> +	}
> +
>  	res = -ENODEV;
>  	console_lock();
>  	if (console_drivers == console) {

[...]

> +static int printk_kthread_func(void *data)
> +{
> +	struct console *con = data;
> +	char *dropped_text = NULL;
> +	char *ext_text = NULL;
> +	bool handover;
> +	u64 seq = 0;
> +	char *text;
> +	int error;
> +
> +	text = kmalloc(CONSOLE_LOG_MAX, GFP_KERNEL);
> +	if (!text) {
> +		printk_console_msg(con, KERN_ERR, "failed to allocate text buffer");
> +		printk_fallback_preferred_direct();
> +		goto out;
> +	}
> +
> +	if (con->flags & CON_EXTENDED) {
> +		ext_text = kmalloc(CONSOLE_EXT_LOG_MAX, GFP_KERNEL);
> +		if (!ext_text) {
> +			printk_console_msg(con, KERN_ERR, "failed to allocate ext_text buffer");
> +			printk_fallback_preferred_direct();
> +			goto out;
> +		}
> +	} else {
> +		dropped_text = kmalloc(DROPPED_TEXT_MAX, GFP_KERNEL);
> +		if (!dropped_text) {
> +			printk_console_msg(con, KERN_ERR,
> +					   "failed to allocate dropped_text buffer");
> +			printk_fallback_preferred_direct();
> +			goto out;
> +		}
> +	}
> +
> +	printk_console_msg(con, KERN_INFO, "printing thread started");
> +
> +	for (;;) {
> +		/*
> +		 * Guarantee this task is visible on the waitqueue before
> +		 * checking the wake condition.
> +		 *
> +		 * The full memory barrier within set_current_state() of
> +		 * prepare_to_wait_event() pairs with the full memory barrier
> +		 * within wq_has_sleeper().
> +		 *
> +		 * See __wake_up_klogd:A for the pairing memory barrier.
> +		 */
> +		error = wait_event_interruptible(log_wait,
> +				printer_should_wake(con, seq)); /* LMM(printk_kthread_func:A) */
> +
> +		if (kthread_should_stop() || !printk_kthreads_available)
> +			break;

printk_fallback_preferred_direct() is an interesting trick. But I am
not sure if it is a good idea. This approach leaves con->kthread
pointing to exited kthreads.

The following looks cleaner and should work:

void printk_stop_kthread(struct console *con)
{
	struct task_struct *thread = con->thread;

	if (!thread)
		return;

	console->thread = NULL;
	kthread_stop(thread);
}

void printk_deactivate_kthreas()
{
	struct console *con;

	console_lock();
	printk_kthreads_available = false;
	console_unlock();

	for_each_console(con)
		printk_stop_kthread(con);
}

And we could move printk_stop_kthread() under console lock
after introducing the per-console mutex.


Best Regards,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* console_is_usable() check: was: Re: [PATCH printk v2 10/12] printk: add kthread console printers
  2022-04-05 13:25 ` [PATCH printk v2 10/12] printk: add kthread console printers John Ogness
  2022-04-07 16:43   ` start/stop: was: " Petr Mladek
@ 2022-04-07 16:49   ` Petr Mladek
  2022-04-07 16:53   ` wake_up_klogd(): " Petr Mladek
  2022-04-07 16:56   ` console_flush_all(): was: : was " Petr Mladek
  3 siblings, 0 replies; 37+ messages in thread
From: Petr Mladek @ 2022-04-07 16:49 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman

On Tue 2022-04-05 15:31:33, John Ogness wrote:
> Create a kthread for each console to perform console printing. During
> normal operation (@system_state == SYSTEM_RUNNING), the kthread
> printers are responsible for all printing on their respective
> consoles.
> 
> During non-normal operation, console printing is done as it has been:
> within the context of the printk caller or within irq work triggered
> by the printk caller.
> 
> Console printers synchronize against each other and against console
> lockers by taking the console lock for each message that is printed.
> 
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> +static bool printer_should_wake(struct console *con, u64 seq)
> +{
> +	short flags;
> +
> +	if (kthread_should_stop() || !printk_kthreads_available)
> +		return true;
> +
> +	if (console_suspended)
> +		return false;
> +
> +	if (!con->write)
> +		return false;

Hmm, the kthread for such consoles will never wake up. It probably
does not make sense to create it at all.

On the other hand, it is not a big deal. And we have "bigger" problem
how to make these checks in sync with console_is_usable(), see below.

> +	/*
> +	 * This is an unsafe read to con->flags, but a false positive is not
> +	 * a problem. Worst case it would allow the printer to wake up even
> +	 * when it is disabled. But the printer will notice that itself when
> +	 * attempting to print and instead go back to sleep.
> +	 */
> +	flags = data_race(READ_ONCE(con->flags));
> +	if (!(flags & CON_ENABLED))
> +		return false;
> +
> +	if (atomic_read(&printk_prefer_direct))
> +		return false;
> +
> +	return prb_read_valid(prb, seq, NULL);
> +}
> +
> +static int printk_kthread_func(void *data)
> +{
> +	struct console *con = data;
> +	char *dropped_text = NULL;
> +	char *ext_text = NULL;
> +	bool handover;
> +	u64 seq = 0;
> +	char *text;
> +	int error;
> +
> +	text = kmalloc(CONSOLE_LOG_MAX, GFP_KERNEL);
> +	if (!text) {
> +		printk_console_msg(con, KERN_ERR, "failed to allocate text buffer");
> +		printk_fallback_preferred_direct();
> +		goto out;
> +	}
> +
> +	if (con->flags & CON_EXTENDED) {
> +		ext_text = kmalloc(CONSOLE_EXT_LOG_MAX, GFP_KERNEL);
> +		if (!ext_text) {
> +			printk_console_msg(con, KERN_ERR, "failed to allocate ext_text buffer");
> +			printk_fallback_preferred_direct();
> +			goto out;
> +		}
> +	} else {
> +		dropped_text = kmalloc(DROPPED_TEXT_MAX, GFP_KERNEL);
> +		if (!dropped_text) {
> +			printk_console_msg(con, KERN_ERR,
> +					   "failed to allocate dropped_text buffer");
> +			printk_fallback_preferred_direct();
> +			goto out;
> +		}
> +	}
> +
> +	printk_console_msg(con, KERN_INFO, "printing thread started");
> +
> +	for (;;) {
> +		/*
> +		 * Guarantee this task is visible on the waitqueue before
> +		 * checking the wake condition.
> +		 *
> +		 * The full memory barrier within set_current_state() of
> +		 * prepare_to_wait_event() pairs with the full memory barrier
> +		 * within wq_has_sleeper().
> +		 *
> +		 * See __wake_up_klogd:A for the pairing memory barrier.
> +		 */
> +		error = wait_event_interruptible(log_wait,
> +				printer_should_wake(con, seq)); /* LMM(printk_kthread_func:A) */
> +
> +		if (kthread_should_stop() || !printk_kthreads_available)
> +			break;
> +
> +		if (error)
> +			continue;
> +
> +		console_lock();
> +
> +		if (console_suspended) {
> +			__console_unlock();
> +			continue;
> +		}
> +
> +		if (!console_is_usable(con)) {
> +			__console_unlock();
> +			continue;
> +		}

This smells with a busy loop. We should make sure that the same
condition will make printk_kthread_func() return false. The current
approach is hard to maintain.

Hmm, it is not easy because console_is_usable(con) is supposed
to be called under console_lock().

I do not have a good solution for this. But the current approach looks
error prone. What about the following?

static inline bool __console_is_usable(struct console *con)
{
	short flags;

	if (!con->write)
		return false;

	/* Make flags checks consistent when called without console_lock. */
	flags = READ_ONCE(con->flags);

	if (!(con->flags & CON_ENABLED))
		return false;

	/*
	 * Console drivers may assume that per-cpu resources have been
	 * allocated. So unless they're explicitly marked as being able to
	 * cope (CON_ANYTIME) don't call them until this CPU is officially up.
	 */
	if (!cpu_online(raw_smp_processor_id()) &&
	    !(con->flags & CON_ANYTIME))
		return false;

	return true;
}

static inline bool console_is_usable(struct console *con)
{
	WARN_ON_ONCE(!lockdep_assert_held(&console_sem));

	__console_is_usable();
}

Note that we could not use lockdep_assert_held() because we will
later need to check both console_sem and con->mutex. Either of
them will be enough.

> +
> +		/*
> +		 * Even though the printk kthread is always preemptible, it is
> +		 * still not allowed to call cond_resched() from within
> +		 * console drivers. The task may become non-preemptible in the
> +		 * console driver call chain. For example, vt_console_print()
> +		 * takes a spinlock and then can call into fbcon_redraw(),
> +		 * which can conditionally invoke cond_resched().
> +		 */
> +		console_may_schedule = 0;
> +		console_emit_next_record(con, text, ext_text, dropped_text, &handover);
> +		if (handover)
> +			continue;
> +
> +		seq = con->seq;
> +
> +		__console_unlock();
> +	}
> +
> +	printk_console_msg(con, KERN_INFO, "printing thread stopped");
> +out:
> +	kfree(dropped_text);
> +	kfree(ext_text);
> +	kfree(text);
> +	return 0;
> +}

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* wake_up_klogd(): was: Re: [PATCH printk v2 10/12] printk: add kthread console printers
  2022-04-05 13:25 ` [PATCH printk v2 10/12] printk: add kthread console printers John Ogness
  2022-04-07 16:43   ` start/stop: was: " Petr Mladek
  2022-04-07 16:49   ` console_is_usable() check: " Petr Mladek
@ 2022-04-07 16:53   ` Petr Mladek
  2022-04-07 16:56   ` console_flush_all(): was: : was " Petr Mladek
  3 siblings, 0 replies; 37+ messages in thread
From: Petr Mladek @ 2022-04-07 16:53 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman

On Tue 2022-04-05 15:31:33, John Ogness wrote:
> Create a kthread for each console to perform console printing. During
> normal operation (@system_state == SYSTEM_RUNNING), the kthread
> printers are responsible for all printing on their respective
> consoles.
> 
> During non-normal operation, console printing is done as it has been:
> within the context of the printk caller or within irq work triggered
> by the printk caller.
> 
> Console printers synchronize against each other and against console
> lockers by taking the console lock for each message that is printed.
> 
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3445,40 +3659,64 @@ static void wake_up_klogd_work_func(struct irq_work *irq_work)
>  	int pending = this_cpu_xchg(printk_pending, 0);
>  
>  	if (pending & PRINTK_PENDING_OUTPUT) {
> +		printk_prefer_direct_enter();

Please, rename PRINTK_PENDING_OUTPUT to PRINTK_PENDING_DIRECT_OUTPUT.

The change confused me a lot. I wrote many lines about why it is
(not) needed. Then I continued the review and found that it started making
sense after the change in defer_console_output(). ;-)

This patch changes the meaning of the flag. It would deserve renaming.

> +
>  		/* If trylock fails, someone else is doing the printing */
>  		if (console_trylock())
>  			console_unlock();
> +
> +		printk_prefer_direct_exit();
>  	}
>  
>  	if (pending & PRINTK_PENDING_WAKEUP)
> -		wake_up_interruptible(&log_wait);
> +		wake_up_interruptible_all(&log_wait);

This would deserve some explanation in the commit message.
I think that this actually was needed even before. IMHO, more
pending waiters had to wait for more new messages. It was
not guaranteed the the woken waiter woke another one.

It would be nice to put it into separate patch and do it first.

>  }
>  
>  static DEFINE_PER_CPU(struct irq_work, wake_up_klogd_work) =
>  	IRQ_WORK_INIT_LAZY(wake_up_klogd_work_func);
>  
> -void wake_up_klogd(void)
> +static void __wake_up_klogd(int val)
>  {
>  	if (!printk_percpu_data_ready())
>  		return;
>  
>  	preempt_disable();
> -	if (waitqueue_active(&log_wait)) {
> -		this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
> +	/*
> +	 * Guarantee any new records can be seen by printing threads before
> +	 * checking if the wait queue is empty.
> +	 *
> +	 * The full memory barrier within wq_has_sleeper() pairs with the full
> +	 * memory barrier within set_current_state() of
> +	 * prepare_to_wait_event(), which is called after ___wait_event() adds
> +	 * the waiter but before it has checked the wait condition.
> +	 *
> +	 * See printk_kthread_func:A for the pairing memory barrier.
> +	 */

I guess that this problem was there even before but it was less
visible. Do I get it correctly, please?


> +	if (wq_has_sleeper(&log_wait) || /* LMM(__wake_up_klogd:A) */
> +	    (val & PRINTK_PENDING_OUTPUT)) {
> +		this_cpu_or(printk_pending, val);
>  		irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
>  	}
>  	preempt_enable();
>  }
>  
> +void wake_up_klogd(void)
> +{
> +	__wake_up_klogd(PRINTK_PENDING_WAKEUP);
> +}
> +
>  void defer_console_output(void)
>  {
> -	if (!printk_percpu_data_ready())
> -		return;

	/* Always wakeup waiters because there are not only printk kthreads. */
> +	int val = PRINTK_PENDING_WAKEUP;

IMHO, this was actually needed even before. Otherwise, nobody woken
log waiters for deferred messages.

IMHO, it is a regression caused by moving wake_up_klogd() from
console_unlock() to vprintk_emit(). It was the commit
43a17111c2553925f6 ("printk: wake up klogd in vprintk_emit").

If I am right then we should fix it separately.


> -	preempt_disable();
> -	this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> -	irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> -	preempt_enable();
> +	/*
> +	 * If console deferring was called with preferred direct printing,
> +	 * make the irqwork perform the direct printing.
> +	 */
> +	if (atomic_read(&printk_prefer_direct))
> +		val |= PRINTK_PENDING_OUTPUT;
> +
> +	__wake_up_klogd(val);
>  }

And we should call defer_console_output() also in
printk_prefer_direct_enter() to make sure that a potential
pending messages are printed.

My understanding is that the kthreads stop processing the messages
when the direct output is preferred. I think that this was not
the case in v1 and I asked you do it this way so that kthreads
do not block entering the direct mode.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* console_flush_all(): was: : was Re: [PATCH printk v2 10/12] printk: add kthread console printers
  2022-04-05 13:25 ` [PATCH printk v2 10/12] printk: add kthread console printers John Ogness
                     ` (2 preceding siblings ...)
  2022-04-07 16:53   ` wake_up_klogd(): " Petr Mladek
@ 2022-04-07 16:56   ` Petr Mladek
  3 siblings, 0 replies; 37+ messages in thread
From: Petr Mladek @ 2022-04-07 16:56 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman

On Tue 2022-04-05 15:31:33, John Ogness wrote:
> Create a kthread for each console to perform console printing. During
> normal operation (@system_state == SYSTEM_RUNNING), the kthread
> printers are responsible for all printing on their respective
> consoles.
> 
> During non-normal operation, console printing is done as it has been:
> within the context of the printk caller or within irq work triggered
> by the printk caller.
> 
> Console printers synchronize against each other and against console
> lockers by taking the console lock for each message that is printed.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>
> ---
>  include/linux/console.h |   2 +
>  kernel/printk/printk.c  | 262 ++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 252 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/console.h b/include/linux/console.h
> index 8c1686e2c233..9a251e70c090 100644
> --- a/include/linux/console.h
> +++ b/include/linux/console.h
> @@ -2773,6 +2809,10 @@ static bool console_flush_all(bool do_cond_resched, u64 *next_seq, bool *handove
>  	*handover = false;
>  
>  	do {
> +		/* Let the kthread printers do the work if they can. */
> +		if (!allow_direct_printing())
> +			break;

We should return either "true" or "false" here to have predictable
result.

I suggest to return "false". I would mean that the function did not
finish the job as I suggested for the 6th patch.

> +
>  		any_progress = false;
>  
>  		for_each_console(con) {

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 09/12] printk: add functions to prefer direct printing
  2022-04-07 14:35       ` John Ogness
@ 2022-04-07 19:36         ` Helge Deller
  2022-04-07 20:04           ` John Ogness
  0 siblings, 1 reply; 37+ messages in thread
From: Helge Deller @ 2022-04-07 19:36 UTC (permalink / raw)
  To: John Ogness, Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman, Jiri Slaby, Paul E. McKenney,
	Frederic Weisbecker, Neeraj Upadhyay, Josh Triplett,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Kees Cook,
	Andrew Morton, Luis Chamberlain, Wei Liu, Xiaoming Ni,
	Peter Zijlstra, Andy Shevchenko, Marco Elver,
	Sebastian Andrzej Siewior, Daniel Lezcano, Shawn Guo,
	Dmitry Torokhov, Matti Vaittinen, Eric W. Biederman, Phil Auld,
	Juri Lelli, Wang Qing, rcu

Hi John,

On 4/7/22 16:35, John Ogness wrote:
> On 2022-04-07, Helge Deller <deller@gmx.de> wrote:
>>>>  - emergency reboot/shutdown
>>>>  - cpu and rcu stalls
>>>>  - hard and soft lockups
>>>>  - hung tasks
>>>>  - warn
>>>>  - sysrq
>>
>> I agree that those needs direct printing.
>> Did you maybe checked how much (e.g. in percentage) of a normal log is
>> then still left to be printed threadened?  Just in case it's not much,
>> is there still a benefit to introduce threadened printing?
>
> A "normal" log can still contain many many messages. There have been
> reports [0] where the system died because _printk_ killed it. This was
> not any of the scenarios that we want direct printing for. It was just a
> lot of messages that some driver wanted to output.
>
> Until now the response was always to reduce printk usage or avoid it
> altogether. printk has a bad reputation because it is so unpredictable
> and expensive. So we are fixing printk. With threaded printers, we make
> a dramatic change. printk becomes 100% lockless and consistently quite
> fast. You can use it to debug or log messages within any code in the
> kernel and not need to worry about horrible latencies being introduced.
>
> For drivers that start dumping tons of messages, there is no concern
> that the system will die because of the messages.

Thank you for this good explanation!

In my case - while I debug low-level kernel code - I then just need to use
pr_warn() or pr_emerg() and get it printed non-threadened. That's sufficient for me.

>> Or another example, e.g. when running on a slow console (the best
>> testcase I know of is a 9600 baud serial port), where some printks are
>> threadened. Then a warn printk() has to be shown, doesn't it mean all
>> threadened printks have to be flushed (waited for) until the warning
>> can be shown. Will there then still be a measureable time benefit?
>
> Now you are talking about something else, unrelated to the threaded
> printing. A warn is considered a serious non-normal event. In this case,
> printk temporarily falls back to the direct, slow, unpredictable,
> printk-caller-context behavior we know today.

Good.

> If we see that direct printing is being used during normal operation
> (i.e. developers are performing warns just for fun), then we should not
> mark warns for direct printing. We will need to get some real-world
> experience here, but the goal is that systems do not use direct printing
> unless a real serious emergency is going on. Otherwise, the system risks
> dying because of the messages rather than the problem being reported.

I agree, it's good if this gets fixed at some point.

Thanks!
Helge

> John
>
> [0] https://lore.kernel.org/all/20190306171943.12345598@oasis.local.home

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 09/12] printk: add functions to prefer direct printing
  2022-04-07 19:36         ` Helge Deller
@ 2022-04-07 20:04           ` John Ogness
  2022-04-07 20:20             ` Helge Deller
  0 siblings, 1 reply; 37+ messages in thread
From: John Ogness @ 2022-04-07 20:04 UTC (permalink / raw)
  To: Helge Deller, Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman, Jiri Slaby, Paul E. McKenney,
	Frederic Weisbecker, Neeraj Upadhyay, Josh Triplett,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Kees Cook,
	Andrew Morton, Luis Chamberlain, Wei Liu, Xiaoming Ni,
	Peter Zijlstra, Andy Shevchenko, Marco Elver,
	Sebastian Andrzej Siewior, Daniel Lezcano, Shawn Guo,
	Dmitry Torokhov, Matti Vaittinen, Eric W. Biederman, Phil Auld,
	Juri Lelli, Wang Qing, rcu

On 2022-04-07, Helge Deller <deller@gmx.de> wrote:
> In my case - while I debug low-level kernel code - I then just need to
> use pr_warn() or pr_emerg() and get it printed non-threadened. That's
> sufficient for me.

Actually, no. The loglevel does not determine if a message is direct
printed or not. By "warn" I was referring to WARN_ON(condition).

If you are debugging low-level kernel code, you usually will _want_
threaded printing. The timestamps match the printk() call, so you will
get accurate logs. And the runtime performance of your low-level kernel
code will not be significantly affected by the printk() call.

If for some reason you really want non-threaded printing, the patch we
are discussing creates new functions to manually activate it:

printk_prefer_direct_enter();
pr_info("debugging\n");
printk_prefer_direct_exit();

But then your pr_info() will cause significant latencies. The timestamp
would be the same, with or without direct printing.

John

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 09/12] printk: add functions to prefer direct printing
  2022-04-07 20:04           ` John Ogness
@ 2022-04-07 20:20             ` Helge Deller
  2022-04-11 12:50               ` Petr Mladek
  0 siblings, 1 reply; 37+ messages in thread
From: Helge Deller @ 2022-04-07 20:20 UTC (permalink / raw)
  To: John Ogness, Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman, Jiri Slaby, Paul E. McKenney,
	Frederic Weisbecker, Neeraj Upadhyay, Josh Triplett,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Kees Cook,
	Andrew Morton, Luis Chamberlain, Wei Liu, Xiaoming Ni,
	Peter Zijlstra, Andy Shevchenko, Marco Elver,
	Sebastian Andrzej Siewior, Daniel Lezcano, Shawn Guo,
	Dmitry Torokhov, Matti Vaittinen, Eric W. Biederman, Phil Auld,
	Juri Lelli, Wang Qing, rcu

On 4/7/22 22:04, John Ogness wrote:
> On 2022-04-07, Helge Deller <deller@gmx.de> wrote:
>> In my case - while I debug low-level kernel code - I then just need to
>> use pr_warn() or pr_emerg() and get it printed non-threadened. That's
>> sufficient for me.
>
> Actually, no. The loglevel does not determine if a message is direct
> printed or not. By "warn" I was referring to WARN_ON(condition).

Oh, then there was a misunderstanding on my side.

> If you are debugging low-level kernel code, you usually will _want_
> threaded printing. The timestamps match the printk() call, so you will
> get accurate logs. And the runtime performance of your low-level kernel
> code will not be significantly affected by the printk() call.

That really much depends on what you debug.
Currently I'm debugging some CPU hotplug stuff, and I really want my
debug info printed immediately, otherwise it's too late to analyze what's
going wrong. I can imaginge other use cases like trying to find memory
leaks are similiar.

> If for some reason you really want non-threaded printing, the patch we
> are discussing creates new functions to manually activate it:
>
> printk_prefer_direct_enter();
> pr_info("debugging\n");
> printk_prefer_direct_exit();

That's quite unhandy. At least I would have thought that pr_emerg() would
do that call sequence.
Wouldn't it make sense to make pr_emerg() work unthreadened, as it's
used quite seldom and only in special situations...

> But then your pr_info() will cause significant latencies. The timestamp
> would be the same, with or without direct printing.

Yes, but that would be Ok in my case.

Helge

^ permalink raw reply	[flat|nested] 37+ messages in thread

* guarantee forward progress: was: Re: [PATCH printk v2 11/12] printk: extend console_lock for proper kthread support
  2022-04-05 13:25 ` [PATCH printk v2 11/12] printk: extend console_lock for proper kthread support John Ogness
@ 2022-04-08 13:45   ` Petr Mladek
  2022-04-08 20:17     ` John Ogness
  2022-04-08 13:52   ` register_console: " Petr Mladek
  1 sibling, 1 reply; 37+ messages in thread
From: Petr Mladek @ 2022-04-08 13:45 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman

On Tue 2022-04-05 15:31:34, John Ogness wrote:
> Currently threaded console printers synchronize against each
> other using console_lock(). However, different console drivers
> are unrelated and do not require any synchronization between
> each other. Removing the synchronization between the threaded
> console printers will allow each console to print at its own
> speed.
> 
> But the threaded consoles printers do still need to synchronize
> against console_lock() callers. Introduce a per-console mutex
> and a new console flag CON_THD_BLOCKED to provide this
> synchronization.
> 
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2691,6 +2760,22 @@ static inline bool console_is_usable(struct console *con)
>  static void __console_unlock(void)
>  {
>  	console_locked = 0;
> +
> +	/*
> +	 * Depending on whether console_lock() or console_trylock() was used,
> +	 * appropriately allow the kthread printers to continue.
> +	 */
> +	if (console_kthreads_blocked)
> +		console_kthreads_unblock();
> +	else
> +		console_kthreads_atomic_unblock();
> +
> +	/*
> +	 * New records may have arrived while the console was locked.
> +	 * Wake the kthread printers to print them.
> +	 */
> +	wake_up_klogd();

Thinking loudly:

Strictly speaking, this is needed only when kthreads are allowed. It
might make sense to do:

	if (!allow_direct_printing())
		wake_up_klogd();

On the other hand, the check is racy. So it might be better to always
call wake_up_klogd() and do not worry about it. The optimization
probably is not worth it.

But this brings the question. How do we actually ensure that either
kthreads or direct printk will handle pending messages.

vprintk_emit() or defer_console_output() make sure that either
direct printkg is used or kthreads are waken. It makes the decision
via the current state of allow_direct_printing(). But it might
change in the meantime.

allow_direct_printing() might change in both directions:

1. printk trigger direct -> kthreads enabled when in direct mode.

   This switch might be handled by the above wake_up_klogd().

   If it is non-conditional, it might handle  also racy
   mutiple direct -> kthread -> direct mode swtiches.


2. printk wakes kthreads -> direct mode allowed when there
   are still pending messages.

   I am afraid that we do not have this path well covered.

   It probably worked "better" in v1 where the kthreads continued
   handling the messages until all were flushed. But I guess that
   there were still races when new messages appeared after some
   kthreads finished the job.

   Hmm, it actually is not enough to call defer_console_output()
   in printk_prefer_direct_enter(). It is because
   wake_up_klogd_work_func() does console_trylock(). It might
   still be blocked by an active kthread. But the other kthreads
   might already sleep.

   I think that we actually need to call defer_console_output()
   after kthread() unblocks console_trylock() and
   the direct mode is preferred(). I mean something like:

static int printk_kthread_func(void *data)
{
[...]
	for (;;) {
[...]
		console_emit_next_record(con, text, ext_text, dropped_text);
		seq = con->seq;

		console_kthread_printing_exit();

+		/*
+		 * This thread does not longer block console_trylock(). Make
+		 * sure that someone will try to flush pending
+		 * messages in the direct mode when it has got preferred
+		 * in the meantime.
+		 */
+		if (allow_direct_printing())
+			deffer_console_output();
+
		mutex_unlock(&con->lock);
	}
[...]
}

Hmm, it is still not enough. We should do it also in printer_should_wake().
It will be needed when the original printk() woke kthreads and
the direct mode got preferred before the kthread started printing.
In this case, the kthreads will do nothing and nobody will handle
the pending messages.

I played a lot with it and it is really hard because:

   + new messages can appear anytime
   + direct mode might get requested anytime
   + only the direct mode knows whether all messages were flushed
     on all consoles

IMHO, ideal logic would be:

   + new messages will always trigger direct printing or wake
     kthreads, newer both

   + direct mode will wake kthreads when kthreads are allowed
     and there are penging messages.

   + (the last) kthread will wake_up direct mode when it is requested
     and there are pending messages.

We could use the trick that the direct mode never gives up when there
are pending messages.

It requires moving printk kthreads into a separate waitqueue:

DECLARE_WAIT_QUEUE_HEAD(printk_kthread_wait);

I suggest to rename the flags in the wake_up_klogd() API

/*
 * Delayed printk version, for scheduler-internal messages:
 */

#define PRINTK_WAKE_UP_LOG		0x01
#define PRINTK_WAKE_UP_KTHREADS		0x02
#define PRINTK_PENDING_DIRECT_OUTPUT	0x04

static DEFINE_PER_CPU(int, printk_pending);

static void wake_up_klogd_work_func(struct irq_work *irq_work)
{
	int pending = this_cpu_xchg(printk_pending, 0);

	if (pending & PRINTK_PENDING_DIRECT_OUTPUT) {
		printk_prefer_direct_enter();

		/* If trylock fails, someone else is doing the printing */
		if (console_trylock())
			console_unlock();

		printk_prefer_direct_exit();
	}

	if (pending & PRINTK_WAKEUP_LOG)
		wake_up_interruptible_all(&log_wait);

	if (pending & PRINTK_WAKEUP_KTHREADS)
		wake_up_interruptible_all(&printk_kthread_wait);
}

static DEFINE_PER_CPU(struct irq_work, printk_wake_up_work) =
	IRQ_WORK_INIT_LAZY(wake_up_klogd_work_func);

static void __printk_wake_up(int val)
{
	if (!printk_percpu_data_ready())
		return;

	preempt_disable();
	/*
	 * Guarantee any new records can be seen by printing threads before
	 * checking if the wait queue is empty.
	 *
	 * The full memory barrier within wq_has_sleeper() pairs with the full
	 * memory barrier within set_current_state() of
	 * prepare_to_wait_event(), which is called after ___wait_event() adds
	 * the waiter but before it has checked the wait condition.
	 *
	 * See printk_kthread_func:A for the pairing memory barrier.
	 */
	 if (wq_has_sleeper(&log_wait) || /* LMM(__printk_wake_up:A) */
	     wq_has_sleeper(&printk_kthread_wait) || /* LMM(__printk_wake_up:B) */
	     (val & PRINTK_PENDING_DIRECT_OUTPUT)) {
		this_cpu_or(printk_pending, val);
		irq_work_queue(this_cpu_ptr(&printk_wake_up_work));
	}
	preempt_enable();
}

void printk_wake_up_klogd(void)
{
	__wake_up_klogd(PRINTK_WAKE_UP_LOG);
}

void printk_wake_up_all(void)
{
	__wake_up_klogd(PRINTK_WAKE_UP_LOG || PRITNK_WAKE_UP_KTHREADS);
}

void defer_console_output(void)
{
	int val = PRINTK_WAKE_UP_LOG;

	/*
	 * If console deferring was called with preferred direct printing,
	 * make the irqwork perform the direct printing.
	 */
	if (atomic_read(&printk_prefer_direct))
		val |= PRINTK_PENDING_DIRECT_OUTPUT;
	else
		val |= PRINTK_WAKE_UP_KTHREADS;

	__wake_up_klogd(val);
}


And do

asmlinkage int vprintk_emit(int facility, int level,
			    const struct dev_printk_info *dev_info,
			    const char *fmt, va_list args)
{
[...]
	/* If called from the scheduler, we can not call up(). */
	if (!in_sched && allow_direct_printing()) {
		preempt_disable();
		if (console_trylock_spinning())
			console_unlock();
		preempt_enable();
		printk_wake_up_klogd();
	} else {
		printk_wake_up_all();
	}

	return printed_len;
}

static bool printer_should_wake(struct console *con, u64 seq)
{
[...]
	if (__allow_direct_printing()) {
		/* Make sure that someone does the direct printing. */
		if (rb_read_valid(prb, next_seq, NULL))
			defer_console_output();

		return false;
	}
[...]
}

void console_unlock(void)
{
	bool do_cond_resched;
	bool handover;
	bool flushed;
	u64 next_seq;

	if (console_suspended) {
		up_console_sem();
		return;
	}

	/*
	 * Console drivers are called with interrupts disabled, so
	 * @console_may_schedule should be cleared before; however, we may
	 * end up dumping a lot of lines, for example, if called from
	 * console registration path, and should invoke cond_resched()
	 * between lines if allowable.  Not doing so can cause a very long
	 * scheduling stall on a slow console leading to RCU stall and
	 * softlockup warnings which exacerbate the issue with more
	 * messages practically incapacitating the system. Therefore, create
	 * a local to use for the printing loop.
	 */
	do_cond_resched = console_may_schedule;

	do {
		console_may_schedule = 0;

		if (!any_usable_console()) {
			__console_unlock();
			break;
		}

		flushed = console_flush_all(do_cond_resched, &next_seq, &handover);
		if (handover)
			return;

		__console_unlock();

		/* Any pending messages? */
		if (!prb_read_valid(prb, next_seq, NULL))
			break ;

		if (!allow_direct_printing()) {
			/*
			 * Make sure that kthreads take over handling
			 * the pending messages.
			 */
			printk_wake_up_kthreads();
			break;
		}

		/* Only panic CPU should handle consoles. */
		if (abandon_console_lock_in_panic())
			break;

		WARN_ONCE(!flushed,
			  "The consoles were not flused in the direct more from unknown reasons\n");

		/*
		 * Try to continue with direct printing. If the trylock fails,
		 * another context is already handling the printing. It is
		 * responsible for flushing everything.
		 */
	} while console_trylock();
}

Note that I have added back any_usable_console(). console_flush_all()
might fail from different reasons. Each reason needs its own handling.
And we need to know that we handled all and use WARN_ONCE().


Uff, I have spent few hours thinking about it and reworked it several
times. The above is the best solution that I was able to come up with.

I send what I have now because I feel tired and do not want to sit
on it. But it is possible that I just have got lost and will see it
on Monday.

>  	up_console_sem();
>  }
>  

Alternative solution would be to go back to v1 and allow kthreads
handling the messages even when direct more is preferred. The direct
mode would be used only when console_trylock() succeeds or when
some code calls console_lock().

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* register_console: was: Re: [PATCH printk v2 11/12] printk: extend console_lock for proper kthread support
  2022-04-05 13:25 ` [PATCH printk v2 11/12] printk: extend console_lock for proper kthread support John Ogness
  2022-04-08 13:45   ` guarantee forward progress: was: " Petr Mladek
@ 2022-04-08 13:52   ` Petr Mladek
  1 sibling, 0 replies; 37+ messages in thread
From: Petr Mladek @ 2022-04-08 13:52 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman

On Tue 2022-04-05 15:31:34, John Ogness wrote:
> Currently threaded console printers synchronize against each
> other using console_lock(). However, different console drivers
> are unrelated and do not require any synchronization between
> each other. Removing the synchronization between the threaded
> console printers will allow each console to print at its own
> speed.
> 
> But the threaded consoles printers do still need to synchronize
> against console_lock() callers. Introduce a per-console mutex
> and a new console flag CON_THD_BLOCKED to provide this
> synchronization.
> 
> console_lock() is modified so that it must acquire the mutex
> of each console in order to set the CON_THD_BLOCKED flag.
> Console printing threads will acquire their mutex while
> printing a record. If CON_THD_BLOCKED was set, the thread will
> go back to sleep instead of printing.
> 
> The reason for the CON_THD_BLOCKED flag is so that
> console_lock() callers do not need to acquire multiple console
> mutexes simultaneously, which would introduce unnecessary
> complexity due to nested mutex locking.
> 
> However, threaded console printers also need to synchronize
> against console_trylock() callers. Since console_trylock() may
> be called from any context, the per-console mutex cannot be
> used for this synchronization. (mutex_trylock() cannot be
> called from atomic contexts.) Introduce a global atomic counter
> to identify if any threaded printers are active. The threaded
> printers will also check the atomic counter to identify if the
> console has been locked by another task via console_trylock().
> 
> Note that @console_sem is still used to provide synchronization
> between console_lock() and console_trylock() callers.
> 
> A locking overview for console_lock(), console_trylock(), and the
> threaded printers is as follows (pseudo code):
> 
> console_lock()
> {
>         down(&console_sem);
>         for_each_console(con) {
>                 mutex_lock(&con->lock);
>                 con->flags |= CON_THD_BLOCKED;
>                 mutex_unlock(&con->lock);
>         }
>         /* console lock acquired */
> }
> 
> console_trylock()
> {
>         if (down_trylock(&console_sem) == 0) {
>                 if (atomic_cmpxchg(&console_kthreads_active, 0, -1) == 0) {
>                         /* console lock acquired */
>                 }
>         }
> }
> 
> threaded_printer()
> {
>         mutex_lock(&con->lock);
>         if (!(con->flags & CON_THD_BLOCKED)) {
> 		/* console_lock() callers blocked */
> 
>                 if (atomic_inc_unless_negative(&console_kthreads_active)) {
>                         /* console_trylock() callers blocked */
> 
>                         con->write();
> 
>                         atomic_dec(&console_lock_count);
>                 }
>         }
>         mutex_unlock(&con->lock);
> }
> 
> The console owner and waiter logic now only applies between contexts
> that have taken the console lock. Since threaded printers never take
> the console lock, they do not have a console lock to handover to
> other contexts.
> 
> @panic_console_dropped must change to atomic_t since it is no longer
> protected exclusively by the console lock.
> 
> Since threaded printers remain asleep if they see that the console
> is locked, they now must be explicitly woken in __console_unlock().
> This means wake_up_klogd() calls following a console_unlock() are
> no longer necessary and are removed.

BTW: I really like the commit message.

> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3230,7 +3335,11 @@ void register_console(struct console *newcon)
>  	if (newcon->flags & CON_EXTENDED)
>  		nr_ext_console_drivers++;
>  
> +	if (console_kthreads_blocked)

I think that this is actully always true because it is called under
console_lock().

> +		newcon->flags |= CON_THD_BLOCKED;

> +
>  	newcon->dropped = 0;
> +	mutex_init(&newcon->lock);
>  	if (newcon->flags & CON_PRINTBUFFER) {
>  		/* Get a consistent copy of @syslog_seq. */
>  		mutex_lock(&syslog_lock);

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 08/12] printk: add pr_flush()
  2022-04-06 15:17   ` Petr Mladek
@ 2022-04-08 18:57     ` John Ogness
  0 siblings, 0 replies; 37+ messages in thread
From: John Ogness @ 2022-04-08 18:57 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner, linux-kernel

On 2022-04-06, Petr Mladek <pmladek@suse.com> wrote:
>> @@ -3321,6 +3331,79 @@ static int __init printk_late_init(void)
>>  late_initcall(printk_late_init);
>>  
>>  #if defined CONFIG_PRINTK
>> +/* If @con is specified, only wait for that console. Otherwise wait for all. */
>> +static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress)
>> +{
>> +	int remaining = timeout_ms;
>> +	struct console *c;
>> +	u64 last_diff = 0;
>> +	u64 printk_seq;
>> +	u64 diff;
>> +	u64 seq;
>> +
>> +	might_sleep();
>> +
>> +	seq = prb_next_seq(prb);
>
> I suggest to add here:
>
> 	/*
> 	 * Try to flush the messages when kthreads are not available
> 	 * and there is not other console lock owner.
> 	 */
> 	if (console_trylock())
> 		console_unlock()

At this point in the series, we do not have kthreads yet. ;-)

And besides, you are implying that there are messages in the ringbuffer,
but _nobody_ is printing them. That would be a bug in our
implementation.

>> +
>> +	for (;;) {
>> +		diff = 0;
>> +
>> +		console_lock();
>> +		for_each_console(c) {
>> +			if (con && con != c)
>> +				continue;
>> +			if (!console_is_usable(c))
>> +				continue;
>> +			printk_seq = c->seq;
>> +			if (printk_seq < seq)
>> +				diff += seq - printk_seq;
>> +		}
>> +		console_unlock();
>
> This is a bit sub-optimal when the kthreads are not available or
> are disabled. In this case, the messages are flushed [*] by
> the console_unlock() and the diff is outdated.

The messages must be flushed by another context. It is wrong to rely on
this context to do the flushing. In direct mode, those messages should
have been flushed before the above console_lock() succeeded. Otherwise
it is a bug. (For threaded mode, console_unlock() does not flush
anyway.)

> Well, it is ugly to keep it this way.

If you don't mind, I will not change this function for v3. I do not want
to start sprinkling console_trylock()/console_unlock() calls all over
the code to cover up possible bugs and races in the implementation.

John

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: guarantee forward progress: was: Re: [PATCH printk v2 11/12] printk: extend console_lock for proper kthread support
  2022-04-08 13:45   ` guarantee forward progress: was: " Petr Mladek
@ 2022-04-08 20:17     ` John Ogness
  2022-04-11 10:45       ` Petr Mladek
  0 siblings, 1 reply; 37+ messages in thread
From: John Ogness @ 2022-04-08 20:17 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman

On 2022-04-08, Petr Mladek <pmladek@suse.com> wrote:
> I played a lot with it and it is really hard because:
>
>    + new messages can appear anytime
>    + direct mode might get requested anytime
>    + only the direct mode knows whether all messages were flushed
>      on all consoles

Yes, and this is why v1 dramatically simplified the picture by making
kthreads not care about direct mode. In v1 the kthread logic is very
simple: If there are messages to print, try to print them no matter
what. We didn't need to worry if someone was printing, because we knew
that at least the kthread was always printing.

This meant that there would be times when direct mode is active but the
kthreads are doing the printing. But in my experimenting, that tends to
be the case anyway, even with this more complex v2 approach. The reason
is that if some code does:

printk_prefer_direct_enter();
(100 lines of printk calls)
printk_prefer_direct_exit();

And directly before that printk_prefer_direct_enter() _any_ kthread was
already inside call_console_driver(), then _all_ the console_trylock()
calls of the above 100 printk's will fail. Inserting messages into the
ringbuffer is fast and any active printer will not have finished
printing its message before the above code snippet is done.

In fact, the above snippet will only do direct printing if there were
previously no unflushed messages. That is true for v1 (by design) and v2
(by misfortune, because ringbuffer insertion is much faster than a
single call_console_driver() call).

This new idea (v2) of trying to stop kthreads in order to "step aside"
for direct printing is really just adding a lot of complexity, a lot of
irqwork calls, and a lot of races. And with my experimenting I am not
seeing any gain, except for new risks of nobody printing.

I understand that when we say printk_prefer_direct_enter() that we
_really_ want to do direct printing. But we cannot force it if any
printer is already inside call_console_driver(). In that case, direct
printing simply will not and cannot happen.

For v3 I recommend going back to the v1 model, where kthreads do not
care if direct mode is preferred. I claim that v2 does yield any more
actual direct printing than v1 did.

However, I would keep the v2 change that kthreads go into their
wait_event check after every message. That at least provides earlier
responses for kthreads to stop themselves if they are disabled.

Once we have atomic consoles, things will look different. Then we
perform true synchronous direct printing. But without them, the "prefer"
in printk_prefer_direct_enter() is only a preference that can only be
satisfied under ideal situations (i.e. no kthread is inside
call_console_driver()).

John

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 09/12] printk: add functions to prefer direct printing
  2022-04-05 13:25 ` [PATCH printk v2 09/12] printk: add functions to prefer direct printing John Ogness
  2022-04-07  9:56   ` Petr Mladek
@ 2022-04-09 15:57   ` Paul E. McKenney
  1 sibling, 0 replies; 37+ messages in thread
From: Paul E. McKenney @ 2022-04-09 15:57 UTC (permalink / raw)
  To: John Ogness
  Cc: Petr Mladek, Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman, Jiri Slaby,
	Frederic Weisbecker, Neeraj Upadhyay, Josh Triplett,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Kees Cook,
	Andrew Morton, Luis Chamberlain, Wei Liu, Xiaoming Ni,
	Peter Zijlstra, Andy Shevchenko, Helge Deller, Marco Elver,
	Sebastian Andrzej Siewior, Daniel Lezcano, Shawn Guo,
	Dmitry Torokhov, Matti Vaittinen, Eric W. Biederman, Phil Auld,
	Juri Lelli, Wang Qing, rcu

On Tue, Apr 05, 2022 at 03:31:32PM +0206, John Ogness wrote:
> Once kthread printing is available, console printing will no longer
> occur in the context of the printk caller. However, there are some
> special contexts where it is desirable for the printk caller to
> directly print out kernel messages. Using pr_flush() to wait for
> threaded printers is only possible if the caller is in a sleepable
> context and the kthreads are active. That is not always the case.
> 
> Introduce printk_prefer_direct_enter() and printk_prefer_direct_exit()
> functions to explicitly (and globally) activate/deactivate preferred
> direct console printing. The term "direct console printing" refers to
> printing to all enabled consoles from the context of the printk
> caller. The term "prefer" is used because this type of printing is
> only best effort. If the console is currently locked or other
> printers are already actively printing, the printk caller will need
> to rely on the other contexts to handle the printing.
> 
> This preferred direct printing is how all printing is currently
> handled (unless explicitly deferred).
> 
> When kthread printing is introduced, there may be some unanticipated
> problems due to kthreads being unable to flush important messages.
> In order to minimize such risks, preferred direct printing is
> activated for the primary important messages when the system
> experiences general types of major errors. These are:
> 
>  - emergency reboot/shutdown
>  - cpu and rcu stalls
>  - hard and soft lockups
>  - hung tasks
>  - warn
>  - sysrq
> 
> Note that since kthread printing does not yet exist, no behavior
> changes result from this commit. This is only implementing the
> counter and marking the various places where preferred direct
> printing is active.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>

From an RCU perspective:

Acked-by: Paul E. McKenney <paulmck@kernel.org>

> ---
>  drivers/tty/sysrq.c     |  2 ++
>  include/linux/printk.h  | 11 +++++++++++
>  kernel/hung_task.c      | 11 ++++++++++-
>  kernel/panic.c          |  4 ++++
>  kernel/printk/printk.c  | 28 ++++++++++++++++++++++++++++
>  kernel/rcu/tree_stall.h |  2 ++
>  kernel/reboot.c         | 14 +++++++++++++-
>  kernel/watchdog.c       |  4 ++++
>  kernel/watchdog_hld.c   |  4 ++++
>  9 files changed, 78 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
> index bbfd004449b5..2884cd638d64 100644
> --- a/drivers/tty/sysrq.c
> +++ b/drivers/tty/sysrq.c
> @@ -578,6 +578,7 @@ void __handle_sysrq(int key, bool check_mask)
>  
>  	rcu_sysrq_start();
>  	rcu_read_lock();
> +	printk_prefer_direct_enter();
>  	/*
>  	 * Raise the apparent loglevel to maximum so that the sysrq header
>  	 * is shown to provide the user with positive feedback.  We do not
> @@ -619,6 +620,7 @@ void __handle_sysrq(int key, bool check_mask)
>  		pr_cont("\n");
>  		console_loglevel = orig_log_level;
>  	}
> +	printk_prefer_direct_exit();
>  	rcu_read_unlock();
>  	rcu_sysrq_end();
>  
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index 091fba7283e1..cd26aab0ab2a 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -170,6 +170,9 @@ extern void __printk_safe_exit(void);
>  #define printk_deferred_enter __printk_safe_enter
>  #define printk_deferred_exit __printk_safe_exit
>  
> +extern void printk_prefer_direct_enter(void);
> +extern void printk_prefer_direct_exit(void);
> +
>  extern bool pr_flush(int timeout_ms, bool reset_on_progress);
>  
>  /*
> @@ -222,6 +225,14 @@ static inline void printk_deferred_exit(void)
>  {
>  }
>  
> +static inline void printk_prefer_direct_enter(void)
> +{
> +}
> +
> +static inline void printk_prefer_direct_exit(void)
> +{
> +}
> +
>  static inline bool pr_flush(int timeout_ms, bool reset_on_progress)
>  {
>  	return true;
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index 52501e5f7655..02a65d554340 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -127,6 +127,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
>  	 * complain:
>  	 */
>  	if (sysctl_hung_task_warnings) {
> +		printk_prefer_direct_enter();
> +
>  		if (sysctl_hung_task_warnings > 0)
>  			sysctl_hung_task_warnings--;
>  		pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
> @@ -142,6 +144,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
>  
>  		if (sysctl_hung_task_all_cpu_backtrace)
>  			hung_task_show_all_bt = true;
> +
> +		printk_prefer_direct_exit();
>  	}
>  
>  	touch_nmi_watchdog();
> @@ -204,12 +208,17 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>  	}
>   unlock:
>  	rcu_read_unlock();
> -	if (hung_task_show_lock)
> +	if (hung_task_show_lock) {
> +		printk_prefer_direct_enter();
>  		debug_show_all_locks();
> +		printk_prefer_direct_exit();
> +	}
>  
>  	if (hung_task_show_all_bt) {
>  		hung_task_show_all_bt = false;
> +		printk_prefer_direct_enter();
>  		trigger_all_cpu_backtrace();
> +		printk_prefer_direct_exit();
>  	}
>  
>  	if (hung_task_call_panic)
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 55b50e052ec3..7d422597403f 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -560,6 +560,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
>  {
>  	disable_trace_on_warning();
>  
> +	printk_prefer_direct_enter();
> +
>  	if (file)
>  		pr_warn("WARNING: CPU: %d PID: %d at %s:%d %pS\n",
>  			raw_smp_processor_id(), current->pid, file, line,
> @@ -597,6 +599,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
>  
>  	/* Just a warning, don't kill lockdep. */
>  	add_taint(taint, LOCKDEP_STILL_OK);
> +
> +	printk_prefer_direct_exit();
>  }
>  
>  #ifndef __WARN_FLAGS
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index d0369afaf365..258d02cff140 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -362,6 +362,34 @@ static int console_msg_format = MSG_FORMAT_DEFAULT;
>  static DEFINE_MUTEX(syslog_lock);
>  
>  #ifdef CONFIG_PRINTK
> +static atomic_t printk_prefer_direct = ATOMIC_INIT(0);
> +
> +/**
> + * printk_prefer_direct_enter - cause printk() calls to attempt direct
> + *                              printing to all enabled consoles
> + *
> + * Since it is not possible to call into the console printing code from any
> + * context, there is no guarantee that direct printing will occur.
> + *
> + * This globally effects all printk() callers.
> + *
> + * Context: Any context.
> + */
> +void printk_prefer_direct_enter(void)
> +{
> +	atomic_inc(&printk_prefer_direct);
> +}
> +
> +/**
> + * printk_prefer_direct_exit - restore printk() behavior
> + *
> + * Context: Any context.
> + */
> +void printk_prefer_direct_exit(void)
> +{
> +	WARN_ON(atomic_dec_if_positive(&printk_prefer_direct) < 0);
> +}
> +
>  DECLARE_WAIT_QUEUE_HEAD(log_wait);
>  /* All 3 protected by @syslog_lock. */
>  /* the next printk record to read by syslog(READ) or /proc/kmsg */
> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> index 0c5d8516516a..d612707c2ed0 100644
> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -619,6 +619,7 @@ static void print_cpu_stall(unsigned long gps)
>  	 * See Documentation/RCU/stallwarn.rst for info on how to debug
>  	 * RCU CPU stall warnings.
>  	 */
> +	printk_prefer_direct_enter();
>  	trace_rcu_stall_warning(rcu_state.name, TPS("SelfDetected"));
>  	pr_err("INFO: %s self-detected stall on CPU\n", rcu_state.name);
>  	raw_spin_lock_irqsave_rcu_node(rdp->mynode, flags);
> @@ -656,6 +657,7 @@ static void print_cpu_stall(unsigned long gps)
>  	 */
>  	set_tsk_need_resched(current);
>  	set_preempt_need_resched();
> +	printk_prefer_direct_exit();
>  }
>  
>  static void check_cpu_stall(struct rcu_data *rdp)
> diff --git a/kernel/reboot.c b/kernel/reboot.c
> index 6bcc5d6a6572..4177645e74d6 100644
> --- a/kernel/reboot.c
> +++ b/kernel/reboot.c
> @@ -447,9 +447,11 @@ static int __orderly_reboot(void)
>  	ret = run_cmd(reboot_cmd);
>  
>  	if (ret) {
> +		printk_prefer_direct_enter();
>  		pr_warn("Failed to start orderly reboot: forcing the issue\n");
>  		emergency_sync();
>  		kernel_restart(NULL);
> +		printk_prefer_direct_exit();
>  	}
>  
>  	return ret;
> @@ -462,6 +464,7 @@ static int __orderly_poweroff(bool force)
>  	ret = run_cmd(poweroff_cmd);
>  
>  	if (ret && force) {
> +		printk_prefer_direct_enter();
>  		pr_warn("Failed to start orderly shutdown: forcing the issue\n");
>  
>  		/*
> @@ -471,6 +474,7 @@ static int __orderly_poweroff(bool force)
>  		 */
>  		emergency_sync();
>  		kernel_power_off();
> +		printk_prefer_direct_exit();
>  	}
>  
>  	return ret;
> @@ -528,6 +532,8 @@ EXPORT_SYMBOL_GPL(orderly_reboot);
>   */
>  static void hw_failure_emergency_poweroff_func(struct work_struct *work)
>  {
> +	printk_prefer_direct_enter();
> +
>  	/*
>  	 * We have reached here after the emergency shutdown waiting period has
>  	 * expired. This means orderly_poweroff has not been able to shut off
> @@ -544,6 +550,8 @@ static void hw_failure_emergency_poweroff_func(struct work_struct *work)
>  	 */
>  	pr_emerg("Hardware protection shutdown failed. Trying emergency restart\n");
>  	emergency_restart();
> +
> +	printk_prefer_direct_exit();
>  }
>  
>  static DECLARE_DELAYED_WORK(hw_failure_emergency_poweroff_work,
> @@ -582,11 +590,13 @@ void hw_protection_shutdown(const char *reason, int ms_until_forced)
>  {
>  	static atomic_t allow_proceed = ATOMIC_INIT(1);
>  
> +	printk_prefer_direct_enter();
> +
>  	pr_emerg("HARDWARE PROTECTION shutdown (%s)\n", reason);
>  
>  	/* Shutdown should be initiated only once. */
>  	if (!atomic_dec_and_test(&allow_proceed))
> -		return;
> +		goto out;
>  
>  	/*
>  	 * Queue a backup emergency shutdown in the event of
> @@ -594,6 +604,8 @@ void hw_protection_shutdown(const char *reason, int ms_until_forced)
>  	 */
>  	hw_failure_emergency_poweroff(ms_until_forced);
>  	orderly_poweroff(true);
> +out:
> +	printk_prefer_direct_exit();
>  }
>  EXPORT_SYMBOL_GPL(hw_protection_shutdown);
>  
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 9166220457bc..40024e03d422 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -424,6 +424,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>  		/* Start period for the next softlockup warning. */
>  		update_report_ts();
>  
> +		printk_prefer_direct_enter();
> +
>  		pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
>  			smp_processor_id(), duration,
>  			current->comm, task_pid_nr(current));
> @@ -442,6 +444,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>  		add_taint(TAINT_SOFTLOCKUP, LOCKDEP_STILL_OK);
>  		if (softlockup_panic)
>  			panic("softlockup: hung tasks");
> +
> +		printk_prefer_direct_exit();
>  	}
>  
>  	return HRTIMER_RESTART;
> diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
> index 247bf0b1582c..701f35f0e2d4 100644
> --- a/kernel/watchdog_hld.c
> +++ b/kernel/watchdog_hld.c
> @@ -135,6 +135,8 @@ static void watchdog_overflow_callback(struct perf_event *event,
>  		if (__this_cpu_read(hard_watchdog_warn) == true)
>  			return;
>  
> +		printk_prefer_direct_enter();
> +
>  		pr_emerg("Watchdog detected hard LOCKUP on cpu %d\n",
>  			 this_cpu);
>  		print_modules();
> @@ -155,6 +157,8 @@ static void watchdog_overflow_callback(struct perf_event *event,
>  		if (hardlockup_panic)
>  			nmi_panic(regs, "Hard LOCKUP");
>  
> +		printk_prefer_direct_exit();
> +
>  		__this_cpu_write(hard_watchdog_warn, true);
>  		return;
>  	}
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: guarantee forward progress: was: Re: [PATCH printk v2 11/12] printk: extend console_lock for proper kthread support
  2022-04-08 20:17     ` John Ogness
@ 2022-04-11 10:45       ` Petr Mladek
  0 siblings, 0 replies; 37+ messages in thread
From: Petr Mladek @ 2022-04-11 10:45 UTC (permalink / raw)
  To: John Ogness
  Cc: Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman

On Fri 2022-04-08 22:23:15, John Ogness wrote:
> On 2022-04-08, Petr Mladek <pmladek@suse.com> wrote:
> > I played a lot with it and it is really hard because:
> >
> >    + new messages can appear anytime
> >    + direct mode might get requested anytime
> >    + only the direct mode knows whether all messages were flushed
> >      on all consoles
> 
> Yes, and this is why v1 dramatically simplified the picture by making
> kthreads not care about direct mode. In v1 the kthread logic is very
> simple: If there are messages to print, try to print them no matter
> what. We didn't need to worry if someone was printing, because we knew
> that at least the kthread was always printing.
> 
> This meant that there would be times when direct mode is active but the
> kthreads are doing the printing. But in my experimenting, that tends to
> be the case anyway, even with this more complex v2 approach. The reason
> is that if some code does:
> 
> printk_prefer_direct_enter();
> (100 lines of printk calls)
> printk_prefer_direct_exit();
> 
> And directly before that printk_prefer_direct_enter() _any_ kthread was
> already inside call_console_driver(), then _all_ the console_trylock()
> calls of the above 100 printk's will fail. Inserting messages into the
> ringbuffer is fast and any active printer will not have finished
> printing its message before the above code snippet is done.

Good to know.

> In fact, the above snippet will only do direct printing if there were
> previously no unflushed messages. That is true for v1 (by design) and v2
> (by misfortune, because ringbuffer insertion is much faster than a
> single call_console_driver() call).

Yup.

> This new idea (v2) of trying to stop kthreads in order to "step aside"
> for direct printing is really just adding a lot of complexity, a lot of
> irqwork calls, and a lot of races. And with my experimenting I am not
> seeing any gain, except for new risks of nobody printing.
> 
> I understand that when we say printk_prefer_direct_enter() that we
> _really_ want to do direct printing. But we cannot force it if any
> printer is already inside call_console_driver(). In that case, direct
> printing simply will not and cannot happen.

I think that we should split it into situations when we need the
direct printk and where we prefer the direct printing.

1. The direct printing is needed in situations when the kthreads can't
   work by design, e.g. early boot, panic, suspend, reboot.

   This was the reason why I wanted to increase the chance that
   kthreads would not block it. But as you write above, it does
   not help much. The consoles are so slow that the direct mode
   is not used.

   It is clear that my proposal does not work. And it is not
   reliable anyway. This patchset actually takes care of some
   situations much better way, by calling pr_flush() in console_stop()
   or suspend_console().


2. The direct printing is only preferred in some situations where
   the system is in troubles, for example, stall reports.

   The direct mode is preferred because we think that it will be
   more reliable. It is a conservative thinking. But in fact,
   the kthreads might provide better results in many situations.

   These stall reports often print a lot of debugging information.
   The direct mode might cause soft-lockups and stalls on its own.
   The kthreads allow showing the messages faster on fast consoles.

   I could imagine that we actually remove the direct mode for
   these stall reports in the future.

   Anyway, it is perfectly fine to print them using kthreads
   when the kthreads are working.


> For v3 I recommend going back to the v1 model, where kthreads do not
> care if direct mode is preferred. I claim that v2 does yield any more
> actual direct printing than v1 did.

I agree.

> However, I would keep the v2 change that kthreads go into their
> wait_event check after every message. That at least provides earlier
> responses for kthreads to stop themselves if they are disabled.

I am not sure what you mean. They seems to be completely disabled
only when newly registered console is not able to create the kthread
or when the kthread is not able to allocate buffers.

I think that there is no hurry to stop the other kthreads in this
case.

Or do you mean panic?

> Once we have atomic consoles, things will look different. Then we
> perform true synchronous direct printing. But without them, the "prefer"
> in printk_prefer_direct_enter() is only a preference that can only be
> satisfied under ideal situations (i.e. no kthread is inside
> call_console_driver()).

OK.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH printk v2 09/12] printk: add functions to prefer direct printing
  2022-04-07 20:20             ` Helge Deller
@ 2022-04-11 12:50               ` Petr Mladek
  0 siblings, 0 replies; 37+ messages in thread
From: Petr Mladek @ 2022-04-11 12:50 UTC (permalink / raw)
  To: Helge Deller
  Cc: John Ogness, Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	linux-kernel, Greg Kroah-Hartman, Jiri Slaby, Paul E. McKenney,
	Frederic Weisbecker, Neeraj Upadhyay, Josh Triplett,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, Kees Cook,
	Andrew Morton, Luis Chamberlain, Wei Liu, Xiaoming Ni,
	Peter Zijlstra, Andy Shevchenko, Marco Elver,
	Sebastian Andrzej Siewior, Daniel Lezcano, Shawn Guo,
	Dmitry Torokhov, Matti Vaittinen, Eric W. Biederman, Phil Auld,
	Juri Lelli, Wang Qing, rcu

On Thu 2022-04-07 22:20:30, Helge Deller wrote:
> On 4/7/22 22:04, John Ogness wrote:
> > On 2022-04-07, Helge Deller <deller@gmx.de> wrote:
> >> In my case - while I debug low-level kernel code - I then just need to
> >> use pr_warn() or pr_emerg() and get it printed non-threadened. That's
> >> sufficient for me.
> >
> > Actually, no. The loglevel does not determine if a message is direct
> > printed or not. By "warn" I was referring to WARN_ON(condition).
> 
> Oh, then there was a misunderstanding on my side.
> 
> > If you are debugging low-level kernel code, you usually will _want_
> > threaded printing. The timestamps match the printk() call, so you will
> > get accurate logs. And the runtime performance of your low-level kernel
> > code will not be significantly affected by the printk() call.
> 
> That really much depends on what you debug.
> Currently I'm debugging some CPU hotplug stuff, and I really want my
> debug info printed immediately, otherwise it's too late to analyze what's
> going wrong. I can imaginge other use cases like trying to find memory
> leaks are similiar.

So, it is not about severity of the messages but about the context,
in this case the suspend. We try to address this. 10th patch from
this patchset adds:

static inline bool allow_direct_printing(void)
{
	return (!printk_kthreads_available ||
		system_state > SYSTEM_RUNNING ||
		oops_in_progress ||
		atomic_read(&printk_prefer_direct));
}

It means that printk() inside the SYSTEM_SUSPEND state will
automatically try to handle the console directly.

I intentionally mention "try" because printk() uses console_trylock().
It fails when anyone else already does the printing, including
a kthread. trylock() is needed because printk() must be usable also
in atomic context. It has worked this way for more than two decades.
It came with SMP support.


> > If for some reason you really want non-threaded printing, the patch we
> > are discussing creates new functions to manually activate it:
> >
> > printk_prefer_direct_enter();
> > pr_info("debugging\n");
> > printk_prefer_direct_exit();
> 
> That's quite unhandy. At least I would have thought that pr_emerg() would
> do that call sequence.
> Wouldn't it make sense to make pr_emerg() work unthreadened, as it's
> used quite seldom and only in special situations...

It is true that pr_emerg() is relatively special.

But I would really like to avoid a situation where developers
use non-appropriate printk level just to get the message
directly.

Instead, we should allow using the direct context in some
situation easily. We could also add a command line parameter
to disable the kthreads completely.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2022-04-11 12:50 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-05 13:25 [PATCH printk v2 00/12] implement threaded console printing John Ogness
2022-04-05 13:25 ` [PATCH printk v2 01/12] printk: rename cpulock functions John Ogness
2022-04-06  9:07   ` Petr Mladek
2022-04-05 13:25 ` [PATCH printk v2 02/12] printk: cpu sync always disable interrupts John Ogness
2022-04-05 13:25 ` [PATCH printk v2 03/12] printk: get caller_id/timestamp after migration disable John Ogness
2022-04-05 13:25 ` [PATCH printk v2 04/12] printk: call boot_delay_msec() in printk_delay() John Ogness
2022-04-05 13:25 ` [PATCH printk v2 05/12] printk: add macro for console detail messages John Ogness
2022-04-06 10:31   ` Petr Mladek
2022-04-05 13:25 ` [PATCH printk v2 06/12] printk: refactor and rework printing logic John Ogness
2022-04-06 14:02   ` Petr Mladek
2022-04-05 13:25 ` [PATCH printk v2 07/12] printk: move buffer definitions into console_emit_next_record() caller John Ogness
2022-04-06 14:40   ` Petr Mladek
2022-04-05 13:25 ` [PATCH printk v2 08/12] printk: add pr_flush() John Ogness
2022-04-06 15:17   ` Petr Mladek
2022-04-08 18:57     ` John Ogness
2022-04-05 13:25 ` [PATCH printk v2 09/12] printk: add functions to prefer direct printing John Ogness
2022-04-07  9:56   ` Petr Mladek
2022-04-07 13:35     ` Helge Deller
2022-04-07 14:35       ` John Ogness
2022-04-07 19:36         ` Helge Deller
2022-04-07 20:04           ` John Ogness
2022-04-07 20:20             ` Helge Deller
2022-04-11 12:50               ` Petr Mladek
2022-04-09 15:57   ` Paul E. McKenney
2022-04-05 13:25 ` [PATCH printk v2 10/12] printk: add kthread console printers John Ogness
2022-04-07 16:43   ` start/stop: was: " Petr Mladek
2022-04-07 16:49   ` console_is_usable() check: " Petr Mladek
2022-04-07 16:53   ` wake_up_klogd(): " Petr Mladek
2022-04-07 16:56   ` console_flush_all(): was: : was " Petr Mladek
2022-04-05 13:25 ` [PATCH printk v2 11/12] printk: extend console_lock for proper kthread support John Ogness
2022-04-08 13:45   ` guarantee forward progress: was: " Petr Mladek
2022-04-08 20:17     ` John Ogness
2022-04-11 10:45       ` Petr Mladek
2022-04-08 13:52   ` register_console: " Petr Mladek
2022-04-05 13:25 ` [PATCH printk v2 12/12] printk: remove @console_locked John Ogness
2022-04-05 15:03 ` [PATCH printk v2 00/12] implement threaded console printing Andy Shevchenko
2022-04-05 21:24   ` John Ogness

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.