All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH v5 0/2] printk: Make printk() completely async
@ 2016-03-20 14:13 Sergey Senozhatsky
  2016-03-20 14:13 ` [RFC][PATCH v5 1/2] " Sergey Senozhatsky
  2016-03-20 14:13 ` [RFC][PATCH v5 2/2] printk: Make wake_up_klogd_work_func() async Sergey Senozhatsky
  0 siblings, 2 replies; 20+ messages in thread
From: Sergey Senozhatsky @ 2016-03-20 14:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jan Kara, Petr Mladek, Tejun Heo, Tetsuo Handa, linux-kernel,
	Byungchul Park, Sergey Senozhatsky, Sergey Senozhatsky

Hello,

 RFC

 The patch set is based on slightly updated Jan Kara's patches.

This patch set makes printk() completely asynchronous: new messages
are getting upended to the kernel printk buffer, but instead of 'direct'
printing the actual print job is performed by a dedicated kthread.
This has the advantage that printing always happens from a schedulable
context and thus we don't lockup any particular CPU or even interrupts.

The patch set is against next-20160318 with reverted NMI patches.

the series in total has 3 patches:
- printk: Make printk() completely async
- printk: Make wake_up_klogd_work_func() async
- printk: make console_unlock() async

per discussion, "printk: make console_unlock() async" will be posted
later on.

v5 (Sergey):
-- make printk.synchronous RO (Petr)
-- make printing_func() correct and do not use wait_queue (Petr)
-- do not panic() when can't allocate printing thread (Petr)
-- do not wake_up_process() only in IRQ, prefer vprintk_emit() (Jan)
-- move wake_up_klogd_work_func() to a separate patch (Petr)
-- move wake_up_process() under logbuf lock so printk recursion logic can
   help us out
-- switch to sync_print mode if printk recursion occured
-- drop "printk: Skip messages on oops" patch

v4 (Sergey):
-- do not directly wake_up() the printing kthread from vprintk_emit(), need
   to go via IRQ->wake_up() to avoid sched deadlocks (Jan)

v3 (Sergey):
-- use a dedicated kthread for printing instead of using wq (Jan, Tetsuo, Tejun)

v2 (Sergey):
- use dedicated printk workqueue with WQ_MEM_RECLAIM bit
- fallback to system-wide workqueue only if allocation of printk_wq has
  failed
- do not use system_wq as a fallback wq. both console_lock() and onsole_unlock()
  can spend a significant amount of time; so we need to use system_long_wq.
- rework sync/!sync detection logic
  a) we can have deferred (in_sched) messages before we allocate printk_wq,
     so the only way to handle those messages is via IRQ context
  b) even in printk.synchronous mode, deferred messages must not be printed
     directly, and should go via IRQ context
  c) even if we allocated printk_wq and have !sync_printk mode, we must route
     deferred messages via IRQ context
- so this adds additional bool flags to vprint_emit() and introduces a new
  pending bit to `printk_pending'
- fix build on !PRINTK configs


Jan Kara (2):
  printk: Make printk() completely async
  printk: Make wake_up_klogd_work_func() async

 Documentation/kernel-parameters.txt |  10 ++
 kernel/printk/printk.c              | 222 ++++++++++++++++++++++++++----------
 2 files changed, 174 insertions(+), 58 deletions(-)

-- 
2.8.0.rc0

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-20 14:13 [RFC][PATCH v5 0/2] printk: Make printk() completely async Sergey Senozhatsky
@ 2016-03-20 14:13 ` Sergey Senozhatsky
  2016-03-21  0:06   ` Byungchul Park
  2016-03-20 14:13 ` [RFC][PATCH v5 2/2] printk: Make wake_up_klogd_work_func() async Sergey Senozhatsky
  1 sibling, 1 reply; 20+ messages in thread
From: Sergey Senozhatsky @ 2016-03-20 14:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jan Kara, Petr Mladek, Tejun Heo, Tetsuo Handa, linux-kernel,
	Byungchul Park, Sergey Senozhatsky, Sergey Senozhatsky, Jan Kara

From: Jan Kara <jack@suse.cz>

Currently, printk() sometimes waits for message to be printed to console
and sometimes it does not (when console_sem is held by some other
process). In case printk() grabs console_sem and starts printing to
console, it prints messages from kernel printk buffer until the buffer
is empty. When serial console is attached, printing is slow and thus
other CPUs in the system have plenty of time to append new messages to
the buffer while one CPU is printing. Thus the CPU can spend unbounded
amount of time doing printing in console_unlock().  This is especially
serious problem if the printk() calling console_unlock() was called with
interrupts disabled.

In practice users have observed a CPU can spend tens of seconds printing
in console_unlock() (usually during boot when hundreds of SCSI devices
are discovered) resulting in RCU stalls (CPU doing printing doesn't
reach quiescent state for a long time), softlockup reports (IPIs for the
printing CPU don't get served and thus other CPUs are spinning waiting
for the printing CPU to process IPIs), and eventually a machine death
(as messages from stalls and lockups append to printk buffer faster than
we are able to print). So these machines are unable to boot with serial
console attached. Another observed issue is that due to slow printk,
hardware discovery is slow and udev times out before kernel manages to
discover all the attached HW. Also during artificial stress testing SATA
disk disappears from the system because its interrupts aren't served for
too long.

This patch makes printk() completely asynchronous (similar to what
printk_deferred() did until now). It appends message to the kernel
printk buffer and wake_up()s a special dedicated kthread to do the
printing to console. This has the advantage that printing always happens
from a schedulable contex and thus we don't lockup any particular CPU or
even interrupts. Also it has the advantage that printk() is fast and
thus kernel booting is not slowed down by slow serial console.
Disadvantage of this method is that in case of crash there is higher
chance that important messages won't appear in console output (we may
need working scheduling to print message to console). We somewhat
mitigate this risk by switching printk to the original method of
immediate printing to console if oops is in progress.  Also for
debugging purposes we provide printk.synchronous kernel parameter which
resorts to the original printk behavior.

printk() is expected to work under different conditions and in different
scenarios, including corner cases of OOM when all of the workers are busy
(e.g. allocating memory), thus printk() uses its own dedicated printing
kthread, rather than relying on workqueue (even with WQ_MEM_RECLAIM bit
set we potentially can receive delays in printing until workqueue
declares a ->mayday, as noted by Tetsuo Handa). Another thing to mention,
is that deferred printk() messages may appear before printing kthread
created, so in the very beginning we have to print deferred messages
the old way -- via IRQs.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 Documentation/kernel-parameters.txt |  10 ++
 kernel/printk/printk.c              | 215 ++++++++++++++++++++++++++----------
 2 files changed, 167 insertions(+), 58 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index ecc74fa..4745e94 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -3114,6 +3114,16 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	printk.time=	Show timing data prefixed to each printk message line
 			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
 
+	printk.synchronous=
+			By default kernel messages are printed to console
+			asynchronously (except during early boot or when oops
+			is happening). That avoids kernel stalling behind slow
+			serial console and thus avoids softlockups, interrupt
+			timeouts, or userspace timing out during heavy printing.
+			However for debugging problems, printing messages to
+			console immediately may be desirable. This option
+			enables such behavior.
+
 	processor.max_cstate=	[HW,ACPI]
 			Limit processor to maximum C-state
 			max_cstate=9 overrides any DMI blacklist limit.
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index bfbf284..cf7906e 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -46,6 +46,7 @@
 #include <linux/utsname.h>
 #include <linux/ctype.h>
 #include <linux/uio.h>
+#include <linux/kthread.h>
 
 #include <asm/uaccess.h>
 #include <asm-generic/sections.h>
@@ -284,6 +285,105 @@ static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
 static char *log_buf = __log_buf;
 static u32 log_buf_len = __LOG_BUF_LEN;
 
+/*
+ * When true, printing to console will happen synchronously unless someone else
+ * is already printing messages.
+ *
+ * The default value on UP systems is 'true'.
+ */
+static bool __read_mostly printk_sync = !IS_ENABLED(CONFIG_SMP);
+module_param_named(synchronous, printk_sync, bool, S_IRUGO);
+MODULE_PARM_DESC(synchronous, "make printing to console synchronous");
+
+/* Printing kthread for async vprintk_emit() */
+static struct task_struct *printk_kthread;
+/* When `true' - printing thread has messages to print */
+static bool need_flush_console;
+
+static int printing_func(void *data)
+{
+	while (1) {
+		set_current_state(TASK_INTERRUPTIBLE);
+		if (!need_flush_console)
+			schedule();
+
+		__set_current_state(TASK_RUNNING);
+		need_flush_console = false;
+
+		console_lock();
+		console_unlock();
+	}
+
+	return 0;
+}
+
+static int __init init_printk_kthread(void)
+{
+	struct task_struct *thread;
+
+	if (printk_sync)
+		return 0;
+
+	thread = kthread_run(printing_func, NULL, "printk");
+	if (IS_ERR(thread)) {
+		pr_err("printk: unable to create printing thread\n");
+		printk_sync = true;
+	} else {
+		printk_kthread = thread;
+	}
+	return 0;
+}
+late_initcall(init_printk_kthread);
+
+/*
+ * Delayed printk version, for scheduler-internal messages:
+ */
+#define PRINTK_PENDING_WAKEUP	(1<<0)
+#define PRINTK_PENDING_OUTPUT	(1<<1)
+
+static DEFINE_PER_CPU(int, printk_pending);
+
+static void wake_up_klogd_work_func(struct irq_work *irq_work)
+{
+	int pending = __this_cpu_xchg(printk_pending, 0);
+
+	if (pending & PRINTK_PENDING_OUTPUT) {
+		/* If trylock fails, someone else is doing the printing */
+		if (console_trylock())
+			console_unlock();
+	}
+
+	if (pending & PRINTK_PENDING_WAKEUP)
+		wake_up_interruptible(&log_wait);
+}
+
+static DEFINE_PER_CPU(struct irq_work, wake_up_klogd_work) = {
+	.func = wake_up_klogd_work_func,
+	.flags = IRQ_WORK_LAZY,
+};
+
+void wake_up_klogd(void)
+{
+	preempt_disable();
+	if (waitqueue_active(&log_wait)) {
+		this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
+		irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
+	}
+	preempt_enable();
+}
+
+int printk_deferred(const char *fmt, ...)
+{
+	va_list args;
+	int r;
+
+	va_start(args, fmt);
+	r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, 0, fmt, args);
+	va_end(args);
+
+	return r;
+}
+
 /* Return log buffer address */
 char *log_buf_addr_get(void)
 {
@@ -1608,6 +1708,8 @@ asmlinkage int vprintk_emit(int facility, int level,
 			    const char *dict, size_t dictlen,
 			    const char *fmt, va_list args)
 {
+	/* cpu currently holding logbuf_lock in this function */
+	static unsigned int logbuf_cpu = UINT_MAX;
 	static bool recursion_bug;
 	static char textbuf[LOG_LINE_MAX];
 	char *text = textbuf;
@@ -1617,12 +1719,21 @@ asmlinkage int vprintk_emit(int facility, int level,
 	int this_cpu;
 	int printed_len = 0;
 	bool in_sched = false;
-	/* cpu currently holding logbuf_lock in this function */
-	static unsigned int logbuf_cpu = UINT_MAX;
+	bool in_panic = console_loglevel == CONSOLE_LOGLEVEL_MOTORMOUTH;
+	bool sync_print = printk_sync;
+
+	/* disable async printk */
+	if (in_panic)
+		printk_sync = true;
 
 	if (level == LOGLEVEL_SCHED) {
 		level = LOGLEVEL_DEFAULT;
+		/*
+		 * Deferred sched messages must not be printed
+		 * synchronously regardless the @printk_sync or @in_panic.
+		 */
 		in_sched = true;
+		sync_print = false;
 	}
 
 	boot_delay_msec(level);
@@ -1655,10 +1766,23 @@ asmlinkage int vprintk_emit(int facility, int level,
 	raw_spin_lock(&logbuf_lock);
 	logbuf_cpu = this_cpu;
 
+	/*
+	 * Set printing_func() sleep condition early, under the @logbuf_lock.
+	 * So printing kthread (if RUNNING) will go to console_lock() and spin
+	 * on @logbuf_lock.
+	 */
+	if (!printk_sync)
+		need_flush_console = true;
+
 	if (unlikely(recursion_bug)) {
 		static const char recursion_msg[] =
 			"BUG: recent printk recursion!";
 
+		/* disable async printk */
+		printk_sync = true;
+		if (!in_sched)
+			sync_print = true;
+
 		recursion_bug = false;
 		/* emit KERN_CRIT message */
 		printed_len += log_store(0, 2, LOG_PREFIX|LOG_NEWLINE, 0,
@@ -1748,13 +1872,42 @@ asmlinkage int vprintk_emit(int facility, int level,
 						 dict, dictlen, text, text_len);
 	}
 
+	/*
+	 * By default we print message to console asynchronously so that kernel
+	 * doesn't get stalled due to slow serial console. That can lead to
+	 * softlockups, lost interrupts, or userspace timing out under heavy
+	 * printing load.
+	 *
+	 * However we resort to synchronous printing of messages during early
+	 * boot, when synchronous printing was explicitly requested by
+	 * kernel parameter, or when console_verbose() was called to print
+	 * everything during panic / oops.
+	 */
+	if (!sync_print) {
+		if (in_sched) {
+			/*
+			 * @in_sched messages may come too early, when we don't
+			 * yet have @printk_kthread. We can't print deferred
+			 * messages directly, because this may deadlock, route
+			 * them via IRQ context.
+			 */
+			__this_cpu_or(printk_pending,
+					PRINTK_PENDING_OUTPUT);
+			irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
+		} else if (printk_kthread && !in_panic) {
+			/* Offload printing to a schedulable context. */
+			wake_up_process(printk_kthread);
+		} else {
+			sync_print = true;
+		}
+	}
+
 	logbuf_cpu = UINT_MAX;
 	raw_spin_unlock(&logbuf_lock);
 	lockdep_on();
 	local_irq_restore(flags);
 
-	/* If called from the scheduler, we can not call up(). */
-	if (!in_sched) {
+	if (sync_print) {
 		lockdep_off();
 		/*
 		 * Try to acquire and then immediately release the console
@@ -2723,60 +2876,6 @@ late_initcall(printk_late_init);
 
 #if defined CONFIG_PRINTK
 /*
- * Delayed printk version, for scheduler-internal messages:
- */
-#define PRINTK_PENDING_WAKEUP	0x01
-#define PRINTK_PENDING_OUTPUT	0x02
-
-static DEFINE_PER_CPU(int, printk_pending);
-
-static void wake_up_klogd_work_func(struct irq_work *irq_work)
-{
-	int pending = __this_cpu_xchg(printk_pending, 0);
-
-	if (pending & PRINTK_PENDING_OUTPUT) {
-		/* If trylock fails, someone else is doing the printing */
-		if (console_trylock())
-			console_unlock();
-	}
-
-	if (pending & PRINTK_PENDING_WAKEUP)
-		wake_up_interruptible(&log_wait);
-}
-
-static DEFINE_PER_CPU(struct irq_work, wake_up_klogd_work) = {
-	.func = wake_up_klogd_work_func,
-	.flags = IRQ_WORK_LAZY,
-};
-
-void wake_up_klogd(void)
-{
-	preempt_disable();
-	if (waitqueue_active(&log_wait)) {
-		this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
-		irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
-	}
-	preempt_enable();
-}
-
-int printk_deferred(const char *fmt, ...)
-{
-	va_list args;
-	int r;
-
-	preempt_disable();
-	va_start(args, fmt);
-	r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, 0, fmt, args);
-	va_end(args);
-
-	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
-	irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
-	preempt_enable();
-
-	return r;
-}
-
-/*
  * printk rate limiting, lifted from the networking subsystem.
  *
  * This enforces a rate limit: not more than 10 kernel messages
-- 
2.8.0.rc0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC][PATCH v5 2/2] printk: Make wake_up_klogd_work_func() async
  2016-03-20 14:13 [RFC][PATCH v5 0/2] printk: Make printk() completely async Sergey Senozhatsky
  2016-03-20 14:13 ` [RFC][PATCH v5 1/2] " Sergey Senozhatsky
@ 2016-03-20 14:13 ` Sergey Senozhatsky
  1 sibling, 0 replies; 20+ messages in thread
From: Sergey Senozhatsky @ 2016-03-20 14:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jan Kara, Petr Mladek, Tejun Heo, Tetsuo Handa, linux-kernel,
	Byungchul Park, Sergey Senozhatsky, Sergey Senozhatsky, Jan Kara

From: Jan Kara <jack@suse.cz>

Offload printing of scheduler deferred messages from IRQ context
to a schedulable printing kthread, when possible (the same way we
do it in vprintk_emit()). Otherwise, the CPU can spend unbounded
amount of time doing printing in console_unlock() from IRQ.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 kernel/printk/printk.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index cf7906e..e8dfe88 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -348,9 +348,16 @@ static void wake_up_klogd_work_func(struct irq_work *irq_work)
 	int pending = __this_cpu_xchg(printk_pending, 0);
 
 	if (pending & PRINTK_PENDING_OUTPUT) {
-		/* If trylock fails, someone else is doing the printing */
-		if (console_trylock())
-			console_unlock();
+		if (!printk_sync && printk_kthread) {
+			wake_up_process(printk_kthread);
+		} else {
+			/*
+			 * If trylock fails, someone else is doing
+			 * the printing
+			 */
+			if (console_trylock())
+				console_unlock();
+		}
 	}
 
 	if (pending & PRINTK_PENDING_WAKEUP)
-- 
2.8.0.rc0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-20 14:13 ` [RFC][PATCH v5 1/2] " Sergey Senozhatsky
@ 2016-03-21  0:06   ` Byungchul Park
  2016-03-21  0:43     ` Sergey Senozhatsky
  0 siblings, 1 reply; 20+ messages in thread
From: Byungchul Park @ 2016-03-21  0:06 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Jan Kara, Petr Mladek, Tejun Heo, Tetsuo Handa,
	linux-kernel, Sergey Senozhatsky, Jan Kara

On Sun, Mar 20, 2016 at 11:13:10PM +0900, Sergey Senozhatsky wrote:
> @@ -1748,13 +1872,42 @@ asmlinkage int vprintk_emit(int facility, int level,
>  						 dict, dictlen, text, text_len);
>  	}
>  
> +	/*
> +	 * By default we print message to console asynchronously so that kernel
> +	 * doesn't get stalled due to slow serial console. That can lead to
> +	 * softlockups, lost interrupts, or userspace timing out under heavy
> +	 * printing load.
> +	 *
> +	 * However we resort to synchronous printing of messages during early
> +	 * boot, when synchronous printing was explicitly requested by
> +	 * kernel parameter, or when console_verbose() was called to print
> +	 * everything during panic / oops.
> +	 */
> +	if (!sync_print) {
> +		if (in_sched) {
> +			/*
> +			 * @in_sched messages may come too early, when we don't
> +			 * yet have @printk_kthread. We can't print deferred
> +			 * messages directly, because this may deadlock, route
> +			 * them via IRQ context.
> +			 */
> +			__this_cpu_or(printk_pending,
> +					PRINTK_PENDING_OUTPUT);
> +			irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> +		} else if (printk_kthread && !in_panic) {
> +			/* Offload printing to a schedulable context. */
> +			wake_up_process(printk_kthread);

It will not print the "lockup suspected" message at all, for e.g. rq->lock,
p->pi_lock and any locks which are used within wake_up_process().
Furtheremore, any printk() within wake_up_process() cannot work at all, as
well. It's too bad to use any functions potentially including printk()
inside of this critical section.

> +		} else {
> +			sync_print = true;
> +		}
> +	}
> +
>  	logbuf_cpu = UINT_MAX;
>  	raw_spin_unlock(&logbuf_lock);
>  	lockdep_on();
>  	local_irq_restore(flags);

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-21  0:06   ` Byungchul Park
@ 2016-03-21  0:43     ` Sergey Senozhatsky
  2016-03-21  0:56       ` Byungchul Park
  0 siblings, 1 reply; 20+ messages in thread
From: Sergey Senozhatsky @ 2016-03-21  0:43 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Petr Mladek,
	Tejun Heo, Tetsuo Handa, linux-kernel, Sergey Senozhatsky,
	Jan Kara

On (03/21/16 09:06), Byungchul Park wrote:
> On Sun, Mar 20, 2016 at 11:13:10PM +0900, Sergey Senozhatsky wrote:
[..]
> > +	if (!sync_print) {
> > +		if (in_sched) {
> > +			/*
> > +			 * @in_sched messages may come too early, when we don't
> > +			 * yet have @printk_kthread. We can't print deferred
> > +			 * messages directly, because this may deadlock, route
> > +			 * them via IRQ context.
> > +			 */
> > +			__this_cpu_or(printk_pending,
> > +					PRINTK_PENDING_OUTPUT);
> > +			irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> > +		} else if (printk_kthread && !in_panic) {
> > +			/* Offload printing to a schedulable context. */
> > +			wake_up_process(printk_kthread);
> 
> It will not print the "lockup suspected" message at all, for e.g. rq->lock,
> p->pi_lock and any locks which are used within wake_up_process().

this will switch to old SYNC printk() mode should such a lockup ever
happen, which is a giant advantage over any other implementation; doing
wake_up_process() within the 'we can detect recursive printk() here'
gives us better control.

why
  printk()->IRQ->wake_up_process()->spin_dump()->printk()->IRQ->wake_up_process()->spin_dump()->printk()->IRQ...
is better?


> Furtheremore, any printk() within wake_up_process() cannot work at all, as
> well.

there is printk_deferred() which has LOGLEVEL_SCHED and which must be used
in sched functions.

	-ss

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-21  0:43     ` Sergey Senozhatsky
@ 2016-03-21  0:56       ` Byungchul Park
  2016-03-21  7:35         ` Sergey Senozhatsky
  0 siblings, 1 reply; 20+ messages in thread
From: Byungchul Park @ 2016-03-21  0:56 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Petr Mladek,
	Tejun Heo, Tetsuo Handa, linux-kernel, Jan Kara

On Mon, Mar 21, 2016 at 09:43:47AM +0900, Sergey Senozhatsky wrote:
> On (03/21/16 09:06), Byungchul Park wrote:
> > On Sun, Mar 20, 2016 at 11:13:10PM +0900, Sergey Senozhatsky wrote:
> [..]
> > > +	if (!sync_print) {
> > > +		if (in_sched) {
> > > +			/*
> > > +			 * @in_sched messages may come too early, when we don't
> > > +			 * yet have @printk_kthread. We can't print deferred
> > > +			 * messages directly, because this may deadlock, route
> > > +			 * them via IRQ context.
> > > +			 */
> > > +			__this_cpu_or(printk_pending,
> > > +					PRINTK_PENDING_OUTPUT);
> > > +			irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> > > +		} else if (printk_kthread && !in_panic) {
> > > +			/* Offload printing to a schedulable context. */
> > > +			wake_up_process(printk_kthread);
> > 
> > It will not print the "lockup suspected" message at all, for e.g. rq->lock,
> > p->pi_lock and any locks which are used within wake_up_process().
> 
> this will switch to old SYNC printk() mode should such a lockup ever
> happen, which is a giant advantage over any other implementation; doing
> wake_up_process() within the 'we can detect recursive printk() here'
> gives us better control.
> 
> why
>   printk()->IRQ->wake_up_process()->spin_dump()->printk()->IRQ->wake_up_process()->spin_dump()->printk()->IRQ...
> is better?

What is IRQ? And I didn't say the recursion is good. I just said it can be
avoid without using the last resort.

> 
> 
> > Furtheremore, any printk() within wake_up_process() cannot work at all, as
> > well.
> 
> there is printk_deferred() which has LOGLEVEL_SCHED and which must be used
> in sched functions.

It would be good for all scheduler code to use the printk_deferred() as you
said, but it's not true yet.

> 
> 	-ss

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-21  0:56       ` Byungchul Park
@ 2016-03-21  7:35         ` Sergey Senozhatsky
  2016-03-21  8:07           ` Byungchul Park
  0 siblings, 1 reply; 20+ messages in thread
From: Sergey Senozhatsky @ 2016-03-21  7:35 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, Jan Kara,
	Petr Mladek, Tejun Heo, Tetsuo Handa, linux-kernel, Jan Kara

On (03/21/16 09:56), Byungchul Park wrote:
> > > > +	if (!sync_print) {
> > > > +		if (in_sched) {
> > > > +			/*
> > > > +			 * @in_sched messages may come too early, when we don't
> > > > +			 * yet have @printk_kthread. We can't print deferred
> > > > +			 * messages directly, because this may deadlock, route
> > > > +			 * them via IRQ context.
> > > > +			 */
> > > > +			__this_cpu_or(printk_pending,
> > > > +					PRINTK_PENDING_OUTPUT);
> > > > +			irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> > > > +		} else if (printk_kthread && !in_panic) {
> > > > +			/* Offload printing to a schedulable context. */
> > > > +			wake_up_process(printk_kthread);
> > > 
> > > It will not print the "lockup suspected" message at all, for e.g. rq->lock,
> > > p->pi_lock and any locks which are used within wake_up_process().
> > 
> > this will switch to old SYNC printk() mode should such a lockup ever
> > happen, which is a giant advantage over any other implementation; doing
> > wake_up_process() within the 'we can detect recursive printk() here'
> > gives us better control.
> > 
> > why
> >   printk()->IRQ->wake_up_process()->spin_dump()->printk()->IRQ->wake_up_process()->spin_dump()->printk()->IRQ...
> > is better?
> 
> What is IRQ?

this is how printk() can print the messages in async mode apart from
direct and wake_up_process() in vprintk_emit().

	-ss

> > > Furtheremore, any printk() within wake_up_process() cannot work at all, as
> > > well.
> > 
> > there is printk_deferred() which has LOGLEVEL_SCHED and which must be used
> > in sched functions.
> 
> It would be good for all scheduler code to use the printk_deferred() as you
> said, but it's not true yet.
> 
> > 
> > 	-ss
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-21  7:35         ` Sergey Senozhatsky
@ 2016-03-21  8:07           ` Byungchul Park
  2016-03-21  8:47             ` Byungchul Park
  2016-03-21  8:51             ` Sergey Senozhatsky
  0 siblings, 2 replies; 20+ messages in thread
From: Byungchul Park @ 2016-03-21  8:07 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Petr Mladek,
	Tejun Heo, Tetsuo Handa, linux-kernel, Jan Kara

On Mon, Mar 21, 2016 at 04:35:07PM +0900, Sergey Senozhatsky wrote:
> On (03/21/16 09:56), Byungchul Park wrote:
> > > > > +	if (!sync_print) {
> > > > > +		if (in_sched) {
> > > > > +			/*
> > > > > +			 * @in_sched messages may come too early, when we don't
> > > > > +			 * yet have @printk_kthread. We can't print deferred
> > > > > +			 * messages directly, because this may deadlock, route
> > > > > +			 * them via IRQ context.
> > > > > +			 */
> > > > > +			__this_cpu_or(printk_pending,
> > > > > +					PRINTK_PENDING_OUTPUT);
> > > > > +			irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> > > > > +		} else if (printk_kthread && !in_panic) {
> > > > > +			/* Offload printing to a schedulable context. */
> > > > > +			wake_up_process(printk_kthread);
> > > > 
> > > > It will not print the "lockup suspected" message at all, for e.g. rq->lock,
> > > > p->pi_lock and any locks which are used within wake_up_process().
> > > 
> > > this will switch to old SYNC printk() mode should such a lockup ever
> > > happen, which is a giant advantage over any other implementation; doing
> > > wake_up_process() within the 'we can detect recursive printk() here'
> > > gives us better control.
> > > 
> > > why
> > >   printk()->IRQ->wake_up_process()->spin_dump()->printk()->IRQ->wake_up_process()->spin_dump()->printk()->IRQ...
> > > is better?
> > 
> > What is IRQ?
> 
> this is how printk() can print the messages in async mode apart from
> direct and wake_up_process() in vprintk_emit().

Do you mean IRQ work?

Is there any reason why you don't put the wake_up_process() out of the
critical section with my suggestion, even though it can solve the infinite
recuresion you worried about?

> 
> 	-ss
> 
> > > > Furtheremore, any printk() within wake_up_process() cannot work at all, as
> > > > well.
> > > 
> > > there is printk_deferred() which has LOGLEVEL_SCHED and which must be used
> > > in sched functions.
> > 
> > It would be good for all scheduler code to use the printk_deferred() as you
> > said, but it's not true yet.
> > 
> > > 
> > > 	-ss
> > 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-21  8:07           ` Byungchul Park
@ 2016-03-21  8:47             ` Byungchul Park
  2016-03-21  9:28               ` Sergey Senozhatsky
  2016-03-21  8:51             ` Sergey Senozhatsky
  1 sibling, 1 reply; 20+ messages in thread
From: Byungchul Park @ 2016-03-21  8:47 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Petr Mladek,
	Tejun Heo, Tetsuo Handa, linux-kernel, Jan Kara

On Mon, Mar 21, 2016 at 05:07:51PM +0900, Byungchul Park wrote:
> On Mon, Mar 21, 2016 at 04:35:07PM +0900, Sergey Senozhatsky wrote:
> > On (03/21/16 09:56), Byungchul Park wrote:
> > > > > > +	if (!sync_print) {
> > > > > > +		if (in_sched) {
> > > > > > +			/*
> > > > > > +			 * @in_sched messages may come too early, when we don't
> > > > > > +			 * yet have @printk_kthread. We can't print deferred
> > > > > > +			 * messages directly, because this may deadlock, route
> > > > > > +			 * them via IRQ context.
> > > > > > +			 */
> > > > > > +			__this_cpu_or(printk_pending,
> > > > > > +					PRINTK_PENDING_OUTPUT);
> > > > > > +			irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> > > > > > +		} else if (printk_kthread && !in_panic) {
> > > > > > +			/* Offload printing to a schedulable context. */
> > > > > > +			wake_up_process(printk_kthread);
> > > > > 
> > > > > It will not print the "lockup suspected" message at all, for e.g. rq->lock,
> > > > > p->pi_lock and any locks which are used within wake_up_process().
> > > > 
> > > > this will switch to old SYNC printk() mode should such a lockup ever
> > > > happen, which is a giant advantage over any other implementation; doing
> > > > wake_up_process() within the 'we can detect recursive printk() here'
> > > > gives us better control.
> > > > 
> > > > why
> > > >   printk()->IRQ->wake_up_process()->spin_dump()->printk()->IRQ->wake_up_process()->spin_dump()->printk()->IRQ...
> > > > is better?
> > > 
> > > What is IRQ?
> > 
> > this is how printk() can print the messages in async mode apart from
> > direct and wake_up_process() in vprintk_emit().
> 
> Do you mean IRQ work?
> 
> Is there any reason why you don't put the wake_up_process() out of the
> critical section with my suggestion, even though it can solve the infinite
> recuresion you worried about?

Just to be sure, whether you take my suggestion or not is not important.
I just suggested it in this thread since it can solve what you worried
about. That's all. I can post it in another thread though. Why don't you
consider it so that yours don't miss any printk message? Do you think there
are any problems in my suggestion?

> 
> > 
> > 	-ss
> > 
> > > > > Furtheremore, any printk() within wake_up_process() cannot work at all, as
> > > > > well.
> > > > 
> > > > there is printk_deferred() which has LOGLEVEL_SCHED and which must be used
> > > > in sched functions.
> > > 
> > > It would be good for all scheduler code to use the printk_deferred() as you
> > > said, but it's not true yet.
> > > 
> > > > 
> > > > 	-ss
> > > 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-21  8:07           ` Byungchul Park
  2016-03-21  8:47             ` Byungchul Park
@ 2016-03-21  8:51             ` Sergey Senozhatsky
  1 sibling, 0 replies; 20+ messages in thread
From: Sergey Senozhatsky @ 2016-03-21  8:51 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, Jan Kara,
	Petr Mladek, Tejun Heo, Tetsuo Handa, linux-kernel, Jan Kara

On (03/21/16 17:07), Byungchul Park wrote:
[..]
> > > > > It will not print the "lockup suspected" message at all, for e.g. rq->lock,
> > > > > p->pi_lock and any locks which are used within wake_up_process().
> > > > 
> > > > this will switch to old SYNC printk() mode should such a lockup ever
> > > > happen, which is a giant advantage over any other implementation; doing
> > > > wake_up_process() within the 'we can detect recursive printk() here'
> > > > gives us better control.
> > > > 
> > > > why
> > > >   printk()->IRQ->wake_up_process()->spin_dump()->printk()->IRQ->wake_up_process()->spin_dump()->printk()->IRQ...
> > > > is better?
> > > 
> > > What is IRQ?
> > 
> > this is how printk() can print the messages in async mode apart from
> > direct and wake_up_process() in vprintk_emit().
> 
> Do you mean IRQ work?

yes.

> Is there any reason why you don't put the wake_up_process() out of the
> critical section

I provided reasons already.

> with my suggestion, even though it can solve the infinite recuresion you worried about?

which is 'a static lock pointer in spin_dump()'? I don't think
this will fix recursive printks.



/* seems that for some reason my previous reply was not delivered. re-sending. */

	-ss

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-21  8:47             ` Byungchul Park
@ 2016-03-21  9:28               ` Sergey Senozhatsky
  2016-03-21 14:32                 ` Jan Kara
  2016-03-22  2:13                 ` Byungchul Park
  0 siblings, 2 replies; 20+ messages in thread
From: Sergey Senozhatsky @ 2016-03-21  9:28 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, Jan Kara,
	Petr Mladek, Tejun Heo, Tetsuo Handa, linux-kernel, Jan Kara

On (03/21/16 17:47), Byungchul Park wrote:
[..]
> > Is there any reason why you don't put the wake_up_process() out of the
> > critical section with my suggestion, even though it can solve the infinite
> > recuresion you worried about?
> 
> Just to be sure, whether you take my suggestion or not is not important.
> I just suggested it in this thread since it can solve what you worried
> about. That's all. I can post it in another thread though. Why don't you
> consider it so that yours don't miss any printk message? Do you think there
> are any problems in my suggestion?

we have 2 spin locks in vprintk_emit() -- logbuf_lock and sem->lock. and N
CPUs can concurrently lockup on those two locks, which already makes a
single static pointer in spiun_dump() questionable.

logbug_lock *theoretically* can detect and handle recursive printk()s,
there is no way to catch sem->lock spin_dump() at the moment (but that's
not the point).

there are 2 new spin locks in vprintk_emit() -- p->pi_lock and rq->lock.
what I want is to put those locks inside the "we can detect recursion 100%"
region. so these two locks will not add any new possibilities of recursive
printks, they are covered by the existing recursion detection code thanks
to logbuf lock and static logbuf_cpu. so we still can say that we have 5
places where printk recursion can happen

-- lock + unlock logbuf_lock
   printk() recursion detection code can't help here

-- inside of logbuf_lock critical section
   printk() recursion detection code works here

-- lock + unlock sem->lock
   printk() recursion detection code can't help here


note how "inside of logbuf_lock critical section" takes care of nested
'lock + unlock p->pi_lock' and 'lock + unlock rq->lock'.

moreover, printk() will switch to synchronous mode in recursion handler and
two misbehaving spin locks (4 places where recursion can happen) will not be
executed anymore.


what you want to have -- 4 independent spin locks and 9 places where
recursion can happen, only 1 of which is covered by printk recursion code.

-- lock + unlock logbuf_lock
   printk() recursion detection code can't help here

-- inside of logbuf_lock critical section
   printk() recursion detection code works here

-- lock + unlock p->pi_lock
   printk() recursion detection code can't help here

-- lock + unlock rq->lock
   printk() recursion detection code can't help here

-- lock + unlock sem->lock
   printk() recursion detection code can't help here

and there is a static pointer to fix everything up? what if 2
CPUs will simultaneously printk-recurse in 2 different places?
why this is better?

	-ss

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-21  9:28               ` Sergey Senozhatsky
@ 2016-03-21 14:32                 ` Jan Kara
  2016-03-21 14:58                   ` Sergey Senozhatsky
  2016-03-22  2:13                 ` Byungchul Park
  1 sibling, 1 reply; 20+ messages in thread
From: Jan Kara @ 2016-03-21 14:32 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Byungchul Park, Sergey Senozhatsky, Andrew Morton, Jan Kara,
	Petr Mladek, Tejun Heo, Tetsuo Handa, linux-kernel, Jan Kara

On Mon 21-03-16 18:28:48, Sergey Senozhatsky wrote:
> On (03/21/16 17:47), Byungchul Park wrote:
> [..]
> > > Is there any reason why you don't put the wake_up_process() out of the
> > > critical section with my suggestion, even though it can solve the infinite
> > > recuresion you worried about?
> > 
> > Just to be sure, whether you take my suggestion or not is not important.
> > I just suggested it in this thread since it can solve what you worried
> > about. That's all. I can post it in another thread though. Why don't you
> > consider it so that yours don't miss any printk message? Do you think there
> > are any problems in my suggestion?
> 
> we have 2 spin locks in vprintk_emit() -- logbuf_lock and sem->lock. and N
> CPUs can concurrently lockup on those two locks, which already makes a
> single static pointer in spiun_dump() questionable.
> 
> logbug_lock *theoretically* can detect and handle recursive printk()s,
> there is no way to catch sem->lock spin_dump() at the moment (but that's
> not the point).
> 
> there are 2 new spin locks in vprintk_emit() -- p->pi_lock and rq->lock.

Actually, this is not true. These locks are already in vprintk_emit() via
the up(&console_sem) call from console_unlock() since up() can call
wake_up() which needs the same locks as wake_up_process().

And by calling wake_up_process() under logbuf_lock, you actually introduce
recursion issues for printk_deferred() messages which are supposed to be
working from under rq->lock and similar. So I think you have to keep this
section outside of logbuf_lock.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-21 14:32                 ` Jan Kara
@ 2016-03-21 14:58                   ` Sergey Senozhatsky
  2016-03-21 15:33                     ` Jan Kara
  0 siblings, 1 reply; 20+ messages in thread
From: Sergey Senozhatsky @ 2016-03-21 14:58 UTC (permalink / raw)
  To: Jan Kara
  Cc: Sergey Senozhatsky, Byungchul Park, Sergey Senozhatsky,
	Andrew Morton, Jan Kara, Petr Mladek, Tejun Heo, Tetsuo Handa,
	linux-kernel

Hello Jan,

On (03/21/16 15:32), Jan Kara wrote:
[..]
> > we have 2 spin locks in vprintk_emit() -- logbuf_lock and sem->lock. and N
> > CPUs can concurrently lockup on those two locks, which already makes a
> > single static pointer in spiun_dump() questionable.
> > 
> > logbug_lock *theoretically* can detect and handle recursive printk()s,
> > there is no way to catch sem->lock spin_dump() at the moment (but that's
> > not the point).
> > 
> > there are 2 new spin locks in vprintk_emit() -- p->pi_lock and rq->lock.
> 
> Actually, this is not true. These locks are already in vprintk_emit() via
> the up(&console_sem) call from console_unlock() since up() can call
> wake_up() which needs the same locks as wake_up_process().

true. I meant new locks (which come with printk_kthread). the already
existing locks and problems were not addressed.

> And by calling wake_up_process() under logbuf_lock, you actually introduce
> recursion issues for printk_deferred() messages which are supposed to be
> working from under rq->lock and similar. So I think you have to keep this
> section outside of logbuf_lock.

hm, in_sched (printk_deferred()) messages are printed by
irq work->wake_up_klogd_work_func(), not by wake_up_process()
from vprintk_emit(). or am I missing something?

I'll take a look and re-spin.

	-ss

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-21 14:58                   ` Sergey Senozhatsky
@ 2016-03-21 15:33                     ` Jan Kara
  2016-03-21 17:11                       ` Sergey Senozhatsky
  0 siblings, 1 reply; 20+ messages in thread
From: Jan Kara @ 2016-03-21 15:33 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Jan Kara, Sergey Senozhatsky, Byungchul Park, Andrew Morton,
	Jan Kara, Petr Mladek, Tejun Heo, Tetsuo Handa, linux-kernel

On Mon 21-03-16 23:58:32, Sergey Senozhatsky wrote:
> Hello Jan,
> 
> On (03/21/16 15:32), Jan Kara wrote:
> [..]
> > > we have 2 spin locks in vprintk_emit() -- logbuf_lock and sem->lock. and N
> > > CPUs can concurrently lockup on those two locks, which already makes a
> > > single static pointer in spiun_dump() questionable.
> > > 
> > > logbug_lock *theoretically* can detect and handle recursive printk()s,
> > > there is no way to catch sem->lock spin_dump() at the moment (but that's
> > > not the point).
> > > 
> > > there are 2 new spin locks in vprintk_emit() -- p->pi_lock and rq->lock.
> > 
> > Actually, this is not true. These locks are already in vprintk_emit() via
> > the up(&console_sem) call from console_unlock() since up() can call
> > wake_up() which needs the same locks as wake_up_process().
> 
> true. I meant new locks (which come with printk_kthread). the already
> existing locks and problems were not addressed.
> 
> > And by calling wake_up_process() under logbuf_lock, you actually introduce
> > recursion issues for printk_deferred() messages which are supposed to be
> > working from under rq->lock and similar. So I think you have to keep this
> > section outside of logbuf_lock.
> 
> hm, in_sched (printk_deferred()) messages are printed by
> irq work->wake_up_klogd_work_func(), not by wake_up_process()
> from vprintk_emit(). or am I missing something?

Think of following:

some function
  printk()
    vprintk_emit()
      spin_lock(&logbuf_lock);
      ...
      wake_up_process()
        printk_deferred()
          vprintk_emit() -> recursion on logbuf_lock

Previously scheduler code was allowed to call printk_deferred() wherever it
wanted...

So we are not supposed to call into the scheduler from under logbuf_lock...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-21 15:33                     ` Jan Kara
@ 2016-03-21 17:11                       ` Sergey Senozhatsky
  2016-03-22  2:18                         ` Byungchul Park
  0 siblings, 1 reply; 20+ messages in thread
From: Sergey Senozhatsky @ 2016-03-21 17:11 UTC (permalink / raw)
  To: Jan Kara
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Byungchul Park,
	Andrew Morton, Jan Kara, Petr Mladek, Tejun Heo, Tetsuo Handa,
	linux-kernel

On (03/21/16 16:33), Jan Kara wrote:
[..]
> > > And by calling wake_up_process() under logbuf_lock, you actually introduce
> > > recursion issues for printk_deferred() messages which are supposed to be
> > > working from under rq->lock and similar. So I think you have to keep this
> > > section outside of logbuf_lock.
> > 
> > hm, in_sched (printk_deferred()) messages are printed by
> > irq work->wake_up_klogd_work_func(), not by wake_up_process()
> > from vprintk_emit(). or am I missing something?
> 
> Think of following:
> 
> some function
>   printk()
>     vprintk_emit()
>       spin_lock(&logbuf_lock);
>       ...
>       wake_up_process()
>         printk_deferred()
>           vprintk_emit() -> recursion on logbuf_lock

uh, indeed. I was more concerned about printk() calls that are
troublemakers and are already in wake_up_process() - spin_dump()s.
but yes, braking printk_deferred() in this case is a regression.
thanks for pointing that out. and also thanks to Byungchul.

	-ss

> Previously scheduler code was allowed to call printk_deferred() wherever it
> wanted...
> 
> So we are not supposed to call into the scheduler from under logbuf_lock...
> 
> 								Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-21  9:28               ` Sergey Senozhatsky
  2016-03-21 14:32                 ` Jan Kara
@ 2016-03-22  2:13                 ` Byungchul Park
  2016-03-22  5:52                   ` Sergey Senozhatsky
  1 sibling, 1 reply; 20+ messages in thread
From: Byungchul Park @ 2016-03-22  2:13 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Petr Mladek,
	Tejun Heo, Tetsuo Handa, linux-kernel, Jan Kara

On Mon, Mar 21, 2016 at 06:28:48PM +0900, Sergey Senozhatsky wrote:
> > Just to be sure, whether you take my suggestion or not is not important.
> > I just suggested it in this thread since it can solve what you worried
> > about. That's all. I can post it in another thread though. Why don't you
> > consider it so that yours don't miss any printk message? Do you think there
> > are any problems in my suggestion?
> 
> we have 2 spin locks in vprintk_emit() -- logbuf_lock and sem->lock. and N
> CPUs can concurrently lockup on those two locks, which already makes a
> single static pointer in spiun_dump() questionable.

As I already said, the static pointer is for an infinite recursion within a
printk() in the case "lockup suspected". It can prevent at least
meaningless *infinite* recursion at the first place the recursion occured.
Once the single static variable works, that is, it solves the recursion,
the next recusion can be solved if it exists, that is, more than two locks
are locked-up. It will be helpful for understanding what I am saying if you
think about how the infinite recursion can happen. So a single static
variable is enough to prevent a *infinite* recursion.

CPU0                    CPU1

spin_lock(A)            printk()
(not release A)            spin_lock(A)
                              spin_dump() // lockup suspected
			         printk()
				    spint_lock(A)
				       ...

where A can be logbuf_lock, sem->lock, rq->lock, p->pi_lock or any lock
which is used within printk(). For your information, logbuf_lock can also
cause an infinite recursion at the first entrance on a CPU.

> logbug_lock *theoretically* can detect and handle recursive printk()s,
> there is no way to catch sem->lock spin_dump() at the moment (but that's
> not the point).

It's the reason why I suggested the way to avoid an inifinit recursion in
the spin debug facitity in the case "lockup suspected".

> there are 2 new spin locks in vprintk_emit() -- p->pi_lock and rq->lock.
> what I want is to put those locks inside the "we can detect recursion 100%"
> region. so these two locks will not add any new possibilities of recursive
> printks, they are covered by the existing recursion detection code thanks
> to logbuf lock and static logbuf_cpu. so we still can say that we have 5

I think, even though we can detect the recursion by the logbuf_cpu, we
should not use it for the purpose of detecting since it's the last resort
preventing an infinite recursion. It seems to be same, it's different
purpose. Now you decided to put it out of the critical section, this
arguing is not much important though.. :-(

> places where printk recursion can happen
> 
> -- lock + unlock logbuf_lock
>    printk() recursion detection code can't help here
> 
> -- inside of logbuf_lock critical section
>    printk() recursion detection code works here
> 
> -- lock + unlock sem->lock
>    printk() recursion detection code can't help here
> 
> 
> note how "inside of logbuf_lock critical section" takes care of nested
> 'lock + unlock p->pi_lock' and 'lock + unlock rq->lock'.

I got what you intended to do. You don't need to explain in more detail.

> moreover, printk() will switch to synchronous mode in recursion handler and
> two misbehaving spin locks (4 places where recursion can happen) will not be
> executed anymore.

This, too.

> what you want to have -- 4 independent spin locks and 9 places where
> recursion can happen, only 1 of which is covered by printk recursion code.
> 
> -- lock + unlock logbuf_lock
>    printk() recursion detection code can't help here
> 
> -- inside of logbuf_lock critical section
>    printk() recursion detection code works here
> 
> -- lock + unlock p->pi_lock
>    printk() recursion detection code can't help here
> 
> -- lock + unlock rq->lock
>    printk() recursion detection code can't help here
> 
> -- lock + unlock sem->lock
>    printk() recursion detection code can't help here
> 
> and there is a static pointer to fix everything up? what if 2

We can use a single static pointer to fix it as I said.

> CPUs will simultaneously printk-recurse in 2 different places?

As I said, 2 different places mean 2 different timings, in the case of
infinite recursion. I think I explained fully. Do I need to explain more?

Thanks,
Byungchul

> why this is better?
> 
> 	-ss

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-21 17:11                       ` Sergey Senozhatsky
@ 2016-03-22  2:18                         ` Byungchul Park
  0 siblings, 0 replies; 20+ messages in thread
From: Byungchul Park @ 2016-03-22  2:18 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Jan Kara, Sergey Senozhatsky, Andrew Morton, Jan Kara,
	Petr Mladek, Tejun Heo, Tetsuo Handa, linux-kernel

On Tue, Mar 22, 2016 at 02:11:05AM +0900, Sergey Senozhatsky wrote:
> On (03/21/16 16:33), Jan Kara wrote:
> [..]
> > > > And by calling wake_up_process() under logbuf_lock, you actually introduce
> > > > recursion issues for printk_deferred() messages which are supposed to be
> > > > working from under rq->lock and similar. So I think you have to keep this
> > > > section outside of logbuf_lock.
> > > 
> > > hm, in_sched (printk_deferred()) messages are printed by
> > > irq work->wake_up_klogd_work_func(), not by wake_up_process()
> > > from vprintk_emit(). or am I missing something?
> > 
> > Think of following:
> > 
> > some function
> >   printk()
> >     vprintk_emit()
> >       spin_lock(&logbuf_lock);
> >       ...
> >       wake_up_process()
> >         printk_deferred()
> >           vprintk_emit() -> recursion on logbuf_lock
> 
> uh, indeed. I was more concerned about printk() calls that are
> troublemakers and are already in wake_up_process() - spin_dump()s.
> but yes, braking printk_deferred() in this case is a regression.

Already said any kind of printk() cannot work within logbuf_lock. :-(

> thanks for pointing that out. and also thanks to Byungchul.

My pleasure.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-22  2:13                 ` Byungchul Park
@ 2016-03-22  5:52                   ` Sergey Senozhatsky
  2016-03-22  6:57                     ` Byungchul Park
  0 siblings, 1 reply; 20+ messages in thread
From: Sergey Senozhatsky @ 2016-03-22  5:52 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, Jan Kara,
	Petr Mladek, Tejun Heo, Tetsuo Handa, linux-kernel, Jan Kara

On (03/22/16 11:13), Byungchul Park wrote:
[..]

what about a "normal" case, when things are not going to explode printk(),
but we have several lockups on the same lock (which is probably more
likely than printk recursion)?

suppose:
- there are 8 CPUs on the system
- 1 cpus owns the spin_lock for too long
- 4 cpus are trying to lock the spin_lock w/o any success
- so all 4 trigger spin_dump.

what we have at the moment, is that all 4 CPUs will report a lockup,
but with this static pointer only X (between 1 and 4, depending on the
timing; on how fast spin_dump() will return (logbuf lock can
be busy for a while); etc.) CPUs will do so.

	-ss

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-22  5:52                   ` Sergey Senozhatsky
@ 2016-03-22  6:57                     ` Byungchul Park
  2016-03-22  7:43                       ` Sergey Senozhatsky
  0 siblings, 1 reply; 20+ messages in thread
From: Byungchul Park @ 2016-03-22  6:57 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Petr Mladek,
	Tejun Heo, Tetsuo Handa, linux-kernel, Jan Kara

On Tue, Mar 22, 2016 at 02:52:43PM +0900, Sergey Senozhatsky wrote:
> On (03/22/16 11:13), Byungchul Park wrote:
> [..]
> 
> what about a "normal" case, when things are not going to explode printk(),
> but we have several lockups on the same lock (which is probably more
> likely than printk recursion)?
> 
> suppose:
> - there are 8 CPUs on the system
> - 1 cpus owns the spin_lock for too long
> - 4 cpus are trying to lock the spin_lock w/o any success
> - so all 4 trigger spin_dump.
> 
> what we have at the moment, is that all 4 CPUs will report a lockup,
> but with this static pointer only X (between 1 and 4, depending on the
> timing; on how fast spin_dump() will return (logbuf lock can
> be busy for a while); etc.) CPUs will do so.

Yes, I agree with you. If it's important to warn "lockup suspected" not
only per each lock, but also per each pair (cpu, lock), it should be
modified so that it can distinguish between cpus.

Anyway I think it's meaningful to make printing "lockup suspected" per a
meaningful unit, e.g. (cpu, lock), once.

If you agree with it, I will post it in your thread after fixing it. Or I
will do it in another thread. The reason why I ask you is because it can
solve a infinite recursion caused by newly introduced locks in yours.

Thank you for informing it,
Byungchul

> 
> 	-ss

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH v5 1/2] printk: Make printk() completely async
  2016-03-22  6:57                     ` Byungchul Park
@ 2016-03-22  7:43                       ` Sergey Senozhatsky
  0 siblings, 0 replies; 20+ messages in thread
From: Sergey Senozhatsky @ 2016-03-22  7:43 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, Jan Kara,
	Petr Mladek, Tejun Heo, Tetsuo Handa, linux-kernel, Jan Kara

On (03/22/16 15:57), Byungchul Park wrote:
> On Tue, Mar 22, 2016 at 02:52:43PM +0900, Sergey Senozhatsky wrote:
> > On (03/22/16 11:13), Byungchul Park wrote:
> > [..]
> > 
> > what about a "normal" case, when things are not going to explode printk(),
> > but we have several lockups on the same lock (which is probably more
> > likely than printk recursion)?
> > 
> > suppose:
> > - there are 8 CPUs on the system
> > - 1 cpus owns the spin_lock for too long
> > - 4 cpus are trying to lock the spin_lock w/o any success
> > - so all 4 trigger spin_dump.
> > 
> > what we have at the moment, is that all 4 CPUs will report a lockup,
> > but with this static pointer only X (between 1 and 4, depending on the
> > timing; on how fast spin_dump() will return (logbuf lock can
> > be busy for a while); etc.) CPUs will do so.
> 
> Yes, I agree with you. If it's important to warn "lockup suspected" not
> only per each lock, but also per each pair (cpu, lock), it should be
> modified so that it can distinguish between cpus.

well, just my opinion, I'd really prefer to have lockup reports from all of the
CPUs that have actually locked up, not on a per-lock basis. this gives better
understanding what was going on. trigger_all_cpu_backtrace() is not guaranteed
to be NMI-based, so if CPUA has detected a lockup it better report straight
ahead.

> Anyway I think it's meaningful to make printing "lockup suspected" per a
> meaningful unit, e.g. (cpu, lock), once.
> 
> If you agree with it, I will post it in your thread after fixing it. Or I
> will do it in another thread. The reason why I ask you is because it can
> solve a infinite recursion caused by newly introduced locks in yours.

a separate thread I think; this patch set is around different things and
does not touch spinlock_debug code, recursive printk is a problem even w/o
this patch set.

	-ss

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-03-22  7:42 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-20 14:13 [RFC][PATCH v5 0/2] printk: Make printk() completely async Sergey Senozhatsky
2016-03-20 14:13 ` [RFC][PATCH v5 1/2] " Sergey Senozhatsky
2016-03-21  0:06   ` Byungchul Park
2016-03-21  0:43     ` Sergey Senozhatsky
2016-03-21  0:56       ` Byungchul Park
2016-03-21  7:35         ` Sergey Senozhatsky
2016-03-21  8:07           ` Byungchul Park
2016-03-21  8:47             ` Byungchul Park
2016-03-21  9:28               ` Sergey Senozhatsky
2016-03-21 14:32                 ` Jan Kara
2016-03-21 14:58                   ` Sergey Senozhatsky
2016-03-21 15:33                     ` Jan Kara
2016-03-21 17:11                       ` Sergey Senozhatsky
2016-03-22  2:18                         ` Byungchul Park
2016-03-22  2:13                 ` Byungchul Park
2016-03-22  5:52                   ` Sergey Senozhatsky
2016-03-22  6:57                     ` Byungchul Park
2016-03-22  7:43                       ` Sergey Senozhatsky
2016-03-21  8:51             ` Sergey Senozhatsky
2016-03-20 14:13 ` [RFC][PATCH v5 2/2] printk: Make wake_up_klogd_work_func() async Sergey Senozhatsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.