[RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls
@ 2016-10-18 15:40 Sergey Senozhatsky
  2016-10-18 15:40 ` [RFC][PATCHv3 1/6] printk: use vprintk_func in vprintk() Sergey Senozhatsky
                   ` (8 more replies)
  0 siblings, 9 replies; 23+ messages in thread
From: Sergey Senozhatsky @ 2016-10-18 15:40 UTC (permalink / raw)
  To: Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, linux-kernel,
	Sergey Senozhatsky, Sergey Senozhatsky

Hello,

        RFC

        This patch set extends a lock-less NMI per-cpu buffers idea to
handle recursive printk() calls. The basic mechanism is pretty much the
same -- at the beginning of a deadlock-prone section we switch to lock-less
printk callback, and return back to a default printk implementation at the
end; the messages are getting flushed to a logbuf buffer from a safer
context.

Deadlock scenarios that printk_safe can handle:

a) printk recursion from logbuf_lock spin_lock section in printk()
  printk()
    raw_spin_lock(&logbuf_lock);
    WARN_ON(1);
    raw_spin_unlock(&logbuf_lock);

b) printk from sem->lock spin_lock section
  printk()
    console_trylock()
      down_trylock()
        raw_spin_lock_irqsave(&sem->lock, flags);
        WARN_ON(1);
        raw_spin_unlock_irqrestore(&sem->lock, flags);

c) printk from logbuf_lock spin_lock section in console_unlock()
  printk()
    console_unlock()
      raw_spin_lock(&logbuf_lock);
      WARN_ON(1);
      raw_spin_unlock(&logbuf_lock);

d) printk from ->pi_lock from semaphore up
  printk()
    console_unlock()
      up()
        try_to_wake_up()
          raw_spin_lock_irqsave(&p->pi_lock, flags);
          WARN_ON(1);
          raw_spin_unlock_irqrestore(&p->pi_lock, flags);

[[against next-20161018]]

v3: (review by Petr)
-- renamed to printk_safe
-- !NMI config build fix
-- report lost messages for both printk_sae and printk_nmi
-- dropped recursion reporting patch
-- etc.

v2:
-- fix build error on !NMI configs, reported by Fengguang
-- reworked the series based on Petr's feedback
-- added a new patch to drop zap_locks()

Sergey Senozhatsky (6):
  printk: use vprintk_func in vprintk()
  printk: rename nmi.c and exported api
  printk: introduce per-cpu safe_print seq buffer
  printk: report lost messages in printk safe/nmi contexts
  printk: use printk_safe buffers
  printk: remove zap_locks() function

 arch/arm/kernel/smp.c                  |   4 +-
 include/linux/hardirq.h                |   4 +-
 include/linux/printk.h                 |  29 ++--
 init/Kconfig                           |  16 ++-
 init/main.c                            |   2 +-
 kernel/kexec_core.c                    |   2 +-
 kernel/panic.c                         |   4 +-
 kernel/printk/Makefile                 |   3 +-
 kernel/printk/internal.h               |  65 +++++----
 kernel/printk/printk.c                 | 116 ++++------------
 kernel/printk/{nmi.c => printk_safe.c} | 239 +++++++++++++++++++++++++--------
 lib/nmi_backtrace.c                    |   2 +-
 12 files changed, 277 insertions(+), 209 deletions(-)
 rename kernel/printk/{nmi.c => printk_safe.c} (54%)

-- 
2.10.1.382.ga23ca1b

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC][PATCHv3 1/6] printk: use vprintk_func in vprintk()
  2016-10-18 15:40 [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
@ 2016-10-18 15:40 ` Sergey Senozhatsky
  2016-10-18 16:49   ` Steven Rostedt
  2016-10-18 15:40 ` [RFC][PATCHv3 2/6] printk: rename nmi.c and exported api Sergey Senozhatsky
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Sergey Senozhatsky @ 2016-10-18 15:40 UTC (permalink / raw)
  To: Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, linux-kernel,
	Sergey Senozhatsky, Sergey Senozhatsky

vprintk(), just like printk(), better be using per-cpu printk_func
instead of direct vprintk_emit() call. Just in case if vprintk()
will ever be called from NMI, or from any other context that can
deadlock in printk().

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 kernel/printk/printk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index d5e3973..be7f60b 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1917,7 +1917,7 @@ EXPORT_SYMBOL(vprintk_emit);
 
 asmlinkage int vprintk(const char *fmt, va_list args)
 {
-	return vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, 0, fmt, args);
+	return vprintk_func(fmt, args);
 }
 EXPORT_SYMBOL(vprintk);
 
-- 
2.10.1.382.ga23ca1b

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC][PATCHv3 2/6] printk: rename nmi.c and exported api
  2016-10-18 15:40 [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
  2016-10-18 15:40 ` [RFC][PATCHv3 1/6] printk: use vprintk_func in vprintk() Sergey Senozhatsky
@ 2016-10-18 15:40 ` Sergey Senozhatsky
  2016-10-18 16:49   ` Steven Rostedt
  2016-10-18 15:40 ` [RFC][PATCHv3 3/6] printk: introduce per-cpu safe_print seq buffer Sergey Senozhatsky
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Sergey Senozhatsky @ 2016-10-18 15:40 UTC (permalink / raw)
  To: Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, linux-kernel,
	Sergey Senozhatsky, Sergey Senozhatsky

A preparation patch for printk_safe work. No functional change.
- rename nmi.c to print_safe.c
- rename exported functions to have `safe' prefix.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 arch/arm/kernel/smp.c                  |  4 +-
 include/linux/hardirq.h                |  4 +-
 include/linux/printk.h                 | 20 +++++-----
 init/Kconfig                           | 16 ++++----
 init/main.c                            |  2 +-
 kernel/kexec_core.c                    |  2 +-
 kernel/panic.c                         |  4 +-
 kernel/printk/Makefile                 |  2 +-
 kernel/printk/{nmi.c => printk_safe.c} | 69 +++++++++++++++++-----------------
 lib/nmi_backtrace.c                    |  2 +-
 10 files changed, 64 insertions(+), 61 deletions(-)
 rename kernel/printk/{nmi.c => printk_safe.c} (77%)

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 7dd14e8..8b056e1 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -644,11 +644,11 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 		break;
 
 	case IPI_CPU_BACKTRACE:
-		printk_nmi_enter();
+		printk_safe_nmi_enter();
 		irq_enter();
 		nmi_cpu_backtrace(regs);
 		irq_exit();
-		printk_nmi_exit();
+		printk_safe_nmi_exit();
 		break;
 
 	default:
diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index c683996..14840aa 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -61,7 +61,7 @@ extern void irq_exit(void);
 
 #define nmi_enter()						\
 	do {							\
-		printk_nmi_enter();				\
+		printk_safe_nmi_enter();			\
 		lockdep_off();					\
 		ftrace_nmi_enter();				\
 		BUG_ON(in_nmi());				\
@@ -78,7 +78,7 @@ extern void irq_exit(void);
 		preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET);	\
 		ftrace_nmi_exit();				\
 		lockdep_on();					\
-		printk_nmi_exit();				\
+		printk_safe_nmi_exit();				\
 	} while (0)
 
 #endif /* LINUX_HARDIRQ_H */
diff --git a/include/linux/printk.h b/include/linux/printk.h
index eac1af8..c9a3080 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -132,17 +132,17 @@ void early_printk(const char *s, ...) { }
 #endif
 
 #ifdef CONFIG_PRINTK_NMI
-extern void printk_nmi_init(void);
-extern void printk_nmi_enter(void);
-extern void printk_nmi_exit(void);
-extern void printk_nmi_flush(void);
-extern void printk_nmi_flush_on_panic(void);
+extern void printk_safe_init(void);
+extern void printk_safe_nmi_enter(void);
+extern void printk_safe_nmi_exit(void);
+extern void printk_safe_flush(void);
+extern void printk_safe_flush_on_panic(void);
 #else
-static inline void printk_nmi_init(void) { }
-static inline void printk_nmi_enter(void) { }
-static inline void printk_nmi_exit(void) { }
-static inline void printk_nmi_flush(void) { }
-static inline void printk_nmi_flush_on_panic(void) { }
+static inline void printk_safe_init(void) { }
+static inline void printk_safe_nmi_enter(void) { }
+static inline void printk_safe_nmi_exit(void) { }
+static inline void printk_safe_flush(void) { }
+static inline void printk_safe_flush_on_panic(void) { }
 #endif /* PRINTK_NMI */
 
 #ifdef CONFIG_PRINTK
diff --git a/init/Kconfig b/init/Kconfig
index 34407f1..77fe553 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -875,17 +875,19 @@ config LOG_CPU_MAX_BUF_SHIFT
 		     13 =>   8 KB for each CPU
 		     12 =>   4 KB for each CPU
 
-config NMI_LOG_BUF_SHIFT
-	int "Temporary per-CPU NMI log buffer size (12 => 4KB, 13 => 8KB)"
+config PRINTK_SAFE_LOG_BUF_SHIFT
+	int "Temporary per-CPU printk log buffer size (12 => 4KB, 13 => 8KB)"
 	range 10 21
 	default 13
-	depends on PRINTK_NMI
+	depends on PRINTK
 	help
-	  Select the size of a per-CPU buffer where NMI messages are temporary
-	  stored. They are copied to the main log buffer in a safe context
-	  to avoid a deadlock. The value defines the size as a power of 2.
+	  Select the size of an alternate printk per-CPU buffer where messages
+	  printed from usafe contexts are temporary stored. One example would
+	  be NMI messages, another one - printk recursion. The messages are
+	  copied to the main log buffer in a safe context to avoid a deadlock.
+	  The value defines the size as a power of 2.
 
-	  NMI messages are rare and limited. The largest one is when
+	  Those messages are rare and limited. The largest one is when
 	  a backtrace is printed. It usually fits into 4KB. Select
 	  8KB if you want to be on the safe side.
 
diff --git a/init/main.c b/init/main.c
index 9af9274..17b0a49 100644
--- a/init/main.c
+++ b/init/main.c
@@ -577,7 +577,7 @@ asmlinkage __visible void __init start_kernel(void)
 	timekeeping_init();
 	time_init();
 	sched_clock_postinit();
-	printk_nmi_init();
+	printk_safe_init();
 	perf_event_init();
 	profile_init();
 	call_function_init();
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 5616755..297508a1 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -914,7 +914,7 @@ void crash_kexec(struct pt_regs *regs)
 	old_cpu = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, this_cpu);
 	if (old_cpu == PANIC_CPU_INVALID) {
 		/* This is the 1st CPU which comes here, so go ahead. */
-		printk_nmi_flush_on_panic();
+		printk_safe_flush_on_panic();
 		__crash_kexec(regs);
 
 		/*
diff --git a/kernel/panic.c b/kernel/panic.c
index e6480e2..67c11fd 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -188,7 +188,7 @@ void panic(const char *fmt, ...)
 	 * Bypass the panic_cpu check and call __crash_kexec directly.
 	 */
 	if (!_crash_kexec_post_notifiers) {
-		printk_nmi_flush_on_panic();
+		printk_safe_flush_on_panic();
 		__crash_kexec(NULL);
 
 		/*
@@ -213,7 +213,7 @@ void panic(const char *fmt, ...)
 	atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
 
 	/* Call flush even twice. It tries harder with a single online CPU */
-	printk_nmi_flush_on_panic();
+	printk_safe_flush_on_panic();
 	kmsg_dump(KMSG_DUMP_PANIC);
 
 	/*
diff --git a/kernel/printk/Makefile b/kernel/printk/Makefile
index abb0042..6079281 100644
--- a/kernel/printk/Makefile
+++ b/kernel/printk/Makefile
@@ -1,3 +1,3 @@
 obj-y	= printk.o
-obj-$(CONFIG_PRINTK_NMI)		+= nmi.o
+obj-$(CONFIG_PRINTK_NMI)		+= printk_safe.o
 obj-$(CONFIG_A11Y_BRAILLE_CONSOLE)	+= braille.o
diff --git a/kernel/printk/nmi.c b/kernel/printk/printk_safe.c
similarity index 77%
rename from kernel/printk/nmi.c
rename to kernel/printk/printk_safe.c
index 16bab47..1f66163 100644
--- a/kernel/printk/nmi.c
+++ b/kernel/printk/printk_safe.c
@@ -1,5 +1,5 @@
 /*
- * nmi.c - Safe printk in NMI context
+ * printk_safe.c - Safe printk in NMI context
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License
@@ -39,18 +39,18 @@
  * were handled or when IRQs are blocked.
  */
 DEFINE_PER_CPU(printk_func_t, printk_func) = vprintk_default;
-static int printk_nmi_irq_ready;
+static int printk_safe_irq_ready;
 atomic_t nmi_message_lost;
 
-#define NMI_LOG_BUF_LEN ((1 << CONFIG_NMI_LOG_BUF_SHIFT) -		\
+#define SAFE_LOG_BUF_LEN ((1 << CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT) -	\
 			 sizeof(atomic_t) - sizeof(struct irq_work))
 
-struct nmi_seq_buf {
+struct printk_safe_seq_buf {
 	atomic_t		len;	/* length of written data */
 	struct irq_work		work;	/* IRQ work that flushes the buffer */
-	unsigned char		buffer[NMI_LOG_BUF_LEN];
+	unsigned char		buffer[SAFE_LOG_BUF_LEN];
 };
-static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
+static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq);
 
 /*
  * Safe printk() for NMI context. It uses a per-CPU buffer to
@@ -58,9 +58,9 @@ static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
  * one writer running. But the buffer might get flushed from another
  * CPU, so we need to be careful.
  */
-static int vprintk_nmi(const char *fmt, va_list args)
+static int vprintk_safe_nmi(const char *fmt, va_list args)
 {
-	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
+	struct printk_safe_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
 	int add = 0;
 	size_t len;
 
@@ -90,7 +90,7 @@ static int vprintk_nmi(const char *fmt, va_list args)
 		goto again;
 
 	/* Get flushed in a more safe context. */
-	if (add && printk_nmi_irq_ready) {
+	if (add && printk_safe_irq_ready) {
 		/* Make sure that IRQ work is really initialized. */
 		smp_rmb();
 		irq_work_queue(&s->work);
@@ -99,7 +99,7 @@ static int vprintk_nmi(const char *fmt, va_list args)
 	return add;
 }
 
-static void printk_nmi_flush_line(const char *text, int len)
+static void printk_safe_flush_line(const char *text, int len)
 {
 	/*
 	 * The buffers are flushed in NMI only on panic.  The messages must
@@ -117,23 +117,24 @@ static void printk_nmi_flush_line(const char *text, int len)
  * printk one line from the temporary buffer from @start index until
  * and including the @end index.
  */
-static void printk_nmi_flush_seq_line(struct nmi_seq_buf *s,
+static void printk_safe_flush_seq_line(struct printk_safe_seq_buf *s,
 					int start, int end)
 {
 	const char *buf = s->buffer + start;
 
-	printk_nmi_flush_line(buf, (end - start) + 1);
+	printk_safe_flush_line(buf, (end - start) + 1);
 }
 
 /*
  * Flush data from the associated per_CPU buffer. The function
  * can be called either via IRQ work or independently.
  */
-static void __printk_nmi_flush(struct irq_work *work)
+static void __printk_safe_flush(struct irq_work *work)
 {
 	static raw_spinlock_t read_lock =
 		__RAW_SPIN_LOCK_INITIALIZER(read_lock);
-	struct nmi_seq_buf *s = container_of(work, struct nmi_seq_buf, work);
+	struct printk_safe_seq_buf *s = container_of(work,
+			struct printk_safe_seq_buf, work);
 	unsigned long flags;
 	size_t len, size;
 	int i, last_i;
@@ -157,9 +158,9 @@ static void __printk_nmi_flush(struct irq_work *work)
 	 * @len must only increase.
 	 */
 	if (i && i >= len) {
-		const char *msg = "printk_nmi_flush: internal error\n";
+		const char *msg = "printk_safe_flush: internal error\n";
 
-		printk_nmi_flush_line(msg, strlen(msg));
+		printk_safe_flush_line(msg, strlen(msg));
 	}
 
 	if (!len)
@@ -174,14 +175,14 @@ static void __printk_nmi_flush(struct irq_work *work)
 	/* Print line by line. */
 	for (; i < size; i++) {
 		if (s->buffer[i] == '\n') {
-			printk_nmi_flush_seq_line(s, last_i, i);
+			printk_safe_flush_seq_line(s, last_i, i);
 			last_i = i + 1;
 		}
 	}
 	/* Check if there was a partial line. */
 	if (last_i < size) {
-		printk_nmi_flush_seq_line(s, last_i, size - 1);
-		printk_nmi_flush_line("\n", strlen("\n"));
+		printk_safe_flush_seq_line(s, last_i, size - 1);
+		printk_safe_flush_line("\n", strlen("\n"));
 	}
 
 	/*
@@ -198,31 +199,31 @@ static void __printk_nmi_flush(struct irq_work *work)
 }
 
 /**
- * printk_nmi_flush - flush all per-cpu nmi buffers.
+ * printk_safe_flush - flush all per-cpu nmi buffers.
  *
  * The buffers are flushed automatically via IRQ work. This function
  * is useful only when someone wants to be sure that all buffers have
  * been flushed at some point.
  */
-void printk_nmi_flush(void)
+void printk_safe_flush(void)
 {
 	int cpu;
 
 	for_each_possible_cpu(cpu)
-		__printk_nmi_flush(&per_cpu(nmi_print_seq, cpu).work);
+		__printk_safe_flush(&per_cpu(nmi_print_seq, cpu).work);
 }
 
 /**
- * printk_nmi_flush_on_panic - flush all per-cpu nmi buffers when the system
+ * printk_safe_flush_on_panic - flush all per-cpu nmi buffers when the system
  *	goes down.
  *
- * Similar to printk_nmi_flush() but it can be called even in NMI context when
+ * Similar to printk_safe_flush() but it can be called even in NMI context when
  * the system goes down. It does the best effort to get NMI messages into
  * the main ring buffer.
  *
  * Note that it could try harder when there is only one CPU online.
  */
-void printk_nmi_flush_on_panic(void)
+void printk_safe_flush_on_panic(void)
 {
 	/*
 	 * Make sure that we could access the main ring buffer.
@@ -236,33 +237,33 @@ void printk_nmi_flush_on_panic(void)
 		raw_spin_lock_init(&logbuf_lock);
 	}
 
-	printk_nmi_flush();
+	printk_safe_flush();
 }
 
-void __init printk_nmi_init(void)
+void __init printk_safe_init(void)
 {
 	int cpu;
 
 	for_each_possible_cpu(cpu) {
-		struct nmi_seq_buf *s = &per_cpu(nmi_print_seq, cpu);
+		struct printk_safe_seq_buf *s = &per_cpu(nmi_print_seq, cpu);
 
-		init_irq_work(&s->work, __printk_nmi_flush);
+		init_irq_work(&s->work, __printk_safe_flush);
 	}
 
 	/* Make sure that IRQ works are initialized before enabling. */
 	smp_wmb();
-	printk_nmi_irq_ready = 1;
+	printk_safe_irq_ready = 1;
 
 	/* Flush pending messages that did not have scheduled IRQ works. */
-	printk_nmi_flush();
+	printk_safe_flush();
 }
 
-void printk_nmi_enter(void)
+void printk_safe_nmi_enter(void)
 {
-	this_cpu_write(printk_func, vprintk_nmi);
+	this_cpu_write(printk_func, vprintk_safe_nmi);
 }
 
-void printk_nmi_exit(void)
+void printk_safe_nmi_exit(void)
 {
 	this_cpu_write(printk_func, vprintk_default);
 }
diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c
index 7555475..5f7999e 100644
--- a/lib/nmi_backtrace.c
+++ b/lib/nmi_backtrace.c
@@ -77,7 +77,7 @@ void nmi_trigger_cpumask_backtrace(const cpumask_t *mask,
 	 * Force flush any remote buffers that might be stuck in IRQ context
 	 * and therefore could not run their irq_work.
 	 */
-	printk_nmi_flush();
+	printk_safe_flush();
 
 	clear_bit_unlock(0, &backtrace_flag);
 	put_cpu();
-- 
2.10.1.382.ga23ca1b

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC][PATCHv3 3/6] printk: introduce per-cpu safe_print seq buffer
  2016-10-18 15:40 [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
  2016-10-18 15:40 ` [RFC][PATCHv3 1/6] printk: use vprintk_func in vprintk() Sergey Senozhatsky
  2016-10-18 15:40 ` [RFC][PATCHv3 2/6] printk: rename nmi.c and exported api Sergey Senozhatsky
@ 2016-10-18 15:40 ` Sergey Senozhatsky
  2016-10-18 16:46   ` Steven Rostedt
  2016-10-19 13:42   ` Sergey Senozhatsky
  2016-10-18 15:40 ` [RFC][PATCHv3 4/6] printk: report lost messages in printk safe/nmi contexts Sergey Senozhatsky
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 23+ messages in thread
From: Sergey Senozhatsky @ 2016-10-18 15:40 UTC (permalink / raw)
  To: Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, linux-kernel,
	Sergey Senozhatsky, Sergey Senozhatsky

This patch extends the idea of NMI per-cpu buffers to regions
that may cause recursive printk() calls and possible deadlocks.
Namely, printk() can't handle printk calls from schedule code
or printk() calls from lock debugging code (spin_dump() for instance);
because those may be called with `sem->lock' already taken or any
other `critical' locks (p->pi_lock, etc.). An example of deadlock
can be

 vprintk_emit()
  console_unlock()
   up()                        << raw_spin_lock_irqsave(&sem->lock, flags);
    wake_up_process()
     try_to_wake_up()
      ttwu_queue()
       ttwu_activate()
        activate_task()
         enqueue_task()
          enqueue_task_fair()
           cfs_rq_of()
            task_of()
             WARN_ON_ONCE(!entity_is_task(se))
              vprintk_emit()
               console_trylock()
                down_trylock()
                 raw_spin_lock_irqsave(&sem->lock, flags)
                 ^^^^ deadlock

and some other cases.

Just like in NMI implementation, the solution uses a per-cpu
`printk_func' pointer to 'redirect' printk() calls to a 'safe'
callback, that store messages in a per-cpu buffer and flushes
them back to logbuf buffer later.

Usage example:

 printk()
  printk_safe_enter(flags)
  //
  //  any printk() call from here will endup in vprintk_safe(),
  //  that stores messages in a special per-CPU buffer.
  //
  printk_safe_exit(flags)

The 'redirection' mechanism, though, has been reworked, as suggested
by Petr Mladek. Instead of using a per-cpu @print_func callback we now
keep a per-cpu printk-context variable and call either default or nmi
vprintk function depending on its value. printk_nmi_entrer/exit and
printk_safe_enter/exit, thus, just set/celar corresponding bits in
printk-context functions.

The patch only adds printk_safe support, we don't use it yet.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 include/linux/printk.h      |  21 +++++--
 kernel/printk/Makefile      |   3 +-
 kernel/printk/internal.h    |  58 ++++++++++--------
 kernel/printk/printk.c      |   3 -
 kernel/printk/printk_safe.c | 140 ++++++++++++++++++++++++++++++++++----------
 5 files changed, 160 insertions(+), 65 deletions(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index c9a3080..e8ede8f 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -132,17 +132,11 @@ void early_printk(const char *s, ...) { }
 #endif
 
 #ifdef CONFIG_PRINTK_NMI
-extern void printk_safe_init(void);
 extern void printk_safe_nmi_enter(void);
 extern void printk_safe_nmi_exit(void);
-extern void printk_safe_flush(void);
-extern void printk_safe_flush_on_panic(void);
 #else
-static inline void printk_safe_init(void) { }
 static inline void printk_safe_nmi_enter(void) { }
 static inline void printk_safe_nmi_exit(void) { }
-static inline void printk_safe_flush(void) { }
-static inline void printk_safe_flush_on_panic(void) { }
 #endif /* PRINTK_NMI */
 
 #ifdef CONFIG_PRINTK
@@ -194,6 +188,9 @@ void __init setup_log_buf(int early);
 __printf(1, 2) void dump_stack_set_arch_desc(const char *fmt, ...);
 void dump_stack_print_info(const char *log_lvl);
 void show_regs_print_info(const char *log_lvl);
+extern void printk_safe_init(void);
+extern void printk_safe_flush(void);
+extern void printk_safe_flush_on_panic(void);
 #else
 static inline __printf(1, 0)
 int vprintk(const char *s, va_list args)
@@ -253,6 +250,18 @@ static inline void dump_stack_print_info(const char *log_lvl)
 static inline void show_regs_print_info(const char *log_lvl)
 {
 }
+
+static inline void printk_safe_init(void)
+{
+}
+
+static inline void printk_safe_flush(void)
+{
+}
+
+static inline void printk_safe_flush_on_panic(void)
+{
+}
 #endif
 
 extern asmlinkage void dump_stack(void) __cold;
diff --git a/kernel/printk/Makefile b/kernel/printk/Makefile
index 6079281..87ad14d 100644
--- a/kernel/printk/Makefile
+++ b/kernel/printk/Makefile
@@ -1,3 +1,2 @@
-obj-y	= printk.o
-obj-$(CONFIG_PRINTK_NMI)		+= printk_safe.o
+obj-y	= printk_safe.o printk.o
 obj-$(CONFIG_A11Y_BRAILLE_CONSOLE)	+= braille.o
diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index 7fd2838..015f68f 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -16,26 +16,8 @@
  */
 #include <linux/percpu.h>
 
-typedef __printf(1, 0) int (*printk_func_t)(const char *fmt, va_list args);
-
-int __printf(1, 0) vprintk_default(const char *fmt, va_list args);
-
 #ifdef CONFIG_PRINTK_NMI
 
-extern raw_spinlock_t logbuf_lock;
-
-/*
- * printk() could not take logbuf_lock in NMI context. Instead,
- * it temporary stores the strings into a per-CPU buffer.
- * The alternative implementation is chosen transparently
- * via per-CPU variable.
- */
-DECLARE_PER_CPU(printk_func_t, printk_func);
-static inline __printf(1, 0) int vprintk_func(const char *fmt, va_list args)
-{
-	return this_cpu_read(printk_func)(fmt, args);
-}
-
 extern atomic_t nmi_message_lost;
 static inline int get_nmi_message_lost(void)
 {
@@ -44,14 +26,44 @@ static inline int get_nmi_message_lost(void)
 
 #else /* CONFIG_PRINTK_NMI */
 
-static inline __printf(1, 0) int vprintk_func(const char *fmt, va_list args)
-{
-	return vprintk_default(fmt, args);
-}
-
 static inline int get_nmi_message_lost(void)
 {
 	return 0;
 }
 
 #endif /* CONFIG_PRINTK_NMI */
+
+#ifdef CONFIG_PRINTK
+
+#define PRINTK_SAFE_CONTEXT_MASK	0x7fffffff
+#define PRINTK_SAFE_NMI_CONTEXT_MASK	0x80000000
+
+extern raw_spinlock_t logbuf_lock;
+
+__printf(1, 0) int vprintk_default(const char *fmt, va_list args);
+__printf(1, 0) int vprintk_func(const char *fmt, va_list args);
+void __printk_safe_enter(void);
+void __printk_safe_exit(void);
+
+#define printk_safe_enter(flags)		\
+	do {					\
+		local_irq_save(flags);		\
+		__printk_safe_enter();		\
+	} while (0)
+
+#define printk_safe_exit(flags)		\
+	do {					\
+		__printk_safe_exit();		\
+		local_irq_restore(flags);	\
+	} while (0)
+
+#else
+
+__printf(1, 0) int vprintk_func(const char *fmt, va_list args) { return 0; }
+void __printk_safe_enter(void) { }
+void __printk_safe_exit(void) { }
+
+#define printk_safe_enter(flags)
+#define printk_safe_exit(flags)
+
+#endif /* CONFIG_PRINTK */
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index be7f60b..58cfc5c 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2024,9 +2024,6 @@ static size_t msg_print_text(const struct printk_log *msg, enum log_flags prev,
 static size_t cont_print_text(char *text, size_t size) { return 0; }
 static bool suppress_message_printing(int level) { return false; }
 
-/* Still needs to be defined for users */
-DEFINE_PER_CPU(printk_func_t, printk_func);
-
 #endif /* CONFIG_PRINTK */
 
 #ifdef CONFIG_EARLY_PRINTK
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index 1f66163..a2b0b99 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -1,5 +1,5 @@
 /*
- * printk_safe.c - Safe printk in NMI context
+ * printk_safe.c - Safe printk for printk-deadlock-prone contexts
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License
@@ -32,15 +32,14 @@
  * is later flushed into the main ring buffer via IRQ work.
  *
  * The alternative implementation is chosen transparently
- * via @printk_func per-CPU variable.
+ * by examinig current printk() context mask stored in @printk_safe_context
+ * per-CPU variable.
  *
  * The implementation allows to flush the strings also from another CPU.
  * There are situations when we want to make sure that all buffers
  * were handled or when IRQs are blocked.
  */
-DEFINE_PER_CPU(printk_func_t, printk_func) = vprintk_default;
 static int printk_safe_irq_ready;
-atomic_t nmi_message_lost;
 
 #define SAFE_LOG_BUF_LEN ((1 << CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT) -	\
 			 sizeof(atomic_t) - sizeof(struct irq_work))
@@ -50,27 +49,26 @@ struct printk_safe_seq_buf {
 	struct irq_work		work;	/* IRQ work that flushes the buffer */
 	unsigned char		buffer[SAFE_LOG_BUF_LEN];
 };
+
+static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq);
+static DEFINE_PER_CPU(int, printk_safe_context);
+
+#ifdef CONFIG_PRINTK_NMI
 static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq);
+atomic_t nmi_message_lost;
+#endif
 
-/*
- * Safe printk() for NMI context. It uses a per-CPU buffer to
- * store the message. NMIs are not nested, so there is always only
- * one writer running. But the buffer might get flushed from another
- * CPU, so we need to be careful.
- */
-static int vprintk_safe_nmi(const char *fmt, va_list args)
+static int printk_safe_log_store(struct printk_safe_seq_buf *s,
+		const char *fmt, va_list args)
 {
-	struct printk_safe_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
-	int add = 0;
+	int add;
 	size_t len;
 
 again:
 	len = atomic_read(&s->len);
 
-	if (len >= sizeof(s->buffer)) {
-		atomic_inc(&nmi_message_lost);
-		return 0;
-	}
+	if (len >= sizeof(s->buffer))
+		return -E2BIG;
 
 	/*
 	 * Make sure that all old data have been read before the buffer was
@@ -110,7 +108,6 @@ static void printk_safe_flush_line(const char *text, int len)
 		printk_deferred("%.*s", len, text);
 	else
 		printk("%.*s", len, text);
-
 }
 
 /*
@@ -126,7 +123,7 @@ static void printk_safe_flush_seq_line(struct printk_safe_seq_buf *s,
 }
 
 /*
- * Flush data from the associated per_CPU buffer. The function
+ * Flush data from the associated per-CPU buffer. The function
  * can be called either via IRQ work or independently.
  */
 static void __printk_safe_flush(struct irq_work *work)
@@ -209,8 +206,12 @@ void printk_safe_flush(void)
 {
 	int cpu;
 
-	for_each_possible_cpu(cpu)
+	for_each_possible_cpu(cpu) {
+#ifdef CONFIG_PRINTK_NMI
 		__printk_safe_flush(&per_cpu(nmi_print_seq, cpu).work);
+#endif
+		__printk_safe_flush(&per_cpu(safe_print_seq, cpu).work);
+	}
 }
 
 /**
@@ -240,14 +241,101 @@ void printk_safe_flush_on_panic(void)
 	printk_safe_flush();
 }
 
+#ifdef CONFIG_PRINTK_NMI
+/*
+ * Safe printk() for NMI context. It uses a per-CPU buffer to
+ * store the message. NMIs are not nested, so there is always only
+ * one writer running. But the buffer might get flushed from another
+ * CPU, so we need to be careful.
+ */
+static int vprintk_safe_nmi(const char *fmt, va_list args)
+{
+	struct printk_safe_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
+	int add;
+
+	add = printk_safe_log_store(s, fmt, args);
+	if (add == -E2BIG) {
+		atomic_inc(&nmi_message_lost);
+		add = 0;
+	}
+
+	return add;
+}
+
+void printk_safe_nmi_enter(void)
+{
+	this_cpu_or(printk_safe_context, PRINTK_SAFE_NMI_CONTEXT_MASK);
+}
+
+void printk_safe_nmi_exit(void)
+{
+	this_cpu_and(printk_safe_context, ~PRINTK_SAFE_NMI_CONTEXT_MASK);
+}
+
+#else
+
+static int vprintk_safe_nmi(const char *fmt, va_list args)
+{
+	return 0;
+}
+
+#endif /* CONFIG_PRINTK_NMI */
+
+/*
+ * Lockless printk(), to avoid deadlocks should the printk() recurse
+ * into itself. It uses a per-CPU buffer to store the message, just like
+ * NMI.
+ */
+static int vprintk_safe(const char *fmt, va_list args)
+{
+	struct printk_safe_seq_buf *s = this_cpu_ptr(&safe_print_seq);
+
+	return printk_safe_log_store(s, fmt, args);
+}
+
+/*
+ * Returns with local IRQs disabled.
+ * Can be preempted by NMI.
+ */
+void __printk_safe_enter(void)
+{
+	this_cpu_inc(printk_safe_context);
+}
+
+/*
+ * Restores local IRQs state saved in printk_safe_enter().
+ * Can be preempted by NMI.
+ */
+void __printk_safe_exit(void)
+{
+	this_cpu_dec(printk_safe_context);
+}
+
+__printf(1, 0) int vprintk_func(const char *fmt, va_list args)
+{
+	if (this_cpu_read(printk_safe_context) & PRINTK_SAFE_NMI_CONTEXT_MASK)
+		return vprintk_safe_nmi(fmt, args);
+
+	if (this_cpu_read(printk_safe_context) & PRINTK_SAFE_CONTEXT_MASK)
+		return vprintk_safe(fmt, args);
+
+	return vprintk_default(fmt, args);
+}
+
 void __init printk_safe_init(void)
 {
 	int cpu;
 
 	for_each_possible_cpu(cpu) {
-		struct printk_safe_seq_buf *s = &per_cpu(nmi_print_seq, cpu);
+		struct printk_safe_seq_buf *s;
 
+		s = &per_cpu(safe_print_seq, cpu);
 		init_irq_work(&s->work, __printk_safe_flush);
+
+#ifdef CONFIG_PRINTK_NMI
+		s = &per_cpu(nmi_print_seq, cpu);
+		init_irq_work(&s->work, __printk_safe_flush);
+#endif
 	}
 
 	/* Make sure that IRQ works are initialized before enabling. */
@@ -257,13 +345,3 @@ void __init printk_safe_init(void)
 	/* Flush pending messages that did not have scheduled IRQ works. */
 	printk_safe_flush();
 }
-
-void printk_safe_nmi_enter(void)
-{
-	this_cpu_write(printk_func, vprintk_safe_nmi);
-}
-
-void printk_safe_nmi_exit(void)
-{
-	this_cpu_write(printk_func, vprintk_default);
-}
-- 
2.10.1.382.ga23ca1b

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC][PATCHv3 4/6] printk: report lost messages in printk safe/nmi contexts
  2016-10-18 15:40 [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
                   ` (2 preceding siblings ...)
  2016-10-18 15:40 ` [RFC][PATCHv3 3/6] printk: introduce per-cpu safe_print seq buffer Sergey Senozhatsky
@ 2016-10-18 15:40 ` Sergey Senozhatsky
  2016-10-18 15:40 ` [RFC][PATCHv3 5/6] printk: use printk_safe buffers Sergey Senozhatsky
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Sergey Senozhatsky @ 2016-10-18 15:40 UTC (permalink / raw)
  To: Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, linux-kernel,
	Sergey Senozhatsky, Sergey Senozhatsky

Account lost messages in pritk-safe and printk-safe-nmi
contexts and report those numbers during printk_safe_flush().

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 kernel/printk/internal.h    | 17 ---------------
 kernel/printk/printk.c      | 10 ---------
 kernel/printk/printk_safe.c | 50 +++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 48 insertions(+), 29 deletions(-)

diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index 015f68f..375fe6d 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -16,23 +16,6 @@
  */
 #include <linux/percpu.h>
 
-#ifdef CONFIG_PRINTK_NMI
-
-extern atomic_t nmi_message_lost;
-static inline int get_nmi_message_lost(void)
-{
-	return atomic_xchg(&nmi_message_lost, 0);
-}
-
-#else /* CONFIG_PRINTK_NMI */
-
-static inline int get_nmi_message_lost(void)
-{
-	return 0;
-}
-
-#endif /* CONFIG_PRINTK_NMI */
-
 #ifdef CONFIG_PRINTK
 
 #define PRINTK_SAFE_CONTEXT_MASK	0x7fffffff
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 58cfc5c..a99414b 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1791,7 +1791,6 @@ asmlinkage int vprintk_emit(int facility, int level,
 	unsigned long flags;
 	int this_cpu;
 	int printed_len = 0;
-	int nmi_message_lost;
 	bool in_sched = false;
 	/* cpu currently holding logbuf_lock in this function */
 	static unsigned int logbuf_cpu = UINT_MAX;
@@ -1842,15 +1841,6 @@ asmlinkage int vprintk_emit(int facility, int level,
 					 strlen(recursion_msg));
 	}
 
-	nmi_message_lost = get_nmi_message_lost();
-	if (unlikely(nmi_message_lost)) {
-		text_len = scnprintf(textbuf, sizeof(textbuf),
-				     "BAD LUCK: lost %d message(s) from NMI context!",
-				     nmi_message_lost);
-		printed_len += log_store(0, 2, LOG_PREFIX|LOG_NEWLINE, 0,
-					 NULL, 0, textbuf, text_len);
-	}
-
 	/*
 	 * The printf needs to come first; we need the syslog
 	 * prefix which might be passed-in as a parameter.
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index a2b0b99..df66d99 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -52,10 +52,11 @@ struct printk_safe_seq_buf {
 
 static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq);
 static DEFINE_PER_CPU(int, printk_safe_context);
+static atomic_t safe_message_lost;
 
 #ifdef CONFIG_PRINTK_NMI
 static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq);
-atomic_t nmi_message_lost;
+static atomic_t nmi_message_lost;
 #endif
 
 static int printk_safe_log_store(struct printk_safe_seq_buf *s,
@@ -122,6 +123,41 @@ static void printk_safe_flush_seq_line(struct printk_safe_seq_buf *s,
 	printk_safe_flush_line(buf, (end - start) + 1);
 }
 
+static void report_message_lost(atomic_t *num_lost, char *fmt)
+{
+	int lost = atomic_xchg(num_lost, 0);
+
+	if (lost) {
+		char msg[56];
+
+		scnprintf(msg, sizeof(msg), fmt, lost);
+
+		printk_safe_flush_line(msg, strlen(msg));
+	}
+}
+
+#ifdef CONFIG_PRINTK_NMI
+
+static void report_nmi_message_lost(void)
+{
+	report_message_lost(&nmi_message_lost,
+			"Lost %d message(s) from NMI context!");
+}
+
+#else
+
+static void report_nmi_message_lost(void)
+{
+}
+
+#endif /* CONFIG_PRINTK_NMI */
+
+static void report_safe_message_lost(void)
+{
+	report_message_lost(&safe_message_lost,
+			"Lost %d message(s) from printk-safe context!");
+}
+
 /*
  * Flush data from the associated per-CPU buffer. The function
  * can be called either via IRQ work or independently.
@@ -147,6 +183,9 @@ static void __printk_safe_flush(struct irq_work *work)
 
 	i = 0;
 more:
+	report_nmi_message_lost();
+	report_safe_message_lost();
+
 	len = atomic_read(&s->len);
 
 	/*
@@ -289,8 +328,15 @@ static int vprintk_safe_nmi(const char *fmt, va_list args)
 static int vprintk_safe(const char *fmt, va_list args)
 {
 	struct printk_safe_seq_buf *s = this_cpu_ptr(&safe_print_seq);
+	int add;
 
-	return printk_safe_log_store(s, fmt, args);
+	add = printk_safe_log_store(s, fmt, args);
+	if (add == -E2BIG) {
+		atomic_inc(&safe_message_lost);
+		add = 0;
+	}
+
+	return add;
 }
 
 /*
-- 
2.10.1.382.ga23ca1b

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC][PATCHv3 5/6] printk: use printk_safe buffers
  2016-10-18 15:40 [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
                   ` (3 preceding siblings ...)
  2016-10-18 15:40 ` [RFC][PATCHv3 4/6] printk: report lost messages in printk safe/nmi contexts Sergey Senozhatsky
@ 2016-10-18 15:40 ` Sergey Senozhatsky
  2016-10-18 15:40 ` [RFC][PATCHv3 6/6] printk: remove zap_locks() function Sergey Senozhatsky
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Sergey Senozhatsky @ 2016-10-18 15:40 UTC (permalink / raw)
  To: Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, linux-kernel,
	Sergey Senozhatsky, Sergey Senozhatsky

Use printk_safe per-CPU buffers in in printk recursion-prone blocks:
-- around logbuf_lock protected sections in vprintk_emit() and
   console_unlock()
-- around down_trylock_console_sem() and up_console_sem()

Note that this solution addresses deadlocks caused by printk()
recursive calls only.

Examples:

1) printk() from logbuf_lock spin_lock section

Assume the following code:
  printk()
    raw_spin_lock(&logbuf_lock);
    WARN_ON(1);
    raw_spin_unlock(&logbuf_lock);

which now produces:

 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 366 at kernel/printk/printk.c:1811 vprintk_emit+0x1cd/0x438
 CPU: 0 PID: 366 Comm: bash
 Call Trace:
  [<ffffffff811e24d9>] dump_stack+0x68/0x92
  [<ffffffff810455cd>] __warn+0xc2/0xdd
  [<ffffffff81045658>] warn_slowpath_null+0x1d/0x1f
  [<ffffffff81091527>] vprintk_emit+0x1cd/0x438
  [<ffffffff810918fe>] vprintk_default+0x1d/0x1f
  [<ffffffff810fdf6a>] printk+0x48/0x50
  [..]
  [<ffffffff813caaaa>] entry_SYSCALL_64_fastpath+0x18/0xad
 ---[ end trace ]---

2) printk() from semaphore sem->lock spin_lock section

Assume the following code

  printk()
    console_trylock()
      down_trylock()
        raw_spin_lock_irqsave(&sem->lock, flags);
        WARN_ON(1);
        raw_spin_unlock_irqrestore(&sem->lock, flags);

which now produces:

 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 363 at kernel/locking/semaphore.c:141 down_trylock+0x3d/0x62
 CPU: 1 PID: 363 Comm: bash
 Call Trace:
  [<ffffffff811e24e0>] dump_stack+0x68/0x92
  [<ffffffff810455cd>] __warn+0xc2/0xdd
  [<ffffffff81045658>] warn_slowpath_null+0x1d/0x1f
  [<ffffffff810838df>] down_trylock+0x3d/0x62
  [<ffffffff8109177e>] ? vprintk_emit+0x3f9/0x414
  [<ffffffff810905cb>] console_trylock+0x31/0xeb
  [<ffffffff8109177e>] vprintk_emit+0x3f9/0x414
  [<ffffffff81091905>] vprintk_default+0x1d/0x1f
  [<ffffffff810fdf71>] printk+0x48/0x50
  [..]
  [<ffffffff813caaaa>] entry_SYSCALL_64_fastpath+0x18/0xad
 ---[ end trace ]---

3) printk() from console_unlock()

Assume the following code:

  printk()
    console_unlock()
      raw_spin_lock(&logbuf_lock);
      WARN_ON(1);
      raw_spin_unlock(&logbuf_lock);

which now produces:

 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 329 at kernel/printk/printk.c:2384 console_unlock+0x12d/0x559
 CPU: 1 PID: 329 Comm: bash
 Call Trace:
  [<ffffffff811f2380>] dump_stack+0x68/0x92
  [<ffffffff8103d649>] __warn+0xb8/0xd3
  [<ffffffff8103d6ca>] warn_slowpath_null+0x18/0x1a
  [<ffffffff8107f917>] console_unlock+0x12d/0x559
  [<ffffffff810772c4>] ? trace_hardirqs_on_caller+0x16d/0x189
  [<ffffffff810772ed>] ? trace_hardirqs_on+0xd/0xf
  [<ffffffff810800a6>] vprintk_emit+0x363/0x374
  [<ffffffff81080219>] vprintk_default+0x18/0x1a
  [<ffffffff810c7f77>] printk+0x43/0x4b
  [..]
  [<ffffffff814c2d6a>] entry_SYSCALL_64_fastpath+0x18/0xad
 ---[ end trace ]---

4) printk() from try_to_wake_up()

Assume the following code:

  printk()
    console_unlock()
      up()
        try_to_wake_up()
          raw_spin_lock_irqsave(&p->pi_lock, flags);
          WARN_ON(1);
          raw_spin_unlock_irqrestore(&p->pi_lock, flags);

which now produces:

 ------------[ cut here ]------------
 WARNING: CPU: 3 PID: 363 at kernel/sched/core.c:2028 try_to_wake_up+0x7f/0x4f7
 CPU: 3 PID: 363 Comm: bash
 Call Trace:
  [<ffffffff811e2557>] dump_stack+0x68/0x92
  [<ffffffff810455cd>] __warn+0xc2/0xdd
  [<ffffffff81045658>] warn_slowpath_null+0x1d/0x1f
  [<ffffffff8106c8d0>] try_to_wake_up+0x7f/0x4f7
  [<ffffffff8106cd5d>] wake_up_process+0x15/0x17
  [<ffffffff813c82c6>] __up.isra.0+0x56/0x63
  [<ffffffff810839a3>] up+0x32/0x42
  [<ffffffff8108f624>] __up_console_sem+0x37/0x55
  [<ffffffff810910ff>] console_unlock+0x21e/0x4c2
  [<ffffffff810917bf>] vprintk_emit+0x41c/0x462
  [<ffffffff81091971>] vprintk_default+0x1d/0x1f
  [<ffffffff810fdfdd>] printk+0x48/0x50
  [..]
  [<ffffffff813cab2a>] entry_SYSCALL_64_fastpath+0x18/0xad
 ---[ end trace ]---

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 kernel/printk/printk.c | 42 +++++++++++++++++++++++++++---------------
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index a99414b..688c1a4 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -213,17 +213,31 @@ static int nr_ext_console_drivers;
 
 static int __down_trylock_console_sem(unsigned long ip)
 {
-	if (down_trylock(&console_sem))
+	int lock_failed;
+	unsigned long flags;
+
+	printk_safe_enter(flags);
+	lock_failed = down_trylock(&console_sem);
+	printk_safe_exit(flags);
+
+	if (lock_failed)
 		return 1;
 	mutex_acquire(&console_lock_dep_map, 0, 1, ip);
 	return 0;
 }
 #define down_trylock_console_sem() __down_trylock_console_sem(_RET_IP_)
 
-#define up_console_sem() do { \
-	mutex_release(&console_lock_dep_map, 1, _RET_IP_);\
-	up(&console_sem);\
-} while (0)
+static void __up_console_sem(unsigned long ip)
+{
+	unsigned long flags;
+
+	mutex_release(&console_lock_dep_map, 1, ip);
+
+	printk_safe_enter(flags);
+	up(&console_sem);
+	printk_safe_exit(flags);
+}
+#define up_console_sem() __up_console_sem(_RET_IP_)
 
 /*
  * This is used for debugging the mess that is the VT code by
@@ -1803,7 +1817,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 	boot_delay_msec(level);
 	printk_delay();
 
-	local_irq_save(flags);
+	printk_safe_enter(flags);
 	this_cpu = smp_processor_id();
 
 	/*
@@ -1819,13 +1833,12 @@ asmlinkage int vprintk_emit(int facility, int level,
 		 */
 		if (!oops_in_progress && !lockdep_recursing(current)) {
 			recursion_bug = true;
-			local_irq_restore(flags);
+			printk_safe_exit(flags);
 			return 0;
 		}
 		zap_locks();
 	}
 
-	lockdep_off();
 	/* This stops the holder of console_sem just where we want him */
 	raw_spin_lock(&logbuf_lock);
 	logbuf_cpu = this_cpu;
@@ -1885,12 +1898,10 @@ asmlinkage int vprintk_emit(int facility, int level,
 
 	logbuf_cpu = UINT_MAX;
 	raw_spin_unlock(&logbuf_lock);
-	lockdep_on();
-	local_irq_restore(flags);
+	printk_safe_exit(flags);
 
 	/* If called from the scheduler, we can not call up(). */
 	if (!in_sched) {
-		lockdep_off();
 		/*
 		 * Try to acquire and then immediately release the console
 		 * semaphore.  The release will print out buffers and wake up
@@ -1898,7 +1909,6 @@ asmlinkage int vprintk_emit(int facility, int level,
 		 */
 		if (console_trylock())
 			console_unlock();
-		lockdep_on();
 	}
 
 	return printed_len;
@@ -2377,7 +2387,8 @@ void console_unlock(void)
 		size_t len;
 		int level;
 
-		raw_spin_lock_irqsave(&logbuf_lock, flags);
+		printk_safe_enter(flags);
+		raw_spin_lock(&logbuf_lock);
 		if (seen_seq != log_next_seq) {
 			wake_klogd = true;
 			seen_seq = log_next_seq;
@@ -2438,7 +2449,7 @@ void console_unlock(void)
 		stop_critical_timings();	/* don't trace print latency */
 		call_console_drivers(level, ext_text, ext_len, text, len);
 		start_critical_timings();
-		local_irq_restore(flags);
+		printk_safe_exit(flags);
 
 		if (do_cond_resched)
 			cond_resched();
@@ -2461,7 +2472,8 @@ void console_unlock(void)
 	 */
 	raw_spin_lock(&logbuf_lock);
 	retry = console_seq != log_next_seq;
-	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+	raw_spin_unlock(&logbuf_lock);
+	printk_safe_exit(flags);
 
 	if (retry && console_trylock())
 		goto again;
-- 
2.10.1.382.ga23ca1b

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC][PATCHv3 6/6] printk: remove zap_locks() function
  2016-10-18 15:40 [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
                   ` (4 preceding siblings ...)
  2016-10-18 15:40 ` [RFC][PATCHv3 5/6] printk: use printk_safe buffers Sergey Senozhatsky
@ 2016-10-18 15:40 ` Sergey Senozhatsky
  2016-10-18 16:45 ` [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls Joe Perches
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Sergey Senozhatsky @ 2016-10-18 15:40 UTC (permalink / raw)
  To: Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, linux-kernel,
	Sergey Senozhatsky, Sergey Senozhatsky

We use printk-safe now which makes printk-recursion detection code
in vprintk_emit() is unreachable. The tricky thing here is that,
apart from detecting and reporting printk recursions, that code also
used to zap_lockc() in case of panic. However, zap_locks() does not
look to be needed anymore:

1) Since commit 08d78658f393 ("panic: release stale console lock to
   always get the logbuf printed out") panic flushing of `logbuf' to
   console ignores the state of `console_sem' by doing
   	panic()
		console_trylock();
		console_unlock();

2) Since commit cf9b1106c81c ("printk/nmi: flush NMI messages on the
   system panic") panic attempts to zap the `logbuf_lock' spin_lock to
   successfully flush nmi messages to `logbuf'.

Basically, it seems that we either already do what zap_locks() used to
do but in other places or we ignore the state of the lock. May be we
still would want to do sema_init() in printk_safe_flush_on_panic(),
just in case.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 kernel/printk/printk.c | 61 --------------------------------------------------
 1 file changed, 61 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 688c1a4..79df499 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1619,28 +1619,6 @@ static void call_console_drivers(int level,
 	}
 }
 
-/*
- * Zap console related locks when oopsing.
- * To leave time for slow consoles to print a full oops,
- * only zap at most once every 30 seconds.
- */
-static void zap_locks(void)
-{
-	static unsigned long oops_timestamp;
-
-	if (time_after_eq(jiffies, oops_timestamp) &&
-	    !time_after(jiffies, oops_timestamp + 30 * HZ))
-		return;
-
-	oops_timestamp = jiffies;
-
-	debug_locks_off();
-	/* If a crash is occurring, make sure we can't deadlock */
-	raw_spin_lock_init(&logbuf_lock);
-	/* And make sure that we print immediately */
-	sema_init(&console_sem, 1);
-}
-
 int printk_delay_msec __read_mostly;
 
 static inline void printk_delay(void)
@@ -1797,17 +1775,13 @@ asmlinkage int vprintk_emit(int facility, int level,
 			    const char *dict, size_t dictlen,
 			    const char *fmt, va_list args)
 {
-	static bool recursion_bug;
 	static char textbuf[LOG_LINE_MAX];
 	char *text = textbuf;
 	size_t text_len = 0;
 	enum log_flags lflags = 0;
 	unsigned long flags;
-	int this_cpu;
 	int printed_len = 0;
 	bool in_sched = false;
-	/* cpu currently holding logbuf_lock in this function */
-	static unsigned int logbuf_cpu = UINT_MAX;
 
 	if (level == LOGLEVEL_SCHED) {
 		level = LOGLEVEL_DEFAULT;
@@ -1818,42 +1792,8 @@ asmlinkage int vprintk_emit(int facility, int level,
 	printk_delay();
 
 	printk_safe_enter(flags);
-	this_cpu = smp_processor_id();
-
-	/*
-	 * Ouch, printk recursed into itself!
-	 */
-	if (unlikely(logbuf_cpu == this_cpu)) {
-		/*
-		 * If a crash is occurring during printk() on this CPU,
-		 * then try to get the crash message out but make sure
-		 * we can't deadlock. Otherwise just return to avoid the
-		 * recursion and return - but flag the recursion so that
-		 * it can be printed at the next appropriate moment:
-		 */
-		if (!oops_in_progress && !lockdep_recursing(current)) {
-			recursion_bug = true;
-			printk_safe_exit(flags);
-			return 0;
-		}
-		zap_locks();
-	}
-
 	/* This stops the holder of console_sem just where we want him */
 	raw_spin_lock(&logbuf_lock);
-	logbuf_cpu = this_cpu;
-
-	if (unlikely(recursion_bug)) {
-		static const char recursion_msg[] =
-			"BUG: recent printk recursion!";
-
-		recursion_bug = false;
-		/* emit KERN_CRIT message */
-		printed_len += log_store(0, 2, LOG_PREFIX|LOG_NEWLINE, 0,
-					 NULL, 0, recursion_msg,
-					 strlen(recursion_msg));
-	}
-
 	/*
 	 * The printf needs to come first; we need the syslog
 	 * prefix which might be passed-in as a parameter.
@@ -1896,7 +1836,6 @@ asmlinkage int vprintk_emit(int facility, int level,
 
 	printed_len += log_output(facility, level, lflags, dict, dictlen, text, text_len);
 
-	logbuf_cpu = UINT_MAX;
 	raw_spin_unlock(&logbuf_lock);
 	printk_safe_exit(flags);
 
-- 
2.10.1.382.ga23ca1b

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls
  2016-10-18 15:40 [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
                   ` (5 preceding siblings ...)
  2016-10-18 15:40 ` [RFC][PATCHv3 6/6] printk: remove zap_locks() function Sergey Senozhatsky
@ 2016-10-18 16:45 ` Joe Perches
  2016-10-19  1:14   ` Sergey Senozhatsky
  2016-10-18 17:07 ` Peter Zijlstra
  2016-10-19  4:33 ` Sergey Senozhatsky
  8 siblings, 1 reply; 23+ messages in thread
From: Joe Perches @ 2016-10-18 16:45 UTC (permalink / raw)
  To: Sergey Senozhatsky, Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, linux-kernel,
	Sergey Senozhatsky

On Wed, 2016-10-19 at 00:40 +0900, Sergey Senozhatsky wrote:
> This patch set extends a lock-less NMI per-cpu buffers idea to
> handle recursive printk() calls.

trivia:

recursive or reentrant?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 3/6] printk: introduce per-cpu safe_print seq buffer
  2016-10-18 15:40 ` [RFC][PATCHv3 3/6] printk: introduce per-cpu safe_print seq buffer Sergey Senozhatsky
@ 2016-10-18 16:46   ` Steven Rostedt
  2016-10-19  1:10     ` Sergey Senozhatsky
  2016-10-19 13:42   ` Sergey Senozhatsky
  1 sibling, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2016-10-18 16:46 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Ingo Molnar, Peter Zijlstra,
	linux-kernel, Sergey Senozhatsky

On Wed, 19 Oct 2016 00:40:42 +0900
Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote:


> +
> +/*
> + * Returns with local IRQs disabled.
> + * Can be preempted by NMI.

It does? How does it return with IRQs disabled?

> + */
> +void __printk_safe_enter(void)
> +{
> +	this_cpu_inc(printk_safe_context);
> +}
> +
> +/*
> + * Restores local IRQs state saved in printk_safe_enter().
> + * Can be preempted by NMI.
> + */

How does this return with irqs enabled?

-- Steve

> +void __printk_safe_exit(void)
> +{
> +	this_cpu_dec(printk_safe_context);
> +}
> +

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 1/6] printk: use vprintk_func in vprintk()
  2016-10-18 15:40 ` [RFC][PATCHv3 1/6] printk: use vprintk_func in vprintk() Sergey Senozhatsky
@ 2016-10-18 16:49   ` Steven Rostedt
  2016-10-19  1:54     ` Sergey Senozhatsky
  0 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2016-10-18 16:49 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Ingo Molnar, Peter Zijlstra,
	linux-kernel, Sergey Senozhatsky

On Wed, 19 Oct 2016 00:40:40 +0900
Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote:

> vprintk(), just like printk(), better be using per-cpu printk_func
> instead of direct vprintk_emit() call. Just in case if vprintk()
> will ever be called from NMI, or from any other context that can
> deadlock in printk().
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

Reviewed-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve

> ---
>  kernel/printk/printk.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index d5e3973..be7f60b 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1917,7 +1917,7 @@ EXPORT_SYMBOL(vprintk_emit);
>  
>  asmlinkage int vprintk(const char *fmt, va_list args)
>  {
> -	return vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, 0, fmt, args);
> +	return vprintk_func(fmt, args);
>  }
>  EXPORT_SYMBOL(vprintk);
>  

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 2/6] printk: rename nmi.c and exported api
  2016-10-18 15:40 ` [RFC][PATCHv3 2/6] printk: rename nmi.c and exported api Sergey Senozhatsky
@ 2016-10-18 16:49   ` Steven Rostedt
  2016-10-19  1:54     ` Sergey Senozhatsky
  0 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2016-10-18 16:49 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Ingo Molnar, Peter Zijlstra,
	linux-kernel, Sergey Senozhatsky

On Wed, 19 Oct 2016 00:40:41 +0900
Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote:

> A preparation patch for printk_safe work. No functional change.
> - rename nmi.c to print_safe.c
> - rename exported functions to have `safe' prefix.
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>


Acked-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls
  2016-10-18 15:40 [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
                   ` (6 preceding siblings ...)
  2016-10-18 16:45 ` [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls Joe Perches
@ 2016-10-18 17:07 ` Peter Zijlstra
  2016-10-19  1:53   ` Sergey Senozhatsky
  2016-10-19 13:18   ` Petr Mladek
  2016-10-19  4:33 ` Sergey Senozhatsky
  8 siblings, 2 replies; 23+ messages in thread
From: Peter Zijlstra @ 2016-10-18 17:07 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Steven Rostedt, Ingo Molnar,
	linux-kernel, Sergey Senozhatsky

On Wed, Oct 19, 2016 at 12:40:39AM +0900, Sergey Senozhatsky wrote:
> Hello,
> 
>         RFC
> 
>         This patch set extends a lock-less NMI per-cpu buffers idea to
> handle recursive printk() calls. The basic mechanism is pretty much the
> same -- at the beginning of a deadlock-prone section we switch to lock-less
> printk callback, and return back to a default printk implementation at the
> end; the messages are getting flushed to a logbuf buffer from a safer
> context.

So I think you're not taking this far enough. You've also missed an
entire class of deadlocks.

The first is that you still keep the logbuf. Having this global
serialized thing is a source of fail. It would be much better to only
keep per cpu stuff. _OR_ at the very least make the logbuf itself
lockfree. So generate the printk entry local (so we know its size) then
atomically reserve the logbuf entry and copy it over.

The entire class of deadlocks you've missed is that console->write() is
a piece of crap too ;-) Even the bog standard 8250 serial console driver
can do wakeups.

See for example:

  https://marc.info/?l=linux-kernel&m=132446649823347&w=2

I've entirely given up on printk(), I'll post the 3 patches I carry to
make things work for me.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 3/6] printk: introduce per-cpu safe_print seq buffer
  2016-10-18 16:46   ` Steven Rostedt
@ 2016-10-19  1:10     ` Sergey Senozhatsky
  0 siblings, 0 replies; 23+ messages in thread
From: Sergey Senozhatsky @ 2016-10-19  1:10 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Petr Mladek, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Ingo Molnar, Peter Zijlstra, linux-kernel, Sergey Senozhatsky

On (10/18/16 12:46), Steven Rostedt wrote:
> > +
> > +/*
> > + * Returns with local IRQs disabled.
> > + * Can be preempted by NMI.
> 
> It does? How does it return with IRQs disabled?
> 
> > + */
> > +void __printk_safe_enter(void)
> > +{
> > +	this_cpu_inc(printk_safe_context);
> > +}
> > +
> > +/*
> > + * Restores local IRQs state saved in printk_safe_enter().
> > + * Can be preempted by NMI.
> > + */
> 
> How does this return with irqs enabled?

ah... leftovers. thanks, will correct.

	-ss

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls
  2016-10-18 16:45 ` [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls Joe Perches
@ 2016-10-19  1:14   ` Sergey Senozhatsky
  0 siblings, 0 replies; 23+ messages in thread
From: Sergey Senozhatsky @ 2016-10-19  1:14 UTC (permalink / raw)
  To: Joe Perches
  Cc: Sergey Senozhatsky, Petr Mladek, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, linux-kernel,
	Sergey Senozhatsky

On (10/18/16 09:45), Joe Perches wrote:
> On Wed, 2016-10-19 at 00:40 +0900, Sergey Senozhatsky wrote:
> > This patch set extends a lock-less NMI per-cpu buffers idea to
> > handle recursive printk() calls.
> 
> trivia:
> 
> recursive or reentrant?

a recursive one.

printk -> {foo} -> printk



reentrant printk() case was addressed by Petr Mladek's nmi patchset:

printk()
	foo
	-------> NMI
		printk()

	-ss

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls
  2016-10-18 17:07 ` Peter Zijlstra
@ 2016-10-19  1:53   ` Sergey Senozhatsky
  2016-10-19  9:30     ` Peter Zijlstra
  2016-10-19 13:18   ` Petr Mladek
  1 sibling, 1 reply; 23+ messages in thread
From: Sergey Senozhatsky @ 2016-10-19  1:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sergey Senozhatsky, Petr Mladek, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, linux-kernel, Sergey Senozhatsky

On (10/18/16 19:07), Peter Zijlstra wrote:
> >         This patch set extends a lock-less NMI per-cpu buffers idea to
> > handle recursive printk() calls. The basic mechanism is pretty much the
> > same -- at the beginning of a deadlock-prone section we switch to lock-less
> > printk callback, and return back to a default printk implementation at the
> > end; the messages are getting flushed to a logbuf buffer from a safer
> > context.
> 
> So I think you're not taking this far enough. You've also missed an
> entire class of deadlocks.

yes. this patch set addresses only printk() recursion. printk deadlocks
in general are much more complicated and basically we don't even know
how many locks we are about to take every time we call printk() (there
may be serial console locks, etc we never know in advance).

> The first is that you still keep the logbuf. Having this global
> serialized thing is a source of fail. It would be much better to only
> keep per cpu stuff. _OR_ at the very least make the logbuf itself
> lockfree. So generate the printk entry local (so we know its size) then
> atomically reserve the logbuf entry and copy it over.

logbuf spin_lock is not the only problematic lock. we _at least_
have console sem spin_lock and rq spin_lock:

	https://marc.info/?l=linux-kernel&m=145406977822244

	https://marc.info/?l=linux-kernel&m=146853438804534&w=2

> The entire class of deadlocks you've missed is that console->write() is
> a piece of crap too ;-) Even the bog standard 8250 serial console driver
> can do wakeups.

yes. well, I didn't exactly "miss" it. I just don't have any working
solution for those locks yet. so things like

 SyS_ioctl
  do_vfs_ioctl
   tty_ioctl
    n_tty_ioctl
     tty_mode_ioctl
      set_termios
       tty_set_termios
        uart_set_termios
         uart_change_speed
          FOO_serial_set_termios
           spin_lock_irqsave(&port->lock)     // lock the output port
           ....
           !! WARN() or pr_err() or printk()
               vprintk_emit()
                /* console_trylock() */
                console_unlock()
                 call_console_drivers()
                  FOO_write()
                   spin_lock_irqsave(&port->lock)     // already locked


are out of this patch set's scope. another class of deadlocks is when we
call printk under one of the locks that printk() can take later in order
to print the message.



so in this particular patch set I make "printk() -> $FOO -> printk()" less
damaging. which is just a first step.

we are also looking at things like might_printk()
https://marc.info/?l=linux-kernel&m=147607261116163

and even extending the lockdep
https://marc.info/?l=linux-kernel&m=147617830511381

not to miss out a DEFERRED_WARN patch set...
//hm, I can't find it online

  Subject: [RFC 0/5] printk: Implement WARN_*DEFERRED()
  Message-Id: <1474992135-14777-1-git-send-email-pmladek@suse.com>



> See for example:
> 
>   https://marc.info/?l=linux-kernel&m=132446649823347&w=2

interesting idea.

> I've entirely given up on printk(), I'll post the 3 patches I carry to
> make things work for me.

yes, printk is surprisingly fragile and, some would _probably_
even say, broken.

a) we have numerous scenarios when it can deadlock
b) we have numerous scenarios when it can soft-lockup/hard-lockup the system
c) we don't have a bullet proof panic printk. zap_locks() resets
   logbuf and console_sem, but there may be serial console locks/etc.

	-ss

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 2/6] printk: rename nmi.c and exported api
  2016-10-18 16:49   ` Steven Rostedt
@ 2016-10-19  1:54     ` Sergey Senozhatsky
  0 siblings, 0 replies; 23+ messages in thread
From: Sergey Senozhatsky @ 2016-10-19  1:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Petr Mladek, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Ingo Molnar, Peter Zijlstra, linux-kernel, Sergey Senozhatsky

On (10/18/16 12:49), Steven Rostedt wrote:
> On Wed, 19 Oct 2016 00:40:41 +0900
> Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote:
> 
> > A preparation patch for printk_safe work. No functional change.
> > - rename nmi.c to print_safe.c
> > - rename exported functions to have `safe' prefix.
> > 
> > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> 
> 
> Acked-by: Steven Rostedt <rostedt@goodmis.org>

Thanks.

	-ss

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 1/6] printk: use vprintk_func in vprintk()
  2016-10-18 16:49   ` Steven Rostedt
@ 2016-10-19  1:54     ` Sergey Senozhatsky
  0 siblings, 0 replies; 23+ messages in thread
From: Sergey Senozhatsky @ 2016-10-19  1:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Petr Mladek, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Ingo Molnar, Peter Zijlstra, linux-kernel, Sergey Senozhatsky

On (10/18/16 12:49), Steven Rostedt wrote:
> On Wed, 19 Oct 2016 00:40:40 +0900
> Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote:
> 
> > vprintk(), just like printk(), better be using per-cpu printk_func
> > instead of direct vprintk_emit() call. Just in case if vprintk()
> > will ever be called from NMI, or from any other context that can
> > deadlock in printk().
> > 
> > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> 
> Reviewed-by: Steven Rostedt <rostedt@goodmis.org>

Thanks.

	-ss

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls
  2016-10-18 15:40 [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
                   ` (7 preceding siblings ...)
  2016-10-18 17:07 ` Peter Zijlstra
@ 2016-10-19  4:33 ` Sergey Senozhatsky
  8 siblings, 0 replies; 23+ messages in thread
From: Sergey Senozhatsky @ 2016-10-19  4:33 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Steven Rostedt, Ingo Molnar,
	Peter Zijlstra, linux-kernel, Sergey Senozhatsky

On (10/19/16 00:40), Sergey Senozhatsky wrote:
[..]
> Deadlock scenarios that printk_safe can handle:
> 
> a) printk recursion from logbuf_lock spin_lock section in printk()
>   printk()
>     raw_spin_lock(&logbuf_lock);
>     WARN_ON(1);
>     raw_spin_unlock(&logbuf_lock);
> 
> b) printk from sem->lock spin_lock section
>   printk()
>     console_trylock()
>       down_trylock()
>         raw_spin_lock_irqsave(&sem->lock, flags);
>         WARN_ON(1);
>         raw_spin_unlock_irqrestore(&sem->lock, flags);
> 
> c) printk from logbuf_lock spin_lock section in console_unlock()
>   printk()
>     console_unlock()
>       raw_spin_lock(&logbuf_lock);
>       WARN_ON(1);
>       raw_spin_unlock(&logbuf_lock);
> 
> d) printk from ->pi_lock from semaphore up
>   printk()
>     console_unlock()
>       up()
>         try_to_wake_up()
>           raw_spin_lock_irqsave(&p->pi_lock, flags);
>           WARN_ON(1);
>           raw_spin_unlock_irqrestore(&p->pi_lock, flags);

and

e) anything from call_console_drivers() should be fine as well.
   console_unlock() big printing loop is covered by printk_safe,
   need to protect console_cont_flush()->call_console_drivers().

   so printk "direct path" seems to be safe now:

   printk -> .. -> console_unlock() -> call_console_drivers() -> serial_console_write()

   anything that is "indirect" is not covered in this patch set.


	-ss

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls
  2016-10-19  1:53   ` Sergey Senozhatsky
@ 2016-10-19  9:30     ` Peter Zijlstra
  0 siblings, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2016-10-19  9:30 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Petr Mladek, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, linux-kernel

On Wed, Oct 19, 2016 at 10:53:57AM +0900, Sergey Senozhatsky wrote:
> not to miss out a DEFERRED_WARN patch set...
> //hm, I can't find it online
> 
>   Subject: [RFC 0/5] printk: Implement WARN_*DEFERRED()
>   Message-Id: <1474992135-14777-1-git-send-email-pmladek@suse.com>

That never hit lkml, that msgid does not resolve on marc and I can only
find a few replies in my local lkml archive.

Can't say I like the idea though. Its propagating the printk() failure
up into more APIs instead of actually fixing it.

Esp. for WARN and the like you do not want DEFERRED, because DEFERRED
means you'll likely never actually get to see the problem because we're
dead by the time 'later' happens.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls
  2016-10-18 17:07 ` Peter Zijlstra
  2016-10-19  1:53   ` Sergey Senozhatsky
@ 2016-10-19 13:18   ` Petr Mladek
  2016-10-19 13:34     ` Peter Zijlstra
  1 sibling, 1 reply; 23+ messages in thread
From: Petr Mladek @ 2016-10-19 13:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, linux-kernel, Sergey Senozhatsky

On Tue 2016-10-18 19:07:54, Peter Zijlstra wrote:
> On Wed, Oct 19, 2016 at 12:40:39AM +0900, Sergey Senozhatsky wrote:
> > Hello,
> > 
> >         RFC
> > 
> >         This patch set extends a lock-less NMI per-cpu buffers idea to
> > handle recursive printk() calls. The basic mechanism is pretty much the
> > same -- at the beginning of a deadlock-prone section we switch to lock-less
> > printk callback, and return back to a default printk implementation at the
> > end; the messages are getting flushed to a logbuf buffer from a safer
> > context.
> 
> So I think you're not taking this far enough. You've also missed an
> entire class of deadlocks.
> 
> The first is that you still keep the logbuf. Having this global
> serialized thing is a source of fail. It would be much better to only
> keep per cpu stuff. _OR_ at the very least make the logbuf itself
> lockfree. So generate the printk entry local (so we know its size) then
> atomically reserve the logbuf entry and copy it over.

This is close to what the lockless ring_buffer does. I mean
kernel/trace/ring_buffer.c that is used by trace_printk. It has
per-CPU buffers, does reservations, ...

Sadly the ring_buffer code is very tricky. Only a single reader is
allowed at a time. And locating the information in the crash dump
is a task for prisoners.

Note that it still does _not_ solve problems with the console output.
I am not sure if going this way would be a real win.

Sigh, a genius idea would help.

> The entire class of deadlocks you've missed is that console->write() is
> a piece of crap too ;-) Even the bog standard 8250 serial console driver
> can do wakeups.

I wonder if all the hard problems are actually related to the console
handling.

There are problems with the single logbuffer but these should get
eliminated by the NMI/safe temporary per-CPU buffers.

All might be easier if we always offload the console handling
into a kthread or so and trigger it via the minimalist
irq_work. It would kill huge bunch of possible deadlocks.
It will even allow to get rid of printk_deferred() and
the uncertainty where it is needed.

The penalty would be "slightly" delayed console output. But is
it a real problem? It should not be a big deal when everything works.
We could always try hard when panicking. And there always
might be a fallback with that direct early_console().

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls
  2016-10-19 13:18   ` Petr Mladek
@ 2016-10-19 13:34     ` Peter Zijlstra
  2016-10-20 13:18       ` Sergey Senozhatsky
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2016-10-19 13:34 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, linux-kernel, Sergey Senozhatsky

On Wed, Oct 19, 2016 at 03:18:36PM +0200, Petr Mladek wrote:
> On Tue 2016-10-18 19:07:54, Peter Zijlstra wrote:

> > The entire class of deadlocks you've missed is that console->write() is
> > a piece of crap too ;-) Even the bog standard 8250 serial console driver
> > can do wakeups.
> 
> I wonder if all the hard problems are actually related to the console
> handling.
> 
> There are problems with the single logbuffer but these should get
> eliminated by the NMI/safe temporary per-CPU buffers.
> 
> All might be easier if we always offload the console handling
> into a kthread or so and trigger it via the minimalist
> irq_work. It would kill huge bunch of possible deadlocks.
> It will even allow to get rid of printk_deferred() and
> the uncertainty where it is needed.
> 
> The penalty would be "slightly" delayed console output. But is
> it a real problem? It should not be a big deal when everything works.
> We could always try hard when panicking. And there always
> might be a fallback with that direct early_console().

Right, so I've had to debug quite a few situations where that 'later'
simply never happens...

I agree that such a scenario is basically all the regular console
drivers are capable of though.

It might make sense to go there, but allow early_console to print on the
go, keeping synchronous output available. We would still need the logbuf
to become NMI safe wrt adding entries though.

It would however place hard requirements of early_console
implementations; already violated by for instance the USB3-debug-port
implementation I just looked at:

  lkml.kernel.org/r/20161019130943.GA3175@twins.programming.kicks-ass.net

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 3/6] printk: introduce per-cpu safe_print seq buffer
  2016-10-18 15:40 ` [RFC][PATCHv3 3/6] printk: introduce per-cpu safe_print seq buffer Sergey Senozhatsky
  2016-10-18 16:46   ` Steven Rostedt
@ 2016-10-19 13:42   ` Sergey Senozhatsky
  1 sibling, 0 replies; 23+ messages in thread
From: Sergey Senozhatsky @ 2016-10-19 13:42 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Steven Rostedt, Ingo Molnar,
	Peter Zijlstra, linux-kernel, Sergey Senozhatsky

On (10/19/16 00:40), Sergey Senozhatsky wrote:
> -obj-y	= printk.o
> -obj-$(CONFIG_PRINTK_NMI)		+= printk_safe.o
> +obj-y	= printk_safe.o printk.o
[..]

d'oh... this breaks !PRINTK configs. will fix in v4.

	-ss

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls
  2016-10-19 13:34     ` Peter Zijlstra
@ 2016-10-20 13:18       ` Sergey Senozhatsky
  0 siblings, 0 replies; 23+ messages in thread
From: Sergey Senozhatsky @ 2016-10-20 13:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Petr Mladek, Sergey Senozhatsky, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, linux-kernel, Sergey Senozhatsky

On (10/19/16 15:34), Peter Zijlstra wrote:
> On Wed, Oct 19, 2016 at 03:18:36PM +0200, Petr Mladek wrote:
> > On Tue 2016-10-18 19:07:54, Peter Zijlstra wrote:
[..]
> It might make sense to go there, but allow early_console to print on the
> go, keeping synchronous output available. We would still need the logbuf
> to become NMI safe wrt adding entries though.
> 

as a side (spam) note,

I sent a [TECH TOPIC] suggestion to ksummit-discuss a while ago
  https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-July/002740.html

and a later update
  https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-September/004006.html

which covers some of the printk problems. so I'm sure I can tell whether
the suggestion has been approved or rejected, but hopefully we can find
some time to sit down and talk about the issues.

	-ss

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2016-10-20 13:19 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-18 15:40 [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
2016-10-18 15:40 ` [RFC][PATCHv3 1/6] printk: use vprintk_func in vprintk() Sergey Senozhatsky
2016-10-18 16:49   ` Steven Rostedt
2016-10-19  1:54     ` Sergey Senozhatsky
2016-10-18 15:40 ` [RFC][PATCHv3 2/6] printk: rename nmi.c and exported api Sergey Senozhatsky
2016-10-18 16:49   ` Steven Rostedt
2016-10-19  1:54     ` Sergey Senozhatsky
2016-10-18 15:40 ` [RFC][PATCHv3 3/6] printk: introduce per-cpu safe_print seq buffer Sergey Senozhatsky
2016-10-18 16:46   ` Steven Rostedt
2016-10-19  1:10     ` Sergey Senozhatsky
2016-10-19 13:42   ` Sergey Senozhatsky
2016-10-18 15:40 ` [RFC][PATCHv3 4/6] printk: report lost messages in printk safe/nmi contexts Sergey Senozhatsky
2016-10-18 15:40 ` [RFC][PATCHv3 5/6] printk: use printk_safe buffers Sergey Senozhatsky
2016-10-18 15:40 ` [RFC][PATCHv3 6/6] printk: remove zap_locks() function Sergey Senozhatsky
2016-10-18 16:45 ` [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls Joe Perches
2016-10-19  1:14   ` Sergey Senozhatsky
2016-10-18 17:07 ` Peter Zijlstra
2016-10-19  1:53   ` Sergey Senozhatsky
2016-10-19  9:30     ` Peter Zijlstra
2016-10-19 13:18   ` Petr Mladek
2016-10-19 13:34     ` Peter Zijlstra
2016-10-20 13:18       ` Sergey Senozhatsky
2016-10-19  4:33 ` Sergey Senozhatsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.