[RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls
@ 2016-10-27 15:49 Sergey Senozhatsky
  2016-10-27 15:49 ` [RFC][PATCHv4 1/6] printk: use vprintk_func in vprintk() Sergey Senozhatsky
                   ` (6 more replies)
  0 siblings, 7 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-10-27 15:49 UTC (permalink / raw)
  To: Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, Laura Abbott,
	Andy Lutomirski, Linus Torvalds, Kees Cook, linux-kernel,
	Sergey Senozhatsky, Sergey Senozhatsky

Hello,

        RFC

        This patch set extends a lock-less NMI per-cpu buffers idea to
handle recursive printk() calls. The basic mechanism is pretty much the
same -- at the beginning of a deadlock-prone section we switch to lock-less
printk callback, and return back to a default printk implementation at the
end; the messages are getting flushed to a logbuf buffer from a safer
context.

Deadlock scenarios that printk_safe can handle:

a) printk recursion from logbuf_lock spin_lock section in printk()
  printk()
    raw_spin_lock(&logbuf_lock);
    WARN_ON(1);
    raw_spin_unlock(&logbuf_lock);

b) printk from sem->lock spin_lock section
  printk()
    console_trylock()
      down_trylock()
        raw_spin_lock_irqsave(&sem->lock, flags);
        WARN_ON(1);
        raw_spin_unlock_irqrestore(&sem->lock, flags);

c) printk from logbuf_lock spin_lock section in console_unlock()
  printk()
    console_unlock()
      raw_spin_lock(&logbuf_lock);
      WARN_ON(1);
      raw_spin_unlock(&logbuf_lock);

d) printk from ->pi_lock from semaphore up
  printk()
    console_unlock()
      up()
        try_to_wake_up()
          raw_spin_lock_irqsave(&p->pi_lock, flags);
          WARN_ON(1);
          raw_spin_unlock_irqrestore(&p->pi_lock, flags);

e) printk from console_cont_flush() /*and down the call chain */
  printk()
    console_unlock()
      call_console_drivers()
       ...
          WARN_ON(1);

[[against next-20161027]]

v4:
-- addressed Steven's review points
-- use printk_safe in console_cont_flush()

v3: (review by Petr)
-- renamed to printk_safe
-- !NMI config build fix
-- report lost messages for both printk_sae and printk_nmi
-- dropped recursion reporting patch
-- etc.

v2:
-- fix build error on !NMI configs, reported by Fengguang
-- reworked the series based on Petr's feedback
-- added a new patch to drop zap_locks()

Sergey Senozhatsky (6):
  printk: use vprintk_func in vprintk()
  printk: rename nmi.c and exported api
  printk: introduce per-cpu safe_print seq buffer
  printk: report lost messages in printk safe/nmi contexts
  printk: use printk_safe buffers
  printk: remove zap_locks() function

 arch/arm/kernel/smp.c                  |   4 +-
 include/linux/hardirq.h                |   4 +-
 include/linux/printk.h                 |  29 ++--
 init/Kconfig                           |  16 ++-
 init/main.c                            |   2 +-
 kernel/kexec_core.c                    |   2 +-
 kernel/panic.c                         |   4 +-
 kernel/printk/Makefile                 |   2 +-
 kernel/printk/internal.h               |  63 ++++-----
 kernel/printk/printk.c                 | 124 +++++-------------
 kernel/printk/{nmi.c => printk_safe.c} | 233 +++++++++++++++++++++++++--------
 lib/nmi_backtrace.c                    |   2 +-
 12 files changed, 274 insertions(+), 211 deletions(-)
 rename kernel/printk/{nmi.c => printk_safe.c} (54%)

-- 
2.10.1.502.g6598894

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC][PATCHv4 1/6] printk: use vprintk_func in vprintk()
  2016-10-27 15:49 [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
@ 2016-10-27 15:49 ` Sergey Senozhatsky
  2016-11-24 16:28   ` Petr Mladek
  2016-10-27 15:49 ` [RFC][PATCHv4 2/6] printk: rename nmi.c and exported api Sergey Senozhatsky
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-10-27 15:49 UTC (permalink / raw)
  To: Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, Laura Abbott,
	Andy Lutomirski, Linus Torvalds, Kees Cook, linux-kernel,
	Sergey Senozhatsky, Sergey Senozhatsky

vprintk(), just like printk(), better be using per-cpu printk_func
instead of direct vprintk_emit() call. Just in case if vprintk()
will ever be called from NMI, or from any other context that can
deadlock in printk().

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/printk/printk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index de08fc9..539f31b 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1921,7 +1921,7 @@ EXPORT_SYMBOL(vprintk_emit);
 
 asmlinkage int vprintk(const char *fmt, va_list args)
 {
-	return vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, 0, fmt, args);
+	return vprintk_func(fmt, args);
 }
 EXPORT_SYMBOL(vprintk);
 
-- 
2.10.1.502.g6598894

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC][PATCHv4 2/6] printk: rename nmi.c and exported api
  2016-10-27 15:49 [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
  2016-10-27 15:49 ` [RFC][PATCHv4 1/6] printk: use vprintk_func in vprintk() Sergey Senozhatsky
@ 2016-10-27 15:49 ` Sergey Senozhatsky
  2016-11-24 16:35   ` Petr Mladek
  2016-10-27 15:49 ` [RFC][PATCHv4 3/6] printk: introduce per-cpu safe_print seq buffer Sergey Senozhatsky
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-10-27 15:49 UTC (permalink / raw)
  To: Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, Laura Abbott,
	Andy Lutomirski, Linus Torvalds, Kees Cook, linux-kernel,
	Sergey Senozhatsky, Sergey Senozhatsky

A preparation patch for printk_safe work. No functional change.
- rename nmi.c to print_safe.c
- rename exported functions to have `safe' prefix.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/arm/kernel/smp.c                  |  4 +-
 include/linux/hardirq.h                |  4 +-
 include/linux/printk.h                 | 20 +++++-----
 init/Kconfig                           | 16 ++++----
 init/main.c                            |  2 +-
 kernel/kexec_core.c                    |  2 +-
 kernel/panic.c                         |  4 +-
 kernel/printk/Makefile                 |  2 +-
 kernel/printk/{nmi.c => printk_safe.c} | 69 +++++++++++++++++-----------------
 lib/nmi_backtrace.c                    |  2 +-
 10 files changed, 64 insertions(+), 61 deletions(-)
 rename kernel/printk/{nmi.c => printk_safe.c} (77%)

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 7dd14e8..8b056e1 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -644,11 +644,11 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 		break;
 
 	case IPI_CPU_BACKTRACE:
-		printk_nmi_enter();
+		printk_safe_nmi_enter();
 		irq_enter();
 		nmi_cpu_backtrace(regs);
 		irq_exit();
-		printk_nmi_exit();
+		printk_safe_nmi_exit();
 		break;
 
 	default:
diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index c683996..14840aa 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -61,7 +61,7 @@ extern void irq_exit(void);
 
 #define nmi_enter()						\
 	do {							\
-		printk_nmi_enter();				\
+		printk_safe_nmi_enter();			\
 		lockdep_off();					\
 		ftrace_nmi_enter();				\
 		BUG_ON(in_nmi());				\
@@ -78,7 +78,7 @@ extern void irq_exit(void);
 		preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET);	\
 		ftrace_nmi_exit();				\
 		lockdep_on();					\
-		printk_nmi_exit();				\
+		printk_safe_nmi_exit();				\
 	} while (0)
 
 #endif /* LINUX_HARDIRQ_H */
diff --git a/include/linux/printk.h b/include/linux/printk.h
index eac1af8..c9a3080 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -132,17 +132,17 @@ void early_printk(const char *s, ...) { }
 #endif
 
 #ifdef CONFIG_PRINTK_NMI
-extern void printk_nmi_init(void);
-extern void printk_nmi_enter(void);
-extern void printk_nmi_exit(void);
-extern void printk_nmi_flush(void);
-extern void printk_nmi_flush_on_panic(void);
+extern void printk_safe_init(void);
+extern void printk_safe_nmi_enter(void);
+extern void printk_safe_nmi_exit(void);
+extern void printk_safe_flush(void);
+extern void printk_safe_flush_on_panic(void);
 #else
-static inline void printk_nmi_init(void) { }
-static inline void printk_nmi_enter(void) { }
-static inline void printk_nmi_exit(void) { }
-static inline void printk_nmi_flush(void) { }
-static inline void printk_nmi_flush_on_panic(void) { }
+static inline void printk_safe_init(void) { }
+static inline void printk_safe_nmi_enter(void) { }
+static inline void printk_safe_nmi_exit(void) { }
+static inline void printk_safe_flush(void) { }
+static inline void printk_safe_flush_on_panic(void) { }
 #endif /* PRINTK_NMI */
 
 #ifdef CONFIG_PRINTK
diff --git a/init/Kconfig b/init/Kconfig
index 172f80e..1f43175 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -875,17 +875,19 @@ config LOG_CPU_MAX_BUF_SHIFT
 		     13 =>   8 KB for each CPU
 		     12 =>   4 KB for each CPU
 
-config NMI_LOG_BUF_SHIFT
-	int "Temporary per-CPU NMI log buffer size (12 => 4KB, 13 => 8KB)"
+config PRINTK_SAFE_LOG_BUF_SHIFT
+	int "Temporary per-CPU printk log buffer size (12 => 4KB, 13 => 8KB)"
 	range 10 21
 	default 13
-	depends on PRINTK_NMI
+	depends on PRINTK
 	help
-	  Select the size of a per-CPU buffer where NMI messages are temporary
-	  stored. They are copied to the main log buffer in a safe context
-	  to avoid a deadlock. The value defines the size as a power of 2.
+	  Select the size of an alternate printk per-CPU buffer where messages
+	  printed from usafe contexts are temporary stored. One example would
+	  be NMI messages, another one - printk recursion. The messages are
+	  copied to the main log buffer in a safe context to avoid a deadlock.
+	  The value defines the size as a power of 2.
 
-	  NMI messages are rare and limited. The largest one is when
+	  Those messages are rare and limited. The largest one is when
 	  a backtrace is printed. It usually fits into 4KB. Select
 	  8KB if you want to be on the safe side.
 
diff --git a/init/main.c b/init/main.c
index cbed073..365cee2 100644
--- a/init/main.c
+++ b/init/main.c
@@ -577,7 +577,7 @@ asmlinkage __visible void __init start_kernel(void)
 	timekeeping_init();
 	time_init();
 	sched_clock_postinit();
-	printk_nmi_init();
+	printk_safe_init();
 	perf_event_init();
 	profile_init();
 	call_function_init();
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 5616755..297508a1 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -914,7 +914,7 @@ void crash_kexec(struct pt_regs *regs)
 	old_cpu = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, this_cpu);
 	if (old_cpu == PANIC_CPU_INVALID) {
 		/* This is the 1st CPU which comes here, so go ahead. */
-		printk_nmi_flush_on_panic();
+		printk_safe_flush_on_panic();
 		__crash_kexec(regs);
 
 		/*
diff --git a/kernel/panic.c b/kernel/panic.c
index e6480e2..67c11fd 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -188,7 +188,7 @@ void panic(const char *fmt, ...)
 	 * Bypass the panic_cpu check and call __crash_kexec directly.
 	 */
 	if (!_crash_kexec_post_notifiers) {
-		printk_nmi_flush_on_panic();
+		printk_safe_flush_on_panic();
 		__crash_kexec(NULL);
 
 		/*
@@ -213,7 +213,7 @@ void panic(const char *fmt, ...)
 	atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
 
 	/* Call flush even twice. It tries harder with a single online CPU */
-	printk_nmi_flush_on_panic();
+	printk_safe_flush_on_panic();
 	kmsg_dump(KMSG_DUMP_PANIC);
 
 	/*
diff --git a/kernel/printk/Makefile b/kernel/printk/Makefile
index abb0042..6079281 100644
--- a/kernel/printk/Makefile
+++ b/kernel/printk/Makefile
@@ -1,3 +1,3 @@
 obj-y	= printk.o
-obj-$(CONFIG_PRINTK_NMI)		+= nmi.o
+obj-$(CONFIG_PRINTK_NMI)		+= printk_safe.o
 obj-$(CONFIG_A11Y_BRAILLE_CONSOLE)	+= braille.o
diff --git a/kernel/printk/nmi.c b/kernel/printk/printk_safe.c
similarity index 77%
rename from kernel/printk/nmi.c
rename to kernel/printk/printk_safe.c
index 16bab47..1f66163 100644
--- a/kernel/printk/nmi.c
+++ b/kernel/printk/printk_safe.c
@@ -1,5 +1,5 @@
 /*
- * nmi.c - Safe printk in NMI context
+ * printk_safe.c - Safe printk in NMI context
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License
@@ -39,18 +39,18 @@
  * were handled or when IRQs are blocked.
  */
 DEFINE_PER_CPU(printk_func_t, printk_func) = vprintk_default;
-static int printk_nmi_irq_ready;
+static int printk_safe_irq_ready;
 atomic_t nmi_message_lost;
 
-#define NMI_LOG_BUF_LEN ((1 << CONFIG_NMI_LOG_BUF_SHIFT) -		\
+#define SAFE_LOG_BUF_LEN ((1 << CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT) -	\
 			 sizeof(atomic_t) - sizeof(struct irq_work))
 
-struct nmi_seq_buf {
+struct printk_safe_seq_buf {
 	atomic_t		len;	/* length of written data */
 	struct irq_work		work;	/* IRQ work that flushes the buffer */
-	unsigned char		buffer[NMI_LOG_BUF_LEN];
+	unsigned char		buffer[SAFE_LOG_BUF_LEN];
 };
-static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
+static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq);
 
 /*
  * Safe printk() for NMI context. It uses a per-CPU buffer to
@@ -58,9 +58,9 @@ static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
  * one writer running. But the buffer might get flushed from another
  * CPU, so we need to be careful.
  */
-static int vprintk_nmi(const char *fmt, va_list args)
+static int vprintk_safe_nmi(const char *fmt, va_list args)
 {
-	struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
+	struct printk_safe_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
 	int add = 0;
 	size_t len;
 
@@ -90,7 +90,7 @@ static int vprintk_nmi(const char *fmt, va_list args)
 		goto again;
 
 	/* Get flushed in a more safe context. */
-	if (add && printk_nmi_irq_ready) {
+	if (add && printk_safe_irq_ready) {
 		/* Make sure that IRQ work is really initialized. */
 		smp_rmb();
 		irq_work_queue(&s->work);
@@ -99,7 +99,7 @@ static int vprintk_nmi(const char *fmt, va_list args)
 	return add;
 }
 
-static void printk_nmi_flush_line(const char *text, int len)
+static void printk_safe_flush_line(const char *text, int len)
 {
 	/*
 	 * The buffers are flushed in NMI only on panic.  The messages must
@@ -117,23 +117,24 @@ static void printk_nmi_flush_line(const char *text, int len)
  * printk one line from the temporary buffer from @start index until
  * and including the @end index.
  */
-static void printk_nmi_flush_seq_line(struct nmi_seq_buf *s,
+static void printk_safe_flush_seq_line(struct printk_safe_seq_buf *s,
 					int start, int end)
 {
 	const char *buf = s->buffer + start;
 
-	printk_nmi_flush_line(buf, (end - start) + 1);
+	printk_safe_flush_line(buf, (end - start) + 1);
 }
 
 /*
  * Flush data from the associated per_CPU buffer. The function
  * can be called either via IRQ work or independently.
  */
-static void __printk_nmi_flush(struct irq_work *work)
+static void __printk_safe_flush(struct irq_work *work)
 {
 	static raw_spinlock_t read_lock =
 		__RAW_SPIN_LOCK_INITIALIZER(read_lock);
-	struct nmi_seq_buf *s = container_of(work, struct nmi_seq_buf, work);
+	struct printk_safe_seq_buf *s = container_of(work,
+			struct printk_safe_seq_buf, work);
 	unsigned long flags;
 	size_t len, size;
 	int i, last_i;
@@ -157,9 +158,9 @@ static void __printk_nmi_flush(struct irq_work *work)
 	 * @len must only increase.
 	 */
 	if (i && i >= len) {
-		const char *msg = "printk_nmi_flush: internal error\n";
+		const char *msg = "printk_safe_flush: internal error\n";
 
-		printk_nmi_flush_line(msg, strlen(msg));
+		printk_safe_flush_line(msg, strlen(msg));
 	}
 
 	if (!len)
@@ -174,14 +175,14 @@ static void __printk_nmi_flush(struct irq_work *work)
 	/* Print line by line. */
 	for (; i < size; i++) {
 		if (s->buffer[i] == '\n') {
-			printk_nmi_flush_seq_line(s, last_i, i);
+			printk_safe_flush_seq_line(s, last_i, i);
 			last_i = i + 1;
 		}
 	}
 	/* Check if there was a partial line. */
 	if (last_i < size) {
-		printk_nmi_flush_seq_line(s, last_i, size - 1);
-		printk_nmi_flush_line("\n", strlen("\n"));
+		printk_safe_flush_seq_line(s, last_i, size - 1);
+		printk_safe_flush_line("\n", strlen("\n"));
 	}
 
 	/*
@@ -198,31 +199,31 @@ static void __printk_nmi_flush(struct irq_work *work)
 }
 
 /**
- * printk_nmi_flush - flush all per-cpu nmi buffers.
+ * printk_safe_flush - flush all per-cpu nmi buffers.
  *
  * The buffers are flushed automatically via IRQ work. This function
  * is useful only when someone wants to be sure that all buffers have
  * been flushed at some point.
  */
-void printk_nmi_flush(void)
+void printk_safe_flush(void)
 {
 	int cpu;
 
 	for_each_possible_cpu(cpu)
-		__printk_nmi_flush(&per_cpu(nmi_print_seq, cpu).work);
+		__printk_safe_flush(&per_cpu(nmi_print_seq, cpu).work);
 }
 
 /**
- * printk_nmi_flush_on_panic - flush all per-cpu nmi buffers when the system
+ * printk_safe_flush_on_panic - flush all per-cpu nmi buffers when the system
  *	goes down.
  *
- * Similar to printk_nmi_flush() but it can be called even in NMI context when
+ * Similar to printk_safe_flush() but it can be called even in NMI context when
  * the system goes down. It does the best effort to get NMI messages into
  * the main ring buffer.
  *
  * Note that it could try harder when there is only one CPU online.
  */
-void printk_nmi_flush_on_panic(void)
+void printk_safe_flush_on_panic(void)
 {
 	/*
 	 * Make sure that we could access the main ring buffer.
@@ -236,33 +237,33 @@ void printk_nmi_flush_on_panic(void)
 		raw_spin_lock_init(&logbuf_lock);
 	}
 
-	printk_nmi_flush();
+	printk_safe_flush();
 }
 
-void __init printk_nmi_init(void)
+void __init printk_safe_init(void)
 {
 	int cpu;
 
 	for_each_possible_cpu(cpu) {
-		struct nmi_seq_buf *s = &per_cpu(nmi_print_seq, cpu);
+		struct printk_safe_seq_buf *s = &per_cpu(nmi_print_seq, cpu);
 
-		init_irq_work(&s->work, __printk_nmi_flush);
+		init_irq_work(&s->work, __printk_safe_flush);
 	}
 
 	/* Make sure that IRQ works are initialized before enabling. */
 	smp_wmb();
-	printk_nmi_irq_ready = 1;
+	printk_safe_irq_ready = 1;
 
 	/* Flush pending messages that did not have scheduled IRQ works. */
-	printk_nmi_flush();
+	printk_safe_flush();
 }
 
-void printk_nmi_enter(void)
+void printk_safe_nmi_enter(void)
 {
-	this_cpu_write(printk_func, vprintk_nmi);
+	this_cpu_write(printk_func, vprintk_safe_nmi);
 }
 
-void printk_nmi_exit(void)
+void printk_safe_nmi_exit(void)
 {
 	this_cpu_write(printk_func, vprintk_default);
 }
diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c
index 7555475..5f7999e 100644
--- a/lib/nmi_backtrace.c
+++ b/lib/nmi_backtrace.c
@@ -77,7 +77,7 @@ void nmi_trigger_cpumask_backtrace(const cpumask_t *mask,
 	 * Force flush any remote buffers that might be stuck in IRQ context
 	 * and therefore could not run their irq_work.
 	 */
-	printk_nmi_flush();
+	printk_safe_flush();
 
 	clear_bit_unlock(0, &backtrace_flag);
 	put_cpu();
-- 
2.10.1.502.g6598894

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC][PATCHv4 3/6] printk: introduce per-cpu safe_print seq buffer
  2016-10-27 15:49 [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
  2016-10-27 15:49 ` [RFC][PATCHv4 1/6] printk: use vprintk_func in vprintk() Sergey Senozhatsky
  2016-10-27 15:49 ` [RFC][PATCHv4 2/6] printk: rename nmi.c and exported api Sergey Senozhatsky
@ 2016-10-27 15:49 ` Sergey Senozhatsky
  2016-11-24 16:58   ` Petr Mladek
  2016-10-27 15:49 ` [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts Sergey Senozhatsky
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-10-27 15:49 UTC (permalink / raw)
  To: Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, Laura Abbott,
	Andy Lutomirski, Linus Torvalds, Kees Cook, linux-kernel,
	Sergey Senozhatsky, Sergey Senozhatsky

This patch extends the idea of NMI per-cpu buffers to regions
that may cause recursive printk() calls and possible deadlocks.
Namely, printk() can't handle printk calls from schedule code
or printk() calls from lock debugging code (spin_dump() for instance);
because those may be called with `sem->lock' already taken or any
other `critical' locks (p->pi_lock, etc.). An example of deadlock
can be

 vprintk_emit()
  console_unlock()
   up()                        << raw_spin_lock_irqsave(&sem->lock, flags);
    wake_up_process()
     try_to_wake_up()
      ttwu_queue()
       ttwu_activate()
        activate_task()
         enqueue_task()
          enqueue_task_fair()
           cfs_rq_of()
            task_of()
             WARN_ON_ONCE(!entity_is_task(se))
              vprintk_emit()
               console_trylock()
                down_trylock()
                 raw_spin_lock_irqsave(&sem->lock, flags)
                 ^^^^ deadlock

and some other cases.

Just like in NMI implementation, the solution uses a per-cpu
`printk_func' pointer to 'redirect' printk() calls to a 'safe'
callback, that store messages in a per-cpu buffer and flushes
them back to logbuf buffer later.

Usage example:

 printk()
  printk_safe_enter(flags)
  //
  //  any printk() call from here will endup in vprintk_safe(),
  //  that stores messages in a special per-CPU buffer.
  //
  printk_safe_exit(flags)

The 'redirection' mechanism, though, has been reworked, as suggested
by Petr Mladek. Instead of using a per-cpu @print_func callback we now
keep a per-cpu printk-context variable and call either default or nmi
vprintk function depending on its value. printk_nmi_entrer/exit and
printk_safe_enter/exit, thus, just set/celar corresponding bits in
printk-context functions.

The patch only adds printk_safe support, we don't use it yet.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 include/linux/printk.h      |  21 +++++--
 kernel/printk/Makefile      |   2 +-
 kernel/printk/internal.h    |  56 ++++++++++--------
 kernel/printk/printk.c      |   3 -
 kernel/printk/printk_safe.c | 134 ++++++++++++++++++++++++++++++++++----------
 5 files changed, 152 insertions(+), 64 deletions(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index c9a3080..e8ede8f 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -132,17 +132,11 @@ void early_printk(const char *s, ...) { }
 #endif
 
 #ifdef CONFIG_PRINTK_NMI
-extern void printk_safe_init(void);
 extern void printk_safe_nmi_enter(void);
 extern void printk_safe_nmi_exit(void);
-extern void printk_safe_flush(void);
-extern void printk_safe_flush_on_panic(void);
 #else
-static inline void printk_safe_init(void) { }
 static inline void printk_safe_nmi_enter(void) { }
 static inline void printk_safe_nmi_exit(void) { }
-static inline void printk_safe_flush(void) { }
-static inline void printk_safe_flush_on_panic(void) { }
 #endif /* PRINTK_NMI */
 
 #ifdef CONFIG_PRINTK
@@ -194,6 +188,9 @@ void __init setup_log_buf(int early);
 __printf(1, 2) void dump_stack_set_arch_desc(const char *fmt, ...);
 void dump_stack_print_info(const char *log_lvl);
 void show_regs_print_info(const char *log_lvl);
+extern void printk_safe_init(void);
+extern void printk_safe_flush(void);
+extern void printk_safe_flush_on_panic(void);
 #else
 static inline __printf(1, 0)
 int vprintk(const char *s, va_list args)
@@ -253,6 +250,18 @@ static inline void dump_stack_print_info(const char *log_lvl)
 static inline void show_regs_print_info(const char *log_lvl)
 {
 }
+
+static inline void printk_safe_init(void)
+{
+}
+
+static inline void printk_safe_flush(void)
+{
+}
+
+static inline void printk_safe_flush_on_panic(void)
+{
+}
 #endif
 
 extern asmlinkage void dump_stack(void) __cold;
diff --git a/kernel/printk/Makefile b/kernel/printk/Makefile
index 6079281..4a2ffc3 100644
--- a/kernel/printk/Makefile
+++ b/kernel/printk/Makefile
@@ -1,3 +1,3 @@
 obj-y	= printk.o
-obj-$(CONFIG_PRINTK_NMI)		+= printk_safe.o
+obj-$(CONFIG_PRINTK)	+= printk_safe.o
 obj-$(CONFIG_A11Y_BRAILLE_CONSOLE)	+= braille.o
diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index 7fd2838..87c784b 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -16,26 +16,8 @@
  */
 #include <linux/percpu.h>
 
-typedef __printf(1, 0) int (*printk_func_t)(const char *fmt, va_list args);
-
-int __printf(1, 0) vprintk_default(const char *fmt, va_list args);
-
 #ifdef CONFIG_PRINTK_NMI
 
-extern raw_spinlock_t logbuf_lock;
-
-/*
- * printk() could not take logbuf_lock in NMI context. Instead,
- * it temporary stores the strings into a per-CPU buffer.
- * The alternative implementation is chosen transparently
- * via per-CPU variable.
- */
-DECLARE_PER_CPU(printk_func_t, printk_func);
-static inline __printf(1, 0) int vprintk_func(const char *fmt, va_list args)
-{
-	return this_cpu_read(printk_func)(fmt, args);
-}
-
 extern atomic_t nmi_message_lost;
 static inline int get_nmi_message_lost(void)
 {
@@ -44,14 +26,42 @@ static inline int get_nmi_message_lost(void)
 
 #else /* CONFIG_PRINTK_NMI */
 
-static inline __printf(1, 0) int vprintk_func(const char *fmt, va_list args)
-{
-	return vprintk_default(fmt, args);
-}
-
 static inline int get_nmi_message_lost(void)
 {
 	return 0;
 }
 
 #endif /* CONFIG_PRINTK_NMI */
+
+#ifdef CONFIG_PRINTK
+
+#define PRINTK_SAFE_CONTEXT_MASK	0x7fffffff
+#define PRINTK_SAFE_NMI_CONTEXT_MASK	0x80000000
+
+extern raw_spinlock_t logbuf_lock;
+
+__printf(1, 0) int vprintk_default(const char *fmt, va_list args);
+__printf(1, 0) int vprintk_func(const char *fmt, va_list args);
+void __printk_safe_enter(void);
+void __printk_safe_exit(void);
+
+#define printk_safe_enter(flags)		\
+	do {					\
+		local_irq_save(flags);		\
+		__printk_safe_enter();		\
+	} while (0)
+
+#define printk_safe_exit(flags)		\
+	do {					\
+		__printk_safe_exit();		\
+		local_irq_restore(flags);	\
+	} while (0)
+
+#else
+
+__printf(1, 0) int vprintk_func(const char *fmt, va_list args) { return 0; }
+
+#define printk_safe_enter(f)	((void)(f))
+#define printk_safe_exit(f)	((void)(f))
+
+#endif /* CONFIG_PRINTK */
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 539f31b..1878b2d 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2028,9 +2028,6 @@ static size_t msg_print_text(const struct printk_log *msg, enum log_flags prev,
 static size_t cont_print_text(char *text, size_t size) { return 0; }
 static bool suppress_message_printing(int level) { return false; }
 
-/* Still needs to be defined for users */
-DEFINE_PER_CPU(printk_func_t, printk_func);
-
 #endif /* CONFIG_PRINTK */
 
 #ifdef CONFIG_EARLY_PRINTK
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index 1f66163..af74d9c 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -1,5 +1,5 @@
 /*
- * printk_safe.c - Safe printk in NMI context
+ * printk_safe.c - Safe printk for printk-deadlock-prone contexts
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License
@@ -32,15 +32,14 @@
  * is later flushed into the main ring buffer via IRQ work.
  *
  * The alternative implementation is chosen transparently
- * via @printk_func per-CPU variable.
+ * by examinig current printk() context mask stored in @printk_safe_context
+ * per-CPU variable.
  *
  * The implementation allows to flush the strings also from another CPU.
  * There are situations when we want to make sure that all buffers
  * were handled or when IRQs are blocked.
  */
-DEFINE_PER_CPU(printk_func_t, printk_func) = vprintk_default;
 static int printk_safe_irq_ready;
-atomic_t nmi_message_lost;
 
 #define SAFE_LOG_BUF_LEN ((1 << CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT) -	\
 			 sizeof(atomic_t) - sizeof(struct irq_work))
@@ -50,27 +49,26 @@ struct printk_safe_seq_buf {
 	struct irq_work		work;	/* IRQ work that flushes the buffer */
 	unsigned char		buffer[SAFE_LOG_BUF_LEN];
 };
+
+static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq);
+static DEFINE_PER_CPU(int, printk_safe_context);
+
+#ifdef CONFIG_PRINTK_NMI
 static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq);
+atomic_t nmi_message_lost;
+#endif
 
-/*
- * Safe printk() for NMI context. It uses a per-CPU buffer to
- * store the message. NMIs are not nested, so there is always only
- * one writer running. But the buffer might get flushed from another
- * CPU, so we need to be careful.
- */
-static int vprintk_safe_nmi(const char *fmt, va_list args)
+static int printk_safe_log_store(struct printk_safe_seq_buf *s,
+		const char *fmt, va_list args)
 {
-	struct printk_safe_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
-	int add = 0;
+	int add;
 	size_t len;
 
 again:
 	len = atomic_read(&s->len);
 
-	if (len >= sizeof(s->buffer)) {
-		atomic_inc(&nmi_message_lost);
-		return 0;
-	}
+	if (len >= sizeof(s->buffer))
+		return -E2BIG;
 
 	/*
 	 * Make sure that all old data have been read before the buffer was
@@ -110,7 +108,6 @@ static void printk_safe_flush_line(const char *text, int len)
 		printk_deferred("%.*s", len, text);
 	else
 		printk("%.*s", len, text);
-
 }
 
 /*
@@ -126,7 +123,7 @@ static void printk_safe_flush_seq_line(struct printk_safe_seq_buf *s,
 }
 
 /*
- * Flush data from the associated per_CPU buffer. The function
+ * Flush data from the associated per-CPU buffer. The function
  * can be called either via IRQ work or independently.
  */
 static void __printk_safe_flush(struct irq_work *work)
@@ -209,8 +206,12 @@ void printk_safe_flush(void)
 {
 	int cpu;
 
-	for_each_possible_cpu(cpu)
+	for_each_possible_cpu(cpu) {
+#ifdef CONFIG_PRINTK_NMI
 		__printk_safe_flush(&per_cpu(nmi_print_seq, cpu).work);
+#endif
+		__printk_safe_flush(&per_cpu(safe_print_seq, cpu).work);
+	}
 }
 
 /**
@@ -240,14 +241,95 @@ void printk_safe_flush_on_panic(void)
 	printk_safe_flush();
 }
 
+#ifdef CONFIG_PRINTK_NMI
+/*
+ * Safe printk() for NMI context. It uses a per-CPU buffer to
+ * store the message. NMIs are not nested, so there is always only
+ * one writer running. But the buffer might get flushed from another
+ * CPU, so we need to be careful.
+ */
+static int vprintk_safe_nmi(const char *fmt, va_list args)
+{
+	struct printk_safe_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
+	int add;
+
+	add = printk_safe_log_store(s, fmt, args);
+	if (add == -E2BIG) {
+		atomic_inc(&nmi_message_lost);
+		add = 0;
+	}
+
+	return add;
+}
+
+void printk_safe_nmi_enter(void)
+{
+	this_cpu_or(printk_safe_context, PRINTK_SAFE_NMI_CONTEXT_MASK);
+}
+
+void printk_safe_nmi_exit(void)
+{
+	this_cpu_and(printk_safe_context, ~PRINTK_SAFE_NMI_CONTEXT_MASK);
+}
+
+#else
+
+static int vprintk_safe_nmi(const char *fmt, va_list args)
+{
+	return 0;
+}
+
+#endif /* CONFIG_PRINTK_NMI */
+
+/*
+ * Lockless printk(), to avoid deadlocks should the printk() recurse
+ * into itself. It uses a per-CPU buffer to store the message, just like
+ * NMI.
+ */
+static int vprintk_safe(const char *fmt, va_list args)
+{
+	struct printk_safe_seq_buf *s = this_cpu_ptr(&safe_print_seq);
+
+	return printk_safe_log_store(s, fmt, args);
+}
+
+/* Can be preempted by NMI. */
+void __printk_safe_enter(void)
+{
+	this_cpu_inc(printk_safe_context);
+}
+
+/* Can be preempted by NMI. */
+void __printk_safe_exit(void)
+{
+	this_cpu_dec(printk_safe_context);
+}
+
+__printf(1, 0) int vprintk_func(const char *fmt, va_list args)
+{
+	if (this_cpu_read(printk_safe_context) & PRINTK_SAFE_NMI_CONTEXT_MASK)
+		return vprintk_safe_nmi(fmt, args);
+
+	if (this_cpu_read(printk_safe_context) & PRINTK_SAFE_CONTEXT_MASK)
+		return vprintk_safe(fmt, args);
+
+	return vprintk_default(fmt, args);
+}
+
 void __init printk_safe_init(void)
 {
 	int cpu;
 
 	for_each_possible_cpu(cpu) {
-		struct printk_safe_seq_buf *s = &per_cpu(nmi_print_seq, cpu);
+		struct printk_safe_seq_buf *s;
+
+		s = &per_cpu(safe_print_seq, cpu);
+		init_irq_work(&s->work, __printk_safe_flush);
 
+#ifdef CONFIG_PRINTK_NMI
+		s = &per_cpu(nmi_print_seq, cpu);
 		init_irq_work(&s->work, __printk_safe_flush);
+#endif
 	}
 
 	/* Make sure that IRQ works are initialized before enabling. */
@@ -257,13 +339,3 @@ void __init printk_safe_init(void)
 	/* Flush pending messages that did not have scheduled IRQ works. */
 	printk_safe_flush();
 }
-
-void printk_safe_nmi_enter(void)
-{
-	this_cpu_write(printk_func, vprintk_safe_nmi);
-}
-
-void printk_safe_nmi_exit(void)
-{
-	this_cpu_write(printk_func, vprintk_default);
-}
-- 
2.10.1.502.g6598894

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts
  2016-10-27 15:49 [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
                   ` (2 preceding siblings ...)
  2016-10-27 15:49 ` [RFC][PATCHv4 3/6] printk: introduce per-cpu safe_print seq buffer Sergey Senozhatsky
@ 2016-10-27 15:49 ` Sergey Senozhatsky
  2016-11-25 11:07   ` Petr Mladek
  2016-10-27 15:49 ` [RFC][PATCHv4 5/6] printk: use printk_safe buffers Sergey Senozhatsky
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-10-27 15:49 UTC (permalink / raw)
  To: Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, Laura Abbott,
	Andy Lutomirski, Linus Torvalds, Kees Cook, linux-kernel,
	Sergey Senozhatsky, Sergey Senozhatsky

Account lost messages in pritk-safe and printk-safe-nmi
contexts and report those numbers during printk_safe_flush().

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 kernel/printk/internal.h    | 17 ---------------
 kernel/printk/printk.c      | 10 ---------
 kernel/printk/printk_safe.c | 50 +++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 48 insertions(+), 29 deletions(-)

diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index 87c784b..384542d 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -16,23 +16,6 @@
  */
 #include <linux/percpu.h>
 
-#ifdef CONFIG_PRINTK_NMI
-
-extern atomic_t nmi_message_lost;
-static inline int get_nmi_message_lost(void)
-{
-	return atomic_xchg(&nmi_message_lost, 0);
-}
-
-#else /* CONFIG_PRINTK_NMI */
-
-static inline int get_nmi_message_lost(void)
-{
-	return 0;
-}
-
-#endif /* CONFIG_PRINTK_NMI */
-
 #ifdef CONFIG_PRINTK
 
 #define PRINTK_SAFE_CONTEXT_MASK	0x7fffffff
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 1878b2d..4675b8d 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1795,7 +1795,6 @@ asmlinkage int vprintk_emit(int facility, int level,
 	unsigned long flags;
 	int this_cpu;
 	int printed_len = 0;
-	int nmi_message_lost;
 	bool in_sched = false;
 	/* cpu currently holding logbuf_lock in this function */
 	static unsigned int logbuf_cpu = UINT_MAX;
@@ -1846,15 +1845,6 @@ asmlinkage int vprintk_emit(int facility, int level,
 					 strlen(recursion_msg));
 	}
 
-	nmi_message_lost = get_nmi_message_lost();
-	if (unlikely(nmi_message_lost)) {
-		text_len = scnprintf(textbuf, sizeof(textbuf),
-				     "BAD LUCK: lost %d message(s) from NMI context!",
-				     nmi_message_lost);
-		printed_len += log_store(0, 2, LOG_PREFIX|LOG_NEWLINE, 0,
-					 NULL, 0, textbuf, text_len);
-	}
-
 	/*
 	 * The printf needs to come first; we need the syslog
 	 * prefix which might be passed-in as a parameter.
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index af74d9c..721b091 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -52,10 +52,11 @@ struct printk_safe_seq_buf {
 
 static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq);
 static DEFINE_PER_CPU(int, printk_safe_context);
+static atomic_t safe_message_lost;
 
 #ifdef CONFIG_PRINTK_NMI
 static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq);
-atomic_t nmi_message_lost;
+static atomic_t nmi_message_lost;
 #endif
 
 static int printk_safe_log_store(struct printk_safe_seq_buf *s,
@@ -122,6 +123,41 @@ static void printk_safe_flush_seq_line(struct printk_safe_seq_buf *s,
 	printk_safe_flush_line(buf, (end - start) + 1);
 }
 
+static void report_message_lost(atomic_t *num_lost, char *fmt)
+{
+	int lost = atomic_xchg(num_lost, 0);
+
+	if (lost) {
+		char msg[56];
+
+		scnprintf(msg, sizeof(msg), fmt, lost);
+
+		printk_safe_flush_line(msg, strlen(msg));
+	}
+}
+
+#ifdef CONFIG_PRINTK_NMI
+
+static void report_nmi_message_lost(void)
+{
+	report_message_lost(&nmi_message_lost,
+			"Lost %d message(s) from NMI context!");
+}
+
+#else
+
+static void report_nmi_message_lost(void)
+{
+}
+
+#endif /* CONFIG_PRINTK_NMI */
+
+static void report_safe_message_lost(void)
+{
+	report_message_lost(&safe_message_lost,
+			"Lost %d message(s) from printk-safe context!");
+}
+
 /*
  * Flush data from the associated per-CPU buffer. The function
  * can be called either via IRQ work or independently.
@@ -147,6 +183,9 @@ static void __printk_safe_flush(struct irq_work *work)
 
 	i = 0;
 more:
+	report_nmi_message_lost();
+	report_safe_message_lost();
+
 	len = atomic_read(&s->len);
 
 	/*
@@ -289,8 +328,15 @@ static int vprintk_safe_nmi(const char *fmt, va_list args)
 static int vprintk_safe(const char *fmt, va_list args)
 {
 	struct printk_safe_seq_buf *s = this_cpu_ptr(&safe_print_seq);
+	int add;
 
-	return printk_safe_log_store(s, fmt, args);
+	add = printk_safe_log_store(s, fmt, args);
+	if (add == -E2BIG) {
+		atomic_inc(&safe_message_lost);
+		add = 0;
+	}
+
+	return add;
 }
 
 /* Can be preempted by NMI. */
-- 
2.10.1.502.g6598894

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC][PATCHv4 5/6] printk: use printk_safe buffers
  2016-10-27 15:49 [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
                   ` (3 preceding siblings ...)
  2016-10-27 15:49 ` [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts Sergey Senozhatsky
@ 2016-10-27 15:49 ` Sergey Senozhatsky
  2016-11-25 14:28   ` Petr Mladek
  2016-10-27 15:49 ` [RFC][PATCHv4 6/6] printk: remove zap_locks() function Sergey Senozhatsky
  2016-10-28  3:30 ` [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls Linus Torvalds
  6 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-10-27 15:49 UTC (permalink / raw)
  To: Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, Laura Abbott,
	Andy Lutomirski, Linus Torvalds, Kees Cook, linux-kernel,
	Sergey Senozhatsky, Sergey Senozhatsky

Use printk_safe per-CPU buffers in in printk recursion-prone blocks:
-- around logbuf_lock protected sections in vprintk_emit() and
   console_unlock()
-- around down_trylock_console_sem() and up_console_sem()

Note that this solution addresses deadlocks caused by printk()
recursive calls only.

Examples:

1) printk() from logbuf_lock spin_lock section

Assume the following code:
  printk()
    raw_spin_lock(&logbuf_lock);
    WARN_ON(1);
    raw_spin_unlock(&logbuf_lock);

which now produces:

 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 366 at kernel/printk/printk.c:1811 vprintk_emit+0x1cd/0x438
 CPU: 0 PID: 366 Comm: bash
 Call Trace:
  [<ffffffff81045658>] warn_slowpath_null+0x1d/0x1f
  [<ffffffff81091527>] vprintk_emit+0x1cd/0x438
  [<ffffffff810918fe>] vprintk_default+0x1d/0x1f
  [<ffffffff810fdf6a>] printk+0x48/0x50
  [..]
  [<ffffffff813caaaa>] entry_SYSCALL_64_fastpath+0x18/0xad
 ---[ end trace ]---

2) printk() from semaphore sem->lock spin_lock section

Assume the following code

  printk()
    console_trylock()
      down_trylock()
        raw_spin_lock_irqsave(&sem->lock, flags);
        WARN_ON(1);
        raw_spin_unlock_irqrestore(&sem->lock, flags);

which now produces:

 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 363 at kernel/locking/semaphore.c:141 down_trylock+0x3d/0x62
 CPU: 1 PID: 363 Comm: bash
 Call Trace:
  [<ffffffff81045658>] warn_slowpath_null+0x1d/0x1f
  [<ffffffff810838df>] down_trylock+0x3d/0x62
  [<ffffffff8109177e>] ? vprintk_emit+0x3f9/0x414
  [<ffffffff810905cb>] console_trylock+0x31/0xeb
  [<ffffffff8109177e>] vprintk_emit+0x3f9/0x414
  [<ffffffff81091905>] vprintk_default+0x1d/0x1f
  [<ffffffff810fdf71>] printk+0x48/0x50
  [..]
  [<ffffffff813caaaa>] entry_SYSCALL_64_fastpath+0x18/0xad
 ---[ end trace ]---

3) printk() from console_unlock()

Assume the following code:

  printk()
    console_unlock()
      raw_spin_lock(&logbuf_lock);
      WARN_ON(1);
      raw_spin_unlock(&logbuf_lock);

which now produces:

 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 329 at kernel/printk/printk.c:2384 console_unlock+0x12d/0x559
 CPU: 1 PID: 329 Comm: bash
 Call Trace:
  [<ffffffff8103d6ca>] warn_slowpath_null+0x18/0x1a
  [<ffffffff8107f917>] console_unlock+0x12d/0x559
  [<ffffffff810772c4>] ? trace_hardirqs_on_caller+0x16d/0x189
  [<ffffffff810772ed>] ? trace_hardirqs_on+0xd/0xf
  [<ffffffff810800a6>] vprintk_emit+0x363/0x374
  [<ffffffff81080219>] vprintk_default+0x18/0x1a
  [<ffffffff810c7f77>] printk+0x43/0x4b
  [..]
  [<ffffffff814c2d6a>] entry_SYSCALL_64_fastpath+0x18/0xad
 ---[ end trace ]---

4) printk() from try_to_wake_up()

Assume the following code:

  printk()
    console_unlock()
      up()
        try_to_wake_up()
          raw_spin_lock_irqsave(&p->pi_lock, flags);
          WARN_ON(1);
          raw_spin_unlock_irqrestore(&p->pi_lock, flags);

which now produces:

 ------------[ cut here ]------------
 WARNING: CPU: 3 PID: 363 at kernel/sched/core.c:2028 try_to_wake_up+0x7f/0x4f7
 CPU: 3 PID: 363 Comm: bash
 Call Trace:
  [<ffffffff81045658>] warn_slowpath_null+0x1d/0x1f
  [<ffffffff8106c8d0>] try_to_wake_up+0x7f/0x4f7
  [<ffffffff8106cd5d>] wake_up_process+0x15/0x17
  [<ffffffff813c82c6>] __up.isra.0+0x56/0x63
  [<ffffffff810839a3>] up+0x32/0x42
  [<ffffffff8108f624>] __up_console_sem+0x37/0x55
  [<ffffffff810910ff>] console_unlock+0x21e/0x4c2
  [<ffffffff810917bf>] vprintk_emit+0x41c/0x462
  [<ffffffff81091971>] vprintk_default+0x1d/0x1f
  [<ffffffff810fdfdd>] printk+0x48/0x50
  [..]
  [<ffffffff813cab2a>] entry_SYSCALL_64_fastpath+0x18/0xad
 ---[ end trace ]---

5) printk() from call_console_drivers()

Assume the following code:
  printk()
    console_unlock()
      call_console_drivers()
      ...
          WARN_ON(1);

which now produces:

 ------------[ cut here ]------------
 WARNING: CPU: 2 PID: 305 at kernel/printk/printk.c:1604 call_console_drivers+0x3a/0xb0
 CPU: 2 PID: 305 Comm: bash
 Call Trace:
  [<ffffffff810399c0>] warn_slowpath_null+0x18/0x1a
  [<ffffffff8106b218>] call_console_drivers.isra.6.constprop.16+0x3a/0xb0
  [<ffffffff8106cc46>] console_unlock+0x471/0x48e
  [<ffffffff8106ce57>] vprintk_emit+0x1f4/0x206
  [<ffffffff8106cfcb>] vprintk_default+0x18/0x1a
  [<ffffffff8106af66>] vprintk_func+0x6e/0x70
  [<ffffffff810abbe3>] printk+0x3e/0x46
[..]
  [<ffffffff8145a860>] entry_SYSCALL_64_fastpath+0x13/0x94
 ---[ end trace ]---

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 kernel/printk/printk.c | 50 ++++++++++++++++++++++++++++++++------------------
 1 file changed, 32 insertions(+), 18 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 4675b8d..5907e92 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -213,17 +213,31 @@ static int nr_ext_console_drivers;
 
 static int __down_trylock_console_sem(unsigned long ip)
 {
-	if (down_trylock(&console_sem))
+	int lock_failed;
+	unsigned long flags;
+
+	printk_safe_enter(flags);
+	lock_failed = down_trylock(&console_sem);
+	printk_safe_exit(flags);
+
+	if (lock_failed)
 		return 1;
 	mutex_acquire(&console_lock_dep_map, 0, 1, ip);
 	return 0;
 }
 #define down_trylock_console_sem() __down_trylock_console_sem(_RET_IP_)
 
-#define up_console_sem() do { \
-	mutex_release(&console_lock_dep_map, 1, _RET_IP_);\
-	up(&console_sem);\
-} while (0)
+static void __up_console_sem(unsigned long ip)
+{
+	unsigned long flags;
+
+	mutex_release(&console_lock_dep_map, 1, ip);
+
+	printk_safe_enter(flags);
+	up(&console_sem);
+	printk_safe_exit(flags);
+}
+#define up_console_sem() __up_console_sem(_RET_IP_)
 
 /*
  * This is used for debugging the mess that is the VT code by
@@ -1807,7 +1821,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 	boot_delay_msec(level);
 	printk_delay();
 
-	local_irq_save(flags);
+	printk_safe_enter(flags);
 	this_cpu = smp_processor_id();
 
 	/*
@@ -1823,13 +1837,12 @@ asmlinkage int vprintk_emit(int facility, int level,
 		 */
 		if (!oops_in_progress && !lockdep_recursing(current)) {
 			recursion_bug = true;
-			local_irq_restore(flags);
+			printk_safe_exit(flags);
 			return 0;
 		}
 		zap_locks();
 	}
 
-	lockdep_off();
 	/* This stops the holder of console_sem just where we want him */
 	raw_spin_lock(&logbuf_lock);
 	logbuf_cpu = this_cpu;
@@ -1889,12 +1902,10 @@ asmlinkage int vprintk_emit(int facility, int level,
 
 	logbuf_cpu = UINT_MAX;
 	raw_spin_unlock(&logbuf_lock);
-	lockdep_on();
-	local_irq_restore(flags);
+	printk_safe_exit(flags);
 
 	/* If called from the scheduler, we can not call up(). */
 	if (!in_sched) {
-		lockdep_off();
 		/*
 		 * Try to acquire and then immediately release the console
 		 * semaphore.  The release will print out buffers and wake up
@@ -1902,7 +1913,6 @@ asmlinkage int vprintk_emit(int facility, int level,
 		 */
 		if (console_trylock())
 			console_unlock();
-		lockdep_on();
 	}
 
 	return printed_len;
@@ -2288,7 +2298,8 @@ static void console_cont_flush(char *text, size_t size)
 	unsigned long flags;
 	size_t len;
 
-	raw_spin_lock_irqsave(&logbuf_lock, flags);
+	printk_safe_enter(flags);
+	raw_spin_lock(&logbuf_lock);
 
 	if (!cont.len)
 		goto out;
@@ -2313,10 +2324,11 @@ static void console_cont_flush(char *text, size_t size)
 	stop_critical_timings();
 	call_console_drivers(cont.level, NULL, 0, text, len);
 	start_critical_timings();
-	local_irq_restore(flags);
+	printk_safe_exit(flags);
 	return;
 out:
-	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+	raw_spin_unlock(&logbuf_lock);
+	printk_safe_exit(flags);
 }
 
 /**
@@ -2381,7 +2393,8 @@ void console_unlock(void)
 		size_t len;
 		int level;
 
-		raw_spin_lock_irqsave(&logbuf_lock, flags);
+		printk_safe_enter(flags);
+		raw_spin_lock(&logbuf_lock);
 		if (seen_seq != log_next_seq) {
 			wake_klogd = true;
 			seen_seq = log_next_seq;
@@ -2442,7 +2455,7 @@ void console_unlock(void)
 		stop_critical_timings();	/* don't trace print latency */
 		call_console_drivers(level, ext_text, ext_len, text, len);
 		start_critical_timings();
-		local_irq_restore(flags);
+		printk_safe_exit(flags);
 
 		if (do_cond_resched)
 			cond_resched();
@@ -2465,7 +2478,8 @@ void console_unlock(void)
 	 */
 	raw_spin_lock(&logbuf_lock);
 	retry = console_seq != log_next_seq;
-	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+	raw_spin_unlock(&logbuf_lock);
+	printk_safe_exit(flags);
 
 	if (retry && console_trylock())
 		goto again;
-- 
2.10.1.502.g6598894

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC][PATCHv4 6/6] printk: remove zap_locks() function
  2016-10-27 15:49 [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
                   ` (4 preceding siblings ...)
  2016-10-27 15:49 ` [RFC][PATCHv4 5/6] printk: use printk_safe buffers Sergey Senozhatsky
@ 2016-10-27 15:49 ` Sergey Senozhatsky
  2016-11-25 15:01   ` Petr Mladek
  2016-10-28  3:30 ` [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls Linus Torvalds
  6 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-10-27 15:49 UTC (permalink / raw)
  To: Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, Laura Abbott,
	Andy Lutomirski, Linus Torvalds, Kees Cook, linux-kernel,
	Sergey Senozhatsky, Sergey Senozhatsky

We use printk-safe now which makes printk-recursion detection code
in vprintk_emit() is unreachable. The tricky thing here is that,
apart from detecting and reporting printk recursions, that code also
used to zap_lockc() in case of panic. However, zap_locks() does not
look to be needed anymore:

1) Since commit 08d78658f393 ("panic: release stale console lock to
   always get the logbuf printed out") panic flushing of `logbuf' to
   console ignores the state of `console_sem' by doing
   	panic()
		console_trylock();
		console_unlock();

2) Since commit cf9b1106c81c ("printk/nmi: flush NMI messages on the
   system panic") panic attempts to zap the `logbuf_lock' spin_lock to
   successfully flush nmi messages to `logbuf'.

Basically, it seems that we either already do what zap_locks() used to
do but in other places or we ignore the state of the lock. May be we
still would want to do sema_init() in printk_safe_flush_on_panic(),
just in case.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 kernel/printk/printk.c | 61 --------------------------------------------------
 1 file changed, 61 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 5907e92..40088e7 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1619,28 +1619,6 @@ static void call_console_drivers(int level,
 	}
 }
 
-/*
- * Zap console related locks when oopsing.
- * To leave time for slow consoles to print a full oops,
- * only zap at most once every 30 seconds.
- */
-static void zap_locks(void)
-{
-	static unsigned long oops_timestamp;
-
-	if (time_after_eq(jiffies, oops_timestamp) &&
-	    !time_after(jiffies, oops_timestamp + 30 * HZ))
-		return;
-
-	oops_timestamp = jiffies;
-
-	debug_locks_off();
-	/* If a crash is occurring, make sure we can't deadlock */
-	raw_spin_lock_init(&logbuf_lock);
-	/* And make sure that we print immediately */
-	sema_init(&console_sem, 1);
-}
-
 int printk_delay_msec __read_mostly;
 
 static inline void printk_delay(void)
@@ -1801,17 +1779,13 @@ asmlinkage int vprintk_emit(int facility, int level,
 			    const char *dict, size_t dictlen,
 			    const char *fmt, va_list args)
 {
-	static bool recursion_bug;
 	static char textbuf[LOG_LINE_MAX];
 	char *text = textbuf;
 	size_t text_len = 0;
 	enum log_flags lflags = 0;
 	unsigned long flags;
-	int this_cpu;
 	int printed_len = 0;
 	bool in_sched = false;
-	/* cpu currently holding logbuf_lock in this function */
-	static unsigned int logbuf_cpu = UINT_MAX;
 
 	if (level == LOGLEVEL_SCHED) {
 		level = LOGLEVEL_DEFAULT;
@@ -1822,42 +1796,8 @@ asmlinkage int vprintk_emit(int facility, int level,
 	printk_delay();
 
 	printk_safe_enter(flags);
-	this_cpu = smp_processor_id();
-
-	/*
-	 * Ouch, printk recursed into itself!
-	 */
-	if (unlikely(logbuf_cpu == this_cpu)) {
-		/*
-		 * If a crash is occurring during printk() on this CPU,
-		 * then try to get the crash message out but make sure
-		 * we can't deadlock. Otherwise just return to avoid the
-		 * recursion and return - but flag the recursion so that
-		 * it can be printed at the next appropriate moment:
-		 */
-		if (!oops_in_progress && !lockdep_recursing(current)) {
-			recursion_bug = true;
-			printk_safe_exit(flags);
-			return 0;
-		}
-		zap_locks();
-	}
-
 	/* This stops the holder of console_sem just where we want him */
 	raw_spin_lock(&logbuf_lock);
-	logbuf_cpu = this_cpu;
-
-	if (unlikely(recursion_bug)) {
-		static const char recursion_msg[] =
-			"BUG: recent printk recursion!";
-
-		recursion_bug = false;
-		/* emit KERN_CRIT message */
-		printed_len += log_store(0, 2, LOG_PREFIX|LOG_NEWLINE, 0,
-					 NULL, 0, recursion_msg,
-					 strlen(recursion_msg));
-	}
-
 	/*
 	 * The printf needs to come first; we need the syslog
 	 * prefix which might be passed-in as a parameter.
@@ -1900,7 +1840,6 @@ asmlinkage int vprintk_emit(int facility, int level,
 
 	printed_len += log_output(facility, level, lflags, dict, dictlen, text, text_len);
 
-	logbuf_cpu = UINT_MAX;
 	raw_spin_unlock(&logbuf_lock);
 	printk_safe_exit(flags);
 
-- 
2.10.1.502.g6598894

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls
  2016-10-27 15:49 [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
                   ` (5 preceding siblings ...)
  2016-10-27 15:49 ` [RFC][PATCHv4 6/6] printk: remove zap_locks() function Sergey Senozhatsky
@ 2016-10-28  3:30 ` Linus Torvalds
  2016-10-28  4:05   ` Sergey Senozhatsky
  6 siblings, 1 reply; 30+ messages in thread
From: Linus Torvalds @ 2016-10-28  3:30 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Steven Rostedt, Ingo Molnar,
	Peter Zijlstra, Laura Abbott, Andy Lutomirski, Kees Cook,
	Linux Kernel Mailing List, Sergey Senozhatsky

On Thu, Oct 27, 2016 at 8:49 AM, Sergey Senozhatsky
<sergey.senozhatsky@gmail.com> wrote:
>
>         RFC
>
>         This patch set extends a lock-less NMI per-cpu buffers idea to
> handle recursive printk() calls. The basic mechanism is pretty much the
> same -- at the beginning of a deadlock-prone section we switch to lock-less
> printk callback, and return back to a default printk implementation at the
> end; the messages are getting flushed to a logbuf buffer from a safer
> context.

This looks very reasonable to me.

Does this also obviate the need for "printk_deferred()" that the
scheduler and the clock code uses?  Because that would be a lovely
thing to look at if it doesn't..

             LInus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls
  2016-10-28  3:30 ` [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls Linus Torvalds
@ 2016-10-28  4:05   ` Sergey Senozhatsky
  0 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-10-28  4:05 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sergey Senozhatsky, Petr Mladek, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, Laura Abbott,
	Andy Lutomirski, Kees Cook, Linux Kernel Mailing List,
	Sergey Senozhatsky

Hello,

On (10/27/16 20:30), Linus Torvalds wrote:
> On Thu, Oct 27, 2016 at 8:49 AM, Sergey Senozhatsky
> <sergey.senozhatsky@gmail.com> wrote:
> >
> >         RFC
> >
> >         This patch set extends a lock-less NMI per-cpu buffers idea to
> > handle recursive printk() calls. The basic mechanism is pretty much the
> > same -- at the beginning of a deadlock-prone section we switch to lock-less
> > printk callback, and return back to a default printk implementation at the
> > end; the messages are getting flushed to a logbuf buffer from a safer
> > context.
> 
> This looks very reasonable to me.
> 
> Does this also obviate the need for "printk_deferred()" that the
> scheduler and the clock code uses?  Because that would be a lovely
> thing to look at if it doesn't..

I wish I could say that we can retire printk_deferred(), but no, we still
need it. it's rather simple to fix printk recursion (that's what the patch
set is doing), but printk deadlocks are much harder to handle. anything that
starts somewhere else but somehow is related printk will deadlock (in the
worst case). I use this backtrace as an example:

 SyS_ioctl
  do_vfs_ioctl
   tty_ioctl
    n_tty_ioctl
     tty_mode_ioctl
      set_termios
       tty_set_termios
        uart_set_termios
         uart_change_speed
          FOO_serial_set_termios
           spin_lock_irqsave(&port->lock)     // lock the output port
           ....
           !! WARN() or pr_err() or printk()
               vprintk_emit()
                /* console_trylock() */
                console_unlock()
                 call_console_drivers()
                  FOO_write()
                   spin_lock_irqsave(&port->lock)     // already locked

with the current printk we can't tell for sure how many locks will
be acquired -- printk() can succeed in locking the console_sem and
start invoking console drivers (if any) from console_unlock(), or
it can fail thus we will acquire only logbuf spin_lock and console_sem
spin_lock.

the things can change *a bit* once we switch to async_printk. because
instead of doing console_unlock()->call_console_drivers(), printk()
will just wake_up() the printk_kthread. but still, it won't be enough
to remove printk_deferred()   :(

       vprintk_emit()
        wake_up()
         spin_lock rq lock
          printk

will be safe. but

      wake_up()
       spin_lock rq lock
        printk
         vprintk_emit()
          wake_up()
           spin_lock rq lock

will deadlock.

we can't even tell for sure what locks are "important" to printk().
a small and reasonable code refactoring somewhere in clock code/etc.
can accidentally change the whole picture by introducing "unsafe"
WARN_ON() or adding yet another lock to the printing path.

need to think more.

p.s.
we are plannig to discuss printk related issues next week in Santa Fe.

	-ss

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 1/6] printk: use vprintk_func in vprintk()
  2016-10-27 15:49 ` [RFC][PATCHv4 1/6] printk: use vprintk_func in vprintk() Sergey Senozhatsky
@ 2016-11-24 16:28   ` Petr Mladek
  0 siblings, 0 replies; 30+ messages in thread
From: Petr Mladek @ 2016-11-24 16:28 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Steven Rostedt, Ingo Molnar,
	Peter Zijlstra, Laura Abbott, Andy Lutomirski, Linus Torvalds,
	Kees Cook, linux-kernel, Sergey Senozhatsky

On Fri 2016-10-28 00:49:28, Sergey Senozhatsky wrote:
> vprintk(), just like printk(), better be using per-cpu printk_func
> instead of direct vprintk_emit() call. Just in case if vprintk()
> will ever be called from NMI, or from any other context that can
> deadlock in printk().
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Reviewed-by: Steven Rostedt <rostedt@goodmis.org>

Makes sense.

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 2/6] printk: rename nmi.c and exported api
  2016-10-27 15:49 ` [RFC][PATCHv4 2/6] printk: rename nmi.c and exported api Sergey Senozhatsky
@ 2016-11-24 16:35   ` Petr Mladek
  2016-12-01  1:07     ` Sergey Senozhatsky
  0 siblings, 1 reply; 30+ messages in thread
From: Petr Mladek @ 2016-11-24 16:35 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Steven Rostedt, Ingo Molnar,
	Peter Zijlstra, Laura Abbott, Andy Lutomirski, Linus Torvalds,
	Kees Cook, linux-kernel, Sergey Senozhatsky

On Fri 2016-10-28 00:49:29, Sergey Senozhatsky wrote:
> A preparation patch for printk_safe work. No functional change.
> - rename nmi.c to print_safe.c
> - rename exported functions to have `safe' prefix.
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Acked-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  arch/arm/kernel/smp.c                  |  4 +-
>  include/linux/hardirq.h                |  4 +-
>  include/linux/printk.h                 | 20 +++++-----
>  init/Kconfig                           | 16 ++++----
>  init/main.c                            |  2 +-
>  kernel/kexec_core.c                    |  2 +-
>  kernel/panic.c                         |  4 +-
>  kernel/printk/Makefile                 |  2 +-
>  kernel/printk/{nmi.c => printk_safe.c} | 69 +++++++++++++++++-----------------
>  lib/nmi_backtrace.c                    |  2 +-
>  10 files changed, 64 insertions(+), 61 deletions(-)
>  rename kernel/printk/{nmi.c => printk_safe.c} (77%)
> 
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 7dd14e8..8b056e1 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -644,11 +644,11 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
>  		break;
>  
>  	case IPI_CPU_BACKTRACE:
> -		printk_nmi_enter();
> +		printk_safe_nmi_enter();
>  		irq_enter();
>  		nmi_cpu_backtrace(regs);
>  		irq_exit();
> -		printk_nmi_exit();
> +		printk_safe_nmi_exit();
>  		break;
>  
>  	default:
> diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
> index c683996..14840aa 100644
> --- a/include/linux/hardirq.h
> +++ b/include/linux/hardirq.h
> @@ -61,7 +61,7 @@ extern void irq_exit(void);
>  
>  #define nmi_enter()						\
>  	do {							\
> -		printk_nmi_enter();				\
> +		printk_safe_nmi_enter();			\
>  		lockdep_off();					\
>  		ftrace_nmi_enter();				\
>  		BUG_ON(in_nmi());				\
> @@ -78,7 +78,7 @@ extern void irq_exit(void);
>  		preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET);	\
>  		ftrace_nmi_exit();				\
>  		lockdep_on();					\
> -		printk_nmi_exit();				\
> +		printk_safe_nmi_exit();				\
>  	} while (0)
>  
>  #endif /* LINUX_HARDIRQ_H */
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index eac1af8..c9a3080 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -132,17 +132,17 @@ void early_printk(const char *s, ...) { }
>  #endif
>  
>  #ifdef CONFIG_PRINTK_NMI
> -extern void printk_nmi_init(void);
> -extern void printk_nmi_enter(void);
> -extern void printk_nmi_exit(void);
> -extern void printk_nmi_flush(void);
> -extern void printk_nmi_flush_on_panic(void);
> +extern void printk_safe_init(void);
> +extern void printk_safe_nmi_enter(void);
> +extern void printk_safe_nmi_exit(void);

I would personally keep the short names pritnk_nmi_enter() and
printk_nmi_exit(). These are the only functions that will stay
in this CONFIG_PRINT_NMI section. The others are generic and
will be moved to the generic section in the next patch.

Well, it is just a cosmetic problem and a personal opinion.
The patch makes sense and looks fine:

Reviewed-by: Petr Mladek <pmladek@pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 3/6] printk: introduce per-cpu safe_print seq buffer
  2016-10-27 15:49 ` [RFC][PATCHv4 3/6] printk: introduce per-cpu safe_print seq buffer Sergey Senozhatsky
@ 2016-11-24 16:58   ` Petr Mladek
  2016-12-01  1:08     ` Sergey Senozhatsky
  2016-12-01  5:32     ` Sergey Senozhatsky
  0 siblings, 2 replies; 30+ messages in thread
From: Petr Mladek @ 2016-11-24 16:58 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Steven Rostedt, Ingo Molnar,
	Peter Zijlstra, Laura Abbott, Andy Lutomirski, Linus Torvalds,
	Kees Cook, linux-kernel, Sergey Senozhatsky

On Fri 2016-10-28 00:49:30, Sergey Senozhatsky wrote:
> This patch extends the idea of NMI per-cpu buffers to regions
> that may cause recursive printk() calls and possible deadlocks.
> Namely, printk() can't handle printk calls from schedule code
> or printk() calls from lock debugging code (spin_dump() for instance);
> because those may be called with `sem->lock' already taken or any
> other `critical' locks (p->pi_lock, etc.). An example of deadlock
> can be
> 
>  vprintk_emit()
>   console_unlock()
>    up()                        << raw_spin_lock_irqsave(&sem->lock, flags);
>     wake_up_process()
>      try_to_wake_up()
>       ttwu_queue()
>        ttwu_activate()
>         activate_task()
>          enqueue_task()
>           enqueue_task_fair()
>            cfs_rq_of()
>             task_of()
>              WARN_ON_ONCE(!entity_is_task(se))
>               vprintk_emit()
>                console_trylock()
>                 down_trylock()
>                  raw_spin_lock_irqsave(&sem->lock, flags)
>                  ^^^^ deadlock
> 
> and some other cases.
> 
> Just like in NMI implementation, the solution uses a per-cpu
> `printk_func' pointer to 'redirect' printk() calls to a 'safe'
> callback, that store messages in a per-cpu buffer and flushes
> them back to logbuf buffer later.
> 
> Usage example:
> 
>  printk()
>   printk_safe_enter(flags)
>   //
>   //  any printk() call from here will endup in vprintk_safe(),
>   //  that stores messages in a special per-CPU buffer.
>   //
>   printk_safe_exit(flags)
> 
> The 'redirection' mechanism, though, has been reworked, as suggested
> by Petr Mladek. Instead of using a per-cpu @print_func callback we now
> keep a per-cpu printk-context variable and call either default or nmi
> vprintk function depending on its value. printk_nmi_entrer/exit and
> printk_safe_enter/exit, thus, just set/celar corresponding bits in
> printk-context functions.
> 
> The patch only adds printk_safe support, we don't use it yet.
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

> diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
> index 7fd2838..87c784b 100644
> --- a/kernel/printk/internal.h
> +++ b/kernel/printk/internal.h
>  #endif /* CONFIG_PRINTK_NMI */
> +
> +#ifdef CONFIG_PRINTK
> +
> +#define PRINTK_SAFE_CONTEXT_MASK	0x7fffffff
> +#define PRINTK_SAFE_NMI_CONTEXT_MASK	0x80000000

What about shorter name PRINTK_NMI_CONTEXT_MASK?

> diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
> index 1f66163..af74d9c 100644
> --- a/kernel/printk/printk_safe.c
> +++ b/kernel/printk/printk_safe.c
> @@ -50,27 +49,26 @@ struct printk_safe_seq_buf {
>  	struct irq_work		work;	/* IRQ work that flushes the buffer */
>  	unsigned char		buffer[SAFE_LOG_BUF_LEN];
>  };
> +
> +static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq);
> +static DEFINE_PER_CPU(int, printk_safe_context);

I would personally use the short name "printk_context". It is a generic
value. Zero value means that it is a normal context. Also there is
an idea to add KDB context that would use its own vprintk_kdb()
implementation and will not use the printk_safe buffer.


> +#ifdef CONFIG_PRINTK_NMI
>  static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq);
> +atomic_t nmi_message_lost;
> +#endif
>  
> -/*
> - * Safe printk() for NMI context. It uses a per-CPU buffer to
> - * store the message. NMIs are not nested, so there is always only
> - * one writer running. But the buffer might get flushed from another
> - * CPU, so we need to be careful.
> - */
> -static int vprintk_safe_nmi(const char *fmt, va_list args)
> +static int printk_safe_log_store(struct printk_safe_seq_buf *s,
> +		const char *fmt, va_list args)
>  {
> -	struct printk_safe_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
> -	int add = 0;
> +	int add;
>  	size_t len;
>  
>  again:
>  	len = atomic_read(&s->len);
>  
> -	if (len >= sizeof(s->buffer)) {
> -		atomic_inc(&nmi_message_lost);
> -		return 0;
> -	}
> +	if (len >= sizeof(s->buffer))
> +		return -E2BIG;

E2BIG means "argument list too long" and does not fit much here.
I would suggest to use -ENOSPC. It is not ideal either but it fits
slightly better.

> +/*
> + * Lockless printk(), to avoid deadlocks should the printk() recurse
> + * into itself. It uses a per-CPU buffer to store the message, just like
> + * NMI.
> + */
> +static int vprintk_safe(const char *fmt, va_list args)
> +{
> +	struct printk_safe_seq_buf *s = this_cpu_ptr(&safe_print_seq);
> +
> +	return printk_safe_log_store(s, fmt, args);

We should return zero if printk_safe_log_store() returns an error.
I know that it will get fixed in the next patch. But we should do
some minimum sanity check here because of bisection.


Best Regards,
Petr

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts
  2016-10-27 15:49 ` [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts Sergey Senozhatsky
@ 2016-11-25 11:07   ` Petr Mladek
  2016-12-01  2:10     ` Sergey Senozhatsky
  0 siblings, 1 reply; 30+ messages in thread
From: Petr Mladek @ 2016-11-25 11:07 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Steven Rostedt, Ingo Molnar,
	Peter Zijlstra, Laura Abbott, Andy Lutomirski, Linus Torvalds,
	Kees Cook, linux-kernel, Sergey Senozhatsky

On Fri 2016-10-28 00:49:31, Sergey Senozhatsky wrote:
> Account lost messages in pritk-safe and printk-safe-nmi
> contexts and report those numbers during printk_safe_flush().
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> ---
>  kernel/printk/internal.h    | 17 ---------------
>  kernel/printk/printk.c      | 10 ---------
>  kernel/printk/printk_safe.c | 50 +++++++++++++++++++++++++++++++++++++++++++--
>  3 files changed, 48 insertions(+), 29 deletions(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 1878b2d..4675b8d 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1795,7 +1795,6 @@ asmlinkage int vprintk_emit(int facility, int level,
>  	unsigned long flags;
>  	int this_cpu;
>  	int printed_len = 0;
> -	int nmi_message_lost;
>  	bool in_sched = false;
>  	/* cpu currently holding logbuf_lock in this function */
>  	static unsigned int logbuf_cpu = UINT_MAX;
> @@ -1846,15 +1845,6 @@ asmlinkage int vprintk_emit(int facility, int level,
>  					 strlen(recursion_msg));
>  	}
>  
> -	nmi_message_lost = get_nmi_message_lost();
> -	if (unlikely(nmi_message_lost)) {
> -		text_len = scnprintf(textbuf, sizeof(textbuf),
> -				     "BAD LUCK: lost %d message(s) from NMI context!",
> -				     nmi_message_lost);
> -		printed_len += log_store(0, 2, LOG_PREFIX|LOG_NEWLINE, 0,
> -					 NULL, 0, textbuf, text_len);
> -	}

I really like that we are moving this out of vprintk_emit() but...

> -
>  	/*
>  	 * The printf needs to come first; we need the syslog
>  	 * prefix which might be passed-in as a parameter.
> diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
> index af74d9c..721b091 100644
> --- a/kernel/printk/printk_safe.c
> +++ b/kernel/printk/printk_safe.c
> @@ -52,10 +52,11 @@ struct printk_safe_seq_buf {
>  
>  static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq);
>  static DEFINE_PER_CPU(int, printk_safe_context);
> +static atomic_t safe_message_lost;
>  
>  #ifdef CONFIG_PRINTK_NMI
>  static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq);
> -atomic_t nmi_message_lost;
> +static atomic_t nmi_message_lost;
>  #endif
>  
>  static int printk_safe_log_store(struct printk_safe_seq_buf *s,
> @@ -122,6 +123,41 @@ static void printk_safe_flush_seq_line(struct printk_safe_seq_buf *s,
>  	printk_safe_flush_line(buf, (end - start) + 1);
>  }
>  
> +static void report_message_lost(atomic_t *num_lost, char *fmt)
> +{
> +	int lost = atomic_xchg(num_lost, 0);
> +
> +	if (lost) {
> +		char msg[56];

I would really like to avoid a hard coded buffer size. Such things
are likely to bite us in the future.

I thought about reshuffling a lot of logic, adding more wrappers,
... But the solution might be easy in the end, see below.

> +
> +		scnprintf(msg, sizeof(msg), fmt, lost);
> +
> +		printk_safe_flush_line(msg, strlen(msg));

This made my brain spin a lot. I wondered if it did what we wanted
and it was safe.

On one hand, it is supposed to work because use exactly this
function in __printk_safe_flush() where you call this from.

One question is if it does what we want in different contexts.
Let's look at it:

1. It calls printk_deferred() in NMI context. There is a risk
   of a deadlock. But it is called only from
   printk_safe_flush_on_panic() which is the last resort. Therefore
   it does exactly what we want.

2. It calls printk()->printk_func()->vprintk_emit() in normal context.
   It is what we want in normal context.

3. It calls printk()->printk_func()->v printk_safe() in printk_safe
   context. This does not look correct. IMHO, this might happen
   only printk_safe_flush_on_panic() and we want to use
   printk_deferred() here as well.

   Also note that printk_safe buffer might still be full at this
   point. It means that the warning will not be stored. There also
   might be a risk of an infinite loop. Both problems are
   solved by printk_deferred().

The easiest solution would be to simply call printk_deferred()
here. Everything will be deferred after the async printk() patchset
anyway.

I would even use printk_deferred() in printk_safe_flush_line()
for each context. It is not optimal but it works very well
and it makes the code much more straightforward.

> +	}
> +}
> +
> +#ifdef CONFIG_PRINTK_NMI
> +
> +static void report_nmi_message_lost(void)
> +{
> +	report_message_lost(&nmi_message_lost,
> +			"Lost %d message(s) from NMI context!");
> +}
> +
> +#else
> +
> +static void report_nmi_message_lost(void)
> +{
> +}
> +
> +#endif /* CONFIG_PRINTK_NMI */
> +
> +static void report_safe_message_lost(void)
> +{
> +	report_message_lost(&safe_message_lost,
> +			"Lost %d message(s) from printk-safe context!");
> +}
> +
>  /*
>   * Flush data from the associated per-CPU buffer. The function
>   * can be called either via IRQ work or independently.
> @@ -147,6 +183,9 @@ static void __printk_safe_flush(struct irq_work *work)
>  
>  	i = 0;
>  more:
> +	report_nmi_message_lost();
> +	report_safe_message_lost();

Please, move this at the end of this function. There are few reasons
for this:

1. This causes an infinite loop if called in printk_safe() context
   and vpritnk_safe() is used to print the warning.

2. This prints the warning about lost messages in the wrong order.
   We should print the warning after we print the messages
   that fit into the buffer.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 5/6] printk: use printk_safe buffers
  2016-10-27 15:49 ` [RFC][PATCHv4 5/6] printk: use printk_safe buffers Sergey Senozhatsky
@ 2016-11-25 14:28   ` Petr Mladek
  2016-12-01  2:14     ` Sergey Senozhatsky
  0 siblings, 1 reply; 30+ messages in thread
From: Petr Mladek @ 2016-11-25 14:28 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Steven Rostedt, Ingo Molnar,
	Peter Zijlstra, Laura Abbott, Andy Lutomirski, Linus Torvalds,
	Kees Cook, linux-kernel, Sergey Senozhatsky

On Fri 2016-10-28 00:49:32, Sergey Senozhatsky wrote:
> Use printk_safe per-CPU buffers in in printk recursion-prone blocks:
> -- around logbuf_lock protected sections in vprintk_emit() and
>    console_unlock()
> -- around down_trylock_console_sem() and up_console_sem()
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 4675b8d..5907e92 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1823,13 +1837,12 @@ asmlinkage int vprintk_emit(int facility, int level,
>  		 */
>  		if (!oops_in_progress && !lockdep_recursing(current))
> {
>  			recursion_bug = true;
> -			local_irq_restore(flags);
> +			printk_safe_exit(flags);
>  			return 0;
>  		}
>  		zap_locks();
>  	}
>  
> -	lockdep_off();

I really like this patch. The only small problem is that it enables
lockdep and it does not explain why it is safe. The change itself
looks fine but it took me some time to prove why. IMHO, it is
worth a comment.

One thing is printk() recursion caused by lockdep warning
triggered from inside vprintk_emit(). It is safe because
the critical sections are guarded by printk_safe_enter()/exit()
now.

Another thing is lockdep recursion caused by catching another lockdep
issue when printing warning about the first one. This is safe
because lockdep protects itself. First, it sets and checks
current->lockdep_recursion around the critical sections.
Second, further checks are disabled entirely once first
lockdep issue is found.

If you add some comments about lockdep, feel free to use:

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 6/6] printk: remove zap_locks() function
  2016-10-27 15:49 ` [RFC][PATCHv4 6/6] printk: remove zap_locks() function Sergey Senozhatsky
@ 2016-11-25 15:01   ` Petr Mladek
  2016-11-25 15:17     ` Peter Zijlstra
                       ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Petr Mladek @ 2016-11-25 15:01 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Steven Rostedt, Ingo Molnar,
	Peter Zijlstra, Laura Abbott, Andy Lutomirski, Linus Torvalds,
	Kees Cook, linux-kernel, Sergey Senozhatsky

On Fri 2016-10-28 00:49:33, Sergey Senozhatsky wrote:
> We use printk-safe now which makes printk-recursion detection code
> in vprintk_emit() is unreachable. The tricky thing here is that,
     		    ^^ superfluous "is"

> apart from detecting and reporting printk recursions, that code also
> used to zap_lockc() in case of panic. However, zap_locks() does not
       	          ^

s/zap_lockc/zap_locks/

> look to be needed anymore:
> 
> 1) Since commit 08d78658f393 ("panic: release stale console lock to
>    always get the logbuf printed out") panic flushing of `logbuf' to
>    console ignores the state of `console_sem' by doing
>    	panic()
> 		console_trylock();
> 		console_unlock();
> 
> 2) Since commit cf9b1106c81c ("printk/nmi: flush NMI messages on the
>    system panic") panic attempts to zap the `logbuf_lock' spin_lock to
>    successfully flush nmi messages to `logbuf'.

Note that the same code is newly used to flush also the printk_safe
per-CPU buffers. It means that logbuf_lock is zapped also when
flushing these new buffers.


> Basically, it seems that we either already do what zap_locks() used to
> do but in other places or we ignore the state of the lock. May be we
> still would want to do sema_init() in printk_safe_flush_on_panic(),
> just in case.

Very good question! I would actually suggest to use printk_deferred()
in printk_safe_flush_on_panic() in any context. It will solve the
problems discussed for the 4th patch of this patchset. And it will
solve also this problem. In case of panic, we should first try to
get all messages into the logbuffer so that they are visible in
the crash dump. We try to push them to the console by
console_flush_on_panic() later because it is more risky.

> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

If we avoid calling console in printk_safe_flush_on_panic(),
feel free to use:

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 6/6] printk: remove zap_locks() function
  2016-11-25 15:01   ` Petr Mladek
@ 2016-11-25 15:17     ` Peter Zijlstra
  2016-12-01  2:34       ` Sergey Senozhatsky
  2016-12-01  2:18     ` Sergey Senozhatsky
  2016-12-01 12:50     ` Sergey Senozhatsky
  2 siblings, 1 reply; 30+ messages in thread
From: Peter Zijlstra @ 2016-11-25 15:17 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, Laura Abbott, Andy Lutomirski, Linus Torvalds,
	Kees Cook, linux-kernel, Sergey Senozhatsky

On Fri, Nov 25, 2016 at 04:01:13PM +0100, Petr Mladek wrote:
> On Fri 2016-10-28 00:49:33, Sergey Senozhatsky wrote:
> > 2) Since commit cf9b1106c81c ("printk/nmi: flush NMI messages on the
> >    system panic") panic attempts to zap the `logbuf_lock' spin_lock to
> >    successfully flush nmi messages to `logbuf'.
> 
> Note that the same code is newly used to flush also the printk_safe
> per-CPU buffers. It means that logbuf_lock is zapped also when
> flushing these new buffers.
> 

Note that (raw_)spin_lock_init() as done here and in
printk_nmi_flush_on_panic() can wreck the lock state and doesn't ensure
a subsequent spin_lock() of said lock will actually work.

The very best solution is to simply ignore the lock in panic situations
rather than trying to wreck it.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 2/6] printk: rename nmi.c and exported api
  2016-11-24 16:35   ` Petr Mladek
@ 2016-12-01  1:07     ` Sergey Senozhatsky
  2016-12-01 12:12       ` Petr Mladek
  0 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-12-01  1:07 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, Peter Zijlstra, Laura Abbott, Andy Lutomirski,
	Linus Torvalds, Kees Cook, linux-kernel, Sergey Senozhatsky

On (11/24/16 17:35), Petr Mladek wrote:
[..]
> >  #ifdef CONFIG_PRINTK_NMI
> > -extern void printk_nmi_init(void);
> > -extern void printk_nmi_enter(void);
> > -extern void printk_nmi_exit(void);
> > -extern void printk_nmi_flush(void);
> > -extern void printk_nmi_flush_on_panic(void);
> > +extern void printk_safe_init(void);
> > +extern void printk_safe_nmi_enter(void);
> > +extern void printk_safe_nmi_exit(void);
> 
> I would personally keep the short names pritnk_nmi_enter() and
> printk_nmi_exit(). These are the only functions that will stay
> in this CONFIG_PRINT_NMI section. The others are generic and
> will be moved to the generic section in the next patch.

I'd probably prefer all printk_safe functions to be printk_safe
prefixed, just to show the fact that those are printk_safe
functions (IOW, indirect output to a per-CPU buffer and flush
it to logbuf later).

> Well, it is just a cosmetic problem and a personal opinion.
> The patch makes sense and looks fine:
> 
> Reviewed-by: Petr Mladek <pmladek@pmladek@suse.com>

thanks.

	-ss

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 3/6] printk: introduce per-cpu safe_print seq buffer
  2016-11-24 16:58   ` Petr Mladek
@ 2016-12-01  1:08     ` Sergey Senozhatsky
  2016-12-01  5:32     ` Sergey Senozhatsky
  1 sibling, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-12-01  1:08 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, Peter Zijlstra, Laura Abbott, Andy Lutomirski,
	Linus Torvalds, Kees Cook, linux-kernel, Sergey Senozhatsky

On (11/24/16 17:58), Petr Mladek wrote:
[..]
> > +#ifdef CONFIG_PRINTK
> > +
> > +#define PRINTK_SAFE_CONTEXT_MASK	0x7fffffff
> > +#define PRINTK_SAFE_NMI_CONTEXT_MASK	0x80000000
> 
> What about shorter name PRINTK_NMI_CONTEXT_MASK?

ok.

> > diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
> > index 1f66163..af74d9c 100644
> > --- a/kernel/printk/printk_safe.c
> > +++ b/kernel/printk/printk_safe.c
> > @@ -50,27 +49,26 @@ struct printk_safe_seq_buf {
> >  	struct irq_work		work;	/* IRQ work that flushes the buffer */
> >  	unsigned char		buffer[SAFE_LOG_BUF_LEN];
> >  };
> > +
> > +static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq);
> > +static DEFINE_PER_CPU(int, printk_safe_context);
> 
> I would personally use the short name "printk_context". It is a generic
> value. Zero value means that it is a normal context. Also there is
> an idea to add KDB context that would use its own vprintk_kdb()
> implementation and will not use the printk_safe buffer.

ok.

> > -	if (len >= sizeof(s->buffer)) {
> > -		atomic_inc(&nmi_message_lost);
> > -		return 0;
> > -	}
> > +	if (len >= sizeof(s->buffer))
> > +		return -E2BIG;
> 
> E2BIG means "argument list too long" and does not fit much here.
> I would suggest to use -ENOSPC. It is not ideal either but it fits
> slightly better.

ok.

> > +/*
> > + * Lockless printk(), to avoid deadlocks should the printk() recurse
> > + * into itself. It uses a per-CPU buffer to store the message, just like
> > + * NMI.
> > + */
> > +static int vprintk_safe(const char *fmt, va_list args)
> > +{
> > +	struct printk_safe_seq_buf *s = this_cpu_ptr(&safe_print_seq);
> > +
> > +	return printk_safe_log_store(s, fmt, args);
> 
> We should return zero if printk_safe_log_store() returns an error.
> I know that it will get fixed in the next patch. But we should do
> some minimum sanity check here because of bisection.

ok.

	-ss

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts
  2016-11-25 11:07   ` Petr Mladek
@ 2016-12-01  2:10     ` Sergey Senozhatsky
  2016-12-01 12:50       ` Petr Mladek
  0 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-12-01  2:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, Peter Zijlstra, Laura Abbott, Andy Lutomirski,
	Linus Torvalds, Kees Cook, linux-kernel, Sergey Senozhatsky

On (11/25/16 12:07), Petr Mladek wrote:
[..]
> > +static void report_message_lost(atomic_t *num_lost, char *fmt)
> > +{
> > +	int lost = atomic_xchg(num_lost, 0);
> > +
> > +	if (lost) {
> > +		char msg[56];
> 
> I would really like to avoid a hard coded buffer size. Such things
> are likely to bite us in the future.

why would scnprintf() overflow.

> I thought about reshuffling a lot of logic, adding more wrappers,
> ... But the solution might be easy in the end, see below.
> 
> > +
> > +		scnprintf(msg, sizeof(msg), fmt, lost);
> > +
> > +		printk_safe_flush_line(msg, strlen(msg));
> 
> This made my brain spin a lot. I wondered if it did what we wanted
> and it was safe.
> 
> On one hand, it is supposed to work because use exactly this
> function in __printk_safe_flush() where you call this from.
> 
> One question is if it does what we want in different contexts.
> Let's look at it:
> 
> 1. It calls printk_deferred() in NMI context. There is a risk
>    of a deadlock. But it is called only from
>    printk_safe_flush_on_panic() which is the last resort. Therefore
>    it does exactly what we want.
> 
> 2. It calls printk()->printk_func()->vprintk_emit() in normal context.
>    It is what we want in normal context.
> 
> 3. It calls printk()->printk_func()->v printk_safe() in printk_safe
>    context. This does not look correct. IMHO, this might happen
>    only printk_safe_flush_on_panic() and we want to use
>    printk_deferred() here as well.
[..]
> The easiest solution would be to simply call printk_deferred()
> here. Everything will be deferred after the async printk() patchset
> anyway.
> 
> I would even use printk_deferred() in printk_safe_flush_line()
> for each context. It is not optimal but it works very well
> and it makes the code much more straightforward.

yes, good point.
we can call deferred printk for anything there; or replace that in_nmi()
check with the `printk_safe_context != 0' one, and then route the message
via printk or printk_deferred.


[..]
> >   * Flush data from the associated per-CPU buffer. The function
> >   * can be called either via IRQ work or independently.
> > @@ -147,6 +183,9 @@ static void __printk_safe_flush(struct irq_work *work)
> >  
> >  	i = 0;
> >  more:
> > +	report_nmi_message_lost();
> > +	report_safe_message_lost();
> 
> Please, move this at the end of this function.

ok.

	-ss

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 5/6] printk: use printk_safe buffers
  2016-11-25 14:28   ` Petr Mladek
@ 2016-12-01  2:14     ` Sergey Senozhatsky
  0 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-12-01  2:14 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, Peter Zijlstra, Laura Abbott, Andy Lutomirski,
	Linus Torvalds, Kees Cook, linux-kernel, Sergey Senozhatsky

On (11/25/16 15:28), Petr Mladek wrote:
[..]
> I really like this patch. The only small problem is that it enables
> lockdep and it does not explain why it is safe. The change itself
> looks fine but it took me some time to prove why. IMHO, it is
> worth a comment.
> 
> One thing is printk() recursion caused by lockdep warning
> triggered from inside vprintk_emit(). It is safe because
> the critical sections are guarded by printk_safe_enter()/exit()
> now.
> 
> Another thing is lockdep recursion caused by catching another lockdep
> issue when printing warning about the first one. This is safe
> because lockdep protects itself. First, it sets and checks
> current->lockdep_recursion around the critical sections.
> Second, further checks are disabled entirely once first
> lockdep issue is found.
> 
> If you add some comments about lockdep, feel free to use:

ok, I'll try to.

	-ss

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 6/6] printk: remove zap_locks() function
  2016-11-25 15:01   ` Petr Mladek
  2016-11-25 15:17     ` Peter Zijlstra
@ 2016-12-01  2:18     ` Sergey Senozhatsky
  2016-12-01 12:50     ` Sergey Senozhatsky
  2 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-12-01  2:18 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, Peter Zijlstra, Laura Abbott, Andy Lutomirski,
	Linus Torvalds, Kees Cook, linux-kernel, Sergey Senozhatsky

On (11/25/16 16:01), Petr Mladek wrote:
> On Fri 2016-10-28 00:49:33, Sergey Senozhatsky wrote:
> > We use printk-safe now which makes printk-recursion detection code
> > in vprintk_emit() is unreachable. The tricky thing here is that,
>      		    ^^ superfluous "is"
> 
> > apart from detecting and reporting printk recursions, that code also
> > used to zap_lockc() in case of panic. However, zap_locks() does not
>        	          ^
> 
> s/zap_lockc/zap_locks/
> 
> > look to be needed anymore:
> > 
> > 1) Since commit 08d78658f393 ("panic: release stale console lock to
> >    always get the logbuf printed out") panic flushing of `logbuf' to
> >    console ignores the state of `console_sem' by doing
> >    	panic()
> > 		console_trylock();
> > 		console_unlock();
> > 
> > 2) Since commit cf9b1106c81c ("printk/nmi: flush NMI messages on the
> >    system panic") panic attempts to zap the `logbuf_lock' spin_lock to
> >    successfully flush nmi messages to `logbuf'.
> 
> Note that the same code is newly used to flush also the printk_safe
> per-CPU buffers. It means that logbuf_lock is zapped also when
> flushing these new buffers.

yes, and I'm a bit skeptical about the whole re-init logbuf_lock,
because this lock is just one of possibly many looks that can be
'locked'. the messages are already in per-CPU buffers, so they
will present in a core dump file.

but it can be done separately, not in this series.

> > Basically, it seems that we either already do what zap_locks() used to
> > do but in other places or we ignore the state of the lock. May be we
> > still would want to do sema_init() in printk_safe_flush_on_panic(),
> > just in case.
> 
> Very good question! I would actually suggest to use printk_deferred()
> in printk_safe_flush_on_panic() in any context. It will solve the
> problems discussed for the 4th patch of this patchset. And it will
> solve also this problem. In case of panic, we should first try to
> get all messages into the logbuffer so that they are visible in
> the crash dump. We try to push them to the console by
> console_flush_on_panic() later because it is more risky.
>
> > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> 
> If we avoid calling console in printk_safe_flush_on_panic(),
> feel free to use:
> 
> Reviewed-by: Petr Mladek <pmladek@suse.com>

ok. thanks.

	-ss

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 6/6] printk: remove zap_locks() function
  2016-11-25 15:17     ` Peter Zijlstra
@ 2016-12-01  2:34       ` Sergey Senozhatsky
  2016-12-01  5:42         ` Peter Zijlstra
  0 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-12-01  2:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Petr Mladek, Sergey Senozhatsky, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Laura Abbott, Andy Lutomirski,
	Linus Torvalds, Kees Cook, linux-kernel, Sergey Senozhatsky

On (11/25/16 16:17), Peter Zijlstra wrote:
> On Fri, Nov 25, 2016 at 04:01:13PM +0100, Petr Mladek wrote:
> > On Fri 2016-10-28 00:49:33, Sergey Senozhatsky wrote:
> > > 2) Since commit cf9b1106c81c ("printk/nmi: flush NMI messages on the
> > >    system panic") panic attempts to zap the `logbuf_lock' spin_lock to
> > >    successfully flush nmi messages to `logbuf'.
> > 
> > Note that the same code is newly used to flush also the printk_safe
> > per-CPU buffers. It means that logbuf_lock is zapped also when
> > flushing these new buffers.
> > 
> 
> Note that (raw_)spin_lock_init() as done here and in
> printk_nmi_flush_on_panic() can wreck the lock state and doesn't ensure
> a subsequent spin_lock() of said lock will actually work.
> 
> The very best solution is to simply ignore the lock in panic situations
> rather than trying to wreck it.

do you mean that we can enterily drop the spin_lock_init()? or is there
something else? spin_lock_init() either does not improve anything or let
us to, at least, move the messages from per-CPU buffers to the logbuf. so
it's not like it does some damage, and it can help sometimes. though I
agree that a) we have the messages in the memory already and b) logbuf_lock
is not the one&only troubling lock.

	-ss

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 3/6] printk: introduce per-cpu safe_print seq buffer
  2016-11-24 16:58   ` Petr Mladek
  2016-12-01  1:08     ` Sergey Senozhatsky
@ 2016-12-01  5:32     ` Sergey Senozhatsky
  1 sibling, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-12-01  5:32 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, Peter Zijlstra, Laura Abbott, Andy Lutomirski,
	Linus Torvalds, Kees Cook, linux-kernel, Sergey Senozhatsky

On (11/24/16 17:58), Petr Mladek wrote:
[..]
> > +/*
> > + * Lockless printk(), to avoid deadlocks should the printk() recurse
> > + * into itself. It uses a per-CPU buffer to store the message, just like
> > + * NMI.
> > + */
> > +static int vprintk_safe(const char *fmt, va_list args)
> > +{
> > +	struct printk_safe_seq_buf *s = this_cpu_ptr(&safe_print_seq);
> > +
> > +	return printk_safe_log_store(s, fmt, args);
> 
> We should return zero if printk_safe_log_store() returns an error.
> I know that it will get fixed in the next patch. But we should do
> some minimum sanity check here because of bisection.

by the way. vprintk_safe() and the entire printk_safe mechanism are
not yet used in this patch. the patch that enables it comes in later.

	-ss

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 6/6] printk: remove zap_locks() function
  2016-12-01  2:34       ` Sergey Senozhatsky
@ 2016-12-01  5:42         ` Peter Zijlstra
  2016-12-01 13:36           ` Petr Mladek
  0 siblings, 1 reply; 30+ messages in thread
From: Peter Zijlstra @ 2016-12-01  5:42 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Sergey Senozhatsky, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Laura Abbott, Andy Lutomirski,
	Linus Torvalds, Kees Cook, linux-kernel

On Thu, Dec 01, 2016 at 11:34:42AM +0900, Sergey Senozhatsky wrote:
> On (11/25/16 16:17), Peter Zijlstra wrote:
> > On Fri, Nov 25, 2016 at 04:01:13PM +0100, Petr Mladek wrote:
> > > On Fri 2016-10-28 00:49:33, Sergey Senozhatsky wrote:
> > > > 2) Since commit cf9b1106c81c ("printk/nmi: flush NMI messages on the
> > > >    system panic") panic attempts to zap the `logbuf_lock' spin_lock to
> > > >    successfully flush nmi messages to `logbuf'.
> > > 
> > > Note that the same code is newly used to flush also the printk_safe
> > > per-CPU buffers. It means that logbuf_lock is zapped also when
> > > flushing these new buffers.
> > > 
> > 
> > Note that (raw_)spin_lock_init() as done here and in
> > printk_nmi_flush_on_panic() can wreck the lock state and doesn't ensure
> > a subsequent spin_lock() of said lock will actually work.
> > 
> > The very best solution is to simply ignore the lock in panic situations
> > rather than trying to wreck it.
> 
> do you mean that we can enterily drop the spin_lock_init()? or is there
> something else?

You should not touch the lock in any way shape or form in the panic
path. Just ignore all locking and do the console writes (which gets you
into whole different pile of crap).

Put another way, don't do silly things like spin_lock() when you're in a
hurry to get your panics out.

> spin_lock_init() either does not improve anything or let
> us to, at least, move the messages from per-CPU buffers to the logbuf.

So spin_lock_init() will completely wreck the lock. And this being the
recursion path, not a panic path, we could have continued running the
kernel no problem.

So yes, its actively harmful. You turn a benign printk() into something
that will wreck the kernel.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 2/6] printk: rename nmi.c and exported api
  2016-12-01  1:07     ` Sergey Senozhatsky
@ 2016-12-01 12:12       ` Petr Mladek
  0 siblings, 0 replies; 30+ messages in thread
From: Petr Mladek @ 2016-12-01 12:12 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, Peter Zijlstra, Laura Abbott, Andy Lutomirski,
	Linus Torvalds, Kees Cook, linux-kernel

On Thu 2016-12-01 10:07:01, Sergey Senozhatsky wrote:
> On (11/24/16 17:35), Petr Mladek wrote:
> [..]
> > >  #ifdef CONFIG_PRINTK_NMI
> > > -extern void printk_nmi_init(void);
> > > -extern void printk_nmi_enter(void);
> > > -extern void printk_nmi_exit(void);
> > > -extern void printk_nmi_flush(void);
> > > -extern void printk_nmi_flush_on_panic(void);
> > > +extern void printk_safe_init(void);
> > > +extern void printk_safe_nmi_enter(void);
> > > +extern void printk_safe_nmi_exit(void);
> > 
> > I would personally keep the short names pritnk_nmi_enter() and
> > printk_nmi_exit(). These are the only functions that will stay
> > in this CONFIG_PRINT_NMI section. The others are generic and
> > will be moved to the generic section in the next patch.
> 
> I'd probably prefer all printk_safe functions to be printk_safe
> prefixed, just to show the fact that those are printk_safe
> functions (IOW, indirect output to a per-CPU buffer and flush
> it to logbuf later).

Fair enough.

Best regards,
Petr

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts
  2016-12-01  2:10     ` Sergey Senozhatsky
@ 2016-12-01 12:50       ` Petr Mladek
  0 siblings, 0 replies; 30+ messages in thread
From: Petr Mladek @ 2016-12-01 12:50 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, Peter Zijlstra, Laura Abbott, Andy Lutomirski,
	Linus Torvalds, Kees Cook, linux-kernel

On Thu 2016-12-01 11:10:42, Sergey Senozhatsky wrote:
> On (11/25/16 12:07), Petr Mladek wrote:
> [..]
> > > +static void report_message_lost(atomic_t *num_lost, char *fmt)
> > > +{
> > > +	int lost = atomic_xchg(num_lost, 0);
> > > +
> > > +	if (lost) {
> > > +		char msg[56];
>
> > I would really like to avoid a hard coded buffer size. Such things
> > are likely to bite us in the future.
> 
> why would scnprintf() overflow.

Sure, it will not overflow. But still, such a small hard coded buffer
size looks too hacky to me.


> > I thought about reshuffling a lot of logic, adding more wrappers,
> > ... But the solution might be easy in the end, see below.
> > 
> > > +
> > > +		scnprintf(msg, sizeof(msg), fmt, lost);
> > > +
> > > +		printk_safe_flush_line(msg, strlen(msg));
> > 
> > This made my brain spin a lot. I wondered if it did what we wanted
> > and it was safe.
> > 
> > On one hand, it is supposed to work because use exactly this
> > function in __printk_safe_flush() where you call this from.
> > 
> > One question is if it does what we want in different contexts.
> > Let's look at it:
> > 
> > 1. It calls printk_deferred() in NMI context. There is a risk
> >    of a deadlock. But it is called only from
> >    printk_safe_flush_on_panic() which is the last resort. Therefore
> >    it does exactly what we want.
> > 
> > 2. It calls printk()->printk_func()->vprintk_emit() in normal context.
> >    It is what we want in normal context.
> > 
> > 3. It calls printk()->printk_func()->v printk_safe() in printk_safe
> >    context. This does not look correct. IMHO, this might happen
> >    only printk_safe_flush_on_panic() and we want to use
> >    printk_deferred() here as well.
> [..]
> > The easiest solution would be to simply call printk_deferred()
> > here. Everything will be deferred after the async printk() patchset
> > anyway.
> > 
> > I would even use printk_deferred() in printk_safe_flush_line()
> > for each context. It is not optimal but it works very well
> > and it makes the code much more straightforward.
> 
> yes, good point.
> we can call deferred printk for anything there; or replace that in_nmi()
> check with the `printk_safe_context != 0' one, and then route the message
> via printk or printk_deferred.

Yup, it might be an option and sounds good.

Anyway, I would use printk_deferred() to print the warnings about lost
messages. It is perfectly fine and you will not need the hard coded
temporary buffer.

Best regards,
Petr

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 6/6] printk: remove zap_locks() function
  2016-11-25 15:01   ` Petr Mladek
  2016-11-25 15:17     ` Peter Zijlstra
  2016-12-01  2:18     ` Sergey Senozhatsky
@ 2016-12-01 12:50     ` Sergey Senozhatsky
  2016-12-01 13:15       ` Petr Mladek
  2 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-12-01 12:50 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, Peter Zijlstra, Laura Abbott, Andy Lutomirski,
	Linus Torvalds, Kees Cook, linux-kernel, Sergey Senozhatsky

On (11/25/16 16:01), Petr Mladek wrote:
[..]
> > apart from detecting and reporting printk recursions, that code also
> > used to zap_lockc() in case of panic. However, zap_locks() does not
>        	          ^
> 
> s/zap_lockc/zap_locks/
> 
> > look to be needed anymore:
> > 
> > 1) Since commit 08d78658f393 ("panic: release stale console lock to
> >    always get the logbuf printed out") panic flushing of `logbuf' to
> >    console ignores the state of `console_sem' by doing
> >    	panic()
> > 		console_trylock();
> > 		console_unlock();
> > 
> > 2) Since commit cf9b1106c81c ("printk/nmi: flush NMI messages on the
> >    system panic") panic attempts to zap the `logbuf_lock' spin_lock to
> >    successfully flush nmi messages to `logbuf'.
> 
> Note that the same code is newly used to flush also the printk_safe
> per-CPU buffers. It means that logbuf_lock is zapped also when
> flushing these new buffers.


not quite. there are several checks that would prevent multiple logbuf_lock
re-inits:


        if (in_nmi() && raw_spin_is_locked(&logbuf_lock)) {
                if (num_online_cpus() > 1)
                        return;

                debug_locks_off();
                raw_spin_lock_init(&logbuf_lock);
        }


	-ss

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 6/6] printk: remove zap_locks() function
  2016-12-01 12:50     ` Sergey Senozhatsky
@ 2016-12-01 13:15       ` Petr Mladek
  0 siblings, 0 replies; 30+ messages in thread
From: Petr Mladek @ 2016-12-01 13:15 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Steven Rostedt, Ingo Molnar,
	Peter Zijlstra, Laura Abbott, Andy Lutomirski, Linus Torvalds,
	Kees Cook, linux-kernel, Sergey Senozhatsky

On Thu 2016-12-01 21:50:45, Sergey Senozhatsky wrote:
> On (11/25/16 16:01), Petr Mladek wrote:
> [..]
> > > apart from detecting and reporting printk recursions, that code also
> > > used to zap_lockc() in case of panic. However, zap_locks() does not
> >        	          ^
> > 
> > s/zap_lockc/zap_locks/
> > 
> > > look to be needed anymore:
> > > 
> > > 1) Since commit 08d78658f393 ("panic: release stale console lock to
> > >    always get the logbuf printed out") panic flushing of `logbuf' to
> > >    console ignores the state of `console_sem' by doing
> > >    	panic()
> > > 		console_trylock();
> > > 		console_unlock();
> > > 
> > > 2) Since commit cf9b1106c81c ("printk/nmi: flush NMI messages on the
> > >    system panic") panic attempts to zap the `logbuf_lock' spin_lock to
> > >    successfully flush nmi messages to `logbuf'.
> > 
> > Note that the same code is newly used to flush also the printk_safe
> > per-CPU buffers. It means that logbuf_lock is zapped also when
> > flushing these new buffers.
> 
> 
> not quite. there are several checks that would prevent multiple logbuf_lock
> re-inits:

Ah, I did not meant that the lock was zapped twice. I wanted to say
that we added a new path and that it was handled as well :-)

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 6/6] printk: remove zap_locks() function
  2016-12-01  5:42         ` Peter Zijlstra
@ 2016-12-01 13:36           ` Petr Mladek
  2016-12-02  1:11             ` Sergey Senozhatsky
  0 siblings, 1 reply; 30+ messages in thread
From: Petr Mladek @ 2016-12-01 13:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Laura Abbott, Andy Lutomirski,
	Linus Torvalds, Kees Cook, linux-kernel

On Thu 2016-12-01 06:42:29, Peter Zijlstra wrote:
> On Thu, Dec 01, 2016 at 11:34:42AM +0900, Sergey Senozhatsky wrote:
> > On (11/25/16 16:17), Peter Zijlstra wrote:
> > > On Fri, Nov 25, 2016 at 04:01:13PM +0100, Petr Mladek wrote:
> > > > On Fri 2016-10-28 00:49:33, Sergey Senozhatsky wrote:
> > > > > 2) Since commit cf9b1106c81c ("printk/nmi: flush NMI messages on the
> > > > >    system panic") panic attempts to zap the `logbuf_lock' spin_lock to
> > > > >    successfully flush nmi messages to `logbuf'.
> > > > 
> > > > Note that the same code is newly used to flush also the printk_safe
> > > > per-CPU buffers. It means that logbuf_lock is zapped also when
> > > > flushing these new buffers.
> > > > 
> > > 
> > > Note that (raw_)spin_lock_init() as done here and in
> > > printk_nmi_flush_on_panic() can wreck the lock state and doesn't ensure
> > > a subsequent spin_lock() of said lock will actually work.
> > > 
> > > The very best solution is to simply ignore the lock in panic situations
> > > rather than trying to wreck it.
> > 
> > do you mean that we can enterily drop the spin_lock_init()? or is there
> > something else?
> 
> You should not touch the lock in any way shape or form in the panic
> path. Just ignore all locking and do the console writes (which gets you
> into whole different pile of crap).

And this is my fear. I am not sure if the other crap is better than
the current one.

One crazy idea. A compromise might be to switch into a timelimed locking
in the panic mode when there are still more CPUs active. If a spin
lock is not available within X thousands of cycles, there is probably
a deadlock and we should just enter the critical section. It would
preserve some reasonable synchronization but it will allow to move
forward.

Another solution would be to use the temporary buffers if the lock
is not available and push it into the main buffer and consoles later
when there is only one CPU running. In this stage, we do not need
to synchronize and could just skip locking as you suggest.


> Put another way, don't do silly things like spin_lock() when you're in a
> hurry to get your panics out.
> 
> > spin_lock_init() either does not improve anything or let
> > us to, at least, move the messages from per-CPU buffers to the logbuf.
> 
> So spin_lock_init() will completely wreck the lock. And this being the
> recursion path, not a panic path, we could have continued running the
> kernel no problem.

printk_nmi_flush_on_panic() is called from panic(). It means that we
will do this only when the system is really going down. Which is a nice
improvement. The current code zaps the locks during any Oops.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC][PATCHv4 6/6] printk: remove zap_locks() function
  2016-12-01 13:36           ` Petr Mladek
@ 2016-12-02  1:11             ` Sergey Senozhatsky
  0 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2016-12-02  1:11 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Peter Zijlstra, Sergey Senozhatsky, Sergey Senozhatsky,
	Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Steven Rostedt, Ingo Molnar,
	Laura Abbott, Andy Lutomirski, Linus Torvalds, Kees Cook,
	linux-kernel

On (12/01/16 14:36), Petr Mladek wrote:
[..]
> > > > > Note that the same code is newly used to flush also the printk_safe
> > > > > per-CPU buffers. It means that logbuf_lock is zapped also when
> > > > > flushing these new buffers.
> > > > > 
> > > > 
> > > > Note that (raw_)spin_lock_init() as done here and in
> > > > printk_nmi_flush_on_panic() can wreck the lock state and doesn't ensure
> > > > a subsequent spin_lock() of said lock will actually work.
> > > > 
> > > > The very best solution is to simply ignore the lock in panic situations
> > > > rather than trying to wreck it.
> > > 
> > > do you mean that we can enterily drop the spin_lock_init()? or is there
> > > something else?
> > 
> > You should not touch the lock in any way shape or form in the panic
> > path. Just ignore all locking and do the console writes (which gets you
> > into whole different pile of crap).
> 
> And this is my fear. I am not sure if the other crap is better than
> the current one.

yeah, that's a good point.

> One crazy idea. A compromise might be to switch into a timelimed locking
> in the panic mode when there are still more CPUs active. If a spin
> lock is not available within X thousands of cycles, there is probably
> a deadlock and we should just enter the critical section. It would
> preserve some reasonable synchronization but it will allow to move
> forward.

logbuf spin_lock is just one of the locks. we also have scheduler spinlocks,
console drivers spinlocks, semaphore spinlock, etc. the messages, on the other
hand, are already in the memory (per-CPU buffers), so they will make it into
the core file (if there will be one).

> Another solution would be to use the temporary buffers if the lock
> is not available and push it into the main buffer and consoles later
> when there is only one CPU running. In this stage, we do not need
> to synchronize and could just skip locking as you suggest.

that's interesting. the problem here is that smp_send_stop() does not
guarantee that all the remaining CPUs will stop by the time it returns

arch/arm/kernel/smp.c

	void smp_send_stop(void)
	{
		unsigned long timeout;
		struct cpumask mask;

		cpumask_copy(&mask, cpu_online_mask);
		cpumask_clear_cpu(smp_processor_id(), &mask);
		if (!cpumask_empty(&mask))
			smp_cross_call(&mask, IPI_CPU_STOP);

		/* Wait up to one second for other CPUs to stop */
		timeout = USEC_PER_SEC;
		while (num_online_cpus() > 1 && timeout--)
			udelay(1);

		if (num_online_cpus() > 1)
			pr_warn("SMP: failed to stop secondary CPUs\n");
	}


> > Put another way, don't do silly things like spin_lock() when you're in a
> > hurry to get your panics out.
> > 
> > > spin_lock_init() either does not improve anything or let
> > > us to, at least, move the messages from per-CPU buffers to the logbuf.
> > 
> > So spin_lock_init() will completely wreck the lock. And this being the
> > recursion path, not a panic path, we could have continued running the
> > kernel no problem.
> 
> printk_nmi_flush_on_panic() is called from panic(). It means that we
> will do this only when the system is really going down. Which is a nice
> improvement. The current code zaps the locks during any Oops.

correct. well, not any oops, but 'oops && printk recursion' combo

	if (unlikely(logbuf_cpu == this_cpu)) {
		/*
		 * If a crash is occurring during printk() on this CPU,
		 * then try to get the crash message out but make sure
		 * we can't deadlock. Otherwise just return to avoid the
		 * recursion and return - but flag the recursion so that
		 * it can be printed at the next appropriate moment:
		 */
		if (!oops_in_progress && !lockdep_recursing(current)) {
			recursion_bug = true;
			local_irq_restore(flags);
			return 0;
		}
		zap_locks();
	}

other than that - yes, now we do (...we are going to do) it only
from the panic() path.

	-ss

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2016-12-02  1:11 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-27 15:49 [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
2016-10-27 15:49 ` [RFC][PATCHv4 1/6] printk: use vprintk_func in vprintk() Sergey Senozhatsky
2016-11-24 16:28   ` Petr Mladek
2016-10-27 15:49 ` [RFC][PATCHv4 2/6] printk: rename nmi.c and exported api Sergey Senozhatsky
2016-11-24 16:35   ` Petr Mladek
2016-12-01  1:07     ` Sergey Senozhatsky
2016-12-01 12:12       ` Petr Mladek
2016-10-27 15:49 ` [RFC][PATCHv4 3/6] printk: introduce per-cpu safe_print seq buffer Sergey Senozhatsky
2016-11-24 16:58   ` Petr Mladek
2016-12-01  1:08     ` Sergey Senozhatsky
2016-12-01  5:32     ` Sergey Senozhatsky
2016-10-27 15:49 ` [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts Sergey Senozhatsky
2016-11-25 11:07   ` Petr Mladek
2016-12-01  2:10     ` Sergey Senozhatsky
2016-12-01 12:50       ` Petr Mladek
2016-10-27 15:49 ` [RFC][PATCHv4 5/6] printk: use printk_safe buffers Sergey Senozhatsky
2016-11-25 14:28   ` Petr Mladek
2016-12-01  2:14     ` Sergey Senozhatsky
2016-10-27 15:49 ` [RFC][PATCHv4 6/6] printk: remove zap_locks() function Sergey Senozhatsky
2016-11-25 15:01   ` Petr Mladek
2016-11-25 15:17     ` Peter Zijlstra
2016-12-01  2:34       ` Sergey Senozhatsky
2016-12-01  5:42         ` Peter Zijlstra
2016-12-01 13:36           ` Petr Mladek
2016-12-02  1:11             ` Sergey Senozhatsky
2016-12-01  2:18     ` Sergey Senozhatsky
2016-12-01 12:50     ` Sergey Senozhatsky
2016-12-01 13:15       ` Petr Mladek
2016-10-28  3:30 ` [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls Linus Torvalds
2016-10-28  4:05   ` Sergey Senozhatsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).