linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 1/3] printk: Add line-buffered printk() API.
@ 2018-11-02 13:31 Tetsuo Handa
  2018-11-02 13:31 ` [PATCH 2/3] mm: Use line-buffered printk() for show_free_areas() Tetsuo Handa
                   ` (6 more replies)
  0 siblings, 7 replies; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-02 13:31 UTC (permalink / raw)
  To: Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt
  Cc: Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon, Tetsuo Handa

Sometimes we want to print a whole line without being disturbed by
concurrent printk() from interrupts and/or other threads, for printk()
which does not end with '\n' can be disturbed.

Since mixed printk() output makes it hard to interpret, this patch
introduces API for line-buffered printk() output (so that we can make
sure that printk() ends with '\n').

Since functions introduced by this patch are merely wrapping
printk()/vprintk() calls in order to minimize possibility of using
"struct cont", it is safe to replace printk()/vprintk() with this API.
Since we want to remove "struct cont" eventually, we will try to remove
both "implicit printk() users who are expecting KERN_CONT behavior" and
"explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to
this API is recommended.

Details:

  A structure named "struct printk_buffer" is introduced for buffering
  up to LOG_LINE_MAX bytes of printk() output which did not end with '\n'.

  get_printk_buffer() tries to assign a "struct printk_buffer" from
  statically preallocated array. get_printk_buffer() returns NULL if
  all "struct printk_buffer" are in use, but the caller does not need to
  check for NULL.

  put_printk_buffer() flushes and releases the "struct printk_buffer".
  put_printk_buffer() must match corresponding get_printk_buffer() as with
  rcu_read_unlock() must match corresponding rcu_read_lock().

  Three functions vprintk_buffered(), printk_buffered() and
  flush_printk_buffer() are provided for using "struct printk_buffer".
  These are like vfprintf(), fprintf(), fflush() except that these receive
  "struct printk_buffer *" for the first argument.

  vprintk_buffered() and printk_buffered() fall back to vprintk() and
  printk() respectively if "struct printk_buffer *" argument is NULL.
  flush_printk_buffer() and put_printk_buffer() become no-op if
  "struct printk_buffer *" argument is NULL. Therefore, the caller of
  get_printk_buffer() does not need to check for NULL.

How to use this API:

  (1) Call get_printk_buffer() and acquire "struct printk_buffer *".

  (2) Rewrite printk() calls in the following way. The "ptr" is
      "struct printk_buffer *" obtained in step (1).

      printk(fmt, ...)     => printk_buffered(ptr, fmt, ...)
      vprintk(fmt, args)   => vprintk_buffered(ptr, fmt, args)
      pr_emerg(fmt, ...)   => bpr_emerg(ptr, fmt, ...)
      pr_alert(fmt, ...)   => bpr_alert(ptr, fmt, ...)
      pr_crit(fmt, ...)    => bpr_crit(ptr, fmt, ...)
      pr_err(fmt, ...)     => bpr_err(ptr, fmt, ...)
      pr_warning(fmt, ...) => bpr_warning(ptr, fmt, ...)
      pr_warn(fmt, ...)    => bpr_warn(ptr, fmt, ...)
      pr_notice(fmt, ...)  => bpr_notice(ptr, fmt, ...)
      pr_info(fmt, ...)    => bpr_info(ptr, fmt, ...)
      pr_cont(fmt, ...)    => bpr_cont(ptr, fmt, ...)

  (3) Release "struct printk_buffer" by calling put_printk_buffer().

Note that since "struct printk_buffer" buffers only up to one line, there
is no need to rewrite if it is known that the "struct printk_buffer" is
empty and printk() ends with '\n'.

  Good example:

    printk("Hello ");    =>  buf = get_printk_buffer();
    pr_cont("world.\n");     printk_buffered(buf, "Hello ");
                             printk_buffered(buf, "world.\n");
                             put_printk_buffer(buf);

  Pointless example:

    printk("Hello\n");   =>  buf = get_printk_buffer();
    printk("World.\n");      printk_buffered(buf, "Hello\n");
                             printk_buffered(buf, "World.\n");
                             put_printk_buffer(buf);

Note that bpr_devel() and bpr_debug() are not defined. This is
because pr_devel()/pr_debug() should not be followed by pr_cont()
because pr_devel()/pr_debug() are conditionally enabled; output from
pr_devel()/pr_debug() should always end with '\n'.

Previous version was proposed at
https://lkml.kernel.org/r/1540375870-6235-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp .

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 include/linux/printk.h          |  43 ++++++++++
 kernel/printk/Makefile          |   2 +-
 kernel/printk/internal.h        |   3 +
 kernel/printk/printk.c          |   3 -
 kernel/printk/printk_buffered.c | 179 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 226 insertions(+), 4 deletions(-)
 create mode 100644 kernel/printk/printk_buffered.c

diff --git a/include/linux/printk.h b/include/linux/printk.h
index cf3eccf..92af345 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -157,6 +157,7 @@ static inline void printk_nmi_direct_enter(void) { }
 static inline void printk_nmi_direct_exit(void) { }
 #endif /* PRINTK_NMI */
 
+struct printk_buffer;
 #ifdef CONFIG_PRINTK
 asmlinkage __printf(5, 0)
 int vprintk_emit(int facility, int level,
@@ -173,6 +174,20 @@ int printk_emit(int facility, int level,
 
 asmlinkage __printf(1, 2) __cold
 int printk(const char *fmt, ...);
+struct printk_buffer *get_printk_buffer(void);
+bool flush_printk_buffer(struct printk_buffer *ptr);
+__printf(2, 3)
+int printk_buffered(struct printk_buffer *ptr, const char *fmt, ...);
+__printf(2, 0)
+int vprintk_buffered(struct printk_buffer *ptr, const char *fmt, va_list args);
+/*
+ * In order to avoid accidentally reusing "ptr" after put_printk_buffer("ptr"),
+ * put_printk_buffer() is defined as a macro which explicitly resets "ptr" to
+ * NULL.
+ */
+void __put_printk_buffer(struct printk_buffer *ptr);
+#define put_printk_buffer(ptr)					\
+	do { __put_printk_buffer(ptr); ptr = NULL; } while (0)
 
 /*
  * Special printk facility for scheduler/timekeeping use only, _DO_NOT_USE_ !
@@ -220,6 +235,17 @@ int printk(const char *s, ...)
 {
 	return 0;
 }
+static inline struct printk_buffer *get_printk_buffer(void)
+{
+	return NULL;
+}
+static inline bool flush_printk_buffer(struct printk_buffer *ptr)
+{
+	return false;
+}
+#define printk_buffered(ptr, fmt, ...) printk(fmt, ##__VA_ARGS__)
+#define vprintk_buffered(ptr, fmt, args) vprintk(fmt, args)
+#define put_printk_buffer(ptr) do { ptr = NULL; } while (0)
 static inline __printf(1, 2) __cold
 int printk_deferred(const char *s, ...)
 {
@@ -330,6 +356,23 @@ static inline void printk_safe_flush_on_panic(void)
 	no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
 #endif
 
+#define bpr_emerg(ptr, fmt, ...) \
+	printk_buffered(ptr, KERN_EMERG pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_alert(ptr, fmt, ...) \
+	printk_buffered(ptr, KERN_ALERT pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_crit(ptr, fmt, ...) \
+	printk_buffered(ptr, KERN_CRIT pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_err(ptr, fmt, ...) \
+	printk_buffered(ptr, KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_warning(ptr, fmt, ...) \
+	printk_buffered(ptr, KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_warn bpr_warning
+#define bpr_notice(ptr, fmt, ...) \
+	printk_buffered(ptr, KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_info(ptr, fmt, ...) \
+	printk_buffered(ptr, KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_cont(ptr, fmt, ...) \
+	printk_buffered(ptr, KERN_CONT fmt, ##__VA_ARGS__)
 
 /* If you are writing a driver, please use dev_dbg instead */
 #if defined(CONFIG_DYNAMIC_DEBUG)
diff --git a/kernel/printk/Makefile b/kernel/printk/Makefile
index 4a2ffc3..23b1547 100644
--- a/kernel/printk/Makefile
+++ b/kernel/printk/Makefile
@@ -1,3 +1,3 @@
 obj-y	= printk.o
-obj-$(CONFIG_PRINTK)	+= printk_safe.o
+obj-$(CONFIG_PRINTK)	+= printk_safe.o printk_buffered.o
 obj-$(CONFIG_A11Y_BRAILLE_CONSOLE)	+= braille.o
diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index 0f18988..5e8c048 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -22,6 +22,9 @@
 #define PRINTK_NMI_DIRECT_CONTEXT_MASK	 0x40000000
 #define PRINTK_NMI_CONTEXT_MASK		 0x80000000
 
+#define PREFIX_MAX		32
+#define LOG_LINE_MAX		(1024 - PREFIX_MAX)
+
 extern raw_spinlock_t logbuf_lock;
 
 __printf(5, 0)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 1b2a029..0b06211 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -421,9 +421,6 @@ __packed __aligned(4)
 static u64 clear_seq;
 static u32 clear_idx;
 
-#define PREFIX_MAX		32
-#define LOG_LINE_MAX		(1024 - PREFIX_MAX)
-
 #define LOG_LEVEL(v)		((v) & 0x07)
 #define LOG_FACILITY(v)		((v) >> 3 & 0xff)
 
diff --git a/kernel/printk/printk_buffered.c b/kernel/printk/printk_buffered.c
new file mode 100644
index 0000000..d181d31
--- /dev/null
+++ b/kernel/printk/printk_buffered.c
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+
+#include <linux/types.h> /* DECLARE_BITMAP() */
+#include <linux/printk.h>
+#include "internal.h"
+
+/* A structure for line-buffered printk() output. */
+struct printk_buffer {
+	unsigned short int len; /* Valid bytes in buf[]. */
+	char buf[LOG_LINE_MAX];
+} __aligned(1024);
+
+/*
+ * Number of statically preallocated buffers.
+ *
+ * We can introduce a kernel config option if someone wants to tune this value.
+ * But since "struct printk_buffer" makes difference only when there are
+ * multiple threads concurrently calling printk() which does not end with '\n',
+ * and this API will fallback to normal printk() when all buffers are in use,
+ * it is possible that nobody needs to tune this value.
+ */
+#define NUM_LINE_BUFFERS 16
+
+static struct printk_buffer printk_buffers[NUM_LINE_BUFFERS];
+static DECLARE_BITMAP(printk_buffers_in_use, NUM_LINE_BUFFERS);
+
+/**
+ * get_printk_buffer - Try to get printk_buffer.
+ *
+ * Returns pointer to "struct printk_buffer" on success, NULL otherwise.
+ *
+ * If this function returned "struct printk_buffer", the caller is responsible
+ * for passing it to put_printk_buffer() so that "struct printk_buffer" can be
+ * reused in the future.
+ *
+ * Even if this function returned NULL, the caller does not need to check for
+ * NULL, for passing NULL to printk_buffered() simply acts like normal printk()
+ * and passing NULL to flush_printk_buffer()/put_printk_buffer() is a no-op.
+ */
+struct printk_buffer *get_printk_buffer(void)
+{
+	long i;
+
+	for (i = 0; i < NUM_LINE_BUFFERS; i++) {
+		if (test_and_set_bit_lock(i, printk_buffers_in_use))
+			continue;
+		printk_buffers[i].len = 0;
+		return &printk_buffers[i];
+	}
+	return NULL;
+}
+EXPORT_SYMBOL(get_printk_buffer);
+
+/**
+ * vprintk_buffered - Try to vprintk() in line buffered mode.
+ *
+ * @ptr:  Pointer to "struct printk_buffer". It can be NULL.
+ * @fmt:  printk() format string.
+ * @args: va_list structure.
+ *
+ * Returns the return value of vprintk().
+ *
+ * Try to store to @ptr first. If it fails, flush @ptr and then try to store to
+ * @ptr again. If it still fails, use unbuffered printing.
+ */
+int vprintk_buffered(struct printk_buffer *ptr, const char *fmt, va_list args)
+{
+	va_list tmp_args;
+	int r;
+	int pos;
+
+	if (!ptr)
+		return vprintk(fmt, args);
+	/*
+	 * Skip KERN_CONT here based on an assumption that KERN_CONT will be
+	 * given via "fmt" argument when KERN_CONT is given.
+	 */
+	pos = (printk_get_level(fmt) == 'c') ? 2 : 0;
+	while (true) {
+		va_copy(tmp_args, args);
+		r = vsnprintf(ptr->buf + ptr->len, sizeof(ptr->buf) - ptr->len,
+			      fmt + pos, tmp_args);
+		va_end(tmp_args);
+		if (likely(r + ptr->len < sizeof(ptr->buf)))
+			break;
+		if (!flush_printk_buffer(ptr))
+			return vprintk(fmt, args);
+	}
+	ptr->len += r;
+	/* Flush already completed lines if any. */
+	for (pos = ptr->len - 1; pos >= 0; pos--) {
+		if (ptr->buf[pos] != '\n')
+			continue;
+		ptr->buf[pos++] = '\0';
+		printk("%s\n", ptr->buf);
+		ptr->len -= pos;
+		memmove(ptr->buf, ptr->buf + pos, ptr->len);
+		/* This '\0' will be overwritten by next vsnprintf() above. */
+		ptr->buf[ptr->len] = '\0';
+		break;
+	}
+	return r;
+}
+
+/**
+ * printk_buffered - Try to printk() in line buffered mode.
+ *
+ * @ptr: Pointer to "struct printk_buffer". It can be NULL.
+ * @fmt: printk() format string, followed by arguments.
+ *
+ * Returns the return value of printk().
+ *
+ * Try to store to @ptr first. If it fails, flush @ptr and then try to store to
+ * @ptr again. If it still fails, use unbuffered printing.
+ */
+int printk_buffered(struct printk_buffer *ptr, const char *fmt, ...)
+{
+	va_list args;
+	int r;
+
+	va_start(args, fmt);
+	r = vprintk_buffered(ptr, fmt, args);
+	va_end(args);
+	return r;
+}
+EXPORT_SYMBOL(printk_buffered);
+
+/**
+ * flush_printk_buffer - Flush incomplete line in printk_buffer.
+ *
+ * @ptr: Pointer to "struct printk_buffer". It can be NULL.
+ *
+ * Returns true if flushed something, false otherwise.
+ *
+ * Flush if @ptr contains partial data. But usually there is no need to call
+ * this function because @ptr is flushed by put_printk_buffer().
+ */
+bool flush_printk_buffer(struct printk_buffer *ptr)
+{
+	if (!ptr || !ptr->len)
+		return false;
+	/*
+	 * vprintk_buffered() keeps 0 <= ptr->len < sizeof(ptr->buf) true.
+	 * But ptr->buf[ptr->len] != '\0' if this function is called due to
+	 * vsnprintf() + ptr->len >= sizeof(ptr->buf).
+	 */
+	ptr->buf[ptr->len] = '\0';
+	printk("%s", ptr->buf);
+	ptr->len = 0;
+	return true;
+}
+EXPORT_SYMBOL(flush_printk_buffer);
+
+/**
+ * __put_printk_buffer - Release printk_buffer.
+ *
+ * @ptr: Pointer to "struct printk_buffer". It can be NULL.
+ *
+ * Returns nothing.
+ *
+ * Flush and release @ptr.
+ * Please use put_printk_buffer() in order to catch use-after-free bugs.
+ */
+void __put_printk_buffer(struct printk_buffer *ptr)
+{
+	long i = (unsigned long) ptr - (unsigned long) printk_buffers;
+
+	if (!ptr)
+		return;
+	if (WARN_ON_ONCE(i % sizeof(struct printk_buffer)))
+		return;
+	i /= sizeof(struct printk_buffer);
+	if (WARN_ON_ONCE(i < 0 || i >= NUM_LINE_BUFFERS))
+		return;
+	if (ptr->len)
+		flush_printk_buffer(ptr);
+	clear_bit_unlock(i, printk_buffers_in_use);
+}
+EXPORT_SYMBOL(__put_printk_buffer);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 2/3] mm: Use line-buffered printk() for show_free_areas().
  2018-11-02 13:31 [PATCH v6 1/3] printk: Add line-buffered printk() API Tetsuo Handa
@ 2018-11-02 13:31 ` Tetsuo Handa
  2018-11-07 14:07   ` Petr Mladek
  2018-11-02 13:31 ` [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages Tetsuo Handa
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-02 13:31 UTC (permalink / raw)
  To: Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt
  Cc: Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon, Tetsuo Handa

syzbot is sometimes getting mixed output like below due to concurrent
printk(). Mitigate such output by using line-buffered printk() API.

  Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB 
  syz-executor0: page allocation failure: order:0, mode:0x484020(GFP_ATOMIC|__GFP_COMP), nodemask=(null)
  (U) 
  syz-executor0 cpuset=
  2*64kB 
  syz0
  (U) 
   mems_allowed=0
  1*128kB 
  CPU: 0 PID: 7592 Comm: syz-executor0 Not tainted 4.19.0-rc6+ #118
  (U) 
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  1*256kB (U) 
  Call Trace:
  0*512kB 
   <IRQ>
  1*1024kB 
   __dump_stack lib/dump_stack.c:77 [inline]
   dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
  (U) 
  1*2048kB 
   warn_alloc.cold.119+0xb7/0x1bd mm/page_alloc.c:3426
  (M) 

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 mm/page_alloc.c | 32 +++++++++++++++++---------------
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a919ba5..4411d5a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4694,10 +4694,10 @@ unsigned long nr_free_pagecache_pages(void)
 	return nr_free_zone_pages(gfp_zone(GFP_HIGHUSER_MOVABLE));
 }
 
-static inline void show_node(struct zone *zone)
+static inline void show_node(struct zone *zone, struct printk_buffer *buf)
 {
 	if (IS_ENABLED(CONFIG_NUMA))
-		printk("Node %d ", zone_to_nid(zone));
+		printk_buffered(buf, "Node %d ", zone_to_nid(zone));
 }
 
 long si_mem_available(void)
@@ -4814,7 +4814,7 @@ static bool show_mem_node_skip(unsigned int flags, int nid, nodemask_t *nodemask
 
 #define K(x) ((x) << (PAGE_SHIFT-10))
 
-static void show_migration_types(unsigned char type)
+static void show_migration_types(unsigned char type, struct printk_buffer *buf)
 {
 	static const char types[MIGRATE_TYPES] = {
 		[MIGRATE_UNMOVABLE]	= 'U',
@@ -4838,7 +4838,7 @@ static void show_migration_types(unsigned char type)
 	}
 
 	*p = '\0';
-	printk(KERN_CONT "(%s) ", tmp);
+	printk_buffered(buf, "(%s) ", tmp);
 }
 
 /*
@@ -4852,6 +4852,7 @@ static void show_migration_types(unsigned char type)
  */
 void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 {
+	struct printk_buffer *buf = get_printk_buffer();
 	unsigned long free_pcp = 0;
 	int cpu;
 	struct zone *zone;
@@ -4950,8 +4951,8 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 		for_each_online_cpu(cpu)
 			free_pcp += per_cpu_ptr(zone->pageset, cpu)->pcp.count;
 
-		show_node(zone);
-		printk(KERN_CONT
+		show_node(zone, buf);
+		printk_buffered(buf,
 			"%s"
 			" free:%lukB"
 			" min:%lukB"
@@ -4993,10 +4994,10 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 			K(free_pcp),
 			K(this_cpu_read(zone->pageset->pcp.count)),
 			K(zone_page_state(zone, NR_FREE_CMA_PAGES)));
-		printk("lowmem_reserve[]:");
+		printk_buffered(buf, "lowmem_reserve[]:");
 		for (i = 0; i < MAX_NR_ZONES; i++)
-			printk(KERN_CONT " %ld", zone->lowmem_reserve[i]);
-		printk(KERN_CONT "\n");
+			printk_buffered(buf, " %ld", zone->lowmem_reserve[i]);
+		printk_buffered(buf, "\n");
 	}
 
 	for_each_populated_zone(zone) {
@@ -5006,8 +5007,8 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 
 		if (show_mem_node_skip(filter, zone_to_nid(zone), nodemask))
 			continue;
-		show_node(zone);
-		printk(KERN_CONT "%s: ", zone->name);
+		show_node(zone, buf);
+		printk_buffered(buf, "%s: ", zone->name);
 
 		spin_lock_irqsave(&zone->lock, flags);
 		for (order = 0; order < MAX_ORDER; order++) {
@@ -5025,12 +5026,12 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 		}
 		spin_unlock_irqrestore(&zone->lock, flags);
 		for (order = 0; order < MAX_ORDER; order++) {
-			printk(KERN_CONT "%lu*%lukB ",
-			       nr[order], K(1UL) << order);
+			printk_buffered(buf, "%lu*%lukB ",
+					nr[order], K(1UL) << order);
 			if (nr[order])
-				show_migration_types(types[order]);
+				show_migration_types(types[order], buf);
 		}
-		printk(KERN_CONT "= %lukB\n", K(total));
+		printk_buffered(buf, "= %lukB\n", K(total));
 	}
 
 	hugetlb_show_meminfo();
@@ -5038,6 +5039,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 	printk("%ld total pagecache pages\n", global_node_page_state(NR_FILE_PAGES));
 
 	show_swap_cache_info();
+	put_printk_buffer(buf);
 }
 
 static void zoneref_set_zone(struct zone *zone, struct zoneref *zoneref)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-02 13:31 [PATCH v6 1/3] printk: Add line-buffered printk() API Tetsuo Handa
  2018-11-02 13:31 ` [PATCH 2/3] mm: Use line-buffered printk() for show_free_areas() Tetsuo Handa
@ 2018-11-02 13:31 ` Tetsuo Handa
  2018-11-02 13:36   ` Peter Zijlstra
  2018-11-07 15:19   ` Petr Mladek
  2018-11-02 14:40 ` [PATCH v6 1/3] printk: Add line-buffered printk() API Matthew Wilcox
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-02 13:31 UTC (permalink / raw)
  To: Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt
  Cc: Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon, Tetsuo Handa

syzbot is sometimes getting mixed output like below due to concurrent
printk(). Mitigate such output by using line-buffered printk() API.

  RCU used illegally from idle CPU!
  rcu_scheduler_active = 2, debug_locks = 1
  RSP: 0018:ffffffff88007bb8 EFLAGS: 00000286
  RCU used illegally from extended quiescent state!
   ORIG_RAX: ffffffffffffff13
  1 lock held by swapper/1/0:
  RAX: dffffc0000000000 RBX: 1ffffffff1000f7b RCX: 0000000000000000
   #0: 
  RDX: 1ffffffff10237b8 RSI: 0000000000000001 RDI: ffffffff8811bdc0
  000000004b34587c
  RBP: ffffffff88007bb8 R08: ffffffff88075e00 R09: 0000000000000000
   (
  R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
  rcu_read_lock
  R13: ffffffff88007c78 R14: 0000000000000000 R15: 0000000000000000
  ){....}
   arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline]
   default_idle+0xc2/0x410 arch/x86/kernel/process.c:498
  , at: trace_call_bpf+0xf8/0x640 kernel/trace/bpf_trace.c:46

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 kernel/locking/lockdep.c | 268 +++++++++++++++++++++++++++--------------------
 1 file changed, 155 insertions(+), 113 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 1efada2..fbc127d 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -493,7 +493,8 @@ void get_usage_chars(struct lock_class *class, char usage[LOCK_USAGE_CHARS])
 	usage[i] = '\0';
 }
 
-static void __print_lock_name(struct lock_class *class)
+static void __print_lock_name(struct lock_class *class,
+			      struct printk_buffer *buf)
 {
 	char str[KSYM_NAME_LEN];
 	const char *name;
@@ -501,25 +502,25 @@ static void __print_lock_name(struct lock_class *class)
 	name = class->name;
 	if (!name) {
 		name = __get_key_name(class->key, str);
-		printk(KERN_CONT "%s", name);
+		printk_buffered(buf, "%s", name);
 	} else {
-		printk(KERN_CONT "%s", name);
+		printk_buffered(buf, "%s", name);
 		if (class->name_version > 1)
-			printk(KERN_CONT "#%d", class->name_version);
+			printk_buffered(buf, "#%d", class->name_version);
 		if (class->subclass)
-			printk(KERN_CONT "/%d", class->subclass);
+			printk_buffered(buf, "/%d", class->subclass);
 	}
 }
 
-static void print_lock_name(struct lock_class *class)
+static void print_lock_name(struct lock_class *class, struct printk_buffer *buf)
 {
 	char usage[LOCK_USAGE_CHARS];
 
 	get_usage_chars(class, usage);
 
-	printk(KERN_CONT " (");
-	__print_lock_name(class);
-	printk(KERN_CONT "){%s}", usage);
+	printk_buffered(buf, " (");
+	__print_lock_name(class, buf);
+	printk_buffered(buf, "){%s}", usage);
 }
 
 static void print_lockdep_cache(struct lockdep_map *lock)
@@ -534,8 +535,9 @@ static void print_lockdep_cache(struct lockdep_map *lock)
 	printk(KERN_CONT "%s", name);
 }
 
-static void print_lock(struct held_lock *hlock)
+static void print_lock(struct held_lock *hlock, struct printk_buffer *buf)
 {
+	bool free = false;
 	/*
 	 * We can be called locklessly through debug_show_all_locks() so be
 	 * extra careful, the hlock might have been released and cleared.
@@ -546,17 +548,24 @@ static void print_lock(struct held_lock *hlock)
 	barrier();
 
 	if (!class_idx || (class_idx - 1) >= MAX_LOCKDEP_KEYS) {
-		printk(KERN_CONT "<RELEASED>\n");
+		printk_buffered(buf, "<RELEASED>\n");
 		return;
 	}
 
-	printk(KERN_CONT "%p", hlock->instance);
-	print_lock_name(lock_classes + class_idx - 1);
-	printk(KERN_CONT ", at: %pS\n", (void *)hlock->acquire_ip);
+	if (!buf) {
+		free = true;
+		buf = get_printk_buffer();
+	}
+	printk_buffered(buf, "%p", hlock->instance);
+	print_lock_name(lock_classes + class_idx - 1, buf);
+	printk_buffered(buf, ", at: %pS\n", (void *)hlock->acquire_ip);
+	if (free)
+		put_printk_buffer(buf);
 }
 
 static void lockdep_print_held_locks(struct task_struct *p)
 {
+	struct printk_buffer *buf;
 	int i, depth = READ_ONCE(p->lockdep_depth);
 
 	if (!depth)
@@ -570,10 +579,12 @@ static void lockdep_print_held_locks(struct task_struct *p)
 	 */
 	if (p->state == TASK_RUNNING && p != current)
 		return;
+	buf = get_printk_buffer();
 	for (i = 0; i < depth; i++) {
-		printk(" #%d: ", i);
-		print_lock(p->held_locks + i);
+		printk_buffered(buf, " #%d: ", i);
+		print_lock(p->held_locks + i, buf);
 	}
+	put_printk_buffer(buf);
 }
 
 static void print_kernel_ident(void)
@@ -1083,12 +1094,16 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
 static noinline int
 print_circular_bug_entry(struct lock_list *target, int depth)
 {
+	struct printk_buffer *buf;
+
 	if (debug_locks_silent)
 		return 0;
-	printk("\n-> #%u", depth);
-	print_lock_name(target->class);
-	printk(KERN_CONT ":\n");
+	buf = get_printk_buffer();
+	printk_buffered(buf, "\n-> #%u", depth);
+	print_lock_name(target->class, buf);
+	printk_buffered(buf, ":\n");
 	print_stack_trace(&target->trace, 6);
+	put_printk_buffer(buf);
 
 	return 0;
 }
@@ -1098,6 +1113,7 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
 			     struct held_lock *tgt,
 			     struct lock_list *prt)
 {
+	struct printk_buffer *buf = get_printk_buffer();
 	struct lock_class *source = hlock_class(src);
 	struct lock_class *target = hlock_class(tgt);
 	struct lock_class *parent = prt->class;
@@ -1116,31 +1132,32 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
 	 * from the safe_class lock to the unsafe_class lock.
 	 */
 	if (parent != source) {
-		printk("Chain exists of:\n  ");
-		__print_lock_name(source);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(parent);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(target);
-		printk(KERN_CONT "\n\n");
+		printk_buffered(buf, "Chain exists of:\n  ");
+		__print_lock_name(source, buf);
+		printk_buffered(buf, " --> ");
+		__print_lock_name(parent, buf);
+		printk_buffered(buf, " --> ");
+		__print_lock_name(target, buf);
+		printk_buffered(buf, "\n\n");
 	}
 
 	printk(" Possible unsafe locking scenario:\n\n");
 	printk("       CPU0                    CPU1\n");
 	printk("       ----                    ----\n");
-	printk("  lock(");
-	__print_lock_name(target);
-	printk(KERN_CONT ");\n");
+	printk_buffered(buf, "  lock(");
+	__print_lock_name(target, buf);
+	printk_buffered(buf, ");\n");
 	printk("                               lock(");
-	__print_lock_name(parent);
-	printk(KERN_CONT ");\n");
-	printk("                               lock(");
-	__print_lock_name(target);
-	printk(KERN_CONT ");\n");
-	printk("  lock(");
-	__print_lock_name(source);
-	printk(KERN_CONT ");\n");
+	__print_lock_name(parent, buf);
+	printk_buffered(buf, ");\n");
+	printk_buffered(buf, "                               lock(");
+	__print_lock_name(target, buf);
+	printk_buffered(buf, ");\n");
+	printk_buffered(buf, "  lock(");
+	__print_lock_name(source, buf);
+	printk_buffered(buf, ");\n");
 	printk("\n *** DEADLOCK ***\n\n");
+	put_printk_buffer(buf);
 }
 
 /*
@@ -1164,11 +1181,11 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
 	pr_warn("------------------------------------------------------\n");
 	pr_warn("%s/%d is trying to acquire lock:\n",
 		curr->comm, task_pid_nr(curr));
-	print_lock(check_src);
+	print_lock(check_src, NULL);
 
 	pr_warn("\nbut task is already holding lock:\n");
 
-	print_lock(check_tgt);
+	print_lock(check_tgt, NULL);
 	pr_warn("\nwhich lock already depends on the new lock.\n\n");
 	pr_warn("\nthe existing dependency chain (in reverse order) is:\n");
 
@@ -1387,21 +1404,23 @@ static inline int usage_match(struct lock_list *entry, void *bit)
 
 static void print_lock_class_header(struct lock_class *class, int depth)
 {
+	struct printk_buffer *buf = get_printk_buffer();
 	int bit;
 
-	printk("%*s->", depth, "");
-	print_lock_name(class);
+	printk_buffered(buf, "%*s->", depth, "");
+	print_lock_name(class, buf);
 #ifdef CONFIG_DEBUG_LOCKDEP
-	printk(KERN_CONT " ops: %lu", debug_class_ops_read(class));
+	printk_buffered(buf, " ops: %lu", debug_class_ops_read(class));
 #endif
-	printk(KERN_CONT " {\n");
+	printk_buffered(buf, " {\n");
 
 	for (bit = 0; bit < LOCK_USAGE_STATES; bit++) {
 		if (class->usage_mask & (1 << bit)) {
 			int len = depth;
 
-			len += printk("%*s   %s", depth, "", usage_str[bit]);
-			len += printk(KERN_CONT " at:\n");
+			len += printk_buffered(buf, "%*s   %s", depth, "",
+					       usage_str[bit]);
+			len += printk_buffered(buf, " at:\n");
 			print_stack_trace(class->usage_traces + bit, len);
 		}
 	}
@@ -1409,6 +1428,7 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 
 	printk("%*s ... key      at: [<%px>] %pS\n",
 		depth, "", class->key, class->key);
+	put_printk_buffer(buf);
 }
 
 /*
@@ -1451,6 +1471,7 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 	struct lock_class *safe_class = safe_entry->class;
 	struct lock_class *unsafe_class = unsafe_entry->class;
 	struct lock_class *middle_class = prev_class;
+	struct printk_buffer *buf = get_printk_buffer();
 
 	if (middle_class == safe_class)
 		middle_class = next_class;
@@ -1469,33 +1490,34 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 	 * from the safe_class lock to the unsafe_class lock.
 	 */
 	if (middle_class != unsafe_class) {
-		printk("Chain exists of:\n  ");
-		__print_lock_name(safe_class);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(middle_class);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(unsafe_class);
-		printk(KERN_CONT "\n\n");
+		printk_buffered(buf, "Chain exists of:\n  ");
+		__print_lock_name(safe_class, buf);
+		printk_buffered(buf, " --> ");
+		__print_lock_name(middle_class, buf);
+		printk_buffered(buf, " --> ");
+		__print_lock_name(unsafe_class, buf);
+		printk_buffered(buf, "\n\n");
 	}
 
 	printk(" Possible interrupt unsafe locking scenario:\n\n");
 	printk("       CPU0                    CPU1\n");
 	printk("       ----                    ----\n");
-	printk("  lock(");
-	__print_lock_name(unsafe_class);
-	printk(KERN_CONT ");\n");
+	printk_buffered(buf, "  lock(");
+	__print_lock_name(unsafe_class, buf);
+	printk_buffered(buf, ");\n");
 	printk("                               local_irq_disable();\n");
-	printk("                               lock(");
-	__print_lock_name(safe_class);
-	printk(KERN_CONT ");\n");
-	printk("                               lock(");
-	__print_lock_name(middle_class);
-	printk(KERN_CONT ");\n");
+	printk_buffered(buf, "                               lock(");
+	__print_lock_name(safe_class, buf);
+	printk_buffered(buf, ");\n");
+	printk_buffered(buf, "                               lock(");
+	__print_lock_name(middle_class, buf);
+	printk_buffered(buf, ");\n");
 	printk("  <Interrupt>\n");
-	printk("    lock(");
-	__print_lock_name(safe_class);
-	printk(KERN_CONT ");\n");
+	printk_buffered(buf, "    lock(");
+	__print_lock_name(safe_class, buf);
+	printk_buffered(buf, ");\n");
 	printk("\n *** DEADLOCK ***\n\n");
+	put_printk_buffer(buf);
 }
 
 static int
@@ -1510,9 +1532,12 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 			 enum lock_usage_bit bit2,
 			 const char *irqclass)
 {
+	struct printk_buffer *buf;
+
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
 
+	buf = get_printk_buffer();
 	pr_warn("\n");
 	pr_warn("=====================================================\n");
 	pr_warn("WARNING: %s-safe -> %s-unsafe lock order detected\n",
@@ -1525,26 +1550,26 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 		curr->softirq_context, softirq_count() >> SOFTIRQ_SHIFT,
 		curr->hardirqs_enabled,
 		curr->softirqs_enabled);
-	print_lock(next);
+	print_lock(next, buf);
 
 	pr_warn("\nand this task is already holding:\n");
-	print_lock(prev);
+	print_lock(prev, buf);
 	pr_warn("which would create a new lock dependency:\n");
-	print_lock_name(hlock_class(prev));
-	pr_cont(" ->");
-	print_lock_name(hlock_class(next));
-	pr_cont("\n");
+	print_lock_name(hlock_class(prev), buf);
+	bpr_cont(buf, " ->");
+	print_lock_name(hlock_class(next), buf);
+	bpr_cont(buf, "\n");
 
 	pr_warn("\nbut this new dependency connects a %s-irq-safe lock:\n",
 		irqclass);
-	print_lock_name(backwards_entry->class);
-	pr_warn("\n... which became %s-irq-safe at:\n", irqclass);
+	print_lock_name(backwards_entry->class, buf);
+	bpr_warn(buf, "\n... which became %s-irq-safe at:\n", irqclass);
 
 	print_stack_trace(backwards_entry->class->usage_traces + bit1, 1);
 
 	pr_warn("\nto a %s-irq-unsafe lock:\n", irqclass);
-	print_lock_name(forwards_entry->class);
-	pr_warn("\n... which became %s-irq-unsafe at:\n", irqclass);
+	print_lock_name(forwards_entry->class, buf);
+	bpr_warn(buf, "\n... which became %s-irq-unsafe at:\n", irqclass);
 	pr_warn("...");
 
 	print_stack_trace(forwards_entry->class->usage_traces + bit2, 1);
@@ -1557,18 +1582,20 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 
 	pr_warn("\nthe dependencies between %s-irq-safe lock and the holding lock:\n", irqclass);
 	if (!save_trace(&prev_root->trace))
-		return 0;
+		goto done;
 	print_shortest_lock_dependencies(backwards_entry, prev_root);
 
 	pr_warn("\nthe dependencies between the lock to be acquired");
 	pr_warn(" and %s-irq-unsafe lock:\n", irqclass);
 	if (!save_trace(&next_root->trace))
-		return 0;
+		goto done;
 	print_shortest_lock_dependencies(forwards_entry, next_root);
 
 	pr_warn("\nstack backtrace:\n");
 	dump_stack();
 
+ done:
+	put_printk_buffer(buf);
 	return 0;
 }
 
@@ -1721,18 +1748,20 @@ static inline void inc_chains(void)
 {
 	struct lock_class *next = hlock_class(nxt);
 	struct lock_class *prev = hlock_class(prv);
+	struct printk_buffer *buf = get_printk_buffer();
 
 	printk(" Possible unsafe locking scenario:\n\n");
 	printk("       CPU0\n");
 	printk("       ----\n");
-	printk("  lock(");
-	__print_lock_name(prev);
-	printk(KERN_CONT ");\n");
-	printk("  lock(");
-	__print_lock_name(next);
-	printk(KERN_CONT ");\n");
+	printk_buffered(buf, "  lock(");
+	__print_lock_name(prev, buf);
+	printk_buffered(buf, ");\n");
+	printk_buffered(buf, "  lock(");
+	__print_lock_name(next, buf);
+	printk_buffered(buf, ");\n");
 	printk("\n *** DEADLOCK ***\n\n");
 	printk(" May be due to missing lock nesting notation\n\n");
+	put_printk_buffer(buf);
 }
 
 static int
@@ -1749,9 +1778,9 @@ static inline void inc_chains(void)
 	pr_warn("--------------------------------------------\n");
 	pr_warn("%s/%d is trying to acquire lock:\n",
 		curr->comm, task_pid_nr(curr));
-	print_lock(next);
+	print_lock(next, NULL);
 	pr_warn("\nbut task is already holding lock:\n");
-	print_lock(prev);
+	print_lock(prev, NULL);
 
 	pr_warn("\nother info that might help us debug this:\n");
 	print_deadlock_scenario(next, prev);
@@ -2048,18 +2077,20 @@ static inline int get_first_held_lock(struct task_struct *curr,
 /*
  * Returns the next chain_key iteration
  */
-static u64 print_chain_key_iteration(int class_idx, u64 chain_key)
+static u64 print_chain_key_iteration(int class_idx, u64 chain_key,
+				     struct printk_buffer *buf)
 {
 	u64 new_chain_key = iterate_chain_key(chain_key, class_idx);
 
-	printk(" class_idx:%d -> chain_key:%016Lx",
+	printk_buffered(buf, " class_idx:%d -> chain_key:%016Lx",
 		class_idx,
 		(unsigned long long)new_chain_key);
 	return new_chain_key;
 }
 
-static void
-print_chain_keys_held_locks(struct task_struct *curr, struct held_lock *hlock_next)
+static void print_chain_keys_held_locks(struct task_struct *curr,
+					struct held_lock *hlock_next,
+					struct printk_buffer *buf)
 {
 	struct held_lock *hlock;
 	u64 chain_key = 0;
@@ -2069,16 +2100,18 @@ static u64 print_chain_key_iteration(int class_idx, u64 chain_key)
 	printk("depth: %u\n", depth + 1);
 	for (i = get_first_held_lock(curr, hlock_next); i < depth; i++) {
 		hlock = curr->held_locks + i;
-		chain_key = print_chain_key_iteration(hlock->class_idx, chain_key);
+		chain_key = print_chain_key_iteration(hlock->class_idx,
+						      chain_key, buf);
 
-		print_lock(hlock);
+		print_lock(hlock, buf);
 	}
 
-	print_chain_key_iteration(hlock_next->class_idx, chain_key);
-	print_lock(hlock_next);
+	print_chain_key_iteration(hlock_next->class_idx, chain_key, buf);
+	print_lock(hlock_next, buf);
 }
 
-static void print_chain_keys_chain(struct lock_chain *chain)
+static void print_chain_keys_chain(struct lock_chain *chain,
+				   struct printk_buffer *buf)
 {
 	int i;
 	u64 chain_key = 0;
@@ -2087,10 +2120,11 @@ static void print_chain_keys_chain(struct lock_chain *chain)
 	printk("depth: %u\n", chain->depth);
 	for (i = 0; i < chain->depth; i++) {
 		class_id = chain_hlocks[chain->base + i];
-		chain_key = print_chain_key_iteration(class_id + 1, chain_key);
+		chain_key = print_chain_key_iteration(class_id + 1, chain_key,
+						      buf);
 
-		print_lock_name(lock_classes + class_id);
-		printk("\n");
+		print_lock_name(lock_classes + class_id, buf);
+		printk_buffered(buf, "\n");
 	}
 }
 
@@ -2098,6 +2132,8 @@ static void print_collision(struct task_struct *curr,
 			struct held_lock *hlock_next,
 			struct lock_chain *chain)
 {
+	struct printk_buffer *buf = get_printk_buffer();
+
 	pr_warn("\n");
 	pr_warn("============================\n");
 	pr_warn("WARNING: chain_key collision\n");
@@ -2106,14 +2142,15 @@ static void print_collision(struct task_struct *curr,
 	pr_warn("%s/%d: ", current->comm, task_pid_nr(current));
 	pr_warn("Hash chain already cached but the contents don't match!\n");
 
-	pr_warn("Held locks:");
-	print_chain_keys_held_locks(curr, hlock_next);
+	bpr_warn(buf, "Held locks:");
+	print_chain_keys_held_locks(curr, hlock_next, buf);
 
-	pr_warn("Locks in cached chain:");
-	print_chain_keys_chain(chain);
+	bpr_warn(buf, "Locks in cached chain:");
+	print_chain_keys_chain(chain, buf);
 
 	pr_warn("\nstack backtrace:\n");
 	dump_stack();
+	put_printk_buffer(buf);
 }
 #endif
 
@@ -2421,18 +2458,20 @@ static void check_chain_key(struct task_struct *curr)
 print_usage_bug_scenario(struct held_lock *lock)
 {
 	struct lock_class *class = hlock_class(lock);
+	struct printk_buffer *buf = get_printk_buffer();
 
 	printk(" Possible unsafe locking scenario:\n\n");
 	printk("       CPU0\n");
 	printk("       ----\n");
-	printk("  lock(");
-	__print_lock_name(class);
-	printk(KERN_CONT ");\n");
+	printk_buffered(buf, "  lock(");
+	__print_lock_name(class, buf);
+	printk_buffered(buf, ");\n");
 	printk("  <Interrupt>\n");
-	printk("    lock(");
-	__print_lock_name(class);
-	printk(KERN_CONT ");\n");
+	printk_buffered(buf, "    lock(");
+	__print_lock_name(class, buf);
+	printk_buffered(buf, ");\n");
 	printk("\n *** DEADLOCK ***\n\n");
+	put_printk_buffer(buf);
 }
 
 static int
@@ -2457,7 +2496,7 @@ static void check_chain_key(struct task_struct *curr)
 		trace_softirq_context(curr), softirq_count() >> SOFTIRQ_SHIFT,
 		trace_hardirqs_enabled(curr),
 		trace_softirqs_enabled(curr));
-	print_lock(this);
+	print_lock(this, NULL);
 
 	pr_warn("{%s} state was registered at:\n", usage_str[prev_bit]);
 	print_stack_trace(hlock_class(this)->usage_traces + prev_bit, 1);
@@ -2500,6 +2539,7 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 			struct held_lock *this, int forwards,
 			const char *irqclass)
 {
+	struct printk_buffer *buf;
 	struct lock_list *entry = other;
 	struct lock_list *middle = NULL;
 	int depth;
@@ -2514,13 +2554,15 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	pr_warn("--------------------------------------------------------\n");
 	pr_warn("%s/%d just changed the state of lock:\n",
 		curr->comm, task_pid_nr(curr));
-	print_lock(this);
+	print_lock(this, NULL);
 	if (forwards)
 		pr_warn("but this lock took another, %s-unsafe lock in the past:\n", irqclass);
 	else
 		pr_warn("but this lock was taken by another, %s-safe lock in the past:\n", irqclass);
-	print_lock_name(other->class);
-	pr_warn("\n\nand interrupts could create inverse lock ordering between them.\n\n");
+	buf = get_printk_buffer();
+	print_lock_name(other->class, buf);
+	bpr_warn(buf, "\n\nand interrupts could create inverse lock ordering between them.\n\n");
+	put_printk_buffer(buf);
 
 	pr_warn("\nother info that might help us debug this:\n");
 
@@ -3077,7 +3119,7 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	 */
 	if (ret == 2) {
 		printk("\nmarked lock as {%s}:\n", usage_str[new_bit]);
-		print_lock(this);
+		print_lock(this, NULL);
 		print_irqtrace_events(curr);
 		dump_stack();
 	}
@@ -3172,7 +3214,7 @@ void lockdep_init_map(struct lockdep_map *lock, const char *name,
 	pr_warn("----------------------------------\n");
 
 	pr_warn("%s/%d is trying to lock:\n", curr->comm, task_pid_nr(curr));
-	print_lock(hlock);
+	print_lock(hlock, NULL);
 
 	pr_warn("\nbut this task is not holding:\n");
 	pr_warn("%s\n", hlock->nest_lock->name);
@@ -4300,7 +4342,7 @@ void __init lockdep_init(void)
 	pr_warn("-------------------------\n");
 	pr_warn("%s/%d is freeing memory %px-%px, with a lock still held there!\n",
 		curr->comm, task_pid_nr(curr), mem_from, mem_to-1);
-	print_lock(hlock);
+	print_lock(hlock, NULL);
 	lockdep_print_held_locks(curr);
 
 	pr_warn("\nstack backtrace:\n");
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-02 13:31 ` [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages Tetsuo Handa
@ 2018-11-02 13:36   ` Peter Zijlstra
  2018-11-03  2:00     ` Tetsuo Handa
  2018-11-06  8:38     ` Petr Mladek
  2018-11-07 15:19   ` Petr Mladek
  1 sibling, 2 replies; 54+ messages in thread
From: Peter Zijlstra @ 2018-11-02 13:36 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Linus Torvalds,
	Andrew Morton, linux-mm, Ingo Molnar, Will Deacon

On Fri, Nov 02, 2018 at 10:31:57PM +0900, Tetsuo Handa wrote:
> syzbot is sometimes getting mixed output like below due to concurrent
> printk(). Mitigate such output by using line-buffered printk() API.
> 
>   RCU used illegally from idle CPU!
>   rcu_scheduler_active = 2, debug_locks = 1
>   RSP: 0018:ffffffff88007bb8 EFLAGS: 00000286
>   RCU used illegally from extended quiescent state!
>    ORIG_RAX: ffffffffffffff13
>   1 lock held by swapper/1/0:
>   RAX: dffffc0000000000 RBX: 1ffffffff1000f7b RCX: 0000000000000000
>    #0: 
>   RDX: 1ffffffff10237b8 RSI: 0000000000000001 RDI: ffffffff8811bdc0
>   000000004b34587c
>   RBP: ffffffff88007bb8 R08: ffffffff88075e00 R09: 0000000000000000
>    (
>   R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>   rcu_read_lock
>   R13: ffffffff88007c78 R14: 0000000000000000 R15: 0000000000000000
>   ){....}
>    arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline]
>    default_idle+0xc2/0x410 arch/x86/kernel/process.c:498
>   , at: trace_call_bpf+0xf8/0x640 kernel/trace/bpf_trace.c:46

WTH is that buffered aPI, and no, that breaks my earlyprintk stuff.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-02 13:31 [PATCH v6 1/3] printk: Add line-buffered printk() API Tetsuo Handa
  2018-11-02 13:31 ` [PATCH 2/3] mm: Use line-buffered printk() for show_free_areas() Tetsuo Handa
  2018-11-02 13:31 ` [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages Tetsuo Handa
@ 2018-11-02 14:40 ` Matthew Wilcox
  2018-11-03  1:55   ` Tetsuo Handa
  2018-11-02 18:12 ` kbuild test robot
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 54+ messages in thread
From: Matthew Wilcox @ 2018-11-02 14:40 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Linus Torvalds,
	Andrew Morton, linux-mm, Ingo Molnar, Peter Zijlstra,
	Will Deacon

On Fri, Nov 02, 2018 at 10:31:55PM +0900, Tetsuo Handa wrote:
>   get_printk_buffer() tries to assign a "struct printk_buffer" from
>   statically preallocated array. get_printk_buffer() returns NULL if
>   all "struct printk_buffer" are in use, but the caller does not need to
>   check for NULL.

This seems like a great way of wasting 16kB of memory.  Since you've
already made printk_buffered() work with a NULL initial argument, what's
the advantage over just doing kmalloc(1024, GFP_ATOMIC)?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-02 13:31 [PATCH v6 1/3] printk: Add line-buffered printk() API Tetsuo Handa
                   ` (2 preceding siblings ...)
  2018-11-02 14:40 ` [PATCH v6 1/3] printk: Add line-buffered printk() API Matthew Wilcox
@ 2018-11-02 18:12 ` kbuild test robot
  2018-11-06 14:35 ` Sergey Senozhatsky
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 54+ messages in thread
From: kbuild test robot @ 2018-11-02 18:12 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: kbuild-all, Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Linus Torvalds,
	Andrew Morton, linux-mm, Ingo Molnar, Peter Zijlstra,
	Will Deacon

[-- Attachment #1: Type: text/plain, Size: 1431 bytes --]

Hi Tetsuo,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.19 next-20181102]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Tetsuo-Handa/printk-Add-line-buffered-printk-API/20181103-010234
config: i386-randconfig-x002-201843 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   In file included from kernel/printk/printk_buffered.c:5:0:
>> kernel/printk/internal.h:28:8: error: unknown type name 'raw_spinlock_t'
    extern raw_spinlock_t logbuf_lock;
           ^~~~~~~~~~~~~~

vim +/raw_spinlock_t +28 kernel/printk/internal.h

f956d1a3 Tetsuo Handa       2018-11-02  27  
099f1c84 Sergey Senozhatsky 2016-12-27 @28  extern raw_spinlock_t logbuf_lock;
099f1c84 Sergey Senozhatsky 2016-12-27  29  

:::::: The code at line 28 was first introduced by commit
:::::: 099f1c84c0052ec1b27f1c3942eed5830d86bdbb printk: introduce per-cpu safe_print seq buffer

:::::: TO: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
:::::: CC: Petr Mladek <pmladek@suse.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31251 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-02 14:40 ` [PATCH v6 1/3] printk: Add line-buffered printk() API Matthew Wilcox
@ 2018-11-03  1:55   ` Tetsuo Handa
  0 siblings, 0 replies; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-03  1:55 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Linus Torvalds,
	Andrew Morton, linux-mm, Ingo Molnar, Peter Zijlstra,
	Will Deacon

On 2018/11/02 23:40, Matthew Wilcox wrote:
> On Fri, Nov 02, 2018 at 10:31:55PM +0900, Tetsuo Handa wrote:
>>   get_printk_buffer() tries to assign a "struct printk_buffer" from
>>   statically preallocated array. get_printk_buffer() returns NULL if
>>   all "struct printk_buffer" are in use, but the caller does not need to
>>   check for NULL.
> 
> This seems like a great way of wasting 16kB of memory.  Since you've
> already made printk_buffered() work with a NULL initial argument, what's
> the advantage over just doing kmalloc(1024, GFP_ATOMIC)?

Like "[PATCH 2/3] mm: Use line-buffered printk() for show_free_areas()."
demonstrates, kzalloc(sizeof(struct printk_buffer), GFP_ATOMIC) can fail.

And using statically preallocated buffers helps avoiding

  (1) out of buffers when memory cannot be allocated

  (2) kernel stack overflow when kernel stack is already tight (e.g.
      a memory allocation attempt from an interrupt handler which was
      invoked from deep inside call chain of a process context)

. Whether

  (A) tuning the number of statically preallocated buffers

  (B) allocating buffers on caller side (e.g. kzalloc() or in .bss section)

are useful is a future decision, for too much concurrent printk() will lockup
the system even if there are enough buffers. I think that starting with
statically preallocated buffers is (at least for now) a good choice for
minimizing risk of (1) (2) while offering practically acceptable result.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-02 13:36   ` Peter Zijlstra
@ 2018-11-03  2:00     ` Tetsuo Handa
  2018-11-06  8:38     ` Petr Mladek
  1 sibling, 0 replies; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-03  2:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Linus Torvalds,
	Andrew Morton, linux-mm, Ingo Molnar, Will Deacon

On 2018/11/02 22:36, Peter Zijlstra wrote:
> On Fri, Nov 02, 2018 at 10:31:57PM +0900, Tetsuo Handa wrote:
>> syzbot is sometimes getting mixed output like below due to concurrent
>> printk(). Mitigate such output by using line-buffered printk() API.
>>
>>   RCU used illegally from idle CPU!
>>   rcu_scheduler_active = 2, debug_locks = 1
>>   RSP: 0018:ffffffff88007bb8 EFLAGS: 00000286
>>   RCU used illegally from extended quiescent state!
>>    ORIG_RAX: ffffffffffffff13
>>   1 lock held by swapper/1/0:
>>   RAX: dffffc0000000000 RBX: 1ffffffff1000f7b RCX: 0000000000000000
>>    #0: 
>>   RDX: 1ffffffff10237b8 RSI: 0000000000000001 RDI: ffffffff8811bdc0
>>   000000004b34587c
>>   RBP: ffffffff88007bb8 R08: ffffffff88075e00 R09: 0000000000000000
>>    (
>>   R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>>   rcu_read_lock
>>   R13: ffffffff88007c78 R14: 0000000000000000 R15: 0000000000000000
>>   ){....}
>>    arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline]
>>    default_idle+0xc2/0x410 arch/x86/kernel/process.c:498
>>   , at: trace_call_bpf+0xf8/0x640 kernel/trace/bpf_trace.c:46
> 
> WTH is that buffered aPI, and no, that breaks my earlyprintk stuff.
> 

This API is nothing but a wrapper for reducing frequency of directly
calling printk() by using snprintf() if possible. Thus, whatever your
earlyprintk stuff is, this API should not affect it.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-02 13:36   ` Peter Zijlstra
  2018-11-03  2:00     ` Tetsuo Handa
@ 2018-11-06  8:38     ` Petr Mladek
  2018-11-06  9:05       ` Sergey Senozhatsky
  2018-11-06  9:56       ` Tetsuo Handa
  1 sibling, 2 replies; 54+ messages in thread
From: Petr Mladek @ 2018-11-06  8:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Tetsuo Handa, Sergey Senozhatsky, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Linus Torvalds,
	Andrew Morton, linux-mm, Ingo Molnar, Will Deacon

On Fri 2018-11-02 14:36:29, Peter Zijlstra wrote:
> On Fri, Nov 02, 2018 at 10:31:57PM +0900, Tetsuo Handa wrote:
> > syzbot is sometimes getting mixed output like below due to concurrent
> > printk(). Mitigate such output by using line-buffered printk() API.
> > 
> >   RCU used illegally from idle CPU!
> >   rcu_scheduler_active = 2, debug_locks = 1
> >   RSP: 0018:ffffffff88007bb8 EFLAGS: 00000286
> >   RCU used illegally from extended quiescent state!
> >    ORIG_RAX: ffffffffffffff13
> >   1 lock held by swapper/1/0:
> >   RAX: dffffc0000000000 RBX: 1ffffffff1000f7b RCX: 0000000000000000
> >    #0: 
> >   RDX: 1ffffffff10237b8 RSI: 0000000000000001 RDI: ffffffff8811bdc0
> >   000000004b34587c
> >   RBP: ffffffff88007bb8 R08: ffffffff88075e00 R09: 0000000000000000
> >    (
> >   R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> >   rcu_read_lock
> >   R13: ffffffff88007c78 R14: 0000000000000000 R15: 0000000000000000
> >   ){....}
> >    arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline]
> >    default_idle+0xc2/0x410 arch/x86/kernel/process.c:498
> >   , at: trace_call_bpf+0xf8/0x640 kernel/trace/bpf_trace.c:46
> 
> WTH is that buffered aPI, and no, that breaks my earlyprintk stuff.

The API is supposed to replace the existing unreliable and tricky code,
see struct cont and KERN_CONT flag in kernel/printk/printk.c.

The solution has been discussed many times. The preferred solution
is to use explicit buffers instead of having many
per-cpu/per-context ones and auto magically switching between them.

The original idea was to use buffers on stack. But the stack is
limited and people would need to guess the length.

This current approach uses small pool of buffers and simple logic
to get/put one. If any buffer is not available, it falls back
to direct printk(), see
https://lkml.kernel.org/r/1541165517-3557-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp

If you would want to avoid buffering, you could set the number
of buffers to zero. Then it would always fallback to
the direct printk().

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-06  8:38     ` Petr Mladek
@ 2018-11-06  9:05       ` Sergey Senozhatsky
  2018-11-06 12:57         ` Petr Mladek
  2018-11-06  9:56       ` Tetsuo Handa
  1 sibling, 1 reply; 54+ messages in thread
From: Sergey Senozhatsky @ 2018-11-06  9:05 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Peter Zijlstra, Tetsuo Handa, Sergey Senozhatsky,
	Sergey Senozhatsky, Dmitriy Vyukov, Steven Rostedt,
	Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Will Deacon

On (11/06/18 09:38), Petr Mladek wrote:
> 
> If you would want to avoid buffering, you could set the number
> of buffers to zero. Then it would always fallback to
> the direct printk().

This printk-fallback makes me wonder if 'cont' really can ever go away.
We would totally break cont printk-s that trapped into printk-fallback;
as opposed to current sometimes-cont-works-just-fine.

	-ss

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-06  8:38     ` Petr Mladek
  2018-11-06  9:05       ` Sergey Senozhatsky
@ 2018-11-06  9:56       ` Tetsuo Handa
  1 sibling, 0 replies; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-06  9:56 UTC (permalink / raw)
  To: Petr Mladek, Peter Zijlstra
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Dmitriy Vyukov,
	Steven Rostedt, Alexander Potapenko, Fengguang Wu,
	Josh Poimboeuf, LKML, Linus Torvalds, Andrew Morton, linux-mm,
	Ingo Molnar, Will Deacon

On 2018/11/06 17:38, Petr Mladek wrote:
> If you would want to avoid buffering, you could set the number
> of buffers to zero. Then it would always fallback to
> the direct printk().

  1 lock held by swapper/1/0:
   #0: 
   (
  rcu_read_lock
  ){....}
  , at: trace_call_bpf+0xf8/0x640 kernel/trace/bpf_trace.c:46

is not welcomed and

  1 lock held by swapper/1/0:
   #0:  (rcu_read_lock){....}, at: trace_call_bpf+0xf8/0x640 kernel/trace/bpf_trace.c:46

is welcomed.

If you want to avoid fallback to direct printk(), please allocate on-stack
buffer with appropriate size. Since lockdep splat may happen when kernel
stack is already tight, blindly allocating large buffer on the stack is
not good.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-06  9:05       ` Sergey Senozhatsky
@ 2018-11-06 12:57         ` Petr Mladek
  0 siblings, 0 replies; 54+ messages in thread
From: Petr Mladek @ 2018-11-06 12:57 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Peter Zijlstra, Tetsuo Handa, Sergey Senozhatsky, Dmitriy Vyukov,
	Steven Rostedt, Alexander Potapenko, Fengguang Wu,
	Josh Poimboeuf, LKML, Linus Torvalds, Andrew Morton, linux-mm,
	Ingo Molnar, Will Deacon

On Tue 2018-11-06 18:05:44, Sergey Senozhatsky wrote:
> On (11/06/18 09:38), Petr Mladek wrote:
> > 
> > If you would want to avoid buffering, you could set the number
> > of buffers to zero. Then it would always fallback to
> > the direct printk().

This comment was a hint for Peter and his workarounds. He ignores most
of printk() code and writes messages directly to the serial console.


> This printk-fallback makes me wonder if 'cont' really can ever go away.
> We would totally break cont printk-s that trapped into printk-fallback;
> as opposed to current sometimes-cont-works-just-fine.

It could break things totally only when the new approach completely
fails. I you have any doubts or suggestions then please comment on
the patch adding the API.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-02 13:31 [PATCH v6 1/3] printk: Add line-buffered printk() API Tetsuo Handa
                   ` (3 preceding siblings ...)
  2018-11-02 18:12 ` kbuild test robot
@ 2018-11-06 14:35 ` Sergey Senozhatsky
  2018-11-07 10:21   ` Petr Mladek
  2018-11-07 10:52   ` Tetsuo Handa
  2018-11-07 13:41 ` Petr Mladek
  2018-11-07 14:06 ` Petr Mladek
  6 siblings, 2 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2018-11-06 14:35 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Petr Mladek, Sergey Senozhatsky, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Linus Torvalds,
	Andrew Morton, linux-mm, Ingo Molnar, Peter Zijlstra,
	Will Deacon

On (11/02/18 22:31), Tetsuo Handa wrote:
>   (1) Call get_printk_buffer() and acquire "struct printk_buffer *".
> 
>   (2) Rewrite printk() calls in the following way. The "ptr" is
>       "struct printk_buffer *" obtained in step (1).
> 
>       printk(fmt, ...)     => printk_buffered(ptr, fmt, ...)
>       vprintk(fmt, args)   => vprintk_buffered(ptr, fmt, args)
>       pr_emerg(fmt, ...)   => bpr_emerg(ptr, fmt, ...)
>       pr_alert(fmt, ...)   => bpr_alert(ptr, fmt, ...)
>       pr_crit(fmt, ...)    => bpr_crit(ptr, fmt, ...)
>       pr_err(fmt, ...)     => bpr_err(ptr, fmt, ...)
>       pr_warning(fmt, ...) => bpr_warning(ptr, fmt, ...)
>       pr_warn(fmt, ...)    => bpr_warn(ptr, fmt, ...)
>       pr_notice(fmt, ...)  => bpr_notice(ptr, fmt, ...)
>       pr_info(fmt, ...)    => bpr_info(ptr, fmt, ...)
>       pr_cont(fmt, ...)    => bpr_cont(ptr, fmt, ...)
> 
>   (3) Release "struct printk_buffer" by calling put_printk_buffer().

[..]

> Since we want to remove "struct cont" eventually, we will try to remove
> both "implicit printk() users who are expecting KERN_CONT behavior" and
> "explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to
> this API is recommended.

- The printk-fallback sounds like a hint that the existing 'cont' handling
  better stay in the kernel. I don't see how the existing 'cont' is
  significantly worse than
		bpr_warn(NULL, ...)->printk() // no 'cont' support
  I don't see why would we want to do it, sorry. I don't see "it takes 16
  printk-buffers to make a thing go right" as a sure thing.

A question.

How bad would it actually be to:

- Allocate seq_buf 512-bytes buffer (GFP_ATOMIC) just-in-time, when we
  need it.
    // How often systems cannot allocate a 512-byte buffer? //

- OK, assuming that systems around the world are so badly OOM like all the
  time and even kmalloc(512) is absolutely impossible, then have a fallback
  to the existing 'cont' handling; it just looks to me better than a plain
  printk()-fallback with removed 'cont' support.

- Do not allocate seq_buf if we are in printk-safe or in printk-nmi mode.
  To avoid "buffering for the sake of buffering". IOW, when in printk-safe
  use printk-safe.

	-ss

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-06 14:35 ` Sergey Senozhatsky
@ 2018-11-07 10:21   ` Petr Mladek
  2018-11-07 12:54     ` Tetsuo Handa
  2018-11-08  2:21     ` Sergey Senozhatsky
  2018-11-07 10:52   ` Tetsuo Handa
  1 sibling, 2 replies; 54+ messages in thread
From: Petr Mladek @ 2018-11-07 10:21 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Sergey Senozhatsky, Dmitriy Vyukov, Steven Rostedt,
	Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon

On Tue 2018-11-06 23:35:02, Sergey Senozhatsky wrote:
> On (11/02/18 22:31), Tetsuo Handa wrote:
> >   (1) Call get_printk_buffer() and acquire "struct printk_buffer *".
> > 
> >   (2) Rewrite printk() calls in the following way. The "ptr" is
> >       "struct printk_buffer *" obtained in step (1).
> > 
> >       printk(fmt, ...)     => printk_buffered(ptr, fmt, ...)
> >       vprintk(fmt, args)   => vprintk_buffered(ptr, fmt, args)
> >       pr_emerg(fmt, ...)   => bpr_emerg(ptr, fmt, ...)
> >       pr_alert(fmt, ...)   => bpr_alert(ptr, fmt, ...)
> >       pr_crit(fmt, ...)    => bpr_crit(ptr, fmt, ...)
> >       pr_err(fmt, ...)     => bpr_err(ptr, fmt, ...)
> >       pr_warning(fmt, ...) => bpr_warning(ptr, fmt, ...)
> >       pr_warn(fmt, ...)    => bpr_warn(ptr, fmt, ...)
> >       pr_notice(fmt, ...)  => bpr_notice(ptr, fmt, ...)
> >       pr_info(fmt, ...)    => bpr_info(ptr, fmt, ...)
> >       pr_cont(fmt, ...)    => bpr_cont(ptr, fmt, ...)
> > 
> >   (3) Release "struct printk_buffer" by calling put_printk_buffer().
> 
> [..]
> 
> > Since we want to remove "struct cont" eventually, we will try to remove
> > both "implicit printk() users who are expecting KERN_CONT behavior" and
> > "explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to
> > this API is recommended.
> 
> - The printk-fallback sounds like a hint that the existing 'cont' handling
>   better stay in the kernel. I don't see how the existing 'cont' is
>   significantly worse than
> 		bpr_warn(NULL, ...)->printk() // no 'cont' support
>   I don't see why would we want to do it, sorry. I don't see "it takes 16
>   printk-buffers to make a thing go right" as a sure thing.

I see it the following way:

   + mixed cont lines are very rare but they happen

   + 16 buffers are more than 1 so it could only be better [*]

   + the printk_buffer() code is self-contained and does not
     complicate the logic of the classic printk() code [**]


[*] A missing put_printk_buffer() might cause that we would get
    out of buffers. But the same problem is with locks,
    disabled preemption, disabled interrupts, seq_buffer,
    alloc/free. Such problems happen but they are rare.

    Also I do not expect that the same buffer would be shared
    between many functions. Therefore it should be easy
    to use it correctly.


[**] I admit that cont buffer implementation is much easier
     after removing the early flush to consoles but still...


Anyway, I do not think that both implementations are worth it.
We could keep both for some transition period but we should
remove the old one later.


> A question.
> 
> How bad would it actually be to:
> 
> - Allocate seq_buf 512-bytes buffer (GFP_ATOMIC) just-in-time, when we
>   need it.
>     // How often systems cannot allocate a 512-byte buffer? //
> 
> - OK, assuming that systems around the world are so badly OOM like all the
>   time and even kmalloc(512) is absolutely impossible, then have a fallback
>   to the existing 'cont' handling; it just looks to me better than a plain
>   printk()-fallback with removed 'cont' support.

This would prevent removing the fallback to struct cont. OOM is
one important scenario where continuous lines are used.


> - Do not allocate seq_buf if we are in printk-safe or in printk-nmi mode.
>   To avoid "buffering for the sake of buffering". IOW, when in printk-safe
>   use printk-safe.

Sure, my plan is to add a helper function is_buffered_printk_context() or so
that would check printk_context. Then we could do the following in
vprintk_buffered()

	if (is_buffered_printk_context())
		vprintk_func(....);

It might be added on top of the current patchset. I opened this
problem once and it got lost. So I did not want to complicate
it at this moment.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-06 14:35 ` Sergey Senozhatsky
  2018-11-07 10:21   ` Petr Mladek
@ 2018-11-07 10:52   ` Tetsuo Handa
  2018-11-07 11:01     ` David Laight
                       ` (2 more replies)
  1 sibling, 3 replies; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-07 10:52 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Sergey Senozhatsky, Dmitriy Vyukov, Steven Rostedt,
	Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon

On 2018/11/06 23:35, Sergey Senozhatsky wrote:
>> Since we want to remove "struct cont" eventually, we will try to remove
>> both "implicit printk() users who are expecting KERN_CONT behavior" and
>> "explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to
>> this API is recommended.
> 
> - The printk-fallback sounds like a hint that the existing 'cont' handling
>   better stay in the kernel. I don't see how the existing 'cont' is
>   significantly worse than
> 		bpr_warn(NULL, ...)->printk() // no 'cont' support
>   I don't see why would we want to do it, sorry. I don't see "it takes 16
>   printk-buffers to make a thing go right" as a sure thing.

Existing 'cont' handling will stay for a while. After majority of
pr_cont()/KERN_CONT users are converted, 'cont' support will be removed
(e.g. KERN_CONT becomes "").

> 
> A question.
> 
> How bad would it actually be to:
> 
> - Allocate seq_buf 512-bytes buffer (GFP_ATOMIC) just-in-time, when we
>   need it.
>     // How often systems cannot allocate a 512-byte buffer? //

It is a very bad thing to do GFP_ATOMIC without __GFP_NOWARN. See
"[PATCH 2/3] mm: Use line-buffered printk() for show_free_areas()."
which helps exactly when GFP_ATOMIC without __GFP_NOWARN failed.
Without __GFP_NOWARN, GFP_ATOMIC for printk() can trigger infinite
recursion and kernel stack overflow.

Even without recursion, doing kmalloc(GFP_ATOMIC | __GFP_NOWARN) temporarily
consumes some kernel stack. I don't know the exact amount needed for
kmalloc(GFP_ATOMIC | __GFP_NOWARN), but it might still emit memory allocation
fault injection messages. What GFP_ATOMIC can guarantee is nothing but
"it does not sleep". Not suitable for printk() which might be called from
critically dangerous situations.

> 
> - OK, assuming that systems around the world are so badly OOM like all the
>   time and even kmalloc(512) is absolutely impossible, then have a fallback
>   to the existing 'cont' handling; it just looks to me better than a plain
>   printk()-fallback with removed 'cont' support.

Since I want to eventually remove 'cont' support inside printk(),
I dropped KERN_CONT in patch [2/3] and [3/3].

> 
> - Do not allocate seq_buf if we are in printk-safe or in printk-nmi mode.
>   To avoid "buffering for the sake of buffering". IOW, when in printk-safe
>   use printk-safe.

Why? Since printk_safe_flush_buffer() forcibly flushes the partial line,
calling printk_safe_log_store() after line buffering can reduce possibility of
flushing partial lines, can't it?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-07 10:52   ` Tetsuo Handa
@ 2018-11-07 11:01     ` David Laight
  2018-11-07 12:00       ` Petr Mladek
  2018-11-07 11:45     ` Petr Mladek
  2018-11-08  2:30     ` Sergey Senozhatsky
  2 siblings, 1 reply; 54+ messages in thread
From: David Laight @ 2018-11-07 11:01 UTC (permalink / raw)
  To: 'Tetsuo Handa', Sergey Senozhatsky
  Cc: Petr Mladek, Sergey Senozhatsky, Dmitriy Vyukov, Steven Rostedt,
	Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon

From: Tetsuo Handa
> Sent: 07 November 2018 10:53
> 
> On 2018/11/06 23:35, Sergey Senozhatsky wrote:
> >> Since we want to remove "struct cont" eventually, we will try to remove
> >> both "implicit printk() users who are expecting KERN_CONT behavior" and
> >> "explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to
> >> this API is recommended.
> >
> > - The printk-fallback sounds like a hint that the existing 'cont' handling
> >   better stay in the kernel. I don't see how the existing 'cont' is
> >   significantly worse than
> > 		bpr_warn(NULL, ...)->printk() // no 'cont' support
> >   I don't see why would we want to do it, sorry. I don't see "it takes 16
> >   printk-buffers to make a thing go right" as a sure thing.
> 
> Existing 'cont' handling will stay for a while. After majority of
> pr_cont()/KERN_CONT users are converted, 'cont' support will be removed
> (e.g. KERN_CONT becomes "").

A though:

Why not make the printf lock slightly 'sticky'?
- If the output line is incomplete save the cpuid.
- If there is a saved cpuid that doesn't match the current cpu then spin for a bit.

Any callers of printk() have to assume they will spin on the buffer for the
longest printk formatting (and symbol lookup might take a while) so a short
additional delay won't matter.

Then two calls to printk() for the same line won't (usually) get split and
none of the callers need any changes.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-07 10:52   ` Tetsuo Handa
  2018-11-07 11:01     ` David Laight
@ 2018-11-07 11:45     ` Petr Mladek
  2018-11-08  2:30     ` Sergey Senozhatsky
  2 siblings, 0 replies; 54+ messages in thread
From: Petr Mladek @ 2018-11-07 11:45 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Dmitriy Vyukov,
	Steven Rostedt, Alexander Potapenko, Fengguang Wu,
	Josh Poimboeuf, LKML, Linus Torvalds, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On Wed 2018-11-07 19:52:53, Tetsuo Handa wrote:
> On 2018/11/06 23:35, Sergey Senozhatsky wrote:
> > - Do not allocate seq_buf if we are in printk-safe or in printk-nmi mode.
> >   To avoid "buffering for the sake of buffering". IOW, when in printk-safe
> >   use printk-safe.
> 
> Why? Since printk_safe_flush_buffer() forcibly flushes the partial line,
> calling printk_safe_log_store() after line buffering can reduce possibility of
> flushing partial lines, can't it?

Good point.

Well, printk_safe buffers are flushed via irqwork scheduled on the
same CPU. It might get flushed prematurely from other CPU but
I am not sure if this risk is worth the double buffering.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-07 11:01     ` David Laight
@ 2018-11-07 12:00       ` Petr Mladek
  0 siblings, 0 replies; 54+ messages in thread
From: Petr Mladek @ 2018-11-07 12:00 UTC (permalink / raw)
  To: David Laight
  Cc: 'Tetsuo Handa',
	Sergey Senozhatsky, Sergey Senozhatsky, Dmitriy Vyukov,
	Steven Rostedt, Alexander Potapenko, Fengguang Wu,
	Josh Poimboeuf, LKML, Linus Torvalds, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On Wed 2018-11-07 11:01:05, David Laight wrote:
> From: Tetsuo Handa
> > Sent: 07 November 2018 10:53
> A though:
> 
> Why not make the printf lock slightly 'sticky'?
> - If the output line is incomplete save the cpuid.
> - If there is a saved cpuid that doesn't match the current cpu then spin for a bit.
> 
> Any callers of printk() have to assume they will spin on the buffer for the
> longest printk formatting (and symbol lookup might take a while) so a short
> additional delay won't matter.

Disabling preemption for a single printk() is questionable. We have
spent many years trying to find an acceptable solution to avoid
softlockups and we did not fully succeeded.

And this patchset is about continuous lines. Therefore we would need
to disable preemption for all code intermixed with the related
printks. This does not have much chances to get accepted.


> Then two calls to printk() for the same line won't (usually) get split and
> none of the callers need any changes.

It would require changes to disable the preemption around the related
calls.

We could not do this without a better API and inspecting all users.
Note that we could not detect this by "\n" because it is too
error-prone.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-07 10:21   ` Petr Mladek
@ 2018-11-07 12:54     ` Tetsuo Handa
  2018-11-08  2:21     ` Sergey Senozhatsky
  1 sibling, 0 replies; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-07 12:54 UTC (permalink / raw)
  To: Petr Mladek, Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Dmitriy Vyukov, Steven Rostedt,
	Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon

On 2018/11/07 19:21, Petr Mladek wrote:
> On Tue 2018-11-06 23:35:02, Sergey Senozhatsky wrote:
>>> Since we want to remove "struct cont" eventually, we will try to remove
>>> both "implicit printk() users who are expecting KERN_CONT behavior" and
>>> "explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to
>>> this API is recommended.
>>
>> - The printk-fallback sounds like a hint that the existing 'cont' handling
>>   better stay in the kernel. I don't see how the existing 'cont' is
>>   significantly worse than
>> 		bpr_warn(NULL, ...)->printk() // no 'cont' support
>>   I don't see why would we want to do it, sorry. I don't see "it takes 16
>>   printk-buffers to make a thing go right" as a sure thing.
> 
> I see it the following way:
> 
>    + mixed cont lines are very rare but they happen
> 
>    + 16 buffers are more than 1 so it could only be better [*]
> 
>    + the printk_buffer() code is self-contained and does not
>      complicate the logic of the classic printk() code [**]
> 
> 
> [*] A missing put_printk_buffer() might cause that we would get
>     out of buffers. But the same problem is with locks,
>     disabled preemption, disabled interrupts, seq_buffer,
>     alloc/free. Such problems happen but they are rare.
> 
>     Also I do not expect that the same buffer would be shared
>     between many functions. Therefore it should be easy
>     to use it correctly.

Since we can allocate printk() buffer upon dup_task_struct()
and free it upon free_task_struct(), we can have enough printk()
buffers for task context. Also, since total number of exceptional
contexts (e.g. interrupts, oops) is finite, we can have enough
printk() buffers for exceptional contexts.

Is it possible to identify all locations which should use their own
printk() buffers (e.g. interrupt handlers, oops handlers) ? If yes,
despite Linus's objection, automatically switching printk() buffers
(like memalloc_nofs_save()/memalloc_nofs_restore() instead of
https://lkml.kernel.org/r/201709021512.DJG00503.JHOSOFFMFtVLOQ@I-love.SAKURA.ne.jp )
will be easiest and least error prone.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-02 13:31 [PATCH v6 1/3] printk: Add line-buffered printk() API Tetsuo Handa
                   ` (4 preceding siblings ...)
  2018-11-06 14:35 ` Sergey Senozhatsky
@ 2018-11-07 13:41 ` Petr Mladek
  2018-11-07 14:06 ` Petr Mladek
  6 siblings, 0 replies; 54+ messages in thread
From: Petr Mladek @ 2018-11-07 13:41 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Dmitriy Vyukov,
	Steven Rostedt, Alexander Potapenko, Fengguang Wu,
	Josh Poimboeuf, LKML, Linus Torvalds, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On Fri 2018-11-02 22:31:55, Tetsuo Handa wrote:
> Sometimes we want to print a whole line without being disturbed by
> concurrent printk() from interrupts and/or other threads, for printk()
> which does not end with '\n' can be disturbed.
> 
> Since mixed printk() output makes it hard to interpret, this patch
> introduces API for line-buffered printk() output (so that we can make
> sure that printk() ends with '\n').
> 
> Since functions introduced by this patch are merely wrapping
> printk()/vprintk() calls in order to minimize possibility of using
> "struct cont", it is safe to replace printk()/vprintk() with this API.
> Since we want to remove "struct cont" eventually, we will try to remove
> both "implicit printk() users who are expecting KERN_CONT behavior" and
> "explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to
> this API is recommended.

[...]

> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index cf3eccf..92af345 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -173,6 +174,20 @@ int printk_emit(int facility, int level,
>  
>  asmlinkage __printf(1, 2) __cold
>  int printk(const char *fmt, ...);
> +struct printk_buffer *get_printk_buffer(void);
> +bool flush_printk_buffer(struct printk_buffer *ptr);
> +__printf(2, 3)
> +int printk_buffered(struct printk_buffer *ptr, const char *fmt, ...);
> +__printf(2, 0)
> +int vprintk_buffered(struct printk_buffer *ptr, const char *fmt, va_list args);
> +/*
> + * In order to avoid accidentally reusing "ptr" after put_printk_buffer("ptr"),
> + * put_printk_buffer() is defined as a macro which explicitly resets "ptr" to
> + * NULL.
> + */
> +void __put_printk_buffer(struct printk_buffer *ptr);
> +#define put_printk_buffer(ptr)					\
> +	do { __put_printk_buffer(ptr); ptr = NULL; } while (0)

I have mixed feelings about this. It is clever but it complicates the
code. I wonder why we need to be more paranoid than kfree() and
friends. Especially that the reuse of printk buffer is much less
dangerous than reusing freed pointers.

Please, remove it unless it gets more supporters.


> --- /dev/null
> +++ b/kernel/printk/printk_buffered.c
> +int vprintk_buffered(struct printk_buffer *ptr, const char *fmt, va_list args)
> +{
> +	va_list tmp_args;
> +	int r;
> +	int pos;
> +
> +	if (!ptr)
> +		return vprintk(fmt, args);
> +	/*
> +	 * Skip KERN_CONT here based on an assumption that KERN_CONT will be
> +	 * given via "fmt" argument when KERN_CONT is given.
> +	 */
> +	pos = (printk_get_level(fmt) == 'c') ? 2 : 0;
> +	while (true) {
> +		va_copy(tmp_args, args);
> +		r = vsnprintf(ptr->buf + ptr->len, sizeof(ptr->buf) - ptr->len,
> +			      fmt + pos, tmp_args);
> +		va_end(tmp_args);
> +		if (likely(r + ptr->len < sizeof(ptr->buf)))
> +			break;
> +		if (!flush_printk_buffer(ptr))
> +			return vprintk(fmt, args);
> +	}
> +	ptr->len += r;
> +	/* Flush already completed lines if any. */
> +	for (pos = ptr->len - 1; pos >= 0; pos--) {
> +		if (ptr->buf[pos] != '\n')
> +			continue;
> +		ptr->buf[pos++] = '\0';
> +		printk("%s\n", ptr->buf);
> +		ptr->len -= pos;
> +		memmove(ptr->buf, ptr->buf + pos, ptr->len);
> +		/* This '\0' will be overwritten by next vsnprintf() above. */
> +		ptr->buf[ptr->len] = '\0';

I am not against explicitly adding '\0' this way. Just note that there
is always the trailing '\0' in the original string, see below.

I mean that we could get the same result with memmove(,, ptr->len+1);


> +		break;
> +	}
> +	return r;
> +}
> +
> +/**
> + * flush_printk_buffer - Flush incomplete line in printk_buffer.
> + *
> + * @ptr: Pointer to "struct printk_buffer". It can be NULL.
> + *
> + * Returns true if flushed something, false otherwise.
> + *
> + * Flush if @ptr contains partial data. But usually there is no need to call
> + * this function because @ptr is flushed by put_printk_buffer().
> + */
> +bool flush_printk_buffer(struct printk_buffer *ptr)
> +{
> +	if (!ptr || !ptr->len)
> +		return false;
> +	/*
> +	 * vprintk_buffered() keeps 0 <= ptr->len < sizeof(ptr->buf) true.
> +	 * But ptr->buf[ptr->len] != '\0' if this function is called due to
> +	 * vsnprintf() + ptr->len >= sizeof(ptr->buf).
> +	 */
> +	ptr->buf[ptr->len] = '\0';

This is not necessary. Note that vsnprintf() always adds the trailing
'\0' even when the string is shrunken.


> +	printk("%s", ptr->buf);
> +	ptr->len = 0;
> +	return true;
> +}
> +EXPORT_SYMBOL(flush_printk_buffer);
> +
> +/**
> + * __put_printk_buffer - Release printk_buffer.
> + *
> + * @ptr: Pointer to "struct printk_buffer". It can be NULL.
> + *
> + * Returns nothing.
> + *
> + * Flush and release @ptr.
> + * Please use put_printk_buffer() in order to catch use-after-free bugs.
> + */
> +void __put_printk_buffer(struct printk_buffer *ptr)
> +{
> +	long i = (unsigned long) ptr - (unsigned long) printk_buffers;
> +
> +	if (!ptr)
> +		return;
> +	if (WARN_ON_ONCE(i % sizeof(struct printk_buffer)))
> +		return;
> +	i /= sizeof(struct printk_buffer);
> +	if (WARN_ON_ONCE(i < 0 || i >= NUM_LINE_BUFFERS))
> +		return;
> +	if (ptr->len)

Nit: This check is superfluous.

> +		flush_printk_buffer(ptr);
> +	clear_bit_unlock(i, printk_buffers_in_use);
> +}
> +EXPORT_SYMBOL(__put_printk_buffer);

Otherwise, it looks good to me.


Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-02 13:31 [PATCH v6 1/3] printk: Add line-buffered printk() API Tetsuo Handa
                   ` (5 preceding siblings ...)
  2018-11-07 13:41 ` Petr Mladek
@ 2018-11-07 14:06 ` Petr Mladek
  6 siblings, 0 replies; 54+ messages in thread
From: Petr Mladek @ 2018-11-07 14:06 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Dmitriy Vyukov,
	Steven Rostedt, Alexander Potapenko, Fengguang Wu,
	Josh Poimboeuf, LKML, Linus Torvalds, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On Fri 2018-11-02 22:31:55, Tetsuo Handa wrote:
> How to use this API:
> 
>   (1) Call get_printk_buffer() and acquire "struct printk_buffer *".
> 
>   (2) Rewrite printk() calls in the following way. The "ptr" is
>       "struct printk_buffer *" obtained in step (1).
> 
>       printk(fmt, ...)     => printk_buffered(ptr, fmt, ...)
>       vprintk(fmt, args)   => vprintk_buffered(ptr, fmt, args)
>       pr_emerg(fmt, ...)   => bpr_emerg(ptr, fmt, ...)
>       pr_alert(fmt, ...)   => bpr_alert(ptr, fmt, ...)
>       pr_crit(fmt, ...)    => bpr_crit(ptr, fmt, ...)
>       pr_err(fmt, ...)     => bpr_err(ptr, fmt, ...)
>       pr_warning(fmt, ...) => bpr_warning(ptr, fmt, ...)
>       pr_warn(fmt, ...)    => bpr_warn(ptr, fmt, ...)
>       pr_notice(fmt, ...)  => bpr_notice(ptr, fmt, ...)
>       pr_info(fmt, ...)    => bpr_info(ptr, fmt, ...)
>       pr_cont(fmt, ...)    => bpr_cont(ptr, fmt, ...)

I am looking at the sample conversions. We actually won't need
bpr_cont(). We will use buffer_printk() instead.

Well, I think about renaming buffer_printk() to bprintk() or
define it as a wrapper at least.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 2/3] mm: Use line-buffered printk() for show_free_areas().
  2018-11-02 13:31 ` [PATCH 2/3] mm: Use line-buffered printk() for show_free_areas() Tetsuo Handa
@ 2018-11-07 14:07   ` Petr Mladek
  0 siblings, 0 replies; 54+ messages in thread
From: Petr Mladek @ 2018-11-07 14:07 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Dmitriy Vyukov,
	Steven Rostedt, Alexander Potapenko, Fengguang Wu,
	Josh Poimboeuf, LKML, Linus Torvalds, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On Fri 2018-11-02 22:31:56, Tetsuo Handa wrote:
> syzbot is sometimes getting mixed output like below due to concurrent
> printk(). Mitigate such output by using line-buffered printk() API.
> 
>   Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB 
>   syz-executor0: page allocation failure: order:0, mode:0x484020(GFP_ATOMIC|__GFP_COMP), nodemask=(null)
>   (U) 
>   syz-executor0 cpuset=
>   2*64kB 
>   syz0
>   (U) 
>    mems_allowed=0
>   1*128kB 
>   CPU: 0 PID: 7592 Comm: syz-executor0 Not tainted 4.19.0-rc6+ #118
>   (U) 
>   Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>   1*256kB (U) 
>   Call Trace:
>   0*512kB 
>    <IRQ>
>   1*1024kB 
>    __dump_stack lib/dump_stack.c:77 [inline]
>    dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
>   (U) 
>   1*2048kB 
>    warn_alloc.cold.119+0xb7/0x1bd mm/page_alloc.c:3426
>   (M) 
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> ---
>  mm/page_alloc.c | 32 +++++++++++++++++---------------
>  1 file changed, 17 insertions(+), 15 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a919ba5..4411d5a 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4694,10 +4694,10 @@ unsigned long nr_free_pagecache_pages(void)
>  	return nr_free_zone_pages(gfp_zone(GFP_HIGHUSER_MOVABLE));
>  }
>  
> -static inline void show_node(struct zone *zone)
> +static inline void show_node(struct zone *zone, struct printk_buffer *buf)
>  {
>  	if (IS_ENABLED(CONFIG_NUMA))
> -		printk("Node %d ", zone_to_nid(zone));
> +		printk_buffered(buf, "Node %d ", zone_to_nid(zone));

The conversion looks fine to me. I just think about renaming
printk_buffered to bprintk().

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-02 13:31 ` [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages Tetsuo Handa
  2018-11-02 13:36   ` Peter Zijlstra
@ 2018-11-07 15:19   ` Petr Mladek
  2018-11-08  4:45     ` Sergey Senozhatsky
  2018-11-09  9:54     ` Tetsuo Handa
  1 sibling, 2 replies; 54+ messages in thread
From: Petr Mladek @ 2018-11-07 15:19 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Dmitriy Vyukov,
	Steven Rostedt, Alexander Potapenko, Fengguang Wu,
	Josh Poimboeuf, LKML, Linus Torvalds, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On Fri 2018-11-02 22:31:57, Tetsuo Handa wrote:
> syzbot is sometimes getting mixed output like below due to concurrent
> printk(). Mitigate such output by using line-buffered printk() API.
> 
> @@ -2421,18 +2458,20 @@ static void check_chain_key(struct task_struct *curr)
>  print_usage_bug_scenario(struct held_lock *lock)
>  {
>  	struct lock_class *class = hlock_class(lock);
> +	struct printk_buffer *buf = get_printk_buffer();
>  
>  	printk(" Possible unsafe locking scenario:\n\n");
>  	printk("       CPU0\n");
>  	printk("       ----\n");
> -	printk("  lock(");
> -	__print_lock_name(class);
> -	printk(KERN_CONT ");\n");
> +	printk_buffered(buf, "  lock(");
> +	__print_lock_name(class, buf);
> +	printk_buffered(buf, ");\n");
>  	printk("  <Interrupt>\n");
> -	printk("    lock(");
> -	__print_lock_name(class);
> -	printk(KERN_CONT ");\n");
> +	printk_buffered(buf, "    lock(");
> +	__print_lock_name(class, buf);
> +	printk_buffered(buf, ");\n");
>  	printk("\n *** DEADLOCK ***\n\n");
> +	put_printk_buffer(buf);
>  }
>  
>  static int

I really hope that the maze of pr_cont() calls in lockdep.c is the most
complicated one that we would meet.

Anyway, the following comes to my mind:

1. The mixing of normal and buffered printk calls is a bit confusing
   and error prone. It would make sense to use the buffered printk
   everywhere in the given section of code even when it is not
   strictly needed.

2. I would replace "buf" with "pbuf" or "prbuf" to distinguish it a
   bit from other eventual buffers.


Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-07 10:21   ` Petr Mladek
  2018-11-07 12:54     ` Tetsuo Handa
@ 2018-11-08  2:21     ` Sergey Senozhatsky
  2018-11-08 11:24       ` Petr Mladek
  1 sibling, 1 reply; 54+ messages in thread
From: Sergey Senozhatsky @ 2018-11-08  2:21 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Tetsuo Handa, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Linus Torvalds,
	Andrew Morton, linux-mm, Ingo Molnar, Peter Zijlstra,
	Will Deacon

On (11/07/18 11:21), Petr Mladek wrote:
> > [..]
> > - The printk-fallback sounds like a hint that the existing 'cont' handling
> >   better stay in the kernel. I don't see how the existing 'cont' is
> >   significantly worse than
> > 		bpr_warn(NULL, ...)->printk() // no 'cont' support
> >   I don't see why would we want to do it, sorry. I don't see "it takes 16
> >   printk-buffers to make a thing go right" as a sure thing.
> 
> I see it the following way:
> 
>    + mixed cont lines are very rare but they happen
> 
>    + 16 buffers are more than 1 so it could only be better [*]

Right, so this is where things go a bit shady. I'll try to explain.

What is the problem:
- we have tons of CPUs, with tons of tasks running on them, with preemption,
  and interrupts, and potentially printk-s coming from various
  contexts/CPUs/tasks etc. so one 'cont' buffer is not enough.

What is the proposed solution:
- if 1 is not enough then 16 will do. And if 16 is not enough then this
  is not our problem anymore, it's a kernel misconfiguration and users'
  fault.

So, maybe it is a "common" kernel development pattern, I really don't know;
but it looks like we are just throwing the issue over the fence; and atop
of that we are killing off the existing 'cont'.

> Anyway, I do not think that both implementations are worth it.
> We could keep both for some transition period but we should
> remove the old one later.
>
[..]
>
> This would prevent removing the fallback to struct cont. OOM is
> one important scenario where continuous lines are used.

Let's have one more look at what we will fix and what we will break.

'cont' has premature flushes.

- it's annoying in some cases
- it's good otherwise

  Why is it good.
  It preserves the correct order of events.

  pr_cont("calling foo->init()....");
  foo->init()
   printk("Can't allocate buffer\n");    // premature flush
  pr_cont("...blah\h");

 Will end up in the logbuf as:
 [12345.123] calling foo->init()....
 [12345.124] Can't allocate buffer
 [12345.125] ...blah

 Where buffered printk will endup as:
 [12345.123] Can't allocate buffer
 [12345.124] calling foo->init().......blah

It's very different.

Not to mention that buffered printk does not flush on panic.
So, frankly, as of now, I don't see buffered printk as a 'cont'
replacement.

If our problem is OOM and lockdep print outs, then let's address only
those two; let's not "fix" the rest of the kernel, especially the early
boot, - we can break more things than we can mend.

[..]
> I opened this problem once and it got lost. So I did not want to
> complicate it at this moment.

Right, I saw it. We have similar points; I raised those in private talks:

- we don't need buffered printk in printk_safe/printk_nmi

- we don't need br_cont()

- unlike 'cont' buffer there is no way for us to flush buffered printk
  buffers on panic

- I don't exactly like the completely of the vprintk_buffered. If
  buffered printk is for single line, then it must be for single
  line only. And I'm not buying the "we will need this for printk
  origin info injection" argument.
  Linus was very clear about this whole idea:

  Linus Torvalds wrote:
  :  Sergey Senozhatsky wrote:
  :  >
  :  > so... I think we don't have to update 'struct printk_log'. we can store
  :  > that "extended data" at the beginning of every message, right after the
  :  > prefix.
  :
  :   No, we really can't. That just means that all the tools would have to
  :   be changed to get the normal messages without the extra crud. And
  :   since it will have lost the difference, that's not even easy to do.
  :
  :   So this is exactly the wrong way around.
  :
  :   If people want to see the extra data, it really should be extra data
  :   that you can get with a new interface from the kernel logs. Not a
  :   "let's just a add it to all lines and make every line uglier and
  :   harder to read.

  So the "we need all that complexity to inject printk info in later
  patches" push doesn't look right.

- It seems that buffered printk attempts to solve too many problems.
  I'd prefer it to address just one.

	-ss

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-07 10:52   ` Tetsuo Handa
  2018-11-07 11:01     ` David Laight
  2018-11-07 11:45     ` Petr Mladek
@ 2018-11-08  2:30     ` Sergey Senozhatsky
  2 siblings, 0 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2018-11-08  2:30 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Linus Torvalds,
	Andrew Morton, linux-mm, Ingo Molnar, Peter Zijlstra,
	Will Deacon

On (11/07/18 19:52), Tetsuo Handa wrote:
> > A question.
> > 
> > How bad would it actually be to:
> > 
> > - Allocate seq_buf 512-bytes buffer (GFP_ATOMIC) just-in-time, when we
> >   need it.
> >     // How often systems cannot allocate a 512-byte buffer? //
> 
> It is a very bad thing to do GFP_ATOMIC without __GFP_NOWARN.

Absolutely, __GFP_NOWARN.

> "it does not sleep". Not suitable for printk() which might be called from
> critically dangerous situations.

So I'm really not convinced that we can use buffered printk in critically
dangerous situations. Premature 'cont' flushes and 'cont' flushes on panic
are nice and right in critically dangerous situations.

[..]
> > - Do not allocate seq_buf if we are in printk-safe or in printk-nmi mode.
> >   To avoid "buffering for the sake of buffering". IOW, when in printk-safe
> >   use printk-safe.
> 
> Why? Since printk_safe_flush_buffer() forcibly flushes the partial line

We need to leave printk_safe and enable local IRQs for that partial
flush to occur. I'm not sure that those "partial flushes" from printk_safe
actually happen.

	-ss

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-07 15:19   ` Petr Mladek
@ 2018-11-08  4:45     ` Sergey Senozhatsky
  2018-11-08 11:37       ` Tetsuo Handa
  2018-11-08 11:53       ` Petr Mladek
  2018-11-09  9:54     ` Tetsuo Handa
  1 sibling, 2 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2018-11-08  4:45 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Tetsuo Handa, Sergey Senozhatsky, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Linus Torvalds,
	Andrew Morton, linux-mm, Ingo Molnar, Peter Zijlstra,
	Will Deacon

On (11/07/18 16:19), Petr Mladek wrote:
> > syzbot is sometimes getting mixed output like below due to concurrent
> > printk(). Mitigate such output by using line-buffered printk() API.
> > 
> > @@ -2421,18 +2458,20 @@ static void check_chain_key(struct task_struct *curr)
> >  print_usage_bug_scenario(struct held_lock *lock)
> >  {
> >  	struct lock_class *class = hlock_class(lock);
> > +	struct printk_buffer *buf = get_printk_buffer();
> >  
> >  	printk(" Possible unsafe locking scenario:\n\n");
> >  	printk("       CPU0\n");
> >  	printk("       ----\n");
> > -	printk("  lock(");
> > -	__print_lock_name(class);
> > -	printk(KERN_CONT ");\n");
> > +	printk_buffered(buf, "  lock(");
> > +	__print_lock_name(class, buf);
> > +	printk_buffered(buf, ");\n");
> >  	printk("  <Interrupt>\n");
> > -	printk("    lock(");
> > -	__print_lock_name(class);
> > -	printk(KERN_CONT ");\n");
> > +	printk_buffered(buf, "    lock(");
> > +	__print_lock_name(class, buf);
> > +	printk_buffered(buf, ");\n");
> >  	printk("\n *** DEADLOCK ***\n\n");
> > +	put_printk_buffer(buf);
> >  }
> >  
> >  static int
> 
> I really hope that the maze of pr_cont() calls in lockdep.c is the most
> complicated one that we would meet.

Hmm... Yes, buffered/seq_buf printk looks like a hard-to-use API,
when it comes to real world cases like this.

So... here is a random and wild idea.

We actually already have an easy-to-use buffered printk. And it's per-CPU.
And it makes all printk-s on this CPU to behave like as if they were called
on UP system. And it cures pr_cont(). And it doesn't require anyone to learn
any new printk API names. And it doesn't require any additional maintenance
work. And it doesn't require any printk->buffered_printk conversions. And
it's already in the kernel. And we gave it a name. And it's printk_safe.

a) lockdep reporting path should be atomic. And it's not a hot path,
   so local_irq_save/local_irq_restore will not cause a lot of trouble
   there probably.

b) We already have some lockdep reports coming via printk_safe.
   All those
	printk->console_driver->scheduler->lockdep
	printk->console_driver->timekeeping->lockdep
	etc.

   came via printk_safe path. So it's not a complete novelty.

c) printk_safe sections can nest.

d) No premature flushes. Everything looks the way it was supposed to
   look.

e) There are no out-of-line printk-s. We keep the actual order of events.

f) We flush it on panic.

g) Low maintenance costs.

So, can we just do the following? /* a sketch */

lockdep.c
	printk_safe_enter_irqsave(flags);
	lockdep_report();
	printk_safe_exit_irqrestore(flags);

	-ss

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-08  2:21     ` Sergey Senozhatsky
@ 2018-11-08 11:24       ` Petr Mladek
  2018-11-08 11:46         ` Tetsuo Handa
  2018-11-08 12:30         ` Sergey Senozhatsky
  0 siblings, 2 replies; 54+ messages in thread
From: Petr Mladek @ 2018-11-08 11:24 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Tetsuo Handa, Dmitriy Vyukov, Steven Rostedt,
	Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon

On Thu 2018-11-08 11:21:38, Sergey Senozhatsky wrote:
> On (11/07/18 11:21), Petr Mladek wrote:
> What is the problem:
> - we have tons of CPUs, with tons of tasks running on them, with preemption,
>   and interrupts, and potentially printk-s coming from various
>   contexts/CPUs/tasks etc. so one 'cont' buffer is not enough.
> 
> What is the proposed solution:
> - if 1 is not enough then 16 will do. And if 16 is not enough then this
>   is not our problem anymore, it's a kernel misconfiguration and users'
>   fault.

I believe that I mentioned this more times. 16 buffers is the first
attempt. We could improve it later in several ways:

  + add more buffers
  + combine buffers on stack, dynamically allocated and static ones.

BTW: I do not remember seeing mixed lines from anything even close to 16
CPUs. Maybe I was just lucky or my memory is leaky.


> Let's have one more look at what we will fix and what we will break.
> 
> 'cont' has premature flushes.
> 
>   Why is it good.
>   It preserves the correct order of events.
> 
>   pr_cont("calling foo->init()....");
>   foo->init()
>    printk("Can't allocate buffer\n");    // premature flush
>   pr_cont("...blah\h");
> 
>  Will end up in the logbuf as:
>  [12345.123] calling foo->init()....
>  [12345.124] Can't allocate buffer
>  [12345.125] ...blah
> 
>  Where buffered printk will endup as:
>  [12345.123] Can't allocate buffer
>  [12345.124] calling foo->init().......blah

We will always have this problem with API using explicit buffers.
What do you suggest instead, please?

I am afraid that we are running in cycles. The other serious
alternative was having per-process and per-context buffers
but it was rejected several times.


> Not to mention that buffered printk does not flush on panic.
> So, frankly, as of now, I don't see buffered printk as a 'cont'
> replacement.

The static array of buffers can be flushed on panic.


> If our problem is OOM and lockdep print outs, then let's address only
> those two; let's not "fix" the rest of the kernel, especially the early
> boot, - we can break more things than we can mend.

Do you have any alternative proposal how to handle OOM and lockdep, please?


> [..]
> > I opened this problem once and it got lost. So I did not want to
> > complicate it at this moment.
>
> - I don't exactly like the completely of the vprintk_buffered. If
>   buffered printk is for single line, then it must be for single
>   line only.

My undestanding is that the new API is similar to the current cont
buffer from this point of view:

    + buffer size is LOG_LINE_MAX
    + it is flushed when full

The only difference is that it is flushed also when there is a
complete line. Is this a problem, please?


> And I'm not buying the "we will need this for printk origin
> info injection" argument.

I was against this idea several times. The current API does
not do anything like this.


> - It seems that buffered printk attempts to solve too many problems.
>   I'd prefer it to address just one.

This API tries to handle continuous lines more reliably.
Do I miss anything, please?

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-08  4:45     ` Sergey Senozhatsky
@ 2018-11-08 11:37       ` Tetsuo Handa
  2018-11-09  6:12         ` Sergey Senozhatsky
  2018-11-08 11:53       ` Petr Mladek
  1 sibling, 1 reply; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-08 11:37 UTC (permalink / raw)
  To: Sergey Senozhatsky, Petr Mladek
  Cc: Sergey Senozhatsky, Dmitriy Vyukov, Steven Rostedt,
	Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon

On 2018/11/08 13:45, Sergey Senozhatsky wrote:
> So, can we just do the following? /* a sketch */
> 
> lockdep.c
> 	printk_safe_enter_irqsave(flags);
> 	lockdep_report();
> 	printk_safe_exit_irqrestore(flags);

If buffer size were large enough to hold messages from out_of_memory(),
I would like to use it for out_of_memory() because delaying SIGKILL
due to waiting for printk() to complete is not good. Surely we can't
hold all messages because amount from dump_tasks() is unpredictable.
Maybe we can hold all messages from dump_header() except dump_tasks().

But isn't it essentially same with
http://lkml.kernel.org/r/1493560477-3016-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
which Linus does not want?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-08 11:24       ` Petr Mladek
@ 2018-11-08 11:46         ` Tetsuo Handa
  2018-11-08 12:30         ` Sergey Senozhatsky
  1 sibling, 0 replies; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-08 11:46 UTC (permalink / raw)
  To: Petr Mladek, Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Dmitriy Vyukov, Steven Rostedt,
	Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon

On 2018/11/08 20:24, Petr Mladek wrote:
>> Let's have one more look at what we will fix and what we will break.
>>
>> 'cont' has premature flushes.
>>
>>   Why is it good.
>>   It preserves the correct order of events.
>>
>>   pr_cont("calling foo->init()....");
>>   foo->init()
>>    printk("Can't allocate buffer\n");    // premature flush
>>   pr_cont("...blah\h");
>>
>>  Will end up in the logbuf as:
>>  [12345.123] calling foo->init()....
>>  [12345.124] Can't allocate buffer
>>  [12345.125] ...blah
>>
>>  Where buffered printk will endup as:
>>  [12345.123] Can't allocate buffer
>>  [12345.124] calling foo->init().......blah
> 
> We will always have this problem with API using explicit buffers.
> What do you suggest instead, please?
> 
> I am afraid that we are running in cycles. The other serious
> alternative was having per-process and per-context buffers
> but it was rejected several times.

Is it possible to identify all locations which should use their own
printk() buffers (e.g. interrupt handlers, oops handlers) ? If yes,
automatically switching printk() buffers (like memalloc_nofs_save()/
memalloc_nofs_restore()) will be easiest and least error prone.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-08  4:45     ` Sergey Senozhatsky
  2018-11-08 11:37       ` Tetsuo Handa
@ 2018-11-08 11:53       ` Petr Mladek
  2018-11-08 12:44         ` Sergey Senozhatsky
  1 sibling, 1 reply; 54+ messages in thread
From: Petr Mladek @ 2018-11-08 11:53 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Sergey Senozhatsky, Dmitriy Vyukov, Steven Rostedt,
	Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon

On Thu 2018-11-08 13:45:10, Sergey Senozhatsky wrote:
> On (11/07/18 16:19), Petr Mladek wrote:
> > I really hope that the maze of pr_cont() calls in lockdep.c is the most
> > complicated one that we would meet.
> 
> Hmm... Yes, buffered/seq_buf printk looks like a hard-to-use API,
> when it comes to real world cases like this.
>
> So... here is a random and wild idea.
> 
> We actually already have an easy-to-use buffered printk. And it's per-CPU.
> And it makes all printk-s on this CPU to behave like as if they were called
> on UP system. And it cures pr_cont(). And it doesn't require anyone to learn
> any new printk API names. And it doesn't require any additional maintenance
> work. And it doesn't require any printk->buffered_printk conversions. And
> it's already in the kernel. And we gave it a name. And it's printk_safe.
> 
> a) lockdep reporting path should be atomic. And it's not a hot path,
>    so local_irq_save/local_irq_restore will not cause a lot of trouble
>    there probably.
> 
> b) We already have some lockdep reports coming via printk_safe.
>    All those
> 	printk->console_driver->scheduler->lockdep
> 	printk->console_driver->timekeeping->lockdep
> 	etc.
> 
>    came via printk_safe path. So it's not a complete novelty.
> 
> c) printk_safe sections can nest.
> 
> d) No premature flushes. Everything looks the way it was supposed to
>    look.
> 
> e) There are no out-of-line printk-s. We keep the actual order of events.
> 
> f) We flush it on panic.
> 
> g) Low maintenance costs.
> 
> So, can we just do the following? /* a sketch */
> 
> lockdep.c
> 	printk_safe_enter_irqsave(flags);
> 	lockdep_report();
> 	printk_safe_exit_irqrestore(flags);

All this looks nice. Let's look it also from the other side.
The following comes to my mind:

a) lockdep is not the only place when continuous lines get mixed.
   This patch mentions also RCU stalls. The other patch mentions
   OOM. I am sure that there will be more.

b) It is not obvious where printk_safe() would be necessary.
   While buffered printk is clearly connected with continuous
   lines.

c) I am not sure that disabling preemption would always be
   acceptable.

d) We might need to increase the size of the per-CPU buffers if
   they are used more widely.

e) People would need to learn a new (printk_safe) API when it is
   use outside printk sources.

f) Losing the entire log is more painful than loosing one line
   when the buffer never gets flushed.


Sigh, no solution is perfect. If only we could agree that one
way was better than the other.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-08 11:24       ` Petr Mladek
  2018-11-08 11:46         ` Tetsuo Handa
@ 2018-11-08 12:30         ` Sergey Senozhatsky
  2018-11-09 14:10           ` Petr Mladek
  1 sibling, 1 reply; 54+ messages in thread
From: Sergey Senozhatsky @ 2018-11-08 12:30 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Tetsuo Handa,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Linus Torvalds,
	Andrew Morton, linux-mm, Ingo Molnar, Peter Zijlstra,
	Will Deacon

On (11/08/18 12:24), Petr Mladek wrote:
> I believe that I mentioned this more times. 16 buffers is the first
> attempt. We could improve it later in several ways

Sure. Like I said - maybe it is a normal development pattern; I really
don't know.

> > Let's have one more look at what we will fix and what we will break.
> > 
> > 'cont' has premature flushes.
> > 
> >   Why is it good.
> >   It preserves the correct order of events.
> > 
> >   pr_cont("calling foo->init()....");
> >   foo->init()
> >    printk("Can't allocate buffer\n");    // premature flush
> >   pr_cont("...blah\h");
> > 
> >  Will end up in the logbuf as:
> >  [12345.123] calling foo->init()....
> >  [12345.124] Can't allocate buffer
> >  [12345.125] ...blah
> > 
> >  Where buffered printk will endup as:
> >  [12345.123] Can't allocate buffer
> >  [12345.124] calling foo->init().......blah
> 
> We will always have this problem with API using explicit buffers.

Exactly.

> What do you suggest instead, please?

So this is why "let's not remove 'cont'" thing. Buffered printk
is not 100% identical to 'cont'. And 'cont' does a good job sometimes,
Because 'cont' looks like a buffered printk, but it behaves like a
normal printk when things go bad. Buffered printk is always buffered;
and not even aware of the fact that things go bad around.

> > If our problem is OOM and lockdep print outs, then let's address only
> > those two; let's not "fix" the rest of the kernel, especially the early
> > boot, - we can break more things than we can mend.
> 
> Do you have any alternative proposal how to handle OOM and lockdep, please?

You misunderstood me. I'm not against the buffered printk in lockdep and
OOM. Albeit I must admit that lockdep patch looks quite big. I don't like
the idea of "we will remove 'cont' and replace it with buffered printk
through out the kernel".

[..]
> > - It seems that buffered printk attempts to solve too many problems.
> >   I'd prefer it to address just one.
> 
> This API tries to handle continuous lines more reliably.
> Do I miss anything, please?

This part:

+       /* Flush already completed lines if any. */
+       for (pos = ptr->len - 1; pos >= 0; pos--) {
+               if (ptr->buf[pos] != '\n')
+                       continue;
+               ptr->buf[pos++] = '\0';
+               printk("%s\n", ptr->buf);
+               ptr->len -= pos;
+               memmove(ptr->buf, ptr->buf + pos, ptr->len);
+               /* This '\0' will be overwritten by next vsnprintf() above. */
+               ptr->buf[ptr->len] = '\0';
+               break;
+       }
+       return r;

If I'm not mistaken, this is for the futute "printk injection" work.

Right now printk("foo\nbar\n") will end up to be a single logbuf
entry, with \n in the middle and at the end. So it will look like
two lines on the serial console:

	[123.123] foo
	          bar

Tetsuo does this \n lookup (on every vprintk_buffered) and split lines
(via memove) for "printk injection", so the output will be

	[123.123] foo
	[123.124] bar

Which makes it simpler to "inject" printk origin into every printed
line.

Without it we can just do

	len = vsprintf();
	if (len && text[len - 1] == '\n' || overflow)
		flush();

Can't we?

	-ss

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-08 11:53       ` Petr Mladek
@ 2018-11-08 12:44         ` Sergey Senozhatsky
  2018-11-08 14:21           ` Sergey Senozhatsky
  0 siblings, 1 reply; 54+ messages in thread
From: Sergey Senozhatsky @ 2018-11-08 12:44 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Tetsuo Handa, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Linus Torvalds,
	Andrew Morton, linux-mm, Ingo Molnar, Peter Zijlstra,
	Will Deacon

On (11/08/18 12:53), Petr Mladek wrote:
> > lockdep.c
> > 	printk_safe_enter_irqsave(flags);
> > 	lockdep_report();
> > 	printk_safe_exit_irqrestore(flags);
> 
> All this looks nice. Let's look it also from the other side.
> The following comes to my mind:
> 
> a) lockdep is not the only place when continuous lines get mixed.
>    This patch mentions also RCU stalls. The other patch mentions
>    OOM. I am sure that there will be more.
> 
> b) It is not obvious where printk_safe() would be necessary.
>    While buffered printk is clearly connected with continuous
>    lines.
> 
> c) I am not sure that disabling preemption would always be
>    acceptable.
> 
> d) We might need to increase the size of the per-CPU buffers if
>    they are used more widely.
> 
> e) People would need to learn a new (printk_safe) API when it is
>    use outside printk sources.
> 
> f) Losing the entire log is more painful than loosing one line
>    when the buffer never gets flushed.
> 
> Sigh, no solution is perfect. If only we could agree that one
> way was better than the other.

I agree with what you are saying. All of the above (in my email)
was for lockdep only, that's why I did mention lockdep several times.
Like I said, a random and wild idea.
I'm not proposing printk_safe as a "better" buffered printk for
everyone. The buffered_printk patch is pretty big, and comes with a
price tag.

If lockdep and OOM people will ACK buffered printk transition in
its current form, then we can go ahead.

It's debatable if we need a fixed size list of buffers; or we can
do kmalloc()+cont fallback. But if we will have ACKs, then we can
move forward.

	-ss

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-08 12:44         ` Sergey Senozhatsky
@ 2018-11-08 14:21           ` Sergey Senozhatsky
  0 siblings, 0 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2018-11-08 14:21 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Tetsuo Handa, Sergey Senozhatsky, Dmitriy Vyukov, Steven Rostedt,
	Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon, Sergey Senozhatsky

On (11/08/18 21:44), Sergey Senozhatsky wrote:
> 
> If lockdep and OOM people will ACK buffered printk transition in
> its current form, then we can go ahead.

That printk_safe approach in lockdep, BTW, does not change (convert)
any printk-s within lockdep, thus Peter's earlycon should not be
affected. So Peter will have earlycon working, syzbot folks will have
buffered lockdep print outs.

	-ss

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-08 11:37       ` Tetsuo Handa
@ 2018-11-09  6:12         ` Sergey Senozhatsky
  2018-11-09  9:55           ` Tetsuo Handa
  2018-11-09 14:08           ` Linus Torvalds
  0 siblings, 2 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2018-11-09  6:12 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Linus Torvalds,
	Andrew Morton, linux-mm, Ingo Molnar, Peter Zijlstra,
	Will Deacon

On (11/08/18 20:37), Tetsuo Handa wrote:
> On 2018/11/08 13:45, Sergey Senozhatsky wrote:
> > So, can we just do the following? /* a sketch */
> > 
> > lockdep.c
> > 	printk_safe_enter_irqsave(flags);
> > 	lockdep_report();
> > 	printk_safe_exit_irqrestore(flags);
> 
> If buffer size were large enough to hold messages from out_of_memory(),
> I would like to use it for out_of_memory() because delaying SIGKILL
> due to waiting for printk() to complete is not good. Surely we can't
> hold all messages because amount from dump_tasks() is unpredictable.
> Maybe we can hold all messages from dump_header() except dump_tasks().
> 
> But isn't it essentially same with
> http://lkml.kernel.org/r/1493560477-3016-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
> which Linus does not want?

Dunno. I guess we still haven't heard from Linus because he did quite a good
job setting up his 'email filters' ;)

Converting the existing users to buffered printk is not so simple.
Apparently there are different paths; some can afford buffered printk, some
cannot. Some of 'cont' users tend to get advantage of transparent 'cont'
context: start 'cont' output in function A: A()->pr_cont(), continue it in
B: A()->B()->pr_cont(), and then in C: A()->B()->C()->pr_cont(), and
finally flush in A: A()->pr_cont(\n). And then some paths have the
early_printk requirement. We can break the 'transparent cont' by passing
buffer pointers around [it can get a bit hairy; looking at lockdep patch],
but early_printk requirement is a different beast.

So in my email I was not advertising printk_safe as a "buffered printk for
everyone", I was just talking about lockdep. It's a bit doubtful that Peter
will ACK lockdep transition to buffered printk.

	-ss

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-07 15:19   ` Petr Mladek
  2018-11-08  4:45     ` Sergey Senozhatsky
@ 2018-11-09  9:54     ` Tetsuo Handa
  1 sibling, 0 replies; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-09  9:54 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Dmitriy Vyukov,
	Steven Rostedt, Alexander Potapenko, Fengguang Wu,
	Josh Poimboeuf, LKML, Linus Torvalds, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On 2018/11/08 0:19, Petr Mladek wrote:
> I really hope that the maze of pr_cont() calls in lockdep.c is the most
> complicated one that we would meet.
> 
> Anyway, the following comes to my mind:
> 
> 1. The mixing of normal and buffered printk calls is a bit confusing
>    and error prone. It would make sense to use the buffered printk
>    everywhere in the given section of code even when it is not
>    strictly needed.

Here is a draft version for how the code would look like...

 include/linux/printk.h   |  16 ++
 kernel/locking/lockdep.c | 711 ++++++++++++++++++++++++++---------------------
 2 files changed, 404 insertions(+), 323 deletions(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index cf3eccf..ff4f66c 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -530,4 +530,20 @@ static inline void print_hex_dump_debug(const char *prefix_str, int prefix_type,
 }
 #endif
 
+struct printk_buffer;
+struct printk_buffer *get_printk_buffer(void);
+void put_printk_buffer(struct printk_buffer *buf);
+__printf(2, 3)
+int bprintk(struct printk_buffer *buf, const char *fmt, ...);
+
+#define bpr_info(buf, fmt, ...)				\
+	bprintk(buf, KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_warning(buf, fmt, ...)				\
+	bprintk(buf, KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_warn bpr_warning
+#define bpr_err(buf, fmt, ...)				\
+	bprintk(buf, KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
+#define bpr_cont(buf, fmt, ...)			\
+	bprintk(buf, KERN_CONT fmt, ##__VA_ARGS__)
+
 #endif
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 1efada2..22b85aa 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -493,7 +493,7 @@ void get_usage_chars(struct lock_class *class, char usage[LOCK_USAGE_CHARS])
 	usage[i] = '\0';
 }
 
-static void __print_lock_name(struct lock_class *class)
+static void __print_lock_name(struct printk_buffer *buf, struct lock_class *class)
 {
 	char str[KSYM_NAME_LEN];
 	const char *name;
@@ -501,28 +501,28 @@ static void __print_lock_name(struct lock_class *class)
 	name = class->name;
 	if (!name) {
 		name = __get_key_name(class->key, str);
-		printk(KERN_CONT "%s", name);
+		bprintk(buf, KERN_CONT "%s", name);
 	} else {
-		printk(KERN_CONT "%s", name);
+		bprintk(buf, KERN_CONT "%s", name);
 		if (class->name_version > 1)
-			printk(KERN_CONT "#%d", class->name_version);
+			bprintk(buf, KERN_CONT "#%d", class->name_version);
 		if (class->subclass)
-			printk(KERN_CONT "/%d", class->subclass);
+			bprintk(buf, KERN_CONT "/%d", class->subclass);
 	}
 }
 
-static void print_lock_name(struct lock_class *class)
+static void print_lock_name(struct printk_buffer *buf, struct lock_class *class)
 {
 	char usage[LOCK_USAGE_CHARS];
 
 	get_usage_chars(class, usage);
 
-	printk(KERN_CONT " (");
-	__print_lock_name(class);
-	printk(KERN_CONT "){%s}", usage);
+	bprintk(buf, KERN_CONT " (");
+	__print_lock_name(buf, class);
+	bprintk(buf, KERN_CONT "){%s}", usage);
 }
 
-static void print_lockdep_cache(struct lockdep_map *lock)
+static void print_lockdep_cache(struct printk_buffer *buf, struct lockdep_map *lock)
 {
 	const char *name;
 	char str[KSYM_NAME_LEN];
@@ -531,10 +531,10 @@ static void print_lockdep_cache(struct lockdep_map *lock)
 	if (!name)
 		name = __get_key_name(lock->key->subkeys, str);
 
-	printk(KERN_CONT "%s", name);
+	bprintk(buf, KERN_CONT "%s", name);
 }
 
-static void print_lock(struct held_lock *hlock)
+static void print_lock(struct printk_buffer *buf, struct held_lock *hlock)
 {
 	/*
 	 * We can be called locklessly through debug_show_all_locks() so be
@@ -546,23 +546,23 @@ static void print_lock(struct held_lock *hlock)
 	barrier();
 
 	if (!class_idx || (class_idx - 1) >= MAX_LOCKDEP_KEYS) {
-		printk(KERN_CONT "<RELEASED>\n");
+		bprintk(buf, KERN_CONT "<RELEASED>\n");
 		return;
 	}
 
-	printk(KERN_CONT "%p", hlock->instance);
-	print_lock_name(lock_classes + class_idx - 1);
-	printk(KERN_CONT ", at: %pS\n", (void *)hlock->acquire_ip);
+	bprintk(buf, KERN_CONT "%p", hlock->instance);
+	print_lock_name(buf, lock_classes + class_idx - 1);
+	bprintk(buf, KERN_CONT ", at: %pS\n", (void *)hlock->acquire_ip);
 }
 
-static void lockdep_print_held_locks(struct task_struct *p)
+static void lockdep_print_held_locks(struct printk_buffer *buf, struct task_struct *p)
 {
 	int i, depth = READ_ONCE(p->lockdep_depth);
 
 	if (!depth)
-		printk("no locks held by %s/%d.\n", p->comm, task_pid_nr(p));
+		bprintk(buf, "no locks held by %s/%d.\n", p->comm, task_pid_nr(p));
 	else
-		printk("%d lock%s held by %s/%d:\n", depth,
+		bprintk(buf, "%d lock%s held by %s/%d:\n", depth,
 		       depth > 1 ? "s" : "", p->comm, task_pid_nr(p));
 	/*
 	 * It's not reliable to print a task's held locks if it's not sleeping
@@ -571,14 +571,14 @@ static void lockdep_print_held_locks(struct task_struct *p)
 	if (p->state == TASK_RUNNING && p != current)
 		return;
 	for (i = 0; i < depth; i++) {
-		printk(" #%d: ", i);
-		print_lock(p->held_locks + i);
+		bprintk(buf, " #%d: ", i);
+		print_lock(buf, p->held_locks + i);
 	}
 }
 
-static void print_kernel_ident(void)
+static void print_kernel_ident(struct printk_buffer *buf)
 {
-	printk("%s %.*s %s\n", init_utsname()->release,
+	bprintk(buf, "%s %.*s %s\n", init_utsname()->release,
 		(int)strcspn(init_utsname()->version, " "),
 		init_utsname()->version,
 		print_tainted());
@@ -804,12 +804,14 @@ static bool assign_lock_key(struct lockdep_map *lock)
 	list_add_tail_rcu(&class->lock_entry, &all_lock_classes);
 
 	if (verbose(class)) {
+		struct printk_buffer *buf = get_printk_buffer();
 		graph_unlock();
 
-		printk("\nnew class %px: %s", class->key, class->name);
+		bprintk(buf, "\nnew class %px: %s", class->key, class->name);
 		if (class->name_version > 1)
-			printk(KERN_CONT "#%d", class->name_version);
-		printk(KERN_CONT "\n");
+			bprintk(buf, KERN_CONT "#%d", class->name_version);
+		bprintk(buf, KERN_CONT "\n");
+		put_printk_buffer(buf);
 		dump_stack();
 
 		if (!graph_lock()) {
@@ -1081,20 +1083,20 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
  * has been detected):
  */
 static noinline int
-print_circular_bug_entry(struct lock_list *target, int depth)
+print_circular_bug_entry(struct printk_buffer *buf, struct lock_list *target, int depth)
 {
 	if (debug_locks_silent)
 		return 0;
-	printk("\n-> #%u", depth);
-	print_lock_name(target->class);
-	printk(KERN_CONT ":\n");
+	bprintk(buf, "\n-> #%u", depth);
+	print_lock_name(buf, target->class);
+	bprintk(buf, KERN_CONT ":\n");
 	print_stack_trace(&target->trace, 6);
 
 	return 0;
 }
 
 static void
-print_circular_lock_scenario(struct held_lock *src,
+print_circular_lock_scenario(struct printk_buffer *buf, struct held_lock *src,
 			     struct held_lock *tgt,
 			     struct lock_list *prt)
 {
@@ -1116,31 +1118,31 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
 	 * from the safe_class lock to the unsafe_class lock.
 	 */
 	if (parent != source) {
-		printk("Chain exists of:\n  ");
-		__print_lock_name(source);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(parent);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(target);
-		printk(KERN_CONT "\n\n");
+		bprintk(buf, "Chain exists of:\n  ");
+		__print_lock_name(buf, source);
+		bprintk(buf, KERN_CONT " --> ");
+		__print_lock_name(buf, parent);
+		bprintk(buf, KERN_CONT " --> ");
+		__print_lock_name(buf, target);
+		bprintk(buf, KERN_CONT "\n\n");
 	}
 
-	printk(" Possible unsafe locking scenario:\n\n");
-	printk("       CPU0                    CPU1\n");
-	printk("       ----                    ----\n");
-	printk("  lock(");
-	__print_lock_name(target);
-	printk(KERN_CONT ");\n");
-	printk("                               lock(");
-	__print_lock_name(parent);
-	printk(KERN_CONT ");\n");
-	printk("                               lock(");
-	__print_lock_name(target);
-	printk(KERN_CONT ");\n");
-	printk("  lock(");
-	__print_lock_name(source);
-	printk(KERN_CONT ");\n");
-	printk("\n *** DEADLOCK ***\n\n");
+	bprintk(buf, " Possible unsafe locking scenario:\n\n");
+	bprintk(buf, "       CPU0                    CPU1\n");
+	bprintk(buf, "       ----                    ----\n");
+	bprintk(buf, "  lock(");
+	__print_lock_name(buf, target);
+	bprintk(buf, KERN_CONT ");\n");
+	bprintk(buf, "                               lock(");
+	__print_lock_name(buf, parent);
+	bprintk(buf, KERN_CONT ");\n");
+	bprintk(buf, "                               lock(");
+	__print_lock_name(buf, target);
+	bprintk(buf, KERN_CONT ");\n");
+	bprintk(buf, "  lock(");
+	__print_lock_name(buf, source);
+	bprintk(buf, KERN_CONT ");\n");
+	bprintk(buf, "\n *** DEADLOCK ***\n\n");
 }
 
 /*
@@ -1148,7 +1150,7 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
  * header first:
  */
 static noinline int
-print_circular_bug_header(struct lock_list *entry, unsigned int depth,
+print_circular_bug_header(struct printk_buffer *buf, struct lock_list *entry, unsigned int depth,
 			struct held_lock *check_src,
 			struct held_lock *check_tgt)
 {
@@ -1157,22 +1159,22 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
 	if (debug_locks_silent)
 		return 0;
 
-	pr_warn("\n");
-	pr_warn("======================================================\n");
-	pr_warn("WARNING: possible circular locking dependency detected\n");
-	print_kernel_ident();
-	pr_warn("------------------------------------------------------\n");
-	pr_warn("%s/%d is trying to acquire lock:\n",
+	bpr_warn(buf, "\n");
+	bpr_warn(buf, "======================================================\n");
+	bpr_warn(buf, "WARNING: possible circular locking dependency detected\n");
+	print_kernel_ident(buf);
+	bpr_warn(buf, "------------------------------------------------------\n");
+	bpr_warn(buf, "%s/%d is trying to acquire lock:\n",
 		curr->comm, task_pid_nr(curr));
-	print_lock(check_src);
+	print_lock(buf, check_src);
 
-	pr_warn("\nbut task is already holding lock:\n");
+	bpr_warn(buf, "\nbut task is already holding lock:\n");
 
-	print_lock(check_tgt);
-	pr_warn("\nwhich lock already depends on the new lock.\n\n");
-	pr_warn("\nthe existing dependency chain (in reverse order) is:\n");
+	print_lock(buf, check_tgt);
+	bpr_warn(buf, "\nwhich lock already depends on the new lock.\n\n");
+	bpr_warn(buf, "\nthe existing dependency chain (in reverse order) is:\n");
 
-	print_circular_bug_entry(entry, depth);
+	print_circular_bug_entry(buf, entry, depth);
 
 	return 0;
 }
@@ -1188,6 +1190,7 @@ static noinline int print_circular_bug(struct lock_list *this,
 				struct held_lock *check_tgt,
 				struct stack_trace *trace)
 {
+	struct printk_buffer *buf;
 	struct task_struct *curr = current;
 	struct lock_list *parent;
 	struct lock_list *first_parent;
@@ -1199,25 +1202,27 @@ static noinline int print_circular_bug(struct lock_list *this,
 	if (!save_trace(&this->trace))
 		return 0;
 
+	buf = get_printk_buffer();
 	depth = get_lock_depth(target);
 
-	print_circular_bug_header(target, depth, check_src, check_tgt);
+	print_circular_bug_header(buf, target, depth, check_src, check_tgt);
 
 	parent = get_lock_parent(target);
 	first_parent = parent;
 
 	while (parent) {
-		print_circular_bug_entry(parent, --depth);
+		print_circular_bug_entry(buf, parent, --depth);
 		parent = get_lock_parent(parent);
 	}
 
-	printk("\nother info that might help us debug this:\n\n");
-	print_circular_lock_scenario(check_src, check_tgt,
+	bprintk(buf, "\nother info that might help us debug this:\n\n");
+	print_circular_lock_scenario(buf, check_src, check_tgt,
 				     first_parent);
 
-	lockdep_print_held_locks(curr);
+	lockdep_print_held_locks(buf, curr);
 
-	printk("\nstack backtrace:\n");
+	bprintk(buf, "\nstack backtrace:\n");
+	put_printk_buffer(buf);
 	dump_stack();
 
 	return 0;
@@ -1385,29 +1390,29 @@ static inline int usage_match(struct lock_list *entry, void *bit)
 	return result;
 }
 
-static void print_lock_class_header(struct lock_class *class, int depth)
+static void print_lock_class_header(struct printk_buffer *buf, struct lock_class *class, int depth)
 {
 	int bit;
 
-	printk("%*s->", depth, "");
-	print_lock_name(class);
+	bprintk(buf, "%*s->", depth, "");
+	print_lock_name(buf, class);
 #ifdef CONFIG_DEBUG_LOCKDEP
-	printk(KERN_CONT " ops: %lu", debug_class_ops_read(class));
+	bprintk(buf, KERN_CONT " ops: %lu", debug_class_ops_read(class));
 #endif
-	printk(KERN_CONT " {\n");
+	bprintk(buf, KERN_CONT " {\n");
 
 	for (bit = 0; bit < LOCK_USAGE_STATES; bit++) {
 		if (class->usage_mask & (1 << bit)) {
 			int len = depth;
 
-			len += printk("%*s   %s", depth, "", usage_str[bit]);
-			len += printk(KERN_CONT " at:\n");
+			len += bprintk(buf, "%*s   %s", depth, "", usage_str[bit]);
+			len += bprintk(buf, KERN_CONT " at:\n");
 			print_stack_trace(class->usage_traces + bit, len);
 		}
 	}
-	printk("%*s }\n", depth, "");
+	bprintk(buf, "%*s }\n", depth, "");
 
-	printk("%*s ... key      at: [<%px>] %pS\n",
+	bprintk(buf, "%*s ... key      at: [<%px>] %pS\n",
 		depth, "", class->key, class->key);
 }
 
@@ -1415,7 +1420,7 @@ static void print_lock_class_header(struct lock_class *class, int depth)
  * printk the shortest lock dependencies from @start to @end in reverse order:
  */
 static void __used
-print_shortest_lock_dependencies(struct lock_list *leaf,
+print_shortest_lock_dependencies(struct printk_buffer *buf, struct lock_list *leaf,
 				struct lock_list *root)
 {
 	struct lock_list *entry = leaf;
@@ -1425,13 +1430,13 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 	depth = get_lock_depth(leaf);
 
 	do {
-		print_lock_class_header(entry->class, depth);
-		printk("%*s ... acquired at:\n", depth, "");
+		print_lock_class_header(buf, entry->class, depth);
+		bprintk(buf, "%*s ... acquired at:\n", depth, "");
 		print_stack_trace(&entry->trace, 2);
-		printk("\n");
+		bprintk(buf, "\n");
 
 		if (depth == 0 && (entry != root)) {
-			printk("lockdep:%s bad path found in chain graph\n", __func__);
+			bprintk(buf, "lockdep:%s bad path found in chain graph\n", __func__);
 			break;
 		}
 
@@ -1443,7 +1448,7 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 }
 
 static void
-print_irq_lock_scenario(struct lock_list *safe_entry,
+print_irq_lock_scenario(struct printk_buffer *buf, struct lock_list *safe_entry,
 			struct lock_list *unsafe_entry,
 			struct lock_class *prev_class,
 			struct lock_class *next_class)
@@ -1469,33 +1474,33 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 	 * from the safe_class lock to the unsafe_class lock.
 	 */
 	if (middle_class != unsafe_class) {
-		printk("Chain exists of:\n  ");
-		__print_lock_name(safe_class);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(middle_class);
-		printk(KERN_CONT " --> ");
-		__print_lock_name(unsafe_class);
-		printk(KERN_CONT "\n\n");
+		bprintk(buf, "Chain exists of:\n  ");
+		__print_lock_name(buf, safe_class);
+		bprintk(buf, KERN_CONT " --> ");
+		__print_lock_name(buf, middle_class);
+		bprintk(buf, KERN_CONT " --> ");
+		__print_lock_name(buf, unsafe_class);
+		bprintk(buf, KERN_CONT "\n\n");
 	}
 
-	printk(" Possible interrupt unsafe locking scenario:\n\n");
-	printk("       CPU0                    CPU1\n");
-	printk("       ----                    ----\n");
-	printk("  lock(");
-	__print_lock_name(unsafe_class);
-	printk(KERN_CONT ");\n");
-	printk("                               local_irq_disable();\n");
-	printk("                               lock(");
-	__print_lock_name(safe_class);
-	printk(KERN_CONT ");\n");
-	printk("                               lock(");
-	__print_lock_name(middle_class);
-	printk(KERN_CONT ");\n");
-	printk("  <Interrupt>\n");
-	printk("    lock(");
-	__print_lock_name(safe_class);
-	printk(KERN_CONT ");\n");
-	printk("\n *** DEADLOCK ***\n\n");
+	bprintk(buf, " Possible interrupt unsafe locking scenario:\n\n");
+	bprintk(buf, "       CPU0                    CPU1\n");
+	bprintk(buf, "       ----                    ----\n");
+	bprintk(buf, "  lock(");
+	__print_lock_name(buf, unsafe_class);
+	bprintk(buf, KERN_CONT ");\n");
+	bprintk(buf, "                               local_irq_disable();\n");
+	bprintk(buf, "                               lock(");
+	__print_lock_name(buf, safe_class);
+	bprintk(buf, KERN_CONT ");\n");
+	bprintk(buf, "                               lock(");
+	__print_lock_name(buf, middle_class);
+	bprintk(buf, KERN_CONT ");\n");
+	bprintk(buf, "  <Interrupt>\n");
+	bprintk(buf, "    lock(");
+	__print_lock_name(buf, safe_class);
+	bprintk(buf, KERN_CONT ");\n");
+	bprintk(buf, "\n *** DEADLOCK ***\n\n");
 }
 
 static int
@@ -1510,65 +1515,69 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 			 enum lock_usage_bit bit2,
 			 const char *irqclass)
 {
+	struct printk_buffer *buf;
+
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
 
-	pr_warn("\n");
-	pr_warn("=====================================================\n");
-	pr_warn("WARNING: %s-safe -> %s-unsafe lock order detected\n",
+	buf = get_printk_buffer();
+	bpr_warn(buf, "\n");
+	bpr_warn(buf, "=====================================================\n");
+	bpr_warn(buf, "WARNING: %s-safe -> %s-unsafe lock order detected\n",
 		irqclass, irqclass);
-	print_kernel_ident();
-	pr_warn("-----------------------------------------------------\n");
-	pr_warn("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] is trying to acquire:\n",
+	print_kernel_ident(buf);
+	bpr_warn(buf, "-----------------------------------------------------\n");
+	bpr_warn(buf, "%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] is trying to acquire:\n",
 		curr->comm, task_pid_nr(curr),
 		curr->hardirq_context, hardirq_count() >> HARDIRQ_SHIFT,
 		curr->softirq_context, softirq_count() >> SOFTIRQ_SHIFT,
 		curr->hardirqs_enabled,
 		curr->softirqs_enabled);
-	print_lock(next);
+	print_lock(buf, next);
 
-	pr_warn("\nand this task is already holding:\n");
-	print_lock(prev);
-	pr_warn("which would create a new lock dependency:\n");
-	print_lock_name(hlock_class(prev));
-	pr_cont(" ->");
-	print_lock_name(hlock_class(next));
-	pr_cont("\n");
+	bpr_warn(buf, "\nand this task is already holding:\n");
+	print_lock(buf, prev);
+	bpr_warn(buf, "which would create a new lock dependency:\n");
+	print_lock_name(buf, hlock_class(prev));
+	bpr_cont(buf, " ->");
+	print_lock_name(buf, hlock_class(next));
+	bpr_cont(buf, "\n");
 
-	pr_warn("\nbut this new dependency connects a %s-irq-safe lock:\n",
+	bpr_warn(buf, "\nbut this new dependency connects a %s-irq-safe lock:\n",
 		irqclass);
-	print_lock_name(backwards_entry->class);
-	pr_warn("\n... which became %s-irq-safe at:\n", irqclass);
+	print_lock_name(buf, backwards_entry->class);
+	bpr_warn(buf, "\n... which became %s-irq-safe at:\n", irqclass);
 
 	print_stack_trace(backwards_entry->class->usage_traces + bit1, 1);
 
-	pr_warn("\nto a %s-irq-unsafe lock:\n", irqclass);
-	print_lock_name(forwards_entry->class);
-	pr_warn("\n... which became %s-irq-unsafe at:\n", irqclass);
-	pr_warn("...");
+	bpr_warn(buf, "\nto a %s-irq-unsafe lock:\n", irqclass);
+	print_lock_name(buf, forwards_entry->class);
+	bpr_warn(buf, "\n... which became %s-irq-unsafe at:\n", irqclass);
+	bpr_warn(buf, "...");
 
 	print_stack_trace(forwards_entry->class->usage_traces + bit2, 1);
 
-	pr_warn("\nother info that might help us debug this:\n\n");
-	print_irq_lock_scenario(backwards_entry, forwards_entry,
+	bpr_warn(buf, "\nother info that might help us debug this:\n\n");
+	print_irq_lock_scenario(buf, backwards_entry, forwards_entry,
 				hlock_class(prev), hlock_class(next));
 
-	lockdep_print_held_locks(curr);
+	lockdep_print_held_locks(buf, curr);
 
-	pr_warn("\nthe dependencies between %s-irq-safe lock and the holding lock:\n", irqclass);
+	bpr_warn(buf, "\nthe dependencies between %s-irq-safe lock and the holding lock:\n", irqclass);
 	if (!save_trace(&prev_root->trace))
-		return 0;
-	print_shortest_lock_dependencies(backwards_entry, prev_root);
+		goto done;
+	print_shortest_lock_dependencies(buf, backwards_entry, prev_root);
 
-	pr_warn("\nthe dependencies between the lock to be acquired");
-	pr_warn(" and %s-irq-unsafe lock:\n", irqclass);
+	bpr_warn(buf, "\nthe dependencies between the lock to be acquired");
+	bpr_warn(buf, " and %s-irq-unsafe lock:\n", irqclass);
 	if (!save_trace(&next_root->trace))
-		return 0;
-	print_shortest_lock_dependencies(forwards_entry, next_root);
+		goto done;
+	print_shortest_lock_dependencies(buf, forwards_entry, next_root);
 
-	pr_warn("\nstack backtrace:\n");
+	bpr_warn(buf, "\nstack backtrace:\n");
 	dump_stack();
-
+ done:
+	put_printk_buffer(buf);
 	return 0;
 }
 
@@ -1716,48 +1725,52 @@ static inline void inc_chains(void)
 #endif
 
 static void
-print_deadlock_scenario(struct held_lock *nxt,
+print_deadlock_scenario(struct printk_buffer *buf, struct held_lock *nxt,
 			     struct held_lock *prv)
 {
 	struct lock_class *next = hlock_class(nxt);
 	struct lock_class *prev = hlock_class(prv);
 
-	printk(" Possible unsafe locking scenario:\n\n");
-	printk("       CPU0\n");
-	printk("       ----\n");
-	printk("  lock(");
-	__print_lock_name(prev);
-	printk(KERN_CONT ");\n");
-	printk("  lock(");
-	__print_lock_name(next);
-	printk(KERN_CONT ");\n");
-	printk("\n *** DEADLOCK ***\n\n");
-	printk(" May be due to missing lock nesting notation\n\n");
+	bprintk(buf, " Possible unsafe locking scenario:\n\n");
+	bprintk(buf, "       CPU0\n");
+	bprintk(buf, "       ----\n");
+	bprintk(buf, "  lock(");
+	__print_lock_name(buf, prev);
+	bprintk(buf, KERN_CONT ");\n");
+	bprintk(buf, "  lock(");
+	__print_lock_name(buf, next);
+	bprintk(buf, KERN_CONT ");\n");
+	bprintk(buf, "\n *** DEADLOCK ***\n\n");
+	bprintk(buf, " May be due to missing lock nesting notation\n\n");
 }
 
 static int
 print_deadlock_bug(struct task_struct *curr, struct held_lock *prev,
 		   struct held_lock *next)
 {
+	struct printk_buffer *buf;
+
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
 
-	pr_warn("\n");
-	pr_warn("============================================\n");
-	pr_warn("WARNING: possible recursive locking detected\n");
-	print_kernel_ident();
-	pr_warn("--------------------------------------------\n");
-	pr_warn("%s/%d is trying to acquire lock:\n",
+	buf = get_printk_buffer();
+	bpr_warn(buf, "\n");
+	bpr_warn(buf, "============================================\n");
+	bpr_warn(buf, "WARNING: possible recursive locking detected\n");
+	print_kernel_ident(buf);
+	bpr_warn(buf, "--------------------------------------------\n");
+	bpr_warn(buf, "%s/%d is trying to acquire lock:\n",
 		curr->comm, task_pid_nr(curr));
-	print_lock(next);
-	pr_warn("\nbut task is already holding lock:\n");
-	print_lock(prev);
+	print_lock(buf, next);
+	bpr_warn(buf, "\nbut task is already holding lock:\n");
+	print_lock(buf, prev);
 
-	pr_warn("\nother info that might help us debug this:\n");
-	print_deadlock_scenario(next, prev);
-	lockdep_print_held_locks(curr);
+	bpr_warn(buf, "\nother info that might help us debug this:\n");
+	print_deadlock_scenario(buf, next, prev);
+	lockdep_print_held_locks(buf, curr);
 
-	pr_warn("\nstack backtrace:\n");
+	bpr_warn(buf, "\nstack backtrace:\n");
+	put_printk_buffer(buf);
 	dump_stack();
 
 	return 0;
@@ -2048,49 +2061,49 @@ static inline int get_first_held_lock(struct task_struct *curr,
 /*
  * Returns the next chain_key iteration
  */
-static u64 print_chain_key_iteration(int class_idx, u64 chain_key)
+static u64 print_chain_key_iteration(struct printk_buffer *buf, int class_idx, u64 chain_key)
 {
 	u64 new_chain_key = iterate_chain_key(chain_key, class_idx);
 
-	printk(" class_idx:%d -> chain_key:%016Lx",
+	bprintk(buf, " class_idx:%d -> chain_key:%016Lx",
 		class_idx,
 		(unsigned long long)new_chain_key);
 	return new_chain_key;
 }
 
 static void
-print_chain_keys_held_locks(struct task_struct *curr, struct held_lock *hlock_next)
+print_chain_keys_held_locks(struct printk_buffer *buf, struct task_struct *curr, struct held_lock *hlock_next)
 {
 	struct held_lock *hlock;
 	u64 chain_key = 0;
 	int depth = curr->lockdep_depth;
 	int i;
 
-	printk("depth: %u\n", depth + 1);
+	bprintk(buf, "depth: %u\n", depth + 1);
 	for (i = get_first_held_lock(curr, hlock_next); i < depth; i++) {
 		hlock = curr->held_locks + i;
-		chain_key = print_chain_key_iteration(hlock->class_idx, chain_key);
+		chain_key = print_chain_key_iteration(buf, hlock->class_idx, chain_key);
 
-		print_lock(hlock);
+		print_lock(buf, hlock);
 	}
 
-	print_chain_key_iteration(hlock_next->class_idx, chain_key);
-	print_lock(hlock_next);
+	print_chain_key_iteration(buf, hlock_next->class_idx, chain_key);
+	print_lock(buf, hlock_next);
 }
 
-static void print_chain_keys_chain(struct lock_chain *chain)
+static void print_chain_keys_chain(struct printk_buffer *buf, struct lock_chain *chain)
 {
 	int i;
 	u64 chain_key = 0;
 	int class_id;
 
-	printk("depth: %u\n", chain->depth);
+	bprintk(buf, "depth: %u\n", chain->depth);
 	for (i = 0; i < chain->depth; i++) {
 		class_id = chain_hlocks[chain->base + i];
-		chain_key = print_chain_key_iteration(class_id + 1, chain_key);
+		chain_key = print_chain_key_iteration(buf, class_id + 1, chain_key);
 
-		print_lock_name(lock_classes + class_id);
-		printk("\n");
+		print_lock_name(buf, lock_classes + class_id);
+		bprintk(buf, "\n");
 	}
 }
 
@@ -2098,21 +2111,24 @@ static void print_collision(struct task_struct *curr,
 			struct held_lock *hlock_next,
 			struct lock_chain *chain)
 {
-	pr_warn("\n");
-	pr_warn("============================\n");
-	pr_warn("WARNING: chain_key collision\n");
-	print_kernel_ident();
-	pr_warn("----------------------------\n");
-	pr_warn("%s/%d: ", current->comm, task_pid_nr(current));
-	pr_warn("Hash chain already cached but the contents don't match!\n");
+	struct printk_buffer *buf = get_printk_buffer();
 
-	pr_warn("Held locks:");
-	print_chain_keys_held_locks(curr, hlock_next);
+	bpr_warn(buf, "\n");
+	bpr_warn(buf, "============================\n");
+	bpr_warn(buf, "WARNING: chain_key collision\n");
+	print_kernel_ident(buf);
+	bpr_warn(buf, "----------------------------\n");
+	bpr_warn(buf, "%s/%d: ", current->comm, task_pid_nr(current));
+	bpr_warn(buf, "Hash chain already cached but the contents don't match!\n");
 
-	pr_warn("Locks in cached chain:");
-	print_chain_keys_chain(chain);
+	bpr_warn(buf, "Held locks:");
+	print_chain_keys_held_locks(buf, curr, hlock_next);
 
-	pr_warn("\nstack backtrace:\n");
+	bpr_warn(buf, "Locks in cached chain:");
+	print_chain_keys_chain(buf, chain);
+
+	bpr_warn(buf, "\nstack backtrace:\n");
+	put_printk_buffer(buf);
 	dump_stack();
 }
 #endif
@@ -2418,57 +2434,61 @@ static void check_chain_key(struct task_struct *curr)
 }
 
 static void
-print_usage_bug_scenario(struct held_lock *lock)
+print_usage_bug_scenario(struct printk_buffer *buf, struct held_lock *lock)
 {
 	struct lock_class *class = hlock_class(lock);
 
-	printk(" Possible unsafe locking scenario:\n\n");
-	printk("       CPU0\n");
-	printk("       ----\n");
-	printk("  lock(");
-	__print_lock_name(class);
-	printk(KERN_CONT ");\n");
-	printk("  <Interrupt>\n");
-	printk("    lock(");
-	__print_lock_name(class);
-	printk(KERN_CONT ");\n");
-	printk("\n *** DEADLOCK ***\n\n");
+	bprintk(buf, " Possible unsafe locking scenario:\n\n");
+	bprintk(buf, "       CPU0\n");
+	bprintk(buf, "       ----\n");
+	bprintk(buf, "  lock(");
+	__print_lock_name(buf, class);
+	bprintk(buf, KERN_CONT ");\n");
+	bprintk(buf, "  <Interrupt>\n");
+	bprintk(buf, "    lock(");
+	__print_lock_name(buf, class);
+	bprintk(buf, KERN_CONT ");\n");
+	bprintk(buf, "\n *** DEADLOCK ***\n\n");
 }
 
 static int
 print_usage_bug(struct task_struct *curr, struct held_lock *this,
 		enum lock_usage_bit prev_bit, enum lock_usage_bit new_bit)
 {
+	struct printk_buffer *buf;
+
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
 
-	pr_warn("\n");
-	pr_warn("================================\n");
-	pr_warn("WARNING: inconsistent lock state\n");
-	print_kernel_ident();
-	pr_warn("--------------------------------\n");
+	buf = get_printk_buffer();
+	bpr_warn(buf, "\n");
+	bpr_warn(buf, "================================\n");
+	bpr_warn(buf, "WARNING: inconsistent lock state\n");
+	print_kernel_ident(buf);
+	bpr_warn(buf, "--------------------------------\n");
 
-	pr_warn("inconsistent {%s} -> {%s} usage.\n",
+	bpr_warn(buf, "inconsistent {%s} -> {%s} usage.\n",
 		usage_str[prev_bit], usage_str[new_bit]);
 
-	pr_warn("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] takes:\n",
+	bpr_warn(buf, "%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] takes:\n",
 		curr->comm, task_pid_nr(curr),
 		trace_hardirq_context(curr), hardirq_count() >> HARDIRQ_SHIFT,
 		trace_softirq_context(curr), softirq_count() >> SOFTIRQ_SHIFT,
 		trace_hardirqs_enabled(curr),
 		trace_softirqs_enabled(curr));
-	print_lock(this);
+	print_lock(buf, this);
 
-	pr_warn("{%s} state was registered at:\n", usage_str[prev_bit]);
+	bpr_warn(buf, "{%s} state was registered at:\n", usage_str[prev_bit]);
 	print_stack_trace(hlock_class(this)->usage_traces + prev_bit, 1);
 
 	print_irqtrace_events(curr);
-	pr_warn("\nother info that might help us debug this:\n");
-	print_usage_bug_scenario(this);
+	bpr_warn(buf, "\nother info that might help us debug this:\n");
+	print_usage_bug_scenario(buf, this);
 
-	lockdep_print_held_locks(curr);
+	lockdep_print_held_locks(buf, curr);
 
-	pr_warn("\nstack backtrace:\n");
+	bpr_warn(buf, "\nstack backtrace:\n");
+	put_printk_buffer(buf);
 	dump_stack();
 
 	return 0;
@@ -2500,6 +2520,7 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 			struct held_lock *this, int forwards,
 			const char *irqclass)
 {
+	struct printk_buffer *buf;
 	struct lock_list *entry = other;
 	struct lock_list *middle = NULL;
 	int depth;
@@ -2507,28 +2528,29 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
 
-	pr_warn("\n");
-	pr_warn("========================================================\n");
-	pr_warn("WARNING: possible irq lock inversion dependency detected\n");
-	print_kernel_ident();
-	pr_warn("--------------------------------------------------------\n");
-	pr_warn("%s/%d just changed the state of lock:\n",
+	buf = get_printk_buffer();
+	bpr_warn(buf, "\n");
+	bpr_warn(buf, "========================================================\n");
+	bpr_warn(buf, "WARNING: possible irq lock inversion dependency detected\n");
+	print_kernel_ident(buf);
+	bpr_warn(buf, "--------------------------------------------------------\n");
+	bpr_warn(buf, "%s/%d just changed the state of lock:\n",
 		curr->comm, task_pid_nr(curr));
-	print_lock(this);
+	print_lock(buf, this);
 	if (forwards)
-		pr_warn("but this lock took another, %s-unsafe lock in the past:\n", irqclass);
+		bpr_warn(buf, "but this lock took another, %s-unsafe lock in the past:\n", irqclass);
 	else
-		pr_warn("but this lock was taken by another, %s-safe lock in the past:\n", irqclass);
-	print_lock_name(other->class);
-	pr_warn("\n\nand interrupts could create inverse lock ordering between them.\n\n");
+		bpr_warn(buf, "but this lock was taken by another, %s-safe lock in the past:\n", irqclass);
+	print_lock_name(buf, other->class);
+	bpr_warn(buf, "\n\nand interrupts could create inverse lock ordering between them.\n\n");
 
-	pr_warn("\nother info that might help us debug this:\n");
+	bpr_warn(buf, "\nother info that might help us debug this:\n");
 
 	/* Find a middle lock (if one exists) */
 	depth = get_lock_depth(other);
 	do {
 		if (depth == 0 && (entry != root)) {
-			pr_warn("lockdep:%s bad path found in chain graph\n", __func__);
+			bpr_warn(buf, "lockdep:%s bad path found in chain graph\n", __func__);
 			break;
 		}
 		middle = entry;
@@ -2536,20 +2558,21 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 		depth--;
 	} while (entry && entry != root && (depth >= 0));
 	if (forwards)
-		print_irq_lock_scenario(root, other,
+		print_irq_lock_scenario(buf, root, other,
 			middle ? middle->class : root->class, other->class);
 	else
-		print_irq_lock_scenario(other, root,
+		print_irq_lock_scenario(buf, other, root,
 			middle ? middle->class : other->class, root->class);
 
-	lockdep_print_held_locks(curr);
+	lockdep_print_held_locks(buf, curr);
 
-	pr_warn("\nthe shortest dependencies between 2nd lock and 1st lock:\n");
+	bpr_warn(buf, "\nthe shortest dependencies between 2nd lock and 1st lock:\n");
 	if (!save_trace(&root->trace))
 		return 0;
-	print_shortest_lock_dependencies(other, root);
+	print_shortest_lock_dependencies(buf, other, root);
 
-	pr_warn("\nstack backtrace:\n");
+	bpr_warn(buf, "\nstack backtrace:\n");
+	put_printk_buffer(buf);
 	dump_stack();
 
 	return 0;
@@ -3076,8 +3099,11 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	 * We must printk outside of the graph_lock:
 	 */
 	if (ret == 2) {
-		printk("\nmarked lock as {%s}:\n", usage_str[new_bit]);
-		print_lock(this);
+		struct printk_buffer *buf = get_printk_buffer();
+
+		bprintk(buf, "\nmarked lock as {%s}:\n", usage_str[new_bit]);
+		print_lock(buf, this);
+		put_printk_buffer(buf);
 		print_irqtrace_events(curr);
 		dump_stack();
 	}
@@ -3160,30 +3186,34 @@ void lockdep_init_map(struct lockdep_map *lock, const char *name,
 				struct held_lock *hlock,
 				unsigned long ip)
 {
+	struct printk_buffer *buf;
+
 	if (!debug_locks_off())
 		return 0;
 	if (debug_locks_silent)
 		return 0;
 
-	pr_warn("\n");
-	pr_warn("==================================\n");
-	pr_warn("WARNING: Nested lock was not taken\n");
-	print_kernel_ident();
-	pr_warn("----------------------------------\n");
+	buf = get_printk_buffer();
+	bpr_warn(buf, "\n");
+	bpr_warn(buf, "==================================\n");
+	bpr_warn(buf, "WARNING: Nested lock was not taken\n");
+	print_kernel_ident(buf);
+	bpr_warn(buf, "----------------------------------\n");
 
-	pr_warn("%s/%d is trying to lock:\n", curr->comm, task_pid_nr(curr));
-	print_lock(hlock);
+	bpr_warn(buf, "%s/%d is trying to lock:\n", curr->comm, task_pid_nr(curr));
+	print_lock(buf, hlock);
 
-	pr_warn("\nbut this task is not holding:\n");
-	pr_warn("%s\n", hlock->nest_lock->name);
+	bpr_warn(buf, "\nbut this task is not holding:\n");
+	bpr_warn(buf, "%s\n", hlock->nest_lock->name);
 
-	pr_warn("\nstack backtrace:\n");
+	bpr_warn(buf, "\nstack backtrace:\n");
 	dump_stack();
 
-	pr_warn("\nother info that might help us debug this:\n");
-	lockdep_print_held_locks(curr);
+	bpr_warn(buf, "\nother info that might help us debug this:\n");
+	lockdep_print_held_locks(buf, curr);
 
-	pr_warn("\nstack backtrace:\n");
+	bpr_warn(buf, "\nstack backtrace:\n");
+	put_printk_buffer(buf);
 	dump_stack();
 
 	return 0;
@@ -3232,10 +3262,13 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	debug_class_ops_inc(class);
 
 	if (very_verbose(class)) {
-		printk("\nacquire class [%px] %s", class->key, class->name);
+		struct printk_buffer *buf = get_printk_buffer();
+
+		bprintk(buf, "\nacquire class [%px] %s", class->key, class->name);
 		if (class->name_version > 1)
-			printk(KERN_CONT "#%d", class->name_version);
-		printk(KERN_CONT "\n");
+			bprintk(buf, KERN_CONT "#%d", class->name_version);
+		bprintk(buf, KERN_CONT "\n");
+		put_printk_buffer(buf);
 		dump_stack();
 	}
 
@@ -3349,12 +3382,15 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 		return 0;
 #endif
 	if (unlikely(curr->lockdep_depth >= MAX_LOCK_DEPTH)) {
+		struct printk_buffer *buf = get_printk_buffer();
+
 		debug_locks_off();
 		print_lockdep_off("BUG: MAX_LOCK_DEPTH too low!");
-		printk(KERN_DEBUG "depth: %i  max: %lu!\n",
+		bprintk(buf, KERN_DEBUG "depth: %i  max: %lu!\n",
 		       curr->lockdep_depth, MAX_LOCK_DEPTH);
 
-		lockdep_print_held_locks(current);
+		lockdep_print_held_locks(buf, current);
+		put_printk_buffer(buf);
 		debug_show_all_locks();
 		dump_stack();
 
@@ -3371,26 +3407,30 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 print_unlock_imbalance_bug(struct task_struct *curr, struct lockdep_map *lock,
 			   unsigned long ip)
 {
+	struct printk_buffer *buf;
+
 	if (!debug_locks_off())
 		return 0;
 	if (debug_locks_silent)
 		return 0;
 
-	pr_warn("\n");
-	pr_warn("=====================================\n");
-	pr_warn("WARNING: bad unlock balance detected!\n");
-	print_kernel_ident();
-	pr_warn("-------------------------------------\n");
-	pr_warn("%s/%d is trying to release lock (",
+	buf = get_printk_buffer();
+	bpr_warn(buf, "\n");
+	bpr_warn(buf, "=====================================\n");
+	bpr_warn(buf, "WARNING: bad unlock balance detected!\n");
+	print_kernel_ident(buf);
+	bpr_warn(buf, "-------------------------------------\n");
+	bpr_warn(buf, "%s/%d is trying to release lock (",
 		curr->comm, task_pid_nr(curr));
-	print_lockdep_cache(lock);
-	pr_cont(") at:\n");
+	print_lockdep_cache(buf, lock);
+	bpr_cont(buf, ") at:\n");
 	print_ip_sym(ip);
-	pr_warn("but there are no more locks to release!\n");
-	pr_warn("\nother info that might help us debug this:\n");
-	lockdep_print_held_locks(curr);
+	bpr_warn(buf, "but there are no more locks to release!\n");
+	bpr_warn(buf, "\nother info that might help us debug this:\n");
+	lockdep_print_held_locks(buf, curr);
 
-	pr_warn("\nstack backtrace:\n");
+	bpr_warn(buf, "\nstack backtrace:\n");
+	put_printk_buffer(buf);
 	dump_stack();
 
 	return 0;
@@ -3946,26 +3986,30 @@ void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie cookie)
 print_lock_contention_bug(struct task_struct *curr, struct lockdep_map *lock,
 			   unsigned long ip)
 {
+	struct printk_buffer *buf;
+
 	if (!debug_locks_off())
 		return 0;
 	if (debug_locks_silent)
 		return 0;
 
-	pr_warn("\n");
-	pr_warn("=================================\n");
-	pr_warn("WARNING: bad contention detected!\n");
-	print_kernel_ident();
-	pr_warn("---------------------------------\n");
-	pr_warn("%s/%d is trying to contend lock (",
+	buf = get_printk_buffer();
+	bpr_warn(buf, "\n");
+	bpr_warn(buf, "=================================\n");
+	bpr_warn(buf, "WARNING: bad contention detected!\n");
+	print_kernel_ident(buf);
+	bpr_warn(buf, "---------------------------------\n");
+	bpr_warn(buf, "%s/%d is trying to contend lock (",
 		curr->comm, task_pid_nr(curr));
-	print_lockdep_cache(lock);
-	pr_cont(") at:\n");
+	print_lockdep_cache(buf, lock);
+	bpr_cont(buf, ") at:\n");
 	print_ip_sym(ip);
-	pr_warn("but there are no locks held!\n");
-	pr_warn("\nother info that might help us debug this:\n");
-	lockdep_print_held_locks(curr);
+	bpr_warn(buf, "but there are no locks held!\n");
+	bpr_warn(buf, "\nother info that might help us debug this:\n");
+	lockdep_print_held_locks(buf, curr);
 
-	pr_warn("\nstack backtrace:\n");
+	bpr_warn(buf, "\nstack backtrace:\n");
+	put_printk_buffer(buf);
 	dump_stack();
 
 	return 0;
@@ -4288,22 +4332,26 @@ void __init lockdep_init(void)
 print_freed_lock_bug(struct task_struct *curr, const void *mem_from,
 		     const void *mem_to, struct held_lock *hlock)
 {
+	struct printk_buffer *buf;
+
 	if (!debug_locks_off())
 		return;
 	if (debug_locks_silent)
 		return;
 
-	pr_warn("\n");
-	pr_warn("=========================\n");
-	pr_warn("WARNING: held lock freed!\n");
-	print_kernel_ident();
-	pr_warn("-------------------------\n");
-	pr_warn("%s/%d is freeing memory %px-%px, with a lock still held there!\n",
+	buf = get_printk_buffer();
+	bpr_warn(buf, "\n");
+	bpr_warn(buf, "=========================\n");
+	bpr_warn(buf, "WARNING: held lock freed!\n");
+	print_kernel_ident(buf);
+	bpr_warn(buf, "-------------------------\n");
+	bpr_warn(buf, "%s/%d is freeing memory %px-%px, with a lock still held there!\n",
 		curr->comm, task_pid_nr(curr), mem_from, mem_to-1);
-	print_lock(hlock);
-	lockdep_print_held_locks(curr);
+	print_lock(buf, hlock);
+	lockdep_print_held_locks(buf, curr);
 
-	pr_warn("\nstack backtrace:\n");
+	bpr_warn(buf, "\nstack backtrace:\n");
+	put_printk_buffer(buf);
 	dump_stack();
 }
 
@@ -4346,19 +4394,23 @@ void debug_check_no_locks_freed(const void *mem_from, unsigned long mem_len)
 
 static void print_held_locks_bug(void)
 {
+	struct printk_buffer *buf;
+
 	if (!debug_locks_off())
 		return;
 	if (debug_locks_silent)
 		return;
 
-	pr_warn("\n");
-	pr_warn("====================================\n");
-	pr_warn("WARNING: %s/%d still has locks held!\n",
+	buf = get_printk_buffer();
+	bpr_warn(buf, "\n");
+	bpr_warn(buf, "====================================\n");
+	bpr_warn(buf, "WARNING: %s/%d still has locks held!\n",
 	       current->comm, task_pid_nr(current));
-	print_kernel_ident();
-	pr_warn("------------------------------------\n");
-	lockdep_print_held_locks(current);
-	pr_warn("\nstack backtrace:\n");
+	print_kernel_ident(buf);
+	bpr_warn(buf, "------------------------------------\n");
+	lockdep_print_held_locks(buf, current);
+	bpr_warn(buf, "\nstack backtrace:\n");
+	put_printk_buffer(buf);
 	dump_stack();
 }
 
@@ -4372,26 +4424,29 @@ void debug_check_no_locks_held(void)
 #ifdef __KERNEL__
 void debug_show_all_locks(void)
 {
+	struct printk_buffer *buf;
 	struct task_struct *g, *p;
 
 	if (unlikely(!debug_locks)) {
 		pr_warn("INFO: lockdep is turned off.\n");
 		return;
 	}
-	pr_warn("\nShowing all locks held in the system:\n");
+	buf = get_printk_buffer();
+	bpr_warn(buf, "\nShowing all locks held in the system:\n");
 
 	rcu_read_lock();
 	for_each_process_thread(g, p) {
 		if (!p->lockdep_depth)
 			continue;
-		lockdep_print_held_locks(p);
+		lockdep_print_held_locks(buf, p);
 		touch_nmi_watchdog();
 		touch_all_softlockup_watchdogs();
 	}
 	rcu_read_unlock();
 
-	pr_warn("\n");
-	pr_warn("=============================================\n\n");
+	bpr_warn(buf, "\n");
+	bpr_warn(buf, "=============================================\n\n");
+	put_printk_buffer(buf);
 }
 EXPORT_SYMBOL_GPL(debug_show_all_locks);
 #endif
@@ -4402,11 +4457,15 @@ void debug_show_all_locks(void)
  */
 void debug_show_held_locks(struct task_struct *task)
 {
+	struct printk_buffer *buf;
+
 	if (unlikely(!debug_locks)) {
 		printk("INFO: lockdep is turned off.\n");
 		return;
 	}
-	lockdep_print_held_locks(task);
+	buf = get_printk_buffer();
+	lockdep_print_held_locks(buf, task);
+	put_printk_buffer(buf);
 }
 EXPORT_SYMBOL_GPL(debug_show_held_locks);
 
@@ -4415,16 +4474,20 @@ asmlinkage __visible void lockdep_sys_exit(void)
 	struct task_struct *curr = current;
 
 	if (unlikely(curr->lockdep_depth)) {
+		struct printk_buffer *buf;
+
 		if (!debug_locks_off())
 			return;
-		pr_warn("\n");
-		pr_warn("================================================\n");
-		pr_warn("WARNING: lock held when returning to user space!\n");
-		print_kernel_ident();
-		pr_warn("------------------------------------------------\n");
-		pr_warn("%s/%d is leaving the kernel with locks still held!\n",
+		buf = get_printk_buffer();
+		bpr_warn(buf, "\n");
+		bpr_warn(buf, "================================================\n");
+		bpr_warn(buf, "WARNING: lock held when returning to user space!\n");
+		print_kernel_ident(buf);
+		bpr_warn(buf, "------------------------------------------------\n");
+		bpr_warn(buf, "%s/%d is leaving the kernel with locks still held!\n",
 				curr->comm, curr->pid);
-		lockdep_print_held_locks(curr);
+		lockdep_print_held_locks(buf, curr);
+		put_printk_buffer(buf);
 	}
 
 	/*
@@ -4436,17 +4499,18 @@ asmlinkage __visible void lockdep_sys_exit(void)
 
 void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
 {
+	struct printk_buffer *buf = get_printk_buffer();
 	struct task_struct *curr = current;
 
 	/* Note: the following can be executed concurrently, so be careful. */
-	pr_warn("\n");
-	pr_warn("=============================\n");
-	pr_warn("WARNING: suspicious RCU usage\n");
-	print_kernel_ident();
-	pr_warn("-----------------------------\n");
-	pr_warn("%s:%d %s!\n", file, line, s);
-	pr_warn("\nother info that might help us debug this:\n\n");
-	pr_warn("\n%srcu_scheduler_active = %d, debug_locks = %d\n",
+	bpr_warn(buf, "\n");
+	bpr_warn(buf, "=============================\n");
+	bpr_warn(buf, "WARNING: suspicious RCU usage\n");
+	print_kernel_ident(buf);
+	bpr_warn(buf, "-----------------------------\n");
+	bpr_warn(buf, "%s:%d %s!\n", file, line, s);
+	bpr_warn(buf, "\nother info that might help us debug this:\n\n");
+	bpr_warn(buf, "\n%srcu_scheduler_active = %d, debug_locks = %d\n",
 	       !rcu_lockdep_current_cpu_online()
 			? "RCU used illegally from offline CPU!\n"
 			: !rcu_is_watching()
@@ -4473,10 +4537,11 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
 	 * rcu_read_lock_bh() and so on from extended quiescent states.
 	 */
 	if (!rcu_is_watching())
-		pr_warn("RCU used illegally from extended quiescent state!\n");
+		bpr_warn(buf, "RCU used illegally from extended quiescent state!\n");
 
-	lockdep_print_held_locks(curr);
-	pr_warn("\nstack backtrace:\n");
+	lockdep_print_held_locks(buf, curr);
+	bpr_warn(buf, "\nstack backtrace:\n");
+	put_printk_buffer(buf);
 	dump_stack();
 }
 EXPORT_SYMBOL_GPL(lockdep_rcu_suspicious);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-09  6:12         ` Sergey Senozhatsky
@ 2018-11-09  9:55           ` Tetsuo Handa
  2018-11-09 15:43             ` Petr Mladek
  2018-11-09 14:08           ` Linus Torvalds
  1 sibling, 1 reply; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-09  9:55 UTC (permalink / raw)
  To: Sergey Senozhatsky, Petr Mladek, Linus Torvalds
  Cc: Sergey Senozhatsky, Dmitriy Vyukov, Steven Rostedt,
	Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Andrew Morton, linux-mm, Ingo Molnar, Peter Zijlstra,
	Will Deacon

On 2018/11/09 15:12, Sergey Senozhatsky wrote:
> On (11/08/18 20:37), Tetsuo Handa wrote:
> > On 2018/11/08 13:45, Sergey Senozhatsky wrote:
> > > So, can we just do the following? /* a sketch */
> > > 
> > > lockdep.c
> > > 	printk_safe_enter_irqsave(flags);
> > > 	lockdep_report();
> > > 	printk_safe_exit_irqrestore(flags);
> > 
> > If buffer size were large enough to hold messages from out_of_memory(),
> > I would like to use it for out_of_memory() because delaying SIGKILL
> > due to waiting for printk() to complete is not good. Surely we can't
> > hold all messages because amount from dump_tasks() is unpredictable.
> > Maybe we can hold all messages from dump_header() except dump_tasks().
> > 
> > But isn't it essentially same with
> > http://lkml.kernel.org/r/1493560477-3016-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
> > which Linus does not want?
> 
> Dunno. I guess we still haven't heard from Linus because he did quite a good
> job setting up his 'email filters' ;)
> 
> Converting the existing users to buffered printk is not so simple.

Yes. We will need to ask for huge modifications for rare events.

> Apparently there are different paths; some can afford buffered printk, some
> cannot. Some of 'cont' users tend to get advantage of transparent 'cont'
> context: start 'cont' output in function A: A()->pr_cont(), continue it in
> B: A()->B()->pr_cont(), and then in C: A()->B()->C()->pr_cont(), and
> finally flush in A: A()->pr_cont(\n).

Yes, both OOM reporting and lockdep reporting have such dependency.

Linus said

  It might actually be a good idea to help those things, by making
  helper functions available that do the marshalling.

  So not calling "printk()" directly, but having a set of simple
  "buffer_print()" functions where each user has its own buffer, and
  then the "buffer_print()" functions will help people do nicely output
  data.

in https://lore.kernel.org/lkml/CA+55aFyzR4LKhJKLFgvvd9OTsos2_g4-9fova782BX4kyA3bLA@mail.gmail.com/ .

But since scattering "struct printk_buffer *" around as a function argument
is not realistic, scattering "a data structure which Linus suggested (i.e.
no printk() from helper functions)" around as a function argument will not be
realistic as well. We will need to calculate appropriate buffer size which
traverses many functions, in addition to passing that data structure as
a function argument.

>                                       And then some paths have the
> early_printk requirement. We can break the 'transparent cont' by passing
> buffer pointers around [it can get a bit hairy; looking at lockdep patch],
> but early_printk requirement is a different beast.

How early_printk requirement affects line buffered printk() API?

I don't think it is impossible to convert from

     printk("Testing feature XYZ..");
     this_may_blow_up_because_of_hw_bugs();
     printk(KERN_CONT " ... ok\n");

to

     printk("Testing feature XYZ:\n");
     this_may_blow_up_because_of_hw_bugs();
     printk("Testing feature XYZ.. ... ok\n");

in https://lore.kernel.org/lkml/CA+55aFwmwdY_mMqdEyFPpRhCKRyeqj=+aCqe5nN108v8ELFvPw@mail.gmail.com/ .

> 
> So in my email I was not advertising printk_safe as a "buffered printk for
> everyone", I was just talking about lockdep. It's a bit doubtful that Peter
> will ACK lockdep transition to buffered printk.

What Linus is suggesting is "helper functions with no printk()", which is
more difficult than my "helper functions with printk()" because we need to
calculate enough buffer size (which may traverse many functions) because
we can't printk() from helper functions when buffer size is too small.

I'm wondering if Linus still insists scattering "a data structure which
Linus suggested" around. Rather, I'd like to propose below one. This is not
perfect but modification is much smaller.

 kernel/printk/printk.c | 166 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 166 insertions(+)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 1b2a029..c339594 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1950,6 +1950,168 @@ asmlinkage int printk_emit(int facility, int level,
 }
 EXPORT_SYMBOL(printk_emit);
 
+static void buffered_printk_emit(const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, 0, fmt, args);
+	va_end(args);
+}
+
+static int get_completed_lines(char *buf, int pos)
+{
+	while (pos > 0) {
+		if (buf[--pos] != '\n')
+			continue;
+		buf[pos++] = '\0';
+		return pos;
+	}
+	return 0;
+}
+
+/* A structure for line-buffered printk() output. */
+struct printk_buffer {
+	/*
+	 * State of this line buffer.
+	 *
+	 * 0: Nobody has reserved, and is not using now.
+	 *
+	 * 1 to NR_CPUS: An interrupt context reserved, but is not using now.
+	 * NR_CPUS+1 to ULONG_MAX-1: Some task context reserved, but is not
+	 *                           using now.
+	 *
+	 * ULONG_MAX: Somebody has reserved, and is using now.
+	 *            This value is used as a lock.
+	 */
+	unsigned long context;
+	/* Valid bytes in buf[]. */
+	unsigned short int len;
+	char buf[LOG_LINE_MAX];
+	/* Last accessed jiffies. */
+	unsigned long stamp;
+} __aligned(1024);
+
+/* Number of line buffers. */
+#define NUM_LINE_BUFFERS 16
+
+/*
+ * Line buffered printk() tries to assign a buffer when printk() from a new
+ * context identifier comes in. And it automatically releases that buffer when
+ * one of three conditions listed below became true.
+ *
+ *   (1) printk() from that context identifier emitted '\n' as the last
+ *       character of output.
+ *   (2) printk() from that context identifier tried to print a too long line
+ *       which cannot be stored into a buffer.
+ *   (3) printk() from a new context identifier noticed that some context
+ *       identifier is reserving a buffer for more than 10 seconds without
+ *       emitting '\n'.
+ *
+ * Since (3) is based on a heuristic that somebody forgot to emit '\n' as the
+ * last character of output(), pr_cont()/KERN_CONT users are expected to emit
+ * '\n' within 10 seconds even if they reserved a buffer.
+ *
+ * There is a limitation that a buffer cannot be released if an oops occurred
+ * while that buffer is locked (context == ULONG_MAX).
+ *
+ * The granularity of context is a bit sparse. If printk() was called from task
+ * context, the address of current thread is used as context identifier. If
+ * printk() was called from non-task context, current processor ID is used as
+ * context identifier.
+ */
+static int try_buffered_printk(const char *fmt, va_list args)
+{
+	static struct printk_buffer printk_buffers[NUM_LINE_BUFFERS];
+	va_list tmp_args;
+	int pos;
+	int r;
+	const unsigned long context = in_task() ?
+		(unsigned long) current : raw_smp_processor_id() + 1;
+	struct printk_buffer *ptr;
+	const unsigned long now = jiffies;
+
+	/* Check if this context can lock a reserved buffer. */
+	for (pos = 0; pos < NUM_LINE_BUFFERS; pos++) {
+		ptr = &printk_buffers[pos];
+		if (context == ptr->context && context ==
+		    cmpxchg_acquire(&ptr->context, context, ULONG_MAX))
+			goto found;
+	}
+	/* Check if somebody is reserving a buffer for too long. */
+	for (pos = 0; pos < NUM_LINE_BUFFERS; pos++) {
+		unsigned long tmp;
+
+		ptr = &printk_buffers[pos];
+		if (likely(!ptr->len ||
+			   !time_after(now, ptr->stamp + 10 * HZ)))
+			continue;
+		tmp = ptr->context;
+		if (tmp == 0 || tmp == ULONG_MAX ||
+		    cmpxchg_acquire(&ptr->context, tmp, ULONG_MAX) != tmp)
+			continue;
+		if (ptr->len && time_after(now, ptr->stamp + 10 * HZ)) {
+			ptr->buf[ptr->len] = '\0';
+			buffered_printk_emit("%s\n", ptr->buf);
+			ptr->len = 0;
+		}
+		xchg_release(&ptr->context, 0);
+	}
+	/* Check if this context can reserve and lock a buffer. */
+	for (pos = 0; pos < NUM_LINE_BUFFERS; pos++) {
+		ptr = &printk_buffers[pos];
+		if (ptr->context == 0 &&
+		    cmpxchg_acquire(&ptr->context, 0, ULONG_MAX) == 0)
+			goto found;
+	}
+	/* All buffers are reserved. Fallback to normal printk(). */
+	return -1;
+ found:
+	ptr->stamp = now;
+	while (true) {
+		pos = ptr->len && (printk_get_level(fmt) == 'c') ? 2 : 0;
+		va_copy(tmp_args, args);
+		r = vsnprintf(ptr->buf + ptr->len, sizeof(ptr->buf) - ptr->len,
+			      fmt + pos, tmp_args);
+		va_end(tmp_args);
+		if (likely(r + ptr->len < sizeof(ptr->buf)))
+			break;
+		/* Oops. Line was too long to store into this buffer. */
+		if (!ptr->len) {
+			/*
+			 * We can do nothing. Fallback to normal printk(). But
+			 * be prepared for this printk() being the last call
+			 * from this context.
+			 */
+			r = -1;
+			goto out;
+		}
+		/* Retry after flushing incomplete line in this buffer. */
+		ptr->buf[ptr->len] = '\0';
+		buffered_printk_emit("%s", ptr->buf);
+		ptr->len = 0;
+	}
+	ptr->len += r;
+	/* Flush already completed lines. */
+	pos = get_completed_lines(ptr->buf, ptr->len);
+	if (pos) {
+		buffered_printk_emit("%s\n", ptr->buf);
+		ptr->len -= pos;
+		memmove(ptr->buf, ptr->buf + pos, ptr->len);
+	}
+ out:
+	/*
+	 * Release this buffer if this buffer became empty because this
+	 * printk() might be the last call from this context. Otherwise, keep
+	 * this buffer for future printk() calls from this context.
+	 */
+	if (!ptr->len)
+		xchg_release(&ptr->context, 0);
+	else
+		xchg_release(&ptr->context, context);
+	return r;
+}
+
 int vprintk_default(const char *fmt, va_list args)
 {
 	int r;
@@ -1961,6 +2123,10 @@ int vprintk_default(const char *fmt, va_list args)
 		return r;
 	}
 #endif
+	r = try_buffered_printk(fmt, args);
+	if (r >= 0)
+		return r;
+
 	r = vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, 0, fmt, args);
 
 	return r;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-09  6:12         ` Sergey Senozhatsky
  2018-11-09  9:55           ` Tetsuo Handa
@ 2018-11-09 14:08           ` Linus Torvalds
  2018-11-09 14:42             ` Steven Rostedt
  1 sibling, 1 reply; 54+ messages in thread
From: Linus Torvalds @ 2018-11-09 14:08 UTC (permalink / raw)
  To: sergey.senozhatsky.work
  Cc: penguin-kernel, Petr Mladek, Sergey Senozhatsky, dvyukov,
	Steven Rostedt, glider, fengguang.wu, jpoimboe,
	Linux List Kernel Mailing, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon

On Fri, Nov 9, 2018 at 12:12 AM Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
>
> Dunno. I guess we still haven't heard from Linus because he did quite a good
> job setting up his 'email filters' ;)

Not filters, just long threads that I lurk on.

I don't actually care too much about this - the part I care about is
that when panics etc happen, things go out with a true best effort.

And "best effort" actually means "reality", not "theory". I don't care
one whit for some broken odd serial console that has a lock and
deadlocks if you get a panic just in the right place. I care about the
main printk/tty code doing the right thing, and avoiding the locks
with the scheduler and timers etc. So the timestamping and wakeup code
needing locks - or thinking you can delay things and print them out
later (when no later happens because you're panicing in an NMI) -
*that* is what I care deeply about.

Something like having a line buffering interface for random debugging
messages etc, I just don't get excited about. It just needs to be
simple enough and robust enough. You guys seem to be talking it out
ok.

             Linus

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-08 12:30         ` Sergey Senozhatsky
@ 2018-11-09 14:10           ` Petr Mladek
  2018-11-12  7:59             ` Sergey Senozhatsky
  0 siblings, 1 reply; 54+ messages in thread
From: Petr Mladek @ 2018-11-09 14:10 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Tetsuo Handa, Dmitriy Vyukov, Steven Rostedt,
	Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon

On Thu 2018-11-08 21:30:49, Sergey Senozhatsky wrote:
> On (11/08/18 12:24), Petr Mladek wrote:
> > > - It seems that buffered printk attempts to solve too many problems.
> > >   I'd prefer it to address just one.
> > 
> > This API tries to handle continuous lines more reliably.
> > Do I miss anything, please?
> 
> This part:
> 
> +       /* Flush already completed lines if any. */
> +       for (pos = ptr->len - 1; pos >= 0; pos--) {
> +               if (ptr->buf[pos] != '\n')
> +                       continue;
> +               ptr->buf[pos++] = '\0';
> +               printk("%s\n", ptr->buf);
> +               ptr->len -= pos;
> +               memmove(ptr->buf, ptr->buf + pos, ptr->len);
> +               /* This '\0' will be overwritten by next vsnprintf() above. */
> +               ptr->buf[ptr->len] = '\0';
> +               break;
> +       }
> +       return r;
> 
> If I'm not mistaken, this is for the futute "printk injection" work.

No, it does not make sense to distinguish the context at this level.
The buffer is passed as an argument. It should not get passed to another
task or context.

The above code only tries to push complete lines to the main log buffer
and consoles ASAP. It sounds like a Good Idea(tm).


> Right now printk("foo\nbar\n") will end up to be a single logbuf
> entry, with \n in the middle and at the end. So it will look like
> two lines on the serial console:
> 
> 	[123.123] foo
> 	          bar
> 
> Tetsuo does this \n lookup (on every vprintk_buffered) and split lines
> (via memove) for "printk injection", so the output will be
> 
> 	[123.123] foo
> 	[123.124] bar

No, please note that the for cycle searches for '\n' from the end
of the string.

> Which makes it simpler to "inject" printk origin into every printed
> line.
> 
> Without it we can just do
> 
> 	len = vsprintf();
> 	if (len && text[len - 1] == '\n' || overflow)
> 		flush();

I had the same idea. Tetsuo ignored it. I looked for more arguments
and found that '\n' is used in the middle of several pr_cont()
calls, see

     git grep 'pr_cont.*\\n.*[^\\n]"'

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-09 14:08           ` Linus Torvalds
@ 2018-11-09 14:42             ` Steven Rostedt
  0 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2018-11-09 14:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: sergey.senozhatsky.work, penguin-kernel, Petr Mladek,
	Sergey Senozhatsky, dvyukov, glider, fengguang.wu, jpoimboe,
	Linux List Kernel Mailing, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon

On Fri, 9 Nov 2018 08:08:13 -0600
Linus Torvalds <torvalds@linux-foundation.org> wrote:

>  You guys seem to be talking it out
> ok.

Do your new filters not only remove words, but also add text?

  ;-)

-- Steve

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-09  9:55           ` Tetsuo Handa
@ 2018-11-09 15:43             ` Petr Mladek
  2018-11-10  2:42               ` Tetsuo Handa
  2018-11-10  8:52               ` Tetsuo Handa
  0 siblings, 2 replies; 54+ messages in thread
From: Petr Mladek @ 2018-11-09 15:43 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Linus Torvalds, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On Fri 2018-11-09 18:55:26, Tetsuo Handa wrote:
> On 2018/11/09 15:12, Sergey Senozhatsky wrote:
> > On (11/08/18 20:37), Tetsuo Handa wrote:
> > > On 2018/11/08 13:45, Sergey Senozhatsky wrote:
> How early_printk requirement affects line buffered printk() API?
> 
> I don't think it is impossible to convert from
> 
>      printk("Testing feature XYZ..");
>      this_may_blow_up_because_of_hw_bugs();
>      printk(KERN_CONT " ... ok\n");
> 
> to
> 
>      printk("Testing feature XYZ:\n");
>      this_may_blow_up_because_of_hw_bugs();
>      printk("Testing feature XYZ.. ... ok\n");
> 
> in https://lore.kernel.org/lkml/CA+55aFwmwdY_mMqdEyFPpRhCKRyeqj=+aCqe5nN108v8ELFvPw@mail.gmail.com/ .

I just wonder how this pattern is common. I have tried but I failed
to find any instance.

This problem looks like a big argument against explicit buffers.
But I wonder if it is real.

> > 
> > So in my email I was not advertising printk_safe as a "buffered printk for
> > everyone", I was just talking about lockdep. It's a bit doubtful that Peter
> > will ACK lockdep transition to buffered printk.
> 
> What Linus is suggesting is "helper functions with no printk()", which is
> more difficult than my "helper functions with printk()" because we need to
> calculate enough buffer size (which may traverse many functions) because
> we can't printk() from helper functions when buffer size is too small.
> 
> I'm wondering if Linus still insists scattering "a data structure which
> Linus suggested" around. Rather, I'd like to propose below one. This is not
> perfect but modification is much smaller.
> 
> + * Line buffered printk() tries to assign a buffer when printk() from a new
> + * context identifier comes in. And it automatically releases that buffer when
> + * one of three conditions listed below became true.
> + *
> + *   (1) printk() from that context identifier emitted '\n' as the last
> + *       character of output.
> + *   (2) printk() from that context identifier tried to print a too long line
> + *       which cannot be stored into a buffer.
> + *   (3) printk() from a new context identifier noticed that some context
> + *       identifier is reserving a buffer for more than 10 seconds without
> + *       emitting '\n'.
> + *
> + * Since (3) is based on a heuristic that somebody forgot to emit '\n' as the
> + * last character of output(), pr_cont()/KERN_CONT users are expected to emit
> + * '\n' within 10 seconds even if they reserved a buffer.

This is my main concern about this approach. It is so easy to omit
the final '\n'.

They are currently delayed until another printk(). Even this is bad.
Unfortunately we could not setup timer from printk() because it
would add more locks into the game.

This patch will make it worse. They might get delayed by 10s or even
more. Many other lines might appear in between. Also the code is
more tricky[*].


Sign, I am really unhappy how the buffered_printk() conversion
looks in lockdep.c. But I still think that the approach is more
reliable. I am going to investigate much more pr_cont() users.
I wonder how many would be that complicated. I do not want
to give up just because one use-case that was complicated
even before.


[*] The buffer can get written and flushed by different processes.
    It is not trivial to do it correctly a lockless way.

    The proposed locking looks right on the first glance. But
    the code is a bit scary ;-)

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-09 15:43             ` Petr Mladek
@ 2018-11-10  2:42               ` Tetsuo Handa
  2018-11-23 12:46                 ` Petr Mladek
  2018-11-10  8:52               ` Tetsuo Handa
  1 sibling, 1 reply; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-10  2:42 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Linus Torvalds, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On 2018/11/10 0:43, Petr Mladek wrote:
>> + * Line buffered printk() tries to assign a buffer when printk() from a new
>> + * context identifier comes in. And it automatically releases that buffer when
>> + * one of three conditions listed below became true.
>> + *
>> + *   (1) printk() from that context identifier emitted '\n' as the last
>> + *       character of output.
>> + *   (2) printk() from that context identifier tried to print a too long line
>> + *       which cannot be stored into a buffer.
>> + *   (3) printk() from a new context identifier noticed that some context
>> + *       identifier is reserving a buffer for more than 10 seconds without
>> + *       emitting '\n'.
>> + *
>> + * Since (3) is based on a heuristic that somebody forgot to emit '\n' as the
>> + * last character of output(), pr_cont()/KERN_CONT users are expected to emit
>> + * '\n' within 10 seconds even if they reserved a buffer.
> 
> This is my main concern about this approach. It is so easy to omit
> the final '\n'.

If it is so easy to forget the final '\n', there will be a lot of implicit
pr_cont() users (because pr_cont() assumes that previous printk() omitted the
final '\n'), and "I am going to investigate much more pr_cont() users." will
be insufficient for getting meaningful conclusion.

Checking "lack of the the final '\n'" means that we need to check
"all printk() users who are not emitting the final '\n'" and evaluate
"whether there is a possibility that subsequent printk() will not be
 called from that context due to e.g. conditional branches". That is an
impossible task for anybody, for there might be out-of-tree code doing it.

> 
> They are currently delayed until another printk(). Even this is bad.
> Unfortunately we could not setup timer from printk() because it
> would add more locks into the game.

We could use interval timer for flushing incomplete line.
But updating printk() users to always end with '\n' will be preferable.

> 
> This patch will make it worse. They might get delayed by 10s or even
> more. Many other lines might appear in between. Also the code is
> more tricky[*].

If there are a lot of implicit pr_cont() users (who are omitting
the final '\n' in previous printk()), we can try to find them (and
fix them) by "reporting the location of printk() which omitted the
final '\n'" instead of "flushing the partial line from different
processes" when "try_buffered_printk() (or an interval timer) found
that some buffer is holding a partial line for more than 10 seconds".

> 
> 
> Sign, I am really unhappy how the buffered_printk() conversion
> looks in lockdep.c. But I still think that the approach is more
> reliable. I am going to investigate much more pr_cont() users.

If there is a possibility that subsequent printk() will not be called from
that context due to e.g. conditional branches, we will need to flush after
previous printk() in order to eliminate possibility of failing to flush.

That is asking printk() users to fix previous printk() so that partial
line is always flushed (with '\n' or without '\n').

> I wonder how many would be that complicated. I do not want
> to give up just because one use-case that was complicated
> even before.
> 
> 
> [*] The buffer can get written and flushed by different processes.
>     It is not trivial to do it correctly a lockless way.
> 
>     The proposed locking looks right on the first glance. But
>     the code is a bit scary ;-)

try_buffered_printk() will be a good hook for finding implicit
pr_cont() users who are omitting the final '\n', for we can find them
without making changes to printk() users.

try_buffered_printk() hook can offload atomic-but-complicated task
(converting e.g. %pSOMETHING to %s) to outside of logbuf_lock, and
reduce the period of holding logbuf_lock.

Since try_buffered_printk() serves as a hook, if we ignore the offloading
atomic-but-complicated task, we can eliminate try_buffered_printk() after
we updated printk() users to always end with '\n'.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-09 15:43             ` Petr Mladek
  2018-11-10  2:42               ` Tetsuo Handa
@ 2018-11-10  8:52               ` Tetsuo Handa
  2018-11-23 12:52                 ` Petr Mladek
  1 sibling, 1 reply; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-10  8:52 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Linus Torvalds, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On 2018/11/10 0:43, Petr Mladek wrote:
> On Fri 2018-11-09 18:55:26, Tetsuo Handa wrote:
>> How early_printk requirement affects line buffered printk() API?
>>
>> I don't think it is impossible to convert from
>>
>>      printk("Testing feature XYZ..");
>>      this_may_blow_up_because_of_hw_bugs();
>>      printk(KERN_CONT " ... ok\n");
>>
>> to
>>
>>      printk("Testing feature XYZ:\n");
>>      this_may_blow_up_because_of_hw_bugs();
>>      printk("Testing feature XYZ.. ... ok\n");
>>
>> in https://lore.kernel.org/lkml/CA+55aFwmwdY_mMqdEyFPpRhCKRyeqj=+aCqe5nN108v8ELFvPw@mail.gmail.com/ .
> 
> I just wonder how this pattern is common. I have tried but I failed
> to find any instance.
> 
> This problem looks like a big argument against explicit buffers.
> But I wonder if it is real.

An example of boot up messages where buffering makes difference.

Vanilla:

[    0.260459] smp: Bringing up secondary CPUs ...
[    0.269595] x86: Booting SMP configuration:
[    0.270461] .... node  #0, CPUs:      #1
[    0.066578] Disabled fast string operations
[    0.066578] mce: CPU supports 0 MCE banks
[    0.066578] smpboot: CPU 1 Converting physical 2 to logical package 1
[    0.342569]  #2
[    0.066578] Disabled fast string operations
[    0.066578] mce: CPU supports 0 MCE banks
[    0.066578] smpboot: CPU 2 Converting physical 4 to logical package 2
[    0.413442]  #3
[    0.066578] Disabled fast string operations
[    0.066578] mce: CPU supports 0 MCE banks
[    0.066578] smpboot: CPU 3 Converting physical 6 to logical package 3
[    0.476562] smp: Brought up 1 node, 4 CPUs
[    0.477477] smpboot: Max logical packages: 8
[    0.477514] smpboot: Total of 4 processors activated (22691.70 BogoMIPS)

With try_buffered_printk() patch:

[    0.279768] smp: Bringing up secondary CPUs ...
[    0.288825] x86: Booting SMP configuration:
[    0.066748] Disabled fast string operations
[    0.066748] mce: CPU supports 0 MCE banks
[    0.066748] smpboot: CPU 1 Converting physical 2 to logical package 1
[    0.066748] Disabled fast string operations
[    0.066748] mce: CPU supports 0 MCE banks
[    0.066748] smpboot: CPU 2 Converting physical 4 to logical package 2
[    0.066748] Disabled fast string operations
[    0.066748] mce: CPU supports 0 MCE banks
[    0.066748] smpboot: CPU 3 Converting physical 6 to logical package 3
[    0.495862] .... node  #0, CPUs:      #1 #2 #3\x016smp: Brought up 1 node, 4 CPUs
[    0.496833] smpboot: Max logical packages: 8
[    0.497609] smpboot: Total of 4 processors activated (22665.22 BogoMIPS)



Hmm, arch/x86/kernel/smpboot.c is not emitting '\n' after #num

        if (system_state < SYSTEM_RUNNING) {
                if (node != current_node) {
                        if (current_node > (-1))
                                pr_cont("\n");
                        current_node = node;

                        printk(KERN_INFO ".... node %*s#%d, CPUs:  ",
                               node_width - num_digits(node), " ", node);
                }

                /* Add padding for the BSP */
                if (cpu == 1)
                        pr_cont("%*s", width + 1, " ");

                pr_cont("%*s#%d", width - num_digits(cpu), " ", cpu);

        } else
                pr_info("Booting Node %d Processor %d APIC 0x%x\n",
                        node, cpu, apicid);

and causing

        pr_info("Brought up %d node%s, %d CPU%s\n",
                num_nodes, (num_nodes > 1 ? "s" : ""),
                num_cpus,  (num_cpus  > 1 ? "s" : ""));

line to be concatenated to previous line.
Maybe disable try_buffered_printk() if system_state != SYSTEM_RUNNING ?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-09 14:10           ` Petr Mladek
@ 2018-11-12  7:59             ` Sergey Senozhatsky
  2018-11-12 10:42               ` Tetsuo Handa
  2018-11-17 10:14               ` Tetsuo Handa
  0 siblings, 2 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2018-11-12  7:59 UTC (permalink / raw)
  To: Petr Mladek, Tetsuo Handa
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Dmitriy Vyukov,
	Steven Rostedt, Alexander Potapenko, Fengguang Wu,
	Josh Poimboeuf, LKML, Linus Torvalds, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On (11/09/18 15:10), Petr Mladek wrote:
> > 
> > If I'm not mistaken, this is for the futute "printk injection" work.
> 
> The above code only tries to push complete lines to the main log buffer
> and consoles ASAP. It sounds like a Good Idea(tm).

Probably it is. So *quite likely* I'm wrong here.

Hmm... Thinking out loud.
At the same time, splitting a single logbuf entry gives a chance to other
events to mix in. Example:
	pr_cont("Foo:");
	pr_cont("\nbar");
	pr_cont("\n");

Previously it could been stored in one logbuf entry (one logstore,
one console_trylock_spinning), which means that it could have been
printed in one go:

	call_console_drivers()
		spin_trylock_irqsave(&port->lock, flags);
		uart_console_write(.... "Foo:\nbar\n");
		spin_unlock_irqrestore(&port->lock, flags);

While with buffered printk, it will be store in two logbuf entries,
and printed in two goes:

	call_console_drivers()
		spin_trylock_irqsave(&port->lock, flags);
		uart_console_write(.... "Foo:\nbar\n");
		spin_unlock_irqrestore(&port->lock, flags);

	<< ... console driver IRQ, TX port->state->xmit chars ... >>>

	call_console_drivers()
		spin_trylock_irqsave(&port->lock, flags);
		uart_console_write(.... "Foo:\nbar\n");
		spin_unlock_irqrestore(&port->lock, flags);

So, *technically*, we now let more events to happens between printk-s.

But, let's look at some of pr_cont() usage examples.
E.g. dump() from arch/h8300/kernel/traps.c. The code in question looks
as follows:

	pr_info("\nKERNEL STACK:");
	tp = ((unsigned char *) fp) - 0x40;
	for (sp = (unsigned long *) tp, i = 0; (i < 0x40);  i += 4) {
		if ((i % 0x10) == 0)
			pr_info("\n%08x: ", (int) (tp + i));
		pr_info("%08x ", (int) *sp++);
	}
	pr_info("\n");

dmesg

[   15.260099] 0000:    00000000  00000010  00000040  00000090
               0010:    00000100  00000190  00000240  00000310
               0020:    00000400  00000510  00000640  00000790
               0030:    00000900  00000a90  00000c40  00000e10

So we have the entire KERNEL STACK stored as a single line, in
a single logbuf entry.

cat /dev/kmsg

4,687,15260099,c;\x0a0000:    00000000  00000010  00000040  00000090  \x0a0010:    00000100  00000190  00000240  00000310  \x0a0020:    00000400  00000510  00000640  00000790  \x0a0030:    00000900  00000a90  00000c40  00000e10

Shall we consider this as a "reference" representation: data that
pr_cont(), ideally, would have stored as a single logbuf entry? And
then compare the "reference" representation and what buffered printk
does.

Buffered printk always stores this as several lines, IOW several
logbuf entries.

cat /dev/kmsg

4,690,15260152,-;0000:    00000000  00000010  00000040  00000090  
4,691,15260154,-;0010:    00000100  00000190  00000240  00000310  
4,692,15260157,-;0020:    00000400  00000510  00000640  00000790  
4,694,15260161,-;0030:    00000900  00000a90  00000c40  00000e10  

So if we will have concurrent printk()-s happening on other CPUs,
then the KERNEL STACK data block still can be a bit hard to read:

[   15.260152] 0000:    00000000  00000010  00000040  00000090  
<printk CPU1> ... foo bar foo bar
<printk CPU2> ... foo bar foo bar
...
[   15.260154] 0010:    00000100  00000190  00000240  00000310  
<printk CPU3> ... foo bar foo bar
<printk CPU2> ... foo bar foo bar
...
              ... and so on; you got the idea.

> No, please note that the for cycle searches for '\n' from the end
> of the string.

You are right, I didn't notice that. Indeed.


Tetsuo, lockdep report with buffered printks looks a bit different:

 kernel:  Possible unsafe locking scenario:
 kernel:        CPU0                    CPU1
 kernel:        ----                    ----
 kernel:   lock(bar_lock);
 kernel:                                lock(
 kernel: foo_lock);
 kernel:                                lock(bar_lock);
 kernel:   lock(foo_lock);
 kernel: 


> > 	len = vsprintf();
> > 	if (len && text[len - 1] == '\n' || overflow)
> > 		flush();
> 
> I had the same idea. Tetsuo ignored it. I looked for more arguments
> and found that '\n' is used in the middle of several pr_cont()
> calls

OK, so we probably can have that new semantics. But we might split
something that possibly was meant to be a single line with a bunch
of \n in the middle.

	-ss

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-12  7:59             ` Sergey Senozhatsky
@ 2018-11-12 10:42               ` Tetsuo Handa
  2018-11-17 10:14               ` Tetsuo Handa
  1 sibling, 0 replies; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-12 10:42 UTC (permalink / raw)
  To: Sergey Senozhatsky, Petr Mladek
  Cc: Sergey Senozhatsky, Dmitriy Vyukov, Steven Rostedt,
	Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon

On 2018/11/12 16:59, Sergey Senozhatsky wrote:
> Tetsuo, lockdep report with buffered printks looks a bit different:
> 
>  kernel:  Possible unsafe locking scenario:
>  kernel:        CPU0                    CPU1
>  kernel:        ----                    ----
>  kernel:   lock(bar_lock);
>  kernel:                                lock(
>  kernel: foo_lock);
>  kernel:                                lock(bar_lock);
>  kernel:   lock(foo_lock);
>  kernel: 
> 

Yes. That is because vprintk_buffered() eliminated KERN_CONT.
Since there was pending partial printk() output, vprintk_buffered()
should not eliminate KERN_CONT. Petr Mladek commented

  1. The mixing of normal and buffered printk calls is a bit confusing
     and error prone. It would make sense to use the buffered printk
     everywhere in the given section of code even when it is not
     strictly needed.

and I made a draft version for how the code would look like if we try to
avoid the mixing of normal and buffered printk calls. The required changes
seem to be too large to apply tree wide. And I suggested try_buffered_printk().

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.
  2018-11-12  7:59             ` Sergey Senozhatsky
  2018-11-12 10:42               ` Tetsuo Handa
@ 2018-11-17 10:14               ` Tetsuo Handa
  1 sibling, 0 replies; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-17 10:14 UTC (permalink / raw)
  To: Sergey Senozhatsky, Petr Mladek
  Cc: Sergey Senozhatsky, Dmitriy Vyukov, Steven Rostedt,
	Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Linus Torvalds, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon

On 2018/11/12 16:59, Sergey Senozhatsky wrote:
> On (11/09/18 15:10), Petr Mladek wrote:
>>>
>>> If I'm not mistaken, this is for the futute "printk injection" work.
>>
>> The above code only tries to push complete lines to the main log buffer
>> and consoles ASAP. It sounds like a Good Idea(tm).
> 
> Probably it is. So *quite likely* I'm wrong here.

In the future, I want to inject caller information to standard console output,
for syzbot is using serial console

  [    0.000000] Linux version 4.20.0-rc2-next-20181116+ (syzkaller@ci) (gcc version 8.0.1 20180413 (experimental) (GCC)) #120 SMP PREEMPT Fri Nov 16 20:21:42 UTC 2018
  [    0.000000] Command line: BOOT_IMAGE=/vmlinuz root=/dev/sda1 console=ttyS0 earlyprintk=serial vsyscall=native rodata=n oops=panic panic_on_warn=1 nmi_watchdog=panic panic=86400 security=apparmor ima_policy=tcb workqueue.watchdog_thresh=140 kvm-intel.nested=1 nf-conntrack-ftp.ports=20000 nf-conntrack-tftp.ports=20000 nf-conntrack-sip.ports=20000 nf-conntrack-irc.ports=20000 nf-conntrack-sane.ports=20000 vivid.n_devs=16 vivid.multiplanar=1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2 nopcid

and is experiencing confusion due to concurrent printk() like

  [  552.907865] syz-executor0: vmalloc: allocation failure: 256 bytes, mode:0x6000c0(GFP_KERNEL), nodemask=(null)
  [  552.921666] ------------[ cut here ]------------
  [  552.926434] DEBUG_LOCKS_WARN_ON(depth <= 0)
  [  552.926563] WARNING: CPU: 0 PID: 24211 at kernel/locking/lockdep.c:3595 lock_release+0x740/0xa10
  [  552.927489] syz-executor0 cpuset=syz0 mems_allowed=0
  [  552.930893] Kernel panic - not syncing: panic_on_warn set ...
  [  552.943750] CPU: 1 PID: 24208 Comm: syz-executor0 Not tainted 4.20.0-rc1-next-20181109+ #110
  [  552.959345] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  [  552.968694] Call Trace:
  [  552.971296]  dump_stack+0x244/0x39d
  [  552.974934]  ? dump_stack_print_info.cold.1+0x20/0x20
  [  552.980138]  ? rcu_lockdep_current_cpu_online+0x1a4/0x210
  [  552.985685]  warn_alloc.cold.116+0xb7/0x1bd
  [  552.990011]  ? zone_watermark_ok_safe+0x3f0/0x3f0
  [  552.994884]  ? __get_vm_area_node+0x130/0x3a0
  [  552.999387]  ? rcu_read_lock_sched_held+0x14f/0x180
  [  553.004414]  ? __might_fault+0x12b/0x1e0
  [  553.008487]  ? __get_vm_area_node+0x2e5/0x3a0
  [  553.013000]  __vmalloc_node_range+0x472/0x750
  [  553.017514]  ? do_replace+0x23b/0x4c0
  [  553.021327]  vmalloc+0x6f/0x80
  (...snipped...)
  [  553.296333]  __warn.cold.8+0x20/0x45
  [  553.296353]  ? lock_release+0x740/0xa10
  [  553.329854] Node 0 DMA free:15908kB min:164kB low:204kB high:244kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
  [  553.332731]  report_bug+0x254/0x2d0
  [  553.332796]  do_error_trap+0x11b/0x200
  [  553.332822]  do_invalid_op+0x36/0x40
  [  553.336825] lowmem_reserve[]: 0 2818 6321 6321
  [  553.362888]  ? lock_release+0x740/0xa10
  [  553.362906]  invalid_op+0x14/0x20
  [  553.362920] RIP: 0010:lock_release+0x740/0xa10
  [  553.362935] Code: 03 38 d0 7c 08 84 d2 0f 85 da 02 00 00 8b 35 a7 95 b3 08 85 f6 75 15 48 c7 c6 20 66 2b 88 48 c7 c7 c0 33 2b 88 e8 10 36 e7 ff <0f> 0b 48 8b 95 e8 fe ff ff 4c 89 f7 48 8b b5 f0 fe ff ff e8 e8 58
  [  553.362941] RSP: 0018:ffff8801839ef868 EFLAGS: 00010086
  [  553.362950] RAX: 0000000000000000 RBX: 1ffff1003073df12 RCX: ffffc9000be41000
  [  553.362957] RDX: 00000000000087b4 RSI: ffffffff8165ba15 RDI: 0000000000000006
  [  553.362966] RBP: ffff8801839ef998 R08: ffff88017c010600 R09: fffffbfff12b2254
  [  553.362974] R10: fffffbfff12b2254 R11: ffffffff895912a3 R12: ffffffff8b0e27a0
  [  553.362989] R13: ffff8801839ef970 R14: ffff88017c010600 R15: ffff8801839ef8b0
  [  553.373138] Node 0 DMA32 free:2887356kB min:30052kB low:37564kB high:45076kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129332kB managed:2888772kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:1416kB local_pcp:0kB free_cma:0kB
  [  553.374209]  ? vprintk_func+0x85/0x181
  [  553.387199] lowmem_reserve[]: 0 0 3503 3503
  [  553.390755]  ? lock_release+0x740/0xa10
  [  553.390840]  ? loop_control_ioctl+0xf5/0x4e0

in https://syzkaller.appspot.com/bug?id=e7838e1e659de1ba566bb17410438b9d1dc59eb7 .

Thus, I appreciate if we could mitigate KERN_CONT problem using try_buffered_printk()
and move on to "caller information injection" topic. I'm fine with starting with
CONFIG_DEBUG_AID_FOR_SYZBOT=y.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-10  2:42               ` Tetsuo Handa
@ 2018-11-23 12:46                 ` Petr Mladek
  2018-11-23 13:12                   ` Tetsuo Handa
  2018-11-23 15:56                   ` Steven Rostedt
  0 siblings, 2 replies; 54+ messages in thread
From: Petr Mladek @ 2018-11-23 12:46 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Linus Torvalds, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On Sat 2018-11-10 11:42:03, Tetsuo Handa wrote:
> On 2018/11/10 0:43, Petr Mladek wrote:
> >> + * Line buffered printk() tries to assign a buffer when printk() from a new
> >> + * context identifier comes in. And it automatically releases that buffer when
> >> + * one of three conditions listed below became true.
> >> + *
> >> + *   (1) printk() from that context identifier emitted '\n' as the last
> >> + *       character of output.
> >> + *   (2) printk() from that context identifier tried to print a too long line
> >> + *       which cannot be stored into a buffer.
> >> + *   (3) printk() from a new context identifier noticed that some context
> >> + *       identifier is reserving a buffer for more than 10 seconds without
> >> + *       emitting '\n'.
> >> + *
> >> + * Since (3) is based on a heuristic that somebody forgot to emit '\n' as the
> >> + * last character of output(), pr_cont()/KERN_CONT users are expected to emit
> >> + * '\n' within 10 seconds even if they reserved a buffer.
> > 
> > This is my main concern about this approach. It is so easy to omit
> > the final '\n'.
> 
> If it is so easy to forget the final '\n', there will be a lot of implicit
> pr_cont() users (because pr_cont() assumes that previous printk() omitted the
> final '\n'), and "I am going to investigate much more pr_cont() users." will
> be insufficient for getting meaningful conclusion.
> 
> Checking "lack of the the final '\n'" means that we need to check
> "all printk() users who are not emitting the final '\n'" and evaluate
> "whether there is a possibility that subsequent printk() will not be
>  called from that context due to e.g. conditional branches". That is an
> impossible task for anybody, for there might be out-of-tree code doing it.
> > 
> > They are currently delayed until another printk(). Even this is bad.
> > Unfortunately we could not setup timer from printk() because it
> > would add more locks into the game.
> 
> We could use interval timer for flushing incomplete line.

I am more and more wondering if the buffered printk is worth
the effort. The more buffers we use the more we risk that nobody
would see some important message. Even a part of the line
might be crucial in some situations.

Steven told me on Plumbers conference that even few initial
characters saved him a day few times.


> But updating printk() users to always end with '\n' will be preferable.

This sounds like a whack a mole game. If I get it correctly, you write
that it is "an impossible task for anybody" just few lines above.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-10  8:52               ` Tetsuo Handa
@ 2018-11-23 12:52                 ` Petr Mladek
  0 siblings, 0 replies; 54+ messages in thread
From: Petr Mladek @ 2018-11-23 12:52 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Linus Torvalds, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On Sat 2018-11-10 17:52:17, Tetsuo Handa wrote:
> On 2018/11/10 0:43, Petr Mladek wrote:
> > On Fri 2018-11-09 18:55:26, Tetsuo Handa wrote:
> >> How early_printk requirement affects line buffered printk() API?
> >>
> >> I don't think it is impossible to convert from
> >>
> >>      printk("Testing feature XYZ..");
> >>      this_may_blow_up_because_of_hw_bugs();
> >>      printk(KERN_CONT " ... ok\n");
> >>
> >> to
> >>
> >>      printk("Testing feature XYZ:\n");
> >>      this_may_blow_up_because_of_hw_bugs();
> >>      printk("Testing feature XYZ.. ... ok\n");
> >>
> >> in https://lore.kernel.org/lkml/CA+55aFwmwdY_mMqdEyFPpRhCKRyeqj=+aCqe5nN108v8ELFvPw@mail.gmail.com/ .
> > 
> > I just wonder how this pattern is common. I have tried but I failed
> > to find any instance.
> > 
> > This problem looks like a big argument against explicit buffers.
> > But I wonder if it is real.
> 
> An example of boot up messages where buffering makes difference.
> 
> Vanilla:
> 
> [    0.260459] smp: Bringing up secondary CPUs ...
> [    0.269595] x86: Booting SMP configuration:
> [    0.270461] .... node  #0, CPUs:      #1
> [    0.066578] Disabled fast string operations
> [    0.066578] mce: CPU supports 0 MCE banks
> [    0.066578] smpboot: CPU 1 Converting physical 2 to logical package 1
> [    0.342569]  #2
> [    0.066578] Disabled fast string operations
> [    0.066578] mce: CPU supports 0 MCE banks
> [    0.066578] smpboot: CPU 2 Converting physical 4 to logical package 2
> [    0.413442]  #3
> [    0.066578] Disabled fast string operations
> [    0.066578] mce: CPU supports 0 MCE banks
> [    0.066578] smpboot: CPU 3 Converting physical 6 to logical package 3
> [    0.476562] smp: Brought up 1 node, 4 CPUs
> [    0.477477] smpboot: Max logical packages: 8
> [    0.477514] smpboot: Total of 4 processors activated (22691.70 BogoMIPS)
> 
> With try_buffered_printk() patch:
> 
> [    0.279768] smp: Bringing up secondary CPUs ...
> [    0.288825] x86: Booting SMP configuration:
> [    0.066748] Disabled fast string operations
> [    0.066748] mce: CPU supports 0 MCE banks
> [    0.066748] smpboot: CPU 1 Converting physical 2 to logical package 1
> [    0.066748] Disabled fast string operations
> [    0.066748] mce: CPU supports 0 MCE banks
> [    0.066748] smpboot: CPU 2 Converting physical 4 to logical package 2
> [    0.066748] Disabled fast string operations
> [    0.066748] mce: CPU supports 0 MCE banks
> [    0.066748] smpboot: CPU 3 Converting physical 6 to logical package 3
> [    0.495862] .... node  #0, CPUs:      #1 #2 #3\x016smp: Brought up 1 node, 4 CPUs
> [    0.496833] smpboot: Max logical packages: 8
> [    0.497609] smpboot: Total of 4 processors activated (22665.22 BogoMIPS)
> 
> 
> 
> Hmm, arch/x86/kernel/smpboot.c is not emitting '\n' after #num
> 
>         if (system_state < SYSTEM_RUNNING) {
>                 if (node != current_node) {
>                         if (current_node > (-1))
>                                 pr_cont("\n");
>                         current_node = node;
> 
>                         printk(KERN_INFO ".... node %*s#%d, CPUs:  ",
>                                node_width - num_digits(node), " ", node);
>                 }
> 
>                 /* Add padding for the BSP */
>                 if (cpu == 1)
>                         pr_cont("%*s", width + 1, " ");
> 
>                 pr_cont("%*s#%d", width - num_digits(cpu), " ", cpu);
> 
>         } else
>                 pr_info("Booting Node %d Processor %d APIC 0x%x\n",
>                         node, cpu, apicid);
> 
> and causing
> 
>         pr_info("Brought up %d node%s, %d CPU%s\n",
>                 num_nodes, (num_nodes > 1 ? "s" : ""),
>                 num_cpus,  (num_cpus  > 1 ? "s" : ""));
> 
> line to be concatenated to previous line.
> Maybe disable try_buffered_printk() if system_state !=
> SYSTEM_RUNNING ?

We need to solve continuous lines also during boot. Also similar
problems might be in the code that is called in SYSTEM_RUNNING state.

This is yet another clue that try_buffered_printk() approach is not
that good idea.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-23 12:46                 ` Petr Mladek
@ 2018-11-23 13:12                   ` Tetsuo Handa
  2018-11-23 15:56                   ` Steven Rostedt
  1 sibling, 0 replies; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-23 13:12 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Linus Torvalds, Sergey Senozhatsky,
	Dmitriy Vyukov, Steven Rostedt, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On 2018/11/23 21:46, Petr Mladek wrote:
> I am more and more wondering if the buffered printk is worth
> the effort. The more buffers we use the more we risk that nobody
> would see some important message. Even a part of the line
> might be crucial in some situations.
> 
> Steven told me on Plumbers conference that even few initial
> characters saved him a day few times.

Yes, few initial characters of one line might sometimes help.
But emitting one line at a time also helps sometimes; especially
when we need to interpret multiple lines and group these lines from
concurrent printk() callers.

> 
> 
>> But updating printk() users to always end with '\n' will be preferable.
> 
> This sounds like a whack a mole game. If I get it correctly, you write
> that it is "an impossible task for anybody" just few lines above.

Yes, updating printk() users is almost impossible. I think that something
like diff shown below is what we can afford at best. If you believe that
line buffering is wrong, I can tolerate with switching via kernel config
option like CONFIG_DEBUG_AID_FOR_SYZBOT=y.

 kernel/printk/printk.c | 263 +++++++++++++++++++++++++++++++++----------------
 1 file changed, 176 insertions(+), 87 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 315718b..2276c99 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -584,6 +584,15 @@ static int log_store(int facility, int level,
 	struct printk_log *msg;
 	u32 size, pad_len;
 	u16 trunc_msg_len = 0;
+	static char id[32];
+	u16 id_len;
+
+	if (in_task())
+		id_len = scnprintf(id, sizeof(id), "[T%u] ", current->pid);
+	else
+		id_len = scnprintf(id, sizeof(id), "[C%u] ",
+				   raw_smp_processor_id());
+	text_len += id_len;
 
 	/* number of '\0' padding bytes to next message */
 	size = msg_used_size(text_len, dict_len, &pad_len);
@@ -607,12 +616,16 @@ static int log_store(int facility, int level,
 		log_next_idx = 0;
 	}
 
+	text_len -= id_len;
+
 	/* fill message */
 	msg = (struct printk_log *)(log_buf + log_next_idx);
-	memcpy(log_text(msg), text, text_len);
-	msg->text_len = text_len;
+	memcpy(log_text(msg), id, id_len);
+	memcpy(log_text(msg) + id_len, text, text_len);
+	msg->text_len = text_len + id_len;
 	if (trunc_msg_len) {
-		memcpy(log_text(msg) + text_len, trunc_msg, trunc_msg_len);
+		memcpy(log_text(msg) + id_len + text_len, trunc_msg,
+		       trunc_msg_len);
 		msg->text_len += trunc_msg_len;
 	}
 	memcpy(log_dict(msg), dict, dict_len);
@@ -620,10 +633,7 @@ static int log_store(int facility, int level,
 	msg->facility = facility;
 	msg->level = level & 7;
 	msg->flags = flags & 0x1f;
-	if (ts_nsec > 0)
-		msg->ts_nsec = ts_nsec;
-	else
-		msg->ts_nsec = local_clock();
+	msg->ts_nsec = ts_nsec;
 	memset(log_dict(msg) + dict_len, 0, pad_len);
 	msg->len = size;
 
@@ -1758,87 +1768,154 @@ static inline void printk_delay(void)
 }
 
 /*
- * Continuation lines are buffered, and not committed to the record buffer
- * until the line is complete, or a race forces it. The line fragments
- * though, are printed immediately to the consoles to ensure everything has
- * reached the console in case of a kernel crash.
+ * Continuation lines are buffered based on type of context, and not committed
+ * to the record buffer until the line is complete, or a race forces it.
+ *
+ * While this context calculation is not perfect, asking printk() callers to
+ * explicitly pass "struct printk_buffer" (except "context" field) as their
+ * function arguments will require too large tree-wide changes. Therefore,
+ * let's tolerate failing to separate "up to one line of messages", and avoid
+ * bloating kernel code size for addressing infrequently happening races.
  */
-static struct cont {
-	char buf[LOG_LINE_MAX];
-	size_t len;			/* length == 0 means unused buffer */
-	struct task_struct *owner;	/* task of first print*/
-	u64 ts_nsec;			/* time of first print */
-	u8 level;			/* log level of first message */
-	u8 facility;			/* log facility of first message */
-	enum log_flags flags;		/* prefix, newline flags */
-} cont;
-
-static void cont_flush(void)
+static inline unsigned long printk_context(void)
 {
-	if (cont.len == 0)
-		return;
-
-	log_store(cont.facility, cont.level, cont.flags, cont.ts_nsec,
-		  NULL, 0, cont.buf, cont.len);
-	cont.len = 0;
-}
+	unsigned long base;
 
-static bool cont_add(int facility, int level, enum log_flags flags, const char *text, size_t len)
-{
-	/* If the line gets too long, split it up in separate records. */
-	if (cont.len + len > sizeof(cont.buf)) {
-		cont_flush();
-		return false;
-	}
-
-	if (!cont.len) {
-		cont.facility = facility;
-		cont.level = level;
-		cont.owner = current;
-		cont.ts_nsec = local_clock();
-		cont.flags = flags;
-	}
-
-	memcpy(cont.buf + cont.len, text, len);
-	cont.len += len;
-
-	// The original flags come from the first line,
-	// but later continuations can add a newline.
-	if (flags & LOG_NEWLINE) {
-		cont.flags |= LOG_NEWLINE;
-		cont_flush();
-	}
-
-	return true;
+	if (in_task())
+		return (unsigned long) current;
+	if (in_nmi())
+		base = NR_CPUS * 2;
+	else if (in_irq())
+		base = NR_CPUS;
+	else
+		base = 0;
+	return base + raw_smp_processor_id();
 }
 
-static size_t log_output(int facility, int level, enum log_flags lflags, const char *dict, size_t dictlen, char *text, size_t text_len)
-{
+struct printk_buffer {
+	char buf[LOG_LINE_MAX];
+	/* valid bytes in buf[] */
+	size_t len;
 	/*
-	 * If an earlier line was buffered, and we're a continuation
-	 * write from the same process, try to add it to the buffer.
+	 * owner context of this buffer if len != 0
+	 *
+	 * 0 to NR_CPUS - 1: A soft IRQ context reserved.
+	 * NR_CPUS to NR_CPUS * 2 - 1: A hard IRQ context reserved.
+	 * NR_CPUS * 2 to NR_CPUS * 3 - 1: An NMI context reserved.
+	 * NR_CPUS * 3 to ULONG_MAX: Some task context reserved.
 	 */
-	if (cont.len) {
-		if (cont.owner == current && (lflags & LOG_CONT)) {
-			if (cont_add(facility, level, lflags, text, text_len))
-				return text_len;
+	unsigned long context;
+	/* last accessed jiffies */
+	unsigned long last_used;
+	/* time of first print */
+	u64 ts_nsec;
+	/* log flags of first message */
+	enum log_flags lflags;
+	/* log level of first message */
+	u8 level;
+} __aligned(1024);
+
+/* Number of line buffers. */
+#define NUM_LINE_BUFFERS 16
+
+/*
+ * Line buffered printk() tries to assign a buffer when printk() from a new
+ * context identifier comes in. And it automatically releases that buffer when
+ * one of three conditions listed below became true.
+ *
+ *   (1) printk() from that context identifier emitted '\n' as the last
+ *       character of output.
+ *   (2) printk() from that context identifier tried to print a too long line
+ *       which cannot be stored into a buffer.
+ *   (3) printk() from a new context identifier noticed that some context
+ *       identifier is reserving a buffer for more than 10 seconds without
+ *       emitting '\n'.
+ *
+ * Since (3) is based on a heuristic that somebody forgot to emit '\n' as the
+ * last character of output(), pr_cont()/KERN_CONT users are expected to emit
+ * '\n' within 10 seconds even if they reserved a buffer.
+ */
+static void log_output(const int facility, const int level,
+		       const enum log_flags lflags,
+		       const char *text, const size_t text_len)
+{
+	static struct printk_buffer printk_buffers[NUM_LINE_BUFFERS];
+	static struct printk_buffer static_buffer;
+	struct printk_buffer *ptr;
+	int i;
+	const unsigned long context = printk_context();
+	const unsigned long now = jiffies;
+
+	/* Only kernel-generated messages are subjected to buffering. */
+	if (facility) {
+		ptr = &static_buffer;
+		goto found;
+	}
+	/* Check if this context already reserved a buffer. */
+	for (i = 0; i < NUM_LINE_BUFFERS; i++) {
+		ptr = &printk_buffers[i];
+		if (ptr->len && context == ptr->context)
+			goto found;
+	}
+	/* Check if somebody is reserving a buffer for too long. */
+	for (i = 0; i < NUM_LINE_BUFFERS; i++) {
+		ptr = &printk_buffers[i];
+		if (!ptr->len || !time_after(now, ptr->last_used + 10 * HZ))
+			continue;
+		/* Forced flush due to timeout. */
+		log_store(facility, ptr->level, ptr->lflags, ptr->ts_nsec,
+			  NULL, 0, ptr->buf, ptr->len);
+		ptr->len = 0;
+	}
+	/* Check if this context can reserve a buffer. */
+	for (i = 0; i < NUM_LINE_BUFFERS; i++) {
+		ptr = &printk_buffers[i];
+		if (ptr->len == 0) {
+			ptr->context = context;
+			goto found;
 		}
-		/* Otherwise, make sure it's flushed */
-		cont_flush();
 	}
+	/* Forced assign due to out of buffers. */
+	ptr = &static_buffer;
+ found:
+	/* Forced flush due to log prefix or out of space. */
+	if (((ptr->len && (lflags & LOG_PREFIX)) ||
+	     text_len + ptr->len > sizeof(ptr->buf))) {
+		log_store(facility, ptr->level, ptr->lflags, ptr->ts_nsec,
+			  NULL, 0, ptr->buf, ptr->len);
+		ptr->len = 0;
+	}
+	if (ptr->len == 0) {
+		ptr->lflags = lflags;
+		ptr->level = level;
+		ptr->ts_nsec = local_clock();
+	}
+	/* 0 <= text_len <= LOG_LINE_MAX due to vscnprintf(). */
+	memmove(ptr->buf + ptr->len, text, text_len);
+	ptr->len += text_len;
+	/*
+	 * Flush already completed lines. By splitting at '\n', we can inject
+	 * caller id information to each line of text information.
+	 */
+	while (true) {
+		size_t len;
+		char *cp = memchr(ptr->buf, '\n', ptr->len);
 
-	/* Skip empty continuation lines that couldn't be added - they just flush */
-	if (!text_len && (lflags & LOG_CONT))
-		return 0;
-
-	/* If it doesn't end in a newline, try to buffer the current line */
-	if (!(lflags & LOG_NEWLINE)) {
-		if (cont_add(facility, level, lflags, text, text_len))
-			return text_len;
+		if (!cp)
+			break;
+		len = ++cp - ptr->buf;
+		log_store(facility, ptr->level, ptr->lflags, ptr->ts_nsec,
+			  NULL, 0, ptr->buf, len - 1);
+		ptr->len -= len;
+		memmove(ptr->buf, cp, ptr->len);
 	}
-
-	/* Store it in the record log */
-	return log_store(facility, level, lflags, 0, dict, dictlen, text, text_len);
+	/* Forced flush due to out of buffers. */
+	if (ptr == &static_buffer && ptr->len) {
+		log_store(facility, ptr->level, ptr->lflags, ptr->ts_nsec,
+			  NULL, 0, ptr->buf, ptr->len);
+		ptr->len = 0;
+	}
+	ptr->last_used = now;
 }
 
 /* Must be called under logbuf_lock. */
@@ -1857,12 +1934,6 @@ int vprintk_store(int facility, int level,
 	 */
 	text_len = vscnprintf(text, sizeof(textbuf), fmt, args);
 
-	/* mark and strip a trailing newline */
-	if (text_len && text[text_len-1] == '\n') {
-		text_len--;
-		lflags |= LOG_NEWLINE;
-	}
-
 	/* strip kernel syslog prefix and extract log level or control flags */
 	if (facility == 0) {
 		int kern_level;
@@ -1888,11 +1959,29 @@ int vprintk_store(int facility, int level,
 	if (level == LOGLEVEL_DEFAULT)
 		level = default_message_loglevel;
 
-	if (dict)
-		lflags |= LOG_PREFIX|LOG_NEWLINE;
+	if (dict) {
+		char *cp;
 
-	return log_output(facility, level, lflags,
-			  dict, dictlen, text, text_len);
+		/* Remove the trailing newline. */
+		if (text_len && text[text_len-1] == '\n')
+			text_len--;
+		/*
+		 * Remove any newline because we don't want to duplicate dict
+		 * information while we want to prefix caller id information to
+		 * each line of text information.
+		 */
+		while ((cp = memchr(text, '\n', text_len)) != NULL)
+			*cp = ' ';
+		log_store(facility, level, lflags | LOG_PREFIX | LOG_NEWLINE,
+			  local_clock(), dict, dictlen, text, text_len);
+		return text_len;
+	}
+
+	/* Add '\n' via line buffering for kernel-generated messages. */
+	if (!facility)
+		lflags |= LOG_NEWLINE;
+	log_output(facility, level, lflags, text, text_len);
+	return text_len;
 }
 
 asmlinkage int vprintk_emit(int facility, int level,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-23 12:46                 ` Petr Mladek
  2018-11-23 13:12                   ` Tetsuo Handa
@ 2018-11-23 15:56                   ` Steven Rostedt
  2018-11-24  0:24                     ` Tetsuo Handa
  2018-11-28 13:29                     ` David Laight
  1 sibling, 2 replies; 54+ messages in thread
From: Steven Rostedt @ 2018-11-23 15:56 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Tetsuo Handa, Sergey Senozhatsky, Linus Torvalds,
	Sergey Senozhatsky, Dmitriy Vyukov, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On Fri, 23 Nov 2018 13:46:47 +0100
Petr Mladek <pmladek@suse.com> wrote:

> Steven told me on Plumbers conference that even few initial
> characters saved him a day few times.

Yes, and that has happened more than once. I would reboot and retest
code that is crashing, and due to a triple fault, the machine would
reboot because of some race, and the little output I get from the
console would help tremendously.

Remember, debugging the kernel is a lot like forensics, especially when
it's from a customer's site. You look at all the evidence that you can
get, and sometimes it's just 10 characters in the output that gives you
an idea of where things went wrong. I'm really not liking the buffering
idea because of this.

-- Steve

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-23 15:56                   ` Steven Rostedt
@ 2018-11-24  0:24                     ` Tetsuo Handa
  2018-11-26  4:34                       ` Sergey Senozhatsky
  2018-11-28 13:29                     ` David Laight
  1 sibling, 1 reply; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-24  0:24 UTC (permalink / raw)
  To: Steven Rostedt, Petr Mladek
  Cc: Sergey Senozhatsky, Linus Torvalds, Sergey Senozhatsky,
	Dmitriy Vyukov, Alexander Potapenko, Fengguang Wu,
	Josh Poimboeuf, LKML, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon

On 2018/11/24 0:56, Steven Rostedt wrote:
> On Fri, 23 Nov 2018 13:46:47 +0100
> Petr Mladek <pmladek@suse.com> wrote:
> 
>> Steven told me on Plumbers conference that even few initial
>> characters saved him a day few times.
> 
> Yes, and that has happened more than once. I would reboot and retest
> code that is crashing, and due to a triple fault, the machine would
> reboot because of some race, and the little output I get from the
> console would help tremendously.
> 
> Remember, debugging the kernel is a lot like forensics, especially when
> it's from a customer's site. You look at all the evidence that you can
> get, and sometimes it's just 10 characters in the output that gives you
> an idea of where things went wrong. I'm really not liking the buffering
> idea because of this.

Then, we should not enforce buffering in a way that requires modification of
printk() callers. That is, we should not ask printk() callers to use their
private buffer. What we can do is to enable/disable line buffering inside
printk() depending on the problem the user wants to debug.

Also, we should allow disabling "struct cont" depending on the problem (in
order to allow flushing the 10 characters in the "cont" buffer).



By the way, is the comment

  /*
   * Continuation lines are buffered, and not committed to the record buffer
   * until the line is complete, or a race forces it. The line fragments
   * though, are printed immediately to the consoles to ensure everything has
   * reached the console in case of a kernel crash.
   */

appropriate despite we don't call cont_flush() upon a kernel crash?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-24  0:24                     ` Tetsuo Handa
@ 2018-11-26  4:34                       ` Sergey Senozhatsky
  0 siblings, 0 replies; 54+ messages in thread
From: Sergey Senozhatsky @ 2018-11-26  4:34 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Steven Rostedt, Petr Mladek, Sergey Senozhatsky, Linus Torvalds,
	Sergey Senozhatsky, Dmitriy Vyukov, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

On (11/24/18 09:24), Tetsuo Handa wrote:
> >> Steven told me on Plumbers conference that even few initial
> >> characters saved him a day few times.
> > 
> > Yes, and that has happened more than once. I would reboot and retest
> > code that is crashing, and due to a triple fault, the machine would
> > reboot because of some race, and the little output I get from the
> > console would help tremendously.
> > 
> > Remember, debugging the kernel is a lot like forensics, especially when
> > it's from a customer's site. You look at all the evidence that you can
> > get, and sometimes it's just 10 characters in the output that gives you
> > an idea of where things went wrong. I'm really not liking the buffering
> > idea because of this.
> 
> Then, we should not enforce buffering in a way that requires modification of
> printk() callers. That is, we should not ask printk() callers to use their
> private buffer. What we can do is to enable/disable line buffering inside
> printk() depending on the problem the user wants to debug.

Right; overall I tend to agree with what you guys are saying and
I like Petr's "I am more and more wondering if the buffered printk
is worth the effort" comment; and I like Steven's comment on flushes;
and admire Tetsuo's efforts.

I think that printk_seq_buf/printk_buffer was never going to replace
pr_cont() and I never liked the idea. The printk_safe proposal for
lockdep had one OK thing about it - it would pass our normal marshaling
before it would reach the buffering stage. Which means - no buffering
for people who "detest" printk buffering.

This looks better:
	printk->vprint_func->{early_printk/printk_safe/vprintk_emit}->buffering

Than this:
	pr_buffer->buffering->vprintk_func->{early_printk/printk_safe/vprintk_emit}

Another thing is - printk seq_buf/printk_buffer doesn't really solve any
problem. People, who can use seq_buf/char buf[256]/etc. buffering, already
can do so; people who cannot - won't switch to a new buffering printk
anyway.

The bad thing about printk_safe proposal is that it's per-CPU; which
is OK for some paths (like lockdep), but not OK in general (e.g. OOM).

IMO, try_buffered_printk() attempts to solve the problem at the right
place - printk. And it does not break our normal marshaling, so we don't
"fix" printk users and we keep people, who does vprintk_func->early_printk
thing, happy. So I don't dislike try_buffered_printk() approach. And unlike
before, now we are talking about a single line buffering.

If we'd walk this way, I would prefer to NOT introduce any structs and
any new code, or any new "split and log_store() in the middle" rules. Just
a bunch of "struct cont" buffers:

	static struct cont conts[N];

and cont_add()/cont_flush() to handle pr_cont, with all the flushes it
does; but on a per-context basis.
conts[0] should serve as a fallback cont buffer, in case if there are
no available cont buffers left. flush_on_panic() is still miserable,
for sure; probably we can do something about it.

Or... Instead.
We can just leave pr_cont() alone for now. And make it possible to
reconstruct messages - IOW, inject some info to printk messages. We
do this at Samsung (inject CPU number at the beginning of every
message. `cat serial.0 | grep "\[1\]"` to grep for all messages from
CPU1). Probably this would be the simplest thing.

> Also, we should allow disabling "struct cont" depending on the problem (in
> order to allow flushing the 10 characters in the "cont" buffer).
>
> By the way, is the comment
>
>   /*
>    * Continuation lines are buffered, and not committed to the record buffer
>    * until the line is complete, or a race forces it. The line fragments
>    * though, are printed immediately to the consoles to ensure everything has
>    * reached the console in case of a kernel crash.
>    */

printk does not do this anymore; you are right.

> appropriate despite we don't call cont_flush() upon a kernel crash?

I tend to count on flush_on_panic more than on a "last moment"
pr_cont->cont_flush(), which was guaranteed to happen immediately
only with early_con.

A kernel crash usually has enough pr_emerg/printk-s to force cont flush.
Even pr_info() will do. We look at the loglevel much later; so even
messages which never make it to the consoles still flush cont buffer.

	-ss

^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-23 15:56                   ` Steven Rostedt
  2018-11-24  0:24                     ` Tetsuo Handa
@ 2018-11-28 13:29                     ` David Laight
  2018-11-29 10:09                       ` Tetsuo Handa
  1 sibling, 1 reply; 54+ messages in thread
From: David Laight @ 2018-11-28 13:29 UTC (permalink / raw)
  To: 'Steven Rostedt', Petr Mladek
  Cc: Tetsuo Handa, Sergey Senozhatsky, Linus Torvalds,
	Sergey Senozhatsky, Dmitriy Vyukov, Alexander Potapenko,
	Fengguang Wu, Josh Poimboeuf, LKML, Andrew Morton, linux-mm,
	Ingo Molnar, Peter Zijlstra, Will Deacon

From: Steven Rostedt
> > Steven told me on Plumbers conference that even few initial
> > characters saved him a day few times.
> 
> Yes, and that has happened more than once. I would reboot and retest
> code that is crashing, and due to a triple fault, the machine would
> reboot because of some race, and the little output I get from the
> console would help tremendously.
> 
> Remember, debugging the kernel is a lot like forensics, especially when
> it's from a customer's site. You look at all the evidence that you can
> get, and sometimes it's just 10 characters in the output that gives you
> an idea of where things went wrong. I'm really not liking the buffering
> idea because of this.

Yep, it is a real PITA that syslogd (or linux equiv) stops messages being
written to the console by the kernel (which used to be synchronous) and
instead writes them out from userspace.
By that time it has all died.

Sometimes you want a printk() that writes the data to the serial port
before returning.

I also spent a week trying to work out why a customer kernel was
locking up - only to finally find out that the distro they were
using set 'panic on opps' - making it almost impossible to find
out what was happening.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-28 13:29                     ` David Laight
@ 2018-11-29 10:09                       ` Tetsuo Handa
  2018-11-30 16:01                         ` Petr Mladek
  0 siblings, 1 reply; 54+ messages in thread
From: Tetsuo Handa @ 2018-11-29 10:09 UTC (permalink / raw)
  To: David Laight, 'Steven Rostedt', Petr Mladek, Sergey Senozhatsky
  Cc: Linus Torvalds, Sergey Senozhatsky, Dmitriy Vyukov,
	Alexander Potapenko, Fengguang Wu, Josh Poimboeuf, LKML,
	Andrew Morton, linux-mm, Ingo Molnar, Peter Zijlstra,
	Will Deacon

On 2018/11/28 22:29, David Laight wrote:
> I also spent a week trying to work out why a customer kernel was
> locking up - only to finally find out that the distro they were
> using set 'panic on opps' - making it almost impossible to find
> out what was happening.

How can line buffering negatively affect this case? The current thread
which triggered an oops will keep calling printk() until that current
thread reaches panic(), won't it?

The only case where line buffering negatively affects will be that the
initial 10 bytes of a critical line failed to reach log_store() because
some other thread (not the current thread doing a series of printk())
halted CPUs / reset the whole machine _before_ the initial 10 bytes of
a critical line reaches log_store().



On 2018/11/26 13:34, Sergey Senozhatsky wrote:
> Or... Instead.
> We can just leave pr_cont() alone for now. And make it possible to
> reconstruct messages - IOW, inject some info to printk messages. We
> do this at Samsung (inject CPU number at the beginning of every
> message. `cat serial.0 | grep "\[1\]"` to grep for all messages from
> CPU1). Probably this would be the simplest thing.

Yes, I sent a patch which helps reconstructing messages at
http://lkml.kernel.org/r/1543045075-3008-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp .

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages.
  2018-11-29 10:09                       ` Tetsuo Handa
@ 2018-11-30 16:01                         ` Petr Mladek
  0 siblings, 0 replies; 54+ messages in thread
From: Petr Mladek @ 2018-11-30 16:01 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: David Laight, 'Steven Rostedt',
	Sergey Senozhatsky, Linus Torvalds, Sergey Senozhatsky,
	Dmitriy Vyukov, Alexander Potapenko, Fengguang Wu,
	Josh Poimboeuf, LKML, Andrew Morton, linux-mm, Ingo Molnar,
	Peter Zijlstra, Will Deacon

On Thu 2018-11-29 19:09:26, Tetsuo Handa wrote:
> On 2018/11/28 22:29, David Laight wrote:
> > I also spent a week trying to work out why a customer kernel was
> > locking up - only to finally find out that the distro they were
> > using set 'panic on opps' - making it almost impossible to find
> > out what was happening.

Did the machine rebooted before the messages reached console or
did it produced crash-dump or frozen?

panic() tries relatively hard to flush the messages to the console,
see printk_safe_flush_on_panic() and console_flush_on_panic().
It is less aggressive when crashdump is called. It might deadlock
in console drivers.

Hmm, it might also fail when another CPU is still running and
steals console_lock. We might want to disable
console_trylock_spinning() if the caller is not
panic_cpu.


> On 2018/11/26 13:34, Sergey Senozhatsky wrote:
> > Or... Instead.
> > We can just leave pr_cont() alone for now. And make it possible to
> > reconstruct messages - IOW, inject some info to printk messages. We
> > do this at Samsung (inject CPU number at the beginning of every
> > message. `cat serial.0 | grep "\[1\]"` to grep for all messages from
> > CPU1). Probably this would be the simplest thing.
> 
> Yes, I sent a patch which helps reconstructing messages at
> http://lkml.kernel.org/r/1543045075-3008-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp .

All the buffering approaches have problems that cannot be solved
easily. The prefix-based approach looks like the best alternative
at the moment.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2018-11-30 16:01 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-02 13:31 [PATCH v6 1/3] printk: Add line-buffered printk() API Tetsuo Handa
2018-11-02 13:31 ` [PATCH 2/3] mm: Use line-buffered printk() for show_free_areas() Tetsuo Handa
2018-11-07 14:07   ` Petr Mladek
2018-11-02 13:31 ` [PATCH 3/3] lockdep: Use line-buffered printk() for lockdep messages Tetsuo Handa
2018-11-02 13:36   ` Peter Zijlstra
2018-11-03  2:00     ` Tetsuo Handa
2018-11-06  8:38     ` Petr Mladek
2018-11-06  9:05       ` Sergey Senozhatsky
2018-11-06 12:57         ` Petr Mladek
2018-11-06  9:56       ` Tetsuo Handa
2018-11-07 15:19   ` Petr Mladek
2018-11-08  4:45     ` Sergey Senozhatsky
2018-11-08 11:37       ` Tetsuo Handa
2018-11-09  6:12         ` Sergey Senozhatsky
2018-11-09  9:55           ` Tetsuo Handa
2018-11-09 15:43             ` Petr Mladek
2018-11-10  2:42               ` Tetsuo Handa
2018-11-23 12:46                 ` Petr Mladek
2018-11-23 13:12                   ` Tetsuo Handa
2018-11-23 15:56                   ` Steven Rostedt
2018-11-24  0:24                     ` Tetsuo Handa
2018-11-26  4:34                       ` Sergey Senozhatsky
2018-11-28 13:29                     ` David Laight
2018-11-29 10:09                       ` Tetsuo Handa
2018-11-30 16:01                         ` Petr Mladek
2018-11-10  8:52               ` Tetsuo Handa
2018-11-23 12:52                 ` Petr Mladek
2018-11-09 14:08           ` Linus Torvalds
2018-11-09 14:42             ` Steven Rostedt
2018-11-08 11:53       ` Petr Mladek
2018-11-08 12:44         ` Sergey Senozhatsky
2018-11-08 14:21           ` Sergey Senozhatsky
2018-11-09  9:54     ` Tetsuo Handa
2018-11-02 14:40 ` [PATCH v6 1/3] printk: Add line-buffered printk() API Matthew Wilcox
2018-11-03  1:55   ` Tetsuo Handa
2018-11-02 18:12 ` kbuild test robot
2018-11-06 14:35 ` Sergey Senozhatsky
2018-11-07 10:21   ` Petr Mladek
2018-11-07 12:54     ` Tetsuo Handa
2018-11-08  2:21     ` Sergey Senozhatsky
2018-11-08 11:24       ` Petr Mladek
2018-11-08 11:46         ` Tetsuo Handa
2018-11-08 12:30         ` Sergey Senozhatsky
2018-11-09 14:10           ` Petr Mladek
2018-11-12  7:59             ` Sergey Senozhatsky
2018-11-12 10:42               ` Tetsuo Handa
2018-11-17 10:14               ` Tetsuo Handa
2018-11-07 10:52   ` Tetsuo Handa
2018-11-07 11:01     ` David Laight
2018-11-07 12:00       ` Petr Mladek
2018-11-07 11:45     ` Petr Mladek
2018-11-08  2:30     ` Sergey Senozhatsky
2018-11-07 13:41 ` Petr Mladek
2018-11-07 14:06 ` Petr Mladek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).